Professional Documents
Culture Documents
Vol. 9: 100 Years of Gravity and Accelerated Frames: The Deepest Insights
of Einstein and Yang-Mills
(J.-P. Hsu & D. Fine)
5829tp(path) 29/8/05 1:48 PM Page 2
100 Years of
Gravity and
Accelerated Frames
The Deepest Insights of
Einstein and YangMills
Editors
Jong-Ping Hsu
Dana Fine
University of Massachusetts Dartmouth, USA
World Scientific
NEW JERSEY . LONDON . SINGAPORE . BEIJING . SHANGHAI . HONG KONG . TAIPEI . CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA ofice: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK ofice: 57 Shelton Street, Covent Garden, London WC2H 9HE
QC178.Al5 2005
530.1 l--dc22
2005050077
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222
Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
To
Arthur Fine
Deepest Insights
This page intentionally left blank
"In 1953-1954, I was visiting Brookhaven and Bob was my office mate.
We discussed many things in physics, from the experimental results pouring
out o f the new Cosmotron, to theoretical topics like renormalization and the
Ward identity. I t was in that year that we found the very elegant and unique
generalization of Maxwell's equation. We were pleased b y the beauty of the
generalization, b u t neither o f us had anticipated its great impact on physics
20 years later.
C. N. Yang,
in "Remembering Robert L. Mills" by Samuel L. Marateck,
Physics Today, p. 14, October 2003.
"The Rubaiyat"
Omar Khayyam
(transl. by Edward Fitzgerald)
This page intentionally left blank
ix
Preface
This book is a collection of papers and writings from the past 100 years on
ideas and problems related to gravity, gauge fields and accelerated frames. The grand
triumphs of Einstein's theory of gravity and Yang-Mills' theory in physics are well
known. It is believed that both theories are based on the principle of 'gauge
invariance,' although not on the same kind of action. Einstein's theory is linear in
spacetime curvature, while Yang-Mills' theory is quadratic in gauge curvature. Now,
at the dawn of the 2 1st century, invariance principles in physics have transcended the
kinematical and dynamical contexts from which they originated to became the
foundation of our understanding of the physical world. Using this framework of
invariance principles, this book surveys the development of gravitationa1 and Yang-
Mills fields, as well as spacetime transformations of accelerated frames. It also
attempts to reveal the problems and limitations of various formulations of
gravitational and Yang-Mills fields. The intent is to enlarge and broaden the reader's
views on the subjects.
As TIME magazine's person of the 20th century (cf. TIME magazine),
Einstein's contributions to physics are arguably incomparable, aside from Newton's.
The gravitational force and accelerated frames were two ingredients in the young
Einstein's 'happiest thoughts' in 1907. The simple thought that 'If a person falls freely
he will not feel his own weight,' made a deep impression on him and impelled him
toward a successful theory of gravitation. Unfortunately, accelerated spacetime
transformations for non-inertial frames have still not been well developed. However,
they are important because one cannot claim to have a complete understanding of the
physical world, especially the basic gravitational and Yang-Mills fields, if one
understands physics only from the viewpoint of the special and limited class of
inertial frames. Strictly speaking, all real frames of reference in the physical world
are non-inertial because of the long range of the gravitational force. In particular,
when one taks about an inherent property of nature (e.g., values of fundamental
constants such as the fine structure constant and the speed of light), a reasonable
criterion is that the property must be present in both incrtial and non-inertial frames.
In this sense, the book suggests that the present understanding of gravitational and
Yang-Mills fields is far from complete.
The formulations of the gravitational and Yang-Mills theories are both an
effect and a cause of scientifk development in experiment and theory. Progress in
physics is made through the collective effort of many physicists. The community of
physicists is like the thousand-hand Guan-Yin: Each hand accomplishes only a
partial or small task, yet the overall accomplishment is enormous. As we shall see in
this volume, in the pursuit of physical laws, the right track has often been discovered
only after many failures by well-known and not-so-well-known pathfinders.
X
It is hoped that the present volume will bring some of the unfulfilled aspects
of the profound thoughts to the attention of physicists and mathematicians of the 2 1st
century. For various reasons, research in the areas of spacetime symmetry and
special relativity are sometimes discouraged by general editorial policy. Fortunately,
so far this is not true in the cases of general relativity and Yang-Mills theory. If the
deepest insights of Einstein and Yang-Mills can inspire its readers to pursue these
subjects further, the chief purpose of the book will have been achieved.
We are grateful that Chairman M. Ninomiya of the Progress of Theoretical
Physics, Acta Physica Polonica B, Elsevier Ltd, and Editorial Administrator R. W.
Brown of the Annals of the New York Academy of Sciences freely granted us
permission to reprint their papers. We would like to thank L. Hsu, H. L. Chen, N.
Cleffi and E. M. Winiarz for their help. This book was supported in part by the Prof.
George Leung Memorial Fund of the University of Massachusetts Dartmouth
Foundation, the Potz Science Fund, and the World Scientific Publishing Company.
Hsu/Jong-Ping
xi
Contents
Preface ix
Acknowledgements xiv
Remarks on the Development of the Gravitational and Yang-Mills Fields, and Accelerated Frames xix
Appendices
A Marcel Grossmann (1878-1 936) 618
J. F! Hsu and D . Fine
B Remembering Robert L. Mills 622
S. L. Marateck
xiv
Acknowledgements
On Homogeneous Gravitational Fields in the General Theory of Relativity and the Clock Paradox
Reprinted with permission from C. Merller, Danske Vid. Sel. Mat-Fys. 20, No. 19 (1943).
0 1943 The Royal Danish Academy.
Generalized Lorentz Transformations for Linearly Accelerated Frames with Limiting Four-
Dimensional Symmetry
Reprinted with permission from J. P. Hsu and L. Hsu, Chinese Journal of Physics 35,407
(1997). 0 1997 The Physical Society of the Republic of China.
Generalizing Lorentz Transformations for Accelerated Frames and Their Physical Implications
Contributed by D. T. Schmitt and Tobias Kleinschmidt. Based on a paper submitted to the Int. J.
Modem Phys. (to be published).
Feynman Rules for Electromagnetic and Yang-Mills Fields from the Gauge-Independent Field-
Theoretic Formalism (Extract)
Reprinted with permission from S. Mendelstam, Phys. Rev. 175, 1580 (1968).
0 1968 The American Physical Society.
Concept of Nonintegrable Phase Factors and Global Formulation of Gauge Fields (Extract)
Reprinted with permission from T. T.Wu and C. N. Yang, Phys. Rev. DlZ, 3845 (1975).
0 1975 The American Physical Society.
Gauge Fields
Reprinted with permission from Robert Mills, Am. J. Phys. 57, 493 (1989),
0 1989 American Institute of Physics.
Chen Ning Yang and Tai Tsun Wu in Leiden (1984) (Photo courtesy of Judy Wong)
T. D. Lee and C. N. Yang at the Institute for Advanced Study (Courtesy of the Archives of the Institute
for Advanced Study, Princeton, New Jersey, USA)
xix
1 Introduction
In the past 100 years, the ideas of general coordinate invariance and of gauge invariance
have played leading roles in the investigation of the fundamental interactions of nature
(gravitational, weak, electromagnetic and strong interactions). Physics Today calls 2005
the World Year of Physics to celebrate physics in its broadest context as part of the
human experience and to raise awareness of physics within the broad population. [l]It
also happens to be roughly the 50th birth-year of Yang-Mills theory, the 100th birth-year
of Einsteins happiest thought, as well as the 100th anniversary of the publication of the
celebrated theory of special relativity. [a] It is thus a fitting time to review Einstein and
Yang-Mills ideas, their impacts on both physics and mathematics, and some open problems
in related areas.
In 1907, a simple thought flashed through young Einsteins mind:
I wus sitting i n a chair in the patent ofice at Bern when all of a sudden a thought
occurred to me: If a person fulls freely he will not feel his own weight. I was startled.
This simple thought made a deep impression on me. It impelled m e toward a theory of
gravitation.
He told this story in his Kyoto lecture. [3] This happiest thought, as Einstein called it,
involves two ingredients:
i. gravitational force, and
ii. accelerated frames.
These two related physical subjects were first discussed by Einstein in his 1907 review paper
entitled On the Relativity Principle and the Conclusions Drawn From it. [4] Apparently,
these problems had engrossed Einsteins thought shortly after he published his landmark
paper on special relativity 100 years ago. Einstein realized that his knowledge of the relevant
mathematics was inadequate, so he turned to his mathematician friend and university col-
league Grossmann. Grossmanns help proved significant for Einstein in realizing his dream
theory.
Around 1947, a graduate student at the University of Chicago, C. N. Yang, also had a
simple thought: [5]
xx
Since Maxwells equations and the conservation of electric charge are intimately related,
and the conservation of isotopic spin has been established by experiments, should it imply
another kind of gauge field?
Yang tried and tried, but he was just unable to overcome a key difficulty. [6] (See sec. 3)
Such efforts, and failures, are typical, everyday occurrences in graduate students offices. It
seems this thought made a deep impression on him. Seven years later, it also impelled
Yang and Mills toward a non-Abelian gauge theory, when a spark from their discussions
grew to illuminate the key difficulty.
These t w o instances, and many others, raise awareness of a remarkable fact that new
thoughts and ideas often emerge from refreshingly young minds. As Nobert Wiener said it
eloquently to young mathematicians: You must devote this brief springtime of top creative
ability to the discovery of new fields and new problems, of such richness and compelling
character that you can scarcely exhaust them in your life. This goal appears to apply
equally well to all young researchers.
2 Gravity
Historically, Newton created the basic framework for understanding the static property of
gravitational force in his grand PRINCIPIA (1686): the inverse-square law of the universal
gravity and the laws of motion. Newton achieved a n extraordinary unification, perhaps
the first in physics: he demonstrated that the same force and laws of motion apply in the
celestrial and the sub-lunar spheres. The style of Newtons PRINCIPIA, follows closely that
of ELEMENTS by a pioneer and profound mathematical thinker Euclid. For example, book
111 of PRINCIPIA started from rules of reasoning in philosophy, introduced phenomena,
and followed these with proposition. [7] Newton proposed the first scientific understanding
of the solar system based on his universal law for (static) gravitational force and his powerful
general methods of mathematics.
About 220 years later, Poincark investigated the Lorentz invariant and kinematic prop-
erties of gravity in his comprehensive paper on relativity finished in 1905. After he discussed
and derived all essential results of special relativity for mechanics and electrodynamics, he
reached an insightful conclusion that the gravitational action propagates with the speed of
light, based on the invariants of the Lorentz group. [8] Poincari is known for his broad and
universal mind and for his contributions to mathematics and physics, comparable to those
of Gauss.
With his persistence and ingenuity, Einsteins happiest thought of 1907 eventually lead
to arguably the greatest leap forward in the human endeavor to understand the universe,
The Foundation of the General Theory of Relativity. This followed 8 years of hard work
and about 30 not-completely-correct papers on gravity and/or general relativity. [9] The
result is Einsteins splendid equation for gravitation which has been extensively discussed
and tested by experiments.
Einstein conceived and worked on the idea of general coordinate invariance during 1911-
12. His knowledge of mathematics was inadequate to express his radically new ideas, so he
sought help, pleading to his mathematician friend: Grossmann, you must help me, or else
Ill go crazy!; thus started one of the most beautiful collaborations between two scientists
xxi
in different disciplines. [lo] Einsteins insight was that the law of gravity must be invariant
under arbitrary spacetime coordinate transformations (one-to-one and twice-differentiable) .
This idea implies that the physics of gravity should be formulated and understood in any
frame of reference, and it eventually led to a theory of gravity which revolutionized our
concepts of the physical universe, endowing the universe with pseudo-Riemannian geometry.
The impact on mathematics can be seen from the participation of development of Einsteins
idea by such brilliant mathematicians as Hilbert, Cartan, Levi-Civita, Weyl and others.
Indeed the impact went well beyond the sciences to include literature and art, which is part
of how Einstein became an international celebrity and TIME magazines person of the
century.
In 1915, after attending Einsteins talks and many discussions and extensive correspon-
dence, Hilbert was able to grasp the essence of Einsteins idea and express it through a n
invariant action involving the linear scalar curvature. He finished the paper The Founda-
tion of Physics containing this invariant action about a week before Einstein completed
his landmark paper. [ll] Hilbert was a newcomer in this field, and one can feel a fierce
competition. Indeed, their relation appears to have been strained for a short period of
time. Presumably this was due to their difference in philosophy (action principle versus
detailed dynamical analysis) rather than a dispute over priority. In fact, Hilbert clearly
credits Einstein with the idea for the theory. Apparently, Einstein liked Hilberts method
of deriving the gravitational equation from one single principle of variation. He also pub-
lished a paper Hamiltons Principle and the General Theory of Relativity in 1916. T h e
recent public dissemination of a hand-edited galley proof of Hilberts paper has shed new
light on the development of Hilberts formulation and its relation to Einsteins, as discussed
in detail in [la]. Hilberts treatment of gravity (and electromagnetism) was based on a n
elegant invariant formalism which he argued mathematically was unique given, as axioms,
the requirement of general covariance and some reasonable additions assumptions. [ll]He
even believed what we would now call the theory of everything could be constructed on
the basis of an axiomatically-determined geometrically-invariant action. This situation re-
sembles the way the mathematician Poincare grasped the essence of relativity principle in
1905 through his complete understanding of the symmetry group of the Lorentz transforma-
tions. He derived, for the first time, the invariant law for the motion of a charged particle
by using a Lorentz invariant action. [8] The young Einstein was unable to do so; in his 1905
landmark paper he only obtained an approximate and non-invariant equation for the motion
of charged particle with small accelerations. The principle of least action is now a powerful
and standard method for the formulations of physical theories.
Furthermore, Cartan recognized in 1922 an unsatisfactory feature in Einsteins equa-
tion of gravity, namely, the energy-momentum tensor does not have geometrical mean-
ing. [13] He showed how, in an Einstein universe with a given d s 2 , the energy tensor attached
to each volume element of that universe can be defined geometrically. Following this line
of research, he published a paper On a generalization of the concept of Riemann curva-
ture and spaces with torsion. [14]Cartans work created the only satisfactory mathematical
framework for physicists to be able to introduce fermions or spinors into Einsteins theory.
This is the only known way for fermions to be coupled to gravitational field according t o
the requirement of general coordinate invariance. All these examples clearly show the needs
of collaborations between physicists and mathematicians to develop a physical theory.
In some sense, Einsteins idea of general coordinate invariance suggests a drastic new
xxii
approach in physics: namely, the geometrization of all physical fields. If one follows this
approach to treat the electromagnetic force, which is velocity-dependent , one might choose
to employ the Finsler geometry rather than the Riemann geometry. The reason is that the
fundamental metric tensors of the Finsler geometry depend on both position and velocity
(i.e., the differentials of coordinates). [15] It appears that, so far, all attempts to geometrize
the classical electromagnetic field has not been successful, let alone quantum electrodynamics
and Einsteins unified theory based on Riemannian geometry.
Einsteins tremendously successful theory of gravity also presents a huge problem in
physics. When Dyson gave the Gibbs lectures on Missed Opportunities under the aus-
pices of the American Mathematical Society in 1972, he stressed that the most glaring
incompatibility of concepts in contemporary physics is that between the principle of general
coordinate invariance and all quantum-mechanical and quantum-field-theoretic descriptions
of nature. [16] Such an incompatibility is intimately related to the difficulty of quantization
in curved spacetime. As a perturbative theory in flat spacetime Einsteins theory is very
complicated and is not renormalizable. [17] This is of course a challenge to mathemati-
cians and physicists, as Bohr was fond of saying: How wonderful that we have met with a
paradox. Now we have some hope of making progress.
where a , b , c, = 1 , 2 , 3 , and F s are the constant matrix representations of the isospin SU(2)
group. This addition solved Yangs original difficulty, and the rest is history. This type
xxiii
of new term in the field strength is essential for all non-Abelian gauge fields. Adding this
term to the gauge field strength is comparable in its significance to adding the displacement
current term to the original Ampere law by Maxwell, which made the electromagnetic theory
consistent and complete.
Nevertheless, this non-trivial and unique generalization of Maxwells equations did not
attract much attention initially. When Dyson wrote the article Innovation in Physics in
1958, he failed to mention the discovery of non-Abelian gauge fields by Yang and Mills. [16]
Even when Yang himself gave a talk on The Future of Physics a t MIT in 1961, he just
asked a related question : What are the basis of the invariance under charge conjugation,
and the invariance under isotopic spin rotation, both of which, unlike space-time symmetries,
are known to be violated? Yang did not mention the non-Abelian gauge fields.
There were in fact serious problems associated with the original Yang-Mills theory. One
of the problems is that Yang-Mills field is massless, while all observed particles with strong
interactions have mass. This incompatibility between Yang-Mills field theory and experi-
ment became the central issue when Yang was invited by Oppenheimer to give a talk a t the
Institute for Advanced Study, Princeton, on this work. Pauli persistently and repeatedly
asked Yang the question regarding the mass of the new field. Yang could not give him a
satisfactory answer. Pauli made a strong criticism, and the talk was almost stopped by
him. [19] It was well known that Pauli was super-critical on every new physical idea, in-
cluding Einsteins ideas. In fact, Pauli himself had a similar idea and investigated the same
problem before. He also obtained the expression for the field strength of the new gauge
field. According to Paulis colleague Gulmanelli, [20] Pauli gave up the whole investiga-
tion because the new field quantum was a massless vector particle and, hence, contradicted
experiments. To all practical physicists at that time, it was obvious that the Yang-Mills
theory with zero mass field did not exist in nature, because a zero mass field would have
been easily detected in strong-interaction experiments.
Here, one sees clearly the difference between Yang and Pauli regarding their tastes in
physics research. Yang said later: We did not know how to make the theory fit experiment.
I t was our judgment, however, the beauty of the idea alone merited attention.[21] Even
so, when Mills passed away in 2003, Yang, recalling their creation of the gauge theory,
acknowledged We were pleased by the beauty of the generalization, but neither of us had
anticipated its great impact on physics 20 years later. [22]
Although most physicists ignored Yang-Mills work a t that time, T. D. Lee and R.
Utiyama were immediately attracted to their idea. Lee and Yang proposed generalized
gauge transformations to understand the conservation of heavy particle (or baryon) num-
ber in 1955. They argued that such a new conservation law implies the existence of a new
long- range repulsive force between baryons (e.g., protons and neutrons). T h e corresponding
force would be attractive between baryons and anti-baryons. They used Eijtvos experiment
t o estimate the strength of such a new force. They found that it to be about times
smaller than that of the gravitational force. Similar consideration of the conservation law
for lepton number lead to a new and very weak long-range repulsive force between electrons.
It is quite possible that such a Lee-Yang force between baryons in galaxies played a role
similar to the so-called dark energy and has direct relevance to the observed accelerating
expansion of the universe. [23]
One year later in 1956, Utiyama generalized Yang-Mills discussion of the SU(2) gauge
group to a general group with n gauge functions, and grasped the essential idea expressed
xxiv
rules beyond one-loop to show that Feynmans idea of a ghost particle works for both the
Yang-Mills theory and the Einstein theory. He published three long and detailed papers in
1967: Quantum Theory of Gravity 1: 11, and 111. [28] T h e whole thing is very complicated
arid the physical reason for the presence of the ghost particle only in the intermediate states
of a physical process is not completely clear. However, the results of Feynman and DeWitt
inspired two Russian physicists Faddeev and Popov to write a most elegant paper Fcynman
Diagrams for the Yang-Mills Field (with only 2 pages!) in the same year. [29] T h is paper
completely clarified and solved the problem of unitarity and gauge invariance to all orders,
on the basis of the Feynman path integral.
Basically, Faddeev and Popov solve Feynmans problem of unitarity by considering t h e
gauge invariance of physical ampiitudes. They showed that the gauge condition such as
d, B: = 0 in the Yang-Mills theory cannot be consistently imposed for all time, in contrast
to that in quantum electrodynamics. Faddeev and Popov proposed a new method t o enforce
the gauge condition for all time and, hence, maintain the gauge invariance of physical
amplitudes. They showed that it leads to the closed loop with a (scalar) ghost particle
propagating along it with a specific interaction, which restores the unitarity of the physical
amplitudes. The same method based on Feynmans path integral can also be applied t o
Einsteins theory of gravity.[30]
The path integral is another not-completely-baked but (probably) profound idea of Feyn-
mans. It is useful in physics but still lacks a mathematical foundation. T h e status of Diracs
delta function around 1930s was similar. Now we have a foundation for the delta function
in the mathematical theory of distribulions, but we still lack a mathematical basis for Feyn-
marls path integral. This problem was also discussed by Dyson in his lecture Missed
Opportunities. [31] In this sense, the residts of Faddeev and Popov may not be consid-
ered as rigorously established within the framework of quantum field theory. Mandelstams
obtained the same results on the basis of quantum field theory.[32]
Thus, the Yang-Mills theory (with or without masses) can be established as a gauge
theory which satisfies both unitarity and renormalizability. So far, this is the best theory
of strong, weak and electromagnetic interactions that physicists can construct within the
framework of local field theory. [33] As noted above Einsteins theory of gravity can also be
regarded as a gauge theory. However, there are important differences: Einsteins theory is
based on curved spacetime and its action is linear in the spacetime curvature. In contrast,
the conventional Yang-Mills theory is based on flat spacetime a n d its action is quadratic
in gauge curvature. As a result, Einsteins theory is gauge invariant and unitary, but it is
not renotmalizable. Only when one can construct a unitary, renormalizable and consistent
theory of gravity, can one claim to have solve the problem of quantum gravity.
WI = (x + - 1) s i n h ( a ; w * ) ,
a*,
1 1
YI = Y
where W I = ctr and W * = ct. In 1972, T.Y. Wu and Y . C . Lee derived the same acceler-
ated transformations from kinematical considerations based on Lorentz transformations for
short spacetime intervals. [35] In the limit of zero acceleration, they reduce to the identity
transformation, and although there are reasons to term these constant accelerations the
velocity ,O of the F(w*,x,y, z ) frame is not a linear function of time w*,,B = t u n h ( ~ * w * ) .
J.P. Hsu and L. Hsu reparametrized the time coordinate and generalized further to arrive
at transformations with two properties: [36]
i. minimal departure from the Lorentz transformation: the velocity ,B is a linear function
+
of the (accelerated-frame) time w ,,B = Po a,w, and
ii. limiting 4-dimensional symmetry: the transformation to an allowed accelerated frame
reduces to a (generally non-trivial) Lorentz boost in the limit of zero acceleration.
The resulting transformation between an inertial frame Fr and an allowed accelerated
frame F, moving with a constant acceleration a , in the x-direction, is
XXVii
(5)
and then
ii. making the usual 4-dimensional Tmrentz boost for (Wdw,dx,dy,dx)involving the (space-
time-dependent) vclocity 0 = {& u,w, +
dupr = y ( W d w + o d z ) , d ~ =r y ( d z + P W d w ) , d y i = dy, dri = d z . (6)
These equations can be obtained from (4) by differentiation. The synchronization factor
W in the first operation guarantees that the differential equations in (6) are integrable and
that the time in the accelerated frame F automatically becomes the synchronized time in
the Lorentz transformation when the acceleration IY, approaches zero. 1361
fiber bundle and the square norm of this connections curvature, respectively. Indeed, Wu
and Yang published a dictionary [38] allowing the physicist and the mathematician each to
translate the terms the other had developed for the same concepts. The recognition of this
overlap of interests quickly led to a renewal of interaction between what had become two
distinct communities. As outlined below, basic questions in the new physics suggested new
realms of mathematical exploration; conversely, existing mathematical constructs, notably
index theory, provided insights in the new physics.
The formulation of an action immediately raises the question of its critical points; that
is, the solutions of the corresponding Euler-Lagrange equations. For the Yang-Mills action
these solutions can be non-trivial field configurations known as instantons. Hitchin, Atiyah,
Singer, Drinfeld and Manin [39] began studying the spaces of instantons. In the case of
self-dual Yang-Mills instantons Donaldson [40] obtained results which firmly established
the Yang-Mills equations as a useful tool in the study of manifolds. He developed what
is now known as the Donaldson invariant, which proves sensitive not only to the topology
but to the differentiable structure of a manifold. This served as the key to resolve a long-
standing question of whether there could be inequivalent differentiable structures on a given
topological four-manifold. (The answer is yes. )
The chiral and non-Abelian anomaly, first discovered using the technique of current
algebras, provided another avenue from Yang-Mills physics to mathematics. This avenue
proved to be very much a two-way street. Bringing techniques from global analysis of
manifolds, including index theory, to bear on a puzzle arising from their study of the Yang-
Mills path integral, Singer and Atiyah [41] developed a new, global formulation of the
anomaly. In so doing, they resolved some outstanding issues in quantum physics, and
inspired a generation of physicists to study new techniques of differential geometry. On
the mathematical side, this inaugurated the study of the topology and geometry of certain
infinite-dimensional spaces. Fine and Fine [42] present the history of the anomaly and the
attendant interplay between mathematics and physics in detail.
The profound influence of Yang-Mills on mathematics is ongoing. Attempts to rigorously
define the quantum Yang-Mills field theory continue. Moreover, there is a direct connection
between the mathematical formulation of the anomaly in Yang-Mills theory and the contem-
porary formulation of string theory, which, in turn, is a continuing source of mathematical
studies. String theory is also a descendant of general relativity: it is supposed to reduce
to general relativity in a classical limit, and is conjectured to reduce to higher-dimensional
supergravity in an appropriate parameter regime.
Topological quantum field theory is an area of active research in which trying to follow
ideas as they move between mathematics and physics is like watching a tennis match from
center court. After Donaldson developed his invariants by mathematical analyzing the clas-
sical Yang-Mills equations, Atiyah conjectured there should be a quantum field theory in
which they arise naturally. In response, Witten wrote down a Lagrangian for a topologi-
cal Yang-Mills theory, followed shortly by a topological gravity theory. Atiyah and Jeffrey
promptly re-interpreted the topological Yang-Mills theory in terms of equivariant cohomol-
ogy. Since then topological quantum field theories have led t o new mathematical conjectures,
most of which have then been rigorously proven, and provided both mathematicians and
physicists with examples of path integrals which can be treated non-perturbatively.
Arguably, the most important influence of Yang and Mills work has been bringing sig-
nificant portions of the mathematics and physics communities back into their historically
xxix
close contact, once again sharing a language and working on aspects of the same problems,
after a long period of divergence.
There is an interesting story in which Yang described his joy of comprehension of the
relation of physics and mathematics: [43]
In 1975, impressed with the fact that gauge fields are connections on jiber bundles, I
drove to the house of Shiing Shen Chern in El Cerrito, near Berkeley. ..... I told him that
I had finally learned from Jim Simons the beauty of fiber-bundle theory and the profound
Chern- Weil theorem. I said I found it amazing that gauge fields are exactly connections on
fiber bundles, which the mathematicians developed without reference to the physical world.
I added, this is both thrilling and puzzling, since you mathematicians dreamed up these
concepts out of nowhere. He immediately protested, No, no. These concepts were not
dreamed up. They were natural and real.
However, when Einstein created his theory of gravitation, he put forward the term gen-
eral relativity which confused everything. This term was adopted in the sense of general
covariance, i.e. in the sense of the covariance of equations with respect to arbitrary trans-
formations of coordinates accompanied b y transformations of the g p v . But we have seen
that this kind of covariance has nothing to do with the uniformity of space, while in one way
or another relativity is connected with uniformity. This means that general relativity has
nothing to do with relativity as such. A t the same time the latter received the name spe-
cial relativity, which purports to indicate that it is a special case of general relativity. [44]
Focks criticism is constructive because it helps to clarify the relation or the lack of re-
lation between special relativity and general relativity. In special relativity, as discussed
by Lorentz, Poincarg, Einstein and Minkowski, the spacetime is flat. However, in Einsteins
general relativity, the spacetime is curved, so that he could introduce the physical effects
of gravity into Riemannian curvature tensor. Perhaps, the idea of generalization of special
relativity first came to mind when he was thinking about generalizing physical laws from
inertial frames to non-inertial frames with an arbitrary velocities. [4] This does not justify
the term general relativity in the sense of frame-independence, because the accelerated
transformations in (3) and (4) show that there is no relativity between inertial and acceler-
ated frames. Note that the curvature of spacetime of in these accelerated frame must still
be zero. Nevertheless, the terms reveal the continuity of Einsteins reasoning after 1905.
xxx
..... Perhaps they speak of the Principle of Equivalence. If so, it is my turn to have
a blank mind, I have never been able to understand this principle. Does it mean that the
signature of the space-time metric is +2 (or -2 i f you prefer the other convention)? If so,
it is important, but hardly a Principle. Does it mean that the effects of Q gravitational
field are indistinguishable from the effects of an observers acceleration? I f so, it is false.
In Einsteins theory, either there is a gravitational field or there is none, according as the
Riemann tensor does not or does vanish. This is an absolute property; it has nothing to
do with any observers world-line. Space-time is either flat or curved, and an several places
in the book I have been a t considerable pains to separate truly gravitational effects due to
curvature of space-time from those due to curvature of the observers world-line (in most
ordinary cases the latter predominate). The Principle of Equivalence performed the essential
ofice of midwife a t the birth of general relativity, but, as Einstein remarked, the infant would
never have got beyond its long-clothes had it not been for Minkowskis concept. I suggest
that the midwife be now buried with appropriate honours and the f a c t s of absolute space-time
faced.
As noted above the Riemann curvature tensor of the spacetime of accelerated frames
characterized by the transformations (4)vanishes, just as in the inertial frames. Thus, the
physical effects related to these accelerated transformations have nothing to do with the
gravitational field.
The physics in non-inertial frames deserves more theoretical and experimental investi-
gations. Clearly, one cannot be contented with the understanding of physics only in the
usual inertial frames, which is the basic framework for the standard models and particle
physics. The Lorentz and Poincark transformations are linear and carry the whole arith-
metic spacetime into itself. The accelerated transformations such as those of Moller and the
Wu transformations, are nonlinear, and they carry only portion of spacetime in accelerated
frame F to the whole spacetime in an inertial frame F I . The notion of pseudo-group was
discussed by Veblen and Whitehead in order to deal with the transformations which carry
the whole space into portions of space. A set of transformations is called a pseudo-group
if it satisfies the conditions: (i) If the resultant of two transformations in the set exists it
is also in the set. (ii) The set contains the inverse of each transformation in the set. Thus,
it is possible that the concept of pseudo-group could become relevant in both differential
geometry and physics of general accelerated transformations of spacetime. [46]
In early 1950, Einstein said that he was not sure whether differential geometry was to
be the framework for further progress, but if it was then he believed he was on the right
track. [47] Indeed, in the past 50 years, most researchers in this area followed Einsteins
approach to investigate gauge theory of gravity, based on Riemannian geometry and a
gauge symmetry group in curved spacetime. [48] However, the difficulties related to the
ultraviolet divergence and the energy-momentum tensor in Einsteins theory have not been
xxxi
overcome. These problems are probably related to the fact that Einsteins approach to
gravitational field is characterized by a drastic departure from the grand tradition of all
classical and quantum fields, as stressed by Dyson [31] It appears that the framework of
Riemannian geometry is too general for field theory in the sense that once a gauge symmetry
is introduced in it, the gauge symmetry, however powerful it may be, cannot harness the
ultraviolet divergence. In contrast, we know that gauge symmetry in Yang-Mills theory can
exercise its full power to harness ultraviolet divergence within the usual framework of local
field theory based on flat spacetime. This appears to be the secret essence for the success
of unified electroweak theory and quantum chromodynamics.
Thus, a burning question is: Is it possible t o realize a union of Einsteins theory and
Yang-Mills theory to overcome the divergence difficulties and to understand gravitational
experiments? Two key features for connecting Einsteins theory to experiments and obser-
vations are the field equation (involving a spacetime curvature tensor) and the Einstein-
Grossmann metric g f i V d d d d which is essential for the motion of classical objects and light
rays. Similarly, there are also two basic features of Yang-Mills theory; namely, an action in-
volving quadratic gauge curvature with a symmetry group and an underlying flat spacetime.
In view of the divergence difficulty associated with curved spacetime, it is worthwhile to
consider a Yang-Mills gravity [49] which is characterized by: (a) an action with quadratic
gauge curvature with translational gauge symmetry, (b) flat spacetime, and (c) an effective
Einstein-Grossmann metric. This is interesting because the external spacetime translation
gauge symmetry naturally leads to an effective Einstein-Grossmann metric, provided the
external symmetry group is implemented in the action according to the Yang-Mills approach
for internal gauge groups. Furthermore, the Yang-Mills theory has the advantage of con-
necting translation gauge symmetry to the conserved energy-momentum tensor which is the
source of the gravitational field. Table 1 shows a comparison of key features of Yang-Mills
theory, Yang-Mills gravity, gauge theory of gravity and Einsteins theory. [50]
In the bundle language, Einsteins gravitational potential is a metric which determines
the Levi-Civita connection on the tangent (vector) bundle, while Yang-Mills potentials are
connections on principal fiber bundles. Thus, Einsteins gravitational field is not exactly
the same as the Yang-Mills field. In this context, it is worth noting Ashtekars approach to
loop quantum gravity actually does formulate Einsteins gravitational theory as an SU(2)
Yang-Mills theory defined on the principal fiber bundle associated to the frame bundle. The
dynamical variable is no longer the metric but the vierbein and a spin connection. [51]
In the past 100 years, the track record of researches in the fundamental physics of
particles and fields could perhaps be briefly summarized as follows:
Local quantum fields with gauge symmetry appear to be the most effective key a t hand to
unlock the secret of the basic interactions of quantum particles and, hence, the last mystery
of quantum gravity. The concepts of Regge poles, current algebra and superconvergence, once
very hot, turn out to be not viable. Geometrization of all physical fields based on Riemann-
Cartan or Finder geometry is still a dream, while supersymmetric string theory and loop
quantum gravity are only visions.
Table 1. A comparison of key features of gauge fields with internal and various external spacetime symmetries
Eang - Mills Theory Yang - Mills Gravity Gauge Theory of Gravity Einsteins Theory of Gravity
( M i n k o w s k i spacetime) ( F l a t spacetime) (Curved spacetime) (Curved spacetime)
A t , 0, dpv, 4, B?, d, r;,, 4,
[Lie groups (internal) [translation and arbitrary
d@(Z), = -iWn(X)(.~),k~k(x)] coordinate transformations :
group generators . r, -
- xp +
Ap(z),]
1
A, = 24: + -a%, 4- I<nbcWbAf, group generators : p p = -iD,,
f
A@ = (8, + ifA:T,)@, 4; = dpv - ~ ~ - 4 p ad d v ~ -
a ~4,,apiza,
4 ~ ~
[Ap,A] = ifF:r,, ( o p + f4pvD)d JpaD0, J p a = Ppa + f d p a ,
F! = a p A : - aA2 - f I i o b e A ~ A ~ , [ J p a D aJupDp]
l = CpuxDx,
[
L = --F
: p
.Fp
Cpux= JpaD Jux - JvpDpJ,x,
[
L = -(C p a p C p P a - c;ac;P)/(2f2)
m a t @] .
. (A@)- 1
+i(G,paa@p4
Gap = PpJpaJ,p =
- ntz42)
I m.
effective metrtic tensor,
m u x i m u m coupling : 4 - vertex m a e i m u m coupling : 4 - vertex m a z i m u m coupling : 03 - vertex m a x i m u m coupling : 00 - vertex*
[* inertial frames,] [ inertial and non-inertial frames [assume &,4 = 0, and ai = hrd,,, [ in the 6term.]
[.* Iwb(x)l is small.] with the metric tensor Pp,,,]
[.* D, is covariant derivative w.r.t the
where h: = df + fBf.1
Levi-Civita connection in flat spacetime.] [, h ; h ; p = gP.]
xxxiii
References
[l] S. G. Benka, Physics Today, J a n . 2005, p.10.
[a] It is more reasonable to view the creation of the celebrated theory of special relativity as
both a cause and an effect of scientific progress, through the collective effort of Lorentz
(1904) , Poincark (1905) , Einstein (1905) and Minkowski (1908).
[3] J . Ishiwara, Einstein Koen-Roku Tokyo-Tosho, Tokyo, 1977. See also A. Pais, Subtle
is the Lord . . . I (Oxford Univ. Press, 1982) p. 179.
[4] A. Einstein, paper C in Chapter 1 of this volume.
[5] C. N. Yang. Talking about Physics Research and Teaching, (The 5 t h talk at the
Graduate School ofthe Chinese University of Science and Technology, Beijing, China),
(1986/5/27 - 1886/6/12). This kind of repeated failure a t some seemingly good idea
is, of coiirse, a common experience for all research workers, he said later, see Chen
Ning Yang, Selected Pupers 1945- 1980 Wath Commentary (W. 11. Freeman and Comp.
1983) p. 19. Robert Mills, Am. J . Phys. 57, 493 (1989).
[6] It was related to the existence of the third term eabcb;bE in the field strength fEV.
[7] I. Newton, paper A in Chapter 1 of this volume. Euclids ELEMENTS started with the
dcfiriitions of point, line, etc. and followed by 5 postulates; while Newtons PRINClPIA
started with the definitions of mass, kinetic energy,,.. and followed by the 3 Iaws of
motion.
[8] H. Poincart!, Rend. Circ. Mat. Palermo 21, 129 (1906), On the Dynamics of the Elec-
tron Section 9 Hypothesis Concerning Gravitational Force. Its English translation
in paper B in Chapter 1 is an extract from A. A. Logunovs book On the Articles by
Henry Poincare ON THE DYNAiMICS OF THE ELECTRON
[9] The Collected Papers of iilbert Einstein (Ed. J . Stachel, Princeton University Press,
1993.)
[lo] A. Pais, Sublle is the L.od ..., The Science and the Life o j Albert Einstein (Oxford
Univ. Press, .1982), p. 212. Pais gave a lucid and detailed discussion about Einstein-
Grossrriann collaboration in Chapter 12 in his book. As far as final result is concerned,
the Einstein-Grossmann collaboration resembles, perhaps, the discovery of the structure
of DNA in 1953 by a colaboration of Physicist Francis Crick and biologist James
Watson. Grossmanris contribution and help to Einsteins theory deserves to be more
widely known. There exists a first page of Einsteins landmark paper The Foundation
of the General Theory of Relativity, in which he said: Finally, I want to acknowledge
gratefully my friend, the mathematician Grossmann, whose help not only saved me the
effort of studying the pertinent mathematical literature, but who also helped me in my
search for the field equations of gravitation. See ref. 4. Unfortunately, this page was not
published together with Einsteins landmark paper. It is fitting that since 1975 there
has been a regular International Grossmann Conference on Gravity to commemorate
his contribution.
xxxiv
[ll] D. Hilbert, paper C in Chapter 2 of this volume. In this paper, Hilbert already at-
tempted to unify electromagnetic and gravitational forces in his theory. Also, we note
that the addition of a cosmological term in the Einstein-Hilbert gravitational equation
will upset the elegant and unique invariant action for Einsteins theory of gravity and,
hence, is untenable from the viewpoint of symmetry.
[12] L. Cory, J. Renn and J. Stachel, Science 278 1270-73 (1997); T. Sauer,
Archive for History of Exact Sciences 53 # 6 529-575 (1999); echo.mpiwg-
berlin.mpg.de/content /relativityrevolution/hilbert
[13] So far, the energy-momentum tensor in Einsteins equation is still not well understood.
[14] E. Cartan, paper D in Chapter 2 of this volume. For a more detailed discussion of
torsion in gauge theory, see J. M. Nester, in A n Introduction to I<7aluza-I<7einTheories
(Ed. H. C. Lee, World Scientific, 1984) pp. 83-115.
[15] 5. P. Hsu, Nuovo Cimento 109B, 645 (1994).
[16] F. Dyson, paper G in Chapter 6 of this volume, and Bull. Am. Math. SOC.78, 635
(1972).
[17] See, for example, B. DeWitt, papers B and C in Chapter 6 of this volume. For a review
of quantum gravity, see E. Alvarez, Rev. Mod. Phys. 6 1 561 (1989).
[18] F. Dyson, Sci. Am. 199, 74 (1958).
[19] The rascal, as Einstein once called Pauli, was so extraordinary that Weyl said: Pauli
combines in an exemplary way physical insight and mathematical skill. But one also
knew that, in 1925, Pauli did not believe that Kronigs new idea of the electron spin
had any connection with reality, and discouraged Kronig from publishing it. In the
same year, Pauli refused to collaborate with Born to develop the matrix mechanics.
[20] C. P. Enz, No Time to be Brief, A sczentific biography of Wolfgang Pauli (Oxford
Univ. Press, 2002) pp. 481-482. Aside from Pauli, Ronald Shaw also discussed possible
generalization of gauge invariance in his Ph.D. thesis (1954, unpublished); see R. Mills,
ref. 5.
[21] This belief resembles Voigts belief when he published his paper on Doppler effects in
1887. Without having any experimental evidence, Voigt postulated (i) the invariance
of the laws for the propagation of the light wave (in aether) and (ii) the universal
constant of the speed of light to derive a (conformal) 4-dimensional space and time
transformations and the Doppler effects. Such a Voigt transformation differs from the
Lorentz transformation by an overall constant. Based on Doppler effects, Voigt obtained
an approximate relativistic time and presented the first challenge to Newtons absolute
time about 20 years before the discovery of special relativity. See, W. Voigt, Nachr.
Ges. Wiss. Goettingen. 41 (1887). For an English translation, see A. Ernst and J. P.
Hsu, Chin. J. Phys. 39, 211, (2001).
[22] Article B of Appendices in this volume.
xxxv
[25] See, for example, papers in Chapter 7 of this volume. For the gauge theory with the
Poincari group, see J . M. Nester, in An Introduction t o liFaZuza-Ilein Theories (Ed. H.
C. Lee, World Scientific, 1984) pp. 83-115; F. W. Hehl, P. von der Heyde, G . D. Kerlick
and J. M . Nester, Rev. Mod. Phys. 48 393 (1976).
1267 J. J. Sakurai, Ann. of Phys. 11, 1 (1960),
[27] P. W. Higgs, Phys. Rev. Lett. 12, 132 (1964). The Higgs mechanism for introducing
masses t o gauge bosom is related to spontaneoussymmetry breaking, which is only
a n apparent breaking of symmetry because the essence of gauge symmetry is still
preserved. This mechanism is crucial for Weinberg to construct explicitly a celebrated
lepton model Lo unify the electromagnetic and weak interactions, to show the power
of Yang-Mills fieIds through t h e astonishing predictions of gauge-boson masses and
others. All Weinbergs predictions were confirmed by experiments 1at.er. This is one of
the greatest achievements of the 20th century science. GIasl-low arid Salem also had
similar ideas.
1281 R DeWit#t,papers 13 and C in Chapter G of this voIume.
[36] J. P. Hsu and L. Hsu, papers D and E in Chapter 5 of this volume. The Wu trans-
formation is named to honor Ta-You Wus idea of a kinematic approach t o finding an
accelerated transformation within the framework of spacetime with a vanishing Rie-
mann curvature tensor. The time w in the Wu transformation (4)can be synchronized
by a set of computerized clocks. See, J . P. Hsu, ref. 35, pp. 289-290.
xxxvi
[41] M. F. Atiyah and I. M. Singer, Proc. Nat. Acad. Sci. U.S.A. 81 2597-2600 (1984)
[42] A. Fine and D. Fine, Studies in History and Philosophy of Modern Physics, 28B #2
307-323 (1997)
[43] V. Fock, Z. Phys. 39, 226 (1927); F. London, Z. Phys. 42 375 (1927). For areview of the
history of the idea of gauge symmetry, see Chen-Ning Yang, Phys. Today, June 1980,
pp.42-49; or in JingShin Theoretical Physics Symposium in Honor of Prof. Ta- You W u
(Ed. J. P. Hsu and L. Hsu, World Scientific, 1998) pp. 61-71.
[44] V. Fock, The Theory of Space, Time, and Gravitation (Pergamon Press, 1964, 2nd
Revised Edition, transl. by N. Kemmer), p. xvii.
[45] J. L. Synge, Relativity: The General Theory (North-Holland, 1966), Preface.
[46] 0. Veblen and J.H.C. Whitehead, The Foundations of Differential Geometry (Cam-
bridge Univ. Press, 1953) pp. 37-38.
[47] A. Pais, Subtle is the Lord ..., The Science and the Life of Albert Einstein (Oxford Univ.
Press, 1982), p. 467.
[48] See papers C and D in Chapter 4 and those in Chapter 7 of this volume. Y. M. Cho,
Phys. Rev. D14, 3341 (1976).
[49] See paper D in Chapter 8 of this volume.
[50] See papers A in Chapter 4, E in Chapter 7, D in Chapter 8 and B in Chapter 2 of this
volume.
xxxvii
[51] A. Ashtekar, Phys. Rev. D 36, 1587 (1987); A . Ashtekar, Phys. Rev. Lett. 5 7 , 2244
(1986); Rovelli, C. and Smolin, L. Nucl. Phys. B 331,80 (1990).
This page intentionally left blank
Chapter 1
THE
OF
N .ATURAL PHILOSOPIKY.
ISAAC
NEWTON
Translated by Andrew Motte
3
BOOK 111,
IN the preceding Books I have laid down the principles of philosophy,
principles not philosophical, but mathematical; such, to wit, a3 r e may
build our reasonings upon in philosophical inquiries. These principles are
the lams and conditions of certain motions, and powers or forces, which
chiefly have respect to philosophy ; but, lest they should have appeared of
themselves dry and barren, I have illwtrated them here and there with
some philosophical scholiums, giving an account of such things as are of
more general nature, and xhich philosophy seems chiefly to be founded on ;
such as the density and the resistance of bodies, spacer; void of all bodies,
and the motion of light and sounds. I t remains that, from the mme prin-
ciples, I now demonstrate the frame of t t e System of the World. Upon
this subject I had, indeed, composed the third Book in a popular method,
that i t might be read by many; but afterward, considering that such aa
had not sufficiently entered into the principles could not easily discern the
strength of the consequences, nor lay aside the prcjudices t o which they had
been many ycars accustomed, therefore, to prevent the disputes rrhich might
be raised upon such accounts, I chose to reduce the substance bf this Book
into the form of Propositions (in the mathematical way), which should be
read by those only who had first made themselves masters of the principles
established in the preceding Books : not that I would advise any one to the
previous study of every Proposition of those Books; for they abound with
such as might cost too much time, even t o readers of good mathematical
learning. It is enough if'one carefully reads the Definitions, the Laws of
Motion, and the first three Sections of the first Book. H e may then pass
on to thia Book, and conault such of the remaining Propositions of the
first two Boob, as the references in thig and his occasions, shall require.
4
RULE I.
We are Po admit no more cames of natural things t h t t such as are both
true and suflcient to explain their appearances.
T o this purpose the philosophers say that Nature does nothing in vain,
and more is in vain when less will serve; for Nature is pleased with ~1111-
plicity, and affects not the pomp of superfluous causes.
RULE 11.
Therefore t o the same natural efects we must, as f a r as possible, asszgn
the same causes.
As to respiration in a man and in a beast; the descent of stones in Europe
and in America; the light of our culinary fire and of the sun; the reflec-
tion of light in the earth, and in the planets.
RULE 111.
The qtzalities of bodies, which admit neiiher kittension nor remission of
degrees, and which are -found to belong to all bodies udhin tlze reach
of our experiments, are to be esteemed the universal qualities of all
bodies whatsoever.
For since.the qualities of bodies are only known t o us by experinients, Ee
me to hold for universal all such as universally agree with experiments;
and such as are not liable t o diminution can never be quite taken away.
We are certainly not to relinquish the evidence of experiments for the sake
of dreams and vain fictions of our own devising; nor are we to recede from
the analogy of Nature, which uses t o be simple, and always consonant to
itaelf. We no other way know the extension of bodies than by our senees,
nor do these reach it in all bodies; but because we perceive extension in
d l that are sensible, therefore we ascribe i t universally t o all other8 also.
T h a t abundance of bodies are hard, we learn b i experience ; and because
the hdrdness of the whole arises from the hardness of the parts, we therefore
justly infer the hardness of the undivided particles not only of the bodies
we feel but of all others. Thht all bodies are impenetrable, me gather not
from reason, but from sensation. T h e bodies which we handle we find im-
penetrable, and thence conclude impenetrability t o be an universal property
of all bodies whatsoever. That all bodies are moveable, and endorred with
certain powers (which we call the vires in,erlia) of persevering in their mo-
tion, or in their rest, we only infer from the like properties obserred in the
5
PIIBNOMENA, OR APPEARANCES,
PHIENOMENON I.
That the circumjovinl planets, By radii drawn to Jupiters ceiitre, de-
scribe areas proportional to the times of descriptioir.; and tlcrtt their
periodic times, the $xed stars being at rest, am in the sesipiplicnte
proportion of their distunces from its centre.
This we know from astronomical observations. For the orbits of these
planets differ but insensibly from circles concentric to Jupiter; and their
motions in those circles are found to be uniform. And all astronomers
agree that their periodic times are in the sesquiplicate proportion of the
semi-diameters of their orbits; and so i t manifestly appears from the fol-
owing table.
The periodic times of the satcllites of Jupiter.
Id. 18h. 27. 34. 3d. 131.13 42. Td. 3. 42 36. lGd. l(ih.32 9.
T h e distarices of the satellites from Jupiters centre.
From the observations of 1 1 I 2 I 3 I 4 1
Townly by the Blicrom. . . . 5,52 8,78
Causini by the Telescope . . .
I Frorir. the pe,riodic times
I:#
Cassini hv the c c l i ~of. the satel. . ;1
(5,66719,017
Mr. Pound lias determined, by the help of excellent micrometws, the
diameters of Jupite] and the elongation of its satellites after the fullowing
manner. The greatest 3eliocentric elongation of the fourth satellite from
Jupiters centre m s taken with a micrometer in a 15 feet telescope, aiid at
the mean distance of Jupiterfrom the earth was found ahoiit 8 16. T h e
elongation of the third satellite was taken with a micrometer in a telescope
of 123 feet, and at the ssme distance of Jupiter from the earth FBS found
4 42. T h e greatest elongations of the other satellites, at the same dis-
tance of Jupiter from the earth, are found from the periodic times to be 2
56 4V,and 1 51 6.
T h e diameter of Jupiter taken Kith the micrometer in a 123 feet tele-
scope several times, and reduced to Jupiters mean distance from tlie earth,
proved always less than 40, never less than 38,generally 39. Tliis di-
ameter in shorter telescopes is 40, or 41; for Jupiters light. is R little
dilated by the unequal refrangibility of the rays, and this dilatation bears
5 less .ratio to the dircmeter of Jupiter in the longer and more perfect tele-
escopes than in those which are shorter and less perfect. T h e t,;mes ju
7
which two satellites, the first and the third, passed over Jupiters body, were
observed, froin the beginning of the ingress to the beginning of the egregs,
and from the complete ingress to the complete egress, with the long tele-
scope. And from the transit of the first satellite, the diameter of Jupiter
at its mean distance from the earth came forth 37;: and from the transit
of the third 379. There was observed also the time i n which the shadow
of the first satellite passed over Jupiters body, and thence the diameter of
Jupiter at its mean distance from the earth came out ahout 37. Let us
suppose its diameter to be 37+ very nearly, and then thP greatest elonga-
tions of the first, second, third, and fourth satellite will be respectively
equal to 5,965, 9,494, 15,141, and 26,63 semi-diameters of Jupiter.
PHBNOMENON 11.
That the circumsatirrtral planets, by radii drawn to Saturns centre, de-
scribe areas proportional to the times of description ;a d that their
periodic times, the fixed stars being at rest, are in the sespipkicate
proportion of their distances f r o m its centre.
For, as Ca.ssitii from his own observations has determined; their distan-
ces from Saturns centre and their periodic times are as follow.
The periodic times of the satellites of Sutzrrn.
I d . 2 1 h . 18 27. 2d. 17h. 41 22. 4d. 12. 25 12. lijd.22h. 41 14l.
70. 71.48 00.
The distances of the satellitesfroin Saturds centre, in semidiamcters of
itu ring.
From observations . . .. . 14;. 2.; 3;. 8. 84.
Prom the periodic times .. . 1,93. 2,47. 3,45. 8. 23,35.
Ihe greatest elongation of the fourth satellite from Saturns centre is
commonly determined from the observations to be eight of thwe semi-
diameters very nearly. But the greatest elongation of this satellite from
Saturns centre, when taken with an excellent micrometer in Mr. Hirygens
telescope of 123 feet, appeared to be eight semi-diameters and &, of a semi-
diameter. And from this obserration and the periodic times the distances
of the satellites from Saturns centre in semi-diameters of the ring are 2,l.
2,69. 375. 8,7. and 25,35. T h e diameter of Saturn observed in the same
telescqe was found to be t o the diameter of the ring as 3 to 7 ; and the
diameter of the ring, M i y 25-29) 1719, was found to be 43; and thznce
the diameter of the ring when Saturn is a t its mean distalice from the
earth is 42, and the diameter of Saturn IS. These things appear so in
very long and excellent telescopes, because in such telescopes the apparent
magnitudes of the heavenly bodies bear a greater proportion to the dilata-
tion of light in the extremities of those bodies than in shorter telescope&
8
If we, then, reject all the spurious light, the diameter of Saturn will not
amount to more than 16".
P H B N O M E N O N 111.
That the$ve primary planets, M m u r y , Venus, Hars, Jupiter, and ~Yut-
uru, with their several orbits, encompass the sun.
T h a t Mercury and Venus revolve about the sun, is evident from their
moon-like appearances. When they shine out with a full face, they are, in
respect of us, beyond or above the s u n ; when they appear half full, they
are about the same height on one side or other of the s u n ; when horned,
they are below or between us and the sun; and they are sometimes, when
directly under, seen like spots traversing the sunk disk. 'That M&rs~ u r -
rounds the Sun, is as plain from its full face when near its conjunction with
the sun, and from the gibbous figure which it shews i n its quadratures.
And the same thing is demonstrable of Jupiter and Saturn, from their ap-
pearing full in all situations ; for the shadows of their satellites that appear
sometimes upon their disks make it plain that the light they shine with is
riot their own, but borrowed rom the sun.
P H E N O M E N O N IV.
That t h e j x e d stars being at rest, the periodic times of t h e j v e primary
planets, rlnd (whether of the sicu about the earth, or) of the earth about
the s m , are itif the sesquiplicate proportion of their meail, distances
fi*oin the sun.
This proportion, first observed by Kepler, is now received by all astron-
omers ; for the periodic times are the same, and the dimensions of the orbits
are the snme, whether the sun revolves about the earth, or the earth ahnut
the sun. And as to the measures of the periodic timcs, all astronomers are
agreed about them. B u t for the dimensions of the orbits, Kepler and Bzil-
Eialdus, above all others, have determined them from observatioris with the
greatest accuracy ; and the mean distances corresponding to the periodic
times differ but insensibly from those which they have assigned, and for
the most part fall in between them ; as we may see from the following table.
T h e periodic tiiiLes witlt respect to tfiejxed stars, of the planets a i d earth
revolving about the suit, in days and decimal parts of a day.
'7 4 a t; ? Y
107'59,275. 4332,514. 696,9755. 365,2565. 224,6176. 87,9692.
TJie mean distances of the planets and of the earth from the SZLTZ.
'7 4 8
According to Kepler . . . . . . . . 951000. 519650. 152360.
" to Bzrllialdrrs . . . . . . . 954195. 522520. 152350.
t o the periodic times . . . . 954006. 520096. 1.52361)
9
8 ? v
According to KepZer . . . ... . .. 100000. 72400. 38806
I( to Bullialdics . . . . . 100000. 72395. 38585
(I .
to the periodic t i m a . . . . 100000. 72333. 3S710.
As to Mercury and Venus, there can be no doubt about their distalices
from the sun; for they are determined by the elongations of those planets
from the sun; and for the distances of the superior planets, all dispute is
cut off by the cclipses of the satellites of Jupiter. For by those eclipses
the position of the shadow which Jupiter projects is determined ; whence
we have the heliocentric longitude of Jupiter. And from its helio-
centric and geocentric longitudes compared together, we determine its
distance.
PHXNOMENON V.
Then the primary plnilets, by radii drawn to the etcrth, describe areas no
wise proportional to tke times ;but that the areas which they describe
by radii drawn to the sun are proportional to the times of descrip-
tion.
For t o the earth they appear sometimes direct, sometimes stationary,
nay, and sometimes retrograde. But from the sun they are alwa,ys seen
direct, and to proceed with a motion nearly uniform, that is t o say,a little
swifter in the perihelion and a little slower in the aphelion distances, so as
t o maintain an equality in the description of the areas. This a noted
proposition among astronomers, and particularly demonstrable in Jupiter,
from the eclipses of his satellites; by the help of which eclipses, as we have
said, the heliocentric longitudes of that planet, and its distances from the
sun, are determined.
PHBNOMENON vr.
That the m o o n , by a radius drawn to the earWs centre, describes a n area
proportional to the time of description.
This we gather from the apparent motion of the moon, compared with
its apparent diameter. I t is true that the motion of the moon is a little
disturbed by tho action of the sun : but in laying down these P h z n o m e q
I neglect thoso qmall and inconsiderable errors.
10
PRQPOSITIONS*
PROPOSTTION I. T H E O R E M I.
37int the forces by which the circtrnzjoviul planets are coiitirrzrnlly drawn
of f r o m rectiliiiear motions, aiid retained in th.eir proper orbits, tend
to Jupiters ceiiire ;and are rehprocally as the sqisares of the distances
of the pluces of those planets f r o m tliat centre.
T h e former part of this Proposition appears from P h z n . I, and Prop.
I1 or 111, Book I ; the latter from Phzen. I, and Cor. 6, Prop. IV, of the same
13oek.
T h e same thing we are to understand of the planets which encompass
Saturn, by P h m . 11.
A.A.Logunov
ON THE ARTICLES
B Y HENRI POINCARE
Translated by G.Pontecorvo
14
H .Poincare
THE DYNAMICS
OF THE ELECTRON'
( 2 3 July 1905)
INTRODUCTION
It would seem at first sight that the aberration of light and the optical and
electrical effects related thereto should afford a means of determining the absolute
motion of the Earth, or rather its motion relative to the ether instead of relative to
the other celestial bodies. An attempt at this was made, indeed, by Fresnel, but
he soon perceived that the Earth's motion does not affect t h e laws of refraction
and reflection. Similar experiments, such as that using a waterfilled telescope. or
any in which only the first-order terms relative to the aberration were considered,
likewise yielded only negative results. The explanation of this was soon found:
but Michelson, who devised an experiment wherein the terms involving the square
of the aberration should be detectable, was equally unsuccessful.
Th is i in poss i b i 1i t y of experimental 1y demonstrating the a b so I u te in o t i o n of
the Earth appears to be a general law of Nature; i t is reasonable to assume the
existence of this law, which we shall call the relativity postulate, and to assume
that it is universally valid. Whether this postulate, which so far is i n agreement
with experiment, be later confirmed or disproved by more accurate tests, i t is. in
any case, of interest to see what consequences follow from it.
* We note thizt the relativity postulate, the postirlrrte of' totLrI im-
possibility of reLwling iibsoliitr rriotinri, CIS jbrrrii~liitedirr his first
short note ((011 the dynrriiiic,s r,f' tlir elrctroii D, N * L ~ . S first rririrtiorirci
by Poiricnre in his report to the CmiCqressof' Lirt atid sc.ienc,c l i ~ l r liti
Sairit Loitis in 1904 '.
I n this report Poiiicare lists the niniri priiiciples of' tlieorrtic.rrl
physics CI r i ii .form ii I LI t es the re I nt i;iity p r ir i c-iplP ir i LI cco rd(it i CL' \ I *i t li
(( ~
'Poincarc H.O. On the dynmniics of thc electron 11 The relativity principle: Collection o f works
of the classics of relativity. Leningrad, 1935. P.Sl-139. Rend. dcl Circ. Mat. di PiiIcrmo 71 ( IOOh)
P.120-17s.
'Poincnre H . The present and the future of niathcniutical physics I / The relati\ ity principlc: Coll.
of works o n special relativity theory. - Moscow. 1973. - P . 2 7 4 4 ; Poincarc H. L'ctat ;ictucl ct
I'avcnir dr la Physique In~ithcmatique. - lnnvicr 1904. - V.28. Scr. 3. - P.302-324 // Thc
Monist. - 1905. - V . X V . N i .
IS
15
which the laws of physical phenomena must be the same for a motion-
less observer and for an observer experiencing uniform motion along
a straight line, so the laws governing physical phenomena should be
the same for a motionless observer and for an observer experienc-
ing uniform motion, so there is no way and cannot be any way of
determining whether one experiences such motion or not.
16
16
17
17
18
18
We cannot be satisfied with formulae that are merely placed side by and
agree only by a lucky chance; these formulae must, as it were, interlock. The
mind will consent only when i t sees the reason for the agreement, and when this
agreement even seems to have been predictable.
But the matter may be viewed in a different light, as an analogy will show. Let
us imagine some astronomer before Copernicus, pondering upon the Plotemainc
system. He would notice that, for every planet, either the epicycle or the deferent
is traversed i n the same time. This cannot be due to chance, and there must be
some mysterious bond between all the planets of the system.
Then Copernicus, by a simple change of the coordinate axes which were
supposed fixed, did away with this seeming relationship: every planet described
one circular orbit only, and the periods of revolution became independent of one
another - until Kepler once more established the relationship that had apparently
been destroyed.
Now, there may be an analogy with our problem. If we assume the relativity
postulate, we find a quantity common to the law of gravitation and the laws
of electromagnetism, and this quantity is the velocity of light; and this same
quantity appears in every other force, of whatever origin. There can be only two
explanations.
Either, everything in the universe is of electromagnetic origin; or, this con-
stituent which appears common to all the phenomena of physics has no real
existence, but arises from our methods of measurement. What are these meth-
ods? One might first reply, the bringing into juxtaposition of objects regarded
as invariable solid things; but this is no longer so in our present theory, if the
Lorentz contraction is assumed. In this theory, two lengths are by definition
equal if they are traversed by light in the same time.
Perhaps the abandonment of this definition would suffice to overthrow Lo-
rentzs theory as decisively as the system of Ptolemy was by the work of Coper-
nicus. Should this ever happen, it would by no means argue the futility of
7Einstein A. Collection of scientific papers: in 4 volumes. Ed. I.Ye.Tamm, Ya.A.Smorodinsky,
B.C.Kuznetsov. - Moscow: Nauka, 1965, V . I. P.685486.
20
20
Lorentzs analysis: whatever the faults of the Ptolemainc theory, it was the
necessary foundation for Copernicus to build upon.
I have therefore not hesitated to publish these incomplete results, even though
at the present time the entire theory may seem to be threatened by the discovery
of cathode rays.
SECTION 9.
HYPOTHESES CONCERNING GRAVITATION
* ccMnss has two aspects: these are both inertia and the grav-
itatiorinl mass, which is involved in Newtonian gravity as a mul-
tiplication factor. If the coefficient of inertia is constant, can the
gravitational mass also be constant? That is the question>,.
H. Poincare
The preserit and future of mathematical physics (1904)18
Thus, Lorentz theory would entirely account for the impossibility of demon-
strating absolute motion, provided that all forces were of electromagnetic origin.
But there exist forces, such as gravitation, which cannot be regarded as being
of electromagnetic origin. It may happen that two systems of bodies create
equivalent electromagnetic fields, in the sense of exerting the same action upon
electrified bodies and currents, while at the same time these two systems do not
exert the same gravitational action upon Newtonian masses.
The gravitational field is therefore not identical with the electromagnetic field.
Lorentz was thus compelled to augment his hypothesis by assuming that forces,
of whatever origin, and in particular gravitation, are affected by translation
(or, if one prefers, by the Lorentz transformation) in the same way as the
electrornagnetic forces.
We must now examine this hypothesis in detail. If the Newtonian force is
to behave in such a way under the Lorentz transformation, we can no longer
Poincare H . Lavenir de la Physique rnathernatique /I Bulletin des Sciences Mathernatiques.
Janvier 1904 - V.28. Ser.2. - P.302-324.
65
21
suppose that this force depends only on the relative position of the attracting and
the attracted body at the instant concerned; it must depend also on the velocities
of the two bodies. Moreover, we may reasonably assume that the force acting
upon the attracted body, at an instant t , depends on the position and velocity of
the body at that instant; but i t will also depend on the position and velocity of the
attracting body, not at the instant but at some previous instant, as if gravitation
required a certain time for its propagation.
Let us consider therefore the position of the attracted body at the instant to,
and let its coordinates at that instant be FO = ( 2 0 . yo. Z O ) and the components of it5
velocity be i7, and let us consider the attracting body at the corresponding instant
t o + t , its coordinates at that instant being FO+ Fand its velocity components 2,.
First of all, we must have a relationship
66
22
we shall therefore f i r h t investigate this case, assuming that these velocities are
constant, and therefore that the two bodies are executing a common uniform
motion of translation i n a straight line.
We may assume that the .r-axis has been taken to be pdrallel to this motion
of tranhlation, so that (;I = 0, and we shall take J = i*l. = 11.
If, under these conditions, we apply the Lorentz transformation, the two
bodies will be at rest after the transformation, with
Moreover,
and
/s
Fll = -T(X - ~ l l t ) / t, FJ. = -TJ./?~. 3 .
/
(4
which may also be written
67
23
It is known that the substitutions forming this group (if 1 = 1) are linear and
such that the quadratic form
x2 + y2 + z 2 - t 2
is invariant. Putting
68
24
If
:c, y. 2. t q ,
bx: dy, ba: at-,
dlS, & y , 612: 61tJ-1,
are regarded a s the coordinates of three points P , PI, P i n four-dimensional
space, we see that the Lorentz transformation is simply a rotation of this space
about a fixed origin.
The only distinct invariants are therefore t h e six distances of the points P , PI,
P from one another and from the origin, or alternatively the two expressions
Let us now consider how the components of the force are transformed. We
return to equations ( 1 1) of section I , which refer not to the force @, discussed
here but to the force per unit volume. Putting
f; = f y r f; = f*. (6)
Thus, f x , jyr f Z , f t are transformed in the same manner as z, y, z , t. The
invariants of the group will therefore be
69
25
Evidently
F/J= T l f t = l p .
Thus the Lorentz transformation will act upon F, T , in the same way as upon
f: f t , except that these expressions will in addition be multiplied by
p - 1 - bt
-- -_
p y ( l - Pv,) bt
Likewise, the transformation will act upon v in the same way as upon
6x,6y, 6 2 , bt, except that these expressions will in addition be multiplied by
the same factor,
bt
--
1
bt y(1 - pV1:).
Let us now regard f z , f?,,f z , ft as being the coordinates of a fourth point &;
The invariants will then be functions of the distances between the five points 0,
P , P, P, Q and these functions must be homogeneous of degree zero, firstly
with respect to fz, fy, fz, f t , 62, b y , b z , 6 t (which variables can subsequently be
replaced by F,, FV, F,, T , i7, l), and secondly with respect to dlx, 6,y, 612, 1
(which variables can subsequently be replaced by vl, 1).
In this way we find, in addition to the four invariants (3,four further and
distinct invaiiants, namely
work per unit time and force reduced to unit charge - [?(I%),
7@1;
four-component velocity, or four-momentum - [y,74;
charge and current - [p, pvj;
scalar and vector potential - [cp, 4
with y = l / d m .
70
26
The velocity of propagation is then much more rapid than that of light, but i n
certain cases might be negative, which, as we have said. seems hardly acceptable.
We shall therefore abide by hypothesis (A).
2. The four invariants ( 7 ) must be functions o f the invariants ( 5 ) .
3 . When both bodies are at absolute rest, F , must have the values given
by Newtons law; when the bodies are at relative rest, the values must be those
given by equations (4).
In the case of absolute rest, the first two invariants ( 7 ) must reduce to
According to hypothesis (A), the second and third of the invariants ( 5 ) become
71
27
We may therefore assume, for example, that the first two invariants ( 5 )
reduce to
- 2 '>
(1 - t: ,)-/(I' + GI).'. -41 - e f / ( r + 1 7 ~ 1 ) .
but other combinations are possible.
I t is necessary to choose some combination, and a third equation is also
needed to determine p . In making the choice, we shall attempt to remain as
close as possible to Newton's law. Let u s then examine the result when the
squares of the velocities .It. P i , etc., are neglected (and f = - r ) .
The four invariants ( 5 ) then become
or, since,
t = -r ' r'= r'l + ClI': T = 7-1 - F C l ,
In the second of these expressions I have written r1, i n place of I', since r is
multiplied by u - v1 and the square of v i s neglected.
Newton's law gives, for these four invariants (7),
72
28
If, therefore, we denote the second and third invariants ( 5 ) by A and B , and
the first three invariants (7) by AI, IV and P, Newtons law will be obeyed, to
within terms of the order of the squares of the velocities, by putting
1 A A- B
AI = - N = fg2: p = -.
B4 B3
This solution is not unique: if the fourth invariant (5) is denoted by C , then
C -1 is of the order of l : , as is ( A - 13)2.
We may therefore add to the right-hand side of each of the equations (8) a
term consisting of C - 1 multiplied by any function of A, B and C , and a term
consisting of ( A - B ) 2 also multiplied by any function of A, B , and C.
The solution (8) appears the simplest at first sight, but it cannot be accepted.
Since 71, N and P are functions of p, and T = Fz7, these equations yield values
of F; but the resulting values may in some cases be imaginary.
In order to avoid this difficulty, we proceed differently, putting
1 1
by analogy with
1
?=dm1
as in the Lorentz substitution.
Then, with the condition --T = t , the invariants ( 5 ) become
W e therefore put
F = aF/-yo + bu' + CGfyl/-yO,
When p , T are replaced by their values (9), the result is, after multiplication
by 7027
-Au - b - CC:
= 0. (10)
T h e desired conclusion is that the values of F" should remain i n accordance
with Newton's law when the square of the velocities C, GI, etc., and the products
of the accelerations and the distances are neglected in comparison with th,p .5 q uare
of the velocity of light.
W e can take
b = 0, c = -aA/C.
To the approximation used,
We must therefore take as the invariant u one which reduces to -l/?-; within
the approximation adopted, that is, l / B 3 . The equations (9) then become
74
30
It is seen, first of all, that the corrected attraction consists of two components,
one parallel to the vector joining the position of the two bodies, and the other
parallel t o the velocity of the attracting body.
When we speak of the position or the velocity of the attracting body, we
mean its position or velocity at the instant when the gravitational wave leaves
it; but the position or the velocity of the attracted body means its position or
velocity at the instant when the gravitational wave reaches it, this wave being
assumed to be propagated with the velocity of light.
I believe that i t would be premature to attempt to continue the discussion of
these formulae, and 1 shall therefore confine myself to making a few comments.
1 . The solutions ( 1 I ) are not unique; for the conimon factor may be replaced
by
1
B3
- + (C - l ) f l ( A . B. C ) + (.4- B ) ' j ~ ( ~ B
- l.. C ) .
so that the total force is divisible into three components corresponding to the
three parentheses in equation (12). The first component is somewhat similar to
the mechanical force due to the electric field, the other two to the mechanical
force due to the magnetic field. By virtue of comment 1, 1 may replace 1/B"
in equations ( I I ) by C / B 3 , so that are linear functions of the velocity Ct of
the attracted body, having been eliminated from the denominator of (11'). This
completes the analogy.
Putting then -
e'= - ( , ( F + r & ) , 12 = -(l[C?l x F], (13)
with eliminated C from the denominator of ( I 1') we obtain
75
31
76
32
77
33
Doc. 47
O N THE RELATIVITY PRINCIPLE AND THE CONCLUSIONS DRAWN FROM I T
by A . E i n s t e i n
[ J n h r b u c h d e r R a d i o a k t i v i t a t und E l e k t r o n i k 4 (1907): 411-4621
2' = z - vt
2' = y
2' = 2
DOC. 47 253
DOC. 47 303
DOC. 47 305
2.2 - XI = xi - xi = (2 - (1
t, = u1 t, = 0,
v = yt = y r ,
u = 711 + $1 .
u=re %.
Nevertheless , we s h a l l maintain formula (30).
lIn accordance with (11, we thereby a l s o assume a c e r t a i n r e s t r i c t i o n with
respect t o t h e values of ( = X I .
41
DOC. 47 307
and
q u a n t i t i e s p , u , 1, L , I , e t c . , i f ue l i m i t ourselves t o an infinitesimally
short period t h a t i s infinitesimally close t o the time of r e l a t i v e r e s t of
S and 2. Further, we have t o replace t by t h e local time 6. However,
we must not simply put
DOC. 47 309
and
45
We thus obtain
and
11031
DOC. 47 311
Doc. 13
Outline of a Generalized Theory of Relativity and of a
Theory of Gravitation
I. Physical Part
by Albert Einstein
I
Physical Part
The theory expounded in what follows derives from the conviction that the
proportionality between the inertial and the gravitational mass of bodies is an exactly
valid law of nature that must already find expression in the very foundation of
theoretical physics. I already sought to give expression to this conviction in several
earlier papers by seeking to reduce the gravitational mass to the inertial mass;' this
endeavor led me to the hypothesis that, from a physical point of view, an (infinitesi-
mally extended, homogeneous) gravitational field can be compIetely replaced by a
state of acceleration of the reference system. This hypothesis can be expressed
pictorially in the following way: An observer enclosed in a box can in no way decide
whether the box is at rest in a static gravitational field, or whether it is in accelerated
motion, maintained by forces acting on the box, in a space that is free of gravitational
fields (equivalence hypothesis). PI
We know the fact that the law of proportionality of inertial and gravitational
mass is satisfied to an extraordinary degree of accuracy from the fundamentally
important investigation by Eotvos,* which is based on the following argument. A
body at rest on the surface of the Earth is acted upon by gravity as well as by the
centrifugal force resulting from Earth's rotation. The first of these forces is
proportional to the gravitational mass, and the second to the inertial mass. Thus, the
direction of the resultant of these two forces , i.e., the direction of the apparent
gravitational force (direction of the plumb) would have to depend on the physical
nature of the body under consideration if the proportionality of the inertial and
gravitational mass were not satisfied. In that case the apparent gravitational forces
acting on parts of a heterogeneous rigid system would, in general, not merge into a
resultant; instead, in general, there would still be a torque associated with the
apparent gravitational forces that would have to make itself noticeable if the system
were suspended from a torsion-free thread. By having established the absence of such
torques with great care, Eotvos proved that, for the bodies that he investigated, the
relationship of the two masses was independent of the nature of the body to such a
degree of exactness that the relative difference in this relationship that might still
exist from one substance to another must be smaller than one twenty-millionth.
The decomposition of radioactive substances occurs with a release of such
significant quantities of energy that the change in the inertial mass of the system that
corresponds to that energy decrease according to the theory of relativity is not very
small relative to the total mass.3 In the case of the decay of radium, for example,
this decrease amounts to one ten-thousandth of the total mass. If these changes of the
inertial mass did not correspond to changes in the gravitational mass, then there
would have to be deviations of the inertial mass from the gravitational mass much
PI greater than those allowed by Eotvos's experiments. Hence it must be considered very
probable that the identity of the inertial and gravitational mass is exactly satisfied.
For these reasons it seems to me that the equivalence hypothesis, which asserts the
essential physical identity of the gravitational with the inertial mass, possesses a h g h
degree of pr~bability.~
From the foregoing, one can already infer that there cannot exist relationships
between the space-time coordinates xl, x2, x3, x4 and the results of measurements
obtainable by means of measuring rods and clocks that would be as simple as those
in the old relativity theory. With regard to time, this has already found to be true in
the case of the static gravitational field.8 The question therefore arises, what is the
physical meaning (measurability in principle) of the coordinates xl, x2, x3, x4.
We note in this connection that ds is to be conceived as the invariant measure
of the distance between two infinitely close space-time points. For that reason, ds
must also possess a physical meaning that is independent of the chosen reference
system. We will assume that ds is the naturally measured distance between the two
space-time points, and by thls we will understand the following. ~ 7 1
The immediate vicinity of the point (xl,x2, x3, x4) with respect to the coordinate
system is determined by the infinitesimal variables dr,, dx2, dx,, h4.We assume
that, in their place, new variables d t 1 , dt;,, d t 3 , dE4 are introduced by means of a
linear transformation in such a way that
ds2 = dt;; + st;: + dt;; - dE;.
In h s transformation the g p v are to be viewed as constants; the real cone ?!G = 0
appears referred to its principal axes. Then the ordinary theory of relativity holds in
this elementary dt; system, and the physical meaning of lengths and times shall be
the same in this sytem as in the ordinary theory of relativity, i.e., ds is the square of
the four-dimensional distance between two infinitely close space-time points,
measured by means of a rigid body that is not accelerated in the dc-system, and by
means of unit measuring rods and clocks at rest relative to it. [I81
From this one sees that, for given dx,,dx2, dx,, h,, the natural distance that
corresponds to these differentials can be determined only if one knows the quantities
g p v that determine the gravitational field. This can also be expressed in the
following way: the gravitational field influences the measuring bodies and clocks in
a determinate manner. [191
From the fundamental equation
one sees that, in order to fix the physical dimensions of the quantities g p v and xv, yet
another stipulation is required. The quantity ds has the dimension of a length.
Likewise, we wish to view the xv (x4 too) as lengths, and thus we do not ascribe any
physical dimension to the quantities gpv.
Given this state of affairs, and in view of the old theory of relativity, it seems
natural to assume that the transformation group we are seekmg also includes the
linear transformations. Hence we require that rpv be a tensor with respect to
arbitrary linear transformations.
Now it is easy to prove (by carrying out the transformation) the following
theorems: [261
1. If @ab,.,A is a contravariant tensor of rank n with respect to linear transfonna-
tions, then
One can also see from the following argument that this operator is related to the
Laplacian operator. In the theory of relativity (absence of gravitational field) one
would have to set
g , , = g,, = g,, = -1, g, = c2, gFv = 0, for P * v;
1
hence Y11 = Y22 = Y33 = -1, Y, = - ,y, = 0, for p f v.
c2
If a gravitational field is present that is sufficiently weak, i.e., if the gpvand y ,.,v
differ only infinitesimally from the values just given, then one obtains instead of the
expression (a), neglecting the second-order terms,
14
y p v is the contravariant tensor reciprocal to gpv (Part 11, $1).
53
If the field is static and only g p v is variable, we thus amve at the case of the
Newtonian theory of gravitation if we take the expression obtained for the quantity
[281 r,, up to a constant.
Hence one might think that, up to a constant factor, the expression (a) must
already be the generalization of Acp that we are seelung. But this would be a
mistake; for alongside this expression, in a generalization of this kind there could also
appear terms that are themselves tensors and that vanish when we neglect the kinds
of terms just indicated. This always occurs when two first derivatives of the g p v or
yllv are multiplied by each other. Thus, for example,
solves the problem. The fact that the momentum law is satisfied follows from the
identity
differential quotients.
If the differential equation for p were not yet known, the problem of finding it
would be reduced to that of finding this identity. What is essential for us to realize
is that this identity can be derived if one of the terms occurring in it is known. All
one has to do is to apply repeatedly the product differentiation rule in the forms
a
-(uv) =
au
-v +
av
-u
ax, ax, ax,
and
u- av a
= -(UV) -
au
- v,
ax, ax, 3%
and then finally to put the terms that are differential quotients on the left side and the
rest of the terms on the right side. For example, if one starts with the first term of
the above identity, one obtains, one after another,
(a = 1,2,3,4)
is the momentum (or energy) imparted by the gravitational field to the matter per unit
volume. For the energy-momentum law to be satisfied, the differential expressions
rpv of the fundamental quantities y p v that enter the gravitational equations
K.eP, = rPy
must be chosen such that
can be rewritten in such a way that it appears as the sum of differential quotients.
On the other hand, we know that the term (a) appears in the expression sought for
rPv. Hence the identity that is being sought has the following form:
Sum of differential quotients
Thus, the expression for rllv that is enclosed between the curly brackets on the
right-hand side is the tensor that is being sought that enters into the gravitational
equations
KopV = rpv.
To make these equations more comprehensible, we introduce the following
abbreviations:
Likewise, for the sake of brevity, we introduce the following notations for
differential operations carried out on the fundamental tensors y and g:
and
Each of these operators yields again a tensor of the same kind (w. resp. to linear
transformations).
With the application of these abbreviations the identity (12) assumes the form
or also
If we write the conservation law (10) for matter and the conservation law (12a)
for the gravitational field in the form
then one recognizes that the stress-energy tensor 6 p v of the gravitational field enters
the conservation law for the gravitational field in exactly the same way as the tensor
O,, of the material process enters the conservation law for this process; this is a
noteworthy circumstance considering the difference in the derivation of the two laws.
From equation (12a) follows the expression for the differential tensor entering
into the gravitational equations
(17) rpv = A p v W - K * f J p v .
Thus, the gravitational equations (1 1) are of the form
(18) *JY) = K F p v + a,,>. [321
(a = 1,2,3,4)
This shows that the conservation laws hold for the matter and the gravitational
field taken together.
In the foregoing we have given preference to the contravariant tensors, because
the contravariant stress-energy tensor of the flow of incoherent masses can be
expressed in an especially simple manner. However, we can express the fundamental
relations that we have obtained just as simply by using covariant tensors. Instead of
Opv, we must then take Tpv= g d v p O , P as the stress-energy tensor of the
ffP
material process. Instead of equation (lo), we obtain through term-by-tern
reformulation
It follows from this equation and equation (16) that the equations of the gravitational
field can also be written in the form
(21) -qm = KPwv + TPVb
these equations can also be derived directly from (18). The equation that corresponds
to (19) reads
58
n
Mathematical Part
by Marcel Grossman
The mathematical tools for developing the vector analysis of a gravitational field,
whch is characterized by the invariance of the line element
h2= &?,vhP%
P
derive from Christoffels fundamental paper on the transformation of quadratic
differential forms. Taking Christoffelsresults as their starting point, Ricci and Levi-
Civita2 developed their methods of the absolute differential calculus-i.e., a
differential calculus that is independent of the coordinate system-which permit our
giving an invariant form to the differential equations of mathematical physics. But
since the vector analysis of a Euclidean space referred to arbitrary curvilinear
coordinates is formally identical with the vector analysis of an arbitrary manifold
specified by its line element, the extension of the vector-analytical conceptions that
Minkowski, Sommerfeld, Laue, et al. worked out for the theory of relativity in recent
years to the general theory of Einsteins expounded above does not present any
difficulty.
With some practice, the general vector analysis obtained in this way is as simple
to handle as the special vector analysis of three- or four-dimensional Euclidean space;
in fact, the greater generality of its conceptions lends it a clarity that is lacking often
enough in the special case.
The theory of special tensors ($3) has been treated to the full in a paper by
K ~ t t l e r published
,~ while this work was in progress; the treatment is based on the
theory of integral forms, sometlung that is not possible in the general case.
Since more detailed mathematical investigations will have to be done in
connection with Einsteins theory of gravitation, and especially in connection with the Pol
problem of the differential equations of the gravitational field, a systematic
presentation of the general vector analysis might be in order. I have purposely not
employed geometrical aids because, in my opinion, they contribute very little to an
intuitive understanding of the conceptions of vector analysis.
The factor % serves to simplify the result but is inconsequential from the point of
view of the theory of invariants.
60
1
i.e., the left side of the investigated equation, up to the factor -. Thus, if that
hi
equation is divided by &, then its left side represents the o-component of a
covariant vector, and is, therefore, in fact, covariant. For that reason, the content of
those four equations can also be expressed thus:
The divergence of the (contravariant) stress-energy tensor of the material flow
or of the physical process vanishes.
PV
In the sense of our general vector analysis, the theory of these dfferential
covariants leads to the difei-ential tensors that are given with a gravitational field.
The complete system of these differential tensors (with respect to arbitrary
transformations) goes back to a covariant differential tensor of fourth rank found by
Riemann'* and, independently of him, by Chri~toffel,'~ which we shall call the
Riemann differential tensor, and which reads as follows:
(45)
I { i p , lm}
(ik, 1.1)
=
k
= c g k p
P
ypk(ik, lm), or, when solved,
{ i p , Lm}.
In general vector analysis, the four-index symbols of the second kmd take on the
meaning of the components of a mixed tensor that is covariant of third rank and
contravariant of first rank.14
The extraordinary importance of these conceptions for the dzflerential g e ~ m e t r y ' ~
of a manifold that is given by its line element makes it a priori probable that these
general differential tensors may also be of importance for the problem of the
differential equations of a gravitational field. To begin with, it is, in fact, possible
to specify a covariant differential tensor of second rank and second order G, that
could enter into those equations, namely,
(46) G im = 1y k l ( i k 7lm) = {ik, k m } .
kl k
It turns out, however, that in the special case of the infinitely weak, static
gravitational field this tensor does not reduce to the expression Acp. We must
therefore leave open the question to what extent the general theory of the differential
tensors associated with a gravitational field is connected with the problem of the
(47)
The first sum on the right-hand side has the desired form of a sum of differential
quotients and shall be denoted by A, so that we have
We once again integrate by parts in the second sum on the right-hand side. The
identity will then take the form
The first of the sums obtained on the right-hand side can be written as a sum of
differentials and shall be denoted by
(49)
l6 The derivation of the identity we are seeking becomes simpler, without affecting the
result, if we put the factor & inside the differentiation sign.
63
The first two sums have the form of terms such as we place on the left side of
our identity. We denote them by
The third of the sums appearing on the right has the form of a sum of differential
quotients; if we eliminate ?!k.from it with the help of the above formula (29), this
2%
sum proves to be the quantity A that has aIready been introduced. Finally, we replace
?!& in the last sum in accord with the same formula. In this way we find
ax,
or
Since i is interchangeable with k, and p with v , we can write the second sum as
Doc. 30
[p. 7691 The Foundation of the General Theory of Relativity
by A. Einstein
The theory which is presented in the following pages conceivably constitutes the
farthest-reaching generalization of a theory which, today, is generally called the
PI theory of relativity; I will call the latter one-in order to distinguish it from the
121 first named-the special theory of relativity, which I assume to be known. The
generalization of the theory of relativity has been facilitated considerably by
Minkowski, a mathematician who was the first one to recognize the formal
131 equivalence of space coordinates and the time coordinate, and utilized this in the
construction of the theory. The mathematical tools that are necessary for general
relativity were readily available in the absolute differential calculus, which is based
upon the research on non-Euclidean manifolds by Gauss, Riemann, and Christoffel,
and whch has been systematized by Ricci and Levi-Civita and has already been
141 applied to problems of theoretical physics. In section B of the present paper I
developed all the necessary mathematical tools-which cannot be assumed to be
lcnown to every physicist-and I tried to do it in as simple and transparent a manner
as possible, so that a special study of the mathematical literature is not required for
[5l &heunderstanding of the present paper. Finally, I want to acknowledge gratefully my
friend, the mathematician Grossmann, whose help not only saved me the effort of
[GI studying the pertinent mathematical literature, but who also helped me in my search
for the field equations of gravitation.
- 1 0 0
0 -1 0
0 0 -1 0 - (4)
0 0 0 4-1
W e shall find hereafter that the choice of such co-ordinates
is, in general, not possible for a finite region.
From the considerations of 2 and 5 3 it follows that
the quantities g,, are to be regarded from the physical stand-
point as the quantities which describe the gravitational
field in relation to the chosen system of reference. For, if
we now assume the special theory of relativity to apply to a
certain four-dimensional region with the co-ordinates properly
chosen, then the g,, have the values given in (4). A free
material point then moves, relatively to this system, with
uniform motion in a straight line. Then if we introduce new
space-time co-ordinates xl, x2,x3, xp,by means of any substi-
tution we choose, the gar in this new system will no longer
be constants, but functions of space and time. At the same
time the motion of the free material point will present itself
in the new co-ordinates as a curvilinear non-uniform motion,
and the law of this motion will be independent of the nature
of the moving particle. W e shall therefore interpret this
motion as a motion under the influence of a gravitational
field. W e thus find the occurrence of a gravitational field
connected with a space-time variability of the g, . So, too,
in the general case, when we are no longer able by a suitable
choice of co-ordinates to apply the special theory of relativity
to a finite region, we shall hold fast to the view that the g,,
describe the gravitational field.
Thus, according to the general theory of relativity, gravi-
tation occupies an exceptional position with regard to bther
forces, particularly the electromagnetic forces, since the ten
functions representing the gravitational field at the same time
define the metrical properties of the space measured.
8r ( *
g,ug,@ =
Y
- (16)
where the symbol 8; denotes 1 or 0, according as p = v or
P =
a)*I-
Instead of the above expression for ds2 we may thus write
g,u8~dx,dzu
or, by (16)
gpugngurdxpdxu.
But, by the niultiplication rules of the preceding paragraphs,
the quantities
dfu gpdX+
form a covariant four-vector, and in fact an arbitrary vector,
since the dx, are arbitrary. By introducing this into our ex-
pression we obtain
ds2 = g@Tdfud&.
Since this, with the arbitrary choice of the vector dEu, is a
scalar, and g m by its definition is symmetrical in the indices
Q and T, it follows from the results of the preceding paragraph
or
121 = l .
STds-0 . * (20)
J
P
Carrying out the variation in the usual way, we obtain
from this equation four differential equations which define the
geodetic line ; this operation will be inserted here for the sake
of completeness. Let X be a function of the co-ordinates xv,
and let this define a family of surfaces which intersect the
required geodetic line as well as all the lines in immediate
proximity to it whichare drawn through the points P and P.
Any such line may then be supposed to be given by expres-
sing its co-ordinates z, as functions of X. Let the symbol 6
indicate the transition from a point of the required geodetic
to the point corresponding to the same X on a neighbouring
line. Then for (20) we may substitute
But since
dx, dx, h
620 = - +
and
(2) =
d
-(62,),
dx
where
Since the values of 6xa are arbitrary, it follows from this that
x,=o . . (20c)
are the equations of the geodetic line.
If ds does not vanish along the geodetic line we may
choose the length of the arc s, measured along the geod3tic
line, for the parameter X. Then w = 1, and in place of (20c)
we obtain
[I 51 d2x, + -bg,,
- - -dx, - - - 1 bg,, dxp dx, - 0
- - dx,
g l l , p
ax, ds ds 2 bx, ds ds
or, by a mere change of notation,
dexa + ~ v , u ]dx,,
- - - dx,
--=O .
ga** ds ds
. (20d)
where, following Christoffel, we have written
5 10. The
Formation of Tensors by Differentiation
With the help of the equation of the geodetic line we can
now easily deduce the laws by which new tensors can be
formed from old by differentiation. By this means we are
able for the first time to formulate generally covariant
differential equations. We reach this goal by repeated appli-
cation of the following simple law :-
If in our continuum a curve is given, the points of which
are specified by the arcual distance s measured from a fixed
point on the curve, and if, further, cj is an invariant function
of space, then d+/ds is also an invariant. The proof lies in
this, that ds is an invariant as well as d+.
As
therefore
at+ d z ,
*=a+-
is also an invariant, and an invariant for all curves starting
from a point of the continuum, that is, for any choice of the
vector dxp. Hence it immediately follows that
X =dJ.
X
taken on a curve, is similarly an invariant. Inserting the
value of 9, we obtain in the first place
a2+ dx, dx,, 3+ d2xp
= G,ds ds + - -
3x, ds2
a
is a tensor. Similarly
w
- 34)
-
ax, ax,,
being the outer product of two vectors, is a tensor. By ad-
dition, there follows the tensor character of
and consequently, from what has already been proved, for any
vector A,.
By means of the extension of the vector, we may easily
define the extension of a covariant tensor of any rank
This operation is a generalization of the extension of a vector.
We restrict ourselves to the case of a tensor of the second
rank, since this suffices to give a clear idea of the law of
formation.
As has already been observed, any covariant tensor of the
second rank can be represented * as the sum of tensors of the
By outer multiplication of the veotor with arbitrary components All, A12,
A13, A,, by the vector with components 1,0,0,0,we produce a tensor with
components
0 0 0 0
0 0 0 0 .
By the addition of four tensors of this type, we obtain the tensor Apy with any
assigned components.
91
and
c171
~ 9 1 In accordance with (31) and (as), the last term of this ex-
pression may be written
A,,,,
a = - - {up, 7)A: + (47, a)A: . . (39)
3%
On contracting (38) with respect to the indices /3 and u
(inner multiplication by $), we obtain the vector
aA"p
A" = - + (&, /?}Aa7 + (&, a}AYp.
3%
On account of the symmetry of {By, a)with respect to the in-
dices and 7, the third term on the right-hand side vanishes,
if A@ is, as we will assume, an antisymmetrical tensor. The
second term allows itself to be transformed in accordance
with (29a). Thus we obtain
' (40)
+[-
96
where
-
J - g = 1
It must be pointed out that there is only a minimum of
arbitrariness in the choice of these equations. For besides
G,, there is no tensor of second rank which is formed from
the g,, and its derivatives, contains no derivations higher than
second, and is linear in these derivatives."
These equations, which proceed, by the method of pure
* Properly speaking, this can be affirmed only of the tensor
a, + A9,,9'%@
where A is a constant. If,however, we set this tensor P 0,we come back sgsin
to the equations GPv = 0.
100
But
The terms arising from the last two terms in round brackets
are of different sign, and result from each other (since the de-
nomination of the summation indices is immaterial) through
interchange of the indices p and p. They cancel each other
in the expression for 6H, because they are multiplied by the
101
32, ax,
and, consequently,
or *
The third term of this expression cancels with the one aris-
ing from the second term of the field equations (47); using
relation (50), the second term may be written
+: - fq3>,
where t = tz. Thus instead of equations (47) we obtain
9 16. The General Form of the Field Equations of
Gravitation
The field equations for matter-free space formulated in
5 15 are to be compared with the field equation
02+ = 0
of Newton's theory. W e require the equation corresponding
to Poisson's equation
v2+= 4.rrup,
where p denotes the density of matter.
The special theory of relativity has led to the conclusion
that in6rt mass is nothing more or less than energy, which
finds its complete mathematical expression in a symmetrical
tensor of second rank, the energy-tensor. Thus in the
general theory of relativity we must introduce a correspond-
ing energy-tensor of matter TZ,which, like the energy-com-
ponents t, [equations (49) and (50)] of the gravitational field,
will have mixed character, but will pertain to a symmetrical
covariant tensor."
The system of equation (51) shows how this energy-tensor
(corresponding to the density p in Poisson's equation) is to
be introduced into the field equations of gravitation. For if
we consider a complete system (e.g. the solar system), the
total mass of the system, and therefore its total gravitating
action as well, will depend on the total energy of the system,
and therefore on the ponderable energy together with the
gravitational energy. This will allow itself to be expressed
by introducing into (51), in place of the energy-components
of the gravitational field alone, the sums t; + Tjof theenergy-
components of matter and of gravitational field. Thus instead
of (51) we obtain the tensor equation
.
J-y=1
where we have set T = T,P (Laue's scalar). These are the
104
The first and third terms of the round brackets yield con-
tributions which cancel one another, as may be seen by
interchanging, in the contribution of the third term, the
summation indices a and CT on the one hand, and @, and X
on the other. The second term may be re-modelled by (31),
so that we have
first place
or
D. MATERIAL PHENOMENA
The mathematical aids developed in part B enable us
forthwith to generalize the physical laws of matter (hydro-
dynamics, Maxwells electrodynamics), as they are formulated
in the special theory of relativity, so that they will fit in with
the general theory of relativity. When this is done, the
general principle of relativity does not indeed afford us a
further limitation of possibilities ; but it makes us acquainted
with the influence of the gravitational field on all processes,
*On this question of. H. Hilbert, Nachr. d. K. Gesellsch. d. Wiss. zu
Gottingen, MatLphys. Klasse, 1915, p. 3.
107
P,P9&s
dxl dx, axs ax4
ds -@ Z
If the g,, are also unknown, the equations (53) are
brought in. These are eleven equations for defining the ten
functions g,,, so that these functions appear over-defined.
We must remember, however, that the equations (57a) are
already contained in the equations (53), so that the latter
represent only seven independent equations. There is good
reason for this lack of definition, in that the wide freedom of
the choice of co-ordinates causes the problem to remain
mathematically widefined i o such a degree that three of the
functions of space may be chosen at will.*
Let
. (65a)
~p = - UE)
where
E
3 21. Newtons Theory as a First Approximation
As has already been mentioned more than once, the
special theory of relativity as a special case of the general
theory is-characterized by the g ,, having the constant values
(4). From what has already been said, this means complete
neglect of the effects of gravitation. We arrive at a closer
approximation to reality by considering the case where the
gpPdiffer from the values of (4) by quantities which are small
compared with 1, and neglecting small quantities of second
and higher order. (First point of view of approximation.)
I t is further to be assumed that in the space-time territory
under consideration the g,, at spatial infinity, with a suitable
choice of co-ordinates, tend toward the values (4) ; i.e. we are
considering gravitational fields which may be regarded as
generated exclusively by matter in the finite region.
I t might be thought that these approximations must lead
us to Newtons theory. But to that end we still need to ap-
proximate the fundamental equations from a second point of
view. We give our attention to the motion of a material
point in accordance with the equations (16). In the case of
the special theory of relativity the components
- dx,
dx, --dx,
ds ds a s
113
v = J<z)2 + (z)2
+ ((p)2
may occur, which is less than the velocity of light in vacuo.
If we restrict ourselves to the case which almost exclusively
offers itself to our experience, of w being small as compared
with the velocity of light, this denotes that the components
d-x , ax2
-- dx,
d s ' ds' as
are to be treated as small quantities, while dx4/ds, to the
second order of small quantities, is equal to one. (Second
point of view of approximation.)
Now we remark that from the first point of view of ap-
proximation the magnitudes Tiv are all small magnitudes of
at least the first order. A glance at (46) thus shows that in
this equation, from the second point of view of approximation,
we have to consider only terms for which p = v = 4. Re-
stricting ourselves to terms of lowest order we first obtain in
place of (46) the equations
-d,
= - &-3g44 (7 = 1 , 2 , 3 ) .* (67)
d t2 3x7
This is the equation of motion of the material point accord-
ing to Newtons theory, in which +g4, plays the pert of the
gravitational potential. What is remarkable in this result
is that the component g,, of the fundamental tensor alone
defines, to a first approximation, the motion of the material
point.
We now turn to the field equations (53). Here we
have to take into consideration that the energy-tensor of
matter is almost exclusively defined by the density of
matter in the narrower sense, i.e. by the second term of the
right-hand side of (58) [or, respectively, (58a) or (58b)l.
If we form the approximation in question, all the components
vanish with the one exception of Td4= p = T. On the left-
hand side of (53) the second term is a small quantity of
second order; the first yields, to the approximation in
question,
or
y = J( - 2) = 1 - 2r
:(I+ 5).
Carrying out the calculation , this gives
What happens physically is not haphazard; on the contrary, the following two
axioms hold true:
Axiom I (Mies Axiom of the world-function3): The law of what happens physi-
cally is determined by a world-function H that contains the following as arguments:
(1, k = 1 , 2 , 3 , 4 )
Sitzungsber. d. Berliner Akad. 1914 S.1030, 115 S.778, 799, 831, 844.
2Ann. d. Phys. 1912, Bd. 37 S. 511, Bd. 39 S. 1, 1913, Bd. 40 S . 1 .
3Mies world-functions do not contain exactly these arguments; in particular, the use of the
arguments (2) goes back to Born; nevertheless it is precisely the introduction and use of such a
world-function in the Hamiltonian principle which is characteristic of Mies electrodynamics.
121
(3)
obviously can appear, where gP means the minor determinant of the determinant
of g corresponding to gpv , divided by g .
Axiom I1 (Axiom of general invariance ): The world-function H is an invari-
ant with respect t o an arbitrary transformation of the world-parameter w 3 .
Axiom I1 is the simplest mathematical expression for the requirement that the
coupling of the potentials g P V ,qs is in and of itself completely independent of the
way one chooses t o name world-points using world-parameters.
The following mathematical theorem, whose proof I will lay out elsewhere, pro-
vides the Leitmotiv for the construction of my theory.
Theorem I. If J is an invariant with respect to arbitrary transformations of
the four world-parameters which contains n quantities and their derivatives, and if
one forms from
J
6 J&dw = 0
and then in regard to the four electrodynamic potentials qd the four Lagrangian
differential equations
(5)
4Mies has already imposed the requirement of orthogonal invariance. In Axiom I1 above, t h e
Einsteinian fundamental basic idea of general invariance is given t h e simplest expression, albeit
in Einstein the Hamiltonian principle plays only a secondary role and his functions H a r e certainly
not general invariants nor d o they contain the electric potentials.
2
122
For brevity we denote the left-hand sides of the equations (4), (5) respectively as
[&HIPV , [fiHIh
The equations (4) might be called the basic equations of gravitation, the equa-
tions (5) the basic electrodynamics equations or the generalized Maxwells equa-
tions. Due to the theorem given above, the equations (5) may be viewed as a
consequence of the equations (4); that is, based on that mathematical proposition,
we can immediately state the claim that in the indzcated sense the electrodynamic
phenomena are effects of gravitation. In this realization I discern the simple and
quite surprising solution of the problem of Riemann who was the first to seek the
theoretical relationship between gravitation and light.
In the following, we use the easily proven fact that if pl ( j = 1 , 2 , 3 , 4 )denotes
an arbitrary contravariant vector, the expression
a covariant vector.
Furthermore, we present two mathematical theorems, that read as follows:
Theorem 11. If J is an invariant depending on g p , g; , g r , qs , q s k then identically
in all arguments and for every arbitrary contravariant vector p s
wherein
3
123
and abbreviate
The proof of (6) falls out easily, because this identity is obviously correct when p s
is a constant vector from which it follows in general due to its invariance.
Theorem I11 If J is an invariant depending only on the gP and their deriva-
tives, and, as above, the variational derivative of , / j J with respect to gz is denoted
[&7J]Pu, then the expression - with h understood to be some contravariant tensor
(7)
in the manner that this equation is fulfilled for all arguments, namely the gP and
their derivatives.
For the proof we consider the integral
J J d d w , dw = dwldwzdwSdw4
4
124
which gives
and, due to the manner of construction of the Lagrangian derivative, is also ac-
cordingly
/ (7 - is) pdw = 0
the expression
is a contragradient vector.
If we therefore form the expression
125
this no longer contains the second derivatives and is therefore of the form
wherein
(9)
(10)
Recalling now the basic equations (4) and (5), it follows from addition of (10)
and (12) that
Now,
6
126
C - &awl
a
(Hp'-a 1
- b 1 - c )1 = O .
1
will be a contravariant vector and indeed the latter obviously satifies the identity
(14) e 1 = H p 1 - a1 - b 1 - c 1 - d 1
H = li' + L ,
where Ii' denotes the invariant arising from the Riemann tensor (curvature of the
four-dimensional manifold)
7
127
Equating to zero the coefficients of ps on the left-hand side yields the equation
or
that is, the derivatives of the electrodynamic potentials qs appear only in the com-
bination
Mks = Qsk - Qks.
We thereby recognize that with our assumptions, the invariant L beyond the po-
tentials q p , qs depends solely on the components of the skew-symmetric invariant
tensor
M = ( M k s ) = Rot(qs);
that is, the so-called electromagnetic six-vector. This result, by which the character
of Maxwells equations is really conditioned, ensues here essentially as a conse-
quence of general invariance, thus on the basis of Axiom II.
If we set the coefficient of pk on the left-hand side of the identity (15) to zero,
then we obtain, using (16),
(Sf. = 0 , l # s ; b, = 1);
8
128
Due t o the formula (21) developed in the following, we see from this in particular
that the electromagnetic energy and with it also the total energy-vector el may be
expression in terms of 11 alone, so that only the gp and their derivatives, but not
the qs and their derivatives appear therein. If in the expression (18) one goes t o
the boundary case
gpv = 01 ( P # ).
gpp = 1
then the same agrees exactly with that which Mie set out in his electrodymics:
the Mie electromagnetic energy-tensor is thus nothing but the general
invariant tensor arising through differentiation of the invariant L with
respect to the gravitational potential g p - a circumstance which first directed
me to the necessarily close relationship between the Einsteinian general theory of
relativity and the Miesian electrodymics and gave me the conviction of the correct-
ness of the theory developed here.
It remains yet to show directly from the assumption
(20) H=K+L
how the generalized Maxwells equations (5) presented above are a result of the
gravitational equations (4) in the sense given above. Using the notation introduced
above for the variational derivatives with respect to the gp the gravitational equa-
tions due to (20) take the form
as follows easily without calculation from the fact K,, is, other than g p v , the only
tensor of second rank and K is the only invariant, that can be constructed with
only the gp and their first and second derivatives g i , g::.
The differential equations of gravity coming about thusly appear to me to be in
agreement with the ambitious theory of general relativity presented by Einstein in
his later treatments.
If we continue in general as above to denote the variational derivatives of &iJ
with respect to the electrodynamic potential q h by
[&J]I~=
afiJ
-- C-a -8fiJ
aqh awk aqhk
9
then due to (20) the electrodynamic basic equation takes the form
(22) [fiJIh 0
Now as Ii is an invariant depending soley on the g!Jv and their derivatives, according
to Theorem 111 the equality (7) obtains identically, wherein
(23) is =
as
and
(25)
10
130
The first term on the right-hand side is, due to (21) and (23), nothing other than
i,. The last term on the right proves to be cancelled by the last term on right in
(25); in fact,
as the expression
dMsv -
aqms - d2q, - a2qs -~d2qm
dwm d ~ , dwsdwm dw,dw, dw,dw,
is symmetric in s , m and the first factor in the summation in (26) comes out skew-
symmetric in s, m.
The equation
(27)
follows immediately from (25); that is, from the gravitational equations (4) follow
indeed the four mutually linearly independent combinations (27) of the electro-
dynamic basic equations (5) and their first derivatives. This is the precise math-
ematical expression of the above claim expressed generally about the character of
electrodynamics as a consequence of gravity.
As L in consequence of our assumptions should not depend on the derivatives of
the gp, L must be a function of four certain general invariants which correspond
to the special orthogonal invariants given by Mie and of which the two most simple
are these:
and
k ,1
The most simple, and looking at the construction of Ii,most obvious Ansatz for
L is likewise the one which corresponds to Mies electrodynamics; namely,
L = a& + f(q)
or, more specifically following Mie:
L = a& + Pqs,
where f ( q ) denotes any function of q and a,p denote constants.*
t T h e last t e r m on the right-hand side is q 5 n o t q in the original. (Translator)
11
As one sees, the few simple assumptions expressed in the A4xiomsI and I1 suf-
fice with appropriate interpretation for the construction of the theory: thereby not
only are our conceptions of space, time and motion transformed from the ground
up in the sense laid forth by Einstein, but I a m also of the conviction that through
the basic equations presented here the most intimate, hitherto hidden, processes
within the atom will receive clarification and most particularly it must be generally
possible to trace all physical constants back to mathematical constants - as then
thereby the possibility first edges nearer that in principle a science of the style of
geometry will evolve from physics: certain is the most excellent reputation of the
axiomatic method which, as we see here, takes into its service the powerful instru-
ments of analysis; namely, variational calculus and invariant theory.
Translators notes:
1. Hilbert writes in the erudite manner of a well-educated mathematician of the early
20th century. I have endeavored to translate his German into an English preserving
the feel of his style. Insofar as I have succeeded, credit is due to my wife, Heide-
marie Floerke, whose help in sorting through the nuances of Hilberts German was
invaluable. The extent to which the reader may mistake Hilbert here for an elderly
German academic writing in broken English is my fault entirely.
2. This paper does not appear in Hilberts collected works.() As Pais notes,() Hilberts
claim that electrodynamics can be viewed as a consequence of gravity rests on a
misunderstanding. The identities among the gravitational potentials gPv and the
electrodynamical potentials q,, on which Hilbert bases his claim express the Bianchi
identities, which are automatically true, and of which Hilbert was apparently un-
aware. In a subsequent version,(11)Hilbert corrects this and other errors. This
version, which presents verbatim much of the present paper, is included in his col-
lected works. Presumably, Hilbert was not anxious to have the erroneous early
version more widely read. From a historical perspective, however, it provides a
fascinating glimpse of Hilbert poised, in 1915, to give an axiomatic, geometric foun-
dation to all of theoretical physics.
12
132
In a recent note(')I showed how, in an Einstein universe with a given ds2, the energy
tensor attached to each volume element of that universe can be defined geometrically;
this is the tensor which, set equal to zero, gives the laws of gravitation in any region
devoid of matter. The definition that I gave makes the curvature of the universe depend
on a certain rotation associated with every closed, infinitesimal contour, and this rotation
was introduced on the basis of the concept of parallel transport of Levi-Civita. This last
concept itself, although it was originally presented using geometrical considerations, is
rather difficult to define in a precise way without calculation. But it is possible, it seems
to me, to show the major significance of it by generalizing the concept even of space; at
the same time that will lead us to geometrical images of material universes physically
richer than our universe, at least as it is usually considered; that will also show us the true
rational of the fundamental laws governing the energy tensor (law of symmetry, law of
conservation).
Let us restrict ourselves to the case of three dimensions, the generalization to four
dimensions being easy. Imagine a space which, in the immediate neighborhood of each
point, has all the characteristics of Euclidean space. The inhabitants of this space will
know, for example, how to locate points infinitely close to a point A by means of an
orthogonal triple having this point A as origin; but we will suppose further that they have
a law enabling them to orient, in relation to the triple at origin A, every coordinate triple
having its origin A' close to A; in particular that will give a sense for them to say that two
directions, one coming from A and another from A', are parallel. Ultimately, such a
space will be defined by the law of mutual orientation (of a Euclidean nature) of two
triples with origins infinitely close.
A space of the preceding kind is not completely defined by its ds2. The ds2, indeed,
determines only one part of the operation that allows the passage from a triple with origin
A to an infinitely close triple with origin A', namely a translation A-A'; in addition, as
one knows, ds2 being fixed, a rotation can still be defined according to an arbitrary law.
That granted, when one describes a closed, infinitesimal contour starting from point A
and returning there, the divergence between the space considered and Euclidean space
will show itself in the following way. Let us attach a coordinate triple to each point M of
the contour; to pass from the triple attached to M to the triple attached to the infinitely
close point M , one needs to make an infinitesimal translation and rotation whose
components one knows with respect to the moving triple with origin M.
Imagine that this collection of infinitesimal displacements is carried out in a Euclidean
space starting from an initial triple chosen arbitrarily. When the point M of non-
Euclidean space that starts from A returns there after having described the closed contour,
in Euclidean space one will not recover the initial triple, but for that to obtain it will be
necessary to carry out a complementary displacement whose components will be well
defined with respect to the initial triple. This complementary displacement is otherwise
independent of the law whereby one attached a triple to each point M of the contour.
In sum, associated with any infinitesimal closed contour of the given space are an
infinitesimal translation and rotation (on the order of magnitude of the surface area
bounded by the contour) and which express the divergence between this space and
Euclidean space. The rotation can be represented by a vector with origin A and the
translation by a couple. One can then prove the following conservation law: If one
considers an infinitesimal volume, the vectors and the couples associated with diflerent
elements of the surface bounding the volume are in equilibrium.
Return now to the case where we are given simply ds2. An easy calculation shows that,
among all the laws of mutual orientation of two triples of infinitely close origin
compatible with the given ds2,there is only one f o r which the translation associated with
an arbitrary, infinitesimal closed contour is null. It is this law which leads to the concept
of parallel displacement of Levi-Civita. The couple in question above disappears, and this
is why the elastic tensor satisfied the law of symmetry.
In the general case where there is a translation associated with any infinitesimal closed
contour, one can say that the given space is different from Euclidean space in two
respects: 1) by a curvature in the sense of Riemann, which results in rotation; 2) by a
torsion, which results in the translation.
In a space with curvature and torsion, the method of moving triples, as in Euclidean
space, allows one to build a theory of the curvature of curves (and even of surfaces). A
straight line will be characterized by the property of having null (relative) curvature at all
of its points; i.e., of preserving the same direction locally. A straight line is no longer
necessarily the shortest path from one point to another; it is in spaces devoid of torsion;
exceptionally, it can also be in certain special torsion spaces.
134
A very simple example of this last case is the following. Imagine a space & that
corresponds point by point with a Euclidean space E, the correspondence preserving
distances. The difference between the two spaces will be as follows: two orthogonal
triples originating from two infinitely close points A and A' of E will be parallel when
the corresponding triples of E can result one from the other by a helicoidal displacement
at a given rate in a given sense (righthanded, for example), with the line that connects
their origins as axis. Lines of & then correspond to lines of E: they are still geodesics.
Space & thus defined admits a 6-parameter group of transformations; it would be our
ordinary space seen by observers all of whose perceptions would be twisted.
Mechanically it would correspond to a medium with constant pressure and constant
torsion.
I will add that the preceding considerations which, from the point of view of
mechanics, are connected with the beautiful work of Mssrs E. and F. Cosserat on the
Euclidean action, are also connected with the theory of generalized spaces of H. Weyl
and can themselves be extended.
Chapter 3
extraordinary thing that all three are of the order 1. units of microphysics. The choice of these uni- is,
This is the justification for an attempt to interpret of course, made uncertain by the appearmce of the
these constants by means of the following simple and unexplained dimensionless constante of microphysics :
illuminating picture. The 'radiue of curvature' R is it k questionable whether one should take, BB the
inbrpreted-and is for thst reason given such a natural unit of mass, the m&98 of the proton, or that
name-in the eenae of Riemannian geometry: in- of the meeon or electron, and whether the n t l t u
stead of the infinite Euclidean space, one haa thus a unit of length should be Bohr'sradius of the hydrogen
closed, finite, Riemannian space, the volume of atom, or the electronic radius e1/mec2, the ratio of
which is of the order of magnitude Ra. In the same which is essentially the square of the fine-structure
intuitive way the Hubble effect is interpreted aa a conatant. However. the dimensional ratios with
Doppler effect. DXerent interpretations of the which one has to do are so enormouely large thet
Hubble effect have, indeed, often been attempted, them distinctions are of little consequence; more.
but here the intuitive concept of an expanding space over, there is s d c i e n t reaeon to give preference to
is retained without modifiation. The empirid the mma n a ~ of the meson. and to the electronic
relation R = CA then meam that the radius of the radius or 'elementary length' 2 = 2 x lo-" cm. If
universe increaaea at 8 rate which is just of the the ratio M : m x , which is then of the order of
magnitude of the velocity of l i g h t a n attractive magnitude of the number of elementary particles in
and revealing reeult. Ale0 d = 1 meam that this the whole Universe, is formed, then a number of the
space. which has been expanding with the velocity coloseel magnitude 10'0 ie obtained. Eddington had
of light ever since that time which we recognize ta the boldnea to o h that this number should not be
be most remote in the history of the universe, must simply acuepted as something incapabIe or without
once have been very d. need of explauation; he sought, by meana of aome
But what is meant by the fact that the last of the curious consider8tions. to give theoretical grounds
three dimensionless c o ~ t a a t smsntioned is of the for supposing that the number of protons in the
order of magnitude unity? This relation 8ppea1-a universe must have the value 2C The true solution
already in the well-known model of the universe would appear to lie in another direction: an idea
which waa Einstein's bold attempt, in framing the which Dirac has put forward in quite a different
general theory of relativity, to reslize the idea of connexion will now be introduced.
a closed, Riemannian, phyeical space. Indeed, if M The ratio R/Z is of the same order of magnitude,
is understood to be the total maas of the universe- 1 0 ' ~ . already encountered in the comparison of the
which is then of the order of magnitude pRa-and gravitational and electromagnetic attractions be-
cA is r e p 1 4 as previouely by R, the equation in tween two elementary particlee; if thase two
question cen be writtan in the form k M 2: R ; and dimensionless numbers ara divided, therefore, a new
Einstein had previouly obtained the relation dimensionless quantity of the grstifying order of
k M = 4 ~ ~ 2for2 , a closed spaoe with 8 tdme-independmt magnitude 1 ia obtained. The ratio of the two
radius R, on the baais of the relativistic theory of attractive forcea ia thereby compared, however, with
gravitation. 8 number which,88 is dready known,is not constent :
From a consideration of the Hubble effect, how- the ratio R/I increeees proportionally to the age of
ever, the concept of a growing universe haa been the universe; it is, in fact, equal to the age of
mached: and then the empirical relation kM N R the universe, expressed in tern of the 'elementary
gives a disturbing conclusion. If R is not fixed, but time' T = E/c r lo-" s e a Also, fmm the significant
is always growing, then neither can k M be regarded discovery of a quotient of the order of magnitude
any longer as invariable; and, of the two factors, unity, it muet be concluded, with Dirac, that the
the gravitstionsl constant k, and the mms M of the constant of gravitation is in reality not 8 constant,
universe, at leest one must vary likewiae with time. but inversely proportional to the age A of the
An important contribution to the elucidation of this universe.
situation is contained in an observation by A. H8ae : What is now the obstacle to applying this idea also
the relation kM N R c a n Slso be written in the form to the ratio M / m x , and aaserting that the number of
f M ' J R - r M e , which means that the negative elementary particla in the universe obviously in-
potentral energy of gravitetion for the whole universe c m as the square of the age of the universe f
is equal to the sum of the rest-energiee of the mSsaea Dirac, who wee led away from this application by a
of the stars. T h i ~provides a surprising solution of specid tmin of thought (concerning the formulation
the prohiem of the universe : it ia possible for the of a comology with an infinite Euclidean ~ p ~ and c e
hid energy of the universe to have the value zero an infinite total mgse of the universe), waa probably
exacfly-through the cancelling out of the positive influenced by a fear of contradicting the principle of
and negative contributions to the energy. The coneervatioo of energy. However, the foregoing
reIation kM N R would then appear 88 a direct conaideration has cleared the way in W respect:
consequence of the consenration of energy, which with the perception that kM N R v m the
wquirea that the evolving universe should continue conservation of energy, the complete harmony of all
in a sequenoe of s t e t e e the total energy of which the statements concerning the proportionslity of k to
always has the value 'zero. A-l, M to A', and R to A is attained.
To prmesd further, another dimemionless constant It is, therefore, accepted that there is a conthud
ie mquimd ; those which can be produced from the creation of matter in the space of the universe, and
six c o a m d o g i d quantitias c. k, A. p, a and R are the question arises, how and where this generation
exhausted by the three dimensionlea c o ~ t e n t s of matter. connected with the growth of the universe.
already mentioned. More are obtained if the cosmo- occura. TO answer this, the individual at-, insd
logical quantities are compared with those derived of the universe as 8 whole, are now taken the
from microphysics. Hitherto the radius of the subject for careful consideration in the light Of
universe R and the m888 of the universe M pRa dimensional analysie. The maas of the 8- amounts
have been expressed in centimetree and grams ; now to 2 x 10" gm.; there are certainly many *-
they will be elcp~88edin terms of the fundamental with still d e r maaaes, and an m t a b u e d lower
138
Abstract
The metric of the Jordan theory can be defined through the postulate
that point-masses move along geodesics. This postulate is equivalent to
another: the Compton wavelength of the elementary particles provides a
natural length-scale. The gravitational constant is defined as the ratio
of gravitational to inertial mass. Further, there generally follows from
the theory a vacuum dielectric constant 0 = l / p ~ whose
, dependence
on x depends on the choice of the exponent 17 introduced by Jordan:
t o = x 1+1'?(r/ # 0 ) .
Introduction
It is well-known to be possible to formally combine the equations of the grav-
itational field and the electromagnetic field by interpreting them as describing
a five-dimensional projective space'. So that the correct number of field equa-
tions still obtain one must normalize the metric components through the side
condition
J = g p u X p X v= 1, (1)
where Xu denotes the 5 homgeneous coordinates.
P. JORDAN^ has suggested extending the theory by dropping the side con-
dition (1) and introducing J as a variable scalar field in the theory. J O RD A N
assumes that the field equations follow from the 5-dimensional variational prin-
ciple
S I (
J a R - X ( J I ~ J I ~ / J 'G) ) d ' X . (2)
+
(c.f. P.J. 27 (22)). Here 7 = cy l / 2 . J O RD A N chooses for 7 the value 1. As
we do not wish to do that it is necessary, in the case 7 # 0 , to set
1
141
with
2
142
The role of Machs principle in physics is discussed in relation to the equivalence principle. The difficulties
encountered in attempting to incorporate Machs principle into general relativity are discussed. A modified
relativistic theory of gravitation, apparently compatible with Machs principle, is developed.
INTRODUCTION small mass, its effect on the metric is minor and can be
considered in the weak-field approximation. The ob-
IT is interesting that only two ideas concerning the
nature of space have dominated our thinking since
the time of Descartes. According to one of these pic-
server would, according to general relativity, observe
normal behavior of his apparatus in accordance with the
usual laws of physics. However, also according to general
tures, space is an absolute physical structure with
relativity, the experimenter could set his laboratory ro-
properties of its own. This picture can be traced from
tating by leaning out a window and firing his 22-caliber
Descartes vortices through the absolute space of
rifle tangentially. Thereafter the delicate gyroscope in
Newton,2 to the ether theories of the 19th century.
the laboratory would continue to point in a direction
The contrary view that the geometrical and inertial
nearly fixed relative to the direction of motion of the
properties of space are meaningless for an empty space,
rapidly receding bullet. The gyroscope would rotate
that the physical properties of space have their origin
relative to the walls of the laboratory. Thus, from the
in the matter contained therein, and that the only
meaningful motion of a particle is motion relative to point of view of Mach, the tiny, almost massless, very
distant bullet seems to be more important that the
other matter in the universe has never found its com-
plete expression in a physical theory. This picture is massive, nearby walls of the laboratory in determining
inertial coordinate frames and the orientation of the
also old and can be traced from the writings of Bishop
Berkeley3 to those of Ernst Mach.4 These ideas have gyroscope.6It is clear that what is being described here is
found a limited expression in general relativity, but it more nearly an absolute space in the sense of Newton
rather than a physical space in the sense of Berkeley
must be admitted that, although in general relativity
spatial geometries are affected by mass distributions, and Mach.
the geometry is not uniquely specified by the distribu- The above example poses a problem for us. Ap-
tion. I t has not yet been possible to specify boundary parently, we may assume one of a t least three things:
conditions on the field equations of general relativity 1. that physical space has intrinsic geometrical and
which would bring the theory into accord with Machs inertial properties beyond those derived from the matter
principle. Such boundary conditions would, among other contained therein ;
things, eliminate all solutions without mass present. 2. that the above example may be excluded as non-
It is necessary to remark that, according to the ideas physical by some presently unknown boundary condi-
of Mach, the inertial forces observed locally in an ac- tion on the equations of general relativity.
celerated laboratory may be interpreted as gravitational 3. that the above physical situation is not correctly
effects having their origin in distant matter accelerated described by the equations of general relativity.
relative to the laboratory. The imperfect expression
of this idea in general relativity can be seen by consider- These various alternatives have been discussed pre-
ing the case of a space empty except for a lone experi- viously. Objections to the first possibility are mainly
menter in his laboratory. Using the traditional, asymp- philosophical and, as stated previously, go back to the
totically Minkowskian coordinate system fixed relative time of Bishop Berkeley. A common inheritance of all
to the laboratory, and assuming a normal laboratory of present-day physicists from Einstein is an appreciation
for the concept of relativity of motion.
* Supported in part by research contracts with the U. S. Atomic As the universe is observed to be nonuniform, it
Energy Commission and the Office of Naval Research. would appear to be difficult to specify boundary condi-
t National Science Foundation Fellow; now at Loyola Uni- tions which would have the effect of prohibiting un-
versity, New Orleans, Louisiana.
E. T. Whittaker, History of the Theories of Aellter and Elec- suitable mass distributions relative to the laboratory
tricity (Thomas Nelson and Sons, New York, 1951). arbitrarily placed; for could not a laboratory be built
I. Newton, Principia Mathemalica Philosophiae Naturalis
(1686) (reprinted by University of California Press, Berkeley, near a massive star? Should not the presence of this
California, 1934). massive star contribute to the inertial reaction?
8 G. Berkeley, The Principles of Hunzan Knowledge, paragraphs
111-117, 1710-De Motu (1726). The difficulty is brought into sharper focus by con-
E. Mach, Conservation of Energy, note No. 1, 1872 (reprinted
by Open Court Publishing Com any, LaSalle, Illinois, 1911), and 6Because of the Thirring-Lense effect, [H. Thirring and J.
The Science of Mechanics, 1883 {eprinted by Open Court Publish- Lense, Phys. Zeits. 19, 156 (1918)], the rotating laboratory would
ing Company, LaSalle, Illinois, 1902), Chap. 11, Sec. VI. have a weak effect on the axis of the gyroscope.
925
143
sidering the laws of physics, including their quantitative due to the presence of distant accelerated matter.7
aspects, inside a static massive spherical shell. I t is This interpretation of the inertial reaction carries with
well known that the interior Schwarzschild solution is it an interesting implication. Consider a test body falling
flat and can be expressed in a coordinate system toward the sun. I n a coordinate system so chosen that
Minkowskian in the interior. Also, according to general the object is not accelerating, the gravitational pull of
relativity all Minkowskian coordinate systems are the sun may be considered as balanced by another
equivalent and the mass and radius of the spherical gravitational pull, the inertial reaction.8 Note that the
shell have no discernible effects upon the laws of physics balance is not disturbed by a doubling of all gravita-
as they are observed in the interior. Apparently the tional forces. Thus the acceleration is determined by the
spherical shell does not contribute in any discernible mass distribution in the universe, but is independent
way to inertial effects in the interior. What would of the strength of gravitational interactions. Designating
happen if the mass of the shell were decreased, or its the mass of the sun by m, and its distance by r enables
radius increased without limit? I t might be remarked the acceleration to be expressed according to Newton
also that Komare has attempted, without success, to as a=Gm,/r* or, from dimensional arguments, in terms
find suitable boundary- and initial-value conditions for of the mass distribution as a-mRc2/Mr2. Combining
general relativity which would bring into evidence the two expressions gives Eq. (1).
Machs principle. This relation has significance in a rough order-of-
The third alternative is the subject of this paper. magnitude manner only, but it suggests that either the
Actually the objectives of this paper are more limited ratio of M to R should be fixed by the theory, or alter-
than the formulation of a theory in complete accord natively that the gravitational constant observed locally
with Machs principle. Such a program would consist of should be variable and determined by the mass distribu-
two parts, the formulation of a suitable field theory tion about the point in question. The first of these two
and the formulation of suitable boundary- and initial- alternatives is of course, in part, simply the limitation
value conditions for the theory which would make the of mass distribution which it might be hoped would
space geometry depend uniquely upon the matter result from some boundary condition on the field equa-
distribution. This latter part of the problem is treated tions of general relativity. The second alternative is
only partially. not compatible with the strong principle of equiva-
At the end of the last section we shall briefly return 1enceO and general relativity. The reasons for this will
again to the problem of the rotating laboratory. be discussed below.
A principle as sweeping as that of Mach, having its If the inertial reaction may be interpreted as a gravi-
origins in matters of philosophy, can be described in tational force due to distant accelerated matter, it
the absence of a theory in a qualitative way only. A might be expected that the locally observed values of
model of a theory incorporating elements of Machs the inertial masses of particles would depend upon the
principle has been given by Sciama. From simple distribution of matter about the point in question. It
dimensional argumentss-9 as well as the discussion of should be noted, however, that there is a fundamental
Sciama, it has appeared that, with the assumption of ambiguity in a statement of this type, for there is no
validity of Machs principle, the gravitational constant direct way in which the mass of a particle such as an
G is related to the mass distribution in a uniform electron can be compared with that of another a t a
expanding universe in the following way : different space-time point. Mass ratios can be compared
at different points, but not masses. On the other hand,
G M / R c Z1.~ (1) gravitation provides another characteristic mass
Here M stands for the finite mass of the visible (i.e.,
(Ac/G)~=Z.l6XlO+ g,
causally related) universe, and R stands for the radius
of the boundary of the visible universe. and the mass ratio, the dimensionless number
The physical ideas behind Eq. (1) have been given
in references 7-9 and can be summarized easily. As m ( ~ / ~ z1c ) s x 10-23, (3)
stated before, according to Machs principle the only
meaningful motion is that relative to the rest of the provides an unambiguous measure of the mass of an
matter in the universe, and the inertial reaction experi- electron which can be compared at different space-
enced in a laboratory accelerated relative to the distant time points.
matter of the universe may be interpreted equivalently I t should also be remarked that statements such as
A and c are the same at all space-time points are in
as a gravitational force acting on a fixed laboratory
the same way meaningless within the same context
A. Komar. Ph.D. thesis, Princeton Universitv.
r .
1956 until a method of measurement is prescribed. I n fact,
(un ublished). it should be noted that h and c may be defined to be
b. W. Sciama, Monthly Notices Roy. Astron. SOC. 113, 34
(1953); The Unity of the Universe (Doubleday & Company, Inc., constant. A set of physical constants may be defined
New York, 1959), Cha s. 7 9. as constant if they cannot be combined to form one or
K. H. Dicke, Am. &e&t 47, 25 (1959).
R. H. Dicke, Science 129, 621 (1959). lK. H. Dicke, Am. J. Phys. 29, 344 (1960).
1 44
M A C H S P I< I N C I P L E 927
more dimensionless numbers. The necessity for this same physical situation, the formal structure of the
limitation is obvious, for a dimensionless number is theory would be very different for the two cases. Thus,
invariant under a transformation of units and the ques- for example, i t can be easily shown that uncharged
tion of the constancy of such dimensionless numbers is spinless particles whose masses are position dependent
to be settled, not by definition, but by measurements. no longer move on geodesics of the metric. (See Ap-
A set of such independent physical constants which are pendix I.) Thus, the definition of the metric tensor is
constant by definition is complete if it is impossible different for the two cases. The two metric tensors are
to include another without generating dimensionless connected by a conformal transformation.
numbers. The arbitrariness in the metric tensor which results
It should be noted that if the number, Eq. (3), from the indefiniteness in the choice of units of measure
should vary with position and h and G are defined as raises questions about the physical significance of Rie-
constant, then either m or G, or both, could vary with mannian geometry in relativity.12 I n particular the 14
position. There is no fundamental difference between invariants which characterize the space are generally
the alternatives of constant mass or constant G. How- not invariant under a conformal transformation inter-
ever, one or the other may be more convenient, for the preted as a redefinition of the metric tensor in the same
formal structure of the theory would, in a superficial space.13 Matters are even worse, for a more general
way, be quite different for the two cases. redefinition of the units of measure can be used to re-
T o return to Eq. (3), the odd size of this dimension- duce all 14 invariants to zero. It should be said that
less number has often been noticed as well as its ap- these remarks should not be interpreted as casting
parent relation to the large dimensionless numbers of doubt on the correctness or usefulness of Riemannian
astrophysics. The apparent relation of the square of the geometry in relativity, but rather that each such
reciprocal of this number [Eq. (3)] to the age of the geometry is but a particular representation of the theory.
universe expressed as a dimensionless number in atomic It would be expected that the physical content of the
time units and the square root of the mass of the visible theory should be contained in the invariants of the group
portion of the universe expressed in proton mass units of position-dependent transformations of units and co-
suggested to Dirac a causal connection that would lead ordinate transformations. The usual invariants of
to the value of Eq. (3) changing with time. The signifi- Riemannian geometry are not invariants under this
cance of Diracs hypothesis from the standpoint of wider group.
Machs principle has been discussed.8 I n general relativity the representation is one in
Dirac postulated a detailed cosmological model based which units are chosen so that atoms are described as
on these numerical coincidences. This has been criti- having physical properties independent of location. I t
cized on the grounds that it goes well beyond the empiri- is assumed that this choice is possible!
cal data upon which it is based.* Also in another publi- I n accordance with the above, a particular choice of
cation by one of us (R. H. D.), it will be shown that it units is made with the realization that the choice is
gives results not in accord with astrophysical observa- arbitrary and without an invariant significance. The
tions examined in the light of modern stellar evolution- theoretical structure appears to be simpler if one de-
ary theory. fines the inertial masses of elementary particles to be
On the other hand, it should be noted that a large constant and permits the gravitational constant to vary.
dimensionless physical constant such as the reciprocal I t should be noted that this is possible only if the mass
of Eq. (3) must be regarded as either determined by ratios of elementary particles are constant. There may
nature in a completely capricious fashion or else as re- be reasonable doubt about this?JO On the other hand,
lated to some other large number derived from nature. it would be expected that such quantities as particle
I n any case, it seems unreasonable to attempt to derive mass ratios or the fine-structure constant, if they
a number like loz3from theory as a purely mathematical depend upon mass distributions in the universe, would
number involving factors such as k / 3 . be much less sensitive in their dependence9 rather than
It is concluded therefore, that although the detailed the number given by Eq. (3) and their variation could
structure of Diracs cosmology cannot be justified by be neglected in a first crude theory. Also it should be
the weak empirical evidence on which it is based, the remarked that the requirements of the approximate
more general conclusion that the number [Eq. (3)] constancy of the ratio of inertial to passive gravitational
varies with time has a more solid basis. mass,I4 and the extremely stringent requirement of
If, in line with the interpretation of Machs principle spatial isotropy,16 impose conditions so severe that it
being developed, the dimensionless mass ratio given by has been found to be difficult, if not impossible, to
Eq. (3) should depend upon the matter distribution in 12 E. P. Wigner has questioned the physical significance of Kie-
the universe, with h and c constant by definition, either rnannian geometry on other grounds [Relativity Seminar, Stevens
the mass m or the gravitational constant, or both, must Institute, May 9, 1961 (unpublished)].
a B. Hoffman, Phys. Rev. 89.49 (1953).
vary. Although these are alternative descriptions of the I4 R. Eotvos, Ann: Physik 68; 11 (1922).
ISV. W. Hughes, H. G . Robinson, and V. Beltran-Lopez,
P. A. M. Dirdc, Proc. Roy. SOC (London) A165, 199 (1938). Phys. Rev. Letters 4,342 (1960).
145
construct a satisfactory theory with a variable fine- concrete spherical shell could be constructed with the
structure constant. laboratory in its interior.
I t should be emphasized that the above argument in- 2. The contrary view is that locally observed inertial
volving the large dimensionless numbers, Eq. (3), does reactions depend upon the mass distribution of the uni-
not concern Machs principle directly, but that Machs verse about the point of observation and consequently
principle and the assumption of a gravitational con- the quantitative aspects of locally observed physical
stant dependent upon mass distributions gives a laws (as expressed in the physical constants) are
reasonable explanation for varying constants. position dependent.
I t would be expected that both nearby and distant 3. I t is possible to reduce the variation of physical
matter should contribute to the inertial reaction experi- constants required by this interpretation of Machs
enced locally. If the theory were linear, which one does principle to that of a single parameter, the gravitational
not expect, Eq. (1)would suggest that it is the reciprocal constant.
of the gravitational constant which is determined locally 4. The separate but related problem posed by the
as a linear superposition of contributions from the mat- existence of very large dimensionless numbers repre-
ter in the universe which is causally connected to the senting quantitative aspects of physical laws is clarified
point in question. This can be expressed in a somewhat by noting that these large numbers involve G and that
symbolic equation : they are of the same order of magnitude as the large
numbers characterizing the size and mass distribution
G-1-Zi(mi/ric2), (4) of the universe.
5. The strong principle of equivalence upon which
where the sum is over all the matter which can con- general relativity rests is incompatible with these ideas.
tribute t o the inertial reaction. This equation can be However, it is only the weak principle which is
given an exact meaning only after a theory has been directly supported by the very precise experiments of
constructed. Equation (4) is also a relation from Eotvos.
Sciamas theory.
I t is necessary to say a few words about the equiva- A THEORY OF GRAVITATION BASED ON A SCALAR
FIELD IN A RIEMANNIAN GEOMETRY
lence principle as it is used in general relativity and as
it relates to Machs principle. As it enters general rela- The theory to be developed represents a generaliza-
tivity, the equivalence principle is more than the as- tion of general relativity. It is not a completely geometri-
sumption of the local equivalence of a gravitational cal theory of gravitation, as gravitational effects are
force and an acceleration. Actually, in general relativity described by a scalar field in a Riemannian manifold.
it is assumed that the laws of physics, including numeri- Thus, the gravitational effects are in part geometrical
cal content (Le., dimensionless physical constants), as and in part due to a scalar interaction. There is a formal
observed locally in a freely falling laboratory, are inde- connection between this theory and that of Jordan,I6
pendent of the location in time or space of the labora- but there are differences and the physical interpretation
tory. This is a statement of the strong equivalence is quite different. For example, the aspect of mass crea-
prin~iple.~J~ The interpretation of Machs principle tion in Jordans theory is absent from this theory.
being developed here is obviously incompatible with In developing this theory we start with the weak
strong equivalence. The local equality of all gravitational principle of equivalence. The great accuracy of the
accelerations (to the accuracy of present experiments) Eotvos experiment suggests that the motion of un-
is the weak equivalence principle. It should be noted charged test particles in this theory should be, as in
that it is the weak equivalence principle that re- general relativity, a geodesic in the four-dimensional
ceives strong experimental support from the Eotvos manifold.
experiment. With the assumption that only the gravitational
Before attempting to formulate a theory of gravita- constant (or active gravitational masses) vary with
tion which is more satisfactory from the standpoint of position, the laws of physics (exclusive of gravitation)
Machs principle than general relativity, the physical observed in a freely falling laboratory should be unaf-
ideas outlined above, and the assumptions being made, fected by the rest of the universe as long as self-gravi-
will be summarized : tational fields are negligible. The theory should be con-
structed in such a way as to exhibit this effect.
1. An approach to Machs principle which attempts, If the gravitational constant is to vary, it should be
with boundary conditions, to allow only those mass
distributions which produce the correct inertial P. Jordon, Schwerkrajt and Wellall (Friedrich Vieweg and
Sohn, Braunschweig, 1955); Z. Physik 157,112 (1959). In this sec-
reaction seems foredoomed, for there do exist large ond reference, Jordan has taken cognizance of the objections of
localized masses in the universe (e.g., white dwarf Fierz (see reference 19) and has written his variational principle
stars) and a laboratory could, in principle, be con- in a form which differs in only two respects from that expressed
in Eq. (16). See also reference 20.
structed near such a mass. Also it appears to be possible 1 For a discussion of this, see H. Bondi, Cosmology, 2nd edition,
to modify the mass distribution. For example, a massive 1960.
146
0=6
s [R+ (16uG/c4)L](- g)fd4x.
It is pointed out that the usual principle of invariance under isotopic spin rotation is not consistant with
the concept of localiied fields. The possibility is explored of having invariance under local isotopic spin
rotations. This leads to formulating a principle of isotopic gauge invariance and the existence of a b field
which has the same relation to the isotopic spin that the electromagnetic field has t o the electric charge, The
b field satisfies nonlinear differential equations. The quanta of the b field are particles with spin unity,
isotopic spin unity, and electric charge &e or zero.
ments in recent years4 on the energy levels of light nuclei dynamics it is necessary to counteract the variation of a!
strongly suggest that this assumption is indeed correct, with x , y , z, and t by introducing the electromagnetic
An implication of this is that all strong interactions field A , which changes under a gauge transformation as
such as the pion-nucleon interaction, must also satisfy
the same conservation law. This and the knowledge that 1 aa
there are three charge states of the pion, and that pions A,=A,+---.
can be coupled to the nucleon field singZy, lead to the
e ax,
conclusion that pions have isotopic spin unity. A direct In an entirely similar manner we introduce a B field in
verification of this conclusion was found in the experi- the case of the isotopic gauge transformation to counter-
ment of Hildebrand6 which compares the differential act the dependence of S on x , y , z, and t. It will be seen
cross section of the process n+p-.rr0+d with that of that this natural generalization allows for very little
the previously measured process p+p-m++d. arbitrariness. The field equations satisfied by the twelve
The conservation of isotopic spin is identical with the independent components of the B field, which we shall
requirement of invariance of all interactions under call the b field, and their interaction with any field
isotopic spin rotation. This means that when electro- having an isotopic spin are essentially fixed, in much the
magnetic interactions can be neglected, as we shall here- same way that the free electromagnetic field and its
after assume to be the case, the orientation of the interaction with charged fields are essentially deter-
isotopic spin is of no physical significance. The differ- mined by the requirement of gauge invariance.
entiation between a neutron and a proton is then a I n the following two sections we put down the
purely arbitrary process. As usually conceived, however, mathematical formulation of the idea of isotopic gauge
this arbitrariness is subject to the following limitation: invariance discussed above. We then proceed to the
once one chooses what to call a proton, what a neutron, quantization of the field equations for the b field. I n the
a t one space-time point, one is then not free to make any last section the properties of the quanta of the b field
choices a t other space-time points. are discussed.
It seems that this is not consistent with the localized
field concept that underlies the usual physical theories. ISOTOPIC GAUGE TRANSFORMATION
In the present paper we wish to explore the possibility
of requiring all interactions to be invariant under +
Let be a two-component wave function describing
independent rotations of the isotopic spin a t all space- a field with isotopic spin 3. Under an isotopic gauge
time points, so that the relative orientation of the iso- transformation it transforms by
topic spin at two space-time points becomes a physic-
ally meaningless quantity (the electromagnetic field
being neglected). where S is a 2x2 unitary matrix with determinant
We wish to point out that an entirely similar situation unity. In accordance with the discussion in the pre-
arises with respect to the ordinary gauge invariance of a vious section, we require, in analogy with the electro-
charged field which is described by a complex wave magnetic case, that all derivatives of $ appear in the
function $. A change of gaugeemeans a change of phase following combination :
factor ++, $= (expic&, a change that is devoid of
any physical consequences. Since $ may depend on (a,-iwk.
+
x, y, z, and t, the relative phase factor of a t two differ-
B, are 2x2 matrices such that7 for p= 1, 2, and 3, B, is
ent space-time points is therefore completely arbitrary.
In other words, the arbitrariness in choosing the phase Hermitian and Bq is anti-Hermitian. Invariance re-
factor is local in character. quires thst
We define isotopic gaxge as an arbitrary way of choos- S (13, -icB,)$= (a, -id?,)$. (2)
ing the orientation of the isotopic spin axes a t all space-
time points, in analogy with the electromagnetic gauge Combining (1) and (2), we obtain the isotopic gauge
which represents an arbitrary way of choosing the com- transformation on B, :
plex phase factor of a charged field a t all space-time
points. We then propose that all physical processes i as
(not involving the electromagnetic field) be invariant B, = SIBJ+-S1--. (3)
ax,
under an isotopic gauge transformation, $-+,$= S v ,
where S represents a space-time dependent isotopic The last term is similar to the gradiant term in the
spin rotation. gauge transformation of electromagnetic potentials.
To preserve invariance one notices that in electro- In analogy to the procedure of obtaining gauge in-
variant field strengths in the electromagnetic case, we
4T. Lauritsen, Ann. Rev. Nuclear Sci. 1, 67 (1952); D. R.
Inglis, Revs. Modern Phys. 25,390 (1953).
6 R. H. Hildebrand, Phys. Rev. 89, 1090 (1953). We use the conventions h=c= 1, and x4=it. Bold-face type
6 W. P a d , Revs. Modern Phys. 13,203 (1941). refers to vectors in isotopic space, not in space-time.
152
194 C. N . Y A N G A N D R . L . M I L L S
a3,/axll=o. (16)
3 ,z, a and 3 4 are respectively the isotopic spin current
density and isotopic spin density of the system. The
equation of continuity guarantees that the total iso- - ez (bpx bv)+ J, * b, -$ (r,a,+m)$. (19)
topic spin
T=
s 34ds~
.C .
I
.
satisfy the law of conservation of electric charge, which divergences. I
is exact. The two states of the nucleon, namely proton
and neutron, d s e r by charge unity. Since they can
transform into each other through the emission or ab-
sorption of a b quantum, the latter must have three
+A d 8
charge states with charges f e and 0. Any measurement
of electric charges of course involves the electro- satisfies
magnetic field, which necessarily introduces a prefer- im/at=H~,**,
ential direction in isotopic space at all space-time points. where Hintwas defined in Eq. (21). The matrix elements
Choosing the isotopic gauge such that this preferential of the scattering matrix are then formulated in terms
direction is along the z axis in isotopic space, one sees of contributions from Feynman diagrams. These
that for the nucleons diagrams have three elementary types of vertices
Q=electric charge= e(++e+T*), illustrated in Fig. 1, instead of only one type as in
quantum electrodynamics. The primitive divergences
and for the b quanta are still finite in number and are listed in Fig. 2. Of
Q= ( e / r ) P . these, the one labeled a is the one that effects the propa-
The interaction (7) then h e s the electric charge up to gation function of the b quantum, and whose singularity
an additive constant for all fields with any isotopic determines the mass of the b quantum. I n electro-
spin : dynamics, by the requirement of electric charge con-
Q = e (e-lTz+R). (22) servation,12 it is argued that the mass of the photon
vanishes. Corresponding arguments in the b field case
The constants R for two charge conjugate fields must be do not existla even though the conservation of isotopic
equal but have opposite signs.lo spin still holds. We have therefore not been able to
anything about the mass of the b quantum.
FIG. 1. Elementary vertices for \\ II conclude
b fields and nucleon fields. Dotted
f
lines refer to b field, solid lines with
arrow refer to nucleon field.
-- ,k,
#i
A,
A conclusion about the mass of the b quantum is of
course very important in deciding whether the proposal
II of the existence of the b field is consistent with experi-
mental information. For example, it is inconsistent with
We next come to the question of the mass of the present experiments to have their mass less than that of
b quantum, to which we do not have a satisfactory the pions, because among other reasons they would then
answer. One may argue that without. a nucleon field the be created abundantly at high energies and the charged
Lagrangian would contain no quantity of the dimension ones should live long enough to be seen. If they have a
of a mass, and that therefore the mass of the b quantum mass greater than that of the pions, on the other hand,
in such a case is zero. This argument is however subject they would have a short lifetime (say, less than
to the criticism that, like all field theories, the b field is sec) for decay into pions and photons and would so far
beset with divergences, and dimensional arguments are have escaped detection.
not satisfactory.
One may of course try to apply to the b field the J. Schwinger, Phys. Rev. 76,790 (1949).
1s In electrodynamics one can formally prove that G,.k,=O,
methods for handling infinities developed for quantum where G, is defined by Schwingers Eq. (A12). (G,,dv is the
electrodynamics. Dysons approach is best suited for current generated through virtual processes by the arbitrary
the present case. One first transforms into the inter- present external field A.) No corresponding proof has been found for the
case. This is due to the fact that in electrodynamics the
action representation in which the state vector fJ? conservation of charge is a consequence of the equation of motion
of the electron field alone, quite independently of the electro-
*O See M. Gell-Mann, Phys. Rev. 92,833 (1953). magnetic field itself. I n the present case the b field carries an iso-
l1 F.J. Dyson, Phys. Rev. 75,486,1736 (1949). topic spin and destroys such general conservation laws.
PHYSICAL REVIEW VOLUME 98, NUMBER 5 JUNE 1. 1955
necessitates the existence of a neutral vector massless should be noticed that in addition the assumption has
field coupled to all heavy particles. A nucleon would also been made that the transformation that generates
have a heavy-particle charge of f~in such a field the conservation of heavy particles is of the specific
and an antinucleon would have a heavy-particle form (1).
We wish to thank Dr. J. Robert Oppenheimer for an
* See M. Gell-Mann and A. Pais, Proceedings of the Glasgow interesting discussion.
Conference, July, 1954 (to be ublished).
* W. Pauli, Revs. Modem PEP. 13,203 (1941).
* C. N. Yang and R. L. Mills, Phys. Rev. 96, 191 (1954). Eotvos, Pekir, and Fekete, Ann. Physik 68, 11 (1922).
1501
156
Some systems of fields have been considered which are invariant under a certain group of transformations
depending on n parameters. A general rule is obtained for introducing a new field in a definite way with a
definite type of interaction with the original fields by postulating the invariance of these systems under a
wider group derived by replacing the parameters of the original group with a set of arbitrary functions.
The transformation character of this new field under the wider group is determined from the invariance
postulate. The possible types of the equations of the new fields can be also derived, giving rise to a certain
conservation law owing to the invariance. As examples, the electromagnetic, the gravitational and the
Yang-Mills fields are reconsidered following this line of approach.
using the concept of parallel displacement. On the Now from the invariant character of I under the
other hand, we shall see in Sec. 4 that the covariant transformation (1.1) and from the fact that this
derivative of any tensor or spinor can be derived from invariance is always preserved for an arbitrary domain
the postulate of invariance under the "generalized 3, we have the invariance of the Lagrangian density
Lorentz transformations" derived by replacing the six itself. Namely we have
parameters of the usual Lorentz group with a set of six
arbitrary functions of x . In deriving such covariant
derivatives it is unnecessary to use explicitly the notion
of parallel displacement.
Now the above stated classification of the interactions The symbol means that 6L must vanish a t any
has only a tentative meaning. Some of the interactions world point and further that this relation does not
of the second class might be translated to the first class depend on the behavior of QA and QAo.. Substituting
if we could find a transformation group by means of (1.1) into (1.4) we get
which we can derive that interaction following the
general scheme in Sec. 1. For example, if the interaction
between mesons and nucleons could be reinterpreted
in a fashion analogous to those of the first class, then
one might presumably be able to get a wider viewpoint
for interpreting the interactions between the new since the 6's are independent of each other. These n
unstable particles and the nucleons. identities are the necessary and sufficient conditions for
the invariance of I under G.
1. GENERAL THEORY If we take into account the field equation for Q A ,
Let us consider a set of fields Q A ( x ) , ( A = 1,2, .. . N ) , we obtain from (1.5) the following n conservation laws:
with the Lagrangian density
is invariant under the following infinitesimal transfor- The first term, ( ) 6 Q A , vanishes on account of the
mation : field equation.
Now let us consider the following transformation :
QA-QA+W,
6QA= T(a),A~ ta QB,
(1.1)
@=infinitesimal parameter ( a = 1 , 2 , . . .n),
ea(x) =infinitesimal arbitrary function,
T(.,,AB= constant coefficient.
instead of (1.1). In this case 6L does not vanish but
In addition, the transformation (1.1) is assumed to be becomes
a Lie group G depending on the n parameters to.
Thus there must be a set of constants f b a c called the
"structure constants," which are defined by
or
[T(,),T(b)]AB=T(a),AC T ( b ) C B - T (bLAC T ( 4 . C B
= f o C b T(c),A B . (1.2)
These constants, fbac, have the following important by virtue of the identity (1.5).
properties: In order to preserve the invariance of the Lagrangian
j a m b fm'c+jbamc fm'a+fcma fm'b=O,
under (l.l)', it is necessary to introduce a new field
(1.3)
focb= -fbOa. A ' J ( ~ ) ,J = l , 2 , . .. M ,
The relations (1.3) can be easily obtained from Jacobi's in such a way that the right-hand side of (1.9' can be
identity and the definition (1.2). cancelled with the contrib&on from this new field A'.'.
~
159
L'(QA,QA,
and consider the following transformation:
By using A", in place of A r J , the transformation
SQA= T(a),AB Q" e " ( ~ ) , character of A turns into
a 0
Now let us investigate the possible type of the From these relations and (1.16), we have
Lagrangian for the free A-field. Let it be denoted by
LO(Aap,Aap,"), A a , aA'Jdx'. =O.
The invariance postulate for LO under the transfor- Namely LOmust be a function of F alone and must
mation (1.11) leads to satisfy the identity (1.19).
As may easily be seen, the transformation character
of F is given by
6Fafiv= eb(%) fb"c F'py. (1.20)
Equation (1.20) can be verified by using the relation
(1.3).
Now let us define a set of matrices, M ( l ) , Mcz),
. . .M(,,),
in the following way:
(a$)-element of M(=)=M(o)'b= fc'b,
From (1.17) we see that the derivative of A should be (a, b, c = l , 2, . . .).
contained in LOthrough the combination
Then these matrices are a representation of degree n
a a for the generators of the Lie group G, since the relation
(1.3) can be written as
+-i- I aL
ax. aVpQA
aLn 6LT I
6 ~ ~ + 4 ~ ~ , + - =o,
aFarv 6Aap
-
- 1 (1.21)
(1.19)
Now let us choose the arbitrary function ra(x) in such
a way that the values of all the r's and &//ax's vanish
(See Appendix I.) Since Lo must have the form
on the boundary surface of the integration domain Q .
Lo(A,aA/ax) Lo' (Anp,Fapy), Then the integration of (1.21) over the domain D
becomes
we have the relations
lKd'x--O, (1.22)
fab. A
161
because the integration of the divergence term in (1.21) 2. PHASE TRANSFORMATION GROUP AND THE
vanishes on account of our special choice of the ds. ELECTROMAGNETIC FIELD
Since the BS can be chosen arbitrarily within Q , K Let us consider a charged field Q and Q*. The La-
must vanish a t every point in Q , as is easily seen from grangian of this system is assumed to be invariant
(1.22). under the phase transformation
Consequently the identity (1.21) are separated into
the following two relations : 6QA=iCUQA, 6QA*= -&QA*, a = a real constant.
and (1.25) becomes The current J p can be obtained from the two different
expressions
(1.29)
then we have the conservation of the current, i.e., As an example let us consider a system of proton and
neutron fields :
aJ*JaxP-O (a=1,2, . . .) (1.30) +a= (;:) (
= proton
neutron
).
Thus we have obtained a general rule for introducing
a new field A in a definite way when there exists some The Lagrangian in the charge-independent theory is
conservation law such as (1.6) or there is a Lie group invariant under the rotation in the three-dimensional
depending upon some parameters under which the isotopic spin space :
system is invariant.
In the following sections we shall consider the a a
following groups as examples of the original Lie group : S+.=i C 6 7 ( c ) a b +fl, a$,= -i C E $0 ~ ( ~ ) f l - , (3.1)
0-1 c-1
(1) the phase transformation of a charged field, (2) the
rotation group in the isotopic spin space, and (3) the where T ( ~ ) T, ( * ) , and T ( Q are the usual isotopic spin
Lorentz group. matrices.
162
In this case the general notation T in Sec. 2 corre- The square of thc invariant length of the infinitesimal
sponds to r as follows line element is given by
hkr+hkr+8hkr, t k l =-Elk
As was stated in Sec. 1, Fa,,is transformed under the 6hk'= -d k h f .
rotation group as a vector, namely, the isotopic spin of
this B-field is unity. The expression for the "current" On account of this geometrical meaning of the h's,
has the form [see (1.25) and (1.24)]: we can transform the world tensor into the corre-
sponding local tensor defined with respect to the local
aLT . ar. aLo frame, or vice versa, using h~' or hk,. For example,
Juo=-= -*-----T(.), O L $J'--fG%
~ Bbpa
avp F",. (u>=hk, (u) Q' (HI8
Qk =hnp(u)
Qlr(a) Qk (a),
4. LORENTZ GROUP AND THE GRAVITATIONAL where the abbreviation
FJELD
Let us consider a system of fields Q" (x)being defined has been used.
e
= Q"$ ( 1>
.
with respect to same Lorentz frame. In addition, let us In this way we can rewrite the action integral as
assume that the action integral follows :
where 9 is defined by
is invariant under any Lorentz transformation.
Now besides the z-system, let us introduce an arbi- P = L ( Q " ( ~ ) , J w Q"..(u))
(~) h, (4.2)
trary system of the curvilinear coordinates U M (p= 1, 2,
3 , 4 ) . In what follows, the Latin and Greek indices and Q", stands for
represent quantities defined with respect to the x- aQ " (u)/ W.
system (or the local Lorentz frame) and to the u- a The world vector means a vector which is defined with respect
system respectively. t o the u-system.
163
The reason or the fact that the Q A in (4.2) is not are assumed to be transformed as
transformed into the corresponding world quantity
is that if Q is a spinor this rewriting is not possible, QA=&(~) T ( ~ z ) , Q,
B
because the spinor can be well defined only with 6hk,,=eki(N) P,,. (4.6)
respect to a Lorentz frame.
Now 1 is invariant under the following two kinds of Then, in order to retain the invariance of I under the
t r a n s f o ~ m a t i o n:s ~ ~ ~ transformatian (4.6), it is necessary to introduce a new
(1) The Lorentz transformation field
A E l p (24)= --A ikp(Zd),
6hhP= e k l hIP,
which has the following transformation character
~ Q = + T ( W . ~ Btk,
Q~ (4.3)
according to (1.11):
u =unchanged,
aekz
(2) The gencral point transformation Furthermore the new Lagrangian is given by
M-w+X+) =dr,
(4.4)
[see (1.10)].
SQ (u) Q(u) -QA(.) =0, - The factors 3 in (4.9) and f in (4.7) are necessary
because in summing up the terms in these expressions
SQf ,= --_ax QA. v.
with respect to the dummy suffices the same contribu-
all* tions are counted twice or four times.
Because of the general Lorentz transformation,
Now our Lagrangian (4.2) has the suitable form for under which each local frame a t each world point i s
the application of the general method stated in I, if the transformed differently, the reIation (4.5) was aban-
given functions, hkP,are regarded as a set of field doned. Since this relation is satisfied only when the
quantities satisfying the condition : basic world is flat, we are forced to take as our basic
ahk,,/auU= ahk,/auP, (4.5) space-time some Riemannian space with the metric
and having the transformation character (4.3) under
the Lorentz group. Though we will omit the condition
(4.5), the invariance of I under the transformations and the affine connection
(4.3) and (4.4) still holds. The only role of (4.5) is to
guarantee the possibility of finding the simplest and
most convenient system of coordinates (d,. . .,d).In
fact if we replace the parameters, eik, with a set of
arbitrary functions, eik(u),after the Lorentz transfor- Accordingly we wouId expect that there exists some
mation depending on such e(u)s, the relatiun (4.5) is relationship between A: and hk,,.
destroyed. I n order to obtain this relationship let us consider,
The condition (4.5) is inconsistent with the applica- as an example, the local tensor
tion of the general procedure of Sec. 1 to the present
problem. Accordingly we shall consider the hs as a set
of 16 independent given functions.
Now following the prescription of Sec. 1, let us con- Then from (4.9) we have
sider the generalized Lorentz transformation depend-
aQkl
ing upon a set of arbitrary functions eik(u)instead of the VfiQkl =-- A Em,,Qm-A Z m
z Qkm
parameters eik. Under this transformation, Q A and hk, aw
R. Utiyama, Progr. Theoret. Phys. (japan) 2, 38 (1947). F. J. Belinfante, Physica 7, 305 (1940); K. Husimi, Proc.
L. Rosenfeld, Ann. Physik 5, 113 (1930). National Research Council of Japan 4, 81 (1943).
164
By using h, this can be rewritten as then we can solve (4.13)for I. The solution is
or
where Qmvand I are defined by
(4.14)
Qkv=h,,, Qkm,
and where
ah, Apv,,,=hkP hipAk,,,.
ri.,,P=hf--hkp Ak,. (4.11)
a u p
(4.14)is just the relation desired.
I n general, the following relation is easily derived from Now from (4.14)we see that AP,,,, is a world tensor
(4.9) : of the third rank under the general point transformation
a (4.4)because the inhomogeneous term
.
V,Qkl.. ,pu. ..., aB... =-QkZ.. .,pa. 9 .
a)..., a8...
aur a2xp
and
(a/au+m=o, (4.19)
where ! l R u is
(4.15) can be rewritten formally as follows:
Fkl,=V,Akly-V,Akr-AkpA1bv+AlrblA$$, (4.15)
where V , A k l , does not behave Like a local tensor of
the second rank, but is a covariant world tensor as the
s f i c e s p and Y show. Using the expression (4.15) we
can prove the following relation (see Appendix 11)
Accordingly, the coefficient of a2E/d3 in (4.19) identi-
Fkluv=hi hk. RahrvJ (4.16) cally vanishes :
where R is the Riemann-Christoffel curvature tensor: di!T a?T
~ &-- hpO.
ahi, ,? ahi;,p y
Thus becomes
Though lak, is contained in our Lagrangian as well as IIJ1.Z ;EkB*is,
A and a A / d u , we can still prove that A k i , appears in
loonly through the combination P.
So far we have assumed that h, is a given function.
The behavior of hk, in general relativity is defined by
the field equation derived by the variational principle. a?T a&
The total Lagrangian density is now given by ---h,,+-htp,
+lIiWp,, ahi,,,..
Of course, in all these equations for the Q, h, and B fields, the Taking the variation with regard to h, we get
affinity Imust be used instead of r. (4.18) and (4.19) also hold
in this case, since B is invariant under the general Lorentz blo bi!
transformations. Consequently this B field is of no use in -+--=(I.
avoiding the trivial result : $Q%i,-D. ah, ahi,
166
Thus we have By taking into account the relation (1.3), the above
h 1C
. @Pn= -%Pi, expression becomes
with 6 K ,p v = f a l m ( f k b l f b m j f f I b j f b m k ) F k p v ti.
%Pi= -62/6hip,
or we have Using (1.3) again, we can rewrite the hrst term of this
@Pa= -5PW. expression as follows:
From this relation, which vanishes in virtue of the relation (1.3). Hence
this proposal concerning the transformation character
of f b a c is compatible with the covariant character of
can be easily derived. Ka,pv.
Using the quantity
ACKNOWLEDGMENTS
gab=fa'm flmb=gba
The author is most grateful to the Institute for
Advanced Study for a grant-in-aid and to Professor and its inverse gab, we can easily construct a tensor
Robert Oppenheimer for the kind hospitality extended algebra similar to that used in the theory of relativity.
him there. He is also indebted to members of the For example, we have invariants
Institute, especially to Dr. R. Arnowitt, for helpful
conversations. Ilpv,ps=gab Fa,, F b p r = H p o , p v .
In the case of the rotation group in three-dimensional
APPENDIX I. CONDITION (1.19) isotopic spin space (see Sec. 3 ) , f b a c has the following
Here we shall show how to construct an invariant values :
in terms of Fapr.
Consider a quantity G,, the transformation character
of which is contragradient to that of Fa,, under the
transformation (120).
Since
Therefore, we have
I
fi32= j 3 ' 1 =
f?k=
fZ13= - 1,
-fk'i,
otherwise f = 0.
gab=26ab,
GP,. and
Hpv,pa= 26.6 Fapv Fbps.
is invariant by definition, 6G, is given by
Another familiar example is the case of the Lorentz
6Ga= - tcf cba G b . group. Here we have
9 W. Pauli, Encyklopaedie der Mathematischen Wissenschaften
(B. G. Teubner, Leipzig, 1904-1922), Vol. 5, Chap. 19, p. 621. If.
4 j k . o bc d fab.Cdlm'g*il g*km-g*jm g*kl,
167
The factor in the bracket vanishes on account of the By using (A.3) this becomes
relation (1.3). Consequently there exists, in fact, a
family of invariant Lagrangians, Lo, which are functions 6pA", p=$" hf 6p6vkCA-A"f, p A'k, y hmP It".
of Fap. alone and satisfy the condition (1.19). Inserting this expression into (A.l) with (A.2), we have
APPENDIX 11. PROOF OF THE RELATION Fk',,,= h1A(6,6,-6,6p)hk~.
F",. =h'xhk a R a ~ p r
Fkl,. is given by As is well known, the Riemann-Christoffel tensor is
defined by
Fk',,=VpAk'y-V,Ak',-Akbp A 1 b u f A k b y A'b,. (A.1) (S,,6.-6dp) VX=RPX~"
V,
Now according to the general rule (4.12) for an arbitrary covariant vector V,.
Thus we get
V,Ak',= hk, h', 8J PO") (-4.2) Fk' # U = h" hk, RahpY.
I68
electromagnetic field, we choose the Lagrangian of Now, under the more general transformations of the
lowest degree which satisfies the invariance require- form (2.1), but in which the parameters @ become
ments. arbitrary functions of position, the Lagrangian is no
The geometrical interpretation in terms of a Rieman- longer invariant, because the derivatives transform
nian space is discussed in Sec. 6, where we show that according to
the free Lagrangian we have obtained is just the usual
curvature scalar density, though expressed in terms of 6x,p = @Tax,p f @.,Tax, (2.4)
an a t k e connection FApuwhich is not necessarily sym- and the terms in to,,, do not cancel. In fact, one finds
metric. In fact, when no matter is present it is sym-
metric as a consequence of the equations of motion, but bL= -Po.J'h.
otherwise it has an antisymmetric part expressible in However, one can obtain a modified Lagrangian which
terms of the "spin density" @#ij. Thus there is a dif- is invariant by replacing X.,, in L by a quantity X ; ,
ference between this theory and the usual metric which transforms according to
theory of gravitation. This difference was first pointed
out by Weyl,' and has more recently been discussed by AX,,,= @Tax,,. (2.5)
Sciama.6 It arises from the fact that our free Lagrangian To do this' it is necessary to introduce 4n new field
is of first order in the derivatives, with the l r k ~and A iip variables As,, whose transformation properties involve
as independent variables. It is possible to re-express the c?,~. In fact, if one takes
theory in terms of the Christoffel connection Orh,," or
its local analog OA ijp,and this is done in Sec. 7. In that X;p= X++A"pTaX, (2.6)
case, additional terms quadratic in @ij, and multiplied then the condition (2.5) determines the transformation
by the gravitational constant, appear in the Lagrangian. properties of the new fields uniquely. They are
2. LINEAR TRANSFORMATIONS 6Aop= cbfbaJcp- @,pa (2.7)
We consider a set of field variables X A ( Z ) , which we In this way one obtains the invariant Lagrangian
regard as the elements of a column matrix ~ ( x )with ,
the Lagrangian L'(X,X,,JA"J= L W ; r ) .
The expression X ; , may be called the c o v h n t deriva-
L ( + U X ( 4 , X,M(X)),
tive of x with respect to the transformations (2.1). One
where X + = 8,X. We also consider linear transformations may defme covariant currents by
of the form
J ' P a z - (13L'/aAap)~-(aL/aX;,)TaX, (2.8)
6 X = @Tax, (2.1)
where the are n constant infinitesimal parameters, where L is regarded as a function of X and Xip They
and the Ta are n given matrices satisfying commutation transform linearly according to
rules appropriate to the generators of a Lie group, bJ'pa= - SbfbcaJ'Pc,
CTa,Tb]= facbTc.
and their covariant divergences vanish in virtue of the
The Lagrangian is invariant under these transforma- equations of motion and the identities (2.2) :
tions if the n identities
J'pa;&= J'",,+- Ab,,j b C a J ' p c
(aL/aX)Tax+ (13L/8t,,,)TaX,,,=0, (2.2)
=0.
are satisfied, and we shall assume that this is so. Note
that a/* must be regarded as a row matrix. The Two covariant d~erentiationsdo not in general
equations of motion imply n conservation laws commute. From (2.6) one finds
Jpa.pz0, X;pv-X;vp=FapvTaX,
where
where the ''currents" are defined by6 Fapu=A -A -j b " d '
4' w . (2.9)
J # ~ =- (aL/ax,,,)~.x. (2.3) Unlike A",,, the expression PPy
is a covariant quantity
'H. Weyl, Phys. Rev. 77,699 (1950). transforming according to
6D.W. Saama, Festschrift for Injeld (Pergamon Press, New
York), to be published. 6Fapu= t bf b " P " p v ,
8 We have defined .Tpm with the opposite sign to that used by
Utiyama.8 This is -use with this choice of sign the analogous
-
quantity for translabom is Tp. rather than T*, The change may and one may, therefore, define its covariant derivative
be considered 89 a change of sip of p and T., and there is a cor- in an obvious manner. It satisfies the cyclic identity
responding change. of sign in (2.6). This convention has the addi-
tional advantage that the "local a5ne connection" A',, defined F",v;p+Faup;p+Fap+;
~ 0 .
in Sec.,4 specifies covariant derivatives according to the same rule
as Py.. 7 For a full discussion, see footnote 3.
170
2 14 T. W . B . K I B B L E
It remains to h d a free Lagrangisn Lo for the new where aL/axu denotes the partial derivative with 6xed
fields. Clearly LOmust be separately invariant, and it x . It is sometimes useful to consider also the variation
is easy to see that this implies that it must contain at a fixed value of x,
A; only through the covariant combination PpY. The
simplest such Lagrangian is* 6&= x(s) -x(z)=sx-6bz~x,. (3.2)
(2.10)
In particular, it is obvious that 60 commutes with a,,
whence
where the tensor indices a.re raised with the flat-space
metric )IPV With diagonal elements (1, - 1, - 1, - l), ax,= (ax),- (6P)J.. (3.3)
and the index a is lowered with the metnc The action integral
gabE fad fcdb
I ( Q ) = s L(x)d&
assodated with the Lie group (except of course for a n
one-parameter group). It is clear that this Lagrangian
is not unique. All that is required is that it should be over a space-time region D is transformed under (3.1)
a scalar both in coordinate space and in the Lie-group into
space, and one could add to it terms of higher degree
Zl(Q)=J L(z?lladfllld4x.
in PFV. However, it seems reasonable to choose the n
Lagrangia~of lowest degree which satisfies the in-
variance requirements. Thus the action integral over an arbitrary region is
With the choice (2.10) of LO, the equations of motion invariant iP
for the new fields are
bL+L(SXfl).@=bgL+ ( D x f i ),p= 0. (3.4)
Faflu;= Jfl,,.
This is of course the typical transformation law of an
Because of the antisymmetry of Fafluone can define invariant density.
another current which is conserved in the strict sense : We now consider the specific case of Lorentz trans-
formations,
(Jpa+j.a).p=O, (2.11)
where 6x#= cpu+c, 6X= +fiVSfiX, (3.5)
jpa= A, jbe,,Fefiu. where e and t p v = - t v p are 10 real infinitesimal param-
This extra current jp,, may be regarded as the current eters, and the S, are matrices satisfying
of the new field A, itself, since it is expressible in the S,U+S, =0,
form
CS,v,S,l= )Iv J p w + ? J ~ u p -subs,,-)IflPsw= +jfi=hP.S. A.
pa=- (aLo/aAafl)= - (aLo/aAbvr)jaboAeu, (2.12) From (3.3) one has
which should be compared with (2.8). Note, however,
that it is not a covariant quantity. To obtain a strict 6X,,= +PSwX*- (3.6)
ConSeTvation law one must sacrifice the covariance of Moreover, since (ax.) ,#= ON,,= 0, the condition (3.4)
the current. for invariance of the action integral again reduces to
6L-0, and yields the 10 identitieslO
3. LORENTZ TRANSFORMATIONS
We now wish to consider infinitesimal variations of
aL/axp= L , ~ -(aL/ax)x,p
- - (aL/ax,,)x,,,,= 0, (3.7)
xp -
both the coordinates and the field Variables,
XI@= xfi+6xx,
X(x) -+ x(x)=x(x)+6x(x).
(3.1)
( a ~ / a x ) s , ~(aL/ax,)
+ (s,x,
+qd,v-~wX,p)
It will be convenient to allow for the possibility that (3.7), which express the conditions for translational
the Lagrangian may depend on x explicitly. Then, invariance, are equivalent to the requirement that L
under a variation (3.1), the change in L is be explicitly independent of x, as might be expected.
As before, the equations of motion may be used to
~ L (aL/ax)sx+
E (aL/ax,p)sx,fl+ ( a L / a x p ) w , obtain 10 conservation laws which follow from these
~
These are the conservation laws of energy, momentum, It is of course not invariant under the generalized
and angular momentum. transformations (4.1), but we shall later obtain an
It is instructive to examine these transformations in invariant expression by replacing X k by a suitable
terms of the variation S& also, which in this case is quantity X ; k.
The transformation of X,,, is given by
w = -ePa,x++ew (s,+x,a,- %,,a,)x.
6X+= 3'Jsijx.,+"",~ijx- .p+X,", (4.3)
On comparing this with (2.1), one sees that the role of
the matrices T,,is played by the differential operators and so the original Lagrangian transforms according to
-a, and S,+x,a.-z,a,. Thus, by analogy with the 6L= - [P,> p
-.IcijoJr.
a
..
'I
definition (2.3) of the currents -Tea, one might expect
the currents in this case to be Note that it is J p , rather than Tr,which appears here.
The reason for this is that we have not included the
extra term L(W),,,in (3.4).The left-hand side of (3.4)
actually has the value
corresponding to the parameters @, epu, respectively.
However, in terms of 80, the condition for invariance 6L+ L ( W ),,,E - [ P + T f i p - 3 e i j , p S ' i j .
(3.4) is not simply &LEO, and the additional term
6xpL,, is responsible for the appearance of the term 1 5 , ~ We now look for a modified Lagrangian which makes
the action integral invariant. The additional term just
in the identities (3.7), and hence for the term PJ in Tfi,,.
mentioned is of a different kind to those previously
encountered, in that it involves L and not aL/aXk. In
4. GENERALIZED LORENTZ TRANSFORMATIONS
particular, it includes contributions from terms in L
We now turn to a consideration of the generalized which do not contain derivatives. Thus it is clear that
transformations (3.5) in which the parameters e p and we cannot remove it by replacing the derivative by a
t W Y become arbitrary functions of position. It is more suitable covariant derivative. For this reason, we shall
convenient, and clearly equivalent, to regard as inde- consider the problem in two stages. We first e l i n a t e
pendent functions tfiuand the noninvariance arising from the fact that X , is not
a covariant quantity, and thus obtain an expression L'
p= e"&+ P, satisfying
since this avoids the explicit appearance of x. Moreover, bL'E0. (4.4)
one could consider generalied transformations with Then, because the condition (3.4) for invariance of the
[e=O but nonzero P,so that the coordinate and field action integral requires the Lagrangian to be an in-
transformations can be completely separated. In view variant density rather than an invariant, we make a
of this fact, it is convenient to use Latin indices for d j further modification, replacing L' by Y f , which satisfies
(and for the matrices Sij), retaining the Greek ones for
.$# and x p . Thus the transformations under considera- W+,p+P=O. (4.5)
tion are
The first part of this program can be accomplished
6x'= p, 6 X = + e ' j S i j x (4.1) by replacing x k in L by a "covariant derivative" X ; k
or
which transforms according to
6oX= - pJX,p+)c'jSijX. (4.2)
This notation emphasizes the similarity of the t i j 6X;L=jCijSijX;k-SikX;i. (4.6)
transformations to the linear transformations discussed The condition (4.4) then follows from the identities
in Sec. 2. These transformations alone were considered (3.8). To do this it is necessary to introduce forty new
by Utiyama.3 Evidently, the four functions f f i specify field variables. We consider first the t i j transformations,
a general coordinate transformation. The geometrical and eliminate the eij,,, term in (4.3) by settingE
signihnce of the e i j will be discussed in Sec. 6.
According to our convention, the differentialoperator XIp= x,,+ $A ' j p S ij x , (4.7)
a,, must have a Greek index. However, in the Lagrangian where the Aij,,= - A j i , , are 24 new field variables. We
function L it would be inconvenient to have two kinds
can then impose the condition
of indices, and we shall, therefore, regard L as a given
function of X and x k (no comma),11satisfying the iden- 6X1,= $e'JS~jXlr-p,rXl" (4.8)
tities (3.7) and (3.8). The original Lagrangian is then
which determines the transformation properties of A i j p
Note that since we are using Latin indices for Sij the various
tensor components of x must also have Latin indices, and for A'i, differs in sign from that of Utiyama? Compare
spinor components the Dirac matrices must be yb. footnote 6.
172
216 T. W . B . K I B B L E
uniquely. They are equivalent, then the modified Lagrangians !& and 9%
6 A i pj -- iw l k i , f e i ~ i k ~ - E Y , r A i j y - ~ i j are not necessarily equivalent. Consider for example the
L (4.9) Lagrangian for a real scalar field written in its first-order
The position with regard to the last term in (4.3) is form
rather different. The term involving &,, is inhomo-
Ll=Ukq,k- +7rk1.k-+(pz. (4.12)
geneous in the sense that it contains X rather than X,,,,
just like the second term of (2.4), but this is not true This is equivalent to
of the last Correspondingly, the transformation
law (4.8) of X I , is already homogeneous. This means LZ= - U k , k q - ~ 4 r k T k - ~ m z # , (4.13)
that to obtain an expression X ; k transforming according but the corresponding modified Lagrangians Wer by
to (4.6) we should add to XI,, not a term in X but rather
a term in XI,, itself. In other words, we can merely f?1-&3@(?rq);k
extending the dehition of covariant derivative one can This quantity is expressible in the form
evaluate the commutator X ; k [ - X ; [ k . However, this
quantity is not simply obtained by multiplying - 2 (&/aA
6 Pa>. S - fils -3 (aSo/aAm n u . . u ) j i j m n d u,
218 T. W . B KIBBLE
requirement that the covariant derivatives of the therefore equal t o the Christoffel connection OP,,.
vierbein components should vanish, (This is the analog for world tensors of OA 'j,.) Then
is symmetric, and Eq. (6.12) yields Einstein's familiar
hi'; +O, bi,:+0. (6.6) equations for empty space,
For a generic quantity a transforming according to
Zu=0.
6a= #Ciisij(y+ oJApU, (6.7)
However, when matter is present, P,,is"no longer
the covariant derivative is defined by2l symmetric, and its antisymmetric part is given by
(6.13). Then the tensor 4, is also nonsymmetric, and
,+
u:,,=a, $A 'jSip+r A , A @ a , (6.8) correspondingly the energy tensor density Zkuis in
whereas the e covariant derivative defined in Sec. 4 is general nonsymmetric, because hrfi does not appear in
obtained by simply omitting the last term of (6.8). I only through the symmetric combination gr'. Thus
Note that the two derivatives are equal for purely local the theory M e r s slightly from the usual one, in a way
tensors or spinors, but not otherwise. One easily finds first nofed by W e ~ lIn . ~the following section, we shall
that the commutator of two covariant differentiations investigate this difference in more detail.6
is given by Finally, we can rewrite the covariant conservation
laws in terms of world tensors. It is convenient to define
a:,v-a; ~Rii,Sijcu+RPu,4puU-CA,~;A, the contraction
where Rpu,,, and CAB.are defined in the usual way in CfiGC'pA,
terms of Rijrv and C$t. They are both world tensors,
and can easily be expressed in terms of Pry, in the formL4 since the covariant divergence of a vector density f'
is then
~~~,.=r~~,.~-r~.,.,-r~~ , r ~ . , + r ~ , r ~ ~ ~P;,=
(6.9) , f'.,+C,ffi. (6.14)
c',;,= rA,,-rAuP (6.10)
The conservation laws become
Thus one sees that RP.,,, is just the Riemann tensor
formed from the affine connection I'A,,. %',;,-C,%u,,+Ca,~u~= )Rpu,u@w,
From (6.6) it follows that SFw; , -c,ew=2w-?.p.
P =0, (6.11) I t may be noticed that these are slightly more com-
plicated than the expressions in terms of the a covariant
so that it is consistent to interpret FAfiu
as an &ne con- derivative.
nection in the Riemannian space. However, the de-
finition (6.5) evidently does not guarantee that it is 7. COMPARISON WITH METRIC THEORY
symmetric, so that in general it is not the Christoffel
connection. The curvature scalar R has the usual form For simplicity, we shall assume in this section that L
is only of first order in the derivatives, so that (5.10)
R- R*,, RNu=Ra&up is an explicit solution for A'S. The difference between
so that the free gravitational Lagrangian is just the the theory presented here and the usual one arises
usual one except for the nonsymmetry of It should because we are using a Lagrangian l o of first order, in
be remarked that it would be incorrect to treat the 64 which h p and A", are independent variables. The situ-
components of rA,,as independent variables, since ation is entirely analogous to that which obtains for
there are only 24 components of Aij,,. In fact the Yap, any theory with "derivative" interaction. In first-order
are restricted by the 40 identities (6.11). Thus there is form, the "momenta" Aij, are not just equal to deri-
no contradiction with the well-known fact that the vatives of the "coordinates" h p , or in other words to
first-order Palatini Lagrangian with nonsymmetric Far, OAij,. Thus an interaction which appears simple in
does not yield (6.11) as equations of motion."6 first-order form will be more complicated if a second-
The equations of motion (5.5) and (5.6) can be order Lagrangian is used, and vice versa.
rewritten in the form The second-order form of the Lagrangian may be
obtained by substituting for Aij, the expression (5.10).
@(&I"- 3g,P) = -%I", (6.12) This gives
@Ca,,= 6',tu-#S',SppU- i&A.SP,p. (6.13) I'=?$+Bo+'I,
From Eqs. (6.10) and (6.13) one sees that in the absence where O$ and O v a are obtained from and by replacing
of matter the a!ine connection P,, is symmetric, and A'S by OAij,, (or equivalently rA,,by and l? is
UThis is a generalization to nonsymmetric affinities of the an additional term quadratic in Skij, namely,
result proved in the appendix to footnote 3. See also footnotes 4
and 5. ' ~ = Q ~ ( 2 S i j k S j L i - S i j ~ j r + 2 S i i ~ (7.1)
jj~).
f l See for instance E. Schradinger, Space-time Structwe (Cam-
bridge University Press,New York, 1950). In this Lagrangian, only h k p and x are treated as inde-
176
220 T. W . B . K I B B L E
pendent variables. The equations of motion are equi- another, but there are plausible arguments for a par-
valent to those previously obtained if the variables Aj,, ticular choice.
are eliminated from the latter by using (5.10). The most obvious criterion would be to require that
The usual metric theory, on the other hand, is given the Lagrangian should be written in the symmetrized
by the Lagrangian fkst-order form suggested by SchwingerFo which in the
case of the scalar field discussed in Sec. 4 is
V =OI!+OI!o,
without the extra terms (7.1). If this Lagrangian were L=B(L1+L2).
written in a first-order form by introducing additional This corresponds to treating (p and r kon a symmetrical
independent variables Aj,,, then one would arrive at a footing. However, this may not in fact be the correct
form identical to the one given here except for the choice, because for some purposes (o and r kshould not
appearance of extra terms equal to (7.1) with a negative be treated in this way. In fact, the two L a g r a n g h
sign. differ in one important respect: 21 is independent of
Thus we see that the only merence betweenthe two A$, whereas v2 is not. Correspondingly, for L1 the
theories is the presence or absence of these direct- quantity S k i j vanishes, whereas for L2 one finds
interaction terms. Now if we had not set K= 1,then &
would have a factor K - ~ , whereas the terms (7.1) would s i j = (6% j--bk>ri) $0.
appear with the factor K. They are, therefore, extremely The conservation laws in the two cases are of course the
small in comparison to other interaction terms. In par- same, because the quantities Tki also differ. Now the
ticular, for a Dirac field, they would be proportional to tensor S k i j has often been interpreted as the spin
(see Appendix) density,8 so that the two cases m e r with regard to the
&k?K$h%K#.
separation of the total angular momentum into orbital
and spin terms. The scalar field is normally regarded as
Thus they are similar in form to the Fermi interaction a field of spinless particles, so that one would naturally
terms, but much smaller in magnitude, so that it seems expect Skijto vanish. This, therefore,furnishesapossible
impossible that they would lead to any observable criterion, which would select L1 rather than Lt. With
difference between the predictions of the two theories. this choice, a preferred position is assigned to the
Hence we must conclude that for all practical purposes wave function (o rather than the momenta r k J and
the theory presented here is equivalent to the usual one. the derivatives are written on (o only. In this way one
achieves a vanishing spin tensor, because the matrices
ACKNOWLEDGMENTS Sij are zero for the scalar field (o, but not for the vector
The author is indebted to Drs. J. L. Anderson, P. W. uk.It may be noticed that LI is automatically selected
Higgs, and D. W. Sciama for helpful discussions and if one writes the Lagrangian in its second-order form
comments. in terms of (o only:
APPENDIX L1= +q*k(ok-fm=$,
In this appendix we shall discuss the remaining which yields the modified Lagrangian
ambiguity in the modified Lagrangian. I t was pointed
out in Sec. 4 that the generally covariant Lagrangians 21= 44 (g(o.r(o.u -mzf) J
obtained from two equivalent Lagrangians Ll and Lz equivalent to !&.2 This should be contrasted with the
are in general inequivalent. One can now see that in fact second-order form of 1 2 , which is
they differ by a covariant divergence. Thus (4.14) can
be written in the form
and clearly differs from & by a covariant divergence.
This seems to be a resonable criterion, but the argu-
but in view of (6.14) this is not equal to the ordinary ments for it cannot be regarded as conclusive. For,
divergence. It is clear that quite generally changing L although it is true that the spin tensor obtained from
by a divergence must change 9 by the covariant di- Lz is nonzero, it is still true that the three space-space
vergence of a quantity which is a vector density under components of the total spin
coordinate transformations, and invariant under all
other transformations. This is the reason for the dif-
ference between this case and that of the linear trans-
formations of Sec. 2.
8ij=
S
dGSOij
We now wish to investigate the possibility of choosing are zero. Thus L1 and LZdiffer only in the values of the
a criterion which will select a particular form of L, and
36 J. Schwinger, Phys.Rev. 91, 713 (1953).
thus specify 2 completely. There does not seem to be Here SIis a linearization of $I in the sense of T. W.B.
any really compelling reason for one choice rather than Kibble and J. C. Polkinghome, Nuovo amento 8, 74 (1958).
177
spin part of the (O i) components of angular momentum. are of course tensors). In fact, (A.l) with m=O would
Indeed, one easily sees that it is true in general that not be gauge invariant. The reason for the difference
adding a divergence to L will change only the (a) is that ai is here treated simply as a component of x ,
components of Sij. Since it is not at all clear what sig- whereas A,, is introduced along with the gravitational
nificance should be attached to the separation of these variables to ensure gauge invariance.2B
components into orbital and spin terms, it might For a spinor field +, symmetry between and +
be questioned whether one should expect the spin appears to demand that one should choose the sym-
terms to vanish even for a spinless particle. Even so, metrized Lagrangian
the choice of LI seems in this case to be the most reason-
able. L= 3 (& %,k- $,kiY$l) -m&,
For a field of spin 1, the correspondingchoice would be which yields the spin density
L1= -3fii(ar,j-aj,i)+ff~fij+3m2a,~i,
which is again equivalent to the choice of the second- Since the Lagrangian must be Hermitian, one could
order Lagrangian in terms of a; only. I t yields not write the derivative on 9 alone. There remains,
however, another possible choice : We could introduce
a distinction between the left- and right-handed com-
which is a reasonable definition of the spin density?* ponents, ~ ) ~ = ) ( l f i - y s ) + , treating one of them line (p
The modided Lagrangian may be expressed in terms of and the other like zk.This gives the Lagrangian
the world vector a,, as
e= - f ~ g ~ ~ g ~ u , , : . - u a , , , ~ ( a p : . - - a , , , ~
This form of Lagrangian may seem rather u n n a t d ,
+f@m2gpw,. (A.1)
but it should be mentioned because there are other
It should be noticed that the electromagnetic Lagrangan grounds for treating ++ and +-
on a nonsymmetrical
is not obtained simply by putting m=O in (A.1). The footing.
dierence is that the derivatives in (A.l) are covariant
*This has the rather strange consequence that for the electro-
derivatives, and since r:, is nonsymmetric the covari- magnetic field the spin tensor SkCf vanishes, since the Lagran-
ant curl is not equal to the ordinary curl (though both gian is inde ndent of A,,.
See R.
(1958).
k Feynman and M. Gell-Man, Phys. Rev. 109,193
Compare footnote 18.
This page intentionally left blank
Chapter 5
Accelerated Frames:
ON HOMOGENEOUS
GRAVITATIONAL FIELDS IN THE
GENERAL THEORY OF RELATIVITY
AND THE CLOCK PARADOX
BY
C. M O L L E R
KOBENHAVN
I KOMhlISSION HOS EJNAR MUNKSGAARD
1943
181
d z = dt 1/1-uB (3)
connecting the proper time d z of a clock moving with the
velocity u in a given system of reference with the time df of
this system is valid only if the frame of reference is a system
of inertia like K. The application of (3) in K thus leads to
the correct formula (l), while the application of (3) in k which
leads to formula (2) is not justified, since k is accelerated in
the middle of the experiment and, therefore, does not constitute
a simple system of inertia during this interval.
In the space-time continuum introduced by MINKOWSKI, the
two events marked by the Arst and second encounters of the
clocks are represented by two points connected by the world
lines of C1 and C,, OF which the first mentioned is a straight
line. Since the lengths of these world lines, on account of (3).
are proportional to the proper times AI, and At, of the two
clocks, the statement expressed by (1) may be considered a
special case of the general statement that a straight line con-
necting two points in Minkowski space is of greater length than
any other curve (of everywhere time-like character) connecting
the two points.
Thus, it was clear that the discussion of the indicated ex-
periment could not lead to any difficulties for the special theory
of relativity, since this theory does not make any statement at
183
Nr. 19 5
all regarding the behaviour of clocks in accelerated systems
like k. The paradox arose again, however, in the general theory
of relativity, according to which a treatment of the behaviour
of C,, from the point of view of a n observer in k, must be
possible. Neglecting the short interval during which k is n o
system of inertia, we then find again the formula (2) for the
time increase of C, measured with the time scale of k and, a t
first sight, i t is difEcult to understand how it is possible to ac-
count for the difference between (2) and (1) by consideration
of the short interval in which k is accelerated. The whole
question was clarified by E I N S T E I Nwho ~ ) pointed out that,
during this interval, the distant masses of the universe are ac-
celerated relative to k, and thus temporarily create a gravita-
tional field which influences the time rates of the clocks in such
a way that the total time increase of C, measured in the time
scale of k is again given by (1).
I n his paper just quoted, EINSTEIN did not give any explicit
calculations, but it is clear beforehand that the result of a cal-
culation must be as stated above. In fact, since At, and A 4
are proportional to the lengths of the world lines of C, and C,
and these lengths, according to the basic assumptions of the
general theory of relativity, are independent of the space-time
coordinates used in their evaluation, i t is obvious that we shall
get the same value for - ti whether the calculation is performed
ta
in K or in k. Nevertheless, it is instructive to calculate directly
the time increase of C, during the existence of the gravitational
field in k. For small values of u, this has been done by TOL-
MAN') who assumed that terms in u higher than the second
can be neglected. In order to account for the lack of symmetry
between the treatment given to the clock C,, which was at no
time subjected to any force. and that given to the clock C l ,
which was subjected to the force F in the middle of the experi-
ment, TOLMAN introduces a temporary homogeneous gravitational
field in the description where C, is taken as the moving clock
and C, as the one which remains at rest. This gravitational field
is allowed to act on C, and C2 in such a way as to produce the
desired change in velocity of C,, while CB remains at rest on
account of the force F. By means of the well-known formula
184
6 Nr. 19
for the relative rates of two clocks situated at points of different
potential in a weak static gravitational field, TOLMAN then finds
for the total increase in time of C, and C, during the considered
experiment the relation
Nr. 19 7
Taking for f and h the expressions
1
f =X--qTZ, h = T. (6)
2
gl4
szs = g3a = 1 ,
gal = gt-
944 = -(l-ggstgs)
} (10)
8 Nr. 19
in accordance with the fact that the measuring rods in k are
subjected to a Lorentz contraction.
The gravitational field i n the frame of reference defined by
(6) has, therefore, not much resemblance with the gravitational
fields assumed in the previous discussions of the clock paradox.
Our first task will be, if possible, to choose the functions f and
h in ( 5 ) in such a way that the gravitational field in k is
static. The expression foc the element of interval in the new
coordinates will then be of the form
dsa = A - d x 2 + d y a $ d z B - D - d P , (13)
where A and D are functions of x , only. This expression may
be further simplified by taking as coordinate dx instead of vx
x so that the line element takes the form
dsa = dx2 + dy2 + dz2-D - di2. (13')
If the desired transformation is at all possible, the functions
glk defined by (9) and (13') must satisfy EINSTEIN'S field equa-
tions for an empty space
c: Rk-- 1 ,dFR= 0, (14)
Nr. 19 9
D = a(1fgr)a fl6')
containing two arbitrary conslants, a and g .
By adequate choice of the time variable, the constant (I may
be made equal to one, giving for the line element (13') the ex-
pression
+ +
ds' = dza dy' -+
dz' - (1 p)'dla . (17)
RL,,,, = 0,
Q@ = - 1 - 2 g x : . (1 9 )
and the Newtonian gravitational potential am,which, in the case
of "weak" fields, is defined by the equation")
10 Nr. 19
field postulated in previous discussions of the clock paradox.
The only necessary restriction regarding the values of x is the
1
condition x > - -.
9
The geometry of physical space in k is Euclidean. x , y, and
z being Cartesian coordinates. The time variable t is the time
measured by a standard clock situated at rest at the origin z = 0.
The increase of tinie d r of a standard clock situated at any
other place is given by the formula
dz = G d t = (1 + g z ) d t , (22)
1
s the proper time of the particle and r,
where d z = ~ d is
1
denote the ordinary Christoffel three index symbols. The values
of I$ in the case of (17) may also be taken from DINGLES
paper), and we get
Nr. 19 11
da
-dtax- - - g ( l fgx)
day=--
dez
dP dta - ''
If the gravitational potential ID is defined by the equation
dG -+
- - - -grad (D, z = (2,y, z),
dl'
we thus get
1
ID = gx+-g'x3,
2
dr = l m d t
which, for small 0 , reduces to the well-known formula6) for
weak fields
d z = (1 +IDm) d t . (29)
12 Nr. 19
From (25) and (24) we then get
dz 1 +gx
and
9
1 + gx
which may also be written
ds
-
dca (1 +gx)a = -2ga.
1
x ' = -{V(l +'gzo)a-g~z$- l}. (33)
9
Introduction of (33) into (31) gives
y = Y , z = Z
1 l+gX+gT
t=-In
2 g l+gX-ggT'
Y=y,Z=z
9
r = -1 +gx' (37)
14 Nr. 19
For later use, we also write down the Lorentz transforma-
tion connecting the space-time coordinates of two systems of
inertia with the relative velocity u
X-X0--u(T- To)
x =
vi=2
y = Y, z=z (39)
T - To - u ( X - X o )
f-to =
vi=2
In (39), the space and time variables have been chosen in such
a way that the origin x = 0 of the system k a t the time t = to
corresponds to the coordinafe X = Xo and the time T = To i n K.
Nr. 19 15
where the constant g is connected with the force F and the rest
mass m of C, by the relation
F = mg. (41)
dX
According to (40), the velocity u = - is given by
dT
yIK
Fig. 1.
16 Nr. 19
The corresponding relation between T" and 2' is, according
to the well-known formula from the special theory of relativity,
u = tgh gz'
1
---- coshgz'.
v1-2
Now, let d
'f
, denote the number of divisions registered by
C1 during the travel of Ca $om 0 to B, as judged by an ob-
server in K, and let difl be the corresponding number during
the period of uniform motion of C , from B to A. We then have
and, for the total time elapsed between the two encounters of
C1 and C,, measured by C1 and Ca. respectively, we get
) Z(T'+T")
All = Z ( A > t l + d ~ t l =
+
At, = 2 (2' % ' I ) .
L' = -
1 (Vl + g a T ' e - 1 )
9
= 4 +-1)1
9 Vl-ua
(49)
while, obviously,
L" = uT".
Nr. 19 17
arbitrarily. In the previous discussions of the clock paradox, it
has, however, tacitly been assumed that k should be a rigid
frame of referehce. According to the considerations in Section 2,
it is then clear that the transformation connecting the space-
time variables of K and k must be given by (35) during the
accelerated motion of C, from B to 0 and back. T h e motion of
the origin 2 = 0 relative to K is then, on account of (36),
identical with the motion of C2 given by (40), and the time
variable t is simply the proper time of the clock C,.
For all events satisfying the conditions -z' < t <z', the con-
nection between the coordinates of K and k is, thus, given by
(35). For t > z ' , the system k is a simple system of inertia, and
the corresponding space-lime transformation is obtained from
(39) by putting
I, = z', To= T', and X , = L'. (51)
Similarly, we have for f ( - 2 ' the transformations (39) with
reversed signs of u, z' and T'.
I n the following, we shall use the equations (35) and (39)
in a somewhat different form. Solving the last equation (35)
with respect to gT and introducing into the first equation, we
get, if we omit the trivial transformations of the y and z variables,
} (52)
for
18 Nr. 19
gT = (1 + gx) tgh gz
1 + gX = (1 + 8) cosh gz
which, by means of (43), (46), a n d (49) may be written
T = u(X-L)+T
(3
On the other hand, - is equal to u and - u for t > z and
1 < -z, respectively, which, on account of (46), is seen l o be
in accordance with (54) for t equal to z and -z.
While, thus, the velocities of the different points of k vary
continuously, it is clear that the accelerations must be discon-
tinuous or t = t and -z, since the force F is assumed to set
in abruptly. This is also the reason for the sudden change i n
the gravitational potential from the value zero to the value
given by (26) at these moments.
The system k defined by (52), (53), and (53) thus seems
to be the most natural frame of reference to be used in the
discussion of the clock paradox. T h e applicability of this system
of coordinates is only restricted by the condition that (38) must
be satisfied for - r < t < z , i. e. for
-u(l+gX)<gTu(l+gX), (55)
on account of ,(62) and (46). Since u is smaller than one, a
comparison of (38) and (55) shows that this condition is satis-
1
fied for all events which take place at points X > - - .
9
197
Nr. 19 19
b. We shall now treat the problem from the point of view
of an observer in k, according to which Cs is permanently situ-
ated at rest at the origin o of k, while C, a t the beginning is
travelling with constant velocity u. The first encounter between
C, and C2 takes place a t the time t = - z f -F". At i = - T I , C,
has arrived at a point b on the positive x-axis with the coordi-
d
/
/
/
//Z
Fig. 2.
nate z = I". During the time --zf < t < z ' , Cl is subjected to the
gravitational field which brings it to rest a t the time 2 = o at
a point a in the distance I' from b, and starts it back to o with
reversed motion. In spite of this gravitational field everywhere
present during this period, C, remains a t rest on account of the
force F which just counterbalances the gravitational force.
The behaviour of the clock C, is now simply obtained from
(52) and (53) if we remember that the X-coordinate of C, has
the constant value X = L'+ L".
From the second equation (53) we then get
1" = L" y m , (56)
since 1': is the value of z for t = z'.
198
20 Nr. 19
Further, since the z-value of C, at t = o is 1'4- I", we get
from the second equation (52)
This formula is also easily obtained from the first equation (53)
if we remember that S,l t, is the increase in T for X = L' L" +
+
during the interval 2' < t < r' z". On account of (45), we may
also write
/;t, = r"(1-uS). (61)
A t , = 2 ( 4 f 1 + 4 f , ) = 2(T'+T")
A t , = 2 (2'4- 2")
199
Nr. 19 21
in accordance with the expressions (48) derived from the stand-
point of an observer in K.
It is interesting to note that /itl remains finite in the limit-
ing case of very large forces F, where TI, T', and dKff vanish,
since Aktl in (59) contains a term which only depends upon u
and 7'. It is just this term which is essential for the solution
of the clock paradox.
/ifl,
Since d f ain a n y case is smaller than At,, a n d accord-
ing to (60), is smaller than T", A i t l must be greater than z',
i. e. the clock C1 goes faster than C, during this period. From
the p i n t of view of an observer in k, tbe reason for this
difference in rate is to be sought mainly in the difference i n
gravitational potential (0 a t the places of the two clocks. T h e
behaviour of C1,however, will in general not be like that of a
clock a t r e s t at the point x = 1'+1'' = L'f L", even if T' and
z' are made small by use of a Iarge force F. In fact, the number
of divisions registered by a clock at rest during the time d t = zt
is, according to (26). (28), or (ZL),given by
= dtV13-2Ul-u3 J
for the proper time of a particle moving with velocity u in the
gravitational potential (D. This general formula, which comprises
the special formulae (3) and (28), clearly shows that Aitl in
general must be smaller than (dkfJo, since C1 during the time
in question falls freely with increasing velocity from the place
z = L'+ L" towards smaller values of z, i. e. smaller values
of the potential (D.
Only in the case u << 1 considered by TOLMAN, where tgh gz'
is equal to gz', apart from terms of the third order in u (cf.
200
22 Nr. 19
(46)). it is allowed to treat Ci as a clock at rest during the
period of acceleration, since the difference between t, and
(ALfi)o is then of higher order i n u. Even in this case, where
gt may be treated as a small quantity, the equations (52).
however, do not reduce to the transformations giyen by (6). If
we neglect terms of higher order in gt, we obtain instead
x = s + - 2g1 P ( 1 +gz).
Nr. 19 23
1
t
X = z c o s h 9 + $ s i n h @ d t , Y = y, 2=z
0
i
Pl
T = x sinh 9 + 50
cosh 0 dt
with
1
@(t) = g ( t ) dt
0 (64)
I t is easily verified by means of direct calculation that the line
element (7) is really brought into the form (17) by the trans-
formation (64). Further, we see that the equations (64) in the
case of constant g reduce to the equations (52) which are equi-
202
24 Nr. 19
valent to (35). On the other hand, if g is assumed to be finite
and constant for - z z ' < t < z ' and zero for all other times, (64)
leads to the transformations (52) and (53) used i n the discus-
sions of Section 3.
When g is given as a function of t, the transformation (64)
and, consequently, also the motion of the origin of k with
respect to K is completely determined. Conversely, the. function
g and the transformation (64) are uniquely determined by the
motion of the origin of k. Differentiating (64) by constant x,
we get
d X = sinh@.(l+gx)ddt
dT == cosh0*(1+ g x ) df.
The velocity U
is, thus,
=
(3
- of a fixed point in k with respect to K
U = tgh 0, (66)
an equation which may be regarded as a generalization of (54).
Moreover, we get from (66), (65), and from the definition of B
in (64)
X = LY(T), (68)
g = - (d wz' )
dT 1/1-
t = 5 T
0
V-dT
which are easily derived from (67), (68). (66), and (65).
203
Nr. 19 25
By means of the general equations (64). it is now easy to
treat the clock paradox for a n arbitrary motion of the clock C,
during the interval of acceleration. Since, however, the treatment
of this general case does not exhibit any essentially new features
as compared with the treatment of the special case discussed in
Section 3, we shall confine ourselves to the general remarks
already made i n this section.
204
B. ROHRLICII(**)
Un,iversity of Iowa - Iowa City, l a .
L. WITTEN
RIAS - Baltimore, Illcl.
- Introduction.
I.
general way the consequences of conformal relativity as they may affect phys-
ical measurements. These questions are of interest in view of the conformal
invariance of the basic equations of classical physics ( 4 > 6 ) . To the best of our
knowledge, no one has discussed the specific physical effects of conformal rela-
tivity f o r a simple case, though very likely a number of physicists have thought
about and have been aware of the matters to be discussed in this paper. These
matters are very restricted in scope. We shall not consider all conformal co-
ordinate transformations but only those which take us from flat space t o flat
space. I n fact, we shall discuss in detail only one specific example of such a
transformation, the one responsible for transforming from an inertial co-ordinate
system to one which is accelerated uniformly with respect t o it. Though this
transformation is a simple one, it does have physical significance. I n the one-
dimensional case and for weak fields, its inverse represents a transformation
from a frame in which a particle falls freely in a static and uniform force field
to the rest frame of the particle (7).
I n the next section, we discuss in detail the properties of our specific trans-
formation, its singularities and other peculiar features. In Section 3 we treat
various thought experiments-relating to the variation of rest mass, leng&h,
and time under a transformation t o a uniformly accelerating co-ordinate frame.
We also discuss the so-called (( twin paradox. u Finally, in Section 4, we deal
with a specific experiment: we calculate the frequency shift of the spectral
line of a freely falling emitter.
(l) H . WEYL: Sitzungsber. Preuss. Akad. Wiss. (1918), p. 465; Nath. Zeits., 2,
3 8 4 (1918); Ann. Phys., 59, 101 (1919); Space, Time, and Xatter (N'ew York, 1950).
(z) A. EINSTEIN:Sitzungsber. Preuss. Akad. W i s s . (1921), p. 261.
(3) W. PAULI:Theory of Relativity (New York, 1958).
(2.1) x/@= P ( X . 2 ) .
The metric arssociated with 8 is gap(x). If the frame b is a Riemanri space
(i.e. we deal with a general relativistic co-ordinate transformation) the metric
in 8 is given by
(2.2)
(2.3)
where
(2.4)
and
(2.5)
(2.6)
P H Y S I C A L C O N S E Q U E N C E S OP A C O - O R D I N A T E TRSNSFOR\~.TATI(JL<STY!. 9 55.
(2.7)
It follows that
(2.10)
where qpv is the Minkowski metric with signature +2. We shall consider trans-
formations t o uniformly accelerating frames of reference. The Y - s of (2.1)
will represent an inversion in the origin, followed by translation by a constant
vector and then by a second inversion about the new origili (lo). In other
words, we let (11)
(2.11) SB f
=-
P
( l o ) Such a transformation is discussed among other authors by T. FULTON and
E. ROHRLICEI: Phys. Bev., 107, 1163 (1957).
Ann. Phys., 9, 499 (1960); S. A. BLUDMAN:
(?) For any vector A P , we define A ? = A . p A p :
(23)
t2.14)
where
(2.15)
(2.16) a($) = A - 2 ( 3 ) .
The transformation inverse t o . (2.11) and (2.12) is simple to exhibit. It too
must be an inversion in the origin, followed by a translation, this time i n the
opposite sense, by the same constant vector a p , and a second inversion:
(2.17)
where
and
1
(27.19) A(x)=I+ 2afix;x2a2 = - .
4s)
namely
(2.90) ap=
i0 ; o,o,--
3
We observe that the trailsformation (2.11), (2.12), with a@ defined as in
(2.20) does describe a uniformly accelerating frame of reference. This may
be made more obvious, if we do not set c = l and instead take the limit
c --f 00. The conformal transformation in this limit becomes
(2.21)
I lim
c+m
A(%) =1
(2.22)
I z=-$
gt2
1
with
(2.23) I. = 1 - $gat
The orbit of the particle, described by the observer a t rest in the 8 frame,
lies in the d-t plane and is hyperbolic, i.e. represents uniform acceleratioii:
(2.24)
Note tlhat we have used only the transformation eqs. (2.11.), (2.12) t o ob-
-tail1 (2.%), so it holds both for general relativistic co-ordinate transformations
and f o r conformal co-ordinate transformations. Not until we speclfy our metric
or, altenmt,ively, the transformation of proper time, do we restrict ourselves
t o one, or the other. Hyperbolic motion thus appears as a solution of either
a conformally ( 15,J.4) or a general relntivistic.ally eovariant equation of motion
whic.h reduces t o the equation of motion for a free particle in the frame X.
The hvperbolic i-iiotioii a8 seen by the observeT a t rest in S would equally
well result8if we, c.oiisidered 8 t o be an inertid frame in which the Lorentz
i n v ~ i i t equation
~~t of motion with elonstant rest mass and constant force held.
Thus we lmve three alternative and equivdent descriptions of the motion as
seeii by t,he observer in 8:
A ) d is ;
I frnme in whioh only R constant force is acting.
I;,) 8 is a frame in >L Weyl space, uniformly accelerating with respect
t o ;i,ii inertial frame. This considers the transformation to be a conformal
co-ordiimt,e t,l.~insforrnation.
B,) R is R frame iii a Riemannian space uniformly accelerating with
respect t<oan inertial frame. This considers the transformation t o be a general
relativistic. co-ordinate transforimtion.
The principle of equiva1enc.e as usually considered regards the descriptions
il and E , t o be equivalent. We shall shortly discuss the equivalence of the
desc,riptionn H, and B, for the simple class of transformations (2.11) and (2.1.2)
that we use considering. Hence, the primiple of equivalence can be regarded
as an equivalence between A and B, or between A and B,. For the case of
uniformly accelerahing motion, the, description used i.s a matter of computa-
tional conveiiienc:e.
For the nuke of completeness, since we will use them later, we give here
the expression for the instantaneous veloc,ity
(2.26)
where
(2.27)
N e s t n-e study the mapping of the x - t plane into the 2-t plane for
(I3) F. ( ~ ~ I ~ s E
Nuovo
Y : Ciwento, 3, 988 (1966); ,I. A . MC LENNAN:
iVuovo Cimexto
5, 640 (1957); H. A. BUCHDBHL: iVuovo Cirr~e)ito,11, 496 (1959). References t o earlier
work can he fouiitl in these papers.
( l * ) L. ISPELLI aiid A. SCIIILD:Phys. Bev., 70, 410 (1946).
21 1
~~ ~
PIIL?IC.4L C G N 3 E Q U E N C E S O F A C O - 0 R . D I N A T E T R A N S B O R M A T I O N E T C . 659
(2.28) t = f (3 + 5)
The fac,t that the transformation is singular is in no way disturbing. One
must iiierely obey t h e injunction t o stay away from the singulwities in dis-
cussing m y physical process. Using this transformation, we discuss physical
processes On117 for regions in space-time for which the transformation is non-
singulair.
In order t o determine hhe mapping, we consider the most general straight
line in the ;.-t plane, pasallel to the t-axis,
2
(2.39) x = - (a--1) for all t ,(a2 0, a # 0 ) .
9
The transformed equations in the 8' frame for these lines are
(3.30)
Thus, stmight lines parallel to the t-axis map into a family of hyperbolas wit1
parallel aeyiizptotes of slopes fl. The vertices of the two arms of the hyper
Pig. 1. - The mipping of particles at rest in the 9 frame ( 2 - t plane) at two typic
positions in t o t,heir corresponcliiig hyperbolic motions in the 9' plane (d-t' plan
by means of the acceleration transformation.
212
(2.31.)
t'
singular line?
-
a)
Fig. 2. - The mapping oQ the x-t plane into the z'-t' plane 01. the trans8ormation
of uniform acceleration. The singular lines in t h e z-t plane map into iritiiiity in
the d-t' plane.
Figure 1 illustrates the way typical lines in the 3-t plane map into the
d-t' plane. Figure 2 indicates the mapping of corresponding doiiiaias in the
two planes. The mapping is one to one for all regions of the plane except
along the singular lines.
(3.1)
I n order to obtain the length effects, we now have to substitute the motion
satisfied in S' by a particle at rest in 8. We used infinitesimlll lengths in (3.1)
in order t o enable us t o substitute such particle motion. No such need arises
for Lorentz transformations so that finite lengths may be used when com-
paring measurements in two inertial frames. The distinction is one between
a transformation for which the new variables are non-linear fuiictioiih of the
old and one for which they are linear.
Substitution of the motion (2.24) in the expression (2.19) for A' yields
(3.3)
where x i is the x' co-ordinate at t i . This agrees with the predictioiis of the
special theory of relativity. Observer S' was only able t o make this prediction
if he knew explicitly the value l / I ' by which he expanded space.
A general relativistic observer would make exactly the same ca,lculatioii
but would introduce 1' not as a scale factor but with the appropriate com-
ponents of the metric tensor. The distinction for this transformation is purely
formal.
(3.4)
Ewluation of (3.4) along the orbit gives, with the use of (3.2)
(3.5)
Tlius u,e get a time dilation which is the same as the one which would be
obtained froiii the use of instantaneous Lorentz franies. Again the general
rellttirist woiiltl make the same calculation using 2 as the appropriate corn-*
ponents of the metric tensor.
mate of the, reading of a proper time interval between. two world points 011 8
given woidd line will necessarily agree with the estimate of observer A for the
same world h i e going between the same two points.
8honlcl obwrver B fail t o keep track of his scaling factor, he will be power-
less t o nmke m y remarks about the ratio of times or distances at different
parts of his dpnce. The interval d t c has no physical sigrdcance bec<ausethere
is no c..look which directly measures it. .However, he can compare ratios of
infinitesirnd 1e.ngths or distances at the same world-point (A' drops out of the
ratio) ; equivalently he can compare angles between world lines or any other
pliy,sicaJy inemiizgful quantity that can be expressed independently of A'
(a conformd invariant).
Rediziag iiow that there is no paradox and that calculations made in the
franie., 8,in which A is station.ary must agree with those made in the frame,
S', in which R is stationary, we proceed t o calculate 8 special case as an
exercise.
In 8, the observer is stationary at ad = 0. His path in S' is
(3.6)
(3.7)
The two paths will intersect a t the times t = to= & 2 / ( d \ / 3 g ) ,I'= ff =
-
= f \/3/y. The elapsed time for A as calculated by the observer 9 is
where the integral is calculated over the path xa = 0. Observer B would find
for this elaped time
(3.9)
where t.he inte.gra1 is calculated over the path given by (3.6). Of comse, the
transforimtioii has been colzstructed so that ( d f P- dzr2)/A" = dta- dz2 and
observer .B n 4 l find that (3.9) gives the same result as (3.5). This can be
shown e,spSicitly. By use of (3.6) and (3.2), the integral (3.9) reduces after
216
(3.1.0)
- 1'3 I 9
(3.11)
(3.13)
(3.13)
(3.1.4)
He knows without calculation that (3.13) must produce the same answer a%
(3.14) and can also readily verify that the integrals give the same resnlts by
using the transformation
4t;
(3.7.5) tB=
9 - g-ti,"'
P H Y S I C A L C O N S E Q U E N C E S 01.'A C O - O R D I N A T E T R A N S F O R X A T I O N E T C . 666
conformal transformations, providing the rest mass is not invariant but trans-
forms appropriately. The charge of the particle and the velocity of light are
invariant. The proper mas8 must transform in the following way
(3.16)
For the transformation (2.11), (2.12) by virtue of (2.19), (2.20), in the plane
I /
s=y=o,
rrb
(3.18) =
1- gz' + (g"la)(dZ - f a )
I n the region of space-time for which g d >> ( g 2 / 4 ) ( x ' e - t ' 2 ) ,
(3.19) mc w m ( l + g d ) .
Equation (3.17) suggests how the point of view of general relativity differs
from that of conformal transformations. For general r e l a t i ~ t y ,the Lorentz
equation would be kept invariant, as would (3.17), by keeping each factor m
and d t individually invariant ; conformal relativity varies both. The value
of the transformed mass (3.18) is origin-dependent; this corresponds to the
analogous situation in general relatiGty which has an origin-dependent metric
tensor, g L V ( d )= l ' ( d ) - 2 q p , , . The origin-dependence is not particularly surprising,
and occurs quite often in metrics commonly considered in general relativity.
Equation (3.19) which is an appropriate approximation for certain regions
of space-time suggests an interpretation in classical terms. mc is the total
energy of the particle which contains contributions due t o the rest mass, m,
and to the gravikational potential, mgd. The full eq. (3.18) apparently adds
a correction due to velocity and its effects to the total energy. The origin-de-
pendence of (3.19) can now be thought of as corresponding to the arbitrary
addition of a constant to the gravitational potential; the mass difference be-
tween two points will according to (3.19) be origin-independent. Rowever
the exact expression (3.18) yields an origin-dependence eTen t o the mass dif-
ference between two points. This is probably related t o the actual physical
model that must be used t o produce the acceleration field with which we are
dealing, much as the gravitational potential from a point mass will yield an
origin-dependence for the potential difference of two equally spaced points
along the same radial line.
218
(4.1.) p$=pP+kP,
where p $ and p!' are the momentum four-vectors of the atoiii -iiiimediahely
before and after the emission, and kf' is the photon four-momentum. We have
ignored the irrelevant dimensions z' and y', for motion along the :'-titxis (we
also put fi = 1).
(4.3)
/ I
cu(z,; x,, vg) = (m,:-m)
m, +m
---------p.(l-v i
: ) *
2m* *
The factor m,-m mould yield o if recoil were neglected and v, were zero;
the term (rn,+rn)/am, is a recoil correction; y+(l-v,) is the Doppler effect.
If the observer is located at a point xi # x , , energy conservatioii will give
where vtc is related to m by eq. (3.16). The conservation law (assured by the
invariance of the theory under the translation group) is
(4.7)
This involves, when both observer and atom are at xi, according t o (3.16)
and (3.1.8)
1 k =
1
x(x/) = 7
1
1 -P o
( w ; 0, 0,k ) .
220
It is evident that 1clLc must be related t o d' in the same way that me is related
t o m, (3.16), iu ovclep t o preserve the covariance of (4.7).
Equatioiis (4.7) and (4.8) lead t o the same results as in case A . I n par-
ticular, the observer S' will again find the result (4.3) f o r o(zL; z i , t ~ * ) .
If the conformal observer is located at z i , he will apply a different scale
factor at xi than at z:. Since the conformal observer considers himself to be
in field free space, k p c is independent of position. Hence the ratio u)(z')/l'(a')
is a co11stmt, and he finds
5. - Discussion.
APPENDIX
where
where .r$is related to T$ in the same way as are the Cristoffel symbols, (A.2)..
This requires the definition
With the affine connection defined in this way, and considering conformel
co-ordinate transformations, we are dealing with an example of a Weyl space;
in showing the invariance of the equation of motion of charged particles, this
affine connection is particularly helpful.
223
Abstract
A system 5" (rocket) starts from rest in an inertial system S, and after a series of acceler-
ated, uniform and decelerated motions, comes back t o rest at its initial position in S. An
exact calculation is carried out, from the standpoint of S, of the time intervals for the
arrivals at S of light signals sent back by S'. From the standpoint of S', S has made a
round trip after undergoing a series of free falls in gravitational fields and coasting
motions. An exact calculation is carried out for the 'proper time' intervals in S from the
standpoint of S'. It is shown that there is exact agreement between S and S' in their
reckonings of the total time intervals for the two frames, namely, both S and S' agree
quantitatively, to them, the time interval is longer for S than for S'.
The accelerated motion of S'relative to Sexplicitly used in the treatment of the problem
in the present work is that under time-independent field and subject to the condition of
local Lorentz contraction and dilation; the resulting motion turns out to be that obtained
earlier by MslIer on entirely different considerations. The result of the present treatment
is, however, more general than this particular motion seems to imply, since by an arbitrary
coordinate transformation, it can be made to include an infinite number of accelerated
frames including time-dependent fields, all within the framework of flat space-time.
General remarks are given for the clock problem in the general theory of relativity in the
sense of Einstein's curved space.
That a returned clock should have lost time compared with the one at
home is so strange a conclusion that Einstein specifically wrote an article
(Einstein, 1918) in 1918, in the form of a dialogue between a critic and
himself, to show (1) how the trip will be viewed from the standpoints of both
frames, (2) how the reciprocal symmetry (in the sense of the special theory
of relativity) will be destroyed in this case by the accelerated motion of the
rocket, and that both frames will agree that the returned one will be slow,
and (3) that this is due to the loss of time, or the slowing down of the clock,
of the rocket during the accelerated portion of the rockets trip when it
turns back.
For definiteness, let us pose the following situation. From the standpoint
of the inertial system S, the rocket (or the travelling twin) S goes through
the following sequence of events :
B S C
F E
Einstein pointed out, however, that during the turning around parts
C-D, D-E in (1 .2), S is at a higher gravitational potential than S, and
the clock of S is faster than that of S. The clock in Swill during C-D and
D-E gain time and more than compensate the loss as given by (1.4). If
the time intervals during C-D, D-E are very short compared with those
for the uniform motion parts, the gain during C-D, D-E by the clock
of S will be such as to bring the total time ATrecorded by S (for the trip
B-C, C-D, D-E, El-F) to be longer than that AT recorded by S , in
accordance with (1.3), i.e.,
ATI
A1
AT =
- P)
The above statement of Einstein has been expressed in explicit form by
ToIman (1934). Let us view the trip from the standpoint of S as in (1.2).
Let T A B , T ~ T C~D E,,T E F (= T B C ) , T~~ (= T ~ B be
) the proper time intervals as
recorded by S, and let tBC,tCDE,tEF(= tBC),tFa (= tAB)be the time intervals
as recorded by synchronised clocks at various points in (1.2) attached to
S.? Then, by (1 .4),
t We shall, without causing confusion, drop the prime for B,C, etc. in the subscripts
for the 7s and t s.
226
Let the average distance between S and S during the turning around
portion C-D-E be approximately taken to be x = vtBc. Since v = gtcD,
the Doppler effect separation gives
1
rg x time for the trip recorded in S (1.9)
d(1- P2)
This is in approximate agreement with (1.3). It is important to note that the
2 sign in (1.8) arises not so much because of the neglect of rAB and tCDEin
(1 .S) as because of the approximations made in obtaining (1.7).
In 1956, Dingle (1956) in a series of articles renewed the question of
whether the returned twin from a rocket trip is younger than his brother
who has stayed home. He believed that there should be no difference in
their aging, that all earlier conclusions, including Einsteins, are erroneous.
His questioning of these earlier works by many physicists has led to a great
flux of discussions. Most authors (Arzelies, 1966) maintain the conclusion
of Einstein. In most cases, the arguments amount to the simple statement
that since the rocket S has undergone accelerated and decelerated motions,
it is not on equal footing with S which is an inertial system, and hence the
reciprocal symmetry in the sense of the special theory of relativity has been
removed. This part of the argument is of course correct. But then, because
of attempts to simplify the problem for the non-specialist, the following
argument is usually put forward: One can make the time intervals for the
accelerated and decelerated parts very short compared with the time
intervals for the uniform relative motion parts [see (1.1) or (1.2)] and in the
limit negligible. Then, since only S is a preferred (in the sense that it is an
inertial) frame, one must only employ the relation (1.3). This part of the
argument is unfortunately misleading. We have seen in the preceding section
from the approximate treatment by Tolman that it is precisely the acceler-
ation (or, an equivalent gravitation field) during the turning around of the
rocket that slows down its clock (relative to the inertial frame S ) , and that
one obtains the result (I .9) in an approximation only, which is not exactly
the relation (1.3). The point that seems to have been forgotten in many
elementary discussions of the clock paradox is that while the conipen-
227
u= (%),
If we assume that in S the unit of length is the same as that in S, then the
condition of local Lorentz contraction is expressed by
This is equivalent to
and
Note that the T called the proper time above and defined by (2.9) is not the
normal proper time -ro defined by d-ro= ds which will be related to T here by
d-r0= d ( g 4 4 ) d r .In the present work, we make use of T in the calculation
ofrO.
229
u=
*+
AXfC,
, A, C,,C2 being constants.
(1 + $ ) 2 = ( 1 +F)2+($)2 (2.15)
230
The above equation in dT/dt can be integrated, and with the initial condition
t =0 when T = 0
(1 + -:
+: (I+---:
: acT)
we obtain
-=In
2at - acT) -111 (2.17)
C
g)x nT/c
= 1 + (aX/cz>
at
= tanh-
C
(2.18a)
"=(1
C +:)sinhe) (2.22)
I+-=
C2
(I + )-: cash-
at
(2.23)
23 1
(2.25)
. a(t - to)
sinh -- (2.26)
C C
(2.28)
u=- ()x
1 - [a(. - xo)/c2] dt
= ()dT x
= -tanha(t - to) (2.28a)
(3.5)
t In the following, we simplify writing by choosing the unit of time such that c = I , All
time, velocity, acceleration T, t, u, uo, u are to be replaced by CT, ct, v/c, uo/c, u/c2, to
convert to c.g.s. units.
$ AT, only represents the time interval for the clock at X = 0 to intercept all the light
-
signals sent to it by the clock attached to x = 0 within AT,. It does not really represent the
time of travel of the rocket ( x = 0) from A to B as recorded by the clock at X 0. The sum
in (3.10) is, however, the total time interval for the whole trip of S, as recorded by one
and the same clock at X = 0 in S.
233
(3.7)
For the parts B-C,E-F, if AT^ = 47,is the proper time intervals in S , the
sum of the intervals for the arrivals of the signals sent back by S during
these intervals is
AT2 + A T , = AT^J(-) + ATIJ(-)
Thus the total interval recorded by one single clock in S (at rest at X = 0)
from all signals sent back by S during its round trip is
(3.10)
The proper time (recorded by one clock) in Sfor the whole round trip is
4
AT^ = 4A701 + 2 4 7 2 = a- tanh- V o + 2 4 7 2 (3. I I)
-X 1
(3.12)
7
1
-VO
Part A-B. Let AT, be the proper time interval (registered by a clock at
X= Oin S)for X = 0 to reach the velocity-uo (relative to S).From (2.18a),
adT, = tanh-a71 = uo (3.13)
C
234
This is now the time interval recorded by the clock in S, and is hence the
proper time interval 2 h 2 , i.e.,
2 4 T2 = d(1 - 00) 2872 (3.15)
Part C-D. During C-D and D-E, S would describe S as being acted
on by a gravitational field n in the positive x-direction, the motion being
described by equations (2.24)-(2.27). Equation (2.24) is
[I - a(x - so)] = [I - a ( X - X0)I2 - u(T- To)2 (2.24)
The constants so,Xo, Toare determined as follows. From the standpoint of
S1 [see (3.11, (3.2) and (3.8)], at C,
% ad 7 2
aT = 2, = UO (3.1 6)
1/( 1 - uo) + 1/(1 - uo)
and (2.28) leads to
At D,
u = 0 and
(3.17)
235
From the symmetry of the situation, it is clear that the time in S for the trip
A-B-C-D-E-F-A as seen (or, calculated) from the standpoint of S is
twice the d in (3.20) i.e.,
Total time d T in S =
(3.21)
(3.22)
the gain during C-D. The total result is to bring the two reckonings of
the proper time intervals, by S and S, of the round trip into exact agreement
with each 0ther.t
It is seen from the foregoing results that all the calculations are exact, and
no approximations involving the assumption of making the accelerated
parts A-B, C-D-E, F-A (or A-B, C-D-E, F-A) very short compared
with the uniform relative motion part B-C, E-F (or B-C, E-F) have
been made. In fact, as emphasised by Einstein as early as in the 1918 paper
and brought out approximately by Tolman (1934) and exactly in (3.22)
above that is precisely the accelerated parts that resolve the paradox. Had
one literally neglected the accelerated parts, (3.10) and (3.22) would have
become
2 AT^
Total time in S (as reckoned by S ) =
d(1- vo2)
Total time in S (as reckoned by S) = 22/(1 - vo2) 4~~
On the other hand, had one done away entirely with the uniform relative
motion (coasting of rocket) parts B-C, E-F (B-C, E-F), the results
(3.10) and (3.22) would have become:
Standpoint of S Standpoint of S
t The results (3.10),(3.11), (3.22), (3.23) above are alittlemorecornplete than those of
Mlaller (1943) in that here S starts out from rest and comes back at rest to S.There are
differences in details between this and Mdlers work. For example, we calculate the time
intervals AT,, AT,, AT,, AT6 in (3.5)-(3.8) as recorded by one cIockinS, andnotM0llers
times T, T which are not the proper times of one clock. Also, as remarked in Section 2
above, the starting points in the two works are different. In an application of M~llers
work, Fock (1959) has obtained an erroneous conclusion.
Fock (1959) states that the time intervals recorded by the clocks A , B in S, S are
given by
V=
74 - T B = - (*T- 3f)
CZ
where t = 2v/g is the time for the turning around part (C-D-E in (1.8) in the present
+
article), and T = uniformly moving part (B-C) (E-F) + t. Thus rA- 7 8 can be 0,
in disagreement with the results of everyone else. This strange result arises from the error
of the f sign in (62.09), which should have read U = U,, - g ( x l - x ) . When this correction
is made, one would have
v2T
71 - 7 B = --
c14
which is in agreement with the approximate result of (1.8) of Tolman and others.
237
where the g l j are functions of X and t and hence no longer static. The
space is, however, Euclidean. The motions of S relative to S can be quite
arbitrary and very complicated, but the description can be reduced to that
of S by the transformation above so that in a sense the treatment of the
clock problem by means of S has covered a whole (infinite number) class
of accelerated motions relative to S. This class of accelerated motions has
not brought in any curved space properties in the sense of Einsteins general
theory of relativity.
The present work has thus treated and resolved the clock problem
without having really made recourse to Einsteins theory of gravitation
involving curved space. This is worth noting in view of the usual statement
in the literature that an exact treatment of the clock problem (i-e., to all
orders of uo/c)calls for the general theory of relativity.
Other paths C1,C2joiningP1 and P2 will not correspond to the free motion
in the field g,,, but will correspond to motions under agencies other than
the field representative by gpv,and
2
f
1CI
ds #
1c
f ds (4.3)
References
Arzelies, H. (1966). Relativistic Kinematics. Pergamon Press. Contains an extensive
bibliography on the clock problem.
Dingle, H. (1956). Nature, London, 177,782.
Dingle, H. (1957a). Nafure,London, 179, 1242.
Dingle, H. (1957b). Nature, London, 180,499, 1275.
Darwin, C. G. (1957). Nature, London, 180,976.
Einstein, A. (1911). Annalen der Physik, 35, 898. Translated and contained in Einstein et
al., The Principle of Relativity. Dover Publ. Inc., New York.
Einstein, A. (1918). Naiurwissenschaften, 6, 697. An exposition of the relativity theory,
and of the clock paradox, in the form of a dialogue.
Fock, V. (1959). The Theory of Space, Time andGravitntion, Section 62, p. 214, eq. (62.16).
Pergsunon Press.
McCrea, W. H. (1956). Nature, London, 177,783.
McMillan, E. M. (1957). Science, New York, 126,381.
Merller, C . (1943). Dmske Vid. Sel. Mat-Fys. &fed. X X , No. 19.
Tclman, R. T. (1934). Relativity, Thermodynamics and Cosmology. Oxford University
Press.
240
wI = y ~ ( x+ 1l a y 3 - P ol a y 0,
(7.1) XI = y(z + 1/ayi) - l/ayo, gr = y, zI = z ;
P=aw+Po, y=l/(l-P2)/2, yo=1/(1-pi)/2,
which will be called the Wu transformation. If one wishes, one may define wI = c t I ,
where t~ is the usual Einstein time, in (7.1) for easy comparson with special relativity.
(But this definition is not necessary for deriving experimental results.) The inverse Wu
transformation of (7.1) is
r w1+Po/aro Po
4 x 1 + l/ayo) a
(7.2)
x = [(xI+ l / a y o ) 2 - (wI+Bo/ay0>21/2-
y=y[, x=x1.
One can verify that (7.1) and (7.2) reduce to four-dimensional transformations of the
form (2.2) in the limit of zero acceleration a. (See appendix.) We may remark that the
coordinate transformation between two CLA frames can be derived on the basis of (7.1)
or (7.2).
From the viewpoint of limiting four-dimensional symmetry, the CLA transforma-
tion must be expressed in terms of the Cartesian coordinates rather than other
coordinates, just like the Lorentz transformation. Furthermore, the coordinates of
CLA frames should play the same role and have a similar physical meaning as those of
inertial frames. This appears to be different from the usual viewpoint that coordinates
for accelerated frames have no physical meaning. The Wu transformation (7.1), based
on the four-dimensional symmetry, differs from that obtained by Mdler [6] based on
the approximate principle of equivalence in general relativity because they give
different spatial measurements by meter sticks or the Bohr radius of hydrogen atoms.
We believe that such a difference should be tested by, say, measuring a Doppler shift of
wavelength emitted from a source with a constant linear acceleration. We may remark
that the constant acceleration a in (7.1) can be shown to be related to constant change
of energy (or moving mass) per unit length measured in an inertial frame. This
differs from the usual definition of acceleration in (2.4). I t is interesting to note that
such a constant acceleration a dictated by the limiting four-dimensional symmetry is
precisely what has been actually realized in linear accelerators in laboratories. Physical
implications of the Wu transformation and their experimental tests will be discussed in
a separate paper.
In the formulation of QED in sect. 6, the electron is, as usual, assumed to be a point
particle. However, if the physical electron is really a fuzzy point (in the sense of fuzzy
set theory with a bell-shape membership function having a width Lo) rather than a
geometric point, then there will be a departure from the four-dimensional symmetry at
short distances or large momentum [I. A fuzzy-point model of a particle has been
interpretated as follows: a particle by itself is a structureless-point particle, but it can
simultaneously exist at different places with a different probabilities. As a result, the
position uncertainty of such a quantum particle has a minimum width Ax Lo. The -
Coulomb potential will be modified when T < L o , and the photon propagator in (6.19)
will be modified when momentum becomes larger than h / L o . For a detailed discussion
of the fuzzy-point model of particles, we refer to ref. [7].
243
One may ask: how can one realize the evolution variable zu in the extended
coordinate transformation (2.2) by physical means? Since the invariant phase of an
electromagnetic wave in the F frame is given by k,,w - k . r ,where k,, = 1 k I , we can
define the lightime w in terms of k d , just as the length can be defined by the
wavelength 1 or 1 k 1 . We note that the clocks, which show lightime in this theory,
are the same as those in taiji relativity [4] because they have exactly the same
four-dimensional transformation property. However, the taiji-time w in F cannot be
factored into two Well-defined b and t because of the absence of a second postulate
while the lightime zu in extended relativity and common relativity can be factored into
two well-definecl functions 6 and t , as shown in (2.11, (2.2) and ref. [8].
Our discussions show that it is extremely important to be aware of what quantities
are actually measured in the experiments and what effects the assumption of a
universal speed of light may have had on the interpretation of the results. For example,
we have seen in paper I that the lifetime dilatation of unstable particle decay in flight
has little to clo with the property of Reichenbachs time with a general parameter q or
q, because the lifetime 5 is basically defined as the decay length divided by the
universal 2-way speed of light c. The basic reason is that the four-dimensional
symmetry dictates that the decay rates in, say, QED based on extended relativity can
only be defined in terms of the covariant lightime zu or 20 which has the dimension of
length.
The constant 2-way speed of light in extended relativity is in general not the
maximum speed of physical objects in the universe. Rather, it is the one-way speed of
light in a given direction, that is the maximum speed of any object in that direction, as
shown in (2.3). This holds for any inertial frame. It is worthwhile to note that this
property of light, being the maximum speed of all physical objects in any given
direction, is a logical consequence of the first postulate of relativity, as shown in taiji
relativity [4].
244
Suppose one writes dwI = y ( T d z u + Updx), dx, = y(Vdx + Wp dzu), dyI = cly, dzI = dx;
y = I/( 1 - p2))/2,
where T , U , V and W are four unknown functions of x and zu. The new Wu
transformation (7.1) for a constant-linear-acceleration (CLA) frame can be derived from the
postulate of the limiting four-dimensional symmetry of taiji relativity and the initial condition that
a CLA transformation reduces to the spatial identity rI = r when the taiji-time zu = 0 and the
initial velocity P o = 0. This initial condition holds also for the Lorentz transformation. Thus,
once the principle (or the first postulate) of relativity is rigorously stated to include the limiting
cases, the concept of acceleration is determined in the physical theory based on extended
four-dimensional framework.
Within the present conceptual framework, the taiji-time iu in the Wu transormation (7.1) or
(A.2) is a primary concept and has the dimension of length. The motion of physical objects,
including light signals, is a derived concept and described by dimensionless taiji-velocities
drldw. The taiji-time zu can be realized by computerized Leonardo clocks [4]: We could program
any Leonardo clock in a CLA frame F t o obtain a reading zuI from the nearest clock in an inertial
frame F , and, based on its F I frame position zI and given parameters u and P o , compute the
taiji-time w it should display, w = (,wI + P o l a y o ) / [ u ( x l+ l / u y o ) ]-Po/.. (See (7.2).)In the limit
of zero acceleration LO shown on a Leonarclo clock will automatically reduce to the taiji-time in the
+
four-dimensional transformation, zu = yo(zal poxI). I t will not reduce to relativistic time, unless
the second postulate of universal constant for the speed of light (zu = ct, zuI = ctI)is made in this
limit [4].
:I: :F :I:
This paper is dedicated to Prof. TA-YOUWu for his wonderful and tireless teaching
of physics and his ninetieth birthday. The work was supported in part by The Jing Shin
Research Fund of the UMass Dartmouth and by a grant from the Potz Science Fund.
APPENDIX
For simplicity, let us denote a CLA frame by F(zo, x,y , x) and an inertial frame by
F I ( W IX, I ,yI, 21). Suppose a CLA frame F(w,x , y, x) is moving with a constant
acceleration a , so that its velocity is
@.I) P = cm + D o ,
245
along the f x axis. Guided by the limiting four-dimensional symmetry, we find that the
linearly accelerated transformation between F I and F should be
(A.3) { w = c t I / ( l + a x I )= c t , ( l - a x I ) , ctI = wI ,
x = ( l / a ) [ l +2axI + a2(zf- c2t12)]1/2- l / a = x I - c 2 a t f / 2 .
(A.4) a = g/cz
when velocities are small. In this sense, the Wu transformation (A.2) is a
four-dimensional generalization of the Galilean transformation for accelerated frames
in classical mechanics.
From (A.l) we obtain
= P and
We see that only in the approximation goo= 1 do we have (dx/dw),,
246
64.7) ( d ~ / d w ) ~=
, a = constant .
We note that the Wu transformation (A.2) holds for general WI and w. In the limit of
zero acceleration, it reduces to the four-dimensional taiji transformation [4]. If one
wishes, one may define
where tI and t are, respectively, Einsteins time and extended Reichenbachs time
(and b is the corresponding ligh function), then the limit of zero acceleration of (A.2)
is the extended transformation (2.2) (where the inertial frame F corresponds to the
CLA frame F of (A.2) in the limit of zero acceleration). One can formulate, say, classical
electrodynamics in a CLA frame. According to taiji relativity, physical results in the
CLA frame F should be independent of the definition in (A.8).
REFERENCES
[ l] Hsu L., Hsu J. P. and SCHNEBLE D., Nuouo Cimento B , 111 (1996) 1299. This is referred as
paper I in the text.
[2] REICHENBACH H., The Philosophy ofSpace and Time (Dover, New York) 1958.
[3] EDWARDS W.F., Am. J. Phys., 31 (1963) 482.
[4] Hsu J. P. and Hsu L., Phys. Lett. A , 196 (1994) 1; 217 (1996) 359; HSU L. and Hsu J. P.,
Nuouo Cimento B, 111 (1996) 1283.
[5] See, for example, BJORKENJ. D. and DRELL S. D., Relativistic Quantum Mechanics
(McGraw-Hill, New York) 1964, pp. 261-268 and pp. 285-286; SAKURAI J. J., Advanced
Quantum Mechanics (Addison-Wesley, Reading, Mass.), 1967, pp. 171-172 and pp.181-188;
WEINBERC S., The Quantum Theory of Fields (Cambridge University Press, New York)
1995, pp. 134-147.
[6] M0LLER C., Danske Vid, Sel. Mat.-Fyz., xx, No. 19 (1943); FOCK V., The Theory of Space
Time and Gravitation (Pergamon, New York) 1958, pp. 206-211; WU T. Y. and LEE Y. C., Int.
J. Theor. Phys., 5 (1972) 307; TA-YOUWU, Theoretical Physics, Vol. 4, Theory of Relativity
(Lian Jing Publishing Co., Taipei) 1978, pp. 172-175.
[7] Hsu J. P., Nuovo Cimento B, 80 (1984) 183; 88 (1985) 140; Hsu J. P. and PEIS. Y., Phys. Rev.
A, 37 (1988) 1406.
[8] For a detailed discussion of common time in four-dimensional framework and its implica-
tions, see Hsu J. P., Nuovo Cimento B , 74 (1983) 67; 88 (1985) 140; 89 (1985) 30; Phys.
Lett. A, 97 (1983) 137; Hsu J. P. and WHANC., Phys. Rev. A , 38 (1988) 2248, appendix.
[9] HSU J. P., Nuovo Cimento B , 93 (1986) 178.
[lo] In other words, all physical results in taiji relativity or extended relativity can be derived by
simply using the quantities (w, x, y , z ) and (w, x, y , z) without ever mention time t or t
(measured in seconds) and speeds of light or other physical objects.
247
I. Introduction
transformations in the limit of zero acceleration. This is due t o the stringent assumption
t h a t the metric tensor goo is time-independent.
In this paper, we follow the kinematic approach and obtain a satisfactory CLA trans-
formation by postulating a new and natural principle of limiting four-dimensional sym-
metry [a]: Any accelerated transformation of coordinates must reduce t o the form with
4-dimensional symmetry in the limit of zero acceleration. We show that the set of trans-
formations for the CLA frames forms a new group, which is termed the Wu group. T h e
WU transformation is a natural and simple generalization of the Lorentz transformation
and the Galilean transformation with constant acceleration. The limiting four-dimensional
symmetry principle contains more definite and satisfactory physical results than the equa-
tion Rik = 0, as far as CLA transformations are concerned. In the gravitational approach,
Einsteins covariant equation holds for any coordinate. However, the Lorentz transforma-
tion prefers the Cartesian coordinate. Therefore, the natural assumption of the smooth
connection between a linearly accelerated frame and an inertial frame dictates that a CLA
coordinate is preferred for CLA transformations. This is an important difference between
kinematic and gravitational approaches.
In our discussions, a CLA frame F(w,z,y,z) with the usual definition w = ct is
introduced. But we know that the constant speed of light c has no operational definition
in any CLA frame. Fortunately, the physical results in the paper are actually independent
of the definition w = ct. In previous papers, we have shown that the logically simplest
theory of relativity, called taiji relativity, can be formulated solely o n the basis of the first
postulate of relativity, without making any second postulate concerning the speed of light
[3]. The first postulate of relativity states that the laws of physics have the same form
in all inertial frames. We are able t o formulate a 4-dimensional physical theory with the
coordinate zy = ( w ~ , z ~ , y ~for ~ )inertial frame F I , where wy is the taiji-time with
, zan
the dimension of length. The absence of the second postulate forbids one to express the
taiji-time WI in terms of the usual time t I (measured in seconds) and velocity because they
cannot be defined for all inertial frames in taiji relativity. Nevertheless, the taiji-time w I
can be directly used as the evolution variable. Furthermore, the taiji-time with the unit of,
say, centimeters, can be physically realized by computerized clocks. Also, the invariant law
for the propagation of light, ds2 = dw; - dr; = 0 , implies the taiji-speedof a light signal
t o be dimensionless and has the universal value, , d = ~ IdrI/dwIl = 1, for all inertial frames.
A careful examination shows that taiji relativity is consistent with all previous experiments
[3]. Indeed, one can simply consider w in a CLA frame as the evolution variable for a
physical system. One can have a grid of computerized clocks in a CLA frame. These
clocks can be synchronized without relying on the constant speed of light signals and will
automatically read taiji-time W I in the limit of zero acceleration.
where iz is shown by the Einstein clocks in the inertial frame F I . As usual, one may define
w = c t , where the realization of t through a grid of computerized clocks will be discussed
later in sec. 6. Although Rik = 0 holds for arbitrary coordinates, we postulate the metric
( 2 . 1 ) so that d s 2 and the resultant transformations are compatible with both Einsteins
vacuum equation R;k = 0 and the new boundary conditions of limiting four-dimensional
symmetry. Since the CLA frame F moves along the z-axis, we look for axial symmetric
solutions with
where
Also, the coefficients of dx and dw in (2.10) must satisfy the integrability conditions
25 1
so that we have a finite coordinate transformation. It follows from (2.12) and (2.15) that
G(w) = a r 2 ( w ) / f , (2.16)
dZ(x)/dx = f X ( x ) , (2.17)
where the constant f is coming from separation of variables w and z. Using (2.16) and
(2.17), we can integrate (2.10) t o obtain the finite transformation between FI and F ,
Note that the constants of integration in (2.18) and the relations in (2.19) are all deter-
mined by the limiting four-dimensional symmetry and a boundary condition at the origin,
Z ( 0 ) < 00. In order to determined the precise form for the function Z ( x ) , we observe that
the Lorentz transformation reduces to the identity transformation, r=r, when time and
velocity vanish, t = 0 and V = 0. Thus, it is natural to impose the same initial condition
to the accelerated transformation (2.18): Namely, when time tI = 0 and velocity Po = 0,
transformation (2.18) reduces to the identity transformation,
r = rI, (2.20)
(2.22)
where the time t I = W I / C is shown by the conventional Einstein clocks in the inertial
frame FI. We shall call the result (2.22) the Wu transformation [4]. One can verify
that the transformation (2.22) with w = ct includes the Lorentz transformation, W I =
+ +
yo(x pow), X I = yo(z pw),as a special case, a -+ 0. In this sense, it satisfies the
limiting four-dimensional symmetry, s2 = c2t; - r; = w2 - r2 as a + 0. The inverse Wu
transformation of (2.22) can be deduced:
252
We may remark that when PO -+ 0, one can verify that ( 2 . 2 3 ) leads t o the accelerated
Galilean transformation, x M a;z - c2at:/2. Thus, the acceleration a can be approximately
related to a constant acceleration g in Newtonian mechanics, a M g / c 2 . The transformation
for the covariant differential operators ( 8 / 8 w , b/ar) can be deduced from (2.23):
In order to see the group property of the Wu transformation ( 3 . 2 ) for two accelerated
frames, we need t o consider a third accelerated frame F with a velocity p = dw, In
analogy t o (3.2), we can write down the transformation between F and F ,
W = y ~ ~ / { r [ Z y / at i/a - / a ] } ,
2 = {[,/,/a f - l / a / ] 2 - y2~~2w2}1/2- (3.4)
which has the same form as that of (3.2). Using (3.1) and ( 3 . 4 ) , s2 = ( c t ~-) xf
~ can be
expressed in terms of the coordinate variables in F and F as follows:
253
+ +
d ( c t 1 ) = y ( W d ~ P d z ) , dxr = y ( d ~ P W d w ) , d l ~ r= d y , dzy = d z ; (3.7)
where
w = y 2 ( y i 2+ m ) ; yo=(l-Po)
2 -1/2
.
In analogy with the Lorentz transformation in an arbitrary direction, (3.7) can be general-
ized t o the following form:
drI =dr + (y - l ) ( u . d r ) u / u 2 t [y3y02u+ cry3(u. r)u/u]dw,
=dr + (y - l ) ( u / u ) d ( u . r / u ) + (l/cryi)(u/u)dy t (u . r)(u/u2)d(y - I),
(3.8)
dw I = y3[y02 + cru . r/u]dw + y u . d r .
= y 3 y i 2 d w + y ( u . r/u)du(y2 - 1) - y 3 ( u .r)udu + d ( y u . r ) ,
It can be verified that the general transformation (3.9) reduces to (2.22) if u is in the
x-direction, u = ( p , 0,O). In the zero acceleration limit, (3.9) reduces to
(3.10)
instead of the usual three, h , e (in esu) [5] and c . These results can be applied t o the present
formalism of physics in CLA frames.
Since the speed of light in a CLA frame F as not a universal constant, we shall write
w E bt and u = d w / d t , where b is a variable in this section, so that one can see that the
physics in F does not depend on w = ct. The invariant action for a charged particle and
the electromagnetic potential a,(x) in F is assumed t o be
S=
J (-mds - Eu,dxp) - (1/4)/ f P uf!-Wd4x =/ Ldt - (1/4)/ f p w f P w W d 4 x ; ( 4 . 1 )
pi = (-myv,/C, - m y v y / C , - m y v , / C ) = g ; k p k , pi E mdxlds;
y ( 1 - ,2/C2)-/2; c =- uw, v 2 = vz2 + wy2 + va = -v;vi.
The Hamiltonian H = PO.with the same dimension as that of Pi,is defined by
PO = [ ( ~ L / ~ v- ~L I )/ Uv =~ po -t Fa0 = w[(P,- + m211i2+ Fao H;
PO = m y W = gooPo, po = mdxo/ds = m d w / d s .
where
Pp = (PO,-P), P = ( P z , P g , P * ) = (p1,P2,p3)= ( - P l , - p Z , - p 3 ) *
Note that (4.10) is consistent with (2.24) because p , and 6/6xp should have the same
transformation property. Also, the CLA transformation of the covariant vector d x , is the
same as (4.10) because dx, = gpWdxwand dxl, = qpwdx:, where q,,, = (1,-1,-1,-1).
The invariant relation gfiwpupw= m2 implies
where we have used (4.5), (4.6), (4.8), (4.9), gp = (W-, -1, -1, -1) and P = gP,.
This equation suggests that the generalized Dirac equation for the accelerated frame F
should have the form
[-y*(z)(Pp
- Ea,) - m]@= 0, P, = i J d / d x p . (4.12)
If one wishes, one can relate y*(x) in (4.12) t o constant Dirac matrices 7, by the relation
7*,() = e p (x)T, where e (z) is a tetrad.
4 4
V. Experimental Implications and Discussions
The result (4.10) and the new transformation (2.22) can be experimentally tested
by measuring Doppler shift of wavelength of light emitted from a CLA source. From Eq.
(4.10) one obtains the transformation of the covariant wave 4-vector k, = p , / J between
an inertial frame FI and a CLA frame F . Note that J k I o and J k o are moving masses of
the same photon measured from FI and F respectively. Suppose the radiation source is at
rest at the origin of the F frame, r = 0, and k, = ( L O , - k l , O , O ) , where ko = ko (rest) and
kl = Icl (rest). Experimentally, it is difficult t o measure ko (rest) and k1 (rest) in the CLA
frame. Thus we have t o express them in terms of quantities measured in the inertial frame
(or laboratory) F I . Using (4.10), the relation [6] ko(rest) = klo(rest) and Z(0) = r;, we
obtain the shifts of kro (related t o photons moving mass or atomic mass level [3]) for
waves emitted from a CLA source,
where (rest) denotes the source being at rest in F I . A similar relation can be obtained for
the wavelength. Such new effects predicted by the Wu transformation for waves emitted
from a CLA source may be termed Wu-Doppler effect. Note that Moellers transformation
will lead t o a result different from (5.1) because t in (2.9) and (2.22) with w = ct [or (5.2)
below] must have the same physical interpretation. Such a difference can be tested by
measuring the Wu-Doppler effect (5.1) in the laboratory frame FI by using the method of
Ives-Stilwell [7].
We stress that the Wu transformation (2.22) does not depend O R a specific relation
between w and t . Suppose one assumes w = ct in (2.22). The time t ,
the acceleration upprwches zero, a ---f 0. In this sense, these sophisticated computerized
clocks are generalized Einstein clocks for both inertial and non-inertial frames. We note
that the choice of w = ct in (2.22) t o synchronize computerized clocks in F does not imply
that the speed of light is a constant c in the accelerated frame. For the general case w = b t ,
where t is defined by an arbitrarily preassigned function t ( z I , t i ) ,the time t can also be
physically realized by the computerized clocks. It appears t h a t all these different times
t ( z 1 , t r ) are equally physical, in principle, for describing physical phenomena, as discussed
in [3]. However, from fundamental laws of physics such as (4.11) and (4.12), we can see
that the rear evolution variable is w rather than t. Of course, we can also make these
computerized clocks t o read w directly.
The boundary condition Z ( 0 ) < cc, which leads t o f = (I: in (2.19), is imposed
for simplicity. It is not necessary: If Z(0) < 00 is not irnrosed, then one has f # a
in general. However, G(w)and Z(x)involve the factors l / f and f respectively, so that
W ( w ,z ) = G(w)Z(z) and the resultant physics do not depend on f.
The equivalence of the effects of a gravitational field and those of an observers
acceleration played art essential role a t the birth of general relativity. Nevertheless, some
authors suggested that it be buried with appropriate honors because it is false [9]. It is
the equivalence of gravitational arid inertial mass which is precise and necessary for general
relativity.
What is the operational meaning of the constant acceleration a? We show that the
constant acceleration (I: of a particle is directly and uniquely related the change of its energy
per unit length as measured in an inertial frame FI [lo]:
where we have used the diflerential transformation (2.10) and the momentum transforma-
tion (4.10) with p p = Jk,. We stress that, within the fIarricwork of the four-dimensional
symmetry, the concept of uniform acceleration of a particle can only be defined in the sense
of (5.3), i.c., constant change of a particles energy pro per unit length, as measured in
an inertial frame FI. It is gratifying t o see that this is precisely what has been used in high
energy laboratory. Other definition of acceleration such as the change of velocity per unit
time is only an approximation for small velocities and is, strictly speaking, incompatible
with the 4-dimensional symmetry.
Our results suggest that the kinematic approach first discussed by Wu and Lee
[l]is morc fruitful than the conventional gravitational approach, provided the limiting
4-dimensional symmetry is postulated.
The work is supported in part by the Potz Science Fund. This paper is written
as an affectionate jubilee greeting t o Taidas Physics Department. Appropriately, it deals
with 4-dimensional symmetry and time which engrossed JPs thoughts for many years as a
student at Taida.
References
[ 1 ] C . M d l e r , Danske Vid. Sel. Mat-Fys. xx, No.19 (1943) and T h e Theorg ofReIalivily(C1arcn-
don, Oxford 1969) pp.255-258. Ta-You W u and Y . C. Lee, Intern. J . Theor. Phys. 5, 307
(1972). Wu and Lee assumed (i) local FitzGerald-Lorentz contraction of length, (ii) local
257
time-dilatation and (iii) a time-independent goo for CLA coordinates, and derived the same
transformation.
[ 21 Jong-Ping Hsu and Leonardo Hsu, Nuovo Cim. B112 (to be published in April, 1997).
[ 31 J . P. Hsu and L. Hsu, Phys. Letters A196, 1-6 (1994); (Erratum) ibid, 217, 359 (1996);
Leonardo Hsu and Jong-Ping Hsu, Nuovo Cimento B111, 1283 (1996).
[4] Ta-You Wu, Theoretical Physics, vol. 4: Theory of Relativity (in Chinese, Lian Jing Publishing
Co., 1978) pp. 172-175; see also ref. 1. Roughly speaking, Wu explored local relation
between accelerated transformation and Lorentz transformation; while we consider their
global relations based on the limiting four-dimensional symmetry principle.
[5] Note that the universal constant 2 is the charge measured in the electromagnetic unit (emu)
rather than in the electrostatic unit (esu).
[6] This equality is only approximate because there is no relativity or equivalence between F
and F I . Nevertheless, it turns out to be an extremely good approximation because atomic
[ 71
metric tensor goo,and the smallness of atomic sizes, -
structure is very stable against constant-linear-acceleration. This is basically related to the
cm.
When we define w = ct, the speed of light measured in the CLA frame F is C 5 d r / d t =
c W ( t , z )because the propagation of light is described by equation (2.1), ds = 0. Note that
C is anisotropic and depends on space and time in general. We stress that one can also
set w = btI in (2.22) and (2.23) without upsetting its Wu group property. Thus, we have a
common time, t = t I , for all frames and, hence, the speed of light measured in F by using
such a common time will be C = cy(1 - p), if d x I / d t I = +c. For discussions of common
time within the 4-dimensional framework, see J . P. Hsu, Phys. Lett. A97, 137 (1983); Nuovo
Cimento B74,67 (1983); J . P. Hsu and C. Whan, Phys. Rev. A38, 2248 (1988), Appendix.
[ 81 H. E. Ives and G. R. Stilwell, J . Opt. SOC.Am. 28, 215 (1938); 31, 369 (1941).
[ 91 J . L. Synge, Relativity: T h e General T h o r y (North-Holland, Amsterdam, 1966) pp. ix-x. See
also V . A. Fock, T h e theory of space, t i m e and gravitation (Pergamon, London, 1959).
[lo] For an object at rest in F I , we have ( d p o / d i ) , I = rna(y- - l ) / ( y Z 2 ) # ( d p I o / d z I ) z . It also
shows the lack of symmetry between a CLA frame F and an inertial frame F I .
258
*daniel.schrnittOphysik.uni-ulm.de
Department of Theoretical Physics, University of Ulrn, Germany
t tobias. kleinschmidt@gmx.de
Department of Physics, University of Massachusetts Dartmouth
North Dartmouth, MA 02747-2300, USA
1
259
1 Introduction
In the physics of the future, it is desirable that particle physics and quantum field
theory, including gravity, can be understood in both inertial and non-inertial frames. The
reason is that all physical frames of reference in the universe are, strictly speaking, non-
inertial because of the existence of the long range gravitational force. The inertial frame
is only an approximation or idealization of physically realizable reference frames. TO
understand physics in inertial frames, we have the Lorentz transformations or Lorentz and
Poincar6 invariance, so that physical theories can be formulated covariantly and tested
experimentally. In contrast, to understand physics in non-inertial frames, there is a group
of all point transformations (one-to-one and twice differentiable) of spacetime in which the
differential form ds2 = g p v d z p d z u is invariant. Its invariant theory is the tensor calculus of
the general theory of relativity. However, this group for non-inertial frames is too general
for quantum field theory and have little specific predictions. For example, there is no
Quantum Field Theory (QED) for accelerated reference frames which would allow us to
calculate particle lifetimes in accelerated frames.
Some physical phenomena involving accelerations of one particle can be treated, by
introducing a co-moving frame' , on the basis of the invariant theory of the Lorentz group.
But such an invariant theory is inadequate for treating many other physical phenomena.
In particular, quantizations of fields in non-inertial frames cannot be treated on the basis
of the invariant theory of the Lorentz group.
This motivates us to investigate a simple subset of reference frames which has constant-
linear-accelerations. [ 11 Our discussions are based on the principle of limiting 4-dimensional
symmetry which requires that all accelerated transformations be reduced to the Lorentz
and Poincar6 transformations in the limit of zero acceleration. [2] We discuss specific
spacetime transformations and their geometry and physical properties. Their geometry
and light propagation is illustrated in several graphs. We also discuss a specific experiment
of decay-length dilation to test the generalized Lorentz transformations.
2
260
course, experiment has the final say regarding the correct accelerated spacetime transfor-
mation. Indeed, the Wu and the Mmller transformations for constant accelerations are
just two of simple generalizations, and only the Wu transformation includes the Lorentz
transformation as a limiting case of zero acceleration. [l]There are other simple general-
izations of the Lorentz transformation. In all these minimal generalizations, the spacetime
of CLA frames is characterized by a metric tensor of the form ( W 2 , - l , - l , - l ) G P,,,
which may be called the Poincar6 metric tensor. Furthermore, there exist finite transfor-
mations for inertial and CLA transformations and, therefore, the spacetime of these CLA
frames is flat, i.e., having a vanishing Riemann curvature tensor. This property is useful
for formulations of field theory.
Let us consider the transformations between an inertial frame F I ( w ~X I, ,y ~Z ,I ) and
a new constant-linear-acceleration (CLA) frame F ( w ,x,y , z) which moves with a time-
dependent velocity p(w)along the x-axis. Suppose the metric tensors in the inertial and
the CLA frames are:
rl,v = diag.(l, -1, -1, -1) (1)
P, = diag.(W2, -1, -1, -1) (2)
where W = W ( w , r ) is any real-valued function of spacetime in F(w,x,y,z) with the
property W -+ 1 for vanishing acceleration. Thus, the Poincar6 metric tensor PPv reduces
to the Minkowski metric tensor qP, in the limit of zero acceleration.
To derive the spacetime transformation for CLA frames, one may start with a linear
relation for the differentials ( d w ~dar)
, and (dw, dx) with some unknown coefficients which
are functions of spacetime. Other components, dy and dz, are unchanged. Based on
physical considerations, a finite transformation of spacetime must exist between FI and F.
Thus, these coefficients must satisfy integrability condition, and the finite transformation
must reduce to the Lorentz transformation in the limit of zero acceleration, p + Po. One
obtains two types of simple generalizations of the Lorentz transformation from inertial
frames to constant-linear-acceleration frames, which are characterized by the metric tensor
of the form (2). [l,41
Let us consider a class of transformations for an inertial frame FI ( W I , X I ,y ~21)
, and a
constant-linear-acceleration frame F ( w , x,y, z ) , accelerated in the a-direction, [5]including
constant spacetime translations:
where xg = (w,, x,, yo, z,) are constants and the velocity function p(w) will be specified
and discussed below. This general transformation of spacetime is interesting because it
includes the Wu transformation and the Mmller transformation for accelerated frames as
special cases. Even though P(w)may be an arbitrary function of time w in F , it is actually
a specific and unique expression when we express it in terms of spacetime coordinates of
the inertial frame F I :
Since the right-hand-side of (4) does not involve any arbitrary function, this implies that,
from the viewpoint of observers in the inertial frame F I , the frame F always moves the
5
26 1
same way no matter what is the function given to P(W). This may be interpreted as a
flexibility of time w of the accelerated frame F , as we shall see below.
When the acceleration in (3) approaches zero, i.e., P Po, the linear-acceleration
-+
%Yo ffOY0
YI =Y, ZI = z; P = P o + a,w.
This is called the Wu transformations, which was the first generalization of the Lorentz
transformation obtained on the basis on limiting 4-dimensional symmetry. [2] We also
have the following transformation for differential 4-vector dxp = (dw,d x , d y , d z ) and the
explicit expression for the invariant interval ds2:
w = - (1 WI +PO/QOYO
- Po) (9)
Qo XI +~/Q,~,
1
x = JCXI + l/Qo%)2 - (WI + Po/ao%)2 - -2
0%
Y = YI, z = .zI
4
262
Contributions of singular terms in the transformations (3) and (9) cancel in the limit of zero
acceleration. Thus, the inertial limit a, -+ 0 is well-defined for the Wu transformations.
We are used to picture physical world from the viewpoint of observers in a n iner-
tial frame. Let us use the inverse Wu transformations (9) to show the change and
the physical properties of space and time axes (w,x) due to velocity and acceleration.
We observe that for constant w and XI # - l / ( a o ~ othe ) , time transformation in (9)
leads to straight lines. All lines correspond to constant w pass through the same point
In contrast, for constant z, the inverse Wu transfor-
( W I , X I ) = ( - P o / ( a y o ) , -l/(a,~,)).
mation (9) leads to two sets of hypobolic lines, which satisfy the condition 1x1 l/cr,y,[ > +
IWI +Po/~,%l.
1 0
5
263
.- _ ... .....
6
264
Figure 4 shows the changes of lightcone given by the relation s2 = W I -XI ~ = 0, where
and XI can be expressed in terms of w and x by using the Wu transformation (7).
The straight lines represent the lightcone in an inertial reference frame for comparison.
It is interesting to observe that the region of spacetime consisting of all points [z +
l/(ay)] > 0 and all w from (-1 - ,f30)/ao to (1 - &,)/a, corresponds to the sector
of the ( w 21) +
~ , plane between ( W I Po/aoro)= +(xz l / c ~ ~ and + r ~ )( W I Po/cyoyo)= +
- ( x ~ + l / a ~ y ~As) . a0 0, this region becomes larger and larger, and eventually cover the
-+
whole plane ( W I , X I ) , as one can see in Figures 1, 2 and 3. Thus, this region of spacetime is
identified with the physical world of the CLA frame F. Similarly, the region of spacetime
+
consisting of all [x l / ( a y 2 ) ] < 0 is represented by the opposite sector with negative
(XI + l / a o ~ , )This
. opposite region of spacetime may be called the mirror spacetime of
the CLA frame. The rest of the ( W I , 5 1 )plane cannot be reached for any real w and z for
a, # 0. The time w is physical only within the range, (-1 - P,)/a, < w < (1 - Po)/a,.
Outside this range, the CLA frame ceases to exist because the velocity P(w)is greater
than 1, which is unphysical. The limits of time, w -+ (kl- ,B,)/cY,, in the CLA frame
implies WI -+ kt00 in inertial frame, (everywhere in physical space except at the singular
wall). This property is interlocked with the assumption P(w)= 0, crow. If one makes a +
different assumption for the arbitrary function P(w)in (3), one has a different time for a
CLA frame. [l]
The Wu transformations suggest that accelerations distort spacetime and form a hori-
zon, which corresponds to a wall singularity x = -1/(aoro2) (with arbitrary y and t
coordinates) of the coordinates for the CLA frame F or ( 2 1 + = (WI +Po/aoro)2.
This singular wall separates physical spacetime from the mirror spacetime. The loca-
tion of the wall singularity depends on the sign of the acceleration a , and the magnitude
laoyo21. The mirror spacetime emerges from the quadratic equation associated with the
second equation in (9):
(.+-&)?= (XI+&)- (WI.a-) 2
a070
Note that we also have a quadratic equation for the relativistic energy-momentum relation,
which leads to negative energy solution. In some sense, the mirror spacetime resembles
to the negative energy solution of a particle, so one may ask whether the mirror spacetime
exists in some physical sense. Such a question probably cannot be answered because even
if it exists there is no possibility of communication between these two sectors of spacetime
separated by the singular wall.
The rate of ticking of a clock and the observable speed of light in the CLA frame F
nearby this singular wall have very peculiar properties. Namely, as x - l / ( a o y o 2 ) in
-+
+
(9) (i.e., ( 2 1 l / a , ~ , )-+ (WI + P o / ( a o y o ) ) ,the Wu transformation leads to the following
results:
(a) The clock stands still:
7
265
The result (a) shows that the rate of ticking of a clock a t rest in F at the positon
(z, O , O ) , i.e., dx = 0, slows down in comparison with a clock at ( X I , 0,O) in the inertial
frame. The result (b) is due to the fact th at the law for the propagation of light is given by
ds2 = 0. T h e two properties in (10) and (11) are intimately related. We may remark th a t
it is natural for the ratio (10) to be positive. This is consistent with ( X I -t l/(aoyo)> 0
+
and ( X I 1 / c y o ~ o ) 2 > ( W I + ~ o / ( c y o ~ o ) 2 , where the last relation is consistent with p2 < 1.
Note t ha t the general transformations (3) with a n arbitrary velocity function p(w)
corresponds to constant-linear-acceleration in the following sense: Suppose a particle is at
rest in the CLA frame F at the position (x,O,0) = constant, we can derive the following
relation,
This result is consistent with the constant acceleration of a charged particle in a high-
energy linear accelerator, which has a constant potential drop per unit length. Thus, a
charged particle gains a constant kinetic energy per unit length. Note that the usual
definition of constant acceleration d2xI/dw; = constant in classical mechanics is only a n
approximation for small velocities and is inconsistent with high-energy experiment.
N(w) = (13)
where No is the number of particles at time w = 0 and wr is the constant lifetime of the
particles at rest. The numerical value of wr is the same as the decay-length D= T ~ C where
,
T~ is the usual lifetime at rest. Since the experiment is performed in the inertial frame F I ,
we express the law (13) in terms of observable variables in FI(xC;)by using the Lorentz
transformations between F and F I , i.e., the coordinates xp = ( w ,x , y1z ) and x! in (5)
are replaced by xp = (w, x, y, z) and zero respectively. Using x = y = z = 0, we have
N = N ~ ~ - w ~ / ( Y O W ~ ) =~ , ~ - ~ r / ( ~ o P o w - ) .
(14)
-
It shows that the particle lifetime is dilated by a yo factor. This result of lifetime dilation
has been confirmed with a very large value of yo 700, in high energy laboratories. [7]
We stress that lifetime dilation of a particle due to constant-acceleration is a pure
kinematic property and is independent of detailed interactions of particles. To discuss the
lifetime dilation with constant-linear-accelerations for particles at rest in the CLA frame
F ( w ,x , y , z ) , we assume that the form of the exponential decay law (13) still holds t o a
good approximation: [S]
N = NOe-w/wr, (15)
in a linearly accelerated frame. Now instead of using the Lorentz transformations, we use
their generalization, i.e., the Wu transformation (3) with x0p = 0, and express w in terms
8
of the spacetime variables in the inertial laboratory frame -FI[Z:):
where (x,0,O) denotes the particle's fixed position in the CLA frame F ( x P ) ,
In the experiment of particle decay in flight, one measures the number of particle
decay as a function of position X I rather than time WI in an inertial laboratory. From
(15) and (161, we obtain t h e generalized decay law for unstable particles moving with
constant- linear- acceler ations:
In the derivation of (16) and (171, we have used the inverse Wu transformation (9)
Figure 5: Particle Decay in Flight with Different Accelerations. This figme shows the
different decays of accelerated pions (massm o = 140MeV, 'decay length' D = 7.8m = w.,and
1: =0) with PO = 0.8. The symbol xI[rn] denotes the distance XI measured in meters, n denotes
N. From the upper t o the lower graph, the constant 'acceleration' F f m , is respectively given by
1/14rn, 1/35m,1/70m,1/140m, 1/280m, and 1/560m.
9
2.67
+
yO(l ao/30y02w)lead to the relation for w:
1
w=-
Po
(-
XI
Yo
- x - a,yo 2 X I X + CY,yo2x2
)
This relation and (9) lead to the following approximate decay formula for small acceleration
a0,
(18)
Clearly, when a , + 0 it reduces to the lifetime dilation (15) of special relativity, if the
particle is at rest at the origin of the moving frame, i.e., x = 0. However, when the location
of particles at rest in F is at x = a # 0, one obtains a new effect of lifetime dilation due
to acceleration a,. Based on (19), one can verify that N ( a ) / N ( O )# N(O)/N(-a). This
result in a CLA frame suggests that we no longer have the usual translational invariance
of inertial frames. This new non-translational invariance in non-inertial frames, suggested
by the lifetime dilation for particle decay in flight with acceleration, could be tested by
experiments in the future. Of course, the measurements of N ( a ) , N ( O )and N ( - a ) are
more difficult than the usual lifetime dilation. We note that the generalized law for the
lifetime dilation as shown in the decay law (17) or (19) holds for both particles moving
with constant velocity and constant-linear-acceleration.
To see the behavior of the particle decay at different accelerations, let us plot the curves
with different values of a. In linear accelerators, the constant acceleration a is expressed
in terms of a constant potential drop per unit length,
we have [9]
1
Let us use m, = 140MeV/c2 and the rest decay length D=7.8 meters for pions at rest in
F moving with a constant velocity Po = 0.8 for a demonstration. The exponential law for
particle decay in flight with different potential drops per meter is shown in figures 5 and
6.
It is important to determine experimentally the correct physical time in accelerated
frames before one formulates physical theories in accelerated frames. Therefore, this type
of experiment to test physical time in accelerated frames is crucial to free our understanding
of physics from the bondage of inertial frames. The experimental results will also motivate
physicists to formulate theories in both inertial and non-inertial frames and to extend our
view and understanding of the physical world.
The work was supported in part by the Jing Shin Research Fund of the UMassD
Foundation and the Potz Science Fund.
10
268
Figure 6: Particle Decay in Flight with Different Initial Velocities. This figure shows
the different decays of accelerated pions (mass m, = 140MeV, decay length D = 7.8m = w T ,
and z = 0 ) with the force F = lrMeV/m acting on them. The initial velocity PO is, from the
upper t o the lower graph, given by 0.99, 0.5, 0.4, 0.3, 0.2, and 0.1 respectively.
References
[l]Daniel T. Schmitt and Jong-Ping Hsu, Intern. J. Modern Phys. A (2005, to he pub-
lished).
;2] Jong-Ping Hsu and Leonard0 Hsu, Nuovo Cimerlto B, 112, 575 (1997) and Chin. J.
Phys. 35,407 (1997).
[3] C. Mmller, Dariske Vid. Sel. Ma.t.-Fys. 20, No. 19 (1943); see also The Theory of
Relativity, (Oxford university press, 1952), Chapter VII; Ta-You Wu and Y . C. Lee,
Intern. J. Theoretical Phys. 5, 307 (1972). Ta-You Wu, Theoretical Physics, voE.4,
Theory of Relativity (Lian Jing Publishing Co., Taipei, 1978) pp. 172-175. T. Fulton
and F. Rohrlich, Ann. Phys. 9, 499 (1960); T. Fulton, F. Rohrlich and L. Witten,
Nuovo Cimento XXVI, 652 (1962); E. A. Desloge and R.. J. Philpott, Am. J. Phys.
5 5 , 252 (1987).
11
269
[8] Explicit calculation of particles decay at rest based on quantum field theory formu-
lated in non-inertial frames appears to be non-trivial. So far, there is no satisfactory
formulation of quantum field theory in non-inertial frames. The assumption of a con-
stant lifetime of a particle at rest in CLA frames can be justified for small accelera-
tions. Furthermore, this constant value of lifetime is not crucial for experimental tests
discussed here. The position-dependence of N(x) in equation (17) or (18) is crucial
for experimental tests.
[9] Their relation can be obtained by calculating ( d p ~ / d z ~ )with, 3: fixed, where p~ =
+
m o y o . One has ( d p 1 0 / d 3 : 1 ) ~= I(dprl/dw~).I = moao/(ro-2 sox) = F,. See also
equation (12).
12
This page intentionally left blank
Chapter 6
( 697)
273
69 8
believe that there uere experiments; I imagined that there were a lot of ezperinients ;mcl
that the gravitational constant was more like the electrical constant and that they nere c o ~ ~ l S
up .with data o n the various gravitating atoms, and SO forth; and that it was a challeiige to
calculate whether the theory agreed with the data. SOthat in each case I gave inyself a specific
physical problem; not a question, what happens in a quantized geometry, how do you &fine
un energy tensor etc., unless that question was necessary to the solution of tlie physical
problem, so please appreciate that tlie plan of the attack is a succession of increasingly
complex physical problems; if I could d o one, then I was finished, and. I went to a harcler
one imagining the experimenters were getting into more and more coinplicatecl situatiolis.
Xlso I clecided not to investigate d i a t I would call familiar difficulties. The quantum electro-
c1.ynamics cliverges ; if this theory cliverges, its not something to be investigated. uriless it
produces any specific difficulties associated with gravitation. In short, I was looking entirely
for unfamiliar (that is, unfamiliar to meson physics) difficulties. For example, its imme-
cliately remarked that the theory is non-linear. T h s is not at all an unfamiliar cLifficJty;
the theory, for esample, of the spin l / 2 particles interacting with the electromagnetic fielcl
has a coupling term y-4.y which involves t h e e fields and. is therefore non-linear ; thats
not a new thing at all. Now, I tlioiight that t h s would be very easy and Id jast go ahead
and do it, and heres what I planned. I started with tlie Lagrarigian of Einstein for the inter-
acting fielcl of gravity and I had t o iiiake some definition for the matter since Im dealing
with real bodies and malie up my mind. what the matter T Y ~ Smade of; arid then later I woulcl
check whether the resuits that I have depend on the specific choice o r they are more powerful.
I can only do one example at a time; I took spin zero matter; then, since Im going to make
a perturbation theory, just as we d o in cluantum electroclynaniics, where it is allowed (it is
especially more allowed in gravity where the coupling constant is smaller), gL,,is written
as flat space as if there were no gravity plus x times / I , , ~ ,where x is the square root of tlie
gravitational constant. Then, if this is substituted in the Lagrangian, one pets a big mess,
which is outliriecl here.
= dL* +xA,,,..
g,,
Substituting and espanding, and simplifying the results by :I notation (a bar over a tensor
means
1
XI, = y ( X ~ . + ? i v L - d ~ , , n . , , ~ ;
Y
-
notice that if x,,,, iy symmetric, .LIv = XJ we get
First, there are terms w h c h are quadratic in h ; then there are terms which are quadratic
274
699
in rp, tlie spin zero meson field variable; then there are terms which are more complicated
thaii quaclratic; for example, here is a term with two cps and one 11, which I will write lip9
(I have written that one out, in particular); there are terms with three h s ; then there are
terms which involve two hs aiicl two rps; and so on and so on with inore ancl mole compli-
cated terms. The first two terms are considererl as the free Lagraiigian of the gravitational
field and of the matter.
Now we look first at what we woulcl want to solve problem classically, we take the
variation of this with respect to h,, from the first term we produce a certain combination of
second derivatives, and on tlie other aide a mess involving higher orders than first. Ancl the
same with the cp, of course.
We will speali in the following way: ( 3 ) is a wave equation, of which .S, is the source, just
x
like (4) is the wave equation of which is the source. The problem is to solve those equa-
tions i n succession, and to use the usual methods of calculation of the quantum theory.
Inasmuch as I wanted to get into the minimum of difficulties, I just tool; a guess that I use
tlie same plan as I clo in electricity; and the plan in electricity leads to tlie following sug-
gestion here: that if you have a source, you divide by the operator on the left side of (3)
in momentum space to get the propagator field. So I have to solve this equation ( 3 ) . But
as you all know it is singular; the entire Lagrangian in the beginning was invariant under
a complicated transformation of g,which in the form of IL is the follov-ing; if you aclcl to h
a gradient plus more, the entire system is invariant:
wherc Elt is arbitrary, and p and v should be made symmetric in all these equations. As
a consequence of this same invariance i n the complet Lagrangian one can show that the
source S,,, must have zero divergence S,,, = 0. In fact equations ( 3 ) woulcl not be consistent
without tllis condition as can be seen by barring both sides ancl taliing the divergence - the
left side vanishes identically. Now, because of the invariance of the equations, in the same
way that the Maswell equations cannot be solved to get a unique vector potential - so
these cant be solver1 and we cant get a unique propagator. But because of the invariance
under the transformation some arbitrary choice of a condition on hJtvcan be made, analogous
to the Lorentz condition All,~, =O in quantum electrodynamics. Maliiag the simplest choice
-
which I linow, I malie choice h,lla,a = 0. This is four conditions and I have free the four
variables E), that I can adjust to malie the condition satisfied by I z ~ , ~Then
. this equation (3)
is very simple, because two terms in (3) fall away and all we have is that the dillemberian
-
of h is equal to S. Therefore the generating field from a source S,, will equal the SPvtimes
l/k2 in Fourier series, where k2 is the square of the frequency, wai-e vector; the time part
might be called tlie frequency w, the space part k. This is the analogue of tlie equation in
electricity that says that the field is l/P times the current. In the method of quantum field
275
700
tlieory, you have a source which generates something, am1 that. may interact later wit11 some-
thing else; the iteraction, of course, is S,lyh,,,; SO that, I say, one source may create ii potential
whicll acts on another sotirce. So, to take the very simplest example uf two interacting sys-
tems, lets say S ancl S, tlie result u-ou!d be the following: h woulcl be generated b y S!,,
and then it woulcl iiiteract with S,!,,so ,: eve would get for the interaction of two systems, of
two particles, tlie fundamental interaction that we investigate
- 1
%2S{,,p S,,,. (6 )
There is a singular point in the last term when w = k, and to be precise we put in the +
is
9s is well-known from electrodyna~nics.You note that i n the first two terms iiisteacl of one
over a four-dimensional w2-k2 Jb-e have here just l/k2, the momentum itself. S,, is the
energy density, so this first term represents the two energy densities interacting with n o w
dependence which means, in the Fourier transform a11 interaction iii;tantaneoLis in time;
a n d 1/k2 means l j r in space, so theres an iristaiitarieous 1/r interaction between masses,
Newtons law. In the next term theres another instantaneous term which says that New-
tons mass law should be corrected by some other components analogous to a Iiincl of magnetic
interc~:tioii (not quite analogous because the magnetic interaction in electricity already
involves a k3--02+ti~ propagator rather than just Bg.Brit the li2--w2 +it in gravitation
comes even later and is a much smallel term which involves velocities to tile fourtli). So
if we really wanted to do probleins with atoms that were held together gravitationally it
~ v o u l dbe very easy; we would take the first term, and possibly even the seconcl as the inter-
action. Being instantaneous, it can be p i t directly into a Schrodinger equation, analogous
t o the ez/r term for electrical interaction. And that take care of gravitation to a very high
accuracy, without a cpntizecl field theory at all. However, for still higher accuracy we
have t o do the radiative corrections, which come from the last term.
Radiation of free gravitons corresponds to the situation that there is a pole in the propa-
gator. There is a pole in the last term when w = k, of course, whlch means that the wave
number ancl the frequency are related as for a mass zero particle. The resiclue of the pole,
we see, is the product of two terms; which means that tliere are two kincis of waves, one
generatell. by S,l-S22 and the other generated by S12,a d so we have two 1;intLs nf trans-
276
70 1
versc polarized waves, that is there are two polarization states for ilir gravitoii. T h e linear
combination Sl1-5&* 2iS,,vary with angle 0 of rotation in the 1-2 plane as eiZio so rile
Eravitaton has spin 2, component 5 2 along direction of polarization. Everything is clear
directly from the expression ('7); I just wanted t o illustrate that the propagator (6) of cIuantilin
mechanics and all that we know about the classical situation are in evicleiit coincidence.
I n order to proceed to malie specific calculations by means of diagrams, beside the
propagator we need to l a o w just what the junctions are, in other words just what tile S'S
are for a particular problem; and I shall just illustrate how that's done i n one example.
It is clone by looking at the non-quadratic terms in the Lagrangian I've writtell one oLlt
completely. This one has a n h ancl two 9 ' s in the Lagrangian (2). T h e rules of the quantunl
mechanics for w i t t i n g this thing iire to look at the h and two y's: one 'p each refers to the
i n and out particle, ancl the one h corresponds to the graviton; so vie immediately see ill
that term a two particle interaction through a graviton (see Fig. 1). And we can immecliately
Fig. 1
read off the answer for the interaction this way: if the p1 and p a are the momenta of the
particles and Q the momentum of the graviton; a n d eQB is the polarization tensor of tile
plaiie wave representing the graviton, that is hQp= eQe e14 x, the Fourier expansion of this
'
term gives the amplitude for the coupling of two particles to a graviton
So this is a coupling of matter to gravity; it is first order, ancl then there are higher terms;
but the point I'm trying to make is that there is no mystery about what to write clo~\-ii-
everything is perfectly clear, from the Lagrangian. We have the propagator, we have the
couplings, we can &rite everything. A term like hhh implies a definite formula for tile
interaction of three gravitons; it is very complicated, and I won't write it down, but you
can read. it right off directly by substituting momenta for the gradients. That such a term
exists is, of course, natural, because gravity interacts mith any kind of energy, including
its own, so if it interacts with an object-particles i t will interact with gravitons; so this is
the scattering of a graviton in a gravitational field, which must exist. So that everything
is directly readable and all we have to do is pro:eed to find out if we get a sensible physics.
I've already indicated that the physics of direct interactions is sensible; arid I go ahead
now to compute a number of other things.
T o take just one example, we compute the Compton effect, or the analogue rather,
of the Compton effect, in d i i c h a gravitoii comes in a n d out o n a particle. The amplitude
277
702
for this is a sum of terms corresponding to the diagrams of Fig. 2. The amplidute for
the first diagram of Fig. 2 is the coupling (8) times the propagator for the intermediate
meson wljch reads (p2-rn2)-, which is the Fourier transform of the equation (4) which
is the propagation of the spin zero paiticle. Then there is another coupling of the same
form as (8). We multiply these together, to get the amplitude for that diagram
where we should substitute p =p 2 +qb = p 1 +qa. Then you must add similar contributions
from the other diagrams.
A 8 C 0
Fig. 2
The third one comes in because there are terms with two hs and two y7sin the Lagrangian.
One adds the four diagrams together and gets a n answer for the Compton effect. It is rather
simple, and quite interesting; that it is simple is what is interesting, because the labour is
fantastic in all these things.
But the thing I would like to emphasize is this; in this problem we used a certain wave
e:s for the incoming graviton number a say; the question is could we use a different one?
According to the theory, it should really be invariant under coordinate transformations
and so on, but what it corresponds to here is the analogue of gauge invariance, that you can
add to the potential a gradient (see (5)). And therefore it should be that if I changed eZp of
a particular graviton to eUp+qJp where 6 is arbitrary, and qa is the momentum of the gravi-
ton, there should be no change in the physics. In short, the amplitude should be unchanged;
and it is. The amplitude for this particular process is what I call gauge-invariant, or coor-
dinate-transforming invariant. At first sight this is somewhat puzzling, because you would
have expected that the invariance law of the whole thing is more complicated, including
the last two terms in (54,which I seem to have omitted. But those terms have been includecl;
you see asymptotically all you have to do is worry about the second term, the last two in
hs times 5s are in fact generated by the last diagram, Fig. 2 0 ; when I put a gradient in hexe
for this one, what this means is if I put for the incoming wave a pure gradient, I should get
zero. If I put the gradient qucpin for ezp o n this term D, I get a coupling between E
and the other field e:p because of the three graviton coupling. The result, as far as the
matter line is concerned is that it is acted on in first order by a resultant field eEu E, qf +
+ 1 q: ePvtUwhich is just the last two terms in (5).The rule is that the field which acts on the
278
703
matter itself must be invariant the way described by (5) ; but here in Fig. 2 Ive already cal-
culated all the corrections, the generator and all the necessary non-linear modifications if I
take all the diagrams into account. I n short, asymptotically far away if I include all lcincls
of diagrams such as D, the invariance need be checked only for a pure graderit added to
an incoming wave. It takes care of the non-linearities by calculating them through the in-
teraction.
I woulcl like, now, to emphasize one more point that is very important for our later
discussion. If I add a gradient, I said, the result was zero. Lets call a the one graviton coming
in and b the other one in every diagram. The result is zero if I use a gradient for a, o n l y
i f b i s a f r e e g r a v i t o n w i t h n o s o u r c e ; that is if it is either really an honest graviton
with (qa)2 = 0, or a pure potential, which is a solution of the free wave equation. That is
unlike electrodynamics, where the field b could have been a n y potential at all and adding
a gradient to a would have made no difference. But i n gravity, it must be that b is a pure
wave; the reason is very simple. There is no way to avoid this by changing any propagators;
this is not a disease - there is a physical reason. The reason can be seen as follows: If this
b had a source let me modify my diagrams to show the source of b, suppose some other
matter particle niade the b, so we acld onto each b line a matter line at the end, like Fig. 3a.
(E.g. Fig. 2a becomes Fig. 3b etc.)
a b C
Fig. 3
Now, if b isnt a free wave, but it had a source, the situation is this. If this a field is taken
as a gradient field which o p e r a t e s e v e r y w h e r e o n e v e r y t h i n g i n t h e d i a g r a m
it should give zero. But we forgot something; theres another type of diagram, if the CL
is supposed to act on everything, one of which looks like Fig. 3c, in which the a itself
acts on the source of b and then b comes over to interact with the original matter. I n other
words, among all the diagrams where there is a source, theres also these of type 3c. The
sum of all diagrams is zero; but the sum of those like Fig. 2 without those of type 3c is not
zero, and therefore if I were to just calculate the diagrams of Fig. 2 and forget about the
source of b and then put a gradient in for a the result cannot be zero, but must be get-
ting ready to cancel the terms from the likes of 3c when I do it right. That Nil1 turn out
to be an important point to emphasize. I have clone a lot of problems like this, without
closed loops but I wont bore you with all the problems and answers; theres nothing new,
I mean nothing interesting, in the sense that no apparent difficulties arise.
However, the next step is to take situations in which we have what we call closed l o o p ,
or rings, or circuits, in which not all momenta of the problem are clefined. Let me just men-
279
704
tion something. Ive analyzed this method both by doing a number of problems, and by
a mathematical high-class elegant technique - I can do high class mathematics too, but
I dont believe in it, thats the difference. I have to check it in a problem. I can prove that
no matter how complicated the problem is, if you take it in the order in which there are
no rings, in which every momentum is determined, the invariance is satisfied, the system
is independent of what choice I made of gauge and of the propagator I made in the begin-
ning; and everything is all right, there are no difficulties. I emphasize that this contains all
the classical caseF, and so I m really saying there are no difficulties in the classical gravita-
tion theory. This is not meant as a grand discovery, because after all, youve been worrying
about all these difficulties that I say dont exist, but only for you to get an idea of the cali-
bration - what I mean by difficulties ! If we take the next case, lets say the interaction of
two particles in a higher order, then you get diagrams of which Ill only begin to write
a few of them. One that looks like this in which two gravitons are exchanged,
a b C
Fig. 4
or, for instance, a graviton gets split into two gravitons and then come back - tliese are
only the beginning of a whole series of frightening-looking pictures, which correspond to
the problem of calculating the Lamb shift, or the radiative corrections to the hydrogen
atom. When I tried to do this, I did it in a straightforward way, following all the rules, putting
in the propagator l/k2, and so on. I had some difficulties, the thing didnt look gauge in-
variant but that had to do with the way I was making the cutoffs, because the stuff is infinite.
Shortage of time doesnt permit me to explain the way I got around all those things, because
in spite of getting around all those things the result is nevertheless definitely incorrect.
Its gauge-invariant, its perfectly O.K. looking, but it is definitely incorrect. The reason
I knew it was incorrect is the following. In order to get it gauge-invariant, I had to do a lot
of pushing and pulling, and I got the feeling that the thing might not be unique. I figured
that maybe somebody else could do it another way or something, and I was rather suspicious,
so I tried to get more tests for it; and a student of mine, by the name of Yu.ra, tested to see
if it was unitary ; and what that means is the following: Let me take instead of t h s scattering
problem, a problem of Fig. 4 in which time runs vertically, a problem which gives the same
diagrams but in which time i4 running horizontally, which is the annihilation of a pair, to
produce another pair, and we are calculating second order corrections to that problem.
Lets suppose for simplicity that in the final state the pair is in the same state as before.
280
705
Then, adding all these diagrams gives the amplitude that if you have a pair, particle and
antiparticle, they annihilate and recreate themselves; in other words its the amplitude
that the pair is still in the same state as a function of time. The amplitude to remain in the
same state for a time T in general is of the form
-i(..-i;) T
e
--Y T
you see that the imaginary part of the phase goes as e ; which means that the probability
of being in a state must decrease with time. Why does the probability decrease in time?
Because theres another possibility, namely, these two objects could come together, annihilate,
and produce a real pair of gravitons. Therefore, it is necessary that this decay rate of the
closed loop diagrams in Fig. 4 that I obtain by directly finding the imaginary part of the sum
agrees with another thing I can calculate independently, without looking at the closed loop
diagrams. Namely, what is the rate at which a particle and antiparticle annihilate into two
gravitons? And this is very easy to calculate (same set of diagrams as Fig. 2, only turned on
its side). I calculated this rate from Fig. 2, checked whether this rate agrees with the rate
at which the probability of the two particles staying the same decreases (imaginary part of
Fig. 4), and it does not check. Somethings the matter.
This made me investigate the entire subject in great detail to find out what the trouble
is. I discovered in the process two things. First, I discovered a number of theorems, which as
far as I know are new, which relate closed loop diagrams and diagrams without closed loop
diagrams (I shall call the latter diagrams trees). The unitarity relation which I have just
been describing, is one connection between a closed loop diagram and a tree; but I found
a whole lot of other ones, and this gives me more tests on my machinery. So let me just tell
you a little bit about this theorem, which gives other rules. It is rather interesting. As a matter
of fact, I proved that if you have a diagram with rings in it there are enough theorems
altogether, so that you can express any diagram with circuits completely in terms of diagrams
with trees and with all momenta for tree diagrams in physically attainable regions and on
the mass shell. The demonstration is remarkably easy. There are several ways of demonstra-
ting it; 111 only chose one. Things propagate from one place t o another, as I said, with
amplitude Ilk2. When translated into space, thats a certain propagation function which
you might call K+(l, 2), a function of two positions, 1, 2, in space-time. It represents, in the
past, incoming waves and in the future, it represents outgoing waves; so you have
w a e s come in and out; and thats the conventional propagator, with the iE and so on,
as usually represented. However, this is only a solution of the propagatorss equation,
the wave equation I mean; it is a special solution, as you all know. There are other solutions;
for instance there is a solution which is purely retarded, which Ill call K,, and which exists
only inside the future light-cone. Now, if you have two Greens functions for the same
equation they must cllffer by some solution of the homogeneous equation, say K,. That
means K, is a Solution of the free wave equation and K+ = K,,, + K,. In a ring like Fig. 4a
we have a whole product of these K+s. For example, for four points 1,2, 3, 4 in a ring
we have a product like this: K+(l,2)K+(2,3)&(3,4)&(4, 1) (all Ks are not the same,
some of them belong to the gravitons and some are propagators for the particles and so on).
706
But now let us see what happens if we were to replace one (or more) of these K+ by K,,
say K+(l,2) is KJl, 2)? Then between 1 , 2 we have just free particles, youve broken the
ring; youve got an open diagram, because K, is free wave solution, and this means its
an integral over all real momenta of free particles, on the mass shell and perfectly honest.
Therefore if we replace one of K+ by K, then that particular line is opened; and the process
is changed to one in which there is a forward scattering of an extra particle; theres a fake
particle that belongs to this propagator that has to be integrated over, but its a free diagram -
it is now a tree, and therefore perfectly definite and unique to calculate. But I said that I
could open every diagram; the reason is this. First I note that if I put K,, for every K in
a ring, I get zero
for to be non zero t , must be greater than t,, t2 > t,, t, > t, and t, > t , which is impossible.
Now make the substitution K,,, = K+-K, in (9). You get either all K+ in each factor,
which is the closed loop we want; or at least one K,, which are represented by tree diagrams.
Since the sum is zero, closed loops can be represented as integrals over tree diagrams. I was
surprised I had never noticed this thing before.
Well, then I checked whether these diagrams of Fig. 4 when opened into trees agreed
with the theorem. I mean I hoped that the theorem proved for other meson theories would
agree in principle for the gravity case, such that on opening a virtual graviton line the
tree would correspond to forward scattering of free graviton waves. And it does not work
in the gravity case. But, you say, how could it fail, after you just demonstrated that it ought
to work? The reason it fails is the following: This argument has to do with the position of
the poles in the propagators; a typical propagator is a factor l/(k2--mz++is), the f i e due
to the poles, and all Im doing here is changing the rule about the poles and picking up an
extra delta function 6 ( k 2 - m 2 ) as a consequence, which is the free wave coming in and
out. What I want these free waves to represent in the gravity case are physical gravitons
and not something wrong. They do represent waves of q 2 = 0 of course, but, as it turns
out, not with the correct polarization to be free gravitons. Id like to show it. It has to do
with the numerator, not the denominator. You see the propagator that I wrote before, which
- -
was S,, times l/(k2 +is) times SLv, is being replaced by S,, 8(q2)SLy. Now when I make
q2 = 0 I have a free wave instead of arbitiary momentum. This s h o u l d be a real graviton
or else theres going to be physical trouble. It inst; although it is of zero momentum, it
is not transverse. It does not make any difference in understanding the point so forget one
index in S,, - its a lot of extra work to carry the other index so just imagine theres one
index: S,S, 6(q2). This combination S,SL, is S,Si-S,Si -S,Si -S2$, where 4 is the
time and 3 is the direction, say, of momentum of the four-vector q. Then 1 and 2 are trans-
verse, and those are the only two we want. (Please appreciate I iemoved one index - I can
make it more elaborate, but it is the same idea.) That is we want only -3 S-S,S; instead
of the sum over four. Now what, about this extra term S4Si-S,S;? Well, it is S4-S, times
+ +
S; $ plus S, S, times Si -5;. But S4-S, is proprtional to q,S, (suppressing one index)
because q4 in this notation is the frequency and. equals q,, if we assume the 3-direction is
the direction of the momentum. So S4-S, is the response of the system to a gradient
282
707
potential, which we proved was zero in our invariance discussion. Therefore, we have shown
(S4-S3)/(Si+S;) = 0 and this should be accounted for by purely transverse wave contri-
butions. But it inst, and it isnt because the proof that the response to a gradient potential
i s zero required that the other particle that was interacting was an honest free graviton.
And four plus three in 5; +S; is not honest - its not transverse, it is not a correct kind
of graviton. You see, the only way you can get a polarization 4 f 5 going in the 4 - 3 direction is
to have what I call longitudinal response; its not a transverse wave. Such a wave could only
be generated by an artificial source here of some silly kind; it is not a free wave. When
theres an artificial source for one graviton, even the another is a pme gradient, the sum
of all the diagrams does not give zero. If the beam is not exactly that of a free wave, perfectly
transverse and everything, the argument that the gradient has to be zero must fail, for the
reason outlined previously.
Although this gradient for S4-S, is what I want and I hoped it was going to be zero
I forgot that the other end of it - Si+S; is a funny wave which is not a gradient, and
which is not a free wave - and therefore you do not get zero and should not get zero, and
something is fundamentally wrong.
Incidentally I investigated further and discovered another very interesting point.
There is another theory, more well-known to meson physicists, called the Yang-Mills theory,
and I take the one with zero mass; it is a special theory that has never been investigated
in great detail. It is very analogous to gravitation; instead of the coordinate transforniation
group being the source of everything, its the isotopic spin rotation group thats the source
of everything. It is a non-linear theory, thats like the gravitation theory, and so forth. At
the suggestion of Gell-Mann I looked at the theory of Yang-Mills with zero mass, which has
a kind of gauge group and everything the same; and found exactly the same difficulty. And
therefore in meson theory it was not strictly unknown difficulty, because it should have
been noticed by meson physicists who had been fooling around the Yang-Mills theory. They
had not noticed it because theyre practical, and the Yang-Mills theory with zero mass
obviously does not exist, because a zero mass field would be obvious; it would come out
of nuclei right away. So they didnt take the case of zero mass and investigate it carefully.
But this disease which I discovered here is a disease which exist in other theories. So at
least there is one good thing: gravity isnt alone in this difficulty. This observation that
Yang-Mills was also in trouble was of very great advantage to me; it made everything much
easier in trying to straighten out the troubles of the preceding paragraph, for several reasons.
The main reason is if you have two examples of the same disease, then there are many things
you d.ont worry about. You see, if there is something clifferent in the two theories it is not
caused by that. For example, for gravity, in front of the second derivatives of gpv in the
Lagrangian there are other gs, the field itself. I kept worrying something was going to happen
from that. In the Yang-Mills theory this is not so, thats not the cause of the trouble, and SO
on. Thats one advantage - it limits the number of possibilities. And the second great
advantage was that the Yang-Mills theory is enormously easier to compute with than the
gravity theory, and therefore I continued most of my investigations on the Yang-Mills
theory, with the idea, if I ever cure that one, 111 turn around and cure the other. Because
I can demonstrate one thing; line for line its a translation like music transcribed to a different
283
308
score; everything has its analogue precisely, so it is a very good example to work with.
Incidentally, to give you some idea of the difference in order to calculate this diagram Fig. 4b
the Yang-Mills case took me about a day; to calculate the diagram in the case of gravitation
I tried again and again and was never able to do it; and it was finally put OR a computing
machine -1 dont mean the arithmetic, I mean the algebra of all the terms coming in, just the
algebra; I did the integrals myself later, but the algebra of the thing was done on a machine
by John Matthews, so I couldnt have done it by hand. In fact, I think its historically
interesting that its the first problem in algebra that I know of that was done on a machine
that has not been done by hand.
Well, what then, now you have the difficulty; how do you cure it? Well I tried the
following idea: I assumed the tree theorem to be true, and used it in reverse. If every closed
ring diagram can be expressed as trees, and if trees produce no trouble and can be computed,
then all you have to do is to say that the closed loop diagram is the sum of the corresponding
tree diagrams, that it should be. Finally in each tree diagram for which a graviton line has
been opened, take only real transverse graviton to represent that term. This then serves
as the definition of how to calculate closed-loop diagrams ;the old rules, involving a propagator
l / k 2 f i e etc. being superseded. The advantage of this is, first, that it will be gauge invariant,
secoiid, it will be unitary, because unitarity is a relation between a closed diagram and an
open one, and is one of the class of relations I was talking about, so theres no difficulty.
And third, it7s completely unique as to what the answer is; theres no arbitrary fiddling
around with different gauges and SO forth, in the inside ring as there was before. So thats
the plan.
Now, the plan requires, however, one more point. Its true that we proved here that
every ring diagram can be broken up into a whole lot of trees; but, a g i v e n t r e e i s n o t
g a u g e i n v a r i a n t . For instance the tree diagram of Fig, 2A is not. Each one of the four
diagrams of Fig. 2 is not gauge-invariant, nor is any combination of them except the sum of
all four. So the thing is the following. Suppose I take a l l the processes, a l l of them that
belong together in a given order; for example, all the diagrams of fourth order, of which
Fig. 4 illustrates three; I break the whole mess into trees, lots of trees. Then I must gather
Fig. 5
the trees into baskets again, so that each basket contains the total of a l l of the diagrams of
some specific p r o c e s s (for example the four diagrams of Fig. 2), you see, not just some
particular tree diagram but the complet set for some process. The business of gathering the
tree diagrams together in bunches representing all diagrams for complet processes is impor-
tant, for only such a complet set is gauge invariant. The question is: Will any odd tree dia-
709
grams be left out o r can they all be gathered into processes? The question is : Can we express
the closed ring diagrams for some process into a sum over various other processes of tree
diagrams for these processes?
Well, in the case with one ring only, I am sure it can be done, I proved it can be done
and I have done it and its all fine. And therefore the problem with one ring is fundamentally
solved; because we say, you express it in terms of open parts, you find the processes that
they correspond to, compute each process and add them together.
You might be interested in what the rule is for one ring; its the sum of several pieces:
first it is the sum of all the processes which you get in the lower order, in which you scatter
one extra particle from the system. For instance, in Fig. 4 we have the rings for two particles
scattering. There is no external graviton but there are two internal ones; now we compute
in the same order a new problem in which there are two particles scattering, but while
thats happening another particle, for example a graviton scatters forward. Some of the
diagrams for this are illustrated in Fig. 5. State f the same state as g ; so another graviton
comes in and is scattered forward. In other words we do the forward scattering of an extra
graviton. In addition, from breaking matter lines we have terms for the forward scattering
of an extra positron, plus the forward scattering of an extra electron, and so o n ; one adds
the forward scattering of every possible extra particle together. That is the first contribution.
But when you break up the trees, you also sometimes break two lines, and then you get
diagrams like Fig. 6 with two extra particles scattering (here a graviton and electron) so it
turns out you must now subtract all the diagrams with two extra particles of all kinds
scattering. Then add all diagrams with 3 extra particles scattering and so on. Its a nice xule,
itss quite beautiful; it took me quite a w h l e t o find; I have other proofs for orther cases
that are easy to understand.
Now, the next thing that anybody would ask which is a natural, interesting thing to
ask, is this. Is it possible to go back and to find the rule by which you could have integrated
the closed rings directly? In other words, change the rule for integrating the closed rings,
so that when you integrate them in a more natural fashion, with the new method, it will
\
b
Fig. 6
give the same answer as this unique, absolute, definite thmg of the trees. Its not necessary
to do this, because, of course, Ive defined everything; but its of great interest to d o this,
because maybe 111 understand what I did wrong before. So I investigated that in detail.
It turns out there are two changes that have to be made - its a little hard to explain in
710
terms of the gravitation of which Ill only tell about one. Well, Ill try to explain the other,
but it might cause some confusion. Because I have to explain in general what Im doing
when I do a ring. Most what it corresponds to is this: first you subtract from the Lagrangian
this
I6Hp;q;o dt.
In that way the equation of motion that results is non-singular any more. Let me write
what it really is so that theres no trouble. You say to me what is this, theres a g in it and
an H in it? Yes. In doing a ring, theres a field variation over which youre integrating,
whch I call H ; and theres a g - which is the representative of all the outside disturbances
which can be summarized as being an effective externalfieldg. And so you add to the
complicated Lagrangian that you get in the ordinary way an extra term, which makes
it no longer singular. Thats the first thing; I found it out by trial and error before,
when I made it gauge invariant. But then secondly, you must subtract from the answer,
the result that you get by imagining that in the ring which involves only a graviton
going around, instead you calculate with a different particle going around, an artificial,
dopey particle is coupled to it. Its a vector particle, artificially coupled to the external
field, so designed as to correct the error in this one. The forms are evidently invariant,
as far as your g-space is concerned; these are like tensors in the g world; and therefore
its clear that my answers are gauge invariant or coordinate transformable, and all thats
necessary. But are also quantum-mechanically satisfactory in the sense that they are unitary.
Now, the next question is, what happens when there are two or more loops? Since
I only got this completely straightened out a week before I came here, I havent had time
to inwestigate the caFe of 2 or more loops to my own satisfaction. The preliminary
investigations that I: have made do not indicate that its going to be possible so easily
gather the thmgs into the right barrels. Its surprising, I cant understand it; when you
gather the trees into processes, there seems to be some loose trees, extra trees. I dont
understand them at the moment, and I therefore do not claim that this method of
quantization can be obviously and evidently carried on to the next order. In short,
therefore, we are still not sure, of the radiative corrections to the radiative corrections to
the Lamb shift, the uncertainty lies in energies of the order of magnitude of
rydbergs. I can therefore relax from the problem, and say: for all practical purposes
everything is all right. In the meantime, unfortunately, although I could retire from
the field and leave you experts who are used to working in gravitation to worry about
this matter, I cant retire on the claim that the number is so small and that the thing is
now r e a l l y irrational, if it was not irrational before. Because, unfortunately, I also discov-
ered in the process that the trouble is present in the Yang-Mills theory; and secondly
I have incidentally discovered a tree-ring connection which is of very great interest and
importance in the meson theories and so on. And so Im stuck to have to continue this
investigation, and of course you all appreciate that this is the secret reason for doing any
work, no matter how absurd and irrational and academic i t looks; we all realize that no
matter how small a thing is, if it has physical interest and is thought about carefully enough,
youre bound to think of something thats good for something else.
286
711
DISCUSSION
M s l l e r : May I, as a non-expert, ask you a very simple and perhaps foolish question.
Is this theory really Einsteins theory of gravitation in the sense that if you would have
here many gravitons the equations would go over into the usual field equations of Einstein?
F e y n man: Absolutely.
M s l l e r : You are quite sure about it?
F e y n m a n : Yes, in fact when I work out the fields and I dont say in what order Im
working, I have to do it in an abstract manner which includes any number of gravitons;
and then the formulas are definitely related to the general theorys formulas; and the in-
variance is the same; things like this that you see labelled as loops are very typical quantum-
-mechanical things; but even here you see a tendency to write things with the right deriva-
tives, gauge invariant and everything. No, theres no question that the thing is the Ein-
steinian theory. The classical limit of this theory that Im working on now is a non-linear
theory exactly the same as the Einsteinian equations. One thing is to prove it by equations;
the other is to check it by calculations. I have mathematically proven to myself so many
things that arent true. Im lousy at proving things - I always make a mistake. I dont
notice when Im doing a path integral over an infinite number of variables that the Lagrang-
ian does not depend upon one of them, the integral is infinite and Ive got a ratio of two
infinities and I could get a different answer. And I dont notice in the morass of things that
something, a little limit or sign, goes wrong. So I always have to check with calculations;
and Im very poor at calculations - I always get the wrong answer. So its a lot of work
in these things. But Ive done two things. I checked it by the mathmatics, that the forms
of the mathematical equations are the same; and then I checked it by doing a consid-
erable number of problems in quantum mechanics, such as the rate of radiation
from a double star held together by quantum-mechanical force, in several orders and
so on, and it gives the same answer in the limit as the corresponding classical problem.
Or the gravitational radiation when two stars - excuse me, two particles - go by each
other, to any order you want (not for stars, then they have to be particles of specified prop-
erties; because obviously the rate of radiation of the gravity depends on the give of the
starstides are produced). If you do a real problem with real physical things in in then Im
sure we have the right method that belongs to the gravity theory. Theres no question
about that. It cant take care of the cosmological problem, in which you have matter out
to infinity, or that the space is curved at infinity. It could be done Im sure, but I havent
investigated it. I used as a background a flat one way out at infinity.
M s l l e r : But you say you are not sure it is renormalizable.
F e y n m a n : Im not sure, no.
M s l l e r : In the limit of large number of gravitons this would not matter?
F e y n m a n : Well, no; you see, there is still a classical electrodynamics; and its not
got to do with the renormalizability of quantum electrodynamics. The infinities come in
different places. Its not a related problem.
R o s e n : Im not sure of this, not being one of the experts; but I have the impression
that because of the non-linearity of the Einstein equations there exists a difficulty of the
287
712
following kind. If the linear equations have a solution in the form of an infinite plane mono-
chromatic wave, there does not seem to correspond to that a more exact solution; because
YOU get piling up of energies in space and the solution then diverges at infinity. Could that.
wave, theres no meaning to the correction. So it must be understood in this way, that
the thing was emitted some time far in the past, and is going to be absorbed some time
in the future; and has not absolutely been going on forever. Then theres a very small
coefficient in front of the logarithm and then for any reasonable q2, like the diameter of
the universe or something, I can still get a sensible answer; this is the shadow of the
phenomenon youre talking about, that the corrections to the propagation of a graviton,
dependent on the logarithm of the momentum squared carried by the graviton and which
would be infinite if it were really a zero momentum graviton exactly. And so a free
graviton just like that does not quite exist. And this is the correction for that. Strictly we
would have to work with wave packets, but they can be of very large extent compared to
the wave length of the gravitons.
A n d e r s o n : Id like to ask if you get the same difficulty in the electromagnetic case
that you did in the Yang-Mills and gravitational cases?
F e y n m a n : No, sir, you do not. Gauge invariance of diagrams such as Fig. 2 (there
is no 20) is satisfied whether b is a free wave or not. That is because photons are not the
source of photons; they are uncharged.
A n d e r s o n : The other thing I would like to suggest is that in putting of things into
baskets, you might be able to get easily by always only starting out with vacuum dia-
grams and opening those successively.
F e y n m a n : I tried that and it didnt go successfully.
I v a n e n k o : If I understood you correctly, you had used in the initial presentation the
transmutation of two particles into gravitons. Yes?
F e y n m a n : It was one of the examples.
288
713
714
by scattering of photons in a Coulomb field, the scattered power has to be greater than twice
the square root of kT times the photon power divided by the averagipg time of the experi-
ment. I believe that the incorrect results that have appeared in the literature have been
due to the statement that A P has to be gieater than kT over t ;dimensionally these things
are the same, but order of magnitude-wise this kind of experiment for the scatterer of which
I spoke requires something like lo5 watts. Maybe I can say something about this
afternoon; I dont want to take any more time.
D e W i t t : I should like to ask Prof. Feynman the following questions. First, to give us
a careful statement of the tree theorem; and then outline, if he can to a brief extent, the
nature of the proof of the theorem for the one-loop case, whish I understand does work.
And then, to also show in a little bit more detail the structure and nature of the fictitious
particle needed if you want to renormalize everything directly with the loops. And if you
like, do it for the Yang-Mills, if things are prettier that way.
F e y n m a n : I usually dont find that to go into the mathematical details of proofs
in a large company is a very effective way to do anything; so, although thats the question
that you asked me - Id be glad to do it - I could instead of that give a more physical
explanation of why there is such a theorem; how I thought of the theorem in the first place,
and things of this nature; although I do have a proof - Im not trying to cover up.
D e W i t t : May we have a statement of the theorem first?
F e y n m a n : That I do not have. I only have it for one loop, and for one loop the careful
statement of the theorem is ... - look, let me do it my way. First - let me tell you how I
thought of this crazy thing. I was invited to Brussels to give a talk on electrodynamics -
the 50th anniversary of the 1911 Solvay Conference on radiation. And I said Id make
believe Im coming back, and Im telling an imaginary audience of Einstein, Lorentz and
so on what the answer was. I n other words, there are going to be intelligent guys, and Ill
tell them the answer. So I tried to explain quantum electrodynamics in a very elementary
way, and started out to explain the self-energy, like the hydrogen Lamb shift. How can
you explain the hydrogen Lamb shift easily? It turns out you cant at all - they clidnt
even know there was an atomic nucleus. But, never mind.. I thought of the following. I would
explain to Lorentz that his idea that he mentioned in the conference, that classically the
electromagnetic field could be represented by a lot of oscillators was correct. And that
Plancks idea that the oscillators are quantized was correct, and that Lorentzs suggestion,
whch is also in that thing, that Planck should quantize the oscillators that the field is equi-
valent to, was right. And it was really amusing to discover that all that was in 1911. And
that the paper in which Planck concludes that the energy of each oscillator was not nhw
but (n+1/2)hw which was also in that, was also right; and that this produced a difficulty,
because each of the harmonic oscillators of Lorentz in each of the modes had a frequency
of Aw/2 which is an infinite amount of energy, because there are an infinite number of modes.
A4ndthat thats a serious problem in quantum electrodjrIlamics and the first one we have
to remove. And the method we use to remove it is to simply redefine the energy so that
we start from a cbfferent zero, because, of course, absolute energy doesnt mean anything.
(In this gravitational context, absolute energy does mean something, but its one of the
technical points I cant discuss, which dido require a certain skill to get rid of, in making
290
715
a gravity theory; but never mind.) NOWlook - I. make a little hole in the box and I let in
a little bit of hydrogen gas from a reservoir; such a small amount of hydrogen gas, that
the density is low enough that the index of refraction in space differs from one by an amount
proportional t o A , the number of atoms. With the index being somewhat changed, the
frequency of all the normal modes is altered. Each normal mode has the same wavelenght
as before, because it must fit into the box; but the frequencies are all aItered. And there-
fore the hws should all be shifted a trifle, because of the shift of index, and therefore theres
a slight shift of the energy. Although we subtract h0/2 for the vacuum, theres a correction
when we put the gas in; and this correction is proportional t o the number of atoms, and
can be associated with an energy for each atom. If you say, yes, but you had that energy
already when you had the gas in back in the reservoir, I say, but let us only compare the
difference in energy between the 2s and 2P state. When we change the excitation of the
hydrogen gas from 2 s to 2P then it changes its index without removing anythmg; and the
energy difference that is needed to change the energy from 2s to the 2P for all these atoms
is not only the energy that you calculate with disregard. of the zero point energy; but the
fact is that the zero point energy is changed very slightly. And this very slight difference
should be the Lamb effect. So I thought, its a nice argument; the only question is, is it
true. In the first place its interesting, because as you well know the index differs from one
by an amount which is proportional t o the forward scattering for y rays of momentum k
and therefore that shift in energy is essentially the sum over all momentum states of the
forward scattering for y rays of momentum k. So I looked at the forward scattering and
compared it with the right formula for the Lamb shift, and it was not true, of course; its
too simple an argument. But then I said, wait, I forgot something. Dirac, explained to us
that there are negative energy states for the electron but that the whole sea of negative
energy states is filled. And, of course, if I put the hydrogen atoms in here all those electrons
in negative energy states are also ascattering off the hydrogen atoms; and therefore their
states are all shifted; and therefore the energy levels of all those are shifted a tiny bit. And
therefore theres shift in the eneigy due to those. And so there must be an additional term
which is the forward scattering of positrons, which is the same as scattering of negative
energy electrons. Actually, for the symmetry of things it is better to take half the case where you
make the positrons the holes and the other half where you make the electrons the holes;
so it should be 112 forward scattering by electrons, 1/2 scattering by positrons and scattering
by y rays - the sum of all those forward scattering amplitudes ought to equal the self-
-energy of the hydrogen atom. And thats right. And its simple, and its very peculiar.
The reason its peculiar is that these forwaid scatterings are real processes. At last I had
discovered a formula I had always wanted, which is a formula for energy differences (which
are defined in terms of virtual fields) in terms of actual measurable quantities, no matter
how clifficult the experiment may be -I mean I have to be able to scatter these things. Many
times in studying the energy difference due to electricity (I suppose) between the proton
and the neutron, I had hoped for a theorem which would go something like this -this energy
difference between proton and neutron must be equal to the following sum of a bunch of
cross-sections for a number of processes, but all real physical processes, I dont care how
hard they are to measure. So this is the beginning of such a formula. Its rather surprising.
29 1
716
Its not the same as the usual formula - its equal to i t but its not the same. I have 110
formulation of the laws of quantum gravidynamics; I have a proposal on how to make the
calculations. When I make the proposal on how to do the closed loops, the obvious proposal
does not work; it gives non-unitarity and stuff like that. So the obvious proposal is no good;
it works O.K. for trees; so how am I going to d.efine the answer for would correspond to
a ring? The one I happen to have chosen is the following: I take the ring in general for any
meson theory, one closed ring can be written as equivalent to a whole lot of processes each
one of which is trees. I then define, as my belief as to what the ring ought to be in the grand
theory, that its going to be also equal to the corresponding physical set of trees. When 1 saicl
this is equal to this. I didnt worry about gauge or anything else; what I means was, if these
werent gravitons but photons or any other neutral object - it d.oesnt make any difference
what they are - this theorem is right. So I suppose its right also for real gravitons, and
I suppose also that whats being scattered is only transverse and is only a real free graviton
with q2 = 0. Therefore, I say let this ring equal this set of trees. Every one of these terms
can be completely computed - its a tree. And its gauge invariant; that is, if I added an
extra potential on the whole thing, another outside disturbance of a type which is nothing
but a coordinate transformation - in short a pure gradient wave - to the whole diagram
then it comes on to all of these processes; but it makes no effect on any of them, and therefore
makes no effect on the sum; and therefore I know my definition of this ring is gauge-invariant.
Second, unitarity is a property of the breaking of this diagram; the imaginary part of this
equals something; if you take the imaginary part of this side, its already broken up, in fact,
and you can prove immediately that its the correct unitarity rule. Therefore its going to
be unitarity and so on and so on. And so I therefore define gravity with one ring in this
way. Now what prevents me from doing it with two rings? The lack of a complet statement
of what two rings is equal to in terms of processes; that is I can open the ring all right; but
I cant put the pieces - the broken diagrams - back together again into complete sets
that each one is a complete physical process. In other words some of them correspond to
the scattering of a graviton, but leaving out some diagrams. But the scattering of a graviton
leaving out diagrams is no longer gauge invariant, I mean, not evid.ently gauge invariant,
and so the power of the whole thing collapses. I dont know what to do with it. So thats
the situation; thats why it is crucial to the particular plan. Theres always, of course, another
way out. And thats the following (and thats what I tried to describe at the end of the talk -
maybe I talked too fast) : After all now Ive defined what this results is equal to - by definition
not that you should do a loop some way and get this, but that a loop is equal to this by
defition, and Im not going to do a loop any other way. But, of course, from a practical
point of view or from the point of view purely of interest, the question is, can you come
back now and calculate the ring directly by some particular mathematical shenanigans,
and get the same answer as you get by adding the trees. And I found the way to do that.
I have another way, in other words, to do the ring integral directly. I have to subtract
something from a vector particle going around the instead of a graviton to get the answer
right. So I known the rule, and I know why the rule is, and I have a proof of the rule for
one loop. I have two ways of extend.ing. I can either break this two loop diagram open and
get it back into the processes, like I did Trith the one ring - where SO far Im stuck. Or,
717
I can take the rule which I found here and try to guess the generalization for any number
of rings. Also stuck. But Ive only had a week, gentlemen; Ive only been able to straighten
out the difficulty of a single ring a week ago when I got everything cleaned up. Its more
than a week - I had to take a lot of time checking and checking; but I was only finished
checking to make sure of everything for this conference. And of course youre always
asking me about the thing I havent had time to make sure about yet, and Im sorry; I worked
hard to be sure of something, and now you ask me about those things I havent had time.
I hoped that I would be able to get it. I still have a few irons to try; Im not completely stL~c1~-
maybe.
D e W i t t : Because of the interest of the tricky extra particle that you mentioned at the
end, and its possible connection, perhaps, with some work of Dr B i a l y n i c k i - B i r u l a ,
have you got far enough on that so that you could repeat it with just a little more detail?
The structure of it and what sort of an equation it satisfies, and what is its propagator?
These are technical points, b u t they have an interest.
F e y n m a n : Give me ten minutes. And let me show how the analysis of these tree
diagrams, loop diagrams and all this other stuff is done mathematical way. Now I will show
you that I too can write equations that nobody can understand. Before I do that I should
like to say that there are a few properties that this result has that are interesting. First of
all in the Yang-Mills case there also exists a theory which violates the original idea of symmetry
of the isotopic spin (from w h c h was originally invented) by the simple assumption that
the particle has a mass. That means to add to the Lagrangian a term -p2u,d where up
is an isotopic vector. You add this to the Lagrangian. This destroys the gauge invariance
of the theory - its just like electrodynamics with a mass, its no longer gauge-invariant,
its just a dirty theory. Knowing that there is no such field with zero mass people say : ,,lets
put the mass term on. Now when you put a mass term on it is no longer gauge invariant.
But then it is also no longer singular. The Lagrangian is no longer singular for the same
reason that it is not invariant. And therefore everything can be solved precisely. The propa-
gator instead of being dpy between two currents is
where q;, is the momentum of propagating particle. The factor l / ( q 2 - p 2 )is typical for mass p
but the part -qpqy/,u2 is an important term w h c h can be taken to be zero in electrodynamics
but it is not obvious whether it can be taken to be zero in the case of Yang-PIiIls theory.
In fact it has been proved it cannot be taken to be zero; this propagator is used between two
currents. I am using the Yang-Mills example instead of the gravity example. I really want
only the case p 2 = 0, and am asking whether I can get there by first calculating finite p2,
then takmg the limit p2 = 0.
Now, with p2 # 0 t h s is a definite propagator and there are no ambiguities at the
closed rings, the closed loops. I have no freedom, I must compute this propagator. I mean
there is no reason for trouble, and there is no trouble. There is no gauge invariance either.
And of course I checked. I broke the rings and I computed by the broken ring theorem
method a closed loop problem of fair complexity (which in fact was the interaction of two
293
718
electrons). I computed it by the open ring method and by the closed ring method, and
of course it agreed, there is no reason that i t shouldnt. It turned out that for tree diagrams
you dont have to worry about this q,,q,,/,u2 term, you can drop it - but not for the closed
ring - only for tree. Therefore the tree diagrams have the definite limit as p2 goes to zero.
And yet I have the closed ring diagram which is equal to the tree diagram when the mass
is anything but zero, and therefore it ought to be true that the limit as p2 goes to zero of the
ring is equal to the case when ,u = 0. It sounds like a great idea why dont you define the
desired ,u2 = 0 theory that way? Answer: You cant put p2 equal zero in the form (10).
You cant do it because of the q,,qy/p2.So i t was necessary next to see if there is a way to
re-express the ring cbagrams, for the case with ,u2# 0, in a new form with a propagator
different from (lo), that didnt have a p2 in it, in such a form that you can take the limits
as p2 goes to zero. Then that would be a new way to do the p equal zero case; and thats
the way I found the formula. Ill try to explain how to find that theory.
We start with a definite theory, the Yang-Mills theory with a mass (the reason I do that
is that theres no ambiguity about what I a m trying to do) and later on I take the mass to zero,
then the theory works something like this. You have the Lagrangian P(A, 9) which involves
the vector potential of this field and the fields a) representing the matter with which this
Vbject is interacting for zero mass, to which, for finite mass we add the term ,u2A,A,. This
is the Lagrangian that has to be integrated and the idea is that you integrate this over all
fields A and p7; and that is the answer for the amplitude of the problem
But wait, what about the initial and final conditions? You have certain particles coming
in and going out. To simplify things ( t k s is not essential) Ill just study the case that corres-
ponds only to gravitons in and out. Ill call them gravitons and mesons even though they
are vector particles. The question is first, what is the right answer if you have gravitons
represented by plane waves, A,, A,, A, ... going in (positive frequency in A,) o r out (nega-
tive frequency). You make the following field up. Let Aasymbe defined as a times the wave
function A , that represents the first graviton coming in a plane wave, plus ,B times A , plus y
times A , and so on.
Then you calculate this integral (11) subject to the condition that A approaches A,,,
at infinity. Theresult of this is of course a function of a,B, y ... and so on. Then what you
want for X is just the term first order in a, B, y ... That means just one of each these gravi-
tons coming in and out. Thats the right formula for a regular theory, for meson theory,
You calculate the integral subject to the asymptotic condition, when you imagine all these
waves, but you take the first order perturbation with respect to each one of the incoming
waves. You never let the same photon operate twice; a photon operating twice is not a photon,
it is a classical wave. Sb you take the derivative of this with respect to a, p, y and so on,
then setting them all equal to zero. Thats problem. (In general theres pl asymptotic
too .)
294
719
Now the way I happened to do this is the following: Let us call A , the A which satisfies
the classical eqatiuons of motion, which in this particular case will be
I solve this subject to the condition that A,equals Aum. In other words, I find what is the
maximum or minimum - whatever it is - of the action in (ll),subject to the asymptotic
condition. That's the beginning of analysing this.
The next thing is to make the simple substitution A = A , +B andput it back in equa-
tion (11).Then if you take L? od A,+B (if B is negligible you get L? of A , and so forth)
so you get something like this
The integral is over all B, and B must go to zero asymptotically. This business can be expanded
in powers of B.
320
The quadratic form involves A , SO the answer depends on A , - its some complicated
functional of A,. Anyway I wont say that all the time, Ill just remember that. We have
to integrate over all B. And the difficulty is - not difficulty, but the point is - that this
quadratic form in B is singular, because it came from the piece of the action that has an
invariance and this invariance keeps chasing us along. And there are certain transformations
of B which leave this Quard B part unchanged in first order. That transformation in the
Yang-fills theory is
where the vectors are in isotopic spin space and a is considered as first order. This trans-
formation leaves the quadxatic form invariant so the Quad (B) thing by itself is singular.
But it doesnt make any difference, because of the addition of the p2BB. If ,u2# 0, there
is no problem, but if ,u2+0, Id be in trouble.
I discovered that if I make this change (16) in the actual Lagrangian and carry everything
up to second order it is exact, in fact because its only second order. If I do it with the
exact change, the thing isnt invariant, it is only invariant to first order in a. But if I make
the substitution exactly, then I get a certain addition to the Lagrangian, in other words the
Lagrangian of B(this includes the p2,the Lagrangian plus the ,u2term in B) is the Lagrangian
plus the p2 term in B plus something like this
I have to explain that the semicolon is analogous to the semicolon in gravity. The semicolon
derivative X;,means the ordinary derivative of X minus A cross X and thats the analogue
of the Christoffel symbols. Anyway, I find out what happens to L when I make this trans-
formation. Now comes the idea, the trick, the nonsense: you start with the following thing;
you, say, suppose instead of writing the original terms down, instead of writing the original
Lagrangian I were to write the following:
Now I say that the integral over a is some constant or other. So all I have done is to multiply
my original integral by 2 of B (by 2 of B I mean the whole thing, I mean this whole thing
is going t o b e 2 of B). If I can claim that when I integrate a I get somethung which is inde-
pendent of B, which is not self-evident. If I integrate over all a it does not look as if it is
independent of B - but after a moments eonsideration you see that it is. Because if I can
solve a certain equation, which is ay,-p2a = BZ, I can shift the value of a by that amount,
and then this term would disappear. In other words if I can solve this, and call this solution a,
and change a to a,, then the B would cancel and it would only be a here. I did it a little
abstractly which is a little easier to explain, therefore, this term that Ive added can be
thought of as an integral of {he following nature: Integral of some B, plus an operator acting
on a (this complicated operator is the second derivative and so on) squared CDa. And then
by that substitution Ive j w t mentioned, this becomes equal to 1/2 the operator on A
296
721
times a squared Qa, which is equal to the integral e to the one half of a times A , the
operator A , times the operator A times a integrated over primed a. Now when you inte-
grate a quadratic form, which is a quadratic with an operator like this you get one Over
the square root of the determinant of the operator. SO this thing is one over the square root
of the determinant of the operator A A . The determinant of the operator A times A is
square of the determinant of A . So this is one over the determinant of the operator A , or
better it is one over square root of the determinant of the operator A squared, youll see
i n a minute why I like to write it in this way. In other words, when Ive written this thing
down Ive written the answer that I want. Lets call X the unknown answer that I want.
Then this is equal to X divided by this d.eterminants square root squared. Now comes the
trick- I now make the change from B to B. We notice that B changed to B is simply ...
oh!, this is wrong, thats whats wrong, it should be just this. Now Ive got it. T h e change
from B to B is to add something to B. Therefore to the differential of B it adds nothing,
its just shfting the B to a new value. So 1 make the transformation from B to B everywhere.
So then I have da and dB, and now I have a new thing up here where I make use of the
formula for 2 of B:
You see there is a certain cross term generated. here and another cross term coming from
expanding this out and the net result, with a little algebra here, is that becomes 2 of B, but
the quadratic term doesnt cancel out and is left; theres one half of BM,psquared; thats
from this term; the cross term here cancels the cross term in there; and then we have only
the quad.ratic - I mean the a terms
And the problem is now to do this integral on a;well, another miraculous thing happens.
I have the operator A , but that this down thing is aAa, and therefore its result is just cletermi-
nant once; or the square of this integral is equal to this determinant, or something like
that. Therefore, when you get all the factors right, X,the unknown, is equal to
S a c h s : I want to ask a question about long-range hopes. Perhaps for irrational reasons
people are particularly interested. in those parts of the theory where is a possibility of real
qualitative differences: what do the coordinates or topology mean in a quantized theory,
and this kind of junk. Now I wond.er if you think that this perturbation theory can eventually
be jazzed. u p to cover also this kind. of questions?
F e y n m a n : The present theory is not a theory as it is incomplete. I d.0 not give a rule
on how to do all problems. I expect of course that if I spend more time on figuring out how
to untangle the pretzels I shall be able to make it into such a theory. So lets suppose I did.
Now you can ask the question would the completed job, assuming it exists, be of any interest
to esoteric question about the quantization of gravity. Of course it would. be, because it
297
722
.izo~ildbe the expression of the quaiituni theory; there is i oday n o expression of tlre quaiiturn
tlieory wliicli is consistent. You say: hut its perturbation theory. But it isnt. I worked
on tlie thing analyzing it in tlie series of illcreasing accuracy, hut thats only, obviously, 13-lien
I aiii doing problems and cliecliin~,or doing things like I just did. But even tliere I havent
said OM: iiiaiiy times the vector potential A , is att.acliing the diagram, there is n o limit to
what order of external h i e s are involved in tlie calculation of A , , for example. -&lidSO if
I gel iiiy general theorem for all orders, Ill have soiiie kind of a formulation. Tlie fact isr
tliat i n such tliings as electrodgiiamics and other theories, it has not been possible to figure
out tlie coiisequeiices of the quaiituin field theory i n tlie case of strong interactions, because
of teclinical difficulties which are not technical difficulties just of the gravitation theory:
but exist all over the quaiituni field theory. I do not expect that the gravitational problenis
will be any easier in tliat region than they are in any otlier field theory, so I can say very
little there. But at least one should certainly foriiiulate the theory that youre tryiiig to
calculate first, a n d then find out what tlie consequeiices are, before trying to do it tlie otlier
way round. So I tliinli that youll be frustrated by the difficulties that do appear whenever
any theory diverges. On otlier hand, if you ask about tlie physical significance of the quaiiti-
zation of geometry, ill other words about tlie philosophy beliiiid i t ; what happeris to the
metric, and all such questions, those I believe will be answerable, yes. I think you would
be able to figure out the physics of it afterwards, but I wont to tliiiil; about that until I have
i t completely formulated, I dont want to start t o work out tlie anjer to something unless
I know d r a t the equation is I a m trying to analyze. But I dont have the doubt that pou
will be able to do sonietliing, liecause after all you are describing the plienolnena that yo11
would espect, a n d if you describe the plieiiomena then you expect y o u can then fiiid some
kind of framework in TI-hich to talk to help to understand the plienomeiia.
298
Contrary to the situation which holds for the canonical theory described in the first paper of this series,
there exists a t present no tractable pure operator language on which to base a manifestly covariant quantum
theory of gravity. One must construct the theory by analogy with conventional S-matrix theory, using
the c-number language of Feynman amplitudes when nothing else is available. The present paper undertakes
this construction. It begins at an elementary level with a treatment of the propagation of small disturbances
on a classical background. The classical background plays a fundamental role throughout, both as a technical
instrument for probing the vacuum (i.e., analyzing virtual processes) and as an arbitrary fiducial point for
the quantum fluctuations. The problem of the quantized light cone is discussed in a preliminary way, and
the formal structure of the invariance group is displayed. A condensed notation is adopted which permits
the Yang-Mills field to be studied simultaneously with the gravitational field. Generally covariant Greens
functions are introduced through the imposition of covariant supplementary conditions on small dis-
turbances. The transition from the classical to the quantum theory is made via the Poisson bracket of
Peierls. Commutation relations for the asymptotic fields are obtained and used to define the incoming
and outgoing states. Because of the non-Abelian character of the coordinate transformation group, the
separation of propagated disturbances into physical and nonphysical components requires much greater
care than in electrodynamics. With the aid of a canonical form for the commutator function, two distinct
Feynman propagators relative to an arbitrary background are defined. One of these is manifestly co-
variant, but propagates nonphysical as well as physical quanta; the other propagates physical quanta only,
but lacks manifest covariance. The latter is used to define external-line wave functions and non-radiatively-
corrected amplitudes for scattering, p.air production, and pair annihilation by the background field. The
group invariance of these amplitudes is proved. A fully covariant generalization of the complete S matrix
is next proposed, and Feynmans lree theorem on the group invariance of non-radiatively-corrected n-particle
amplitudes is derived. The big problem of radiative corrections is then confronted. The resolution of this
problem is carried out in steps. The single-loop contribution to the vacuum-to-vacuum amplitude is first
computed with the aid of the formal theory of continuous determinants. This contribution is then func-
tionally differentiated to obtain the lowest-order radiative corrections to the n-quantum amplitudes.
These amplitudes split automatically into Feynman buskets, i.e., sums over tree amplitudes (bare scattering
amplitudes) in which all external lines are on the mass shell. This guarantees their group invariance. The
invariance can be made partially manifest by converting from the noncovariant Feynman propagator to
the covariant one, and this leads to the formal appearance of fictitiow quanta which compensate the
nonphysical modes carried by the covariant propagator. Although avoidable in principle, these quanta
necessarily appear whenever manifestly covariant expressions are employed, e.g., in renormalization theory.
The fictitious quanta, however, appear only in closed loops and are coupled to real quanta through vertices
which vanish when the invariance group is Abelian. The vertices are nonsymmetric and always occur with
a uniform orientation around any fictitious quantum loop. The problem of splitting radiative corrections
into Feynman baskets becomes more difficult in higher orders, when overlapping loops occur. This problem
is approached with the aid of the Feynman functional integral. It is shown that the measure or volume
element for the functional integration plays a fundamental role in the decomposition into Feynman
baskets and in guaranteeing the invariance of radiative corrections under arbitrary changes in the choice
of basic field variables. The measure has two effects. Firstly, it removes from all closed loops the no%
causal chains of cyclically connected advanced (or retarded) Greens functions, thereby breaking them
open and ensuring that at least one segment of every loop is on the mass shell. Secondly it adds certain non-
local corrections to the operator field equations, which vanish in the classical limit A 0. The question
arises why these removals and corrections are always neglected in conventional field theory without apparent
harm. It is argued that the usual procedures of renormalization theory automatically take care of them.
In practice the criteria of locality and unitarity are replaced by analyticity statements and Cutkosky rules.
It is virtually certain that the measure may be similarly ignored (set equal to unity) in gravity theory,
and that attention may therefore be confined to primary diagrams, i.e., diagrams which contain Feynman
propagators only, with no noncausal chains removed. A general algorithm is given for obtaining the
primary diagrams of arbitrarily high order, including all fictitious quantum loops, and the group invariance
of the amplitudes thereby defined is proved. Essential to all these derivations is the use of a background
field satisfying the classical free field equations. I t is never necessary to employ external sources, and
hence the well-known difficulties arising with sources in a non-Abelian context are avoided.
*This research was supported in part by the Air Force 0 5 c e t Permanent address.
of Scientific Research under Grant AFOSR-153-64, and in part by B.S. DeWitt, Phys. Rev. 160, 1113 (1967). This paper will
the National Science Foundation under Grant GP7437. be referred to aa I.
162 1195
299
prosaic questions as the scattering, production, absorp- This,however, is not the whole story, for the general
tion, and decay of individual quanta were left un- coordinate transformation group still has, even as a
touched. The main reason for this was that the canonical gauge group, profound physical implications. Some of
theory does not lend itself easily to the study of these these we have already encountered in I, and some we
questions when physical conditions are such that the shall encounter in the present paper. Others will appear
effects of vacuum processes must be taken into account. in the final paper of this series, which is to be devoted
A manifestly covariant formalism is needed instead. to applications of the covariant theory. If it were not
It is the task of the present paper to provide such a for these implications there would be little interest in
formalism. pushing our investigations further, for there is no
We must begin by making clear precisely what is likelihood that such prosaic processes as graviton-
meant by manifest covariance. I n conventional graviton scattering or curvature induced vacuum
S-matrix theory (whether based on a conventional polarization will ever be experimentally observed.4 The
field theory or not) manifest covariance means real reason for studying the quantum theory of gravity
manifest Lorentz covariance. In the context of a is that by uniting quantum theory and general relativity
theory of gravity the question arises whether it should one may discover, at no cost in the way of new axioms
mean more than this, since the classical theory from of physics, some previously unknown consequences of
which one starts has manifest general covariance. general coordinate invariance, which suggest new in-
Here one must be careful. There is an important teresting things that can be done with quantum field
difference between general covariance and ordinary theory as a whole.
Lorentz covariance, and neither one implies the other. Our problem will be to develop a formalism which
Lorentz covariance is the expression of a geometrical makes manifest the extent to which general covariance
symmetry possessed by a system. In gravity theory permeates the theory. This will be accomplished by
it has relevance a t most to the asymptotic state of introducing, instead of a flat background, an adjust-
the field. As has been emphasized by Fock,2 the word able c-number background metric. Use of such a
relativity in the name general relativity has con- metric has the following fundamental technical advan-
notations of symmetry which are misleading. Far from tages: (1) I t facilitates the introduction of particle
being more relativistic than special relativity, general propagators which are generally covariant rather than
relativity is in fact less relativistic. For as soon as space- merely Lorentz-covariant. (2) I t reduces the study of
time acquires bumps (i.e., curvature) it becomes radiative corrections to the study of the vacuum. (3) It
absolute in the sense that one may be able to specify makes possible the generally covariant isolation of
position or velocity with respect to these bumps, pfo- divergences, which is essential to any renormalization
vided they are sufficiently pronounced and distin- program. (4) I t renders theorems analogous to the
guishable from one another. Only when the bumps Ward identity almost trivial. (5) It makes possible,
coalesce into regions of uniform curvature does space- in principle, the extension of the theory of radiative
time regain its relativistic properties. It never becomes corrections to worlds for which space-time is not
more relativistic than flat space-time, which is char- asymptotically flat and which may even be closed
acterized by the 10-parameter Poincar6 group. and finite. These advantages are typical of what we
The technical method of distinguishing between the shall mean by the phrase manifest covariance. Use
PoincarC group and the general coordinate transforma- of the phrase, however, is not to be understood as
tion group is to confine the operations of the latter implying that the simple trick of introducing a variable
group to a finite (but arbitrary) region of space-time. background metric makes everything obvious. The
The asymptotic coordinates are then left undisturbed generally covariant propagators will not be unique
by general coordinate transformations, and only the but will be choosable in various ways, analogous to
operations of the PoincarC group (if that is indeed the the gauge choices in quantum electrodynamics, and
asymptotic symmetry group of the problem) are we shall have to undertake a separate investigation,
allowed to change them. The general coordinate just as in quantum electrodynamics, to verify that
transformation group thus becomes a gauge group the choice is irrelevant. This investigation turns out
which, although historically an offspring of the Poin- to be much more complicated than in the case of
car6 group and the equivalence principle, plays techni- quantum electrodynamics.
cally the rather obscure role of providing the analytic Of the five advantages listed above as stemming
means by which the Einstein equations can be ob- from the use of a variable background metric only
tained from a variational principle and their essential the first two will appear in the present paper. The third
locality displayed?
argued [see S . Weinberg, Phys. Rev. 138, B988 (1965)] that the
1 V . Fock, The Theory of Space-Time a d Grauilalwn (Pergam- general coordinate transformation group is simply a consequence
mon Press, New York, 1959). of the zero rest mass of the gravitational field and its long-range
aThe content of the Einstein equations can be expressed in an character.
intrinsic coordinate-independentform only at the cost of introduc- Although one might hope for some very indirect cosmological
ing nonlocal structures. (See, for example, Ref. 32). It can be evidence for such processes.
300
and fourth will be demonstrated in the following paper The language of graphs and the S matrix is much more
of this series, while the fifth remains a program for direct.
the future. I t is not out of place here, however, to The latter language, embracing as it does many dif-
speculate briefly on this ultimate program. As long ferent particle theories at once, is also much less
as the conventional S matrix is our chief concern it dependent on the detailed Lagrangian structure of the
appropriate to choose a background metric which is field theory on which it is based. It assumes that virtual
asymptotically flat. We shall see that Lorentz invari- processes may be described by an infinite set of basic
ance of the S matrix then follows almost trivially from diagrams, the combinatorial properties of which are the
the formalism, in the limit in which the background same for all field theories. In working out the details
metric becomes everywhere Minkowskian. Now i t is of how this language is t o be extended to the non-
obvious that scattering processes are also possible in Abelian case, we have attempted to develop it within
an infinite world which is not asymptotically flat. h as broad a framework as possible. Every thcorem in
such a world it should be possible to construct a this paper will therefore apply not only to the gravita-
generalized S matrix in which the convenlianal phne- tional field but also to the Yang-Mills field6which,
wave momentum eigenfunctions are replaced by wave like the gravitational field, possesses a non-Abelian
functions appropriatc to the altered asymptotic invariance group6
Section 2 begins with the introduction of a notation
geometry. The asymptotic gcometry itself would be which is sufficiently general to embrace all boson field
fixed by choosing the background metric appropriately. theories and at the 5ame time condensed enough to
In a dosed world no rigorous S matrix exists. The reduce the highIy complex analysis of subsequent sec-
continuum of scattering states is replaced by a regime tions to manageable proporlions, A table is included
of discrete quantization, and, as we have seen in I, to facilitate comparison of the condensed notation with
the wave function of the universe may even be unique. the detailed forms which the various symbols take in
It may be conjectured that the formalism most ap- the case of the Yang-Mills and gravitational fields.
propriate to this case is obtained by choosing the back- The notation is particularly useful in dealing with the
ground metric to be lzot a c number but rather an second functional derivative of the action, which plays
operator depending on a small number (e.g., one) of the role of the differential operator governing the prop-
quantum variables similar to the operator R represent- agation of infinitesimal disturbances on an arbitrary
ing the radius of the Friedmann universe studied in I. background field. I t is also useful in dealing with the
These variables would be quantized by the canonical higher functional derivatives, which are the bare vertex
method, while the full q-number metric would continue functions of the theory. The problem of the quantized
tb be treated by manifestly covariant methods. (Con- light cone is discussed in a preliminary way in Sec. 3,
dilions of constraint would, of course, have to be im- and its relationship t o the nonrenorrnalizabilily of
posed on the latter metric to take into account the fact the theory is noted. Attention is called to the various
that some of its degrees of freedom have been trans- roles of the background metric, one of which is to define
ferred t o the background metric.) The resulting the concepts of past and future. Greens theorem
simultaneous use of both the canonical and covariant for an arbitrary differential operator is then derived.
theories might help to reveal the relationship between
Section 4 introduces a notation for the basic struc-
them. tures governing the action of the invariance group on
As has been remarked in I, no rigorous mathematical the field variables. The relationship between manifest
Link has thus far been established between the canonical covariance and linearity of the group transformation
and covariant theories. I n the case of infinite worlds laws is emphasized. In Sec. 5 it is pointed out that the
it is believed that the two theories are merely two infinitesimal disturbances themselves are determined
versions of thc same theory, expressed in difierent only modulo an Abelian transformation group. This
languages, but no one knows for sure. The analysis of group, which is the tangent group of the full group,
radiative corrections has turned out to be of such affects only the field variables but not physical ob-
intricacy that the covariant theory has had to be servables. The latter are necessarily group-invariant.
developed completeIy within its own framework and Infinitesimal disturbances satisfying retarded or ad-
independently of the canonical theory. Although the
structure of the covariant theory is suggested by the 6C. N. Ymg and R. L. Mills, Phys. Rev. 96,191 (1954).
formalism of field operators, and hence maintains a few Tbe term invariance group, as used in this paper, will
always refer to the infinite dimensional gauge group of the
points of contact with conventional field theory, the theory, and not to the finite dimensional ((10) asymptotic
language of operators is dropped at a certain key stage isometry group, which is undetermined a priori. It is not hard to
and c-number criteria are thenceforth exclusively em- show that the Eang-Mills field and its gauge group can be
given a metrical interpretation which suggests a physical kinship
ployed to maintain internal consistency. I t turns out between the YangMills and ravitational fields which is closer
that the languagc of operators is a peculiarly unwieldy than the i D I m d rnathematicaf similarities between them alone
indicate. [See B. S. DeUitt, Dytramicd Theory o j Groups and
one in which to discuss questions of consistency when Fields (Gordon and Breach Science Publishers, Inc., New York.
the invariance group of the theory is non-Abelian, 19651, problem 77, p, 139.1
301
invariance criterion must therefore be used as a guide in the classical limit h+ 0. The question arises why
rather than as an a posteriori consistency check. these removals and corrections are always neglected in
Section 15 pauses briefly to review the question of conventional field theory without apparent harm. It is
Lorentz invariance, to point out that the theory should argued that the usual procedures of renormalization
also be invariant under changes in the specific variables theory automatically take care of them and that in
with which one works, and to comment upon the utility practice the criteria of locality and unitarity are re-
of using c-number language exclusively. Section 16 placed by analyticity statements and Cutkosky rules
then plunges into the main problem. The single-loop (see Ref. 52). A detailed investigation of these cor-
contribution to the vacuum-to-vacuum amplitude is rections when a group is present is undertaken in Sec.
computed with the aid of the formal theory of con- 20. The two-loop Feynman-basket decomposition of
tinuous determinants, and various alternative forms for the preceding section is appropriately generalized and the
it are given. There is no ambiguity about this contribu- result is reexpressed in terms of covariant propagators,
tion, and its group invariance is readily demonstrated. including the fictitious quanta. It turns out that the total
This contribution is functionally differentiated in two-loop amplitude is obtainable from a set of covariant
Sec. 17 to yield the lowest-order contribution to primury diagrams (containing Feynman propagators
single quantum production by the background field. only, and hence off-mass-shell contributions in all
The latter splits into two parts, one involving the lines) by a process of removing noncausal chains and
covariant propagator for normal quanta and the other adding nonlocal corrections, which is completely
involving the covariant propagator for a set of fictitiozts analogous to that of the no-group case. Moreover, the
quanta which compensate the nonphysical quanta that primary diagrams, taken together, are group-invariant
the first propagator also carries. The fictitious quanta as they stand, independently of the tree theorem. This
are coupled to real quanta through asymmetric vertices suggests that even when a group is present the non-
which vanish when the invariance group is Abelian. causal chains and nonlocal corrections may be neglected
With the aid of the fundamental lemma of Sec. 10 and as in conventional field theory. The problem therefore
a collection of new identities it is shown that the becomes one of finding a general algorithm for obtain-
fictitious quanta can be formally avoided by replacing ing the primary diagrams of arbitrarily high order, in-
the covariant propagator by the noncovariant one cluding all fictitious quantum loops. The remainder of
which carries physical quanta only. The covariant Sec. 20 is devoted to the construction of such an algo-
propagators, however, are needed for the practical rithm. The generator for the algorithm is a Feynman
implementation of any renormalization program. functional integral for the vacuum-to-vacuum ampli-
The lowest-order radiative corrections to the tude, which includes fields representing the fictitious
n-quantum amplitudes are analyzed in Sec. 18. These quanta. The group invariance of this integral is explicitly
amplitudes split automatically into Feynman baskets, demonstrated, and the fictitious quanta are shown
i.e., sums over tree amplitudes (lowest-order scattering formally to obey Fermi statistics despite their integral
amplitudes) in which all external lines are on the mass spin. No physical criteria are violated, however, since
shell. The tree theorem then guarantees their group the fictitious quanta never occur outside of closed loops.
invariance. This invariance can be made partially Finally, the rules for inserting external lines into the
manifest by converting from the noncovariant prop- primary vacuum diagrams are given, and the asym-
metric vertices contained in the fictitious quantum
agator to the covariant one, and the fictitious quanta
loops are shown to have a uniform orientation around
again make their appearance. each loop.
The problem of splitting the radiative corrections
into Feynman baskets becomes more difficult in higher 2. NOTATION. INFINITESIMAL DISTURBANCES .
orders, when overlapping loops occur. This problem BARE VERTEX FUNCTIONS
is approached in Sec. 19 with the aid of the Feynman
functional integral. When no invariance group is present A quantum field theory begins with the selection of
it is shown that the measure or volume element for an action functional S. If the theory is local this func-
the functional integration plays a fundamental role in tional is expressible in the form
the decomposition into Feynman baskets and in
guaranteeing the invariance of the vacuum-to-vacuum
amplitude under arbitrary changes in the choice of
basic field variables. The measure has two effects.
S=
\ Cdz, dx=dx0dx1dx2dxa, (2.1)
Firstly, it removes from all closed loops the noncausal where &-hte Lagrangian (density)-+ a function of
chains of cyclically connected advanced (or retarded) the dynamical variables and a finite number of their
Greens functions, thereby breaking them open and in- space-time derivatives at a single point. Various criteria
suring that at least one segment of every loop is on such as covariance, self-consistency of the field equa-
the mass shell. Secondly, it adds certain nonlocal cor- tions, the existence of the vacuum as a state of lowest
rections to the operator field equations, which vanish energy, and positive definiteness of the quantum-
303
mechanical Hilbert space in practice drastically limit Suppose the form of the action functional suffers the
the possible choices for 2. However, many different following change:
choices exist for the Lagrangian of a given field. Thus
it is always possible to add a trivial divergence to the S+S+eA, (2.3)
Lagrangian without changing the field equations a t all.
Moreover, the field variables may be replaced by where e is an infinitesimal constant. Such a change may
arbitrary functions of themselves; this replaces the field be thought of as being brought about by weak coupling
equations by linear combinations of themselves. Finally, to some external agent. The coupling produces an in-
even the number of field variables is not unique; for finitesimal disturbance 6pi in the field, which satisfies
example, alternative Lagrangians may be found leading the linear inhomogeneous equation
to field equations which express some of the variables in S,I1..6pi=-eA ,a. . (2.4)
terms of derivatives of others. What is important is that
the choice of Lagrangian is basically irrelevant to the That is, pi+Gpi satisfies the field equations of the
development of the theory of a given field and should system S+eA if pi satisfies those of the system S. The
be determined only by convenience. The quantum undisturbed field qi may be regarded as a background
theory of a given field must be constructed in such a jield upon which the disturbance 6pi propagates. The
way that it is invariant under changes in the mode of concept of the background field proves to be a useful
description of the field. one in the covariant theory, and will occur repeatedly
I t will prove convenient in what follows to adopt a in what follows.
highly condensed notation. The field variables (assumed For local theories the quantity S,;i has the form of a
here to be real) will be denoted by pi, and commas linear combination of 6 functions and derivatives of 6
followed by indices from the middle of the Greek functions, with functions of the field variables and their
alphabet will be used to denote differentiation with re- derivatives as coefficients. I n Eq. (2.4) S,iJ therefore
spect to the space-time coordinates. The first part of plays the role of a linear differential operator with
the Greek alphabet will be reserved for grouf indices, variable coefficients. The reader will find it useful to
to be introduced presently. Primes will be used to consult Table I, which lists the explicit forms which this
distinguish different points of space-time; they will also and various other abstract symbols of the general
appear on associated indices, or on field symbols them- formalism take in the cases of the Yang-Mills field and
selves, when it is desired to avoid cumbersome explicit the gravitational field, respectively.
appearances of the xs. In most cases, however, the I n the case of linear theories S,;i corresponds to a
primes will be simply omitted. This corresponds to linear differential operator with constant coefficients,
making the indices i, j , etc. do double duty as discrete and the higher functional derivatives S , i l k , etc., vanish.
labels for field components and as continuous labels over In nonlinear theories the higher functional derivatives
the points of space-time. That is, an index such as i will are known as bare vertex funclions. They describe the
really stand for the quintuple (i, xo, xl, x2, 2)and the basic interactions between jinite disturbances, the prop-
summation convention for repeated indices will be agation of which, as will be seen later, provides a direct
extended to include integrations over the xs. The classical model for the quantum S matrix.
significance of the indices thus becomes almost purely I t is frequently convenient to introduce a further con-
combinatorial. When this notation is employed it is densation of notation, namely to make the replacement
necessary to remember that expressions such as Mi, are
really elements of continuous matrices and that the s.i,...i, .--,s n (2.5)
symbol S i j involves a 4-dimensionaI 6 function.
and to drop the indices altogether. Equations (2.2)
For most purposes the form of the field equations is
and (2.4) are then replaced by
more important than the value of the action functional.
Therefore, the domain of integration in (2.1) is un- s1=0 (2.6)
important; when otherwise unspecified it is to be under-
stood as being large enough to embrace all points at and
which it may be desired to perform functional dif-
ferentiations. Functional differentiation with respect to Sz8pp=-Ai, (2.7)
the field variables will be denoted by a comma followed
by one or more Latin indices. Thus the field equations respectively. If the basic field variables are properly
will be expressed in the symbolic form chosen the number of nonvanishing bare vertex func-
tions is finite in the case of both the Yang-Mills and
s,i=o. (2.2) gravitational fields. Thus, for the Yang-Mills field we
have S n = O for n>4 when the field variables are chosen
In this paper no restriction is imposed on the range of Latin as in Table I, while for the gravitational field we have
indices. Other conventions, to the extent they overlap, are the
same as in I. S n = O for n>9 if the quantities p=g6/18gv-q~r are
162 QUANTUM THEORY OF GRAVITY. I1 1201
I. Expressions for the Yang-Mills and gravitational fields corresponding to quantities appearing in the abstract formalism.
TABLE
Abstract
symbol or
equation Corresponding expression for the Yang-Mills field Corresponding expression for the gravitational field
cp'
I;pu85fg,'~igcr.s+l"r.N-gNY.r). . .
The indices p, Y are raised and lowered by means of the
Minkowski metric ~ , v = d i a g ( - l , l l l , l ) and its inverse The indices p, v, p , u, r are raised and lowered by means
q p y . The indices CY, 8, y are raised and lowered by means of the metric tensor g," and its inverse gr".
of the Cartan metric, I n the remaining entries of this table the symbol (OR is
y a p -c-~c~,8c~gy replaced simply by R.
and its inverse y@. The c's are the structure constants
of a compact n-dimensional semi-simple Lie group, and
the constant c2 is chosen so that det(ya~)= 1.
o=s,i O = 6S/6AaNs -Fa#';".
St" The infinitesimal group parameters are functions 6 t a ( x ) The infinitesimal group parameters are the functions
which assign to each point x a corresponding &p(x) appearing in the infinitesimal coordinate
infinitesimal transformation of the generating Lie transformation ~ r = x f i + 6 p . Under inner automorphisms
group. Under inner automorphisms they transform they transform as contravariant vectors. Note that
according to the adjoint representation of the full group. group and coordinate indices coincide in the case of the
general coordinate transformation group.
R', R a r ~=i -6a51 ;,,, 6'p" Sa~6(x,x') Rwu~ -6put;v-6u.*;p, 6gr's gfiA(x,X')
6 cpi =Ria6ta 6 A a = - 6Ea:,= -6Ea,p-~ar,qAYP6EB 6 'ppv = -6tK" -St" ;I!= - g w a. P g q u w , r - g r d 4I
Semicolons denote invariant differentiation. A field Semicolons denote covariant differentiation. A field
quantity Q which has the group transformation law quantity cp which has the group transformation law
6 Q =G a QW, 8cp= - a.r6P+G",~8P,.= - c p : , 6 E f i + G r , ~ ~ : ,
where the G, are the generators of a matrix tepresenta- where the GI,, are the generators of a matrix representa-
tion of the generating Lie group, is defined to have the tion of the linear group, is defined to have the covariant
invariant derivative derivative
P:," cp,r+GaAaeQ. P:,= cp,C+G.or,v8P.
Invariant differentiation leaves transformation properties Covariant differentiation adds one covariant index. It
intact. It has the commutation law
~ : # v - ~ : v p =- - G a F a p v ~ .
S,iRia=0 FU~*:,,.=0
This identity is a consequence of the antisymmetry of
Fa,, and of the structure constants 0,8. Fa,. transforms
according to the adjoint representation of the group and
also satisfies the cyclic identity
Fapv; .+ ,+
Fa".: Farlr;"=0.
w
0
VI
306
The basic momentum-space propagators and vertices (including those for the fictitious quanta) are
given for both the Yang-Mills and gravitational fields. These propagators are used to obtain thecross
sections for gravitational scattering of two xal ar particles, scattering of gravitons by scalar particles,
graviton-graviton scattering, two-graviton annihilation of scalar-particle pairs, and graviton bremsstrah-
lung. Special features of these cross sections are noted. Problems arising in renormalization theory and the
role of the Plan& length are discussed. The gravitational Ward identity is derived, and the structure of
the radiatively corrected 1-graviton vertex for a scalar particle is displayed. The Ward identity is only one
of an infinity of identities relating the many-graviton vertex functions of the theory. The need for such
identities may be eliminated in principle by computing radiative corrections directly in coordinate space,
using the tbeory of manifestly covariant Greens functions. As an example of such a calculation, the con-
tribution of conformal metric fluctuations to the vacuum-to-vacuum amplitude is summed to all orders.
The physical significance of the renormalization terms is discussed. Finally, Weinbergs treatment of the
infrared problem is examined. It is not dificult to show that the fictitious quanta contribute negligibly to
infrared amplitudes, and hence that Weinbergs use of the DeDonder gauge is justified. His proof that the
infrared problem in gravidynamics can be handled just as in electrodynamic3 is thereby made rigorous.
that of 11, which should be consulted for the definition of un- were explored in I. In this third and find paper of the
familiar symbols, e.g., S, for the n-pronged bare vertex and
Y,,;,e for the asymmetric vertex coupling real and fictitious series we examine Some Of the consequences Of the
quanta. covariant theory.
1240 BRYCE S DirW I T T 162
Armed with the formalism constructed in I1 one can rather than in momenLum space. An example of such
in principle carry out the calculation of any micro- a calculation is given in Sec. 7, where the contribution
process to any order of perturbation theory in a of conformal metric fluctuations to the vacuum- to-
manner which is completely invariant and unambigu- vacuum amplitude is summed to all orders. The calcu-
ous except for the arbitrary high-energy cutoff which lation, which is manifestly covariant throughout,
must be introduced to render divergent integrals finite. makes use of an integral representation for the ampli-
A few of these calculations have actually been per- tude. A resum6 is given of that part of the mathe-
formed, and the only thing which prevents more of matical theory of covariant Greens functions which is
them from being done is the extreme tediousness of needed.
the algebra involved and the lack of any experimental Section 8 concludes the paper with a review of
motivation for them. It is a pity that Nature displays Weinbergs treatment of the infrared problem (see Ref,
such indifference to so intriguing and beautiful a sub- 37). If Yang-Mills quanta are assumed to be massless
ject, for the calculations themselves are of considerable then, since they can act as their own sources, they give
intrinsic interest. The present paper contains several rise to the special infrared divergences which plague
examples. They are by no means exhaustive but have massless electrodynamics. Weinberg showed that gravity
been selected as useful landmarks in a still largely miraculously escapes these difficulties ; its infrared di-
unexplored territory. Not all of these were originally vergences can be handled by the standard methods
carried out by the author, but it is hoped that their familiar in ordinary quantum electrodynamics. His
unified presentation here will make their results more proofs, however, were incomplete, since he did not
accessible than hitherto. have available a fully elaborated quantum theory. In
Section 2 begins with the rules of calculation in particular he used the DeDonder gauge without taking
momentum space. The basic structural elements of the into account the fictitious quanta. It is not difficult to
theory, namely the propagators for real and fictitious show that the fictitious quanta contribute negligibly
quanta, the vertices S3,S4, Vc.i)p, and the coupling in the infrared limit. Weinbergs results are therefore
with matter fields, are given for both the YangMills rigorous.
and gravitational fields. The standard Feynman rules
are summarized. The results of a few lowest-order 2. RULES OF CALCULATION I N
scattering calculations based on these rules are given MOMENTUM SPACE
and discussed in Sec. 3. Included are the cross section
We begin with the vertex functions for the Yang-
for gravitational scattering of two scalar particles, the
Mills field interacting with itself. We have seen in I1
cross section for scattering of gravitons by scalar
that when the standard field variables are used only
particles, the corresponding annihilation cross section, Ss and S4 are nonvanishing for this case. I n momentum
and the graviton-graviton cross section. Section 4 is de- space these become (apart from a 6 function expressing
voted to the problem of gravitational bremsstrahlung. conservation of momentum)
The role of the energy quadrupole moment tensor and
the absence of the forward peak a t high energies, charac-
teristic of photon bremsstrahlung, are noted.
Section 5 discusses some of the problems which arise
in renormalization theory. Although the Yang-Mills
theory looks as if it may be renormalizable (provided
its infrared difficulties can be disposed of), quantum
gravidynamics is definitely not renormalizable in the
usual sense. Tentative proposals for dealing with this
situation are briefly described, as is also the evidence
that gravity contains its own cutoff-at the Planck
length. Illustration of the actual details of the re-
normalization program, by explicit calculation of a
radiative correction, is postponed to Sec. 7. The correspondence of momenta with indices is p o p ,
The gravitational Ward identity and its implications pp~,pyu, p6~. All momenta are incoming
for gravitational form factors are derived in Sec. 6. (to the vertex), and momentum conservation implies
The general structure of the radiatively corrected 1- p+p+p=O for Ss and p+p+p+p=O for S4.
graviton vertex is displayed in the case of a scalar Indices on the structure constants are raised and
particle. It is emphasized that the gravitational Ward lowered by means of the Cartan metric -ya8. When all
identity is only one of an infinity of identities relating indices are in the lower position the structure constants
the many-graviton vertex functions of the theory. The are completely antisymmetric.
need for Ward identities can be eliminated by com- I n addition to the above vertices, the fictitious
puting radiative corrections directly in coordinate space vertex Irc,i,~is needed for the calculation of radiative
309
corrections. For the Yang-Mills field it takes the form field are much more complicated. I n this case we shall
employ the momentum-index combinations ppv, pud,
V~au,,o~pt
-+ - i ~ , p ~ $ = - i ~ , , p ( p ~ + p ) . (2.3) pl$/lx/l, $11 111 111
L K . The vertices must not only be sym-
metric in each index pair but must also remain un-
The propagators for the normal and fictitious quanta changed under arbitrary permutations of the momen-
are, respectively, tum-index triplets. At least 171 separate terms are
required in the complete expression for Sa in order to
G +Y aB~w/P2 (2.4)
exhibit this full symmetry, and for S4 the number is
c: yaB/pz1 (2.5) 2850. However, these numbers can be greatly reduced
by counting only the combinatorially distinct termsZ
with p2 being understood to have the usual small and leaving it understood that the appropriate sym-
negative imaginary part. metrizations are to be carried out. I n this way SIis
The corresponding quantities for the gravitational reduced to 11 terms and S4 to 28 terms, as follows:
6SS
--f
6QP&8 d P p * *X i
The Syni standing in front of these expressions indi- him best we shall not shackle him by describing one
cates that a symmetrization is to be performed on each here. We also make no attempt to display S g or any
index pair pv, UT, etc. The symbol P indicates that a higher vertices.
summation is to be carried out over all distinct permu- The vertex V ( a i ) Bhas the following form for the
tations of the momentum-index triplets, and the sub- gravitational field:
script gives the number of permutations required in v ( $ o f l r ) ,3
each case.
Expressions (2.6) and (2.7) can be obtained in a )sp[2Q,Q06,r- p,,PfrqU7
straightforward manner by repeated functional differ- +(P~p-Q,Qu)~~+Q.P~~o~,rl,
(2.8)
entiation of the Einstein action. This procedure, how- where the momentum-index combinations are Qp, pv,
ever, is exceedingly laborious. A more efficient (but p~~ and
, the symmetrization is to be performed on
still lengthy) method is to make use of the hierarchy the index pair UT. The propagators for the normal and
of identities (11, 17.31). It is a remarkable fact that fictitious quanta are given by
once Szois known all the higher vertex functions, and
hence the complete action functional itself, are de- G 3 (q~q,+q,q,~-qq~,)/p2, (2.9)
termined by the general coordinate invariance of the
theory. It is convenient, in the actual computation of
c: _.q*/p2. (2.10)
the vertices via (11, 17.31), to invent diagrammatic *The choice of terms is not completely unique since momentum
conservation may be used to replace a given term by other terms.
schemes for displaying the combinatorics of indices. We give here what we believe (but have not proved) to be the
Since each reader will devise the scheme which suits expressions containing the smallest number of terms.
3 10
1242 B R Y C E S. D E W I T T 162
If one wishes to calculate processes involving the the mass shell, to precisely the forms (2.13) and (2.15)
interaction of the Yang-Mills and/or gravitational regardless of the magnitude of the particle spin. This
field with matter, additional vertices describing this may be proved in each instance as a straightforward
interaction must be included. As prototypes of such consequence of the gauge invariance of the theory and,
vertices, we shall display those which arise from inter- when extended to the radiatively corrected vertices,
actions with scalar (or pseudoscalar) particles. The constitutes a boundary condition on the Yang-Mills
latter particles contribute to the total action functional and gravitational form factors.3 [See also Sec. 6.1
an expression of the form It is to be emphasized that the inclusion of addi-
tional fields in no way affects the formal theoretical
structure developed in 11.The topology and invariance
properties of diagrams remain completely unchanged.
One simply permits the field indices i, j , etc., to extend
where the covariant derivative is defined in Table I over a greater range of values in order to accommodate
of I1 and where the components of the new fields which have been
added. The only differences are differences of detail
~ - 7 ,rG,T= -Ga-,
@= (2.12) such as, for example, the sign modifications due to
7 being the matrix which connects the two forms of a statistics which appear when some of the added compo-
self-contragredient representation (of the Yang-Mills nents are those of fermion fields, or changes in the
Lie group) generated by the matrices G , and -Ga-, structure of the invariance group which arise from
respectively. We find having both the Yang-Mills and gravitational fields
simultaneously present and interacting with each other.
The rules for combining vertices and propagators
into transition amplitudes are completely standard.
With the notational conventions of the present paper
they may be summarized as follows: (1) An expression
such as (2.1), (2.2), (2.3), (2.6), etc., for each vertex;
(2) an expression such as (2.4), (2.5), (2.9), (2.10),
etc. for each propagator; (3) a factor ( - i ) / ( 2 ~ ) ~for
each independent closed loop; (4) an additional factor
(- 1) for each closed fermion or fictitious-quantum
loop, or when necessary to assure antisymmetry of
fermion amplitudes; ( 5 ) an over-all factor i(2n) times
a 6 function assuring total energy-momentum con-
servation; (6) a wave function uid (see Table I1 of
11) or its complex conjugate evaluated a t x=O for each
external line; (7) integration over all the independent
momenta.
+ (PpPr+PrPp)q+ (Pp? Gauge invariance may be invoked as a useful con-
sistency check in all calculations. However, it must be
- (tP+fp)q- @~+~~)q. (2.16) applied to the entire amplitude for a given process
and not merely to a single diagram. It is therefore
The corresponding vertices which describe the inter- algebraically more laborious than corresponding checks
action of the gravitational and/or Yang-Mills fields in electrodynamics. I t is no longer possible to exploit
with particles having spin are obtained by straight- charge conservation by following individual lines
forward computation from the pertinent action func-
tional. The latter is obtained in each case via the These are analogs of the electromagnetic form factors. The
principle of minimal coupling (which, in the case gravitational form factors are also sometimes referred to as
of gravity, is nothing but the strong equivalence stress-energy, mass, or mechanical form factors.
4 These are, in fact, the most important differences. It is worth
principle) from the corresponding action functional mentioning that when fermion fields are included it is usually
in the absence of gravitational and Yang-Mills fields, convenient to replace the metric field g, by a vierbein field.
by replacing ordinary derivatives by covariant deriva- Otherwise the group transformation laws are no longer linear.
[See B . S. DeWitt and C. M. DeWitt, Phys. Rev. 87, 116 (1952).]
tives, the Minkowski metric q,, by g, and the volume We also mention that the combined vierbein-general-coordinate-
element dx by g1I2dx. We do not give here the results transformation group has the structure of a semi-direct product
based on the automorphisms of the wicrhein group under general
of such calculations for particles with spin but merely coordinate transformations. In the combined group only the
point out (what is more useful for the reader) that the vierbein group is an invariant subgroup. The coordinate trans-
three-pronged vertices, when sandwiched between nor- formation group is its factor group. Similar statements apply
to the combined Yang-Mills-general coordinate-transformation
malized wave functions, always reduce, in the limit of group. The analysis of these cases is therefore correspondingly
zero momentum transfer, with particle momenta on complicated.
311
through diagrams, for now the conserved quantity which permit the amplitude to be recast in the form
Yang-Mills charge, energy-momentumleaks all over
every diagram. Moreover, when Yang-Mills quanta or
i{r,(i)rw,(2)-4r0i(i)7'o,(2)-4rM(i)r(2)
gravitons interact with themselves, the closed loops
form traffic jams of spurious charge which can be un-
snarled only by calling the fictitious quanta to the
rescue. -(-exchange and virtual annihilation terms. (3.9)
3. SCATTERING CROSS SECTIONS The first term yields an instantaneous "Newtonian"
interaction, while the second gives rise to a "delayed"
We now display some of the lowest-order amplitudes
interaction propagated by transverse gravitons. In this
and scattering cross sections which the covariant theory
case the factors which couple separately to the two
yields. One of the simplest is the amplitude for the
states of linear polarization are TuTtt and 27i2,
scattering of two identical scalar particles by exchange
of a single Yang-Mills quantum. This has the form respectively.
From (3.6) it is straightforward to compute the
differential cross section for gravitational scattering
+exchange and virtual annihilation terms, (3.1) of identical scalar particles in the center-of-mass frame.
One finds5
where
q=pi'-pi=Pi-pi, (3.2) d<r G2E2r(l+3z>2)(l-zi2)+4zi2(l+i>2) cos2(0/2)
(3.12)
-f-exchange and virtual annihilation terms , (3.5) fi/E
where a factor i(2-n-)~2S(pi'+p2pip^) has been re- In a similar manner one may compute the cross
moved, and the 3-axis has been chosen in the direction section for scattering of gravitons by scalar particles.
of the spatial part q of the space-like 4-vector q. The The relevant diagrams are shown in Pig. 1, the heavy
first term of (3.5) represents the instantaneous "Cou- lines denoting particles and the light lines gravitons.
lomb" interaction of the particles; the second repre- Diagrams (a) and (b) vanish in the rest frame of the
sents a "delayed" interaction propagated by transverse target particle, and one finds for unpolarized gravitons5
quanta, the factors jai and _;'a2 being separately coupled
da- G'm2
to the two states of linear polarization of these quanta.
The corresponding amplitude arising from exchange [l+2e sin2|0]2 sin4J0
of a graviton is
sin20
], (3-13)
X 0)" V+'T<r-i7''VT)2rl,T(2)/s!
/rf<r\
+exchange and virtual annihilation terms, (3.6) (3.14)
\<to/ NK
where
T,t= i (&)-1*[pl.pf,+p,p'lt- !, (p p'+m'fl. (3.7) (3.15)
Again we have conservation laws
6
C. F. Cooke, Ph.D. thesis, University of North Carolina, 1964
(3.8) (unpublished).
312
finds that it is impossible to produce two gravitons and simply multiplies the original amplitude. This
having opposite helicities by annihilation of Yang-Mills limiting form actually holds for all external lines, re-
quanta, or conversely to produce a Yang-Mills pair gardless of the spin character of their associated
having opposite helicities by the reverse process. The particles. It even holds when the external line is a
quanta in both the initial and final states must have graviton line, provided the emission vertex is inserted
identical helicities if the amplitude is to be nonvanish- not merely into a single diagram but into the sum of
ing. Helicity selection rules exist even for the process all diagrams contributing to the original amplitude.
in which two Yang-Mills quanta coalesce to produce This may be verified in a straightforward manner by
a single Yang-Mills quantum and a graviton. If both plugging in the 3-pronged graviton vertex (2.6) and
initial quanta have the same helicity the final quanta eliminating the terms involving q. Of the remaining
must have this helicity too; if the initial helicities are terms only those survive which yield a net contribution
opposite the final helicities must be opposite. The same of the form (4.3); the rest disappear in virtue of the
obviously holds for the reverse process. gauge invariance of the total original a m p l i t ~ d e . ~
The multiplicative factor (4.3) exhibits the well-
4. GRAVITATIONAL BREMSSTRAHLUNG known infrared divergence and can be obtained from
a purely classical model. We note that the infrared
Since the problem of gravitational radiation from divergence shows up only when the emission takes
accelerating masses has bedeviled classical relativists place from lines on the mass shell; it does not occur
for years it is a pleasant surprise to discover that its when the emission is from internal lines of a scattering
treatment within the quantum framework is quite diagram. The external lines therefore dominate the
simple.8 Consider a scattering diagram in which one of soft graviton emission. This means that the precise
the lines represents a scalar particle (real or virtual) details of the scattering process have little relevance
of momentum p. Let the diagram be modified by the in the limit q 4 0, and that the long-wavelength end
emission of a graviton of momentum q from this line. of the emission spectrum is determined primarily by
If the momenta of all lines subsequent to the inserted the asymptotic trajectories of the incoming and out-
graviton vertex are held fixed while those prior to the going particles, just as in the case of photon brems-
vertex are adjusted in such a way as to conserve strahlung. For wavelengths large compared to the
momentum and keep external lines on the mass shell, space-time region in which the collision takes place
then the only additional effect of the graviton emission (the size of this region is determined by the magnitudes
is to introduce into the corresponding amplitude, a of typical energies exchanged in the collision) the eff ec-
factor tive graviton source is a stress tensor of the form
(2nY2 44
TP(z)= c qnmnVnPvn.J 0
6(x- Vn7)d7, (4.4)
P P (PY+4Y)+PY(PP+PP) - PPYcmz+P. (P+ 411 which idealizes the particles t o classical points colliding
X (4.1)
at the coordinate origin. Here rn, and V , are, respec-
9
( p + qI2+ m2- i0
tively, the mass and 4-velocity of the lzth particle,
which follows from Eq. (2.15) and Table I1 of 11. and the sign factor qn tells whether the particle is in-
Alternatively, if the momenta prior to the vertex are coming or outgoing. The summation is over all the
held fixed we get a factor which differs from (4.1) by external lines, and the velocities are subject to the
the replacement q 4 -4. energy-momentum conservation law :
If the graviton is emitted from an external line these
factors reduce to C qnmnVn=O. (4.5)
1 PPfJ+tdP,@+PYQ,-- l l r d Q) The classical emission spectrum is obtained by pro-
~-
e*P*e**
1 (4.2) jecting (4.4) onto the graviton wave functions urv*(z,q)
(27r)32 dff @+ 2sP. 4- io (see Table I1 of 11). The corresponding quantum
where q = + 1 or -1 according as the external line is amplitude is
outgoing or incoming, and p is held fixed on the mass
shell. In the long-wavelength limit q -+ 0 (4.2) itself
reduces to
which, in view of the relation fn=mnVn, is just (4.3) resemble bundles of plane waves having momenta con-
summed over all the external lines. fined to narrow cones. These bundles (particularly
When the collision is nonrelativistic (4.6) reduces to their outer regions) have difficulty readjusting to the
altered particle trajectories arising from the collision
(4.7) and hence partly escape as radiation.
where the graviton gauge is chosen so that the compo- In the gravitational case the sharp forward emission
nents eko of the polarization vectors e* vanish, and is absent.I2In fact for an extremely relativistic collision
AZ is the change in the spatial integral of the total (I pn/ = E m ) which is confined to a plane (e.g., 2-
3-stress dyadic as a result of the collision: particle scattering) it is easy to verify that the total
sum (4.6) yields an amplitude which vanishes for
emission in the plane.13 This implies that, unlike photon
A Z = A T d x = C qnpnvn, (4.8) emission, graviton emission is a cooperative phenome-
J n
non which cannot be traced to the individual particle
T= E,n B(rlnxO)pnVn6(x-vVn~), (4.9) fields. Indeed the real gravitational field of a particle,
namely the Riemann tensor, falls off as the inverse
V+t= Vn/V2= pJEn=pJmn. (4.10) cube rather than the inverse square of the distance,
and hence its outer regions contribute negligibly to the
Now it is well knowno that energy-momentum con- emission. This has obvious implications for investiga-
servation permits the integral of the 3-stress dyadic tions of classical 2-body radiation as well as for at-
to be reexpressed as one half the second time derivative tempts to introduce Weizsacker-Williams approxima-
of the second moment JxxToodx of the energy density. tion schemes into quantum calculations.
Moreover, since e+*.e**=O, the trace of AZ may be
removed from (4.7). Therefore the emission amplitude 5. RENORMALIZATION AND T H E
may be written in the alternative form: PLANCK LENGTH
t(2~QO)-~~e~*.~(d~Q/dt2).e~*, (4.11) In lowest-order perturbation theory the formal rules
of the manifestly covariant theory yield results which
where A(d2Q/dt2) denotes the change in the second agree with the classical theory in the correspondence
time derivative of the energy quadrupole moment principle limit. In higher orders, divergences appear,
tensor just as they do for other field theories, and almost
nothing is known about how to extract finite and
Q=/b-. 1x2)Toodx , (4.12) physically meaningful radiative corrections from the
results. In the case of quantum gravidynamics the
severity of the divergences is such that the theory is
showing that soft gravitons are emitted predominantly not, by standard criteria, renormalizable. This is due
in the quadrupole mode. to the quadratic momentum dependence of the vertices
It is of interest to examine the angular distribution S,(n33), which in turn may be traced to the de-
of the emitted radiation. From (4.6) one sees that each pendence of the light cone on the background field,
external line makes a contribution to the emission i.e., t o the field dependence of the coefficients of the
amplitude, which has an angular distribution of the second time derivatives appearing in 52. Thus by
form counting momentum powers one finds for the super-
ficial degree of divergence of any diagram
(4.13)
D=-ZLi+2 C Vn+4K, (5.1)
n
where 0 is the angle between v and q, and p is a helicity where Li denotes the number of internal lines, V , the
phase angle. In the case of photon bremsstrahlung the number of n-pronged vertices, and K the number of
sine appears linearly instead of quadratically in the independent momentum integrations. Now i t is not
numerator, with the consequence that for relativistic difficult to show that
collisions (v= 1) the emission is concentrated sharply
in the forward directions of all the particles (initial as K=L- c V,+l.
n
(5.2)
well as final). This peaking may be attributed to the
individual Lorentz-contracted Coulomb fields, which This was first pointed out by R. P. Feynman in a mimeo-
graphed letter to V. F. Weisskopf dated January 4 to February
11, 1961 (unpublished).
OSee, for example, L. D. Landau and E. M. Lifshitz, The 18 Introducing unit vectors fi and a, in the directions of q and
Classical Thwrv o r Fields. translated bv M. Hammermesh pn, respectively, one may write the amplitude in this case in the
(Addison-fesle$ Publishing Company, Irk, Reading, Massa- form
chusetts, 1962), rev. 2nd Ed.
In view of the nonrelativistic energy conservation law constx c qa.-=constX c q,~.(~+fi.a~,
nqn(mn+tpn-vn) =0, this trace is just twice the rest mass lost
in the collision and already vanishes for elastic collisions. which vanishes by energy-momentum conservation.
315
1248 B R Y C E S. D E W I T T 162
simple class of diagrams, namely, those which represent integral equations, or otherwise simplify the computa-
two scalar particles exchanging gravitons in the ladder tional labor. It is clear that the results can give a t
approximation. It turns out that the "leading terms" best only a qualitative insight into the true analytic
(i.e., the most divergent) of the Bethe-Salpeter ampli- structure of the theory.
tude can be summed exactly, and, owing to certain
remarkable cancellations, the sum of the ladder-type 6. THE GRAVITATIONAL WARD IDENTITY
contributions to the gravitational self-energy can be
expanded in a power series in the bare mass, with no Although the computational difficulties involved in
approximations whatever. The method can also be extracting physical information from quantum gravi-
extended to the case of charged scalar particles, with dynamics are formidable, the theory has a redeeming
one or more of the graviton ladder rungs replaced by feature in its general covariance, which serves as a
photons, and a simple expression can be obtained for cross check on the consistency of various calculations
the lowest-order electromagnetic self-energy. The self- and imposes constraints on the permissible forms of
energies and renormalization constants found in this various amplitudes. One of these constraints has
way are all finite. recently been discussed by Brout and Englert.19 These
The finiteness of these quantities may be traced to authors derive a generalized Ward identity relating
the behavior of the particle-particle scattering ampli- the gravitational vertex function of a scalar particle
tude. In the limit of very high momentum transfer the to the self-energy function arising from all its inter-
singularity of the gravitational interaction kernel is actions. Their derivation is easily generalized to the
displaced off the light cone in coordinate space and onto case of a particle of arbitrary spin.
a hyperboloid lying a t a distance X= (4f~G/?rc~)~/*=1.82 Denote the field of the particle by q A . In addition
X10-3a cm in spacelike directions. This is roughly t o the functions Ria (or, in expanded notation, RPY.,)
equivalent t o endowing the scalar particles with the characterizing the coordinate transformation behavior
properties of hard spheres of diameter X, and may be of the gravitational field (see 11) we now have corre-
regarded as a manifestation of the smearing out of the sponding functions R", for q". The explicit structure
light cone due to quantum fluctuations. of these functions may be inferred from Table I of 11:
Similar results have been found for spin-4 electrons R*,,=- ~A,lr6(~,~')+G.rAsrps6.u(x,x'). (6.1)
by Khriplovich,'* and there seems to be no reason
why, with enough labor, they may not also be extended We note that RAP,vanishes in the limit pA-+ 0, and
to particles of higher spin, including the graviton in that its functional derivative has the momentum-space
interaction with itself. Thus gravity may indeed prove form
to be the universal regulator which renders all field
theories finite. RAP,p r -+-i6A~P"p+iGYCA~P'V, (6.2)
It should be remarked that the self-energy functions
in which the association of momenta with indices is
which are obtained by summing ladder graphs appear
PA, p'p', p"B" (P+#'+P"=O).
to correspond to "good" spectral functions, which do
Let us denote the full (radiatively corrected) propa-
a minimum of violence to unitarity. This suggests that
gator for the particle by SAB.I t is the sum of the bare
no illegal analytic operations have inadvertently crept
propagator G A B and a function obtained by applying the
into the summation procedure. An improved calcula-
operator GAB6/6pBtwice to the vacuum-to-vacuum am-
tional method, which insures analytic legality in gen-
plitude. Since the vacuum-to-vacuum amplitude is an
eral, has been developed by Halpern.'" He sums first
invariant the propagator SAB,like GAB,transforms in
the absorptive parts of any amplitude and then obtains
the manner indicated by the position of its indices.2O Its
the full amplitude by a dispersion integral. The tech-
inverse must transform contragrediently :
nique is applicable to gravity theory as well as to other
nonrenormalizable theories, and is amenable to N / D S-'AB, i R i u + S 1 ,cRC,
~~
approximation schemes. It is probably the safest = -S-'CBR~,,A-S'ACR~,,B. (6.3)
method currently available, but it is very complicated
to apply. Equation (6.3) is the gravitational Ward identity. T o
Although the finite results which have been obtained get it into more familiar form one must reexpress it in
thus far are very suggestive, one must remember that momentum space, with all the background fields set
they derive from restricted classes of diagrams. They equal to zero. In this limit S + A B .becomes
~ the negative
are therefore not y-invariant but depend on the par- of the gravitational vertex function, which is conven-
ticular gauge chosen for the internal graviton lines. So
far calculations have been restricted to those gauges I @ K .Brout and F. Englert, Phys. Rev. 141, 1231 (1966). See
which avoid "dangerous" singularities in the resulting also K. Just and K . Rossberg, Nuovo Cimento 40, 1077 (1965).
"This will be true even if q A possesses a gauge group of its
own, provided the gauge conditions which determine G A B are
10 I. B. Khriplovich, report, Siberian Section, Academy of covariant. Note that the "background field" now includes q-4 in
Science, USSR, Novosibirsk, 1965 (unpublished). addition to the metric field.
317
tionally denoted by PV,the particle indices being The cancellation of divwgences which is implied by
suppressed and the index i being replaced by the more (6.12) applies only to the leading term of the vertex
explicit p . Making use of (6.2) and the momentum function, in the limit p+ p , and only on the mass
space form of Ri,, which is given in Table I1 of 11, one shell. In order that no divergences occur in the remain-
readily finds ing terms, or off the mass shell, the interactions which
the field Q A experiences with other fields must be of
2r,v(p,P)q.= .s-l (p)p,- s-~(P)P, the renormalizable type (or else summable to finite
- Cs-l(p)GY,-a,-S1(p)14,
(6.4) values). The example of the scalar particle provides an
where fi and p are, respectively, the incoming and adequate illustration of the conditions which must be
outgoing particle momenta and q=p-p is the in- satisfied. In this case we
coming graviton momentum. This, with the spin terms G1(p)=p+m2, (6.13)
involving GY, omitted is the equation given by Brout
and Englert. It holds, as a simple consequence of = ~CPrPV+PYP,- d P .P+ m9)3 (6.144
rPy(p,P)
general covariance, no matter how many other fields
= lPPY(P,p)-%vrrsm2, (6.14b)
are coupled to the field pAand involved in the structure
of the vertex function. where the index 0 refers to the bare mass, and we may
Now introduce the vertex and wave-function renor- write
malization constants Zl and Zp. They are defined by
S-l@)=4+m2+-z($),
(6.15)
u + ( P ) r P v ( P , P ) d P )=zl-lu+(P)Y,(P,P)~(P)
> am2=m2- mOz=Z(-ma) ,
p= -mz, (6.5) r,.(p,p)=rorY(p,p)+n,,(p,p). (6.16)
S--l(P)=zz-6-1 (P)+Z (P)l, The functions Z and A are related by the Ward identity
(6.6)
[az(P)/apIpLd=O, as follows:
where ypy and G are the bare vertex and propagation 2A,,(p,p)q=z(p))p,--z(~)P,. (6.17)
functions, respectively, m is the particle rest mass,
and u(p) is a particle wave function satisfying I t is not hard to show that the general solution of
(6.17) is
~-l(P)u(p)=~l(P)u(P)=O (6.7)
on the mass shell. From (6.6) we may infer
(P)/aplp~-,,p= Z2-[ aG1(p)/ap]pl_m2. (6.8)
On the other hand, (6.4) yields, in the limit p+ P,
2r,(P,?) = P , a ~ ( P ) / a P ~ - 7 r S - W where F is an arbitrary function. Therefore the graviton
-S-l(p)GV,-GF,t-S-l (P), (6.9) vertex of a scalar particle is characterized on the mass
whence, in virtue of (6.7), shell, by a single function of qz. This is the gravita-
tional form factor.
2at(P)r,V(P,P)4P) = P,U+(P)Cas-l (P)/WE($) I (6.10) Now introduce the renormalized self-energy function
p= -mz. 2, defined by
Now, since (6.4) is a consequence simply of general ~ ( y ) = S m z +(Z2--1)($+m2)+Zi-S:(pz),
covariance, it holds also if r,,and S- are replaced by (6.19)
ypuand G I , respectively. Therefore we have Z(-mz)=O, [d2($)/dp2]p*,,~=0.
2u(P)r, (p>f)(fi) = .$rut (P)[ac-l(p)/aylu(fi) 3 (6.11) I n terms of this function Eq. (6.18) takes the form
$=-ma, A,(P,fl)= (22-l- 1)YPY(P,P)-3gm2tllur
From (6.5), (6.8), and (6.11) it follows that
z1=zz. (6.12)
When both vertex and wave-function radiative correc-
tions are taken into account the two renormalizations
cancel, and there remains only the graviton renormali-
zation Zs arising from vacuum polarization,2 which
has the effect of modifying the gravitation constant. X (qzvlly-qpqv) 1 (6.20)
The polarization of the vacuum by a gravitational field is of ZZEquation (6.14a) is obtained from (2.15) by making the
the quadrupole type. Examples of renormalization terms to which replacement p-+ -p, since p is here an outgoing and not an
it leads are given in Sec. 7. incoming momentum.
3 18
which suggests that we also introduce a renormalied magnitude of the gravitation constant G, in terms of
form factor F , defined by arbitrarily chosen (e.g., international mks) mass stand-
ards, to be determined by experiment.23
. (6.21)
F(pZ,p,qZ)= t(Zi-l- l)+Zz-lP (p,P,@9 It is clear that the gravitational Ward identity is
Combining (6.14b), (6.16), and (6.20) we then get only one of an infinity of identities, derivable from Eq.
(17.31) of 11, which relate vertex functions involving
F,~W,P)=ZJ,~(P,P) (6.22a) n gravitons to those involving lz-1 gravitons. Such
identities become superfluous if calculations are per-
formed in coordinate space rather than in momentum
space, for then the general covariance of the theory
can be kept constantly manifest. That such calcula-
-tC2(p3+2 (PIV,.+ +P@2,P2,q2) tions are actually feasible will be demonstrated in the
next section.
X (q%pv-~rqv) 1 (6-22b)
which reduces, on the mass shell, to 7. RENORMALIZATION IN COORDINATE SPACE.
CONFORMAL VACUUM FLUCTUATIONS
rll.(P,P)=rr(p,p)+P(-m2, -m2, q? (Q2V,v-Q,qJ,
P 2 = p z = - ~ 2 . (6.23) The chief tool for studying quantum gravidynamics
directly in space-time is the theory of Greens functions
The Zz factor in (6.22a) takes into account the wave- in hyperbolic Riemannian manifolds developed by
function renormalization arising from self-energy in- Hadamard.24 The basic structural element of this
sertions in the external lines. theory is the geodetic interval, denoted by u,25 which is
If the scalar particle is coupled to other fields through defined as one half the square of the distance along the
nonrenormalizable interactions then the functions Z geodesic between any two space-time points x and x.
and F will diverge in perturbation theory. In particular, The geodetic interval is a symmetric function of x and
they will diverge if virtual gravitons are permitted to x which transforms as a biscalar, i.e., as a scalar
contribute to the vertex function. Thus unless an separately a t x and x. It satisfies the differential
arbitrary cutoff is used, or someone discovers a way equationz6
to sum gravitational interactions to all orders, the
gravitational field must be allowed to act only through
g = 1a@ : p~ .. k2 - ;,,u?,
2 ~
(7.1)
the external graviton line. Although the identity (6.12) and the boundary condition
continues to hold formally in the nonrenormalizable
case, it is then of n_outility. Because of the divergence
which remains in F , Eq. (6.23) will yield an infinite
cross section for the scattering of the particle in an I n a general Riemannian manifold u is not single-
external gravitational field. valued, except when x and x are sufficiently close to
x
In the renormalizable case and P are finite, and one another? The geodesics emanating from a given
expression (6.23) has a well-defined limit as q-0, point will often, beyond a certain distance, begin to
namely, cross over one another. The locus of points a t which
f,(P,P)=PrP, P=-m*. (6.24) the onset of overlap occurs forms an envelope of the
More generally, with partides of arbitrary spin one 21 The necessity of measuring G disappears if absolute units are
finds adopted, with h=c=16nG=l. However, the masses of the ele-
mentary particles must then be measured in absolute units, which
, 4=- m 2 , (6.25)
ut(p)F,&,p)u(fi)= (~T)-~P,PY/~E is operationally the same thing as measuring both G and the
masses in mks units.
when the wave functions u(p)are chosen to correspond J. Hadamard, Lectures on Cauckys Problem in Linear Partial
Dijerenlial Eyualions (Yale University Press, New Haven,
to &function normalization with respect to 3-momen- Connecticut, 1923).
tum. As Brout and Englert point the universality *6B.S. DeWitt and R. W. Brehme, Ann. Phys. (N. Y.) 9, 220
(1960). See also J. L. Synge, Relativity: The General Theory (North-
of (6.25) implies that the equivalence principle relating Holland Publishing Company, Amsterdam, 1960). Synge calls
gravitational and inertial mass holds in the quantum this function the world /unction and denotes it by the symbol R.
26 The semicolons denote covariant differentiation. For a scalar
theory as well as the classical theory. I n particular the this is the same as ordinary differentiation. u L Pis a vector of length
motion of a nonrelativistic particle in a slowly varying equal to the distance along the geodesic between z and z, tangent
gravitational field is independent of its mass. to geodesic a t z, and oriented in the direction z -+ z. u. is a 8
If a high-energy cutoff is permitted then the Ward vector of equal length, tangent to the geodesic a t z, and oifented
in the opposite direction.
identity may be applied to gravity itself, i.e., to the In some manifolds (e.g., some compact manifolds) every pair
three-graviton vertex. In this case the wave function of points may be linked by more than one geodesic. It is always
possible, however, to define a single-valued function c in the
renormalization constants 21 and 2 3 coincide, and Eq. neighborhood of z by starting a t z and following each geodesic
(6.12) tells us that Z1=22=23. This leaves only the emanating from z until it hits a caustic.
319
family of geodesics, known as a cuzlstic swfuce. The the factors gill4 being inserted to insure the covariance
equation for the caustic surface relative to a given of operator
point is D-l= 0, where Taking matrix elements of (7.11) one obtains
D=-det(--a,,,.).
I) is a bidensily of unit weight at both
satisfies the boundary condition
(7.3)
x and x,which
where
g%(~,~)g/=i
L- (x,sIx,O)dF, (7.12)
1.1, (7.26)
where
%=[-D(t)dt, (7.25) Ell@)being the Hankel function of the second kind of
order 1. This formula has the series expansion
y=0.5772.. * , (7.28)
where the instructlions u+iO and -8-iO indicate Table I of 11). Because of the coordinate invariance
what is evident from (7.24), namely, that G(x,x) is of the theory the functional integration is redundant
the boundary vaue of a function of u and 8 which and ambiguous, and since no one has yet discovered
is analytic in the upper-half u plane and the upper-half an analytically accessible nonredundant subspace for
% plane. The singularity structure in u reflects the the integration, we are forced to accept Eq. (20.12) of
usual behavior of the Feynman propagator on the 11 as the effective definition of the integral. However,
light cone (u= 0). The remaining singularity structure there is an incomplete nonredundant subspace which is
symbolized by the logarithm of -%-iO, on the other easily accessible, namely, the subspace of all conformally
hand, is far from simple owing to the presence of the equivalent geometries. One may simply set
chronological ordering operation.
In the perturbative approach to quantum gravi- 4pv=Xgpv, (~cv++w=Bpv-~pv B w = (l+X)gpv, (7.30)
dynamics we must deal not with the scalar propagator and integrate over X, to obtain the partial contribution
(7.27) but with the vector and tensor propagators to (0, m 10, - a) arising from conformal fluctuations
6 .8 and Gj. However, the latter have structures in the vacuum geometry. The special interest of this
closely similar to (7.27); the only difference is that the integration is that it can be performed exactly, giving
operators a(t)
out of which % is built are slightly the conformal contribution to all orders of perturbation
more complicated, and the 1 standing on the right theory. The only fly in the ointment is that this is
of Eqs. (7.19), (7.22), (7.24), (7.26), and (7.27) is the one contribution for which high-energy damping
replaced by the geodetic parallel displacement functionF6 cannot be expected to produce a finite cutoff, There is
Therefore we can gain a qualitative understanding of no smearing out of the light cone, because conformal
the renormalization program in coordinate space al- metric fluctuations leave the light cone invariant.
ready by studying the scalar propagator. Moreover, It is easy to show that
there is an interesting nonperturbative treatment of g1/z ( 4 ) R = ( l + X ) p (4)R
the vacuum-to-vacuum amplitude in which the scalar
propagator itself directly enters: -3g/(1 +X)-X;,X;r-3glzX,,p, (7.31)
Consider the Feynman functional integral, Eq. and hence
(20.33) of 11,which may be rewritten in the form SC P+ $I-S[PI-S, i C ~ 3
~XP~QXPI = /-,UZ (4)g-g1/2 (OR+g1/2(R#.-+glr.
(SCp+41- SCPI- s,i~ql#)d+, (7.29)
= -3/g1/l (1+X)-lX; ,X;F dx . (7.32)
where S is the Einstein action and prv=grv--9rv (see
32 1
The following change of variables then suggests itself : Several comments are now in order. First we remark
that although the final result is divergent, the degree o j
X=f$+$#?, l+X= (1++@)2. (7.33) divergence is bounded. The singularity a t X I = 1: is there-
This change not only simplifies expression (7.32) but fore not an essential one as one might have expected
at the same time guarantees the integrity of the signa- on the basis of Eq. (5.3). As a matter of fact (7.39) is
ture of space-time. We may allow @ to range from - m identical in structure with the contributions which the
to 00 without danger of encountering unphysical geo- propagators G ' j and GaB of the full theory make in
metries and at the trivial cost of counting each distinct lowest perturbation order (i.e., the single closed loops
geometry twice at every point instead of only once. of FV (1)) .32
Thus we write I t may be conjectured that inclusion of the non-
conformal vacuum fluctuations will eliminate the di-
exp (iaconformal) vergences altogether, and that a rough approximation
to the exact vacuum-to-vacuum amplitude can be ob-
tained simply by making the replacement U ( X , Z ) +
$A-a in (7.39), where A is a high-energy cutoff of the
from which we immediately obtain order of unity in absolute units. The "i0" attached to
each u in (7.39) reflects the presence of unremoved
Wconforma~=acodorrn&[ p ] - t i ~ c o ~ o r r n a ~ [ ~ ] noncausal chains. In passing from to W these
imaginary infinitesimals should be discarded. We ob-
tain, therefore, the estimate
L O
Still cruder estimates o W can be obtained by
=-fi6 tr
L- s-l e ~ p ( i g ' / ~ F g ' /d~ss, ) (7.36)
- lim A-1/2A-1/2
a'-+%
r!J
P = l
6
(4)R. (7.42)
The trace symbol here means "integrate the diagonal This quantity raised to the nth power can be extracted
matrix element over space-time.'' Hence, making use from expression (7.19) or (7.22) for an. Moreover, it is
of (7.12), (7.13), (7.16) and (7.27), we find clear that the operator Q((1)has the dimensions of the
where
-
Wconforrnal=
s ~conforrnaid~+COnStant,(7.38)
curvature scalar and in the limit XI -+ x, is a kind of
nonlocal, or mean curvature averaged over a certain
neighborhood of x . If we represent the purely nonlocal
part schematically by A%, we may write
&conformal
O(t) -
Z'"Z
e"(Q ( 4 ) R + ~ % ) , (7.43)
+[27- In2 - &+In (- 8-i0) **The manifestly covariant occurrence of three distinct types
,-. of divergences: quartic, quadratic, and logarithmic, already in
I I..-,.
lowest order, implies that the conjecture of Brout and Englert
+ln(u+iO)]%* .1 (Ref. 19) that quantum gravidynamics is conventionally re-
(7.39) normalizable is unfounded.
3 22
-;-d.)
1
drr (8.2)
mann universe, with encouraging results. He finds that ,I--L l , T2B2+ . . ,
if the sign of the coefficient in front is negative, as (8.3)
would be the case for the contribution from a fermion where is the rate without graviton emission and A
field, this term succeeds in turning the collapse cycle is a parameter marking the dividing line between
around before infinite curvature is reached?s I t may soft and hard virtual gravitons. If A is chosen to
be objected that in applying the correction to the be of the order of the typical energies involved in the
Friedmann model one violates the boundary conditions physical process, Eq. (8.1) gives a fair estimate of the
of asymptotic flatness which were assumed to get it rigorous value which would be obtained for r ( E ) if the
in the first place. However, vacuum polarization is contributions from ultraviolet virtual gravitons were
also included and appropriate renormalizations per-
UThe nonlocal part of the A* term, which has been omitted formed. The soft gravitons make appreciable contribu-
from (7.49, also has observable consequences. tions only if attached to the external lines of the ro
Jl T. W. Hill, Ph.D. thesis, University of North Carolina, 1965
(unpublished).
Not, however, until a density of the order of unity in absolute Conclusions reached for this case are presumably valid also
units is reached. At this density all the matter in the visible for infinite worlds having other background geometries.
universe has been compressed to a region the size of a nucleon. O7 S. Weinberg, Phys. Rev. 140,B516 (1965).
323
m,n ( l - ~ n m ~ )
gnqmmnmm
vnm
ln-
1+Vnm
l-vnm
(8.4)
form (2.8) which these vertices possess, the unifomi fourth order in the velocities. This expression can be
orientation guarantees that a t least one of the vertices greatly simplified with the aid of the energy-momentum
in each infrared loop is proportional to an infrared conservation laws
momentum.
We conclude this section by repeating Weinberg's C ~ n m n ( I + + ~ n ~ + + ~ n ' +. ) -= .O ,
n
calculation of B in the nonrelativistic limit and correct-
ing a minor mistake in his result. The quantity vnm2 is
first expanded in the form
vnm2= (vn- v,y-vn2v,2+ 2 (v,'+vm2)vn. v, and one finally obtains the compact formula
- ~ ( v ~ . v , ) ~ +. .. , (8.9)
B= (4G/5r)tr(Ad2Q/dt2)2, (8.11)
where vn= pn/En. This expansion is then inserted into
where adZQ/dP is the dyadic previously defined by
l+vnm2 qntmmnmm l+vnm Eqs. (4.11) and (4.12), having the explicit traceless
In-
(1-~nm~)'" vnm l-vnm formaQ
~ d 2 Q / d P = z qnntn(vnvn-+lvn2). (8.12)
= 21l~q,m,m,
('d 40
63
. . (8.10)
lS.-~,,,,~+_o~~~+
)
to obtain a lengthy expression for B correct to the
n
Feynman and De Witt showed, that the r u l e s must be changed f o r the calculation of contributions f r o m
diagrams with closed loops i n the theory of gauge invariant fie lds . The y suggested a l s o a s pe c ific r e c i p e
for the c a s e of one loop. In t hi s l e t t e r we propose a s i m p l e method for calculation of the contribution
from a r b i t r a r y d i a g r a m s . T h e method of Feynman functional integration is us e d.
The Feynman rules for the Yang-Millsfield, originally derived by Feynman and DeWitt from S-matrix
theory and the tree theorem, are here derived as a consequence of field theory. Our starting point is the
gauge-independent, path-dependent formalism which we p m p o d earlier. The path-dependent Greens
functions in this theory are expressed in terms of auxiliary, path-independent Greens functions in such a
way that the path-dependence equation is automatically satisfied. The formula relating the path-dependent
to the auxiliary Greens functions is similar to the classical formula relating the pathdependent field vari-
ables to the potentials. By using a notation similar but not identical to Schwingers functional notation, the
infinite set of equations satisfied by the Greens function can be replaced by a single equation. When the
equation for the auxiliary Greens functions of electromagnetism is solved in a perturbation series, the usual
Feynman rules result. 1701 the Yang-Millsfield, however, one obtains extra terms; such terms correspond
preasely to the dosed loops of fictitious scalar particles introduced hy Feynman,DeWitt, and Faddeev
and Popov.
In the present treatment we shall avoid noncovariant Our results will be the same as those found by Feyn-
quantities and we shall therefore not introduce PO- man, DeWitt, and Faddeev and Popov, They showed
tentials as quantum-niechanical operators. Instead, we that the correct prescription was to take all Feynman
shall introduce auxiliary Greens functions. I n for- diagrams of the Lorentz-gauge theory, together with
malisms of quantum electrodynamics which employ Feynman diagrams containing closed loops of fictitious
potentiais, whether in the Coulomb or Lorentz gauges, scalar particles. In our treatment we shall find that
one can define Greens functions integrals corresponding to closed loops of scalar
particles appear directly in the solution of the Greens-
G,,(zL,** * ; ~ I , * * * ; z I , * * * ) function equations. We may associate such integrals
-
= (01T{&(zi).+*(YI) * .. A &I) . . * I 0). with closed loops of scalar particles if we wish, but this
is purely a mnemonic device. The fictitious particles
One can also define path-dependent but gauge-invariant never occur in external lines, nor do they appear in the
Greens functions intermediate states of the unitaxity condition.
-
G+&QI,* * * ;yi,PI,** * j 81,. )
I n our present formulation of the theory, the Feyn-
man rules are thus rules for calculating auxiliary
- ( 0 ~ 1 { 9 ( x , , P , ).. .
. o~I,PI).~.F,.(Zl)...}]o)Greens functions. We can then proceed to calculate
The latter Greens functions can be expressed in terms the gauge-invariant, path-dependent Greens func-
of the former. In our present approach, the path- tions, since we shall already have expressed them in
independent Greens functions will be introduced, not terms of the auxiliary Greens functions. Ry using the
as vacuum-expectation values of time-ordered prod- reduction formulas we can then calculate the S matrix.
ucts, but as auxiliary functions in their own right. The The fundamental reduction formulas of the theory
physical, path-dependent Greens functions of our involve the path-dependent Greens functions. How-
theory will then be expressed in terms of the auxiliary ever, one can use these reduction formulas to derive
Greens functions by using the same formulas as in further reduction formulas involving the auxiliary
theories with potentials. The connection between the Greens functions. Thus, from the Feynman rules for
path-dependent and path-independent Greens func- the auxiliary Greens functions, one can derive Feyn-
tions will guarantee that the path-dependence equations man rules for the S matrix by the usual reinterpreta-
are satisfied, as we shall verify explicitly. We then have tion of the external lines.
to find the equations which the auxiliary Greens func- The equations for the Greens functions are coupled
tions must satisfy in order that the path-dependent integral equations between an infinite number of such
Greens functions satisfy the correct equations. functions. Moreover, when expressing path-dependen t
For electrodynamics, such an approach has already Greens functions in terms of auxiIiary Greens func-
been carried out by Sarker. He found that the equa- tions, one finds that a single path-dependent Greens
tions satisfied by the auxiliary Greens functions are function is equal to the s u m of an infinite number of
similar, but not identical, to the equations satisfied by auxiliary Greens functions. It would be clumsy, if in
the Greens functions of the Lorentz-gauge theory. The principle possible, to carry out manipulations with such
difference is due to the fact that he started with infinite systems of equations. We require a shorthand
the Maxwell equations aP,(x)/(&c,,)+jl.= 0, whereas for expressing the infinite sets of equations as single
the Lorentz-gauge theory starts with the equations equations. The Schwinger functional notation provides
UZA,(x)+jv=O. Nevertheless, he showed that the 11s with such a shorthand; Schwingers functional dif-
Greens functions calculated by the usual Feynman ferential equation is equivalent to the complete set of
rules do satisfy the correct equations. The Feynman equations for the Greens functions. Cnfortunately it
rules were thus derived from a procedure which was co- does not appear to be an easy matter to express the
variant throughout and which did not make use of an equations for path-dependent Greens functions in
enlarged Hilbert space. Schwingers notation. We shall therefore use another
When we carry out a similar treatment for the Yang- notation in which our fundamentalquantity corrcsponds
Mills field, we shall again find that the equations to Schwingers 8/67 rather than to 7. We shall indicate
satisfied by our auxiliary Greens functions are slightly the connection between our notation and Schwingers
different from the corresponding equations in the but we shall not w u m e knowledge of his notation.
(incorrect) Lorentz-gauge theory. As with electro- In the following section we shall illustrate some of our
magnetism, the difference is due to the dropping of a methods by using the A@ theory. We shall find the dif-
term - aaA./ax,,ax.in the Lorentz gauge. In this case, ferential equations for the h e n s functions and shall
however, we shall find that the difference is important, use them to construct the perturbation expansion. We
and that the solution to our equations contains terms shall then develop our notation for simplifying the
besides those given by the Lorentz-gauge Feynman writing of the differential equations. Essentially what
rules. we shall do is to form a linear space of all Greens func-
tions and to write the differential equations as equations
A. Q . Sarker, Ann. Phys. (N.Y . ) 24, 19 (1963). for vectors in this space. In Sec. 3 we shall treat the
329
1582 S T A N L E Y M A N D E L S 1A M 175
functions
(2.3d)
(2.3e)
One method of obtaining the perturbation series for
the Greens functions is to use the differential equations
Q----.GH
satisfied by them. This is the method we shall use in
+ - ---+ the following sections when treating gauge fields. Thus,
G2 will satisfy the equation
(b)
FIG.1. Diagrammatic representation of Eqs. (2.5).
(0?-P~)GZ(ZI,ZZ)=aXGs(li,r~,zz)+ib(zl-~q). (2.4a)
Equation (2.4a) is obtained by applying the dif-
electromagnetic field. We shall write down the equa- ferential equation (2.1) to the factor +(XI) of (2.3a).
tions for the path-dependent Greens functions and The first term on the right of (2.4a) arises from the
shall reexpressthem in our shorthand notation. Working interaction term in (2.1) while the second term is ob-
within this notation, we shall then express our path- tained by applying the differential operator - a2/axoz
dependent Greens functions in terms of new, path- to the time ordering itself. In deriving this term it is of
independent, auxiliary Greens functions. We shall course necessary to use the commutation relation (2.2b).
determine the equations which the auxiliary Greens The higher Greens functions will satisfy similar
functions should satisfy in order that the path-depen- equations. Thus G3 will satisfy the equation
dent Greens functions satisfy the required equations.
On solving them, we shall iind that they lead to the ( 12-/J2)G3(X1,X2&)= ~AG4(Xl,ZI&,X3)
ordinary Feynman rules. In Sec. 4 we shall treat the + ~ ~ ( ( ~ ~ - X Z ) G ~ ( X ~ ) + ~ ~ ~ ( X (2.4b)
~-ZI)GI(XZ).
Yang-Mills field in a similar way. Here, however, we
shall find that the perturbation expansion contains Equations (2.4a) and (2.4b) can be integrated t o yield
the formulas
s
terms besides those given by naive Feynman rules.
Gz(x1,xe) = -+z> dzl~AF(Xl-x1)G3(Xq,Zi1,22)
2. DIFBERENTIAL EQUATIONS FOR
GREENS FUNCTIONS ++AF(S-X~), (2.5a)
In this section we shall summarize the method of
determining Greens functions by solving differential
equations, and shall also develop our shorthand nota-
Ga(xl,X2,x3) = -*&
I
dxl~AF(XI-lq)G4(24,24,22,r3)
+*AP(xl--%)Gl(%)
tion. The method is certainly not new but, as far as we
are aware, there is no easily available reference in which +aA~(xl-dGl(xJ. (2.5b)
it is described, and we therefore felt it worthwhile to
Equations (2.5) are illustrated diagramatically in
describe its application to non-gauge fields before
Fig. 1.
passing on to the gauge fields in which we are interested.
If we are working in perturbation theory, the first
We shall treat the simple case of a neutral scalar
Greens function on the right of (2.5a) or (2%) will
field with A@ coupling. The field equations will be
be required to one order lower than that on the left,
(0-Pd4-+A(&4)z= 07 (2.1) since it contains an explicit factor A. The second term
on the right of (2.5a) is known explicitly, while that on
and the 4s will satisfy the commutation relations the right of (2.5b) only involves GI.Hence, if we con-
struct the perturbation series order by order and, within
C+(x,t)dY,t)l= C4(x,t)d(Y,t)l=o, (2.24 each order, construct the functions GI, G2, .. suc-
cessively, the right-hand side of (2.5) will be known in
(2.2b) terms of previously calculated functions. We can
therefore construct the entire perturbation series in this
manner, and it is not difficult to see that we obtain the
usual prescription for Feynman diagrams.
In a field theory with a simpleLagrangian, such as the
A@ theory, it is sufficient to write down the first few
equations (2.4) and (2.5); the form of the subsequent
equations is then fairly obvious. When writing down
equations for gauge fields and performing manipulations
with them, however, it would be somewhat cumbersome
to proceed in tbis manner. We require a notation in
330
which the whole series of equations (2.4) can be simply space on which the operators &a$ act. The 8 s do not
displayed. In the remainder of the section we shall satisfy the quantum-mechanical commutation rules.
develop such a natation. We emphasize that we are IRfact, all the 8 s and their space and time derivatives
doing nothing more than constructing a shorthand for commute with one another. The linear space of vectors
expressing the equations satisfied by the Green's I ) is thus a totally Werent space from the quantum-
functions. mechanical Hilbert space of vectors I ).
We shall work with the linear space of the totality It would be inconvenient if we had to display equa-
of functions Co,Cl(z1),C2(zl,x2),. .. A typical vector tions such as (2.8) whenever we used them, and, in fact,
in the linear space may be written there i s a standard notation for expressing (2.8) in a
more compact form. This notation is expressed in terms
of vectors in the dual space. A vector in the dual space is
written in the form (HI, and is defined by means of its
scalar products ( H l C ) with all vectors in our space
1 C). The scalar products must depend linearly on IC).
We define the vectors (a,], (Hl(z1)I, (HZ(XI,SZ) I, * *
The linear space is thus the sum of a series of subspaces in the dual space by the equation.
co 0 0 (HolC)=Co,
0 Cl(X1) 0 . . I
(HdXl) Ic)= Cl(X1), (2.9)
0 0 Ce(x1,Xp) I c)=C2(S,S>,
... ... ,.. . tHz(z1,Xp)
1584 S TA N L E Y M A N D E LS T A M 175
Now, by (2.9), We now use (2.14) and (2.15) to express the vectors
(Bl(4 I c) = Cl(d I (Hz(xl,~JI , ( H , ( x i , x ~ ,1,~ )and (Ho Ia(x1- X P ) in terms
(HZ(X4 1 0=Cz(z,xd, (2.12) of the single vector (HI(-) I :
4. YAHG-MILLS FIELD
The massless Yang-Mills field appears to possess all
the essential complications of the gravitational field
while lacking some of the algebraic complications. It is It is not difficult to check that Eqs. (4.1)(4.3) are
therefore instructive to consider this field before going consistent with one another and with Lorenz trans-
on to the gravitational hid. We shall treat as&- formations. In fact, the equations of motion and com-
interacting Yang-MiUs field, since interaction with mutation relations may be derived from the Lagrangian
other fields does not introduce my new features.
The path-depadent formalism for the Yang-Mills
field has been examined by Bialynicki-BirulaP The One may define path-dependent Green's functions
procedure foUowed is analogous to that used for the in the usual way. As in the electromagneic case, it is
electromagneticfield, with the difference that the Yang- necessary to include -function terms if the Green's
Mills field plays the dual role of the gauge field and the functions are to be covariant. The definitions are ther-
charged field. The field equations are simpler in fore as follows:
crppearance than the Maxwell equations of electro-
dynamics, since there is no additional current term.
They take the form
aF"."(qp)
-0. (4.la)
%
IOB. Zumino, J. Math. Ph'hys. 1, 1 (IY60).
333
Higher Green's functions may be defined m a similar and we need not explain it in detail again. We con-
way. struct the linear space of the totality of all functions
The field equations (4.la) may be rewritten as equa- C?,,...a,..(xl,Pl,. . .). We then construct the dual space
tions for the Green's function, analogous to Eqs. (3.8) and define the vector ( ~ r v . . . a . . . ( x l , P ~ , .1 in the
a )
for the electromagnetic Green's functions. Thus the usual way. We next define the operators Pa,,,(x,P) by
two-point Green's function satisfies the equation the equations
E.qX 1; d.5
. .) I
-ig%&
La dtJ4((x1- UGw'pr ,A . r u ( 2 9 1 , d ~ , d ' 4 )
1
61,~a$.,P.(xi,PI,Xg,P~)=gbyr d ~ r ) i ( X ~ ) G ' B ~ p ~ . p ~ . x ~
X(Xl,P1,X2,P2,X*,PO. (4.6)
to complete the definition of U.
In our present notation, the field equations (4.5)
become
Condensed Notation
The condensed notation which we shall use is where P' represents the portion of P leading to t h e
very similar to that used in the two previous sections, point 2.
334
j#P(x)==---- directly:
ax, a% V.q(x,P)=Vu6(d,P')v8,7(%',5P) (4.18)
where as usual P' is the portion of P leading to 5'.
If we differentiate (4.13~)with respect to the end-
point of the path of integration, we obtain the equation
a
-Vn,(x,P)=g,ad,'(x)V.s(x,P). (4.19)
a5
As we shall see below, this equation will enable us to ex-
press derivatives of path-dcpcndent functions in terms
of derivatives of auxiliary functions. Equation (4.19)
can be written in integral form,
* * * . (4.13~)
XA$'([')A,,r($)+
If WE. compare (4.13) with (3.24) and (3.34), we ob-
serve that we have to take the curl of A and multiply it
by the function V in order to obtain the path-dependent
Vny(%P)= &fg%b
1: dfflA~'(f)Vm6(r!,P)
1 (4.20)
sense that they both have the same commutation rela- If we compare (4.24) with (3.36) we notice that terms
tions (4.9) with the operators F,,fl(x',P'). corresponding to the right-hand side of (3.36a) and
Once we have defined the operators ANa(& we (3.364 appear on the right of (4.24). This is once more
can construct our enlarged dual space of vectors due to the fact that the Yang-MiUs field plays the dual
(Jl,,..:*..(q,** * ) I . We can then construct the linear role of gauge field and charged field.
space of vectors IG) and can define auxiliary, path- Let us tirst investigat_ehow the operators defined in
independent Green'_sfunctions G. Ths path-dependent (4.13) transform when A undergoes the transformation
Green's functions G can be expressed in terms of the (4.24). From (4.13b) one can verify directly that
G's by formulas analogous to (3.29). We shall not give
the details, which are the exact analog of the cor- fwa(x) +~~:;"(Bfhg-8,X,(X)K,(X). (4.25)
responding details in electrodynamics. We can also
write equations similar to (3.32) for the operators 7. The function 'b transforms in a similar way:
The gauge transformations are given by the equation The easiest way of verifying (4.26) is to show that
(4.19)' which may be taken as the defining equation for
V , remains invariant when A undergoes the trans-
formalion (4.24) and V the transformation (4.26).
Under such a transformation, the two sides of (4.19)
transform as follows:
The variable PWa(z,P)is therefore invariant under the = -g6,8&6(%) vae(S,P). (4.30)
transformation (4.24), and we are justified in callkg Field Equations
it a gauge transformation.
To define a generator of the gauge transformation, Our aiin is now to express the field equations (4.11)
we construct the operator as equations lor the auxiliary variables. The first term
336
1.598 S T A N I, E Y M A N D E I. S T A M If5
of (4.11) is easily transformed:
J
a
=--v,,(x,P)j#y(.) The factor V,, in the second term of (4.34) is still not
ax, in front of the other factors, but we can move it into
this position by using the cornmutation relation (4.30)
= V&, ~ ) -a. f ; , ~ ( x ) + y 6 ~ ~ u ~ ( x ) ~ ~ ~ ( x ) ~ ~ , ~The
( ~ )
equation then becomes
s
ax,
[from (4.19)] u=(%,n = Ya,(~,P)l)?-+9+ d Y ~ ~ Y ( ~ , ~ ) ~ B ~ ) x Y
We have seen that the second tcnn --iUpQ(@) of e,y(x) = l).y(x)+ d r Y~(r)xya(x,r)+g.,ax~(z,x)
(4.11) is equivalent to the operator --zVuy(x,P)~,y(x)
in the sense that they both have the same commuta-
tion relations with the operator F,,@(.,P>. We may
therefore be tempted to rewrite the field equations
(4.11) as follows:
+gy6&6(x,%) , (4.36)
from (4.29).
We can thus generalize the field equations (4.32)
to read
uniquely in the enlarged linear space. We employ this The path dependence has been removed from (4.38),
freedom to find a definition of UVu(x,P)which gives and we shall adopt it as our field equation. By taking
consistent field equation%W e begin by writing the gauge-invariant derivative
s
Uv(Z,P)=Vay(Z,P)%~(x)+ dy YB(Y)XUpI(X,Y) (4.33)
of the factor within the parentheses we can easily show
where x is arbitrary. Since the second term commutes that the last term must satisfy the consistency condition
with eveiy gauge-invariant operator, the right-hand
side of (4.33) maintains the correct commutation rela-
(4.39)
tions (4.9). All the terms in the equation of motion
(4.32) have a factor Vu7(x,P) in front of them, and it
will be convenient for us if the last term in (4.33) also We have t o choose the [unction x in the deiinition
such a factor. We therefore define (4.36) of 6 so that (4.39) is satisfied.
In order to orient ourselves we shall first find a func-
tion eye,of the form qraf&a, which satisfies (4.39). The
337
term @,'a will not have precisely the form of the second
term of (4.36), but we shall then be able to modify it
so that all conditions are satisfied. The following func-
tion clearly satisfiea (4.39):
+-
FIG.3. Diagrammatic representation of the equation for the
twc-point Green's function in the Yang-Millstheory.
where
where the matrices I and ED represent the symbols
6, and em@,, considered as matrices in a and y. The
superscript (1) on 8 indicates that it is not our hal
definition of this operator. Equation (4.40) can be re-
written without using reciprocals of operators as
follows:
+gEuBByE~a*ApB(s)Ar'(Z)A.((2). (4.46)
By integrating (4.45) in the usual way and wing (4.43)
for e, we obtain the result
One could then construct Feynman diagrams by ar- vertices: w(Pla,p&u,p8y) = - ( 2 ~ ) ~ g g a , ~ ,;p ~ ,(4.53b)
ranging the vertices (4.48) and (4.49) in all possible
ways. In fact, Eq. (4.47) without the last term is an over-all factor - 1. (4.53c)
identical to the equation we would have obtained by
starting from the Lagrangian In (4.53c), the quantities $1, a and pa, y refer to the
dashed lines, the quantities p2, @u to the solid lines
representing the Yang-Mills quanta. We notice that
the vertex factor is not symmetric in the two dashed
lines; it involvcs a factor p,, but no factor PI,,. I t is for
this reason that we have drawn arrows on the dashed
lines in Fig. 3. The factor p ~ in. (4.53b) is associated
with the line directed away from the vertex.
and writing down "naive" Feynman rules in the usual The presecription for constructing Feynman diagrams
way. is therefore to draw three-particle and four-particle
The presence of the last term in (4.47) shows that vertices with factors (4.48)and (4.49), and also polygons
the naive Feynman rules are not correct and that there with any number of dashed lines and with factors (4.53)
are additional terms in the perturbation expansion. associated with them. The three- and four-point
From (4.42), we can expand 0 as a perturbation series vertices, as well as the vertices of the polygons, are
in g as follows: then to be joined by solid Yang-Mills lines in all
possible ways.
The Feynman rules for our theory are the same as
those for a theory with ficititious scalar particles as
well as the Yang-Mills particles. The Feynman diagrams
contain three- and four-point vertices involving the
a Pang-Mills lines alone. The factors (4.48) and (4.49)
x-~AF(Z1-R)ig4r*Apr(X2). .
' CB,~&'(X~)
are associated with these vertices. In addition, the
ax,,
a 1 diagrams contain vertices involving two scalar lines
X- -Ap(xn-y). (4.51) and one Yang-Mills line. Associated with such vertices
ax,. 2 are the factors (4.53b). There is a further factor -1
associated with each closed loop of scalar particles. The
When (4.51) is substituted in the last term of (4.47), scalar limes only occur as internal lines and only in
we obtain the result closed loops.
Note added in nanwcripf. Faddeev and Popov
(unpublished) have shown that their functional-
integration prescription' can be related to Schwinger's
formulation of the Yang-MiIls theory.6 This therefore
provides an alternative derivation of the Feynman
rules from a quantized field theory. Faddeev and
Popov have restricted themselves to Landau gauge.
A method is suggested (and applied to the Yang-Mills and gravitational fields) for the construction of
the generating functional (S matrix) for fields possessing an invariance group. The unitarity and gauge
independence of the S matrix on the mass shell are seen explicitly.
The classical gravitational field is described by the I n the case K,.=g()) the first term in (4.8) should be
actionis written as
Lo=4(d-g)glp,. (4.2)
fonnalisrn. We use the following designations :
Here g,, is the metric tensor, g= detg,,, gcgvA= PA,and
R,, is the curvature tensor of second rank (Ricci tensor) G(z)IG(~)~(x) 6iVo/6g(B,,(x) , (4.10)
R,,= avr*sr-aur~,v+
rP,,ur~yp-r~,,yr~pu.
(4.3) GWo/Gg(~)(~),(4.11)
GPv(~)=G(B),v(x)=
r*,x--+ wVA=
r~,A--~avr~.A+r~,Aa.E~ L= Lo++,B*J3++culj*.p6,,B*., (4.18)
-rrP~aX~-rrPhrav~-a,axS.
(4.7) where 6, is the Minkowski tensor. The case afO will
be considered only for these gauge functions (4.17)
,$P(x) are arbitrary infinitesimal functions of x, I n the
which depend linearly on K~~ and are independent
first-order formalism, gauge transformations of both K,, of revA.
and I p v h should be made; in the second-order formalism,
only gauge transformations of K, should be made.
The gauge variation of (4.1) has the form
B+P(%)=
I dy D+,P(x,y;K,r)B(Y). (4.19)
6Wo=
J dx ~ ( ~ ) [ R p V ~ G Y X ( ~ ) + R p V ~ u(4.8)
(4.20)
Fields (Pergamon, London, 1962), 2nd ed. 8KP
34 1
in the case of the second-order forinalism. I n the first- in the second-order formalism, and to
order formalism the following equation should be added :
TI n d d X ) rI drx(x) (4.29)
t ,< P;V<X
(4.21)
in the first-order formalism.
Note that the generating functionals (4.26), written
With the help of the identity (4.16), we obtain the in terms of different variables g(@I),,. or g(@a)pv, do not
following consequences of the field equations: coincide in general. I n this connection there arises the
question of a choice of the true variables (the true
( x ) =o ,
(Q+,BV) (4.22) measure) in terms of which the generating functional
can be written as the functional integral of exp(iL) over
Q+,= Q(x,Y; K , r ) the true variables. The proposed method does not
permit the value of the /3 to be determined. It could be
found in principle with the help of the correctly formu-
lated canonical formalism for the gravitational field.
However, we shall not investigate this problem in the
We impose the restriction on +,, that in the limit present work.
We now give some arguments which show that the
K B V = ~ ~ Y , rlvA=o, S matrix corresponding to the generating functional
(4.26) and (4.27) is independent of the choice of the
the operator Q+,, should be a nonsingular differential
variables of the functional integration belonging to
operator Q(O)+,. Choose the D+ function in the form
class (4.5). Suppose we integrate over K ~ , , in (4.26). The
D+p=[ Q + - ] ~ + A y ( ) . (4.24) corresponding Jacobian has in general the form
Q+()ppBY=
0. (4.25) where y is some number and
s
Za+= d(K,I)dB@ exp
I n the gauge
+Tr lnQ+(&())-l
1. (4.26) +Tr In&( -K)~(@+(O))-~
1
$J,=O,
i.e., for (Y= 0, we have
x e X p {~ ~ ~ ( L , + K , J Y . ) + T r l ~ Q + ( ~ ~. ( (4.27)
0))l} We can see from (4.26) that za+
corresponds to the
field equations for K,,, Ipi,and Br which are obtained
from (4.20)-(4.22) by the substitution B+P+B + r .
I n (4.26) and (4.27), d(K,r) is equal to
a+. -
rI rI dx,(x) (4.28) $= -aG,$+, GN+ -B+ =O , (4.20)
P<# 6%
342
function +,,.
Define the function &(K,r) by the relation If (4.42) is true, then the left and the right measures
are different. When proving the gauge independence of
the S matrix, this fact should be taken into account
[in particular, in (4.35) one should use the d p ~ ] .
The formal proof of the invariance of the S matrix
can be made in the general case DetJ(x)G(x-y)# 1 [then
Here S is an element of the coordinate gauge trans- it is necessary to assume the P-independence of the
formation group, and dp(S) is the measure of group S matrix). However, taking into account that the
integration. (For more details on the coordinate gauge arbitrary functions~ ( xare) the coordinates themselves,
transformation group see Ref. 12.) we can expect that
Let us explain some peculiarities of the coordinate
DetJ(x)G(x-y)= 1. (4.47)
group transformations. Under the transformation
Indeed, the G(z(x)-y) can be considered as the matrix
X -+ P(x), (4.36)
20 C. W. Misner, Rev. Mod. Phys. 29,497 (1957) ; J. R. Klauder,
la S. Coleman, J. Wess, and B. Zurnino, Phys. Rev. 177, 2239 Nuovo Cirnento 19, 1059 (1961); B. Laurent, Arkiv Fysik 16,
(1969). 279 (1959); B. S. DeWitt, J. Math. Phys. 3, 1073 (1962).
743
of the permutation of the points. The corresponding Thus we prove that the S matrix of the gravitation field
finite-dimensional matrix has determinant 1. Therefore, is independent of the type of gauge condition (4.34).
the 6(z(x)--y) has determinant 1 if it is calculated by The proof that the S matrix is independent of the
the use of the finite-dimensional approximations. In as for the case when the gauge function is fixed can be
this case (4.47) follows from (4.41) and made similarly to Sec. 11. For this purpose substitution
of (4.6) or (4.6) should be made in expression (4.26)
dgI,=d/.W=dg=~ @(.x). (4.48) with
X.I(
tqX)= - (sa/zff)(c~*-llr~~)(~). (4.55)
Furthermore, d p has the property (2.44), and the Then
) d(K,r) for arbitrary ps are invariant
measures d ( ~and
1
under the gauge coordinate transformations. Below we Lo- ---$pSpYJ/y+KpJPY 3 Lo
assume that (4.47) is true. 2a
As in Sec. I1 the property (2.44) enables one to prove 1
the gauge invariance of &(K,r). We must know the
function &(K,r) only for K~~ and I,X satisfying the
condition (4.34). In this case the group integral is con- and the variation of the term T r In in (4.26) is compen-
centrated in the neighborhood of the unit element. The sated by the resulting Jacobian. We omit the corre-
gauge transformations have the form (4.6), (4.7), and sponding cumbersome calculations.*
&(S)=rId S W 3 (4.49) Let us pass to the consideration of some particular
3,s gauges.
As in Sec. I1 we obtain A. Harmonic Condition
AC(KJ7 I , P O Consider the class of gauges determined by the
function
= / d p ( x ) 6( (Qpv+Tp)(x)}
-Det-Q. (4.50) g1y+ a , g q X ) , g= (d-g)gpV. (4.57)
The harmonic condition corresponds to
Keeping (4.50) in mind, the expression (4.27) can be
(4.58)
rewritten in the form
We shall use gfiy as independent variables. By means of
Zo, = \d(Kj r)6{$,(X; r))A+(K,1) (4.14) (with p = $ ) , (4.23), and (4.24), we find
J
el%=
6~~(gxuaxau+aXgxua,)+axg~~a~.
(4.59)
Xexp[ i ~ ~ ~ ~ ( ~ o + (4.51)
~ ~ ~ ~ The
~ ~ generating
) ] . functional is equal to*I
Multiply (4.51) by (b+I(K,r) and perform the gauge +Tr InQ~fi-l]. (4.60)
transformation
r p v X-+
-+ K,,~S-~,
K~~ rS-lpVh. In transverse gauge (a=O), we have
The quantities LO,A,, and A+1 are invariant. Further-
more, the following substitution can be made on the
mass shell:
K~-~ -+ ~
K ~ , ,J
JCP
V V (4.53)
as discussed in Sec. 11.
Then we obtain 1
+4 Tr lngrvd,dy~-l S{ a p g y ( x ) ) . (4.61)
- - A#- %- ,p (4.62)
E. S. Fradkin and I. v. Tyutin, CNH hboratorio di Ciller-
netica report, Napoli, 1969 (unpublished).
344
are are
C. Dirac Gauge
+ ~ L p ~ p ~ + 6 ~ ~ p * p " (4.65)
)]~!.
We give the arguments which show that the S matrix
obtained by Popov and Faddeeve in the Dirac gaugeI7
The Feynman rules for the gravitational field in the coincides with the S matrix in the covariant gauges.
gauge a,gMv=O were also obtained by Fadeev and Consider t.he following set of gauge conditions:
Popov.3
(4-g)eikPik0= 0 ,
G3k= a i [ ( - g ( 3 ) ) l / 3 e i k1-- 0 . (4.73)
B. Linearized Form of Harmonic Condition Here
Consider the class of gauges described by the function g(3)= detgik, g= (l/g00)g(3), eikgkj= Sji. (4.74)
$'p2s 6"X(a,gAp-%a&Acr). (4.66) In gauge (4.73) it is natural to use the first-order
formalism. We choose g"" and P r A as independent
We choose g, as independent variables. We then obtain variables. With the help of (4.14) (with a=0), (4.15),
with the help of (4.13) (with p=O), (4.32), and (4.24) and (4.23) one finds
Q O= ~[-2b-ie'"nk0-2~i(g0'c/g~O)e~~rlK0
~
erv2=gPvO
+(augp8-+argm8)
X (6""~3,6,@+ 6'flc3r6vu- PflaV). (4.67) +&e'krlk' -2 i s k e i h ] d - g , (4.75)
Qaoj= 0 , (4.76)
The generating functional isz1
Qj30=- 2 ~ i T i j - e ' i ' ( - g ( 3 ) ) ' 1 3 ( + ~ ~ l e k j a ~ - 6 ; ~ a ' ) , (4.77)
1 pij=(1/gOO)gOmei"(-g(3))1/3
2856 E. S. F R A D K I N A N D I . V. T Y U T I N 2
The generating functional is g" and rVxfiand consequently to the same S matrix as
(4.81) does.
23'
I [
d(grY,r)6{+sfi) exp a jdz(Lo+gY'J,.)
Now let us integrate in (4.89) over all the I'pvx except
Faik. One can show6 that Lo takes the form
Lo 3 ? r " g i k - - a ( a , g ) . (4.90)
+Tr lnQ300?-1+Tr lnBj'(Q3(o)-1)ki], (4.81)
Here H ( r , g ) is the Hamiltonian of the gravitation
B.i=Q3Tji=
1- (-g(3))1/3(6j"e'ndld~+~elid~dj). field, the explicit form of which we do not need. ?rik are
the canonical momenta for gik:
According to the general arguments given in this
section, the generating functional (4.81) on the mass ?rik= [d(
-g(3))/dgao)(eikelm-eirekm)rol,, (4.91)
shell is equivalent to (4.60), (4.61), and (4.68).
Now we transform expression (4.75). Using (4.25) With the help of (4.91) the gauge condition (4.73)
and (4.80) the field equations for B can be obtained: can be rewritten exactly in the form given by Diracl?:
VBo= 0 , (Gj%'V+f6ae'dldj)Bi=
0. (4.82) $,30Egik?ri'l:=0, + i=- a i [ ( - g ( 3 ) ) 1 / 3 e i k ] = o . (4.92)
The only physical solution of (4.82) is B,,? 0. Thus g''" Let us pass from the integration over r C i k and g g v in
and rvx'satisfy (4.20) and (4.21) with B,(3)=O; i.e., (4.89) to the integration over riband gfip.The resulting
relations (4.4) for r d are valid, and gfi" satisfies the Jacobian can be omitted. The proof of this fact is
usual Einstein equations. analogous to that of the possibility of arbitrary choice
Substituting (4.4) into (4.75), one obtains of the functional integration variables belonging to
class (4.5).
Q30'= (d-g(3))eikVidk(da) The final expression for the generating functional in
-(da)(2/-g(3))eikVidk, (4.83) gauge (4.73) or (4.92) is
viak= didk-'y'ikdl. (4.84)
Here yikL is the three-dimensional Christoff el symbol,
and a= (go0)-l. Let us find the expression for F o i k with
Zff=
J d(g,a)d{+afi)
- h. O = o ,
rk .+ i i k r O . zk -
e i k t ~
1
+Tr lnA.\/or+Tr InBj-Tr 1nQ3(") (4.89) with the canonical scheme can be traced. Furthermore,
the method suggested proves to be convenient for con-
structing the perturbation expansion of the S matrix in
for the generating functional with gauge condition theories partially invariant under a gauge group, the
(4.73) can be used instead of (4.81). The generating power of divergence in the S matrix being considerably
functional (4.89) leads to the same field equations for reduced.
346
I n this paper no attention was paid to possible inter- not the method of Fradkin and Efimov.2z) It is con-
actions with other particles. The latter would not affect venient to treat this problem using the variables hc' and
our considerations, however. P r h in the first-order formalism where there are two
We would like to discuss briefly the problems which vertices: a vertex rrh and the vertex responsible for the
have not yet been solved. interaction of h*" with the fictitious B field. The formal
(1) Owing to divergences, there is an important estimate of degrees of growth leads to the conclusion
problem of introducing a regularization which will not that the theory is of unrenormalizable type.
affect the group properties of the theory. Recall that
non-gauge-invariant regularization in electrodynamics ACKNOWLEDGMENTS
creates the photon mass. From the more recent view, the
resulting photon mass is due to Schwinger terms or, in The authors are grateful to the participants in the
the end, to the singular character of products of field theoretical seminar a t the P. N. Lebedcv Institute for
operators a t coincident points. In nonlinear theories useful discussions. One of the authors (E. F.) thanks Dr.
this problem becomes even more complicated. The Popov and Dr. Faddeev for being so kind as to acquaint
Schwinger ternis affect even the renormalization con- him with Ref. 6 prior to publication, and Professor E. R.
stant, as for instance in the case of the Yang-Mills Caianiello for his kind hospitality. He is also grateful to
field. Professor Abdus Salam for hospitality a t the Inter-
(2) There is an interesting question whether the national Centre for Theoretical Physics, Trieste.
gravitation field is renormahable in the framework of
a E. S. Fradkin, Nucl. Phys. 49, 624 (1963);76, 588 (1966);
perturbation theory. (We mean here the usual perturba- G.V. Efimov, Zh. Eksperim. i Teor. Fiz. 44,2107 (1963)[Soviet
tion expansion with respect to a coupling constant and Phys. JETP 17, 1417 (1963)l;Nuovo Cimento 32, 1046 (1964).
347
BULLETIN OF THE
AMERICAN MATHEMATICAL SOCIETY
Volume 78, Number 5. September 1972
style. I shall try to convince you by examining actual cases that the progress
of both mathematics and physics has in the past been seriously retarded
by our unwillingness to listen to one another. And I will end with an
attempt to identify some areas in which opportunities for future discov-
eries are now being missed.
REFERENCES
I. D. Hilbert, Mafhemntisclte Problenle, Lecture to the Second Internat. Congress of
Math. (Paris, 1900), Arch. Math. und Phys. (3) 1 (1901). 44-63; 213-237; Englilh transl.,
Bull. Amer. Math. SOC.8 (1902), 437-479.
2. H hlirrkoa.ski. Rnrcrrr rtrrd Zeir. Lecture to the 80111 Assernbly of Nattlral Scientists
( K d n . 19081. Phys. 3 . 10 (14001. 104 I I I . ISnglistl tr:ii\sl,. 7 % ~ prirlc~iple
. t/ R&irtt,ify, Abcr-
deen Ilni\.. Presz. Ahertlcen. 1923.
Freeman Dyson
Looking back on this lecture thirty years later, I have the impression
that things have improved. Mathematicians and physicists are listening
more to one another now than they were then. Ideas and methods are
spreading more easily between the two disciplines. On the other hand,
there is now a widening gulf of incomprehension between two groups of
physicists, one group doing string theory and the other group doing other
kinds of physics. The string theorists and the mathematicians understand
each other, but the gap that used to separate mathematics from physics
now separates string theory from the rest of physics.
Chapter 7
491
Kenji HAYASHI
and Tadao NAKANO*
Department of Physics, Kyoto Unizlersity, Kyoto
Depai-ttnent of Physics, Osaka City UnizJei-sity,Osaka
Gauge fields together with nonlinear field equations to govern them are introduced by
requiring that the Lagrangian should be invariant under a n extended translation in space-
time, i. e. a translation in u-hich four parameters are replaced by four arbitrary coordinate-
dependent functions. A prescription is given to convert a non-invariant canonical (pseudo)
energy-momentum tensor into an invariant one.
T h e symmetric part of these field equations is examined for the two cases: (1) under
linear and non-relativistic approximation, it reduces to the classical gravitational-field equation,
(2) for static and spherically symmetric field, its solution is shown to correspond to Schwarz-
schilds solution. T h e antisymmetric part has no classical analogues, f o r there a r e n o sources
of skew-symmetric energy-momentum tensors i n the classical experiments. A reasonable
method is proposed to eliminate this redundant field.
1. Introduction
Since it was suggested that the electromagnetic interaction is best understood
in terms of a principle of gauge invariance, under a gauge transformation with
a coordinate-dependent function, there have been a number of attempts to deduce
the existence of gauge fields coupled to conserved currents, starting with the idea
of extended transformations.)
It was shown that the invariance under the n-parameter Lie group of trans-
formation referred to space-time and/or fieIds leads to the conservation of n
generators. Further, invariance requirement under an extended transformation,
i.e. a transformation whose n parameters are replaced by ?z space-time dependent
functions, necessitates the introduction of R (generally) non-commuting vector
fields together with field equations which they must
The purpose of this paper is to deduce the existence of a gravitational field
from the translational invariance in an extended sense just mentioned above. In
order to construct the gravitational interaction, Utiyama has proposed to introduce
24 new field variables by postulating the invariance under an extended four-
dimensional rotation which is specified by six skew-symmetric arbitrary functions
oi,(XI .3) However, the self-inconsistency of his scheme was pointed out by Kibble
who has claimed that it is necessary to consider the extension of full 10-parameter
inhomogeneous Lorentz group in place of the restricted six-parameter group.4)
Then, our method is different from both of them and will be shown to be one
of the simplest ways of discussing the gravitational interaction within the Lagrangian
355
2. General formulation
We start with the Lagrangian density**),***'
*) Einstein's theory of general relativity has been based on the general covariance under the
(2.1')
where
s k = Lo>,$6qA
-T d X i , aLo/8q$= Lo,$ ,
(2 - 9)
Ti,= Lo,p$q: - 6ikLo .
If the action integral is invariant under the translation (2-4), the conservation
law of the energy-momentum tensor defined above follows
T,,.,= 0 (2.10)
on account of the field equation (2.8).
Next we consider the extended translation*"'
6qA=O, 632' = E' (x), (2.11)
(E' (x);infinitesimal arbitrary function).
The invariance property of Lo under the translation (2.4) breaks down in this
case ; the variation of the derivative does not vanish,
- A
oq., = (6&' - q t s x ~=
p - &:#qe . (2* 12)
We shall further require the invariance of the action integral under the extended
4:) This is called the Euler equation and derived by postulating 6aT=0 under the condition that
GicqA should vanish on the boundary surface of the integration domain.
* * ) In this case the Greek indices are used for conveniences.
357
translation, by dcfiiiiiig the covariant derivative tlirough which the new field
a , + ( ~is) introduced so as to satisfy our postulate :
D,q = (8,. + likp (x)} q:!+= bL (z) q:: , (2.13)
6Dkq = 0 . (2.14)
In order to satisfy Eq. (2.14), i t follows immediately,
6bk = E Y v b k Y . (2.15)
Therefore we recover the invariance of the action integral even under the extended
translation (i) by simply replacing 44 in the original Lagrangian by the covariant
derivative DkqA defined above ;
L
O(qAy q:) j L (qA,q t y b k P ) =L (qA,DkqA)
=L
O (qA,q$+DkqA) 3 (2- 16)
hence its variation associated with (2.11) vanishes identically
6L=O, (2.16)
and further (ii) by multiplying L by a certain function b ( x ) so as to satisfy
the required identity (2.3):
6L+LE(;=O,
L = bL. (2-17)
Accordingly, the transformation property of b has to be
6b = - E(;b . (2-18)
Next tasks are then to construct such a function b ( x ) and the invariant field
strength from bk and its first derivatives. For these purposes, it is necessary
to define the field bk, inverse to bkl from the following orthogonal relations:
(2- 19)
Consequently, it follows
6be+= - E$bkY ,
hence we choose
b = de t (bk,) ,
because it has the desired property (2.18). In other words, the invariant volume
element becomes bd4xinstead of d4x. Suppose that we obtain a free Lagrangian
Lo for the new field, the action integral turns out
358
where
L = L + LG=b (L+LG) ,*) (2.20)
and LG consists of the invariant field strength. We shall write for short
<aa>
= (qA, bkp) (2.21)
The invariance of the action integral follows from the following identity analogous
to (2.3):
6*L + (LEV).,= 0 ,
which is just shown above to hold by means of our prescription. The above
identity is rewritten as before,
[ L ]Qa6*Qa+ S$ = 0 , (2.22)
where
(2.23)
(2.24)
(2-25)
As E, ~ t :and
GFLare chosen arbitrarily inside the integration domain 2, the
second term of Eq. (2.25) resolves itself into the three identities
* & ; ([L]bkybku)*p$
(mTy$fY)~p~o, (2.26)
;& ; + (T +a)
rL1.kpbk + (LP6kpXbkY).hE0
, (2 - 27)
,
eyfi&Ekx,pbkv30 (2* 28)
among which there are only two independent identities, for the last identity
implies
+
L:kx,, LEk,., = 0 (2 28)
and the differentiation of the second identity yields the first one, by making use
of (2.28). Furthermore it suggests that the invariant field strength must contain
an nntisyinmci ric coiiil>il1;ition d h,,., with rcspcct to thc Crcck suhscripts and
~ -
(2 - 29)
Under the extended translation only the Greek indices are associated with the
transformation properties : Then there arises a question, What role does the
Latin index play? To see it, we consider the four-dimensional rotation of the
field variables only,
3Y=O ,
(2.30)
8qA= TqA,
under the assumption that L is kept invariant under (2.30) and Lo under the
Lorentz transformation specified by
axk @kLxI , (@(LO = O),
6qA= TqA.
The assumed invariance properties respectively yield the following identities :
+
L!hATqA L!LkqATD,qA+ L!,kP6b,P= 0 , (2.31)
+ Lov,,*,TqP,- Lo*qhq$@Lk= O ,
Lo*,.*TqA
and (2.31) passes into
(6b, - w k l b L P ) L%,rl4$; = 0 ,
to agP,, and b,, the covariant vector as it transforms cogradiently to (2.12). bkp
is to be referred to as a vierbein system because of its dual character under
the extended translation and the four-dimensional rotation specified by (2.11)
and (2-30), respectively. These situations are made clear and are summed up
by the following statement. Under the combination of these two independent
transformations,
6x = E (x), (2.30)
6qA= TqA, -
(2 30)
L stays invariant if bkfi (or equivalently bk,,) transforms as follows:
-
(2 30)
The field strength ckl,Jlis reducible under the four-dimensional rotation ; the
irreducible parts of it are calculated by means of the standard method:
i) an irreducible tensor of rank 3,
~ L s i = ~ ( k l )-
m (1/3) (Sk:lC,nV-S?n(kC1))
ii) a vector,
CkF=C7Jlmk = (bbk)~,/b ,
iii) an axial vector,
C k A =i&i,n,lCin171/6 .
The tensor CkL,n is represented in terms of these irreducible tensors,
Cklm = (4/3) c;[ini] + (2/3) 6k[icZj + i~kl,m?~?ld.
We shall require that LG should be of the quadratic form in the first derivative
of bkp. Thus, we choose
d;= b(acclmcrh, + pckvCkv + TCkACkA + 0) *) (2 32)
with the arbitrary constants a, B, y, 8. Inserting the above Lagrangian into
(2.27) , we obtain after some algebra
Bkl =7nTkl 9
(2 * 33)
Bkl = -bmipbFklm,p CmVFkmL + (1/2) C l m 7 l ~ ; c m-
n cm7tkFnin1 + ~ L Z L ~(2, 34)
*
F/ci;clm = 4b {aC.&m] + v]
PJk[lCm - (1/6) ~)I&kL?JlIIC?~) (2 35)
(2 * 36)
36 1
where all the tensors are converted into the local tensors in order to preserve
the invariance under the extended translation.
We shall manage to write down the equation of motion (2.33) in a simpler
form analogous to the divergent form (2.27),
- ( b l P b m F k l m ) *v = b, (mTkl + tkl) y (2 37)*
where
(2 35)
*
(2 - 39)
1
(brTl())Zp =0
\
Ptn= d3x:b;Tl(*), (2.42)
(At))= (A:))= ( $ k z ) ,
p k
i
= d3xb;( m T k l + tkl) . -
(2 43)
(2.48)
where ti2 does not contain an arbitrary constant 13 ; it can b e rewritten i n terms
of the irreducible tensor components,
ti2 = (1/2)ib {a- (4/9) r } (2ClklJllljCIP,,JCjA+ E,.lJljc,lLcjA
+ 3i (6k.C,,AC7,d -CkA4CLA)} .
(2.48)
In deriving (2.46) and (2.47) a special care is taken in order to eliminate the
second derivate of the field variables from the definitions of ti? and tiL,by dint
of the useful identity
(2.49)
If we put
(2.50)
the above two equations (2.46) and (2.47) yield after the differentiation
(2.51)
(2.52)
and its first derivatives are so small as compared to unity that the quadratic
terms in a, and/or its derivatives lead only to secondary effects and are
hereafter neglected. In this linear approximation, all the Greek indices are
363
replaced by the Latin indices as there remain no differences between them. From
the orthogonal relations (2 19) , it follows that
a
k &lk ,
and the various non-linear quantities pass into the linearized ones,
a
~ (km'l~)
(3-1)
ck V
CkA== (1/3) ze*Zm<Z2m'
(a -2/9) {flt
(3-2)
(V9) r>
(3-3)
with
= amm , D = 9mdm .
Provided that the relation (2-50) holds, these two equations are completely
decomposed into the symmetric and anti-symmetric parts,
a,(lm)'mk ~ kl
(3-4)
- {a - (4/9) rl (3-5)
we obtain
( 3 * lo)
(3.11)
011 the ot_-er hand, we choose the Lagrangian 0- a spinor field, say an
electron, as a matter field,
Lo= W.2) (Frkh L;d>+ 4 G ,
- (3.12)
hence we obtain the invariant substitute of it by the prescription stated in $ 2 ,
L = (1/2) b, - j + F~.
($rks,, iFZprkfi 111 (3 - 13)
The equation of motion derived from L=bL is
t7krrk$p+ (l/Z) ck.rL$+ m G =0 , ( 3 - 14)
which is reduced by the linear approximation (3 * 1) and the condition of diver-
gence-free (3.9) to
rk +
(3, - li (SkA- (l/2) S,pS) dr, 1A,,3, - (1/4) li-S.,}4t ?)L$ = 0 , (3* 14)
(SWL,,,= 1.
011 multiplying the dual operator, we obtain the differential equation of second
order,
{ (1 +/cs)0 -- 2 K s k , @ k o ? L - (1/4) R (ns)14
+ i { K (SknL.$TIL(1/2) S,la,) +
- 6 k ~ $ / ? t 2=~O , ( 3 * 15)
where
rrrl=a,, tidti .
Now let us proceed to the non-relativistic limit of (3.10), (3.11) and
(3.15) ;
n&=- l i m r l r O O =
1
--KO,
k) One i s permitted to set SkL=0=Akl except for So, only, as there are no components of
sources for t h e m
365
where
R, = 2{ G i d X ) - r;J>,p}
7
.
R =gpYRpy
Transition from the Greek indices into the Latin ones (as already shown in
(2.36), for example) yields
366
It is easy to construct the general form of bk, such that it has the required
transformation properties mentioned above.
+ + BXaXa
baa = 6aa (1 A )
baa=iCXa,
bta= i D X a ,
bk6=1+E,
where
X a= xa/r, r2= 2
and A, B, E are functions of r only.
a,
is established.
Dn=DmD,, (5* 7)
we observe the connection between the energy-momentum tensor and its derivative
as follows
mFLk= ( 1 / 2 )bbi' ($rk$pp- $,'rk$) = b" TLk , (5.8)
(bL'makL)'p= -CmkLmTmL, (5.8')
(bLPmTlk)'p = - b ( 1 / 4 )Eklmncjmn ($r5rL$'j - $'.jr57L@) +
$- (bbmpcrnkt~rLfi)'p, (5 * 9 )
(l7LPtkL)'p +
= - CmkLtmL 2bkYb,1[p'v] .
(bL"b?nXFn~m)'X (5 * 10)
In fact, the sum of (5.5) and (5.10) vanishes if the field equation for the b',
field is employed. Thus, the derivative of the skew-symmetric energy-momentum
tensor cannot be made zero with any choices in the free parameters.
Case 3 : a + P = O , a - (4/9)r=O and we add an axial-vector interaction to the
matter field.
First, we consider the local homogeneous Lorentz transformation (compare
with (2- 30) ) ,
8x'=O,
8fi= (i/4) WkL (x)dkL$ , (@<kl) (x>= O) i (5.11) **)
n-
o$= - (.
2 / 4 )wkL qgkL . J
Under this transformation, the modified Lagrangian L' (3 13) is n o longer
invariant. According to our recipe we should introduce new fields in order to
make the theory invariant.***) Instead of introducing new fields, however, w e
shall here add some interaction terms consisting of the gravitational field strength.
There are the vector, the axial-vector and the tensor couplings constructed in
terms of ckV, CkA and &,,' in the form of tri-linear interactions. On examining
the respective transformation properties, we find a promising one, that is, a n
*) Its linearized form is given by (3.15). I t should be noticed that the covariant derivative
does not commute each other, yielding the invariant field strength, [D,, DL]=cmkLDm.
**) ou(.z) and its first derivative are assumed to vanish on the boundary surface of the integra-
tion domain.
***) Detailed discussion of it will be made in a forthcoming paper.
369
axial-vector coupling,
LA= - (3'1/41C L A $ r S r k $ , (5.12)
and the modified Lagrangian
L' + L A (5.13)
remains invariant under the extended four-dimensional rotation (and of course
under the extended translation). From (5.13), the equation of motion is replaced
by
+
bkPrk$',, (1/2) C k v r k $ - (3i/4) Ck*rdk$ + ?JLfi =0 . (5 * 14)
It should be noticed, on the other hand, that the action integral
I ( 2 ) = \ d4xLG
2
is kept invariant even under (5.11), because of the particular choice in the
arbitrary coefficient (see (5.6) ) ,
61= 5
I
=0 .
= B(kl)Lohzd4x
Bkllokld4x
L
$ 6 . Discussion
References
*) The field strength of the electromagnetic field is irreducible for itself and there remains
612
Ryoyu UTIYAMA
and T a k e s h i FUKWAMA
It is shown that a symmetric tensor field of the second rank A,,(z) should he introduced
i n order t o retain the invariance of the action-integral under a generalized translation xa
+z#+P (z), provided that the original action-integral is invariant under inhomogeneous
Lorentz transformations. It is further proved that the generalized gauge field A,, should ap-
pear in the Lagrangian i n exactly the same fashion as the metric tensor gPv does in Einsteins
theory of gravitation.
Some general feature is also discussed with respect t o a law of conservation of some
physical quantity which becomes no longer valid when the interaction with the generalized
gauge field takes place, provided that the associated group is non-Ahelian.
1. Introduction
A gravitational field w a s first interpreted as a kind of generalized gauge
fields by one of the present authors) by introducing a system of tetrads h,(.x)
and extending the Lorentz transformation of the tetrads at each world point t o a
larger g r o u p depending upon six arbitrary functions of x instead of six parameters.
Besides this article, some a ~ i t h o r s ~ )tried
, ~ ) to introduce a gravitational field by
extending the translation g r o u p to a general transformation of coordinates
xzr+z+ 6(..),
but their arguments seem rather unsatisfactory and complicated.
Many groups of transformations depending o n parameters have been found
in connection with the different kinds of conservation laws. A m o n g these groups
it i s well known that the group of phase-transformations of complex fields was
extended t o the gauge transformation depending on a n arbitrary scalar function
I (x) connected with t h e existence of an electroniagnetic field. T h e invariance
under rotations in the iso-spin space was extended t o the invariance u n d e r a
generalized rotation group by an adjoined introduction of t h e Yang-Mills field.
T h e most well-known group, namely the translation group, has been conjectured
t o be related with the gravitational field because the gravitational field is, following
Einsteins equation, produced by the energy-momentum tensor of material fields, the
conservation of which holds owing to the invariance of the material system under
a translation of coordinates. I n spite of such a conjecture, however, there has not
been any convincing article which shows the gravitational field being derivable
from the postulate that the action integral of a matrial system is invariant under
a group of generalized translations depending upon four arbitrary functions of z.
372
T h e aim of the present paper is t o show that a tensor field of the second
Tmik should be introduced in order to retain the invariance of the action-integral
and that this tensor field should appear in the original Lagrangian of the material
field in exactly the same way a s the metric tensor gpu does i n Einsteins theory
of gravitation. T h i s conclusion is derived from the assumption that the original
Lagrangian is invariant under inhomogeneous Lorentz transformations but the in-
variance under rotations of tetrads has not been assumed.
In addition to the derivation of the gravitational field, some general feature
is discussed with respect t o the laws of conservation. It is shown that a physi-
cal quantity owned by some field, say d A ( x ) ,which is conserved owing t o the
invariance of the action-integral of d A under some parameter-group of transfor-
mations, becomes unable to satisfy the law of conservation when the original 4-
field begins to interact with a generalized gauge field associated with the group
mentioned above, provided that this group of transformations is non-Abelian. T h e
conservation is recovered only when the quantity carried by the generalized gauge
field is taken into account together with that possessed by the field dA.
T h e present procedure of introducing the interaction of a gravitational field
with a material system might include its application in a derivation of an S-
matrix f o r a material system interacting with a gravitational field if the Lorentz-
invariant S-matrix is known for this matrial system without the gravitational in-
teraction.
5 2. Fundamental postdate
I=
s Ld4x
i) trans la t i on
x-+x= x + a, (aP= constant parameter)
Epv = y p p .E P , = - E U P 7
~/&=yp=
I -1
1
0
p=v=11,2,3,
P+Y
p=v=O.
9
H e r e it has been assumed that the field $ A is a kind of tensor and t h e transfor-
mation-coefficient C?: is a n appropriate sum of products of Kroneckers 6.
O u r fundamental postulate is that the action integral should be invariant
under the generalized translations which is a generalization of (i) and (ii) de-
pending upon four infinitesimal arbitrary functions of x7 in place of four para-
meters a#, i.e.
xp-+x@= xp + p ( x ). (2.1)
I n order t o realize this postulate, the original arguments of the Lagrangian,
for example a,$,, should be replaced with a n appropriately defined covariant
derivative V k $ A by introducing a new field AK(x).
Let a Lagrangian
8Ap,...pr OD (;,2lr):.
(5)= A,,l...vr f, , (2.2)
where the transformation coefficient D is
The expression of 6$A for the Lorentz transformation has the form
8$A= $B c?; f , v
*
Il =
s Lld x
should be invariant under (2.1) leads to the following various identities (see
Appendix A ) :
a, { [ L Y .C2;. $B + [Li]lar.
D (:::::L:)Y, . Ab,. . . b r }
+ [Ll]a.ar.Aa,...a,,p~O
+ {LI]-$A,~ , (2 - 4)
a, { [LJ.c<;. + [L,]+I-.D (
$B y
,...a:); -0 , .Ab,...br -T / l ) p } (2.5)
[Li]* c$ * $B + [ L I ] ~ D. .(:::$)$
~~ * A b , ...a,
+ { p and Y interchanged} S O , (2 7)
where the following abbreviations have been used :
dLZ . .$ B . [{~~.C~p+M~i...ar
I...a,)p 4 .A b l...brt. ~ ( L i . , . ~ r
ark$,
+ {Y and ,u interchanged}]=O .
375
A(q...UT)
(X)*
I f w e define (x) by
A(al'..ur) (x)cc Y&)
I...u,-,a)ua
9 (3.5)
the above relation is written as
* A ( 5 1 . . . a , ~ ,., ) ~ ~ ~
A(a1".5r-1a) (3* 6)
T h i s result shows that r should be >1 otherwise ( 3 . 6 ) leads to a contradiction.
The relation (3.5) 2llows us to represent Y in terms of A(a1.'.) as
,...a,. ua
y > $ y ? - ) v a= (XI . z((fl...*r))(L8)
A('J,-JJr) , (3.7)
where the coefficient 2 on the right-hand side is a n appropriate sum of products
of Kronecker's 6.
Substituting (3.7) for Y i n (3.4) and making use of the definition ( 2 - 3 )
of D,w e have a n important relation
- A(a,...5r.,p)
28:8{fi]=~* A(bl'..*T) {Z:fl...lrr)(~,q)
I-ar-,u)un +
,Z~f,'.:~~~rj($fn}. ( 3 . 8 )
F r o m our assumption that and A(bl...br) a r e both fully symmetric with
respect to their suffices, it is plausible t o assume that t h e undetermined coefficient
2 is also symmetric with respect t o both superscripts (al...a,)and subscripts
(b,-..br), in addition t o the symmetric pair of subscripts (,l/3). Since w e have
no information about the symmetry of 2 with respect to the extra superscripts
*) (PA) means that Y is symmetric with respect to suffices inside the parentheses.
376
Y and a, 2 having the above mentioned symmetry with respect to the suffices
can be expressed in terms of the products of Kroneckers 8 in the following w a y :
tc i: (a,...a . )
[ O ( * , ...> . . ! : f i r ) . t d.,kS$;:::;.!$
o;isyp+ s$::::pzl*,)0~i8,1 ...h,. )8;;;ij) ,
i=l 1. J (L+ij)
(3-9)
where a , b, c and d a r e undetermined constants. 6::::;
in (3.9) means
where the summation should be taken over all the permutatioiis 01 (bl...br).
Inserting the expression (3.9) into (3 -8), and taking contractions of (3.8)
with respect to many different pairs of suffices, w e arrive a t t h e result
r=2, a=b=O, c= -d= ( 3 . lo)
with the normalization
(L) .A(,,)
A(pp) ( x ) =8
:
. (3.11)
T h e details of the derivation of this result a r e given iil Appendix B. (3.10)
determines 2, Y and Vxad a s follows:
= 3 [o$:,,z8;,SY, +
z((;$;;& On((6a,,la) , ) J ; , ~ Y p
+o(a,a,)
(b,B) o*,S;+61~:~?6;,6~-2o(l,,
a (*]b2)1> Yala,)/J(W (3.12)
5 4. Derivation of Lagrangian
Before beginning a discussion about (2.6), let us consider a little extension
of the Lorentz-invariance of the original Lagrangian L.
T h e Lorentz-invariance of the original Lagrangian can be made manifest by
writing explicitly the metric tensor 77,. It is easily seen, however, that this
manifestly invariant expression of the action-integral can also be invariant uiider
a n affine transformation)
xp+xp = apv- x u ap, + det (aP,)+O , (4.1)
*) In the case of an affine transformation, since the coefficient ufiv has no such restriction
as T o v a f l a . u ~ g = ~ a g ,there cannot exist such a covariant affine tensor as g,, whose components a r e
kept unchanged under the transformation (4-1).
377
L ( d A > d A , X , C p w )= J l T . L ( d A , d A , x , C p Y ) y
C =d e t ( C J . (4.2)
T h e invariance of I = J L d 4 x under an infinitesimal affine transformation
x-+x= xi+ A, .xu+ a
where
=5 ( $ A , r d A > A,)Y
where
A = det (A,).
T h e discussion s o f a r developed neither compelles us t o interpret A,, as the
gravitational field nor gives any information on the signature of A,, but in order
to let the field equation of be hyperbolic, A,, should have the signature -,
+, $, +. I n place of our A,, one can consider a particular tensor field BfiY
which can be derived by introducing a system of curvilinear coordinates U into
the Minkowskian space, that is,
T h e question whether our A,, is identical with the fictitious gravitational field
B,, o r is an entity being completely different from Bpu, giving a non-vanishing
L
curvature tensor,*) is to be answered by the field equation of A,.Thus if
Einsteins equation is taken as a field equation, our A,, describes a permanent
gravitational field produced by the material field $ A .
*) The term curvature tensor means the Riemann-Christoffels curvature tensor when A,,
is substituted for the metric tensor qYY in the ordinary definition of the curvature tensor.
379
8S, 1
8X = y S a b.A,,,,
__- , (5- 3 )
S, = S * A,,. (5.4)
T h e relationship between S, and T& is given by (2.6) with the aid of ( 5 - 1 ) :
S, = TZ),- d~FpC, (5 * 5)
where
V S ~ =o
, , (5.3)
where
s u p=S,/JK ,
and the covariant derivative of S u pis
+ A;, *Sap
V k S , = dxSYP - A:p. S u b .
T h e existence of the non-vanishing right-hand side of (5.3) means that the energy-
momentum of $ A is not conserved owing t o the interaction of dA with A. It is
well known that (5.3) can be transformed into the expression
a,{S,+t,} =o (5.6)
by the aid of the field equation of A, where t, represents a pseudo energy-
momentum tensor density of the field A.
This result that the energy-momentum of $ A can no longer be conserved
when the interaction of $ A with A takes place, is a consequence of the general
theory of the non-Abelian gauge fields on which a brief explanation will be given
in the next section.
380
I= JLW ($A7 .
ht,.>d4~
R c (x)+ Ra, p .
6Aa, = Ab, .MbaC. (6.2)
This postulate of invariance gives rise to the following identities, if one follows
a similar line of argument t o that given in Appendix A :
*) The first half of the content of this section is a review of the paper I.I.I., but our defini.
tion of j ( a ) is different from that given by (1.27) on page 1601 of 1.1.1.
**) For brevity, coordinate transformations are not considered in this section.
381
T h e identity (6.3) is transformed into the following expression with the aid of
(6-6)and (6.7):
(6.3)
It is easily seen from the definition (6.8) that j ( + )has a transformation property
8j{i;A=j::;-fCaa. L (x)
which leads to the definition of the invariant derivative of j,,,:
V Lj:;; = 9 , j{:) - A,..-f~.j::;. (6.9)
Making use of (6.9) and assuming the field equation of $A
&IA =0 7
w e can derive the equations of continuity for j,,) from the identity (6.3):
v P J- (( a -0.
0) - (6.10)
(6-10) shows that the (a)-charge of the $A-field defined by
(6.11)
where
(6-12)
(6.14)
Appendix
A. Derivation of identities ( 2 . 4 ) - ( 2 - 7 )
Consider a variation of
11 = J L( d A , dA,x, A(,...)>A(a1...),x> d 4 z
9
-t 8Ll 8A(+),
8A(U,...L
x
+ A1. Cyp} d *x
,
In terms of this
SI, =
Substituting the definitions (A-1) for and 8A(ul...)in the above expression
and putting 811=0, we a r e led to th e identities. Especially when Cp(x)= a P + d P v - x "
and A(ul...)(x)is replaced with a constant affine tensor C p y (and consequently
the coefficient of each parameter a" or A', in SI,, should identically
A(al...),k=O),
vanish because the domain of integration can be arbitrarily chosen. T h e relations
thus obtained are the identities (4.3) and (4-4)w h e r e L takes the place of th e
present L,.
O n the contrary if Cp(x) is an arbitrary function of x, the first and the
second terms of the right-hand side in (A.3) can be transformed by a partial-
integration to the following form :
3 84
[ * . * I p [a,{
= [L,lA.C2;.6B+ [L,]l.D(~::::)~
*A(b,...))
+ [ L J Ay 5 , , + [LJ-)
* * yo
A(a,*..),p] (A.5)
which is nothing but the identity ( 2 . 4 ) . Xrisertjiig (A.5) iiitu (A.4), w e have
u,=l 3 , [ - . . 1 * . d 4 ~ = 0
for arbitrary t P s . By putting equal to zero the coefficients of cf, and its deriva-
tives i n the above identity, ( 2 . 5 ) - ( 2 . 7 ) can be derived.
etc.
T h e double contraction of (B.1) by putting p = l and Y=P leads t o
A: (5ra + 2 (27-+ 3 ) . r . b + 10. r (r+ 3 ) - c+ 2r(r- 1)(7- + 3 ) .d )==ZOb:,
which allows to put
= 6;
A aP-= A ( ~ % - . ~ r r lA(pa,...al.)
.
and c on s e q u e n t ly
A(%--%)
. A(a,..-ar) = 4 -
Inserting (B-3) into ( B - 2 ) , w e have
+ + +
57-a 2 (2r+ 3 ) 7-b 10r ( r 3 ) c+ 2r(r - 1) (r+ 3 ) d = 20 .
In a similar way, t h e contraction of (B.1) by putting a=p and ,u=A leads t o
6ra + 2r (2r+ 3 ) b + 10r(r + 3 ) . c+ 2r(7-- 1)( r+ 3 ) d= 20 , (B.4)
where the normalization (B-3) and (B-3) have been employed. The third type
of contraction a=A and f i = p gives a relation
+
15ra 107-(27-+ 3) b + 2r (7r+ 3)c + 2r (2r + 3 ) ( r- 1)d = 1 0 . (B .5)
385
n=O. (B.6)
If (B.1) is contracted by putting ,.u=,l, we obtain a relation
{Z (T- +
1)* r - b G ( r - 1) -T. C S2r(r- 1) (rt l)d}A:,
= (5 - rb - r ( r + a ) c - Y(T- 1) .d}8s
;; - 2r(7-+ 2) b .s;aY,. (3.7)
Since the left-hand side of (B.7) is symmetric with respect t o the pairs of suf-
fices (a,Y) and (B, p ) the same should be true for the right-hand side. Thus
w e have
~ - T ~ - T ( T + ~ ) c - T ( T - ~ ) -d=-2r(r+2) * b , (B.8)
and consequently (B .7) becomes
+
{2r(r--I)b 6 ( r - 1) . T . c + 2 ~ ( 7 - 1 ) ( ~ + -ld)} A G = - 4 r ( r f 2) .b-8{%)).
(B .7)
Similarly, contractions fi = p and a =X lead to
{CT+ T(T- 1)d)A:; = (1- (2r+3) rb - T ( T + 1)C- T(T - 1)d}8:;;; (B -9)
and
{ ~ T ( Y - l)b + r ( r - 1)
C+ ~ ( r l)d}A$
- z= (1- 5rb - ~ ( r 1)
+C- T(T- 1) d}@$]
(B - 10)
respectively.
I n (B.7), (B.9) and @.lo), if the coefficients of -4:: do not vanish, these
relations show that &4::: should be proportional t o @:I.
Recalling the normaliza-
tion (B.3), we have to put
respectively.
A subtraction of (Beg) from (B.10) gives
( 2 ~ - 4 )- r * c = O . (B * 12)
If we choose r = 2, then (B.11) becomes unsolvable relations
(x)= $ 8 g ) .
A(x)*A,,
386
C S (r-l)d=O. (3 * 17)
I = r ( r + 1)c-t-r ( r - . l ) c l . (B .18)
A comparison of (B-15) with (B-17) gives
( r - 2 ) .d=O.
Since the case d=O leads to a contradiction a s easily seen, w e obtain
r=2.
It is easily verified that the solution of our problem is
fi=b=O, c= - d = J4
References
33, NUMBER
VOLUME 7 PHYSICAL REVIEW LETTERS 12 AUGUST1974
A new integral formdism for gauge fields is described. Further developments are
presented, including gravitation equations related to, but not identical with, Einstein's
equations.
It was pointed out by Weyl many years ago o r e m s are naturally developed. We summarize
that the electromagnetic field can be formulated some of these below. Details will be published
in t e r m s of an Abelian gauge transformation. elsewhere.
This idea was extended' in 1954 to the concept Gauge field strength.-Consider a path ABCDA
of gauge fields for non-Abelian groups. That forming the border of a n infinitesimal parallelo-
formulation, like the Weyl formulation for elec- gram with sides dx and dx'. ~ A B can ~ Abe com-
tromagnetism, was based on the replacement of puted by multiplying four phase factors like (1)
8, by a,, - ieB,. One might call such formula- together, resulting i n
tions differential formulations. It is the purpose
of the present paper to reformulate the concept
of gauge fields in an integral formalism. The where
new formalism is conceptually superior to the
differential formalism and allows for natural
developments of additional concepts. It further
allows a mathematical and physical discussion
of the gravitational field as a gauge field, re-
sulting in equations related, but not identical,
to Einstein's.
The basic point is the fact that electromagne- fpuk will be called a gauge field, o r gauge field
tism is a nonintegvable phase factor, a fact dis- strength. They a r e the Faraday-Maxwell fields
cussed many y e a r s ago by Dirac, Peierls, and when G =U(l).
others, and more recently by many authors.' Gauge transformation.-A gauge transf o r ma-
This fact is now generalized as follows: tion in the integral formalism is defined by a
Definition of a gauge f i e l d . 4 o n s i d e r a mani- transformation
fold with points on it labeled by x p ( p = 1, 2, . . . ,
n) and consider a gauge G which is a Lie group (PAB-(PAB'= ~ A ~ A B ~ B - ' , (5)
with generators Xk (k = 1, 2, . . . , m). [For G where tA is an element of G which depends on the
=U(1) we have electromagnetism; for G non- point A . It is clear that under (5)
Abelian we have non-Abelian gauge fields.] De-
fine a path-dependent (i.e., nonintegrable) phase (P,B c m - q AB c m ' t A PA wcm <A - '- (6)
factor (PAW as an element of the group G associ- Thus
ated with path A B between two points A and B on
the manifold. The association is to have the
group property: qABC = q A B q Bwherec, the paths where R,,, is the adjoint representation for the
A B and BC a r e segments of ABC. Furthermore element L . The simple transformation property
for an infinitiesimal path A to A + d x pthe phase (7) is the definition f o r the concept thatfPvki s
factor is close to the identity I of G , s o that' gauge covariant. Generalization to other repre-
sentations R of G f o r a gauge-covariant quantity
qA(A +ax) =I + bpk(@k d x . (11 +,8ygis immediate3:
The function b p k ( x )defined on the manifold will
be called a gauge potential; qABwill be called a
gauge phase factor. b P k is not gauge covariant; f p u k is.
With this definition additional concepts and the- Gauge-co variant diff erentiation.-To retain
445
388
33, NUMBER
VOLUME 7 PHYSICAL REVIEW LETTERS 12 AUGUST
1974
gauge covariance in differentiation we define tial and gauge fields are respectively
av + b p ( K I Z k I J ) $ J ,
=q (9)
where z k is the matrix representation of X,. Gen- It i s important to recognize that in this defini-
eralization to other cases is obvious. An inter- tion we have chosen a fixed coordinate system.
esting theorem is that A coordinate transformation would generate a
linear transformation in the vector spaces VA
fPUIXk+fU
XI/ + f X P I U k =0, (10)
and V B . In other words M A B - N A M A B N B - l . Com-
which is the gauge-Bianchi identity. parison with (5) shows thus that a coordinate
Introduction of a Riemannian m e t r i c . 4 0 far transformation generates a simultaneous gauge
we need no metric for the manifold. Now we in- transformation of the parallel-displacement gauge
troduce a metric for it and discuss arbitrary co- potential. In fact, the usual nonlinear t e r m in the
ordinate transformations. We come then natural- transformation of {z,,} is precisely the nonlinear
ly to Riemannian covariant quantities and doubly t e r m needed in the gauge transformation of the
covariant derivatives. b, is Riemannian covari- gauge noncovariant quantity b:. In this con-
ant, since V A B is coordinate-system independent. nection we observe that for GL(n),
f P u k is doubly covariant. We have
$XIIP=~KllI 9
etc. It is easily shown that where the semicolon represents the usual Rie-
mannian covariant differentiatim with a and /3
f,ullhk+fI.hIIpk+fXpII vk=O (1 2) treated as usual contravariant and covariant in-
which is satisfied by all gauge fields on all Rie- dices. The rule works also in general. E.g.,
mannian manifolds.
Source of gauge fieZds.- We define, in analogy
with electromagnetism, a source four-vector J, Nontrivial sourceless gauge fields. - - G a u g e
for a gauge field: fields for which f p u k $ 0 and J
: = 0 a r e of physi-
cal interest. So far only nonanalytic examples
J,k=g ~XfPullXk=fpu~ll~. (13) a r e known.
After some computation one derives a theorem: We now can construct two general types of gen-
e r a l types of examples,
gJ, A= 0 (conserved current) , (14) (a) Consider the natural Riemannian geometry
which in electromagnetism states charge conser- of a semisimple Lie group. Its parallel-displace-
vation. In Ref. 1 this was Eq. (14). One can also ment gauge field is sourceless and analytic.
generalize Eqs. (15) and (16) of Ref. 1, leading (b) Consider the same Riemannian manifold of
to the concept of total charge. a group G a s above in (a). Define ( P A B as that for
parallel -displacement gauge field.-For any an infinitesimal path A B , pAa=@-B). This
Riemannian manifold, the important concept of gauge phase factor which is itself an element of
parallel displacement defines, along any path G gives a gauge field which is analytic and
A B , a linear relationship between any vector V A sourceless.
at A and its parallel vector V B a t B . Thus paral- P u r e spaces.-A Riemannian manifold for
el displacement is defined by an n x n matrix MA* which the parallel-displacement gauge field is
which gives this linear relationship. MA^ is a sourceless will be called a pure space. A nec-
representation of an element of GL(n). Thus we essary and sufficient condition for a pure space
have the following: is
Theorem-Parallel displacement defines a
gauge field with G being GL(n). The index k has
n2 values and we write k = (a@).The gauge poten- A four-dimensional Einstein space, i.e., one for
446
389
33, NUMBER
VOLUME 7 PHYSICAL REVIEW LETTERS 12 AUGUST1974
447
390
XX: What is your view concerning the unification of gravitational field a.nd gauge field?
Yang: Comparing their formulas, one finds that they are very similar. There is no
question that they have an intimate relation. But as to what kind of relation after all, it is
still a controversial problem.
Both F,, in gauge field and R,, in gravitational field are curvatures in geometry. R,,
is the second order derivative of g,,. Therefore, Einsteins gravitational field equation
is a second order differential equation of g,,, while the equation of motion P F , , = ....
for gauge field is a first order differential equation of curvature. This is what happens in
electromagnetism. The concept that both F,, and R,, are curvatures is absolutely primary.
I think that this is absolutely unchangeable. Since P F , , = .... is the first order differential
equation of curvature, the gravitational field should be the third order differential equation
of g,,. This is a clue that Einsteins theory of gravity needs to be modified. In 1974, when
the geometric structure was clarified, I proposed a new gravitational equation. However, I
knew how to write down the left-hand-side of the equation and could not write down the
right-hand-side. At present, this problem has not been solved.
39 1
35, NUMBER
VOLUME 5 PHYSICAL REVIEW LETTERS 4 AUGUST1975
Wei-Tou Ni
Institute of Physics, National Tsing &a University, Hsinchu. Taiwan. Republic of China
(Received 27 March 1975)
Recently by using a n integral formalism for gauge fields, Yang has proposed the following gravita-
tional field equations for pure spaces:
R , j ; k - Ria; j = 0. (1)
It is interesting to notice that (1) are satisfied by a l l vacuum solutions of Einsteins general relativity
G, + A g , = 0
and NordstrEms second theory2- (in Einstein- Fokker form)
R=O, Cijkl=O. (3)
where C i j k li s the Weyl tensor. In fact, f r o m Bianchi identities and the definition of the Weyl tensor,
a simple calculation shows that (1) are equivalent to
,,
R = 0 , C i , k l : =l 0. (5)
Equations ( 4 ) a r e differentiated equations (antisymmetrized in j and k ) of (2); while (5) are differen-
tiated equations (summed over 1) of (3). Hence Yangs theory is a derivative theory of both Einsteins
theory and NordstrEms theory. It is amusing and intriguing that the two eminent and structurally dif-
ferent theories of gravity could b e embraced in a single s e t of equations.
Since Nordstroms theory admits monopole radiations, Yangs theory admits them too. Therefore
no analog of the Birkhoff theorem of general relativity could b e valid for Yangs theory. More specif-
ically, (1) have the following time-dependent spherical-symmetric solution (the time dependence can-
not be transformed away by coordinate changes):
where co is an a r b i t r a r y constant, and f and g are two a r b i t r a r y functions. Hence spherical symmetry
does not imply time independence.
In the static spherical-symmetric case, the metric can be put in the form
ds2= - e 2 * ( ) d t 2+ eZYr)d?+?(dr2 +sin*Od@). ( 7)
F o r this metric (1) reduce to
A - 2 A 2 + A + + + f 2 + r - 2 ( 1 -e2*) = O
and
392
35, NUMBER
VOLUME 5 P H Y S I C A L REVIEW LETTERS 4 AUGUST1975
Gauge theories of the Yang-Mills type and Ein- Let us assume that the structural group G of our
steins theory of gravitation have a common fea- bundle P is T, with four commuting generators
ture: the self-interaction of the fields. Then one Ea (a=1 , 2 , 3 , 4 ) ,
i s led to ask whether Einsteins theory itself is a
gauge theory. Of course this i s an old question a? S B 1 =o, (1)
and many people-3 have suggested that Einsteins and that the base manifoldM is the four-dimen-
theory can be viewed as the gauge theory of the sional space-time with an orthonormal basis a t
four-dimensional translation group T,. Unfortun- each point, i.e., four orthonormal vector fields
ately certain features seem not to have been fully e , (i = 1 , 2 , 3 , 4 ) with the commutation relations
clarified so far, and it is precisely these features
that bear out the complete relationship between
the Yang-Mills and Einstein theories. Of course the basis independence of a theory is one
In this paper we show that if one only applies the of the basic principles in physics, and one can
gauge principle (this includes a Yang-Mills-type choose any other b a s i s if one wants to, but for
Lagrangian quadratic in the field strengths) for the obvious reasons the local orthonormal b a s i s (2) i s
group of translation Tqof space-time, the gauge the natural one to s t a r t with in our problem. Notice
theory that one obtains is unique and becomes p r e - that this orthonormalbasis i s not in general a coordi-
cisely Einsteins theory of gravitation. In this T, nate basis since the b a s i s vectors do not commute. If
gauge formalism of Einsteins theory the transla- we introduce a coordinate basis a, ( p = 1 , 2 , 3 , 4 )
tional gauge potentials a r e identified a s the non- with
trivial p a r t of the vierbein fields and the gauge field
strengths a r e given in terms of the commutator [a,,,a,l =o,
coefficients (i.e., the anholonomity) of the local then ei can be written in t e r m s of thevierbeinfields
orthonormal basis one s t a r t s with. h;,
To prove that the unique gauge theory of the
e, =#ap,
translation is Einsteins theory, it i s important to
observe that although the gauge group Tq i s Abeli- and correspondingly we have
an, it is not an internal-symmetry group and acts
T , =(a&?
~ ~-a,h;)h:. (3)
on space-time itself. Fortunately the geometric
meaning of gauge theories has, been well understood Here a , =hpa, is the directional derivative in the
by now in terms of the bundle p i ~ t u r e . ~The
-~ direction of e , and h i a r e the inverse vierbein
power of this bundle picture has been appreciated fields
by Cho and Freund. in unifying gauge theories
h:hg = 6:, hfh; = 6;.
with gravitation and also recently by Wu and Yang.
In the following we will first prove our claim in a Observe that due to the commutation relation (2)
formal way constructing the eight-dimensional of the basis e , , the directional derivatives a, do not
bundle of the translation group T4 over Space-time commute either.
and then will give a precise physical meaning to At this point we would like to emphasize that all
this translational bundle. For the details about the above expressions a r e just a matter of a for-
the bundle formalism of gauge theories we refer malism and we have not assumed that our space-
the reader to Ref. 5. time i s curved. Eventually we Will c r e a t e a curva-
14
- 252 1
394
t u r e by introducing gauge fields associated with T, the gauge potentials BY a s the nontrivial p a r t of
symmetry, but we s t a r t with a f l a t space-time and the vierbein fields
so far o u r hff remain trivial. h i =6: + K B ~ . (7)
Now, given a connection form' w = w " [ , in the
bundle P, the gauge potentials BF a r e a s usual This means that wenow have created the curvature of
given by the connection coefficients of Zi, the lift space -time b y introducing the gauge potentials B:
of e i into a four-dimensional gauge-defining sub- f0.r T4 and making the vierbeinfields hf nontrivial.
manifoldo, i.e., a c r o s s section of P: Notice that the decomposition of h r into 6p and l?:
is basis-dependent since 6: i s not invariant
w U ( E i )=KB;I, (4) under a rotation of the local orthonormal b a s i s e,.
where we have introduced a dimensional constant K From Eqs. (3), (6), and (7)one finds that
(of dimension of a length) to give the canonical di-
mension to the gauge potentials B f . This K will
s e r v e a s the coupling constant for the gauge group
T, and will be related to the gravitational constant
l a t e r on.
With 2, ( z = 1 , 2 , 3 , 4 ) as the horizontal lift of e ,
and (2 ( a = 1,2,3,4)a s the fundamental vector
The gauge field strengths G: a r e thus determined
fields which a r e vertical, we clearly have by the commutation coefficients T i j kof the ortho-
" 2 , (31 =o, n o r m a l b a s i s v e c t o r s that one s t a r t s with. We would
like to emphasize here that once the connection.w
[ 6 2 , Z k I =o, (5) (i.e., the gauge potentials in physical t e r m s ) is given,
[ z { , z j ] =Ti,';& - K G i , " [ , ' , the gauge field strengths q,a r e uniquely determined
where Gyj a r e the vertical components of the com- from the geometrical s t r u c t u r e of thebundle and
mutator coefficients of the horizontal lift vector a r e not something that one can define otherwise
fields Z i . The first two equations come from the as sometimes s ~ g g e s t e d . ~
definition and the third is due to the fact that the Gauge transformations in this picture a r e
changes of bundle c r o s s If we change
projection of [Zi,2,] down to the base manifold is
the same a s [ e i , e , ] . Notice that because of the the c r o s s s e c t i o n a t o o ' by a four-translation
B"(x) (a= 1 , 2 , 3 , 4 ) in the four-dimensional fiber
Abelian character of the (2's the group action on
the bundle space is really a translation and there space [geometrically V ( x ) a r e simply a set of
i s no "rotation" whatsoever. Mathematically this transition functions that relate o' to a], we clearly
means that the holonomy group of P i s T, and not have
the Lorentz group. Z; =c?, +(aiea)(2
Let us recall that in the bundle picture the gauge and (8)
field strengths a r e given by the vertical cornmuta-
1 I
t o r coefficients of two horizontal vector fields, B': = ~ " ( e =BY
;) +-aieg.
i.e., by G i j u . To find the gauge field strengths in
t e r m s of potentials BF, notice that from Eq. (4) From Eq. ( 7 ) this means that under the gauge
and from the definition of w', transformation we have
WU(Zi)= 0, hr - h': =hf +ai@'
w"(58) =a", = h 3 6 $ + a ,@')
it follows that = hffXt,
A -
e i = e l -KB;I(,*, where
and Xz = 6 $ +a,@',
[Gi,;,] =[zi - K B y ( z , z j -KB;[g] and now the gauge transformation i s unambiguouslyi
= T i j 'z b -K(aiBP - ajBF)Ep* identified as a general coordinate transformation
in the coordinate basis ap :
=Tijk2b-K[(aiBy - ajBp) - T i j k i 3 f ] ( 2 . (5')
a,,- a; =(b;+a,e')a,
Thus one has
=x;a,. (9)
G i r a = ( a i ~ - a , f l ) - T , j .b ~ (6)
Notice that gauge transformations (or equivalently
Now following Kibble's suggestion' we interpret general coordinate transformations) do not change
395
the orthonormal basis e , : They change the coor- Now in the last line of Eq. (12) the third t e r m i s
dinate basis a,, . Also notice that under a local explicitly a total divergence. But each of the f i r s t
gauge transformation one has two t e r m s cannot be made into a total divergence
and one is forced to choose CI : b : c = 1 : 2 : - 4 to
-
Gilt" = ( a j q a - ap;") T i j k B B ;
satisfy the consistency requirement. So the La-
a
= Gij + ;
1
(ai a, - ajar)ea - -1 TijkakeU grangian should have the f o r m
= Gun, (10)
i.e., G i l a a r e invariant under gauge transforma- We would like to emphasize that any other linear
tions a s expected for an Abelian gauge group. combination in C d o e s not yield a meaning7Ful theory.
Once we have the gauge field strengths GI,", we Now it is readily seen that C i s (again up to a
?4
can write in the manner of YangandMillsthemost divergence) precisely Einstein s Lagrangian. T h i s
general Lagrangian quadratic in these G i j " . Using completes our argument that the four-dimensional
Eq. ( 6 ' ) we have translational gauge theory with Lagrangian quadra-
d: = G G i j ' G k , '(aqjkq"qU8 +bqik6k6i+cq' k 6 i 6 3 tic in the field strengths i s precisely Einstein's
theory of gravitation. The fact that the Lagrangian
=
1
- (13) i s equivalent to Einstein's one is of course
K2
(a T i j k T l j k + Tij k T i k l + c Ti, j Tikk),
well known. But the geometrical meaning of this
where Lagrangian does not seem to have been fully under-
stood. We now understand i t as the translational
\/-g=det(ht), gauge formalism of Einstein's theory of gravitation.
a , b , and c a r e f o r the time being a r b i t r a r y con- At this point one may wonder whether we have
stants, and we have used the f l a t metric q a e for required a Lorentz gauge invariance by imposing
the fiber space. Notice that in our formal- the independence of the theory under the local Lo-
ism we do not need a Riemannian metric a pn'on'. rentz transformation (11) of the orthonormal basis
The crucial question now i s whether the constants e i . Even so, however, one does not need to in-
a, b , and c a r e really arbitrary. To answer this ques- troduce Lorentz gauge fields in one's theory if one
tion let us point out that the above Lagrangian is basis has only s c a l a r s and internal gauge fields as one's
dependent as it i s obtained using Eq. (6'). NOW, if the source field^.^ This is so because s c a l a r s a r e
theory is going to have any meaning a t all, it should not singlets under the Lorentz transformation and also
depend upon which orthonormal frame one s t a r t s with. the internal gauge fields do not couple directly to
This means that if one chooses a different set of the gauge fields of the Lorentz group owing to the
e i ' s , the Lagrangian should differ only by a total gauge invariance of the internal symmetry. In any
divergence. We now show that this consistency re- event it should be made c l e a r that the independence
quirement removes all the arbitrariness in a , b , of the theory under the local Lorentz transforma-
and c. tion (11)is a consistency condition that one has to
Notice that under a n infinitesimal change of or- require f o r one's theory." In the presence of
thonormal frame, one has spinor source fields, of course, this consistency
condition naturally leads us to introduce the gauge
h Y ( x ) - h ; " ( ~=hY(x)+wikhg(x),
) (11) fields of the Lorentz group to the theory and one
where w i k ( x )= - w k i ( x )a r e six infinitesimal func- obtains the Einstein-Cartan" theory of gravitation a s
tions so that has been argued by Kibble.z*'Z In this c a s e the
translational gauge group i s replaced by the Poin-
6 2 = fi(ZaT,j b 6 T i j k+ 26 Tij k 6 T i k j+ 2 c T i j j 6 T i k k )
card group.g
==[2aTijk(ai.wjk -ajwik)+ ZbTijk(aiwkj - a k w i j )
111. PHYSICAL INTERPRETATION
+ 2cTijjakwki]
We now wish to make c l e a r how in the presence
= J-g[(4a - 2b) Tij k aiw,k - 2b Tij b a , w i j of source fields the translation group T, a c t s on
them. By doing so we will give a precise physical
+ 2cTijjakwkiI meaning to our bundle of translation group. Re-
==[(4a -2 b ) ~ - (~z b +~c ) T~, ~ ,a~ , w
~ , ~~] ~ ~ member that we have treated our bundle just like
a principal fiber bundle of an internal-symmetry
- 2caV(hyGakwbi). (12) group except f o r one crucial difference, i.e., the
identification of Eq. (7). This eq.uation i n t e r l x k s
Here the last equality comes from the following
the fiber space of the translation group T4 with
identity :
space-time and allows u s to speak of our T, as a
au(hyJ-gakwki) = - J - g ( i T i j k a k w i+j T j j j a k w k i ) - space-time symmetry rather than an internal sym-
396
metry. We w i l l first justify this basic equation a n d we started from the usual coordinate b a s i s of a
clarify the meaning of the translation group T,. global Minkowski frame. But clearly one should
Let us consider a s c a l a r field $ ( x ) a s the s o u r c e b e able to s t a r t with any other b a s i s as well. In-
field f o r simplicity and s t a r t with an action integral deed it may be more desirable to construct the
written in the usual coordinate basis of a global theory in a basis-independent way. So l e t us start
Minkowski f r a m e : f r o m the beginning with a local orthonormal f r a m e
e, and write down the action integral (14) a s
*Work supported in part by the National Science Founda- See also L. N. Chang, K. I. Macrae, and F, Mansouri,
tion under Contract No. PHY74-08833 A01. i b i d . 2, 235 (1976).
t P r e s e n t address: Department of Physics, New York *Curnotation here i s the s a m e a s the one in Ref. 5.
University, New York. New York 10003. 'y. M. Cho, Phys. Rev. D (to b e published).
'See, e.g., S. L. Glashow and M. -11-Mann, Ann. Phys. "There appear different opinions in the l i t e r a t u r e with
(N.Y.) Is, 437 (1961); R. P. Feynman, Lectures on which we d o not agree. See, e.g., K. Hayashi, Nuovo
Gravitation (Caltech, Pasadena, Calif., 1963). Cimento 3, 639 (1973); Gen. Relativ. Gravit. A, 1
'T. W. B. Kibble, J. Math. Phys. 2, 212 (1961). (1974).
%ee, e.g.. K. Hayashi and T. Nakano, Prog. Theor. "$. Cartan, C. R. Acad. Sci.
n,
z,
". 593 (1922); Ann.
Phys. 2, 491 (1967): G. D. Kerlick, t h e s i s , Princeton Ecole Normale 325 (1923); 1 (1924).
Univ., 1975 (unpublished): 3 . M. N e s t e r , r e p o r t , 1974 '*Kibble's work followed after the pioneering work of
(unpublished). See also, F. A. Kaempffer, Phys. Rev. Utiyama. See R. Utiyama, Phys. Rev. 101, 1597
165, 1420 (1968). (1956). See also D. W. Sciama, Recent Developments
'ATTrautman, Rep. Math. Phys. 1,29 (1970). in General RelatiVity (Pergamon, New YorK, 1962).
5Y.M. Cho, J. Math. Phys. g, 2029 (1975). 1 3 h this paper our signature of the m e t r i c is (+, -, -, -).
6Y. M. Cho and P. G. 0. Freund, Phys. Rev. D G , 1711 I4C. N. Yang, Phys. Rev. Lett. 33, 445 (1974).
(1975). 15R. Pavelle, ehys. Rev. Lett. 34, 1114 (1975).
'T. T. Wu and C. N. Yang, Phys. Rev. D g , 3845 (1975).
398
J.P. HSU
Physics Department, Southeastern Massachusetts University, North Dartmouth, MA 02 74 7, USA
We present a new fermion lagrangian which possesses exact symmetry under the local de Sitter group. The lagrangian
involves new scale gauge fields related to the newtonian force and the usual Yang-Mills phase gauge fields related to a
new gravitational spin force between two fermions. Generalization of the usual gauge theory for external symmetry
groups is also discussed.
I t has been suggested that gravity is related to gauge Mills phase gauge fields. They have different trans-
fields of four-dimensional symmetry such as the de formation property and, therefore, must be treated as
Sitter group [ 1,2]. The idea is quite interesting diffcrent and independent fields.
because the de Sitter group possesses the maxiniuin Lct 11sconsidcr the generalization o f ypJ,+ in the
four-dimensional symmetry [3] and is the unique gen- form for ;I non-abelian external symmetry group:
eralization of the PoincarC group. It also suggests tlie
existence of a new gravitational spin forcc between rql1C, 1 (1)
objects with nonzero net spin densities. The dc Sitter where rcL
involves both tlic Dirac matrices a n d scale
group is a rotational group in de Sitter space, which is gauge fields e j , and llic gauge-covariant derivative D,
the hypersurface of a four-dimensional sphcre of a hy- contains pliase gauge fields /$ = (oh, 0;):
perbolic character in one direction, einbeddcd in a
five-dimensional space. The radius of the spiierc is
rg 3 e j p= + efi i(yiyk - # y j > / 4 ~
denoted by L . The de Sitter group reduces to tlie
PoincarC group in tlie flat space limit L + m. =fizz*, (2)
One important ingredient in a realistic gauge theory
of gravity is the fermion field - a sourcc of the gravi-
tational field. But in previous discussions [ 1,2] one
either ignored the fermion field or discussed a fermion
lagrangian which has only upproximate symmetry un-
der local de Sitter gauge transformations. I t appears
that one cannot get a ferniion lagrangian with exuct
external gauge symmetry if one just employs the usual
Ej = ( 2 L e p , e$./L) .
Yang-Mills fields, i.e. phase gauge fields [4]. The quantity ZA is the matrix representation of the
In this paper, we present a new fermion lagrangian, S0(3,2) de Sitter group generators:
which has exact symmetry under the local de Sitter
group. It is necessary that the lagrangian involves new [Z,, Zc] = if&ZA , A = i, j k ; etc. (4)
scale gauge fields in addition t o the usual Yang- The local de Sitter gauge transformations are given by
$ $ = Ed$ >
* The work is supported in part by Southeastern Massachusetts +-
r, rp= tdr@t:l,
--f (5 cond) u ik .
g ~ LelpyekT7 (1 2 cond)
where For large L , gp is approximately the same as g@.In
the limit L w, it is natural t o interpret e f as the
&j= exp [ i d(x) Z
, 1. +
3 29
Volume 1199, number 4,5,6 PHYSICS LETTERS 23/30 December 1982
330
Chapter 8
The theory of gravitation is usually expressed in terms of an arbitrary system of coordinates. This results
in the appearance of weak equations connecting the Hamiltonian dynamical variables that describe a state
at a certain time, leading to supplementary conditions on the wave function after quantization. I t is then
difficult to spccify the initial state in any practical problem.
To rcmove the difficulty one must eliminate the weak equations by fixing the coordinate system. The
general procedure for this elimination is here described. A particular way of fixing the coordinatc system is
then proposed and its effect on the Poisson bracket relations is worked out.
INTRODUCrION AND NOTATION that one could effect a substantial simplification, at the
HE problem of putting Einsteins equations for expense of giving up four-dimensional symmetry, by
T the gravitational field into the Harniltonian choosing a system of coordinates such that the three-
form, as a preliminary to quantization, has recently dimensional surfaces 3p= constant are all space-like
received a good deal of attention, because of the develop- and dealing with the physical states on these surfaces.
ment of mathematical methods sufficiently powerful The main features of the Hamiltonian formalism
to make it tractable. will be recapitulated here. The notation will be that
The Hamiltonian form involves the concept of a used by the author, with the exception that the sign
physical state at a certain time, which means in a of the g, will be changed throughout, to make goo
relativistic theory a state on a certain three-dimensional negative. Greek suffixes take on the values 0, 1, 2, 3,
space-like surface in space-time. At first people. chose lower-case Roman suffixes take on the values 1, 2, 3,
the space-like surface independent of the coordinates the determinant of the g, is - J z , the determinant of
x*, which enabled them to preserve the four-dimen- the g., is Kz, and the reciprocal matrix to g,, is er4. A
sional symmetry of the equations. Later it was realized3m4 Iower suffix added to a field quantity denotes an
___
* The authors stay at the Institute for Advanced SLydy was ordinary derivative, while ,I added to it denotes the
supported by the National Science Foundation. covariant derivative.
* F. A. E. Pirani and A. Schild, Phys. Rev. 79, 986 (1950).
Bergmann, Penfield, Schiller, and Zatzkis, Phys. Rev. 80, We shall deal with the gravitational field in inter-
81 (1950). action with other fields, or possibly particles. Spinor
Pirani, Schild, and Skinner, Phys. Rev. 87, 452 (1952).
P.A. M. Dirac, Proc. Roy. SOC.(London) A246,333 (1958). fields are excluded, as they require a special treatment.
403
We have an action density of the form It should be noted that, for a vector A,, the ordinary
and covariant derivatives A , . and A , , , are both
e=e C f c M , independent of the go,. Their difference, namely
where is the action density of the gravitational field r,,JA,,is thus independent of the go,. We may take A ,
alone, involving the ,g, and their first derivatives, and here to be the unit normal, namely
2 ,is ~the action density of the other fields, involving the
other field quantities, q M say, and their first derivatives
and involving also the g,, but not derivatives of the gPv.
The gravitational action density is
ec= (16 r + ~ g q r , , ~ r , ~ - r,yprpsq, (1) is independent of the go,. This quantity may be called
where y is the gravitational constant, occurring in the the invariant velocity of g,, as it consists of the
numerator of Newtons law of force. To save writing, ordinary velocity graOmultiplied by a certain factor and
we shall take with certain terms added on, so as to produce a quantity
167ry= 1. (2) independent of the choice of coordinate system outside
the surface xo=t.
HAMILTONIAN FORM OF GRAVITATIONAL THEORY With the physical state described in this way, one
We shall deal with the physical state on the surface easily finds4 that for a dynamical variable Q not in-
xO=t and shall set up Hamiltonian equations of motion volving the gu0, dq/dxO is of the form
to determine how the state varies as t varies. The
Hamiltonian is, by the usual definition
independent of the go,. Let us consider the kind of With this SO,the momenta p@ conjugate to gp0
quantities that can enter into such a description. vanish weakly, which results in the degrees of freedom
Suppose there is a vector field A,. The three co- described by gM0,yodropping out from the Hamiltonian
variant components A , on the surface remain invariant formalism. The weak equations p@=O give, when one
under a change of coordinates which leaves the co- passes to the quantum theory, the conditions p.V=O,
ordinates of each point on the surface invariant. So which show that the wave function # does not involve
these A , will enter into the description. We cannot have the go,.
A o , but we have instead the normal component of A , The surviving gravitational momenta are
namely A,P, where 1, is the unit normal. Similarly for a p r a = K ( e v o e d 6 - , y a e a b ) r ab0/(-gm).
(9)
tensor B,,, which may be the covariant derivative
A,,, of A,,, we have the quantities B,,, BT$, B,,Y, They are built up from the invariant velocities (5).
B,PI. Each of these quantities is unaffected by a The fundamental Poisson bracket (P.b.) relations for
change of coordinates which leaves the points on the them are
surface invariant and is thus independent of the go,.
404
926 P. A . M DIRAC
The expressions for XL and X, in (7) are found to be We can write the total Hamiltonian (7) in the form
3CL = K - y p y , , - *pr'psa) +B I
+(K-'(K2era)).)s + X ~ ~(1,1)
2 (pabgaa-)b + X M 8 , (12)
S
Xa=pabgoba-
where 4- (18)
gt.0ersXad3~,
Jj=+Kg
lSU GI," { (eroesb-er.?e=b)ei~v
(13)
+2 (er"e5b- eraebtL)eBU),
when it appears as Hlnain with arbitrary linear combina-
tions of X I , and of X,, for various values of zT,added on.
and XMI,, ~ C are
M ~the contributions arising from the These additional terms in the Hamil tonian produce
nongravitational fields. I t should be noted that the terms in the equations of motion in addition to those
terms B+(K-1(K2er"),),are equal to the density of the produced by Hmain,corresponding to the surface
three-dimensional scalar curvature of the surface xo= 1. xo= t undergoing arbitrary deformations and having
We have the weak equations arbitrary changes of its coordinate system zr as t varies.
XLZO, X,=O. (14) NEED FOR FIXATION OF THE COORDINATES
They are x equations or secondary constraints. To see To specify a physical state a t a particular time in the
where they come from, we note that Einstein's field classical theory, we must choose numerical values for
equations are all the dynamical coordinates and momenta so as to
satisfy the constraints (14). This involves solving some
R,'-+g,vRR,'= +T,', (15) differential equations, so it is not such a straight-
where Thyis the stress tensor produced by the non- forward matter as specifying a state in particle
gravitational fields. The left-hand side of (15) contains dynamics.
second derivatives of the gas and thus in general I n the quantum theory the situation is more compli-
contains accelerations g,ooo. The right-hand side of (15) cated. The constraints (14) go over into the conditions
contains no derivatives of the gas. Now the well-known on the wave function
identities xL*=o, (19)
(R,"- +g,'R,") I " ~ 0
X,*=O. (20)
may be written
To specify a state at a particular time involves obtaining
a solution of Eqs. (19), (20), which are functional
equations.
where the + a t the end indicates that some further Equation (20) expresses merely that $ must be
terms, not involving third derivatives of the g,, must invariant under changes of the coordinate system xr in
be added on. The right-hand side of (16) evidently the surface P=t. To get J. to satisfy this equation is
does not contain any third time derivatives g,gooo. thus not difficult. Equation (19) expresses the require-
Thus the left-hand side cannot involve third time ment that the state shall be specified in a way that is
derivatives, so R,O- +g,OR<'cannot involve accelerations independent of deformations of the surface xO=t. The
gaBoo. Thus if we take v = O in (15), we get equations treatment of such deformations is essentially as compli-
involving only dynamical coordinates and velocities. cated as the treatment of the passage from the surface
By substituting for the velocities here in terms of the xo=t to a neighboring surface x 0 = t + c , so to get $ to
momenta, we get four equations between dynamical satisfy (19) is essentially as complicated as solving the
coordinates and momenta only, which yield (14). equations of motion. Thus we have the situation that we
The main part of the Hamiltonian is obtained by cannot specify the initial state for a problem without
putting g=, -6,o in (7) and is thus solving the equations of motion. The formalism is thus
not suitable for dealing with practical problems.
The difficulty does not arise in the weak-field approxi-
mation, because then many of the terms in (19) get
neglected and the remaining ones, if expressed in terms
n
of Fourier components, are easy to handle,
To obtain a practical formalism of greater accuracy
than the weak-field approximation, i t is necessary to
after removal of a surface integral a t infinity. The introduce into the theory some new constraint that
removal of this surface integral does not disturb the fixes the surface xo=t, so that we no longer have the
validity of Hmninfor giving equations of motion, but possibility of making arbitrary deformations in it. Then
it results in Hmainnot vanishing weakly. the supplementary condition (19) gets eliminated. We
405
may also introduce some further constraints that fix a new definition of P.b.s, which corresponds to the
the coordinate system z in the surface. While not number of effective degrees of freedom being reduced
essential for getting a practical formalism, such further by M .
constraints serve t o simplify the formalism by elimi- In simple cases we can pick out directly the degrees of
nating the conditions (ZO), and so making the task of freedom that have to be dropped and those that
specifying the initial state a trivial one. survive. Let us take the special case when M of the
The fixation of coordinates is advantageous also in equation Y,=O are
the weak-field approximation, because it leads to some
degrees of freedom dropping out from the formalism, p,,=O, in= 1, 2; - ., M . (21)
the procedure being similar to the elimination of the The remaining M of them must then contain all the
longitudinal waves in electrodynamics. variables q , independently, (otherwise the p,, would
When dealing with gravitational waves, people not all be second-class) and so it must be possible to
usually restrict the coordinate system by introducing solve them for the qm and write them as
the harmonic conditions
qm=fm(q,l4+1, qA.f+L?. . .PM+l, p,11++2...). (22)
( J g q = 0.
We now see that the degrees of freedom associated with
These conditions would be quite unsuitable in the q,, p , (m= 1, 2 , . . . M ) cease to play an effective role in
present formalism because they involve the g p 0 , which the dynamics. We can use Eqs. (21) and (22) to elimi-
the present formalism allows to be completely arbitrary. nate the variables p , and qm from the theory, which
Any restriction imposed on the gg0 would not help one implies using these equations as definitions or as strong
in dealing with Eqs. (14) or (19) and (20). We need equations. We then work with P.b.s that refer only to
some restrictions which affect only the variables the other degrees of freedom.
involved in (14), namely g, and pr8,and possibly also In the general case one retains all the dynamical
the liongravitational variables. variables and merely changes their P.b.s to correspond
to the reduction in the number of degrees of freedom.
GENERAL METHOD To do this one first sets up the matrix of all the P.b.s
Let us examine the general principles which come [U.,Y.,].I t can be shown that this matrix has a
into play when we introduce some new restrictions or nonvanishing determinant, provided there is no linear
constraints on the dynamical variables in a Hamiltonian combination of the Y , that is first-class. One must then
theory. Suppose we have a number of weak equations obtain the reciprocal matrix C,,,, satisfying
x n = O (n= 1, 2;. . N ) , which may be either primary or C d [ Y d ,Y,,,]= L,,. (23)
secondary constraints. We are taking N to be finite
for definiteness, but the same principles apply with A Note that Ca8,is a skew matrix, like [Y,,Y,.]. One then
infinite. Suppose further that these weak equations are defines new P.b.s by the formula
all first-class, so that Ct,d*=Ctd- CE, ~ 8 I ~ . * V ~ ~ , S l . (24)
CXn,xdl= 0. It can be checked5 that the new P.b.s satisfy all the
Now introduce some new restrictions, say the M fundamental relations that P.b.s ought to satisfy.
independent equations From (23) and (24) we see at once that [&Y.]*=O
for any t. Thus the Y , now have zero P.b. with every-
Y,=O, m=l, 2;.., M thing, so that we can consider the equations Y,=O as
with M G N . They are, of course, weak equations. strong equations and use them before working out
Suppose that none of them (and no linear combination P .b .s.
of them) has zero P.b. with all the xs, so that they are I n applying this method to the gravitational case
all second-class constraints. They will cause M of the we desire, of course, that the change in the P.b.s
xs to become second-class, while AT-M of the x s (or shall not be too complicated. I n particular, we would
linear combination of them) remain first-class. like to have no change a t all in the P.b. of two quanti-
Suppose X I , XZ,. . X M become second-class, while ties, neither of which involves the gravitational variables
g, pr8. This result is ensured provided the two condi-
XM+I; . ., X N remain first-class. We now have the 2M
second-class constraints xra=0, Y,=O (m= 1, 2,. . . M ) . tions hold: (i) The Y, (m= 1, 2,. . . M ) iiivolve only
Let us write ~,=YM+,, so that the 2M second-class the gravitational variables ; (ii) The P.b.s [Y,, Ym.]
constraints become Y,=O (s= 1, 2;. ., 2M). all vanish. The proof is as follows.
There is no place for second-class weak equations in We have already (x,,,xw,,]=O from the assumption
the quantum theory, so we have to transform them in that the xs were originally first-class. With the further
some way. We shall see that we can change them into condition [Y,,Y,,]=O we have [Y,,Y.,J=O except
strong equations (holding as equations between when 16 s< M and M+ 16 s< 2M or vice versa. This
operators in the quantum theory) provided we adopt P. A. M . Dirac, Can. J. Math. 2, 129 (1950)
406
928 P. A . M. DIRAC
leads to Ca81=0 except when 1 6 s < M and M+ 1 6 s We find that graand HI. have zero P.b. with p,* and K
< 2M or vice versa. The surviving elements of Care thus at all points.
C,, M+,,,,= -CM+,,,,, ., The elements C,,,, ,v+,,,! form a Let us change our basic dynamical coordinates from
matrix of M rows and columns, which is the reciprocal the six g, to the five independent gTBand Inx. The
of the matrix [x,,.,Y,,,]. momentum conjugate to 1nK is now, from (28), just
The formula (24) now reduces to Zp,, and the momenta conjugate to the graare certain
functions of the and grd.
cs,?l*-cs,rll= -Cm* M+m4Ct,~mICxm~,al The conditions (26) now take the form (21) and we
- Ct,xm,lCYm,rll). (25) have the equations XL=O playing the role of (22).
If 5 and 7 do not involve the gravitational variables, To put them into the form of (22) we must solve them,
the condition (i) above leads to [(,Ym]=O and with the help of (26), to get K expressed in terms of
[Y,,a]=O, so the right-hand side of (25) vanishes. quantities having zero P.b. with p,- and K. Such
The introduction of the new constraints into the quantities are the gr., Fa, $1.8) and the nongravi-
theory, when combined with the appropriate change in tational variables.
the P.b.s, leaves the Hamiltonian first-class. It follows From ( l l ) , the equation XL-O gives,
that the Hamiltonian equations of motion preserve all
the constraints.
in which we look upon the g, in B and X M Las expressed
FIXATION OF THE SURFACE in terms of the and K. This is a difficult equation to
solve generally for K. However, for gravitational fields
To fix the surface P=t, the natural conditions to that are not too strong, the important terms are those
take are that involve second derivatives of K, i.e., those on the
prT=grapr*=o. (26) left-hand side. We can therefore obtain the solution
This involves bringing into the theory one Y equation by a method of successive approximation, first putting
for each point of the surface. ~ = on l the right and solving the resulting simplified
One easily checks that equation, then substituting the first approximation for
K on the right and solving to get the second approxi-
[XB,g=puY-J = g,.p6, (x- x ) =0, mation, and so on. We shall consider this equation
so the conditions (26) do not disturb the first-class further in the next section, with reference to a particular
character of the equations X,= 0. This means that the system of coordinates, and for the present we shall
conditions (26) do not restrict the coordinate system x assume that the solution has been obtained.
in the surface, a result which is evident from the tensor Following the method of the preceding section for
character of (26). The conditions (26) mean geo- dealing with the second-class equations (21) and (22),
metrically that the surface shall have a maximum we express Hmninand X. in terms of the variables
three-dimensional area. The equations (26) and g,,, Bru,
ifra, Bt., p,, and K, and then eliminate puU and K
XL-O are now second-class and we can use them to from them by means of (26) and the solution of (30),
eliminate one degree of freedom at each point of space. which we may now use as strong equations. The
We have elimination from X, is trivial, as we get from (12),
using (26),
Cgra,puU1=g,*6(x-X).
It follows that the ratios of the g,, at any point have Xa=$abgabs-2(habgai) b+XM*. (31)
zero P.b.s with pu at all points of the surface. Let us If the nongravitational field variables are suitably
Put chosen, will not contain K . The elimination from
K = K ~ ,gre=g,,K-, ~=ersKZ. (27) Hmain leads to an expression
Then g,, involves only such ratios and has zero P.b.
with p- a t all points. There are five independent prnain J
p 91.~+B+Xnr~)d~x, (32)
= (K -3-ra-
grS,as their determinant is unity. The form the
reciprocal matrix to the matrix gY6,and also have the in which K is understood to have the appropriate valuc.
determinant unity. The integrand here may be considered as the energy
We have density or mass density. The complete Hamiltonian
[Kz,puu]= 3K%(x-- x), is now
and so
Put
[ l n ~ , p ~ ~+]6=( x - x) . (28) s
H*maln+ groeraXdd3x.
$rs= ( p r a - geiagobpcb)K2, (29) of the surface, i.e., the middle term of (18), has
$is=gmga b$Ob- disappeared.
407
We now have a Hamiltonian formalism in which which follows from (35), this reduces to
the degree of freedom described by and 1nK has
dropped out. The Hamiltonians (32) and (33) are
R= + ~ g ' ~ ~ ~(, &P ,F~b W - 2 b * Z b " k v ) -~ K - ~ U , , K , Z (39)
~~.
first-class even with the condition (26), so they lead The last term here, divided by lw, can be inter-
to equations of motion that preserve (26). The pro- preted as the mass density (or energy density) of the
cedure of substituting for M in the derivation of H*msin Newtonian field with the potential K- 1. It is negative
caused the introduction of the right amount of X L definite, corresponding to the Newtonian force being
into the Hamiltonian to ensure the preservation of (26). attractive. The remaining terms of B, together with
the first term on the right-hand side of (381, give the
FIXATION OF COORDINATES IN THE SURFACE energy density of the gravitational waves.
To get the theory into a more convenient form, one
THE NEW POISSON BRACKETS
must also fix the coordinate system 'x in the surface.
The most natural conditions to take for this purpose, With the coordinates fixed by (35), the P.b.'s of the
from the geometrical point of view, are the harmonic gravitational variables with one another and with the
conditions in three dimensions : nongravitational variables will be altered. The new
P.b.'s are given by formula (25) with Y, replaced by
(K5")*=0. (34) FUuand xmlreplaced by X'M. It thus reads
However, (34) does not have zero P.b. with (26), so if
we adopt (34) together with (26) we must change the Ct191*-ITE,?1= -~ ~ c ~ ~ ( x , 2 ' ~ ( c ~ , ~ ~ ~ J [ x
P.b. relationships between the nongravitational vari-
ables. To avoid this inconvenience, it is better to -[~,X'.][gF1L=,?]}d3xd3~'. (40)
replace (34) by
6 4 , = 0, (35) The coefficient C,'(x,x') is the reciprocal of the matrix
which does have zero P.b. with (26). [X'B,Eru,] and thus satisfies
With the coordinates fixed by (35), Eq. (30) reduces
to SC.a(.",2')[XII,F"u]d92'= g6.' (x- x"). (41)
-4vK=K4p9,a+B+XYL. (36)
where P denotes the Laplacian operator with respect Evaluating the P.b. here, we get
to the metric Era, namely
QZ= rw/axw.
The right-hand side in (36) equals the integrand in (32)
(37) s c"qz",T') { g 8 P * 6 a & ( X -
+&?6*&-
x')
2')) d32' =g,'6 (x- x''),
and is the mass density. To interpret (36), let us restore
which reduces to
the gravitational constant into the theory in accordance
with (2). It then becomes 4-$raC,,a(x',x).,= g6,'
V2Cmr(~',x) (x- x'), (42)
- (4'Ty)-'V2K= 16'~y~-'$~"$,,f (16~Ky)-~B+x,w~.(38) with 02 defined by (37).
This equation may be considered for fixed x', when
We now see that K-l is the Newtonian potential it is a differential equation for the unknown functions
generated by the mass density in a space with the Cu*(x',x)in the Variables x. The important domain for x
metric &. Thc fact that x occurs in the right-hand is now the neighborhood of x', since when x is far from
side of (38) can be understood as due to the Newtonian I' the functions CVr(x',x)are small. We can therefore
potential itself having some influence on the mass get an approximate solution by considering the space
density which generates it. as flat in this domain, so that the b b are constants.
Let us examine the term with B in (38). The expres- With this approximation we get, on differentiating
sion (13) for B, written in terms of the new variables, is (42) with respect to x',
B=-;K-' (KZrsu+ 2Kugrs) ( K g o b v + 6) VC,"= %6,(X- X'). (43 )
x { ( p a p b- p a z a b)geu+ 2 ( p pb -Zvagbu) -8u
e 1. Thc solution of this equation is
With the help of the equation
g,,c'*= 0,
which follows from the determinant of the grd being wherc I z - d denotes the distance from z t o x' with
unity, and of the equation respect to the metric LI.,
g,,,i? = 0, 1 X-dI ={g.*(Xr-x'p) (X"-r'*)) i. (44)
408
930 P. A . M DIRAC
Equation (42) now becomes can then set up the wave function as a function of these
variables,
1 1
V"C,'= g'"6 (x- x') +-Era
whose solution is
-
l&lr (,x-x', >,. b! (FI'",O.
The effective domain of J. is that for which the
Zr* are restricted to have the determinant unity and
to satisfy b*,=O. # may be considered as undefined
outside this domain. When we operate on # with 6abor
with any dynamical variable in the theory, we get
another wave function defined in the same domain,
One could get the solution of (42) to a higher accuracy on account of pa' commuting with the determinant
by substituting for the F b in the left-hand side of (42), of the F a and with Fa..
(remembering that Zab occurs also in the operator 7,) There are no supplementary conditions to be imposed
their Taylor expansions in powers of x-x' and using on $. We can choose it arbitrarily to correspond to the
the first approximation for CVrin those terms in which initial state in any problem. There is just one equation
it occurs with a factor xr-x". By a process of successive for +, the Schrodinger equation
approximation one could get the solution to any
desired accuracy.
With the coefficients Cro(x,xt)in (40) determined, which fixes the state a t later times.
the new P.b.'s are determined. I t should be noted that For the theory to be self-consistent it is necessary
the new P.b. of any nongravitational variable with that the space-like surface on which the state is defined
g,, or Zra vanishes. However, its new P.b. with does shall always remain space-like. The condition for this
not vanish. is that K2, the determinant of the g, shall remain
always positive. I n the present formalism this means
QUANTIZATION
K ~ > O , with K determined by (36). If the mass density
To pass over to the quantum theory, we must make is always positive, (36) shows that K> 1 and there is no
all our dynamical variables into operators satisfying trouble. Difficulties arise only where there is a large
commutation relations corresponding to the new negative density. This occurs very close to a point
P.b.'s. We must then pick out a complete set of com- particle, on account of the last term in (39). The
muting observables. We may take these to consist of gravitational treatment of point particles thus brings
the Era a t all points x', together with a complete set of in one further difficulty, in addition to the usual ones
commuting nongravitational observables, f say. We in the quantum theory.
409
Takeshi Shirafuji
Physics Department, Saitama University, Saitama. Japan
(Received 6 February 1979)
A gravitational theory is formulated on the Weitzenbkk space-time, characterized by the vanishing
curvature tensor (absolute parallelism) and by the torsion tensor formed of four parallel vector fields. This
theory is called new general relativity, since Einstein in 1928 first gave its original form. New general
relativity has three parameters c , , c,, and A, besides the Einstein constant K . In this paper we choose
c , = 0 = c2, leaving open A. We prove, among other things, that (i) a static, spherically symmetric
gravitational field is given by the Schwarzschild metric, that (ii) in the weak-field approximation an
antisymmetric field of zero mass and zero spin exists, besides gravitons, and that (iii) new general relativity
agrees with all the experiments so far carried out.
I. INTRODUCTION
From this equation we get
In 1928 Einstein introduced the notion of absolute
parallelism and tried to unify gravitation and elec-
tromagnetism, using tetrads with 16 degrees of
where the first t e r m denotes the Levi-Civita con-
freedom. H i s attempt failed because there was
nection,
no Schwarzschild solution in his simplified field
equation.* Later, in 1961 Mdller revived Einsteins
idea,3 and Pellegrini and Plebanski found a La-
grangian formulation for absolute p a r d l e l i ~ m . ~ and the second stands for the contortion tensor,
Recently this formalism was reconsidered by Mdl-
ier.5
In 1967, quite independently, Hayashi and Nakano with the torsion tensor
started to formulate the gauge theory of the space-
time translation group: This theory was of no
geometrical construction, but it was shown that, In t e r m s of the affine connection the curvature
for a static, isotropic gravitational field, a sym- tensor i s given by
metric part of their field equations is identical with
the Einstein field equation in general relativity,
and that, in the weak-field approximation, an an- The Riemann-Cartan space-time has both the
tisymmetric part describes the propagation of an curvature tensor and the torsion tensor. From this
antisymmetric field, whose source is related to the space-time follow two very interesting models of
intrinsic spin of spin-$ fundamental particles. space-time. One is the well-known Riemann
Miyamoto and Nakano estimated effects of ex- space-time V,, which i s obtained from the U, by
changing this field in the microscopic system. setting the torsion tensor to be identically vanish-
In later y e a r s Hayashi further developed the gauge ing. From (1.2) follows the Levi-Civita connection.
theory into a more elaborate framework* and fixed It is well known that general relativity i s the theory
the final form in 1973. Quite recently Hayashi of gravitation on this space-time, and that it as-
pointed out the connection between the gauge theory cribes gravitation to the Riemann-Christoffel
of the space-time translation group and absolute curvature tensor formed of the Levi-Civita con-
paral1elism.l0 nection.
Now we wish to unify these two developments Another interesting model is the Weitzenbock
mentioned above, following the geometry of under- space-time A,, which i s obtained from the U, by
lying space-time structure. The Riemann-Cartan setting the curvature tensor to be identically
space-time U, is a paracompact, Hausdorff, con- vanishing,
nected Cmfour-dimensional manifold endowed with
a locally Lorentzian metric g and a linear affine
connection fwhich is metric, Or, to put it equivalently, the Weitzenbock space-
19
- 3524 @ 1979 The American Physical Society
410
-
19 NEW G E N E R A L R E L A T I V I T Y 3525
time i s obtained by requiring the U, to admit abso- by equations that are covariant o r f o r m invariant
lute parallelism, i.e., to have a quadruplet (speci- under the group of general coordinate transforma-
fied by k = 0, 1 , 2 , 3 ) of linearly independent parallel tions. (B) The equivalence principle. (C) Gravi-
vector fields, b = { ~ ~ = { bwhich , } , is defined by tational field equations are derivable f r o m the ac-
tion principle. (D)The field equations are partial
D$b~=a,bkX+I:~bkp=O. (1.8)
differential equations in the field variables of not
Solving this equation we find the nonsymmetric higher than the second order. (E) The gravitational
affine connection, field is exhaustively described by the metric tensor
alone.
r~~=b~a,bk,, (1.9) In new general relativity the fundamental as-
and the torsion tensor, sumptions are as follows: (A) Underlying space-
(1.10) time is the Weitzenb6ck space-time, which has a
quadruplet of the parallel vector fields as the fun-
Here b*={bk}={bk,}is also a quadruplet of parallel damental structure. These parallel vector fields
vectorfields, which is inverse t o b. It is straight- give r i s e to the m e t r i c tensor as a by-product. A l l
forward t o see that the curvature E n s o r indeed physical laws are expressed by equations that a r e
vanishes identically [see (1.7)]. See Fig. 1 f o r covariant o r f o r m invariant under the group of
reduction of the Riemann-Cartan space-time. general coordinate transformations; (B) The
We will give the name, new general relativity, equivalence principle is valid only in classical
to the theory of gravitation on the WeitZenback physics. (C) and (D) are the s a m e as (C) and (D),
space-time, since Einstein in 1928, after invent- but at this time we s t a r t from the microscopic ac-
ing general relativity, considered absolute paral- tion principle. (E) The gravitational field is ex-
lelism f o r the first time, and the main conse- clusively described by a quadruplet of the parallel
quences of the present theory will be analogous vector fields. As is closely related t o (E), we
to those of general relativity s o f a r as macroscopic need t o assume: (F) A l l physical laws are ex-
phenomena are concerned. New general relativity pressed by equations that a r e covariant o r f o r m
attributes gravitation to the torsion tensor formed invariant under the group of global Lorentz trans-
of the parallel vector fields. tormations. When general relativity is extended
A s is well known, general relativity is formulated to the domain of microscopic system, one must
by the following fundamental assumptions, which use tetrads and h a s t o assume: (F) All physical
we will compare with those of new general rela- laws a r e expressed by equations that are covariant
tivity: (A) Underlying space-time is the Riemann o r form invariant under the group of local Lorentz
space-time, which h a s the metric tensor as the transformations.
basic structure. A l l physical laws a r e expressed We shall formulate new general relativity in the
following manner: In Sec. II geometry of the
WeitZenback space-time is described in some de-
1 Riemann-Cartan Space-Time
tail, with emphasis on spinor wave functions de-
fined in this space-time. In Sec. I11 microscopic
matter Lagrangians a r e considered, such as of the
electromagnetic field, of spin-$ fundamental parti-
cles and so forth. Their equations of motion are
derived and then approximated by the WKB method
to yield, in the classical limit, the geodesics of
the metric g:3 along which point particles and
light rays a r e defined to move. In Sec. IV a gravi-
tational Lagrangian is constructed by the require-
ment of invariance under (1) the group of general
coordinate transformations, (2) the group of global
Lorentz transformations (3) the parity operation,
and by the demand that (4) the Lagrangian be quad-
I Minkowski Space-Time I ratic in the torsion tensor. Gravitational field
equations a r e derived, with three unknown param-
FIG. 1. The reduction of space-time is made in two
eters, cl, c2, and c3. In Sec. V a static, spheri-
particular cases: One is the Riemann space-time V, cally symmetric field outside a massive neutral
with a curvabre tensor only @), and the other the body is determined, with two parameters, c1 and
WeitZenback space-time A with a torsion tensor alone cr; in this c a s e a t e r m proportional to c3 is van-
ishing identically. In Sec. VI comparison with all
41 1
-
19 NEW G E N E R A L R E L A T I V I T Y 3527
- -b , = ( D : V Y-) E v-
D*V-= ( D : V i )-E " @ @ E,, , (2.10) with this invariance of underlying geometry, we
where
demand that physical - -should be inuan'ant un-
laws
d e r the action of L!. We call this the global Lo-
D:Vr= a,V', (2.11) rentz invariance.
In the Weitzenb6ck space-time, spinors a r e in-
D:V~ = a v v u +r ; y . (2.12)
troduced as quantities which transform like two-
Thus, for the components V' with respect to a valued representations of the proper, orthochro-
quadruplet of the parallel vector fields, the co- nous Lorentz group L!.'* Most elementary spi-
variant derivative coincides with the usual deriva- nors are four kinds of two-component spinors,
tive. i.e., contravariant spinor { t"}, covariant spinor
In the W e h e n b a c k space-time absolute paral- {xA}, dotted contravariant spinor { t;"}, and dotted
lelism of vectors at different points of M i s defined covariant spinor {xi} f o r A = 1 and 2. Dotted spi-
in the following way: Consider a vector I(p) nor6 transform like the complex conjugate of
= V'b,(p) at p and a vector W ( q ) =W'L,(q) a t q , undotted spinors. A spinor of higher rank is a
where the point 4 can be arTitrarily separated quantity which transforms like a direct product
from p . The parallelism of 1 and W is manifest: of two-component spinors. A vector V = V'k, is
If their components a r e equal witheach other, identified with.a mixed spinor of secoyd rank with
components vAB,
V'=WV', (2.13)
then the two vectors, V ( p ) and W(q), a r e parallel vAB=c { iByi J (2.16)
with each other and of2qual length. where {XiiB} is a s e t of Hermitian 2 x 2 matrices
In passing we make the remark that Latin indices satisfying
a r e used t o denote components with respect to a
quadruplet of the parallel vector fields, and are
raised and lowered by the Minkowski metric ten- where
sor, {%,} o r {v"}.
The affine connection, r*={rzi}, is not sym-
metric with respect to the exchange of lower two (2.18)
indices. The torsion tensor is given by
r ~ p . ~ r : ~ - r : ~ = b ~ ( a , b ~ p - a p b ~ (2.14)
v). One of the simplest choices, which we take in this
paper, i s
The curvature tensor formed of r*identically
vanishes [see (1.7)], since parallel transfer of a
vector is path independent owing t o absolute paral- where {q,a,, u3} is a s e t of the Pauli matrices.
lelism. Thus the Weitaenback space-time is The four-component Dirac spinor J, i s defined
characterized by the torsion tensor alone, and r e - by a direct sum of a covariant spinor and a dotted
duces to the lMinkowski space-time provided the contravariant spinor, and is written as a single
torsion tensor vanishes globally. See Fig. 1 for
column matrix
reduction of the Riemann-Cartan space-time. In
the Minkowski space -time the parallel vector
fields, which define absolute parallelism, coincide
with the coordinate basis of a Cartesian coordinate (2.20)
system.
When a quadruplet of the parallel vector fields b
is subject t o a global, proper, orthochronous
Lorentz transformation, The conjugate Dirac spinor $ is obtained f r o m J,
by
i?r=A',b;, (2.15a)
AJ,q,,,,Amn=tlrn, d e t A = l , (2.21)
Ao,s 1, a,A',=O, (2.15b)
new absolute parallelism defined by new parallel Now we extend the definition of absolute paralle-
vector fields b' is equivalent to the original one. lism to include spinors. Consider a spinor a t p ,
So geometry of the Weitaenback space-time is in- s a y a contravariant two-component spinor { [ " ( p ) } ,
variant under the global, proper, orthochronous and another spinor a t q of the s a m e type, say,
Lorentz group, L!r{A=(Af,)~GL(4,R), A',~,,A", {EA(q)}.If components of these spinors a r e equal,
= q h , d e t A = l , A o o a l , B,A',=O}. In conformity i.e., ( " ( p ) = EA(q),then two spinors a r e defined to
413
be parallel and of the same magnitude. From interaction shall be assumed t o hold in new general
(2.16) it follows that absolute parallelism of spi- relativity, because this invariance plays the fun-
nors implies absolute parallelism of vectors and damental role in quantum electrodynamics. The
tensors: In fact, for two vectors V at point p and electromagnetic Lagrangian density L,, is then
-
W a t another point-q, equality of spinor compo- given by
W A B ( q ) implies
nents, V A f ( ( p ) = , Vi(p) = W i ( q ) , be-
cause {C y }is independent of space-time posi- L,,= -ag ""g""F,, F,, , (3.3)
tion. with
When a spinor at point p is parallel transferred
to another point q , its components are kept un-
changed owing to absolute parallelism. There- which i s of the same form as the electromagnetic
fore, the covariant derivative 0: of spinors coin- Lagrangian density used in general relativity.
cides with the usual derivative a., Absolute parallelism i s applied to spinor wave
Finally, we make the following important re- functions of fundamental spin-2 particles, and the
mark: The parallel vector fields b are different Dirac Lagrangian density L, i s given byL5
from the so-called tetrad fields e%y an arbitrary,
position-dependent Lorentz tranaormation, which L D -- r zAbku[$ykD:I/J-
. (D:$)yk$]- m$$. (3.5a)
is called a local Lorentz transformation. In this paper we use the unit, R = c = 1, but through-
out this section we write A explicity for convenience
of taking the semiclassical limit. For spinors
111. MATTER LAGRANGIAN AND EQUATIONS OF MOTION
the covariant derivative 0: coincides with the
FOR TEST PARTICLES
usual derivative
A. Matter Lagrangian D:$= a,$. (3.6a)
In new general relativity we do not identify the
If we use the covariant derivative V, of general
six extra degrees of freedom of the parallel vec-
relativity,
tor fields with the electromagnetic field strength,
since we now know that such an attempt failed.' V,$= (a,+ $ i A i j U S " ) $ , (3.6b)
Instead, we take the electromagnetic potential
with respect to the Ricci rotation coefficients
-={A,,}as the dynamical variable independent of
A
the parallel vector fields. The matter part of the {Aijul,
action i s then represented as a sum of the action Aija'bkc(Aij,=-~(T,jk- Tjik- 'kij) 9 (3.7)
of fundamental particles and fields, i.e., of the
electromagnetic field and several kinds of spin-; then L , of (3.5a) can be rewritten as
fundamental particles;
-
19 NEW G E N E R A L R E L A T I V I T Y 3529
[iib, y (0:+ Su ), -m ]
where the electromagnetic current is defined by X (3.17)
[~bib,Yy(D,*+~v,)+m]$I=O ,
6 to which we apply the WKB approximation method.
jY=-Lint. (3.12)
J.4 Y We seek a semiclassical solution of (3.17) with
Equation (3.11) is just the Maxwell equation in the following form:
general relativity, and hence the law of electro-
magnetism is entirely free from the influence of 9 = ern(; s) 40 , (3.18)
absolute parallelism. In space-time with a given
background metric g, electromagnetic waves by assuming that t i s very small compared to S.
propagate in the same manner as in general rela- Usihg (3.18) in (3.17) and then putting each o r d e r
tivity: In the short-wavelength limit, in particu- of ( t / i ) to zero, we find (up to the f i r s t order)
lar, light rays propagate along the null geodesics (t/i): gYV(a,,s)(a,s)+m2=0, (3.19)
of the metric g.
The Dirac equation is derived from L , by taking ( E m : {2g(auS)(D,*+Sv,)
variation with respect to $, - b,b[[D:(a,S)]y1yk} $ I o = O . (3.20a)
[~b~Yyh(DZ+~uY)-m]J)=O, (3.13a) The last equation is rewritten as
o r equivalently, {2gY(a.SN,+ g Y Y [ v u ( ~ , S ) 1
- -
( i t b ~ y k V , 3tia,y5yk m ) $ =0 , (3.13b) +z =f0~ @(3.20b)
3 ~. ~ ~ j m b m Y ( 3 Y S ) a S 0
where { v Y } i s the vector part of the torsion tensor, with help of the relation between D,* and V , ,
vp=Ttiu, (3.14) D:4,= (V,+$ iKf,vSij)40,
(3.21)
and only the gravitational interaction is included. D:(aYs)=v,(aYS) -K?,,,,a,S,
with { K i j v }and {KAYv}
being the contortion tensor
B. Equations of motion for massive Dirac particles
defined by
We shall derive two equations of motion for a
freely falling Dirac particle, i.e., the equation of Ki,= biAblYK,Y
orbit and the equation of spin pPecession, by = i b ibjY(TiYV - T,,,- T,J
applying the WKB approximation method to the
= -A ijv. (3.22)
Dirac equation (3.13).
The particle of spin i is usually represented by The applicability condition of the semiclassical
a four-component spinor wave function obeying the solution (3.18) is that when it is used in (3.17) the
first-order Dirac equation, However, it is well t e r m s of o r d e r ( t / i ) are much l a r g e r than those
known that it can equally well be described by a of o r d e r ( t / i ) . Estimating I a,SI/ti -l/(wave-
two-component spinor wave function obeying a -
length) 1 / X , D,*$, @,/w with w the width of the
second-order wave equation. So there are two wave packet, and lD,*(a,S)l -E/XC with L being the
equivalent ways t o take the classical limit for the distance over which the parallel vector fields
particle of spin i, in accordance with which a {b*,} vary considerably, we obtain the following in-
wave equation is considered; a first-order wave equality:
equation o r a second-order one.
L>>X, w > > X . (3.23)
F o r our present purpose of deriving the spin
equation in addition to the orbit equation, it is Equation (3.19) is the Hamilton-Jacobi equation
much more convenient to s t a r t from a second- which describes the motion of freely falling parti-
order wave equation rather than from the Dirac cles in general relativity.*l The complete solution
equation (3.13). We thus introduce a two-com- S(x; al,a2,a3)with three f r e e parameters, a l , az,
ponent spinor wave function $I byzo and as,determines the classical orbit by
$I = 31+ Y 5 ) q J . (3.15) as
-= B,(=const), (a=1,2,3). (3.24)
3%
415
3530 KENJI H A Y A S H I A N D T A K E S H I S H I R A F U J I 19
fir-
- --1
(four-velocity) ,
g8,S =(I (3.25)
dr m (3.32 c)
(3.26) In the integral of ( 3 . 3 2 ~ )compensation takes place
almost everywhere except in the space-time re-
Given the solution S ( x ; a l , a, a3)of (3.19), Eq. gion satisfying
(3.20) can be solved to define the spinor wave
function q50 in t e r m s of S. By virtue of (3.16), the
semiclassical expression f o r the Dirac spinor
wave function in t e r m s of S and cpo i s given by According to the condition (1) stated above, the
right-hand side of (3.33a) is negligibly s m a l l com-
pared to the macroscopic scale. Therefore, in the
$= exP(; s) $0 3 (3.27) macroscopic scale, the wave packet (3.32~)has
nonvanishing amplitude only along a world line
x ( T ) defined by
-
= (1 bkU,yk)@,,. (3.28) (3.33b)
The probability current, j = b,$y$, then takes Here T is the proper time along the world line.
the following form in the semiclassical approxi- The wave packet thus Propagates along a classical
mation: trajectory ~ ( 7 )satisfying the geodesic equation
j=pU, (3.29) (3.26).
Now we turn our consideration to the motion of
where p is defined by spin f o r a spin-f particle described by the spinor
p= - 2b4U,&,y$, . (3.30) wave function (3.32~). The spin polarization is
described by the spinor wave function &(x(T); a),
Equation (3.20b) of cpo ensures that j satisfies the since other two factors of ( 3 . 3 2 ~ )are s c a l a r func-
continuity equation, tions which have nothing t o do with intrinsic spin
polarization. We introduce a new spinor wave
(3.31) function $ $ x ( T ) ; 3)by
The expression (3.29) f o r the probability current 1
shows that, in the semiclassical approximation,
$;=T $0.
5
T h j ~ v = (tAuv - tAuu) + $ (gAuJ,-gAyVp)
+fiuvpa * (4.8)
+ c z ( v u v I I )c+3 ( a u a u ) ) . (4.18)
The tensor {tAUv}has the following properties de-
rived from the defining equation (4.5): Comparing (4.13) with (4.18), we find that the
parameters a r e effectively (under integration
tAUv=tUAv 9 (4.9)
symbol) related to each other by
gt,,,= 0 =gA*tA,,, (4.10) 1 1
c l = a l +- , c =a --
tAUu+tUvA+tvAL&=O (4.11) 3K 3 K
(4.19)
3
The above postulates of (1) to (4) require that c3=a3+-
4K
.
the most general Lagrangian density be of the
form It should be mentioned that one of the f r e e param-
eters, c,, c2, and c3, must be nonzero; other-
+ u,(zv,) + a3(aua,)+a,,
L, =al(tAutArv) (4.12)
wise, the left-hand side of a gravitational field
where a,, a,, and a3 a r e f r e e parameters, while equation would become symmetric, while the
a, is a cosmological term. right-hand side, the energy-momentum tensor of
In Appendix A we will treat a case of lifting up spin-4 fundamental particles, would become non-
the postulate of (3), by adding to (4.12) parity- symmetric. This is a contradiction.
violating t e r m s like (va,) and ( ~ , , , , , t ~ ~ t ; In
~). It is easy ro derive a gravitational field equa-
the r e s t of t h i s paper we shall neglect the cosmo- tion. F o r the sake of completeness we write down
logical term, so we have the gravitational action of a gravitational field equation when matter i s
418
-
19 NEW G E N E R A L R E L A T I V I T Y 3533
present, by adding to the vacuum gravitational with the Dirac Lagrangian LgR used in general
action a matter action I,, which satisfies the relativity
postulates of (1) to (4),
LgR=+ i b k B yVu$ - (V,$) y 141 -wz$$. (4.30b)
I=Z,+Z,, (4.20)
It is useful t o split the gravitational field equa-
tion into the symmetric and antisymmetric parts:
,Z = d4xGL,. (4.21)
G({ })+2KD~Fb+2KVAF~uA+
2KH
F r o m this action follows the following field equa-
tion, by taking variation with respect to the -KguvL= KT), (4.31)
parallel vector fields b,, and then multiplying with 2D:F[udb+ 2vbflIVIk=T[lYI (4.32)
qkfblP:
where
Guy({ })+ ~KD:F+ 2~v~~~+2~H
-KgL= KT. (4.22)
Here the first t e r m denotes the Einstein tensor of
general relativity,
GuY({})=R({})-~guuR({}), (4.23)
and the tensor {FUvA}
stands f o r Furthermore, it is often useful to rewrite the
field equation in Latin indices:
-
~ + c2( gvb-gkv)
F X = ~ ~ ( t tUA)
-4 c3<-ap
= -Fby. (4.24)
The fourth t e r m {H} is defined by
HY= p f i ~ ; - + TYWF~~=HY@, (4.25)
which is shown to be symmetric upon inserting the
irreducible decomposition (4.8) of the torsion ten-
sor. Finally, L is given by
L= C l ( t w A r v ) +c*(vu,)+ c3(auau). (4.26)
A source t e r m is, as usual, defined by
6 I d 4 x G L , = /d4x=T,6bku
= -J dx=T:bb,. (4.27)
xu -- xu
1 bcz)--b(a) tropic form,
E
K(C, +Cz) We assume that a central gravitating body is a
- - '
1+K(CI+4Cz) (5'6) nonrelativistic system with all the components of
and a prime means differentiation with respect t o Fa being negligibly small compared t o ?ao; ?ao
>> I TuBl= 0. Then the gravitational field is weak;
Y. It is shown i n Appendix C that the constant
[1+K ( C ~+ 4c2)] is a nonzero number. the metric coefficients, A and B,a r e nearly unity,
There is no appearance of the parameter c3, but A = 1= B, and t e r m s quadratic i n A' and B' can be
only t h e parameters, c1 and cz, owing to a static, ignored in the field equation (5.4) with (5.5). We
then find that the gravitational field equation i n the
isotropic gravitational field. In other words, we
Newtonian limit is given by
can s a y nothing about the parameter c3 i n this
case. Now we proceed to study a solution of the
field equation (5.4) with (5.5) f o r the following
+- -
2 [A'+ (1 26)B'] = K P o ,
three cases: (A) the Newtonian limit, (B) the post-
Newtonian approximation, and (C) an exact s o h -
Y I (5.7a)
tion i n vacuum.
420
19
- NEW GENERAL RELATIVITY 3535
(1-2~)A'+B'=0. (5.W
The external solution satisfying the boundary con-
dition,
r-- I --
limA(r) = lim B ( r )= 1, (5.8)
is
2 Gm
A(v) = 1 -
(1-E)(1-4)[1+K(C1+4C2)] 7'
.
(5.9a)
2(1- 2 r ) Gm
B ( r )= 1+
-
(1 )(1- 4 4 [ 1 + K ( C 1 + 4CJ] -F '
(5.9b)
with m the total m a s s of the source,
m= I I
Po(x)d3x=4s r2P0(r)dr. (5.10)
It was found i n Sec. 111 that the trajectory of a FIG. 2. The curve of (5.12b): F 1 = ~ candF2=KC2.
I
test particle is determined by the geodesic equa- is* = - 4ZJ@ + %I).
tion (3.26), which reduces for a nonrelativistic
particle to
-
(1 ~ ) ( 1 -4 ~ ) [ 1 +K ( C ~+ 4c,)] 1, (5.12a)
which we shall assume hereafter. This condition
is called the Newton approximation condition. In
t e r m s of Fl = KC^ and F, = KC,, the Newton approxi-
_ - 1
- (1 - )(I - 4)[1 +K(C1+4C2)] & (-?) *
mation condition reads as
4~1+~,+9F1F,=0. (5 12b)
(5.11a)
From this follow the two cases, c1 = 0 = c2 and c1
Here the solution (5.9a) is used i n the final step.
# 0 # c2.See Fig. 2 f o r the curve specified by
We demand that the trajectory of a nonrelativistic
(5.12b). Now, combining (5.6) and (5.12b), we find
test particle, specified by x " ( t ) , obeys the Newton
equation of motion - E - 4E
(5.12~)
c1=-- 3 ( 1 - ~ ) ' c, =- 3(1-4)'
(5.11b) Since E is observable i n solar-system experi-
ments, a s will be shown i n Sec. VL, we draw the
where q5 is a gravitational potential, which takes curves of (5.12~) versus E i n Fig. 3.
the form
Q = - Gm/r (5.1 l c ) B. Vacuum solution in the post-Newtonian approximation
for a gravitational field around a spherical body
with m a s s m. Accordingly, the parameters, c1 The field equation (5.4) with (5.5) can be r e -
and c2, must satisfy the condition written i n vacuum as follows,
(5.13a)
(5.13b)
(5.13~)
42 1
(5.17a)
we obtain
where M i s the gravitational m a s s of a central
gravitating body, and p, y , and 6 are.expansion
parameters to be determined by the field equa-
tion. Using (5.14) i n (5.13), and putting each order
of (GM/r) equal to zero, we find that the param- = (1 -F)(l+F), (5.21)
e t e r s , p, y , and 6, are given by
where two constants, p and q , a r e defined by
p=1-c/2, y=l-2, 2
(5.15) p ~-{[(l- ~ ) ( 14~)]"'
- -2 ~ }
6 = i(l- 3~+ $ E ' ) . 1 - 5E
It is to be noticed that the Newton approximation =~+E+o(E'~,
condition (5.12a) is not used to derive (5.15), al- n (5.22a)
though the above results a r e consistent with
(5.12a).
+cF+g)((l-Z~h
2 A B
A' B' (5.16) +~)=0. It can be shown by direct calculations that this
solution indeed satisfies the field equation (5.13).
422
-
19 NEW G E N E R A L R E L A T I V I T Y 3537
(5.24)
in a static, isotropic gravitational field. The invariant distance ds2 of (5.3a) becomes
(5.25)
where we have introduced the spherical polar co- tensor appearing on the right-hand side of the
ordinates by Einstein field equation. It follows from the con-
servation law that the world line of a freely falling
x1 = Y sine cos+, x z = r s i n e sin+ ,
(5.26) test body is the geodesics of the metric g. The
x 3 = r cose . characteristic feature of general relati3ty is that
the conservation law of (6.1) is a consequence of
If the parameter E of (5.6) is exactly zero, then
the Einstein gravitational field equation, and hence
two constants,# and q , are exactly equal to 2 , and
hence this metric coincides with the Schwarz- that mechanical equations of motion f o r matter
are consequences of the same gravitational field
schild metric written in the isotropic coordi-
equation.
nates:
Now we shall show that almost the same property
holds also in new general relativity based on the
Weitzenbkk space-time. From the invariance of
the gravitational action under the group of general
coordinate transformations follows the identityz7
(
+ 1+-
3 [ d r z + r z ( d ~ z + s i n z O d ~ z )(5.27)
].
where
(6.2)
A. The equivalence principle After slight modification, this identity can be re-
written as
It has been verified experimentally to very high
accuracyz6that the world line of a freely falling V , B *-KyAp
~ B,,~=0 , (6.4)
test body is independent of its composition and where is the contortion tensor given by
structure. The equivalence principle implies that (3.22). From the definition of (6.3) it follows that
the unique world line of a test body coincides with the gravitational field equation takes the form
the geodesics of the metricg. It was shown in BUY = TILL (6.5)
Sec. 111 that by taking the short-wavelength limit of
the Maxwell and Dirac equations the photon with the matter energy-momentum tensor {T)
and Dirac particles in the classical limit are to defined b y (4.28). Using (6.5) in the identity (6.4),
travel along the geodesics of the metricg. Thus, we get the response equation to gravitation,
new general relativity is compatible with the
V,T -PA
T,, = 0. (6.6)
equivalence principle in this limit.
In general relativity implications of the equi- This is the conservation law of new general rel-
valence principle a r e concisely expressed by the ativity, corresponding to the conservation law
conservation law, (6.1) of general relativity. The energy-momentum
tensor {T*} is not symmetric in new general rel-
V,TSL =0, (6.1)
ativity. However, an antisymmetric part { T c p y I )
where {TS2 is the matter energy-momentum is due to the contribution from the intrinsic spin
423
of spin- $ fundamental particles. For macroscopic to the hyperfine splitting of the atomic energy
bodies such as a test body employed in terrestrial levels, and it shall be discussed in Sec. X.
experiments and astrophysical objects such a s
planets and s t a r s , effects due to the intrinsic B. Comparison with solar-system experiments
spin of spin-$ fundamental particles canbe ignored, Since the invariant distance dsz of (5.3a) i s
and hence their energy-momentum tensor can be written in the isotropic coordinates, the post-
supposed to be symmetric and of the same form as Newtonian parameters of the expansion (5.14a)-
that of general relativity. Therefore, an energy- (5.14b), 6 and y , are the Eddington-Robertson
momentum tensor of macroscopic bodies satisfies
parameters. Thus, by virtue of (5.15), the
the conservation law
Eddington-Robertson parameters of new general
VyTuy= 0 (6.7) relativity a r e given by
owing to the antisymmetric property of the con- P=1-/2, y=1-2<. (6.9)
tortion tensor { K u X uwith
} respect to v and A. The
The values of B and y have been measured by the
only exception seems to be compact stellar ob- solar-system experiments:
jects such as neutron s t a r s and black holes: The
spin direction of neutrons may happen to be aligned l.OO* 0.06(retardation of radio waves30),
over the macroscopic scale inside neutron s t a r s .
If this i s indeed the case, the gravitational response (6.10a)
of neutron matter should be described by Eq. (6.6)
1.014* 0.018 (solar deflection31), (6.10b)
instead of by the conservation law (6.1)
The equivalence principle i s thus satisfied f o r
macroscopic bodies in new general relativity, i ( 2 + 2 y - p ) = 1.003i0.005
and the world line of a test body coincides with
(perihelion advances3z), (6.11)
the geodesics of the wetric g , although the metric
g- itself may be dqferent fro& that of general re1 - q 40 y- - 3 = - 0.001 i 0.015
ativity. (lunar laser ranging33). (6.12)
In the microscopic scale new general relativity
violates the equivalence principle, since effects From (6.9) it follows that the Nordtvedt parameter
due to the intrinsic spin of spin-i fundamental q i s vanishing in new general relativity;
particles cannot be ignored there, and an anti-
q=O. (6.13)
symmetric part of the energy-momentum tensor
should be seriously taken into account. The motion For the sake of safety, we here adopt the value
of the intrinsic spin of a freely falling spin-i fun- (6.1Ob) for y . Using (6.9) in (6.10b) and (6.111, we
damental particle, f o r example, does not satisfy get
the equivalence principle. As was shown in
(3.37b), the spin vector {S} obeys the equation
- 0.007 f 0.009 from (6.10b)
(6.14a)
of motion -0.003*0.004 from (6.11).
Combining these two values for c as if they were
W/dr= - $crrupsUuapSo (6.8a)
independent, we are led to
with < = - 0.004* 0.004. (6.14b)
-
19 N E W G E N E R A L RELATIVITY 3539
-
J U ~ = b i P b , Y J ~ ~ P = _ ~ g Ua0Y P U (7.6)
Taking the combination of [(7.2) + (K/X)x (7.3)], the gravitational field equation is rewritten as
G({})+L~~TPl (7.7)
with {L} defined by
L= -
~{aA[CUPGA(T!,,-T ; ~ ) + f Y P a A ( TTi;)]
~,,- - 3aa- ~ g u Y a P a , + 3 ~ p Y o ~ ( b + paap,vaoi ) } , (7.8)
I
where {ai = b f Y a u }is a scalar with respect to gen- part of the torsion tensor vanishes identically,
eral coordinate transformations.
As is evident by the definition of the torsion
a =& f bk,(a, b,, - 8, b,,) = 0 , (7.10)
tensor (4.4) and its irreducible components of and (2)effects due to the intrinsic spin of spin-;
(4.5)-(4.7), the second term {LuY}of (7.7) does not fundamental particles can be neglected. The first
transform like a tensor under a local Lorentz condition implies that the left-hand side of (7.7)
transformation becomes the Einstein tensor GUY({}). The second
condition, on the other hand, allows u s to treat
-bb(x)=A,(%)_b;(x) spin-f fundamental particles as if they were spin-
(7.9)
Ab(xhj,,,A , ( x ) = Vnn * less; the energy-momentum tensor {TuY}on the
right-hand side of (7.7) can then be identified with
The energy-momentum tensor of the electro- the energy-momentum tensor {T;$ used in gen-
magnetic field depends on the parallel vector eral relativity. Thus, in this particular case the
fields - b only through the metric tensor, g,, gravitational field equation (7.7) is identical with
= b i u q f , b i Y ,and hence it is locally Lorentz in- the Einstein field equation,
variant. The energy-momentum tensor of spin-
$ fundamental particles, however, i s not locally GUY({}) = KTG. (7.11)
Lorentz invariant, due to the second term of the
second line of (4.30a), i.e., f A,, b,K~0,&~~y~$. For example, suppose that the metric in the in-
Thus, the energy-momentum tensor of matter is variant distance,
not locally Lorentz invariant, unless effects due
ds2=-A (x)(dx0)2+E(%)(dx1)z
to the intrinsic spin of spin-; fundamental par-
ticles can be neglected. Therefore, the g r a v - + C ( X ) ( ~+ D
) (~~ ) ( d x ) , (7.12)
itational f i e l d equation of (7.7) i s not invariant
under a local Lorentz transformation. is an exact solution of the Einstein field equation
The gravitational field equation is considerably (7.111, whereA(x), B ( x ) , C ( x ) , and D ( x ) a r e
simplified in the particular case which satisfies functions of x . Define the parallel vector fields
the following two conditions: (1)The axial-vector -b = @,I
- by
425
-
19 NEW G E N E R A L R E L A T I V I T Y 3541
A',,(x) = 6',+ d,,(lC), w , ~ +wA, =0, (7.12). The parallel vector fields, defined by
(7.13), in such coordinates are equivalent to the
lwJk I << I (8.3) parallel vector fields of (8.7) defined in the iso-
the condition (8.2) becomes tropic coordinates. An example i s given by the
f u v ~ uI h
spherical polar coordinates ( t , Y,0, $1 introduced
b "b pW,*GC),, = 0 , (8.4) by (5.26): The Schwarzschild metric reads as
b y neglecting the second- and higher-order t e r m s d s2 =- A( y ) d t 2 +B( r. ) d r2 +C ( Y) d B 2 +D0)d@
(y, (8.9)
of wlb. Since the condition (8.4) i s linear in wIb,
the infinitesimal neighborhood of the unit element with A and B still given by (8.6), and
in h(4)has some of a Lie-algebra property: The
inverse, @-')Ik = 6', -
wJh, and the product, @'A)',
= 6',+ w' + w I I , satisfy (8.4) f o r any two infinites- (8.10)
imal local Lorentz transformations, A and A',
belonging to h ( b ) .
As a n e x a m p c of the extended WeitZenback
space-time, consider the static isotropic space-
Thus, the system of four orthonormal vectors, t',
time, which has the Schwarzschild metric for
the present case of c , = 0 = cz. Written in the iso- (8.11)
tropic coordinates used in Sec. V, the Schwarzs-
child metric is expressed in the isotropic form,
,
ds2 = -A(y)dtz + B ( ~ ) d f d P (8.5) with
with
-
19 NEW G E N E R A L R E L A T I V I T Y 3543
Then, dropping a prime on A,,, we finally get Here we assume that an antisymmetric field
{A,,,,} i s so weak that we can neglect the second-
and higher-order t e r m s of {A,,").
In Sec. 11, the Dirac spinor wave function was
Since 5, decrease as l/r, the change of the space introduced by referring to the parallel vector
components, &I, a,$,
,- = decreases as l/rz, fields b; we denote it here by $ b . The Dirac spin-
and hence the ( I / r ) t e r m s of A , , do not change o r wave function $, , which is defined by referring
to the tetrad fields g, is related to $a by the local
under this gauge transformation. The expression
(9.35) is to be compared with the asymptotic Lorentz, transformation (10.2);
expression for ho,(Z, t),41 =U(A)$*,U(A)=l -&AwvSUv. (10.3)
It should be remarked h e r e that the spinor wave
(9.36) function qe is usually used in atomic physics to
describe the electron.
where $={Mu}is a total angular momentum of
Suppose that & satisfies the Dirac equation
the source,
(3.13b), then Eq. (10.3) implies that i), satisfies
M,(t)=c,,r/d3xX8Tboi(k, t). (9.37) (ips ,,- -
$a,,y 5y' m)qe = o (10.4)
by virtue of the following property of the covar-
See Table I f o r an illustration of {h,,,,] and pUv} iant derivative V,,:
in the asymptotic region.
-
19 NEW G E N E R A L R E L A T I V I T Y 3545
moment of the proton; M,, 3p, and g, a r e the The last t e r m of (lOD15), which consists of two
mass, the spin, and the gyromagnetic ratio of the parts, describes the spin-spin interaction of the
proton, respectively. The Dirac equation (10.4) electron and the proton: One is due to the magnet-
then becomes ic moment of the proton, and the other due to an
antisymmetric field.
[ i y ' ( a , + i e A , ) - f a , y 5 y v ' -m]JI,=O , (10.8) The spin-spin coupling due t o a n antisymmetric
for the electron in hydrogen atom. field is not restricted to the c a s e of the electron
F o r the proton at rest a t the origin, the axial- and the proton, but quite universal. F o r any two
vector current of (9.13) is given by spin-f particles, A and B , separated by 7 , we
J,, = 0, 3, = 8, 6'((x3 . (10.9)
can show in the similar way that in the nonrela-
tivistic approximation the coupling with an anti-
Use of this in (9.11)-(9.12) shows that space- symmetn'c field leads to universal spin-spin cou-
space components of the antisymmetric part of pling,
T P v , T L a e , , vanish identically; therefore, we
find that
Aa~=O (1O.lOa)
around the proton. On the other hand, the (Oa)
components of an antisymmetric field a r e given
by (9.35):
(10.10b)
where 5, and 8, are the spin vectors of the spin-
Using (10.10a)-(10.10b) in (9.6b), we obtain the + particles, A and B, respectively. This spin-
axial-vector part of the torsion tensor around the spin coupling makes a contribution to the hyper-
proton at rest, fine splitting of energy levels in atoms and muon-
ium (the bound state of an electron and a positive
(10.11)
muon).
Let u s first consider the hyperfine structure in-
In order to evaluate the effects due to an anti- terval Av(H) of the ground state of the hydrogen
symmetric field, we rewrite the Dirac equation atom. We denote by A vQED( H ) the theoretical value
(10.8) into two-component wave equations, which is based on conventional quantum electrody-
namics and on the assumption that the proton is a
Dirac particle without internal structure. Adding
possible corrections to AvQeo ( H ) , we express
AU(H) as
Av(H) = A v ~ ~ ~ (t H ) [ ~a,&)].
6$'+ (10.17)
(10.12b)
where we put Here 6$) is the correction due t o internal structure
of the proton: The precise value of 6(;) is not
(10.13) known at present, but it is estimated to be 1-2
The last term 6,(H) is a possible correc-
and used the standard representation of the y tion which a r i s e s from universal spin-spin cou-
mat rice^.^' Here 5 denotes the momentum opera- pling of (10.16): From the expression (10.15) f o r
tor; p a = - i a / a x a .
In the Pauli approximation, the Hamiltonian, we obtain
in which (10.12b) may be approximated to
A/16n =O.OlZX -x (GeV)' . (10.18)
6A(H)=ezgp
/4mM, 4n
6,(ep) 5 . (10.24)
(ahu)= 'f" --
CD
xu
This upper limit can be improved, provided that k 0 (D6: + F z x ' x "
D
+ Ff,,,xaZ/(D2 +r2F2)
the fundamental constants pu/pp and a would be (1l.lb)
known with higher precision. Using (10.23) in
(10.24), we obtain and the invariant distance ds2is expressed in a
rotationally invariant form,
L 3X10m4(GeV)-2 .
h
-
4n (10.25) d s 2 = - ( C 2 - r 2 H 2 ) d t 2 +2DHdt(xudxu)
-
19 NEW G E N E R A L R E L A T I V I T Y 3547
7- -
lim bk, = bkr, limb, = 6,
I- -
Then unknown functions, C , D, F and H , satisfy
. (11.11)
Here K is the Einstein gravitational constant,
K = 8 r G / c 4= 8nG, and A i s a new parameter,
bounded by X/4n < lo- Ec/(GeV) from precise
lim C ( t , r )= lim D ( t , r ) = 1, (11.12) experiments in quantum electrodynamics. (We
r-- 7-
leave open the possibility that X would be equal
I- - r- -
lim r H ( t , r )= lirn r F ( t , r )= 0,
I- -
limrP(t,r)=limrQ(t,r)=O.
r- - (11.14) From this action follows the gravitational field
equation,
Because of the boundary condition (11.13) for F , G({ }) + L = KT , (12.2)
the integral in the exponent of (11.10) converges
-
for r -a, and s o the exponential factor of (11.10) where
approaches a finite positive value for r-00.
Therefore, in o r d e r to satisfy the boundary con-
dition (11.14), the unknown function f ( t ) must
vanish, and hence we get
P ( l , r )= 0 , Q ( t , r )= 0 , (11.15) (12.3)
by virtue of (11.8b) and (11.10). It then follows Here a, = biua,, i s a vector with respect to global
from (11.4) that the axial-vector part of the tor- Lorentz transformations, but a s c a l a r with res-
433
-
19 NEW G E N E R A L R E L A T I V I T Y 3549
Transformation
Riemann-Chr istoffel
curvature tensor
General coordinate
Torsion tensor
T?, = b:($bk,
General coordinate
- 8,bh,,)
with G W = O
Quantum Graviton ; Graviton;
spin 2 and massless spin 2 and massless
Scalar particle; positive
energy, spinless and
massless
Theory Macroscopic Microscopic
Equivalence Yes Yes, for macroscopic
principle phenomena
No, for microscopic
phenomena
which is interpreted as the gravitational radiation which, in view of (9.6b), is equivalent to the
produced by the source {T,,"}. Inspection of (A151 vanishing of the axial-vector p a r t of the torsion
shows that if {BAT""} does not identically vanish, tensor;
the field {h,,,} propagates inside the light cone as
if i t i s massive. a"=O. (A. 18b)
It seems natural, however, to r e s t r i c t the Therefore, {A,,,,} can be represented as curl of a
theoretical framework of gravitation by requiring vector field {But,
that gravitational radiation should propagate on
the light cone with the speed of light. In view of
A,,, = a,B, - B,B,. (A.19)
this criterion, the case of c , f 0 should be disre- Using (9.11)-(9.12), (A18a), and (A19) in (A8), we
garded unless the energy-momentum tensor satis- find that the field equation of {A,,,,} i s rewritten a s
fies
aATiu=0 , (A.16)
in addition to the ordinary energy-momentum con- the retarded solution of which is given by (A19),
servation law (9.9). Therefore, we shall assume with B, defined by
(A16) hereafter. Then the spin tensor {s""} is
totally antisymmetric with respect to i t s three
indices, and is represented as (9.12).
It follows from (A13) and (A16) that the symme-
tric field {h,,,} satisfies the field equation It follows from (A18b) that an antisymmetric
- field does not couple with spin-$ fundamental
Ohuu=-2KT(,u, > (A.17) particles [see the Dirac equation (3.13b)l. On the
which is nothing but the field equation (9.17) in the other hand, the electromagnetic field i s decoupled
case of cq= 0. Consequently, we find that the from an antisymmetric field, since the f o r m e r
symmetric field {h,,,,} is not influenced af all by interacts with the gravitational field through the
the parity-violating c , term of L,. metric tensor (g,,,}. Consequently, an antisymme-
From (A12) and (A16) i t follows that' t r i c field (A19) does not interact with fundamental
particles and fields, and so it i s entirely devoid
a,,Z u u = o , (A.18a) of physical reality.
436
-
19 NEW GENERAL RELATIVITY 3551
The present case of c,#O i s invariant under the f r e e to redefine the time coordinate and the radius
gauge transformation (A10). Using this gauge by
freedom, we have put the harmonic condition
(9.15), which i s necessary to eliminate unphysical t'=Q(t,r), x ' " = $ ( t , r ) x ~ , (B3)
components of the symmetric field {huv}. We are with 9 and J, a r b i t r a r y functions of t and Y.
still left with the freedom to perform a gauge Under arbitrary coordinate transformation xu
transformation (A10) with A, satisfying the - x 111 , the parallel vector fields {b',,} transform
like covariant vectors
d'hlembert equation,
OA,=O. (A22) b lU ( x ' ) = ( 8xY/8x'")b*,(x) . (B4)
It follows from (A21), however, that {BJ satisfies For a redefinition (B3) of t and r , the transform-
the inhomogeneous d' Alembert equation, ation coefficients (ax"/axru) a r e given by
at/at'=($+rqWa,
axa/at' = -($/A)%,,
if matter exists. Accordingly, a gauge trans- (B5)
at/axra = - ( ~ ~ / r a ) x ~ ,
formation (A10) with (A22) is insufficient to make
an antisymmetric field (A19) vanishing in that
space-time region where there exists nonvanishing
source, {JS& 0.
Therefore, an antisymmetric field (A19), al-
though i t is unphysical, cannot be eliminated by
a symmetry transformation of the present case of
c,+ 0. This situation is to be contrasted with that
of the electromagnetic field, in which unphysical
components of the electromagnetic potential {A,,)
can be eliminated by choosing an appropriate Using (B5) in (B4), we obtain
gauge. It is unreasonable to accept a theory in-
volving such unphysical degrees of freedom that
cannot be removed by a symmetry transformation b'LO),,=$ [ ( J , + r J I * ) C - & G ] , (B7a)
of a theory. Consequently, we should disregard
the case of c,#O.
The parallel vector fields (B2) then take the fol- compwents, b(')", must vanish. The parallel
lowing form: vector fields (B11) then become
0) >
(bk,)=(" 0314)
0 D6.m
where C and D are unknown functions of Y alone.
Further reduction of {bk,) i s impossible: Any
of C, D, F, and H cannot be put to zero in ad- APPENDIX C: PROOF OF l+(cl +4c2)# 0
dition to E and G. This is evident from (B7a) and
(B7d) for C, D , and F. To prove this for H, we The field equations f o r the static isotropic gravi-
assume that the (a0) components, b('),,, were el- tational field become
iminated from ( B l l ) by a suitable redefinition of
t and r , then Eqs. (B?b)-(B7d) show that the func-
- [I- K(C1 - 2C2)B"
-K(C1 4- C 2 ) A "
n
'A. Einstein, (a) Sitzungsber. Preuss. Akad. Wiss. 217 "R. WeitZenback, Inuariantentheorie (Noordhoff, Gron-
(1928);@) 224 (1928); (c) 2 (1929);(d) 156 (1929); (e) ingen, 1923); Chap. XIII, Sec. 7.
18 (1930);(f) 401 (1930). "See, e.g., E. T. Davies and K. Ywo. in Convegno
'A. Einstein and W. Mayer, Sitzungsber. Preuss. Akad. Internazionale Celebratiuo del Centenario della Nas-
Wiss. 110 (1930). cita di Tulli Levi-Ciuita, Atti dei Convegni Ltncei
k. Mbller, K. Dan. Vidensk. Selsk. Mat. Fys. Skr. 1, (Academia Nazionale dei Lincei, Roma, 1975); p. 53.
No. 1 0 (1961). '3Throughout this paper we mean by "geodesics" the
4C. Pellegrini and J. Plebanski, K. Dan. Vidensk. Selsk. shortest (or longest) possible path between two points,
Mat. Fys. Skr. 2, No. 4 (1962). "length" being measured by the metric g.
k. Mbller, K. Dan. Vidensk. Selsk. Mat. Fys. Skr. 89, "See, for example, K. Hayashi and T. Shirafuji, Prog.
No. 13 (1978). Theor. Phys. 57, 302 (1977).
6K.Hayashi and T. Nakano. Prog. Theor. Phys. a, 491 ' 5 ~convention
r of the y m a t r i c e s i s a s follows:
0967).
%.Miyamoto and T. Nakano, Prog. Theor. Phys. 2, {yi. r')=-2q''. S i ' = ( i / 4 ) [ y i , 7'1,
295 (1971).
*K.Hayashi, (a) Gen. Relativ. Gravit. 4, 1 (1973);@)
y 5= i y 'y 'y2y '.
Lett. Nuovo Cimentn 5, 529 (1972); (c)5, 739 (1972); In the spinor representation (2.20)of J I , the Y matrices
e,
(d) 2, 883 (1972);(e) Phys. Lett. 497 (1973); are
(f) K B , 497 (1973).
'K. Hayashi, Nuovo Cimento g , 639 (1973).
'OK. Hayashi, Phys. Lett.E, 441 (1977).
438
-
19 NEW GENERAL RELATIVITY 3553
16C. Mbller, The Theory of Relativity (Clarendon, Ox- U. S. A.46, 871 (1960);Phys. Rev. Lett. 2,215 (1960).
ford, 1952). "A. S. Eddington, The Mathematical Theory of Relativ-
"Although the magnitude of spin vanishes in the classi- i t y (Cambridge Univ. Press, Cambridge, England,
c a l limit F- 0, the spin polarization h a s the meaning- 1924), 2nd edition, p. 105; H. P. Robertson, in Space
ful classical limit. The classical equation of spin Age Astronomy, edited by A. J. Deutsch and W. B.
precession in a homogeneous electromagnetic field is Klempler (Academic, New York, 1962). p. 228.
now well established and employed in the experimental "J. D. Anderson et a l . . Astrophys. J. 200, 221 (1975).
study of the anomalous magnetic moment of muons and "E. B. Fomalont and R. A. Sramek, Phys. Rev. Lett.
electrons. See, for example, V. Bargmann, L. Michel, 36, 1475 0976).
and V. L. Telegdi, Phys. Rev. Lett. 2, 435 (1959). '%otnote 27 of I. I. Shapiro et al.. Phys. Rev. Lett.
WKB approximation method was f i r s t applied to 36, 555 (1976).
the Dirac equation in the electromagnetic field by 3 E G . Williams et al., Phys. Rev. Lett. 2, 551 (1976);
W. Pauli, Helv. Phys. Acts?, 179 0932). The classi- I. I. Shapiro et al., ibid. 36. 555 0976).
c a l equation of spin precession in the homogeneous %ee, for example. J. Ehlers and W. Kundt, in Gravita-
magnetic field was l a t e r derived by this method in tion, edited by L. Witten (Wiley, New York, 1962);
S. I. Rubinow and J. B. Keller, Phys. Rev. 131,2789 and W. Kinnersley, in General Relativity and G m v -
(1963); K. Rafanelli and R. Shiller, ibid. 3,B279 itation. edited by G. Shaviv and N. Rosen (Wiley, New
(1964). York, 1975).
''See, for example, R. P. Feynman and M. Gell-Mann, 35A. Friedmann, Z. Phys. lo,377 (1922); z, 326 (1924).
Phys. Rev. 109, 193 (1958). 36D. G. Boulware, Phys. Rev. D ll, 1404 (1975);l2,
the spinor representation, Cp is indeed a two-com- 350 (1975).
ponent spinor. We c a n a s welluse $=;(I -v5$ in- 37M. D. Kruskal, Phys. Rev. 119, 1743 (1960);G. Sze-
stead of (3.15) without any change in the result of the keres, Publ. Mat. Debrecen 7, 285 (1960).
classical limit. "The spin tensor { S x ' y } is taken to l o w e r t o r d e r in
"The Hamilton-Jacobi equation in classical mechanics the weak field, and so i t i s independent of the weak
i s treated in, for example, H. Goldstein, C h s i c a l field. The Tetrode formula (9.11) is equivalent to the
Mechanics (Addison-Wesley, Reading, Mass., 1950). total angular momentum conservation law,
Application to particle motion in general relativity can
be found in C. W. Misner, K. S. Thorne. and J. A. a ,,Mx""= 0,
Wheeler, Gravitation (Freeman. San Francisco, with MAWdefined by
1973).
"Then w c c a n put tl'= Q , O , O . O ) , br''=tSk', and so
MA P Y , ~AT',, - ,'TAU +s UY.
S"= (0, S) by virtue of (3.28) and (3.30). 39See, f o r example, S. Weinberg, Gmvitatwn and
23Theu matrix ( u k )is defined by ( u k ) = ( I , u ' , u z , u 3 ) . Cosmology (Wiley, New York. 1972). Chap. 10; o r
24aK,Hayashi and A. Bregman, Ann. Phys. (N.Y.) 15, C. W. Misner, K. S. Thorne, and J. A. Wheeler, Grav-
562 (1973);p. 597. itation (Freeman, San Francisco. 1973). Chap. 18.
24b1. M. Gel'fand. R. A. Minlos, and Z. Ya. Shapiro, 40Forquantization of the {A,,,,} field, see K. Hayashi,
Representations of the Rotation and Loren& Groups Phys. Lett. @, 497 (1973).
and Their Applications (Pergamon, Oxford, 1963). 4'See the second reference of Ref. 39, p. 449.
25R.C. Tolman, Relativity, Thermodynamics and Cos- "Here we u s e the standard representation of the y ma&
mology (Oxford Univ. P r e s s . Oxford. England, 1934), rices:
Eq. (82.14).
26P.J. Roll, R. Krotkov, and R. H. Dicke, Ann. Phys.
(N.Y.) 26, 442 (1964);V. B. Braginsky and V. I.
Panov, Zh. Eksp. Teor. Fiz. fi, 873 (1971) kov. "S. D. Drell and J. D. Sullivan, Phys. Rev. 154,1477
Phys.-JETP+, 464 (1971)J. (1967).
'?K.Hayashi, Lett. Nuovo C I m e n t o j , 529 0972). "E. N. Taylor, W. H. P a r k e r , and D. N. Langenberg,
28Fora spinning macroscopic test body such as a tor- Rev. Mod. Phys. 4 ,-l 375 (1969).
que-free gyroscope, the situation is different, and "The hyperfine s t r u c t u r e of muonium is reviewed both
the equation of the spin precession can be derived theoretically and experimentally by V. M. Hughes and
from the conservation law (6.7)by applying the method T. Kinoshita. in Mum Physics, edited by V. W. Hughes
developed by Papapetrou in general relativity; and C. S. Wu (Academic, New York, 1977), Vol. I,
A. Papapetrou, Proc. R. Soc. London-. 248 (1951); Chap. 11.
E. Corinaldesi and A. Papapetrou, ibid. E ,259 46G. Birkhoff, Relativity and Modern Physics (Harvard
(1951). See also L. Schiff, Proc, Natl. Acad. Sci. Univ. P r e s s , Cambridge, Mass.,, 1923), p. 253.
439
Brief Reports
Brie/Reports are short papers which report on completed research which, while meeting the usual Physical Review standards of scien-
tific quolity. does not warrant a regular article. (Addenda to papers previously published in the Physical Review by the same authors are
included in Brief Reports.) A Brief Report may be no longer than 3% printed pages and must be accompanied by an abstract. The same
publication schedule as for regular articles is followed. and page proofs are sent to authors.
Takeshi Shirafuji
Physics Department, Saitama University, Urawa. Saitama 338, Japan
(Received 28 July 1981)
We make a short comment on our new general relativity formulated on the WeitzenWk space-time. The new
general relativity considered here has one free parameter besides the Einstein constant K . The total action is
invariant under a class of local Lorentz transformations, besides being invariant under general coordinate and global
Lorentz transformations. The consequences of this restricted local Lorentz invariance are studied.
In a previous paper we studied a gravitational brief comment on the internal consistency of this
theory based on the Weitzenb5ck space -time with model; in particular, we put fonvard an argument
absolute parallelism. This theory attributes gravi- against a recent statement of internal inconsisten-
ty to the torsion of space-time, defined byZ cy.3
The Lagrangian density of (2b) now becomes
Tauu =bi(aubip - a p b i v ) (1)
with b={hi} = { b , } a quartet of the parallel vector L,= (1/2K)R({ } ) + C 3 ( a U u ) , (3)
fields (or simply the parallel vector fields). Pos- with a total-derivative term neglected, and it has
tulating that the gravitational Lagrangian density the following invariance property besides the in-
should be quadratic in the torsion tensor and con- variance under general coordinate and global Lor-
serve parity, we found that it is represented by entz transformations: namely, the L, of (3) i s i n -
variant under those local Lorentz transformations,
u2(21v,)+ u3(aa,)
tc=a1(t~~t,,,)+ , (2a)
bi = b ; ? I f i ( x ) ,
where a,, u z , and a3 a r e free parameters. Or
equivalently, it i s rewritten by o r simply
& = (1/2K)R(i})+Cl(PtA,,)+ C2(zIYp) b= bA(x), (4)
+ c3(aup)+a total derivative (2b) with A ( x ) = { A ~ ( xand
) } A T @ = 7), which leave the
axial-vector part of the torsion tensor u invari-
with ~ = 8 n C c, , = a , + ( 1 / 3 ~ )c,, = a , - ( 1 / 3 ~ ) , and cg
ant. Using the definition of a, a= + z ~ ~ we T ~ ~ ~ ,
= a , + ( 3 / 4 ~ ) whereG
, denotestheNewtoniangravita-
see that the constraint imposed on A ( x ) is given by
tional constant. Here t*, v, and a a r e three
irreducible building blocks of the torsion tensor,
CifmnbiPbnYA1f(x)avA*~(x)=
0, (5)
while R({ }) is the ordinary scalar curvature de-
fined by the Christoffel symbol. We have compared where A&)= q h A m f ( x ) . This property of L, ,
the theory with solar-system experiments, and which we shall hereafter refer to a s the r e -
found that the parameters c1 and c p a r e severely stricted local Lorentz invariance, is a character-
restricted by the currently available experimental istic feature of the present model with c1= c z = 0,
data; KC^= 0 . O O l i 0.001 and K C * = -0.005i 0.005. A s and i t h a s some consequences which we shall now
for the parameter c 3 , solar-system experiments discuss.
do not give any restriction a t all. Consider the variation of the action under an in-
In view of this severe restriction on K C , and KC^, finitesimal local Lorentz transformation (4) con-
we have proposed a particular model for which strained by (5), i.e., A&)= 6,+ w f i b ) with
the parameters c 1 and c z a r e exactly vanishing; c ,
= c 2 = 0. It is the purpose of this paper to make a Eifmnbib,,auWfm(x)=
0, (6)
where wt,(x)= vlrnwrn,(x), w J i ( x ) =0, and where V , denotes the covariant derivative with r e -
Iw,,(z)[ << 1. The matter part of the action changes spect to the Ricci rotation coefficients formed of
like {b,}: the t e r m s in the square brackets a r e invar-
iant under any local Lorentz transformations.
Therefore, the Lagrangian density L, of (13) has
the required restricted local Lorentz invariance.
For the Rarita-Schwinger field of spin P, however,
= J d4x=(T(j1+ - - 1( G L6, , $ S i t q ) w , , ( x )
G 6q , the minimal Lagrangian density does not possess
(7) this invariance property, and one must add some
nonminimal coupling terms in an ad hoc manner to
where Ti is the energy-momentum tensor of mat- ensure the restricted local Lorentz invariance.
ter, For the gauge fields of internal symmetrv of the
c g T f J =b i , , 6 ( c g L,)/6b,, , (8) fundamental particles, such as photons, W *
mesons, 2 mesons, and gluons, the Lagrangian
and S f J is the infinitesimal Lorentz generator for density i s constructed by the usual Yang-Mills
the matter fields which we denote collectively by procedure, andhence it is described in t e r m s of
q . The gravitational part of the action, on the the metric tensor alone, namely, the Lagrangian
other hand, does not change under this transforma- density for the gauge fields of internal symmetry
tion: is invariant under any local Lorentz transforma-
tions.
6) d q x c g L y = - d 4 x c g B L 1 w , , ( x ) =0 , (9) The equations of motion for matter fields are de-
rived from L , by the action principle, and s o they
where B is defined by a r e covariant under the local Lorentz transforma-
c g B =-b,,6(~gLo)/6bi, . (10) tions constrained by (5). That is, we have
Therefore, the gravitational field equation, which
now reads as
Bfs T , (11) where w e denote the Lorentz transformation rule
o f q byqf=U(A)q.
requires that the TCshould also satisfy It is thus impossible to distinguish experimen-
1d 4 x 6 T C i J 1 w ,(x) = 0 (12)
tally (i.e., by observing the motion of test bodies)
two parallel vector fields b and b from one anoth-
e r , if these two parallel vector fields are related
when the matter fields obey their field equations to each other by a local Lorentz transformation
6(GgLM)/6= q 0. According to (7), this condition satisfying (5). In other words, these two parallel
of T is automatically guaranteed if and only if vector fields should be interpreted as physically
the matter part of the action is invariant under in- equivalent with each other. We can therefore
finitesimal local Lorentz transformations con- divide the s e t of parallel vector fields into equiva-
strained by (6). Namely, it is required by con- lence classes. Observing the motion of test bod-
sistency of the gravitational field equation that the i e s , one cannot unambiguously specify a single
matter part of the Laflangian density L , should quartet of parallel vector fields but only an equiv-
have the same restricted local Lorentz invariance alence c l a s s of parallel vector fields. Conse-
as the gravitational Lagrangian density of (3). quently, the underlying space-time of the present
The Lagrangian density of the fundamental par- model with c , c 2 = 0 is not a Weitzenbbck space-
ticles of spin *, which is derived from the special- time but a new c l a s s of space-time which may be
relativistic one by the minimal prescription, i s classified somewhere between the Riemann and the
given by WeitZenrock space-times. We shall call this new
L,= ( i / Z ) b / ( q y * e , q -e,qyq) -mqq class of space-time the extended Weitzenbbck
space-time.
=[(i/2)bh(q?bV,q -V,qY*q) -Wtqq] The gravitational field equation symbolically de-
+ $(qyby5q)a 7 (13) noted by (11)reads
44 1
where Ak(x)= V m J A k J ( x.) then and q also satisfy their field equations at
The second term under the integral sign in (19) the same time. This result indicates that the par-
a r i s e s due to the fact that the local Lorentz trans- allel vector fields a r e determined by the gravita-
formation matrix A(x) does depend on the parallel tional field equation up to a freedom of making the
vector fields, and it represents the peculiar fea- local Lorentz transformations constrained by (5).
t u r e of the transformation law of the gravitational We can thus conclude that the present model with
field equation in the present model with CL = c2 = O . c1 = Cz = O i s internally consistent if the Lorentz
Because of this second term in (19), the gravita- transformation matrix A(%)constrained by (5) de-
tional field equation is not covariad under the pends smoothly on b. More specifically, we have
local Lorentz transformations constrained by (5). assumed the integral representation (17)f o r j l i i ( x ) ,
Nevertheless, it does follow from (14) and (19) which we have not yet succeeded in proving.
that if if and q obey their field equations
6 ( G L ) / 6 b * = O I 6(\/qL)/6q,=O, (20)
K. Hayashi and T. Shirafuji, Phys. Rev. D g , 3524 %. Kopczyrkki, University of Cologne report (unpub-
(1979). lished).
Weuse the same notations and conventions as in Ref. 1.
442
A. A. LOGUNOVand M. A. MESTVIRISHVILI
USSR State Committee for Utilization of Atomic Energy
Institute for High Energy Physics, Serpukhov, Moscow
In the present paper a relativistic theory of gravity (RTG) is unambiguously constructed on the basis
o f the special relativity and geometrization principle. In this, a gravitational field is treated as the
Faraday-Maxwell spin-2 and spin-0 physical field possessing energy and momentum. The source of a
gravitational field is the total conserved energy-momentum tensor of matter and of a gravitational field in
Minkowski space. In the RTG, the conservation laws are strictly fulfilled for the energy-momentum and
for the angular momentum of matter and a gravitational field. The theory explains the whole available
set o f experiments on gravity. In virtue o f the geometrization principle, the Riemannian space in our
theory is of field origin, since it appears a s an effective force space due to the action of a gravitational field
on matter. The RTG leads to an exceptionally strong prediction: The Universe is not closed but just
flat. This suggests that in the Universe a missing mass should exist in a form of matter.
1. Introduction
In this paper the relativistic theory of gravitation (RTG) is constructed on the basis
of the special relativity, and the ideas by Poincare, Minkowski, Einstein and Hilbert get
their further development. Also the investigations of the authors). are reflected and
developed here.
First the principle of relativity was applied to mechanical phenomena only. But then
Henri PoincarG formulated it as the universal principle for all physical p h e n ~ m e n a : ~ )
The laws of physical phenomena should be the same both for an observer at rest and for
one who is in the state of a uniformly translational motion. So, we do not and cannot
have any means to distinct whether we are in such a motion or not. Even by now they
used to think that the essence of the principle of relativity is restricted by the existence of
only one class of coordinate systems, the so-called inertial reference frames within which
physical processes take place in the same way. However, as shown in Ref. 4), the
pseudo-Euclidean space-time geometry discovered by Minkowski allows t o formulate the
generalized principle of relativity, valid both for the class of inertial and that of noniner-
tial frames. The generalized principle of relativity was formulated in Ref. 4) : Which-
ever physical reference system is chosen, inertial or noninertial, one can always find an
infinite set of other frames, where physical phenomena are simultaneous with those in the
initial reference frame. Thus, we do not and cannot have any experimental means to
distinguish in what particular reference frame out of this infinite set we are.
The discovery of the pseudo-Euclidean space-time geometry allows t o formulate
physical laws both in inertial and noninertial reference frames, and thus to disprove the
erroneous statement? on inapplicability of the special theory of relativity to accelerated
reference frames. This means that when describing physical phenomena in Minkowski
space subject to a physical problem we may choose any reference frame adequate for the
given problem and, hence, set a corresponding metric tensor 7 of Minkowski space.
According to the ideology of the general relativity (GR), the special principle of relativity
443
cannot be applied for gravitational phenomena. It was that a very central point in which
almost seventy years ago Einstein and Hilbert turned away from a special theory of
relativity when costructing GR. This resulted in giving up the conservation laws for the
energy-momentum and angular momentum, as well as in the development of unphysical
concepts on the nonlocalizalility of gravitational energy, and of many other things, which
have nothing to do with gravity. These two eminent scientists left the surprisingly simple
Minkowski space with the maximal, ten-parameter,group of space motion and entered the
maze of the Riemannian geometry, which entangled the following generations of
physicists engaged in gravity. Some authors even consider giving up the energy-
momentum conservation laws in GR to be the most important principal step of this theory
which overthrew the concept of energy. But it would be too thoughtless if we renounced
the most important law of nature, i. e., the conservation law of energy-momentum and the
angular momentum of a closed system, without sound experimental grounds. It was
shown in Refs. 1) that, since the GR does not and cannot have conservation laws for the
energy-momentum of both matter and a gravitational field, then the inert mass defined in
Einstein theory has no physical sense, the gravitational radiation flux, as it is defined in
the GR, can always be annihilated by the correponding choice of the admissible reference
frame, and hence, the Einstein quadrupole formula for the gravitational field radiation
does not follow from GR. The general relativity does not basically suggest that a binary
system looses energy because of gravitational radiation. The GR does not have the
classical Newtonian limit and, consequently, does not satisfy the most fundamental
principle of physics, i. e., the correspondence principle. This is what the absence of
energy-momentum conservation laws leads to, should one reject dogmatism, think seri-
ously over the heart of the problem and perform almost an elementary analysis. All of
it testifies to the fact that the GR is not a satisfactory physical theory. Therefore, the
problem of constructing a classical theory of gravity which would satisfy all the require-
ments imposed on a physical theory, is quite vital.
As opposed to the GR, our theory is based on the special principle of relativity which
we, following Poincar6, consider universal and, consequently, applicable to gravitational
phenomena as well. Thus, in our approach the conservation laws for the energy-
momentum and angular momentum are fulfilled strictly and have a covariant character.
Therefore, our theory contains no pseudotensors and as a consequence no unphysical
concepts of the gravitational energy nonlocalizability arises. Figuratively speaking, our
overriding problem is to construct without leaving Minkowski space an effective field
Riemannian space with the help of a tensor gravitation field and the geornetrization
principle, with the conservation laws for matter being strictly fulfilled. This will allow
us to use, if necessary, Riemannian space already inspired with the conservation laws for
matter. Note, that Riemannian space constructed in such a way is, literally, of a field
origin since the effective force space is generated by a gravitation field of Faraday-
Maxwell type. Thus, in the present paper we shall carry out this program developing the
ideas of Refs. 6). In this we manage to preserve with necessity the Hilbert-Einstein
equations supplementing them with four new field equations. According to new equa-
tions, a gravitation field has, in the general case, only 2 and 0 spins. This theory changes
the conventional concepts of space-time influenced by the GR, takes u s out of the maze of
the Riemannian geometry and is in spirit of the modern theories in elementary particle
physics. Everything turns out surprisingly simple and natural. The only thing to
444
wonder is that the way to this simplicity and lucidity took 70 years. It follows from our
theory, that the Einstein general principle of relativity has neither physical sense nor any
physical content.')
T h e theory developed in this paper is based on the concept of a gravitation field being
a physical field of Faraday-Maxwell type and possessing energy-momentum. Thus, a
gravitation field, as well as all other physical fields, is characterized by the energy-
momentum tensor of the system. We consider a gravitation field as a spin 2 and spin 0
physical field, and a free gravitation field t o have spin 2. T h e space-time geometry for all
physical fields is pseudo-Euclidean (Minkowski space). Thus, the conservation laws for
the energy-momentum and angular momentum of a closed system are rigorously fulfilled.
This is the principal distinction between our theory and the Einstein GR. Another
important problem arising in the costruction of a theory of gravity, is that concerning the
interaction between a gravitation field and matter. We think a gravitation field t o be
universal, and to act on all forms of matter identically. We construct our theory on the
basis of the geometrization principle,'' which says that the equations of motion of matter
under the action of the tensor gravitation field 4'' in Minkowski space with the metric
tensor y " , may be identically represented a s the equations of motion of matter in the
effective Riemannian space-time with the metric tensor g '" depending on the gravitational
field 4'' and the metric tensor y". In this way we introduce the concept of a n effective
Riemannian space of field nature. Proceeding from Minkowski space and the geometriza-
tion principle, the Lagrangian density has a general form
where 6'"
=6 4 ' ' is the density of the tensor of a field variable in the gravitational field
4". Gzk=&g1' is the density of the metric tensor of the Riemannian space .'g 7''
=&Y" is the density of the metric tensor of Minkowski space. @A are the fields of
matter.
In this theory, the Lagrangian density of the gravitational field depends on the metric
tensor 7'' and the gravitational field 4". That is why it crucially differs from the GR,
where the Lagrangian density depends only on the metric tensor of the Riemannian space
g". Thus, in our theory, contrary to the GR, the geometrization of the Lagrangian
density of the gravitational field is not complete.
2. Geometrization principle and general relations
in the relativistic theory of a tensor field
Without any loss of generality, we shall assume the tensor density 5" of the metric
tensor of Riemannian space-time to be a local function depending on the tensor density 7'"
of the metric tensor of Minkowski space and the tensor density 6"of the gravitation field.
Let the Lagrangian density L M depends only on the fields @ A , on their first-order
covariant derivatives, and also on the tensor d&sity g'' in virtue of the geometrization
principle. The Lagrangian density of the gravitational field in taken to depend on the
tensor density fa", their first-order partial derivatives, as well a s on the gravitation field
density 6'" and its first-order covariant derivatives with respect to the Minkowski metric.
T o obtain the coservation laws, we use the invariance of the action in the infinitesimal
covariant shift. Indeed, since the action is a scalar, the variations of the actions of
445
matter, ~ J M , and of the gravitational field, 8J8, will be zeros under an arbitrary
infinitesimal coordinate transformation. Calculate first the variation of the action of
matter under the transformation
xl=xi+p(x), (1.1)
tibeing the infinitesimal shift four-vector
Here and in what follows D, is the covariant derivative with respect to the Minkowski
metric. Putting these expressions into (1.2) and integrating them by parts, one obtains
In virtue of arbitrariness of the vector ,$, one finds from the condition 8 / M = o a strong
identity
Proceeding from this, strong identity (1.5) may be written in the form
446
Due to the least action principle, the equations for matter fields have the form
6 L M -0. (1.9)
6@A
Taking into account the above equation, one may find from strong identity (1.8) the weak
identity:
8,(p.-$+o. (1.10)
Note, that the density of the energy-momentum tensor of matter in Riemannian space 2
is expressed v i a P as follows:
Thus, expression (1.10) entails the covariant equation for matter conservation in Riernan-
nian space:
VrTP=O. (1.12)
If the number of equations for matter is four, then instead of equations for matter ( 1 - 9 )
one can always use equivalent equations (1- 12).
The variation of the action integral may be written in the equivalent form
It should be noted that this identity is valid irrespective of the fulfillment of the
equations of motion for matter and for gravitation field.
Let us introduce the notations:
(1.18)
Analogously, it follows from the invariance of the action of the gravitation field under
coordinate transformations (1-1) that:
Here
- -
Adding expressions (1 18) and (1 19), we get
(1.21)
Here
,-PU= ,-y+ (1.22)
In virtue of the least action principle, the equations of the gravitation field have the form
If we take these equations into account from (1-21),we obtain an extremely important
identity:
? ) = Fuu-+Puut).
T- u Y1- ~ ~ u PpvDu(
BPUVu(
(1-24)
G T P U = 29
, P U - r - P Y T- . (1-25)
Similarly, the density of the total energy-momentum tensor in Minkowski space equals
D tv = V T, = 0 . (1.28)
From the covariant equation for matter conservation in Riemannian space, i t is not
clear, what is conserved while from t h e conservation law for the total energy-momentum
tensor t, in Minkowski space it is clear that both the energy-momentum of matter and
gravitation field are conserved. Thus, in this theory Riemannian space appears a s a
result of the action of the gravitation field on all forms of matter. That is why it is an
effective Riemannian space of field origin. Minkowski space finds its exact physical
reflection in the conservation laws for the energy-momentum tensor and the angular
momentum of matter and gravitation field taken together.
Since there are ten Killing vectors in a flat space, there are, consequently, ten
conservable integral quantities for a closed system of fields. If the number of equations
of motion for matter is four, then instead of them we may use the equations expressing the
total energy-momentum tensor conservation in Minkowski space:
D(tL+ G U ) =o . (1.29)
This equation, alongside of those for a gravitation field, defines all the unknown character-
istics of matter and gravitation field. It is worth noting that both matter and gravitation
field in our theory are characterized by energy momentum tensors. As a result, in our
theory, contrary to the GR, no pseudotensors arise and, hence, there are no unphysicaI
concepts of the nonlocalizability of gravitation energy.
If we, following Hilbert and Einstein, choose the Lagrangian density of a gravitation
field in a completely geometrized form, i.e., depending on the metric tensor g t b of
Riemannian space and its derivatives only, e. g.,
L,=&R,
where R is the scalar curvature of Riemannian space, then in virtue of the field equations
the energy-momentum tensor density of the free gravitation field in Minkowski space
would always be equal to zero:
(1.30)
Under coordinate transformation (1.11, the action variation SJg is zero and, hence,
Here
JA=-[a~~+K~ADa~p,
where the density of the canonical tensor r a A is equal to
Putting into (2.1) formulae (1.14) and (1.15) for the variations c ~ L P , we shall
6~7,
obtain in virtue of arbitrariness of the volume 9 the following strong identity:
Since the shift vector ,fa is arbitrary, the last expression leads to the following identities:
D, raL= - L6P
D a P y , (2.6)
l3L -
rpa- D A K , =d ~a ~ - 8 6Lg -or 6L a 6Lg -ur
c a ~ d + 2 g J P a - 6 p6 p b r Y , (2.7)
66 p 64
KFA=- K i a . (2-8)
Henceforth our theory is based on the linear relation between the metric tensor density
g of the effective Riemannian space and the density P of the tensor gravitation field
g,u= y_PY+ p u . (2-9)
In this case we shall obtain the equalities:
As a result of elementary calculations, one obtains for IT,"^ the following expression:
It is just the identity which establishes the relation between the Hilbert tensor density in
Minkowski space and that of the canonical energy momentum tensor.
For further use it is convenient to introduce the following quantity as a characteristic
of the gravitation field:
t&=rPa-DiK,"'. (2.13)
In virtue of identity (2-121, this quantity coincides exactly with the density of the Hilbert
energy-momentum tensor in the case of a free gravitation field.
As was shown in Ref. 9), the symmetric tensor d i k of rank two may be represented as
the sum of the irreducible representations: one with spin 2, one with spin 1 and two with
spin 0:
$ik= [P2+PI + PO+ pO']t: (DLm . (3.1)
PZ 3 +XjmXnL]
=T[XiLXnm - XniX m' . (3.5)
In the 1-representation the projection operators P , are nonlocal integrodiff erential ones:
45 1
-
With the help of expressions (3-3) (3-5) one may easily make sure that only operators
P2 and POare conserved:
qlP;",',=q1Pd",',=O, qmP;",t=qmPGnt=O. (3.6)
As may be easily verified, the tensor field has the only local operator of a lower order
which is linear in field. It equals
fik = [(PZ
- 2PO)@ ] i k (3-7)
and its divergence is identically equal to zero.
difik=O. (3-8)
The field fik describes only spins 2 and 0, i.e., in a more detailed form
fik =UBik-didmBmk-
drdmOmi+yikdmdlBml, (3-9)
1
B i k = 4 i k - y y i k 4 .
(3.10)
harmonic coordinate conditions to solve island-type problems. It was Fock who paid
special attention to the importance of harmonic coordinate conditions in solving island
problems. He wrote as follows: The above remarks concerning the privileged charac-
ter of the harmonic system of coordinates should not be understood, in any case, a s some
kind of prohibition of the use of other coordinate systems. Nothing is more alier; to our
point of view than such an interpretation. He went on: Likewise, in the case of the
Theory of Gravitation, the existence of harmonic coordinates, defined apart from a
Lorentz transformation, though a fact of primary theoretical and practical importance,
does not in any way preclude the use of other, non-harmonic, coordinate systems. From
the point of view of our theory, when solving island problems, Fock was unconsciously
working with ordinary Galilean coordinates in an inertial reference frame, which are, as
known from the special relativity, definitely singled out coordinates. Therefore, in
Focks calcdations for island systems the harmonic conditions were not the coordinate
ones as he thought them to be but, a s will be seen from our theory, field equations in
Galilean coordinates of an inertial reference frame. It was due to this very fact that they
played such an important role in his specific calculations, which neither Fock nor others
even suspected.
Thus, Fock considered harmonic conditions no more than a s priviledged coordinate
conditions applicable for island-type problems only. This is quite natural, since he
and ail his eminent predecessors were captured by the Riemannian geometry, which
basically gave no possibility to make a deeper insight into the problem. In order t o make
step ahead and impose these conditions as universal covariant field equations, it was
necessary to give up the ideology of the GR, leave the maze of the Riemannian geometry
and apply a special principle of relativity in defiance of the GR, as well as t o introduce the
concepts of a gravitation field a s a Faraday-Maxwell physical field, possessing some
energy and momentum. All of it was translated into reality in our theory, with the choice
of coordinates being arbitrary and set only by the metric tensor y* of Minkowski space,
as is generally accepted in elementary particle theory. As for Eq. (4*2), in our theory
they are comprehensive and universal because of being the gravitation field equations.
They have nothing to do with the choice of coordinates. In Minkowski space these
equations are written in the covariant form
Judging by 5 3, we conclude that these field equations exclude automatically spins 1 and
0 from a gravitational tensor field. Thus, we have already costructed four covariant
equations (4.3) for the fourteen unknown variables of a gravitation field and of matter.
To construct other ten equations, we draw a simple, but farreaching analogy with an
electromagnetic field. As is known, Maxwell electrodynamic equations may be written in
the covariant form a s follows:
Dvju=O. (4.5)
453
(4.12)
454
Here relation(4.l) was taken into account. For this system of equations to be presented
, is necessary to choose the constants a, b and c for the Lagrangian
in the form of ( 4 - 7 ) it
density in a definite and unique way.
This suggests that the Lagrangian of the gravitation field with spins 2 and 0 in
Minkowski space is determined unambiguously. In order to make such a choice of the
coefficients a, b and c let us calculate the density of the energy momentum tensor for
matter and a gravitational field.
Let us introduce the notations
(4-13)
-
(4 14)
Similarly
(4-15)
Calculating the variation of the total Lagrangian over Ymn and with regard for the field
equations
where
H?=( g"DiG*"+ Gn'Dig*? g h u . (4-19)
In order that no new equations on the field d i k would arise from the equality
Dmtmn=O,
which would otherwise lead to a redefinition of the system of equations, it is necessary
and sufficient for the coefficients a, b and c to satisfy the following conditions:
a=-b/2, c=b/4. (4-20)
Thus, with such a choice of the constants one comes to the identity
Dmtmn=O
which was enclosed in Eq. (4.7). With regard to the choice of the coefficients as in (4.20),
~ ~
455
With consideration for Eq. (4.3), the complete system of equations for matter and
gravitational field will be6'
G = 16n(tFn tZn) ,
y P 6 D r D ~mn + (4-28)
DmGmn=O, (4.29)
or in the Galilean coordinate system
0 G m n = 1 6 ~ ( t , " " + t 2 ", ) amgmn=0.
Should we confine ourselves only to the first system of equations, (4.28), then the separa-
tion of the Riemannian space metric into the Minkowski space metric and tensor
gravitational field would be of a conditional character and would not have any physical
meaning. The second system of four field equations, (4.29), separates decisively all that
456
relates to the inertia forces from all that is connected with gravitational field. T h e two
systems of Eqs. (4.28) and (4.291, are generally covariant. Corresponding physical
conditions are imposed, as usual, within a given, for example, Galilean coordinate system
on the behaviour of gravitational field. In the framework of GR, one cannot formulate
the conditions for the metric g " remaining in Riemannian space since the asymptotics of
the metric always depends on the choice of the three-dimensional coordinate system. It
should also be noted that the equations of matter motion are contained in the given system
of equations. The density for the energy-momentum tensor of gravitational field in
Minkowski space is equal to
(4.30)
for Lagrangian density (4.22). Here, as we can see, there automatically appears the
second-rank curvature tensor Rpqin Riemannian space. Similarly, the tensor density of
matter energy-momentum in Minkowski space is equal to
which directly follows from the expression for coupling ( 4 - 1). Putting the expressions
for the energy-momentum tensors of matter and gravitational field in field equations (4 -71,
we transform them to the form of Hilbert-Einstein equations
8x( T p u - 1
TgfiuT)
(4.34)
D,g'"=o. (4-36)
it is worth mentioning that Eq. (4.36) is general and universal since these a r e field
equations describing a gravitational field with spins 2 and 0. The choice of a reference
frame (or coordinate system) is determined by the metric tensor Y"" of Minkowski space.
Hence, Eq. (4.36) do not impose any restrictions on the choice of a coordinate system.
Consequently, the system of Eq. (4.36) excludes spins 1 and 0' in the density of tensor field
d", leaving only spins 2 and 0. The required six components of gravitational field,
457
corresponding to these spins, and four components of matter are defined from field
equations (4.28) or from their equivalent Hilbert-Einstein equations (4.35). The system
of equations for gravitational field, (4-28) and (4.29), may be expressed in a somewhat
different form through the Hilbert energy-momentum tensor density in Riemannian space.
However, for this purpose we will have to obtain some relations if use is made of a specific
-
expression for the Lagrangian density of gravitational field obtained by us earlier, (4 24) :
where
Using these expressions, we calculate then the density of the tensor of the third rank,
K F , with formula (2.10).
With account for the equality
Using this expression and definition (2.13), we shall have for 5,the following expression:
1
$,, = rpa--D~oF
16n
- (4-38)
The quantities CiUare the tensors of the third rank with respect t o the linear transforma-
tion of the coordinates. Therefore L s will be a scalar density with respect to the same
transformations. From the invariance of action with respect to the linear transforma-
45 8
tions, we have
6J g = /d4x d, J
D
[ ++L 8] =0 . (4.40)
Here
J = - t n r a A
f l?$A6a[p, (4.41)
where the density of the canonical tensor r: is equal to
aLg ,
raA= -62Lg+&GPua(a,gpu) (4.42)
and the density of the tensor of the third rank, l??, is in this case
Since
rpa-adl&A= --[J-s
8n R p a - 1~ 8 p a R.] (4.48)
Putting the expression for K? into (4.481, we shall obtain the identity
459
In the curvature tensor, one can always identically replace, leaving it unchanged, the
conventional derivatives by the covariant ones in the Minkowski metric, therefore expres-
sion (4-51) may be presented in the covariant form:
In this case the canonical tensor density in (4.52) will be equal to expression (2-3):
where the Lagrangian density L , is already presented in terms of the derivatives covariant
in the Minkowski metric, (4.24).
Using identity (4.52) we may present the expression for Gp (4.38) in the form
(4.53)
As we have already established, the system of equations of matter and gravitational field,
(4.28) and (4*29),is equivalent to the system of Eqs. (4.35) and (4-36). With the help of
expression (4-53) the system of equations of matter and gravitational field may also be
rewritten in another equivalent form?
Here T,' is the Hilbert energy-momentum tensor density (1.6) for the matter in Rieman-
nian space. It is quite obvious that in virtue of (4.54) and (4.55) the conservation law for
energy-momentum tensor of matter and gravitational field has the form
D,(T,'+k)=O. (4.56)
The covariant matter conservation law in Riemannian space may identically be presented
in the form
As seen from this expression, the matter acquires energy and momentum right from
gravitational field, the total energy-momentum tensor of matter and gravitational field
being always conserved rigorously. The construction of R T G on the basis of Minkowski
space and geometrization principle allowed us to deal only with covariant quantities at
every stage of our reasonings. Here we give briefly some of our results following from
460
P= j d 3 1 [ t P f tk]
is an energy-momentum four-vector with respect to any coordinate transformations;
similarly, the angular momentum is also a tensor with respect to any coordinate transfor-
mations in four-dimensional Minkowski space. It may also be shown that for any island
static system, the inertial mass is exactly equal to its active gravitating mass. The given
theory provides a prediction of an extraordinary force, it leads to a strictly definite
development of the Universe.6 According to it, the Universe is not closed, it is flat in
virtue of Eq (4.291,
d s 2 = c 2 d r 2 - V ( r ) ( d x 2 + d y 2 + d z 2.)
The Universe expansion is defined by the function V (r ) which is easily calculated from
the field equations. Equations (4.28) and (4.29) make us easily convinced that the total
energy density of matter and gravitational field is always zero at any instant of time of
the Universe development. T h e nowaday density of all the forms of matter, p 0 , should be
equal to its critical density p C ,
po=pc=3H2/8aG,
Acknowledgements
References
1) A. A. Logunov et al., Theor. Math. Phys. 40 (1979), 291.
V. I. Denisov and A. A. Logunov, Theor. Math. Phys. 50 (1982). 3 ; Elemenhy Particle and Atomic
Nucleus Physics (19821, v. 13, part 3, p. 757; Modem Problems of Mathematics (Moscow. VINITI of
Acad. of Sciences, USSR, 1982), v. 21.
2) N. Rosen, Phys. Rev. 57 (1940). 147; Ann. of Phys. 22 (1963). 1.
A. Papapetrou, Proc. Roy. Irish Acad. A52 (1918). 11.
S. Gupta, Proc. Roy. SOC.A65 (1952), 608.
W. Thimng, Ann. of Phys. 16 (1961), 69.
3) H. Poincar6, Present and Future of Mathematical PhysicdBulletin des Sciences Mathematiques, Decem-
ber, 1904), v. 28, ser. 2, p. 302; The Monist(January, 1905), v. XV. No. 1;Relutivity Principle, ed. A. A.
Tyapkin (Atomizdat, Moscow. 1973).
4) A. A. Logunov, Lectures on the Theoly of Relativity(Moscow Univ. 1984), (In Russian).
5) A. Einstein, Collected works (Nauka, Moscow, 1965). v. I.
W. Pauli, Theory of Relativity (Pergamon Press, 1965).
C. M$ller, The Theory of Relativity (Clarendon Press, Oxford, 1972).
L. I. Mandelstam, Lectures on Optics, Relatidy Theory and Quantum Meckanics (Nauka, Moscow,
1973, p. 218.
6) A. A. Logunov and A. A. Vlasov, Minkowski Space as the Basis of the Physical Gravitation Theov
(Moscow Univ.. Moscow, 1984); TMF (1984), v. 60, p. 319 Spherically Symmetric Solution in Gravitation
Theo7y Based on Minkowski Space (Moscow Univ., Moscow, 1984); TMF (1984), V. 60, p. 163.
A. A. Vlasov, A. A. Logunov and M. A. Mestvirishvili, IHEP Reprint 84-156 (Serpukhov, 1984).
7) V. A. Fock, Theory of Space, Time and Gravitation (Pergamon Press, London, 1959).
8) V. I. Ogievetsky and I. V. Polubarinov, Ann. of Phys. 35 (1965). 167; JINR Preprint P-2106 (1965).
9) C. Fronsdal, Sup. Nuovo Cim. 9 (1958), 416.
K. J. Barnes, J. Math. Phys. 6 (1965). 788.
10) De-Donder, La Gmvifique Einsteinienne. (Paris. 1921); Tkeorie des Champs Grauzjques (Paris, 1926)
V. A. Fock, J. of Phys. 1 (1939), 81; Rev. Mod. Phys. 29 (1957). 235.
Authors' comments:
'.....I think it appropriate to say that in the paper "Relativistic Theory of Gravitation"
some aspects that are of great importance and appeared in the later works had not
been taken into consideration. Therefore, the book "The Theory of Gravity" Moscow
NAUKA, 2001 (also see gr-qc/0210005 V2 21 Oct 2002), is more useful and
complete.'
462
Jong-Ping Hsu
Department of Physics,
University of Massachusetts Dartmouth
North Dartmouth, MA 02747-2300, USA
Yang-Mills gravity in flat spacetime can shed light on renormalizable quantum gravity.
463
In general relativity, the structure of couplings for g,,, is very complicated and also has
non-trivial difficulties in both technical and conceptual aspects from the viewpoint of quan-
tum field theory. However, as a classical field theory, one of the strengths of general relativity
lies in its successful equation of motions for objects and light rays, based on the Einstein-
Grossmann metric g,,,dzpdz. [l] As for renormalizable quantum field theory, Yang-Mills
fields have the best track record in theory and experiment, provided the underlying space-
time is flat. In this paper, we show that Yang-Mills fields with spacetime translation gauge
symmetry have special features which provide a natural union of the Einstein-Grossmann
metric and the gravitational Yang-Mills field. The union implies: (i) The framework is
applicable to all general frames of reference (both inertial and non-inertial) in which the
spacetime is characterized by the vanishing Riemann-Christoffel curvature tensor. (ii) The
effective Einstein-Grossmann metric originates physically from a spin-2 Yang-Mills field in
flat spacetime.
The formulations for electromagnetic and Yang-Mills fields associated with internal gauge
groups have been developed extensively. They are based on the replacement d , + 8, +
igB;ra, where T~ is the constant matrix representations of the gauge groups which have
little to do with external spacetime. For external gauge groups related to spacetime, e.g.,
the de Sitter group or the PoincarC group, the gauge invariant Lagrangian involving fermions
turns out to be richer in content. [a]
In this paper, we concentrate on a specific simple external gauge group of translations
T(4) in flat spacetime. The translation group T(4) is the Abelian subgroup of the PoincarC
group. It is particularly interesting because it is the minimal gauge group related to the
conserved energy-momentum tensor which couples to a spin-2 field q5,,,. However, the
generators of the translational group are the displacement operators, p , = id, (c=fi=l)
in inertial frames. In a general frame (inertial or non-inertial) with a metric tensor P,,,
we replace 8, by D,, i.e., the partial covariant derivative with respect to the Levi-Civita
connection (or the metric tensor P,,,) in flat spacetime. Thus, the replacement in Yang-
Mills gravity takes a different form: D, + D, - ig$,,p 2 J,,,D, where D , = P,,,D,
in such a gauge theory. The generators of this non-compact translation group do not have
a constant matrix representation. It is precisely this unique property that leads naturally
to an effective Einstein-Grossmann metric in flat spacetime. Such an effective metric
emerges from the Lagrangian of matter fields (as shown in equations (5) and (6) below).
Furthermore, the displacement operator of the translation gauge group dictates that the
coupling constant g in J,,, must have the dimension of length and that the interaction
cannot have both attractive and repulsive forces, in sharp contrast to the dimensionless
real coupling constants in electrodynamics and other Yang-Mills theories associated with
internal gauge groups.
The new formalism of external gauge symmetry for translations in flat spacetime leads
to a gauge-invariant action involving fermions. It suggests (a) the massless Yang-Mills spin-
2 field in flat spacetime as the gravitational gauge field [3], (b) a new gravitational gauge
equation in both inertial and non-inertial frames and (c) an effective metric G,,,dzPdzU for
the motion of classical objects. In the post-Newtonian approximation, the present gauge
field equation is consistent with classical tests such as the perihelion shift of the Mercury
and the time delay of radar echoes.
Let us consider the local spacetime translation with an arbitrary infinitesimal vector
464
gauge-function A'( x) ,
The basic point is that this transformation has a dual interpretation: (i) a shift of the
spacetime coordinates by an infinitesimal vector gauge-function Ai'(z). and (ii) an arbitrary
infinitesimal transformation. These two rriathematical implications of the transformation (1)
dictates the following gauge transformation of spacetime translations for physical quantities
in the Lagrangian of fields (e.g., L + b p in (12) below):
d s 2 = dwr2 - dr12 = W 2 d w 2 - d r 2 = P P , d x p d x V , W 2= y4 ($ + 2
a,x> . (4)
All constant-linear-acceleration frames of reference have the metric tensor of the form
P,,, = ( W 2 - I , - 1, - 1). [6] The existence of the finite Wu transformations (3) implies that
the spacetime of the constant-linear-acceleration frames is flat, i.e., having zero Riemann-
Christofel curvature tensor. The metric terisor PPY for a general frame of reference with
zero Itiemann-Christoffel curvature tensor may be called the Poincari metric tensor. In the
limit of zero acceleration, a , + 0 , Ppy in (4) reduces to the hilirikowski metric tensor qihUof
inertial frames.
465
In order to see the connection of such a spin-2 field and the gravitational field, let us
consider the kinetic term in the Lagrangian of a scalar field @: (1/2)P~vD,<PD,@. In the
presence of the spin-2 field d,,, the translation gauge symmetry dictates the replacement,
Thus, we have
where Gp,,dxpdxv denotes the effective Einstein-Grossmann metric for motions of classical
objects.
This action (6) for particles suggests a simple and natural union of Einstein-Grossmann
metric for motions of classical objects and Yang-Mills fields for gravity with flat-spacetime
translation gauge group: Namely, the spin-2 gauge field and its interaction with fermion
matter actually takes place in flat spacetime; only the equation of motion of classical objects
is derived from the classical action S, which happens to have a form similar to the Einstein-
Grossmann metric gPydxpdxV.We stress that the effective metric tensor G,, in (5) and (6)
is completely determined by the Yang-Mills action with translation gauge symmetry.
The present theory of Yang-Mills gravity is formulated on the basis of the translation
gauge symmetry and the postulate of the effective metric tensor in (6) for the motion of
a classical particle in such a spin-2 fields. However, from the field-theoretic viewpoint, the
real physical spacetime is still flat and the fundamental metric tensor is still P,,,in general
frames of reference. Thus, G,, in (6) is treated as merely an effective metric tensor for
the motion of a classical object in the presence of the spin-2 gauge field, in the sense that the
+dimensional effective interval is d& = Gp,,dxPdzv in the action S, for classical objects
such as planets, stars or light rays. We note that the Poincar6 metric tensor Pp, is a purly
geometrical property of spacetime and does not contain physical field d,,,. This property is
important because it enables the Yang-Mills gravity to have a very simple coupling, namely,
the maximum coupling is a 4-vertex (in Feymann diagrams).
The translational gauge invariance requires that a symmetric spin-2 field, d,,, = &,
must couple to the fermion field $ via the energy-momentum tensor. We postulate the
following gauge-invariant fermion action S, in a general frame:
SQ = 1 L ~ g d ~ x , (7)
466
where L q d n o n l y changes by a divergence under the gauge transformation (2). Note that
the quadratic gauge-curvature in (12) can be written as (-l/2g2)( ~ C p , p C ~ a-pCpolaCpbp).
467
For simplicity, let us choose inertial frames rather than general non-inertial frames in the
following discussions of experimental implications. One of the ways to do this is to consider
the case,
ppu = rlpv, J p u = rlpv +
g&u, (13)
and all tensors indices are raised and lowered by qll,,. For weak fields, the linearized gauge
equation for spin-2 field can be derived from (12):
which turns out to be formally the same as Einsteins equation for weak gravitational field.
The classical particle action S, = - J mdsei, with the effective interval ( d ~ , i )=~
Gpvdz!-dzu,leads to the variation
(19)
by solving (16) with the spherical coordinate, d= (w, r , 8 , $ ) to the first order approxima-
tion.
In order to show that the theory is viable beyond the first-order approximation (19), let
us compare the result (18) with the perihelion shift of the Mercury, [7, 81 which is sensitive
to the coefficient appearing in the second-order term in loo or GOO.We solve the non-linear
gauge field equations by the method of successive approximation and carry out the related
468
For ( p , v ) = ( O , O ) , (1, l ) ,( 2 , a), ( 3 , 3 ) ,the gauge-field equation (21) can be written respec-
tively as
cl(
dr
R2--
T)$ 7+ ( ~+
2R2dS
R-
d
dr
- + -8
ddrR r )(
dS
-q-
dr
+ dT
2-1
dr
2
--
r
~
2
+2 -m)
r
(23)
1 d
-R + 2T) + - -(S - 3R + 4 T ) - = 0,
2r dr
dS
-R(-
dr
+ 2-)dT
dr
-
2
-T2
r
+
1 d
+ (
469
Equations ( 2 2 ) and ( 2 6 ) lead t o the second order approximation of gauge field gq500:
Gm G2m2
9400 = --r +-2r2 ' g411=--+-Grm GF2 (1- + -
,2) , (28)
which are [-dependent, just like the gauge field d P u . In order to see the [-independent
physical results and their agreement with gravitational experiments, let us carry out the
expansion to the second order in all components Goo, Gll, G22 and G33 in the usual spherical
coordinate. For any given value of [, we can make a change of variable: p 2 = r2(1+ 2Gm/r+
[G2rn2/r2][4/(' - 3 ] ) , where the zeroth and the first orders terms are independent of [. We
obtain the following effective metric tensors GpLV(p):
2Gm .40
Goo([>)= 1 - -$ 7 , G h ( p ] = -
P P-
(I+---
27 2::m2)
1 (30)
metric tensor G,,(p) in (5), (6) and (30) to the second order. Note that only the second or-
der term in G,,(p), i.e., 2G2m2/p2,differs from the corresponding term in Einsteins theory.
As we shall see below, this difference is too small to be detected with the available appara-
tus. [9] We have shown that the effective metric tensor G,,(p) in the spherical coordinates
are independent of the gauge parameter E in the post-Newtonian approximation.
The inverse of non-vanishing components of G,, are given by
1 2Gm 4G2m2
IO0(p) = -=1+-+-,
Goo(P> P P2
Iyp)= L=
- I - :- 2 [ + -6G2m2]
,
Gll(P) P2
1 -1 1 -1
I(p) = - -- 133(p) = ___ -
-
G22(P) P2 G33(p) p2sin28
Let us choose B = r/2, the Hamilton-Jacobi equation for a planet with mass mp has the
following form:
By the general procedure for solving this equation, we look for an action S in the form
S = -E,w +M $ + j ( p ) . The action S is found to be
where E, and bf are respectively constant energy and angular momentum. As usual, the
trajectory is determined by d S / d M = constant:
OD 11 - 6G2m2
I II I-l+-----,
P2
where 1 and Ill are given by (31).This term loolI1lldiffers from the corresponding term
in Einsteins theory, in which goolgl1l = 1. For the approximate trajectory, we write (34) as
a differential equation with = l / p , We have
M2
p=-
miGm'
Einstein's theory does not have this type of correction term Q because it has the relation
go0lg1ll = 1. The new correction term Q is of the order of (Gm/P),02 which is the result
of the relation loolll'll > 1 in (34). This correction is extremely small and undetectable
because of the velocity ,8 of the planet is very small in comparison with the speed of light.
To see the effect of this correction Q to the perihelion shift, we solve (36) by a change
of variable 3 = u( 1 - Q). We can write equation (36) as
where e is the eccentricity. The major semiaxis a can be expressed in terms of e and P :
a = P/(1 - e2). The perihelion shift for one revolution of the planet is give by
2na 6nGm
P P
where the second term shows the difference between the present Yang-Mills gravity and
Einstein's theory. This result shows that the observable perihelion shift is independent of
the gauge parameter ( which appears in the second order approximation of the solution of
the spin-2 gauge field gq5. Since the observational accuracy of the perihelion shift of the
Mercury is about 1 %, the prediction (40) of Yang-Mills gravity can be tested only when
(Ed - rni)/rn: M p 2 x 0.01. Thus, it is not possible to test the small correction in (40) of
Yang-Mills gravity in the solar system.
For the bending of light, the eikonal equation (47) with l p v given by (31), one obtains
the trajectory of the ray which is the same as (36) with mp + 0 and Eo replaced by
w, = -a$/aw (c=l). Following the usual procedure, we obtain
4Gmw, 3nGmw,
A$%-
M
where the second term in the bracket is negligible. So there is no observable difference
between Yang-Mills gravity and Einstein's theory in the known experiments.
472
If one compares the fermion equation, (iIp (PPv+gq5,,)Dv - m)+ = 0, derived from (12)
with the Dirac equation in quantum electrodynamics [i.e., (+ad, - eyPA, - m)+ = 01,one
can see a distinct difference: Namely, the kinematic term i ~ ~ and 8 , the electromagnetic
coupling term e y p A , have a different relative sign, if one takes the complex conjugate of the
Dirac equations. This implies the presence of both repulsive and attractive forces between
two charges. However, there is no change in the relative sign of the kinematical term and
the spin-2 coupling term in our fermion equation when one takes the complex conjugate.
This provides a natural explanation that the gravitational force is only attractive between
two fermions, in contrast to that in electrodynamics.
The Yang-Mills gravity with the translational symmetry [lo] has a well-defined conser-
vation law for the energy-momentum tensor, just as that in ordinary field theory. It is
believed that such a spacetime gauge theory can shed light on quantum gravity because
(i) the maximum interaction vertex is 4-vertex, just as that in the usual Yang-Mills theory
with internal Lie groups, and (ii) it is based on gauge symmetry, which could minimize the
ultraviolet divergences. In light of these discussions, it is possible t o understand gravity
based on the spacetime gauge theory with the translational symmetry in general frames of
reference.
The author would like to thank Zhenhua Ning for his help. The work is supported in
part by the Potz Science Fund and the Jing Shin Research Fund of the UMass Dartmouth
Foundation.
Note added. The energy-momentum tensor of gravitation t,, is defined by the field equation
(21) (with [ = 0) written in the following form, DXD~& = -g(Tpv +
t p u ) ,in a general
frame. Using the usual approximations and gauge condition (15), we can calculate the
average energy-momentum of a gravitational plane wave and the power. For example, the
power Po emitted per unit solid angle in the direction x/lxl can be written as
where T(k,w) is defined as follows: [9] Suppose one observes this radiation in the wave zone,
one can write the polarization tensor in terms of the Fourier transform of TPv:
T P v ( kw, ) z
J d3xT,,(x, w ) ] e x p ( - i k . x),
where the polarization tensor e,,(x, w) is defined by the relation: +,v(x,t ) M [ e P v ( xw,) x
+
ezp(-ikxxx) c.c.]. The approximate result for the power emitted per solid angle in Yang-
Mills gravity turns out t o be the same as that obtained in Einsteins theory. [9]
473
References
[l] A. Einstein and M . Grossmann, Z. Math. Physik, 62 225 (1913). See also F. J . Dyson,
Bull. Am. Math. SOC.78 (1972) 635, and S. Weinberg, Gravitation and Cosmology
(John Wiley and Sons, 1972) pp. 285-289.
[2] Jong-Ping Hsu, Phys. Lett., 119B (1982) 328. Such a gauge theory predicts a new
gravitational spin force produced by fermion spin densities, in addition to the usual
gravitational force produced by the mass density. T h e usual Yang-Mills-type formalism
(with Faddeev-Popovs gauge compensation terms) for internal gauge groups is more
difficult to be applied t o this case.
[3] The idea of an effective Riemannian spacetime due to the presence of a symmetric
spin-2 field in flat spacetime was extensively discussed by Logunov and others. Their
theory has a different gauge transformation and a completely different Lagrangian. See
A. A. Logunov, The Theory of Gravity (Trans. by G . Pontecorvo, Moscow, Nauka,
2001) and references therein. For a discussion of the spin-2 field, see also H. van Dam
and M. Veltman, Nucl. Phys. B22 (1970) 397, and S. Weinberg, The Quantum Theory
of Fields. vol. 1. Foundations (Cambridge Univ. Press, 1995) pp. 246-255.
[4]Jong-Ping Hsu and Leonard0 Hsu, Nuovo Cimento B, 112 (1997) 575 and Chin. J.
Phys. 35 (1997) 407. Jong-Ping Hsu, Einsteins Relativity and Beyond - New Symmetry
Approaches, (World Scientific, Singapore, 2000) , Chapters 21-23. Daniel Schmidt and
Jong-Ping Hsu, Intern. J . Modern Phys. A (2005, to be published).
[5] C. Merller, Danske Vid. Sel. Mat.-Fys. 20 (1943) No. 19; Ta-You Wu and Y . C. Lee,
Intern. J. Theoretical Phys. 5 (1972) 307.
[6] For arbitrary linear accelerations with limiting $-dimensional symmetry, one has
+
Pp,dxpdx = W 2 d w 2 2Udwdx - dr2. See J. P. Hsu, Chin. J. Phys. 40 (2002) 265.
[7] L. Landau and E. Lifshitz, The Classical Theory of Fields (trans. by M. Hamermesh,
Addison-Wesley, 1951) p. 58 and pp. 312-316. See also S. Weinberg, ref. 1, pp. 185-201.
[8] Wei-Tou Ni, in The Proceedings of the Fourth International Workshop on Gravitation
and Astrophysics (Ed. L. Liu, J . Luo, X. Z. Li and J. P. Hsu, World Scientific, 2000)
pp. 1-19,
[9] S. Weinberg, Gravitation and Cosmology (John Wiley and Sons, 1972) p. 178 and pp.
259-273.
[lo] Within the framework of curved spacetime, gauge gravity with translation gauge sym-
metry was discussed by Y. M. Cho, Phys. Rev. 14,2515 (1976). His formulation and
results are very much different from ours. For example, Cho made additional assumption
tP$ = 0 for the gauge covariant derivative, so that one has Ai$ = (ai+ B!<,)$ = ai$,
where tPare the translation group generators. Furthermore, he assumed a*$ = h;ap$,
where It; = 6; + f B r . This assumption is equivalent to assuming the curved space-
time. Yang-Mills formulation of gauge symmetry and gauge fields do not make these
additional assumptions.
This page intentionally left blank
Chapter 9
WEI-TOU N1
Centerfor Gravitation and Cosmology, Solar-System Division,
Purple Mountain Observatory. Chinese Academy of Sciences.
N o 2, BerJing W Rd, Nanjing, China 210008 M tnr!qmo ai ( n
In 1859, Le Verrier discovered the mercury perihelion advance anomaly. This anomaly turned out to
be the first relativistic-gravity effect observed. During the 141 years to 2000, the precision of
laboratory and space experiments, and astrophysical and cosmological observations on relativistic
gravity have improved by 3 orders of magnitude. In 1999, we envisaged a 3-6 order improvement in
the next 30 years in all directions of tests of relativistic gravity. In 2000, the interferornetric
gravitational wave detectors began their runs to accumulate data. In 2003, the measurement of
relativistic Shapiro time-delay of the Cassini spacecraft determined the relativistic-gravity parameter
y to be 1.000021 ? 0.000023 of general relativity --- a 1.5-order improvement. In October 2004,
Ciufolini and Pavlis reported a measurement of the Lense-Thirring effect on the LAGEOS and
LAGEOS2 satellites to be 0.99 k 0.10 of the value predicted by general relativity. In April 2004,
Gravity Probe B (Stanford relativity gyroscope experiment to measure the Lense-Thirring effect to 1
%) was launched and has been accumulate science data for more than 170 days now. pSCOPE
(MICROSCOPE: MICRO-Satellite a trainee Compensee pour IObservation du Principle
dEquivalence) is on its way for a 2007 launch to test Galileo equivalence principle to 10.. LISA
Pathfinder (SMART2), the technological demonstrator for the LISA (Laser Interferometer Space
Antenna) mission is well on its way for a 2008 launch. STEP (Satellite Test of Equivalence
Principle), and ASTROD (Astrodynamical Space Test of Relativity using Optical Devices) are in the
good planning stage. Various astrophysical tests and cosmological tests of relativistic gravity will
reach precision and ultra-precision stages. Clock tests and atomic interferometry tests of relativistic
gravity will reach an ever-increasing precision These will give revived interest and development both
in experimental and theoretical aspects of gravity, and may lead to answers to some profound
questions of gravity and the cosmos.
1. Introduction
A dimensionless parameter <(x, t) characterizing the strength of gravity at a spacetime
point 8 [with coordinates (x, t)] due to a gravitating source is the ratio of the negative
of the potential energy, rnU (due to this source), to the inertial mass-energy mc2 of a test
body at p , i.e.,
1
477
where R is the distance to the source. For a nearly Newtonian system, we can use
Newtonian potential for U. The strength of gravity for various configurations is tabulated
in Table 1.
2
478
Table 2. A road map (Highlights) for gravity. D denotes dynamical effect; EP denotes equivalence
principle effect.
~~~
3
479
This review is a five-year update from a previous review article (W.-T. Ni,
"Empirical tests of the relativistic gravity: the past, the present and the future", in Recent
advances and cross-century outlooks in physics: interplay between theory and
experiment: proceedings of the Conference held on March 18-20, 1999 in Atlanta,
Georgia, editors, Pisin Chen, and Cheuk-Yin Wong, [Singapore: World Scientific,
20001 ; and pp. 1-19 in Gravitation and Astrophysics, editors, Liu L, Luo J, Li X-Z and
Hsu J-P [Singapore: World Scientific, 20001 ).
In section 2, we review the Mercury's perihelion advance and events leading to
general relativity. In section 3, we discuss the classical tests. In section 4, we review
precision measurement tests and the foundations of relativistic gravity. In section 5, we
review solar system tests since the revival (1960). In sections 6 and 7, we discuss
astrophysical tests and cosmological tests respectively. In section 8, we discuss
gravitational-wave observations in relation to testing relativistic gravity. In section 9, we
discuss next generation experiments in progress, planned and proposed. In section 10, we
give an outlook. In the appendix, we discuss empirical tests associated with Edditington-
Robertson formalism.
4
480
3. Classical Tests
The perihelion advance anomaly of Mercury, the deflection of light passing the limb of
the Sun and the gravitational redshift are the three classical tests of relativistic gravity.
Using EEP, Einstein [30] derived the deflection of light passing the limb of the Sun in
1911. This agrees with the deflection of light derived by using particle model of light in
the late 18th century. Before 1915, observations on light deflection were not successful
due to war and weather. Einstein's general relativity doubled the prediction of the
5
48 1
deflection of light (1 'l.75). The 1919 British solar eclipse expeditions reported reasonably
good agreement with the prediction of Einstein's relativity. Before 1960, there were
several such observations. The accuracy of these observations was not better than 10 -
20%.
After Einstein [ 101 proposed the gravitational redshift, Freundlich [31J started the
long effort to disentangle the gravitational redshift of solar and other stellar spectral lines
from other causes. Over the next five decades, astronomers did not agree on whether
there is gravitational redshift empirically [32]. This question is finally settled and
gravitational redshift confirmed by Pound and Rebka [ 151 using Mossbauer effect. The
improved result of Pound and Snider [I51 confirmed the redshift prediction to 1 %
accuracy.
where H = det ( H ) is a metric which generates the light cone for electromagnetic
propagation, and e"' is the completely antisymmetric symbol with e"123= 1 [35-371.
Recently, Lammerzahl and Hehl have shown that this non-birefringence guarantees,
without approximation, Riemannian light cone, i.e., Eq. (4) [38J
Eq. (4) is verified empirically to high accuracy from pulsar observations and from
polarization measurements of extragalactic radio sources and will be discussed in 46 on
the astrophysical tests. Let us now look into the empirical constraints for H ' l and p. In Eq.
(3), ds is the line element determined from the metric g,,. From Eq. (4), the gravitational
coupling to electromagnetism is determined by the metric H,, and two scalar fields cp and
yl. If HI, is not proportional to g,/, then the hyperfine levels of the lithium atom, the
6
482
beryllium atom, the mercury atom and other atoms will have additional shifts. But this is
not observed to high accuracy in Hughes-Drever experiments [39]. Therefore H, is
proportional to g, to a certain accuracy. Since a change of Hkto m'does not affect iJk'
in Eq. (4), we can define H I ,= glI to remove this scale freedom. [35,40]
In Hughes-Drever experiments [39] Am/m 5 0.5 x lo-*' or Amlm,, 5 0.3 x
where me nr is the electromagnetic binding energy. Using Eq. (4) in Eq. (3), we have three
kinds of contributions to Arnlm,, . These three kinds are of the order of (i) (H,,,,- g,,), (ii)
(Hop- go,,)v, and (iii) (Hoo- goo)v2respectively [35, 401. Here the Greek indices ,LA, v
denote space indices. Considering the motion of laboratories from earth rotation, in the
solar system and in our galaxy, we can set limits on various components of (H,, - g,) from
Hughes-Drever experiments as follows:
Thus, we see that for the constraint on I Hoo- goo 1 / U, Hughes-Drever experiments give
the most stringent limit. However, STEP mission concept [46] proposes to improve the
WEP experiment by five orders of magnitude. This will again lead in precision in
determining Hoo,
The theory (3) withxyk' given by
where p is a scalar or pseudoscalar function of the gravitational field and eVk' = (-&)-1'2ei'k'
is studied in [47] and [48]. In (3), particles considered have charges but no spin. To
include spin-1/2 particles, we can add the Lagrangian for Dirac particles. Experimental
tests of the equivalence principle for polarized-bodies are reviewed in [49].
To include QCD and other gauge interactions, we have generalized the x - g
framework [50]. Now we are working on a more comprehensive generalization to
include a framework to test special relativity, and a framework to test the gravitational
7
483
interactions of scalar particles and particles with spins together with gauge fields.
Table 4. Relativity-parameter determination from interplanetary radio ranging and from lunar laser
ra
In the last column of Table 4, the values come from two references [54] and [ 5 5 ] . In
[ 5 5 ] , Williams et al. used a total of 15 553 LLR normal-point data in the period of March
1970 to April 2004 from Observatoire de la CBte dAzur, McDonald Observatory and
Haleakala Observatory in their determination. Each normal point comprises from 3 to
about 100 photons. The weighted rms scatter after their fits for the last ten years of
ranges is about 2 cm (about 5 x 10-lof range).
In 2003, Bertotti, Iess and Tortora [56] reported a measurement of the frequency
shift of radio photons due to relativistic Shapiro time-delay effect from the Cassini
spacecraft as they passed near the Sun during the June 2002 solar conjunction. From this
measurement, they determined y to be 1.000021 _+ 0.000023.
With the advent of VLBI (Very Long Baseline Interferometry) at radio wavelengths,
the gravitational deflection of radio waves by the Sun from astrophysical radio sources
has been observed and accuracy of observation had been improved to 1.7 x 1 0-3for y [6 1 ,
621 in 1995. Recent analysis using VLBI data from 1979-1999 improved this result by
about four times to 0.99983 L 0.00045 [63]. Fomalont and Kopeikin [64] measured the
8
484
effect of retardation of gravity by the field of moving Jupiter via VLBI observation of
light bending from a quasar.
The solar-system measurements have made possible the creation of high-accuracy
planetary and lunar ephemerides. Two most complete series of ephemerides are the
numerical DE ephemerides of JPL [65] and the EPM ephemerides of the Institute of
Applied Astromomy [53]. They are of the same level of accuracy and can be used to fit
experiments/observations and to determine astronomical constants. Krasinsky and
Brumberg [66] used these two series of ephemerides to analyze the major planet motions
and the AU (Astronomical Unit); Pitjeva [53] have recently used the EPM framework to
determine the AU and obtain 1 AU = 149 597 870 696.0 m. The JPL DE410
determination of this number is 1 AU = 149 597 870 697.4 m. The difference of 1.4 m
represents the realistic error in the determination of the AU. Pitjeva's [53] determination
of p and y is obtained simultaneously with estimations for the solar oblateness and the
possible variability of the gravitational constant.
In 1918, Lense and Thirring predicted that the rotation of a body like Earth will drag
the local inertial frames of reference around it in general relativity. In 2004, Ciufolini and
Pavlis [67] reported a measurement of this Lense-Thirring effect on the two Earth
satellites, LAGEOS and LAGEOS2; it is 0.99 k 0.10 of the value predicted by general
relativity. In the same year, Gravity Probe B (a space mission to test general relativity
using cryogenic gyroscopes in orbit) was launched in April and aims at measurement of
Lense-Thirring effect to about 1 YO[68].
With the Hipparcos mission, very accurate measurements of star positions at various
elongations from the Sun were accumulated. Most of the measurements were at
elongations greater than 47" from the Sun. At these angles, the relativistic light
deflections are typically a few mas; it is 4.07 mas according to general relativity at right
angles to the solar direction for an observer at 1 AU from the Sun. In the Hipparcos
measurements, each abscissa on a reference great-circle has a typical precision of 3 mas
for a star with 8-9 mag. There are about 3.5 million abscissae generated, and the
precision in angle or similar parameter determination is in the range. FraeschlC, Mignard
and Arenou [69] analyzed these Hipparcos data and determined the light deflection
+
parameter y to be 0.997 0.003. This result demonstrated the power of precision optical
astrometry.
6. Astrophysical Tests
In the early days, astronomical observations of the solar system provided the basis
for developing gravitation theories. With increasing precise observations, astrophysics
and cosmology are increasingly more important for such developments. Precise timing
of pulsars provides:
(i) confirmation of quadrupole radiation formula for gravitational radiation [ 171,
(ii) additional testing ground for Post-Newtonian Parameters [ 171,
(iii) test of nonbirefringence of propagation of electromagnetic wave in a gravitational
field, and
(iv) upper limit of background gravitational-wave radiation [70-731.
We refer (i), (ii) and (iv) to references cited. Here, we discuss (iii).
With the null-birefringence observations of pulsar pulses and micropulses before
1980, the relations (4) for testing EEP are empirically verified to - [35-371.
With the present pulsar observations, these limits would be improved; a detailed such
analysis is in [74]. Analyzing the data from polarization measurements of extragalactic
radio sources, Haugan and Kauffmann [75] inferred that the resolution for null-
9
485
7. Cosmological Tests
In an attempt to find a static cosmology, Einstein add a cosmological constant A to
his equation. The term containing A can be interpreted as a modification of Einstein's
equation or it can be just interpreted as vacuum stress-energy.
Although Einstein considered the proposal of this term his biggest blunder in his life,
the value of A needs to be determined using cosmological observations.
Recent evidence suggests that Type Ia supernovae (SNeIa) can be used as precise
cosmological distance indicators [78]. Early results with these SNeIa observations imply
that there is not enough gravitating matter to close the universe [IS, 191 and that
currently the expansion of the Universe is accelerating [20, 2 I], indicating A-density
(cosmological term, dark energy or quintessence) is larger than the ordinary-matter
density. More supernovae observations together with more precise cosmic background
anisotropy measurements will be important in testing and determining the gravitational
equation in the cosmological context.
In section 4, we mentioned a nonmetric theory [33, 34, 471 in discussing the
foundations of relativistic gravity. Theories with spontaneous direction [79] and axion
theories also have such an electromagnetic interaction. The effect of cp [in (9)] in this
theory is to change the phase of two different circular polarizations in gravitation field
and gives polarization rotation for linearly polarized light [47, 79, SO]. Using polarization
observations of radio galaxies, Carroll, Field and Jackiw [79, SO] put a limit of 0.1 on Acp
over cosmological distances. Using a different analysis of polarization observation of
radio galaxies, Nodland and Ralston [SI] found indication of anisotropy in
electromagnetic propagation over cosmological distances with a birefringence scale of
order m (i.e., about 0.1 - 0.2 Hubble distance). This gave Acp - 5 - 10 over Hubble
distance). Later analyses [82-861 did not confirm this result and put a limit of Acp 5 1
over cosmological distance scale.
The natural coupling strength cp is of order 1. However, the isotropy of our
observable universe to may leads to a change Acp of cp over cosmological distance
scale smaller [87]. Hence, observations to test and measure Acp to are significant
and they are promising. In 2002, DASY microwave interferometer observed the
polarization of the cosmic background. With the axial interaction (9), the polarization
anisotropy is shifted relative to the temperature anisotropy. In 2003, WMAP (Wilkenson
Microwave Anisotropy Probe) [SS] found that the polarization and temperature are
correlated. This gives a constraint of 10.' of Acp 1891. Planck Surveyor [90] will be
launched in 2007 with better polarization-temperature measurement and will give a
sensitivity to Acp of 10-2-10-3.A dedicated future experiment on cosmic microwave
background radiation will reach 10-5-10-6Acp-sensitivity. This is very significant as a
positive result may indicate that our patch of inflationary universe has a 'spontaneous
polarization' in fundamental law of electromagnetic propagation influenced by
10
486
8. Gravitational-Wave Observations
The importance of gravitational-wave detection is twofold: (i) as probes to fundamental
physics and cosmology, especially black hole physics and early cosmology, and (ii) as
tools in astronomy and astrophysics to study compact objects and to count them. We
follow [97] to extend the conventional classification of gravitational-wave frequency
bands [98] into the ranges:
(i) High-frequency band (1-10 kHz): This is the frequency band that ground
gravitational-wave detectors are most sensitive to.
(ii) Low-frequency band (100 nHz - 1 Hz): This is the frequency band that space
gravitational-wave experiments are most sensitive to.
(iii) Very-low-frequency band (300 pHz-I00 nHz): This is the frequency band that the
pulsar timing experiments are most sensitive to.
(iv) Extremely-low-frequency band (1 aHz - 10 fHz): This is the frequency band that the
cosmic microwave anisotropy and polarization experiments are most sensitive to.
The cryogenic resonant bar detectors have already reached a strain sensitivity of
(10-21)/(Hz)l in the kHz region. Five such detectors --- ALLEGRO, AURIGA,
EXPLORER, NAUTILUS and NIOBE --- have been on the air, forming a network with
their bar axis quasi-parallel in a continuous search for bursts. TAMA (300 m annlength)
interferometer started accumulating data in 2000. GEO, and kilometer size laser-
interferometric gravitational-wave detectors --- LIGO and VIRGO --- took runs and
started to accumulate data also with strain sensitivity goal aimed at 10-23/(Hz)12 in the
frequency around 100 Hz. Various limits on the gravitation-wave strains for different
sources become significant. For example, analysis of data collected during the second
LIGO science run set strain upper limits as low as a few times for some pulsar
sources; these translate into limits on the equatorial ellipticities of the pulsars, which are
smaller than for the four closest pulsars [99].
Space interferometer (LISA [ 100, 1011, ASTROD [ 102, 1031) for gravitational-
wave detection hold the most promise. LISA (Laser Interferometer Space Antenna) [ 1001
is aimed at detection of low-frequency to 1 Hz) gravitational waves with a strain
sensitivity of 4 x IO-/(Hz) at 1 mHz. There are abundant sources for LISA: galactic
binaries (neutron stars, white dwarfs, etc.). Extra-galactic targets include supermassive
black hole binaries, supermassive black hole formation, and cosmic background
gravitational waves. A date of launch is hoped for 2013.
For the very-low-frequency band and for the extremely-low-frequency band, it is
more convenient to express the sensitivity in terms of energy density per logarithmic
11
487
frequency interval divided by the cosmic closure density pc for a cosmic background of
gravitational waves, i .e., R,(f)(=(f/pc)dpdf)/df).
The upper limits from pulsar timing observations on a gravitational wave
background are about R, 5 lo- in the frequency range 4-40 nHz [70], and Q, 5 4 x
at 6 x lo- Hz [72]. More pulsar observations with extended periods of time will improve
the limits by two orders of magnitude in the lifetime of present ground and space
gravitational-wave-detector projects. The COBE microwave-background quadrupole
anisotropy measurement [104, 1051 gives a limit R, (1 aHz) - low9on the extremely-low-
frequency gravitational-wave background [ 106, 1071. Ground and balloon experiments
probe smaller-angle anisotropies and, hence, higher-frequency background. WMAP [ 1081
and Planck Surveyor [90] space missions could probe anisotropies with I up to 2000 and
with higher sensitivity.
12
488
2013. At the present study, GAIA aims at limit magnitude 21, with survey completeness
to visual magnitude 19-20,and proposes to measure the angular positions of 35 million
objects (to visual magnitude V=15) to 10 pas accuracy and those of 1.3 billion objects
(to V=20) to 0.2 mas accuracy. The observing accuracy of V=10 objects is aimed at 4
pas. To increase the weight of measuring the relativistic light deflection parameter y,
GAIA is planned to do measurements at elongations greater than 35" (as compared to
essentially 47" for Hipparcos) from the Sun. With all these, a simulation shows that
GAIA could measure y to 1 x 10.' - 2 x 1 0-7accuracy [I 151.
13
489
10. Outlook
Physics is an empirical science, so is gravitation. The road map for gravitation is clearly
empirical. As precision is increased by orders of magnitude, we are in a position to
explore deeper into the origin of gravitation. The current and coming generations are
holding such promises.
Acknowledgements
I would like to thank the National Natural Science Foundation (Grant No. 104751 14),
and the Foundation of Minor Planets of Purple Mountain Observatory for supporting this
work.
Appendix
Since many readers are more familiar with the parametrization given by Eddington (A. S.
Eddington, The Mathematical Theory of Relativity [2nded., Cambridge University Press,
19241) and Robertson (H. P. Robertson, p. 228 in Space Age Astronomy, ed. by A. J.
Deutsch and W. H. Klemperer [Academic Press, New York, 19621) in testing relativistic
gravity, with the recommendation of J. P. Hsu (one of the editor of this book), I add this
explanatory appendix.
The Eddington-Robertson parametrization of metric is
For dynamical tests of P and y, Table 4 and Table 5 and the associated discussions in
section 5 and section 9 apply.
References
1 . I. Newton, Philosophiae Naturalis Principia Mathematica (London, 1687).
2. J. Kepler, Astronomia nova de motibus stellae Martis (Prague, 1609); Harmonice mundi (Linz,
1619).
14
490
3. G. Galilei, Discorsi e dimostria-ioni matematiche intorno a due nuove scienze (Elzevir, Leiden,
1638)
4. U. J. J. Le Verrier, Theorie du mouvement de Mercure, Ann. Observ. imp. Paris (Mim.) 5, 1-196
(1859).
5. A. A. Michelson and E. W. Morley, Am. J. Sci. 34,333 (1887).
6. H. A. Lorentz, Kon. Neder. Akad. Wet. Amsterdam. Versl. Gewone Vergad. Wisen Natuurkd. Afd.
6, 809 (1904), and references therein.
7. H. Poincare, em C. R. Acad. Sci. 140, 1504 (1905), and references therein.
8. A. Einstein,Ann. Phys. 17, 891 (1905).
9. R. V. Eotvos, Math. Naturwiss. Ber. Ungarn 8,65 (1889).
10. A. Einstein, Jahrb. Radioakt. Elektronik 4, 411 (1907); Corrections by Einstein in Jahrb.
Radioakt. Elektronik 5, 98 (1908); English translations by H. M. Schwartz in Am. J. Phys. 45,
512, 811, 899 (1977).
1 1. A. Einstein, Preuss. Akad Wiss. Berlin, Sitzber. 778, 799, 83 1, 844 ( I 9 15).
12. F. Dyson, A. Eddington and C. Davidson, Phil. Trans. Roy. SOC.220A, 291 (1920); F. Dyson,
A. Eddington and C. Davidson, Mem. Roy. Ast. SOC.62, 291 (1920).
13. I. I. Shapiro, Phys. Rev. Lett. 13,789 (1964).
14. P. G. Roll, R. Krotkov and R. H. Dicke, Ann. Phys. (U. S. A,) 26,442 (1964).
15. R. V. Pound and G. A. Rebka, Phys. Rev. Lett. 4, 337 (1960); R. V. Pound and J. L. Snider,
Phys. Rev. Lett. 13, 539 (1964).
16. S. Newcomb, "Discussion and results of observations on transits of Mercury from 1677 to
1881", Astr. Pap. am. Ephem. naut. Alm., 1, 367-487 ( U S . Govt. Printing Office, Washington,
D.C., 1882).
17. J. H. Taylor, Rev. Mod. Phys. 66,711 (1 994); and references therein.
18. P. M. Granavich, et al., Astrophys. J. 493,53 (1998).
19. S. Perlmutter, et al., Nature 391,51 (1998).
20. A. G. Riess, et aL, Astrophys. J. 116, 1009 (1998).
21. S. Perlmutter, etal., Astrophys. J , 517(2), 565-586 (1999).
22. P. Moore, The Story of Astronomy, 5th revised edition (New York, Grosset & Dunlap
Publishing, 1977).
23. U. J. J. Le Verrier, C. R. Acad. Sci. Paris 29, 1 (1 849); the English translation is from [24].
24. N. T. Roseveare, Mercury's Perihelionfrom Le Verrier to Einstein (Oxford, Clarendon Press,
1982); the reader is referred to this book for a thorough study of the history related to the
Mercury's perihelion advance.
25. I. I. Shapiro, in General Relativiw and Gravitation, ed. N. Ashby, D. F. Bartlett and W. Wyss,
p. 313 (Cambridge, Cambridge University Press, 1990).
26. H. Poincare, L'etat actuel et I'avenir de la physique mathematique, Bulletin des Sciences
Mathimatiques, Tome 28, 2e serie (reorganized 39-l), 306 (1904); the English translation is
from [27].
27. C. Marchal, Sciences 97-2 (April, 1997) and English translation provided by the author.
28. J. Stachel, p. 249 in Twentieth Century Physics, Vol. I , ed. L. M. Brown, A. Pais and B.
Pippard (New York, AIP Press, 1995).
29. D. Hilbert, Konigl. Gesell. d. Wiss. Gottingen, Nachr., Math.-Phys. KI., 395 (1915).
30. A. Einstein, Ann. Phys. (Germany) 35, 898 (191 I).
3 1. K. Hentschel, Erwin Finlay-Freundlich and testing Einstein's theory of relativity, Arch, Hist.
Exact Sci. 47, 143 (1994); and references therein.
32. E. G. Forbes,Ann. Sci. 17, 143 (1961).
33. W.-T. Ni, Bull. Am. Phys. Soc 19,655 (1974).
34. W.-T. Ni, Phys. Rev. Lett. 38, 301 (1977).
35. W.-T. Ni, "Equivalence Principles and Precision Experiments" pp. 647-65 1, in Precision
Measurement and Fundamental Constants 11, ed. by B. N. Taylor and W. D. Phillips, Natl.
Bur. Stand. (U.S.), Spec. Publ. 617 (1984).
36. W.-T. Ni, "Timing Observations of the Pulsar Propagations in the Galactic Gravitational Field
as Precision Tests of the Einstein Equivalence Principle", pp. 441-448 in Proceedings of the
Second Asian-Pacific Regional Meeting of the International Astronomical Union, ed. by B.
49 1
46. ESA SCI(93)4,-STEP (Satellite test of the equivalence principle) report on the phase A study
(1993).
47. W.-T: Ni, A Nonmetric Theory of Gravity, preprint, Montana State University, Bozeman,
Montana, USA (1973). The paper is available via
http://gravity5.phys.nthu.edu.tw/webpage/article4/index. html.
48. W.-T. Ni, "Spin, Torsion and Polarized Test-Body Experiments", pp. 53 1-540 in Proceedings
of the 1983 International School and Symposium on Precision Measurement and Gravity
Experiment, Taipei, Republic of China, January 24-February 2, 1983, ed. by W.-T. Ni
(Published by National Tsing Hua University, Hsinchu, Taiwan, Republic of China, June,
1983).
49. W.-T. Ni, Searches for the role of polarization and spin in gravitation, review article in
preparation for Reports on Progress in Physics (2005).
50. W.-T. Ni, fhys. Lett. A 120, 174 (1987).
51. C. M. Will, and K. Nordtvedt, Jr., Astrophys. J., 177, 757 (1972); and references therein.
52. J. D. Anderson, E. L. Lau, S. Turyshev, J. G. Williams, and M. M. Nieto, Bulletin of the
American Astronomical Society 34, 660 (2002).
53. E. Pitjeva "Precise determination of the motion of planets and some astronomical constants
from modern observations", 12 p, to be published in IAU Coll. N 196 / Transit of Venus: new
views of the solar system and galaxy (ed. D.W. Kurtz), Cambridge: Cambridge University
Press, 2005.
54. J. G. Williams, X. X. Newhall, and J. 0. Dickey, Phys. Rev. D 53, 6730 (1996).
55. J. G. Williams, S. G. Turyshev, and D. H. Boggs, Phys. Rev. Lett. 93,261 101(4) (2004).
56. B. Bertotti, L. Iess, and P. Tortora, Nature 425, 374-376 (2003).
57. R. W. Hellings, P. J. Adams, J. D. Anderson, M. S. Keesey, E. L. Lau, E. M. Standish, V. M.
Canuto and I. Goldman, Phys. Rev. Lett. 51, 1609 (1 983).
58. R. D. Reasenberg, Philos. Trans. R. SOC.London. Sec. A 310, 227 (1983); J. F. Chandler, R. D.
Reasenberg, and I. I. Shapiro, Bull. Amer. Astron. SOC.25, 1233 (1993).
59. J. D. Anderson, J. K. Campbell, R. F. Jurgens, E. L. Lau, X. X. Newhall, M. A. Slade III, and E.
M. Standish Jr., "Recent Developments in Solar-System Tests of General Relativity", in
Proceedings of the 6th Marcel Grossmann Meeting on General Relativity, Ed. H. Sat0 and T.
Nakamura, p. 353 (Singapore, World Scientific, 1992).
60. J. G. Williams, J. D. Anderson, D. H. Boggs, E. L. Lau, and J. 0. Dickey, Bulletin of the
16
492
17
493
95. L.-S. Hou, W.-T. Ni and Y.-C. M. Li, Phys. Rev. Lett. 90 (20), 201101 (4) (2003); Preprrnt
physics/0009012 (2000).
96. B. R. Heckel et al., in CPT and Lorentx Symmetry 11, V. A. Kostelecky, ed. (World Scientific,
Singapore, 2002).
97. K. S. Thorne, Gravitational Waves, p. 160 in Paticle and Nuclear Astrophysics and Cosmology
m the Next Millennium), ed. E. W. Kolb and R. D. Peccei (World Scientific, Singapore, 1995).
98. W.-T. Ni, "ASTROD and gravitational waves", pp. 117-129 in Gravitational Wave Detection,
edited by K. Tsubono, M.-K. Fujimoto and K. Kuroda (Universal Academy Press, Tokyo,
Japan, 1997).
99. B. Abbott et al., "Limits on gravitational wave emission from selected pulsars using LIGO data,
arXiv: gr-qc/0410007 v2 19 Jan., 2005 (2005).
100. LISA, Pre-Phase A Report, second edition, July (1998).
101. LISA, Laser Interferometer Space Antenna: A Cornerstone Mission for the Observation of
Gravitational Waves, ESA System and Technology Study Report, ESA-SCI 11 (2000).
102. A. Bec-Borsenberger, J. Christensen-Dalsgaard, M. Cruise, A. Di Virgilio, D. Gough, M.
Keiser, A. Kosovichev, C. Laemmerzahl, J. Luo, W.-T. Ni, A. Peters, E. Samain, P. H.
Scherrer, J.-T. Shy, P. Touboul, K. Tsubono, A.-M. Wu, and H.-C. Yeh, "Astrodynamical
Space Test of Relativity using Optical Devices ASTROD---A Proposal Submitted to ESA in
Response to Call for Mission Proposals for Two Flexi-Mission F2/F3", January 3 1,2000.
103. W.-T. Ni, fnt. J. Mod. Phys. D 11 (7): 947-962 (2002).
104. G. F. Smoot et a/.,Astrophys. J. 396, LI (1992).
105. C. L. Bennett et al., Astrophys. J. 464, L1 (1996).
106. L. M. Krauss and M. White, Phys. Rev. Lett. 69,969 (1992).
107. R. L. Davis, H. M. Hodges, G. F. Smoot, P. J. Steinhardt, and M. S. Turner, Phys. Rev. Lett.
69, 1856 (1992).
108. C. L. Bennett, M. Halpern, G. Hinshaw, et al., Astrophys. J. Suppl. 148 (l), 1-27 (2003).
109. P. Touboul, M. Rodrigues, G. Metris and B. Tatry, MICROSCOPE, testing the equivalence
principle in space, C. R. Acad. Sci. Ser. IV 2(9), 1271-1286 (2001).
110 T. Damour and K. Nordtvedt, Jr., Phys. Rev. Letf. 70, 2217 (1993).
11 1. T. Damour, F. Piazza, and G. Veneziano, Phys. Rev. D 66,046007 (2002); Preprint hep-
tN0205 111 (2002).
1 12. http://www.esa.int/esaSC/12039 l-index-0-m.html
113. A. Milani, D. Vokrouhlicky, D. Villani, C. Bonanno, and A. Rossi, Phys. Rev. D 66 (8),
082001(21) (2004).
1 14. http://www.esa.int/esaSC/ 120377-index-O-m.html
115. A. Vecchiato, M. G. Lattanzi, B. Bucciarelli, et al., Astron. Astrophys. 399 (I), 337-342,
(2003).
116. W.-T. Ni, G. Bao, Y. Bao, et al., J. Korean Phys. SOC.45: S118-Sl23 (2004).
117. S. G. Turyshev, M. Shao, and K. Nordtvedt, Class. Quantum Grav. 21, 2773-2799 (2004).
118. W.-T. Ni, S. Shiomi and A.-C. Liao, Class. Quantum Grav. 21, S641 (2004).
494
I. SEARCH AND DISCOVERY band ratio noise detectable over interstellar distances.
However, the rich diversity of the observed radio pulses
suggested magnetospheric complexities far beyond those
Work leading to the discovery of the first pulsar in a
readily incorporated in theoretical models. Many of us
binary system began more than twenty years ago, so it
seems reasonable to begin with a bit of history. Pulsars suspected that detailed understanding of the pulsar emis-
sion mechanism might be a long time coming-and that,
burst onto the scene (Hewish et al., 1968) in February
1968, about a month after I completed my Ph.D. at Har- in any case, the details might not turn out to be funda-
vard University. Having accepted an offer to remain mentally illuminating.
In September 1969 I joined the faculty at the Universi-
there on a post-doctoral fellowship, I was looking for an
ty of Massachusetts, where a small group of us planned
interesting new project in radio astronomy. When Na-
ture announced the discovery of a strange new rapidly
to build a large, cheap radio telescope especially for ob-
pulsating radio source, I immediately drafted a proposal, serving pulsars. Our telescope took several years to
together with Harvard colleagues, to observe it with the build, and during this time it became clear that whatever
92 m radio telescope of the National Radio Astronomy the significance of their magnetospheric physics, pulsars
Observatory. By late spring we had detected and studied were interesting and potentially important to study for
quite different reasons. As the collapsed remnants of su-
all four of the pulsars which by then had been discovered
by the Cambridge group, and I began thinking about how pernova explosions, they could provide unique experi-
mental data on the final stages of stellar evolution, as well
to find further examples of these fascinating objects,
which were already thought likely to be neutron stars. as an opportunity to study the properties of nuclear
Pulsar signals are generally quite weak, but have some matter in bulk. Moreover, many pulsars had been shown
unique characteristics that suggest effective search stra- to be remarkably stable natural clocks (Manchester and
tegies. Their otherwise noise-like signals are modulated Peters, 19721, thus providing an alluring challenge to the
by periodic, impulsive waveforms; as a consequence, experimenter, with consequences and applications about
dispersive propagation through the interstellar medium which we could only speculate at the time. For such
makes the narrow pulses appear to sweep rapidly down- reasons as these, by the summer of 1972 I was devoting a
ward in frequency. I devised a computer algorithm for large portion of my research time to the pursuit of accu-
recognizing such periodic, dispersed signals in the inevit- rate timing measurements of known pulsars, using our
able background noise, and in June 1968 we used it to new telescope in western Massachusetts, and to planning
discover the fifth known pulsar (Huguenin et al., 1968). a large-scale pulsar search that would use bigger tele-
Since pulsar emissions exhibited a wide variety of new scopes a t the national facilities.
and unexpected phenomena, we observers put consider- I suspect i t is not unusual for an experiment's motiva-
able effort into recording and studying their details and tion to depend, at least in part, on private thoughts quite
peculiarities. A pulsar model based on strongly magnet- unrelated to avowed scientific goals. The challenge of a
ized, rapidly spinning neutron stars was soon established good intellectual puzzle, and the quiet satisfaction of
as consistent with most of the known facts (Gold, 1968). finding a clever solution, must certainly rank highly
The model was strongly supported by the discovery of among my own incentives and rewards. If an experiment
pulsars inside the glowing, gaseous remnants of two su- seems difficult to do, but plausibly has interesting conse-
pernova explosions, where neutron stars should be creat- quences, one feels compelled to give it a try. Pulsar
ed (Large, et al., 1968; Staelin and Reifenstein. 1968), searching is the perfect example: it's clear that there
and also by an observed gradual lengthening of pulsar must be lots of pulsars out there, and, once identified,
periods (Richards and Comella, 1969) and polarization they are not so very hard to observe. But finding each
measurements that clearly suggested a rotating source one for the first time is a formidable task, one that can
(Radhakrishnan and Cooke, 1969). The electrodynami- become a sort of detective game. T o play the game you
cal properties of a spinning, magnetized neutron star invent an efficient way of gathering clues, sorting, and as-
were studied theoretically (Goldreich and Julian, 1969) sessing them, hoping to discover the identities and celes-
and shown to be plausibly capable of generating broad- tial locations of all the guilty parties.
Most of the several dozen pulsars known in early 1972
were discovered by examination of strip-chart records,
'Nobel Lecture, presented to the Royal Swedish Academy of without benefit of further signal processing. Neverthe-
Sciences on 8 December 1993. less, it was clear that digital computer techniques would
Reviews of Modern Physics, VoI. 66, No. 3,July 1994 01994 The American Physical Society 711
495
GPS Satellites
I * UIC(NISI)
Boulder
0 .02 -04
Time (s)
Synthsrizer
FIG. 4. Pulse profiles obtained on April 24, 1992 during a five-
minute observation of PSR 1913+ 16. The characteristic
double-peaked shape, clearly seen in the de-disped profile at
FIG. 3. Simplified block diagram of equipment using for timing the bottom, is also discernible in the 32 individual spectral
pulsars at Arecibo. channels.
-
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
PSR 1802-07
PSR 1855+09
PSR 1913+16
- +
- *
predictions for the remaining PK parameters.
PSR 1913+16'
The binary systems most likely to yield measurable PK
parameters are those with large masses and high eccen- PSR 2127+11C k-----l
PSR 2303+46
PSR 2303+46'
1 1 1 1
.5
I I I
1
I
(Me)
I
1.5
I I I I I
2
panions and nearly circular orbitals can yield significant FIG. 9. The masses of ten neutron stars, measured by observing
post-Keplerian measurements. The best present example relativistic effects in binary pulsar orbits. Asterisks after pulsar
is PSR 1855+09: its orbital plane is nearly parallel to names denote companions to the observed pulsars.
observations. Results based on 15 months of data (Tay- Gold, T., 1968, Rotating neutron stars as the origin of the pul-
lor et al., 1992) have already produced significant mea- sating radio sources, Nature 218, 731-732.
surements of four PK parameters: &, y , r, and s. In re- Goldreich, P., and W. H. Julian, 1969, Pulsar electrodynam-
cent work not yet published, Wolszczan and I have mea- ics, Astrophys. J. 157, 869-880.
sured the orbital decay rate, P,,,and found it to be in ac- Haugan, M. P., 1985, Post-Newtonian arrival-time analysis for
a pulsar in a binary system, Astrophys. J. 296, 1- 12.
cord with general relativity at about 20% level. I n fact, Hewish, A., S. J. Bell, J. D. H. Pilkington, P. F. Scott, and R.
all measured parameters of the PSR 15344- 12 system are A. Collins, 1968, Observation of a rapidly pulsating radio
consistent within general relativity, and it appears that source, Nature 217,709-713.
when the full experimental analysis is complete, Huguenin, G. R., J. H. Taylor, L. E. Goad, A. Hartai, G. S. F.
Einsteins theory will have passed three more very Orsten, and A. K. Rodman, 1968, New pulsating radio
stringent tests under strong-field and radiative condi- source, Nature 219, 576.
tions. Hulse, R. A., 1994, The discovery of the binary pulsar, in Les
I d o not believe that general relativity necessarily con- Prix Nobel (The Nobel Foundation).
tains the last valid words t o be written about the nature Hulse, R. A., and J. H. Taylor, 1974, A high sensitivity pulsar
of gravity. The theory is not, of course, a quantum survey, Astrophys. J. 191, L59-L61.
Hulse, R. A. and J. H. Taylor, 1975a, Discovery of a pulsar in
theory, and at its most fundamental level the universe ap-
a binary system, Astrophys. J. 195, L51-L53.
pears to obey quantum-mechanical rules. Nevertheless, Hulse, R. A., and J. H. Taylor, 1975b, A deep sample of new
our experiments with binary pulsars show that, whatever pulsars and their spatial extent in the galaxy, Astrophys. J.
the precise directions of future theoretical work may be, 201, LS5-L59.
the correct theory of gravity must make predictions that Kaspi, V. M., J. H. Taylor, and M. Ryba, 1994, High-precision
are asymptotically close to those of general relativity timing of millisecond pulsars. 111. Long-term monitoring of
over a vast range of classical circumstances. PSRs B1855+09 and B1937+21, Astrophys. I. (in press).
Large M. I., A. E. Vaughan, and B. Y. Mills, 1968, A pulsar
supernova association, Nature 220,340-341.
ACKNOWLEDGMENTS Manchester, R. N., and W. L. Peters, 1972, Pulsar parameters
from timing observations, Astrophys. J. 173,221-226.
Russell Hulse and I have many individuals to thank for Manchester, R. N., J. H. Taylor, and G. R. Huguenin, 1972,
their important work, both experimental and theoretical, New and improved parameters for twenty-two pulsars, Na-
without which our discovery of PSR 1913+16 could not ture Phys. Sci. 240,74.
have borne fruit so quickly or so fully. Most notable McCulloch, P. M., J. H. Taylor, and J. M. Weisberg, 1979,
among these are Roger Blandford, Thibault Damour, Lee Tests of a new dispersion-removingradiometer on binary pul-
Fowler, Peter McCulloch, Joel Weisberg, and the skilled sar PSR 1913+16, Astrophys. J. 227, L133-LI37.
Radhakrishnan, V., and D. J. Cooke, 1969, Magnetic poles
and dedicated technical staff of the Arecibo Observatory.
and the polarization structure of pulsar radiation, Astrophys.
Lett. 3, 225-229.
Rawley, L. A., J. H. Taylor, and M. M. Davis, 1988, Funda-
REFERENCES mental astrometry and millisecond pulsars, Astrophys. J. 326,
947-953.
Richards, D. W., and J. M. Comella, 1969, The period of pul-
Blandford, R., and S. A. Teukolsky, 1976, Arrival-time sar NP 0532, Nature 222,551-552.
analysis for a pulsar in a binary system, Astrophys. J. 205, Ryba, M.F., and J. H. Taylor, 1991, High precision timing of
580-591. millisecond pulsars. I. Astrometry and masses of the PSR
Burns, W. R. and B. G. Clark, 1969, Pulsar search tech- 1855i-09 system, Astrophys. J. 371, 739-748.
niques, Astron. Astrophys. 2, 280-287. Staelin, D. H., and E. C. Reifenstein, 111, 1968, Pulsating radio
Camilo, F., 1994, Millisecond pulsar searches, in Lives ofthe sources near the Crab Nebula, Science 162, 1481- 1483.
Neutron Stars, NATO ASI Series, edited by A. Alpar (Kluwer, Stinebring, D. R., V. M. Kaspi, D. J. Nice, M. F. Ryba, J. H.
Dordrecht). Taylor, S. E. Thorsett, and T. H. Hankins, 1992, A flexible
Damour, T., and N. Deruelle, 1985, General relativistic celes- data acquisition system for timing pulsars, Rev. Sci. Instrum.
tial mechanics of binary systems. I. The post-Newtonian 63,3551-3555.
motion, Ann. Inst. Henri Poincart Phys. Thtor. 43,107-132. Taylor, J. H., 1972, A high sensitivity survey to detect new
Damour, T., and N. Deruelle, 1986, General relativistic celes- pulsars, research proposal submitted to the US National Sci-
tial mechanics of binary systems. 11. The post-Newtonian ence Foundation, September, 1972.
timing formula, Ann. Inst. Henri Poincare Phys. Thtor. 44, Taylor, J. H., 1974, A sensitive method for detecting dispersed
263-292. radio emission, Astron. Astrophys. Suppl. Ser. 15, 367.
Damour, T., and J. H. Taylor, 1991, On the orbital period Taylor, J. H., 1991, Millisecond pulsars: Natures most stable
+
change of the binary pulsar PSR 1913 16, Astrophys. J. 366, clocks, Proc. IEEE 79, 1054-1062.
501-511. Taylor, I. H., 1993, Testing relativistic gravity with binary and
Damour, T., and J. H. Taylor, 1992, Strong-field tests of rela- millisecond pulsars, in General Relutiuity und Gruuitation
tivistic gravity and binary pulsars, Phys. Rev. D 45, 2992, edited by R. J. Gleiser, C. N. Kozameh, and 0. M.
1840- 1868. Moreschi (Institute of Physics, Bristol), pp. 287-294.
Epstein, R., 1977, The binary pulsar: Post Newtonian timing Taylor, J. H., and R. J. Dewey, 1988, Improved parameters for
effects, Astrophys. J. 216,92- 100. four binary pulsars, Astrophys. J. 332,770-776.
Taylor, 3. H., L. A. Fowler, and P. M. McCulloch, 1979, Mea- Taylor, J. H., and J. M. Weiberg, 1989, Further experimental
surements of general relativistic effects in the binary pulsar tests of relativistic gravity using the binary pulsar PSR
PSR 1913+ 16, Nature 277,431. 1913+ 16, Astrophys. J. 345,434-450.
Taylor, J. H., R. A. Hulse, L. A. Fowler, G. W. Gullahorn, and Taylor, J. H.. A. Wolszczan, T. Damour, and J. M. Weisberg,
J. M. Rankin, 1976, Further observations of the binary pulsar 1992, Experimental constraints on strong-field relativistic
PSR 1913f16, Astrophys. J. 206, L53-L58. gravity, Nature 355, 132- 136.
Taylor, J. H.,R. N. Manchester, and A. G. Lyne. 1993, Cata- Thorsett, S. E., Z. Arzoumanian, M. M. McKinnon, and J. H.
log of 558 pulsars; Astrophys. J. Suppl. Ser. 88, 529-568. Taylor, 1993, The masses of two binary neutron star sys-
Taylor, J. H., and J. M. Weisberg, 1982, A new test of general tems, Astrophys. J. 405, L29-L32.
relativity: Gravitational radiation and the binary pulsar PSR A. Wolszczan, 1991, A nearby 37.9 ms radio pulsar in a rela-
1913-k16, Astrophys. J. 253,908-920. tivistic binary system, Nature 350, 688-690.
Other Perspectives*
1. MOTIVATION AND INTRODUCTlON global problems we analyze in Sec. Ill the field
produced by a magnetic monopole. We demon-
The concept of the electromagnetic field was s t r a t e how the quantization of the pole strength,
conceived b y Faraday and Maxwell to describe a striking result due to D i r a ~ i,s~understood in
electromagnetic effects in a space-time region. this concept of electromagnetism. The demon-
According to this concept, the field strenght f,, stration i s closely related to that in the original
describes electromagnetism. It was l a t e r real- Dirac paper. Dirac discussed the phase factor of
ized: however, thatf, by itself does not, in the wave function of a n electron (which, among
quantum theory, completely describe all electro- other things, depends on the electron energy). Our
magnetic effects on the wave function of the elec- emphasis is on the nonintegrable electromagnetic
tron. The famous Bohm-Aharonov experiment, phase factor (which does not depend on such quan-
first beautifully performed by Chambers: showed tities as the energy of the electron).
that in a multiply connected region where f , , = O The monopole discussion leads to the recognition
everywhere there a r e physical experiments for that in general the phase factor (and indeed the
which the outcome depends on the loop integral vector potential A ), can only be properly defined
in each of many overlapping regions of space-
time. In the overlap of any two regions there ex-
i s t s a gauge transformation relating the phase
around an unshrinkable loop. This r a i s e s the factors defined for the two regions. This discus-
question of what constitutes an intrinsic and corn- sion is made m o r e p r e c i s e in Sec. IV. It leads to
plete descnption of electromagnetism. In the the definition of global gauges and global gauge
present paper we wish to discuss this question and transformations.
also its generalization to non-Abelian gauge fields. In Sec. V generalizations to non-Abelian gauge
An examination of the Bohm-Aharonov experi- groups are made. The special c a s e s of SU, and
ment indicates that in fact only the phase factor SO, gauge fields a r e discussed in Secs. VI and VII.
A surprising result is that the monopole types are
quite different for SU, and SO, gauge fields and f o r
electromagnetism.
and not the phase (l), i s physically meaningful. In The mathematics of these r e s u l t s is in fact well
other words, the phase (1)contains more infor- known to the mathematicians in fiber bundle theo-
mation than the phase factor (2). But the addition- r y . An identification table of terminologes is
al information is not measurable. This simple given in Sec. V. We should emphasize that our in-
point, probably implicitly recognized by many t e r e s t in this paper does not lie in the beautiful,
authors, is discussed in Sec. II. It leads to the deep, and general mathematical development in
concept of nonintegrable (i.e., path-dependent) fibsr bundle theory. Rather we a r e concerned with
Phase factor as the b a s i s of a description of elec- the necessary c a c e p t s to describe the physics of
tromagne tism. gauge theones. I t is remarkable that these con-
This concept has been taken3 as the basis of the cepts have already been intensively studied a s
definition of a gauge field. The discussions in mathematical constructs.
Ref. 3 , however, centered only on the local prop- Section VLI d i s c u s s e s a gedanken generalized
erties of gauge fields. To extend the concept to Bohm-Aharonov experiment f o r SU, gauge fields.
__
12 3845
505
3 846 T A I T S U N WU A N D C H E N N I N G Y A N G 12
-0
l e s s the m a s s of the gauge particle vanishes. In
the l a s t section we make several remarks. interference
electron plane
11. DESCRIPTION OF ELECTROMAGNETISM cylinder
beam
The Bohm-Aharonov experiment explores the
electromagnetic effect on an electron beam (Fig.
1)in a doubly connected region where the electro-
magnetic field i s zero. As predicted by Aharonov FIG. 1. Bohm-Aharonov experiment (Refs. 1, 2). A
and Bohm, the fringe shift is dependent on the magnetic flux is in the cylinder. Outside of the cylinder
phase factor (Z), which is equal to the field strength fuv = O .
(5) (8)
For this gauge transformation to be definable, S does not change the prediction of the outcome of
must be sinzle-valued, but ci itself need not be. any physical measurements. Following Ref. 3 ,
Now ( A J b- (A,),,is curlless; hence (5) can always we shall call the phase factor (7) a nonintegrable
be solved for (Y. But it i s multiple-valued with a n (i.e., path-dependent) phase factor.
increment of Electromagnetism i s thus the gauge-invanant
manifestation of a nmntegrable phase f a c t o r . We
shall develop this theme f u r t h e r in the next sec-
tion.
e
=-(ab - no)
AC
Ill. FIELD DUE TO A MAGNETIC MONOPOLE
every time one goes around the cylinder. If (3)
is satisfied, ha= Z R X integer and S i s single-
valued. Case a and case b outside of the cylinder The definition of a nonintegrable phase factor
a r e then gauge-transformable into each other, and (7) in a general case may present problems. To
no physically observable effects would differentiate illustrate the problem, l e t u s study the magnetic
them. The same argument obviously holds if one monopole field of Dirac.* Consider a static mag-
studies the wave function of an interacting system netic monopole of strength g # 0 a t the origin
of particles provided the charges of the particles ? = 0 and take the region R of space- time under
a r e all integral multiples of e . Thus we have consideration to be all space-time minus the O n -
shown the validity of Theorem 1. gin ?= 0. We shall now show the following:
506
12
c
C O N C E P T O F N O N I N T E G R A B L E P H A S E F A C T O R S AND ... 3847
Theorem 2: There does not exist a singuiarity- The gauge transformation i n the overlap of the two
f r e e A ,, over all R . regions i s
Lf a singularity-free A , , does exist throughout R ,
consider the loop integral $ A,,&' f o r time t =0
around a circle at fixed spherical coordinates
S = S o b=exp(-i a)= exp ?Z@).
-
-
r and 0 with azimuthal angle 4 = O 2 R . This in- This i s a n allowed gauge transformation if and
tegral, denoted by n ( ~0), f o r r>O, is equal to the only if S is single-valued, i.e.
magnetic flux through a cap bounded by the loop,
or more explicitly n(r,0) = 2ng(l - cos0). At 0 = 0, *=integer =D ,
RC
O ( r ,0) = 0. Increasing 0 leads to a continuous in-
crease in s2 till one approaches 0 = n , a t which which i s D i r a c ' s quantization. With (13) we have
O ( 7 , n) = 4ng. (9)
But a t 0 = n the loop shrinks to a point. Therefore To define the phase factor for a path we r e f e r to
n(r,n) = 0 since A ,has no singularity. We have Fig. 2 , where a point in the overlapping region,
thus reached a contradiction and Theorem 2 is such a s point P, is regarded as two points P, and
proved. P,. If a path is entirely within region a o r b , we de-
With an A, which has singularities, the nonin- fine 0 along the path by (7) with (Au).o r ( A , ) , in
tegrable phase factor becomes undefined if the path the integrand in the exponent. If the path Q P is -
goes through a singularity. This difficulty m u s t be entirely within the overlapping region we have
resolved in order to use a nonintegrable phase then two possible phas.e factors and aOhpb.
factor as a fundamental concept to describe elec- It is easy to prove that
tromagnetism. It can be resolved in the following
way. Let us seek to divide R into two overlapping
regions R , and R, and to define (A,), and (Au),, i.e.,
each singularity-free in their respective regions,
so that (i) their curls a r e equal to the magnetic
field and (ii) in the cwrlapping region (A,,), and which merely states that (Au),and (A,,)b a r e related
(Au),are related by gauge transformation. One by a gauge transformation with the transformation
possible choice is to take the regions to be factor (12).
F o r a path that c r i s s c r o s s e s in and out of the
R,: O S B < n / 2 + 6 O<T, 05@<2n, allt
(10) overlapping region, such as A - B - - -
C D E in
R,: n / 2 - 6 < 0 S n O<T, 05$<2r, allt Fig. 2, the definition of 0 is
with a n overlap extending throughout n/2 - 6< 0 *P,DCflA= aP,D~S,(D)a,b,flbSb,(B)afl~A ' (15)
< n/2 + 6. (We assume 0 < 6 5 n/2.) Take
Notice that fixing the path but sliding the points
B and D along i t does not change aEDcflA [because
of formulas like (14')] s o long as B and D remain
in the overlapping region.
The phase factor so defined satisfies the group
proper@, eag'J
@EDCBA= @ED=*D~CBA
= %?Db'DbCflA
=*EDc@caA etc. (16)
The relationship between the electromagnetic field
and the phase factor around a loop is the s a m e as
r 7
usual. One only has to be careful that if the start-
ing and terminating point A is in the overlapping
region, the phase factor is taken to be aAaflA,
=aAbBAb, and not +A,BAb o r The phase
factor around the loop is then equal to
3848 T A I T S U N WU A N D C H E N N I N G Y A N G 12
<
dered by the loop. Notice that because of Diracs tential (A,);. W e shall illustrate schematically
quantization condition, the phase factor i s the the transformation by elevating the region b in
s a m e whichever w a y one chooses the cap provided Figure 3(a).
it does not p a s s through the point F =0 (any t ) . One could extend the region b. One could also
W e have satisfactorily resolved the difficulty contract it, provided the whole R r e m a i n covered-
mentioned a t the beginning of this section, pro- One could create a new region by considering a
vided Diracs quantization condition (13) is satis- subregion of b as x additional region R, [Figure
fied. We shall now prove the following. 3@)], and define the gauge transformation connect,
Theorem 3 : If (13) is not satisfied (the above ing them as the identity transformation s o that
method of resolving the difficulty would not work (A,,),= (A,,),. One can then elevate R, and con-
since) there exists no division of R into overlap- t r a c t R,, which r e s u l t s in Fig. 3(c).
ping r e g o n s R,,R,,R,, . ..so that condition (i) and Through operations Of the kind mentioned in the
(ii) stated above, properly generalized to the case last three paragraphs, which we shall call distor-
of more than two regions, would hold. tions, we a r r i v e a t a large number of possibilities,
To prove this statement, observe that if such a each with a particular choice of overlapping re-
division is possible, one could generalize (15) and gions and with a particular choice of gauge trans-
a r r i v e at a satisfactory definition of the phase formation f r o m the original @,), or (A,,), to the
factor. The phase factor around a loop is then a new A,, in each region. Each of such possibilities
continuous function of the loop. Take the loop to will be called a gauge (or global gatige). T h i s
be a parallel on the sphere Y fixed, t = O , 6 fixed, definition is a natural generalization of the usual
@ = O - 2a. The phase factor defined by the gener- concept, extended to deal with the intricacies of
alization of (15) i s equal to the field of a magnetic monopole.
For each choice of gauge there is a definition of
a nonintegrable phase factor f o r every path. The
group condition * C , B A , = * C , B b * B b A , i s always
This i s not equal to unity when 8 = TI, since (13) is satisfied.
assumed to be invalid. Thus we have a contradic- Notice that the original gauge we s t a r t e d with
tion. was characterized by (a) specifying [in (lo)] the
Theorem 3 shows that if Diracs quantization regions [R, and R,]and (b) specifying the gauge
condition (13) i s not satisfied, then the field of a transformation factor (12) in the overlap (between
magnetic monopole of strength g cannot be taken R, and R,). I t does not refer to any specific A,,.
as a realizable physical situation in R. (Of course, [ A distortion may of course lead to no changes in
if one excludes the half-line x =y = 0, z < O , or any characterizations (a) and (b). Thus two different
half-line starting f r o m F=O leading to infinity, gauges may s h a r e the same characterizations (a)
then it is possible to have any value for g.) This and (b).] In the case of the monopole field, we
conclusion is the same as Diracs, but viewed had chosen the vector potential to be given by (11).
from a somewhat different point of emphasis. But, in fact, we can attach to this gauge any
and (A ,), provided they a r e gauge-transformed
1V. GENERAL DEFINITION OF GAUGE
into each other by (12) in the region of overlap.
AND GLOBAL GAUGE TRANSFORMATION
(The resultant f , , i s , of course, not a monopole
Assuming that (13) holds, to round out our con- field in general.) Thus a gauge is a c a c e p t not
cept of a nonintegrable phase factor the question tied to any specvic vector potential. We shall c a l l
of the flexibility in the choice of the overlapping the process of distortion leading f r o m one gauge to
regions and the flexibility in the choice of A,, in the another a global gauge tmansformation. It i s also
regions must be faced. Both of these questions a r e a concept not tied to any specific vector potentid.
related to gauge transformations. It i s a natural generalization of the usual gauge
Consider a gauge transformation 5 in R, (( will transformation.
be assumed to be many times differentiable, but The collection of gauges that can b e globally
-
not necessarily analytic), resulting in a new po- gauge-transformed into each other will be said to
C
I b C
-b b-
a (1 a
F I G . 3 . Distortions allowed i
n gauge transformation.
508
.n
16
C O N C E P T O F NONINTEGRABLE P H A S E FACTORS AND ... 3849
J
Wlong to the s a m e gauge type. the gauge field and only depends on the gauge:
The phase factor around a loop s t a r t s and ends
at the same point in the s a m e region. Thus it does
not change under any global gauge transformation,
#fuVdxudx=- -iAc
e d-a
an,
(InS,,)dx, (19)
i.e- we have, for Abelian gauge fields, the follow- where S i s the gauge transformation defined by
ing. (12) for the gauge S, in question, and the integral
Tlleorem 4a; The phase factor around any loop is taken around any loop around the origin f=O in
is invariant under a global gauge transformation. the overlap between R a and R,, such as the equa-
It follows trivially from this, by taking an in- tor on a sphere r = l .
finitesimal loop, that To prove this theorem we observe that the flux
T)leorem 5a: The field strengthf,, is invariant through the upper half of the sphere 7=l is equal
under a global gauge transformation. to the following integral around the equator:
For a given value of D , the gauge defined by (10)
and (12) will be denoted by S,. For D # D , the re-
lationship, o r r a t h e r the lack of relationship, be-
ween S, and .S, i s shown by Theorem 6. The flux through the lower half i s equal to a simi-
Theorem 6 : For D *D,S, and S, a r e not re- lar integral around the equator:
lated by a global gauge transformation, i.e., they
are not of the s a m e gauge type.
To prove this theorem we use Theorem 7.
Theorem 7: Between two gauge fields defined on Hence
the same gauge there exists a continuous interpo-
lating gauge field defined on the s a m e gauge.
To prove Theorem 7, we simply make a linear
interpolatic? between the two original gauge fields
which we shall denote by ( A u ) ( u )and [A,)(:
~ ( = )t ( ,)I~ a)+ (1
7 - ~ ) ( A , ) ( Q, o5t =1. (18)
which completes the proof. Using (13) and (12),
In an overlap between regions a and b this inter- the right-hand side of (21) i s equal to 4rg, as ex-
polating vector potential a s s u m e s values (A]:), pected.
and (A,):) which a r e related by the proper gauge If one s t a r t s with any gauge which is of the same
transformation belonging to this overlap. Thus gauge type a s S,, and makes a global gauge trans-
we have proved Theorem 7. formation on i t , the total flux i s not changed by
Now go back to Theorem 6 and a s s u m e it to be Theorem 5a. Thus (19), which depends only on the
invalid. Then we can gauge-transform the vector gauge, is in fact the s a m e for all gauges of the
potential belonging to the monopole of strength s a m e type. Notice that if there a r e more regions
D A c / l e to the gauge S,. For this gauge we have in a gauge than two, (19) should be replaced by a
then two monopole fields of different pole sum of line integrals along paths that a r e in the
strengths. Using Theorem 7 we interpolate be- various overlaps between the regions. For a case
b e e n them and obtain unquantized magnetic mon- of three regions there a r e three paths, which a r e
opoles, which contradict Theorem 3. illustrated in Fig. 4. Along each path the integral
Notice that although in this proof of Theorem 6 is of the form (19) with S denoting the gauge trans-
we have used two specific gauge fields, the the- formation factor, such as (12), between the two
orem itself does not r e f e r to any specific gauge regions containing the path. To prove Theorem 8 in
fields a t all. this case one need only add three loop integrals to-
By the s a m e argument as used in the proof of
Theorem 7 , any gauge field defined on S, must
have a magnetic monopole of strength D A c / Z e a t
the excluded point F= 0, in addition to possible
fields produced by electric charges and currents. >-- --_
Thus the total magnetic flux around the origin
F=O is equal to ( Z i f i c / e ) D for any gauge field de-
fined on 9., We shall state this as a theorem and @cR
Ove another proof of it. Rb
Theorem 8: Consider gauge S, and define any FIG. 4 , Ca s e of thre e regions f o r The ore m 8. The
gauge field on it. The total magnetic flux through t h r e e paths f r o m P to Q are in the t h r e e overlapping
a sphere around the origin F=O is independent of regions between ( R , , R b ) , ( R b , R , ) , and (R,,R,,).
509
3850 T A I T S U N WU A N D C H E N N I N G Y A N G
gether, each of the form of (ZOa) and (ZOb), and (2) If three regions R,, R,, and R , overlap, then
notice that along each path the integrand is always there a r e gauge transformations S o b ,SbnrSac,S,,
the difference of the vector potential A , between Sbc,S,, so that
two regions, very much as in (21).
The f i r s t proof we gave above of Theorem 8 i s S
, S , = 1 , etc .
,, S
easy and i s obvious to a physicist. The second inR,nR,nR,.
proof i s more involved but is m o r e intrinsic. The A s in the case of electromagnetism, both the
theorem is a special c a s e of the Chern-Well concept of a gauge and the concept of a global
theorem which evolved from the famous Gauss- gauge transformation a r e not tied to any specific
Bonnet-Allendoerfer-Weil-Chern theorem, a gauge potentials, denoted in general by b t .
seminal development in contemporary mathemat- The nunintegrable phase factor f o r a given path
ics. We want to emphasize two consequences of i s now an element of the gauge group. W e shall
the theorem. (i) The right-hand side of (19) i s in- s t i l l call i t a phase factor. Since these phase fat.
dependent of the gauge field, and only depends on t o r s do not i n general commute with each other,
the gauge type. (ii) The right-hand side of (19) has Theorems 4a and 5a f o r the Abelian c a s e need to
as integrand the gradient of Ins. Since S is single- be modified as follows.
valued, the integral must be equal to an integral Theorem 4 : Under a. global gauge transforma-
multiple of a constant (in this c a s e 2z-i). A r e - tion, the phase factor around any loop remains in
markable fact i s that these consequences remain the same class. The c l a s s does not depend on
valid in the general mathematical theorem, which which point is taken as the starting point around
is very deep. the loop.
Theorem 5: The field strengthfz, i s covariant
V. GENERALIZATION TO NON-ABELIAN under a global gauge transformation.
GAUGE FIELD Only theorem 4 i s not immediately transparent.
For a loop ABCA, under a gauge transformation
So far we have only considered electromagnetism
and described i t in t e r m s of an Abelian gauge field *ABcA- *kBCA= S(A)*A,cAS-(A) .
that corresponds to the group U,, o r equivalently Thus and 9,,,, a r e in the s a m e c l a s s . Also
SO2. On the basis of the discussions in the pre- around the s a m e loop if we change the starting
ceding section, the generalization to the non- point from A to C ,
Abelian case can be carried out without much diffi-
culty. For a local region this h a s been done i n Ref. 3 . *caac= Q C A * A B C A * A C .
Extension to global considerations is our present Hence changing the starting point does not change
focus of interest. the class.
A gauge is defined by (a) a particular choice of Theorem 4 defines the class of a loop. This
overlapping regions and (b) a particular choice of concept i s the generalization of the phase factor
single-valued gauge transformations Sab in the f o r electromagnetism around a loop with the mag-
overlapping regions. The choice of gauge trans- netic flux as the exponent. It is a gauge-invariant
formations clearly must satisfy the following two
concept.
conditions. These concepts have been extensively studied
(1)In the overlapping region R a n R,, the gauge by the mathematicians in the framework of more
transformations S, from a to b and S,, f r o m b to general mathematical constructs. A translation
a a r e related by of terminology i s given in Table I.
Sb.= 1 1
Gauge fields
Robert Mills
Physics Department, The Ohio State Uniuersity, Columbus, Ohio 43210
(Received 15 January 1987; accepted for publication 6 February 1989)
This article is a survey of the history and ideas of gauge theory. Described here are the gradual
emergence of symmetry as a driving force in the shaping of physical theory; the elevation of
Noethers theorem, relating symmetries to conservation laws, to a fundamental principle of
nature; and the force of the idea (the gauge principle) that the symmetries of nature, like the
interactions themselves, should be local in character. The fundamental role of gauge fields in
mediating the interactions of physics springs from Noethers theorem and the gauge principle in a
remarkably clean and elegant way, leaving, however, some tantalizing loose ends that might prove
to be the clue to a future deeper level of understanding. The example of the electromagnetic field
as the prototype gauge theory is discussed in some detail and serves as the basis for examining the
similarities and differences that emerge in generalizing to non-Abelian gauge theories. The article
concludes with a brief examination of the dream of total unification-all the forces of nature in a
single unified gauge theory, with the differences among the forces due to the specific way in which
the fundamental symmetries are broken in the local environment.
there is only a very limited class oftheories that can meet it (as a woman) striving for recognition at the University of
and extremely little arbitrariness in the forms of interaction Gottingen, relates in particular to variational principles as
that are allowed. they apply to physics. The substance of the theorem, for
It is the purpose of this article to explore this idea of our purposes, is that for every symmetry of nature there is 0
gauge invariance-to tell something of the history of the corresponding conservation law and for every consemation
idea, to give a survey of some of the physics needed to un- law there is a symmetry. In the Lagrangian formulation ofa
derstand the principle, and to describe the logic of the physical theory, the symmetry in question is a symmetry of
gauge invariance principle and how it gives rise to very the Lagrangian; since the form of the Lagrangian deter.
particular forms of physical theories. Three key concepts mines the equations of motion of the system being de-
run through the discussion-distinct, but tightly interre- scribed, this means that it is symmetry of thoseequations of
lated: symmetry, conservation laws, and gauge fields. We motion, that is, of the physical theory itself.
shall look first at how these interrelationships appear in the Physics is characterized by a number of deeply signifi.
case of classical electromagnetic theory, which is the sim- cant conservation laws such as energy, linear and angular
plest example of a gauge field theory, and then examine momentum, and electric charge, together with a number of
how this provides the stimulus and basis for the generaliza- others that have emerged in more recent years. Noethers
tion to what are now called non-Abelian gauge theories. I theorem shows explicitly how these are related to the very
shall then give a sketch of the developments in more recent structure-the symmetry, in fact-af physical laws. All
years-the resolution of a variety of formidable difficulties theorems are proved on the basis of given hypotheses, and
and the formulation of the Standard Model, the current in the case of Noethers theorem, the most important as-
widely accepted picture of the elementary particles and the sumption is that the equations of motion of physics are
gauge fields by which they interact. derivable from a variational principle, known as Hamil-
While our present understanding of physics involves a tons principle. Hamiltons principle states that for the true
unification of type in the sense that all the force fields of trajectory of a system-its history as a function of time-
nature are seen to be of the same character, namely, gauge the time integral of the Lagrangian is stationary with re-
fields, I shall discuss, finally, the hope that most physicists spect to small changes of that trajectory away from its true
share of a more complete unification-the hope of showing shape. There is no obvious reason why the equations of
that all the forces are associated with asingle gauge theory, motion should have this character. It is easy to devise a
with a single multicomponent gauge field whose different universe whose equations of motion do not satisfy any such
components are related to each other in some completely variational principle, but in our universe they always do.
symmetrical way. This dream has already been partially Hamiltons principle was first discovered in connection
realized in the Standard Model through the unification of with mechanical systems, where the Lagrangian turns out
the electromagnetic and weak interactions, but there still to be the difference between the kinetic and potential ener-
are major obstacles in the way of its final realization. gies, but the principle is easily extended to include velocity-
dependent forces of certain types, including the magnetic
11. T H E BEGINNINGS OF THE GAUGE IDEA force on a moving charged particle. (Hamiltons principle
The key ideas leading up to the introduction of general- cannot be extended to include dissipative forces, but ele-
ized gauge fields came from Noether, Weyl, and London. mentary forces are never dissipative.) Finally, it has turned
The underlying trend, of which gauge symmetry is a partic- out that systems of a completely nonmechanical nature,
ular manifestation, is the growing realization in this centu- such as the electromagnetic field, also can be described in
ry of the importance of symmetry to our basic understand- this way: Maxwells equations, the equations of motion of
ing of the universe, to the point where it is now felt that it is the electromagnetic field, can also be derived from Hamil-
the underlying symmetry of physical laws that drives the tons principle, though with a Lagrangian that seems to
system-that determines the structure of the laws and the have nothing to do with such things as kinetic and potential
number and character of the elementary particles. This is a energy. We have become so used to this state of affairs that
characteristically 20th-century development. Prior to this, when we are trying to devise a new theory we invariably
symmetries were seen as accidental and if the theories of look for the correct Lagrangian, assuming almost without
physics showed certain symmetrical structures, that was question that the new theory must also obey Hamiltons
nice, but not of fundamental importance. principle. It always seems to work and, in consequence, we
always have Noethers theorem also: the relation between
symmetries and conservation laws.
A. Noether When we make the transition to quantum theory,
Noethers theorem is still true, although in most ways of
Emm), Noether ( 1882-1935) is regarded in mathemat- doing quantum theory the proof looks very different. One
ical circles as one of the most important mathematicians of such proof is outlined in Sec. 111. It seems to me quite possi-
this century, though not in fact for the work that has made ble that Noethers theorem is the more fundamental fact-
her known to physicists. The theorem that bears her name, that the physical theories that we devise to describe the
which has been the keystone in the development of symme- universe about us have the structure they do because of this
try as a guiding force in physics and for which (and for fundamental relationship between symmetries and conser-
which alone) she is known in physics, is hardly known to vation laws. If this is so, then Noethers theorem becomes a
mathematicians, who honor her for her work on commuta- principle rather than a theorem, like the principles of equiv-
tive rings and algebraic number theory. In each case, alence in special and general relativity; we should say then
though, she is noted for probing the underlying concepts that classical physical laws take the Lagrangian form and
upon which mathematical disciplines are based. quantum theory takes its characteristic Hamiltonian form
Noethers theorem, proved in 1918 while she was still as a consequence of Noethers principle.
R. Weyl and London Ning) (1922- ), who was then at the Institute for Ad-
vanced Study in Princeton, NJ. For some years prior to
The idea of gauge invariance also had its inception in this, since the time when Yang was a graduate student at
1918. Hermann Weyl (1885-1955), a friend of Noethers
the Southwest Associated University in Kunming, China,
at Gottingen, had been deeply influenced by Einstein and
he had been much impressed with the relationship between
shared his vision of seeing electromagnetism as a manifes-
charge conservation and gauge invariance and, in particu-
tation of some kind of local symmetry, similar to the local
lar, by the fact that the whale structure of electromagnetic
symmetry that characterizes the general theory of relativi-
theory would be uniquely determined by the sole require-
ty. In the case of general relativity, the symmetry in ques-
ment of gauge invariance. After coming to the United
tion is an invariance of the form of the basic equations un-
States in 1945, as a graduate student at the University of
der arbitrary curvilinear coordinate transformations,
Chicago, Yang began the attempt to generalize the gauge
corresponding to the physical requirement that physical
invariance argument to other conservation laws, in partic-
laws appear the same to all observers regardless of the state ular the conservation of isospin. Many conservation laws of
of motion-accelerating, rotating, or whatever-of their various sorts have appeared since then, but at that time the
reference frames. only conservation law that bore a useful similarity to elec-
The invariance that Weyl hoped to exploit was an invar- tric charge was the conservation of isospin (usually re-
iance with respect to change of scale: the requirement that ferred to then as isotopic spin). Isospin was an imperfect
physical laws be the same if the scale of all length measure- conservation law, violated by electromagnetic and weak
ments is changed by the same overall factor. Weyl wanted interactions, but apparently strictly true for strong interac-
to require a local gauge invariance in which the scale tions. One could easily imagine a world with only the
changes are allowed to be different at different points in strong interaction, where the conservation of isospin and
space and time, analogous to the curvilinear coordinate the associated symmetries would be exactly valid. If the
transformations of general relativity. The associated con- gauge invariance idea could be generalized, the result
servation law, by Noethers theorem, was to be the conser- should be a complete theory of the strong interaction, with
vation of electric charge. isospin as the charge responsible for the interactions and
Einstein pointed out serious flaws in the idea and it lay the newly invented gauge field as the glue playing the
dormant until 1927, by which time Schrodinger had intro- same role as the electromagnetic field in electrodynamics.
duced his wave equation for quantum theory (in 1926) and During the academic year 1953-1954, Yang was a visi-
complex wavefunctions were seen to play a role in physics. tor to Brookhaven National Laboratory, about 80 km east
In 1927, Fritz London (1900-1954) pointed out that the of New York City on Long Island. Here the Cosmotron,
symmetry associated with electric charge conservation was then the biggest particle accelerator in the world (acceler-
not a scale invariance, but aphase invariance, i.e., the in- ating protons of energies of 2-3 GeV) was just beginning to
variance of quantum theory under an arbitrary change in produce the abundance of new and unfamiliar particles
the complex phase of the wavefunction (explained in detail that have transformed the face of physics in the years since.
in Sec. 11).The invariance uner a global phase change- I was at Brookhaven also, on a postdoctoral appointment,
multiplication of the wavefunction by a constant phase fac- and was assigned to the same ofice as Yang. (I was still
tor e-was trivial in fact; the nontrivial fact was that the belatedly writing my dissertation, the study of a possible
existence of the electromagnetic field allows a much contribution to the fourth-order Lamb shift, under the
broader kind of invariance, invariance under a local phase guidance of Normal Kroll at Columbia University in New
change, in which the phase factor varies arbitrarily from York.) Yang, who has demonstrated on a number of occa-
one point to another in space-time. That is, 0 becomes an sions his generosity to young physicists beginning their car-
arbitrary function of x, y , I, and t, the coordinates of space eers, told me about his idea of generalizing gauge invar-
and time. How this works is explained in Sec. 111. iance and we discussed it at some length. Having some
Weyl also played a part in this modification of his idea background in quantum electrodynamics, I was able to
and continued to use the name gauge symmetry to de- contribute something to the discussions, especially with re-
scribe it, although it was now a misnomer, since the word gard to the quantization procedures, and to a small degree
gauge historically refers to a choice of length scale, rath- in working out the formalism; however, the key ideas were
er than to the assignment of complex phases. Yangs. The predicted quanta would have a spin of I, like
the photon, but would also have isospin of 1, like the pi
C. Yang and Mills
meson: This means that they would form a charge triplet,
For some 25 years the idea of gauge invariance (almost with positive, negative, and neutral states, just like the
always thought of in terms of local gauge invariance) was pion. The question of renormalizability was far beyond us,
seen as a specific characteristic of electromagnetic theory, as was the question of the mass of the gauge field quantum.
useful in various ways (e.g., as a check on the validity of These questions were not to be resolved for another 10 -15
calculation procedures), but not of more fundamental sig- years, by which time there had been an order-of-magnitude
nificance. Local gauge invariance was felt to have addi- increase in the sophistication of physicists understanding
tional implications within electromagnetic theory, such as of quantum field theory.
zero mass for the photon (although it was hard to make a At about the same time (also in 1954), Ronald Shaw, a
rigorous proof of this) and the constancy, among the dif- student of Abdus Salam at Cambridge University in Eng-
ferent elementary particles, of the elementary unit of elec- land, was also thinking deeply about possible generaliza-
tric charge. tions of the idea of gauge invariance, influenced in particu-
The idea that local gauge invariance might have a more lar by lecture notes of Schwinger. Shaws unpublished
universal significance in physics began to be considered in doctoral dissertation ( 1954) on The Problem of Particle
the early 1950s, particularly by C. N.Yang (Yang Chen Types and Other Contributionsto the Theory ofE/ernrntary
515
Particles includes a section (Invariance under General tromagnetic theory and general relativity, both of which, ;Is
Isotopic Spin Transformations) that closely and indepen- we shall see, show just this kind of local symmetry.
dently parallels the argument of the 1954 paper of Yang In general relativity, the relevant local symmetry is an
and myself and duplicates the basic equations of non-Abe- invariance under arb; trary curvilinear coordinate transfor-
lian gauge theory. There seems to be no question that the mations-as if one were making a different space-time ro-
time was ripe for this development. tation at every point. This is a sort of four-dimensional
analog of the sliced cylinder described above, with the dif-
ference that we are now talking about the invariance, not of
111. T H E GAUGE PHILOSOPHY: LOCAL an object, but of physical laws. It is well known that insist-
SYMMETRY ing on this general invariance leads inevitably to a complete
The idea at the core of gauge theory, as mentioned ear- theory of the gravitational force.
lier, is the local symmetry principle: Every continuous sym- In the electromagnetic case, which I talk about in more
metry of nature is a local symmetry. Let me now explain detail in Sec. IV, the symmetry in question is an invariance
more fully what these words mean. First, what is a sym- under changes in phase ofcomplex fields or wavefunctions,
metry of nature? We say that an object is symmetrical if its
appearance is unchanged by some transformation-we can -
$ $e9
and it becomes a local symmetry if 0 is taken to be an arbi-
(1)
say that its properties are invariant under the transfor-
mation. Look at some examples: An equilateral triangle trary function of the space-time coordinates ( x , y , r , t ) .This
looks the same if it is rotated by 120or if it is reflected with may not seem to have anything to do with electromagne-
respect to any of its altitudes, while a sphere has a much tism but, in fact, as we shall see, this local symmetry can be
richer class of symmetries, being invariant under a rotation realized only ifwe introduce an additional field with all the
of any magnitude about any of its diameters or a reflection familiar properties of the electromagnetic field.
in any of its median planes. Another type of symmetry, The electromagnetic case was in fact the inspiration for
called a translational symmetry, is illustrated by an infi- the further development of the gauge field idea. The fact
nitely long cylinder, which is invariant under any displace- that examples of local symmetry were already known to
ment parallel to its axis. The point, then, is that a symmetry exist in nature strongly suggested that this might be a gen-
necessarily involves some invariance property. eral principle and that we should examine other observed
Now the world as wesee it is not particularly symmetri- symmetries of nature to see if the same thing happens in
cal because the objects in it are irregular in shape and loca- every case. In Sec. IV, I will show how the gauge principle
tion. The rules of behauior of the physical world have a works in detail, first for the electromagnetic case and then
great deal of symmetry, however; this is what I mean by for the non-Abelian generalization.
symmetries of nature. The fact that experiments work the
same in China and in the US reflects an invariance of the IV. CONSERVED QUANTITIES, SYMMETRIES,
laws of physics under spatial displacements and rotations, AND GAUGE FIELDS
and hence a symmetry. The fact that the same experimental We now need to find out how the assumption of local
results are obtained at different times reveals the time dis- symmetry can lead to a physical theory and how it can
placement symmetry of those laws, etc. determine the character of that theory. We find that in
The transformations we normally consider areglobal-a
every case there is a characteristic logical pattern that
single rotation, for example, on the entire universe. Invar-
emerges, represented graphically in Fig. 1, connecting con-
iance under such a transformation is referred to as a glo-
served quantities, symmetries of nature, and gauge fields.
bal symmetry. The meaning of a local symmetry, on the
First, as we have seen, there is Noethers theorem, which
other hand, would be that the objects or physical laws in
states that for every conservation law there is an associated
question are invariant under a local transformation, -
symmetry and vice versa; second, there is the fact, men-
which is in fact a large number of separate transformations
tioned above, that the requirement of local symmetry leads
with a different one at every point in space and time. As an
to a gauge field theory of a particular well-determined
example, consider a long circular cylinder, which is clearly
character; and third, we find that the gauge field theory
invariant under rotations about its axis. Now imagine that
determined in this way necessarily includes interactions
the cylinder is sliced into a large number N , say, of very thin
between the gauge field and the conserved quantity with
disks. The system is now invariant under a transformation
which we started. Thus we have the astonishing fact that
in which every ring is rotated through a different angle and
the resulting symmetry, a local symmetry, is a much richer
symmetry than the original one. The original global sym-
metry is an invariance under a set of transformations de- Noethers
Theorem
scribed by a single parameter, the angle of rotation, while in
the local symmetry case, we have invariance under a much Symmetry
Quont i t y
larger set of transformations described by N different an-
gles of rotation.
We now apply this idea, not to objects, but to the laws of Local
physics, and ask that somehow they manage to have this Symmetry
much richer type ofsymmetry that we saw in the case of the Gauge
thinly sliced cylinder. Why would one think it possible that Field
the laws of physics should have such an extended symme-
try? It would seem extremely unlikely if it were not for the
fact that we already know of two examples in nature: elec- Fig. I . The logical pattern of a gauge theory
516
for every true conservation law there is a complete theory on thesequence in which the rotations are performed. This
of a gauge field for which the given conserved quantity is is directly associated with the fact that the different compo-
the source. Theonly restriction is that theconservation law nents of the angular momentum operator do not commute
be associated with a continuous symmetry (this would ex- with each other, while the different components of the lin-
clude, for example, parity, which is associated with reflec- ear momentum operator do commute with each other.
tion symmetry). The resulting theory has just one free pa- ( 2 ) The dynamical variablg A is invariant under the
rameter, the interaction strength. We shall now discuss in transformations generated by B. For example, the x com-
detail the different links in this logical pattern. ponent of position of a particle is invariant under displace-
ments in they direction (the operator ?, commutes with the
A. Noethers theorem operator Py), but is not invariant under displacements in
the x direKtion (the operator Z does not c p m u t e with the
Noethers theorem, connecting conservation laws with
operator P,).9 mathematical form, if A represents the
symmetry, was originally developed in 1918 as a theorem
variable A and B generates the transformation
in the calculus of variations, with an immediate application
to classical Lagrangian mechanics. This theorem takes a \v = e - IABqJ,
particularly elegant and general form, however, in relation (4)
to Hamiltonian mechanics, whether of the classical or then the expectation value ofA is unaltered by the transfor-
quantum mechanical type. I shall present Noethers mation
theorem in the quantum context, but it will easily be seen h
by those who know about such things that the entire discus- **A* = qJ*A^qJ. (5)
sion can be brought back to the classical domain by simply h
which illustrates the relation of the operatorA to the corre- 3bIeA is invariant under the transformations generated by
sponding physical observable A. [Here, the caret indicates B if and only if the dynamical varixble B is invariant under
an operator and the asterisk indicates conjugation. Note the transformations generated by A.
that the left side of Eq. ( 2 ) represents aphysical quantity, Now, our interest is in th5case that B, say, is taken to be
the mean value of a physical variable, while the right side is the Hamiltonian operator H , which is the operator asso-
a mathematical expression.] On the other hand, the same ciated with the totalenergy of the system. The transforma-
operator can be used to generate a unitary transformation tjons generated by H are time displacements, which is why
on the state vectors of the system: H i s the operator that appears in the Schrodinger equation,
ih-=HV,
dT
qJ+qJ z e-AAqJ, (3) dt
whereA is an arbitrary real parameter. Such a transforma- whose integrated form,
tion on the state vector may represent a transfTrmation on
the physical state itself, so that, for example, ifA is taken as
h q ( t )=e-~fi~T(o), (7)
Pz,the z component of the momentum operator, then Y
represents the state that differs from Y by a displacement /z corresponds to the general form ( 3 ) . Thus we see why it is
in the z direction. Every continuous family of transforma- the energy operator that governs the dynamics of the sys-
tions is generated in this way and is thus associated unique- tem.
ly with one of the physical variables of the theory. To say that a dynamical varixble A is invariant under the
Now, one ofthe well-known features of these linear oper- transformations generated by H i s to say that A is a con-
ators is that they do not in general commute with each stant of the motion, that is, its expectation value in any
other (that is, multiplication of operators is not commuta- state will be invariant under any time displacement. On the
tive) and whether or not two operators commute has phys- other hand, the statement that H is invariant under the
ical significance. This significance takes different forms de- transformations generated by A is to say that $represents a
pending on which rolezach g t h e two operators is playing. symmetry of the dynamical laws because it is H that deter-
Thus if two operators A and B commute, we can make four mines those dynamical laws. We now see (and this is our
different statements about the variables and transforma- modified version of Noethers theorem) that these two sta-
tions associated with these operators: h
tements are equivalent s i p e each is equivalent to the state-
( 1 ) The transformations generated by> and Bcommute ment that the operators A and H commute:
with each other. Thus rotations about different axes do not The dynamical variable A is conserved ifand only i f t h e
commute, while displacements in different directions do dynamical l a p are invariant under the transformations
commute, that is, you get a different net rotation depending generated by A .
517
B. Local symmetry and gauge fields: The electromagnetic tials (as you may have anticipated if you are familiar with
case the standard form of the Schrodinger equation in the pres-
ence of the electromagnetic field), that is, we combine the
We now come to the next link in the diagram shown in conventional derivative d,, with a vector field A, ,
Fig. 1, where we relate the assumption oflocal symmetry to
the existence of gauge fields. Suppose we consider some D, = a, + ineA,,, (16)
particular conservation law such as the electric charge Q, with the transformation law for A, chosen in such a way
and identify the associated symmetry_transformations gen- that Eq. ( 15) is satisfied. If you substitute expression ( 16)
erated by the Hermitian operator Q. The action of this intoEq. (15),anduseEq. (14) forthevariationofd,$(x),
transformation on a complex field $ ( x ) associated with then the terms that are left, after removing the terms that
particles of charge ne is to change its phase by an amount cancel, give
proportional to n:
- ine(df,B)$+ine6A,$=0, (17)
$ ( x )- $ ' ( x ) = e-'""$(x). (8)
which is satisfied if A satisfies the transformation law
In what follows it is convenient to work with infinitesimal
transformations for which Eq. ( 8 ) reduces, by Taylor ex- SA, ( x ) = a,, B ( x ) , (18)
pansion of the exponential, to the familiar gauge transformation for electromagnetic po-
$ ( x ) = ( I - ineB)$(x). (9) tentials.
These expressions can be rewritten in a more
Here, x represents the space-time coordinates and 6' is the familiar notation: A,, = #, the scalar potential, and
arbitrary infinitesimal parameter of the transformation. --Ai ( i = 1,2,3) are the components of the vector poten-
You can think of $ ( x ) as essentially equivalent to the tial A, so that Eq. ( 18) is equivalent to
Schrodinger or Dirac wavefunction. The invariance comes
from the fact that any quantity that is physically observable
involves the factors $* and $, so that the phase factors
Sg,=--, s
C
("S),
cancel. If several different fields are involved, as in a pro- SA= -ve. (20)
duction or decay process, then the conservation of charge
in the process is exactly related to the cancellation of these What have we proved? We started with a conservation
phase factors. The cancellation works the same for deriva- law, the conservation of electric charge with its associated
tives of $as long as B is taken as a constant with respect to symmetry, and then (as if we had never heard of the elec-
X. tromagnetic field) we showed that the requirement that the
Now, however, I want tointroduce the idea oflocdsym- symmetry be local forces us to introduce a gauge field,
metry, as discussed in Sec. 111, that is, I want to make a which turns out to be nothing but the familiar electromag-
different transformation at each different point in space- netic field. Before we turn to the more general kinds of
time, which corresponds exactly to allowing B to be an arbi- gauge fields, let us continue with the example of the electro-
trary function B ( x ) : magnetic field and see how the gauge invariance condition
fixes the form of the theory.
$'(x) =e-tn'~e(x)$(x) ( 10)
- [ 1 - ineB(x)]$(x) (11)
The standard procedure for setting up a physical theory
is to start with a classical Lagrangian formulation and then
go through a well-defined "quantization" procedure to
or
generate the appropriate quantum mechanical theory. The
=@(XI -$(x) (12) classical equations of motion are generated from the La-
= - ineB(x)$(x). (13) grangian by a variational principle (Hamilton's principle),
which is discussed in Sec. I1 and explained in standard me-
The invariance of expressions such as $*$is easy to see, but
chanics texts. Then the quantization procedure ( a pre-
the invariance is lost in expressions that involve derivatives
scription for constructing the canonical momentum opera-
of $ since derivatives of 19appear as well. Thus (letting a, tors, the Hamiltonian operator, and the quantum
represent the partial derivative with respect to xf', x" = ct, mechanical equations of motion) guarantees that the re-
x ' = x , x2 = y , x3 = z ) we find that a,+(x) transforms as sulting theory satisfies the important Correspondence
Principle, which is the requirement that the theory be en-
tirely consistent with the classical theory in the macroscop-
ic domain, where the classical theory is known to be valid:
We need only discuss the Lagrangian since the form of the
with the first term of standard form [Eq. ( 1 3 ) ] and an theory is determined once that is known.
awkward second term involving the derivative of B ( x ) . In In the case o f a field theory, the important quantity is the
order to achieve local symmetry, we have to get rid of the Lagrangian density L (the Lagrangian itself is the space
second term in some way. To do this, we must replace the integral of L ) , which depends on the various fields and
conventional derivative a,, $ by a "covariant" derivative their partial derivatives and which must be a relativistic
D,, $whose transformation law is required to be of thestan- invariant in order that the resulting theory be Lorentz in-
dard form, variant. Indeed, the Lagrangian density must display all
the symmetries that are required of the theory that it is to
generate.
This will make expressions such as $*OF$, for instance, Thus we see that if we want to construct a dynamical
invariant under the transformation. How can we accom- theory of our new gauge field, we need to construct an
plish this? We simply introduce the electromagnetic poten- appropriate Lagrangian density L which is both Lorentz
518
and gauge invariant. If we add some fairly standard re- field equations, which as I said are just Maxwells equa-
quirements about the physical reasonableness of the field tions, take the form
equations, we find that L is uniquely determined apart
from trivial factors. If our gauge field interacts with just a d,, f =j p. (26)
single charged particle field $, then we expect to get two One might ask whether there should be some kind of
field equations, one for $ and one for the gauge field A,. gauge-covariant derivative, such as D,, in Eq. (26); the
Now suppose that we already know the Lagrangian density answer is that there is no difference in this case since the
for $in the absence of the gauge field; It certainly involves extra term in 0, [ Eq. ( 16)] is proportional to the charge
derivatives of $ and to achieve gauge invariance we have of the relevant particle and the charge of the photon is zero.
found [ Eq. ( 16)1 that the ordinary partial derivative must We have now completed the pattern shown in Fig. I . The
be replaced by the gauge covariant derivative D,. Thus L requirement of local symmetry has not only generated a
now automatically includes an interaction term involving gauge field of uniquely determined structure, but has dic-
A, and $ that shows exactly how the charged particles tated almost uniquely the form of the interaction-the pre-
behave in the presence of a given gauge field A,. If $ is a cise form of the forces on the charged particle and the pre-
Dirac field (for a relativistic spin-1/2 fermion), for exarn- cise way in which the electric charge-current density serves
ple, then the appropriate terms in L take the form as the source for the gauge field.
L , = $[iyJaDp- m ) $ , (21)
4
where is related to $*, the conjugate of I++. (The speed of C. Local symmetry and gauge fields: The general case
light c has been taken as equal to 1 and the repeated indexp Now we are ready to look at the more general case, in
is summed over by a standard summation convention.) which we start with an arbitrary conservation law and, by
In the classical limit, it can easily be shown that the interac- the same logic that we used above, develop the theory of a
tion term in (21), gauge field whose relationship to the conserved quantity
will be the same as that of the electromagnetic field to the
electric charge.
corresponds exactly to the Lorentz force (the familiar elec- The most obvious candidate for the conserved quantity
trostatic and magnetic forces) on the charged particle. might seem to be energy or momentum, but these in fact
To get the dynamical equations for the gauge field, we turn out to be much more complicated. The associated lo-
need to add a term to L that is again both Lorentz and cal symmetry is the invariance under local coordinate
gauge invariant and involves derivatives of A,, , so that the transformations and the associated gauge field, as men-
resulting equations for A, will have the character of field tioned in Sec. 111, is the gravitational field; however, there
equations. We first look for gauge inuorfunt expressions are a number of sourcej of confusion that make this a poor
involving A, and find that the only one we can make is the case to presenkhere. I discuss this case in more dctail in Sec.
combination VI.
f,,. = d,A, - a,A,. (23) The more typical case of a non-Abelian gauge field is
obtained when the symmetry involved is an infernal sym-
Since the variation of A, under a gauge transformation is metry, i.e., one that is not associated with any kind of coor-
simply the space-time gradient of B ( x ) [Eq. ( l S ) ] , the dinate transformation. The phase invariance associated
cross derivatives of B ( x ) that appear in Eq. (23) cancel, with charge conservation is one example of an internal
leavingf,, invariant. The gauge invariant combinationfp;, symmetry; there can be others of this type associated with
is the familiar tensor representation of the electric and other conserved particle numbers such as baryon or lepton
magnetic fields E and H; the elementsf, ( i = I,2,3) are number, These examples are exactly equivalent in form to
the components of E and the elementsf, give the compo- the electromagnetic case, so that the result of the logic
nents of H. Next, we need to construct a relativistic invar- would be a gauge field identical in structure to the electro-
iant (Lorentz xalar) out off, to serve as the Lagrangian magnetic field, but coupled to the appropriate conserved
density for the field; the only way to do this that produces particle number. This is not what happens.
physically reasonable field equations is to take the scalar The other kind of internal symmetry is associated with
product families of identical particles, such as isospin multiplets (if
one can treat them as identical) or multiplets of colored
quarks. The conserved quantities in this case are associated
wherep and Yare summed over by the summation conven- with the quantum numbers that label the members of the
tion mentioned above and the superscripts (contravar- multiplets, together with certain operators that induce
iant indices) indicate an additional factor of - 1 when transitions from one member of a multiplet to another.
either p or Y has the value 1, 2, or 3. The factor merely There is thus a family of operators that on one hand corre-
serves to set the scale of the electromagnetic fieM vectors. spond to the conserved dynamical variables, e.g., isospin or
When the standard procedure is followed for generating color and, on the other hand, correspond to a group of
the classical field equations from the Lagrangian density transformations, the symmetry group of the multiplets.
(24). with the interaction term (22) the result is simply The fact that these operators do not in general commute
Maxwells equations. Interaction term generates a source with each other is precisely what makes this case so very
term for Maxwells equations, which is found to be just the different from the electromagnetic case, as we shall see in
charge-current density what follows. In each case it is the mathematical structure
j , = ne&,+. (25) ofthe symmetry group that determines the structure of the
gauge field and the form of the interaction. The symmetry
In the covanant notation that we have been using, the groups have names according to their structure, so that one
speaks of U ( 1), the simple phase symmetry group of elec- There are now three arbitrary infinitesimal parameters 8,
tromagnetic theory or SU (2), the isospin symmetry group. describing the transformation and the symbols T, here
[ SU( 2) is also the symmetry group of spatial rotations in represent 2 x 2 matrices which govern the mixing of the
quantum theory, which is why the language of isospin is so components $p and $ in the transformation. These matri-
similar to the language of angular momentum, even though ces (proportional to the Pauli spin matrices) hxve the Same
they have nothing to dowith each other physically.] commutation relations (27) as the operators T, and pro-
I want to describe in qualitative terms what happens vide what is called a representation of the symmetry
when we apply the gauge philosophy to one of these non- group. 9 and T are the three-component vectors whose
commuting (non-Abelian) symmetry groups. The his- components are the parameters 0, and the matrices T,.
torical example was the isospin case and indeed this is the Now, in exact parallel to the electromagnetic case, we try
simplest case to consider, though it should be emphasized to impose the condition of local symmetry, which means
that the same mathematical forms appear for any non-Abe- that we allow 9 to be an arbitrary vector-valued function of
lian symmetry group, with only minor changes. I shall fol- x, so that
low the example of the electromagnetic field step by step
and try to clarify the differences. 6 @ ( x )= - i g e ( x ) . T $ ( x ) , (33)
h
First, the conserved electric cbarge operator Q is re- in close analogy to Eq. (13). Again, we find that while
placed by the family of operators T,, which we take as tke terms not involving derivatives cause no problems, the
three components of isospin. The 2 component T, variation of the deriuatiue of $has an additional term in-
(which has nothing to do with thez direction in real space) volving the derivative of 9,as in Eq. ( 14). To get rid of
labels, by its eigenvalues, the memkers of xn isospin multi- these we again need to replace the conventional derivative
plet, while thex2ndycomponents T , and T,, which do not a, with a covariant derivative D,, defined in terms 0f.a new
commute with T,, generate transformations that mix the field whose transformation rule can have a term propor-
different values of T3. [In the case of color, the symrnepy tional to a,, 9 ( x ) . The correct form for D,$ is given by
group is called SU ( 3 ) and there are eight operators T,,
two of which commute with each other so that their eigen-
+
D,$= (a, igB,.T)$, (34)
values can serve to label the members of a color multiplet, a with a new field B, ( x ) that is a vector with respect to
two-dimensional array. ] Lorentz transformations and also a vector with respect to
It is charactejStic of all these symmetry groups that the isospin rotations; It therefore has 12 components in all.
set of operators T, is closed under the commutation rela- When the condition of covariance is imposed on D,,$,
tion, that is, the commutator js always just a linear combi- namely,
nation of the same operators T,. Thus we write
SCo,,$) = -ge.T(D,$) (35)
[like $ itself, Eq. (33) J , then we find that the transforma-
tion rule for B, is completely determined. Because of the
where the coefficients c , ~ , , called the structure con-
fact that the matrices T, do not commute, we find that the
stants, are a set of numerical constants that completely
transformation rule is more complicated than that for the
characterize the local structure of the symmetry group;
electromagnetic potentials A, :
again, y is summed over. In the case of isospin, the coeffi-
cients take the values 0, 1 in such a way as to give the
usual angular-momentum-type commutation relations The second term on the right side of (36), which is ab-
sent in the electromagnetic case [Eq. ( l e ) ] , is extremely
[?,,?*I = iT3, etc. (28)
important and is associated in one way or another with all
The complex field @ now has several components, de- the interesting and novel features of non-Abelian gauge
pending on the size of the multipiet we want to describe. fields (i.e., fields associated with non-Abelian symmetry
For the isospin case we can consider as an examplejust the groups): This term reflects the isovector character of our
nucleon doublet, consisting of the proton and neutron, new field (it is the normal expression for the change in a
so that vector under rotation) and the fact that the quanta of this
field, in particular the linear superposition of wave solu- to each point x of space-time. The term fiber bundle re-
tions, would be lost. fers to theanalogy with a collection ofthin threads or fibers
Photons are not charged, of course, but the B quanta m e bound together into a much thicker bundle. Each thread
in the sense that they carry isospin and isospin is the corresponds to one of the local vector spaces at a point x ;
source-the analog of charge-for theB field. The descrip- the collection of these into a product space is the bundle.
tion of the last paragraph may not apply to the electromag- (There is also a principal fiber bundle consisting of the
netic case, but it applies almost exactly to the B field. Ex- collection of parameter spaces, each attached to a point x ,
cept for this self-interaction, the field equations for each associated with the local transformations discussed
isospin component of the B field are identical to those for above.)
the electromagnetic field. The process of moving from one point x to a neighboring
Let us now complete the description of the B field, still point requires us to define a connection between the two
following the logical sequence of the electromagnetic case. associated local vector spaces, giving rise to the covariant
Recall first that the components B,, ( x ) are analogous to derivatives introduced in Eqs. (16) and (34), and which
the electromagnetic potentials A,, ( x ) , so that we need to bear a very close relationship to the covariant derivatives of
construct the electric and magnetic field analogs. These general relativity theory. The failure of the covariant de-
analogs take the covariant form fPv,which now has the rivatives to commute is an indication of a kind of curva-
character of an isovector, like B,,, and is given [cp. Eq. ture in this extended space, with the quantities fPt.giving a
( 2 3 ) l by precise characterization of that curvature, again in close
analogy to the curvature tensor of general relativity. It
f f i Y = a V B ,-, a , , B , - g B , , X B , . (37) seems very clear that this geometrical structure represents
The electric field, whose spatial components E, are given a wonderful and unexpected vindication of Einsteins vi-
by fa, has the character of an isovector also, so that it has sion of a unified geometrical picture of all the forces of
nine components in all instead of the familiar three compo- nature.
nents. The spatial components H, of the magnetic field, This completes our outline of the non-Abelian gauge
given by the elements f,k, have the same character. field logic, using the example of isospin as the conserved
The final steps, as with the electromagnetic field, are to quantity that serves as the source for the gauge field B. At
construct the Lagrangian density and field equations for the time the theory was invented, isospin was the only rea-
theB field. Because it must be Lorentz and gauge invariant, sonable candidate for this role and the hope was that the
the Lagrangian density can only be resulting gauge field might serve as the carrier of the strong
interaction. There was a general feeling at the time that the
L,= - 4! f@%.i (38) idea, while beautiful, was beset by too many difficulties to
be acceptable. A number of years had to pass and a number
[cp. Eq. (24) 1, while the field equations derived from ( 3 8 )
of sophisticated developments had to occur before it was
have the form
discovered how the difficulties could be resolved and thus
D,. P= j@, (39) the non-Abelian gauge fields seen as an acceptable descrip-
like Eq. (26). Here jp is the isospin current density asso- tion of the fundamental interactions of nature.
ciated with the other particles present-in our example, the
neutron-proton field,
V. RESOLVING DIFFICULTIES: THE GAUGE
j = g w T @ . (40) THEORY OF THE ELECTROWEAK
The covariant derivative D,, ofthe field f is no longer equiv- INTERACTION
alent to the regular derivative a,, since f is an isovector. The
rule is As mentioned briefly in Sec. 11, the mass of the gauge
quantum and the renormalizability of the theory were seen
+
D , , i = d , i gB, xi, (41) from the beginning as major problems. The resolution of
in close analogy with Eqs. (36) and (37). these problems occurred over the course of time as part of
Now, the form of Eq. (39) does not tell us that jwis the long and intricate process by which the modern gauge
conserved, and in view of our discussion we should not theories of the electroweak and strong interactions were
expect it to be since theB field carries isospin as well. How- developed. A proper account of these developments is be-
ever, if the second term on the right side of (41) is taken to yond the scope of this article, but I shall give a brief descrip-
the right side of Eq. (391, then Eq. (39) takes the form tion of how these particular difficulties were finally re-
solved.
a V i Y= jp - g B , X P , (42) The problem with mass is essentially that the forces to be
and since the four-dimensional divergence of the left side described are short-range forces and thus require massive
of (39) is now automatically zero, we see that it is the right interaction quanta (as explained below), while the intro-
side of (42) that represents the full conserved isospin cur- duction of such a mass into a gauge theory destroys the
rent, including the contribution of the B field. gauge invariance, as well as making it virtually impossible
It is important to mention that the structures we have that the theory be renormalizable.
developed are seen by mathematicians as having a deeply The question of renormalizability has to d o with the han-
geometrical character, which is realized in the mathemat- dling of certain infinite expressions that seem to arise inevi-
ical theory of fiber bundles. The space in which this geo- tably when doing any kind of relativistic quantum field
metrical character manifests itself is a kind of product theory. For certain kinds of theory, including quantum
space (the associated fiber bundle, to be more precise) in electrodynamics, there is a logical procedure (discussed
which a local vector space, with elements $ ( x ) , is attached below) for dealing with these infinities that permits us to
52 1
make realistic finite calculations for real physical effects. ferromagnet, which, when cooled to absolute zero and thus
The procedures are not mathematically rigorous, but they brought to its state of lowest energy, must come to a mag-
work; in the case of quantum electrodynamics (QED) they netized state with some definite orientation even though
work with astonishing precision. Theories for which these any orientation is allowed. This ground state of the ferro.
procedures work are called renormalizable; nonrenor- magnet is the analog of the vacuum state for our universe.
malizable theories are .useless in practice and apparently The symmetry of the underlying theory, then, is spon.
unphysical. taneously broken in the sense that the universe we see, ex-
isting within the framework of such an unsymmetrical
vacuum state, will exhibit this broken symmetry even in the
A. Mass of gauge quanta
way that the physical laws appear to operate. Now, the
Our experience with gauge fields led us to suppose that broken symmetry that we want to look at is the gauge sym-
the quanta would have to have zero mass in order for the metry of the electromagnetic and weak interactions,
theory to satisfy gauge invariance. Now, the range of an viewed together as associated with a single four-component
interaction is inversely proportional to the mass of the me- gauge field. There is no evidence that the vacuum state for
diating quanta and zero-mass quanta imply long-range in- such a theory is degenerate, so the authors of the theory
teractions, like the electromagnetic and gravitational forced the vacuum to be degenerate by means of a device
forces. The strong and weak interactions, however, are of due to Higgs, which consists of the artificial introduction of
extremely short range and fall off exponentially at dis- an additional field, called the Higgs field, with properties
tances of the order of the nuclear size. (This is why the chosen so as to make the vacuum state degenerate. In the
nuclei of neighboring atoms, for example, interact with example worked out in Sec. IV, where the symmetry was
each other only by electromagnetic forces, even though the the isospin symmetry SU(2), the Higgs field might be tak-
nuclear force is much stronger.) Thus we seem to run into a en to be an isospin vector field, with a quartic form for the
contradiction and we ask the following question: Is it possi- field energy chosen in such a way that the field configura-
ble for the gauge quanta to acquire a mass in some way tion of minimum energy is a uniform nonzero field with a
without violating gauge invariance? definite but arbitrary orientation in the isospin vector
In what we now believe is a reliable picture of fundamen- space. Like the magnetization of the ferromagnet, this on-
tal particles and interactions at our present level of under- ented Higgs field gives the observed universe a preferred
standing, there are three separate gauge theories: the Gla- orientation, now in isospin space, and provides an elegant
show-Weinberg-Salam theory for electromagnetic and model for the observed broken symmetry.
weak interactions, the color gauge theory for strong inter- In the standard model, however, isospin symmetry is
actions, and general relativity for the gravitational interac- treated as accidental and is not associated with a gauge field
tion. The first two of these theories, together with the spec- [although efforts to extend the standard model to a larger
trum of elementary particles associated with them, make symmetry group that will unify the strong and electroweak
up what is now referred to as the Standard Model. In interactions (Grand Unified Theories, or GUTS)in-
each of these three gauge theories the question of the mass variably incorporate the symmetry of flavor, of which
of the quanta takes a different form. In gravitation theory, isospin is one facet]. In the gauge theory of electroweak
the mass is simply zero and the forces are long range. In interactions, the symmetry group is taken to be
quantum chromodynamics (QCD), the theory of the color S U ( 2 ) x U ( l ) , where U ( 1 ) is the phase symmetry dis-
gauge field that mediates the strong interaction, the mass is cussed in Sec. IV in connection with the electromagnetic
zero, but the confinement of color charge (discussed be- field [Eq.( 8 ) ] and SU(2) is the same symmetry group
low) prevents its manifestation as a long-range force. Fin- used to describe isospin symmetry, but now applied to a
ally, in the electroweak theory, which unifies the electro- somewhat different symmetry referred to as weak iso-
magnetic and weak interactions, all the gauge quanta spin. Weak isospin symmetry, unlike ordinary isospin, re-
except the photon acquire a mass through a mechanism lates the weakly interacting particles, so that the electron
known as spontaneous symmetry breaking, which does and its neutrino form a doublet (with extra complications
not in fact spoil either the gauge invariance of the theory or associated with left- and right-handed spin asymmetries
its renormalizability. that I will not go into here), as well as the muon and tau
The principle of spontaneous symmetry breaking is that with their respective neutrinos. The isospin multiplets of
the actual symmetry of a system may be less than the sym- hadrons (the strongly interacting particles) are modified
metry of the underlying physical laws. This is obvious in so that, for example, the proton is paired, not with the neu-
the world around us, which is full of objects that are neither tron alone, but with a quantum mixture (linear superposi-
translation nor rotation invariant, even though the phys- tion) of the neutron, the and the z.
ical laws of which they are a manifestation are both transla- This theory predicts the existence of four gauge quanta: a
tion and rotation invariant. The new discovery is that even neutral photonlike object, sometimes called the X,asso-
the vacuum state-the supposedly blank canvas on which ciated with the U( 1 ) symmetry and a weak-isospin triplet
the universe is painted-an fail to exhibit the full symme- associated with the SU(2) symmetry, whose members are
try of the laws of physics. If the vacuum is unique, that is, if referred to as w, W* . The Higgs symmetry-breaking
there is only one state oflowest energy, then it must indeed mechanism now has several consequences: The W* parti-
have the full symmetry of the laws, but if it is not unique, clesacquirea mass and theXandthe W a r e mixed, so that
i.e., if the vacuum is a degenerate state, then this is no long- the neutral particles you see in nature are two different
er the case and for each unsymmetrical vacuum state there linear combinations of these two particles. One of these,
will be others of the same minimal energy related to the christened the Z ,has a mass and the other, the familiar
first by the various symmetry transformations under which photon, is massless. The masses of the W * and the Z are
the physical laws are invariant. A famous example is the governed by the structure of the uniform Higgs field back-
522
ground and do not afect the basic gauge invariance of the rnalized. These two parameters, rn and e, appear in the
theory. The interaction strength associated with all four theory from the beginning and represent the mass and
particles is essentially the same and the observed weakness charge of the naninteructing electron, usually referred to as
of the weak interaction, mediated by the W * and the Z ,is the bare mass and the bare charge. What was realized,
understood as a consequence of their masses, which are then, was that the electromagnetic interaction, in addition
taken to be large enough (9C-lo0 proton masses) to agree to shifting atomic energy levels, would also alter the ob-
with what we see. served mass and charge. T o obtain a prediction of the ob-
served values, one needs to do another calculation; the re-
sulting expressions are equal to the bare values plus
H. Renormalizability corrections of order a,a,etc. Thesecorrections are called
the renormalized mass and charge. I called this apredic-
Renormalizability was the other problem, reflecting one rim, but that is not correct because in fact we do not know
of the deep questions in quantum field theory, a reminder what the bare values are and we do know what are the
of the fact that the theory has no rigorous mathematical observed, or renormalized, values. What we must do, then,
foundation and indeed has some serious flaws if viewed is use these expressions to deduce the bare parameters from
from a strictly mathematical viewpoint. Let us review just the obserued values. This process is called renormaliza-
enough of the history of this problem to see how it relates to tion and would be necessary and indeed quite straightfor-
the new non-Abelian gauge theories. ward even if there were no divergences in the theory. Pre-
The big problem is what are called divergences, or dictions of experimental quantities such as the Lamb shift
infinities. Since the early days of electromagnetic theory must be reexpressed in terms of the renorrnclfizedmass and
and on into the era of QED, certain calculations of what charge since these are the observed values.
should be physically reasonable effects have given infinite What happens, though, is that in the calculation of the
answers. The first is the infinite electrostatic self-energy of renormalized mass and charge the same sort of divergent
the classical electron (or of any charged point particle). integrals appear as were found in the original Lamb shift
The energy density of the electric field is proportional to calculation. What does this mean? First, it does not mean
[El2and the field a t a distance r from a particle of charge e that the renormalized values are infinite (since they are the
is proportional to e/?; thus one ought to be able to find the observed values), but that the bare values differ by some
total energy in the electric field of the particle by integrat- infinite factor from the observed values. The process of
ing. renormalization, then, requires us to manipulate these infi-
If the particle has radius a, say, the result is an integral nite quantities, which is clearly nut allowed mathematical-
proportional to e a J r p Zdr, where the integration is from u ly, but seems to be necessary to arrive at any kind of an-
to C O , giving e2/a. If the electron is a point particle, then swer.
u = 0 and the energy is infinite. Is this physically reasona- What we do in practice is find some artificial way of
ble? One might argue that this is an unobservable energy making the integrals converge so thar the expressions we
and that, therefore, its value is physically meaningless. Dif- need to manipulate are finite. There are a number of ways
ficulties arise on account of relativity theory, however, of doing this, closely analogous to keeping a finite radius (I
which tells us that energy is equivalent to mass and that for the electron in the calculation of the electron self-ener-
such a point particle should therefore have infinite inertia, gy discussed above (and indeed the self-energy is the
which would certainly be an observable effect. Afinite ra- mass). The divergent expressions are now finite for finite u
dius a, on the other hand, also produces contradictions and become infinite only when [I is set equal to zero. The
with the requirements of relativistic invariance and we are renormalization procedure can now be carried through
left with a dilemma that is still with us, though in a more and the expressions for the bare charge and mass in terms
abstract form, 80 years later. of the observed values substituted into the expression for
Since the advent of QED, the difficulty has shown up in the Lamb shift (or any other experimentally observable
the form of divergent integrals in the calculation of all sorts quantity). The astonishing result is that the troublesome
of physical effects. The historic example was the Lamb integrals cancel out and disappear completely and when
shift, referring to the measurement by Lamb and Rether- the cutoff parameter- in our exampie-is set equal to
ford in 1947 of the extremely small shift of the spectral lines zero the result is still finite, The theory could now be com-
of hydrogen due to quantum effects in the electromagnetic pared with experiment (remember that the effect predicted
field. The experiment revealed, and the theory in principle is extremely tiny) and the agreement was spectacular. The
predicted, a contribution smaller by a factor a (the fine- initial agreement was at the level of a few parts per million
structure constant, approximately equal to &) than the and has steadily improved over the years.
smallest previously identified contribution. The actual cal- Clearly, we have in some sense a correct theory, even
culation of this contribution was straightforward to set up though the procedures are mathematically unsound. In the
using standard principles of quantum mechanical pertur- case of QED, the results are to a large degree independent
bation theory, as it is called, and gave the proper factor a, of what cutoff scheme is used; there are a number of differ-
multiplied, however, by an expression involving divergent ent ways of replacing the divergent integrals with finite
integrals, that is, with a value equal to infinity. This highly expressions and the end result, when the divergent parts
unphysical result shows that there was something seriously have canceled, seems to be always the same, A theory for
wrong with the basic ideas of QED and caused consider- which this procedure works is called renormalizable. It
able anxiety among physicists at the time. is easy to find otherwise reasonabIe theories that are not
The problem was solved, after a fashion, when it was renormalizable since there are a number of things that can
realized that the fundamental parameters involved, the go wrong; thus a very important fact about QED is that it is
mass and electric charge of the electron, needed to be renor- renormalizable and can therefore be used to make highly
523
precise predictions about the tiny effects such as the Lamb tal proved unfruitful. For another thing, the hadrons had
shift that validate its claim to be true. more structure than seemed appropriate for elementary
It is evident, now, that if non-Abelian gauge theories are particles, even taking into account the big effect that virtual
to be taken seriously, their renormalizability becomes a quantum processes would have as a result of their large
question of great importagce. The nonlinear character of interaction strength. Finally, the observed symmetry pat-
these theories, which is related directly, as discussed in Sec. terns among the hadrons, described by quantum numbers
IV, to their self-interaction, makes this question very diffi- known collectively as flavor (which is like a multidimen-
cult indeed, even if the quanta are massless. To make mat- sional extension of isospin) strongly suggested a composite
ters worse, the first such theory to show signs of really structure, since it would take only a small number of sub-
being true was the Glashow-WeinbergSalam theory of nuclear particles, with the correct fundamental symmetry
the electroweak interaction and as we have seen the gauge characteristics, to permit the construction of the large var-
quanta (apart from the photon) are massive. If the masses iety of observed particles.
are put in directly, in violation of gauge invariance, the These subnuclear particles, christened quarks by
theory is found to be nonrenormalizable, but what if the Murray Gell-Mann, who was responsible for the present
masses arise from spontaneous symmetry breaking? This form of this hypothesis, were at first three in number. Two
remained an open question for some time and glimpses of of the quarks, called the u and d (for up and down,
the truth were beginning to be seen by several physicists constituted an isospin doublet and generated all the ob-
when, in 1971, Gerard t Hooft, a young graduate student served isospin multiplets, while the third, the s (for
from the Netherlands, solved the problem completely. In a strange) quark, was responsible for the additional di-
truly remarkable piece of work, t Hooft confirmed the ren- mension of flavor known as strangeness.
ormalizability of massless gauge theories, already shown Each of the known baryons (strongly interacting fer-
implicitly by others, and went on to renormalize the mas- mions) could then be modeled as a bound state of three
sive case provided the masses were generated by spontane- quarks and each of the mesons (strongly interacting bo-
ous symmetry breaking, the Higgs mechanism. sons) could be modeled as a bound state of a quark and an
This major breakthrough cleared the way for wide ac- antiquark. It was also found to be essential (after some
ceptance of the Glashow-Weinberg-Salam theory of the distress with the idea and a number of unsuccessful efforts
electroweak interaction. The authors of the idea had in- to avoid it) to give the quarks fractional electric charge, so
tended to produce an example of a theory with broken sym- +
that the u quark has charge f and the d and s quarks each
metry using the Higgs mechanism in an ad hoc way and have charge - {, in units of e, the proton charge.
hardly expected that the result would correspond to rea- In more recent years, two additional flavors of quark
lity. In fact, however, the Higgs mechanism not only pro- have been identified, the c (for charm) and b (for bot-
vided the quanta with masses, but allowed for complete tom, or sometimes beauty), while a sixth flavor, to be
renormalizability and resulted, surprisingly, in a practical known as t (for top, or truth) is widely felt to be need-
theory that has enjoyed a remarkable degree of experimen- ed to complete the picture. This would then match the six
tal confirmation. varieties of lepton in some deeper flavor symmetry that
may some day unify the strong and electroweak interac-
tions. In the standard model, though, flavor symmetry is
VI. QCD AND THE STANDARD MODEL not included as a true symmetry and is not associated with
a gauge field. The many efforts to model the strong interac-
We turn now to a brief description of the second of the tions in such a way were unsuccessful and the next level of
two gauge theories that make up the Standard Model, truth was found to involve a deeper degree of complexity,
namely, quantum chromodynamics (QCD), or color as we shall now see.
gauge theory, the gauge theory of the strong interactions. The next chapter in this story came with the realization
Again, there is a long and complicated history leading up to that the different flavors did not provide enough quarks to
the point where such a theory could even be proposed since explain what we observed. In the first place, there was a
the major ingredients of the theory are impossible to ob- difficulty with the Pauli exclusion principle. In order to
serve directly. The color gauge field is unobservable; the explain several of the observed baryons, one had to suppose
conserved color charge is unobservable; and the quarks, that three quarks of the same variety were bound in what
which are the particles that carry the color charge, are un- was essentially the same orbital state. The quarks, though,
observable. Actually, the last comment is not quite fair be- had to be spin-4 fermions, just like electrons, which meant
cause although free quarks have never been seen and are that there could be no more than two in the same orbital
probably impossible to produce, one can nevertheless say state, corresponding to the two possible spin states. (The
that the quarks inside the neutron or proton have been ob- idea that quarks might be neither bosons nor fermions, but
served indirectly through the deep inelastic scattering of obey some more complicated kind of parastatistics, was
high-energy probes, in a manner very similar to the obser- also explored, particularly by 0.W. Greenberg in 1964,but
vation of the atomic nucleus by Rutherford and his co- has not proved fruitful.) The other difficulty was in finding
workers in 1909, through the scattering of alpha particles. a force that would bind the quarks together so strongly that
The first stages leading to the development of color they could never (or only rarely) escape and yet would not
gauge theory involved the realization that the hadrons (the show up as a comparably strong force among the observed
proton, neutron, and other strongly interacting particles, hadrons, seen as bound configurations of quarks. In 1965,
both mesons and baryons) could not be elementary parti- Han and Nambu pointed the way to a solution of these
cles. For one thing, there was no logical reason for identify- difficulties by suggesting (in connection with a model of
ing some of the hadrons as more fundamental than the rest, quarks with integer rather than fractional charge) that
while schemes for treating all of them as equally fundamen- there might be additional quantum numbers needed to de-
5 24
scribe quarks, so that each flavor of quark could come in tary color, in a nice analogy with the two ways of adding
three varieties, later to be called colors, and linked by a the three color vectors as described above. In contrast to
new symmetry, with the structure known as SU(3). This ordinary color, though, the three types of color charge are
dealt nicely with the exclusion principle and the color totally indistinguishable from each other, reflecting the
charge, as one might call it, could provide the basis for a symmetry upon which the gauge theory is based.
binding force for the quarks. ( A complication needs to be mentioned here: The con-
It was not until 1972, though, when gauge theories had served color charge is in actuality a vector in an eight-di-
become more popular, that the idea of binding the quarks mensional space, in analogy to the three-dimensional iso-
with a gauge field was put forward. In that year, Gell- spin space. Just as the isospin components fail to commute,
Mann, Fritzsch, and Bardeen introduced the term color so that the eigenvalues ofjust one of them, referred to as I,,
to describe these additional degrees of freedom and pro- label the members of each isospin multiplet, in the same
posed a gauge theory based on the color SU(3) symmetry way just two of the eight components of color can be taken
group. This also followed theobservation by Adler that one to be commuting, so that the quark multiplets are labeled
can calculate the rate of decay of the pi meson into two by those two parameters and give rise to the two-dimen-
photons and that the result depends strongly on the num- sional vectors discussed above.)
ber of fypes of constituent particle. Adler found that the The gauge quanta have come to be referred to as
decay rate is off by a factor of about 3 if one used only the gluons, in reference to the long-standing puzzlement as
flavored quarks, but gives very reasonable agreement if one to the nature of the glue that holds nuclear matter to-
uses the color triplets. gether; the color gauge field is often referred to as the
One of the strong motivations for the new color SU( 3) gluon field. The quantum field theory of the color gauge
model was the need to explain why quarks were never ob- field is called quantum chromodynamics (QCD) in
served. The hope was that there might be what is called analogy to quantum electrodynamics (QED), the quan-
color confinement, which would require that the force tum theory of the electromagnetic field. (Chromo- is a
between color charges is strong enough at large distances to prefix, from a Greek root, meaning color.) The gluons
prevent the charges ever being separated. It is as if the are massless (there are no broken symmetries in this theor
Coulomb potential between electric charges, instead of fall- ry), but because they inevitably carry color charge (see
ing off as l/r, were logarithmic, or even linear, at large Sec. IV), they are also confined, just as are the quarks.
distances. Then it would take an infinite amount of work to This, together with the color neutrality of all the physical
separate two opposite charges to an infinite distance from hadrons, prevents the gluon field from generating a long-
each other and any unbalanced charges would inevitably range force. Many people have speculated that there might
find each other and recombine into neutral composites. As be color neutral bound configurations of gluons, referred to
it is, in fact, with the long-range Coulomb force is such that as glueballs, but they would necessarily be massive.
free electric charges have a strong tendency to recombine Whether the gluon field does in fact produce confine-
and most matter at the macroscopic scale is very close to ment is still an unanswered question, but there are strong
being electrically neutral. indications that it does, as a direct consequence of the non-
In the case of quarks, the color charge is seen as a vector linear character of such a gauge field. In contrast to con-
in a two-dimensional plane and the three different colors finement, which concerns the behavior of the field at large
correspond to vectors at angles of 120 from each other, as distances from the source, there is also a very important
shown in Fig. 2( a ) . In order to obtain a color neutral com- question as to what happens at small distances. As men-
bination, it is necessary to add the color vectors in multi- tioned at the beginning of this section, high-energy probes
ples of three. Thus, to get vectors to add to zero, three have pointed to the presence of quarks in the interior of
vectors, one of each type, can be added, as in Fig. 2 ( b ) ,or a baryons and furthermore have indicated that the quarks
vector can be added to its negative (corresponding to an move very freely at these small distances, as if the forces
antiquark), as in Fig. 2(c). The first possibility corre- between them were much weaker than the theory seems to
sponds to the baryons, consisting of three quarks and the demand. This weakness of the interaction at small dis-
second possibility corresponds to the mesons, consisting of tances has come to be called asymptotic freedom, in ref-
a quark-antiquark pair. The use of the term color for erence to the fact that very small distances correspond to
this conserved vector quantity was motivated by the analo- very large momentum transfers (related to the Heisenberg
gy to the three primary colors, usually taken as red, blue, uncertainty principle). It was very important, then, that
and green or red, blue, and yellow. In order to achieve any proposed theory show this sort ofbehavior; it has been
white, the neutral color, you must combine all three pri- one of the triumphs of QCD that Gross and Wilczek and,
mary colors, or else combine a color with its complemen- independently, Politzer, were able to show in 1972 that this
was indeed the case.
With this settling of the problem of asymptotic freedom,
all the major difficulties in formulating gauge theories of
the strong, electromagnetic, and weak interactions were
essentially solved, giving rise to what we have referred to as
the Standard Model, the picture of the world as consisting
of quarks and leptons, together with the two basic gauge
fields associated with the strong and electroweak interac-
tions.
(b) (C) I have not yet discussed the gravitational field, which
seems to me to be the mystery at the center of the puzzle. It
Fig. 2. Vector addition for color charge is clear that Einsteins general theory of relativity can be
525
understood in many ways as the gauge theory associated ending with a coherent, if still incomplete, picture in which
with the symmetries of space and time, but it does not fit all the basic interactions of nature are understood in terms
easily into the pattern I have tried to describe. The local of three fundamental gauge fields: the gluon field, the
symmetry involved is the invariance of the theory under electroweak field, and the gravitational field.
arbitrary curvilinear coordinate transformations, which In this presentation, I have greatly oversimplified the
can be seen as a kind of local Poincare invariance (the com- complex developments leading up to the Standard Model
plete symmetry of special relativity). The conserved quan- and I must do the same now with the multitudinous devel-
tities associated with Poincark invariance, namely, energy, opments and efforts of recent years extending beyond that
momentum, and angular momentum, are to some extent model. Two principal lines of thought have been directed
the sources for the gravitational field, as the gauge philoso- toward the goal of unifcufion.There has long been a desire
phy requires, but there are important differences. In the on the part of scientists, stimulated especially by Einstein,
first place, the local transformations associated with the to see coherence in the laws of physics, a feeling that all the
coordinate transformations of general relativity are general laws should be part of a single pattern, consequences in a
affine transformations-a much broader class than the unified way of a single fundamental principle governing all
Poincark group and too broad in fact to describe the sym- of nature. The history of physics has seen many unifica-
metries we see in our local environment. Closely associated tions and our present picture, consisting of the Standard
with this is the fact that the quantities in general relativity Model plus general relativity, represents spectacular prog-
that play the role of the gauge potentials, namely, the Cris- ress toward this end in that all the diverse areas of 19th- and
toffel symbols (or the affine connection in mathematical 20th-century physics are drawn together into a very com-
terms), are not dynamically independent gauge potentials pact set of elementary constituents and interaction laws;
at all, but are derived from the metric tensor, which in some furthermore, the interactions are all associated with gauge
way then acts the part of the gauge potential. theories that follow the same basic principles.
In the second place, the sources of the gravitational field It is clear, though, that thejob is not finished. The dream
in conventional general relativity are just the energy and is to unify the basic interaction fields into a single theory,
momentum distributions, associated through Noethers presumably a gauge theory associated with some great and
theorem not with the whole Poincari group, butjust with a natural symmetry group that comprises all the symmetries
subgroup of it, namely, the translation symmetry of space of our current picture and breaks down through some kind
and time, as if general relativity were the gauge theory of of spontaneous symmetry breaking, like that associated
the translation symmetries alone. Many people, then, have with the electroweak gauge theory, into the different sub-
tried to extend the gauge symmetry to the full Poincart5 symmetries. Gravity will undoubtedly be the most difficult
grouptheories with torsion, in the usual terminol- to incorporate into such a unified theory; thus the first step
ogy-which would predict forces associated with angular was expected to be a unification of the strong and
momentum distributions. This remains very speculative. electroweak theories into a GUT. In such a scheme it is
Finally, gravitation theory has proved extremely diffi- evident that the six (presumed) flavors of quarks will be
cult to quantize, with the problems I have described ap- symmetrically related to the six different leptons and inevi-
pearing in their worst possible form, together with other tably, on account of the symmetry breaking, will be able
problems not present in the other kinds of gauge theory. very weakly to transform into each other. If quarks can
Nonetheless, these three basic interaction theories seem turn into leptons, though, it means that an otherwise stable
to give a remarkably coherent picture of the universe as we hadron (the hadrons are the baryons and mesons, the
know it. Classical gravitation theory, which is general rela- strongly interacting particles) would ultimately decay. In
tivity as Einstein developed it, has survived many experi- fact, the only stable hadron is the proton (though the neu-
mental tests and fits beautifully with all our conceptions of tron can also be stable when bound in a nucleus), so the
stellar behavior and cosmology. The Glashow-Weinberg- first characteristic of a GUT would be. proton decay. The
Salam theory of the electroweak interaction has also had lifetime would have to exceed around 10 years, which is
remarkable success in describing experimental results and, greater than the age of our universe by a factor of around
while it is extremely difficult to do any but the crudest lo2,so you might expect it to be impossible to observe.
calculations with QCD, there are many indications that it However, there are many protons in the matter around us,
does give a correct description of the strong interactions. In for example, about lo3 in a ton of water, so there is in fact a
Sec. VII I shall try to give a summary of the ideas presented real chance of seeing them. Major experimental efforts
here and some account of the directions that are being ex- have been made to do so in a number of different laborato-
plored in trying to find a single unified theory encompass- ries, using huge vats of liquid surrounded by electronic
ing all the fundamental interactions-a theory of every- eyes, in deep underground mines in order to minimize com-
thing, as some people have called it. sic ray background. Thus far the results have been negative
and the limits obtained on the decay rate are sufficiently
low to have forced us to abandon at least the most reasona-
VII. T H E FUTURE A THEORY OF EVERYTHING? ble forms of GUT. The efforts continue, but the interest in
GUTS has decreased considerably. One of the untidy ele-
In the preceding pages I have tried to describe some of ments in the search for a GUT is the Higgs, the pre-
the history of the idea of gauge symmetrics and gauge fields sumed quantum of the Higgs field, introduced in an ad hoc
starting with Noethers theorem-or Noethers principIe, way, as you will recall, to provide a symmetry breaking
perhaps (see Sec. 111)-relating symmetrics and conserva- mechanism for the electroweak gauge theory. Such a parti-
tion laws; going on to develop the logic of gauge theory and cle seems to have no fundamental reason of its own for
the unique way in which the gauge philosophy generates existing since it seems esthetically unsatisfactory to SUP-
theories associated with given natural symmetrics; and pose that nature invented it for that purpose alone. What
526
many hope is that the dynamics of the fermion and gauge symmetries involving the extra dimensions would appear
fields will be found to produce spontaneous symmetry at our level as internal symmetries, allowing for the possi-
breaking without the help of such a device, a mechanism bility (not completely realized at this point) of relating the
referred to as dynamical symmetry breaking. two types of symmetry, while, on the other hand, the prob-
Beyond the unification of the strong and electroweak lems with divergences and other ambiguities that beset
forces there lies the dream of total unification of gravity quantum field theories would at least take a different form
into a single theory of everything. This sort of work is and might even prove to be soluble.
much more speculative than in the case of GUTSbecause One exciting development that combines these ideas is
of the extra degrees of complexity in gravitation and its known as ten-dimensional superstring theory. The theo-
failure to conform to the pattern of the other gauge theo- ry is a supersymmetric gauge theory in a ten-dimensional
ries. The major reason for these differences is that gravita- space-time, with six of the dimensions assumed to be highly
tion is associated with what is called a dynamical symme- compact, as suggested above, and with the additional fea-
try group having to do with transformations on space and ture that the fundamental entities out of which matter is to
time, the medium in which dynamical events take place, be built are not point particles, but strings, tiny open or
while the other gauge theories are associated with internal closed loops that can be thought of as dislocations in the
symmetry groups whose transformations involve only basic structure of space and time and whose transforma-
nondynamical degrees of freedom, the flavor and color pa- tion symmetries are the basis for an associated gauge theo-
rameters labeling the different elementary particles. ry. With this precise combination of dimensions and struc-
While it may turn out that the role of gravity is indeed tures it seems that many of the difficulties mentioned above
intrinsically different from that of the other forces, there might find their natural resolution, such as the canceling of
are nevertheless two lines of attack on the problem of unifi- divergences and anomalies in the gauge theories, the natu-
cation that are elegant enough to stand at least some chance ral appearance of the gravitational field, and the natural
of being true, namely, Kaluza-Klein and supersymmetry concealment of the extra bosons and fermions generated by
theories. The idea in supersymmetric theories is to extend supersymmetry. There are many aspects of such a theory
the symmetry groups of nature to include transformations that could never be directly tested since they involve the
that mix boson and fermion fields. This can be done with- structure of space-time and matter at a totally inaccessible
out violating the conservation of fermion number by allow- level: There are those who regard the exercise as futile for
ing the cross terms in the transformation matrices, the this reason. One could argue, though, that if such a theory
terms associated with boson-fermion mixing, to be what should be found to provide a consistent basis for under-
are called Grassmann elements, which are totally anti- standing physics at the level a t which we do observe the
commuting analogs of ordinary c numbers. While the su- world and if there is no other way of doing this that anyone
persymmetry transformations never turn bosons complete- can think of, then elegance alone is a sufficient reason for
ly into fermions, or vice versa, such theories do predict treating it as true.
matching pairs of bosons and fermions, including even the There seems to be little doubt now that the ultimate theo-
gauge quanta (photons and photinos, gluons and ry, if it is ever accurately identified, will turn out to be a
gluinos, etc.). This proliferation of particles may be gauge theory. My own feeling is that there will have to be at
somewhat embarrassing, but what is startling is that a spin- least one more major conceptual revolution before that fi-
2 gauge field arises very naturally in this context, with nal goal is achieved. While gauge theories are easy to for-
many of the right properties to be a lively candidate for the mulate at the classical level, the process of quantizing
gravitational field. It is linked to a spin-3/2 gravitino gauge theories is quite awkward, involving either nonco-
Eeld which might or might not have observable conse- variant procedures or the introduction of unphysical de-
quences. grees of freedom, all of which suggests to me that we may
Another very significant approach is the Kaluza-Klein be starting from an incorrect understanding of quantum
idea, growing out of the idea of Kaluza ( 1921) and Klein theory itself. If the most basic theory of the universe is a
( 1926), that the basic structure of space-time might in fact quantum gauge theory, then a gauge theory should be the
have more than the usual four dimensions. If the additional most natural thing (if not perhaps the o n b thing) that can
dimensions are tightly curled up, so that you could not go be quantized, rather than the most awkward; indeed, you
more than cm, say, in such a direction without re- should be able to formulate a quantum gauge theory direct-
turning to your starting point, then at the level where ordi- ly, without going through the intermediate stage of the
nary physics takes place only four dimensions would be classical theory. Will the ultimate theory be a gauge theory
observed. The basic symmetries of nature, however, would of the full group of unitary transformations on the Hilbert
be realized in this higher dimensioned space-time and the space of quantum theory, so that quantum states them-
basic quantum field theory would be a field theory in that selves will be understood locally, rather than globally?
higher number of dimensions. On one hand, dynamical
527
The reports in this monograph have shown great enthusiasm and exuberance for
the unification of various interactions through the concept of gauge fields. I would
like to emphasize a point that has not yet been explicitly stated by any of the other
authors: gauge fields are deeply related to some profoundly beautiful ideas of
contemporary mathematics, ideas that are the driving forces of part of the
mathematics of the last 40 years. Recalling the relationship between physics and
mathematics in earlier periods, general relativity and Riemannian geometry,
quantum mechanics and Hilbert space, it is all too obvious that physicists may again
be zeroing in on a fundamental new secret of nature.
The mathematical development referred to above is the theory of fiber bundles.
It may appear, a priori, that this theory is quite abstract and is unrelated to the
structure of the physical world. To show that this is not true, we will start with a
simple demonstration that electromagnetism and quantum mechanics together lead
naturally to nontrivial fiber bundles. We will then trace the early history of the
gauge field concept and its generalization, emphasizing three related but different
conceptual motivations, each of which leads to a general formulation of gauge
fields.
MAGNETIC
MONOPOLES
AND NONTRIVIAL
BUNDLES
The magnetic monopole is the magnetic charge. Though the idea of magnetic
monopoles probably was discussed in classic physics early in the history of elec-
tricity and magnetism, modern discussions of this concept date back only to 1931,
when the important paper of Dirac pointed o u t that magnetic monopoles in
quantum mechanics exhibit some extra and subtle features. In particular, with the
existence of a magnetic monopole of strength g, electric charges and magnetic
charges must necessarily be quantized, in quantum mechanics. We will give a new
derivation of this result below.
If one wants to describe the wave function of an electron in the field of a
magnetic monopole, it is necessary to find the vector potential A around the
monopole. Dirac chose a vector potential that has a string of singularities. The
necessity of such a string of singularities is obvious if we prove the following
theorem2:
86
528
Here, n, and as are the total upward magnetic fluxes through caps a and p, both of
which are bordered by the parallel. Subtracting these two equations, we obtain
which is equal to the total flux out of the sphere, which, in turn, is equal to 47rgf0.
We have thus reached a contradiction.
Having proved this theorem, we observe that R is arbitrary. Thus, one con-
cludes that there must be a string(s) of singularities in the vector potential to
describe the monopole field. Yet, we know that the magnetic field around the
monopole is singularity free. This fact suggests that the string of singularities is not
a real physical difficulty. Indeed, the situation is reminiscent of the problem that
one faces when one wants to find a parametrization of the surface of the globe. The
coordinate system that we usually use, latitude and longitude, is not singularity free.
It has singularities at the north pole and at the south pole. Yet, the surface of the
globe is evidently devoid of singularities. We deal with this situation usually in the
manner illustrated in FIGURE 2. We consider a rubber sheet with nicely defined
coordinates and stretch and wrap it downward onto the globe, so that it covers more
than the northern hemisphere. Similarly, we consider another rubber sheet with
nicely defined coordinates and stretch and wrap it upward, so it covers more than
FIGURE
2. Method of parametrizing the globe.
529
which has no singularities in R b . It is simple to prove that the curl of either of these
two potentials gives correctly the magnetic field of the monopole.
In the region of overlap, because both of the two sets of vector potentials share
the same curl, the difference between them must be curlless and therefore must be a
gradient. Indeed, a simple calculation shows
where q5 is the azimuthal angle. The Schradinger equation for an electron in the
monopole field is thus
where $, and I J ~are, respectively, the wave functions in the two regions. The fact
that the two vector potentials in these two equations are different by a gradient tells
us, by the well-known gauge principle, that $, and $ b are related by a phase factor
transformation
or
2q = integer. (9)
HILBERT
SPACE OF SECTIONS
Two $s, $, and $bp in R, and R , , respectively, that satisfy the condition of
transition (Equation 8) in the overlap region are called a section by the
mathematicians. We see that around a monopole, the electron wave function is a
section and not an ordinary function. We will call these functions wave sections.
Different wave sections (which belong to different energies, for example)
clearly satisfy the same condition of transition (Equation 8) with the same q. Thus,
we need to develop the concept of a Hilbert space of sections. To develop this
concept, we define the scalar product of two sections t , v (for thesame q ) by
(The question of convergence at r=O and r = 00 is ignored here.) Notice that in the
overlap
xta =s(xtb).
Thus, x is an operator in the Hilbert space of sections. Similarly, we prove that the
components of (p - eA) are operators, but those of p are not. Furthermore, x and
p - eA are both Hermitian.
53 1
Following F i e r ~ ,we
~ will now attempt to construct angular momentum
operators. Define
qr
L = r x ( p - e A ) - -.
r
It is clear that L,, L , , and L , are Hermitian operators on the Hilbert space of
sections. The following commutation rules can be easily verified:
Equation 13, together with its consequence (Equation 14), show that L , , L , , and
L , are the angular momentum ope rut or^.^ We emphasize that neither the Hilbert
space nor these operators possess any singularities. (The singularities of A , and
are not real singularities, because they occur outside of R, and Rb , respectively.)
MONOPOLE
HARMONICS
Yq,],,,
w -6K
where [ is a section dependent only on angular coordinates 0 and 4. L operates,
then, on angular sections.
Equation 14 shows that [ L2,L, ] = 0 . Simultaneous diagonalization produces
the familiar multiplets with eigenvalues I ( I + 1) and m,
L*Yq,I,rn= l ( ~ + l ) Y q , ~ , m ~ L ~ Y q , l , ~ = m Y q , ~ , m , (15)
where 1 = 0, 1/2, 1, . . . ,and for each value of 1,m ranges from - 1 to + 1 in integral
steps of increment. Yq,],,,are eigensections, which are called3 monopole har-
monics. The allowed values of I and m are
Each of these 1,m combinations occurs exactly once. One can choose each Y
normalized, so that
sin0d0 rr0
I Yq,l,m
I 2 d 4= 1.
5 32
Different Yq,l,m(for fixed q) are orthogonal, a fact one easily proves in the usual
way from Equation 15.
The explicit values of Yq,l,min terms of Jacobi polynomials were given in
Reference 3. They were obtained from Equation 15, in exactly the same way one
usually obtains the spherical harmonics YI.,,. Indeed,
Y1,m = Y0,l.m .
The collection of Yq,l,mfor fixed q and values of I,m given by Equation 16 form3 a
complete orthonormal set of angular sections.
Each is analytic in R , ; so is (Yq,l,m), in R b . Thus, all of the discon-
tinuities, cusps, and singularities in A and in $ are removed in a very smooth way.
v e ( v xA)=O,
the magnetic field described by v x A must have continuous flux lines. Thus, its
flux lines consist of the dotted lines of FIGURE 4, plus the bundle of lines described
by the solid line, so as to make the net flux at the origin zero. Thus, v x A does
not correctly describe the magnetic field of the monopole, a point already em-
phasized by W e n t ~ e l . ~
(B) For ordinary spherical harmonics, there are many important theorems, such
as the spherical harmonics addition theorem and the decomposition of products of
spherical harmonics by use of Clebsch-Gordon coefficients. These theorems can be
generalized to monopole harmonics.6
(C) In the approximately 40 years since Diracs first paper on monopoles, the
subject has been beset with difficulties due to singularities. Now that we have
removed the difficulty of string singularities through the introduction of the
concept of sections, it is revealed that there is yet another difficulty, which we will
call the Lipkin-Weisberger-Peshkin difficulty. This difficulty occursE in studying
the radial wave function of a Dirac electron around a monopole (TABLE1). It can
be removed through the introduction of a small extra magnetic moment for the
Dirac electron.
(D) It is instructive to go back to the reasoning represented in FIGURE 1 and
attempt to repeat the steps for the combined A,, A , description of the magnetic
field. Choose the parallel to be the equator. Then,
TABLE1
DIFFlCULTlES AND METHODSOF
WLUTlON FOR STUDYING THE MOTION
OF A
DIRAC
ELECTRON
IN THE FIELD
OF A MAGNETIC
MONOPOLE
Thus,
which is, by Equation 6,equal to the increment of (Y around the equation, that .is,
2g(27r) = 47rg.
We have arrived at an identity. I have provided this simple argument because it
is exactly the gist of the proof of the famous Gauss-Bonnet-Allendoerfer-Weil-
Chern theorem and the later Chern-Weil theorem, which play seminal roles in
contemporary mathematics.
In fact, gauge fields, of which electromagnetism is the simplest example, are
conceptually identical to some mathematical concepts in fiber bundle theory.
TABLE2 gives translations for the terminologies used by physicists, on the one
hand, and mathematicians, on the other. We notice that, in particular, Diracs
monopole quantization (Equation 9) is identical to the mathematical concept of
classification of U( 1) bundles according to the first Chern class.
The last two entries of TABLE2 identify electromagnetism with and without
magnetic monopoles with connections to trivial and nontrivial U( 1) bundles. Why
is electromagnetism without monopoles trivial? We can gain some un-
derstanding by looking at a paper loop and a Moebius strip (FIGURE 5 ) . If they are
cut along the dotted lines, each would break into two pieces. Looking at the
resultant pieces, we cannot differentiate between the two. The paper loop and the
Moebius strip are different only in the way the resultant pieces are put together. For
the latter, a twist of one of the resultant pieces is necessary. The difference between
a trivial and a nontrivial bundle resides only in the processes of joining: for the
nontrivial bundle, a twist is needed in the joining process. In the case of elec-
tromagnetism, the joining process is given by Equation 7 or 8. If there is no
monopole, S= 1, and the bundle is trivial. If there is a monopole, S Z 1, and the
bundle is nontrivial. (We may describe the nontrivial nature by saying that a twist
ofphase is necessary.)
TABLE 2
TRANSLATION
OF TERMINOLOGIES
-.
attempted to unify gravity and electromagvnetism through the use of the geometric
concept of aspace-time-dependent scale change. The basic idea is summarized
below.
&P
scale 1 1 +s,dX,
f f j-+ I aj-/axp d ~ g
In the summary above, the first line indicates how the scale changes in going from a
point x to a neighboring point x + d T g of space-time. The second line shows how
a function of space-time changes as a result of the change in argument from xp t o x
+ d x p . Finally, if the scale change is applied t o the function f, one obtains at
-@ + d P the product
Expanding to first order in the small displacement gives the last line in the sum-
mary. The increment in f is, then,
535
p, - - ih (d/axp).
In 1927, Fock'O observed that one could base quantum electrodynamics on this
operator. London" pointed out the similarity of Fock's to Weyl's earlier work.
Comparing Equations 18 and 19, Weyl's identification would be correct if one
makes the replacement
S , - -i(e/hc)A,.
which can be thought of as an imaginary scale change. Weyl put all of these ex-
pressions togetherI2 in a remarkable paper (which also first discussed the two-
component theory of a spin-1/2 particle) in which the transformation of the elec-
tromagnetic potential
kept the earlier terminology*t that he used in 1918-20 and called both the trans-
formation (Equation 20) and the associated phase change of wave functions
gauge transformations.
Generalization: With the discovery of many new particles after World War 11,
physicists explored various couplings between the elementary particles. Many
possible couplings can be written down, and the desire to find a principle to choose
among the many possibilities was one of the motivation~~J* for an attempt to
generalize Weyls gauge principle for electromagnetism. The point here is that for
electromagnetism, the gauge principle determines, all at once, the way in which any
particle of charge qe, a conserved quantity, serves as a source of the electromagnetic
field. Because the isotopic spin I is also conserved, a natural question was, Does
there exist a generalized gauge principle that determines the way in which I serves as
the source of a new field?
Another motivation for an attempt at generalization is the observation that the
conservation of I implies that the proton and the neutron are similar. Which to call
a proton or, indeed, which superposition of the two to call a proton, is a convention
that one can select arbitrarily (if the electromagnetic interaction is switched off). If
one requires this freedom of choice to be independent for observers at different
space-time points, that is, if one requires localized freedom of choice, one is led to a
generalization of the gauge principle.
These two motivations were, of course, intertwined and led quite naturally to the
formulation* of non-Abelian gauge fields.
~ a generalized gauge principle came later and is the in-
A third a p p r o a ~ h to
tegral formalism of gauge fields. It starts from the observation that the gauge
principle of Weyl deals with a phase factor (Equation 20) between two neighboring
points. Along a path from space-time point A to space time point B, the resultant
phase factor is
rB
JA
which is path dependent. that is. nanintegrable. (Dirac had already discussed, in
1931, non-integrable phases for wave functions.) If one analyzes the meaning of
electromagnetism in quantum mechanics, especially through a discussion of the
Bohm-Aharonov experiment ,20* one reaches the conclusion2 that electromag-
netism is the gauge invariant manifestation of a non-integrable phase factor.
Once this conclusion is reached, a natural generalization is to replace a
The idea of scale invariance, discussed in Reference 9, was developed carlier, in 1918-19,
in three papers by Weyl (submitted on May 2 and June 8,1918 and on January 7, 1919). In the
first two of them, he used the term Mussstub Inuuriunz (see Reference 14); in the third paper,
he settled on the term Eich Inuuriunz.
The English translation of Eich Inuuriunz was calibration invariance in Henry Broses
1921 translation of the fourth edition of Weyls book Space, Time und Mutter (republished
by Dover). The translation gauge invariance was not used, I suspect, until after Weyls
1929 article.I2 It appeared (probably not for the first time) in Diracs article of 1931.
t The transformation (Equation 21) that leaves field strengths unchanged must have been
known in the nineteenth century. It did not, however, seem to have a specific name. In the
many editions of Foppl-Abraham-Becker-Sauter on electricity and magnetism, which started
in 1894, Eich or gauge was not used until the 1964 English translation Electromugnetic
Fields and Interactions, in which the term Lorentz gauge was inserted in a footnote.
3 The experiment was performed by Chamber\.
537
NONlNTEGRABLE
PHASE
/ \
integral
formalism \
\
I GAUGE
CONSERVED QUAb
SOURCE OF FIELD
\---
FIGURE
6 . Three motivations that led to the concept of gauge fields.
however, I believe it will be a long time before the question can be definitively
answered as t o exact!y how strong and weak interactions are due to gauge fields.
Reflecting on how the concepts basic to gauge fields were formulated by
physicists, we see that at every step, the development was tied to the problem of the
conceptual description of the physical world. Firstly, Maxwell equations originated
with the four fundamental experimental laws of electricity and magnetism and with
Faradays introduction of the concepts of field and flux. Maxwells equations and
the principles of quantum mechanics led to the idea of gauge invariance. Attempts
to generalize this idea, motivated by physical concepts of phases, symmetry, and
conservation laws, led to the theory of non-Abelian gauge fields. That non-Abelian
gauge fields are conceptually identical to ideas in the beautiful theory of fiber
bundles, developed by mathematicians without reference to the physical world, was
a great marvel to me. In 1975, I discussed my feelings with Chern, and said, This
is both thrilling and puzzling, since you mathematicians dreamed up these concepts
out of nowhere. He immediately protested, NO, no, these concepts were not
dreamed up. They were natural and real.
REFERENCES
Norbert Straumann
lnstitut fur Theoretische Physik der Universitat Zurich-lrchel,Zurich, Switzerland
One of the major developments of twentieth-century physics has been the gradual recognition that a
common feature of the known fundamental interactions is their gauge structure. In this article the
authors review the early history of gauge theory, from Einsteins theory of gravitation t o the
appearance of non-Abelian gauge theories in the fifties. The authors also review the early history of
dimensional reduction, which played an important role in the development of gauge theory. A
description is given of how, in recent times, the ideas of gauge theory and dimensional reduction have
emerged naturally in the context of string theory and noncommutative geometry.
Reviews of Modern Physics, Vol. 72, No. 1, January 2000 @OOO The American Physical Society 1
541)
Schwinger, Glashow,Salam, Weinberg.. . -. ent, as the gauge principle is used as an input, but the
change from a continuum to a discrete structure pro-
duces qualitatively new features. Amongst these is an
FIG. 1. Key papers in the development of gauge theories. interpretation of the Higgs field as a gauge potential and
the emergence of a dimensional reduction that avoids
the usual embarrassment concerning the fate of the ex-
At the time Weyls contributions to theoretical phys- tra dimensions.
ics were not appreciated very much, since they did not A fuller account of the early history of gauge theory is
really add new physics. The attitude of the leading theo- given by ORaifeartaigh (1997). There one can also find
reticians was expressed with familiar bluntness in a let- English translations of the most important papers of the
ter by Pauli to Weyl of July 1, 1929, after he had seen a early period, as well as Paulis letters to Pais on non-
preliminary account of Weyls work: Abelian Kaluza-Klein reductions. These works underlie
Before me lies the April edition of the Proc. Nat. the diagram in Fig. 1.
Acad. (US).Not only does it contain an article
from you under Physics but shows that you are II. WEYLS ATTEMPT TO UNIFY GRAVITATION
now in a Physical Laboratory: from what I hear AND ELECTROMAGNETISM
you have even been given a chair in Physics in
America. I admire your courage; since the conclu- On the 1st of March 1918 Weyl writes in a letter to
sion is inevitable that you wish to be judged, not Einstein:
for success in pure mathematics, but for your true These days I succeeded, as I believe, to derive
but unhappy love for physics. (Translated from electricity and gravitation from a common
Pauli, 1979.) source. . . .
Weyls reinterpretation of his earlier speculative pro- Einsteins prompt reaction by postcard indicates already
posal had actually been suggested before by London and a physical objection, which he explained in detail shortly
Fock, but it was Weyl who emphasized the role of gauge afterwards. Before we come to this we have to describe
invariance as a symmetry principle from which electro- Weyls theory of 1918.
magnetism can be derived. It took several decades until
the importance of this symmetry principle-in its gener- A. Weyls generalization of Riemannian geometry
alized form to non-Abelian gauge groups developed by
Yang, Mills, and others-also became fruitful for a de- Weyls starting point was purely mathematical. He felt
scription of the weak and strong interactions. The math- a certain uneasiness about Riemannian geometry, as is
ematics of the non-Abelian generalization of Weyls clearly expressed by the following sentences early in his
1929 paper would have been an easy task for a math- paper:
ematician of his rank, but at the time there was no mo- But in Riemannian geometry described above there is
tivation for this from the physics side. The known prop- contained a last element of geometry at a distance
erties of the weak and strong nuclear interactions, in Cferngeometrisches Element)-with no good reason,
(2) They must be gauge invariant, i.e., invariant with re- termed the eliminants of the latter. These structural
spect to the substitutions of Eq. (9) for an arbitrary connections hold also in modern gauge theories.
smooth function A.
Nothing is more natural to Weyl than identifying A,
with the vector potential and F,, in Eq. (3) with the C. Einsteins objection and reactions of other physicists
field strength of electromagnetism. In the absence
After this sketch of Weyls theory we come to Ein-
of electromagnetic fields (F,,= 0) the scale
steins striking counterargument, which he first commu-
factor, exp(-J,A) in Eq. (2), for length transport be-
comes path independent (integrable) and one can find a nicated to Weyl by postcard (see Fig. 3). The problem is
gauge such that A , vanishes for simply connected space- that if the idea of a nonintegrable length connection
(scale factor) is correct, then the behavior of clocks
time regions. In this special case, it is the same situation
would depend on their history. Consider two identical
as in general relativity.
Weyl proceeds to find an action that is generally in- atomic clocks in adjacent world points and bring them
along different world trajectories which meet again in
variant as well as gauge invariant and that would give
the coupled field equations for g and A . We do not want adjacent world points. According to Eq. (2) their fre-
quencies would then generally differ. This is in clear
to enter into this, except for the following remark. In his
contradiction with empirical evidence, in particular with
first paper Weyl (1918) proposes what we now call the
Yang-Mills action: the existence of stable atomic spectra. Einstein therefore
concludes (see Straumann, 1987):
S ( g , A ) =- -,41/ Tr(RA*R). . . . (if) one drops the connection of the ds to the
measurement of distance and time, then relativity
Here R denotes the curvature from and *R its Hodge loses all its empirical basis.
dual. Note that the latter is gauge invariant, i.e., inde- Nernst shared Einsteins objection and demanded on
pendent of the choice of g E [ g ] . In Weyls geometry the behalf of the Berlin Academy that it be printed in a
curvature form splits as R = A+ F , where is the metric short amendment to Weyls article. Weyl had to accept
piece (Audretsch, Gahler, and Straumann, 1984). Corre- this. One of us has described elsewhere (Straumann,
spondingly, the action also splits, 1987; see also Vol. 8 of Einstein, 1987) the intense and
instructive subsequent correspondence between Weyl
Tr( RA*a)= Tr( h~ *a)+ F A * F . (11)
and Einstein. As an example, let us quote from one of
the last letters of Weyl to Einstein:
The second term is just the Maxwell action. Weyls This [insistence] irritates me of course, because ex-
theory thus contains formally all aspects of a non- perience has proven that one can rely on your in-
Abelian gauge theory. tuition; so unconvincing as your counterarguments
Weyl emphasizes, of course, that the Einstein-Hilbert seem to me, as I have to admit. . .
action is not gauge invariant. Later work by Pauli (1919) By the way, you should not believe that I was
and by Weyl himself (1918, 1922) soon led to the con- driven to introduce the linear differential form in
clusion that the action of Eq. (10) could not be the cor- addition to the quadratic one by physical reasons. I
rect one, and other possibilities were investigated (see wanted, just to the contrary, to get rid of this
the later editions of Space, Time, Matter). methodological inconsistency (Znkonsequenz)
Independent of the precise form of the action, Weyl which has been a bone of contention to me already
shows that in his theory gauge invariance implies the much earlier. And then, to my surprise, I realized
conservation of electric charge in much the same way as that it looked as if it might explain electricity. You
general coordinate invariance leads to the conservation clap your hands above your head and shout: But
of energy and m o r n e n t ~ m .This
~ beautiful connection physics is not made this way! (Weyl to Einstein 10
pleased him particularly: . . . [it] seems to me to be the December 1918).
strongest general argument in favour of the present
theory-insofar as it is permissible to talk of justification Weyls reply to Einsteins criticism was, generally
in the context of pure speculation. The invariance prin- speaking, this: The real behavior of measuring rods and
ciples imply five Bianchi-type identities. Correspond- clocks (atoms and atomic systems) in arbitrary electro-
ingly, the five conservation laws follow in two indepen- magnetic and gravitational fields can be deduced only
dent ways from the coupled field equations and may be from a dynamical theory of matter.
Not all leading physicists reacted negatively. Einstein
transmitted a very positive first reaction by Planck, and
The integrand in Eq. (10) is indeed just the expression Sommerfeld wrote enthusiastically to Weyl that there
R , g y s R a B Y s ~ d d X o h . . . ~ din
x 3 local coordinates which is was . . . hardly doubt, that you are on the correct path
used by Weyl (RepYs=the curvature tensor of the Weyl con- and not on the wrong one.
nection). In his encyclopedia article on relativity Pauli (1921)
3We adopt here the somewhat naive interpretation of energy- gave a lucid and precise presentation of Weyls theory,
momentum conservation for generally invariant theories of the but commented on Weyls point of view very critically.
older literature. At the end he states:
FIG. 3. Postcard from Einstein to Weyl 15 April 1918. From A.rchives of Eidgenossische Technische Hochschule, Zurich.
. . . In summary one may say that Weyls theory along a completely different line at the principle of
has not yet contributed to getting closer to the so- gauge invariance in the framework of wave mechanics.
lution of the problem of matter. His approach was similar to that of Klein, which will be
Eddingtons reaction was at first very positive but he discussed in detail (in Sec. IV).
soon changed his mind and denied the physical rel- The contributions of Schrodinger (1922), London
evance of Weyls geometry. (1927), and Fock (1927) are discussed in the book of
The situation was later appropriately summarized by ORaifeartaigh (1997), where English translations of the
London (1927) as follows: original papers can also be found. Here, we concentrate
on Weyls seminal paper Electron and Gravitation.
In the face of such elementary experimental evi-
dence, it must have been an unusually strong meta-
physical conviction that prevented Weyl from 111. WEYLS 1929 CLASSIC: ELECTRON
abandoning the idea that Nature would have to AND GRAVITATION
make use of the beautiful geometrical possibility
that was offered. He stuck to his conviction and Shortly before his death late in 1955, Weyl wrote for
evaded discussion of the above-mentioned contra- his Selecta (Weyl, 1956) a postscript to his early attempt
dictions through a rather unclear re-interpretation in 1918 to construct a unified field theory. There he ex-
of the concept of real state, which, however, pressed his deep attachment to the gauge idea and adds
robbed his theory of its immediate physical mean- (p. 192):
ing and attraction. Later the quantum-theory introduced the
In this remarkable paper, London suggested a reinter- Schrodinger-Dirac potential fi of the electron-
pretation of Weyls principle of gauge invariance within positron field; it carried with it an experimentally-
the new quantum mechanics: The role of the metric is based principle of gauge-invariance which guaran-
taken over by the wave function, and the rescaling of the teed the conservation of charge, and connected the
metric has to be replaced by a phase change of the wave fi with the electromagnetic potentials + l in the
function. same way that my speculative theory had con-
In this context an astonishing early paper by Schro- nected the gravitational potentials glk with the + 1 ,
dinger (1922) has to be mentioned, which also used and measured the + l in known atomic, rather than
Weyls world geometry and is related to Schrodingers unknown cosmological units. I have no doubt but
later invention of wave mechanics. This precursor rela- that the correct context for the principle of gauge-
tion was discovered by Raman and Forman (1969). [See invariance is here and not, as I believed in 1918, in
also the discussion by C. N. Yang in Schrodinger the intertwining of electromagnetism and gravity.
(1987).] This reinterpretation was developed by Weyl in one
Simultaneously with London, Fock (1927) arrived of the great papers of this century (Weyl, 1929). Weyls
classic not only gives a very clear formulation of the with observation, is that the exponent of the factor
gauge principle, but contains, in addition, several other multiplying $ is not real but purely imaginary. 9
important concepts and results-in particular his two- now plays the role that Einsteins ds played before,
component spinor theory. The richness and scope of the It seems to me that this new principle of gauge-
paper is clearly visible from the following table of con- invariance, which follows not from speculation but
tents: from experiment, tells us that the electromagnetic
Introduction. Relationship of General Relativity to field is a necessary accompanying phenomcnon,
the quantum-theoretical field equations of the not of gravitation, but ol the material wave-field
spinning electron: mass, gauge-invariance, distant- rcpresented by $. Since gauge-invariance involves
parallelism. Expected modifications of the Dirac an arbitrary function A it has the character oE gen-
theory. - I . Two-cornponcnt theory: the wave eral relativity and can naturally only be under-
function $has only two componcnts. -51. Connec- stood in that context.
tion between the transformation of the t) and the We shall soon enter into Weyls justification, which is,
transformation of a normal tetrad in four-
not surprisingly, strongly associated with genera1 relativ-
dimensional space. Asymmetry of past and future,
ity. Before this we have to describe his incorporation of
of left and right. -52. In General Relativity the
metric at a given point is determined by a normal the Dirac theory into general relativity, which he
tetrad. Components of vectors relative to the tet- achieved with the help of the tetrad formalism.
rad and coordinates. Covariant differentiation of One of the reasons for adapting the Dirac theory of
4. -83. Generally invariant form of the Dirac ac- the spinning electron to gravitation had to do with Ein-
lion, characteristic for the wave-field of mattcr. steins recent unified theory, which invoked a distant
-94. The differential conservation law of energy parallelism with torsion. Wigner (1929) and others had
and momentum and the symmctry of the energy- noticed a connection between this thcory and the spin
momcntum tensor as a consequence of the double- theory ol the electron. Weyl did not likc this and wanted
invariance (1) with respect to coordinate Iransfor- to dispense with tclcparallelism. In the introduction he
mations (2) with respect to rotation of the tetrad. says:
Momentum and moment of momentum for matter. I prefer not to believe in distant parallelism for a
-95. Einsteins classical theory of gravitation in number of reasons. First my mathematical intu-
the new analytic formulation. Gravitational en-
ition objects to accepting such an artificial geom-
ergy. -16. The electromagnetic field. From the ar-
etry; I find it difficult to understand the force that
bitrariness of the gauge-factor in 9 appears the ne-
would keep the local tetrads at different points and
cessity of introducing the electromagnetic
potential. Gauge invariance and charge conserva- in rotated positions in a rigid rclationship. There
tion. The space-integral of charge. The introduc- are, 1 beheve, two important physical reasons as
tion of mass. Discussion and rejeclion 01 another wcll. The loosening of the rigid relationship bc-
possibility in which electromagnctisrn appcars, not tween the telrads at different points converts thc
as an accompanying phenomenon of matter, but of gaugc-factor e i A ,which remains arbitrary with rc-
gravitation. spect to @, Gom a constant to an arbitrary function
of space-time. In other words, only through the
The modern version of the gauge principle is already
loosening of the rigidity does the established
spelled out in the introduction:
gauge-invariance become understandable.
The Dirac field-equations for 9 together with the
Maxwell equations for the four potentials f p of the This thought is carried out in detail after Weyl has set
electromagnetic field have an invariance property up his two-component theory in special relativity. in-
which is formally similar to the one which I called cluding a discussion of P and T invariance. He empha-
gauge-invariance in my 1918 theory of gravitation sizes thereby that the two-component theory excludes a
and electromagnetism; the equations rcmain in- linear implementation of parity and remarks: ;It is only
variant when one makes the simultaneous replace- the fact that the left-right symmetry actually appears in
ments Nature that forces us to introduce a sccond pair of (lr
componcnts. To Weyl the mass problem is thus not
dh rclevant lor this. Indeed he says: Mass, howevcr, is a
@ by eih@ and f!, by f p - d x - p , gravitational cffcct; thus there is hope of finding a sub-
stitute in the theory ot gravitation that would produce
where A is understood to be an arbitrary function the required corrections.
of position in four-space. Here the factor elch,
where -e is the charge of the electron, c is the
speed of light, and hl27r is the quantum of action, 4At the time i t was thought by Weyl, and indeed by all physi-
has been absorbed in f p . The connection of this cists, that the two-component theory required a zero mass. In
gauge invariance to the conservation of electric 1957, after the discovery of parity nonconservation, it was
charge remains untouched. But a fundamental dif- found that the two-component theory could be consistent with
ference, which is important to obtain agrccmcnt a finite mash. See Case (1957).
A. Tetrad formalism
(Indices are raised and lowered with ?lap and q a p ,re- tetrad but only to the extent that they can still be
spectively.) They are determined (in terms of the tetrad) multiplied by an arbitrary "gauge-factor'' elh. The
by the first structure equation of Cartan: transformation of the (I, induced by a rotation of
the tetrad is determined only up to such a factor.
In special relativity one must regard this gauge-
de"+ w;AeP=O. (15) factor as a constant because here we have only a
(For a textbook derivation see Straumann, 1984.) Under single point-independent tetrad. Not so in general
local Lorentz transformations [Eq. (13)] the connection relativity; every point has its own tetrad and hence
forms transform in the same way as the gauge potential its own arbitrary gauge-factor; because by the re-
of a non-Abelian gauge theory: moval of the rigid connection between tetrads at
different points the gauge-factor necessarily be-
w ( x ) - t ~ ( ~ ) w ( x ) ~ - l ( x ) - d ~ ( x ) ~ - l ( ~(16)
). comes an arbitrary function of position.
The curvature forms n=(nt)are obtained from w in In this manner Weyl arrives at the gauge principle in
exactly the same way as the Yang-Mills field strength its modern form and emphasizes "From the arbitrariness
from the gauge potential: of the gauge factor in CC. appears the necessity of intro-
ducing the electromagnetic potential." The first term d $
0= d o + WAW (17) in Eq. (19) now has to be replaced by the covariant
gauge derivative ( d - i e A ) (I,, and the nonintegrable
(second structure equation). scale factor (2) of the old theory is now replaced by a
For a vector field V , with components V" relative to phase factor:
{e,}, the covariant derivative DV is given by
one hand, it is a consequence of the field equations for ance. I myself have long since abandoned this
matter plus gauge invariance. On the other hand, how- theory in favour of its correct interpretation: gauge
ever, it is also a consequence of the field equations for invariance as a principle that connects electromag-
the electromagnetic field plus gauge invariance. This netism not with gravitation but with the wave-field
corresponds to an identity in the coupled system of field of the electron. -Einstein was against it [the origi-
equations that has to exist as a result of gauge invari- nal theory] from the beginning, and this led to
ance. All this is now familiar to students of physics and many discussions. I thought that I could answer his
does not need to be explained in more detail. concrete objections. In the end he said Well,
Much of Weyls paper appeared also in his classic Weyl, let us leave it at that! In such a speculative
book The Theory of Groups and Quantum Mechanics manner, without any guiding physical principle,
(Weyl, 1981). There he mentions the transformation of one cannot make Physics. Today one could say
his early gauge-theoretic ideas: This principle of gauge that in this respect we have exchanged our points
invariance is quite analogous to that previously set up by of view. Einstein believes that in this field [Gravi-
the author, on speculative grounds, in order to arrive at tation and Electromagnetism] the gap between
a unified theory of gravitation and electricity. But I now ideas and experience is so wide that only the path
believe that this gauge invariance does not tie together of mathematical speculation, whose consequences
electricity and gravitation, but rather electricity and must, of course, be developed and confronted with
matter. experiment, has a chance of success. Meanwhile
When Pauli saw the full version of Weyls paper he my own confidence in pure speculation has dimin-
became more friendly and wrote (Pauli, 1979, p. 518): ished, and I see a need for a closer connection with
In contrast to the nasty things I said, the essential quantum-physics experiments, since in my opinion
part of my last letter has since been overtaken, it is not sufficient to unify Electromagnetism and
particularly by your paper in 2. J Physik. For this Gravity. The wave-fields of the electron and what-
reason I have afterward even regretted that I ever other irreducible elementary particles may
wrote to you. After studying your paper I believe appear must also be included.
that I have really understood what you wanted to Independently of Weyl, Fock (1929) also incorporated
do (this was not the case in respect of the little the Dirac equation into general relativity using the same
note in the Proc. Nut. Acad.). First let me empha- method. On the other hand, Tetrode (1928), Schro-
size that side of the matter concerning which I am dinger (1932), and Bargmann (1932) reached this goal
in full agreement with you: your incorporation of by starting with space-time-dependent y matrices, satis-
spinor theory into gravitational theory. I am as dis- fying { y,yy}=2g~. A somewhat later work by Infeld
satisfied as you are with distant parallelism and
and van der Waerden (1932) is based on spinor analysis.
your proposal to let the tetrads rotate indepen-
dently at different space-points is a true solution.
In brackets Pauli adds:
Here I must admit your ability in Physics. Your
IV. THE EARLY WORK OF KALUZA AND KLEIN
earlier theory with gik= Xgik was pure mathemat-
ics and unphysical. Einstein was justified in criticiz- Early in 1919 Einstein received a paper of Theodor
ing and scolding. Now the hour of your revenge Kaluza, a young mathematician (Privatdozent) and con-
has arrived. summate linguist in Konigsberg. Inspired by the work of
Then he remarks, in connection with the mass problem, Weyl one year earlier, he proposed another geometrical
Your method is valid even for the massive [Dirac] unification of gravitation and electromagnetism by ex-
case. I thereby come to the other side of the mat- tending space-time to a five-dimensional pseudo-
ter, namely, the unsolved difficulties of the Dirac Riemannian manifold. Einstein reacted very positively.
theory (two signs of mo) and the question of the On 21 April 1919 he writes, The idea of achieving [a
2-component theory. In my opinion these prob- unified theory] by means of a five-dimensional cylinder
lems will not be solved by gravitation.. . the gravi- world never dawned on m e . . . . At first glance I like
tational effects will always be much too small. your idea enormously. A few weeks later he adds:
The formal unity of your theory is starting. For un-
Many years later, Weyl summarized this early tortu- known reasons, Einstein submitted Kaluzas paper to
ous history of gauge theory in an instructive letter the Prussian Academy after a delay of two years
(Seelig, 1960) to the Swiss writer and Einstein biogra- (Kaluza, 1921).
pher C. Seelig, which we reproduce in an English trans- Kaluza was actually not the first who envisaged a five-
lation. dimensional unification. It is astonishing to note that G.
The first attempt to develop a unified field theory Nordstrom had this idea already in 1914 (Nordstrom,
of gravitation and electromagnetism dates to my 1914). We recall that Nordstrom had worked out in sev-
first attempt in 1918, in which I added the principle eral papers (Nordstrom, 1912, 1913a, 1913b) a scalar
of gauge invariance to that of coordinate invari- theory of gravitation that was regarded by Einstein as
the only serious competitor to general relativity. (In lar field could play an important role, and he makes
collaboration with Fokker, Einstein gave this theory a some speculative remarks in this direction.
generally covariant, conformally flat form.) Nordstrom In the classical part of his first paper, Klein (1926a)
started in his unification attempt with five-dimensional improves on Kaluzas treatment. H e assumes, however,
electrodynamics and imposed the cylinder condition, beside the condition of cylindricity, that gS5is a constant.
that the fields should not depend on the fifth coordinate. Following Kaluza, we keep here the scalar field 4 and
write the Kaluza-Klein ansatz for the five-dimensional
Then the five-dimensional gauge potential ( ) A splits as
+
()A = A + dx, where A is a four-dimensional gauge
metric ()g in the form
wg, 4-113( g - 4a@ U ) ?
potential and 4is a space-time scalar field. The Maxwell (23)
field splits correspondingly, (IF= F+ d&,dx5, and where g=g,, dx d x Y is the space-time metric and w is a
hence the free Maxwell Lagrangian becomes differential 1-form of the type
w=dx+ KA,dxP. (24)
Like 4, A = A , dx is independent of x; K is a coupling
In this manner Nordstrom arrived at a unification of his constant to be determined. The convenience of the con-
formal factor 4-113 will become clear shortly.
theory of gravity and electromagnetism. [The matter
Klein considers the subgroup of five-dimensional co-
source (five-current) is decomposed correspondingly.] It ordinate transformations which respect the form (23) of
seems that this early attempt left, as far as we know, no the d = 5 metric:
traces in the literature.
We now return to Kaluzas attempt. Like Nordstrom xp+xfi, XLX5+f(XP). (25)
he assumes the cylinder condition. Then the five- Indeed, the pull-back of ( 5 ) g is again of the form (23)
dimensional metric tensor splits into the four- with
dimensional fields g,, , A , , and +.
Kaluzas identifica-
tion of the electromagnetic potential is not quite the
right one, because he chooses it equal to gps (up to a
constant), instead of taking the quotient gps1g55,This Thus A = A , dx transforms like a gauge potential un-
does not matter in his further analysis, because he con- der the Abelian gauge group (25) and is therefore inter-
siders only the linearized approximation of the field preted as the electromagnetic potential. This is further
equations. Furthermore, the matter part is only studied justified by the most remarkable result derived by
in a nonrelativistic approximation. In particular, the Kaluza and Klein, often called the Kaluza-Klein miracle.
five-dimensional geodesic equation is only written in this It turns out that the five-dimensional Ricci scalar ()R
limit. Then the scalar contribution to the four-force be- splits as follows:
r 1
comes negligible and an automatic split into the usual
gravitational and electromagnetic parts is obtained. 1
( s ) R = 4 1 3R + 4 ~ 2 + F , y F p - ~ ( V q 5 ) 2 + ~ A l n. +
Kaluza was aware of the limitations of his analysis, 64 l l
but he was confident of being on the right track, as be- (27)
comes evident from the final paragraph of his paper: For + = l this becomes the Lagrangian of the coupled
In spite of all the physical and theoretical difficul- Einstein-Maxwell system. In view of the gauge group
ties which are encountered in the above proposal it (25), this split is actually no miracle, because no other
is hard to believe that the derived relationships, gauge-invariant quantities can be formed.
which could hardly be surpassed at the formal For the development of gauge theory this dimensional
level, represent nothing more than a malicious co- reduction was particularly important, because it re-
incidence. Should it sometime be established that vealed a close connection between coordinate transfor-
the scheme is more than an empty formalism this mations in higher-dimensional spaces and gauge trans-
would signify a new triumph for Einsteins General formations in space-time.
Theory of Relativity, whose suitable extension to With Klein we consider the d = 5 Einstein-Hilbert ac-
five dimensions is our present concern. tion
For good reasons the role of the scalar field was un-
clear to him, except in the limiting situation of his analy-
sis, where 4 becomes the negative of the gravitational
potential. Kaluza was, however, well aware that the sca- assuming that the higher-dimensional space is a cylinder
with 0 ~ x ~ ~ L = 2 Since
a R ~ .
Our choice of the conformal factor 4!-13 in Eq. (23) was Dicke-type interactions has lately been investigated for
made so that the gravitational part in Eq. (30) is just the instance by Damour and collaborators (Damour and
Einstein-Hilbert action, if we choose Polyakov, 1994).
Since the work of Fierz (published in German, Fierz,
K= 16irG. (31) 1956) is not widely known, we briefly describe its main
For 4= 1 a beautiful geometrical unification of gravita- point. Quoting Pauli, Fierz emphasizes that, in theories
tion and elcctromagnetisrn is obtained. containing both tensor and scalar fields, the tensor field
We pause by noting that nobody in the early history appearing most naturally in thc action of the theory can
of Kaluza-Klein theory sccms to have noticed the fol- differ from the physical metric by some conformal
lowing inconsistency in putting 4- 1 [see, however, Li- factor depending o n the scalar fields. In order to decide
chneruwicz (1995)l: The field equations for the dimen- which is the atomic-unit metric and thus the gravita-
sionally rcduccd action (30) arc just the five-dimensional tional constant, onc has to look at the coupling to mat-
equations ( 5 ) R , b =0 for the Kaluza-Klein ansatz (23). ter. The physical metric g,, is thc one to which matter
Among these, the 4 equation, which is equivalent to is universally coupled (in accordance with the principle
()I?,= 0, becomes of equivalence). For instance, the action for a spin-0
massive matter field 9 should take the form
and adds:
(45)
The small value of this length together with the
periodicity in the fifth dimension may perhaps be where a is the fine-structure constant and lP1 is the
taken as a support of the theory of Kaluza in the Planck length.
sense that they may explain the non-appearance of Equations (43) and (44) imply a serious defect of the
the fifth dimension in ordinary experiments as the five-dimensional theory: The (bare) masses of all
result of averaging over the fifth dimension. charged particles ( ( n ( 3 l )are of the order of the Planck
Klein concludes this note with the daring speculation mass
that the fifth dimension might have something to do with
Plancks constant: m,=n-mpl.
J;;
2 (46)
In a former paper the writer has shown that the
differential equation underlying the new quantum The pioneering papers of Kaluza and Klein were
mechanics of Schrodinger can be derived from a taken up by many authors. For some time the projec-
wave equation of a five-dimensional space, in tive theories of Veblen (1933), Hoffmann (1933), and
which h does not appear originally, but is intro- Pauli (1933) played a prominent role. These are, how-
duced in connection with the periodicity in x 5 . Al- ever, just equivalent formulations of Kaluzas and
though incomplete, this result, together with the Kleins unification of the gravitational and the electro-
considerations given here, suggests that the origin magnetic field (Bergmann, 1942; Ludwig, 1951).
of Plancks quantum may be sought just in this pe- Einsteins repeated interest in five-dimensional gener-
riodicity in the fifth dimension. alizations of general relativity has been described by
Bergmann (1942) and Pais (1982) and will not be dis-
This was not the last time that such speculations have
cussed here.
been put forward. The revival of (supersymmetric)
Kaluza-Klein theories in the eighties (Appelquist, Cha-
dos, and Freund, 1987; Kubyshin ef d.,1989) led to the
idea that the compact dimensions would necessarily give
rise to an enormous quantum vacuum energy via the V. KLEINS 1938 THEORY
Casirnir effect. There were attempts to exploit this
vacuum energy in a self-consistent approach to compac- The first attempt to go beyond electromagnetism and
tification, with the hope that the size of the extra dimen- gravitation and apply Weyls gauge principle to the
sions would be calculable as a pure number times the nuclear forces occurred in a remarkable paper by Oskar
Planck length. Consequently the gauge-coupling con- Klein, presented at the Kazimierz Conference on New
stant would then be calculable. Theories in Physics (Klein, 1938). Assuming that the
Coming back to Klein we note that he would also mesons proposed by Yukawa were vectorial, Klein pro-
have arrived at Eq. (39) by the dimensional reduction of ceeded to construct a Kaluza-Klein-like theory which
his five-dimensional equation. Indeed, if the wave field would incorporate them. As in the original Kaluza-Klein
$ ( x , x 5 ) is Fourier decomposed with respect to the peri- theory he introduced only one extra dimension but his
odic fifth coordinate, theory differed from the original in two respects:
(i) The fields were not assumed to be completely in-
dependent of the fifth coordinate x5 but to depend on it
through a factor e - l P x 5 where e is the electric charge.
(ii) The five-dimensional metric tensor was assumed
one obtains for each amplitude $,,(x) [for the metric
to be of the form
(23) with += 11 the following four-dimensional wave
equation: g,Y(X), g55=L gp5=Pxp(x), (47)
where g,, was the usual four-dimensional Einstein met- An obvious weakness of Kleins theory is that there is
ric, p was a constant, and x , ( x ) was a matrix-valued only one coupling constant p, which implies that the
field of the form nuclear and electromagnetic forces would be of approxi-
mately the same strength, in contradiction with experi-
ment. Furthermore, the nuclear forces would not be
charge independent, as they were known to be at the
time. These weaknesses were noticed by Meller, who, at
where the d s are the usual Pauli matrices and A , ( x ) is
the end of the talk, objected to the theory on these
what we would now call an S U ( 2 ) gauge potential. This
was a most remarkable ansatz considering that it implies grounds. Kleins answer to these objections was aston-
a matrix-valued metric, and it is not clear what moti- ishing: this problem could easily be solved he said, be-
vated Klein to make it. The reason that he multiplied cause the strong interactions could be made charge in-
the present-day S U ( 2 ) matrix by u3 is that u3 repre- dependent (and the electromagnetic field separated) by
sented the charge matrix for the fields. introducing one more vector field C , and generalizing
Having made this ansatz, Klein proceeded in the stan- the 2 X 2 matrix x p
dard Kaluza-Klein manner and obtained, instead of the
Einstein-Maxwell equations, a set of equations that we
would now call the Einstein-Yang-Mills equations. This In other words, he there and then generalized what was
is a little surprising because Klein inserted only electro- effectively a (broken) S U ( 2 ) gauge theory to a broken
magnetic gauge invariance. However, one can see how S U ( 2 ) X U(1) gauge theory. In this way, he anticipated
the U ( 1 ) gauge invariance of electromagnetism could the mathematical structure of the standard electroweak
generalize to S U ( 2 ) gauge invariance by considering the theory by 21 years!
field strengths. The S U ( 2 ) form of the field strengths Klein has certainly not forgotten his ambitious pro-
corresponding to the B, and B , fields, namely, posal of 1938, in contrast to what has been suspected by
Gross (1995). In his invited lecture at the Berne Con-
F f ,= d,B ,,- d,B, + ie ( A,B ,-A ,,B), , (49) gress in 1955 (Klein, 1956) he came back to some main
aspects of his early attempt and concluded with the
FEY= d,B,- d,B,- i e ( A , B , - A $,) (50) statement:
actually follows from the electromagnetic gauge prin- On the whole, the relation of the theory to the
ciple d , + D , = d , + i e ( l - u 3 ) A , , given that the three five-dimensional representation of gravitation and
vector fields belong to the same 2 X 2 matrix. The more electromagnetism on the one hand and to symmet-
difficult question is why the expression ric meson theory on the other hand-through the
appearance of the charge invariance group-may
Fzy=d,A,,- d , A , - i e ( B , B , - B , B , ) perhaps justify the confidence in its essential
for the field strength corresponding to A , contained a soundness.
bilinear term when most other vector-field theories, such
as the Proca theory, contained only the linear term. The
reason is that the geometrical nature of the dimensional VI. THE PAUL1 LETTERS TO PAIS
reduction meant that the usual space-time derivative d,
was replaced by the covariant space-time derivative d, The next attempt to write down a gauge theory for the
+ i e ( l - u 3 ) x , / 2 , with the result that the usual curl dr\X nuclear interactions was due to Pauli. During a discus-
was replaced by a,x,- d , ~ , + i e l 2 [ ~,x,], , whose third sion following a talk by Pais at the 1953 Lorentz Con-
component is just the expression for F;,, . ference in Leiden (Pais, 1953), Pauli said:
Being primarily interested in the application of his . . . I would like to ask in this connection whether the
theory to nuclear physics, Klein immediately introduced transformation group with constant phases can be am-
the nucleons, treating them as an isodoublet $ ( x ) on plified in way analogous to the gauge group for electro-
which the matrix 6, acted by multiplication. In this way magnetic potentials in such a way that the meson-
he was led to field equations of the familiar S U ( 2 ) form, nucleon interaction is connected with the amplified
namely, group.. .
Stimulated by this discussion, Pauli worked on this
ie
( 7 .D + M ) I,@) = 0, ,
D = a,+ 2 ( 1- u 3 ) x ,. (52)
- problem and drafted a manuscript to Pais that begins
with the heading (Pauli, 1999).
However, although the equations of motion for the vec- Written down July 22-25, 1953, in order to see
tor fields A , , B , , and B , would be immediately recog- how it looks. Meson-Nucleon Interaction and Dif-
nized today as those of an S U ( 2 ) gauge-invariant ferential Geometry.
theory, this was not at all obvious at the time and Klein In this manuscript, Pauli generalizes the Kaluza-Klein
does not seem to have been aware of it. Indeed, he im- theory to a six-dimensional space and arrives through
mediately proceeded to break the S U ( 2 ) gauge invari- dimensional reduction at the essentials of an S U ( 2)
ance by assigning ad hoc mass terms to the Bp and B , gauge theory. The extra dimensions form a two-sphere
fields. S2 with space-time-dependent metrics on which S U ( 2 )
operates in a space-time-dependent manner. Pauli de- There is, however, no justification for the particu-
velops first in local language the geometry of what we lar choice of the five-dimensional curvature scalar
now call a fiber bundle with a homogeneous space as P as integrand of the action integral, from the
typical fiber [in his case S2=SU(2)/U(1)]. Studying the standpoint of the restricted group of the cylindrical
curvature of the higher-dimensional space, Pauli auto- metric [gauge group]. The open problem of finding
matically finds, for the first time, the correct expression such a justification seems to point to an amplifica-
for the non-Abelian field strength. tion of the transformation group.
Since it is somewhat difficult to understand exactly
what Pauli did, we give some details, using more familiar In a second letter (Pauli, 1999), Pauli also studies the
formulations and notations. dimensionally reduced Dirac equation and arrives at a
Pauli considers the six-dimensional total space mass operator that is closely related to the Dirac opera-
M X S2, where S2 is the two-sphere on which SO(3) acts tor in internal space ( S 2 , y ) . The eigenvalues of the lat-
in the canonical manner. He distinguishes among the ter operator had been determined by him long before
diffeomorphisms (coordinate transformations) those (Pauli, 1939). Pauli concludes with the statement: So
which leave M pointwise fixed and induce space-time- this leads to some rather unphysical shadow particles.
dependent rotations on S2:
( X . Y ) - - t [ x , R ( x ,) Y l . (54) VII. YANG-MILLS THEORY
Then Pauli postulates a metric on M X S 2 that is sup-
posed to satisfy three assumptions. These lead him to In his Hermann Weyl Centenary Lecture at the ETH
what is now called the non-Abelian Kaluza-Klein ansatz: (Yang, 1980), C. N. Yang commented on Weyls remark
The metric S on the total space is constructed from a The principle of gauge-invariance has the character of
space-time metric g, the standard metric y o n S2, and a general relativity since it contains an arbitrary function
Lie-algebra-valued 1-form, A, and can certainly only be understood in terms of it
(Weyl, 1968) as follows:
A=AT,, An=A;dxp, (55) The quote above from Weyls paper also contains
on M [ T , , a = 1,2,3, are the standard generators of the something which is very revealing, namely, his
Lie algebra of SO(3)] as follows: If KLdlay are the strong association of gauge invariance with general
three Killing fields on S2, then relativity. That was, of course, natural since the
idea had originated in the first place with Weyls
g = g - yij[ d y + K d ( y ) A ] @
[d y j + K;(y)A]. (56) attempt in 1918 to unify electromagnetism with
In particular, the nondiagonal metric components are gravity. Twenty years later, when Mills and I
worked on non-Abelian gauge fields, our motiva-
g,i=A;(x)yjjKh. (57) tion was completely divorced from general relativ-
Pauli does not say that the coefficients of A ; in Eq. (57) ity and we did not appreciate that gauge fields and
are the components of the three independent Killing general relativity are somehow related. Only in the
fields. This is, however, his result, which he formulates in late 1960s did I recognize the structural similarity
terms of homogeneous coordinates for S2. He deter- mathematically of non-Abelian gauge fields with
mines the transformation behavior of A ; under the general relativity and understand that they both
group (54) and finds in matrix notation what he calls were connections mathematically.
the generalization of the gauge group: Later, in connection with Weyls strong emphasis of
the relation between gauge invariance and conservation
A p + R A p R - + R- d,R. (58) of electric charge, Yang continues with the following in-
With the help of A, , he defines a covariant deriva- structive remarks:
tive, which is used to derive field strengths by apply- Weyls reason, it turns out, was also one of the
ing a generalized curl to A , . This is exactly the field melodies of gauge theory that had very much ap-
strength that was later introduced by Yang and Mills. To pealed to me when as a graduate student I studied
our knowledge, apart from Kleins 1938 paper, it ap- field theory by reading Paulis articles. I made a
pears here for the first time. Pauli says that this is the number of unsuccessful attempts to generalize
true physical field, the analog of the field strength and gauge theory beyond electromagnetism, leading fi-
he formulates what he considers to be his main result: nally in 1954 to a collaboration with Mills in which
The vanishing of the field strength is necessary and we developed a non-Abelian gauge theory. In [. . .]
sufficient for the A ; ( x ) in the whole space to be we stated our motivation as follows:
transformable to zero. The conservation of isotopic spin points to the ex-
It is somewhat astonishing that Pauli did not work out istence of a fundamental invariance law similar to
the Ricci scalar for as for the Kaluza-Klein theory. the conservation of electric charge. In the latter
One reason may be connected with his remark on the case, the electric charge serves as a source of elec-
Kaluza-Klein theory in Note 23 of his relativity article tromagnetic field; an important concept in this case
(Pauli, 1958) concerning the five-dimensional curvature is gauge invariance, which is closely connected
scalar (p. 230): with (1) the equation of motion of the electro-
magnetic field, (2) the existence of a current den- We should let Frank proceed. I then resumed,
sity, and (3) the possible interactions between a and Pauli did not ask any more questions during
charged field and the electromagnetic field. We the seminar.
have tried to generalize this concept of gauge in- I dont remember what happened at the end of the
variance to apply to isotopic spin conservation. It seminar. But the next day I found the following
turns out that a very natural generalization is pos- message:
sible.
February 24, Dear Yang, I regret that you made it
Item (2) is the melody referred to above. The almost impossible for me to talk with you after the
other two melodies, (1) and (3), where what had seminar. All good wishes. Sincerely yours, W.
become pressing in the early 1950s when so many Pauli.
new particles had been discovered and physicists
had to understand how they interact with each I went to talk to Pauli. H e said I should look up a
other. paper by E. Schrodinger, in which there were simi-
lar mathematics.6 After I went back to
I had met Weyl in 1949 when I went to the Insti- Brookhaven, I looked for the paper and finally ob-
tute for Advanced Study in Princeton as a young tained a copy. It was a discussion of spacetime-
member. I saw him from time to time in the next dependent representations of the yp matrices for a
years, 1949-1955. He was very approachable, but I Dirac electron in a gravitational field. Equations in
dont remember having discussed physics or math- it were, on the one hand, related to equations in
ematics with him at any time. His continued inter- Riemannian geometry and, on the other, similar to
est in the idea of gauge fields was not known the equations that Mills and I were working on.
among the physicists. Neither Oppenheimer nor But it was many years later when I understood that
Pauli ever mentioned it. I suspect they also did not these were all different cases of the mathematical
tell Weyl of the 1954 papers of Mills and mine. theory of connections on fiber bundles.
Had they done that, or had Weyl somehow came
across our paper, I imagine he would have been Later Yang adds:
pleased and excited, for we had put together two I often wondered what he [Pauli] would say about
things that were very close to his heart: gauge in- the subject if he had lived into the sixties and sev-
variance and non-Abelian Lie groups. enties.
It is indeed astonishing that during those late years At another occasion (Yang, 1980) he remarked:
neither Pauli nor Yang ever talked with Weyl about I venture to say that if Weyl were to come back
non-Abelian generalizations of gauge invariance. today, he would find that amidst the very exciting,
With the background of Sec. VI, the following story of complicated and detailed developments in both
spring 1954 becomes more understandable. In late Feb- physics and mathematics, there are fundamental
ruary, Yang was invited by Oppenheimer to return to things that he would feel very much at home with.
Princeton for a few days and to give a seminar on his He had helped to create them.
joint work with Mills. Here is Yangs report (Yang,
1983): Having quoted earlier letters from Pauli to Weyl, we
add what Weyl said about Pauli in 1946 (Weyl, 1980):
Pauli was spending the year in Princeton, and was
deeply interested in symmetries and interactions. The mathematicians feel near to Pauli since he is
(He had written in German a rough outline of distinguished among physicists by his highly devel-
some thoughts, which he had sent to A. Pais. Years oped organ for mathematics. Even so, he is a
later F. J. Dyson translated this outline into En- physicist; for he has to a high degree what makes
glish. It started with the remark, Written down the physicist; the genuine interest in the experi-
July 22-25,1953, in order to see how it looks, and mental facts in all their puzzling complexity. His
had the title Meson-Nucleon Interaction and Dif- accurate, instructive estimate of the relative weight
ferential Geometry.) Soon after my seminar be- of relevant experimental facts has been an unfail-
gan, when I had written down on the blackboard, ing guide for him in his theoretical investigations.
Pauli combines in an exemplary way physical in-
(d, - i EB), CCI. sight and mathematical skill.
Pauli asked, What is the mass of this field B,? I To conclude this section, let us emphasize the main
said we did not know. Then I resumed my presen- differences between general relativity and Yang-Mills
tation, but soon Pauli asked the same question theories. Mathematically, the so (1,3)-valued connection
again. I said something to the effect that that was a forms w in Sec. IIIA and the Lie-algebra-valued gauge
very complicated problem, we had worked on it potential A are on the same footing; they are both rep-
and had come to no definite conclusions. I still re- resentatives of connections in (principle) fiber bundles
member his repartee: That is not sufficient ex-
cuse. I was so taken aback that I decided, after a
few moments hesitation to sit down. There was 6E. Schrodinger, Sitzungsberichte der Preussischen (Akad
general embarrassment. Finally Oppenheimer said, emie der Wissenschaften, 1932), p. 10.5.
1 1
0- F
FIG. 4. General relativity vs Yang-Mills theory.
priate to conclude with some remarks on current at-
tempts in string theory and noncommutative geometry.
second-rank symmetric-tensor theory must be a gravita- the emission of the on-shell massless states. The vertices
tional theory if the polarization vector satisfies are the analogs of the ordinary Feynmann diagrams in
quantum field theory and take the form
Already at this stage there is a feature that does not which dates from the days of strong-interaction string
arise in the gauge-field case: Since the vertex operator is theory. In this mechanism one simply attaches charged
bilinear in the field X it has to be normal ordered, and particles to the open ends of the string. These charged
it turns out that the normal ordering destroys the classi- particles are not otherwise associated with any string
cal conformal invariance, unless and thus the mechanism is rather ad hoc and leads to a
hybrid of string and field theories. But it has the merit of
tf=O and ptpV=O. (67) introducing charged particles directly and thus empha-
We next make the momentum-space version of an in- sizing the relationship between strings and gauge fields.
finitesimal coordinate transformation, namely, The Chan-Paton mechanism has the further merit of
allowing a simple generalization to the non-Abelian
5pv-+tpu+ O(PIPV+ T v ( P ) P p . (68) case. This is done by replacing the charged particles by
Under this transformation the vertex V gpicks up an ad- particles belonging to the fundamental representations
ditional term of the form of compact internal symmetry groups G, typically
quarks q , ( x ) and antiquarks g b ( x ) . The vertex opera-
tor then generalizes to one with double labels (u,b) and
represents non-Abelian gauge fields in much the same
way that the simple bosonic string represents an Abelian
gauge field.
An interesting restriction arises from the fact that
In analogy with the electromagnetic case we can carry since the string represents gauge fields, and gauge fields
out a partial integration. However, this time the expres- belong to the adjoint representation of the gauge group,
sion does not vanish completely but reduces to the vertex function must belong to the adjoint represen-
tation. This implies that even at the tree level the tensor
product of the fundamental group representation with
itself must produce only the adjoint representation, and
where A denotes the two-dimensional Laplacian. On the this restricts G to be one of the classical groups S O ( n ) ,
other hand, A X ( a ) = O is just the classical field equa- Sp(2n), and U ( n ) .Furthermore, it is found that U(n)
tion for X ( o ) , and it can be shown that even in the violates unitarity at the one-loop level, which leaves only
quantized version it is effectively zero (Green, Schwarz, S O ( n ) and SP(2n). Finally, these groups require sym-
and Witten, 1987; Polchinski, 1998). Thus, thanks to the metrization and antisymmetrization in the indices a and
dynamics, we have invariance with respect to the trans- b to produce only the adjoint representation, and this
formations (68). But Eq. (67) and invariance with re- implies that the string is oriented (symmetric with re-
spect to Eq. (68) are just the conditions (61) and (62) spect to its end points). When all these conditions are
discussed earlier for the vertex to be a gravitational satisfied it can be shown that the non-Abelian vertex
field. As in the gauge-field case, the important point is corresponding to Eq. (64) is covariant with respect to
that the general coordinate invariance is not imposed, non-Abelian gauge transformations corresponding to
but is a consequence of the conformal invariance and tp+ ~ ( p ) pabove.
, But since these transformations
internal structure of the string. are nonlinear the proof is more difficult than in the Abe-
The appearance of a scalar field in this context is not lian case.
too surprising, since a scalar also appeared in the
Kaluza-Klein reduction. What is more surprising is the
appearance of an antisymmetric tensor. From the point 5. Fermionic and heterotic strings: Supergravity
of view of traditional local gravitational and gauge-field and non-abelian gauge theory
theory the presence of an additional antisymmetric ten- The Chan-Paton version of gauge string has the obvi-
sor field seems at first sight to be an embarrassment. But ous disadvantage that the charged fields (quarks) are not
it turns out to play an essential role in maintaining con- an intrinsic part of the theory. A second method of in-
formal invariance (cancellation of anomalies), so its troducing fermions is to place them in the string itself.
presence is to be welcomed. This is done by replacing the kinetic term (ax) by a
4. The presence of matter
Dirac term qSW in the Lagrangian density. Interesting
cases are those in which the number of fermion compo-
Of course, the open bosonic string is not the whole nents just matches the number of bosons, so that the
story any more than pure gauge fields are the whole Lagrangian is supersymmetric. In that case the condition
story in quantum field theory. One still has to introduce for quantum conformal invariance reduces from d = 26
quantities that correspond to fermions (and possibly sca- to d=10. An interesting case from the point of view of
lars) at the zero-mass level. There are essentially two gauge theory and dimensional reduction is the heterotic
ways to do this. The first is the Chan-Paton mechanism string, in which the left-handed part forms a superstring
(Green, Schwarz and Witten, 1987; Polchinski, 1998), and the right-handed part forms a bosonic string in
which 16 of the bosons are fermionized. For the het- trivial representations of G, and these are the ones ob-
erotic string the Lagrangian in the bosonic-string path tained from the tensor products of Eq. (73) with the
integral (63) is replaced by the Lagrangian space-time scalars AAlO). At this point one must make a
p=10 p=10 A=32 choice about the internal symmetry group G. The sim-
z
p=l
a,xpa*x,-2 z fl
p=l
d-$p+-2
A=l
A! d , ~ ! , plest choice is evidently G=SO(32), and it is obtained
by assigning antiperiodic boundary conditions to all the
(71) fermion fields A. (Assigning periodic boundary condi-
where the $s and As are Majorana-Weyl fermions and tions to all of them violates the masslessness condition.)
the As belong to a representation (labeled with A ) of an Since the product states continue to belong to the ad-
internal symmetry group G. It is only through the As joint representation of SO(32), they are the natural
that the internal symmetry group enters. The left- and candidates for states associated with non-Abelian gauge-
right-handed parts of the theory are conformally invari- fields, and an analysis of the vertex operators associated
ant for quite different reasons. The left-handed part of with these states confirms that they do indeed corre-
the Xs and the left-handed fermions are conformally spond to SO(32) gauge fields.
invariant, because together they form the left-handed In sum, the heterotic string produces both supergrav-
part of a superstring (this is why the summation over the ity and non-Abelian gauge theory.
Xs is only from 1 up to 10). The right-handed part of
the Xs and the right-handed fermions AA are confor-
7. Dimensional reduction and the heterotic
mally invariant because, from the point of view of
symmetry group E8x E8
anomalies, two Majorana-Weyl fermions are equivalent
to one boson and thus the system is equivalent to the A variety of other left-handed internal symmetry
right-handed part of a 26-dimensional bosonic string. groups GCSO(32) can be obtained by assigning peri-
(This is why the index A runs from 1 to 32.) The fact odic and antiperiodic boundary conditions to the fermi-
that there are 32 fermions obviously puts strong restric- ons AA of the heterotic string in a nonuniform manner.
tions on the choice of the internal symmetry group G. However, apart from the SO(32) case just discussed, the
We now examine the particle content of the theory, only assignment that satisfies unitarity at the one-loop
using the light-cone gauge, where there are no redun- level is an equipartition of the 32 fermions into two sets
dant fields. There are no tachyons for the left-moving of 16, with mixed boundary conditions. This would ap-
fields; the first excited states are massless and take the pear, at first sight, to lead to an SO( 16) X SO(16) inter-
form nal symmetry and gauge group, on the same grounds as
SO(32) above, but a closer analysis shows that it actu-
li), and la),, (72) ally leads to a larger group, namely E s X E , , which ac-
where the l i ) L for i=1...8 are the left-handed compo- tually has the same dimension (496) as SO(32). This
nents of a massless space-time vector in the eight trans- group is quite attractive for grand unification theory, as
verse directions in the light-cone gauge and are the it breaks naturally to E,, which is one of the favorite
components of a massless fermion in one of the two grand unified theory groups.
fundamental spinor representations of the same space- Once we accept that S 0 ( 1 6 ) X S 0 ( 1 6 ) is a gauge
time SO(8) group. These states are all G invariant. group and that a rigid internal symmetry group
The first excited states for the right-moving sector are E 8 X E s exists, it follows immediately that E , X E , must
be a gauge group, because the action of the rigid gen-
[ i ) R and A ~ A ~ I o ) , (73) erators of E,XE, on the SO(16)XSO(16) gauge fields
where the li)R are the right-handed analogs of the li), produces E , X E , gauge fields.
and the A A l O ) states are massless space-time scalars. The This reduces the problem to the existence of a rigid
states [ i ) Rare G invariant but the states A A l O ) belong to E 8 X E , symmetry, but, within the context of our present
the adjoint representation of G and thus it is only methods, this is a rather convoluted process. One must
through these states that the internal symmetry enters at introduce special representations of SO( 16) XSO( 16),
the massless level. project out some of the resulting states, and construct
The physical states are obtained by tensoring the left- vertices that represent the elements of the coset
and right-moving states (72) and (73). On tensoring the (E8XE,)I[SO(16)XSO(16)]. Luckily there is a much
right-handed states with the vectors in Eq. (73) we ob- more intuitive way to establish the existence of the
viously obtain states that are G invariant, and they turn E , X E , symmetry, and, as this way provides a very nice
out to be just the states that would occur in N = 1 super- example of dimensional reduction within string theory,
gravity. An analysis of the vertex operators, similar to we shall now sketch it.
that carried out above for closed bosonic strings, con- We have already remarked that, from the point of
firms that these fields do indeed correspond to super- view of Virasoro anomalies, the 32 right-handed
gravity. Majorana-Weyl fermions A are equivalent to the right-
handed parts of 16 bosons. This relationship can be car-
6. The internal symmetry group G
ried farther by bosonizing the fermions according to
From the point of view of non-Abelian gauge theory A*(u)= :exp[+&u)]:, where + R ( u )is a right-moving
the interesting states are those belonging to the non- bosonic field, compactified so that O S ~ ~ ( U ) < ~InTTthat
.
case we may regard the right-handed part of the het- where nodenotes the space of zero-forms. The essential
erotic string as originating in the right-handed part of an new feature is the introduction of a discrete component
ordinary 26-dimensional bosonic string, in which 16 of d of the outer derivative d. This is defined as a self-
the 26 right-moving bosonic fields XR(a) have been fer- adjoint off-diagonal matrix, i.e.,
mionized by letting Xf(u)++f(g) for O C + ~ ( L T ) < ~ T
and a = 1 . . .16. Since the Xs correspond to coordinates
in the target space of the string, this is equivalent to a (75)
toroidal compactification of 16 of the target-space di-
with constant entries k. (More generally one could take
mensions and thus is equivalent to a Kaluza-Klein-type
the off-diagonal elements in d to be complex-conjugate
dimensional reduction from 26 to 10 dimensions. It turns
out that the toroidal compactification and conversion to bounded operators, but that will not be necessary for
fermions is consistent only if the lattice that defines the our purpose.) The outer derivative of the zero-forms
16-dimensional torus is even and self-dual. But it is well with respect to d is obtained by commutation,
known that there are only two such lattices, called 0 2
and E X +E x , and since these have automorphism groups
SO(32) and E 8 X E x , respectively, one sees at once
where the origin of these symmetry groups lies. The fur- The noncommutativity enters in the fact that dwo does
ther reduction from ten to four dimensions is, of course, not commute with the forms in R,. The one-forms w1
another question. One of the more attractive proposals are taken to be off-diagonal matrices,
is that the quotient, six-dimensional space, be a Calabi-
Yau space (Green, Schwarz, and Witten, 1987; Yau,
1985), but we do not wish to pursue this question further (77)
here.
where the u ( x ) s are ordinary scalar functions and i l l
denotes the space of one-forms. Note that, according to
B. Gauge theory and noncommutative geometry
Eq. (76), the discrete component of the outer derivative
maps R o into a,. The outer derivative of a one-form
The recent development of noncommutative geom-
etry by Connes (1994) has permitted the generalization with respect to d is obtained by anticommutation. Thus
of gauge-theory ideas to the case in which the standard d A o l 3 { d , w l } = [ u b ( X ) - U 0 ( X ) ] k Z E 510, (78)
differential manifolds (Minkowskian, Euclidean, Rie-
mannian) become mixtures of differential and discrete where I is the unit 2 x 2 matrix. It is easy to check that
manifolds. The differential operators then become mix- with this definition we have dAdA = 0 on both R-spaces.
tures of ordinary differential operators and matrices. The U ( 1) gauge group is a zero-form and is the direct
From the point of view of the fundamental physical in- sum of the U ( 1) gauge groups on the two sectors of the
teractions, the interest in such a generalization of gauge zero-forms. Thus it has elements of the form
theory is that the Higgs field and its potential, which are
normally introduced in an ad hoc manner, appear as U(x)=je;) ei:(x)) E U(1). (79)
part of the gauge-field structure. Indeed the Higgs field
emerges as the component of the gauge potential in the Its action on both Ro and is by conjugation. Thus
discrete direction and the Higgs potential, like the under a gauge transformation the zero-forms are invari-
self-interaction of the gauge field, emerges from the ant and the one-forms transform according to
square of the curvature. The theory also relates to
Kaluza-Klein theory because the Higgs field and poten- w l ( x ) + o ; ( x ) = U - ( x ) w l ( x )U ( x ) . (80)
tial can also be regarded as coming from a dimensional Explicitly,
reduction in which the discrete direction in the gauge
group is reduced to an internal direction.
1. Simple example
To explain the idea in its simplest form we follow where X ( x ) = p ( x ) - r u ( x ) . (81)
Connes (1994) and use as an example the simplest non- The discrete component of a connection takes the
trivial case, namely, when the continuous manifold is a form
four-dimensional compact Riemannian manifold with
gauge group U ( 1 ) and the discrete manifold consists of
just two points. With respect to the new discrete (two-
point) direction the zero-forms (functions) w o ( x ) are
and thus resembles a Hermitian one-form. But, being a
taken to be diagonal 2 X 2 matrices with ordinary scalar
connection, it is assumed to transform with respect to
functions as entries:
U ( x ) as
(74)
v ( x ) +v,(x)
= u - ( x ) v ( xU)( x )+ e - ~ - ( x ) d ~ ( x ) ,
(83)
and
where
k
4(x)=u(x) +c, c = -. (85)
The outer derivative with D is formed in the same way respectively. As might be expected from the fact that the
as with d, namely, by commutation and anticommuta- fermions are U(1) covariant, it is the U(I)-covariant
tion on the forms SL, and a,, respectively. From Eq. field +(I),and not the component u ( x ) of thc connec-
(83) it follows in the usual way that D translorms cova- tion, that couples to them in Eq. (94).
riantly with rcspect to the U (1 ) gauge group, i.e.,
D[ $ ~ ( x ) I =u-'(n)D[ 4(x)lU ( X )
D[ 4(~1l+ (86) 2. Application to the standard model
whcre As has already bccn mentioned, the immediate physi-
cal intcresl of the noncommutative gauge theory lies in
4x(x)= eLA['"'C(x). (87) its application to the standard model of the fundamental
This is consistent with the fact that D acts on the gauge interactions. The new feature is that it produces the
group by commutation, Higgs Geld and its potential as natural consequcnccs of
Note that, although the component v ( x ) of the con- gauge theory, in contrast to ordinary field theory in
nection does not transform covariantly with respect to which they are introduced in an ad hoc phenomenologi-
U ( 1 ) , the field +(x) does. Since +(x) is also a space- cal manner. The mechanism by which they are produced
time scalar, it can therefore be identified as a Higgs field. is very like that used in Kaluza-Klein reduction so, to
As we shall see, the fact that d(x) rather than u ( x ) is put the noncommutative mechanism into perspective, let
identified as the Higgs field is of great importance for us first digress a little to recall the usual Kaluza-Klein
the Higgs potential. mechanism.
Having defined the covariant derivative, we can pro-
ceed to construct the curvature. In an obvious notation
a. The Kaluza-Klein mechanism
this can be written as
Consider the gauge-fermion Lagrangian density in 4
+ n dimensions, namely,
where F,,, is the convcnhmal curvature and
Fdp= d,V - ~ A +A+ [ A @V, ] wherc A , 4 = 1 . . . 4 + n . IT we let , u u , v = O . ,. 3 and r,s
= 4 . . . n and assume that thc fields dn not depend on
the coordinatcs x r , the Dirac operator and the curva-
ture decompose into
where U p is the conventional covarianl derivative. The
interesting component is F d d , which turns out to be FAfi =
F d d = dAV+ e V 2 . (90)
and
The explicit form of Eq. (90) is easily computed to be
IpDA=I/lrDp+y'Ar, (94)
F,,= (k(u +u*)+evu*)r=e(l$12- c 2 ) 1 . (91)
respectively, and hence the Lagrangian (95) decomposes
Since it is & ( x ) that must be identified as a Higgs field, into
the relationship between Eq. (91) and the standard
U ( 1) Higgs potential is obvious. 1 1
L= -Tr(Fpv)z--(D,,Ar)2+$-yvDfi$
Before applying the above formalism to physics, how- 4 2
ever, we have to introduce fermion fields 9 ( x ) . These
are taken to be column vectors of ordinary fermions (97)
*Ax).
The extra components A , of the gauge potential are But this is just the renormalizable potential that is used
space-time scalars and may therefore be identified as to produce the spontaneous breakdown of U ( 1) invari-
Higgs fields. Thus the dimensional reduction produces a ance. Putting all the new contributions together, we see
standard kinetic term, a standard Yukawa term, and a that the introduction of the discrete dimension and its
potential for the Higgs fields. The problem is that the associated gauge potential + ( x ) produces exactly the
Higgs potential is not the one required for the standard extra terms
model. In particular, its minimum does not force lArl to
assume the fixed nonzero value that is necessary to pro-
duce the masses of the gauge fields and leptons.
ACKNOWLEDGMENTS
where
We are indebted to C. N. Yang for important remarks
which improved the paper. A number of people gave us
positive reactions and welcome comments. We thank, in
Since the field + ( x ) is a scalar that transforms covari- particular, D. Giulini, F. Hehl, U. Lindstrom, T.
antly with respect to the U(1) gauge group it may be Schucker, and D. Vassilevich. Special thanks go to T.
interpreted as a Higgs field. Hence, in analogy with the Damour for some pertinent remarks which led to sev-
Kaluza-Klein mechanism, the noncommutative mecha- eral improvements of the printed version.
nism produces a standard kinetic term, a standard
Yukawa term, and a potential for the Higgs fieId. The
difference lies in the form of the potential, which is no REFERENCES
longer the square of a commutator. From Eq. (91) we
have Appelquist, T., A. Chodos, and P. G . 0. Freund, 1987, Modern
Kuluza-Klein Theories (Addison-Wesley, London).
Audretsch, J., F. Gahler, and N. Straumann, 1984, Commun.
Math. Phys. 95,41.
Bergmann, V., 1932, Sitzungsber. K.Preuss. Akad. Wiss., Phys. Nordstrom, G., 1913b, Ann. Phys. (Lcipzig) 42,533.
Math. K1. 346. Nordstrom, C., 1914, Phys. Z . 15,504.
Bergmann, P. G., 1942, An Iritroduction to the Theory orKela- OKaifearLaigh L., 1997, The Dawning of Guuge T h o ? (Prin-
tiuiry (Prenlice-Hall, New York), Chaps. XVll and XVIII. ceton University, Princeton, NJ).
Bergmann, P. G., 1968, Int. J. l h e o r . Phys. 1,25. Pais, A., 1953, Conference in Honour of H . A . Larentz, Leiden
Bleecker, D., 1981, Gauge Theory und Vurialional Principles 1953, Physica (Amsterdam) 19, 869.
(Addison-Wesley, London). Pais, A.? 1982, Subtle is fhe L o r d The Science and Life of Al-
Brans, C. H., and R. H. Dicke. 1961, Phys. Rev. 124,925. bert Einrtein (Oxford University, New Yorkj.
Cartan, E., 1928, LeGons sur la Geornitrie des Espaces de Rie- Pauli, W., 1919, Phys. Z. 20, 457.
mann, 2nd ed. (Gauthier-Villars. Paris). Pauli, W., 1921, Relativitatstheorie, Encyklopiidie der Math-
Case, C. M., 1957, Phys. Rev. 107,307. emrischen Wissenschafien (Leipzig, Teubner), Vol. 5.3. p.
Chamseddine, A. H., G . Felder, and J. Frohlich, 1993, Nucl.
539.
Phys. B 395,672.
Pauli, W.. 1Y33, Ann. Phys. (Leipzig) 18, 305.
Chandrasekharan, K., 1986, Ed., llennann Weyl, 1885-1.985
Pauli, W,, 1939, Helv. Phys. Acta 12, 147.
(Springer, New York).
Cnnncs, A., 1994, Noncommuturive Geometry (Academic, Ncw P a d , W.. 1958, Theory of Relativity (Pergamon, New York).
Yorkj. Pauli, W., 1979, Wissenschafilicher Briefwechsel, Vol. k 1919-
Darnour, T., and A. M. Polyakov, 1994: Nucl. Phys. B 423.532. 1929 (Springer, Berlin), p. 505. (Translation of the letter by L.
Dicke, R. H., 1962, Phys. Rev. 125, 2163. ORaifeartaighj.
Einstein, A., 1987, The Collected Papers of Albert Einstein, Pauli, W., 1999, Wissenschafilicher Briefwechsel, Vol. IV, Part
edited by J. Stachel, D. C. Cassidy, and R. Schulmann (Prin- I1 (Springer, Berlin), Letters 1614 and 1682.
ceton University, Princeton, NJ). Polchinski, J., 1998, String Theory, Vols. I, 11, Cambridge
Feynman, R. P., 1995, Feynmn Lectures on Gravitation, ed- Monographs on Mathematical Physics (Cambridge Univer-
ited by B. Hatfield (Addison-Wesley). sity, Cambridge, England).
Fierz, M., 1956, Helv. Phys. Acta 29, 128. Raman, V.,and P. Forman, 1969, Hist. Stud. Phys. Sci. 1, 291.
Fierz, M., 1999, private communication. Schrodinger, E., 1922, Z . Phys. 12, 13.
Fock, V., 1921, %. Phys. 39. 226. Schrodinger, E., 1932, Sitzungsber. K. Preuss. Akad. Wiss.,
Fock, V., 1929, Z. Phys. 57. 261. Phys. Math. K1. 105.
Green, M., J. Schwarz, and E. Witten, 1987, Theory of Strings Schrodinger, E., 1987, Schrodinger: Cenrenary Celebration of a
and Super.vtrings (Cambridge University, Cambridge, En- Polymulh, cdited by C . Kilmister (Camhridgc University,
gland). New YorWCamhridgc, England).
Gross, D., 1995, in The Oskar KIeiri Centenary Symposium, Schucker, T., l Y Y 7 , Geometry and Forces, in Proceedings of
cditcd by U. Lindstrom (World Scientific, Singapore), p. 94. (Re 1997 E M S Summer School on Noncommuruiivr Geometry
Hoffmann, B., 1933, Phys. Rev. 37,88. and Applications, Monsaraz and Lisbon, edited by P.
Infeld, L., and B. L. van der Waerden, 1932, Sitzungsber. K. Almeida, to appear (hep-tM9712095)
Preuss. Akad. Wiss., Phys. Math. K1. 380 and 474. Seelig, C., 1960, Albert Einstein (Europa, Zurich), p. 274.
Jordan, P., 1949, Nature (London) 164,637. Steinhardt, P. J., 1993, Class. Quantum Grav. 10, 33.
Jordan, P., 1955, Schwerkraft und Weltall. 2nd ed. (Vieweg, Straumann, N., 1984, General Relativity and Relativistic Astro-
Braunschweig). physics- Texts and Monographs in Physics (Springer, Berlin).
Kaluza, Th., 1921, Sitzungsber. K. Freuss. Akad. Wiss., Phys. Straumann. N., 1987. Phys. BI. (Germany) 43 (11). 414.
Math. KI., 966; lor an English translation see ORaifeartaigh, Tetrodc, H., 1928. Z.Phys. 50,336.
1997. Thiry, Y.R.:1948, C. R. Acad. Sci. 226, p. 216, 1881.
Klein, O., 1926~1,Z. Phys. 37, 895; for 3n English translation Thiry, Y. R.,1951, These (Univcrsiti de Psris).
scc OKaifcartaigh, 1997. Vcblen, 0.. 1933, Projektive Relativicatstheorie (Springer, Bcr-
Klein, O., 1926b, Naturc (London) 118, 516. lin).
Klein, O., 1938, 1938 Conference on. New 7heories in Physics, Wald, R. M., 1986. Phys. Rev. D 33, 3613.
held in Kasimierz, Poland; reproduced in ORaifeartaigh, Weinberg, S., 1965, Phys. Rev. 138, A988.
1991. Weyl, H., 1318, Gravitation und Elektrizitat? Sitzungsber.
Klein, O., 1956, in Proceedings of rhe Berne Congress, Helv. Deutsch. Akad. Wiss. Berlin, Klossefu ... pp. 465-480. See
Phys. Acta, Suppl. IV, 58. also H. Weyl, 1968, Gesammelfen Abhandlungen, edited by
Kubyshin, Y. A,, J. M. Mourao, G . Rudolph. and 1. P. Vo- K . Chadrasekharan (Springer, Berlin). An English translation
lobujev, 1989, Dimensional Reduction of Gauge Theories, is given in ORaifeartaigh, 1997.
Spontaneous Compactifcation and Model Building, Springer Weyl. H., 1922, Space, Time, Matter (Methuen, London, and
Lecture Notes in Physics No. 349 (Springer, New Yorkj. Dover, Yew York). Translated from the 4th German Edition.
Lichnerowicz, A,, 1955, Thkvies Rehrivisre de la Gravitation et [Raum: Zeit: Materie, 8. Aufage (Springer, Berlin, 1993)l.
de IE/ectroniagnitis,ne (Masson. Paris), Chap. 4. Weyl, H., 1929, Elektron und Gravitation. I Z . Phys. 56,330.
London, t:., 1927, %. Phys. 42,375. Wcyl, H., 1946, Memorabilia, in IIermann Weyl. edited by
Ludwig, C., 1951, Foorlschritlc der Prujolrtiven Relativiththeo- K. Chandrasekharan (Springer, Ncw York), p, 85.
rie. (Vieweg, Branunschweig). Wcyl, H., 1956, Selerfa (Birkhiuser, Boston).
Nordstrom, G., 1912, Phys. Z. W, 1126. Weyl- H.: 1968, Cesanimelte Ahhandlungen, edited by K .
Nordstrom, G., 1913a, Ann. Phys. (Leipzig) 40, 856. Chandrasekharan (Springer, Berlin), Vol. 111, p . 229.
Weyl, H., 1980, in Hermann Weyl, edited by K. Chandrasekha- in Hermann Weyl, 1885-1985, edited by K. Chandrasekharan
ran (Springer, Berlin), p. 85. (Springer, New York), p. 7.
Weyl, H., 1981, Gruppentheorie und Quantenmechanik (Wis- Yang, c. N., 1983, Selected Papers 1945-1980 with Commen-
senschaftliche Buchgesellschaft, Darmstadt), Nachdruck der tary (Freeman, Francisco), p. 525.
Yau, S. T., 1985, Compact three-dimensional Kahler Mani-
2 Aufl, Leipzig 1931. translation: Group Theory folds with zero Ricci curvature, in Symposium on Anomalies,
and Quantum Mechanics, Dover, New York (1950)l. Geometry, and Topology, edited by W. Bardeen and A.
Wigner, E., 1929, 2. Phys. 53, 592. White (World Scientific, Singapore), p. 395. See also refer-
Yang, C. N., 1980, Hermann Weyls Contribution to Physics, ences therein.
Pei-Ming Ho
pm ho@phys.ntu.edu.tw
This is a brief review of the relation between string theory and usual gauge theories including
Einsteins gravity and Yang-Mills theory. In particular, we would like t o explain ho117 string theory
extends or generalizes ideas behind gauge symmetry and Einsteins general relativity. Most of the
material concerning this article can be found in [l,21, if no further reference is provided.
1 There is gravity
Let us start with somc basic. fxt.s about string theory. The first and foremost to say is that string
theory includes gravit,y. More importantly? string theory coritairis quantum gmvity, and is the only
theory of quantum gravity which admits a. ultra.violet-finitc and unita.ry perturbation theory. On t.he
other hand, there is no consistent. pertiirbatiori thcory for the canonical quantization of Einsteins
theory of pure gravity.
Contrary to particles, for which it is extrcmely hard to find a well-behaved (causal, unitary, renor-
malizable) interaction with t,he gravitons, strings are almost always accompanied by gra\-ity. As each
excited state of a string can be reinterpreted as a particle, the massless spin-two excitation of a closed
string is identified with the graviton. This excitation exists in all five superstring theories (type I, type
IIA, type IIB, heterotic SO(32) a.nd het,erotic EB x Es) as well as the bosonic string theory.
A string can have (infinitely) many vibrabion modes. The graviton is a massless spin-2 osillation
mode of a closed string. Each oscillation mode shoiild bc matched with a certain part,icle in space-
time. Each particle is associated with a, vertex cipcrator defined in the 1+1 dimensional quantum field
theory living on the string worldshcc~. Irite1actinns among particles are determined in string theory
by correlation functions of wrtex operators. -4 Feyrirrian diagram for a spacetime particle scattering
process corresponds to a path intcgrsl of tlie worldsheet thcory. Unlike particle theory, wherc intcrac-
tion vertices and propagators are built in a Lagranglan, ad1 ingredients of Feynrnan diagrams are fixed
in string thcory by the definition of a free string.
There are two complimentary approaches to see how gravitational interaction is dcterrriiried in
string theory. The first. approach is to compute the Feynman diagram of a. sca irig process involving
gravitons, i n the flat, trivial background. From the result one can construct order by order a. field
theory involving the metric to reproduce the scattering amplitudes.
The second approach is to consider a string propagating in a perturbative deformation of the flat
background. The spacetime met.ric appears in the kinetic term of the string worldsheet act.ion. The
requirement of (quantum) conformal invariance then imposes a strong constraint. on the metric. The
563
constraint involves the metric and its derivatives, and can be viewed as the equation of motion of the
metric. In this approach, the equation of motion for the spacetime metric is in fact a self-consistency
condition. This is a remarkable feature of string theory which is not unique to gravity. Dynamics and
kinematical constraints are unified in string theory.
Of course, the results of the two approaches agree with each other. They also agree with the
Einstein equation at low energies for weak coupling (Newton) constant. A t high energies (compared
with the energy scale of the string tension), string theory modifies Einsteins theory, but with general
covariance intact.
1
* (QQ)+ -9*
3
9*9
where 9 [ X ( a ) ]is the string wave function (a state of the string worldsheet theory in the formulation
using BRST quantization) and Q is the BRST charge for conformal symmetry. The product labelled
by * defines an associative algebra for the string states, which essentially tells us how t o glue two
strings together into one string. This action has the gauge symmetry
We see from this formula that the string wave function is itself a gauge field for a huge gauge symmetry.
5 64
The string wave function XI' can be represented as a generic state in the Hilbert space of the string
worldsheet theory in BRST quantization. In terms of an expansion of creation operators,
= / + +
ddk [ 4 ( k ) ZAIL(k)a!l a(lc)b-lco + iBp(k)af2B,,(k)aflaYl+
-
where atare the bosonic operators representing fluctuations of the spacetime coordinate X p , and c,,
b, belong to the ghost sector. The coefficients $ ( k ) , A p ( k ) , a ( k B
) ,, ( k ) , etc. are Fourier transforms of
spacetime fields. Their gauge transformations are given by
1
sp, = -(a2-2);,
2
1
bpo = -(LJ2-2)1,
2
Obviously, A , is the massless spin-1 guage field mentioned above. B,, is a symmetric rank-2 gauge
potential. At higher mass levels one finds gauge fields a t higher ranks. Except A,, the other gauge
fields are massive, signaling a spontaneous symmetry breakdown.
The physical meaning of the huge gauge symmetry of string theory has not yet been thoroughly
explored. People also suspect that there is another huge hidden (global) symmetry which is sponta-
neously broken, and that one should study the high energy limit where the symmetry may be restored
[6]. The 2 dimensional string theory is the best understood toy model of strings. It has the w,
symmetry and the symmetry algebra is strong enough t o determine all scattering amplitudes. It has
been conjectured by Gross and his collaborators [6] that the full symmetry may uniquely determine
the dynamics for higher dimensional strings as well. It is possible that the huge global symmetry has
the same origin as the huge gauge symmetry of string theory, or that its existence is a necessity for
self-consistency due to the higher spin gauge symmetries [7].
While the general covariance of general relativity is embedded in a bigger gauge symmetry, we do
not seem to fully comprehend the physical notion which generalizes the equivalence principle. What
are the gendenken experiments analogous to the elevator in free fall?
3 Geometry is induced
Before the advant of string theory, spacetime is put in by hand as a stage in which particles interact
and events take place. In string theory, spacetime can be a derived notion. f i o m the viewpoint of the
string worldsheet action, the spacetime coordinates of the string are scalar fields on the worldsheet.
Properties of spacetime are determined by properties of these scalar fields. Of course one can also
view spacetime coordinates of a point particles as a scalar field on the worldline. The crucial difference
565
is that while string worldsheet theory determines the dynamics of spacetime geometry, the particle
worldline theory does not.
As spacetime is a derived notion, the geometrical structure of spacetime is something t h a t should
be extracted from the theory. We have already mentioned above how the dynamics of spacetime
metric is determined in string theory, yet Riemannian geometry is not the only geometrical structure
the spacetime can possess.
Noncommutative geometry is a mathematical notion that generalizes classical geometry [S] A clas-
sical manifold is commutative, that is, the algebra of functions on the manifold is a commutative
algebra. Mathematicians noticed that, although traditionally one visualizes a manifold as a collection
of points (with some topology), the algebra of functions provides an alternative equivalent description
t o some extent. Hence it is natural t o relax the definition of geometry t o allow the algebra of functions
t o be noncommutative. Such a space no longer admits the picture of a set of points, but various
geometric structure and quantities can be defined.
In a field theory, the base space can be noncommutative by imposing nontrivial commutation
relations for the coordinates. In string theory, the commutativity of spacetime coordinates depends
on the quantization of the scalars fields on the worldsheet. In suitable background, noncommutative
geometry arises automatically.
In string theory, there are solitonic objects called D-branes. They are submanifolds of spacetime
on which open strings end. Turning on a background field called NS-NS B-field, one can quantize
the open string and find that the coordinates of the endpoints are noncommutative [9], with the
noncommutativity depending on the B-field background. The D-brane field theory is then conveniently
defined as a noncommutative field theory [lo].
Another way t o see the noncommutativity of a D-brane is to consider scattering amplitudes of open
strings [12]. In other words, you probe the geometric property of the D-brane worldvolume using open
strings. Due t o the B-field background, the interactions a t low energies are most conveniently described
by a field theory living on a noncommutative space. With other background fields (graviphoton) turned
on, the spacetime itself can also be noncommutative [13].
In addition t o noncommutative geometry, there might be more exotic geometrical notions which
we can learn from string theory. In general, one probes the spacetime via strings. The geometry
of spacetime is an induced notion derived from decoding string interactions in a certain way. It is
conceivable that there can be ambiguity in the decoding process, corresponding to the freedom of
making a change of variables (field redefinition). For example, a noncommutative field theory can be
reinterpreted as a commutative field theory with higher derivative interactions [ll]. We will make
more comments below on other ambiguities in defining spacetime properties.
Matrix model provides another way t o demonstrate the idea. The BFSS model [14]and the IKKT
model [15]are conjectured to be equivalent descriptions of string theory. Spacetime coordinates corre-
spond to N x N matrices with N + 03. As matrices the coordinates are generically noncommutative.
Interactions in the models would make it impossible to have large extended dimensions in spacetime
if there were no supersymmetry to guarantee flat directions in the moduli space. The large scale,
(roughly) commutative spacetime around us is not given to us without warrant. The choice of the
effective spacetime dimension by our universe can be translated into questions about the free energy
of the theory [16].
T-duality (including Mirror symmetry) is another example of the ambiguity in extracting the
566
spacetime structure from string theory. The simplest setting of T-duality is having one spatical
direction compactified on a circle. Let the radius be denoted R. It turns out that this theory is
equivalent t o another string theory with a dual spatial direction compactified on a circle of radius 1/R in
string units. The two theories dual to each other are different descriptions of the same physical system.
The same physical state can have totally different descriptions in the two theories. For instance, a
string winding on a circle can be matched with a Kaluza-Klein mode (momentum eigenstate) in the
dual theory. But a state is always matched with another state with the same energy. Physicists trying
t o describe a physical process can adopt either one of the two theories and its language. They may
sound very different in words, but the two theories agree when it comes t o the prediction of the result
of a measurement in terms of pure numbers.
In general, shapes and topology of spacetime can be different between equivalent theories. Even
the spacetime dimension is not independent of the choice of description/theory. Superstrings in 10
dimensions are believed to be equivalent t o the M-theory in 11 dimensions. Physics is not only
observer-dependent, like it already is in Einsteins theory, but is now also theory-dependent. In fact,
all five superstring theories are believed to be dual to one another, and also to ( a quantum version of)
the 11 dimensional theory of supergravity.
4 Summary
To summarize, gauge symmetry is generalized to include gauge fields of arbitrary spins in string theory.
Einsteins theory of spacetime is also extended to more general geometric structures. The notion that
physics is observer-dependent, which we first learned from the theory of relativity, is magnified t o
the notion that physics is theory-dependent. It is also believed that Yangs inspiring saying that
Symmetry dictates interaction is fully honoured by string theory.
An obvious difference between string theory and general relativity or Yang-Mills theory is that the
discovery of the latter was motivated by beautiful theoretical notions on symmetry and geometry, but
the discovery of string theory was an accident. What is the symmetry principle of string theory? This
is one of the most important problems of string theory.
We have so far avoided discussions on holography and cosmological constant in this article. Holo-
graphic principle is argued to be a salient feature of quantum gravity [17]. In string theory we have
seen remarkable evidences of it [18]. It is generally viewed to be of utmost importance for quantum
gravity, but so far our understanding of it remains at a technical level. On the other hand, our under-
standing of the cosmological constant problem is not even a t the technical level. It might be that the
secrets of holography and cosmological constant are deeply hidden inside string theory waiting t o be
discovered, and their revelation will bring us a drastic conceptual breakthrough so that string theory
will no longer be called string theory.
Acknowledgment
This work is supported in part by the National Science Council and National Center for Theoretical
Sciences, Taiwan, R.O.C. and the Center for Theoretical Physics at National Taiwan University.
567
References
[l]M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory. Vol. 1: Introduction, Cambridge
University Press; M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory. Vol. 2: Loop
[2] J. Polchinski, String theory. Vol. 1: An introduction t o the bosonic string, Cambridge Uni-
versity Press; J. Polchinski, String theory. Vol. 2: Superstring theory and beyond, Cambridge
University Press.
[3] See, for instance, J. Isberg, U. Lindstrom, B. Sundborg and G. Theodoridis, Classical and quan-
tized tensionless strings, Nucl. Phys. B 411, 122 (1994) [arXiv:hep-th/9307108]; B. Sundborg,
Stringy gravity, interacting tensionless strings and massless higher spins, Nucl. Phys. Proc.
Suppl. 102, 113 (2001) [arXiv:hep-th/0103247]; E. Sezgin and P. Sundell, Massless higher spins
and holography, Nucl. Phys. B 644, 303 (2002) [Erratum-ibid. B 660, 403 (2003)] [arXiv:hep-
th/0205131]; C. S.Chu, P. M. Ho and F. L. Lin, Cubic string field theory in pp-wave background
and background independent Moyal structure, JHEP 0209, 003 (2002) [arXiv:hep-th/0205218].
[4] See, for instance, C. Fronsdal, Massless Fields With Integer Spin, Phys. Rev. D 18, 3624
(1978); J. Fang and C. Fronsdal, Massless Fields With Half Integral Spin, Phys. Rev. D 18,
3630 (1978); M. A. Vasiliev, Progress in higher spin gauge theories, Prepared for 9th Marcel
Gravitation and Relativistic Field Theories ( M G 9), R o m e , Italy, 2-9 Jul 2000 M. A. Vasiliev,
Higher spin gauge theories in various dimensions, Fortsch. Phys. 52, 702 (2004) [arXiv:hep-
th/0401177].
[5] E. Witten, Noncommutative Geometry And String Field Theory, Nucl. Phys. B 268, 253 (1986).
[6] D.J. Gross and P. Mende, Phys. Lett. B197,129 (1987); Nucl.Phys.B303, 407(1988); D.J. Gross,
High energy symmetry of string theory, Phys. Rev. Lett. 60,1229 (1988); PhiLTrans. R. Soc.
Lond. A329,401(1989); D.J. Gross and J.L. Manes, The high energy behavior of open string
[7] C. T. Chan, P. M. Ho and J . C. Lee, Ward identities and high-energy scattering amplitudes in
[9] C. S. Chu and P. M. Ho, Noncommutative open string and D-brane, Nucl. Phys. B 5 5 0 , 151
(1999) [arXiv:hep-th/9812219].
[lo] A. Connes, M. R. Douglas and A. Schwarz, Noncommutative geometry and matrix theory:
[ll] N. Seiberg and E. Witten, String theory and noncommutative geometry, JHEP 9909, 032
(1999) [arXiv:hep-th/9908142].
[12] V. Schomerus, D-branes and deformation quantization, JHEP 9906, 030 (1999) [arXiv:hep-
th/9903205].
[13] H. Ooguri and C. Vafa, The C-deformation of gluino and non-planar diagrams, Adv. Theor.
N = 1 / 2 supersymmetry, field theory and string theory, JHEP 0306, 010 (2003) [arXiv:hep-
th/0305248].
[14] T . Banks, W. Fischler, S. H. Shenker and L. Susskind, M theory as a matrix model: A conjec-
[15] N. Ishibashi, H. Kawai, Y. Kitazawa and A. Tsuchiya, A large-N reduced model as superstring,
[16] H. Aoki, S. Iso, H. Kawai, Y. Kitazawa and T . Tada, Space-time structures from IIB matrix
[18] J. M. Maldacena, The large N limit of superconformal field theories and supergravity, Adv.
Theor. Math. Phys. 2, 231 (1998) [Int. J. Theor. Phys. 38, 1113 (1999)l [arXiv:hep-th/9711200];
0. Aharony, S. S. Gubser, J. M. Maldacena, H. Ooguri and Y. Oz, Large N field theories, string
theory and gravity, Phys. Rept. 323, 183 (2000) [arXiv:hep-th/9905111].
569
--H-2tanh2Hr(dt?+ sin28dq2)],
Reviews of Modern Physics, VoI. 61, No. 1 , JanuaIy 1989 Copyright 01988 The American Physical Society 1
570
++]_
panding one. Already in 1922, Friedmann (1924)had de- as
scribed a class of cosmological models, with line element
(in modern notation) [ =Ho=50- 100 km /sec Mpc
d2=dt2-R2(t) [- dr2 +r2(d02+sin20dp2)
1-kr2 =(+-1)XIO-O/yr.
If we believe general relativity up to the Planck scale, of quantum fluctuations. As we have seen, the zero-point
then we might take A - ( ~ T G ) - ~ ,which would give energies themselves gave far too large a value for ( p ) , so
Zeldovich assumed that these were canceled by h / 8 a G ,
(p)=2-0a-4G--2=2X 10 GeV4. (3.6) leaving only higher-order effects: in particular, the gravi-
But we saw that / ( p ) + h / 8 ~ G ( is less than about tational force between the particles in the vacuum fluc-
GeV4, so the two terms here must cancel to better tuations. (In Feynman diagram terms, this corresponds
than 118 decimal places. Even if we only worry about to throwing away the one-loop vacuum graphs, but keep-
zero-point energies in quantum chromodynamics, we ing those with two loops.) Taking A3 particles of energy
would expect ( p ) to be of order A&-J16rr2, or A per unit volume gives the gravitational self-energy den-
GeV, requiring h/8aG to cancel this term to about 41 sity of order
decimal places. ( p ) .s(GAZ/A-)A=GA6 . (3.7)
Perhaps surprisingly, it was a long time before particle
physicists began seriously to worry about this problem, For no clear reason, Zeldovich took the cutoff A as 1
despite the demonstration in the Casimir effect of the GeV, which yields a density ( p) =IO - GeV4, much
reality of zero-point et~ergies.~Since the cosmological smaller than from zero-point energies themselves, but
+
upper bound on I ( p ) h/87rG 1 was vastly less than any still larger than the observational bound (3.4) on
value expected from particle theory, most particle theor- I(p)+h/8aGI by some 9 orders of magnitude. Neither
ists simply assumed that for some unknown reason this Zeldovich nor anyone else felt encouraged to pursue
quantity was zero. But cosmologists generally continued these ideas.
to keep an open mind, analyzing cosmological data in The real beginning of serious worry about the vacuum
terms of models with a possibly nonvanishing cosmologi- energy seems to date from the success of the idea of spon-
cal constant. taneous symmetry breaking in the electroweak theory.
In fact, as far as I know, the first published discussion In this theory, the scalar field potential takes the form
of the contribution of quantum fluctuations to the (withpZ>O,g>O)
effective cosmological constant was triggered by astro-
nomical observations. In the late 1960s it seemed that an y = y 0 -P2dtd+g(d+d)** (3.8)
excessively large number of quasars were being observed At its minimum this takes the value
with redshifts clustered about z = 1.95. Since l+z is the
ratio of the cosmic scale factor R ( t ) at present to its (3.9)
value at the time the light now observed was emitted, this
could be explained if the universe loitered for a while at a Apparently some theorists felt that V should vanish a t
value of R ( 1 ) equal to 1/2.95 times the present value. A d=O, which would give Vo=O, so that ( p ) would be
number of authors [Petrosian, Salpeter, and Szekeres negative definite! In the electroweak theory this would
(1967); Shklovsky (1967); Rowan-Robinson (196811 pro- give (p)---g(300 GeV)4, which even for g as small as
posed that such a loitering could be accounted for in a a* would yield I ( p ) l =lo6 GeV4, larger than the bound
model proposed by Lemake (1927, 1931). In this model on pv by a factor los3. Of course we know of no reason
there is a positive cosmological constant he, and positive why Vo or h must vanish, and it is entirely possible that
+
curvature k = 1, just as in the static Einstein model, Vo or h cancels the term -p4/4g (and higher-order
while the mass of the universe is taken close to the Ein- corrections), but this example shows vividly how un-
stein value (2.4). The scale factor R ( t ) starts at R = O natural it is to get a reasonably small effective cosmologi-
and then increases; however, when the mass density cal constant. Moreover, at early times the effective
drops to near the Einstein value (2.2), the universe temperature-dependent potential has a positive coefficient
behaves for a while like a static Einstein universe, until for 40, so the minimum then is at 0-0, where
the instability of this model takes over and the universe V ( d ) = V o . Thus, in order to get a zero cosmological
starts expanding again. In order for this idea to explain a constant today, we have to put up with an enormous
preponderance of redshifts at z = 1.95, the vacuum ener- cosmological constant at times before the electroweak
gy density pv would have to be (2.95 )3 times the present phase transition. [This is not in conflict with experiment;
nonvacuum mass density p,,. in fact, the hase transition occurs a t a temperature T of
These considerations led Zeldovich (1967) to attempt P
order p / g ,so the black-body radiation present at that
to account for a nonzero vacuum energy density in terms
4Veltman (1975) attributes this view to Linde (1974). himself
(quoted as to be published), and Dreitlein (1974). However,
3Casimir(1948)showed that quantum fluctuations in the space Lindes paper does not seem to me to take this position.
between two flat conducting plates with separation d would pro- Dreitleins paper proposed that Eq. (3.9) could give an accept-
duce a force per unit area equal to fic7r2/240d4,or 1.3OX lo-* ably small value of ( p ) , with p/t/Ti fixed by the Fermi cou-
dyncmz/d4. This was measured by Sparnaay (1957),who found pling constant of weak interactions,if p is very small, of order
a force per area of (1-4)XlO-* dyncm2/d4, when d was MeV. Veltmans paper gives experimental arguments
varied between 2 and 10 pm. against this possibility.
time has an energy density of order p4/g2, larger than with c independent of gpv. With this L , there are no
the vacuum energy by a factor I /g (Bludman and Ruder- solutions of Eq. (3.11), unless for some reason the
man, 1977).] At even earlier times there were other tran- coefficient c vanishes when (3.10) is satisfied.
sitions, implying a n even larger early value for the Now that the problem has been posed, we turn to its
effective cosmological constant. This is currently regard- possible solution. The next five sections will describe five
ed as a good thing; the large early cosmological constant directions that have been taken in trying to solve the
would drive cosmic inflation, solving several of the long- problem of the cosmological constant.
standing problems of cosmological theory (Guth, I98 1;
Albrecht and Steinhardt, 1982; Linde, 1982). We want to
IV. SUPERSYMMETRY. SUPERGRAVITY,
explain why the effective cosmological constant is small SUPERSTRINGS
now, not why it was always small.
Before closing this section, I want to take up a peculiar Shortly after the development of four-dimensional glo-
aspect of the problem of the cosmological constant. The bally supersymmetric field theories, Zumino ( 1975)point-
appearance of an effective cosmological constant makes it ed out that supersymmetry in these theories would, if un-
impossible to find any solutions of the Einstein Reld equa- broken, imply a vanishing vacuum energy. The argu-
tions in which g , i s the constant Minkowski term vrv. ment is very simple: the supersymmetry generators Q,
That is, the original symmetry of general covariance, satisfy an anticommutation relation
which is always broken by the appearance of any given
metric gNv, cannot, without fine-tuning, be broken in (4.1)
such a way as to preserve the subgroup of space-time
where u and fl are two-component spin indices; u,,u2.
translations.
and o3 are the Pauli matrices; o o = l ; and PP is the
This situation is unusual. Usually if a theory is invari-
energy-momentum 4-vector operator. If supersymmetry
ant under some group G, we would not expect to have to
fine-tune the parameters of the theory in order to Rnd is unbroken, then the vacuum state 10)satisfies
vacuum solutions that preserve any given subgroup (4.2)
H C G. For instance, in the electroweak theory, there is a
finite range of parameters in which any number of dou- and from (4.1) and (4.2) we infer that the vacuum has
blet scalars will get vacuum expectation values that vanishing energy and momentum
preserve a U(1) subgroup of SU(Z)XU( 1). So why will
this not work for the translational subgroup of the group
(OIPI0)=0 .
of general coordinate transformations? Suppose we look This result can also be obtained by considering the poten-
for a solution of the field equations that preserves transla- tial V ( d , d * ) for the chiral scalar fields 4 of a globally su-
tional invariance. With all fields constant, the field equa- persymmetric theory:
tions for matter and gravity are
(4.3)
(3.101
where W ( 4 )is the so-called superpotential. (Gauge de-
(3.11) grees of freedom are ignored here, but they would not
change the argument.) The condition for unbroken su-
With N $s, these are N 3 - 6 equations for N f 6 un- persymmetry is that W be stationary in 4,which would
knowns, so one might expect a solution without fine- imply that V take its minimum value,
tuning. The problem is that when (3.10) is satisfied, the
dependence of L on g,, is too simple to allow a solution ( p >= Vmin=o . (4.4)
of (3.1 1). There is a GL(4) symmetry that survives as a
vestige of general covariance even when we constrain the Quantum effects do not change this conclusion, because
fields to be constants: under the GL(4) transformation with boson-fermion symmetry, the fermion loops cancel
the bason ones.
g,,-AP,A uYgp I (3.12) The trouble with this result is that supersymmetry is
broken in the real world, and in this case either (4.1) or
$i + D y ( A )tcl, ; (3.13) (4.3) shows that the vacuum energy is positive-definite.
the Lagrangian transforms as a density, If this vacuum energy were the sole contribution to the
effective cosmological constant, then the effect of super-
l-+DetAL. (3.14) symmetry would be to convert the problem of the cosmo-
When Eq. (3.10) is satisfied, this implies that L trans- logical constant from a crisis into a disaster.
Fortunately this is not the whole story. It is not possi-
forms as in (3.14) under (3.12) alone. This has the unique
ble to decide the value of the effective cosmological con-
solution
stant unless we explicitly introduce gravitation into the
L=c(Detg)2 , (3.15) theory. Any globally supersymmetric theory that in-
volves gravity is inevitably a locally supersymmetric su- point of V. Thus in supergravity the problem of the
pergravity theory. In such a theory the effective cosmo- cosmological constant is no more a disaster, but just as
logical constant is given by the expectation value of the much a crisis, as in nonsupersymmetric theories.
potential, but the potential is now given by (Cremmer O n the other hand, supergravity theories offer oppor-
et a l . , 1978, 1979; Barbieri et al., 1982; Witten and tunities for changing the context of the cosmological con-
Bagger, 1982) stant problem, if not yet for solving it. Cremmer et al.
(1983) have noted that there is a class of Kahler poten-
V(4,q5*)= e x p ( 8 r G K ) [ D WCO-')\(Dj
i W)*
tials and superpotentials that, for a broad range of most
-24rGI W I 2 ] , (4.5) parameters, automatically yield an equilibrium scalar
field configuration in which V=O, even though super-
where K (1#,4*) is a real function of both I$ and 4' known symmetry is broken. Here is a somewhat generalized
as the Kahler potential, DiW is a sort of covariant version: the Kahler potential is
derivative
K = -3 In1 T+ T* -h ( Ca,Ca*)I / 8 r G
(4.6)
+R(S",S"*) (4.8)
and (.!?-I)> is the inverse of a metric while the superpotential is
W = W , ( C ' ) + W 2 ( S " ), (4.9)
(4.7)
and T,C",S" are all chiral scalar fields. No constraints
The condition for unbroken supersymmetry is now are placed on the functions h (C",C"*), t ( S " , S " * ) ,
D, W=O. This again yields a stationary point of the po- , W2CS"),except that h and R are real, and
W I ( C a )or
tential, but now it is one at which V is generally negative. functions all depend only on the fields indicated; in par-
In fact, even if we fine-tuned W so that there were a su- ticular, the superpotential must be independent of the
persymmetric stationary point at which W =O and hence single chiral scalar T.
V=O, such a solution would not, in general, be the state With these conditions the potential (4.5) takes the form
of lowest energy, though it would be stable [Coleman and
de Luccia (19801, Weinberg (1982)l. It should, however,
be mentioned that if there is a set of field values at which
W=O and Di W=O for all i in lowest order of perturba-
tion theory, then the theory has a supersymmetric equi-
librium configuration with Y = O to all orders of pertur-
bation theory, though not necessarily beyond perturba- (4.10)
tion theory (Grisaru, Siege], and Rocek, 1979). The same
is believed to be true in superstring perturbation theory where ( J V - ' ) is
~ ~the reciprocal of the matrix
(Dine and Seiberg, 1986; Friedan, Martinec, and Shenker, a2h
1986; Martinec, 1986; Attick, Moore, and Sen, 1987; Pb
= (4.11)
Morozov and Perelomov, 1987).
acR*acb
'
Without fine-tuning, we can generally find a nonsuper- The matrices and grim are necessarily positive-
symmetric set of scalar field values at which V = O and definite, because of their role in the kinetic part of the
Di W#O, but this would not normally be a stationary scalar Lagrangian
(4.12)
I
Hence Eq. (4.10) is positive and therefore, without fur- aw
DaW=-+8rG-W
air
ther fine-tuning, may be expected to have a stationary aca aca
point with V =O, specified by the conditions
(4.14)
-
--
a w-D,
OL
-n
-
W=O . (4.13)
and this does not necessarily vanish. (However, to have
But this is not necessarily a supersymmetric supersymmetry broken, it is essential that the superpo-
configuration, because here tential actually depend on all of the chiral scalars S", be-
cause otherwise the conditions D , W=O would require So far, the only examples where this occurs entail a
W = O and hence D, W - 0 . ) compactification to two rather than four space-time di-
The superpotential W depends on C and S, but not mensions, but it does not seem unlikely that four-
on T, so the conditions (4.13) will generally fix the values dimensional examples could be found. A more serious
of C and S at the minimum of V, while leaving T un- obstacle is that the Atkin-Lehner symmetry seems irre-
determined. The field Tenters the potential only in the trievably tied to one-loop order.
overall scale of the part that depends on the C, so such Indeed, it is very hard to see how any property of su-
theories are called no-scale models. An intensive phe- pergravity or superstring theory could make the effective
nomenological study of these models was carried out at cosmological constant sufficiently small. It is not enough
CERN for several years following 1983 (Ellis, Lahanas, that the vacuum energy density cancel in lowest order, or
et al., 1984; Ellis, Kounnas, et al., 1984; Barbieri et al., to all finite orders of perturbative theory; even nonpertur-
1985). bative effects like ordinary QCD instantons would give
Of course, these models do not solve the cosmological far too large a contribution to the effective cosmological
constant problem, because neither Eq. (4.8) nor Eq. (4.9) constant if not canceled by something else. According to
is dictated by any known physical principle. In particu- our modern theories, properties of elementary particles,
lar, in order to cancel the second term in Eq. (4.9, it is like approximate baryon and lepton conservation, are
essential that the coefficient of the logarithm in the first dictated by gauge symmetries of the standard model,
term in (4.8) be given the apparently arbitrary value which survive down to accessible energies. We know of
-3/8aG. no such symmetry (aside from the unrealistic example of
It was therefore exciting when, in some of the first unbroken supersymmetry) that could keep the effective
work on the physical implications of superstring theory, cosmological constant sufficiently small. It is conceivable
it was found that compactification of six of the ten origi- that in supergravity the property of having zero effective
nal dimensions yielded a four-dimensional supergravity cosmological constant does survive to low energies
theory with Kahler potential and superpotential of the without any symmetry to guard it, but this would run
form (4.8) and (4.9). Specifically, Witten (1985) found a counter to all our experience in physics.
Kahler potential of the form (4.8), with h quadratic in the
C s and b = - ln(S +S * )/Sac, but with a superpoten- V. ANTHROPIC CONSIDERATIONS
tial that depended solely on the Cs. By including non-
perturbative gaugino condensation effects, Dine et al. I now turn to a very different approach to the cosmo-
(1985) were able to give the superpotential a dependence logical constant, based on what Carter (1974) has called
on S (though they did not treat the dependence of the the anthropic p r i n ~ i p l e . ~Briefly stated, the anthropic
Kahler potential or superpotential on the C fields). In principle has it that the world is the way it is, at least in
this work, the S field is a complex function (now often part, because otherwise there would be no one to ask why
called r) of four-dimensional dilaton and axion fields, it is the way it is. There are a number of different ver-
while the T field represents the scale of the compactified sions of this principle, ranging from those that are so
six-dimensional manifold. The factor 3 in Eq. (4.8) arises weak as to be trivial to those that are so strong as to be
in these models because one compactifies on a complex absurd. Three of these versions seem worth distinguish-
manifold with ( 10-4)/2= 3 complex dimensions (Chang ing here.
et a l . , 1988).
(i) In one very weak version, the anthropic principle
Intriguing as these results are, they have not been tak-
amounts simply to the use of the fact that we are here as
en seriously (even by the original authors) as a solution of
one more experimental datum. For instance, recall M.
the cosmological constant problem. The trouble is that
Goldhabers joke that we know in our bones that the
no one expects the simple structures (4.8) and (4.9)to sur-
lifetime of the proton must be greater than about 10l6yr,
vive beyond the lowest order of perturbation theory, be-
because otherwise we would not survive the ionizing par-
cause they are not protected by any symmetry that sur- ticles produced by proton decay in our ownbodies. No
vives down to accessible energies.
one can argue with this version, but it does not help us to
Recently Moore (1987a, 1987b) has attempted a more
explain anything, such as why the proton lives so long.
specifically stringy attack on the problem. Early work
Nor does it give very useful experimental information;
by Rohm (1984) and Polchinski (1986) had shown that in
certainly experimental physicists (including Goldhaber)
the calculation of the vacuum energy density, the sum
have provided us with better limits on the proton life-
over zero-point energies can be converted into an integral
time.
over a complex modular parameter T. (In string
theories, two-dimensional conformal symmetry makes
the tree-level vacuum energy vanish.) Last year Moore
pointed out that for some special compactifications there
is a discrete symmetry of modular space, known as 5Recent discussions of the anthropic principle are given in the
Atkin-Lehner symmetry, that makes the integral over T books by Davies (1982) and Barrow and Tipler (1986), and in ar-
vanish despite the absence of space-time supersymmetry. ticles by Carter (1983), Page (1987), and Rees (1987).
(ii) In one rather strong version, the anthropic princi- (1) The vacuum energy may depend on a scalar field
ple states that the laws of nature, which are otherwise in- vacuum expectation value that changes slowly as the
complete, are completed by the requirement that condi- universe expands, as in a model of Banks (1985).
tions must allow intelligent life to arise, the reason being (2) In a model of Linde (1986, 1987, 1988b), fluctua-
that science (and quantum mechanics in particular) is tions in scalar fields produce exponentially expanding re-
meaningless without observers. I do not know how to gions of the universe, within which further fluctuations
reach a decision about such matters and will simply state produce further subuniverses, and so on. Since these
my own view, that although science is clearly impossible subuniverses arise from fluctuations in the fields, they
without scientists, it is not clear that the universe is im- have differing values of various constants of nature.
possible without science. (3) The universe may go through a very large number
(iii) A moderate version of the anthropic principle, of first-order phase transitions in which bubbles of small-
sometimes known as the weak anthropic principle, er vacuum energy form; within these bubbles there form
amounts to an explanation of which of the various possi- further bubbles of even smaller vacuum energy, and so
ble eras or parts of the universe we inhabit, by calculat- on. This can happen if the potential for some scalar field
ing which eras or parts of the universe we could inhabit. has a large number of small bumps, as in a model of Ab-
An example is provided by what I think is the first use of bott (1985). Alternatively, the bubble walls may be ele-
anthropic arguments in modem physics, by Dicke (1961), mentary membranes coupled to a 3-form gauge potential
in response to a problem posed by Dirac (1937). In effect, A,, as in the work of Brown and Teitelboim (1987a,
Dirac had noted that a combination of fundamental con- 1987b).
stants with the dimensions of a time turns out to be (4) The universe may start in a quantum state in which
roughly of the order of the present age of the universe: the cosmological constant does not have a precise value.
Any measurement of the properties of the universe
fi/Gcm~=4.SX100yr . (5.1) yields a variety of possible values for the cosmological
constant, with a priori probabilities determined by the in-
[There are various other ways of writing this relation, itial state (Hawking, 1987a). We will see examples of this
such as replacing m,,with various combinations of parti- in Secs. VII and VIII.
cle masses and introducing powers of e2/fic. Diracs In models of these types, it is perfectly sensible to apply
original large-number coincidence is equivalent to us- anthropic considerations to decide which era or part of
ing Eq. (5.1) as a formula for the age of the universe, with the universe we could inhabit, and hence which values of
mrr replaced by ( 137m,m~)3= 183 MeV. In fact, there the cosmological constant we could observe.
are so many different possibilities that one may doubt A large cosmological constant would interfere with the
whether there is any coincidence that needs explaining.] appearance of life in different ways, depending on the
Dirac reasoned that if this connection were a real one, sign of hew For a large positive hemthe universe very ear-
then, since the age of the universe increases (linearly) ly enters an exponentially expanding de Sitter phase,
with time, some of the constants on the left side of (5.1) which then lasts forever. The exponential expansion in-
must change with time. He guessed that it is G that terferes with the formation of gravitational condensa-
changes, like l/t. [Zee (1985) has applied similar argu- tions, but once a clump of matter becomes gravitationally
ments to the cosmological constant itself.] In response to bound, its subsequent evolution is unaffected by the
Dirac, Dicke pointed out that the question of the age of cosmological constant. Now, we do not know what
the universe could only arise when the conditions are weird forms life may take, but it is hard to imagine that it
right for the existence of life. Specifically, the universe could develop at all without gravitational condensations
must be old enough so that some stars will have complet- out of an initially smooth universe. Therefore the an-
ed their time on the main sequence to produce the heavy thropic principle makes a rather crisp prediction: h,
elements necessary for life, and it must be young enough must be small enough to allow the formation of
so that some stars would still be providing energy sufficiently large gravitational condensations (Weinberg,
through nuclear reactions. Both the upper and lower 1987).
bounds on the ages of the universe at which life can exist This has been worked out quantitatively, but we can
turn out to be roughly (very roughly) given by just the easily understand the main result without detailed calcu-
quantity (5.1). Hence there is no need to suppose that lations. We know that in our universe gravitational con-
any of the fundamental constants vary with time to ac- densation had already begun at a redshift z, 2 4 . At this
count for the rough agreement of the quantity (5.1) with time, the energy density was greater than the present
the present age of the universe. mass density py0 by a factor ( 1+z, )3 1 125. A cosmolog-
It is this weak anthropic principle that will be ap-
plied here. Its relevance arises from the fact that, in ical constant has little effect as long as the nonvacuum
some modern cosmological models, the universe does energy density is larger than p v , so one can conclude that
have parts or eras in which the effective cosmological a vacuum energy density p v no larger than, say loOpM,
constant takes a wide variety of values. Here are some would not be large enough to prevent gravitational con-
examples. densations. [The quantitative analysis of Weinberg
A. Mass denaity
0. Ages
If, as often assumed, the universe now has negligible
In a dust-dominated universe with k =O and pv=O,
spatial curvature, then
the age of the universe is t 0 = 2 / 3 H 0 . For
S1,+RMo=l I (5.2) H o = 100 km/sec Mpc, this is 7 X lo9 yt, considerably
less than the ages usually d i m a t d for globular clusters
where a, and QM0 are the ratios of the vacuum energy (Renzid, 1986). On the other hand, for a dust-dominated
density and the present mass density to the critical densi- universe with k =O and pv#O, the present age of an ob-
tY ject that formed at a redshift z, is
(5.4)
For instance, for zc=4 and pv/pM0=9 (i.e., nMo=O.l),this gives an age 1.1HL' in place of +Hi'.This is not in
conffict with globular cluster ages even for Hubble constants near 100 km/sec Mpc.
These considerations of cosmic age and density have led a number of astronomers to suggest a fairly large positive
cosmological constant, with p v > p M [de Vaucouleurs (1982, 1983);Peebles (1984, 1987a, 1987b);Turner, Steigman, and
Krauss (1984)j. However, there recently has appeared a strong argument against this view, which we shall now consid-
er.
C. Number counts
Loh and Spillar (1986) have carried out a survey of numbers of galaxies as a function of redshift, subsequently ana-
lyzed by Loh (1986). For a uniformly distributed class of objects that are all bright enough to be detectable at redshifts
5zmsx, the number of objects observed at redshift less than z Sz,,, in a dust-dominated universe with k = O is
Of course, in the real world there are always some objects This is more than 3 orders of magnitude below the an-
too dim to be seen. Loh's analysis allowed for an un- thropic upper bound discussed earlier. If the effective
known luminosity distribution, assuming only that its cosmological constant is really this small, then we would
shape does not evoIve with time. Under these assump- have to conclude that the anthropic principle does not
tions, he found that the vacuum energy must be quite explain why it is so small. [However, there are reasons to
small: specifically, be cautious in reaching this conclusion. Bahcall and
Tremaine (1986) have recently reanalyzed the data of
Loh and Spillar, using a plausible model of galaxy evolu-
pv/PnS,'o. I T0d.j . tion in which the shape of the luminosity distribution
does change with time. They considered only the case matter, or the vacuum and radiation, in such a way that
pv=O, leaving sZMo undetermined, and found that evolu- either p v / p w or p v / p R remain constant, respectively
tion in this model could increase or decrease the inferred (see also Reuter and Wetterich, 1987). In order for the
value of nMoby as much as unity. Presumably it would vacuum to transfer energy to ordinary matter in such a
way that p v / p M remains fixed, and if baryon number is
also have a similarly large effect on the inferred value of
conserved, then it would be necessary to create baryon-
p v / p w o when n M o + tis v constrained to be unity. In antibaryon pairs at a sufficient rate to produce a trouble-
addition, the redshifts of Loh and Spillar are photometric some p r a y background. Alternatively, if the vacuum
and therefore less certain than those obtained from shifts transfers energy to radiation in such a way that p v / p R
of individual spectral lines.] remains constant, and if pv is comparable with the
Now let us consider a cosmological constant of the present mass density pMo, then p v / p ~must be rather
other sign, Aer< 0. Here the cosmological constant does
not interfere with the formation of gravitational conden- large, completely changing the results of cosmological
sations. Instead (for k = O or k =+1), the whole nucleosynthesis.
universe collapses to a singularity in a finite time T. The One more possibility that was not considered by Freese
anthropic constraint here is simply that the universe last er al. is that the vacuum transfers energy t o radiation,
long enough for the appearance of life (Barrow and avoiding the problems of baryon-antibaryon annihilation,
Tipler, 1986),say, T Z 0 . 5 H , ' , where H,' is the Hubble but in such a way as to keep a fixed ratio p v / p M rather
time in our universe. For a dust-dominated universe with than p v / p R . However, this also does not work. With
k =0, we have pv=cpM and R 3pM constant, E q . (5.8) yields
(5.6)
Einstein field equations have a flat-space solution. the N fields JI, with N - 1 fields ua (not necessarily sca-
Of course, we do not observe such a scalar field, but for lars) and one scalar 4, in such a way that the symmetry
these purposes it can couple as weakly as we like; a weak transformation (6.5) takes the form
coupling simply implies that the equilibrium value t$o is
6gpbv=2Eg~,, S U , = O , St$=-E. (6.7)
very large. In this respect the scalar t$ is analogous to the
axion, especially in its later invisible version [Kim [To do this, we first define a transverse surface S in
(1979);Dine, Fischler, and Srednicki (1981)l. field space by an equation T(JI)=O,where T ( J I )is any
Even very weakly coupled, it is possible that the t$ field function on which X,(aT/aJI,, )fn(JI) does not vanish.
could have interesting effects, because it must have very We take u, as any set of coordinates on this ( N- 1)-
small mass. If it has any nonzero mass M + , then at ener- dimensional surface, and define JI,,(u;t$) as the solution
gies below m g we can work with an effective Lagrangian of the ordinary differential equation dJI,,/ d t $ = f , , ( J I )
in which t$ has been integrated out, and so does not ap- subject to the condition that at t$=O, JI,, is at the point
pear explicitly. But massless fields like the gravitational on S with coordinates u. The condition that S be a trans-
and electromagnetic field will still appear in this effective verse surface ensures that, at least within a finite region
Lagrangian, and their vacuum fluctuations will contrib- of field space, any point JI,, is on just one of these trajec-
ute to the effective cosmological constant. In order to tories.] This symmetry simply ensures that for constant
keep p v < GeV4, we need the scalar field adjust- fields the Lagrangian can depend on gAv and t$ only in the
ment to cancel the effect of gravitational and electromag- combination e+gA,. The general arguments of Sec. 111
netic field fluctuations down to frequencies GeV; then show that when the field equations for u are
for this purpose we must have m+ < lo- GeV. A field satisfied, the Lagrangian must take the form
this light will have a macroscopic range: W m + c 20.01 L=eW(Detg)/Lo(u) . (6.8)
cm.
Unfortunately it seems to be impossible to construct a We see that the source of t$ is the trace of the energy-
theory with one or more scalar fields having the assumed moment urn tensor
properties. This can be seen in very general terms. What --
aL - T,(Detg )I/ , (6.9)
we want is to find an equilibrium solution of the field a4
equations in which g,, and all matter fields JI,, (perhaps
~ , ~ = g ~ e ~ + L.~ ( u ) (6.10)
tensors as well as scalars) are constant in space-time. For
such constant fields the Euler-Lagrange equations are It is true that if there were a value of t$ where L is sta-
simply tionary in t$, then the trace of the Einstein field equations
would automatically be satisfied at this point, but clearly
(6.2) there is no such stationary field value (unless, of course,
we fine-tuneLo so that it vanishes at its stationary point).
-- To put this another way, since L depends only on t$ and
aL -0. (6.3) g,, only in the combination g,,, =ee4g,, (and derivatives
aJIn
of I$ and g,,), we might as well redefine the metric as g,
As we saw in Sec. 111, the problem is in satisfying the instead of g,,. Then t$ is just a scalar with only deriva-
trace of the gravitational field equation. To make a solu- tive couplings and clearly cannot help with our problem6
tion natural, we would like this trace to be a linear com- As one example of many failed attempts along this
bination of the JI,, field equations; that is, we want line, let us consider a proposal of Peccei, Sol&,and Wet-
terich (1987). They observed that the symmetry (6.5) or
(6.4) (6.7) may be broken by conformal anomalies, such as
those that produce the fl function of quantum chromo-
for all constant g,, and JI,,. This can be restated as a dynamics, in such a way that the effective Lagrangian be-
symmetry condition: for constant fields the Lagrangian comes
must be invariant under the transformation L,,= (Detg)*[eqLo(u) -tt$W, J , (6.11)
k ~ v = 2 E g ~ v S, J I n = - - ~ f n C t c ) . (6.5) where W, represents the effect of the conformal anoma-
With this condition, if we find a solution JIco of the
Euler-Lagrange equations for JI,,,
*his remark is due to Polchinski (1987).
An equation essentially equivalent to (6.11) appeared in the
(6.6) preprint version of the paper by Peccei, Sola, and Wetterich
(1987). In the published version this equation was removed, and
then the trace of the field equation for g,, is automatical- it was acknowledged that fine-tuning is still needed to make the
ly satisfied. cosmological constant vanish. However, this equation was
The problem is that under these assumptions, it is im- quoted in the meantime in a paper by Ellis, Tsamis, and
possible (without fine-tuning L)to find a solution to the Voloshin (1987),which mostly deals with the observable conse-
field equations (6.3) for the $. To see this, we replace quences of the light scalar particle in this model.
ly. The source of the $ field is now cal assumptions that later turn out to have exceptions of
great physical interest. (A famous example is the
(6.12) Coleman-Mandula theorem.) More discouraging than
any theorem is the fact that many theorists have tried to
with T the previous energy-momentum tensor (6.10). invent adjustment mechanisms t o cancel the cosmologi-
Now we can find an equilibrium solution for the field, cal constant, but without any success so far.
a t a value 4o such that
VII. CHANGING GRAVITY
4e40Lo+Wp=0 . (6.13)
A number of authors have suggested changing the
rules of classical general relativity in such a way that the
The trouble is that this is not the condition for a flat- cosmological constant appears as a constant of integra-
space solution; the Einstein equation for a constant tion, unrelated to any parameters in the action [Van der
metric is Bij ef al. (1982); Weinberg (1983); Wilczek and Zee
(1983); Buchmuller and Dragon (1988a, 1988b)l. This
(6.14) does not solve the cosmological constant problem, but it
does change it in a suggestive way.
I will describe one version of this idea, in which one
which is not the same as (6.13). The point is that just cal- maintains general covariance, but reinterprets the for-
ling the anomalous term in (6.l l ) W pdoes not make it a malism so that the determinant of the metric is not a
term in the trace of the energy-momentum tensor to dynamical field. Any theory can be written in a way that
which gpv is coupled. This result is not surprising, since is formally generally covariant, so by the usual argu-
(6.11) does not obey the symmetry (6.7). One cannot ments we can take the action for gravity and matter as
have it both ways: either we preserve the symmetry, in
which case there is no equilibrium solution for $, or we
break the symmetry, in which case such an equilibrium
solution does not imply a solution of the field equations where $ are a set of matter fields appearing in the matter
for a constant metric. (Also see Coughlan ef al., 1988; action ZM. (IM includes a possible cosmological constant
Wetterich, 1988.) term -AS G d 4 x / ~ P G . )The variational derivative of
In a slightly diferent version of this general class of Eq.(7.1) with respect to the metric is
models, we can try coupling a scalar field so that it is the
curvature scalar R rather than the trace of the energy- (7.2)
momentum tensor that directly serves as the source of
the scalar field. [See, e.g., Dolgov (1982); Barr (1987); where, as usual, Tpvis the variational derivative of IM
Ford (1987).] For instance, we might take the Lagrangian
with respect to gFv. In ordinary general relativity all
as
components of the metric are dynamical fields, so Eq.
(7.2) vanishes for all p , ~yielding
, the usual Einstein field
equations. However, just because we use a generally co-
variant formalism does not mean that we are committed
to treating all components of the metric as dynamical
(6.15)
fields. For instance, we all learn in childhood how to
write the equations of Newtonian mechanics in general
This has a flat-space solution with gpv=vFvand 4=& (a curvilinear spatial coordinate systems, without supposing
constant), provided that the 3-metric has to obey any field equations at all.
In particular, if the determinant g is not dynamical,
U(40)=03 . (6.16) then the action only has to be stationary with respect to
variations in the metric that keep the determinant fixed,
However, as the above authors observed, the effective
gravitational coupling in this theory is given by
G 8For instance, we assumed that in the solution for flat space all
GeR= +
1 lb~rGU(4,)
=o . (6.17)
fields are constant, but it might be that this solution preserves
only some combination of translation and gauge invariance, in
This is not much progress; we always knew thet a which case some gauge-noninvariant fields might vary with
nonzero vacuum energy does not prevent a flat-space space-time position. (This is the case for the 3-form gauge field
solution if the gravitational constant is zero. model discussed at the end of Sec. VII and in Sec. VIII.) Fur-
The no-go theorem proved in this section should not thermore, it is possible that the foliation of field space, which al-
be regarded as closing off all hope in this direction. No- lows us to replace the with uo and 6,does not work
go theorems have a way of relying on apparently techni- throughout the whole of field space.
i.e., for which g@v6gP,=O; hence only the traceless part This is consistent only if ZPy is traceless; however, Ein-
of (7.2) needs to vanish, yielding the field equation stein took for tPV not the full energy-momentum tensor of
matter and radiation, but just the traceless tensor of radi-
RPV--$gPVR= -8&(TPV--fgPvTA).) . (7.3) ation alone. This is, of course, conserved only outside
matter. In such regions there is no difference between
This is just the traceless part of the Einstein field equa- Eqs. (7.8) and (7.3),so by the same calculation as shown
tions; these equations evidently contain less information here, Einstein was able to recover Eq. (7.7), with A a con-
than Einsteins, but as we shall see, not much less. Be- stant of integration. However, inside matter, Eq. (7.8)is
cause the whole formalism is generally covariant, the different from (7.3),the difference being that the right-
energy-momentum tensor satisfies the usual conservation hand side of Eq. (7.3)includes the traceless part of the
law
energy-momentum tensor of matter. A consequence of
TPV;,=O , (7.4) this difference is that in charged matter R is an undeter-
mined function, except that it is constant along world
and of course the Bianchi identities still hold, lines.
I will also take the opportunity of this pause to com-
( R p v - + g f i v R );,=O . (7.5)
ment on the connection between the formulation de-
The full Einstein field equations are automatically con- scribed here and that of Zee (1985) and Buchmiiller and
sistent with (7.4) and (7.5),but for the traceless part we Dragon (1988a, 1988b). These authors take as their start-
get a nontrivial consistency condition. Taking the co- ing point the assumption that the action is invariant not
variant derivative of Eq. (7.3)with respect to xP yields under the group of all coordinate transformations, but
only under the subgroup of transformations xP+x p
$a,R =8?rG$a,TAA , with Det(ax/ax)=l. This is not really in conflict
with the formulation presented here; the general covari-
or, in other words, R --8rrGTAA is a constant, which we
ance of Eq. (7.1)is achieved at the cost of introducing a
will call -4A:
metric that is partly nondynamical (just as we can make
R - 8 r G T A A = -4A (constant) . (7.6) Newtonian mechanics formally Lorentz invariant by in-
troducing a nondynamical quantity, the velocity of the
From (7.3)and (7.6),
we obtain reference frame). However, in giving up general covari-
ance, one may be led to a theory with unnecessary ele-
RPv-+gPvR -AgPV=-8&TPv. (7.7)
ments. Under transformations with Det(ax/ax )= 1, the
Thus we recover the Einstein field equations, but with a determinant of the metric g behaves just like any scalar
cosmological constant that has nothing to do with any field, so one can introduce arbitrary functions of g here
terms in the action or vacuum fluctuations, arising, in- and there in the action. There is nothing wrong with
stead, as a mere integration constant. To put this anoth- this, but it is not necessary, no different from inserting a
er way, Eq. (7.3) does not involve a cosmological con- new scalar field into the theory.
stant; the contribution of vacuum fluctuations automati- Now let us return to the theory described by the field
cally cancel on the right-hand side of Eq. (7.3),so this equations (7.3).In my view, the key question in deciding
equation does have flat-space solutions in the absence of whether this is a plausible classical theory of gravitation
matter and radiation. The remaining problem in this for- is whether it can be obtained as the classical limit of any
mulation is: why should we choose the flat-space solu- physically satisfactory quantum theory of gravitation.
tions? To help in answering this, and also to illuminate the
Before proceeding with this theory, I should pause to points raised in the previous paragraph, let us look at a
mention that it is closely related to a proposal made long simple model (Teitelboim, 1982) that shares several
ago by Einstein (1919). After his formulation of general features with the theory of gravitation studied here.
relativity and its application to cosmology, Einstein Consider a free relativistic particle, with space-time
turned to the old problem of a field theory of matter. In trajectory x p ( s ) parametrized by a variable s. In order
a paper titled Do Gravitational Fields Play an Essential for the action to be invariant under arbitrary reparame-
Part in the Structure of the Elementary Particles of trizations s-+s(s), we must introduce an einbein g (s),
Matter? he proposed to replace the original gravitation- with transformation rule
al field equation with the equation r 1--1
The conditions that I be stationary with respect to varia- Here h,, N,and N' parametrize the 4-metric, with line
tions in x P ( s ) and g ( s ) are, respectively, element given by
dpp=o, (7.11)
d 2 =( h - I @ - N'NJh.. )dt
ds
-2hijNidxJdt -hijdxidxj , (7.18)
pppp,=-m2, (7.12)
h =Det(hij) . (7.19)
where p,, is the canonical conjugate to x 8:
Furthermore, diis the canonical conjugate to hij, and %
(7.13) and 7 f i are functions of hij and viJand their space
derivatives, given by
However, just because we choose to write the action in %=T I 9..
i,,k/ T i j ? T k l - 1 3 ) ~ (7.20)
a reparametrization-invariant way does not necessarily
mean that we must treat the einbein g ( s ) as a dynamical Yf,=-2hijVkd.k , (7.21)
quantity. If we treat x%), but not g ( s ) , as dynamical
variables, then we obtain Eq. (7.11), but not (7.12). Of where (')R is the scalar curvature and V k is the covariant
course, Eq. (7.11) implies that p@' is a constant bust as derivative, both calculated using the 3-metric h,, and
Eq. (7.3) implies that R -8rGTAA is constant]. If we Qij,k/ Ehikhjl f h , h j k - h j j h k / . (7.22)
like, we can call this constant - m 2 , but this is now a
mere integration constant, unrelated to anything in the We see that N and N' just act as Lagrange multipliers for
original action. 3f and Xi,
respectively. Moreover, from (7.18), we see
Now to quantization. The Hamiltonian here is that R 2is just the quantity whose status is under ques-
tion here, the determinant of the 4-metric"
(7.14)
~=(Detg,,)l'* . (7.23)
so in quantum mechanics we calculate amplitudes by the
functional integral Thus, just as the integral over the einbein g ( s ) enforced
the constraint p"p,= -m2, the integral over Detg en-
A =.f[dxltd~l[dgl forces the constraint
%=2h. (7.24)
The two conditions are quite similar. Just as T,,,, has sig-
The einbein g (s)has no canonical conjugate, and so ap- nature +++ -, the quantity (7.22), viewed as a 6 X6
pears here only as a Lagrange multiplier, whose integral matrix, has signature +, +, +, +, +,
-. Hence the in-
yields a factor tegration over Detg,, has the effect of eliminating one
negative norm degree of freedom for each x, d'a (h -I)'',
JJ~(ppp, + m *) . (7.16) just as the integral over the einbein g(s) allows one to
5
eliminate the variable-pO. However, for gravity there is a
Presumably the classical theory in which g is not dynami-
"potential" term in %, proportional to the 3-curvature,
cal would be obtained as the classical limit of a quantum
and it is not entirely clear to me that it really is necessary
theory in which we do not do a functional integral over
to constrain % to take a fixed value. For the present, the
g ( s ) , and hence do not get the factor (7.16). But then
question of whether it is necessary to integrate over
there would be nothing to keep p" timelike. This is such
Detg,,, must be left open. [Recent work by Henneaux
a trivial theory that it is hard to say that anything goes
and Teitelboim (1988) shows that there is a sensible gen-
wrong physically; but we may anticipate that in less trivi-
erally covariant quantum version of the classical theory
al theories, we need a field to serve as a Lagrange multi-
described by Eq. (7.3).]
plier for every negative norm degree of freedom like p o .
Before closing this section, I should note that several
This is the case, for instance, in string theories, where the
authors have made a rather different suggestion, which
integration over the world-sheet metric is needed to en-
also has the effect of converting the cosmological con-
force the Virasoro conditions on physical states.
stant from a function of parameters in the action into a
The quantum theory of gravitation can be put in simi-
constant of the motion (Aurilia et al., 1980; Witten,
lar terms. Using the Arnowitt-Deser-Misner (1962) for-
malism, we calculate amplitudes as functional integrals, 1983; Henneaux and Teitelboim, 1984). They proposed
adding to the action a term
Z = S[dhi,][da"][d#][dNIv']
with different ways of calculating the measure [ d g ] [d @ ] is a natural definition of time, and we generally ask for
(Hawking and Page, 1986). Another problem, potentially the probabilities that the fields have certain values at a
more worrisome, is that for gravity the Euclidean action definite time. However, here time is a coordinate with no
(8.3) is not bounded below. Gibbons, Hawking, and Per- objective significance, and this coordinate time is even
ry (1978) have proposed rotating the contour of integra- imaginary. As Augustine (398) warned, I must not al-
tion for the overall scale of the 4-metric so that it runs low my mind to insist that time is something objective.
parallel to the imaginary axis. We will not need to go Heeding this warning, suppose we choose some time-
into these technicalities here, because it will turn out that keeping field o ( x , t ) , for instance, the trace of the
we only need to deal with the effective action at its equi- energy-momentum tensor, and use its value to define a lo-
librium point. cal time a. Each value of a defines a 3-surface, on which
A problem that is more relevant to us here has to do the coordinate time t is a function t ( x , a ) defined impli-
with the probabilistic interpretation of the wave function citly by
Y and of Euclidean path integrals like (8.2). Hawking
has proposed (1984a, 1 9 8 4 ~that) exp( -S[g,cP]) should
a(x,t[x,a))=a . (8.4)
be regarded as proportional to the probability of a partic- We are then interested in the probability that the tangen-
ular metric and matter field history. It is not immediate- tial components of the metric and all matter fields other
ly clear what is meant by this-even supposing that we than a (x,t ) have specified values on this surface. Calling
had the godlike ability to measure the gravitational and these quantities b, (x, t ), we see that the probability den-
matter fields throughout space-time, it would be in a sity for the b , ( x , f ) to have the values 8, ( x ) at local time
space-time of Lorentzian rather than Euclidean signa- a is
ture. However, since we can (sometimes) go from one
signature to another by a complex coordinate transfor-
mation, it may be that a Euclidean history g,,(x), W x )
can be interpreted in terms of correlations of scalar quan-
Xn6(b,(x,t(x,a))--P,(x)) , (8.5)
n,x
tities, just as if the space-time were Lorentzian. In much
of Hawkings work (e.g., Hawking, 1979),these questions with N a normalization factor, determined by the condi-
are avoided by using the formalism only to calculate the tion that the total probability of finding any value for the
probability that, in the space-time history of the universe, b,(x) at local time a should be unity:
there is a spacelike 3-surface with a given 3-metric ! ~ , ~ ( x )
and matter fields 4(x). For instance, with Hartle-
Hawking (1983) initial conditions, we would integrate (8.6)
over all closed 4-manifolds that contain such a 3-surface.
If this surface bisects the 4-manifold, then it can be re- [This usually makes N a function of a,because in (8.5)
garded as the boundary of the two halves of the 4- and (8.6) we integrate only over matter and metric his-
manifold, and so the integral is (with some qualifications) tories for which Eq. (8.4) is satisfied on some 3-surface.
just the square of the wave function (8.2). But questions With some boundary conditions, this condition is au-
still arise concerning the probabilistic interpretation of tomatically satisfied, and then N is a independent. For
Y, particularly with regard to normalization. If instance, if M 4 has two boundaries, on which a ( x ) is re-
IY[h,q5]lzis the probability density that there exists some quired to take values a land a2,then there are 3-surfaces
3-surface on which the 3-metric is h,,(x) and the matter on which (8.4) is satisfied for all a in the range
fields are # x ) , then we would not simply want to set the a,<a <a,.] Where the surface of constant a bisects the
functional integral of IP[h,$]12 over h , f ( x ) and & x ) 4-space, P , [ B ] can be written as proportional to the
equal to unity, because in this functional integral we are square of the wave function Y[a,/3], but with a constant
summing up possibilities that are not exclusive; if the in 3-space.
universe has some h,f ( x ) and #(x) on one 3-surface, then
it may also have some other h h ( x ) and #(x) on some
other 3-surface. After all, you would not expect that the I3Thisquote is not merely a display of useless erudition. Book
probabilities that you ever in your life have flipped a coin XI of Augustines Confessions contains a famous discussion of
the nature of time, and it seems to have become a tradition to
and gotten heads, and that you ever in your life have
quote from this chapter in writing about quantum cosmology.
flipped a coin and gotten tails, should add up to unity. Thus Hawking (1979) quotes What did God do before He
I would like to offer an interpretation of what is meant made Heaven and Earth? I do not answer as one did merrily:
by treating 1 Y [ h , @ ] / *as a probability density, which He was preparing a Hell for those that ask such questions. For
seems to me implicit in Hawkings writings (and may al- at no time had God not made anything, for time itself was made
ready be stated explicitly somewhere in the literature). by God. Coleman (1988a) quotes The past is present
As everyone has recognized, the problem has to do with memory. To this, I can add one more very relevant quote: I
the role of time in quantum gravity. [See, e.g., Hartle confess to you, Lord, that I still do not know what time is. Yet
(1987).] The problems raised here do not arise in asymp- I confess too that I do know that I am saying this in time, that I
totically flat cosmologies, because in such theories there have been talking about time for a long time, . . . .
I
XIm Y*[h]- 6
6hkl(X)
Y[h]]] . (8.8)
0: J[dA][dg][d@l8(c(xl ) - c )
Xexp(--S[A,g,@]) .
It is well known that such functional integrals can be ex-
(8.13)
Since the beginning, it was hoped that such a conserva-
tion law could be used to construct a suitable probability pressed as exponentials of the effective action at its sta-
density (DeWitt, 1967). Usually (8.8) is stated in a tionary point.14 In the present case, we have
minisuperspace context, where h, ( x ) is constrained to
P ( c ) 0:exp( - A, aC1) ,
xC, (8.14)
depend on only a finite number of parameters. Since
gij,,,h k= -hij, it is natural to treat the overall scale of where T[A,g,@] is the total action (the sum of one-
hij as a sort of global time coordinate, and take as a prob- particle irreducible graphs with external lines replaced
ability density the corresponding component of the con- with fields A,g,*) and the subscript c indicates that this
served current in (8.8). I wish to point out here that quantity is to be evaluated at a point where is station-
such a construction is not limited to any particular ary with respect to any variations in A , , ( x ) , gp,(x), or
minisuperspace formulation, but can be carried out in the @ ( x ) that leave c ( x l ) = c fixed. Now, among all the pos-
general case. Take Y to depend on a global time sible stationary points of r, there is one that can be
found knowing only the effective action relevant to large
(8.9)
4-manifolds. In this case, it is convenient to set all fields where co is the value of c (assuming there is one), for
except A,,,,h and gp,, equal to their ( A- and g-dependent) which A(c)=O.
stationary values, in which case the effective action can It is important that the quantity A(c) is the true
be expanded in inverse powers of the size of the mani- effective cosmological constant, previously called hem
that would be measured in gravitational phenomena at
long ranges.I6 The constant A in Eq. (8.15) includes all
effects of fields other than g,, and All,&, including all
quantum fluctuations. Hence the result (8.21), if valid,
really does solve the cosmological constant problem.
We can check that this result is not invalidated by the
the omitted terms involving more than two derivatives of
terms neglected in Eq. (8.16). For a large radius r, the ex-
g and/or A. As we saw in Sec. VII, the condition that
hibited terms in (8.16) are of order hr4/G and rZ/G, re-
this be stationary in A,, [for variations that keep c(x,
spectively, while a term with D > 4 derivatives would
fixed] is that FpVAp have vanishing covariant divergence,
yield a contribution to refof order (mr)4-D, where m is
from which it follows that c in Eq. (7.27) is constant;
some combination of the Planck mass and elementary-
hence particle masses. For A ( c ) S m 2 ,this shifts the size of the
manifold by
6r/r =GA(c)[A(c)/m 2]1D-432 << 1 .
(8.16)
The change in the stationary value of the action is then
where
S r , , = [ h ( ~ ) / m ~ ] ( ~ - ~ ) 1 ~,< _
C2
A(c)=--+A. (8.17) so these higher-derivative terms have no effect on the
2
singularity (8.20).
The condition that this be stationary in g,, is, of course, Coleman (1988b) does not need to introduce a 3-form
that g,, satisfy the Einstein field equations with cosmo- gauge field A,,,*; rather, in order to make the cosmologi-
logical constant U c ) . For any such solution, cal constant into a dynamical variable, he considers the
R = -4A(c), so at the stationary point effect of topological fixtures known as wormhole^.^^ An
explicit example of a wormhole is provided by the metric
(8.18) (Hawking, 1987b, 1988)
With Hartle-Hawking boundary conditions, the solution ds2=( 1 + b 2 / x x @ ) 2 d x d x . (8.22)
of the Einstein equations for A ( c ) > O is a 4-sphere of
This appears to have a singularity at xP=O, but the line
proper circumference 2nr, where
element is invariant under the transformation
(8.19)
x-+xJ=xb2/xvxv, (8.23)
yielding a probability density proportional to
so the region x p x , < b Z actually has the same geometry
exp( -rer)=exp[3a/GA(c)l . (8.20) as that with x P x p > b 2 . The space described by Eq. (8.22)
therefore consists of two asymptotically flat 4-spaces,
On the other hand, for h ( c )< O the solutions can be made joined together at the 3-surface with xfxp=b2, a 3-
compact by imposing periodicity conditions, but they all sphere known as a baby universe. This 4-metric is not
have re,2 0. Hawkings conclusion is that the probabili- a solution of the classical Einstein equations (though it
ty density has an infinite peak for A(c)-+O+; hence, after does have R =O), but this is not very relevant; the action
normalizing P, is
P ( c ) = G ( c -co) , (8.21) S=3rb2/G, (8.24)
so the factor exp(-S) suppresses the effects of all
wormholes except those of Planck dimensions or less, for different interpretation [see also Giddings and Strom-
which quantum effects are surely important. [A model inger (1988b)I. The state ) B ) in Eq. (8.26) may always be
with classical wormhole solutions, based on a 2-form ax- expanded in eigenstates of the operators a, +a!:
ion, has been presented by Giddings and Strominger
(1988al.l IB )=JfB(a)n
daiIa) ,
#
(8.28)
If Planck-sized wormholes can connect asymptotically
flat 4-spaces, then they can connect any 4-spaces that are ( a i + a / ) l a )=a,Ia!) , (8.29)
large compared to the Planck scale. We are therefore led
to consider contributions to the Euclidean path integral (8.30)
from large 4-spaces [like the 4-sphere in Hawkings
(1984b) theory] connected to themselves and each other the function f B ( a )depending on the boundary condi-
with Planck-sized wormholes. Each wormhole can be re- tions. For instance, for Hartle-Hawking conditions, IB )
garded as the creation and subsequent destruction of a satisfies Eq. (8.271, and so
baby universe [like the 3-sphere of proper circumference
f B ( a ) =n ~ - ~ e x p ( - a a f / 2 ) . (8.31)
47rb in Hawkings (1987b, 1988) wormhole model], and ,
such baby universes may also appear as part of the
boundary of the emanifold. (With n baby universes on the boundary of the 4-space,
What are the effects of these wormholes and baby this would be multiplied with a Hermite polynomial of
universes? At scales large compared with the scale of the order n.) In the state ] a ) ,the effect of the creation and
baby universe, the creation or destruction of a baby annihilation of baby universes is to change the action S to
universe can only show up through the insertion of a lo- s,=s+ ~ a a , J ~ , ( x ) .d 4 x (8.32)
cal operator in the path integral. The various types of I
baby universes can be classified according to the form of
these local operators. The effect of creating and destroy- That is, the coupling constant multiplying each possible
ing arbitrary numbers of baby universes of all types can local term s0,d4x is changed by an amount a,. As soon
thus be expressed by adding a suitable term in the action as we start to make any sort of measurements, the state
of the universe breaks up into an incoherent superposi-
(8.25) tion of these la )s, each appearing with a priori probabil-
ity ( f B ( a ) I 2but
; for each term we have an ordinary
where a, and a / are the annihilation and creation opera- wormhole-free quantum theory, with a-dependent action
tors for a baby universe of type i, and Oi( x ) is the corre- (8.32).
sponding local operator. [This was first stated by Hawk- If all we want is to explain why the cosmological con-
ing (1987b). Creation and annihilation operators for stant is not enormous, then our work is essentially done.
baby universes were earlier used by Strominger (1984). The effective cosmological constant is a function of the
For a proof of Eq. (8.29, see Coleman (1988a) and Gid- a,,because among the 0,there is a simple operator
dings and Strominger (1988b).] The path integral over all 0,=G,whose coefficient contributes a term 8 a G a , to
4-manifolds with given boundary conditions is to be cal- A, and also because the vacuum energy ( p ) depends on
culated as the couplings of all interactions, each of which has a
term proportional to one of the a,. Now, generic baby-
J[dg][d@]e-S= JNo[dg][d@](Ble-SIE) , (8.26)
universe states IB ) will have components la) for which
where No means that wormholes and baby universes are &&a) is very small, as well as others for which it is enor-
excluded, and I B ) is a normalized baby-universe state mous. The anthropic considerations of Sec. VI tell us
depending on the boundary conditions. For instance, that any scientist who asks about the value of the cosmo-
with Hartle-Hawking boundary conditions, IB ) is the logical constants can only be living in components la)
empty state for which A,, is quite small, for otherwise galaxies and
stars could never have formed (for h,,>O), or else there
a,lE)=O. (8.27) would not be time for life to evolve (for A, < 0).
These baby universes have an important effect even if However, it is of great interest to ask whether the
none of them appear as part of the boundary of the 4- effective cosmological constant is really zero, or just
manifold, as would be the case for Hartle-Hawking small enough to satisfy anthropic bounds, in which case
boundary conditions. Hawking (1987b, 1988) has sug- it should show up observationally. The probability of
gested that since the baby universes are unobservable, getting any particular value of the a,,and hence of
their effect is an effective loss of quantum coherence. finding a value is not just given by the function
[See also Hawking (1982);Teitelboim (1982); Strominger IfB( a)I* arising from the boundary conditions, but is also
(1984); Lavrelashvili, Rubakov, and Tinyakov (1987, affected by the functional integral itself.
1988); Giddings and Strominger (1988b). A contrary In calculating this effect, Coleman (1988b) observed
view was taken by Gross (1984).] Recently Coleman that although we are to integrate only over connected 4-
(1988a) has argued (convincingly, in my view) for a manifolds, on a scale much large than the wormhole
scale those manifolds that appear disconnected will really possible that the essential singularity in
be connected by wormholes. Hence any sort of probabili- exp(exp[3a/Gh(a)]] is canceled by an essential zero in
ty density or expectation value will contain as a factor a the apriori probability ]fB(a)l2. However, this is not the
sum over disconnected manifolds consisting of arbitrary case for Hartle-Hawking boundary conditions, where
numbers of closed connected wormhole-free com- IfB ( a )I is a simple Gaussian. Moreover, Coleman
ponents. Just as for Feynman diagrams, this sum is the (1988b) has shown that in his theory such an essential
exponential of the path integral for a single closed con- zero would be destroyed by almost any perturbation of
nected wormhole-free manifold the boundary conditions; instead of its being unnatural to
have zero cosmological constant, it would be highly un-
(8.33) natural not to. Still, the problem of boundary conditions
is disturbing, because it reminds us that quantum cosmol-
where CC indicates that we include only closed connect- ogy is an incomplete theory.
ed wormhole-free manifolds, and S,[g] is the action (3) Are wormholes real? Colemans calculation de-
(8.32)with all fields other than gJx) integrated out. pends on there being a clear separation between the very
The path integral in (8.33)can be evaluated by precise- large 4-manifolds, for which the long-range effective ac-
ly the same methods as described above in connection tion is stationary (and large and negative), and very small
with Hawkings (1984b) model [and used for this purpose wormholes, whose contribution to the action is of order
by Coleman (1988b)I. The result is that the probability unity (and generally positive). Furthermore, the
density for A,, contains a factor (for he,> 0) wormholes have been assumed to be so well separated
that we can ignore their interactions (the dilute gas ap-
F=exp exp
[ [-
Eh:, j j +O(l)
(8.34) proximation). It may be possible to construct a theory in
which the wormhole scale [like b in Eq. (8.2211 is some-
The fact that this is now an exponential of an exponen- what larger than the Planck scale, large enough to allow
tial, instead of a mere exponential, is not essential in solv- the wormhole metric to be calculated classically, but we
ing the cosmological constant problem (though it is im- would still have to ask whether this is actually the case.
portant in fixing other constants, as described at the end Hawking (1984b) does not need to worry about
of this section). Either way, the probability distribution wormholes, but how do we know that the 3-form gauge
has an infinite peak at h,,-+O+, which, after normaliz- field is real? A related question for both authors: even
ing so that the total probability is unity, means that P ( a ) granting the existence of the stationary point of the ac-
has a factor tion at which reff= -33~/hG, how do we know that this
is the dominant stationary point?
P(a)oc6(Aeda)) . (8.35) (4) What about the other terms in the effective action?
For instance, suppose we include the 6-derivative termIg
In addition, as in Hawkings case, from the way that F
has been calculated it is clear that this her is the constant
that appears in the effective action for pure gravity with
all high-energy fluctuations integrated out; hence it is the
cosmological constant relevant to astronomical observa-
tion.
Has the cosmological constant problem been solved?
Perhaps so, but there are still some things to worry about
I
+urRRorpv , (8.36)
+ < ( a ) G (a )RPvApR
the last term of Eq. (8.36)? This would completely invali- nice to be able to calculate them, because up t o now the
date the analysis of the singularity in the probability den- only really unsatisfactory feature of the quantum theory
sity P ( a ) ,and could well wipe it out. of gravitation has been the apparent arbitrariness of this
The last of these four qualms suggests some interesting infinite set of parameters.
possibilities. Suppose we do assume that for some reason
constants like cJa) in Eq. (8.36) are bounded. Then the IX. OUTLOOK
effect of wormholes is not only t o fix Ma) at zero, but
also t o fix these other constants a t their lower or upper All of the five approaches to the cosmological constant
bounds. [I think this is the correct interpretation of what problem described in Secs. IV -VIII remain interesting.
Coleman (1988b) calls the big fix.] For instance, for At present, the fifth, based on quantum cosmology, ap-
[(a) bounded and ( h ( a ) G ( a ) l < < l , the action (8.36) is pears the most promising. However, if wormholes (or 3-
stationary for a sphere of proper circumference 27rr, form gauge fields) d o produce a distribution of values for
where the cosmological constant, but without a n infinite peak a t
2 3 64[rGzh heff=O, then we will have to fall back on the anthropic
(8.37)
-h 3 principle to explain why he, is not enormously larger
than allowed by observation. Alternatively, it may be
for which the effective action takes the value some change in the theory of gravity, like that described
- 377 128cGha2 here in Sec. VII, that produces the distribution in values
(8.38)
eff Gh 3 for heP The approaches based on supersymmetry and
adjustment mechanisms described in Secs. IV and V I
Thus the probability distribution exp[exp( -ref)] not
seem least promising at present, but this may change.
only has an infinite peak at h(a)=O, but also contains a All five approaches have one other thing in common:
factor They show that any solution of the cosmological constant
problem is likely to have a much wider impact on other
(8.39) areas of physics or astronomy. One does not need to ex-
plain the potential importance of supergravity and super-
For Gh-0, the quantity G h e x p ( 3 r / G h ) becomes strings. A light scalar like that needed for adjustment
infinite, so the normalized probability will have a delta mechanisms could show up macroscopically, as a fifth
function at the upper bound of [(a). All constants in the force. Changing gravity by making Detg,, not dynami-
effective action for gravitation, including terms with any cal would make us rethink our quantum theories of grav-
numbers of derivatives, can be calculated in this way,20 itation, and wormholes might force all the constants in
but they all have to be bounded as h ( a ) G ( a ) - + Ofor any these theories to their outer bounds. Finally, and of
of this t o make sense. greatest interest t o astronomy, if it is only anthropic con-
It may be that the bounds (if any) on parameters like straints that keep the effective cosmological constant
[(a)arise from the details of wormhole physics, in which within empirical limits, then this constant should be rath-
case these remarks are not going to be useful numerically er large, large enough to show up before long in astro-
for some time. However, there is another more exciting nomical observations.
possibility, that there are just unitarity bounds, which Note added in prooJ As might have been expected, in
could be calculated working only with low-energy the time since this report was submitted for publication
effective theory itself. Of course, we are not likely to be there have appeared a large number of preprints that fol-
able t o measure parameters like [ ( a ) but
, it would still be low up on various aspects of the work of Coleman
(1988b) and Banks (1988). Here is a partial list: Accetta
el al. (1988); Adler (1988); Fischler and Susskind (1988);
Giddings and Strominger (1988~);Gilbert (1988); Grin-
@Tothe extent that it will become possible to calculate func- stein and Wise (1988); Gupta and Wise (1988); Hosoya
tions like U a ) ,G ( a ) ,{(a)etc., in terms of the parameters in an (1988); Klebanov, Susskind, and Banks (1988); Myers and
underlying fundamental theory, such as a string theory, the lo-
Periwal (1988); Polchinski (1988); Rubakov (1988). I a m
cation of the delta functions in F may allow us to infer some-
not able t o review all of these papers here. However, I d o
thing about the values of the ai and of the parameters in the un-
derlying theory. However, without such an underlying theory, want t o mention two further qualms, regarding
it is impossible to use calculations of AG, 5; etc., to infer any- Colemans proposed solution of the cosmological con-
thing about the observed parameters of some intermediate stant problem, that are raised by some of these papers.
theory like the standard model. This is because, in addition to First, Fischler and Susskind (1988), partly on the basis of
charges, masses, etc., the standard model implicitly also in- conversations with V. Kaplunovsky, have pointed out
volves parameters k,,,Go,{,,, . . . appearing in the effective ac- that the exponential damping of large wormholes may be
tion for gravitation. When we integrate out the quarks, leptons, overcome by Colemans double exponential. If this were
and gauge and Higgs bosom, we obtain new values for h,G,C, the case, we would be confronted with closely packed
etc.; but these new values depend on an equal number of un- wormholes of macroscopic as well as Planck scales. This
knowns &, G,,Co, etc., as well as on charges and masses. would be a disaster for Colemans proposed solution of
t h e cosmological constant problem, and would also indi- Attick, J., G. Moore, and A. Sen, 1987, Institute for Advanced
cate t h a t we do n o t fully understand how t o use Euclide- Studies preprint.
a n path integrals in q u a n t u m cosmology. Next, Polchin- Augustine, 398, Confessions, translated by R. S. Pine-Coffin
ski (1988) has found t h a t t h e Euclidean path integral over (Penguin Books, Harmondsworth, Middlesex, 1961),Book XI.
closed, connected, wormhole-free manifolds inside t h e Bahcall, J., T. Piran, and S. Weinberg, 1987, Eds., Dark Matter
in rhe Uniuerse: 4th Jerusalem Winter School for Theoretical
exponential i n (8.33) h a s a phase that might eliminate t h e
Physics (World Scientific, Singapore).
peak in the probability distribution a t zero cosmological
Bahcall, S. R., and S . Tremaine, 1988, Astrophys. J. 326, L1.
constant. A s pointed o u t here in footnote 15, when we Banks, T., 1985, Nucl. Phys. B 249, 332.
use a n effective action ref
t o evaluate such path integrals, Banks, T., 1988, University of California, Santa Cruz, Preprint
t h e effective action m u s t be taken a s a n input to calcula- No. SCIPP 88/09.
tions in which we include quantum fluctuations i n mass- Banks, T., W. Fischler, and L. Susskind, 1985, Nucl. Phys. B
less particle fields with momenta u p t o some ultraviolet 262, 159.
cutoff A . T h i s cutoff must b e taken a s t h e same a s t h e in- Barbieri, R., E. Cremmer, and S. Ferrara, 1985, Phys. Lett. B
frared cutoff t h a t was used in calculating ren so t h a t all 163, 143.
fluctuations a r e taken into account. I t was remarked in Barbieri, R., S. Ferrara, D. V. Nanopoulos, and K. S. Stelle,
1982, Phys. Lett. B 113,219.
footnote 15 t h a t A must b e taken very small, t o avoid Barr, S. M., 1987, Phys. Rev. D 36, 1691.
reintroducing a cosmological constant, b u t a s Polchinski Barr, S. M., and D. Hochberg, 1988, Phys. Lett. B 211.49.
n o w remarks, no matter how small we take A, t h e in- Barrow, J. D., and F. J. Tipler, 1986, The Anthropic Cosmologi-
tegral over fluctuations in t h e gravitational field with mo- cal Principle (Clarendon, Oxford).
menta less t h a n A produces a phase in t h e integral. Since Bernstein, J., and G. Feinberg, 1986, Eds., Cosmological Con-
this phase appears inside t h e exponential in Eq. (8.33), if stants (Columbia University, New York).
its real part is not positive definite there would b e no ex- Bludman, S . A,, and M. Ruderman, 1977, Phys. Rev. Lett. 38,
ponential peak a t zero cosmological constant. O n t h e 255.
o t h e r hand, in t h e absence of wormholes this phase Brown, J. D., and C. Teitelboim, 1987a, Phys. Lett. B 195, 177.
would appear a s a n overall factor in front of a single ex- Brown, J. D., and C. Teitelboim, 1987b, Nucl. Phys. B 297, 787.
Buchmiiller, W., and N. Dragon, 1988a, University of Hann-
ponential, so it would not affect t h e peaking a t zero
over Preprint No. ITP-UH 1/88.
cosmological constant found by Hawking [ 1984b).
Buchmiiller, W., and N. Dragon, 1988b, Phys. Lett. B 207,292.
Carter, B., 1974, in International Astronomical Union Symposi-
ACKNOWLEDGMENTS um 63: Confrontation of Cosmological Theories with Obserua-
tional Data, edited by M. S. Longair (Reidel, Dordrecht), p.
I have been greatly helped in preparing this review by 291.
Carter, B., 1983, in The Constants of Physics, Proceedings of a
conversations with many colleagues. Here is a list of a
Royal Society Discussion Meeting, 1983, edited by W. H.
few o f those to whom m y thanks a r e especially due. Sec- McCrea and M. J. Rees (printed for The Royal Society, Lon-
tion 11: G. Holton; Sec. 111: E. Witten; Sec. IV: S. d e don, at the University Press, Cambridge), p. 137.
Alwis, J. Polchinski, E. Witten; Sec. V: P. J. E. Peebles, Casimir, H. B. G., 1948, Proc. K.ed. Akad. Wet. 51,635.
P. Shapiro, E. Vishniac; Sec. VI: J. Polchinski; Sec. VII: Chang, N.-P., D.-X. Li, and J. Perez-Mercader, 1988, Phys.
C. Teitelboim, F. Wilczek; Sec. VIII: L. Abbott, S. Cole- Rev. Lett. 60, 882.
man, B. DeWitt, W. Fischler, S. Giddings, L. Susskind, Coleman, S., 1988a, Nucl. Phys. B 307, 867.
C. Teitelboim, F. Wilczek. O f course, they t a k e no Coleman, S., 1988b. Why there is nothing rather than some-
responsibility for anything t h a t I may have gotten wrong. thing: A theory of the cosmological constant, Harvard Uni-
Research was supported in p a r t by t h e Robert A. Welch versity Preprint No. HUTP-88/A022.
Foundation and N S F G r a n t No. P H Y 8605978. Coleman, S., and F. de Luccia, 1980, Phys. Rev. D 21, 3305.
Coughlan, G. D., I. Kani, G. G. Ross, and G. Segre, 1988,
CERN Preprint No. TH. 5014/88.
REFERENCES Cremmer, E., S. Ferrara, C. Kounnas, and D . V. Nanopoulos,
1983, Phys. Lett. B 133, 61.
Cremmer, E., B. Julia, J. Scherk, S. Ferrara, L. Girardello, and
Abbott, L., 1985, Phys. Lett. B150,427. P. van Nieuwenhuizen, 1978, Phys. Lett. B 79, 231.
Abbott, L., 1988, Sci. Am. 258 (No. 5 ) , 106. Cremmer, E., B. Julia, J. Scherk, S. Ferrara, L. Girardello, and
Accetta, F. S., A. Chodos, F. Cooper, and B. Shao, 1988, Fun P. van Nieuwenhuizen, 1979, Nucl. Phys. B 147, 105.
with the wormhole calculus, Yale University Preprint No. Davies, P. C. W., 1982, The Accidental Universe (Cambridge
YCTP-P20-88. University, Cambridge).
Adler, S. L., 1988, On the Banks-Coleman-Hawking argument de Sitter, W., 1917, Mon. Not. R. Astron. SOC.78, 3 (reprinted
for the vanishing of the cosmological constant, Institute for in Bernstein and Feinberg, 1986).
Advanced Study Preprint No. IASSNS-HEP-88/35. de Vaucouleurs, G., 1982, Nature (London)299,303.
Albrecht, A., and P. J. Steinhardt, 1982, Phys. Rev. Lett. 48, de Vaucouleurs, G., 1983, Astrophys. J. 268,468, Appendix B.
120. DeWitt, B., 1967, Phys. Rev. 160, 1113.
Arnowitt, R., S. Deser, and C. W. Misner, 1962, in Grauitarion: Dicke, R. H., 1961, Nature (London) 192,440.
A n Introduction to Current Research, edited by L. Witten (Wi- Dine, M., W. Fischler, and M. Srednicki, 1981, Phys. Lett. B
ley, New York) p . 227. 104, 199.
Dine, M., R. Rohm, N. Seiberg, and E. Witten, 1985, Phys. Hawking, S. W., 1984a, Nucl. Phys. B 239, 257.
Lett. B 156, 55. Hawking, S. W., 1984b, Phys. Lett. B 134,403.
Dine, M., and N. Seiberg, 1986, Phys. Rev. Lett. 57,2625. Hawking, S. W., 1984c, in Relatiuity, Groups and Topologv II,
Dirac, P. A. M., 1937, Nature (London) 139, 323. . .
NATO Advanced Study Institute Session XL . Les Houches
Dolgov, A. D., 1982, in The Very Early Uniuerse: Proceedings of 1983, edited by B. S. DeWitt and Raymond Stora (Elsevier,
the 1982 Nufield Workshop at Cambridge, edited by G. W. Amsterdam), p. 336.
Gibbons, S. W. Hawking, and S. T. C. Siklos (Cambridge Uni- Hawking, S. W., 1987a, remarks quoted by M. Gell-Mann,
versity, Cambridge), p. 449. Phys. Scr. T15, 202 (1987).
Dreitleh, J., 1974, Phys. Rev. Lett. 34,777. Hawking, S. W., 1987b, Phys. Lett. B 195, 337.
Eddington, A. S., 1924, The Mathematical Theory of Relativity, Hawking, S. W., 1988, Phys. Rev. D 37,904.
2nd Ed. (Cambridge University, London). Hawking, S., and D. Page, 1986, Nucl. Phys. B 264, 185.
Einstein, A., 1917, Sitzungsber. Preuss. Akad. Wiss. Phys.- Henneaux, M., and C. Teitelboim, 1984, Phys. Lett. B 143, 415.
Math. K1. 142 [English Translation in The Principle of Rela- Henneaux, M., and C. Teitelboim, 1988, The cosmological
tiuity (Methuen, 1923, reprinted by Dover Publications), p. constant and general covariance, University of Texas pre-
177; and in Bernstein and Feinberg, 19861. print.
Einstein, A., 1919, Sitzungsber. Preuss. Akad. Wiss., Phys.- Hosoya, A., 1988, A diagrammatic derivation of Colemans
Math. Kl. [English translation in The Principle of Relatiuity vanishing cosmological constant, Hiroshima Preprint No.
(Methuen, 1923, reprinted by Dover Publications),p. 1911. RRK-88-28.
Ellis, J., C. Kounnas, and D. V. Nanopoulos, 1984, Nucl. Phys. Kim, J., 1979, Phys. Rev. Lett. 43,103.
B 241, 373. Klebanov, I., L. Susskind, and T. Banks, 1988, Wormholes and
Ellis, J., A. B. Lahanas, D. V. Nanopoulos, and K. Tamvakis, cosmological constant, SLAC Preprint No. SLAC-Pub.-4705.
1984, Phys. Lett. B 134,429. Knapp, G. R., and J. Kormendy, 1987, Eds., Dark Matter in the
Ellis, J., N. C. Tsamis, and M. Voloshin, 1987, Phys. Lett. B Universe: LA. U.Symposium No. 117 (Reidel, Dordrecht).
194, A 29 1. Lavrelashvili, G. V., V. A. Rubakov, and P. G. Tinyakov, 1987,
Fischler, W., and L. Susskind, 1988, A wormhole catas- Pisma Zh. Eksp. Teor. Fiz. 46, 134 [JETP Lett. 46, 167
trophe, Texas Preprint No. UTTG-26-88. (1987)l.
Ford, L. H., 1987, Phys. Rev. D 35,2339. Lavrelashvili, G . V., V. A. Rubakov, and P.G. Tinyakov, 1988,
Freese, K., F. C. Adams, J. A. Frieman, and E. Mottola, 1987, Nucl. Phys. B 299,757.
Nucl. Phys. B 287,797. Lema%re, G., 1927, Ann. SOC.Sci. Bruxelles, Ser. 147,49.
Friedan, D., E. Martinec, and S. Shenker, 1986, Nucl. Phys. B Lemahe, G., 1931, Mon. Not. R. Astron. SOC.91,483.
271,93. Linde, A. D., 1974, Pisma Zh. Eksp. Teor. Fiz. 19, 320 [JETP
Friedmann, A,, 1924, 2. Phys. 21, 326 [English translation in Lett. 19, 183 (1974)l.
Bernstein and Feinberg, 1986, Eds., Cosmological Constants Linde, A. D., 1982, Phys. Lett. B 129, 389.
(Columbia University, New York)]. Linde, A. D., 1986, Phys. Lett. B 175, 395.
Gibbons, G. W., S. W. Hawking, and M. J. Perry, 1978, Nucl. Linde, A. D., 1987, Phys. Scri. T15, 169.
Phys. B 138,141. Linde, A. D., 1988a, Phys. Lett. B 200, 272.
Giddings, S. B., and A. Strominger, 1988a, Nucl. Phys. B 306, Linde, A. D., 1988b, Phys. Lett. B 202, 194.
890. Loh, E. D., 1986, Phys. Rev. Lett. 57, 2865.
Giddings, S. B., and A. Strominger, 1988b, Nucl. Phys. B 307, Loh, E. D., and E. J. Spillar, 1986, Astrophys. J. 303, 154.
854. Martinec, E., 1986, Phys. Lett. B 171, 189.
Giddings, S. B., and A. Strominger, 1988c, Baby universes, Moore, G., 1987a, Nucl. Phys. B 293, 139.
third quantization, and the cosmological constant, Harvard Moore, G., 1987b, Institute for Advanced Study Preprint No.
Preprint No. HUTP-88/A036. IASSNS-HEP-87/59, to be published in the proceedings of the
Gilbert, G., 1988, Wormhole induced proton decay, Caltech Cargese School on Nonperturbative Quantum Field Theory.
Preprint No. CALT-68-1524. Morozov, A,, and A. Perelomov, 1987, Phys. Lett. B 199,209.
Grinstein, B., and M. B. Wise, 1988, Light scalars in quantum Myers, R. C., and V. Periwal, 1988, Constants and correlations
gravity, Caltech Preprint No. CALT-68-1505. in the Coleman calculus, Santa Barbara Preprint No. NSF-
Grisaru, M. T., W. Siegel, and M. Rocek, 1979, Nucl. Phys. B ITP-88-151.
159,429. Page, D., 1987, in The World and I (in press).
Gross, D. J., 1984, Nucl. Phys. B 236, 349. Pais, A., 1982,,Subtle is the Lord . . .: The Science and the Life
Gupta, A. K., and M. B. Wise, 1988, Comment on wormhole of Albert Einstein (Oxford University, New York).
correlations, Caltech Preprint No. CALT-68-1520. Peccei, R. D., J. Sola, and C. Wetterich, 1987, Phys. Lett. B
Guth, A. H., 1981, Phys. Rev. D 23, 347. 195, 183.
Hartle, J. B., 1987, in Gravitation in Astrophysics: Cargese 1986, Peebles, P. J. E., 1984, Astrophys. J. 284,439.
edited by B. Carter and J. B. Hartle (Plenum, New York), p. Peebles, P. J. E., 1987a, in Proceedings of the Summer Study on
329. the Physics of the Superconducting Super Collider, edited by R.
Hartle, J. B., and S. W. Hawking, 1983, Phys. Rev. D 28,2960. Donaldson and J. Marx (Division of Particles and Fields of the
Hawking, S. W., 1978, Nucl. Phys. B 144, 349. APS, New York).
Hawking, S. W., 1979, in Three Hundred Years of Gravitation, Peebles, P. J. E., 1978b, Publ. Astron. Soc. Pac., in press.
edited by S. W. Hawking and W. Israel (Cambridge Universi- Peebles, P. J. E., and B. Ratra, 1988, Astrophys. J. Lett. 325,
ty, Cambridge). L17.
Hawking, S. W., 1982, Commun. Math. Phys. 87, 395. Petrosian, V., E. E. Salpeter, and P. Szekeres, 1967, Astrophys.
Hawking, S. W., 1983, Philos. Trans. R. SOC.London, Ser. A J. 147, 1222.
310, 303. Polchinski, J., 1986, Commun. Math. Phys. 104, 37.
Polchinski, J., 1987, private communication. Weinberg, S.; 1979b, Physica 96A, 327.
Polchinski, J., 1988, in preparation. Weinberg, S.,1982, Phys. Rev. Lett. 48, 1776.
Ratra, B., and P. J. E. Peebles, 1988, Phys. Rev. D 37,3406. Weinberg, S., 1983, unpublished remarks at the workshop on
Rees, M. J., 1987, New Sci. August 6, 1987,p. 43. Problems in Unification and Supergravity, La Jolla Insti-
Renzini, A., 1986, in Galaxy Distances and Deuiations from tute, 1983.
Universal Expansion, edited by B. F. Madore and R. B. Tully Weinberg, S., 1987, Phys. Rev. Lett. 59, 2607.
(Reidel, Dordrecht), p. 177. Wetterich, C., 1988, Nucl. Phys. B 302, 668.
Reuter, M., and C. Wetterich, 1987, Phys. Lett. 188, 38. Wheeler, 1. A., 1964, in Relatiuity, Groups and Topology, Lec-
Rohm, R., 1984, Nucl. Phys. B 237,553. .
tures Delivered at Les Houches, 1963 . , ,edited by B. DeWitt
Rowan-Robinson, M., 1968, Mon. Not. R. Astron. SOC.141, and C. DeWitt (Gordon and Breach, New York), p. 317.
445. Wheeler, J. A., 1968, in Buttelle Rencontres, edited by C .
Rubakov, V. A., 1988, On the third quantization and the DeWitt and J. A. Wheeler (Benjamin, New York).
cosmological constant, DESY preprint. Wilczek, F., 1984, Phys. Rep. 104, 143.
Shklovsky, I., 1967, Astrophys. J. 150, L1. Wilczek, F., 1985, in How Far Are We from the Gauge Forces:
Slipher, V. M., 1924, table in Eddington (1924), The Mathemati- Proceedings of the 1983 Erice Conference, edited by A. Zichichi
cal Theory of Relatiuity, 2nd Ed. (Cambridge University, Lon- (Plenum, New York), p. 208.
don), p. 162 Wilczek, F., and A. Zee, 1983, unpublished work quote by Zee
Spamaay, M. J., 1957, Nature (London) 180,334. (1989, in High Energy Physics: Proceedings of the Annual
Strominger, A., 1984, Phys. Rev. Lett. 52, 1733. Orbis Scientiae, edited by S . L. Mintz and A. Perlmutter (Ple-
Teitelboirn, C., 1982, Phys. Rev. D 25, 3159. num, New York).
Turner, M. S.,G. Steigman, and L. M. Krauss, 1984, Phys. Rev. Witten, E., 1983, in Proceedings of the 1983 Shelter Island
Lett. 52,2090. Conference on Quantum Field Theory and the Fundamental
Van der Bij, J. J., H. Van Dam, and Y. J. Ng, 1982, Physica Problems of Physics, edited by R. Jackiw, N. N. Khuri, S.
116A, 307. Weinberg, and E. Witten (MIT, Cambridge, Massachusetts), p.
Veltman, M., 1975, Phys. Rev. Lett. 34,717. 273.
Vilenkin, A., 1986, Phys. Rev. D 33, 3560. Witten, E., 1985, Phys. Lett. B 155, 151.
Vilenkin, A., 1988a, Phys. Rev. D 37,888. Witten, E., and J. Bagger, 1982, Phys. Lett. B 115, 202.
Vilenkin, A., 1988b, Tufts Preprint No.TUTP-88-3. Zee, A,, 1985, in High Energy Physics: Proceedings of the 20th
Weinberg, S., 1972, Gravitation and Cosmology (Wiley, New Annual Orbis Scientiae, 1983, edited by S . L. Mintz and A.
York). Perlmutter (Plenum, New York).
Weinberg, S., 1979a, in General Relativity: An Einstein Cen- Zeldovich, Ya., B., 1967, Pisma Zh. Eksp. Teor. Fiz. 6, 883
renary Suruey, edited by S . W. Hawking and W. Israel (Cam- [JETP Lett. 6, 316 (1967)].
bridge University, Cambridge),p. 800. Zumino, B., 1975, Nucl. Phys. B 89, 535.
Bharat Ratra
Department of Physics, Kansas State University, Manhattan, Kansas 66506
(Published 22 April 2003)
Physics welcomes the idea that space contains energy whose gravitational effect approximates that of
Einsteins cosmological constant, A; today the concept is termed dark energv or quintessence. Physics
also suggests that dark enerly could be dynamical, allowing for the arguably appealing picture of an
evolving dark-energy density approaching its natural value, zero, and small now because the
expanding universe is uld. This would alleviate the classical problem of thc curious energy scale of a
millielectrun voll associated with a constant A. Dark energy may have been detected by recent
cusmological tests. These tem make a good scientific casc for the context, in the relativistic
Friedmann-Lemaitre model, in which Ihe gavitational inverse-square law is applied to the scales of
cosmology. We have well-chcckcd evidence that the mean mass density is not much more than
one-quarter of the critical Einstein-de Sitter value. The case for dctcction of dark energy is not yet as
convincing but slill serious; we await more data, which may be derived from work in progress. Planned
observations may detect the evolution of the dark-energy density: a positive result would be a
considerable stimulus for attempts a t understanding the microphysics of dark energy This review
presents the basic physics and astronomy of the subject, reviews the history of ideas. assesses the state
of the observational evidence. and comments on recent developments in the search for a fundamental
theory
560 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
puzzling is that the value of the dark-energy density has The reader is referred to Leibundguts (2001, Sec. 4) dis-
to be tiny compared to what is suggested by dimensional cussion of astrophysical hazards. Astronomers have
analysis; the startling new evidence is that it may be dif- checks for this and other issues of interpretation when
ferent from the only other natural value, zero. considering the observations used in cosmological tests.
The main question to consider now is whether to ac- But it takes nothing away from this careful and elegant
cept the evidence for detection of dark energy. We out- work to note that the checks are seldom convincing, be-
line the nature of the case in this section. After review- cause the astronomy is complicated and what can be
ing the basic concepts of the relativistic world model in observed is sparse. What is more, we do not know ahead
Sec. 11, and in Sec. 111 reviewing the history of ideas, we of time that the physics well tested on scales ranging
present in Sec. IV a more detailed assessment of the from the laboratory to the Solar System survives the
cosmological tests and the evidence for detection of A or enormous extrapolation to cosmology.
The situation is by no means hopeless. We now have
its analog in dark energy.
significant cross-checks from the consistency of results
There is little new to report on the big issue for based on independent applications of the astronomy and
physics-why the dark-energy density is so small-since of the physics of the cosmologxal model. If the physics
Weinbergs (1989) review in this journal. But there have or astronomy was faulty we would not expect consis-
been analyses of a simpler idea: can we imagine that the tency from independent lines of evidence-apart from
present dark-energy density is evolving, perhaps ap- the occasional accident and the occasional tendency to
proaching zero? Models are introduced in Secs. 1I.C and stop the analysis when it approaches the right answer.
III.E, and recent work is summarized in more detail in We have to demand abundant evidence of consistency,
the Appendix. Feasible advances in cosmological tests and that is starting to appear.
could detect evolution of the dark-energy density, and The case for detection of A or dark energy com-
perhaps its gravitational response to large-scale fluctua- mences with the Friedmann-Lemaitre cosmological
tions in the mass distribution. This would substantially model. In this model the expansion history of the uni-
motivate the search for a more fundamental physics verse is determined by a set of dimensionless parameters
model for dark energy. whose sum is normalized to unity,
A. The issues for observational cosmology
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 561
or three standard deviations off the best fit, depending geneous and isotropic way from a hotter denser state:
on the data set and analysis technique. This is an impor- how else could space, which is transparent now, have
tant indication, but 2 to 3 CT is not convincing, even when been filled with radiation that has relaxed to a thermal
we can be sure that systematic errors are under Teason- spectrum? The debate is when the expansion com-
able control. And we have to consider that there may be menced or became a meaningful concept. Some whose
opinions and research we respect question the extrap-
a significant systematic error from differences between
olation of the gravitational inverse-square law, in
distant, high-redshift, and nearby, low-redshift, superno- its use in estimates of masses in galaxies and systems of
vae. galaxies, and of RMo. We agree that this law is one of
There is a check, based on the cold-dark-matter the hypotheses to be tested. Our conclusion from the
(CDM) model3 for structure formation. The fit of the cosmological tests of Sec. IV is that the law passes
model to the observations reviewed in Sec. 1V.B yields significant, though not yet complete, tests, and that
two key constraints. First, the angular power spectrum we already have a strong scientific case, resting on the
of fluctuations in the temperature of the 3-K thermal abundance of cross-checks, that the matter density
cosmic microwave background radiation across the sky parameter RMo is about one-quarter. The case for
indicates that nKois small. Second, the power spectrum detection of nno is significant too, but not yet as com-
of the spatial distribution of the galaxies requires Ruo pelling.
-0.25. Similar estimates of nMo follow from indepen- For the most part the results of the cosmological tests
dent lines of observational evidence. The rate of gravi- agree wonderfully well with accepted theory. But the ob-
tational lensing prefers a somewhat larger value (if RKO servational challenges to the tests are substantial: we are
drawing profound conclusions from very limited infor-
is small), and some dynamical analyses of systems of
mation. We have to be liberal when considering ideas
galaxies prefer lower nMO. But the differences could all about what the universe is like, and conservative when
result from measurement uncertainties. Since nRo in Eq. accepting ideas into the established canon.
(1) is small, the conclusion is that RAois large, in excel-
lent agreement with the supernovae result. B. The opportunity for physics
Caution is in order, however, because this check
depends on the CDM model for structure formation. Unless there is some serious and quite unexpected
We cannot see the dark matter, so we naturally assign flaw in our understanding of the principles of physics we
it the simplest properties possible. Maybe it is significant can be sure the zero-point energy of the electromagnetic
that the model has observational problems with galaxy field at laboratory wavelengths is real and measurable,
formation, as discussed in Sec. IV.A.2, or maybe these as in the Casimir (1948) effect? Like all energy, this
problems are only apparent, due to the complications of zero-point energy has to contribute to the source term in
the astronomy. We are going to have to determine which Einsteins gravitational field equation. If, as seems likely,
is correct before we can have confidence in the role of the zero-point energy of the electromagnetic field is
the CDM model in cosmological tests. We will get a close to homogeneous and independent of the velocity
strong hint from current precision angular distribution of the observer, it manifests itself as a positive contribu-
measurements of the 3-K thermal cosmic microwave tion to Einsteins A, or dark energy. The zero-point en-
background radiation4 If the results match precisely the ergies of the fermions make a negative contribution.
prediction of the relativistic model for cosmology and Other contributions, perhaps including the energy den-
the CDM model for structure formation, with parameter sities of fields that interact only with themselves and
choices that agree with the constraints from all the other
cosmological tests, there will be strong evidence that we
are approaching a good approximation to reality, and See Bordag, Mohideen, and Mostepanenko (2001) for a re-
the completion of the great program of cosmological cent review. The attractive Casimir force between two parallel
tests that commenced in the 1930s. But all that is in the conducting plates results from the boundary condition that
future. suppresses the number of modes of oscillation of the electro-
We wish to emphasize that the advances in the empiri- magnetic field between the plates, thus suppressing the energy
cal basis for cosmology already are very real and sub- of the system. One can understand the effect at small separa-
stantial. How firm the conclusion is depends on the is- tion without reference to the quantum behavior of the electro-
sue, of course. Every competent cosmologist we know magnetic field, such as in the analysis of the van der Waals
accepts as established beyond reasonable doubt that interaction in quantum mechanics, by taking account of the
the universe is expanding and cooling in a near homo- term in the particle Hamiltonian for the Coulomb potential
energy between the charged particles in the two separate neu-
tral objects. But a more complete treatment, as discussed by
Cohen-Tannoudji,Dupont-Roc,and Grynberg (1992), replaces
3The model is named after the nonbaryonic cold dark matter the Coulomb interaction with the coupling of the charged par-
that is assumed to dominate the masses of galaxies in the ticles to the electromagnetic-fieldoperator. In this picture the
present universe. There are more assumptions in the CDM van der Waals interaction is mediated by the exchange of vir-
model, of course; they are discussed in Secs. 1II.D and IVA.2. tual photons. With either way of looking at the Casimir
4At the time of writing the Microwave Anisotropy Probe effect-the perturbation of the normal modes or the exchange
(MAP) satellite is collecting data; the project is described in of virtual quanta of the unperturbed modes-the effect is the
http://map.gsfc.nasa.gov/ same, the suppression of the energy of the system.
562 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
gravity, might have either sign. The value of the sum things-baryonic and nonbaryonic matter-and a,,, ,
suggested by dimensional analysis is much larger than which is thought to represent something completely dif-
what is allowed by the relativistic cosmological model. ferent, is not much larger. Also, if the parameters were
The only other natural value is h=O.If A really is tiny measured when the universe was one-tenth its present
but not zero, this introduces a most stimulating though size the time-independent A parameter would contrib-
enigmatic clue to the physics yet to be discovered. ute ClA-0.003. That is, we seem to have come on the
To illustrate the problem we outline an example of a scene just as A has become an important factor in the
contribution to A. The energy density in the 3-K thermal expansion rate. These curiosities surely are in part acci-
cosmic microwave background radiation, which amounts dental, but maybe in part physically significant. In par-
to ClR0-5X10-5 in Eq. (1) (ignoring the neutrinos), ticular, one might imagine that the dark-energy density
peaks at wavelength A-2mm. At this Wien peak the represented by A is rolling to its natural value, zero, but
photon occupation number is about one-fifteenth. The is very small now because we measure it when the uni-
zero-point energy amounts to half the energy of a pho- verse is very old. We shall discuss efforts along this line
ton at the given frequency. This means the zero-point to at least partially rationalize the situation.
energy in the electromagnetic field at wavelengths A
-2 mm amounts to a contribution 6fi,,,-4X10-4 to
the density parameter in A or the dark energy. The sum
over the modes scales as K 4 [as illustrated in Eq. (37)]. C. Some explanations
Thus a naive extrapolation to visible wavelengths deter-
mines that the contribution amounts to 6aAO-5 X lo, We have to explain our choice of nomenclature. Basic
already a ridiculous number. concepts of physics say that space contains homoge-
The situation can be compared to the development of neous zero-point energy, and perhaps also energy that is
the theory of weak interactions. The Fermi pointlike in- homogeneous or nearly so in other forms, real or effec-
teraction model is strikingly successful for a consider- tive (such as from counter terms in gravity physics,
able range of energies, but it was clear from the start which make the net energy density cosmologically ac-
that the model fails at high energy. A fix was discussed- ceptable). In the literature this near homogeneous en-
mediate the interaction by an intermediate boson-and ergy has been termed vacuum energy, the sum of
eventually incorporated into the even more successful vacuum energy and quintessence (Caldwell, Davk, and
electroweak theory. General relativity and quantum me- Steinhardt, 1998), and dark energy (Turner, 1999). We
chanics are extremely successful over a considerable have adopted the last term, and we shall refer to the
range of length scales, provided we agree not to use the dark-energy density pa that manifests itself as an effec-
rules of quantum mechanics to count the zero-point en- tive version of Einsteins cosmological constant, but one
ergy density in the vacuum, even though we know we that may vary slowly with time and position.6
have to count the zero-point energies in all other situa- Our subject involves two quite different traditions, in
tions. There are thoughts on improving the situation, physics and astronomy. Each has familiar notation, and
though they seem to be less focused than was the case familiar ideas that may be in the air but not in recent
for the Fermi model. Perhaps a new energy component literature. Our attempt to take account of these tradi-
spontaneously cancels the vacuum energy density or the tions commences with the summary in Sec. I1 of the ba-
new component varies slowly with position and here and sic notation with brief explanations. We expect that
there happens to cancel the vacuum energy density well readers will find some of these concepts trivial and oth-
enough to allow observers like us to exist. Whatever the ers of some use, and that the useful parts will be differ-
nature of the more perfect theory, it must reproduce the ent for different readers.
successes of general relativity and quantum mechanics. We offer in Sec. 111 our reading of the history of ideas
That includes the method of representing the material on A and its generalization to dark energy. This is a
content of the observable universe-all forms of mass fascinating and we think edifying illustration of how sci-
and energy-by the stress-energy tensor, and the rela- ence may advance in unexpected directions. It is rel-
tion between the stress-energy tensor and the curvature evant to an understanding of the present state of re-
of macroscopic spacetime. One part has to be adjusted. search in cosmology, because traditions inform opinions,
The numerical values of the parameters in Eq. (1) also and people have had mixed feelings about A ever since
are enigmatic, and possibly trying to tell us something. Einstein (1917) introduced it 85 years ago. The concept
The evidence is that the parameters have the approxi- never entirely disappeared in cosmology because a se-
mate values ries of observations hinted at its presence, and because
to some cosmologists A fits the formalism too well to be
ignored. The search for the physics of the vacuum, and
its possible relation to A, has a long history too. Despite
We have written in two parts: measures the
density of the baryons we know exist and RDMo mea-
sures the hypothetical nonbaryonic cold dark matter we 6The dark energy should of course be distinguished from a
need to fit the cosmological tests. The parameters OBo hypothetical gas of particles with velocity dispersion large
and ~ D M ohave similar values but represent different enough that the distribution is close to homogeneous.
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 563
the common and strong suspicion that A must he negli- those who have not already thought to d o so, t o check
gibly small, because any other acceptable value is ab- that Eq. (4) is required t o preserve homogeneity and
surd, all this history has made the community well pre- isotropy.'
pared for the recent observational developments that The rate of change of the distance in Eq. (4) is the
argue for the detection of A. speed
Our approach in Sec. IV to the discussion of the evi-
dence for detection of A, from the cosmological tests, u=dlldt=HI, H=ula, (3
also requires explanation. One occasionally reads that
the tests will show us how the world will end. That cer- where the overdot means the derivative with respect to
tainly seems interesting, hut it is not the main point: why world timc t aud H is thc time-dependcnt Hubble pa-
should wc trust an extrapolation into the indefinite fu- rameter. When u is small compared to thc spccd of light
turc of a theory that we can at best show is a good this is Hubble's law. The present value 01H is Hubble's
approximation to real it^?^
As we remarked in Sec. T.A,
constant, H,, . When needed we will use'
the purpose of the tests is to check the approximation to
reality, by checking the physics and astronomy of the
H 0 = lOOh km spl Mpc-' = 6 7 t 7 km sK1 Mpc-I
standard relativistic cosmological model, along with any
viable alternatives that may be discovered. We take our =(15+2 Gyr)-', (6)
task to be the identification of the aspects of the stan-
dard theory that enter the interpretation of the measure- at two standard deviations. The first equation defines the
ments and thus are or may be empirically checked or dimensionless parameter l a .
measured. Another measure of the expansion follows by consid-
ering the stretching of the wavelength of light received
from a distant galaxy. The observed wavelength XohS nf a
II. BASIC CONCEPTS fcaturc in thc spcctrum that had wavelength ,X, at emis-
sion satisfies
A. The Friedmann-Lemaltre model
564 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
ds2=g,,dx,dx . (13)
The gravitational constant is G. Here and throughout
we choose units to set the velocity of light to unity. The The repeated indices are summed, and the metric tensor
mean mass density, p ( t ) , and the pressure, p ( t ) , count- g,, is a function of position in spacetime. If ds2 is posi-
ing all contributions including dark energy, satisfy the tive then ds is the proper (physical) time measured by
local energy-conservation law an observer who moves from one event to the other; if
negative, Ids1 is the proper distance between the events
a measured by an observer who is moving so the events
p= -3 -(p+p). (9) are seen to be simultaneous.
In the flat spacetime of special relativity one can
The first term on the right-hand side represents the de- choose coordinates so the metric tensor has the
crease of mass density due to the expansion that more Minkowskian form
broadly disperses the matter. The p d V work in the sec-
o \
n,=l
ond term is a familiar local concept, and is meaningful in /l 0 0
general relativity. But one should note that energy does
not have a general global meaning in this theory. O -l O O
The first integral of Eqs. (8) and (9) is the Friedmann 0 0 - 1 0
equation \o 0 0 -11
8 A freely falling, inertial observer can choose locally
a2=- rrGpa2+ const.
3 (lo) Minkowskian coordinates, such that along the path of
the observer g,,,= 7,) and the first derivatives of g,,
It is conventional to rewrite this as vanish.
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 565
566 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
p = - p: the streaming velocity loses meaning. When c: a small number. Since we are near the edge of the lumi-
is negative Eq. (23) shows that the fluid is unstable, in nous part of our galaxy, a search for the effect of A on
general. But when p = - p the vanishing divergence of the internal dynamics of galaxies such as the Milky Way
Tp becomes the condition shown in Eq. (22), that p does not look promising. The precision of celestial dy-
= ( p ) + S p is constant. namics in the Solar System is much greater, but the ef-
There are two measures of gravitational interactions fect of A is very much smaller; gA/g-10-22 for the orbit
with a fluid: the passive gravitational mass density deter- of the Earth.
mines how the fluid streaming velocity is affected by an One can generalize Eq. (19) to a variable p i \ , by tak-
applied gravitational field, and the active gravitational ing p a to be negative but different from - p A . But if the
dynamics were that of a fluid, with pressure a function of
mass density determines the gravitational field produced
p a , stability would require c : = d p a / d p A > O , from Eq.
by the fluid. When the fluid velocity is nonrelativistic the
expression for the former in general relativity is p f p , as (23), which seems quite contrived. A viable working
one can determine by writing out the covariant diver- model for a dynamical P A is the dark energy of a scalar
gence of T. This vanishes when p = - p , consistent field with self-interaction potential chosen to make the
with the loss of meaning of the streaming velocity. The variation of the field energy acceptably slow, as dis-
latter is p+ 3 p , as one can see from Eq. (8). Thus a fluid cussed next.
with p = - p / 3 , if somehow kept homogeneous and
static, would produce no gravitational field. In the
model in Eqs. (19) and (21) the active gravitational mass C. Inflation and dark energy
density is negative when pA is positive. When this posi-
tive dominates the stress-energy tensor, u is positive: The negative active gravitational mass density associ-
the rate of expansion of the universe increases. In the ated with a positive cosmological constant is an early
language of Eq. (20), this cosmic repulsion is a gravita- precursor of the inflation picture of the early universe;
tional effect of the negative active gravitational mass inflation in turn is one precursor of the idea that A might
density, not a new force law. generalize into evolving dark energy.
The homogeneous active mass represented by A To begin, we review some aspects of causal relations
changes the equation of relative motion of freely moving between events in spacetime. Neglecting space curva-
test particles in the nonrelativistic limit to ture, a light ray moves a proper distance d l = a ( t ) d x
= d t in time interval d t , so the integrated coordinate
d2r displacement is
-=g+n,,H;i,
dt2 x= dtla(t).
where g is the relative gravitational acceleration pro- If nAo=O this integral converges in the past-we see
duced by the distribution of ordinary matter. For an distant galaxies that at the time of observation cannot
illustration of the size of the last term consider its effect have seen us since the singular start of expansion at a
on our motion in a nearly circular orbit around the cen- = 0. This particle horizon problem is curious: how
ter of the Milky Way galaxy. The Solar System is moving could distant galaxies in different directions in the sky
at speed v,=220 km s-l at radius r = 8 kpc. The ratio of know to look so similar? The inflation idea is that in the
the acceleration g A produced by A to the total gravita- early universe the expansion history approximates that
tional acceleration g= u : / r is of de Sitters (1917) solution to Einsteins field equation
for A>O and T,,=O in Eq. (20). We can choose the
ga Ig=ClaoH~r21u~-10-5, (25) coordinate labels in this de Sitter spacetime so space
curvature vanishes. Then Eqs. (11) and (12) show that
the expansion parameter is
Lest we contribute to a wrong problem for the student we
note that a fluid with p = - pi3 held in a container would have
net positive gravitational mass, from the pressure in the con- where Ha is a constant. As one sees by working the
tainer walls required for support against the negative pressure integral in Eq. (26), here everyone can have seen every-
of the contents. We have finessed the walls by considering a one else in the past. The details need not concern us; for
homogeneous situation. We believe Whittaker (1935) gives the the following discussion two concepts are important.
first derivation of the relativistic active gravitational mass den- First, the early universe acts like an approximation to de
sity. Whittaker also presents an example of the general propo- Sitters solution because it is dominated by a large effec-
sition that the active gravitational mass of an isolated stable tive cosmological constant, or dark-energy density.
object is the integral of the time-time part of the stress-energy
tensor in the locally Minkowskian rest frame. Misner and Put- Second, the dark energy is modeled as that of a near
man (1959) give the general demonstration. homogeneous field, CD.
12Thisassumes that the particles are close enough for appli- In this scalar field model, motivated by grand unified
cation of the ordinary operational definition of proper relative models of very-high-energy particle physics, the action
position. The parameters in the last term follow from Eqs. (8) of the real scalar field CD (in units chosen so that Plancks
and (21). constant ti is unity) is
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 567
568 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
so the two terms on the right-hand side cancel, but the sort of quantum treatment, passes through a quasistatic
balance can be upset by redistributing the mass.13 approximation t o Einsteins solution, and then continues
Einstein did not consider the cosmological constant to expanding to de Sitters (1917) empty space solution. To
be part of the stress-energy term: his form for the field modern tastes, this loitering model requires incredibly
equation [in the streamlined notation of Eq. (17)] is special initial conditions, as will be discussed. Lernaitre
liked it because the loitering epoch allows the expansion
C,,,-8 mGpAgp,=8?7G T,, . (34) time to be acceptably long for Hubbles (1929) estimate
of N o , which is an order-of-magnitude high.
The left-hand side contains the metric tensor and its de- The record shows Einstein never liked the A term. His
rivatives; a new constant of nature, A, appears in the view of how general relativity might fit Machs principle
addition to Einsteins original field equation. One can was disturbed by de Sitters (1917) solution to Eq. (34)
equally place Einsteins new term on the right-hand side for empty space ( T p Y = O )with A>0.425 Pais (1982, p.
of the equation, as in Eq. (20), and count P&,, as part 288) pointed out that Einstein, in a letter t o Weyl in
of the source term in the stress-energy tensor. The dis- 1923, commented on the effect of A in Eq. (24): Ac-
tinction becomes interesting when p~ takes part in the cording to De Sitter, two material points that are suffi-
dynamics, and the field equation is properly written with ciently far apart, continue t o be accelerated and move
p,, , or its generalization, as part of the stress-energy ten- apart. If there is no quasistatic world, then away with the
sor. One would then be able to say that the differential cosmological term. We do not know whether at this
equation of gravity physics has not changed from Ein- time Einstein was influenced by Sliphers redshifts or
steins original form; instead there is a new component Friedmanns expanding world model.
in the content of the universe.
Having assumed that the universe is static, Einstein
did not write down the differential equation for a ( t ) , 14North (1965) reviews the confused early history of ideas on
and so did not see the instability. Friedmann (1922, the possible astronomical significance of de Sitters solution for
1924) found the evolving homogeneous solution, but an empty universe with A>O; we add a few comments regard-
had the misfortune to do so before the astronomy ing the physics that contributed to the discovery of the expand-
became suggestive. Sliphers measurements of the spec- ing world model. Suppose an observer in de Sitters spacetime
tra of the spiral nebulae-galaxies of stars-showed holds a string tied to a source of light, so the source stays at
most are shifted toward the red, and Eddington (1924, fixed physical distance r<H,. The source is much less mas-
pp. 161 and 162) remarked that that might be a manifes- sive than the observer, the gravitational frequency shift due to
tation of the second, repulsive term in Eq. (24). Lemai- the observers mass may be neglected, and the observer is mov-
tre (1927) introduced the relation between Sliphers red- ing freely. Then the observer receives light from the source
shifts and a homogeneous matter-filled expanding shifted to the red by Gh/X=-(HAr)Z/2.The observed red-
shifts of particles moving on geodesics depend on the initial
relativistic world model. H e may have been influenced conditions. Stars in the outskirts of our galaxy are held at fixed
by Hubbles work, which led to the publication (Hubble, mean distances from Earth by their motions. The mean shifts
1929) of the linear redshift-distance relation [Eq. (S)]:as of the spectra of light from these stars include this quadratic de
a graduate student at MIT Lemaitre attended a lecture Sitter term as well as the much larger Doppler and ordinary
by Hubble. gravitational shifts. The prescription for initial conditions that
In Lemahres (1927) solution, the expanding universe reproduces the linear redshift-distance relation for distant gal-
traces asymptotically back to Einsteins static case. Le- axies follows Weyls (1923) principle: the world particle geode-
maitre then turned to what he called the primeval atom, sics trace back to a near common position in the remote past,
which is now termed the Big Bang model. This solution in the limiting case of the Friedmann-Lemaitremodel at QMo
expands from densities so large that they require some -to. This spatially homogeneous coordinate labeling of de Sit-
ters spacetime, with space sections with negative curvature,
already appears in de Sitter [1917, Eq. (15)], and is repeated in
Lanczos (1922). This line element is the second expression in
l3To help motivate the introduction of A, Einstein (1917) our Eq. (15) with aacosh HAt.Lernaitre (1925) and Robertson
mentioned a modification of Newtonian gravity physics that (1928) present the coordinate labeling for the spatially flat
could render the theory well defined when the mass distribu- case, where the line element is dsZ=dtZ-eZH~(dn2+dy2
tion is homogeneous. In Einsteins example, similar to what + d z Z ) [in the choice of symbols and signature in Eqs. (15) and
was considered by Seeliger and Neumann in the mid-l890s, the (27)]. Lemaitre (1925) and Robertson (1928) note that par-
modified field equation for the gravitational potential p is ticles at rest in this coordinate system present a linear redshift-
V2p-hp=4rGpM. This allows the nonsingular homoge- distance relation, u = H A r ,at small u . Robertson (1928) esti-
neous static solution p = -4nGpMIh. In this example the po- mated H A ,and Lema?tre (1927) its analog for the Friedmann-
tential for an isolated point mass is the Yukawa form, p Lemaitre model, from published redshifts and Hubbles galaxy
ae-Ar/r. Trautman (1965) pointed out that this is not the non- distances. Their estimates are not far off Hubbles (1929) pub-
relativistic limit of general relativity with the cosmological lished value.
term. Rather, Eq. (24) follows from v2p=4T&(pM-2pA), To the present way of thinking the lengthy debate about the
where the active gravitational mass density of the A term IS singularity in de Sitters static solution, chronicled by North
p A+ 3 p A= - 2p,, . Norton (1999) reviewed the history of ideas (1965), seems surprising, because de Sitter (1917) and Klein
of this Seeliger-Neumann Yukawa-type potential in gravity (1918) had presented de Sitterssolution as a sphere embedded
physics. in 4-plus-1-dimensionalflat space, with no physical singularity.
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 569
The earliest published comments we have found on Gamow (1970, p. 44) recalls that when I was discuss-
Einsteins opinion of A within the evolving world model ing cosmological problems with Einstein, h e remarked
(Einstein, 1931; Einstein and de Sitter, 1932) make the that the introduction of the cosmological term was the
point that, since not all the terms in the expansion-rate biggest blunder he ever made in his life. This certainly
equation (11) are logically required, and the matter term is consistent with all ol Einsteins written comments that
surely i s present and likely dominates over radialion at we have seen on the cosmolopcal cnnstant per se; we do
low redshift, a reasonable working model drops flKo and not know whether Einstein was also referring to the
nA0 and ignores Om. This simplifics thc cxpansion-rate missed chance to predict the evolution of the universe.
equation to what has come to be called the Einstein-de
Sitter model, 6. The development of ideas
1. Early indications of A
(35)
In the classic book, The Classical Theory of Fields,
where phf is the mass density in nonrelativistic matter; Landau and Lifshitz (1951, p. 338) second Einsteins
opinion of the cosmological constant A, stating there is
here RM=8.rrGpM/(3H2)is unity. The left side is a
no basis whatsoever for adjustment of the theory to
measure of the kinetic energy of expansion per unit
include this term. The empirical side of cosmology is not
mass, and the right-hand side a measure of the negative much mentioned in this book, however (though there is
of the gravitational potential energy. In effect, this
a perceptivc commcnt on thc limited empirical support
model universe expands with escapc vclocity. or the homogeneity assumption: p. 332). In the Supple-
Einstein and de Sitter point out that Hubbles esti- mentary Notes to the English translation of his book,
mate of H o and de Sitters estimate of the mean mass Theory of Relativity, Pauli (1958, p. 220) also endorses
dcnsity in galaxies are not inconsistent with Eq. (35) Einsteins position.
(and since both quantities scale with distance in the Discussions elsewhere in the literature on how one
same way, this result is not affected by the errnr in the might find cmpirical constraints on the values of the cos-
distance scale that affected Hubbles initial measure- mological parameters usually take account of A. The
ment of H o ) . But the evidence shows now that the mass continued interest was at least in part driven by indica-
density is about one-quarter of what is predicted by this tions that A might be needed to reconcile theory and
equation, as we w d discuss. observations. Here are three examples.
Einstein and de Sitter (1932) remarked that the cur- First, the expansion timc is uncomfortably short if A
vature term in Eq. (11) is essentially determinable, and =O. Sandages recalibration of the distance scale in the
an increase in the precision of the data derived from 1960s indicates Ho-75 krns-lMpc-. If A=O this
observations will enable us in the future to fix its sign shows that the time of expansion from densities too high
and determine ils value. This is happcning, 70 years for stars to have existed is <H,-13 Gyr, maybe less
later. The cosmological constant term is measurable, in than the ages of thc oldcst stars, then estimated to he
principle, too, and may now have been detected. But greater than about 15Gyr. Sandage (1961a) points out
Einstein and de Sitter said only that the theory of an that the problem is removed by adding a positive A. The
expanding universe with finite mean mass density can present estimates reviewed below (Sec. IV.B.3) are not
be reached without the introduction of A. far from these numbers, but still too uncertain for a sig-
Further to this point, in the appendix of the second nificant case for A.
edition of his book, The Meaning of Relativity, Einstein Second, counts of quasars as a function of redshift
(1945, p. 127) states that the introduction of the cos- show a peak at z-2, as would be produced by the loi-
mologic member -Einsteins terminology for tering epoch in Lemahes A model (Petrosian, Salpeter,
A-into the equations of gravity, though possible from and Szckeres, 1967; Shklovsky, 1967; Kardashev, 1967).
the point of view of relativity, is to be rejected from the The peak is now well established, centered at z-2.5
point of view of Iogical economy, and that if Hubbles (Croom er al., 2001; Fan et al., 2001). I1 is usually inter-
expansion had been discovered at thc timc of thc crc- prctcd as the evolution in the rate of violent activity in
ation of the gencral theory of relativity, the cosmologic the nuclei ol galaxies, though in the absence of a loiter-
member would never have been introduced. It seems ing epoch the indicated sharp variation in quasar activity
now so much less justified to introduce such a member with timc is curious (but certainly could be a conse-
into the field equations, since its introduction loses its quence of astrophysics that is not well understood).
sole original justiiication,-that of leading to a natural The third example is the redshift-magnitude relation.
solution of the cosmologic problem. Einstein knew that Sandages (1961a) analysis indicates this is a promising
without the cosmological constant the expansion time method of distinguishing world models. The Gunn and
derived from Hubbles estimate of H , is uncomfortably Oke (1975) measurement of this reIation for giant ellip-
short compared to estimates of the ages of the stars, and tical galaxies, with Tinsleys (1972) correction for evolu-
opined that that might be a problem with the star ages. tion of the star population from assumed formation at
The big error, the value of H o , was corrected by 1960 high redshift, indicates curvature away from thc lincar
(Sandage, 1958: 1962). relation in the dircction that, as Gunn and Tinsley
570 P. J. E.Peebles and Bharat Ratra: The cosmological constant and dark energy
(1975) discuss, could only be produced by A (within gen- the early 1960s, in R. H. Dickes gravity research group,
eral relativity theory). The new application of the the coincidences argument was discussed, but published
redshift-magnitude test, to type-la supernovae (Sec. much later (Dicke, 1970, p. 62; Dicke and Peebles,
IV.B.4), is not inconsistent with the Gunn-Oke measure- 1979). We do not know its provenance in Dickes group,
ment; we do not know whether this agreement of the whether from Bondi, McCrea, Dicke, or sonieone else.
measurements is significant, because Gunn and Oke We would not be surprised to learn others bad similar
were worried about galaxy evolution.6 thoughts.
The coincidences argument is sensible but not a proof,
2. The coincidences argument against A of course. The discovery of the 3-K thermal cosmic mi-
crowave background radiation gave us a term in the
A n argument against an observationally interesting expansion-rate equation that is down from the dominant
value of A, from our distrust of accidental coincidences, one by four orders of magnitude, not such a large factor
has been in the air for decades, and became very influ- by astronomical standards. This might be counted as a
ential in the early 1980s with the introduction of the first step away from the argument. From the dynamics of
inflation scenario for the very early universe. galaxies the evidence that flMois less than unity is an-
If the Einstein-de Sitter model in Eq. (35) were a other step (Peebles, 1984, p. 442; 1986). And yet another
good approximation at the present epoch, an observer is the development of the evidence that the A and dark-
measuring the mean mass density and Hubbles constant matter terms differ by only a factor of 3 [Eq. (2)]. This
when the age of the universe was one-tenth the present last piece is the most curious, but the community has
value, or ten times the present age, would reach the come to accept it, for the most part. The precedent
same conclusion, that the Einstein-de Sitter model is a makes Lemaitres loitering model more socially accept-
good approximation. That is, we would flourish at a time able.
that is not special in the course of evolution of the uni- A socially acceptable value of A cannot be such as to
verse. If, on the other hand, two or more of the terms in make life impossible, of c o ~ r s e . ~But
perhaps the most
the expansion-rate equation (11) made substantial con- productive interpretation of the coincidences argument
tributions to the present value of the expansion rate, it is that it demands a search for a more fundamental un-
would mean that we are present at a special epoch, be- derlying model. This is discussed further in Sec. 1II.E
cause each term in Eq. (11) varies with the expansion and the Appendix.
factor in a different way. To put this in more detail, we
imagine that the physics of the very early universe, when 3. Vacuum energy and A
the relativistic cosmological model became a good ap-
proximation, set the values of the cosmological param- Another tradition to consider is the relation between
eters. The initial values of the contributions to the A and the vacuum or dark-energy density. In one ap-
expansion-rate equation had to have been very different proach to the motivation for the Einstein field equation,
from each other, and exceedingly specially fixed, to yield taken by McVittie (1956) and others, A appears as a
two flios with comparable values. This would be a most constant of integration (of the expression for local con-
remarkable and unlikely coincidence. The multiple coin- servation of energy and momentum). McVittie (1956, p.
cidences required for the near vanishing of a and a at a 35) emphasizes that, as a constant of integration, A can-
redshift not much larger than unity makes an even stron- not be assigned any particular value on a priori
ger case against Lemahes loitering model, with this line grounds. Interesting variants of this line of thought are
of argument. still under discussion (Weinberg, 1989; Unruh, 1989; and
The earliest published comment we have found on references therein).
this point is by Bondi (1960, p. 166), in the second edi- The notion of A as a constant of integration may be
tion of his book Cosmology. Bondi notes the remark- related to the issue of the zero point of energy. In labo-
able property of the Einstein-de Sitter model: the di- ratory physics one measures and computes energy differ-
mensionless parameter we now call f l M is independent ences. But the net energy matters for gravity physics,
of the time at which it is computed (since it is unity). and one can imagine that A represents the difference
The coincidences argument follows and extends Bondis between the true energy density and the sum at which
comment. It is presented in McCrea (1971, p. 151). one arrives by laboratory physics. Eddington (1939) and
When Peebles was a postdoctoral research associate, in Lemaitre (1934, 1949) make this point.
16Early measurements of the redshift-magnitude relation If A were negative and the magnitude too large there would
were meant in part to test the Steady State cosmology of not be enough time for the emergence of life such as ours. If A
Bondi and Gold (1948) and Hoyle (1948). Since Steady State were positive and too large the universe would expand too
cosmology assumes spacetime is independent of time its line rapidly to allow galaxy formation. Our existence, which re-
element has to have the form of the de Sitter solution with quires something resembling the Milky Way galaxy to contain
RKo=O and the expansion parameter in Eq. (27). The mea- and recycle heavy elements, thus provides an upper bound on
sured curvature of the redshift-magnitude relation is in the the value of A. Such anthropic considerations are discussed by
direction predicted by Steady State cosmology. But this cos- Weinberg (1987, 2001) and Vilenkin (2001), and references
mology fails other tests discussed in Sec. IVB. therein.
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 571
572 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
more acute, because a natural value for k , is thought to transitions accompanying the symmetry breaking. Each
be much larger than what Nernst or Pauli used.23 first-order transition has a latent heat that appears as a
While there was occasional discussion of this issue in contribution to an effective time-dependent A ( t ) or
the middle of the 20th century (as in the quote from N. dark-energy density.= The decrease in value of the dark-
Bohr in Rugh and Zinkernagel, 2002, p. 5), the modern energy density at each phase transition is much larger
era begins with the paper by Zeldovich (1967) that con- than the acceptable present value (within relativistic cos-
vinced the community to consider the possible connec- mology); the natural presumption is that the dark energy
tion between the vacuum energy density of quantum is negligible now. This final condition seems bizarre, but
physics and Einsteins cosmological ~ o n s t a n t . ~ the picture led to the very influential concept of infla-
If the physics of the vacuum looks the same to any tion. We discussed the basic elements in connection with
inertial observer its contribution to the stress-energy Eq. (27); we now turn to some implications.
tensor is the same as Einsteins cosmologcal constant
[Eq. (19)]. Lemaitre (1934) notes t h s : in order that ab-
solute motion, i.e., motion relative to the vacuum, may
not be detected, we must associate a pressure p = - p c 2
to the energy density pc2 of vacuum. Gliner (1965)
goes further, presenting the relation between the metric
tensor and the stress-energy tensor of a vacuum that is
the same t o any inertial observer. But it was Zeldovich
(1968) who presented the argument clearly enough and
at the right time to catch the attention of the community.
With the development of the concept of broken sym-
metry in the now standard model for particle physics
came the idea that the expansion and cooling of the uni-
verse is accompanied by a sequence of first-order phase
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 595
596 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
advances in the evidence. Our review leads us to con- in at least some circles, a decade from now, whatever the
clude that there is now a good scientific case arguing outcome of the present work on the cosmological tests.
that the matter density parameter is fiM0-o.25, and a Though this much is clear, we see no basis for a predic-
fairly good case that about three-quarters of that is not tion of whether the standard cosmology a decade from
baryonic. The cases for dark energy and for the ACDM now will be a straightforward elaboration of ACDM, or
model are significant, too, though obscured by observa- whether there will be more substantial changes of direc-
tional issues of whether we have an adequate picture of tion.
structure formation. But we expect that rapid advances
in the observations of structure formation will soon dis- Note added in proof
sipate these clouds, and, considering the record, likely
reveal new clouds over the standard model for cosmol- The measurements of the angular distribution of the
ogy a decade from now. 3-K cosmic microwave background radiation by the
A decade ago the high-energy-physics community had Wilkinson Microwave Anisotropy Probe (WMAP) were
a well-defined challenge to show why the dark-energy released (in Bennett et al., 2003; Spergel ef af.,2003, and
density vanishes. Now there seems to be both a new references therein) after this review was completed. It is
challenge and clue: determine why the dark-energy den- appropriate to comment on how the results of this su-
sity is exceedingly small but not zero. The present state perb experiment have changed our assessment of the
of ideas can be compared with the state of research on cosmological tests.
structure formation a decade ago: in both situations The discussion in Sec. 1V.C led us to conclude that the
there are many lines of thought but not a clear picture of case for the detection of dark energy is not as compel-
which is the best direction to take. The big difference is ling as the case for dark matter, because there are fewer
that a decade ago we could be reasonably sure that ob- cross checks. WMAP changes that. The ACDM model
servations in progress would guide us to a better under- gives an excellent fit to the WMAP measurements. The
standing of how structure formed. Untangling the phys-
parameters required by this fit, including density param-
ics of dark matter and dark energy and their role in
eters 0.19SQM0S0.35, 0.65SfiAoS0.81, and -0.02
gravity physics is a much more subtle challenge, but, we
hope, will be guided by advances in the exploration of ~ f i ~ ~ c O (all
. 0 6at two standard deviations), are in
the phenomenology. Perhaps in another ten years this good agreement with other constraints (as summarized
will include detecting the evolution of the dark energy, in Sec. IV).In particular, a p r e - W A F survey of the
and detecting the gravitational response of the dark- constraints on fiMofrom a combination of the dynami-
energy distribution to the large-scale mass distribution. cal, baryon fraction, power spectrum, weak lensing, and
There may be three unrelated phenomena to deal with: cluster abundance measurements indicates 0 . 2 s CiMo
dark energy, dark matter, and a vanishing sum of zero- 50.35 (at two standard deviations, Chen and Ratra,
point energies and whatever goes with them. Or the 2003b), in striking accord with the WMAP estimate. The
phenomena may be related. Because our only evidence fit to the WMAP measurements and the overall consis-
of dark matter and dark energy is from their gravity, it is tency of parameter constraints is strong evidence that
a natural and efficient first step to suppose that their the ACDM model is a good approximation to reality.
properties are as simple as allowed by the phenomenol- This evidence increases the weight of parameter esti-
ogy. However, it makes sense to watch for hints of more mates that depend on the ACDM model. And the model
complex physics within the dark sector. fit to the WMAP measurements requires the presence of
The past eight decades have seen steady advances in dark energy, provided the Hubble constant is within ac-
the technology used for the cosmological tests, from ceptable distance of the astronomical measurements.
telescopes to computers; advances in the theoretical This is distinct from the line of argument for 0 ~ 0 - 0 . 7
concepts underlying the tests; and progress through the described in Sec. IV.It provides the wanted cross check
learning curves on applying the concepts and technol- that makes a convincing case for the detection of dark
ogy. We see the results: the basis for cosmology is much energy.
firmer than it was a decade ago. And the basis surely will
Issues remain. The ACDM fit to WMAP indicates the
be a lot more solid a decade from now.
density parameter in baryons is 0.021 S R B , , h h 2 ~ 0 . 0 2 4
Einstein's cosmological constant, and the modern vari-
ant, dark energy, have figured in a broad range of topics (at two standard deviations), consistent with what is in-
under discussion in physics and astronomy, in at least dicated by the standard Big Bang nucleosynthesis model
some circles, for much of the past eight decades. Many and measurements of the primeval deuterium abun-
of these issues undoubtedly have been discovered more dance [Eq. (62) and footnote 1011. To be resolved are
than once. But in our experience such ideas tend to per- the somewhat different estimates of fiB,$z2from the he-
sist for a long time at low visibility and sometimes low lium and lithium abundances. The temperature anisot-
fidelity. Thus the community has been very well pre- ropy autocorrelation function is consistent with zero at
pared for the present evidence for detection of dark en- angular separations greater than about 60" (consistent
ergy. And for the same reason we believe that dark en- with the earlier but less emphatic COBE result men-
ergy, whether constant, or rolling toward zero, or maybe tioned in footnote loo), and seems not likely to be con-
even increasing, still will be an active topic of research, sistent with the ACDM prediction. Maybe this is an un-
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 599
likely statistical fluctuation. Or maybe it is telling us Bardeen, J. M., F! J. Steinhardt, and M. S. Turner, 1983, Phys.
about the physics of the dark energy Rev. D 28, 679.
We take this opportunity to refer to Padmanabhans Barr, S . M., and D. Hochberg, 1988, Phys. Lett. B 211, 49.
(2002) recently completed review of the cosmological Barr, S . M., and D. Seckel, 2001, Phys. Rev. D 64, 123513.
constant, with particular emphasis on possible resolu- Barreiro, T., E. J. Copeland, and N. J. Nunes, 2000, Phys. Rev.
tions of the physicists cosmological constant problem D 61, 127301.
and the physics of the resulting dark energy models. Bartlett, J. G., and J. Silk, 1990, Astrophys. J. 353, 399.
Bartolo, N., and M. Pietroni, 2000, Phys. Rev. D 61, 023518.
ACKNOWLEDGMENTS Battye, R. A,, M. Bucher, and D . Spergel, 1999, e-print
astro-ph/9908047.
We a re indebted to Pia Mukherjee, Michael Peskin, Bean, R., and A. Melchiorri, 2002, Phys. Rev. D 65, 041302.
and Larry Weaver for detailed comments o n drafts of Bennett, C. L., el a/., 2003, e-print astro-ph/0302207.
this review. We thank Uwe Thumm for help in translat- Bernardeau, E, and R. Schaeffer, 1992, Astron. Astrophys.
ing and discussing papers written in German. We have 255, 1.
also benefited from discussions with Neta Bahcall, Rob- Bertolami, O., 1986, Fortschr. Phys. 34, 829.
ert Caldwell, Gang Chen, Andrea Cimatti, Mark Dick- Bertolami, O., and P. J. Martins, 2000, Phys. Rev. D 61, 064007.
inson, Michael Dine, Masataka Fukugita, Salman Habib, Bindtruy, F!, 1999, Phys. Rev. D 60, 063502.
David Hogg, Avi Loeb, Stacy McGaugh, Paul Schechter, Bindtruy, P., and J. Silk, 2001, Phys. Rev. Lett. 87, 031102.
Chris Smeenk, Gary Steigman, E d Turner, Michael Birkel, M., and S. Sarkar, 1997, Astropart. Phys. 6, 197.
Turner, Jean-Philippe Uzan, David Weinberg, an d Si- Blais-Ouellette, S., F! Amram, and C. Carignan, 2001, Astron.
mon White. B.R. acknowledges support from NSF J. 121, 1952.
C A R E E R Grant No. AST-9875031, and P.J.E.P. ac- Blanchard, A,, R. Sadat, J. G. Bartlett, and M. Le Dour, 2000,
knowledges support in part from th e NSF. Astron. Astrophys. 362, 809.
Bludman, S. A,, and M. A. Ruderman, 1977, Phys. Rev. Lett.
38, 255.
REFERENCES Blumenhagen, R., B. Kors, D. Lust, and T. Ott, 2002, Nucl.
Phys. B 641, 235.
Abazajian, K., G. M. Fuller, and M. Patel, 2001, Phys.-Rev.D Blumenthal, G. R., A. Dekel, and J. R. Primack, 1988, Astro-
64, 023501. phys. J. 326, 539.
Abbott, L. F., and M. B. Wise, 1984, Nucl. Phys. B 244, 541. Blumenthal, G. R., S. M. Faber, J. R. Primack, and M. J. Rees,
Abell, G. O., 1958, Astrophys. J., Suppl. Ser. 3, 211. 1984, Nature (London) 311, 517.
Akhmedov, E. Kh., 2002, e-print hep-th/0204048. Bode, P., J. P. Ostriker, and N. Turok, 2001, Astrophys. J. 556,
Albrecht, A,, and P. J. Steinhardt, 1982, Phys. Rev. Lett. 48, 93.
1220. Bond, J. R., 1988, in The Early Universe, edited by W. G. Un-
Alcock, C., and B. Paczyfiski, 1979, Nature (London) 281,358. ruh and G. W. Semenoff (Reidel, Dordrecht), p. 283.
Allen, S . W., R. W. Schmidt, and A. C. Fabian, 2002, Mon. Not. Bond, J. R., R. Crittenden, R. L. Davis, G. Efstathiou, and P. J.
R. Astron. SOC.335, 256. Steinhardt, 1994, Phys. Rev. Lett. 72, 13.
Alpher, R. A., and R. Herman, 2001, Genesir of the Big Bang Bondi, H., 1960, Cosmology (Cambridge University, Cam-
(Oxford University, Oxford). bridge).
Amendola, L., 1999, Phys. Rev. D 60, 043501. Bondi, H., and T. Gold, 1948, Mon. Not. R. Astron. SOC.108,
Amendola, L., 2000, Phys. Rev. D 62, 043511. 252.
Amendola, L., and D. Tocchini-Valentini, 2002, Phys. Rev. D Bordag, M., U. Mohideen, and V. M. Mostepanenko, 2001,
66, 043528. Phys. Rep. 353, 1.
Arag6n-Salamanca. A,, C. M. Baugh, and G. Kauffmann, 1998, Borgani, S., P. Rosati, F! Tozzi, S. A. Stanford, P. R. Eisenhardt,
Mon. Not. R. Astron. SOC.297, 427. C. Lidman, B. Holden, R. D. Ceca, C. Norman, and G.
Arai, K., M. Hashimoto, and T. Fukui, 1987, Astron. Astro- Squires, 2001, Astrophys. J. 561, 13.
phys. 179, 17. Boughn, S. F!, and R. G. Crittenden, 2001, e-print
Aramendariz-Picon, C., V. Mukhanov, and P J. Steinhardt, astro-ph/Olll281.
2001, Phys. Rev. D 63, 103510. Bousso, R., 2000, J. High Energy Phys. 0011, 038.
Baccigalupi, C., A. Balbi, S. Matarrese, F. Perrotta, and N. Branchini, E., W. Freudling, L. N. Da Costa, C. S. Frenk, R.
Vittorio, 2002, Phys. Rev. D 65, 063520. Giovanelli, M. F! Haynes, J. J. Salzer, G. Wegner, and I. Ze-
Baccigalupi, C., S. Matarrese, and F. Perrotta, 2000, Phys. Rev. havi, 2001, Mon. Not. R. Astron. SOC.326, 1191.
D 62, 123510. Branchini, E., I. Zehavi, M. Plionis, and A. Dekel, 2000, Mon.
Bacon, D. J., R. J. Massey, A. R. Refregier, and R. S. Ellis, Not. R. Astron. SOC.313, 491.
2002, e-print astro-phi0203134. Brandenberger, R. H., 2001, e-print hep-phi0101119.
Bahcall, J. N., and R. A. Wolf, 1968, Astrophys. J. 152, 701. Brax, P., and J. Martin, 2000, Phys. Rev. D 61, 103502.
Bahcall, N. A., R. Cen, R. Davd, J. F! Ostriker, and Q. Yu, Brax, P., J. Martin, and A. Riazuelo, 2000, Phys. Rev. D 62,
2000, Astrophys. J. 541, 1. 103505.
Bahcall, N. A., and X. Fan, 1998, Astrophys. J. 504, 1. Brax, P., J. Martin, and A. Riazuelo, 2001, Phys. Rev. D 64,
Bahcall, N. A., J. P. Ostriker, S. Perlmutter, and F! J. Stein- 083505.
hardt, 1999, Science 284, 1481. Bridle, S., R. G. Crittenden, A. Melchiorri, M. P. Hobson, R.
Bahcall, N. A., et al., 2003, Astrophys. J. 585, 182. Kneissl, and A. N. Lasenby, 2002, Mon. Not. R. Astron. SOC.
Banks, T., and W. Fischler, 2001, e-print hep-thi0102077. 335. 1193.
600 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
Bronstein, M., 1933, Phys. 2. Sowjetunion 3, 73. Corasaniti, P. S . , and E. J. Copeland, 2002, Phys. Rev. D 65,
Brown, G. S., and B. M. Tinsley, 1974, Astrophys. J. 194, 555. 043004.
Buchalter, A,, D. J. Helfand, R. H. Becker, and R. L. White, Croom, S. M., R. J. Smith, B. J. Boyle, T. Shanks, N. S. Loar-
1998, Astrophys. J. 494, 503. ing, L. Miller, and I. J. Lewis, 2001, Mon. Not. R. Astron. SOC.
Bucher, M., and D. N. Spergel, 1999, Phys. Rev. D 60,043505. 322, L29.
Bucher, M., and N. Turok, 1995, Phys. Rev. D 52, 5538. Cyburt, R. H., B. D. Fields, and K. A. Olive, 2001, New As-
Burgess, C. P., ? Martineau, F. Quevedo, G. Rajesh, and R.-J. tron. 6, 215.
Zhang, 2002, J. High Energy Phys. 0203, 052. Dalal, N., K. Abazajian, E. Jenkins, and A. Manohar, 2001,
Burles, S., K. M. Nollett, and M. S. Turner, 2001, Astrophys. J. Phys. Rev. Lett. 87, 141302.
Lett. 552, L1. Daly, R., and E. J. Guerra, 2001, e-print astro-ph/0109383.
Burstein, D., 2000, in Cosmic Flows Workshop, edited by S . Danese, L., G. L. Granato, L. Silva, M. Magliocchetti, and G.
Courteau and J. Willick, Astronomical Society of the Pacific De Zotti, 2002, in The Mass of Galaxies at Low and High
Conference Proceedings No. 201 (Astronomical Society of Redshift, edited by R. Bender and A. Renzini (Springer, Ber-
the Pacific, San Francisco), p. 178. lin), in press.
Caldwell, R. R., 2002, Phys. Lett. B 545, 23. Dasgupta, K., C. Herdeiro, S. Hirano, and R. Kallosh, 2002,
Caldwell, R. R., R. Davd, and I? J. Steinhardt, 1998, Phys. Rev. Phys. Rev. D 65, 126002.
Lett. 80, 1582. Dav6, R., D. N. Spergel, ? J. Steinhardt, and B. D . Wandelt,
Canuto, V., ? J. Adams, S.-H. Hsieh, and E. Tsiang, 1977, Phys. 2001, Astrophys. J. 547, 574.
Rev. D 16, 1643. Davis, A. C., M. Dine, and N. Seiberg, 1983, Phys. Lett. l25B,
Canuto, V., and J. F. Lee, 1977, Phys. Lett. 72B, 281. 487.
Carlstrom, J. E., M. Joy, L. Grego, G. Holder, W. L. Holzapfel, Davis, M., G. Efstathiou, C. S. Frenk, and S. D. M. White,
S. LaRoque, J. J. Mohr, and E. D. Reese, 2001, in Construct- 1985, Astrophys. J. 292, 371.
ing the Universe with Clusters of Galuxies, edited by F. Durret Davis, M., and P J. E. Peebles, 1983a, Annu. Rev. Astron.
and G. Gerbal (in press), e-print astro-phi0103480. Astrophys. 21, 109.
Carretta, E., R. G. Gratton, G. Clementini, and E F. Pecci, Davis, M., and I? J. E. Peebles, 1983b, Astrophys. J. 267, 465.
2000, Astrophys. J. 533, 215. Davis, R. L., 1987, Phys. Rev. D 35, 3705.
Carroll, S. M., 1998, Phys. Rev. Lett. 81, 3067. de Blok, W. J. G., and A. Bosma, 2002, Astron. Astrophys.
Carroll, S. M., 2001, Living Rev. Relativ. 4, 1. 385, 816.
Carroll, S . M., W. H. Press, and E. L. Turner, 1992, Annu. Rev. de Blok, W. J. G., S. S. McGaugh, A. Bosma, and V C. Rubin,
Astron. Astrophys. 30, 499. 2001, Astrophys. J. Lett. 552, L23.
Casimir, H. B. G., 1948, Proc. K. Ned. Akad. Wet. 51, 635. Deffayet, C., G. Dvali, and G. Gabadadze, 2002, Phys. Rev. D
Chaboyer, B., and L. M. Krauss, 2002, Astrophys. J. Lett. 567, 65, 044023.
L45. de la Macorra, A,, and G. Piccinelli, 2000, Phys. Rev. D 61,
Chen, B., and E-L. Lin, 2002, Phys. Rev. D 65, 044007. 123503.
Chen, G., and B. Ratra, 2003a, Astrophys. J. 582, 586. de la Macorra, A., and C. Stephan-Otto, 2001, Phys. Rev. D 87,
Chen, G., and B. Ratra, 2003b, e-print astro-ph/0302002. 271301.
Chen, X., R. J . Scherrer, and G. Steigman, 2001, Phys. Rev. D de Oliveira-Costa, A., M. Tegmark, L. A. Page, and S. P.
63, 123504. Boughn, 1998, Astrophys. J. Lett. 509, L9.
Chiba, T., 1999, Phys. Rev. D 60, 083508. de Ritis, R., A. A. Marino, C. Rubano, and ? Scudellaro, 2000,
Chiba, T., T. Okabe, and M. Yamaguchi, 2000, Phys. Rev. D 62, Phys. Rev. D 62, 043506.
023511. de Sitter, W., 1917, Mon. Not. R. Astron. SOC.78, 3.
Chimento, L. P., and A. S. Jakubi, 1996, Int. J. Mod. Phys. D 5, DeWitt, B. S . , 1975, Phys. Rep. 19, 295.
71. Dicke, R. H., 1968, Astrophys. J. 152, 1.
Choi, K., 1999, e-print hep-phi9912218. Dicke, R. H., 1970, Gravitation and the Universe (American
Cimatti, A., el al., 2002, Astron. Astrophys. Lett. 391, L1. Philosophical Society, Philadelphia).
Coble, K., S. Dodelson, M. Dragovan, K. Ganga, L. Knox, J. Dicke, R. H., and P J. E. Peebles, 1979, in General Relativity,
Kovac, B. Ratra, and T. Souradeep, 2003, Astrophys. J. 584, edited by S. W. Hawking and W. Israel (Cambridge Univer-
585. sity, Cambridge, England), p. 504.
COC,A., E. Vangioni-Flam, M. Lass;, and M. Rabibet, 2002, Dimopoulos, K., and J. W. F. Valle, 2001, e-print
Phys. Rev. D 65, 043510. astro-phl0111417.
Cohen-Tannoudji, C., J. Dupont-Roc, and G. Grynberg, 1992, Dine, M., 1996, e-print hep-phi9612389.
Atom-Photon Interactions (Wiley, New York), p. 121. Di Pietro, E., and J. Demaret, 2001, Int. J. Mod. Phys D 10,
Colberg, J. M., et al., 2000, Mon. Not. R. Astron. SOC.319,209. 231.
Cole, S., D. H. Weinberg, C. S . Frenk, and B. Ratra, 1997, Dirac, P. A. M., 1937, Nature (London) 139, 323.
Mon. Not. R. Astron. SOC.289, 37. Dirac, P. A. M., 1938, Proc. R. SOC.London, Ser. A 165, 199.
Coleman, S., and F. De Luccia, 1980, Phys. Rev. D 21, 3305. Dodelson, S., et al., 2002, Astrophys. J. 572, 140.
Colley, W. N., J. R. Gott, and C. Park, 1996, Mon. Not. R. Dodelson, S., M. Kaplinghat, and E. Stewart, 2000, Phys. Rev.
Astron. SOC.281, L82. Lett. 85, 5276.
Cooray, A. R., 1999, Astrophys. J. 524, 504. Dolgov, A. D., 1983, in The Very Early Universe, edited by G.
Copeland, E. J., N. J. Nunes, and F. Rosati, 2000, Phys. Rev. D W. Gibbons, S. W. Hawking, and S. T. C. Siklos (Cambridge
62, 123503. University, Cambridge, England), p. 449.
Dolgov, A. D., 1989, in The Quest for the Fundamental Con- Flores, R. A., and J. R. Primack, 1994, Astrophys. J. Lett. 427,
stants in Cosmology, edited by J. Audouze and J. Tran Thanh L1.
Van (Editions Frontikres, Gif-sur-Yvette), p. 227. Ford, L. H., 1987, Phys. Rev. D 35,2339.
Doran, M., and J. Jackel, 2002, Phys. Rev. D 66, 043519. Fosalba, P., 0.Dore, and F. R. Bouchet, 2002, Phys. Rev. D 65,
Doran, M., M. Lilley, J. Schwindt, and C. Wetterich, 2001, As- 063003.
trophys. J. 559, 501. Freedman, W. L., 2002, Int. J. Mod. Phys. A 17, 51, 58.
Dreitlein, J., 1974, Phys. Rev. Lett. 33, 1243. Freedman, W. L., et al., 2001, Astrophys. J. 553, 47.
Dubinski, J., and R. G. Carlberg, 1991, Astrophys. J. 378, 496. Freese, K., F. C. Adams, J. A. Frieman, and E. Mottola, 1987,
Durrer, R., B. Novosyadlyj, and S . Apunevych, 2003, Astro- Nucl. Phys. B 287, 797.
phys. J. 583, 33. Frewin, L. A., and J. E. Lidsey, 1993, Int. J. Mod. Phys. D 2,
Dvali, G., and S.-H. H. Tye, 1999, Phys. Lett. B 450, 72. 323.
Eddington, A. S., 1924, The Mathematical Theory of Relativity Friedland, A,, H. Muruyama, and M. Perelstein, 2003, Phys.
(Cambridge University, Cambridge, England). Rev. D (in press).
Eddington, A. S., 1939, Sci. Prog. 34, 225. Friedmann, A., 1922, Z. Phys. 10, 377 [English translation in
Efstathiou, G., 1999, Mon. Not. R. Astron. SOC.310, 842. Cosmological Constants, edited by J. Bernstein and G. Fein-
Efstathiou, G., 2002, Mon. Not. R. Astron. SOC.332, 193. berg (Columbia University, New York, 1986), p. 491.
Efstathiou, G., and J. R. Bond, 1999, Mon. Not. R. Astron. Friedmann, A,, 1924, Z. Phys. 21, 326 [English translation in
SOC.304, 75. Cosmological Constants, edited by J. Bernstein and G. Fein-
Efstathiou, G., and M. J. Rees, 1988, Mon. Not. R. Astron. berg (Columbia University, New York, 1986), p. 591.
SOC. 230, 5P. Frieman, J. A., C. T. Hill, A. Stebbins, and I. Waga, 1995, Phys.
Efstathiou, G., W. J. Sutherland, and S . J. Maddox, 1990, Na- Rev. Lett. 75, 2077.
ture (London) 348,705. Fry, J. N., 1984, Astrophys. J. 279, 499.
Fry, J. N., 1985, Phys. Lett. 158B, 211.
Einstein, A., 1917, Sitzungsher. K. Preuss. Akad. Wiss. 142
Fry, J. N., 1994, Phys. Rev. Lett. 73, 215.
[English translation in The Principle of Relativity (Dover,
Fry, J. N., and E. Gaztaiiaga, 1993, Astrophys. J. 4l3,447.
New York, 1952), p. 1771.
Fujii, Y., 1982, Phys. Rev. D 26,2580.
Einstein, A,, 1931, Sitzungsber. Preuss. Akad. Wiss., Phys. Fujii, Y.,2000, Gravitation Cosmol. 6, 107.
Math. K1. 235. Fujii, Y., and T. Nishioka, 1990, Phys. Rev. D 42, 361.
Einstein, A., 1945, The Meaning of Relalivity (Princeton Uni- Fukugita, M., T. Futamase, and M. Kasai, 1990, Mon. Not. R.
versity, Princeton, NJ). Astron. SOC.246, 24P.
Einstein, A,, and W. de Sitter, 1932, Proc. Natl. Acad. Sci. Fukugita, M., C. J. Hogan, and P. J. E. Peebles, 1998, Astro-
U.S.A. 18, 213. phys. J. 503, 518.
Ellwanger, U., 2002, e-print hep-ph/0203252. Gamow, G., 1970, My World Line (Viking, New York).
Endo, M., and T. Fukui, 1977, Gen. Relativ. Gravit. 8, 833. Ganga, K., B. Ratra, J. 0. Gundersen, and N. Sugiyama, 1997,
Enz, C. P., 1974, in Physical Reality and Mathematical Descrip- Astrophys. J. 484, 7.
tion, edited by C. P. Enz and J. Mehra (Reidel, Dordrecht), p. Garcia-Bellido, J., J. Rabadin, and F. Zamora, 2002, J. High
124. Energy Phys. 0201, 036.
Enz, C. P., and A. Thellung, 1960, Helv. Phys. Acta 33, 839. Garnavich, P. M., et aL, 1998, Astrophys. J. 509, 74.
Eriksson, M., and R. Amanullah, 2002, Phys. Rev. D 66, Gasperini, M., 1987, Phys. Lett. B 194, 347.
023530. Gasperini, M., F. Piazza, and G. Veneziano, 2002, Phys. Rev. D
Etoh, T., M. Hashimoto, K. Arai, and S . Fujimoto, 1997, As- 65, 023508.
tron. Astrophys. 325, 893. Gebhardt, K., et al., 2000, Astrophys. J. Lett. 539, L13.
Evrard, A. E., 1989, Astrophys. J. Lett. 341, L71. Geller, M. J., and P. J. E. Peebles, 1973, Astrophys. J. 184, 329.
Faber, S . M., G. Wegner, D. Burstein, R. L. Davies, A. Georgi, H., H. R. Quinn, and S . Weinberg, 1974, Phys. Rev.
Dressler, D. Lynden-Bell, and R. J. Terlevich, 1989, Astro- Lett. 33, 451.
phys. J., Suppl. Ser. 69, 763. Gerke, B., and G. Efstathiou, 2002, Mon. Not. R. Astron. SOC.
Falco, E. E., C. S . Kochanek, and J. A. Muiioz, 1998, Astro- 335, 33.
phys. J. 494, 47. Giovannini, M., E. Keihanen, and H. Kurki-Suonio, 2002,
Fan, X., et al., 2001, Astrophys. J. 122, 2833. Phys. Rev. D 66, 043504.
Faraoni, V., 2000, Phys. Rev. D 62, 023504. Giudice, G. F., and R. Rattazzi, 1999, Phys. Rep. 322, 419.
Feldman, H. A., J. A. Frieman, J. N. Fry, and R. Scoccimarro, Gliner, E. B., 1965, Zh. Eksp. Teor. Fiz. 49, 542 [Sov. Phys.
2001, Phys. Rev. Lett. 86, 1434. JETP 22, 378 (1966)l.
Ferrarese, L., and D. Merritt, 2000, Astrophys. J. Lett. 539, L9. Gonzilez-Diaz, P. F., 2000, Phys. Rev. D 62, 023513.
Ferreira, P. G., and M. Joyce, 1998, Phys. Rev. D 58, 023503. Goobar, A,, and S . Perlmutter, 1995, Astrophys. J. 450, 14.
Fields, B. D., and S . Sarkar, 2002, Phys. Rev. D 66, 010001. Gorski, K. M., B. Ratra, R. Stompor, N. Sugiyama, and A. J.
Fischler, W., A. Kashani-Poor, R. McNees, and S . Paban, 2001, Banday, 1998, Astrophys. J., Suppl. Ser. 114, 1.
J. High Energy Phys. 0107, 003. Gorski, K. M., B. Ratra, N. Sugiyama, and A. J. Banday, 1995,
Fischler, W., B. Ratra, and L. Susskind, 1985, Nucl. Phys. B Astrophys. J. Lett. 444,L65.
259, 730. Gorski, K. M., J. Silk, and N. Vittorio, 1992, Phys. Rev. Lett.
Fisher, K. B., C. A. Scharf, and 0. Lahav, 1994, Mon. Not. R. 68, 733.
Astron. SOC.266, 219. Gott, J. R., 1982, Nature (London) 295, 304.
Fixsen, D. J., E. S . Cheng, J. M. Gales, J. C. Mather, R. A. Gott, J. R., 1997, in Critical Dialogs in Cosmology, edited by
Shafer, and E. L. Wright, 1996, Astrophys. J. 473, 576. N. Turok (World Scientific, Singapore), p. 519.
602 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
Gott, J. R., M. S. Vogeley, S. Podariu, and B. Ratra, 2001, Huterer, D., and M. S. Turner, 2001, Phys. Rev. D 64, 123527.
Astrophys. J. 549, 1. Hwang, J.-c., and H. Noh, 2001, Phys. Rev. D 64, 103509.
Green, A. M., and J. E. Lidsey, 2000, Phys. Rev. D 61, 067301. Ikebe, Y., T. H. Reiprich, H. Bohringer, Y. Tanaka, and T.
Gudmundsson, E. H., and G. Bjornsson, 2002, Astrophys. J. Kitayama, 2002, Astron. Astrophys. 383, 773.
565, 1. Iliopoulos, J., and B. Zumino, 1974, Nucl. Phys. B 76, 310.
Gum, J. E., 1967, Astrophys. J. 147, 61. Israel, F. l?, 1998, Astron. Astrophys. Rev. 8, 237.
Gunn, J. E., and J. B. Oke, 1975, Astrophys. J. 195, 255. Jenkins, A., C. S . Frenk, E R. Pearce, P. A. Thomas, J . M.
Gunn, J. E., and B. M. Tinsley, 1975, Nature (London) 257, Colberg, S. D. M. White, H. M. P. Couchman, J. A. Peacock,
454. G. Efstathiou, and A. H. Nelson, 1998, Astrophys. J. 499, 20.
Gurvits, L. T., K. I. Kellermann, and S . Frey, 1999, Astron. John, M. V., and K. B. Joseph, 2000, Phys. Rev. D 61,087304.
Astrophys. 342, 378. Johri, V. B., 2002, Class. Quantum Grav. 19,5959.
Guth, A. H., 1981, Phys. Rev. D 23, 347. Joyce, M., and T. Prokopec, 1998, Phys. Rev. D 57, 6022.
Guth, A. H., 1997, The Inflationary Universe (Addison-Wesley, Juszkiewicz, R., ? G. Ferreira, H. A. Feldman, A. H. Jaffe, and
Reading). M. Davis, 2000, Science 287, 109.
Guth, A. H., and S.-Y. Pi, 1982, Phys. Rev. Lett. 49, 1110. Kaganovich, A. B., 2001, Phys. Rev. D 63, 025022.
Guyot, M., and Ya. B. Zeldovich, 1970, Astron. Astrophys. 9, Kahn, F. D., and L. Woltjer, 1959, Astrophys. J. 130,705.
227. Kaiser, N., 1984, Astrophys. J. Lett. Ed. 284, L9.
Halliwell, J. J., 1987, Phys. Lett. B 185, 341. Kaiser, N., 1987, Mon. Not. R. Astron. SOC.227, 1.
Halpern, M., H. P. Gush, and E. H. Wishnow, 1991, in After the Kamionkowski, M., B. Ratra, D. N. Spergel, and N.Sugiyama,
First Three Minutes, edited by S. S. Holt, C. L. Bennett, and 1994a, Astrophys. J. Lett. 434, L1.
V. Trimble (AIP, New York), p. 53. Kamionkowski, M., D. N. Spergel, and N. Sugiyama, 1994b,
Halverson, N. W., et al., 2002, Astrophys. J. 568, 38. Astrophys. J. Lett. 426, L57.
Halyo, E., 2001a, J. High Energy Phys. 0110, 025. Kamionkowski, M., and N. Toumbas, 1996, Phys. Rev. Lett. 77,
Halyo, E., 2001b, e-print hep-ph10105341. 587.
Hamilton, A. J. S., and M. Tegmark, 2002, Mon. Not. R. As- Kardashev, N., 1967, Astrophys. J . Lett. 150, L135.
tron. SOC.330, 506. Kay, S. T., F. R. Pearce, C. S. Frenk, and A. Jenkins, 2002,
Hamilton, J.-Ch., and K. Ganga, 2001, Astron. Astrophys. 368, Mon. Not. R. Astron. SOC.330, 113.
760. Kazanas, D., 1980, Astrophys. J. Lett. Ed. 241, L59.
Harrison, E. R., 1970, Phys. Rev. D 1,2726. Kim, J. E., 2000, J. High Energy Phys. 0006, 016.
Hawking, S. W., 1982, Phys. Lett. 115B, 295. Kirzhnitz, D. A,, and A. D. Linde, 1974, Zh. Eksp. Teor. Fiz.
He, X.-G., 2001, e-print astro-phi0105005. 67, 1263 [Sov. Phys. JETP 40,628 (1975)J
Hebecker, A., and C. Wetterich, 2001, Phys. Lett. B 497, 281. Klein, F., 1918, Nachr. Ges. Wiss. Goettingen, Math.-Phys. K1.
Helbig, P., D. Marlow, R. Quast, I? N. Wilkinson, I. W. A. December, 394.
Browne, and L. V. E. Koopmans, 1999, Astron. Astrophys., Klypin, A., A. V. Kratsov, 0. Valenzuela, and F. Prada, 1999,
Suppl. Ser. 136, 297. Astrophys. J. 522, 82.
Hellerman, S., N. Kaloper, and L. Susskind, 2001, J. High En- Knebe, A., J. E. G. Devriendt, A. Mahmood, and J. Silk, 2002,
ergy Phys. 0106, 003. Mon. Not. R. Astron. SOC.329, 813.
Hiscock, W. A., 1986, Phys. Lett. 166B, 285. Knox, L., and L. Page, 2000, Phys. Rev. Lett. 85, 1366.
Hivon, E., F. R. Bouchet, S. Colombi, and R. Juszkiewicz, Kofman, L. A,, and A. A. Starobinsky, 1985, Pisma Astron.
1995, Astron. Astrophys. 298, 643. Zh. 11, 643 [Sov. Astron. Lett. 11, 271 (1985)].
Hoekstra, H., H. K. C. Yee, and M. D. Gladders, 2002, New Kogut, A., A. J. Banday, C. L. Bennett, K. M. Gbrski, G. Hin-
Astron. Rev. 46, 767. shaw, G. E Smoot, and E. L. Wright, 1996, Astrophys. J. Lett.
Holden, D. J., and D. Wands, 2000, Phys. Rev. D 61,043506. 464, L5.
Holtzman, J. A., 1989, Astrophys. J., Suppl. Ser. 71, 1. Kolb, E. W., and S. Wolfram, 1980, Astrophys. J. 239, 428.
Horvat, R., 1999, Mod. Phys. Lett. A 14, 2245. Kolda, C., and W. Lahneman, 2001, e-print hep-ph/0105300.
Hoyle, F., 1948, Mon. Not. R. Astron. SOC.108, 372. Kolda, C., and D. H . Lyth, 1999, Phys. Lett. B 458, 197.
Hoyle, F., 1959, in Paris Symposium on Radio Astronomy, IAU Komatsu, E., B. D. Wandelt, D. N. Spergel, A. J. Banday, and
Symposium 9, edited by R. N. Bracewell (Stanford Univer- K. M. Gbrski, 2002, Astrophys. J. 566, 19.
sity, Stanford), p. 529. Kosowsky, A,, 2002, in Modern Cosmology, edited by S.
Hoyle, F., and R. J. Tayler, 1964, Nature (London) 203, 1108. Bonometto, V. Gorini, and U. Moschella (IOP, Bristol), p.
Hradecky, V., C. Jones, R. H. Donnelly, S. G. Djorgovski, R. R. 219.
Gal, and S. C. Odewahn, 2000, Astrophys. J. 543, 521. Kragh, H., 1996, Cosmology and Controversy (Princeton Uni-
Hu, W., 1998, Astrophys. J. 506, 485. versity, Princeton, NJ).
Hu, W., and P. J. E. Peebles, 2000, Astrophys. J. Lett. 528, L61. Kragh, H., 1999, in The Expanding Worlds of General Relativ-
Hu, W., and N. Sugiyama, 1996, Astrophys. J. 471, 542. ity, edited by H. Goenner, J. Renn, J. Ritter, and T. Sauer
Huang, J.J., 1985, Nuovo Cimento SOC.Ital. Fis., B 87B, 148. (Birkhauser, Boston), p. 377.
Hubble, E., 1929, Proc. Natl. Acad. Sci. U S A . 15, 168. Krauss, L. M., and B. Chaboyer, 2001, e-print
Hubble, E., 1936, Realm o f t h e Nebulae (Yale University, New astro-ph10111597.
Haven). Reprinted (Dover, New York, 1958). Kruger, A. T., and J. W. Norbury, 2000, Phys. Rev. D 61,
Huey, G., and G. Lidsey, 2001, Phys. Lett. B 514, 217. 087303.
Huey, G., and R. Tavakol, 2002, Phys. Rev. D 65, 043504. Kujat, J., A. M. Linn, R. J. Scherrer, and D. H. Weinberg, 2002,
Humason, M. L., N. U. Mayall, and A. R. Sandage, 1956, As- Astrophys. J. 572, 1.
tron. J. 61. 97. Kyae, B., and Q. Shafi, 2002, Phys. Lett. B 526,379.
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 603
Lahay O.? P. 5. Lilje, J. R. Primack, and M. J. Rees, 1991! Mizuno, S., and K.-i. Maeda, 2001, Phys. Rev. D 64, 123521.
Mon. Not. R. Astron. SOC.251, 128. Moffat, J. N.,2001, e-print hep-thl0105017.
Lanczos, K., 1922, Phys. 2. 23, 539. M o h o ! P., S. A. Levshskov, M. Dessauges-Zavadsky. and S.
Landau. L. D., and E. M. Lifshitz, 1951, The C I ~ ~ s i c Theory
al DOdorico, 2002, Astron. Astrophys. 381, Lh4.
of Fields (Pergarnon, Oxford). Moore. B., 1994, Nature (London) 370: 629.
Lands, S. D., 2002, Astrophys. 1. Lett. 567, L1. Moore, B., S . Ghigna, F. Governato, G . Lake, T. Quinn, J.
Larsen, E, J. E van der Schaar. and R. G. Leigh, 2002, J. High Stadel, and I? Tozzi, 1999a. Astrophys. J . Lett. 524. L19.
Energy Phys. 0204- 047. Moore, B., T. Quinn, F, Governato, J. Stadel, and G. Lake,
Lau. Y.-K.,1985, Aust. J. Phys 38,547. 1999b, Mon. Not. R. Astron. SOC.310, 1147.
Lazarides. G.* 2002, e-print hep-pW020.1294. Mukherjee, P., B. Dendison. B. Ratra, J. H. Simonetti, K.
Lee, A . T., e t a / . :2001, Astrophys. J. Lett. 561,L1. Ganga, and J.-Ch. Hamikon, 2002, Astrophys. J. 579, 83.
Leibundgut, B., 2001, Annu. Rev. Astron. Astrophys. 39, 67. Mukherjee, P., M . P. Hobson. and A. N. Lasenby, 2000: Mon.
Lemaitre, G., 1925, J. Math. Phys. (Cambridge, Mass.) 4, 188. Not. R. Astron. Sac. 318, 1157.
Lernahre, G., 1927? Ann. SOC.Sci. Bruxelles, Ser. 1 47, 49 Munshi! D., and Y. Wang, 2003>Astrophys. J. 583. 566.
[Man. Not. R. Astron. SOC.91, 483 (1931)l. Myung, Y. S., 2001, Mod. Phys. Lett. A 16,1963.
Lemaitre, G., 1934. Proc. Natl. Acad. Sci. U.S.A. 20, 12. Narayanan, V. K., D. N. Spergel, R. Davi, and C . 2 Ma, 2000,
Lemaitre, G . , 1949, in Alberf Einstein: PhilosopheT-Scientist7 Astrophys. J . Lett. 543, L103.
edited by E! A. Schilpp (Library of Living Philosophers. Nernst, W., 1916, Verh. Dtsch. Phys. Ges. 18, 53.
Evanston), p. 437. Netterfield, C. B., et al., 2002, Astrophys. J. 571, 604.
Liddle, A. R., and R. J. Scherrer, 1999, Phys. Rev. D 59. Newman, J. A., and M. Davis, 2000, Astrophys. J. Lett. 534.
023509. L11.
Lightman, A. P., and P. L. Schechter, 1990, Astrophys. J.! Ng. S . C. C., N. J. Nunes, and E Rosati, 2001, Phys. Rev. D 64,
Suppl. Ser. 74, 831. 083510.
Lima, J. A. S., and J. S. Alcaniz, 2001, Braz. J. Phys. 31, 583. Ng, S. C. C., and D . L. Wiltshire, 2001, Phys. Rev. D 63.
Lima, J. A. S., and J. S. Alcaniz, 2002, Astrophys. J. 566, 15. 023503.
Linde., A. D., 1974, Pisma Zh. Eksp. Teor. F i r 19, 320 [JETF Nilles, H. P., 1985, in New Trendr in Particle Theory, edited by
Lett. 19, 183 (1974)l. L. Lusanna (World Scientific. Singapore). p. 119.
Linde, A. D., 1982, Php. Lett. 108B, 389, Nolan, L. A.. J. S. Dunlop, R. Jimenez, and A. F. Heavens,
Lineweaver. C., 2001, e-print astro-ph10112381. 2001, e-print astro-ph10103450.
Loh, E. D., and E. J . Spillar, 1986, Astrophys. J. Lett. Ed. 307. Noomura, Y., 7. Watari, and T. Yanagida, 2000, Phys. Lett. B
L1. 484,103.
Lucchin, S.. and S. Matarrese, 1985a, Phys. Rev. D 32, 1316. North, J. D., 1965, The Memure of the Universe (Oxford Uni-
Lucchin, S.. and S. Matarrese! 1985b, Phjs. Lett. 164B, 282. versity, Oxford). Reprinted (Dover. New York. 1990).
Lynden-Bell, D., 1969, Nature (London) 223, 690. Norton, J. D., 1999, in The Expanding Worlds of General Rela-
Lyth, D. H., and E. D. Stewart, 1990, Phys. Lett. B 252,336. tivity, edited by H. Goennes, 3. Renn, J. Ritter, and T. Sauer
Magueijo, J.: and L. Smolin, 2002, Phys. Rev. Lett. 88, 190403. (Birkhauser, Boston), p. 271.
Majumdar, A. S., 2001, Phys. Rev. D 64, 083503. Olson, T. S., and T. F. Jordan, 1987, Phys. Rev. D 35, 3258.
Mak, M. K.:J. A. Belinchon, and T. Harko. 2002, Int. J. Mod. Oort, J. H., 1958, Proceedings of the 11th Solvay Conference,
Phys. D 11, 1265. Structure and Evolution of the Universe (Stoops, Brussels), p.
Maldacena, J., and C. Nuiiez, 2001, Int. J . Mod. Phys. A 16! 163.
822. Ott, T., 2001, Phys. Rev. D 64,023518.
Maor. I., R. Brustein, J. McMahon, and P. J. Steinhardt, 2002, Oukbir, J., and A. Blanchard, 1992, Astron. Astrophys. 262,
Phys. Rev. D 65,123003. 721.
Marriage, T. A., 2002, e-print astro-ph/0203153. Overduin, J. IM., and E I . Cooperstock, 1998, Phys. Rev. D 58,
Masiero, A., M. Pietroni, and F. Rosati, 2000, Phys. Rev. D 61, 043506.
023504. Overduin, J. M., P. S. Wesson, and S. Bowyer, 1993, Astrophys.
Mason, B. S.?el al.,20MI e-print astro-ph10205384. J. 404, 1.
Mathis, H.. and S. D. M. White, 2002, Mon. Not. R. Astron. Ozer, M., and M.0. Taha, 1986, Phys. Lett. B 171, 363.
SOC. 337, 1193. Padilla, N. D., M. E. Merchin, C. A. Valotto, D. F. Lambas,
Matyjasek, J., 1995+Phys. Rev. D 51, 4154. and M. A. G. Maia. 2001, Astrophys. J. 554, 873.
McCrea, W. H., 1951, Proc. R. Soc. London, Ser. A 206, 562. Padmanabhan, T.. 2002, e-print hep-th/0212290.
McCrea, W. H., 1971, Q. J. R. Astron. SOC. 12,140. Pais, A.. 1982, Subtle is the Lord ... (Oxford University, New
McVittie, G. C., 1956, General Relativity and Cosrnoiogy York) ,
(Chapman and Hall, London). Papovich, C., M. Dickinson, and H. C. Ferguson, 2002, e-print
Medved. A. J. M., 2002, Class. Quantum Grav. 19,4511. astro-ph/0201221.
Misziros. P., 1974, Astron. Astrophys. 37* 225. Park, C.-G., C. Park, B. Ratra, and M. Tegmark, 2001, Asrro-
Mitgrom, M.. 1983- Astrophys. J. 270, 365. phys. J. 556, 582.
Miller. A . D., et a!., 2002a. Astrophjs. J.? Suppl. Ser. 140, 115. Pauli, W., 1958, Theory of Relativity (Pergamon, New York).
Miller, C. J., R. C. Nichol. C. Genovese, and L.Wasserman, Reprinted (Dover, New York, 1981).
2002b, Astrophys. J. Lett. 565, L67. Pauli, W.? 1980, General Principles of Quantum Mechanics
Miralda-Escude, J., 2002, Asrrophys. J. 564. 60. (Springer, Berlin).
Misner, C. W., 1969, Phys. Rev. Lett. 22, 1071. Peacock, J. A., et a[., 2001, Nature (London) 410, 169.
Misner. C. W.* and P. Putnam, 1959, Pbys. Rev, 116. 1045. Peebles, P. J. E.. 1965, Astrophys. J. 142, 1317.
604 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
Peebles, P. J. E., 1966, Astrophys. J. 146, 542. Premadi, P.,H. Martel, R. Matzner, and T. Futamase, 2001,
Peebles, P. J. E., 1971, Physical Cosmology (Princeton Univer- Astropbys. J., Suppl. Ser. 135, 7.
sity, Princeton, NJ). Primack, J., 2002, e-print astro-ph/0205391.
Peebles, F! J. E., 1980a, The Large-Scale Structure of the Uni- Pryke, C., N. W. Halverson, E. M. Leitch, J. Kovac, J. E. Carl-
verse (Princeton University, Princeton, NJ). strom, W. L. Holzapfel, and M. Dragovan, 2002, Astrophys. J.
Peebles, I? J. E., 1980b, Ann. N.Y. Acad. Sci. 336, 167. 568, 46.
Peebles, P. J. E., 1982, Astrophys. J. Lett. Ed. 263, L1. Pskovskii, Yu. P., 1977, Astron. Zh. 54, 1188 [Sov. Astron. 21,
Peebles, P. J. E., 1984, Astrophys. J. 284, 439. 675 (1977)].
Peebles, P. J. E., 1986, Nature (London) 321, 27. Quevedo, F., 1996, in Workshops of Particles and Fields and
Peebles, F! J. E., 1987, Astrophys. J. Lett. Ed. 315, L73. Phenomenology of Fundamental Interactions, edited by J. C.
Peebles, ? J. E., 1989a, in Large Scale Structure and Motions in D'Olivio, A. Fernandez, and M. A. Perez, AIP Conf. Proc.
the Universe, edited by M. Meuetti, G. Giuricin, and F. No. 359 (AIR Woodbury, NY), p. 202.
Mardirossian (Kluwer, Dordrecht), p. 119. Ratra, B., 1985, Phys. Rev. D 31, 1931.
Peebles, P. J. E., 1989b, J. R. Astron. SOC.Can. 83, 363. Ratra, B., 1989, Phys. Rev. D 40, 3939.
Peebles, P. J. E., 1993, Principles of Physical Cosmology (Prin- Ratra, B., 1991, Phys. Rev. D 43, 3802.
ceton University, Princeton). Ratra, B., 1992a, Phys. Rev. D 45, 1913.
Peebles, ? J. E., 2001, Astrophys. J. 557, 495. Ratra, B., 1992b, Astrophys. J. Lett. 391, L1.
Peebles, P. J. E., 2002, e-print astro-ph/0201015. Ratra, B., 1994, Phys. Rev. D 50,5252.
Peebles, F! J. E., R. A. Daly, and R. Juszkiewicz, 1989, Astro- Ratra, B., and ? J. E. Peebles, 1988, Phys. Rev. D 37, 3406.
phys. J. 347, 563. Ratra, B., and P. J. E. Peebles, 1994, Astrophys. J. Lett. 432,
Peebles, P. J. E., S . D. Phelps, E. J. Shaya, and R. B. Tully, L5.
2001, Astrophys. J. 554, 104. Ratra, B., and F! J. E. Peebles, 1995, Phys. Rev. D 52, 1837.
Peebles, P. J. E., and B. Ratra, 1988, Astrophys. J. Lett. Ed. Ratra, B., and A. Quillen, 1992, Mon. Not. R. Astron. SOC.259,
325, L17. 738.
Peebles, P. J. E., S . Seager, and W. Hu, 2000, Astrophys. J. Lett. Ratra, B., R. Stompor, K. Ganga, G. Rocha, N. Sugiyama, and
K. M. Gbrski, 1999, Astrophys. J. 517, 549.
539, L1.
Ratra, B., N. Sugiyama, A. J. Banday, and K. M. Gbrski, 1997,
Peebles, P: J. E., and J. Silk, 1990, Nature (London) 346,233.
Astrophys. J. 481, 22.
Peebles, P. J. E., and A. Vilenkin, 1999, Phys. Rev. D 59, Refregier, A., J. Rhodes, and E. J. Groth, 2002, Astrophys. J.
063505. Lett. 572, L131.
Peebles, P. J. E., and J. T. Yu, 1970, Astrophys. J. 162, 815. Refsdal, S., 1970, Astrophys. J. 159, 357.
Percival, W. J., et al., 2001, Mon. Not. R. Astron. SOC.327, Riess, A. G., ef al., 1998, Astrophys. J. 116, 1009.
1297. Rindler, W., 1956, Mon. Not. R. Astron. SOC.116, 662.
Percival, W. J., et al., 2002, Mon. Not. R. Astron. SOC.337, Robertson, H. P., 1928, Philos. Mag. 5, 835.
1068. Robertson, H. P, 1955, Publ. Astron. SOC.Pac. 67, 82.
Perlmutter, S., et al., 1999a, Astrophys. J. 517, 565. Rossi, G. C., and G. Veneziano, 1984, Phys. Lett. 138B, 195.
Perlmutter, S., M. S. Turner, and M. White, 1999b, Phys. Rev. Roussel, H., R. Sadat, and A. Blanchard, 2000, Astron. Astro-
Lett. 83, 670. phys. 361, 429.
Perrotta, E, C. Baccigalupi, and S . Matarrese, 2000, Phys. Rev. Rubakov, V A., M. V Sazhin, and A. V Veryaskin, 1982, Phys.
D 61, 023507. Lett. 115B, 189.
Peskin, M. E., 1997, in Fields, Strings, and Duality, edited by C. Rubano, C., and P. Scudellaro, 2002, Gen. Relativ. Gravit. 34,
Efthimiou and B. Greene (World Scientific, Singapore), p. 307.
729. Rugh, S. E., and H. Zinkernagel, 2002, Stud. Hist. Philos. Mod.
Petrosian, V., E. Salpeter, and ? Szekeres, 1967, Astrophys. J. Phys. 33, 663.
147, 1222. Sachs, R. K., and A. M. Wolfe, 1967, Astrophys. J. 147, 73.
Phillipps, S., S. P. Driver, W. J. Couch, A. Fernandez-Soto, ? Sahni, V., M. Sami, and ' ISouradeep, 2002, Phys. Rev. D 65,
D. Bristow, S . C. Odewahn, R. A. Windhorst, and K. Lan- 023518.
zetta, 2000, Mon. Not. R. Astron. SOC.319, 807. Sahni, V., and A. Starobinsky, 2000, Int. J. Mod. Phys. D 9,373.
Phillips, M. M., 1993, Astrophys. J. Lett. 413, L105. Sahni, V, and L. Wang, 2000, Phys. Rev. D 62, 103517.
Phillips, N. G., and A. Kogut, 2001, Astrophys. J. 548, 540. Sandage, A., 1958, Astropbys. J. 127, 513.
Pierpaoli, E., D. Scott, and M. White, 2001, Mon. Not. R. As- Sandage, A,, 1961a, Astrophys. J. 133,355.
iron. SOC.325, 77. Sandage, A., 1961b, The Hubble Atlas of Galaxies (Carnegie
Plionis, M., 2002, e-print astro-ph/0205166. Institution, Washington).
Podariu, S . , R. A. Daly, M. P. Mory, and B. Ratra, 2003, As- Sandage, A., 1962, in Problems of Extragalactic Research, ed-
trophys. J. 584, 577. ited by G. C. McVittie (McMillan, New York), p. 359.
Podariu, S., P. Nugent, and B. Ratra, 2001, Astrophys. J. 553, Sandage, A,, 1988, Annu. Rev. Astron. Astrophys. 26, 561.
39. Sarkar, S., 2002, e-print hep-ph/0201140.
Podariu, S., and B. Ratra, 2000, Astrophys. J. 532, 109. Sato, K., 1981a, Mon. Not. R. Astron. SOC.195, 467.
Podariu, S., and B. Ratra, 2001, Astrophys. J. 563, 28. Sato, K., 1981b, Phys. Lett. WB, 66.
Podariu, S., T. Souradeep, J. R. Gott, B. Ratra, and M. S. Sato, K., N. Terasawa, and J. Yokoyama, 1989, in The Quest for
Vogeley, 2001, Astrophys. J. 559, 9. the Fundamental Constants in Cosmology, edited by J. Au-
Polenta, G., et a[., 2002, Astrophys. J. Lett. 572, L27. douze and J. Tran Thanh Van (Editions Fronti&res,Gif-sur-
Pollock, M. D., 1980, Mon. Not. R. Astron. SOC.193, 825. Yvette), p. 193.
P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 605
Schindler, S., 2001, e-print astro-ph/0107028. Thomas, D., and G . Kauffmann, 1999, in Spectrophoiomelric
Sciama, D. W, 2001, Astrophys. Space Sci. 276. 151. Dating of Sfars and Galuxies, Astronomical Society of the
Scoccimarro. R., H. A. Feldman, J. N. Fry, and J. A. Frieman, Pacific Conference Proceedings No. 192, edited by I. Huberg,
2001, Astrophys. J. 546, 652. S . Heap, and R. Cornett (Astronomical Society of the Pacific,
Scott, P. E, el al., 2002. e-print astro-ph10205380. San Francisco), p. 261.
Seljak, U., 2002, Mon. Not. R. Astron. SOC.337, 769. Thuan, T. X., and Y. I. Izotov, 2002, in Matter in rhe Universe,
Sellwood, I . A,, and A. Kosowsky, 2M1, in Gas and Ga!axy edited by F. Jetzer, K. Pretzl, and R. Vcn Steiger (Kluwer,
Evolurion, Astronomical Society of the Pacific No. 240, ed- Dordrechtj, in press.
ited by J. E. Hibbard, M. Rupen, and J . H. van Gorkum Tinsley, B. M., 1972, Astrophys. J. 178, 319.
(Astronomical Society of the Pacific, San Francisco), p. 311. Totani, T., Y. Yoshii, T. Maihara, F, Iwamuro, and K. Moto-
Sen, A. A., and S. Sethi, 2002, Phys. Lett. B 532* 159. hara, 2001, Astrophys. J. 559, 592.
Shafi, Q., and C. Wetterich, 1985, Phys. Lett. 152B,51. Townsend, P K.*2001, J. High Energy Phys. 0111,042.
Shandarin, S . F., H. A. Feldman, Y. Xu. and M. Tegmark, 2002, Trager, S. C. ?S . M. Faber, G. Worthey. and J. J. Gonzilez,
Astrophys. J., Suppl. Ser. 141, 1. 2000, Astron. J. 119, 1645.
Shiu, G . , and S . H . 3. Tye, 2001, Phys. Lett. B 516, 421. Trautman, A., 1965, in Lectures an General Relativity, edited
Shklovsky. J.: 1967, Astrophys. J. Lett. 150, L1 by A. Trautrnan, F. A . E. Pirani, and H. Bondi (Prentice-Hall,
Shvartsman, V F., 1969, Pisma Zh. Eksp. Teor. Fiz. 9, 315 Englewood Cliffs, NJ),p, 230.
[JETP Lett. 9, 184 (196911. Tully, R. B., R. S . Somemilk, N. Trentham, and M. A. W.
Sigad, Y., A. Eldar, A. Dekel, M. A. Strauss, and A. Yahil. Verheijen, 2002, Astrophys. J. 569, 573.
1998, Astrophys. J. 495,516. Turner, E. L., 1990, Astrophys. J. Lett. 365,L43.
Silk, J., 1967, Nature (London) 215, 1155. Turner, M. S . , 1999, in The GaLacficHalo, Astronomical Soci-
Silk, J., 1968, Astrophys. J. 151,459. ety of the Pacific Conference Proceedings No. 165, edited by
Silk, J., and N. Vittorio, 1987*Astrophys. J. 317, 564. B. K. Gibson, T.S . Axelrod, and M. E. Putnam (Astronomi-
Singh, T. P.: and T. Padmanabhan, 1988, Int. 3. Mod. Phys. A 3, cal Society of the Pacific, San Francisco), p. 431.
1593. Turner, M. S.* and M. White, 1997- Phys. Rev. D 56, R4439.
Skordis- C., and A. Albrecht, 2002, Phys. Rev. D 66,043523. Turner. M. S.. and L. Widrow, 1988, Phys. Rev. D 37,2743.
Smith: S., 1936, Astrophys. J. 83. 23. Unruh. W. G . , 1989, Phys. Rev. D 40,1048.
Smoot, G . F., er d , 1992, Astrophys. J. Lett. 396. L1. Urefia-Lbpez, L. A., and T. Matos, 2000, Phys. Rev. D 62,
Sommer-Larsen, J., and A. Dolgov. 2001, Astrophys. J. 551, 081302.
608. Uson, J. M., and D.T. Wilkinson, 1984, Nature (London) 3 Y ,
Souradeep, T.. and B. Ratra, 2001, Astrophys. J. 560, 28. 427.
Spergel, D., and U.-L. Pen, 1997, Astrophys. I. Lett. 491, L67. Uzan, J.-P., 1999, Phys. Rev. D 59, 123510.
Spergel, D. N., and P. J. Steinhardt, 2000, Phys. Rev. Lett. 84, Uzan, J.-P., 2003, Rev. Mod. Phys. 75, 403.
3760. Uzan, J.-P,, and E Bernardeau, 2001, Phys. Rev. D 64, 083004.
Spergel, D. N., et a l , 2003, e-print asuo-ph/0302209. Uzawa, K., and J. Soda, 2001, Mod. Phys. Lett. A 16, 1089.
Spokoiny. B., 1993, Phys. Lett. B 315,40. Van Waerbeke, L., Y. Mallier, R. Pe116, U.-L. Pen. H. J. Mc-
Starobinsky, A. A,, 1982, Phys. Lett. 117J3, 175. Cracken, and B. Jain, 2002, Astron. Astrophys. 393, 369,
Starobinsky, A. A,, 1998- Gravitation Cosmol. 4, 88. Veltman, M., 1975, Phys. Rev. Lett. 34,777.
Steigman. G., 2002>private communication. Verde, L., er al., 2002, Mon. Not. R. Astron. SOC.335, 432.
Steigman, G., D. N. Schrarnm, and J. E. Gunn, 1977, Phys Viana, I? T. I?, R . C. Nichol. and A. R. Liddle, 2002, Astrophys.
Lett. 66B,202. J. Lett. 569, L75.
Steinhardt, P. J., and N. Turok, 2002, Science 296, 1436. Vilenkin, A., 1984, Phys. Rev. Lett. 53, 1016.
Steinhardt, P. I.,L. Wang, and I. Zlatev, 1999, Phys. Rev. D 59, Vilenkin. A., 2001, e-print astro-ph10106083.
123504. Vilenkin, A,, and E. l? S . Shellard, 1994, Cosmic Strings and
Stoehr. F,, S . D. M. White, G. Tormen, and V. Springel, 2002, Other TopoIogical Defecrs (Cambridge University, Cam-
Mon. Not. R. Astron. SOC.335, L84. bridge, England).
Stompor, R.. el d ,2001, Astrophys. J. Lett. 561, L7. Vishwakarma, R. G., 2001, Class. Quantum Grav. 18. 1159.
Straumann, N., 2002, e-print astro-ph10203330, Vittorio, N., and J. Silk, 1985, Astrophys. J. Lett. Ed. 297, L1.
Sugiyama, N.. and N. Gouda, 1992, Prog. Theor. Phys. 88,803. Waga, I., and J. A. Frieman, 20W, Phys. Rev. D 62, 043521.
Sunyaev. R.A,, and Ya. B. Zeldovich, 1970, Astrophys. Space Wang, L., and l? J . Steinhardt, 1998, Astrophys. J. 508, 483.
Sci. 7, 3. Wang, X.. M. Tegmark, and M. Zaldarriaga, 2002, Phys. Rev. D
Susskind, L., 1979, Php. Rev. D 20,2619. 65, 123001.
Sutherland, W., H. Tadros, G . Efstathiou, C. S. Frenk, 0 . Wang, Y.,and G. Lovelace, 2001, Astrophys. J. Lett. 562, L l l j ,
Keeble. S. Maddox, R. G. McMahon, S. Oliver. M . Rowan- Wasserman, I., 2002, Phys. Rev. D 66, 123511.
Robinson. and s. D. M. White, 1999, Mon. Not. R. Astron. Weinberg, S . , 1987, Phys. Rev. Lett. 59. 2607.
SOC. 300s. 289. Weinberg, S., 1989. Rev. Mod. Phys. 61, 1.
Tadros, H., et al., 1999, Mon. Not. R. Astron. SOC.305, 527. Weinberg, S., 2001, in Relalivisric Astrophysics, edited by J. C.
Tammann, G. A.? B. Reindl, E Thim, A. Saha: and A. Wheeler and H. Martel, ALP Conf. Proc. No. 586 (AIP,
Sandage. 2001. in A New Era in Cosmology, Astronomical Melville, NY), p. 893.
Society of the Pacific Conference Proceedings, edited by T. Weiss, N., 1987, Phys. Lett. B 197, 42.
Shanks and N. Metcalf (Astronomical Society of the Pacific, Weller, J., and A. Albrecht, 2002, Phys. Rev. D 65, 103512.
San Francisco), in press. Wetterich, C., 1988, Nucl. Phys. B 302, 668.
Tasitsiomi, A,, 2002, e-print astro-phi0205464. Wey5 H., 1923, Phys. 2. 24, 230.
606 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy
White, M., and C. S. Kochanek, 2001, Astrophys. J. 560,539. Yamamoto, K., M. Sasaki, and T. Tanaka, 1995, Astrophys. J.
White, S . D. M., 1992, in Clusters and Superclusters of Galax- 455,412.
ies, edited by A. C. Fabian (Kluwer, Dordrecht), p. 17. Zaldarriaga, M., D. N. Spergel, and U. Seljak, 1997, Astrophys.
White, S . D. M., J. I? Navarro, A. E. Evrard, and C. S. Frenk, J. 288, 1.
1993, Nature (London) 366,429. Zee, A,, 1980, Phys. Rev. Lett. 44, 703.
Whittaker, E. T., 1935, Proc. R. SOC. London, Ser. A 149,384. Zee, A., 1985, in High Energy Physics, edited by S . L. Mintz
Wilczek, F., 1985, in How Far Are We from the Gauge Forces, and A. Perlmutter (Plenum, New York), p. 211.
edited by A. Zichichi (Plenum, New York), p. 157. Zeldovich, Ya. B., 1964, Astron. Zh. 41,19 [Sov. Astron. 8, 13
Wilkinson, D. T., and P. J. E. Peebles, 1990, in The Cosmic (1964)i.
Microwave Background: 25 Years Later, edited by N. Man-
Zeldovich, Ya. B., 1967, Zh. Eksp. Teor. Fiz., Pisma Red. 6,
883 [JETP Lett. 6,316 (1967)l.
dolesi and N. Vittorio (Kluwer, Dordrecht), p. 17.
Zeldovich, Ya. B.,1968, Usp. Fiz. Nauk 95,209 [Sov. Phys.
Wilson, G., N. Kaiser, and G. A. Luppino, 2001, Astrophys. J.
Usp. 11,381 (1968)l.
556,601. Zeldovich, Ya. B., 1972, Mon. Not. R. Astron. SOC.160,1P.
Witten, E., 2001, in Sources and Detection of Dark Matter and Zeldovich, Ya. B., 1978, in IAU Symposium 79, The Large-
Dark Energy in the Universe, edited by D. B. Cline (Springer, Scale Structure of the Universe, edited by M. S . Longair and J.
Berlin), p. 27. Einasto (Reidel, Dordrecht), p. 409.
Worthey, G., 1994, Astrophys. J., Suppl. Ser. 95,107. Zeldovich, Ya. B., 1981, Usp. Fiz. Nauk 133,479 [Sov. Phys.
Wu, J.-H. P., et al., 2001a, Astrophys. J., Suppl. Ser. 132,1. Usp. 24, 216 (1981)l.
Wu, J.-H. P., et al., 2001b. Phys. Rev. Lett. 87,251303. Zeldovich, Ya. B., I . Yu. Kobzarev, and L. B. Okun, 1974, Zh.
Wu, X.-P., and F. Hammer, 1993, Mon. Not. R. Astron. SOC. Eksp. Teor. Fiz. 67,3 [Sov.Phys. JETP 40,1 (1975)J.
262,187. Zimdahl, W., D. J. Schwarz, A. B. Balakin, and D. Pa&,
Wyithe, J. S. B., and A. Loeb, 2002, Astrophys. J. 581,886. 2001, Phys. Rev. D 64,063501.
Yahiro, M., G. J. Mathews, K. Ichiki, T. Kajino, and M. Orito, Zumino, B., 1975, Nucl. Phys. B 89, 535.
2002, Phys. Rev. D 65,063502. Zwicky, F., 1933, Helv. Phys. Acta 26,241.
Marcel Grossmann
1878-1936
Hochschule in 1907. His doctoral thesis and his favorite subject were non-Euclidean
geometry, which, as luck would have it, paved the way for his celebrated collaboration
with Einstein later. Grossmann was a teacher of outstanding ability who trained many
mathematicians in geometry. [2]
In the meantime, his friend Einstein was attempting to obtain a position at Berne
University, so at the end of 1907 he submitted his 1905 paper on special relativity, On
the Electrodynamics of Moving Bodies, to Berne University as an inaugural thesis. The
paper was rejected as being incomprehensible! [3] Einstein was bitterly disappointed and
gave up his dream of becoming a university professor for a while. He became interested
in a teaching position at the Technical School. Einstein wrote a letter to Grossmann for
advice and explained:
Do not imagine that I am driven to such careerist ways by megalomania or some
other questionable passion; rather, I came to this hankering only because of an ardent
desire to be able to continue my private scientific work under less unfavorable
conditions, as you will certainly understand.. ..
As a professor of geometry, Grossmann organized summer courses for high
school teachers. In 1910 he became one of the founders of the Swiss Mathematical
Society. Within a year, he became Dean of the mathematics-physics section of
Eidgenossische Technische Hochschule. As a new Dean, he made an effort to persuade
Einstein to return to the ETH. As destiny would have it, Einstein agreed in 1912: I am
extraordinarily happy about the prospect of returning to Zurich. Around this time,
Einstein sought to formulate mathematically his ideas on the general theory of relativity;
he turned to his friend for assistance: Grossmann, you must help me, or else Ill go
crazy! After discussions with Grossmann, Einstein was sure that he was on the right
path. Grossmann introduced Einstein to the absolute differential calculus, started by E. B.
Christoffel (1864) and fully developed by Ricci and Levi Civita (1901). Grossmann
facilitated Einsteins unique synthesis of mathematical and theoretical physics in what is
still today considered the most elegant and powerful theory of gravity: The General
Theory of Relativity. [4]
Pais has an interesting, lucid and detailed analysis of the Einstein-Grossmann
paper (I. Physical Part, by A. Einstein; 11. Mathematical Part, by M. Grossmann):[5]
Grossmanns concluding section starts as follows. The problem of the
formulation of the differential equations of a gravitation field draws attention to the
differential invariants.... and .... covariants of .... ds2 = g,dxpdxV. He then presents to
Einstein the major tensor of the future theory: the Christoffel four-index symbol, now
better known as the Riemannian-Christoffel (curvature) tensor: RhpvK=. ....I From this
tensor it is .... possible to derive a second-rank tensor of the second order [in the
derivatives of g,, I, the Ricci tensor: R,, = RhCLLv....
Unfortunately, Grossmann reached an erroneous conclusion: It turns out,
however, that in the special case of the infinitely weak, static gravitational field this
tensor does not reduce to the expression AT. Alas, the symmetry property of general
covariance and the freedom of choosing a coordinate condition were not properly
understood by the collaborators.
A crucial change of destiny for their collaboration occurred in 1913 when Planck
and Nernst came to Zurich to persuade Einstein to go to Berlin. As a result Einstein left
Zurich in March 1914. During the next year, the endeavor for Einstein in Berlin was to
620
really understand his own idea of general covariance and the key role the Ricci tensor
played in his theory.
A page from the draft of Einsteins landmark paper The Foundation of the
General Theory of Relativity was included in his collected works. [6] Einstein wrote:
. . . .. . Finally, I want to acknowledge gratefully my friend, the mathematician
Grossmann, whose help not only saved me the effort of studying the pertinent
mathematical literature, but who also helped me in my search for the field equations of
gravitation. For some reason, Einstein did not publish this page together with his
landmark paper.
Grossmann died of multiple sclerosis in 1936. Einstein wrote a letter to
Grossmanns wife to convey his heartfelt appreciation of Grossmanns kindness:
.... Our student days together come back to me. He is a model student; I untidy
and a dreamer. He on excellent terms with the teachers and grasping everything easily; I
aloof and discontented, not very popular. But we were good friend and our
conversations over iced coffee at the Metropol every few weeks belong among my nicest
memories. Then the end of the studies .... I suddenly abandoned by everyone, facing life
not knowing which way to turn. But he stood by me and through him and his father I
came to Haller in the Patent OfJice a few years later, In a way, this saved my life; not
that I would have died without it, but I would have been intellectually stunted.
But Einstein did not write an obituary shortly after Grossmanns death. Pais
commented that I have a sense of regret that Einstein did not do something for which he
had often demonstrated a talent and sensitivity.
In the last year of Einsteins life (1955), he wrote of Grossmann, of their
collaboration, and how the latter had checked through the literature and soon discovered
that the mathematical problem had already been solved by Riemann, Ricci, and Levi-
Civita. Einstein also wrote: The need to express at least once in my life my gratitude to
Marcel Grossmann gave me the courage to write this ... autobiographical sketch. (7,8)
In 1975, the International Center for Theoretical Physics, Trieste, Italy announced
the Marcel Grossmann Meeting on the Recent Progress of the Fundamentals of General
Relativity as follows:
Marcel Grossmann was associated with Albert Einstein in elucidating the
mathematical basis of general relativity. In commemoration of his contributions, a
Marcel Grossmann meeting on the Recent Progress of the Fundamentals of General
Relativity will be held at the International Center for Theoretical Physics in Trieste
during the period 7-12 July 1975 under the directorship of professor R. RufJini (Institute
for Advanced study, Princeton, N.J., USA) and cosponsored by the University of Trieste.
The theme of the Meeting will cover recent advances in the mathematical techniques of
general relativity as well as progress in the physics of relativistic field theories......
I
We would like to thank Anna Revay-Grossmann and Carlo Revay for materials
regarding their grandfather Marcel Grossmann.
62 1
References
1. B. Hoffmann and H. Dukas. Albert Einstein, Creator and Rebel (The Viking Press,
1972) p. 105.
2. L. Kollros, Pro$ Dr. Marcel Grossmann, 2878-1936 (Extrait des Actes de la Societe
Helvetique des Sciences Naturelles, Geneve 1937). pp. 325-329. A. Pais, Subtle
is the Lord. ..(Oxford Univ. Press, 1982), Chapter 12.
3. See ref. l.,p. 86. This is understandable: Even a thinker like Mach, whose previous
criticism of absolute space and absolute motion played a major role in paving
the way for Einstein, was to say harsh things about Einsteins special relativity.
4. R. Ruffini, Marcel Grossmann, (unpublished).
5. See paper A in Chapter 2 of this volume. A. Pais, ref. 2.
6. The Collected Papers ofAlbert Einstein (Ed. J. Stachel, Princeton Univ. Press, 1993).
English translator Anna Beck. Vol. 5.
7. A. Einstein, in HelEe Zeit, Dunkle Zeit (C. Seeling, Ed). Europa Verlag, Zurich,
1956. [A.Einstein, Erinnerungen - Souvenirs (100 Jahre Eidgenossische
Technische Hochschule , 28 Jahrgang, 1955)l.
8. A Pais, Subtle is the Lord.. . (Oxford Univ. Press, 1982), p. 225.
Remembering
Robert L. Mills
A few months ago, I was saddened
to learn that Robert L. Mills had
passed away. As a fellow graduate of
Columbia College, I feel a special
bond with Mills and am therefore
submitting this memorial note.
Mills, who shared with C. N. Yang
the 1980 Rumford Premium Prize
from the American Academy of Arts
and Sciences for development of a
generalized gauge invariant field
theory, died on 27 October 1999
from prostate cancer. His passing
was a great loss to his family,
friends, and the physics community.
Robert L. Mills
Mills was born on 15 April 1927
in Englewood, New Jersey. He grad- From 1955 to 1956, Mills was a
uated from George School in Penn- member of the Institute for Advanced
sylvania in early 1944 and, in Study in Princeton, New Jersey. He
March, entered Columbia College in then joined the physics department of
New York. While there, he enlisted Ohio State University and became a
in the US Merchant Marines in the full professor in 1962. He remained a t
last year of World War 11; he served OSU until his retirement in 1995. His
until 1947. research was in quantum field theory,,
On leave from the service, he at- the theory of alloys, and many-body
tended classes at Columbia, where theory. He worked with Andrew
his father was an economics profes- Sessler on many-body theory; later,
sor. In 1948, his senior year, Mills Leon Cooper joined in the effort. That
was a winner of the Putnam na- work, besides producing papers that
tional college mathematics contest. appeared in Physical Review and
The mathematical ability he dis- Physical Review Letters, resulted in
played there was evident throughout Millss writing a book, Propagators
his career as a theoretical physicist. for Many-Particle Systems: An Ele-
He then studied a t Cambridge Uni- mentary Deatment (Gordon and
versity, where he received first-class Breach, 1969). He later wrote Space,
honors in the mathematical tripos Time and Quanta: An Introduction to
and a masters degree. Mills re- Contemporary Physics (W. H. Free-
turned to Columbia and got his PhD man, 1994). For his outstanding dedi-
in 1955 under Norman Kroll for a cation t o his students, Mills received
thesis on radiative corrections in OSUs Rosalene SedgJvick Faculty
quantum electrodynamics. Service Award. With his wife, Lee, he
From 1953 to 1955, Mills was a shared the OSU International Com-
research associate at Brookhaven munity Service Award. He was a vis-
National Laboratory and shared an iting professor a t many schools and a
office with Yang. During that time, visiting scientist a t CERN. After his
they developed what is now known retirement, he taught for a year as a
as the Yang-Mills theory, a non- Fulbright scholar a t St. Patricks col-
abelian local gauge invariant theory lege in Ireland.
that would become one of the pivotal According to Sessler, Robert was
concepts of physics. It formed the even-tempered and simply a joy to
model for non-abelian gauge theories work with. His coworkers enjoyed
that followed and is thus one of the interacting with him. A memorial
bases for the standard model of ele- piece in the 2000 OSU Physics De-
mentary particles and string theory.2 partment Magazine concludes with
Yang-Mills also has applications t o this statement: A gentlemen of
mathematics. unfailing good humor and sincere
and active concern for helping a phonon mediated electron-
others, Robert Mills will be long electron interaction could be du-
remembered with great respect plicated among 3He atoms due
and affection. to the atom-atom potential. Al-
While preparing this letter, I though the solution in 3He
couldnt help but observe Millss de- turned out to be somewhat more
votion to his friends and their devo- complicated, the basic idea was
tion to him, as exemplified by his in- vindicated with the discovery,
teraction with Yang and Cooper, both about ten years later, of the su-
Nobel laureates. The following com- perfluidity of 3He.
ments were obtained via private Bob Mills was a talented, cre-
communication. ative physicist. We miss him.
In 1953-1954, I was visiting -Leon Cooper
Brookhaven and Bob was my I would be remiss in compiling
office mate. We discussed many this tribute to Mills if I didnt mention
things in physics, from the ex- the direct or indirect influence of
perimental results pouring out Yang-Mas on some of the advances
of the new Cosmotron, to theo- establishing the standard model.
retical topics like renormaliza- These advances, tours de force all,
tion and the Ward identity. It illustrate the wonderful synergy of
was in that year that we found theoretical and experimental physics
the very elegant and unique and include the Glashow-Weinberg-
generalization of Maxwells Salam (GWS) theory, the Glashow-
equation. We were pleased by niopoulos-Maiani (GIM) model, the
the beauty of the generaliza- successful searches for neutral cur-
tion, but neither of us had an- rents and for the gauge particles W
ticipated its great impact on and Zo, the proof of the renonnaliza-
physics 20 years later. tion of Yang-Mills theories, and quan-
Bob spent one year, I think tum chromodynamics encompassing
it was 1955-1956, at the Insti- asymptotic freedom and quark confine-
tute for Advanced Study in ment. This is a splendid legacy indeed.
Princeton and we resumed our
collaboration. One fruit of that Many thanks to Lee Mills for so gen-
was a paper on the overlapping erously giving of her time to provide
me with information on her hus-
divergence in the photon prop-
agator which, however, was not bands careez
written up for publication3 until References
1966, when he and his family 1. C. N. Yana. R. L. Mills, Ph.ys. Reu. 96,
visited us for the summer just 191 (1954):
after I had moved to [SUNY] 2. For details on Yang-Mills in the devel-
Stony Brook. opment of gauge theory, see L.
Bob was an old-fashioned ORaifeartaigh, N. Strauman, Reu.
Mod. Phys. 72, l(2000); also see L.
man. Among all the physicists
ORaifeartaigh, The Dawning of Gauge
that I know, he was certainly Theory, Princeton U. Press, Princeton,
one of the most honest and the N.J. (1997), for a reprinting of funda-
most sincere. mental papers in gauge theory up to
Bob had a brilliant mind. He 1956, with commentaries.
was very quick a t grasping new 3. R. L. Mills, C . N. Yang, Prog. Theor.
ideas. I shall treasure the mem- Phys. Sup. 37, 507 (1966).
ory of our intensive collabora- 4. L. N. Cooper, R. L. Mills, A. M. Sessler,
tion and of our many discus- Phys. Rev. 114, 1377 (1959).
sions on diverse topics ranging Samuel L. Marateck
from accelerator theory to the (marateck@cs.nyu.edu)
theory of computability.
Courant Institute of
-C. N. Yang
Mathematical Sciences
Bob Mills and I, with Andy New York University
Sessler, wrote a paper dis- New York City
cussing possible superfluidity of
helium-3. In it, we suggested
that the electron pairing due to