FirstCourseGR Notes On Schutz2009 PDF

A detailed solution manual and guide for
Schutzs First Course in General Relativity

(Schutz, 2009)
Robert B. Scott,
1
1
Department of Physics, University of Brest,
Brest, France
To whom correspondence should be addressed; E-mail: robert.scott@univ-brest.fr
February 10, 2012
2
Note to user.
This manual is to be used as a companion to the textbook A First Course
in General Relativity, by Bernard Schutz, 2nd edition, published 2009 by
Cambridge University Press. It will only make sense when read with Schutzs
text. Herein youll nd brief notes meant to clarify and amplify his textbook,
presented in the same order as his textbook. Most importantly youll nd
solutions to almost all the exercises. These are presented in great detail,
with each step explained including references to the text. Youll also nd
my supplementary problems that are meant to establish intermediate goals
necessary to solve Schutzs exercises or to amplify concepts not covered by
his exercises. Comments should be sent to robert.scott@univ-brest.fr.
Chapter 1
Special Relativity
1.1 Fundamental principles of special relativ-
ity (SR) theory
In the footnote 4 on p. 3, the answer to the rst question is no, the soup
is unaected by a acceleration experienced by an astronaut in orbit. This
would appear to also cause problems for SR, since how do we know that
an observer is in an inertial frame? The acceleration cannot, necessarily,
be measured locally. And theres no special reference frame with which to
measure ones acceleration.
1.2 Denition of an inertial observer in SR
Gives a geometrical denition of an inertial reference frame, or coordinate
system.
Notes that gravity makes it impossible to construct such an inertial co-
ordinate system.
1.3 New units
Introduces what Misner et al. (1973) called geometric units, wherein time
is measured in distance of light travel.
They claim the motivation is that c = 3 10
8
m/s in SI, a ridiculous
value. I disagree, since then a 1/3 second becomes the ridiculously large
3
4
10
5
km! A more useful motivation comes from
velocity becoming a dimensionless parameter,
space-time diagrams having the same units on all axes, and
the world lines of light paths having unit slope.
1.4 Spacetime diagrams
Typo: Fig. 1.4: v is of course a vector, so one should replace this with
v = [v[.
1.5 Construction of the coordinates used by
another observer
This is an extremely important section. Unfortunately he doesnt explain
why the angle of the xaxis to the xaxis is = arctan(v), where v = [v[
is the magnitude of the velocity of O along the xaxis axis. Rather this
result appears in Fig. 1.5 without explanation, nor even delegating it as an
exercise for the student. The result does follow from the construction of
the xaxis , but the steps involved are not trivial. Please see supplementary
problem SP.1 in section 1.15.
1.6 Invariance of the interval
This section purports to provide a proof of the invariance of the interval.
But bear in mind that it assumes that the relationship between coordinates
in dierent frames is linear, see discussion before Eq. (1.2).
He reduces the relationship between the interval in one frame and another
to a function of the relative velocities of their origins, see Eq. (1.5) on p. 10,
s
2
= M
00
s
2
= (v)s
2
.
To show that (v) depends only upon direction he considers the case of a
rod (or two events A and B at the ends of the rod) lying along the yaxis.
Robs notes on Schutz 5
A and B are simultaneous in O, and he argues that they are therefore also
simultaneous in O, by constructing the y-axis as he did in Fig. 1.3. But
now the velocity of O is orthogonal to the constructed axis, so of course
the simultaneity of events is not changed by the coordinate transform. The
intermediate result is that the space-time interval between A and B in either
frame is just the square of the length, so their ratio is the sought-after (v).
Now the subtle point is that he then claims that this ratio cannot depend on
the direction of the velocity, because the rod is perpendicular to it and there
are no preferred directions?! So what??
I think the solution is that v could be in an arbitrary direction in the xz
plane. The ratio of lengths should not depend upon this direction, because
then there would be preferred directions. But as far as I can tell, this only
shows that the direction of the component orthogonal to the yaxis cannot
inuence (v).
1.7 Invariant hyperbolae
At the end of the section its stated that
The lesson of Fig. 1.12b is that tangent to a hyperbola at any
event T is line of simultaneity of the Lorentz frame whose time
axis joins T to the origin. If this frame has velocity v, the tangent
has slope v.
The above is stated without proof or even hint that theres some calcula-
tion involved. Fortunately it proceeds straightforwardly. We seek the slope
of the tangent to a hyperbola. Dierentiate any timelike hyperbola wrt x, to
obtain in general
dt
dx
=
x
t
.
At some point T the slope of the tangent wrt the xaxis is
xp
tp
. Now if the
taxis is chosen to go through the origin and T its slope wrt the taxis will
also be
xp
tp
, corresponding to tan() = v =
xp
tp
. But we know from Fig. 1.5
that the corresponding xaxis will have slope v relative to the xaxis. That
is, the tangent is parallel the xaxis, and is therefore a line of simultaneity
for O. QED.
6
1.8 Particularly important results
Time dilation This was straightforward once one uses the invariant hyper-
bolae. The event x
B
was constructed so that it had t = 1. The corresponding
event in O is obtained by tracing the point back to the taxis along the hy-
perbola with the same interval, s
2
= 1,
t
2
+ x
2
= 1
One must also note that the equation for the taxis is t = x/v. Substituting
this into the hyperbola,
t
2
B
+ x
2
B
= 1 (1.1)
t
2
B
+ (t
B
v)
2
= 1 (1.2)
t
B
=
1
1v
2
(1.3)
This gives Eq. (1.8).
Lorentz contraction I still dont see how he came up with
x
C
=
l
1 v
2
But I obtain the same end result using instead the invariance of the interval,
which in O is
s
2
AC
= t
2
+ x
2
= t
2
+ x
2
= 0 + l
2
= l
2
.
and therefore must also be in O. I also used the equation for the xaxis,
t = vx. This was confusing at rst since the units look wrong! But its clear
when you go back to Fig. 1.5 and note that tan() = v, which was obvious for
the taxis since the observer O is moving along the xaxis at speed v. That
the xaxis was also inclined at the same angle was more complicated. One
also needs a three relation, which is simply x
C
x
B
= vt
C
. A little algebra
gives the Lorentz contraction:
x
B
= l
1 v
2
.
1.9 Lorentz transformation
The rst step, substituting the equations for the O axes proceeds immediately
to
t = (t v x) (1.4)
x = (x v t) (1.5)
I had trouble seeing how = from the path of a light ray, so I used the
invariant hyperbolae instead. Substituting (1.4 and 1.5) into the equation
for the interval from the origin, gives,
s
2
= t
2
+ x
2
= s
2
= t
2
+ x
2
The cross term on the RHS involving x t must be zero, giving that
2
=
2
.
Equating either of the other terms gives the Lorentz factor,
2
=
1
1 v
2
As Schutz (2009, p. 22) points out, the positive root is selected so that the
coordinates are not inverted when v = 0.
In retrospect, it is clear how the path of a light ray gives = . Sim-
ply note that the world line of line ray has x = t and x = t.
Substitution into (1.4 and 1.5) gives,
_
x vt
t vx
_
=
x
t
= 1.
So = .
The Lorentz transformation is often said to reduce to the Galilean trans-
formation in the limit v 1, but thats not strictly true. Unlike for the
Galilean transformation, in the Lorentz transformation time is aected at
large distances even for small velocities.
1.10 Velocity composition law
1.11 Paradoxes and physical intuition
1.12 Further reading
A more thoughtful look at fundamentals, Bohm (2008).
8
1.13 Appendix: the twin paradox dissected
1.14 Exercises
1.14.1 Convert to geometric units
a)
10 J = 10N m = 10kg m
2
/s
2
= 10/9 10
16
kg = 1.11 10
16
kg.
b)
100W = 100J/s = 1.1110
15
kg/s = 1.1110
15
/310
8
kg/m = 0.37110
23
kg/m
c)
= 1.05 10
34
J s =
1.05 10
34
J s
3 10
8
m/s
= 0.352 10
42
kg m
d) Car velocity [108 km/hr]
v = 30m/s = 10
7
e) Car momentum
p = 30m/s 1000kg = 10
4
kg
f) Atmospheric pressure,
1bar = 10
5
N m
2
=
10
5
kg m s
2
9 10
16
m
4
s
2
= 1.1 10
12
kg m
3
g) water density
10
3
kg m
3
h) Luminosity ux
10
6
J s
1
cm
2
= 10
10
J s
1
m
2
=
10
10
J s
1
m
2
1.11 10
16
kg J
1
3 10
8
m s
1
= 3.7110
16
kg m
3
1.14.2 Convert from natural units (c = 1) to SI units
2 (a) Velocity, v = 10
2
:
v = 10
2
c[m s
1
] = 3 10
6
[m s
1
]
2 (b) Pressure, 10
19
[kg m
3
]:
10
19
[kg m
3
] c
2
[m
2
s
2
] = 9 10
35
[N m
2
]
2 (c) Time, 10
18
[m]:
10
18
[m]
c[m s
1
]
= 3.3 10
9
[s]
2 (d) Energy density, 1 [kg m
3
]:
1 [kg m
3
] c
2
[m
2
s
2
] = 9 10
16
[J m
3
]
2 (e) Acceleration, 10 [m
1
]:
10 [m
1
] c
2
[m
2
s
2
] = 9 10
17
[m s
2
]
1.14.6 Show that Eq. (1.2) contains only M
+ M
when ,= , not M
and M
independently.
Argue that this allows us to set M
= M
with-
out loss of generality.
s
2
=
3
=0
3
=0
M
(x
)(x
)
10
Pick a pair of indices, =
and =
say, where
,=
, and
0 . . . 3
and
0 . . . 3. So s
2
contains a term like,
M
(x
)(x
).
But s
2
also contains a term like,
M
(x
)(x
) = M
(x
)(x
).
The equality follows because of course the product does not depend upon
the order of the factors. So we can group these two terms and factor out the
(x
)(x
) leaving,
(x
)(x
)(M
+ M
)
Because the o-diagonal terms always appear in pairs as above, we could
without changing the interval (and therefore without loss of generality) re-
place them with their mean value
(M
+ M
)/2
Thus the new tensor

M
is by construction symmetric.
1.14.7 In the discussion leading up to Eq. (1.2), as-
sume that the coordinates of O are given as the
following linear combinations of those of O:
t = t + x, (1.6)
x = t + x, (1.7)
y = ay, (1.8)
z = bz, (1.9)
where , , , , a, and b may be functions of the velocity v of O relative to O,
but they do not depend on the coordinate. Find the numbers M
, , =
0, . . . 3 of Eq. (1.2) in terms of , , , , a, and b.
First note that the origins of the two coordinate systems line up, and
that t = t etc. Then the result follows from straightforward substitution
of (1.6) to (1.9) into Eq. (1.1)
s
2
= t
2
+ x
2
+ z
2
+ z
2
(1.10)
= (t + x)
2
+ (t + x)
2
+ (ay)
2
+ (bz)
2
(1.11)
Grouping terms we nd that (
2
+
2
) multiplies t
2
, so M
00
= (
2
+
2
).
Similarly, the term multiplying x
2
is M
11
=
2
+
2
. The cross terms give
M
01
= M
10
= + , and the remaining diagonal terms are M
22
= a
2
,
M
33
= b
2
. Other cross terms are nil.
1.14.8 a) Derive Eq. (1.3) from Eq. (1.2) for general
M
.
Start with Eq. (1.2)
s
2
= M
.
Substituting
s
2
= M
00
t
2
+ M
0i
x
i
t + M
i0
x
i
t + M
ij
x
i
x
j
Note that M
i0
= M
0i
(problem 6). Consider case s
2
= 0, so from Eq. (1.1),
t = r =
_
x
2
+ y
2
+ z
2
. Then,
s
2
= M
00
r
2
+ 2M
0i
x
i
r + M
ij
x
i
x
j
which is Eq. (1.3).
b) Since s
2
= 0 in Eq. (1.3) for any x
i
, replace x
i
by x
i
in Eq. (1.3) and subtract the resulting equations from Eq. (1.3) to
establish that M
0i
= 0 for i = 1, 2, 3.
We have set s
2
= 0 and it followed, based upon the universality of the
speed of light, that s
2
= 0. Note that changing x
i
to x
i
does not
change r nor s. So thats why s
2
= 0 in Eq. (1.3).
The only term in Eq. (1.3) to change sign when changing x
i
to x
i
is the 2M
0i
x
i
r term. The nal term doesnt because changing x
i
to
x
i
also changes x
j
to x
j
; the i is just a dummy index. So when we
subtract from Eq. (1.3) the following
s
2
= M
00
r
2
2M
0i
x
i
r + M
ij
x
i
x
j
were left with
0 = 4M
0i
x
i
r.
This must be true for arbitrary x
i
so M
0i
= 0. QED.
c) Derive Eq. (1.4)b
12
Required to show:
M
ij
= M
00
ij
, (i, j = 1, 2, 3).
Adding to Eq. (1.3) the following
0 = s
2
= M
00
r
2
2M
0i
x
i
r + M
ij
x
i
x
j
gives,
0 = M
00
r
2
+ M
ij
x
i
x
j
(1.12)
Suppose, x = r, y = z = 0. Substituting into (1.12) then gives
M
00
= M
11
. Or, when y = r, x = z = 0, we see that M
00
= M
22
.
Similarly, M
00
= M
33
. To see that the o-diagonal terms are zero, note
that its also possible that x = y = r/
2 and z =. Substitution into

(1.12) gives that
0 = (M
12
+ M
21
)r/2 = rM
12
= 0
Similarly, M
13
= 0 = M
23
. In summary,
M
ij
= M
00
ij
, (i, j = 1, 2, 3).
which is Eq. (1.4)b. QED.
1.14.18 a) Show that velocity parameters add linearly,
b) apply to a specic problem
Dene the velocity parameter W through w = tanh(W).
Want to show the velocity addition law,
w
=
u + w
1 + wu
implies linear addition of velocity parameters. Simply substitute the deni-
tion of velocity parameter,
w
=
tanh(U) + tanh(W)
1 + tanh(U) tanh(W)
(1.13)
=
(tanh(U) + tanh(W)) cosh(W) cosh(U)
cosh(W) cosh(U) + sinh(U) sinh(W)
(1.14)
The numerator can be written as,
N = sinh(W) cosh(U) + cosh(W) sinh(U)
so that
w
=
sinh(W) cosh(U) + cosh(W) sinh(U)
cosh(W) cosh(U) + sinh(U) sinh(W)
The following identities are useful:
cosh(a) cosh(b) =
_
exp(a) + exp(a)
2
__
exp(b) + exp(b)
2
_
=
exp(a + b) + exp((a + b))
4
+
exp(a b) + exp((a b))
4
=
cosh(a + b)
2
+
cosh(a b)
2
(1.15)
sinh(a) sinh(b) =
_
exp(a) exp(a)
2
__
exp(b) exp(b)
2
_
=
exp(a + b) + exp((a + b))
4

exp(a b) + exp((a b))
4
=
cosh(a + b)
2

cosh(a b)
2
(1.16)
sinh(a) cosh(b) =
_
exp(a) exp(a)
2
__
exp(b) + exp(b)
2
_
=
exp(a + b) exp((a + b))
4
+
exp(a b) exp((a b))
4
=
sinh(a + b)
2

sinh(a b)
2
(1.17)
Using (1.15) and (1.16) the denominator above simplies to D = cosh(U+
W). Using (1.17) the numerator simplies to N = sinh(U + W). So,
w
= tanh(U + W)
which reveals that we can linearly add velocity parameters, then apply tanh
to reduce the nal parameter to the nal velocity.
14
b Velocity of 2nd star relative to rst, u
2
= 0.9. Velocity of nth star
relative to (n-1)th, u
n
u
n1
= 0.9. So the Nth star relative to the rst is,
u
N
= tanh[(N 1)U]
where 0.9 = tanh(U).
My answer disagrees with that given by Schutz. Where I have N 1 he
has N. Note that my answer is correct for N = 2, the 2nd star relative to
the rst moves at 0.9.
1.14.19 a) Lorentz Transformation using velocity pa-
rameter
t = t vx (1.18)
x = vt + x
y = y
z = z
Let, v = tanh(V ). Note that the Lorentz factor also simplies,

1
1 v
2
=
_
1 tanh
2
(V )
_
1/2
=
_
cosh
2
(V )
cosh
2
(V ) sinh
2
(V )
_
1/2
= cosh(V ) (1.19)
We always take the positive root in the Lorentz factor so that the Lorentz
transformation reduces to the identity matrix when v = 0.
The nal equality follows from the following identity (stated without proof
in b):
cosh
2
(V ) sinh
2
(V ) =
_
exp(V ) + exp(V )
2
_
2
_
exp(V ) exp(V )
2
_
2
=
_
exp(2V ) + exp(2V ) + 2
4
_
_
exp(2V ) + exp(2V ) 2
4
_
= 1 (1.20)
Substituting v = tanh(V ) and (1.19) into (1.18) gives the desired result,
t = cosh(V ) t sinh(V ) x (1.21)
x = sinh(V ) t + cosh(V ) x
y = y
z = z
1.14.19 b) invariance of the interval using velocity pa-
rameter
The given identity is derived above (1.20). Invariance of the interval follows
from straightforward substitution into (1.21).
s
2
= t
2
+ x
2
+ y
2
+ z
2
= (cosh(V )t sinh(V )x)
2
+ (sinh(V )t + cosh(V )x)
2
+ y
2
+ z
2
= s
2
(1.22)
In the nal equality, the cross terms cancelled directly while the squared
terms simplied with the identity (1.20).
1.14.19 c) analogy between Lorentz transformation us-
ing velocity parameter and Euclidean coordi-
nate transformation
Hyperbolic trigonometric functions replace regular trigonometric functions,
but the sign changes for the sine term in the Euclidean coordinate transfor-
mation and not the sinh term of the Lorentz transformation.
The analog to the interval s
2
is the squared distance to the origin.
The analog to the invariant hyperbolae are circles. These could be used
to calibrate axes of the rotated Euclidean frame.
1.14.20 Lorentz tranformation in matrix form
x = Ax
16
where
x =
_
_
t
x
y
z
_
_
, x =
_
_
t
x
y
z
_
_
and
A =
_
_
cosh(V ) sinh(V ) 0 0
sinh(V ) cosh(V ) 0 0
0 0 1 0
0 0 0 1
_
_
1.14.21 a) Timelike separated events can be transformed
to occur at the same point.
Without loss of generality, consider two points, one at the origin and another
at t, x in inertial frame O. For the interval to be timelike, we require
s
2
= t
2
+ x
2
= t
2
+ x
2
> 0
Consider another inertial frame O moving at velocity v along the xaxis
with origins that coincide at t = 0. From the Lorentz transformation
t = t vx
x = vt + x (1.23)
so
( x)
2
= x
2
= (vt + x)
2
We divide through by t
2
to reduce this expression to one in a single parameter
x/t with [[ < 1 for the timelike interval:
2
2v + v
2
= 0
so
( v)
2
= 0
This has solution = v and this is possible for realistic velocity boost [v[ < 1
because [[ < 1 for the timelike interval. Q.E.D.
1.15 Robs supplementary problems
SP.1 In Fig. 1.5, explain why the angle of the xaxis to the xaxis is
= arctan(v), where v = [v[ is the magnitude of the velocity of O along the
xaxis axis. The result follows from the construction of the xaxis , but
the steps involved are not trivial.
Call the unknown angle between the xaxis and the xaxis . Ex-
tend the line from T to 1 all the way to the taxis, and call this in-
tersection Q. Draw two lines parallel to the xaxis, one through 1 and
where it crosses the taxis, call this |. The other through T and where
it crosses the t axis, call this T . The events , T, 1 form a right trian-
gle, with hypotenuse 1 = 2 a. We need the angle at 1T, which turns
out to be = /4 . (Call angle O1Q . Then + + /4 = ,
and angle 1T, which is = /4 + . It follows that = /4
.) So now we can compute the length 1T = 2a sin() = a
2(cos()
sin()). |1 = a sin(). Then Q1 = |1/ sin(/4) = a sin()
2. Sum-
ming the two lengths QT = Q1+1T = a
2 cos(). Were now after

OT = OQT Q. But OQ = O| +|Q, with |Q = |1 = a sin(), and
O| = a cos(). Also, T Q = QT cos(/4) = a cos() So OT = a(sin() +
cos()) a cos() = a sin(). Note that the sought after angle satises
tan() = OT /T T = OT /T Q = a sin()/a cos(), so = , the desired
result.
1.16 Additional thoughts
I think its worth mentioning that the Lorentz transformation, which is linear
by construction, transforms lines to lines. This is easily veried by substi-
tuting the equation for a line in O and conrming that its also a line in
O.
Its also worth pointing out that a tangent line to a curve in O remains a
tangent line in O. Of course it would be quite strange if this were not true,
but on the other hand it was not immediately obvious to me that it holds.
18
Chapter 2
Vector Analysis in Special
Relativity
An elementary introduction to 4-vectors, working with Lorentz transforma-
tions. Contains lots of hand-holding about the algebra of working with vec-
tors, the summation convention, changing dumbing indices etc.
19
20
2.1 Denition of a vector
Regarding the Einstein summation convention introduced on p. 34, Schutz
states that it applies whenever there is a repeated index, one up and one
down, in the same expression. I nd this misleading because its actually the
same term or same factor.
Buried in footnote 2 on p. 35 is an important notational point.
2.2 Vector algebra
Eq. (2.10) introduces a strange notational twist. Apparently enclosing the
vectors e
with parentheses and writing a superscript implies that we are

forming a tensor from the set of these vectors?
( e
Theres no comment to explain this. Earlier the author explained that the
superscript notation will become clear when he introduces dierential geom-
etry. For now I just note that the RHS is the Kronecker delta, which is a
second-rank tensor.
[Coming back to the above point after having read most of the book (all
but Chapters 9 and 12), Im still amazed at this leap in notation. Generally
Schutz is clear and fairly careful but this is an exception. I interpret it as
follows. Enclosing the vector in parentheses I believe means taking the set
of components, since later when we want to write down the components of
a 2nd rank tensor as a matrix we enclose the tensor in parentheses and the
analogous operation for a vector would be writing down its 4 components as
a row or column vector. Apparently the superscript means then selecting
the
th
one of these components. Indeed, this interpretation is consistent
with his usage in Eq. (2.21) and later in the book, such as Eq. (5.52).]
On p. 38 he introduces the notation that putting a tensor within square
brackets [
] gives the matrix of components. He will oscillate between

square brackets and parentheses throughout the book, sometimes in the same
paragraph!! For example, the two sentences after Eq. (2.18).
Eq. (2.18) is described as a key formula. Exercise 2.11c is to verify it.
Schutz doesnt like the terms contravariant and covariant, used by
many others. A
are the contravariant components of a vector (Hobson et al.,

2009).
2.3 The four-velocity
First paragraph. Not clear what uniformly moving means. I guess he
means not accelerating so that the inertial frame in which the particle is
at rest is not constantly changing. This also makes sense because the next
paragraph discusses the case of an accelerating particle.
2.4 The four-momentum
Typo on p. 42, in the example, p
1
= mv(1 v
2
)
1/2
should be p
1
= mv(1
v
2
)
1/2
.
2.6 Some applications
Top of p. 48. In the MCRF

U has only a zero component . . .. It might be
better to say only a time component since he means zero in the sense of
the rst component.
2.7 Photons
Four-momentum
Typo: Last paragraph: . . . a photon has frequency v and . . . . This
should be . . . a photon has frequency and . . . .
The derivation of Eq. (2.39) seems a bit mysterious until one realizes
22
that its based on the idea that the four-momentum must transform under a
Lorentz transformation. In particular, the momentum of a photon directed
in the xdirection is:
p
O
(E, E, 0, 0)
If we change reference frames to O then this four-vector must transform
under a Lorentz transformation, giving:
p
O
(

E,

E, 0, 0)
where
E = E Ev = E(1 v)
etc.
2.9 Exercises
2. Identify the dummy and free indices, count the equations:
a) is the dummy index. One equation.
b) is the dummy index. is the free index. Four equations.
c) , dummy indices; , free indices; 16 equations.
d) and are free indices, and there are 16 equations. Although the in-
dices are repeated, theyre not repeated in the same factor, and one is not
superscript.
3 Prove Eq. (2.5).
Theres nothing to prove really. It follows immediately from the denition
and notation conventions. In particular, the LHS involves a sum over all
values of the dummy index 0, 1, 2, 3, see p. 34. The RHS merely
spells this out, with the convention that Roman indices like i take all values
i 1, 2, 3.
4 Practise adding components of vectors, and multiplying by a scalar.
a) 6
A
O
(30, 6, 0, 6)
5 a) Show that the basis vectors are linearly independent. Start with a
general linear combination a
,
0 = a
(e
= a
Start with the rst component, = 0. The equation above is 0 = a

0
1,
so a
0
must be zero. Similarly for the other components. Since this trivial
solution is the only solution, the basis vectors must be linearly independent.
More formally, one could write this out in matrix notation:
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
a
0
a
1
a
2
a
3
_
_
=
_
_
0
0
0
0
_
_
Its a result of elementary linear algebra that this system has nontrivial so-
lutions only if the determinant of the matrix is zero. But the determinant is
+1. So there are no nontrivial solutions and thus the basis vectors must be
linearly independent, Q.E.D.
5 b) The given set is not linearly independent, since the linear combina-
tion (5, 3, +2, 1) gives the zero vector.
6 As in Fig. 1.5, the t and x axes are tilted at an angle relative to their
O frame counterparts and toward the world line of the line ray t = x. The
basis vectors are parallel to these O axes. Here tan() = 0.6. For the O the
axes will be tilted even further toward t = x. The angle of this basis vectors
can be computed as
tan() = tanh(2arctanh (0.6))
7 a) Verify Eq. (2.10). As mentioned above, this is a strange notational
twist. If we write the basis vectors as row vectors as in Eq. (2.9), then the
24
set form a matrix, and the matrix element is unity when row and column
numbers are equal, and zero otherwise, i.e. the identity matrix.
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
The RHS of Eq. (2.10) can of course be written as the identity matrix too,
which demonstrates the equality.
7 b) Ive always thought of Eq. (2.11) as the denition of the vector, so it
seems to me a tautology, rather than something to prove. Perhaps its worth
stating the result in words. If you use the components of the vector,

A to
form the linear combination of basis vectors e, i.e.
A
then you, of course, recover the vector

A. In particular, for the rst compo-
nent, = 0, the rst component A
0
multiplies all the basis vectors, but only
the rst one e
0
contributes since the other basis vectors are all zero in the
rst component. Similarly for the other components.
8 a) Prove that the zero vector has the same components in all reference
frames.
This follows immediately from the use of a linear transformation to go
between reference frames. See p. 35, and Eq. (2.7) for the denition of the
general (4-) vector and the linear transformation. In other words,
A
_
_
0
0
0
0
_
_
=
_
_
0
0
0
0
_
_
for all matrices A and the Lorentz transformation can always be written as
a matrix A.
8 b) Prove that if two vectors have equal components in one frame their
components are equal in all frames.
My rst thought is that if their components are equal in a given frame, then
theyre the same vector. By the denition of a vector, they are invariant
under coordinate transformation. So their components are equal in all other
frames. But that doesnt use 8a.
Using 8a, one could subtract the two equal vectors, giving the zero vec-
tor in that frame. Under coordinate transformation, this dierence vector
remains they zero vector. Thus their components must be equal in any other
frame.
9 There are 16 terms to write out, which is too much work. It seems
convincing enough to me to note that for each term in the sum on the LHS,
there is a corresponding term on the RHS. In general these terms look like,
Of course the order of summation doesnt matter for a nite sum. Substi-
tuting specic values for the dummy indices might make this more clear, say
= 0, = 1.
10 Prove Eq. (2.13) from
A
) = 0
Eq. (2.13) was:
e
Choosing any A
with only one non-zero entry, like (1, 0, 0, 0), or (10, 0, 0, 0),
shows straight away that
0
e
= e
0
which is Eq. (2.13) with = 0. Similarly choosing A
as (0, 1, 0, 0), or
(0, 2, 0, 0), shows straight away
1
e
= e
1
.
So repeating this argument gives the result for the other two basis vectors.
26
Perhaps more instructive is to note that this result works for more general
situations. The quantity inside the parentheses is a set of 4 dierent vectors
v
,
(
) = v
Then view the components of A
as the components of a linear combination

of this vector v
. Now its clear that the RHS is not just the number zero,
but the 4-vector (0, 0, 0, 0). The linear combination of the set of v
must sum
to the zero vector for arbitrary components of the linear combination. If the
rst three led to a non-zero vector,
2
=0
v
= (2, 4, 6, 8)
then A
3
would have to be chosen so bring this to zero. For example, if
v
3
= (1, 2, 3, 4) one would have to choose A
3
= 2. But since A
was
arbitrary so then choosing A
3
= +2 would violate the equality. So this
means that the only way it could work is if
2
=0
v
= 0
and v
3
= 0. One can now repeat this argument for the
1
=0
v
etc. and
show that all v
are the zero 4-vector. And the result Eq. (2.13) holds.
11 (a) Matrix of
(v). Exercise 1.20 was to put the Lorentz trans-

formation in matrix form. Note that sinh(V ) = sinh(V ), cosh(V ) =
cosh(V ). So we only have to change the sign of the sinh(V ) elements,
=
_
_
0 0 1 0
0 0 0 1
_
_
where v = tanh(V ).
(b) A
for all .
A
0
= cosh(V )A
0
sinh(V )A
1
(2.1)
A
1
= sinh(V )A
0
+ cosh(V )A
1
(2.2)
A
2
= A
2
(2.3)
A
3
= A
3
(2.4)
(c) Verify Eq. (2.18). Written out in matrix form Eq. (2.18) becomes,
_
_
0 0 1 0
0 0 0 1
_
_
_
_
0 0 1 0
0 0 0 1
_
_
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
.
To show this its useful to use the hyperbolic function identity,
cosh
2
(x) sinh
2
(x) = 1.
Eq. (2.18) follows immediately from matrix multiplication. This identity
is easy to derive, and can be found at http://en.wikipedia.org/wiki/
Hyperbolic_function#Similarities_to_circular_trigonometric_functions
along with other properties.
(d) The Lorentz transformation matrix from O to O is just the matrix in
(a). Since O is moving toward increasing x with velocity v with respect to
O, then from O point of view O is moving toward increasing x with velocity
v.
(e) A
for all .
A
0
= cosh(V )A
0
+ sinh(V )A
1
= A
0
(2.5)
A
1
= +sinh(V )A
0
+ cosh(V )A
1
= A
1
(2.6)
A
2
= A
2
= A
2
(2.7)
A
3
= A
3
= A
3
(2.8)
Relation to Eq. (2.18): Multiplying the vector

A on the left by the Lorentz
transformation matrix (v) gives the components in the O frame, A
=
28
(v)
A
. Multiplying this vector on the right by the Lorentz transformation

matrix (v) should return the vector to the O frame. And indeed it does,
when we use Eq. (2.18) in the nal step below:
(v) A
= A
(2.9)
(v)
(v)
A
= A
(2.10)
= A
(2.11)
(f) Verify that the order applying the transformations doesnt matter.
Physically we know this must be true. Mathematically it works out because
if we repeat (c) with the matrices in the opposite order, we get the same
result:
_
_
0 0 1 0
0 0 0 1
_
_
_
_
0 0 1 0
0 0 0 1
_
_
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
.
(g) Establish that
e
I nd this a rather strange question. From the denition of the Kronecker

delta function, Eq. (1.4c), the result is immediately obvious. Another way
to see this is that the Kronecker delta can be written as the identity matrix.
And of course, writing the vector on the RHS as a column vector, multiplying
by the identity matrix, gives back the original vector.
12 (b) Remember not to add the velocities linearly, but to use the Einstein
law of composition of velocities Eq. (1.13), or use the velocity parameters
introduced in Exercise 1.18.
(c) Note that the denition of the magnitude of the vector is analogous
to the interval introduced in Chapter 1, see Eq. (2.24).
A
2
= 0
2
+ (2)
2
+ 3
2
+ 5
2
= 38.
(d) The magnitude should be independent of the reference frame, because
of the invariance of the interval.
13
(a) Transformation of coordinates from O to O is can be constructed in
two steps. First transform to O,
A
(v) A
.
Then transform from O to O,
A
(v
) (
(v)A
).
So the Lorentz transformation from O to O is
(v
(v).
(b) I thought we just did show that Eq. (2.41) was the matrix product of
the two individual Lorentz transformations. Maybe he means write it out in
matrix form? Im not sure what hes looking for.
(c) The was an important exercise for me because I learned that the
Lorentz transformation matrix did not have to be symmetric when there are
velocity components in two directions.
=
_
_
(v)(v
) (v)v(v
) (v
)v
0
(v)v (v) 0 0
(v)(v
)v
(v)v(v
)v
(v
) 0
0 0 0 1
_
_
.
(d) Show that the interval is invariant under the above transformation.
(e) Show that the order matters in constructing the Lorentz transforma-
tion as in (a), i.e.
(v)
(v
) ,=
(v
(v)
30
Using the example from (c), the LHS of the above would be,
LHS =
=
_
_
(v)(v
) (v)v (v)(v
)v
0
(v)v(v
) (v) (v)v(v
)v
0
(v
)v
0 (v
) 0
0 0 0 1
_
_
Comparison with the matrix in (c) shows its dierent. In fact, another
observation, not discussed by Schutz, is that one is the transpose of the
other. This can be understood because
(AB)
T
= B
T
A
T
= BA
where the nal equality holds because A and B are symmetric.
This is surprising if we think in a Galilean way. However, mathematically
we know in general that matrix multiplication is not commutative, http://
en.wikipedia.org/wiki/Matrix_multiplication#Common_properties. Phys-
ically we know that the Lorentz transformation results in the axes tilting
toward the t = x line, as in Fig. 1.5. The order of rotations matters. For
example, rotating the globe 90
to the east about the polar axis, then 45
clockwise about the axis through the Equator and 90
W and 90
E, puts the
coordinates 0
N, 0
E where the South Indian Ocean used to be. But perform-

ing the same rotations in the opposite order leaves the coordinates 0
N, 0
E
on the old Equatorial plane.
14 (a) v = 3/5 in the positive z direction. The o-diagonal term gives
the direction, v = 0.75, and the diagonal term gives = 1.25. One can
conrm that = 1/
1 v
2
, once v is found.
(b) Since its a Lorentz transformation, the inverse should be obtained by
from the Lorentz transformation from O back to O.
(v) =
_
_
1.25 0 0 0.75
0 1 0 0
0 0 1 0
0.75 0 0 1
_
_
And matrix multiplication conrms this is the inverse.
(c)
_
_
1.25 0 0 0.75
0 1 0 0
0 0 1 0
0.75 0 0 1
_
_
_
_
1
2
0
0
_
_
=
_
_
1.25
2
0
0.75
_
_
15 (a) The particle 3-velocity is v = (v, 0, 0). In the frame moving
with the particle, the 4-velocity is e
0
, so

A
O
(1, 0, 0, 0). The Lorentz
transformation back to the O frame is
(v) =
_
_
(v) v (v) 0 0
v (v) (v) 0 0
0 0 1 0
0 0 0 1
_
_
.
So

A in the O frame has components

A
O
((v), v(v), 0, 0).
(b) For general particle 3-velocity is v = (u, v, w). Lets start with a
slightly less general 3-velocity is v = (u, v, 0) to make the algebra easier.
One could rotate through an angle to a frame where v = ([v[, 0, 0). Here
is such that
_
u
_
=
_
cos() sin()
sin() cos()
_ _
u
v
_
=
_
[v[
0
_
Now we have the situation as in (a) so we can apply the Lorentz transforma-
tion back to the O frame
(v) =
_
_
([v[) [v[ ([v[) 0 0
[v[ ([v[) ([v[) 0 0
0 0 1 0
0 0 0 1
_
_
.
So

A in a frame moving with the O frame but rotated through has compo-
nents

A
O
(([v[), [v[([v[), 0, 0). Finally we rotate through to obtain
A in the O frame
A
O
(([v[), [v[([v[) cos(), [v[([v[) sin(), 0) (2.12)
=(([v[), u([v[), ([v[)v, 0) (2.13)
32
Finally, theres no reason for the z component to behave dierently, so we can
generalize this. For general particle 3-velocity is v = (u, v, w), the 4-velocity
is
A
O
(([v[), [v[([v[) cos(), [v[([v[) sin(), 0) (2.14)
=(([v[), u([v[), v([v[), w([v[)) (2.15)
where
[v[ =
u
2
+ v
2
+ w
2
.
(c) Starting with the 4-velocity components U
, one can write the 3-

velocity,
v = (U
1
/, U
2
/, U
3
/)
where 1/
1 v v = U
0
.
(d) Applying the above formula, if the 4-velocity is given as (2, 1, 1, 1) then
the 3-velocity is v = (1/2, 1/2, 1/2). Note the magnitude of the 4-velocity is
4 + 3 = 1, making it a legitimate example.
16 Particle moves with speed w, say along the xaxis, in a reference
frame O moving along the xaxis with speed v. Deriving Einsteins velocity
addition law from a Lorentz transformation of the particles 4-velocity.
The particles 4-velocity in reference frame O, U
O
((w), (w) w, 0, 0).
Lorentz transformation from O to O
(v) =
_
_
(v) v (v) 0 0
v (v) (v) 0 0
0 0 1 0
0 0 0 1
_
_
.
So the 4-velocity is, U
O
((w)(v)+vw(w)(v), v(v)(w)+w(v)(w), 0, 0).
Converting this to the 3-velocity using the formula in 15c,
v
x
=
U
x
U
0
(2.16)
=
(v) (w)(v + w)
(v) (w)(1 + v w)
(2.17)
=
(v + w)
(1 + v w)
(2.18)
17 (a) Prove that any timelike vector

U for which U
0
> 0 and

U
U = 1
is the four-velocity of some world line.
The four-velocity is the e
0
in the MCRF. If

U is some world lines four-
velocity, then there exists a Lorentz transformation for which

U
O
(1, 0, 0, 0).
Lets see if thats possible for the given vector

U.
The coordinate system can be rotated so that U
= (U
0
, u, 0, 0), just to
make the algebra simpler. Now apply an arbitrary Lorentz transformation
_
_
(v) v (v) 0 0
v (v) (v) 0 0
0 0 1 0
0 0 0 1
_
_
_
_
U
0
u
0
0
_
_
=
_
_
1
0
0
0
_
_
,
for some v and (v). Thus we require
1 = (U
0
v u) (2.19)
0 = (u U
0
v). (2.20)
But in general we require 1, so the second equation (2.20) requires
v = u/U
0
. We know U
0
> 0 (given) and it follows from the fact that

U is
timelike that U
0
> u. So thus v < 1. Thus (v) > 1, and most importantly,
(v) 1, i.e. the Lorentz transformation is possible. Does this required
Lorentz transformation also bring the time component to unity?
The algebra can get messy, but simplies if we use the fact that

U
U = 1.
Eliminate v in the rst equation (2.19) gives
1
(v)
= U
0
u
2
U
0
=
1
U
0
((U
0
)
2
u
2
) =
1
U
0
So
= U
0
This proves that the required Lorentz transformation to make

U = e
0
is
possible, which is enough to show that the original 4-vector is the 4-velocity
of something. Note that the requirement that U
0
1 is hidden in the
requirement that

U

U = 1.
(b) Use this to prove that for any timelike vector

V there is a Lorentz
frame in which

V has zero spatial components.
34
The magnitude of a vector is the interval between the origin and the co-
ordinates of the vector. For a timelike interval the vector is timelike, and vice
versa. Timelike intervals can be transformed via a Lorentz transformation
to have zero spatial part, see Exercise 1.21. The corresponding vector will
have zero spatial components.
If you havent done Exercise 1.21, you can construct a proof using part
17(a). We are no longer required to make the time part unity; we only
require the space part to be zero, i.e. (2.20), 0 = (v)(u V
0
v), where u is
now V
i
V
i
= u
2
. We no longer have V
0
> 0, but that doesnt matter. Because
its a timelike vector we have
(V
0
)
2
> V
i
V
i
= u
2
So (2.20) implies now that
[v[ < 1
and again, (v) 1, i.e. the Lorentz transformation is possible.
18 (a) Sum of two spacelike orthogonal vectors is spacelike.
By denition, orthogonal vectors have

A

B = 0, so
(
A +

B) (
A +

B) =

A

A +

B

B + 2
A

B (2.21)
=

A

A +

B

B > 0. (2.22)
Spacelike vectors have positive magnitude,

A

A > 0. So (
A +

B) is also
spacelike.
(b) Timelike vector and null vector cannot be orthogonal.
Timelike vector

A. Lets keep the algebra simple and rotate to a co-
ordinate frame such that the spacepart of the null vector

N is all in one
component,
N
O
(N
0
, N
1
, 0, 0) = (N
0
, N
0
, 0, 0)
Because its a null vector, N
0
= N
1
. The null vector

N has unknown coor-
dinates in this frame, but
A N = A
0
N
0
+ A
1
N
0
= N
0
(A
1
A
0
)
But if (A
1
A
0
) = 0 then
A

A = (A
2
)
2
+ (A
3
)
2
0
which contradicts the stipulation that

A is timelike. Thus (A
1
A
0
) ,= 0.
Thus if

N is not the zero vector then N
0
,= 0 and A N ,= 0 so they are not
orthogonal.
19 Consider a uniformly accelerated particle. That is, it has a 4-acceleration
a of constant magnitude a a =
2
where
2
0 is a constant.
(a) Show that a has constant components in the particles MCRF, and
that these components look like normal, Galilean, acceleration terms.
From Eq. (2.32) the 4-acceleration is always normal to the 4-velocity. But
in the MCRF

U = e
0
. Without loss of generality we can take the a to have
only one spatial component, say in the xdirection, with increasing x in the
direction of the acceleration. So
a = (a
0
, a
1
, 0, 0)
with this orientation of the spatial axes. Then Eq. (2.32) requires that a
0
= 0
and a
1
= .
a
MCRF
(0, , 0, 0)
In some small amount of time, say d
t = d, in the MCRF at point P, the

4-velocity will change by
d
U = d
ta = d
t(0, , 0, 0)
Since the 4-velocity started at e
0
in the MCRF (by denition) then velocity
at the end of this small amount of time will be simply
U(d
t) (1, d
t, 0, 0)
The equality can be made arbitrarily accurate for small d
t, wherein the rel-

ativistic eects are negligible. So the 3-velocity changes during d
t from zero
to d
t in the xdirection, just as in Galilean acceleration, which is what

we were required to prove. (See Exercise 15c for converting 4-velocity to
3-velocity).
36
19(b) Say = 10 m s
2
and particle starts from rest at t = 0. Find
general expression for the speed at time t.
The trick is to work in the MCRF to nd the change in velocity in some
small increment of time d
t, such that relativistic corrections are small, as

in (a) above. Then one nd the new 4-velocity in the MCRF. But then
one must transform the new 4-velocity and the time increment back to the
original frame using a Lorentz transformation.
A some arbitrary point T with coordinates (t
p
, x
p
, 0, 0) the particle will
be moving at speed v
p
in the positive xdirection. So applying the Lorentz
transformation from the MCRF back to the original frame we nd the new
velocity is just:
U
O
_
_
(v
p
) +v
p
(v
p
) 0 0
+v
p
(v
p
) (v
p
) 0 0
0 0 1 0
0 0 0 1
_
_
_
_
1
d
t
0
0
_
_
=
_
_
[1 + v
p
ad
t]
[v
p
+ ad
t]
0
0
_
_
,
The new 3-velocity is:
u
x
=
U
1
U
0
=
[v
p
+ d
t]
[1 + v
p
d
t]
=
[v
p
+ d
t]
[1 + v
p
d
t]
[v
p
+ d
t][1 v
p
d
t]
v
p
+ d
t v
2
p
d
t + O(d
t
2
)
(2.23)
This change in velocity occurred in time increment d
t in O, which corresponds
to
dt = d
t + v
p
d x
= d
t + v
p
1
2
d
t
2
d
t (2.24)
The acceleration in O is
du
x
dt
=
v
p
+ d
t v
2
p
d
t v
p
dt
=
d
t v
2
p
d
t
d
t
=
(1 v
2
p
)
3
(2.25)
This dierential equation can be written
3
dv = dt (2.26)
which can be integrated immediately
_
vp
0
3
dv =
_
tp
0
dt
v
p
_
1 v
2
p
= t
p
(2.27)
Solving for
v
p
(t) =
t
p
1 +
2
t
2
p
(2.28)
We can immediately solve for the distance traveled by
dx
dt
= v
p
(t) =
t
p
1 +
2
t
2
p
dx =
t
p
1 +
2
t
2
p
dt
x =
1
2
ln(1 +
2
t
2
) (2.29)
Setting v
p
= 0.999 and = 10 m s
2
=
10
310
8
s
1
we nd a time of
t
p
= 6.7 10
8
s 21 years
and expressing = 10/(3 10
8
)
2
m
1
x
p
=
1
2
(3 10
8
)
2
10
ln
_
1 +
2
[m/s
2
]
2
t
2
p
[s]
2
(3 10
8
)
2
_
= 2.0 10
17
m
38
19 (c) Find the elapsed proper time for the particle in (b).
Recall from part (a) that the time increment in the MCRF was the proper
time increment, d
t = d and this was related to the time increment in the

original frame via the Lorentz transformation with the spatial part playing
negligible role:
dt = d
t + v
p
d x
= d
t + v
p
1
2
d
t
2
d
t (2.30)
So
d = d
t =
1
(v)
dt
We can use the expression derived above (2.26)
d =
1
(v)
dt
=
1
(v)
2
dv (2.31)
which can also integrated immediately
_
p
0
d =
_
vp
0
1
(v)
2
dv
p
=
1
arctanh (v
p
)
p
(t
p
) =
1
arctanh
_
t
p
1 +
2
t
2
p
_
(2.32)
We can solve for the proper time elapsed during the 21 years it took to
accelerate the particle to v = 0.999:
p
=
c[m/s]
[m/s
2
]
arctanh (0.999) = 1.14 10
8
[s] 3.6 years
20 The particle moves in a circle in the x y plane of radius b, in a
clockwise sense when viewed in the direction of decreasing z. The circle
translates along the xaxis at speed a. Its stated that [b[ < 1, but the
requirement for a realistic particle is actually that [a+b[ < 1. The 3-velocity
is computed directly by dierentiating the given equations, v
O
( x, y, 0),
where
x = a + b cos(t) (2.33)
y = b sin(t) (2.34)
The 4-velocity is obtained from the 3-velocity using the formula derived
in problem 1.15b.
U
O
((v), x(v), y(v), 0)
= (v)(1, a + b cos(t), b sin(t), 0) (2.35)
where v = [v[ =
_
(a + b sin(t))
2
+ (b cos(t))
2
=
_
a
2
+ 2ab sin(t) +
2
b
2
.
To obtain the 4-acceleration we require the 4-velocity as a function of
proper time, , not t, the time in the inertial frame. But remember that
the proper time is the time measured by a clock at, say, the origin of the
MCRF. Call this frame O, and then t = = x
0
. And t = (v)
0
.
For simplicity we choose the MCRF with origin at the particle location,
so x
O
(, 0, 0, 0), and t = (v) = (v). Then we obtain the 4-
acceleration from the given equations in t and the chain rule,
a
d
U
d
=
d
U
dt
dt
d
= (v)
d
U
dt
We now confront the question as to whether or not to let (v) in this
derivative! The answer is yes. See my supplementary problem SP.3 in the
next section, in which we derived a general expression for the 4-acceleration,
see (2.53).
a =
3
[ x x + y y + z z]
U +
2
(0, x, y, z) (2.36)
Substituting the values for our particular problem we nd:
a =
3
[
2
ab sin(t)]
U +
2
(0,
2
b sin(t),
2
b cos(t, 0) (2.37)
21 The motion is hyperbolic in frame O,
x
2
t
2
= a
2
cosh
2
_
a
_
a
2
sinh
2
_
a
_
= a
2
40
and therefore hyperbolic in all reference frames, t
2
+x
2
= a
2
. The velocity
is obtained by dierentiating with respect to ,
v =
dx
dt
=
dx
d
/
dt
d
= tanh
_
a
_
.
So we notice that
_
a
_
is a velocity parameter for v, see problem 1.18.
The Lorentz transformation to the MCRF can be written in a simple form
with the velocity parameter, see problem 1.20:
=
_
cosh
_
a
_
sinh
_
a
_
sinh
_
a
_
cosh
_
a
_
_
Thus we nd the points transform to
_
t()
x()
_
=
_
cosh
_
a
_
sinh
_
a
_
sinh
_
a
_
cosh
_
a
_
_ _
a sinh
_
a
_
a cosh
_
a
_
_
=
_
0
a
_
The particle always ends up on the xaxis.
To show that the parameter is the proper time, we show that
dt
d
= 1
for a MCRF and any . This is a bit subtle, because we want to hold the
Lorentz transformation xed (so hold =
MCRF
xed), so that the MCRF
is inertial. But we want to let vary about =
MCRF
so we can take the
derivative of t() wrt . Ive written out this dependence explicitly below:
t() = cosh
_
MCRF
a
_
a sinh
_
a
_
sinh
_
MCRF
a
_
a cosh
_
a
_
Now dierential wrt , and evaluate at =
MCRF
giving,
dt
d
= cosh
2
_
a
_
sinh
2
_
a
_
= 1
The 4-velocity is
U
O
_
cosh
_
a
_
, sinh
_
a
_
, 0, 0
_
The 4-acceleration is easy for this problem because we have the 4-velocity
as a function of proper time!
a
d
U
d
=
d
U
d

O
_
1
a
sinh
_
a
_
,
1
a
cosh
_
a
_
, 0, 0
_
We can check if its orthogonal to the 4-velocity, as it should be.
U a =
1
a
sinh
_
a
_
cosh
_
a
_
+
1
a
sinh
_
a
_
cosh
_
a
_
= 0.
Is it uniformly accelerating?
a a =
1
a
2
sinh
2
_
a
_
+
1
a
2
cosh
2
_
a
_
=
1
a
2
.
And a was given as constant, and its always pointing in the xdirection, so
it is uniformly accelerating (see denition in problem 2.19).
22 (a) Given 4-momentum, p
O
(4, 1, 1, 0) kg. Find:
Energy in O: In general p
O
(E, p
1
, p
2
, p
3
), so E = 4 kg.
3-velocity in O: In general m
U = p, where m is the rest mass and

U
is the 4-velocity. And the 3-velocity is related to the 4-velocity as inferred
in problem 2.15b, U
= ([v[)
0
. So p
O
(m, mu, mv, mw), where
v
O
(u, v, w) are the components of the 3-velocity. Note that E = m, and
simply dividing through by E gives v
O
(1/4, 1/4, 0).
Rest mass:
=
1
1 v v
=
4
14
.
From which it follows from E = m = 4 that m =
14 3.74kg.
(b) We must apply the law of conservation of 4-momentum.
p
I
= p
1
+ p
2

O
(5, 0, 1, 0) kg
By conservation of 4-momentum,
p
F
= p
I
= p
3
+ p
4
+ p
5
,
42
so
p
5
= p
I
p
3
p
4
=
O
(3, 1/2, 1, 0) kg.
Now, like in problem (a), we know the 4-momentum. From an analysis
just like in (a), we nd the 5th particle has in this same reference frame:
E
5

O
3kg, and v
5

O
(1/6, 1/3, 0). Finally, the rest mass is m =
31/2 2.83kg.
The CM frame is found by nding the Lorentz transformation that trans-
forms the p
F
to have only a time component,
= (e
0
)
This gives the equation for the y-direction,

v5 + = 0
So CM has 3-velocity v
O
(0, 1/5, 0).
23 Find the energy given the 3-velocity and rest mass.
First nd the 4-momentum, p = m
U = m(1, u, v, w). And the energy is

the time-part of the 4-momentum,
E = m
We can nd an approximate value of from the binomial series, http://en.
wikipedia.org/wiki/Binomial_series. This is just a Taylor series about
x = 0. Let x = v v = v
2
, and = 1/2, so we obtain,
= 1 +
1
2
v
2
+
3
8
v
4
+ . . .
So
E m(1 +
1
2
v
2
+
3
8
v
4
+ . . .)
i.e. the rest mass, plus the classical kinetic energy, plus a correction of order
O(v
4
). The correction is 1/2 the kinetic energy when,
v =
_
2/3
24 Show that its impossible for a positron and an electron to annihilate
and produce a single ray.
Apparently particles come and go, but 4-momentum is conserved. Line
up the coordinates such that the xaxis is aligned with the direction of
propagation of the ray. Then conservation of 4-momentum,
p
e
+ + p
e
= p
,
gives two equations. The time part looks like conservation of energy,
p
0
e
+ + p
0
e
= p
0
,
while the spatial part looks like traditional conservation of momentum,
p
1
e
+ + p
1
e
= p
1
.
Its important to realize that they are not independent, since in a reference
frame wherein the electron and positron move with velocities v
e
and v
e
+,
we have
m((v
e
+) + (v
e
)) = h (2.38)
m((v
e
+) v
e
+ + (v
e
) v
e
) = h, (2.39)
where m is the rest mass of the electron and positron and is the frequency
of the ray. The only mathematical solution is then v
e
= v
e
+ = 1, which
is physically impossible because of their non-zero rest mass. Nothing moves
at the speed of light, except electromagnetic radiation and possibly gravity
waves if they exist.
Its possible to produce two rays. Suppose they are travel in opposite
directions with equal and opposite momentum in some frame of reference.
Then the nal total 4-momentum is the null vector. To satisfy momentum
conservation we only require that the positron and electron have equal and
opposite momentum in the same frame of reference, so v
e
+ = v
e
with
arbitrary v
e
+, which can obviously be satised.
25 Doppler shift.
In frame O photon has 4-momentum
p
O
(h, h cos(), h sin(), 0)
44
Transforming to the frame O moving at speed v along the xaxis, we
apply the Lorentz transformation
(v) =
_
_
(v) v (v) 0 0
v (v) (v) 0 0
0 0 1 0
0 0 0 1
_
_
to obtain
p
O
_
_
h v(v)h cos()
v(v)h + (v)h cos()
h sin()
0
_
_
So the Doppler shift is obtained from the time component, i.e. the rst
component, and can be expressed as,
= (v)(1 v cos()) =
1
1 v
2
(1 v cos())
as given.
(b)
No Doppler shift occurs when
= 1 =
1
1 v
2
(1 v cos())
So
v =
2 cos
1 + cos
2
or
= arccos
_
1
1 v
2
v
_
Extra questions: Does this have solutions? For [v[ 1 use the binomial
series to see that /2. Whats the maximum angle of no Doppler shift?
As v 1, 0. Show that at v = 1/2, 74.5
.
(c)
Eq. (2.35) is the frame-invarient expression for energy E relative to ob-
server moving with velocity

U
obs
,
p U
obs
= E
and Eq. (2.38) was just E = h. This calculation ends up being exactly the
same as above, but allows one to focus on the relevant parts, i.e. just the
time component. Since
U
obs

O
((v), (v)v, 0, 0)
and recall
p
O
(h, h cos(), h sin(), 0)
so we can immediately nd
E = (v)h v(v)h cos().
which was the time component of the p
O
found in (a) above.
26 Energy required to accelerate an object with rest mass m from v to
v to rst order in v.
E = m(v) = m
1
1 v
2
so the change in energy is just
E = m((v + v) (v)).
When v 1 the problem is easy. Just dierentiate wrt v to get the
Taylor series approximation
(v + v) (v) =
(v)v +
1
2
(v)v
2
+ . . .
where
=
d
dv
= v
3
(2.40)
=
3
+ v3
2
(2.41)
46
So
(v + v) (v) = v
3
v + O(v
2
) . . .
And so the change in energy is,
E mv
3
v.
A subtlety arises when v is not small. The coecient
become large
relative to
, so ignoring the O(v

2
) term becomes misleading. The author
should have instructed us to check this. In particular,
=
1
v
+ 3v
2
When v 1 we can replace
1
2
v
2

v
_
v
2v
_
(v)v
since were given that (
v
v
) 1. So were still justied in ignoring the 2nd
term in the Taylor series. But when v is not small we need another approach.
The above argument is not formally correct when v is not small because
the higher order terms in the Taylor series can no longer be ignored. Here is
one approach.
Write v = 1 where 0 < 1, so were close to the speed of light. Use
1 and the Binomial series to simplify ,
(v) =
1
_
(2 )
2
(1 + /4),
and
(v + v) =
1
_
( )(2 + )
where = v. To simplify the latter we need to consider the case where
[[ . But this is not so restrictive. Then
(v + v) =
1
_
( )(2 + )
2
_
1 +

2
__
1 +

4
_
To nd the perturbation in energy we take the dierence,
(v + v) (v)
1
2
_
4
_
Its clear that as 0, so v 1,
A simpler and better solution: Write v = 1 where 0 < 1, so were
close to the speed of light.
(v) =
1
_
(2 )
.
Now expand this in a Taylor Series in :
(v + v) (v) =
d
d
() +
1
2
d
2
d
2
()
2
+ . . .
And
d
d
=
_
1

4
_
(2)
3/2
(1 /2)
3/2

_
1

4
_ _
1 +
3
4
_
(2)
3/2
where the approximation exploits 0 < 1 with the Binomial Series ap-
proximation. Its important to check the size of the 2nd derivative relative
to the rst. We nd, again using the Binomial Series,
d
2
d
2

3
2
d
d
so were only justied in ignoring the 2nd term if [[ . In this case, the
change in energy is
E m
1
(2)
3/2
v = m
3
v + O()
This actually agrees with the result we would have obtained from using the
simply Taylor Series above.
Were asked to show that the energy becomes innite when v 1. This
is easily obtained by noting that is nite for 0 v < 1. However,
lim
v1
(v) .
27 Increasing temperature increases the rest mass.
Object has rest mass, m(T
0
) = 10[kg]. Increasing temperature from T
0
to T by heat ux Q = 100 J. This must be reected in an increase in rest
mass, since in the MCRF of the object, U
0
= 1 and mU
0
= p
0
= E. So
m(T) = m(T
0
)[kg] + Q[J]/c
2
[m
2
/s
2
] = 10 + 1.1 10
15
[kg]
48
This problem is interesting to look at from a thermodynamics point of
view. The heat ux increases the temperature and enthalpy of the object,
which is reected on a microscopic scale by an increase in the motion, relative
to the centre of mass of the object, of the elements (atoms or molecules or sea
of electrons depending on the material) composing the object. This motion
increases the eective mass of the elements. Say an element has rest mass
m
i
, then when it has thermal speed v
i
it has relativistic mass
m
i,rel
= m
i
(v
i
).
I found this website, which expands on these ideas http://en.wikipedia.
org/wiki/Massenergy_equivalence.
28 Boring.
29
d
d
(
U

U) =
d
d
_
(U
0
)
2
+ (U
1
)
2
+ (U
2
)
2
+ (U
3
)
2
_
(2.42)
= 2U
0
dU
0
d
+ 2U
i
dU
i
d
(2.43)
= 2
U
d
U
d
(2.44)
Q.E.D.
30 Four velocity of rocket ship,
U
O
(2, 1, 1, 1)
High-velocity cosmic ray with 4-momentum,
P
O
(300, 299, 0, 0) 10
27
kg
(a) Transform to MCRF of rocket ship. We know from Ex. 2.15, that for
general particle 3-velocity is v = (u, v, w), the 4-velocity is
A
O
(([v[), [v[([v[) cos(), [v[([v[) sin(), 0) (2.45)
=(([v[), u([v[), v([v[), w([v[)) (2.46)
where
[v[ =
u
2
+ v
2
+ w
2
.
Inspection of

U reveals that
= 2
u = 1/2
v = 1/2
w = 1/2
and [v[ =
u
2
+ v
2
+ w
2
=
3/2. Now we need the Lorentz transformation

for a reference frame moving with 3-velocity with more than one non-zero
component. Up to this point we havent learned this, and Im a bit surprised
Schutz has thrown this at us now. To lead one through the steps to construct
a general Lorentz transformation, Ive created supplementary problem SP.1
in section 2.10. Here we note that we actually only need the rst row of the
Lorentz transformation matrix, since we only require P
0
= E. This rst row
must be such that it transforms

U
O
(1, 0, 0, 0). Thus it must be related
to the components of

U as follows:
0
0
= U
0
0
i
= U
i
.
Applying
0
to the given

P gives, E = 301 10
27
kg in rocket ship frame.
(b)
P

U
obs
= E
obs
= 10
27
[300 299 0 0]
_
_
2
1
1
1
_
_
= 301 10
27
kg
(c) Of course (b) was faster. The same computations were performed to
get the answer, but in (b) we only did the necessary computations.
31 Photon reects o mirror without changing frequency . Angle of
incidence is .
50
This appears to be a straightforward application of conservation of 4-
momentum, but it fun because it gets us thinking about all 4 components.
Let the mirror lie in the y z plane, with photon travelling initially in
the x y plane, with angle to the xaxis. Then the initial 4-momentum
of the photon is written
P
i
= (h, cos()h, sin()h, 0).
First lets construct the 4-momentum of the reected photon

P
r
. Since the
photon frequency doesnt change, we know instantly the time component,
P
0
r
= P
0
i
= h.
For a smooth mirror we assume (actually Im just guessing!) that the mo-
mentum transferred is only in the xdirection. So then we can also construct
the components,
P
2
r
= P
2
i
= sin()h, P
3
r
= P
3
i
= 0
Recall from Eq. (2.37) that the 4-momentum of a photon is orthogonal to
itself. This alone gives us two possibilities for P
1
r
= cos()h. For the
reected photon, we choose the minus sign. In summary,
P
r
= (h, h cos(), h sin(), 0).
By conservation of 4-momentum, we see that the momentum transferred to
the mirror must be P
1
m
= 2h cos() in the xdirection. How did the mirror
acquire xdirection momentum without gaining energy? See Supplementary
Problem SP 2 in section 2.10.
If the photon is absorbed, then the momentum transferred to the mirror
has three components,
P
m
= (E
m
, P
1
m
, P
2
m
, 0) = (h, h cos(), h sin(), 0),
How did the mirror acquire the extra energy E
m
= h? See Supplementary
Problem SP 2 in section 2.10.
32 Derive the Compton scattering relationship Eq. (2.43).
Initially the total 4-momentum in the particles initial rest frame O is
P
O
(h
i
, h
f
, 0, 0) + (m, 0, 0, 0)
After the scattering event,
P
O
(h
f
, h
f
cos(), h
f
sin(), 0) + m(, v cos(), v sin(), 0)
where v and are the speed and the angle of the particles scattered trajec-
tory in the xy plane relative to the initial direction of the incident photon.
Equating the three nonzero components of 4-momentum gives 3 equations
for the 3 unknowns
f
, v, . In principle one can then solve for
f
in terms
of the other two unknowns, but I found it too tedious to do so.
33 Compton scattering of a cosmic microwave background radiation pho-
ton o a cosmic ray ( high-energy proton). Whats the max frequency of
scattered photon?
Very nice problem. At rst appears very challenging, but the extreme
dierences in energy between the two particles simplies things.
First we note that in the rest frame of the particle, Compton scattering
only reduces the frequency and more so for less massive particles (see also
supplementary problem SP 2 below). So how can Compton scattering in-
crease the energy of the photon?? The increase in energy is revealed via the
Doppler shift.
The key simplication in this problem is that the Compton scattering in
the frame of the particle has very little eect on frequency.
1
h
i
= 5000eV
1
1
m
p
= 10
9
eV
1
.
So the angle of the Compton scattering has very little eect on the nally
frequency in the particles initial rest frame. So in considering the eect of
the angle, we need only consider its eect on the Doppler shift.
Now the problem is easy. The Doppler shift in frequency is given in
general by Eq. (2.42). Obviously to maximize the frequency in the cosmic
ray frame,
i
, we want the photon and cosmic ray traveling in a line in
opposite directions, i.e. = radians, for which Eq. (2.42) gives
h
i
= h
i
1
1 v
2
(1 + v) h
i
2
1 v
2
= h
i
2 10
9
= 4 10
5
eV.
52
The Doppler shift has made a tremendous increase in frequency! The Comp-
ton scattering will make very little dierence, so to maximize the scat-
tered frequency in the Suns frame, choose the Compton scattering angle
to maximize the Doppler shift. That is, choose the scattering angle to be .
Eq. (2.43) gives
1
h
f
=
1
h
i
+
2
m
p
= 0.25 10
5
+ 2 10
9
0.25 10
5
[eV]
1
.
Compton scattered caused negligible decrease in energy in the protons frame.
The proton, like the mirror in problem 31, is massive enough to cause little
change in frequency of the photon in the protons frame. See also Supple-
mentary problem SP 2. Now Lorentz transform back to the Suns frame.
The photon again gains tremendously from the Doppler shift (thats why we
choose the scattering angle to be complete reection).
h
f
h
f
2 10
9
8 10
14
eV.
This is a very hard ray. A pair of 511 keV photons arising from annihi-
lation of an electron and positron are considered to be rays. And this is
more than a billion times more energetic than that.
34 These are quite trivial. For example, expand out the dot product in
terms of components using the denition in Eq. (2.26), and use the linearity
property given by Eq. (2.8),
(
A)

B = A
0
B
0
+ A
1
B
1
+ A
2
B
2
+ A
3
B
3
= (A
0
B
0
+ A
1
B
1
+ A
2
B
2
+ A
3
B
3
)
= (
A

B) (2.47)
35 Show that e
obtained from Eq. (2.15),

e

=

(v)e
,
obey
e

e

e

e

(v)e
(v)e
The LHS is a vector expression, and it shouldnt depend upon the orientation
of the coordinate axes. So lets rotate the axes so that v is oriented along
the xaxis. Then
(v) =
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
Note that is symmetric so we can interchange indices on one without eect,
e

e
For given =

, the RHS looks like the product of a row of
times a
column

. Its easy to see that the result is 1 for =

= 0 and +1 for
=

> 0. When ,=

, the RHS = 0. Q.E.D.
2.10 Robs supplemental problems
R.1 Suppose the 4-velocity of rocket ship is

U
O
(2, 1,
2, 0) in some
reference frame O.
(a) Show that the given

U is a legitimate 4-velocity. Show that

V
O
(2, 1, 1, 0) is not possible.
(b) Find the 3-velocity in O. Hint: see Ex. 2.15. (Youll need this for
(c)).
(c) Find the matrix that rotates of spatial coordinates such that the 3-
velocity has only one non-zero component, in say the xdirection. Whats
the matrix that rotates the 4-velocity to have only one nonzero spatial com-
ponent?
(d) Find the inverse rotation matrices for above. Hint: Think physically
and check mathematically, i.e. R
1
4
R
4
= I
(e) Find the Lorentz transformation from O to the MCRF of the rocket
ship. Conrm that it has the correct eect applied to

U itself. Hint: The
54
problem here is that we have so far only seen the Lorentz transformation
when the 3-velocity has only one non-zero component. Use your rotation
matrix from above and its inverse.
Solution:
(a)
U

U = 2
2
+ 1
2
+
2
2
= 1
which is consistent with Eq. (2.28). On the other hand,
V = 2
2
+ 1
2
+ 1
2
= 2
which is inconsistent with Eq. (2.28).
(b) See solution to Ex. 2.15:
v
O
(1/2,
2/2, 0)
(c) Rotating anticlockwise through angle = arccos(1/
3) aligns the
xaxis with the 3-velocity. This is accomplished with the matrix R
3
,
R
3
=
_
_
cos() sin() 0
sin() cos() 0
0 0 1
_
_
For the 4-velocity
R
4
=
_
_
1 0 0 0
0 cos() sin() 0
0 sin() cos() 0
0 0 0 1
_
_
(d) To nd the inverse of the rotation matrix just change the sign of the
angle!
R
1
4
=
_
_
1 0 0 0
0 cos() sin() 0
0 sin() cos() 0
0 0 0 1
_
_
(e) The Lorentz transformation for the case (u, v, 0) can be built from
the above tools. Consider transforming a vector,

U.
U =
U
= R
1
4

(u
, 0, 0)R
4
U
where
(u
, 0, 0) =
_
_
(u
) u
(u
) 0 0
u
(u
) (u
) 0 0
0 0 1 0
0 0 0 1
_
_
So this denes the desired Lorentz transformation (u, v, 0),
(u, v, 0) =
_
_
([v[) u([v[) v([v[) 0
u([v[) ([v[) cos
2
() + sin
2
() (([v[) 1) cos() sin() 0
v([v[) (([v[) 1) cos() sin() ([v[) sin
2
() + cos
2
() 0
0 0 0 1
_
_
(2.48)
where [v[ =
u
2
+ v
2
and = arctan(v/u). Its straightforward, albeit a bit
tedious, to show that
(u, v, 0)
U =
_
_
1
0
0
0
_
_
.
R.2 (a)How did the mirror in problem 2.31 acquire xdirection momen-
tum without acquiring energy when the photon was reected?
(b) How did it acquire the energy when the photon was absorbed?
Solution:
(a) The change in 4-momentum is related to the change in 4-velocity of
a massive object,
P
m
= m
U = m(, (u), 0, 0) = m( 1, u, 0, 0),

where the 2nd equality assumes the mirror is initially at rest. Thus the ratio
of
P
0
m
P
1
m
=
E
m
(mU
1
)
=
1
u
(1
1 u
2
)
u
2
.
56
The approximation applies in the limit u 1 using the binomial series. So
the change in energy can be arbitrarily small for a given change in momentum
if the change in velocity is correspondingly small. This corresponds to intu-
ition that a more massive mirror would rebound less for a given momentum
transfer. I suspect the imposition of reection without change in frequency
is an idealization applicable for massive mirrors. Indeed the next problem,
2.32 covers Compton scattering, wherein a photon reects o a particle of
mass m. In Eq. (2.43) we see that for
m
h

i
where
i
is the incident frequency of the photon, the reected frequency
f

i
.
(b) For a massive mirror, the energy must have become mostly thermal
energy. For a less massive mirror the energy, more the energy would go into
the translational kinetic energy of the rebound.
R.3 Start with the expression for the 4-velocity in terms of the 3-velocity
with the components of the 3-velocity written as a function of time in an
inertial frame O:
U
O
(v)(1, x, y, z)
where
v = [v[ =
_
x
2
+ y
2
+ z
2
Show that the 4-acceleration is orthogonal to the 4-velocity by using the
change rule to derive a general expression for the 4-acceleration involving
derivatives with respect to time.
Solution:
The 4-acceleration is dened as
a
d
U
d
Eq. (2.32)
=
d
d
[(v)(1, x, y, z)]
=
dt
d
d
dt
[(v)(1, x, y, z)]
(2.49)
One can always put the origin of the MCRF at the particle location, so that
x

O
(, 0, 0, 0)
and thus for short increments in time,
t =
0
x

= (v)
= (v) (2.50)
and
dt
d
= (v) (2.51)
To resolve (2.49) we also require
d
dt
(v) =
d
dt
_
1
_
1 x
2
y
2
z
2
_
=
3
[ x x + y y + z z] (2.52)
Substituting (2.51) and (2.52) into (2.49) we nd,
a =
3
[ x x + y y + z z](1, x, y, z) + (0, x, y, z)
=
2
[ x x + y y + z z]
U + (0, x, y, z)
=
3
[ x x + y y + z z]
U +
2
(0, x, y, z) (2.53)
Now take the dot product with

U:
U a =
3
[ x x + y y + z z]
U +
2
(0, x, y, z)
=
3
[ x x + y y + z z] +
3
(1, x, y, z) (0, x, y, z)
= 0. (2.54)
58
Chapter 3
Tensor Analysis in Special
Relativity
59
60
3.1 The metric tensor
We learn that
(introduced in Eq. (2.27)) is the metric tensor, and it

provides a frame-invariant way to write the scalar product of two vectors,
Eq. (3.1).
I dont see why this is frame-invariant when the RHS depends upon
the components, which in turn depend upon the frame. Maybe he means
the LHS? Of course the scalar product itself is frame-invariant, and perhaps
thats all he means here.
In any case, he wants to talk about tensors in general and is using the
metric tensor as a concrete example to get the discussion going.
3.2 Denition of tensors
Tensors are dened as rules for mapping N vectors to real numbers that are
linear in the arguments:
_
0
N
_
An ordinary function y = f(t, x, y, z) is a rule for mapping reals onto a
real and is classed as
_
0
0
_
because it takes zero vectors as its input. I nd this odd because no where
in the denition of a tensor is there mention of the dimension of the vectors
and scalars are vectors of dimension 0.
But this is just semantics.
Aside on the usage of the term function
Emphasizes that a regular function y = f(x) is more generally thought
of as a rule for associating real values of x with real values of y.
So we should think of tensors as rules for associating vectors with real
scalars. For example g is the tensor that associates the vectors with their
dot product; for example, say

A and

B with the number

A

B.
Components of a tensor
So the components of a tensor, for a given frame, are the values of the
tensor applied to the basis vectors of that frame. This gives new insight
into Eq. (2.27) that introduced the metric tensor, for now we see the same
equation repeated here as Eq. (3.5), but now with the interpretation of the
16 values of
being the components of the metric tensor in basis vectors

Eq. (2.9). And the metric tensor provides the rule associated with the dot
product.
3.3 The
_
0
1
_
tensors: one-forms
Covector = covariant vector = one-form.
The concept of one-form was so confusing (to me at least) in Misner et al.
(1973) that I put their book aside and bought Schutz. Here Schutz comes
through brilliantly, helping me through this hurdle.
General properties
Typo in Eq. (3.6b).
r(

A) = p(
A)
should be
r(
A) = p(
A)
The set of all one-forms form a vector space. The axioms of a vector
space are given in Appendix A on p. 374. BTW
. . . an abelian group, also called a commutative group, is a group
in which the result of applying the group operation to two group
elements does not depend on their order (the axiom of commu-
tativity). Abelian groups generalize the arithmetic of addition of
integers.
which I got from http://en.wikipedia.org/wiki/Abelian_group.
Components of one-forms transform in the same way as basis vectors do;
see Eq. (3.9).
A one-form is frame-independent, see p. 59.
Notation for derivatives
62
Result Eq. (3.20) is very fundamental, yet I found it a bit weakly ex-
plained. Up until this point the gradient was dened only for a scalar eld,
like (x), c.f. Eq. (3.15). Suddenly, and without comment to the eect, in
Eq. (3.20) Schutz is applying the gradient operator to the component of a
vector, x
). One might be tempted to say, but hold xed and then x
is like a scalar eld. I nd that unsatisfactory because components of vec-

tors are very dierent things from scalar elds. Components change under a
change of basis, while scalars dont! In any case, Eq. (3.20) appears out-of-
hat. I now appreciate the approach of Hobson et al. (2009), who start with
manifolds and co-ordinate curves therein, then cover vector and tensor cal-
culus on pseudo-Riemann manifolds. Then the notation of basis vector and
basis one-form, and their connection with coordinate curves, appears more
naturally.
In exercise 34(e) well learn by example that the denition in Eq. (3.20)
means that the one-form bases are not necessarily just the metric applied to
the vector basis vectors.
Normal one-forms
In my opinion it is necessary to state that the normal one-form is not the
zero one-form. This was specied for instance in Problem 12.
3.5 Denition of tensors
Why distinguish one-forms from vectors
This is the crucial material missing from Misner et al. (1973). Theres
also a nice connection made with row vectors and column vectors and Diracs
bra and ket vector formulation of quantum mechanics.
Typo: Eq. (3.45) should be
d
because its a vector gradient.
3.6 Finally:
_
M
N
_
tensors
Typo: Paragraph before Eq. (3.55), the equation below should have the RHS
in regular, not bold, face type because its the components.
R(
; e
) := R
Similarly for Eq. (3.55), last line, the RHS should be in regular, not bold,
face type because its the components. Also the rst line of this equation
should have on the RHS not .
3.8 Dierentiation of tensors
This is a very important section for later work. Unfortunately it is a bit
rushed. As discussed in my solution to problem 28 of 3.10, I nd that
Eq. (3.66) is not clearly explained. Rather than saying that we deduce
Eq. (3.66) from Eq. (3.65), Id nd it more satisfying if hed said something
like:
We chose to dene the gradient of a rank 2 tensor as . Eq. (3.66). And
in so doing, we can then obtain the Eq. (3.65), which is desirable because it
appears as a straightforward generalization of Eq. (3.14).
64
3.10 Exercises
1(a). The double sum is obviously dierent because it includes the o-
diagonal terms
M
, when ,=
(b)
A
=
3
=0
3
=0
A
(3.1)
= A
0
B
0
(1) + A
1
B
1
+ A
2
B
2
+ A
3
B
3
(3.2)
using
dened after Eq. (2.7).

2. To prove that the set of all one-forms is a vector space, we must show
that this set meets the axioms (1) and (2) given in Appendix A, p. 374.
Axiom (1): The sum of two one-forms must also be a one-form, which is
satised by Eq. (3.6a), and the order of summation doesnt matter, which is
satised by Eq. (3.6b) because a one-form evaluates to a real and the sum of
two reals doesnt depend upon the order. We also require a zero. The zero
one-form gives zero for any vector (see p. 60). So say q is the zero one-form.
Then assuming Eq. (3.6a) and by Eq. (3.6b)
s(
A) = p(
A) + q(
A) (3.3)
= p(
A) + 0 (3.4)
= p(
A) (3.5)
so p + 0 = p and we have a zero. Axiom (1) is satised.
Axiom (2): There are four requirements to meet. Although its not made
explicit in Eq. (3.6a), lets assume 1. By Eq. (3.6) its clear that
multiplication of a one-form by a real scalar meets requirements of Axiom 2.
3(a). Show
p(A
) = A
p(e
)
p(A
) = p
_
3
=0
A
_
(3.6)
=
3
=0
A
p (e
) , by linearity in arguments, c.f. p. 56, (3.7)

= A
p(e
) (3.8)
3(b). (i)
p(
A) = A
= 2 + 1 + 0 + 0
= 1
(ii) +2, (iii) -7, (iv) -7.
4(a). To show the vectors are linearly independent we require that there
is no non-trivial linear combination
a
A + b
B + c
C + d
D = 0 (3.9)
_
A

B

C

D
_
a
b
c
d
_
_
=
_
_
0
0
0
0
_
_
(3.10)
A non-trivial combination (i.e. not a = b = c = d = 0) requires the determi-
nant of the matrix with columns formed by the vectors

A through

D to be
zero. But the determinant is -8.
4(b). Find components of p given p(
A) = 1 etc.
Using the denition of components Eq. (3.8), we can write a linear system
in the four unknown components:
_
D
_
_
_
_
p
0
p
1
p
2
p
3
_
_
=
_
_
p(
A)
p(
B)
p(
C)
p(
D)
_
_
(3.11)
66
Note the matrix is written with the rows given by the vectors. This can be
solved in MaLab as follows:
A = [2 1 1 0; 1 2 0 0; 0 0 1 1; -3 2 0 0];
b = [1; -1; -1; 0];
x = A\b;
x =
-0.2500
-0.3750
1.8750
-2.8750
4(c). Given
E
O
(1, 1, 0, 0)
we easily nd that
p(
E) = E
= p
0
+ p
1
= 5/8
4(d). Given the values of the four one-forms, p, q, r, s applied to the four
known vectors

A,

B,

C,

D we can, in principle, nd all components of all four
one-forms, repeating the procedure we did in 4b. And then one could write a
matrix M where the columns of M are taken from the one-form components.
If the determinant of M is zero the one-forms are linearly dependent. But
thats a lot of work.
I believe there is a simpler way to test for linear dependence. If the one-
forms are linearly dependent, then there are non-trivial real number a, b, c, d
such that
a p + b q + c r + d s =

t = 0
_
p q r s
_
_
_
_
a
b
c
d
_
_
_
_
=

t =
_
_
_
_
0
0
0
0
_
_
_
_
(3.12)
But then
_
D
_
t =
_
_
_
_
0
0
0
0
_
_
_
_
(3.13)
By Eq. (3.6) we have
_
D
_
t =
_
_
p(
A) q(
A) r(
A) s(
A)
p(
B) q(
B) r(
B) s(
B)
p(
C) q(
C) r(
C) s(
C)
p(
D) q(
D) r(
D) s(
D)
_
_
_
_
_
_
a
b
c
d
_
_
_
_
=
_
_
1 0 2 1
1 0 0 1
1 1 0 0
0 1 0 0
_
_
_
_
_
_
a
b
c
d
_
_
_
_
=
_
_
_
_
0
0
0
0
_
_
_
_
(3.14)
The latter can only be true if the determinant is zero, but
1 0 2 1
1 0 0 1
1 1 0 0
0 1 0 0
= 2 (3.15)
so the one-forms must not be linearly dependent.
5. Justify steps from Eq. (3.10a) to Eq. (3.10d).
A

p

= (

)(
), by Eq. (2.7) and Eq. (3.9) respectively

= (
) (A
), just rearranged the terms

(3.16)
68
We claimed that the transformation for one-forms

was the same as for
basis vectors Eq. (3.9).
A

p

= (
)(A
),
=
(A
), by Eq. (2.18),
= A
, by the properties of the Kronecker delta.

6. Given a basis e
of a frame O and a basis
0
,
1
,
2
,
3
for the
space of one-forms, with
O
(1, 1, 0, 0) (3.17)
O
(1, 1, 0, 0) (3.18)
O
(0, 0, 1, 1) (3.19)
O
(0, 0, 1, 1) (3.20)
6a. Consider an arbitrary one-form p and vector

A.
p(e
A) = p
A) (3.21)
= p
(e
)A
(3.22)
= p
(e
) =
(3.23)
but it is clear that

(e
) ,=
by inspection of the given basis.

6b. Given p
O
(1, 1, 1, 1), nd
p = l
We can write this as a linear system of equations to solve for l
(e
) l
= p(e
) (3.24)
(
= p
(3.25)
where Im using that weird notation introduced in Eq. (2.10).
This can be solved in MatLab with
A = [1 1 0 0; 1 -1 0 0; 0 0 1 1; 0 0 -1 1];
p = [1 1 1 1];
l = A\p;
l =
1
0
0
1
7. The proof of Eq. (3.13), were told on p. 61, is analogous to the
corresponding relation for basis vectors, which was given on p. 37.
Imagine that p is an arbitrary one-form, and

A an arbitrary vector. Let e
and e
be the basis vectors in frame O, and O respectively. The components

of

A in the two frames are related by:
= A

,
see Eq. (2.7) on p. 35. We seek the transformation relating the components
of the one-form in frame O and O. Lets call it T

. Because p(
A) is frame-
independent (after all, its just a scalar):
p(
A) = A
= A

p

=

p

,
=

,
= A

, just rearranging terms
At this point I believe Schutz would relabel his indices and replace with
. Im a little uncomfortable with this because my understanding is that the
sums on the two sides of the equal sign are equal, but we cannot immediately
say the individual terms are equal. We can note however that

A is arbitrary,
and therefore we can imagine the case where all components of

A are zero
but one,

A = ae
0
say, and then:
p(
A) = A
0
p
0
= A

, from above
= A
0
p

0
T

, (3.26)
70
And similarly, we can imagine

A = ae
1
etc. In this way we can see that
indeed, it is valid to set = .
p(
A) = A
= A

, from above
= A

, relabel with
(3.27)
and now its clear that

(3.28)
8. The basis one-forms of

dt, viewed from the t x plane, would be
equally spaced straight lines through the taxis. The spacing is one unit of
t between surfaces so that e
0
would cross one surface. The equation for

dx
is given in Eq. (3.20) in terms of the basis one-forms, which are written out
on the top of page 61.
The basis one-forms of

dx, viewed from the t x plane, would be equally
spaced straight lines through the xaxis. The spacing is one unit of x be-
tween surfaces.
9. The components of

dT are given by Eq. (3.15), and the partial deriva-
tives can be estimated from Fig. 3.5, at least for the x and y directions. Its
not clear what were supposed to do for the t and z directions.
10 (a) Its obvious that
x
(3.29)
because when ,= we have terms like, say,
x
0
x
1
=
t
x
= 0
because t and x are independent variables. But when = we have terms
like, say,
x
3
x
3
=
z
z
= 1.
10 (b)
x
, from (a)

x

x
, sub co-ordinate transform

x

x
, transform is a constant
, from Eq. (3.18)

(3.30)
11
Eq. (3.14) in dierent notation:
d
d
=
,t
dt
d
+
,x
dx
d
+
,y
dy
d
+
,z
dz
d
(3.31)
d
O
(
,t
,
,x
,
,y
,
,z
) (3.32)
x
,
=

(3.33)
12 (a)
72
V is not tangent to surface S, so it must have a component in the e

x
direction, V
x
,= 0.
n(
V ) = n
x
V
x
(3.34)
To show this is nonzero, we must show that n
x
> 0. The normal one-form
is the one-form that is zero for every vector tangent to the surface. For the
x = 0 surface in 3D space,
n
O
(n
x
, 0, 0) (3.35)
where n
x
,= 0 will serve as a non-zero, normal-one form.
12 (b)
Suppose
n(
V ) = V
,
= V
x
n
x
> 0. (3.36)
Suppose

W has same sign of W
x
as V
x
. Then,
n(
W) = W
,
= W
x
n
x
> 0. (3.37)
12 (c)
Any normal to the surface S must have n
y
= 0 and n
z
= 0. To be
non-zero it requires n
x
,= 0. So
n
O
a(n
x
, 0, 0) (3.38)
where a ,= 0 and a 1 will serve also as a non-zero, normal-one form.
12 (d)
To generalize the above results to a 3D surface in 4D space-time, I found it
hard to work with surfaces that are not simply one of the coordinates equals a
constant. So I suggest that we require that the surface be suciently smooth
that we can approximate the surface locally by a tangent plane. Then we
can also rotate the coordinates so that the tangent plane is x = 0, and then
the above immediately holds.
13 Show that the one-form formed from the gradient of a scalar function
f is normal to surfaces of constant value of f.
Consider an arbitrary point p = (t, x, y, z) where f(t, x, y, z) = f
p
. Now
imagine taking an innitesimal step
t, x, y, z
such that the change in value of f,
f =
f
t
t +
f
x
x +
f
y
y +
f
z
z
= 0. (3.39)
This ensures we dont leave the surface of constant f. So a tangent vector
to the surface of constant f is obtained from an arbitrary multiple of such a
step:
A
O
a(t, x, y, z) (3.40)
where a 1 and a ,= 0. The gradient one-form evaluated with such a tangent
vector is
df(
A) = a
_
f
t
t +
f
x
x +
f
y
y +
f
z
z
_
= 0. (3.41)
14 Given
p
O
(1, 1, 0, 0)
q
O
(1, 0, 1, 0)
74
I chose something very easy:

A = e
0
and

B = e
1
. Then the computation are
easy because we only have one component contributing for each vector:
p q(
A,

B) = p(
A) q(
B)
= 1 0 = 0.
While changing the order of the vectors gives:
p q(
B,

A) = p(
B) q(
A)
= 1 (1) = 1
,= p q(
A,

B).
To nd the components we must input the basis vectors. Because there
are two vector inputs we have a two-by-two array of basis components:
p q(e
, e
) = p(e
) q(e
)
= p q
.
Lets write these in a matrix where is the row and is the column:
p q
=
_
_
1 0 1 0
1 0 1 0
0 0 0 0
0 0 0 0
_
_
.
15
Supply the reasoning leading from Eq. (3.23) to Eq. (3.24).
f = f
, what we mean by a basis

f
= f (e
, e
), what we mean by components

f
= f
(e
, e
), sub rst line into 2nd line

(e
, e
) =
, solving above, verify by substitution, used Eq. (3.12)

(3.42)
16 (a)
Its obvious from the denition Eq. (3.69) that the
_
0
2
_
tensor h
(s)
is
symmetric. Just interchange the arguments

A and

B and you obtain the
same result because
1
2
h(
A,

B)
and
1
2
h(
B,

A)
are just a real numbers and the order of addition doesnt matter for reals.
16 (b)
Similar argument for the antisymmetric
_
0
2
_
tensor h
(A)
.
16 (c)
Components of the symmetric part of
p q
=
_
_
1 0 1 0
1 0 1 0
0 0 0 0
0 0 0 0
_
_
.
are
p q
(S)
=
_
_
1
1
2
1
2
0
1
2
0
1
2
0
1
2
1
2
0 0
0 0 0 0
_
_
.
The antisymmetric part is:
p q
(A)
=
_
_
0
1
2
1
2
0
1
2
0
1
2
0
1
2

1
2
0 0
0 0 0 0
_
_
.
16 (d)
From the denition of an antisymmetric tensor we know that Eq. (3.32)
h(
A,

B) = h(
B,

A),
B,

A
h(
A,

A) = h(
A,

A), a special case
= 0, solving for LHS. (3.43)
76
16 (e)
Number of independent components of h
(S)
For a general
_
0
2
_
tensor there are 4 4 = 16 components. But for the
symmetric tensor we have Eq. (3.28)
f
= f
This gives the 4 diagonal components are the 6 upper diagonal components
are independent, so 10 independent components, while the 6 lower diagonal
are determined by the symmetry.
For an antisymmetric tensor we have Eq. (3.33)
f
= f
which means the diagonal elements are zero,

f
= 0,
so there are only 6 independent components in total.
17 (a) This problem takes some time to work through. There must be
an easier way, but here is my long-winded solution!
In general,
h(
C,

A) = h
Lets treat

C as an arbitrary vector. Were told that for arbitrary vectors
A and

B, but with

B ,= 0
h( ,

A) = h( ,

B)
(3.44)
Suppose C
O
(1, 0, 0, 0). Then
h
0
B
= h
0
A
(3.45)
q(
B) = q(
A) (3.46)
The LHS of (3.45) has the form a one-form, so we wrote that explicitly in
(3.46). So far theres no restriction on q; we simply choose
=
q(
A)
q(
B)
. (3.47)
and we note the stipulation that

B ,= 0.
Now suppose C
O
(1, 1, 0, 0). Then
h
0
B
+ h
1
B
= (h
0
A
+ h
1
A
) (3.48)
h
1
B
= (h
1
A
) (3.49)
For (3.49) we used (3.45), but now is no longer a free variable, being set
by (3.47). Both (3.45) and (3.49) can be satised i
h
0
= h
1
a (3.50)
for an arbitrary a. And in general, we see simply by repeating the argument
above for dierent

C, that
h
0
= h
. (3.51)
That is, if the tensor h is written as a matrix, the rows are arbitrary scalar
constants p
of the rst row. And so,

h
= p
(3.52)
so the tensor h has the form of an outer product
h = p q. (3.53)
17 (b) Given
_
1
1
_
tensor T, show that T( ; v) is a vector.
In general, for a
_
1
1
_
tensor T,
T( p; v) = T(p
; v
) (3.54)
= T
; e
) (3.55)
= T
(3.56)
Ive used u as the basis of
_
1
1
_
tensors, and p as my one-form arguments
because I wanted to keep as the basis for the one-form. By equating (3.55)
and (3.56) we see that
u
; e
) =
(3.57)
78
and we identify
(e
) =
from Eq. (3.12) on p. 60, and from the duality

of one-forms and vectors we identify e
) =
, so
u
= e
. (3.58)
Now I believe we can go back and write T acting on the vector alone:
T( ; v) = T( ; v
) (3.59)
= T
( ; e
) = T
( ; e
) (3.60)
= T
(e
) e
(3.61)
= T
(3.62)
= T
. (3.63)
And now were done because the RHS is a scalar T
0
times the basis vector

e
0
etc., which is a vector!
And similar we can go back and write T acting on the one-form alone:
T( p; ) = T(p
; )
= T
; )
= T
= T
. (3.64)
Again were done here because the RHS is clearly a one-form.
18 (a) Applying the metric tensor to a vector gives the corresponding
one-form. The result is simply a change in sign of the rst component, so
A
O
(1, 0, 1, 0)
results in
A
O
(1, 0, 1, 0).
Note that there is a caution on p. 69 that the metric tensor will change, but
for now were still within Special Relativity, and the metric tensor has the
simple values given on p. 45.
(b) Applying the inverse metric tensor to a one-form gives the correspond-
ing vector. The inverse metric tensor is the same matrix as the metric tensor,
so again, the result is simply a change in sign of the rst component, so
p
O
(3, 0, 1, 1)
results in
p
O
(3, 0, 1, 1).
19 (a) The inverse matrix tensor
was given in Eq. (3.44). Its a

matrix with the same values as the metric tensor itself, given rst on p. 45.
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
(3.65)
19 (b) To derive the formula for the inner product of one-forms in terms of
components, Eq. (3.53), we start with the denition Eq. (3.52). This involves
only the squares. But rst we must establish how to obtain the components
of the addition of two one-forms. Intuitively one must guess we just add the
components. Indeed this is so, but to establish this rigorously we start with
the denition of addition in Eq. (3.6). If s = p + q then s(
A) = p(
A) + q(
A)
for all

A. Suppose
A
O
(1, 0, 0, 0).
Then
p(
A) = p
= p
0
(3.66)
q(
A) = q
= q
0
(3.67)
s(
A) = s
= s
0
= p
0
+ q
0
. (3.68)
Similarly for

A
O
(0, 1, 0, 0) etc. This establishes that Eq. (3.6) implies that
to add two one-forms one just adds the components. Now we can expand
( p + q)
2
in terms of components using Eq. (3.50).
( p + q)
2
= s
2
=
, by Eq. (3.50)
=
(p
+ q
)(p
+ q
), component-wise addition of one-forms

=
(p
+ q
+ p
+ p
), components are just reals

(3.69)
80
Now we are nally ready to deal with the denition:
p q =
1
2
[( p + q)
2
p
2
q
2
], by denition Eq. (3.52)
=
1
2
[
(p
+ q
+ p
+ p
) p
2
q
2
], by denition Eq. (3.50)
=
1
2
[
(p
+ q
+ p
+ p
], by Eq. (3.50)
=
1
2
[
(p
+ p
)], after cancelling terms

= p
0
q
0
+ p
1
q
1
+ p
2
q
2
+ p
3
q
3
, using
from Eq. (3.44) (3.70)

which is Eq. (3.53).
20 Suppose were in Euclidean 3-space in Cartesian coordinates.
(a) We want to show that
A

=

and
P
(3.71)
are the same transformation since the matrix

is equal to the transpose

of its inverse.
The inverse transformation takes us back from the O frame to the O
frame, and is thus written,
_
_
1
=
. (3.72)
If the one-form components on the RHS of (3.71), P
, are written as a
column matrix, then the transformation in (3.71) amounts to multiplying
the column vector by the matrix
T
=
.
And if the original matrix is orthogonal then were back to the same matrix.
(b) All were given is that metric tensor for Cartesian 3-space is
ij
, i, j =
1, 2, 3. The metric tensor is used in forming the inner product of vectors,
which we know must be frame invariant. So lets write the inner product
between two 3-space vectors in two dierent frames,
j
A
i
B
j
=

A

B
=
kl
A
k
B
l
=
kl
(A
i
)(B
j
)
(3.73)
and so upon cancelling the A
i
B
j
on either side and rearranging we see that
j
=
k
i

l
kl
, (3.74)
as required. If one were uncomfortable with cancelling the A
i
B
j
on either
side, one can write
j
A
i
B
j
=

A

B
=
kl
A
k
B
l
=
kl
(A
m
k
m
)(B
n
l
n
). (3.75)
And then because

A and

B were arbitrary we can consider cases like
A
O
(1, 0, 0).
Then there is only one none-zero A
i
, A
m
component on each side, and we
can divide through by this single component. And this is true for all the
components A
i
and B
i
. So again, we assert (3.74).
j
=
k
i

l
kl
, from above (3.76)
=
k
i

k
j
, after summing over l. (3.77)
The RHS of (3.77) is the product of a matrix by its transpose, and for this
to equal the identity matrix (i.e. the LHS), we require the matrix to be
orthogonal.
And now I know why we never learned one-forms in undergrad, and called
the gradient of a scalar eld a vector. Incidentally, it was precisely this point
that I didnt understand in reading the explanation of a one-form by Misner
et al. (1973) that made me give up on their book and try Schutzs book. I
82
couldnt see the dierence between their description of a one-form and the
gradient (and of course there is no dierence) but I also thought I knew
that the gradient of a scalar eld was a vector. So hats o to Schutz for
making this point clear. Though perhaps if I had persisted with Misner
et al. (1973) all would have been ne in the end.
21 (a) Starting with the t = 0 boundary and moving counter clockwise,
so next the x = 1 boundary, lets call the normal one-forms a,
b, c,

d. These
take on the values:
a
O
(a
0
, 0), a
0
> 0 (3.78)
b
O
(0, b
1
), b
1
> 0 (3.79)
c
O
(c
0
, 0), c
0
< 0 (3.80)
d
O
(0, d
1
), d
1
< 0. (3.81)
To obtain the corresponding vectors we simply change the sign of the
time component:
a
O
(a
0
, 0), a
0
> 0 (3.82)
b
O
(0, b
1
), b
1
> 0 (3.83)
c
O
(c
0
, 0), c
0
< 0 (3.84)
d
O
(0, d
1
), d
1
< 0. (3.85)
The normal vectors in the time direction, a and c look odd because they
appear to be pointing inward. But the metric is such,
=
_
1 0
0 1
_
that the scalar product of vectors that point outward will be positive.
21 (b) The rst challenge is nding out what is meant by the null bound-
ary. I would guess its the surface in the direction of the null vector, a null
vector being one whose inner product with itself is zero, e.g.:
V
O
(1, 1).
This vector has the strange property of being orthogonal to itself, see p. 45.
The other two boundaries: x = 1 and t = 1 are easily named and so process
of elimination also points to the boundary between (1, 0) and (2, 1) as the
null boundary.
The outward normal one-form, c is easily found:
c(
V ) = 0, denition of normal (3.86)

= c
(3.87)
= c
0
V
0
+ c
1
V
1
(3.88)
= c
0
+ c
1
. (3.89)
To ensure the outward normal, we require
c
0
(1) + c
1
(1) = c
0
c
1
> 0. (3.90)
For example,
c
O
(1, 1).
The associated vector is,
c
O
(1, 1).
22
To show that vectors form a vector space, when introduced as a functions
that take one-forms as arguments, we would proceed as in section 3.3, inter-
changing the roles of vectors and one-forms in Eq. (3.6). That is, we would
dene the addition of vectors and multiplication of a vector by a scalar as
follows. Suppose
A =

B +

C (3.91)
D = a
B, a 1, (3.92)
then we require for all one-form arguments p,
A( p) =

B( p) +

C( p) (3.93)
D( p) = a
B( p), a 1, (3.94)
in analogy to Eq. (3.6b).
84
We need a zero, which is provided by the null vector, the one that is zero
for any one-form argument:
V ( q) = 0, q.
The two axioms of Appendix A are now clearly satised. (See also prob-
lems 2 and 23.)
23 (a) Prove that the set of all
_
M
N
_
tensors for xed M and N forms a
vector space.
This is like question 2. But now we need to dene what we mean by
the addition of two
_
M
N
_
tensors and the multiplication of an
_
M
N
_
tensor by
a scalar. Of course we are guided by Eq. (3.6). That is, we note that
_
M
N
_
tensors produce real numbers that can be added like real numbers, so the
generalization of Eq. (3.6) is trivial. The tensor S where
S = P+Q (3.95)
is dened to be that which gives the sum of the two values obtained by
applying the input to P and Q. That is,
S( a
1
, a
2
, . . . , a
M
;
b
1
,
b
2
, . . . ,
b
N
) =
P( a
1
, a
2
, . . . , a
M
;
b
1
,
b
2
, . . . ,
b
N
) +Q( a
1
, a
2
, . . . , a
M
;
b
1
,
b
2
, . . . ,
b
N
) (3.96)
where I have invented the notation
( a
1
, a
2
, . . . , a
M
;
b
1
,
b
2
, . . . ,
b
N
)
to show the M one-form inputs and N vector inputs. The choice of one-
forms rst, well see later, gives the basis in order Schutz gave in 23b. I
have followed the convention that superscript integers are used as indices
of dierent one-forms. That is, a
1
and a
2
are two dierent one-forms, not
components of the same one-form. Similarly, subscripts are used to denote
dierent vectors. In analogy with Eq. (3.6b) we can dene multiplication of
an
_
M
N
_
tensor by a scalar
R = P (3.97)
to be the tensor that, for a given input, gives just times the real number
produced by supplying the input to P:
R( a
1
, a
2
, . . . , a
M
;
b
1
,
b
2
, . . . ,
b
N
) =
P( a
1
, a
2
, . . . , a
M
;
b
1
,
b
2
, . . . ,
b
N
) (3.98)
The set of
_
M
N
_
tensors for xed M and N forms a vector space by the
same argument as given for question 2. Perhaps its worth making explicit
what we mean by the zero
_
M
N
_
tensor. This is the tensor that gives zero for
any input,
( a
1
, a
2
, . . . , a
M
;
b
1
,
b
2
, . . . ,
b
N
).
The set of
_
M
N
_
tensors, with (3.95) and (3.96) then meets axiom (1) in Ap-
pendix A:
_
M
N
_
tensors form an abelian group with the operation of addition.
23 (b) Prove that the basis for the vector space formed from the set of
all
_
M
N
_
tensors for xed M and N is the set:
e
. . . e
. . .
(3.99)
with M vectors labeled with . . . and N one-forms labeled . . . .
This is a nice question because it forces us to think about what we mean
by a basis. The answer is a straightforward generalization of the argument
for the basis of the
_
0
2
_
tensors starting at the bottom of page 66 and ending
with Eq. (3.26) on p. 67.
The notation is combersome because one needs to refer to M superscripts
and N subscripts where M and N are arbitrary. In dening the basis (3.99)
Schutz has used a series of greek letters like . . . . Ive decided to put sub-
script indices on the greek letters
1
,
2
, . . .
M
. That way I can be explicit
about how many there are. Remember that each greek letter index can take
on 4 values, e.g.
1
= 0, 1, 2, 3 corresponding to the four dimensions.
As in Eq. (3.23) we write the
_
M
N
_
tensor as a sum of components times
the basis that we seek:
R = R
1
,
2
,...,
M
1
,
2
,...,
N

1
,
2
,...,
N
1
,
2
,...,
M
(3.100)
And furthermore, the components correspond to the real values produced
by applying the tensor to arguments that are the basis one-forms and basis
vectors. So,
R
1
,
2
,...,
M
1
,
2
,...,
N
= R(
1
,
2
, . . . ,
M
; e
1
, e
2
, . . . , e
N
) (3.101)
which is the generalization of the formula given between Eq. (3.23) and
Eq. (3.24) on p. 67. Now, we simply substitute the tensor (3.100) into (3.101)
to obtain:
R
1
,
2
,...,
M
1
,
2
,...,
N
= R
1
,
2
,...,
M
1
,
2
,...,
N

1
,
2
,...,
N
1
,
2
,...,
M
(
1
,
2
, . . . ,
M
; e
1
, e
2
, . . . , e
N
)
(3.102)
86
This implies the analogue to Eq. (3.24),

1
,
2
,...,
N
1
,
2
,...,
M
(
1
,
2
, . . . ,
M
; e
1
, e
2
, . . . , e
N
) =
2
. . .
2
. . .
N
(3.103)
Using Eq. (3.12) we identify
1
=
1
(e
1
)
2
=
2
(e
2
)
.
.
.
N
=
N
(e
N
) (3.104)
Based upon the dualism between vectors and one-forms, we identify:
1
= e
1
(
1
)
2
= e
2
(
2
)
.
.
.
M
= e
M
(
M
) (3.105)
So,
2
. . .
M
= e
1
(
1
)e
2
(
2
) . . . e
M
(
M
). (3.106)
So focusing on just the tensor, i.e. dropping the arguments, were left with
the basis that is the analogue to Eq. (3.25),

1
,
2
,...,
N
1
,
2
,...,
M
= e
1
e
2
. . . e
2
. . .
N
(3.107)
where we have introduced the idea of an outer product of N one-forms as a
simply extension of the case when N = 2 introduced on p. 66. That is, the
outer product of N one-forms
p
1
p
2
. . . p
N
is simply the tensor that, when supplied with N vector inputs, say

A
1
,

A
2
, . . . ,

A
N
,
as arguments, produces that number that results from multiplying together
each real number that results from applying p
n
to vector argument

A
n
, i.e.
p
1
p
2
. . . p
N
(
A
1
,

A
2
, . . . ,

A
N
) = p
1
(
A
1
) p
2
(
A
2
) . . . p
N
(
A
N
)
24 (a)
(i) The denitions of the symmetric and antisymmetric tensors are given in
Eq. (3.31) and Eq. (3.34).
M
()
=
_
_
0 1 1
1
2
1 1 0 1
1 0 0
3
2
1
2
1
3
2
0
_
_
(3.108)
M
[]
=
_
_
0 0 1
1
2
0 0 0 1
1 0 0
1
2
1
2
1
1
2
0
_
_
(3.109)
(ii) Section 3.7 shows how to raise and lower indices using the metric tensor.
M
(3.110)
= M
T
(3.111)
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
=
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
(3.112)
(iii)
M

(3.113)
=
(3.114)
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
=
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
(3.115)
(iv)
M
= (3.116)
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
=
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
(3.117)
88
OR,
M
(3.118)
= M

T
(3.119)
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
=
_
_
0 1 0 0
1 1 0 2
2 0 0 1
1 0 2 0
_
_
. (3.120)
Of course the two dierent ways to obtain M
agree.
24 (b) Does it make sense to speak of the symmetric and antisymmetric
parts of M
? This
_
1
1
_
tensor is represented by a matrix, so if it did makes
sense then it would be easy to nd the symmetric and antisymmetric parts!
But I would guess that it doesnt make sense, because symmetry has to do
with the interchange of the order of the arguments. For a
_
1
1
_
tensor, one
argument is a vector, the other a one-form. So they cannot be interchanged.
(On the other hand, each vector has a corresponding one-form and vice versa,
so if we incorporate this into the idea of symmetry, then one could dene
symmetric and antisymmetric parts.) I guess Im sitting on the fence.
This rst view is the correct one. It only makes sense to discuss symmetry
after both indices have been raised or lowered, as weve done in this example
using the metric tensor.
24 (c)
The
_
2
0
_
tensor
was introduced in section 3.5 as the inverse of the

metric tensor. So when we raise an index on the metric tensor, we naturally
get the identity matrix:
, using Eq. (3.58) (3.121)

=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
, using Eq. (3.44) (3.122)
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
, (3.123)
=
. (3.124)
25 Show that A
is frame invariant.
This simple-looking problem turned out to be quite tedious. Perhaps Im
doing more than required?
[In retrospect, having now read most of the book, the solution below of
course contains much more detail then required. But it was only through
working through this sort of detail that I was able to gain condence in
tensor calculus.]
First I believe we need to interpret A
in terms of tensor calculous.

It looks like it has the form of A(B), where A is a
_
2
0
_
tensor and B is a
_
0
2
_
tensor. To conrm this we write:
A = A
, see problem 23 (b) (3.125)

B = B
. (3.126)
Now one applies the tensor A to the arguments B,
A(B) = A
(B
) (3.127)
= A
)e
) (3.128)
= A
(3.129)
= A
, (3.130)
which conrms our interpretation.
90
At this point, we expect A
= A(B) to be frame-invariant, because

its composed of vectors and one-forms, which are frame-invariant. But we
can demonstrate this more explicitly. That is,
A = A

e

e
, (3.131)
= A
, (3.132)
= A
, (3.133)
= A
. (3.134)
And similarly for B:
B = B

, (3.135)
= B
, (3.136)
= B
, (3.137)
= B
. (3.138)
Now we should be quite convinced that A
= A(B) is frame-invariant,
because its both A and B are frame-invariant. And of course we can show
this in tedious detail:
A(B) = A

e

e
(B

) (3.139)
= A

B

e

e
(

) (3.140)
= A

B

e

(

) e
(

) (3.141)
= A
),
(3.142)
= A
) e
), (3.143)
= A
, (3.144)
= A
. (3.145)
26 A is an antisymmetric
_
2
0
_
tensor, B is a symmetric
_
0
2
_
tensor, C is
an arbitrary
_
0
2
_
tensor, D is an arbitrary
_
2
0
_
tensor. Prove:
(a) A
= 0. I dont think Schutz discussed the symmetry of

_
2
0
_
tensors explicitly in this chapter. But in Section 3.6 we are introduced to
_
M
0
_
tensors and assured that All our previous discussions of
_
0
2
_
tensors apply
here. So I think its safe to use the obvious analogues of the symmetry ideas
of Section 3.4. [Yes, obviously it is.]
Because A is an antisymmetric
_
2
0
_
tensor, we can write it as the anti-
symmetric part of some more general tensor say A
:
A
[]
=
1
2
_
A
_
, by Eq. (3.34),
= A
, (3.146)
and because B is a symmetric
_
0
2
_
tensor, we can write it as follows
B
()
=
1
2
_
B
+ B
_
, by Eq. (3.31). (3.147)
Now we simply apply A to B:
A
= A
[]
B
()
=
1
2
_
A
_
1
2
_
B
+ B
_
,
=
1
4
_
A
+ A
_
, (3.148)
But we can call the indices and whatever we like, and so
A
= A
= A
(3.149)
just be relabelling. And from this we see that A
= 0.
(b) Prove that: A
= A
C
[]
.
A
= A
_
C
()
+ C
[]
_
, by Eq. (3.35),
= A
C
[]
, by part (a)! (3.150)
(c) Prove that: B
= B
D
()
.
B
= B
_
D
()
+ D
[]
_
, by Eq. (3.35)
= B
D
()
, by part (a)! (3.151)
92
27 (a) A is an antisymmetric
_
2
0
_
tensor. Show that A
is an antisym-
metric
_
0
2
_
tensor.
A
, by Eq. (3.56). (3.152)

So in matrix notation:
A
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
0 a
12
a
13
a
14
a
12
0 a
23
a
24
a
13
a
23
0 a
34
a
14
a
24
a
34
0
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
0 a
12
a
13
a
14
a
12
0 a
23
a
24
a
13
a
23
0 a
34
a
14
a
24
a
34
0
_
_
=
_
_
0 a
12
a
13
a
14
a
12
0 a
23
a
24
a
13
a
23
0 a
34
a
14
a
24
a
34
0
_
_
(3.153)
27 (b) Suppose V
= W
. Prove that V
= W
.
V
, by 3.39
W
. (3.154)
So
V
_
V
_
= 0. (3.155)
28 Deduce Eq. (3.66) from Eq. (3.65).
I would start from comparing Eq. (3.65) with Eq. (3.14). For scalar
function , we had
d
d
=

t
U
t
+

x
U
x
+

y
U
y
+

z
U
z
from Eq. (3.14)
=

d(
U), just dierent notation. (3.156)

So I believe we want to write the derivate of the
_
1
1
_
tensor in a similar way,
i.e. as something that takes

U as an argument and gives the rate of change
of that
_
1
1
_
tensor following the world line of

U:
dT
d
= T(
U), analogous to (3.156) above,

= T
,
e
(U
) Eq. 3.66
= T
,
e
(e
) rearranging
= T
,
e
Eq. (3.12)
= T
,
e
. (3.157)
Thus we were lead back to Eq. (3.65) from Eq. (3.66) using the analogy with
Eq. (3.14).
[In retrospect, the above solution doesnt seem to answer the question
posed. But I think the question is not well-posed. The solution given by
Schutz merely refers to the arbitrariness of

U. But there is more going on
here in making the step from Eq. (3.65) to Eq. (3.66), and whats involved
is the analogy with the gradient of a scalar, as discussed above. In short,
Im not happy with either my solution nor with Schutzs, and I blame the
question! I discuss this more in my notes on this section above.]
29 Prove that tnesor dierentiation obeys the Leibniz (product) rule.
(AB) = (A) B+A(B).
Were not told what A and B are, which probably means that it doesnt
matter. For convenience, Ill assume that they are
_
1
1
_
tensors:
A = A
, Eq. 3.61
B = B
. (3.158)
94
We then have
AB = A
(B
)
= A
. (3.159)
My strategy is to just redo what Schutz did in Section 3.8. Lets dierentiate
AB with respect to proper time, as in Eq. (3.63):
d(AB)
d
=
d(A
)
d

, (3.160)
were use have assumed that the basis one-forms and basis vectors are uniform
in spacetime. And
d(A
)
d
is just the ordinary derivative of this function (A
) along the world line.

Because its just the ordinary derivative, we can use the ordinary product
rule:
d(A
)
d
=
_
dA
d
_
B
+ A
_
dB
d
_
. (3.161)
Now just sub this into (3.160):
d(AB)
d
=
__
dA
d
_
B
+ A
_
dB
d
__

,
= A(
U) B+AB(
U)
= (AB)(
U). (3.162)
If we wrote out

U = U
everywhere, wed nd that it cancels on the two

sides of the equal sign. So its clear that:
(AB) = (A) B+A(B). (3.163)
30 Given the following elds:
U
O
(1 + t
2
, t
2
,
2t, 0)
D
O
(x, 5xt,
2t, 0)
= x
2
+ t
2
y
2
(3.164)
(a) By Eq. (2.28) we require a four-velocity to have

U

U = 1. And

U
does at all points.
U

U =
, see Eq. (3.1)

= (1 + t
2
)
2
+ (t
2
)
2
+ (
2t)
2
+ 0
= 1. (3.165)
On the other hand,

D cannot be a four-velocity because it doesnt meet this
property everywhere:
D

D =
, see Eq. (3.1)

= x
2
+ (5tx)
2
+ (
2t)
2
+ 0
,= 1 everywhere. (3.166)
And for some reason were asked to also calculate:
U

D =
, see Eq. (3.1)

= (1 + t
2
)x + (t
2
)(5tx) + (
2t)
2
+ 0
= x xt
2
+ 5t
3
x + 2t
2
. (3.167)
(b) Find spatial velocity given the four-velocity

U. Spatial components
of a four-velocity vector was explained on p. 42. Here,

O
(t
2
,
2t, 0)
At t = 0, = 0. But as t , .
(c) Find U
for all .
U
see section 3.7. (3.168)

U
0
= U
0
= (1 + t
2
)
U
1
= U
1
= t
2
U
2
= U
2
=
2t
U
3
= U
3
= 0.
96
(d) Find U
,
for all , .
U
,

U
see Eq. (3.19) (3.169)

U
0
,0
=
U
0
x
0
=
U
0
t
=
(1 + t
2
)
t
= 2t,
U
0
,1
=
U
0
x
1
=
U
0
x
=
(1 + t
2
)
x
= 0,
U
0
,2
=
U
0
x
2
=
U
0
y
=
(1 + t
2
)
y
= 0,
U
0
,3
=
U
0
x
3
=
U
0
x
=
(1 + t
2
)
x
= 0.
For = 1:
U
1
,0
=
U
1
x
0
=
t
2
t
= 2t,
U
1
,1
=
U
1
x
1
=
t
2
x
= 0,
U
1
,2
=
U
1
x
2
=
t
2
y
= 0,
U
1
,3
=
U
1
x
3
=
t
2
x
= 0.
For = 2:
U
2
,0
=
U
2
x
0
=

2t
t
=
2,
U
2
,1
=
U
2
x
1
=

2t
x
= 0,
U
2
,2
=
U
2
x
2
=

2t
y
= 0,
U
2
,3
=
U
2
x
3
=

2t
x
= 0.
For = 3:
U
3
,
=
U
3
x
=
0
x
= 0, (3.170)
(e) Show that U
,
= 0 for all .
For = 0:
U
,t
= (1 + t
2
)2t + t
2
2t +
2t
2 + 0
= 0. (3.171)
For > 0, U
,
= 0. Thus, U
,
= 0.
Show that U
U
,
= 0 for all .
U
U
,
= U
)
,
=
(U
,
)
= U
(U
,
)
= 0, see above! (3.172)
(f) Find D
,
D
,
=
D
0
t
+
D
1
x
+
D
2
y
+
D
3
z
=
x
t
+
5tx
x
+

2t
y
+
0
z
= 5t. (3.173)
(g) Find (U
)
,
for all .
(U
)
,
= U
(D
,
) + U
,
D
= (1 + t
2
), t
2
,
2t, 0(5t) + D
0
2t, 2t,
2, 0
= (1 + t
2
), t
2
,
2t, 0(5t) + x2t, 2t,
2, 0
= 5t(1 + t
2
) + x2t, 5t
3
+ x2t, 5
2t
2
+
2x, 0
(3.174)
(h) Find U
(U
)
,
.
98
U
(U
)
,
= U
[U
,
D
+ U
,
]
= U
,
D
+ U
,
= U
,
using result from (e)
= D
,
using result from (a) and Eq. (2.28)
= 5t using result from (f) (3.175)
(i) Find
,
for all
,
=
t
,

x
,

y
,

z
,
= 2t, 2x, 2y, 0 (3.176)
Find
,
for all
,
=
,
= 2t, 2x, 2y, 0 (3.177)
Interpretation:
d
O

,
(3.178)
That is, the
,
are the components of the one-form that is the gradient of
the scalar eld .
,
are the components of the associated vector.
(j) Find
U
U
,
from Eq. (3.68)
= 2t(1 + t
2
), 2xt
2
, 2y
2t, 0 (3.179)
Find
U
D
D = U
,
= U
D
0
,
, U
D
1
,
, U
D
2
,
, U
D
3
,
= t
2
, 5x(1 + t
2
) + 5t(t
2
),
2(1 + t
2
), 0 (3.180)
31 Theres a mistake in the initial denition of P : Were told that

U
is a unit timelike vector. This implies its scalar product with itself is both
negative , i.e.
U
< 0,
But what we require for this problem is that: and of magnitude unity like a
4-velocity:
U
= 1.
(i) Given the denition of P and

V
it is then simply a matter of taking

the scalar product with

U to show that its orthogonal.
+ U
)V
= U
+ U
)V
, using Eq. (3.39)

= U
+ U
)V
, using Eq. (3.60)

= (U
+ U
)V
,
= (U
)V
,
= 0 (3.181)
(ii) Similarly for showing that

V
is unaected by P.
P
= (
+ U
)V
= (
+ U
)(
+ U
)V
= (
+ U
+ U
)V
= (
+ U
+ U
+ (U
)U
)V
= (
+ U
+ U
+ (1)U
)V
= (
+ U
)V
= V
(3.182)
100
(b) Show that the tensor
,
with only restriction that q is not the null vector, projects orthogonally.
Based upon (a) we can guess that projects orthogonally means that
this tensor converts vectors into one-forms that are orthogonal to q. (Note
that the given tensor produces a one-form from a vector input because its a
_
0
2
_
tensor.) Much like in (a) we simply apply the given tensor to an arbitrary
vector, say s. Here this produces a one-form. And then we apply this one-
form to q to show that its zero, indicating a one-form orthogonal to q.
_
_
s
=
_
_
s
=
_
_
s
metric tensor is its own transpose,

= (q
) s
,
= 0. (3.183)
This fails when q
= 0 because then of course we have zero in the

denominator of the denition. Now the relation to (a) is clear. In (a) we
didnt need the q
in the denominator of the 2nd term, nor the negative

sign, because it was negative unity for vector

U.
(c) Show that
P(
,

W
) = g(
,

W
).
I think it is as simple as this:
P(
,

W
) = P
,
= (
+ U
) W
, substituting denition given

=
, using (a) (i) above,

= g(
,

W
). (3.184)
32 (a) Prove the given transformation law for
_
0
2
_
tensors.
f

= f (e

, e
)
= f
(e

, e
), using frame-independent denition of f, Eq. (3.26)

= f
), using Eq. (2.15)

= f
, using Eq. (3.24, 3.25)

= f
,
(3.185)
Let A = (a
) be the matrix with components, a
, and similarly for B.

Its easy to see that regular matrix multiplication
AB = a
= (a)(b)
So we can write
f

, just rearranging
(

f) = ()
T
(f) (). (3.186)
32 (b) Prove that the familiar Lorentz transformation associated with a
velocity boost obeys the generalization suggested.
The familiar Lorentz transformation was given on p. 22 by Eq. (1.12),
and deemed a velocity boost v in the xdirection.
t = t vx
x = x vt
y = y
z = z
where = 1/
1 v
2
. Note this transformation can be written in matrix
form
_
_
_
_
t
x
y
z
_
_
_
_
=
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
t
x
y
z
_
_
_
_
102
Testing whether this transformation matrix meets Eq. (3.71) is just a matter
of doing the matrix multiplication. Writing Eq. (3.71) in matrix form:
( ) = ()
T
()(), see problem 32 (3.187)
Substituting our familiar Lorentz transformation we get
RHS = ()
T
()(),
=
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
=
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
= LHS. (3.188)
32 (c) Suppose (L) and () are two matrices that satisfy Eq. (3.71).
Prove that ()(L) also obeys Eq. (3.71).
This is, of course, what we would hope since each Lorentz transformation
corresponds to changing reference frames, and ()(L) = (N) corresponds to
changing twice.
RHS = (N)
T
()(N)
= (()(L))
T
()(()(L))
= (L)
T
()
T
()()(L)
= (L)
T
_
()
T
()()
_
(L)
= (L)
T
() (L), using Eq. (3.71)
= (), using Eq. (3.71)
= LHS. (3.189)
33 (a) Find the matrix for the identity element of the Lorentz group.
Apparently identity element is a very general concept, beyond just
group theory: http://en.wikipedia.org/wiki/Identity_element. For
the Lorentz group, we seek I such that
I L = L
for all elements L. Clearly the 44 identity matrix I meets this requirement.
Note that (v = 0) = I.
The implicit matrix in Eq. (1.12) was
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
where = 1/
1 v
2
.
Its inverse is
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
which is obvious on physical grounds, and can be easily conrmed by multi-
plication:
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
=
_
2
(1 v
2
) 0 0 0
0
2
(1 v
2
) 0 0
0 0 1 0
0 0 0 1
_
_
=
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
(3.190)
33 (b) Prove that the determinant of any matrix representing a Lorentz
transformation is 1.
104
Its easy to show that the determinant of Lorentz transformation associ-
ated with the velocity boost v in the xdirection is +1. But what about
velocity components in dierent directions? In Chapter 1 we did nd the
most general Lorentz transformation for arbitrarily oriented velocity. But
it would be messy to nd the determinant, and also Im not sure if Schutz
wants us to consider this familiar Lorentz transformation, or his generaliza-
tion dened by Eq. (3.71).
Lets work with generalization dened by Eq. (3.71).
[()[ = [(L)
T
()(L)[
= [(L)
T
[ [()[ [(L)[
1 = [(L)[
2
[(L)[ = 1, (3.191)
where we have used properties of the determinant of matrices specied at
http://en.wikipedia.org/wiki/Determinant#Multiplicativity_and_matrix_
groups.
Clearly if [A[ = 1, and [B[ = 1, then [C[ = [AB[ = 1, and this forms a
subgroup.
But if [A[ = 1, and [B[ = 1, then [C[ = [AB[ = 1. Thus C is not a
member and the set of matrices with determinant of -1 do not for a subgroup
because they fail to meet the closure axiom. The axioms of a group are given
here http://en.wikipedia.org/wiki/Group_(mathematics).
33 (c) Show that the 3D orthogonal matrices form a group.
Matrix multiplication meets the following three axioms of a group: as-
sociativity, identity, and invertibility. We need to show that the set of 3D
orthogonal matrices is closed. That is, if A and B are elements of the set, is
C = AB also an element?
C
T
C = (AB)
T
(AB)
= (B
T
A
T
) (AB)
= B
T
(A
T
A) B associativity
= B
T
(I) B orthogonality of A
= B
T
B
= I orthogonality of B
which implies the orthogonality of C. Thus the set is closed, and forms a
matrix group.
If we remove the rst row and column of the L(4) matrices we see that
Eq. (3.71) becomes just the condition for orthogonal 3D matrices given in
problem 20. I believe this implies that the O(3) matrices are a subgroup of
L(4).
[Yes, this essentially agrees with Schutzs solution.]
34 Introduce variables: u = t x and v = t + x.
(a) Given that e
u
connects events u = 1, v = 0, y = 0, z = 0 to the origin
(u = 0, v = 0, y = 0, z = 0). Therefore
u = 1 = t x
v = 0 = t + x
so that t = 1/2 and x = 1/2. And thus e
u
= (e
t
e
x
)/2.
Given that e
v
connects events u = 0, v = 1, y = 0, z = 0 to the origin
(u = 0, v = 0, y = 0, z = 0). Therefore this event is
u = 0 = t x
v = 1 = t + x
and can be written as t = 1/2 and x = 1/2. And thus e
v
= (e
t
+e
x
)/2.
(b) Show that e
u
, e
v
, e
y
, e
z
form a basis.
106
To be a basis, the vectors must be at least linearly independent no one
of them can be written as a linear combination of the others.
e
u
e
v
= (e
t
e
x
)/2 (e
t
+e
x
)/2
= 0. (3.192)
Also,
e
u
e
y
= e
u
e
z
= 0,
e
v
e
y
= e
v
e
z
= 0.
so the 4 vectors are orthogonal and can form in fact an orthogonal basis.
(c) The metric tensor is given in terms of the basis vectors by Eq. (3.5).
The trick to nding the metric tensor is then to do the scalar product in the
basis where we know the metric tensor so that we can nd the metric tensor
in the new basis.
uu
= e
u
e
u
= (e
t
e
x
)/2 (e
t
e
x
)/2
=
1
4
(e
t
e
t
+e
x
e
x
)
=
1
4
(1 + 1)
= 0 (3.193)
uv
=
vu
= e
u
e
v
= (e
t
e
x
)/2 (e
t
+e
x
)/2
=
1
4
(e
t
e
t
e
x
e
x
)
=
1
4
(1 1)
=
1
2
(3.194)
uy
= e
u
e
y
= (e
t
e
x
)/2 e
y
= 0 (3.195)
uz
= e
u
e
z
= (e
t
e
x
)/2 e
z
= 0 (3.196)
And similarly
vy
=
vz
= 0.
vv
= e
v
e
v
= (e
t
+e
x
)/2 (e
t
+e
x
)/2
=
1
4
(e
t
e
t
+e
x
e
x
)
=
1
4
(1 + 1)
= 0 (3.197)
While of course the rest of the tensor is the same as the familiar one.
Collecting terms:
=
_
_
0
1
2
0 0
1
2
0 0 0
0 0 1 0
0 0 0 1.
_
_
(3.198)
(d) We have already shown that e
u
e
u
= 0, in (c) above, which means
that it is a null vector, c.f. p. 45. (Not sure why he asked us that now,
instead of because (c)!?). Similarly for e
v
e
v
= 0, also a null vector.
And we have already shown that e
u
e
v
= 1/2 ,= 0, so we have already
shown that e
u
and e
v
are not orthogonal.
(e) Compute the four one-forms

du,

dv, g(e
u
, ), and g(e
u
, ) in terms of

dt,
dx.
108
The formula for the gradient is given by Eq. (3.15), but were asked here
for the answer in terms of

dt,

dx. So I think were to use the basis formed
frome
t
, e
x
etc., and since the gradient is a one-form the basis elements are
t
,

x
, etc. Note that Eq. (3.20) lets us write these basis one-forms as

dt =
t
,
dx =
x
.
The computation is trivial:
du =
u
x
dx
=
(t x)
t
dt +
(t x)
x
dx
=

dt

dx. (3.199)
And similarly for
dv =
v
x
dx
=
(t + x)
t
dt +
(t + x)
x
dx
=

dt +

dx. (3.200)
And now were asked for g(e
u
, ). Look in Section 3.5 if youre having
trouble here.
g(e
u
, ) =
(e
u
, )
=
(e
u
)
( )
=
( )
=
u

( )
(3.201)
The u row of only has one non-zero element, so we might as well continue
to narrow this down:
g(e
u
, ) =
u

( )
=
uv

v
( )
=
1
2

v
( )
=
1
2
dv( )
=
1
2
dt
1
2
dx. (3.202)
And similarly for g(e
v
, ).
g(e
v
, ) =
(e
v
, )
=
(e
v
)
( )
=
( )
=
v

( )
=
vu

u
( ) only non-zero element
=
1
2

u
( )
=
1
2
du( )
=
1
2
dt +
1
2
dx. (3.203)
[Schutz stopped one line before I have above, so he left the answer in
terms of

du and

dv. ]
110
Chapter 4
Perfect uids in Special
Relativity
GR books tend to cover mostly the same material: SR, tensor calculus,
curvature and the Riemann tensor, Einsteins eld equations, Schwarzschild
solution, etc. But not necessarily the material of this chapter on uid me-
chanics. This chapter has come in lieu of one on electricity and magnetism.
Either of these topics would give us practise on working with tensors and
developing frame-invariant equations, as we will have to do to understand
the development of the Einstein eld equations. I like Schutzs choice here
because uid mechanics seems of more general relevance to astrophysical ap-
plications of GR. The last question of 4.10 is on electricity and magnetism,
so Schutz is tipping his hat to the other possible choice. See (Hobson et al.,
2009) for the a full chapter on electricity and magnetism with the aim of
preparing the student for the development of the Einstein eld equations
(but without a uids chapter).
4.1 Fluids
The continuum hypothesis is discussed in much more detail in classical uid
dynamics texts, see for example (Batchelor, 1967).
Traditionally the qualitative distinction between solids and uids is that
solids can sustain a stress (a force parallel to an interface between uid
elements) without a strain ( relative movement between the elements). Im
not sure why Schutz avoids making this distinction here.
111
112
4.2 Dust: the number -ux vector

N
The discussion at the end of this section, on p. 88, is very fundamental. Its
perhaps worth adding that Einstein arrived at GR by searching for a descrip-
tion of gravity that was consistent with SR in the sense that it was written in
terms of tensors that are invariant under Lorentz transformations. One can
develop such a tensorial description of classical Electricity and Magnetism,
see problem 25 of 4.10, or (Hobson et al., 2009).
4.3 One-forms and surfaces
Number density as a timelike ux
I found this section confusing until I reach the end and realized that all he
wanted to say was that the time-component of

N in Eq. (4.5) looks like the
spatial components [but with velocity c]. The mathematics is much clearer
than all the words in my opinion. This point is also immediately obvious to
anyone who looked at the four-velocity and realized that in the MCRF one
is moving at the speed of light in the direction of time.
Representation of a frame by a one-form
Last line of this section,
E = p

U
looks very confusing because, taking literally, it suggests that energy E is
a scalar, i.e. a frame-independent concept! But we know that energy is
the zero-component of the four-momentum, and as such depends upon the
reference frame (as one would expect!). The resolution is to ip back to p. 48
and remind oneself that equation Eq. (2.35) was written with

U =

U
obs
being the 4-velocity of the observer:
E = p

U
obs
In short, hes being sloppy at the end of 4.10.
4.4 Dust again: the stress-energy tensor
This is great practise for developing the frame-independent view of tensor
equations energy density must be a component of a
_
2
0
_
tensor since it
transforms like
2
. Indeed, the stress-energy tensor T forms the RHS of the
Einstein eld equations.
4.5 General Fluids
Symmetry of T
in the MCRF
Dont take the lower indices of F
i
1
as indicating the covariant components.
They are just indices to give names to the forces on the dierent faces so he
can talk about them.
Conservation of energy-momentum
Typo bottom of p. 98: l
2
T
0x
(x = a) should be l
2
T
0x
(x = l).
4.6 Perfect uids
No viscosity
The only matrix diagonal in all frames is a multiple of the identity matrix.
Here by all frames, he means all orientations of the spatial axes. In two spatial
dimensions this is easy to see. Just apply the rotation matrix
R =
_
cos sin
sin cos
_
Then consider the transformation of some arbitrary diagonal matrix
A =
_
a 0
0 b
_
We nd that in a co-ordinate system rotated by we have
R
T
A R =
_
a cos
2
+ b sin
2
(a b) cos sin
(a b) cos sin b cos
2
+ a sin
2
_
(4.1)
which is again diagonal o a = b.
114
The conservation laws
Typo: Eq. (4.52) should read:
( + p) U
i
+ p
,

i
= 0
To see that U
i,
U
is the denition of the four-acceleration a

i
, we must
piece together the denition in Eq. (2.32) and the notation in Eq. (3.64)
and one must lower the index using the metric, which is special relativity is
constant and so commutes with derivatives.
Its a bit beyond the scope of this text, but Eq. (4.55) is the famous
Euler equation, which is the inviscid form of the Navier-Stokes equations of
classical uid mechanics. As Schutz says, its the F = ma of uid mechanics.
4.10 Problems
1. The continuum hypothesis applies to which of these situations.
(a) Planetary motions in the solar system. The continuum hypothesis
does not apply because there are only nine planets, and they have dierent
orbits, periods, velocities, etc.
(b) A lava ow from a volcano is likely well suited to the continuum
approximation because the molten rock ows like a liquid. Presumably ele-
ments can be found that are much bigger than the molecules of the minerals
but small enough to have uniform macroscopic properties like temperature,
density, etc.
(c) Trac on a major road at rush hour is likely to be well suited to
the continuum approximation if one considers scales much larger than an
individual car so there are many cars in the element, but small enough that
speed and direction of the cars in one element is roughly constant. At rush
hour its more likely to have bumper to bumper trac which would force the
cars in a vicinity to travel at the same speed.
(d) Trac at an intersection with stop signed is likely to be not well suited
to the continuum approximation. The stop signs ensure that the cars in an
element will have dierent speeds at dierent times. There is no near-uniform
element.
(e) Plasma dynamics is likely to be well suited to the continuum approxi-
mation unless the plasma is extremely raried. In the latter case there might
not be sucient collisions to bring the ions into statistical equilibrium.
2. Flux across a surface of constant x is often called ux in the
x direction. This is inappropriate because it implies that ux in the x
direction is a component of a vector. However, the Flux across a surface of
constant x is actually the result of the application of a vector to a one-form,
or vice versa:
N(
dx) = n,

N
=
dx,

N (4.2)
This was described in Section 4.3.
3. (a) Galilean momentum is frame-dependent in a manner that relativis-
tic momentum is not. Galilean momentum is the ordinary, 3-vector velocity,
v times the (frame-independent) mass m:
p
g
= mv. (4.3)
The velocity depends very much of the frame. Its not just the components
that change with reference frame, but the vector itself that changes.
In contrast with Galilean momentum, the relativistic momentum, p, is a
4-vector, created by the scalar rest mass m times the 4-velocity

U:
p = m
U. (4.4)
The rest mass is obviously frame-invariant. The 4-velocity is too, while its
components do depend upon reference frame. Note for instance that the
116
magnitude of the 4-velocity is alway,
U

U = U
= 1. (4.5)
(b) How is the above situation possible given that Galilean momentum
is an approximation to relativistic momentum? Hint: Dene a Galilean 4-
momentum.
The Galilean 4-momentum would look like the regular 3-momentum but
with a time component,
p
G

O
m1, v
x
, v
y
, v
z
. (4.6)
where v
x
, v
y
, v
z
is ordinary 3-velocity in frame O. Because [v[ 1 then
the p
G
vector is dominated by its rst component, just the inertial mass.
Thus for instance, the magnitude of the Galilean 4-momentum is almost
frame-invariant.
4. Show that the number density of dust to an arbitrary observer with
4-velocity

U
obs
is
N

U
obs
Lets transform ourselves into the velocity of the observer, O, so then
U
obs

O
1, 0, 0, 0
Now,
N

U
obs
= n
U

U
obs
= n(U
0
)
= n
1
1 v
2
(4.7)
where v is the ordinary velocity of the dust measured in the O frame. But
this is exactly what we came up with for the number density in Eq. (4.2).
Now we note that the observers 4-velocity and

N are frame-invariant
vectors, and hence

N
U
obs
is frame-invariant. So the result must be true in
all reference frames.
5. Complete the proof that Eq. (4.14) [for the stress-energy tensor] denes
a tensor by arguing that it must be linear in both its arguments.
Eq. (4.14) denes the stress-energy tensor
T(
dx
dx
) = T
,
and T
is the ux of momentum across a surface of constant x
. Of
course we require this ux to be proportional to the area of the surface of
constant x
. This requirement is met because T(
dx
dx
) is linear in the
2nd argument

dx
.
Furthermore, the momentum of the four-momentum is
p
dx
, p, (4.8)
which is linear in

dx
, the rst argument of T(
dx
dx
) , by the properties
of one-forms and vectors.
6. Derive Eq. (4.19)
We only need to show that p

N has only one non-zero component in
the MCRF, namely (T
00
)
MCRF
= nm.
In the MCRF, say O,
U
O
(1, 0, 0, 0)
N = n
U
O
(n, 0, 0, 0)
p = m
U
O
(m, 0, 0, 0).
Thus T = p

N has only one non-zero component, namely (T
00
)
MCRF
=
nm.
7. Derive Eq. (4.21).
118
The terms in Eq. (4.21) are immediately clear from the preceding expres-
sion for the 4-velocity in frame O:
U
O
(, v
x
, v
y
, v
z
) (4.9)
where =
1
1v
2
, and v
2
=
i
v
i
v
i
.
8.
(a) Argue that Eqs. (4.25) and (4.26) can be written as statements about
one-forms.
In the derivation of Eqs. (4.25) and (4.26), we started with a uid element
in the MCRF and considered how its energy could change by rst a nite
amount E and then we took the limit of an innitesimal change dE. If we
simply divided by the change in proper time, , then for instance Eq. (4.24)
would become
n
q

+ p
n
n
(4.10)
Now, following Schutz, we suppose that the changes are innitesimal. The
RHS becomes
lim
0

+ p
n
n
=
d
d

+ p
n
dn
d
=

d

U
+ p
n
dn

U
= nT

dS

U (4.11)
If we divide by the

U, we obtain
d
+ p
n
dn = nT

dS
(4.12)
[Solution above is dierent from Schutzs solutions, but I believe the two
are compatible.]
8. (b) Show that

q is not a gradient.
Unlike, and n, there was no property of each element that we could
dene by q. (Yes formally one can write q := Q/N, but what is Q? Heat is
only dened in terms of transfer of energy.) I nd it misleading that Schutz
writes q := Q/N on p. 95.
9. Show that Eq. (4.34), when = i, i.e. any spatial index, expresses
Newtons 2nd law.
Lets assume were dealing with dust, so that the stress-energy tensor is
given by Eq. (4.19).
Lets start with = 0. Then Eq. (4.34) gives the following term:
T
i0
,0
=
T
i0
x
0
=
T
i0
t
(4.13)
which we interpret as the time rate of change of the idirection momentum
density.
The remaining terms
T
ij
,j
=
T
ij
x
j
we directly interpret as the divergence of the ux of idirection momentum.
The same terms can be written in a form familiar to classical uid me-
chanics:
T
i0
t
=
u
i
t
T
ij
x
j
=
u
i
u
j
x
j
(4.14)
where on the RHS we have used the notation of standard symbols of uid
mechanics: is the mass density, u
i
is the uid velocity in the x
i
direction.
So
T
i0
t
+
T
ij
x
j
=
u
i
t
+
u
i
u
j
x
j
= 0. (4.15)
120
The RHS is the Navier-Stokes equations but without the pressure gradient
term, forcing or dissipation. This is of course a form of Newtons 2nd law
but when forces are not present. This case applies to dust only.
In a perfect uid the stress energy tensor is given by Eq. (4.38) and we
immediately see the possibility of a pressure gradient term.
10. Take limit of [v[ 1 showing that Eq. (4.35) reduces to the equation
given.
Solution:
N
,
= 0, restating Eq. (4.35)
n
t
+
nv
i
x
i
= 0, using Eq. (4.5)
n
t
+
nv
i
x
i
= 0, 1when[v[ 1
where = 1/
1 v
2
.
11. (a) Show that the matrix
ij
is unchanged when transformed by a
rotation of the spatial axes.
Why this question is in this chapter? Recall the discussion of viscosity
on p. 101. If there is no viscosity then T should be diagonal in any reference
frame.
First one has to remind oneself how to transform matrices under a change
of coordinates. I dont have my linear algebra book with me, but we can nd
this be pretending that the matrix represents a
_
2
0
_
tensor, say A. This tensor
should be an entity that is invariant under the arbitrary choice of coordinate
system so we demand that
A = A
, see Chapter 3, section 6 for basis

= A
e

e
, using bases of O
= A

e

e
, entity independent of coordinate system. (4.16)

So its now clear that
A

= A
(A

) = (

)(A
)(
)
T
, matrix version of previous line. (4.17)
Now we can address the problem by replacing (A
) with the matrix

ij
and the transformation (

) with a rotation R
i
i
. Lets work in 2D. The
matrix
ij
in the rotated coordinate system O will be
j = R
i
i
ij
R
j
j
= (R)I(R)
T
=
_
cos sin
sin cos
_ _
1 0
0 1
_ _
cos sin
sin cos
_
=
_
cos sin
sin cos
_ _
cos sin
sin cos
_
=
_
1 0
0 1
_
. (4.18)
11. (b) Show that any matrix that has this property is a multiple of
ij
.
It is quite obvious that any multiple of the identity matrix has this prop-
erty. In fact I went through this in my notes above for 4.6.
12. Derive Eq. (4.37) from Eq. (4.36).
We simply go term by term as Schutz suggested on p. 101. He has already
addressed = = 0. Using the convention i 1, 2, 3, consider the three
terms T
0i
:
T
0i
= ( + p)U
0
U
i
+ p
0i
= ( + p)U
i
+ p
0i
in MCRF, nil spatial components
= ( + p)U
i
is diagonal
= 0. in MCRF, nil spatial components of

U (4.19)
Then by symmetry,
T
i0
= T
0i
= 0. (4.20)
Finally,
T
ij
= ( + p)U
i
U
j
+ p
ij
= p
ij
in MCRF, nil spatial components of

U
= p
ij
. (4.21)
122
13. Supply reasoning behind Eq. (4.44)
We want the derivative of the dot product of the 4-velocity with itself.
This can be written
(U
)
,
= (U
)
,

, metric tensor is constant

= (U
,
U
+ U
,
)
, product rule
= (2U
,
U
, metric tensor is symmetric. (4.22)

14. Argue that Eq. (4.46) is the time component of Eq. (4.45) in the
MCRF.
Recall Eq. (4.45) was
nU
_
+ p
n
U
_
,
+ p
,

= 0
Although there are two indices, is just a dummy index (it appears both
upstairs and downstairs implying a sum over ). So the time component is
when = 0. The following argument is not correct:
0 = nU
_
+ p
n
U
0
_
,
+ p
,

0
= nU
_
+ p
n
_
,
+ p
,

0
, since

U = e
0
in the MCRF
= nU
_
+ p
n
_
,
p
t
, recall properties of see Eq. (3.44)
Although

U = e
0
in MCRF, we cannot immediately conclude that the gra-
dient of the time component is zero. Instead, we must rst take the gradient
in Eq (4.45), and then set = 0:
0 = nU
_
+ p
n
U
_
,
+ p
,

Eq. (4.45)
= nU
_
U
_
+ p
n
_
,
+
_
+ p
n
_
U
,
_
+ p
,

, expanding (4.23)
And now focus attention on the 2nd term in square brackets:
nU
_
+ p
n
_
U
,
= ( + p)U
,
= ( + p)
U c.f. Eq. (3.68)

= ( + p)
d
U
d
c.f. Eq. (3.67)
= 0 for t component in MCRF, c.f. p. 48 (4.24)
So for the above reason, Eq. (4.45) in the time component reduces to
0 = nU
_
U
_
+ p
n
_
,
+
_
+ p
n
_
U
,
_
+ p
,

, from above
0 = nU
_
U
_
+ p
n
_
,
_
+ p
,

, using above
0 = nU
_
_
+ p
n
_
,
_
+ p
,

0
, time component
= nU
_
+ p
n
_
,
p
t
, recall properties of see Eq. (3.44) (4.25)
One the other hand, Eq. (4.46) was
0 = nU
_
+ p
n
U
_
,
+ p
,

Consider:
U
_
+ p
n
U
_
,
= U
_
+ p
n
_
,
+ U
+ p
n
U
,
, product rule
=
_
+ p
n
_
,
+ U
+ p
n
U
,
, using Eq. (2.28)
=
_
+ p
n
_
,
, using Eq. (4.42). (4.26)
124
So Eq. (4.46) becomes:
0 = nU
_
+ p
n
_
,
+ p
,

= nU
_
+ p
n
_
,
+ p
,
U
= nU
_
+ p
n
_
,
+
p
t
, in MCRF.
= nU
_
+ p
n
_
,
p
t
, same as time component of Eq. (4.45).
(4.27)
15. Derive Eq. (4.48) from Eq. (4.47).
This is just trivial manipulation to encourage the student to follow the
steps of the argument and become comfortable with the notation:
0 = U
_
n
_
+ p
n
_
,
+ p
,
_
, recall Eq. (4.47)
= U
n
n
(
,
+ p
,
) +
n
n
2
( + p) n
,
+ p
,
_
, product rule and chain rule,
= U
_
+ p
n
_
n
,
_
, algebra. (4.28)
Which is Eq. (4.48).
16. In the MCRF, U
i
= 0. Why cant we assume U
i
,
= 0?
The analogous statement in 3D space is also true. In uid mechanics for
instance, one can alway transform the equations into a frame momentarily
co-moving with the local uid velocity, but that doesnt mean the velocity
gradient will be zero. The 3-velocity in uid mechanics, and 4-velocity in
SR, can depend upon space so that adjacent uid elements have dierent
MCRFs.
17. We have dened a
= U
,
U
. Show that in the nonrelativistic limit:

a
i
= v
i
+ (v )v
i
=
Dv
i
Dt
.
The 4-velocity can be written in terms of the 3-velocity as
U
= [, u, v, w] (4.29)
where
=
1
1 v
2
.
(Recall that v is assumed to be the ordinary 3-velocity divided by c.)
a
i
= U
i
,
U
=
u
i
t
+ u
u
i
x
+ v
u
i
y
+ w
u
i
z
(4.30)
In the nonrelativistic limit v 1, so
1.
and,
a
i
=
u
i
t
+ u
u
i
x
+ v
u
i
y
+ w
u
i
z
=
Du
i
Dt
. (4.31)
Unfortunately Ive used v as both the y-component of the 3-velocity, (stan-
dard in uid mechanics), and also as the magnitude of the 3-velocity in the
denition of .
18. Sharpen the discussion at the end of 4.6 by showing that p is
actually the net force per unit volume on the uid element in the MCRF.
I believe this is simply the same argument used in classical uid mechan-
ics. Imagine a cube with one corner at the origin, with sides parallel to the
126
Cartesian coordinate axes, and of volume x y z. Without loss of gener-
ality let the pressure gradient be in the ydirection. The pressure force on
the face at y = 0 is p(y = 0) x z, while the pressure force on the face at
y = y is p(y = y) x z. So the pressure gradient force per unit volume
is
PGF
x y z
= [p(y = y) p(y = 0)]
x z
x y z
=
_
p(y = y) p(y = 0)
y
_
(4.32)
Taking the limit x 0, y 0, z 0,
PGF
dx dy dz
=
p
y
. (4.33)
19. Starting with Eq. (4.58) prove Eq. (4.47).
Equation Eq. (4.58) contains the sum of terms like:
_
V
0
(t
2
) V
0
(t
1
) dx dy dz +
_
V
x
(x
2
) V
x
(x
1
) dt dy dz + . . . (4.34)
Lets start with the rst term.
_
V
0
(t
2
) V
0
(t
1
) dx dy dz =
_
V
0
(t
1
+ t) V
0
(t
1
)
t
t dx dy dz, (4.35)
where t = t
2
t
1
. Taking the limit t 0,
_
V
0
(t
2
) V
0
(t
1
) dx dy dz =
_
V
0
(t)
t
dt dx dy dz. (4.36)
And similarly for the other terms. For example,
_
V
x
(x
2
) V
x
(x
1
) dt dy dz =
_
V
x
(x)
x
dt dx dy dz. (4.37)
Combining these terms we obtain Eq. (4.57).
20. (a) Show that if particles are not conserved but are generated locally
at a rate particles per unit volume per unit time in the MCRF, then the
conservation law, Eq. (4.35), becomes:
N
,
= .
We must essentially derive Eq. (4.35) but including the source term. We
were told just before Eq. (4.35) that the procedure was the same as for
Eq. (4.34), see p. 98. Consider a uid element as described in Fig. 4.8 and
bottom of p. 98. The number density is n in the MCRF, and by Eq. (4.42)
it is n in a reference frame moving at speed v relative to the uid element,
with,
=
1
1 v
2
.
The 4-velocity of the uid is, by denition,

U = e
0
in the MCRF and the
time component is in general U
0
= . Thus we can write the number of
particles in an element of volume l
3
as
l
3
n = nl
3
U
0
.
The rate of ow (or ux of ) particles across surface 4 (c.f. Fig. 4.8) is
l
2
nU
x
(x = 0). (This may seem strange because we know that U
x
= 0 in the
MCRF, but soon were going to take a derivative that will not be zero recall
problem 16 above). The ux of particles across surface 2 is l
2
nU
x
(x = l).
Similarly, in the ydirection and zdirection the net inow of particles is
l
2
nU
y
(y = 0) l
2
nU
y
(y = l) and l
2
nU
z
(z = 0) l
2
nU
z
(z = l) respectively.
These net inow terms increase the particle density in the uid element at a
rate
nl
3
U
0
t
= l
2
[(nU
x
)(x = 0) (nU
x
)(x = l) + (nU
y
)(y = 0) (nU
y
)(y = l)
(4.38)
+ (nU
z
)(z = 0) (nU
z
)(z = l)] + . . . other terms
(4.39)
There are other terms contributing now, unlike in deriving Eq. (4.35), because
128
there is also a source term giving,
nl
3
U
0
t
= l
2
[(nU
x
)(x = 0) (nU
x
)(x = l) + (nU
y
)(y = 0) (nU
y
)(y = l)
(4.40)
+ (nU
z
)(z = 0) (nU
z
)(z = l)] + l
3
. (4.41)
Note that this relation should be frame-invariant because n is obviously frame
invariant and

U is also frame invariant.
Note: is frame-invariant! Recall is the rate of generation of particles
per unit volume per unit time in the MCRF. In another reference frame,
there is a factor of to account for the fact that the volume will be smaller,
thus tending to increase the generation rate, but the time will be slower by
a factor 1/. In short, time dilation cancels length contraction.
And we can pull l
3
out of the derivative because it is a specied constant,
and then divide both sides by l
3
:
nU
0
t
=
(nU
x
)(x = l) (nU
x
)(x = 0)
l

(nU
y
)(y = l) (nU
y
)(y = 0)
l
(4.42)
(nU
z
)(z = l) (nU
z
)(z = 0)
l
+ . (4.43)
In the limit l 0,
nU
0
t
=
nU
x
x

nU
y
y

nU
z
z
+ . (4.44)
Or
N
,
= .
20. (b) Show that if momentum and energy are not conserved (due to
interactions with external systems) then Eq. (4.34) becomes:
T
,
= F
.
where F
is the relativistic 4-vector force.

This problem is easier than 20. (a) because we follow the derivation of
Eq. (4.34) on pp. 98 and 99. Recall Eq. (4.31):
T
00
t
= T
0i
i
where the LHS is the time rate of change of energy per unit volume, and
the RHS in the net in ux of energy per unit time per unit volume. Thus
for non-conservative systems we must add a F
0
, which is the net rate of
energy forcing (or supply of energy from external sources and not associated
with uxes across the boundaries of the uid element) per unit time per unit
volume.
T
00
t
= T
0i
,i
+ F
0
. (4.45)
I believe, unlike the source of particles, , F
0
will be frame dependent. For
suppose it is associated with particles being generated at a rate . We argued
that was frame-invariant in 20 a. But the total energy of each particle is
m, where m is the rest mass. So the energy source F
0
will increase as one
moves to a reference frame moving relative to the uid element.
[In retrospect, of course its frame dependent, its a component of a 4-
vector.]
Similarly for the other components. Consider
T
x0
t
= T
xi
,i
.
The LHS is the time rate of change of xdirection momentum per unit
volume, and the RHS in the net inux of xmomentum per unit time per
unit volume. Thus we must add any external forces
T
x0
t
= T
xi
,i
+ F
x
. (4.46)
where F
x
is the external force on the uid element per unit volume in the
MCRF.
21. Find the stress-energy tensor components in an inertial frame O.
(a) Group of particles all with the same 3-velocity v = e
x
in frame O.
The rest-mass density is
0
and we can make the continuum approximation.
Making the continuum approximation, we treat the particles as a uid,
and because all the particles have the same velocity the uid is dust.
130
The stress-energy tensor for dust was discussed in 4.4, see in particular
Eq. (4.19):
T =

U

U.
Here the energy density in the rest frame is =
0
and the 4-velocity is
U
O
(, , 0, 0)
with = 1/
_
1
2
. So the stress-energy tensor is simply
T
O
_
2
0 0
2

0
2
0 0
0 0 0 0
0 0 0 0
_
_
.
(b) A ring of N similar particles of mass m rotating counter-clockwise
in the x y plane about the origin of O. The radius of the ring is a, and
the circular cross-section of the ring has radius a a. Ignore force to keep
particles in circular orbit.
Because we can ignore the forces to keep the particles in the circular orbit,
I believe we can treat the uid as a dust, even though the frame co-moving
with the particles is not inertial. [ Yes, my solution agrees with Schutzs
solution, so this must be right.]
Lets assume that the given mass m is the rest-mass of each of the parti-
cles. The relativistic mass of each particle is then m measured in frame O,
where
=
1
_
1

2
a
2
c
2
.
Lets assume that a was given in geometric units so that c = 1, and
=
1
1
2
a
2
.
The total energy is then
Nm.
And this energy is uniformly distributed over the volume of the torus so the
energy density in frame O is
r
=
Nm
2a(a)
2
=
Nm
2a
2
(a)
2
.
But the particles each have speed a in O. To transform to the MCRF
energy density one must use Eq. (4.13),
=

r
2
=
Nm
2a
2
(a)
2
.
(Here were assuming that we can treat the uid as dust even though there
is centripetal acceleration.)
Now all we need is the 4-velocity. In frame O, the xcomponent of
velocity is a sin() = ay/a = y. The 3-velocity in frame O is
v = (y, x, 0).
The 4-velocity is then
U
O
(, y, x, 0).
We can now easily form the stress-energy tensor
T =

U

U
which in matrix form is
(T) =
_
2
y
2
x 0
2
y
2
2
y
2
2
xy 0
2
x
2
2
xy
2
2
x
2
0
0 0 0 0
_
_
(c) For two such rings that are identical in every way accept the sense of
rotation and wherein the particles do not interact.
We simply add the two stress-energy tensors since the energy density is
linear in the number of particles, the volume is xed. The 2nd stress-energy
tensor is obtained by just changing the sign of in the rst stress-energy
tensor.
(T) =
_
2
y
2
x 0
2
y
2
2
y
2
2
xy 0
2
x
2
2
xy
2
2
x
2
0
0 0 0 0
_
_
+
_
2
y
2
x 0
2
y
2
2
y
2
2
xy 0
2
x
2
2
xy
2
2
x
2
0
0 0 0 0
_
_
=
_
2
0 0 0
0
2
2
y
2
2
xy 0
0
2
2
xy
2
2
x
2
0
0 0 0 0
_
_
(4.47)
132
22.(i) Here we must argue that a collection of noncolliding particles, like
a galaxy, with random velocities with no preferred direction in the MCRF
has a stress-energy tensor of a perfect uid.
Here we must simply argue that heat conduction and viscosity are zero,
so that the conditions of a perfect uid are met. Head conduction and mo-
mentum diusion (viscosity) result from the net transfer of energy and mo-
mentum, respectively, due to particle motions. In classical uid dynamics
this results from the molecular motions having a preferred direction due to
temperature gradients or momentum gradients. But here we are assuming
a priori that there is no preferred direction in the MCRF (this must be the
same MCRF for the entire system so that there are no gradients). So there
can be no net transfer of energy or momentum due to the motion of the
particles. Hence no heat transfer by conduction or momentum transfer by
viscosity. With the conditions of a perfect uid being met, the argument of
4.6 apply for the form of the stress-energy tensor.
This is interesting because the condition of no bias in any direction of the
particle velocities (in the MCRF) is a statistical condition, applying in mean.
But if the velocities are truly randomly distributed, then we should expect
random uctuations about the mean, implying a random heat conduction
and viscosity eect, albeit with a time mean of zero. In classical atmo-
sphere/ocean uid dynamics this is only recently being considered under the
name of stochastic parameterization.
22. (ii) If all particles have the same speed v and mass m (less assume
thats rest mass), express p and as functions of m, v, and n.
The magnitude of the momentum of a given particle is,
mv,
so the momentum ux in a given direction is
i
nv cos(
i
)mv cos(
i
) =
i
nmv
2
cos
2
(
i
).
We make the continuum hypothesis and approach the problem statistically.
Suppose a particle is at the origin. Because the direction of a given particles
trajectory is random, the probability of leaving through a given piece of the
unit sphere centred at the origin is proportional to the area of the piece.
Dividing by the area of the sphere we get the solid angle. The solid angle
of the strip of width d making an angle to the z-axis is 2 sin() d/4 =
sin() d/2. So the mean value of the momentum ux of one particle in the
zdirection is
mv
2
_
0
sin() cos
2
()d
2
=
mv
2
2
1
3
[cos
3
()]
0
= mv
2
1
3
. (4.48)
Originally I integrated only over half the spheres area because I mistakenly
thought that only half the particles contribute to the momentum ux in a
given direction, the other half uxing momentum in the other direction. But
this is wrong because the momentum ux goes like the square of the velocity
a particle moving in the negative xdirection carries negative momentum
in the negative xdirection and thus results in a positive momentum ux!
This was made clear in the previous problem, see 21c.
For a photon gas, say the energy of each photon is E and the MCRM
number density is n, so that
= nE
The momentum of each photon is E, c.f. Eq. (2.37), and its speed is c = 1.
So the momentum ux of a given photon in a given direction will be
E c
3
=
E
3
and the total momentum ux from all photons per unit volume will be
p = n
E
3
=

3
23. This question was confusing at rst because d
3
x is not dened in the
text. Is that dx
1
dx
2
dx
3
or dx
0
dx
2
dx
3
etc.? Below I just assumed it was the
3 spatial dimensions d
3
x = dx dy dz, and found that it makes sense.
23 (a) Prove
t
_
T
0
d
3
x = 0
134
0 = T
,
Eq. (4.34)
= T
,
symmetry
= T
0
,t
+ T
i
,i
expanding
=
_
_
T
0
,t
+ T
i
,i
_
dx dy dz integrate over volume
=
_
_
T
0
,t
+ T
i
,i
_
d
3
x change notation
_
T
0
,t
d
3
x =
_
T
i
,i
d
3
x rearrange
=
_
d
T
i
n
i
d
2
x 3D version of Gauss theorem
= 0 choosing surface d in region where T
= 0
t
_
T
0
d
3
x = 0 choosing time-independent d (4.49)
23 (b) Prove
2
t
2
_
T
00
x
i
x
j
d
3
x = 2
_
T
ij
d
3
x
This is a bit trickier. I just played with the integrand such that it would
give me terms that appeared in the above. Even if you guess incorrectly,
after doing a few computations youll learn how things work and an then
nd the solution.
_
T
_
=

x
_
T
,
x
+ T
_
=

x
_
T
_
by Eq. (4.34)
=

x
_
T
_
symmetry
= T
2
x
by Eq. (4.34)
= T
(x
+ x
) product rule of elementary calculus

= T
_
product rule of elementary calculus
= T
+ T
= 2 T
symmetry (4.50)
Now we just restrict attention to spatial ranges of these indices: = i
and = j, and we integrate over a 3D spatial volume xed in time so
that temporal derivatives commute with the spatial integral (without Leibniz
terms) and large enough that it entirely encloses the region in which the
stress-energy tensor is non-zero:
_
T
x
i
x
j
_
= 2 T
ij
restricting above
_
_
T
x
i
x
j
_
d
3
x =
_
2 T
ij
d
3
x integrating
2
t
2
_
_
T
00
x
i
x
j
_
d
3
x +
_
x
k
x
l
_
T
kl
x
i
x
j
_
d
3
x =
_
2 T
ij
d
3
x expanding
(4.51)
The middle term can be shown to be zero because it always involves an
integral over a direction for which there is a spatial derivative, giving terms
like, say for k = 2:
_
x
l
_
T
ky
x
i
x
j
_
dx dy dz =
_
_

x
l
_
T
ky
x
i
x
j
_
_
yR
yL
dx dz = 0
where yR and yL are the y coordinates of the bounding surface on the right
and left sides of the volume . But on this surface T = 0 because we choose
the surface to be outside the bounded region of non-zero T = 0.
136
Maybe I could have used Gauss theorem again instead of the nal argu-
ment above?
In any case, were left with just
2
t
2
_
_
T
00
x
i
x
j
_
d
3
x =
_
2 T
ij
d
3
x (4.52)
which was what we had to prove.
23 (c) Prove
2
t
2
_
T
00
(x
i
x
i
)
2
d
3
x = 4
_
T
i
i
x
j
x
j
d
3
x + 8
_
T
ij
x
i
x
j
d
3
x
The solution to this problem proceeds much like in (b) but is a bit more
complicated. The diculty is mostly in guessing where to start, and in
particular what operator to apply to T
. Again I chose something too

general and made life dicult for myself, but it soon became clear where
I should have started. These problems are actually much easier than they
might seem at rst. Here is some thinking out loud that might help. We
see on the left a second derivative wrt t so we want to start with a general
double derivative of T
2
x
As in parts (a) and (b), we anticipate exploiting the fact that these deriva-
tives are zero for deriving the RHS. Clearly we have to multiple by some
combination of x
and x
and there should be four of them. Some will ap-

parently be eliminated somehow and somehow one of the indices of T
will
need to be lowered but I had no clue how at this point. I started very general,
with
2
x
But after a few computations to see how things worked, it became clear I
only needed the spatial components and that I only needed two pairs, that
is
2
x
x
i
x
j
x
i
x
j
It wasnt that tricky. It was obvious what needed to be done to get some of
the terms on the desired RHS. It also became clear how to lower the index
of T
. Lets just proceed now with the computations.
2
x
x
i
x
j
x
i
x
j
= T

2
x
x
i
x
j
x
i
x
j
Eq. (4.34) and symmetry
= T
_

x
(x
i
x
j
ki
x
k
lj
x
l
)
_
raise indices so we can dierentiate
= T
ki
lj
_

x
(x
i
x
j
x
k
x
l
)
_
metric is uniform in space
= T
ki
lj
_
(x
i
x
j
x
k
+ x
i
x
j
x
l
+ x
i
x
k
x
l
+
i
x
j
x
k
x
l
)
product rule
= T
ki
lj
[x
i
x
j
+ x
i
x
k
+
i
x
j
x
k
] +
k
[x
i
x
j
+ x
i
x
l
+
i
x
j
x
l
]
[x
i
x
k
+ x
i
x
l
+
i
x
k
x
l
] +
i
[x
j
x
k
+ x
j
x
l
+
j
x
k
x
l
] (4.53)
The terms in red are characterized by having the indices of x, say x
i
x
k
for
the rst one, that are both within the same metric term, e.g.
ki
, as opposed
to spread across two metric terms. Thus its only possible to lower one of
the two indices. It doesnt matter which one we lower (youll see in a second
because of the symmetry of T
we can get what we want in the end either

way). So the 4 red terms become
4T
i
i
x
j
x
j
where we have relabelled dummy indices. And the remaining 8 black terms
become
8T
ij
x
i
x
j
After integrating over all space this gives the RHS. The LHS follows as it did
in (b).
24. (a) Show that in the rest frame O of a star of constant luminosity L
(total energy radiated per second), the stres-energy tensor of the radiation
from the star at the event (t, x, 0, 0) has components T
00
= T
0x
= T
x0
=
T
xx
= L/(4x
2
). Stars sites at the origin.
138
Assume the star emits radiation isotropically. So a sphere of radius x
centred at the origin has radiation owing out of it at a rate of L. The
surface area of the sphere is 4x
2
. The ux will be evenly distributed over
the surface of the sphere by the assumption of isotropy. Thus the ux per unit
time per unit area is everywhere of magnitude L/(4x
2
). And in particular
at event (t, x, 0, 0) it is also T
0x
= L/(4x
2
).
In time period t the energy ow out of the sphere will be Lt and this
energy will ll a spherical shell of volume (4x
2
) c t = (4x
2
) t, since c = 1
in geometric units. Thus the energy density at a distance of x from the origin
will be T
00
= Lt/(4x
2
t) = L/(4x
2
).
By the symmetry properties of T, we know that in general T
0x
= T
x0
.
Finally, the energy ux is photon ux, say F
p
times the energy per photon,
h,
T
0x
= F
p
h
And the momentum ux will be the photon ux times the momentum per
photon, h,
T
xx
= F
p
h = F
p
h
c
= T
0x
because again c = 1 in geometric units, c.f. p. 49.
24. (b) By drawing the world lines of photons emitted and absorbed at
an event, I can only guess that the denition of a a null vector that separates
the emission and reception of the radiation is a null vector in the direction
of the radiation that passes through the event and points in the direction of
the emitted radiation and opposite the received radiation. We are given that
X
O
(x, x, 0, 0)
is such a vector for the event (x, x, 0, 0). Recall the radiation came from
the origin of O wherein the star is at rest, see (a). I dont see that theres
anything to prove. I guess were just supposed to learn the above denition
that wasnt explicitly provided.
To show that the stress-energy tensor has the given frame-invariant form,
we must show that it is indeed frame-invariant, and that it reproduces the
results of (a) in the MCRF. Vectors are frame-invariant, and so their out-
erproduct will for a
_
2
0
_
tensor that is also frame-invariant. The radiation
emitted per second, the luminosity, will depend upon reference frame (as we
will have to discuss in part (c)), so we must assume that L is the luminosity in
the rest frame, even though this was not stated explicitly. Finally the denom-
inator has an inner product of two four-vectors, and is also frame-invariant.
So the given expression for T is frame-invariant.
In the MCRF it is easy to verify that the expression given produces the
results of (a):
T
00
=
L
2
X
0
X
0
(U
)
4
=
L
2
x x
(1 x)
4
=
L
2x
2
. (4.54)
And
T
0x
=
L
2
X
0
X
1
(U
)
4
=
L
2
x x
(1 x)
4
=
L
2x
2
, (4.55)
and
T
xx
=
L
2
X
1
X
1
(U
)
4
=
L
2
x x
(1 x)
4
=
L
2x
2
. (4.56)
Furthermore,
T
0y
=
L
2
X
0
X
2
(U
)
4
=
L
2
x 0
(1 x)
4
= 0, (4.57)
and similarly for the remaining terms.
140
24. (c) Find T
0 x
in a frame

O moving at speed v along the xaxis of O.
Applying the Lorentz transformation to T we rst transform

X:
X
O
((1 v)x, (1 v)x, 0, 0),
O
(R, R, 0, 0),
(4.58)
where = 1/
1 v
2
, see p. 38, and we have found
R = (1 v)x =
_
1 v
1 + v
x
The 4-velocity of the star in the Earths frame of reference is
U
s

O
(, v, 0, 0)
which gives for

U
s

X = R(1 + v). Using the frame-invariant expression
for the stress energy tensor we nd in the Earths frame of reference:
T
0 x
=
L
4
X
0
X
1
(U
)
4
=
L
4
R
2
(R(1 + v))
4
=
L
4R
2
(1 v)
2
(1 + v)
2
(4.59)
Interpretation
Consider the emission of a photon from the star as event A at (0, 0, 0, 0) in
frame O, see Fig. 4.1. The photon travels along the xaxis, which is parallel
to the xaxis. Consider the absorption of the photon in a detector as event B
at (x, x, 0, 0) in frame O. Suppose that frame

O coincides with frame frame
O at event A, so both frames observe event A at their origins. During the
time period between A and B, t = x 0 = x, the reference frame

O has
traveled a distance v in the xdirection. So under a Galilean transformation
the event B at (x, x, 0, 0) in O would occur at x = x vt = x(1 v) in the
frame

O moving at speed v in the positive x-direction. Due to relativistic
eects, this becomes x = x(1 v) = R. This is dierent from Lorentz
contraction or time dilation because its neither a ruler nor a clocks reading
that is being transformed between Lorentz frames but the spatial part of a
light path.
From the point of view of

O the light emitted at event A spreads out in
a spherical shell centred at the origin of

O despite the fact that the star was
moving when the light was emitted. In principle, this makes things simple!
T
0 x
is the ux of energy across a surface of constant x = R. But note that the
luminosity varies with reference frame. Suppose the star emits A photons
per second of mean energy h such that L = Ah in O the MCRF of the
star. The emission rate is a clock of sorts, and moving clocks run slowly, so in
frame

O, the emission rate will be A/. But the radiation will be red-shifted
by the relativistic Doppler eect,

= (1 v)
see Eq. (2.39) on p. 49. Thus the luminosity observed in

O will be reduced:
L =
A
(1 v) = L(1 v).
Based on the above argument I would have anticipated
T
0 x
=
L
4R
2
(1 v)
but this is wrong by a factor of (1 v)/(1 + v)
2
!
A more accurate but only partial answer is to note from the frame-
invariant expression given in 24 b), its clear that the components of T only
change because of the

X

X term in the numerator, since the remaining
factors are frame-invariant. For all four non-zero terms,
(
X

X)

O
R
2
, ( ,

)
0, x
And R/x = (1 v) =
_
1v
1+v
, which implies that the magnitude of the
components of T change like
T
0 x
T
0x
=
R
2
x
2
=
1 v
1 + v
The same result can of course be obtained directly from the Lorentz transfor-
mation. Tensors are geometric objects and thus invariant under changes of
142
reference frame, and the Lorentz transformation tells us how the components
change. Apply this to T we nd
T

= T
which also gives

T
0 x
T
0x
=
R
2
x
2
=
1 v
1 + v
25. This problem is apparently not required to understand the remainder
of the book so Ive but it on hold.
x
t
t
x
A
B
*
*
x
R
R
Figure 4.1: Events A and B seen in stars reference frame (x t axes) and
a reference frame co-moving with Earth but with origin coinciding with star
at event A ( x
t axes), when radiation was emitted that reach Earth in time

x at event B.
144
Chapter 5
Preface to curvature
145
146
5.1 On the relation of gravitation to curva-
ture
Bad choice of symbols in Eq. (5.1): h is Planks constant when multiplied by
frequency and h is the height of the tower when multiplied by g.
5.2 Tensor calculus in polar coordinates
How did he get Eq. (5.28b) for the magnitudes of the one-form bases? Use
Eq. (3.51) but adapted to 2D Cartesian space instead of Minkowski space,
p
2
= p
2
1
+ p
2
2
(5.1)
So e.g.
[
dr[ =
_
cos
2
+ sin
2
= 1 using Eq. (5.27) and (5.1)
Some typos:
Eq. (5.28a), double == is just a typo.
Typo middle p. 129, just before Eq. (5.54), the nal should be su-
perscript:
V
=
V
pp. 130 and 131, two separate equations labeled Eq. (5.64).
5.3 Christoel symbols and the metric
Clarication: On p. 132, before Eq. (5.70), the superscripted on the LHS
of
V
;
= g
;
looks odd. But it is correct. It follows because were in Cartesian coordinates
wherein g
.
5.4 Noncoordinate bases
Verifying Eq. (5.78):
First we use Eqs. (5.22, 5.23) for e
r
and e
, and Eq. (5.76) to obtain:

e
r
= cos e
x
+ sin e
y
,
e
= sin e
x
+ cos e
y
, (5.2)
and we use Eqs. (5.26, 5.27) for

dr and

d, and Eq. (5.77) to obtain:
dr = cos
dx + sin
dy,
d = sin
dx + cos
dy. (5.3)
Taking the various dot products of basis vectors we nd:
e
r
e
r
= cos
2
+ sin
2
= 1,
e
r
e
= sin cos + sin cos = 0,

e
= cos
2
+ sin
2
= 1. (5.4)
Doing the same for the one-form bases gives the analogous result. If one is
uncomfortable taking dot products of one-form bases, recall Eq. (3.47) and
Eq. (3.52).
dr
dr = (cos
dx + sin
dy) (cos
dx + sin
dy),
= cos
2
dx
dx + sin
2
dy
dy + 2 cos sin
dx
dy,
= cos
2
+ sin
2
= 1. (5.5)
Eq. (5.84) reduces to,
1
_
x
2
+ y
2
,= 0.
5.8 Exercises
1. Repeat the argument that led to Eq. (5.1) under more realistic assump-
tions: suppose a fraction of the kinetic energy of the mass at the bottom
148
can be converted into a photon and sent back up, the remaining energy stay-
ing at ground level in a useful form. Devise a perpetual motion engine if
Eq. (5.1) is violated.
Taken literally, is the fraction of the kinetic energy of the mass at the
bottom, not the fraction of the total energy. The remaining energy then
means the rest mass energy plus (1 ) of the kinetic energy. Somehow I
doubt thats what Schutz meant because it makes it needlessly complicated
but thats what he said, so well go with that interpretation. Conceptually
it doesnt really matter since were supposed to just see that the Einstein
thought experiment carries forward the same message even when ineciencies
are introduced.
Lets introduce an index i to keep track of the iterations of the mass
falling and photon propagating to the top of the tower. Say we start with
mass m
0
at the top, it falls gaining kinetic energy
m
0
gh
c
2
= m
0
gh
where the constants are in geometric units so that c = 1 and gh is dimen-
sionless. For Earth conditions of course gh 1. Of this kinetic energy only
a fraction is available for generating the photon at the bottom of the tower,
m
0
gh = 2
0
,
while the remaining energy is accumulated, apparently in useful form, at the
base of the tower:
m
0
= [(1 )gh + 1]m
0
.
The key assumption is that the radiation is unaected by the gravitational
eld (in violation with Eq. (5.1)), yielding a photon at the top of the tower
of the same energy as at the bottom:
m
0
gh = 2
0
.
Now this is converted into mass
m
1
= 2
0
= m
0
gh.
This yielding kinetic energy at the base of the tower:
m
1
gh = m
0
(gh)
2
,
of which the fraction is available for the 2nd photon:
m
1
gh = 2
1
= m
0
(gh)
2
.
The remaining energy accumulates at the bottom:
m
1
= [(1 )gh + 1]m
1
= [(1 )gh + 1]m
0
gh.
The kinetic energy at the top of the tower is again taken as the total energy
in the photon at the base of the tower, yielding a new mass at the top of the
tower:
m
2
= 2
1
= m
0
(gh)
2
.
At the bottom we generate another photon:
2
2
= m
0
(gh)
3
,
and accumulate more mass:
m
2
= [(1 )gh + 1]m
2
= [(1 )gh + 1](gh)
2
m
0
.
The process will repeat indenitely. After n + 1 iterations we have accumu-
lated this much mass at the base of the tower:
n
i=0
m
i
=
n
i=0
ar
i
,
with a = [(1)gh+1]m
0
and r = (gh). As n , the accumulated mass
approaches
M =
a
1 r
=
([(1 )gh + 1]m
0
1 gh
,
see Boas (1983, Eq. (1.8)) for the sum of an innite geometric series. As-
suming Earth-like values, gh 1, and
M [(1 )gh + 1]m
0
(1 + gh) m
0
(1 + gh).
The accumulated mass is not much more than the starting mass so the process
is not an ecient way to create energy. However, we gained something for
nothing and generated an innite process. Clearly something is wrong, and
in particular, it was the violation of Eq. (5.1) describing the gravitational
150
redshift. Einsteins simple thought experiment is robust to the inclusion of
ineciencies.
2. A uniform external gravitational eld would contribute to a uniform
acceleration for the solid Earth and its uid envelopes, engendering no rela-
tive motion between them.
3. (a) Consider coordinate transformation (x, y) (, ) with = x
and = 1. Note that /x = 0 and /y = 0. This violates Eq. (5.6),
implying that this coordinate transformation is not good. In fact this same
example was worked out on p. 118, complete with an example of a distinct
pair of points (x, y) points having the same (, ) coordinates.
3. (b) Are the following coordinates transformations good ones? Com-
pute the Jacobian and list and points where the transformations fail.
(i) = (x
2
+y
2
)
1/2
, = arctan(y/x). This is of course Eq. (5.3), the polar
coordinate transformation.
x
=
x
_
x
2
+ y
2
,
y
=
y
_
x
2
+ y
2
,
x
=
y
x
2
+ y
2
,
y
=
x
x
2
+ y
2
, (5.6)
The determinant is 1/
_
x
2
+ y
2
so I believe the only problem is at the
origin, where r = 0 and derivatives above are undened.
(ii) = ln(x), = y.
x
=
1
x
,
y
= 0,
x
= 0,
y
= 1, (5.7)
The determinant is
1
x
so again the only problem is at the origin and of
course x 0 where = ln(x) is undened.
(iii) = arctan(y/x), = (x
2
+ y
2
)
1/2
.
x
=
y
x
2
+ y
2
,
y
=
x
x
2
+ y
2
,
x
=
x
(x
2
+ y
2
)
3/2
,
y
=
y
(x
2
+ y
2
)
3/2
, (5.8)
The determinant is 1/(x
2
+ y
2
)
3/2
so I believe the only problems are at
the origin and as x
2
+ y
2
, where the derivatives above are undened.
4. A curve is dened by x = f(), y = g(), 0 1. Show that the
tangent vector (dx/d, dy/d) does actually lie tangent to the curve.
The slope of the tangent to the curve is
lim
x0
y
x
=
dy/d
dx/d
=
dy/d
dx/d
, (5.9)
152
which is also the slope of the tangent vector.
5. Sketch the curves, compare paths, nd tangent vectors when param-
eter is nil. The computations in this exercise are a bit trivial but still I
found it instructive. If one likes this sort of approach, one might like the text
by Faber (1983). The plots can be found in the accompanying Maple
TM
le
schutz2009_ch5.mw.
(a)
x = sin
y = cos (5.10)
This is a unit circle centred at the origin. When = 0 the tangent vector is
at (0, 1), points in the xdirection and has unit length.
(b)
x = cos(2t
2
)
y = sin(2t
2
+ ) (5.11)
This is a unit circle centred at the origin, as in (a).
The tangent vector is a bit subtle.
x = sin(2t
2
)4t
y = cos(2t
2
+ )4t (5.12)
When t = 0 the tangent vector is the zero vector. But strangely we can still
identify its direction! The angle of the tangent vector to the xaxis is
cot =
x
y
=
sin(2t
2
)
cos(2t
2
+ )
When t = 0 its clear that = /2. And we can decide the sign as follows.
Let t = , a small and positive parameter. Then x is close to unity but
slightly less, and y is close to zero but negative. Taking 0 moves the
point to (1, 0), so the tangent vector points to y = . That is, = /2.
(c)
x = s
y = s + 4 (5.13)
The path is a straight line with y intercept at y = 4. The tangent vector is
uniform (1, 1).
(d)
x = s
2
y = (s 2)(s + 2) = 4 s
2
(5.14)
The path is a straight line with y intercept at y = 4, as in (c), but now were
restricted to x 0. The tangent vector is not uniform but depends upon s:
x = 2s
y = 2s (5.15)
As in (b) the tangent vector is the zero vector at s = 0, but again we can
still dene its direction. In this case the direction is uniform (although the
magnitude is not) (1, 1) or = /4.
(e)
x =
y = 1 (5.16)
The path is a straight line with y intercept at y = 1. The tangent vector is
uniform:
x = 1
y = 0 (5.17)
6. Justify the basis vectors and one-forms in Fig. 5.5.
The e
r
vectors should point away from the origin, and be of the same
length regardless of position because [e
r
[ = 1, c.f. Eq. (5.28b).
154
The e
vectors should be orthogonal to e

r
, and point in the anticlockwise
direction about the origin. The length should increase linearly with distance
from the origin, c.f. Eq. (5.28a).
The

dr basis should be surfaces tangent to a curve of constant r so orthog-
onal to e
r
, which point away from the origin. The amplitude (indicated
by the spacing of the lines) should be constant regardless of position because
[
dr[ = 1, c.f. Eq. (5.28b) or (5.1) in my notes of 5.3.

The

d basis should be surfaces tangent to a curve of constant so orthog-
onal to e
. The amplitude (weaker amplitude indicated by greater spacing

of the lines) should decrease with distance from the origin, [
d[ = r
1
, c.f.
Eq. (5.28b).
7. Let primed indices indicate polar coordinates and unprimed Cartesian.
Find
and
.
Lets start with
. Using Eq. (5.3) for the cooridnates (x, y) in terms of

polar coordinate variables (r, ), we calculate the terms of the transformation
given in Eq. (5.13):
(
) =
_
cos r sin
sin r cos
_
. (5.18)
Slightly more awkward is
. Use Eq. (5.8) with = r and = . Note

for the second row we must dierentiate arctan:
x
=

x
=
arctan(y/x)
x
. (5.19)
I found this simpler to write tan = y/x and dierentiate both sides with
respect to x:
tan
x
=
d tan
d
x
,
=
1
cos
2
x
. (5.20)
Now we solve for
x
= cos
2
tan
x
= cos
2
y/x
x
= cos
2
_
y
x
2
_
=
y
x
2
+ y
2
. (5.21)
And similarly for
y
= cos
2
tan
y
=
x
x
2
+ y
2
. (5.22)
Finally we arrive at:
(
) =
_
x
x
2
+y
2
y
x
2
+y
2
y
x
2
+y
2
x
x
2
+y
2
_
(5.23)
As a check we can multiple the two transformations together to conrm
that they are indeed a pair of inverses. To do this we must choose a common
set of variables. Lets use (r, ). Its straightforward to convert
(
) =
_
x
x
2
+y
2
y
x
2
+y
2
y
x
2
+y
2
x
x
2
+y
2
_
,
=
_
cos sin
sin
r
cos
r
_
. (5.24)
And then we nd their product gives the identity matrix,
(
)(
) =
_
cos sin
sin
r
cos
r
_ _
cos r sin
sin r cos
_
,
=
_
1 0
0 1
_
. (5.25)
156
8. (a) Let f = x
2
+ y
2
+ 2 x y, and consider two vectors in Cartesian
coordinates,
V
c
(x
2
+ 3y, y
2
+ 3x)
W
c
(1, 1). (5.26)
Find f as a function or r and , and nd the components of the two vectors
on the polar basis.
From Eq. (5.3) we nd immediately that
f = r
2
+ 2 r
2
cos sin , by simple substitution. (5.27)
For the vectors, we need to express the Cartesian components in terms of
(r, ), and we need the transformation matrix
where prime refers to the

polar coordinates and unprimed the Cartesian, as in problem 7. The former
is
V
c
(x
2
+ 3y, y
2
+ 3x)
c
(r
2
cos
2
+ 3 r sin , r
2
sin
2
+ 3 r cos ) , by substitution. (5.28)
These are still Cartesian coordinates, but expressed as a function of (r, ).
Using the transformation matrix from problem 7 we have
V
p
_
cos sin
sin
r
cos
r
_ _
r
2
cos
2
+ 3 r sin ,
r
2
sin
2
+ 3 r cos
_
V
p
_
r
2
(cos
3
+ sin
3
) + 6 r sin cos
r cos
2
sin + r sin
2
cos + 3 (cos
2
sin
2
)
_
. (5.29)
And similarly for

W, again using the transformation matrix from prob-
lem 7, we have simply:
W
p
_
cos sin
sin
r
cos
r
_ _
1
1
_
W
p
_
sin + cos
sin
r
+
cos
r
_
. (5.30)
The coordinates of the simple vector

W are a function of (r, ) because
the basis vectors change with position, c.f. Eqs. (5.22) & (5.23).
8. (b) Find the components of the one-form

df in Cartesian coordinates.
Solution: Using Eq. (5.10) we nd
df
c
_
f
x
,
f
y
_
c
(2x + 2y, 2x + 2y) . (5.31)
(i) Find the components of the one-form

df in polar coordinates by direct
computation.
We use result (5.27) above and Eq. (5.10),
df
p
_
f
r
,
f
_
, Eq. (5.10)
p
_
2r + 4r cos sin , 2r
2
(cos
2
sin
2
)
_
. (5.32)
(ii) Find the components of the one-form

df in polar coordinates by
transforming the Cartesian components.
We use of course Eq. (5.14) in general to relate the polar and Cartesian
components of a one-form. The result (5.31) above gives the Cartesian com-
ponents and the transformation, in general is Eq. (5.13) and was found above
in result (5.18).
_
df
_
=
_
df
_
, Eq. (5.14)
=
_
f
x
,
f
y
__
x
r
x
y
r
y
_
= (2x + 2y, 2x + 2y)
_
cos r sin
sin r cos
_
= (2r(cos + sin ), 2r(cos + sin ))
_
cos r sin
sin r cos
_
=
_
2r + 4r cos sin , 2r
2
(cos
2
sin
2
)
_
. (5.33)
which agrees, of course, with the result (5.32).
158
8. (c) (i) Find the components of the one-forms

V and

W in polar
coordinates using the metric tensor.
The matrix tensor in polar coordinates was given in Eq. (5.32). We found
V in polar coordinates in problem 8(a), see result (5.29).

V
= g
c.f. Eq. (3.39)

=
_
1 0
0 r
2
_ _
V
r
V
_
= (V
r
, r
2
V
)
=
_
r
2
(cos
3
+ sin
3
) + 6 r sin cos
r
3
cos
2
sin + r
3
sin
2
cos + 3r
2
(cos
2
sin
2
)
_
. (5.34)
For

W in polar coordinates we use result (5.30):
W
= g
c.f. Eq. (3.39)

=
_
1 0
0 r
2
_ _
W
r
W
_
=
_
sin + cos
r sin + r cos
_
. (5.35)
8. (c) (ii) Find the components of the one-forms

V and

W in polar
coordinates using the transformation from Cartesian components.
Because the matrix tensor in Cartesian components corresponds to the
identity matrix, the components of the one-forms are identical to the vectors:
V
= g
c.f. Eq. (3.39)

V
, used Eq. (5.29)

V

c
V
, also explained p. 131 bottom. (5.36)

so we immediately have the components of the

V and

W in Cartesian com-
ponents. We merely have to transform these to polar coordinates.
V
= V
Eq. (5.14) or Eq. (3.16)

= (r
2
cos
2
+ 3r sin , r
2
sin
2
+ 3r cos )
_
cos r sin
sin +r cos
_
=
_
r
2
(cos
3
+ sin
3
) + 6r sin cos ,
r
3
(cos
2
sin sin
2
cos ) + 3r
2
(cos
2
sin
2
)
_
, (5.37)
which agrees with result (5.34). And similarly for the other one-form:
W
= W
= (1, 1)
_
cos r sin
sin +r cos
_
=
_
cos + sin , r (cos sin )
_
(5.38)
which agrees with the result (5.35).
9. Draw a gure to explain Eqs. (5.38a, b).
Recall the Eqs. (5.38a, b):
e
r
=
1
r
e
,
e
= re
r
.
We see that changing r does not change the direction of the polar coordinate
basis vectors. But e
does change in magnitude since it must increase in

length as one moves further from the origin, albeit more slowly the farther
one is from the origin, see Fig. 5.1.
Changing on the other hand does change the orientation of the basis
vectors. Increasing when one is in the rst quandrant for instance results
in e
pointing more toward the x-direction, see Fig. 5.2.

10. Prove that
V dened in Eq. (5.52) is a

_
1
1
_
tensor.
160
A
B .
.
r
x
y
Figure 5.1: Moving from point A to B we see the basis vector e
increases
in length but doesnt change direction. For larger r the relative change is
smaller. Plot was partly generated with Maple
TM
, see accompanying le
schutz2009 ch5.mw.
First a note on terminology. One might complain that the RHS below
(
V )
;
(5.39)
is not a tensor because it lacks a basis. In Chapter 3, Schutz was careful
to include the basis functions but here they were lost. While this complaint
is valid, Schutz is following tradition here by calling the components of a
tensor a tensor for short. Hobson et al. (2009) were explicit about using
such an abbreviation and noted that it is commonly done. In Schutzs partial
defence here, in Eq. (5.52) he enclosed the tensor on the LHS in parentheses
and pulled o the and components explicitly recall the discussion in
my notes on chapter 2 2.2 of this notation introduced without explanation.
Going back to the subsection The covariant derivative of 5.3, if we
take the gradient of Eq. (5.43) instead of just the derivative, then instead of
Eq. (5.46) we obtain:
dx
V
x
=

dx
_
V
+ V
_
. (5.40)
A
B .
.
x
y
Figure 5.2: Moving from point A to B we see the basis vector e
changes
direction, in this case pointing more toward the x-direction, but doesnt
change in length. Plot was partly generated with Maple
TM
, see accompanying
le schutz2009 ch5.mw.
In other words, Ive just included the one-form basis. Continuing the devel-
opment on p. 128 leads up to Eq. (5.51), but with the one-form basis:
dx
V
x
=

dx
;
e
=

dx
;
. (5.41)
To completely prove this is a
_
1
1
_
tensor would include showing that it is
invariant under a change of basis. This is easy if we restrict ourselves to
Cartesian basis for then we dont have to deal with the Christoel symbol.
Consider frame

O, related to frame O as follows:
V

=

e

=
dx
dx
=
x
, (5.42)
162
where overbar indices indicate components of the

O frame, those without
overbar indicate components of the O frame regardless of whether or not
they have primes. We simply substitute these into the expression for the
gradient of the velocity in the frame

O:
dx
V
x
=

dx
_
V

x
e

_
,
=
dx

_
V
_
,
=
dx
_
V
_
,
=

dx
_
V
_
. (5.43)
Viola! It transforms properly such that it is frame invariant at least for
frames with a Cartesian basis.
[Schutzs solution focuses on showing linearity. ]
11. For

V from Exer. 8 above, nd:
(a) V
,
in Cartesian coordinates.
V
,

c
_
V
x
x
V
x
y
V
y
x
V
y
y
_
,
=
_
2x 3
3 2y
_
,
=
_
2r cos 3
3 2r sin
_
. (5.44)
11. (b) The transformation to polar coordinates.
Transformation of vectors between dierent coordinates was explained in
2.2 and 5.2. Perhaps it was not thoroughly explained how to transform
tensors. This exercise then is very important for clarifying that. The key
ingredient is that one needs to apply a transformation matrix for each index
or rank of the tensor Eq. (5.8) for the superscript indices (what other books
called the contravariant components) and Eq. (5.13) for the subscript indices
(what other books call the covariant components).
=
_
cos sin
sin
r
cos
r
_ _
2r cos 3
3 2r sin
_ _
cos r sin
sin r cos
_
=
_
A B
C D
_
(5.45)
where
A = 2r(cos
3
+ sin
3
) + 3 sin 2
B = 2r
2
(cos
2
sin + sin
2
cos ) + 3r cos 2
C = 2(cos
2
sin + sin
2
cos ) +
3
r
cos 2
D = 2r(cos
2
sin + sin
2
cos ) 3 sin 2 (5.46)
11. (c) V
directly in polar coordinates.

First we need the velocity eld in polar coordinates. We use the result
(5.29) found in problem 7 above, which was:
V
p
_
cos sin
sin
r
cos
r
_ _
r
2
cos
2
+ 3 r sin ,
r
2
sin
2
+ 3 r cos
_
V
p
_
r
2
(cos
3
+ sin
3
) + 3 r sin 2
r cos
2
sin + r sin
2
cos + 3 cos 2
_
. (5.47)
From Eq. (5.50), the velocity gradient has two parts. The rst part, due
to the gradient of the components:
V
=
_
V
r
r
V
r
V

r
V

_
=
_
A E
F G
_
(5.48)
164
where
A = 2r(cos
3
+ sin
3
) + 3 sin 2 , as in (b) above
E = 3r
2
(cos
2
sin + sin
2
cos ) + 6r cos 2
F = (cos
2
sin + sin
2
cos )
G = r(2 sin
2
cos + 2 cos
2
sin cos
3
sin
3
) 6 sin 2 (5.49)
The second part is due to the gradient in basis vectors. Using Eq. (5.45) for
the Christoel symbols:
V
=
_
0 V
r
V
r
r
_
_
0 r
2
(cos
2
sin sin
2
cos ) 3r cos 2
( cos
2
sin + sin
2
cos ) +
3 cos 2
r
r (cos
3
+ sin
3
) + 3 sin 2
_
(5.50)
Combining these two we obtain
V

p
(5.51)
_
2r(cos
3
+ sin
3
) + 3 sin 2 2r
2
(cos
2
sin + sin
2
cos ) + 3r cos 2
2(cos
2
sin + sin
2
cos ) +
3 cos 2
r
r(2 sin
2
cos + 2 cos
2
sin ) 3 sin 2
_
(5.52)
And viola, it agrees with the much simpler calculation in Cartesian coordi-
nates, transformed to polar (5.45).
11. (d) The divergence using results from (a).
The divergence should be frame-independent and is the trace of the matrix
of the covariant derivative of the vector. Using the covariant derivative in
Cartesian coordinates from (a) above:
V
,
= 2r(cos + sin ). (5.53)
11. (e) The divergence using results from (b).
V
;
= 2r(cos
3
+ sin
3
+ cos
2
sin + sin
2
cos ),
= 2r[cos
2
(cos + sin ) + sin
2
(sin + cos )],
= 2r(cos + sin ). (5.54)
And of course this agrees with the result (5.53) obtained in (d).
11. (f) The divergence using Eq. (5.56)
Recall from (5.29) that we had:
V
p
_
r
2
(cos
3
+ sin
3
) + 6 r sin cos
r cos
2
sin + r sin
2
cos + 3 (cos
2
sin
2
)
_
. (5.55)
So applying Eq. (5.56) we get
V =
1
r
r
[rV
r
] +

=
1
r
r
_
r(r
2
(cos
3
+ sin
3
) + 6 r sin cos )
+ (5.56)
[r cos
2
sin + r sin
2
cos + 3 (cos
2
sin
2
)]
=
_
3(r (cos
3
+ sin
3
) + 12 sin cos )
+ (5.57)
[2r cos sin
2
r cos
3
+ 2r sin cos
2
r sin
3
6 sin(2)]
= 2r(cos + sin ). (5.58)
And of course this also agrees with the result (5.53) obtained in (d).
12. Given p whose components in Cartesian coordinates are the same as
V in Exers. 8(a) and 11.

(a) Find p
,
in Cartesian coordinates.
p
,
=
p
and weve done all these calculations in Exer. 11(a).

p
,

c
_
px
x
px
y
py
x
py
y
_
,
=
_
2x 3
3 2y
_
,
=
_
2r cos 3
3 2r sin
_
. (5.59)
166
(b) Transform p
,
from Cartesian coordinates to polar coordinates, p
;
.
Note the dierent notation now for the derivative because were us-
ing curvilinear coordinates there are derivatives of the basis vectors too so
we use the colon instead of comma. Now we get something dierent from
Exerc. 11(b) because the transformation is dierent for one-forms than for
vectors:
p
;
=
p
,
Instead of using matrices, lets try using tensors as follows.
p
r;r
=
r
p
,
=
x
r

x
r
p
x,x
+
x
r

y
r
(p
x,y
+ p
y,x
) +
y
r

y
r
p
y,y
= (
x
r
)
2
2x +
x
r

y
r
(3 + 3) + (
y
r
)
2
2y
= (
x
r
)
2
2r cos +
x
r

y
r
(6) + (
y
r
)
2
2r sin
= (cos )
2
2r cos + cos sin (6) + (sin )
2
2r sin
= 2r(cos
3
+ sin
3
) + 6 cos sin
= V
r
;r
see result (5.46) above (5.60)
And
p
;r
=
r
p
,
=
x

x
r
p
x,x
+
x

y
r
p
x,y
+ +
y

x
r
p
y,x
+
y

y
r
p
y,y
=
x

x
r
2r cos +
x

y
r
3 + +
y

x
r
3 +
y

y
r
2r sin
= [r sin cos ]2r cos + [r sin cos ]3 + [r cos
2
]3 + [r sin cos ]2r sin
= r
2
2[cos
2
sin + sin
2
cos ] +
3
r
cos(2)
= r
2
V
r
We can save some time by noting because p
,
= p
,
p
r;
=
p
,
= p
;r
(5.62)
And nally,
p
;
=
p
,
=
x
p
x,x
+
x
(p
x,y
+ p
y,x
) +
y
p
y,y
= (
x
)
2
2r cos +
x
(6) + (
y
)
2
2r sin
= (r sin )
2
2r cos + (r
2
cos sin )6 + (r cos )
2
2r sin
= 2r
3
(cos
2
sin + sin
2
cos ) 3r
2
sin(2)
= r
2
V
;
12 (c) Obtain p
;
directly in polar coordinates.
We need p
in polar coordinates. We can use the results we obtained in

Exer. 8(c). Because

V from Exer. 8 had the same components as our one-
form p when both were in Cartesian components,

V must be the vector dual
of p. This works because g
for Cartesian coordinates. So p
= V
and recalling result (5.34) from above,

p
r
= r
2
(cos
3
+ sin
3
) + 6 r sin cos
p
= r
3
cos
2
sin + r
3
sin
2
cos + 3r
2
(cos
2
sin
2
) (5.64)
Lets separate out the 4 terms in Eq. (5.63) and work on them one at a
time.
p
r;r
= p
r,r
p
rr
Eq. (5.63)
= p
r,r
, Eq. (5.45) gives
rr
= 0
=

r
(r
2
(cos
3
+ sin
3
) + 6 r sin cos )
= 2r(cos
3
+ sin
3
) + 3 sin(2) (5.65)
as in result (5.60) above.
p
r;
= p
r,
p
r
Eq. (5.63)
= p
r,
p
r
, only one non-zero Christoe from Eq. (5.45)
=

(r
2
(cos
3
+ sin
3
) + 6 r sin cos ) p
_
1
r
_
= 2r
2
(cos
2
sin + sin
2
cos ) + 3r cos(2) , several terms canceled
(5.66)
168
You can save yourself some work by noting that order doesnt matter for
partial derivatives: p
,r
= p
r,
and the latter was already calculated above.
And the Christoel symbol cannot make a dierence because of the symmetry
(Eq. (5.74)). Thus p
;r
= p
r;
:
p
;r
= p
,r
p
r
Eq. (5.63)
= p
r,
p
r
= 2r
2
(cos
2
sin + sin
2
cos ) + 3r cos(2)
= p
r;
(5.67)
Finally
p
;
= p
,
p
Eq. (5.63)
= p
,
p
r

=

(p
) p
_
1
r
_
= 2r
3
(cos
2
sin + sin
2
cos ) 3r
2
sin(2) , several terms canceled
(5.68)
13. Show that one could have obtained the results in Exer. 12(b) by
lowering the index using the metric.
p
;
= g
Recall in Exer. 12(b) we found

p
r;r
=
r
p
,
= 2r(cos
3
+ sin
3
) + 6 cos sin
= V
r
;r
(5.69)
But
p
r;r
= g
r
V
;r
= g
rr
V
r
;r
= V
r
;r
, used Eq. (5.31) (5.70)
p
;r
=
r
p
,
= r
2
2[cos
2
sin + sin
2
cos ] +
3
r
cos(2)
= r
2
V
;r
(5.71)
But
p
;r
= g
;r
= g
;r
= r
2
V
;r
, used Eq. (5.31) (5.72)
And
p
r;
= g
r
V
;
= g
rr
V
r
;
= V
r
;
, used Eq. (5.31)
= r
2
V
;r
, used result (5.46 (5.73)
p
;
= 2r
3
[cos
2
sin + sin
2
cos ] 3r
2
sin(2)
= r
2
V
;
(5.74)
But
p
;
= g
;
= g
;
= r
2
V
;
, used Eq. (5.31) (5.75)
14. Given a
_
2
0
_
tensor A with components in polar coordinates:
(A) =
_
r
2
r sin
r cos tan
_
170
nd the components of A.
From Eq. (5.65) there are three contributions:
A
;
= A
,
+ A
+ A
The three contributions are respectively:

A
;
A
,
A
A
rr
;r
= 2r +0 +0 = 2r
A
rr
;
= 0 r
2
sin r
2
cos = r
2
(sin + cos )
A
r
;r
= sin +sin +0 = 2 sin
A
r
;
= r cos +r +r tan = r(1 + cos tan )
A
r
;r
= cos +0 +cos = 2 cos
A
r
;
= r sin r tan +r = r(1 sin tan )
A
;r
= 0 +r
1
tan +r
1
tan = r
1
2 tan
A
;r
= sec
2
+cos +sin = sec
2
+ cos + sin
The above solution was veried with Maple
TM
, please see accompanying le
schutz2009_ch5.mw.
15. Given the uniform vector in polar coordinates V
r
= 1 and V
= 0,
which points radially from the origin, nd
V
;;
In principle this is quite straightforward, but there are several places
one might slip-up. First I think its a good idea to write down the general
expression, and then substitute the given vector eld. Write
T
;
= V
,
+ V
Eq. (5.64), the 2nd one that is (5.76)

Dont substitute the given vector at this point because were still going to
take another derivative:
V
;;
T
;
= T
,
+ T
Eq. (5.66) (5.77)

Now its straightforward substitution. The problem simplies tremendously
because there are only three nonzero components of the Christoel symbol,
c.f. Eq. (5.45):
r
=
r
=
1
r
= r (5.78)
To help you debug, Ive split (5.77) into 3 colour coded parts (so the rst
term makes the contributions in red below):
V
; ;r
=
1
r
2
+
1
r
2
1
r
2
=
1
r
2
V
;r ;
=
1
r
2
=
1
r
2
V
r
; ;
= 1 = 1 (5.79)
The above solution was veried with Maple
TM
, please see accompanying
le schutz2009_ch5.mw.
16. Fill in steps between Eq. (5.74) and Eq. (5.75).
There are no steps to ll in! He has explained every step in detail. See
Exer. 20 for a problem that forces one thoroughly understand each of these
steps.
17. Discover how V
,
transforms under a change of coordinates. Do
same for V
Ive created SP 3 and SP 4 as an alternative to this problem. They carry

the same message but in a more straightforward way the follows naturally
from what we did in Chapter 2 for vectors.
18. Verify Eq. (5.78).
172
Recall Eq. (5.78) gave
e

e
(5.80)
So in the rst line there are two equalities to verify. Recall how we obtain
the components of any tensor, Eq. (3.21), and the metric tensor in particular,
Eq. (3.5), consistent with:
e

e
For the second equality of the rst line, we have four terms to verify.
e
r
e
r
= e
r
e
r
, substituted Eq. (5.76)
= g
rr
, Eq. (3.5)
= 1 , Eq. (5.31a)
(5.81)
e
r
e
= e
r
1
r
e
substituted Eq. (5.76)

=
1
r
g
r
, Eq. (3.5)
= 0 , Eq. (5.31b)
= e
e
r
, order of dot product above doesnt matter. (5.82)
e
=
1
r
e
1
r
e
, substituted Eq. (5.76)

=
1
r
2
g
, Eq. (3.5)
= 1 , Eq. (5.31a) (5.83)
So in the second line there are two equalities to verify. Recall how we
obtain the components of any tensor, Eq. (3.21), consistent with:


For the second equality of the second line, we have four terms to verify.

r

r
=

dr

dr , substituted Eq. (5.77)
= g
rr
see p. 124
= 1 , Eq. (5.34)
(5.84)

r

=

dr r

d , substituted Eq. (5.77)
= rg
r
, see p. 124
= 0 , Eq. (5.34)
=

r
, order of dot product above doesnt matter. (5.85)

= r

d r

d , substituted Eq. (5.77)
= r
2
g
see p. 124
= 1 , Eq. (5.34) (5.86)
19. Repeating argument from Eq. (5.81) to Eq.(5.84) using

dr and

d
leads to the conclusion that these are a coordinate basis.
We simply repeat the argument but instead of substituting Eq. (5.77) we
use

dr and

d. We nd the 2nd line of Eq. (5.81) changes:
dr = cos

dx + sin
dy , used Eq. (5.27)
d = sin

dx + cos
dy , used Eq. (5.26) (5.87)

So now instead of Eq. (5.82) we get
x
=
1
r
sin
y
=
1
r
cos (5.88)
174
So instead of Eq. (5.83) we have factors of 1/r on both sides,
yx
=

y
_
1
r
sin
_
=

y
_
y
x
2
+ y
2
_
=
1
x
2
+ y
2
+
2y
2
(x
2
+ y
2
)
2
, chain rule
=
x
2
+ y
2
(x
2
+ y
2
)
2
+
2y
2
(x
2
+ y
2
)
2
=
x
2
+ y
2
(x
2
+ y
2
)
2
(5.89)
On the other hand,
xy
=

x
_
1
r
cos
_
=

x
_
x
x
2
+ y
2
_
=
1
x
2
+ y
2

2x
2
(x
2
+ y
2
)
2
=
x
2
+ y
2
(x
2
+ y
2
)
2
=

2
yx
(5.90)
Thus the basis is consistent with a coordinate basis. See SP.5 to complete
the proof that

dr and

d are a coordinate basis.
20 For a noncordinate basis e
, dene
e
e
e
e
and use this in place of Eq. (5.74) to generalize Eq. (5.75).

The Christoel symbol [of the second kind] arising from the derivative of
the basis vectors, as in Eq. (5.44). For coordinate bases it is symmetric on
the two lower indices, Eq. (5.74), and this was used to derive Eq. (5.75). Now
we must derive the generalization of Eq. (5.75) appropriate for noncoordinate
bases.
It is clear that c
so dened is antisymmetric on the two lower indices.

It is convenient to introduce a new symbol for the connection coecient
appropriate for the noncoordinate bases as follows:
e
e
=
_
+
1
2
c
_
e
(5.91)
For each ,
is a 4 4 matrix and it can be uniquely decomposed into

its symmetric and antisymmetric parts. The c
is then identied with

the twice the antisymmetric part and the regular Christoel symbol as the
symmetric part. We then replace in Eq. (5.72)
and repeat the steps on p. 134:

g
,
=
g
,
=
g
,
=
(5.92)
Simply add both sides, grouping terms with common factors of g, remem-
bering that we can still exploit the symmetry of g. [These have been colour
coded above to help you nd them quickly.] We get of course the same as
Schutz on p. 134 but with
:
g
,
+ g
,
g
,
= (
) g
+ (
) g
+ (
) g
(5.93)
Now we exploit the fact that
has been represented in terms of its sym-

metric and antisymmetric parts. That is,
(
) = c
) = c
) = 2
(5.94)
176
which gives us
g
,
+ g
,
g
,
= c
+ c
+ 2
g
,
+ g
,
g
,
= c
+ c
+ 2
, lowered index (5.95)

Were almost there! Multiply by (1/2) g
and solve for the Christoel

symbol:
=
1
2
g
(g
,
+ g
,
g
,
c
=
1
2
g
(g
,
+ g
,
g
,
c
) (5.96)
21a. Hold xed and let a vary in Eq. (5.96), showing that these coor-
dinate curves are orthogonal to the world lines (coordinate curves obtained
with a xed and varying).
Let

A be the tangent to the curve with xed and varying a:
A = e
0
t(a, )
a
+e
x
x(a, )
a
= e
0
sinh() +e
x
cosh() (5.97)
Let

B be the tangent to the curve with a xed and varying (i.e. the world
lines):
B = e
0
t(a, )
+e
x
x(a, )
= e
0
a cosh() +e
x
a sinh() (5.98)
Now its easy to show that
A

B = a sinh() cosh() + a sinh() cosh() = 0 (5.99)
21b. Show that Eq. (5.96) denes a transformation from coordinates
(t, x) to coordinates (, a) that form an orthogonal coordinate system. Draw
these coordinates and show they only cover half of the original t x plane.
Apparently there are two disjoint quadrants separated by [t[ = [x[.
The transformation matrix is

=
_
t
t
a
x
x
a
_
=
_
a cosh() sinh()
a sinh() cosh()
_
(5.100)
And the determinant of this transformation matrix is
det(

) = a
which for a ,= 0 is a legitimate transformation.
The basis vectors will be
= e

so that we have already found the new basis vectors in terms of the old:
e
a
=

A
e
=

B (5.101)
and shown they were orthogonal in part (a) above.
Plotting the coordinates sounds vague, but certainly the easiest thing
to do is plot the curves in the t x plane obtained by holding one of the
a pair xed and varying the other. These are coordinate curves. Before
plotting these coordinate curves (see Fig. 5.3) its useful to reect on what
they will look like. Say a = 1, and 1, then
t(1, ) = 1 sinh() =
exp() exp()
2

exp()
2
x(1, ) = 1 cosh() =
exp() + exp()
2
+
exp()
2
(5.102)
so the curve approaches the straight line t = x in the limit . And
the family of curves approach this same limit regardless of the value of a as
long as its nite. (Well discuss a < 0 in a minute. For now think of a > 0.)
At = 0 we have t = 0 and x = a. So for various a > 0 we have a family of
curves in the 4th quadrant (t 0 and x > 0).
178
For > 0 the curves in the 1st quadrant are the refection about the
xaxis at t = 0 of the curves we just described in the 4th quadrant. In
particular, all curves with a > 0 approach t = x as +.
For a < 0 we have a family of curves in the 2nd and 3rd quadrants that
are the refection about the taxis at x = 0 of the curves we just described in
the 1st and 4th quadrants respectively. Thus we see that the region between
t = x and t = x are not parameterized by (, a).
x
t
Figure 5.3: Coordinate curves for Eq. (5.96). Plot was partly generated with
Maple
TM
, see accompanying le schutz2009 ch5.mw.
21c. Metric tensor and Christoel symbols.
We know the metric in
in t x coordinates. So we transform this

metric to a coordinates as we did in 5.2:
g

=
x
x

x
(5.103)
Lets do this one component at a time. There are only 3 to check since
g

= g

:
g
0
0
=
0
g
=
x
, just rewritting
=
t

00
+
x

11
=
_
a sinh
_
2
(1) +
_
a cosh
_
2
(+1)
= (a cosh )
2
(1) + (a sinh )
2
(+1)
= a
2
(5.104)
And the o-diagonal term:
g
0
1
=
1
g
g
a
=
x
a
g
, just rewritting
=
t
t
a

00
+
x
x
a

11
=
_
a sinh
__
a sinh
a
_
(1) +
_
a cosh
__
a cosh
a
_
(+1)
= a cosh sinh (1) + a sinh cosh (+1)
= 0
= g
a
symmetry of the metric tensor. (5.105)
The nal component:
g
1
1
=
1
g
g
aa
=
x
a
x
a
g
, just rewritting
=
t
a
t
a

00
+
x
a
x
a

11
=
_
a sinh
a
_
2
(1) +
_
a cosh
a
_
2
(+1)
= (sinh )
2
(1) + (cosh )
2
(+1)
= 1 (5.106)
180
There are only 23 = 6 components of the Christoel symbol to compute.
We use Eq. (5.75) since were in a coordinate bases. Well need the inverse
metric tensor
(g

) =
_
a
2
0
0 1
_
1
=
_
a
2
0
0 1
_
(5.107)
=
1
2
g
(g
,
+ g
,
g
,
) (5.108)
The metric depends only upon a since no where does appear in our com-
ponents of g above. So we immediately conclude
= 0. (5.109)
a
=
1
2
g
(g
,a
+ g
a,
g
a,
)
=
1
2
g
(g
,a
+ g
a,
g
a,
) , diagonal metric
=
1
2
g
(g
,a
) , diagonal metric
=
1
2
_
1
a
2
_
(a
2
)
a
=
1
a
=
a
Eq. (5.74) (5.110)
aa
=
1
2
g
(g
a,a
+ g
a,a
g
aa,
)
=
1
2
g
(g
a,a
+ g
a,a
g
aa,
)
=
1
2
g
(g
aa,
) , diagonal metric
= 0. (5.111)
=
1
2
g
a
(g
,
+ g
,
g
,
)
=
1
2
g
aa
(g
a,
+ g
a,
g
,a
)
=
1
2
g
aa
(g
,a
) , diagonal metric
=
1
2
(+1)
(a
2
)
a
= a (5.112)
a
a
=
1
2
g
a
(g
,a
+ g
a,
g
a,
)
=
1
2
g
aa
(g
a,a
+ g
aa,
g
a,a
)
=
1
2
g
aa
(+g
aa,
)
= 0. (5.113)
a
aa
=
1
2
g
a
(g
a,a
+ g
a,a
g
aa,
)
=
1
2
g
aa
(g
aa,a
+ g
aa,a
g
aa,a
)
= 0. (5.114)
22. Show that if
U
= W
then
U
= W
We simply use the metric tensor to lower the index on both sides of the
equals sign, =, in the rst expression:
g
= g
(g
) = g
metric commutes with covariant derivative, Eq. (5.71)

U
= W
lower the index (5.115)

182
SP.1 Eq. (5.28b) states that the magnitude of the one-form basis
[
d[ =
1
r
while Eq. (5.28a) implies that
[e
[ = r
Does this contradict Eq. (3.47) wherein the magnitude of a one-form was
stated to be the same as its associated vector? Hint. The answer is of course
no. Work through problem 34 of 3.10, which might help.
Solution: The answer is of course no, there is no contradiction. The
one-form bases are not simply the associated one-forms of the vector bases.
This was stated explicitly by Schutz in 3.3, p. 61. See also the next supple-
mentary problem SP.2.
SP.2 Find the one-form associated with vector e
for a xed .
Solution: The one-form associated with a vector

A is
A = g(
A, )
as stated just before Eq. (5.67), and rst introduced in 3.5. We simply
substitute A
, with A
= 1 for xed into the above expression:
A = g(
A, )
= g
(A
, )
= A
(e
( )
= A
( )
= g
( ) , used A
= 1 (5.116)
So we see that only if the metric is diagonal is g
the one-form associated

with e
, for a xed , and furthermore only if g
= 1 is
the one-form
associated with e
.
SP.3 Write the identity matrix as the product of two transformations
and take the partial derivative /x
to show that
,
= 0
Solution:
a transform and its inverse transform
,
= 0 LHS: product rule, RHS: identity matrix constant
(5.117)
SP.4 Multiply Eq. (5.43) by the one-form basis
=

dx
so that its
a proper tensor equation. Then show that it transforms as we would hope,
that is like a
_
1
1
_
tensor. This supplementary problem is meant to be easier
than Schutz problem 17 but to carry the same message. Hint: Go take to
Eq. (3.10) and remind yourself how we showed that p was invariant under a
change of coordinates. Furthermore, youll need the result from SP.3.
Solution: Multiplying Eq. (5.43) by the one-form basis
=

dx
gives,
dx
(V
) =

dx
+ V
In Eq. (3.10) we showed that one-forms are invariant under a change of coor-
dinates (or rather we assumed they were invariant and found the appropriate
transformation). However you interpret it, we do the same manipulations
here but for a mixed rank 2 tensor. To be frame invariant, we require
dx
(V
) =

dx
(V
)
where, in keeping with Chapter 5 notation, the primes indicate a dierent
coordinate system. Now lets expand the RHS, writing it in terms of known
transformations from things in the unprimed frame. Our strategy will be to
184
obtain terms on the RHS like V
and

dx
that we can cancel with term on

the LHS.
dx
(V
)
=

dx
(V
) assumed its frame invariant

=

dx
+ V
product rule (as in Eq. (5.43))

= (
dx
)(
)

x
+ V
) unprimed bases
= (
dx
)(
)

x
(V
) + (V
)

x
) unprimed components
= (
dx
)(
)
_
x
_
(V
) + (V
)
_
x
_
(
) chain rule
= (
dx
)(
)
_
_
(V
) + (V
)
_
_
(
) inverses
= (
dx
)(
)
_

x
_
(V
) + (V
)
_

x
_
(
) simplied
= (
dx
)(
)
_
V
,
+ V
_
+ (V
)
_
,
e
_
product rule
= (
dx
)(
)
_
V
_
+ (V
)
_
_
used SP3.
= (
dx
,
+ V
inverse transforms
= (
dx
)e
,
+ V
simplied
(5.118)
And if we changed dummy indices, wed be back where we started. So
applying the expected transformation rules we found the covariant derivative
of a vector does transform as we expected, like a
_
1
1
_
tensor. And notice
that, because of the product rule of elementary dierential calculus, the
covariant derivative involved two terms. Neither of these terms transformed
like a tensor on their own, because of the troublesome derivatives of the
transformations, terms like
,
above. But these two terms cancelled
when we used the result of SP3. This was the point of Schutzs problem 17.
SP.5 In Exer. 19 we showed that

dr and

d are consistent with a coor-
dinate basis. Complete the proof that

dr and

d are a coordinate basis.
Solution: One must repeat the argument for also. Analogous to Eq. (5.82)
we obtain:
x
= cos
y
= sin (5.119)
Find the common mixed partial derivative,
yx
=

y
(cos )
=

y
_
x
_
x
2
+ y
2
_
=
2xy
(x
2
+ y
2
)
3/2
, product rule (5.120)
And
xy
=

x
(sin )
=

x
_
y
_
x
2
+ y
2
_
=
2xy
(x
2
+ y
2
)
3/2
, product rule
=

2
yx
(5.121)
186
Chapter 6
Curved Manifolds
187
188
6.3 Covariant dierentiation
Typo in Eqs (6.30) and (6.31). Colin and period should be semicolon and
comma.
Typo just before Eq. (6.36), reference should be to Eq. (5.52), not (5.53).
6.5 The Curvature Tensor
Typo in Eq. (6.67), the rst upper index should be not :
R
=
1
2
g
(g
,
g
,
g
,
+ g
,
). (6.1)
This will be derived in Exerc. 17.
6.6 Bianchi identities: Ricci and Einstein ten-
sors
Typo in Eq. (6.91), the should be an , i.e.
R
= R
6.9 Exercises
It might help to tackle my supplementary problems rst, see 6.10 below.
1. Decide if the following sets are manifolds and say why. If there are
exceptional points at which the sets are not manifolds, give them:
We are given only an intuitive explanation of manifolds in 6.1, where
were told that a manifold is a space that can be continuously parameterized;
that there is a smooth mapping from points of the manifold to a Euclidean
space of the same dimension. So I believe such an intuitive explanation is all
that is required here.
(a) Phase space of Hamiltonian mechanics, the space of the canonical
coordinates and momenta p
i
and q
i
;
I believe this is a typo and should read . . . the space of the canonical
coordinates i.e. momenta p
i
and q
i
, since in Hamiltonian mechanics the
canonical coordinates are momenta p
i
and q
i
, see http://en.wikipedia.
org/wiki/Canonical_coordinates.
The answer is yes of course, since the phase space is a manifold. The
intuitive reason is that the generalized momentum p
i
and generalized coor-
dinates q
i
are parameterized by time, and on physical grounds this parame-
terization must be continuous otherwise the particles would jump instantly
in position or accelerate innitely.
(b) The interior of a circle of unit radius in two-dimensional Euclidean
space.
Yes, because again this can be parameterized by putting the centre of
the circle at the origin, say, and parameterizing the curve on angle of the
radius vector to the xaxis.
(c) The set of permutations of n objects.
Im not completely clear on what the objects are nor what permutations
of them are, but I believe the answer is No, because I dont see how sep-
arate objects, potentially with a space between them, can be continuously
parameterized.
(d) The subset of Euclidean space of two dimensions (coordinates x and
y) which is a solution of
x y (x
2
+ y
2
1) = 0.
The solution is the union of the xaxis, i.e. x, and y = 0, and the
yaxis, and the unit circle centred at the origin. So because these intersect,
one could parameterize it continuously. But its not dierentiable every-
where, so its not a dierential manifold.
2. Of the manifolds in Exer. 1, on which is it customary to use a metric,
and what is that metric? On which would a metric not normally be dened,
and why?
190
(b) The natural coordinates would be polar coordinates, since the domain
is then easily dened: r < 1 and 0 2 . The metric tensor was given
in 5.2, c.f. Eq. (5.32).
3. (a) Show that given a diagonal matrix, D one can always nd a
matrix R such that R
T
DR is also diagonal, with same elements as D but in
ascending order.
Lets call the reorder matrix

D,
D = R
T
DR,
and the elements of the matrices d
i j
for elements of D etc. Then
d
kl
= (r
T
)
ki
d
ij
r
jl
= r
ik
d
ij
r
jl
= r
ik
d
ii
r
il
(6.2)
Suppose we want to move the diagonal element d
II
into slot K for given
(xed) I and K, i.e. we want
d
KK
= d
II
We simply choose r
IK
= 1 and r
iK
= 0 i ,= I. But are the o-diagonal
terms still zero? Yes, this is guaranteed because when k ,= l, we cannot have
both r
ik
,= 0 and r
il
,= 0 since that would correspond to moving the diagonal
element d
ii
into two dierent slots

d
kk
and

d
ll
. Thats not allowed because
each diagonal can only be moved into one slot.
3. (b) Show that the diagonal elements can be scaled such that they are
either 1, 0, or +1 using another matrix N as follows: N
T

DN.
As we found above the new elements will be
d
kl
= (n
T
)
ki

d
ij
n
jl
= n
ik

d
ij
n
jl
= n
ik

d
ii
n
il
(6.3)
Suppose

d
KK
,= 0 and we want the diagonal element

d
KK
for given (xed)
K to be 1 or +1, i.e. we want
d
KK
= 1 sign(
d
KK
).
We simply choose n
KK
= 1/
_
[
d
KK
[ and n
iK
= 0 i ,= K.
But if

d
KK
= 0 we must cannot do this, and must choose n
KK
= a,
where a is any nonzero number, and n
iK
= 0 i ,= K. We then end up with
d
KK
= 0.
3. (c) Show that none of the diagonal elements can be zero for the inverse
of A to exist.
The equation for the inverse is trivial for a diagonal matrix because one
can nd one equation with one unknown for each element of the inverse.
DD
1
= I. (6.4)
So for element (D
1
)
ii
we have
(D
1
)
ii
= 1/(D)
ii
.
and for o-diagonal elements
(D
1
)
ij
= 0/(D)
ii
= 0.
So the inverse of a diagonal matrix is also diagonal but with the elements
equal to the inverse of the original elements. When the original matrix had
zero for one or more diagonal elements, then the inverse doesnt exist because
nding it would involve dividing by zero.
3. (d) Show that the metric of Eq. (6.2) can always be found.
As we are reminded in the text (p. 145), the metric tensor g is symmetric
by denition (for example if it is dened from the dot product of two vectors,
the order of the vectors does not matter). So as stated in Exerc. 3, this implies
that (g) can be diagonalized. Furthermore (g) must have an inverse (for the
mapping from vectors to one-forms to be invertible. Then the results (a)
through (c) show that we can reduce (g) to a matrix with either 1 or +1
on the diagonal. Since we choose (g) to have one negative eigenvalue we end
up with one 1 on the diagonal, and remaining entries +1.
192
4. Prove the following results used in the proof of the local atness
theorem in 6.2:
(a)
2
x
/x
[
0
has 40 independent values.
Consider rst the operator,
2
/x
[
0
. It is represented by a 4 4
symmetric (because partial dierentiation commutes) matrix, and has 10
independent elements (4 diagonal and 6 in the upper diagonol). For each el-
ement of this dierential operator, there are 4 independent coordinates it can
act on, x
. Hence there are 410 = 40 degrees of freedom in

2
x
/x
[
0
.
(b)
3
x
/x
[
0
Again the problem is to omit the elements made redundant by the fact
that the partial derivatives commute. Again, start with just the dierential
operator which here is
3
/x
[
0
. There are 4 diagonal elements.
There are 4 3 = 12 ways to choose two of the derivatives the same and one
dierent (like
,=
. Its not necessary to also consider
,=
because weve accounted for these elements already, since the order of the
derivatives does not matter). And nally there are the elements where
,=
,=
. Here it is easiest to count them by noting that for any given choice
there is only one unused index value, so there are 4 such elements. Adding
these three types of terms we get 4 +12 +4 = 20. Again, for each element of
this dierential operator, there are 4 independent coordinates it can act on,
x
. Hence there are 4 20 = 80 degrees of freedom.

(c) g
,
[
0
This one is easy. The dierential operator has 10 degrees of freedom, see
Exerc. (a) above. And this is applied to the metric which is a symmetric 44
tensor, and hence has 10 independent values. Thus there are 10 10 = 100
independent values in total.
5. (a) Prove that
in any coordinate system in a curved

Riemannian space.
This Exercise is so important one really must do it. By the local atness
theorem, c.f. 6.2, on a general Riemann manifold, there is a local inertial
(Lorentz) reference frame.In a Lorentz frame spacetime is at and one can
construct a reference frame with basis vectors that do not change with posi-
tion, so the Christoel symbols are zero. This is all one needs to reproduce
the argument of 5.4 leading to Eq. (5.74).
5. (b) Use this to prove that Eq. (6.32) can be derived in the same manner
as in at space.
The same principle is as in (a), i.e. the local atness theorem, is involved
in deriving Eq. (6.31) (which is identical to Eq. (5.71)). And Eq. (6.32) is
identical to Eq. (5.75). The Argument leading to Eq. (5.75) can be repeated
in curved Riemann space because it used Eqs. (5.71, 5.72, 5.74).
Eq. (5.72) is valid in curved space. Eq. (5.74) was proved above in (a).
Eq. (5.71) was given.
6. Prove the rst term in Eq. (6.37) vanishes.
Recall Eq. (6.37),
=
1
2
g
(g
,
g
,
) +
1
2
g
g
,
As pointed out in the text, we only need to prove that
(g
,
g
,
)
is antisymmetric. Then we can use the result of Exerc. 26(a) of 3.10 to
show that the term vanishes. First we note that g
= g
because of the
symmetry of the metric tensor.
(g
,
g
,
) = (g
,
g
,
).
Then we note that for each , the RHS is antisymmetric in and since
obviously
(g
,
g
,
) = (g
,
g
,
).
And for each , this term antisymmetric in and will be multiplied by the
inverse metric tensor g
that is obviously symmetric in and . Thus, by

the result of Exerc. 26(a) of 3.10, the rst term vanishes.
7. This problem is similar to 8.16 on page 222 in Misner et al. (1973)
book.
194
7. (a) Give the denition of the determinant of a matrix A in terms of
cofactors of elements.
7. (b) Dierentiate the determinant of an arbitrary 2 2 matrix and
show that it satises Eq. (6.39).
7. (c) Generalize Eq. (6.39) (by induction or otherwise) to arbitrary nn
matrices.
8. Fill in the missing algebra leading to Eqs. (6.40) and (6.42).
I nd it easiest to work backwards. That is, start with Eq. (6.40) and
dierentiate using the chain rule:
=
(
g)
,
g
=
1
2
(g)
,
(
g)
2
=
1
2
g
,
g
. (6.5)
Substitution of Eq. (6.39) leads directly to Eq. (6.38).
And for the second part, again I nd it easiest to work backwards. That
is, start with Eq. (6.42) and dierentiate using the product rule and chain
rule:
V
;
=
1
g
(
g V
)
,
=
g
V
,
+
V
g
(
g )
,
= V
,
+
V
g
(
g )
,
, (6.6)
which is Eq. (6.41). And Eq. (6.41) was obtained directly by substitution of
Eq. (6.40) into Eq. (6.36).
9. Show that Eq. (6.42) leads to Eq. (5.56).
This amounts to showing that the general formula for the divergence of
a velocity eld is consistent with the special case derived in 5.3 for polar
coordinates. We start with Eq. (5.32) for the metric in polar coordinates.
The determinate is simply
g = det (g
) = r
2
.
Substitution into Eq. (6.42) we nd
V
;
=
1
g
(
g V
)
,
= V
,
+
V
g
(
g )
,
,
= V
r
,r
+ V
,
+
V
r
r
2
(
r
2
)
,r
,
= V
r
,r
+ V
,
+
V
r
r
, (6.7)
consistent with the rst line of Eq. (5.56).
Find the divergence formula for the metric given in Eq. (6.19), i.e. that
for spherical polar coordinates.
From Eq. (6.19), the determinate is
g = r
4
sin
2
,
which has two nonzero gradient components,
_
r
4
sin
2
r
=
2r
3
sin
2
_
r
4
sin
2
_
r
4
sin
2
=
r
4
sin cos
_
r
4
sin
2
. (6.8)
196
Substitution into Eq. (6.42) we nd,
V
;
= V
r
,r
+ V
,
+ V
,
+
V
r
_
r
4
sin
2
2r
3
sin
2
_
r
4
sin
2
, +
V
_
r
4
sin
2
r
4
sin cos
_
r
4
sin
2
= V
r
,r
+ V
,
+ V
,
+
2V
r
r
+
V
tan
(6.9)
10. Consider a triangle made up of great circles on a sphere intersecting
at points A, B, and C, as in Fig. 6.3 but with B not necessarily on the pole.
Show that the amount by which a vector is rotated by parallel transport
around such a triangle equals the excess of the sum of the angles over 180
=
rad.
This is a good problem because it forces us to think carefully through
all the steps of the rst subsection of 6.4. We strongly need a diagram
to do this one. Lets use that of Fig. 6.3. The only constraint given was
that the sides of the triangle form great circles on the sphere. Without loss
of generality we can take A and C to be on the equator. We can label the
interior angles of the triangle as follows: CAB is the angle at A between
the great circle through CA and that through AB, and similarly for the
other two. As in 6.4, lets start with a vector at A that is parallel to the
equator. The angle from the vector to geodesic through AB is CAB. This
angle doesnt change as we move to B because it is parallel-transported. Now
imagine looking down on the point B. The angle between the extension of
geodesic AB to the vector is still CAB. Let be the angle between the
vector at B and geodesic BC. The
ABC + + CAB = rad.
I believe this is the only tricky part of this problem. It stems from the fact
that the intersection of two great circles forms four angles that add to 2 rad,
with those on one side of a great circle adding to rad. I take this as visually
obvious (unfortunately I cant oer any proof of this). Moving from B to C
the angle doesnt change. At C, let be the angle between the vector and
AC i.e. the equator. There,
BCA = + .
Combining these two equations and solving for gives
= CAB + ABC + BCA .
11. When is the gradient of the velocity eld zero everywhere?
V
;
= 0 = V
,
+
(6.10)
(a) A necessary condition for the solution of (6.10) is the integrability con-
dition for (6.10), which apparently follows from the commuting of partial
derivatives. We are to use the fact that V
,
= V
,
to derive the equation:
(
,
) V
= (
)V
(6.11)
This problem is fairly straightforward once one thinks about what would
I do to (6.10) to obtain a term like . . . in (6.11). Obviously to obtain a term
like
,
one must take
of (6.10).
To get rid of the resulting V
,
one simply does this twice and subtracts
the two results, using the commuting property of partial dierentiation, as
Schutz indicated. But then one must also deal with terms like
,
. For
these one uses (6.10) again (without dierentiating).
(b) By relabeling indices, work this into the form given. Actually the
form given has a typo and should be:
(
,
+
) V
= 0. (6.12)
This problem is straightforward. The rst two terms in (6.12) and (6.11)
are identical. Based on the sign of the 3rd term in (6.12) its obviously the
nal term in (6.11). So we must replace . That leaves the nal term
in in (6.12) as the 3rd in (6.11). So we need to also replace .
12. Prove that Eq. (6.52) denes a new ane parameter.
198
Using dened in Eq. (6.52) to parameterize the curve, we can nd the
equation for the geodesics by using Eq. (6.51) and the chain rule. So the
operator
d
d
_
d
d
_
=
_
d
d
_
2
d
d
_
d
d
_
=
_
1
a
_
2
d
d
_
d
d
_
(6.13)
That means we can rewrite Eq. (6.51) in terms of by simply dividing
Eq. (6.51) by a
2
, which is constant. Because Eq. (6.51) is set to zero, this
doesnt change the form of the equation at all, as indicated on p. 157, equation
below Eq. (6.52).
13. (a) Show that if

A and

B are parallel-transported along a curve, then
g(
A,

B) =

A

B is constant on the curve.
A vector that is parallel-transported along a curve is moved in the direc-
tion of the tangent to the curve without rotating or changing its length. From
this notion it seems obvious that the dot product of two vectors that were
parallel-transported along a curve would not change. To demonstrate this
mathematically, on could take the derivative along the curve (parameterized
by ) of the dot product:
d g(
A,

B)
d
=
d
d
_
g
_
= A
d
d
(g
) + g
d
d
(A
) + g
d
d
_
B
_
(6.14)
All the derivatives are zero. The rst term is the derivative of the metric
along the curve and is zero as consequence of the local atness theorem:
d
d
(g
) = U
g
;
= 0, c.f. Eq. (6.31). (6.15)
The 2nd and 3rd terms are the derivatives of the components of the vectors.
These are zero because these vectors were assumed to be parallel-transported
along the curve, c.f. Eq. (6.47).
13. (b) Conclude from the results of (a) that if a geodesic is spacelike
(or timelike or null) at some point, it is necessarily spacelike (or timelike or
null) at all points.
Vectors were dened as spacelike (or timelike or null) if their magnitude
was > 0(< 0, = 0), c.f. 2.5 on p. 44. A geodesic is of course not a vector,
but it does have a tangent vector at each point along the curve that gives the
linear approximation to the displacement along the curve at the point, per
unit of the parameter that parameterizes the curve. So it would be reasonable
to call a geodesic spacelike at a point if its tangent vector

U were of positive
magnitude at that point,
U

U = g
.
Can this change as one moves along the curve? The geodesic is, by denition,
the curve that parallel-transports its own tangent vector. But from (a) we
have that any two vectors that are parallel-transported by any curve keep
the same dot product. So the tangent vector, dotted with itself, does not
change as it is parallel-transported around the geodesic.
14. Proper distance along a curve whose tangent is

V is given by Eq. (6.8).
Show that if the curve is a geodesic, then proper length is an ane parameter.
(Use results of Exerc. 13).
200
l =
_

1
0
[
V [
1/2
d, Eq. (6.8)
=
_

1
0
[
U

U[
1/2
d, where

U is tangent vector to geodesic
= [
U

U[
1/2
_

1
0
d, using Exerc. 13b
(6.16)
Thus the proper distance along the curve has the form,
l() = [
U

U[
1/2
(
0
)
= [
U

U[
1/2

0
[
U

U[
1/2
,
= a + b. (6.17)
where a = [
U

U[
1/2
is constant (c.f. Exerc. 13 (a)), and b =
0
a. This
is the same form as Eq. (6.52), which was (hopefully) shown to be an ane
parameter in Exerc. 12.
15. Use Exercs. 13 and 14 to prove that the proper length of a geodesic
between two points is unchanged to rst order by small changes in the curve
that do not change its endpoints.
This looks like a nice problem because it implies that geodesics are ex-
tremma in distance between two xed points.
16. (a) Derive Eqs. (6.59) and (6.60) from Eq. (6.58).
This problem makes sure were following the argument. The Eq. (6.59) is
found simply by expanding the integrand in a Taylor series (about point A)
and keeping only the constant term and the term in the rst derivative. The
constant terms cancel in both cases leaving only the rst derivative terms,
leading to Eq. (6.59).
Treating the integrands in Eq. (6.59) as constants (in keeping the Tay-
lor series expansions), the integrals can be performed giving immediately
Eq. (6.60).
16. (b) Provide algebra needed to justify Eq. (6.61).
Again a simple problem to make sure were following the argument. Lets
start with the rst term in Eq. (6.60).

x
1
(
2
V
) = V

x
1
(
2
)
x
1
(V
)
= V
2,1
)
2
(
1
V
), using Eq. (6.53)

= [
2,1
+
1
]V
, (6.18)
where we let and in the 2nd term so that we can pull out the
V
. This is allowed because they are dummy (repeated) indices.

The 2nd term in Eq. (6.60) is dealt with in exactly the same way:
x
2
(
1
V
) = V

x
2
(
1
) +
x
2
(V
)
= V
1,2
) +
1
(
2
V
), using Eq. (6.53)

= [
1,2
2
]V
, (6.19)
where we let and in the 2nd term so that we can pull out the
V
.
17. (a) Prove that Eq. (6.5) implies that
g
,
(T) = 0
Because the metric tensor applied to its inverse gives the identity matrix,
which is of course a constant, we have
g
= g
,
(g
)
,
=
,
= 0,
g
,
g
+ g
,
= 0,
g
,
= 0. (6.20)
And now for the tricky bit. In general g
is a general tensor and there could

in principle be several non-zero terms in each column that cancel to produce
202
zero when multiplied by g
,
. So I believe one must argue as follows. However,
one can always choose ones basis such that g
, c.f. Eq. (6.2). But

then there is only one non-zero term in each column. For instance, let = 0.
g
0
g
,
= 1g
0
,
= 0.
= g
0
,
= 0, , . (6.21)
The same argument of course applies to all the other values of , and thus
g
,
= 0, , , .
17. (b) Use results of (a) to establish Eq. (6.64).
A nice easy problem, perhaps the point be to show us what use the results
in (a) can be put to. Starting with Eq. (6.32)
=
1
2
g
(g
,
+ g
,
g
,
), (6.22)
we simply dierentiate with respect to x
,
=
1
2
_
g
(g
,
+ g
,
g
,
)
_
,
=
1
2
g
,
(g
,
+ g
,
g
,
) +
1
2
g
(g
,
+ g
,
g
,
),
=
1
2
g
(g
,
+ g
,
g
,
). (6.23)
17. (c) Fill in step needed to establish Eq. (6.68).
We start with Eq. (6.63). Because were in a local inertial frame at
point T, the Christoel symbol vanishes at T. For the derivative of the
Christoel symbol, we substitute from Eq. (6.64), for the rst term making
the substitutions

. (6.24)
For the 2nd term we simply interchange and in the rst term and change
the sign, giving rst Eq. (6.65), and then, after cancelling the red terms using
Eq. (6.66), then arriving at Eq. (6.67):
R
=
1
2
g
(g
,
+ g
,
g
,
g
,
g
,
+ g
,
),
=
1
2
g
(g
,
g
,
g
,
+ g
,
),
=
1
2
g
(g
,
g
,
+ g
,
g
,
), just changed order.
(6.25)
Finally we must lower the index,
R
= g
= g
1
2
g
(g
,
g
,
+ g
,
g
,
),
= g
1
2
(g
,
g
,
+ g
,
g
,
),
=
1
2
(g
,
g
,
+ g
,
g
,
),
=
1
2
(g
,
g
,
+ g
,
g
,
), (6.26)
18. (a) Derive Eqs. (6.69) and (6.70) from Eq. (6.68)
This question involves trivial (and not so trivial) index manipulation. In
Eq. (6.68) on notices that changing the order of and changes the sign.
This is clear because the rst term, g
,
is the negative of the last term
g
,
but with and in the opposite order. And similarly for the two
middle terms. This observation gives the rst equality of Eq. (6.69)
R
= R
.
An analogous observation gives the 2nd equality in Eq. (6.69). That is,
the rst two terms of Eq. (6.68) are of opposite sign and have the second
204
pair of indices, and , in the opposite order. Similarly for the 3rd and 4th
terms. This gives, the second equality of Eq. (6.69)
R
= R
.
The 3rd and nal equality in Eq. (6.69) is not so immediately clear. I
found a proof using the following somewhat tedious procedure. Use Eq. (6.68)
to nd the expression for R
. That is, simply use Eq. (6.68) with the

following substitutions:
,
,
,
.
One nds
R
=
1
2
(g
,
g
,
+ g
,
g
,
) (6.27)
There are two ways to proceed from here. The most direct is to use the facts
that g is always symmetric, and derivatives commute. So we can interchange
the order of the two indices before and after the comma without changing
the result. Then one notices that the 3rd term in (6.27) above is the same
as the rst term in Eq. (6.69),
g
,
= g
,
. (6.28)
And furthermore, the 2nd term in (6.27) above is the same as the 2nd term
in Eq. (6.69); the 1st term in (6.27) above is the same as the 3rd term in
Eq. (6.69); the 4th term in (6.27) above is the same as the 4th term in
Eq. (6.69). This gives the 3rd equality in Eq. (6.69).
Proving Eq. (6.70) is also rather straightforward, albeit rather messy (at
least for the brute force method I came up with). I simply wrote done the
expressions for all three terms in Eq. (6.70), R
, R
and R
using
Eq. (6.68):
2R
= +g
,
g
,
+g
,
g
,
(6.29)
2R
= +g
,
g
,
+g
,
g
,
(6.30)
2R
= +g
,
g
,
+ g
,
g
,
(6.31)
Recognizing again that the order of the rst pair and last pair of indices
in terms like g
,
doesnt matter because of symmetry of the metric and
partial derivatives commute, we can cancel all the terms. The cancelling
terms are typeset in the same colour of font.
18. (b) Show that Eq. (6.69) reduces the number of independent compo-
nents of R
to 21.
If youre having trouble here, perhaps its worth having a go at Exerc. 35,
since that problem involves writing down the independent elements for a
specic example, perhaps forcing you to think about it in a more concrete
or dierent way. Of course, Exerc. 35 would be much easier if you solved
Exer. 18 rst! If youre still stuck, heres my solution below.
First of all one must notice that terms of the form R
= 0, and of
course also R
= 0. Speaking from personal experience, one can waste

quite some time if one doesnt appreciate this from the start! But once
one imposes this, then the problem simplies tremendously. For now there
are only six independent choices for the rst pair , (where order doesnt
matter because of Eq. (6.69)). One can establish this a number of ways. For
instance, there are 4 ways of choosing and then only 3 ways of choosing
,= giving 4 3 = 12 pairs. But we have counting them twice because we
counted = 1, = 2 separately from = 2, = 1, etc. so we must divide
this by two to get 12/2 = 6 independent pairs. Similarly for the second set,
but lets be careful in choosing these because were going to have to impose
the symmetry given by the last equality in Eq. (6.69). Lets consider rst
those pairs of , where of course we require ,= , but also we impose that
neither nor equals or . Thus for any given pair , we have only
5 pairs of , . This counts twice the permutations R
and R
, so we
must divide this by 2 to get 6 5/2 = 15 independent elements. To this we
must add the six pairs of , that were not dierent from the , pair, i.e.
= or and = or . This gives a total of 15 + 6 = 21 independent
elements of R
accounting for Eq. (6.69) only. (We account for Eq. (6.70)
in the next problem.)
18. (c) Show that Eq. (6.70),
R
+ R
+ R
= 0,
imposes only one restriction that is independent of Eq. (6.69), and thus
206
reduces the number of independent components of R
to 20.
My solution is, as is often the case, rather longwinded. If you nd a neater
solution, please let me know! First lets establish that all the indices must
be dierent. I do this but simply consider all the cases. Consider = , and
then R
= 0, which implies, from Eq. (6.70),

R
+ R
+ R
= 0,
R
+ R
= 0,
R
= 0. (6.32)
But this nal result is a special case in Eq. (6.69), so is not a new result,
and does not provide any restraints on R
beyond those of Eq. (6.69). We

proceed like this, next considering = :
R
+ R
+ R
= 0,
R
+ R
= 0,
R
= 0. (6.33)
This last equality was also implied by Eq. (6.69) as a special case. Unfortu-
nately we have a few more cases to consider: = , and separately = ,
and separately = . They all reduce to special cases of Eq. (6.69). So we
conclude that the only new information in Eq. (6.70) must come from the
case when the indices are all unique. There is only one set of 4 indices all
dierent, 0, 1, 2, 3, and these can form 3 unique pairs:
, ,
0, 1 2, 3
0, 2 1, 3
0, 3 2, 1
(6.34)
where the order of , does not matter, nor does the order of , . And
notice also we dont distinguish between , = 0, 1 while , = 2, 3 and
the case where , = 2, 3 while , = 0, 1, because the elements must have
the same value due to the last equality in Eq. (6.69). So of the 21 elements
of R
described in (b) above, there are 3 such that the indices are all
dierent. But these three elements obey the relation Eq. (6.70), which can
be written assuming all the indices , , , are unique, for example
R
0123
+ R
0312
+ R
0231
= 0.
It is not necessary to distinguish this equation from others that can be ob-
tained by simply applications of the rules in Eq. (6.69). This equation then
imposes one constraint, independently of Eq. (6.69), and thus reduces the
number of independent elements of R
from the 21 we found in (b) to 20.

19. Prove that R
for polar coordinates in the Euclidean plane.

First we nd the number of independent components of R
in two
dimensions. [Refer to Exer. 18(b) for the case of 4-dimensional space.] There
is only one degree of free associated with the rst pair of indices because
R
r
= R
r
and
R
= 0.
And similarly only one degree of freedom associated with the last two indices
since
R
r
= R
r
and
R
= 0.
Its not possible to use the cyclic identity to reduce this any further since
applying the cyclic identity we discover something that was true by the other
symmetry relations, that is:
R
rr
+ R
rr
+ R
rr
= 0 , by cyclic identity, Eq. (6.70)
R
rr
R
rr
= 0 , symmetry relations, Eq. (6.69) (6.35)
So there is only one independent value to compute. Lets compute
R
rr
=
r
,r
r
r,
+
r
r
, Eq. (6.63)
=
r
,r
+
r
r
rr
, Eq. (5.45)
=
r
,r
, Eq. (5.45)
=
(r)
r

1
r
(r) , Eq. (5.45)
= 0. (6.36)
208
And this is of course what we expect since (despite the polar coordinates)
were in Euclidean space, which is at, and Eq. (6.71) tells us we must have
the zero for the Riemann tensor.
20. Fill in the algebra necessary to establish Eq. (6.73).
The covariant derivative of a vector:
= V
;
= V
,
+
see Eq. (6.33). (6.37)

Now we simply apply the gradient operator another time,
) = (V
;
)
;
= (V
;
)
,
+
;
, see Eq. (6.34)
= (V
;
)
,
(6.38)
where the last step used the fact that Christoel symbols are zero in local
inertial coordinates. But their gradients are not, so
) = (V
;
)
,
= V
,
+ (
)
,
= V
,
+
,
V
(6.39)
21. Following Eq. (6.78) it was claimed that one could generalize Eq. (6.77)
for the commutator of the covariant derivative of a vector to a tensor with
Eq. (6.78) . Each index got a Riemann tensor and the sign was always posi-
tive, even when the index was a lower index. In parentheses it was claimed
that this must be the case because the metric tensor g is unaected by the
covariant derivative.
This is a great problem. The result Eq. (6.78) seems to contradict
Eq. (6.34). I cannot nd the explanation Schutzs is looking for. All Ive
managed so far on this is to derive Eq. (6.78) (see my supplementary prob-
lem SP2 in section 6.10 above) and to show that Eq. (6.34) is at least con-
sistent in the following sense.
V
;
= (g
)
;
= g
;
V
+ g
;
= (g
,
)V
+ g
;
= (
)V
+ g
;
= (
)V
+ g
(V
,
+
)
(6.40)
Lets relabel the dummy indices so that its more clear which terms cancel:
V
;
= (
)V
+ g
(V
,
+
)
=
+ g
,
+ g
= g
= V
,
(6.41)
which is consistent with Eq. (6.34). Of course we used Eq. (6.34) in obtaining
this, so its not a proof. But it is a check on internal consistency.
22. Establish Eqs. (6.84), (6.85), and (6.86).
23. Prove Eq. (6.88).
Eq. (6.88) gives the partial derivative of the Riemann curvature tensor.
As suggested in 6.6, we can nd it by starting with the denition of the
Riemann curvature tensor given in Eq. (6.63).
R
,
= (g
)
,
= g
,
using Eq. (6.5)
= g
,
+
,
)
= g
,
) in local inertial frame (6.42)
210
We now use Eq. (6.32), which is purportedly correct in any coordinate sys-
tem, to write Christoel symbols in terms of the metric tensor. We must
dierentiate Eq. (6.32) with respect to x
,
=
1
2
g
(g
,
+ g
,
g
,
) +
1
2
g
,
(g
,
+ g
,
g
,
) (6.43)
We might note that to arrive at Eq. (6.64), we would eliminate the terms that
are zero by Eq. (6.5) and the results of Exerc. 17(a), i.e. g
,
= 0, leading to:
,
=
1
2
g
(g
,
+ g
,
g
,
) (6.44)
But since we are going to dierentiate this again, to be on the safe side, lets
keep the terms that we eliminated from (6.43) because of g
,
= 0. Now
dierentiate (6.43) with respect to x
to arrive at:
,
=
1
2
g
(g
,
+ g
,
g
,
)
+
1
2
g
,
(g
,
+ g
,
g
,
) +
1
2
g
,
(g
,
+ g
,
g
,
)
=
1
2
g
(g
,
+ g
,
g
,
)
(6.45)
The terms in blue didnt contribute because there was always a common
factor with at most one derivative of the metric, which vanishes. So be-
ing careful didnt amount to anything here we could have dierentiated
Eq. (6.64) right away. Armed with this 2nd derivative of the Christoel
symbol, we can substitute this into (6.46), giving:
R
,
= g
1
2
g
[g
,
+ g
,
g
,
(g
,
+ g
,
g
,
)]
=
1
2
g
[g
,
g
,
(g
,
g
,
)]
=
1
2
[g
,
g
,
(g
,
g
,
)]
=
1
2
[g
,
g
,
g
,
+ g
,
] (6.46)
24. Establish Eq. (6.89) from Eq. (6.88).
The solution follows from straightforward substitution of Eq. (6.88) into
Eq. (6.89). For the 2nd term, make the substitutions:

(6.47)
For the 3rd term, make the substitutions:

(6.48)
Matching up terms that cancel is simply a matter of nding the terms with
the same rst two indices since g is symmetric. These pairs of terms (shown
in the same colour font below) will cancel because the remaining indices
(the 3 after the comma) will necessarily correspond and again order doesnt
matter because partial derivatives commute.
2R
,
+ 2R
,
+ 2R
,
= [g
,
g
,
g
,
+g
,
]
+ [g
,
g
,
g
,
+ g
,
]
+ [g
,
g
,
g
,
+g
,
]
= 0. (6.49)
25. (a) Prove that the Ricci tensor R
is the only independent con-

traction of R
since all other are multiples of it [or they are zero as pointed
out in the text].
First we need the result that
R
= R

as was proven in supplementary problem SP.4 above. It follows that
R
= R

= 0, , ,
212
since zero is the only number equal to its own negative. Similarly,
R

= 0, , ,
It remains to consider R
, R

, R

. These candidates were identied
by stepping through the possibilities systematically: 1st and 2nd, 1st and 3rd,
1st and 4th, (thats it for those involving the rst index), 2nd and 3rd (1st
and 2nd already considered), 2nd and 4th, 3rd and 4th. Thats all.
R
= g
= g
, by Eq. (6.69)
= R
= R
, i.e. the Ricci tensor. (6.50)

R

= R
, using results of SP.4

= R

R

= R
, using results of SP.4

= R
= R

25. (b) Show the Ricci tensor is symmetric.
R
= g
= g
, by Eq. (6.69)
= R

,
= R
. (6.53)
26. Use Exer. 17(a) to prove Eq. (6.94).
In Exer. 17(a) we proved that g
,
(T) = 0 at some event T. If we chose
a local inertial reference frame, then
(T) = 0, so
g
,
(T) = g
;
(T) = 0
just as in the un-numbered equation on p.151 between Eqs. (6.30) and (6.31).
The 2nd equality is a valid tensor equation, which is then valid in all reference
frames. So, just as for Eq. (6.31), we also have:
g
;
(T) = 0, in any basis.
27. Fill in the steps needed to establish Eqs. (6.95), (6.97), and (6.99).
Eq. (6.95) is the 2nd term in Eq. (6.93)
g
R
;
= g
R
;
= (g
)
;
using Eq. (6.94)
= (g
)
;
using Eq. (6.69)
= (R
)
;
= (R
)
;
= R
;
(6.54)
Although not asked for this, note that the rst and 3rd terms in Eq. (6.63)
follow immediately from multiplication by the inverse metric tensor, so we
have established Eq. (6.63) as well.
To establish Eq. (6.67) we need Eq. (6.66), which is the contraction of
Eq. (6.63) with the inverse metric:
0 = g
[R
;
R
;
+ R
;
]
= R
;
+ g
[R
;
+ R
;
] used Eqs. (6.92), (6.94)
= R
;
R
;
+ g
;
relabled (6.55)
The 3rd term is more involved. I found it easiest to work backwards, starting
214
at the result in Eq. (6.96):
R
;
= g
R
;
used Eqs. (6.92), (6.94)
= g
;
used Eqs. (6.91)
= g
R
;
= g
R
;
used Eqs. (6.69)
= g
;
(6.56)
Substituting this for the 3rd term in (6.57) gives Eq. (6.96):
0 = R
;
R
;
R
;
(6.57)
Multiplying by 1 gives:
2 R
;
R
;
= 0 (6.58)
For all values of and , the Ricci scalar R is a scalar. So we can write,
R
;
= (
R)
;
=
R
;
(6.59)
Substitution into (6.60) gives Eq. (6.97):
(2 R
R)
;
= 0 (6.60)
The rst equality in Eq. (6.98) is a denition (for some reason hes
changed notation from := to , both being standard nomenclature). The
2nd equality follows from the symmetry of R
as follows:
R
= R
= R
c.f. Eq. (6.91)

= R
symmetry of (inverse) metric tensor

= R
(6.61)
Working backwards again, replace in Eq. (6.99) and multiply by 2g
giving:
2g
;
= 2g
;
symmetry of G
= 2g
(R
1
2
g
R)
;
using def. in Eq. (6.98)
= (2R
R)
;
= (2R
R)
;
(6.62)
28.(a) Derive Eq. (6.19) by using the usual coordinate transformation
from Cartesian to spherical polars. [On the one hand these problems work-
ing with familiar coordinates like spherical polar may help build physical
intuition. But on the other hand, you loose generality by working with a
specic coordinate system. ]
First one needs the coordinate transformation equations:
x = r sin cos
y = r sin sin
z = r cos (6.63)
Well need the transformation matrix,
x
where the x
refer to the spherical-polar coordinates (r, , ) in that order.

_
_
_
x
r
= sin cos
x
= r cos cos
x
= r sin sin
y
r
= sin sin
y
= r cos sin
y
= r sin cos
z
r
= cos
z
= r sin
z
= 0
_
_
_
The metric tensor for Euclidean space in Cartesian coordinates is simply
g
ij
=
ij
Eq. (5.29)
which is immediately clear from considering the dot product of the Cartesian
coordinate basis vectors in 3D Euclidean space. So we can obtain the metric
of 3D Euclidean space in spherical-polar coordinates through the transfor-
mation
g
i
j
=
_
x
i
x
i
__
x
j
x
j
_
g
ij
=
_
x
i
x
i
__
x
i
x
j
_
(6.64)
216
g
rr
=
_
x
i
r
__
x
j
r
_
g
ij
=
_
x
i
r
_
2
, with summation over i
=
_
x
r
_
2
+
_
y
r
_
2
+
_
z
r
_
2
= (sin cos )
2
+ (sin sin )
2
+ (cos )
2
= 1. (6.65)
g
r
=
_
x
i
r
__
x
j
_
g
ij
=
_
x
i
r
__
x
i
_
with summation over i
=
_
x
r
__
x
_
+
_
y
r
__
y
_
+
_
z
r
__
z
_
= (sin cos ) (r cos cos ) + (sin sin ) (r cos sin ) + (cos ) (r sin )
= 0. (6.66)
g
r
=
_
x
i
r
__
x
j
_
g
ij
=
_
x
i
r
__
x
i
_
=
_
x
r
__
x
_
+
_
y
r
__
y
_
+
_
z
r
__
z
_
= (sin cos ) (r sin sin ) + (sin sin ) (r sin cos ) + (cos ) (0)
= 0. (6.67)
g
r
= g
r
, all metrics are symmetric (6.68)
g
=
_
x
i
__
x
j
_
g
ij
=
_
x
i
_
2
=
_
x
_
2
+
_
y
_
2
+
_
z
_
2
= (r cos cos )
2
+ (r cos sin )
2
+ (r sin )
2
= r
2
. (6.69)
g
=
_
x
i
__
x
j
_
g
ij
=
_
x
i
__
x
i
_
=
_
x
__
x
_
+
_
y
__
y
_
+
_
z
__
z
_
= (r cos cos ) (r sin sin ) + (r cos sin ) (r sin cos ) + 0
= 0. (6.70)
g
r
= g
r
g
= g

g
=
_
x
i
__
x
j
_
g
ij
=
_
x
i
_
2
=
_
x
_
2
+
_
y
_
2
+
_
z
_
2
= (r sin sin )
2
+ (r sin cos )
2
+ 0
= r
2
sin
2
. (6.73)
218
The above metric is consistent with that presented in matrix form in
Eq. (6.19).
28.(b) Deduce from Eq. (6.19) that the metric of the surface of a sphere
of radius r has components
g
= r
2
, g
= r
2
sin
2
, g
= 0
in spherical coordinates.
On the surface of a sphere, the variable r is held x at the radius so
dr = 0 and the line element becomes
dl
2
= r
2
d
2
+ r
2
sin
2
d
2
consistent with the metric given:
g
= r
2
, g
= r
2
sin
2
, g
= 0
28.(c) Find the components of g
for the sphere.

Recall from elementary linear algebra, or the results of Exer. 3(c), the
inverse of a diagonal matrix is just the inverse of the diagonal elements
(D
ij
)
1
=
_
1
D
ij
_
when i = j and 0 otherwise
Applying that here we nd
g
= r
2
, g
= r
2
sin
2
, g
= 0
29. Find the Riemann curvature tensor on the surface of a sphere of
radius r = 1 in polar coordinates. Theres only one independent component,
R
, see Exer. 18(b) for explanation.

Even though theres only one component, and we have the metric from
Exer. 28, this question still involves a lot of work. I suggest we keep r as a
variable since its no extra eort and it gains us a more general result. Lets
break it into three steps.
(i) Find the basis vectors e
and e
.
30. Boring.
31. Show that covariant dierentiation obeys the usual product rule, e.g.
(V
)
;
= V
;
W
+ V
W
;
Hint: Use a locally inertial frame.
In a locally inertial frame, the Christoel symbol vanishes and covariant
derivatives equal partial derivatives, so
(V
)
;
= (V
)
,
=

x
_
from here forward suspend usual summation convention
=
_
V
_
=
_
W
+ V
_
=
_
W
,
+ V
W
,
_
= W
,
+ V
W
,
reinvoke usual summation convention
= W
;
+ V
W
;
because were in a locally inertial frame.
(6.74)
The last equality is a valid tensor equation, valid in all reference frames.
32. A 4D manifold has coordinates (u, v, w, p) in which the metric has
220
components
g
uv
= 1
g
ww
= 1
g
pp
= 1 (6.75)
and all other independent components vanishing (emphasis my own).
(a) Show that the manifold is at and the signature is +2.
By 6.7 point (6), we have that a at space has Riemann curvature
identically zero, R
= 0. The Riemann curvature tensor was expressed in

terms of the metric in Eq. (6.68). The important point is that it depends
only upon 2nd derivatives of the metric. But the metric here is given as a
constant, independent of event or point on the manifold. So the derivatives
vanish and the Riemann curvature tensor is identically zero, R
= 0.
To nd the signature we only need to count the number of positive and
negative eigenvalues. But since in (b) we will need to diagonalize the metric
tensor, we might as well do it here. Then its trivial to nd the eigenvalues.
Simply by playing around, I found the following transformation worked:
H =
_
2
2

2
2
0 0
+
2
2

2
2
0 0
0 0 1 0
0 0 0 1
_
_
Then
H
T
(g)H = () (6.76)
(b) Find the transformation to the usual coordinates t, x, y, z.
We found the transformation matrix in (a) to be:
H =
_
2
2

2
2
0 0
+
2
2

2
2
0 0
0 0 1 0
0 0 0 1
_
_
This means that
g
where , are indices on the coordinates in (t, x, y, z) and ,

indices on
the coordinates in (u, v, w, p) and say u
0
= u, u
1
= v, . . . So,
H = (
) =
u
Thus, integrating these derivatives nally we arrive at:

u =
2
2
t
2
2
x,
v = +
2
2
t
2
2
x,
w = y,
p = z. (6.77)
33. A 3-sphere is the 3D surface in a 4D Euclidean space (coordinates
x, y, z, w), given by the equation
x
2
+ y
2
+ z
2
+ w
2
= r
2
where r is the radius of the three-sphere.
(a) Dene new coordinates (r, , , ) by the equations
w = r cos(),
z = r sin() cos(),
y = r sin() sin() sin(),
x = r sin() sin() cos(), (6.78)
Show that (, , ) are coordinates for the sphere.
If we simply substitute these equations (6.78) into the equation for the
3-sphere, we nd that the equation is satised for xed r for all values of
(, , ). So these coordinates can vary and we stay on the 3-sphere. But
222
to show that these are truly coordinates, I believe we must also show that
the transformation dened by (6.78) is not singular, c.f. Eq. (5.6). After a
whole lot of algebra, I found the determinant of
det
_
_
_
_
_
_
w
r
w
z
r
z
. . .
y
r
. . .
x
r
. . .
x
_
_
_
_
_
= 2r
3
sin
2
() sin().
So just as in spherical-polar coordinates there are singular points, but the
transformation is generally non-singular.
(b) Find the metric tensor of the three-sphere of radius r in these coor-
dinates. (Use method of Exer. 28).
The o-diagonal terms of the metric tensor are zero if the basis vectors
are orthogonal. In spherical coordinates this was obvious but its not a priori
obvious for the three-sphere (at least for me). So I simply calculated all the
terms of the metric tensor. There are only 9 independent terms (because of
symmetry, g

= g

. Lets use an overbar to indicate indices on the bases
in , , , with
x
1
= ,
x
2
= ,
x
3
= . (6.79)
And indices without overbar to indicate the original coordinates in x, y, z, w.
Then in general
g

= g
(6.80)
where

=
x
x

(6.81)
The calculus is tedious but straightforward. For instance,
g
1
1
= g
= g
xx
_
x
_
2
+ g
yy
_
y
_
2
+ g
zz
_
z
_
2
+ g
ww
_
w
_
2
= r
2
sin
2
cos
2
sin
2
+ r
2
sin
2
cos
2
cos
2
+ r
2
sin
2
sin
2
= r
2
sin
2
. (6.82)
The only question for me was whats the metric tensor in the 4D space?
Turns out to get the right answer one must assume
g
= +1 if =
= 0 if ,= (6.83)
In a similar manner one can easily show the o-diagonal terms are zero.
34. Establish the following identities for a general metric tensor in a
general coordinate system. Eqs. (6.39) and (6.40) are sometimes helpful.
In my humble opinion, these problems provide some nice practise with
tensor calculus, but are not essential for understanding the material of Chap-
ter 6.
(a)
=
1
2
(ln [g[)
,
cf. Exer. 5
=
1
2
g
g
,
cf. Eq. (6.38)
=
1
2
g
,
g
using Eq. (6.39)
=
1
2
(ln [g[)
,
chain rule. (6.84)
224
(b) Personally I found this by far the hardest of these 5 identities. If you
have trouble with this one, try (d) rst.
g
=
_
g
g
_
,
g
Expand the RHS using the product rule of dierential calculus,
_
g
g
_
,
g
= g
g)
,
g
g
,
= g
,
using Eq. (6.40)
= g
1
2
g
g
,
g
,
using Eq. (6.38)
(6.85)
Expand the LHS using Eq. (6.32):
g
= g
[
1
2
g
(g
,
+ g
,
g
,
)]
= g
1
2
g
g
,
+ g
1
2
g
(g
,
+ g
,
) rearranging. (6.86)
Subtracting these two results we nd it remains to prove only that
g
,
= g
1
2
g
(g
,
+ g
,
) (6.87)
This follows easily using (result from (d)):
g
1
2
g
g
,
= g
1
2
g
,
g
= g
1
2
g
,
=
1
2
g
,
.
(6.88)
Similarly for the other term, giving the result we required.
(c) Suppose F
is antisymmetric.
F
;
=
(F
g)
,
g
Expand the RHS using the product rule of dierential calculus,
(F
g)
,
g
= F
g)
,
g
+ F
,
= F
+ F
,
using Eq. (6.40)
(6.89)
Expand the LHS using Eq. (6.35):
F
;
= F
,
+ F
+ F
= F
,
+ F
relabelling dummy indices. (6.90)

The 3rd term vanished because F
is antisymmetric (given) and
is
symmetric in and they are contracted on these indices (vanishes by the
results of Exer. 26 of 3.10).
(d) Prove that
g
g
,
= g
,
(6.91)
in all bases.
Solution:
g
= g
in all bases because the inverse metric tensor is, by denition, the inverse of
the metric tensor. Simply dierentiate this formula:
(g
)
,
=
,
g
g
,
+ g
,
g
= 0
g
g
,
= g
,
. (6.92)
(e) Prove
g
226
This is a one-liner that follows immediately from Eq. (6.31) and (6.35).
35. Given the line element
ds
2
= exp(2(r)) dt
2
+ exp(2(r)) dr
2
+ r
2
d
2
+ r
2
sin
2
d
2
nd the Riemann curvature tensor.
I found this problem instructive on several levels. It should be done after
(or concurrently with) Exer. 18, for it helps clarify the symmetry relations
and the implied reduction in degrees of freedom of the Riemann curvature
tensor. It also helps reveal how much information is packed in the line element
equation!
Later well learn that the given form of the metric leads to the Schwarzschild
metric which represents the simplest solution of the Einstein equations. It is
therefore extremely useful to know.
(i) The coordinates are t, r, , . This is clear because these form the
dierential variables of the line element. The metric tensor is
(g
) =
_
_
_
_
exp(2) 0 0 0
0 exp(2) 0 0
0 0 r
2
0
0 0 0 r
2
sin
2
_
_
_
_
A fair question is why is the metric tensor diagonal? The answer is that
there are no cross-terms in the line element,
dl = [g
dx
dx
[
1/2
which was given just before Eq. (6.6).
(ii) The inverse of the metric tensor:
(g
) =
_
_
_
_
exp(2) 0 0 0
0 exp(2) 0 0
0 0 r
2
0
0 0 0 r
2
sin
2
_
_
_
_
See Exer. 3 for computing the inverse of a diagonal matrix.
The Christoel symbol can be computed from the metric tensor using
Eq. (6.32). One needs the rst derivatives of the metric tensor:
g
tt,r
=

r
[exp(2)] = 2 exp(2)
,
g
rr,r
=

r
[exp(2)] = 2 exp(2)
,
g
,r
=

r
[r
2
] = 2r,
g
,r
=

r
[r
2
sin
2
()] = 2r sin
2
(),
g
,
=

[r
2
sin
2
()] = 2r
2
sin() cos() = r
2
sin(2). (6.93)
All other rst derivatives of the metric tensor are zero.
Here are the nonzero Christoel symbols:
0
01
=
1
00
= exp(2) exp(2)
1
11
=
1
22
= exp(2) r
1
33
= exp(2) r sin
2
()
2
12
=
1
r
2
33
= sin() cos()
3
13
=
1
r
3
23
=
cos()
sin()
(6.94)
(iii) Deciding the 20 terms of R
to calculate. This is not as simple as

it might sound. Here it helps tremendously if one has solved Exer. 18.
Recall R
= 0 as does R
because of Eq. (6.69) (see Exer. 18).

We organize the terms as recommended in the hint for Exer. 18: we
choose pairs of ,= (there are 6 of them accounting for the fact that order
doesnt matter), and similarly there are 6 pairs of ,= . These would give
6 6 = 36 elements, but, because of the symmetry R
= R
, we must
228
divide the number o-diagonal elements by two to get 5 6/2 + 6 = 21.
(Well deal with the reduction to 20 by Eq. (6.70) in a minute.)
I found it too dicult to attempt to write down these terms immediately
based on the above prescription. Instead it was much easier to write down
all 6 6 = 36 terms and eliminate the lower diagonal:
R
trtr
R
trt
R
trt
R
trr
R
trr
R
tr
R
ttr
R
tt
R
tt
R
tr
R
tr
R
t
R
ttr
R
tt
R
tt
R
tr
R
tr
R
t
R
rtr
R
rt
R
rt
R
rr
R
rr
R
r
R
rtr
R
rt
R
rt
R
rr
R
rr
R
r
R
tr
R
t
R
t
R
r
R
r
R
(6.95)
Note that were not writing them down randomly. Instead, we step the
2nd pair of indices, i.e. , systematically by increasing the most rapidly
with increasing column, more slowly with increasing column. Similarly we
increase the rst pair of indices, i.e. , with row, and more rapidly than
. These were arbitrary choices of course, but having a system and sticking
to it makes it easy.
Recall only the upper diagonal is necessary to determine the tensor be-
cause of symmetry R
= R
:
R
trtr
R
trt
R
trt
R
trr
R
trr
R
tr
R
tt
R
tt
R
tr
R
tr
R
t
R
tt
R
tr
R
tr
R
t
R
rr
R
rr
R
r
R
rr
R
r
R
(6.96)
Now we must also impose the condition in Eq. (6.70), which we explained
in Exer. 18 only applies to the case when none of the indices are equal. There
are three such terms, indicated in red above. One of these can be determined
from the other two.
Lets evaluate a few of these in full detail. Its important to use Eq. (6.63),
which is true in all coordinate bases and not Eq. (6.68) which is only true
in a local inertial frame. From Eq. (6.63) and the Christoel symbols above
(6.94) we nd for
R
trtr
= g
tt
R
t
rtr
= exp(2)
_
0
01,r
+
0
10
1
11
0
01
0
01
= exp(2)
_
(
)
2
+
(6.97)
Its important to note that if one were to Eq. (6.68):
R
=
1
2
(g
,
g
,
+ g
,
g
,
)
one would miss the cross term
R
trtr
=
1
2
(g
tr,rt
g
tt,rr
+ g
rt,tr
g
rr,tt
)
=
1
2
_
0 (4 exp(2)(
)
2
2 exp(2)
) + 0 0
= 2 exp(2)(
)
2
+ exp(2)
(6.98)
Im nding my answers disagree with those provided by Schutz, so its
important to work these out in more detail. For the next one R
tt
I found,
R
tt
= g
tt
_
0
22,t
0
20,
+
0
0
22
0
2
20
= g
tt
_
0
0
22
= g
tt
_
0
10
1
22
= exp(2) [
(r exp(2))]
= +r
exp(2 2) (6.99)
After a lot of algebra one arrives at only the diagonal elements of (6.96)
are nonzero:
(R
) =
_
_
_
_
_
_
_
_
R
trtr
0 0 0 0 0
0 R
tt
0 0 0 0
0 0 R
tt
0 0 0
0 0 0 R
rr
0 0
0 0 0 0 R
rr
0
0 0 0 0 0 R
_
_
_
_
_
_
_
_
(6.100)
230
and the other 256 36 = 220 terms determined by symmetry relations in
Eq. (6.69). These 6 nonzero terms are
R
trtr
= exp(2)
_
(
)
2
+
R
tt
= r
exp(2) exp(2)
R
tt
= sin
2
() r
exp(2) exp(2)
R
rr
= r
R
rr
= r
sin
2
()
R
= r
2
_
cos
2
() 1 + exp(2) cos
2
() exp(2)
_
= r
2
sin
2
() (1 exp(2)) (6.101)
36. A four-dimensional manifold has coordinates (t, x, y, z) and a line
element:
ds
2
= (1 + 2)dt
2
+ (1 2)(dx
2
+ dy
2
+ dz
2
),
with [(t, x, y, z)[ 1 everywhere. At any point P with coordinates (t
0
, x
0
, y
0
, z
0
),
nd a coordinate transformation to a locally inertial coordinate system, to
rst order in . At what rate does such a frame accelerate with respect to
the original coordinates, again to rst order in ?
Ive noticed in the other chapters that the questions near the end of
the Exercises section generally anticipate results well need later. This is
probably of that genre since I dont immediately see the point of this (i.e.
its probably important). (Indeed this metric will reappear in Chapter 7,
Eq. (7.8).) First we need the metric, which we can nd as we did in Exer. 35.
Here the metric is
(g
) =
_
_
_
_
(1 + 2) 0 0 0
0 (1 2) 0 0
0 0 (1 2) 0
0 0 0 (1 2)
_
_
_
_
So we seek a transformation

such that,

By inspection (which was aided by solving Exer. 3), we see that:
(

) =
_
_
_
_
(1 + 2
P
)
1/2
0 0 0
0 (1 2
P
)
1/2
0 0
0 0 (1 2
P
)
1/2
0
0 0 0 (1 2
P
)
1/2
_
_
_
_
where
P
= (P) = (t
0
, x
0
, y
0
, z
0
). The question specically for the coor-
dinate transformation to a locally inertial coordinate system, to rst order in
. I guess that means Schutz wants us to approximate this transformation,
perhaps using the binomial theorem:
(1 + 2
P
)
1/2
(1
P
) + O(
2
P
).
Im not sure because I dont see the point of this. In any case one would get:
(

)
_
_
_
_
(1
P
) 0 0 0
0 (1 +
P
) 0 0
0 0 (1 +
P
) 0
0 0 0 (1 +
P
)
_
_
_
_
I confess I dont know what is meant by the rate such a frame accelerates
with respect to the original coordinates. But it is clear that this transforma-
tion works exactly only at the event P, and for the binomial approximation
it only applies approximately there as well. And one can write down how
the metric would depart from the local inertial metric, , by simply applying
the approximation:

_
_
_
_
(1 + 2)(1
P
)
2
0 0 0
0 (1 2)(1 +
P
)
2
0 0
0 0 (1 2)(1 +
P
)
2
0
0 0 0 (1 2)(1 +
P
)
2
_
_
_
_
_
_
_
_
(1 + 2)(1 2
P
) 0 0 0
0 (1 2)(1 + 2
P
) 0 0
0 0 (1 2)(1 + 2
P
) 0
0 0 0 (1 2)(1 + 2
P
)
_
_
_
_
_
_
_
_
(1 + 2(
P
)) 0 0 0
0 (1 2(
P
)) 0 0
0 0 (1 2(
P
)) 0
0 0 0 (1 2(
P
))
_
_
_
_
.
(6.102)
232
37. (a) Proper volume of a 2D manifold is usually called proper area.
Using the metric oin Exer. 28, integrate Eq. (6.18) to nd the proper area of
a sphere of radius r.
_ _
dx
1
dx
2
=
_

0
_
2
0
g
1/2
dd, changed sign of det [g[, cf. Eq. (6.19)
=
_

0
_
2
0
(r
2
r
2
sin
2
)
1/2
dd
=
_

0
_
2
0
(r
2
sin ) dd
= r
2
2
_

0
sin d
= r
2
2[cos ]
0
= 4r
2
. (6.103)
37. (b) Analogous problem for Exer. 33, the three-sphere.
I dont see that we learn anything new here.
39. Denes the Lie bracket in Eq. (6.100):
[
U,

V ]
= U
(a) That,
[
U,

V ]
= [
V ,

U]
follows immediately from the denition in Eq. (6.100).

Show that,
[
U,

V ]
= U
,
V
,
We start with the denition of Eq. (6.100):
[
U,

V ]
= U
= U
;
V
;
notation change only
= U
(V
,
+
) V
(U
,
+
) used Eq. 6.33

= U
,
V
,
+ U
rearranging only
(6.104)
where the terms in black font match what we are required to prove. So we
only need to show that the red terms vanish.
U
relabelled dummy indices on rst term

= 0, (6.105)
because in any coordinate system
, see Exer. 5.
SP.1 What are the numerical values of the elements of the Riemann curvature
tensor, R
and R
, with summation not implied. (Hint: Think about

the implications of the symmetry relations contained in Eq. (6.69).)
I recommend doing this problem before attempting Exerc. 18(b) in 6.9.
Solution: R
= 0 and R
= 0 because we must have

R
= R
For the case where = we must have that this element has a numerical
value equal to its inverse. The only number equal to its inverse is zero.
SP.2 Generalize Eq. (6.73) to the case of
_
1
1
_
tensor, F
.
Solution: This is a straightforward generalization of the argument leading
to Eq. (6.73). See also the solution to Schutzs Exerc. 20.
234
The covariant derivative of a
_
1
1
_
tensor can be inferred from Eqs. (6.34)
and (6.35).
= F
;
= F
,
+
(6.106)
Now we simply apply the gradient operator another time, initially for com-
pactness without expanding F
) = (F
;
)
;
= (F
;
)
,
+
;
= (F
;
)
,
(6.107)
where the last step used the fact that Christoel symbols are zero in local
inertial coordinates. But their gradients are not, so
) = (F
;
)
,
= F
,
+ (
)
,
using (6.106) above
= F
,
+
,
F
,
F
(6.108)
which generalizes Eq. (6.73).
SP.3 Derive Eq. (6.78) in a manor analogous to the derivation of Eq. (6.77).
Use the results of SP2 above. I posed this problem because I couldnt nd
the solution to Exerc. 21 wherein one is to explain the positive signs in
Eq. (6.78).
Solution: Using results (6.108) from SP2 above, we start by changing the
order of the derivatives by changing the order of the indices and . Taking
the dierence between the two derivatives we obtain (terms in red cancel)
[
]F
)
= F
,
+
,
F
,
F
(F
,
+
,
F
,
F
)
= (
,
)F
,
)F
collecting common factors

(6.109)
Use Eq. (6.63) but in reference frame where the Christoel symbols all vanish,
[
]F
= R
(6.110)
SP.4 Do the symmetry relations in Eq. (6.69) apply also when an in-
dex is raised? Prove that R
= R

. This result will be useful for
Exerc. 25(a).
Yes.
R

= g
= g
, by Eq. (6.69)
= R
. (6.111)
SP.5 Whats the Ricci tensor and Ricci scalar of the metric of Exer. 35
(which turns out to be of the form of the Schwarzschild metric)?
SP.6 Why is it generally important to use Eq. (6.63) for the computation
of the Riemann tensor components and not the one in terms of the metric,
given by Eq. (6.68)?
236
Chapter 7
Physics in curved spacetime
237
238
7.2 Physics in slightly curved spacetimes
Eq. (7.14) should be
0
00
(1 2)
,t
=
,t
+ O().
p. 177, [g
] is apparently the matrix associated with the inverse metric

tensor. In earlier chapters the notation (g
) was used, an annoying incon-

sistency.
7.6 Exercises
It might help to tackle my supplementary problems rst, see 7.7 below.
1. (i) If Eq. (7.3) were the correct generalization of Eq. (7.1) to a curved
spactime, how would you interpret it? (ii) What would happen to the number
of particles in a comoving volume of uid, as time evolves? (iii) In principle,
can we distinguish experimentally between Eqs. (7.2) and (7.3)?
(i) Recall the hypothetical Eq. (7.3) was
(nU
)
;
= q R
where q was some constant and R the Ricci scalar. Based on kinematics,
this equation states that, for q > 0 there is a source of particles for positively
curved space R > 0. This was shown in Exer. 20 (a) of 4.10, which
interpreted showed that the divergence of the 4-vector N = n
U was the rate

of generation of particles per unit volume. In Chapter 4 we were working in
at-space time but on a curved manifold the divergence would include terms
from the rate of change of the basis vectors:
N
,
=
= (nU
)
;
using denitions Eq. (4.1) and Eq. (4.4)
(7.1)
If q R were negative there would be a sink of particles. The fact that, by
Eq. (7.1)
(nU
)
,
= 0
while
(nU
)
;
= q R ,= 0
would mean, at least mathematically, that the nonzero source results from
the derivatives of the basis vectors:
(nU
)
;
= N
;
= N
,
+
Eq. (6.36)
=
using Eq. (7.1)

= N
(ln(
g))
,
using Eq. (6.41)
= q R by the hypothetical Eq. (7.3). (7.2)
I suppose physically we could interpret this as a source of particles, per unit
volume per unit time, (q R), resulting from the ux of particles N
up the
gradient of the natural logarithm of the magnitude of the determinant of
the metric, whatever that would mean! Im not sure how to understand
or interpret this more completely. Of course would be admittedly strange
because it would mean that there would be a source of particles in some
frames but not in the locally inertial frame (where by Eq. (6.5) we had
vanishing rst derivatives of all components of the metric).
(ii) What would happen to the number of particles in a co-moving volume
of uid, as time evolves?
Lets recall the solution to Exer. 20 of 4.10, where we derived (4.44):
nU
0
t
=
nU
x
x

nU
y
y

nU
z
z
+ .
In the co-moving frame, U
0
is always unity, such that:
n
t
=
nU
x
x

nU
y
y

nU
z
z
+ . (7.3)
How the number of particles per unit volume n evolves in time depends
upon the spatial convergence of the number ux N
i
;i
and the source term,
= qR.
(iii) Could we ever distinguish experimentally between Eqs. (7.2) and (7.3)?
Recall Eq. (7.2) had
(nU
)
;
= 0.
240
So yes, I believe one could, in principle, measure the terms on the RHS of
(7.3) and thereby distinguish between Eqs. (7.2) and (7.3).
2. To rst order in , compute g
for [the line element given by ]

Eq. (7.8).
See Exers. 35 and 36 from 6.9 for how to calculate the metric from the
line element. Here the metric is
(g
) =
_
_
_
_
(1 + 2) 0 0 0
0 (1 2) 0 0
0 0 (1 2) 0
0 0 0 (1 2)
_
_
_
_
and in fact, its exactly the same as in Exer. 36 of 6.9. Now we are asked for
the inverse, which because its diagonal follows immediately as the inverse of
the terms on the diagonal of the metric:
(g
) =
_
_
_
_
(1 + 2)
1
0 0 0
0 (1 2)
1
0 0
0 0 (1 2)
1
0
0 0 0 (1 2)
1
_
_
_
_
_
_
_
_
(1 2) 0 0 0
0 (1 + 2) 0 0
0 0 (1 + 2) 0
0 0 0 (1 + 2)
_
_
_
_
(7.4)
3. Calculate all the Christoel symbols for the metric given by Eq. (7.8),
to rst order in (t, x, y, z).
This question requires tones of algebra yet I found I didnt learn anything.
On the other hand, given the importance of this metric, perhaps a complete
set of Christoel symbols will come in handy later. First lets count the
number of independent Christoel symbols,
to calculate. For each

there are only ten independent terms because
in any basis. Given

the metric and inverse metric, see Exer. 2, we can calculate the Christoel
symbols using Eq. (6.32). The calculation simplies tremendously because
(g
) is diagonal. Thus we need only consider the = contribution in

Eq. (6.32).
0
00
=
t
tt
=
1
2
g
tt
g
tt,t
= (1 2)
,t
(7.5)
0
01
=
t
tx
=
1
2
g
tt
g
tt,x
= (1 2)
,x
(7.6)
0
02
=
t
ty
=
1
2
g
tt
g
tt,y
= (1 2)
,y
(7.7)
0
03
=
t
tz
=
1
2
g
tt
g
tt,z
= (1 2)
,z
(7.8)
0
10
=
0
01
=
t
tx
= (1 2)
,x
(7.9)
but lets not bother with redundant ones any more.
242
0
11
=
t
xx
=
1
2
g
tt
(g
xx,t
)
= (1 2)
,t
(7.10)
0
12
=
t
xy
=
1
2
g
tt
(0)
= 0. (7.11)
and in general when all the indices are dierent, the results is nil.
0
13
=
t
xz
= 0. (7.12)
0
22
=
t
yy
=
1
2
g
tt
(g
yy,t
)
= (1 2)
,t
(7.13)
0
23
=
t
yz
= 0 (7.14)
0
33
=
t
zz
=
1
2
g
tt
(g
zz,t
)
= (1 2)
,t
(7.15)
1
00
=
x
tt
=
1
2
g
xx
(g
tt,x
)
= (1 + 2)
,x
(7.16)
1
01
=
x
tx
=
1
2
g
xx
(g
xx,t
)
= (1 + 2)
,t
(7.17)
1
02
=
x
ty
= 0. (7.18)
1
03
=
x
tz
= 0. (7.19)
1
11
=
x
xx
=
1
2
g
xx
(g
xx,x
)
= (1 + 2)
,x
(7.20)
1
12
=
x
xy
=
1
2
g
xx
(g
xx,y
)
= (1 + 2)
,y
(7.21)
244
1
13
=
x
xz
=
1
2
g
xx
(g
xx,z
)
= (1 + 2)
,z
(7.22)
1
22
=
x
yy
=
1
2
g
xx
(g
yy,x
)
= (1 + 2)
,x
(7.23)
1
23
=
x
yz
= 0. (7.24)
1
33
=
x
zz
=
1
2
g
xx
(g
zz,x
)
= (1 + 2)
,x
(7.25)
2
00
=
y
tt
=
1
2
g
yy
(g
tt,y
)
= (1 + 2)
,y
(7.26)
2
01
=
y
tx
= 0. (7.27)
2
02
=
y
ty
=
1
2
g
yy
(g
yy,t
)
= (1 + 2)
,t
(7.28)
2
03
=
y
tz
= 0. (7.29)
2
11
=
y
xx
=
1
2
g
yy
(g
xx,y
)
= (1 + 2)
,y
(7.30)
2
12
=
y
xy
=
1
2
g
yy
(g
yy,x
)
= (1 + 2)
,x
(7.31)
2
13
=
y
xz
= 0. (7.32)
2
22
=
y
yy
=
1
2
g
yy
(g
yy,y
)
= (1 + 2)
,y
(7.33)
246
2
23
=
y
yz
=
1
2
g
yy
(g
yy,z
)
= (1 + 2)
,z
(7.34)
2
33
=
y
zz
=
1
2
g
yy
(g
zz,y
)
= (1 + 2)
,y
(7.35)
3
00
=
z
tt
=
1
2
g
zz
(g
tt,z
)
= (1 + 2)
,z
(7.36)
3
01
=
z
tx
= 0. (7.37)
3
02
=
z
ty
= 0. (7.38)
3
03
=
z
tz
=
1
2
g
zz
(g
zz,t
)
= (1 + 2)
,t
(7.39)
3
11
=
z
xx
=
1
2
g
zz
(g
xx,z
)
= (1 + 2)
,z
(7.40)
3
12
=
z
xy
= 0. (7.41)
3
13
=
z
xz
=
1
2
g
zz
(g
zz,x
)
= (1 + 2)
,x
(7.42)
3
22
=
z
yy
=
1
2
g
zz
(g
yy,z
)
= (1 + 2)
,z
(7.43)
3
23
=
z
yz
=
1
2
g
zz
(g
zz,y
)
= (1 + 2)
,y
(7.44)
3
33
=
z
zz
=
1
2
g
zz
(g
zz,z
)
= (1 + 2)
,z
(7.45)
248
4. Verify that the results of Eqs. (7.15) and (7.24) depended only on g
00
;
the form of g
xx
doesnt aect them, as long as it is (1 + O().
Its not clear where were expected to start. Ill go back to Eq. (7.9), since
this is clearly the equation for a geodesic.
Let = /m, where is proper time and m is the rest mass. Let x be
the position of the particle, so the tangent vector
U =
dx
d
is also the four-velocity of the particle. The momentum p is
p = m
U
= m
dx
d
=
dx
d
. (7.46)
And thus the momentum is the tangent vector the same path parameterized
by . So particles path also satises Eq. (7.10), the equation for a geodesic
written in terms of its tangent vector, cf. Eq. (6.49).
To justify Eq. (7.11) from Eq. (7.10), we write out Eq. (7.10) using the
notation used in Eq. (6.47):
p
p =
d p
d
=
d
d
(p
)
= e
d
d
p
+ p
d
d
e
= e
d
d
p
+ p
dx
d
= e
d
d
p
+ p
dx
d
using denition of Christoel symbol, cf. Eq. (5.44)
= e
d
d
p
+ p
using (7.46)
= e
_
d
d
p
+ p
_
relabelling dummy indices to allow factoring
= 0. (7.47)
The quantity in parentheses is a scalar for each and the basis vectors e
are all orthogonal, since were working in an orthonormal basis (since the
line element in Eq. (7.8) does not contain o-diagonal terms see discus-
sion in solution of Exer. 35 of 6.9.) That implies that each component of
Eq. (7.10) must vanish (something that was not immediately obvious). So
we can consider just the = 0 (i.e. time) component, which gives Eq. (7.11).
Recall the 4-momentum is
p
O
(E, p
1
, p
2
, p
3
) = (m, mv
1
, mv
2
, mv
3
)
where v
i
is the i
th
component of the 3-velocity and

1
1 v
2
/c
2
.
When the magnitude of the three-velocity v c = 1, we have p
0
p
i
,
and clearly Eq. (7.11) simplies to Eq. (7.12). The
0
00
for this metric was
computed in Exer. 3 and agrees with that given just before Eq. (7.14). But
I would get, instead of Eq. (7.14),
0
00
(1 2)
,t
=
,t
+ O().
Furthermore, p
0
= E, the total energy,
E = m = m
1
1 v
2
m,
see bottom p. 42. Substitution into Eq. (7.11) gives Eq. (7.15).
5. (a) For a perfect uid, verify that the spatial components of Eq. (7.6)
in the Newtonian limit reduce to the Euler equations (Eq. 7.38) for the metric
Eq. (7.8).
The stress-energy tensor T
for a perfect uid was given in Eq. (7.7),

the only modication from that giving by Eq. (4.37) in Chapter 4 on uids
in SR is the more general metric tensor. Substituting this into Eq. (7.6) we
get:
T
;
= [( + p)U
]
;
+ [pg
]
;
= 0
= [( + p)U
]
;
+ p
;
g
because g
;
= 0
= [ U
]
;
+ p
,
g
because p is a scalar
(7.48)
250
and because p . This follows because the pressure arises from the random
motion of the particles, which provides them with negligible kinetic energy
relative to the rest mass energy in the non-relativistic limit. Furthermore,

0
= nm again because the rest mass dominates the energy in the
non-relativistic limit. (See Table 4.1 for a denition of symbols.)
T
;
= m[nU
]
;
+ p
,
g
= mnU
[U
]
;
+ p
,
g
using Eq. (7.2)

= mnU
[U
,
+
] + p
,
g
using Eq. (6.33)

=
0
U
[U
,
+
] + p
,
g
using Eq. (6.33) (7.49)

where mn =
0
, the rest mass density. Lets now restrict attention to the
= i (spatial) components.
T
i
;
=
0
U
[U
i
,
+
i
] + p
,
g
i
(7.50)
Lets look at the terms in (7.50) one-at-a-time. The rst term is the time
derivative, in particular the Eulerian part when = 0 and the advective part
when = j > 0 (by convention). To see this, expand the rst term when
= 0:
0
U
0
U
i
,0
=
0
U
0
U
i
,t
=
0
2
v
i
,t
(7.51)
where = 1/
1 v
2
, v
i
is the i
th
component of the 3-velocity and v is its
magnitude.
2
v
i
,t

0
v
i
,t
Newtonian limit
=
0
v
t
changing to vector notation. (7.52)
And the advective part corresponds to this term with = j:
2
v
j
v
i
,j

0
v
j
v
i
,j
Newtonian limit
=
0
(v )v changing to vector notation. (7.53)
The next term in in (7.50) contains the Christoel symbol and can be written:
0

i
i
00
U
0
U
0
in Newtonian limit U
0
U
i
=
0
,i
U
0
U
0
Eq. (7.23)
=
0
,i
2

0
,i
in Newtonian limit
=
0
(7.54)
which is the gravitational force per unit volume of uid. The nal term in
(7.50) is the pressure gradient force per unit volume,
p
,
g
i
= p
,i
(1 2)
1
Eq. (7.20)
p
,i
(1 + 2) because [[ 1
= p
,i
because [[ 1
= p. (7.55)
And thats the Euler equation.
5. (b) Examine the time component of Eq. (7.6) under the Newtonian
limit, and interpret each term.
We go back to (7.49) and restrict attention to the = 0 (time) compo-
nent:
T
;
=
0
U
[U
,
+
] + p
,
g
T
0
;
=
0
U
[U
0
,
+
0
] + p
,
g
0
(7.56)
Lets look at the terms in (7.56) one-at-a-time. The rst term is the time
derivative. To see this, expand the rst term when = 0:
0
U
0
U
0
,0
=
0
U
0
U
0
,t
=
0

,t

0
t
_
1
2
v
2
_
(7.57)
252
which the Eulerian time rate of change of the kinetic energy density. The
nal approximation comes about by expanding and using the binomial
approximation that applies in the Newtonian limit:
=
1
1 v
2
1 +
v
2
2
. (7.58)
And the advective part corresponds to this term with = j:
0
U
j
U
0
,j
=
0
v
j
,j

0
v
j
_
1
2
v
2
_
,j
Newtonian limit
=
0
(v )
_
1
2
v
2
_
changing to vector notation, (7.59)
which is the advection of the kinetic energy per unit volume. The next term
in in (7.56) contains the Christoel symbol and can be written:
0

0
0
0
U
0
U
in Newtonian limit U
0
U
i
=
0
U
0
U
1
2
g
0
(g
,0
+ g
0,
g
0,
)
=
0
U
0
U
1
2
g
00
(g
0,0
+ g
00,
g
0,0
) (inverse) metric is diagonal
=
0
U
0
U
1
2
g
00
(g
00,
)
=
0
U
0
U
1
2
[(1 + 2)
1
][(1 + 2)
,
]

0
U
0
U
(1 2)
,
binomial approximation

0
U
0
U
,
=
0
2
v
,
(7.60)
When = 0 this gives the tidal forcing:
0

0
0
U
U
0

0
,t
(7.61)
When = i this gives the work done by potential forces (or change in
potential energy):
0

0
i
U
U
i

0
v
i
,i
(7.62)
The nal term in (7.56) involves the pressure,
p
,
g
0
= p
,0
g
00
because the inverse metric is diagonal
= p
,t
[(1 + 2)
1
] Eq. (7.8), take the inverse of diagonal matrix
p
,t
[(1 2)] binomial approximation
= p
,t
(7.63)
This looks like a change from internal energy to kinetic energy but Im con-
fused because this should result from divergence of the velocity (compres-
sion/expansion of uid parcels working against ambient pressure).
5. (c) Derive the relativistic hydrostatic balance, given by Eq. (7.40),
from Eq. (7.6). [I recommend you try my supplementary problem R1 above
before tackling this one.]
We start again with the divergence of the stress-energy tensor given above
in (7.48):
T
;
= [( + p)U
]
;
+ [pg
]
;
= 0
= [( + p)U
]U
;
+ [( + p)U
]U
;
+ [U
]( + p)
;
+ p
;
g
+ p g
;
(7.64)
Lets step through the terms of (7.64) starting with the last, applying the
static condition when appropriate. The nal term vanishes:
p g
;
= 0
as we have used many times, see 6.9 Exer. 17(a) and Eq. (6.31). The 4th
term is:
p
;
g
= p
,
g
because p is a scalar,
= p
,i
g
i
static metric. (7.65)
The 3rd term vanishes:
[U
]( + p)
;
= [U
]( + p)
,
because p is a scalar
= [U
0
U
0
]( + p)
,0
static uid
= [U
0
U
0
]( + p)
,t
= 0 static uid. (7.66)
254
The 2nd term is
[( + p)U
]U
;
= [( + p)U
0
]U
;0
static uid
= [( + p)U
0
](U
,t
+
t
U
) Eq. (6.33)
= ( + p) U
0
t
U
static uid
= ( + p) U
0
00
U
0
static uid
(7.67)
The rst term vanishes:
(
0
+ p) U
;
= (
0
+ p) U
[U
,
+ U
(ln(g))
,
]
= (
0
+ p) U
[U
0
,t
+ U
0
(ln(g))
,t
] static uid
= 0 static uid
(7.68)
So theres a (relativistic hydrostatic) balance between the 4th and 2nd terms:
( + p) U
0
00
U
0
+ p
,i
g
i
= 0
(7.69)
Now lets simplify the Christoel symbol,
00
=
1
2
g
(g
00,
) using Eq. 5.75 & static metric
=
1
2
g
i
(g
00,i
) static metric. (7.70)
Substitution in our hydrostatic balance equation gives:
0 = ( + p) U
0
00
U
0
+ p
,i
g
i
recall above
= ( + p)
1
2
g
i
(g
00,i
)U
0
U
0
+ p
,i
g
i
(7.71)
And now for the tricky bit! Recall we learned in SR that the four velocity
of a stationary particle was the speed of light in the direction of time, c.f.
2.2, so that
U

U = g
= U
0
U
0
g
00
= 1 1 (1) = 1, Eq. (2.28). (7.72)
Now in GR the metric has changed, but do we keep the magnitude of the
U

U = 1 and change the components of U
accordingly? Yes, we do.

[Personally I think this wasnt explained clearly by Schutz and its one of my
few criticisms of this text.] Once one sees this small but important step, the
result follows easily. One either considers the = 0 component or factors
g
i
out:
0 = ( + p)
1
2
g
i
(g
00,i
)U
0
U
0
+ p
,i
g
i
= ( + p)
1
2
(g
00,i
)
_
1
g
00
_
+ p
,i
g
i
= g
i
_
+ p)
1
2
(g
00,i
)
_
1
g
00
_
+ p
,i
_
= ( + p)
1
2
(g
00,i
)
_
1
g
00
_
+ p
,i
= ( + p)
1
2
[ln(g
00
)]
,i
+ p
,i
chain rule of dierential calculus
(7.73)
5. (d) Appears to be a relationship between g
00
and exp(2 ), where
is the Newtonian potential. Show that Eq. (7.8) and Exer. 4 are consistent
with this.
First we have to make sense of there is a close relation . Lets check
if they are approximately equal in some situations (such as the Newtonian
limit). So then the term
1
2
[ln(g
00
)]
,i
=
1
2
[ln(exp(2))]
,i
=
,i
the Newtonian gravitational force per unit mass that appears in the Newto-
nian uid hydrostatic balance (except that we have to then ignore p ).
So yes, approximate equality (in some situations) appears to be what the
relation is.
Lets assume that the Newtonian potential is small, [[ 1 in non-
256
dimensional units. Then
exp(2 ) = (1 + 2 + O(4)
2
) Taylor series about = 0
(1 + 2) if [[ 1
= g
00
as in Eq. (7.8). (7.74)
And in Exer. 4 we were required to show that in the Newtonian limit
of Eq. (7.10), which was Eq. (7.15) and Eq. (7.24), there was only a depen-
dence on g
00
, and not the other components of the metric. These equations
correspond to the energy and momentum of a particle in a time-varying gravi-
tational eld. In the Newtonian limit we expect expect classical mechanics to
apply, of course, from which we know that the energy depends upon the tidal
forcing (time variation of the gravitational potential), and the momentum is
altered by the gravitational force which is the gradient of the gravitational
potential. So if g
00
exp(2) is consistent with this.
6. Deduce Eq. (7.25) from Eq. (7.10).
Eq. (7.10) was the equation for a geodesic,
p
( p) = 0
g
[
p
( p)] = 0
= g
[p
;
]
= p
[g
]
;
using Eq. (6.31)
= p
p
;
(7.75)
7. Were given for expressions for the line elements corresponding to four
dierent metrics (i) . . . (iv). (i) is the Minkowski metric.
(a) For each metric, nd as many conserved components of p
of a freely
falling particles four momentum as possible.
(i) All four components of p, i.e. p
, are conserved.
(ii) The conserved components of p are:
p
t
p
(7.76)
(iii) The conserved components of p are:
p
t
p
(7.77)
(iv) The conserved components of p are:
p
(7.78)
(b) Use the results of Exer. 28 from 6.9 to transform the Minkowski
metric (i) to the form
ds
2
= dt
2
+ dr
2
+ r
2
(d
2
+ sin
2
d
2
)
Use this to argue that (ii) and (iv) are spherically symmetric. Does this
increase the number of conserved components of p
?
Recall Exer. 28 from 6.9 gave us the metric on the surface of a sphere
in 3D Cartesian space. But the 3 spatial coordinates of Minkowski space
look like that of 3D Cartesian space. So we can represent them in spherical
polar coordinates with the same metric, while keeping the time component.
I suppose one also needs to know that dr
2
gives the radial contribution to
the line element in spherical polar coordinates yet this was not provided in
Exer. 28.
Im not sure about the correct answer to the remainder of this question,
but Ill have a stab at it.
Its clear for Schwarzschild metric (ii) that for xed r (so on the surface
of 3-sphere manifold in this 4-space), that the Schwarzschild metric has the
same form (to within constants) as the metric for Minkowski space written in
spherical coordinates. We know all components of p
are conserved for the

Minkowski metric. This indicates that linear combinations of components
in p
on the surface of 3-sphere manifold in the Schwarzschild space are

258
conserved. There is a linear transformation that converts this metric to the
Minkowski metric everywhere on the 3-sphere.
For the Robertson-Walker metric (iv) for xed r and t, the metric has the
same form to within constants as the Minkowski metric written in spherical
coordinates. So again this indicates that linear combinations of components
in p
on this manifold would be conserved.

(c) For metrics (i), (ii), (iii), and (iv), a geodesic that begins tangent to
the equatorial plane stays on the equatorial plane (i.e. starts with = /2
and p
= 0 and keeps = /2 and p
= 0). For cases (i), (ii), and (iii), use

the equation p p = m
2
to solve for p
r
in terms of m, and other conserved
quantities, and known functions of position.
For (i),
m
2
= p p
= p
= (p
t
)
2
+ (p
r
)
2
+ r
2
[(p
)
2
+ sin
2
(p
)
2
] writing out terms with metric (i)
= (p
t
)
2
+ (p
r
)
2
+ r
2
(p
)
2
tangent to equatorial plane
p
r
=
_
(p
t
)
2
r
2
(p
)
2
m
2
(7.79)
Were almost there, but remember its p
t
and p
that are conserved. So we

need to relate p
t
and p
to them to fulll the requirement of other conserved

quantities. Because (g
) is symmetric for metrics (i), and (ii), a single

factor relates each component:
p
t
= p
t
g
t
= p
t
g
tt
= p
t
(1)
p
= p
= p
= p
r
2
sin
2
= p
r
2
(7.80)
Substitution in the above equation gives:
p
r
=
_
(p
t
)
2
(p
)
2
/r
2
m
2
(7.81)
For (ii),
m
2
= p
= [1 2M/r] (p
t
)
2
+
1
1 2M/r
(p
r
)
2
+ r
2
[(p
)
2
+ sin
2
(p
)
2
] expanding
= [1 2M/r] (p
t
)
2
+
1
1 2M/r
(p
r
)
2
+ r
2
(p
)
2
tangent to equatorial plane
p
r
=
_
1 2M/r
_
[1 2M/r] (p
t
)
2
r
2
(p
)
2
m
2
(7.82)
Again we must nd the corresponding conserved quantities:
p
t
= p
t
g
tt
= p
t
(1)[1 2M/r]
p
= p
= p
r
2
(7.83)
Substitution in the above equation gives:
p
r
=
_
1 2M/r
_
(p
t
)
2
/[1 2M/r] (p
)
2
/r
2
m
2
(7.84)
For (iii), its instructive to note immediately that on the equatorial plane
2
= r
2
+ a
2
cos
2
= r
2
, so that
m
2
= p
=
a
2
r
2
(p
t
)
2
4aM
r
(p
t
p
) +
(r
2
+ a
2
)
2
a
2
r
2
(p
)
2
+
r
2
(p
r
)
2
(7.85)
The above equation involves known functions of position, and p
t
and p
which
are not conserved. Because the Kerr metric is not diagonal, its more involved
to nd the corresponding conserved quantities:
p
t
= p
g
t
= g
tt
p
t
+ g
t
p
= p
= g
+ g
t
p
t
(7.86)
We solve this 2 2 system for p
t
and p
to nd:
p
t
= p
t
g
g
2
t
g
tt
g
+ p
g
t
g
2
t
g
tt
g
= p
t
g
t
g
2
t
g
tt
g
g
tt
g
2
t
g
tt
g
(7.87)
260
Substitution in the above equation gives would give the required equation,
but it would be very messy. We simply note that we have met the requirement
in principle because g
tt
, g
t
, g
are known functions of position.

(d) For (iv), spherical symmetry implies that if a geodesic begins with
p
= p
= 0, these remain zero. Use this to show from Eq. (7.29) that when
k = 0, p
r
is conserved.
When k = 0 the Robertson-Walker metric simplies to:
(g
) =
_
_
1 0 0 0
0 R
2
(t) 0 0
0 0 r
2
R
2
(t) 0
0 0 0 r
2
sin
2
R
2
(t)
_
_
Writing out Eq. (7.29) for p
r
and this metric we get:
m
d
d
p
=
1
2
g
,
p
Eq. (7.29)
m
d
d
p
r
=
1
2
g
,r
p
=
1
2
[g
tt,r
(p
t
)
2
+ g
rr,r
(p
r
)
2
+ g
,r
(p
)
2
+ g
,r
(p
)
2
]
=
1
2
[g
(p
)
2
+ g
(p
)
2
] clear from metric
= 0 clear from given initial conditions and spherical symmetry.
(7.88)
8. Suppose that in some coordinate system the components of the metric
g
are independent of some coordinate x
.
(a) Show that the conservation law T
;
= 0 for any stress energy tensor
becomes that given by Eq. (7.41).
If its not obvious where this conservation law came from, have a go at
supplementary problem SP2 above.
Lets start with expanding Eq. (7.41) to see whats really there.
0 =
1
g
_
gT
_
,
= T
,
+ T
g)
,
g
product rule of dierential calculus
= T
,
+ T
Eq. (6.40). (7.89)

This now looks somewhat like the given conservation law, for if we expand
it using Eqs. (6.34) and (6.35), we obtain
T
;
= T
,
+ T
(7.90)
Comparing (7.89) and (7.90) we see that we merely have to show that under
the given conditions on the metric, the nal term in (7.90) vanishes. Lets
start with the most general expression relating the Christoel symbol to the
metric tensor, and then apply the restriction given here:
T
= T
1
2
g
[g
,
+ g
,
g
,
] Eq. (6.32)
= T
1
2
g
[g
,
g
,
] when g independent of x
= T
1
2
[g
,
g
,
] just raised index
= 0, (7.91)
because T
is symmetric on , while quantity in [ ] is antisymmetric on

these indices, cf. Exer. 26 of 3.10.
(b) Suppose that in these coordinates T
,= 0 only in some bounded

region of each spacelike hypersurface x
0
=const. Show that Eq. (7.41) implies
_
x
0
=const.
T
d
3
x
is independent of time x
0
, if
is the unit normal to the hypersurface.

I believe this is a typo and should be written with = 0, i.e.:
_
x
0
=const.
T
0
g
0
d
3
x
262
Starting with Eq. (7.41) we integrate both sides, using the proper volume
element,

g d
4
x (c.f. Eq. (6.18)):
_
1
g
_
gT
_
,
g d
4
x = 0
_
_
gT
_
,
d
4
x = 0
_
_
gT
_
n
d
3
S = 0 Eq. (6.44)
(7.92)
I found it helpful at this point to picture using Gausss law in a simple setting
of 2D Cartesian space, with some eld being only nonzero in a some region
of nite extent. In this simple case one would choose the bounding limits to
be straight lines x = x
1
and x = x
2
and y = y
1
and y = y
2
that lie outside
the region of nonzero eld. By analogy, we expand the LHS of (7.92)
SP 1. Recall we learned in SR that the four velocity of a stationary particle
was the speed of light in the direction of time, c.f. 2.2, so that
U

U = g
= U
0
U
0
g
00
= 1 1 (1) = 1, Eq. (2.28). (7.93)
Now in GR the metric has changed, but do we keep the magnitude of the
U

U = 1 and change the components of U
accordingly? I recommend
you do this problem before tackling Exer. 5(c).
Yes, we do. [Personally I think this wasnt explained clearly by Schutz up
to this point in the text and its one of my few criticisms.] See Eq. (10.18).
SP 2. Show that T
;
= 0 can be derived from the conservation law
Eq. (7.6).
We simply multiply Eq. (7.6) by the metric tensor to lower the index.
0 = T
;
Eq. (7.6)
= T
;
symmetry of the stress-energy tensor
0 = g
;
multiplying both sides by g
= (g
)
;
using Eq. (6.31)
= T
;
(7.94)
et voila!
264
Chapter 8
The Einstein Field Equations
265
266
8.1 Purpose and justication of the eld equa-
tions
Re-reading this long after Chapter 7, I found it strange that he was justifying
Eq. (7.6) based upon the Einstein equivalence principle, see discussion be-
tween Eqs. (8.4) and (8.5) on p. 185. But re-reading Chapter 7 it is clear that
hes just referring to the application of the comma goes to semi-colon rule.
That is, Eq. (7.5), the conservation of four-momentum in SR, generalizes (by
the Einstein equivalence principle) to Eq. (7.6).
8.2 Einsteins equations
8.3 Einsteins equations for weak gravitational
elds
Its perhaps helpful to go through the derivation of Eq. (8.22).
g
transformation of a 2nd rank tensor, like Eq. (8.16)

=
+ h
) substitute Eq. (8.12)

= [
,
][
,
](
+ h
) substitute Eq. (8.21)

=
+ h
,
keeping just the largest terms
=
+ h
,
(8.1)
which gives Eq. (8.22). Its tempting to justify the nal step above by the
metric tensor lowering the indices and . But note that Schutz is careful
not to say this, but instead says that we dene
see his Eq. (8.23). I guess this is because is not the full metric, but only
the dominant piece of it.
When g
is given by Eq. (8.12) then Eq. (8.25) follows immediately

from Eq. (6.67), which is R
in a local inertial frame. All one has to do is

substitute Eq. (8.12) and lower the index. Note the typo in Eq. (6.67).
8.6 Exercises
1. Show that Eq. (8.2) is a solution to Eq. (8.1) using the method suggested.
Gauss law in 3D space is
_
()dV =
_
d
n ()dA
where is a volume, dV a dierential volume element, d is the surface
bounding , n is the outward pointing unit normal vector, and dA is an area
element on the bounding surface.
As suggested, we consider a point particle at the origin of a spherical
co-ordinate system. Then the left-hand integral is trivial:
_
()dV =
_
4GdV, using Eq. (8.1)

= 4G
_
dV,
= 4Gm, (8.2)
where m is the mass of the particle. We have assumed that the particle is in
a vacuum so that is only due to this particle.
The righthand side gives,
_
d
n ()dA =
_
d
d
dr
dA, using surface of sphere for d
=
d
dr
4R, (8.3)
where R is the radius of the sphere and by spherical symmetry the integrand
is constant on the surface of the sphere. Combining these two results gives
the 1st order ODE:
d
dr
=
Gm
r
2
268
To solve this we integrate both sides, imposing the BC () = 0,
_

R
d
dr
dr =
_

R
Gm
r
2
dr
[]
R
=
_
Gm
r
_
R
(R) =
Gm
R
(8.4)
which is Eq. (8.2).
2 (a). Derive the two given conversion factors.
One simply plugs in the given constants with values in SI units to calculate
the numerical value. One should also convince oneself that the units are
correct. The SI values and units were given in Table 8.1. Personally I nd it
easy to remember the formulae that the constants are used in and gure out
the units from there.
G
SI
[m
3
kg
1
s
2
]
c
2
SI
[m s
1
]
2
=
G
SI
c
2
SI
[m
3
kg
1
s
2
]
[m
2
s
2
]
=
G
SI
c
2
SI
[m]
[kg]
=
6.674 10
11
(2.998 10
8
)
2
[m kg
1
]
= 7.425 10
28
[m kg
1
] (8.5)
c
5
SI
[m s
1
]
5
G
SI
[m
3
kg
1
s
2
]
=
c
5
SI
G
SI
[m
5
s
5
]
[m
3
kg
1
s
2
]
=
c
5
SI
G
SI
[m
2
kg]
[s
3
]
=
(2.998 10
8
)
5
6.674 10
11
[N m s
1
]
= 3.629 10
52
[J s
1
] (8.6)
2 (b). Derive the constants in Table 8 in geometrized units.
Here Ill just get you started. The numerical computations were per-
formed in the corresponding Maple
TM
script for Chapter 8.
Table 8.1: Conversion between SI and geometrized units
Constant Geometrized units Value in terms of constants in SI
c unitless c
SI
(c
SI
)
1
G unitless G
SI
G
1
SI
m
2
SI
G
SI
c
3
SI
m
e
m m
e
SI
G
SI
c
2
SI
M
m m
SI
G
SI
c
2
SI
L unitless L
SI
G
SI
c
5
SI
2 (c) Express the following in geometrized units.
The trick is again to nd the right combination of c and G that give the
right units. By right units we mean that youre not allowed to have [kg]
or [s] or units derived from these, like [N]. But you are allowed to have [m].
(i) Density of a neutron star.
SI
G
SI
c
1
SI
[m
2
] (8.7)
(ii) Pressure in a neutron star.
p
SI
G
SI
c
3
SI
[m
2
] (8.8)
270
(iii) Acceleration.
g
SI
c
2
SI
[m
1
] (8.9)
(iv) Luminosity of a neutron star.
L
SI
G
SI
c
5
SI
[unitless] (8.10)
2 (d). Find the Planck length, mass, and time all in SI units.
Now the goal is to nd the right combination of c and G and such that
the quantity has dimensions of length [m], mass [kg], and time [s] respectively.
Computations were performed in the accompanying Maple
TM
worksheet for
Chapter 8.
(i) Planck length.
L
P
=
_
SI
G
SI
c
3
SI
_
1/2
[m]
= 1.62 10
35
[m] (8.11)
(ii) Planck mass.
M
P
=
_
SI
G
1
SI
c
SI
_
1/2
[kg]
= 2.18 10
8
[kg] (8.12)
(iii) Planck time.
T
P
=
_
SI
G
SI
c
5
SI
_
1/2
[s]
= 5.39 10
44
[s] (8.13)
Now we are to compare these three scales with elementary particles. El-
ementary particles are considered point particles, with no inherent radius
(Perkins, 2009). Experiments in high-energy particle physics explore scales
down to 10
17
m, or 100 times smaller than the charge radius of a proton
(Perkins, 2009, p. 3), which is still much much larger than the Planck scale.
The heavier leptons, i.e. the muon and the tauon, are unstable and decay
with mean lifetimes of 2.210
6
and 2.910
13
[s], much much longer than
the Planck time. In contrast, the Planck mass appears to be rather large,
much heavier than the electron.
3 (a). Calculate the following in geometrized units:
(i) Newtonian potential of Sun at the surface of the Sun.
=
GM
r
=
1 1476 m
6.96 10
8
m
= 2.12 10
6
(8.14)
(ii) Newtonian potential of Sun at the radius of the Earths orbit.
=
GM
r
=
1 1476 m
1.496 10
11
m
= 9.87 10
9
(8.15)
(iii) Newtonian potential of Earth at the surface of the Earth.
=
GM
r
=
1 4.434 10
3
m
6.371 10
6
m
= 6.96 10
10
(8.16)
272
3 (b). Why is the Suns potential at the Earths radius greater than that
of the Earths own potential there, yet we feel the Earths gravitational pull
more?
The force of gravity per unit mass is determined by the gradient of the
gravitational potential. While the Suns potential is larger at the surface of
the Earth, its potential gradient is much less than that of the Earths own
gravitational potential.
3 (c). Show that a circular orbit around a body of mass M has an
orbital velocity, in Newtonian theory, of v
2
= , where is the Newtonian
potential.
The centripetal acceleration of a body in a circular orbit of radius r is
(

Omega re
r
) =
v
2
r
e
r
(8.17)
as can be found in elementary texts on mechanics (or see Lesson 2 in my
course notes http://stockage.univ-brest.fr/
~
scott/GFD1/2012/index_
gfd1.html).
4 (a). Let A be an nn matrix whose entries are all very small, [A
i,j
[
1/n, and let I be the unit matrix. Show that
(I + A)
1
= I A + A
2
A
3
. . .
by rst showing that
(i) the series on the RHS converges absolutely.
(ii) (I + A) times the RHS gives I.
(i) A series
p=0
a
p
is absolutely convergent if
p=0
[a
p
[ <
see http://en.wikipedia.org/wiki/Absolute_convergence.
Lets say that the largest term is,
max [A
i,j
[ =
1
n
, 1.
where the maximum is over all i and j. Then consider an arbitrary term of
the RHS,
R
i,j
=
p=0
a
p
,
with a
0
= 1 from I. The next contribution from the series, p = 1, is A
i,j
,
which makes contribution
[a
1
[ = [ A
i,j
[
1
n
The next term in the series a
2
has magnitude,
[a
2
[ = [A
i,k
A
k,j
[ n (
1
n
)
2
=
2
1
n
The next term in the series a
3
has magnitude,
[a
3
[ = [A
i,k
A
k,l
A
l,j
[ n n (
1
n
)
3
=
3
1
n
So we see that
[a
p
[ =
p
1
n
which is 1/n times the geometric series from 1 to with [[ < 1 for which
the sum is nite. Thus
p=0
= 1 +
1
n
1
which proves that the RHS is absolutely convergent.
(ii) The next step is actually easier. Multiply the RHS by (I +A) gives two
sets of terms. The rst set comes from I and is of course just the RHS again.
The next set of terms is like the RHS but each term has A to one higher
power. But because the terms alternate in sign, the second set cancels all
the terms in the rst set but the I!
274
4 (b). Use results from (a) to establish Eq. (8.21) from Eq. (8.20).
First we must identify Eqs. (8.20) with the matrix equation:
(
) = I + A
(
) = I
(
) = A (8.18)
And of course we know
and
are inverse transforms. Alternatively

its also clear from basic calculus that (Dirac, 1996):
x
=
x
So then Eq. (8.21) can be written in matrix form:

(I + A)
1
= I A + O(A
2
) (8.19)
and of course this is a straightforward application of the result in exercise 4
(a).
Chapter 10
Spherical solutions for stars
275
276
10.2 Static spherically symmetric spacetimes
10.2.1 The metric
10.2.2 Physical interpretation of metric terms
Typo p. 259, just before Eq. (10.11), has

U
U = 1 but of course this should

be
U

U = 1
c.f. Eq. (2.28) or Eq. (10.18). It really is just a typo. He wants to say
U

U = 1
= U
= U
0
U
0
g
00
because shes in the MCRF
= U
0
U
0
(e
2
) using metric term from Eq. (10.7)
(10.1)
which implies, as Schutz concludes, U
0
= exp().
10.7 Realistic stars and gravitational collapse
White dwarfs
Typo: p. 273. . . . the Fermi momentum rises (Eq. (10.5)). I believe
this should be Eq. (10.75), which shows that Fermi momentum p
f
increases
like n
1/3
where n is the number density.
10.9 Exercises
4. Calculate the diagonal components of the Einstein tensor, c.f Eqs. (10.14
. . . 10.17) for the spherically symmetric static metric Eq. (10.7). For this we
can use the results of Problem 35 6.9 wherein we found the Riemann tensor
for a metric of this form. However, I found my Riemann tensor disagreed
with that of Schutzs solutions. For that reason Ill show my intermediate
results as well.
First the Ricci tensor obtained by contracting the Riemann tensor (see
Eq. 6.91):
R
tt
= R
tt
= g
rr
R
rtrt
+ g
R
tt
+ g
R
tt
= exp(2)exp(2)
_
(
)
2
+
+ r
2
r
exp(2) exp(2)
+ r
2
sin
2
()sin
2
() r
exp(2) exp(2)
= exp(2 2)
_
(
)
2
+
+ 2
r
_
(10.2)
R
rr
= R
rr
= g
tt
R
trtr
+ g
R
rr
+ g
R
rr
= exp(2)exp(2)
_
(
)
2
+
+ r
2
r
+ r
2
sin
2
()r
sin
2
()
= (
)
2
+ 2
r
(10.3)
R
= R
= g
tt
R
tt
+ g
rr
R
rr
+ g
= exp(2)[r
exp(2) exp(2)] + exp(2)[r
]
+ r
2
sin
2
()[r
2
sin
2
()(1 exp(2))]
= r
exp(2) + r
exp(2) + (1 exp(2)) (10.4)

R
= R
= g
tt
R
tt
+ g
rr
R
rr
+ g
= exp(2)[sin
2
() r
exp(2) exp(2)] + exp(2)[r
sin
2
()]
+ r
2
[r
2
sin
2
() (1 exp(2))]
= exp(2) sin
2
()[r
+ exp(2) 1 + r
] (10.5)
Second the Ricci scalar obtained by contracting the Ricci tensor (see
Eq. 6.92):
R = g
= 2 exp()
_
2
r
+ 2
r

1
r
2
+
exp(2)
r
2
+
_
(10.6)
278
Finally we obtain the Einstein tensor via Eq. (6.98). But notice that for
Eqs. (10.14 . . . 10.17) we want the covariant components. So we use
G
= R
1
2
g
R
(10.7)
We nd:
G
00
= R
tt
1
2
g
00
R
= exp(2 2)
_
(
)
2
+
+ 2
r
_
1
2
[exp(2)]2 exp()
_
2
r
+ 2
r

1
r
2
+
exp(2)
r
2
+
= exp(2 2)
_
2
r

1
r
2
+
exp(2)
r
2
_
=
1
r
2
exp(2)
d
dr
[r(1 exp(2)] , the form of Eq. (10.14). (10.8)
G
11
= R
rr
1
2
g
rr
R
= (
)
2
+ 2
1
2
exp(2)2 exp()
_
2
r
+ 2
r

1
r
2
+
exp(2)
r
2
+
_
=
exp(2)
r
2
[(1 exp(2)] + 2
r
, the form of Eq. (10.15). (10.9)
G
22
= R
1
2
g
R
= r
exp(2) + r
exp(2) + (1 exp(2))
1
2
r
2
2 exp()
_
2
r
+ 2
r

1
r
2
+
exp(2)
r
2
+
= r
2
exp(2)
_
+
2
+

r
_
, the form of Eq. (10.16).
(10.10)
And nally for the component we need not do any further computations if
we note the relationship with the corresponding components of the Ricca
tensor and metric tensor:
G
33
= R
1
2
g
R
= sin
2
()R
1
2
sin
2
()g
R
= sin
2
()G
22
. (10.11)
SP. 1
280
Chapter 11
Schwarzschild geometry and
black holes
281
282
11.1 Trajectories in the Schwarzschild space-
time
Conserved quantities
Eqs. (11.5) and (11.6) look strange at rst, since they appear to contradict
the denitions of p. 42. Why is E = p
0
here, yet E = p
0
on p. 42. Why is
L = p
? I found it instructive to note that p
= mU
= mdx
/d, and x
can be a coordinate like in spherical coordinates. And /d is an angular

velocity, so it doesnt have units of velocity. But p
= r
2
sin
2
m d/d which
includes the velocity r sin d/d and the momentum arm length r sin .
We can also just accept p
0
= E and p
= L as convenient denitions of
quantities that are conserved in the Schwarzschild metric and certainly E is
energy like and L is angular momentum like, which is how I guess how
Schutz meant us to take them.
Types of orbits
Typo: p. 286, rst sentence of page, too many (.
Regarding Eq. (11.19), Schutz refers to the minimum radius of a parti-
cles circular orbit. But it is important to note that this is the minimum only
for the stable, larger orbit obtained from the positive root in Eq. (11.17). For
the unstable, smaller orbit obtained from the negative root in Eq. (11.17),
r = 6M is the maximum radius. In the latter case, the minimum is obtained
by taking the limit of the

L , which gives
r 3M
See also the solution to Exercise 4 in 11.7.
Perihelon shift
Typo: p. 288, 3rd line, These opposite of peri is ap should be The
opposite of peri is ap .
Typo: p. 290, line above Eq. (11.42), Each orbit take . . . should be Each
orbit takes . . . .
Typo: p. 291, last line of 1st paragraph predict should be predicts.
[I sent these notes to Schutz in Dec. 2011 and he didnt reply, so Im no
longer bothering with cosmetic typos.]
11.3 General black holes
Ergoregion
Typo in Eq. (11.78), the should be a t obviously, since it came from the
previous line.
11.7 Exercises
1.
(a). A particle or photon in an orbit [careful, its not necessarily a closed
orbit] in the Schwarzschild metric with a certain E and L, at a radius r M.
Show that if space-time were really at, the particle [or photon] would travel
on a straight line that would pass a distance b L/
E
2
m
2
from the
centre of coordinates r = 0. This ratio b is called the impact factor.
(b). Show also that photon orbits that follow Eq. (11.12) depend only
on b.
(a) If we rotate the coordinate system such that = /2 then p
= 0
always. For a particle or a photon p
= m
2
where m is the rest mass,
which is nil for a photon, c.f. Eqs. (2.33) and (2.40). At the point where r is
a minimum
U
r
=
dr
d
= 0
because its an extremum of the particle path. The Schwarzschild metric is
diagonal so that this implies p
r
= 0 at r = b. So that we have only two
284
non-zero components of the momentum:
m = p
in general
= p
t
p
t
+ p
= g
0
p
p
t
+ g
= g
00
p
t
p
t
+ g
diagonal metric
=
_
1
2M
r
_
1
E
2
+ r
2
L
2
, used Eqs. (11.5), (11.6)
= E
2
+ r
2
L
2
at space approximation
= E
2
+ b
2
L
2
r = b
(11.1)
Solve for b and chose positive root since b is the spatial distance:
b =
L
E
2
m
2
.
(b) For a photon m = 0 Eq. (2.40),
b =
L
E
.
Note that we can rewrite Eq. (11.12) by absorbing the E into the parameter
, dening for instance
= E
which is consistent with Eq. (6.52), so we retain an ane parameter so the
paths are still geodesics,
_
dr
d
_
2
= E
2
_
1
2M
r
_
L
2
r
2
= E
2
_
1
2M
r
_
b
2
E
2
r
2
substituted L = bE
= E
2
[1
_
1
2M
r
_
b
2
r
2
]
_
dr
d
_
2
= 1
_
1
2M
r
_
b
2
r
2
(11.2)
so besides M, the only parameter is b.
2. Prove Eqs. (11.17) and (11.18).
For the particle orbit, start with
0 =
d
dr
_
_
1
2M
r
_
_
1 +
L
2
r
2
__
(11.3)
If one does the obvious thing, and simply integrates this, one runs into trou-
ble. One can show that the integration constant is E
2
. But then one has
an cubic to solve. Even knowing the solution, its not obvious that it is a
solution to the cubic.
However, if one rst dierentiates then everything works out easily, with
only a quadratic to solve,
0 =
d
dr
_
_
1
2M
r
_
_
1 +
L
2
r
2
__
=
2M
r
2
_
1 +
L
2
r
2
_
+
_
1
2M
r
_
_
2
L
2
r
3
_
= r
2
r
L
2
M
+ 3
L
2
after multiplying by
r
M
(11.4)
The quadratic formula gives the solution:
r =
L
2
2M
_
1
_
1
12M
2
L
2
_
For the photon orbit, start with
0 =
d
dr
__
1
2M
r
__
L
2
r
2
__
(11.5)
286
Again, its much easier, perhaps essential, to start by dierentiating,
0 =
d
dr
__
1
2M
r
__
L
2
r
2
__
=
2M
r
2
_
L
2
r
2
_
+
_
1
2M
r
__
2
L
2
r
3
_
r = 3M after dividing by L
2
(11.6)
4. What kind of orbits are possible outside a star of radius (a) R = 2.5M,
(b) R = 4M, (c) R = 10M,
See the last paragraph of 11.1, subsection Types of orbits, where
Schutz points out that if the radius of the star exceeds the radius of the
orbit, the orbit is not possible, simply because the star is in the way. In my
notes above I qualify in what sense r
MIN
= 6M is a minimum in Eq. (11.19).
The plot below is useful for interpreting this question.
(a) Note that R < 3M which is the photons circular orbit radius, so a
circular, unstable, photon orbit is possible. This is point A in Fig. 11.2.
Note that R < 6M = r
MIN
so a circular, unstable, particle orbit is
possible, like point A in Fig. 11.1. Because R < 3M, which is the minimum
unstable orbit radius (see my notes on section 11.1), the unstable orbit is
allowed for all

L.
Also the larger, stable, circular orbit is also allowed for all

L, point B in
Fig. 11.1.
(b) Note that R > 3M which is the photons circular orbit radius, so a
circular, unstable, photon orbit is not possible.
Note that R = 4 < 6M = r
MIN
so a circular, unstable, particle orbit is
possible, like point A in Fig. 11.1. Because R > 3M, which is the minimum
unstable orbit radius (see my notes on section 11.1), the unstable orbit is
only possible for a nite range of

L, see my Fig. 11.1 above.
The larger, stable, circular orbit is allowed for all

L, point B in Fig. 11.1.
(c) Note that R = 10 > 3M which is the photons circular orbit radius,
so a circular, unstable, photon orbit is not possible.
Figure 11.1: Radius of circular orbit r vs.

L for M = 1 from Eq. (11.17).
Green line taking the positive root, while red line from taking the negative
root.
Note that R > 6M which is the particles maximum unstable circular
orbit radius, so a circular, unstable, particle orbit is not possible for any

L.
The larger, stable, circular orbit is allowed for a range of

L. Let

L become
large in Eq. (11.17) and its clear that the larger orbit can be at arbitrarily
large r so and be such that r > R = 10M and this orbit must exist. See also
green line in my Fig. 11.1 above.
5 (a). Find the radius R
0.01
at which g
00
diers from the Newtonian
value 1 2M/R by only 1%.
This question doesnt make sense because
g
00
= 1
2M
R
in the Schwartzschild metric without approximation. Please see my supple-
mentary problem below.
288
5 (b). How many normal [Sun-like] stars can t in the region between
R
0.01
and the radius 2M?
This question does make sense, once one answers 5(a). It is a simple
geometry question, no tricks.
11.8 Robs Supplementary Problems
SP. 1 Recall from Exercise 5d of 7.6 that g
00
is closely related to
exp(2), where is the Newtonian potential for a similar [non-relativistic]
situation. This was derived for a hydrostatic uid in Exercise 5d, but use
this fact here to nd at what distance R from a black hole of mass 10
6
M
the Schwartzschild g
00
diers from the Newtonian potential by 1%.
Solution: The Schwartzschild metric has, precisely,
g
00
= 1
2M
R
with G = 1 of course. When the weak gravity limit applies, the Newtonian
potential =
M
R
for a similar [non-relativistic] situation obeys 1 and
we have
g
00
exp(2) , Exercies 5d of 7.6. (11.7)
Use a Taylor series about = 0 to approximate the exponential function,
g
00
exp(2)
1 + 2 + 2
2
+
4
3
3
. . .
1 + 2 = 1 + 2 + 2
2
+
4
3
3
. . .
(11.8)
For 1 the LHS diers from the RHS by approximately 2
2
. Setting this
dierence to 1% of g
00
1 we solve for the radius when 10
6
M
,
2
2
= 2
_
M
R
_
2
=
1
100
so
R = 10
2 M = 14.1 10
6
1.5km see Table 8.1
Chapter 12
Cosmology
289
290
12.2 Cosmological kinematics: observing the
expanding universe
Three types of universe
Typo: in Eq. (12.15), missing square on d.
Eq. (12.21) uses z for redshift. This was dened in Eq. (10.12) and also
in Eq. (12.37).
12.6 Exercises
1. Use the metric of the 2-sphere to prove the statement associated with
Fig. 12.1 that the rate of increase of the distance between any two points
as the sphere expands ( as measured on the sphere) is proportional to the
distance between them.
The metric of the 2-sphere is apparent from the line element
dl
2
= r
2
(d
2
+ sin
2
d
2
)
Here we assume that r = r(t) and pick two points on the sphere, p and q.
We can always rotate the coordinates such that the two points are on the
equator, = /2, so the distance is
_
dl = l =
_
r(t) d = r(t)(
p
q
)
Dierentiating with respect to time one obtains the rate of increase of their
separation
l = r(
p
q
) = l
r
r
(12.1)
So for a given r and rate of change of r, the rate of change of the distance l
is proportional to l.
2. Find the parsec in meters given the radius of the Earths orbit as about
10
11
m and the denition of the parsec, which is the parallax of 1 second of
arc.
This question is of course quite trivial. But its nice to go through it once
since one feels then more comfortable with the parsec and gains appreciation
of the astronomers who measure positions of stars to within an arc second 6
months apart.
The distance from the Sun to the star l is
l =
r
sin(1[arc second])
180
[rad per degree]
1
60
[degree per minute]
1
60
[minute per second]
=
10
11
180 60 60
(12.2)
3. Newtonian cosmology.
3(a). Apply Newtons law of gravity to the study of cosmology by show-
ing that the general solution of
2
= 4 for constant is a quadratic
polynomial in Cartesian coordinates that is not necessarily isotropic.
Newtons law is a Poisson equation with constant coecient. By inspec-
tion it is immediately clear that
=
0
(a
2
x
2
+ b
2
y
2
+ c
2
z
2
)
is a solution with
0
some constant and
2(a
2
+ b
2
+ c
2
) =

0
But to this one can add an arbitrary solution to Laplaces equation, so our
solution is by no means a completely general solution! It is clear that it is
not isotropic because we do not require a = b = c.
3 (b) Show that if the universe consists of a region where is constant,
outside of which there is a vacuum, then, if the boundary is not spherical,
the eld will not be isotropic. The eld will show signicant deviations from
sphericity throughout the interior, even at the centre.
292
Suppose we stretch our Cartesian co-ordinates such that:
x
= ax; y
= by; z
= cz
Then our general solution is
=
0
(x
2
+ y
2
+ z
2
) =
0
r
2
and is spherically symmetric in this stretched coordinate system. Outside
the region of nonzero , so in the vacuum, we know the solution must be
=
4
3
1
r
, see Eq. (8.2) and Exercise 1 of 8.6

And this solution must match the solution
0
r
2
at the boundary. But this
proves that the boundary must be a surface of constant r
= R, for otherwise
it would be impossible to match the interior and exterior solutions. And this
gives us the result we required. For the boundary is aecting the shape of the
solution every, including the whole interior and even its central point r
= 0.
And if the boundary R is not a true circle (i.e. a circle in the original un-
stretched coordinates, such that a ,= b ,= c) then the anisotropy is apparent,
after transforming back to the original coordinates, for instance in the second
derivatives:
,xx
=
0
2a ,=
,yy
=
0
2 b ,=
,zz
=
0
2 c (12.3)
3 (c) Show that an experiment done locally could determine the shape
of the boundary.
One could measure the gravitational acceleration on a test particle in 3
orthogonal directions as a function of position. Under the assumption of
=constant [or possibly correcting for local inhomogeneities not sure how
feasible that would be in practice] one would obtain
,xx
=
0
2a
from the xdirection gradient of the xdirection gravitational acceleration,
and similarly for the other two directions. This gives the relative sizes of
a, b, c which would determine the shape of an ellipsoidal boundary.
For a boundary shape more general than ellipsoidal, it is not immediately
clear how to proceed. The problem lies in our solution in terms of quadratic
terms in x, y, z not being truly a general solution. I guess the mathematics
will become complicated rather quickly and its not clear its worth the eort.
For we have already shown that the boundary eects of an ellipsoidal shaped
Newtonian universe would in principal be observable throughout the interior,
and clearly we dont observe such eects.
4. Show that if h
ij
(t
1
) ,= f(t
1
, t
0
) h
ij
(t
0
) for all i and j in Eq. (12.3), then
distances between galaxies would increase anisotropically: the Hubble law
would have to be written as
v
i
= H
i
j
x
j
for a matrix H
i
j
not proportional to the identity.
We replace
h
ij
(t
1
) = f(t
1
, t
0
) h
ij
(t
0
)
by a more general expression
h
ij
(t
1
) = (a
i
j
)
2
(t
1
, t
0
) h
ij
(t
0
)
with no summation over the double i or j indices on the RHS. Consider a
galaxy at some distance l
0
at t = t
0
. We can orient our Cartesian coordinates
such that we are at the centre and the galaxy is in the xdirection.
dl
2
= h
ij
(t
0
)dx
i
dx
j
= h
xx
(dx)
2
dl =
_
h
xx
(t
0
) dx
_
dl =
_
_
h
xx
(t
0
) dx = l
0
(12.4)
294
Then at a later time t
1
dl
2
= h
ij
(t
1
)dx
i
dx
j
= (a
i
j
)
2
h
ij
(t
0
)dx
i
dx
j
= (a
x
x
)
2
h
xx
(dx)
2
dl = a
x
x
_
h
xx
(t
0
) dx
_
dl = a
x
x
_
_
h
xx
(t
0
) dx = a
x
x
l
0
(12.5)
So the rate of change of position of this galaxy observed at Earth is
l = v = a
x
x
l
0
(12.6)
But if our (a
i
j
)
2
is not just a constant times
i
j
then we measure a dierent
Hubble parameter in each of the 3 orthogonal directions:
v
x
= a
x
x
l
x
0
v
y
= a
y
y
l
x
0
v
z
= a
z
z
l
z
0
(12.7)
And performing a rotation to orient our Cartesian coordinates to an arbitrary
direction we obtain the generalized Hubble law
v
j
= H
j
i
x
i
6. (a) Prove the statement leading to Eq. (12.8), that we can deduce G
ij
of our three-spaces by setting = 0 in Eq. (10.15)-(10.17).
Im going to be a bit deconstructionist for this question since I dont like how
its posed. Recall that the Einstein tensor is dened in Eq. (6.98), which,
after lower the indices, is
G
= R
1
2
g
R
So we want the spatial part of that, G
ij
. But this involves the temporal
dimension through the Riemann tensor. Recall the Ricci tensor was dened
by Eq. (6.91), which after xing the typo, is
R
= R

So even when we restrict ourselves to G
ij
we have contributions from the
temporal diminution through = 0 in the contraction of the Riemann tensor.
So I think its better to say we want to ensure that the metric in Eq. (12.6)
is homogeneous at a given point in time, t = t
0
. Then R
2
(t
0
) constant one
without loss of generality it can be recalled to R
2
= 1:
ds
2
= dt
2
+ R
2
(t)h
ij
dx
i
dx
j
= dt
2
+ (e
2(r)
dr
2
+ r
2
d
2
)
If we set = 0 in Eq. (10.7) then we obtain the same metric. So of course
it will have the same Riemann tensor and all that follows (Ricci tensor and
scalar and Einstein tensor).
6. (b) Derive Eq. (12.9).
To obtain Eq. (12.9) we substitute Eq. (12.8) for the diagonal components
of the Einstein tensor into the equation for the trace,
G = G
ij

ij
and the rest is just algebra. Theres nothing to derive.
7.Show the metric in Eq. (12.7) is only at at r = 0 if A = 0 in Eq. (12.11).
For the metric to be at we require the Riemann tensor to be zero, see
Eq. (6.71),
R
= 0
Careful, its not correct to show that
R
= 0
Ive learned from experience one can waste hours working with this incorrect
equation! Our metric was given by Eq. (12.7):
dl
2
= exp(2(r)) dr
2
+ r
2
d
2
which has the spatial part the same as metric Eq. (10.7). For the Riemann
tensor, we can pull o the spatial part without fear of contamination from
296
the temporal part of the metric since our metric is diagonal and there is no
contraction involved in the denition of the Riemann tensor
R
= v
;
Using the results from Exer. 35 of 6.9, we see there are only three non-
zero terms to consider (and the ones obtainable from these 3 by symmetry
relations), see result (6.101),
R
rr
= r
R
rr
= r
sin
2
()
R
= r
2
sin
2
() (1 exp(2)) (12.8)
Well be using Eq. (12.11)
g
rr
= exp(2) =
1
1 +
1
3
r
2
A
r
Consider case A ,= 0. Then
R
r
r
= g
r
R
r
=
1
g
rr
R
rr
, for diagonal metric
=
1
g
rr
r
, substitute from result (12.8)

=
r
2(g
rr
)
2
dg
rr
dr
, elementary operations
=
1
2
_
2
3
r
2
+
A
r
_
(12.9)
which is nonzero (in fact singular) as r 0.
On the other hand, if A = 0 then R
r
r
= 0 at r = 0. But we still need to
check the other two non-zero components of the Riemann tensor when A = 0
to conrm that space is at there. Note that
R
r
r
= R
r
r
sin
2
() = 0 at r = 0.
R
rr
= g
R
rr
=
1
g
R
rr
=
1
g
R
rr
, symmetry
=
1
r
2
r
, substitute from result (12.8)

=

3 + r
2
, elementary operations
(12.10)
R
= g
=
1
g

=
1
r
2
r
2
sin
2
() (1 exp(2)) , substitute from result (12.8)
= 0 at r = 0. (12.11)
R
= g
=
1
g

=
1
g
, symmetry
=
1
r
2
sin
2
=
1
sin
2
= 0 at r = 0. (12.12)
8.Find the coordinate transform leading to Eq. (12.19).
Its immediately clear by inspection of the term in d
2
that
r = sinh
298
is the transform. To conrm this works we substitute it in Eq. (12.13) when
k = 1. Well use the identity
cosh
2
sinh
2
= 1
and
d sinh = cosh d
These identities all follow easily from the denitions of the hyperbolic func-
tions, http://en.wikipedia.org/wiki/Hyperbolic_function. Omitting
the temporal part we nd
dl
2
= R(t
0
)
_
dr
2
1 + r
2
+ r
2
d
2
_
, spatial part of Eq. (12.13)
= R(t
0
)
_
cosh
2
()d
2
1 + sinh
2
()
+ sinh
2
()d
2
_
, sub transform
= R(t
0
)
_
d
2
+ sinh
2
()d
2
_
, use hyperbolic identies
(12.13)
SP. 1 Lets get a feel for the order of magnitude of the terms in the expanding
universe.
SP. 1(a) At the current rate of expansion of the universe, how long will it
take for a meter stick to double in size? For the universe to double in size?
SP. 1(b) Plate tectonics is causing the Atlantic ocean to spread at a rate
of about 25 mm/year, see http://en.wikipedia.org/wiki/Mid-ocean_
ridge. So stationed in Brest, France one observes New York City to re-
cede at v = 25 mm/year. Assuming Hubbles law applies locally, compare
this to the rate at which New York recedes from Brest due to expansion of
the universe.
Bibliography
Batchelor, G. K., 1967: An Introduction to Fluid Dynamics. Cambridge Uni-
versity Press, 615 pp.
Boas, M. L., 1983: Mathematical methods in the physical sciences. John
Wiley and Sons, 793 pp.
Dirac, P. A. M., 1996: The General Theory of Relativity. Princeton Univer-
sity Press. 71 pp.
Faber, R. L., 1983: Dierential geometry and relativity theory: An Introduc-
tion. Marcel Dekker. 255 + X pp.
Hobson, M., G. Efstathiou, and A. Lasenby, 2009: General Relativity: An
introduction for physicists. Cambridge. 572 +XVIII pp.
Misner, C. W., K. S. Thorne, and J. A. Wheeler, 1973: Gravitation. W. H.
Freeman and company. 1279 + XXVI pp.
Perkins, D., 2009: Particle Astrophysics. Oxford University Press. 2nd ed.,
339 + IX pp.
Schutz, B., 2009: A rst course in General Relativity. Cambridge University
Press. 2nd ed., 393 + XV pp.
299

FirstCourseGR Notes On Schutz2009 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FirstCourseGR Notes On Schutz2009 PDF

Uploaded by

Copyright:

Available Formats

A detailed solution manual and guide for

Schutzs First Course in General Relativity

2 and z =. Substitution into

2 cos(). Were now after

with parentheses and writing a superscript implies that we are

] gives the matrix of components. He will oscillate between

are the contravariant components of a vector (Hobson et al.,

Start with the rst component, = 0. The equation above is 0 = a

then you, of course, recover the vector

Then view the components of A

as the components of a linear combination

(v). Exercise 1.20 was to put the Lorentz trans-

. Multiplying this vector on the right by the Lorentz transformation

I nd this a rather strange question. From the denition of the Kronecker

to the east about the polar axis, then 45

clockwise about the axis through the Equator and 90

E where the South Indian Ocean used to be. But perform-

, one can write the 3-

t = d, in the MCRF at point P, the

t, wherein the rel-

t in the xdirection, just as in Galilean acceleration, which is what

t, such that relativistic corrections are small, as

t = d and this was related to the time increment in the

U = p, where m is the rest mass and

This gives the equation for the y-direction,

U = m(1, u, v, w). And the energy is

, so ignoring the O(v

3/2. Now we need the Lorentz transformation

obtained from Eq. (2.15),

Robs notes on Schutz 53

U = m(, (u), 0, 0) = m( 1, u, 0, 0),

(introduced in Eq. (2.27)) is the metric tensor, and it

being the components of the metric tensor in basis vectors

). One might be tempted to say, but hold xed and then x

is like a scalar eld. I nd that unsatisfactory because components of vec-

dened after Eq. (2.7).

) , by linearity in arguments, c.f. p. 56, (3.7)

), by Eq. (2.7) and Eq. (3.9) respectively

), just rearranged the terms

, by the properties of the Kronecker delta.

of a frame O and a basis

by inspection of the given basis.

We can write this as a linear system of equations to solve for l

be the basis vectors in frame O, and O respectively. The components

, sub co-ordinate transform

, from Eq. (3.18)

V is not tangent to surface S, so it must have a component in the e

, what we mean by a basis

), what we mean by components

), sub rst line into 2nd line

, solving above, verify by substitution, used Eq. (3.12)

which means the diagonal elements are zero,

of the rst row. And so,

from Eq. (3.12) on p. 60, and from the duality

times the basis vector

was given in Eq. (3.44). Its a

), component-wise addition of one-forms

), components are just reals

)], after cancelling terms

from Eq. (3.44) (3.70)

is equal to the transpose

V ) = 0, denition of normal (3.86)