Professional Documents
Culture Documents
e2
e2
e2
x2
e1
x1
e1
x1
e1
O
Fig. 1.2
Fig. 1.1
Notice that in (1.1) we have used a superscript (upper index) for components
and a subscript (lower index) for basis vectors, and repeated indices (one upper
and one lower) are understood to be summed over (from 1 to the dimension
of the vector space under consideration, which is two in the present case). This
convenient notational device, called the Einstein summation convention, will be
used routinely in this book from now on, and its elegance will hopefully become
apparent as we proceed. (Repeated indices that are not meant to be summed
over will be explicitly noted, while repeated summation indices that are both
up or both down will be accompanied by an explicit summation sign.) The
orthonormal basis {e1 , e2 } is usually called a reference (coordinate) frame in
physics.
Now consider the same vector x with dierent components x i with respect
to a rotated orthonormal frame {e1 , e2 }, rotated from {e1 , e2 } by an angle in
the positive (anti-clockwise) sense (as shown in Fig. 1.2):
x = x i ei
(1.2)
(1.3)
Equation (1.3) can also be obtained by working with the basis vectors, instead
of the components, directly. It is evident from Fig. 1.2 that
e1 = (cos ) e1 (sin ) e2 ,
(1.4)
(1.5)
Comparison of the coecients of e1 and e2 on both sides of (1.5) immediately
yields (1.3).
Let us now write (1.3) and (1.4) in matrix notation. Equations (1.3) can be
written as the single matrix equation
cos sin
(x 1 , x 2 ) = (x1 , x2 )
(1.6)
,
sin
cos
while Eqs. (1.4) can be written as
e
1
cos
sin
e2
sin e1
cos
e2
1
2
cos
sin
a
a
1
1
aji =
=
a12 a22
sin
cos
(1.7)
(1.8)
where the lower index is the row index and the upper index is the column index.
(This index convention for matrices will be adopted in this book unless otherwise
specied, and we warn the reader that, occasionally, it may lead to confusion.)
Equations (1.6) and (1.7) can then be compactly written using the Einstein
summation convention as
x i = aij xj ,
ei =
aji
ej
(1.9)
(1.10)
Note again that repeated pairs of indices, one upper and one lower, are summed
over.
Equations (1.9) and (1.10) are equivalent, in the sense that either one is a
consequence of the other. By way of illustrating the usefulness of the index
notation, we again derive (1.9) from (1.10) algebraically as follows:
x = xj ej = xj aij ei = x i ei
The last equality implies (1.9).
(1.11)
In matrix notation, (1.9) and (1.10) can be written respectively as (note the
order in which the matrices occur in the matrix products):
x = x A ,
e = A e ,
(1.12)
(1.13)
where A is the matrix in (1.8), e, e are the column matrices in (1.7), and x, x
are the row matrices in (1.6). The above equations imply that
x = x A1 ,
e =A
(1.14)
e,
(1.15)
v i = aij v j
v i = (a1 )ij v j
Table 1.1
In all the indexed quantities introduced above, upper indices are called contravariant indices and lower indices are called covariant indices. These are
distinguished because tensorial objects [see the discussion in the paragraph following Eq. (1.31) below] involving upper and lower indices transform dierently
under a change of basis, as evident from Table 1.1 for the case of vectors.
At this point it will be useful to recall the abstract concept of linear vector
spaces. One speaks of a vector space V over a eld F . In physics applications,
the algebraic eld is usually the eld of real numbers R or the eld of complex numbers C, corresponding to so-called real vector spaces and complex
vector spaces, respectively.
Denition 1.1. A vector space V over an algebraic eld F is a set of objects
v V for which an internal commutative binary operation, called addition
(denoted by +), and an external binary operation, called scalar multiplication,
are dened such that
(a) v + w V , for all v, w V
(1.16)
(1.17)
for all a, b R ; x, y V ,
(1.18)
v2 =
vj vj .
(1.19)
j
vj vj =
v i v i =
aij aik v j v k
Comparing the leftmost expression with the rightmost we see that (AAT )kj = jk ,
or AAT = 1. Property (1.16) then follows.
right-handed frame
left-handed frame
e3
e1
e2
O
O
e1
e2
e3
Fig. 1.3
(1.21)
det(AT ) = det(A) .
(1.22)
(1.23)
(1.24)
where the aji are scalars in the eld of real numbers R. Similar to (1.8) the
quantities aji , i = 1, . . . , n, j = 1, . . . , n, can be displayed as an n n matrix
a11
a21
an1
1
a2 a22 an2
(aji ) =
.................
(1.25)
(1.26)
where in the last equality we have performed the interchange (i j) since both
i and j are dummy indices that are summed over. Equation (1.26) then implies
x i = aij xj ,
(1.27)
which is formally the same as (1.9). Thus we have shown explicitly how the
matrix representation of a linear transformation depends on the choice of a
basis set.
Let us now investigate the transformation properties of the matrix representation of a linear transformation A under a change of basis:
ei = sji ej .
The required matrix A = (ai j ) is given by [(cf. (1.24)]
A(ei ) = ai j ej
(1.28)
(1.29)
where in the second equality we have used the linearity property of A, in the
third equality, Eq. (1.24), and in the fourth equality, Eq. (1.10). Comparison
with (1.28) gives the desired result:
ai j = (s1 )li akl sjk
(1.30)
(1.31)
where A = (ai j ), A = (aji ) and S = (sji ) are the matrices with the indicated
matrix elements. The transformation of matrices A A given above is called
a similarity transformation. These transformations are of great importance
in physics. Two matrices related by an invertible matrix S as in (1.31) are
said to be similar. Note that in (1.31), the matrix S itself may be viewed as
representing a linear (invertible) transformation on V (S : V V , using the
same symbol for the operator as for its matrix representation), and the equation
as a relation between linear transformations. The equation (1.31) may also be
represented succinctly by the following so-called commutative diagram (used
frequently in the mathematics literature):
A
V V
The way to read the diagram to arrive at the result (1.31) is as follows. Instead
of using the operator A to directly eect the linear transformation (from V to
V ) as represented by the top line (from left to right) of the diagram, one can
instead start from the top left corner, use the operator S rst on V (going down
on the left line), followed by the operator A (going from left to right on the
bottom line), then nally use the operator S 1 (going up on the right line).
According to (1.30) the upper index of a matrix (aji ) transforms like the
upper index of a contravariant vector v i (cf. Table 1.1), and the lower index of
(aji ) transforms like the lower index of a basis vector ei (cf. Table 1.1 also). In
general one can consider a multi-indexed object
...ir
Tji11...j
s
(with r upper indices and s lower indices) which transforms under a change of
basis (or equivalently, a change of coordinate frames) ei = sji ej (see Table 1.1)
according to the following rule [compare carefully with (1.30)]
i1
ir
...kr
1 l1
r
(T )ij11...i
)j1 . . . (s1 )ljss Tlk11...l
...js = sk1 . . . skr (s
s
(1.32)
10
and the s sums over the repeated indices l1 , . . . , ls . Thus a matrix (aji ) which
transforms as (1.30) is a (1,1)-type tensor, and a vector v i is a (1, 0)-type tensor.
(r, s)-type tensors with r = 0 and s = 0 are called tensors of mixed type. A
(0,0)-type tensor is a scalar. The term covariant means that the transformation is the same as that of the basis vectors: (e )i = (s1 )ji ej (see Table 1.1),
while contravariant means that the indexed quantity transforms according to
the inverse of the transformation of the basis vectors. Common examples of
tensors occurring in physics are the inertia tensor I ij in rigid body dynamics
(see Chapter 20 and Problem 1.5 below), the electromagnetic eld tensor F
in Maxwell electrodynamics, and the stress-energy tensor T in the theory of
relativity (the latter two behaving as tensors under Lorentz transformations,
which preserve the lengths of 4-vectors in 4-dimensional spacetime).
Let us step back to look at the mathematical genesis of the concept of tensors
more closely. We warn the reader that the somewhat abstract and seemingly
convoluted development over the next several paragraphs [until Eq. (1.40)] may
appear to be needlessly pedantic at this point for the purpose of physics applications. Our justication is that it will prepare the ground for developments of
key concepts and miscellaneous applications in subsequent chapters.
Our objective is to construct a vector space out of two arbitrary vector
spaces, by devising a tensor product between them, or tensoring them together, so to speak. The rst step is to introduce the concept of dual spaces.
Instead of linear operators A : V V , let us consider a linear function f :
V R. Recall that [cf. (1.18)] linearity means that, in this case, f (av + bw) =
a f (v) + b f (w), for all v, w V and all a, b R. Writing v = v i ei (with
respect to a particular basis), we have, by linearity, f (v i ei ) = v i f (ei ). Let
f (ei ) ai R. (Notice the position of the covariant index i a subscript.)
Then we can write (verify it!) f = ai (e )i , where (e )i is a particular linear
function on V such that its action is completely specied by (e )i (ej ) = ji (the
Kronecker delta symbol, dened to be equal to 1 if i = j and equal to zero if
i = j). If V is n-dimensional, the numbers ai , i = 1, . . . , n, can be considered as
components of the vector f in an n-dimensional vector space V , called the
dual space to V , with respect to the basis {(e )1 , . . . , (e )n }. Thus we have the
following denition.
Denition 1.2. The dual space to a vector space V , denoted V , is the vector
space of all linear functions f : V R on V .
The basis {(e )i } of V is called the dual basis to the basis {ei } of V , and
vice versa. Since dim (V ) = dim (V ), V is isomorphic to V , as follows from
the general fact that all nite dimensional vector spaces of the same dimension
are isomorphic to each other. Thus the vector spaces V and V are in fact dual
to each other, or we can write (V ) = V . Under the change of basis ei = sji ej
we have
f (ei ) ai = f ((s1 )ji ej ) = (s1 )ji f (ej ) = (s1 )ji aj .
(1.33)
This conrms the fact that the (lower) index in ai is a covariant index. So
vectors in V are (0, 1)-type tensors.
11
Now let us observe that the set of all linear operators from V to V , denoted
by L(V ; V ), is a vector space if we dene the vector addition and scalar multiplication by (A + B)(v) = A(v) + B(v) and (aA)(v) = aA(v), respectively,
where A, B L(V ; V ), v V and a R. As we demonstrated earlier, if V
is n-dimensional, each A L(V ; V ) is represented by an n n real matrix
(aji ). So there is an isomorphic relationship between L(V ; V ) and the set of
all n n real matrices M (n, R). On the other hand, M (n, R) is also isomorphic to the vector space of all bilinear maps from V V to R, denoted by
L(V , V ; R), where bilinearity means linear in each argument. Indeed, given
any A L(V , V ; R), w = wi (e )i V and v = v i ei V , we have, by
bilinearity,
A(w , v) = wj v i A((e )j , ei ) .
(1.34)
So the action of any A L(V , V R) is completely specied by the n2 numbers
A((e )j , ei ) aji , that is, the n n matrix (aji ), and we have the following
vector space isomorphisms
L(V ; V ) L(V , V ; R) M (n; R) .
(1.35)
There is yet another way to view the above vector space. For v V and
w V , consider an element v w L(V , V ; R) dened by
(v w )(x , y) = x (v) . w (y)
x , v
w , y
(1.36)
where the notation x (v)
x , v =
v, x indicates that the pairing between any x V and v V is symmetrical. Denote by V V the vector
space consisting of all nite sums of the form v w . It can be shown readily that the operation is bilinear, that is, linear in both arguments. Thus
v w = v i wj ei (e )j , and so V V is spanned by the basis vectors
ei (e )j . It follows that any A V V can be written as the linear combination A = aji ej (e )i . In other words, V V is also isomorphic to the set
of all n n real matrices M (n; R). Combining the above results we have the
following strengthening of the isomorphism scheme of (1.35):
L(V ; V ) L(V , V ; R) V V M (n; R) .
(1.37)
The vector space V V is called the tensor product of the vector spaces V
and V . Note that the ordering of the arguments in the binary operation is
important, and the complementary appearances of V and V in the second and
third entries in the above equation.
It is readily seen that the notion of the tensor product as developed above
can be generalized to that between any two vector spaces V and W (which may
be of dierent dimensions). In fact, analogous to (1.37), we have the following
denition:
Denition 1.3. The tensor product of two vector spaces V and W , denoted
V W , is the vector space of bilinear functions from V W to R, or real
bilinear forms on V W :
V W L(V , W ; R) .
(1.38)
12
Note again the duality of the vector spaces appearing on both sides of the
equation. We note that L(V , W ; R) can also be identied with the dual of
the space of real bilinear forms on V W , that is, L(V , W ; R) L (V, W ; R)
(the reader can verify this easily).
By extension we can dene the tensor product between any number of (dierent) vector spaces. An (r, s)-tensor [with components that transform according
to (1.32)] can then be considered as an element in the tensor product space
Vsr V V V V .
r times
(1.39)
s times
(1.40)
where the components transform according to (1.32) under the change of basis
in V given by ei = sji ej [and the associated change in V given by (e )i =
(s1 )ij (e )j ]. Note again the use of the Einstein summation convention.
After the above ight of abstraction we will return to more familiar ground
for most of the remainder of this chapter the special case of physical 3dimensional Euclidean space and review the notions of scalar products and
cross products in this space. We will see that, while the denition of the scalar
product as furnished in (1.41) below is in fact valid for any dimension n, that of
the cross product given by (1.45) is specic to n = 3 only (although in the next
chapter we will explore the important generalization of the concept to higher
dimensions). The scalar (inner) product of two vectors u and v (in the same
vector space) expressed in terms of components (with respect to a certain choice
of basis) is dened by
(u, v) = u v ij ui v j =
ui v i = ui vi ,
(1.41)
i
where the Kronecker delta symbol ij (dened similarly as ji above) is considered to be a (0, 2)-type tensor, and can be regarded as a metric in Euclidean
space R3 . Loosely speaking, a metric in a space is a (0, 2)-type non-degenerate,
symmetric tensor eld which gives a prescription for measuring lengths and
angles in that space. [For a denition of non-degeneracy, see discussion immediately above Eq. (1.65).] For example, the norm (length) of a vector v is
dened to be [recall (1.19)]
v (v, v) =
(v i )2 .
(1.42)
i
It is readily shown, by steps similar to those in (1.20), that the scalar product
of two vectors is invariant under orthogonal transformations.
Referring to Fig. 1.4, in which the basis vector e1 is chosen to lie along the
direction of v, the angle between two vectors u and v is given in terms of the
13
e2
v
e1
Fig. 1.4
scalar product by
u v = (u cos , u sin ) (v, 0) = uv cos ,
(1.43)
(1.45)
The cross product gives a Lie algebra structure to the vector space R3 , since
this product satises the so-called Jacobi identity:
A (B C) + B (C A) + C (A B) = 0 .
(1.46)
The metric ij and its inverse ij can be used to raise and lower indices of
tensors in R3 . For example,
klm = kn nlm = klm .
(1.47)
In fact, because of the special properties of the Kronecker delta, one can use it
to raise and lower indices with impunity:
ijk = ijk = i jk = . . . .
(1.48)
14
Problem 1.4 Verify the following tensorial properties under orthogonal transformations for the Kronecker deltas and the Levi-Civita tensor:
(a) ij transforms as a (2, 0)-type tensor;
(b) ij transforms as a (0, 2)-type tensor;
(c) ij transforms as a (1, 1)-type tensor;
(d) i jk [as dened in (1.44)] transforms as a (1, 2)-type tensor; ijk [as dened in
(1.47)] transforms as a (0, 3)-tensor.
Problem 1.6 Use the denition of the cross product [(1.45)] to show that
A B = B A .
(1.49)
e2 e3 = e1 ,
e3 e1 = e2 .
(1.50)
The Levi-Civita tensor satises the following very useful contraction property:
kij klm = il jm im jl
(1.51)
(1)
(sgn ) a1 . . . a(n)
=
(sgn ) a1(1) . . . an(n)
det(aji ) =
n
Sn
i1 ...in ai11
Sn
. . . ainn
(1.53)
15
e2
A2
B2
e3
3 .
A
B3
(1.54)
Problem 1.8 Verify the contraction property of the Levi-Civita tensor (1.51) by
using the dening properties of the tensor given by (1.44) and the properties of the
Kronecker delta.
We can use the contraction property of the Levi-Civita tensor (1.51) and the
denition of the scalar product (1.41) to calculate the magnitude of the cross
product as follows:
(A B) (A B) =
(A B)i (A B)i
=
(1.55)
= (jm kn jn km ) Aj B k Am B n
= Aj Aj Bn B n (Aj Bj )(Ak B k ) = A2 B 2 (A B)2
= A2 B 2 A2 B 2 cos2 = A2 B 2 sin2 .
Hence
|A B| = AB sin ,
(1.56)
where is the angle between the vectors A and B. Furthermore, using the
properties of the Levi-Civita tensor (1.44), we see that
(A B) A =
ijk Aj B k Ai = ijk Ai Aj B k = 0 .
(1.57)
i
Similarly,
(A B) B = 0 .
(1.58)
Thus the cross product of two vectors is always perpendicular to the plane
dened by the two vectors. The right-hand rule for the direction of the cross
product follows from (1.50).
As a further example in the use of the contraction property of the Levi-Civita
tensor we will prove the following very useful identity, known as the bac minus
cab identity in vector algebra.
A (B C) = B (A C) C (A B)
(1.59)
16
C
y
A
Fig. 1.5
Indeed, we have
(A (B C))i = ijk Aj (B C)k
= ijk Aj klm B l C m = ijk klm Aj B l C m = kij klm Aj B l C m
= (il jm im jl ) Aj B l C m
l
(1.60)
m
Problem 1.9 Use the properties of the Levi-Civita tensor to show that
A (B C) = B (C A) = C (A B) .
(1.61)
(1.62)
17
Problem 1.10 Use the bac minus cab identity (1.59) to show that
(A B) (C D) = C (D (A B)) D (C (A B))
= B (A (C D)) A (B (C D)) .
(1.63)
Problem 1.11 Use the bac minus cab identity and (1.61) to show that
(A B) (C D) = (A C)(B D) (A D)(B C) .
(1.64)
(1.65)
We now assert that the condition of nondegeneracy of the inner product is equivalent to the fact that the inner product induces a special (canonical) bijective
[one-to-one (injective) and onto (surjective)] mapping between V and V . Dene a linear map G : V V by
G(v)(w) = (v, w) ,
(1.66)
18
(1.67)
This says that the map G is injective. The fact that it is also surjective follows
from the fact that V and V , as nite-dimensional vector spaces of the same
dimension, are isomorphic. So G is a bijection. Conversely, suppose G is bijective, so it is certainly injective. This means that Ker G = 0; in other words,
G(v) = 0 = v = 0. It follows from (1.66) that G(v)(w) = (v, w) = 0 for
all w V implies v = 0; in other words, the inner product is nondegenerate.
This establishes the equivalence referred to in the sentence following Eq. (1.65).
The linear bijective map G : V V dened by (1.66) is called the conjugate
isomorphism between V and V .
In Hamiltonian mechanics, we will need antisymmetric, nondegenerate, bilinear 2-forms instead of symmetric ones. These are called symplectic forms.
They will play a crucial role from Chapter 36 onwards.
(1.68)