246 views

Uploaded by Rex Bedzra

- Tensor
- Vector and Tensor Analysis
- Tensor Calculus
- tensor analysis
- Intro to Tensor Calculus
- Tensor
- 1948-Vector and Tensor Analysis
- tensors
- tensor
- tensor analysis
- Vector and Tensor Analysis 1950
- Functional and Structural Tensor Analysis for Engineers
- Tensor Analysis in Differentiable Manifolds
- Tensor Calculus
- Matrix and Tensor Calculus - Aristotle D. Michal
- Schaum's Differential Geometry -- 277
- Rudin+Companion+Mathematical+Analysis+-+Silvia
- Tensor Analysis Intro
- 47706809 Tensor Calculus Relativity and Cosmology a First Course 2005 Dalarsson
- Introduction to Tensor Calculus & Continuum Mechanics

You are on page 1of 53

**Kees Dullemond & Kasper Peeters
**

c _1991-2010

This booklet contains an explanation about tensor calculus for students of physics

and engineering with a basic knowledge of linear algebra. The focus lies mainly on

acquiring an understanding of the principles and ideas underlying the concept of

‘tensor’. We have not pursued mathematical strictness and pureness, but instead

emphasise practical use (for a more mathematically pure resum´ e, please see the bib-

liography). Although tensors are applied in a very broad range of physics and math-

ematics, this booklet focuses on the application in special and general relativity.

We are indebted to all people who read earlier versions of this manuscript and gave

useful comments, in particular G. B¨ auerle (University of Amsterdam) and C. Dulle-

mond Sr. (University of Nijmegen).

The original version of this booklet, in Dutch, appeared on October 28th, 1991. A

major update followed on September 26th, 1995. This version is a re-typeset English

translation made in 2008/2010.

Copyright c _1991-2010 Kees Dullemond & Kasper Peeters.

1 The index notation 5

2 Bases, co- and contravariant vectors 9

2.1 Intuitive approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Mathematical approach . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Introduction to tensors 15

3.1 The new inner product and the ﬁrst tensor . . . . . . . . . . . . . . . 15

3.2 Creating tensors from vectors . . . . . . . . . . . . . . . . . . . . . . . 17

4 Tensors, deﬁnitions and properties 21

4.1 Deﬁnition of a tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.2 Symmetry and antisymmetry . . . . . . . . . . . . . . . . . . . . . . . 21

4.3 Contraction of indices . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.4 Tensors as geometrical objects . . . . . . . . . . . . . . . . . . . . . . . 22

4.5 Tensors as operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5 The metric tensor and the new inner product 25

5.1 The metric as a measuring rod . . . . . . . . . . . . . . . . . . . . . . . 25

5.2 Properties of the metric tensor . . . . . . . . . . . . . . . . . . . . . . . 26

5.3 Co versus contra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

6 Tensor calculus 29

6.1 The ‘covariance’ of equations . . . . . . . . . . . . . . . . . . . . . . . 29

6.2 Addition of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.3 Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.4 First order derivatives: non-covariant version . . . . . . . . . . . . . . 31

6.5 Rot, cross-products and the permutation symbol . . . . . . . . . . . . 32

7 Covariant derivatives 35

7.1 Vectors in curved coordinates . . . . . . . . . . . . . . . . . . . . . . . 35

7.2 The covariant derivative of a vector/tensor ﬁeld . . . . . . . . . . . . 36

A Tensors in special relativity 39

B Geometrical representation 41

C Exercises 47

C.1 Index notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

C.2 Co-vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

C.3 Introduction to tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

C.4 Tensoren, algemeen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

C.5 Metrische tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

C.6 Tensor calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3

4

1

The index notation

Before we start with the main topic of this booklet, tensors, we will ﬁrst introduce a

new notation for vectors and matrices, and their algebraic manipulations: the index

notation. It will prove to be much more powerful than the standard vector nota-

tion. To clarify this we will translate all well-know vector and matrix manipulations

(addition, multiplication and so on) to index notation.

Let us take a manifold (=space) with dimension n. We will denote the compo-

nents of a vector v with the numbers v

1

, . . . , v

n

. If one modiﬁes the vector basis, in

which the components v

1

, . . . , v

n

of vector v are expressed, then these components

will change, too. Such a transformation can be written using a matrix A, of which

the columns can be regarded as the old basis vectorse

1

, . . . ,e

n

expressed in the new

basise

1

′

, . . . ,e

n

′

,

_

_

_

v

′

1

.

.

.

v

′

n

_

_

_ =

_

_

_

A

11

A

1n

.

.

.

.

.

.

A

n1

A

nn

_

_

_

_

_

_

v

1

.

.

.

v

n

_

_

_ (1.1)

Note that the ﬁrst index of A denotes the row and the second index the column. In

the next chapter we will say more about the transformation of vectors.

According to the rules of matrix multiplication the above equation means:

v

′

1

= A

11

v

1

+ A

12

v

2

+ + A

1n

v

n

,

.

.

.

.

.

.

.

.

.

.

.

.

v

′

n

= A

n1

v

1

+ A

n2

v

2

+ + A

nn

v

n

,

(1.2)

or equivalently,

v

′

1

=

n

∑

ν=1

A

1ν

v

ν

,

.

.

.

.

.

.

v

′

n

=

n

∑

ν=1

A

nν

v

ν

,

(1.3)

or even shorter,

v

′

µ

=

n

∑

ν=1

A

µν

v

ν

(∀µ ∈ N [ 1 ≤ µ ≤ n) . (1.4)

In this formula we have put the essence of matrix multiplication. The index ν is a

dummy index and µ is a running index. The names of these indices, in this case µ and

5

CHAPTER 1. THE INDEX NOTATION

ν, are chosen arbitrarily. The could equally well have been called α and β:

v

′

α

=

n

∑

β=1

A

αβ

v

β

(∀α ∈ N [ 1 ≤ α ≤ n) . (1.5)

Usually the conditions for µ (in Eq. 1.4) or α (in Eq. 1.5) are not explicitly stated

because they are obvious from the context.

The following statements are therefore equivalent:

v = y ⇔ v

µ

= y

µ

⇔ v

α

= y

α

,

v = Ay ⇔ v

µ

=

n

∑

ν=1

A

µν

y

ν

⇔ v

ν

=

n

∑

µ=1

A

νµ

y

µ

.

(1.6)

This index notation is also applicable to other manipulations, for instance the inner

product. Take two vectors v and w, then we deﬁne the inner product as

v w := v

1

w

1

+ + v

n

w

n

=

n

∑

µ=1

v

µ

w

µ

. (1.7)

(We will return extensively to the inner product. Here it is just as an example of the

power of the index notation). In addition to this type of manipulations, one can also

just take the sum of matrices and of vectors:

C = A + B ⇔ C

µν

= A

µν

+ B

µν

z = v + w ⇔ z

α

= v

α

+ w

α

(1.8)

or their difference,

C = A −B ⇔ C

µν

= A

µν

− B

µν

z = v − w ⇔ z

α

= v

α

−w

α

(1.9)

◮ Exercises 1 to 6 of Section C.1.

From the exercises it should have become clear that the summation symbols ∑

can always be put at the start of the formula and that their order is irrelevant. We can

therefore in principle omit these summation symbols, if we make clear in advance

over which indices we perform a summation, for instance by putting them after the

formula,

n

∑

ν=1

A

µν

v

ν

→ A

µν

v

ν

¦ν¦

n

∑

β=1

n

∑

γ=1

A

αβ

B

βγ

C

γδ

→ A

αβ

B

βγ

C

γδ

¦β, γ¦

(1.10)

From the exercises one can already suspect that

• almost never is a summation performed over an index if that index only ap-

pears once in a product,

• almost always a summation is performed over an index that appears twice in

a product,

• an index appears almost never more than twice in a product.

6

CHAPTER 1. THE INDEX NOTATION

When one uses index notation in every day routine, then it will soon become

irritating to denote explicitly over which indices the summation is performed. From

experience (see above three points) one knows over which indices the summations

are performed, so one will soon have the idea to introduce the convention that,

unless explicitly stated otherwise:

• a summation is assumed over all indices that appear twice in a product, and

• no summation is assumed over indices that appear only once.

Fromnow on we will write all our formulae in index notation with this particular

convention, which is called the Einstein summation convection. For a more detailed

look at index notation with the summation convention we refer to [4]. We will thus

from now on rewrite

n

∑

ν=1

A

µν

v

ν

→ A

µν

v

ν

,

n

∑

β=1

n

∑

γ=1

A

αβ

B

βγ

C

γδ

→ A

αβ

B

βγ

C

γδ

.

(1.11)

◮ Exercises 7 to 10 of Section C.1.

7

CHAPTER 1. THE INDEX NOTATION

8

2

Bases, co- and contravariant vectors

In this chapter we introduce a new kind of vector (‘covector’), one that will be es-

sential for the rest of this booklet. To get used to this new concept we will ﬁrst show

in an intuitive way how one can imagine this new kind of vector. After that we will

follow a more mathematical approach.

2.1. Intuitive approach

We can map the space around us using a coordinate system. Let us assume that

we use a linear coordinate system, so that we can use linear algebra to describe

it. Physical objects (represented, for example, with an arrow-vector) can then be

described in terms of the basis-vectors belonging to the coordinate system (there

are some hidden difﬁculties here, but we will ignore these for the moment). In

this section we will see what happens when we choose another set of basis vectors,

i.e. what happens upon a basis transformation.

In a description with coordinates we must be fully aware that the coordinates

(i.e. the numbers) themselves have no meaning. Only with the corresponding basis

vectors (which span up the coordinate system) do these numbers acquire meaning.

It is important to realize that the object one describes is independent of the coordi-

nate system (i.e. set of basis vectors) one chooses. Or in other words: an arrow does

not change meaning when described an another coordinate system.

Let us write down such a basis transformation,

e

1

′

= a

11

e

1

+ a

12

e

2

,

e

2

′

= a

21

e

1

+ a

22

e

2

.

(2.1)

This could be regardedas a kind of multiplication of a ‘vector’ with a matrix, as long

as we take for the components of this ‘vector’ the basis vectors. If we describe the

matrix elements with words, one would get something like:

_

e

1

′

e

2

′

_

=

_

projection of e

1

′

ontoe

1

projection of e

1

′

ontoe

2

projection of e

2

′

ontoe

1

projection of e

2

′

ontoe

2

_

_

e

1

e

2

_

. (2.2)

Note that the basis vector-colums

_

.

.

_

are not vectors, but just a very useful way to

write things down.

We can also look at what happens with the components of a vector if we use a

different set of basis vectors. From linear algebra we know that the transformation

9

2.1 Intuitive approach

e

e

v=(

0.4

0.8

)

1

2

v=(

0.4

)

2

e’

e’

1

1.6

Figure 2.1: The behaviour of the transformation of the components of a vector under

the transformation of a basis vectore

1

′

=

1

2

e

1

→ v

1

′

= 2v

1

.

matrix can be constructed by putting the old basis vectors expressed in the new basis

in the columns of the matrix. In words,

_

v

1

′

v

2

′

_

=

_

projection of e

1

ontoe

1

′

projection of e

2

ontoe

1

′

projection of e

1

ontoe

2

′

projection of e

2

ontoe

2

′

__

v

1

v

2

_

. (2.3)

It is clear that the matrices of Eq. (2.2) and Eq. (2.3) are not the same.

We now want to compare the basis-transformation matrix of Eq. (2.2) with the

coordinate-transformation matrix of Eq. (2.3). To do this we replace all the primed

elements in the matrix of Eq. (2.3) by non-primed elements and vice-versa. Compar-

ison with the matrix in Eq. (2.2) shows that we also have to transpose the matrix. So

if we call the matrix of Eq. (2.2) Λ, then Eq. (2.3) is equivalent to:

v

′

= (Λ

−1

)

T

v . (2.4)

The normal vectors are called ‘contravariant vectors’, because they transform con-

trary to the basis vector columns. That there must be a different behavior is also

intuitively clear: if we described an ‘arrow’ by coordinates, and we then modify the

basis vectors, then the coordinates must clearly change in the opposite way to make

sure that the coordinates times the basis vectors produce the same physical ‘arrow’

(see Fig. ??).

In view of these two opposite transformation properties, we could now attempt

to construct objects that, contrary to normal vectors, transform the same as the basis

vector columns. In the simple case in which, for example, the basis vector e

1

′

trans-

forms into

1

2

e

1

, the coordinate of this object must then also

1

2

times as large. This

is precisely what happens to the coordinates of a gradient of a scalar function! The

reason is that such a gradient is the difference of the function per unit distance in

the direction of the basis vector. When this ‘unit’ suddenly shrinks (i.e. if the basis

vector shrinks) this means that the gradient must shrink too (see Fig. ?? for a one-

dimensional example). A ‘gradient’, which we so far have always regarded as a true

vector, will from now on be called a ‘covariant vector’ or ‘covector’: it transforms in

the same way as the basis vector columns.

The fact that gradients have usually been treated as ordinary vectors is that if

the coordinate transformation transforms one cartesian coordinate system into the

other (or in other words: one orthonormal basis into the other), then the matrices Λ

en (Λ

−1

)

T

are the same.

◮ Exercises 1, 2 of Section C.2.

As long as one transforms only between orthonormal basis, there is no difference

between contravariant vectors and covariant vectors. However, it is not always pos-

sible in practice to restrict oneself to such bases. When doing vector mathematics in

10

2.2 Mathematical approach

e

e

1

2

F

grad F = -1.4

F

2

e’

grad F = -0.7

e’

1

Figure 2.2: Basis vectore

1

′

=

1

2

e

1

→

∇f

′

=

1

2

∇f

curved coordinate systems (like polar coordinates for example), one is often forced

to use non-orthonormal bases. And in special relativity one is fundamentally forced

to distinguish between co- and contravariant vectors.

2.2. Mathematical approach

Now that we have a notion for the difference between the transformation of a vector

and the transformation of a gradient, let us have a more mathematical look at this.

Consider an n-dimensional manifold with coordinates x

1

, x

2

, ..., x

n

. We deﬁne

the gradient of a function f (x

1

, x

2

, ..., x

n

),

(∇f )

µ

:=

∂ f

∂x

µ

. (2.5)

The difference in transformation will now be demonstrated using the simplest of

transformations: a homogeneous linear transformation (we did this in the previous

section already, since we described all transformations with matrices). In general a

coordinate transformation can also be non-homogeneous linear, (e.g. a translation),

but we will not go into this here.

Suppose we have a vector ﬁeld deﬁned on this manifold V : v = v(x). Let us

perform a homogeneous linear transformation of the coordinates:

x

′

µ

= A

µν

x

ν

. (2.6)

In this case not only the coordinates x

µ

change (and therefore the dependence of v

on the coordinates), but also the components of the vectors,

v

′

µ

(x) = A

µν

v

ν

(x) , (2.7)

where A is the same matrix as in Eq. (2.6) (this may look trivial, but it is useful to

check it! Also note that we take as transformation matrix the matrix that describes

the transformation of the vector components, whereas in the previous section we

took for Λ the matrix that describes the transformation of the basis vectors. So A is

equal to (Λ

−1

)

T

).

Now take the function f (x

1

, x

2

, ..., x

n

) and the gradient w

α

at a point P in the

following way,

w

α

=

∂ f

∂x

α

, (2.8)

and in the new coordinate system as

w

′

α

=

∂ f

∂x

′

α

. (2.9)

11

2.2 Mathematical approach

It now follows (using the chain rule) that

∂ f

∂x

′

1

=

∂ f

∂x

1

∂x

1

∂x

′

1

+

∂ f

∂x

2

∂x

2

∂x

′

1

+ ... +

∂ f

∂x

n

∂x

n

∂x

′

1

. (2.10)

That is,

∂ f

∂x

′

µ

=

_

∂x

ν

∂x

′

µ

_

∂ f

∂x

ν

, (2.11)

w

′

µ

=

_

∂x

ν

∂x

′

µ

_

w

ν

. (2.12)

This describes how a gradient transforms. One can regard the partial derivative

∂x

ν

∂x

′

µ

as the matrix (A

−1

)

T

where A is deﬁned as in Eq. (2.6). To see this we ﬁrst take the

inverse of Eq. (2.6):

x

µ

= (A

−1

)

µν

x

′

ν

. (2.13)

Now take the derivative,

∂x

µ

∂x

′

α

=

∂((A

−1

)

µν

x

′

ν

)

∂x

′

α

= (A

−1

)

µν

∂x

′

ν

∂x

′

α

+

∂(A

−1

)

µν

∂x

′

α

x

′

ν

. (2.14)

Because in this case A does not depend on x

′

α

the last term on the right-hand side

vanishes. Moreover, we have that

∂x

′

ν

∂x

′

α

= δ

να

, δ

να

=

_

1 when ν = α,

0 when ν ,= α.

(2.15)

Therefore, what remains is

∂x

µ

∂x

′

α

= (A

−1

)

µν

δ

να

= (A

−1

)

µα

. (2.16)

With Eq. (2.12) this yields for the transformation of a gradient

w

′

µ

= (A

−1

)

T

µν

w

ν

. (2.17)

The indices are now in the correct position to put this in matrix form,

w

′

= (A

−1

)

T

w. (2.18)

(We again note that the matrix A used here denotes the coordinate transformation

from the coordinates x to the coordinates x

′

).

We have shown here what is the difference in the transformation properties of

normal vectors (‘arrows’) and gradients. Normal vectors we call from now on con-

travariant vectors (though we usually simply call them vectors) and gradients we call

covariant vectors (or covectors or one-forms). It should be noted that not every covari-

ant vector ﬁeld can be constructed as a gradient of a scalar function. A gradient has

the property that ∇(∇f ) = 0, while not all covector ﬁelds may have this prop-

erty. To distinguish vectors from covectors we will denote vectors with an arrow

(v) and covectors with a tilde ( ˜ w). To make further distinction between contravari-

ant and covariant vectors we will put the contravariant indices (i.e. the indices of

contravariant vectors) as superscript and the covariant indices (i.e. the indices of co-

variant vectors) with subscripts,

y

α

: contravariant vector

w

α

: covariant vector, or covector

12

2.2 Mathematical approach

In practice it will turn out to be very useful to also introduce this convention for

matrices. Without further argumentation (this will be given later) we note that the

matrix A can be written as:

A : A

µ

ν

. (2.19)

The transposed version of this matrix is:

A

T

: A

ν

µ

. (2.20)

With this convention the transformation rules for vectors resp. covectors become

v

′

µ

= A

µ

ν

v

ν

, (2.21)

w

′

µ

= (A

−1

)

T

µ

ν

w

ν

= (A

−1

)

ν

µ

w

ν

. (2.22)

The delta δ of Eq. (2.15) also gets a matrix form,

δ

µν

→ δ

µ

ν

. (2.23)

This is called the ‘Kronecker delta’. It simply has the ‘function’ of ‘renaming’ an

index:

δ

µ

ν

y

ν

= y

µ

. (2.24)

13

2.2 Mathematical approach

14

3

Introduction to tensors

Tensor calculus is a technique that can be regarded as a follow-up on linear algebra.

It is a generalisation of classical linear algebra. In classical linear algebra one deals

with vectors and matrices. Tensors are generalisations of vectors and matrices, as

we will see in this chapter.

In section 3.1 we will see in an example how a tensor can naturally arise. In

section 3.2 we will re-analyse the essential step of section 3.1, to get a better under-

standing.

3.1. The new inner product and the ﬁrst tensor

The inner product is very important in physics. Let us consider an example. In

classical mechanics it is true that the ‘work’ that is done when an object is moved

equals the inner product of the force acting on the object and the displacement vector

x,

W = ¸

F, x) . (3.1)

The work W must of course be independent of the coordinate system in which the

vectors

F and x are expressed. The inner product as we know it,

s = ¸a,

b) = a

µ

b

µ

(old deﬁnition) (3.2)

does not have this property in general,

s

′

= ¸a

′

,

b

′

) = A

µ

α

a

α

A

µ

β

b

β

= (A

T

)

µ

α

A

µ

β

a

α

b

β

, (3.3)

where A is the transformation matrix. Only if A

−1

equals A

T

(i.e. if we are dealing

with orthonormal transformations) s will not change. The matrices will then together

form the kronecker delta δ

βα

. It appears as if the inner product only describes the

physics correctly in a special kind of coordinate system: a system which according to

our human perception is ‘rectangular’, and has physical units, i.e. a distance of 1 in

coordinate x

1

means indeed 1 meter in x

1

-direction. An orthonormal transformation

produces again such a rectangular ‘physical’ coordinate system. If one has so far

always employed such special coordinates anyway, this inner product has always

worked properly.

However, as we already explained in the previous chapter, it is not always guar-

anteed that one can use such special coordinate systems (polar coordinates are an

example in which the local orthonormal basis of vectors is not the coordinate basis).

15

3.1 The new inner product and the ﬁrst tensor

The inner product between a vector x and a covector y, however, is invariant

under all transformations,

s = x

µ

y

µ

, (3.4)

because for all A one can write

s

′

= x

′

µ

y

′

µ

= A

µ

α

x

α

(A

−1

)

β

µ

y

β

= (A

−1

)

β

µ

A

µ

α

x

α

y

β

= δ

β

α

x

α

y

β

= s (3.5)

With help of this inner produce we can introduce a new inner product between

two contravariant vectors which also has this invariance property. To do this we

introduce a covector w

µ

and deﬁne the inner product between x

µ

and y

ν

with respect

to this covector w

µ

in the following way (we will introduce a better deﬁnition later):

s = w

µ

w

ν

x

µ

y

ν

(ﬁrst attempt) (3.6)

(Warning: later it will become clear that this deﬁnition is not quite useful, but at

least it will bring us on the right track toward ﬁnding an invariant inner product be-

tween two contravariant vectors). The inner product s will now obviously transform

correctly, because it is made out of two invariant ones,

s

′

= (A

−1

)

µ

α

w

µ

(A

−1

)

ν

β

w

ν

A

α

ρ

x

ρ

A

β

σ

y

σ

= (A

−1

)

µ

α

A

α

ρ

(A

−1

)

ν

β

A

β

σ

w

µ

w

ν

x

ρ

y

σ

= δ

µ

ρ

δ

ν

σ

w

µ

w

ν

x

ρ

y

σ

= w

µ

w

ν

x

µ

y

ν

= s .

(3.7)

We have now produced an invariant ‘inner product’ for contravariant vectors by

using a covariant vector w

µ

as a measure of length. However, this covector appears

twice in the formula. One can also rearrange these factors in the following way,

s = (w

µ

w

ν

)x

µ

y

ν

=

_

x

1

x

2

x

3

_

_

_

w

1

w

1

w

1

w

2

w

1

w

3

w

2

w

1

w

2

w

2

w

2

w

3

w

3

w

1

w

3

w

2

w

3

w

3

_

_

_

_

y

1

y

2

y

3

_

_

. (3.8)

In this way the two appearances of the covector w are combined into one object:

some kind of product of w with itself. It is some kind of matrix, since it is a collection

of numbers labeled with indices µ and ν. Let us call this object g,

g =

_

_

w

1

w

1

w

1

w

2

w

1

w

3

w

2

w

1

w

2

w

2

w

2

w

3

w

3

w

1

w

3

w

2

w

3

w

3

_

_

=

_

_

g

11

g

12

g

13

g

21

g

22

g

23

g

31

g

32

g

33

_

_

. (3.9)

Instead of using a covector w

µ

in relation to which we deﬁne the inner product,

we can also directly deﬁne the object g: that is more direct. So, we deﬁne the inner

product with respect to the object g as:

s = g

µν

x

µ

y

ν

new deﬁnition (3.10)

Now we must make sure that the object g is chosen such that our new inner product

reproduces the old one if we choose an orthonormal coordinate system. So, with

Eq. (3.8) in an orthonormal system one should have

s = g

µν

x

µ

y

ν

=

_

x

1

x

2

x

3

_

_

_

g

11

g

12

g

13

g

21

g

22

g

23

g

31

g

32

g

33

_

_

_

_

y

1

y

2

y

3

_

_

= x

1

y

1

+ x

2

y

2

+ x

3

y

3

in an orthonormal system!

(3.11)

16

3.2 Creating tensors from vectors

To achieve this, g must become, in an orthonormal system, something like a unit

matrix:

g

µν

=

_

_

1 0 0

0 1 0

0 0 1

_

_

in an orthonormal system! (3.12)

One can see that one cannot produce this set of numbers according to Eq. (3.9).

This means that the deﬁnition of the inner product according to Eq. (3.6) has to be

rejected (hence the warning that was written there). Instead we have to start directly

from Eq. (3.10). We do no longer regard g as built out of two covectors, but regard it

as a matrix-like set of numbers on itself.

However, it does not have the transformation properties of a classical matrix.

Remember that the matrix A of the previous chapter had one index up and one

index down: A

µ

ν

, indicating that it has mixed contra- and co-variant transformation

properties. The new object g

µν

, however, has both indices down: it transforms in

both indices in a covariant way, like the w

µ

w

ν

which we initially used for our inner

product. This curious object, which looks like a matrix, but does not transform as

one, is an example of a tensor. A matrix is also a tensor, as are vectors and covectors.

Matrices, vectors and covectors are special cases of the more general class of objects

called ‘tensors’. The object g

µν

is a kind of tensor that is neither a matrix nor a vector

or covector. It is a new kind of object for which only tensor mathematics has a proper

description.

The object g is called a metric, and we will study its properties later in more

detail: in Chapter 5.

With this last step we now have a complete description of the new inner product

between contravariant vectors that behaves properly, in that it remains invariant

under any linear transformation, and that it reproduces the old inner product when

we work in an orthonormal basis. So, returning to our original problem,

W = ¸

F, x) = g

µν

F

µ

x

ν

general formula . (3.13)

In this section we have in fact put forward two new concepts: the new inner

product and the concept of a ‘tensor’. We will cover both concepts more deeply: the

tensors in Section 3.2 and Chapter 4, and the inner product in Chapter 5.

3.2. Creating tensors from vectors

In the previous section we have seen how one can produce a tensor out of two

covectors. In this section we will revisit this procedure from a slightly different

angle.

Let us look at products between matrices and vectors, like we did in the previous

chapters. One starts with an object with two indices and therefore n

2

components

(the matrix) and an object with one index and therefore n components (the vector).

Together they have n

2

+ n componenten. After multiplication one is left with one

object with one index and n components (a vector). One has therefore lost (n

2

+

n) −n = n

2

numbers: they are ‘summed away’. A similar story holds for the inner

product between a covector and a vector. One starts with 2n components and ends

up with one number,

s = x

µ

y

µ

= x

1

y

1

+ x

2

y

2

+ x

3

y

3

. (3.14)

Summation therefore reduces the number of components.

In standard multiplication procedures from classical linear algebra such a sum-

mation usually takes place: for matrix multiplications as well as for inner products.

In index notation this is denoted with paired indices using the summation conven-

tion. However, the index notation also allows us to multiply vectors and covectors

17

3.2 Creating tensors from vectors

without pairing up the indices, and therefore without summation. The object one

thus obtains does not have fewer components, but more:

s

µ

ν

:= x

µ

y

ν

=

_

_

x

1

y

1

x

1

y

2

x

1

y

3

x

2

y

1

x

2

y

2

x

2

y

3

x

3

y

1

x

3

y

2

x

3

y

3

_

_

. (3.15)

We now did not produce one number (as we would have if we replaced ν with µ in

the above formula) but instead an ordered set of numbers labelled with the indices µ

and ν. So if we take the example x = (1, 3, 5) and ˜ y = (4, 7, 6), then the tensor-

components of s

µ

ν

are, for example: s

2

3

= x

2

y

3

= 3 6 = 18 and s

1

1

= x

1

y

1

=

1 4 = 4, and so on.

This is the kind of ‘tensor’ object that this booklet is about. However, this ob-

ject still looks very much like a matrix, since a matrix is also nothing more or less

than a set of numbers labeled with two indices. To check if this is a true matrix or

something else, we need to see how it transforms,

s

′α

β

= x

′α

y

′

β

= A

α

µ

x

µ

(A

−1

)

ν

β

y

ν

= A

α

µ

(A

−1

)

ν

β

(x

µ

y

ν

) = A

α

µ

(A

−1

)

ν

β

s

µ

ν

. (3.16)

◮ Exercise 2 of Section C.3.

If we compare the transformation in Eq.(3.16) with that of a true matrix of ex-

ercise 2 we see that the tensor we constructed is indeed an ordinary matrix. But if

instead we use two covectors,

t

µν

= x

µ

y

ν

=

_

_

x

1

y

1

x

1

y

2

x

1

y

3

x

2

y

1

x

2

y

2

x

2

y

3

x

3

y

1

x

3

y

2

x

3

y

3

_

_

, (3.17)

then we get a tensor with different transformation properties,

t

′

αβ

= x

′

α

y

′

β

= (A

−1

)

µ

α

x

µ

(A

−1

)

ν

β

y

ν

= (A

−1

)

µ

α

(A

−1

)

ν

β

(x

µ

y

ν

)

= (A

−1

)

µ

α

(A

−1

)

ν

β

t

µν

.

(3.18)

The difference with s

µ

ν

lies in the ﬁrst matrix of the transformation equation. For s

it is the transformation matrix for contravariant vectors, while for t it is the trans-

formation matrix for covariant vectors. The tensor t is clearly not a matrix, so we

indeed created something new here. The g tensor of the previous section is of the

same type as t.

The beauty of tensors is that they can have an arbitrary number of indices. One

can also produce, for instance, a tensor with 3 indices,

A

αβγ

= x

α

y

β

z

γ

. (3.19)

This is an ordered set of numbes labeled with three indices. It can be visualized as a

kind of ‘super-matrix’ in 3 dimensions (see Fig. ??).

These are tensors of rank 3, as opposed to tensors of rank 0 (scalars), rank 1

(vectors and covectors) and rank 2 (matrices and the other kind of tensors we in-

troduced so far). We can distinguish between the contravariant rank and covariant

rank. Clearly A

αβγ

is a tensor of covariant rank 3 and contravariant rank 0. Its to-

tal rank is 3. One can also produce tensors of, for instance, contravariant rank 2

18

3.2 Creating tensors from vectors

A

A

A

A A

A

A

A

A

131

121

111 112

122

132 133

123

113

A

A

A

A A

A

A

A

A

231

221

211

232

222

212

233

223

213

A

A

A

A A

A

A

A

A

331 332 333

321 322 323

311 312 313

Figure 3.1: A tensor of rank 3.

and covariant rank 3 (i.e. total rank 5): B

αβ

µνφ

. A similar tensor, C

α

µνφ

β

, is also of

contravariant rank 2 and covariant rank 3. Typically, when tensor mathematics is

applied, the meaning of each index has been deﬁned beforehand: the ﬁrst index

means this, the second means that etc. As long as this is well-deﬁned, then one can

have co- and contra-variant indices in any order. However, since it usually looks bet-

ter (though this is a matter of taste) to have the contravariant indices ﬁrst and the

covariant ones last, usually the meaning of the indices is chosen in such a way that

this is accomodated. This is just a matter of how one chooses to assign meaning to

each of the indices.

Again it must be clear that although a multiplication (without summation!) of m

vectors and m covectors produces a tensor of rank m+ n, not every tensor of rank m+

n can be constructed as such a product. Tensors are much more general than these

simple products of vectors and covectors. It is therefore important to step away

from this picture of combining vectors and covectors into a tensor, and consider this

construction as nothing more than a simple example.

19

3.2 Creating tensors from vectors

20

4

Tensors, deﬁnitions and properties

Now that we have a ﬁrst idea of what tensors are, it is time for a more formal de-

scription of these objects.

We begin with a formal deﬁnition of a tensor in Section 4.1. Then, Sections 4.2

and 4.3 give some important mathematical properties of tensors. Section 4.4 gives

an alternative, in the literature often used, notation of tensors. And ﬁnally, in Sec-

tion 4.5, we take a somewhat different view, considering tensors as operators.

4.1. Deﬁnition of a tensor

‘Deﬁnition’ of a tensor: An (N, M)-tensor at a given point in space can be described

by a set of numbers with N + M indices which transforms, upon coordinate trans-

formation given by the matrix A, in the following way:

t

′α

1

...α

N

β

1

...β

M

= A

α

1

µ

1

. . . A

α

N

µ

N

(A

−1

)

ν

1

β

1

. . . (A

−1

)

ν

M

β

M

t

µ

1

...µ

N

ν

1

...ν

M

(4.1)

An (N, M)-tensor in a three-dimensional manifold therefore has 3

(N+M)

compo-

nents. It is contravariant in N components and covariant in M components. Tensors

of the type

t

α

β

γ

(4.2)

are of course not excluded, as they can be constructed fromthe above kind of tensors

by rearrangement of indices (like the transposition of a matrix as in Eq. 2.20).

Matrices (2 indices), vectors and covectors (1 index) and scalars (0 indices) are

therefore also tensors, where the latter transforms as s

′

= s.

4.2. Symmetry and antisymmetry

In practice it often happens that tensors display a certain amount of symmetry, like

what we know from matrices. Such symmetries have a strong effect on the proper-

ties of these tensors. Often many of these properties or even tensor equations can be

derived solely on the basis of these symmetries.

A tensor t is called symmetric in the indices µ and ν if the components are equal

upon exchange of the index-values. So, for a 2

nd

rank contravariant tensor,

t

µν

= t

νµ

symmetric (2,0)-tensor . (4.3)

21

4.4 Tensors as geometrical objects

A tensor t is called anti-symmetric in the indices µ and ν if the components are equal-

but-opposite upon exchange of the index-values. So, for a 2

nd

rank contravariant

tensor,

t

µν

= −t

νµ

anti-symmetric (2,0)-tensor . (4.4)

It is not useful to speak of symmetry or anti-symmetry in a pair of indices that are

not of the same type (co- or contravariant). The properties of symmetry only remain

invariant upon basis transformation if the indices are of the same type.

◮ Exercises 2 and 3 of Section C.4.

4.3. Contraction of indices

With tensors of at least one covariant and at least one contravariant index we can

deﬁne a kind of ‘internal inner product’. In the simplest case this goes as,

t

α

α

, (4.5)

where, as usual, we sum over α. This is the trace of the matrix t

α

β

. Also this con-

struction is invariant under basis transformations. In classical linear algebra this

property of the trace of a matrix was also known.

◮ Exercise 3 of Section C.4.

One can also perform such a contraction of indices with tensors of higher rank, but

then some uncontracted indices remain, e.g.

t

αβ

α

= v

β

, (4.6)

In this way we can convert a tensor of type (N, M) into a tensor of type (N−1, M−

1). Of course, this contraction procedure causes information to get lost, since after

the contraction we have fewer components.

Note that contraction can only take place between one contravariant index and

one covariant index. A contraction like t

αα

is not invariant under coordinate trans-

formation, and we should therefore reject such constructions. In fact, the summa-

tions of the summation convention only happen over a covariant and a contravariant

index, be it a contraction of a tensor (like t

µα

α

) or an inner product between two ten-

sors (like t

µα

y

αν

).

4.4. Tensors as geometrical objects

Vectors can be seen as columns of numbers, but also as arrows. One can perform

calculations with these arrows if one regards them as linear combinations of ‘basis

arrows’,

v =

∑

µ

v

µ

e

µ

. (4.7)

We would like to do something like this also with covectors and eventually of course

with all tensors. For covectors it amounts to constructing a set of ‘unit covectors’ that

serve as a basis for the covectors. We write

˜ w =

∑

µ

w

µ

˜ e

µ

. (4.8)

Note: we denote the basis vectors and basis covectors with indices, but we do not

mean the components of these basis (co-)vectors (after all: in which basis would that

22

4.4 Tensors as geometrical objects

be?). Instead we label the basis (co-)vectors themselves with these indices. This way

of writing was alreadyintroduced in Section 2.1. In spite of the fact that these indices

have therefore a slightly different meaning, we still use the usual conventions of

summation for these indices. Note that we mark the co-variant basis vectors with

an upper index and the contra-variant basis-vectors with a lower index. This may

sound counter-intuitive (‘did we not decide to use upper indices for contra-variant

vectors?’) but this is precisely what we mean with the ‘different meaning of the

indices’ here: this time they label the vectors and do not denote their components.

The next step is to express the basis of covectors in the basis of vectors. To do this,

let us remember that the inner product between a vector and a covector is always

invariant under transformations, independent of the chosen basis:

v ˜ w = v

α

w

α

(4.9)

We now substitute Eq. (4.7) and Eq. (4.8) into the left hand side of the equation,

v ˜ w = v

α

e

α

w

β

˜ e

β

. (4.10)

If Eq. (4.9) and Eq. (4.10) have to remain consistent, it automatically follows that

˜ e

β

e

α

= δ

β

α

. (4.11)

On the left hand side one has the inner product between a covector ˜ e

β

and a vec-

tor e

α

. This time the inner product is written in abstract form (i.e. not written out

in components like we did before). The indices α and β simply denote which of

the basis covectors (β) is contracted with which of the basis vectors (α). A geomet-

rical representation of vectors, covectors and the geometric meaning of their inner

product is given in appendix B.

Eq. (4.11) is the (implicit) deﬁnition of the co-basis in terms of the basis. This

equation does not give the co-basis explicitly, but it deﬁnes it nevertheless uniquely.

This co-basis is called the ‘dual basis’. By arranging the basis covectors in columns

as in Section 2.1, one can show that they transform as

˜ e

′α

= A

α

β

˜ e

β

, (4.12)

when we choose A (that was equal to (Λ

−1

)

T

) in such a way that

e

′

α

= ((A

−1

)

T

)

α

β

e

β

. (4.13)

In other words: basis-covector-columns transform as vectors, in contrast to basis-

vector-columns, which transform as covectors.

The description of vectors and covectors as geometric objects is now complete.

We can now also express tensors in such an abstract way (again here we refer to

the mathematical description; a geometric graphical depiction of tensors is given in

appendix B). We then express these tensors in terms of ‘basis tensors’. These can be

constructed from the basis vectors and basis covectors we have constructed above,

t = t

µν

ρ

e

µ

⊗e

ν

⊗ ˜ e

ρ

. (4.14)

The operator ⊗ is the ‘tensor outer product’. Combining tensors (in this case the

basis (co-)vectors) with such an outer product means that the rank of the resulting

tensor is the sum of the ranks of the combined tensors. It is the geometric (abstract)

version of the ‘outer product’ we introduced in Section 3.2 to create tensors. The

operator ⊗ is not commutative,

a ⊗

b ,=

b ⊗a . (4.15)

23

4.5 Tensors as operators

4.5. Tensors as operators

Let us revisit the new inner product with v a vector and w a covector,

w

α

v

α

= s . (4.16)

If we drop index notation and we use the usual symbols we have,

˜ wv = s , (4.17)

This could also be written as

˜ w(v) = s , (4.18)

where the brackets mean that the covector ˜ w acts on the vector v. In this way, ˜ w is an

operator (a function) which ‘eats’ a vector and produces a scalar. Therefore, written

in the usual way to denote maps from one set to another,

˜ w : R

n

→R. (4.19)

A covector is then called a linear ‘function of direction’: the result of the operation

(i.e. the resulting scalar) is linearly dependent on the input (the vector), which is a

directional object. We can also regard it the other way around,

v( ˜ w) = s . (4.20)

where the vector is now the operator and the covector the argument. To prevent

confusion we write Eq. (4.19) from now on in a somewhat different way,

v : R

∗3

→R. (4.21)

An arbitrary tensor of rank (N, M) can also be regarded in this way. The tensor

has as input a tensor of rank (M, N), or the product of more than one tensors of

lower rank, such that the number of contravariant indices equals M and the number

of covariant indices equals N. After contraction of all indices we obtain a scalar.

Example (for a (2, 1)-tensor),

t

αβ

γ

a

α

b

β

c

γ

= s . (4.22)

The tensor t can be regarded as a function of 2 covectors (a an b) and 1 vector (c),

which produces a real number (scalar). The function is linearly dependent on its

input, so that we can call this a ‘multilinear function of direction’. This somewhat

complex nomenclature for tensors is mainly found in older literature.

Of course the fact that tensors can be regarded to some degree as functions or

operators does not mean that it is always useful to do so. In most applications

tensors are considered as objects themselves, and not as functions.

24

5

The metric tensor and the new inner

product

In this chapter we will go deeper into the topic of the new inner product. The new

inner product, and the metric tensor g associated with it, is of great importance to

nearly all applications of tensor mathematics in non-cartesian coordinate systems

and/or curved manifolds. It allows us to produce mathematical and physical for-

mulae that are invariant under coordinate transformations of any kind.

In Section 5.1 we will give an outline of the role that is played by the metric

tensor g in geometry. Section 5.2 covers the properties of this metric tensor. Finally,

Section 5.3 describes the procedure of raising or lowering an index using the metric

tensor.

5.1. The metric as a measuring rod

If one wants to describe a physical system, then one typically wants to do this with

numbers, because they are exact and one can put them on paper. To be able to do this

one must ﬁrst deﬁne a coordinate system in which the measurements can be done.

One could in principle take the coordinate system in any way one likes, but often

one prefers to use an orthonormal coordinate system because this is particularly

practical for measuring distances using the law of Pythagoras.

However, often it is not quite practical to use such cartesian coordinates. For

systems with (nearly) cylindrical symmetry or (nearly) spherical symmetry, for in-

stance, it is much more practical to use cylindrical or polar coordinates, even though

one could use cartesian ones. In some applications it is even impossible to use or-

thonormal coordinates, for instance in the case of mapping of the surface of the

Earth. In a typical projection of the Earth with meridians vertically and parallels

horizontally, the northern countries are stretched enormously and the north/south

pole is a line instead of a point. Measuring distances (or the length of a vector) on

such a map with Pythagoras would produce wrong results.

What one has to do to get an impression of ‘true’ distances, sizes and propor-

tions is to draw on various places on the Earth-map a ‘unit circle’ that represents,

say, 100 km in each direction (if we wish to measure distances in units of 100 km).

At the equator these are presumably circles, but as one goes toward the pole they

tend to become ellipses: they are circles that are stretched in horizonal (east-west)

direction. At the north pole such a ‘unit circle’ will be stretched inﬁnitely in hori-

zontal direction. There is a strong relation between this unit circle and the metric g.

In appendix B one can see that indeed a metric can be graphically represented with

25

5.2 Properties of the metric tensor

a unit circle (in 2-D) or a unit sphere (in 3-D).

So how does one measure the distance between two points on such a curved

manifold? In the example of the Earth’s projection one sees that g is different at dif-

ferent lattitudes. So if points A and B have different lattitudes, there is no unique g

that we can use to measure the distance. Indeed, there is no objective distance that

one can give. What what can do is to deﬁne a path from point A to point B and mea-

sure the distance along this path in little steps. At each step the distance travelled is

so small that one can take the g at the middle of the step and get a reasonably good

estimate of the length ds of that step. This is given by

ds

2

= g

µν

dx

µ

dx

ν

with g

µν

= g

µν

(x) . (5.1)

By integrating ds all the way from A to B one thus obtains the distance along this

path. Perhaps the ‘true’ distance is then the distance along the path which yields the

shortest distance between these points.

What we have done here is to chop the path into such small bits that the curva-

ture of the Earth is negligible. Within each step g changes so little that one can use

the linear expressions for length and distance using the local metric g.

Let us take a bit more concrete example and measure distances in 3-D using

polar coordinates. From geometric considerations we know that the length dl of an

inﬁnitesimally small vector at position (r, θ, φ) in polar coordinates satisﬁes

dl

2

= s = dr

2

+ r

2

dθ

2

+ r

2

sin

2

θdφ

2

. (5.2)

We could also write this as

dl

2

= s =

_

dx

1

dx

2

dx

3

_

_

_

1 0 0

0 r

2

0

0 0 r

2

sin

2

θ

_

_

_

_

dx

1

dx

2

dx

3

_

_

= g

µν

dx

µ

dx

ν

. (5.3)

All the information about the concept of ‘length’ or ‘distance’ is hidden in the metric

tensor g. The metric tensor ﬁeld g(x) is usually called the ‘metric of the manifold’.

5.2. Properties of the metric tensor

The metric tensor g has some important properties. First of all, it is symmetric. This

can be seen in various ways. One of them is that one can always ﬁnd a coordinate

system in which, locally, the metric tensor has the shape of a unit ‘matrix’, which is

symmetric. Symmetry of a tensor is conserved upon any transformation (see Sec-

tion 4.2). Another way to see it is that the inner product of two vectors should not

depend on the order of these vectors: g

µν

v

µ

w

ν

= g

µν

v

ν

w

µ

≡ g

νµ

v

µ

w

ν

(in the last

step we renamed the running indices).

Every symmetric second rank covariant tensor can be transformed into diagonal

form in which the diagonal elements are either 1,0 or −1. For the metric tensor they

will then all be 1: in an orthonormal coordinate system. In fact, an orthonormal

coordinate system is deﬁned as the coordinate system in which g

µν

= diag(1, 1, 1).

A normal metric can always be put into this form. However, mathematically one

could also conceive a manifold that has a metric that can be brought into the follow-

ing form: g

µν

= diag(1, 1, −1). This produces funny results for some vectors. For

instance, the vector (0, 0, 1) would have an imaginary length, and the vector (0, 1, 1)

has length zero. These are rather pathological properties and appear to be only in-

teresting to mathematicians. A real metric, after all, should always produce positive

deﬁnite lengths for vectors unequal to the zero-vector. However, as one can see in

appendix A, in special relativity a metric will be introduced that can be brought to

26

5.3 Co versus contra

the form diag(−1, 1, 1, 1). For now, however, we will assume that the metric is posi-

tive deﬁnite (i.e. can be brought to the form g

µν

= diag(1, 1, 1) through a coordinate

transformation).

It should be noted that a metric of signature g

µν

= diag(1, 1, −1) can never be

brought into the form g

µν

= diag(1, 1, 1) by coordinate transformation, and vice

versa. This signature of the metric is therefore an invariant. Normal metrics have

signature diag(1, 1, 1). This does not mean that they are g

µν

= diag(1, 1, 1), but they

can be brought into this form with a suitable coordinate transformation.

◮ Exercise 1 of Section C.5.

So what about the inner product between two co-vectors? Just as with the vectors

we use a kind of metric tensor, but this time of contravariant nature: g

µν

. The inner

product then becomes

s = g

µν

x

µ

y

ν

. (5.4)

The distinction between these two metric tensors is simply made by upper resp. lower

indices. The properties we have derived for g

µν

of course also hold for g

µν

. And both

metrics are intimitely related: once g

µν

is given, g

µν

can be constructed and vice

versa. In a sense they are two realisations of the same metric object g. The easiest

way to ﬁnd g

µν

from g

µν

is to ﬁnd a coordinate system in which g

µν

= diag(1, 1, 1):

then g

µν

is also equal to diag(1, 1, 1). But in the next section we will derive a more

elegant relation between the two forms of the metric tensor.

5.3. Co versus contra

Before we got introduced to co-vectors we presumably always used orthonormal

coordinate systems. In such coordinate systems we never encountered the differ-

ence between co- and contra-variant vectors. A gradient was simply a vector, like

any kind of vector. An example is the electric ﬁeld. One can regard an electric ﬁeld

as a gradient,

˜

E = −∇V ⇔ E

µ

= −

∂V

∂x

µ

. (5.5)

On the other hand it is clearly also an arrow-like object, related to a force,

E =

1

q

F =

1

q

ma ⇔ E

µ

=

1

q

F

µ

=

1

q

ma

µ

. (5.6)

In an orthonormal system one can interchange

R and

˜

R (or equivalently E

µ

and E

µ

)

without punishment. But as soon as we go to a non-orthonormal system, we will

encounter problems. Then one is suddenly forced to distinguish between the two.

So the question becomes: if one has a potential ﬁeld V, how does one obtain the

contravariant

E? Perhaps the most obvious solution is: ﬁrst produce the covariant

tensor

˜

E, then transform to an orthonormal system, then switch from co-vector form

to contravariant vector form (in this system their components are identical, at least

for a positive deﬁnite metric) and then transform back to the original coordinate

system. This is possible, but very time-consuming and unwieldy. A better method

is to use the metric tensor:

E

µ

= g

µν

E

ν

. (5.7)

This is a proper tensor product, as the contraction takes place over an upper and a

lower index. The object E

µ

, which was created from E

ν

is now contravariant. One

can see that this vector must indeed be the contravariant version of E

ν

by taking an

orthonormal coordinate system: here g

µν

= diag(1, 1, 1), i.e. g

µν

is some kind of unit

27

5.3 Co versus contra

‘matrix’. This is the same as saying that in the orthonormal basis E

µ

= E

ν

, which is

indeed true (again, only for a metric that can be diagonalised to diag(1, 1, 1), which

for classical physics is always the case). However, in contrast to the formula E

µ

=

E

ν

, the formula Eq. (5.7) is now also valid after transformation to any coordinate

system,

(A

−1

)

µ

α

E

µ

= (A

−1

)

µ

α

(A

−1

)

ν

β

g

µν

A

β

σ

E

σ

= (A

−1

)

µ

α

δ

ν

σ

g

µν

E

σ

= (A

−1

)

µ

α

g

µν

E

ν

,

(5.8)

(here we introduced σ to avoid having four ν symbols, which would conﬂict with

the summation convention). If we multiply on both sides with A

α

ρ

we obtain exactly

(the index ρ can be of course replaced by µ) Eq. (5.7).

One can of course also do the reverse,

E

µ

= g

µν

E

ν

. (5.9)

If we ﬁrst lower an index with g

νρ

, and then raise it again with g

µν

, then we must

arrive back with the original vector. This results in a relation between g

µν

en g

µν

,

E

µ

= g

µν

E

ν

= g

µν

g

νρ

E

ρ

= δ

µ

ρ

E

ρ

= E

µ

⇒ g

αν

g

νβ

= δ

α

β

. (5.10)

With this relation we have now deﬁned the contravariant version of the metric ten-

sor in terms of its covariant version.

We call the conversion of a contravariant index into a covariant one “lowering an

index”, while the reverse is “raising an index”. The vector and its covariant version

are each others “dual”.

28

6

Tensor calculus

Now that we are familiar with the concept of ‘tensor’ we need to go a bit deeper in

the calculus that one can do with these objects. This is not entirely trivial. The index

notation makes it possible to write all kinds of manipulations in an easy way, but

makes it also all too easy to write down expressions that have no meaning or have

bad properties. We will need to investigate where these pitfalls may arise.

We will also start thinking about differentiation of tensors. Also here the rules of

index notation are rather logical, but also here there are problems looming. These

problems have to do with the fact that ordinary derivatives of tensors do no longer

transform as tensors. This is a problem that has far-reaching consequences and we

will deal with them in Chapter 7. For now we will merely make a mental note of

this.

At the end of this chapter we will introduce another special kind of tensor, the

Levi-Civita tensor, which allows us to write down cross-products in index notation.

Without this tensor the cross-product cannot be expressed in index notation.

6.1. The ‘covariance’ of equations

In chapter 3 we saw that the old inner product was not a good deﬁnition. It said

that the work equals the (old) inner product between

F en x. However, while the

work is a physical notion, and thereby invariant under coordinate transformation,

the inner product was not invariant. In other words: the equation was only valid in

a prefered coordinate system. After the introduction of the new inner product the

equation suddenly kept its validity in any coordinate system. We call this ‘universal

validity’ of the equation: ‘covariance. Be careful not to confuse this with ‘covariant

vector’: it is a rather unpractical fate that the word ‘covariant’ has two meanings

in tensor calculus: that of the type of index (or vector) on the one hand and the

‘universal validity’ of expressions on the other hand.

The following expressions and equations are therefore covariant,

x

µ

= y

µ

, (6.1)

x

µ

= y

µ

, (6.2)

because both sides transform in the same way. The following expression is not co-

variant:

x

µ

= y

µ

, (6.3)

because the left-hand-side transforms as a covector (i.e. with (A

−1

)

T

)) and the right-

hand-side as a contravariant vector (i.e. with A). If this equation holds by chance in

29

6.2 Addition of tensors

one coordinate system, then it will likely not hold in another coordinate system.

Since we are only interested in vectors and covectors are geometric objects and not

in their components in some arbitrary coordinate system, we must conclude that

covectors cannot be set equal to contravariant vectors.

The same as above we can say about the following equation,

t

µν

= s

µ

ν

. (6.4)

This expression is also not covariant. The tensors on both sides of the equal sign are

of different type.

If we choose to stay in orthonormal coordinates then we can drop the distinction be-

tween co- and contra-variant vectors and we do not need to check if our expressions

or equations are covariant. In that case we put all indices as subscript, to avoid

giving the impression that we distinguish between co- and contravariant vectors.

6.2. Addition of tensors

Two tensors can only be added if they have the same rank: one cannot add a vector

to a matrix. Addition of two equal-rank tensors is only possible if they are of the

same type. So

v

µ

+ w

µ

(6.5)

is not a tensor. (The fact that the index µ occurs double here does not mean that

they need to be summed over, since according to the summation convention one

only sums over equal indices occuring in a product, not in a sum. See chapter 1). If

one adds two equal type tensors but with unequal indices one gets a meaningless

expression,

v

µ

+ w

ν

. (6.6)

If tensors are of the same type, one can add them. One should then choose the indices

to be the same,

x

µν

+ y

µν

, (6.7)

or

x

αβ

+ y

αβ

. (6.8)

Both the above expressions are exactly the same. We just chose µ and ν as the indices

in the ﬁrst expression and α and β in the second. This is merely a matter of choice.

Now what about the following expression:

t

α

β

γ

+ r

αγ

β

. (6.9)

Both tensors are of same rank, have the same number of upper and lower indices,

but their order is different. This is a rather peculiar contruction, since typically (as

mentioned before) one assigns meaning to each index before one starts working

with tensors. Evidently the meaning assigned to index 2 is different in the two

cases, since in the case of t

α

β

γ

it is a covariant index while in the case of r

αγ

β

it is

a contravariant index. But this expression, though highly confusing (and therefore

not recommendable) is not formally wrong. One can see this if one assumes that

r

αγ

β

= u

α

v

γ

w

β

. It is formally true that u

α

v

γ

w

β

= u

α

w

β

v

γ

, because multiplication

is commutative. The problem is, that a tensor has only meaning for the user if one

can, beforehand, assign meaning to each of the indices (i.e. ”ﬁrst index means this,

second index means that...”). This will be confused if one mixes the indices up.

Fortunately, in most applications of tensors the meaning of the indices will be clear

from the context. Often symmetries of the indices will help in avoiding confusion.

30

6.4 First order derivatives: non-covariant version

6.3. Tensor products

The most important property of the product between two tensors is:

The result of a product between tensors is again a tensor if in each summation

the summation takes place over one upper index and one lower index.

According to this rule, the following objects are therefore not tensors, and we will

forbid them from now on:

x

µ

T

µν

, h

α

β

γ

t

αβ

γ

. (6.10)

The following products are tensors,

x

µ

T

µν

h

α

β

γ

t

α

β

γ

. (6.11)

and the following expression is a covariant expression:

t

µν

= a

µρ

σ

b

νσ

ρ

. (6.12)

If we nevertheless want to sum over two covariant indices or two contravariant

indices, then this is possible with the use of the metric tensor. For instance: an inner

product between t

µν

en v

ν

could look like:

w

µ

= g

αβ

t

µα

v

β

. (6.13)

The metric tensor is then used as a kind of ‘glue’ between two indices over which

one could normally not sum because they are of the same kind.

6.4. First order derivatives: non-covariant version

Up to now we have mainly concerned ourselves with the properties of individual

tensors. In differential geometry one usually uses tensor ﬁelds, where the tensor

depends on the location given by x. One can now take the derivative of such tensor

ﬁelds and these derivatives can also be denoted with index notation. We will see in

Chapter 7 that in curved coordinates or in curved manifols, these derivatives will

not behave as tensors and are not very physically meaningful. But for non-curved

coordinates on a ﬂat manifold this problem does not arise. In this section we will

therefore, temporarily, avoid this problem by assuming non-curved coordinates on

a ﬂat manifold.

We start with the gradient of a scalar ﬁeld, which we already know,

(grad f )

µ

=

∂ f

∂x

µ

. (6.14)

In a similar fashion we can now deﬁne the ‘gradient’ of a vector ﬁeld:

(gradv)

µ

ν

=

∂v

µ

∂x

ν

, (6.15)

or of a tensor-ﬁeld,

(gradt)

µνα

=

∂t

µν

∂x

α

. (6.16)

As long as we employ only linear transformations, the above objects are all tensors.

Now let us introduce an often used notation,

v

µ

,ν

:= ∂

ν

v

µ

:=

∂v

µ

∂x

ν

. (6.17)

31

6.5 Rot, cross-products and the permutation symbol

For completeness we also introduce

v

µ,ν

:= ∂

ν

v

µ

:= g

νρ

∂v

µ

∂x

ρ

. (6.18)

With index notation we can also deﬁne the divergence of a vector ﬁeld (again all

under the assumption of non-curved coordinates),

∇ v =

∂v

ρ

∂x

ρ

= ∂

ρ

v

ρ

= v

ρ

,ρ

. (6.19)

Note, just to avoid confusion: the symbol ∇ will be used in the next chapter for a

more sophisticated kind of derivative. We can also deﬁne the divergence of a tensor,

∂T

αβ

∂x

β

= ∂

β

T

αβ

= T

αβ

,β

. (6.20)

6.5. Rot, cross-products and the permutation symbol

A useful set of numbers, used for instance in electrodynamics, is the permutation

symbol ǫ. It is a set of numbers with three indices, if we work in a 3-dimensional

space, or n indices if we work in n-dimensional space.

The object ǫ does not transform entirely like a tensor (though almost), and it is

therefore not considered a true tensor. In fact it is a tensor density (or pseudo tensor),

which is a near cousin of the tensor family. Tensor densities transform as tensors, but

are additionally multiplied by a power of the Jacobian determinant of the transfor-

mation matrix. However, to avoid these complexities, we will in this section assume

an orthonormal basis, and forget for the moment about covariance.

The ǫ symbol is completely anti-symmetric,

ǫ

ijk

= −ǫ

jik

= −ǫ

ikj

= −ǫ

kji

. (6.21)

If two of the three indices have then same value, then the above equation clearly

yields zero. The only elements of ǫ that are non-zero are those for which none of the

indices is equal to another. We usually deﬁne

ǫ

123

= 1 , (6.22)

and with Eq. (6.21) all the other non-zero elements follow. Any permutation of in-

dices yields a −1. As a summary,

ǫ

ijk

=

_

¸

_

¸

_

1 if ijk is an even permutation of 123,

−1 if ijk is an odd permutation of 123,

0 if two indices are equal.

(6.23)

This pseudo-tensor is often called the Levi-Civita pseudo-tensor.

The contraction between two epsilons yields a useful identity,

ǫ

αµν

ǫ

αρσ

= δ

µρ

δ

νσ

−δ

µσ

δ

νρ

. (6.24)

From this it follows that

ǫ

αβν

ǫ

αβσ

= 2δ

νσ

, (6.25)

and also

ǫ

αβγ

ǫ

αβγ

= 6 . (6.26)

32

6.5 Rot, cross-products and the permutation symbol

With the epsilon object we can express the cross-product between two vectors a

and

b,

c =a

b → c

i

= ǫ

ijk

a

j

b

k

. (6.27)

For the ﬁrst component of c one therefore has

c

1

= ǫ

1jk

a

j

b

k

= ǫ

123

a

2

b

3

+ ǫ

132

a

3

b

2

= a

2

b

3

−b

2

a

3

, (6.28)

which is indeed the familiar ﬁrst component of the cross-product between a and b.

This notation has advantages if we, for instance, want to compute the divergence

of a rotation. When we translate to index notation this becomes

c = ∇ (∇a) = ∇

i

(ǫ

ijk

∇

j

a

k

) = ǫ

ijk

∂

i

∂

j

a

k

, (6.29)

because

∇ =

_

_

_

_

∂

∂x

1

∂

∂x

2

∂

∂x

3

_

_

_

_

and therefore ∇

i

=

∂

∂x

i

:= ∂

i

. (6.30)

On the right-hand-side of Eq. (6.29) one sees the complete contraction between a set

that is symmetric in the indices i and j (∂

i

∂

j

) and a set that is antisymmetric in the

indices i and j (ǫ

ijk

). From the exercise below we will see that such a contraction

always yields zero, so that we have proven that a divergence of a rotation is zero.

◮ Exercise 4 of Section C.4.

One can also deﬁne another kind of rotation: the generalised rotation of a covector,

(rot ˜ w)

αβ

= ∂

α

w

β

−∂

β

w

α

. (6.31)

This transforms as a tensor (in non-curved coordinates/space) and is anti-symmetric.

33

6.5 Rot, cross-products and the permutation symbol

34

7

Covariant derivatives

One of the (many) reasons why tensor calculus is so useful is that it allows a proper

description of physically meaningful derivatives of vectors. We have not mentioned

this before, but taking a derivative of a vector (or of any tensor for that matter) is

a non-trivial thing if one wants to formulate it in a covariant way. As usual, in a

rectangular orthonormal coordinate system this is usually not an issue. But when

the coordinates are curved (like in polar coordinates, for example) then this becomes

a big issue. It is even more of an issue when the manifold is curved (like in general

relativity), but this is beyond the scope of this booklet – even though the techniques

we will discuss in this chapter will be equally well applicable for curved manifolds.

We want to construct a derivative of a vector in such a way that it makes physical

sense even in the case of curved coordinates. One may ask: why is there a problem?

We will show this by a simple example in Section ??. In this section we will also

show how one deﬁnes vectors in curved coordinates by using a local set of basis

vectors derived from the coordinate system. In Section ?? we will introduce without

proof the mathematical formula for a covariant derivative, and introduce thereby

the so-called Christoffel symbol. In Section ?? we will return to the polar coordinates

to demonstrate how this machinery works in practice.

This chapter will, however, take quite large steps. The reason is that the concepts

of covariant derivatives are usually described in detail in introductory literature

on general relativity and Riemannian geometry. In principle this is well beyond

the scope of this booklet. However, since tensors ﬁnd much of their application in

problems in which differential operators are used, we wish to at least brieﬂy address

the concept of covariant derivative. For further details we refer to ***************

LITERATURE LIST **************

7.1. Vectors in curved coordinates

Deﬁning vectors in curvilinear coordinates is not trivial. Since the coordinates are

curved (maybe even the manifold itself is curved, like the surface of the Earth), there

is no meaningful global set of basis vectors in which one can express the vectors. If

a vector or tensor ﬁeld is given, then at each location this vector or tensor could

be decomposed into the local basis (co-)vectors. This basis may be different from

location to location, but we always assume that the changes are smooth.

In principle there are two useful set of basis vectors one can choose:

1. A local orthonormal set of basis vectors: If the coordinates are such that they are

locally perpendicular with respect to each other, then it is possible to choose

35

7.2 The covariant derivative of a vector/tensor ﬁeld

the basis vectors such that they point along these coordinates (though not nec-

essarily are normalized the same as the coordinate units).

2. A local coordinate basis: The covector basis can be seen as unit steps in each of

the directions: ˜ e

µ

= dx

µ

. The contravariant basis vectors are thene

µ

= ∂/∂x

µ

.

Typically this coordinate basis is not orthonormal.

So far in the entire booklet we have implicitly assumed a local coordinate basis (re-

member that we used the words coordinate transformation on equal footing as basis

transformation). For tensor mathematics this basis is usually the easiest to use. But

in realistic applications (like in Chapter ??) a local orthonormal basis has more phys-

ical meaning. In the application of Chapter ?? we will ﬁrst convert everything to the

coordinate basis in order to do the tensor math, and then, once the ﬁnal answer is

there, we will convert back to the local orthonormal basis, which is physically more

intuitive. But since the tensor mathematics works best for the local coordinate basis,

we will assume this basis for the rest of this chapter.

Now let us take the example of circular coordinates in two dimensions. We can

express the usual x- and y-coordinates in terms of r and φ as: x = r cos φ and y =

r sin φ. In r-direction the coordinate basis vector ∂/∂r can act as a normal basis

vector. In the φ-direction, however, the coordinate basis vector ∂/∂φ has a different

length at different radii. However,

1

r

∂/∂φ is again a normalized (unit) vector. The

set of basis vectors (∂/∂r, ∂/∂φ) is the local coordinate basis, while the set of basis

vectors (∂/∂r,

1

r

∂/∂φ) is a local orthonormal basis, which happens to be parallel to

the coordinates.

Now let us use the coordinate basis. Let us take a vector ﬁeld that, at every

location, when expressed in this local basis, takes the form (1, 0). This is shown in

Fig. XXXX. Expressed in cartesian coordinates (for which a global basis is possible)

this would be a vector ﬁeld (x/

_

x

2

+ y

2

, y/

_

x

2

+ y

2

). Clearly this vector ﬁeld is

not constant. In cartesian coordinates one would write, for instance,

∂

∂x

_

x/

_

x

2

+ y

2

y/

_

x

2

+ y

2

_

=

_

y

2

/(x

2

+ y

2

)

3/2

−xy/(x

2

+ y

2

)

3/2

_

, (7.1)

which is clearly non-zero. However, if we switch to circular coordinates, and take

the derivative with respect to, for instance, φ, then we obtain zero,

∂

∂φ

_

1

0

_

=

_

0

0

_

. (7.2)

In the circular coordinate system it looks as if the vector does not change in space,

but in reality it does. This is the problem one encounters with derivatives of tensors

in curved coordinates. Going to the local orthonormal basis does not help. One has

to deﬁne a new, ‘covariant’ form of the derivative.

7.2. The covariant derivative of a vector/tensor ﬁeld

In this section we will introduce the covariant derivative of a vector ﬁeld. We will

do so without proof. Again, for a more detailed discussion we refer to the literature.

We deﬁne ∂

µ

to be the ordinary coordinate-derivative ∂/∂x

µ

, and ∇

µ

to be the new

covariant derivative. The covariant derivative of a vector ﬁeld v

µ

is deﬁned as

∇

µ

v

α

= ∂

µ

v

α

+Γ

α

µν

v

ν

, (7.3)

where Γ

α

µν

is an object called the Christoffel symbol. The Christoffel symbol is not

a tensor because it contains all the information about the curvature of the coordi-

nate system and can therefore be transformed entirely to zero if the coordinates are

36

7.2 The covariant derivative of a vector/tensor ﬁeld

straightened. Nevertheless we treat it as any ordinary tensor in terms of the index

notation.

The Christoffel symbol can be computed from the metric g

µν

(and its companion

g

αβ

) in the following way,

Γ

α

µν

=

1

2

g

αβ

_

∂g

βν

∂x

µ

+

∂g

βµ

∂x

ν

−

∂g

µν

∂x

β

_

≡

1

2

g

αβ

(g

βν,µ

+ g

βµ,ν

−g

µν,β

) . (7.4)

The covariant derivative of a covector can also be deﬁned with this symbol,

∇

µ

w

α

= ∂

µ

w

α

−Γ

ν

µα

w

ν

. (7.5)

The covariant derivative of a tensor t

αβ

is then

∇

µ

t

αβ

= ∂

µ

t

αβ

+Γ

α

µσ

t

σβ

+Γ

β

µσ

t

ασ

, (7.6)

and of a tensor t

α

β

,

∇

µ

t

α

β

= ∂

µ

t

α

β

+Γ

α

µσ

t

σ

β

−Γ

σ

µβ

t

α

σ

. (7.7)

◮ NOW MAKE EXERCISES TO SHOW WHAT HAPPENS WITH THE ABOVE EX-

AMPLE

From the exercise we can see that the above recipe indeed gives the correct an-

swer even in curved coordinates. We can also prove that the covariant derivative of

the metric itself is always zero.

◮ ANOTHER EXERCISE

The covariant derivative ∇

µ

produces, as its name says, covariant expressions.

Therefore ∇

α

t

µν

γ

is a perfectly valid tensor. We can also contract indices: ∇

α

t

αν

γ

,

or with help of the metric: g

αγ

∇

α

t

µν

γ

. Since ∇

µ

g

αβ

= 0 (as we saw in the above

exercise) we can always bring the g

αβ

and/or g

αβ

inside or outside the ∇

µ

operator.

We can therefore write

g

αγ

∇

α

t

µν

γ

= ∇

α

(t

µν

γ

g

αγ

) = ∇

α

t

µνα

(7.8)

We can also deﬁne:

∇

α

= g

αβ

∇

β

(7.9)

37

7.2 The covariant derivative of a vector/tensor ﬁeld

38

A

Tensors in special relativity

Although tensors are mostly useful in general relativity, they are also quite conve-

nient in special relativity. Moreover, when we express the formulas using tensors,

they expose the structure of the theory much better than the formulas without ten-

sors do. We will start here at the point when four-vectors are introduced into special

relativity.

From the simple thought experiments with moving trains, meant to give a visual

understanding of the Lorentz transformations, it has become clear that 3-dimensional

space and 1-dimensional time are not separate entities, but rather should be seen as

one. In matrix form one found for a pure Lorentz transformation

_

_

_

_

_

x

0

′

x

1

′

x

2

′

x

3

′

_

_

_

_

_

=

_

_

_

_

γ −γβ 0 0

−γβ γ 0 0

0 0 1 0

0 0 0 1

_

_

_

_

_

_

_

_

x

0

x

1

x

2

x

3

_

_

_

_

, (1.1)

with γ = 1/

√

1 −v

2

/c

2

and β =

v

c

. Moreover, we found that the inner product now

satisﬁes the somewhat strange formula

¸a,

b) = −a

0

b

0

+ a

1

b

1

+ a

2

b

2

+ a

3

b

3

. (1.2)

With the knowledge which we now have about the ‘metric tensor’, this formula can

be understood much better. Namely, we know that the inner product of two vectors

does not really exist (at least not in a form which is independent of the coordinate

system). Rather, we should be talking about the inner product of a vector with a

covector. We can obtain one from the other by raising or lowering an index, making

use of the metric tensor,

a

α

= g

αβ

a

β

. (1.3)

Eq. (1.2) should thus really be viewed as

¸a,

b)

g

= a

µ

b

µ

= g

µν

a

µ

b

ν

. (1.4)

where

˜

b is a covector. From the fact that inner products such as (1.2) are invariant

under Lorentz transformations we now ﬁnd the components of the metric tensor,

g

µν

=

_

_

_

_

−1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

_

_

_

_

. (1.5)

39

APPENDIX A. TENSORS IN SPECIAL RELATIVITY

Minkowski space, which is the space in which the rules of special relativity hold, is

thus a space which has a metric which is different from the metric in an ordinary

Euclidean space. In order to do computations in this space, we do not have to in-

troduce any strange inner products. Rather, we simply apply our new knowledge

about inner products, which automatically leads us to the modiﬁed metric tensor.

Let us end with a small related remark. We have seen in Chapter (5) that the

metric can also be used to describe curved spaces. The Minkowski metric which we

have encountered in the present section can be used to describe curved space-time.

This concept is at the basis of the general theory of relativity. In that theory, gravity

is explained as a consequence of the curvature of space-time. A good introduction

to this theory is given in [5].

40

B

Geometrical representation

The tensors as we have seen them in the main text may appear rather abstract. How-

ever, it is possible to visualise their meaning quite explicitly, in terms of geometrical

objects. A simple example of this is the arrow-vector. An arrow-vector is effectively

just a set of numbers that transforms in a particular way under a basis transforma-

tion. However, it is useful to think about it in terms of an arrow, so as to visualise

its meaning.

We can make a similar geometrical visualisation of a covector, as well as for all

other tensors. The present appendix contains a list of these geometrical representa-

tions. As we go along we will also describe how various manipulations with tensors

(such as additions and inner products) can be translated to this geometrical picture.

None of this is necessary in order to be able to compute with tensors, but it may

help in understanding such computations.

• arrow-vector

An arrow-vector is represented by (how else) an arrow in an n-dimensional

space. The components of the vector (i.e. the numbers v

µ

) are obtained by

determining the projections of the arrow onto the axes. See ﬁgure B.1.

Figure B.1: Geometrical representation of a vector, and the determination of its com-

ponents by projection onto the axes.

• co-vector

A co-vectors is closely related to the gradient of a function f (x), so it makes

sense to look for a geometrical representation which is somehow connected

to gradients. A useful related concept is that of contour lines (curves of equal

height, in two dimensions) or contour surfaces (in three dimensions). Now

consider two consecutive contour lines. The intersection they make with a

neighbouring x

2

= constant lijne is equal to the increase of x

1

necessary to

41

APPENDIX B. GEOMETRICAL REPRESENTATION

increase f (x) by 1, if the other coordinate is kept constant. So in two dimen-

sions,

intersection at x

1

-axis =

_

∂x

1

∂ f

_

x

2

constant

=

_

∂ f

∂x

1

_

−1

x

2

constant

(2.1)

Therefore, if we take the inverse of this intersection, we again obtain the ﬁrst

component of the gradient.

Given the above, it makes sense to use, for the generic representation of a

co-vector, two consecutive lines (or, in three dimensions, planes), such that

the intersections with the axes are equal to the components of the co-vector.

Note that one should also keep track of the order of the lines or planes. In the

representation sketched in ﬁgure B.2 we have drawn a ﬂexible type of arrow.

This is not an arrow-vector as above, but only meant to indicate the order of

the surfaces. We could equally well have labelled the surfaces with different

numbers or colours.

asafsnede

Figure B.2: Geometrical representation of a co-vector, and the determination of its

components by computing the inverse of the intersection with the axes.

• anti-symmetric contravariant second-order tensor t

µν

An anti-symmetric contravariant second order tensor can be made from two

vectors according to

t

µν

= v

µ

w

ν

−w

µ

v

ν

. (2.2)

This is a bit like an outer product. The order of the two vectors is clearly

of relevance. We can visualise this object as a surface with a given area and

orientation. The components of the tensor can be obtained by measuring the

projection of the surface onto the various coordinate surfaces. A negative com-

ponent is obtained when the orientation of the surface assocatiated to the ten-

sor is negative. Let us illustrate this with an example: if the surface of the

projection on the x

1

, x

2

surface is equal to 2, then the t

12

component equals 2

and the t

21

component equals −2. See ﬁgures B.3 and B.4.

2

1

1

2

Figure B.3: Representation of an anti-symmetric contravariant 2nd-order tensor.

• anti-symmetric covariant second order tensor t

µν

This object can again be seen as a kind of outer product of two vectors, or

rather two co-vectors,

t

µν

= v

µ

w

ν

−w

µ

v

ν

. (2.3)

42

APPENDIX B. GEOMETRICAL REPRESENTATION

Figure B.4: Determination of the components of an anti-symmetric contravariant

second order tensor.

We can visualise this object as a kind of tube with an orientation. The compo-

nents can be found by computing the intersections of this tube with all coor-

dinate surfaces, and inverting these numbers. Again one has to keep track of

the orientation of the intersections. See ﬁgure B.5.

1

2

1

2

Figure B.5: Geometrical representation of an anti-symmetric covariant second order

tensor.

• addition of vectors

Addition of vectors is of course a well-known procedure. Simply translate one

arrow along the other one until the top. This translated arrow now points to

the sum-arrow. See ﬁgure B.6.

Figure B.6: Geometrical representation of the addition of vectors.

• addition of co-vectors

The addition of covectors is somewhat more complicated. Instead of trying to

describe this in words, it is easier to simply give the visual representation of

this addition. See ﬁgure for a two-dimensional representation.

• inner product between a vector and a co-vector

The inner product of a vector and a co-vector is the ratio of the length of the in-

tersection which the co-vector has with the vector and the length of the vector.

See ﬁgure B.8 for a three-dimensional representation.

43

APPENDIX B. GEOMETRICAL REPRESENTATION

v

w

v+w

Figure B.7: Geometrical representation of the addition of co-vectors.

Figure B.8: Representation of the inner prdouct between a vector and a co-vector.

• second order symmetric co-variant tensor g

µν

The metric tensor belongs to this class. It therefore makes sense to look for

a representation which is related to the inner product. A suitable one is to

consider the set of all points which have a unit distance to a given point x

0

,

¦x ∈ R

3

[ g

µν

x

µ

x

ν

= 1¦ . (2.4)

If the metric is position-dependent, g = g(x), then the point x

0

is equal to the

point at which the corresponding metric g

µν

is evaluated.

Now it is also clear what is the connection to the ellipses of chapter 5. See

ﬁgure B.9 for a three-dimensional representation: an ellipsoid.

Figure B.9: Geometrical representation of a second order symmetric covariant ten-

sor.

It is considerably more complicated to make a geometrical representation of a

second order symmetric contravariant tensor, so we will omit that case.

• turning an arrow-vector into a co-vector

The procedure of converting a vector to a co-vector can also be illustrated ge-

ometrically, see ﬁgure B.10 for a two-dimensional representation. The two

44

APPENDIX B. GEOMETRICAL REPRESENTATION

long slanted lines are tangent to the circle, and the bottom horizontal line

goes through the center of the circle. This representation fails when the arrow-

vector is shorter than the radius of the circle.

Figure B.10: Geometrical representation of turning an arrow-vector into a co-vector.

45

APPENDIX B. GEOMETRICAL REPRESENTATION

46

C

Exercises

C.1. Index notation

1. A, B and C are matrices. Assume that

A = BC

Write out this matrix multiplication using index notation.

2. A and B are matrices and x is a position vector. Show that

n

∑

ν=1

A

µν

(

n

∑

α=1

B

να

x

α

) =

n

∑

ν=1

n

∑

α=1

(A

µν

B

να

x

α

)

=

n

∑

α=1

n

∑

ν=1

(A

µν

B

να

x

α

)

=

n

∑

α=1

(

n

∑

ν=1

(A

µν

B

να

)x

α

)

3. Which of the following statements are true?

(a) The summation signs in an expression can always be moved to the far

left, without changing the meaning of the expression.

(b) If all summation signs are on the far left of an expression, you can ex-

change their order without changing the meaning of the expression.

(c) If all summation signs are on the far left of an expression, you cannot just

change the order of the variables in the expression, because this changes

the order in which matrices are multiplied, and generically AB ,= BA.

(d) A

µν

= (A

T

)

νµ

(e) A

µν

= (A

T

)

µν

4. A, B, C, D and E are matrices. Write out the following matrix multiplications

using index notation (with all summation signs grouped together).

(a)

A = B(C + D)

(b)

A = BCD

47

C.1 Index notation

(c)

A = BCDE

5. Assume you have three vectors x, y andz which satisfy the following relations,

y = Bx

z = Ay

Write these relations using index notation. Now write down the relation be-

tweenz and x using index notation.

6. Write in matrix form

D

βν

=

n

∑

µ1

n

∑

α=1

A

µν

B

αµ

C

αβ

.

7. Try a couple of the previous exercises by making use of the summation con-

vention.

8. Write as a matrix multiplication:

(a)

D

αβ

= A

αµ

B

µν

C

βν

(b)

D

αβ

= A

αµ

B

βγ

C

µγ

(c)

D

αβ

= A

αγ

(B

γβ

+ C

γβ

)

9. Consider a vector ﬁeld in an n-dimensional space,

v(x) .

We perform a coordinate transformation,

x

′

= Ax A is a transformation matrix.

Show that

v

′

= Av ,

where the matrix A is the same matrix as in the ﬁrst equation.

10. For a transformation we have

x

′

= Ax .

This corresponds to

x

′

µ

=

n

∑

ν=1

A

µν

x

ν

.

Can you understand the expression

n

∑

ν=1

x

ν

A

µν

=?

And how can you construct the matrix multiplication equivalent of

n

∑

µ=1

x

µ

A

µν

=?

(note the position of the indices).

48

C.3 Introduction to tensors

C.2. Co-vectors

1. The matrix for a rotation over an angle φ in the x-y plane is given by

Λ =

_

cos φ sin φ

−sin φ cos φ

_

.

Compute the inverse of this matrix (either by replacing φ by −φ or by matrix

inversion). Show that (Λ

−1

)

T

is equal to Λ.

2. In Cartesian coordinate systems the basis vectors are orthogonal to each other,

e

1

e

2

= 0 .

(This is a somewhat misleading notation. You might be tempted to interpret

this as the inner product of two basis vectors, in which the components of the

basis vectors are expressed in the basis itself. That would of course always

yield a trivial zero).

If we transform from one such Cartesian coordinate system to another one,

e

1

′

= Λe

1

& e

2

′

= Λe

2

,

(where Λ is the transformation matrix) then this relation should of course con-

tinue to hold,

e

1

′

e

2

′

= 0 .

Insert the transformation rule given above and show that, for transforma-

tions between two Cartesian systems, the tranformation matrix Λ is equal

to (Λ

−1

)

T

. These are thus orthogonal transformations.

C.3. Introduction to tensors

1. With the help of various mathematical operators you can make new tensors

out of old ones. For instance

w

µ

= t

µν

v

ν

,

in which t

µν

and v

ν

are tensors.

(a) Show that in this case w

µ

is also a tensor (that is, show that it transforms

correctly under a basis transformation).

(b) Show that w

µ

is not a tensor if we make it according to

w

µ

= t

µν

v

ν

.

2. A matrix can be viewed as an object which transforms one vector into another

one. We have seen that this can be a normal transformation or a basis transfor-

mation. The question is now: does a normal transformation matrix transform

too upon a change of basis. And how? Write this out in normal matrix no-

tation, with S the matrix which transforms from the original system to the

primed system, and A the matrix which is to be transformed.

Hint:

y = Ax ,

and after a basis (or coordinate) transformation

Ay = S(Ax) ,

but there is also a matrix A

′

which satisﬁes

Sy = A

′

(Sx) .

49

C.4 Tensoren, algemeen

3. Show that, if you start with a number of tensors and construct a new object

out of them, and use the summation convention (that is, only sum over one

upper and one lower index, never over two indices at the same position), the

new object is again a tensor.

C.4. Tensoren, algemeen

1. You have seen that there are four types of 2nd-order tensors. How many 3rd

order tensors are there?

2. Take an anti-symmetric tensor t

µν

. Show that the property of anti-symmetry

is preserved under a basis transformation. Do the same thing for a symmetric

tensor. This shows that symmetry and anti-symmetry are fundamental prop-

erties of tensors.

3. Show, by writing out the transformation of a (1, 1)-tensor, op te schrijven, dat

het geen zin heeft om te spreken van symmetrie of anti-symmetrie van een

tensor in indices die niet van hetzelfde soort zijn. (Laat dus zien dat die eigen-

schap na transformatie meestal verloren is).

4. Gegeven een tweede orde contravariante symmetrische tensor t en een tweede

orde covariante antisymmetrische tensor r. Toon aan dat de dubbele contractie

t

αβ

r

αβ

altijd gelijk is aan nul.

5. De vogende tensor is gedeﬁnieerd uit twee vectoren:

t

µν

= v

µ

w

ν

−w

ν

v

µ

Laat zien dat deze antisymmetrisch is.

6. Hoe transformeert een kronecker-delta? De kronecker-delta is symmetrisch;

waarom blijft de eigenschap van symmetrie hier wel behouden, terwijl toch is

bewezen dat de symmetrie in een boven en onderindex niet behouden blijft?

7. We hebben in opgave 2 in paragraaf C.3 gezien hoe een matrix Atransformeert.

Laat aan de hand daarvan zien hoe de matrix A

T

transformeert. Toon aan de

hand van de twee gevonden vergelijkingen aan dat het object g uit de tekst

niet transformeert als een matrix, maar ook niet als de getransponeerde van

een matrix.

8. Neem twee tensoren: s

µν

en t

µν

. Maak een product van die twee dat

(a) geen vrije indices heeft.

(b) twee vrije indices heeft.

(c) vier vrije indices heeft.

9. Hoe transformeert een 3

e

orde geheel covariante tensor? Kan dit in matrixvorm

geschreven worden?

10. Je kunt een niet-lineaire co¨ ordinatentransformatie geven door de nieuwe co¨ ordinaten

uit te drukken als een functie van de oude co¨ ordinaten:

x

′1

= f

′1

(x

1

, x

2

. . . , x

n

)

.

.

.

x

′n

= f

′n

(x

1

, x

2

. . . , x

n

)

50

C.6 Tensor calculus

Deze functies zijn nu naar hun n co¨ ordinaten te Taylor-ontwikkelen om het

punt

**0. Maak zo’n Taylor-ontwikkeling.
**

Aanwijzing:

Bij het aﬂeiden van de Taylorreeks van een gewone functie ging men uit van

een eenvoudige machtreeks:

f (x) = a

0

+ a

1

x + a

2

x

2

+ a

3

x

3

. . .

Door deze vergelijking steeds te differenti¨ eren kon men de co¨ efﬁcienten a

0...n

bepalen. Doe iets dergelijks ook in dit geval.

C.5. Metrische tensor

1. Laat zien dat het onmogelijk is om een metrische tensor die zowel 1’en als

−1’en op de hoofddiagonaal heeft, te transformeren naar een vorm waarin er

nog uitsluitend 1’en aanwezig zijn.

2. Het nieuwe inproduct, met de metrische tensor, is invariant onder co¨ ordina-

tentransformaties:

g

′

µν

x

′

µ

x

′

ν

= g

µν

x

µ

x

ν

Bij de behandeling van de Speciale Relativiteitstheorie komen we echter ook

vaak de volgende vergelijking tegen:

g

µν

x

′

µ

x

′

ν

= g

µν

x

µ

x

ν

(dus zonder accent bij g) Leg uit waarom we deze vergelijking mogen ge-

bruiken in alle gevallen dat g de Minkowskimetriek is en we uitsluitend in

lorentzframes werken.

C.6. Tensor calculus

1. Bewijs dat de volgende vergelijkingen covariant zijn:

z

αβ

= x

βα

z

αβ

= x

αβ

+ y

βα

2. Stel, je hebt 2 berekeningen uitgevoerd:

x

α

= . . .

en

y

β

= . . .

Hoe kan je nu, terwijl de indices van x en y verschillend zijn, toch deze twee

vectoren optellen?

3. Waarom is het bij

(gradv)

µ

ν

=

∂v

µ

∂x

ν

niet nodig om de indices (aan de linkerkant van de vergelijking) op volgorde

te zetten?

51

C.6 Tensor calculus

52

C

Bibliography

[1] B¨ auerle G. G. A. (1981) Syllabus ‘Algemene Relativiteitstheorie I’ (UvA)

[2] Dullemond C. (1986) Syllabus ‘Algemene Relativiteitstheorie’ (KUN)

[3] Kay D. C. (1988) ‘Tensor Calculus (Schaum’s outline series)’ (McGraw-Hill)

[4] Schouten J. A. (1951) ‘Tensor Analysis for Physicists’ (Oxford)

[5] Schutz B. F. (1985) ‘A ﬁrst course in general relativity’ (Cambridge)

[6] Takens R. J. (1985) Syllabus ‘Hemelmechanica’ (UvA)

53

This booklet contains an explanation about tensor calculus for students of physics and engineering with a basic knowledge of linear algebra. The focus lies mainly on acquiring an understanding of the principles and ideas underlying the concept of ‘tensor’. We have not pursued mathematical strictness and pureness, but instead emphasise practical use (for a more mathematically pure resum´ , please see the bibe liography). Although tensors are applied in a very broad range of physics and mathematics, this booklet focuses on the application in special and general relativity. We are indebted to all people who read earlier versions of this manuscript and gave useful comments, in particular G. B¨ uerle (University of Amsterdam) and C. Dullea mond Sr. (University of Nijmegen). The original version of this booklet, in Dutch, appeared on October 28th, 1991. A major update followed on September 26th, 1995. This version is a re-typeset English translation made in 2008/2010.

Copyright c 1991-2010 Kees Dullemond & Kasper Peeters.

1 2

The index notation Bases, co- and contravariant vectors 2.1 Intuitive approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Mathematical approach . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction to tensors 3.1 The new inner product and the ﬁrst tensor . . . . . . . . . . . . . . . 3.2 Creating tensors from vectors . . . . . . . . . . . . . . . . . . . . . . . Tensors, deﬁnitions and properties 4.1 Deﬁnition of a tensor . . . . . . 4.2 Symmetry and antisymmetry . 4.3 Contraction of indices . . . . . 4.4 Tensors as geometrical objects . 4.5 Tensors as operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 9 9 11 15 15 17 21 21 21 22 22 24 25 25 26 27 29 29 30 31 31 32 35 35 36 39 41 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 47 49 49 50 51 51

3

4

5

The metric tensor and the new inner product 5.1 The metric as a measuring rod . . . . . . . . . . . . . . . . . . . . . . . 5.2 Properties of the metric tensor . . . . . . . . . . . . . . . . . . . . . . . 5.3 Co versus contra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tensor calculus 6.1 The ‘covariance’ of equations . . . . . . . . . . . 6.2 Addition of tensors . . . . . . . . . . . . . . . . . 6.3 Tensor products . . . . . . . . . . . . . . . . . . . 6.4 First order derivatives: non-covariant version . . 6.5 Rot, cross-products and the permutation symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

7

Covariant derivatives 7.1 Vectors in curved coordinates . . . . . . . . . . . . . . . . . . . . . . . 7.2 The covariant derivative of a vector/tensor ﬁeld . . . . . . . . . . . .

A Tensors in special relativity B Geometrical representation C Exercises C.1 Index notation . . . . . C.2 Co-vectors . . . . . . . C.3 Introduction to tensors C.4 Tensoren, algemeen . . C.5 Metrische tensor . . . . C.6 Tensor calculus . . . .

3

4

The index notation

1

Before we start with the main topic of this booklet, tensors, we will ﬁrst introduce a new notation for vectors and matrices, and their algebraic manipulations: the index notation. It will prove to be much more powerful than the standard vector notation. To clarify this we will translate all well-know vector and matrix manipulations (addition, multiplication and so on) to index notation. Let us take a manifold (=space) with dimension n. We will denote the components of a vector v with the numbers v1 , . . . , v n . If one modiﬁes the vector basis, in which the components v1 , . . . , v n of vector v are expressed, then these components will change, too. Such a transformation can be written using a matrix A, of which the columns can be regarded as the old basis vectors e1 , . . . , en expressed in the new basis e1 ′ , . . . , en ′ , ′ v1 A11 · · · A1n v1 . . . . . = . . . (1.1) . . . . v′ n An1

···

Ann

vn

Note that the ﬁrst index of A denotes the row and the second index the column. In the next chapter we will say more about the transformation of vectors. According to the rules of matrix multiplication the above equation means:

′ v1 . . .

= =

v′ n or equivalently,

A11 · v1 . . . An1 · v1

+ +

A12 · v2 . . . An2 · v2

n

+ ··· + ···

+ +

A1n · vn , . . . Ann · vn ,

(1.2)

′ v1 =

∑ A1ν vν ,

ν =1

. . . v′ = n or even shorter, v′ = µ

. . .

(1.3)

∑ Anν vν ,

ν =1

n

∑ Aµν vν

ν =1

n

(∀µ ∈ N | 1 ≤ µ ≤ n) .

(1.4)

In this formula we have put the essence of matrix multiplication. The index ν is a dummy index and µ is a running index. The names of these indices, in this case µ and

5

THE INDEX NOTATION ν. We can therefore in principle omit these summation symbols. • an index appears almost never more than twice in a product. then we deﬁne the inner product as v · w : = v 1 w1 + · · · + v n w n = ∑ vµ wµ .9) ◮ Exercises 1 to 6 of Section C.8) ⇔ Cµν = Aµν − Bµν ⇔ zα = vα − wα (1. The could equally well have been called α and β: v′ = α ∑ Aαβ v β β =1 n (∀α ∈ N | 1 ≤ α ≤ n) . • almost always a summation is performed over an index that appears twice in a product. n ⇔ vµ = ∑ Aµν yν ν =1 n ⇔ vν = ∑ Aνµ yµ . for instance by putting them after the formula.1. The following statements are therefore equivalent: v v = = y Ay ⇔ vµ = yµ ⇔ vα = yα .5) are not explicitly stated because they are obvious from the context. one can also just take the sum of matrices and of vectors: C = A+B z = v+w or their difference.6) This index notation is also applicable to other manipulations.7) (We will return extensively to the inner product. Take two vectors v and w. Here it is just as an example of the power of the index notation). µ =1 n (1. if we make clear in advance over which indices we perform a summation.5) Usually the conditions for µ (in Eq. 1. are chosen arbitrarily.4) or α (in Eq. (1.CHAPTER 1. 6 .10) ∑ ∑ β =1 γ =1 n n Aαβ Bβγ Cγδ { β. µ =1 (1. In addition to this type of manipulations. C = A−B z = v−w ⇔ Cµν = Aµν + Bµν ⇔ zα = vα + wα (1. γ} From the exercises one can already suspect that • almost never is a summation performed over an index if that index only appears once in a product. for instance the inner product. 1. ∑ Aµν vν ν =1 n → → Aµν vν Aαβ Bβγ Cγδ {ν } (1. From the exercises it should have become clear that the summation symbols ∑ can always be put at the start of the formula and that their order is irrelevant.

∑ ∑ β =1 γ =1 n n Aαβ Bβγ Cγδ ◮ Exercises 7 to 10 of Section C. We will thus from now on rewrite ∑ Aµν vν ν =1 n → → Aµν vν . (1. then it will soon become irritating to denote explicitly over which indices the summation is performed. From experience (see above three points) one knows over which indices the summations are performed. and • no summation is assumed over indices that appear only once. which is called the Einstein summation convection. so one will soon have the idea to introduce the convention that.CHAPTER 1. 7 . From now on we will write all our formulae in index notation with this particular convention. THE INDEX NOTATION When one uses index notation in every day routine.1. unless explicitly stated otherwise: • a summation is assumed over all indices that appear twice in a product. For a more detailed look at index notation with the summation convention we refer to [4].11) Aαβ Bβγ Cγδ .

THE INDEX NOTATION 8 .CHAPTER 1.

i. with an arrow-vector) can then be described in terms of the basis-vectors belonging to the coordinate system (there are some hidden difﬁculties here. From linear algebra we know that the transformation 9 . In a description with coordinates we must be fully aware that the coordinates (i.1) This could be regarded as a kind of multiplication of a ‘vector’ with a matrix. one would get something like: e1 ′ e2 ′ = projection of e1 ′ onto e1 projection of e2 onto e1 .e. as long as we take for the components of this ‘vector’ the basis vectors.Bases.1. If we describe the matrix elements with words. one that will be essential for the rest of this booklet. It is important to realize that the object one describes is independent of the coordinate system (i. Let us write down such a basis transformation. (2. Only with the corresponding basis vectors (which span up the coordinate system) do these numbers acquire meaning. co. In this section we will see what happens when we choose another set of basis vectors. ′ projection of e1 ′ onto e2 projection of e2 onto e2 ′ e1 e2 . After that we will follow a more mathematical approach.e. set of basis vectors) one chooses. for example. Or in other words: an arrow does not change meaning when described an another coordinate system. 2. what happens upon a basis transformation. but just a very useful way to write things down. + a22 e2 . . Let us assume that we use a linear coordinate system.e. We can also look at what happens with the components of a vector if we use a different set of basis vectors. so that we can use linear algebra to describe it. but we will ignore these for the moment).2) Note that the basis vector-colums are not vectors. Physical objects (represented. Intuitive approach We can map the space around us using a coordinate system. (2. the numbers) themselves have no meaning.and contravariant vectors 2 In this chapter we introduce a new kind of vector (‘covector’). e1 ′ e2 ′ = a11 e1 = a21 e1 + a12 e2 . To get used to this new concept we will ﬁrst show in an intuitive way how one can imagine this new kind of vector.

2. (2. (2. (2. if the basis vector shrinks) this means that the gradient must shrink too (see Fig.6 ) 0.3) are not the same. As long as one transforms only between orthonormal basis. (2. ?? for a onedimensional example).2. then Eq. That there must be a different behavior is also intuitively clear: if we described an ‘arrow’ by coordinates.3) is equivalent to: v ′ = ( Λ −1 ) T v . will from now on be called a ‘covariant vector’ or ‘covector’: it transforms in the same way as the basis vector columns. (2. (2. the basis vector e1 ′ trans1 forms into 1 × e1 . In words. then the coordinates must clearly change in the opposite way to make sure that the coordinates times the basis vectors produce the same physical ‘arrow’ (see Fig. there is no difference between contravariant vectors and covariant vectors. In the simple case in which. When this ‘unit’ suddenly shrinks (i. However. because they transform contrary to the basis vector columns. ??).3). and we then modify the basis vectors. A ‘gradient’. for example. When doing vector mathematics in 10 .2) Λ.1: The behaviour of the transformation of the components of a vector under 1 the transformation of a basis vector e1 ′ = 2 e1 → v1 ′ = 2v1 . So if we call the matrix of Eq. transform the same as the basis vector columns.4 e1 e’ 1 Figure 2. Comparison with the matrix in Eq. (2.8 ) 0. (2. which we so far have always regarded as a true vector. (2. we could now attempt to construct objects that.1 Intuitive approach e2 v=( 0. then the matrices Λ en (Λ−1 ) T are the same. the coordinate of this object must then also 2 times as large.4) The normal vectors are called ‘contravariant vectors’. The fact that gradients have usually been treated as ordinary vectors is that if the coordinate transformation transforms one cartesian coordinate system into the other (or in other words: one orthonormal basis into the other). We now want to compare the basis-transformation matrix of Eq.3) It is clear that the matrices of Eq. In view of these two opposite transformation properties. 2 of Section C.4 e’ 2 v=( 1. contrary to normal vectors.2) shows that we also have to transpose the matrix. (2. matrix can be constructed by putting the old basis vectors expressed in the new basis in the columns of the matrix. it is not always possible in practice to restrict oneself to such bases. v1 ′ v2 ′ = projection of e1 onto e1 ′ projection of e1 onto e2 ′ projection of e2 onto e1 ′ projection of e2 onto e2 ′ v1 v2 .2) with the coordinate-transformation matrix of Eq. ◮ Exercises 1.3) by non-primed elements and vice-versa. To do this we replace all the primed elements in the matrix of Eq.e. This 2 is precisely what happens to the coordinates of a gradient of a scalar function! The reason is that such a gradient is the difference of the function per unit distance in the direction of the basis vector.2) and Eq.

(2. . So A is T equal to (Λ−1 ) ).2. xn ) and the gradient wα at a point P in the following way. since we described all transformations with matrices). let us have a more mathematical look at this.6) In this case not only the coordinates xµ change (and therefore the dependence of v on the coordinates). Now take the function f ( x1 .and contravariant vectors. .7) where A is the same matrix as in Eq.. x2 . xn . Consider an n-dimensional manifold with coordinates x1 .g.. ∂xµ (2. ∂f . Mathematical approach Now that we have a notion for the difference between the transformation of a vector and the transformation of a gradient. x2 . x2 . Suppose we have a vector ﬁeld deﬁned on this manifold V : v = v( x).... (e. a translation). one is often forced to use non-orthonormal bases. We deﬁne the gradient of a function f ( x1 . In general a coordinate transformation can also be non-homogeneous linear.6) (this may look trivial. Let us perform a homogeneous linear transformation of the coordinates: ′ xµ = Aµν xν . ..5) The difference in transformation will now be demonstrated using the simplest of transformations: a homogeneous linear transformation (we did this in the previous section already. (∇ f )µ := ∂f . but we will not go into this here. xn ).8) wα = ∂xα and in the new coordinate system as w′ = α ∂f .7 e1 e’ 1 Figure 2.2. µ (2.. but it is useful to check it! Also note that we take as transformation matrix the matrix that describes the transformation of the vector components.. v′ ( x ) = Aµν vν ( x ) .9) 11 .. (2. ′ ∂xα (2. (2.2: Basis vector e1 ′ = 1 e1 → ∇ f ′ = 1 ∇ f 2 2 curved coordinate systems (like polar coordinates for example). And in special relativity one is fundamentally forced to distinguish between co.4 F e’ 2 grad F = -0.2 Mathematical approach F e2 grad F = -1. but also the components of the vectors. 2. whereas in the previous section we took for Λ the matrix that describes the transformation of the basis vectors.

11) (2. ′ ∂xα δνα = 1 0 when ν = α. A gradient has the property that ∇ × (∇ f ) = 0. ∂x1 n 2 1 1 1 1 That is. To see this we ﬁrst take the inverse of Eq. ′ ∂(( A−1 )µν xν ) ∂( A−1 )µν ′ ∂xµ ∂x ′ = = ( A−1 )µν ν + xν . (2.16) (2. (2. or covector ..17) The indices are now in the correct position to put this in matrix form.12) ∂x ν ′ ∂x µ This describes how a gradient transforms.15) Therefore.14) ′ Because in this case A does not depend on xα the last term on the right-hand side vanishes. One can regard the partial derivative as the matrix ( A−1 ) T where A is deﬁned as in Eq. Moreover. we have that ′ ∂xν = δνα . To distinguish vectors from covectors we will denote vectors with an arrow ˜ (v) and covectors with a tilde (w). ∂xα With Eq. We have shown here what is the difference in the transformation properties of normal vectors (‘arrows’) and gradients. (2. (2. Normal vectors we call from now on contravariant vectors (though we usually simply call them vectors) and gradients we call covariant vectors (or covectors or one-forms).2 Mathematical approach It now follows (using the chain rule) that ∂ f ∂x1 ∂ f ∂x2 ∂ f ∂xn ∂f ′ = ∂x ∂x ′ + ∂x ∂x ′ + . when ν = α. ∂f = ′ ∂xµ w′ = µ ∂xν ′ ∂xµ ∂xν ′ ∂xµ ∂f .e. + ∂x ∂x ′ . the indices of covariant vectors) with subscripts. w ′ = ( A −1 ) T w .13) Now take the derivative. ′ ′ ′ ′ ∂xα ∂xα ∂xα ∂xα (2. (2.6).2. (2..6): ′ xµ = ( A−1 )µν xν .e. while not all covector ﬁelds may have this property. what remains is ∂xµ −1 −1 ′ = ( A )µν δνα = ( A ) µα . the indices of contravariant vectors) as superscript and the covariant indices (i.10) (2. ∂xν wν . (2. yα wα 12 : contravariant vector : covariant vector.12) this yields for the transformation of a gradient T w′ = ( A−1 )µν wν . µ (2. To make further distinction between contravariant and covariant vectors we will put the contravariant indices (i. It should be noted that not every covariant vector ﬁeld can be constructed as a gradient of a scalar function.18) (We again note that the matrix A used here denotes the coordinate transformation from the coordinates x to the coordinates x ′ ).

24) 13 . µ The delta δ of Eq.2.2 Mathematical approach In practice it will turn out to be very useful to also introduce this convention for matrices. covectors become v′ = Aµ ν vν . (2. w ′ = ( A −1 ) T µ w ν = ( A −1 ) ν µ w ν . (2. (2. It simply has the ‘function’ of ‘renaming’ an index: δµ ν yν = yµ . (2.19) The transposed version of this matrix is: AT : Aν µ . Without further argumentation (this will be given later) we note that the matrix A can be written as: A : Aµ ν . (2.15) also gets a matrix form.21) (2. δµν ν µ (2.23) This is called the ‘Kronecker delta’.20) With this convention the transformation rules for vectors resp.22) → δµ ν .

2 Mathematical approach 14 .2.

a distance of 1 in coordinate x1 means indeed 1 meter in x1 -direction. s ′ = a′ .1) The work W must of course be independent of the coordinate system in which the vectors F and x are expressed. to get a better understanding. It appears as if the inner product only describes the physics correctly in a special kind of coordinate system: a system which according to our human perception is ‘rectangular’. The new inner product and the ﬁrst tensor The inner product is very important in physics.1. (old deﬁnition) (3. An orthonormal transformation produces again such a rectangular ‘physical’ coordinate system. this inner product has always worked properly. It is a generalisation of classical linear algebra. 15 . The matrices will then together form the kronecker delta δβα .1 we will see in an example how a tensor can naturally arise. In section 3. and has physical units.Introduction to tensors 3 Tensor calculus is a technique that can be regarded as a follow-up on linear algebra. In section 3.1.e. b ′ = A µ α a α A µ β b β = ( A T ) µ µ α β α A βa b . s = a. However. if we are dealing with orthonormal transformations) s will not change. x . W = F. Only if A−1 equals A T (i. i. 3. (3. Tensors are generalisations of vectors and matrices. Let us consider an example.3) where A is the transformation matrix. The inner product as we know it.e. b = aµ bµ does not have this property in general. In classical linear algebra one deals with vectors and matrices.2) (3. as we already explained in the previous chapter. In classical mechanics it is true that the ‘work’ that is done when an object is moved equals the inner product of the force acting on the object and the displacement vector x. as we will see in this chapter.2 we will re-analyse the essential step of section 3. If one has so far always employed such special coordinates anyway. it is not always guaranteed that one can use such special coordinate systems (polar coordinates are an example in which the local orthonormal basis of vectors is not the coordinate basis).

4) because for all A one can write s ′ = x ′ y ′ = A µ α x α ( A −1 ) β µ y β = ( A −1 ) β µ A µ α x α y β = δ α x α y β = s µ µ β (3. To do this we introduce a covector wµ and deﬁne the inner product between x µ and yν with respect to this covector wµ in the following way (we will introduce a better deﬁnition later): s = wµ wν x µ yν (ﬁrst attempt) (3.7) Instead of using a covector wµ in relation to which we deﬁne the inner product. w1 · w1 w1 · w2 w1 · w3 g11 g12 g13 g = w2 · w1 w2 · w2 w2 · w3 = g21 g22 g23 . s = x µ yµ . We have now produced an invariant ‘inner product’ for contravariant vectors by using a covariant vector wµ as a measure of length. 1 y w1 · w1 w1 · w2 w1 · w3 s = ( w µ w ν ) x µ y ν = x 1 x 2 x 3 w2 · w1 w2 · w2 w2 · w3 y 2 .9) w3 · w1 w3 · w2 w3 · w3 g31 g32 g33 (3. we deﬁne the inner product with respect to the object g as: s = gµν x µ yν new deﬁnition (3. however. we can also directly deﬁne the object g: that is more direct. s ′ = ( A −1 ) µ α w µ ( A −1 ) ν β w ν A α ρ x ρ A β σ y σ = ( A −1 ) µ α A α ρ ( A −1 ) ν β A β σ w µ w ν x ρ y σ = δµ ρ δν σ wµ wν x ρ yσ = wµ wν x µ yν = s.1 The new inner product and the ﬁrst tensor The inner product between a vector x and a covector y. (3. The inner product s will now obviously transform correctly. Let us call this object g.3.6) (Warning: later it will become clear that this deﬁnition is not quite useful. because it is made out of two invariant ones. However. but at least it will bring us on the right track toward ﬁnding an invariant inner product between two contravariant vectors). (3. (3.5) With help of this inner produce we can introduce a new inner product between two contravariant vectors which also has this invariance property. So. (3.10) Now we must make sure that the object g is chosen such that our new inner product reproduces the old one if we choose an orthonormal coordinate system. is invariant under all transformations.8) in an orthonormal system one should have s = gµν x µ yν = x1 x2 x3 = x 1 y1 + x 2 y2 + x 3 y3 16 g11 g21 g31 g12 g22 g32 in an orthonormal system! 1 y g13 g23 y2 g33 y3 (3. So. this covector appears twice in the formula. since it is a collection of numbers labeled with indices µ and ν. It is some kind of matrix. One can also rearrange these factors in the following way. with Eq.11) .8) w3 · w1 w3 · w2 w3 · w3 y3 In this way the two appearances of the covector w are combined into one object: some kind of product of w with itself.

Remember that the matrix A of the previous chapter had one index up and one index down: Aµ ν .3.2. The object g is called a metric. is an example of a tensor. A matrix is also a tensor. In index notation this is denoted with paired indices using the summation convention. In this section we will revisit this procedure from a slightly different angle.9). but does not transform as one. With this last step we now have a complete description of the new inner product between contravariant vectors that behaves properly. as are vectors and covectors. Creating tensors from vectors In the previous section we have seen how one can produce a tensor out of two covectors. 3. indicating that it has mixed contra. which looks like a matrix. (3. One starts with 2n components and ends up with one number. (3. In standard multiplication procedures from classical linear algebra such a summation usually takes place: for matrix multiplications as well as for inner products. We do no longer regard g as built out of two covectors. returning to our original problem. it does not have the transformation properties of a classical matrix. has both indices down: it transforms in both indices in a covariant way. However. s = x µ y µ = x 1 y1 + x 2 y2 + x 3 y3 .10). like we did in the previous chapters. in an orthonormal system.6) has to be rejected (hence the warning that was written there). and that it reproduces the old inner product when we work in an orthonormal basis. This curious object. To achieve this. g must become. After multiplication one is left with one object with one index and n components (a vector).14) Summation therefore reduces the number of components. One has therefore lost (n2 + n) − n = n2 numbers: they are ‘summed away’. So. Let us look at products between matrices and vectors. It is a new kind of object for which only tensor mathematics has a proper description. Together they have n2 + n componenten. The new object gµν . (3. and the inner product in Chapter 5.13) In this section we have in fact put forward two new concepts: the new inner product and the concept of a ‘tensor’.2 Creating tensors from vectors One can see that one cannot produce this set of numbers according to Eq.2 and Chapter 4. One starts with an object with two indices and therefore n2 components (the matrix) and an object with one index and therefore n components (the vector). W = F.and co-variant transformation properties. however. (3. Instead we have to start directly from Eq. in that it remains invariant under any linear transformation. We will cover both concepts more deeply: the tensors in Section 3. but regard it as a matrix-like set of numbers on itself.12) 0 0 1 (3. the index notation also allows us to multiply vectors and covectors 17 . and we will study its properties later in more detail: in Chapter 5. However. Matrices. This means that the deﬁnition of the inner product according to Eq. something like a unit matrix: 1 0 0 gµν = 0 1 0 in an orthonormal system! (3. like the wµ wν which we initially used for our inner product. vectors and covectors are special cases of the more general class of objects called ‘tensors’. The object gµν is a kind of tensor that is neither a matrix nor a vector or covector. A similar story holds for the inner product between a covector and a vector. x = gµν F µ x ν general formula .

However. x1 · y1 x1 · y2 x1 · y3 tµν = xµ yν = x2 · y1 x2 · y2 x2 · y3 . The tensor t is clearly not a matrix.16) ◮ Exercise 2 of Section C. as opposed to tensors of rank 0 (scalars). so we indeed created something new here. (3. So if we take the example x = (1. 6).17) x3 · y1 x3 · y2 x3 · y3 = ( A −1 ) µ α ( A −1 ) ν β ( x µ y ν ) = ( A−1 )µ α ( A−1 )ν β tµν . this object still looks very much like a matrix. ′ t′ = xα y′β = ( A−1 )µ α xµ ( A−1 )ν β yν αβ If we compare the transformation in Eq. One can also produce. x 3 · y3 (3. These are tensors of rank 3.19) This is an ordered set of numbes labeled with three indices. and therefore without summation. for instance.16) with that of a true matrix of exercise 2 we see that the tensor we constructed is indeed an ordinary matrix. To check if this is a true matrix or something else. then we get a tensor with different transformation properties. while for t it is the transformation matrix for covariant vectors.3. One can also produce tensors of. (3. 3. a tensor with 3 indices.2 Creating tensors from vectors without pairing up the indices. and so on. (3. The g tensor of the previous section is of the same type as t. Clearly Aαβγ is a tensor of covariant rank 3 and contravariant rank 0. since a matrix is also nothing more or less than a set of numbers labeled with two indices. s′α β = x ′α y′β = Aα µ x µ ( A−1 )ν β yν = Aα µ ( A−1 )ν β ( x µ yν ) = Aα µ ( A−1 )ν β sµ ν . for instance. we need to see how it transforms. ??).15) We now did not produce one number (as we would have if we replaced ν with µ in the above formula) but instead an ordered set of numbers labelled with the indices µ ˜ and ν. rank 1 (vectors and covectors) and rank 2 (matrices and the other kind of tensors we introduced so far). (3. But if instead we use two covectors. for example: s2 3 = x2 · y3 = 3 · 6 = 18 and s1 1 = x1 · y1 = 1 · 4 = 4. This is the kind of ‘tensor’ object that this booklet is about.(3. For s it is the transformation matrix for contravariant vectors.18) The difference with sµ ν lies in the ﬁrst matrix of the transformation equation. then the tensorcomponents of sµ ν are. We can distinguish between the contravariant rank and covariant rank. Aαβγ = xα y β zγ . 7. The beauty of tensors is that they can have an arbitrary number of indices.3. but more: sµ ν x 1 · y1 µ x 2 · y1 := x yν = x 3 · y1 x 1 · y2 x 2 · y2 x 3 · y2 x 1 · y3 x 2 · y3 . contravariant rank 2 18 . 5) and y = (4. The object one thus obtains does not have fewer components. It can be visualized as a kind of ‘super-matrix’ in 3 dimensions (see Fig. Its total rank is 3.

It is therefore important to step away from this picture of combining vectors and covectors into a tensor. Tensors are much more general than these simple products of vectors and covectors. Typically. and covariant rank 3 (i. and consider this construction as nothing more than a simple example. is also of contravariant rank 2 and covariant rank 3. the second means that etc. C α µνφ β . As long as this is well-deﬁned. This is just a matter of how one chooses to assign meaning to each of the indices. since it usually looks better (though this is a matter of taste) to have the contravariant indices ﬁrst and the covariant ones last.2 Creating tensors from vectors A A 231 331 A A 232 332 A A 233 333 A 131 A A A 221 132 321 A A A 222 133 322 A A 223 323 A 121 A A A 122 311 A A A 212 123 312 A A 213 313 211 A 111 A 112 A 113 Figure 3. total rank 5): Bαβ µνφ . then one can have co. Again it must be clear that although a multiplication (without summation!) of m vectors and m covectors produces a tensor of rank m + n.1: A tensor of rank 3. when tensor mathematics is applied. However. A similar tensor.e. 19 . usually the meaning of the indices is chosen in such a way that this is accomodated. the meaning of each index has been deﬁned beforehand: the ﬁrst index means this. not every tensor of rank m + n can be constructed as such a product.and contra-variant indices in any order.3.

2 Creating tensors from vectors 20 .3.

2 and 4. Tensors of the type tα β γ (4.. in the literature often used.. (4.α N β1 . We begin with a formal deﬁnition of a tensor in Section 4. 4.4 gives an alternative. ( A−1 )νM β M tµ1 . we take a somewhat different view. Such symmetries have a strong effect on the properties of these tensors. So.2) are of course not excluded. Deﬁnition of a tensor ‘Deﬁnition’ of a tensor: An ( N.. . . where the latter transforms as s′ = s. 2.0)-tensor . notation of tensors. upon coordinate transformation given by the matrix A. It is contravariant in N components and covariant in M components. Symmetry and antisymmetry In practice it often happens that tensors display a certain amount of symmetry. for a 2nd rank contravariant tensor.1. M )-tensor in a three-dimensional manifold therefore has 3( N + M) components. A tensor t is called symmetric in the indices µ and ν if the components are equal upon exchange of the index-values.νM (4.5. as they can be constructed from the above kind of tensors by rearrangement of indices (like the transposition of a matrix as in Eq. Sections 4. Then.1) An ( N.β M = Aα1 µ1 . in Section 4.. deﬁnitions and properties 4 Now that we have a ﬁrst idea of what tensors are. Often many of these properties or even tensor equations can be derived solely on the basis of these symmetries.2. M )-tensor at a given point in space can be described by a set of numbers with N + M indices which transforms. Aα N µ N ( A−1 )ν1 β1 . in the following way: t′α1 ..20).3 give some important mathematical properties of tensors.. 4. Matrices (2 indices). like what we know from matrices. Section 4.µ N ν1 . vectors and covectors (1 index) and scalars (0 indices) are therefore also tensors. tµν = tνµ symmetric (2...Tensors. considering tensors as operators. it is time for a more formal description of these objects.3) 21 . . And ﬁnally. .1.

Of course.0)-tensor .8) Note: we denote the basis vectors and basis covectors with indices. (4. 4. Contraction of indices With tensors of at least one covariant and at least one contravariant index we can deﬁne a kind of ‘internal inner product’. we sum over α. since after the contraction we have fewer components. e.4 Tensors as geometrical objects A tensor t is called anti-symmetric in the indices µ and ν if the components are equalbut-opposite upon exchange of the index-values. (4.4. A contraction like tαα is not invariant under coordinate transformation.g.4. but also as arrows.4. tµν = −tνµ anti-symmetric (2. as usual. (4. One can perform calculations with these arrows if one regards them as linear combinations of ‘basis arrows’. (4.or contravariant). Also this construction is invariant under basis transformations. ◮ Exercise 3 of Section C. but we do not mean the components of these basis (co-)vectors (after all: in which basis would that 22 . One can also perform such a contraction of indices with tensors of higher rank. The properties of symmetry only remain invariant upon basis transformation if the indices are of the same type. be it a contraction of a tensor (like tµα α ) or an inner product between two tensors (like tµα yαν ).7) µ We would like to do something like this also with covectors and eventually of course with all tensors. tαβ α = v β . So. Tensors as geometrical objects Vectors can be seen as columns of numbers. for a 2nd rank contravariant tensor. the summations of the summation convention only happen over a covariant and a contravariant index. but then some uncontracted indices remain. ◮ Exercises 2 and 3 of Section C. 4. In the simplest case this goes as. µ (4. In fact. For covectors it amounts to constructing a set of ‘unit covectors’ that serve as a basis for the covectors.3. Note that contraction can only take place between one contravariant index and one covariant index. this contraction procedure causes information to get lost.6) In this way we can convert a tensor of type ( N.4) It is not useful to speak of symmetry or anti-symmetry in a pair of indices that are not of the same type (co.5) where.4. We write ˜ ˜ w = ∑ wµ eµ . M − 1). and we should therefore reject such constructions. v = ∑ v µ eµ . In classical linear algebra this property of the trace of a matrix was also known. This is the trace of the matrix tα β . tα α . M ) into a tensor of type ( N − 1.

To do this. when we choose A (that was equal to (Λ−1 ) ) in such a way that e′ = (( A−1 ) T )α e β . (4. The indices α and β simply denote which of the basis covectors (β) is contracted with which of the basis vectors (α). This time the inner product is written in abstract form (i. but it deﬁnes it nevertheless uniquely. (4. In spite of the fact that these indices have therefore a slightly different meaning. Note that we mark the co-variant basis vectors with an upper index and the contra-variant basis-vectors with a lower index. This way of writing was already introduced in Section 2.9) and Eq.2 to create tensors. in contrast to basisvector-columns. This may sound counter-intuitive (‘did we not decide to use upper indices for contra-variant vectors?’) but this is precisely what we mean with the ‘different meaning of the indices’ here: this time they label the vectors and do not denote their components. By arranging the basis covectors in columns as in Section 2. not written out in components like we did before). one can show that they transform as ˜ ˜ e′α = Aα β e β . (4. This equation does not give the co-basis explicitly.10) have to remain consistent. We then express these tensors in terms of ‘basis tensors’. (4. Combining tensors (in this case the basis (co-)vectors) with such an outer product means that the rank of the resulting tensor is the sum of the ranks of the combined tensors. (4.14) The operator ⊗ is the ‘tensor outer product’. (4. This co-basis is called the ‘dual basis’. The operator ⊗ is not commutative. α β T (4.15) 23 . These can be constructed from the basis vectors and basis covectors we have constructed above. Eq.4 Tensors as geometrical objects be?).11) is the (implicit) deﬁnition of the co-basis in terms of the basis.12) (4. A geometrical representation of vectors. (4.8) into the left hand side of the equation.9) We now substitute Eq. The description of vectors and covectors as geometric objects is now complete. we still use the usual conventions of summation for these indices. ˜ e β · eα = δ β α .e. It is the geometric (abstract) version of the ‘outer product’ we introduced in Section 3. We can now also express tensors in such an abstract way (again here we refer to the mathematical description. Instead we label the basis (co-)vectors themselves with these indices. (4.10) If Eq. which transform as covectors. (4. covectors and the geometric meaning of their inner product is given in appendix B. The next step is to express the basis of covectors in the basis of vectors. a⊗b = b⊗a.7) and Eq.1.13) In other words: basis-covector-columns transform as vectors. let us remember that the inner product between a vector and a covector is always invariant under transformations.1. it automatically follows that (4.4.11) ˜ On the left hand side one has the inner product between a covector e β and a vector eα . a geometric graphical depiction of tensors is given in appendix B). ˜ t = tµν ρ eµ ⊗ eν ⊗ eρ . independent of the chosen basis: ˜ v · w = vα wα ˜ ˜ v · w = v α eα · w β e β .

After contraction of all indices we obtain a scalar. In most applications tensors are considered as objects themselves. wα vα = s . w is an operator (a function) which ‘eats’ a vector and produces a scalar. We can also regard it the other way around. 24 . The tensor has as input a tensor of rank ( M. (4. Of course the fact that tensors can be regarded to some degree as functions or operators does not mean that it is always useful to do so. This could also be written as ˜ w(v) = s . or the product of more than one tensors of lower rank. (4. ˜ v(w) = s .21) An arbitrary tensor of rank ( N. ˜ wv = s .5.e. which is a directional object.19) (4.16) A covector is then called a linear ‘function of direction’: the result of the operation (i. (4. To prevent confusion we write Eq. 1)-tensor). (4.17) (4. written in the usual way to denote maps from one set to another.4.22) The tensor t can be regarded as a function of 2 covectors (a an b) and 1 vector (c). The function is linearly dependent on its input. M ) can also be regarded in this way. Example (for a (2. the resulting scalar) is linearly dependent on the input (the vector). N ).20) where the vector is now the operator and the covector the argument. (4. Tensors as operators Let us revisit the new inner product with v a vector and w a covector. v: R∗ 3 → R . ˜ w: Rn → R . Therefore.5 Tensors as operators 4. which produces a real number (scalar). tαβ γ aα b β cγ = s . so that we can call this a ‘multilinear function of direction’.18) ˜ ˜ where the brackets mean that the covector w acts on the vector v. and not as functions. In this way. This somewhat complex nomenclature for tensors is mainly found in older literature.19) from now on in a somewhat different way. (4. such that the number of contravariant indices equals M and the number of covariant indices equals N. If we drop index notation and we use the usual symbols we have.

The metric tensor and the new inner product 5 In this chapter we will go deeper into the topic of the new inner product. say. At the equator these are presumably circles.1 we will give an outline of the role that is played by the metric tensor g in geometry. To be able to do this one must ﬁrst deﬁne a coordinate system in which the measurements can be done. then one typically wants to do this with numbers. What one has to do to get an impression of ‘true’ distances. Measuring distances (or the length of a vector) on such a map with Pythagoras would produce wrong results. even though one could use cartesian ones. Section 5. 5. However. It allows us to produce mathematical and physical formulae that are invariant under coordinate transformations of any kind. Section 5. is of great importance to nearly all applications of tensor mathematics in non-cartesian coordinate systems and/or curved manifolds. In some applications it is even impossible to use orthonormal coordinates. sizes and proportions is to draw on various places on the Earth-map a ‘unit circle’ that represents. but as one goes toward the pole they tend to become ellipses: they are circles that are stretched in horizonal (east-west) direction. because they are exact and one can put them on paper. At the north pole such a ‘unit circle’ will be stretched inﬁnitely in horizontal direction. In Section 5.2 covers the properties of this metric tensor. 100 km in each direction (if we wish to measure distances in units of 100 km). often it is not quite practical to use such cartesian coordinates. In a typical projection of the Earth with meridians vertically and parallels horizontally. One could in principle take the coordinate system in any way one likes. The new inner product. for instance.3 describes the procedure of raising or lowering an index using the metric tensor. for instance in the case of mapping of the surface of the Earth. the northern countries are stretched enormously and the north/south pole is a line instead of a point. but often one prefers to use an orthonormal coordinate system because this is particularly practical for measuring distances using the law of Pythagoras. and the metric tensor g associated with it. In appendix B one can see that indeed a metric can be graphically represented with 25 . it is much more practical to use cylindrical or polar coordinates. Finally. For systems with (nearly) cylindrical symmetry or (nearly) spherical symmetry. There is a strong relation between this unit circle and the metric g. The metric as a measuring rod If one wants to describe a physical system.1.

At each step the distance travelled is so small that one can take the g at the middle of the step and get a reasonably good estimate of the length ds of that step. φ) in polar coordinates satisﬁes dl 2 = s = dr2 + r2 dθ 2 + r2 sin2 θdφ2 . So if points A and B have different lattitudes. Every symmetric second rank covariant tensor can be transformed into diagonal form in which the diagonal elements are either 1. A real metric. Another way to see it is that the inner product of two vectors should not depend on the order of these vectors: gµν vµ wν = gµν vν wµ ≡ gνµ vµ wν (in the last step we renamed the running indices). mathematically one could also conceive a manifold that has a metric that can be brought into the following form: gµν = diag(1.3) (5. This can be seen in various ways. For the metric tensor they will then all be 1: in an orthonormal coordinate system. Indeed. r2 sin2 θ dx3 (5. (5. there is no unique g that we can use to measure the distance. an orthonormal coordinate system is deﬁned as the coordinate system in which gµν = diag(1. From geometric considerations we know that the length dl of an inﬁnitesimally small vector at position (r. Properties of the metric tensor The metric tensor g has some important properties. What we have done here is to chop the path into such small bits that the curvature of the Earth is negligible. 1. For instance.2) All the information about the concept of ‘length’ or ‘distance’ is hidden in the metric tensor g. Symmetry of a tensor is conserved upon any transformation (see Section 4.5. after all. Perhaps the ‘true’ distance is then the distance along the path which yields the shortest distance between these points. This produces funny results for some vectors. However. However. it is symmetric. locally. The metric tensor ﬁeld g( x) is usually called the ‘metric of the manifold’. 1).2). which is symmetric.1) By integrating ds all the way from A to B one thus obtains the distance along this path.2. These are rather pathological properties and appear to be only interesting to mathematicians.2 Properties of the metric tensor a unit circle (in 2-D) or a unit sphere (in 3-D). θ. In fact. We could also write this as dl 2 = s = dx1 dx2 dx3 1 0 0 r2 0 0 1 0 dx 0 dx2 = gµν dx µ dx ν . Let us take a bit more concrete example and measure distances in 3-D using polar coordinates. A normal metric can always be put into this form. and the vector (0. should always produce positive deﬁnite lengths for vectors unequal to the zero-vector. the vector (0. What what can do is to deﬁne a path from point A to point B and measure the distance along this path in little steps. −1). 1. One of them is that one can always ﬁnd a coordinate system in which. as one can see in appendix A. First of all. 1) would have an imaginary length. 0. This is given by ds2 = gµν dx µ dx ν with gµν = gµν ( x ) . in special relativity a metric will be introduced that can be brought to 26 .0 or −1. 1. So how does one measure the distance between two points on such a curved manifold? In the example of the Earth’s projection one sees that g is different at different lattitudes. the metric tensor has the shape of a unit ‘matrix’. Within each step g changes so little that one can use the linear expressions for length and distance using the local metric g. 5. 1) has length zero. there is no objective distance that one can give.

which was created from Eν is now contravariant.4) The distinction between these two metric tensors is simply made by upper resp.3 Co versus contra the form diag(−1. at least for a positive deﬁnite metric) and then transform back to the original coordinate system. related to a force. And both metrics are intimitely related: once gµν is given. It should be noted that a metric of signature gµν = diag(1. 1). 1. For now. 1. however. 1). how does one obtain the contravariant E? Perhaps the most obvious solution is: ﬁrst produce the covariant ˜ tensor E. An example is the electric ﬁeld. In such coordinate systems we never encountered the difference between co. −1) can never be brought into the form gµν = diag(1. So what about the inner product between two co-vectors? Just as with the vectors we use a kind of metric tensor. then transform to an orthonormal system. but this time of contravariant nature: gµν . 1): then gµν is also equal to diag(1. can be brought to the form gµν = diag(1. This does not mean that they are gµν = diag(1. But in the next section we will derive a more elegant relation between the two forms of the metric tensor. A better method is to use the metric tensor: Eµ = gµν Eν . we will encounter problems. Normal metrics have signature diag(1. 1). 1). But as soon as we go to a non-orthonormal system. and vice versa. 1). gµν is some kind of unit 27 . lower indices.7) This is a proper tensor product. but they can be brought into this form with a suitable coordinate transformation. as the contraction takes place over an upper and a lower index. In a sense they are two realisations of the same metric object g. The easiest way to ﬁnd gµν from gµν is to ﬁnd a coordinate system in which gµν = diag(1. 1. 1.3. 1.5) E = −∇V ⇔ Eµ = − µ . One can regard an electric ﬁeld as a gradient. 1. The inner product then becomes s = gµν xµ yν . 5. ∂x On the other hand it is clearly also an arrow-like object. 1) by coordinate transformation. 1. q q (5. E= 1 1 F = ma q q ⇔ Eµ = 1 µ 1 F = maµ . gµν can be constructed and vice versa. (5.e. 1. we will assume that the metric is positive deﬁnite (i. then switch from co-vector form to contravariant vector form (in this system their components are identical. This signature of the metric is therefore an invariant.5. Then one is suddenly forced to distinguish between the two. So the question becomes: if one has a potential ﬁeld V. ◮ Exercise 1 of Section C. like any kind of vector. but very time-consuming and unwieldy. ∂V ˜ (5. The object Eµ .6) ˜ In an orthonormal system one can interchange R and R (or equivalently Eµ and Eµ ) without punishment.and contra-variant vectors. 1) through a coordinate transformation). i. 1. A gradient was simply a vector. (5. This is possible. 1.e.5. Co versus contra Before we got introduced to co-vectors we presumably always used orthonormal coordinate systems. The properties we have derived for gµν of course also hold for gµν . One can see that this vector must indeed be the contravariant version of Eν by taking an orthonormal coordinate system: here gµν = diag(1.

3 Co versus contra ‘matrix’. (5. the formula Eq. One can of course also do the reverse. 1. then we must arrive back with the original vector. The vector and its covariant version are each others “dual”.9) If we ﬁrst lower an index with gνρ . This results in a relation between gµν en gµν . However.10) With this relation we have now deﬁned the contravariant version of the metric tensor in terms of its covariant version. This is the same as saying that in the orthonormal basis Eµ = Eν . only for a metric that can be diagonalised to diag(1.7) is now also valid after transformation to any coordinate system.8) (here we introduced σ to avoid having four ν symbols. (5. and then raise it again with gµν . in contrast to the formula Eµ = Eν . (5.7). while the reverse is “raising an index”. which for classical physics is always the case). (5. Eµ = gµν Eν = gµν gνρ Eρ = δµ ρ Eρ = Eµ ⇒ gαν gνβ = δα β . (5. ( A−1 )µ α Eµ = ( A−1 )µ α ( A−1 )ν β gµν A β σ Eσ = ( A−1 )µ α δν σ gµν Eσ = ( A−1 )µ α gµν Eν . If we multiply on both sides with Aα ρ we obtain exactly (the index ρ can be of course replaced by µ) Eq. Eµ = gµν Eν . which would conﬂict with the summation convention). which is indeed true (again. 28 . We call the conversion of a contravariant index into a covariant one “lowering an index”.5. 1).

and thereby invariant under coordinate transformation. It said that the work equals the (old) inner product between F en x. We call this ‘universal validity’ of the equation: ‘covariance. After the introduction of the new inner product the equation suddenly kept its validity in any coordinate system.e. with ( A−1 ) T )) and the righthand-side as a contravariant vector (i. For now we will merely make a mental note of this. 6. We will need to investigate where these pitfalls may arise. The following expression is not covariant: xµ = yµ .3) because the left-hand-side transforms as a covector (i. The index notation makes it possible to write all kinds of manipulations in an easy way. with A). This is not entirely trivial. (6. the Levi-Civita tensor. The following expressions and equations are therefore covariant.Tensor calculus 6 Now that we are familiar with the concept of ‘tensor’ we need to go a bit deeper in the calculus that one can do with these objects. (6. which allows us to write down cross-products in index notation. If this equation holds by chance in 29 .2) because both sides transform in the same way. Also here the rules of index notation are rather logical. Be careful not to confuse this with ‘covariant vector’: it is a rather unpractical fate that the word ‘covariant’ has two meanings in tensor calculus: that of the type of index (or vector) on the one hand and the ‘universal validity’ of expressions on the other hand. but makes it also all too easy to write down expressions that have no meaning or have bad properties. The ‘covariance’ of equations In chapter 3 we saw that the old inner product was not a good deﬁnition. We will also start thinking about differentiation of tensors. but also here there are problems looming. However. This is a problem that has far-reaching consequences and we will deal with them in Chapter 7. At the end of this chapter we will introduce another special kind of tensor. while the work is a physical notion. xµ = yµ .1. Without this tensor the cross-product cannot be expressed in index notation.e. These problems have to do with the fact that ordinary derivatives of tensors do no longer transform as tensors. In other words: the equation was only valid in a prefered coordinate system. the inner product was not invariant.1) (6. x µ = yµ .

It is formally true that uα vγ w β = uα w β vγ . have the same number of upper and lower indices. but their order is different. (The fact that the index µ occurs double here does not mean that they need to be summed over. Evidently the meaning assigned to index 2 is different in the two cases.9) Both tensors are of same rank.e. But this expression. Often symmetries of the indices will help in avoiding confusion. Addition of tensors Two tensors can only be added if they have the same rank: one cannot add a vector to a matrix.. because multiplication is commutative. Since we are only interested in vectors and covectors are geometric objects and not in their components in some arbitrary coordinate system. since according to the summation convention one only sums over equal indices occuring in a product. assign meaning to each of the indices (i.”). (6. since typically (as mentioned before) one assigns meaning to each index before one starts working with tensors. If we choose to stay in orthonormal coordinates then we can drop the distinction between co. (6. This is a rather peculiar contruction. The problem is. One should then choose the indices to be the same.7) or x αβ + yαβ .5) is not a tensor. We just chose µ and ν as the indices in the ﬁrst expression and α and β in the second. second index means that. (6. one can add them. (6. then it will likely not hold in another coordinate system. Fortunately. This is merely a matter of choice.4) This expression is also not covariant. that a tensor has only meaning for the user if one can.and contravariant vectors. 6. The same as above we can say about the following equation. Now what about the following expression: tα β γ + r αγ β . If one adds two equal type tensors but with unequal indices one gets a meaningless expression. to avoid giving the impression that we distinguish between co. we must conclude that covectors cannot be set equal to contravariant vectors. tµν = sµ ν . See chapter 1). (6. vµ + wν .2.2 Addition of tensors one coordinate system. since in the case of tα β γ it is a covariant index while in the case of r αγ β it is a contravariant index.6) If tensors are of the same type. beforehand.8) Both the above expressions are exactly the same. The tensors on both sides of the equal sign are of different type. In that case we put all indices as subscript. x µν + yµν . So vµ + wµ (6.and contra-variant vectors and we do not need to check if our expressions or equations are covariant. not in a sum.. One can see this if one assumes that r αγ β = uα vγ w β . Addition of two equal-rank tensors is only possible if they are of the same type. This will be confused if one mixes the indices up. 30 .6. though highly confusing (and therefore not recommendable) is not formally wrong. in most applications of tensors the meaning of the indices will be clear from the context. ”ﬁrst index means this.

(6. which we already know. (grad t)µνα = vµ . ∂x ν (6. We will see in Chapter 7 that in curved coordinates or in curved manifols. ∂x ν (6.4.ν := ∂ν vµ := ∂vµ . the following objects are therefore not tensors. 6. The following products are tensors. (6.10) x µ T µν .17) 31 . (grad f )µ = ∂f . In this section we will therefore.14) In a similar fashion we can now deﬁne the ‘gradient’ of a vector ﬁeld: (grad v)µ ν = or of a tensor-ﬁeld. and we will forbid them from now on: (6.3. But for non-curved coordinates on a ﬂat manifold this problem does not arise. (6. One can now take the derivative of such tensor ﬁelds and these derivatives can also be denoted with index notation.11) and the following expression is a covariant expression: tµν = aµρ σ bνσ ρ . ∂x µ (6. hα β γ tα β γ . Tensor products The most important property of the product between two tensors is: The result of a product between tensors is again a tensor if in each summation the summation takes place over one upper index and one lower index. where the tensor depends on the location given by x.6.15) ∂tµν . xµ T µν hα β γ tα β γ . First order derivatives: non-covariant version Up to now we have mainly concerned ourselves with the properties of individual tensors. avoid this problem by assuming non-curved coordinates on a ﬂat manifold.13) The metric tensor is then used as a kind of ‘glue’ between two indices over which one could normally not sum because they are of the same kind. We start with the gradient of a scalar ﬁeld. ∂vµ . In differential geometry one usually uses tensor ﬁelds. Now let us introduce an often used notation. According to this rule. (6. temporarily. the above objects are all tensors. these derivatives will not behave as tensors and are not very physically meaningful. For instance: an inner product between tµν en vν could look like: wµ = gαβ tµα v β .16) ∂x α As long as we employ only linear transformations. then this is possible with the use of the metric tensor.12) If we nevertheless want to sum over two covariant indices or two contravariant indices.4 First order derivatives: non-covariant version 6.

just to avoid confusion: the symbol ∇ will be used in the next chapter for a more sophisticated kind of derivative. ǫαµν ǫαρσ = δµρ δνσ − δµσ δνρ . The contraction between two epsilons yields a useful identity.25) (6. or n indices if we work in n-dimensional space. In fact it is a tensor density (or pseudo tensor). we will in this section assume an orthonormal basis. The only elements of ǫ that are non-zero are those for which none of the indices is equal to another.21) all the other non-zero elements follow. The object ǫ does not transform entirely like a tensor (though almost). Rot. and forget for the moment about covariance.ν := ∂ν vµ := gνρ ∂vµ . (6. ∂T αβ = ∂ β T αβ = T αβ . The ǫ symbol is completely anti-symmetric. if two indices are equal. Any permutation of indices yields a −1. However. (6. As a summary. if we work in a 3-dimensional space. ∂x β (6. It is a set of numbers with three indices. cross-products and the permutation symbol A useful set of numbers. ǫijk = −ǫ jik = −ǫikj = −ǫkji . 32 (6. (6. if ijk is an odd permutation of 123.20) 6. which is a near cousin of the tensor family. cross-products and the permutation symbol For completeness we also introduce vµ.18) With index notation we can also deﬁne the divergence of a vector ﬁeld (again all under the assumption of non-curved coordinates). ∇·v = ∂vρ = ∂ρ vρ = vρ .22) and with Eq. We usually deﬁne ǫ123 = 1 .ρ .23) This pseudo-tensor is often called the Levi-Civita pseudo-tensor.26) . used for instance in electrodynamics.5. ǫijk 1 = −1 0 if ijk is an even permutation of 123. ∂x ρ (6.β . is the permutation symbol ǫ.5 Rot. and also ǫαβγ ǫαβγ = 6 . to avoid these complexities.21) If two of the three indices have then same value. We can also deﬁne the divergence of a tensor. From this it follows that ǫαβν ǫαβσ = 2δνσ . then the above equation clearly yields zero.24) (6. ∂x ρ (6. Tensor densities transform as tensors.19) Note. and it is therefore not considered a true tensor. but are additionally multiplied by a power of the Jacobian determinant of the transformation matrix.6. (6.

When we translate to index notation this becomes c = ∇ · (∇ × a) = ∇i (ǫijk ∇ j ak ) = ǫijk ∂i ∂ j ak . cross-products and the permutation symbol With the epsilon object we can express the cross-product between two vectors a and b. ∂xi (6. want to compute the divergence of a rotation. because ∂ ∂x1 ∂ ∂x2 ∂ ∂x3 (6. One can also deﬁne another kind of rotation: the generalised rotation of a covector.31) This transforms as a tensor (in non-curved coordinates/space) and is anti-symmetric. ˜ (rot w)αβ = ∂α w β − ∂ β wα . (6. From the exercise below we will see that such a contraction always yields zero. c = a × b → ci = ǫijk a j bk . 33 .29) one sees the complete contraction between a set that is symmetric in the indices i and j (∂i ∂ j ) and a set that is antisymmetric in the indices i and j (ǫijk ).5 Rot. for instance. (6. (6.29) ∇= and therefore ∇i = ∂ : = ∂i .6. so that we have proven that a divergence of a rotation is zero.4.28) which is indeed the familiar ﬁrst component of the cross-product between a and b.27) For the ﬁrst component of c one therefore has c1 = ǫ1jk a j bk = ǫ123 a2 b3 + ǫ132 a3 b2 = a2 b3 − b2 a3 . (6. This notation has advantages if we.30) On the right-hand-side of Eq. ◮ Exercise 4 of Section C.

5 Rot.6. cross-products and the permutation symbol 34 .

we wish to at least brieﬂy address the concept of covariant derivative. like the surface of the Earth). but we always assume that the changes are smooth. there is no meaningful global set of basis vectors in which one can express the vectors. for example) then this becomes a big issue. Since the coordinates are curved (maybe even the manifold itself is curved. In this section we will also show how one deﬁnes vectors in curved coordinates by using a local set of basis vectors derived from the coordinate system.1. however. One may ask: why is there a problem? We will show this by a simple example in Section ??. then at each location this vector or tensor could be decomposed into the local basis (co-)vectors. In Section ?? we will return to the polar coordinates to demonstrate how this machinery works in practice. The reason is that the concepts of covariant derivatives are usually described in detail in introductory literature on general relativity and Riemannian geometry. then it is possible to choose 35 . In Section ?? we will introduce without proof the mathematical formula for a covariant derivative. For further details we refer to *************** LITERATURE LIST ************** 7. We want to construct a derivative of a vector in such a way that it makes physical sense even in the case of curved coordinates. take quite large steps. A local orthonormal set of basis vectors: If the coordinates are such that they are locally perpendicular with respect to each other. As usual. If a vector or tensor ﬁeld is given.Covariant derivatives 7 One of the (many) reasons why tensor calculus is so useful is that it allows a proper description of physically meaningful derivatives of vectors. In principle there are two useful set of basis vectors one can choose: 1. but taking a derivative of a vector (or of any tensor for that matter) is a non-trivial thing if one wants to formulate it in a covariant way. in a rectangular orthonormal coordinate system this is usually not an issue. However. This basis may be different from location to location. In principle this is well beyond the scope of this booklet. Vectors in curved coordinates Deﬁning vectors in curvilinear coordinates is not trivial. since tensors ﬁnd much of their application in problems in which differential operators are used. but this is beyond the scope of this booklet – even though the techniques we will discuss in this chapter will be equally well applicable for curved manifolds. We have not mentioned this before. But when the coordinates are curved (like in polar coordinates. It is even more of an issue when the manifold is curved (like in general relativity). This chapter will. and introduce thereby the so-called Christoffel symbol.

at every location. which is physically more intuitive. takes the form (1. The r set of basis vectors (∂/∂r. A local coordinate basis: The covector basis can be seen as unit steps in each of ˜ the directions: eµ = dx µ . Typically this coordinate basis is not orthonormal. 0). for a more detailed discussion we refer to the literature. ‘covariant’ form of the derivative. XXXX. The contravariant basis vectors are then eµ = ∂/∂x µ . and then.3) where Γα is an object called the Christoffel symbol. (7. while the set of basis vectors (∂/∂r. φ.7. we will assume this basis for the rest of this chapter. y/ x2 + y2 ). 1 ∂/∂φ) is a local orthonormal basis. if we switch to circular coordinates. µν (7. But since the tensor mathematics works best for the local coordinate basis. 2. Clearly this vector ﬁeld is not constant. for instance. 1 ∂/∂φ is again a normalized (unit) vector. The covariant derivative of a vector ﬁeld vµ is deﬁned as ∇µ v α = ∂µ v α + Γα v ν . for instance. once the ﬁnal answer is there.2) In the circular coordinate system it looks as if the vector does not change in space. We can express the usual x. In cartesian coordinates one would write. which happens to be parallel to r the coordinates. the coordinate basis vector ∂/∂φ has a different length at different radii. then we obtain zero. and take the derivative with respect to. ∂ ∂φ 1 0 = 0 0 . The covariant derivative of a vector/tensor ﬁeld In this section we will introduce the covariant derivative of a vector ﬁeld. For tensor mathematics this basis is usually the easiest to use. The Christoffel symbol is not µν a tensor because it contains all the information about the curvature of the coordinate system and can therefore be transformed entirely to zero if the coordinates are 36 . ∂/∂φ) is the local coordinate basis. (7.2 The covariant derivative of a vector/tensor ﬁeld the basis vectors such that they point along these coordinates (though not necessarily are normalized the same as the coordinate units). We will do so without proof. we will convert back to the local orthonormal basis. We deﬁne ∂µ to be the ordinary coordinate-derivative ∂/∂x µ . when expressed in this local basis. In r-direction the coordinate basis vector ∂/∂r can act as a normal basis vector. But in realistic applications (like in Chapter ??) a local orthonormal basis has more physical meaning. However.and y-coordinates in terms of r and φ as: x = r cos φ and y = r sin φ.1) which is clearly non-zero. Again. Now let us use the coordinate basis. Let us take a vector ﬁeld that. but in reality it does. 7. however. Expressed in cartesian coordinates (for which a global basis is possible) this would be a vector ﬁeld ( x/ x2 + y2 . One has to deﬁne a new.2. This is the problem one encounters with derivatives of tensors in curved coordinates. This is shown in Fig. In the application of Chapter ?? we will ﬁrst convert everything to the coordinate basis in order to do the tensor math. ∂ ∂x x/ y/ x 2 + y2 x 2 + y2 = y2 /( x 2 + y2 )3/2 − xy/( x2 + y2 )3/2 . In the φ-direction. However. Going to the local orthonormal basis does not help. Now let us take the example of circular coordinates in two dimensions. and ∇µ to be the new covariant derivative. So far in the entire booklet we have implicitly assumed a local coordinate basis (remember that we used the words coordinate transformation on equal footing as basis transformation).

7. µσ and of a tensor tα β .6) (7. ∇µ w α = ∂µ w α − Γν w ν . µσ µβ NOW MAKE EXERCISES TO SHOW WHAT HAPPENS WITH THE ABOVE EXAMPLE ◮ From the exercise we can see that the above recipe indeed gives the correct answer even in curved coordinates. or with help of the metric: g αγ ∇α tµν γ .β ) .9) ∇α = gαβ ∇ β 37 . We can therefore write gαγ ∇α tµν γ = ∇α (tµν γ gαγ ) = ∇α tµνα We can also deﬁne: (7. ◮ ANOTHER EXERCISE The covariant derivative ∇µ produces.ν − gµν. Therefore ∇α tµν γ is a perfectly valid tensor. µα The covariant derivative of a tensor tαβ is then β (7.5) ∇µ tαβ = ∂µ tαβ + Γα tσβ + Γµσ tασ .4) The covariant derivative of a covector can also be deﬁned with this symbol.2 The covariant derivative of a vector/tensor ﬁeld straightened. Nevertheless we treat it as any ordinary tensor in terms of the index notation. as its name says.7) ∇µ t α β = ∂µ t α β + Γα t σ β − Γσ t α σ .µ + g βµ. We can also prove that the covariant derivative of the metric itself is always zero. Since ∇µ gαβ = 0 (as we saw in the above exercise) we can always bring the gαβ and/or gαβ inside or outside the ∇µ operator. Γα = µν 1 αβ g 2 ∂g βν ∂g βµ ∂gµν µ + ∂x ν − ∂x ∂x β ≡ 1 αβ g ( g βν. We can also contract indices: ∇α tαν γ . (7. covariant expressions. 2 (7.8) (7. The Christoffel symbol can be computed from the metric gµν (and its companion gαβ ) in the following way.

7.2 The covariant derivative of a vector/tensor ﬁeld 38 .

From the simple thought experiments with moving trains. We can obtain one from the other by raising or lowering an index. they are also quite convenient in special relativity. we should be talking about the inner product of a vector with a covector. it has become clear that 3-dimensional space and 1-dimensional time are not separate entities.3) Eq.4) ˜ where b is a covector. this formula can be understood much better. (1.2) should thus really be viewed as a. aα = gαβ a β . but rather should be seen as one. Moreover. From the fact that inner products such as (1. We will start here at the point when four-vectors are introduced into special relativity. 0 0 0 1 39 . meant to give a visual understanding of the Lorentz transformations. 0 x 2 1 x3 With the knowledge which we now have about the ‘metric tensor’. Namely. Rather. when we express the formulas using tensors.2) are invariant under Lorentz transformations we now ﬁnd the components of the metric tensor. (1. −1 0 0 0 0 1 0 0 gµν = (1. making use of the metric tensor. we found that the inner product now c satisﬁes the somewhat strange formula a. we know that the inner product of two vectors does not really exist (at least not in a form which is independent of the coordinate system). Moreover. b = − a0 b0 + a1 b1 + a2 b2 + a3 b3 . b g = aµ bµ = gµν aµ bν .5) 0 0 1 0 . they expose the structure of the theory much better than the formulas without tensors do.Tensors in special relativity A (1.1) Although tensors are mostly useful in general relativity. (1. In matrix form one found for a pure Lorentz transformation ′ x0 γ 1′ x −γβ 2′ = 0 x 0 3′ x −γβ γ 0 0 0 0 1 0 √ with γ = 1/ 1 − v2 /c2 and β = v .2) 0 x 0 0 x 1 . (1.

We have seen in Chapter (5) that the metric can also be used to describe curved spaces. we do not have to introduce any strange inner products. which is the space in which the rules of special relativity hold. Rather. is thus a space which has a metric which is different from the metric in an ordinary Euclidean space.APPENDIX A. 40 . The Minkowski metric which we have encountered in the present section can be used to describe curved space-time. we simply apply our new knowledge about inner products. In order to do computations in this space. TENSORS IN SPECIAL RELATIVITY Minkowski space. gravity is explained as a consequence of the curvature of space-time. Let us end with a small related remark. In that theory. This concept is at the basis of the general theory of relativity. A good introduction to this theory is given in [5]. which automatically leads us to the modiﬁed metric tensor.

e. the numbers vµ ) are obtained by determining the projections of the arrow onto the axes. The intersection they make with a neighbouring x2 = constant lijne is equal to the increase of x1 necessary to 41 .1: Geometrical representation of a vector.Geometrical representation B The tensors as we have seen them in the main text may appear rather abstract. As we go along we will also describe how various manipulations with tensors (such as additions and inner products) can be translated to this geometrical picture. • co-vector A co-vectors is closely related to the gradient of a function f ( x). • arrow-vector An arrow-vector is represented by (how else) an arrow in an n-dimensional space. as well as for all other tensors. However. so it makes sense to look for a geometrical representation which is somehow connected to gradients. it is useful to think about it in terms of an arrow. However.1. The present appendix contains a list of these geometrical representations. The components of the vector (i. A simple example of this is the arrow-vector. See ﬁgure B. Figure B. in two dimensions) or contour surfaces (in three dimensions). it is possible to visualise their meaning quite explicitly. but it may help in understanding such computations. An arrow-vector is effectively just a set of numbers that transforms in a particular way under a basis transformation. in terms of geometrical objects. None of this is necessary in order to be able to compute with tensors. and the determination of its components by projection onto the axes. Now consider two consecutive contour lines. A useful related concept is that of contour lines (curves of equal height. We can make a similar geometrical visualisation of a covector. so as to visualise its meaning.

asafsnede Figure B. Note that one should also keep track of the order of the lines or planes. two consecutive lines (or. We can visualise this object as a surface with a given area and orientation.4. tµν = vµ wν − wµ vν . in three dimensions.APPENDIX B. Given the above. • anti-symmetric contravariant second-order tensor tµν This is a bit like an outer product. In the representation sketched in ﬁgure B. This is not an arrow-vector as above. or rather two co-vectors. The order of the two vectors is clearly of relevance. intersection at x1 -axis = ∂x1 ∂f = x 2 constant ∂f ∂x1 −1 x 2 constant (2.2: Geometrical representation of a co-vector. • anti-symmetric covariant second order tensor tµν This object can again be seen as a kind of outer product of two vectors. Let us illustrate this with an example: if the surface of the projection on the x1 . but only meant to indicate the order of the surfaces. it makes sense to use.2) 1 2 2 Figure B.1) Therefore. if we take the inverse of this intersection. GEOMETRICAL REPRESENTATION increase f ( x ) by 1. See ﬁgures B. x2 surface is equal to 2. if the other coordinate is kept constant.3: Representation of an anti-symmetric contravariant 2nd-order tensor. then the t12 component equals 2 and the t21 component equals −2. So in two dimensions.2 we have drawn a ﬂexible type of arrow.3) 42 . such that the intersections with the axes are equal to the components of the co-vector. (2. planes). we again obtain the ﬁrst component of the gradient. We could equally well have labelled the surfaces with different numbers or colours. The components of the tensor can be obtained by measuring the projection of the surface onto the various coordinate surfaces. and the determination of its components by computing the inverse of the intersection with the axes. for the generic representation of a co-vector. 1 An anti-symmetric contravariant second order tensor can be made from two vectors according to tµν = vµ wν − wµ vν .3 and B. (2. A negative component is obtained when the orientation of the surface assocatiated to the tensor is negative.

6: Geometrical representation of the addition of vectors. This translated arrow now points to the sum-arrow. Figure B. Again one has to keep track of the orientation of the intersections. • addition of co-vectors The addition of covectors is somewhat more complicated. Instead of trying to describe this in words. GEOMETRICAL REPRESENTATION Figure B. See ﬁgure B. 43 • inner product between a vector and a co-vector .APPENDIX B.4: Determination of the components of an anti-symmetric contravariant second order tensor. The inner product of a vector and a co-vector is the ratio of the length of the intersection which the co-vector has with the vector and the length of the vector. We can visualise this object as a kind of tube with an orientation. • addition of vectors Addition of vectors is of course a well-known procedure. The components can be found by computing the intersections of this tube with all coordinate surfaces. Simply translate one arrow along the other one until the top.5: Geometrical representation of an anti-symmetric covariant second order tensor.8 for a three-dimensional representation. See ﬁgure for a two-dimensional representation. See ﬁgure B. 2 1 2 1 Figure B.6.5. it is easier to simply give the visual representation of this addition. and inverting these numbers. See ﬁgure B.

Figure B.8: Representation of the inner prdouct between a vector and a co-vector. It therefore makes sense to look for a representation which is related to the inner product. so we will omit that case. A suitable one is to consider the set of all points which have a unit distance to a given point x0 . • second order symmetric co-variant tensor gµν The metric tensor belongs to this class. g = g( x). see ﬁgure B. • turning an arrow-vector into a co-vector The procedure of converting a vector to a co-vector can also be illustrated geometrically. It is considerably more complicated to make a geometrical representation of a second order symmetric contravariant tensor.7: Geometrical representation of the addition of co-vectors.9 for a three-dimensional representation: an ellipsoid.4) If the metric is position-dependent. Now it is also clear what is the connection to the ellipses of chapter 5. The two 44 . GEOMETRICAL REPRESENTATION v w v+w Figure B.APPENDIX B. See ﬁgure B.10 for a two-dimensional representation. { x ∈ R3 | gµν x µ x ν = 1} . then the point x0 is equal to the point at which the corresponding metric gµν is evaluated. (2. Figure B.9: Geometrical representation of a second order symmetric covariant tensor.

APPENDIX B. Figure B. GEOMETRICAL REPRESENTATION long slanted lines are tangent to the circle.10: Geometrical representation of turning an arrow-vector into a co-vector. 45 . and the bottom horizontal line goes through the center of the circle. This representation fails when the arrowvector is shorter than the radius of the circle.

APPENDIX B. GEOMETRICAL REPRESENTATION 46 .

Show that C ∑ Aµν ( ∑ Bνα xα ) = ∑ ∑ ( Aµν Bνα xα ) ν =1 α =1 n n ν =1 α =1 n n n n = = ∑ ∑ ( Aµν Bνα xα ) α =1 ν =1 ∑ ( ∑ ( Aµν Bνα )xα ) α =1 ν =1 n n 3. C.Exercises C.1. A and B are matrices and x is a position vector. Assume that A = BC Write out this matrix multiplication using index notation. (d) Aµν = ( A T )νµ (e) Aµν = ( A T )µν 4. Which of the following statements are true? (a) The summation signs in an expression can always be moved to the far left. A. 2. you cannot just change the order of the variables in the expression. B. because this changes the order in which matrices are multiplied. without changing the meaning of the expression. and generically AB = BA. A. you can exchange their order without changing the meaning of the expression. D and E are matrices. Index notation 1. (c) If all summation signs are on the far left of an expression. B and C are matrices. (b) If all summation signs are on the far left of an expression. Write out the following matrix multiplications using index notation (with all summation signs grouped together). (a) A = B(C + D ) (b) A = BCD 47 .

Write as a matrix multiplication: (a) Dαβ = Aαµ Bµν Cβν (b) Dαβ = Aαµ Bβγ Cµγ (c) Dαβ = Aαγ ( Bγβ + Cγβ ) 9. x′ = Ax Show that A is a transformation matrix. 8. 6.1 Index notation (c) A = BCDE 5. ∑ Aµν xν . y and z which satisfy the following relations. Now write down the relation between z and x using index notation. 10. n µ1 α=1 7. Try a couple of the previous exercises by making use of the summation convention. v( x) . Consider a vector ﬁeld in an n-dimensional space. y = Bx z = Ay Write these relations using index notation. where the matrix A is the same matrix as in the ﬁrst equation. 48 . Assume you have three vectors x. For a transformation we have This corresponds to ′ xµ = x′ = Ax . v′ = Av . Write in matrix form D βν = ∑ n ∑ Aµν Bαµ Cαβ .C. ν =1 n Can you understand the expression ∑ xν Aµν =? ν =1 n And how can you construct the matrix multiplication equivalent of ∑ xµ Aµν =? µ =1 n (note the position of the indices). We perform a coordinate transformation.

C. 2. The matrix for a rotation over an angle φ in the x-y plane is given by Λ= cos φ sin φ − sin φ cos φ . The question is now: does a normal transformation matrix transform too upon a change of basis. (a) Show that in this case wµ is also a tensor (that is. in which the components of the basis vectors are expressed in the basis itself. Insert the transformation rule given above and show that. With the help of various mathematical operators you can make new tensors out of old ones. These are thus orthogonal transformations. the tranformation matrix Λ is equal to (Λ−1 ) T . e1 · e2 = 0 . And how? Write this out in normal matrix notation. and after a basis (or coordinate) transformation Ay = S( A x ) . Compute the inverse of this matrix (either by replacing φ by −φ or by matrix inversion). but there is also a matrix A′ which satisﬁes Sy = A′ (S x ) . Co-vectors 1. In Cartesian coordinate systems the basis vectors are orthogonal to each other. Introduction to tensors 1. in which tµν and vν are tensors. For instance wµ = tµν vν . 2. C. Hint: y = Ax . (where Λ is the transformation matrix) then this relation should of course continue to hold. We have seen that this can be a normal transformation or a basis transformation. 49 . You might be tempted to interpret this as the inner product of two basis vectors. for transformations between two Cartesian systems. e1 ′ = Λ e1 & e2 ′ = Λ e2 . and A the matrix which is to be transformed.3 Introduction to tensors C. e1 ′ · e2 ′ = 0 . A matrix can be viewed as an object which transforms one vector into another one. show that it transforms correctly under a basis transformation). If we transform from one such Cartesian coordinate system to another one. (b) Show that wµ is not a tensor if we make it according to wµ = tµν vν . That would of course always yield a trivial zero).3. (This is a somewhat misleading notation.2. with S the matrix which transforms from the original system to the primed system. Show that (Λ−1 ) T is equal to Λ.

Show that the property of anti-symmetry is preserved under a basis transformation. 9. Hoe transformeert een 3e orde geheel covariante tensor? Kan dit in matrixvorm geschreven worden? 10. x n ) f ′n ( x1 .C. Show that. Tensoren.4 Tensoren. algemeen 3. You have seen that there are four types of 2nd-order tensors. x 2 . waarom blijft de eigenschap van symmetrie hier wel behouden. 3. . maar ook niet als de getransponeerde van een matrix. Neem twee tensoren: sµν en tµν . if you start with a number of tensors and construct a new object out of them. Je kunt een niet-lineaire coordinatentransformatie geven door de nieuwe coordinaten ¨ ¨ uit te drukken als een functie van de oude coordinaten: ¨ x ′1 x ′n 50 = . .4. 5. 6. Laat aan de hand daarvan zien hoe de matrix A T transformeert. only sum over one upper and one lower index. Toon aan dat de dubbele contractie tαβ rαβ altijd gelijk is aan nul. 1)-tensor. op te schrijven. . = f ′1 ( x 1 . algemeen 1. Gegeven een tweede orde contravariante symmetrische tensor t en een tweede orde covariante antisymmetrische tensor r. (Laat dus zien dat die eigenschap na transformatie meestal verloren is). C. We hebben in opgave 2 in paragraaf C. terwijl toch is bewezen dat de symmetrie in een boven en onderindex niet behouden blijft? 7. (b) twee vrije indices heeft. Show. 4. 8. . Take an anti-symmetric tensor tµν . never over two indices at the same position). by writing out the transformation of a (1. x n ) . Maak een product van die twee dat (a) geen vrije indices heeft. Toon aan de hand van de twee gevonden vergelijkingen aan dat het object g uit de tekst niet transformeert als een matrix. x2 . Do the same thing for a symmetric tensor. How many 3rd order tensors are there? 2. the new object is again a tensor. . This shows that symmetry and anti-symmetry are fundamental properties of tensors. . . (c) vier vrije indices heeft. . and use the summation convention (that is. Hoe transformeert een kronecker-delta? De kronecker-delta is symmetrisch. dat het geen zin heeft om te spreken van symmetrie of anti-symmetrie van een tensor in indices die niet van hetzelfde soort zijn.3 gezien hoe een matrix A transformeert. De vogende tensor is gedeﬁnieerd uit twee vectoren: tµν = v µ wν − wν vµ Laat zien dat deze antisymmetrisch is.

is invariant onder coordina¨ tentransformaties: µ ν ′ gµν x ′ x ′ = gµν x µ x ν Bij de behandeling van de Speciale Relativiteitstheorie komen we echter ook vaak de volgende vergelijking tegen: gµν x ′ x ′ = gµν x µ x ν (dus zonder accent bij g) Leg uit waarom we deze vergelijking mogen gebruiken in alle gevallen dat g de Minkowskimetriek is en we uitsluitend in lorentzframes werken. . . C.. je hebt 2 berekeningen uitgevoerd: xα = . Aanwijzing: Bij het aﬂeiden van de Taylorreeks van een gewone functie ging men uit van een eenvoudige machtreeks: f ( x ) = a0 + a1 x + a2 x 2 + a3 x 3 . Maak zo’n Taylor-ontwikkeling. Metrische tensor 1.6 Tensor calculus Deze functies zijn nu naar hun n coordinaten te Taylor-ontwikkelen om het ¨ punt 0. µ ν C. Het nieuwe inproduct. . te transformeren naar een vorm waarin er nog uitsluitend 1’en aanwezig zijn.. Hoe kan je nu. terwijl de indices van x en y verschillend zijn.C. . 2. Stel.5. en yβ = . toch deze twee vectoren optellen? 3. Door deze vergelijking steeds te differenti¨ ren kon men de co¨ fﬁcienten a0. Doe iets dergelijks ook in dit geval.n e e bepalen. Bewijs dat de volgende vergelijkingen covariant zijn: zαβ = x βα zαβ = x αβ + y βα 2. Laat zien dat het onmogelijk is om een metrische tensor die zowel 1’en als −1’en op de hoofddiagonaal heeft. met de metrische tensor.6. . . Tensor calculus 1. Waarom is het bij ∂v µ ∂x ν niet nodig om de indices (aan de linkerkant van de vergelijking) op volgorde te zetten? (gradv)µ ν = 51 .

C.6 Tensor calculus 52 .

A. (1988) ‘Tensor Calculus (Schaum’s outline series)’ (McGraw-Hill) [4] Schouten J. J. (1985) Syllabus ‘Hemelmechanica’ (UvA) C 53 . (1951) ‘Tensor Analysis for Physicists’ (Oxford) [5] Schutz B. (1986) Syllabus ‘Algemene Relativiteitstheorie’ (KUN) [3] Kay D. (1981) Syllabus ‘Algemene Relativiteitstheorie I’ (UvA) a [2] Dullemond C. A.Bibliography [1] B¨ uerle G. F. (1985) ‘A ﬁrst course in general relativity’ (Cambridge) [6] Takens R. G. C.

- TensorUploaded byGulfam Shahzad
- Vector and Tensor AnalysisUploaded bymydeardog
- Tensor CalculusUploaded byalehegny
- tensor analysisUploaded byபுகழேந்தி தனஞ்செயன்
- Intro to Tensor CalculusUploaded bymjk
- TensorUploaded byFaisal Rahman
- 1948-Vector and Tensor AnalysisUploaded bymoodindigo1969
- tensorsUploaded bysohaib_321
- tensorUploaded byipsthethi
- tensor analysisUploaded byJoseph Raya-Ellis
- Vector and Tensor Analysis 1950Uploaded byRay Minich
- Functional and Structural Tensor Analysis for EngineersUploaded byManuel F. Presnilla
- Tensor Analysis in Differentiable ManifoldsUploaded byapi-19523062
- Tensor CalculusUploaded byDane C-Rape
- Matrix and Tensor Calculus - Aristotle D. MichalUploaded byRavish Verma
- Schaum's Differential Geometry -- 277Uploaded byTien Song Chuan
- Rudin+Companion+Mathematical+Analysis+-+SilviaUploaded byBin Chen
- Tensor Analysis IntroUploaded byrahi77653
- 47706809 Tensor Calculus Relativity and Cosmology a First Course 2005 DalarssonUploaded bySiyoung Byun
- Introduction to Tensor Calculus & Continuum MechanicsUploaded byManuel F. Presnilla
- Continuum MechanicsUploaded byEmre Demirci
- A Gentle Introduction to TensorsUploaded byMichael Simon
- Tensor Analysis SokolnikoffUploaded byIndhujah Somasundaram
- Tensor AnalysisUploaded byDaniel Salinas-Arizmendi
- (Kelley)General.topologyUploaded byChan Yu Hin
- Notes on StochasticsUploaded byAyodele Emmanuel Sonuga
- Lagrangian and Hamiltonian Mechanics - M. G. Calkin.pdfUploaded byFelix Aurioles
- problem book in relativityUploaded bylinamohdzhor4815

- George K. Francis and Jeffrey R. Weeks- Conway’s ZIP ProofUploaded byJiamsD
- Reverse engineeringUploaded byBobi Pitrop
- Graduate Functional Analysis Problem Solutions wUploaded byLuis Zanxex
- Appendix ThreeUploaded byTom Davis
- Coordinate SystemsUploaded bySundar Sk
- A Common Fixed Point Theorem for Self Maps on a Metric Space using Pseudo Altering Distance Function and a Deficit FunctionUploaded bythesij
- Problems on Applied AnalysisUploaded byspaul4u
- Analyse FonctionnelleUploaded byomardd
- Exercise 1.2 Meinrenken Lie groupsUploaded bysupermatthew6483
- Msc Bulletin 2013Uploaded byteju1996cool
- ch4 (1)Uploaded byKhmer Cham
- INTERNATIONAL JOURNAL OF MATHEMATICAL COMBINATORICS, vol. 1 / 2018Uploaded byAnonymous 0U9j6BLllB
- Topo Hinh HocUploaded byDuy Khuong
- Poisson and Quantum GeometryUploaded byiavicenna
- Algebraic Curves and Riemann Surface(Miranda)-Main TextUploaded byKanaJune
- UNA_helpUploaded byOlgu Caliskan
- Notes on Metric SpacesUploaded bysrinu27
- BM & SF Command SnippetUploaded byw1r9x8m2an
- MSC II Mathematics SyllabusUploaded byManas Gajare
- Hw 9 SolutionsUploaded byRicardo E.
- 3+1 formalismoUploaded byAuraDatabase
- 2010 Measure TheoryUploaded byAravindVR
- Lectures on the Geometry of Manifolds - Liviu I. NicolaescuUploaded byHenry Rojas
- An application of Tychonoff's theorem, to prove the compactness theorem for propositional logic.Uploaded byMariusz Popieluch
- A General Rudin-Carleson Theorem - Errett Bishop (AMS 13-11-1960)Uploaded byLeandro Az
- LURY 2012 - Bringing the World Into the World-The Material Semiotics of Comtemporary CultureUploaded byMariah Freitas Monteiro
- M.S Mathematics TopologyUploaded byRohit Pentapati
- Prob Bp Teachers WorkshopUploaded byrapsjade
- Orthogonal Decomposition of Lorentz Transformations 1103.1072Uploaded byforizsl
- 104 NotesUploaded byrollsroycemr