Classical Mechanics

Undergraduate Lecture Notes in Physics
Reinhard Hentschke
Classical
Mechanics
Including an Introduction to the
Theory of Elasticity
Undergraduate Lecture Notes in Physics (ULNP) publishes authoritative texts covering
topics throughout pure and applied physics. Each title in the series is suitable as a basis for
undergraduate instruction, typically containing practice problems, worked examples, chapter
summaries, and suggestions for further reading.
ULNP titles must provide at least one of the following:

• An exceptionally clear and concise treatment of a standard undergraduate subject.
• A solid undergraduate-level introduction to a graduate, advanced, or non-standard subject.
• A novel perspective or an unusual approach to teaching a subject.
ULNP especially encourages new, original, and idiosyncratic approaches to physics teaching
at the undergraduate level.
The purpose of ULNP is to provide intriguing, absorbing books that will continue to be the
reader’s preferred reference throughout their academic career.
Series editors
Neil Ashby
University of Colorado, Boulder, CO, USA
William Brantley
Department of Physics, Furman University, Greenville, SC, USA
Matthew Deady
Physics Program, Bard College, Annandale-on-Hudson, NY, USA
Michael Fowler
Department of Physics, University of Virginia, Charlottesville, VA, USA
Morten Hjorth-Jensen
Department of Physics, University of Oslo, Oslo, Norway
Michael Inglis
SUNY Suffolk County Community College, Long Island, NY, USA
Heinz Klose
Humboldt University, Oldenburg, Niedersachsen, Germany
Helmy Sherif
Department of Physics, University of Alberta, Edmonton, AB, Canada
More information about this series at http://www.springer.com/series/8917

Reinhard Hentschke
Classical Mechanics
Including an Introduction to the Theory
of Elasticity
123
Reinhard Hentschke
School of Mathematics and Natural Sciences
Bergische Universität
Wuppertal
Germany
ISSN 2192-4791 ISSN 2192-4805 (electronic)

ISBN 978-3-319-48709-0 ISBN 978-3-319-48710-6 (eBook)
DOI 10.1007/978-3-319-48710-6
Library of Congress Control Number: 2016958492
© Springer International Publishing AG 2017

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This textbook on classical mechanics is intended for physics students, who

encounter the subject as a part of their undergraduate curriculum in theoretical
physics.
Chapter 1, Mathematical Background, reviews the mathematical ‘tool chest’ of
classical mechanics. Emphasis is placed on the practical application of each tool
making its usefulness to the subject as transparent as possible. Readers who are
thoroughly familiar with the material may skip this chapter. But even then it is
probably a good idea to at least briefly look over the different sections, focussing
particularly on the problems and examples, where the mathematical ‘tools’ are
applied in physics related contexts. Chapter 2, Laws of Mechanics, provides a
general overview. It is intended as a guide rail for the beginner leading the way
through the basic equations and concepts of mechanics. Chapter 3, Least Action
Principle for one Coordinate, introduces Langrangian mechanics in simplified
fashion. According to the author’s experience most undergraduate physics students
do find it difficult to interrelate the standard ingredients of undergraduate theoretical
physics, i.e. Newton’s equations of motion in mechanics, Maxwell’s equations in
electrodynamics or Schrödinger’s equation in quantum mechanics. This, to me,
makes it worthwhile to introduce the least action principle, as a unifying concept in
physics, at this early stage. Together the three chapters constitute the introductory
part of the present text.
The following five chapters are comprised of topics central to analytical
mechanics. First, in Chap. 4, The Least Action Principle, the least action principle is
reintroduced and discussed from a more general perspective. This includes the
relation between conservation laws and symmetries, the description of motion in
accelerated coordinate systems or dynamic stability. Chapter 5, Integration of the
Equations of Motion, is devoted mainly to two-body problems including celestial
mechanics and scattering. The next chapter, Small Oscillations, focuses on oscil-
lations. This encompasses the standard harmonic oscillator including dissipation
and external forces, dispersion relations of harmonic chains, normal mode analysis
and related aspects. Chapter 7, Motion of Rigid Bodies, discusses the motion of
rigid bodies. The moment of inertia tensor is defined and its meaning in different
v
vi Preface
coordinate systems is discussed in detail. The equations of motion for rigid bodies
are introduced including their representation in terms of the Euler equations using
Euler angles and quaternions. The final chapter of the central part, Canonical
Mechanics, introduces Hamiltonian dynamics and Hamilton-Jacobi theory.
The subsequent chapters address topics outside the standard content of classical
mechanics. Chapter 9, Many-Body Systems, focuses on the mechanics of many-body
systems. This includes the numerical solution of the equations of motion, in par-
ticular the Molecular Dynamics simulation technique, as well as the foundations of
statistical mechanics contrasting the approaches due to Boltzmann and Gibbs. The
chapter concludes with a brief discussion of the transition to chaos. Chapter 10,
Theory of Elasticity, presents an introduction to the theory of elasticity. This subject,
which is hardly ever touched upon in the current physics curriculum, possesses a
wide spectrum of applications. These range from the microscopic physics of cells to
the design of precision instrumentation or the macroscopic mechanics of materials.
In my opinion the theory of elasticity does deserve increased attention. The aim of
this chapter is to provide the students with a basic introduction to this subject and the
background knowledge on which to expand as necessary using more advanced
literature. At the beginning, strain- and stress tensors, the free energy of an isotropic
elastic body and its attendant equilibrium conditions are derived and their uses are
illustrated in a series of examples. Because in practice most problems in elasticity
require numerical solution and because the method is of general use, Chap. 10
includes a section on the finite element method. The final section of the chapter
summarizes the basic application of mechanics to viscoelasticity. This section also
highlights that friction or, more generally, the dissipation of energy in mechanical
systems is still not very well understood and continues to be a field of active
research.
Throughout the book the reader will encounter three types of highlighted
examples. Most of them are solved problems. I find it important to supply the reader
with opportunities for exercising newly acquired concepts without getting stuck and
thus loosing interest. Occasionally a problem reoccurs, when it is useful to compare
different solution approaches. Then there are ordinary examples designed to prac-
tice a new concept during reading. The only exception from this rule are the more
elaborate numerical Mathematica examples in Chap. 10. In addition, a certain
number of advanced examples serve the purpose to either explain a difficult point or
relate the current material to other areas in physics. This can be an extension of
classical mechanics to quantum mechanics, via the so-called quasi-classical
approximation, or the strengthening of rubber materials through filler nanoparticles.
These examples, even though they can be omitted on a first reading, are intended to
supply additional motivation for the context in which they are embedded.
Selected headings are accompanied by a raised symbol, which provides a rough
guidance for the materials selection. The meaning of the symbols is as follows:
• †: Material of particular importance to the beginner.
• ‡: The content should be included according to time and necessity.
• no symbol: Materials important for more advanced students.
Preface vii
~
• : The chapter on the theory of elasticity certainly requires considerable extra
time. This extra time may not always be affordable. Thus, it is a ‘matter of the
heart’ how much effort one is willing to spend on the various sections com-
prising this chapter.
It is very likely that despite my effort to the contrary, this text will contain errors.
On my website (http://constanze.materials.uni-wuppertal.de) readers can find a
continuously updated list of corrections.
Wuppertal, Germany Reinhard Hentschke

Contents
1 Mathematical Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Coordinatesy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Vectorsy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Matricesy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Derivatives and Integralsy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Complex Numbersy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2 Laws of Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.1 An Overviewy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Two Examples in Newtonian Mechanicsy . . . . . . . . . . . . . . . . . 53
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3 Least Action Principle for One Coordinate . . . . . . . . . . . . . . . . . . . 69
3.1 Euler–Lagrange Equation for One Coordinatey . . . . . . . . . . . . . 69
3.2 Two Simple Examplesy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.3 The Meaning of the Least Action Principley . . . . . . . . . . . . . . . 75
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4 Principle of Least Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.1 Lagrangian for a System of Point Masses . . . . . . . . . . . . . . . . . 89
4.2 Conserved Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Lagrangians in Accelerated Systems . . . . . . . . . . . . . . . . . . . . . 106
4.4 An Application in Theoretical Chemistry . . . . . . . . . . . . . . . . . 119
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5 Integrating the Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.1 One-Dimensional Motiony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.2 Two-Body Central Force Motiony . . . . . . . . . . . . . . . . . . . . . . . 125
ix
x Contents
5.3 Scatteringz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6 Small Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.1 One-Dimensional Motiony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.2 Normal Mode Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
7 Rigid Body Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.1 Moment of Inertia Tensor and Angular Momentumy . . . . . . . . . 189
7.2 Equations of Motion for a Rigid Body . . . . . . . . . . . . . . . . . . . 209
7.3 Static Contact Between Rigid Bodiesy . . . . . . . . . . . . . . . . . . . . 225
8 Canonical Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
8.1 Hamilton’s Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . 233
8.2 Hamilton–Jacobi Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9 Many-Particle Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
9.1 Numerical Solution of the Equations of Motiony . . . . . . . . . . . . 253
9.2 Molecular Dynamics Simulation . . . . . . . . . . . . . . . . . . . . . . . . 258
9.3 From Mechanics to Statistical Mechanicsz . . . . . . . . . . . . . . . . 271
9.4 Classification of Dynamical Systems . . . . . . . . . . . . . . . . . . . . . 284
9.5 Roads to Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
10 Basic Equations of the Theory of Elasticity~ . . . . . . . . . . . . . . . . . . 291
10.1 Strain and Stress Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
10.2 Free Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
10.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
10.4 Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
10.5 Dynamic Mechanical Analysis of Viscoelastic Materials . . . . . . 352
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Appendix A: Identities and Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Appendix B: Mathematica MD in the NVE-Ensemble . . . . . . . . . . . . . . . 369
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
Chapter 1
Mathematical Tools
The following is a compilation of essential mathematical tools needed in theoretical

mechanics. A transparent connection between the tool and its application is favored
over mathematical rigor. One important item, differential equations, is postponed to
latter in the book, where the subject is discussed in the context of specific problems.
A good source of additional mathematical support is [1].
1.1 Coordinates†
A point on a straight line or coordinate axis is represented by a number defining its

position relative to some origin on the same line (cf. Fig. 1.1). A point on a plane
requires two numbers or coordinates to specify its position. In the case of a point
in space three numbers are needed and so on. It is important to pay attention to the
handedness of the respective coordinate system. Here we use right-handed systems,
i.e. the thumb of the right hand indicates the x-direction, the index finger points along
the y-direction, and the middle finger represents the z-direction.
Sometimes it is useful to replace the cartesian coordinates in Fig. 1.1 by so called
symmetry adapted coordinates. Important examples are polar coordinates in two
dimensions and cylindrical or spherical coordinates in three dimensions. Polar coor-
dinates (see Fig. 1.2) are related to the cartesian coordinates x and y via
x = r cos φ
(1.1)
y = r sin φ

(r = x 2 + y 2 ). The quantity r is the distance of the point P from the coordinate
system’s origin. φ is the angle between the line connecting the origin with P and the
x-axis. Cylindrical (see Fig. 1.3) and spherical coordinates (see Fig. 1.4) are related
to the cartesian coordinates x, y, and z via
© Springer International Publishing AG 2017 1
R. Hentschke, Classical Mechanics, Undergraduate Lecture Notes in Physics,
DOI 10.1007/978-3-319-48710-6_1
2 1 Mathematical Tools
Fig. 1.1 Cartesian (5)

coordinates
-5 0 5 x
y
(5,1)
1
-5 5 x
z (5,1,2)
y
2
1
-5 5 x
Fig. 1.2 Polar coordinates. y-axis

Every P is uniquely
identified by one r - and one
φ-value (0 ≤ φ ≤ 2π) P
y
r
0 x x-axis
Fig. 1.3 Cylindrical z-axis

coordinates. Every P is P
uniquely identified by one z
ρ -, one φ -, and one ρ -value
(0 ≤ φ ≤ 2π)
y-axis
r
y
x
0 x-axis
1.1 Coordinates† 3
Fig. 1.4 Spherical z-axis

coordinates. Every P is P
uniquely identified by one z
r -, one φ -, and one θ -value
(0 ≤ φ ≤ 2π, 0 ≤ θ ≤ π/2)
y-axis
r
y
x
0 x-axis
x = ρ cos φ
y = ρ sin φ (1.2)
z=z
and
x = r cos φ sin θ
y = r sin φ sin θ (1.3)
z = r cos θ ,
respectively. The quantity ρ is the perpendicular separation of the point P from the
z-axis. The meaning of φ is the same is in the case of polar coordinates. In the case
of spherical coordinates r is the length of the line connecting the origin with P. Here
φ is the angle between the perpendicular projection of this line onto the x-y-plane
and the x-axis. The second angle, θ, is the angle between the aforementioned line
and the z-axis.
1.2 Vectors†
Vectors in physics describe all quantities possessing magnitude as well as orientation.

Examples are velocity and force. These vectors are represented by arrows in space
whose length is the magnitude of the physical quantity. The orientation of the vector
is the orientation of the physical quantity. Numerically the vector is represented by
its components. Here the components are the perpendicular projections of the vector
onto the axes of the coordinate system. An example is shown in Fig. 1.5. The vector is
a and its components are the ai (i = 1, 2, 3). The common mathematical expression
for a in terms of its components is
Fig. 1.5 The vector a and its z

components in three
dimensions
a
a3
a2
a1 x
⎛ ⎞
a1
⎜ a2 ⎟
⎜ ⎟
a = ⎜ . ⎟ . (1.4)
⎝ .. ⎠
ad
In this case d is the space dimension, which may be different from d = 3. Especially
in three dimensions we often use the form
⎛ ⎞
x
a = ⎝ y ⎠ , (1.5)
z
where x, y, and z are the components in a right-handed rectangular coordinate system.

Vectors may be associated with discrete points in space, like the force on a bridge
bearing. In other cases variable vectors are associated with every point in space. This
is called a vector field. An example of a vector field is the displacement field within
a column due to the earth’s gravitation as shown in Fig. 1.6.
Vector algebra† :
The most important vector operations are defined as follows:

• Multiplication of a vector by a scalar (a number) c:
⎛ ⎞ ⎛ ⎞
a1 ca1
⎜ ⎟ ⎜ ⎟
a = c ⎝ a2 ⎠ = ⎝ ca2 ⎠ .
c (1.6)
.. ..
. .
1.2 Vectors† 5
Fig. 1.6 Deformation of a

square column in the earth’s
gravitational field. The
arrows show the magnitude
and direction of the
displacement of volume
elements inside the column
relative to their position in
the absence of the
gravitational field
• Adding and subtracting vectors (cf. Fig. 1.7):
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a1 b1 a1 + b1
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
a + b = ⎝ a2 ⎠ + ⎝ b2 ⎠ = ⎝ a2 + b2 ⎠
.. .. ..
. . .
⎛ ⎞
a1 − b1
⎜ a2 − b2 ⎟
a − b = a + (−1) b = ⎝ ⎠ .
..
.
• Magnitude of a vector: The magnitude | a | of the vector a simply is the length of

its arrow, which follows via Pythagoras’s theorem:
Fig. 1.7 Adding and

subtracting vectors b
a a
b
a +b
-b
a a - b
b
d

| a |≡ a ≡ a1 + a2 + · · · =
2 2
ai2 . (1.7)
i=1
• Scalar or dot product: The scalar or dot product, as the name suggests, yields a
scalar. Its definition is:

d
a · b = ai bi ≡ ai bi . (1.8)
i=1
Notice that ai bi on the right, without the summation symbol, is a shorthand notation,
which is called summation convention. If in the following the same index appears
twice in a product, it automatically means that the summation convention is applied.
In three-dimensional space the definition (1.8) implies
a · b = ab cos γ (1.9)

(cf. Problem 1). Here γ is the angle between the vectors a and b.
1.2 Vectors† 7
• Problem 1 - Scalar Product: Show that
a · b = | cos(
a ||b| .
a , b)
Solution: Based on the following sketch
we may write
d = b − a or a · b .
d 2 = b2 + a 2 − 2 (1.10)
Application of Pythagoras’s theorem yields
d 2 = h 2 + p2 and b2 = h 2 + q 2 .
In addition
a 2 = (q + p)2 = q 2 + p 2 + 2qp .
The combination of the last three equations yields
q(q + p) = a · b .
cos γ and q + p = |
Inserting q = |b| a | completes the proof.
• Vector or cross product: We limit our discussion to the case d = 3. Then the cross
product is defined via ⎛ ⎞
a2 b3 − a3 b2
a × b ≡ ⎝ a3 b1 − a1 b3 ⎠ (1.11)
a1 b2 − a2 b1
(notice: a × b = −b × a ). The result of the cross product again is a vector.

The motivation of the scalar product is the projection of one vector onto another.
But what is the motivation for this seemingly complicated new definition? We com-
bine the two products and calculate the quantity (
a × b) · a . The result is
( · a = a2 b3 a1 + a3 b1 a2 + a1 b2 a3 − a3 b2 a1 − a1 b3 a2 − a2 b1 a3 = 0
a × b)
The same happens in the case of ( · b.

a × b) Thus a × b is perpendicular to a as
Because the relative orientation of two vectors does not change under
well as to b.
rotation of the coordinate system, it is sufficient to continue with the special case
when a is along the x-axis and b lies in the x-y-plane:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a b1 0
⎝ 0 ⎠ × ⎝ b2 ⎠ = ⎝ 0 ⎠ .
0 0 ab2
and a × b span a right-handed system of axes. With

We do find that a , b,
ab2 = ab sin γ follows
| a × b |= ab | sin γ | . (1.12)
The magnitude of a × b therefore is the shaded area in Fig. 1.8. Straightforward

continuation leads to the conclusion that the magnitude of
( · c
a × b) (1.13)
is the volume of the skewed box also shown in Fig. 1.8. By explicit multiplication
we find
( · c = (b × c) · a = (
a × b) c × a ) · b . (1.14)
Fig. 1.8 Two vectors a and z

b spanning the shaded area
in the x-y-plane. Together
with the vector c they define
the edges of a skewed box y
indicated by the dotted lines
c
b
x
a
1.2 Vectors† 9
Notice that this triple product is not a true scalar. If we change the signs on all axes.
i.e. x → −x, y → −y and z → −z, then the triple product will also change its sign
contrary to a true scalar.
• Problem 2 - Vectors and Lattices: The cartesian basis vectors of a particular

two-dimensional lattice are
⎛ ⎞ ⎛ ⎞
1 √1/2
a1 = a ⎝ 0 ⎠ and a2 = a ⎝ 3/2 ⎠ ,
0 0
where a is the so called lattice constant.
(a) Calculate the area spanned by the two vectors a1 and a2 .
(b) Sketch the lattice spanned by a1 and a2 , i.e. the sketch should contain
the lattice points defined by the vectors i a1 + j a2 , where i and j are integer
numbers.
(c) Calculate the radii of the three nearest neighbor shells on an arbitrary
node on the lattice spanned by the vectors
a2 × e3 e3 × a1

g1 = 2π and g2 = 2π
a1 · (
a2 × e3 ) a1 · (
a2 × e3 )
(in units of 1/a). Note that e3 is a unit vector, i.e. |e3 | = 1, perpendicular to
the plane defined by a1 and a2 .
Solution: (a) The area is

⎛ ⎞ ⎛ ⎞ √
a1,y a2,z − a1,z a2,y 0

a1 × a2 | = ⎝ a1,z a2,x − a1,x a2,z ⎠ = ⎝ √ 0
| ⎠ = 3 a 2 .

a1,x a2,y − a1,y a2,x 3a 2 /2
2
(b) The vector a1 lies on the x-axis. The magnitude of a2 /a is given by

a2
= 1+3 =1.
a 4 4
The angle φ between the x-axis and a2 follows from
1
= 1 · cos φ , i.e. φ = 60◦ .
2
The upper part of the following sketch shows the triangular lattice produced
by the linear combination n a1 + m a2 (n, m = 0, ±1, ±2, . . . ).
(c) Our starting point is

⎛√ ⎞
3/2
a2 × e3 = a ⎝ −1/2 ⎠
0
with e3 = (0, 0, 1). Thus

√ ⎛ ⎞
0
3 2
a1 · (
a2 × e3 ) = a and e3 × a1 = a ⎝ 1 ⎠ .
2 0
The final result is

⎛ ⎞ ⎛ ⎞
1√ 0
2π ⎝ 2π ⎝ √ ⎠
g1 = −1/ 3 ⎠ and g2 = 2/ 3 .
a a
0 0
1.2 Vectors† 11
The radii are defined via
d = |n g1 + m g2 | (n, m : integer numbers) ,
i.e.
4π 2
d=√ n − nm + m 2 .
3a
Inserting different values for n and m we obtain the radii
4π 4π 8π
d |1. = √ d |2. = d |3. = √ .
3a a 3a
The lattice spanned by g1 and g2 is shown in the bottom part of the above
sketch. The radii of the two circles are d |1. and d |2. .
• Cross product and rotation: The following is another and perhaps the most important
motivation for introducing the cross product from the point of physics. The cross
product can be used to describe the rotation of a vector with respect to an axis
as illustrated in Fig. 1.9. We choose an arbitrary rotation axis through an arbitrary
origin. Subsequently we carry out an infinitesimal rotation with respect to this axis.
The rotation of r generates a new vector r + δ
r . From Fig. 1.9 we obtain
(1.12)
r = r sin θδφeδr = δ φ × r .
δ
Fig. 1.9 Infinitesimal

rotation
r
r
r , and δ φ is a vector parallel to the axis of

Here eδr is a unit vector parallel to δ
rotation. Its magnitude is δφ and its orientation is given by the right-hand-rule, i.e.
the right hand thumb shows the vector’s orientation whereas the fingers indicate the
direction of rotation. After dividing this equation by δt we obtain
r˙ = φ˙ × r .
r˙ is the velocity of a point at r due to the angular velocity φ˙ of the rotation. Commonly
one uses the greek letter ω for the angular velocity, i.e. ω ≡ φ. ˙
It is important to realize that the equation
a = δ φ × a
δ (1.15)
holds for every vector a , even if its origin is not located on the axis of rotation.
This we can show as follows: Let b = a + c, where the origin of c is on the axis of
rotation. Therefore also the origin of b is on the axis of rotation. Thus δ
a = δ b − δ
c
= δ φ × (b − c) = δ φ × a , which proves the point.
1.3 Matrices†
Suppose we want to convert the components of the vector a from one coordinate
system to another. Figure 1.10 is an illustration in two dimensions. Here the primed
coordinate system is rotated relative to the unprimed system. In the unprimed system
the components of a are given by
a1 = a cos(ϕ + φ) = a cos ϕ cos φ − sin ϕ sin φ

. (1.16)
a2 = a sin(ϕ + φ) = a sin ϕ cos φ + cos ϕ sin φ
Fig. 1.10 Vector a in two y

rotated coordinate systems y'
a
x'
x
1.3 Matrices† 13
In the primed system we have instead
a1 = a cos φ
. (1.17)
a2 = a sin φ
The attendant conversion relations are
a1 = cos ϕ a1 − sin ϕ a2

(1.18)
a2 = sin ϕ a1 + cos ϕ a2
or
a1 = cos ϕ a1 + sin ϕ a2
. (1.19)
a2 = − sin ϕ a1 + cos ϕ a2
Using the definitions D11 = cos ϕ, D12 = sin ϕ, D21 = − sin ϕ, and D22 = cos ϕ,
this can be expresses as
a1 = D11 a1 + D12 a2
. (1.20)
a2 = D21 a1 + D22 a2
Another way of writing this is

a1 D11 D12 a1
= · (1.21)
a2 D21 D22 a2
or
a = D · a (1.22)
with
cos ϕ sin ϕ
D= . (1.23)
− sin ϕ cos ϕ
The quantity D is a matrix - in this case a 2 × 2 matrix. Matrix D is an example for

a square matrix, on which we concentrate in the following. Equation (1.22) shows
how the multiplication of an n × n matrix by an n-component vector yields a new
n-component vector. In the present case n = 2. But the generalization is obvious -
particularly when we write (1.22) in terms of components:

2
ai = Di j a j . (1.24)
j=1
The summation convention introduced above is very useful here, i.e. (1.24) may be
expressed via
ai = Di j a j . (1.25)
Analogous to (1.22) and (1.24) we may also write
a = D−1 · a , (1.26)
where
−1 cos ϕ − sin ϕ
D = . (1.27)
sin ϕ cos ϕ
Notice that here −1 is not a mathematical operation but part of the name for this new
matrix.
Matrix Algebra† :
• Addition of matrices: A and B are two 2 × 2 matrices. c1 and c2 are scalars. Then
we may write

c1 A11 + c2 B11 c1 A12 + c2 B12
c1 A + c2 B = . (1.28)
c1 A21 + c2 B21 c1 A22 + c2 B22
The generalization to n > 2 is easy.

• Matrix multiplication: The multiplication of A and B, i.e. A · B = C, is defined
via
Aik Bk j = Ci j (1.29)
(summation convention!). It is important to note that in general A · B = B · A, i.e.

the commutative law for multiplication of matrices is not satisfied in general.
If we apply the above definition (1.29) to the product D · D−1 , then we obtain

cos2 ϕ + sin2 ϕ − cos ϕ sin ϕ + sin ϕ cos ϕ
Dik Dk−1 = , (1.30)
j − sin ϕ cos ϕ + cos ϕ sin ϕ sin2 ϕ + cos2 ϕ
i.e.
Dik Dk−1
j = δi j (1.31)
1.3 Matrices† 15
with
1 (i = j)
δi j = . (1.32)
0 (i = j)
In matrix form (1.31) becomes
D · D−1 = I . (1.33)
Here I is the so called unit or identity matrix. Multiplication of a vector or a matrix

with the unit matrix does not change either one, i.e. A · B = A · I · B and A · I · a =
A · a . Because of (1.33) the matrix D−1 its the inverse matrix of D and vice versa.
In general S · S−1 = I and therefore S−1 · S = I, i.e.
S · S−1 = S−1 · S = I . (1.34)
This follows via S = I · S = (S · S−1 ) · S = S · (S−1 · S).

We remark that the inverse of the above matrix D can be obtained using (1.31).
The four elements Dk−1j are the unknown quantities in a linear system of likewise
four equations.
• Transpose of a matrix: The transpose AT of an n × n Matrix A is obtained by
exchanging rows and columns of A, i.e.
Ai j = A Tji . (1.35)
Applying this operation to the product of two matrices yields
(A · B)T = BT · AT or A · B = (BT · AT )T . (1.36)
Using components we can prove this statement:
(AB)i j = Aik Bk j = Bk j Aik = (B T ) jk (A T )ki = ((B T A T )T )i j . (1.37)
• Trace: Another important matrix operation is defined via
T r (A) = Aii . (1.38)

Here T r (A) is the trace of matrix A, which simply is the sum over all elements along
the principal diagonal of A. The trace allows the cyclic permutation of matrices, e.g.
T r (A · B · C) = T r (B · C · A) = T r (C · A · B) . (1.39)
This can be shown using components, i.e. T r (A · B · C) = Aik Bkl Cli =

Bkl Cli Aik = T r (B · C · A) etc.
• Eigenvalues and eigenvectors: Consider the following equation
Ae (1) = λ(1) e (1) (1.40)
or expressed in terms of components
A jk ek(1) = λ(1) e(1)

j . (1.41)
Here the multiplication of the n × n Matrix A with the vector e (1) yields a number,
λ(1) , times the same vector. λ(1) is called eigenvalue of A. The vector e (1) is the
attendant eigenvector. Suppose we have n eigenvalues λ(l) and eigenvectors e (l) .
Then we can write down the following generalized form of (1.41):
A jk Skl = λ(l) S jl . (1.42)
Here the eigenvectors e (l) do form the columns of the matrix S. Notice that the
summation convention does not apply to the index (l). This equation is multiplied
from the left side with the inverse of S:
Si−1 (l) −1
j A jk Skl = λ Si j S jl . (1.43)
If the matrix S is constructed from orthonormal eigenvectors, i.e. e ( p) · e (q) = δ pq

∀ p, q, then Si−1
j S jl = δil . Thus we find
Si−1 (l)
j A jk Skl = λ δil (1.44)
or in matrix form (and for n = 2)

−1 λ(1) 0
S ·A·S= . (1.45)
0 λ(2)
On the right side of this equation we obtain a diagonal matrix, i.e. all matrix elements
outside the principal diagonal are zero. Here the diagonal elements are the eigenvalues
of A. This procedure, which is called diagonalization, is used repeatedly throughout
this text.
1.3 Matrices† 17
• Example - Eigenvalues and Eigenvectors: Obtain the eigenvalues and

eigenvectors of
01
A= .
10
Special method - In this special case we can obtain the solution by looking for
the angle ϕ for which D−1 · A · D, with D from (1.23), is diagonal:

− sin(2ϕ) cos(2ϕ) ϕ=π/4 −1 0
D−1 · A · D = =
cos(2ϕ) sin(2ϕ) 0 1
and
cos ϕ sin ϕ ϕ=π/4 √1 √1
D= = 2 2 .
− sin ϕ cos ϕ − √12 √1
2
Here the first column vector of D(ϕ = π/4) is the eigenvector belonging to
the eigenvalue −1. The second column vector is the eigenvector belonging to
the eigenvalue 1.
General method - We start from the eigenvalue equation

0 1 x x
· =λ . (1.46)
1 0 y y
An equivalent form is

−λ 1 x
· =0. (1.47)
1 −λ y
In order for this equation to possess solutions different from zero, the determi-
nant of the matrix must vanish. The determinant of a 2 × 2 matrix is computed
according to
a11 a12
det = a11 a22 − a12 a21 (1.48)
a21 a22
(*). Thus we must require

λ2 − 1 = 0 . (1.49)
The resulting eigenvalues are λ = ±1. Inserting λ = 1 into (1.47) yields x =

y. Choosing x = 1 we obtain√the eigenvector
√ (1, 1). After normalization the
final result is, as before, (1/ 2, 1/ 2). Inserting λ = −1 into (1.47) yields
x = −y. Again we choose√ = 1 and obtain the eigenvector (1, −1) or, in
x√
normalized form, (1/ 2, −1/ 2). All in all the result is identical to the above.
However, this second method is more general.
(*) The general rule for the evaluation of determinants is

⎛ ⎞
a11 a12 ... a1n
⎜ a21 a22 ... a2n ⎟
det ⎜
⎝...
⎟= ±a1k a2l . . . anr .
... ... ... ⎠
permutations of k, l,...,r
an1 an2 ... ann
Here the sign is (+) when the number of pair permutations necessary to obtain this
particular sequence of the indices k, l, . . . , r is even and (−) otherwise. We start
from k = 1, l = 2, . . . , r = n, which itself is (+).
1.4 Derivatives and Integrals†
Differentiation Formulas† :
We consider the function

f (x) = x ν , (1.50)
where we assume that ν is an integer number. The approximate slope of f (x) at x

is given by
δf f (x + δx) − f (x)
= , (1.51)
δx δx
when δx is small. In the limit δx → 0 the new function
df δf
f (x) ≡ = lim (1.52)
dx δx→0 δx
is called the derivative of f . The notation f (x), where the prime indicates the
first derivative with respect to x, can be applied to higher derivatives as well, i.e.
derivatives of derivatives: ddx ( ddx f (x)) ≡ ddx 2 f (x). Thus ddx 2 f (x) ≡ f (x).
2 2
Returning to (1.50) we obtain

ν
δf (x + δx)ν − x ν x ν 1 + δxx − x ν
= = . (1.53)
δx δx δx
Now we make use of

δx ν δx
1+ ≈1+ν . (1.54)
x x
1.4 Derivatives and Integrals† 19
This follows if we carry out the multiplications, neglecting all terms containing higher
powers of δx. The idea is that δx is small and therefore δx δx k if k > 1. Inserting
(1.54) into (1.53) yields
f (x) = νx ν−1 . (1.55)
It is reasonable to assume that (1.55) remains valid even if ν is not an integer. This
then implies that the differentiation formula (1.55) is almost all we need, because
most functions in physics, at least locally, can be approximated in terms of a power
series, i.e.

cν x ν ,
ν
to which we apply (1.55) term by term.

Formula (1.51) leads to two important differentiation rules. The first is the product
rule:
d
g (x) f (x)
dx
g (x + δx) f (x + δx) − g (x) f (x)
≈
δx
1
≈ g (x) δx + g (x) f (x) δx + f (x) − g (x) f (x)
δx
1
≈ g f δx + g f δx + O δx 2
δx
= g (x) f (x) + g (x) f (x) (1.56)

for δx → 0. The notation O δx 2 means that there are additional terms containing
δx 2 as well as possibly higher powers of δx. However here these terms do not
contribute to the final result. The second rule is the chain rule:
d
f (g (x))
dx
f (g (x + δx)) − f (g (x))
≈
δx
1 dg
≈ f g (x) + δx − f (g (x))
δx dx

1 df
≈ f (g (x)) + δg − f (g (x))
δx dg
d f dg
= (1.57)
dg d x
for δx → 0.
An important function in physics is the exponential or e-function
f (x) = e x (≡ exp[x]) (1.58)
(e = 2.718 . . . ). Based on
−1 1
e−x = e x = x
e
we conclude that e x is positive in the entire range −∞ < x < ∞. The derivative of
the e-function follows via
d x e x+δx − e x eδx − 1
e = lim = e x lim .
dx δx→0 δx δx→0 δx
We decide to evaluate the expression lim δx→0 (. . . ) using a calculator, which produces
the list of numbers compiled in Table 1.1.
Thus we conclude
d x
e = ex . (1.59)
dx
Remark: What if there is no e-button on our calculator? In this case we can use (1.59)
itself to define and calculate the unknown number e (cf. below).
The inverse of the e-function is the ln-function (natural logarithm):

ln e x = x or eln x = x . (1.60)
Thus
x=e y
ln x n = ln eny = ny ln (e) = n ln x . (1.61)
The derivative of the ln-function follows via

d d ln x (1.57) ln x d ln x
1= x= e = e ,
dx dx dx
Table 1.1 Evaluation of δx (eδx − 1)/δx

limδx→0 (. . . ) using a
calculator 0.1 1.052
0.01 1.005
0.001 1.001
↓ ↓
0 1
i.e.
d 1
ln x = . (1.62)
dx x
The special meaning of the e-function in physics is due to the important formula
x n
lim 1 + = ex . (1.63)
n→∞ n
The validity of this equation can be shown as follows:
⎧ ⎫
⎪
⎪ ⎪
⎪
x n x x/n1 d ln x
⎨ x ⎬
ln 1 + = n ln 1 + ≈ n "#$%
ln 1 + =x.
n n ⎪
⎪ " #$ % ⎪
d x x=1 n ⎪
⎩ =0 ⎭
=1
The content of the curly brackets will become muchclearer in the context of (1.69).
The notation d f (x)/d x x=xo , abbreviated d f (x)/d x xo , means that the value x = xo
is inserted into the derivative. Notice that (1.63) can be used to calculate e = 2.718...
numerically (for x = 1). Even better than (1.63) are the inequalities (1 + n+1 ) <
x n
(1 + n ) < (1 + n ) , because they allow to estimate the accuracy of the numerical

x n x n+1
value for e.
The differentiation rules for sin x and cos x follow from their representation in
terms of Euler’s formula,1
e±i x = cos x ± i sin x , (1.64)
which we prove in an example on p. 29. Solving Euler‘s formula for sin x and cos x
we obtain
1 ix
cos x = e + e−i x (1.65)
2
1 ix
sin x = e − e−i x . (1.66)
2i
√
Here i = −1 i 2 = −1 and in particular i −1 = −i. Using (1.59) in combination
with the chain rule yields
d
cos x = − sin x (1.67)
dx
d
sin x = cos x . (1.68)
dx
1 Euler, Leonhard, Swiss mathematician and physicist, *Basel 15.4.1707, †St. Petersburg 18.9.1783;
he made numerous important contributions to mathematics, physics, and astronomy.

• Example - Derivative of arcsin(x) in the Range −1 ≤ x ≤ 1: The follow-

ing calculation illustrates how to differentiate inverse trigonometric functions
using arcsin(x) as an example:

d arcsin x x=sin y dy ∗ d sin y −1 1 ∗∗ 1 1
= = = = =√
dx d sin y dy cos y 1 − sin y
2 1 − x2
* Instead of the slope we may calculate the inverse slope and subsequently
invert the result (at least locally). ** cos2 + sin2 = 1.
A simple application of the (1.51) and (1.52), which we shall use frequently, is
f (x + δx) ≈ f (x) + δx f (x) . (1.69)
The accuracy of this approximation increases as δx gets smaller, provided that f (x)
is a reasonably smooth function. In particular we may apply (1.69) to the numerical
calculation of the roots of f (x). If for instance xo ≡ x + δx is one of these roots,
i.e. f (xo ) = 0, then 0 ≈ f (x) + δx f (x) or
f (x)
xo ≈ x − . (1.70)
f (x)
This means that xo is approximated by the x-intercept of a straight line with the slope
f (x) calculated at x, a value not too far from xo . Iteration of this approximation,
i.e.
f (xn )
xn+1 = xn − , (1.71)
f (xn )
generates a series of xn -values, which in most cases quickly converges to xo .

This procedure sometimes is called Newton’s method. Consider for instance the
root of f (x) = ln x − 1, which is e. Starting from x1 = 0.5 and using xn+1 =
xn − xn (ln xn − 1) we obtain the successive values 1.34657, 2.29246, 2.68304,
2.71805, 2.71828, 2.71828, . . . . In general we must choose a starting value not
too far from the root. This is especially true if f (x) possesses more than one root.
But there is more we can do with (1.69). Consider the equation

δx d n
f (x + δx) = lim 1 + f (x) . (1.72)
n→∞ n dx
This may be verified step-by-step starting with


δx d n δx d n−1 δx d f
1+ f (x) = 1 + f (x) +
n dx n dx n dx x
" #$ %
= f (x+ δx
n )+O((δx/n) )
2
and continuing via

n
δx d δx d n−2 δx δx d f
1+ f (x) = 1 + f (x + )+
n dx n dx n n d x x+ δxn
" #$ %
= f (x+ 2δx
n )+O(2(δx/n) )
2
..
.
= f (x + δx) + O(n(δx/n)2 ) .
This proves (1.72), because O(δx 2 /n) → 0 as n → ∞. Now we make use of the
binomial theorem

n n n
(a + b) =
n
a b +
n 0
a b + ··· +
n−1 1
a 0 bn ,
0 1 n
where
n n!
≡ (1.73)
i i!(n − i)!
(0! = 1 and n! = 1 · 2 · 3 · · · · n). Application to (1.63) for instance yields the series
expansion of the e-Funktion2

n n x n x 2
e =
x
+ + + ···
0 1 n 2 n
"#$% "#$% "#$%
=1 =n = n(n−1)
2!
n→∞ 1 2
= 1+x + x + ··· (1.74)
2!
However, this is just a special case of the Taylor series expansion of f (x) at x,
which follows by combining the binomial theorem with (1.72):

d 1 d 2
f (x + δx) = f (x) + δx f (x) + δx f (x) + · · · . (1.75)
dx 2! dx
In mechanics we shall have to deal with functions of more than just one variable,
e.g. f (x, y, z). How can we differentiate or expand these function - and what is the
meaning of these derivatives and expansions? The quantity
2 Notice that the expansion does satisfy the derivation rule (1.59).
∂ f (x, y, z) ∂ f (x, y, z) ∂ f (x, y, z)

d f (x, y, z) = dx + dy + dz (1.76)
∂x ∂y ∂z
is called the total differential of f (x, y, z). Here ∂/∂x, ∂/∂ y, and ∂/∂z are so called
partial derivatives. The operation ∂/∂x for instance is the derivative with respect to
the variable x only, considering the other variables constants. The same applies to
∂/∂ y and ∂/∂z. Notice that the right side of the definition (1.76) can be expressed
using the scalar product, i.e.
f (x, y, z) · d r .
d f (x, y, z) = ∇ (1.77)
The differential operator ∇ is called Nabla- or gradient operator, i.e. ∇

f (x, y, z) is
the gradient of f (x, y, z). Here
⎛ ⎞ ⎛ ⎞
∂x dx
≡ ⎝ ∂ y ⎠ and d r = ⎝ dy ⎠ .
∇ (1.78)
∂z dz
Notice that ∂x etc. are convenient abbreviations for ∂/∂x etc., which we shall use
repeatedly.
Example - The Meaning of the Gradient: What is the meaning of the gra-
dient? Maps often contain information on the terrain’s topography expressed
by lines of equal height. The height (above sea level) is a function h(x, y),
where x and y are coordinates. A step forward along a line of equal height
mathematically means dh(x, y) = 0 and (1.77) becomes

∂x h dx
0= · .
∂y h dy
The first vector on the right side of this equation is the gradient in two dimen-
sions. The second vector is the ‘step’ expressed in terms of its components.
According to (1.9) the equation states that the gradient is perpendicular to the
direction of the step. In other words, the gradient vector is perpendicular to the
line of equal height. This means it is pointing either straight down the slope
or straight up. In any case, the gradient is parallel to the direction along which
h(x, y) changes the most. This is a general conclusion, which is true for all
other functions as well. It is worth noting that there is a method for locating
minima in the topography of functions of more than one variable. The name
of the method, which is derived from this particular property of the gradient,
is ‘method of steepest descent’.
At first glance the introduction of a partial derivative may seem somewhat of

a nuisance. However, the variables or coordinates x, y, and z may each depend
on another variable quantity like time t. Thus instead of f (x, y, z) we now have
f (x (t) , y (t) , z (t) , t). The total derivative of f (x (t) , y (t) , z (t) , t) with respect
to t is
d ∂f ∂ f dx ∂ f dy ∂ f dz
f (x (t) , y (t) , z (t) , t) = + + + . (1.79)
dt ∂t ∂x dt ∂ y dt ∂z dt
The first term on the right side is due to the time dependent change of the shape of
f itself. The other three terms are the changes of f due to the time dependence of
the variables x, y, and z.
Example - Total and Partial Derivatives: The following example illustrates

what this means. Let n(x(t), t) be the number of cyclists in a race (along
the x-axis) in the vicinity of point x(t) and at time t. We assume that there
are many participants in the race and therefore n(x(t), t) approximately is a
smooth function. The total change of n(x(t), t) during the time δt is

dn(x(t), t) ∂n(x(t), t) ∂n(x(t), t) d x(t)
δn(x(t), t) = δt = + δt .
dt ∂t ∂x dt
(1.80)
If for the moment we take the point of view of a spectator (stationary

observer), who watches the race from a fixed position x(t) = x along the road,
then the above equation becomes
dn(x, t) ∂n(x, t)
δn(x, t) = δt = δt . (1.81)
dt ∂t
There is no difference between the total and the partial derivative now. We
observe δn(x, t) cyclists passing our observation point x during the time inter-
val δt.
On the other hand, if we travel alongside the cyclists on a motorbike (moving
observer) our position, x(t), does depend on time. In this case we observe a
different δn given by the full (1.80). The first term in brackets is the change of
n per unit time in our vicinity as we travel along. Initially for instance we may
have moved in a group of fifty cyclists in our vicinity. After some time this
number has dropped to forty, because of some falling behind and others moving
ahead. The second term in brackets is due to our own velocity, d x(t)/dt, at
which we travel a small distance d x. On that distance the number of cyclists
in our vicinity changes in addition to the change described by the first term.
Remark: We are at a point where it is not too difficult to generalize the Taylor series
expansion (1.75) to the case of more than one variable:
f ( 1 2
f (
f (
r + δ
r ) = f (
r ) + δ
r ·∇ r)+ δ
r ·∇ r ) + ··· . (1.82)
2!
• Example - Surface Elements Expressed in Different Coordinates: In this

example we apply the differentiation rules, which we have learned thus far,
to the conversion of surface elements from one set of coordinates to another.
Points 1–4 in the sketch indicate the corners of a surface element,
ez · ((
r2 − r1 ) × (
r4 − r1 )) ,
in the x-y-plane, where ⎛ ⎞

0
ez = ⎝ 0 ⎠ .
1
Instead of using x and y to express the vectors ri (i = 1, 2, . . .), we may use
the coordinates u and v:
⎛ ⎞
x (u, v)
r1 = ⎝ y (u, v) ⎠
0
⎛ ⎞ ⎛ ⎞
x (u + δu, v) x (u, v) + ∂x(u,v)
∂u
δu
r2 = ⎝ y (u + δu, v) ⎠ ≈ ⎝ y (u, v) + ∂ y(u,v)
∂u
δu ⎠
0 0
⎛ ⎞ ⎛ ⎞
x (u, v + δv) x (u, v) + ∂x(u,v)
∂v
δv
r4 = ⎝ y (u, v + δv) ⎠ ≈ ⎝ y (u, v) + ∂ y(u,v)
∂v
δv ⎠ .
0 0
Thus we have
⎛ ∂x ⎞ ⎛ ∂x ⎞
∂u
δu ∂v
δv
⎜ ∂y ⎟ ⎜ ∂y ⎟
r2 − r1 ≈ ⎝ ∂u
δu ⎠ and r4 − r1 ≈ ⎝ ∂v
δv ⎠ .
0 0
The surface element expressed in the new coordinates is

∂x ∂ y ∂ y ∂x ∂ (x, y)
ez · (
r2 − r1 ) × (
r4 − r1 ) = − δuδv ≡ δuδv .
∂u ∂v ∂u ∂v ∂ (u, v)
The quantity ∂ (x, y) /∂ (u, v) is the Jacobian of the transformation. In par-

ticular, if the vector r2 − r1 is along the x-axis, whereas r4 − r1 is along the
y-axis, we find

∂ (x, y)
δxδ y = δuδv , (1.83)
∂ (u, v)
where
∂ (x, y) ∂x ∂ y ∂ y ∂x
= − . (1.84)
∂ (u, v) ∂u ∂v ∂u ∂v
Notice that the area of the surface element is always positive. This is why we
use the magnitude of the Jacobian.
The generalization of this equation to three dimensions, i.e. to volume
elements, is

∂ (x, y, z)
δxδ yδz = δuδvδw , (1.85)
∂ (u, v, w)
where now
∂ (x, y, z) ∂x ∂ y ∂z ∂ y ∂z ∂x ∂z ∂x ∂ y
= + +
∂ (u, v, w) ∂u ∂v ∂w ∂u ∂v ∂w ∂u ∂v ∂w
∂z ∂ y ∂x ∂x ∂z ∂ y ∂ y ∂x ∂z
− − − . (1.86)
∂u ∂v ∂w ∂u ∂v ∂w ∂u ∂v ∂w
• Problem 3 - Line, Surface, and Volume Elements:
(a) The square of the length of a short path, ds, in cartesian coordinates is
given by ds 2 = d x 2 + dy 2 + dz 2 . What is ds expressed in polar, cylindrical,
and spherical coordinates?
(b) Transform the cartesian surface element d xd y to polar coordinates.
(c) Calculate the cartesian volume element d xd ydz in cylindrical and spher-
ical coordinates.
Solution: (a) Using polar coordinates, x = r cos φ and y = r sin φ, the differ-
entials become
d x = cos φdr − r sin φdφ and dy = sin φdr + r cos φdφ .
Calculating the squares and adding them up yields
ds 2 = d x 2 + dy 2 = dr 2 + r 2 dφ2 . (1.87)
Analogous we obtain for cylindrical coordinates (based on (1.2))
ds 2 = dρ2 + ρ2 dφ2 + dz 2 , (1.88)
and for spherical coordinates (based on (1.3))
ds 2 = dr 2 + r 2 dθ2 + r 2 sin2 θdφ2 . (1.89)
(b) In polar coordinates the surface element is calculated via
∂(x, y) ∂x ∂ y ∂ y ∂x
= − = r cos2 φ + r sin2 φ = r ,
∂(r, φ) ∂r ∂φ ∂r ∂φ
i.e.
d xd y = r dr dφ . (1.90)
(c) Using (1.85) and (1.86) we obtain in the case of cylindrical coordinates
d xd ydz = ρdρdφdz (1.91)

and for spherical coordinates
d xd ydz = r 2 sin θdr dφdθ . (1.92)
Notice that the quantities
d f cyl = ρdρdφ (1.93)
and
d f sph = r 2 sin θdφdθ (1.94)
are surface elements on a cylinder with radius ρ and on the surface of a sphere
with radius r , respectively.
• Example - A Derivation of Euler’s Formula: In this example we apply

most of what we have discussed thus far.
We start by defining the unit vector
⎛ ⎞ ⎛ ⎞
x cos φ
r = ⎝ y ⎠ = ⎝ sin φ ⎠ (1.95)
0 0
and the function
r ) = x + i y = cos φ + i sin φ .
z ( (1.96)
√
Note that i = −1.
Next we calculate the change of z (
r ) in response to a small change of the
angle φ in the x-y-plane. Using the (1.77), (1.15), and (1.14) we obtain

(1.77) (1.15)
δz (
r ) = ∇z · δ
r = ∇z · δ φ × r (1.14)
= δ φ · r × ∇z
.
Again the vector δ φ possesses the magnitude δφ. Its orientation is that of the
z-axis (Not to be confused with the function z!). From
⎛ ⎞
1
= ⎝i ⎠
∇z
0
follows ⎛ ⎞ ⎛ ⎞
0 0
= ⎝ 0 ⎠ = i ⎝0⎠ ,
r × ∇z
ix − y z
i.e.
δz (
r ) = iδφz (
r)
or
r ) + δz (
z ( r ) = (1 + iδφ) z (
r) .
Let us assume that the original z is z 0 = 1. Subsequently we apply the above

formula n times, i.e. δφ = φ/n, to obtain z ( r ) as our final result. Mathemat-
ically this means

φ φ φ
r) = 1+i
z ( 1+i ... 1 +i z0
n n n "#$%
" #$ % =1
n times

φ n (1.63) iφ
= 1+i ≈ e . (1.97)
n
We do see that the combination of (1.96) and (1.97) yields
cos φ + i sin φ = eiφ . (1.98)
The formula with the minus sign,
cos φ − i sin φ = e−iφ , (1.99)
follows for a rotation in the opposite direction.
Integration† :
Integration is the limit of a sum in which the number of terms tends to infinity but
their difference tends to zero:

n ) xB
(∗)
lim δx f (xi ) ≡ d x f (x) = f (x B ) − f (x A ) (1.100)
n→∞ xA
δx→0 i=1
(here f = d f /d x). Every term δx f (x) is a thin slice, between xi = a + (i − 1)δx

and xi = a + iδx (with a + nδx = b), of the area between the x-axis and the function
f (x). Hence the integral is the total area under the curve described by f (x) between
the integration limits x A and x B . Notice that δx > 0. But the function f (x), and
therefore the result of the integration, can be either positive or negative.
Not quite as obvious is the equal sign indicated by (∗). We can show the validity
of (∗) as follows:
δx f (a) ≈ f (a + δx) − f (a)

δx f (a + δx) ≈ f (a + 2δx) − f (a + δx)
δx f (a + 2δx) ≈ f (a + 3δx) − f (a + 2δx)
... ≈ ...
δx f (b − 2δx) ≈ f (b − δx) − f (b − 2δx)
δx f (b − δx) ≈ f (b) − f (b − δx) .
Adding the terms on the left yields the sum in (1.100). Adding the right sides yields
f (b) − f (a).
In the following we discuss a number of rules or rather ‘tricks’, which help to
calculate integrals of the form
) xB
F (x B ) − F (x A ) = d x f (x) , (1.101)
xA
where again
d F (x)
= f (x) . (1.102)
dx
Notice that f (x) is called the integrand. For instance, if f (x) = x then F(x) =
x 2 /2, because it fulfills (1.102). We neglect the constant, which can be added to
F(x), because it cancels when we calculate F (x B ) − F (x A ). Another example is
f (x) = e−x . In this case F (x) = −e−x . In general however it is not always easy
to find F(x) for a given f (x). Contrary to differentiation, which merely requires to
adhere to a fixed set of rules, integration is a matter of experience and practice.
One helpful trick is partial integration, which follows directly from the product
rule of differentiation. Integrating (1.56) yields
) xB ) xB ) xB
d
g (x) f (x) d x = g (x) f (x) d x + g (x) f (x) d x .
xA dx xA xA
With the definition

x B

g (x B ) f (x B ) − g (x A ) f (x A ) ≡ g (x) f (x)
xa
we find immediately
x B ) xB ) xB

g (x) f (x) = g (x) f (x) d x + g (x) f (x) d x
xA xA xA
or
) xB x B ) xB

g (x) f (x) d x = g (x) f (x) − g (x) f (x) d x . (1.103)
xA xA xA
Consider for example

) ∞
xe−x d x , (1.104)
0
i.e.
) ∞ ∞ ) ∞
−x −x
−x
xe = x −e − −e dx
0 " 0
#$ % 0
=0
) ∞
= e−x d x
∞ 0

= − e−x = 1 .
0
Another trick uses parameter differentiation to simplify the integrand. Assume an

integrand f (x, λ), where λ is a parameter. Then the following is valid:
) xB ) xB
d d
f (x, λ) d x = f (x, λ) d x . (1.105)
dλ xA xA dλ
Again we apply this trick to (1.104), i.e.

) ∞ ) ∞

xe−x d x = xe−λx
λ=1
0 0
) ∞
d
=− e−λx
dλ 0 λ=1
d ∞ 1 −λx
=− e
dλ 0 −λ λ=1
d 1 1
=− = 2
dλ λ λ=1 λ λ=1
=1.
Notice that in this example we introduce λ into an integrand, which originally does
not contain the parameter.
A third approach achieves simplification by transformation to new coordinates.

This we demonstrate using the so called Gaussian integral3 :
) ∞
e−x d x .
2
We begin by extending the integration over the entire first quadrant of the x-y-plane
via
) ∞ 2 ) ∞ ) ∞ ) ∞ ) ∞
−x 2
e−x d x e−y dy = e−(x +y 2 )
2 2 2
e dx = d xd y .
0 0 0 0 0
Now we carry out the transformation to polar coordinates using (1.1). The trans-
formation of the cartesian surface element d xd y follows according to (1.91), i.e.
d xd y = r dr dφ. Thus we have
) ∞ 2 ) ∞ ) π/2
−x 2
r e−r dr
2
e dx = dφ .
0 0 0
Notice the new integration limits of the first quadrant in polar coordinates. Notice
also the extra factor of r in the r -integral as compared to the original x-integral. Due
to the extra r it now is easy to find a function whose derivative is the integrand:
*) *
) ∞ ∞ ) π/2
e−x d x =
2
r e−r 2 dr dφ
0 0 0
* ) ∞
π
= r e−r 2 dr
2 0

π ∞
= − e−r 2
4 0
√
π
= .
2
In addition, the symmetry of the integrand yields
) ∞ √
e−x d x =
2
π.
−∞
Without going through the details we mention that this example can be extended to
include integrals of the form
3 Carl Friedrich Gauss, 1777–1855, made outstanding contributions to mathematics, physics as well
as astronomy.
) ∞
e−ax ±bx
2
dx
0
√ √
(note: a > b > 0). Here
√ the transformation or substitution z = ax ∓ c with c =
1
a
(b/2)2 and dz = ad x is used to convert the integrand to Gaussian form. This
trick is called completing the square.
Because we are so close and because we shall need them, we briefly want to
mention surface and volume integrals. Notice the following intermediate result of
the Gaussian integral:
) ∞ ) ∞ ) ∞ ) π/2
dx dy · · · = r dr dφ . . . . (1.106)
0 0 0 0
Here . . . replaces exp[−x 2 − y 2 ]. We modify the integrations by reducing the upper

integration bounds on the left side to R. In addition we require x 2 + y 2 ≤ R 2 . For
the sake of simplicity we also set exp[−x 2 − y 2 ] equal to one. Thus we integrate,
which now means summation of columns with cross section d xd y and unit height,
within a quarter-circle in the x-y-plane:
) ) ) ) π/2
R R R
1 2π
d xd y = r dr dφ = R . (1.107)
2 2
" #$ %
0 0 0 0
x 2 +y 2 ≤R 2
Notice also that due to our choice of the integration limits, we benefit greatly from
the use of polar coordinates. The two integrals on the right side of the equation are
independent and easy to evaluate. The result, π R 2 /4, is the area of the quarter-circle.
We carry this one step further and consider the integral
) ) R ) R ) R
dV = d xd ydz . (1.108)
sphere of radius R −R −R −R
" #$ %
x 2 +y 2 +z 2 ≤R 2
Here d V = d xd ydz is the volume of a small cube, i.e. a volume element, and thus
we add up all the volume elements fitting into a sphere with radius R. This can be
done in cartesian coordinates. But spherical coordinates are simpler, i.e.
) ) ) )
R 2π π ∂ (x, y, z) 1 3
dV =
∂ (r, φ, θ) dr dφdθ = 3 R (2π)2 .
sphere of radius R 0 0 0
" #$ %
(1.92) 2
= r sin θ
(1.109)
The result is the volume of the sphere, V = (4π/3)R 3 .

Equation (1.101) is easily generalized to include vector fields of one variable:

) xB
F (x B ) − F (x A ) = d x f (x) . (1.110)
xA
The integration is carried out for each component separately.

We can go one step further by considering the following type of integral,
)
d r · f (
r) , (1.111)
path from A to B
where A and B are two points in space. What does this mean? We have to sum over the
scalar product d r · f (
r ) at (infinitely) many points along a particular path leading
from A to B. Notice that d r is a (infinitesimally) short line element connecting two
neighboring points along the path. The total sum is the line integral from A to B
along this particular path. In general the result will be different if we choose another
path. An example from mechanics is the work, W , done by some force, F, along a
path:
)
W = d r · F (
r) .
path from A to B
A specific example is Sisyphus pushing a boulder up an incline. The incline has

an angle α with the horizontal, whereas the force of gravity, Fg , is acting vertically.
Sisyphus has to overcome the component of Fg parallel to the incline. Here the line

up parallel to the incline. Thus the scalar product is d r · F =
element is pointing
π
dr Fg cos 2 + α = −dr Fg sin α. If the length of the path on the incline is s, then
the force of gravitation (!) does the following amount work:
)
Wpath I = − dr Fg sin α = −s Fg sin α
incline
(neglecting the boulder’s sliding friction). Notice that even though Sisyphus is push-
ing the boulder, it is the force of gravitation, which appears in the integral. Sisyphus
overcomes the component of the force of gravitation parallel to the incline by an
equal but opposite force4 and therefore his work is −W (s). Alternatively, Sisyphus
can push the boulder horizontally along the base of the incline and then hoist it up
vertically to the same final position as before. In this case the gravitational force does
the work
4 Strictlyspeaking ‘equal but opposite’ would mean that nothing happens. The situation is one of
static equilibrium. We assume however that Sisyphus pushed just hard enough to overcome the
force of gravity by a ‘negligible’ amount without causing ‘noticeable acceleration’.
) )
Wpath II = d r · F + d r · F
" horizontal#$ % vertical
Fg
=0 because d r⊥ F=
)
= d r · Fg = −s Fg sin α .
vertical
Again friction is neglected. We observe that in our special example both paths lead
to the same result. Mathematically we express this via
+ ) )
r)=
d r · F( r)−
d r · F( r)=0.
d r · F( (1.112)
path I path II
The circle in the integral symbol means that the total path is closed. Notice that the
minus sign in front of the integral over path II means that this path is traversed in the
opposite direction compared to before.
1.5 Complex Numbers†
A complex number has the representation
z = a + ib , (1.113)
where
√
i= −1 or i 2 = −1
while a and b both are real numbers. The complex number z is comparable to a
vector in the x-y-plane. The so called real part, Re(z) = a, of z is plotted along the
x-axis and the imaginary part, Im(z) = b, of z is plotted along the y-axis on which
the ordinary divisions are replaced by i 1, i 2, i 3, . . . (see Fig. 1.11).
As in the case of two vectors the comparison z 1 > z 2 is not well defined. Well
defined however is | z 1 |>| z 2 | or | z 1 |<| z 2 |, where | z | is the length of the arrow
in Fig. 1.11, i.e. the magnitude of a complex number is
√
| z |= a 2 + b2 = (a + ib) (a − ib) = z z̄ . (1.114)
z̄ = a − ib is called the complex conjugate of z = a + ib. Instead of the notation z̄

some authors use z ∗ . In particular the sum z + z̄ is always real, whereas the difference
z − z̄ is always imaginary.
1.5 Complex Numbers† 37
Fig. 1.11 Representing the imaginary

complex number z by a axis
vector
ib z=a+ib
1 a real
axis
Some rules:
• adding complex numbers:
(a + ib) + (c + id) = (a + c) + i (b + d)
• subtracting complex numbers:
(a + ib) − (c + id) = (a − c) + i (b − d)
• multiplying complex numbers:
(a + ib) (c + id) = ac + iad + ibc + i 2 bd

= (ac − bd) + i (ad + bc)
• dividing complex numbers:
a + ib (a + ib) (c − id) ac + bd bc − ad
= = 2 +i 2
c + id (c + id) (c − id) c + d 2 c + d2
Notice that the first two operations, unlike the next two, do not mix real and imaginary
parts.
Introducing the angle φ (cf. Fig. 1.11) via
a b
sin φ = or cos φ =
|z| |z|
yields a second representation of z:
z =| z | (sin φ + i cos φ) =| z | eiφ . (1.115)

Thus
z 1 · z 2 =| z 1 || z 2 | ei(φ1 +φ2 )
or
z1 | z 1 | i(φ1 −φ2 )
= e .
z2 | z2 |
Reference
1. M.R. Spiegel, Advanced Mathematics - Schaum’s Outline Series in Mathematics (McGraw-Hill,

New York, 1971)
Chapter 2
Laws of Mechanics
Starting with gravitation this chapter provides an overview over much of what is
to come. This includes Newton’s equation of motion or quantities like momentum,
angular momentum, torque, and energy as well as their conservation laws. Overall
the chapter is meant to guide the beginner through the basic equations and concepts
of mechanics.
2.1 An Overview†
Newton’s law of gravitation,1
mM r
mr¨ = −G , (2.1)
r2 r
is the theoretical basis for an unpretentious view of earth’s role among the other
bodies of the solar system. It describes the motion of the planets and their moons
based on a universal, attractive force, on the right hand side of the equation, acting
between the celestial bodies. Here m and M are the masses of two bodies separated
by the distance r. The quantity
G = (6.673 ± 0.001) 10−11 Nm2 kg−2 (2.2)
is the gravitational constant. The position vector r = (x, y, z) joins the masses m and
M. It is extending from the origin, the position of the mass M, to the position of m.
The quantity
1 Sir Isaac Newton, 1643–1727. His fame as founder of classical theoretical physics is largely due to
his book of 1687, Philosophiae naturalis principia mathematica (Mathematical principles of natural
philosophy). In this work he formulates his laws of motion as well as the law of gravitation, which
he had discovered in 1666. In addition to these an other seminal contributions to the development
of theoretical physics, he also made important contributions to mathematics.
DOI 10.1007/978-3-319-48710-6_2
40 2 Laws of Mechanics
⎛ 2 ⎞
2 d x/dt 2
d
r¨ = 2 r = ⎝ d 2 y/dt 2 ⎠ (2.3)
dt d 2 z/dt 2
is the acceleration of m, i.e. the variation of m’s velocity,

⎛ ⎞
dx/dt
d
r˙ = r = ⎝ dy/dt ⎠ , (2.4)
dt dz/dt
as a function of time, due to the force of gravitation between m and M expressed on

the right hand side of (2.1). Notice that (2.1) is symmetric, i.e.
Mm r
M r¨ = −G , (2.5)
r2 r
is equally correct. Except that here m occupies the origin and r is the position vector
of M.
Gravitation is one of currently four so called fundamental interactions. The other
three interactions can be described in the unifying framework of the standard model
of particle physics.2 Gravitation, however, is special and not very well understood. It
is by far the weakest interaction on the atomic scale. And yet it becomes the dominant
one between macroscopic bodies over large distances. This is because gravitational
interactions are strictly positive, attractive, and their range is infinite.3
Let us return to Newton’s law of gravitation. If m is the earth’s mass and M is the
mass of the sun, how can we account for the gravitational effects of the moon and
the other large bodies in the solar system and possibly beyond? The mathematical
answer is
Mj rj − r
mr¨ = Gm . (2.6)
j
| rj − r | | rj − r |
2
2 There are interactions or forces, which one encounters on a daily basis, that would not be consid-
ered fundamental in this framework. One example is the attractive interaction between a piece of
adhesive tape and the smooth surface to which it is stuck. The (molecular) interaction laws used to
describe adhesion can themselves be derived from theories build on the basis of the aforementioned
fundamental interactions. Nevertheless it is not always easy to exclude forces from the list of ‘fun-
damental interactions’. A stretched rubber band exerts a force that tends to reduce the strain. This
force is due to the reduction of the ‘conformation entropy’ of the macromolecules or polymers in
the deformed rubber. It is not so simple to decide whether or not ‘entropic forces’ are fundamental
or not. And entropy, a key quantity in many-body theory, of course is an important contribution to
the aforementioned molecular forces as well.
3 We shall see what exactly this means.
2.1 An Overview† 41
Here r is the position vector of the earth in a suitable coordinate system. The vector
rj − r joins the earth with the (celestial) body j possessing the mass Mj . This extension
of (2.1) states that gravitation is simply additive.4
There is a conceptual problem here as you may have noticed. What do we mean
by separation of two masses? Do we mean the distance between their centers or do
we mean the distance between their surfaces or some other distance?
Let’s pretend earth is the size of a pinhead with a diameter of 3 mm. Then the sun
is a baseball some 70 m away. Here the precise meaning of distance between the two
masses may not be very important. The situation changes completely if we consider
a satellite in an orbit near the earth’s surface. Even more difficult is the situation if
we consider the motion of a pendulum directly on the earth’s surface. What is the
proper mass-to-mass separation in these cases? One can show (cf. below) that two
radially symmetric mass distributions, possessing the total masses m and M, each
feels attracted to the other with a force of magnitude
mM
Fg (r) = G . (2.7)
r2
The proper r is the distance separating the midpoints of the two mass distributions.5
Advanced Example: We want to show that the last statement is true. Suppose
a large mass is cut up into many volume elements each contributing a small
increment δmj to the total mass. Thus each volume element (or mass element)
contributes to the total gravitational force F g (r ) at position r given by
δmj rj − r
F g (r ) = Gm (2.8)
j
| rj − r |2 | rj − r |
(cf. (2.6)). Momentarily we assume that we measure this force via a point mass
m located at position r . A point mass is a mathematical approximation, which
assumes that the entire mass is concentrated in a point.
The sum is inconvenient to deal with. Therefore we decide to convert it into
an integral, i.e.

δmj = ρ(r )dV .
j Vb
4 The inter molecular forces mentioned mentioned above usually are not additive. This means that the
force between any two molecules does depend on the positions and orientations of other molecules
in their proximity.
5 In examples involving the earth’s gravitation, we shall always approximate earth as a radially
symmetric mass distribution.

Here Vb is the total volume of the large mass or body, and ρ(r ) is the mass
density or mass distribution inside the large body at the position r . Notice
that the two sides of this equation indeed are equal, because we obtain the
total mass in both cases. On the right side we of course assume that the mass
increments can be arbitrarily small, i.e. limδmj →0 δmj /δV = ρ(r ), where r
is the position of the small volume δV containing δmj .
Thus we have

ρ(r ) r − r
F g (r ) = Gm dV . (2.9)
Vb | r − r |2 | r − r |
In general this integral is difficult to solve. However, if the distribution ρ(r ) is

radially symmetric, i.e. ρ(r ) = ρ(r ), then we can achieve great simplification.
First we note that
r · F g = 0
∇ (2.10)
outside Vb . We apply Gauss’ theorem or the divergence theorem (cf. Appendix

A) to this equation, i.e.

0= r · F g dV =
∇ F g · d f . (2.11)
V ∂V
The integration volume, V , indicated by the darker shaded area in the sketch, is
the volume between two concentric spherical shells. Both shells as well as the
volume Vb , shown as the small black circle, are centered on the same origin.
Each of the two shells completely includes Vb . From the radial symmetry of
the problem it is clear that m experiences a force pulling it straight towards the
origin, i.e. the center of the mass distribution, regardless of where m is located
(momentarily outside Vb ). This means that the magnitude of F g is constant on
each of the two shells, i.e. Fg = Fg (R) on the outer shell with radius R and
Fg = Fg (R ) on the inner shell with radius R . In addition, on the outer shell F g ·
d f < 0, because according to Gauss’ theorem d f must point perpendicularly

away from the integration volume’s surface. Thus F g · d f > 0 on the inner
shell’s surface, because there F g and d f are parallel. Equation (2.11) therefore
yields
0 = −4πR2 Fg (R) + 4πR2 Fg (R ) . (2.12)
So far we have only stated that R > R and that both radii are larger than
the radius of the mass distribution. Otherwise we may choose arbitrary and
independent values for R and R . From this and (2.12) we conclude
R2 Fg (R) = constant . (2.13)
We determine the constant by considering the limit R → ∞. Notice that in

this limit the mass distribution reduces to a point mass and thus
GmM r
F g (r ) = − 2 , (2.14)
r r

where M = Vb ρ(r )dV . This means that a radially symmetric mass distrib-
ution, ρ(r ), on its outside, gives rise to the same gravitational field as a point
mass M (Here by gravitational field we mean the right side of (2.14) without the
point mass m.). By extension of this reasoning we conclude that two radially
symmetric mass distributions, which do not overlap, possess the same gravi-
tational interaction as their corresponding point masses located at the center
of the respective mass distribution.
Before we continue, we want to discuss the gravitational force inside a
radially symmetric mass distribution. First we remove the black circle, i.e. the
mass distribution, from the above sketch and insert instead a thin spherical
shell, uniformly covered with mass, between the two spherical shells defin-
ing the integration volume. Again, all shells are centred on the same origin.
Equation (2.11) still does apply. Except now, because of the symmetry of the
problem, Fg (0) = 0. From (2.12) we conclude that Fg = 0 everywhere inside
the mass-covered shell. This is Newton’s first theorem, whereas the statement
following (2.14) is known as Newton’s second theorem.
If we measure the force of gravitation ‘tunnelling’ through a radially sym-
metric mass distribution, ρ(r), with a point mass along a straight line through
the distribution’s center, what do we find? Outside the mass distribution we
measure Fg (r) ∝ Mr −2 , where r is the distance to the center of ρ(r) and
M is the distribution’s total mass. Inside the mass distribution we measure
Fg (r) ∝ M(r)r −2 instead, where M(r) is the mass enclosed in a sphere with
radius r. As we have just shown, there is no contribution from beyond r.
If the mass distribution is uniform, i.e. ρ(r) = ρ, then M(r) = 4πρr 3 /3 and
in this case Fg (r) ∝ ρr. Because the force inside the mass distribution must
continuously tie on to the force on the outside given by (2.14), we find
GmM r
F g (r ) = − 3 r , (2.15)
R r
when r ≤ R.
Remark 1: Equations (2.12) and (2.13) may convey the impression that they
‘prove’ the r −2 -dependence of the gravitational force. This is not true. The
r −2 -dependence already enters through (2.10)!
Remark 2: What would have been the result of this calculation in two dimen-
sions? A point mass m remains a point mass. But the uniform spherical mass
distribution is replaced by a uniform disk. The result is shown in the next
figure. The solid line is the result for the sphere according to the (2.14) and
(2.15). The dashed line is the corresponding force experienced by a point
mass located in the plane of the disk, possessing the same radius and the
same mass as the sphere, at the distance r from its center. A negative force
here means that the force is directed towards the center. The two results are
qualitatively similar. Notice that the magnitude of the force in the case of the
disk is larger, but it does not diverge at r = R. In particular, the two geome-
tries yield the same forces in the limits of small and large r. This is easy to
understand in the limit r → ∞, because in this limit both geometries reduce
to points. The limit r → 0 is more subtle. This is because in the case of the
disk (2.11) no longer applies. In contrast to the three-dimensional case, the
point mass, located at r from the center but inside the disk, does experience
a net force from the mass located between r and R. This force contribution
is oriented radially away from the disk’s center. Together with the net force
due to mass elements between 0 and r, which is directed towards the cen-
ter, it yields the same slope as the force in the spherical mass distribution.
The disk problem may seem academic. But it has applications in the dynam-
ics of disk-like galaxies [1]. The stars in these galaxies rotate around the galac-
tic center with velocities v(r), where r is the radial distance from the center.
v(r) versus r is called rotation curve. Measurements of rotations curves show
that v becomes more or less constant at large distances from the galactic cen-
ter. This, when it was discovered, was very surprising. We shall learn that a
(point) mass, rotating around a (spherical) central mass, experiences a centrifu-
gal force proportional to v 2 /r, which must be balanced by the √ gravitational
attraction ∝ 1/r 2 . Thus v 2 /r ∼ 1/r 2 and therefore v(r) ∼ 1/ r (The symbol
∼ means that we neglect everything except the (dominant) r-dependencies on
the two sides of the underlying equation.). The above figure shows that this is
true for the disk as well, provided r is sufficiently large. The observation of
numerous rotation curves, however, does not show a decrease of v(r) when r
exceeds the radius of the visible matter in the galaxies. Therefore there must
be additional so called dark matter distributed in a special way and extending
far beyond the visible galaxy. It is not difficult to estimate the dark matter
distribution, provided it is radially symmetric, in the r-range in which v(r) is
constant. Above we have remarked that the force of gravitation on a point mass
(or star in this case) moving inside a radially symmetric (dark matter) mass
distribution is proportional to M(r)r −2 . Here M(r) is the total mass inside
the mass distribution from its center out to r, where the point mass is. Thus
v 2 /r ∼ M(r)r −2 . Because we want v to not depend on r, we obtain M(r) ∼ r
or ρ(r) ∼ r −2 , where ρ(r) is the radial (dark matter) mass density. Notice that
we have neglected the ordinary mass of the galaxy. The nature of dark matter,
which thus far betrays its existence only through its gravitation, currently is
one of the great mysteries in science.
We may generalize (2.1) by replacing the gravitational force by a different type

of force acting on the point mass m, i.e.
mr¨ = F . (2.16)
This is the usual form of Newton’s second law. An example of a force different from
gravitation is the electrostatic force exerted on a charge, bound to the mass m, in an
electric field. Another example is the elastic force due to a stretched spring attached
to m.
It is not very difficult to solve (2.16) numerically. This means we can calculate
the future position of m, provided we do have some information about m at an earlier
time. First we apply the Taylor series expansion (1.75) to r (t) or, for simplicity, just
to its x-component:
1
x (t + δt) = x (t) + ẋ (t) δt + ẍ (t) δt 2 + · · · . (2.17)
2
Here δt is a small timestep. Thus, approximately we may write
1 Fx (t) 2
x (t + δt) ≈ x (t) + ẋ (t) δt + δt , (2.18)
2 m
where we have replaced the acceleration ẍ using (2.16). Analogous equations follow
for y (t + δt) and z (t + δt). Provided we do know r (t) and r˙ (t), we can compute
r (t), r˙ (t)). Having thus obtained r (t + δt) the
r (t + δt) for any possible force F(
future velocity can be estimated via
r (t + δt) − r (t)
r˙ (t + δt) ≈ . (2.19)
δt
Iterating the last two equations yields the trajectory of m for all future t. The basic
prerequisite of course is that we know the initial conditions. Here this means that we
know the position and the velocity at time t = 0, the starting time of our trajectory.
Because the quality of our approximations does depend on the size of the timestep,
we can improve the numerical accuracy by choosing smaller δt.
Equation (2.6) is a special case of the following equation

mi r¨ i = F i1 + F i2 + F i3 + · · · ≡ F ij . (2.20)
j( =i)
Here F ii , which is the interaction of mass mi with itself, is excluded from the sum.
Notice that the notation F ij suggests that the total force acting on i always is a sum
of pair forces between i and other masses j. This is true in the case of gravitation. It is
also true in the many mechanical systems, which we are going to study. But in general
we must be more careful. The electric forces between molecules, for instance, are
not pairwise additive. This means that the force between two molecules does depend
on the distribution of other molecules in the vicinity. We do not want to elaborate on
these ‘polarization effects’, but we should keep the point in mind.
Using the definition
d
pi ≡ mi ri = mi r˙ i , (2.21)
dt
where pi is called the momentum, (2.20) can be expressed as

p˙ i = F ij . (2.22)
j( =i)
If the forces on the right side are zero, this means
p˙ i = 0 (2.23)
or
pi = const . (2.24)
The two equations state that mi neither alters the magnitude nor the direction of its
momentum if there are no forces acting on it. This fact is called Newton’s first law.
At this point we should discuss the meaning of m or mi appearing in the (2.16)–
(2.24). In principle every mass occupying a finite volume of space, we call this a
mass distribution, has to be cut up into very small or infinitesimal mass elements δm
or δmi . This means that m or mi in the aforementioned equations must be understood
as δm or δmi .
But how do we reconcile this concept of δmi with the known fact that matter
consists of atoms? The answer is that we partition a larger mass into tiny elements
according to a mere mathematical procedure, which assumes that matter is indeed
continuous rather than discrete. This approximation is a very good one for almost all
applications of mechanics.
A generalization of (2.20) based on this approximation is
ρ(r )r¨ (r ) = f(r ) . (2.25)
Here ρ(r ) is the mass density δmi /δV inside a volume element δV at position r (or
ri ) and f(r ) is the attendant force density F i /δV . F i or F(
r ) is the total force exerted
on the mass element.
Experience has taught us that (2.16) or (2.25) yield a very accurate description
of the motion of mass elements on all familiar lengths scales, i.e. from 10−6 m
(microscope) to 1011 m (telescope). However, there are no sharp limits beyond which
Newtonian mechanics does no longer apply. Somewhere on the atomic scale
(<10−9 m) quantities like position and momentum are no longer ‘independent’. The
effect on the validity of classical or Newtonian mechanics depends on various phys-
ical parameters like mass of the particles, temperature, and density. A completely
different description of the dynamics of particles is needed. This description is quan-
tum mechanics. On the other side of the above range of lengths classical mechanics
also gets into trouble. But again not immediately. Notice that δmi feels forces instan-
taneously, independent of the distance of the source of the force. This instantaneous
action at a distance is build into Newton’s equation of motion. Notice also that accord-
ing Newton’s law of gravitation (2.1) the force on the right side becomes infinite in
the limit r → 0. We may almost guess that the quantity GM/r is an important para-
meter, which, if it becomes sufficiently large, signals trouble for classical mechanics.
Again, within our solar system the attendant deviations from classical mechanics are
difficult to measure, which means they are small. They can be calculated within
Einstein’s6 general theory of relativity.
6 Einstein,Albert, physicist, *Ulm 14.3.1879, †Princeton (New Jersey) 18.4.1955. Einstein’s out-
standing contributions to physics are numerous. He is perhaps best known for his theory of special
and general relativity, which has advanced our understanding of nature on the microscopic as well
as on the scale of the entire universe.
We now carry out the summation over i in (2.22), i.e.

p˙ i = F ij ≡ F . (2.26)
i i,j( =i)
Here F is the total force, whereas the left side is the time derivative of the total
momentum,

=
P pi , (2.27)
i
of all mass elements. Thus, the final result is the simple equation
˙ = F .
P (2.28)
Static mechanical equilibrium of a system, which for instance every building

˙ = 0, and
must satisfy (excluding vibrational motion), means that pi = 0 ∀ i, i.e. P
therefore
F = 0 . (2.29)
If a building does not satisfy this equation, it will start to move - and most likely
collapse.
Let’s look at (2.28) more closely. According to Newton’s third law (actio equals
reactio)
F ij = −F ji . (2.30)
According to (2.26) this means that F = 0 is satisfied, unless there are external forces
F ext,k in addition to the internal forces F ij between the mass elements, i.e.
F ext,1 + F ext,2 + · · · = F ext .
In this case the total force becomes
F = F ext . (2.31)
Systems for which F ext = 0 are isolated. In an isolated system the total momentum
is therefore conserved, i.e.
d = const .
P = 0 or P (2.32)
dt
Every distribution of mass elements has a special reference point - the center of
mass defined via

≡
i mi ri .
R (2.33)
i mi
Using the mass density ρ(r ) this definition becomes

ρ(r )r dV
≡ V
R . (2.34)
V ρ(r )dV
The integration volume includes the entire mass distribution. From this we obtain
for the center of mass velocity

˙R =
i mi r˙ i = P

, (2.35)
i mi m
where m is the total mass.

Notice that the center of mass velocity of an isolated system is constant, i.e.
˙R = constant. This is a special case of Newton’s first law. Notice also that (2.28)

may be rewritten as
¨ = F .
mR (2.36)
In general the three equations for the vector components in (2.36) do not fully describe
the motion or even the static equilibrium of a mechanical system. Why is this? Notice
that according to this equation we merely describe the translational motion of the
center of mass neglecting the motion of the individual mass elements mi .
There is an important special case, the so called rigid body approximation, which
allows to describe the complete motion of a body using (2.36) plus three additional
equations. The rigid body approximation assumes that the distance between any
pair of mass elements is fixed, i.e. |ri − rj | = constant∀ i, j. If we throw a rock, then
the rock is a rigid body. If we throw a water balloon instead, then the rigid body
approximation is not very good.
But let’s put the rigid body approximation aside for the moment and return to
Newton’s equation of motion for a mass element or point mass mi , i.e.

mi r¨ i = F i ≡ F ij + F ext,i . (2.37)
j( =i)
What happens to this equation in the case of a (small) rotation of r with respect to
a fixed axis? Of course, mi does not change but r¨ i becomes r¨ i + δ r¨ i and F i changes
into F i + δ F i . This means that (2.37) becomes
mi (r¨ i + δ r¨ i ) = F i + δ F i .
Subtracting (2.37) from this equation yields
mi δ r¨ i = δ F i .
In the chapter Mathematical Tools we had discussed this type of rotation and how
to write it as a cross product, i.e.
mi δ φ × r¨ i = δ φ × F i .
Here δ φ is a vector parallel to the axis of rotation, whose magnitude is the small
rotation angle δφ. We can get rid of δ φ by scalar multiplication of the equation with
ri , i.e.
mi (δ φ × r¨ i ) · ri = (δ φ × F i ) · ri ,
and subsequent cyclic rotation, i.e.
mi (ri × r¨ i ) · δ φ = (ri × F i ) · δ φ .
At this point we can omit δ φ on the two sides of the equation.7 In addition we make
use of ri × r¨ i = dtd (ri × r˙ i ). The final result is
d
(ri × pi ) = ri × F i .
dt
Again we may sum over i on both sides of the equation, which yields
L˙ = N
, (2.38)
where

L = ri × pi (2.39)
i
and

=
N ri × F i . (2.40)
i
The quantity L is the total angular momentum, whereas Li = ri × pi are the individual
on the other hand is the total
angular momenta of the mi relative to the origin of ri . N
7 Notice is arbitrary. This means that [mi (ri × r¨ i ) − ri × F i ] · δ φ

that δ φ = 0 implies [. . . ] = 0.
torque. Again the N i = ri × F i are the individual torques relative to the origin of the
ri . Static mechanical equilibrium here means
=0.
N (2.41)
Together with (2.29) this equation governs the static equilibrium of buildings or other
static mechanical systems.
The motion of a rigid body, as defined above, is completely described by (2.28)
and (2.38). In Chap. 7 we shall return to this problem.
Let’s again consider the isolated system for which F ext = 0. Thus we can rewrite
(2.40) as follows:

=
N ri × F i (2.42)
i

= ri × F ij
i,j(j =i)
1
= ri × F ij + rj × F ji
2 i,j(j =i) i,j(j =i)
1
= ri × F ij − rj × F ij
2 i,j(j =i) i,j(j =i)
1
= (ri − rj ) × F ij
2 i,j(j =i)
=0.
The last equality follows because the vector connecting i and j, i.e. ri − rj , is either
parallel or anti-parallel to the force F ij (assuming central forces). This means that
the total angular momentum is conserved, i.e.
d
L = 0 or L = constant . (2.43)
dt
Another quantity, which is conserved in an isolated system, follows via a straight-
forward manipulation of Newton’s equations of motion, mi r¨ i = F i . Multiplication
by r˙ i yields
d 1 ˙2 ri U · r˙ i = − d U .
mi r i = −∇ (2.44)
dt 2 dt
The quantity
1 ˙2
E= mi r i + U , (2.45)
2
is the total energy of the system, where the first term is the total kinetic energy and
the second term is the total potential energy. Thus we find that the total energy is
conserved as well, i.e.
d
E = 0 or E = const . (2.46)
dt
Here we assume that the force can be expressed in terms of the negative gradient of
a function U, the potential energy, of all coordinates:
.
F = −∇U (2.47)
However, it is important to note that the total energy of an isolated system is conserved
independent of this assumption. Energy conservation, which is also the basis of the
first law of thermodynamics, is an undisputed empirical fact.
Classical mechanics consist mainly of applications of Newton’s equations of
motion to systems of one or two point masses. Systems consisting of many point
masses are limited to certain special cases like vibrations and the motion of rigid bod-
ies. The differential equations occurring in these contexts usually can be simplified
due to the three conservation laws mentioned thus far, i.e.
Ė = 0
P˙ = 0 . (2.48)
L˙ = 0
We will show that these conservation laws are intimately related to specific sym-
metries. Conservation of energy is tied to the fact that the result of an experiment does
not depend on whether it is carried out today or next year. This is called homogeneity
in time. Conservation of momentum follows from the invariance of our experimental
results to translation, i.e. it does not matter when the experimental apparatus is moved
from its current position to the next room. Conservation of angular momentum fol-
lows when the orientation of the experimental apparatus is rotated without altering
the experimental results. In technical terms this means that space is homogeneous
and isotropic.
However, we will not discuss the origin or the nature of the various forces in
this text. What is the origin of gravitation and how does its mathematical form as
expressed in (2.1) arise? Do all forces possess the property of pairwise additivity,
which we did use in (2.6)?8 Why is gravitation a fundamental force, whereas the
force exerted by the ends of a stretched elastic thread is not? Likewise we will not
talk about the nature of mass or why is possesses the property called inertia.
At this point we turn our attention to two example applications of the (2.28) and
(2.38) before we then exercise our newly acquired skills by solving a number of
problems.
8 Cf. our discussion on p. 46.

2.2 Two Examples in Newtonian Mechanics† 53
2.2 Two Examples in Newtonian Mechanics†
• Example - Mathematical Pendulum: A mathematical pendulum consists

of a point mass, m, suspended by a massless thread of length l in a gravitational
field (see Fig. 2.1). According to (2.28) the equation of motion of the mass is
mr¨ = F g + T . (2.49)
The gravitational force acting on the point mass is F g , whereas T is the tension
in the thread. On the left hand side of the equation we use polar coordinates:

sin φ
r = l
cos φ

˙r = lφ̇ cos φ
− sin φ

¨r = −l φ̈ − cos φ −lφ̇2 sin φ
sin φ cos φ

≡e⊥ ≡e
(l = constant). It is useful to express r¨ in terms of the orthogonal unit vectors

e⊥ and e . Notice that the same works also for F g , i.e.
F g = F g⊥ + F g = Fg⊥ e⊥ + Fg e .
Fig. 2.1 Mathematical

pendulum of constant length, x
l, including all relevant
forces |r|=l
r T
e m
e
Fg
Fg=m g Fg
y
Thus we obtain
−mlφ̈e⊥ − mlφ̇2 e = Fg⊥ e⊥ + Fg e − T e .
Here F g is given by the right hand side of (2.1). Applied to the present
problem, the position vector in this equation, r , points from the center of the
earth to the pendulum’s mass, m. To very good approximation the magnitude
of this vector is given by the radius of the earth, rE . Therefore F g = m
g , where
GmE
g ≈ ey . (2.50)
rE2
Here ey is a unit vector pointing towards the center of the earth. This value
of g is slightly larger than the standard value 9.81 ms−2 , which is an average
including the effect of the earth’s rotation (g-values for different latitudes can
be found on page 14–12 in [2].).
Because e⊥ and e are perpendicular to each other, we obtain the indepen-
dent equations
− mlφ̈ = mg sin φ (2.51)
and
− mlφ̇2 = mg cos φ − T . (2.52)
Equation (2.51) is the final equation of motion for the pendulum’s mass:
g
φ̈ (t) = − sin φ (t) . (2.53)
l
This is a second order differential equation [3]. Its solution is much easier if
we require that the amplitude, and therefore φ (t) , is small:
φ (t) = δφ (t) .
We now expand sin δφ (t) according to (1.69), i.e.
sin δφ (t) ≈ sin (0) + cos (0) δφ (t)

=0 =1

f (δx) ≈ f (0) + f (0) δx .
From (2.53) follows
d2 g
2
δφ (t) ≈ − δφ (t) . (2.54)
dt l
The general solution of this linear homogeneous differential equation can be

written as
δφ (t) = c1 sin ωt + c2 cos ωt ,
where the quantities c1 and c2 are unknown constants. Inserting this into (2.54)
yields
g
−c1 ω 2 sin ωt − c2 ω 2 cos ωt = − (c1 sin ωt + c2 cos ωt)
l
and thus

ω= g/l .
c1 and c2 follow from the initial conditions. If we choose δφ (t = 0) = δφ0

and δ φ̇ (t = 0) = w0 , i.e. c1 = ω0 /ω and c2 = δφ0 , then
ω0
δφ (t) = sin ωt + δφ0 cos ωt . (2.55)
ω
Remark 1: Alternatively we can insert the solution ansatz δφ (t) = δ φ̃0 sin
(ωt + ). The identity sin (ωt + ) = sin ωt cos + cos ωt sin then leads
to
δφ (t) = (δ φ̃0 cos ) sin ωt + (δ φ̃0 sin ) cos ωt .
Comparison with (2.55) yields

ω 2
0 ω
δ φ̃0 = δφ20 + and tan = δφ0 .
ω ω0
If we choose for instance ω0 = 0 (vanishing initial velocity), then

π
= and δ φ̃0 = δφ0 .
2
The solution now becomes
π
δφ (t) = δφ0 sin ωt + = δφ0 cos ωt .
2
Remark 2: In problem 9 (Elastic Pendulum) we obtain the equation
k
δ φ̈ + δφ = g .
m
This is an inhomogeneous differential equation, because g = 0. The general
solution is the sum of the general solution of the homogeneous differential
equation (g = 0), as discussed above, plus a special solution of the inhomo-

geneous differential equation, i.e.
δφ (t) = δφhom (t) +δφinh (t) . (2.56)

as before
Notice that δφinh (t) = const solves the inhomogeneous equation if const =
mg/k.
Let us return to (2.52), i.e.
T = mg cos φ + mlφ̇2 , (2.57)
from which we obtain the tension. The first term on the right is the contribution
due to the force of gravitation, whereas the second term is the centrifugal force
contribution due to the rotational motion of the mass.
Remark 3: Notice that in this example we have used the equality of inertial mass
and gravitational mass. Already Newton had carried out pendulum experiments
to check this equality, which later became the basis of Einstein’s theory of
general relativity (equivalence principle).
• Example - Static Equilibrium of a Ladder: In this second example we

consider the static equilibrium of the ladder depicted in Fig. 2.2. Its length and
mass are l and m, respectively. In order to prevent the ladder from sliding on
the frictionless floor, its lower end is tied to the wall using a rope. The angle
between the ladder and the wall is α. We want to determine the unknown forces
fB and fC as well as the rope’s tension, T .
Fig. 2.2 A ladder of length l

is leaning against a wall
fC
h
fB y
f
T x
Because this is a static situation, we use (2.28) and (2.38) in the form of
(2.29) and (2.41). We write

F = fi = 0 (2.58)
i
and

=
N ri × fi = 0 . (2.59)
i
First we convince ourselves that the force of gravity acting on the ladder can
be treated as a force, which only acts at one distinct position along the ladder.
This is because

mj g = mg = f ,
j
where mj are mass segments along the ladder. Notice that we do not yet know
at which point f is acting. This we shall know when we calculate the total
torque. Equation (2.58) now becomes
f + fB + fC + T = 0
or, expressed in terms of components,
fC cos α − T = 0 .
and
−f + fB + fC sin α = 0 .
There are three unknown forces in two equations. Thus we also need (2.59).
First we select an origin, which is arbitrary. We show this via ri = r i + a, where
a is a constant vector, i.e.

ri × fi = r i + a × fi = r i × fi + a × fi .
i i i i

(2.58)
=0
A useful choice is the center of mass of the ladder, i.e.

m r
=
j j j .
R
j mj
Why is this a good choice? Notice that gravity’s contribution to the total torque
is

rj × mj g = mj rj × g = R × f .
j j
is the origin, i.e. R

If R = 0, then this contribution to the torque vanishes. Thus
we find
rB × fB + rC × fC + rT × T = 0 .
Here rB , rC and rT are the positions, relative to the center of mass, at which the
respective forces act on the ladder. The last equation can be rewritten in terms
of the magnitudes of the vectors, which all lie inside a common plane, i.e.
| rB × fB | − | rC × fC | − | rT × T | = 0 .

= 21 lfB sin α =sfC 1
cos α
2 lT
The length s follows via

1 2 1 2
s + l = h2 + s + l sin2 α ,
2 2
i.e.
h l
s= − .
cos α 2
The necessary third equation thus becomes

1 h l 1
lfB sin α − − fC − lT cos α = 0 .
2 cos α 2 2
In conjunction with the two previous equations we now can determine the
unknown forces:
l sin (2α)
fB = f − fC sin α fC = f T = fC cos α .
4h
The forces fB and fC are called reaction forces. They are equal and opposite in
the ladder and the wall, respectively.
• Problem 4 - Force – Two Centers: The Lennard–Jones (LJ) potential,

12 6
σ σ
uLJ (rij ) = 4 − , (2.60)
rij rij
is a simple model, which approximately describes the interaction between two

nobel gas atoms or small molecules like nitrogen or methane (in their gaseous
or liquid states).
When the two particles are far apart, they do attract each other. At small
separations the interaction force is repulsive. The −r −6 -attraction in the poten-
tial is the leading term of the quantum mechanical interaction between small
neutral molecules, i.e. neutral molecules universally attract each other at large
distances. The r −12 -repulsion is a convenient description of the excluded vol-
ume interaction between molecules. However, there is no deeper justification
for its mathematical form.
Here the atoms or molecules are located at ri and rj , respectively, and
rij = |ri − rj |. The quantities und σ are positive parameters characterizing
the atoms or molecules.
We know already thatthe force Fi , which i experiences due to j, is given
by F i = −∇ ri u(rij ) = − d , d , d u(rij ).
dxi dyi dzi
(a) Calculate F LJ,i .
(b) Sketch uLJ (r)/ and σFLJ / (notice: F LJ,i = FLJ · (rij /rij )) as functions
of r/σ.
Solution: (a) The following figure shows the two centers or atoms i and j
including their position vectors ri and rj . The third vector is rij = ri − rj . The
force on center i due to the second center j follows via
⎛ ⎞ ⎛ ⎞
d/dxi drij /dxi
du(rij )
ri u(rij ) = − ⎝ d/dyi ⎠ u(rij ) = − ⎝ drij /dyi ⎠
F i = −∇ .
d/dzi drij /dzi drij
Using
drij d 1 1 xij
= (ri − rj )2 = 2xij = .
dxi dxi 2 rij rij
together with drij /dyi = yij /rij and drij /dzi = zij /rij yields
7
σ 13 σ rij
F LJ,i = 24 2 − . (2.61)
σ rij rij rij

≡FLJ
In the opposite case, which is the force on center j exerted by center i, we have
rji
F LJ,j = −F LJ,i = FLJ .
rij
(b) The solid line shown in the second sketch is the LJ potential in units of
versus r/σ. The dashed line is σFLJ / . The force is attractive, i.e. negative,
for r > 21/6 σ and becomes repulsive, i.e. positive, when r < 21/6 σ, implying
that σ, albeit roughly, corresponds to the diameter of the atom or molecule.
Remark: You may wonder how to describe the interaction between different
types of atoms or molecules, e.g. in a mixture, with the LJ potential. In this case
= ij and σ = σij , i.e. each different type of pair requires separate parameters
and σ. A simple approximation or mixing rule, which often is a reasonable
approximation, is
√ 1
ij = ii jj and σij = σii + σjj . (2.62)
2
ii and σii means that the two interaction partners are identical. The same
applies to jj and σjj of course. Additional detail regrading this point can be
found in books on computer simulations of gases and liquids like [4].
• Problem 5 - Force – Three Centers: Here i, j, and k are three masses. The
masses i and k are separated by the distance rik . They are joined by a harmonic
spring, i.e. a spring whose extension or compression is proportional to the
restoring force opposing the respective deformation. Likewise, masses j and
k, separated by the distance rjk , are also coupled by a harmonic spring. The
respective spring constants, k, are equal and the springs both have the same
equilibrium length bo .
Write down the total potential energy of this system of spring-coupled
masses. Subsequently calculate the forces F i , F j , and F k acting on each mass
in response to a deformation of the system.
Solution: The total potential energy is given by
1 1 2
U= k (rik − bo )2 + k rjk − bo . (2.63)
2 2
The force felt by mass i is
i 1 k (rik − bo )2 = −k (rik − bo ) rik .

F i = −∇ (2.64)
2 rik
i ≡ ∇
Notice ∇ ri , i.e. the derivatives are with respect to the position of particle
i.
Using the symmetry of the system yields
rjk
F j = −k rjk − bo . (2.65)
rjk

3
In addition we make use of the equilibrium condition l=1 F l = 0. Hence
F k = −F i − F j . (2.66)
• Problem 6 - Breaking Covalent Bonds: A simple mechanical model for a

chemical bond is the Morse potential, i.e.
2
uMorse (r) = Do 1 − e−a(r−ro ) . (2.67)
The quantity Do is the depth of the potential well, ro is the position of the
potential minimum, i.e. the bond length, and a is a constant controlling the
width of the potential well (make a sketch of the Morse potential!).
(a) Calculate the force Fz , beyond which a linear chain of n Morse bonds
breaks. The force is applied to the ends of the chain.
(b) Show that the work necessary to destroy the chain, Ez , is given by
Ez = −nDo /4. Compare
r this to the energy of a single bond at the same critical
strain. Hint: Ez = n roz F(r)dr, where F(r) = −duMorse (r)/dr and rz follows
via Fz = F(rz ).
Solution: The sketch shows the Morse potential for aro = 1 (solid line) and
aro = 2 (dashed line). Notice that in the vicinity of ro the potential is harmonic,
i.e.
uMorse (r) ≈ Do a2 (r − ro )2 . (2.68)
(a) First we look for the maximum tension, Fz , which a bond can support.
This means
∂ 2 uMorse (r)
− =0, (2.69)
∂r 2 r=rz (>ro )
where rz is the bond length for which F(r) = Fz . The result is
ln 2 aDo
rz = ro + and Fz = − . (2.70)
a 2
The minus sign indicates that, referring to the sketch, the force is acting towards
the left. In the case of a chain consisting of n Morse bonds, each bond experi-
ences the same tension. In particular Fz is still given by this formula.
(b) To good approximation (*) the work required to rupture the bond is given
by
rz
Do
Ez ≈ n F(r)dr = −n (uMorse (rz ) − uMorse (ro )) = −n . (2.71)
ro 4
The minus sign here means that the work must be done against the force F(r).
(*) Our considerations are somewhat rough. What happens to the energy
stored in all other bonds when a certain bond is broken? Well, some additional
energy must be invested to widen the gap in the chain. The remainder of the
stored energy eventually is lost, i.e. it is dissipated. The respective amounts of
energy do depend on n.
Remark 1: A calculation of this type can be applied to estimate the tensile

strength of high strength polymer fibers. The polymers in these fibers are long
chain molecules, which are densely packed in the cross section of the fiber and
whose backbone is oriented parallel to the fiber. However, the estimate is quite
rough and overestimates the fiber’s tensile strength. This is because failure of
the fibers strongly couples to defects on their surfaces.
Remark 2: The problem offers an important insight. The amount of energy

which a material can absorb, without being damaged, is greater if the energy
is distributed over a larger volume.
• Problem 7 - Valence Angle Potential: Three centers i, j, and k share the

following harmonic interaction potential (cf. the sketch):
kφ
u(φ) = (φ − φo )2 (2.72)
2
(a) The lengths rik and rjk as well as the parameters k φ and φo are constant.
φ
Only the angle φ is variable. Show that the force F i , which acts on i, can be
written as

φ rik × rik × rjk
φ
Fi = −k (φ − φo ) 2 . (2.73)
r rik × rjk ik
φ φ
(b) Obtain analogous expressions for F j and F k .
Solution: (a) We begin by expanding the right hand side of (2.72) to first order.
The result is
δu = k φ (φ − φo )δφ . (2.74)
rik rjk
Via cos φ = rik
· rjk
(see (1.9)) follows

ri rik rjk
− sin φδφ = δri · ∇ · (2.75)
rik rjk
and subsequently
δri · rjk δri · rik (rik · rjk )

ri (. . . ) =
δri · ∇ − . (2.76)
rik rjk rik3 rjk

rjk 2
Using sin φ = 1 − cos2 φ = 1 − ( rrikik · rjk
) we find
δri · rjk rik2 − δri · rik (rik · rjk )

δφ = . (2.77)
rik2 rik2 rjk2 − (rik · rjk )2
Inserting this expression into (2.74) yields
φ rjk rik2 − rik (rik · rjk )

F i = k φ (φ − φo ) . (2.78)
rik2 rik2 rjk2 − (rik · rjk )2
We can transform this expression into the desired form by applying the iden-
tities a × (b × c) = b(
a · c) − c(a · b)
and (a × b) 2 to the
2 = a2 b2 − (a · b)
numerator and denominator, respectively.
φ
(b) The force on j, i.e. F j , simply requires interchanging the indices i and
φ
j in (2.73). The last of the three forces, F k , is obtained using the equilibrium
condition
φ φ φ
F i + F j + F k = 0 . (2.79)
Remark: The preceding four problems, in particular problems 5 and 7, are not without
practical application. These forces do occur in so called empirical force fields used
in molecular modeling. Molecular modeling encompasses different computational
techniques and approaches used to predict the properties of molecular materials or
biological and pharmaceutical systems. One of these techniques, Molecular Dynam-
ics, is discussed in Chap. 9.
• Problem 8 - Ballistic Trajectory: A ball (point mass) is given an initial

velocity v. The angle between v and the x-axis, parallel to the surface of the
earth, is α. The initial height of the ball is h. Which angle αmax yields the
largest x-distance, i.e. the distance at which the ball hits the ground?
Solution: The equations of motion along the two coordinate axes (cf. the
sketch) are
g
x(t) = vt cos α and y(t) = h + vt sin α − t 2 .
2
It is useful to introduce the dimensionless quantities x → x/h, y → y/h, t →

vt/h, and z = sin α. Hence the above equations become
1 1 gh
x= 1 − z2 t y = 1 + zt − t 2 = 2 .
p p 2v
According
to the y-equation the condition y = 0 yields t± = pz/2 ±
(pz/2)2 + p and thus
p√ √
x= 1−s s + s + 4/p , (2.80)
2
where we use the +-sign in t± as well as s = z2 . It is convenient to use

d ln x/ds = 0 in order to calculate the optimal angle, i.e. the angle yielding
the maximum distance. A short calculation produces

4 −1
s= 2+ (2.81)
p
or

1
αmax = arcsin . (2.82)
2 + 2gh/v 2
There are two interesting limits (i) gh/v 2 → ∞, i.e. αmax = 0o , and (ii)
gh/v 2 → 0, i.e. αmax = 45o .
• Problem 9 - Elastic Pendulum (Newton): We consider a mathematical

pendulum in polar coordinates (ρ, φ) as shown by the sketch. An additional
feature is the variable length of the pendulum. The mass is connected to an
elastic thread with the potential energy 21 k(ρ − l)2 , where k is the spring con-
stant and l is the equilibrium length.
(a) Make a sketch showing the pendulum including all relevant forces. Write
down Newton’s equations of motion for ρ and φ. Hint: Follow the approach
we have used in the case of the rigid mathematical pendulum.
(b) Use the assumptions ρ = l + δρ and φ = δφ, where δρ and δφ are small.
Insert these assumptions into the above equations of motion and omit all terms
which are non-linear in the small quantities. The result will be independent
equations of motion for δρ and δφ.
(c) Solve the equations of motion for δρ and δφ.
Solution: (a) The sketch is essentially identical to Fig. 2.1. In polar coordinates
the position vector of the mass is given by

sin φ
r = ρ .
cos φ
The origin is the point (hook) from which the pendulum is suspended. The
y-direction is parallel to the gravitational force on the mass. Differentiating
twice with respect to time yields the acceleration
r¨ = −(ρφ̈ + 2ρ̇φ̇)e⊥ + (ρ̈ − ρφ̇2 )e|| .
The unit vectors e|| and e⊥ are the same as in Fig. 2.1. With
F g = mg sin φe⊥ + mg cos φe|| and T = −k(ρ − l)e||
follow the equations of motion
k
ρ̈ − ρφ̇2 + (ρ − l) − g cos φ = 0 (2.83)
m
and
ρ̇φ̇ sin φ
φ̈ + 2 +g =0. (2.84)
ρ ρ
When ρ = l = const we obtain agreement with (2.52) and (2.53). Notice that
the term k(ρ − l) requires some caution. It does not simply vanish in the limit
ρ → l, which means an increase of the thread’s stiffness, i.e. from ρ → l fol-
lows k → ∞ and thus k(ρ − l) → T = 0.
(b) Assuming φ = δφ and ρ = l + δρ in (2.83) yields
k
δ ρ̈ + δρ − g ≈ 0 . (2.85)
m
Second and higher order terms in the small quantities are omitted. Analogously
we find from (2.52)
g
δ φ̈ + δφ ≈ 0 . (2.86)
l
Here we use (l + δρ)−1 ≈ l −1 (1 − δl/l).
(c) In this approximation the two equations are not coupled. Equation (2.86),
in particular, is identical to (2.54). And we already know the solution of the
latter equation. The solution of (2.85) is the sum of the general solution of
the homogeneous equation δ ρ̈ + (k/m)δρ = 0, which again we can copy, and
the special solution, δρinh , of (2.85). An easy guess is
mg
δρinh = .
k
References
1. J. Binney, S. Tremaine, Galactic Dynamics (Princeton University Press, Princeton, 2008)

2. D.R. Lide (ed.), Handbook of Chemistry and Physics (CRC Press, Boca Raton, 2003)
New York, 1971)
4. M.P. Allen, D.J. Tildesley, Computer Simulation of Simple Liquids (Clarendon Press, Oxford,
1990)
Chapter 3
Least Action Principle for One Coordinate
The least action principle unifies most of physics, e.g. Newton’s equations of motion
in mechanics, Maxwell’s equations in electrodynamics, Schrödinger’s equation in
quantum mechanics etc. Here this concept is introduced in a simplified form, which
nevertheless allows to demonstrate its power ranging from the simple mathematical
pendulum to special relativity.
3.1 Euler–Lagrange Equation for One Coordinate†
Suppose we repeat the ballistic trajectory experiment (cf. problem 8) several times.
If we do not change the initial conditions we expect the ball to follow the exact same
trajectory in every experiment. Alternative paths apparently are not allowed. Can
we use this observation to construct a description for the dynamical development
of mechanical systems without using the forces right from the start as in Newton’s
approach? The answer is yes.
Let’s assume a function L, which depends on all quantities characterizing the
system at different times t. In the case of the ballistic trajectory these quantities are
the coordinates x(t) and y(t) as well as their time derivatives, ẋ(t) and ẏ(t). Notice
that the phrase ’all quantities characterizing the system at different times t’ implies
that the sought after method is not limited to mechanics problems, as we shall show
in several examples, once we have developed it. In particular, t can be a parameter.
Here we try to keep matters simple and we focus on just one variable, q(t), and its
time derivative,
d
q̇ (t) = q (t) .
dt

DOI 10.1007/978-3-319-48710-6_3
70 3 Least Action Principle for One Coordinate
Hence
L = L (q(t), q̇(t), t) , (3.1)
where we include an explicit dependence of L on t.

But how can we translate the existence of a special or optimal trajectory into
mathematics? We decide to measure the ‘quality’ of a trajectory by a single scalar
quantity S defined via the integral
t2
S= L (q (t) , q̇ (t) , t) dt (3.2)
t1
from time t1 to time t2 . If we evaluate S for different trajectories {qi (t), q̇i (t)}, where
i = 1, 2, 3 as shown in Fig. 3.1, we do obtain attendant values Si . Notice that we
require all trajectories to coincide at t = t1 and t = t2 as shown in the figure. In the
case of our ballistic trajectory experiment this means that the initial conditions are
identical for all trajectories. In addition the ball hits the ground at t = t2 always at the
same spot and with the same velocity. But between t1 and t2 different trajectories are
allowed. Increasing the number of i-values leads to the S versus i curve shown in the
lower panel of Fig. 3.1. In particular we hope that the one special trajectory singled
out by nature corresponds to a special value of S. In the figure this is the minimum
of S. But more generally we assume that the optimal trajectory corresponds to an
extremum of S. Thus we proceed according to the following two-step plan. First
we calculate the difference δS for two trajectories deviating only slightly from each
other, i.e.
Fig. 3.1 Three alternative

.
trajectories. In this case the q
extremum corresponds to
trajectory 2
3
2
1
q
q(t1) q(t2)
path
1 2 3
3.1 Euler–Lagrange Equation for One Coordinate† 71
t2 t2
δS = L (q(t) + δq(t), q̇(t) + δ q̇(t), t) dt − L (q(t), q̇(t), t) dt .
t1 t1
Here δq (t1 ) = δq (t2 ) = 0 as well as δ q̇ (t1 ) = δ q̇ (t2 ) = 0. Because δq(t) and

δ q̇(t) are small, we can expand L in these quantities (to first order), i.e.
∂L ∂L
L (q + δq, q̇ + δ q̇, t) ≈ L (q, q̇, t) + δq + δ q̇ ,
∂q ∂ q̇
where we have omitted the time argument. The result is

t2
∂L ∂L
δS = δq + δ q̇ dt .
t1 ∂q ∂ q̇
Notice that the quantity δ q̇, which again is a difference, can be written as
d
δ q̇ = δq , (3.3)
dt
i.e. the difference of derivatives δ q̇ is equal to the derivative of the difference δq. At
this point it is useful to integrate by parts, which yields
t2 ∂L t2
∂L d ∂L
δS = δq + − δq dt . (3.4)
t1 ∂ q̇ t1 ∂q dt ∂ q̇
The first term vanishes, because δq (t1 ) = δq (t2 ) = 0. Otherwise δq is arbitrary.

In a second step we use the condition
δS = 0 , (3.5)
which is satisfied only when we compare S for two trajectories at an extremum (cf.
Fig. 3.1). The fact that δq is arbitrary, except at the endpoints, implies
∂L d ∂L
− =0, (3.6)
∂q dt ∂ q̇
i.e. the integrand in the second term in (3.4) must vanish. Equation (3.6) is called
Euler–Lagrange equation. It is a differential equation, which yields q (t) - provided
we do know L = L (q (t) , q̇ (t) , t). The unknown function L (q (t) , q̇ (t) , t) is
called Lagrangian1 and S is the action.
Before we try to find the explicit form of L, insert this L into (3.6), and then
solve the differential equation, we want to interject a comment. Again, we stress
that this approach, which is called the principle of least action, is quite general. For
1 Lagrange, Joseph Louis de, Italian mathematician and astronomer, *Turin 25.1.1736, †Paris
10.4.1813; one of his many contributions is the development of variational calculus.
instance, we shall study an example in which there are many q(t) corresponding to
time-dependent charges inside molecules. Equation (3.6) allows to calculate these
charges depending on their interactions with each other. In another example q(t)
is replaced by y(x), where the latter is a curve above the x-axis. In this case we
want to maximize the area enclosed between y and the x-axis in a given x-interval.
The principle of least action is suited for these an numerous other problems, when
a certain special function must be distinguished from other alternative functions. In
addition, the least action principle transcends all of physics. One example we study
explicitly is special relativity. Another interesting topic, which of course we do not
study here, is quantum mechanics. In quantum mechanics the action, i.e. (3.2), is
important as well. But other than in classical mechanics there is not a single physical
trajectory a particle must follow according to the condition (3.5). A quantum particle
follows all alternative trajectories or paths! However, the alternatives are not all
equally important. We cannot present the rules here according to which the paths
must be weighted, but the paths close the one satisfying (3.5) usually possess greater
weight (cf. [1] and references therein).
Now let’s return to the Lagrangian. In order to find the specific form of L we first
consider a point mass in the absence of a force, i.e. we consider a free point mass.
We conclude that in this case the position of the point mass in space, i.e. q, should
not matter. In particular L should not depend on q. According to (3.6) this means
d ∂L
=0. (3.7)
dt ∂ q̇
In addition L should not depend on the direction in which the point mass is moving.
A simple form consistent with this requirement is

L = L q̇ 2
or more explicitly

L q̇ 2 = m q q̇ 2 . (3.8)
Here m q is a constant. Equation (3.7) therefore yields
q̇ = const .
This is Newton’s first law (cf. (2.24). We can account for a force acting on the point
mass by adding the function −U (q) to L, i.e.
1
L (q, q̇) = m q q̇ 2 − U (q) . (3.9)
2
3.1 Euler–Lagrange Equation for One Coordinate† 73
At this point the minus sign is convention. Inserting this equation into (3.6) yields
∂U (q)
m q q̈ = − . (3.10)
∂q
We obtain Newton’s equation of motion. If q is a cartesian coordinate, then m q is

the mass of the point mass. The right hand side, i.e. the negative derivative of the
potential energy U (q), is the force (cf. (2.47)).
The least action principle (3.5) in conjunction with some guesswork, which always
accompanies general principles, yields in a new approach to the calculation of the
motion of masses. The following two examples are intended to familiarize us with
the new tool.
3.2 Two Simple Examples†
• Example - Mathematical Pendulum Revisited: Because we have studied

the mathematical pendulum before (cf. Fig. 2.1), it is a good example to test
our new formalism. The Lagrangian of the pendulum is
1 2
L= mv − U (φ) .
2
Here v is the velocity of the mass, which we obtain via the time derivative of
its position vector:
d d l sin φ cos φ
v = r = = l φ̇ ,
dt dt l cos φ − sin φ
i.e.
v =| v |= l φ̇ .
The potential energy is given by
U (φ) = mgh = mgl (1 − cos φ)
(cf. the Sisyphus example at the end of Sect. 1.3 (where h = s sin α)). Notice
that the equation of motion derived from (3.6) does not change when a con-
stant is added to L. Thus U (φ) = −mgl cos φ is possible as well. Our final
Lagrangian is given by
1 2 ˙2
L(φ, φ̇) = ml φ + mgl cos φ ,
2
which, if we insert it in (3.6), yields
∂L
= −mgl sin φ
∂φ
and
d d ∂L d
pφ ≡ = ml 2 φ̇ = ml 2 φ̈ . (3.11)
dt dt ∂ φ̇ dt
Combination of the last two equations according (3.6) again yields the equation
of motion (2.51) for φ(t).
Remark: In (3.11) pφ ≡ ∂L/∂ φ̇ defines the generalized momentum, pφ . It is
important to note that usually the generalized momentum is not the same as
the product of mass times velocity, i.e. mv or m v. Here for instance v = l φ̇
and therefore mv = ml φ̇ = pφ . Analogously, the derivative ∂L/∂φ is called
generalized force.
• Example - Oscillator in the Gravitational Field: We study a mass m

suspended by a massless harmonic spring in the earth’s gravitational field
oscillating along the field’s orientation. The Lagrangian is
1 2
L (x, ẋ) = m ẋ − U (x) .
2
It is convenient to use the cartesian coordinate x(t) to describe the position
of the mass. The potential energy, U (x), is the sum of the energy of position,
mgx, and the elastic energy of the spring, 21 k (x − x0 )2 . The quantity k is the
spring constant and x0 is the equilibrium position of the mass in the absence
of gravitation, i.e.
1
U (x) = mgx + k (x − x0 )2 .
2
This time the application of (3.6) yields the equation of motion
k
ẍ = −g − (x − x0 ) .
m
3.3 The Meaning of the Least Action Principle† 75
3.3 The Meaning of the Least Action Principle†
Symmetries and the Least Action Principle:
Within the above formalism it is easy to relate symmetries to attendant conserved

quantities. Consider for instance time. Because we derive the equations of motion
from the Lagrangian and because the equations of motion are not different tomorrow
in comparison to today, unless we purposely alter the conditions of our experiment,
we expect L to possess no explicit time dependence, i.e. ∂L/∂t = 0. In this case the
total time derivate of L is
dL ∂L d q̇ ∂L dq
= +
dt ∂ q̇ dt ∂q dt
(cf. 1.79). According to (3.6) we find
dL ∂L d ∂L d ∂L
= q̈ + q̇ = q̇
dt ∂ q̇ dt ∂ q̇ dt ∂ q̇
viz.
d ∂L
q̇ −L =0.
dt ∂ q̇
Thus, the quantity
∂L
E ≡ q̇ −L (3.12)
∂ q̇
does not change with time.

But what is the meaning of E? Inserting L = 21 m q q̇ 2 − U (q) into (3.12) yields
1
E= m q q̇ 2 + U (q) .
2
This means that E is the sum of the kinetic energy, 21 m q q̇ 2 , and the potential energy,
U (q), i.e. E is the total energy. Notice that this does not prove the conservation of
energy. However, it relates the symmetry of time invariance to the conservation of
energy.
We shall explore other symmetries below. Now we make a short excursion into
the special theory of relativity, in order to illustrate the power of the least action
principle in a different area.
Special Relativity and the Least Action Principle:
We observe a mass moving in one dimension. The position of the mass is x at

time t, i.e. x(t). We can choose a different frame of reference, however, in which
the position of the mass is x (t ). Here the primed reference frame moves with
the velocity w relative to the unprimed reference frame. According to our normal
perception, the unprimed and the primed coordinates, should be related via the Galilei
transformation2 :
x = x − wt and t = t . (3.13)
Time is the same in both coordinate frames, and the velocity of the mass measured
in the primed frame is equal to the sum of −w plus the velocity of the mass in the
unprimed frame, i.e. v = v − w.
• Problem 10 - Galilei Transformation: Show that Newton’s equation of

motion of the one-dimensional harmonic oscillator,
d2x d
m 2
= − u (x ) , (3.14)
dt dx
where
k
u (x ) = (x − xo )2 ,
2
are invariant under a Galilei transformation when w = const.
Solution: We start on the left hand side of (3.14):
d2x d d x − wdt dt=dt d2x dw

m =m = m −m . (3.15)
dt 2 dt dt dt 2 dt

=0
Now comes the right hand side, i.e.
d
− u (x ) = −k(x − xo ) = −k(x − xo ) . (3.16)
dx
Notice that xo = xo − wt. Therefore x − xo = x − xo .
2 Galilei, Galileo, Italian mathematician, physicist and philosopher, *Pisa 15.2.1564, †Arcetri (today
a part of Florence) 8.1.1642.

Our time perception is special, mostly because of the irreversibility of the many
coupled chemical processes characteristic of life. But for a point mass there is no
real difference between time and space. Let us consider a generalization of the above
Galilei transformation, which treats x and t (almost) symmetrically, i.e.
x = γ (x − wt) (3.17)
t = α (t − βx) .
‘Almost’ means that w in the first equation, in the most general case, should be
replaced by a function of w, analogous to the unknown functions γ = γ(w), α =
α(w), and β = β(w). But how can we determine these unknown functions? The idea
is to find another quantity ds or rather ds 2 , which depends on γ, α, and β, and which,
by definition, does not change under the above transformation, i.e. ds 2 = ds 2 . This
quantity is
ds 2 = c2 dt 2 − d x 2 . (3.18)
Here c is a constant, with no special meaning yet, whose dimension is that of a

velocity. If there was a plus sign instead of a minus sign in (3.18), then ds would just
be the distance between two closely spaced points in the x-ct-plane invariant under
the transformation (3.17). This is what we know from ordinary rotations, for instance.
However, it turns out that only the minus sign yields physically meaningful results.3
One special case is d x = 0, i.e. ds = c dt. This is a point mass at rest, and ds/c is
the time between two closely spaced positions of the point mass in the x-ct-plane or
spacetime. The other extreme is ds/c = 0 or c2 dt 2 = d x 2 , corresponding to motion
with the speed c. ds/c is also called proper time, i.e. the proper time is zero in this
case.
But let’s just explore the consequences of the above invariance condition, i.e. we
begin with
ds 2 = c2 dt 2 − d x 2 = c2 dt 2 − d x 2 = ds 2 . (3.19)
According to the new transformation equations (3.17) we may write

−d x 2 = γ 2 −d x 2 + 2wd xdt − w 2 dt 2

c2 dt 2 = c2 α2 dt 2 − 2βdtd x + β 2 d x 2 .
In order for (3.19) to be satisfied, the following must hold
3 Theabove transformation also corresponds to a rotation, but different from the ones that we had
discussed in the first chapter of this book. We return to this point at the end of this section.
c 2 = c 2 α2 − γ 2 w 2
0 = −2c2 α2 β + 2γ 2 w
−1 = c2 α2 β 2 − γ 2 .
The solution of this system of equations is

−1/2
w2
γ = 1− 2 (3.20)
c
α=γ
w
β= 2 ,
c
and the above transformation (3.17) turns into the so called Lorentz transformation4
of special relativity:
w
x = γ (x − wt) and t = γ t − 2 x . (3.21)
c
At this point we return to the least action principle. We want to write down an
action for a relativistic free particle moving in two-dimensional spacetime. The only
quantity, which we know thus far, that has something to do with a path is ds. So let’s
try

2 t2
ds t2
v2
S∝− ds = − dt = −c 1− dt , (3.22)
1 t1 dt t1 c2
where 1 and 2 are two points in spacetime and v = d x/dt. The minus sign, at this
point, is just a convention. Let’s work out δS = S(v + δv) − S(v), i.e.

t2
v δv t2
1 δv 2
δS ∝ dt 2 + dt 3/2 + O(δv ) .
3
(3.23)
c 1− v2 2c2 v2
t1
c2
t1
1 − c2
As before, cf. (3.3), we may write δv = dδx/dt and then use partial integration to
rewrite the first integral, i.e.

t2
d v 1 t2
1 δv 2
δS ∝ − dtδx 2
+ dt 2 3/2 + O(δv ) .
3
(3.24)
dt c 1 − v2 2c v2
t1
c2
t1
1 − c2
4 Lorentz,Hendrik Antoon, dutch physicist, *Arnheim 18.7.1853, †Haarlem 4.2.1928; Nobel Prize
in physics 1902 together with P. Zeeman.
Because δx is arbitrary,5 the first integral vanishes only if the integrand vanishes, i.e.
−3/2
d v 1 1 v2 dv
= 1− 2 =0. (3.25)
dt c2 1 − v2 c2 c dt
c2
This means that v must be constant, i.e. the path of a free particle in our two-
dimensional spacetime, for which the action has an extremum, is a straight line
connecting the points 1 and 2. In addition, the second integral is positive (v < c!),
which means that the extremum is a minimum of S, provided that the a yet unspecified
proportionality constant is positive.
According to of (3.22) (cf. (3.2)) the Lagrangian in the present case, up to a
constant, is

v2
L ∝ −c 1 − .
c2
The proportionality constant follows if we consider the limit of small velocity, i.e.
v c, via

v2 1 v2
L ∝ −c 1 − ≈ −c 1 − .
c2 2 c2
In order for the second term to be equal to 21 mv 2 , the proportionality constant must
be mc, i.e.

v2
L = −mc 2
1− . (3.26)
c2
First we calculate the relativistic momentum of the point mass, i.e.
∂L
p= = mγv . (3.27)
∂v
The factor mγ is the relativistic mass. The latter approaches infinity as v → c!
Another interesting quantity is the energy given by
∂L
E =v − L = mγv 2 + mγ −1 c2 = mγc2 . (3.28)
∂v
In particular for v → 0 we obtain the famous formula
E 0 = mc2 (3.29)
5 The only requirement is that the velocity along an alternative path is less then c.
for the rest energy.

Originally we had introduced c as some constant velocity, which ensures that ct
and x do have the same units. Now c acquires physical meaning. It is a limiting
velocity, which no mass can surpass! For instance, γ for v > c becomes complex.
For v = c the relativistic mass mγ becomes infinite. Finally, there is the addition of
velocities, which is different from the simple addition we had used previously (cf.
(3.13)). According to (3.21) we have
dx d x − wdt
= .
dt dt − cw2 d x
With v = d x /dt and v = d x/dt this becomes
v−w v + w
v = or v = . (3.30)
1 − wv
c2 1 + wv
c2
If now v = c, then (3.30) yields
v = c , (3.31)
even though w = 0 (cf. Fig. 3.2). This means that addition of velocities cannot lead
to a velocity larger than c. We emphasize that we do not yet know that c is the velocity
of light! This connection will be made in the context of the theory of electricity and
magnetism.
Remark - Lorentz transformation as ‘rotation’: Fig. 1.10 shows a vector relative to
two coordinate systems. The primed system is rotated counterclockwise by the angle
ϕ relative to the unprimed system. Using the notation x and y instead of a1 and a2 ,
and x and y instead of a1 and a2 , we have

x cos ϕ sin ϕ x
= · . (3.32)
y − sin ϕ cos ϕ y
Inserting x = r cos θ and y = r sin θ, i.e. polar coordinates, yields

cos θ cos ϕ cos θ + sin ϕ sin θ cos(θ − ϕ)
r =r =r . (3.33)
sin θ − sin ϕ cos θ + cos ϕ sin θ sin(θ − ϕ)
Thus r = r and θ = θ − ϕ(= φ).

Now we repeat this for the Lorentz transformation (3.20):

x 1 −w/c x
=γ · . (3.34)
ct −w/c 1 ct
Instead of sin α and cos α, however, we use the hyperbolic functions

Fig. 3.2 Top Reference

frame K is moving relative
K
K'
to reference frame K with
the velocity w. A runner
moves with velocity v v'
relative to the origin of K in
the same direction. Relative
to the origin of K the runners
velocity is v. Bottom v w
plotted versus v according
to (3.30) when w = 0.9c.
The straight line is the naive v
addition of velocities,
v = v + w. Deviation from
the relativistic result increase
when the w approaches c
1 α 1 α
sinh α = e − e−α and cosh α = e + e−α , (3.35)
2 2
which satisfy the identity cosh2 α − sinh2 α = 1. Now we let x = r cosh α and
ct = r sinh α. Thus

cosh α cosh ω sinh ω cosh α
r = ·r , (3.36)
sinh α sinh ω cosh ω sinh α
where cosh ω ≡ γ and sinh ω ≡ −γv/c. Notice that cosh2 α − sinh2 α = 1 is

satisfied, i.e. cosh2 ω − sinh2 ω = γ 2 − γ 2 v 2 /c2 = γ 2 (1 − v 2 /c2 ) = 1. Finally,

cosh α cosh ω cosh α + sinh ω sinh α
r = r (3.37)
sinh α sinh ω cosh α + cosh ω sinh α

cosh(ω + α)
= r .
sinh(ω + α)
Thus r = r and α = ω + α.
Remark - the invariance of ds revisited: Our previous derivation of the Lorentz
transformation was based on the invariance of ds as defined in (3.18). Here we
derive the Lorentz transformation from more general principles. The underlying idea
Fig. 3.3 System K moves z z'

relative to system K with
velocity w
y y'
K K'
x x'
w
is that the laws of nature are the same for observers in different inertial reference
frames.6 In particular we shall assume that space and time are homogeneous, i.e. the
spacetime origin of a coordinate system is arbitrary (R1); space is isotropic, i.e. all
space directions are equivalent (R2).
Figure 3.3 depicts the inertial frames, originally introduced in the context of the
Galilei transformations, moving relative to each other with the constant velocity w
along the x-direction. Similar to (3.17) we start from a rather general form of the
transformation equations, i.e.
(i) K → K x = γ(w) (x − wt)

y = α(w)y
z = α(w)z
t = μ(w)t + (w)x .
Notice that the notation of the coefficients here differs from (3.17). We also include
the two other space axes. From R2 follows

(ii) K → K x = γ(−w) x + wt
y = α(−w)y
z = α(−w)z
t = μ(−w)t + (−w)x .
A third set of equations is obtained via inversion of w, x, and x (as well as y and
y in order to maintain right-handed coordinate systems), i.e.
(iii) −x = −γ(−w) (x − wt)

−y = −α(−w)y
z = α(−w)z
t = μ(−w)t − (−w)x .
From the comparison of (i) and (iii) follows
6 In an inertial frame a free point mass at rest will remain so in the future.
(iv) γ(w) = γ(−w)

α(w) = α(−w)
μ(w) = μ(−w)
(w) = −(−w) .
Inserting (ii) into (i) together with (iv) yields

x = γ (w) γ (−w) x + wt − w μ (−w) t + (−w) x
(iv) 2
= γ (w) + wγ (w) (w) x + wγ (w) (γ (w) − μ (w)) t

=1 =0
as well as
(iv)
y = α (w) α (−w) y = α2 (w) y .

=1
Notice that α2 (w) = 1 implies α (w) = ±1. The sign is +, because lim w→0 α (w) →
1. In γ 2 (w)+wγ (w) (w) = 1 we introduce the definition (w) ≡ − wγ(w)
η 2 (w)
,7 which
leads to
1
γ (w) = .
1 − η2w(w)
2
Thus
x = γ (w) (x − wt) (3.38)

y =y (3.39)
z = z (3.40)

wx
t = γ (w) t − 2 . (3.41)
η (w)
w1
We obtain additional information regarding γ (w) if we recognize that K →
w2 w
K → K is equivalent to K → K , i.e.

x = γ (w2 ) x − w2 t

w2 x
t = γ (w2 ) t − 2
η (w2 )
or
7 This definition satisfies the condition (w) = − (−w). The minus sign also leads to a meaningful
velocity-addition formula.

x = γ (w2 ) γ (w1 ) (x − w1 t)

w1 x
−w2 γ (w1 ) t − 2
η (w1 )
t = ...
is equivalent to
x = γ (w) (x − wt)
t = ... .
This yields the equations

w1 w2
γ (w) = γ (w1 ) γ (w2 ) 1 + 2
η (w1 )
−wγ (w) = −γ (w1 ) γ (w2 ) (w1 + w2 )
and therefore
w1 + w2
w= .
1 + ηw2 (w
1 w2
1)
The above special velocity-addition formula is sensible only if η = const (i.e. η = c,

the velocity of light (cf. p. 80)). Notice in particular that for w1 = w and w2 = c
w+c w+c
w= =c =c.
1 + wc
c2
w +c
Thus we find that γ(w) is given by (3.20), i.e.
1
γ(w) = .
w2
1− c2
Finally, we can generalize the Lorentz transformation by not requiring that w is

along the x-axes. Let the position vector in K be given by r = r⊥ + r with r⊥ ⊥ w

and r w.
It follows

r = γ (w) r − wt

r ⊥ = r⊥

w
· r
t = γ (w) t − .
c2
r ·w)
( w r ·w)
( w
Using r⊥ = r − w2
and r = w2
we obtain for r = r + r ⊥
r · w)
( w
r = r + (γ (w) − 1) − γ (w) wt
(3.42)
w 2
w · r
t = γ (w) t − 2 , (3.43)
c
where we have made use of r⊥ · w = 0.

Similarly we can generalize the transformation of velocities, i.e. the velocity-
addition formula. The inertial frame of reference K moves with velocity w relative
to K . The respective velocities in the two reference frames are v = d r/dt and
v = d r /dt . Based on (3.42) we have
d r
v =
γ (w) dt − w·d r
c2
γ (w) − 1 (d r · w)
w 1 wdt

+ w·d

− ,
γ (w) w 2
dt − c2 r
dt − w·d
r
c2
i.e.

1 v 1 v · w)
( w
v = w·
v
+ 1− − w
. (3.44)
1− c2
γ (w) γ (w) w2
The inverse of this relation is given by
v + w

v = v
w·
(3.45)
1+ c2
and
1 v ⊥
v⊥ = v
. (3.46)
γ(w) 1 + w·c2
Here and ⊥ refer to the direction of w.

• Problem 11 - Velocity-Addition Formula: The inertial frame K moves

relative to the inertial frame K with the constant velocity w.
Above we have
derived the velocity transformation equation (3.44). Show that the inverse rela-
tion is indeed given by (3.45) and (3.46).
Solution: Multiplication of (3.44) with w

yields

w
· v 1 1
1− 2 v · w
= v · w
+ 1− v · w
−w
2 , (3.47)
c γ (w) γ (w)
i.e.
v · w
+w
2
v · w
= v
w·
, (3.48)
1+ c2
which is equivalent to (3.45).

From (3.44) we obtain directly
1 v⊥
v ⊥ = w·
v
. (3.49)
1− c2
γ (w)
After insertion of (3.48) into this equation, we find (3.46) following some easy
manipulations.
• Problem 12 - Invariance of ds: By explicitly inserting the Lorentz

transformation,
i.e. (3.42) and (3.43), do show the invariance of ds =
c2 dt 2 − d x 2 − dy 2 − dz 2 .
Solution: Our starting point is
ds = c2 dt − d r .
2 2 2
(3.50)
We insert
(d r · w)
w
d r = d r + (γ (w) − 1) − γ (w) wdt
(3.51)
w 2
according to (3.42) and

w
· d r
dt = γ (w) dt − (3.52)
c2
according to (3.43). After carrying out the square we find the desired result,
i.e.
ds = c2 dt − d r = c2 dt 2 − d r 2 = ds 2 .
2 2 2
(3.53)
Reference 87
Reference
1. R. Hentschke, A Short Introduction to Quantum Theory, Lecture Notes 2016. http://constanze.

materials.uni-wuppertal.de/Englishindex.html
Chapter 4
Principle of Least Action
In the following we study the principle of least action more thoroughly. This includes
equations of motion in systems of point masses, the relation between conservation
laws and attendant symmetries, and motion in accelerated systems. We even show
how the least action principle can be applied to problems outside of mechanics.
4.1 Lagrangian for a System of Point Masses
Our aim is the derivation of the equations of motion for many interacting point masses,
generalizing the previous discussion, which did focus on a single point mass.
The position of a point mass i is described by its position vector ri = (xi , yi , z i ).
xi , yi , and z i are cartesian coordinates. Occasionally it is better or more convenient
to us symmetry adapted coordinates. Examples include polar, spherical or cylin-
drical coordinates. We shall call such coordinates generalized coordinates, q1 , q2 ,
. . . , qs , provided they describe the system of interest completely.1 The quantity s
denotes the degrees of freedom of the system.2 In the simplest case s = 3N , where
N is the number of point masses. However, the mere specification of the coordi-
nates q j ( j = 1, . . . , s) at time t, i.e. q j (t), does not yet allow the calculation of
the positions of the point masses at another time t + t. This requires, as experi-
ence shows, the additional specification of the generalized velocities q̇ j ≡ dq j /dt
( j = 1, . . . , s). Only on the basis of {q j (t) , q̇ j (t)} j=1,...,s are we able to compute
the trajectories of the point masses.
case: q1 ≡ x1 , q2 ≡ y1 , q3 ≡ z 1 , q4 ≡ x2 ,….
1 Cartesian
2 More generally we can identify the number of degrees of freedom with the number of generalized
coordinates necessary for defining the position and orientation of a body in space. However, we
emphasize that later in statistical mechanics the meaning of degree of freedom is defined somewhat
differently, including also the attendant momentum components.
DOI 10.1007/978-3-319-48710-6_4
90 4 Principle of Least Action
Euler–Lagrange Equations of Motion:
The generalization of the action (3.2) to the present situation is

t2
S= L (q1 (t) , . . . , qs (t) , q̇1 (t) , . . . , q̇s (t) , t) dt. (4.1)
t1
We proceed analogous the previous section. The variation of S, i.e. δS, is

t2
δS = L (q1 + δq1 , . . . , qs + δqs , q̇1 + δ q̇1 , . . . , q̇s + δ q̇s , t) dt
t1
t2
− L (q1 , . . . , qs , q̇1 , . . . , q̇s , t) dt.
t1
Expansion of the first integrand (cf. (1.76)) yields
L (q1 + δq1 , . . . , qs + δqs , q̇1 + δ q̇1 , . . . , q̇s + δ q̇s , t)

s
∂L ∂L
≈ L (q1 , . . . , qs , q̇1 , . . . , q̇s , t) + δq j + δ q̇ j .
j=1
∂q j ∂ q̇ j
Following the same steps as before in the case of a single coordinate, we obtain the
Euler–Lagrange equations,
∂L d ∂L
− = 0. (4.2)
∂q j dt ∂ q̇ j
Notice that each δq j can vary independently.

Remark: The requirement δS = 0 does not determine L completely. Replacing
L (q, q̇, t) by L (q, q̇, t) = L (q, q̇, t) + dtd f (q, t) (here for the sake of simplicity
for one coordinate only) yields
t2 t2 t2
df
S = L (q, q̇, t) dt = L (q, q̇, t) dt + dt .
t1 t1 t1 dt

= const
This means δS = δS = 0, i.e. both L and L do satisfy (4.2). In addition we may

multiply L by a constant without altering (4.2).
4.1 Lagrangian for a System of Point Masses 91
Terminology of constraints:
Frequently it is necessary to included constraints, which the solutions of the
equations of motion must satisfy. An example is the motion of a pendulum. The
mass moves according to the force of gravity, but its path is constraint by the thread
from which it is suspended.
There are different types of constraints. Here we distinguish -
– holonomic constraints: These constraints are of the form
f (q1 , q2 , . . . , t) = 0. (4.3)
An example is a rigid body, which is defined by contraining the distances

ri j =| ri − rj | between all of its mass elements to certain constant values, i.e.
ri2j − ci j = 0. Another example is a particle constrained to move in a plane. Each
holonomic constraint reduces the number of independent coordinates by one, i.e.
s = 3N − Z , where Z is the number of holonomic constraints.
– nonholonomic constraints – As perhaps expected, these do not have the form of
(4.3). If a point mass slides down the surface of a sphere and finally falls off, then
r 2 − a 2 ≥ 0, where r is the distance from the center of the sphere, whose radius is
a, to the point mass.
In addition, constraints explicitly dependent on time are called rheonomic, other-
wise they are called scleronomic.
Euler–Lagrange Equations Including Constraints:
We consider a single point mass and assume that it is subject to the nonholonomic
constraint
G (q, q̇, t) = c. (4.4)
This constraint may be included using the method of Lagrange multipliers (cf. [1]).
Instead of varying (4.1), we calculate the variation of
t2

L (q, q̇, t) + λ G (q, q̇, t) − c dt.
t1
The result is

∂ d ∂
− L (q, q̇, t) + λG (q, q̇, t) = 0. (4.5)
∂q dt ∂ q̇
The undetermined Lagrange multiplier λ is calculated via the additional (4.4).

Remark: In order to make this approach more transparent we consider the following
example. We seek the minimum of the function f (x, y) of x and y under the condition
φ(x, y) = 0. First we have
∂f ∂ f ∂y
0= + .
∂x ∂ y ∂x
Notice that the constraint implies y = y(x). In addition
∂φ ∂φ ∂ y
0= + .
∂x ∂ y ∂x
Combination of the two equations yields
∂f ∂φ
0= +λ ,
∂x ∂x
where
∂ f ∂φ
λ=− / .
∂y ∂y
∂f
Thus 0 = ∂y
+ λ ∂φ
∂y
and therefore
δ( f + λφ) = 0.
If instead of just a single constraint there are n simultaneous constraints, then

each requires its individual Lagrange multiplier, i.e.

n
∂ d ∂
− L (q, q̇, t) + λν G ν (q, q̇, t) = 0 . (4.6)
∂q dt ∂ q̇ ν=1
The generalization to systems of point masses is straightforward.
• Example - Mathematical Pendulum: We apply the above concept to the

familiar mathematical pendulum (cf. Fig. 2.1). The Lagrangian in cartesian
coordinates is given by
1
r , r˙ ) = m ẋ 2 + ẏ 2 − mg (l − y) .
L(
2
The scleronomic constraints is

r)≡
G ( x 2 + y 2 = l. (4.7)
Application of (4.6) with λ replaced by −λ yields

x y
m ẍ = −λ and m ÿ = −λ + mg.
l l
All in all, including the constraint, these are three equations for x (t) , y (t),
and λ.
We assume the following initial conditions

2E
x (t = 0) = 0 ẋ (t = 0) =
m
y (t = 0) = l ẏ (t = 0) = 0,
where E is the total energy of the pendulum. Notice that E = 21 m ẋ 2 , when

the mass passes the lowest point. The solutions are
mgl
x (t) = x0 sin (ωt + x ) and y (t) = y0 sin ωt + y + ,
λ
√
where ω = λ/(ml). Here mgl/λ is a special solution of the inhomogeneous
equation (cf. Sect. 2.2). The initial conditions yield

2E
x = 0 ωx0 =
m
mgl π
y0 + = l y = .
λ 2
Hence

2El λ
x(t) = sin t
λ ml

mgl mgl λ
y(t) = + l− cos t .
λ λ ml
In principle λ can be obtained by inserting these equations into (4.7). Because

this is somewhat complicated, once again we consider the limit of small ampli-
tude δφ(t). In this case y(t) ≈ l and thus λ ≈ mg. From x(t) ≈ lδφ(t) we
obtain the equation

2E g
δφ(t) ≈ sin t .
mgl l
This coincides with our previous solution of the mathematical pendulum

expressed in (2.55) for the same initial conditions (δφo = 0 and ωo =
as
2E/(ml 2 )).
• Problem 13 - Variation Subject to a Constraint: We consider the curve

y(x) (cf. the sketch), which intersects the x-axis at x1 and x2 . The curve has a
fixed length l. What is the shape of y(x) so that the area under the curve, A,
has a maximum, i.e.
y(x)
A
A
x1 x2 x
x2
y(x)d x = maximum.
x1
The constraint is

ds = l.
∂A
Write down the attendant Euler–Lagrange √ equation including the constraint.

Show that the equation for a circle, y(x) = r 2 − x 2 , satisfies this differential
equation. What is the value of the Lagrange multiplier?
Solution: We start from

dy 2
ds 2 = d x 2 + dy 2 = d x 2 1 + 2 or ds = d x 1 + y 2 ,
dx
wherein y = dy/d x. Analogous to (4.5) we can write

∂ d ∂
− y(x) + λ 1 + y2 = 0.
∂y d x ∂ y
Thus
d y 1
= .
d x 1 + y 2 λ
√
Using y(x) = r 2 − x 2 yields
y x
=− ,
1+ y2 r
i.e. for λ = −r the above differential equation is satisfied.

• Problem 14 - Motion on an Incline: A point mass slides down a frictionless

incline (sketch). Write down the Lagrangian and describe the incline as a
constraint expressed in terms of the coordinates x and y. Using the Lagrange
multiplier method, derive the equation of motion of the point mass and calculate
the solutions x(t) and y(t).
y
Fg
Solution: The Lagrangian and the constraint are given by
1 2
L= m ẋ + ẏ 2 − mg y
2
and
y = x tan α,
respectively. Thus we have

∂ d ∂
− (L + λ(y − x tan α)) = 0
∂x dt ∂ ẋ

∂ d ∂
− (L + λ(y − x tan α)) = 0.
∂y dt ∂ ẏ
The resulting equations of motion are
−m ẍ − λ tan α = 0 and − m ÿ − mg + λ = 0.
We multiply the first equation by tan α and use ÿ = ẍ tan α (constraint). Sub-
tracting the second equation yields an expression for the Lagrange multiplier:
λ = mg cos2 α.
Upon insertion of this result into the original equations of motion we find
ẍ = −g sin α cos α and ÿ = −g sin2 α.

The solutions are

1
x(t) = x(0) + ẋ(0)t − gt 2 sin α cos α
2
1
y(t) = y(0) + ẏ(0)t − gt 2 sin2 α.
2
Law of Inertia:
In the following we to study the form of the Lagrangian in more detail. We assume
that (empty) space is isotropic and homogeneous, i.e. no direction is special and all
positions in space are equivalent. Under these conditions the state of motion of a
point mass should not depend on position, r, or time, t. Due to the isotropy of space
L should depend on the magnitude of the velocity of the mass only, which, as we
pointed out before, suggests L = L v 2 . According to (4.2)3
d ∂L
=0, (4.8)
dt ∂ v
or ∂L/∂ v = const. This is easily satisfied via
v = const,
which once again is Newton’s first law (cf. (2.24)).

L of a Free Point Mass:
We consider two inertial coordinate systems moving within the above space with
constant relative velocity w.
If v is the velocity of a point mass in the first coordinate
system, then we calculate the velocity of the same point mass in the second coordinate
system via v = v − w.
The Lagrangians in the two reference frames are related
according to
∂L
L = L(v ) = L v 2 − 2
2
+ w 2 ≈ L(v 2 ) − 2 2
v·w v · w,
(4.9)
∂v
where we assume w v.
The laws of motion must be the same in the two reference frames. This means that
L(v 2 ) and L(v 2 ) are equivalent, i.e. the resulting equations of motion are identical.
Therefore the two Lagrangians can differ only by the total time derivative of a function
∂L
depending on the coordinates and time (cf. above), i.e. ∂v 2v
·w = dtd f (
r , t). With

3 Note:the meaning of ∂/∂ v is ∂/∂vx , ∂/∂v y , ∂/∂vz . In addition, ∂/∂ =
r is the gradient ∇
(∂/∂x, ∂/∂ y, ∂/∂z).
∂L ∂L ∂L
∂v 2
= const we have indeed f (
r , t) = ∂v 2
r · w.
Notice that this does not work if ∂v 2
still is a function of v.
Thus
m 2
L= v , (4.10)
2
where m/2 is a positive constant.4 m of course is the mass.

Remark 1: Equation (4.10) implies for an arbitrary w
that
m 2
L = v
2
m
= ( v − w)
2
2
m m
= v 2 − m v · w
+ w2
2 2
d m
= L+ −mr ·w
+ w2 t . (4.11)
dt 2
However, two such Lagrangians are equivalent in the framework of the least action
principle. Here this means that they are invariant with respect to the change from one
frame of reference to the other.
Remark 2: Notice that
2
ds (ds)2
v2 = = .
dt (dt)2
Here ds is an ordinary line element in three-dimensional space. The following is ds 2

in cartesian coordinates,
ds 2 = d x 2 + dy 2 + dz 2 , (4.12)
in cylindrical coordinates,
ds 2 = dρ2 + ρ2 dφ2 + dz 2 , (4.13)
and in spherical coordinates,
ds 2 = dr 2 + r 2 dθ2 + r 2 sin2 θdφ2 . (4.14)
4 In
the case of a negative m, S would decrease without bound by increasing the velocity (see also
page 78).
Hence
m 2
L= ẋ + ẏ 2 + ż 2 (4.15)
2
or
m 2
L= ρ̇ + ρ2 φ̇2 + ż 2 (4.16)
2
or
m 2
L= ṙ + r 2 θ̇2 + r 2 sin2 θφ̇2 (4.17)
2
for a free point mass.
• Problem 15 - Elastic Pendulum (Euler–Lagrange): We revisit the elastic

pendulum introduced in Problem 9.
(a) Write down the expressions for the kinetic and potential energies of the
pendulum. Notice that the zero point of the potential energy is arbitrary. Also
write down the resulting expression for the Lagrangian.
(b) From the Euler–Lagrange equations derive the equations of motion for ρ
and φ. Convert these
√equations into their dimensionless form. Use
√ l as the unit
of length and τl = l/g as the unit of time (Notice that τm = m/k also is a
typical time!).
(c) Make the transition to the rigid pendulum by introducing the constraint
ρ = l. Use the Lagrange multiplier method to obtain the resulting equations
of motion. Give the explicit expression for the Lagrange multiplier, λ. Solve
the equation of motion for small φ, i.e. sin φ ≈ φ.
Solution: (a) The (4.16) immediately yields
m 2
K = ρ̇ + ρ2 φ̇2 . (4.18)
2
The potential energy is given by
1
U = −mgρ cos φ + k (ρ − l)2 + Uo , (4.19)
2
where Uo is a constant, which here has no particular meaning. The resulting
Lagrangian is
m 2 1
L= ρ̇ + ρ2 φ̇2 + mgρ cos φ − k (l − ρ)2 . (4.20)
2 2
(b) The Euler–Lagrange equations are
∂L d ∂L ∂L d ∂L
− = 0 and − = 0. (4.21)
∂ρ dt ∂ ρ̇ ∂φ dt ∂ φ̇
Insertion of the Lagrangian (4.20) yields the equations of motion already

known to us from problem 9 (cf. (2.83) and (2.84)).
In dimensionless form the two equations are given by
2
τl
ρ¨∗ − ρ∗ φ̇2 + (ρ∗ − 1) − cos φ = 0 (4.22)
τm
and
ρ̇∗ φ̇ sin φ
φ̈ + 2 + ∗ = 0. (4.23)
ρ∗ ρ
Here, i.e. in this part of the problem, ρ∗ = ρ/l and ˙ = τl d/dt. The only
remaining parameter, distinguishing different systems, is τl /τm . Notice that
the dimensionless equations of motion are particularly suited for a numerical
solution of this problem.
(c) The Euler–Lagrange equation for ρ including the constraint ρ = l is

∂ d ∂
− (L + λ(ρ − l)) = 0. (4.24)
∂ρ dt ∂ ρ̇
From this equation we obtain
k λ
ρ̈ − ρφ̇2 + (ρ − l) − g cos φ + = 0. (4.25)
m m
Using ρ = l, i.e. ρ̈ = 0, yields
λ = mg cos φ + ml φ̇2 . (4.26)
The Lagrange multiplier, λ, is the constraining force (or tension), which must
be applied in order to keep the length of the pendulum constant (cf. (2.57)).
The equation of motion for φ follows immediately from part (b), i.e. from
(4.23):
g
φ̈ + sin φ = 0. (4.27)
l
The detailed solution of this differential equation was discussed in the example
beginning on page 53.
L in Systems of Point Masses:
The basis for the mechanics of point masses are Newton’s equations of motion:
m i v˙ i = Fi . (4.28)
The force Fi acting on point mass i, which in general includes contributions from
all other point masses, is given by
∂U
Fi = − . (4.29)
∂
ri
Here U is the potential energy of the system. Provided the position and momentum
of (all) i is known at some time t = 0, then the integration of (4.28) yields the entire
trajectory of the point mass - past and future.
Consider the following expansion of the position vector, i.e.
1
ri (t + t) ≈ ri (t) + r˙ i (t) t + r¨ i (t) t 2 . (4.30)
2
According to (4.28) the acceleration is given via r¨ i (t) = Fi (t) /m i . Ist Fi (t). If
the force is known, then (4.30) can be employed to numerically calculate, using a
suitable timestep t, the path of i in space. We can improve the precision of the
above algorithm considerably, if we add the series expansions of ri (t + t) and
ri (t − t) at time t. The result is
1
ri (t + t) ≈ 2
ri (t) − ri (t − t) + Fi (t) t 2 . (4.31)
mi
This algorithm is better than the previous one, because its error is O(t 4 ) (why?),
whereas before the error was O(t 3 ). In this case we need to know two initial
positions, e.g. ri (0) and ri (−t)), in order to be able to calculate all following
positions. Note also that this algorithm, like (4.28), is invariant with respect to time
reversal. We shall return to the numerical solution of the equations of motion later
in this book.
We have already shown that Newton’s equations of motion can be obtained via
the Euler–Lagrange equations, i.e. dtd ∂∂L
vi
= ∂L
∂
ri
, setting
m i v2
L = K −U = i
− U. (4.32)
i
2
Notice that U here is assumed to be independent of the velocities, i.e.
U = U (
r1 (t) , r2 (t) , r3 (t) , . . .) . (4.33)
One particular consequence is the instantaneous action at a distance. This of course

is in contrast to the theory of relativity. Only in the limit of slow moving masses can
this inconsistency be neglected. On the other hand, Newton’s classical theory has
been used for the past three hundred years, for instance to predict planetary motion
with high precision, and continues to be useful in the present time of interplanetary
space probes.
Why then have we introduced the least action principle at this early stage? In
particular because a number of additional assumptions are necessary to obtain the
equations of motion (4.28). However, the least action principle is very general, which
makes it the guiding principle throughout much of physics. We had a glimpse of its
power, when we discussed the action of a free relativistic particle.
Remark: In the previous sections we have used cartesian instead of generalized
coordinates qi . Expressed in terms of generalized coordinates the Lagrangian follows
according to
xi = f i (q1 , q2 , . . . , qs ) ,
i.e.
∂ fi
ẋi = q̇k . (4.34)
k
∂qk
Thus
1
L= alm ({q}) q̇l q̇m − U ({q}) . (4.35)
2 l,m
Work:
The integral

W = d r · Fi (4.36)
C
is the work done on a point mass i by the force Fi (cf. Sect. 1.4). Here d r is a line
element oriented in the direction of motion of the point mass along its path C. Using
(4.29) we have

∂U
W = − d r · =− dU = Ubeginning − Uend . (4.37)
C ∂
r C
Notice that W = 0 in the case of a closed path C. Forces for which this is true, i.e.
forces which can be derived from a potential U , are called conservative forces.
4.2 Conserved Quantities
Energy Conservation:
In an isolated system, i.e. a region in space decoupled from any outside influence,
we expect that L does not depend on time explicitly (cf. Sect. 3.3). The equations of
motions derived from L are the same once and for all. This is called homogeneity of
time. Therefore

dL ∂L ∂L ∂L (4.2) d ∂L
= q̇ j + q̈ j + = q̇ j ,
dt j
∂q j j
∂ q˙j ∂t
j
dt ∂ q˙j
=0
or
d ∂L
q̇ j − L = 0.
dt j
∂ q̇ j
This means that the energy, E, defined via

∂L
E= q̇ j − L, (4.38)
j
∂ q̇ j
is a conserved quantity in an isolated system. Because the kinetic energy is quadratic

in the velocities, we have
E = K + U. (4.39)
Momentum Conservation:
The above Lagrangian should not depend on the space origin either. Mathemati-
cally this is expressed via
∂L
0 = δL = · δ
r,
i
∂
ri
∂L ∂L d ∂L
where δr is a small uniform translation. Thus i ∂
ri
= 0. Using ∂
ri
− dt ∂ vi
=0
we find that the total momentum of our system,
∂L
P = = m i vi , (4.40)
i
∂ vi i
4.2 Conserved Quantities 103
is conserved. The quantities pi = m i vi are the momenta of the individual point
masses. Generalized momenta are defined via
∂L
pj = . (4.41)
∂ q̇ j
∂L
Remark: Using ∂
ri
= − ∂U
∂
ri
in conjunction with (4.29) yields

Fi = 0. (4.42)
i
The sum over all forces within an isolated system is zero. In the special case i = 1, 2
this is Newton’s 3rd law (actio = reactio). Analogous to (4.41)
∂L
Fj = (4.43)
∂q j
defines generalized forces. Thus we can express the Euler–Lagrange equations via
ṗ j = F j . (4.44)
Conservation of Angular Momentum:
We define an arbitrary axis within our isolated system and carry out an (infinites-
imal) rotation with respect to the axis (cf. Fig. 1.9). Due to the isotropy of space the
Lagrangian is not affected by the rotation or mathematically speaking:
∂L ∂L

δL = · δ
ri + · δ vi = p˙i · δ
ri + pi · δ vi = 0. (4.45)
i
∂
ri ∂ vi i
Notice that δ
r can be expressed as
r = δ φ × r.
δ (4.46)
The same applies to δ v, i.e.
δ v = δ φ × v. (4.47)
Remember that according to (1.15) every vector a subject to an infinitesimal

undergoes the change δ
rotation, δ φ, a = δ φ × a . The rate of change is given by
δ
δφ

da
a
δt
= δt × a or dt I = ω × a . The index I is a reminder that this is a change with
respect to an inertial frame of reference. If a itself has an additional time dependence,

d a
i.e. dt R
= 0, within the rotating reference frame R, then
d a d a
= +ω
× a . (4.48)
dt I dt R
The combination of the three (4.45)–(4.47) yields

0= p˙i · δ φ × ri + pi · δ φ × vi
i

= δ φ · ri × p˙i + (
vi × pi )
i
d
= δ φ · ri × pi .
dt i
The quantity

L = ri × pi (4.49)
i
is the total angular momentum of our system and thus
d L
= 0. (4.50)
dt
Via
L i = ri × pi . (4.51)
we define the angular momenta of the individual point masses with respect to the
axis of rotation.
Notice the important specification ...with respect to the axis of rotation.... Replace-
ment of ri by r i = ri + a , where a defines is an arbitrary translation, yields
L i = r i × pi = L i + a × pi (4.52)
or
L = L + a × P.
(4.53)
Only if the total momentum of the system vanishes, is L is not affected by a shift of
the axis of rotation.
4.2 Conserved Quantities 105
Remark 1: Notice that

∂L
Lz = (4.54)
i
∂ φ̇i
is the projection of the angular momentum onto a z-axis, where φi are angles of
rotation with respect to this axis. coordinates xi = ρi cos φi ,
Proof: In cylindrical
yi = ρisin φi and thus L z = i m i (xi ẏi − yi ẋi ) = i m i ρi2 φ̇i . On the other hand
L = 21 i m i (ρ̇i2 + ρi2 φ̇i2 + ż i2 ) − U (cf. (4.16)).
Remark 2: Consider a system possessing a symmetry axis. The projection of the

system’s angular momentum onto this axis (z-axis) is time independent, i.e. it is a
conserved quantity (Why?5 ).
Remark 3: How is the system angular momentum in one inertial frame of reference
related to the same quantity in a second inertial frame of reference? We have
L = L + m( R × w),
(4.55)

where m = i = vi − v i . In addition,
m i and w

m i ri
R = i (4.56)
i mi

is the center
of mass of the system. Justification: L = i m i (
ri × vi ) = i m i
ri × v i + i m i (
ri × w).
Equation (4.55) can also be written as
L = L + R × P,
(4.57)
i.e. the angular momentum L is the sum of L , the angular momentum in the rest
frame of the center of mast, plus the angular momentum of the center of mass itself.
More Conserved Quantities
The above conserved quantities are of special importance, because they are inti-
mately tied to the symmetries of time and space. However, there are other conserved
quantities, whose significance is less obvious.
Why are there additional conserved quantities and what is their meaning? We
consider the solution to the equations of motion in an isolated system. The latter is
characterized by j = 1, 2, . . . , s coordinates, q j , and attendant momenta, p j . These
solutions do contain 2s constants (initial conditions) Ci . One of these constants is the
5 In this case L does not depend on φ and application of (4.2) in conjunction with (4.54) proves the
i
claim.
time origin to . The solutions of the equations of motion therefore can by expressed
as
q j = q j (t + to , C1 , C2 , . . . , C2s−1 ) (4.58)
and
q˙j = q˙j (t + to , C1 , C2 , . . . , C2s−1 ) , (4.59)
i.e. there are 2s − 1 remaining conserved quantities also called constants of motion.
We consider a brief example - the one-dimensional oscillator (cf. Sect. 3.2; here
q = x − x0 is the displacement from the rest position). We have
p ṗ k
q̇ = and q = − or q̈ = − q,
m k m
where m is the mass and k is the spring constant. The general solution is

k
q = q0 sin (t + t0 )
m
(cf. Sect. 2.2; notice that the mathematical pendulum in the limit of small amplitude
is equivalent to the present oscillator problem). Here s = 1 and q0 is the only constant
of motion corresponding to the total energy E, i.e. from E = 21 m q̇ + 21 kq 2 follows

q0 = 2 Ek .
Next we study two coupled one-dimensional oscillators (hook-spring-mass-
spring-mass). This time the solution is

qi = qi,0 sin ω t + ti,0 ,

√
where i = 1, 2 and ω = 1
2
3 ± 5 mk . If we identify t1,0 with t0 , then three
constants of motion remain (2s − 1 = 3). Only one can be replaced by E using again
the conservation of total energy, E = m2 (q̇12 + q̇22 ) + 2k q12 + (q1 − q2 )2 = const.
In Chap. 9 we shall discuss many particle systems and also return to the role of
the constants of motion in these systems.
4.3 Lagrangians in Accelerated Systems
We compare the motion of a particle in the following three reference frames: the
inertial frame K (rest frame); a second reference frame K moving with the trans-
(t) relative to K ; the third reference frame, K R , whose origin
lational velocity w
4.3 Lagrangians in Accelerated Systems 107
coincides with the origin of K , rotates with an angular velocity ω

(t). The respective
Lagrangians in these frames of reference are L, L , and LR , respectively.
K -frame: If v is the velocity of a particle in the K -frame, then
m 2
L= v − U.
2
K -frame: With v = v + w
(t) follows
m 2 m 2
L = v + m v · w
+ w − U.
2 2
The term m w 2 /2 may be omitted. According to the remark following (4.2) this does
not alter the equation of motion, because w(t)2 can be expressed as the derivative
with respect to time of a function f (t). The same reasoning applies in the case of
d dw

m v · w
= r ·w
m r·
− m ,
dt dt
i.e. d/dt (. . .) can be omitted as well. We obtain
m v 2
L = ˙ − U
r·w
− m (4.60)
2
and thus the equation of motion
∂U
m v˙ = − ˙
− m w. (4.61)
r
∂
Notice that changing reference frames from K to K is equivalent to introducing a

homogeneous forcefield −m w.˙
K R -frame: According to (4.48) we have
v = v R + ω
× r R . (4.62)
Inserting this into (4.60) yields

m 2 m 2
L R = v R + m v R · ω × r R + ˙ · r R − U.
× r R − m w
ω (4.63)
2 2
In order to obtain the derivatives ∂LR /∂ v R and ∂LR /∂

r R we calculate dLR :

dLR = m v R · d v R + m ω × r R · d v R + m v R · ω × d r R
∂U
+m ω × r R · ω × d r R − m w ˙ · d r R − · d r R
∂r R

= m v R · d v R + m ω × r R · d v R + m v R × ω · d r R
∂U
+m ω × r R × ω · d r R − m w ˙ · d r R − · d r R .
∂r R
Hence
∂L
= m v R + m ω × r R
∂ v R
and
∂L ∂U
= m v R × ω
+m ω × r R × ω ˙ −
− mw .
r R
∂ r R
∂
Inserting this into the Euler–Lagrange equation yields the equation of motion within
the rotating reference frame:
1 ∂U
v˙ R = − ˙ + (
−w ˙ + 2(
r R × ω) v R × ω)
+ω r R × ω).
× ( (4.64)
r R
m ∂
Notice that the last two terms are present also in case of a uniform rotation. After
multiplication by m they are called Coriolis force6 and centrifugal force, respectively.
• Example - Rotating Pendulum: The following sketch shows the mass m,

rotating on a thread of length l in the earth’s gravitational field. The constant
angular velocity is ω.
We want to calculate the angle α.
mg
Several terms in (4.64) vanish from the point of view of a co-rotating

observer, who finds
6 Coriolis, Gaspard Gustave de *1792, †1843; French physicist.

∂U
0=− +m ω
× (
r R × ω)

∂
rR
(the prime is omitted here). Thus
0 = T + m g + m ω
× (
r R × ω)
, (4.65)
where T is the tension in the thread, or

0 = −T e + mg sin α e⊥ + mg cos α e + m ω 2 l sin2 α e − sin α cos α e⊥ .
Here we employ the same projections along the basis vectors e and e⊥ as
before in Fig. 2.1. Comparing the coefficients of e⊥ yields
0 = mg sin α − mω 2 l sin α cos α, (4.66)
i.e.
g
α = arccos (4.67)
lω 2
for ω 2 ≥ g/l (cf. the following figure).

g
arccos
2
l
2.0
1.5
1.0
0.5
g
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 l 2
In the limit ω → ∞ we obtain α = π/2 as expected. In the limit ω = 0, on

the other hand, we must consider (4.66), which immediately yields sin α = 0
or α = 0. Notice that (4.66) holds for ω 2 < g/l, because then only α = 0
yields a stable equilibrium. The latter means that the pendulum, in response
to a small perturbation, oscillates around the solution α = 0. If ω 2 > g/l
however, this solution is no longer stable. A small perturbation is sufficient to
drive the pendulum towards the new stable solution (4.67). A general analysis
of stability is discussed in Sect. 4.4.
• Example - Rotating Double-Pendulum: The sketch shows the rotating pen-

dulum of the previous example, which now includes a second mass suspended
by a thread of length d attached to the first mass. The two masses, m, are
identical. The orientation of the second thread is characterized by the angle β
relative to an axis defined by the first thread.
Analogous to (4.65) we now have the following two equations:
0 = T + T + m g + m ω
× (
r1 × ω)
(4.68)
and
0 = −T + m g + m ω
× (
r2 × ω)
. (4.69)
Here T is the tension in the second thread and ri is the position of mass i, when
the suspension point of the pendulum is the origin. The approach is the same
as before. We obtain four equations corresponding to the projections along the
above orthogonal unit vectors. The unknown quantities are the two tensions
and the angles α and β.
We simplify matters by concentrating on small α. In this limit we have
−t + t cos β + x ≈ 0
−t sin β + α(x − 1) ≈ 0
−t 1y cos β + x
y
+ α sin β ≈ 0
(t − y) sin β + α(x − 1 − y cos β) ≈ 0,
where t = T /(mlω 2 ), t = T /(mlω 2 ), x = g/(lω 2 ), and y = d/l. The

solution for β is

1 − 2x − 1 − 12x(1 − x) + 4y 2
β ≈ − arccos . (4.70)
2y
The second sketch shows β as function of x = g/(lω 2 ) for y = d/l = 0.05

(dashed line) as well as for y = 0.1 (solid line).
The meaning of negative β-values is illustrated in the third sketch. This solution
only exists below a d/l-dependent value of g/(lω 2 ) (i.e. sufficiently large
rotation frequencies), where the magnitude of β increases with increasing
rotation frequency.
• Problem 16 - Rotating Disk: The sketch shows a disk of radius R, rotating

with a constant angular velocity ω.
A point mass m can slide without friction on the disk. The mass is attached
to the middle of a spring, which in turn is attached to the center of the disk as
well as to its rim. The spring constant is k. Write down a general formula for
r R (ω) based on (4.64).
Solution: In equilibrium we have
v˙ R = ω
˙ = vR = 0. (4.71)
This means that (4.64) reduces to
∂U
0=− + m (ω
× (
r R × ω))
, (4.72)
∂
rR
where we have omitted the primes. Using

k R 2
U =2 r R (ω) − , (4.73)
2 2
yields

∂U R rR
− = −2k r R (ω) − . (4.74)
∂
rR 2 rR
Thus

R
0 = −2k r R (ω) − + mω 2 r R (ω), (4.75)
2
i.e.
−1
mω 2 R
rR = 1 − . (4.76)
2k 2
√
In the limit ω = 0 we obtain r R = R/2, as expected, whereas for ω = k/m
the mass reaches the radius of the disk, i.e. r R = R.
• Problem 17 - Freely Falling Mass: The center of a spherical mass distribu-

tion with radius R coincides with the origin of an inertial frame (I ) as shown
in the sketch. The sphere rotates around the z-axis of this coordinate system
with the angular velocity ω(t).
The origin of a second coordinate system (R),
which is attached to the sphere, is located at the position R on the surface. The
˙
Its x R -axis is parallel to R.
z R -axis of this coordinate system is parallel to R.
x z
R
y R
R
R r
z R
y
x
(a) We consider a point mass at R + rR , falling freely in the gravitational field
of the mass distribution (earth!). What is the position vector, rR (t), of the point
mass in the R-system?
(b) We assume that the point mass is released from an initial height h = 400 m
above the earth’s surface, i.e. rR (0) = (0, 0, h). Its initial velocity is v = 100
m/s along the y R -axis, i.e. r˙ R (0) = (0, v, 0). What is the displacement of its
point of contact on the surface of the earth due the rotation? Assume that the
angle between the z R -axis and the z-axis is θ = 39o .
Solution: (a) Again we use (4.64) as our starting point. In the present case
w = 0 and therefore w˙ = 0. Likewise ω ˙ = 0. In addition we must replace

r R in (4.64) by R + rR . Because R is a constant vector in the primed
coordinate system, we have R˙ = 0, i.e. vR = r˙ R as well as v˙ R = r¨ R . Here and
in the following the time derivatives refer to the rotating system. Thus (4.64)
becomes
r¨ R = gs + 2r˙ R × ω × (( R + rR ) × ω).

+ω (4.77)
Using gs implies that we remain close to the earth’s surface. The meaning of
the index s will be explained shortly. Now we compare the relative importance
of the various terms:
2π
gs ≈ 10 m/s2 ṙ R ≈ 102 m/s ω ≈ ≈ 10−4 s−1 R ≈ 107 m r R ≈ 103 m,
24h
i.e.
r¨ R = gs + ω + 2r˙ R × ω

× ( R × ω) +ω
× (
r R × ω)
.

≈10 ≈10−1 ≈10−2 ≈10−5 m/s2

=g
We do notice that the local value of g is subject to a small correction due to

× ( R × ω).
the (rotation) term ω Notice also that the estimate of ṙ R is based
on the terminal velocity of a freely falling body when the initial height is 1000
m. Based on these considerations we neglect the last term and obtain
r¨ R ≈ g + 2r˙ R × ω.
(4.78)
Integration with respect to time yields
r˙ R ≈ gt + 2
rR × ω
+ co . (4.79)
Next we insert the trial solution

1 2
rR = gt + co t + δ
r (t), (4.80)
2
which leads to
δr˙ R = gt 2 × ω
+ 2
co t × ω
+ 2δ
r × ω.
(4.81)
Now we omit the last term, because it is small. Finally, the integration with
respect to time yields

1 1 3
rR ≈ gt 2 + co t + gt + co t × ω
2
+ c1 . (4.82)
2 3
The two constant vectors, co and c1 , are the velocity and the position of the
falling mass at time t = 0, respectively.
(b) We start by expressing (4.82) in terms of its components:
x R (t) ≈ (. . . ) y R ωz R − (. . . )z R ω y R
y R (t) ≈ vt + (. . . )z R ωx R − (. . . )x R ωz R
1
z R (t) ≈ − gt 2 + (. . . )x R ω y R − (. . . ) y R ωx R + h. .
2
Here (. . . ) stands for (1/3)

g t 3 + co t 2 . In particular
1
(. . . )x R = 0 (. . . ) y R = vt 2 (. . . )z R = − gt 3
3
and
ωx R = 0 ω y R = ω sin θ ωz R = ω cos θ.
The desired displacement of the point of contact due to the earth’s rotation is
1
x R (to ) ≈ vto2 ω cos θ + gto3 ω sin θ ≈ 0.57 m
3
y R (to ) ≈ 0 m,
√
where the time for the mass to reach the ground, to ≈ 2h/g ≈ 9.03 s, follows
via z (to ) = 0.
• Advanced Example: - Foucault’s Pendulum: Leon Foucault was a French

physicist, *Paris 18.9.1819, †Paris 11.2.1868, who, among other important
experiments, carried out the one discussed in this example, which studies the
motion of the mathematical pendulum within a rotating frame of reference.
The overall setup is illustrated in the sketch. Here ...
z'R
y'R P
latitude R
x'R
E
• ... R is a vector from the center of the earth to the suspension point of the
pendulum.
• ... P is a vector of length l from the suspension point to the point mass.
• ...α is the angle between the aforementioned vectors.
• ...ϕ is the angle between the projection of P onto the x R -y R -plane and the
x R -axis.

• ...θ E is the angle between the earth’s axis of rotation and R.
The motion of a point mass relative to the x R -y R -z R -system is described by
(4.64). Notice that in the sketch we have moved the origin of this coordinate
system to the surface of the earth. Its real origin, however, is the center of
the earth. Notice also that we do not need the prime in this example, which
therefore will be omitted from hereon.
In the present case w˙ = ω
˙ = 0, i.e.
m v˙ R = Fg + T + 2m vR × ω

+ mω
× (
r R × ω).

In addition
rR = R + P = R + l e ,
where
⎛ ⎞ ⎛ ⎞
0 cos ϕ sin α
R = R ⎝ 0 ⎠ and e = ⎝ sin ϕ sin α ⎠ .
1 − cos α
The unit vectors e and e⊥ , introduced below, do have the same meaning as in
Fig. 2.1. The velocity of the point mass is

vR = r˙ R = l e˙ = l ϕ̇ sin α eϕ̇ − α̇e⊥ ,
where
⎛ ⎞
cos ϕ cos α
e⊥ = − ⎝ sin ϕ cos α ⎠ .
sin α
The acceleration of the point mass is given by v˙ R = l e¨ , i.e.

v˙ R = l (ϕ̈ sin α + 2α̇ϕ̇ cos α) eϕ̇ − ϕ̇2 sin α eϕ − α̈e⊥ − α̇2 e . (4.83)
Here and above

⎛ ⎞ ⎛ ⎞
− sin ϕ cos ϕ
eϕ̇ = ⎝ cos ϕ ⎠ and eϕ = ⎝ sin ϕ ⎠ .
0 0
Now we consider the forces acting on the pendulum. First of all there is
gravitation and tension, i.e.
Fg + T ≈ mg cos αe + mg sin αe⊥ − T e . (4.84)
Notice that we model the earth as a perfectly homogenous sphere. Next are
the contributions due to the Coriolis force, i.e.
⎛ ⎞
0
2m vR × ω = 2ml e˙ × ω ⎝ sin θ E ⎠
cos θ E
⎡ ⎛ ⎞
cos ϕ cos θ E
= 2mωl ⎣ϕ̇ sin α ⎝ sin ϕ cos θ E ⎠ (4.85)
− sin ϕ sin θ E
⎛ ⎞⎤
sin ϕ cos α cos θ E − sin α sin θ E
+α̇ ⎝ − cos ϕ cos α cos θ E ⎠⎦ ,
cos ϕ cos α sin θ E
and the centrifugal force, i.e.

⎛ ⎞
cos ϕ sin α
mω
× ( = mω 2 l ⎝ κ cos θ E ⎠ .
r R × ω) (4.86)
−κ sin θ E
The quantity κ is given by κ = sin ϕ sin α cos θ E − (R/l − cos α) sin θ E .

At this point we are able to write down the final equation of motion for
the pendulum. However, first we note that the oscillation of the pendulum,
expressed via α (t), is fast compared to the rotation of the plane of the pendu-
lum, expressed via ϕ (t), i.e.

α̇ ∼ ω0 = g/l or α̈ ∼ ω02 = g/l
compared to
ϕ̇ ∼ ω = 2π/(24h) or ϕ̈ ∼ ω 2 .
Using l = 1 m we obtain
ω
2 · 10−5 .
ω0
In addition R l, i.e. R ± l R, and therefore

ω2 R
3 · 10−3 .
w02 l
Comparison of m v˙ R (4.83) to the (4.84)–(4.86) yields

T
−α̈e⊥ − α̇2 e ≈ ω02 cos αe + ω02 sin αe⊥ − e .
ml
This is the equation of motion for the case of the mathematical pendulum,
which we do know already (cf. Sect. 2.2). Thus
α̈ ≈ −ω02 sin α ≈ −ω02 α (α small)

T T
α̇2 ≈ −ω02 cos α + ≈ −ω02 + .
ml ml
As far as ϕ (t) is concerned, we are interested in the x R -y R -plane only. Assum-

ing sin α ≈ α and cos α ≈ 1 (small amplitude), we find

− sin ϕ
(ϕ̈α + 2ϕ̇α̇ + 2ω α̇ cos θ E )
cos ϕ

α cos ϕ
≈ ω2
−R/l cos θ E sin θ E

2 cos ϕ
+ ϕ̇ α + 2ω ϕ̇α cos θ E
sin ϕ

− sin θ E
−2ωαα̇ ,
0
where we have used

g cos ϕ g cos ϕ T cos ϕ
0≈ α − α − α + α̈ − α̇2 α .
l sin ϕ l sin ϕ ml sin ϕ

=0
Focussing on the first component and on terms O (ω), we obtain
(ϕ̇ + ω cos θ E ) · − sin ϕ ≈ 0,
i.e. (if sin ϕ = 0)
ϕ̇ ≈ −ω cos θ E .
Notice that the origin of the cos θ E -term is the Coriolis force. Using ϕ̇ =
ϕ/t and t = 24h yields
ϕ24h ≈ − cos θ E .
The Foucault pendulum in our physics department (Wuppertal, Germany)

undergoes a ϕ 281◦ -rotation every 24 hours, i.e.

281
θ E ≈ arccos = 38.7◦ .
360
The attendant latitude is
90 − θ E ≈ 51.3◦ N.
This shows that the motion of the Foucault pendulum is intimately coupled to
the rotation of the earth and its spherical shape.
4.4 An Application in Theoretical Chemistry
By now we are used to mechanical systems described in terms of a Lagrangian,

which itself depends on the coordinates, q j , and their time derivatives. But there are
applications of the formalism discussed thus far, in which some of the q j have an
entirely different meaning as illustrated by the following example.
Molecules do consist of atoms, which, according to their different electronegativi-
ties may possess a variable positive or negative net charge - a so called partial charge.
Consider for instance the water molecule. The oxygen atom, crudely speaking, has a
negative charge, because it ‘pulls’ on the electrons of the two hydrogens. They latter
thus appear to possess positive net charges, which of course cancel the negative net
charge on the oxygen due to the overall neutrality of the molecule. These partial
charges, denoted here by e j , are not constant and they are not integer multiples of
the elementary charge. Their values vary depending on the ‘chemical environment’
of the atoms, e.g. e j does depend on other ek in its vicinity.
We consider a molecules with N atoms. Their positions, which depend on time,
are given by ri (t). In addition, each atom possesses a ‘charge coordinate’, ei (t). The
Lagrangian of the molecule is given by

N
mi
N
me
L= vi2 + ėi2 − U ({ei }, {
ri }) . (4.87)
i=1
2 i=1
2
The second term is the ‘kinetic energy’ of the ei , where m e a ‘mass’ parameter. The
U -term includes not only the ‘mechanical’ potential energy but also describes the
interaction of the partial charges. Here, however, we are not interested in its explicit
form. An additional constraint i.e.

N
ei = 0, (4.88)
i=1
ensures that the molecule remains neutral all the time. Based on (4.6) we obtain the
following equations of motion:
∂U
m i r¨ i = − (4.89)
∂
ri
∂U
m e ëi = − +λ (4.90)
∂ei
The Lagrange multiplier λ follows via summation of (4.90) with respect to i, i.e.
1
∂U d2 1 ∂U
N N N
λ= + me 2 ei = . (4.91)
N i=1 ∂ei dt i=1 N i=1 ∂ei

=0
But do these equations of motion make sense? Do the partial charges ei , when
the atomic positions are kept fixed, oscillate around finite equilibrium values, ēi ? We
check this by setting ei = ēi + δei in (4.90). Using ēï = 0 and the series expansion
4.4 An Application in Theoretical Chemistry 121
∂U 1 ∂ 2 U
N N
U = Ū + δe j + δe j δek + · · · (4.92)
j=1
∂e j ē j 2 j,k=1 ∂e j ∂ek ē j ,ēk
we find
∂U ∂ 2 U N
m e δ ëi = − − δe j
∂ei ēi j=1
∂ei ∂e j ēi ,ē j
1 ∂U 1 ∂ 2 U
N N
+ + δe j + · · · . (4.93)
N i=1 ∂ei ēi N i, j=1 ∂ei ∂e j ēi ,ē j
Notice that the sum of the first and the third term must vanish if the ēi are solutions
of (4.93). Thus we have, using matrix notation,
δ e¨ = M · δe (4.94)
with
δe = (δe1 , . . . , δe N )
and
" #
∂ 2 U 1 ∂ 2 U
N
1
Mi j = − + .
me ∂ei ∂e j ēi ē j N i=1 ∂ei ∂e j ēi ,ē j
We proceed by looking for a matrix S, which diagonalizes the matrix M, i.e.

S−1 · M · S = and S · S−1 = I. The diagonal matrix contains the eigenval-
ues, λi , (no to be confused with the Lagrange multiplier!) of M. Multiplication of
(4.94) by S−1 from the left and by S from the right yields
δ e¨ = · δe ,
where δe = S−1 · δe. Using the solution ansatz

⎛ √ √ ⎞
δe1(+) e λ1 t
+ δe1(−) e− λ1 t
⎜ .. ⎟
δe = ⎝ . ⎠,
√ √
δe(+)
N e
λN t
+ δe(−)
N e
− λN t

√ becomes unstable, i.e. δe grows without bound, if at
we conclude that the system
least for one charge Re( λi ) = 0.
However, here we do not want to peruse this for concrete U , which would detract
us from our main subject. The interested reader is referred to S.W. Rick et al. [2].
But there are two points worth emphasizing. The principle of least action is a general
one and extends beyond mechanical systems. The stability analysis used here is quite
general too and also extends beyond this particular example.
References

New York, 1971)
2. S.W. Rick, S.J. Stuart, B.J. Berne, Dynamical fluctuating charge force fields. J. Chem. Phys.
101, 6141 (1994)
Chapter 5
Integrating the Equations of Motion
This and the following chapter compile most of the standard problems in classical
mechanics. After a brief discussion of one-dimensional motion, we turn to the two-
body problem.
5.1 One-Dimensional Motion†
The Lagrangian in the case of one-dimensional motion expressed in terms of the

cartesian coordinate x is
m 2
L= ẋ − U (x) . (5.1)
2
Notice that the potential energy in principle can depend on time as well. The resulting
equation of motion is
d U (x)
m ẍ = − . (5.2)
dx
For a given U (x) and with some skill this differential equation can be solved either
analytically or numerically.
The following approach uses energy conservation, i.e.
m 2
ẋ + U (x) = E . (5.3)
2
Solving for ẋ we obtain

DOI 10.1007/978-3-319-48710-6_5
124 5 Integrating the Equations of Motion

2
ẋ = [E − U (x)] . (5.4)
m
This first order differential equation is simpler and provides additional physical
insight. Its integration yields
t

2
x (t) = x (0) + dt [E − U (x)] (5.5)
0 m
or

m dx
t= √ + const . (5.6)
2 E − U (x)
Both solutions require
E ≥ U (x) . (5.7)
The condition is fulfilled in the intervals I1 = [x A , x B ] and I2 = [xC , ∞] in Fig. 5.1.

Within I1 the motion is periodic between the turning points x A and x B . The period,
T , follows from (5.6) via
√ x B (E)
dx
T (E) = 2m √ (5.8)
x A (E) E − U (x)
(period = x A → x B + x B → x A ). The motion within I2 , on the other hand, could be

a reflection at xC of a particle originally traveling in the negative x-direction.
Fig. 5.1 One-dimensional U (x)

energy surface
x
xA xB xC
5.1 One-Dimensional Motion† 125
• Example - Period of the Mathematical Pendulum: The energy of the

pendulum expressed in polar coordinates is
ml 2 φ̇2
E= + mgl (1 − cos φ) . (5.9)
2
Here m is the mass of the pendulum, l is its length, g is the gravitational
acceleration, and φ is the angular displacement of the pendulum (cf. the sketch).
According to (5.8) we have
Fg l
φmax
2 ldφ
T =2 √
gl 0 cos φ − cos φmax

φmax
4l φmax dφ
≈ 2
small
g 0 φmax − φ2
2

l
= 2π ,
g

where E = mgl (1 − cos φmax ) (note: √ dx
1−x 2
= arcsin x).
5.2 Two-Body Central Force Motion†
Transformation of the Two-Body Problem† :
Figure 5.2 illustrates the situation. There are two (point) masses m 1 and m 2 separated
by the distance r12 . Their interaction potential is U (r12 ). It is useful to replace the
Fig. 5.2 Illustration of the m

two-body problem, where 1 r
12
m 1 and m 2 are point masses center of mass
m
2
r R
1
r2
origin
Here R is the position of the

position vectors r1 and r2 by the new vectors r12 and R.
center of mass. From
r12 = r1 − r2 and m R = m 1r1 + m 2 r2 ,
where m = m 1 + m 2 , follows
m2 m1
r1 = r12 + R and r2 = − r12 + R . (5.10)
m m
We work out r˙ 21 and r˙ 22 and insert the results into the Lagrangian of the two-body
problem, i.e.
m1 ˙ 2 m2 ˙ 2
L= r + r − U (r12 ) . (5.11)
2 1 2 2
The new Lagrangian is
μ˙2
r 12 + m R˙ 2 − U (r12 ) ,
1
L= (5.12)
2 2
where
m1m2
μ= (5.13)
m1 + m2
is the reduced mass.

Remark: Notice that in the limit m 1 m 2 the reduced mass becomes

m2
μ ≈ m2 1 − ≈ m2 .
m1
5.2 Two-Body Central Force Motion† 127
This approximation is a good one for all planets in our solar system in comparison
to the sun.
Using the Euler–Lagrange equation
d ∂L ∂L
− =0
dt ∂ R˙ ∂ R
we obtain
d ˙
mR = 0 .
dt
This means that the center of mass moves freely at constant velocity. Selecting the
center of mass as the new origin yields the simpler Lagrangian
μ˙2
L= r − U (r12 ) , (5.14)
2 12
which reduces the two-body problem to an effective one-body problem.
Central Force Problem† :
Equation (5.14) is the Lagrangian of a central force problem. The central force,
which depends on r12 only, is given by
∂U (r ) dU r
F = − =− . (5.15)
∂
r dr r
Here and in the following we replace r12 by r.
Remark 1: Notice that the motion of the (reduced) mass is confined to a plane. This
is because L = const (see the second remark in the context of angular momentum
conservation) implies that the orientation of the angular momentum does not change.
Therefore the orbital plane as defined by r and p remains unaltered.
Remark 2: It follows from the previous remark that we can use polar coordinates to
describe the motion, i.e.
μ 2
L= ṙ + r 2 φ̇ 2 − U (r ) . (5.16)
2
Thus
∂L d ∂L
0= = ,
∂φ dt ∂ φ̇
i.e. for the generalized momentum we find
pφ = μr 2 φ̇ = L z = L = const (5.17)
(cf. (4.54)).
Remark 3: Equation (5.17) has a simple geometric meaning. The quantity 21 r 2 dφ is
the area, A, swept out by r in the angular interval dφ. In the context of (4.54) we
had shown that L z = μ( r × r˙ )z = μr 2 φ̇. Because |
r × d r| = 2d A and L z = L, we
have L = μ| ˙
r × r| = 2μ Ȧ and thus
dA
L = 2μ . (5.18)
dt
Conservation of angular momentum implies that the area swept out by r per unit
time is constant as well. This is known as Kepler’s second law.1
We now return to the solution of the central force problem, using, as before in the
one dimension case, energy conservation, i.e.
μ 2
E= ṙ + r 2 φ̇2 + U (r )
2
(5.17) μ 2 L2
= ṙ + + U (r )
2 2μr 2
μ
= ṙ 2 + Ue f f (r ) , (5.19)
2
where the effective potential is given by
L2
Ue f f (r ) = + U (r ) . (5.20)
2μr 2
The first term in (5.20) is called centrifugal potential. We solve (5.19) for ṙ , i.e.

2 L2
ṙ = [E − U (r )] − 2 2 , (5.21)
μ μr
Subsequent separation of variables (e.g. [1]) yields
1 Kepler,Johannes, German mathematician and astronomer, *27.12.1571 Weil der Stadt,

†15.11.1630 Regensburg; he discovered the laws of planetary motion.
−1/2
2 L2
t= dr [E − U (r )] − 2 2 + const . (5.22)
μ μr
We can rewrite this in terms of the angle φ via (5.17), i.e. dφ = L

μr 2
dt,
−1/2
L L2
φ= dr 2μ [E − U (r )] − + const . (5.23)
r2 r2
The (5.22) and (5.23) are general solutions of the central force two-body problem.
The integration limits rmin and rmax are given by
E − Ue f f = 0 (5.24)
(ṙ = 0). If this equation possesses a single solution, rmin , only, the trajectory is infi-
nite. This mean that the mass approaches from infinity and recedes towards infinity.
Otherwise the solution must lie within the ring-shaped area defined by rmin and rmax .
A possible path is depicted in Fig. 5.3.
1/r -Potential† :
In the following we study the 1/r -interaction, i.e.
α
U (
r) = − . (5.25)
r
Fig. 5.3 Trajectory forming

a rosette. Here
rmax
φ = rmin .... A closed
path requires either
U (r ) ∝ r −1 or U (r ) ∝ r 2 rmax
(Bertrand’s theorem)
rmin
This potential is common to two fundamental interactions - gravitation2 and Coulomb

interaction.3
• Example - Free Fall: We consider two radially symmetric mass distributions,

m and M, falling towards each other due to their gravitational attraction. In
this case α = Gm M. Initially both masses are at rest. Their initial center of
mass separation is R + h. At the time when they collide this distance is R. The
time interval separating the two situations is given by (5.22), i.e.
R+h
dr
t= . (5.26)
R 2
μ
[U (R + h) − U (r )]
Here E = U (R + h) and L = 0. The substitution x = (R + h)/r − 1 yields

(R + h)3 h/R
(x + 1)2
t= √ dx . (5.27)
2G(M + m) 0 x
We are interested in the case when h

R and thus

h/R
R3 dx 2R 2 h
t≈ √ = . (5.28)
2G(M + m) 0 x G(m + M)
Let’s compare this result to the ballistic trajectory problem (Problem 8). In
the special case of a freely falling mass with zero initial velocity, i.e. v = 0,
we obtain

2h g=G M/R 2 2R 2 h
t= = . (5.29)
g GM
The formula agrees with the result of problem 17 (Freely Falling Mass), if we
neglect the additional complication due to the earth’s rotation. The different
masses, i.e. M + m in the present case and M in the two aforementioned
problems, is due to our previous ‘asymmetric point of view’. In the problems
8 and 17 the earth’s center of mass is considered to be at rest at all times.
2 Thisis the so called Kepler problem.

3 Coulomb, Charles Augustin de, French physicist and engineer, *Angoulême 14.6.1736, †Paris
23.8.1806.
However, because M, the mass of the earth, is so much larger than the second
mass, this point of view is justified.
The trajectories associated with (5.25) follow from (5.23):

−1/2
L α L2
φ= dr 2μ E + − + const . (5.30)
r2 r r2
The substitution x = 1/r yields

−1/2
φ = −L d x 2μ (E + αx) − L 2 x 2 + const . (5.31)
Next we complete the square under the square root, i.e.
− L 2 x 2 + 2μαx + 2μE

μα 2 μ2 α2 2μE
=L − x− 2 +
2
+ 2
L L4 L
2 2
μ α 2μE
= L2 + 2 1 − y2 ,
L4 L
−1/2
μα
μ2 α2
where y = x − L2 L4
+ 2μE
L2
. Thus

dy
φ=− + const = arccos y + const .
1 − y2
y=cos z dz 1 1 1
Notice that d
dy
arccos y = =− = −√ = − .
d cos z sin z 1 − cos z
2 1 − y2
The final solution is
⎛ ⎞
μα
1
−
φ = arccos ⎝ r L2 ⎠ . (5.32)
μ2 α2
L4
+ 2μE
L2
Notice also that we have chosen the initial position so that the constant is zero.
With

L2 2E L 2
q= and e = 1+ (5.33)
μα μα2
Fig. 5.4 Examples for

different paths. The right
focal point of the ellipse is
the origin 1
r
1 0.5 cos
1 1
r r
1 2 cos 1 2 cos
we obtain –
– for α > 0, i.e. q > 0 (attraction):
q
r= . (5.34)
1 + e cos φ
If the eccentricity e < 1 (E < 0) this equation yields ellipses, whereas for e ≥ 1
(E ≥ 0) the results are hyperbolas (cf. Fig. 5.4). The smallest separation, rmin , in
either case is given by
q
rmin = . (5.35)
1+e
Remark: Applying this to our solar system, we recognize that the planets move on
ellipses with a common focal point. This is Kepler’s first law.
– for α < 0, i.e. q < 0 (repulsion):
q
r= . (5.36)
1 − e cos φ
In this case only hyperbolas are obtained (cf. Fig. 5.4), i.e. e > 1 in order for r to be
positive.
• Problem 18 - Ellipses:
(a) An ellipse is a curve in a plane which surrounds two focal points such
that the sum of the distances to the two focal points is constant for every point
on the curve. Use this condition to derive the equation of an ellipse in cartesian
coordinates.
(b) Let the right focal point be the origin and convert the above equation to
polar coordinates.
Solution: (a) We construct the ellipse so that both focal points are on the x-
axis. The y-axis intersects with the x-axis in the middle between the two focal
points (cf. the sketch). Using the above prescription we have f 1 + f 2 = 2a.
Here the f i (i = 1, 2) are the distances of the focal points Fi to a point P on
the ellipse and 2a is a constant.
According to Pythagoras’ theorem
f 12 = Y 2 + (l + X )2 and f 22 = Y 2 + (l − X )2 . (5.37)
Subtracting the second equations from the first one yields
f 12 − f 22 = ( f 1 + f 2 )( f 1 − f 2 ) = 4l X . (5.38)
Employing the condition f 1 + f 2 = 2a or f 2 = 2a − f 1 we obtain
l
f1 = a + X, (5.39)
a
which in conjunction with (5.37) leads to
X2 Y2
+ =1. (5.40)
a2 a2 − l 2
The substitution b2 = a 2 − l 2 yields the standard mathematical form of an

ellipse in cartesian coordinates. Setting X or Y equal to zero we observe that,
according to the above sketch, a is the semi-major axis and b is the semi-minor
axis of the ellipse.
P
Y
F1
X
F2
(b) In the second sketch the right focus is the intersection of the x- and the
y-axis. Thus r = f 2 = 2a − f 1 and x = X − l. Equation (5.39) immediately
yields
q
r= , (5.41)
1 + e cos φ
where q = a(1 − e2 ) and e = l/a.
P
y
r
F1
x F2
• Problem 19 - Kepler Circular Orbit: Consider a particle on a Kepler

circular orbit, i.e. U (
r ) = −α/r and α > 0. Calculate the kinetic energy, K ,
as well as the potential energy, U , each expressed in terms of the total energy, E.
Solution: We use
d
r · p = 2K + r · F . (5.42)
dt
On a circle r ⊥ p, i.e. r · p = 0 and thus
0 = 2K + r · F . (5.43)
With
= −α r
F = −∇U (5.44)
r3
follows
r · F = U . (5.45)
Using E = K + U we find
K = −E and U = 2E . (5.46)
• Problem 20 - Virial Theorem: For a particle with mass μ, moving on a

T
Kepler elliptical orbit, calculate the average kinetic energy, K̄ = T1 0 K dt,
T
and the average potential energy, Ū = T1 0 U dt. The quantity T is the time
for one complete cycle. The calculation of K̄ should be based on the average
of dtd (
r · p). This then leads to the virial theorem (2 K̄ = −
r · F).
Solution: Notice first that (5.43) does not hold in the present case, because r
and p almost never are perpendicular to each other. However, we can work out
(5.42) based on the time for one complete cycle, i.e.
T
1
0 = 2 K̄ + dt r · F = 2 K̄ + r · F , (5.47)
T 0
T
where we use 0 dt d
dt
r · p = r · p |0T = 0. In addition as before
α
r · F = − =−
r · ∇U =U (5.48)
r
and thus
2 K̄ = −Ū . (5.49)
The average potential energy is given by
π π
1 T
1
(5.17) μr 2 2μα
Ū = dt U = 2 dφ U (r ) = − dφ r (φ) . (5.50)
T 0 T 0 L TL 0
Using r (φ) from (5.34) and e < 1 as well as
π
dφ π
=√ (5.51)
0 1 − e cos φ 1 − e2
yields
2πμα q
Ū = − √ . (5.52)
TL 1 − e2
The area theorem (5.18), i.e. Kepler’s second law, yields
2μ
T = A. (5.53)
L
The area of the ellipse is A = πab, where the semi-axes, according to the
(5.33)
calculation
√ on p. 133, are given by a = q/(1 − e2 ) = α/(−2E) and b =
q/ 1 − e . Inserting this into (5.52) gives
2
Ū = 2E . (5.54)
All in all our final result is
K̄ = −E and Ū = 2E . (5.55)
Contrary to the case of the circular orbit, the (5.46) are valid only if we replace
K and U by their average values. The virial theorem will be discussed in more
detail on p. 283.
Remark: It is not difficult to rewrite (5.53) into

μ
T 2 = 4π 2 a 3 . (5.56)
α
The quantity a is the length of the semi-major axis. Applied to a planet P in our solar
system μ ≈ m P , where m P is the mass of the planet, which is much smaller than the
solar mass. In addition α ∝ m P , i.e. the ratio μ/α to good approximation does not
depend on the planet’s mass. This implies, again to good approximation,
T12 a13
= , (5.57)
T22 a23
where the indices indicate two arbitrary planets. Equation (5.57) is Kepler’s third
law.
• Problem 21 - Path for U = α r 2 : What is the path r = r (φ) when the poten-
tial energy is given by U = α r 2 (α > 0)? Also calculate the period T in the
special case of a circular orbit.
Solution: We use (5.23). The substitution x = r −2 (dr = −(1/2)x −3/2 d x)

yields

L −1/2
φ=− d x 2μ(E x − α) − L 2 x 2 . (5.58)
2
Replacing −α by E and E by α leads to an equation, which, except for a factor

1/2, agrees with (5.31). Thus we can adopt the solution of this latter equation
in the present case, i.e.
⎛ ⎞
μE
1 1
−
φ = arccos ⎝ r ⎠
2 L 2
(5.59)
2 μ2 E 2
L4
− L22μα
(cf. (5.32)). Using

L2 2αL 2
a= and b= 1− , (5.60)
μE μE 2
we obtain

a
r= . (5.61)
1 + b cos(2φ)
And therefore

a a
rmin = and rmax = . (5.62)
1+b 1−b
The period follows via Kepler’s second law according to (5.53). In the case
of a circular orbit we have b = 0, which results in A = πa = πL 2 /(μE). In
conjunction with 2αL 2 = μE 2 , again because b = 0, follows

μ
T = 2π . (5.63)
2α
• Problem 22 - Laplace–Runge–Lenz Vector: An additional conserved quan-

tity in the case of U (r ) = −α/r is the Laplace–Runge–Lenz vector given by
μαr
A = p × L − .
r

Show that d A/dt = 0. Calculate the magnitude of A expressed in terms of the
eccentricity, e, and the orientation of A relative to the orbital plane.
Remark: In the center of mass coordinate system the position of the reduced
mass is determined by 3 spatial coordinates. Subtraction of the time origin
leaves 2s − 1 = 2 · 3 − 1 = 5 constants of motion. On the other hand, there
E, and apparently A.
are 7 conserved quantities, i.e. L, This implies that there
must be two additional equations reducing the latter number to 5. One of them
is the sought after relation A = A(e(E, L)) and the other one is the orientation
of A relative to L.

Solution: First we show that

d A d r
=μ v × L − α =0. (5.64)
dt dt r
Using L = μ( r × v), L˙ = 0 as well as the identity a × (b × c) = b(
a · c) −
c( yields
a · b)

d r v r(
v · r)

v × L − α = r(v˙ · v) − v(v˙ · r) − α + α . (5.65)
dt r r r3
Inserting the equation of motion, v˙ = −(α/μ) r /r 3 , completes the first part of
the problem.
Next we consider the orientation and the magnitude of A. Notice that A ·
L = 0. Because L is perpendicular to the orbital plane ( L · r = 0), A must lie
in this plane. The orientation of A follows via
A · r = Ar cos φ = r · ( p × L)
−μαr , (5.66)

= L 2
i.e.
L 2 /(μα)
r= . (5.67)
1 + μαA
cos φ
Comparison with (5.34) shows that A is parallel to the x-axis. Its magnitude
is related to the eccentricity via
A
e= . (5.68)
μα
• Problem 23 - Perihelion Precession: A homogeneous distribution of dust

particles within the solar system leads to an additional gravitational force,
F = −m ear th C r, pulling the earth towards the sun (why? - see p. 43). Here C
is a constant proportional to the gravitational constant times the dust density
and r is a vector extending from the sun (origin) to the earth. The force due to
the dust cloud gives rise to a small change, δU , of U (r ) = −α/r . The path of
the earth no longer is a closed ellipse but an open rosetta. Calculate the angular
velocity, ω = φ̇, of the rotation of the major axis of earth’s ellipse to lowest
order in C.
Solution: Our starting point is (5.23), i.e.
rmax −1/2
L L2
2π + φ = 2 dr 2 2μ [E − U (r )] − 2 . (5.69)
rmin r r
The quantity φ/T , where T is the rotation period, is the ω. The potential
energy is
α
U (r ) = − + δU (r ) . (5.70)
r
δU (r ) = μCr 2 /2 yields the additional gravitational force, i.e. F = −m ear th

(r ). Series expansion of (5.69) including the leading
C r ≈ −μC r = −∇δU
order in δU yields
−1/2
rmax
L α L2
2π + φ ≈ 2 dr 2μ E + −
rmin r2 r r2
−3/2
rmax
L α L2
+2 dr 2 2μ E + − 2 μδU (r ) .
rmin r r r
(5.71)
The first integral is equal to π, because U (r ) = −α/r by itself results in a

closed path for which φ = 0. Thus
−3/2
rmax
α L2
φ ≈ μ LC 2
dr 2μ E + − 2 . (5.72)
rmin r r
We transform this integral via
−1/2
d rmax
α L2
φ ≈ μ C 2
drr2
2μ E + − 2
dL rmin r r

(5.23) d 1 π
= μ2 C dφ r 4 (φ) (5.73)
dL L 0

(5.34) 2 d q4 π dφ
= μC .
dL L 0 (1 + e cos φ)4
The resulting integral we may look up in a suitable table (e.g. [2]), i.e.
π

dφ π 2 + 3e2
=
, (5.74)
0 (1 + e cos φ)4 2 1 − e2 7/2
or we may decide to do it ourselves, i.e.

π π
dφ 1 d3 dφ
= − . (5.75)
0 (1 + e cos φ)4 6 da 3 0 (a + e cos φ) a=1
The subsequent substitution x = tan(φ/2) leads to a much simpler integral.

Using q = L 2 /(μα) and e2 = 1 + 2E L 2 /(μα2 ) (cf. (5.33)) we obtain

3π L Eμα2
φ ≈ C 3 − . (5.76)
4 E 2
Notice that E < 0. The time T for the earth to complete one full ellipse is
given by (5.53). Of course, in the present case the earth’s path is not an ellipse.
However, here this is a higher order effect, which we can neglect. Using again
(5.33) gives

π μα2
T = (5.77)
−E −2E
and thus
φ 3 L
ω= ≈ C . (5.78)
T 4 E
The negative sign of ω indicates a clockwise rotation.
5.3 Scattering‡
Scattering in the Center of Mass Frame‡ :
Scattering corresponds to the case when (5.24) possesses a single solution only (cf.
Fig. 5.5). The angle θ in Fig. 5.6,
θ = 2ϕ0 − π , (5.79)
is the scattering angle. ϕ0 is given by

Fig. 5.5 Effective potential

if U = − αr Ueff(r)
<0
>0
0
Fig. 5.6 Illustration of the

scattering of two particles, r
possessing attractive
interaction, within their rmin x
center of mass system. The
thick line is the path swept
out by the distance vector, r,
of the two particles. The path
follows via (5.34) with
E ≥ 0, i.e. e ≥ 1. Notice
also the rotational symmetry x
with respect to the dotted line
∞ −1/2
L L2
ϕ0 = dr 2 2μ [E − U (r )] − 2 . (5.80)
rmin r r
Notice that the scattering angle includes information on the interaction potential,
U (r ).
Scattering experiments measure the deflection of many particles expressed in
terms of the scattering cross section
dN
dσ = . (5.81)
n
The quantity d N is the number of scattered particles detected in the range between
θ and θ + dθ. n is the number of incident particles per unit area.
We want to relate θ, the scattering angle, to the scattering cross section (5.81)
using (5.80). Our starting point is
5.3 Scattering‡ 143
d N = 2πρdρn ,
where 2πρdρ is the area of a ring with radius ρ and width dρ. The center of the ring
is located on a straight line, asymptotically parallel to the direction of approach of
the incoming (effective) particle, passing through the center of mass. Notice that the
plane of the ring is perpendicular the aforementioned line. The quantity ρ = ρ (θ) is
the impact parameter. We find
dρ (θ)

dσ = 2πρdρ = 2πρ (θ) dθ . (5.82)
dθ
Usually the scattering angle is replaced by the solid angle element d = 2π sin θdθ,
i.e.
ρ (θ) dρ (θ)
dσ = d . (5.83)
sin θ dθ
Rutherford Scattering‡:
In the following we focus specifically on U (r ) = − αr , which is called Rutherford

scattering.4 According to (5.32)
⎛ ⎞
±1
ϕ0 = φ (∞) − φ (rmin ) = arccos ⎝−
⎠ (5.84)
2 ρ/α 2
1 + μv∞
(+ : α > 0; − : α < 0), where we have used E = 21 μv∞ 2

as well as L = μv∞ ρ.
Here v∞ is the relative velocity of the particles at infinite separation. From (5.84) we
immediately obtain
α2
ρ2 = tan2 ϕ0 . (5.85)
μ2 v∞
4
With
π θ
ϕ0 = + (5.86)
2 2
4 Rutherford, Ernest, Lord Rutherford of Nelson (since 1931), British physicist, *Brightwater
(Newsealand) 30.8.1871, †Cambridge 19.10.1937; probably the most influential experimental
physicist of his time in the area of nuclear research. He received the Nobel Price in chemistry
in 1908.
follows
dρ
α2 sin π2 + 2θ
ρ = 3 π θ

dθ 2μ2 v∞
4
cos 2 + 2
α2 cos (θ/2)
= . (5.87)
2μ2 v∞
4 sin3
(θ/2)
Using the identity sin θ = 2 sin (θ/2) cos (θ/2) leads to
α2 d
dσ = . (5.88)
4μ2 v∞
4sin4 (θ/2)
This is Rutherford’s scattering formula. Notice that it does not depend on the sign of
α! We emphasize that the equation as it stand applies in the center of mass reference
frame. Below we shall show how to transform (5.88) to the laboratory reference
frame.
Changing Reference Frames:
In the following we consider elastic collisions between particles in different

frames of reference. From the point of view of mechanics there is no real differ-
ence between scattering and collisions. The latter expression emphasizes the particle
nature, whereas the former expression is the more general one, which is used outside
of classical mechanics as well.
We are not interested in the details of the immediate interactions. Instead we are
interested in the relation of the particle momenta before and after the collision, when
the particle separation is large compared to the range of their interaction, i.e. when
particles behave as free particles.
A collision is called elastic if it does not alter the internal energy of the colliding
partners. The elastic collision does not lead to permanent deformation or dissipation
of energy, i.e. heating of the environment. In the following we consider binary col-
lisions, where (S) denotes the center of mass system and (L) denotes the laboratory
frame. Unprimed quantities refer to the time before the collision, primed quantities
refer to the time after the collision.
Center of mass frame: The center of mass is at rest at the origin. Therefore (4.56)
yields
m 1r1 (S) + m 2 r2 (S) = 0 , (5.89)
where r1 (S) and r2 (S) are the position vectors of the colliding particles. With
r (S) = r1 (S) − r2 (S) (5.90)

follows
m2 m1
r1 (S) = r (S) and r2 (S) = − r (S) (5.91)
m m
5
and thus
m2 m1
v1 (S) = v and v2 (S) = − v (5.92)
m m
(m = m 1 + m 2 ). These are the particle velocities before the collision expressed in

terms of the relative velocity v. Notice that v and r are the same in both reference
frames, which move relative to each other with the constant velocity w.
The mathematical description of the collision process does not depend on the
direction of time. Application of this to (5.92) yields
m2 m1
v 1 (S) = v and v 2 (S) = − v . (5.93)
m m
Conservation of the kinetic energy (elastic collision) implies | v |=| v | or
v = v · n , (5.94)
where n is a unit vector in the direction of v . However, nothing can be said about
this direction.
Laboratory frame: The corresponding velocities in the laboratory frame are
v1 (L) = v1 (S) + w and v2 (L) = v2 (S) + w

v 1 (L) = v 1 (S) + w
and v 2 (L) = v 2 (S) + w
.
Here w = R˙ = (m 1 v1 (L) + m 2 v2 (L))/m according to (5.10). In addition, due to
(5.93) and (5.94),
m2 m1
v 1 (L) = v n + w
and v 2 (L) = − v n + w
. (5.95)
m m
Finally, multiplication of (5.95) with m 1 and m 2 leads to the desired relations between
the momenta before and after the collision:
5 This follows also directly from (5.10), because r

12 = r(S) and R = 0 in the center of mass frame.
Fig. 5.7 Collision geometry

for the case p2 (L) = 0. p1'(L)
Here θ is the scattering angle µv n
p2'(L)
in the center of mass system,
because p1 (L) is parallel to 1 2
the orientation of the relative (µ/m2)p1(L) (µ/m1)p1(L)
velocity before the collision.
θ1 and θ2 are angles in the
laboratory frame. Notice that
| μv n |=| mμ1 p1 (L) |,
because |v | = |v | = v
μ
p 1 (L) = μv n + ( p1 (L) + p2 (L)) (5.96)
m2
and
μ
p 2 (L) = −μv n + ( p1 (L) + p2 (L)) . (5.97)
m1
Beyond this point additional information can be gained only for special cases.
We consider p2 (L) = 0, i.e. particle 2 is at rest prior to the collision. This implies
p1 = m 1 v and we are able to illustrate both (5.96) and (5.97) graphically as shown
in Fig. 5.7. According to the figure we have
m 2 sin θ
tan θ1 = (5.98)
m 1 + m 2 cos θ
6
as well as
π−θ
θ2 = , (5.99)
2
relating the center of mass to the laboratory frame.
In the case of Rutherford’s formula (5.88) the scattering cross section of the
particles initially at rest (particle 2) follows via (5.99)
6 Equation (5.98) follows via (5.96), i.e.
μ
p 1 (L) cos θ1 = μv cos θ + p1 (L) and p 1 (L) sin θ1 = μv sin θ .
m2
Next we eliminate p 1 (L) and make use of p1 (L) = m 1 v.

α2 2dθ2
dσ2 = 2π sin (π − 2θ2 ) 4 π
4μ2 v∞
4 sin 2 − θ2
α2 dθ2
= 2π sin (2θ2 )
2μ v∞
2 4 cos4 θ2
2
α d2
= . (5.100)
μv∞2 cos3 θ2
7
Notice that v∞ ≡ v. The corresponding expression for the particles initially not at
rest is more complicated.
Remark: It is interesting to also consider the energy distribution of the
particles 2
(assuming they are initially at rest), i.e. we are looking for dσ2 2 (L) instead of
p 2 2 (L)
dσ2 (θ2 ). We have 2 (L) = 21 m 2 v 2 2 (L) = 2m 2
. Using (5.96) as well as (5.97)
yields
1 2 2 μ2 2 v
2 (L) = μ v + 2 p1 (L) − 2μ2 p1 (L) cos θ
2m 2 m1 m1
2 2
μ v θ
= 1 − cos 2
m2 2

μ2 v 2 2 θ
=2 sin , (5.101)
m2 2
where cos (2α) = cos2 α − sin2 α. Thus we find

μ2 v 2 θ θ 1 1 μ2 v 2
d 2 (L) = 2 2 sin cos dθ = d . (5.102)
m2 2 2 2 2π m 2

=sin θ
In conjunction with (5.88) we finally obtain
α2 d 2 (L)
dσ2 = 2π . (5.103)
m 2 v∞
2 2
2 (L)
• Problem 24 - Inelastic Collision: An electron possessing the initial velocity

vo collides with an atom at rest. The mass of the electron is m and that of
the atom is M. The atom is excited into a higher energy level. The energy
7 Again we have used sin θ2 = 2 sin (θ2 /2) cos (θ2 /2).
difference to the lower level is W . What is the minimum initial velocity of the
electron in this case?
Solution: The equations for energy and momentum conservations are
1 2 1 1
mv = mv 2 + M V 2 + W (5.104)
2 o 2 2
and
mvo = mv + M V . (5.105)
Inserting v = v(vo , V ) from the second equation into the first, we obtain

vo 2W
V1,2 = 1± 1− . (5.106)
1 + M/m μ vo2
Here μ is the reduced mass. Because V must be real, we have

2W
vo ≥ . (5.107)
μ
√ the limiting cases M m and M = m. In the √

Let’s consider first case
μ ≈ m or vo ≥ 2W/m. In the second case μ = m/2 and thus vo ≥ 4W/m.
• Problem 25 - Hard Sphere Scattering: Hard spheres of mass m 1 and radius

R1 are scattered off other hard spheres at rest. The latter spheres possess the
mass m 2 and the radius R2 . Calculate the differential cross section dσ2 (θ2 ),
where θ2 is the scattering angle in the laboratory frame, as well as the full
cross section, σ2 (interpret your result). Also calculate the ratio between the
laboratory energies, 1 (L)/1 (L), for the 1-spheres in leading order for small
θ1 . Here 1 (L) is the energy before the collision and 1 (L) is the same quantity
after the collision.
Solution: According to the following sketch
R1
o
o
R2
(R1 + R2 ) sin ϕo = ρ . (5.108)
Using the impact parameter from this equation in conjunction with (5.86) in
(5.82) leads to
θ θ
dσ = π(R1 + R2 )2 cos sin dθ . (5.109)
2 2
Using again (5.99) to replace θ by θ2 we obtain the differential scattering cross
section in the laboratory frame, i.e.

dσ2 = 2π(R1 + R2 )2 sin θ2 cos θ2 dθ2 . (5.110)
The sign in this case is chosen so that the following integration is from small
to large values of θ2 . The full cross section is given by
π/2
σ2 = 2π(R1 + R2 )2 sin θ2 cos θ2 dθ2 = π(R1 + R2 )2 . (5.111)
0
=1/2
Notice that σ2 is the area of a disk with radius R1 + R2 , i.e. the midpoints of
the two spheres must approach at least this close for a defection to occur.
We now turn to the sought after ratio 1 (L)/1 (L) of the energies of the
1-spheres in the laboratory frame. Via (5.96) follows

p 2 (5.96) μ2 v 2 p12 (L) p1 (L)
1 (L) = 1 = 1+ 2 2 +2 cos θ (5.112)
2m 1 2m 1 m2v m2v
and with p1 = m 1 v
2
μ2 v 2 m1 m1
1 (L) = 1+ +2 cos θ . (5.113)
2m 1 m2 m2
The relation between θ1 , the angle of deflection of the 1-spheres in the labo-
ratory frame, and θ follows according to (5.98), i.e.
m 2 sin θ m2
tan θ1 = = θ + O(θ3 ) (5.114)
m 1 + m 2 cos θ m
or θ/θ1 ≈ m/m 2 . Thus
2
μ2 v 2 m1 m1 m2
1 (L) ≈ − θ1 (5.115)
2m 1 μ m 2 m 22
or with 1 (L) = m 1 v 2 /2
1 (L) m1
≈1− θ1 . (5.116)
1 (L) m2
The Slingshot-Effect‡:
An interesting space flight application of scattering theory is the slingshot-effect

or gravity assist, which we want to discuss next, albeit very much simplified.
On October 15, 1997 NASA sent the space probe Cassini8 on its 6.7 year flight
to Saturn.9 A Titan/Centaur booster rocket accelerated the spacecraft with a mass of
5700 kg to a velocity of 4 km/s relative to earth. However, Saturn is rather high up in
the gravitational potential of the sun. In order for a space vehicle to reach Saturn from
an orbit around earth, it must at least have a velocity of 10 km/s. Cassini’s flight plan
therefore included four planetary flybys, designed to provide additional velocity -
Venus, Venus again, earth, and finally Jupiter (VVEJ flightpath). During every flyby
8 Cassini, Giovanni Domenico, French astronomer, *Perinaldo (near Nizza) 8.6.1625, †Paris
14.9.1712.
9 see http://saturn.jpl.nasa.gov/index.cfm.
Fig. 5.8 Extreme variant of v

the slingshot-maneuver
2V+v
Cassini was accelerated by the planets, which revolve at velocities ranging from 13
km/s (Jupiter) to 35 km/s (Venus) around the sun. The first flyby (Venus) on April 26,
1998 supplied an extra velocity of about 7 km/s; the third (earth) on August 17, 1999
added another 5.5 km/s. The fourth and last flyby (Jupiter) on December 30, 2000
produced additional 2 km/s. Cassini reached Saturn on July 1, 2004. The estimated
fuel reduction due to the four maneuvers was at least 75 tons.
An extreme variant of the maneuver is depicted in Fig. 5.8. The spacecraft is
approaching a planet head-on with velocity v. The planet moves in the direction of
the spacecraft with velocity V (both velocities are defined relative to a sun-based
reference frame). We imagine a very sharp turn around the planet, after which the
spacecraft is heading in the opposite direction. This is akin to a head-on elastic
collision. An observer on the planet watches the spacecraft disappear with the velocity
V + v following the collision. However, the planet possesses the velocity V relative
to the sun. Thus, relative to the sun, the velocity of the spacecraft after the collision
is 2V + v.
Let us consider this in more detail. Conservation of kinetic energy as well as
momentum requires
M V 2 + mv 2 = M V + mv
2 2
and
−M V + mv = M V + mv .
Solving with respect to v yields
(1 − q)v + 2V
v = − , (5.117)
1+q
where q = m/M. Because q almost vanishes (the mass of the spacecraft is so much
smaller than the mass of the planet), (5.117) is simplified to v ≈ −(v + 2V ).
Fig. 5.9 Top as seen by an y

observer on the planet; vy
bottom as seen by an
observer at rest in the vx+V
sun-based reference frame
x
vx+V
vy
v1 y
vy
vx
V x
vx+2V
vy
v2
Planetary flybys are not head-on collisions of course. But the same laws also
apply in the case of less extreme deflections. Here we assume that the planet moves
along the x-axis. The y-axis is perpendicular to the ecliptic plane. The craft initially
moves with the velocity v relative to the sun-based reference frame. Its path is at an
angle θ relative to the x-axis. Figure 5.9 shows this; its top panel in the rest frame
of the planet and its bottom panel in the sun-based reference frame. The spacecraft’s
velocity components in the sun-based system long before the encounter with the
planet are
vx = v cos θ and v y = v sin θ .
Long after the flyby, according to our above discussion, we have
vx ≈ − (v cos θ + 2V ) and v y ≈ v sin θ . (5.118)
This yields

4V v (1 − cos θ)
v ≈ (v + 2V ) 1 − , (5.119)
(v + 2V )2
the magnitude of the velocity of the spacecraft after the ‘collision’. If for instance
the initial velocities of the spacecraft and the planet are the same, then the above
relation (5.119) reduces to
√
v ≈ v 5 + 4 cos θ .
For θ = 0 we have v ≈ 3v. This is the above head-on collision. On the other hand,
if θ = π the result is v ≈ v. In this case the spacecraft and the planet move in the
same direction having the same velocity. Somewhat more realistic is the approach
of the spacecraft almost perpendicular to the path of the planet followed by a sharp
curve right behind the planet. In this case the spacecraft is deflected in the direction
of the path √
of the planet, where the angle is given by the above formula and its final
velocity is 5 times its initial velocity.
References
1. M.R. Spiegel, Advanced Mathematics - Schaum’s Outline Series in Mathematics McGraw-Hill

(1971)
2. I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series, and Products. Academic Press (1980)
Chapter 6
Small Oscillations
Oscillatory or vibrational motion has applications in all of physics as well as in engi-

neering. We begin with the one-dimensional harmonic oscillator, including friction
and external excitation, and continue to study wave propagation along linear chains,
which eventually leads us to normal mode analysis.
6.1 One-Dimensional Motion†
We assume that the potential energy, U (q), where q is a generalized coordinate,

possesses a local minimum at qo . Expanding U (q) in the vicinity of this minimum
yields
∂U 1 ∂ 2 U
U (q) = U (qo ) + (q − qo ) + (q − qo )2 + O((q − qo )3 ) . (6.1)
∂q qo 2 ∂q 2 qo

=0
The first term on the right hand side is a constant. The next term vanishes at qo . If we
are interested in the motion close to qo , then the third term is sufficient, i.e. we do
not need higher order terms. This is the classical harmonic oscillator. If higher order
terms are included, then the oscillator becomes an anharmonic oscillator, assuming
the motion remains bounded within the potential well.
Harmonic Oscillator† :
In this section we focus on the one-dimensional harmonic oscillator with mass m

U (x)
and spring constant is k = ∂ ∂x
2
2 > 0 as shown in Fig. 6.1. Its equation of motion is
o

DOI 10.1007/978-3-319-48710-6_6
156 6 Small Oscillations
Fig. 6.1 The harmonic F

oscillator displaced by x(t)
from its rest position
0 x(t)
m ẍ(t) = −k x(t) , (6.2)
where x(t) is the displacement of the mass from the rest position at time t. The same
result is obtained based on the Lagrangian
m 2 k 2
L= ẋ − x . (6.3)
2 2
Slightly rewritten (6.2) becomes
ẍ + ωo2 x = 0 , (6.4)
√
where ωo = k/m. The general solution, as for the pendulum when its amplitude is
small, is
x (t) = c1 cos ωo t + c2 sin ωo t (6.5)
or
x (t) = a cos (ωo t + α) . (6.6)

Here a = c12 + c22 and tan α = −c2 /c1 (cf. Sect. 2.2). The quantity a is the ampli-
tude, α is the phase of the oscillation, and ωo is its frequency. The energy of the
oscillator is
m k m 2 1
E = ẋ 2 + x 2 = ẋ + ωo2 x 2 = mωo2 a 2 . (6.7)
2 2 2 2
Harmonic Oscillator Including Friction† :
Next we include an additional friction force,
f R = −ζ ẋ , (6.8)
assumed to be proportional to the velocity. Here ζ is a friction coefficient. In this

case (6.4) becomes
ẍ + 2λẋ + ωo2 x = 0 , (6.9)
where 2λ = ζ/m.
This may look reasonable. But do we really understand what we are doing? Notice
that (6.8) is not a general law. It is a simple approximation, which applies to certain
situations only. One example is the slow motion of a sphere in a gas or a liquid (Stoke’s
law1 ).2 Another example is viscous loss in elastomers during dynamic deformation
(cf. Sect. 10.7). Notice also that the equation of motion (6.9) is obtained by simple
addition of the friction force. But what is the Lagrangian in this case? The answer
is that we do not have an expression for it. Thus far we consider a single mass only.
This particle or point mass looses energy to an environment for which we do not have
a suitable description. We study an open system, which dissipates energy dependent
on its trajectory, i.e. the above friction force is not conservative. Mathematically

this means fR · ds = 0. The friction force cannot be derived from a potential.3
We can however write down an ‘extended’ Euler–Lagrange equation in which such
non-conservative forces, f, are included by simply adding them:
∂L d ∂L
− + f = 0 . (6.10)
∂
r dt ∂ r˙
The solution of (6.9) follows via the ansatz x = er t . This leadsto the characteristic
equation r 2 + 2λr + ωo2 = 0 possessing the roots r1,2 = −λ ± λ2 − ωo2 . Thus, the
general solution is given by
x (t) = c1 er1 t + c2 er2 t , (6.11)
where we must distinguish the three cases λ < ωo , λ > ωo , and λ = ωo .

λ < ωo : We obtain the damped oscillation

−λt
x (t) = ae cos ωo2 − λ2 t +α . (6.12)
λ > ωo : In this case the motion is not periodic, i.e.

√ √
− λ− λ2 −ωo2 t − λ+ λ2 −ωo2 t
x (t) = c1 e + c2 e . (6.13)
λ = ωo : We now have
x (t) = (c1 + c2 t) e−λt . (6.14)
Notice that the ansatz x = er t is not sufficient.
1 Stokes, Sir George Gabriel, British mathematician and physicist, *Skreen (County Sligo, Ireland)
13.8.1819, †Cambridge 1.2.1903.
2 A justification for (6.8) in a special case is given on p. 276.

= − ∂U ≡ −∇U
3 Notice that if F , then F · ds = − ∇U · ds = − dU = 0.
∂r
Fig. 6.2 Linear chain C

consisting of harmonic
springs joining different M m us+1 vs+1
masses
s-1 s s+1
Linear Chain‡ :
We consider an infinite chain of harmonically coupled masses in one dimension

shown in Fig. 6.2. The horizontal lines, which join the point masses (solid circles),
are identical harmonic springs. Their spring constant is C. Notice that the chain
consists of identical elements, each containing two masses M and m. The quantities
u s and vs are the displacements of the attendant masses from their rest position in
the corresponding element s.
We derive the equations of motion for the masses based on the Lagrangian of the
chain given by
M u̇ 2 m v̇s2
C 2
L= s
+ − vs−1 − u s + [u s − vs ]2 .
s
2 2 s
2
The attendant equations of motion follow via
d ∂L ∂L d ∂L ∂L
= and = , (6.15)
dt ∂ u̇ s ∂u s dt ∂ v̇s ∂vs
i.e.
M ü s = C (vs−1 − u s ) − C (u s − vs )
and
m v̈s = C (u s − vs ) − C (vs − u s+1 ) .
This is a coupled system of differential equations, which we solve using the following
ansatz:
u s = u exp [i (sak − ωt)] and vs = v exp [i (sak − ωt)] . (6.16)
The quantity a is the length of a chain element, k is the wavenumber, and ω is the
frequency of the oscillation multiplied by 2π.
But what is the physical meaning of this ansatz? Using Euler’s formulas we have
exp [i (sak − ωt)] = cos (sak − ωt) + i sin (. . . ) .

Momentarily we concentrate on the real part only and we set k = 2π/λ and
ω = 2π/T . Hence

sa t
cos (sak − ωt) = cos 2π − .
λ T
If we consider this expression at a fixed time t = 0,4 then

sa
t=0
cos (sak − ωt) = cos 2π .
λ
This tells us that λ is the wavelength of the periodic displacements of the masses
along the chain, i.e. masses spaced sa = nλ apart, where n = 0, ±1, ±2, . . . , do
have the same displacement from their respective rest positions. Analogously we
select a fixed position along the chain, for instance the element s = 0. In this case

s=0 t
cos (sak − ωt) = cos 2π .
T
This means that at times t = lT , where l = 0, ±1, ±2, . . . , a mass within our element
s does have the same reoccurring displacement. T is the period of the time-dependent
displacement for this mass. Finally, we choose to move along the chain with the
constant velocity λ/T (along the positive x-axis). What do we observe? In this case
sa/λ − t/T = 0, i.e. we do observe that the displacement right next to us remains
constant.
All in all the above ansatz describes a wave, in terms of the displacements of the
masses, which moves at constant velocity λ/T within the chain along the positive
direction. The reason why we use a complex expression to describe the wave,5 instead
of just cos(. . .) or sin(. . .), is the greater mathematical convenience - as we shall see.
The mathematical form of the ansatz (6.16), at first glance, is unexpected. On the
basis of the previous discussion we would expect
u s = cu,1 cos(sak ± ωt) + cu,2 sin(. . . ) = u exp[i(sak ± ωt)] + ū exp[−i(. . . )]
with
cu,1 − icu,2
u≡
2
and an analogous expression for vs . Our ansatz only includes the first terms of the
general form, which is the sum of (6.16) and its complex conjugate. In addition,
we merely consider the positive sign of ±ω, i.e. we consider waves traveling in
4 This means we take a snapshot, showing the oscillating chain at time t = 0.

5 Everything we just discussed also applies to the sin-part.
positive direction only. However, keeping this in mind (6.16) is fully sufficient for
what follows.
Returning to the problem at hand, we insert (6.16) into the equations of motion
and obtain
−ω 2 Mu = Cve−iak − 2Cu + Cv
(6.17)
−ω 2 mv = Cu − 2Cv + Cueiak
or in vector notation

−ω 2M + 2C −C 1 + e−iak u
=0.
−C 1 + eiak −ω 2 u + 2C v
One solution is u = v = 0. However, the physically meaningful solution follows via

the condition (cf. the example in Sect. 1.3)

−ω 2M + 2C −C 1 + e−iak
det =0.
−C 1 + eiak −ω 2 u + 2C
Thus
2
−ω M + 2C −ω 2 m + 2C − C 2 2 + eiak + e−iak = 0
(cf. (1.48)), i.e.

2 2 1 1 C2
ω − 2C + ω2 + 2 (1 − cos (ak)) = 0 . (6.18)
m M mM
The solutions, or in this case dispersion relations, are

1/2
C m m2 m
ω1,2 = 1+ ± 1 + 2 + 2 cos (ak) . (6.19)
m M M M
Notice that the index 1 indicates the plus sign, whereas the index 2 indicates the
minus sign.
But what do these dispersion relations, depicted in Fig. 6.3, mean? The origin, i.e.
ak/π = 0, is the limit of infinite wavelength. The other limit, i.e. ak/π = 1, corre-
sponds to the wavelength λ = 2a. Shorter wavelengths do not make sense (why?).
Notice that the velocity of the wave, λ/T = ω/k, depends on k and thus on the
wavelength - along two distinct ω-branches.
We can insert the solutions for ω into (6.17) in order to obtain the attendant com-
plex amplitudes u and v,6 i.e. the type of oscillation or specific mode of oscillation.
6 The displacements u s and vs are real of course.

Fig. 6.3 Dispersion m

relations from (6.19)
√ for C
m/M =√0.5. Top m/C ω1 ;
bottom m/C ω2 1.75
1.5
1.25
1
0.75
0.5
0.25
ak
0.2 0.4 0.6 0.8 1
Here we consider the simple limit ak/π = 0 only. Addition of the equations in (6.17)
yields
ω 2 (Mu + mv) = 0 .
Because ω2 = 0, we do not learn anything about the relation of ω2 and the amplitudes
in this limit. But we also have ω1 = 0 and thus
m
u=− v
M
for the ω1 -branch in this limit. This means that the displacements of the masses m
and M are opposite when the wavelengths are long compared to the segment size,
a. Notice that the same relation is true for ū and v̄ (cf. above) and therefore also for
the real constants (cu,i = −(m/M)cv,i with i = 1, 2).
• Problem 26 - Linear Chain (Newton): We consider the infinite one-

dimensional chain in the following sketch. The chain consists of identical
segments of length a. The black dots are identical masses, M. This time the
spring constants, C1 and C2 , are different. u s and vs are the displacements of
the respective masses within segment s from their rest positions.
a
us+1 v s+1
C1 C2
s-1 s s+1
Write down the equations of motion for u s and vs . Insert the ansatz (6.16).
Obtain the two solutions for ω 2 , i.e. ω 2 = ω 2 (k), and the explicit form of the
dispersion relations depicted in the second sketch for a special case.
Solution: Newton’s equations of motion are

du 2s
M = C1 (vs − u s ) + C2 (vs−1 − u s ) (6.20)
dt 2
2
dv
M 2s = C1 (u s − vs ) + C2 (u s+1 − vs ) . (6.21)
dt
Analogous to the chain problem in Fig. 6.2 we obtain
C1 + C2 1 2 1/2
ω2 = ± C1 + C22 + 2C1 C2 cos(ak) . (6.22)
M M
In the special case C = C1 = C2 and M = m this agrees with our previous
result (6.19).
In the limit ka 1 we have

2(C1 +C2 )
− 2(CC1 +C
1 C2
(ak)2 + O a 4 k 4 (+)
ω =
2 M 2 )M , (6.23)
C1 C2
2(C1 +C2 )M
(ak) 2
+ O a 4 4
k (−)
while in the limit ka = π

C1 +C2 |C1 −C2 |

+ + O (ka − π)2 (+)
ω = C1 M
2
+C2
M
|C1 −C2 | . (6.24)
M
− M
+ O (ka − π)2 (−)
All in all this yields the dispersion relations shown in the above sketch.
Remark 1: Because the propagation velocity of the waves is given by c = νλ =

ω/k (phase velocity), the above result means that there are two types of waves,
which for the same k-value have different propagation velocities. Notice also
that there are frequencies, i.e. values of ω, which cannot be realized by either
of the two wave types.
Remark 2 - amplitude ratio when k = 0: The equations of motion yield
−Mω 2 u = C1 (v − u) + C2 (v − u) (6.25)
−Mω v = −C1 (v − u) − C2 (v − u) .
2
(6.26)
Adding the two equations gives
− Mω 2 (u + v) = 0 (6.27)
and thus, for the (+)-branch, u/v = −1.
• Problem 27 - Linear Chain (Euler–Lagrange): Repeat the previous prob-

lem on the basis of the chain’s Lagrangian, i.e. derive the equations of motion
for u s and vs from the Euler–Lagrange equations.
Solution: The Lagrangian is given by
1 1 1 1

L= M u̇ 2s + M v̇s2 − C1 (u s − vs )2 − C2 (vs−1 − u s )2 . (6.28)
s
2 2 2 2
Using (6.15) we immediately obtain the equations of motion (6.20).
• Problem 28 - Two-Dimensional Double-Pendulum: The sketch shows a

double-pendulum moving in the x−y-plane. The two threads of length l are
assumed to be massless, i.e. the entire mass of the pendulum is concentrated
in the two point masses m 1 and m 2 .
1
l
m1
F
g
2
l
m2
y
(a) Make a sketch including all relevant forces. Based on this sketch write
down Newton’s equations of motion for the angles φ1 and φ2 . Hint: Extend the
approach used to solve the pendulum problem on p. 53. This leads to four equa-
tions. Eliminate the string tensions and obtain the equations of motion. Notice:
sin φ1 sin φ2 + cos φ1 cos φ2 = cos(φ1 − φ2 ) and sin φ1 cos φ2 −
cos φ1 sin φ2 = sin(φ1 − φ2 ).
(b) Derive the same equations of motion using the Lagrangian of the double-
pendulum.
(c) Solve the equations of motion in the limit of small angles φ1 and φ2 .
Calculate the pendulum frequencies and the amplitude ratios. Sketch the two
oscillation types. Hint: Expand the exact equations of motion in terms of the
angular variables, neglecting quadratic and higher order terms. Also neglect
terms like φi φ̇2j (i, j = 1, 2) (why?). The resulting coupled system can be
solved analogous to the linear chain problem.
Solution: (a) The following sketch shows the double-pendulum including the
forces acting on the masses. Also included are the relevant unit vectors.
Analogous to the mathematical pendulum on p. 53 we obtain the following
equation of motion,
1
l
T1
m1
-T2
e 2
,1 l
e m2
,1 T2
e
y F ,2
g,1 e
,2
F
g,2
⊥
− m 1l φ̈1 e⊥,1 − m 1l φ̇21 e ,1 = Fg,1 e⊥,1 + Fg,1 e ,1 − T1 e ,1 + T2 e ,2 , (6.29)
for m 1 at r1 = l(sin φ1 , cos φ1 ). In the case of the mass m 2 the position vector
is r2 = l(sin φ1 , cos φ1 ) + l(sin φ2 , cos φ2 ) and thus
− m 2 l φ̈1 e⊥,1 − m 2 l φ̇21 e ,1 − m 2 l φ̈2 e⊥,2 − m 2 l φ̇21 e ,2

⊥
= Fg,2 e⊥,2 + Fg,2 e ,2 − T2 e ,2 . (6.30)
We do need both equations, component by component, in a common coordinate

system. Therefore we express e ,2 and e⊥,2 in terms of e ,1 and e⊥,1 . First we
have
e ,2 = (e ,2 · e⊥,1 )e⊥,1 + (e ,2 · e ,1 )e ,1 = ae⊥,1 + be ,1 , (6.31)
where a = sin(φ1 − φ2 ) and b = cos(φ1 − φ2 ). Secondly
e⊥,2 = (e⊥,2 · e⊥,1 )e⊥,1 + (e⊥,2 · e ,1 )e ,1 = be⊥,1 − ae ,1 . (6.32)
Applied to (6.29) and (6.30) we obtain the four component equations
− m 1l φ̈1 = m 1 g sin φ1 + T2 a (6.33)

−m 1l φ̇21 = m 1 g cos φ1 − T1 + T2 b (6.34)
−m 2 l φ̈1 − m 2 l φ̈2 b − m 2 l φ̇22 a = m 2 gb sin φ2 + m 2 g cos φ2 a − T2 a (6.35)
−m 2 l φ̇21 + m 2 l φ̈2 a − m 2 l φ̇22 b = −m 2 ga sin φ2 + m 2 gb cos φ2 − T2 b . (6.36)
Addition of (6.33) and (6.35) yields the differential equation

m2 g
φ̈1 + φ̈2 cos(φ1 − φ2 ) + φ̇22 sin(φ1 − φ2 ) = − sin φ1 . (6.37)
m1 + m2 l
A second differential equation, which also does not contain the as yet unknown
tensions, is obtained by subtracting (6.35) multiplied by b from (6.36) multi-
plied by a:
g
φ̈2 + φ̈1 cos(φ1 − φ2 ) − φ̇21 sin(φ1 − φ2 ) = − sin φ2 . (6.38)
l
(b) The kinetic energy of the double-pendulum is
1 ˙ 2 1 ˙ 2
K = m 1r1 + m 2 r2 . (6.39)
2 2
Inserting the time derivatives of the position vectors from part (a) leads to
1 1
K = m 1l 2 φ̇21 + m 1l 2 φ̇21 + φ̇22 + 2φ̇1 φ̇2 (cos(φ1 − φ2 )) . (6.40)
2 2
The potential energy expressed in terms of the angles is given by

U = m 1 gl(1 − cos φ1 ) + m − 2gl (1 − cos φ1 ) + (1 − cos φ2 ) . (6.41)
Using L = K − U and d/dt (∂L/∂ φ̇i ) = ∂L/∂φi (i = 1, 2) again yields the

above equations of motion (6.37) and (6.38).
(c) We assume small amplitudes, i.e. cos(φ1 − φ2 ) ≈ 1 and sin(φ1 − φ2 ) ≈
φ1 − φ2 as well as sin φ1 ≈ φ1 . Thus, the (6.37) and (6.38) become
m2 g
φ̈1 + φ̈2 + φ̇22 (φ1 − φ2 ) = − φ1 (6.42)
m1 + m2 l
and
g
φ̈2 + φ̈1 − φ̇21 (φ1 − φ2 ) = − φ2 , (6.43)
l
respectively. In addition we neglect the terms φ̇i2 (φ1 − φ2 ), because they are
products of small quantities. All in all we obtain the coupled system
g m2
φ̈1 + φ1 + φ̈2 = 0 (6.44)
l m1 + m2
and
g
φ̈2 + φ2 + φ̈1 = 0 . (6.45)
l
Inserting the ansatz
φ1 (t) = ueiωt and φ2 (t) = veiωt (6.46)
(cf. (6.16)) yields the following system of equations:

−ω 2 + g/l −m 2 /(m 1 + m 2 ) u
=0. (6.47)
−ω 2 −ω 2 + g/l v
The condition det(. . . ) = 0 yields the eigenfrequencies
g/l
ω±
2
= √ . (6.48)
1± m 2 /(m 1 + m 2 )
Inserting ω±
2
into (6.47) yields the amplitude ratio
u
= ± m 2 /(m 1 + m 2 ) . (6.49)
v
We conclude that the frequencies ω± belong to two different types of oscilla-
tions (cf. the following sketch). In one case the amplitudes possess the same
sign; in the other case the signs are opposite.
+ Mode - Mode
Lattice Vibrations and Speed of Sound‡ :
We consider a two-dimensional square mesh with masses M located at its nodes,

which can move perpendicular to the plane of the mesh only. The perpendicular
displacement of the mass located on node l, m is u lm . We assume harmonic coupling
of nearest neighbor masses. The attendant spring constant is C. With this the equation
of motion of the mass on node l, m is given by
∂ 2 u lm
M = Flm .
∂t 2
Flm , given by

Flm = C u l−1,m − u l,m + C u l+1,m − u l,m

+ C u l,m+1 − u l,m + C u l,m−1 − u l,m

= C u l+1,m + u l−1,m − 2u l,l + u l,m+1 + u m−1,l − 2u l,m ,
is a sum of four terms due to the fourfold coordination of the mesh. Analogous to
(6.16) we insert

u lm = u o exp i (lkl a + mkm a − ωt) ,
where a is the mesh size, i.e. the nearest neighbor distance between the masses at
their equilibrium positions. The result is
− ω 2 Mu lm =

Cu lm e−ikl a − 1 + Cu lm eikl a − 1

+ Cu lm eikm a − 1 + Cu lm e−ikm a − 1 .
Using 2 cos x = ei x + e−i x we obtain the dispersion relation
ω 2 M = 2C [2 − cos (kl a) − cos (km a)] .
In the long wavelength limit (a λ or ka 1 (with k 2 ≡ kl2 + km2 )) this yields

kl2 a 2 km2 a 2
ω M ≈ 2C 2 − 1 +
2
−1+ = Ca 2 kl2 + km2 ,
2 2
i.e.

Ca 2
ω= k.
M
Assuming the validity of this expression in three dimensions, i.e. our system now
consists of an infinite stack of meshes along the third dimension, we may estimate
the transversal sound velocity, ct , and compare it to experimental values. Hence

ω Ca 2 C/a
ct = = = .
k M ρ
Here ρ = M/a 3 is the three-dimensional mass density. The quantity C/a has the unit
Pa = N/m2 . It describes the ‘elastic stiffness’ of the system in response to ‘shear-like’
displacement of the masses in the third dimension. A related measurable quantity,
which we discuss in much detail in the context of the theory of elasticity, is the shear
modulus, μ, which also has the unit N/m2 . Thus we replace C/a with μ, which yields

μ
ct = . (6.50)
ρ
Looking up experimental values for ρ and μ in the case of aluminum we find ρ =

2.7 · 103 kg/m3 and μ = 2.6 · 1010 N/m2 (e.g. [1]). The resulting value,
ct ≈ 3 · 103 m/s ,
is in very good agreement with the experimental number [1]. In Chap. 10 we shall
deal with wave propagation in (isotropic) elastic media more precisely and in detail.
Notice that (6.50) agrees with (10.101) in the limit of constant volume.
Driven Harmonic Oscillator with Friction† :
We return to the equation of motion (6.9), to which we now add a periodic force
f = b cos ωt , (6.51)
i.e.
b
ẍ + 2λẋ + ωo2 x = cos ωt . (6.52)
m
The general solution is the sum of (6.12), assuming that λ < ωo , plus a special solu-
tion of the inhomogeneous differential equation. Again it is convenient to use complex
numbers (z(t) = x(t) + i y(t)). This means that instead of (6.52) we consider
b
z̈ + 2λż + ωo2 z = exp[iωt] . (6.53)
m
Because this is a linear differential equation, the real part of z(t) automatically is the
sought after solution of (6.52).
We can guess a special solution based on the idea that in the case of strong coupling
the oscillator will closely follow the driving force. Thus we try
z inh (t) = Beiωt . (6.54)
Inserting (6.54) into the equation of motion yields
b
B=
m ωo2 − ω 2 + 2iλω
b −i arctan[2λω/(ωo2 −ω 2 )]
= 2 1/2 e .
m ωo2 − ω 2 + 4λ2 ω 2
The full solution is the sum of (6.12) plus the real part of (6.54), i.e.

−λt
x (t) = ae cos ωo2 − λ2 t +α
b
+ 2 1/2 cos (ωt − arctan [. . .]) . (6.55)
m ωo − ω + 4λ w
2 2 2 2
Notice that the first term in (6.55) vanishes after a certain transient time. This is illus-
trated in Fig. 6.4 (with a = 1, α = 0, λ = 0.1, ωo = 1.5, ω = 1.0, and b/m = 1.0).
The solid line is the full solution, whereas the dashed line is the second term in (6.55)
only. Notice also that the factor multiplying the cos-function possesses a maximum
Fig. 6.4 Transient response xt

of a driven oscillator with
friction
t
p 3p 5p 10p
close to where the driving force’s frequency, ω, is equal to the oscillator’s own fre-
quency, ωo . This means that the oscillator is in resonance with the external force.
The maximum increases and shifts closer to ωo , where it finally diverges in the limit
λ → 0, when friction diminishes.
• Problem 29 - Where an Oscillator Spends Its Time:

(a) Calculate the probability, p(x)δx, for finding the one-dimensional har-
monic oscillator in the interval δx at the displacement x. Sketch your result,
i.e. the probability density, p(x), in the interval (−a, a), were a is the oscilla-
tor’s amplitude.
(b) Determine an approximate solution to the anharmonic oscillator given by
ẍ + ωo2 x + λx 2 = 0 .
Consider the quantity λ (not a friction coefficient!) as being small. Start from
x(t) = A cos(ωo t + δ) + λx1 (t). Obtain x1 (t) by considering terms linear in
λ only. Hints: (i) 2 cos2 (z) = 1 + cos(2z); (ii) Use the ansatz x1,i (t) = C +
D cos(2ωo t + δ ) for the inhomogeneous differential equation.
Solution: (a) We write
δt
p(x)δx = 2 . (6.56)
T
Here T is the period of the oscillator and 2δt is the time it spends passing
through δx at the position x during one full oscillation. It is useful to rewrite
(6.56), i.e.
δt δx ωo 1
p(x)δx = 2 =2 δx . (6.57)
T δx 2π |ẋ|
Here δx/δt = |ẋ(t)| = aωo | sin(ωo t)| and 2π/T = ωo . With x(t) = a cos(ωo t)
follows
ωo 1
p(x)δx = δx (6.58)
π aωo | sin(arccos(x/a))|

or via sin y = 1 − cos2 y (0 < y < π)
1
p(x) = . (6.59)
πa 1 − (x/a)2
The following graph shows ap(x) versus x/a. As perhaps expected we find
that the probability density diverges at the turning points.
a
Notice that −a d x p(x) = 1.
(b) Inserting the suggested ansatz we find
ẍ1 (t) + ωo2 x1 (t) ≈ −A2 cos2 (ωo t + δ) (6.60)

= 21 + 21 cos(2ωo t+2δ)
to leading order in λ. Now we use the second ansatz for the solution of the
inhomogeneous differential equation, which yields
A2 A2
−4ωo2 D cos(2ωo t + δ ) + ωo2 C + ωo2 D cos(2ωo t + δ ) ≈ − − cos(2ωo t + 2δ) .
2 2
Hence
A2 A2
C =− D= δ = 2δ . (6.61)
2ωo2 6ωo2
The general solution, to first order in λ, is
A2 A2
x(t) ≈ A cos(ωo t + δ) + λ A cos(ωo t + δ) − λ + λ cos (2ωo t + 2δ)
2ωo2 6ωo2
or, after cleaning up the expression,
A2
x(t) ≈ A(1 + λ) cos(ωo t + δ) − λ 1 + sin2 (ωo t + δ) . (6.62)
3ωo2
The following graph compares this approximation for λ = 0.3 (dashed line)
to the exact solution for the same λ (solid line). Also included is the solution for
λ = 0 (dotted line). Here we have A = 1 and δ = 0. Apparently the approxi-
mation is not bad when ωo t is not too large.
Dissipation Function‡ :
In its equilibrium state, i.e. after the initial transient phase, the driven oscillator’s
energy is constant. This requires that the oscillator absorbs a certain amount of energy
per unit time, I . We want to calculate I .
For this purpose we introduce the dissipation function, Q, defined via
1 2
Q= ζ ẋ . (6.63)
2
Notice that f R = − ∂∂Qẋ . Using (4.38) we find


dE d ∂L
= ẋ −L
dt dt ∂ ẋ

d ∂L ∂L (6.10) ∂Q
= ẋ − = −ẋ
dt ∂ ẋ ∂x ∂ ẋ
= 2Q . (6.64)
This means that
I = 2 Q̄ , (6.65)
where the bar indicates a time-average over one full period of the oscillator.
According to (6.55)
λω 2 b2
I (ω) = 2 , (6.66)
m ωo2 − ω 2 + 4λ2 ω 2
where we use sin2 (. . .) = 1/2.7

Remark: The same result follows also via the straightforward integration
2π/ω
f (t)d x(t) = 0 f (t) ẋ(t)dt, where f (t) is the external force (6.51).
It is worth noting that
ζω 2 2
I (ω) = x (ζ = 2mλ) . (6.67)
2 o
The quantity xo is the equilibrium amplitude according to (6.55). The energy absorp-
tion therefore depends on the squares of the amplitude, xo , and the frequency, ω.
In Sect. 10.7 we shall return to this subject, which is of interest in many technical
applications. One such application is the design of automobile tires. An important
aspect is the decrease of energy dissipation in the tire tread material during normal
driving conditions (Notice that the overwhelming contribution to the rolling resis-
tance of a tire is from deformations within the tire material.) and its increase in other
situations, like breaking.
Euler–Lagrange Formalism Applied to a String Under Tension:
ω
2π/ω 2
7 Notice that sin2 (. . .) = 2π 0 sin (ωt + c)dt, where the quantity c is independent of t. The
substitution x = ωt + c yields
2π+c 2π+c
1 1 1 1
sin2 (. . .) = sin2 xd x = sin2 x + cos2 x d x = .
2π c 2 2π c 2
=1
Fig. 6.5 Vibrating string ds

with its ends fixed du
dx
u(x,t) x
0 L
Figure 6.5 shows an elastic string under tension with its ends fixed. The quantity
u (x, t) is the vertical displacement of the string relative to its equilibrium at position
x and at time t. First we want to construct the Lagrangian of the string expressed in
terms of u̇ (x, t) and u(x, t). The kinetic energy of the string is given by
L
1
K = ρ u̇ 2 d x .
2 0
Here 21 ρ d x u̇ 2 (x, t) is the kinetic energy of a mass element ρ d x, where ρ is the mass
density per unit length. The quantity L is the direct end-to-end distance of the string.
The potential energy requires the local extension of the string, which is ds − d x.
Here ds is a line element along the string (cf. Fig. 6.5) and d x is the length of the
same line element when the string is not strained at all. Thus

ds − d x = d x 2 + du 2 − d x

du 2
= dx 1+ 2 −1
dx
2
1 du
dx 1 + −1
2 dx

1 du 2
= dx .
2 dx
Therefore the potential energy of the string is given by

L 2
T du
U= dx , (6.68)
2 0 dx
where T is the string tension. The resulting Lagrangian is

2
L
1 du(x)
L= d x ρu̇ (x) − T
2
.
2 0 dx
The quantity 21 [. . .] is a simple example for a Lagrangian density.

Now we want to derive an equation for u (x, t) via the least action principle. The
action is
L 2
1 t2 du
S= dt d x ρu̇ − T
2
2 t1 0 d x
and the variation of S with respect to u is given by

t2 L
d du d
δS = dt d x ρu̇ δu − T δu
t1 0 dt dx dx
t2 L 2
p.i. d u
=− dt d x ρü − T 2 δu
t1 0 d x
t2 L L t2
du
+ d xρu̇δu − dt T δu
t1 0 0 t1 dx
(p.i.: partial integration). The last two terms vanish. The first of the two terms vanishes
because δu = 0 at the times t1 and t2 . At these times the string is constrained to a cer-
tain shape. The second of the two terms vanishes because δu (0, t) = δu (L , t) = 0.
The condition δu S = 0 therefore implies the wave equation of the string,
d 2u
ü − c2 =0,
dx2
√
where c = T /ρ. We may use the method of separation of variables, i.e. u(x, t) =
u x (x)u t (t), to obtain the solution of this differential equation. However, here our
sole interest was the derivation of the wave equation itself.
• Example - Tension in a Power Line: This is an application of the above

formalism - albeit in a static situation. The sketch shows a power line supported
by utility poles. We assume that the displacement u(x) is entirely due to the
elastic deformation of the cable due to its own weight. In the following ρ is
the mass of the cable per unit length (This of course is a crude assumption.
In addition we neglect thermal effects.). We want to calculate the tension, T ,
along the cable.
u(x)
L
x
We consider the displacement of the cable at position x, i.e. u(x). Previously

the (elastic) potential energy was given by (6.68), whereas here the potential
energy due to gravity,
L
U pot = −ρg u(x)d x ,
0
is an additional contribution. The Lagrangian becomes

2
1 L du(x)
L= d x ρu̇ 2 (x) − T + 2ρgu(x) .
2 0 dx
Using δu S = 0 we now find
d 2u
ρü − T + ρg = 0 .
dx2
In the static case this is
d 2u
= ρg .
dx2
Inserting u(x) = C (L − x) x we find for C

ρg
C =−
2T
or
ρg
u(x) = − (L − x) x .
2T
We calculate T by considering the elastic elongation L of the entire cable
between two poles, i.e.
T L
= .
A L
Here A is the cross section of the cable, assumed to be homogeneous, and is
its elastic or Young’s modulus (cf. Sect. 10.2). For L we obtain
2
L L
1 du ρ2 g 2 L 3
L = ds − dx ≈ dx = ,
cable 0 0 2 dx 24T 2
i.e.
ρ2 g 2 L 2
T = A (6.69)
24T 2
or
1/3
Aρ2 g 2 L 2
T = .
24
6.2 Normal Mode Analysis
Thus far we have dealt mainly with the motion of one or two point masses. One par-
ticular exception were waves along one-dimensional chains of harmonically coupled
masses. Here we consider a similar type of problem.
The positions of the point masses, m i , in our system are ri (i = 1, . . . , N ). The
system’s potential energy is U ( r1 , . . . , rN ). We assume that U (
r1 , . . . , rN ) possesses
a local minimum at r10 , r20 , . . . , rN 0 . As in the case of (6.1), we expand the potential
energy at the minimum, i.e.
1
U ( r10 , . . . , rN 0 ) + δ R T · F · δ R + . . . ,
r1 , . . . , rN ) = U ( (6.70)
2
where
∂ 2 U
Fαβ = (6.71)
∂xα ∂xβ r10 ,...,r N 0
(α, β = 1, . . . , 3N ) and

δ R T = x1 − x10 , y1 − y10 , z 1 − z 10 , x2 − x20 , . . . , z N − z N 0 . (6.72)
Notice that the linear term is zero (cf. our discussion of (6.1)).
We introduce the following coordinate transformation:
.
δ R = M−1/2 · L · Q (6.73)
−1/2 −1/2
Here M−1/2 denotes a matrix whose elements are Mαβ = m i δαβ (δαβ = 1 if
α = β and zero otherwise). In addition, the column vectors L α , which define the
matrix L, satisfy the eigenvalue equation

(M−1/2 )T · F · M−1/2 · L α = λα L α . (6.74)
Using (6.73) in conjunction with (6.74) one can show (cf. Problem 30), that the
kinetic and the potential energy of the system may be expressed as
1 ˙ T 1 2
δK = δ R · M · δ R˙ = Q̇ (6.75)
2 2 α α
1 1
δU = δ R T · F · δ R = λα Q 2α . (6.76)
2 2 α
Notice that δK and δU refer to the part of the total energy of the system, which is
due to small oscillations within the above local minimum of U .
The significance of the coordinate transformation (6.73) is that the resulting total
(vibration) energy is a sum over 3N independent one-dimensional harmonic oscil-
lators! The new coordinates, Q α , are the so called normal coordinates. Each of the
oscillators satisfies
Q̈ α + λα Q α = 0 (α = 1, . . . , 3N ) (6.77)
d ∂δK ∂δU
(cf. (6.5)). This follows via the Euler–Lagrange equations dt ∂ Q̇ α
+ ∂ Qα
= 0. Using
the ansatz Q α (t) ∝ cos (ωα t) we find
ωα = λ1/2
α , (6.78)
i.e. the eigenvalues computed via (6.74) yield the so called normal mode frequencies.
In general a normal mode describes the coupled oscillation of several or even all
masses in the system.
In order to obtain the ‘shape’ of the oscillation, i.e. this is what we observe when
the system is moving according to a particular normal mode, expressed in terms of
the cartesian coordinates of the masses, we must insert the solution Q α (t) of (6.77)
into (6.73). But what does this mean - there is no special mode-index in (6.73)? Let
6.2 Normal Mode Analysis 179
us assume we are interested in mode κ. The attendant δ R is δ Rκ , which follows via
κ ,
δ Rκ = M−1/2 · L · Q (6.79)
κ is given by
wherein Q
κ = (. . . , Q κ , . . . ) .
Q (6.80)
Here Q κ is the solution of (6.77) and . . . indicates zeros, i.e. only the κth entry is dif-
ferent from zero. The mode’s frequency is given by (6.78) with α = κ. The difference
between (6.73) and (6.79) is that the former includes all normal modes simultane-
ously. However, if we want to visualize just one single normal mode vibration, based
on the cartesian coordinates of the masses, then we must use (6.79).
• Problem 30 - Mathematical Transformations in Normal Mode Analysis:

(a) Show
−1/2 T
3N

Q · L · (M
T T
) ·F·M −1/2
·L· Q = λα Q 2α
α=1

based on the eigenvalue equation (M−1/2 )T · F · M−1/2 · L α = λα L α . Notice
that LT · L = I, where I is the unit matrix.
(b) Prove the orthogonality of the L α , i.e. show that LT · L = I, assuming
λα = λβ for α = β.
Solution: Starting from (6.70) we write
T
(6.73)
δ R T · F · δ R = · F · M−1/2 · L · Q
M−1/2 · L · Q

T · LT · (M−1/2 )T · F · M−1/2 · L · Q
=Q
= Q αT L αβ
T
(. . . )βγ L γδ Q δ .
Here and in the following we use the summation convention. Now we apply
(. . . )βγ L γδ = λδ L βδ
Notice that δ here is the column index, which also distinguishes the eigenvalues.
All in all we have
δ R T · F · δ R = λδ Q αT L αβ
T
L βδ Q δ .
T
Using L αβ L βδ = δαδ (cf. below) we find

δ R T · F · δ R = λα Q αT Q α = λα Q 2α .
α

(b) Using the definition H ≡ (M−1/2 )T · F · M−1/2 , we obtain
∗
λα L βT · L α = L βT · H · L α = L αT · HT · L β = L αT · H · L β = λβ L αT · L β .
Via the symmetry of H (*) as well as with λα = λβ (α = β) we have
L αT · L β = 0 .
Normalization of the eigenvectors yields LT · L = I.
• Problem 31 - Normal Modes of Carbon Dioxide: The sketch shows the

chain from Fig. 6.2. Consider the isolated chain segment in the box, which is
our model of the CO2 molecule (M = O and m = C). The bonds formerly
connecting the segment to the rest of the chain are ignored.
C
M m us+1 vs+1
s-1 s s+1
(a) Using normal mode analysis, calculate the frequencies of the stretch modes
for this model of CO2 . The modes themselves are depicted in the second sketch
below. The arrows indicate the direction of the momentary velocity of the
respective atoms. Notice that CO2 possesses two bending mode as well, which
we do not consider here. The experimental wavenumbers of the stretch modes
are 1340 and 2349 cm−1 (wavenumbers, να , and frequencies, ωα , are related
via να = ωα /(2πc), where c is the velocity of light.). Compare the ratio of
the two experimental wavenumbers to your result. Which wavenumber does
belong to which mode in the sketch?
(b) Solve the problem using the method discussed on p. 158.

Solution: (a) We consider the chain segment in the box. The potential energy
of the three masses is given by
1 1
U= C(u 1 − u 2 )2 + C(u 2 − u 3 )2 . (6.81)
2 2
Now we calculate the F-matrix (Fi j = ∂ 2 U/∂u i ∂u j ), i.e.

⎛ ⎞
1 −1 0
F = C ⎝ −1 2 −1 ⎠ . (6.82)
0 −1 1
In conjunction with
⎛ ⎞
M −1/2 0 0
M−1/2 = ⎝ 0 m −1/2 0 ⎠ (6.83)
0 0 M −1/2
we have
⎛ ⎞
1
M
− √m1 M 0
⎜−√1 − √m1 M ⎟
(M−1/2 )T · F · M−1/2 = C ⎝ mM m
2
⎠ . (6.84)
0 − √m1 M 1
M
The eigenvalues of this matrix are
C C(m + 2M)
λ1 = 0 λ2 = λ3 = . (6.85)
M mM
The first eigenvalue corresponds to frequency zero, which here means uniform
translation of the CO2 -molecule. In the present context we therefore ignore λ1 .
The wavenumbers, να = λ1/2 α /(2πc), based on the two remaining eigenvalues
are

1 C 1 C(m + 2M)
ν2 = ν3 = . (6.86)
2πc M 2πc mM
Their ratio, which is independent of C, is

ν3 M
= 1+2 ≈ 1.92 (6.87)
ν2 m
(notice: M/m ≈ 16/12). The same ratio based on the experimental wavenum-
bers is 2349 cm−1 /1340 cm−1 ≈ 1.75. The 10%-difference is not too bad.
But which wavenumber belongs to which mode? We take a look at the
eigenvectors belonging to ν2 and ν3 :
⎛ ⎞ ⎛ ⎞
−1 √1
⎝ 0 ⎠ and ⎝ −2 M/m ⎠ (6.88)
1 1
In the case of the first eigenvector, belonging to ν2 , we recognise the mode

where the carbon atom in the center is at rest and the oxygen atoms move in
opposite directions. This is the upper one of the two modes in the sketch. The
third eigenvector corresponds to the asymmetric oscillation, i.e. the bottom
mode in the sketch. Notice that we do not need the full (6.79) - just looking at
L 2 and L 3 is sufficient.
(b) An alternate route uses Newton’s equation of motion, i.e.
M ü s = −C(u s − vs )
m v̈s = C(u s − vs ) − C(vs − u s+1 )
M ü s+1 = C(vs − u s+1 ) .
Inserting the ansatz (6.16) yields
−Mω 2 u = −Cu + Cv
−mω 2 v = Cu − 2Cv + Cueiak
−Mω 2 ueiak = Cv − Cueiak .
We require that u and v are real. Thus exp[iak] = ±1. In the case exp[iak] = 1
follows

−Mω 2 + C −C u
= 0. (6.89)
−2C −mω 2 + 2C v
From det(. . . ) = 0 we have ω 2 = 0 (translation) as well as ω 2 = C(m +

2M)/(m M). The second solution is the above result for λ3 . In the other case,
exp[iak] = −1, the equations of motion yield v = 0 and −Mω 2 u = −Cu, i.e.
ω 2 = C/M. This is the above result for λ2 . v = 0 again means that the carbon
atom is at rest relative to the oxygens.
Remark: Notice that CO2 is a linear molecule. It has N = 3 atoms and thus
3N − 5 = 4 normal modes. We must subtract 5 from 3N , because there are
three directions of translation and two possible rotations (why only two?). This
means we neglect two bending modes.
Normal Modes of the Water Molecule:
Normal mode frequencies can be used to identify atomic groups in molecules,

like C–H, O–H, etc., and thus the molecules themselves. An important experimental
technique in this context is infrared absorption spectroscopy. If a suitable potential
function is available to describe the vibrations within a molecule or molecular system,
then we can compare the positions of the experimental absorption lines with a normal
mode calculation based on this potential.
It turns out that we can construct suitable potential functions, so called empirical
force fields, based on simple mechanical models (molecular mechanics) - two of
which were discussed in the Problems 5 and 7. In fact we can combine these two
expressions into the following potential function for an isolated water molecule:
Kb 2 Kφ 2
U= δb1 + δb22 + δφ . (6.90)
2 2
The quantities δb1 and δb2 are the deviations of the two OH-bond lengths from
their equilibrium values. δφ is the analogous deviation for the HOH-angle. Our
goal is the determination of the parameters K b and K φ from the measured normal
mode wavenumbers of water (in the gas phase where the molecules are more or less
isolated).
The first step is the calculation of the F-matrix, i.e. the matrix of the second
derivatives of U with respect to the cartesian coordinates of the water molecule. This
means that we must convert from the internal coordinates used to express U in (6.90)
to cartesian coordinates.
Figure 6.6 shows the relation between internal and cartesian coordinates of the
water molecule. Notice that a two-dimensional coordinate system is sufficient,
because the molecular vibrations are confined to a plane. The δ Ri (i = 1, 2, 3), the
atomic displacements relative to the respective equilibrium positions, are assumed
to be small. In order to calculate δb1 and δb2 , we need the projections, i.e. the com-

ponents of the δ Ri parallel (δ Ri ) and perpendicular (δ Ri⊥ ) to the chemical bonds.
Thus
Fig. 6.6 Relation between

y R1
internal and cartesian R3
coordinates of the water
molecule R1 R 1||
H H
/2 /2
2 1
x
R2 O

δb1 = δ R1 − δ R2 1

(6.91)

δb2 = δ R3 − δ R2 2 (6.92)
1 ⊥ 1
δφ = δ R1 − δ R2⊥ 1 + δ R3⊥ − δ R2⊥ 2 . (6.93)
b b

Notice that . . .1 and . . .2 indicate that the projections are relative to bonds 1 and 2,
respectively. The quantity b is the equilibrium bond length, i.e. the bond length in the
undeformed molecule. Conversion of these projections into the cartesian coordinates
in Fig. 6.6 is accomplished via the transformation x = D · x, where

cos ϕ sin ϕ
D= .
− sin ϕ cos ϕ
Here x is the vector x expressed in terms of the coordinates in a frame, which

is rotated counterclockwise by the angle ϕ. If the x-axis of the original system is

parallel to bond 1, then in this system δ Ri = (δ Ri , δ Ri⊥ ) and (i = 1, 2). A rotation by
ϕ = − (π/2 − φ/2) maps bond 1 onto the x-axis of the x-y-system in Fig. 6.6. Notice
that the x- and y-components of δ Ri in this rotated system are given by (δxi , δ yi ) =

D · δ Ri = D · (δ Ri , δ Ri⊥ ). Hence the sought after relation is (δ Ri , δ Ri⊥ ) = D−1 ·
(δxi , δ yi ), where

sin [φ/2] cos [φ/2]
D−1 = .
− cos [φ/2] sin [φ/2]
It follows that

δ Ri = δxi sin [φ/2] + δ yi cos [φ/2]
(6.94)
δ Ri⊥ = −δxi cos [φ/2] + δ yi sin [φ/2]
for the projections of δ R1 and δ R2 onto bond 1. The projections of δ R3 and δ R2
onto bond 2 we obtain by the inversion of (6.94) with respect to the y-axis, i.e. δxi
is replaced by −δxi . Equations (6.91) through (6.93) thus become
δb1 = (δx1 − δx2 ) sin [φ/2] + (δ y1 − δ y2 ) cos [φ/2]

δb2 = − (δx3 − δx2 ) sin [φ/2] + (δ y3 − δ y2 ) cos [φ/2]
1
δφ = {(−δx1 + δx2 ) cos [φ/2] + (δ y1 − δ y2 ) sin [φ/2]}
b
1
+ {(δx3 + δx2 ) cos [φ/2] + (δ y3 − δ y2 ) sin [φ/2]} .
b
Now the F-matrix can be constructed. We insert the above equations into (6.90) and
calculate the matrix elements Fαβ = ∂ 2 U/∂δxα ∂δxβ . Here δxα = δxi with α = i
and δxα = dyi with α = i + 3. All in all we obtain a 6 × 6 matrix. This matrix must
be multiply from both sides by the matrix M−1/2 , whose only non-zero elements
−1/2 −1/2 −1/2 −1/2 −1/2 −1/2

are m H , m O , m H , m H , m O , and m H along the diagonal. Here m H is
the mass of the hydrogen atom√ and m O is the oxygen mass. We obtain the normal
mode frequencies via να = λα / (2π), where λα are the eigenvalues of the matrix
M1/2 · F · M−1/2 . Three of the six eigenvalues are zero, corresponding the center of
mass translation of the molecule in the x−y-plane and the uniform rotation with
respect to the center of mass
√ in the same plane. The remaining eigenvalues do yield
the wavenumbers ν̃α = λα / (2πc), which, depending on the numerical values for
K b and K φ , should agree with the experiment. The following Mathematica-program
performs the necessary calculations:
”potential energy of the water molecule (here: x[[i]]=δxi in the cases i = 1,

2, 3 and x[[i]] = δyi when i = 4, 5, 6)”;
x={x1,x2,x3,y1,y2,y3};
u=Kb/2 (db1∧ 2 + db2∧ 2) + Kφ/2 dφ∧ 2;
db1= (x[[1]]-x[[2]]) Sin[φ/2] + (x[[4]]-x[[5]]) Cos[φ/2];
db2=-(x[[3]]-x[[2]]) Sin[φ/2] + (x[[6]]-x[[5]]) Cos[φ/2];
dφ = 1/b ((-x[[1]]+x[[2]]) Cos[φ/2] + (x[[4]]-x[[5]]) Sin[φ/2] )
+ 1/b (( x[[3]]-x[[2]]) Cos[φ/2] + (x[[6]]-x[[5]]) Sin[φ/2] );
”calculation of the F − matrix”;
F=Table[D[u,x[[α]],x[[β]]],{α,6},{β,6}];
”calculation of the M ∧ − 1/2 − matrix”;
M=Inverse[DiagonalMatrix[{mH∧ (1/2),mO∧ (1/2),mH∧ (1/2),mH∧ (1/2),
mO∧ (1/2),mH∧ (1/2)}]];
”solution of the eigenvalue problem (frequency = Sqrt(ev)/2π)”;
ev=Simplify[Eigenvalues[M.F.M]]
The non-vanishing eigenvalues are:

1/2
ζ ± ζ 2 − 8m O (2m H + m O ) b2 K b K φ
λ1/2 =
2b2 m H m O
2 b
ζ = (m H + m O ) b K + 2K φ

+ m H b2 K b − 2K φ cos φ
m H (1 − cos φ) + m O
λ3 = K b .
mHmO
The above order corresponds to the order of the normal modes in Fig. 6.7. The
explicit calculation of the depicted oscillations, the analog of which was discussed in
the previous problem on carbon dioxide in one dimension, is omitted here. In addition
to the eigenvalues we need the eigenvectors, which yield the cartesian displacements
of the atoms according to (6.79).
b φ
√ determine K and K , we simultaneously solve two of the three equa-
In order to
tions ν̃α = λα / (2πc) (for instance using the FindRoot-routine of Mathematica.
e
Fig. 6.7 Normal mode

vibrations of the water
molecule O
H H
O
H H
O
H H
The quantities ν̃αe are the experimental wavenumbers and λα are obtained from
the above equations. We can check the consistency of the solution via the, thus
far unused, third equation. Suitable values for K b and K φ as well as the other
parameters can be obtain from the parameter tables of the AMBER molecular
modeling software package [2] (or some other molecular modeling software pack-
age). Here K b = 768.7 J m−2 , b−2 K φ = 70.9 J rad−2 m−2 , b = 0.96 · 10−10 m, and
φ = π (104.5/180) rad for the equilibrium valence angle. The attendant wavenum-
bers are compiled in Table 6.1. Notice that their deviations from the experimental
values are between 6 and 31 wavenumbers. We can try to modify the AMBER-
parameter values for K b and φ, which yields almost exact agreement with the exper-
imental wavenumbers - but only in two out of three. If we include the HOH-angle,
φ, as an additional adjustable parameter, we are able to match all three experimental
wavenumbers using K b = 766.4 J m−2 , b−2 K φ = 69.3 J rad−2 m−2 , and φ = 118.6◦ .
However, φ now deviates quite strongly from its literature value of 104.5◦ . In order
Table 6.1 Experimental wavenumbers ν̃ e (in cm−1 ) and the attendant deviations ν̃ of the theo-
retical values. A: AMBER-values; A1 : additional parameter adjustment using λ1 and λ2 (based on
the AMBER-values); A2 : same using λ2 and λ3 instead; Ab : parameter adjustment using all three
eigenvalues, based on the A1 -parameters, when (6.90) is enhanced by adding a bond-bond cross
term. The resulting parameter values are listed in the bottom part of the table
ν̃ e ν̃ eA ν̃ eA1 ν̃ eA2 ν̃ eAb
3652 −31 0 −50 0
1595 −6 0 0 0
3756 19 51 0 0
K b / mJ 2 – 768.7 755.8 776.6 766.2
Kφ
/ J
b2 rad2 m 2
– 70.9 70.3 70.3 70.3
K bb / mJ2 – – – – −20.9
to avoid this kind of problem, force fields sometimes contain cross terms; here this
would be a b − b cross term.8 For more information the interested reader is referred
to the somewhat old but still very useful [3].
Remark: In particular for large molecules it is not possible to obtain the λα analyt-
ically. In this case the F-matrix and the eigenvalues as well as the eigenvectors of
(M−1/2 )T · F · M−1/2 have to be calculated numerically.
References
1. D.R. Lide (ed.), Handbook of Chemistry and Physics (CRC Press, Boca Raton, 2003)
2. W.D. Cornell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, D.M. Ferguson, D.C. Spellmeyer,
T. Fox, J.W. Caldwell, P.A. Kollman, A second generation force field for the simulation of
proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc. 117, 5179 (1995)
3. U. Burkert, N.L. Allinger, Molecular Mechanics, vol. 177, ACS Monograph (American Chem-
ical Society, Washington D.C., 1982)
8 E.g., K bb δb δb .
2 1 2
Chapter 7
Rigid Body Motion
The motion of extended objects requires a new set of tools and some new methods
as well. We begin with the introduction of the moment of inertia tensor and develop
the relations combining it with the angular velocity of a rigid body and its kinetic
energy or angular momentum. We then derive Euler’s equations and learn how to us
them.
7.1 Moment of Inertia Tensor and Angular Momentum†
A rigid body consists of point masses m i , satisfying the following constraint. The
distance between any two of the point masses is constant, i.e. ri j = const ∀ i, j. Even
though real materials never satisfy this constraint, not even in their solid state, it often
is a very reasonable approximation. Because point masses are rather inconvenient
to use, we introduce the mass density ρ( r ), which generally depends on position.
The center of mass as well as the orientation of the rigid body in space are can be
described by three cartesian coordinates and three angles.
Angular Velocity† :
Figure 7.1 shows a rigid body in an inertial x yz-reference frame. The second
x y z -coordinate frame is tightly attached to rigid body. Notice that r = R + r , i.e.
d r = d R + d φ × r (cf. (1.15)). Hence
v = V + ω
× r , (7.1)
where V = R˙ and ω ˙ This means that the velocity of the point P in the laboratory
= φ.

frame is V plus a contribution due to rotation of the rigid body.

DOI 10.1007/978-3-319-48710-6_7
190 7 Rigid Body Motion
Fig. 7.1 Top laboratory z'

frame (unprimed) and rigid y'
body with attached frame
(primed). Bottom relation
between vectors mentioned z
in the text r'
P
x'
r
R
x
instantaneous
axis of rotation
r''
r'
RS
Notice that R is not necessarily the center of mass. In addition, R does not have to
be located on the rotation axis either! Proof: The bottom part in Fig. 7.1 illustrates the
situation. In this case the vector R points to a position on the rotation axis, whereas
for Rs this is not the case. According to the figure we have
r = R + r or v = V + ω
× r

Rs = R + a or Vs = V + ω
× a .
7.1 Moment of Inertia Tensor and Angular Momentum† 191
Using r = a + r the first equation yields v = V + ω × a + ω × r . Insertion

of V from the second equation yields v = Vs − ω × a + ω × r and thus
× a + ω
v = Vs + ω
× r .
Moment of Inertia Tensor† :

The kinetic energy of the rigid body is given by
m i v 2 mi 2
i (7.1)
K = = V + ω
× ri . (7.2)
i
2 i
2
Here and in the following calculation, including (7.15), ri refers to the body-fixed
reference frame, whose origin is the rigid body’s center of mass. Notice that we omit
the prime shown in Fig. 7.1 (top). V now is the center of mass velocity relative to the
laboratory frame. However, as mentioned before, the center of mass position need
not be on the instantaneous axis of rotation. Hence
1 2 mi
K = mV + V × ω
· m i ri + ω 2 ri2 − (ω
· ri )2 (7.3)
2 i i
2

=0(∗)
(*: because center of mass is the origin!). Thus
1 2
K = m V + K r ot (7.4)
2
with
1
K r ot = Is,αβ ωα ωβ (7.5)
2
α,β
and
⎛ ⎞
⎜ ⎟
⎜ ⎟
⎜ ⎟
Is,αβ = mi ⎜ xν,i δαβ − xα,i xβ,i ⎟
2
(7.6)
⎜ ⎟
i ⎝ ν ⎠
=ri2
as well as

1 if α=β
δαβ = .
0 otherwise
Here α = 1 or β = 1 indicate the x-component, α = 2 or β = 2 indicate the

y-component, and α = 3 or β = 3 indicate the z-component. The quantities Is,αβ
are the components of the moment of inertia tensor, I s . Notice that I s is symmetric.
The continuum version of I s in terms of its components is

Is,αβ = ρ(
r) xν2 δαβ − xα xβ d 3r (7.7)
ν
3
d r = d x1 d x2 d x3 .
Notice also that we may employ the summation convention to simplify ν xν2 or
α,β Is,αβ ωα ωβ , i.e. x ν or Is,αβ ωα ωβ replace the explicit summations.
2
• Example - Moment of Inertia Tensor of a Uniform Sphere: We want to

calculate the components of the moment of inertia tensor for a uniform or
homogenous sphere relative to its center of mass. The sphere of radius R has
a constant mass density, ρ. The first component is Is,11 , also denoted as Is,x x ,
i.e.

Is,x x = ρ r2 − x2 dV .
Vsphere
In spherical coordinates we have d V = r 2 dr dφdθ sin θ (cf. (1.92)) and thus

π
R 2π
Is,x x = ρ drr 4
dφ dθ 1 − cos2 φ sin2 θ sin θ .
0 0 0
The r -integration, which is independent from the angular integrations, yields
1 5
R .
5
The results of the subsequent φ-integrations are
2π
dφ = 2π
0
and

2π
1 2π
dφ cos φ sin θ = sin2 θ
2 2
dφ sin2 φ + cos2 φ = π sin2 θ .
0 2 0
=1
Our preliminary result is

π
πρ 5
Is,x x = R dθ 2 − sin2 θ sin θ .
5 0
Now we use the substitution x = cos θ, i.e. d x = − sin θdθ, which leads to
π −1
1 8
dθ · · · = − dx 1 + x2 = dx 1 + x2 = .
0 1 −1 3
Hence the final result is given by
8π 5
Is,x x = ρR .
15
Using ρ = m/ 4π
3
R 3 we may replace the density by the mass, m, of the sphere,
which yields
2
Is,x x = m R2 .
5
The components Is,yy = Is,zz = Is,x x (or Is,22 = Is,33 = Is,11 ) and Is,αβ = 0
for α = β follow analogously. All in all we have
⎛2 ⎞
m R2 0 5
0
I s = ⎝ 0 25 m R 2 0 ⎠ . (7.8)
0 0 25 m R 2
The off-diagonal tensor components generally are not zero. On the other hand,
the moment of inertia tensor, I s , is symmetric. Therefore we can always find a
coordinate system, via rotation with respect to the origin, in which the off-diagonal
elements vanish. This is called diagonalization of the moment of inertia tensor. The
coordinate axes of this new reference system are principal axes of inertia. In this case
(7.5) becomes
1
K r ot = Is,αβ ωα ωβ (7.9)
2
1 T
= ω · Is · ω
2
1 T
= ω · D−1 D · I s · D−1 D · ω

2
1 T
= ω · I s · ω
.
2
Diagonalization is accomplished via the rotation matrix D, which satisfies D−1 D = I,

where I is the unit or identity matrix, i.e.
⎛
⎞
Is,1 0 0
I s = D · I s · D−1 = ⎝ 0 Is,2

0 ⎠ . (7.10)

0 0 Is,3

The quantities Is,1 , Is,2 , and Is,3 , i.e. the eigenvalues of I s , are the moments of
inertia with respect to the principal axes of the body (principal moments of inertia).
The angular velocity in this new (rotated) reference frame is
= D · ω
ω . (7.11)
Hence
1 2

ω2 + Is,3

ω3
2 2
K r ot = Is,1 ω1 + Is,2 . (7.12)
2

Remark 1: Notice the following definitions - (i) asymmetric top Is,1 = Is,2 = Is,3 ,

(ii) symmetric top Is,1 = Is,2 = Is,3 , and (iii) spherical top Is,1 = Is,2 = Is,3 .
Remark 2: Below we discuss the moment of inertia tensor in coordinate systems,
where the center of mass is not located at the origin. As before it is possible to
diagonalize the moment of inertia tensor. The resulting principal moments of inertia
and attendant principal axes generally are different.
Angular Momentum† :
We consider therigid body’s angular momentum, L s , with respect to its center of

mass. Using L s = i m i (
ri × vi ) and vi = ω
× ri we have

L s = m i ri × (ω
× ri ) = m i ri2 ω
− ri (
ri · ω)
, (7.13)
i i
i.e.
2 Eq. (7.6)
L s,α = m i xl,i δαβ − xα,i xβ,i ωβ = Is,αβ ωβ . (7.14)
i
In particular we find that K r ot , defined in (7.5), is given by
1
K r ot = Ls · ω
. (7.15)
2
Moment of Inertia Tensor in a Shifted Coordinate System† :

Fig. 7.2 Precession of the

free symmetric top L z
Pr
Thus far the center of mass of the rigid body did coincide with the origin of the
coordinate system in which the moment of inertia tensor was calculated. Is this a
necessity?
The answer is no. Equation (7.1) remains valid if V = 0, i.e. R = const. Take
for instance a clock pendulum. Its suspension point is at rest. The center of mass,
however, generally is quite far from the suspension point, which itself is on the axis
of rotation. If we repeat the steps following (7.1) setting V = 0, then we obtain
the same expressions as before, i.e. (7.4), (7.5)1 (7.6), (7.7) and (7.9)–(7.15) as well
as later (7.62) remain valid without the index s. Thus, the moment of inertia tensor
(without the index s) still is calculated in the body fixed coordinate system. At the
same time the center of mass is not the origin, but the origin is at rest.
As an example we consider the free symmetric top, which here is a homogeneous
cone of height h with mass m. Its radius at the base is a. The cone, shown in Fig. 7.2,
is supported at its tip while rotating. It is this rotation we want to study.
The vector L is the total angular momentum, which is parallel to the z-axis of
the laboratory frame. We choose a coordinate system defined by the axes x, y, and
z, as indicated in the figure (the y-axis is not shown explicitly). The origin of this
coordinate system, which is attached to the cone, and that of the laboratory frame
do coincide. The cone’s center of mass is located some distance from its tip on the
z-axis. A calculation shows that the moment of inertia tensor is diagonal in this
coordinate system. Here we do not want to go through the entire calculation. We
rather determined the following three tensor elements only:
h za/ h 2π
dz dss 0 dφ(..) (..)=s 2 3
Izz = m 0 h 0 za/ h 2π = ma 2 . (7.16)
10
0 dz 0 dss 0 dφ
Izz is the moment of inertia with respect to the z-axis. We use cylindrical coordinates,
where s is the perpendicular distance of the volume or mass element from the z-axis.
z on the other hand is the z-position of the volume or mass element. Finally, φ is the
1K
r ot , now is the total kinetic energy.
angle between the projection of the line from the origin to the volume element onto
the x-y-plane and the x-axis. The mass density, ρ, is expressed via m/V . The volume
of the cone, V , not to be confused with the above velocity V , is the denominator of
the above expression. Analogous we find
(..)=−zs cos φ
Ix z = 0 (7.17)
and
(..)=s 2 +z 2 −s 2 cos2 φ 3
Ix x = m(a 2 + 4h 2 ) , (7.18)
20
the moment of inertia with respect to the x-axis. The calculation of the other tensor
elements, omitted here, is quite similar and proves the above assertion.2
In the following we require that L lies in the instantaneous x-z-plane, which yields
⎛ ⎞ ⎛ ⎞⎛ ⎞
Lx Ix x 0 0 ωx
⎝ 0 ⎠ = ⎝ 0 I yy 0 ⎠ ⎝ 0 ⎠ .
Ly 0 0 Izz ωz
In particular we have ω y = 0, because L y = 0. The vector ω can now be written as

the sum of a contribution ω
K along the z-direction (rotation of the top relative to this
principal axis) plus a contribution ω Pr along the direction of L (rotation or rather
Notice that ω Pr = ωx / sin θ,
precession of the top relative to an axis parallel to L).

where θ is the angle between L and the z-axis (cf. Fig. 7.2), i.e.
Lx L
ω Pr = = (7.20)
Ix x sin θ Ix x
is the magnitude of the angular velocity of the precession of ω Hence we

around L.
obtain a rotation of the projection of the z-axis in the x-y-plane of the laboratory
frame.
Steiner’s Theorem† :
The simple shape of the above rigid body and the special choice of the coordinate
origin, coinciding with the tip of the cone, simplify the calculation of the moment of
inertia tensor. But what if the cone is transfixed along its z-axis by a thin (massless)
needle as shown in Fig. 7.2. Now the top is supported by the tip of the needle, which
2 Notice that the distance, R, of the cone’s tip from its center of mass is given by
h za/ h 2π
dz dss dφ(..) (..)=z 3
R= 0
h
0
za/ h
0
2π = h. (7.19)
dz dss 0 dφ 4
0 0
Fig. 7.3 Free symmetric top

supported on a needle along z
its symmetry axis
is the new origin. The distance, d, from the origin to the tip of the cone is not equal
to zero but has some finite value (cf. Fig. 7.3).
Nevertheless, we may still calculate the moment of inertia tensor using (7.7):

Iαβ = ρ r 2new δαβ − xnew,α xnew,β d 3 rnew .
V
Here rnew is a vector in the coordinate system shown in Fig. 7.3. Expressing rnew via
rnew = rold + d , (7.21)
where rold is the position vector in our previous example and d = (0, 0, d), we
obtain

Iαβ = ρ ( 2 δαβ − (xold,α + dα )(xold,β + dβ ) d 3rold .
rold + d)
V
This equation does not simplify matters - with one exception. Instead of (7.21) we
use
rneu = r + R . (7.22)
Here R is the vector from the origin to the center of mass of the cone, and r is the
vector from the center of mass to the volume element. In this case the calculation
usually is much simpler. Because

ρ xα d 3 r = 0
Vr b
(α = 1, 2, 3), we have
Iαβ = m r b (R 2 δαβ − X α X β ) + Is,αβ . (7.23)
The quantities Vr b and m r b are the volume and the mass of the rigid body, respectively,
whereas X α and X β are components of R. In addition, Is,αβ denotes the components
of the moment of inertia tensor relative to the center of mass.
Applying this formula to the spinning top in Fig. 7.3 we obtain
⎛ ⎞
m r b R 2 + Is,x x 0 0
I=⎝ 0 m r b R 2 + Is,yy 0 ⎠ .
0 0 Is,zz
Because the center of mass is located on the z-axis, we may use our previous result
(7.16), i.e. Is,zz = 10
3
m r b a 2 . The other two components of the moment of inertia
tensor we look up in one of numerous tables listing the moments of inertia for simple
bodies: Is,x x = Is,yy = 803
m r b (4a 2 + h 2 ).3
Using (7.23) we now have

Iαβ ωα ωβ = m r b (R 2 δαβ − X α X β ) + Is,αβ ωα ωβ
= m r b (ω × R) 2 + Is,αβ ωα ωβ
= m r b R⊥ ω + Is,αβ ωα ωβ .
2 2
Here R⊥ is the perpendicular distance of the center of mass from the axis of rotation.
This equation becomes even simpler if the axis of rotation is parallel to the x-axis
(ω
= (ω, 0, 0)), i.e.

Iαβ ωα ωβ = m r b R⊥
2
+ Is,x x ω 2
and
1
K Rot = 2
m r b R⊥ + Is,x x ω 2 . (7.25)
2
The fact that
I = m r b R⊥
2
+ Is,x x (7.26)
3 Inthe special case R = 3h/4 (cf. (7.19)), when the origin coincides with the tip of the cone, we
obtain exactly this result from (7.18), i.e.
2
3 3
m r b (a 2 + 4h 2 ) = m r b h + Is,x x . (7.24)
20 4
is the moment of inertia with respect to a fixed axis, the x-axis in this case, is known
as Steiner’s theorem or parallel axis theorem. Notice that (7.26) is a special case of
(7.23).
• Problem 32 - Moment of Inertia Tensor: Calculate the principal axes and

the attendant principal moments of inertia for the following mass distributions.
(a) An infinitely thin isosceles triangle possessing a uniform mass density
(ρ = σδ(z)). The length of the two equal sides is a and α = 45◦ is the angle
between them. Here the reference point is the center of mass.
(b) Three identical point masses, located at (a, 0, 0), (0, a, 2a), and (0, 2a, a),
whose combined mass is m. The reference point is the origin.
Solution: (a) The following sketch shows the triangle, whose particular posi-
tion and orientation in the x-y-plane simplifies the calculation. The compo-
nents of the moment of inertia tensor of the triangle in this coordinate system
are

Iαβ = m R 2 δαβ − X α X β + Is,αβ (7.27)
(cf. (7.23)). The quantities X α and X β are the cartesian center of mass compo-
nents. The center of mass itself is indicated by the black circle in the sketch.
In addition R 2 = X 12 + X 22 + X 32 , m is the total mass of the triangle, and Is,αβ
are the components of the moment of inertia tensor relative to the center of
mass. Using the area of the triangle,
α α a2
A = a cos a sin = 3/2 , (7.28)
2 2 2
and its mass,

m= dV ρ = σ d xd y = σ A , (7.29)
A
we first calculate the cartesian components of the center of mass, i.e.

1 σ 1 a cos(α/2) x tan(α/2)
X1 = d V xρ = xd xdy = dxx dy .
m m A A 0 −x tan(α/2)
(7.30)
The result of the integrations is
2 α
X1 = a cos . (7.31)
3 2
Using the symmetry of the problem immediately yields
X2 = X3 = 0 . (7.32)
Now we can calculate the elements of the moment of inertia tensor in the
coordinate system of our choice. Using (7.27) we obtain the elements relative
to the center of mass. For instance:

a cos(α/2) x tan(α/2)
Ix x = σ d xd y x + y − x
2 2 2
=σ dx dyy 2 .
A 0 −x tan(α/2)
(7.33)
We find

1 1
Ix x = 1 − √ ma 2 (7.34)
12 2
and

1 1
Is,x x = Ix x −0= 1 − √ ma 2 . (7.35)
12 2
Analogously we obtain the other tensor components, i.e.

⎛ ⎞
1
1− √1 0 0
⎜ 12 2 ⎟
⎜ ⎟
I s = ma 2 ⎜ 0 1
1+ √1 0 ⎟ . (7.36)
⎝ 36 2 ⎠
0 0 1
9
1− 1
√
2 2
We notice that our coordinate system is a good choice. After its origin is shifted
to the center of mass, the coordinate axes become the principal axes, because
I s is diagonal.
(b) Our starting point is (7.6). One by one we obtain
m 2 10 2
Ix x = a + a 2 + 4a 2 + 4a 2 + a 2 − a 2 = ma
3 3
m 6
I yy = 11a 2 − (a 2 + 4a 2 ) = ma 2
3 3
m 6 2
Izz = 11a − (4a + a ) = ma
2 2 2
3 3
m
Ix y = (0 − 0) = 0
3
m
Ix z = (0 − 0) = 0
3
m 4
I yz = 0 − (2a 2 + 2a 2 ) = − ma 2 ,
3 3
i.e.
⎛ ⎞
10 0 0
1 2⎝
I = ma 0 6 −4 ⎠ . (7.37)
3 0 −4 6
The first principal axis is defined by the unit vector ex = (1, 0, 0) along the
x-direction, because
I · ex = Ix x ex . (7.38)
The
othertwo principal axes we find via diagonalization of the submatrix
6 −4
, i.e.
−4 6

6 −4 c1 0
D−1 · ·D= , (7.39)
−4 6 0 c2
where the rotation matrix, D, is

cos φ sin φ
D= (7.40)
− sin φ cos φ
(notice: D−1 = DT ). Hence

10 − 4 cos(2φ) −4 cos(2φ) c1 0
= . (7.41)
−4 cos(2φ) 2 + 4 cos(2φ) 0 c2
This means that 2φ = ± π2 ± 3π 2

, . . . as well as c1 = 10 and c2 = 2.
Notice that φ = π/4 corresponds to a rotation of the coordinate system with
√ π/4.√
respect to the x-axis by The new y- and z-axes are√ oriented
√ along the unit
vectors ey = (0, 1/ 2, 1/ 2) and ez = (0, −1/ 2, 1/ 2), respectively.
The attendant principal moments of inertia are given by
2 2
I · ey = ma ey (7.42)
3
10 2
I · ez = ma ez . (7.43)
3
The following figure shows the three point masses joined by straight lines
within a cube. The lengths of the cube’s edges are 2a. The center of mass
can be seen in the middle of the triangle. The vectors located at the origin are
parallel to ex , ey and ez , respectively.
Remark: What is the result if the reference point is the center of mass
instead of the origin? The principal moments of inertia in this case are
(1/3,
√ 11/9, 14/9)ma
√
2
√. The attendant√principal
√ axes are defined
√ by
√ the vectors
√
(− 2/11, 3/ 22, 3/ 22), (0, −1/ 2, 1/ 2), and (3/ 11, 1/ 11, 1/ 11).
These vectors may look complicated. But looking at them in the figure, where
they are shown located on the center of mass, we notice that they appear accord-
ing to our expectation. ‘Expectation’ refers to the result of the geometrically
similar mass distribution in part (a).
• Problem 33 - General Pendulum: The sketch shows a pendulum whose

shape is rather arbitrary. The suspension point also is the origin of the laboratory
frame (hollow circle). The center of mass of the pendulum is indicated by the
vector R (solid circle). Notice that R ⊥ ω
is not required.

Based on L = i ri × pi , where the sum is over all mass elements i of the
pendulum, show that the angular momentum relative to the instantaneous axis
of rotation is given by
L = I · ω
.
The quantity ω is the angular velocity of the pendulum with respect to the axis
of rotation and I is given by
I = IsP + Is .
The first term is the moment of inertia tensor relative to the axis of rotation of
a point mass M, equal to the total mass of the pendulum, located at the center
of mass. The second term is the moment of inertia tensor relative to the center
of mass of the pendulum.
Solution: We write

(7.1)
L = m i ri × vi = m i ( R + r i ) × V + (ω
× r i ) . (7.44)
i i

The two terms i m i r i ×V and i m ω×
i R×( ri ) vanish, because i m i r i =
0. Hence

L = M R × V + m i r i × (ω
× r i ) . (7.45)
i
Using V = ω × R in conjunction with the third vector identity in Appendix

A, this becomes

L = M R 2 ω R · ω) i − r i (
r i · ω)
2
− R( + m i ωr . (7.46)
i
Expressed in terms of components we have

2

L α = M(R δαβ − X α X β ) +
2
m i ri δαβ − xi,α xi,β ωβ , (7.47)
i
which shows that we have attained the above objective. We also recognize
(7.23).
• Problem 34 - A Cone Rolling on a Horizontal Plane: A homogeneous cone

characterized by its mass, m, the height, h, and the cone angle, α, is rolling on
its side as shown in the sketch below. The period of its rotation on a horizontal
plane is T .
(a) Calculate the kinetic energy, K , of the cone expressed in terms of the
above quantities.
(b) Also calculate the total angular momentum, L, relative to a fixed coordi-
nate system. The origin of the coordinate system coincides with the tip of the
cone and its y-z-plane coincides with the plane on which the cone is rolling.
Solution: (a) We begin with a sketch of the cone on the plane (cf. below).
The upper part of the sketch shows the cone on its side. The tip of the cone
is located at the origin of the laboratory x yz-coordinate system. The y- and
z-axes of the latter define the plane. A second, body fixed, x y z -coordinate
system is shown also. Its origin coincides with the center of mass of the cone.
The position of the center of mass in the laboratory system is indicated by the

vector R.
At this instant the y- and the y -axis are parallel. The lower part of the sketch
shows the attendant position of the cone’s center of mass, looking along the
negative direction of the z-axis, indicated by rs . Notice that rs is perpendicular
to the y-z-plane. After a short time interval, δt, the cone’s center of mass has
moved by δ rs to the left. Strictly speaking we consider the limit δt → 0, i.e.
limδt→0 δφ/δt = dφ/dt = ω. Notice that the instantaneous axis of rotation
coincides with the z-axis. Notice also that ω is parallel to this axis. Hence
⎛ ⎞
0
=⎝0⎠
ω (7.48)
ω
in the laboratory frame. The instantaneous velocity of the cone’s center of

mass, V , is given by
V = ω
× rs . (7.49)
Using rs = (rs , 0, 0) and rs = R sin(α/2) as well as R = 3h/4 (cf. (7.19))

yields
3
V = ωh sin(α/2) , (7.50)
4
which is the constant magnitude of the center of mass velocity. Notice that ω
is not equal to 2π/T , the angular velocity of the cone’s axis in the y-z-plane.
ω describes the rotation of rs . Therefore
T R cos(α/2) 1
= = . (7.51)
2π/ω rs tan(α/2)
Thus, V expressed in terms of T and the cones’s geometry parameters is given

by
3π h
V = cos(α/2) . (7.52)
2 T
We now employ (7.4) and (7.9) for the calculation of the total kinetic energy
of the rolling cone. The first term in (7.4) essentially is taken care of by (7.52).
We therefore turn to K r ot according to (7.9), i.e.
1 T 1 T
K r ot = ω
· Is · ω
= ω · D−1 D · I s · D−1 D · ω
. (7.53)
2 2
We may choose the rotation matrix, D, so that D·I s ·D−1 is particularly simple
or so that this is true for D · ω.
In the present case we decide to work in the
primed system, because in this frame of reference I s is diagonal already, i.e.
⎛
⎞
Is,1 0 0
I s = D · I s · D−1 = ⎝ 0 Is,2

0 ⎠ . (7.54)

0 0 Is,3

The tensor elements are Is,1 = Is,2 = 80 3
m(4a 2 + h 2 ), calculated via (7.18) by

subtracting m R 2 according to Steiner’s theorem, and Is,3 = 10
3
ma 2 , according
to (7.16). At this point we need merely the components of ω in the primed
reference frame, which follow via
⎛ ⎞ ⎛ ⎞
cos(α/2) 0 − sin(α/2) 0
ω =⎝
=D·ω 0 1 0 ⎠·⎝0⎠
sin(α/2) 0 cos(α/2) ω
⎛ ⎞
− sin(α/2)
= ω⎝ 0 ⎠ . (7.55)
cos(α/2)
Straightforward algebra yields
1 T 3π 2 mh 2
K r ot = · I s · ω
ω = (5 cos α + 13) (7.56)
2 80T 2
and for the total kinetic energy
1 1 T 3π 2 mh 2
K = mV 2 + ω · I s · ω
= (5 cos α + 7) . (7.57)
2 2 20T 2
(b) We want to calculate the total angular momentum in the x yz-frame. As in
part (a), the primed system is more convenient to use. However, we translate
the origin of primed frame (no rotation) into the origin of the x yz-frame. This
shifted referenced frame we call x y z -frame. The sought after total angular
momentum is given by

L = = D−1 · (I · ω
= D−1 D · I · D−1 D · ω
Iαβ ωα = I · ω ) . (7.58)
α
The rotation matrix D is the same as above. According to the (7.16) and (7.18)
⎛ ⎞
3
20
m(a 2+ 4h 2 ) 0 0
I = D · I · D−1 = ⎝ 0 3
20
m(a 2
+ 4h 2 ) 0 ⎠ . (7.59)
3
0 0 10
ma 2
Notice that the cone’s moment of inertia tensor in the new x y z -reference
= ω . Another straightforward calculation
frame is diagonal. In addition, ω
yields
⎛ ⎞
2 5 cos α + 3
3πmh
L = − ⎝ 0 ⎠ . (7.60)
20T 5 sin α + 2 tan α
2
Finally we can check our result. Because K , the total rotation energy, does
not depend on the particular coordinate system, the following expression,
1
K = L ·ω
, (7.61)
2
must agree with the right hand side in (7.57). An explicit calculation shows
that this is indeed the case.
Lagrangian of the Rigid Body:

From the (7.4) and (7.5) we obtain
m V 2 1
L= + Is,αβ ωα ωβ − U (7.62)
2 2
m V 2 1
= + L s · ω
−U , (7.63)
2 2
the Lagrangian of the rigid body. Notice, however, that the validity of this form of
the Lagrangian requires that at least one of the following conditions is satisfied: (i)
the origin of the body fixed frame of reference coincides with the center of mass; (ii)
V = 0 in a shifted frame of reference!
Remark: If in particular the axes of the coordinate system are the principal axes, then
L ω
for the spherical top. This is true also when the axis of rotation coincides with
one of the principal axes.
• Problem 35 - Clock Pendulum: We consider a clock pendulum consisting

of a pendulum weight at one end of a massless rod of length l. The other end
of the rod coincides with the suspension point. The weight is a homogeneous
cylindrical disk of mass M. The radius of this disk is a.
Write down the pendulum’s equation of motion for small amplitudes based
on the Lagrangian of the pendulum. Calculate the relevant principal moment
of inertia of the pendulum and give an expression for the pendulum frequency
in terms of the quantities g, a, and l.
Solution: The pendulum’s kinetic energy is given by
1 1 2
K = L ·ω
= Ml + Is ω 2 (7.64)
2 2
(cf. (7.25)). Is is the weight’s moment of inertia relative to the cylinder axis
and ω is the angular velocity of the rod. The potential energy is
U = Mgl(1 − cos φ) , (7.65)

where φ̇ ≡ ω. Notice that φ is the angle between the vertical and the pendulum.
Hence
1 2
L= Ml + Is φ̇2 − Mgl(1 − cos φ) . (7.66)
2
The equation of motion follows via (d/dt)∂L/∂ φ̇ − ∂L/∂φ = 0, i.e.
Mgl
φ̈ ≈ − φ (7.67)
Ml 2 + Is
(with sin φ ≈ φ).

The cylinder’s moment of inertia with respect to its axis (through the center
of mass) is

M 1
Is = dφdzρdρρ2 = Ma 2 . (7.68)
V V 2
The quantity V is the cylinder volume, ρ is the perpendicular distance of the

volume element d V = dφdzρdρ to the cylinder axis, and z is the position of
d V along this axis, which here is part of the z-axis.
Together with (7.67) we find for the frequency of the pendulum
−1/2
1 g a2
ν= 1+ 2 (7.69)
2π l 2l
(cf. the frequency of the mathematical pendulum in (2.55); notice: ω = 2πν

in (2.55) should not be confused with the magnitude of the time-dependent
angular velocity ω = ω(t) in (7.64).).
7.2 Equations of Motion for a Rigid Body
The position and orientation of a rigid body are characterized by six degrees of
freedom. We therefore expect to find six corresponding equations of motion. The
as well as the change of the
latter describe the change of the total momentum, P,

total angular momentum, L, with time.

P: From

P = pi
i
follows
d
P˙ = pi = fi .
dt i i
Hence
P˙ = F . (7.70)
The quantity F is the sum over all forces fi acting on all the mass elements i in the
rigid body. This also includes internal forces (e.g. bonds). The change of the potential
energy due to a uniform translation δ R is given by
∂U
δU = δ R = −δ R · F ,
i
∂
ri
i.e.
∂U
F = − .
∂ R

L: We have
d
L˙ = ri × pi
dt i

= r˙ i × pi + ri × p˙i
i i

=0

= ri × fi = Ni .
i i
Hence
L˙ = N . (7.71)
The quantity N is the total torque and Ni is the torque on i only.
Note that both the sum over the internal forces and the sum over the attendant
torques are zero. This follows from P˙ = 0 and L˙ = 0, in the absence of external
forces.
An interesting special case is that of a uniform force field. Here we use the
gravitational field as an example. The total force on a body is given by F = g i m i ,
where the m i are the masses of the above mass elements (or volume elements). The
total torque is given by N = i m i ri × g. Using R = i m i ri / i m i we find
7.2 Equations of Motion for a Rigid Body 211
N = R × F,
i.e. the total force acting on the body in principle acts at a single point
which here is the center of mass.4
R,
Remark: We may obtain (7.71) via
d ∂L ∂L
= .
dt ∂ ω
∂ φ
The quantities φ and ω = φ˙ are the instantaneous generalized coordinates and veloc-
ities. This really means that φ corresponds to a small angle of rotation δ φ,
oriented
along the instantaneous axis of rotation.
For ∂L/∂ ω we find according to (7.63)
∂L
= Iαβ ωβ = L α .
∂ωα
In addition

δU = − fi · δ
ri = − fi · (δ φ × ri ) = −δ φ · ri × fi = − N · δ φ
i i i
or
∂U
N = − . (7.72)
∂ φ
We want to study two examples.
• Example - Baseball: A (rod-shaped) bat of length l, having the mass m, is

hit by a ball perpendicular to its axis. The result is an instantaneous rotation
of the bat with respect to a point along its axis, whose distance from the point
of impact is d (cf. the sketch; arrow: point of impact; open circle: center of
rotation; solid circle: center of mass). Calculate d expressed via the quantities
r and l. What is the value of r , when the center of rotation coincides with the
end of the bat? Remark: This is when the hitter doesn’t feel the bat’s recoil.
4 Inthe case of an electric field the center of mass is replaced by the center of charge, i.e. m i is
replaced by qi , the corresponding charge.
Using both (7.70) and (7.71) yields
m(d − r )ω̇ = f
and

m(d − r )2 + Is ω̇ = d f .
Here P in (7.70) is the center of mass momentum of the bat. Is = 12 1

md 2 is
the moment of inertia of the bat with respect to a perpendicular axis through
its center of mass. Note that we make use of Steiner’s theorem in the second
equation.
Dividing both equations yields
m(d − r )2 + Is = md(d − r ) ,
i.e.
1 l2
d =r+ .
12 r
Notice that if d = r + l/2, i.e. when r = l/6, there is no recoil.
• Problem 36 - The Falling Chimney: An explosive charge is detonated near

the base of a brick chimney, which then starts falling as shown in the sequence
of photographs. In the second picture from the left the chimney shows a crack
at some distance above its base, even though there is no apparent damage
caused by the explosion this high up along the chimney.
Try to explain this phenomenon based on the laws governing the motion of
rigid bodies. Estimate the height above the base, where the chimney starts
to break up. Assume that the diameter of the chimney is constant and small
compared to its height, L.
Solution: In the following sketch a mass element, δm, is located at the position
r along the chimney. The equation of motion of the mass element expressed
in terms of α, the instantaneous angle between the vertical axis and r, is
δm r α̈ = δm g sin α + δ f (r ) . (7.73)
The first term on the right hand side is the gravitational force component
perpendicular to the chimney. The unknown force contribution δ f (r ) arises
due to the mortar, binding the mass element to the rest of the chimney.
Let us assume that there is no ‘rest of the chimney’. Instead there is just the
mass element supported on a massless thin rod at the same position r. In this
case α(t) follows via (7.73) with δ f (r ) = 0, which yields α̈ = (g/r ) sin α.
This shows that the angular acceleration of the falling rod does depend on the
distance r of the mass element from its base. The higher up along the rod the
mass element is located, the smaller α̈ becomes. Carrying this over to the real
chimney, we conclude that the upper part of the chimney tends to slow down
the lower part due to the mortar between the bricks - unless the attendant torque
causes the chimney to break.
We consider an arbitrary point A along the chimney. Every mass element
above it contributes a certain torque δ N A (r ) with respect to a (rotation) axis
through A. δ N A (r ) is given by
δ N A (r ) = (r − r A )δ f = (r − r A )(δmr α̈ − δmg sin α) . (7.74)
Here r A is the distance from A to the base. Notice that by using (7.73) we have
eliminated the unknown force δ f (r ). The total torque with respect to A is the
sum over all δ N A (r ) for which r > r A , i.e.
L
m
NA = dr (r − r A )(r α̈ − g sin α) , (7.75)
L rA
where δm is replaced by (m/L)δr . The quantity m is the total mass of the

chimney.
We find the unknown angular acceleration α̈ via L˙ = N . Here L and N
are the total angular momentum and the total torque relative to the base of the
chimney, respectively. Thus
L
I α̈ = mg sin α . (7.76)
2
The quantity I = m L 2 /3 is the total moment of inertia of a (very) thin chimney

relative to its base. This means that we can replace α̈ in (7.75) by known
quantities. An easy calculation yields

m L
3r mg sin α r A 2
NA = gsinα dr − 1 (r − r A ) = 1− rA .
L rA 2L 4 L
(7.77)
This shows that the torque N A increases with increasing α. N A also has a
maximum when A is at r Amax . From d N A (r A )/dr A |r Amax = 0 we obtain
L
r Amax = . (7.78)
3
If the chimney begins to break due to the torque(s) N A , then this is likely to
happen at 1/3 of its height. Of course, we have used simplifying assumptions
regarding the shape of the chimney, which basically has been reduced to a thin
rod. However, the comparison with the above series of pictures confirms that
our result is reasonable even for a real chimney.
• Example - Rolling without Sliding: A uniform sphere of mass m is rolling

without sliding on a plane. The motion is due to the torque N and the external
force F (cf. the sketch). How do the equations of motion look like?
dP md V
= = F + R . (7.79)
dt dt
And from (7.71) in conjunction with (7.63) follows
d L
˙ = N + r × R .
=I ·ω (7.80)
dt
The quantity R is the force due to the plane, acting at the point of contact. In
addition I is diagonal with I1 = I2 = I3 and thus I · ω ˙ = I1 ω.
˙
Mathematically rolling without sliding means
V − r × ω
=0, (7.81)
where V is the velocity of the sphere’s center. In the case of sliding motion
V − r × ω
= vsli p ,
Using r = − n a we find V˙ − r × ω

˙ = 0 from (7.81). Inserting this into
(7.79), the resulting equation in conjunction with (7.80) yields
I1 = N × n − a R + a n( .
( F + R) n · R)
am
We can now determine the components of R and insert them into (7.79). This
immediately yields the sought after equations of motion for the center of mass,
i.e. d Vx /dt = ... and d Vy /dt = ..., where the x-y-plane is the plane on which
the sphere moves. The equations of motion of the angular velocity components
follow via (7.81) (dωx /dt and dω y /dt) as well as from (7.80) (dωz /dt).
• Problem 37 - A Linear Molecule: We consider a linear (model) molecule

for which N = ν rν × fν = s × ν dν fν = s × G. Here s is a unit vector
parallel to the molecular axis and dν is the distance from the νth interaction
site (e.g. an atom), subject to the force fν , to the molecule’s center of mass.
Show that s¨ is given by

s¨ = G + s˙ 2 ,
− s s · G (7.82)
= I −1 G.
where G I is one of the two identical, non-vanishing principal
moments of inertia of the molecule.
Solution: We start from
(1.15)
s˙ = ω × s . (7.83)
In terms of the individual components this becomes (cf. Appendix A (Cross

Product))
ṡi = i jk ω j sk . (7.84)
Here and in the following we use the summation convention. Differentiating

again with respect to time yields

s̈i = i jk ω̇ j sk + ω j ṡk = i jk ω̇ j sk + i jk klm ω j ωl sm . (7.85)

= klm ωl sm
We now apply the identity i jk klm = δil δ jm − δim δ jl (cf. Appendix A):
s̈i = i jk ω̇ j sk + (δil δ jm − δim δ jl )ω j ωl sm = i jk ω̇ j sk + ωi (ω

· s) − si ω
2.
(7.86)
Taking the square of (7.83) and using s 2 = 1 we find s˙ 2 + (ω

· s)2 = ω
2 and
thus
s̈i = i jk ω̇ j sk − si s˙ 2 − si (ω
· s)2 + ωi (ω
· s) . (7.87)
This equation does not refer to any particular coordinate system. Now,
however, we consider the coordinate system defined by the principal moments
of inertia. We require s to be parallel to the z-axis. Because there is no torque,
which causes the molecule to rotate with respect to this axis ( N ⊥ s), we
can set the z-component of the angular velocity, ω, equal to zero within this
coordinate system. Thus ω · s = 0. In addition, we have L x = Iωx and
L y = Iω y or N x = L̇ x = I ω̇x and N y = L̇ y = I ω̇ y . Because Nz = 0,
we may write Nz = I ω̇z , even though I is not the moment of inertia for a
rotation with respect to the z-axis. Notice that I is the principal moment of
inertia relative to the x- as well as the y-axis. This means that (7.87) becomes
s̈i = i jk I −1 N j sk − si s˙ 2 . (7.88)
Using N j = jlm sl G m and after another application of the above identity we

obtain the desired result, i.e. (7.82).
Remark: The stepwise numerical solution of (7.82), at every integration step,
yields the new orientation of the linear molecule relative to the instantaneous (!)
coordinate system defined by the principal axes before the step. This requires
that the components of G in this coordinate system must be known. In the
following we shall study the equations of motion for rigid bodies, i.e. the
motion of rigid bodies under the influence of given forces, systematically.
Euler’s Equations:
Again we consider the (7.70) and (7.71), which describe how P and L change
with time in their coordinate system. Let‘s calculate the respective time derivatives,
d /dt, within a body-fixed coordinate system, which is defined by the principal axes
of the rigid body of interest. According to (4.48) we have
d P d P
× P =
+ω = F (7.89)
dt dt
and
d L d L
× L =
+ω = N . (7.90)
dt dt
Now we express the same equations in terms of their components:

d V1
m + ω2 V3 − ω3 V2 = F1 (7.91)
dt

d V2
m + ω3 V1 − ω1 V3 = F2 (7.92)
dt

d V3
m + ω1 V2 − ω2 V1 = F3 . (7.93)
dt
We may omit the prime as long as we keep keep in mind that we work entirely within
the rotating frame! Notice that m is the total mass (m V = P).
Within the rotating
frame the following holds: L 1 = I1 ω1 etc. and, according to (7.90), we find
dω1
I1 + (I3 − I2 ) ω2 ω3 = N1 (7.94)
dt
dω2
I2 + (I1 − I3 ) ω3 ω1 = N2 (7.95)
dt
dω3
I3 + (I2 − I1 ) ω1 ω2 = N3 . (7.96)
dt
Equations (7.91)–(7.96) are Euler’s equations.
As an example we consider Euler’s equations applied to the free symmetric top.
From F = N = 0 and I1 = I2 follows dω3 /dt = 0, i.e. ω3 = const. Therefore
ω̇1 = −ωω2 ω̇2 = ωω1
with
I3 − I1
ω = ω3 . (7.97)
I1
The solution of this system is
ω1 = c cos (ωt) ω2 = c sin (ωt) .

The projection of the angular velocity onto the 1-2-plane, defined by the attendant
principal axes, therefore also rotates with the angular velocity ω.
Euler Angles:
Aside from the position of the center of mass of a rigid body in the laboratory
frame, we need to describe its orientation. Here we introduce the Euler angles for this
purpose. Figure 7.4 shows two coordinate systems possessing a common origin.
The
unprimed system (x, y, z) is the laboratory frame, while the primed system x , y , z
rotates (with the rigid body). The angles ϕ, , and ψ are the Euler angles:
0 ≤ ϕ ≤ 2π 0≤≤π 0 ≤ ψ ≤ 2π . (7.98)
The transition from the unprimed to the primed frame is defined by the following
rotations in their respective order: (i) rotation with respect to the z-axis by the angle
ϕ; (ii) rotation with respect to the new (intermediate) x-axis by the angle ; (iii)
rotation with respect to the z -axis by the angle ψ. Mathematically these rotations
are described via the following matrices:
⎛ ⎞
cos ϕ sin ϕ 0
D ϕ = ⎝ − sin ϕ cos ϕ 0 ⎠ (7.99)
0 0 1
⎛ ⎞
1 0 0
D θ = ⎝ 0 cos sin ⎠ (7.100)
0 − sin cos
⎛ ⎞
cos ψ sin ψ 0
D ψ = ⎝ − sin ψ cos ψ 0 ⎠ . (7.101)
0 0 1
Fig. 7.4 Euler angles z y'
z' .
. x'
x
xintermediate
The vector A in the laboratory is transformed into the vector A in the rotated frame
via
A = D · A (7.102)
where
D = Dψ · Dθ · Dϕ , (7.103)
i.e.
⎛ ⎞
cos ϕ cos ψ − sin ϕ cos sin ψ sin ϕ cos ψ + cos ϕ cos sin ψ sin sin ψ
D = ⎝ − sin ϕ cos cos ψ − cos ϕ sin ψ cos ϕ cos cos ψ − sin ϕ sin ψ sin cos ψ ⎠ .
sin ϕ sin − cos ϕ sin cos
The inverse transformation is given by
A = D −1 · A , (7.104)
where
⎛ ⎞
cos ϕ cos ψ − sin ϕ cos sin ψ − sin ϕ cos cos ψ − cos ϕ sin ψ sin ϕ sin
D −1
= ⎝ sin ϕ cos ψ + cos ϕ cos sin ψ cos ϕ cos cos ψ − sin ϕ sin ψ − cos ϕ sin ⎠ .
sin sin ψ sin cos ψ cos
As an example we calculate the components of the angular velocity, ω = ϕ˙ + ˙ +

ψ, ˙ , and ψ˙ , i.e. ω
˙ in the rotating frame expressed in terms of ϕ˙ , ˙ + ψ˙ .
= ϕ˙ +
The orientation of these vectors we can infer from the above sketch:
In the case of ϕ˙ we have
ϕ̇x = ϕ̇ sin sin ψ ϕ̇ y = ϕ̇ sin cos ψ ϕ̇z = ϕ̇ cos .
˙ follows via
˙ cos ψ
˙ x = ˙ y = −
˙ sin ψ ˙ z = 0 .

ψ˙ is given by
ψ̇x = 0 ψ̇ y = 0 ψ̇z = ψ̇ .
Adding these components we obtain
˙ cos ψ
ω1 = ϕ̇ sin sin ψ + (7.105)
˙ sin ψ
ω2 = ϕ̇ sin cos ψ − (7.106)
ω3 = ϕ̇ cos + ψ̇ , (7.107)
where ω1 ≡ ωx , ω2 ≡ ω y , and ω3 ≡ ωz .

Remark: We want to apply the Euler angles to our previous results, (7.20) and (7.97),
for the free symmetric top. We chose the z-axis of the laboratory frame (cf. Fig. 7.4)
to be parallel to the constant angular momentum vector (cf. Fig. 7.2). Then we chose
ψ = π/2 according to Fig. 7.2. Using (7.105)–(7.107) we obtain
L 1 = I1 ω1 = I1 ϕ̇ sin
L 2 = I2 ω2 = −I1 ˙
L 3 = I3 ω3 = I3 (ϕ̇ cos + ψ̇) .
On the other hand we have also
L 1 = L sin
L2 = 0
L 3 = L cos .
Equating the right hand sides in these systems yields
I1 ϕ̇ = L
˙ =0
I3 (ϕ̇ cos + ψ̇) = L cos .
With ω Pr = ϕ̇ the first equation corresponds to (7.20). The second equation yields
the information that the tilt angle of the top in Fig. 7.2 is constant. The third equation
describes the angular velocity of the top’s rotation with respect to its own axis. In
addition we may transform the third equation into

1 1 I3 − I1
−ψ̇ = −L cos − = ω3 .
I3 I1 I1
Notice that ω = −ψ̇ yields (7.97)!
• Advanced Example - Quaternions: For our understanding of the above

transformations between laboratory and rotating frame the Euler angles are
preferable. But for computational purposes quaternions are superior. The def-
initions of the quaternions, qo , q1 , q2 , and q3 , in terms of the Euler angles
are

ϕ+ψ
qo = cos cos (7.108)
2 2

ϕ−ψ
q1 = sin cos (7.109)
2 2

ϕ−ψ
q2 = sin sin (7.110)
2 2

ϕ+ψ
q3 = cos sin . (7.111)
2 2
The qi satisfy

3
qi2 = 1 . (7.112)
i=0
The matrices D and D −1 , needed in the transformations (7.102) and (7.104),

can be expressed in terms of the quaternions. With a little work we find
⎛ 2 ⎞
qo + q1 2 − q2 2 − q3 2 2(q1 q2 + qo q3 ) 2(q1 q3 − qo q2 )
D=⎝ 2(q1 q2 − qo q3 ) qo 2 − q1 2 + q2 2 − q3 2 2(qo q1 + q2 q3 ) ⎠
2(qo q2 + q1 q3 ) 2(q2 q3 − qo q1 ) qo 2 − q1 2 − q2 2 + q3 2
as well as
⎛ ⎞
qo 2 + q1 2 − q2 2 − q3 2 2(q1 q2 − qo q3 ) 2(qo q2 + q1 q3 )
−1
D =⎝ 2(q1 q2 + qo q3 ) qo − q1 + q2 − q3
2 2 2 2 2(q2 q3 − qo q1 ) ⎠ .
2(q1 q3 − qo q2 ) 2(qo q1 + q2 q3 ) qo − q1 − q2 + q3
2 2 2 2
In addition one can show

⎛ ⎞ ⎛ ⎞
ω1 q̇1
⎜ ω2 ⎟ ⎜ q̇2 ⎟
⎜ ⎟=W ·⎜ ⎟
⎝ ω3 ⎠ ⎝ q̇3 ⎠ , (7.113)
0 q̇o
where
⎛ ⎞
qo q3 −q2 −q1
⎜ −q3 qo q1 −q2 ⎟
W = 2⎜
⎝ q2
⎟ . (7.114)
−q1 qo −q3 ⎠
q1 q2 q3 qo
The first three lines in (7.113) correspond to the (7.105)–(7.107) in quaternion

representation. The last line in (7.113) is the derivative of the normalization
relation (7.112).
In the following we shall need the inverse form of (7.113), i.e.
⎛ ⎞ ⎛ ⎞
q̇1 ω1
⎜ q̇2 ⎟ ⎜ ω ⎟
⎜ ⎟ = W −1 · ⎜ 2 ⎟ (7.115)
⎝ q̇3 ⎠ ⎝ ω3 ⎠
q̇o 0
with
⎛ ⎞
qo −q3 q2 q1
1⎜ q3 qo −q1 q2 ⎟
W −1 = ⎜⎝
⎟ . (7.116)
2 −q 2 q1 qo q3 ⎠
−q1 −q2 −q3 qo
We practice using quaternions by applying them to the motion of the sym-

metric top depicted below, which moves under the influence of the gravitational
field g = (0, 0, −g). We want to calculate the path of the center of mass of

the top, whose position is described by the vector R.
The coordinate system in the sketch is the laboratory reference frame. The
rotating frame, moving with the top, is not shown explicitly. However, the
This means that in the primed reference
z -axis of this frame is parallel to R.
frame the center of mass is at

3
r = 0, 0, h . (7.117)
4
The quantity h is the distance from the base to the tip of the cone.
We need the torque relative to the body-fixed coordinate frame i.e.
⎛ ⎞ ⎛ ⎞
0 qo q1 + q2 q3
3
N = r × D · ⎝ 0 ⎠ = mgh ⎝ qo q2 − q1 q3 ⎠ . (7.118)
−mg 2 0
Notice that (0, 0, −mg), where m is the mass of the top and g is the gravitational
acceleration. This vector is transformed into the body-fixed coordinate frame
via D. We can then insert N into (7.94)–(7.96) (notice: N in these equations
refers to the body-fixed frame even though we have omitted the prime!), which
yields
⎛ ⎞ ⎛ ⎞
ω̇1 N1 + (I2 − I3 )ω2 ω3 /I1
⎝ ω̇2 ⎠ = ⎝ N2 + (I3 − I1 )ω3 ω1 /I2 ⎠ . (7.119)
ω̇3 0
The principal moments of inertia are I1 = I2 = (3/20)m(a 2 + 4h 2 ), and

I3 = (3/10)ma 2 . Here a is the radius of the cone’s base.
The four equations in (7.115) and the three equations in (7.119) form a
complete system of first order differential equations in terms of q1 (t), q2 (t),
q3 (t), qo (t) as well as ω1 (t), ω2 (t), and ω3 (t). Its numerical solution is not too
difficult. The reverse transformation of r from the body-fixed frame back to
the laboratory frame is accomplished via
⎛ ⎞
qo q2 + q1 q3
3
R = D −1 · r = h ⎝ −qo q1 + q2 q3 ⎠ . (7.120)
2 (q − q − q + q )/2
2 2 2 2
o 1 2 3

The following figure shows the path of R(t)/ h for a rather flat top for which
a/ h = 1.7 (the mass cancels). In this case the initial values, i.e. at time t = 0,
are ϕ = = π/3, ψ = 0 and ω1 = ω2 = 0.5, ω3 = 15.
Remark 1: A simple integrator, which may be used here, is obtained by sub-

tracting the following two series expansions
1
x(t ± t) = x(t) ± ẋ(t)t + ẍ(t)t 2 ± O(t 3 ) . (7.121)
2
Thus
x(t + t) = x(t − t) + 2 ẋ(t)t + O(t 3 ) . (7.122)
Here x(t) represents each of the quantities q1 (t), q2 (t), q3 (t), qo (t) as well as
ω1 (t), ω2 (t), and ω3 (t). Notice that the initial value at time t − t may be
estimated using x(t − t) ≈ x(t) − √ ẋ(t)t. In this example the integration
timestep is t = 0.001 (in units of h/g).
Remark 2: The quaternions apparently circumvent the use of trigonometric
functions and avoid possible divergencies during matrix inversions.
7.3 Static Contact Between Rigid Bodies†
The conditions for the static stability of individual rigid bodies or assemblies of rigid
bodies are

F = fi = 0 (7.123)
i
and

N = ri × fi = 0 . (7.124)
i
The fi are the forces acting at the positions ri .5 Already on p. 56 we had studied one
application of these equations.
• Problem 38 - Stability of an Arch: Homogeneous rectangular blocks of

length b are used to construct an arch across a gorge of width h (cf. the sketch).
When the arch is completed the two topmost blocks are touching face-to-face,
i.e. 2x = h. However, we require that the two blocks do not exert force on
each other. How many blocks do we need when h = 1.7b?
Solution: A stack of two blocks remains stable as long as the following con-
dition is satisfied (cf. the second sketch):
1 b
xs = (xs (1) + xs (2)) ≤ . (7.125)
2 2
Notice that the origin of the horizontal axis does not coincide with the edge of
the gorge. The quantity xs is the horizontal position of the common center of
mass of the two blocks (open circle in the second sketch). If xs is located to
the right of the edge of the gorge, then the resulting torque relative to the edge
causes the blocks to fall. The individual centers of mass of the blocks are at
xs (i) (i = 1, 2) (solid circles). Our two-block semiarch extends a distance x
beyond the edge of the gorge. x is given by
5 Notice that this does not depend on the position of the origin:

N = ri × fi = r i × fi + a × fi
i i i

=
a × F=0
ri = r i + a ).
(
7.3 Static Contact Between Rigid Bodies† 227
x = xs (1) + (xs (2) − xs (1)) = xs (2) . (7.126)
The combination of (7.125) and (7.126) yields
x ≤ b − xs (1) . (7.127)
There is a second axis of rotation, edge 2, as shown in the sketch. In order

to prevent the second block from rotating with respect to this axis, we must
require
b
xs (2) − xs (1) ≤ . (7.128)
2
Again we use (7.126) to obtain the following condition in addition to (7.127)
b
x≤ + xs (1) . (7.129)
2
The largest x, which one can achieve with two blocks follows from the relations
(7.127) and (7.129) in the case of the equal sign, i.e.
3
xmax = b. (7.130)
4
Because 2xmax < h = 1.7b, we discover that two blocks, i.e. an arch consisting
of a total of four blocks, are not sufficient.
Thus we add a third block as shown in the following sketch. Relative to
edge 1 we have
1 b
xs = (xs (1) + xs (2) + xs (3)) ≤ (7.131)
3 2
x = xs (3) (7.132)
3
x ≤ b − xs (1) − xs (2) . (7.133)
2
With respect to edge 2 we find
1 b
(xs (3) + xs (2)) − xs (1) ≤ (7.134)
2 2
and thus, using (7.132),
x ≤ b + 2xs (1) − xs (2) . (7.135)
Finally, edge 3 yields
b
xs (3) − xs (2) ≤ (7.136)
2
and
b
x≤ + xs (2) . (7.137)
2
This time the system of equations (!) (7.133), (7.135), and (7.137) yields
11
xmax = b, (7.138)
12
i.e. 2xmax ≈ 1.83b. This means that we can use six blocks to build an arch
spanning the required width. Question: Assuming we have infinitely many
blocks, what is the maximum possible h?
d’Alembert’s Principle† :
Figure 7.5 depicts a mechanical system consisting of two cable drums mounted
tightly side-by-side on the same axis. There are two masses m 1 and m 2 attached to
the ends of the two cables. The system is in equilibrium. This means that, neglecting
friction, there is no work required to lower mass m 1 , which in turn lifts mass m 2 , and
vice versa. Mathematically this can be expressed via

fi · δ
ri = 0 (7.139)
i
or
m 1 gδr1 − m 2 gδr2 = 0 .
With δri = Ri δφ, where δφ is the rotation angle, follows
m 1 R1 = m 2 R2 .
Fig. 7.5 Atwood machine
R2
R1
m1
m2
Equation (7.139) is applicable quite generally to static mechanical systems. The δ ri

are displacements, sometimes called virtual displacements, compatible with possible
constraints. Notice that the fi include two types of forces, i.e. constraints, fi (c) , and
others, fi (nc) , i.e. fi = fi (c) + fi (nc) . Because δ
ri ⊥ f (c) , we have

fi (nc) · δ
ri = 0 . (7.140)
i
This approach to the solution of static problems can be extended to dynamical

problems, in which case one has

f i (nc) · δ
ri = p˙i · δ
ri . (7.141)
i i
Equation (7.141) is known as d’Alembert’s principle.6

D’Alembert’s principle does not yield completely new insights. The above equi-
librium condition m 1 R1 = m 2 R2 , for instance, we can obtain from (7.124) also. On
the other hand, the scalar product fi · δ
ri includes possible constraints right from the
start.
Remark: One may derive the Euler–Lagrange equation from d’Alembert’s principle.
• Problem 39 - D’Alembert’s Principle: Use d’Alembert’s principle to calcu-

late the acceleration ẍ(t) for the above Atwood’s machine. Neglect the inertia
of the cable drums.
6 Alembert,Jean-Baptiste Le Rond d’, French mathematician, philosopher and physicist, *Paris

16.11.1717, †Paris 29.10.1783.
Solution: In this case (7.141) yields
− m 1 g R1 δφ + m 2 g R2 δφ = m 1 ω̇ R12 δφ + m 2 ω̇ R22 δφ . (7.142)
Via
d δr2 d δφ
ẍ(t) = = R2 = R2 ω̇ (7.143)
dt δt dt δt
and insertion of ω̇ from (7.143) into (7.142) we obtain
−m 1 R1 + m 2 R2
ẍ(t) = R2 g . (7.144)
m 1 R12 + m 2 R22
Chapter 8
Canonical Mechanics
This chapter introduces an approach to mechanics, which is particularly useful in

two other areas of physics - quantum mechanics and statistical mechanics.
8.1 Hamilton’s Equations of Motion
In the Chaps. 3 and 4 we had discussed the conservation of energy in isolated systems
based on the idea that the Lagrangian should not explicitly depend on time. We had
obtained the formulas (3.12) and (4.38) expressing the total energy, E, in terms of
the Lagrangian and the product of the generalized velocities times the generalized
momenta. Perhaps it is a good idea to use these expressions more generally and
define the new function, the so called Hamiltonian,1 via

H= pj q̇j − L . (8.1)
j
The Hamiltonian coincides with (4.38) when we use (4.41) to replace the derivatives
of L with respect to the q̇j .
The Lagrangian depends on the coordinates and their attendant velocities. What
does the Hamiltonian depend on? We find the answer by working out the total dif-
ferential, i.e.

dH = pj d q̇j + q̇j dpj − dL .
j j
1 Hamilton,
Sir (since 1835) William Rowan, Irish mathematician and physicist, *Dublin 4.8.1805,
†Dunsink (near Dublin) 2.9.1865.
DOI 10.1007/978-3-319-48710-6_8
234 8 Canonical Mechanics
The total differential of L is given by

∂L ∂L (4.41),(4.43)
dL = dqj + d q̇j = ṗj dqj + pj d q̇j .
j
∂qj j
∂ q̇j j j
Hence

dH = −ṗj dqj + +q̇j dpj , (8.2)
j j
which shows that H depends on the generalized momentum components, pj , and the
generalized coordinates, qj . This means that the new, but equivalent, description of
a mechanical system thorough H instead of L replaces the q̇j by the pj .
An alternative set of equations of motion follows directly from (8.2), i.e.
∂H
q̇j = (8.3)
∂pj
and
∂H
ṗj = − . (8.4)
∂qj
These are Hamilton’s equations of motion. Because of their simplicity, they are
sometimes called canonical equations or Hamilton’s canonical equations.
But what have we really achieved by introducing H? Perhaps not all that much
from the point of view of mechanics. Because H is the energy of a system, we can
construct it by adding the potential energy to the kinetic energy instead of subtracting
it as in the case of L. In addition, there are cases when the above equations of motion
are indeed more convenient. However, it is the central role of energy in quantum and
in statistical mechanics, which causes H to assume center stage when these subjects
are introduced - at least initially.
Remark: Notice that (8.1) is a so called Legendre transformation.2 In the present
case this transformation replaces the variables q̇j by the new variables pj . In order for
this to work, the derivative pj = ∂L/∂ q̇j , i.e. the generalized momentum as defined
already in (4.41), must exist. Legendre transformations are very useful if one wants
to replace an ‘inconvenient’ variable by a ‘convenient’ one. Frequent use of Legendre
transformations is made in thermodynamics. For instance, it may be more convenient
(and saver) to work at constant pressure instead of at constant volume.
Let’s look at a number of simple examples for Hamiltonians:
(a) In cartesian coordinates the Hamiltonian of a point mass is given by
2 Legendre, Adrien Marie, French mathematician, *Paris 18.9.1752, †Paris 10.1.1833.

8.1 Hamilton’s Equations of Motion 235
1 2
H= p + U (r ) . (8.5)
2m
(b) The same function expressed in spherical coordinates is

1 pθ 2 pφ 2
H= p + 2 + 2 2
2
+ U (r, θ, φ) . (8.6)
2m r r r sin θ
This follows from (4.17) using pr = ∂L/∂ ṙ, pθ = ∂L/∂ θ̇, and pφ = ∂L/∂ φ̇.
(c) The Hamiltonian of the free symmetric top (cf. p. 194) can be calculated starting

from the Lagrangian L = 21 3i=1 Ii ωi2 (I = I1 = I2 , I3 = 0). We have pφ =
∂L/∂ φ̇, pθ = ∂L/∂ θ̇, and pψ = ∂L/∂ ψ̇, where φ, θ, and ψ are Euler’s angles. The
angular velocities are given by the (7.105)–(7.107). The result is
(pφ csc θ − pψ cot θ)2 + p2θ p2ψ

H= + . (8.7)
2I 2I3
In the case when I3 = 0 the Hamiltonian becomes

p2φ csc2 θ + p2θ
H= . (8.8)
2I
Remark: The expressions (8.7) and (8.8) are useful in the context of gases and liquids
of small molecules. To good approximation small molecules, e.g., water or carbon
dioxide, behave like rigid bodies. Under not too severe conditions, molecular vibra-
tions, which follow from normal mode analysis, are quite independent of translational
and rotational motion.
• Example - Mathematical Pendulum: Once again we use the mathematical

pendulum in Fig. 2.1 as an example. Its Lagrangian is given by
1 2 ˙2
L(φ, φ̇) = ml φ − mgl(1 − cos φ) .
2
The momentum belonging to the coordinate φ, i.e. the momentum conjugate
to φ, follows according to
∂L
pφ = = ml2 φ̇ .
∂ φ̇
Therefore the Hamiltonian becomes
p2φ
H(φ, pφ ) = + mgl(1 − cos φ) .
2ml 2
We find the equations of motion via (8.3), i.e.
pφ
φ̇ = ,
ml 2
and via (8.4), i.e.
ṗφ = −mgl sin φ .
Calculating the time derivative of the first equation and inserting ṗφ from the
second equation, yields the already known differential equation for φ(t) (cf.
(2.51)).
The total time derivative of H is

dH ∂H ∂H ∂H (8.3),(8.4) ∂H
= + q̇j + ṗj = . (8.9)
dt ∂t j
∂qj j
∂pj ∂t
This means that if H does not depend on time explicitly, then dH/dt = 0 (cf. energy
conservation).
We assume that both L and H, aside from depending on the qj and the q̇j or the
qj and the pj , also depend on the parameter λ. In this case
∂L
dL = ṗj dqj + pj d q̇j + dλ
j j
∂λ
and
∂H
dH = − ṗj dqj + q̇j dpj + dλ .
j j
∂λ
Addition of these two equations and using (8.1) yields the useful result
∂L ∂H
=− . (8.10)
∂λ qj ,q̇j ∂λ qj ,pj
• Problem 40 - Helical Track (Newton): The sketch shows a uniform cylinder,

whose radius is a and its mass is mcyl , rotating freely with respect to its axis.
A channel spirals down the cylinder’s surface along a helical path. Inside the
channel, whose pitch angle is α, the point mass m slides without friction.
Initially m is at rest at the upper end of the channel. It then starts its way down
along the track under the influence of gravity.
(a) Write down Newton’s equation of motion for the position vector, r (t), of the
point mass. Use a space-fixed laboratory frame of reference, whose origin is
the point where the axis emerges from the top of the cylinder. The z-axis of this
frame points downward along the cylinder’s axis. Hint: There are (initially)
unknown forces, for which we need additional equations in part (b). Consider
the angular momenta occurring in the system.
(b) Obtain the solution for the following initial conditions: r (0) = (a, 0, 0)
and r˙ (0) = (0, 0, 0).
Solution: (a) The second sketch shows m and the relevant forces acting on it.
Here F g is the force of gravity, F B is the force exerted by the bottom of the
channel, and F R is a radial force, pointing towards the axis of the cylinder,
exerted by the channel’s sidewall. Hence
FR
FB
Fg
mr¨ (t) = F g + F B + F R . (8.11)

Notice that the tilt angle of F B relative to the x−y-plane is α. Notice also that
F R has no z-component.
Expressing r (t) in cylindrical coordinates yields
⎛ ⎞
a cos φ(t)
r (t) = ⎝ a sin φ(t) ⎠ . (8.12)
z(t)
Thus, combining (8.11) and (8.12) we obtain

⎛ ⎞
−aφ̇2 cos φ − aφ̈ sin φ
m ⎝ −aφ̇2 sin φ + aφ̈ cos φ ⎠ (8.13)
z̈
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 − sin α sin φ − cos φ
= ⎝ 0 ⎠ + FB ⎝ sin α cos φ ⎠ + FR ⎝ − sin φ ⎠ .
mg − cos α 0
(b) We begin with the derivation of an equation allowing to eliminate F B . This

requires the z-component of the angular momentum of the mass relative to the
cylinder’s axis:

L̇m,z = Nz = r × (F g + F B + F R )z = aFB sin α . (8.14)
Using Lm,z = ma2 φ̇ gives

ma
FB = φ̈ . (8.15)
sin α
Inserting this into the z-component of the equation of motion (8.13) yields
a
z̈ = g − φ̈ . (8.16)
tan α
At this point our goal is to express z in terms of φ. Considering the pitch of
the track we might be tempted to assume z(t) = φ(t)a tan α. This means that
the height of the mass is reduced by φ(t)a tan α per angle φ. However, this
neglects the opposite rotation of the cylinder due to the force −F B exerted by
the mass. This rotation by itself leads to an angle φcyl opposite to φ. Hence,
the true z-position of the mass is
z(t) = (φ(t) − φcyl (t))a tan α . (8.17)

Therefore (8.16) becomes

a
(φ̈ − φ̈cyl )a tan α = g − φ̈ . (8.18)
tan α
Once again we do need an extra relation - this time between φ and φcyl .
We obtain the desired equation from the conservation of the total angular
momentum with respect to the z-axis. We also remind ourselves that both the
point mass and the cylinder initially are at rest. Thus
0 = Lm,z + Lzyl,z = ma2 φ̇ + Izz φ̇cyl , (8.19)
where Izz = mcyl a2 /2, and therefore
m
φ̈cyl = −2 φ̈ . (8.20)
mcyl
The combination of the (8.18) and (8.20) provides us with the final equation
for the time-development of φ:
g tan α
φ̈(t) = (x = 1 + 2m/mcyl ) . (8.21)
a 1 + x tan2 α
Simple integration yields
gt 2 tan α
φ(t) = . (8.22)
2a 1 + x tan2 α
Using z̈(t) = φ̈(t)x tan α we obtain for the vertical direction
gt 2 x tan2 α
z(t) = . (8.23)
2a 1 + x tan2 α
• Problem 41 - Helical Track (Hamilton): Here we tackle the previous prob-

lem using Hamilton’s equations of motion.
(a) Derive H for the entire system including both the cylinder and the point
mass.
(b) Solve Hamilton’s equations of motion.
Solution: (a) We begin with the potential energy, i.e.
(8.17)
U = −mgz = −mg(φ − φcyl )a tan α . (8.24)
The kinetic energy,
K = Km + Kcyl , (8.25)
has two contributions. The first one is due to m, i.e.
m 2 2 ma2 2
Km = a φ̇ + ż2 = φ̇ + (φ̇ − φ̇cyl )2 tan2 α . (8.26)
2 2
The second contribution is the kinetic energy of the rotating cylinder given by
1 1
Kcyl = Izz φ̇2cyl = mcyl a2 φ̇2cyl . (8.27)
2 4
In addition we make use of the relation
m
φcyl = −2 φ (8.28)
mcyl
derived in the previous problem based on angular momentum conservation.

This allows expressing the system’s Lagrangian in terms of φ̇ and φ:
1 2
L=K −U = ma x(1 + x tan2 α)φ̇2 (t) + mgφ(t)ax tan α (8.29)
2
(x = 1 + 2m/mcyl ).
For the Hamiltonian we need the generalized momentum:
∂L
pφ = = ma2 x(1 + x tan2 α)φ̇ . (8.30)
∂ φ̇
Using
pφ
φ̇ = (8.31)
ma2 x(1 + x tan2 α)
we find
p2φ (t)
H=K +U = − mgφ(t)ax tan α . (8.32)
2ma2 x(1 + x tan2 α)
(b) This time we obtain the equation of motion for φ via the (8.3) and (8.4),
i.e.
∂H
ṗφ = − = mgax tan α (8.33)
∂φ
and
∂H pφ
φ̇ = = . (8.34)
∂pφ ma2 x(1 + x tan2 α)
Differentiating the last equation with respect to time and inserting ṗφ from
(8.33) yields the sought after result (8.21). The solution, as before, is given by
(8.22) and (8.23).
• Problem 42 - Elastic Pendulum (Hamilton): Already in two previous prob-

lems, 9 and 15 on the pp. 66 and 98, we have studied the elastic pendulum by
other methods. Here we want to obtain Hamilton’s equations of motion for this
system. We use U(φ = 0, ρo ) = 0, where ρo is the solution of ∂U/∂ρ|φ=0 = 0.
Solution: Kinetic and potential energy are given by
m 2
K= ρ̇ + ρ2 φ̇2 (8.35)
2
and
1
U = −mgρ cos φ + k(ρ − l)2 + Uo , (8.36)
2
respectively (cf. (4.18) and (4.19)). We determine Uo from
dU(φ = 0, ρ)
0= = −mg + k(ρ − l) , (8.37)
dρ ρo
i.e.
mg
ρo = l + (8.38)
k
and
mg 1 mg 2 mg
Uo = mg l + − k = mg l + . (8.39)
k 2 k 2k
The attendant Lagrangian is
m 2 1 mg
L= ρ̇ + ρ2 φ̇2 + mgρ cos φ − k(ρ − l)2 − mg l + (8.40)
2 2 2k
(cf. (4.20)). Now we calculate the generalized momenta, i.e.
∂L ∂L
pρ = = mρ̇ and pφ = = mρ2 φ̇ . (8.41)
∂ ρ̇ ∂ φ̇
Subsequently we find the Hamiltonian H = K + U:

1 p2φ 1 1 mg
H= p2ρ + + k (l − ρ)2 − mgρ cos φ + mg l + . (8.42)
2m ρ2 2 2 k
The resulting equations of motion follow according to
∂H pρ ∂H pφ
ρ̇ = = and φ̇ = = (8.43)
∂pρ m ∂pφ mρ2
as well as
∂H p2φ
ṗρ = − = − k(ρ − l) + mg cos φ (8.44)
∂ρ mρ3
and
∂H
ṗφ = − = −mgρ sin φ . (8.45)
∂φ
• Problem 43 - Hamilton-to-Lagrange and Back: Consider the Hamiltonian
p2 bα 2 −αt kq2
H= − bqpe−αt + q e α + be−αt + .
2α 2 2
Here α, b, and k are constants.
(a) Find the attendant Lagrangian.
(b) Find an equivalent Lagrangian, which does not depend on time explicitly.
What H does belong to this second L?
Solution: (a) We begin by calculating the following quantities
∂H p
q̇ = = − bqe−αt (8.46)
∂p α
∂H p2
p = − bqpe−αt (8.47)
∂p α
∂H p2 bα 2 kq2
p −H = − q (α + be−αt )e−αt − . (8.48)
∂p 2α 2 2
Using (8.46) the momentum, p, expressed via q̇ is given by
p = α(q̇ + bqe−αt ) . (8.49)
Inserting p into (8.48) yields the Lagrangian
α 2 kq2 αq −αt
L= q̇ − + bqα q̇ − e . (8.50)
2 2 2
(b) Looking at
d 2 −αt αq −αt
q e = 2q q̇ − e (8.51)
dt 2
we notice that the last term of (8.50) can be expressed as a total derivative with
respect to time. In a remark on p. 90 we have shown that such a term does not
alter the equations of motion. Hence
α 2 kq2
L= q̇ − (8.52)
2 2
is the equivalent Langrangian. Its attendant Hamiltonian follows via (8.1),
when we replace q̇ according to
∂L
p= = αq̇ , (8.53)
∂ q̇
where p is the generalized momentum. The result is
p2 kq2
H= + . (8.54)
2α 2
Poisson Brackets:
The total derivative of a function f (qi , pi , t) with respect to time can be expressed
as
df ∂f
= + {H, f } .
dt ∂t
The curly brackets are called Poisson brackets.3 They are given by
∂H ∂f
∂H ∂f (8.3),(8.3) ∂f ∂f

{H, f } = − = q̇i + ṗi . (8.55)
i
∂pi ∂qi ∂qi ∂pi i
∂qi ∂pi
Analogously we have in the case of two arbitrary functions f and g
∂f ∂g ∂f ∂g

{f , g} = − , (8.56)
i
∂pi ∂qi ∂qi ∂pi
which is the general definition of the Poisson brackets. Following from this definition
are the rules
{f , g} = −{g, f } (8.57)
{f , c} = 0 (with c = const) (8.58)
{f1 + f2 , g} = {f1 , g} + {f2 , g} (8.59)
{f1 f2 , g} = f1 {f2 , g} + f2 {f1 , g} (8.60)

∂ ∂f ∂g
{f , g} = ,g + f, (8.61)
∂t ∂t ∂t
∂f
{f , qk } = (8.62)
∂pk
∂f
{f , pk } = − (8.63)
∂qk
3 Poisson,Siméon Denis, French mathematician and physicist, *Pithiviers (Département Loiret)

21.6.1781, †Paris 25.4.1840.
{qi , qk } = 0 {pi , pk } = 0 {pi , qk } = δik . (8.64)
In addition there is the Jacobi identity4
{f , {g, h}} + {g, {h, f }} + {h, {f , g}} = 0 . (8.65)
and the Poisson theorem, i.e.
{f , g} = const , (8.66)
if f and g are constants of motion, i.e. df /dt = 0 and dg/dt = 0.

As an example we work out the Poisson bracket of a particle’s angular momentum
components Lx and Ly :
3
∂ ∂
{Lx , Ly } = ypz − zpy (zpx − xpz )
α=1
∂pα ∂xα
∂ ∂
− ypz − zpy (zpx − xpz )
∂xα ∂pα

3
= yδzα − zδyα (δzα px − δxα pz )
α=1

− δyα pz − δzα py (zδxα − xδzα )
= ypx − xpy = −Lz .
Analogously we find {Ly , Lz } = −Lx as well as {Lz , Lx } = −Ly . We shall encounter

the Poisson brackets again as a central ingredient of quantum mechanics.
Canonical Transformations:
A canonical transformation of the old coordinates and generalized momenta, qi

and pi , to the new ones, Qj = Qj (qk , pk , t) and Pj = Pj (qk , pk , t), does not alter
Hamilton’s equations, i.e.
∂H ∂H
Q̇j = and Ṗj = − . (8.67)
∂Pj ∂Qj
The prime indicates that H is the Hamiltonian in terms of the Qj and the Pj . An
example of a canonical transformation is the transformation to normal coordinates
discussed in Chap. 6.
Remark: The transformation Qj = Qj (qk , pk , t) and Pj = Pj (qk , pk , t) expresses
both Qj and Pj in terms of the qk as well as in terms of the pk . This means that the
4 Jacobi, Carl Gustav Jacob, German mathematician, *Potsdam 10.12.1804, †Berlin 18.2.1851.
original distinction between coordinates and momenta has disappeared. An example

is the simple canonical transformation Q = p and P = −q. Generally qk and pk as
well as Qj and Pj are called canonically conjugate variables.
8.2 Hamilton–Jacobi Theory
According to the action’s definition, its total time derivative along a path is given by
dS
=L (8.68)
dt
or, if we consider S to be a function of the coordinates and time,
dS ∂S ∂S
= + q̇j . (8.69)
dt ∂t j
∂qj
In order to better understand the meaning of this, we calculate the variation of S, i.e.
t2 ∂L t2
∂L d ∂L
δS = δq + − δqdt
t1 ∂ q̇ t1 ∂q dt ∂ q̇
(for one degree of freedom). If we consider the stationary path and only vary its
endpoint, q (t2 ), then we find
∂L
δS = δq = pδq
∂ q̇ t=t2
or more generally

δS = pj δqj .
j
Thus
∂S
= pj . (8.70)
∂qj
Inserting (8.68) and (8.70) into (8.69) yields
∂S
=L− pj q̇j = −H (q1 , . . . , p1 , . . . , t) . (8.71)
∂t j
8.2 Hamilton–Jacobi Theory 247
Replacing the generalized momenta in (8.71) via (8.70), we obtain the so called
Hamilton–Jacobi differential equation:

∂S ∂S ∂S
+ H q1 , . . . , qs , ,..., ,t = 0 . (8.72)
∂t ∂qi ∂qs
This is a first order partial differential equation, which, in addition to the Euler–
Lagrange and Hamilton’s equations of motion, provides a third formal method to
derive and solve equations of motion.
Again we use the harmonic oscillator as an example:
p2 k
H= + q2 .
2m 2
The attendant Hamilton–Jacobi differential equation is
2
∂S 1 ∂S k
+ + q2 = 0 .
∂t 2m ∂q 2
We try the ansatz S = St + Sq . Here St depends solely on time and Sq depends on

position. The result is
2
1 dSq k
−Ṡt = + q2 = c ,
2m dq 2
where c is a constant. We find
St = −ct ,
and
dSq
= 2mc − mkq2 .
dq
Hence

S= 2mc − mkq2 dq − ct . (8.73)
But how can we use (8.73) to find the solution q (t)? In order to understand the
general approach, we must take a short break from our example.
Generator of a Canonical Transformation:
Using (8.69) in combination with (8.70) and (8.72) yields

dS = pj dqj − Hdt
j
or

S= pj dqj − Hdt . (8.74)
j
Equation (8.74) is valid also for the canonically transformed variables Qj and Pj . This
means that δS = 0 implies

δ pj dqj − Hdt = 0
j
as well as

δ Pj dQj − H dt = 0 .
j
Thus we have

pj dqj − Hdt = Pj dQj − H dt + dF ,
j j
where dF is the total differential of a function of the coordinates, the momenta, and
time, and therefore

dF = pj dqj − Pj dQj + H − H dt . (8.75)
j
Consequently we have
∂F ∂F ∂F
pj = Pj = − H = H + . (8.76)
∂qj ∂Qj ∂t
The function F is called generator of the canonical transformation.

We now return to the above example, setting F = S, c = Q, and C = P. From
(8.76) follows H = 0 (cf. (8.72)), C = − ∂S
∂c
, and p = ∂S
∂q
(cf. (8.70)). H = 0 yields
∂H /∂C = ċ = 0 and −∂H /∂c = Ċ = 0, i.e. c and C are constants, and from
(8.73) we obtain

dq
−C = m/k −t
2c
k
− q2

√
=− arccos k
2c q
or

2c k
q= cos (t + C) .
k m

With w0 = mk , α = mk C, and a = 2ck this is identical to (6.6). Here C cor-
responds to an initial time and c = E (cf. (6.7)). Notice that energy and time are
conjugate variable.
The beginner possibly perceives this section as academic. Its special meaning
becomes much clearer in the context of quantum mechanics. The attendant key-
words are Sommerfeld–Wilson quantization, quasi-classical approximation or path
integration.
Advanced Example: In early quantum mechanics the atom was comparable

to the solar system on the nanoscale, with the nucleus corresponding to the
sun and the electrons corresponding to the planets. Even though this picture is
not correct, it was useful for visualizing the increase or decrease of an atom’s
energy in terms of electrons changing orbits. By merely postulating discrete
orbits for the electrons, Niels Bohr5 was the first to achieve significant progress
in the understanding of the interaction of radiation with atoms.
According to this interpretation of electronic orbits, a discrete amount of
energy, hν, consisting of electromagnetic radiation with the frequency ν is
either absorbed or emitted by the atom. The quantity
h = 6.6261 · 10−34 Js
is Planck’s constant.6 We can assign a number, i.e. n = 1, 2, 3, . . . , to every

orbit, starting closest to the nucleus. The equation
E = hνn (8.77)
then describes the change in energy, E, of an atom, when an electron changes
from orbit m = n ± 1 to orbit n, i.e. n = m − n (Even though the concept of
orbits is abandoned, the numbers, albeit based on the more general concept of
quantum states, persist.).
Assuming that an electron ‘hops’ in n − 1 steps from the orbit closest to
the nucleus to the nth orbit, then, if n is large, the following form of (8.77) is
appropriate:
E
dE
= hn . (8.78)
0 ν (E )
Here E is the sum over all discrete increments E. Notice that the approxi-
mation of the sum by an integral gets better and better as n gets larger. Notice
also that in the limit of very large n the electron has become an elementary
charge on a macroscopic orbit, which must obey the laws of classical physics.
This is called the principle of correspondence.
Using canonical mechanics the left hand side of (8.78) can be transformed
as follows:
E
dE

= pdq . (8.79)
0 ν (E )
The integral on the right is calculated along a path in the p−q-plane, along
which the energy is equal to E.
We can derive (8.79) by considering the action S = S (q, t) as a function of
the generalized coordinate q and time. Based on this S we carry out a Legendre
transformation to the function S̃ = S̃ (q, H), where H is the Hamiltonian, i.e.
d S̃ = d (Ht) + dS
∂S ∂S
= tdH + Hdt + dq + dt
∂q ∂t

=p =−H
= pdq + tdH .
Applying H = E = const we integrate this equation between q and qo , which

yields
q
S̃(q, E) − S̃(qo , E) = pdq .
qo
Taking the derivative with respect to E leads to

∂ S̃(q, E) ∂ S̃(qo , E) ∂ q
− = pdq .
∂E
∂E ∂E qo
=t =to
Note that t and to correspond to the coordinates q = q(t) and qo = q(to ). If

we integrate over the entire period of the (periodic) motion, the result is

∂
T (E) = pdq .
∂E
Here T (E) = 1/ν(E) is the time of one cycle when the energy is E. A final
integration of this equation from 0 to E yields the desired equation:
E
T (E )dE = pdq . (8.80)
0
Let’s apply this to the one-dimensional harmonic oscillator, i.e. E =

p2 /(2m)
√ + (k/2)x 2 . Because the period of the oscillator is given by T =
2π m/k, we obtain
E E

m
T (E )dE = T dE = 2π E. (8.81)
0 0 k
Now we look at the right hand side of (8.80), i.e.
√
2E/k
k 2
pdq = 2 √ dx 2m E − x
− 2E/k 2

√ 2E 1
= 2 2mE dz 1 − z2
k −1

=π/2

m
= 2π E. (8.82)
k
√
The quantity 2E/k is the amplitude of the oscillator. Thus, both results,
(8.81) and (8.82), indeed are identical.
Combination of the (8.78) and (8.79) yields the useful formula

pdq = h n . (8.83)
For instance, in the case pφ = Lz = L = const (cf. (5.17)) we find
h
L= n. (8.84)
2π
This is the well known angular momentum quantization in early quantum
mechanics.
We note that the combination of the formulas (8.82) and (8.83) immediately
yields
E = ω (8.85)
√
where n = 1, ω = k/m, and = h/(2π). Equation (8.85) tells us that a
harmonic oscillator does not alter its energy continuously. Instead the energy
changes by discrete energy quanta of size ω (But why do we not notice this
in the lab? Answer: + ). This is something new, because we have progressed
beyond the above model of the atom, as considered by Bohr. In courses on
classical electromagnetism we shall learn that the energy of an electromagnetic
field inside a cavity can be expressed as a sum over infinitely many terms, which
all look like the energy of a one-dimensional harmonic oscillator. According
to (8.85) we conclude that the energy inside the cavity consists of discrete
‘packages’ of energy - photons. In order to solve the black-body radiation
problem, i.e. the calculation of the radiation spectrum emanating from a warm
body, Planck boldly made the assumption that energy comes in small packages
and was able to get the right answer. This seminal discovery finally lead to a
completely new understanding of the laws of nature based on quantum theory.
+
: Suppose a mass of 1 kg is suspended by a spring in the earth’s gravitational
field. The weight of the mass stretches the spring by 10 cm. The spring constant
is k = 98.1 kg s−2 . Thus E ≈ 10−33 J. If the same mass was dropped from
a height of 10 cm onto the lab floor, the attendant change of potential energy
exceeds this number by a factor which is roughly 1033 . We conclude that E
is too small to be noticed when the oscillator is macroscopic.
5 Bohr, Niels Henrik David, Danish physicist, *Kopenhagen 7.10.1885, †Kopenhagen
18.11.1962; he was one of the great pioneers of atomic theory and received the Nobel Prize
in physics in 1922 for his contributions to the development and application of quantum
mechanics.
6 Planck, Max, German physicist, *Kiel 23.4.1858, †Göttingen 4.10.1947; his introduction
of energy quantization to solve the black-body radiation problem in 1900 is considered the
beginning of quantum theory. He received the Nobel Prize in physics in 1918.
Chapter 9
Many-Particle Mechanics
Thus far we have focusses on examples allowing an analytical solution of the equa-
tion(s) of motion. However, already our first mechanics problem, the mathematical
pendulum, required the assumption that the amplitude is small, in order for us to arrive
at a reasonably simple differential equation. In fact, for most problems a numerical
solution is attempted first and sometimes is the only feasible option. In the following
we study a numerical method, which is useful for solving problems involving few
variables, e.g. the displacement of a one-dimensional oscillator. Subsequently we
discuss a technique, the Molecular dynamics simulation technique, which can be
used to solve large numbers of coupled equations of motion.
9.1 Numerical Solution of the Equations of Motion†
A standard numerical method when the number of degrees of freedom is small is

the Runge–Kutta method (RK). In the case of an ordinary first order differential
equation,
dy
= f (x, y) , (9.1)
dx
we can use the following RK3 algorithm:
h
y (xn+1 ) = y (xn ) + f (xn , yn ) + f (xn+1 , yn + fn h) + O h3 . (9.2)
2
The quantity h is the step width and xn = nh + x0 , where x0 is the initial x-value. In
addition yn = y (xn ) and fn = f (xn , yn ).

DOI 10.1007/978-3-319-48710-6_9
254 9 Many-Particle Mechanics
We obtain (9.2) via the following series expansion:
1
y (x + h) = y (x) + y (x) h + y (x) h2 + O h3
2
1 d
= y (x) + f (x, y) h + f (x, y) h2 + O h3 .
2 dx
∂f ∂f ∂y ∂f ∂f
Using d
dx
f (x, y) = ∂x
+ ∂y ∂x
= ∂x
+ ∂y
f one finds

1 1 ∂f ∂f
y (x + h) = y (x) + f (x, y) h + h f (x, y) + h+ fh +O h3 .
2 2 ∂x ∂y

=f (x+h,y+fh)+O (h2 )
The number in RK3 indicates the order of the algorithm. In the present case
all terms
proportional to h3 or higher powers of h are omitted - this is what O h3 tells us.
The algorithm can be applied to ordinary differential equations of arbitrary order
m as well. For this purpose the mth order differential equation is transformed into a
system of first order differential equations:

y(m) = f x, y, y(1) , . . . , y(m−1) . (9.3)
With y0 ≡ y, y1 ≡ y(1) , y2 ≡ y(2) , …, ym = y(m) follows
y0 = y1
..
. (9.4)

ym−2 = yn−1

ym−1 = f (x, y0 , y1 , . . . , ym−1 ) .
The following algorithm, given here without proof, is the RK4 applicable to
differential equations of the type y = f (x, y):
1 2 1
yn+1 = yn + k1 + k2 + k3 (9.5)
6 3 6
1
yn+1 = yn + h{yn + (k1 + 2k2 )} + O h4 (9.6)
6
k1 = hf (xn , yn )

h h h
k2 = hf xn + , yn + yn + k1
2 2 8

h
k3 = hf xn + h, yn + hyn + k2 .
2
9.1 Numerical Solution of the Equations of Motion† 255
(cf. Formula (25.5.22) in [1]). Notice that here the index n indicates the nth integration
step.
• Problem 44 - A Numerical Integrator:

Based on the RK3-algorithm in (9.2), which applies to y = f (x, y), construct
another Runge–Kutta algorithm (O(h3 )), which can be used directly to solve
differential equations of the type y = f (x, y, y ). The algorithm should be of
the form y (xn+1 ) = ... + O(h3 ) and y(xn+1 ) = ... + O(h3 ).
Solution: We begin by rewriting y = f (x, y, y ) into a first order system and
define
yo ≡ y and y1 ≡ y .
Hence
yo = y1 (9.7)
y1 = f (x, yo , y1 ) . (9.8)
Now the RK3 of (9.2) is applied to both equations, which yields
h
y1 (xn+1 ) ≈ y1 (xn ) + f (xn , yo (xn ), y1 (xn ))
2

+ f (xn+1 , yo (xn ) + y1 (xn )h, y1 (xn ) + hf (xn , yo (xn ), y1 (xn )) (9.9)
as well as
h
yo (xn+1 ) ≈ yo (xn ) + y1 (xn ) + y1 (xn+1 ) . (9.10)
2
Finally we arrive at
h
y (xn+1 ) ≈ y (xn ) +
fn + f (xn+1 , yn + hyn , yn + hfn ) (9.11)
2
h
y(xn+1 ) ≈ y(xn ) + y (xn ) + y (xn+1 (9.12)
2
(with fn ≡ f (xn , yn , yn )). An example application is discussed in Problem 45.

• Problem 45 - Trajectories of the Elastic Pendulum:

(a) Obtain the dimensionless form of Hamilton’s equations in Problem 42.
The resulting four coupled first order differential equations
√ do depend√on the
parameter K = (τl /τm )2 (cf. Problem 13 (b); τl = l/g and τm = m/k).
What is the dimensionless form of the Hamiltonian?
(b) In the special case K → ∞ (rigid pendulum) only two equations remain.
Obtain these equations and write a computer program, which calculates,
based on the algorithm of Problem 44, the quantities φ(t) and pφ (t). Plot the
phase space trajectories corresponding to the five initial conditions φ(t = 0) =
∓π; pφ (t = 0) = ±0.001, φ(t = 0) = π5 ; pφ (t = 0) = 0, and φ(t = 0) = ∓π;
pφ (t = 0) = ±1.
Solution: (a) We start from the two equations in (8.43):
pρ pφ
ρ̇ = and φ̇ = . (9.13)
m mρ2
Expressing theses equations via the dimensionless quantities ρ∗ = ρ/l, t ∗ =

t/τ , p∗ρ , and p∗φ we obtain
∗
l dρ∗ l l 2 dφ pφ l 2 pφ
m ∗ = pρ = m p∗ρ and m = = m , (9.14)
τ dt τ τ dt ∗ ρ∗ 2 τ ρ∗ 2
i.e.
dρ∗ dφ p∗φ
= p∗ρ and = . (9.15)
dt ∗ dt ∗ ρ∗ 2
Notice that τ is a ‘typical’ time, which we are free to define in terms other
quantities (cf. below). Analogously we rewrite the next equation in Problem
42, (8.44), i.e.
∗2
ml dp∗ρ m2 l 4 pφ
= − kl(ρ∗ − 1) + mg cos φ (9.16)
τ 2 dt ∗ ml 2 τ 2 ρ∗ 3
or
dp∗ρ p∗φ 2 k ∗ g
= − τ2 (ρ − 1) + τ 2 cos φ . (9.17)
dt ∗ ρ∗ 3 m l
Now we define τ 2 ≡ τl2 = l/g and, in addition, τm2 ≡ m/k. Hence (9.17)
becomes
9.1 Numerical Solution of the Equations of Motion† 257
2
dp∗ρ p∗φ 2 τl
= − (ρ∗ − 1) + cos φ . (9.18)
dt ∗ ρ∗ 3 τm
The final equation in Problem 42, which we must transform, is (8.45). In this
case we obtain
dp∗φ
= −ρ∗ sin φ . (9.19)
dt ∗
The equations in (9.15), (9.18), and (9.19) form a coupled system of non-linear
first order differential equations. The dynamics of the system depends on the
single parameter K ≡ τl /τm . Notice that K is the ratio of the periods of the
mathematical pendulum and the harmonic oscillator, respectively.
The dimensionless Hamiltonian is given by
⎛ ⎞
p∗φ 2
1 1
H∗ = ⎝p∗ρ 2 + 2 ⎠ + K(ρ∗ − 1)2 − ρ∗ cos φ + 1 + K . (9.20)
2 ρ∗ 2
Remark: If one intends to work with an equation on a computer, not just in

this particular problem but as a general rule, one should always implement the
equation in its dimensionless form.
(b) In the following we only use dimensionless quantities and therefore we
omit the asterisk. The limit K → ∞ implies ρ → 1, because otherwise the
right hand side of the ṗρ -equation diverges. Consequently we obtain
φ̇ = pφ and ṗφ = − sin φ (9.21)
in this limit. Notice that the ρ∗ → 1-limit of (9.18) already is discussed in

Problem 13.
We now apply the RK3-algorithm of Problem 44 to this system. In the
present case yo ≡ φ and y1 ≡ pφ (cf. (9.7) and (9.8)). Thus, according to (9.9)
and (9.10),
t
pφ,n+1 = pφ,n + − sin φn − sin(φn + pφ,n t) (9.22)
2
and
t
φn+1 = φn + pφ,n + pφ,n+1 . (9.23)
2
The indices n and n + 1 indicate the times tn and tn+1 , respectively. The result
of this algorithm is shown in Fig. 9.1 for the five specified initial conditions
Fig. 9.1 Phase space

trajectories of the pendulum
(timestep: t = 0.01), of which each yields one of the curves. Try to assign
the initial conditions to the individual trajectories.
Remark: How about the solution of the original system (9.15), (9.18), and
(9.19)? Aside from having to derive a new algorithm, there is a bigger problem
here. Generally the dynamical behavior of non-linear systems is complex. We
may obtain different results, even for minute changes of the parameter values or
initial conditions. In Sect. 9.4 we shall return to this point, discussing a simpler
but nevertheless instructive example for a non-linear iteration algorithm.
9.2 Molecular Dynamics Simulation
The Runge–Kutta method is not very well suited when we deal with computationally
expensive problems involving many point masses, e.g. N ∼ 100 or even N ∼ 106 ,
where N is the number of point masses (or particles)! But what type of system are
we talking about? Think about a gas or a liquid. If we were able to follow the path
of a molecule in a gas or a liquid, it would look rather irregular because of the
collisions with the other molecules. Even though the path of the individual molecule
is irrelevant for the interesting quantities like temperature, pressure, density, transport
coefficients, . . . , we can use the collective information of many individual trajectories
to calculate the aforementioned macroscopic quantities of interest. This in fact is done
in statistical mechanics. Another type of system is the solar system. We might be
interested in predicting the positions of the planets and their moons relative to one
another and to the sun. Again, depending on the amount of detail, N may be rather
large. In fact, we can apply the same numerical solution method to both types of
system, even though their relevant scales are vastly different.
The following is a brief introduction to the Molecular Dynamics simulation tech-
nique applied to simple gases and liquids [2]. ‘Simulation’ means that we do not
carry out a certain experiment in a real laboratory but on a computer instead.
Simulation Boxes:
9.2 Molecular Dynamics Simulation 259
Fig. 9.2 Left primary simulation box in the center at the beginning of a simulation embedded in
a lattice of its periodic images. Right after a certain number of simulation steps all but one of the
real, i.e. red, particles have left the original simulation box. The original density, if the blue image
particles are included in the count, has not changed
Imagine 18 g of water - roughly 2.6 cm × 2.6 cm × 2.6 cm in terms of volume. How

many molecules does this much water contain? About 6 · 1023 . There is no computer
yet that can handle this many molecules. We in fact will deal with somewhere around
100 molecules or particles. Nevertheless, we have to trick them into ‘thinking’ that
they are 6 · 1023 . This big a system is a bulk system - a system in which the surfaces
do not affect the properties.
The red particles in Fig. 9.2 are models of real molecules stored in a computer’s
memory. Each particle is the center of a circle with radius rcut . Things outside its
circle a particle does not ‘see’ directly. This means that two particles interact directly
only if they are inside each others circles.
The blue particles in Fig. 9.2 are not real molecules. They are not stored any-
where on the computer. They have well defined positions, however, because they
are periodic images of the red particles. The periodicity arises because of the central
square. The latter defines a lattice and every lattice cell contains exactly the same
particles at the exact same positions. In some cases the particles are real (red), but in
most cases the are merely images of real particles (blue). The volume of the central
square, or (primary) simulation box, determines the (number) density of particles,
i.e. molecules. As long as the range of interaction (circle) is less, ideally much less,
than the size of the simulation box, a particle should not notice the small size of its
system. In practice this is not exactly true and we must expect so called finite size
effects.
When we calculate the path of a red particle in space, we do this based on all
interactions with particles inside its interaction radius, rcut , which we call cutoff
radius. These particles can be red or blue! In the right panel of Fig. 9.2 the last red
particle in the central box interacts with only the blue diamond. The real red diamond,
however, is a long way off. Since we only store the position of real, i.e. red, particles,
we must use the position of the red diamond to determine the distance between the
red star in the central box and the blue diamond in its cutoff radius. This is done,
here for the x-distance, by the following line of computer code:
xijmin = xij − L Round[xij /L] (9.24)
The quantity xij is given by xij = xi − xj , where xi is the x-coordinate of the red star
and xj is the x-coordinate of the red diamond. L is the length of the primary simulation
box or cell in x-direction. Round[a] returns a rounded to the nearest integer. No matter
where the red diamond really is, the magnitude of xijmin is the x-distance separating
the red star from the nearest image of the red diamond. Thus, when we calculate the
potential energy of a real particle i in a system, we do this as follows. First we find
every particle, j (= i), the real j or one of its images, inside the cutoff radius of i
using (9.24). That is the condition rcut > rijmin , with
(rijmin )2 = (xijmin )2 + (yijmin )2 + (zijmin )2 , (9.25)
must be satisfied. Subsequently we compute the pair-potential energies based on

these distances rijmin . Likewise we proceed in the case of forces. Equation (9.24) is
also known as minimum image convention. Using the minimum image convention the
real particles are free to move wherever they like, while the density remains constant.
The simulation box mimics a bulk system. One last note. Some simulations require
that the density is variable, i.e. the pressure is constant instead of the volume. We
can handle constant pressure by varying L and still use (9.24) as described.
Error Calculation:
There are many types of errors. Some errors are simply mistakes. Others are due
to numerical inaccuracies of the simulation algorithm. Still others have to do with
insufficient equilibration. Here we focus on statistical errors.
Every simulation algorithm produces long sequence of numbers. One such
sequence may be the x-position of a particle, another one the potential energy or
the temperature. Whatever it is, we call this quantity A and its values Ai . The sample
average of A is
1
K
Ā = Ai (9.26)
K i=1
and
sA2 = A¯2 − Ā2 (9.27)
is the sample variance. Figure 9.3 depicts a mock series of data points. After an
equilibration phase, which may look different from the more or less monotonous
Fig. 9.3 Mock series of data

points Ai including an initial
equilibration phase followed
by equilibrium data. The
dashed line separates the
data acquired during
equilibration from the data
used for analysis. The
horizontal line is the
equilibrium average Ā
increase shown here, the data points finally form a ‘plateau’. Only the data in this
plateau are used for analysis.
Using the central limit theorem (cf. below) one can estimate the likelihood that
the true average value of A, i.e. A , satisfies
sA sA
Ā − √ ≤ A ≤ Ā + √ . (9.28)
n n
This likelihood is 68%. Notice that n is the number of independent Ai -values in the
original sample. When presenting sample averages √ computed from simulated data,
it is useful to include the standard error ±sA / n. This is done either in the form of
an error bar, when the sample √ average, Ā, is a data point in a graph, or, when Ā is a
number, in the form Ā ±sA / n.
The number of independent values n in a series of stored simulation data can be
determined from the auto-correlation function of A, i.e.
K−k
(Ai − Ā)(Ai+k − Ā)
CA (k) = i=1
K . (9.29)
i=1 (Ai − Ā)
2
Two examples for CA (k) are shown in Fig. 9.4. In this particular case t = t k, where
t = 0.001 is the timestep in a Molecular Dynamics simulation (cf. below). The two
curves, labeled P and T , are auto-correlation functions for pressure and temperature
obtained in a simulation. Important parameters characterising the simulated system
are the particle number density, ρ = N/V = 0.15, where V is the volume of the
simulation box, the particle number, N = 108, the cutoff radius, rcut = 3, and T =
2.59 (in Lennard-Jones units as explained below). Both auto-correlation functions
have decayed to zero at t ≈ 2, which means k = kc ≈ 2000. This value, i.e. k =
kc , beyond which CA (k) becomes zero -within small fluctuations- can be used to
determine n via
n = K/kc . (9.30)
Notice that in general kc depends on the choice of simulation parameters.

Fig. 9.4 Examples of 0.4

auto-correlation functions
obtained for pressure and 0.8 0.2 Δt=0.01
C (t)
temperature data extracted Δt=0.001
T
from Molecular Dynamics 0
simulations of a
C(t)
Lennard-Jones gas 0.4 0.01 0.1 1 10 100
T
t
P
0.01 0.1 1 10 100

t
The auto-correlation function also allows to spot certain types of systematic errors,
i.e. ‘drifts’ in the data. The inset in Fig. 9.4 shows CT for two independent Molecular
Dynamics simulations of different precision. The CT obtained for the longer timestep,
t = 0.01, does not decay to zero. Due to numerical error the temperature does not
fluctuate around a constant value, which gives rise to a non-vanishing correlation.
Remark: What is the justification for the above statement that the true average is
inside the bounds specified in (9.28) with 68% likelihood? We can show this using
an important mathematical theorem - the central limit theorem. This theorem states
that if Ai is a random variable with average A and variance σA , then the new random
variable,
n
Ai − nA
Sn = i=1 √ , (9.31)
σA n
possesses the probability density
1
f (Sn ) = √ exp[−Sn2 /2] , (9.32)
2π
in the limit n → ∞ (details can be found for instance in [3]). However, this remains
valid to very good approximation even if n is not very large. The probability, p(−1 ≤
S ≤ 1), for finding an Sn -value in the range between −1 to 1 is p(−1 ≤ SK ≤ 1) =
n1
−1 f (Sn )dSK ≈ 0.68. Thus
n
Ai − nA
p −1 ≤ i=1
√ ≤ 1 ≈ 0.68 (9.33)
σA n
or
√ √
p A − σA / n ≤ Ā ≤ A + σA / n ≈ 0.68 . (9.34)
Replacing σA by the standard deviation of the sample, s̄A , which only introduces a
small error, completes our justification of the above statement.
Lennard-Jones Interactions:
This short introduction to MD focusses on simple gases and liquids. ‘Simple’

means that the interactions between the molecules, or atoms in the case of noble
gases, may be described via simple potential functions like the Lennard-Jones (LJ)
potential discussed in Problem 4 (p. 59). The two parameters and σ are a typical
energy and a characteristic linear dimension of the molecules, respectively. Here we
deal with pure systems only, i.e. all pairs of molecules in our simulations do have the
same and σ. This makes it convenient and useful to measure all energies in units
of and all lengths in units of σ. Thus, we do not use (2.60) as it stands but rather

u(r) = 4 r −12 − r −6 , (9.35)
i.e. your MD simulation program should never contain and/or σ explicitly!

At first glance this may seem as if we study just one very particular system. This
is correct, but there is what is called the law of corresponding states. This law is not
a strict law. It rather is an approximation - albeit a very good one for many fluids of
small molecules. It means that if we have done all simulations in these units, the LJ
units which we discuss in more detail below, then we can, if we know and σ for a
particular kind of molecule, map our LJ results onto the real system. In particular,
this allows the comparison of the simulation results to corresponding experiments
for this molecular system.
Above we have introduced the cutoff radius, rcut . Interactions of molecules beyond
rcut will be neglected. Looking at the LJ potential in Problem 4, this seems to be
reasonable if for instance rcut /σ ≥ 3. But how much of the total (potential) energy
do we actually neglect? The answer is

Ulrc = uLJ (rij ) ≈ (9.36)
i<j,rijmin >rcut
∞
N(N − 1) 1 8 σ 3 3
4πr 2 druLJ (r) ≈ − πρ σ N.
2 V rcut 3 rcut
You can understand this formula better if you set uLJ (r) = , i.e. all particle pairs
have the same potential energy independent of their separation. In this case the
integral is equal to V , except for a small hole with the radius rcut , which cancels the
factor 1/V . The remaining number, N(N − 1)/2, is the number of distinct pairs of
particles. Finally, uLJ (r)(= ) accounts for the distance dependence of the interaction
between pairs. This so called long range correction should be small compared to the
total potential energy in your system. In LJ systems rcut /σ = 3 usually meets this
condition.
Fig. 9.5 Relation between

inter-particle potential and
the ordering imposed by the
shaded particle on its
neighborhood. The validity
of (9.36) is based on the
assumption that the density
variations beyond rcut are
negligible. Notice that
rcut /σ = 3 essentially means
that this should be true after
the third neighbor shell
If necessary you can add Ulrc to your potential energy as a correction. However be
careful. You may think that if this is always possible, you can make rcut really small,
e.g. rcut /σ ≈ 1, and thereby reduce the computational effort. This is not correct! The
first ≈ in (9.36) means that we can neglect structural ordering beyond rcut . Struc-
tural ordering is caused by every particle’s presence, because it imposes a distance
constraint on its neighbors and, depending on density, on the next-nearest neighbors
as well. A pictorial illustration of these spatial correlations is shown in Fig. 9.5. In
order for (9.36) to work, rcut must be sufficiently large, i.e. the presence of the central
particle, relative to which we measure rcut , does no longer influence the position of
particles beyond rcut .
Notice that our approach amounts to an approximation of the interactions between
particles in a simulation by pairwise interactions. The total interaction energy, for
instance, is

U= uij + Ulrc . (9.37)
i<j,rijmin ≤rcut
Similarly the total force on particle i is given by

F i = fij . (9.38)
j(=i),rijmin ≤rcut
Comparing to the Real World:

As we already have mentioned, energy is measured in units of and length is

measured in units of σ. In the following we intend to solve Newton’s equation of
motion for every particle i in the system, i.e.
d 2 ri
m = F i , (9.39)
dt 2
where m is the mass of the particles. We now convert this equation to its dimensionless
form (cf. Problem 45), which we then can implement on the computer:
mσ d 2 rLJ,i
= F LJ,i . (9.40)
τ dtLJ
2 2 σ
Setting

mσ 2
τ= , (9.41)

we find the desired equation of motion
d 2 rLJ,i
2
= F LJ,i . (9.42)
dtLJ
The quantity τ , as defined by (9.41), is the unit of time in our simulations. Notice
that the index LJ here replaces the asterisk in Problem 45, i.e. quantities with this
index are dimensionless.
There is one important quantity, which we have not yet encountered. This quantity
is the temperature, T . Temperature, on the molecular level, is related to the kinetic
energy of the molecules (cf. below). The quantity kB T , where kB is Boltzmann’s
constant, is the attendant typical thermal energy. Because we measure energy in
units of , we can defined the dimensionless temperature
kB T
TLJ = . (9.43)

Simpler but analogously we can write
ρLJ = σ 3 ρ . (9.44)
If ρ = N/V is the number density of molecules, then ρLJ is an attendant dimension-

less number density.
Let us assume that someone supplies us with special values for temperature and
number density, i.e. Tc,CH4 = 190.6 K and ρc,CH4 = 0.00612 Å−3 - the position of the
critical point of methane (CH4 ) in a temperature-density-phase diagram of methane.
Someone else, who already has completed a simulation study of the same phase
Table 9.1 Conversion to and U/ ULJ

from LJ systems
kB T / TLJ
σ3 ρ ρLJ
σF/ FLJ
t/τ tLJ
diagram for particles interacting via (9.35), supplies us with his location of the LJ
critical point close to Tc,LJ = 1.32 and ρc,LJ = 0.3 (cf. Sect. 6 of [4]). We can then
use (9.43) and (9.44), i.e. Tc,LJ = kB Tc,CH4 /CH4 and ρc,LJ = σCH 3
ρ
4 c,CH4
, to extract
CH4 /kB ≈ 141 K and σCH4 ≈ 3.7 Å. In the case of a different molecule, i.e. molecule
X, we obtain X and σX instead. We do not have to use the critical point. Alterna-
tively we can use other special points in a phase diagram or different experimental
data altogether. However, the resulting parameter values will differ to some extend,
because the Lennard-Jones potential is just a simple approximation of molecular
interaction.
In order to get a feeling for the magnitude of τ , we again consider a methane
molecule. Its mass is mCH4 = 16 amu (atomic mass units). Inserting this value and
the above values for and σ into (9.41) we find τCH4 ≈ 1.4 · 10−12 s! During this time
a methane molecule in a gas, whose temperature is T = 300 K, travels the average
distance kB T /mCH4 τCH4 ≈ σCH4 . Notice that the square root is the thermal velocity.
During a collision of two methane molecules, we must calculate the forces between
them in steps much smaller than τCH4 . This is because the potential energy, and
thus the force of interaction, may be very different, even when the inter molecular
distance changes by only a fraction of σ (cf. Problem 4). A reasonable timestep in a
MD simulation of a LJ system is 0.001τ . Thus, we not only have few particles in our
simulation boxes, we can follow their dynamics only for a very brief time interval
by macroscopic standards.
From now on we no longer use the index LJ to indicate (dimensionless) LJ quan-
tities. Unless explicitly stated otherwise all quantities in the remainder of this section
are LJ quantities! (Table 9.1).
Theoretical Background:
In a MD simulation we numerically solve Newton‘s equations of motion for

a many-particle system. Particles may be atoms or molecules. This is remarkable,
because we may have heard already that this scale is governed by quantum mechanics.
We are dealing with gases and liquids, for which classical mechanics works quite
well under the condition
ρ−1/3 T (9.45)
Sect. 5.2 in [5]). Here ρ = N/V is the number density and T =

(cf.
h2 /(2πmkB T ), where h is Planck‘s constant, m is the particle mass, kB is Boltz-
mann‘s constant, is the thermal wavelength. Notice that (9.45) is one of the exceptions
√
from our above rule regarding LJ quantities. Notice also that T ≈ 17.5 Å/ mT ,
where we use atomic mass units and Kelvin for the temperature. Using again methane
as an example, we obtain T ≈ 0.32 Å close to its critical point. The average center
of mass separation, i.e. ρ−1/3 , between methane molecules at the critical point is
≈5.5 Å. This means that in the vicinity of the critical point the above condition is
satisfied.
The result of a MD simulation is the so called trajectory, i.e. a file containing the
particle coordinates, qi , and momenta, pi , collected over k = 1 to K timesteps, t:
K
{qi (tk ), pi (tk )}3N,
i=1,k=1 . (9.46)
With this information we can estimate the (time) average of quantity A, i.e.
t
1
A = lim dt A({q(t), p(t)}, t) , (9.47)
t→∞ t 0
which may depend on either the coordinates or the momenta or on both (cf. (9.26)).
Implementation:
A simple integration algorithm for (9.42) can be constructed as follows. We start

with the series expansions of the position vector of particle i, ri (t), and its velocity,
vi (t):
1
ri (t + t) = ri (t) + tvi (t) + t 2 F i (t) + O(t 3 ) (9.48)
2
t t
vi (t + ) = vi (t) + Fi (t) + O(t 2 ) (9.49)
2 2
t t t
vi (t + t) = vi (t + )+ Fi (t + ) + O(t 2 ) . (9.50)
2 2
2
=F i (t+t)+O(t)
Adding the last two equations, we obtain the final algorithm:
1
ri (t + t) ≈ ri (t) + tvi (t) + t 2 F i (t) (9.51)
2
t
vi (t + t) ≈ vi (t) + Fi (t + t) + F i (t) . (9.52)
2
The first line advances the position, we shall call its implementation MOVER,
whereas the second line advances the velocity, we shall call its implementation
MOVEV. By repeating MOVER and MOVEV we are able to collect the system‘s
trajectory. The entries in our trajectory list are separated by the timestep t. Large
t allow to follow the system’s dynamic over a longer time. However, a large t
also means a large numerical error (cf. (9.48)–(9.50)). Thus, the actual timestep is a
compromise, allowing a long trajectory with ‘acceptable’ error. Usually it is a good
idea to vary the timestep and compare the attendant simulation results. A reasonable
timestep for LJ-simulations is t = 0.001 (cf. above).
Notice that we need one force evaluation per timestep t only. The x-component
of the force is given by

N
d N
Fi,x = − uij = fij,x . (9.53)
j(=i)=1
dxi j(=i)=1
A straightforward derivative yields
−13
min
−7 xij
fij,x = 24 2rijmin − rijmin . (9.54)
rijmin
Notice that this is the x-component of the force on particle i exerted by the real
particle
j or its nearest
image (The force curve in the figure on p. 60 corresponds
to 24 2r −13 − r −7 ). If for instance particle i is located at the origin and j (real or
image) is far off along the positive x-axis, then fij,x is positive, i.e. particle i is pulled
towards j. In order to work out the other two components of the force we merely
replace x by y and z.
NVE MD Code:
The core of an MD program looks like this:

Main
. . . generate initial configuration . . .
FORCE (k = 1)
Do k = 2, NSTEP
MOVER(k − 1)
FORCE(k)
MOVEV(k − 1)
End k
. . . output . . .
The index k is the timestep index. The program integrates the equations of motion
for NSTEP timesteps. The three subroutines FORCE, MOVER, and MOVEV do the
following:
FORCE(k)
Do i = 1, N − 1
Do j = i + 1, N
If rijmin < rcut Then
Fi = +fij
Fj = + − fij
(U = +uij )
End j
End i
MOV ER(k)
Do i = 1, N
ri (k + 1) = ri (k) + tvi (k) + 21 t 2 Fi (k)
End i
MOV EV (k)
Do i = 1, N
vi (k + 1) = ri (k) + 21 t Fi (k + 1) + Fi (k)
End i
Notice that = + means that the quantity on the left is incremented by the quantity
on the right. Notice also that the above is a shorthand notation omitting vectors. MD
does not use the potential directly, this is why the calculation of the total potential
energy U is optional.
The line . . . generate initial configuration . . . requires some discussion. Initially
the particles should be placed on the nodes of a (cubic) lattice satisfying the required
density. This is superior to random placement, because the latter may create overlap-
ping particles or close contacts. After the random assignment of the initial velocities,
it may be necessary to reset the center of mass velocity,
1
N
vCM = vi , (9.55)
N i=1
to zero. This is done in the single loop
Do i = 1, N
vi = vi − vCM
End i
Otherwise the unphysical translation of the center of mass can cause problems, e.g.
a wrong temperature.
An example MD program is included in Appendix B. The first figure shows the
initial particle positions on a simple cubic lattice inside the primary simulation box.
The next figure includes the particle’s paths in space (for selected particles only). The
density in this case is rather high, which leads to numerous collisions and attendant
direction changes. The third and last figure shows the instantaneous temperature in
the system.
Adjusting Temperature:
The particle system modelled by the aforementioned code is supplied with a

certain amount of energy in the program part called . . . generate initial configuration
. . . . The particles are assigned random initial velocities, which means that they
possess kinetic energy. Due to their initial positions in close proximity to each other
they also possess potential energy. The total energy, i.e. E = K + U, remains constant
(sidestepping the issue of numerical errors), even though K and U are variable. We
therefore model what is called an NVE system, i.e. the particle number, N, the
volume, V , and the total energy, E, are constant.
The system will undergo a transient period called equilibration already mentioned.
Afterwards the instantaneous temperature,
1 N 2
T (k) = v , (9.56)
3N i=1 i(k)
fluctuates around a constant average value, T̄ , as shown in the last figure in the
appendix. The sample average T̄ is an estimate of the equilibrium temperature in the
system (cf. (9.28)). Notice that the above equation is based on K(k) = 23 NT (k) (cf.
(9.88)).
Our current approach of supplying a more or less unknown initial energy to the
system makes it difficult to adjust its temperature to a particular value. A simple
method for temperature adjustment is the heat-flux approach. We assume that the
heat flux, JQ , leads to the following change of the instantaneous kinetic energy
between the timesteps k and k + 1:
1 1 2
N
Q 1 3
JQ = = vi (k)(λ2−1 ) = NT (k)(λ2−1 ) . (9.57)
t t i=1 2 t 2
This Q corresponds to the velocity rescaling
vi(k) → λvi(k) (9.58)
following every timestep and for every particle. λ is slightly larger or smaller than
one, depending on whether heat is flowing into or leaving the system.
If JQ depends linearly on the difference between the instantaneous temperature,
T (k), and the target temperature, TB , i.e.
JQ = αT (TB−T (k)) , (9.59)

where αT is a constant, then the combination of (9.57) and (9.59) yields

2t TB t TB
λ= 1+ −1 ≈1+ −1 . (9.60)
τT T (k) τT T (k)
Here τT−1 = αT /(3N) is another constant. t/τT should be small, so that the continuous
rescaling of the velocities (9.58) leads to a gradual approach of T (k) towards TB .
When the system’s temperature has reached the desired value, the velocity scaling
is either turned off (NVE system!) or continued using a large value of τT , in order to
perturb the system as little as possible.
9.3 From Mechanics to Statistical Mechanics‡
In this section we focus on the problems, which arise when classical mechanics is
applied to systems1 composed of many particles. The trajectory of an individual
particle loses its immediate importance. It is replaced by a statistical analysis of the
trajectory of the entire system in the framework of statistical mechanics. This subject
cannot be covered in one chapter. Here we want to concentrate on selected concepts
relating classical to statistical mechanics in order to prepare the reader for a more
exhaustive introduction to many-body theory.
Phase Space‡:
A momentary state or microstate in canonical mechanics is described by a point

in phase space, i.e. by all 3N coordinates, (q1 (t), q2 (t), …, q3N (t), and their attendant
momenta, p1 (t), p2 (t), . . . , p3N (t)). The point’s position in phase space depends on
time, t, sweeping out a phase space trajectory. Figure 9.1 show trajectories of the
mathematical pendulum, depicted in Fig. 2.1, for different initial conditions. Notice
that the trajectories do not intersect,2 because fixing the initial positions and momenta
uniquely determines the motion. Special cases are the (unstable) fixed points at
. . . , −π, π, . . ..
Time Averages‡:
As mentioned above, in a many-body or many-particle system the individual

particle looses its special significance. Important are the properties characterizing
1A system is a large box containing a (uniform) mass distribution. Large means that boundary
effects can be neglected. Systems are distinguished according to whether they are isolated systems
(no interaction with the outside world whatsoever), closed systems (energy exchange across the
system’s boundaries is possible) or open systems (energy and mass exchange is possible).
2 However, they may be closed (periodic motion).
the entire system - like its energy, pressure, density, etc. All of these quantities are
obtained by integrations over the phase space.
Let’s consider the quantity A(q1 (t), q2 (t), . . . , q3N (t), p1 (t), p2 (t), . . . , p3N (t), t).
An example is the pressure of a gas at time t. What we measure, however, is an
average, Ā(T ), of this quantity over a time interval of length T , i.e.
T
1
Ā(T ) = dtA (q1 (t) , q2 (t) , . . . , p3N (t) , t) . (9.61)
T 0
In the following we assume that A does not depend on time explicitly. In addition the
system is an isolated system at equilibrium. We are interested in the limit T → ∞,
which allows to define the time average3
A t ≡ lim Ā(T ). (9.62)

T →∞
The question to ask is the following: Does this formula allow the calculation
of (macroscopic) equilibrium quantities based on the microscopic coordinates and
momenta in a certain mechanical system? We notice immediately that the application
of (9.62) to systems containing on the order of 1023 particles is difficult. Only when
we use Molecular Dynamics computer simulations, described in the previous section,
can we calculate Ā for moderate values of N. The exact limit T → ∞, however, is
unattainable.
Aside from the technical problem posed by the integration in (9.62), there is
another conceptual difficulty. The phase space trajectories of the mathematical pen-
dulum in Fig. 9.1 do depend on initial conditions. This means that every attendant
A also depends on initial conditions. We conjecture that the same might be true
for a macroscopic volume filled with a gas. Thus, the gas pressure might be a func-
tion of the initial conditions as well. Experimentally one finds that this is not true.
Exceptions are systems distinguished by different total energies. Other constants of
motion, not related to the fundamental symmetries of space and time, apparently do
not have an effect! And, returning to the mathematical pendulum, this also applies
to the trajectories in Fig. 9.1. Nevertheless this is a subtle point.
The computational difficulties imposed by A were greatly reduced by the work of
J.W. Gibbs4 at the end of the 19th century. Even though his method merely bypasses
the second of the above points, it quickly became the foundation of many-body
theory. However, before we introduce Gibb’s approach to the calculation of A , we
3 The existence of the right hand side is the subject of Birkhoff’s theorem (cf. [6]).
4 Gibbs, Josiah Willard, American mathematician and physical chemist, *New Haven (Connecticut)
11.2.1839, †New Haven 28.4.1903; he is one of the fathers of modern statistical thermodynamics.
9.3 From Mechanics to Statistical Mechanics‡ 273
briefly focus on the very different ‘mechanical’ approach of L. Boltzmann,5 which

basically tackles the second point head on.
Boltzmann’s Picture‡:
Boltzmanns conceptual basis is the probability, f (r , v)d 3 rd 3 v, for finding a mole-
cule in a dilute gas inside a volume element at the position r possessing a velocity
v. He abandons the impossible integration of the equations of motion of all parti-
cles inside a macroscopic system in favor of their statistical description. When the
probability density, f (r , v), is known, then A may be calculated via

A (r , v) f (r , v) d 3 rd 3 v
A B = . (9.63)
f (r , v) d 3 rd 3 v
Here A (r , v) is the quantity A expressed in terms of its values at the positions r and v
in the respective spaces of position and velocity. Phase or -space is replaced by the
lower dimensional μ-space. Notice that the index B indicates that A is computed
using Boltzmann’s method. We introduce this distinction, because we want to return
to the question whether A B is equal to the time average in (9.62).
In general f = f (r , v, t), i.e. there is an explicit dependence on time, which follows
according to the so called Boltzmann’s equation:
∂
+ v · ∇ v f (r , v, t) = ∂f
r + v˙ · ∇ (9.64)
∂t ∂t coll
The left hand side is the total derivative of f (r , v, t) with respect to time, whereas
the right hand side describes the reason for this dependence in terms of collisions of
the gas particles. Here we do not discuss this term in detail.6 Instead we make the
following remarks:
• All collisions obey energy and momentum conservation.

• The probability for two molecules, possessing the respective velocities v1 and v2 ,
to be inside the same volume element at r is the product
f (r , v1 , t)d 3 rd 3 v1 f (r , v2 , t)d 3 rd 3 v2 . (9.65)
5 Boltzmann, Ludwig, Austrian physicist, *Vienna 20.2.1844, †Duino (today Duino–Aurisina, near
Trieste) 5.9.1906; he laid the foundation of statistical mechanics. There is a nice quote in Kerson
Huang’s book on this topic which states: [His] H-theorem opened the door to an understanding of
the macroscopic world on the basis of molecular dynamics.
6 E.g. K. Huang (1963) Statistical Mechanics. Chap. 3.
Thus, neglecting correlations, which is also called ‘molecular chaos’, destroys

every effect the initial conditions might have.7
Based on these conditions Boltzmann was able to obtain the inequality
dH (t)
≤0 (9.66)
dt
(which is valid on average!), wherein the H-function is given by8

H (t) = d 3 vf (v, t) ln f (v, t) . (9.67)
Relation (9.66) is the famous H-theorem. −H (t), as you will learn, essentially is
the entropy of the gas. The H-theorem is the microscopic version of the second law
of thermodynamics on the molecular level! In the long time limit f (v, t) approaches
the so called Maxwell distribution9 :

1
f (v) ∝ exp − βmv .
2
(9.68)
2
Here m is the particle mass and
1
β= , (9.69)
kB T
wherein kB is Boltzmann’s constant and T is temperature.10

Figure 9.6 shows an example based on a NVE-Molecular Dynamics simulation
of a gas containing 108 Lennard-Jones particles (cf. p. 59). The particles’ initial
positions as well as their initial velocities are random. In Fig. 9.6 the H-function is
plotted versus time. Notice that (9.66) is satisfied ‘on average’. This means that (9.66)
is violated by short-lived fluctuations only. The second panel in this figure depicts the
velocity distribution function f (v) calculated from the simulation trajectory after the
H-function has bottomed out. Due to the rather small number of particles we observe
significant scatter of the data. Nevertheless, the solid line is the limiting distribution
in (9.68) using the temperature determined from the equilibrium kinetic energy via
(9.56). The reader is encouraged to repeat this calculation with the MD program in
the appendix.
7 How this can be rationalized despite of the deterministic nature of the equations of motion is
discussed in the next section.
8 The irrelevant positions in the argument of f are omitted.
9 Maxwell, James Clerk, Britisch physicist, *Edinburgh 13.6.1831, †Cambridge 5.11.1879; he was
one of the most influential contributors to development of modern physics.

10 Not to be confused with time.
-5.0
0.2
fi( i)
H(t)
-5.4
0.1
-5.8 0
0 5 10 15 20 -6 -4 -2 2 4 6
time i
Fig. 9.6 NVE Molecular Dynamics simulation of a Lennard-Jones gas. All results are in LJ units.
Left time development of the H-function; right equilibrium velocity distribution function, f (v),
for the different velocity components, vi (i = 1, 2, 3). The three types of symbols correspond to
the three spatial directions. The solid line is the Maxwell distribution using the simulated average
temperature based on (9.56)
• Example - Ideal Gas Law (1): The following example illustrates the cal-
culation of a macroscopic observable in μ-space. The observable here is the
pressure, P, exerted by a dilute gas on the walls of its container. We assume
that the gas molecules interact so weakly that we can neglect their interaction
altogether. This gas is called an ideal gas.
The pressure is defined via P = F/A, where F is the average force per
area, A. A single particle of mass m, colliding with a wall perpendicular to the
x-direction, contributes the amount
px 2mvx
fx = =
t t
to the instantaneous force F(t). The quantity px = 2mvx is the momentum
transfer during the collision occurring in a short time t. During this time
a particle, approaching the wall, traverses a layer parallel to the wall whose
thickness is x = tvx /2. Inserting this into the above equation yields
mvx2
fx = .
x
Hence the total force becomes

mvx2
F= nf (vx ) dvx .
x
Here n is the number of particles in the above layer and f (vx ) is the normalized
velocity probability distribution of the x-components. Using ρ = n/ (A · x),
the particle number density in the gas, together with (9.68) yields
∞
e−β 2 vx vx2 dvx
m 2
P = mρ −∞
∞
e−β 2 vx dvx
m 2
−∞
∞ √ m
d z=vx β2 d
e−β 2 vx dvx
m 2
= −2ρ ln = ρ ln β
dβ −∞ dβ
1
=ρ .
β
This is the well known ideal gas law,
PV = NkB T , (9.70)
derived on the basis of molecular collisions with the container walls.
Remark: On p. 150 we have discussed the so called slingshot-effect, focussing on

the momentum transfer between two bodies whose masses are very different. Thus
we may ask what happens to an ideal gas, which is compressed by a piston inside an
insulated cylinder? In this case the molecules collide with a much heavier body, i.e.
the moving piston, from which they receive additional kinetic energy.
The situation is depicted in the Fig. 9.7. The piston moves with the velocity vP .
This means that for gas molecules colliding with the piston the momentum transfer
is no longer px = 2mvx but px = 2m(vx + Vp ) instead (cf. (5.118)). We assume
that the mass of the molecule can be neglected compared to the piston’s mass. Thus
fx acquires an additional contribution, fx,f = 2mVp /t. We also assume that vx Vp
and therefore t ≈ 2x/vx (as before). Overall the piston experiences an additional
friction force
∞
e−β 2 vx vx dvx
m 2
2mkB T
Ff = Ap mρ ∞0
Vp = Ap ρ Vp . (9.71)
0 e−β m2 vx2
dvx π
Fig. 9.7 An ideal gas

compressed by a moving
piston
Title
Vp
Here Ap is the cross section of the cylinder. The work done on the gas, due to this
effect, when the piston has moved a distance δs, is given by

2mkB T
−δsFf = −ρ Vp δV . (9.72)
π
Compared to the attendant work calculated based on the ideal gas law in (9.70), i.e.
−ρkB T δV , this contribution is O(Vp /v) only, where v is the average velocity of the
gas molecules.
We conclude this remark with a comment. Already in the context of Molecular
Dynamics simulation we have seen that the average velocity of the molecules in a gas
increases with increasing temperature. Because the above effect increases the velocity
of the gas particles, we may conclude that in the limit of a very slow compression, i.e.
Vp ≈ 0, there is no increase of the gas temperature. This is incorrect however. A slow
compression of an isolated gas, an adiabatic compression, increases the temperature
of the gas. The opposite is true for a slow expansion of the isolated gas, an adiabatic
expansion, which leads to a cooling of the gas. The key quantity responsible here is
the entropy (change), which is addressed elsewhere (e.g. [4]).
Liouville’s Theorem:
In the following we briefly discuss a useful theorem. Imagine a set of infinitely

many independent but otherwise identical mechanical systems.11 Each of them, at
a particular moment, corresponds to a point in phase space. Inside a certain phase
space volume element these points occur with the density ρ(t) at time t. The density
depends on position, i.e.
ρ = ρ (q1 , q2 , . . . , q3N , p1 , p2 , . . . , p3N , t) , (9.73)
and satisfies the following useful theorem:

dρ ∂ρ ∂ρ ∂ρ
= + q̇j + ṗj =0
dt ∂t j
∂qj ∂pj
(j = 1, . . . , 3N) or, using Possion brackets,
dρ ∂ρ
= + {H, ρ} = 0 . (9.74)
dt ∂t
11 The insulated systems are ‘different’ only with respect to (most of) their initial conditions. How-
ever, they do share the same energy (hyper)surface.
This is Liouville’s theorem.12 Liouville’s theorem does not uniquely determine the
phase space density but constrains the options!
Remark 1: Notice that dρ/dt is the density change measured by an observer traveling
alongside the phase space ‘fluid’. The derivative ∂ρ/∂t, on the other hand, refers to
a fixed position (cf. p. 25).
Remark 2: Sometimes it is useful to extend phase space, i.e. aside from the coor-
dinates and momenta of the mechanical system additional generalized coordinates
and their conjugate momenta are introduced. Despite of this the continuity equation
introduced below as well as (9.74) remain valid. This then can be employed to obtain
the equations of motion of the newly introduced variables (e.g. [7]).
For the sake of simplicity the following justification of (9.74) is limited to the
case ρ = ρ (q, p, t). But the generalization to (9.73) is (almost) obvious. The two-
dimensional volume element at the position (q, p) is given by dqdp. At the position
(q, p) the density, ρ, changes according to ∂ρ/∂t q,p . This change is due to phase
space points entering or leaving the volume element. The net change of phase space
points in the volume element during the time dt due to their motion along the q-
direction is

ρq̇ dq − ρq̇ dq dpdt .
q− 2 q+ 2
The two terms in the brackets account for the two boundaries of the volume element
along q. For instance, if [...] > 0, then the number of phase space points in the volume
element is increased. The analogous net change along the p-direction is given by

ρṗ − ρṗ dqdt .
p− dp
2 p+ dp
2
The sum of the two contributions (per dqdpdt) is
d d
− (ρq̇) − (ρṗ) .
dq dp
Setting this equal to ∂ρ/∂t yields the continuity equation13
d d ∂ρ
(ρq̇) + (ρṗ) + =0 (9.75)
dq dp ∂t
12 Liouville, Joseph, French mathematician, *Saint-Omer (Départment Pas-de-Calais) 24.3.1809,

†Paris 8.9.1882.
13 If a quantity, which here happens to the number of phase space points, satisfies the continuity
equation, it means that a change of the amount of this quantity in a certain region of space is entirely
due to the quantity flowing in or out of the region. In other words, there is not ‘production’ or
‘destruction’ of this quantity inside this region. An example for processes causing ‘production’ or
‘destruction’ are chemical reactions.
or

dρ dρ ∂ρ d q̇ d ṗ
q̇ + ṗ + +ρ + =0.
dq dp ∂t dq dp

=dρ/dt
Applying Hamilton’s equations, i.e. q̇ = ∂H∂p

and ṗ = − ∂H
∂q
, immediately shows that
(...) = 0 is correct, which proves (9.74) in this simplified case.
We may look at Liouville’s theorem from another perspective - Boltzmann’s
perspective. We want to compute the change of a phase space volume element with
time, i.e.
d(t) ≡ dq1 (t)dq2 (t)...dq3N (t)dp1 (t)dp2 (t)...dp3N (t) . (9.76)
Here the phase space volume element travels with a phase space point. As before,
for the sake of simplicity, we study
d(t) = dq(t)dp(t) .
We want to calculate d expressed by the new coordinates Q = Q (q, p) and P =

P (q, p). According to the example on p. 27 we have

∂Q ∂Q
∂q ∂p

dQdP = dqdp .
∂P ∂P
∂q ∂p

Now we define the special transformation
Q = q (t + δt) ≈ q (t) + q̇ (t) δt and P = p (t + δt) ≈ p (t) + ṗ (t) δt ,
which yields

∂ q̇ ∂ ṗ
dq (t + δt) dp (t + δt) = 1 + δt 1+ δt dq (t) dp (t) + O δt 2
∂q ∂p

∂ q̇ ∂ ṗ
= 1+ + δt dq (t) dp (t) + O δt 2 .
∂q ∂p
Iterating this leads to

∂ q̇ ∂ ṗ t − t n
dq t dp t = 1 + + dq (t) dp (t) + O δt 2 ,
∂q ∂p n
where n δt = t − t. In the limit n → ∞, δt → 0 (cf. (1.63)) this becomes

∂ q̇
∂ ṗ
d(t ) = exp + (t − t) dq(t)dp(t) = d(t) = const , (9.77)
∂q ∂p

(8.3),(8.4)
= 0
i.e. phase space points behave like particles in an incompressible liquid. Figure 9.8
is an approximate illustration of this invariance, showing a number of phase space
points of the mathematical pendulum, plotted every δt. The timestep is sufficiently
small and the initial conditions are sufficiently close to demonstrate the migration
and deformation of a phase space volume element.
Gibbs’ Picture:
A point (q1 , q2 , . . . , q3N , p1 , p2 , . . . , p3N ) in phase space observed at a particular

time defines a microstate of the system. In order to apply this concept to the cal-
culation of averages approximating macroscopic quantities, we imagine, as already
before, an infinite number of macroscopically identical systems. Each systems con-
tributes a point in phase space at any one time. In statistical mechanics this imaginary
construction is an ensemble.
The basis of Gibbs’ picture is the presumption that all microstates of an isolated
system are equally probable. If E is the energy of each isolated or micro-canonical
systems in a (micro-canonical) ensemble, then the above presumption implies that
the density, ρ, of phase space points on an energy (hyper)surface, defined via E =
H(q1 , q2 , . . . , q3N , p1 , p2 , . . . , p3N ), is constant, i.e.
ρ = const . (9.78)
Note that this ρ is consistent with Liouville’s theorem (9.74).
Fig. 9.8 Motion of a volume

element in a
two-dimensional phase space
If, on the other hand, the individual systems are not isolated but merely closed,
i.e. they do exchange energy (heat) with a heat bath surrounding them, which is the
same for all systems, then ρ is no longer constant. It can be shown that in this case
ρ ∝ exp (−βH) (9.79)
(e.g. [4]). The attendant ensemble is called canonical ensemble and its phase space
density again satisfies Liouville’s theorem (9.74) (try to show this).
The calculation of equilibrium averages according to Gibbs is based on ensemble
averages. As before in the case of (9.63) in μ-space, we can write

dA(q1 , . . . , p1 , . . .)ρ(q1 , . . . , p1 , . . .)
A G = . (9.80)
dρ(q1 , . . . , p1 , . . .)
The index G indicates that the average is calculated according to Gibbs. Of course,
at some point we must deal with the question whether or not the different averages,
i.e. A G , A B , and A t , produce the same results.
In the micro-canonical case this formula becomes

dA
A G,micro = E=H . (9.81)
E=H d
Here the integration is over the entire energy surface E = H. In the canonical case
the attendant average is given by

dA exp (−βH)
A G,can = . (9.82)
d exp (−βH)
These integrations are over the entire phase space including all energies. You will
learn in statistical mechanics that in the so called thermodynamic limit
A G = A G,micro = A G,can , (9.83)
i.e. different ensembles yield the same averages. Thermodynamic limit means that
the number of particles in a system becomes infinite, whereas its density remains
constant. The resulting system is a bulk system and the shape or material of its con-
tainment should not influence the properties of its content. This may sound obvious,
but it is not. For instance, if the interaction between particles has a long or even infinite
range, e.g. the Coulomb interaction between charges or the gravitational interaction
between masses, then this becomes a difficult concept.
• Example - Ideal Gas Law (2): In this example we derive the ideal gas
law along a different route. We begin by inserting the quantity A = qj ∂H
∂qj
into
(9.82), which yields

∂H dqj ∂H
∂qj
exp [−βH]
qj =
∂qj d exp [−βH]
qjmax

d − 1
q
β j
exp [−βH] + 1
β
exp [−βH] dqj
p. i. qjmin
= .
d exp [−βH]
Notice that the prime means that the coordinate qj is excluded from the integra-
tion. The first term in square brackets vanishes. This is because we consider a
gas confined to a container, i.e. the potential energy at qjmax and qjmin , where the
walls are, becomes infinite. Notice also that p. I. stands for partial integration.
Hence
∂H
qj = kB T (9.84)
∂qj
or in the case of N gas particles
3N
∂H
qj = 3NkB T . (9.85)
j=1
∂qj
We can rewrite the left hand side of the last equation as follows:
3N
∂H 3N N
qj = − qj ṗj = − ri · F i . (9.86)
j=1
∂qj j=1 i=1
Here F i is the total force on particle i at ri . For an ideal gas this force vanishes
inside the container. Only very close to the container walls do we have F i = 0,
i.e.
N
−
ri · Fi P dA (n · r )
i=1 A

Gauss theorem
= P · r
dV ∇
V
= 3PV . (9.87)
Here P is the gas pressure, d A = ndA is a surface element on the con-

tainer surface oriented towards the outside, and V is the container’s volume.
Combination of the (9.85)–(9.87) again yields the ideal gas law:
PV = NkB T .
Remark 1: We can repeat the analogous calculation replacing pj ∂H

∂pj
by qj ∂H
∂qj
. Instead
of (9.85) we now obtain
3N
∂H 3N
pj = pj q̇j = 2K = 3NkB T .
j=1
∂pj j=1
Here K is the average kinetic energy of the gas, i.e. the equation
3
K = NkB T (9.88)
2
allows the calculation of the macroscopic temperature based on the microscopic
particle velocities (cf. (9.56)).
Remark 2: The obvious equality of pj ∂H
∂pj
and qj ∂H
∂qj
implies the virial theorem, i.e.
N
2K = − ri · F i . (9.89)
i=1
Notice that this also holds when the system is not an ideal gas!
An important specializationis the following. We assume that N particles interact
via pairwise forces, i.e. F i = Nj=1 fij . This allows to transform the right hand side
of (9.89) via

N
N
ri · F i = ri · fij
i=1 i,j=1
⎛ ⎞
1 ⎝
N N
= ri · fij + rj · fji ⎠
2 i,j=1 j,i=1
⎛ ⎞
fji =−fij 1 N N
= ⎝ ri · fij − rj · fij ⎠
2 i,j=1 j,i=1
1
N
= rij · fij
2 i,j=1
1
N
=− i uij
rij · ∇
2 i,j=1
1 duij
N
=− rij . (9.90)
2 i,j=1 drij
In the next to last step we also assume that the force fij can be expressed as the negative
gradient of the potential uij . The last step applies to central forces, i.e. uij = u(rij ).
And finally we consider uij to be a homogeneous function of order n, which means
u(λrij ) = λn u(rij ) . (9.91)
This is not unusual. Prominent examples are gravitation and the Coulomb interaction
between charges. In both cases u(rij ) ∝ rijn with n = −1. The validity of (9.91) in
these cases is easily checked. Differentiating (9.91) with respect to the parameter λ
yields
du(λrij ) du(λrij )
= rij = nλn−1 u(rij ) . (9.92)
dλ dλrij
Inserting λ = 1 leads to
du(rij )
rij = nu(rij ) . (9.93)
drij
If we insert this equation into (9.90) and the result again into (9.89), then we find the
following important result:
2K = nU . (9.94)

Here U = (1/2) Ni,j=1 uij is the total potential energy of a system of N particles.
Equation (9.94) has many practical uses. For instance, it allows to estimate the kinetic
energy and thus the temperature of a particle gas after its gravitational collapse into
one compact body (e.g. [8]14 ).
9.4 Classification of Dynamical Systems
A central issue in Gibbs’ approach is the assumed equality of the ensemble average
A G,micro to the attendant time average in (9.62). This assumption is called ergodic
hypothesis. Where is the problem? Well, we cannot be sure that the trajectory of a
mechanical system traverses all (the important) regions of phase space. The ensemble
construction, on the other hand, by its very definition includes all of phase space.
14 In the case of gravitation the potential energy of a mass density, ρ( r ), is given by U =

−(G/2) V d 3 rd 3 r ρ(r )|r − r |−1 ρ(r ). The volume V encloses the mass distribution. Thus U =
gGM 2 V −1/3 , where for a uniform mass distribution g is a geometry dependent factor.
9.4 Classification of Dynamical Systems 285
dynamical systems
reoccurrent non-reoccurrent
trajectories trajectories
Hamiltonian systems non-Hamiltonian systems
integrable systems non-integrable systems
periodic quasi periodic non-ergodic ergodic mixing
stable trajectory unstable trajectory

(trajectory is confined to (trajectory covers the
a torus in phase space) entire energy hypersurface)
Fig. 9.9 Classification of dynamical systems
Or in other words - how do deterministic equations of motion yield ‘random’ phase

space trajectories?
Before we address this point, we want to briefly discuss the classification schema
for dynamical systems depicted in Fig. 9.9. First we distinguish dynamical systems
according to whether or not their trajectories are reoccurring. Most comets belong
to the second category. An exception is Halley’s comet, which follows a reoccurring
trajectory. A theorem of Poincaré15 (Poincaré’s recurrence theorem) states, that a
point moving in a finite phase space is going to pass arbitrarily close and often every
accessible system configuration. However, even for systems containing few particles
the recurrence times are astronomically large.
Depending on whether or not the motion is determined by a Hamiltonian, one
distinguishes between Hamiltonian and non-Hamiltonian systems. Here we are inter-
ested in Hamiltonian systems. Examples for the second type of systems are dissi-
pative systems, in which friction occurs. In (closed) Hamiltonian systems we have
H = const. Hamiltonian systems are distinguished in integrable, i.e. an analytic solu-
tion exists, and non-integrable systems. In integrable systems the number of degrees
of freedom equals the number of independent constants of motion.
Integrable systems are distinguished into systems whose motion is periodic and
those whose motion is quasi-periodic. Periodic means that the phase space trajectory
‘intersects’ itself after a finite time. Quasi-periodic motion, on the other hand, we
illustrate via the following example. A system consists of two independent harmonic
oscillators. The attendant phase space has four dimensions. The fact that the energy
15 Poincaré, Jules Henri, French physicist and mathematician, *Nancy 29.4.1854, †Paris 17.7.1912.
of the first oscillator, E1 , is constant constrains the system’s trajectory to a certain

volume in phase space. The additional fact that the energy of the second oscillator,
E2 , is constant also constrains the system trajectory to a closed surface in phase
space. If the ratio of the oscillator frequencies is a rational number, then this yields
a periodic motion. Otherwise the resulting trajectory does not cross itself but covers
the aforementioned surface densely. This is called quasi-periodic.
The additional distinction between non-integrable systems in Fig. 9.9 is based
on the concept of stability, i.e. how is a phase space trajectory affected by a small
perturbation. In order for a system to evolve from a non-equilibrium state towards
equilibrium, its phase space trajectory must be unstable with respect to small pertur-
bations. What this means is illustrated in Fig. 9.10. A trajectory is perturbed slightly
at time t = 0, i.e. the attendant phase space point is shifted by δ(0). The figure
shows different possible reactions to the perturbation in terms of the difference, δ(t),
between the unperturbed and the perturbed trajectory at some later time. From top
to bottom we have the following options:
⎧
⎨ δ(t) = cδ(0) stable
δ(t) = δ(t) = ctδ(0) quasi-periodic . (9.95)
⎩
δ(t) = δ(0)eλt unstable
Two conditions must be satisfied in order for an isolated system to evolve from
non-equilibrium towards equilibrium. The entire energy surface must be accessible.
The motion has to be ‘mixing’, which means ‘at least unstable’, so that arbitrarily
small segments of the energy surface are accessed. Systems not satisfying these
conditions are called non-mixing. Thus, a quasi-periodic motion therefore is non-
mixing. Even though the perturbed trajectory deviates from the original trajectory,
it remains confined to a limited region of the energy surface. The above conditions
are the foundation of the aforementioned ergodic hypothesis.
Fig. 9.10 Pictorial perturbed trajectory

comparison between
unperturbed and perturbed (0)
trajectories versus time unperturbed trajectory
perturbed trajectory
(0)
unperturbed trajectory
perturbed trajectory
(0)
unperturbed trajectory
time t
9.5 Roads to Chaos 287
9.5 Roads to Chaos
We return to the original question: How can deterministic equations of motion lead
to ‘random’ phase space trajectories? This is a difficult question, to which we do not
have a simple answer. However, we want to discuss two examples, which, despite
their simplicity, illustrate the basic effect.
The first example is not a mechanical system. It is a non-linear iteration relation
called logistic map, i.e.
xi+1 = 4rxi (1 − xi ) . (9.96)
Starting from an initial value x0 ∈ [0, 1] the iteration relation (9.96) generates the
subsequent xi -values x1 , x2 , x3 ,.... (also ∈ [0, 1]). The outcome depends on the para-
meter r. Two examples are shown in Fig. 9.11. Here subsequent xi -values are given
by the position of the horizontal (vertical) arrows on the vertical (horizontal) axis. For
instance, for r < 0.25 the sequence of x-values converges on 0, i.e. 0 is an attractive
or stable fix point. If r is greater than 0.25 the behavior changes. The second panel
in Fig. 9.11 illustrates this for r = 0.6. A still large r-value, shown in Fig. 9.12, no
longer leads to a single fix point but to a stable 2-cycle instead. If we continue to
increase r, the result is a rather complex but self-similar plot of limit cycle values
versus r, generated via successive bifurcations. But this is not of immediate interest
to us. We are interested in the effect of a small perturbation of x0 on the subsequent
xi -values.
0.10 0.6
0.08 0.5
x 0.4
0.06 x
0.3
0.04
0.2
0.02 0.1
0.0
0.05 0.10 0.15 0.20 0.2 0.4 0.6 0.8 1.0
Fig. 9.11 Graphical iteration of the logistic map. Left r = 0.1; right r = 0.6. Open circles r0 = 0.1;
solid circles fix points
Fig. 9.12 Graphical 1.0

iteration of the logistic map
using r = 0.8. Solid circles 0.8
limiting two-cycle
0.6
0.4
0.2
0.0
0.2 0.4 0.6 0.8 1.0
Fig. 9.13 Lyapunov

exponent, λ, versus r 0.5
l -0.5
-1
0.7 0.75 0.8 0.85 0.9 0.95 1

r
We measure the effect using the value of λ defined via
|xn | = |x0 | exp[nλ] , (9.97)
where λ is called Lyapunov exponent. Notice that a positive λ means that a small
perturbation leads to ever increasing deviations of the original sequence (or trajec-
tory) from the perturbed one. A negative value, on the other hand, signals stability,
i.e. the initial deviation quickly diminishes. A useful formula for λ is

n−1
λ = lim n−1 ln |4r(1 − 2xi )| . (9.98)
n→∞
i=0
It follows by rewriting (9.97) as
|xn | |xn−1 | |x1 |

... = exp[nλ] , (9.99)
|xn−1 | |xn−2 | |x0 |
Each of the factors on the left we replace using the differential of (9.96), i.e. xi+1 =
4rxi (1 − 2xi ). The result is shown in Fig. 9.13. Notice that λ < 0 for r = 0.1, 0.6,
and 0.8, the r-values in the previous figures. But there are other regions in which
λ > 0, i.e. the ‘trajectory’ is not stable. In these parameter regions every independent
run, on one specific computer, will generate the same xi -values if x0 is the same. But
even a tiny change, for a sufficiently long sequence, yields a completely different
result. This is called deterministic chaos.
We can apply the same idea to a Molecular Dynamics simulation. Figure 9.14
shows Molecular Dynamics simulation trajectories of one specific particle in a gas
generated in different runs on the same computer. The only difference between the
runs is a small initial displacement of this particle along one coordinate direction.
As before in the case of the simpler logistic map, we observe deterministic ‘chaotic’
behavior.
References 289
Fig. 9.14 Trajectories of one specific particle in different MD runs on the same computer. The
only difference between the runs is a small initial displacement of this particle along one coordinate
direction
References
1. M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions (Dover, New York, 1972)
2. M.P. Allen, D.J. Tildesley, Computer Simulation of Simple Liquids (Clarendon Press, Oxford,
1990)
3. E. Kreyszig, Introductory Mathematical Statistics: Principles and Methods (Wiley, New York,
1970)
4. R. Hentschke, Thermodynamics (Springer, Berlin, 2013)
5. R. Hentschke, A Short Introduction to Quantum Theory. Lecture Notes. http://constanze.
materials.uni-wuppertal.de/Englishindex.html
6. A.I. Khinchin, Mathematical Foundations of Statistical Mechanics (Dover, New York, 1949)
7. W.G. Hoover, Computational Statistical Mechanics (Elsevier, Amsterdam, 1991)
8. J. Binney, S. Tremaine, Galactic Dynamics (Princeton University Press, Princeton, 2008)
Chapter 10
Basic Equations of the Theory of Elasticity♥
There is no such thing as a rigid body. When mechanics is applied to the stability
and dynamics of buildings, machinery, vehicles, or parts thereof, one must be able to
deal with deformation. In this chapter we focus on elastic deformations, i.e. defor-
mations which disappear completely when the forces causing them are no longer
present. However, a completely reversible deformation is an approximate concept,
like the rigid body.1 A deformation remaining largely present, after the responsible
forces are ‘turned off’, is called a plastic deformation. Also not considered here is
fracture mechanics (e.g. [1]), i.e. the development of cracks and the general failure
of materials. Nevertheless, the concepts discussed in the following should allow the
reader to approach these areas with greater ease.
Much of the theory of elasticity was developed during the second half of the
18th and the first half of the 19th century (see for instance the historical notes in
[2, 3]). In more recent times numerical approaches were developed, which allow
to calculate the response of complicated elastic structures with high precision. The
most important technique, from an industrial perspective, certainly is the finite ele-
ment method. Here we present its basic principles. The final section of this chapter
is devoted, again to basic principles, of the mechanical analysis of viscoelastic
materials.
1 Even though some of the things we encounter every day do come very close to being ideally elastic,
like car tires.

DOI 10.1007/978-3-319-48710-6_10
292 10 Basic Equations of the Theory …
10.1 Strain and Stress Tensors
Derivation of the Strain Tensor u:
Figure 10.1 shows a piece of material compressed by a force (or load). Here r is
the position of a point in the undeformed body. Notice that the theory of elasticity is
a continuum theory. This means that ‘a point’ really is a region in space containing
many atoms. The vector r indicates the position of the same point or region in space
in the deformed body. The difference,
u = r − r , (10.1)
is the displacement. The displacement, u = u ( r ), is a function of position. Even

though we want to calculate u ( r ), en route to this goal another quantity is more
important.
The distance between two (infinitesimally close) points, which before the defor-
mation are located at r and r + d r, respectively, is given by

dr = d x12 + d x22 + d x32 .
Employing the summation convention, which we shall use a lot in the following, this
is

dr = d xi2 .

displacement vector u F
r
u
r'
10.1 Strain and Stress Tensors 293
After the deformation the above distance becomes

dr = d x1 2 + d x2 2 + d x3 3 ,
i.e.

dr = d xi 2 .
Using (10.1), i.e. d u = d r − d r, we find
dr = (d xi + du i )2
2
2
∂u i
= d xi + d xk
∂xk
∂u i ∂u i ∂u i
= d xi2 + 2 d xi d xk + d x k d xl
∂x ∂x ∂xl
k k
i←→k i←→l

∂u k 1 ∂u l ∂u l
= dl + 2
2
d xi d xk + d xi d xk
∂xi 2 ∂x ∂x
k i

∂u i ∂u ∂u ∂u l
= 21 ∂xk + ∂xk + ∂x l d xi d xk
i k ∂xi
dr = dr 2 + 2u ik d xi d xk .
2
(10.2)
The quantities

1 ∂u i ∂u k ∂u l ∂u l
u ik = + + (10.3)
2 ∂xk ∂xi ∂xk ∂xi
are the elements of the strain tensor, u. The following is a list of selected important
properties of u:
(i) The strain tensor is symmetric, i.e. u ik = u ki .
(ii) Thus it may be diagonalized at every point, i.e.
dr 2 = (δik + 2u ik ) d xi d xk

= 1 + 2u (1) d x12 + 1 + 2u (2) d x22 + 1 + 2u (3) d x32 .
The indices 1, 2, and 3 correspond to the principal axes. The quantities u (i) are the
attendant principal values or eigenvalues of u. Therefore

d xi = 1 + 2u (i) d xi . (10.4)
The overall local distance change can be expressed in terms of independent contri-
butions along the principal axes.
(iii) In general, the relative distance changes are small, i.e.
d xi − d xi
1. (10.5)
d xi
In conjunction with (10.4) this leads to

1 + 2u (i) − 1 ≈ u (i) 1 . (10.6)
Notice that there is an important difference between displacement and strain as

illustrated in Fig. 10.2. The displacement, u(r ), can be large, e.g. the oscillation
amplitude of the top floor of a tall building swaying in a storm, whereas the strain,
in the above sense, remains small. Thus, if ∂u i /∂xk is small, then we can use the
approximation

1 ∂u i ∂u k
u ik ≈ + (10.7)
2 ∂xk ∂xi
instead of the exact but complicated form of u ik in (10.3).
(iv) The relative local volume change due to a deformation is given by (d V −

d V )/d V . Here
d V = d x1 d x2 d x3

difference between
displacement, u( r ), and
strain dl − dl . A large u
displacement does not imply dl
a large strain dl'
is the volume element in the deformed state and
d V = d x1 d x2 d x3
is the volume element in the undeformed state. The xi are coordinates with respect
to the principal axes. Using (10.4) we have

d V = 1 + 2u (1) 1 + 2u (2) 1 + 2u (3) d x1 d x2 d x3

≈ 1 + u (1) 1 + u (2) 1 + u (3) d V

≈ 1 + u (1) + u
(2)
+ u (3) d V .
=T r (u)
The quantity T r (u) is the trace of u. Because the trace is independent of the coordinate
system this yields
d V = (1 + u ii ) d V (10.8)
and the relative local volume change becomes
dV − dV
= u ii = T r (u) . (10.9)
dV
• Problem 46 - Strain Tensor in Cylindrical and Spherical Coordinates:

Express the elements of the strain tensor in (a) cylindrical and (b) spherical
coordinates:
Solution: (a) The infinitesimal distance between two points inside an elas-
tic body after a deformation
√ is given by the square root of d r 2 = d r 2 +
u ik d xi d xk , where d r 2 is the same distance in the undeformed state. The
distance between two points should not depend on our choice of coordinates.
Thus we conclude that the quantity u ik d xi d xk is the same whether the indices i
and k refer to cartesian components or to the cylindrical coordinates r , φ, and z.
We use this idea to work out the components of the strain tensor in cylindrical
coordinates, i.e. u ik d xi d xk = u rr dr 2 +u r φ drr dφ + u r z dr dz + . . . .
We start by expressing the displacement vector via
u = er (er · u) + eφ (eφ · u) + ez (ez · u) = er u r + eφ u φ + ez u z
(cf. appendix). The orthogonal unit vectors are

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
cos φ − sin φ 0
er = ⎝ sin φ ⎠ eφ = ⎝ cos φ ⎠ ez = ⎝ 0 ⎠ .
0 0 1
In addition we use
d r = er dr + eφr dφ + ez dz

= er ∂r + eφ 1 ∂φ + ez ∂z ,
∇
r
expressed in cylindrical coordinates. Thus

i.e. d r and the cartesian gradient ∇
u.
u ik d xi d xk = d r · (d r · ∇) (10.10)
via
In order to evaluate this expression efficiently, we express u, d r, and ∇
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
ur dr ∂r
u = D̃ · ⎝ u φ ⎠ d r = D̃ · ⎝ r dφ ⎠ = D̃ · ⎝ 1 ∂φ ⎠ ,
∇ r
uz dz ∂z
where
⎛ ⎞
cos φ − sin φ 0
D̃ = ⎝ sin φ cos φ 0 ⎠ .
0 0 1
Insertion of the above into (10.10) yields

⎛ ⎞ ⎡ ⎛ ⎞ ⎛ ⎞⎤ ⎛ ⎞
dr dr ∂r ur
u ik d xi d xk = D̃ · ⎝ r dφ ⎠ · ⎣D̃ · ⎝ r dφ ⎠ · D̃ · ⎝ r1 ∂φ ⎠⎦ D̃ · ⎝ u φ ⎠ ,
dz dz ∂z uz

⎛ ⎞T ⎛ ⎞
dr ∂r
⎜ ⎟ ⎜1 ⎟
⎜ r dφ ⎟ ·D̃ · D̃ ·⎜ ∂ ⎟
T
⎝ ⎠ ⎝ r φ ⎠
dz =I ∂z
where I is the 3 × 3 unit matrix. Thus

⎛ ⎞ ⎛ ⎞
dr ur
u ik d xi d xk = D̃ · ⎝ r dφ ⎠ dr ∂r + dφ∂φ + dz∂z D̃ · ⎝ u φ ⎠ .
dz uz
The next step is

⎛ ⎞T ⎛ ⎞
dr ur
u ik d xi d xk = ⎝ r dφ ⎠ · D̃T · D̃ dr ∂r + dφ∂φ + dz∂z · ⎝ u φ ⎠
dz uz
⎛ ⎞T ⎛ ⎞
dr
ur
+ ⎝ r dφ ⎠ dφ · D̃T · ∂φ D̃ · ⎝ u φ ⎠ .
dz uz
A short calculation yields

⎛ ⎞

0 −1 0
D̃T · ∂φ D̃ = ⎝ 1 0 0 ⎠ .
0 0 0
Carrying out the last few multiplications results in
u ik d xi d xk = dr 2 ∂r u r + dr dφ∂φ u r + dr dz∂z u r
+r dφdr ∂r u φ + r dφ2 ∂φ u φ + r dφdz∂z u φ
+dzdr ∂r u z + dzdφ∂φ u z + dz 2 ∂z u z
−dφdr u φ + r dφ2 u r ,
from which we find

∂u r
u rr = ,
∂r
1 ∂u φ ur
u φφ = + ,
r ∂φ r
∂u z
u zz = ,
∂z
1 ∂u z ∂u φ
2u φz = + ,
r ∂φ ∂z
∂u r ∂u z
2u r z = + ,
∂z ∂r
∂u φ uφ 1 ∂u r
2u r φ = − + .
∂r r r ∂φ
(b) Again our starting point is (10.10). However this time

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
ur dr ∂r
u = D̃ · ⎝ u φ ⎠ d r = D̃ · ⎝ r sin θdφ ⎠ = D̃ · ⎝ r sin
∇
1
∂ ⎠
θ φ
uθ r dθ 1
∂
r θ
and
⎛ ⎞
cos φ sin θ − sin φ cos φ cos θ
D̃ = ⎝ sin φ sin θ cos φ sin φ cos θ ⎠ . (10.11)
cos θ 0 − sin θ
A calculation analogous to the previous one yields
∂u r
u rr = ,
∂r
1 ∂u θ ur
u θθ = + ,
r ∂θ r
1 ∂u φ uθ ur
u φφ = + cot θ + ,
r sin θ ∂φ r r

1 ∂u φ 1 ∂u θ
2u θφ = − u φ cot θ + ,
r ∂θ r sin θ ∂φ
∂u θ uθ 1 ∂u r
2u r θ = − + ,
∂r r r ∂θ
1 ∂u r ∂u φ uφ
2u φr = + − .
r sin θ ∂φ ∂r r
Derivation of the Stress Tensor σ:
The deformation of a body gives rise to internal forces, which act to restore the
body to its original equilibrium state. On the macroscopic scale these forces possess
a very short range.
Here we consider a certain volume V of the elastic body diced up into volume
elements (cf. Fig. 10.3). The net force acting on the entire volume is given by

fd V
Fig. 10.3 Partial volume of

an elastic body divided into
volume elements
dV
V
or in terms of components

fi d V .
Notice that f is a force density, i.e. a force per volume. We introduce the stress tensor
via the following definition:
∂σik
fi = . (10.12)
∂xk
But what does this mean? Figure 10.4 shows a volume element in a gas column
parallel to a uniform gravitational field. The mass inside the volume element is δm.
The force of gravity acting on the volume element points in the negative z-direction.
The pressure, P, at the bottom of the volume element exceeds the pressure at its top
by the amount δ P, i.e.
δm
δP = g = ρgδh . (10.13)
A
Here ρ = δm/(Aδh), where δh is the height of the volume element. Now we assume
that i = k = 3 in (10.12) and that there are no other components except the 3- or
z-component, which means we can write
δσ33 = f 3 δx3 or δσzz = f z δz . (10.14)
Because f z = −ρg and δz = δh, we conclude δσzz = −δ P. In this particular exam-

ple, the force density f z is due to a pressure or stress difference between top and
Fig. 10.4 A volume element

in a gas column
A
P P
m z
fg
Fig. 10.5 Orientation of the k

forces and their attendant k
surface elements
i
x
shear force in i-direction k
acting on k-face
V
k
i k
normal force
acting on k-face x
k
bottom of the volume element. But generally there may be additional contributions
to f z , due to stress differences between the left and the right faces of the volume ele-
ment or between its front and back faces. In these cases the orientation of the faces
is orthogonal to the orientation of the attendant force on the volume element. This
is called shear. In the present example, shear may occur due to a position dependent
gas velocity, varying along the x- or/and y-direction. Roughly speaking stress is like
(negative) pressure but with an orientation, which is not necessarily parallel to the
orientation of the surface on which it acts. The different combinations of net force
direction and orientation of the volume element’s faces is illustrated in Fig. 10.5.
Notice that the right hand side of (10.12) is a divergence, which means that we
can apply Gauss’ integral theorem (cf. the appendix), i.e.

· K d V = ∂ Kk
∇ K · d A or dV = K k d Ak .
V A V ∂xk A
Here A is the surface of V and d A is a surface element oriented away from the
volume. Hence

∂σik
fi d V = d V = σik d Ak . (10.15)
∂xk
This means that the total force on the volume is due to stresses at its surface. But
what about gravity? Gravity is an external force. Our f is the density of internal
forces, i.e. the forces transmitted due to face-to-face contact of the volume elements.
Elastic equilibrium means that these internal forces compensate each other. Because
(10.15) follows from our above definition of the stress tensor in (10.12), this means
that the definition already includes the elastic equilibrium of the internal forces.
Remark: Because we consider the elastic body in its deformed state, we should use
the primed coordinates. However, using the unprimed coordinates instead causes a
small error of higher order only.
The equilibrium conditions of the deformed elastic body are
∂σik
=0. (10.16)
∂xk
As we already pointed out, there may be an additional force, like gravitation, acting
on all volume elements throughout the body. In the case of gravitation (10.16) must
be replaced by
∂σik
+ ρgi = 0 . (10.17)
∂xk
The quantity ρ is the mass density of the body and g is the gravitational acceleration.
Contrary to volume forces like gravitation, surface forces are included via boundary
conditions. In the case of an external surface force P (per area) we have
Pi d A − σik d Ak = 0 .
Here Pi is the i-component of P and −σik d Ak is the attendant component of the

surface stress along the negative i-direction. Using the surface unit vector n pointing
away from the surface, this equation becomes
σik n k = Pi , (10.18)
which applies everywhere on the surface of the elastic body.
10.2 Free Energy
The above equilibrium conditions alone do not tell us much about the shape of the
deformed elastic body. We need to relate the strain tensor to the stress tensor and
vice versa. In addition, every elastic body is made of a material or, more often, of a
combination of materials. As we shall see, the sought after relation between the stress
and the strain tensor does depend on parameters, so called elastic constants, specific
to each material. Moreover, the elastic constants are not constant. They do depend
on, for instance, temperature. In the case of a dynamic deformation they also depend
on the characteristic deformation time, e.g. the frequency when the deformation is
cyclic. Calculating material parameters requires an approach on the molecular level,

which we do not deal with in this book.2
Thermodynamic Theory of Deformations:
The theory of elasticity is only one part of a comprehensive theory of macro-

scopic matter - Thermodynamics. Thermodynamics is a phenomenological theory,
i.e. its foundation consists of a set of empirical laws, which, thus far, have passed
every experimental test (e.g. [4]). The mathematical expression of the first law of
thermodynamics, energy conservation, is given by
d E = δq + δw . (10.19)
Here dE is a differential change of the internal energy of a system. This change is

the combination of δq, a certain amount of heat energy given off or absorbed by the
system, and δw, work done on or by the system. Using d on the left and δ on the right
highlights the fact that E is a state function, whereas q and w are no state functions.
The significance of this is that the two terms on the right, individually, do depend on
the process responsible for the changes. Together, however, they add up to a process
independent change of the internal energy.
We do not know much about δq, but we do know something about δw, and we
may object to this on the basis of the Sisyphus example on p. 35 (specifically (1.112))
or more generally based on (4.37). However, in the Sisyphus example we neglect
friction. Friction reshuffles part of the invested work into heat. If we neglect such
dissipative processes, which in general we cannot, then δw can be replaced by dw,
i.e. we deal with reversible work. But no matter how we divide up energy, its total
amount is conserved.
The second ingredient we need is the second law of thermodynamics. Its mathe-
matical expression is the Clausius inequality,3 i.e.
δq
dS ≥ . (10.20)
T
The quantity on the left is a differential change of the so called entropy, S, of the
system. The quantity T on the right is temperature. Entropy was mentioned briefly
in the context of Boltzmann’s H-theorem (9.66).
In a completely isolated system δq = 0. Every process occurring inside the system
according to (10.20) satisfies d S ≥ 0 or Safter ≥ Sbefore . Here before and after refer
to an earlier time and to a latter time during the process, respectively. This is what
we observe in Fig. 9.6 if we replace H by -H, except for fluctuations.4 The second
2 Thisapproach is statistical mechanics.

3 Clausius,Rudolf Julius Emanuel, German physicist, *Köslin (now Koszalin) 2.1.1822,
†Bonn 24.8.1888.
4 Had we done the simulation based on a larger system, the fluctuations would have been much
smaller.
10.2 Free Energy 303
law accounts for the fact that (complex) processes, without outside interference,
distinguish the future from the past (arrow of time). We have obtained a feeling for
this when we discussed the approach to chaos in the previous chapter.
Combination of (10.19) and (10.20) yields
d E − T d S ≤ δw . (10.21)
The right hand side of (10.21) describes a differential change of the quantity
F = E −TS , (10.22)
which is called the free energy, at constant temperature. If the temperature is not
constant we have
dF = d E − d(T S) = d E − T d S − SdT , (10.23)
Combining this equation with (10.21) yields
dF ≤ −SdT + δw , (10.24)
or, in the limit of a very slow processes infinitely close to equilibrium,
dF = −SdT + dw . (10.25)
These are important results. In particular (10.24) is telling us, that, if we keep the
temperature constant, i.e. dT = 0, and make sure the system is not doing or receiving
work, δw = 0, the free energy develops towards a minimum! The fact that the free
energy ‘wants’ to be as small as possible can be employed quite practically. For
instance, if we develop a model of the free energy, which depends on an unknown
parameter λ, then we can obtain this parameter via the condition dF/dλ = 0, where
of course we must make sure that the solution corresponds to the minimum.
This can be generalized. Imagine someone suggests two possible deformed states
of an elastic body at a given temperature. By comparing the free energies of the two
structures, we can decide which one is realized. We can even include work. Imagine
we must find the shape of a bridge on which a truck is parked. The force of gravity
acting on the truck leads to a deformation of the bridge, which amounts to a certain
work, w, (neglecting dissipation effects). The solution, i.e. the shape of the deformed
bridge, follows via (10.24) according to
δu (F − w) = 0 . (10.26)
The expression on the left is the variation of the content of the bracket with respect
to the displacement field u, describing the shape of the elastic body. Even though the
foundations are different, the mathematical analogy to the derivation of the Euler-
Lagrange equations is evident. Notice that (10.26) is an alternative to the above
equilibrium conditions, cf. (10.16)–(10.18), which we may use also to obtain the
displacement field u, provided we know how to relate the stress to the strain tensor.
We shall return to (10.26) in an example on p. 327 and when we discuss the finite
element method. Here we continue to peruse our original goal, which is the relation
between stress and strain tensor.
The key to the solution of this problem is (10.25), because we can calculate dw,
which for the moment is the work done by the internal forces only. The internal
forces cause the small displacements δu i ,5 which leads to the total work

∂σik
f i δu i d V = δu i d V
∂xk

Gauss’
= ∂δu i
theorem σik δu i d Ak − σik dV
∂xk

=0(∗)
∂u
∗∗ 1 i ∂u k
= − σik δ + dV
2 ∂xk ∂xi

≈2u ik

= − σik δu ik d V
(*: The surface at infinity is undeformed. **: see6 ). Thus, the work done by the
internal forces per unit volume is −σik δu ik . This mean that the attendant change of
E is −(−σik δu ik ) and therefore
d F̂ = − ŜdT + σik du ik (10.27)
Here and in the following F̂ and Ŝ are densities of the respective quantities. According
to (10.27)
5 Note that δ in the following just means a ‘small change’ and does not refer to the displacement
not being a state function. In fact, here we assume that the work is reversible.
6 Here we use the symmetry of the stress tensor, i.e. σ = σ . This can be shown as follows: The
ik ki
torque on a particular volume of the body is given by

∂σil ∂σkl
( f i xk − f k xi )d V = xk − xi d V
∂xl ∂xl

∂(σil xk − σkl xi )
= d V − (σil δkl − σkl δil )d V
∂xl

= (σil xk − σkl xi )d Al − (σik − σki )d V .
The volume integral must vanish, implying that σik = σki , because the torque must be due to the
forces acting on the surface.
∂ F̂
σik = . (10.28)
∂u ik T
Free Energy of the Isotropic Elastic Body:
At this point we try to guess a plausible form of the free energy in terms of the
u ik , because then (10.28) is the desired link between the stress and the strain tensor.
The idea is to expand the free energy density, F̂, in terms of the small elements of
the strain tensor, u ik The reference free energy density is the one of the undeformed
elastic body, F̂o (T ).
First
we consider possible terms linear in the u ik . In this case (10.28) yields
∂ F̂

∂u ik T
= 0 in the limit u ik → 0. Thus σik = 0 even though the elastic body is in its
undeformed state. This cannot be and we conclude that there are no such terms in
the free energy density.
Notice that the free energy density is a scalar. Therefore the terms in the expansion
must be scalars also. An attendant form of the free energy density including the second
terms is
λi jkl
F̂ = F̂o (T ) + u i j u kl . (10.29)
2
Here the λi jkl are the elastic constants. The number of independent elastic constants
depends on the symmetry of the material. There can be as much as 21 elastic constants,
but we are interested in isotropic materials for which (10.29) reduces to
λ 2
F̂ = F̂o (T ) + u + μu ik
2
(10.30)
2 ii
(cf. [5]). The two coefficients, λ and μ, are called Lamé coefficients. Notice that u ii2
2
is the square of the trace of the strain tensor and u ik is the sum over the squared
tensor elements. Examples for isotropic materials are rubber or metals like copper
or aluminum. Because of (10.30) we have
∂(F̂ − F̂o )
u ik = 2(F̂ − F̂o ) .
∂u ik
Using (10.28) this yields

σik u ik
F̂ = F̂o + . (10.31)
2
It is possible and sensible to rewrite the free energy density, i.e. λ2 u ii2 + μu ik
2
can
be expressed by a term describing a shear deformation plus a term describing a pure
dilatation. Shear deformation means that the volume of the elastic body remains
Fig. 10.6 Examples

illustrating a pure dilatation
(left) and a pure shear
deformation (right)
constant. Dilatation means that the volume is changed but not the shape of the elastic
body (cf. Fig. 10.6).
First we reexpress u ik via

1 1
u ik = u ik − δik u ll + δik u ll . (10.32)
3 3

(∗)
Notice that the sum over the diagonal elements of (*) is zero. This term does not
contribute to a volume change, which is given by u ii instead (cf. (10.8)), and thus is
2
corresponds to a mere shear deformation. Using (10.32) we find for u ik
1 2 2
2
u ik = (. . .)2 + δik δik u ll2 + u ik δik u ll − δik δik u ll2
9 3 9
=δii =3 u ll 3

=0
1
= (. . .) + u ll2
2
3
and thus
2
1 1
F̂ = F̂o + μ u ik − δik u ll + κu ll2 . (10.33)
3 2
The coefficient μ is called shear modulus and
2
κ=λ+ μ (10.34)
3
is the compression modulus. Notice that μ > 0 and κ > 0. If either one of the condi-
tions is violated, the material can lower its free energy spontaneously and indefinitely
either by a shear deformation or a dilatation or both.

∂ F̂
Calculating σik from σik = ∂u ik T
:
According to (10.33)
d F̂ = κu ll du ll + 2μ (. . .) d (. . .) .
Via (. . .) δik = 0 we obtain

1
d F̂ = κu ll δik + 2μ u ik − δik u ll du ik
3
and

1
σik = κu ll δik + 2μ u ik − δik u ll . (10.35)
3
Equation (10.35) allows to determine σik from the u ik in the case of an isotropic
elastic body.
Strain Tensor Elements, u ik , Expressed in Terms of the σik :
From (10.35) follows
1
σll = 3κu ll or u ll = σll (10.36)
3κ
and thus

1 1
σik = σll δik + 2μ u ik − σll δik (10.37)
3 9κ
or

σll δik 1 1
u ik = + σik − σll δik . (10.38)
9κ 2μ 3
Notice that the strain tensor is a linear function of the stress tensor, i.e. the deforma-
tions are proportional to the applied forces (Hooke’s law7 ).
Remark: In the case of hydrostatic, i.e. uniform, compression the force per unit
area is −δ Pd Ai . Here δ P is a small hydrostatic pressure. The minus sign indicates
that the compression force is oriented opposite to the orientation of the surface
element. This force, if expressed in terms of the stress tensor, is given by σik d Ak ,
i.e. σik d Ak = −δ Pd Ai = −δ Pδik d Ak . Hence
σik = −δ Pδik (10.39)
(cf. our example on p. 299). In addition we have according to (10.36)
7 Hooke, Robert, English scientist, *Freshwater (Isle of Wight) 18.7.1635, †London 3.3.1703.
Fig. 10.7 Elongation of a y

cylindrical rod. Notice:
u zz (z) = ∂u z (z)/∂z ≈ n
(u z (z 2 ) − u z (z 1 ))/(z 2 − z 1 )
with z ≈ z 1 ≈ z 2 x
z P
uz
uz
uz(z2) = const
z
uz(z1)
z1 z2 z
δP 1 ∂V 1
u ii = − or =− .
κ V ∂P T κ
The quantity κ−1 is the isothermal compressibility.
Elastic Modulus and Poisson’s Number:
In the following we study the simple elongation of a cylindrical rod along the z-
direction. The deformation is caused by a uniform force per area, (0, 0, P), applied
to the upper end of the rod while its other end’s z-position is kept fixed (cf. Fig. 10.7).
We assume that the strain, u ik , inside the rod is constant. The equilibrium condition
(10.18) applied to the cylinder surface yields.8
σx x n x + σx y n y = 0
σ yx n x + σ yy n y = 0
σzx n x + σzy n y = 0
We may shift the surface unit vector n to an arbitrary point on the cylinder’s surface.
This implies all of the above stress tensor components vanish. On the top and bottom
surfaces, however, we have
8 Notation: σ
x x , σx y , etc. are equivalent to σ11 , σ12 , etc. The same is true for u x x etc., i.e. u x x = u 11
etc. In addition δii = 3 but δ11 = 1.
σzz = P .
Based on (10.38) this yields

1 1 1
u x x = u yy =− − P
3 2μ 3κ
as well as

1 1 1
u zz = + P.
3 μ 3κ
If i = j then u i j = 0.
The equation
P

= (10.40)
u zz
defines the elastic or Young’s modulus,

. Thus
9κμ

= . (10.41)
3κ + μ
Notice that u x x and u yy determine the relative lateral contraction. Figure 10.8 illus-
trates the lateral contraction of the rod in the x-z-plane. The negative lateral strain
divided by the longitudinal strain defines Poisson’s number, i.e.
uxx
ν=− , (10.42)
u zz
where
1 3κ − 2μ
ν= . (10.43)
2 3κ + μ
An important special case is a volume conserving deformation, i.e. u x x + u yy +

u zz = 0 (cf. (10.9)). Using u x x = u yy we have u x x = −u zz /2 and therefore ν = 1/2.
Generally, because κ, μ > 0 (cf. above), Poisson’s number obeys
−1 ≤ ν ≤ 21
.
(κ = 0) (μ = 0)
ν < 0 implies that the above rod expands laterally instead of contracting. Usually
this doesn’t happen. However, Fig. 10.9 shows a pictorial illustration of a (molecular)
mechanism, which would produce a negative ν.
Fig. 10.8 Lateral z

contraction of the above rod.
Notice: P
u x x (x) = ∂u x (x)/∂x
≈ (u x (x2 ) − u x (x1 ))/(x2 −
x1 ) < 0 with x ≈ x1 ≈ x2
ux(-x1) ux(x1)
ux(-x2) ux(x2)
x
Fig. 10.9 A cartoon of how

a negative Poisson’s number
can be realized
Table 10.1 Selected elastic constants

(104 M Pa) ν μ (104 M Pa) κ (104 M Pa)
Aluminum 7.1 0.34 2.6 7.3
Lead 1.6 0.44 0.57 4.3
Iron (at −4 ◦ C) 0.96 0.33 0.36 0.98
Iron 21.1 0.28 8.2 17
Glas, technical 5…10 0.25
Glas, quarz 7.5 0.17 3.2 3.8
Gold 7.8 0.42 2.7 17
Rubber 6 · 10−4 ≈0.5 0.24
Acrylic glass 0.32 0.35 0.12 0.36
Porcelain 5.8 0.23 2.4 3.6
Steel (1C) 21 0.28 8.0 16
It often happens that we must convert κ and μ to

and ν. The attendant equations
are

μ= and κ = . (10.44)
2 (1 + ν) 3 (1 − 2ν)
Note also that the isotropic free energy density in terms of

and ν is given by

ν
F̂ = F̂o + 2
u ik + u ll2 . (10.45)
2 (1 + ν) 1 − 2ν
Table 10.1 compiles selected values for the elastic modulus,

, Poisson’s number,
ν, the shear modulus, μ, and the compression modulus, κ, of isotropic materials.
Remark: (i) The numbers in Table 10.1 may vary depending on the composition of the
materials. (ii) Rubber is a very special material (cf. the last section of this chapter).
The ‘typical’
-value listed in the table applies under ordinary conditions, i.e. roughly
at room temperature and low deformation frequencies. The huge difference between
the elastic and the compression modulus indicates the unusual molecular mechanism
underlying rubber elasticity. The origin of the latter is the conformational entropy
reduction when a polymer is stretched.
Finally, it is important to note that the definitions (10.40) and (10.42) are applica-
ble even if a material is not isotropic, but (10.41) and (10.43) are no longer valid.
Useful Formulas for Isotropic Elastic Bodies:

ν
σik = u ik + u ll δik (10.46)
1+ν 1 − 2ν
1
u ik = ([1 + ν] σik − νσll δik ) (10.47)

or explicitly

σx x = [1 − ν] u x x + ν u yy + u zz (10.48)
(1 + ν) (1 − 2ν)

σ yy = . . . [1 − ν] u yy + ν (u x x + u zz ) (10.49)

σzz = . . . [1 − ν] u zz + ν u x x + u yy (10.50)
σx y = uxy (10.51)
1+ν
σx z = uxz (10.52)
1+ν
σ yz = u yz (10.53)
1+ν
and
1
uxx = σx x − ν σ yy + σzz (10.54)

1
u yy = σ yy − ν (σx x + σzz ) (10.55)

1
u zz = σzz − ν σx x + σ yy (10.56)

1+ν
uxy = σx y (10.57)

1+ν
uxz = σx z (10.58)

1+ν
u yz = σ yz (10.59)

Equilibrium Conditions for Isotropic Elastic Bodies:
We are now ready to return to the above equilibrium conditions. Inserting (10.46)
into (10.17) yields

∂u ik
ν ∂u ll
+ + ρgi = 0 .
1 + ν ∂xk (1 + ν) (1 − 2ν) ∂xi
Using (10.7) we obtain a partial differential equation for the displacement field, i.e.

∂2ui
∂2uk
ν ∂2uk
+ + + ρgi
2 (1 + ν) ∂xk2 2 (1 + ν) ∂xk ∂xi (1 + ν) (1 − 2ν) ∂xi ∂xk

∂2ui
− 2
ν + 2
ν ∂2uk
= + + ρgi = 0
2 (1 + ν) ∂xk2 2 (1 + ν) (1 − 2ν) ∂xi ∂xk
or after some clean up work

∂2ui
∂2uk
+ + ρgi = 0 . (10.60)
2 (1 + ν) ∂xk2 2 (1 + ν) (1 − 2ν) ∂xi ∂xk
In vector notation this equation becomes
1
2 (1 + ν)

u+ ∇ ∇ · u = −ρg (10.61)
1 − 2ν
(∗)
(*: gravitation may be replaced by other volume forces as appropriate). In the case
of surface forces we have . . . = 0 instead, where the forces enter via the boundary
conditions.
Remark: The result (10.61) is obtained on the basis of the approximation (10.7). This
means it also is an approximation!
10.3 Examples
Deformation of a Square Column in the Gravitational Field:
The left side of Fig. 10.10 depicts a square column supported by a solid foundation.
The column is deformed under its own weight. Notice that the force of gravity acts
along the negative z-direction. We want to calculate the attendant displacement field
u(
r ).
In the present case the equilibrium conditions (10.17) yield
∂σxk ∂σ yk ∂σzk
= = 0 and = ρg .
∂xk ∂xk ∂xk
Next we consider the side faces of the column (cf. Fig. 10.10). From σik n k = 0 we
obtain σx x = 0, σ yx = 0, σzx = 0, σx y = 0, σ yy = 0, and σzy = 0, i.e. σik = 0 ∀ i, k
with the exception of σzz ! At the top of the column we have σx z = 0, σ yz = 0, and
σzz = 0.
A solution satisfying these boundary conditions is σzz = −ρg (l − z). The remain-
ing σik are zero. Using (10.54) through (10.59) yields
n= () 1
0
0
z
z
Fg
l
y n= () 0
0
1
0
x
z
y
n= () 0
1
0
0
Fig. 10.10 Deformation of a square column by its own weight

10.3 Examples 315
ν ν 1
uxx = ρg (l − z) , u yy = ρg (l − z) , u zz = − ρg (l − z) ,

∂u x ∂u ∂u x
and u x y = u x z = u yz = 0. Now we employ (10.7) (notice that ∂y
+ ∂xy = 0, ∂z
+
∂u z ∂u y ∂u z
∂x
= 0, and ∂z
+ ∂y
= 0), which leads to
∂u x ν ν
= ρg (l − z) ux = ρg (l − z) x + gx (y, z)
∂x

=0
∂u y ν ν
= ρg (l − z) uy = ρg (l − z) y + g y (x, z)
∂y

=0

∂u z 1 1 1 1
= − ρg (l − z) uz = ρg (l − z)2 − l 2 + f (x, y)
∂z

2 2
1 2
= − ρg l − (l − z)2 + f (x, y) .
2
With
∂u z ∂u x ν
=− = ρgx
∂x ∂z
and
∂u z ∂u y ν
=− = ρg y
∂y ∂z
we obtain finally
1 ν
uz = − ρg l 2 − (l − z)2 + ρg x 2 + y 2
2
2
ρg 2 2
= − {l − (l − z) − ν x + y 2 } .
2
2
Figure 10.11 shows the displacement field of the column. Notice that we use reduced
components, i.e.
Fig. 10.11 Top y

Displacement field of an -2
0
2
elastic square column due to
its own weight. Bottom 10
Mathematica-program
5 z
0
-2
0
x 2

1
u x = (l − z) x
ν ρg

1
u y = (l − z) y
ν ρg

1 1
u z = − {l 2 − (l − z)2 − ν x 2 + y 2 } .
ν ρg 2ν
Notice that our solution, which satisfies u z (z = 0) = 0 for x = y = 0 only, is

not valid in the immediate vicinity of the foundation. This is because we have not
specified or dealt with the non-trivial boundary conditions at the interface between
column and foundation (e.g. is gliding possible or not).
Spherical Shell Subject to Uniform Pressure:

10.3 Examples 317
Fig. 10.12 Spherical shell

whose inner radius is R1 and
its outer radius is R2
R2
R1
P1 P2
Figure 10.12 depicts a spherical shell. Inside the shell the pressure is P1 , whereas
on its outside the pressure is P2 . We want to calculate the deformation of the shell
under these conditions. For symmetry reasons the displacement field, u, must be
radially symmetric and dependent on r only. Therefore we employ spherical coor-
dinates in a coordinate system whose origin is the center of the shell. In addition we
use (10.61) in the following form:
(1 − 2ν) ∇
u + ∇( · u) = 0 .
∇
Using the vector identity ∇( · u) − ∇
× (∇
× u) ≡
u this becomes
× (∇
(1 − 2ν) ∇ × u) + 2(1 − ν)∇(
∇ · u) = 0 .
Symmetry requires u = u(r )

n (with n = r/r ), which yields
× u = ∇
∇ × u(r ) n
× n −
= u(r ) ∇ n×
∇u(r ) =0.

=0 )
= du(r
dr ∇r ×n =0
=
n
Hence
∇
∇( · u) = 0 or ∇
· u = c ≡ 3a .
In spherical coordinates ∇ r = er ∂/∂r + eφ (r sin θ)−1 ∂/∂φ + eθ r −1 ∂/∂θ, where
−1
the er = |∂
r /∂r | (∂
r /∂r ), eφ =…are the respective orthogonal unit vectors in
same coordinates (cf. Problem 46). Therefore the right equation yields

∂ 1 ∂ 1 ∂
er + eφ + eθ · er u r + eφ u φ + eθ u θ = 3a ,
∂r r sin θ ∂φ r ∂θ
=0 =0
i.e.

∂ 1 ∂
3a = ur + eφ · er ur
∂r r sin θ ∂φ

⎛ ⎞ ⎛ ⎞
− sin φ − sin θ sin φ
= ⎝ cos φ ⎠ · ⎝ sin θ cos φ ⎠
0 0

=sin θ

1 ∂
+ eθ · er ur .
r ∂θ

⎛ ⎞ ⎛ ⎞
cos θ cos φ cos θ cos φ
= ⎝ cos θ sin φ ⎠ · ⎝ cos θ sin φ ⎠
− sin θ − sin θ

=1
Hence
∂ 2
3a = ur + ur .
∂r r
This differential equation possesses the solution u r = ar + rb2 (check: 3a = a −

2 rb3 + 2a + 2 rb3 ).
The next step is the calculation of the constants. We write down the components
of the strain tensor (cf. Problem 46):
∂u r 2b
u rr = =a− 3
∂r r
ur b
u φφ = =a+ 3
r r
ur b
u θθ = =a+ 3 .
r r
The remaining components vanish. According to (10.46) we have

ν
σrr = u rr + (u rr + 2u θθ )
1+ν 1 − 2ν

= [(1 − ν) u rr + 2νu θθ ]
(1 + ν) (1 − 2ν)

2b b
= (1 − ν) a − 3 + 2ν a + 3
(1 + ν) (1 − 2ν) r r

2
b
= a− .
(1 − 2ν) (1 + ν) r 3
Using (10.18) we include the boundary conditions, i.e.

10.3 Examples 319
σrr = −P1 at r = R1 and σrr = −P2 at r = R2 .
Thus
P1 R13 − P2 R23 1 − 2ν R13 R23 (P1 − P2 ) 1 + ν

a= and b= . (10.62)
R23 − R13
R23 − R13 2
Let’s study two special cases:
(i) In the case of a thin shell of thickness h = R2 − R1 R and P2 = 0 as well as

P1 = P we find
1 − ν P R2
ur = (10.63)

2h
and
PR P
σθθ = σφφ = and σ̄rr = .
2h 2
The bar means that the radial stress is an average across the shell’s thickness.
Remark: Also interesting is the case P1 = P2 = P for which a = −P 1−2ν
and b = 0.
Therefore u r = −P 1−2ν

r , i.e. the shell shrinks proportional to P.
(ii) The opposite extreme is a spherical cavity embedded in an infinite elastic medium
(see Fig. 10.13), i.e. R1 = R and R2 = ∞. We assume that the pressure in the cavity
is P1 = 0, whereas the pressure outside the medium is P2 = P.
Here

R3
σrr = −P 1 − 3
r

R3
σθθ = σφφ = −P 1 + 3 ,
2r
Fig. 10.13 A spherical

cavity of radius R embedded
in an infinite elastic medium
R
i.e. on the surface of the cavity the tangential stress components σθθ = σφφ = −3P/2
exceed the pressure at infinity considerably. This means that fracture of the medium
in response to the outside pressure is likely to originate at the surface of a cavity.
Filler Particle Inside a Stretched Matrix:
The spherical cavity of the previous example here is replaced by a solid particle.
Instead of a uniform compression, the medium or matrix surrounding the particle
(o)
is subject to a unidirectional tensile stress, σzz , along the z-direction. Without the
particle this is analogous to the above deformation of the column in the gravitational
(o)
field - only in this case gravitation is replaced by σzz . Thus we obtain the attendant
(o)
displacement components by simply replacing ρgl by −σzz in combination with
l → ∞, i.e.
ν (o) ν (o) 1 (o)

u (o)
x = − σzz x u (o)
y = − σzz y u (o)
z = σ z. (10.64)

zz
These are displacements with respect to an origin at the center of the elastic medium.
Now we include the solid, which means indeformable spherical particle. We
assume that the surface of the particle is tightly connected to the matrix material,
i.e. the displacement components vanish on the particle’s surface. At large distances
from the particle we expect that the displacement components are given by (10.64).
Hence
u = u (o) + u (1) . (10.65)

(o)
In addition we assume that u (1) is linear in σik and that it can be expressed in terms
of derivatives of r n (n = 0, ±1, ±2, . . . ), where r is the radial distance from the
origin (cf. Problem 12 in Chap. 1 of [5]). The most general form of this solution is
!
u i(1) = (o)
co,n σkk (o)
∂i + c1,n σik ∂k + c2,n σkl(o) ∂i ∂k ∂l r n (10.66)
n
(where we use the summation convention and ∂i ≡ ∂/∂xi ). All c j,n belonging to
terms causing u i(1) to not vanish at infinity must be equal to zero. In addition u i(1)
must satisfy the equilibrium conditions
(1 − 2ν) ∇
u + ∇( · u) = 0 (10.67)
and it must vanish on the particle’s surface, i.e.
u(r = 1) = 0 . (10.68)
Here and in the following we set the particle radius equal to unity. We can see from
(o) (o)
(10.46) that σik = 0 (the only exception is σzz ). Thus (10.66) leads to
10.3 Examples 321
!
u (1)
x =
(o)
σzz co,n + c2,n ∂z2 ∂x r n
n
!
u (1)
y = (o)
σzz co,n + c2,n ∂z2 ∂ y r n
n
!
u (1)
z = (o)
σzz co,n + c1,n + c2,n ∂z2 ∂z r n .
n
Using a certain amount of foresight we continue considering terms with n = ±1

only. The above equations then become
u (1) 1
x
(o)
= co,−1 + c2,−1 ∂z2 ∂x + c2,1 ∂z2 ∂x r
σzz r
(1)
uy 1
(o)
= co,−1 + c2,−1 ∂z2 ∂ y + c2,1 ∂z2 ∂ y r
σzz r
u (1)
z 1
(o)
= co,−1 + c1,−1 + c2,−1 ∂z2 ∂z + c2,1 ∂z3r .
σzz r
Thus co,1 = c1,1 = 0, because otherwise u does not vanish at infinity. Inserting these
equations into the equilibrium conditions (10.67) shows that they are a solution
only if
c1,−1 + 4(1 − ν)c2,1 = 0 . (10.69)
Additional equations for the constants follow via the boundary conditions (10.68):
(1 − 2ν)(1 − 5ν)
co,−1 = (10.70)
2
(4 − 5ν)
5(1 − ν)(1 + ν)
c1,−1 = (10.71)

(4 − 5ν)
1+ν
c2,−1 =− (10.72)
4
(4 − 5ν)
5(1 + ν)
c2,1 =− (10.73)
4
(4 − 5ν)
(evaluated using Mathematica). The final result is

ν (o) 1 b 1 z2
u x = − σzz x 1− 3 − 3 1− 2 1−5 2 (10.74)

r νr r r

ν (o) 1 b 1 z2
u y = − σzz y 1− 3 − 3 1− 2 1−5 2 (10.75)

r νr r r
2
1 (o) 1 b 1 z
u z = σzz z 1− 3 + 3 1− 2 3−5 2 , (10.76)

r r r r
Fig. 10.14 A spherical filler y

particle embedded in an 0 2
-2
elastic medium for which
ν = 1/2. The medium is
subject to a tensile stress
along the z-direction
2
z 0
-2
-2
0
2
x
where
3 1+ν
b= . (10.77)
4 4 − 5ν
Figure 10.14 depicts the displacement field in the vicinity of the particle.
It is interesting to calculate the components of the stress tensor at the interface
between the particle and the medium. Here we use (10.46) with ν = 1/2, i.e. the
matrix possesses a constant volume. A material, which satisfies this constraint to very
good approximation, is rubber (cf. the parameter table on p. 311). We employ the
formulas in Problem 46, in order to obtain the stress tensor components in spherical
coordinates. The displacement components u r , u φ , and u θ follow via
u r = u x ex · er + u y ey · er + u z ez · er (10.78)
and the analogous equations for u φ and u θ . The ei are unit vectors in the respective
coordinates (e.g. er = (cos φ sin θ, sin φ sin θ, cos θ), eφ = (− sin φ, cos φ, 0), eθ =
(cos φ cos θ, sin φ cos θ, − sin θ)). The stress components on the particle surface are
1 (o)
σrr = σφφ = σθθ = − (1 − 5 cos2 θ)σzz
2
as well as
5 (o)
σr φ = σθφ = 0 σr θ = − sin(2θ)σzz .
4
10.3 Examples 323
The radial and tangential stresses are highest for z = ±1, i.e. at the poles these stresses
are twice as large as the external tensile stress. In addition the result is independent of
the sphere’s radius. The first two of the shear stresses in the second line are zero (also
for r > 1), whereas the shear stress σr θ , acting in radial direction, is different from
zero and also exceeds the outside tensile stress in the vicinity of the particles equator.
Advanced Example: Tire rubber is a complex material. For instance, it must

be both elastic and strong. The necessary strength is due to two steps in the
manufacturing process. One is vulcanization, which adds chemical cross links
in the form of sulfur bridges between the polymer chains of the rubber. The
other is the addition of filler (carbon black or silica nanoparticles). In principle,
modern or active fillers increase the local cross link density by offering addi-
tional anchor sites for the polymer chains on the particle surfaces. The number
of these anchor sites should exceed the sulfur cross links, which the particle
replaces by its presence. This means that filler particles must be very small
(typical radii R = 10–20 nm), because their surface grows as R 2 , whereas their
volume grows as R 3 , i.e. large particles displace more cross links in comparison
to the surface anchor sites which they contribute.
In the following, based on the previous calculation, we derive an expression
for the enhancement of the (rubber) shear modulus if a small amount of filler
particles is added. Even though the resulting Einstein–Smallwood equation,
i.e.

5
μ∗ = μ 1 + φ , (10.79)
2
is not a really satisfactory description, it remains the starting point for more
elaborate models of rubber reinforcement. Here μ∗ is the shear modulus of
the filled rubber, whereas μ is the shear modulus of the unfilled rubber. The
quantity φ is the volume fraction, i.e. the ratio of the volume occupied by filler
particles to the total volume of the rubber material. This equation describes
the limit of very small filler concentration, i.e. φ 1. Einstein derived this
equation in the context of the viscosity of liquids, where the shear modulus
is replaced by the viscosity coefficient (Annalen der Physik 19, 289 (1906);
34, 591 (1911)). The particular appeal of this equation is that in the limit of
dilute spherical particles tightly bonded to an elastic matrix it is an exact result.
However, the filler concentration in modern tires is rather large and the size
and shape of the nanoparticles matters. Notice that the theory of elasticity is a
continuum theory, which does not include effects related to molecular structure
like the above polymer adsorption sites.
First we calculate the excess free energy, Fex , i.e. the elastic free energy
contribution due to the embedded sphere in the previous example. Using the
free energy density (10.30) and ν = 1/2 we obtain

λ (1) 2 (1) 2 vsphere (o) 2
Fex = dV u ii + μu ik = σ . (10.80)
V 2 4
zz
Here vsphere is the volume of the sphere. Next we assume that the matrix con-
tains N such spheres. However, the average distance between the spheres is
sufficiently large so that their displacement fields to not interfere significantly.
In this case
2
Fex σ (o) φ
= zz ,
V 2
2
where φ = N vsphere /V . If we now add the elastic free energy inside the volume
V in the absence of the embedded spheres, then the total elastic free energy is
given by
2
F σ (o) φ
= zz 1+ . (10.81)
V 2
2
The second ingredient to the Einstein–Smallwood equation is a calculation

of the above F/V along a different route. Suppose we consider the filled
rubber to be a uniformly elastic material possessing the elastic modulus
∗ . In
this case
ν (o)
u x = − σzz x(1 − φ) (10.82)

ν (o)
u y = − σzz y(1 − φ) (10.83)

1 (o)
u z = σzz z(1 − φ) . (10.84)

But were do the factors (1 − φ) come from and why doesn’t

have an asterisk?
We imagine the entire material to be divided into volume elements. Every
volume element contributes a small part to the displacement at a distant point
(x, y, z). The exceptions are volume elements inside one of the filler particles,
which cannot be deformed. Their overall volume fraction is φ. This reduces the
displacements by a factor (1 − φ). Notice also that outside the filler particles
the elastic modulus is
and not
∗ . If we use these strain components and
ν = 1/2 to calculate F/V , then we obtain
2
F σ (o)
= zz 2
∗ (1 − φ)2 . (10.85)
V 2
Equating the right sides of (10.81) and (10.85) yields

10.3 Examples 325
(o) 2 2
σzz φ σ (o)
∗
1+ = zz (1 − φ)2
2
2 2

and thus

1 + φ/2 5

∗ =
≈
1 + φ .
(1 − φ)2 2
With μ =
/3 and μ∗ =
∗ /3 follows the desired result.
What we have just described can be found in a research article by H.M.
Smallwood [6]. A comprehensive exposition is given by T.A. Vilgis, G. Hein-
rich, M. Klüppel in [7].
Free Energy of a Bent Plate:
We consider a thin and slightly bent plate whose thickness is h (cf. Fig. 10.15).
The displacement vector of the neutral surface is
⎛ ⎞
0
u (0) =⎝ 0 ⎠ (10.86)
ζ (x, y)
We make the following simplifying assumptions:
(i) The internal stresses are much larger than the external forces Pi .9 Using (10.18)
with Pi = 0 this yields
σik n k = 0 .
(ii) Because n is parallel to the z-direction, with sufficient accuracy, we have
σx z = σ yz = σzz = 0 . (10.87)
(iii) In addition, because

h is so
small, (10.87) is satisfied everywhere inside the
∂u i
plate. Using u ik = 21 ∂x k
+ ∂u k
∂xi
…
…and (10.52)10 yields
∂u x ∂u z ∂ζ ∂ζ
=− ≈− u x ≈ −z
∂z ∂x ∂x ∂x
9 Ifthis is difficult to understand, you should think of a tightrope walker. The tension on the rope
must be large compared to the weight force exerted by the walker in order to prevent slack.
10 The integration constant is zero, because u = u = 0 at z = 0 (neutral surface!).
x y
Fig. 10.15 Bent plate. The z

neutral surface is defined by
the cross-over from
compression to tension compression
neutral
surface
h x
tension
…and (10.53) yields
∂u y ∂u z ∂ζ ∂ζ
=− ≈− u y ≈ −z
∂z ∂y ∂y ∂y
…and (10.50) yields

ν ν z2 ∂2ζ ∂2ζ
u zz = − u x x + u yy u z = + 2 .
1−ν 1−ν 2 ∂x 2 ∂y
The remaining components u ik are
∂u x ∂2ζ
uxx = = −z 2
∂x ∂x
∂u y ∂2ζ
u yy = = −z 2
∂y ∂y
2
ν ∂ ζ ∂2ζ
u zz = z + 2
1 − ν ∂x 2 ∂y
2
z ∂ ζ ∂2ζ ∂2ζ
uxy = − + = −z
2 ∂ y∂x ∂x∂ y ∂x∂ y

1 ∂u x ∂u z
uxz = + ≈ 0 cf. above
2 ∂z ∂x

1 ∂u y ∂u z
u yz = + ≈ 0 cf. above
2 ∂z ∂y

ν
The free energy density, F̂ = F̂o + 2(1+ν)
2
u ik + 1−2ν u ll2 , becomes

F̂ = F̂o + u 2 + u 2yy + u 2zz + 2u 2x y
2 (1 + ν) x x
ν 2
+ u + u 2yy + u 2zz
1 − 2ν x x
10.3 Examples 327

+2u x x u yy + 2u x x u zz + 2u yy u zz

1−ν
= F̂o + u 2 + u 2yy + u 2zz
2 (1 + ν) 1 − 2ν x x
+2u 2x y
2ν
+ u x x u yy + u x x u zz + u yy u zz
1 − 2ν

1−ν
= F̂o + z2
2 (1 + ν) 1 − 2ν
" ∂ 2 ζ 2 ∂ 2 ζ 2
+
∂x 2 ∂ y2
2
ν2 ∂2ζ ∂2ζ #
+ + 2
(1 − ν)2 ∂x 2 ∂y
2 2
∂ ζ 2ν
+2z 2 + z2
∂x∂ y 1 − 2ν
∂2ζ ∂2ζ ∂2ζ ∂2ζ
2
ν
− + .
∂x ∂ y
2 2 ∂x 2 ∂y 2 1−ν
Collecting terms and integrating over the plate’s volume yields the free energy of the
entire plate:

h 3
∂2ζ ∂2ζ
2
F Pl = d xd y + 2
24 1 − ν 2 ∂x 2 ∂y

∂2ζ 2
∂2ζ ∂2ζ
+2 (1 − ν) − 2 2 . (10.88)
∂x∂ y ∂x ∂ y
Notice that the second term can be eliminated via partial integration when the surface
terms vanish (e.g. ζ = 0 at the plate’s rim).
Euler Buckling:
Figure 10.16 shows a thin plate between the jaws of a vice, which apply the force
T to the plate. The question is: At what force, i.e. T = Tcrit , does the plate begin to
buckle? The special geometry as well as 1 − ν 2 ≈ 1 simplify the problem. Using
ζ = ζ (x) we obtain from (10.88) the simpler expression
2

h 3 ∂2ζ
F Pl = d xd y .
24 ∂x 2
The work done bending the plate is wT (cf. the sketch in Fig. 10.16). From
Fig. 10.16 Top left Thin T

plate subject to an external
stress. Lower right
Explanatory sketch in the
context of bending work x L
z
y
x
d
dx ds
$ 2 %
1 ∂ζ (x)
ds − d x = dx2 + (dζ (x)) − d x ≈ d x 1 +
2
−1
2 ∂x
follows
2 2
1 ∂ζ 1 ∂ζ
wT = T dx = σ dV .
2 ∂x 2 V ∂x
Here V is the plate’s volume and σ is the stress defined via T divided by the cross-
sectional area of the plate.
& 2 2 '

2 ∂2ζ σ ∂ζ
δζ (F Pl − wT ) = δζ z − dV = 0 . (10.89)
2 ∂x 2 2 ∂x
( h/2
Notice: 1
h −h/2 z 2 dz = 1 2
12
h ≡ I. Hence
& 2 2 2 '
1 ∂ ζ ∂ζ
δζ
I −σ d xd y = 0 . (10.90)
2 ∂x 2 ∂x
The variation, i.e.

∂2ζ ∂2 ∂ζ ∂

I δζ − σ δζ d xd y = 0 ,
∂x 2 ∂x 2 ∂x ∂x
followed by a partial integration (δζ (x) = 0 at ±L/2) yields the differential equation
10.3 Examples 329
Fig. 10.17 Boundary

conditions of the bent plate:
clamped (i = 1) and not
clamped (i = 1/2)
∂4ζ ∂2ζ

I +σ 2 =0.
∂x 4 ∂x
Inserting ζ (x) = ζ0 sin (q x) leads to

Iq 4 − σq 2 = 0 or σcrit =
Iqmin
2
.
Notice that q = 0 corresponds to the flat plate, whereas the bent plate is characterized
by q > 0. The quantity σcrit is the smallest stress causing a bending instability.
Notice that qmin must satisfy certain boundary conditions. Two different boundary
conditions are shown in Fig. 10.17. On the left side (i = 1) the lower and the upper
edge of the plate are clamped, whereas on the right (i = 1/2) this is not the case.
Thus we write q = 2π/λ, where λ is the wavelength of the bending instability. In
the first case (i = 1) we have λ = L, whereas in the second case (i = 1/2) λ = 2L.
We may combine both boundary conditions by writing qmin = 2πi/L. Inserting this
in the upper equation for σcrit yields
4π 2 i 2
σcrit =
I .
L2
Notice that σcrit ∝ L −2 , i.e. plates with a large L can support less load - as we know
from experience. This phenomenon is known as Euler buckling. Here we have used
a plate for convenience, because then the problem can be solved in two dimensions.
The phenomenon as such also occurs in the case of beams or columns, but then it is
more complicated to deal with (e.g. [8]).
• Problem 47 - Euler Buckling of a Plate Supported by an Elastic Matrix:

We assume that the plate of the previous example is embedded in an elastic
matrix, which we ‘simulate’ by replacing the integral in (10.90) via
& 2 2 '

I ∂2ζ K σ ∂ζ
+ ζ2 − d xd y .
2 ∂x 2 2 2 ∂x
The quantity K describes the stiffness of the matrix material. What is σcrit in
this case?
Solution: This time the variation yields

2 2
∂ ζ ∂ ∂ζ ∂

I δζ + K ζδζ − σ δζ d xd y = 0 .
∂x 2 ∂x 2 ∂x ∂x
Partial integration (δζ = 0 at x = ±L/2), leads to
∂4ζ ∂2ζ

I + K ζ + σ =0. (10.91)
∂x 4 ∂x 2
Again we insert ζ(x) = ζo sin(q x), which yields

Iq 4 + K − σq 2 = 0 , (10.92)
i.e.
) * + ,
K 2πi 2 2 L 2 1
σcrit = min
Iq + 2 = min
I
2
n +K
q q n L 2πi n2
(10.93)
(n = 1, 2, . . . ). For small n (small K ) we have to determine the minimum

numerically. If n is large (large K ), n as well as q can be treated as continuous
variables and we determine qmin as before, i.e.
1/4
K
qmin ≈ (10.94)

I
and
√
σcrit ≈ 2
I K . (10.95)
10.3 Examples 331
Notice that the presence of the elastic medium eliminates the dependence of
σcrit on L. Instead of a single buckle we now obtain a wave-like deformation
as shown in the following sketch.
Elastic Waves in Isotropic Media:
According to (10.12)
∂σik
ρü i = , (10.96)
∂xk
where ρ is the mass density. Using (10.60) we obtain for isotropic media

ρu¨ =
u+ ∇
∇( · u) . (10.97)
2 (1 + ν) 2 (1 + ν) (1 − 2ν)
Let’s assume a plane wave in x-direction, i.e. u = u (x, t). In this case
⎛ ⎞
∂ 2 u x /∂x 2

∂ 2

ρu¨ = u + ⎝ 0 ⎠ .
2 (1 + ν) ∂x 2 2 (1 + ν) (1 − 2ν) 0
Hence
2

1 ∂ ux
ρü x = 1+
2 (1 + ν) 1 − 2ν ∂x 2
or
∂2u x 1 ∂2u x
− 2 =0. (10.98)
∂x 2 cl ∂t 2
The quantity cl is the longitudinal wave or sound velocity in the medium, i.e.
-

(1 − ν)
cl = . (10.99)
ρ(1 + ν)(1 − 2ν)
In the y-direction we have

∂2
ρü y = uy
2 (1 + ν) ∂x 2
or
∂2u y 1 ∂u y
− 2 2 =0. (10.100)
∂x 2 ct ∂t
Here ct is the transversal wave or sound velocity given by

.

ct = . (10.101)
2ρ (1 + ν)

The same is true for u z . Notice: cl
ct
= 1−ν
2 1−2ν → cl > ct 4
3
mit −1 ≤ ν ≤ 21 .
10.4 Finite Element Method
The finite element method or FEM is the numerical method of choice, when elastic
deformations or vibrations of complicated structural components due to external
forces need to be calculated. However, FEM is not limited to elasticity problems. It
is a general numerical method for the solution of partial differential equation with
complicated boundary conditions in complex geometries. In the following we outline
the basics of FEM via a series of examples.
Suppose we are interested in a solution of the wave equations (10.98) and (10.100).
Before we focus on the numerical approach, we want to find an analytical solution,
which later can be compare to our numerical result. The partial differential equations
(10.98) and (10.100) can be solved using the separation ansatz u(t, x) = T (t)ψ(x),11
i.e.
11 In the following we omit the indices x and y as well as l and t.

10.4 Finite Element Method 333
1 ∂t2 T (t) ∂x2 (x)

= = −λ . (10.102)
c2 T (t) (x)
The quantity λ is an as yet unknown constant. Apparently it is sufficient to concentrate

on just one of the two resulting ordinary differential equations. Here we continue
with the ψ(x)-equation, i.e.
2
∂x + λ (x) = 0 . (10.103)
Its solution is assumed to satisfy the boundary conditions
(0) = (1) = 0
on the unit interval. The general solution is given by

√ √
(x) = A sin λx + B cos λx .
Employing the boundary conditions we obtain two equations, i.e.

0√ 1√ A
=0.
sin λ cos λ B
The possible λ-values follow via det (. . .) = 0. Hence

√ √
− sin λ = 0 or λ = nπ ,
where (n = 1, 2, 3, . . .). The attendant solution of the ψ(x)-equation is

√
n (x) = 2 sin (nπx) . (10.104)
Notice that the amplitude satisfies the (arbitrary) normalization condition

(1 2
0 n (x)d x = 1.
Now that we know the analytical solution, we want to study the same problem
numerically using a precursor to FEM - the so called Ritz method.
Ritz Method:
The solution of (10.103) also satisfies δψ I = 0, where I is given by

1 1
I = d x [∂x (x)]2 − λ 2 (x) , (10.105)
2 0
and vice versa. Proof:

1
δI = ∂x (x) ∂x δ (x) − λ (x) δ (x)

0
1
p.i. 1
= ∂x (x) δ (x) − d x ∂x2 (x) + λ (x) δ(x)
0 0
=0
Notice that δψ I = 0 requires the bracket under the integral to vanish, which in turn
requires (x) to satisfy (10.103).
The Ritz approximation of the exact solution is a combination of linearly inde-
pendent functions
φ1 (x) , φ2 (x) , . . .
Even though the φi are quite arbitrary, the quality of the approximation depends
strongly on their resemblance to the exact solution(s). In the present case we chose
φ1 (x) = x (1 − x) φ2 (x) = x (1 − x) (1 − 2x) .
Thus, our trial solution is a linear combination of φ1 and φ2 , i.e.
(x) c1 x (1 − x) + c2 x (1 − x) (1 − 2x) ,
for which we evaluate I = I (c1 , c2 ; λ) and subsequently δ I = (∂ I /∂c1 )δc1 +

(∂ I /∂c2 )δc2 . The condition δ I = 0 yields the two equations ∂ I /∂c1 = 0 and
∂ I /∂c2 = 0, i.e.
2
− 15
1
λ 0 c1
3 =0.
0 2
5
− 105
1
λ c2
The two λ-values, which follow from det (. . .) = 0, are

λ1 = 10 exact : π 2 ≈ 9.87

λ2 = 42 exact : 4π 2 ≈ 39.5 .
An obvious problem of the Ritz approach is the limited number of λ-values, depend-
ing on the number of φi included in the linear combination. Our toy problem possesses
an infinite number of eigenvalues or λ-values, whereas the approximate solution only
yields two of them - with decreasing precision.
In order to obtain the approximate properly normalized, we include the latter
condition via the Lagrange multiplier method. Instead of the original I we use

1 1
In = d x [∂x n (x)]2 − λn n2 (x) − h n n2 (x) − 1 . (10.106)

2 0
(1
The normalization, 0 d xn2 (x) = 1, is taken care of by the h n -term, where h n is the
Lagrange multiplier. The equations for the coefficients c1,n , c2,n and the Lagrange
multiplier h n follow via
for λ1 = 10 for λ2 = 42
∂ In 32
0= ∂c1
= − 151
c1 h =− 15
+ 15
h
c1
∂ In
0= ∂c2
= 105 − 105
32 h
c2 = − 105
1
hc2
(1 c 2
c22
1= 0 d xn (x) = 30 + 210
2 1
= dito
solutions:
√
c1 = 30 c1 = 0√
c2 = 0 c2 = 210
h=0 h=0
The following Mathematica-program contains the full solution and the attendant
graphs:
”Ritz method: The program calculates the first eigenvalues and

eigenfunctions of {(d/dx)∧ 2+λ}ψ(x)=0.”;
”eigenvalues”;
φ1 = x(1 − x);
φ2 = x(1 − x)(1 − 2x);
Plot {φ1 , φ2 } , {x, 0, 1}, PlotStyle → Black, AxesLabel → {”x”, ”φ”}
ψ = c1 φ1+ c2 φ2 ;
Integrate D D[ψ, x]∧ 2 − λψ ∧ 2, c1 , {x, 0, 1}

Integrate D D[ψ, x]∧ 2 − λψ ∧ 2, c2 , {x, 0, 1}
Solve[%%%==0, λ]
φ
0.25
0.20
0.15
0.10
0.05
x
0.2 0.4 0.6 0.8 1.0
0.05
0.10
− λc
2c1
3 15
1
λc2
− 105
2c2
5
{{λ → 10}, {λ → 42}}
”Calculation of the attendant eigenfunctions”;
λ = 10;
Integrate D D[ψ, x]∧ 2 − λψ∧ 2 − h(ψ∧ 2 − 1), c1 , {x, 0, 1}

Integrate D D[ψ, x]∧ 2 − λψ ∧ 2 − h(ψ ∧ 2 − 1), c2 , {x, 0, 1}
Integrate[ψ ∧ 2 − 1, {x, 0, 1}]
Solve [{%%%==0, %%==0, %==0}, {c1 , c2 , h}]
− hc
15
1
32c2
105
− 105
hc2
c2 c2
−1 + 301 + 210
2
/" √ # " √ #
h → 0, c1 → − 30, c2 → 0 , h → 0, c1 → 30, c2 → 0 ,
" √ # " √ #0
h → 32, c2 → − 210, c1 → 0 , h → 32, c2 → 210, c1 → 0
”Comparison
to the 1st exact eigenfunction”;
Plot {Sqrt[30]φ1 − 0φ2 , Sqrt[2]Sin[Pix]} , {x, 0, 1},
PlotStyle → {{Black}, {Black, Dashed}}, AxesLabel → {”x”, ”φ”}]
φ
1.4
1.2
1.0
0.8
0.6
0.4
0.2
x
0.2 0.4 0.6 0.8 1.0
λ = 42;
Integrate D D[ψ, x]∧2 − λψ ∧2 − h(ψ ∧2 − 1), c1 , {x, 0, 1}

Integrate D D[ψ, x]∧ 2 − λψ ∧ 2 − h(ψ ∧ 2 − 1), c2 , {x, 0, 1}
Integrate[ψ ∧ 2 − 1, {x, 0, 1}]
Solve [{%%%==0, %%==0, %==0}, {c1 , c2 , h}]
− 32c
15
1
− hc
15
1
− 105
hc2
c12 c22
−1 + 30
+ 210
/" √ # " √ #
h → −32, c1 → − 30, c2 → 0 , h → −32, c1 → 30, c2 → 0 ,
" √ # " √ #0
h → 0, c2 → − 210, c1 → 0 , h → 0, c2 → 210, c1 → 0
”Comparison
to the 2nd exact eigenfunction”;
Plot {0φ1 + Sqrt[210]φ2 , Sqrt[2]Sin[2Pix]} , {x, 0, 1},
PlotStyle → {{Black}, {Black, Dashed}}, AxesLabel → {”x”, ”φ”}]
1.0
0.5
x
0.2 0.4 0.6 0.8 1.0
0.5
1.0
Depending on the number and quality of the φ-functions the Ritz method yields
good approximations to the exact solutions. However, the φ-functions generally are
difficult to guess, which means that we cannot systematically improve the results.
Finite Element Method:
The φ-functions are approximations to the exact solution over the entire interval
of interest, i.e. the interval (0, 1) in the above example. This makes it hard to guess
them. If instead the interval is very narrow, it obviously is much easier to approximate
the shape of the solution.
Consequently we begin by dividing the interval (0, 1) into m bins or elements
of width h as shown in Fig. 10.18. Within each element, in the simplest case, the
solution can be approximated by a linear function, i.e.
(l) (ξ) = c1(l) + c2(l) ξ . (10.107)
Here l is the element index. Notice that element l is mapped onto a new interval (0, 1)
along a new coordinate axis. The constants c1(l) and c2(l) can be expressed in terms of
the solution on the element boundries, i.e. 1(l) = (l) (0) and 2(l) = (l) (1):
*
1(l) = c1(l) c1(l) = 1(l)
→
2 = c1(l) + c2(l)
(l)
c2(l) = −1(l) + 2(l)
x
1 2
0 1
Fig. 10.18 The original x-interval of interest is divided into narrow elements of width h. Subse-
quently each element is mapped onto the interval (0, 1) on the ξ-axis
(l) (ξ) = 1(l) + −1(l) + 2(l) ξ = 1(l) (1 − ξ) + 2(l) ξ .
This may be written as
!
i max
(l)
(ξ) = i(l) Ni (ξ) , (10.108)
i=1
where in the present case i max = 2. The Ni (ξ), the so called shape functions, are
given by
N1 = 1 − ξ and N2 = ξ . (10.109)
It is easy and sometimes useful or even necessary to use more complex shape func-
tions. For instance, we can express (l) (ξ) by a second order polynomial, i.e.
(l) (ξ) = c1(l) + c2(l) ξ + c3(l) ξ 2 .
In this case 1(l) = (l) (0), 2(l) = (l) (1/2), and 3(l) = (l) (1). A simple calcula-
tion yields12
N1 = (1 − 2ξ) (1 − ξ) N2 = 4ξ (1 − ξ) N3 = −ξ (1 − 2ξ) .
(10.110)
We now replace (x) in (10.105) with (10.108), which yields

"
1!
m lh
I = d xi(l) ∂x Ni(l) (x) ∂x N (l)
j (x)
2 l=1 (l−1)h
#
−λNi(l) (x) N (l)
j (x) (l)
j (10.111)
12 Show this (Fig. 10.18).

Notice the following points:
1. We use the summation convention in the cases of i and j.

2. x and ξ are related via x = (l − 1) h + ξh and thus d x = hdξ as well as ddx = 1 d
h dξ
.
3. Even though Ni is the same on every element, we use the notation Ni(l) , where l
is the element index.
4. Notice that i(l−1)
max
= 1(l) .
Thus (10.111) becomes
!
m " #
I = i(l) I1(l) (i, j) − λI2(l) (i, j) (l)
j . (10.112)
l=1
The components of the so called element matrixes, I1 and I2 , are given by

1
1
I1(l) (i, j) = dξ∂ξ Ni(l) (ξ) ∂ξ N (l)
j (ξ)
2h 0
and
1
h
I2(l) (i, j) = dξ Ni(l) (ξ) N (l)
j (ξ) .
2 0
In the following we use I1 and I2 evaluated with the simple linear shape functions
(10.109), i.e.

1 1 −1
I1(l) =
2h −1 1
and

h 1/3 1/6
I2(l) = .
2 1/6 1/3
Notice that the matrix elements do not depend on the element index l.
At this point we can express (10.112) in the form
⎛ ⎞T ⎛ (1) ⎞
1(1)
⎜ (1) ⎟ ⎛ (1) (1) ⎞ ⎜ 1(1) ⎟
⎜ 2 ⎟ I − λI2 ⎜ 2 ⎟
⎜ (2) ⎟ ⎜ 1 (2) (2) ⎟ ⎜ ⎟
⎜
I =⎜ 1 ⎟ ·⎝⎟ − λI ⎠·⎜ (2) ⎟
⎜ 1(2) ⎟ .
I 1 2
(2)
⎜ 2 ⎟ .. ⎜ 2 ⎟
⎝ ⎠ . ⎝ ⎠
.. ..
. .

elimination of redundant
nodes in one dimension. The
squares correspond to
I1(l) − λI2(l)
The I1(l) − λI2(l) are 2 × 2 matrices (cf. above) along the diagonal of a 2m × 2m
matrix. All of its remaining elements are zero. The vectors contain the 2m -values
contributed by the m elements.
Because of the above equality of across element boundaries, i.e. 2(l) = 1(l+1) ,
every second -value is redundant and must be eliminated. For our one-dimensional
problem this is easy, i.e. we eliminate 1(l) ∀ l > 1 from the two vectors. Then
we reduce the 2m × 2m matrix accordingly. This is done by shifting adjacent 2 × 2
matrices along the diagonal so that the 22-element of the upper matrix coincides with
the 11-element of the lower matrix (see Fig. 10.19). The sum of the superimposed
entries becomes the new entry. This procedure works for one-dimensional problems.
In two or three dimensions the elimination step obviously is more complicated. After
the elimination step (10.112) is given by
T · (S1 − λS2 ) ·
I = (10.113)
with

= 1(1) , 2(1) , 2(2) , 2(3) , 2(4) .

Notice that here we use m = 4 elements (h = 1/4). The attendant 5 × 5 matrices,

S1 and S2 , are
⎛ ⎞
1 −1 0
⎜ −1 2 −1 ⎟
1 ⎜⎜
⎟
⎟
S1 = ⎜ −1 2 −1 ⎟
2h ⎝ −1 2 −1 ⎠
0 −1 1
and
⎛ ⎞
1/3 1/6 0
⎜ 1/6 2/3 1/6 ⎟
h⎜
⎜
⎟
⎟ .
S2 = ⎜ 1/6 2/3 1/6 ⎟
2⎝ 1/6 2/3 1/6 ⎠
0 1/6 1/3
As in the Ritz method we are now ready to carry out the variation of (10.113)
with respect to the i(l) , which yields
−1
=0
(S1 − λS2 ) · or =0,
S2 · S1 − λI · (10.114)
corresponding to δψ I = 0. The matrix I is a unit matrix.

In the very last step we include the boundary conditions, i.e. the solution must
vanish at x = 0 and x = 1. Hence

= 0, 2(1) , 2(2) , 2(3) , 0 .

BC
≡
The size of the problem is reduced again, an we obtain

BC −1 BC
BC = 0 ,
(S2 ) · S1 − λ1 · (10.115)
where
⎛ ⎞ ⎛ ⎞
2 −1 2/3 1/6
1 ⎝ h
S1BC = −1 2 −1 ⎠ and S2R B = ⎝ 1/6 2/3 1/6 ⎠ .
2h −1 2 2 1/6 2/3
The numerical solution of the eigenvalue matrix equation (10.115) supplies us with
the three lowest eigenvalues including the attendant eigenfunctions (cf. the follow-
ing Mathematica-program). However, in contrast to the Ritz method, now we can
easily increase the number of elements, m, which yields additional eigenvalues and
eigenfunctions and improves the overall agreement with the exact solution.
”Finite Element Method: This program determines the eigenvalues and

eigenfunctions of {(d/dx)∧ 2+λ}ψ(x)=0 on (0,1).
nv : linear shape functions

h : element size
m : number of elements (m h =1)
I1, I2 : element matrices
S1, S2 : band matrices”;
h = 0.25;
m = 1/ h;
nv = {1 − ξ, ξ};
I1 = Table[Integrate[1/(2h)D[nv[[i]], ξ]D[nv[[ j]], ξ], {ξ, 0, 1}], {i, 2}, { j, 2}];
I2 = Table[Integrate[h/2nv[[i]]nv[[ j]], {ξ, 0, 1}], {i, 2}, { j, 2}];
MatrixForm[I1]
MatrixForm[I2]

2. −2.
−2. 2.

0.0416667 0.0208333
0.0208333 0.0416667
”generation of zero matrices”;
Do[S1 = Table[0, {i, m + 1}, { j, m + 1}]; S2 = Table[0, {i, m + 1}, { j, m + 1}],
{o,0,0,1}];
”generation of the band matrices S1 and S2”;
Do[Do[Do[S1[[n + k, n + l]] = S1[[n + k, n + l]] + I1[[k, l]], {l, 1, 2, 1}], {k, 1, 2, 1}],
{n,0,m-1,1}];
Do[Do[Do[S2[[n + k, n + l]] = S2[[n + k, n + l]] + I2[[k, l]], {l, 1, 2, 1}], {k, 1, 2, 1}],
{n,0,m-1,1}];
MatrixForm[S1]
⎛
MatrixForm[S2] ⎞
2. −2. 0 0 0
⎜ −2. 4. −2. 0 0 ⎟
⎜ ⎟
⎜ 0 −2. 4. −2. 0 ⎟
⎜ ⎟
⎝ 0 0 −2. 4. −2. ⎠
0 0 0 −2. 2.
⎛ ⎞
0.0416667 0.0208333 0 0 0
⎜ 0.0208333 0.0833333 0.0208333 0 0 ⎟
⎜ ⎟
⎜ 0 0.0208333 0.0833333 0.0208333 0 ⎟
⎜ ⎟
⎝ 0 0 0.0208333 0.0833333 0.0208333 ⎠
0 0 0 0.0208333 0.0416667
”including the boundary conditions”;
S1RB = Table[0, {i, 1, m − 1}, { j, 1, m − 1}];
Do[Do[S1RB[[i, j]] = S1[[i + 1, j + 1]], {i, 1, m − 1}], { j, 1, m − 1}];
MatrixForm[S1RB]
S2RB = Table[0, {i, 1, m − 1}, { j, 1, m − 1}];
Do[Do[S2RB[[i, j]] = S2[[i + 1, j + 1]], {i, 1, m − 1}], { j, 1, m − 1}];
MatrixForm[S2RB]
⎛ ⎞
4. −2. 0
⎝ −2. 4. −2. ⎠
0 −2. 4.
⎛ ⎞
0.0833333 0.0208333 0
⎝ 0.0208333 0.0833333 0.0208333 ⎠
0 0.0208333 0.0833333
λ = Eigenvalues[Inverse[N [S2RB]].N [S1RB]]
ListPlot[Table[Sort[λ][[i]]/(Pii)∧ 2, {i, 1, Length[λ]}], PlotRange->{0.9, 1.5},
PlotStyle → {PointSize[0.03], Black},
AxesLabel->{”eigenvalue#”, ”λ/λ(exact)”}]
{126.756, 48., 10.3866}
λ/λ(exact)
1.5
1.4
1.3
1.2
1.1
1.0
eigenvalue
0.0 0.5 1.0 1.5 2.0 2.5 3.0
ψ = Eigenvectors[Inverse[N [S2RB]].N [S1RB]][[m − 1]]

”normalization”;
ListInterpolation[Flatten[{0, ψ, 0}]];
norm = Sqrt[NIntegrate[%[x]∧ 2, {x, 1, m + 1}]/m];
ListPlot[ψ/norm, PlotRange->{0, 2.5}, PlotStyle → {PointSize[0.03], Black}];
Plot[Sqrt[2]Sin[(Pi/m)x], {x, 0, m}, PlotStyle → Black];
Show[%%, %, AxesLabel → {”node”, ”ψ”}]
{0.5, 0.707107, 0.5}
Ψ
2.5
2.0
1.5
1.0
0.5
node
0 1 2 3 4
Instead of h = 0.25 and thus m = 4 it is not difficult to repeat this calculation

with a smaller h, which better describes the short wavelengths (cf. Fig. 10.20). Even
though the computational effort increases, in principle it is easy to improve the
approximate solutions within the FEM.
Remark: How can we accommodate an additional term u(x)(x), where u(x) is an

arbitrary function of x, in the differential equation (10.103)? Analogous to (10.108)
we express u(x) via the shape functions, i.e.
Fig. 10.20 The same three exact

eigenvalues as before using 1.10
h = 0.1 and thus m = 10
1.08
1.06
1.04
1.02
eigenvalue
0.0 0.5 1.0 1.5 2.0 2.5 3.0
!
i max
u (l) (ξ) = u i(l) Ni (ξ) . (10.116)
i=1
The curly brackets in (10.111) now contain the additional term
u k(l) Nk (ξ)Ni (ξ)N j (ξ) ,
where we use the summation convention.

Everything we have discussed thus far can be generalized to higher dimensions.
In two dimensions, for instance, we can express (on the lth element) via
(l) (ξ, η) = c1 + c2 ξ + c3 η + c4 ξ 2 + c5 ξη + c6 η 2 + c7 ξ 2 η + c8 ξη 2 .
The relation between the primary x–y- to the secondary ξ–η-plane is illustrated in
Fig. 10.21. The element shown here in the x–y-plane is only one of many possible
elements. Analogous to the one-dimensional case we express the ci in terms of the
-values according to
⎛ ⎞⎛ ⎞ c ⎛ ⎞
(0, 0) 10000000 1
⎜ ( 1 , 0) ⎟ ⎜ 1 1 ⎟ ⎜ c2 ⎟
⎜ ⎟ 1 2 0 4 0 0 0 0⎟⎜ ⎟
⎜ (1, 0) ⎟ ⎜
2
⎜ c3 ⎟
⎜ ⎟=⎜
⎜ 1 1 0 1 0 0 0 0⎟ ⎟⎜ ⎟ .
⎜ (1, 1 ) ⎟ ⎝ ⎜ . ⎟
⎝ 2 ⎠ .. .. .. .. .. .. .. .. ⎠ ⎝ .. ⎠
.. .. .. .. .. .. .. .. ..
. c8
This yields
!
8
(ξ, η) = i Ni (ξ, η) , (10.117)
i=1
P4
y
P7
P8 P3 P4 P3
P7
P1 P6
P8 P6
P5
P2
P5
x
P1 P2
Fig. 10.21 Nodes in the case of the quadratic ansatz in two dimensions
Fig. 10.22 Example of

common nodes in two
dimensions
I II
III IV
with
N1 (ξ, η) = (1 − ξ)(1 − η)(1 − 2ξ − 2η)

N2 (ξ, η) = −ξ(1 − η)(1 − 2ξ + 2η)
N3 (ξ, η) = −ξη(3 − 2ξ − 2η)
N4 (ξ, η) = −η(1 − ξ)(1 + 2ξ − 2η)
N5 (ξ, η) = 4ξ(1 − ξ)(1 − η)
N6 (ξ, η) = 4ξη(1 − η)
N7 (ξ, η) = 4ξη(1 − ξ)
N8 (ξ, η) = 4η(1 − ξ)(1 − η) .
The remaining calculation is the same as before. Only the elimination step is
significantly more complicated. Notice that neighboring elements now share several
common -values (cf. Fig. 10.22).
The FEM Method Applied to Problems in the Theory of Elasticity:
The following is an illustration of the FEM in the context of a problem in the

theory of elasticity. In this case
!
m
1
I = σik u ik d V − Fi (
ri ) · ui − p · ud V − q · ud A
2 V i=1 V ∂V
(10.118)
(cf. (10.26)). The first term on the right is the elastic body’s free energy, Fel . All other
terms describe different types of work done on the elastic body. The first of the work
terms corresponds to forces, Fi , acting on the elastic body at discrete positions, ri .
The ui are the displacements at these positions. The second work term is a continuum
version of the previous one, where p is the force density field and u is the attendant
displacement field. The third term describes the work done by a continuous load
distribution on the surface of the elastic body. As in the case of Euler buckling (cf.
p. 327) we obtain the resulting displacement field via the condition
δu I = 0 ,
which we can solve using the Finite Element Method.

We want to demonstrate this using the example depicted in Fig. 10.23 (cf. [9]). A
thin plate is clamped on one side, whereas the other side is supported on two rollers.
A continuous load, q, is distributed between the two rollers as indicated. Gravitation
is neglected. Hence (10.118) becomes
2

I L
∂2ζ L
I = dx − d x q(x)ζ(x) . (10.119)
2 0 ∂x 2 0
Notice that we use the expression (10.88) for the elastic free energy of a thin plate
with ζ = ζ(x) and the definition I = d 3 b/(12(1 − ν 2 )). The load is described via
12 k N/m
x
element 1 element 2
Fig. 10.23 A plate is clamped on one side and supported by two rollers on the other side. The
length of the two elements is L/2 = 1 m. The width of the plate is b and its thickness is d
Fig. 10.24 Elements of h

length h along the x-axis
x
1 2
' 3 4
'
0 1
)
0 0 ≤ x < L/2
q(x) = .
−q L/2 ≤ x ≤ L
In this example we divide the plate into two elements as shown in Fig. 10.23.
The simplest shape functions correspond to a linear approximation of ζ(x) on each
element, i.e.
ζ (l) (ξ) = c1(l) + c2(l) ξ . (10.120)
However, in the present example linear shape functions are not sufficient. This is
because aside from the ζ-values at the element boundaries we also need the derivatives
ζ (x) ≡ dζ(x)/d x (cf. Fig. 10.24). This means we must include two more orders, i.e.
ζ (l) (ξ) = c1(l) + c2(l) ξ + c3(l) ξ 2 + c4(l) ξ 3 . (10.121)
Hence13
(l) (l)
ζ (l) (ξ) = ζ1(l) N1 (ξ) + hζ 2 N2 (ξ) + ζ3(l) N3 (ξ) + hζ 4 N4 (ξ) , (10.122)
with
N1 = (1 + 2ξ)(1 − ξ)2
N2 = ξ(1 − ξ)2
N3 = (3 − 2ξ)ξ 2
N4 = −(1 − ξ)ξ 2 .
Inserting this into (10.119) we obtain

$ T 1 (1) T % (1)
ζ(1) I 0 f(1) ζ
I ≈ · 2 1 1 (2) − · (2) , (10.123)
ζ(2) 0 2 I1 f(2) ζ
13 The h-factors follow according to the chain rule, i.e. dζ(x)/dξ = (dζ(x)/d x)(d x/dξ) = ζ h.
where

ζ(l) = ζ1(l) , ζ2 , ζ3(l) , ζ4

(l) (l)
(10.124)
and
⎛ ⎞
12 6h −12 6h

I ⎜ 6h 4h 2 −6h 2h 2 ⎟
I1(l) = 3⎜ ⎟ .
h ⎝ −12 −6h 12 −6h ⎠
6h 2h 2 −6h 4h 2
The element load vector is given by

h h2 h h2
f(l) = −q (l) , , ,− , (10.125)
2 12 2 12
where q (1) = 0 and q (2) = q.

In the next step we identify the redundancies, i.e.
(1) (2)
ζ3(1) ≡ ζ1(2) , ζ4 ≡ ζ2 ,
and specify the boundary conditions, i.e.

(1)
ζ1(1) = 0 , ζ2 = 0 , ζ3(1) = 0 , ζ3(2) = 0 .
Notice that the only remaining undetermined quantities are the two slopes ζ4 (1) and
ζ4 (2) .
After removing the redundancies as before and including the boundary conditions,
δζ I = 0 yields the simple result
$ %

I 82 ζ4 (1) −qh 2
≈ 12 .
h 24 ζ4 (2) qh 2
12
Let’s insert some numbers:

= 2 · 1011 Nm−2 (steel), I = 4 · 10−6 m4 , h = 1 m, q =
12 · 103 Nm−1 . The result is ζ4 (1) ≈ −2.68 · 10−4 and ζ4 (2) ≈ 4.46 · 10−4 . This
means that the steel plate’s deformations are less than 1 mm. The entire calculation
is included in the following Mathematica-program:
”Clamped plate: Example 8.1. from Introduction to

Finite Elements in Engineering by T.R. Chandrupatla, A.
D. Belegundu”;
”nv: cubic shape functions based on the displacements

u and their derivatives at the element boundaries”;
”h: element width”; ”m: number of elements”; m = 2;
”
: elastic modulus”;
”J: moment of inertia relative to the neutral plane”;
”-q: force density (negative because force in - direction)”;
Clear[h,
, ν, J, p, c1, c2, c3, c4];
”shape functions on 0<=ξ <=1”;

ζ = c1 + c2ξ + c3ξ∧ 2 + c4ξ∧ 3;
dζ = D[ζ, ξ];
s = Flatten[Solve[{ζ1 == ζ/.ξ → 0,
ζ2==dζ/.ξ → 0,
ζ3 == ζ/.ξ → 1,
ζ4==dζ/.ξ → 1}, {c1, c2, c3, c4}]];
c1 = c1/.s[[1]];
c2 = c2/.s[[2]];
c3 = c3/.s[[3]];
c4 = c4/.s[[4]];
nv = {Factor[Coefficient[ζ, ζ1]], hFactor[Coefficient[ζ, ζ2]],
Factor[Coefficient[ζ, ζ3]], hFactor[Coefficient[ζ, ζ4]]}
Inull = Table[0, {i, 4}, { j, 4}];
”element matrix - contribution of elastic free energy”;
IE =
J/(h ∧ 3)
Table[Integrate[D[nv[[i]], {ξ, 2}]D[nv[[ j]], {ξ, 2}], {ξ, 0, 1}],
{i, 4}, { j, 4}];
MatrixForm[IE]
”element vector describing the continuous load distribution”;
LE = hTable[Integrate[nv[[i]], {ξ, 0, 1}], {i, 4}];
MatrixForm[LE]
/ 0
(−1 + ξ)2 (1 + 2ξ), h(−1 + ξ)2 ξ, −ξ 2 (−3 + 2ξ), h(−1 + ξ)ξ 2
⎛ 12J
6J

6J
⎞
h3 h2
− 12J
h3 h2
⎜ 6J2
4J
− 6J2
2J
⎟
⎜ h12J
h6J
12Jh
h ⎟
⎝− 3 − 2 − 6J
⎠
h h h3 h2
6J
2J
6J
4J
− h2
⎛ hh ⎞ h
2 h
2
⎜ h2 ⎟
⎜ 12 ⎟
⎝ h ⎠
22
− 12
h
”generation of reduced band matrix SR”;

Inull={{0,0},{0,0}};
I11=Drop[IE,{3,4},{3,4}];
I12=Drop[IE,{3,4},{1,2}];
I21=Drop[IE,{1,2},{3,4}];
I22=Drop[IE,{1,2},{1,2}];
Table[If[i==j&&i==1,I11,
If[i==j&&i==m+1,I22,
If[i==j,I11+I22,If[i+1==j,I12,If[i-1==j,I21,Inull]]]]],
{i,1,m+1},{j,1, m+1}];
SR=ArrayFlatten[%];
”boundary conditions”;
Drop[SR,{1,3},{1,3}];
SR=Drop[%,{2},{2}];
MatrixForm[SR]

8J
2J
h h
2J
4J
h h
”generation of load vector LV”;

LV = {};
Do[If[i ≤ m/2, LV = Append[LV, {0, 0, 0, 0}], LV = Append[LV, LE]],
{i, 1, m}]; LV = Flatten[−q LV];
”generation of reduced load vector LV”;
LVR = {};
LVR = Append[LVR, LV[[1]]];
Do[LVR = Append[LVR, {LV[[i]] + LV[[i + 2]], LV[[i + 1]] + LV[[i + 3]]}],
{i, 3, Length[LV] − 2, 4}];
LVR = Append[LVR, LV[[Length[LV] − 1]]];
LVR = Flatten[Append[LVR, LV[[Length[LV]]]]];
Drop[LVR, {1, 3}];
LVR = Drop[%, {2}];
MatrixForm[LVR]
$ %
2
− h12q
h2 q
12
”calculation of the slopes at the roller positions”;

sol = LinearSolve[SR, LVR];
”parameter values”;
J = 4 10∧ (−6);
= 2 10∧ 11; h = 1; q = 12 10∧ 3;
N [sol]
{−0.000267857, 0.000446429}
Whereas the above example is designed to be analytically solvable (even though

a Mathematica-implementation is provided too), the following and final example
requires a computer program. We consider a similar setup (cf. Fig. 10.25). The
acrylic glass plate is clamped on one side and is deformed by its own weight. The
measurements of the plate are L = 1 m, b = 1 m, and d = 0.02 m. The other para-

meters are
= 0.32 · 1010 Nm−2 , I = 0.76 · 10−6 m4 , and p = 0.02 · 1160 · 9.81
Nm−1 . This problem is solved by following Mathematica-program. The figure
included with the program shows the deformation, ζ(x), of the plate versus x.
”Deformation of a 1 m x 1m x 2 cm acrylic glass plate,

which is clamped on one side, in the gravitational field:”;
Clear[h,
, J, p];
”number of elements”; m = 16;
”reduced band matrix SR is generated”;
Inull = {{0, 0}, {0, 0}};
I11 = Drop[IE, {3, 4}, {3, 4}];
I12 = Drop[IE, {3, 4}, {1, 2}];
I21 = Drop[IE, {1, 2}, {3, 4}];
I22 = Drop[IE, {1, 2}, {1, 2}];
Table[If[i == j&&i == 1, I11,
If[i == j&&i == m + 1, I22,
If[i == j, I11 + I22, If[i + 1 == j, I12, If[i − 1 == j, I21, Inull]]]]],
{i, 1, m + 1}, { j, 1, m + 1}];
SR = ArrayFlatten[%];
SR = Drop[SR, {1, 2}, {1, 2}];
”load vector LV is generated”;

LV = {}; Do[LV = Append[LV, LE], {i, 1, m}]; LV = Flatten[− p LV];
”reduced load vector LV is generated”;
LVR = {};
Do[LVR = Append[LVR, {LV[[i]] + LV[[i + 2]], LV[[i + 1]] + LV[[i + 3]]}],
{i, 3, Length[LV] − 2, 4}];
LVR = Append[LVR, LV[[Length[LV] − 1]]];
LVR = Flatten[Append[LVR, LV[[Length[LV]]]]];
LVR = Drop[LVR, {1, 2}];
”calculation of displacements and slopes at the element boundaries”;

sol = LinearSolve[SR, LVR];
J = 0.76 10∧ (−6);

= 0.32 10∧ 10;
h = 1/m;
p = 0.02 1160 9.81;
Fig. 10.25 Acrylic glass g

plate clamped on one side
data = N [sol, 10]; ”displacement ζ(x) for m=16 elements”;

Table[{h (i + 1)/2, 1000data[[i]]}, {i, 1, Length[data], 2}];
ListPlot[%, PlotJoined → True, AxesLabel → {”x[m]”, ”ζ(x)[mm]”},
PlotStyle → Black]
”displacement of free edge in mm”;

1000 data[[Length[data]-1]]
−11.6978
10.5 Dynamic Mechanical Analysis

of Viscoelastic Materials
In Sect. 6.1 we had studied the damped harmonic oscillator, which also included
a short discussion of dissipation (cf. p. 172), i.e. the transformation of mechanical
energy into heat. This may even be desirable in certain technical applications, e.g.
damping elements preventing vibrations from spreading to sensitive parts of a sys-
tem. Usually damping elements are made of elastomers. Elastomers are cross-linked
polymers or (linear) macromolecules. Cross-linking can be achieved chemically,
most importantly by a process called vulcanization, but it is always present due to
physical entanglement of the polymers. Not every cross-linked polymer system is an
elastomer however. Elasticity depends on the cross-link density and also on the type
of polymer. Elastomer elasticity is due to the conformation entropy of the polymer
segments between cross-links. If this is the dominant part of the free energy change
during a deformation then the material is an elastomer - or rubber. Elastomers have
10.5 Dynamic Mechanical Analysis of Viscoelastic Materials 353
a wide range of uses. If an elastomer-based component fails it can mean a leaking

faucet or it can mean a major disaster (the Space Shuttle Challenger explosion was
caused by an O-ring used under inappropriate conditions).
The mechanical behavior of elastomers is very complex. It is viscoelastic. Toy
stores usually sell Silly Putty, which is based on silicone polymer. It can flow like a
liquid, albeit slowly. But if rolled into a ball and dropped onto the floor it bounces and
appears perfectly elastic. The elastic modulus of an elastomer depends on frequency
and temperature, among other parameters, and can vary by two to three orders of mag-
nitude. In addition, most elastomer materials are nanocomposites containing large
amounts of filler and other chemicals. The elastomer or rubber industry, which is a
large industry, subjects their products to elaborate mechanical testing called Dynamic
Mechanical Analysis (DMA) [10, 11]. This section is devoted to some of the basic
concepts underlying these tests. DMA has much in common with the (periodically)
driven harmonic oscillator. The spring is replaced by a suitable elastomer specimen.
A machine subjects the material to a cyclic stress and records its response (strain).
This can be done for different temperatures, frequencies, and strain amplitudes.
We start our discussion of DMA by introducing the following scalar relations
between (shear) stress, σ, and (shear) strain, u, as well as shear rate, u̇:
σμ = μu μ (10.126)
and
ση = η u̇ η . (10.127)
Equation (10.126) corresponds to the dependence of the stress tensor on the strain
tensor expressed in (10.37) if i = k, σik ≡ σ, and u ik ≈ (1/2)∂u i /∂xk ≡ u/2,
i.e. the strain perpendicular to the shear direction can be neglected. Equation
(10.127) describes the dissipative contribution, where η is the viscosity coefficient.
Figure 10.26 shows symbolic representations of the (10.126) and (10.127) - the spring
symbol represent the former and the dashpot symbol the latter.
These basic elements can be combined into simple models describing linear vis-
coelastic material behavior. One possible combination, the so called Kelvin–Voigt
model, is depicted in Fig. 10.27a. Its mathematical representation is
Fig. 10.26 Symbolic

elements representing elastic
and viscous contributions μ
Fig. 10.27 a Kelvin–Voigt μ

model; b Maxwell model; (a)
c Zener model
(b)
μ
(c)
μ
σ = σμ + ση u = uμ = uη . (10.128)
The quantity σ is the total stress and u is the total strain. Notice that due to the parallel
arrangement of the two basic elements in the Kelvin–Voigt model, σ is the sum of
the stresses, σμ and ση , contributed by the respective branches. The strain u, on the
other hand, is identical to the strain, u μ and u η , in each individual branch. Using
(10.126) and (10.127) we obtain for the Kelvin–Voigt model
(a) : σ = μu + η u̇ . (10.129)
Our long-time solution of the driven harmonic oscillator (cf. (6.55)) shows a phase
shift between the driving force and the oscillator’s response. Here we try an analogous
ansatz, i.e.
σ = σo sin(ωt + δ) u = u o sin(ωt) . (10.130)
As in the case of the harmonic oscillator the phase shift, δ, is tied to dissipation or
friction.14 Inserting this into (10.129) and using the identity
sin(ωt + δ) = cos(δ) sin(ωt) + sin(δ) cos(ωt)
yields
14 You may wonder why we use (10.130) instead of σ = σo cos(ωt) and u = u o cos(ωt − δ), which
is closer to the solution we had obtained for the forced oscillator on p. 169. The present expressions
for σ and u follow from a simple shift of the time origin. They are more convenient in the present
context. The various results, however, do not depend on which form we choose.
σo
μ ≡ cos δ = μ (10.131)
uo
σo
μ ≡ sin δ = ωη (10.132)
uo
tan δ = τ K V ω (τ K V = η/μ) . (10.133)
The quantity τ K V is a relaxation time. In order to see this we set σ = 0 in (10.129).

Separation of variables then leads to u(t) = u(0) exp[−t/τ K V ], which describes the
strain relaxation after the stress is ‘turned off’.
An analogous result follows for the driven oscillator if we consider (6.52) in
the limit m → 0. Notice that 2λ = ζ/m (cf. (6.9)) and ωo2 = k/m (cf. (6.4)). This
corresponds to omitting ẍ in (6.52), i.e. all forces are in equilibrium. In this limit the
phase shift of the long-time solution, arctan[2λω/(ωo2 − ω 2 )], in (6.55)) is
2λω ζ
tan δ = lim = ω. (10.134)
m→0 ωo2 − ω 2 k
Here ζ corresponds to the viscosity coefficient, η, and k corresponds to the shear

modulus, μ.
The quantities μ , μ , and tan δ are of special interest. Their measurement yields
information on the frequency dependence of the dynamic mechanical properties of
a viscoelastic material.
The real ansatz (10.130) can be replaced by its complex version:
σ̃ = σo ei(ωt+δ) ũ = u o eiωt . (10.135)
It is common practice to define the complex modulus μ∗ = σ̃/ũ. Insertion of (10.135)

into (10.129) yields
μ∗ = μ + iμ , (10.136)
i.e. μ and μ are the real and imaginary parts of the complex modulus. The quantity
μ is called storage modulus and μ is called loss modulus. Looking at the (10.131)
and (10.132) this notation makes sense. μ is equal to μ, which describes elasticity.
μ on the other hand is proportional to η, which is linked to dissipative processes.
We can show this more explicitly and independent of the present model. The work
required during one stress-strain cycle (per unit volume) is given by
2π/ω
w= σdu = σ u̇dt . (10.137)
0
Inserting (10.130) yields
(10.132)
w = πu o σo sin δ = πμ u 2o , (10.138)
Fig. 10.28 Hysteresis
i.e. only μ does contribute to w. Figure 10.28 illustrates w in terms of parametric
plots of (10.130) in the σ-u-plane for different phase shifts, δ. Notice that the area
enclosed by the curves is the respective w. In particular, δ = 0 yields w = 0 whereas
δ = π/2 yields the maximum w. The behavior observed when δ > 0 is called hys-
teresis.
Remark: We can relate the dissipation function discussed on p. 172 to w, i.e.

I (ω)2π/ω = wV . Here 2π/ω is the oscillation period and V is an appropriate vol-
ume, because w is work per volume (notice: ωo ω).
The Kelvin–Voigt model is only one of several simple combinations of the basic
elements and, in addition, is not applicable in the full frequency range. Our next
model is the so called Maxwell model (model (b)) in Fig. 10.27. This model arranges
the two basic elements in series. Its mathematical description is
σ = σμ = ση u = uμ + uη . (10.139)
Using (10.126) and (10.127) we obtain

η
(b) : σ+ σ̇ = η u̇ . (10.140)
μ
The same ansatz as before leads to

τM ω
2 2
μ /μ = (10.141)
τM ω +
2 2
1
τM ω
μ /μ = (10.142)
τM ω +
2 2
1
1
tan δ = , (10.143)
τM ω
where τ M = η/μ. We postpone a discussion of the Maxwell model and turn to another
model, the Zener model (model (c) in Fig. 10.27).
The mathematical description of the Zener model is slightly more complex com-
pared to the other two models:
σ = σ1 + σ M
σ M = σ2 = ση
u = u1 = u M
u M = u2 + uη .
The indices 1 and 2 refer to μ1 and μ2 . Again we employ (10.126) and (10.127),
which here leads to

η μ1
(c) : σ + σ̇ = μ1 u + η 1 + u̇ . (10.144)
μ2 μ2
As before the relation between stress and strain is linear and the above ansatz (real
or complex) yields
τ22 ω 2 /θ + 1
μ /μ1 = (10.145)
τ22 ω 2 + 1
τ2 ω
μ /μ2 = 2 2 (10.146)
τ2 ω + 1
1−θ τ2 ω
tan δ = , (10.147)
θ τ2 ω 2 /θ + 1
2
where τ2 = η/μ2 and θ = μ1 /(μ1 + μ2 ). The quantity τ2 again describes a relaxation

time. If the instantaneous strain u(t = 0) = u o is held constant, i.e. u̇(t) = 0 for
t > 0, then

σ(t) = μ1 + μ2 e−t/τ2 u o . (10.148)
Question: What is the justification for σ(t = 0) = (μ1 + μ2 )u o ?15
15 Thestrain occurs instantaneously at t = 0. The dashpot cannot follow as quickly, i.e. only the
μ-elements contribute to the answer.
Let’s study the low and high frequency limits of the (10.145) through (10.147).
The limit of small ω leads to the Kelvin–Voigt model (with μ1 = μ). In the opposite
limit of large ω we obtain μ ≈ μ1 + μ2 , μ ≈ μ2 /(τ2 ω), and tan δ ≈ (μ2 /(μ1 +
μ2 ))/(τ2 ω). The special case μ1 μ2 yields the Maxwell model in the same limit
(with μ2 = μ). Figure 10.29 shows the various results of the Zener model for μ1 =
μ2 = μ. The dashed lines are the leading contributions in the above two limits. In
order to gauge the value of our simple models we must relate their results to actual
measurements.
Figure 10.30 shows such measurements in comparison to the Zener model. Notice
that the values for μ1 and μ2 follow by fitting the theoretical storage modulus to the
experimental results in the respective limits at low and high frequencies. The value
of τ2 ≈ 10−7 s is obtained by adjusting the inflection point of the theoretical storage
modulus to the data. Despite its simplicity the Zener model provides an overall correct
qualitative description of the data.
Before we explore possible improvements of the Zener model, we should make
sure that we do understand why it works the way it works. After all, the model com-
bines three simple basic elements, i.e. two springs plus a dashpot, into a reasonable
description of the mechanical properties of a complex system.
Figure 10.31 shows a number of alternative combinations of the basic models, i.e.
(a) (Maxwell: M) and (b) (Kelvin–Voigt: KV) depicted in Fig. 10.27. If we investigate
the mechanical behavior of the new models, then the upper row behaves very much
like the Maxwell model, i.e. tan δ ∼ ω −1 , whereas the lower row behaves according to
the Kelvin–Voigt model, i.e. tan δ ∼ ω. None of the models reproduces the maximum
of tan δ versus ω. Only the Zener-Modell (sometimes also called Poynting-Thomson
relaxation model) and its dual partner, both depicted in Fig. 10.32, do yield qualitative
agreement with the experiment.16 Notice that interchanging springs and dash pots in
the two models does not yield useful results or insights.
We can understand our above results as follows. At low frequencies an increase
of tan δ is observed. This means that the friction element, i.e. the dashpot, must be
able to follow the excitation. This is build into the Kelvin–Voigt model, because the
amplitudes in the two branches are strictly coupled. The decrease of tan δ at high
frequencies, on the other hand, requires the decoupling of the friction elements from
the excitation. This is build into the Maxwell model, where the spring can take over
the strain from the friction element. The Zener model incorporates both behaviors.17
Remark 1: We have mentioned temperature as an important factor influencing the

mechanical behavior of elastomers? But none of the above models does depend
on temperature! It turns out that temperature, T , and excitation frequency, ω, are
closely connected. This connection is described by an empirical principle called
time temperature superposition. What does this mean? Figure 10.30 covers a very
16 The two models can be converted into one another via μ1R = μ1K μ2K /(μ1K + μ2K ), μ2R =
2
μ2K /(μ1K + μ2K ) and η R = (μ2K /(μ1K + μ2K ))2 η K . The index R indicates the relaxation or Zener
model, the index K indicates its dual partner.
17 This also explains why none of the models in Fig. 10.31 describes the entire frequency range.
Fig. 10.29 Results of the

Zener model 2.5
2.0
1.5
1.0
0.01 0.1 1 10 100
0.5
0.4
0.3
0.2
0.1
0.01 0.1 1 10 100

tan
0.4
0.3
0.2
0.1
0.01 0.1 1 10 100
2.5
2.0
1.5
1.0
0.01 0.1 1 10 100

Fig. 10.30 Dynamic moduli

of the Zener model (lines) in
comparison to measured data
(squares: storage modulus;
circles: loss modulus)
obtained for a highly
cross-linked polyisoprene
rubber versus strain
frequency (data reproduced
with the permission of
Continental Reifen
Deutschland)
Fig. 10.31 Simple

combination models
Fig. 10.32 Top Zener or

Poynting-Thomson
relaxation model; bottom:
dual partner
wide frequency range. However, this is not how the data were obtained. The data
were obtained by measuring the dynamic moduli at a fixed temperature, T1 , in a
comparatively small frequency interval. Then a second analogous measurement is
carried out at a different temperature, T2 . This is repeated several times. In the end one
has collected data for the dynamic moduli in the same frequency interval at a series of
temperatures, i.e. T1 , T2 , …. Subsequently, the individual data sets, each belonging to
one particular temperature, are shifted parallel to the frequency axis.18 The shifting
18 Low temperatures corresponds to high frequencies and vice versa. Notice that log ω ∼ 1/T .
continues until one continuous and smooth master curve is obtained. Only one of the
data sets, obtained at temperature Tr , remains within the original frequency interval.
Tr then is the temperature of the master curve. The data in Fig. 10.30 are examples
of master curves.19 The physical justification of what we have just described is not
trivial. The interested reader is referred to [7].
Another parameter we have mentioned is the strain amplitude. There is no ampli-
tude dependence of the dynamic moduli predicted by the above models. This is
because they are linear. Most elastomer materials contain large amounts of filler.
In the case of automobile tires the fillers are carbon black and/or silica nanopar-
ticles. The resulting materials are highly non-linear and their dynamic moduli are
strongly dependent on the strain amplitude (Payne effect). Again the interested reader
is referred to [7].
Remark 2: We have discussed the above simple models and we do understand why
they describe the experimental data or not. But we do not understand the molecular
mechanisms behind the different combinations of springs and dashpots. This is still
an active field of research.
Special Example: We conclude this section with a look at static friction or stic-
tion. One of the simplest experiments, which is part of every basic mechanics
laboratory, is the measurement of the coefficient of static friction, μ f . Usually
a smooth rectangular brick-shaped block is put on an incline. Subsequently
the angle of the incline is increased until the block begins to move. From this
angle the force F f is calculated, acting on the block parallel to the incline, as
well as the attendant force Fn , acting normal to the incline. The coefficient of
static friction follows via the simple law
F f = μ f Fn . (10.149)
This law, which we better call an approximation, works very well. It also
applies to the sliding motion of the body, where the coefficient of kinetic
friction is smaller than the static one. Remarkably F f does not depend on the
(macroscopic or apparent) contact area, A, between the body and the surface
on which it rests or slides. Equation (10.149) usually is called Amonton‘s law,
even though a number of people, the first seems to have been Leonardo da
Vinci (1452–1519), who performed systematic studies of friction, have made
contributions to its development (see [12, 13]). The experiment is easy to
perform, but the theoretical foundation of (10.149) is far from being simple!
In fact, friction, in particular kinetic friction, is an area of active research.
Let’s attempt a simple ‘derivation’ of (10.149) in the context of this section.
The sketch depicts the interface between an elastomer and a solid surface (e.g.
19 Looking closely at the experimental data you may be able to spot a couple of ‘junctions’ between
shifted data sets.

sandpaper). The shaded elastomer, to which the normal force Fn is applied

from above, is pushed down on the solid surface. Asperities on the surface
indent the elastomer material. The depth of indentation is h and the size of
the asperities is κ. The average nearest-neighbor distance between them is λ.
This of course is a highly idealized model of a rough surface. In addition to
Fn there is a shear stress σ. For sufficiently large σ, i.e. σ > σcrit , we expect
the elastomer to begin sliding parallel to the interface. Actually, it does not
begin to slide smoothly but instead starts with a jolt in the direction of the
parallel force.
F
n
The observation that the body starts sliding with a jolt suggests to treat the
problem analogous to Euler buckling on p. 327. We estimate the elastic energy
due to the presence of an asperity via
2
1 1 3 h
Fel ≈ d 3 xμu 2 ≈ κμ , (10.150)
2 2 κ
where we have assumed u ∼ h/κ. The work done by the normal force Fn per
asperity on the other hand is
w Fn ≈ (Fn /N )h . (10.151)
Here N is the number of asperities on the surface A.

We may obtain the unknown indentation h by making use of the second law
of thermodynamics via (10.26). In the presence of the work w Fn this yields
h Fn
0 = δh (Fel − w Fn ) ∼ κ3 μ − (10.152)
κ2 N
(cf. (10.89)), i.e.
1 Fn
h∼ . (10.153)
μκ N
Inserting this result into (10.150) we obtain

2
1 Fn
Fel ∼ . (10.154)
μκ N
Remark: The linear relation between h and Fn in (10.153) is different from

the relation between the analogous quantities in Hertz’s contact theory of
macroscopic elastic bodies (cf. the discussion of this aspect in the review
article by B.N.J. Persson et al. [14]).
We expect the jolt to occur, i.e. σ = σcrit , when the elastic energy due to
shear in a slab of height κ and lateral size λ2 , i.e.

1 σcrit
2
σ2
d3x ∼ κλ2 crit , (10.155)
2 μ μ
becomes equal to Fel . Thus we find

2
1 Fn σcrit
2
∼ κλ2 . (10.156)
μκ N μ
Using σcrit = F f /A and A ∼ λ2 N we obtain
λ
Ff ∼ Fn , (10.157)
κ
i.e. we obtain Amonton’s law. The parameter λ/κ characterizes the surface
roughness, whereas the parameter μ characterizing the elastomer’s elasticity
has vanished from the result.
Again we emphasize that our reasoning is rather crude. Essentially we
make intuitive use of so called scaling arguments relating the physical quanti-
ties determining the effect. Aside from the simple surface structure, we assume
linear elasticity. What is also neglected is the interaction of the two surfaces on
the atomic and molecular level. This interaction would enter our calculation
in the form of surface work involving the relevant surface tensions. When this
contribution, which is adhesion, becomes comparable to the elastic deforma-
tion considered thus far, the friction force will depend on the contact area. If
we want to include kinetic friction as well, then we encounter the problem
of dynamic loss, which we already discussed in the framework of the above
simple models of viscoelastic systems (see for instance [15]).
References
1. T.L. Anderson, Fracture Mechanics: Fundamentals and Applications (CRC Press, Boca Raton,
1991)
2. A.E.H. Love, A Treatise on the Mathematical Theory of Elasticity (Dover Publications, New
York, 1944)
3. I.S. Sokolnikoff, Mathematical Theory of Elasticity (McGraw-Hill, New York City, 1956)
4. R. Hentschke, Thermodynamics (Springer, New York, 2013)
5. L.D. Landau, E.M. Lifshitz, Theory of Elasticity (Pergamon Press, New York, 1970)
6. H.M. Smallwood, Limiting law of the reinforcement of rubber. J. Appl. Phys. 15, 758 (1944)
7. T.A. Vilgis, G. Heinrich, M. Klüppel, Reinforcement of Polymer Nano-Composites (Cambridge
University Press, Cambridge, 2009)
8. S.P. Timoshenko, J.M. Gere, Theory of Elastic Stability (McGraw-Hill, New York City, 1963)
9. T.R. Chandrupatla, A.D. Belegundu, Introduction to Finite Elements in Engineering (Pearson,
New York city, 2002)
10. K.P. Menard, Dynamic Mechanical Analysis - A Practical Introduction (CRC Press, Boca
Raton, 2008)
11. R.G.C. Arridge, Mechanics of Polymers (Clarendon Press, Oxford, 1975)
12. H.W. Kummer (1966) Unified Theory of Rubber and Tire Friction The Pennsylvania State
University College of Engineering
13. J. Gao, W.D. Luedtke, D. Gourdon, M. Ruths, J.N. Israelachvili, U. Landman, Frictional forces
and amonton‘s law: from the molecular to the macroscopic scale. J. Phys. Chem. B 108, 3410
(2004)
14. B.N.J. Persson, O. Albohr, U. Tartaglino, A.I. Volokitin, E. Tosatti, On the nature of surface
roughness with application to contact mechanics, sealing, rubber friction and adhesion. J. Phys.:
Condens. Matter 17, R1 (2005)
15. R.H. Smith, Analyzing Friction in the Design of Rubber Products and Their Paired Surfaces
(CRC Press, Boca Raton, 2008)
Appendix A
Identities and Units
Cartesian Unit Vectors: e1 , e2 , e3 ; ei · ej = δi j with δ11 = δ22 = δ33 = 1 and
δ12 = δ21 = δ13 = δ31 = δ23 = δ32 = 0 but δii = 3!
Scalar Product: a · b = ai ei · b j ej = ai b j δi j = ai bi
Vector Product:
⎛ ⎞
e1 e2 e3 a2 b3 − a3 b2

a × b = a1 a2 a3 = ⎝ a3 b1 − a1 b3 ⎠
b1 b2 b3 a1 b2 − a2 b1
ej × ek ≡ ei i jk

⎧
⎨ 123 = 231 = 312 = 1
i jk = 132 = 213 = 321 = −1
⎩
0 otherwise
i jk is symmetric in all indices.
a × b = a j bk ej × ek = ei i jk a j bk
Useful:
i jk kmn = δim δ jn − δin δ jm
Vector Identities:
(α a ) × b = a × (α b)
= α( a × b)

a · (b × c) = b · (
c × a ) = c · (
a × b)
a × (b × c) = b(
a · c) − c(
a · b)
DOI 10.1007/978-3-319-48710-6
366 Appendix A: Identities and Units
( · (
a × b) c × d) = ( a · c)(b · d)
− ( b · c)
a · d)(

∇ × (∇ϕ) = 0

∇ · (∇ × a ) = 0
∇ × (∇ × a ) = ∇( ∇ · a ) − ∇ 2 a
∇ · (ϕ a ) = a · ∇ϕ
+ ϕ∇ · a

∇ × (ϕ a ) = ∇ϕ × a + ϕ ∇ × a
a · b)
∇( = ( b + (b · ∇)
a · ∇) a + a × (∇ + b × (∇
× b) × a )
∇ · (
a × b) = b · (∇ × a ) − a · (∇
× b)
∇ × ( a × b) = a (∇ − b(
· b) ∇ · a ) + (b · ∇)
a − ( b
a · ∇)
Taylor Expansion:

1
ro + δ
ϕ ( r ) = ϕ (
ro ) + (δ
r ·∇ r ) ro + (δ
r ) ϕ ( r )(δ
r ·∇ r ) ϕ (
r ·∇ r) + · · ·
2 ro
Gauss’ Theorem: Integral theorem of Gauss - Let ∂ V be a closed surface enclosing

the volume V . Then

∇ · Ad V = A · d f .
V ∂V
The surface element, d f, points away from the surface. For a proof see for instance
M.R. Spiegel [1].
Generalized Coordinates: Let r = r (u, v, w), where u, v, w are generalized coor-

dinates. Using the unit vectors
∂ r −1 ∂ r ∂ r −1 ∂ r ∂ r −1 ∂ r

eu = eu = ew =
∂u ∂u ∂v ∂v ∂w ∂w
in these coordinates, requiring them to be orthogonal, we define a coordinate system.
Hence
r = eu ru + ev rv + ew rw

⎛ ⎞
dx ∂ r ∂ r ∂ r

d r = ⎝ dy ⎠ = eu du + ev dv + ew dw
dz ∂u ∂v ∂w
⎛ ⎞
∂x −1 −1 −1
= ⎝ ∂ y ⎠ = eu ∂ r ∂u + ev ∂ r ∂v + ew ∂ r ∂w .
∇
∂ ∂u ∂v ∂w
z
Appendix A: Identities and Units 367
The last relation follows from
∂ r ∂ r
∂u = · ∂r ≡ ·∇ ,
∂u ∂u
i.e.
∂ r −1

∂u = eu · ∇
∂u
(analogous for v and w). The desired result follows via
= eu (eu · ∇)
∇ + ev (ev · ∇)
+ ew (ew · ∇)
.
Special Cases:
(a) Cylinderical Coordinates:
⎛ ⎞
r cos φ
r = ⎝ r sin φ ⎠
z
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
cos φ − sin φ 0
er = ⎝ sin φ ⎠ eφ = ⎝ cos φ ⎠ ez = ⎝ 0 ⎠
0 0 1
∂ r ∂ r ∂ r

=1 =r =1
∂r ∂φ ∂z
d r = er dr + eφ r dφ + ez dz

= er ∂r + eφ 1 ∂φ + ez ∂z
∇
r
∇ 2 1
= ∂r (r ∂r ) + 1 ∂φ2 + ∂z2
r r2
(b) Spherical Coordinates
⎛ ⎞
r cos φ sin θ
r = ⎝ r sin φ sin θ ⎠
r cos θ
368 Appendix A: Identities and Units
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
cos φ sin θ − sin φ cos φ cos θ
er = ⎝ sin φ sin θ ⎠ eφ = ⎝ cos φ ⎠ eθ = ⎝ sin φ cos θ ⎠
cos θ 0 − sin θ
∂ r ∂ r ∂ r

= 1 = r sin θ = r
∂r ∂φ ∂θ
d r = er dr + eφ r sin θ ∂φ + eθ r dθ

= er ∂r + eφ 1 ∂φ + eθ 1 ∂θ
∇
r sin θ r
1 2 1 1
∇ = 2 ∂r r ∂r + 2 2 ∂φ2 + 2
2
∂θ (sin θ ∂θ )
r r sin θ r sin θ
= r1 ∂r2 (r ... ) φ,θ
≡∇ 2
Units:
length m
time s
mass kg
force N = kg m s−2
work, energy J=Nm
pressure Pa = N m−2
elastic moduli N m−2
Constants (MKS):
velocity of light c = 2.99792458 m s−1

atomic mass unit m a = 1.66055 · 10−27 kg
gravitational constant G = 6.673 · 10−11 Nm2 kg−2
mass of the earth m E = 5.977 · 1024 kg
radius of the earth r E = 6.37 · 106 m
gravitational acceleration g = 9.81 m s−2
Boltzmann’s constant k B = 1.380658 · 10−23 J K−1
Reference
New York, 1971)
Appendix B
Mathematica MD in the NVE-Ensemble
"A simple Molecular Dynamics program for

Lennard-Jones particles (using LJ units).
It realizes the NVE ensemble, i.e. the
number of particles, N, the volume, V,
and the energy of the system, E, are
constant.";
"The program INIT is used to initialize

various quantities. It also sets up the
initial particle positions and assigns
the initial random velocities to the
particles";
"Individual key pieces: MOVER - advances

the positions, MOVEV - advances the
velocities, FORCE - calculates the
forces felt by the individual particles
at their current positions";
"INIT";
"set parameters:";
"number of particles - (too) small because we use
Mathematica!!";
n = 3 ∗ 3 ∗ 3;
"number of timesteps";
NSTEP = 100000;
"maximum magnitude of initial random velocity component";
vmax = 2.7;
"cut-off radius for the forces";
rcut = 3;
"the primary simulation box volume is V=L∧ 3";
DOI 10.1007/978-3-319-48710-6
370 Appendix B: Mathematica MD in the NVE-Ensemble
L = 3;
"timestep";
t = 0.001;
"generate initial coordinates on a cubic lattice";

"initialize coordinate arrays";
x = Table[0, {i, 1, n}, {k, 1, NSTEP}];
y = Table[0, {i, 1, n}, {k, 1, NSTEP}];
z = Table[0, {i, 1, n}, {k, 1, NSTEP}];
"calculate particle coordinates on cubic lattice";
i = 0; max = n ∧ (1/3); Do[
i+=1;
x[[i, 1]] = ii;
y[[i, 1]] = jj;
z[[i, 1]] = kk,
{ii, 0, max −1},
{jj, 0, max −1},
{kk, 0, max −1}];
"display particles on cubic lattice and box boundaries";
g1 =
Graphics3D[
{Table[{PointSize[Large],
Point[{x[[i, 1]], y[[i, 1]], z[[i, 1]]}]}, {i, 1, n}],
Line[{{0, 0, 0}, {L , 0, 0}, {L , L , 0}, {0, L , 0}, {0, 0, 0},
{0, 0, L}, {L , 0, L}, {L , L , L}, {0, L , L}, {0, 0, L},
{0, L , L}, {0, L , 0}, {L , L , 0}, {L , L , L}, {L , 0, L},
{L , 0, 0}}]}, Boxed → False]
"generate random velocity components";

vx = Table[0, {i, 1, n}, {k, 1, NSTEP}];
vy = Table[0, {i, 1, n}, {k, 1, NSTEP}];
vz = Table[0, {i, 1, n}, {k, 1, NSTEP}];
Do[
vx[[i, 1]] = vmax(2Random[Real, 1] − 1);
vy[[i, 1]] = vmax(2Random[Real, 1] − 1);
vz[[i, 1]] = vmax(2Random[Real, 1] − 1),
{i, 1, n}];
"subtract center of mass velocity";
vxcm = Sum[vx[[i, 1]], {i, 1, n}]/n
vycm = Sum[vy[[i, 1]], {i, 1, n}]/n
vzcm = Sum[vz[[i, 1]], {i, 1, n}]/n
Do[
vx[[i, 1]] = vx[[i, 1]] − vxcm;
vy[[i, 1]] = vy[[i, 1]] − vycm;
vz[[i, 1]] = vz[[i, 1]] − vzcm,
Appendix B: Mathematica MD in the NVE-Ensemble 371
{i, 1, n}];
"check this";
vxcm = Sum[vx[[i, 1]], {i, 1, n}]/n
vycm = Sum[vy[[i, 1]], {i, 1, n}]/n
vzcm = Sum[vz[[i, 1]], {i, 1, n}]/n
"initialize force array";
fx = Table[0, {i, 1, n}, {k, 1, NSTEP}];
fy = Table[0, {i, 1, n}, {k, 1, NSTEP}];
fz = Table[0, {i, 1, n}, {k, 1, NSTEP}];
0.480694
−0.11963
−0.193074
1.0691036533427433`*∧ -16
-4.11193712824132`*∧ -17
0.
"NVE - MD for LJ particles";
Timing[
"FORCE (k) ";
k = 1;
Do[
xmin = (x[[i, k]] − x[[ j, k]]) − LRound (x[[i,k]]−x[[
L
j,k]])
;

ymin = (y[[i, k]] − y[[ j, k]]) − LRound (y[[i,k]]−y[[
L
j,k]])
;

zmin = (z[[i, k]] − z[[ j, k]]) − LRound (z[[i,k]]−z[[
L
j,k]])
;
∧ ∧ ∧
rmin2 = xmin 2 + ymin 2 + zmin 2;
If[rmin2 < rcut∧ 2,
{ f = 48/rmin2∧ 7 − 24/rmin2∧ 4;
fx[[i, k]]+= f xmin;

fy[[i, k]]+= f ymin;
fz[[i, k]]+= f zmin;
fx[[ j, k]]+= − f xmin;
fy[[ j, k]]+= − f ymin;
fz[[ j, k]]+= − f zmin}],
{i, 1, n − 1}, { j, i + 1, n}];
"main loop of MD";
Do[
"MOVER (k-1)";
Do[
x[[i, k]] = x[[i, k − 1]] + tvx[[i, k − 1]]+
( t∧ 2/2)fx[[i, k − 1]];
y[[i, k]] = y[[i, k − 1]] + tvy[[i, k − 1]]+
( t∧ 2/2)fy[[i, k − 1]];
z[[i, k]] = z[[i, k − 1]] + tvz[[i, k − 1]]+
( t∧ 2/2)fz[[i, k − 1]],
{i, 1, n}];
"FORCE (k) ";

Do[
xmin = (x[[i, k]] − x[[ j, k]]) − LRound (x[[i,k]]−x[[ j,k]])
;
L

ymin = (y[[i, k]] − y[[ j, k]]) − LRound (y[[i,k]]−y[[
L
j,k]])
;

zmin = (z[[i, k]] − z[[ j, k]]) − LRound (z[[i,k]]−z[[
L
j,k]])
;
∧ ∧ ∧
rmin2 = xmin 2 + ymin 2 + zmin 2;
If[rmin2 < rcut∧ 2,
{ f = 48/rmin2∧ 7 − 24/rmin2∧ 4;
fx[[i, k]]+= f xmin;
fy[[i, k]]+= f ymin;
fz[[i, k]]+= f zmin;
fx[[ j, k]]+= − f xmin;
fy[[ j, k]]+= − f ymin;
fz[[ j, k]]+= − f zmin}],
{i, 1, n − 1}, { j, i + 1, n}];
"MOVEV (k-1)";
Do[
vx[[i, k]] = vx[[i, k − 1]] + ( t/2)(fx[[i, k]] + fx[[i, k − 1]]);
vy[[i, k]] = vy[[i, k − 1]] + ( t/2)(fy[[i, k]] + fy[[i, k − 1]]);
vz[[i, k]] = vz[[i, k − 1]] + ( t/2)(fz[[i, k]] + fz[[i, k − 1]]),
Appendix B: Mathematica MD in the NVE-Ensemble 373
{i, 1, n}],
{k, 2, NSTEP}]]
{1204.27, Null}
"pictorial representation of selected particle`s path

including the initial lattice";
g2 =
Graphics3D[
{Red, Point[Table[{x[[1, k]], y[[1, k]], z[[1, k]]},
{k, 2, NSTEP}]],
{Green, Point[Table[{x[[8, k]], y[[8, k]], z[[8, k]]},
{k, 2, NSTEP}]]},
{Blue, Point[Table[{x[[16, k]], y[[16, k]], z[[16, k]]},
{k, 2, NSTEP}]]},
{Magenta, Point[Table[{x[[24, k]], y[[24, k]], z[[24, k]]},
{k, 2, NSTEP}]]}}, Boxed → False];
Show[g1, g2]
"instantaneous temperature vs. time";

Table[
{ tk, Sum[(vx[[i, k]]∧ 2 + vy[[i, k]]∧ 2 + vz[[i, k]]∧ 2), {i, 1, n}]/
(3n)}, {k, 2, NSTEP}];
ListPlot[%, Joined → True, PlotRange → {1, 4}, PlotStyle → Black,
AxesLabel → {"time", "T"}]
T
4.0
3.5
3.0
2.5
2.0
1.5
time
0 20 40 60 80 100
Index
A Chaos
Acceleration, 40, 100 deterministic, 288
Action, 71 Chemical bond, 62
Adhesion, 363 Clausius inequality, 302
Adiabatic Collision
compression, 277 elastic, 144
expansion, 277 Complex number, 36
Angular momentum, 50, 104 complex conjugate, 36
Arrow of time, 303 imaginary part, 36
Auto-correlation function, 261 magnitude, 36
Automobile tire, 323 real part, 36
Average Constant of motion, 106
ensemble, 281 Constraints, 91
time, 272 holonomic, 91
nonhomonomic, 91
rheonomic, 91
B scleronomic, 91
Baseball, 211 Continuity equation, 278
Bending instability, 329 Coordinate system
Bertrand’s theorem, 129 right-handed, 1
Binomial theorem, 23 Coordinates
Birkhoff’s theorem, 272 cartesian, 1
Black-body radiation, 252 cylindrical, 1
Boltzmann’s generalized, 89
equation, 273 internal, 183
picture, 273 normal, 178
polar, 1
spherical, 1
C symmetry adapted, 1
Canonical Critical point, 265
equations, 234 Cutoff radius, 259
transformation, 245
generator, 248
Canonically conjugate variables, 246 D
Center of mass, 49, 105 D’Alembert’s principle, 230
velocity, 49 Dark matter, 45
Central limit theorem, 261, 262 Dashpot, 353
DOI 10.1007/978-3-319-48710-6
376 Index
Deformation buckling, 329, 362

elastic, 291 equations, 218
plastic, 291 formula, 21, 29
Degree of freedom, 89 Euler–Lagrange equations of motion, 71, 90
Derivative, 18
chain rule, 19
gradient, 24 F
method of steepest descent, 24 Filler
partial, 24 nanoparticles, 323
product rule, 19, 31 Finite Element Method
total, 25 elasticity and, 346
total differential, 24 element matrix, 339
Determinant elements, 337
definition, 17 shape function, 338
Jacobian, 27 Finite size effects, 259
Differential equation Fix point
homogeneous, 56 stable, 287
inhomogeneous, 55 Force, 35, 100
Dispersion relation, 160 central, 127
Displacement, 292 centrifugal, 45, 56, 108
Dissipation, 63, 352 conservative, 101, 157
Divergence theorem, 42 Coriolis, 108
Dynamic Mechanical Analysis, 353 friction, 276
generalized, 74
reaction, 58
surface, 301
E
volume, 301
Eccentricity, 132
Fracture, 320
e-function, 20
mechanics, 291
series expansion, 23 Free energy, 303
special meaning, 21 Friction
Einstein-Smallwood equation, 323 Amonton‘s law, 361
Elastic coefficient, 156, 361
constants, 301, 305 force, 156
waves, 331 static, 361
Elastomer, 352
Electronegativity, 119
Ellipse, 133 G
Empirical force field, 65, 183
-space, 273
Energy, 52 Gas
conservation, 75 ideal, 275, 283
internal, 302 Gauss’ theorem, 42
kinetic, 52 Gibbs’ picture, 280
potential, 52, 100 Gradient operator, 24
vibration, 178 Gravitational
Ensemble, 280 constant, 39
canonical, 281 field, 43
micro-canonical, 280 Gravity assist, 150
Entropy, 40, 274, 302
Equilibration, 261
Equivalence principle, 56 H
Ergodic hypothesis, 284, 286 Hamiltonian, 233
Euler Hamilton–Jacobi differential equation, 247
angles, 219 Hamilton’s equations, 234
Index 377
Harmonic, 61 density, 175

Heat bath, 281 Lamé coefficient, 305
Homogeneity Lattice constant, 9
in space, 52 Law of corresponding states, 263
in time, 52 Legendre transformation, 234, 250
Homogeneous function, 284 Lennard-Jones
Hooke’s law, 307 particle gas, 274
H-theorem, 274, 302 potential, 59
Hysteresis, 356 Line element, 28
Liouville’s theorem, 278
ln-function, 20
I Logistic map, 287
Ideal gas law, 276, 282 Lorentz transformation, 81
Impact parameter, 143 Lyapunov exponent, 288
Inertia, 52
Inertial reference frame, 82
Infrared spectroscopy, 183 M
Initial conditions, 46
Mass
Instantaneous action at a distance, 47
density, 42
Integration, 30
reduced, 126
and work, 35
Master curve, 361
along a path, 35
Mathematical, vi
completing the square, 34
Matrix
coordinate transformation, 33
addition, 14
Gaussian integral, 33
commutative law, 14
integrand, 31
definition, 13
limits, 31
determinant, 17
parameter method, 32
diagonal, 16
partial, 31
eigenvalue, 16
substitution, 34
eigenvector, 16
surface, 34
vector field, 35 inverse, 15
volume, 34 multiplication, 14
Integrator, 225 trace, 15
Isothermal compressibility, 308 transpose, 15
Isotropy, 52 unit or identity, 15
Maxwell distribution, 274
Maxwell’s equations, 69
J Mechanical equilibrium
Jacobi identity, 245 static, 48
Microstate, 271, 280
equal probability of, 280
K Minimum image convention, 260
Kepler’s Mixing, 286
first law, 132 rule, 60
problem, 130 Modulus
second law, 128 complex, 355
third law, 137 compression, 306
elastic, 177, 309, 324
loss, 355
L shear, 168, 306
Lagrange storage, 355
multilplier, 91, 334 Young’s, 177, 309
Lagrangian, 71 Molecular chaos, 274
378 Index
Molecular Dynamics simulation, 258, 272 general, 203

Molecular modeling, 65 mathematical, 53, 73, 92, 125, 235, 271
Moment of inertia tensor, 192 rotating, 108
Momentum, 46 Perihelion precession, 139
conjugate, 235 Phase, 156
free relativistic particle, 79 space, 271
generalized, 74, 103, 234 trajectory, 271
Morse potential, 62 Photon, 252
μ-space, 273 Planck’s constant, 249
Poincaré’s recurrence theorem, 285
Point mass, 41
N
Poisson
Nanocomposite, 353
brackets, 244
Neutral surface, 325
’s number, 309
Newton’s
theorem, 245
equations of motion, 100
first law, 47, 49, 72, 96 Polarization, 46
first theorem, 43 Polymer, 63, 311
law of gravitation, 39 Position vector, 89
method, 22 Potential
second law, 45 centrifugal, 128
second theorem, 43 effective, 128
third law, 48, 103 Precession, 196
Non-mixing, 286 Principal
Normal mode, 178 axes, 194
axes of inertia, 193
moments of inertia, 194
O Principle of correspondence, 250
Oscillation Principle of least action, 71
damped, 157 Proper time, 77
Oscillator Pythagoras’s theorem, 5
amplitude, 156
anharmonic, 155, 170
damped, 156
dissipation, 172 Q
driven, 169 Quantum mechanics, 47, 69, 72, 233, 234,
frequency, 156 245, 249
harmonic, 76, 155, 247, 251 Quantum theory, 252
in gravitational field, 74 Quaternions, 221
one-dimensional, 106
position distribution, 170
resonance, 170
R
Relaxation time, 355
P Rest energy, 80
Pairwise additivity, 46 Right-hand-rule, 12
Parallel axis theorem, 199 Rigid body, 49, 51
Partial charge, 119 Ritz method, 333
Payne effect, 361 Rolling resistance, 173
Pendulum Rotation curve, 45
Foucault’s, 115 Rubber, 323
clock, 208 polyisoprene, 360
double, 163 reinforcement, 323
elastic, 66, 98, 241, 256 Runge–Kutta method, 253
Index 379
S generalization, 25
Sample Temperature, 265, 270, 274, 283, 302
average, 260 Tensile strength, 63
variance, 260 Theory of relativity
Scalar, 4 general, 47
true, 9 special, 75
Scaling argument, 363 Thermal wavelength, 267
Scattering Thermodynamic limit, 281
angle, 141 Thermodynamics, 302
cross section, 142 first law, 52, 302
Rutherford, 143 second law, 274, 302, 362
Shear rate, 353 Time
Silly Putty, 353 homogeneity of, 102
Simulation box Time temperature superposition, 358
image, 259 Top
primary, 259 assymmetric, 194
Slingshot-effect, 150 free symmetric, 195
Sound velocity spherical, 194
longitudinal, 332 symmetric, 194, 223, 235
transversal, 332 Torque, 51, 210
Space Trajectory

-, 273 stability, 286
μ-, 273 Transformation
Spacetime, 77 Galilei, 76
Spring constant, 61 Lorentz, 78
Standard model, 40 Turning point, 124
State function, 302
Static equilibrium, 51
Statistical mechanics, 89, 233, 234, 258, 271, V
302 Vector, 3
Steiner’s theorem, 199 basis, 9
Stoke’s law, 157 cross product, 7
Strain tensor, 293 cross product and rotation, 11
cylindrical coordinates, 295 field, 4
spherical coordinates, 295 Laplace-Runge-Lenz, 138
Stress tensor, 299 magnitude, 5
String scalar product, 6
wave equation, 175 unit, 9
Summation convention, 6, 14 Velocity
Surface element, 26, 28 addition of, 80
conversion, 26 angular, 12
Surface tension, 363 sound, 167
Symmetry Velocity of sound
time invariance, 75 transversal, 168
System Velocity-addition formula, 80, 84, 85
bulk, 281 Virial theorem, 135, 283
closed, 271 Virtual displacements, 230
isolated, 48, 271 Viscoelasticity, 353
open, 157, 271 Kelvin-Voigt model, 354
Maxwell model, 354
Poynting-Thomson model, 358
T Zener model, 354
Taylor series expansion, 23 Viscosity, 353
380 Index
coefficient, 323 W
Volume element, 27, 28 Wavenumber, 158, 180
conversion, 26 Work, 101
Vulcanization, 323, 352 reversible, 302

Classical Mechanics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Classical Mechanics

Uploaded by

Copyright:

Available Formats

Undergraduate Lecture Notes in Physics

ULNP titles must provide at least one of the following:

More information about this series at http://www.springer.com/series/8917

ISSN 2192-4791 ISSN 2192-4805 (electronic)

© Springer International Publishing AG 2017

Printed on acid-free paper

This Springer imprint is published by Springer Nature

This textbook on classical mechanics is intended for physics students, who

Wuppertal, Germany Reinhard Hentschke

5.3 Scatteringz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

The following is a compilation of essential mathematical tools needed in theoretical

A point on a straight line or coordinate axis is represented by a number defining its

Fig. 1.1 Cartesian (5)

Fig. 1.2 Polar coordinates. y-axis

Fig. 1.3 Cylindrical z-axis

Fig. 1.4 Spherical z-axis

Vectors in physics describe all quantities possessing magnitude as well as orientation.

Fig. 1.5 The vector a and its z

where x, y, and z are the components in a right-handed rectangular coordinate system.

The most important vector operations are defined as follows:

Fig. 1.6 Deformation of a

• Adding and subtracting vectors (cf. Fig. 1.7):

• Magnitude of a vector: The magnitude | a | of the vector a simply is the length of

Fig. 1.7 Adding and

• Problem 1 - Scalar Product: Show that

Solution: Based on the following sketch

Application of Pythagoras’s theorem yields

The combination of the last three equations yields

(notice: a × b = −b × a ). The result of the cross product again is a vector.

The same happens in the case of ( · b.

and a × b span a right-handed system of axes. With

The magnitude of a × b therefore is the shaded area in Fig. 1.8. Straightforward

Fig. 1.8 Two vectors a and z

• Problem 2 - Vectors and Lattices: The cartesian basis vectors of a particular

where a is the so called lattice constant.

a2 × e3 e3 × a1

Solution: (a) The area is

The angle φ between the x-axis and a2 follows from

(c) Our starting point is

with e3 = (0, 0, 1). Thus

The final result is

The radii are defined via

d = |n g1 + m g2 | (n, m : integer numbers) ,

Inserting different values for n and m we obtain the radii

Fig. 1.9 Infinitesimal

r , and δ φ is a vector parallel to the axis of

a1 = a cos(ϕ + φ) = a cos ϕ cos φ − sin ϕ sin φ

Fig. 1.10 Vector a in two y

In the primed system we have instead

The attendant conversion relations are

a1 = cos ϕ a1 − sin ϕ a2

Another way of writing this is

The quantity D is a matrix - in this case a 2 × 2 matrix. Matrix D is an example for

Analogous to (1.22) and (1.24) we may also write

The generalization to n > 2 is easy.

(summation convention!). It is important to note that in general A · B = B · A, i.e.

In matrix form (1.31) becomes

Here I is the so called unit or identity matrix. Multiplication of a vector or a matrix

S · S−1 = S−1 · S = I . (1.34)

This follows via S = I · S = (S · S−1 ) · S = S · (S−1 · S).

Applying this operation to the product of two matrices yields

(A · B)T = BT · AT or A · B = (BT · AT )T . (1.36)

Using components we can prove this statement:

(AB)i j = Aik Bk j = Bk j Aik = (B T ) jk (A T )ki = ((B T A T )T )i j . (1.37)

a1 = cos ϕ a1 − sin ϕ a2

f (x) = νx ν−1 . (1.55)

f (x + δx) ≈ f (x) + δx f (x) . (1.69)

(here f = d f /d x). Every term δx f (x) is a thin slice, between xi = a + (i − 1)δx

δx f (a) ≈ f (a + δx) − f (a)