Professional Documents
Culture Documents
Mattias Flygare
Physics
C-level thesis
Date: 2012-05-30
Supervisor: Jrgen Fuchs
Examiner: Claes Uggla
2
Acknowledgements
My biggest thanks to my supervisor Jrgen Fuchs, always interested and always
interesting! I would also like to thank Jonatan Andersson for our long discussions
and all your helpful comments. Thanks also to David, Soe, Jessica and Joakim
for your willingness to listen and help. Last but not least, thank you to my wife
Anna-Lena and my two sons Sixten and Alexander!
3
Contents
1 Introduction 5
2 Mathematical tools 9
2.1 Linear and ane functions . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 D'Alembert's principle 15
4 Constraints 17
4.1 Holonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.4 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.7 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4
1 Introduction
In the late 17th century, the German mathematician Gottfried Leibniz introduced the
concept of vis viva, the living force, to be the mass times the square of the velocity, mv2 .
This is today referred to as kinetic energy (diering only by the unimportant factor
2)[11]. He also replaced Isaac Newton's concept of force with the work of the force, later
replaced by an even more basic quantity, the work function. This marks the starting
point of a new branch of mechanics, often called analytical mechanics, where the vector
quantities of force and momentum are no longer important, but instead these two scalar
quantities, the kinetic energy T and the work function U completely describe the motion
of the system.
Through the work of Leonard Euler, Joseph-Louis Lagrange and others in the second
half of the 18th century, a new formalism was developed, called Lagrangian mechanics.
Euler and Lagrange were the rst to discover and formulate the principle of least action,
stating that given a start point and an end point, assuming the energy of the system is
conserved, a time-integral of the kinetic energy over any possible path between the start
and end points has a stationary value, and that nature will choose the path with the
smallest of these values. Thus, to nd the actual motion of the system one only needed
to nd the minimum value for this integral. This was achieved by using calculus of
In 1834 and 1835, William Rowan Hamilton published two papers titled On a General
Method in Dynamics where he was able to generalize the principle of least action to
include nonconservative systems. The integrand in the action integral was no longer the
kinetic energy alone, but the dierence between kinetic energy T and potential energy V,
the most fundamental quantity in Lagrangian mechanics, later named the Lagrangian,
L = T V. (1.1)
This variational principle was named Hamilton's principle. Numerous other formulations
of the principle of least work were also developed (by Lagrange, Euler, Jacobi) but they
are all similar and all originate from the principle of least action. These principles
all lead to the equations of motion in the Lagrangian formalism, the Euler-Lagrange
equations
d L L
j
j = 0, (1.2)
dt q q
for any generalized coordinate qj where j = 1, ..., n and n is the total number of coordi-
nates.
There is however another way to arrive at these equations, and that is d'Alembert's
least action principle at all, but instead starts with Newton's second law and applies the
principle of virtual work, stating that a system is in equilibrium if and only if the total
virtual work (the work done by a force along a spacial displacement when time is kept
xed) of all applied forces vanish. The principle thus assumes that all the work done by
5
constraint forces cancel. The new addition by d'Alembert (startlingly simple and still
very important) to this principle was to take the right hand side of Newton's second
law, the mass times the acceleration, and move it to the left hand side and treat it as
if it were a force, called the force of inertia. In this way, the principle of virtual work
could be extended to include systems with non-trivial dynamics and not only statical
systems.
The d'Alembert principle also leads to the Euler-Lagrange equations and it is actually
the most general principle of them all, and indeed it can quite easily be mathematically
reformulated as Hamilton's principle. Hamilton's principle (and the same goes for all
the mentioned least action principles) is thus equivalent to d'Alembert's principle in all
applications where both can be applied [11]. The latter half of the previous statement is
very important but sometimes forgotten. To get a quick appreciation of how the two
the kinetic energy and the work function. The forces that gives rise to a virtual work
1. Monogenic. Forces that are derivable from a scalar function U called the work
2. Polygenic. Forces that are not monogenic, but when it is still relevant to talk
about the virtual work done by the force (such as frictional forces).
Conservative forces are now the special case when U does not depend on time explicitly.
Potential forces are then the special case of conservative forces but when the work
function neither depends on time explicitly, nor the coordinate velocities. The negative
principle, in areas such as optimal control and when dealing with systems of curvilinear
coordinates, although it turns out that Hamilton's principle is invalid when dealing
with polygenic forces while d'Alembert's still works. To see what this means, let us also
be constraints that are expressed in terms of coordinate velocities that are not
There is also a connection between the two above classications; Holonomic con-
straints give rise to monogenic forces and nonholonomic constraints give rise to polygenic
6
forces[11]. From this we can immediately deduce the important fact that Hamilton's
principle is applicable only to systems subject to holonomic constraints and that for
Probably due to the elegance and wide areas of use of Hamilton's principle on systems
with holonomic constraints there has sometimes arisen some confusion when nonholo-
nomic systems are to be treated. One way to do this properly was rmly established
already in 1900 in [10] but still today one can nd books for students in analytical me-
chanics, as pointed out by [1] and [6], where the topic is treated in an uncareful manner,
such as when generality is claimed for results applying only to holonomic constraints,
or the author may simply have misunderstood how the treatment of nonholonomic con-
Although there still is some confusion remaining, there are numerous publications
where the importance of using the d'Alembert principle is stressed and the main dif-
ferences are pointed out. One of the most active contemporary author within the eld
is A.M. Bloch (for example in [1], [12], [3] or [2]), but there are also other authors (for
example W.S. Koon and J.E. Marsden in [9] and M.R. Flannery in [6]).
The main purpose of this text is to create a reasonably straight path through the
process of obtaining equations of motion for systems with both holonomic and nonholo-
nomic constraints, starting with the d'Alembert principle, deriving the Euler-Lagrange
equations and then applying constraints, both holonomic and nonholonomic. By do-
ing this in a very straightforward and transparent way (the ambition was to include
almost all equations and calculations) it becomes fairly easy to pinpoint the dierences
between the two types of constraints, and how they appear explicitly in the equations
A large part of the more recent literature in the eld is written, not in terms of local
coordinates, but in terms of dierential geometry. Since this text is intended for last
year undergraduate level and up, one of the bi-purposes of the text was to open the
door (but never completely walk through it) to describing mechanical systems in the
Finally the formulas developed are put to use on the mechanical system of a rolling
disc (such as a coin) on a horizontal plane, a good example of how rolling constraints
leads to linear nonholonomic constraint equations. The example has (of course) been
treated before in a similar way, but is included to illustrate the process of obtaining
the equations of motion by the process developed in the foregoing sections and it also
By pure necessity, and sometimes to avoid confusing the reader more than necessary,
there are some restrictions to generality in the text. To begin with we only treat sys-
tems where there are no nonpotential forces acting, which in this context means that
we only consider forces that arise from constraints. We also only treat systems where
the constraints are not depending on time explicitly (so called scleronomic systems, as
opposed to rheonomic, constraints that are depending on time explicitly). The nonholo-
nomic constraints we treat are all linear in the velocities, disregarding both constraints
that also have a term depending only on the coordinates ( ane constraints) and also
7
completely nonlinear constraints.
Many of these restrictions can quite easily be lifted so that the results can be extended
in generality. In these cases the reader will be referred to external literature where this is
treated. In other cases, such as not treating completely nonlinear constraints, it is simply
outside the scope of either principle mentioned in this text to treat such constraints.
This text draws heavily from the above mentioned authors and follows closely their
adding comments or conclusions that help tie together all loose ends. While no single
detail in this text is completely new, its main contribution is to gather all these loose
ends and tie them together in a way that cannot be located in the literature and that
hopefully provides the reader with new or refreshed knowledge and understanding of
Lagrangian mechanics.
8
2 Mathematical tools
In this section, denitions and explanations of some of the mathematical tools and
concepts used in this text are given. Some clarifying examples are also included to make
some of the more abstract concepts a bit more concrete. The section can be read straight
To read more about these mathematical tools and concepts, the reader is referred to
2. f (aX) = af (X), X Rn , a R.
a translation.
2.2 Projection
Denition 3. A projection P is a linear transformation
P : Rn R n (2.2)
result is the same vector. P is thus said to be idempotent, which is to say that
P 2 = P. (2.4)
2.3 Manifolds
To describe the positional state of a given mechanical system we nd a minimum number
n of generalized coordinates, not necessarily coordinates in position space but any quan-
tity that describes the position of the system. All possible values of these coordinates
make up a set of points Q. This set may not resemble Euclidean space as a whole, but
locally (on a small enough scale) there exists a homeomorphism from Q to Rn . All these
n
local charts to R overlap smoothly with each other so that the whole set of points can
9
Denition 4. A dierentiable manifold of dimension n is a set Q together with a
nite or countably innite set of subsets U Q and with one-to-one mappings :
U Rn such that
A simple illustration is to see the manifold as the earth and each chart as one of the
pages of an atlas. Although two dierent pages may be scaled and rotated dierently,
in the areas where they overlap it is easy to nd points that correspond to each other.
manifold of dimension one is the circle S1 . The generalized coordinate is usually the
angle which varies from 0 to 2 . This example also illustrates the need for more than
one chart to describe the manifold, since the coordinate values 0 and 2 describe the
a useful manifold to describe a spherical pendulum. Here the two generalized coordinates
would typically be the usual spherical coordinates and , with ranging from 0 to 2
and from 0 to . Also in this example we have several values of coordinates describing
These examples of some possible manifolds are included to provide a better feeling
that
10
Now assume a map from an open subset containing q to Rn so that (c(t)) and (d(t))
describe curves on the chart corresponding to the curves on the manifold. The curves c
di di
i
(c(t)) = i
(d(t)) , i = 1, ..., N. (2.6)
dt
t=0 dt
t=0
In this text we only consider N = 1. N th-order tangency is useful for instance when
considering constraints given in higher order time derivatives of the coordinates. Further
information can be found on the topic in [5, p. 566-570] and [4, p. 2785-2787].
In short, the tangent vector is the time derivative of the curve once it is mapped to
local coordinates.
If we take the span of all the tangent vectors of all possible curves through a point
qQ we get the tangent space of Q at the point q, denoted Tq Q which is the vector
n
space R .
If we now take all points in Q and bundle them together with all the attached tangent
spaces, we get the tangent bundle T Q. The tangent bundle is thus a 2n-dimensional
manifold that is suited to describe a system completely, and for example, since the
that point. The tangent bundle would then be the circle itself with lines coming out
of each point. This can be visualized as an innite cylinder with the manifold at the
bottom and the cylinder itself being made up by all the tangent spaces.
Figure 2: The tangent lines of a circle are placed over the point of tangency in a non-
intersecting manner. All tangent lines are nally joined smoothly to make up the cylin-
der.
The tangent space to a point lying on a sphere is a tangent plane. The bundle in
Mathematically, T Q, Q and the Tq Q's form a bre bundle, with each tangent space
Tq Q called a bre. In section 2.5 we explore this concept further.
11
2.5 Fibre bundles
A bre bundle consists of four pieces of data, three topological spaces, such as manifolds,
: T B, (2.8)
where B base space, T is called the total space and F is called the bre.
is called the
The map is called the bundle projection of the bundle which locally is a continuous
surjection which projects any element of T to B . For every q T there exists an open
surjection for every q Q. The bundle projection is thus also often called the submersion
or in words, the set of points inT whose projection through equals r B . This set
is called the bre at r and when (r) is homeomorphic to F for all r B , the total
1
space, the base space, the bre and the submersion together form the bre bundle.
Fibre bundles where the spaces B, T and F are vector spaces are called vector bundles.
To see that the tangent bundle T Q is a bre bundle, we call T Q the total space and
: TQ Q (2.10)
line R, so that the total space is simply S1 R. An example of a nontrivial bre bundle
1
is the same base space, the circle S , but instead of having the bres make up a cylinder
we have them make up a Mbius strip, a sort of twisted cylinder such that going one
revolution around the strip reverses the sign. This bre bundle locally resembles the
product space S1 R but diers from it when considering what lies above the entire
circle [14]. Both these examples can be seen in gure 3.
12
Figure 3:To the left a trivial bre bundle involving a cylinder and to the right a nontrivial
bundle involving a Mbius strip.
q : Tq Q Tq Q R, (2.11)
| {z }
k factors
where q is linear.
n
fi (q)dq i
X
q = (2.12)
i=1
for any qQ and some functions fi , where dq i are the exterior derivatives (section 2.8)
of each coordinate.
n X
n
gij (q)dq i dq j
X
q = (2.14)
i=1 j=1
for any qQ and some anti-symmetric matrix of functions gij . The symbol denotes
= (1)kl . (2.15)
13
2.8 Exterior derivative
The exterior derivative of a function f (q 1 , ..., q n ) is given in local coordinates by
n
X f i
df = dq . (2.17)
i=1
q i
If a function is considered to be a dierential zero-form, then for any k -form , with k
a natural number, the exterior derivative is
is the sum
n X n
X fi j
dq = j
dq dq i . (2.20)
i=1 j=1
q
Comparing with (2.14) we see that the exterior derivative of a one-form is a two-form.
natural number, a (k + 1)-form that is the exterior derivative of a k -form is called exact
and a k -form whose exterior derivative is zero is called closed. It can also be shown that
all exact k -forms are also closed.
Denition 5. The Jacobi-Lie bracket (sometimes called the Lie bracket) [X, Y] of
two vector elds X and Y acting on a function f is then dened by
[X, Y](f ) := X(Y(f )) Y(X(f )). (2.21)
14
3 D'Alembert's principle
In this section the equations of motion for a mechanical system with n independent
generalized coordinates will be derived using the d'Alembert principle, similarly to how
it is done in [1].
Starting from Newtons second law and splitting up the forces into applied forces Fi
and constraint forces fci according to
Fi + fci = pi , (3.1)
we take the sum of the virtual work (the work done by a force upon a virtual displacement
ri , a displacement only in the coordinate but not in time, all consistent with the forces
(Fi pi ) ri + fci ri = 0.
X X
(3.2)
i i
Now we make our rst assumption, which is that the virtual work of all constraint forces
vanishes (the virtual displacements must all satisfy any constraint equations). This is
true for many (but far from all) mechanical systems, such as rigid bodies. By applying
(Fi pi ) ri = 0.
X
(3.3)
i
This is called d'Alembert's principle. Since the ri are, in general, not independent of
with n independent q j 's. The indices of q are given as an upper index to conform with
mathematical conventions and are not to be confused with powers of q. Introducing T
as the kinetic energy and V as the potential energy we can now dene the Lagrangian
L(q, q; t) = T V, (3.5)
n
X d L L (N P )
j
j
F j q j = 0, (3.6)
j=1
dt q q
(N P )
where Fj are all applied nonpotential forces acting on the system. For simplicity we
(N P )
Fj = 0, j = 1, ..., n, (3.7)
15
When all n coordinates in (3.8) are independent of each other, the terms of the sum
in (3.8) must all equal zero individually, so the n equations needed to describe the
motion of the system are immediately obtained from (3.8). These are the Euler-Lagrange
equations,
d L L
= 0, j = 1, ..., n. (3.9)
dt q j q j
16
4 Constraints
In section 3 we saw that (3.8) gives us (3.9) when all coordinates are independent, or
In some cases it might be most practical to use these equations to eliminate all
dependent variables in the Lagrangian directly, and thus end up with a system of np
independent variables which can then be treated by (3.8).
In other cases it may be dicult to perform such eliminations, or the forces pro-
duced by these constraints might be of interest. In such cases the constraints can be
incorporated into (3.8) and even be included in the Lagrangian itself. By d'Alembert's
principle, all the virtual displacements q in (3.8) must satisfy the constraint equations.
n
X f a
q j = 0. (4.4)
j=1
q j
a = a (t), (4.5)
which are called Lagrange multipliers, we may subtract the sum of all variations of the
constraints from the left hand side of (3.8) to obtain
p
n
" #
L X a f a
d L
(t) j (q) q j = 0.
X
j (4.6)
j=1
dt q j q a=1
q
s1 , s2 , ..., sp (4.7)
1 The constraint functions may also be dependent on time explicitly but for simplicity we disregard
this case. The following results may be extended to such cases, see [6].
17
and choose the functions a such that
p
L X a f a
d L
b = 0, b = 1, ..., p. (4.9)
dt sb s a=1
sb
np p
" #
a
X d L L X f
a r = 0 (4.10)
=1
dt r r a=1
r
where the r's are all independent and arbitrary. Each term must therefore individually
X f a p
d L L
= a , = 1, ..., n p. (4.11)
dt r r a=1
r
Together with the original constraint equations, expressed in the new labelling of coor-
dinates, as
we now have n independent equations that completely describe the motion of the system.
In addition we also get, as a bonus, the constraint forces given by
X f a
a , = 1, ..., n p. (4.13)
a=1
r
This also enables us, since the f a 's are independent of q, to introduce the modied
Lagrangian L , dened by
p
X
L := L + a f a (4.14)
a=1
This indeed shows that the holonomic constraint equations can be added to the La-
grangian with Lagrange multipliers and then use (4.15) to get the constrained equations
and what this shows is that it does not matter whether we impose the constraints before
or after we take variations of the Lagrangian. This is however only true for holonomic
constraints, as we shall see in the next section.
time are present is well known and fairly straightforward. However, in some systems the
constraints are not expressible only in terms of coordinates, but also with coordinate
18
and the equations
In general there is no guarantee that constraints of this type are treatable with the
d'Alembert principle. In particular, when the g a 's are nonlinear in the velocities we
can't take variations of the constraints that are linear in the q 's. In many mechanical
systems, however, the constraints only depend on position or are linear in velocities.
n
g a (q, q) = Aai (q)q i ,
X
(4.18)
i=1
commonly occurring, for example in systems with rolling constraints. We again disregard
explicit time dependence and now also disregard constraints with an additional term
involving only the coordinates ( ane constraints, see section 2.1). The following results
n
g a (q, q)dt = Aai (q)dq i ,
X
(4.19)
i=1
n
X
Aai q i = 0, a = 1, ..., p. (4.20)
i=1
p
n
" #
X d L L X a a
j Ai q j = 0. (4.21)
j=1
dt q j q a=1
and proceed as in the holonomic case, denoting dependent and independent coordinates
with (4.7) and (4.8) and choosing the a 's appropriately to get
p
d L L X
= a Aa , = 1, ..., n p, (4.22)
dt r r
a=1
In (4.22) the functions Aa have been reordered in the lower index so that
Equation (4.22) gives the correct equations of motion in terms of Lagrange multipli-
ers, and we may now pose the question if it is possible to include the constraint functions
19
in the Lagrangian, thus imposing the constraints before taking the variation, to form an
p
X
L =L+ a g a , (4.25)
a=1
with a instead of a to distinguish these functions from the ones used previously. Now
we get
p
L L X a a
= + Aj , (4.26)
q j q j a=1
with time derivative
p n
!
d L d L X a a a
X Aaj i
j
= j
+ Aj + q , (4.27)
dt q dt q a=1 i=1
q i
and
p n
L L X X a Aai i
= + q . (4.28)
q j q j a=1 i=1 q j
p p
n
" n #
L X a a X X a Aaj Aai
X d L
j + Aj + q i q j = 0. (4.29)
j=1
dt q j q a=1 a=1 i=1
q i q j
needs to vanish in order to obtain the correct equations of motion (4.22). Since this
procedure works for holonomic constraints it might be expected that if the constraint
functions ga are simply the total time derivatives of some other functions depending
only on the coordinates, in other words, the constraints are really holonomic , then
2
this term is zero for every j. It turns out that this is indeed the case, as shown in [6].
Generally these terms are not zero however, which shows that this procedure is generally
an incorrect approach.
The failure of obtaining the correct equations of motion by including them in the
not work when dealing with nonholonomic constraints, since Hamilton's principle oers
no option to impose the constraints after taking the variation. In fact, imposing the
constraints after taking the variations on the Lagrangian is crucial for nonholonomic
mechanical systems and shows that imposing constraints before taking the variation,
2 These kind of constraints (when the linear velocity constraints are integrable, so that they can be
expressed in terms of only coordinates) are sometimes called semi-holonomic.
20
however useful in its proper context, is allowed only for holonomic systems. This proper
context has been pointed out by several authors on the subject ([1, p. 208], [3, p. 325])
to be optimal control systems, whereas the correct approach for mechanics is the one
21
5 Structure of the equations of motion
We now explore in more detail the structure of the equations determined by (4.21) of
nonholonomic systems with constraints linear in the velocities. The following passage
follows closely the path of [3, p. 325-326] but with added explanations and equations.
q 1 , q 2 , ..., q n (5.1)
tion D on Q that is given by a linear subset of the tangent space (section 2.4) Tq Q at
each point q Q.
Assume there are p constraint equations and that there is a choice of coordinates
3
such that s represents the dependent generalized velocities and r the independent ones.
The constraints can now be described locally by the p equations
np
X
a
s = Aa (r, s)r , a = 1, ..., p. (5.3)
=1
n
X d L L
j
j q j = 0. (5.4)
j=1
dt q q
Now we impose the constraints. Because of (5.3), and the d'Alembert principle, the
np
X
a
s + Aa (r, s)r = 0, a = 1, ..., p. (5.5)
=1
np p np
!
X d L L
X d L L
X
r + a
a
Aa r = 0. (5.7)
=1
dt r r a=1
dt s s =1
3 According to [1, p. 217] this choice of coordinates is always possible.
22
Rearranging, we get
np p
" #
X d L L X d L L
Aa a
a r = 0. (5.8)
=1
dt r r a=1
dt s s
This is one way to eliminate the Lagrange multipliers and a comparison with equation
d L L
a = a, a = 1, ..., p. (5.10)
dt sa s
Together with the constraint equations
np
X
sa = Aa (r, s)r , a = 1, ..., p (5.11)
=1
Lagrangian
p np
L Lc X X Ab L
= + r , = 1, ..., n p (5.13)
r r r sb
b=1 =1
p np
L Lc X X Ab L
= + r , a = 1, ..., p (5.14)
sa sa sa sb
b=1 =1
and
p
L Lc X b L
= + A b , = 1, ..., n p. (5.15)
r r s
b=1
The time derivative of (5.15) is
p np p p
" #
d L d Lc X X Ab X a Ab L X a d L
= + A a r + A . (5.16)
dt r dt r r a=1
s sb a=1 dt sa
b=1 =1
Inserting (5.13), (5.14) and (5.16) into (5.9), the left hand side becomes
p np p
" #
d Lc X X Ab X a Ab L
+ A a r +
dt r r a=1
s sb
b=1 =1
p p np
X d L Lc X X Ab L
+ Aa r (5.17)
a=1
dt sa r r sb
b=1 =1
23
and the right hand side becomes
p p p p np
X d L X a Lc X X X a Ab L
Aa A A a r . (5.18)
a=1
dt sa a=1 sa a=1 s sb
b=1 =1
Pp d L
The term a=1 Aa dt sa cancels out and by grouping all terms with r L
sb
we write
p
d Lc Lc X a Lc
+ A =
dt r r a=1 sa
p np p
" ! !#
X X Ab Ab X A b A b
L
= + Aa a Aa a r b (5.19)
r r a=1
s s s
b=1 =1
or
p p np
d Lc Lc X a Lc XX
b L
+ A a = B r b , = 1, ..., n p (5.20)
dt r r a=1
s s
b=1 =1
where
p
! !
b Ab Ab X Ab b
a A
B := + Aa A . (5.21)
r r a=1
sa sa
5.4 Interpretation
In section 5.3, the equations of motion were rewritten in terms of the constrained La-
np
Ab (r, s) dr ,
X
b b
= ds + b = 1, ..., p. (5.22)
=1
Taking the exterior derivative (section 2.8) of the b 's, we get the two-forms
np np p
X X Ab X A b
d b = dr dr + dsa dr . (5.23)
=1
r a=1
sa
=1
np
Ab (r, s)dr
X
dsb = (5.24)
=1
Now, using the fact that the wedge product is alternating bilinear (section 2.7) we have
dr dr (r, r) = r r r r (5.26)
24
and thus
np
X np p
!
Ab X a Ab
(r, r) =
X
b
d A a r r
=1 =1
r a=1
s
np
X np p
!
X Ab X a Ab
A a r r . (5.27)
=1 =1
r a=1
s
Changing the role of the summation variables and in the second term we get
np
X np p
" !#
X Ab Ab X Ab b
a A
d b (r, r) =
+ A a
A r r (5.28)
=1 =1
r r a=1
sa sa
or
np
X np
(r, r) =
X
b b
d B r r , b = 1, ..., p. (5.29)
=1 =1
np p p
" #
d Lc Lc X a Lc X L
d b (r, r).
X
Lc =
+ A a r = b
(5.30)
=1
dt r r a=1
s s
b=1
This equation isolates the eects of the constraints and also shows that when d = 0 the
right hand side of (5.29) vanishes. This is the case when the constraints are integrable
(holonomic), that is when the one-form is closed (section 2.8), once again conrming
what can be seen in equation (4.29). This equation thus also provides an alternate way
to obtain the equations of motion for holonomic systems, especially when the constraints
are initially given as functions linear in velocities, by substituting the constraints into
the Lagrangian and setting the variation of Lc to zero [1]. In cases where the constraints
truly are nonholonomic however, there are additional forces generated by the constraints,
given by the right hand side of (5.30). In sections 5.6 and 5.7 we shall explore how these
dened as
L
pk := . (5.31)
rk
This also means that for a holonomic system the equations of motion are given by (4.11)
p
X f a
L
p
= a , = 1, ..., n p. (5.32)
r a=1
r
Now, if the Lagrangian L along with all the constraint equations f is independent of a
certain coordinate rk , then rk is said to be cyclic. For such a coordinate, (5.32) becomes
pk = 0, (5.33)
25
which means that
pk = constant, (5.34)
there are several other examples (for instance in [8, p. 568]) where similar statements
imply that this conservation law always holds. However, this is only true for holonomic
Lc
pk := , (5.35)
rk
so when inserting this into (5.20) we get
p np p
Lc XX
b L X a Lc
p = B r b A a , = 1, ..., n p. (5.36)
r s a=1
s
b=1 =1
and we can clearly see, even with this last assumption, that the conjugate momentum of a
cyclic variable is generally not conserved for nonholonomic systems. Further discussion
The idea of an Ehresmann connection (section 2.10) is to think of the tangent space
Tq Q at the point qQ as a the sum of a horizontal part and a vertical part, where the
horizontal part of the space is exactly the subspace of Tq Q dened by the constraints.
26
where the r 's and sa 's are to be interpreted only as a notation for the values of the
elements of u and are not to be confused with the time derivatives of the elements of
We now dene a bre bundle (section 2.5) with total space Q and base space that
we will call R, that is locally the space of the r-variables. Thus we have a submersion
r1 r1
2 2
r r
.. ..
. .
rnp rnp
1 7 . (5.40)
s 0
s2 0
. .
.. ..
sp 0
The tangent space of the base space at each point is then the space
Tq R = span 1
, 2 , ..., np . (5.41)
r r r
The projection from Tq Q to Tq R is called Tq and the kernel of Tq is what we call the
The space Tq R is now a possible choice of horizontal space in order to have the tangent
space Tq Q be a sum of the vertical and horizontal parts. This choice of horizontal space
does not, however, full our second wish, that the horizontal space also is the subspace
A : Tq Q Vq (5.43)
by
p
Aq (u) := b (u)
X
. (5.44)
sb
b=1
p np
!
Aq (u) =
X X
sb + Ab r . (5.45)
=1
sb
b=1
27
Written in the basis B, the resulting vector is
0
0
.
.
.
0
np
P 1
a := Aq (u) =
1
s + A r
(5.46)
=1
2 np
P 2
s + A r
=1
.
.
.
p npP p
s + A r
=1
so that
p np
!
Aq (a) = = a.
X X
b
s + Ab r (5.47)
=1
sb
b=1
np
X
sa = Aa r , a = 1, ..., p. (5.49)
=1
This means that even if Tq Q is the total space for velocity vectors at a point q, any
allowed velocity lies in the subspace Hq , determined by the Ehresmann connection that
28
and the horizontal projection, written uH , is given by
r1
r1 r2
.
2
r
.
.
..
np
.
r
np
rnp P 1
A r .
1 7 (5.52)
s =1
np
s2 2
P
A r
=1
.
..
.
.
.
p
s
np
P p
A r
=1
From this it is easily checked that adding up the two projections gives us back the
5.7 Curvature
The curvature B of the connection A, with respect to two vector elds X and Y on Q,
is dened as
where the bracket on the right hand side of (5.54) is the Jacobi-Lie bracket (section 2.9).
rx1 ry1
2 2
rx ry
1 .. ..
1
X . Y .
2 2
X np
rx Y np
ry
X = .. = 1 ,
Y = .. = 1 .
(5.55)
. sx . sy
n s2 n s2
X x Y y
. .
. .
. .
spx spy
The horizontal projections of the vector elds, written in the basis B as before, are then
np p np
XH = Aa (r, s)rx a
X XX
rx (5.56)
r a=1 s
=1 =1
and
np p np
YH = Aa (r, s)ry a .
X XX
ry
(5.57)
=1
r a=1 =1
s
29
Let
k
XH and YHk denotes the k 'th row of XH and YH respectively. Now we have
np p np np
k XX a k X
[XH , YH ] =
X
k
rx Y A rx a YH k
ry XH +
r H a=1 s =1
r
=1 =1
p np
X X
+ Aa ry Xk , k = 1, ..., n. (5.58)
a=1 =1
sa H
YHk 0
, for k = 1, ..., n p
= np
Ak(np)
(5.60)
sa ry
P
sa , for k = (n p + 1), ..., n,
=1
k 0
, for k = 1, ..., n p
XH
= np
P Ak(np) (5.61)
r
r rx , for k = (n p + 1), ..., n
=1
and
k 0
, for k = 1, ..., n p
XH
= np
P Ak(np) (5.62)
sa
sa rx , for k = (n p + 1), ..., n.
=1
p np
X np p
" !#
X Ab Ab X Ab b
a A
[XH , YH ] =
X
a
+ A a A a rx ry b (5.63)
=1
r r a=1
s s s
b=1 =1
or
p np
X np
[XH , YH ] =
X X
b
B rx ry . (5.64)
sb
b=1 =1 =1
p np
X np
[XH , YH ] =
X X
b
B XY . (5.65)
sb
b=1 =1 =1
Since this vector eld is already in the vertical space Vq , and since A is a projection to
np
X np
d (XH , YH ) =
X
b b
B X Y , b = 1, ..., p. (5.67)
=1 =1
30
The conclusion is that the curvature B of the Ehresmann connection A, acting on two
and that it can be seen as a measure of the failure of the constraints to be integrable [1,
np
X np
B(X, Y)b =
X
b
B X Y , b = 1, ..., p, (5.69)
=1 =1
b
with the coecients B given by (5.21).
31
6 An example: The falling disc
Now we try to make use of the method developed in the previous sections in an example, a
plane. A similar, but not identical, model was explored as early as 1892 by Vierkandt
and assume that the disc is not allowed to slip as it rolls and rotates on the surface.
for these types of systems, but for our purpose in this text it is deemed to be realistic
X, Y, Z. (6.1)
The rotation of the disc in its rolling direction measured counter clockwise by the angle
, (6.2)
the rotation about the disc's z -axis measured from the x-axis counter clockwise by
, (6.3)
. (6.4)
32
The kinetic energy of the system is
1 1
T = m(X 2 + Y 2 + Z 2 ) + (I1 2 + I2 2 + I3 2 ) (6.5)
2 2
with I1 , I2 and I3 the moments of inertia of the disc about each corresponding axis from
1
I1 = mR2 , (6.6)
2
corresponding to the change in rotation measured by ,
1
I2 = mR2 , (6.7)
4
corresponding to , and
1
I3 = mR2 , (6.8)
4
corresponding to the angle . A more detailed description on how these values can be
calculated can be found in any book on basic mechanics, such as [13, p. 663-690].
V = mgZ, (6.9)
x, y (6.11)
for this purpose, the coordinates X, Y and Z can now be expressed as functions of these
new coordinates and the angles and . We now have
and
Z = R cos . (6.14)
What this actually means is that the position of the centre of mass is known as soon
as we know the point of contact, the heading angle and the inclination of the disc.
These constraints are all holonomic so we can eliminate them directly by inserting the
equations into the Lagrangian. To be able to do this we also need the time derivatives,
X = x R cos sin + sin cos , (6.15)
Y = y R sin sin cos cos , (6.16)
33
and
Z = R sin() (6.17)
We also calculate
X 2 = x2 2xR cos sin + sin cos +
+ R2 cos2 sin2 2 + 2 cos sin cos sin + sin2 cos2 2 , (6.18)
Y 2 = y 2 2yR sin sin cos cos +
+ R2 sin2 sin2 2 2 cos sin cos sin + cos2 cos2 2 (6.19)
and
Z 2 = R2 sin2 2 . (6.20)
Then we have
X 2 + Y 2 + Z 2 = x2 + y 2 + R2 2 + R2 sin2 2
h i
2R sin (x cos + y sin ) + cos (x sin y cos ) . (6.21)
5R2 2 R2 2
m 2 1
L= x + y 2 + + R2 + sin2 2 + 2gR cos F , (6.22)
2 4 4 2
where h i
F = 2R sin (x cos + y sin ) + cos (x sin y cos ) . (6.23)
Q = R2 S1 S1 S1+ (6.24)
where
and S1 is the one-sphere, the set of all points on a circle, and it is therefore the natural
space for each of the angles and . The set of points S1+ is dened as
and is the set of points of the upper half of a circle, which is to be used as the space for
the angle .
x = R cos (6.27)
and
y = R sin . (6.28)
34
To be able to write these constraints in the language of the approach we have devel-
oped, we let
and
r = [, , ]T (6.30)
and
1 A11
B12 = = (R cos ) = R sin , (6.35)
2 A21
B12 = = (R sin ) = R cos , (6.36)
1 A11
B21 = = (R cos ) = R sin (6.37)
and
2 A21
B21 = = (R sin ) = R cos , (6.38)
with all other B 's zero.
Remembering that the constrained Lagrangian is the Lagrangian where all veloci-
ties full the nonholonomic constraint equations, we insert (6.27) and (6.28) into the
Lagrangian. First we see that when inserting the constraints into the function F we get
h i
F = 2R2 sin (cos2 + sin2 ) + cos (cos sin sin cos ) =
35
and
Lc Lc
= = 0, (6.43)
s2 y
equation (5.20) becomes
2 3
d Lc Lc X X b L
+ B r = 0, = 1, 2, 3 (6.44)
dt r r sb
b=1 =1
with
[r1 , r2 , r3 ]T = [, , ]T (6.45)
and
L
= mx mR sin cos + cos sin (6.52)
x
and
L
= my mR sin sin cos cos . (6.53)
y
Now we can calculate
1 L 2 L
B12 + B12 = mR(xsin ycos) mR2 cos (6.54)
x y
and
1 L 2 L
B21 + B21 = mR2 cos mR(xsin ycos). (6.55)
x y
Inserting into (6.44) gives us
3
mR2 sin 2 cos + mR (x sin y cos ) = 0, (6.56)
2
2 1 2
mR + sin + sin(2) sin mR (x sin y cos ) = 0, (6.57)
4
36
and
2 5 1 2 g
mR sin(2) + cos sin = 0. (6.58)
4 2 R
Using the constraint equations (6.27) and (6.28) we see that
2h i
sin + 2 cos = 0, (6.60)
3
sin(2) sin
+ 1 2
= 0, (6.61)
4 + sin
and
4 1 g
sin(2)2 cos + sin = 0. (6.62)
5 2 R
Equations (6.60), (6.61) and (6.62) give the complete motion of the system.
of and ), so we can check whether their associated momenta are conserved, with the
and
so here we see that the conjugate momenta are not conserved. This is consistent with
we can set = 0 and = 0, so that the rolling velocity and turning velocity are both
= 0, (6.65)
which tells us that we either have a constant direction of rolling or a constant tilt
angle . If we assume the direction is constant, equation (6.62) then gives the solution
4g
sin = 0. (6.66)
5R
Looking at equation (6.66) we see that the acceleration of the tilt angle will always
have the same sign as the angle itself. It follows that the only (unstable) equilibrium
37
On the other hand, if we assume the tilt angle is constant, then equation (6.62)
s
4g sin2
, (6.68)
R cos
which gives the lower limit of the rolling velocity as a function of the constant value
of .
Another example is to set = 0, so that the tilt angle is constant. Here, equation
1
+ sin2
4
= . (6.70)
sin
3
sin2 = , (6.71)
4
which of course has no real solutions. Thus we must assume that
= = 0, (6.72)
In summary we can say that constant velocities for rolling and turning gives possible
solutions with the tilt angle either constant or nonconstant, but if we start from an
assumption of constant tilt angle, then we only have solutions where both turning and
In both of these examples there are possible solutions where the equations (6.63) and
(6.64) predict conserved conjugate momenta for both and/or , which is also what we
38
7 Results and conclusion
We started from Newton's second law and used d'Alembert's principle to arrive at
the equation
n
X d L L
j
j
q j = 0. (7.1)
j=1
dt q q
We argued that when all n generalized coordinates are independent of each other,
d L L
j = 0, j = 1, ..., n. (7.2)
dt q j q
We also saw that when some of the coordinates are dependent of each other, their
us the equations
X f a p
d L L
= a , r = 1, ..., n p. (7.3)
dt r r a=1
r
We also showed that, when the constraints are holonomic, they can be added to the
imposed before taking the variations to the Lagrangian, consistent with the approach
Next we showed that this procedure fails when applied to nonholonomic constraint,
and that the proper equations of motion only are obtained by imposing the constraints
after taking the variations, as can only be done by d'Alembert's principle. The equations
n
g (q, q) := Aai (q)q i = 0,
X
a
(7.4)
i=1
p
d L L X
= a Aa , = 1, ..., n p. (7.5)
dt r r a=1
In particular, we showed that when the constraint equations were written in the form
np
X
sa = Aa (r, s)r , a = 1, ..., p, (7.6)
=1
d L L
a = a, a = 1, ..., p, (7.7)
dt sa s
so that the equations of motion become
p
d L L X d L L
= a
a Aa , = 1, ..., n p. (7.8)
dt r r a=1
dt s s
39
Further, we introduced the concept of the constrained Lagrangian Lc , which is the
Lagrangian dened only on the horizontal space of T Q, or more explicitly, dened only
p p np
d Lc Lc X a Lc XX
b L
+ A = B r b , = 1, ..., n p (7.9)
dt r r a=1 sa s
b=1 =1
where
p
! !
Ab Ab X Ab Ab
b
B := + Aa Aa a . (7.10)
r r a=1
sa s
By rewriting the right hand side to be in terms of the exterior derivative of the two
form , which was the constraint function, we could show that the results were con-
b
sistent with holonomic constraints and that these extra terms involving B , generated
gent space Tq Q of each point qQ and it is a powerful way to describe the constraints,
We also showed that the conjugate momenta of a cyclic variable is generally not
Finally we showed how to use some of the above mentioned results in an example.
The equations of motion for a coin, rolling without slipping on a plane surface, were
calculated to be
2h i
sin + 2 cos = 0, (7.11)
3
sin(2) sin
+ 1 2
= 0, (7.12)
4 + sin
and
4 1 2 g
sin(2) cos + sin = 0. (7.13)
5 2 R
The equations obtained were also consistent with the earlier conclusions that conju-
In conclusion it should also be said that, while being a fairly instructive example
showing how to use the developed equations, the example of the falling disc that was
considered in section 6 is arguably not very hard to solve even without the help of the
above formulas, simply with the use of Newtonian mechanics. There are of course other
examples when the Newton approach becomes too cumbersome or impractical and where
the above approach would oer more of an advantage. More advantages still are to be
gained by developing the geometrical approach even further than was done in this text,
40
References
[1] A.M. Bloch. Nonholonomic Mechanics and Control. Interdisciplinary applied math-
ematics: Systems and control. Springer, 2003.
chanical systems with symmetry. Archive for Rational Mechanics and Analysis,
136:2199, 1996.
[3] A.M. Bloch, J.E. Marsden, and D.V. Zenkov. Nonholonomic dynamics. Notices of
the AMS, 52(3):324333, 2005.
[4] H. Cendra, A. Ibort, M. de Len, and D. Martn de Diego. A generalization of
2002.
[8] J.V. Jos and E.J. Saletan. Classical Dynamics: A Contemporary Approach. Cam-
[9] W.S. Koon and J.E. Marsden. The hamiltonian and lagrangian approaches to the
1997.
Problems der rollenden Bewegung, ber die Theorie dieser Bewegung, und ins-
[14] R. Penrose. The Road to Reality: A Complete Guide to the Laws of the Universe.
Vintage Series. Vintage Books, 2007.
41
[15] R. Sjamaar. Manifolds and dierential forms. Lecture notes from Cornell University,
2006.
[16] A. Vierkandt. Dritter Abschnitt: Das Rollen und Gleiten einer ebenen Flche,
[18] D.V. Zenkov, A.M. Bloch, and J.E. Marsden. The energy-momentum method
42