Nonholonomic Form

Faculty of Technology and Science
Mattias Flygare
Holonomic versus nonholonomic

constraints
Physics
C-level thesis
Date: 2012-05-30
Supervisor: Jrgen Fuchs
Examiner: Claes Uggla
Karlstads universitet 651 88 Karlstad

Tfn 054-700 10 00 Fax 054-700 14 60
Information@kau.se www.kau.se
Abstract
Courses in analytical mechanics for undergraduate students are often limited
to treatment of holonomic constraints, which are constraints on coordinates. The
concept of nonholonomic constraints, constraints on velocities, is usually only men-
tioned briey and it is easy to get a wrongful idea of what they are and how to treat
them. This text explains and compares the methods of deriving the Euler-Lagrange
equations and the consequences when imposing dierent kinds of constraints. One
way to properly treat both holonomic and nonholonomic constraints is given, pin-
pointing the diculties and common errors. Along the way, the treatment in local
coordinates is also put in more modern terms, in the language of dierential geom-
etry, which is the language most commonly used in modern texts on the subject.
2
Acknowledgements
My biggest thanks to my supervisor Jrgen Fuchs, always interested and always
interesting! I would also like to thank Jonatan Andersson for our long discussions
and all your helpful comments. Thanks also to David, Soe, Jessica and Joakim
for your willingness to listen and help. Last but not least, thank you to my wife
Anna-Lena and my two sons Sixten and Alexander!
3
Contents
1 Introduction 5
2 Mathematical tools 9
2.1 Linear and ane functions . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Tangent vectors, spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Dierential k-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7 Wedge product between dierential forms . . . . . . . . . . . . . . . . . 13
2.8 Exterior derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.9 Jacobi-Lie bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.10 Ehresmann connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 D'Alembert's principle 15
4 Constraints 17
4.1 Holonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Nonholonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Structure of the equations of motion 22

5.1 Imposing the constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Obtaining the equations of motion . . . . . . . . . . . . . . . . . . . . . 22
5.3 Constrained Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.4 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.5 Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.6 An Ehresmann connection . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.7 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 An example: The falling disc 32

6.1 Writing the Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Change of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.3 Nonholonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.4 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.5 Interpretation and stable solutions . . . . . . . . . . . . . . . . . . . . . 37
7 Results and conclusion 39
4
1 Introduction
In the late 17th century, the German mathematician Gottfried Leibniz introduced the
concept of vis viva, the living force, to be the mass times the square of the velocity, mv2 .
This is today referred to as kinetic energy (diering only by the unimportant factor
2)[11]. He also replaced Isaac Newton's concept of force with the work of the force, later
replaced by an even more basic quantity, the work function. This marks the starting
point of a new branch of mechanics, often called analytical mechanics, where the vector
quantities of force and momentum are no longer important, but instead these two scalar
quantities, the kinetic energy T and the work function U completely describe the motion
of the system.
Through the work of Leonard Euler, Joseph-Louis Lagrange and others in the second
half of the 18th century, a new formalism was developed, called Lagrangian mechanics.
Euler and Lagrange were the rst to discover and formulate the principle of least action,
stating that given a start point and an end point, assuming the energy of the system is
conserved, a time-integral of the kinetic energy over any possible path between the start
and end points has a stationary value, and that nature will choose the path with the
smallest of these values. Thus, to nd the actual motion of the system one only needed
to nd the minimum value for this integral. This was achieved by using calculus of
variations, a eld of mathematics that deals with maximizing or minimizing functionals,
analogous to nding extremal points of functions in ordinary calculus.
In 1834 and 1835, William Rowan Hamilton published two papers titled On a General
Method in Dynamics where he was able to generalize the principle of least action to
include nonconservative systems. The integrand in the action integral was no longer the
kinetic energy alone, but the dierence between kinetic energy T and potential energy V,
the most fundamental quantity in Lagrangian mechanics, later named the Lagrangian,
L = T V. (1.1)
This variational principle was named Hamilton's principle. Numerous other formulations
of the principle of least work were also developed (by Lagrange, Euler, Jacobi) but they
are all similar and all originate from the principle of least action. These principles
all lead to the equations of motion in the Lagrangian formalism, the Euler-Lagrange
equations
d L L
j
j = 0, (1.2)
dt q q
for any generalized coordinate qj where j = 1, ..., n and n is the total number of coordi-
nates.
There is however another way to arrive at these equations, and that is d'Alembert's
principle, sometimes called the Lagrange-d'Alembert principle, introduced 1743 in Trait

de Dynamique by Jean-Baptiste le Rond d'Alembert. This principle does not rely on the
least action principle at all, but instead starts with Newton's second law and applies the
principle of virtual work, stating that a system is in equilibrium if and only if the total
virtual work (the work done by a force along a spacial displacement when time is kept
xed) of all applied forces vanish. The principle thus assumes that all the work done by
5
constraint forces cancel. The new addition by d'Alembert (startlingly simple and still
very important) to this principle was to take the right hand side of Newton's second
law, the mass times the acceleration, and move it to the left hand side and treat it as
if it were a force, called the force of inertia. In this way, the principle of virtual work
could be extended to include systems with non-trivial dynamics and not only statical
systems.
The d'Alembert principle also leads to the Euler-Lagrange equations and it is actually
the most general principle of them all, and indeed it can quite easily be mathematically
reformulated as Hamilton's principle. Hamilton's principle (and the same goes for all
the mentioned least action principles) is thus equivalent to d'Alembert's principle in all
applications where both can be applied [11]. The latter half of the previous statement is
very important but sometimes forgotten. To get a quick appreciation of how the two
principles dier we rst need to collect some more information.
As previously mentioned, in analytical mechanics, the quantities of importance are
the kinetic energy and the work function. The forces that gives rise to a virtual work
are commonly classied as conservative or nonconservative, or sometimes as potential
or nonpotential. A more general classication and a terminology used in [11], divides
the forces into two categories:
1. Monogenic. Forces that are derivable from a scalar function U called the work
function, of the coordinates, the coordinate velocities and time explicitly.
2. Polygenic. Forces that are not monogenic, but when it is still relevant to talk
about the virtual work done by the force (such as frictional forces).
Conservative forces are now the special case when U does not depend on time explicitly.
Potential forces are then the special case of conservative forces but when the work
function neither depends on time explicitly, nor the coordinate velocities. The negative
of such a work function is what we normally call a scalar potential.
Hamilton's variational principle actually has many advantages over d'Alembert's
principle, in areas such as optimal control and when dealing with systems of curvilinear
coordinates, although it turns out that Hamilton's principle is invalid when dealing
with polygenic forces while d'Alembert's still works. To see what this means, let us also
classify forces of constraint into two categories:
1. Holonomic constraints. Integrable constraints, meaning that given some con-
straints depending on time-derivatives of coordinates, these constraints can be
integrated as to express the constraints in only the coordinates themselves, a ter-
minology rst introduced by Heinrich Hertz in 1894.
2. Nonholonomic constraints. Constraints that are not holonomic. These might
be constraints that are expressed in terms of coordinate velocities that are not
derivable from coordinate constraints (thereby unintegrable) or constraints not
given as an equation at all.
There is also a connection between the two above classications; Holonomic con-
straints give rise to monogenic forces and nonholonomic constraints give rise to polygenic
6
forces[11]. From this we can immediately deduce the important fact that Hamilton's
principle is applicable only to systems subject to holonomic constraints and that for
treatment of nonholonomic constraints only d'Alembert's principle is applicable.
Probably due to the elegance and wide areas of use of Hamilton's principle on systems
with holonomic constraints there has sometimes arisen some confusion when nonholo-
nomic systems are to be treated. One way to do this properly was rmly established
already in 1900 in [10] but still today one can nd books for students in analytical me-
chanics, as pointed out by [1] and [6], where the topic is treated in an uncareful manner,
such as when generality is claimed for results applying only to holonomic constraints,
or the author may simply have misunderstood how the treatment of nonholonomic con-
straints diers from holonomic.
Although there still is some confusion remaining, there are numerous publications
where the importance of using the d'Alembert principle is stressed and the main dif-
ferences are pointed out. One of the most active contemporary author within the eld
is A.M. Bloch (for example in [1], [12], [3] or [2]), but there are also other authors (for
example W.S. Koon and J.E. Marsden in [9] and M.R. Flannery in [6]).
The main purpose of this text is to create a reasonably straight path through the
process of obtaining equations of motion for systems with both holonomic and nonholo-
nomic constraints, starting with the d'Alembert principle, deriving the Euler-Lagrange
equations and then applying constraints, both holonomic and nonholonomic. By do-
ing this in a very straightforward and transparent way (the ambition was to include
almost all equations and calculations) it becomes fairly easy to pinpoint the dierences
between the two types of constraints, and how they appear explicitly in the equations
when imposing the constraints.
A large part of the more recent literature in the eld is written, not in terms of local
coordinates, but in terms of dierential geometry. Since this text is intended for last
year undergraduate level and up, one of the bi-purposes of the text was to open the
door (but never completely walk through it) to describing mechanical systems in the
coordinate free language of geometry.
Finally the formulas developed are put to use on the mechanical system of a rolling
disc (such as a coin) on a horizontal plane, a good example of how rolling constraints
leads to linear nonholonomic constraint equations. The example has (of course) been
treated before in a similar way, but is included to illustrate the process of obtaining
the equations of motion by the process developed in the foregoing sections and it also
demonstrates some of the dierences between holonomic and nonholonomic systems.
By pure necessity, and sometimes to avoid confusing the reader more than necessary,
there are some restrictions to generality in the text. To begin with we only treat sys-
tems where there are no nonpotential forces acting, which in this context means that
we only consider forces that arise from constraints. We also only treat systems where
the constraints are not depending on time explicitly (so called scleronomic systems, as
opposed to rheonomic, constraints that are depending on time explicitly). The nonholo-
nomic constraints we treat are all linear in the velocities, disregarding both constraints
that also have a term depending only on the coordinates ( ane constraints) and also
7
completely nonlinear constraints.
Many of these restrictions can quite easily be lifted so that the results can be extended
in generality. In these cases the reader will be referred to external literature where this is
treated. In other cases, such as not treating completely nonlinear constraints, it is simply
outside the scope of either principle mentioned in this text to treat such constraints.
This text draws heavily from the above mentioned authors and follows closely their
argumentations and derivations, sometimes lling in missing equations and sometimes
adding comments or conclusions that help tie together all loose ends. While no single
detail in this text is completely new, its main contribution is to gather all these loose
ends and tie them together in a way that cannot be located in the literature and that
hopefully provides the reader with new or refreshed knowledge and understanding of
Lagrangian mechanics.
8
2 Mathematical tools
In this section, denitions and explanations of some of the mathematical tools and
concepts used in this text are given. Some clarifying examples are also included to make
some of the more abstract concepts a bit more concrete. The section can be read straight
through, used as a reference or skipped altogether depending on the reader.
To read more about these mathematical tools and concepts, the reader is referred to
for instance [1] and [15].
2.1 Linear and ane functions

Denition 1. A function f : Rn Rm is said to be linear if
1. f (X + Y) = f (X) + f (Y), X, Y Rn ,
2. f (aX) = af (X), X Rn , a R.
Denition 2. A function g : Rn Rm is said to be ane if there exists a linear

function f : Rn Rm and a vector b Rm such that
g(X) = f (X) + b, X Rn (2.1)
In plainer language, an ane transformation is a linear transformation followed by
a translation.
2.2 Projection
Denition 3. A projection P is a linear transformation
P : Rn R n (2.2)
such that if u Rn and v = P (u) then

P (v) = v. (2.3)
In other words, if P is applied to a vector that is already transformed by P, the
result is the same vector. P is thus said to be idempotent, which is to say that
P 2 = P. (2.4)
2.3 Manifolds
To describe the positional state of a given mechanical system we nd a minimum number
n of generalized coordinates, not necessarily coordinates in position space but any quan-
tity that describes the position of the system. All possible values of these coordinates
make up a set of points Q. This set may not resemble Euclidean space as a whole, but
locally (on a small enough scale) there exists a homeomorphism from Q to Rn . All these
n
local charts to R overlap smoothly with each other so that the whole set of points can
be described in terms of these charts. Q is then called a dierentiable manifold. More
explicitly, the denition is:
9
Denition 4. A dierentiable manifold of dimension n is a set Q together with a
nite or countably innite set of subsets U Q and with one-to-one mappings :
U Rn such that
1. The union of all U is Q,

2. If I = U U is a nonempty intersection of U and U , then there exists a smooth
function : (I) (I).
Figure 1: An example of a manifold with mappings to local coordinates
A simple illustration is to see the manifold as the earth and each chart as one of the
pages of an atlas. Although two dierent pages may be scaled and rotated dierently,
in the areas where they overlap it is easy to nd points that correspond to each other.
Trivial cases of manifolds are when Q is Rn itself. An example of a non-trivial
manifold of dimension one is the circle S1 . The generalized coordinate is usually the
angle which varies from 0 to 2 . This example also illustrates the need for more than
one chart to describe the manifold, since the coordinate values 0 and 2 describe the
same point on the circle.
Another example of a manifold is the two dimensional sphere S2 , illustrated by gure

1, which consists of all the points lying on the surface of a three dimensional ball, which is
a useful manifold to describe a spherical pendulum. Here the two generalized coordinates
would typically be the usual spherical coordinates and , with ranging from 0 to 2
and from 0 to . Also in this example we have several values of coordinates describing
the same point.
These examples of some possible manifolds are included to provide a better feeling
for what manifolds are.
2.4 Tangent vectors, spaces

Let c(t) and d(t) be smooth curves on the manifold Q that go through a point qQ so
that
c(0) = d(0) = q. (2.5)
10
Now assume a map from an open subset containing q to Rn so that (c(t)) and (d(t))
describe curves on the chart corresponding to the curves on the manifold. The curves c
and d are said to be N -tangent at q if
di di

i
(c(t)) = i
(d(t)) , i = 1, ..., N. (2.6)
dt
t=0 dt
t=0
In this text we only consider N = 1. N th-order tangency is useful for instance when
considering constraints given in higher order time derivatives of the coordinates. Further
information can be found on the topic in [5, p. 566-570] and [4, p. 2785-2787].
Now we dene the tangent vector u to a manifold Q of dimension n at a point q to
be an n-dimensional vector with the components given by

d
ui = ((c(t)))i

, i = 1, ..., n. (2.7)
dt t=0
In short, the tangent vector is the time derivative of the curve once it is mapped to
local coordinates.
If we take the span of all the tangent vectors of all possible curves through a point
qQ we get the tangent space of Q at the point q, denoted Tq Q which is the vector
n
space R .
If we now take all points in Q and bundle them together with all the attached tangent
spaces, we get the tangent bundle T Q. The tangent bundle is thus a 2n-dimensional
manifold that is suited to describe a system completely, and for example, since the
Lagrangian L is a function of both position and velocity, it is dened on T Q.

The tangent space to a point on a circle is simply the tangent line to the circle through
that point. The tangent bundle would then be the circle itself with lines coming out
of each point. This can be visualized as an innite cylinder with the manifold at the
bottom and the cylinder itself being made up by all the tangent spaces.
Figure 2: The tangent lines of a circle are placed over the point of tangency in a non-
intersecting manner. All tangent lines are nally joined smoothly to make up the cylin-
der.
The tangent space to a point lying on a sphere is a tangent plane. The bundle in
this case has four dimensions so it is hard to visualize.
Mathematically, T Q, Q and the Tq Q's form a bre bundle, with each tangent space
Tq Q called a bre. In section 2.5 we explore this concept further.
11
2.5 Fibre bundles
A bre bundle consists of four pieces of data, three topological spaces, such as manifolds,
B, T, F and a continuous surjection
: T B, (2.8)
where B base space, T is called the total space and F is called the bre.
is called the
The map is called the bundle projection of the bundle which locally is a continuous
surjection which projects any element of T to B . For every q T there exists an open
neighbourhood U B of the projection (q) such that the inverse image

1
(U ) is
homeomorphic to the product space U F. The bundle projection q at each point
qQ should also be a submersion, that is to say that the dierential dq should be a
surjection for every q Q. The bundle projection is thus also often called the submersion
of the bre bundle.
We can think of B as a subspace of T and as a projection to only the base variables.

Since is not a bijection there is no inverse, but we can still speak of the bre, dened
for any element rB as
1 (r) = {q T : (q) = r}, (2.9)
or in words, the set of points inT whose projection through equals r B . This set
is called the bre at r and when (r) is homeomorphic to F for all r B , the total
1
space, the base space, the bre and the submersion together form the bre bundle.
Fibre bundles where the spaces B, T and F are vector spaces are called vector bundles.
A bundle where T = B F is called a trivial bundle.
To see that the tangent bundle T Q is a bre bundle, we call T Q the total space and
the conguration manifold Q the base space and dene a submersion
: TQ Q (2.10)
that locally projects each element in TQ onto the corresponding q Q.

An example of a trivial bundle is again the circle S1 with each bre a copy of the real
line R, so that the total space is simply S1 R. An example of a nontrivial bre bundle
1
is the same base space, the circle S , but instead of having the bres make up a cylinder
we have them make up a Mbius strip, a sort of twisted cylinder such that going one
revolution around the strip reverses the sign. This bre bundle locally resembles the
product space S1 R but diers from it when considering what lies above the entire
circle [14]. Both these examples can be seen in gure 3.
12
Figure 3:To the left a trivial bre bundle involving a cylinder and to the right a nontrivial
bundle involving a Mbius strip.
2.6 Dierential k-forms

A k -form on a dierentiable manifold Q is a smooth map from the products space
T Q T Q with k factors of T Q, to R and at any point q Q denes an alternating
multilinear map q to R,
q : Tq Q Tq Q R, (2.11)
| {z }
k factors
where q is linear.
For example, a one-form can locally be expressed as
n
fi (q)dq i
X
q = (2.12)
i=1
for any qQ and some functions fi , where dq i are the exterior derivatives (section 2.8)
of each coordinate.
A two-form q is alternating bilinear (or anti-symmetric). This means that
q (X, Y) = q (Y, X), X, Y Tq Q. (2.13)
It can locally be expressed as
n X
n
gij (q)dq i dq j
X
q = (2.14)
i=1 j=1
for any qQ and some anti-symmetric matrix of functions gij . The symbol denotes
the wedge product which will be explained in section 2.7.
2.7 Wedge product between dierential forms

The wedge product of a k -form and an l-form is a (k + l)-form denoted and
has the property
= (1)kl . (2.15)
In particular, the wedge product between two one-forms and is a two-form
whose value at a point qQ is an alternating bilinear form dened by
( )q (X, Y) = (X)(Y) (Y)(X). (2.16)
13
2.8 Exterior derivative
The exterior derivative of a function f (q 1 , ..., q n ) is given in local coordinates by
n
X f i
df = dq . (2.17)
i=1
q i
If a function is considered to be a dierential zero-form, then for any k -form , with k
a natural number, the exterior derivative is
d( ) = (d ) + (1)k ( d). (2.18)
In local coordinates, the exterior derivative of a one-form expressed as

n
fi (q)dq i ,
X
q = (2.19)
i=1
is the sum
n X n
X fi j
dq = j
dq dq i . (2.20)
i=1 j=1
q
Comparing with (2.14) we see that the exterior derivative of a one-form is a two-form.
In the same way, the exterior derivative of a zero-form is a one-form. In general, if k is a
natural number, a (k + 1)-form that is the exterior derivative of a k -form is called exact
and a k -form whose exterior derivative is zero is called closed. It can also be shown that
all exact k -forms are also closed.
2.9 Jacobi-Lie bracket

Any vector eld on a manifold Q can be regarded as a dierential operator acting on
smooth functions on Q. The Jacobi-Lie bracket is then dened as follows;
Denition 5. The Jacobi-Lie bracket (sometimes called the Lie bracket) [X, Y] of
two vector elds X and Y acting on a function f is then dened by
[X, Y](f ) := X(Y(f )) Y(X(f )). (2.21)
In local coordinates, if {q 1 , q 2 , ..., q n } are coordinates, and { q 1 , q 2 , ..., qn } denotes

the basis for vector elds, then
n X
n
Y i i

j X
[X, Y] :=
X
j
X Y . (2.22)
i=1 j=1
q j q j q i
2.10 Ehresmann connection

Denition 6. An Ehresmann connection A is a vector-valued one-form on Q that
satises:
1. A is vertical valued: Aq : Tq Q Vq is a linear map for each point q Q.
2. A is a projection: A(u) = u, u Vq .
Vq is called the vertical space, which is a subspace of Tq Q, and it is determined by the
bre bundle submersion . If Tq is the projection from Tq Q to the tangent space of the
base space of a specic bre bundle, then Vq is the kernel of Tq .
14
3 D'Alembert's principle
In this section the equations of motion for a mechanical system with n independent
generalized coordinates will be derived using the d'Alembert principle, similarly to how
it is done in [1].
Starting from Newtons second law and splitting up the forces into applied forces Fi
and constraint forces fci according to
Fi + fci = pi , (3.1)
we take the sum of the virtual work (the work done by a force upon a virtual displacement
ri , a displacement only in the coordinate but not in time, all consistent with the forces
and constraints at the time of the displacement)
(Fi pi ) ri + fci ri = 0.
X X
(3.2)
i i
Now we make our rst assumption, which is that the virtual work of all constraint forces
vanishes (the virtual displacements must all satisfy any constraint equations). This is
true for many (but far from all) mechanical systems, such as rigid bodies. By applying
this assumption to (3.2) we arrive at
(Fi pi ) ri = 0.
X
(3.3)
i
This is called d'Alembert's principle. Since the ri are, in general, not independent of
each other we switch to the generalized coordinates by the transformation
ri = ri (q1 , q2 , ..., qn ; t) (3.4)
with n independent q j 's. The indices of q are given as an upper index to conform with
mathematical conventions and are not to be confused with powers of q. Introducing T
as the kinetic energy and V as the potential energy we can now dene the Lagrangian
L(q, q; t) = T V, (3.5)
which now leads us to the following equation, equivalent to (3.3);
n
X d L L (N P )
j
j
F j q j = 0, (3.6)
j=1
dt q q
(N P )
where Fj are all applied nonpotential forces acting on the system. For simplicity we
will hereafter assume that
(N P )
Fj = 0, j = 1, ..., n, (3.7)
so that (3.6) becomes

n
X d L L
j
j
q j = 0. (3.8)
j=1
dt q q
15
When all n coordinates in (3.8) are independent of each other, the terms of the sum
in (3.8) must all equal zero individually, so the n equations needed to describe the
motion of the system are immediately obtained from (3.8). These are the Euler-Lagrange
equations,

d L L
= 0, j = 1, ..., n. (3.9)
dt q j q j
16
4 Constraints
In section 3 we saw that (3.8) gives us (3.9) when all coordinates are independent, or
unconstrained. In this section we explore how to treat systems with constraints.
4.1 Holonomic constraints

We now consider systems where some of the coordinates are dependent of some of the
others ( holonomic constraints), the dependence being described by some functions

1
f a = f a (q), a = 1, ..., p, p < n, (4.1)
and the equations
f a (q) = 0, a = 1, ..., p. (4.2)
In some cases it might be most practical to use these equations to eliminate all
dependent variables in the Lagrangian directly, and thus end up with a system of np
independent variables which can then be treated by (3.8).
In other cases it may be dicult to perform such eliminations, or the forces pro-
duced by these constraints might be of interest. In such cases the constraints can be
incorporated into (3.8) and even be included in the Lagrangian itself. By d'Alembert's
principle, all the virtual displacements q in (3.8) must satisfy the constraint equations.
In practice, this condition can be written as an equation by rst dierentiating (4.2),
giving the p equations

n
a
X f a
df = dq j = 0, (4.3)
j=1
q j
so when taking a virtual displacement (time is kept xed) we get
n
X f a
q j = 0. (4.4)
j=1
q j
After multiplying each of these sums with a function
a = a (t), (4.5)
which are called Lagrange multipliers, we may subtract the sum of all variations of the
constraints from the left hand side of (3.8) to obtain
p
n
" #
L X a f a

d L
(t) j (q) q j = 0.
X
j (4.6)
j=1
dt q j q a=1
q
Now we denote the dependent coordinates with
s1 , s2 , ..., sp (4.7)
and the independent ones with
r1 , r2 , ..., rnp (4.8)
1 The constraint functions may also be dependent on time explicitly but for simplicity we disregard
this case. The following results may be extended to such cases, see [6].
17
and choose the functions a such that
p
L X a f a

d L
b = 0, b = 1, ..., p. (4.9)
dt sb s a=1
sb
Equation (4.6) now reduces to
np p
" #
a

X d L L X f
a r = 0 (4.10)
=1
dt r r a=1
r
where the r's are all independent and arbitrary. Each term must therefore individually
vanish, which gives
X f a p
d L L
= a , = 1, ..., n p. (4.11)
dt r r a=1
r
Together with the original constraint equations, expressed in the new labelling of coor-
dinates, as
f a (s1 , s2 , ..., sp , r1 , r2 , ..., rnp ) = 0 (4.12)
we now have n independent equations that completely describe the motion of the system.
In addition we also get, as a bonus, the constraint forces given by
X f a
a , = 1, ..., n p. (4.13)
a=1
r
This also enables us, since the f a 's are independent of q, to introduce the modied

Lagrangian L , dened by
p
X
L := L + a f a (4.14)
a=1
so that (4.6) becomes

n
d L L
X
j
j
q j = 0. (4.15)
j=1
dt q q
This indeed shows that the holonomic constraint equations can be added to the La-
grangian with Lagrange multipliers and then use (4.15) to get the constrained equations
d'Alembert's generalized principle,

of motion. Equation (4.15) is sometimes referred to as
and what this shows is that it does not matter whether we impose the constraints before
or after we take variations of the Lagrangian. This is however only true for holonomic
constraints, as we shall see in the next section.
4.2 Nonholonomic constraints

The treatment of constraint equations in which only the generalized coordinates and
time are present is well known and fairly straightforward. However, in some systems the
constraints are not expressible only in terms of coordinates, but also with coordinate
velocities ( nonholonomic constraints). Consider constraints that can be expressed by p

functions of the form
g a = g a (q, q; t), a = 1, ..., p, p < n, (4.16)
18
and the equations
g a (q, q; t) = 0, a = 1, ..., p. (4.17)
In general there is no guarantee that constraints of this type are treatable with the
d'Alembert principle. In particular, when the g a 's are nonlinear in the velocities we
can't take variations of the constraints that are linear in the q 's. In many mechanical
systems, however, the constraints only depend on position or are linear in velocities.
Therefore we now focus on the interesting special case where
n
g a (q, q) = Aai (q)q i ,
X
(4.18)
i=1
commonly occurring, for example in systems with rolling constraints. We again disregard
explicit time dependence and now also disregard constraints with an additional term
involving only the coordinates ( ane constraints, see section 2.1). The following results
can be extended to include also ane constraints, as seen in [1, p. 219-220].
Multiplying (4.18) with dt we get
n
g a (q, q)dt = Aai (q)dq i ,
X
(4.19)
i=1
and, when taking a virtual displacement (that is when dt = 0 and dq i becomes q i ), in
accordance with the d'Alembert principle we have
n
X
Aai q i = 0, a = 1, ..., p. (4.20)
i=1
Multiplying with a as before, we may now write
p
n
" #
X d L L X a a
j Ai q j = 0. (4.21)
j=1
dt q j q a=1
and proceed as in the holonomic case, denoting dependent and independent coordinates
with (4.7) and (4.8) and choosing the a 's appropriately to get
p
d L L X
= a Aa , = 1, ..., n p, (4.22)
dt r r
a=1
which is the analogue to equation (4.11) for holonomic constraints.
In (4.22) the functions Aa have been reordered in the lower index so that
Aa1 , Aa2 , ..., Aanp (4.23)
corresponds to the independent coordinate velocities that were relabelled as
r1 , r2 , ..., rnp . (4.24)
Equation (4.22) gives the correct equations of motion in terms of Lagrange multipli-
ers, and we may now pose the question if it is possible to include the constraint functions
19
in the Lagrangian, thus imposing the constraints before taking the variation, to form an
equivalent generalized principle as we saw was possible in the holonomic case.
As before, we take the modied Lagrangian to be
p
X

L =L+ a g a , (4.25)
a=1
with a instead of a to distinguish these functions from the ones used previously. Now
we get
p
L L X a a
= + Aj , (4.26)
q j q j a=1
with time derivative
p n
!
d L d L X a a a
X Aaj i
j
= j
+ Aj + q , (4.27)
dt q dt q a=1 i=1
q i
and
p n
L L X X a Aai i
= + q . (4.28)
q j q j a=1 i=1 q j
Using this L in (4.15), combined with (4.27) and (4.27) gives us
p p
n
" n #
L X a a X X a Aaj Aai

X d L
j + Aj + q i q j = 0. (4.29)
j=1
dt q j q a=1 a=1 i=1
q i q j
Comparing (4.29) with (4.21) it becomes apparent that
a (t) = a (t) (4.30)
and that the term

p X
n
Aaj Aai
X
a
q i (4.31)
a=1 i=1
q i q j
needs to vanish in order to obtain the correct equations of motion (4.22). Since this
procedure works for holonomic constraints it might be expected that if the constraint
functions ga are simply the total time derivatives of some other functions depending
only on the coordinates, in other words, the constraints are really holonomic , then
2
this term is zero for every j. It turns out that this is indeed the case, as shown in [6].
Generally these terms are not zero however, which shows that this procedure is generally
an incorrect approach.
The failure of obtaining the correct equations of motion by including them in the
Lagrangian also provides an explanation as to why Hamilton's variational principle does
not work when dealing with nonholonomic constraints, since Hamilton's principle oers
no option to impose the constraints after taking the variation. In fact, imposing the
constraints after taking the variations on the Lagrangian is crucial for nonholonomic
mechanical systems and shows that imposing constraints before taking the variation,
2 These kind of constraints (when the linear velocity constraints are integrable, so that they can be
expressed in terms of only coordinates) are sometimes called semi-holonomic.
20
however useful in its proper context, is allowed only for holonomic systems. This proper
context has been pointed out by several authors on the subject ([1, p. 208], [3, p. 325])
to be optimal control systems, whereas the correct approach for mechanics is the one
given by equations (4.18) through (4.22).
21
5 Structure of the equations of motion
We now explore in more detail the structure of the equations determined by (4.21) of
nonholonomic systems with constraints linear in the velocities. The following passage
follows closely the path of [3, p. 325-326] but with added explanations and equations.
5.1 Imposing the constraints

We consider a conguration manifold (section 2.3) Q with local coordinates
q 1 , q 2 , ..., q n (5.1)
constrained by a nonintegrable (nonholonomic), linear in velocities, constraint distribu-
tion D on Q that is given by a linear subset of the tangent space (section 2.4) Tq Q at
each point q Q.
Assume there are p constraint equations and that there is a choice of coordinates
3
q = (r, s), (r Rnp , s Rp ) (5.2)
such that s represents the dependent generalized velocities and r the independent ones.
The constraints can now be described locally by the p equations
np
X
a
s = Aa (r, s)r , a = 1, ..., p. (5.3)
=1
5.2 Obtaining the equations of motion

After taking the variations of the curves in conguration space as before, we get
n
X d L L
j
j q j = 0. (5.4)
j=1
dt q q
Now we impose the constraints. Because of (5.3), and the d'Alembert principle, the
variations must satisfy the equations
np
X
a
s + Aa (r, s)r = 0, a = 1, ..., p. (5.5)
=1
Substituting q with r and s in (5.4)

np
X p
d L L X d L L
r + sa = 0 (5.6)
=1
dt r r a=1
dt sa sa
and combining with (5.5)
np p np
!
X d L L
X d L L
X

r + a
a
Aa r = 0. (5.7)
=1
dt r r a=1
dt s s =1
3 According to [1, p. 217] this choice of coordinates is always possible.
22
Rearranging, we get
np p
" #
X d L L X d L L

Aa a
a r = 0. (5.8)
=1
dt r r a=1
dt s s
Since all the r are independent we can write

p
d L L X d L L
= Aa a , = 1, ..., n p. (5.9)
dt r r a=1
dt sa s
This is one way to eliminate the Lagrange multipliers and a comparison with equation
(4.22) shows us that the multipliers here are given by
d L L
a = a, a = 1, ..., p. (5.10)
dt sa s
Together with the constraint equations
np
X
sa = Aa (r, s)r , a = 1, ..., p (5.11)
=1
we now have n equations that are the complete equations of motion.
5.3 Constrained Lagrangian

Now we eliminate the s dependence in the Lagrangian L, thus getting a constrained
Lagrangian
Lc (r, s, r) := L(r, s, r, s(r, s, r)) (5.12)
where the components of s(r, s, r) are given by (5.11).

Then we have
p np
L Lc X X Ab L
= + r , = 1, ..., n p (5.13)
r r r sb
b=1 =1
p np
L Lc X X Ab L
= + r , a = 1, ..., p (5.14)
sa sa sa sb
b=1 =1
and
p
L Lc X b L
= + A b , = 1, ..., n p. (5.15)
r r s
b=1
The time derivative of (5.15) is
p np p p
" #
d L d Lc X X Ab X a Ab L X a d L
= + A a r + A . (5.16)
dt r dt r r a=1
s sb a=1 dt sa
b=1 =1
Inserting (5.13), (5.14) and (5.16) into (5.9), the left hand side becomes
p np p
" #
d Lc X X Ab X a Ab L
+ A a r +
dt r r a=1
s sb
b=1 =1
p p np
X d L Lc X X Ab L
+ Aa r (5.17)
a=1
dt sa r r sb
b=1 =1
23
and the right hand side becomes
p p p p np
X d L X a Lc X X X a Ab L
Aa A A a r . (5.18)
a=1
dt sa a=1 sa a=1 s sb
b=1 =1
Pp d L
The term a=1 Aa dt sa cancels out and by grouping all terms with r L
sb
we write
p
d Lc Lc X a Lc
+ A =
dt r r a=1 sa
p np p
" ! !#
X X Ab Ab X A b A b
L
= + Aa a Aa a r b (5.19)
r r a=1
s s s
b=1 =1
or
p p np
d Lc Lc X a Lc XX
b L

+ A a = B r b , = 1, ..., n p (5.20)
dt r r a=1
s s
b=1 =1
where
p
! !
b Ab Ab X Ab b
a A
B := + Aa A . (5.21)
r r a=1
sa sa
5.4 Interpretation
In section 5.3, the equations of motion were rewritten in terms of the constrained La-
grangian Lc . To be able to make further interpretations we try to rewrite once more,
following the general path of [9, p. 24].
By writing the constraint equations as p one-forms (section 2.6) we have
np
Ab (r, s) dr ,
X
b b
= ds + b = 1, ..., p. (5.22)
=1
Taking the exterior derivative (section 2.8) of the b 's, we get the two-forms

np np p
X X Ab X A b

d b = dr dr + dsa dr . (5.23)
=1
r a=1
sa
=1
Using the constraints in the form
np
Ab (r, s)dr
X
dsb = (5.24)
=1
we can eliminate ds:

np
X np p
!
b
X Ab X a Ab
d = A a dr dr . (5.25)
=1 =1
r a=1
s
Now, using the fact that the wedge product is alternating bilinear (section 2.7) we have
dr dr (r, r) = r r r r (5.26)
24
and thus
np
X np p
!
Ab X a Ab
(r, r) =
X
b
d A a r r
=1 =1
r a=1
s
np
X np p
!
X Ab X a Ab
A a r r . (5.27)
=1 =1
r a=1
s
Changing the role of the summation variables and in the second term we get
np
X np p
" !#
X Ab Ab X Ab b
a A
d b (r, r) =
+ A a
A r r (5.28)
=1 =1
r r a=1
sa sa
or
np
X np
(r, r) =
X
b b
d B r r , b = 1, ..., p. (5.29)
=1 =1
So the variation of the constrained Lagrangian Lc can be written as
np p p
" #
d Lc Lc X a Lc X L
d b (r, r).
X
Lc =
+ A a r = b
(5.30)
=1
dt r r a=1
s s
b=1
This equation isolates the eects of the constraints and also shows that when d = 0 the
right hand side of (5.29) vanishes. This is the case when the constraints are integrable
(holonomic), that is when the one-form is closed (section 2.8), once again conrming
what can be seen in equation (4.29). This equation thus also provides an alternate way
to obtain the equations of motion for holonomic systems, especially when the constraints
are initially given as functions linear in velocities, by substituting the constraints into
the Lagrangian and setting the variation of Lc to zero [1]. In cases where the constraints
truly are nonholonomic however, there are additional forces generated by the constraints,
given by the right hand side of (5.30). In sections 5.6 and 5.7 we shall explore how these
terms can be treated in terms of geometry.
5.5 Conservation laws

It is customary to dene an associated momentum (often called the conjugate momen-
tum ) to each generalized coordinate. The conjugate momentum of a coordinate rk is
dened as
L
pk := . (5.31)
rk
This also means that for a holonomic system the equations of motion are given by (4.11)
in the new form of conjugate momenta;
p
X f a
L
p
= a , = 1, ..., n p. (5.32)
r a=1
r
Now, if the Lagrangian L along with all the constraint equations f is independent of a
certain coordinate rk , then rk is said to be cyclic. For such a coordinate, (5.32) becomes
pk = 0, (5.33)
25
which means that
pk = constant, (5.34)
or, it is said to be conserved. Indeed, the generalized momentum conjugate to a cyclic
coordinate is conserved is stated in [7, p. 56] to be a general conservation theorem and
there are several other examples (for instance in [8, p. 568]) where similar statements
imply that this conservation law always holds. However, this is only true for holonomic
constraints, as we shall see below.
If we instead look at the equations of motion that we derived for nonholonomic
constraints of the form (5.3), the conjugate momentum is now
Lc
pk := , (5.35)
rk
so when inserting this into (5.20) we get
p np p
Lc XX
b L X a Lc
p = B r b A a , = 1, ..., n p. (5.36)
r s a=1
s
b=1 =1
In this context, if rk is cyclic, then all the quantities

b
L, Lc , B and all the constraint
a k
functions A are independent of r . If we also assume, as is often the case, that the
dependent coordinates s are also cyclic, then (5.36) becomes

p np
X X
b L
pk = Bk r , (5.37)
sb
b=1 =1
and we can clearly see, even with this last assumption, that the conjugate momentum of a
cyclic variable is generally not conserved for nonholonomic systems. Further discussion
of this fact can be found for instance in [17] and [3].
5.6 An Ehresmann connection

We now try to formulate the results derived so far in section 5 in geometrical terms.
The idea of an Ehresmann connection (section 2.10) is to think of the tangent space
Tq Q at the point qQ as a the sum of a horizontal part and a vertical part, where the
horizontal part of the space is exactly the subspace of Tq Q dened by the constraints.
Using our previous notation of r's and s's, the basis of Tq Q is

B= , , ..., np , 1 , 2 , ..., p (5.38)
r1 r2 r s s s
and any vector u Tq Q can be written as

r1
2
r

..
.

rnp np
p

u= 1
X X
= r + sa a , (5.39)

s r s
=1 a=1
s2

.
..

sp
26
where the r 's and sa 's are to be interpreted only as a notation for the values of the
elements of u and are not to be confused with the time derivatives of the elements of
any specic vector in the corresponding position space.
We now dene a bre bundle (section 2.5) with total space Q and base space that
we will call R, that is locally the space of the r-variables. Thus we have a submersion
that locally is the projection

r1 r1
2 2
r r

.. ..
. .

rnp rnp
1 7 . (5.40)

s 0

s2 0

. .
.. ..

sp 0
The tangent space of the base space at each point is then the space

Tq R = span 1
, 2 , ..., np . (5.41)
r r r
The projection from Tq Q to Tq R is called Tq and the kernel of Tq is what we call the
vertical space, dened as

Vq := span 1
, 2 , ..., p . (5.42)
s s s
The space Tq R is now a possible choice of horizontal space in order to have the tangent
space Tq Q be a sum of the vertical and horizontal parts. This choice of horizontal space
does not, however, full our second wish, that the horizontal space also is the subspace
of Tq Q dened by our constraints. Instead we dene the map
A : Tq Q Vq (5.43)
by
p

Aq (u) := b (u)
X
. (5.44)
sb
b=1
Inserting (5.22) in (5.44) we get
p np
!

Aq (u) =
X X
sb + Ab r . (5.45)
=1
sb
b=1
27
Written in the basis B, the resulting vector is

0
0

.

.

.

0

np
P 1
a := Aq (u) =
1
s + A r

(5.46)
=1
2 np
P 2
s + A r

=1
.

.
.

p npP p
s + A r
=1
so that
p np
!

Aq (a) = = a.
X X
b
s + Ab r (5.47)
=1
sb
b=1
Aq is a projection (section 2.2) from Tq Q to the vertical space Vq , or in other words, Aq

is an Ehresmann connection.
The horizontal space Hq is now given by the kernel of Aq ,
Hq = ker(Aq ) = {u Tq Q : Aq (u) = 0}, (5.48)
or more explicitly, Hq is spanned by all vectors in Tq Q that satisfy
np
X
sa = Aa r , a = 1, ..., p. (5.49)
=1
This means that even if Tq Q is the total space for velocity vectors at a point q, any
allowed velocity lies in the subspace Hq , determined by the Ehresmann connection that
was dened using the constraint equations.
To sum up, the vertical projection of a vector u Tq Q, written

uV := A(u), (5.50)
is given by the map

0

r1 0

.
2
r
.
.

..
.
0

np
rnp
s1 + A1 r
P
1 7 (5.51)

s =1
np
s2 s2 + A2 r
P

. =1
..
.

.
.

p
s
p np P p

s + A r
=1
28
and the horizontal projection, written uH , is given by

r1

r1 r2

.
2
r
.
.

..
np

.
r

np
rnp P 1
A r .

1 7 (5.52)

s =1

np
s2 2
P
A r

=1
.
..
.

.
.

p
s
np
P p

A r
=1
From this it is easily checked that adding up the two projections gives us back the
original vector and that the constrained Lagrangian can be written as
Lc (r, s, r) = L(q, qH ). (5.53)
5.7 Curvature
The curvature B of the connection A, with respect to two vector elds X and Y on Q,
is dened as
B(X, Y) := A([XH , YH ]) (5.54)
where the bracket on the right hand side of (5.54) is the Jacobi-Lie bracket (section 2.9).
In local coordinates, let X and Y be described by

rx1 ry1
2 2
rx ry
1 .. ..
1
X . Y .
2 2
X np
rx Y np
ry
X = .. = 1 ,
Y = .. = 1 .
(5.55)
. sx . sy

n s2 n s2
X x Y y
. .
. .
. .
spx spy
The horizontal projections of the vector elds, written in the basis B as before, are then
np p np

XH = Aa (r, s)rx a
X XX
rx (5.56)
r a=1 s
=1 =1
and
np p np

YH = Aa (r, s)ry a .
X XX
ry
(5.57)
=1
r a=1 =1
s
29
Let
k
XH and YHk denotes the k 'th row of XH and YH respectively. Now we have
np p np np
k XX a k X
[XH , YH ] =
X
k
rx Y A rx a YH k
ry XH +
r H a=1 s =1
r
=1 =1
p np
X X
+ Aa ry Xk , k = 1, ..., n. (5.58)
a=1 =1
sa H
The derivatives in (5.58) are

YHk 0
, for k = 1, ..., n p
= np k(np)
A
(5.59)
r ry
P

r
, for k = (n p + 1), ..., n,
=1

YHk 0
, for k = 1, ..., n p
= np
Ak(np)
(5.60)
sa ry
P

sa , for k = (n p + 1), ..., n,
=1

k 0
, for k = 1, ..., n p
XH

= np
P Ak(np) (5.61)
r

r rx , for k = (n p + 1), ..., n
=1
and

k 0
, for k = 1, ..., n p
XH
= np
P Ak(np) (5.62)
sa

sa rx , for k = (n p + 1), ..., n.
=1
Since [XH , YH ]k = 0 k = 1, ..., n p we write b = k (n p) so that

for all
p np
X np p
" !#
X Ab Ab X Ab b
a A
[XH , YH ] =
X
a

+ A a A a rx ry b (5.63)
=1
r r a=1
s s s
b=1 =1
or
p np
X np
[XH , YH ] =
X X
b
B rx ry . (5.64)
sb
b=1 =1 =1
Finally, recall that rx = X and ry = Y so, nally, we get
p np
X np
[XH , YH ] =
X X
b
B XY . (5.65)
sb
b=1 =1 =1
Since this vector eld is already in the vertical space Vq , and since A is a projection to
Vq , we also know that
A([XH , YH ]) = [XH , YH ]. (5.66)
Also, equations (5.25) and (5.26) imply that
np
X np
d (XH , YH ) =
X
b b
B X Y , b = 1, ..., p. (5.67)
=1 =1
30
The conclusion is that the curvature B of the Ehresmann connection A, acting on two
arbitrary vectors X and Y on Q, can be written as

p

B(X, Y) = d b (XH , YH )
X
(5.68)
sb
b=1
and that it can be seen as a measure of the failure of the constraints to be integrable [1,
p. 108]. The local expression for curvature is then given by
np
X np
B(X, Y)b =
X
b
B X Y , b = 1, ..., p, (5.69)
=1 =1
b
with the coecients B given by (5.21).
31
6 An example: The falling disc
Now we try to make use of the method developed in the previous sections in an example, a
model of a rolling homogenous disc, such as a coin, of negligible thickness on a horizontal
plane. A similar, but not identical, model was explored as early as 1892 by Vierkandt
in [16] and further, for example, in [1] and [18].
6.1 Writing the Lagrangian

We assume a homogeneous disc of radius R and mass m. We neglect all frictional forces
and assume that the disc is not allowed to slip as it rolls and rotates on the surface.
Rolling without slipping is of course an unrealistic idealization that is commonly adopted
for these types of systems, but for our purpose in this text it is deemed to be realistic
enough. The disc's centre of mass is positioned at the coordinates
X, Y, Z. (6.1)
The rotation of the disc in its rolling direction measured counter clockwise by the angle
, (6.2)
the rotation about the disc's z -axis measured from the x-axis counter clockwise by
, (6.3)
and the inclination angle, measured by
. (6.4)
These generalized coordinates are illustrated by gure 4.
Figure 4: A rolling disc on a at surface
32
The kinetic energy of the system is
1 1
T = m(X 2 + Y 2 + Z 2 ) + (I1 2 + I2 2 + I3 2 ) (6.5)
2 2
with I1 , I2 and I3 the moments of inertia of the disc about each corresponding axis from
where we measure the rotation, here given by
1
I1 = mR2 , (6.6)
2
corresponding to the change in rotation measured by ,
1
I2 = mR2 , (6.7)
4
corresponding to , and
1
I3 = mR2 , (6.8)
4
corresponding to the angle . A more detailed description on how these values can be
calculated can be found in any book on basic mechanics, such as [13, p. 663-690].
The potential energy is
V = mgZ, (6.9)
so that we obtain the Lagrangian on T Q:

m 2 2 2 1 2 2 1 2 2 1 2 2
L= X + Y + Z + R + R + R 2gZ . (6.10)
2 2 4 4
6.2 Change of coordinates

Since rolling without slipping is better expressed as a function of the coordinates at the
point of contact between the disc and the surface, we introduce
x, y (6.11)
for this purpose, the coordinates X, Y and Z can now be expressed as functions of these
new coordinates and the angles and . We now have
X = x R sin sin , (6.12)
Y = y R cos sin (6.13)
and
Z = R cos . (6.14)
What this actually means is that the position of the centre of mass is known as soon
as we know the point of contact, the heading angle and the inclination of the disc.
These constraints are all holonomic so we can eliminate them directly by inserting the
equations into the Lagrangian. To be able to do this we also need the time derivatives,

X = x R cos sin + sin cos , (6.15)

Y = y R sin sin cos cos , (6.16)
33
and
Z = R sin() (6.17)
We also calculate

X 2 = x2 2xR cos sin + sin cos +

+ R2 cos2 sin2 2 + 2 cos sin cos sin + sin2 cos2 2 , (6.18)

Y 2 = y 2 2yR sin sin cos cos +

+ R2 sin2 sin2 2 2 cos sin cos sin + cos2 cos2 2 (6.19)
and
Z 2 = R2 sin2 2 . (6.20)
Then we have
X 2 + Y 2 + Z 2 = x2 + y 2 + R2 2 + R2 sin2 2
h i
2R sin (x cos + y sin ) + cos (x sin y cos ) . (6.21)
Now insert (6.14) and (6.21) into the Lagrangian to get
5R2 2 R2 2

m 2 1
L= x + y 2 + + R2 + sin2 2 + 2gR cos F , (6.22)
2 4 4 2
where h i
F = 2R sin (x cos + y sin ) + cos (x sin y cos ) . (6.23)
In the language of geometry we now have the conguration manifold
Q = R2 S1 S1 S1+ (6.24)
where
[x, y]T R2 (6.25)
and S1 is the one-sphere, the set of all points on a circle, and it is therefore the natural
space for each of the angles and . The set of points S1+ is dened as
S1+ := {x, y R | x2 + y 2 = 1, y 0}, (6.26)
and is the set of points of the upper half of a circle, which is to be used as the space for
the angle .
6.3 Nonholonomic constraints

The condition of rolling without slipping can be described by the equations
x = R cos (6.27)
and
y = R sin . (6.28)
34
To be able to write these constraints in the language of the approach we have devel-
oped, we let
s = [x, y]T (6.29)
and
r = [, , ]T (6.30)
so that (6.27) and (6.28) are written as
s1 = R cos(r2 )r1 (6.31)
and
s2 = R sin(r2 )r1 . (6.32)
This is exactly the form of (5.3) with
A11 (r, s) = R cos , (6.33)
A21 (r, s) = R sin (6.34)
and with all the other A's zero.
Now we can use (5.21) to calculate the curvature terms:
1 A11
B12 = = (R cos ) = R sin , (6.35)

2 A21
B12 = = (R sin ) = R cos , (6.36)

1 A11
B21 = = (R cos ) = R sin (6.37)

and
2 A21
B21 = = (R sin ) = R cos , (6.38)

with all other B 's zero.
Remembering that the constrained Lagrangian is the Lagrangian where all veloci-
ties full the nonholonomic constraint equations, we insert (6.27) and (6.28) into the
Lagrangian. First we see that when inserting the constraints into the function F we get
h i
F = 2R2 sin (cos2 + sin2 ) + cos (cos sin sin cos ) =
= 2R2 sin , (6.39)
and then we see that
x2 + y 2 = R2 cos2 ()2 + R2 sin2 ()2 = R2 2 (6.40)
so the constrained Lagrangian becomes

3 2 1 1 5 g
Lc = L(q, qH ) = mR2 + + sin2 2 sin + 2 cos . (6.41)
4 8 2 8 R
Since
Lc Lc
= =0 (6.42)
s1 x
35
and
Lc Lc
= = 0, (6.43)
s2 y
equation (5.20) becomes
2 3
d Lc Lc X X b L

+ B r = 0, = 1, 2, 3 (6.44)
dt r r sb
b=1 =1
with
[r1 , r2 , r3 ]T = [, , ]T (6.45)
and
[s1 , s2 ]T = [x, y]T . (6.46)
6.4 Equations of motion

Now we calculate the necessary derivatives to insert into (6.44):

d Lc 3
P = = mR2 sin cos , (6.47)
dt 2

d Lc 2 1 2
P = = mR + sin + sin(2) sin cos , (6.48)
dt 4
d Lc 5
P = = mR2 , (6.49)
dt 4

Lc 1 g
= mR2 sin(2)2 cos + sin (6.50)
2 R
and
Lc Lc
= = 0. (6.51)

We also need
L
= mx mR sin cos + cos sin (6.52)
x
and
L
= my mR sin sin cos cos . (6.53)
y
Now we can calculate
1 L 2 L
B12 + B12 = mR(xsin ycos) mR2 cos (6.54)
x y
and
1 L 2 L
B21 + B21 = mR2 cos mR(xsin ycos). (6.55)
x y
Inserting into (6.44) gives us

3
mR2 sin 2 cos + mR (x sin y cos ) = 0, (6.56)
2

2 1 2
mR + sin + sin(2) sin mR (x sin y cos ) = 0, (6.57)
4
36
and
2 5 1 2 g
mR sin(2) + cos sin = 0. (6.58)
4 2 R
Using the constraint equations (6.27) and (6.28) we see that
(x sin y cos ) = R (cos sin sin cos ) = 0, (6.59)
so now we arrive at the nal equations of motion
2h i
sin + 2 cos = 0, (6.60)
3
sin(2) sin
+ 1 2
= 0, (6.61)
4 + sin
and
4 1 g
sin(2)2 cos + sin = 0. (6.62)
5 2 R
Equations (6.60), (6.61) and (6.62) give the complete motion of the system.
6.5 Interpretation and stable solutions

In this example, and are cyclic coordinates (L, Lc , Aa and
b
B are all independent
of and ), so we can check whether their associated momenta are conserved, with the
help of equation (5.37). We have
P = mR2 cos (6.63)
and
P = mR2 cos (6.64)
so here we see that the conjugate momenta are not conserved. This is consistent with
the statements made in section 5.5.
It is interesting to explore some special solutions to these equations. For instance
we can set = 0 and = 0, so that the rolling velocity and turning velocity are both
constant. Both equations (6.60) and (6.61) now give
= 0, (6.65)
which tells us that we either have a constant direction of rolling or a constant tilt
angle . If we assume the direction is constant, equation (6.62) then gives the solution
for as the second order nonlinear dierential equation
4g
sin = 0. (6.66)
5R
Looking at equation (6.66) we see that the acceleration of the tilt angle will always
have the same sign as the angle itself. It follows that the only (unstable) equilibrium
solution in this case is when = 0.
37
On the other hand, if we assume the tilt angle is constant, then equation (6.62)
provides an equation for the possible constant values of and ,

s
2 g
= 2 , (6.67)
2 sin 4 sin R cos
which in turn gives the inequality
s
4g sin2
, (6.68)

R cos
which gives the lower limit of the rolling velocity as a function of the constant value
of .
Another example is to set = 0, so that the tilt angle is constant. Here, equation
(6.60) shows that

2
= sin (6.69)
3
while equation (6.61) gives us
1
+ sin2

4
= . (6.70)
sin
This can only be true if both and are zero or if
3
sin2 = , (6.71)
4
which of course has no real solutions. Thus we must assume that
= = 0, (6.72)
so now we are back to our rst case again, with constant.
In summary we can say that constant velocities for rolling and turning gives possible
solutions with the tilt angle either constant or nonconstant, but if we start from an
assumption of constant tilt angle, then we only have solutions where both turning and
rolling velocities are constant.
In both of these examples there are possible solutions where the equations (6.63) and
(6.64) predict conserved conjugate momenta for both and/or , which is also what we
would expect for this model of a rolling disc.
38
7 Results and conclusion
In this section the main results and conclusions will be reviewed.
We started from Newton's second law and used d'Alembert's principle to arrive at
the equation
n
X d L L
j
j
q j = 0. (7.1)
j=1
dt q q
We argued that when all n generalized coordinates are independent of each other,
we get the Euler-Lagrange equations
d L L
j = 0, j = 1, ..., n. (7.2)
dt q j q
We also saw that when some of the coordinates are dependent of each other, their
dependency expressed by p functions f a , the constraints can be included in the Euler-

Lagrange equations, when labelling the (n p) independent variables with r , to give
us the equations
X f a p
d L L

= a , r = 1, ..., n p. (7.3)
dt r r a=1
r
We also showed that, when the constraints are holonomic, they can be added to the
Lagrangian to form a modied Lagrangian L so that the constraints in eect can be
imposed before taking the variations to the Lagrangian, consistent with the approach
for dealing with constraints via Hamilton's principle.
Next we showed that this procedure fails when applied to nonholonomic constraint,
and that the proper equations of motion only are obtained by imposing the constraints
after taking the variations, as can only be done by d'Alembert's principle. The equations
when using Lagrange multipliers on linear nonholonomic constraints given by
n
g (q, q) := Aai (q)q i = 0,
X
a
(7.4)
i=1
are the equations
p
d L L X
= a Aa , = 1, ..., n p. (7.5)
dt r r a=1
In particular, we showed that when the constraint equations were written in the form
np
X
sa = Aa (r, s)r , a = 1, ..., p, (7.6)
=1
we derived the Lagrange multipliers to be
d L L
a = a, a = 1, ..., p, (7.7)
dt sa s
so that the equations of motion become
p
d L L X d L L

= a
a Aa , = 1, ..., n p. (7.8)
dt r r a=1
dt s s
39
Further, we introduced the concept of the constrained Lagrangian Lc , which is the
Lagrangian dened only on the horizontal space of T Q, or more explicitly, dened only
on the subset of TQ dened by the constraint equations. In terms of the constrained
Lagrangian we derived the equations of motion to be
p p np
d Lc Lc X a Lc XX
b L
+ A = B r b , = 1, ..., n p (7.9)
dt r r a=1 sa s
b=1 =1
where
p
! !
Ab Ab X Ab Ab
b
B := + Aa Aa a . (7.10)
r r a=1
sa s
By rewriting the right hand side to be in terms of the exterior derivative of the two
form , which was the constraint function, we could show that the results were con-
b
sistent with holonomic constraints and that these extra terms involving B , generated
by the nonholonomic constraint forces, could be seen, from a geometric perspective, as
curvature terms for a specic choice of connection.
This connection, called an Ehresmann connection, is a projection dened on the tan-
gent space Tq Q of each point qQ and it is a powerful way to describe the constraints,
without using local coordinates.
We also showed that the conjugate momenta of a cyclic variable is generally not
conserved for nonholonomic systems, as it is in the holonomic case.
Finally we showed how to use some of the above mentioned results in an example.
The equations of motion for a coin, rolling without slipping on a plane surface, were
calculated to be
2h i
sin + 2 cos = 0, (7.11)
3
sin(2) sin
+ 1 2
= 0, (7.12)
4 + sin
and
4 1 2 g
sin(2) cos + sin = 0. (7.13)
5 2 R
The equations obtained were also consistent with the earlier conclusions that conju-
gate momenta are normally not conserved for nonholonomic constraints.
In conclusion it should also be said that, while being a fairly instructive example
showing how to use the developed equations, the example of the falling disc that was
considered in section 6 is arguably not very hard to solve even without the help of the
above formulas, simply with the use of Newtonian mechanics. There are of course other
examples when the Newton approach becomes too cumbersome or impractical and where
the above approach would oer more of an advantage. More advantages still are to be
gained by developing the geometrical approach even further than was done in this text,
many of which are explored in the source literature.
40
References
[1] A.M. Bloch. Nonholonomic Mechanics and Control. Interdisciplinary applied math-
ematics: Systems and control. Springer, 2003.
[2] A.M. Bloch, P. Krishnaprasad, J. Marsden, and R. Murray. Nonholonomic me-
chanical systems with symmetry. Archive for Rational Mechanics and Analysis,
136:2199, 1996.
[3] A.M. Bloch, J.E. Marsden, and D.V. Zenkov. Nonholonomic dynamics. Notices of
the AMS, 52(3):324333, 2005.
[4] H. Cendra, A. Ibort, M. de Len, and D. Martn de Diego. A generalization of
Chetaev's principle for a class of higher order nonholonomic constraints. Journal

of Mathematical Physics, 45(7):27852801, 2004.
[5] M. Crampin, W. Sarlet, and F. Cantrijn. Higher-order dierential equations and
higher-order lagrangian mechanics. Mathematical Proceedings of the Cambridge

Philosophical Society, 99(03):565587, 1986.
[6] M.R. Flannery. The enigma of nonholonomic constraints. American Journal of
Physics, 73(3):265272, 2005.
[7] H. Goldstein, C.P. Poole, and J.L. Safko. Classical Mechanics. Addison Wesley,
2002.
[8] J.V. Jos and E.J. Saletan. Classical Dynamics: A Contemporary Approach. Cam-
bridge University Press, 1998.
[9] W.S. Koon and J.E. Marsden. The hamiltonian and lagrangian approaches to the
dynamics of nonholonomic systems. Reports on Mathematical Physics, 40:2162,
1997.
[10] D. J. Korteweg. ber eine ziemlich verbreitete unrichtige Behandlungsweise eines
Problems der rollenden Bewegung, ber die Theorie dieser Bewegung, und ins-
besondere ber kleine rollende Schwingungen um eine Gleichgewichtslage. Nieuw

archief voor wiskunde, 24:130161, 1900.
[11] C. Lanczos. The Variational Principles of Mechanics. Mathematical expositions.
Dover Publications, 1970.
[12] C-M. Marle. Various approaches to conservative and nonconservative nonholonomic
systems. Reports on Mathematical Physics, 42:211229, 1998.

[13] J.L. Meriam and L.G. Kraige. Engineering Mechanics: Dynamics. Engineering
Mechanics. Wiley, 2008.
[14] R. Penrose. The Road to Reality: A Complete Guide to the Laws of the Universe.
Vintage Series. Vintage Books, 2007.
41
[15] R. Sjamaar. Manifolds and dierential forms. Lecture notes from Cornell University,
2006.
[16] A. Vierkandt. Dritter Abschnitt: Das Rollen und Gleiten einer ebenen Flche,
insbesondere einer homogenen Kreisscheibe, auf der Horizontalebene unter dem
Einuss der Schwere. Monatshefte fr Mathematik, 3:117134, 1892.

[17] D. V. Zenkov. Linear conservation laws of nonholonomic systems with symme-
try. Discrete and Continuous Dynamical Systems, supplementary volume:963972,

2003.
[18] D.V. Zenkov, A.M. Bloch, and J.E. Marsden. The energy-momentum method
for the stability of non-holonomic systems. Dynamics and Stability of Systems,

13(2):123165, 1998.
42

Nonholonomic Form

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nonholonomic Form

Uploaded by

Copyright:

Available Formats

Faculty of Technology and Science

Holonomic versus nonholonomic

Karlstads universitet 651 88 Karlstad

2.4 Tangent vectors, spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.5 Fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.6 Dierential k-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.7 Wedge product between dierential forms . . . . . . . . . . . . . . . . . 13

2.8 Exterior derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.9 Jacobi-Lie bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.10 Ehresmann connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.2 Nonholonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5 Structure of the equations of motion 22

5.2 Obtaining the equations of motion . . . . . . . . . . . . . . . . . . . . . 22

5.3 Constrained Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.5 Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.6 An Ehresmann connection . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6 An example: The falling disc 32

6.2 Change of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.3 Nonholonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6.4 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6.5 Interpretation and stable solutions . . . . . . . . . . . . . . . . . . . . . 37

7 Results and conclusion 39

variations, a eld of mathematics that deals with maximizing or minimizing functionals,

analogous to nding extremal points of functions in ordinary calculus.

principle, sometimes called the Lagrange-d'Alembert principle, introduced 1743 in Trait

principles dier we rst need to collect some more information.

As previously mentioned, in analytical mechanics, the quantities of importance are

are commonly classied as conservative or nonconservative, or sometimes as potential

or nonpotential. A more general classication and a terminology used in [11], divides

the forces into two categories:

function, of the coordinates, the coordinate velocities and time explicitly.

of such a work function is what we normally call a scalar potential.

Hamilton's variational principle actually has many advantages over d'Alembert's

classify forces of constraint into two categories:

1. Holonomic constraints. Integrable constraints, meaning that given some con-

straints depending on time-derivatives of coordinates, these constraints can be

integrated as to express the constraints in only the coordinates themselves, a ter-

minology rst introduced by Heinrich Hertz in 1894.

2. Nonholonomic constraints. Constraints that are not holonomic. These might

derivable from coordinate constraints (thereby unintegrable) or constraints not

given as an equation at all.

treatment of nonholonomic constraints only d'Alembert's principle is applicable.

straints diers from holonomic.

when imposing the constraints.

coordinate free language of geometry.

demonstrates some of the dierences between holonomic and nonholonomic systems.

argumentations and derivations, sometimes lling in missing equations and sometimes

through, used as a reference or skipped altogether depending on the reader.

for instance [1] and [15].

2.1 Linear and ane functions

Denition 2. A function g : Rn Rm is said to be ane if there exists a linear

In plainer language, an ane transformation is a linear transformation followed by

such that if u Rn and v = P (u) then

In other words, if P is applied to a vector that is already transformed by P, the

be described in terms of these charts. Q is then called a dierentiable manifold. More

explicitly, the denition is:

1. The union of all U is Q,

Figure 1: An example of a manifold with mappings to local coordinates

Trivial cases of manifolds are when Q is Rn itself. An example of a non-trivial

same point on the circle.

Another example of a manifold is the two dimensional sphere S2 , illustrated by gure

the same point.

for what manifolds are.

2.6 Dierential k-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.7 Wedge product between dierential forms . . . . . . . . . . . . . . . . . 13

variations, a eld of mathematics that deals with maximizing or minimizing functionals,

analogous to nding extremal points of functions in ordinary calculus.

principles dier we rst need to collect some more information.

are commonly classied as conservative or nonconservative, or sometimes as potential

or nonpotential. A more general classication and a terminology used in [11], divides

minology rst introduced by Heinrich Hertz in 1894.

straints diers from holonomic.

demonstrates some of the dierences between holonomic and nonholonomic systems.

argumentations and derivations, sometimes lling in missing equations and sometimes

2.1 Linear and ane functions

Denition 2. A function g : Rn Rm is said to be ane if there exists a linear

In plainer language, an ane transformation is a linear transformation followed by

be described in terms of these charts. Q is then called a dierentiable manifold. More

explicitly, the denition is:

Another example of a manifold is the two dimensional sphere S2 , illustrated by gure

Now we dene the tangent vector u to a manifold Q of dimension n at a point q to

Lagrangian L is a function of both position and velocity, it is dened on T Q.

of the bre bundle.

the conguration manifold Q the base space and dene a submersion

2.6 Dierential k-forms

2.7 Wedge product between dierential forms

whose value at a point qQ is an alternating bilinear form dened by

smooth functions on Q. The Jacobi-Lie bracket is then dened as follows;

In practice, this condition can be written as an equation by rst dierentiating (4.2),

so when taking a virtual displacement (time is kept xed) we get

can be extended to include also ane constraints, as seen in [1, p. 219-220].