Professional Documents
Culture Documents
Introduction.
Suppose y(x) is defined on the interval a,b and so defines a curve on the (x,y ) plane.
Now suppose
b
with y ′ the derivative of y(x) . The value of this will depend on the choice of the
function y and the basic problem of the calculus of variations is to find the form of the
function which makes the value of the integral a minimum or maximum (most commonly
a minimum).
The sort of question which gives rise to this kind of problem is exemplified by the
“Brachistochrone” problem, solved by Newton and the Bernoullis (the name comes from
the Greek for “shortest time”). This considers a particle sliding down a smooth curve
under the action of gravity and poses the question as to what curve minimises the time for
the particle to slide between fixed points A and B.
Clearly the time will need to be found by calculating the speed at each point then
integrating along the curve.
Other examples arise in various areas of physics in which the basic laws can be stated in
terms of variational principles. For example in optics Fresnel’s principle says that the
path of a light ray between two points is such as to minimise the time of travel between
the two points.
First recall the condition under which an ordinary function y(x) has an extremum. If we
expand in a Taylor series
1
y(x + δx) = y(x) + δxy ′(x) + δx 2y ′′(x) + ........
2
then the condition is that the term proportional to δx must vanish, so that if the second
derivative is non-zero the difference between y(x + δx) and y(x) will always have the
same sign for small δx . The same principle applies to our present problem.
What we do is consider a small change in the function y(x) , replacing it with
y(x) + η(x) . (Note that all the functions we introduce are assumed to have appropriate
properties of differentiability etc, without particular comment being made.) We then
produce a change in the integral, which can be expanded in powers of η . We demand
that the term proportional to η vanishes.
Substituting into (1) we get
b
x
The unperturbed curve (full line) and the perturbed curve ( dotted line)
Examples.
(a) Find the curve which gives the shortest distance between two points on a plane.
dl = dx 2 + dy 2 = 1 + y ′2dx
so we want to minimise
b
∫ 1 + y ′2dx
a
(where a and b are the x-coordinates of the points of interest).
The integrand is independent of y so we just get
d ∂
1 + y ′2 = 0
dx ∂y ′
y′
giving = const , or y ′ = const . As expected this just gives a straight line
1 + y ′2
y = mx + c with the constants fixed by the positions of the end points.
∫ (y + y ′2 )dx
2
This is a fairly simple, artificial example, but it illustrates a more general point. Note that
we could easily find a first integral and reduce the problem to a first order DE. The
existence of a first integral like this turns out to be a general property of the Euler-
Lagrange equation whenever the integral has no explicit dependence on x .
Under these circumstances, if we multiply the E-L equation by y ′ we get
d ∂F ∂F
y ′ − y ′
∂y = 0
dx ∂y ′
or
d ∂F ∂F ∂F
y ′ − y ′′ − y ′
dx ∂y ′ ∂y ′ ∂y = 0 .
dF
Since F does not contain x explicitly, the last two terms combine to give , the total
dx
derivative. So, we get the first integral
∂F
y ′ − F = const (5)
∂y ′
A
x
Since the curve passes through the origin, K = 0 . The value of b is determined by the
condition that the curve passes through B.
Suppose F = F(y1, y1′,y 2, y 2′,y 3, y 3′,..........) with each yi = yi (x) and again we are looking
b
for an extremum of ∫ Fdx . The analysis proceeds as before, replacing each yi with
a
yi + ηi . Since each ηi can be chosen independently, we must let the coefficient of each in
the integrand vanish. We end up with a system of Euler-Lagrange equations
∂F d ∂F
= . (6)
∂yi dx ∂yi′
It has, of course, been assumed that the end points are fixed, as before.
Example: Find the curve which minimises
1
∫ (y ′ +z ′2 + y 2 )dx
2
0
and which joins the points (0, 0, 0) and (1,1,1) .
Hamilton’s Principle
∫ Ldt
t1
m
φ
The height of the top bob above its equilibrium position is a(1 − cos θ) and the height of
the lower bob above its equilibrium is a(1 − cos θ) + b(1 − cos φ) . So
V = mga(1 − cos θ) + Mg a(1 − cos θ) + b(1 − cos φ)
&
The horizontal component of velocity of the top bob is a θ cos θ and the vertical
component a θ& sin θ . For the lower bob the corresponding components are
a θ& cos θ + bφ& cos φ and a θ& sin θ + bφ& sin φ and so
1 1
T = ma 2θ& 2 + M (a θ& cos θ + bφ& cos φ)2 + (a θ& sin θ + bφ& sin φ)2
2 2
1 1
= ma 2θ& 2 + M a 2θ& 2 + b 2φ& 2 + 2abθ&φ& sin(θ + φ)
2 2
From Lagrange’s equations we then get
d
dt
( )
ma 2θ& + Ma 2θ& + Mabφ& sin(θ + φ) + mga sin θ + Mga sin θ = 0
d
dt
( )
Mb 2φ& + Mabθ& sin(θ + φ) + Mgb sin φ = 0 .
Recall that to find the extremum of a function of several variables with constraints
imposed we use Lagrange’s method of undetermined multipliers. An exact analogy holds
in the case of calculus of variations. Suppose we want to find the extremum of
b
I = ∫ F(y, y ′,x)dx
a
subject to the condition
b
Example; A heavy chain with constant mass/unit length is suspended between two points.
What curve does it take up in equilibrium?
y
b
With the geometry shown this means that we minimise ∫ ydl where the element of
a
length is given by dl = 1 + y ′2dx . There is also the constraint that the total length is
fixed, so we must minimise
b
∫y 1 + y ′2dx
a
subject to
b
∫ 1 + y ′2dx = const.
a
dy ′
from which we get
2
y − λ
y ′ =
2
−1 .
k
If we make the substitution y − λ = k cosh z this gives z ′ = 1 so that z = x + c and we
obtain
y = λ + k cosh(x + c) .
The three constants k, c and λ are obtained from the coordinates of the end points and the
length. This curve is called a catenary.
If there is more than one constraint then we introduce more than one multiplier.
Example: In statistical mechanics, the distribution of energy of a system of particles is
described by a probability distribution function f (E) . In equilibrium, theory says that
this distribution should be such as to maximise the function
∞
−∫ f log fdE
0
subject to the conditions
∞ ∞
∫ f (E)dE = 1 ∫ Ef (E)dE = E 0
.
0 0
The first of these is just the standard condition on a probability function. The second,
with E0 a given constant, says that the average energy per particle, or equivalently the
total energy of the system, is fixed.
The Euler-Lagrange equation with these constraints is
∂
(−f log f − λf − µEf ) = 0
∂f
with λ and µ the two multipliers corresponding to the two constraints. This yields
f = Ce −µE
where C is a constant into which λ has been incorporated. Using the two constraints
gives µ = 1 / E 0 , C = E 0 . This is the Boltzmann distribution and E0 is proportional to
the temperature of the system.
The isoperimetric problem - find the shape which has maximum area for a given
perimeter.
∫ (x&
2
L= + y& 2 )1/2dt
t0
So, we construct the Euler-Lagrange equations for the two variables x and y from the
function
1
φ(x,y, x,
& y) & = (xy& − yx) & − λ(x& 2 + y& 2 )1/2 .
2
These equations are
d y λx& y&
− − − = 0
1/2 2
dt 2
2
(
x& + y& 2
)
d x λy& x&
− − = 0.
1/2 2
dt 2
( 2
x& + y& 2
)
These can be integrated immediately to give
λx&
y+ 2 =A
(x& + y& 2 )
λy&
x− 2 =B
(x& + y& 2 )
Multiplying the first of these by x& and the second by y& and adding gives
x(x
& − B) + y(y& − A) = 0
This has the integral
(x − B ) + (y − A)
2 2
= const.
so that the required curve is a circle.
Geodesics
If G(x,y,z) = 0 defines a surface in three dimensional space, then the geodesics on this
surface are the curves which produce the shortest distance between points on the surface.
So, as we have seen, the geodesics on a plane are just straight lines. We can cast the
problem of finding geodesics on a surface into a variational problem with a constraint as
follows. If x = x(t),y = y(t),z = z(t) are parametric equations for a curve on the
surface, then along any curve on the surface
t1
∫ G (x(t),y(t),z(t))dt = 0 (8)
t0
Since the element of length along a curve is x& 2 + y& 2 + z& 2dt , the problem is to minimise
t1
plus similar equations with for y and z , with F = x& 2 + y& 2 + z& 2 .
∫ {p(x)y ′ }
2
I = −q(x)y 2 dx
0
subject to the condition that
1
∫ r(x)y
2
J = = const.
0
and the given boundary conditions on y , then we obtain the above equation from the
Euler-Lagrange equations and the method of multipliers. The lowest possible eigenvalue
is then the minimum possible value of I/J. Since J is constrained to be constant this is just
equivalent to the problem of minimising I subject to J being constant. The standard
approach to this leads back to the DE and we may appear to be going round in circles.
The usefulness of this approach is that if we use any function y(x) then the resulting
value of λ is greater or equal to the minimum possible, so we obtain an upper bound on
the lowest eigenvalue. With a choice of y which is a reasonable approximation to the
solution we can get a good estimate.
Example: Use this technique to find an estimate of the lowest eigenvalue of the problem
y ′′ + λy = 0 y(0) = y(1) = 0
This is, of course a problem to which we know the solution, namely that the eigenvalues
are given by λ = n 2π 2 , so the lowest is π 2 = 9.8696 (corresponding to the solution
y = sin(πx) ). We want a trial function with the required end values, and preferably one
which is easily integrated (though numerical integration is readily done with a package
like MAPLE). Let us take y = x(1 − x) . Then p(x) = 1 q(x) = 0 r(x) = 1 and so
1
1
∫ (1 − 2x) dx = 3
2
I(x) =
0
1
c
1
J = ∫ x 2(1 − x)2dx =
0
30
giving an upper bound of 10, which is actually a fairly good approximation to the lowest
eigenvalue.
Variants on the boundary conditions are possible, for example in the following.
Suppose we relax the condition that the values of y be fixed at the end points but instead
assume that they are allowed to vary freely. Then, following the procedure which led to
Eq. (3) we obtain, as well as the term in (3), an extra contribution
∂F
η − ∂F η .
∂y ′ ∂y ′
x =b x =a
Since, the extremum, if it exists, must be the extreme value for whatever end points turn
out to be suitable, the integral term must vanish as before. Otherwise a slightly greater or
smaller value for the integral could be obtained by taking a different curve with the same
end points. Also, this extra term must vanish which, since η is arbitrary, means that
∂F
=0
∂y ′
at both end points.
As a simple example we can consider the problem of minimising the distance between
x = 0 and x = 1 without fixing y at the end points. Here
F = 1 + y ′2
and the solution of the E-L equation is a straight line as before. The extra conditions
yield
y′
=0
1+y ′ 2
at both ends. The derivative must then be zero everywhere, so we arrive at the expected
result that the shortest line between x = 0 and x = 1 is a straight line parallel to the x
axis.
Another variant of the problem is to consider a case where the end points are constrained
to lie on a given curve. For simplicity, let us assume that the lower end point is fixed
while that the upper end point has to lie on the curve y = g(x) . Suppose that the
extremum has its upper limit at x = b , while the lower limit x = a is fixed. Then, if y is
replaced with y + η as before, there is a change in the upper limit of the integral to
b + ∆x , say.
y+η
y
y(x)
y=g(x)
a x b b+∆x
Example: Find the curve connecting the origin to a curve y = g(x) and which has the
shortest length.
Here, as we have seen before, F = 1 + y ′2 and the solution of the E-L equation gives a
straight line. For this case, the condition (9), which must be satisfied where the straight
line meets the given curve, reduces to g ′y ′ = −1 , implying that the straight line giving
the shortest distance to the curve must be orthogonal to the curve at the point of
intersection.
The basic ideas discussed here can be extended in various ways, for example to
integrands which involve higher derivatives of y or to problems which involve
minimising a multiple integral over some given domain.
Further Reading
Calculus of Variations R Weinstock
Variational Calculus in Science and Engineering M J Forray
An Introduction to the Calculus of Variations L A Pars