Professional Documents
Culture Documents
approaches
Abstract
1 Introduction
We start with the definition of a Sturm-Liouville boundary value problem (S-L BVP). It
is the second-order differential equation given by
1 d d
Lu(x) = λu(x), L= − p(x) + q(x) (1)
r(x) dx dx
subject to the following boundary conditions:
1
• For an angle equal to 0 we have a boundary condition that only depends on u
(Dirichlet boundary condition).
• For an angle equal to π/2 we have a boundary condition that only depends on u0
(Neumann boundary condition).
• For non-equivalent cases mod π, we have a mixed boundary condition (Robin
boundary condition).
Note that (1) is written as an eigenvalue problem for the linear operator L. Indeed, as
part of solving the problem, we will have to determine the values of λ that allow solutions.
We shall call these eigenvalues of (1), and the associated solutions u shall be called
eigenfunctions of (1)
These problems are important in several areas of physics and engineering. Perhaps the
most widespread place in which they appear is when the method of separation of variables
is applied to solve partial differential equations subject to boundary conditions. When
successful, this method splits a PDE into several ODEs, and for many of these PDEs (e.g.
heat equation, wave equation, Laplace’s equation) we end up with problems of the form
of (1), at least when the geometry of the boundary is simple enough.
Example:One of the simplest examples of a S-L BVP is given by taking [a, b] = [0, 2π],
p = r = 1, q = 0, α = β = 0. We are left with:
d2
− u = λu
dx2
subject to u(0) = u(π) = 0.
Let us restrict our attention to real solutions (therefore, to real eigenvalues as well).
This is a second order, homogeneous ODE of constant coefficients. Its solutions depend
on the sign of λ :
• For λ > 0, solutions are sinusoidal
• For λ = 0, solutions are linear
• For λ < 0, solutions are exponential
It is easy to check that the only way to satisfy the
√ boundary conditions is with sinusoidal
functions. Therefore, we have λ > 0, and setting λ = ω we get:
2
u(x) = C cos(ωx) + D sin(ωx)
Plugging in the boundary conditions, we see:
0 = C cos 0 + D sin 0 ⇒ C = 0
and, assuming that u is not identically zero,
0 = D sin πω ⇒ ω = n ∈ N
Let us discard the trivial eigenfunction sin(0x). The eigenfunctions are of the form
un = D sin (n + 1)x and the eigenvalues are (n + 1)2 : n ∈ N .
The example above will serve to illustrate several important properties of S-L problems
throughout our exploration.
Definition 2.1. Let X be a Banach space (in our context, this will be a space of functions).
A linear operator A on X is said to be compact if for any bounded sequence fn there is a
subsequence fnk such that A (fnk ) is convergent.
The following theorem is a standard result, found in any functional analysis textbook.
Theorem 2.1. (Spectral theorem for compact, symmetric operators): Let H be a Hilbert
space with inner product h , i and let A be a bounded, compact operator on H. Assume,
furthermore, that A is symmetric, that is:
3
N
X
f= huj , f i uj
j=0
(Note this implies that if Range(A) is dense, then the eigenvectors form an orthonormal
basis.)
It wouldn’t make much sense to try to apply this theorem to the operator L right
away, given that we saw that there is at least one case (the example above) in which the
sequence of eigenvalues tends to infinity, rather than zero!
However, precisely for this reason, one might be inclined to look at the inverse of the
operator L, which would have eigenvalues equal to the reciprocal of these, which would in
turn tend to zero. This shall be our strategy.
Proof. We sketch the proof, omitting computational details. By using integration by parts
twice, it is easy to show the Lagrange identity:
4
where Wx (f, g) denotes the Wronskian at x of the functions f and g, modified by multi-
plying by p(x):
w = y 0 (p(x)
ii. To find the inverse of L − zI, we want to solve the non-homogeneous system (L −
zI)f = g, for f and g ∈ H0 Use variation of parameters on the system above to do
so, noting that the homogeneous diff. eq. associated to (L − zI)f = g is precisely
(1) with eigenvalue z.
iii. Let us denote RL (z) = (L − zI)−1 . The solution we just obtained can be written in
the form:
Z b
f (x) = RL (z)g(x) = G(z, x, t)g(t)r(t)dt
a
where G denotes the Green’s function:
(
1 ub (z, x)ua (z, t), x ≥ t
G(z, x, t) =
W (ub (z), ua (z)) ub (z, t)ua (z, x), x ≤ t
and ua (z, ·) and ub (z, ·) denote any 2 solutions of the homogeneous system (L−zI)f =
0 adapted to the boundary conditions.
5
iv. We now have RL (z) in the form of an integral operator (on a bounded set) against
the measure induced by the density function r, where its kernel G is continuous. The
methods of functional analysis guarantee that any operator of this form is bounded
and compact. (To do so, a bounded sequence fn on H0 is produced and it is shown
that RL (z)fn is a equicontinuous sequence of functions. Then the Arzelà–Ascoli
theorem is applied to guarantee the existence of a convergent subsequence.)
Theorem 2.3. The regular Sturm-Liouville problem has a countable number of discrete,
simple eigenvalues En which accumulate only at ∞. The corresponding normalized eigen-
functions un can be chosen real-valued and form an orthonormal basis for H0 .
With this we finish our use of functional analysis to study (1). We have shed consid-
erable light on the nature of its eigenvalues. Now we turn our attention to the behavior
of its eigenfunctions, by virtue of our next tool.
u(x) = ρu (x) sin(θu (x)), p(x)u0 (x) = ρu (x) cos(θu (x)) (4)
It is natural to consider p(x)u0 (x) instead of just u0 (x) if we notice the form of the
operator L in (1), which contains the derivative of p(x)u0 (x).
For these coordinates to be uniquely defined, we shall assume WLOG that ρu never
vanishes (note that if ρu (x0 ) = 0 we would have u(x0 ) = u0 (x0 ) = 0 and by uniqueness
6
we would be talking of the identically zero function). More importantly, the range of θu
is not necessarily [0, 2π[: it will be assumed large enough to guarantee θu is a continuous
function (and doesn’t jump from 2π back to 0) as the function (p(x)u0 (x), u(x)) winds
about the origin.
Let us apply the change of variables to (1). For clarity, we shall omit the variable x in
our functions. We get:
1 d 0 d
pu0 = −(λr − q)u
− pu + qu = λu ⇒ (5)
r dx dx
Recall that in general polar coordinates for (X, Y ) we have the formulas:
XX 0 + Y Y 0 Y 0X − X 0Y
ρ0 = , θ0 =
ρ ρ2
In our case X = pu0 , Y = u, and with help of (5) above we get:
1 sin(2θu )
ρ0u = ρu + q − λr (6)
p 2
2
cos (θu )
θu0 = + (λr − q) sin2 (θu ) (7)
p
As it is usual with this kind of substitutions, we have exchanged a second-order ODE
by a 2x2 system of first-order ODEs. However, a remarkable feature of this new system
is that the equation for θu0 is independent of ρ, and so we have a first-order ODE which
describes the Prüfer angle! Also note that the equation for ρ0u is separable once θu is known.
7
3.2 Oscillatory nature of the eigenfunctions
The study of Prüfer angles alone is enough for us to deduce interesting properties of the
eigenfunctions of (1). Let us start by studying the sign of θu0 in (7).
We know that p1 > 0. Also, since r and q are continuous on [a, b] they are bounded,
which implies that for a sufficiently large λ (which we know we can find, thanks to the
results in section 2) we have λr − q > 0. In fact, due to the continuity we can say both
of these coefficients are bounded away from zero, and so from the shape of (7) we see
there is a positive constant K such that θu0 > K. As such, from (4) we see any u with
an eigenvalue large enough must have an oscillatory behavior, since the polar angle has a
minimum, positive rate of increase.
Furthermore, θu0 will be larger at some points the bigger λ is. This indicates that these
eigenfunctions will oscillate faster as λ increases!
Now, if the eigenvalue λ is not large enough, we cannot guarantee that θu is increasing.
But we can note the following: at a zero of u, the Prüfer angle is always increasing, as we
can note from (7) and from u(x) = ρu (x) sin(θu (x))
1
u(x0 ) = 0 ⇔ θu (x0 ) = 0 mod π ⇒ θu0 (x0 ) = >0
p(x0 )
This leads us to:
Lemma 3.1. : Regardless of the magnitude of λ, a Prüfer angle can cross integer multiples
of π only from below. This means that, between consecutive zeros, the Prüfer angle must
increase by exactly π units.
8
Proof. i. It is known that if θ1 (c) ≥ θ0 (c) and θ10 (x) ≥ θ00 (x) for all x ∈ [c, b), then
θ1 (x) ≥ θ00 (x) is also true in [c, b), with the inequality staying strict after any time
0
for any fixed function f (x, y) which is locally Lipschitz continuous with respect to y,
uniformly in x.
2
In this case, pick f (x, y) = cos (y) 2
p0 (x) + (λ0 r0 (x) − q0 (x)) sin (y). It is easy to check the
required uniform Lipschitz condition from the fact that p01(x) and λ0 r0 (x) − q0 (x) are
bounded on [a, b].
With this f , θ00 (x) − f (x, θ0 (x)) = 0, whereas and θ10 (x) − f (x, θ1 (x)) is given by:
W (u0 , u1 ) = 0 or u0 = 0 (8)
(where W denotes the wronskian.)
If the functions u0 and u1 are linearly independent on (c, d) (i.e. not a constant multiple
one of the other) then u1 has at least a zero in (c, d).
9
Proof. Let us assume, without loss of generality, that the Prüfer angles at c are in [0, π).
(They can be taken to lie in [−π, π) as any polar angle, and if needed, we reverse the signs
of u0 or u1 to have the polar angle in the desired interval.)
The Wronskian W (u0 , u1 ) is readily calculated to give:
But note this means θ1 (d) > π. Indeed, applying the condition (8) at x = d, (and
similar reasoning to that of above for x = c) we get either of 2 cases:
• θ0 (d) = θ1 (d) mod π, and by virtue of θ0 (d) < θ1 (d), we must have
• θ0 (d) = 0 mod π: We can’t have θ0 (d) ≤ 0 because θ0 (c) ∈ [0, π] and Prüfer angles
increase between zeros as noted in lemma 3.1. Therefore θ0 (d) ≥ π, which implies
θ1 (d) > π.
Finally, by the mean value theorem in [c, d], we find a x∗ such that θ1 (x∗ ) = π, which
is a zero of u1 , as desired.
By keeping the coefficients p, q and r constant, and just increasing the eigenvalues, we
conclude:
10
un have exactly n zeros on [a, b]. (Note this is in agreement with the elementary example
we did at the beginning on [0, π], in which the eigenfunctions are sin((n + 1)x).
Knowing the exact number of zeros, along with the corollary we got above, allows us
to be much more specific about the relative position of the zeros of the eigenfunctions:
Corollary 3.2.2. Let un be the eigenfunctions of (1), sorted according to the size of the
eigenvalues. Then the zeros of u( n + 1) interlace the zeros of un . This means, if xn,j are
the zeros of un inside (a, b), then:
a < xn+1,1 < xn,1 < xn + 1, 2 < ... < xn+1,n+1 < b
A quick analysis shows this is the only possible behavior if the zeros of un+1 must
always have a zero of un in between.
Figure 1: Illustration of the interlacing of zeros for the eigenfunctions u3 = sin(4x) and u4 =
sin(5x), for the elementary example −u00 = λu
11
References
[1] William E Boyce, Richard C DiPrima, and Douglas B Meade. Elementary differential
equations and boundary value problems, volume 9. Wiley New York, 1992.
[2] John K Hunter and Bruno Nachtergaele. Applied analysis. World Scientific Publishing
Company, 2001.
[3] Gerald Teschl. Ordinary differential equations and dynamical systems, volume 140.
American Mathematical Soc., 2012.
12