Mathematical Background QM

Mathematical Methods
in Quantum Mechanics
With Applications
to Schr odinger Operators
Gerald Teschl
Note: The AMS has granted the permission to post this online edition!
This version is for personal online use only! If you like this book and want
to support the idea of online versions, please consider buying this book:
http://www.ams.org/bookstore-getitem?item=gsm-99
Graduate Studies
in Mathematics
Volume 99
American Mathematical Society
Providence, Rhode Island
Editorial Board
David Cox (Chair) Steven G. Krants
Rafe Mazzeo Martin Scharlemann
2000 Mathematics subject classication. 81-01, 81Qxx, 46-01, 34Bxx, 47B25
Abstract. This book provides a self-contained introduction to mathematical methods in quan-
tum mechanics (spectral theory) with applications to Schrodinger operators. The rst part cov-
ers mathematical foundations of quantum mechanics from self-adjointness, the spectral theorem,
quantum dynamics (including Stones and the RAGE theorem) to perturbation theory for self-
adjoint operators.
The second part starts with a detailed study of the free Schr odinger operator respectively
position, momentum and angular momentum operators. Then we develop WeylTitchmarsh the-
ory for SturmLiouville operators and apply it to spherically symmetric problems, in particular
to the hydrogen atom. Next we investigate self-adjointness of atomic Schr odinger operators and
their essential spectrum, in particular the HVZ theorem. Finally we have a look at scattering
theory and prove asymptotic completeness in the short range case.
For additional information and updates on this book, visit:
http://www.ams.org/bookpages/gsm-99/
Typeset by L
A
T
E
Xand Makeindex. Version: February 17, 2009.
Library of Congress Cataloging-in-Publication Data
Teschl, Gerald, 1970
Mathematical methods in quantum mechanics : with applications to Schrodinger operators
/ Gerald Teschl.
p. cm. (Graduate Studies in Mathematics ; v. 99)
Includes bibliographical references and index.
ISBN 978-0-8218-4660-5 (alk. paper)
1. Schr odinger operators. 2. Quantum theoryMathematics. I. Title.
QC174.17.S3T47 2009 2008045437
515.724dc22
Copying and reprinting. Individual readers of this publication, and nonprot libraries acting
for them, are permitted to make fair use of the material, such as to copy a chapter for use
in teaching or research. Permission is granted to quote brief passages from this publication in
reviews, provided the customary acknowledgement of the source is given.
Republication, systematic copying, or multiple reproduction of any material in this pub-
lication (including abstracts) is permitted only under license from the American Mathematical
Society. Requests for such permissions should be addressed to the Assistant to the Publisher,
American Mathematical Society, P.O. Box 6248, Providence, Rhode Island 02940-6248. Requests
can also be made by e-mail to reprint-permission@ams.org.
c _ 2009 by the American Mathematical Society. All rights reserved.
The American Mathematical Society retains all rights
except those granted too the United States Government.
To Susanne, Simon, and Jakob
Contents
Preface xi
Part 0. Preliminaries
Chapter 0. A rst look at Banach and Hilbert spaces 3
0.1. Warm up: Metric and topological spaces 3
0.2. The Banach space of continuous functions 12
0.3. The geometry of Hilbert spaces 16
0.4. Completeness 22
0.5. Bounded operators 22
0.6. Lebesgue L
p
spaces 25
0.7. Appendix: The uniform boundedness principle 32
Part 1. Mathematical Foundations of Quantum Mechanics
Chapter 1. Hilbert spaces 37
1.1. Hilbert spaces 37
1.2. Orthonormal bases 39
1.3. The projection theorem and the Riesz lemma 43
1.4. Orthogonal sums and tensor products 45
1.5. The C
algebra of bounded linear operators 47

1.6. Weak and strong convergence 49
1.7. Appendix: The StoneWeierstra theorem 51
Chapter 2. Self-adjointness and spectrum 55
vii
viii Contents
2.1. Some quantum mechanics 55
2.2. Self-adjoint operators 58
2.3. Quadratic forms and the Friedrichs extension 67
2.4. Resolvents and spectra 73
2.5. Orthogonal sums of operators 79
2.6. Self-adjoint extensions 81
2.7. Appendix: Absolutely continuous functions 84
Chapter 3. The spectral theorem 87
3.1. The spectral theorem 87
3.2. More on Borel measures 99
3.3. Spectral types 104
3.4. Appendix: The Herglotz theorem 107
Chapter 4. Applications of the spectral theorem 111
4.1. Integral formulas 111
4.2. Commuting operators 115
4.3. The min-max theorem 117
4.4. Estimating eigenspaces 119
4.5. Tensor products of operators 120
Chapter 5. Quantum dynamics 123
5.1. The time evolution and Stones theorem 123
5.2. The RAGE theorem 126
5.3. The Trotter product formula 131
Chapter 6. Perturbation theory for self-adjoint operators 133
6.1. Relatively bounded operators and the KatoRellich theorem 133
6.2. More on compact operators 136
6.3. HilbertSchmidt and trace class operators 139
6.4. Relatively compact operators and Weyls theorem 145
6.5. Relatively form bounded operators and the KLMN theorem 149
6.6. Strong and norm resolvent convergence 153
Part 2. Schrodinger Operators
Chapter 7. The free Schr odinger operator 161
7.1. The Fourier transform 161
7.2. The free Schrodinger operator 167
Contents ix
7.3. The time evolution in the free case 169
7.4. The resolvent and Greens function 171
Chapter 8. Algebraic methods 173
8.1. Position and momentum 173
8.2. Angular momentum 175
8.3. The harmonic oscillator 178
8.4. Abstract commutation 179
Chapter 9. One-dimensional Schrodinger operators 181
9.1. SturmLiouville operators 181
9.2. Weyls limit circle, limit point alternative 187
9.3. Spectral transformations I 195
9.4. Inverse spectral theory 202
9.5. Absolutely continuous spectrum 206
9.6. Spectral transformations II 209
9.7. The spectra of one-dimensional Schrodinger operators 214
Chapter 10. One-particle Schrodinger operators 221
10.1. Self-adjointness and spectrum 221
10.2. The hydrogen atom 222
10.4. The eigenvalues of the hydrogen atom 229
10.5. Nondegeneracy of the ground state 235
Chapter 11. Atomic Schrodinger operators 239
11.1. Self-adjointness 239
11.2. The HVZ theorem 242
Chapter 12. Scattering theory 247
12.1. Abstract theory 247
12.2. Incoming and outgoing states 250
12.3. Schrodinger operators with short range potentials 253
Part 3. Appendix
Appendix A. Almost everything about Lebesgue integration 259
A.1. Borel measures in a nut shell 259
A.2. Extending a premeasure to a measure 263
A.3. Measurable functions 268
x Contents
A.4. The Lebesgue integral 270
A.5. Product measures 275
A.6. Vague convergence of measures 278
A.7. Decomposition of measures 280
A.8. Derivatives of measures 282
Bibliographical notes 289
Bibliography 293
Glossary of notation 297
Index 301
Preface
Overview
The present text was written for my course Schrodinger Operators held
at the University of Vienna in winter 1999, summer 2002, summer 2005,
and winter 2007. It gives a brief but rather self-contained introduction
to the mathematical methods of quantum mechanics with a view towards
applications to Schrodinger operators. The applications presented are highly
selective and many important and interesting items are not touched upon.
Part 1 is a stripped down introduction to spectral theory of unbounded
operators where I try to introduce only those topics which are needed for
the applications later on. This has the advantage that you will (hopefully)
not get drowned in results which are never used again before you get to
the applications. In particular, I am not trying to present an encyclopedic
reference. Nevertheless I still feel that the rst part should provide a solid
background covering many important results which are usually taken for
granted in more advanced books and research papers.
My approach is built around the spectral theorem as the central object.
Hence I try to get to it as quickly as possible. Moreover, I do not take the
detour over bounded operators but I go straight for the unbounded case.
In addition, existence of spectral measures is established via the Herglotz
theorem rather than the Riesz representation theorem since this approach
paves the way for an investigation of spectral types via boundary values of
the resolvent as the spectral parameter approaches the real line.
xi
xii Preface
Part 2 starts with the free Schrodinger equation and computes the
free resolvent and time evolution. In addition, I discuss position, momen-
tum, and angular momentum operators via algebraic methods. This is
usually found in any physics textbook on quantum mechanics, with the
only dierence that I include some technical details which are typically
not found there. Then there is an introduction to one-dimensional mod-
els (SturmLiouville operators) including generalized eigenfunction expan-
sions (WeylTitchmarsh theory) and subordinacy theory from Gilbert and
Pearson. These results are applied to compute the spectrum of the hy-
drogen atom, where again I try to provide some mathematical details not
found in physics textbooks. Further topics are nondegeneracy of the ground
state, spectra of atoms (the HVZ theorem), and scattering theory (the En
method).
Prerequisites
I assume some previous experience with Hilbert spaces and bounded
linear operators which should be covered in any basic course on functional
analysis. However, while this assumption is reasonable for mathematics
students, it might not always be for physics students. For this reason there
is a preliminary chapter reviewing all necessary results (including proofs).
In addition, there is an appendix (again with proofs) providing all necessary
results from measure theory.
Literature
The present book is highly inuenced by the four volumes of Reed and
Simon [40][43] (see also [14]) and by the book by Weidmann [60] (an
extended version of which has recently appeared in two volumes [62], [63],
however, only in German). Other books with a similar scope are for example
[14], [15], [21], [23], [39], [48], and [55]. For those who want to know more
about the physical aspects, I can recommend the classical book by Thirring
[58] and the visual guides by Thaller [56], [57]. Further information can be
found in the bibliographical notes at the end.
Readers guide
There is some intentional overlap between Chapter 0, Chapter 1, and
Chapter 2. Hence, provided you have the necessary background, you can
start reading in Chapter 1 or even Chapter 2. Chapters 2 and 3 are key
Preface xiii
chapters and you should study them in detail (except for Section 2.6 which
can be skipped on rst reading). Chapter 4 should give you an idea of how
the spectral theorem is used. You should have a look at (e.g.) the rst
section and you can come back to the remaining ones as needed. Chapter 5
contains two key results from quantum dynamics: Stones theorem and the
RAGE theorem. In particular the RAGE theorem shows the connections
between long time behavior and spectral types. Finally, Chapter 6 is again
of central importance and should be studied in detail.
The chapters in the second part are mostly independent of each other
except for Chapter 7, which is a prerequisite for all others except for Chap-
ter 9.
If you are interested in one-dimensional models (SturmLiouville equa-
tions), Chapter 9 is all you need.
If you are interested in atoms, read Chapter 7, Chapter 10, and Chap-
ter 11. In particular, you can skip the separation of variables (Sections 10.3
and 10.4, which require Chapter 9) method for computing the eigenvalues of
the hydrogen atom, if you are happy with the fact that there are countably
many which accumulate at the bottom of the continuous spectrum.
If you are interested in scattering theory, read Chapter 7, the rst two
sections of Chapter 10, and Chapter 12. Chapter 5 is one of the key prereq-
uisites in this case.
Updates
The AMS is hosting a web page for this book at
http://www.ams.org/bookpages/gsm-99/
where updates, corrections, and other material may be found, including a
link to material on my own web site:
http://www.mat.univie.ac.at/
~
gerald/ftp/book-schroe/
Acknowledgments
I would like to thank Volker En for making his lecture notes [18] avail-
able to me. Many colleagues and students have made useful suggestions and
pointed out mistakes in earlier drafts of this book, in particular: Kerstin
Ammann, Jorg Arnberger, Chris Davis, Fritz Gesztesy, Maria Homann-
Ostenhof, Zhenyou Huang, Helge Kr uger, Katrin Grunert, Wang Lanning,
Daniel Lenz, Christine Pfeuer, Roland Mows, Arnold L. Neidhardt, Harald
xiv Preface
Rindler, Johannes Temme, Karl Unterkoer, Joachim Weidmann, and Rudi
Weikard.
If you also nd an error or if you have comments or suggestions
(no matter how small), please let me know.
I have been supported by the Austrian Science Fund (FWF) during much
of this writing, most recently under grant Y330.
Gerald Teschl
Vienna, Austria
January 2009
Gerald Teschl
Fakultat f ur Mathematik
Nordbergstrae 15
Universitat Wien
1090 Wien, Austria
E-mail: Gerald.Teschl@univie.ac.at
URL: http://www.mat.univie.ac.at/
~
gerald/
Part 0
Preliminaries
Chapter 0
A rst look at Banach
and Hilbert spaces
I assume that the reader has some basic familiarity with measure theory and func-
tional analysis. For convenience, some facts needed from Banach and L
p
spaces are
reviewed in this chapter. A crash course in measure theory can be found in the
Appendix A. If you feel comfortable with terms like Lebesgue L
p
spaces, Banach
space, or bounded linear operator, you can skip this entire chapter. However, you
might want to at least browse through it to refresh your memory.
0.1. Warm up: Metric and topological spaces
Before we begin, I want to recall some basic facts from metric and topological
spaces. I presume that you are familiar with these topics from your calculus
course. As a general reference I can warmly recommend Kellys classical
book [26].
A metric space is a space X together with a distance function d :
X X R such that
(i) d(x, y) 0,
(ii) d(x, y) = 0 if and only if x = y,
(iii) d(x, y) = d(y, x),
(iv) d(x, z) d(x, y) +d(y, z) (triangle inequality).
If (ii) does not hold, d is called a semi-metric. Moreover, it is straight-
forward to see the inverse triangle inequality (Problem 0.1)
[d(x, y) d(z, y)[ d(x, z). (0.1)
3
4 0. A rst look at Banach and Hilbert spaces
Example. Euclidean space R
n
together with d(x, y) = (
n
k=1
(x
k
y
k
)
2
)
1/2
is a metric space and so is C
n
together with d(x, y) = (
n
k=1
[x
k
y
k
[
2
)
1/2
.
The set
B
r
(x) = y X[d(x, y) < r (0.2)
is called an open ball around x with radius r > 0. A point x of some set
U is called an interior point of U if U contains some ball around x. If x is
an interior point of U, then U is also called a neighborhood of x. A point
x is called a limit point of U if (B
r
(x)x) U ,= for every ball around
x. Note that a limit point x need not lie in U, but U must contain points
arbitrarily close to x.
Example. Consider R with the usual metric and let U = (1, 1). Then
every point x U is an interior point of U. The points 1 are limit points
of U.
A set consisting only of interior points is called open. The family of
open sets O satises the properties
(i) , X O,
(ii) O
1
, O
2
O implies O
1
O
2
O,
(iii) O
O implies
O.
That is, O is closed under nite intersections and arbitrary unions.
In general, a space X together with a family of sets O, the open sets,
satisfying (i)(iii) is called a topological space. The notions of interior
point, limit point, and neighborhood carry over to topological spaces if we
replace open ball by open set.
There are usually dierent choices for the topology. Two usually not
very interesting examples are the trivial topology O = , X and the
discrete topology O = P(X) (the powerset of X). Given two topologies
O
1
and O
2
on X, O
1
is called weaker (or coarser) than O
2
if and only if
O
1
O
2
.
Example. Note that dierent metrics can give rise to the same topology.
For example, we can equip R
n
(or C
n
) with the Euclidean distance d(x, y)
as before or we could also use
d(x, y) =
n
k=1
[x
k
y
k
[. (0.3)
Then
1
n
n
k=1
[x
k
[
_
n
k=1
[x
k
[
2
k=1
[x
k
[ (0.4)
shows B
r/
n
(x)

B
r
(x) B
r
(x), where B,

B are balls computed using d,
d, respectively.
Example. We can always replace a metric d by the bounded metric
d(x, y) =
d(x, y)
1 +d(x, y)
(0.5)
without changing the topology.
Every subspace Y of a topological space X becomes a topological space
of its own if we call O Y open if there is some open set

O X such that
O =

O Y (induced topology).
Example. The set (0, 1] R is not open in the topology of X = R, but it is
open in the induced topology when considered as a subset of Y = [1, 1].
A family of open sets B O is called a base for the topology if for each
x and each neighborhood U(x), there is some set O B with x O U(x).
Since an open set O is a neighborhood of every one of its points, it can be
written as O =
OB
O and we have
Lemma 0.1. If B O is a base for the topology, then every open set can
be written as a union of elements from B.
If there exists a countable base, then X is called second countable.
Example. By construction the open balls B
1/n
(x) are a base for the topol-
ogy in a metric space. In the case of R
n
(or C
n
) it even suces to take balls
with rational center and hence R
n
(and C
n
) is second countable.
A topological space is called a Hausdor space if for two dierent
points there are always two disjoint neighborhoods.
Example. Any metric space is a Hausdor space: Given two dierent
points x and y, the balls B
d/2
(x) and B
d/2
(y), where d = d(x, y) > 0, are
disjoint neighborhoods (a semi-metric space will not be Hausdor).
The complement of an open set is called a closed set. It follows from
de Morgans rules that the family of closed sets ( satises
(i) , X (,
(ii) C
1
, C
2
( implies C
1
C
2
(,
(iii) C
( implies
(.
That is, closed sets are closed under nite unions and arbitrary intersections.
The smallest closed set containing a given set U is called the closure
U =
C(,UC
C, (0.6)
and the largest open set contained in a given set U is called the interior
U
=
_
OC,OU
O. (0.7)
We can dene interior and limit points as before by replacing the word
ball by open set. Then it is straightforward to check
Lemma 0.2. Let X be a topological space. Then the interior of U is the
set of all interior points of U and the closure of U is the union of U with
all limit points of U.
A sequence (x
n
)
n=1
X is said to converge to some point x X if
d(x, x
n
) 0. We write lim
n
x
n
= x as usual in this case. Clearly the
limit is unique if it exists (this is not true for a semi-metric).
Every convergent sequence is a Cauchy sequence; that is, for every
> 0 there is some N N such that
d(x
n
, x
m
) , n, m N. (0.8)
If the converse is also true, that is, if every Cauchy sequence has a limit,
then X is called complete.
Example. Both R
n
and C
n
are complete metric spaces.
A point x is clearly a limit point of U if and only if there is some sequence
x
n
Ux converging to x. Hence
Lemma 0.3. A closed subset of a complete metric space is again a complete
metric space.
Note that convergence can also be equivalently formulated in terms of
topological terms: A sequence x
n
converges to x if and only if for every
neighborhood U of x there is some N N such that x
n
U for n N. In
a Hausdor space the limit is unique.
A set U is called dense if its closure is all of X, that is, if U = X. A
metric space is called separable if it contains a countable dense set. Note
that X is separable if and only if it is second countable as a topological
space.
Lemma 0.4. Let X be a separable metric space. Every subset of X is again
separable.
Proof. Let A = x
n
nN
be a dense set in X. The only problem is that
AY might contain no elements at all. However, some elements of A must
be at least arbitrarily close: Let J N
2
be the set of all pairs (n, m) for
which B
1/m
(x
n
) Y ,= and choose some y
n,m
B
1/m
(x
n
) Y for all
(n, m) J. Then B = y
n,m
(n,m)J
Y is countable. To see that B is
dense, choose y Y . Then there is some sequence x
n
k
with d(x
n
k
, y) < 1/k.
Hence (n
k
, k) J and d(y
n
k
,k
, y) d(y
n
k
,k
, x
n
k
) +d(x
n
k
, y) 2/k 0.
A function between metric spaces X and Y is called continuous at a
point x X if for every > 0 we can nd a > 0 such that
d
Y
(f(x), f(y)) if d
X
(x, y) < . (0.9)
If f is continuous at every point, it is called continuous.
Lemma 0.5. Let X, Y be metric spaces and f : X Y . The following are
equivalent:
(i) f is continuous at x (i.e, (0.9) holds).
(ii) f(x
n
) f(x) whenever x
n
x.
(iii) For every neighborhood V of f(x), f
1
(V ) is a neighborhood of x.
Proof. (i) (ii) is obvious. (ii) (iii): If (iii) does not hold, there is
a neighborhood V of f(x) such that B
(x) , f
1
(V ) for every . Hence
we can choose a sequence x
n
B
1/n
(x) such that f(x
n
) , f
1
(V ). Thus
x
n
x but f(x
n
) , f(x). (iii) (i): Choose V = B
(f(x)) and observe

that by (iii), B
(x) f
1
(V ) for some .
The last item implies that f is continuous if and only if the inverse image
of every open (closed) set is again open (closed).
Note: In a topological space, (iii) is used as the denition for continuity.
However, in general (ii) and (iii) will no longer be equivalent unless one uses
generalized sequences, so-called nets, where the index set N is replaced by
arbitrary directed sets.
The support of a function f : X C
n
is the closure of all points x for
which f(x) does not vanish; that is,
supp(f) = x X[f(x) ,= 0. (0.10)
If X and Y are metric spaces, then X Y together with
d((x
1
, y
1
), (x
2
, y
2
)) = d
X
(x
1
, x
2
) +d
Y
(y
1
, y
2
) (0.11)
is a metric space. A sequence (x
n
, y
n
) converges to (x, y) if and only if
x
n
x and y
n
y. In particular, the projections onto the rst (x, y) x,
respectively, onto the second (x, y) y, coordinate are continuous.
In particular, by the inverse triangle inequality (0.1),
[d(x
n
, y
n
) d(x, y)[ d(x
n
, x) +d(y
n
, y), (0.12)
we see that d : X X R is continuous.
Example. If we consider R R, we do not get the Euclidean distance of
R
2
unless we modify (0.11) as follows:
d((x
1
, y
1
), (x
2
, y
2
)) =
_
d
X
(x
1
, x
2
)
2
+d
Y
(y
1
, y
2
)
2
. (0.13)
As noted in our previous example, the topology (and thus also conver-
gence/continuity) is independent of this choice.
If X and Y are just topological spaces, the product topology is dened
by calling O X Y open if for every point (x, y) O there are open
neighborhoods U of x and V of y such that U V O. In the case of
metric spaces this clearly agrees with the topology dened via the product
metric (0.11).
A cover of a set Y X is a family of sets U
such that Y
.
A cover is called open if all U
are open. Any subset of U
which still
covers Y is called a subcover.
Lemma 0.6 (Lindelof). If X is second countable, then every open cover
has a countable subcover.
Proof. Let U
be an open cover for Y and let B be a countable base.

Since every U
can be written as a union of elements from B, the set of all

B B which satisfy B U
for some form a countable open cover for Y .

Moreover, for every B
n
in this set we can nd an
n
such that B
n
U
n
.
By construction U
n
is a countable subcover.
A subset K X is called compact if every open cover has a nite
subcover.
Lemma 0.7. A topological space is compact if and only if it has the nite
intersection property: The intersection of a family of closed sets is empty
if and only if the intersection of some nite subfamily is empty.
Proof. By taking complements, to every family of open sets there is a cor-
responding family of closed sets and vice versa. Moreover, the open sets
are a cover if and only if the corresponding closed sets have empty intersec-
tion.
A subset K X is called sequentially compact if every sequence has
a convergent subsequence.
Lemma 0.8. Let X be a topological space.
(i) The continuous image of a compact set is compact.
(ii) Every closed subset of a compact set is compact.
(iii) If X is Hausdor, any compact set is closed.
(iv) The product of nitely many compact sets is compact.
(v) A compact set is also sequentially compact.
Proof. (i) Observe that if O
is an open cover for f(Y ), then f

1
(O
)
is one for Y .
(ii) Let O
be an open cover for the closed subset Y . Then O

XY is an open cover for X.
(iii) Let Y X be compact. We show that XY is open. Fix x XY
(if Y = X, there is nothing to do). By the denition of Hausdor, for
every y Y there are disjoint neighborhoods V (y) of y and U
y
(x) of x. By
compactness of Y , there are y
1
, . . . , y
n
such that the V (y
j
) cover Y . But
then U(x) =

n
j=1
U
y
j
(x) is a neighborhood of x which does not intersect
Y .
(iv) Let O
be an open cover for X Y . For every (x, y) X Y

there is some (x, y) such that (x, y) O
(x,y)
. By denition of the product
topology there is some open rectangle U(x, y)V (x, y) O
(x,y)
. Hence for
xed x, V (x, y)
yY
is an open cover of Y . Hence there are nitely many
points y
k
(x) such that the V (x, y
k
(x)) cover Y . Set U(x) =
k
U(x, y
k
(x)).
Since nite intersections of open sets are open, U(x)
xX
is an open cover
and there are nitely many points x
j
such that the U(x
j
) cover X. By
construction, the U(x
j
) V (x
j
, y
k
(x
j
)) O
(x
j
,y
k
(x
j
))
cover X Y .
(v) Let x
n
be a sequence which has no convergent subsequence. Then
K = x
n
has no limit points and is hence compact by (ii). For every n
there is a ball B
n
(x
n
) which contains only nitely many elements of K.
However, nitely many suce to cover K, a contradiction.
In a metric space compact and sequentially compact are equivalent.
Lemma 0.9. Let X be a metric space. Then a subset is compact if and only
if it is sequentially compact.
Proof. By item (v) of the previous lemma it suces to show that X is
compact if it is sequentially compact.
First of all note that every cover of open balls with xed radius > 0
has a nite subcover since if this were false we could construct a sequence
x
n
X
n1
m=1
B
(x
m
) such that d(x
n
, x
m
) > for m < n.
In particular, we are done if we can show that for every open cover
O
there is some > 0 such that for every x we have B
(x) O
for
some = (x). Indeed, choosing x
k
n
k=1
such that B
(x
k
) is a cover, we
have that O
(x
k
)
is a cover as well.
So it remains to show that there is such an . If there were none, for
every > 0 there must be an x such that B
(x) , O
for every . Choose

=
1
n
and pick a corresponding x
n
. Since X is sequentially compact, it is no
restriction to assume x
n
converges (after maybe passing to a subsequence).
Let x = limx
n
. Then x lies in some O
and hence B
(x) O
. But choosing
n so large that
1
n
<

2
and d(x
n
, x) <

2
, we have B
1/n
(x
n
) B
(x) O
,
contradicting our assumption.
Please also recall the HeineBorel theorem:
Theorem 0.10 (HeineBorel). In R
n
(or C
n
) a set is compact if and only
if it is bounded and closed.
Proof. By Lemma 0.8 (ii) and (iii) it suces to show that a closed interval
in I R is compact. Moreover, by Lemma 0.9 it suces to show that
every sequence in I = [a, b] has a convergent subsequence. Let x
n
be our
sequence and divide I = [a,
a+b
2
] [
a+b
2
, b]. Then at least one of these two
intervals, call it I
1
, contains innitely many elements of our sequence. Let
y
1
= x
n
1
be the rst one. Subdivide I
1
and pick y
2
= x
n
2
, with n
2
> n
1
as
before. Proceeding like this, we obtain a Cauchy sequence y
n
(note that by
construction I
n+1
I
n
and hence [y
n
y
m
[
ba
n
for m n).
A topological space is called locally compact if every point has a com-
pact neighborhood.
Example. R
n
is locally compact.
The distance between a point x X and a subset Y X is
dist(x, Y ) = inf
yY
d(x, y). (0.14)
Note that x is a limit point of Y if and only if dist(x, Y ) = 0.
Lemma 0.11. Let X be a metric space. Then
[ dist(x, Y ) dist(z, Y )[ d(x, z). (0.15)
In particular, x dist(x, Y ) is continuous.
Proof. Taking the inmum in the triangle inequality d(x, y) d(x, z) +
d(z, y) shows dist(x, Y ) d(x, z)+dist(z, Y ). Hence dist(x, Y )dist(z, Y )
d(x, z). Interchanging x and z shows dist(z, Y ) dist(x, Y ) d(x, z).
Lemma 0.12 (Urysohn). Suppose C
1
and C
2
are disjoint closed subsets of
a metric space X. Then there is a continuous function f : X [0, 1] such
that f is zero on C
1
and one on C
2
.
If X is locally compact and C
1
is compact, one can choose f with compact
support.
Proof. To prove the rst claim, set f(x) =
dist(x,C
2
)
dist(x,C
1
)+dist(x,C
2
)
. For the
second claim, observe that there is an open set O such that O is compact
and C
1
O O XC
2
. In fact, for every x, there is a ball B
(x) such
that B
(x) is compact and B
(x) XC
2
. Since C
1
is compact, nitely
many of them cover C
1
and we can choose the union of those balls to be O.
Now replace C
2
by XO.
Note that Urysohns lemma implies that a metric space is normal; that
is, for any two disjoint closed sets C
1
and C
2
, there are disjoint open sets
O
1
and O
2
such that C
j
O
j
, j = 1, 2. In fact, choose f as in Urysohns
lemma and set O
1
= f
1
([0, 1/2)), respectively, O
2
= f
1
((1/2, 1]).
Lemma 0.13. Let X be a locally compact metric space. Suppose K is
a compact set and O
j
n
j=1
an open cover. Then there is a partition of
unity for K subordinate to this cover; that is, there are continuous functions
h
j
: X [0, 1] such that h
j
has compact support contained in O
j
and
n
j=1
h
j
(x) 1 (0.16)
with equality for x K.
Proof. For every x K there is some and some j such that B
(x) O
j
.
By compactness of K, nitely many of these balls cover K. Let K
j
be the
union of those balls which lie inside O
j
. By Urysohns lemma there are
functions g
j
: X [0, 1] such that g
j
= 1 on K
j
and g
j
= 0 on XO
j
. Now
set
h
j
= g
j
j1
k=1
(1 g
k
). (0.17)
Then h
j
: X [0, 1] has compact support contained in O
j
and
n
j=1
h
j
(x) = 1
n
j=1
(1 g
j
(x)) (0.18)
shows that the sum is one for x K, since x K
j
for some j implies
g
j
(x) = 1 and causes the product to vanish.
Problem 0.1. Show that [d(x, y) d(z, y)[ d(x, z).
Problem 0.2. Show the quadrangle inequality [d(x, y) d(x
t
, y
t
)[
d(x, x
t
) +d(y, y
t
).
Problem 0.3. Let X be some space together with a sequence of distance
functions d
n
, n N. Show that
d(x, y) =
n=1
1
2
n
d
n
(x, y)
1 +d
n
(x, y)
is again a distance function.
Problem 0.4. Show that the closure satises U = U.
Problem 0.5. Let U V be subsets of a metric space X. Show that if U
is dense in V and V is dense in X, then U is dense in X.
Problem 0.6. Show that any open set O R can be written as a countable
union of disjoint intervals. (Hint: Let I
be the set of all maximal subin-

tervals of O; that is, I
O and there is no other subinterval of O which

contains I
. Then this is a cover of disjoint intervals which has a countable

subcover.)
0.2. The Banach space of continuous functions
Now let us have a rst look at Banach spaces by investigating the set of
continuous functions C(I) on a compact interval I = [a, b] R. Since we
want to handle complex models, we will always consider complex-valued
functions!
One way of declaring a distance, well-known from calculus, is the max-
imum norm:
|f(x) g(x)|
= max
xI
[f(x) g(x)[. (0.19)
It is not hard to see that with this denition C(I) becomes a normed linear
space:
A normed linear space X is a vector space X over C (or R) with a
real-valued function (the norm) |.| such that
|f| 0 for all f X and |f| = 0 if and only if f = 0,
|f| = [[ |f| for all C and f X, and
|f +g| |f| +|g| for all f, g X (triangle inequality).
From the triangle inequality we also get the inverse triangle inequal-
ity (Problem 0.7)
[|f| |g|[ |f g|. (0.20)
Once we have a norm, we have a distance d(f, g) = |fg| and hence we
know when a sequence of vectors f
n
converges to a vector f. We will write
f
n
f or lim
n
f
n
= f, as usual, in this case. Moreover, a mapping
F : X Y between two normed spaces is called continuous if f
n
f
implies F(f
n
) F(f). In fact, it is not hard to see that the norm, vector
addition, and multiplication by scalars are continuous (Problem 0.8).
In addition to the concept of convergence we have also the concept of
a Cauchy sequence and hence the concept of completeness: A normed
space is called complete if every Cauchy sequence has a limit. A complete
normed space is called a Banach space.
Example. The space
1
(N) of all sequences a = (a
j
)
j=1
for which the norm
|a|
1
=
j=1
[a
j
[ (0.21)
is nite is a Banach space.
To show this, we need to verify three things: (i)
1
(N) is a vector space
that is closed under addition and scalar multiplication, (ii) |.|
1
satises the
three requirements for a norm, and (iii)
1
(N) is complete.
First of all observe
k
j=1
[a
j
+b
j
[
k
j=1
[a
j
[ +
k
j=1
[b
j
[ |a|
1
+|b|
1
(0.22)
for any nite k. Letting k , we conclude that
1
(N) is closed under
addition and that the triangle inequality holds. That
1
(N) is closed under
scalar multiplication and the two other properties of a norm are straight-
forward. It remains to show that
1
(N) is complete. Let a
n
= (a
n
j
)
j=1
be
a Cauchy sequence; that is, for given > 0 we can nd an N
such that
|a
m
a
n
|
1
for m, n N
. This implies in particular [a

m
j
a
n
j
[ for
any xed j. Thus a
n
j
is a Cauchy sequence for xed j and by completeness
of C has a limit: lim
n
a
n
j
= a
j
. Now consider
k
j=1
[a
m
j
a
n
j
[ (0.23)
and take m :
k
j=1
[a
j
a
n
j
[ . (0.24)
Since this holds for any nite k, we even have |aa
n
|
1
. Hence (aa
n
)
1
(N) and since a
n

1
(N), we nally conclude a = a
n
+(aa
n
)
1
(N).
Example. The space
(N) of all bounded sequences a = (a

j
)
j=1
together
with the norm
|a|
= sup
jN
[a
j
[ (0.25)
is a Banach space (Problem 0.10).
Now what about convergence in the space C(I)? A sequence of functions
f
n
(x) converges to f if and only if
lim
n
|f f
n
| = lim
n
sup
xI
[f
n
(x) f(x)[ = 0. (0.26)
That is, in the language of real analysis, f
n
converges uniformly to f. Now
let us look at the case where f
n
is only a Cauchy sequence. Then f
n
(x) is
clearly a Cauchy sequence of real numbers for any xed x I. In particular,
by completeness of C, there is a limit f(x) for each x. Thus we get a limiting
function f(x). Moreover, letting m in
[f
m
(x) f
n
(x)[ m, n > N
, x I, (0.27)
we see
[f(x) f
n
(x)[ n > N
, x I; (0.28)
that is, f
n
(x) converges uniformly to f(x). However, up to this point we
do not know whether it is in our vector space C(I) or not, that is, whether
it is continuous or not. Fortunately, there is a well-known result from real
analysis which tells us that the uniform limit of continuous functions is again
continuous. Hence f(x) C(I) and thus every Cauchy sequence in C(I)
converges. Or, in other words
Theorem 0.14. C(I) with the maximum norm is a Banach space.
Next we want to know if there is a countable basis for C(I). We will call
a set of vectors u
n
X linearly independent if every nite subset is and
we will call a countable set of linearly independent vectors u
n
N
n=1
X
a Schauder basis if every element f X can be uniquely written as a
countable linear combination of the basis elements:
f =
N
n=1
c
n
u
n
, c
n
= c
n
(f) C, (0.29)
where the sum has to be understood as a limit if N = . In this case the
span spanu
n
(the set of all nite linear combinations) of u
n
is dense in
X. A set whose span is dense is called total and if we have a countable total
set, we also have a countable dense set (consider only linear combinations
with rational coecients show this). A normed linear space containing a
countable dense set is called separable.
Example. The Banach space
1
(N) is separable. In fact, the set of vectors
n
, with
n
n
= 1 and
n
m
= 0, n ,= m, is total: Let a = (a
j
)
j=1

1
(N) be
given and set a
n
=
n
j=1
a
j
j
. Then
|a a
n
|
1
=
j=n+1
[a
j
[ 0 (0.30)
since a
n
j
= a
j
for 1 j n and a
n
j
= 0 for j > n.
Luckily this is also the case for C(I):
Theorem 0.15 (Weierstra). Let I be a compact interval. Then the set of
polynomials is dense in C(I).
Proof. Let f(x) C(I) be given. By considering f(x) f(a) + (f(b)
f(a))(x b) it is no loss to assume that f vanishes at the boundary points.
Moreover, without restriction we only consider I = [
1
2
,
1
2
] (why?).
Now the claim follows from the lemma below using
u
n
(x) =
1
I
n
(1 x
2
)
n
,
where
I
n
=
_
1
1
(1 x
2
)
n
dx =
n
n + 1
_
1
1
(1 x)
n1
(1 +x)
n+1
dx
= =
n!
(n + 1) (2n + 1)
2
2n+1
=
n!
1
2
(
1
2
+ 1) (
1
2
+n)
=

(1 +n)
(
3
2
+n)
=
_
n
(1 +O(
1
n
)).
In the last step we have used (
1
2
) =

[1, (6.1.8)] and the asymptotics
follow from Stirlings formula [1, (6.1.37)].
Lemma 0.16 (Smoothing). Let u
n
(x) be a sequence of nonnegative contin-
uous functions on [1, 1] such that
_
[x[1
u
n
(x)dx = 1 and
_
[x[1
u
n
(x)dx 0, > 0. (0.31)
(In other words, u
n
has mass one and concentrates near x = 0 as n .)
Then for every f C[
1
2
,
1
2
] which vanishes at the endpoints, f(
1
2
) =
f(
1
2
) = 0, we have that
f
n
(x) =
_
1/2
1/2
u
n
(x y)f(y)dy (0.32)
converges uniformly to f(x).
Proof. Since f is uniformly continuous, for given we can nd a (indepen-
dent of x) such that [f(x)f(y)[ whenever [xy[ . Moreover, we can
choose n such that
_
[y[1
u
n
(y)dy . Now abbreviate M = max1, [f[
and note
[f(x)
_
1/2
1/2
u
n
(x y)f(x)dy[ = [f(x)[ [1
_
1/2
1/2
u
n
(x y)dy[ M.
In fact, either the distance of x to one of the boundary points
1
2
is smaller
than and hence [f(x)[ or otherwise the dierence between one and the
integral is smaller than .
Using this, we have
[f
n
(x) f(x)[
_
1/2
1/2
u
n
(x y)[f(y) f(x)[dy +M
_
[y[1/2,[xy[
u
n
(x y)[f(y) f(x)[dy
+
_
[y[1/2,[xy[
u
n
(x y)[f(y) f(x)[dy +M
= + 2M +M = (1 + 3M), (0.33)
which proves the claim.
Note that f
n
will be as smooth as u
n
, hence the title smoothing lemma.
The same idea is used to approximate noncontinuous functions by smooth
ones (of course the convergence will no longer be uniform in this case).
Corollary 0.17. C(I) is separable.
However,
(N) is not separable (Problem 0.11)!

Problem 0.7. Show that [|f| |g|[ |f g|.
Problem 0.8. Let X be a Banach space. Show that the norm, vector ad-
dition, and multiplication by scalars are continuous. That is, if f
n
f,
g
n
g, and
n
, then |f
n
| |f|, f
n
+g
n
f +g, and
n
g
n
g.
Problem 0.9. Let X be a Banach space. Show that
j=1
|f
j
| < implies
that
j=1
f
j
= lim
n
n
j=1
f
j
exists. The series is called absolutely convergent in this case.
Problem 0.10. Show that
(N) is a Banach space.

(N) is not separable. (Hint: Consider se-

quences which take only the value one and zero. How many are there? What
is the distance between two such sequences?)
0.3. The geometry of Hilbert spaces
So it looks like C(I) has all the properties we want. However, there is
still one thing missing: How should we dene orthogonality in C(I)? In
Euclidean space, two vectors are called orthogonal if their scalar product
vanishes, so we would need a scalar product:
Suppose H is a vector space. A map ., ..) : H H C is called a
sesquilinear form if it is conjugate linear in the rst argument and linear
in the second; that is,
1
f
1
+
2
f
2
, g) =
1
f
1
, g) +
2
f
2
, g),
f,
1
g
1
+
2
g
2
) =
1
f, g
1
) +
2
f, g
2
),

1
,
2
C, (0.34)
where denotes complex conjugation. A sesquilinear form satisfying the
requirements
(i) f, f) > 0 for f ,= 0 (positive denite),
(ii) f, g) = g, f)
(symmetry)
is called an inner product or scalar product. Associated with every
scalar product is a norm
|f| =
_
f, f). (0.35)
The pair (H, ., ..)) is called an inner product space. If H is complete, it
is called a Hilbert space.
Example. Clearly C
n
with the usual scalar product
a, b) =
n
j=1
a
j
b
j
(0.36)
is a (nite dimensional) Hilbert space.
Example. A somewhat more interesting example is the Hilbert space
2
(N),
that is, the set of all sequences
_
(a
j
)
j=1
j=1
[a
j
[
2
<
_
(0.37)
with scalar product
a, b) =
j=1
a
j
b
j
. (0.38)
(Show that this is in fact a separable Hilbert space Problem 0.13.)
Of course I still owe you a proof for the claim that
_
f, f) is indeed a
norm. Only the triangle inequality is nontrivial, which will follow from the
CauchySchwarz inequality below.
A vector f H is called normalized or a unit vector if |f| = 1.
Two vectors f, g H are called orthogonal or perpendicular (f g) if
f, g) = 0 and parallel if one is a multiple of the other.
If f and g are orthogonal, we have the Pythagorean theorem:
|f +g|
2
= |f|
2
+|g|
2
, f g, (0.39)
which is one line of computation.
Suppose u is a unit vector. Then the projection of f in the direction of
u is given by
f
|
= u, f)u (0.40)
and f
dened via
f
= f u, f)u (0.41)
is perpendicular to u since u, f
) = u, f u, f)u) = u, f) u, f)u, u) =
0.
f
f
|
f
1
B
B
B
B
BM
Taking any other vector parallel to u, it is easy to see

|f u|
2
= |f
+ (f
|
u)|
2
= |f
|
2
+[u, f) [
2
(0.42)
and hence f
|
= u, f)u is the unique vector parallel to u which is closest to
f.
As a rst consequence we obtain the CauchySchwarzBunjakowski
inequality:
Theorem 0.18 (CauchySchwarzBunjakowski). Let H
0
be an inner prod-
uct space. Then for every f, g H
0
we have
[f, g)[ |f| |g| (0.43)
with equality if and only if f and g are parallel.
Proof. It suces to prove the case |g| = 1. But then the claim follows
from |f|
2
= [g, f)[
2
+|f
|
2
.
Note that the CauchySchwarz inequality entails that the scalar product
is continuous in both variables; that is, if f
n
f and g
n
g, we have
f
n
, g
n
) f, g).
As another consequence we infer that the map |.| is indeed a norm. In
fact,
|f +g|
2
= |f|
2
+f, g) +g, f) +|g|
2
(|f| +|g|)
2
. (0.44)
But let us return to C(I). Can we nd a scalar product which has the
maximum norm as associated norm? Unfortunately the answer is no! The
reason is that the maximum norm does not satisfy the parallelogram law
(Problem 0.17).
Theorem 0.19 (Jordanvon Neumann). A norm is associated with a scalar
product if and only if the parallelogram law
|f +g|
2
+|f g|
2
= 2|f|
2
+ 2|g|
2
(0.45)
holds.
In this case the scalar product can be recovered from its norm by virtue
of the polarization identity
f, g) =
1
4
_
|f +g|
2
|f g|
2
+ i|f ig|
2
i|f + ig|
2
_
. (0.46)
Proof. If an inner product space is given, verication of the parallelogram
law and the polarization identity is straightforward (Problem 0.14).
To show the converse, we dene
s(f, g) =
1
4
_
|f +g|
2
|f g|
2
+ i|f ig|
2
i|f + ig|
2
_
.
Then s(f, f) = |f|
2
and s(f, g) = s(g, f)
are straightforward to check.

Moreover, another straightforward computation using the parallelogram law
shows
s(f, g) +s(f, h) = 2s(f,
g +h
2
).
Now choosing h = 0 (and using s(f, 0) = 0) shows s(f, g) = 2s(f,
g
2
) and
thus s(f, g) + s(f, h) = s(f, g + h). Furthermore, by induction we infer
m
2
n
s(f, g) = s(f,
m
2
n
g); that is, s(f, g) = s(f, g) for every positive rational
. By continuity (check this!) this holds for all > 0 and s(f, g) =
s(f, g), respectively, s(f, ig) = i s(f, g), nishes the proof.
Note that the parallelogram law and the polarization identity even hold
for sesquilinear forms (Problem 0.14).
But how do we dene a scalar product on C(I)? One possibility is
f, g) =
_
b
a
f
(x)g(x)dx. (0.47)
The corresponding inner product space is denoted by L
2
cont
(I). Note that
we have
|f|
_
[b a[|f|
(0.48)
and hence the maximum norm is stronger than the L
2
cont
norm.
Suppose we have two norms |.|
1
and |.|
2
on a space X. Then |.|
2
is
said to be stronger than |.|
1
if there is a constant m > 0 such that
|f|
1
m|f|
2
. (0.49)
It is straightforward to check the following.
Lemma 0.20. If |.|
2
is stronger than |.|
1
, then any |.|
2
Cauchy sequence
is also a |.|
1
Cauchy sequence.
Hence if a function F : X Y is continuous in (X, |.|
1
), it is also
continuous in (X, |.|
2
) and if a set is dense in (X, |.|
2
), it is also dense in
(X, |.|
1
).
In particular, L
2
cont
is separable. But is it also complete? Unfortunately
the answer is no:
Example. Take I = [0, 2] and dene
f
n
(x) =
_
_
0, 0 x 1
1
n
,
1 +n(x 1), 1
1
n
x 1,
1, 1 x 2.
(0.50)
Then f
n
(x) is a Cauchy sequence in L
2
cont
, but there is no limit in L
2
cont
!
Clearly the limit should be the step function which is 0 for 0 x < 1 and 1
for 1 x 2, but this step function is discontinuous (Problem 0.18)!
This shows that in innite dimensional spaces dierent norms will give
rise to dierent convergent sequences! In fact, the key to solving problems in
innite dimensional spaces is often nding the right norm! This is something
which cannot happen in the nite dimensional case.
Theorem 0.21. If X is a nite dimensional space, then all norms are equiv-
alent. That is, for any two given norms |.|
1
and |.|
2
, there are constants
m
1
and m
2
such that
1
m
2
|f|
1
|f|
2
m
1
|f|
1
. (0.51)
Proof. Clearly we can choose a basis u
j
, 1 j n, and assume that |.|
2
is the usual Euclidean norm, |
j

j
u
j
|
2
2
=

j
[
j
[
2
. Let f =

j

j
u
j
.
Then by the triangle and CauchySchwartz inequalities
|f|
1

j
[
j
[|u
j
|
1

j
|u
j
|
2
1
|f|
2
and we can choose m
2
=
_
j
|u
j
|
1
.
In particular, if f
n
is convergent with respect to |.|
2
, it is also convergent
with respect to |.|
1
. Thus |.|
1
is continuous with respect to |.|
2
and attains
its minimum m > 0 on the unit sphere (which is compact by the Heine-Borel
theorem). Now choose m
1
= 1/m.
Problem 0.12. Show that the norm in a Hilbert space satises |f + g| =
|f| +|g| if and only if f = g, 0, or g = 0.
2
(N) is a separable Hilbert space.
Problem 0.14. Suppose Q is a vector space. Let s(f, g) be a sesquilinear
form on Q and q(f) = s(f, f) the associated quadratic form. Prove the
parallelogram law
q(f +g) +q(f g) = 2q(f) + 2q(g) (0.52)
and the polarization identity
s(f, g) =
1
4
(q(f +g) q(f g) + i q(f ig) i q(f + ig)) . (0.53)
Conversely, show that any quadratic form q(f) : Q R satisfying
q(f) = [[
2
q(f) and the parallelogram law gives rise to a sesquilinear form
via the polarization identity.
Problem 0.15. A sesquilinear form is called bounded if
|s| = sup
|f|=|g|=1
[s(f, g)[
is nite. Similarly, the associated quadratic form q is bounded if
|q| = sup
|f|=1
[q(f)[
is nite. Show
|q| |s| 2|q|.
(Hint: Use the parallelogram law and the polarization identity from the pre-
vious problem.)
Problem 0.16. Suppose Q is a vector space. Let s(f, g) be a sesquilinear
form on Q and q(f) = s(f, f) the associated quadratic form. Show that the
CauchySchwarz inequality
[s(f, g)[ q(f)
1/2
q(g)
1/2
(0.54)
holds if q(f) 0.
(Hint: Consider 0 q(f + g) = q(f) + 2 Re(s(f, g)) + [[
2
q(g) and
choose = t s(f, g)
/[s(f, g)[ with t R.)

Problem 0.17. Show that the maximum norm (on C[0, 1]) does not satisfy
the parallelogram law.
Problem 0.18. Prove the claims made about f
n
, dened in (0.50), in the
last example.
0.4. Completeness
Since L
2
cont
is not complete, how can we obtain a Hilbert space from it?
Well, the answer is simple: take the completion.
If X is an (incomplete) normed space, consider the set of all Cauchy
sequences

X. Call two Cauchy sequences equivalent if their dierence con-
verges to zero and denote by

X the set of all equivalence classes. It is easy
to see that

X (and

X) inherit the vector space structure from X. Moreover,
Lemma 0.22. If x
n
is a Cauchy sequence, then |x
n
| converges.
Consequently the norm of a Cauchy sequence (x
n
)
n=1
can be dened by
|(x
n
)
n=1
| = lim
n
|x
n
| and is independent of the equivalence class (show
this!). Thus

X is a normed space (

X is not! Why?).
Theorem 0.23.

X is a Banach space containing X as a dense subspace if
we identify x X with the equivalence class of all sequences converging to
x.
Proof. (Outline) It remains to show that

X is complete. Let
n
= [(x
n,j
)
j=1
]
be a Cauchy sequence in

X. Then it is not hard to see that = [(x
j,j
)
j=1
]
is its limit.
Let me remark that the completion

X is unique. More precisely any
other complete space which contains X as a dense subset is isomorphic to
X. This can for example be seen by showing that the identity map on X
has a unique extension to

X (compare Theorem 0.26 below).
In particular it is no restriction to assume that a normed linear space
or an inner product space is complete. However, in the important case
of L
2
cont
it is somewhat inconvenient to work with equivalence classes of
Cauchy sequences and hence we will give a dierent characterization using
the Lebesgue integral later.
0.5. Bounded operators
A linear map A between two normed spaces X and Y will be called a (lin-
ear) operator
A : D(A) X Y. (0.55)
The linear subspace D(A) on which A is dened is called the domain of A
and is usually required to be dense. The kernel
Ker(A) = f D(A)[Af = 0 (0.56)
and range
Ran(A) = Af[f D(A) = AD(A) (0.57)
0.5. Bounded operators 23
are dened as usual. The operator A is called bounded if the operator
norm
|A| = sup
|f|
X
=1
|Af|
Y
(0.58)
is nite.
The set of all bounded linear operators from X to Y is denoted by
L(X, Y ). If X = Y , we write L(X, X) = L(X).
Theorem 0.24. The space L(X, Y ) together with the operator norm (0.58)
is a normed space. It is a Banach space if Y is.
Proof. That (0.58) is indeed a norm is straightforward. If Y is complete and
A
n
is a Cauchy sequence of operators, then A
n
f converges to an element
g for every f. Dene a new operator A via Af = g. By continuity of
the vector operations, A is linear and by continuity of the norm |Af| =
lim
n
|A
n
f| (lim
n
|A
n
|)|f|, it is bounded. Furthermore, given
> 0 there is some N such that |A
n
A
m
| for n, m N and thus
|A
n
f A
m
f| |f|. Taking the limit m , we see |A
n
f Af| |f|;
that is, A
n
A.
By construction, a bounded operator is Lipschitz continuous,
|Af|
Y
|A||f|
X
, (0.59)
and hence continuous. The converse is also true
Theorem 0.25. An operator A is bounded if and only if it is continuous.
Proof. Suppose A is continuous but not bounded. Then there is a sequence
of unit vectors u
n
such that |Au
n
| n. Then f
n
=
1
n
u
n
converges to 0 but
|Af
n
| 1 does not converge to 0.
Moreover, if A is bounded and densely dened, it is no restriction to
assume that it is dened on all of X.
Theorem 0.26 (B.L.T. theorem). Let A L(X, Y ) and let Y be a Banach
space. If D(A) is dense, there is a unique (continuous) extension of A to X
which has the same norm.
Proof. Since a bounded operator maps Cauchy sequences to Cauchy se-
quences, this extension can only be given by
Af = lim
n
Af
n
, f
n
D(A), f X.
To show that this denition is independent of the sequence f
n
f, let
g
n
f be a second sequence and observe
|Af
n
Ag
n
| = |A(f
n
g
n
)| |A||f
n
g
n
| 0.
From continuity of vector addition and scalar multiplication it follows that
our extension is linear. Finally, from continuity of the norm we conclude
that the norm does not increase.
An operator in L(X, C) is called a bounded linear functional and the
space X
= L(X, C) is called the dual space of X. A sequence f

n
is said to
converge weakly, f
n
f, if (f
n
) (f) for every X
.
The Banach space of bounded linear operators L(X) even has a multi-
plication given by composition. Clearly this multiplication satises
(A+B)C = AC +BC, A(B +C) = AB +BC, A, B, C L(X) (0.60)
and
(AB)C = A(BC), (AB) = (A)B = A(B), C. (0.61)
Moreover, it is easy to see that we have
|AB| |A||B|. (0.62)
However, note that our multiplication is not commutative (unless X is one-
dimensional). We even have an identity, the identity operator I satisfying
|I| = 1.
A Banach space together with a multiplication satisfying the above re-
quirements is called a Banach algebra. In particular, note that (0.62)
ensures that multiplication is continuous (Problem 0.22).
Problem 0.19. Show that the integral operator
(Kf)(x) =
_
1
0
K(x, y)f(y)dy,
where K(x, y) C([0, 1] [0, 1]), dened on D(K) = C[0, 1] is a bounded
operator both in X = C[0, 1] (max norm) and X = L
2
cont
(0, 1).
Problem 0.20. Show that the dierential operator A =
d
dx
dened on
D(A) = C
1
[0, 1] C[0, 1] is an unbounded operator.
Problem 0.21. Show that |AB| |A||B| for every A, B L(X).
Problem 0.22. Show that the multiplication in a Banach algebra X is con-
tinuous: x
n
x and y
n
y imply x
n
y
n
xy.
Problem 0.23. Let
f(z) =
j=0
f
j
z
j
, [z[ < R,
0.6. Lebesgue L
p
spaces 25
be a convergent power series with convergence radius R > 0. Suppose A is
a bounded operator with |A| < R. Show that
f(A) =
j=0
f
j
A
j
exists and denes a bounded linear operator (cf. Problem 0.9).
0.6. Lebesgue L
p
spaces
For this section some basic facts about the Lebesgue integral are required.
The necessary background can be found in Appendix A. To begin with,
Sections A.1, A.3, and A.4 will be sucient.
We x some -nite measure space (X, , ) and denote by L
p
(X, d),
1 p, the set of all complex-valued measurable functions for which
|f|
p
=
__
X
[f[
p
d
_
1/p
(0.63)
is nite. First of all note that L
p
(X, d) is a linear space, since [f + g[
p
2
p
max([f[, [g[)
p
2
p
max([f[
p
, [g[
p
) 2
p
([f[
p
+ [g[
p
). Of course our hope
is that L
p
(X, d) is a Banach space. However, there is a small technical
problem (recall that a property is said to hold almost everywhere if the set
where it fails to hold is contained in a set of measure zero):
Lemma 0.27. Let f be measurable. Then
_
X
[f[
p
d = 0 (0.64)
if and only if f(x) = 0 almost everywhere with respect to .
Proof. Observe that we have A = x[f(x) ,= 0 =

n
A
n
, where A
n
=
x[ [f(x)[ >
1
n
. If
_
[f[
p
d = 0, we must have (A
n
) = 0 for every n and
hence (A) = lim
n
(A
n
) = 0. The converse is obvious.
Note that the proof also shows that if f is not 0 almost everywhere,
there is an > 0 such that (x[ [f(x)[ ) > 0.
Example. Let be the Lebesgue measure on R. Then the characteristic
function of the rationals
Q
is zero a.e. (with respect to ).
Let be the Dirac measure centered at 0. Then f(x) = 0 a.e. (with
respect to ) if and only if f(0) = 0.
Thus |f|
p
= 0 only implies f(x) = 0 for almost every x, but not for all!
Hence |.|
p
is not a norm on L
p
(X, d). The way out of this misery is to
identify functions which are equal almost everywhere: Let
^(X, d) = f[f(x) = 0 -almost everywhere. (0.65)
Then ^(X, d) is a linear subspace of L
p
(X, d) and we can consider the
quotient space
L
p
(X, d) = L
p
(X, d)/^(X, d). (0.66)
If d is the Lebesgue measure on X R
n
, we simply write L
p
(X). Observe
that |f|
p
is well-dened on L
p
(X, d).
Even though the elements of L
p
(X, d) are, strictly speaking, equiva-
lence classes of functions, we will still call them functions for notational
convenience. However, note that for f L
p
(X, d) the value f(x) is not
well-dened (unless there is a continuous representative and dierent con-
tinuous functions are in dierent equivalence classes, e.g., in the case of
Lebesgue measure).
With this modication we are back in business since L
p
(X, d) turns
out to be a Banach space. We will show this in the following sections.
But before that let us also dene L
(X, d). It should be the set of

bounded measurable functions B(X) together with the sup norm. The only
problem is that if we want to identify functions equal almost everywhere, the
supremum is no longer independent of the equivalence class. The solution
is the essential supremum
|f|
= infC [ (x[ [f(x)[ > C) = 0. (0.67)

That is, C is an essential bound if [f(x)[ C almost everywhere and the
essential supremum is the inmum over all essential bounds.
Example. If is the Lebesgue measure, then the essential sup of
Q
with
respect to is 0. If is the Dirac measure centered at 0, then the essential
sup of
Q
with respect to is 1 (since
Q
(0) = 1, and x = 0 is the only
point which counts for ).
As before we set
L
(X, d) = B(X)/^(X, d) (0.68)

and observe that |f|
is independent of the equivalence class.

If you wonder where the comes from, have a look at Problem 0.24.
As a preparation for proving that L
p
is a Banach space, we will need
Holders inequality, which plays a central role in the theory of L
p
spaces.
In particular, it will imply Minkowskis inequality, which is just the triangle
inequality for L
p
.
Theorem 0.28 (Holders inequality). Let p and q be dual indices; that is,
1
p
+
1
q
= 1 (0.69)
0.6. Lebesgue L
p
spaces 27
with 1 p . If f L
p
(X, d) and g L
q
(X, d), then fg L
1
(X, d)
and
|f g|
1
|f|
p
|g|
q
. (0.70)
Proof. The case p = 1, q = (respectively p = , q = 1) follows directly
from the properties of the integral and hence it remains to consider 1 <
p, q < .
First of all it is no restriction to assume |f|
p
= |g|
q
= 1. Then, using
the elementary inequality (Problem 0.25)
a
1/p
b
1/q
1
p
a +
1
q
b, a, b 0, (0.71)
with a = [f[
p
and b = [g[
q
and integrating over X gives
_
X
[f g[d
1
p
_
X
[f[
p
d +
1
q
_
X
[g[
q
d = 1
and nishes the proof.
As a consequence we also get
Theorem 0.29 (Minkowskis inequality). Let f, g L
p
(X, d). Then
|f +g|
p
|f|
p
+|g|
p
. (0.72)
Proof. Since the cases p = 1, are straightforward, we only consider 1 <
p < . Using [f + g[
p
[f[ [f + g[
p1
+ [g[ [f + g[
p1
, we obtain from
Holders inequality (note (p 1)q = p)
|f +g|
p
p
|f|
p
|(f +g)
p1
|
q
+|g|
p
|(f +g)
p1
|
q
= (|f|
p
+|g|
p
)|(f +g)|
p1
p
.
This shows that L

p
(X, d) is a normed linear space. Finally it remains
to show that L
p
(X, d) is complete.
Theorem 0.30. The space L
p
(X, d) is a Banach space.
Proof. Suppose f
n
is a Cauchy sequence. It suces to show that some
subsequence converges (show this). Hence we can drop some terms such
that
|f
n+1
f
n
|
p

1
2
n
.
Now consider g
n
= f
n
f
n1
(set f
0
= 0). Then
G(x) =
k=1
[g
k
(x)[
is in L
p
. This follows from
_
_
_
n
k=1
[g
k
[
_
_
_
p
k=1
|g
k
(x)|
p
|f
1
|
p
+
1
2
using the monotone convergence theorem. In particular, G(x) < almost
everywhere and the sum
n=1
g
n
(x) = lim
n
f
n
(x)
is absolutely convergent for those x. Now let f(x) be this limit. Since
[f(x) f
n
(x)[
p
converges to zero almost everywhere and [f(x) f
n
(x)[
p
2
p
G(x)
p
L
1
, dominated convergence shows |f f
n
|
p
0.
In particular, in the proof of the last theorem we have seen:
Corollary 0.31. If |f
n
f|
p
0, then there is a subsequence which con-
verges pointwise almost everywhere.
Note that the statement is not true in general without passing to a
subsequence (Problem 0.28).
Using Holders inequality, we can also identify a class of bounded oper-
ators in L
p
.
Lemma 0.32 (Schur criterion). Consider L
p
(X, d) and L
q
(X, d) with
1
p
+
1
q
= 1. Suppose that K(x, y) is measurable and there are measurable
functions K
1
(x, y), K
2
(x, y) such that [K(x, y)[ K
1
(x, y)K
2
(x, y) and
|K
1
(x, .)|
q
C
1
, |K
2
(., y)|
p
C
2
(0.73)
for -almost every x, respectively, for -almost every y. Then the operator
K : L
p
(X, d) L
p
(X, d) dened by
(Kf)(x) =
_
Y
K(x, y)f(y)d(y) (0.74)
for -almost every x is bounded with |K| C
1
C
2
.
Proof. Choose f L
p
(X, d). By Fubinis theorem
_
Y
[K(x, y)f(y)[d(y)
is measurable and by Holders inequality we have
_
Y
[K(x, y)f(y)[d(y)
_
Y
K
1
(x, y)K
2
(x, y)[f(y)[d(y)
__
Y
K
1
(x, y)
q
d(y)
_
1/q
__
Y
[K
2
(x, y)f(y)[
p
d(y)
_
1/p
C
1
__
Y
[K
2
(x, y)f(y)[
p
d(y)
_
1/p
0.6. Lebesgue L
p
spaces 29
(if K
2
(x, .)f(.) , L
p
(X, d), the inequality is trivially true). Now take this
inequality to the pth power and integrate with respect to x using Fubini
_
X
__
Y
[K(x, y)f(y)[d(y)
_
p
d(x) C
p
1
_
X
_
Y
[K
2
(x, y)f(y)[
p
d(y)d(x)
= C
p
1
_
Y
_
X
[K
2
(x, y)f(y)[
p
d(x)d(y) C
p
1
C
p
2
|f|
p
p
.
Hence
_
Y
[K(x, y)f(y)[d(y) L
p
(X, d) and in particular it is nite for
-almost every x. Thus K(x, .)f(.) is integrable for -almost every x and
_
Y
K(x, y)f(y)d(y) is measurable.
It even turns out that L
p
is separable.
Lemma 0.33. Suppose X is a second countable topological space (i.e., it
has a countable basis) and is a regular Borel measure. Then L
p
(X, d),
1 p < , is separable.
Proof. The set of all characteristic functions
A
(x) with A and (A) <
is total by construction of the integral. Now our strategy is as follows:
Using outer regularity, we can restrict A to open sets and using the existence
of a countable base, we can restrict A to open sets from this base.
Fix A. By outer regularity, there is a decreasing sequence of open sets
O
n
such that (O
n
) (A). Since (A) < , it is no restriction to assume
(O
n
) < , and thus (O
n
A) = (O
n
) (A) 0. Now dominated
convergence implies |
A

On
|
p
0. Thus the set of all characteristic
functions
O
(x) with O open and (O) < is total. Finally let B be a
countable basis for the topology. Then, every open set O can be written as
O =

j=1
O
j
with

O
j
B. Moreover, by considering the set of all nite
unions of elements from B, it is no restriction to assume
n
j=1
O
j
B. Hence
there is an increasing sequence

O
n
O with

O
n
B. By monotone con-
vergence, |
O
On
|
p
0 and hence the set of all characteristic functions
O
with

O B is total.
To nish this chapter, let us show that continuous functions are dense
in L
p
.
Theorem 0.34. Let X be a locally compact metric space and let be a
-nite regular Borel measure. Then the set C
c
(X) of continuous functions
with compact support is dense in L
p
(X, d), 1 p < .
Proof. As in the previous proof the set of all characteristic functions
K
(x)
with K compact is total (using inner regularity). Hence it suces to show
that
K
(x) can be approximated by continuous functions. By outer regu-
larity there is an open set O K such that (OK) . By Urysohns
lemma (Lemma 0.12) there is a continuous function f
which is 1 on K and
0 outside O. Since
_
X
[
K
f
[
p
d =
_
O\K
[f
[
p
d (OK) ,
we have |f
K
| 0 and we are done.
If X is some subset of R
n
, we can do even better. A nonnegative function
u C
c
(R
n
) is called a mollier if
_
R
n
u(x)dx = 1. (0.75)
The standard mollier is u(x) = exp(
1
[x[
2
1
) for [x[ < 1 and u(x) = 0
otherwise.
If we scale a mollier according to u
k
(x) = k
n
u(k x) such that its mass is
preserved (|u
k
|
1
= 1) and it concentrates more and more around the origin,
-
6
u
k
we have the following result (Problem 0.29):
Lemma 0.35. Let u be a mollier in R
n
and set u
k
(x) = k
n
u(k x). Then
for any (uniformly) continuous function f : R
n
C we have that
f
k
(x) =
_
R
n
u
k
(x y)f(y)dy (0.76)
is in C
(R
n
) and converges to f (uniformly).
Now we are ready to prove
Theorem 0.36. If X R
n
and is a regular Borel measure, then the set
C
c
(X) of all smooth functions with compact support is dense in L
p
(X, d),
1 p < .
Proof. By our previous result it suces to show that any continuous func-
tion f(x) with compact support can be approximated by smooth ones. By
setting f(x) = 0 for x , X, it is no restriction to assume X = R
n
. Now
choose a mollier u and observe that f
k
has compact support (since f
0.6. Lebesgue L
p
spaces 31
has). Moreover, since f has compact support, it is uniformly continuous
and f
k
f uniformly. But this implies f
k
f in L
p
.
We say that f L
p
loc
(X) if f L
p
(K) for any compact subset K X.
Lemma 0.37. Suppose f L
1
loc
(R
n
). Then
_
R
n
(x)f(x)dx = 0, C
c
(R
n
), (0.77)
if and only if f(x) = 0 (a.e.).
Proof. First of all we claim that for any bounded function g with compact
support K, there is a sequence of functions
n
C
c
(R
n
) with support in
K which converges pointwise to g such that |
n
|
|g|
.
To see this, take a sequence of continuous functions
n
with support
in K which converges to g in L
1
. To make sure that |
n
|
|g|
, just
set it equal to |g|
whenever
n
> |g|
and equal to |g|
whenever
n
< |g|
(show that the resulting sequence still converges). Finally use

(0.76) to make
n
smooth (note that this operation does not change the
range) and extract a pointwise convergent subsequence.
Now let K be some compact set and choose g = sign(f)
K
. Then
_
K
[f[dx =
_
K
f sign(f)dx = lim
n
_
f
n
dx = 0,
which shows f = 0 for a.e. x K. Since K is arbitrary, we are done.
Problem 0.24. Suppose (X) < . Show that
lim
p
|f|
p
= |f|
for any bounded measurable function.

Problem 0.25. Prove (0.71). (Hint: Take logarithms on both sides.)
Problem 0.26. Show the following generalization of Holders inequality:
|f g|
r
|f|
p
|g|
q
, (0.78)
where
1
p
+
1
q
=
1
r
.
Problem 0.27 (Lyapunov inequality). Let 0 < < 1. Show that if f
L
p
1
L
p
2
, then f L
p
and
|f|
p
|f|
p
1
|f|
1
p
2
, (0.79)
where
1
p
=

p
1
+
1
p
2
.
Problem 0.28. Find a sequence f
n
which converges to 0 in L
p
([0, 1], dx) but
for which f
n
(x) 0 for a.e. x [0, 1] does not hold. (Hint: Every n N can
be uniquely written as n = 2
m
+k with 0 m and 0 k < 2
m
. Now consider
the characteristic functions of the intervals I
m,k
= [k2
m
, (k + 1)2
m
].)
Problem 0.29. Prove Lemma 0.35. (Hint: To show that f
k
is smooth, use
Problems A.7 and A.8.)
Problem 0.30. Construct a function f L
p
(0, 1) which has a singularity at
every rational number in [0, 1]. (Hint: Start with the function f
0
(x) = [x[
which has a single pole at 0. Then f

j
(x) = f
0
(x x
j
) has a pole at x
j
.)
0.7. Appendix: The uniform boundedness principle
Recall that the interior of a set is the largest open subset (that is, the union
of all open subsets). A set is called nowhere dense if its closure has empty
interior. The key to several important theorems about Banach spaces is the
observation that a Banach space cannot be the countable union of nowhere
dense sets.
Theorem 0.38 (Baire category theorem). Let X be a complete metric space.
Then X cannot be the countable union of nowhere dense sets.
Proof. Suppose X =
n=1
X
n
. We can assume that the sets X
n
are closed
and none of them contains a ball; that is, XX
n
is open and nonempty for
every n. We will construct a Cauchy sequence x
n
which stays away from all
X
n
.
Since XX
1
is open and nonempty, there is a closed ball B
r
1
(x
1
)
XX
1
. Reducing r
1
a little, we can even assume B
r
1
(x
1
) XX
1
. More-
over, since X
2
cannot contain B
r
1
(x
1
), there is some x
2
B
r
1
(x
1
) that is
not in X
2
. Since B
r
1
(x
1
) (XX
2
) is open, there is a closed ball B
r
2
(x
2
)
B
r
1
(x
1
) (XX
2
). Proceeding by induction, we obtain a sequence of balls
such that
B
rn
(x
n
) B
r
n1
(x
n1
) (XX
n
).
Now observe that in every step we can choose r
n
as small as we please; hence
without loss of generality r
n
0. Since by construction x
n
B
r
N
(x
N
) for
n N, we conclude that x
n
is Cauchy and converges to some point x X.
But x B
rn
(x
n
) XX
n
for every n, contradicting our assumption that
the X
n
cover X.
(Sets which can be written as the countable union of nowhere dense sets
are said to be of rst category. All other sets are second category. Hence
we have the name category theorem.)
0.7. Appendix: The uniform boundedness principle 33
In other words, if X
n
X is a sequence of closed subsets which cover
X, at least one X
n
contains a ball of radius > 0.
Now we come to the rst important consequence, the uniform bound-
edness principle.
Theorem 0.39 (BanachSteinhaus). Let X be a Banach space and Y some
normed linear space. Let A
L(X, Y ) be a family of bounded operators.

Suppose |A
x| C(x) is bounded for xed x X. Then |A
| C is
uniformly bounded.
Proof. Let
X
n
= x[ |A
x| n for all =
x[ |A
x| n.
Then

n
X
n
= X by assumption. Moreover, by continuity of A
and the
norm, each X
n
is an intersection of closed sets and hence closed. By Baires
theorem at least one contains a ball of positive radius: B
(x
0
) X
n
. Now
observe
|A
y| |A
(y +x
0
)| +|A
x
0
| n +|A
x
0
|
for |y| < . Setting y =
x
|x|
, we obtain
|A
x|
n +C(x
0
)
|x|
for any x.
Part 1
Mathematical
Foundations of
Quantum Mechanics
Chapter 1
Hilbert spaces
The phase space in classical mechanics is the Euclidean space R
2n
(for the n
position and n momentum coordinates). In quantum mechanics the phase
space is always a Hilbert space H. Hence the geometry of Hilbert spaces
stands at the outset of our investigations.
1.1. Hilbert spaces
Suppose H is a vector space. A map ., ..) : HH C is called a sesquilinear
form if it is conjugate linear in the rst argument and linear in the second.
A positive denite sesquilinear form is called an inner product or scalar
product. Associated with every scalar product is a norm
|| =
_
, ). (1.1)
The triangle inequality follows from the CauchySchwarzBunjakowski
inequality:
[, )[ || || (1.2)
with equality if and only if and are parallel.
If H is complete with respect to the above norm, it is called a Hilbert
space. It is no restriction to assume that H is complete since one can easily
replace it by its completion.
Example. The space L
2
(M, d) is a Hilbert space with scalar product given
by
f, g) =
_
M
f(x)
g(x)d(x). (1.3)
37
38 1. Hilbert spaces
Similarly, the set of all square summable sequences
2
(N) is a Hilbert space
with scalar product
f, g) =
jN
f
j
g
j
. (1.4)
(Note that the second example is a special case of the rst one; take M = R
and a sum of Dirac measures.)
A vector H is called normalized or a unit vector if || = 1.
Two vectors , H are called orthogonal or perpendicular ( ) if
, ) = 0 and parallel if one is a multiple of the other.
If and are orthogonal, we have the Pythagorean theorem:
| +|
2
= ||
2
+||
2
, , (1.5)
which is one line of computation.
Suppose is a unit vector. Then the projection of in the direction of
is given by
|
= , ) (1.6)
and
dened via
= , ) (1.7)
is perpendicular to .
These results can also be generalized to more than one vector. A set of
vectors
j
is called an orthonormal set (ONS) if
j
,
k
) = 0 for j ,= k
and
j
,
j
) = 1.
Lemma 1.1. Suppose
j
n
j=0
is an orthonormal set. Then every H
can be written as
=
|
+
,
|
=
n
j=0
j
, )
j
, (1.8)
where
|
and
are orthogonal. Moreover,

j
,
) = 0 for all 1 j n.
In particular,
||
2
=
n
j=0
[
j
, )[
2
+|
|
2
. (1.9)
Moreover, every

in the span of
j
n
j=0
satises
|

| |
| (1.10)
with equality holding if and only if

=
|
. In other words,
|
is uniquely
characterized as the vector in the span of
j
n
j=0
closest to .
Proof. A straightforward calculation shows
j
,
|
) = 0 and hence
|
and
=
|
are orthogonal. The formula for the norm follows by
applying (1.5) iteratively.
Now, x a vector
=
n
j=0
c
j
j
in the span of
j
n
j=0
. Then one computes
|

|
2
= |
|
+

|
2
= |
|
2
+|
|

|
2
= |
|
2
+
n
j=0
[c
j

j
, )[
2
from which the last claim follows.
From (1.9) we obtain Bessels inequality
n
j=0
[
j
, )[
2
||
2
(1.11)
with equality holding if and only if lies in the span of
j
n
j=0
.
Recall that a scalar product can be recovered from its norm by virtue of
the polarization identity
, ) =
1
4
_
| +|
2
| |
2
+ i| i|
2
i| + i|
2
_
. (1.12)
A bijective linear operator U L(H
1
, H
2
) is called unitary if U preserves
scalar products:
U, U)
2
= , )
1
, , H
1
. (1.13)
By the polarization identity this is the case if and only if U preserves norms:
|U|
2
= ||
1
for all H
1
. The two Hilbert space H
1
and H
2
are called
unitarily equivalent in this case.
Problem 1.1. The operator
S :
2
(N)
2
(N), (a
1
, a
2
, a
3
, . . . ) (0, a
1
, a
2
, . . . )
satises |Sa| = |a|. Is it unitary?
1.2. Orthonormal bases
Of course, since we cannot assume H to be a nite dimensional vector space,
we need to generalize Lemma 1.1 to arbitrary orthonormal sets
j
jJ
.
We start by assuming that J is countable. Then Bessels inequality (1.11)
shows that
jJ
[
j
, )[
2
(1.14)
converges absolutely. Moreover, for any nite subset K J we have
|
jK
j
, )
j
|
2
=
jK
[
j
, )[
2
(1.15)
by the Pythagorean theorem and thus
jJ
j
, )
j
is Cauchy if and only
if

jJ
[
j
, )[
2
is. Now let J be arbitrary. Again, Bessels inequality
shows that for any given > 0 there are at most nitely many j for which
[
j
, )[ . Hence there are at most countably many j for which [
j
, )[ >
0. Thus it follows that
jJ
[
j
, )[
2
(1.16)
is well-dened and so is
jJ
j
, )
j
. (1.17)
In particular, by continuity of the scalar product we see that Lemma 1.1
holds for arbitrary orthonormal sets without modications.
Theorem 1.2. Suppose
j
jJ
is an orthonormal set. Then every H
can be written as
=
|
+
,
|
=
jJ
j
, )
j
, (1.18)
where
|
and
are orthogonal. Moreover,

j
,
) = 0 for all j J. In
particular,
||
2
=
jJ
[
j
, )[
2
+|
|
2
. (1.19)
Moreover, every

in the span of
j
jJ
satises
|

| |
| (1.20)
with equality holding if and only if

=
|
. In other words,
|
is uniquely
characterized as the vector in the span of
j
jJ
closest to .
Note that from Bessels inequality (which of course still holds) it follows
that the map
|
is continuous.
An orthonormal set which is not a proper subset of any other orthonor-
mal set is called an orthonormal basis (ONB) due to the following result:
Theorem 1.3. For an orthonormal set
j
jJ
the following conditions are
equivalent:
(i)
j
jJ
is a maximal orthonormal set.
(ii) For every vector H we have
=
jJ
j
, )
j
. (1.21)
(iii) For every vector H we have
||
2
=
jJ
[
j
, )[
2
. (1.22)
(iv)
j
, ) = 0 for all j J implies = 0.
Proof. We will use the notation from Theorem 1.2.
(i) (ii): If
,= 0, then we can normalize
to obtain a unit vector

which is orthogonal to all vectors

j
. But then
j
jJ

would be a
larger orthonormal set, contradicting the maximality of
j
jJ
.
(ii) (iii): This follows since (ii) implies
= 0.
(iii) (iv): If ,
j
) = 0 for all j J, we conclude ||
2
= 0 and hence
= 0.
(iv) (i): If
j
jJ
were not maximal, there would be a unit vector
such that
j
jJ
is a larger orthonormal set. But
j
, ) = 0 for all
j J implies = 0 by (iv), a contradiction.
Since
|
is continuous, it suces to check conditions (ii) and (iii)
on a dense set.
Example. The set of functions
n
(x) =
1
2
e
inx
, n Z, (1.23)
forms an orthonormal basis for H = L
2
(0, 2). The corresponding orthogo-
nal expansion is just the ordinary Fourier series (Problem 1.20).
A Hilbert space is separable if and only if there is a countable orthonor-
mal basis. In fact, if H is separable, then there exists a countable total set
N
j=0
. Here N N if H is nite dimensional and N = otherwise. After
throwing away some vectors, we can assume that
n+1
cannot be expressed
as a linear combinations of the vectors
0
, . . . ,
n
. Now we can construct
an orthonormal basis as follows: We begin by normalizing
0
,
0
=

0
|
0
|
. (1.24)
Next we take
1
and remove the component parallel to
0
and normalize
again:
1
=

1
0
,
1
)
0
|
1
0
,
1
)
0
|
. (1.25)
Proceeding like this, we dene recursively
n
=

n
n1
j=0
j
,
n
)
j
|
n
n1
j=0
j
,
n
)
j
|
. (1.26)
This procedure is known as GramSchmidt orthogonalization. Hence
we obtain an orthonormal set
j
N
j=0
such that span
j
n
j=0
= span
j
n
j=0
for any nite n and thus also for N (if N = ). Since
j
N
j=0
is total, so
is
j
N
j=0
. Now suppose there is some =
|
+
H for which
,= 0.
Since
j
N
j=1
is total, we can nd a

in its span, such that |

| < |
|
contradicting (1.20). Hence we infer that
j
N
j=1
is an orthonormal basis.
Theorem 1.4. Every separable Hilbert space has a countable orthonormal
basis.
Example. In L
2
(1, 1) we can orthogonalize the polynomial f
n
(x) = x
n
.
The resulting polynomials are up to a normalization equal to the Legendre
polynomials
P
0
(x) = 1, P
1
(x) = x, P
2
(x) =
3 x
2
1
2
, . . . (1.27)
(which are normalized such that P
n
(1) = 1).
If fact, if there is one countable basis, then it follows that any other basis
is countable as well.
Theorem 1.5. If H is separable, then every orthonormal basis is countable.
Proof. We know that there is at least one countable orthonormal basis
jJ
. Now let
k
kK
be a second basis and consider the set K
j
=
k K[
k
,
j
) , = 0. Since these are the expansion coecients of
j
with
respect to
k
kK
, this set is countable. Hence the set

K =

jJ
K
j
is
countable as well. But k K
K implies
k
= 0 and hence

K = K.
We will assume all Hilbert spaces to be separable.
In particular, it can be shown that L
2
(M, d) is separable. Moreover, it
turns out that, up to unitary equivalence, there is only one (separable)
innite dimensional Hilbert space:
Let H be an innite dimensional Hilbert space and let
j
jN
be any
orthogonal basis. Then the map U : H
2
(N), (
j
, ))
jN
is unitary
(by Theorem 1.3 (iii)). In particular,
Theorem 1.6. Any separable innite dimensional Hilbert space is unitarily
equivalent to
2
(N).
1.3. The projection theorem and the Riesz lemma 43
Let me remark that if H is not separable, there still exists an orthonor-
mal basis. However, the proof requires Zorns lemma: The collection of
all orthonormal sets in H can be partially ordered by inclusion. Moreover,
any linearly ordered chain has an upper bound (the union of all sets in the
chain). Hence Zorns lemma implies the existence of a maximal element,
that is, an orthonormal basis.
Problem 1.2. Let
j
be some orthonormal basis. Show that a bounded
linear operator A is uniquely determined by its matrix elements A
jk
=
j
, A
k
) with respect to this basis.
Problem 1.3. Show that L(H) is not separable if H is innite dimensional.
1.3. The projection theorem and the Riesz lemma
Let M H be a subset. Then M
= [, ) = 0, M is called
the orthogonal complement of M. By continuity of the scalar prod-
uct it follows that M
is a closed linear subspace and by linearity that

(span(M))
= M
. For example we have H
= 0 since any vector in H
must be in particular orthogonal to all vectors in some orthonormal basis.

Theorem 1.7 (Projection theorem). Let M be a closed linear subspace of a
Hilbert space H. Then every H can be uniquely written as =
|
+
with
|
M and
. One writes
M M
= H (1.28)
in this situation.
Proof. Since M is closed, it is a Hilbert space and has an orthonormal basis
jJ
. Hence the result follows from Theorem 1.2.
In other words, to every H we can assign a unique vector
|
which
is the vector in M closest to . The rest,
|
, lies in M
. The operator
P
M
=
|
is called the orthogonal projection corresponding to M. Note
that we have
P
2
M
= P
M
and P
M
, ) = , P
M
) (1.29)
since P
M
, ) =
|
,
|
) = , P
M
). Clearly we have P
M
=
P
M
=
. Furthermore, (1.29) uniquely characterizes orthogonal projec-

tions (Problem 1.6).
Moreover, we see that the vectors in a closed subspace M are precisely
those which are orthogonal to all vectors in M
; that is, M
= M. If M
is an arbitrary subset, we have at least
M
= span(M). (1.30)
Note that by H
= 0 we see that M
= 0 if and only if M is total.

Finally we turn to linear functionals, that is, to operators : H
C. By the CauchySchwarz inequality we know that
: , ) is a
bounded linear functional (with norm ||). It turns out that in a Hilbert
space every bounded linear functional can be written in this way.
Theorem 1.8 (Riesz lemma). Suppose is a bounded linear functional on a
Hilbert space H. Then there is a unique vector H such that () = , )
for all H. In other words, a Hilbert space is equivalent to its own dual
space H
= H.
Proof. If 0, we can choose = 0. Otherwise Ker() = [() = 0
is a proper subspace and we can nd a unit vector Ker()
. For every
H we have () ( ) Ker() and hence
0 = , () ( )) = () ( ) , ).
In other words, we can choose = ( )
. To see uniqueness, let

1
,
2
be
two such vectors. Then
1

2
, ) =
1
, )
2
, ) = () () = 0
for any H, which shows
1
2
H
= 0.
The following easy consequence is left as an exercise.
Corollary 1.9. Suppose s is a bounded sesquilinear form; that is,
[s(, )[ C|| ||. (1.31)
Then there is a unique bounded operator A such that
s(, ) = A, ). (1.32)
Moreover, |A| C.
Note that by the polarization identity (Problem 0.14), A is already
uniquely determined by its quadratic form q
A
() = , A).
Problem 1.4. Suppose U : H H is unitary and M H. Show that
UM
= (UM)
.
Problem 1.5. Show that an orthogonal projection P
M
,= 0 has norm one.
Problem 1.6. Suppose P L satises
P
2
= P and P, ) = , P)
and set M = Ran(P). Show
P = for M and M is closed,
M
implies P M
and thus P = 0,
and conclude P = P
M
.
1.4. Orthogonal sums and tensor products 45
Problem 1.7. Let P
1
, P
2
be two orthogonal projections. Show that P
1
P
2
(that is, , P
1
) , P
2
)) if and only if Ran(P
1
) Ran(P
2
). Show in
this case that the two projections commute (that is, P
1
P
2
= P
2
P
1
) and that
P
2
P
1
is also a projection. (Hints: |P
j
| = || if and only if P
j
=
and Ran(P
1
) Ran(P
2
) if and only if P
2
P
1
= P
1
.)
Problem 1.8. Show P : L
2
(R) L
2
(R), f(x)
1
2
(f(x) + f(x)) is a
projection. Compute its range and kernel.
Problem 1.9. Prove Corollary 1.9.
Problem 1.10. Consider the sesquilinear form
B(f, g) =
_
1
0
__
x
0
f(t)
dt
___
x
0
g(t)dt
_
dx
in L
2
(0, 1). Show that it is bounded and nd the corresponding operator A.
(Hint: Partial integration.)
1.4. Orthogonal sums and tensor products
Given two Hilbert spaces H
1
and H
2
, we dene their orthogonal sum
H
1
H
2
to be the set of all pairs (
1
,
2
) H
1
H
2
together with the scalar
product
(
1
,
2
), (
1
,
2
)) =
1
,
1
)
1
+
2
,
2
)
2
. (1.33)
It is left as an exercise to verify that H
1
H
2
is again a Hilbert space.
Moreover, H
1
can be identied with (
1
, 0)[
1
H
1
and we can regard
H
1
as a subspace of H
1
H
2
, and similarly for H
2
. It is also customary to
write
1
+
2
instead of (
1
,
2
).
More generally, let H
j
, j N, be a countable collection of Hilbert spaces
and dene
j=1
H
j
=
j=1
j
[
j
H
j
,
j=1
|
j
|
2
j
< , (1.34)
which becomes a Hilbert space with the scalar product
j=1
j
,
j=1
j
) =
j=1
j
,
j
)
j
. (1.35)
Example.

j=1
C =
2
(N).
Similarly, if H and

H are two Hilbert spaces, we dene their tensor
product as follows: The elements should be products
of elements H
and

H. Hence we start with the set of all nite linear combinations of
elements of H

H:
T(H,

H) =
n
j=1
j
(
j
,

j
)[(
j
,

j
) H

H,
j
C. (1.36)
Since we want (
1
+
2
)
=
1
+
2
, (
1
+

2
) =
1
+
2
,
and ()

= (
), we consider T(H,

H)/^(H,

H), where
^(H,

H) = span
n
j,k=1
k
(
j
,

k
) (
n
j=1
j
,
n
k=1
k
) (1.37)
and write

for the equivalence class of (,

).
Next we dene

,

) = , )
,

) (1.38)
which extends to a sesquilinear form on T(H,

H)/^(H,

H). To show that we
obtain a scalar product, we need to ensure positivity. Let =
i
,=
0 and pick orthonormal bases
j
,
k
for span
i
, span
i
, respectively.
Then
=
j,k
jk
j

k
,
jk
=
j
,
i
)
k
,

i
) (1.39)
and we compute
, ) =
j,k
[
jk
[
2
> 0. (1.40)
The completion of T(H,

H)/^(H,

H) with respect to the induced norm is
called the tensor product H

H of H and

H.
Lemma 1.10. If
j
,
k
are orthonormal bases for H,

H, respectively, then
j

k
is an orthonormal basis for H

H.
Proof. That
j

k
is an orthonormal set is immediate from (1.38). More-
over, since span
j
, span
k
are dense in H,

H, respectively, it is easy to
see that
j

k
is dense in T(H,

H)/^(H,

H). But the latter is dense in
H

H.
Example. We have H C
n
= H
n
.
Example. Let (M, d) and (

M, d ) be two measure spaces. Then we have
L
2
(M, d) L
2
(

M, d ) = L
2
(M

M, d d ).
Clearly we have L
2
(M, d) L
2
(

M, d ) L
2
(M

M, d d ). Now
take an orthonormal basis
j

k
for L
2
(M, d) L
2
(

M, d ) as in our
previous lemma. Then
_
M
_
M
(
j
(x)
k
(y))
f(x, y)d(x)d (y) = 0 (1.41)

1.5. The C
algebra of bounded linear operators 47

implies
_
M
j
(x)
f
k
(x)d(x) = 0, f
k
(x) =
_
M

k
(y)
f(x, y)d (y) (1.42)

and hence f
k
(x) = 0 -a.e. x. But this implies f(x, y) = 0 for -a.e. x and
-a.e. y and thus f = 0. Hence
j

k
is a basis for L
2
(M

M, d d )
and equality follows.
It is straightforward to extend the tensor product to any nite number
of Hilbert spaces. We even note
(
j=1
H
j
) H =
j=1
(H
j
H), (1.43)
where equality has to be understood in the sense that both spaces are uni-
tarily equivalent by virtue of the identication
(
j=1
j
) =
j=1
j
. (1.44)

= 0 if and only if = 0 or

= 0.
Problem 1.12. We have

=

,= 0 if and only if there is some
C0 such that = and

=
1
.
Problem 1.13. Show (1.43)
1.5. The C
algebra of bounded linear operators

We start by introducing a conjugation for operators on a Hilbert space H.
Let A L(H). Then the adjoint operator is dened via
, A
) = A, ) (1.45)
(compare Corollary 1.9).
Example. If H = C
n
and A = (a
jk
)
1j,kn
, then A
= (a
kj
)
1j,kn
.
Lemma 1.11. Let A, B L(H). Then
(i) (A+B)
= A
+B
, (A)
,
(ii) A
= A,
(iii) (AB)
= B
,
(iv) |A| = |A
| and |A|
2
= |A
A| = |AA
|.
Proof. (i) and (ii) are obvious. (iii) follows from, (AB)) = A
, B) =
B
, ). (iv) follows from

|A
| = sup
||=||=1
[, A
)[ = sup
||=||=1
[A, )[ = |A|
and
|A
A| = sup
||=||=1
[, A
A)[ = sup
||=||=1
[A, A)[
= sup
||=1
|A|
2
= |A|
2
,
where we have used || = sup
||=1
[, )[.
As a consequence of |A
| = |A| observe that taking the adjoint is

continuous.
In general, a Banach algebra / together with an involution
(a +b)
= a
+b
, (a)
, a
= a, (ab)
= b
(1.46)
satisfying
|a|
2
= |a
a| (1.47)
is called a C
algebra. The element a
is called the adjoint of a. Note that

|a
| = |a| follows from (1.47) and |aa
| |a||a
|.
Any subalgebra which is also closed under involution is called a -
subalgebra. An ideal is a subspace 1 / such that a 1, b / imply
ab 1 and ba 1. If it is closed under the adjoint map, it is called a -ideal.
Note that if there is an identity e, we have e
= e and hence (a
1
)
= (a
)
1
(show this).
Example. The continuous functions C(I) together with complex conjuga-
tion form a commutative C
algebra.
An element a / is called normal if aa
= a
a, self-adjoint if a = a
,
unitary if aa
= a
a = I, an (orthogonal) projection if a = a
= a
2
, and
positive if a = bb
for some b /. Clearly both self-adjoint and unitary

elements are normal.
Problem 1.14. Let A L(H). Show that A is normal if and only if
|A| = |A
|, H.
(Hint: Problem 0.14.)
Problem 1.15. Show that U : H H is unitary if and only if U
1
= U
.
Problem 1.16. Compute the adjoint of
S :
2
(N)
2
(N), (a
1
, a
2
, a
3
, . . . ) (0, a
1
, a
2
, . . . ).
1.6. Weak and strong convergence 49
1.6. Weak and strong convergence
Sometimes a weaker notion of convergence is useful: We say that
n
con-
verges weakly to and write
w-lim
n
n
= or
n
(1.48)
if ,
n
) , ) for every H (show that a weak limit is unique).
Example. Let
n
be an (innite) orthonormal set. Then ,
n
) 0 for
every since these are just the expansion coecients of . (
n
does not
converge to 0, since |
n
| = 1.)
Clearly
n
implies
n
and hence this notion of convergence is
indeed weaker. Moreover, the weak limit is unique, since ,
n
) , )
and ,
n
) ,

) imply , (

)) = 0. A sequence
n
is called a
weak Cauchy sequence if ,
n
) is Cauchy for every H.
Lemma 1.12. Let H be a Hilbert space.
(i)
n
implies || liminf |
n
|.
(ii) Every weak Cauchy sequence
n
is bounded: |
n
| C.
(iii) Every weak Cauchy sequence converges weakly.
(iv) For a weakly convergent sequence
n
we have
n
if and
only if limsup|
n
| ||.
Proof. (i) Observe
||
2
= , ) = liminf,
n
) || liminf |
n
|.
(ii) For every we have that [,
n
)[ C() is bounded. Hence by the
uniform boundedness principle we have |
n
| = |
n
, .)| C.
(iii) Let
m
be an orthonormal basis and dene c
m
= lim
n
m
,
n
).
Then =
m
c
m
m
is the desired limit.
(iv) By (i) we have lim|
n
| = || and hence
|
n
|
2
= ||
2
2 Re(,
n
)) +|
n
|
2
0.
The converse is straightforward.
Clearly an orthonormal basis does not have a norm convergent subse-
quence. Hence the unit ball in an innite dimensional Hilbert space is never
compact. However, we can at least extract weakly convergent subsequences:
Lemma 1.13. Let H be a Hilbert space. Every bounded sequence
n
has a
weakly convergent subsequence.
Proof. Let
k
be an orthonormal basis. Then by the usual diagonal se-
quence argument we can nd a subsequence
nm
such that
k
,
nm
) con-
verges for all k. Since
n
is bounded, ,
nm
) converges for every H
and hence
nm
is a weak Cauchy sequence.
Finally, let me remark that similar concepts can be introduced for oper-
ators. This is of particular importance for the case of unbounded operators,
where convergence in the operator norm makes no sense at all.
A sequence of operators A
n
is said to converge strongly to A,
s-lim
n
A
n
= A : A
n
A x D(A) D(A
n
). (1.49)
It is said to converge weakly to A,
w-lim
n
A
n
= A : A
n
A D(A) D(A
n
). (1.50)
Clearly norm convergence implies strong convergence and strong conver-
gence implies weak convergence.
Example. Consider the operator S
n
L(
2
(N)) which shifts a sequence n
places to the left, that is,
S
n
(x
1
, x
2
, . . . ) = (x
n+1
, x
n+2
, . . . ), (1.51)
and the operator S
n
L(
2
(N)) which shifts a sequence n places to the right
and lls up the rst n places with zeros, that is,
S
n
(x
1
, x
2
, . . . ) = (0, . . . , 0
. .
n places
, x
1
, x
2
, . . . ). (1.52)
Then S
n
converges to zero strongly but not in norm (since |S
n
| = 1) and
S
n
converges weakly to zero (since , S
n
) = S
n
, )) but not strongly
(since |S
n
| = ||) .
Note that this example also shows that taking adjoints is not continuous
with respect to strong convergence! If A
n
s
A, we only have
, A
n
) = A
n
, ) A, ) = , A
) (1.53)
and hence A
n
A
in general. However, if A
n
and A are normal, we have
|(A
n
A)
| = |(A
n
A)| (1.54)
and hence A
n
s
A
in this case. Thus at least for normal operators taking

adjoints is continuous with respect to strong convergence.
Lemma 1.14. Suppose A
n
is a sequence of bounded operators.
(i) s-lim
n
A
n
= A implies |A| liminf
n
|A
n
|.
(ii) Every strong Cauchy sequence A
n
is bounded: |A
n
| C.
(iii) If A
n
A for in some dense set and |A
n
| C, then
s-lim
n
A
n
= A.
The same result holds if strong convergence is replaced by weak convergence.
Proof. (i) follows from
|A| = lim
n
|A
n
| liminf
n
|A
n
|
for every with || = 1.
(ii) follows as in Lemma 1.12 (i).
(iii) Just use
|A
n
A| |A
n
A
n
| +|A
n
A| +|A A|
2C| | +|A
n
A|
and choose in the dense subspace such that | |

4C
and n large
such that |A
n
A|

2
.
The case of weak convergence is left as an exercise. (Hint: (2.14).)
Problem 1.17. Suppose
n
and
n
. Then
n
,
n
) , ).
Problem 1.18. Let
j
j=1
be some orthonormal basis. Show that
n

if and only if
n
is bounded and
j
,
n
)
j
, ) for every j. Show that
this is wrong without the boundedness assumption.
Problem 1.19. A subspace M H is closed if and only if every weak
Cauchy sequence in M has a limit in M. (Hint: M = M
.)
1.7. Appendix: The StoneWeierstra theorem
In case of a self-adjoint operator, the spectral theorem will show that the
closed -subalgebra generated by this operator is isomorphic to the C
al-
gebra of continuous functions C(K) over some compact set. Hence it is
important to be able to identify dense sets:
Theorem 1.15 (StoneWeierstra, real version). Suppose K is a compact
set and let C(K, R) be the Banach algebra of continuous functions (with the
sup norm).
If F C(K, R) contains the identity 1 and separates points (i.e., for
every x
1
,= x
2
there is some function f F such that f(x
1
) ,= f(x
2
)), then
the algebra generated by F is dense.
Proof. Denote by A the algebra generated by F. Note that if f A, we
have [f[ A: By the Weierstra approximation theorem (Theorem 0.15)
there is a polynomial p
n
(t) such that

[t[ p
n
(t)
<
1
n
for t f(K) and
hence p
n
(f) [f[.
In particular, if f, g are in A, we also have
maxf, g =
(f +g) +[f g[
2
, minf, g =
(f +g) [f g[
2
in A.
Now x f C(K, R). We need to nd some f
A with |f f
< .
First of all, since A separates points, observe that for given y, z K
there is a function f
y,z
A such that f
y,z
(y) = f(y) and f
y,z
(z) = f(z)
(show this). Next, for every y K there is a neighborhood U(y) such that
f
y,z
(x) > f(x) , x U(y),
and since K is compact, nitely many, say U(y
1
), . . . , U(y
j
), cover K. Then
f
z
= maxf
y
1
,z
, . . . , f
y
j
,z
A
and satises f
z
> f by construction. Since f
z
(z) = f(z) for every z K,
there is a neighborhood V (z) such that
f
z
(x) < f(x) +, x V (z),
and a corresponding nite cover V (z
1
), . . . , V (z
k
). Now
f
= minf
z
1
, . . . , f
z
k
A
satises f
< f + . Since f < f

z
l
< f
, we have found a required

function.
Theorem 1.16 (StoneWeierstra). Suppose K is a compact set and let
C(K) be the C
algebra of continuous functions (with the sup norm).

If F C(K) contains the identity 1 and separates points, then the -
subalgebra generated by F is dense.
Proof. Just observe that

F = Re(f), Im(f)[f F satises the assump-
tion of the real version. Hence any real-valued continuous functions can be
approximated by elements from

F, in particular this holds for the real and
imaginary parts for any given complex-valued function.
Note that the additional requirement of being closed under complex
conjugation is crucial: The functions holomorphic on the unit ball and con-
tinuous on the boundary separate points, but they are not dense (since the
uniform limit of holomorphic functions is again holomorphic).
Corollary 1.17. Suppose K is a compact set and let C(K) be the C
algebra
of continuous functions (with the sup norm).
If F C(K) separates points, then the closure of the -subalgebra gen-
erated by F is either C(K) or f C(K)[f(t
0
) = 0 for some t
0
K.
Proof. There are two possibilities: either all f F vanish at one point
t
0
K (there can be at most one such point since F separates points)
or there is no such point. If there is no such point, we can proceed as in
the proof of the StoneWeierstra theorem to show that the identity can
be approximated by elements in A (note that to show [f[ A if f A,
we do not need the identity, since p
n
can be chosen to contain no constant
term). If there is such a t
0
, the identity is clearly missing from A. However,
adding the identity to A, we get A + C = C(K) and it is easy to see that
A = f C(K)[f(t
0
) = 0.
Problem 1.20. Show that the functions
n
(x) =
1
2
e
inx
, n Z, form an
orthonormal basis for H = L
2
(0, 2).
Problem 1.21. Let k N and I R. Show that the -subalgebra generated
by f
z
0
(t) =
1
(tz
0
)
k
for one z
0
C is dense in the C
algebra C
(I) of
continuous functions vanishing at innity
for I = R if z
0
CR and k = 1, 2,
for I = [a, ) if z
0
(, a) and any k,
for I = (, a] [b, ) if z
0
(a, b) and k odd.
(Hint: Add to R to make it compact.)
Chapter 2
Self-adjointness and
spectrum
2.1. Some quantum mechanics
In quantum mechanics, a single particle living in R
3
is described by a
complex-valued function (the wave function)
(x, t), (x, t) R
3
R, (2.1)
where x corresponds to a point in space and t corresponds to time. The
quantity
t
(x) = [(x, t)[
2
is interpreted as the probability density of the
particle at the time t. In particular, must be normalized according to
_
R
3
[(x, t)[
2
d
3
x = 1, t R. (2.2)
The location x of the particle is a quantity which can be observed (i.e.,
measured) and is hence called observable. Due to our probabilistic inter-
pretation, it is also a random variable whose expectation is given by
E
(x) =
_
R
3
x[(x, t)[
2
d
3
x. (2.3)
In a real life setting, it will not be possible to measure x directly and one will
only be able to measure certain functions of x. For example, it is possible to
check whether the particle is inside a certain area of space (e.g., inside a
detector). The corresponding observable is the characteristic function
(x)
of this set. In particular, the number
E
) =
_
R
3
(x)[(x, t)[
2
d
3
x =
_
[(x, t)[
2
d
3
x (2.4)
55
56 2. Self-adjointness and spectrum
corresponds to the probability of nding the particle inside R
3
. An
important point to observe is that, in contradistinction to classical mechan-
ics, the particle is no longer localized at a certain point. In particular,
the mean-square deviation (or variance)
(x)
2
= E
(x
2
) E
(x)
2
is
always nonzero.
In general, the conguration space (or phase space) of a quantum
system is a (complex) Hilbert space H and the possible states of this system
are represented by the elements having norm one, || = 1.
An observable a corresponds to a linear operator A in this Hilbert space
and its expectation, if the system is in the state , is given by the real
number
E
(A) = , A) = A, ), (2.5)
where ., ..) denotes the scalar product of H. Similarly, the mean-square
deviation is given by
(A)
2
= E
(A
2
) E
(A)
2
= |(AE
(A))|
2
. (2.6)
Note that
(A) vanishes if and only if is an eigenstate corresponding to

the eigenvalue E
(A); that is, A = E
(A).
From a physical point of view, (2.5) should make sense for any H.
However, this is not in the cards as our simple example of one particle already
shows. In fact, the reader is invited to nd a square integrable function (x)
for which x(x) is no longer square integrable. The deeper reason behind
this nuisance is that E
(x) can attain arbitrarily large values if the particle

is not conned to a nite domain, which renders the corresponding opera-
tor unbounded. But unbounded operators cannot be dened on the entire
Hilbert space in a natural way by the closed graph theorem (Theorem 2.8
below).
Hence, A will only be dened on a subset D(A) H called the domain
of A. Since we want A to be dened for at least most states, we require
D(A) to be dense.
However, it should be noted that there is no general prescription for how
to nd the operator corresponding to a given observable.
Now let us turn to the time evolution of such a quantum mechanical
system. Given an initial state (0) of the system, there should be a unique
(t) representing the state of the system at time t R. We will write
(t) = U(t)(0). (2.7)
Moreover, it follows from physical experiments that superposition of states
holds; that is, U(t)(
1
1
(0) +
2
2
(0)) =
1
1
(t) +
2
2
(t) ([
1
[
2
+[
2
[
2
=
1). In other words, U(t) should be a linear operator. Moreover, since (t)
2.1. Some quantum mechanics 57
is a state (i.e., |(t)| = 1), we have
|U(t)| = ||. (2.8)
Such operators are called unitary. Next, since we have assumed uniqueness
of solutions to the initial value problem, we must have
U(0) = I, U(t +s) = U(t)U(s). (2.9)
A family of unitary operators U(t) having this property is called a one-
parameter unitary group. In addition, it is natural to assume that this
group is strongly continuous; that is,
lim
tt
0
U(t) = U(t
0
), H. (2.10)
Each such group has an innitesimal generator dened by
H = lim
t0
i
t
(U(t) ), D(H) = H[ lim
t0
1
t
(U(t) ) exists.
(2.11)
This operator is called the Hamiltonian and corresponds to the energy of
the system. If (0) D(H), then (t) is a solution of the Schrodinger
equation (in suitable units)
i
d
dt
(t) = H(t). (2.12)
This equation will be the main subject of our course.
In summary, we have the following axioms of quantum mechanics.
Axiom 1. The conguration space of a quantum system is a complex
separable Hilbert space H and the possible states of this system are repre-
sented by the elements of H which have norm one.
Axiom 2. Each observable a corresponds to a linear operator A dened
maximally on a dense subset D(A). Moreover, the operator correspond-
ing to a polynomial P
n
(a) =

n
j=0
j
a
j
,
j
R, is P
n
(A) =

n
j=0
j
A
j
,
D(P
n
(A)) = D(A
n
) = D(A)[A D(A
n1
) (A
0
= I).
Axiom 3. The expectation value for a measurement of a, when the
system is in the state D(A), is given by (2.5), which must be real for
all D(A).
Axiom 4. The time evolution is given by a strongly continuous one-
parameter unitary group U(t). The generator of this group corresponds to
the energy of the system.
In the following sections we will try to draw some mathematical conse-
quences from these assumptions:
First we will see that Axioms 2 and 3 imply that observables corre-
spond to self-adjoint operators. Hence these operators play a central role
in quantum mechanics and we will derive some of their basic properties.
Another crucial role is played by the set of all possible expectation values
for the measurement of a, which is connected with the spectrum (A) of the
corresponding operator A.
The problem of dening functions of an observable will lead us to the
spectral theorem (in the next chapter), which generalizes the diagonalization
of symmetric matrices.
Axiom 4 will be the topic of Chapter 5.
2.2. Self-adjoint operators
Let H be a (complex separable) Hilbert space. A linear operator is a linear
mapping
A : D(A) H, (2.13)
where D(A) is a linear subspace of H, called the domain of A. It is called
bounded if the operator norm
|A| = sup
||=1
|A| = sup
||=||=1
[, A)[ (2.14)
is nite. The second equality follows since equality in [, A)[ || |A|
is attained when A = z for some z C. If A is bounded, it is no
restriction to assume D(A) = H and we will always do so. The Banach space
of all bounded linear operators is denoted by L(H). Products of (unbounded)
operators are dened naturally; that is, AB = A(B) for D(AB) =
D(B)[B D(A).
The expression , A) encountered in the previous section is called the
quadratic form,
q
A
() = , A), D(A), (2.15)
associated to A. An operator can be reconstructed from its quadratic form
via the polarization identity
, A) =
1
4
(q
A
( +) q
A
( ) + iq
A
( i) iq
A
( + i)) . (2.16)
A densely dened linear operator A is called symmetric (or hermitian) if
, A) = A, ), , D(A). (2.17)
The justication for this denition is provided by the following
Lemma 2.1. A densely dened operator A is symmetric if and only if the
corresponding quadratic form is real-valued.
Proof. Clearly (2.17) implies that Im(q
A
()) = 0. Conversely, taking the
imaginary part of the identity
q
A
( + i) = q
A
() +q
A
() + i(, A) , A))
shows ReA, ) = Re, A). Replacing by i in this last equation
shows ImA, ) = Im, A) and nishes the proof.
In other words, a densely dened operator A is symmetric if and only if
, A) = A, ), D(A). (2.18)
This already narrows the class of admissible operators to the class of
symmetric operators by Axiom 3. Next, let us tackle the issue of the correct
domain.
By Axiom 2, A should be dened maximally; that is, if

A is another
symmetric operator such that A

A, then A =

A. Here we write A

A
if D(A) D(

A) and A =

A for all D(A). The operator

A is called
an extension of A in this case. In addition, we write A =

A if both

A A
and A

A hold.
The adjoint operator A
of a densely dened linear operator A is

dened by
D(A
) = H[
H : , A) =
, ), D(A),
A
=

.
(2.19)
The requirement that D(A) be dense implies that A
is well-dened. How-
ever, note that D(A
) might not be dense in general. In fact, it might

contain no vectors other than 0.
Clearly we have (A)
for C and (A + B)
+ B
provided D(A + B) = D(A) D(B) is dense. However, equality will not

hold in general unless one operator is bounded (Problem 2.2).
For later use, note that (Problem 2.4)
Ker(A
) = Ran(A)
. (2.20)
For symmetric operators we clearly have A A
. If, in addition, A = A
holds, then A is called self-adjoint. Our goal is to show that observables

correspond to self-adjoint operators. This is for example true in the case of
the position operator x, which is a special case of a multiplication operator.
Example. (Multiplication operator) Consider the multiplication operator
(Af)(x) = A(x)f(x), D(A) = f L
2
(R
n
, d) [ Af L
2
(R
n
, d)
(2.21)
given by multiplication with the measurable function A : R
n
C. First
of all note that D(A) is dense. In fact, consider
n
= x R
n
[ [A(x)[
n R
n
. Then, for every f L
2
(R
n
, d) the function f
n
=
n
f D(A)
converges to f as n by dominated convergence.
Next, let us compute the adjoint of A. Performing a formal computation,
we have for h, f D(A) that
h, Af) =
_
h(x)
A(x)f(x)d(x) =
_
(A(x)
h(x))
f(x)d(x) =
Ah, f),
(2.22)
where

A is multiplication by A(x)
,
(

Af)(x) = A(x)
f(x), D(

A) = f L
2
(R
n
, d) [

Af L
2
(R
n
, d).
(2.23)
Note D(

A) = D(A). At rst sight this seems to show that the adjoint of
A is

A. But for our calculation we had to assume h D(A) and there
might be some functions in D(A
) which do not satisfy this requirement! In

particular, our calculation only shows

A A
. To show that equality holds,

we need to work a little harder:
If h D(A
), there is some g L
2
(R
n
, d) such that
_
h(x)
A(x)f(x)d(x) =
_
g(x)
f(x)d(x), f D(A), (2.24)

and thus
_
(h(x)A(x)
g(x))
f(x)d(x) = 0, f D(A). (2.25)

In particular,
_

n
(x)(h(x)A(x)
g(x))
f(x)d(x) = 0, f L
2
(R
n
, d), (2.26)
which shows that
n
(h(x)A(x)
g(x))
L
2
(R
n
, d) vanishes. Since n
is arbitrary, we even have h(x)A(x)
= g(x) L
2
(R
n
, d) and thus A
is
multiplication by A(x)
and D(A
) = D(A).
In particular, A is self-adjoint if A is real-valued. In the general case we
have at least |Af| = |A
f| for all f D(A) = D(A
). Such operators are

called normal.
Now note that
A B B
; (2.27)
that is, increasing the domain of A implies decreasing the domain of A
.
Thus there is no point in trying to extend the domain of a self-adjoint
operator further. In fact, if A is self-adjoint and B is a symmetric extension,
we infer A B B
= A implying A = B.
Corollary 2.2. Self-adjoint operators are maximal; that is, they do not have
any symmetric extensions.
Furthermore, if A
is densely dened (which is the case if A is symmet-

ric), we can consider A
. From the denition (2.19) it is clear that A A
and thus A
is an extension of A. This extension is closely related to ex-

tending a linear subspace M via M
= M (as we will see a bit later) and

thus is called the closure A = A
of A.
If A is symmetric, we have A A
and hence A = A
; that is,
A lies between A and A
. Moreover, , A
) = A, ) for all D(A),

D(A
) implies that A is symmetric since A
= A for D(A).
Example. (Dierential operator) Take H = L
2
(0, 2).
(i) Consider the operator
A
0
f = i
d
dx
f, D(A
0
) = f C
1
[0, 2] [ f(0) = f(2) = 0. (2.28)
That A
0
is symmetric can be shown by a simple integration by parts (do
this). Note that the boundary conditions f(0) = f(2) = 0 are chosen
such that the boundary terms occurring from integration by parts vanish.
However, this will also follow once we have computed A
0
. If g D(A
0
), we
must have
_
2
0
g(x)
(if
t
(x))dx =
_
2
0
g(x)
f(x)dx (2.29)
for some g L
2
(0, 2). Integration by parts (cf. (2.116)) shows
_
2
0
f
t
(x)
_
g(x) i
_
x
0
g(t)dt
_
dx = 0. (2.30)
In fact, this formula holds for g C[0, 2]. Since the set of continuous
functions is dense, the general case g L
2
(0, 2) follows by approximating
g with continuous functions and taking limits on both sides using dominated
convergence.
Hence g(x) i
_
x
0
g(t)dt f
t
[f D(A
0
)
. But f
t
[f D(A
0
) =
h C[0, 2][
_
2
0
h(t)dt = 0 (show this) implying g(x) = g(0) +i
_
x
0
g(t)dt
since f
t
[f D(A
0
) = h H[1, h) = 0 = 1
and 1
= span1.
Thus g AC[0, 2], where
AC[a, b] = f C[a, b][f(x) = f(a) +
_
x
a
g(t)dt, g L
1
(a, b) (2.31)
denotes the set of all absolutely continuous functions (see Section 2.7). In
summary, g D(A
0
) implies g AC[0, 2] and A
0
g = g = ig
t
. Conversely,
for every g H
1
(0, 2) = f AC[0, 2][f
t
L
2
(0, 2), (2.29) holds with
g = ig
t
and we conclude
A
0
f = i
d
dx
f, D(A
0
) = H
1
(0, 2). (2.32)
In particular, A
0
is symmetric but not self-adjoint. Since A
0
= A
0
A
0
,
we can use integration by parts to compute
0 = g, A
0
f) A
0
g, f) = i(f(0)g(0)
f(2)g(2)
) (2.33)
and since the boundary values of g D(A
0
) can be prescribed arbitrarily,
we must have f(0) = f(2) = 0. Thus
A
0
f = i
d
dx
f, D(A
0
) = f D(A
0
) [ f(0) = f(2) = 0. (2.34)
(ii) Now let us take
Af = i
d
dx
f, D(A) = f C
1
[0, 2] [ f(0) = f(2), (2.35)
which is clearly an extension of A
0
. Thus A
0
and we compute
0 = g, Af) A
g, f) = if(0)(g(0)
g(2)
). (2.36)
Since this must hold for all f D(A), we conclude g(0) = g(2) and
A
f = i
d
dx
f, D(A
) = f H
1
(0, 2) [ f(0) = f(2). (2.37)
Similarly, as before, A = A
and thus A is self-adjoint.

One might suspect that there is no big dierence between the two sym-
metric operators A
0
and A from the previous example, since they coincide
on a dense set of vectors. However, the converse is true: For example, the
rst operator A
0
has no eigenvectors at all (i.e., solutions of the equation
A
0
= z, z C) whereas the second one has an orthonormal basis of
eigenvectors!
Example. Compute the eigenvectors of A
0
and A from the previous exam-
ple.
(i) By denition, an eigenvector is a (nonzero) solution of A
0
u = zu,
z C, that is, a solution of the ordinary dierential equation
i u
t
(x) = zu(x) (2.38)
satisfying the boundary conditions u(0) = u(2) = 0 (since we must have
u D(A
0
)). The general solution of the dierential equation is u(x) =
u(0)e
izx
and the boundary conditions imply u(x) = 0. Hence there are no
eigenvectors.
(ii) Now we look for solutions of Au = zu, that is, the same dierential
equation as before, but now subject to the boundary condition u(0) = u(2).
Again the general solution is u(x) = u(0)e
izx
and the boundary condition
requires u(0) = u(0)e
2iz
. Thus there are two possibilities. Either u(0) = 0
(which is of no use for us) or z Z. In particular, we see that all eigenvectors
are given by
u
n
(x) =
1
2
e
inx
, n Z, (2.39)
which are well known to form an orthonormal basis.
We will see a bit later that this is a consequence of self-adjointness of
A. Hence it will be important to know whether a given operator is self-
adjoint or not. Our example shows that symmetry is easy to check (in case
of dierential operators it usually boils down to integration by parts), but
computing the adjoint of an operator is a nontrivial job even in simple situ-
ations. However, we will learn soon that self-adjointness is a much stronger
property than symmetry, justifying the additional eort needed to prove it.
On the other hand, if a given symmetric operator A turns out not to
be self-adjoint, this raises the question of self-adjoint extensions. Two cases
need to be distinguished. If A is self-adjoint, then there is only one self-
adjoint extension (if B is another one, we have A B and hence A = B
by Corollary 2.2). In this case A is called essentially self-adjoint and
D(A) is called a core for A. Otherwise there might be more than one self-
adjoint extension or none at all. This situation is more delicate and will be
investigated in Section 2.6.
Since we have seen that computing A
is not always easy, a criterion for

self-adjointness not involving A
will be useful.
Lemma 2.3. Let A be symmetric such that Ran(A+z) = Ran(A+z
) = H
for one z C. Then A is self-adjoint.
Proof. Let D(A
) and A
=

. Since Ran(A + z
) = H, there is a
D(A) such that (A+z
) =

+z
. Now we compute
, (A+z)) =
+z
, ) = (A+z
), ) = , (A+z)), D(A),
and hence = D(A) since Ran(A+z) = H.
To proceed further, we will need more information on the closure of
an operator. We will use a dierent approach which avoids the use of the
adjoint operator. We will establish equivalence with our original denition
in Lemma 2.4.
The simplest way of extending an operator A is to take the closure of its
graph (A) = (, A)[ D(A) H
2
. That is, if (
n
, A
n
) (,

),
we might try to dene A =

. For A to be well-dened, we need that
(
n
, A
n
) (0,

) implies

= 0. In this case A is called closable and
the unique operator A which satises (A) = (A) is called the closure of
A. Clearly, A is called closed if A = A, which is the case if and only if the
graph of A is closed. Equivalently, A is closed if and only if (A) equipped
with the graph norm |(, A)|
2
(A)
= ||
2
+ |A|
2
is a Hilbert space
(i.e., closed). By construction, A is the smallest closed extension of A.
Example. Suppose A is bounded. Then the closure was already computed
in Theorem 0.26. In particular, D(A) = D(A) and a bounded operator is
closed if and only if its domain is closed.
Example. Consider again the dierential operator A
0
from (2.28) and let
us compute the closure without the use of the adjoint operator.
Let f D(A
0
) and let f
n
D(A
0
) be a sequence such that f
n
f,
A
0
f
n
ig. Then f
t
n
g and hence f(x) =
_
x
0
g(t)dt. Thus f AC[0, 2]
and f(0) = 0. Moreover f(2) = lim
n0
_
2
0
f
t
n
(t)dt = 0. Conversely, any
such f can be approximated by functions in D(A
0
) (show this).
Example. Consider again the multiplication operator by A(x) in L
2
(R
n
, d)
but now dened on functions with compact support, that is,
D(A
0
) = f D(A) [ supp(f) is compact. (2.40)
Then its closure is given by A
0
= A. In particular, A
0
is essentially self-
adjoint and D(A
0
) is a core for A.
To prove A
0
= A, let some f D(A) be given and consider f
n
=
x[ [x[n
f. Then f
n
D(A
0
) and f
n
(x) f(x) as well as A(x)f
n
(x)
A(x)f(x) in L
2
(R
n
, d) by dominated convergence. Thus D(A
0
) D(A)
and since A is closed, we even get equality.
Example. Consider the multiplication A(x) = x in L
2
(R) dened on
D(A
0
) = f D(A) [
_
R
f(x)dx = 0. (2.41)
Then A
0
is closed. Hence D(A
0
) is not a core for A.
To show that A
0
is closed, suppose there is a sequence f
n
(x) f(x)
such that xf
n
(x) g(x). Since A is closed, we necessarily have f D(A)
and g(x) = xf(x). But then
0 = lim
n
_
R
f
n
(x)dx = lim
n
_
R
1
1 +[x[
(f
n
(x) + sign(x)xf
n
(x))dx
=
_
R
1
1 +[x[
(f(x) + sign(x)g(x))dx =
_
R
f(x)dx (2.42)
which shows f D(A
0
).
Next, let us collect a few important results.
Lemma 2.4. Suppose A is a densely dened operator.
(i) A
is closed.
(ii) A is closable if and only if D(A
) is dense and A = A
, respec-
tively, (A)
= A
, in this case.
(iii) If A is injective and Ran(A) is dense, then (A
)
1
= (A
1
)
. If
A is closable and A is injective, then A
1
= A
1
.
Proof. Let us consider the following two unitary operators from H
2
to itself
U(, ) = (, ), V (, ) = (, ).
(i) From
(A
) = (, ) H
2
[, A) = , ), D(A)
= (, ) H
2
[(, ), (
, ))
H
2 = 0, (,

) (A)
= (U(A))
(2.43)
we conclude that A
is closed.
(ii) Similarly, using U
= (U)
(Problem 1.4), by
(A) = (A)
= (U(A
))
= (,

)[ , A
, ) = 0, D(A
)
we see that (0,

) (A) if and only if

D(A
. Hence A is closable if
and only if D(A
) is dense. In this case, equation (2.43) also shows A
= A
.
Moreover, replacing A by A
in (2.43) and comparing with the last formula

shows A
= A.
(iii) Next note that (provided A is injective)
(A
1
) = V (A).
Hence if Ran(A) is dense, then Ker(A
) = Ran(A)
= 0 and
((A
)
1
) = V (A
) = V U(A)
= UV (A)
= U(V (A))
shows that (A
)
1
= (A
1
)
. Similarly, if A is closable and A is injective,

then A
1
= A
1
by
(A
1
) = V (A) = V (A) = (A
1
).
Corollary 2.5. If A is self-adjoint and injective, then A

1
is also self-
adjoint.
Proof. Equation (2.20) in the case A = A
implies Ran(A)
= Ker(A) =
0 and hence (iii) is applicable.
If A is densely dened and bounded, we clearly have D(A
) = H and by
Corollary 1.9, A
L(H). In particular, since A = A
, we obtain
Theorem 2.6. We have A L(H) if and only if A
L(H).
Now we can also generalize Lemma 2.3 to the case of essential self-adjoint
operators.
Lemma 2.7. A symmetric operator A is essentially self-adjoint if and only
if one of the following conditions holds for one z CR:
Ran(A+z) = Ran(A+z
) = H,
Ker(A
+z) = Ker(A
+z
) = 0.
If A is nonnegative, that is, , A) 0 for all D(A), we can also
admit z (, 0).
Proof. First of all note that by (2.20) the two conditions are equivalent.
By taking the closure of A, it is no restriction to assume that A is closed.
Let z = x + iy. From
|(A+z)|
2
= |(A+x) + iy|
2
= |(A+x)|
2
+y
2
||
2
y
2
||
2
, (2.44)
we infer that Ker(A+z) = 0 and hence (A+z)
1
exists. Moreover, setting
= (A + z)
1
(y ,= 0) shows |(A + z)
1
| [y[
1
. Hence (A + z)
1
is
bounded and closed. Since it is densely dened by assumption, its domain
Ran(A+z) must be equal to H. Replacing z by z
, we see Ran(A+z
) = H
and applying Lemma 2.3 shows that A is self-adjoint. Conversely, if A = A
,
the above calculation shows Ker(A
+ z) = 0, which nishes the case

z CR.
The argument for the nonnegative case with z < 0 is similar using
||
2
, (A + )) |||(A + )| which shows (A + )
1

1
,
> 0.
In addition, we can also prove the closed graph theorem which shows
that an unbounded closed operator cannot be dened on the entire Hilbert
space.
Theorem 2.8 (Closed graph). Let H
1
and H
2
be two Hilbert spaces and
A : H
1
H
2
an operator dened on all of H
1
. Then A is bounded if and
only if (A) is closed.
Proof. If A is bounded, then it is easy to see that (A) is closed. So let us
assume that (A) is closed. Then A
is well-dened and for all unit vectors

D(A
) we have that the linear functional
() = A
, ) is pointwise
bounded, that is,
|
()| = [, A)[ |A|.

Hence by the uniform boundedness principle there is a constant C such that
|
| = |A
| C. That is, A
is bounded and so is A = A
.
Note that since symmetric operators are closable, they are automatically
closed if they are dened on the entire Hilbert space.
Theorem 2.9 (Hellinger-Toeplitz). A symmetric operator dened on the
entire Hilbert space is bounded.
Problem 2.1 (Jacobi operator). Let a and b be some real-valued sequences
in
(Z). Consider the operator

Jf
n
= a
n
f
n+1
+a
n1
f
n1
+b
n
f
n
, f
2
(Z).
Show that J is a bounded self-adjoint operator.
Problem 2.2. Show that (A)
and (A + B)
+ B
(where
D(A
+ B
) = D(A
) D(B
)) with equality if one operator is bounded.

Give an example where equality does not hold.
Problem 2.3. Suppose AB is densely dened. Show that (AB)
.
Moreover, if B is bounded, then (BA)
= A
.
Problem 2.4. Show (2.20).
Problem 2.5. An operator is called normal if |A| = |A
| for all
D(A) = D(A
).
Show that if A is normal, so is A+z for any z C.
Problem 2.6. Show that normal operators are closed. (Hint: A
is closed.)
Problem 2.7. Show that a bounded operator A is normal if and only if
AA
= A
A.
Problem 2.8. Show that the kernel of a closed operator is closed.
Problem 2.9. Show that if A is closed and B bounded, then AB is closed.
2.3. Quadratic forms and the Friedrichs extension
Finally we want to draw some further consequences of Axiom 2 and show
that observables correspond to self-adjoint operators. Since self-adjoint op-
erators are already maximal, the dicult part remaining is to show that an
observable has at least one self-adjoint extension. There is a good way of
doing this for nonnegative operators and hence we will consider this case
rst.
An operator is called nonnegative (resp. positive) if , A) 0 (resp.
> 0 for ,= 0) for all D(A). If A is positive, the map (, ) , A)
is a scalar product. However, there might be sequences which are Cauchy
with respect to this scalar product but not with respect to our original one.
To avoid this, we introduce the scalar product
, )
A
= , (A+ 1)), A 0, (2.45)
dened on D(A), which satises || ||
A
. Let H
A
be the completion of
D(A) with respect to the above scalar product. We claim that H
A
can be
regarded as a subspace of H; that is, D(A) H
A
H.
If (
n
) is a Cauchy sequence in D(A), then it is also Cauchy in H (since
|| ||
A
by assumption) and hence we can identify the limit in H
A
with
the limit of (
n
) regarded as a sequence in H. For this identication to be
unique, we need to show that if (
n
) D(A) is a Cauchy sequence in H
A
such that |
n
| 0, then |
n
|
A
0. This follows from
|
n
|
2
A
=
n
,
n
m
)
A
+
n
,
m
)
A
|
n
|
A
|
n
m
|
A
+|
n
||(A+ 1)
m
| (2.46)
since the right-hand side can be made arbitrarily small choosing m, n large.
Clearly the quadratic form q
A
can be extended to every H
A
by
setting
q
A
() = , )
A
||
2
, Q(A) = H
A
. (2.47)
The set Q(A) is also called the form domain of A.
Example. (Multiplication operator) Let A be multiplication by A(x) 0
in L
2
(R
n
, d). Then
Q(A) = D(A
1/2
) = f L
2
(R
n
, d) [ A
1/2
f L
2
(R
n
, d) (2.48)
and
q
A
(x) =
_
R
n
A(x)[f(x)[
2
d(x) (2.49)
(show this).
Now we come to our extension result. Note that A + 1 is injective and
the best we can hope for is that for a nonnegative extension

A, the operator
A+ 1 is a bijection from D(

A) onto H.
Lemma 2.10. Suppose A is a nonnegative operator. Then there is a non-
negative extension

A such that Ran(

A+ 1) = H.
Proof. Let us dene an operator

A by
D(

A) = H
A
[
H : , )
A
= ,

), H
A
,
A =

.
Since H
A
is dense,

is well-dened. Moreover, it is straightforward to see
that

A is a nonnegative extension of A.
It is also not hard to see that Ran(

A + 1) = H. Indeed, for any

H,

, ) is a bounded linear functional on H

A
. Hence there is an element
H
A
such that
, ) = , )
A
for all H
A
. By the denition of

A,
(

A+ 1) =

and hence

A+ 1 is onto.
Example. Let us take H = L
2
(0, ) and consider the operator
Af =
d
2
dx
2
f, D(A) = f C
2
[0, ] [ f(0) = f() = 0, (2.50)
which corresponds to the one-dimensional model of a particle conned to a
box.
(i) First of all, using integration by parts twice, it is straightforward to
check that A is symmetric:
_

0
g(x)
(f
tt
)(x)dx =
_

0
g
t
(x)
f
t
(x)dx =
_

0
(g
tt
)(x)
f(x)dx. (2.51)
Note that the boundary conditions f(0) = f() = 0 are chosen such that
the boundary terms occurring from integration by parts vanish. Moreover,
the same calculation also shows that A is positive:
_

0
f(x)
(f
tt
)(x)dx =
_

0
[f
t
(x)[
2
dx > 0, f ,= 0. (2.52)
(ii) Next let us show H
A
= f H
1
(0, ) [ f(0) = f() = 0. In fact,
since
g, f)
A
=
_

0
_
g
t
(x)
f
t
(x) +g(x)
f(x)
_
dx, (2.53)
we see that f
n
is Cauchy in H
A
if and only if both f
n
and f
t
n
are Cauchy
in L
2
(0, ). Thus f
n
f and f
t
n
g in L
2
(0, ) and f
n
(x) =
_
x
0
f
t
n
(t)dt
implies f(x) =
_
x
0
g(t)dt. Thus f AC[0, ]. Moreover, f(0) = 0 is obvious
and from 0 = f
n
() =
_
0
f
t
n
(t)dt we have f() = lim
n
_
0
f
t
n
(t)dt = 0.
So we have H
A
f H
1
(0, ) [ f(0) = f() = 0. To see the converse,
approximate f
t
by smooth functions g
n
. Using g
n

1
0
g
n
(t)dt instead
of g
n
, it is no restriction to assume
_
0
g
n
(t)dt = 0. Now dene f
n
(x) =
_
x
0
g
n
(t)dt and note f
n
D(A) f.
(iii) Finally, let us compute the extension

A. We have f D(

A) if for
all g H
A
there is an

f such that g, f)
A
= g,

f). That is,
_

0
g
t
(x)
f
t
(x)dx =
_

0
g(x)
(

f(x) f(x))dx. (2.54)
Integration by parts on the right-hand side shows
_

0
g
t
(x)
f
t
(x)dx =
_

0
g
t
(x)
_
x
0
(

f(t) f(t))dt dx (2.55)
or equivalently
_

0
g
t
(x)
_
f
t
(x) +
_
x
0
(

f(t) f(t))dt
_
dx = 0. (2.56)
Now observe g
t
H[g H
A
= h H[
_
0
h(t)dt = 0 = 1
and thus
f
t
(x) +
_
x
0
(

f(t) f(t))dt 1
= span1. So we see f H
2
(0, ) =
f AC[0, ][f
t
H
1
(0, ) and

Af = f
tt
. The converse is easy and
hence
Af =
d
2
dx
2
f, D(

A) = f H
2
[0, ] [ f(0) = f() = 0. (2.57)
Now let us apply this result to operators A corresponding to observables.

Since A will, in general, not satisfy the assumptions of our lemma, we will
consider A
2
instead, which has a symmetric extension

A
2
with Ran(

A
2
+1) =
H. By our requirement for observables, A
2
is maximally dened and hence
is equal to this extension. In other words, Ran(A
2
+ 1) = H. Moreover, for
any H there is a D(A
2
) such that
(Ai)(A+ i) = (A+ i)(Ai) = (2.58)
and since (A i) D(A), we infer Ran(A i) = H. As an immediate
consequence we obtain
Corollary 2.11. Observables correspond to self-adjoint operators.
But there is another important consequence of the results which is worth-
while mentioning. A symmetric operator is called semi-bounded, respec-
tively, bounded from below, if
q
A
() = , A) ||
2
, R. (2.59)
We will write A for short.
Theorem 2.12 (Friedrichs extension). Let A be a symmetric operator which
is bounded from below by . Then there is a self-adjoint extension

A which
is also bounded from below by and which satises D(

A) H
A
.
Moreover,

A is the only self-adjoint extension with D(

A) H
A
.
Proof. If we replace A by A, then existence follows from Lemma 2.10.
To see uniqueness, let

A be another self-adjoint extension with D(

A) H
A
.
Choose D(A) and D(

A). Then
, (

A+ 1)) = (A+ 1), ) = , (A+ 1))
= , )
A
= , )
A
and by continuity we even get , (

A + 1)) = , )
A
for every H
A
.
Hence by the denition of

A we have D(

A) and

A =

A; that is,
A

A. But self-adjoint operators are maximal by Corollary 2.2 and thus
A =

A.
Clearly Q(A) = H
A
and q
A
can be dened for semi-bounded operators
as before by using ||
A
= , (A)) +||
2
.
In many physical applications, the converse of this result is also of im-
portance: given a quadratic form q, when is there a corresponding operator
A such that q = q
A
?
So let q : Q C be a densely dened quadratic form corresponding
to a sesquilinear form s : Q Q C; that is, q() = s(, ). As with
a scalar product, s can be recovered from q via the polarization identity
(cf. Problem 0.14). Furthermore, as in Lemma 2.1 one can show that s is
symmetric, s(, ) = s(, )
, if and only if q is real-valued. In this case q

will be called hermitian.
A hermitian form q is called nonnegative if q() 0 and semi-
bounded if q() ||
2
for some R. As before we can associate
a norm ||
q
= q() +(1)||
2
with any semi-bounded q and look at the
completion H
q
of Q with respect to this norm. However, since we are not
assuming that q is steaming from a semi-bounded operator, we do not know
whether H
q
can be regarded as a subspace of H! Hence we will call q clos-
able if for every Cauchy sequence
n
Q with respect to |.|
q
, |
n
| 0
implies |
n
|
q
0. In this case we have H
q
H and we call the extension
of q to H
q
the closure of q. In particular, we will call q closed if Q = H
q
.
Example. Let H = L
2
(0, 1). Then
q(f) = [f(c)[
2
, f C[0, 1], c [0, 1],
is a well-dened nonnegative form. However, let f
n
(x) = max(0, 1n[xc[).
Then f
n
is a Cauchy sequence with respect to |.|
q
such that |f
n
| 0 but
|f
n
|
q
1. Hence q is not closable and hence also not associated with a
nonnegative operator. Formally, one can interpret q as the quadratic form
of the multiplication operator with the delta distribution at x = c. Exercise:
Show H
q
= H C.
From our previous considerations we already know that the quadratic
form q
A
of a semi-bounded operator A is closable and its closure is associated
with a self-adjoint operator. It turns out that the converse is also true
(compare also Corollary 1.9 for the case of bounded operators):
Theorem 2.13. To every closed semi-bounded quadratic form q there cor-
responds a unique self-adjoint operator A such that Q = Q(A) and q = q
A
.
If s is the sesquilinear form corresponding to q, then A is given by
D(A) = H
q
[
H : s(, ) = ,

), H
q
,
A =

(1 ).
(2.60)
Proof. Since H
q
is dense,

and hence Ais well-dened. Moreover, replacing
q by q(.) |.| and A by A , it is no restriction to assume = 0. As
in the proof of Lemma 2.10 it follows that A is a nonnegative operator,
|A|
2
||
2
, with Ran(A+1) = H. In particular, (A+1)
1
exists and is
bounded. Furthermore, for every
j
H we can nd
j
D(A) such that
j
= (A+ 1)
j
. Finally,
(A+ 1)
1
1
,
2
) =
1
, (A+ 1)
2
) = s(
1
,
2
) = s(
2
,
1
)
=
2
, (A+ 1)
1
)
= (A+ 1)
1
,
2
)
=
1
, (A+ 1)
1
2
)
shows that (A+ 1)
1
is self-adjoint and so is A+ 1 by Corollary 2.5.
Any subspace

Q Q(A) which is dense with respect to |.|
A
is called a
form core of A and uniquely determines A.
Example. We have already seen that the operator
Af =
d
2
dx
2
f, D(A) = f H
2
[0, ] [ f(0) = f() = 0 (2.61)
is associated with the closed form
q
A
(f) =
_

0
[f
t
(x)[
2
dx, Q(A) = f H
1
[0, ] [ f(0) = f() = 0. (2.62)
However, this quadratic form even makes sense on the larger form domain
Q = H
1
[0, ]. What is the corresponding self-adjoint operator? (See Prob-
lem 2.13.)
A hermitian form q is called bounded if [q()[ C||
2
and we call
|q| = sup
||=1
[q()[ (2.63)
the norm of q. In this case the norm |.|
q
is equivalent to |.|. Hence
H
q
= H and the corresponding operator is bounded by the HellingerToeplitz
theorem (Theorem 2.9). In fact, the operator norm is equal to the norm of
q (see also Problem 0.15):
Lemma 2.14. A semi-bounded form q is bounded if and only if the associ-
ated operator A is. Moreover, in this case
|q| = |A|. (2.64)
Proof. Using the polarization identity and the parallelogram law (Prob-
lem 0.14), we infer 2 Re, A) (||
2
+||
2
) sup
[, A)[ and choosing

= |A|
1
A shows |A| |q[. The converse is easy.
As a consequence we see that for symmetric operators we have
|A| = sup
||=1
[, A)[ (2.65)
generalizing (2.14) in this case.
Problem 2.10. Let A be invertible. Show A > 0 if and only if A
1
> 0.
Problem 2.11. Let A =
d
2
dx
2
, D(A) = f H
2
(0, ) [ f(0) = f() = 0
and let (x) =
1
2
x(x). Find the error in the following argument: Since

A is symmetric, we have 1 = A, A) = , A
2
) = 0.
Problem 2.12. Suppose A is a closed operator. Show that A
A (with
D(A
A) = D(A)[A D(A
)) is self-adjoint. Show Q(A
A) =
D(A). (Hint: A
A 0.)
Problem 2.13. Suppose A
0
can be written as A
0
= S
S. Show that the

Friedrichs extension is given by A = S
S.
Use this to compute the Friedrichs extension of A =
d
2
dx
2
, D(A) = f
C
2
(0, )[f(0) = f() = 0. Compute also the self-adjoint operator SS
and
its form domain.
Problem 2.14. Use the previous problem to compute the Friedrichs exten-
sion A of A
0
=
d
2
dx
2
, D(A
0
) = C
c
(R). Show that Q(A) = H
1
(R) and
D(A) = H
2
(R). (Hint: Section 2.7.)
Problem 2.15. Let A be self-adjoint. Suppose D D(A) is a core. Then
D is also a form core.
Problem 2.16. Show that (2.65) is wrong if A is not symmetric.
2.4. Resolvents and spectra
Let A be a (densely dened) closed operator. The resolvent set of A is
dened by
(A) = z C[(Az)
1
L(H). (2.66)
More precisely, z (A) if and only if (A z) : D(A) H is bijective
and its inverse is bounded. By the closed graph theorem (Theorem 2.8), it
suces to check that A z is bijective. The complement of the resolvent
set is called the spectrum
(A) = C(A) (2.67)
of A. In particular, z (A) if A z has a nontrivial kernel. A vector
Ker(A z) is called an eigenvector and z is called an eigenvalue in
this case.
The function
R
A
: (A) L(H)
z (Az)
1
(2.68)
is called the resolvent of A. Note the convenient formula
R
A
(z)
= ((Az)
1
)
= ((Az)
)
1
= (A
)
1
= R
A
(z
). (2.69)
In particular,
(A
) = (A)
. (2.70)
Example. (Multiplication operator) Consider again the multiplication op-
erator
(Af)(x) = A(x)f(x), D(A) = f L
2
(R
n
, d) [ Af L
2
(R
n
, d),
(2.71)
given by multiplication with the measurable function A : R
n
C. Clearly
(Az)
1
is given by the multiplication operator
(Az)
1
f(x) =
1
A(x) z
f(x),
D((Az)
1
) = f L
2
(R
n
, d) [
1
Az
f L
2
(R
n
, d) (2.72)
whenever this operator is bounded. But |(A z)
1
| = |
1
Az
|
is
equivalent to (x[ [A(x) z[ < ) = 0 and hence
(A) = z C[ > 0 : (x[ [A(x) z[ < ) = 0. (2.73)
The spectrum
(A) = z C[ > 0 : (x[ [A(x) z[ < ) > 0 (2.74)
is also known as the essential range of A(x). Moreover, z is an eigenvalue
of A if (A
1
(z)) > 0 and
A
1
(z)
is a corresponding eigenfunction in
this case.
Example. (Dierential operator) Consider again the dierential operator
Af = i
d
dx
f, D(A) = f AC[0, 2] [ f
t
L
2
, f(0) = f(2) (2.75)
in L
2
(0, 2). We already know that the eigenvalues of A are the integers
and that the corresponding normalized eigenfunctions
u
n
(x) =
1
2
e
inx
(2.76)
form an orthonormal basis.
To compute the resolvent, we must nd the solution of the correspond-
ing inhomogeneous equation if
t
(x) z f(x) = g(x). By the variation of
constants formula the solution is given by (this can also be easily veried
directly)
f(x) = f(0)e
izx
+ i
_
x
0
e
iz(xt)
g(t)dt. (2.77)
Since f must lie in the domain of A, we must have f(0) = f(2) which gives
f(0) =
i
e
2iz
1
_
2
0
e
izt
g(t)dt, z CZ. (2.78)
(Since z Z are the eigenvalues, the inverse cannot exist in this case.) Hence
(Az)
1
g(x) =
_
2
0
G(z, x, t)g(t)dt, (2.79)
where
G(z, x, t) = e
iz(xt)
_
i
1e
2iz
, t > x,
i
1e
2iz
, t < x,
z CZ. (2.80)
In particular (A) = Z.
If z, z
t
(A), we have the rst resolvent formula
R
A
(z) R
A
(z
t
) = (z z
t
)R
A
(z)R
A
(z
t
) = (z z
t
)R
A
(z
t
)R
A
(z). (2.81)
In fact,
(Az)
1
(z z
t
)(Az)
1
(Az
t
)
1
= (Az)
1
(1 (z A+Az
t
)(Az
t
)
1
) = (Az
t
)
1
, (2.82)
which proves the rst equality. The second follows after interchanging z and
z
t
. Now x z
t
= z
0
and use (2.81) recursively to obtain
R
A
(z) =
n
j=0
(z z
0
)
j
R
A
(z
0
)
j+1
+ (z z
0
)
n+1
R
A
(z
0
)
n+1
R
A
(z). (2.83)
The sequence of bounded operators
R
n
=
n
j=0
(z z
0
)
j
R
A
(z
0
)
j+1
(2.84)
converges to a bounded operator if [z z
0
[ < |R
A
(z
0
)|
1
and clearly we
expect z (A) and R
n
R
A
(z) in this case. Let R
= lim
n
R
n
and
set
n
= R
n
, = R
for some H. Then a quick calculation shows

AR
n
= (Az
0
)R
n
+z
0
n
= + (z z
0
)
n1
+z
0
n
. (2.85)
Hence (
n
, A
n
) (, + z) shows D(A) (since A is closed) and
(Az)R
= . Similarly, for D(A),

R
n
A = + (z z
0
)
n1
+z
0
n
(2.86)
and hence R
(A z) = after taking the limit. Thus R
= R
A
(z) as
anticipated.
If A is bounded, a similar argument veries the Neumann series for
the resolvent
R
A
(z) =
n1
j=0
A
j
z
j+1
+
1
z
n
A
n
R
A
(z)
=
j=0
A
j
z
j+1
, [z[ > |A|. (2.87)
In summary we have proved the following:
Theorem 2.15. The resolvent set (A) is open and R
A
: (A) L(H) is
holomorphic; that is, it has an absolutely convergent power series expansion
around every point z
0
(A). In addition,
|R
A
(z)| dist(z, (A))
1
(2.88)
and if A is bounded, we have z C[ [z[ > |A| (A).
As a consequence we obtain the useful
Lemma 2.16. We have z (A) if there is a sequence
n
D(A) such
that |
n
| = 1 and |(Az)
n
| 0. If z is a boundary point of (A), then
the converse is also true. Such a sequence is called a Weyl sequence.
Proof. Let
n
be a Weyl sequence. Then z (A) is impossible by 1 =
|
n
| = |R
A
(z)(A z)
n
| |R
A
(z)||(A z)
n
| 0. Conversely, by
(2.88) there is a sequence z
n
z and corresponding vectors
n
H such
that |R
A
(z)
n
||
n
|
1
. Let
n
= R
A
(z
n
)
n
and rescale
n
such that
|
n
| = 1. Then |
n
| 0 and hence
|(Az)
n
| = |
n
+ (z
n
z)
n
| |
n
| +[z z
n
[ 0
shows that
n
is a Weyl sequence.
Let us also note the following spectral mapping result.
Lemma 2.17. Suppose A is injective. Then
(A
1
)0 = ((A)0)
1
. (2.89)
In addition, we have A = z if and only if A
1
= z
1
.
Proof. Suppose z (A)0. Then we claim
R
A
1(z
1
) = zAR
A
(z) = z z
2
R
A
(z).
In fact, the right-hand side is a bounded operator from H Ran(A) =
D(A
1
) and
(A
1
z
1
)(zAR
A
(z)) = (z +A)R
A
(z) = , H.
Conversely, if D(A
1
) = Ran(A), we have = A and hence
(zAR
A
(z))(A
1
z
1
) = AR
A
(z)((Az)) = A = .
Thus z
1
(A
1
). The rest follows after interchanging the roles of A and
A
1
.
Next, let us characterize the spectra of self-adjoint operators.
Theorem 2.18. Let A be symmetric. Then A is self-adjoint if and only if
(A) R and (AE) 0, E R, if and only if (A) [E, ). Moreover,
|R
A
(z)| [ Im(z)[
1
and, if (AE) 0, |R
A
()| [ E[
1
, < E.
Proof. If (A) R, then Ran(A + z) = H, z CR, and hence A is
self-adjoint by Lemma 2.7. Conversely, if A is self-adjoint (resp. A E),
then R
A
(z) exists for z CR (resp. z C[E, )) and satises the given
estimates as has been shown in the proof of Lemma 2.7.
In particular, we obtain (show this!)
Theorem 2.19. Let A be self-adjoint. Then
inf (A) = inf
D(A), ||=1
, A) (2.90)
and
sup(A) = sup
D(A), ||=1
, A). (2.91)
For the eigenvalues and corresponding eigenfunctions we have
Lemma 2.20. Let A be symmetric. Then all eigenvalues are real and eigen-
vectors corresponding to dierent eigenvalues are orthogonal.
Proof. If A
j
=
j
j
, j = 1, 2, we have
1
|
1
|
2
=
1
,
1
1
) =
1
, A
1
) =
1
, A
1
) =
1
1
,
1
) =
1
|
1
|
2
and
(
1
2
)
1
,
2
) = A
1
,
2
) A
1
,
2
) = 0,
nishing the proof.
The result does not imply that two linearly independent eigenfunctions
to the same eigenvalue are orthogonal. However, it is no restriction to
assume that they are since we can use GramSchmidt to nd an orthonormal
basis for Ker(A ). If H is nite dimensional, we can always nd an
orthonormal basis of eigenvectors. In the innite dimensional case this is
no longer true in general. However, if there is an orthonormal basis of
eigenvectors, then A is essentially self-adjoint.
Theorem 2.21. Suppose A is a symmetric operator which has an orthonor-
mal basis of eigenfunctions
j
. Then A is essentially self-adjoint. In
particular, it is essentially self-adjoint on span
j
.
Proof. Consider the set of all nite linear combinations =

n
j=0
c
j
j
which is dense in H. Then =

n
j=0
c
j
j
i
j
D(A) and (A i) =
shows that Ran(Ai) is dense.
Similarly, we can characterize the spectra of unitary operators. Recall
that a bijection U is called unitary if U, U) = , U
U) = , ). Thus
U is unitary if and only if
U
= U
1
. (2.92)
Theorem 2.22. Let U be unitary. Then (U) z C[ [z[ = 1. All
eigenvalues have modulus one and eigenvectors corresponding to dierent
eigenvalues are orthogonal.
Proof. Since |U| 1, we have (U) z C[ [z[ 1. Moreover, U
1
is also unitary and hence (U) z C[ [z[ 1 by Lemma 2.17. If
U
j
= z
j
j
, j = 1, 2, we have
(z
1
z
2
)
1
,
2
) = U
1
,
2
)
1
, U
2
) = 0
since U = z implies U
= U
1
= z
1
= z
.
Problem 2.17. Suppose A is closed and B bounded:
Show that I +B has a bounded inverse if |B| < 1.
Suppose A has a bounded inverse. Then so does A + B if |B|
|A
1
|
1
.
Problem 2.18. What is the spectrum of an orthogonal projection?
Problem 2.19. Compute the resolvent of
Af = f
t
, D(A) = f H
1
[0, 1] [ f(0) = 0
and show that unbounded operators can have empty spectrum.
Problem 2.20. Compute the eigenvalues and eigenvectors of A =
d
2
dx
2
,
D(A) = f H
2
(0, )[f(0) = f() = 0. Compute the resolvent of A.
Problem 2.21. Find a Weyl sequence for the self-adjoint operator A =
d
2
dx
2
, D(A) = H
2
(R) for z (0, ). What is (A)? (Hint: Cut o the
solutions of u
tt
(x) = z u(x) outside a nite ball.)
2.5. Orthogonal sums of operators 79
Problem 2.22. Suppose A = A
0
. If
n
D(A) is a Weyl sequence for
z (A), then there is also one with

n
D(A
0
).
Problem 2.23. Suppose A is bounded. Show that the spectra of AA
and
A
A coincide away from 0 by showing

R
AA
(z) =
1
z
(AR
A
A
(z)A
1) , R
A
A
(z) =
1
z
(A
R
AA
(z)A1) .
(2.93)
2.5. Orthogonal sums of operators
Let H
j
, j = 1, 2, be two given Hilbert spaces and let A
j
: D(A
j
) H
j
be
two given operators. Setting H = H
1
H
2
, we can dene an operator
A = A
1
A
2
, D(A) = D(A
1
) D(A
2
) (2.94)
by setting A(
1
+
2
) = A
1
1
+A
2
2
for
j
D(A
j
). Clearly A is closed,
(essentially) self-adjoint, etc., if and only if both A
1
and A
2
are. The same
considerations apply to countable orthogonal sums. Let H =
j
H
j
and set
A =
j
A
j
, D(A) =
j
D(A
j
)[A H. (2.95)
Then we have
Theorem 2.23. Suppose A
j
are self-adjoint operators on H
j
. Then A =
j
A
j
is self-adjoint and
R
A
(z) =
j
R
A
j
(z), z (A) = C(A) (2.96)
where
(A) =
_
j
(A
j
) (2.97)
(the closure can be omitted if there are only nitely many terms).
Proof. Fix z ,

j
(A
j
) and let = Im(z). Then, by Theorem 2.18,
|R
A
j
(z)|
1
and so R(z) =

j
R
A
j
(z) is a bounded operator with
|R(z)|
1
(cf. Problem 2.26). It is straightforward to check that R(z)
is in fact the resolvent of A and thus (A) R. In particular, A is self-
adjoint by Theorem 2.18. To see that (A)

j
(A
j
), note that the
above argument can be repeated with = dist(z,
j
(A
j
)) > 0, which will
follow from the spectral theorem (Problem 3.5) to be proven in the next
chapter. Conversely, if z (A
j
), there is a corresponding Weyl sequence
n
D(A
j
) D(A) and hence z (A).
Conversely, given an operator A, it might be useful to write A as an
orthogonal sum and investigate each part separately.
Let H
1
H be a closed subspace and let P
1
be the corresponding pro-
jector. We say that H
1
reduces the operator A if P
1
A AP
1
. Note that
this is equivalent to P
1
D(A) D(A) and P
1
A = AP
1
for D(A).
Moreover, if we set H
2
= H
1
, we have H = H
1
H
2
and P
2
= 1 P
1
reduces
A as well.
Lemma 2.24. Suppose H =

j
H
j
where each H
j
reduces A. Then A =
j
A
j
, where
A
j
= A, D(A
j
) = P
j
D(A) D(A). (2.98)
If A is closable, then H
j
also reduces A and
A =
j
A
j
. (2.99)
Proof. As already noted, P
j
D(A) D(A) and thus every D(A) can be
written as =
j
P
j
; that is, D(A) =
j
D(A
j
). Moreover, if D(A
j
),
we have A = AP
j
= P
j
A H
j
and thus A
j
: D(A
j
) H
j
which proves
the rst claim.
Now let us turn to the second claim. Suppose D(A). Then there
is a sequence
n
D(A) such that
n
and A
n
= A. Thus
P
j
n
P
j
and AP
j
n
= P
j
A
n
P
j
which shows P
j
D(A) and
P
j
A = AP
j
; that is, H
j
reduces A. Moreover, this argument also shows
P
j
D(A) D(A
j
) and the converse follows analogously.
If A is self-adjoint, then H
1
reduces A if P
1
D(A) D(A) and AP
1
H
1
for every D(A). In fact, if D(A), we can write =
1

2
, with
P
2
= 1 P
1
and
j
= P
j
D(A). Since AP
1
= A
1
and P
1
A =
P
1
A
1
+ P
1
A
2
= A
1
+ P
1
A
2
, we need to show P
1
A
2
= 0. But this
follows since
, P
1
A
2
) = AP
1
,
2
) = 0 (2.100)
for every D(A).
Problem 2.24. Show (
j
A
j
)
j
A
j
.
Problem 2.25. Show that A dened in (2.95) is closed if and only if all A
j
are.
Problem 2.26. Show that for A dened in (2.95), we have |A| = sup
j
|A
j
|.
2.6. Self-adjoint extensions
It is safe to skip this entire section on rst reading.
In many physical applications a symmetric operator is given. If this
operator turns out to be essentially self-adjoint, there is a unique self-adjoint
extension and everything is ne. However, if it is not, it is important to nd
out if there are self-adjoint extensions at all (for physical problems there
better be) and to classify them.
In Section 2.2 we saw that A is essentially self-adjoint if Ker(A
z) =
Ker(A
) = 0 for one z CR. Hence self-adjointness is related to

the dimension of these spaces and one calls the numbers
d
(A) = dimK
, K
= Ran(Ai)
= Ker(A
i), (2.101)
defect indices of A (we have chosen z = i for simplicity; any other z CR
would be as good). If d
(A) = d
+
(A) = 0, there is one self-adjoint extension
of A, namely A. But what happens in the general case? Is there more than
one extension, or maybe none at all? These questions can be answered by
virtue of the Cayley transform
V = (Ai)(A+ i)
1
: Ran(A+ i) Ran(Ai). (2.102)
Theorem 2.25. The Cayley transform is a bijection from the set of all
symmetric operators A to the set of all isometric operators V (i.e., |V | =
|| for all D(V )) for which Ran(1 V ) is dense.
Proof. Since A is symmetric, we have |(A i)|
2
= |A|
2
+||
2
for all
D(A) by a straightforward computation. Thus for every = (A+i)
D(V ) = Ran(A+ i) we have
|V | = |(Ai)| = |(A+ i)| = ||.
Next observe
1 V = ((Ai) (A+ i))(A+ i)
1
=
_
2A(A+ i)
1
,
2i(A+ i)
1
,
which shows that Ran(1 V ) = D(A) is dense and
A = i(1 +V )(1 V )
1
.
Conversely, let V be given and use the last equation to dene A.
Since V is isometric, we have (1 V ), (1 V )) = 2i ImV , )
for all D(V ) by a straightforward computation. Thus for every =
(1 V ) D(A) = Ran(1 V ) we have
A, ) = i(1 +V ), (1 V )) = i(1 V ), (1 +V )) = , A);
that is, A is symmetric. Finally observe
Ai = ((1 +V ) (1 V ))(1 V )
1
=
_
2i(1 V )
1
,
2iV (1 V )
1
,
which shows that A is the Cayley transform of V and nishes the proof.
Thus A is self-adjoint if and only if its Cayley transform V is unitary.
Moreover, nding a self-adjoint extension of A is equivalent to nding a
unitary extensions of V and this in turn is equivalent to (taking the closure
and) nding a unitary operator from D(V )
to Ran(V )
. This is possible
if and only if both spaces have the same dimension, that is, if and only if
d
+
(A) = d
(A).
Theorem 2.26. A symmetric operator has self-adjoint extensions if and
only if its defect indices are equal.
In this case let A
1
be a self-adjoint extension and V
1
its Cayley trans-
form. Then
D(A
1
) = D(A) + (1 V
1
)K
+
= +
+
V
1
+
[ D(A),
+
K
+
(2.103)
and
A
1
( +
+
V
1
+
) = A + i
+
+ iV
1
+
. (2.104)
Moreover,
(A
1
i)
1
= (Ai)
1
i
2
j
, .)(
j
), (2.105)
where
+
j
is an orthonormal basis for K
+
and
j
= V
1
+
j
.
Proof. From the proof of the previous theorem we know that D(A
1
) =
Ran(1 V
1
) = Ran(1 +V ) + (1 V
1
)K
+
= D(A) + (1 V
1
)K
+
. Moreover,
A
1
(+
+
V
1
+
) = A+i(1+V
1
)(1V
1
)
1
(1V
1
)
+
= A+i(1+V
1
)
+
.
Similarly, Ran(A
1
i) = Ran(Ai) K
and (A
1
+i)
1
=
i
2
(1 V
1
),
respectively, (A
1
+ i)
1
=
i
2
(1 V
1
1
).
Note that instead of z = i we could use V (z) = (A + z
)(A + z)
1
for
any z CR. We remark that in this case one can show that the defect
indices are independent of z C
+
= z C[ Im(z) > 0.
Example. Recall the operator A = i
d
dx
, D(A) = f H
1
(0, 2)[f(0) =
f(2) = 0 with adjoint A
= i
d
dx
, D(A
) = H
1
(0, 2).
Clearly
K
= spane
x
(2.106)
is one-dimensional and hence all unitary maps are of the form
V
e
2x
= e
i
e
x
, [0, 2). (2.107)
The functions in the domain of the corresponding operator A
are given by
f
(x) = f(x) +(e

2x
e
i
e
x
), f D(A), C. (2.108)
In particular, f
satises
f
(2) = e
i
(0), e
i
=
1 e
i
e
2
e
2
e
i
, (2.109)
and thus we have
D(A
) = f H
1
(0, 2)[f(2) = e
i
f(0). (2.110)
Concerning closures, we can combine the fact that a bounded operator

is closed if and only if its domain is closed with item (iii) from Lemma 2.4
to obtain
Lemma 2.27. The following items are equivalent.
A is closed.
D(V ) = Ran(A+ i) is closed.
Ran(V ) = Ran(Ai) is closed.
V is closed.
Next, we give a useful criterion for the existence of self-adjoint exten-
sions. A conjugate linear map C : H H is called a conjugation if it
satises C
2
= I and C, C) = , ). The prototypical example is, of
course, complex conjugation C =
. An operator A is called C-real if

CD(A) D(A), and AC = CA, D(A). (2.111)
Note that in this case CD(A) = D(A), since D(A) = C
2
D(A) CD(A).
Theorem 2.28. Suppose the symmetric operator A is C-real. Then its
defect indices are equal.
Proof. Let
j
be an orthonormal set in Ran(A+i)
. Then C
j
is an
orthonormal set in Ran(A i)
. Hence
j
is an orthonormal basis for
Ran(A + i)
if and only if C
j
is an orthonormal basis for Ran(A i)
.
Hence the two spaces have the same dimension.
Finally, we note the following useful formula for the dierence of resol-
vents of self-adjoint extensions.
Lemma 2.29. If A
j
, j = 1, 2, are self-adjoint extensions of A and if
j
(z)
is an orthonormal basis for Ker(A
z), then
(A
1
z)
1
(A
2
z)
1
=
j,k
(
1
jk
(z)
2
jk
(z))
j
(z
), .)
k
(z), (2.112)
where
l
jk
(z) =
k
(z), (A
l
z)
1
j
(z
)). (2.113)
Proof. First observe that ((A
1
z)
1
(A
2
z)
1
) is zero for every
Ran(A z). Hence it suces to consider vectors of the form =
j
(z
), )
j
(z
) Ran(Az)
= Ker(A
). Hence we have
(A
1
z)
1
(A
2
z)
1
=
j
(z
), .)
j
(z),
where
j
(z) = ((A
1
z)
1
(A
2
z)
1
)
j
(z
).
Now computing the adjoint once using ((A
l
z)
1
)
= (A
l
z
)
1
and once
using (
j
, .)
j
)
j
, .)
j
, we obtain
j
(z), .)
j
(z
) =
j
(z), .)
k
(z
).
Evaluating at
k
(z) implies
k
(z) =
j
(z
),
k
(z
))
j
(z) =
j
(
1
kj
(z)
2
kj
(z))
j
(z)
and nishes the proof.
Problem 2.27. Compute the defect indices of
A
0
= i
d
dx
, D(A
0
) = C
c
((0, )).
Can you give a self-adjoint extension of A
0
?
Problem 2.28. Let A
1
be a self-adjoint extension of A and suppose
Ker(A
z
0
). Show that (z) = + (z z
0
)(A
1
z)
1
Ker(A
z).
2.7. Appendix: Absolutely continuous functions
Let (a, b) R be some interval. We denote by
AC(a, b) = f C(a, b)[f(x) = f(c) +
_
x
c
g(t)dt, c (a, b), g L
1
loc
(a, b)
(2.114)
the set of all absolutely continuous functions. That is, f is absolutely
continuous if and only if it can be written as the integral of some locally
integrable function. Note that AC(a, b) is a vector space.
By Corollary A.36, f(x) = f(c) +
_
x
c
g(t)dt is dierentiable a.e. (with re-
spect to Lebesgue measure) and f
t
(x) = g(x). In particular, g is determined
uniquely a.e.
2.7. Appendix: Absolutely continuous functions 85
If [a, b] is a compact interval, we set
AC[a, b] = f AC(a, b)[g L
1
(a, b) C[a, b]. (2.115)
If f, g AC[a, b], we have the formula of partial integration (Problem 2.29)
_
b
a
f(x)g
t
(x)dx = f(b)g(b) f(a)g(a)
_
b
a
f
t
(x)g(x)dx (2.116)
which also implies that the product rule holds for absolutely continuous
functions.
We set
H
m
(a, b) = f L
2
(a, b)[f
(j)
AC(a, b), f
(j+1)
L
2
(a, b), 0 j m1.
(2.117)
Then we have
Lemma 2.30. Suppose f H
m
(a, b), m 1. Then f is bounded and
lim
xa
f
(j)
(x), respectively, lim
xb
f
(j)
(x), exists for 0 j m 1. More-
over, the limit is zero if the endpoint is innite.
Proof. If the endpoint is nite, then f
(j+1)
is integrable near this endpoint
and hence the claim follows. If the endpoint is innite, note that
[f
(j)
(x)[
2
= [f
(j)
(c)[
2
+ 2
_
x
c
Re(f
(j)
(t)
f
(j+1)
(t))dt
shows that the limit exists (dominated convergence). Since f
(j)
is square
integrable, the limit must be zero.
Let me remark that it suces to check that the function plus the highest
derivative are in L
2
; the lower derivatives are then automatically in L
2
. That
is,
H
m
(a, b) = f L
2
(a, b)[f
(j)
AC(a, b), 0 j m1, f
(m)
L
2
(a, b).
(2.118)
For a nite endpoint this is straightforward. For an innite endpoint this
can also be shown directly, but it is much easier to use the Fourier transform
(compare Section 7.1).
Problem 2.29. Show (2.116). (Hint: Fubini.)
Problem 2.30. A function u L
1
(0, 1) is called weakly dierentiable if for
some v L
1
(0, 1) we have
_
1
0
v(x)(x)dx =
_
1
0
u(x)
t
(x)dx
for all test functions C
c
(0, 1). Show that u is weakly dierentiable if
and only if u is absolutely continuous and u
t
= v in this case. (Hint: You will
need that
_
1
0
u(t)
t
(t)dt = 0 for all C
c
(0, 1) if and only if f is constant.
To see this choose some
0
C
c
(0, 1) with I(
0
) =
_
1
0

0
(t)dt = 1. Then
invoke Lemma 0.37 and use that every C
c
(0, 1) can be written as
(t) =
t
(t) +I()
0
(t) with (t) =
_
t
0
(s)ds I()
_
t
0

0
(s)ds.)
Problem 2.31. Show that H
1
(a, b) together with the norm
|f|
2
2,1
=
_
b
a
[f(t)[
2
dt +
_
b
a
[f
t
(t)[
2
dt
is a Hilbert space.
Problem 2.32. What is the closure of C
0
(a, b) in H
1
(a, b)? (Hint: Start
with the case where (a, b) is nite.)
Problem 2.33. Show that if f AC(a, b) and f
t
L
p
(a, b), then f is
Holder continuous:
[f(x) f(y)[ |f
t
|
p
[x y[
1
1
p
.
Chapter 3
The spectral theorem
The time evolution of a quantum mechanical system is governed by the
Schrodinger equation
i
d
dt
(t) = H(t). (3.1)
If H = C
n
and H is hence a matrix, this system of ordinary dierential
equations is solved by the matrix exponential
(t) = exp(itH)(0). (3.2)
This matrix exponential can be dened by a convergent power series
exp(itH) =
n=0
(it)
n
n!
H
n
. (3.3)
For this approach the boundedness of H is crucial, which might not be the
case for a quantum system. However, the best way to compute the matrix
exponential and to understand the underlying dynamics is to diagonalize H.
But how do we diagonalize a self-adjoint operator? The answer is known as
the spectral theorem.
3.1. The spectral theorem
In this section we want to address the problem of dening functions of a
self-adjoint operator A in a natural way, that is, such that
(f+g)(A) = f(A)+g(A), (fg)(A) = f(A)g(A), (f
)(A) = f(A)
. (3.4)
As long as f and g are polynomials, no problems arise. If we want to extend
this denition to a larger class of functions, we will need to perform some
limiting procedure. Hence we could consider convergent power series or
equip the space of polynomials on the spectrum with the sup norm. In both
87
88 3. The spectral theorem
cases this only works if the operator A is bounded. To overcome this limita-
tion, we will use characteristic functions
(A) instead of powers A

j
. Since
()
2
=
(), the corresponding operators should be orthogonal projec-

tions. Moreover, we should also have
R
(A) = I and
(A) =
n
j=1
j
(A)
for any nite union =
n
j=1

j
of disjoint sets. The only remaining prob-
lem is of course the denition of
(A). However, we will defer this problem

and begin by developing a functional calculus for a family of characteristic
functions
(A).
Denote the Borel sigma algebra of R by B. A projection-valued mea-
sure is a map
P : B L(H), P(), (3.5)
from the Borel sets to the set of orthogonal projections, that is, P()
=
P() and P()
2
= P(), such that the following two conditions hold:
(i) P(R) = I.
(ii) If =
n

n
with
n

m
= for n ,= m, then

n
P(
n
) =
P() for every H (strong -additivity).
Note that we require strong convergence,
n
P(
n
) = P(), rather
than norm convergence,

n
P(
n
) = P(). In fact, norm convergence
does not even hold in the simplest case where H = L
2
(I) and P() =
(multiplication operator), since for a multiplication operator the norm is just

the sup norm of the function. Furthermore, it even suces to require weak
convergence, since w-limP
n
= P for some orthogonal projections implies
s-limP
n
= P by , P
n
) = , P
2
n
) = P
n
, P
n
) = |P
n
|
2
together
with Lemma 1.12 (iv).
Example. Let H = C
n
and let A GL(n) be some symmetric matrix. Let
1
, . . . ,
m
be its (distinct) eigenvalues and let P
j
be the projections onto
the corresponding eigenspaces. Then
P
A
() =
j[
j
P
j
(3.6)
is a projection-valued measure.
Example. Let H = L
2
(R) and let f be a real-valued measurable function.
Then
P() =
f
1
()
(3.7)
is a projection-valued measure (Problem 3.3).
It is straightforward to verify that any projection-valued measure satis-
es
P() = 0, P(R) = I P(), (3.8)
and
P(
1

2
) +P(
1

2
) = P(
1
) +P(
2
). (3.9)
Moreover, we also have
P(
1
)P(
2
) = P(
1

2
). (3.10)
Indeed, rst suppose
1
2
= . Then, taking the square of (3.9), we infer
P(
1
)P(
2
) +P(
2
)P(
1
) = 0. (3.11)
Multiplying this equation from the right by P(
2
) shows that P(
1
)P(
2
) =
P(
2
)P(
1
)P(
2
) is self-adjoint and thus P(
1
)P(
2
) = P(
2
)P(
1
) =
0. For the general case
1

2
,= we now have
P(
1
)P(
2
) = (P(
1
2
) +P(
1

2
))(P(
2
1
) +P(
1

2
))
= P(
1

2
) (3.12)
as stated.
Moreover, a projection-valued measure is monotone, that is,
1

2
P(
1
) P(
2
), (3.13)
in the sense that , P(
1
)) , P(
2
)) or equivalently Ran(P(
1
))
Ran(P(
2
)) (cf. Problem 1.7). As a useful consequence note that P(
2
) = 0
implies P(
1
) = 0 for every subset
1

2
.
To every projection-valued measure there corresponds a resolution of
the identity
P() = P((, ]) (3.14)
which has the properties (Problem 3.4):
(i) P() is an orthogonal projection.
(ii) P(
1
) P(
2
) for
1

2
.
(iii) s-lim
n
P(
n
) = P() (strong right continuity).
(iv) s-lim
P() = 0 and s-lim

+
P() = I.
As before, strong right continuity is equivalent to weak right continuity.
Picking H, we obtain a nite Borel measure
() = , P()) =
|P()|
2
with
(R) = ||
2
< . The corresponding distribution func-
tion is given by
() = , P()) and since for every distribution function

there is a unique Borel measure (Theorem A.2), for every resolution of the
identity there is a unique projection-valued measure.
Using the polarization identity (2.16), we also have the complex Borel
measures
,
() = , P()) =
1
4
(
+
()
() + i
i
() i
+i
()).
(3.15)
Note also that, by CauchySchwarz, [
,
()[ || ||.
Now let us turn to integration with respect to our projection-valued
measure. For any simple function f =

n
j=1
j
(where
j
= f
1
(
j
))
we set
P(f)
_
R
f()dP() =
n
j=1
j
P(
j
). (3.16)
In particular, P(
) = P(). Then , P(f)) =
j

j
,
(
j
) shows
, P(f)) =
_
R
f()d
,
() (3.17)
and, by linearity of the integral, the operator P is a linear map from the set
of simple functions into the set of bounded linear operators on H. Moreover,
|P(f)|
2
=
j
[
j
[
2
(
j
) (the sets
j
are disjoint) shows
|P(f)|
2
=
_
R
[f()[
2
d
(). (3.18)
Equipping the set of simple functions with the sup norm, we infer
|P(f)| |f|
||, (3.19)
which implies that P has norm one. Since the simple functions are dense
in the Banach space of bounded Borel functions B(R), there is a unique
extension of P to a bounded linear operator P : B(R) L(H) (whose norm
is one) from the bounded Borel functions on R (with sup norm) to the set
of bounded linear operators on H. In particular, (3.17) and (3.18) remain
true.
There is some additional structure behind this extension. Recall that
the set L(H) of all bounded linear mappings on H forms a C
algebra. A C
algebra homomorphism is a linear map between two C
algebras which
respects both the multiplication and the adjoint; that is, (ab) = (a)(b)
and (a
) = (a)
.
Theorem 3.1. Let P() be a projection-valued measure on H. Then the
operator
P : B(R) L(H)
f
_
R
f()dP()
(3.20)
is a C
algebra homomorphism with norm one such that

P(g), P(f)) =
_
R
g
()f()d
,
(). (3.21)
In addition, if f
n
(x) f(x) pointwise and if the sequence sup
R
[f
n
()[ is
bounded, then P(f
n
)
s
P(f) strongly.
Proof. The properties P(1) = I, P(f
) = P(f)
, and P(fg) = P(f)P(g)

are straightforward for simple functions f. For general f they follow from
continuity. Hence P is a C
algebra homomorphism.
Equation (3.21) is a consequence of P(g), P(f)) = , P(g
f)).
The last claim follows from the dominated convergence theorem and
(3.18).
As a consequence of (3.21), observe
P(g),P(f)
() = P(g), P()P(f)) =
_
()f()d
,
(), (3.22)
which implies
d
P(g),P(f)
= g
fd
,
. (3.23)
Example. Let H = C
n
and A = A
GL(n), respectively, P
A
, as in the
previous example. Then
P
A
(f) =
m
j=1
f(
j
)P
j
. (3.24)
In particular, P
A
(f) = A for f() = .
Next we want to dene this operator for unbounded Borel functions.
Since we expect the resulting operator to be unbounded, we need a suitable
domain rst. Motivated by (3.18), we set
D
f
= H[
_
R
[f()[
2
d
() < . (3.25)
This is clearly a linear subspace of H since
() = [[
2
() and since
+
() = |P()(+)|
2
2(|P()|
2
+|P()|
2
) = 2(
()+
())
(by the triangle inequality).
For every D
f
, the sequence of bounded Borel functions
f
n
=
n
f,
n
= [ [f()[ n, (3.26)
is a Cauchy sequence converging to f in the sense of L
2
(R, d
). Hence, by
virtue of (3.18), the vectors
n
= P(f
n
) form a Cauchy sequence in H and
we can dene
P(f) = lim
n
P(f
n
), D
f
. (3.27)
By construction, P(f) is a linear operator such that (3.18) holds. Since
f L
1
(R, d
) (
is nite), (3.17) also remains true at least for = .

In addition, D
f
is dense. Indeed, let
n
be dened as in (3.26) and
abbreviate
n
= P(
n
). Now observe that d
n
=
n
d
and hence
n
D
f
. Moreover,
n
by (3.18) since
n
1 in L
2
(R, d
).
The operator P(f) has some additional properties. One calls an un-
bounded operator A normal if D(A) = D(A
) and |A| = |A
| for all
D(A). Note that normal operators are closed since the graph norms on
D(A) = D(A
) are identical.
Theorem 3.2. For any Borel function f, the operator
P(f)
_
R
f()dP(), D(P(f)) = D
f
, (3.28)
is normal and satises
P(f)
= P(f
). (3.29)
Proof. Let f be given and dene f
n
,
n
as above. Since (3.29) holds for
f
n
by our previous theorem, we get
, P(f)) = P(f
), )
for any , D
f
= D
f
by continuity. Thus it remains to show that
D(P(f)
) D
f
. If D(P(f)
), we have , P(f)) =
, ) for all
D
f
by denition. By construction of P(f) we have P(f
n
) = P(f)P(
n
)
and thus
P(f
n
), ) = , P(f
n
)) = , P(f)P(
n
)) = P(
n
)
, )
for any H shows P(f
n
) = P(
n
)
. This proves existence of the limit

lim
n
_
R
[f
n
[
2
d
= lim
n
|P(f
n
)|
2
= lim
n
|P(
n
)
|
2
= |
|
2
,
which by monotone convergence implies f L
2
(R, d
); that is, D
f
.
That P(f) is normal follows from (3.18), which implies |P(f)|
2
=
|P(f
)|
2
=
_
R
[f()[
2
d
.
These considerations seem to indicate some kind of correspondence be-
tween the operators P(f) in H and f in L
2
(R, d
). Recall that U : H

H
is called unitary if it is a bijection which preserves norms |U| = || (and
hence scalar products). The operators A in H and

A in

H are said to be
unitarily equivalent if
UA =

AU, UD(A) = D(

A). (3.30)
Clearly, A is self-adjoint if and only if

A is and (A) = (

A).
Now let us return to our original problem and consider the subspace
H
= P(g)[g L
2
(R, d
) H. (3.31)
Note that H
is closed since L
2
is and
n
= P(g
n
) converges in H if and
only if g
n
converges in L
2
. It even turns out that we can restrict P(f) to
H
(see Section 2.5).

Lemma 3.3. The subspace H
reduces P(f); that is, P
P(f) P(f)P
.
Here P
is the projection onto H
.
Proof. First suppose f is bounded. Any H can be decomposed as
= P(g) +
. Moreover, P(h), P(f)
) = P(f
h),
) = 0
for every bounded function h implies P(f)
. Hence P
P(f) =
P
P(f)P(g) = P(f)P
which by denition says that H
reduces P(f).
If f is unbounded, we consider f
n
= f
n
as before. Then, for every
D
f
, P(f
n
)P
= P
P(f
n
). Letting n , we have P(
n
)P
and P(f
n
)P
= P(f)P(
n
)P
P(f). Finally, closedness of

P(f) implies P
D
f
and P(f)P
= P
P(f).
In particular we can decompose P(f) = P(f)
P(f)
. Note that
P
D
f
= D
f
H
= P(g)[g, fg L
2
(R, d
) (3.32)
and P(f)P(g) = P(fg) H
in this case.
By (3.18), the relation
U
(P(f)) = f (3.33)
denes a unique unitary operator U
: H
L
2
(R, d
) such that
U
P(f)
= fU
, (3.34)
where f is identied with its corresponding multiplication operator. More-
over, if f is unbounded, we have U
(D
f
H
) = D(f) = g L
2
(R, d
)[fg
L
2
(R, d
) (since = P(f) implies d
= [f[
2
d
) and the above equa-

tion still holds.
The vector is called cyclic if H
= H and in this case our picture is

complete. Otherwise we need to extend this approach. A set
j
jJ
(J
some index set) is called a set of spectral vectors if |
j
| = 1 and H
i
H
j
for all i ,= j. A set of spectral vectors is called a spectral basis if
j
H
j
=
H. Luckily a spectral basis always exists:
Lemma 3.4. For every projection-valued measure P, there is an (at most
countable) spectral basis
n
such that
H =
n
H
n
(3.35)
and a corresponding unitary operator
U =
n
U
n
: H
n
L
2
(R, d
n
) (3.36)
such that for any Borel function f,
UP(f) = fU, UD
f
= D(f). (3.37)
Proof. It suces to show that a spectral basis exists. This can be easily
done using a GramSchmidt type construction. First of all observe that if
jJ
is a spectral set and H
j
for all j, we have H
j
for all j.
Indeed, H
j
implies P(g) H
j
for every bounded function g since
P(g), P(f)
j
) = , P(g
f)
j
) = 0. But P(g) with g bounded is dense
in H
implying H
j
.
Now start with some total set
j
. Normalize

1
and choose this to
be
1
. Move to the rst

j
which is not in H
1
, project to the orthogonal
complement of H
1
and normalize it. Choose the result to be
2
. Proceeding
like this, we get a set of spectral vectors
j
such that span
j
H
j
.
Hence H = span
j
H
j
.
It is important to observe that the cardinality of a spectral basis is not
well-dened (in contradistinction to the cardinality of an ordinary basis of
the Hilbert space). However, it can be at most equal to the cardinality of
an ordinary basis. In particular, since H is separable, it is at most count-
able. The minimal cardinality of a spectral basis is called the spectral
multiplicity of P. If the spectral multiplicity is one, the spectrum is called
simple.
Example. Let H = C
2
and A =
_
0 0
0 1
_
and consider the associated projection-
valued measure P
A
() as before. Then
1
= (1, 0) and
2
= (0, 1) are a
spectral basis. However, = (1, 1) is cyclic and hence the spectrum of A is
simple. If A =
_
1 0
0 1
_
, there is no cyclic vector (why?) and hence the spectral
multiplicity is two.
Using this canonical form of projection-valued measures, it is straight-
forward to prove
Lemma 3.5. Let f, g be Borel functions and , C. Then we have
P(f) +P(g) P(f +g), D(P(f) +P(g)) = D
[f[+[g[
(3.38)
and
P(f)P(g) P(f g), D(P(f)P(g)) = D
g
D
f g
. (3.39)
Now observe that to every projection-valued measure P we can assign a
self-adjoint operator A =
_
R
dP(). The question is whether we can invert
this map. To do this, we consider the resolvent R
A
(z) =
_
R
( z)
1
dP().
From (3.17) the corresponding quadratic form is given by
F
(z) = , R
A
(z)) =
_
R
1
z
d
(), (3.40)
which is know as the Borel transform of the measure
. By
Im(F
(z)) = Im(z)
_
R
1
[ z[
2
d
(), (3.41)
we infer that F
(z) is a holomorphic map from the upper half plane into

itself. Such functions are called Herglotz or Nevanlinna functions (see
Section 3.4). Moreover, the measure
can be reconstructed from F
(z)
by the Stieltjes inversion formula
() = lim
0
lim
0
1
_
+
Im(F
(t + i))dt. (3.42)
(The limit with respect to is only here to ensure right continuity of
().)
Conversely, if F
(z) is a Herglotz function satisfying [F
(z)[
M
Im(z)
, then
it is the Borel transform of a unique measure
(given by the Stieltjes

inversion formula) satisfying
(R) M.
So let A be a given self-adjoint operator and consider the expectation of
the resolvent of A,
F
(z) = , R
A
(z)). (3.43)
This function is holomorphic for z (A) and satises
F
(z
) = F
(z)
and [F
(z)[
||
2
Im(z)
(3.44)
(see (2.69) and Theorem 2.18). Moreover, the rst resolvent formula (2.81)
shows that it maps the upper half plane to itself:
Im(F
(z)) = Im(z)|R
A
(z)|
2
; (3.45)
that is, it is a Herglotz function. So by our above remarks, there is a
corresponding measure
() given by the Stieltjes inversion formula. It is

called the spectral measure corresponding to .
More generally, by polarization, for each , H we can nd a corre-
sponding complex measure
,
such that
, R
A
(z)) =
_
R
1
z
d
,
(). (3.46)
The measure
,
is conjugate linear in and linear in . Moreover, a
comparison with our previous considerations begs us to dene a family of
operators via the sesquilinear forms
s
(, ) =
_
R
()d
,
(). (3.47)
Since the associated quadratic form is nonnegative, q
() = s
(, ) =
() 0, the CauchySchwarz inequality for sesquilinear forms (Prob-

lem 0.16) implies [s
(, )[ q
()
1/2
q
()
1/2
=
()
1/2
()
1/2
(R)
1/2
(R)
1/2
|| ||. Hence Corollary 1.9 implies that there is in-
deed a family of nonnegative (0 , P
A
()) 1) and hence self-adjoint
operators P
A
() such that
, P
A
()) =
_
R
()d
,
(). (3.48)
Lemma 3.6. The family of operators P
A
() forms a projection-valued mea-
sure.
Proof. We rst show P
A
(
1
)P
A
(
2
) = P
A
(
1

2
) in two steps. First
observe (using the rst resolvent formula (2.81))
_
R
1
z
d
R
A
(z
),
() = R
A
(z
), R
A
( z)) = , R
A
(z)R
A
( z))
=
1
z z
(, R
A
(z)) , R
A
( z)))
=
1
z z
_
R
_
1
z

1
z
_
d
,
() =
_
R
1
z
d
,
()
z
implying d
R
A
(z
),
() = (z)
1
d
,
() by Problem 3.21. Secondly we
compute
_
R
1
z
d
,P
A
()
() = , R
A
(z)P
A
()) = R
A
(z
), P
A
())
=
_
R
()d
R
A
(z
),
() =
_
R
1
z
()d
,
()
implying d
,P
A
()
() =
()d
,
(). Equivalently we have
, P
A
(
1
)P
A
(
2
)) = , P
A
(
1

2
))
since
2
=
2
. In particular, choosing
1
=
2
, we see that
P
A
(
1
) is a projector.
To see P
A
(R) = I, let Ker(P
A
(R)). Then 0 = , P
A
(R)) =
(R)
implies , R
A
(z)) = 0 which implies = 0.
Now let =
n=1

n
with
n

m
= for n ,= m. Then
n
j=1
, P
A
(
j
)) =
n
j=1
(
j
) , P
A
()) =
()
by -additivity of
. Hence P
A
is weakly -additive which implies strong
-additivity, as pointed out earlier.
Now we can prove the spectral theorem for self-adjoint operators.
Theorem 3.7 (Spectral theorem). To every self-adjoint operator A there
corresponds a unique projection-valued measure P
A
such that
A =
_
R
dP
A
(). (3.49)
Proof. Existence has already been established. Moreover, Lemma 3.5 shows
that P
A
((z)
1
) = R
A
(z), z CR. Since the measures
,
are uniquely
determined by the resolvent and the projection-valued measure is uniquely
determined by the measures
,
, we are done.
The quadratic form of A is given by
q
A
() =
_
R
d
() (3.50)
and can be dened for every in the form domain
Q(A) = D([A[
1/2
) = H[
_
R
[[d
() < (3.51)
(which is larger than the domain D(A) = H[
_
R
2
d
() < ). This
extends our previous denition for nonnegative operators.
Note that if A and

A are unitarily equivalent as in (3.30), then UR
A
(z) =
R
A
(z)U and hence
d
= d
U
. (3.52)
In particular, we have UP
A
(f) = P
A
(f)U, UD(P
A
(f)) = D(P
A
(f)).
Finally, let us give a characterization of the spectrum of A in terms of
the associated projectors.
Theorem 3.8. The spectrum of A is given by
(A) = R[P
A
(( , +)) ,= 0 for all > 0. (3.53)
Proof. Let
n
= (
0
1
n
,
0
+
1
n
). Suppose P
A
(
n
) ,= 0. Then we can nd
a
n
P
A
(
n
)H with |
n
| = 1. Since
|(A
0
)
n
|
2
= |(A
0
)P
A
(
n
)
n
|
2
=
_
R
(
0
)
2
n
()d
n
()
1
n
2
,
we conclude
0
(A) by Lemma 2.16.
Conversely, if P
A
((
0
,
0
+)) = 0, set
f
() =
R\(
0
,
0
+)
()(
0
)
1
.
Then
(A
0
)P
A
(f
) = P
A
((
0
)f
()) = P
A
(R(
0
,
0
+)) = I.
Similarly P
A
(f
)(A
0
) = I[
D(A)
and hence
0
(A).
In particular, P
A
((
1
,
2
)) = 0 if and only if (
1
,
2
) (A).
Corollary 3.9. We have
P
A
((A)) = I and P
A
(R (A)) = 0. (3.54)
Proof. For every R(A) there is some open interval I
with P
A
(I
) =
0. These intervals form an open cover for R(A) and there is a countable
subcover J
n
. Setting
n
= J
n
m<n
J
m
, we have disjoint Borel sets which
cover R (A) and satisfy P
A
(
n
) = 0. Finally, strong -additivity shows
P
A
(R (A)) =
n
P
A
(
n
) = 0.
Consequently,
P
A
(f) = P
A
((A))P
A
(f) = P
A
(
(A)
f). (3.55)
In other words, P
A
(f) is not aected by the values of f on R(A)!
It is clearly more intuitive to write P
A
(f) = f(A) and we will do so from
now on. This notation is justied by the elementary observation
P
A
(
n
j=0
j
) =
n
j=0
j
A
j
. (3.56)
Moreover, this also shows that if A is bounded and f(A) can be dened via
a convergent power series, then this agrees with our present denition by
Theorem 3.1.
Problem 3.1. Show that a self-adjoint operator P is a projection if and
only if (P) 0, 1.
Problem 3.2. Consider the parity operator : L
2
(R
n
) L
2
(R
n
),
(x) (x). Show that is self-adjoint. Compute its spectrum ()
and the corresponding projection-valued measure P
.
Problem 3.3. Show that (3.7) is a projection-valued measure. What is the
corresponding operator?
Problem 3.4. Show that P() dened in (3.14) satises properties (i)(iv)
stated there.
Problem 3.5. Show that for a self-adjoint operator A we have |R
A
(z)| =
dist(z, (A)).
Problem 3.6. Suppose A is self-adjoint and |B z
0
| r. Show that
(A + B) (A) + B
r
(z
0
), where B
r
(z
0
) is the ball of radius r around z
0
.
(Hint: Problem 2.17.)
Problem 3.7. Show that for a self-adjoint operator A we have |AR
A
(z)|
[z[
Im(z)
. Find some A for which equality is attained.
Conclude that for every H we have
lim
z
|AR
A
(z)| = 0, (3.57)
where the limit is taken in any sector [ Re(z)[ [ Im(z)[, > 0.
Problem 3.8. Suppose A is self-adjoint. Show that, if D(A
n
), then
R
A
(z) =
n
j=0
A
j
z
j+1
+O(
|A
n
|
[z[
n
Im(z)
), as z . (3.58)
(Hint: Proceed as in (2.87) and use the previous problem.)
Problem 3.9. Let
0
be an eigenvalue and a corresponding normalized
eigenvector. Compute
.
0
is an eigenvalue if and only if P(
0
) ,= 0.
Show that Ran(P(
0
)) is the corresponding eigenspace in this case.
Problem 3.11 (Polar decomposition). Let A be a closed operator and
set [A[ =

A
A (recall that, by Problem 2.12, A
A is self-adjoint and
Q(A
A) = D(A)). Show that

|[A[| = |A|.
Conclude that Ker(A) = Ker([A[) = Ran([A[)
and that
U =
_
= [A[ A if Ran([A[),
0 if Ker([A[)
extends to a well-dened partial isometry; that is, U : Ker(U)
Ran(U)
is unitary, where Ker(U) = Ker(A) and Ran(U) = Ker(A
.
In particular, we have the polar decomposition
A = U[A[.
Problem 3.12. Compute [A[ =

A
A for the rank one operator A =

, .). Compute

AA
also.
3.2. More on Borel measures
Section 3.1 showed that in order to understand self-adjoint operators, one
needs to understand multiplication operators on L
2
(R, d), where d is a
nite Borel measure. This is the purpose of the present section.
The set of all growth points, that is,
() = R[(( , +)) > 0 for all > 0, (3.59)
is called the spectrum of . The same proof as for Corollary 3.9 shows that
the spectrum = () is a support for ; that is, (R) = 0.
In the previous section we have already seen that the Borel transform of
,
F(z) =
_
R
1
z
d(), (3.60)
plays an important role.
Theorem 3.10. The Borel transform of a nite Borel measure is a Herglotz
function. It is holomorphic in C() and satises
F(z
) = F(z)
, [F(z)[
(R)
Im(z)
, z C
+
. (3.61)
Proof. First of all note
Im(F(z)) =
_
R
Im
_
1
z
_
d() = Im(z)
_
R
d()
[ z[
2
,
which shows that F maps C
+
to C
+
. Moreover, F(z
) = F(z)
is obvious
and
[F(z)[
_
R
d()
[ z[

1
Im(z)
_
R
d()
establishes the bound. Moreover, since (R) = 0, we have
F(z) =
_
1
z
d(),
which together with the bound
1
[ z[

1
dist(z, )
allows the application of the dominated convergence theorem to conclude
that F is continuous on C. To show that F is holomorphic in C,
by Moreras theorem, it suces to check
_
F(z)dz = 0 for every triangle

C. Since ( z)
1
is bounded for (, z) , this follows from
_
( z)
1
dz = 0 by using Fubini,
_
F(z)dz =
_
_
R
( z)
1
d() dz =
_
R
_
( z)
1
dz d() = 0.
Note that F cannot be holomorphically extended to a larger domain. In
fact, if F is holomorphic in a neighborhood of some R, then F() =
F(
) = F()
implies Im(F()) = 0 and the Stieltjes inversion formula

(Theorem 3.21) shows that R().
Associated with this measure is the operator
Af() = f(), D(A) = f L
2
(R, d)[f() L
2
(R, d). (3.62)
By Theorem 3.8 the spectrum of A is precisely the spectrum of ; that is,
(A) = (). (3.63)
Note that 1 L
2
(R, d) is a cyclic vector for A and that
d
g,f
() = g()
f()d(). (3.64)
Now what can we say about the function f(A) (which is precisely the
multiplication operator by f) of A? We are only interested in the case where
f is real-valued. Introduce the measure
(f
)() = (f
1
()). (3.65)
Then
_
R
g()d(f
)() =
_
R
g(f())d(). (3.66)
In fact, it suces to check this formula for simple functions g, which follows
since
f =
f
1
()
. In particular, we have
P
f(A)
() =
f
1
()
. (3.67)
It is tempting to conjecture that f(A) is unitarily equivalent to multi-
plication by in L
2
(R, d(f
)) via the map

L
2
(R, d(f
)) L
2
(R, d), g g f. (3.68)
However, this map is only unitary if its range is L
2
(R, d).
Lemma 3.11. Suppose f is injective. Then
U : L
2
(R, d) L
2
(R, d(f
)), g g f
1
(3.69)
is a unitary map such that Uf() = .
Example. Let f() =
2
. Then (g f)() = g(
2
) and the range of the
above map is given by the symmetric functions. Note that we can still
get a unitary map L
2
(R, d(f
)) L
2
(R, d(f
)) L
2
(R, d), (g
1
, g
2
)
g
1
(
2
) +g
2
(
2
)(
(0,)
()
(0,)
()).
Lemma 3.12. Let f be real-valued. The spectrum of f(A) is given by
(f(A)) = (f
). (3.70)
In particular,
(f(A)) f((A)), (3.71)
where equality holds if f is continuous and the closure can be dropped if, in
addition, (A) is bounded (i.e., compact).
Proof. The rst formula follows by comparing
(f
) = R[ (f
1
( , +)) > 0 for all > 0
with (2.74).
If f is continuous, f
1
((f() , f() + )) contains an open interval
around and hence f() (f(A)) if (A). If, in addition, (A) is
compact, then f((A)) is compact and hence closed.
Whether two operators with simple spectrum are unitarily equivalent
can be read o from the corresponding measures:
Lemma 3.13. Let A
1
, A
2
be self-adjoint operators with simple spectrum and
corresponding spectral measures
1
and
2
of cyclic vectors. Then A
1
and
A
2
are unitarily equivalent if and only if
1
and
2
are mutually absolutely
continuous.
Proof. Without restriction we can assume that A
j
is multiplication by
in L
2
(R, d
j
). Let U : L
2
(R, d
1
) L
2
(R, d
2
) be a unitary map such that
UA
1
= A
2
U. Then we also have Uf(A
1
) = f(A
2
)U for any bounded Borel
function and hence
Uf() = Uf() 1 = f()U(1)()
and thus U is multiplication by u() = U(1)(). Moreover, since U is
unitary, we have
1
() =
_
R
[
[
2
d
1
=
_
R
[u
[
2
d
2
=
_
[u[
2
d
2
;
that is, d
1
= [u[
2
d
2
. Reversing the roles of A
1
and A
2
, we obtain d
2
=
[v[
2
d
1
, where v = U
1
1.
The converse is left as an exercise (Problem 3.17).
Next we recall the unique decomposition of with respect to Lebesgue
measure,
d = d
ac
+d
s
, (3.72)
where
ac
is absolutely continuous with respect to Lebesgue measure (i.e.,
we have
ac
(B) = 0 for all B with Lebesgue measure zero) and
s
is singular
with respect to Lebesgue measure (i.e.,
s
is supported,
s
(RB) = 0, on
a set B with Lebesgue measure zero). The singular part
s
can be further
decomposed into a (singularly) continuous and a pure point part,
d
s
= d
sc
+d
pp
, (3.73)
where
sc
is continuous on R and
pp
is a step function. Since the measures
d
ac
, d
sc
, and d
pp
are mutually singular, they have mutually disjoint
supports M
ac
, M
sc
, and M
pp
. Note that these sets are not unique. We will
choose them such that M
pp
is the set of all jumps of () and such that M
sc
has Lebesgue measure zero.
To the sets M
ac
, M
sc
, and M
pp
correspond projectors P
ac
=
Mac
(A),
P
sc
=
Msc
(A), and P
pp
=
Mpp
(A) satisfying P
ac
+ P
sc
+ P
pp
= I. In
other words, we have a corresponding direct sum decomposition of both our
Hilbert space
L
2
(R, d) = L
2
(R, d
ac
) L
2
(R, d
sc
) L
2
(R, d
pp
) (3.74)
and our operator
A = (AP
ac
) (AP
sc
) (AP
pp
). (3.75)
The corresponding spectra,
ac
(A) = (
ac
),
sc
(A) = (
sc
), and
pp
(A) =
(
pp
) are called the absolutely continuous, singularly continuous, and pure
point spectrum of A, respectively.
It is important to observe that
pp
(A) is in general not equal to the set
of eigenvalues
p
(A) = R[ is an eigenvalue of A (3.76)
since we only have
pp
(A) =
p
(A).
Example. Let H =
2
(N) and let A be given by A
n
=
1
n
n
, where
n
is
the sequence which is 1 at the nth place and zero otherwise (that is, A is
a diagonal matrix with diagonal elements
1
n
). Then
p
(A) =
1
n
[n N
but (A) =
pp
(A) =
p
(A) 0. To see this, just observe that
n
is the
eigenvector corresponding to the eigenvalue
1
n
and for z , (A) we have
R
A
(z)
n
=
n
1nz
n
. At z = 0 this formula still gives the inverse of A, but
it is unbounded and hence 0 (A) but 0 ,
p
(A). Since a continuous
measure cannot live on a single point and hence also not on a countable set,
we have
ac
(A) =
sc
(A) = .
Example. An example with purely absolutely continuous spectrum is given
by taking to be the Lebesgue measure. An example with purely singularly
continuous spectrum is given by taking to be the Cantor measure.
Finally, we show how the spectrum can be read o from the boundary
values of Im(F) towards the real line. We dene the following sets:
M
ac
= [0 < limsup
0
Im(F( + i)) < ,
M
s
= [ limsup
0
Im(F( + i)) = , (3.77)
M = M
ac
M
s
= [0 < limsup
0
Im(F( + i)).
Then, by Theorem 3.23 we conclude that these sets are minimal supports for
ac
,
s
, and , respectively. In fact, by Theorem 3.23 we could even restrict
ourselves to values of , where the limsup is a lim (nite or innite).
Lemma 3.14. The spectrum of is given by
() = M, M = [0 < liminf
0
Im(F( + i)). (3.78)
Proof. First observe that F is real holomorphic near , () and hence
Im(F()) = 0 in this case. Thus M () and since () is closed, we
even have M (). To see the converse, note that by Theorem 3.23, the
set M is a support for M. Thus, if (), then
0 < (( , +)) = (( , +) M)
for all > 0 and we can nd a sequence
n
(1/n, +1/n)M converging
to from inside M. This shows the remaining part () M.
To recover (
ac
) from M
ac
, we need the essential closure of a Borel
set N R,
N
ess
= R[[( , +) N[ > 0 for all > 0. (3.79)
Note that N
ess
is closed, whereas, in contradistinction to the ordinary clo-
sure, we might have N , N
ess
(e.g., any isolated point of N will disappear).
Lemma 3.15. The absolutely continuous spectrum of is given by
(
ac
) = M
ess
ac
. (3.80)
Proof. We use that 0 <
ac
(( , + )) =
ac
(( , + ) M
ac
)
is equivalent to [( , + ) M
ac
[ > 0. One direction follows from the
denition of absolute continuity and the other from minimality of M
ac
.
Problem 3.13. Construct a multiplication operator on L
2
(R) which has
dense point spectrum.
Problem 3.14. Let be Lebesgue measure on R. Show that if f AC(R)
with f
t
> 0, then
d(f
) =
1
f
t
()
d.
Problem 3.15. Let d() =
[0,1]
()d and f() =
(,t]
(), t R.
Compute f
.
Problem 3.16. Let A be the multiplication operator by the Cantor function
in L
2
(0, 1). Compute the spectrum of A. Determine the spectral types.
Problem 3.17. Show the missing direction in the proof of Lemma 3.13.
Problem 3.18. Show N
ess
N.
3.3. Spectral types
Our next aim is to transfer the results of the previous section to arbitrary
self-adjoint operators A using Lemma 3.4. To this end, we will need a
spectral measure which contains the information from all measures in a
spectral basis. This will be the case if there is a vector such that for every
3.3. Spectral types 105
H its spectral measure
is absolutely continuous with respect to
.
Such a vector will be called a maximal spectral vector of A and
will
be called a maximal spectral measure of A.
Lemma 3.16. For every self-adjoint operator A there is a maximal spectral
vector.
Proof. Let
j
jJ
be a spectral basis and choose nonzero numbers
j
with
jJ
[
j
[
2
= 1. Then I claim that
=
jJ
j
is a maximal spectral vector. Let be given. Then we can write it as =
j
f
j
(A)
j
and hence d
j
[f
j
[
2
d
j
. But
() =
j
[
j
[
2
j
() =
0 implies
j
() = 0 for every j J and thus
() = 0.
A set
j
of spectral vectors is called ordered if
k
is a maximal
spectral vector for A restricted to (
k1
j=1
H
j
)
. As in the unordered case

one can show
Theorem 3.17. For every self-adjoint operator there is an ordered spectral
basis.
Observe that if
j
is an ordered spectral basis, then
j+1
is absolutely
continuous with respect to
j
.
If is a maximal spectral measure, we have (A) = () and the fol-
lowing generalization of Lemma 3.12 holds.
Theorem 3.18 (Spectral mapping). Let be a maximal spectral measure
and let f be real-valued. Then the spectrum of f(A) is given by
(f(A)) = R[(f
1
( , +)) > 0 for all > 0. (3.81)
In particular,
(f(A)) f((A)), (3.82)
where equality holds if f is continuous and the closure can be dropped if, in
addition, (A) is bounded.
Next, we want to introduce the splitting (3.74) for arbitrary self-adjoint
operators A. It is tempting to pick a spectral basis and treat each summand
in the direct sum separately. However, since it is not clear that this approach
is independent of the spectral basis chosen, we use the more sophisticated
denition
H
ac
= H[
is absolutely continuous,
H
sc
= H[
is singularly continuous,
H
pp
= H[
is pure point. (3.83)

Lemma 3.19. We have
H = H
ac
H
sc
H
pp
. (3.84)
There are Borel sets M
xx
such that the projector onto H
xx
is given by P
xx
=
Mxx
(A), xx ac, sc, pp. In particular, the subspaces H
xx
reduce A. For
the sets M
xx
one can choose the corresponding supports of some maximal
spectral measure .
Proof. We will use the unitary operator U of Lemma 3.4. Pick H and
write =

n
n
with
n
H
n
. Let f
n
= U
n
. Then, by construction
of the unitary operator U,
n
= f
n
(A)
n
and hence d
n
= [f
n
[
2
d
n
.
Moreover, since the subspaces H
n
are orthogonal, we have
d
n
[f
n
[
2
d
n
and hence
d
,xx
=
n
[f
n
[
2
d
n,xx
, xx ac, sc, pp.
This shows
UH
xx
=
n
L
2
(R, d
n,xx
), xx ac, sc, pp
and reduces our problem to the considerations of the previous section.
Furthermore, note that if is a maximal spectral measure, then every
support for
xx
is also a support for
,xx
for any H.
The absolutely continuous, singularly continuous, and pure point
spectrum of A are dened as
ac
(A) = (A[
Hac
),
sc
(A) = (A[
Hsc
), and
pp
(A) = (A[
Hpp
),
(3.85)
respectively. If is a maximal spectral measure, we have
ac
(A) = (
ac
),
sc
(A) = (
sc
), and
pp
(A) = (
pp
).
If A and

A are unitarily equivalent via U, then so are A[
Hxx
and

A[
Hxx
by (3.52). In particular,
xx
(A) =
xx
(

A).
Problem 3.19. Compute (A),
ac
(A),
sc
(A), and
pp
(A) for the multi-
plication operator A =
1
1+x
2
in L
2
(R). What is its spectral multiplicity?
3.4. Appendix: The Herglotz theorem
Let C
= z C[ Im(z) > 0 be the upper, respectively, lower, half plane.

A holomorphic function F : C
+
C
+
mapping the upper half plane to itself
is called a Herglotz function. We can dene F on C
using F(z
) = F(z)
.
In Theorem 3.10 we have seen that the Borel transform of a nite mea-
sure is a Herglotz function satisfying a growth estimate. It turns out that
the converse is also true.
Theorem 3.20 (Herglotz representation). Suppose F is a Herglotz function
satisfying
[F(z)[
M
Im(z)
, z C
+
. (3.86)
Then there is a Borel measure , satisfying (R) M, such that F is the
Borel transform of .
Proof. We abbreviate F(z) = v(z) +i w(z) and z = x+i y. Next we choose
a contour
= x + i +[ [R, R] x + i +Re
i
[ [0, ]
and note that z lies inside and z
+ 2i lies outside if 0 < < y < R.

Hence we have by Cauchys formula
F(z) =
1
2i
_
_
1
z

1
z
2i
_
F()d.
Inserting the explicit form of , we see
F(z) =
1
_
R
R
y
2
+ (y )
2
F(x + i +)d
+
i
_

0
y
R
2
e
2i
+ (y )
2
F(x + i +Re
i
)Re
i
d.
The integral over the semi-circle vanishes as R and hence we obtain
F(z) =
1
_
R
y
( x)
2
+ (y )
2
F( + i)d
and taking imaginary parts,
w(z) =
_
R
()w
()d,
where
() = (y)/((x)
2
+(y)
2
) and w
() = w(+i)/. Letting
y , we infer from our bound
_
R
w
()d M.
In particular, since [
()
0
()[ const , we have
w(z) = lim
0
_
R
0
()d
(),
where
() =
_
(x)dx. Since
(R) M, Lemma A.26 implies that

there is subsequence which converges vaguely to some measure . Moreover,
by Lemma A.27 we even have
w(z) =
_
R
0
()d().
Now F(z) and
_
R
(z)
1
d() have the same imaginary part and thus
they only dier by a real constant. By our bound this constant must be
zero.
Observe
Im(F(z)) = Im(z)
_
R
d()
[ z[
2
(3.87)
and
lim
Im(F(i)) = (R). (3.88)

Theorem 3.21. Let F be the Borel transform of some nite Borel measure
. Then the measure is unique and can be reconstructed via the Stieltjes
inversion formula
1
2
(((
1
,
2
)) +([
1
,
2
])) = lim
0
1
_

2
1
Im(F( + i))d. (3.89)
Proof. By Fubini we have
1
_

2
1
Im(F( + i))d =
1
_

2
1
_
R
(x )
2
+
2
d(x)d
=
_
R
1
_

2
(x )
2
+
2
dd(x),
where
1
_

2
(x )
2
+
2
d =
1
_
arctan
_
2
x
_
arctan
_
1
x
_
_
1
2
_
[
1
,
2
]
(x) +
(
1
,
2
)
(x)
_
pointwise. Hence the result follows from the dominated convergence theorem
since 0
1
_
arctan(
2
x
) arctan(
1
x
)
_
1.
Furthermore, the RadonNikodym derivative of can be obtained from
the boundary values of F.
Theorem 3.22. Let be a nite Borel measure and F its Borel transform.
Then
(D)() liminf
0
1
F( + i) limsup
0
1
F( + i) (D)(). (3.90)
Proof. We need to estimate
Im(F( + i)) =
_
R
K
(t)d(t), K
(t) =

t
2
+
2
.
We rst split the integral into two parts:
Im(F(+i)) =
_
I
(t)d(t)+
_
R\I
(t)(t), I
= (, +).
Clearly the second part can be estimated by
_
R\I
(t )(t) K
()(R).
To estimate the rst part, we integrate
K
t
(s) ds d(t)
over the triangle (s, t)[ s < t < + s, 0 < s < = (s, t)[ < t <
+, t < s < and obtain
_

0
(I
s
)K
t
(s)ds =
_
I
(K() K
(t ))d(t).
Now suppose there are constants c and C such that c
(Is)
2s
C, 0 s .
Then
2c arctan(
)
_
I
(t )d(t) 2C arctan(
)
since
K
() +
_

0
sK
t
(s)ds = arctan(
).
Thus the claim follows combining both estimates.
As a consequence of Theorem A.37 and Theorem A.38 we obtain (cf.
also Lemma A.39)
Theorem 3.23. Let be a nite Borel measure and F its Borel transform.
Then the limit
Im(F()) = lim
0
1
Im(F( + i)) (3.91)

exists a.e. with respect to both and Lebesgue measure (nite or innite)
and
(D)() =
1
Im(F()) (3.92)
whenever (D)() exists.
Moreover, the set [ Im(F()) = is a support for the singularly
continuous part and [0 < Im(F()) < is a minimal support for the
absolutely continuous part.
In particular,
Corollary 3.24. The measure is purely absolutely continuous on I if
limsup
0
Im(F( + i)) < for all I.
The limit of the real part can be computed as well.
Corollary 3.25. The limit
lim
0
F( + i) (3.93)
exists a.e. with respect to both and Lebesgue measure. It is nite a.e. with
respect to Lebesgue measure.
Proof. If F(z) is a Herglotz function, then so is
_
F(z). Moreover,
_
F(z)
has values in the rst quadrant; that is, both Re(
_
F(z)) and Im(
_
F(z)) are
positive for z C
+
. Hence both
_
F(z) and i
_
F(z) are Herglotz functions
and by Theorem 3.23 both lim
0
Re(
_
F( + i)) and lim
0
Im(
_
F( + i))
exist and are nite a.e. with respect to Lebesgue measure. By taking squares,
the same is true for F(z) and hence lim
0
F( + i) exists and is nite a.e.
with respect to Lebesgue measure. Since lim
0
Im(F( + i)) = implies
lim
0
F( + i) = , the result follows.
Problem 3.20. Find all rational Herglotz functions F : C C satisfying
F(z
) = F(z)
and lim
[z[
[zF(z)[ = M < . What can you say about
the zeros of F?
Problem 3.21. A complex measure d is a measure which can be written
as a complex linear combinations of positive measures d
j
:
d = d
1
d
2
+ i(d
3
d
4
).
Let
F(z) =
_
R
d
z
be the Borel transform of a complex measure. Show that is uniquely de-
termined by F via the Stieltjes inversion formula
1
2
(((
1
,
2
)) +([
1
,
2
])) = lim
0
1
2i
_

2
1
(F( + i) F( i))d.
Problem 3.22. Compute the Borel transform of the complex measure given
by d() =
d
(i)
2
.
Chapter 4
Applications of the
spectral theorem
This chapter can be mostly skipped on rst reading. You might want to have a
look at the rst section and then come back to the remaining ones later.
Now let us show how the spectral theorem can be used. We will give a
few typical applications:
First we will derive an operator-valued version of the Stieltjes inversion
formula. To do this, we need to show how to integrate a family of functions
of A with respect to a parameter. Moreover, we will show that these integrals
can be evaluated by computing the corresponding integrals of the complex-
valued functions.
Secondly we will consider commuting operators and show how certain
facts, which are known to hold for the resolvent of an operator A, can be
established for a larger class of functions.
Then we will show how the eigenvalues below the essential spectrum and
dimension of RanP
A
() can be estimated using the quadratic form.
Finally, we will investigate tensor products of operators.
4.1. Integral formulas
We begin with the rst task by having a closer look at the projections P
A
().
They project onto subspaces corresponding to expectation values in the set
. In particular, the number
,
(A)) (4.1)
111
112 4. Applications of the spectral theorem
is the probability for a measurement of a to lie in . In addition, we have
, A) =
_
() hull(), P
A
()H, || = 1, (4.2)
where hull() is the convex hull of .
The space Ran
(A) is called the eigenspace corresponding to

0
since we have
, A) =
_
R
()d
,
() =
0
_
R
d
,
() =
0
, ) (4.3)
and hence A =
0
for all Ran
(A). The dimension of the

eigenspace is called the multiplicity of the eigenvalue.
Moreover, since
lim
0
i

0
i
=
(), (4.4)
we infer from Theorem 3.1 that
lim
0
iR
A
(
0
+ i) =
(A). (4.5)
Similarly, we can obtain an operator-valued version of the Stieltjes inversion
formula. But rst we need to recall a few facts from integration in Banach
spaces.
We will consider the case of mappings f : I X where I = [t
0
, t
1
] R is
a compact interval and X is a Banach space. As before, a function f : I X
is called simple if the image of f is nite, f(I) = x
i
n
i=1
, and if each inverse
image f
1
(x
i
), 1 i n, is a Borel set. The set of simple functions S(I, X)
forms a linear space and can be equipped with the sup norm
|f|
= sup
tI
|f(t)|. (4.6)
The corresponding Banach space obtained after completion is called the set
of regulated functions R(I, X).
Observe that C(I, X) R(I, X). In fact, consider the simple function
f
n
=

n1
i=0
f(s
i
)
[s
i
,s
i+1
)
, where s
i
= t
0
+ i
t
1
t
0
n
. Since f C(I, X) is
uniformly continuous, we infer that f
n
converges uniformly to f.
For f S(I, X) we can dene a linear map
_
: S(I, X) X by
_
I
f(t)dt =
n
i=1
x
i
[f
1
(x
i
)[, (4.7)
where [[ denotes the Lebesgue measure of . This map satises
|
_
I
f(t)dt| |f|
(t
1
t
0
) (4.8)
4.1. Integral formulas 113
and hence it can be extended uniquely to a linear map
_
: R(I, X) X
with the same norm (t
1
t
0
) by Theorem 0.26. We even have
|
_
I
f(t)dt|
_
I
|f(t)|dt, (4.9)
which clearly holds for f S(I, X) and thus for all f R(I, X) by conti-
nuity. In addition, if X
is a continuous linear functional, then

(
_
I
f(t)dt) =
_
I
(f(t))dt, f R(I, X). (4.10)
In particular, if A(t) R(I, L(H)), then
__
I
A(t)dt
_
=
_
I
(A(t))dt. (4.11)
If I = R, we say that f : I X is integrable if f R([r, r], X) for all
r > 0 and if |f(t)| is integrable. In this case we can set
_
R
f(t)dt = lim
r
_
[r,r]
f(t)dt (4.12)
and (4.9) and (4.10) still hold.
We will use the standard notation
_
t
3
t
2
f(s)ds =
_
I

(t
2
,t
3
)
(s)f(s)ds and
_
t
2
t
3
f(s)ds =
_
t
3
t
2
f(s)ds.
We write f C
1
(I, X) if
d
dt
f(t) = lim
0
f(t +) f(t)
(4.13)
exists for all t I. In particular, if f C(I, X), then F(t) =
_
t
t
0
f(s)ds
C
1
(I, X) and dF/dt = f as can be seen from
|F(t +) F(t) f(t)| = |
_
t+
t
(f(s) f(t))ds| [[ sup
s[t,t+]
|f(s) f(t)|.
(4.14)
The important facts for us are the following two results.
Lemma 4.1. Suppose f : I R C is a bounded Borel function and set
F() =
_
I
f(t, )dt. Let A be self-adjoint. Then f(t, A) R(I, L(H)) and
F(A) =
_
I
f(t, A)dt, respectively, F(A) =
_
I
f(t, A) dt. (4.15)
Proof. That f(t, A) R(I, L(H)) follows from the spectral theorem, since
it is no restriction to assume that A is multiplication by in some L
2
space.
We compute
, (
_
I
f(t, A)dt)) =
_
I
, f(t, A))dt
=
_
I
_
R
f(t, )d
,
()dt
=
_
R
_
I
f(t, )dt d
,
()
=
_
R
F()d
,
() = , F(A))
by Fubinis theorem and hence the rst claim follows.
Lemma 4.2. Suppose f : R L(H) is integrable and A L(H). Then
A
_
R
f(t)dt =
_
R
Af(t)dt, respectively,
_
R
f(t)dtA =
_
R
f(t)Adt.
(4.16)
Proof. It suces to prove the case where f is simple and of compact sup-
port. But for such functions the claim is straightforward.
Now we can prove an operator-valued version of the Stieltjes inversion
formula.
Theorem 4.3 (Stones formula). Let A be self-adjoint. Then
1
2i
_

2
1
_
R
A
( + i) R
A
( i)
_
d
s
1
2
_
P
A
([
1
,
2
]) +P
A
((
1
,
2
))
_
(4.17)
strongly.
Proof. By
1
2i
_

2
1
_
1
x i

1
x + i
_
d =
1
_

2
(x )
2
+
2
d
=
1
_
arctan
_
2
x
_
arctan
_
1
x
_
_
1
2
_
[
1
,
2
]
(x) +
(
1
,
2
)
(x)
_
the result follows combining the last part of Theorem 3.1 with Lemma 4.1.
4.2. Commuting operators 115

Note that by using the rst resolvent formula, Stones formula can also
be written in the form
,
1
2
_
P
A
([
1
,
2
]) +P
A
((
1
,
2
))
_
) = lim
0
1
_

2
1
Im, R
A
( + i))d
= lim
0
_

2
1
|R
A
( + i)|
2
d.
(4.18)
Problem 4.1. Let be a dierentiable Jordan curve in (A). Show
(A) =
_
R
A
(z)dz,
where is the intersection of the interior of with R.
4.2. Commuting operators
Now we come to commuting operators. As a preparation we can now prove
Lemma 4.4. Let K R be closed and let C
(K) be the set of all continuous

functions on K which vanish at (if K is unbounded) with the sup norm.
The -subalgebra generated by the function

1
z
(4.19)
for one z CK is dense in C
(K).
Proof. If K is compact, the claim follows directly from the complex Stone
Weierstra theorem since (
1
z)
1
= (
2
z)
1
implies
1
=
2
. Otherwise,
replace K by

K = K, which is compact, and set (z)
1
= 0. Then
we can again apply the complex StoneWeierstra theorem to conclude that
our -subalgebra is equal to f C(

K)[f() = 0 which is equivalent to
C
(K).
We say that two bounded operators A, B commute if
[A, B] = AB BA = 0. (4.20)
If A or B is unbounded, we soon run into trouble with this denition since
the above expression might not even make sense for any nonzero vector (e.g.,
take B = , .) with , D(A)). To avoid this nuisance, we will replace A
by a bounded function of A. A good candidate is the resolvent. Hence if A
is self-adjoint and B is bounded, we will say that A and B commute if
[R
A
(z), B] = [R
A
(z
), B] = 0 (4.21)
for one z (A).
Lemma 4.5. Suppose A is self-adjoint and commutes with the bounded
operator B. Then
[f(A), B] = 0 (4.22)
for any bounded Borel function f. If f is unbounded, the claim holds for
any D(f(A)) in the sense that Bf(A) f(A)B.
Proof. Equation (4.21) tells us that (4.22) holds for any f in the -sub-
algebra generated by R
A
(z). Since this subalgebra is dense in C
((A)),
the claim follows for all such f C
((A)). Next x H and let f be

bounded. Choose a sequence f
n
C
((A)) converging to f in L
2
(R, d
).
Then
Bf(A) = lim
n
Bf
n
(A) = lim
n
f
n
(A)B = f(A)B.
If f is unbounded, let D(f(A)) and choose f
n
as in (3.26). Then
f(A)B = lim
n
f
n
(A)B = lim
n
Bf
n
(A)
shows f L
2
(R, d
B
) (i.e., B D(f(A))) and f(A)B = BF(A).
In the special case where B is an orthogonal projection, we obtain
Corollary 4.6. Let A be self-adjoint and H
1
a closed subspace with corre-
sponding projector P
1
. Then H
1
reduces A if and only if P
1
and A commute.
Furthermore, note
Corollary 4.7. If A is self-adjoint and bounded, then (4.21) holds if and
only if (4.20) holds.
Proof. Since (A) is compact, we have C
((A)) and hence (4.20)

follows from (4.22) by our lemma. Conversely, since B commutes with any
polynomial of A, the claim follows from the Neumann series.
As another consequence we obtain
Theorem 4.8. Suppose A is self-adjoint and has simple spectrum. A bounded
operator B commutes with A if and only if B = f(A) for some bounded Borel
function.
Proof. Let be a cyclic vector for A. By our unitary equivalence it is no
restriction to assume H = L
2
(R, d
). Then
Bg() = Bg() 1 = g()(B1)()
since B commutes with the multiplication operator g(). Hence B is multi-
plication by f() = (B1)().
4.3. The min-max theorem 117
The assumption that the spectrum of A is simple is crucial as the exam-
ple A = I shows. Note also that the functions exp(itA) can also be used
instead of resolvents.
Lemma 4.9. Suppose A is self-adjoint and B is bounded. Then B commutes
with A if and only if
[e
iAt
, B] = 0 (4.23)
for all t R.
Proof. It suces to show [

f(A), B] = 0 for f o(R), since these functions
are dense in C
(R) by the complex StoneWeierstra theorem. Here

f
denotes the Fourier transform of f; see Section 7.1. But for such f we have
[

f(A), B] =
1
2
[
_
R
f(t)e
iAt
dt, B] =
1
2
_
R
f(t)[e
iAt
, B]dt = 0
by Lemma 4.2.
The extension to the case where B is self-adjoint and unbounded is
straightforward. We say that A and B commute in this case if
[R
A
(z
1
), R
B
(z
2
)] = [R
A
(z
1
), R
B
(z
2
)] = 0 (4.24)
for one z
1
(A) and one z
2
(B) (the claim for z
2
follows by taking
adjoints). From our above analysis it follows that this is equivalent to
[e
iAt
, e
iBs
] = 0, t, s R, (4.25)
respectively,
[f(A), g(B)] = 0 (4.26)
for arbitrary bounded Borel functions f and g.
Problem 4.2. Let A and B be self-adjoint. Show that A and B commute if
and only if the corresponding spectral projections P
A
() and P
B
() commute
for every Borel set . In particular, Ran(P
B
()) reduces A and vice versa.
Problem 4.3. Let A and B be self-adjoint operators with pure point spec-
trum. Show that A and B commute if and only if they have a common
orthonormal basis of eigenfunctions.
4.3. The min-max theorem
In many applications a self-adjoint operator has a number of eigenvalues
below the bottom of the essential spectrum. The essential spectrum is ob-
tained from the spectrum by removing all discrete eigenvalues with nite
multiplicity (we will have a closer look at this in Section 6.2). In general
there is no way of computing the lowest eigenvalues and their corresponding
eigenfunctions explicitly. However, one often has some idea about how the
eigenfunctions might approximately look.
So suppose we have a normalized function
1
which is an approximation
for the eigenfunction
1
of the lowest eigenvalue E
1
. Then by Theorem 2.19
we know that
1
, A
1
)
1
, A
1
) = E
1
. (4.27)
If we add some free parameters to
1
, one can optimize them and obtain
quite good upper bounds for the rst eigenvalue.
But is there also something one can say about the next eigenvalues?
Suppose we know the rst eigenfunction
1
. Then we can restrict A to
the orthogonal complement of
1
and proceed as before: E
2
will be the
inmum over all expectations restricted to this subspace. If we restrict to
the orthogonal complement of an approximating eigenfunction
1
, there will
still be a component in the direction of
1
left and hence the inmum of the
expectations will be lower than E
2
. Thus the optimal choice
1
=
1
will
give the maximal value E
2
.
More precisely, let
j
N
j=1
be an orthonormal basis for the space spanned
by the eigenfunctions corresponding to eigenvalues below the essential spec-
trum. Here the essential spectrum
ess
(A) is given by precisely those values
in the spectrum which are not isolated eigenvalues of nite multiplicity (see
Section 6.2). Assume they satisfy (A E
j
)
j
= 0, where E
j
E
j+1
are
the eigenvalues (counted according to their multiplicity). If the number of
eigenvalues N is nite, we set E
j
= inf
ess
(A) for j > N and choose
j
orthonormal such that |(AE
j
)
j
| .
Dene
U(
1
, . . . ,
n
) = D(A)[ || = 1, span
1
, . . . ,
n
. (4.28)
(i) We have
inf
U(
1
,...,
n1
)
, A) E
n
+O(). (4.29)
In fact, set =
n
j=1
j
and choose
j
such that U(
1
, . . . ,
n1
).
Then
, A) =
n
j=1
[
j
[
2
E
j
+O() E
n
+O() (4.30)
and the claim follows.
(ii) We have
inf
U(
1
,...,
n1
)
, A) E
n
O(). (4.31)
In fact, set =
n
.
Since can be chosen arbitrarily small, we have proven the following.
4.4. Estimating eigenspaces 119
Theorem 4.10 (Min-max). Let A be self-adjoint and let E
1
E
2
E
3

be the eigenvalues of A below the essential spectrum, respectively, the in-
mum of the essential spectrum, once there are no more eigenvalues left.
Then
E
n
= sup
1
,...,
n1
inf
U(
1
,...,
n1
)
, A). (4.32)
Clearly the same result holds if D(A) is replaced by the quadratic form
domain Q(A) in the denition of U. In addition, as long as E
n
is an eigen-
value, the sup and inf are in fact max and min, explaining the name.
Corollary 4.11. Suppose A and B are self-adjoint operators with A B
(i.e., AB 0). Then E
n
(A) E
n
(B).
Problem 4.4. Suppose A, A
n
are bounded and A
n
A. Then E
k
(A
n
)
E
k
(A). (Hint: |AA
n
| is equivalent to A A A+.)
4.4. Estimating eigenspaces
Next, we show that the dimension of the range of P
A
() can be estimated
if we have some functions which lie approximately in this space.
Theorem 4.12. Suppose A is a self-adjoint operator and
j
, 1 j k,
are linearly independent elements of a H.
(i) Let R,
j
Q(A). If
, A) < ||
2
(4.33)
for any nonzero linear combination =
k
j=1
c
j
j
, then
dimRan P
A
((, )) k. (4.34)
Similarly, , A) > ||
2
implies dimRan P
A
((, )) k.
(ii) Let
1
<
2
,
j
D(A). If
|(A

2
+
1
2
)| <

2
1
2
|| (4.35)
for any nonzero linear combination =
k
j=1
c
j
j
, then
dimRan P
A
((
1
,
2
)) k. (4.36)
Proof. (i) Let M = span
j
H. We claim dimP
A
((, ))M =
dimM = k. For this it suces to show Ker P
A
((, ))[
M
= 0. Sup-
pose P
A
((, )) = 0, ,= 0. Then we see that for any nonzero linear
combination
, A) =
_
R
d
() =
_
[,)
d
()

_
[,)
d
() = ||
2
.
This contradicts our assumption (4.33).
(ii) This is just the previous case (i) applied to (A(
2
+
1
)/2)
2
with
= (
2
1
)
2
/4.
Another useful estimate is
Theorem 4.13 (Temples inequality). Let
1
<
2
and D(A) with
|| = 1 such that
= , A) (
1
,
2
). (4.37)
If there is one isolated eigenvalue E between
1
and
2
, that is, (A)
(
1
,
2
) = E, then

|(A)|
2
E +
|(A)|
2

1
. (4.38)
Proof. First of all we can assume = 0 if we replace A by A. To prove
the rst inequality, observe that by assumption (E,
2
) (A) and hence the
spectral theorem implies (A
2
)(AE) 0. Thus , (A
2
)(AE)) =
|A|
2
+
2
E 0 and the rst inequality follows after dividing by
2
> 0.
Similarly, (A
1
)(AE) 0 implies the second inequality.
Note that the last inequality only provides additional information if
|(A)|
2
(
2
)(
1
).
A typical application is if E = E
0
is the lowest eigenvalue. In this case
any normalized trial function will give the bound E
0
, A). If, in
addition, we also have some estimate
2
E
1
for the second eigenvalue E
1
,
then Temples inequality can give a bound from below. For
1
we can choose
any value
1
< E
0
; in fact, if we let
1
, we just recover the bound
we already know.
4.5. Tensor products of operators
Recall the denition of the tensor product of Hilbert space from Section 1.4.
Suppose A
j
, 1 j n, are (essentially) self-adjoint operators on H
j
. For
every monomial
n
1
1

nn
n
we can dene
(A
n
1
1
A
nn
n
)
1

n
= (A
n
1
1

1
) (A
nn
n

n
),
j
D(A
n
j
j
),
(4.39)
4.5. Tensor products of operators 121
and extend this denition by linearity to the span of all such functions
(check that this denition is well-dened by showing that the corresponding
operator on T(H
1
, . . . , H
n
) vanishes on ^(H
1
, . . . , H
n
)). Hence for every
polynomial P(
1
, . . . ,
n
) of degree N we obtain an operator
P(A
1
, . . . , A
n
)
1

n
,
j
D(A
N
j
), (4.40)
dened on the set
D = span
1

n
[
j
D(A
N
j
). (4.41)
Moreover, if P is real-valued, then the operator P(A
1
, . . . , A
n
) on D is sym-
metric and we can consider its closure, which will again be denoted by
P(A
1
, . . . , A
n
).
j
, 1 j n, are self-adjoint operators on H
j
and let P(
1
, . . . ,
n
) be a real-valued polynomial and dene P(A
1
, . . . , A
n
)
as above.
Then P(A
1
, . . . , A
n
) is self-adjoint and its spectrum is the closure of the
range of P on the product of the spectra of the A
j
; that is,
(P(A
1
, . . . , A
n
)) = P((A
1
), . . . , (A
n
)). (4.42)
Proof. By the spectral theorem it is no restriction to assume that A
j
is
multiplication by
j
on L
2
(R, d
j
) and P(A
1
, . . . , A
n
) is hence multiplication
by P(
1
, . . . ,
n
) on L
2
(R
n
, d
1
d
n
). Since D contains the set of
all functions
1
(
1
)
n
(
n
) for which
j
L
2
c
(R, d
j
), it follows that the
domain of the closure of P contains L
2
c
(R
n
, d
1
d
n
). Hence P is
the maximally dened multiplication operator by P(
1
, . . . ,
n
), which is
self-adjoint.
Now let = P(
1
, . . . ,
n
) with
j
(A
j
). Then there exist Weyl
sequences
j,k
D(A
N
j
) with (A
j

j
)
j,k
0 as k . Consequently,
(P )
k
0, where
k
=
1,k

1,k
and hence (P). Conversely,
if , P((A
1
), . . . , (A
n
)), then [P(
1
, . . . ,
n
) [ for a.e.
j
with
respect to
j
and hence (P )
1
exists and is bounded; that is,
(P).
The two main cases of interest are A
1
A
2
, in which case
(A
1
A
2
) = (A
1
)(A
2
) =
1
2
[
j
(A
j
), (4.43)
and A
1
I +I A
2
, in which case
(A
1
I +I A
2
) = (A
1
) +(A
2
) =
1
+
2
[
j
(A
j
). (4.44)
Problem 4.5. Show that the closure can be omitted in (4.44) if at least one
operator is bounded and in (4.43) if both operators are bounded.
Chapter 5
Quantum dynamics
As in the nite dimensional case, the solution of the Schrodinger equation
i
d
dt
(t) = H(t) (5.1)
is given by
(t) = exp(itH)(0). (5.2)
A detailed investigation of this formula will be our rst task. Moreover, in
the nite dimensional case the dynamics is understood once the eigenvalues
are known and the same is true in our case once we know the spectrum. Note
that, like any Hamiltonian system from classical mechanics, our system is
not hyperbolic (i.e., the spectrum is not away from the real axis) and hence
simple results such as all solutions tend to the equilibrium position cannot
be expected.
5.1. The time evolution and Stones theorem
In this section we want to have a look at the initial value problem associated
with the Schrodinger equation (2.12) in the Hilbert space H. If H is one-
dimensional (and hence A is a real number), the solution is given by
(t) = e
itA
(0). (5.3)
Our hope is that this formula also applies in the general case and that we
can reconstruct a one-parameter unitary group U(t) from its generator A
(compare (2.11)) via U(t) = exp(itA). We rst investigate the family of
operators exp(itA).
Theorem 5.1. Let A be self-adjoint and let U(t) = exp(itA).
(i) U(t) is a strongly continuous one-parameter unitary group.
123
124 5. Quantum dynamics
(ii) The limit lim
t0
1
t
(U(t) ) exists if and only if D(A) in
which case lim
t0
1
t
(U(t) ) = iA.
(iii) U(t)D(A) = D(A) and AU(t) = U(t)A.
Proof. The group property (i) follows directly from Theorem 3.1 and the
corresponding statements for the function exp(it). To prove strong con-
tinuity, observe that
lim
tt
0
|e
itA
e
it
0
A
|
2
= lim
tt
0
_
R
[e
it
e
it
0
[
2
d
()
=
_
R
lim
tt
0
[e
it
e
it
0
[
2
d
() = 0
by the dominated convergence theorem.
Similarly, if D(A), we obtain
lim
t0
|
1
t
(e
itA
) + iA|
2
= lim
t0
_
R
[
1
t
(e
it
1) + i[
2
d
() = 0
since [e
it
1[ [t[. Now let

A be the generator dened as in (2.11). Then
A is a symmetric extension of A since we have

,

A) = lim
t0
,
i
t
(U(t) 1)) = lim
t0
i
t
(U(t) 1), ) =
A, )
and hence

A = A by Corollary 2.2. This settles (ii).
To see (iii), replace U(s) in (ii).
For our original problem this implies that formula (5.3) is indeed the
solution to the initial value problem of the Schrodinger equation. Moreover,
U(t), AU(t)) = U(t), U(t)A) = , A) (5.4)
shows that the expectations of A are time independent. This corresponds
to conservation of energy.
On the other hand, the generator of the time evolution of a quantum
mechanical system should always be a self-adjoint operator since it corre-
sponds to an observable (energy). Moreover, there should be a one-to-one
correspondence between the unitary group and its generator. This is ensured
by Stones theorem.
Theorem 5.2 (Stone). Let U(t) be a weakly continuous one-parameter uni-
tary group. Then its generator A is self-adjoint and U(t) = exp(itA).
Proof. First of all observe that weak continuity together with item (iv) of
Lemma 1.12 shows that U(t) is in fact strongly continuous.
5.1. The time evolution and Stones theorem 125
Next we show that A is densely dened. Pick H and set
=
_

0
U(t)dt
(the integral is dened as in Section 4.1) implying lim
0
= . More-
over,
1
t
(U(t)
) =
1
t
_
t+
t
U(s)ds
1
t
_

0
U(s)ds
=
1
t
_
+t
U(s)ds
1
t
_
t
0
U(s)ds
=
1
t
U()
_
t
0
U(s)ds
1
t
_
t
0
U(s)ds U()
as t 0 shows
D(A). As in the proof of the previous theorem, we can

show that A is symmetric and that U(t)D(A) = D(A).
Next, let us prove that A is essentially self-adjoint. By Lemma 2.7 it
suces to prove Ker(A
) = 0 for z CR. Suppose A
= z
.
Then for each D(A) we have
d
dt
, U(t)) = , iAU(t)) = iA
, U(t)) = iz, U(t))

and hence , U(t)) = exp(izt), ). Since the left-hand side is bounded
for all t R and the exponential on the right-hand side is not, we must have
, ) = 0 implying = 0 since D(A) is dense.
So A is essentially self-adjoint and we can introduce V (t) = exp(itA).
We are done if we can show U(t) = V (t).
Let D(A) and abbreviate (t) = (U(t) V (t)). Then
lim
s0
(t +s) (t)
s
= iA(t)
and hence
d
dt
|(t)|
2
= 2 Re(t), iA(t)) = 0. Since (0) = 0, we have
(t) = 0 and hence U(t) and V (t) coincide on D(A). Furthermore, since
D(A) is dense, we have U(t) = V (t) by continuity.
As an immediate consequence of the proof we also note the following
useful criterion.
Corollary 5.3. Suppose D D(A) is dense and invariant under U(t).
Then A is essentially self-adjoint on D.
Proof. As in the above proof it follows that , ) = 0 for any D and
Ker(A
).
Note that by Lemma 4.9 two strongly continuous one-parameter groups
commute,
[e
itA
, e
isB
] = 0, (5.5)
if and only if the generators commute.
Clearly, for a physicist, one of the goals must be to understand the time
evolution of a quantum mechanical system. We have seen that the time
evolution is generated by a self-adjoint operator, the Hamiltonian, and is
given by a linear rst order dierential equation, the Schrodinger equation.
To understand the dynamics of such a rst order dierential equation, one
must understand the spectrum of the generator. Some general tools for this
endeavor will be provided in the following sections.
Problem 5.1. Let H = L
2
(0, 2) and consider the one-parameter unitary
group given by U(t)f(x) = f(x t mod 2). What is the generator of U?
5.2. The RAGE theorem
Now, let us discuss why the decomposition of the spectrum introduced in
Section 3.3 is of physical relevance. Let || = || = 1. The vector , )
is the projection of onto the (one-dimensional) subspace spanned by .
Hence [, )[
2
can be viewed as the part of which is in the state . The
rst question one might raise is, how does
[, U(t))[
2
, U(t) = e
itA
, (5.6)
behave as t ? By the spectral theorem,

,
(t) = , U(t)) =
_
R
e
it
d
,
() (5.7)
is the Fourier transform of the measure
,
. Thus our question is an-
swered by Wieners theorem.
Theorem 5.4 (Wiener). Let be a nite complex Borel measure on R and
let
(t) =
_
R
e
it
d() (5.8)
be its Fourier transform. Then the Cesàro time average of (t) has the limit
lim
T
1
T
_
T
0
[ (t)[
2
dt =
R
[()[
2
, (5.9)
where the sum on the right-hand side is nite.
Proof. By Fubini we have
1
T
_
T
0
[ (t)[
2
dt =
1
T
_
T
0
_
R
_
R
e
i(xy)t
d(x)d
(y)dt
=
_
R
_
R
_
1
T
_
T
0
e
i(xy)t
dt
_
d(x)d
(y).
The function in parentheses is bounded by one and converges pointwise to
0
(x y) as T . Thus, by the dominated convergence theorem, the
limit of the above expression is given by
_
R
_
R
0
(x y)d(x)d
(y) =
_
R
(y)d
(y) =
yR
[(y)[
2
,
which nishes the proof.
To apply this result to our situation, observe that the subspaces H
ac
,
H
sc
, and H
pp
are invariant with respect to time evolution since P
xx
U(t) =
Mxx
(A) exp(itA) = exp(itA)
Mxx
(A) = U(t)P
xx
, xx ac, sc, pp.
Moreover, if H
xx
, we have P
xx
= , which shows , f(A)) =
, P
xx
f(A)) = P
xx
, f(A)) implying d
,
= d
P
xx
,
. Thus if
is ac, sc, or pp, so is

,
for every H.
That is, if H
c
= H
ac
H
sc
, then the Cesàro mean of , U(t)) tends
to zero. In other words, the average of the probability of nding the system
in any prescribed state tends to zero if we start in the continuous subspace
H
c
of A.
If H
ac
, then d
,
is absolutely continuous with respect to Lebesgue
measure and thus
,
(t) is continuous and tends to zero as [t[ . In
fact, this follows from the Riemann-Lebesgue lemma (see Lemma 7.6 below).
Now we want to draw some additional consequences from Wieners the-
orem. This will eventually yield a dynamical characterization of the contin-
uous and pure point spectrum due to Ruelle, Amrein, Gorgescu, and En.
But rst we need a few denitions.
An operator K L(H) is called a nite rank operator if its range is
nite dimensional. The dimension
rank(K) = dimRan(K)
is called the rank of K. If
j
n
j=1
is an orthonormal basis for Ran(K), we
have
K =
n
j=1
j
, K)
j
=
n
j=1
j
, )
j
, (5.10)
where
j
= K
j
. The elements
j
are linearly independent since Ran(K) =
Ker(K
. Hence every nite rank operator is of the form (5.10). In addi-

tion, the adjoint of K is also nite rank and is given by
K
=
n
j=1
j
, )
j
. (5.11)
The closure of the set of all nite rank operators in L(H) is called the set
of compact operators C(H). It is straightforward to verify (Problem 5.2)
Lemma 5.5. The set of all compact operators C(H) is a closed -ideal in
L(H).
There is also a weaker version of compactness which is useful for us. The
operator K is called relatively compact with respect to A if
KR
A
(z) C(H) (5.12)
for one z (A). By the rst resolvent formula this then follows for all
z (A). In particular we have D(A) D(K).
Now let us return to our original problem.
Theorem 5.6. Let A be self-adjoint and suppose K is relatively compact.
Then
lim
T
1
T
_
T
0
|Ke
itA
P
c
|
2
dt = 0 and lim
t
|Ke
itA
P
ac
| = 0
(5.13)
for every D(A). If, in addition, K is bounded, then the result holds for
any H.
Proof. Let H
c
, respectively, H
ac
, and drop the projectors. Then,
if K is a rank one operator (i.e., K =
1
, .)
2
), the claim follows from
Wieners theorem, respectively, the Riemann-Lebesgue lemma. Hence it
holds for any nite rank operator K.
If K is compact, there is a sequence K
n
of nite rank operators such
that |K K
n
| 1/n and hence
|Ke
itA
| |K
n
e
itA
| +
1
n
||.
Thus the claim holds for any compact operator K.
If D(A), we can set = (A i)
1
, where H
c
if and only if
H
c
(since H
c
reduces A). Since K(A + i)
1
is compact by assumption,
the claim can be reduced to the previous situation. If K is also bounded,
we can nd a sequence
n
D(A) such that |
n
| 1/n and hence
|Ke
itA
| |Ke
itA
n
| +
1
n
|K|,
concluding the proof.
With the help of this result we can now prove an abstract version of the
RAGE theorem.
Theorem 5.7 (RAGE). Let A be self-adjoint. Suppose K
n
L(H) is a se-
quence of relatively compact operators which converges strongly to the iden-
tity. Then
H
c
= H[ lim
n
lim
T
1
T
_
T
0
|K
n
e
itA
|dt = 0,
H
pp
= H[ lim
n
sup
t0
|(I K
n
)e
itA
| = 0. (5.14)
Proof. Abbreviate (t) = exp(itA). We begin with the rst equation.
Let H
c
. Then
1
T
_
T
0
|K
n
(t)|dt
_
1
T
_
T
0
|K
n
(t)|
2
dt
_
1/2
0
by CauchySchwarz and the previous theorem. Conversely, if , H
c
, we
can write =
c
+
pp
. By our previous estimate it suces to show
|K
n
pp
(t)| > 0 for n large. In fact, we even claim
lim
n
sup
t0
|K
n
pp
(t)
pp
(t)| = 0. (5.15)
By the spectral theorem, we can write
pp
(t) =

j

j
(t)
j
, where the
j
are orthonormal eigenfunctions and
j
(t) = exp(it
j
)
j
. Truncate this
expansion after N terms. Then this part converges uniformly to the desired
limit by strong convergence of K
n
. Moreover, by Lemma 1.14 we have
|K
n
| M, and hence the error can be made arbitrarily small by choosing
N large.
Now let us turn to the second equation. If H
pp
, the claim follows
by (5.15). Conversely, if , H
pp
, we can write =
c
+
pp
and by our
previous estimate it suces to show that |(I K
n
)
c
(t)| does not tend to
0 as n . If it did, we would have
0 = lim
T
1
T
_
T
0
|(I K
n
)
c
(t)|
2
dt
|
c
(t)|
2
lim
T
1
T
_
T
0
|K
n
c
(t)|
2
dt = |
c
(t)|
2
,
a contradiction.
In summary, regularity properties of spectral measures are related to
the long time behavior of the corresponding quantum mechanical system.
However, a more detailed investigation of this topic is beyond the scope of
this manuscript. For a survey containing several recent results, see [28].
It is often convenient to treat the observables as time dependent rather
than the states. We set
K(t) = e
itA
Ke
itA
(5.16)
and note
(t), K(t)) = , K(t)), (t) = e
itA
. (5.17)
This point of view is often referred to as the Heisenberg picture in the
physics literature. If K is unbounded, we will assume D(A) D(K) such
that the above equations make sense at least for D(A). The main
interest is the behavior of K(t) for large t. The strong limits are called
asymptotic observables if they exist.
Theorem 5.8. Suppose A is self-adjoint and K is relatively compact. Then
lim
T
1
T
_
T
0
e
itA
Ke
itA
dt =
p(A)
P
A
()KP
A
(), D(A).
(5.18)
If K is in addition bounded, the result holds for any H.
Proof. We will assume that K is bounded. To obtain the general result,
use the same trick as before and replace K by KR
A
(z). Write =
c
+
pp
.
Then
lim
T
1
T
|
_
T
0
K(t)
c
dt| lim
T
1
T
_
T
0
|K(t)
c
dt| = 0
by Theorem 5.6. As in the proof of the previous theorem we can write
pp
=
j

j
j
and hence
j
1
T
_
T
0
K(t)
j
dt =
j
_
1
T
_
T
0
e
it(A
j
)
dt
_
K
j
.
As in the proof of Wieners theorem, we see that the operator in parentheses
tends to P
A
(
j
) strongly as T . Since this operator is also bounded
by 1 for all T, we can interchange the limit with the summation and the
claim follows.
We also note the following corollary.
Corollary 5.9. Under the same assumptions as in the RAGE theorem we
have
lim
n
lim
T
1
T
_
T
0
e
itA
K
n
e
itA
dt = P
pp
, (5.19)
respectively,
lim
n
lim
T
1
T
_
T
0
e
itA
(I K
n
)e
itA
dt = P
c
. (5.20)
Problem 5.2. Prove Lemma 5.5.
5.3. The Trotter product formula 131
Problem 5.3. Prove Corollary 5.9.
5.3. The Trotter product formula
In many situations the operator is of the form A + B, where e
itA
and e
itB
can be computed explicitly. Since A and B will not commute in general, we
cannot obtain e
it(A+B)
from e
itA
e
itB
. However, we at least have
Theorem 5.10 (Trotter product formula). Suppose A, B, and A + B are
self-adjoint. Then
e
it(A+B)
= s-lim
n
_
e
i
t
n
A
e
i
t
n
B
_
n
. (5.21)
Proof. First of all note that we have
_
e
iA
e
iB
_
n
e
it(A+B)
=
n1
j=0
_
e
iA
e
iB
_
n1j
_
e
iA
e
iB
e
i(A+B)
__
e
i(A+B)
_
j
,
where =
t
n
, and hence
|(e
iA
e
iB
)
n
e
it(A+B)
| [t[ max
[s[[t[
F
(s),
where
F
(s) = |
1
(e
iA
e
iB
e
i(A+B)
)e
is(A+B)
|.
Now for D(A+B) = D(A) D(B) we have
1
(e
iA
e
iB
e
i(A+B)
) iA + iB i(A+B) = 0
as 0. So lim
0
F
(s) = 0 at least pointwise, but we need this uniformly

with respect to s [[t[, [t[].
Pointwise convergence implies
|
1
(e
iA
e
iB
e
i(A+B)
)| C()
and, since D(A+B) is a Hilbert space when equipped with the graph norm
||
2
(A+B)
= ||
2
+|(A+B)|
2
, we can invoke the uniform boundedness
principle to obtain
|
1
(e
iA
e
iB
e
i(A+B)
)| C||
(A+B)
.
Now
[F
(s) F
(r)[ |
1
(e
iA
e
iB
e
i(A+B)
)(e
is(A+B)
e
ir(A+B)
)|
C|(e
is(A+B)
e
ir(A+B)
)|
(A+B)
shows that F
(.) is uniformly continuous and the claim follows by a standard
2
argument.
If the operators are semi-bounded from below, the same proof shows
Theorem 5.11 (Trotter product formula). Suppose A, B, and A + B are
self-adjoint and semi-bounded from below. Then
e
t(A+B)
= s-lim
n
_
e
t
n
A
e
t
n
B
_
n
, t 0. (5.22)
Problem 5.4. Prove Theorem 5.11.
Chapter 6
Perturbation theory for
self-adjoint operators
The Hamiltonian of a quantum mechanical system is usually the sum of
the kinetic energy H
0
(free Schrodinger operator) plus an operator V cor-
responding to the potential energy. Since H
0
is easy to investigate, one
usually tries to consider V as a perturbation of H
0
. This will only work
if V is small with respect to H
0
. Hence we study such perturbations of
self-adjoint operators next.
6.1. Relatively bounded operators and the KatoRellich
theorem
An operator B is called A bounded or relatively bounded with respect
to A if D(A) D(B) and if there are constants a, b 0 such that
|B| a|A| +b||, D(A). (6.1)
The inmum of all constants a for which a corresponding b exists such that
(6.1) holds is called the A-bound of B.
The triangle inequality implies
Lemma 6.1. Suppose B
j
, j = 1, 2, are A bounded with respective A-bounds
a
i
, i = 1, 2. Then
1
B
1
+
2
B
2
is also A bounded with A-bound less than
[
1
[a
1
+ [
2
[a
2
. In particular, the set of all A bounded operators forms a
linear space.
There are also the following equivalent characterizations:
133
134 6. Perturbation theory for self-adjoint operators
Lemma 6.2. Suppose A is closed and B is closable. Then the following are
equivalent:
(i) B is A bounded.
(ii) D(A) D(B).
(iii) BR
A
(z) is bounded for one (and hence for all) z (A).
Moreover, the A-bound of B is no larger than inf
z(A)
|BR
A
(z)|.
Proof. (i) (ii) is true by denition. (ii) (iii) since BR
A
(z) is a closed
(Problem 2.9) operator dened on all of H and hence bounded by the closed
graph theorem (Theorem 2.8). To see (iii) (i), let D(A). Then
|B| = |BR
A
(z)(Az)| a|(Az)| a|A| + (a[z[)||,
where a = |BR
A
(z)|. Finally, note that if BR
A
(z) is bounded for one
z (A), it is bounded for all z (A) by the rst resolvent formula.
Example. Let A be the self-adjoint operator A =
d
2
dx
2
, D(A) = f
H
2
[0, 1][f(0) = f(1) = 0 in the Hilbert space L
2
(0, 1). If we want to add a
potential represented by a multiplication operator with a real-valued (mea-
surable) function q, then q will be relatively bounded if q L
2
(0, 1): Indeed,
since all functions in D(A) are continuous on [0, 1] and hence bounded, we
clearly have D(A) D(q) in this case.
We are mainly interested in the situation where A is self-adjoint and B
is symmetric. Hence we will restrict our attention to this case.
Lemma 6.3. Suppose A is self-adjoint and B relatively bounded. The A-
bound of B is given by
lim
|BR
A
(i)|. (6.2)
If A is bounded from below, we can also replace i by .
Proof. Let = R
A
(i), > 0, and let a
be the A-bound of B. Then

(use the spectral theorem to estimate the norms)
|BR
A
(i)| a|AR
A
(i)| +b|R
A
(i)| (a +
b
)||.
Hence limsup
|BR
A
(i)| a
which, together with the inequality a

inf
|BR
A
(i)| from the previous lemma, proves the claim.
The case where A is bounded from below is similar, using
|BR
A
()|
_
a max
_
1,
[[
+
_
+
b
+
_
||,
for < .
Now we will show the basic perturbation result due to Kato and Rellich.
6.1. Relatively bounded operators and the KatoRellich theorem 135
Theorem 6.4 (KatoRellich). Suppose A is (essentially) self-adjoint and
B is symmetric with A-bound less than one. Then A + B, D(A + B) =
D(A), is (essentially) self-adjoint. If A is essentially self-adjoint, we have
D(A) D(B) and A+B = A+B.
If A is bounded from below by , then A+B is bounded from below by
max
_
a[[ +b,
b
a 1
_
. (6.3)
Proof. Since D(A) D(B) and D(A) D(A+B) by (6.1), we can assume
that A is closed (i.e., self-adjoint). It suces to show that Ran(A+Bi) =
H. By the above lemma we can nd a > 0 such that |BR
A
(i)| < 1.
Hence 1 (BR
A
(i)) and thus I +BR
A
(i) is onto. Thus
(A+B i) = (I +BR
A
(i))(Ai)
is onto and the proof of the rst part is complete.
If A is bounded from below, we can replace i by and the above
equation shows that R
A+B
exists for suciently large. By the proof of
the previous lemma we can choose < min(, b/(a 1)).
Example. In our previous example we have seen that q L
2
(0, 1) is rel-
atively bounded by checking D(A) D(q). However, working a bit harder
(Problem 6.2), one can even show that the relative bound is 0 and hence
A+q is self-adjoint by the KatoRellich theorem.
Finally, let us show that there is also a connection between the resolvents.
Lemma 6.5. Suppose A and B are closed and D(A) D(B). Then we
have the second resolvent formula
R
A+B
(z) R
A
(z) = R
A
(z)BR
A+B
(z) = R
A+B
(z)BR
A
(z) (6.4)
for z (A) (A+B).
Proof. We compute
R
A+B
(z) +R
A
(z)BR
A+B
(z) = R
A
(z)(A+B z)R
A+B
(z) = R
A
(z).
The second identity is similar.
Problem 6.1. Show that (6.1) implies
|B|
2
a
2
|A|
2
+
b
2
||
2
with a = a(1 +
2
) and

b = b(1 +
2
) for any > 0. Conversely, show that
this inequality implies (6.1) with a = a and b =

b.
Problem 6.2. Let A be the self-adjoint operator A =
d
2
dx
2
, D(A) = f
H
2
2
(0, 1) and q L
2
(0, 1).
Show that for every f D(A) we have
|f|
2

2
|f
tt
|
2
+
1
2
|f|
2
for any > 0. Conclude that the relative bound of q with respect to A is
zero. (Hint: [f(x)[
2
[
_
1
0
f
t
(t)dt[
2
_
1
0
[f
t
(t)[
2
dt =
_
1
0
f(t)
f
tt
(t)dt.)
Problem 6.3. Let A be as in the previous example. Show that q is relatively
bounded if and only if x(1 x)q(x) L
2
(0, 1).
Problem 6.4. Compute the resolvent of A+, .). (Hint: Show
(I +, .))
1
= I

1 +, )
, .)
and use the second resolvent formula.)
6.2. More on compact operators
Recall from Section 5.2 that we have introduced the set of compact operators
C(H) as the closure of the set of all nite rank operators in L(H). Before we
can proceed, we need to establish some further results for such operators.
We begin by investigating the spectrum of self-adjoint compact operators
and show that the spectral theorem takes a particularly simple form in this
case.
Theorem 6.6 (Spectral theorem for compact operators). Suppose the op-
erator K is self-adjoint and compact. Then the spectrum of K consists of
an at most countable number of eigenvalues which can only cluster at 0.
Moreover, the eigenspace to each nonzero eigenvalue is nite dimensional.
In addition, we have
K =
(K)
P
K
(). (6.5)
Proof. It suces to show rank(P
K
(( , + ))) < for 0 < < [[.
Let K
n
be a sequence of nite rank operators such that |KK
n
| 1/n. If
RanP
K
((, +)) is innite dimensional, we can nd a vector
n
in this
range such that |
n
| = 1 and K
n
n
= 0. But this yields a contradiction
since
1
n
[
n
, (K K
n
)
n
)[ = [
n
, K
n
)[ [[ > 0
by (4.2).
As a consequence we obtain the canonical form of a general compact
operator.
6.2. More on compact operators 137
Theorem 6.7 (Canonical form of compact operators). Let K be compact.
There exist orthonormal sets
j
,
j
and positive numbers s
j
= s
j
(K)
such that
K =
j
s
j
j
, .)
j
, K
j
s
j
j
, .)
j
. (6.6)
Note K
j
= s
j

j
and K
j
= s
j
j
, and hence K
K
j
= s
2
j
j
and KK
j
=
s
2
j
j
.
The numbers s
j
(K)
2
> 0 are the nonzero eigenvalues of KK
, respec-
tively, K
K (counted with multiplicity) and s

j
(K) = s
j
(K
) = s
j
are called
singular values of K. There are either nitely many singular values (if K
is nite rank) or they converge to zero.
Proof. By Lemma 5.5, K
K is compact and hence Theorem 6.6 applies.

Let
j
be an orthonormal basis of eigenvectors for P
K
K
((0, ))H and let
s
2
j
be the eigenvalue corresponding to
j
. Then, for any H we can write
=
j
, )
j
+

with

Ker(K
K) = Ker(K). Then
K =
j
s
j
j
, )
j
,
where

j
= s
1
j
K
j
, since |K

|
2
=
, K
K

) = 0. By
j
,

k
) =
(s
j
s
k
)
1
K
j
, K
k
) = (s
j
s
k
)
1
K
K
j
,
k
) = s
j
s
1
k

j
,
k
) we see that
the
j
are orthonormal and the formula for K
follows by taking the

adjoint of the formula for K (Problem 6.5).
If K is self-adjoint, then
j
=
j

j
,
2
j
= 1 are the eigenvectors of K
and
j
s
j
are the corresponding eigenvalues.
Moreover, note that we have
|K| = max
j
s
j
(K). (6.7)
Finally, let me remark that there are a number of other equivalent de-
nitions for compact operators.
Lemma 6.8. For K L(H) the following statements are equivalent:
(i) K is compact.
(i) K
is compact.
(ii) A
n
L(H) and A
n
s
A strongly implies A
n
K AK.
(iii)
n
weakly implies K
n
K in norm.
(iv)
n
bounded implies that K
n
has a (norm) convergent subse-
quence.
Proof. (i) (i). This is immediate from Theorem 6.7.
(i) (ii). Translating A
n
A
n
A, it is no restriction to assume
A = 0. Since |A
n
| M, it suces to consider the case where K is nite
rank. Then (by (6.6))
|A
n
K|
2
= sup
||=1
N
j=1
s
j
[
j
, )[
2
|A
n
j
|
2
j=1
s
j
|A
n
j
|
2
0.
(ii) (iii). Again, replace
n

n
and assume = 0. Choose
A
n
=
n
, .), || = 1. Then |K
n
| = |A
n
K
| 0.
(iii) (iv). If
n
is bounded, it has a weakly convergent subsequence
by Lemma 1.13. Now apply (iii) to this subsequence.
(iv) (i). Let
j
be an orthonormal basis and set
K
n
=
n
j=1
j
, .)K
j
.
Then
n
= |K K
n
| = sup
span
j
j=n
,||=1
|K|
is a decreasing sequence tending to a limit 0. Moreover, we can nd
a sequence of unit vectors
n
span
j
j=n
for which |K
n
| . By
assumption, K
n
has a convergent subsequence which, since
n
converges
weakly to 0, converges to 0. Hence must be 0 and we are done.
The last condition explains the name compact. Moreover, note that one
cannot replace A
n
K AK by KA
n
KA in (ii) unless one additionally
requires A
n
to be normal (then this follows by taking adjoints recall
that only for normal operators is taking adjoints continuous with respect
to strong convergence). Without the requirement that A
n
be normal, the
claim is wrong as the following example shows.
Example. Let H =
2
(N) and let A
n
be the operator which shifts each
sequence n places to the left and let K =
1
, .)
1
, where
1
= (1, 0, . . . ).
Then s-limA
n
= 0 but |KA
n
| = 1.
Problem 6.5. Deduce the formula for K
from the one for K in (6.6).

Problem 6.6. Show that it suces to check conditions (iii) and (iv) from
Lemma 6.8 on a dense subset.
6.3. HilbertSchmidt and trace class operators
Among the compact operators two special classes are of particular impor-
tance. The rst ones are integral operators
K(x) =
_
M
K(x, y)(y)d(y), L
2
(M, d), (6.8)
where K(x, y) L
2
(MM, dd). Such an operator is called a Hilbert
Schmidt operator. Using CauchySchwarz,
_
M
[K(x)[
2
d(x) =
_
M
_
M
[K(x, y)(y)[d(y)
2
d(x)
_
M
__
M
[K(x, y)[
2
d(y)
___
M
[(y)[
2
d(y)
_
d(x)
=
__
M
_
M
[K(x, y)[
2
d(y) d(x)
___
M
[(y)[
2
d(y)
_
, (6.9)
we see that K is bounded. Next, pick an orthonormal basis
j
(x) for
L
2
(M, d). Then, by Lemma 1.10,
i
(x)
j
(y) is an orthonormal basis for
L
2
(M M, d d) and
K(x, y) =
i,j
c
i,j
i
(x)
j
(y), c
i,j
=
i
, K
j
), (6.10)
where
i,j
[c
i,j
[
2
=
_
M
_
M
[K(x, y)[
2
d(y) d(x) < . (6.11)
In particular,
K(x) =
i,j
c
i,j
j
, )
i
(x) (6.12)
shows that K can be approximated by nite rank operators (take nitely
many terms in the sum) and is hence compact.
Using (6.6), we can also give a dierent characterization of Hilbert
Schmidt operators.
Lemma 6.9. If H = L
2
(M, d), then a compact operator K is Hilbert
Schmidt if and only if

j
s
j
(K)
2
< and
j
s
j
(K)
2
=
_
M
_
M
[K(x, y)[
2
d(x)d(y), (6.13)
in this case.
Proof. If K is compact, we can dene approximating nite rank operators
K
n
by considering only nitely many terms in (6.6):
K
n
=
n
j=1
s
j
j
, .)
j
.
Then K
n
has the kernel K
n
(x, y) =
n
j=1
s
j
j
(y)
j
(x) and
_
M
_
M
[K
n
(x, y)[
2
d(x)d(y) =
n
j=1
s
j
(K)
2
.
Now if one side converges, so does the other and, in particular, (6.13) holds
in this case.
Hence we will call a compact operator HilbertSchmidt if its singular
values satisfy
j
s
j
(K)
2
< . (6.14)
By our lemma this coincides with our previous denition if H = L
2
(M, d).
Since every Hilbert space is isomorphic to some L
2
(M, d), we see that
the HilbertSchmidt operators together with the norm
|K|
2
=
_
j
s
j
(K)
2
_
1/2
(6.15)
form a Hilbert space (isomorphic to L
2
(MM, dd)). Note that |K|
2
=
|K
|
2
(since s
j
(K) = s
j
(K
)). There is another useful characterization for

identifying HilbertSchmidt operators:
Lemma 6.10. A compact operator K is HilbertSchmidt if and only if
n
|K
n
|
2
< (6.16)
for some orthonormal basis and
n
|K
n
|
2
= |K|
2
2
(6.17)
for any orthonormal basis in this case.
Proof. This follows from
n
|K
n
|
2
=
n,j
[
j
, K
n
)[
2
=
n,j
[K
j
,
n
)[
2
=
n
|K
n
|
2
=
j
s
j
(K)
2
.

Corollary 6.11. The set of HilbertSchmidt operators forms a -ideal in
L(H) and
|KA|
2
|A||K|
2
, respectively, |AK|
2
|A||K|
2
. (6.18)
Proof. Let K be HilbertSchmidt and A bounded. Then AK is compact
and
|AK|
2
2
=
n
|AK
n
|
2
|A|
2
n
|K
n
|
2
= |A|
2
|K|
2
2
.
For KA just consider adjoints.
This approach can be generalized by dening
|K|
p
=
_
j
s
j
(K)
p
_
1/p
(6.19)
plus corresponding spaces
p
(H) = K C(H)[|K|
p
< , (6.20)
which are known as Schatten p-classes. Note that by (6.7)
|K| |K|
p
(6.21)
and that by s
j
(K) = s
j
(K
) we have
|K|
p
= |K
|
p
. (6.22)
Lemma 6.12. The spaces
p
(H) together with the norm |.|
p
are Banach
spaces. Moreover,
|K|
p
= sup
_
_
_
_
j
[
j
, K
j
)[
p
_
1/p

j
,
j
ONS
_
_
_
, (6.23)
where the sup is taken over all orthonormal sets.
Proof. The hard part is to prove (6.23): Choose q such that
1
p
+
1
q
= 1 and
use Holders inequality to obtain (s
j
[...[
2
= (s
p
j
[...[
2
)
1/p
[...[
2/q
)
j
s
j
[
n
,
j
)[
2
j
s
p
j
[
n
,
j
)[
2
_
1/p
_
j
[
n
,
j
)[
2
_
1/q
j
s
p
j
[
n
,
j
)[
2
_
1/p
.
Clearly the analogous equation holds for

j
,
n
. Now using CauchySchwarz,
we have
[
n
, K
n
)[
p
=
j
s
1/2
j

n
,
j
)s
1/2
j

j
,
n
)
j
s
p
j
[
n
,
j
)[
2
_
1/2
_
j
s
p
j
[
n
,

j
)[
2
_
1/2
.
Summing over n, a second appeal to CauchySchwarz and interchanging the
order of summation nally gives
n
[
n
, K
n
)[
p
n,j
s
p
j
[
n
,
j
)[
2
_
1/2
_
n,j
s
p
j
[
n
,

j
)[
2
_
1/2
j
s
p
j
_
1/2
_
j
s
p
j
_
1/2
=
j
s
p
j
.
Since equality is attained for
n
=
n
and
n
=

n
, equation (6.23) holds.
Now the rest is straightforward. From
_
j
[
j
, (K
1
+K
2
)
j
)[
p
_
1/p
j
[
j
, K
1
j
)[
p
_
1/p
+
_
j
[
j
, K
2
j
)[
p
_
1/p
|K
1
|
p
+|K
2
|
p
we infer that
p
(H) is a vector space and the triangle inequality. The other
requirements for a norm are obvious and it remains to check completeness.
If K
n
is a Cauchy sequence with respect to |.|
p
, it is also a Cauchy sequence
with respect to |.| (|K| |K|
p
). Since C(H) is closed, there is a compact
K with |K K
n
| 0 and by |K
n
|
p
C we have
_
j
[
j
, K
j
)[
p
_
1/p
C
for any nite ONS. Since the right-hand side is independent of the ONS
(and in particular on the number of vectors), K is in
p
(H).
The two most important cases are p = 1 and p = 2:
2
(H) is the space
of HilbertSchmidt operators investigated in the previous section and
1
(H)
is the space of trace class operators. Since HilbertSchmidt operators are
easy to identify, it is important to relate
1
(H) with
2
(H):
Lemma 6.13. An operator is trace class if and only if it can be written as
the product of two HilbertSchmidt operators, K = K
1
K
2
, and in this case
we have
|K|
1
|K
1
|
2
|K
2
|
2
. (6.24)
Proof. By CauchySchwarz we have
n
[
n
, K
n
)[ =
n
[K
n
, K
2
n
)[
_
n
|K
n
|
2
n
|K
2
n
|
2
_
1/2
= |K
1
|
2
|K
2
|
2
and hence K = K
1
K
2
is trace class if both K
1
and K
2
are HilbertSchmidt
operators. To see the converse, let K be given by (6.6) and choose K
1
=
j
_
s
j
(K)
j
, .)
j
, respectively, K
2
=
j
_
s
j
(K)
j
, .)
j
.
Corollary 6.14. The set of trace class operators forms a -ideal in L(H)
and
|KA|
1
|A||K|
1
, respectively, |AK|
1
|A||K|
1
. (6.25)
Proof. Write K = K
1
K
2
with K
1
, K
2
HilbertSchmidt and use Corol-
lary 6.11.
Now we can also explain the name trace class:
Lemma 6.15. If K is trace class, then for any orthonormal basis
n
the
trace
tr(K) =
n
, K
n
) (6.26)
is nite and independent of the orthonormal basis.
Proof. Let
n
be another ONB. If we write K = K
1
K
2
with K
1
, K
2
HilbertSchmidt, we have
n
, K
1
K
2
n
) =
n
K
n
, K
2
n
) =
n,m
K
n
,
m
)
m
, K
2
n
)
=
m,n
K
m
,
n
)
n
, K
1
m
) =
m
K
m
, K
1
m
)
=
m
, K
2
K
1
m
).
Hence the trace is independent of the ONB and we even have tr(K
1
K
2
) =
tr(K
2
K
1
).
Clearly for self-adjoint trace class operators, the trace is the sum over
all eigenvalues (counted with their multiplicity). To see this, one just has to
choose the orthonormal basis to consist of eigenfunctions. This is even true
for all trace class operators and is known as Lidskij trace theorem (see [44]
or [20] for an easy to read introduction).
Finally we note the following elementary properties of the trace:
Lemma 6.16. Suppose K, K
1
, K
2
are trace class and A is bounded.
(i) The trace is linear.
(ii) tr(K
) = tr(K)
.
(iii) If K
1
K
2
, then tr(K
1
) tr(K
2
).
(iv) tr(AK) = tr(KA).
Proof. (i) and (ii) are straightforward. (iii) follows from K
1
K
2
if and
only if , K
1
) , K
2
) for every H. (iv) By Problem 6.7 and (i) it
is no restriction to assume that A is unitary. Let
n
be some ONB and
note that
n
= A
n
is also an ONB. Then
tr(AK) =
n
, AK
n
) =
n
A
n
, AKA
n
)
=
n
, KA
n
) = tr(KA)
Problem 6.7. Show that every bounded operator can be written as a linear
combination of two self-adjoint operators. Furthermore, show that every
bounded self-adjoint operator can be written as a linear combination of two
unitary operators. (Hint: x i
1 x
2
has absolute value one for x
[1, 1].)
Problem 6.8. Let H =
2
(N) and let A be multiplication by a sequence
a(n). Show that A
p
(
2
(N)) if and only if a
p
(N). Furthermore, show
that |A|
p
= |a|
p
in this case.
Problem 6.9. Show that A 0 is trace class if (6.26) is nite for one (and
hence all) ONB. (Hint: A is self-adjoint (why?) and A =

A
A.)
Problem 6.10. Show that for an orthogonal projection P we have
dimRan(P) = tr(P),
where we set tr(P) = if (6.26) is innite (for one and hence all ONB by
the previous problem).
Problem 6.11. Show that for K C we have
[K[ =
j
s
j
j
, .)
j
,
where [K[ =

K
K. Conclude that
|K|
p
= (tr([A[
p
))
1/p
.
Problem 6.12. Show that K :
2
(N)
2
(N), f(n)
jN
k(n+j)f(j) is
HilbertSchmidt with |K|
2
|c|
1
if [k(n)[ c(n), where c(n) is decreasing
and summable.
6.4. Relatively compact operators and Weyls theorem
In the previous section we have seen that the sum of a self-adjoint operator
and a symmetric operator is again self-adjoint if the perturbing operator is
small. In this section we want to study the inuence of perturbations on
the spectrum. Our hope is that at least some parts of the spectrum remain
invariant.
We introduce some notation rst. The discrete spectrum
d
(A) is the
set of all eigenvalues which are discrete points of the spectrum and whose
corresponding eigenspace is nite dimensional. The complement of the dis-
crete spectrum is called the essential spectrum
ess
(A) = (A)
d
(A). If
A is self-adjoint, we might equivalently set
d
(A) =
p
(A)[ rank(P
A
(( , +))) < for some > 0, (6.27)
respectively,
ess
(A) = R[ rank(P
A
(( , +))) = for all > 0. (6.28)
Example. For a self-adjoint compact operator K we have by Theorem 6.6
that
ess
(K) 0, (6.29)
where equality holds if and only if H is innite dimensional.
Let A be self-adjoint. Note that if we add a multiple of the identity to
A, we shift the entire spectrum. Hence, in general, we cannot expect a (rel-
atively) bounded perturbation to leave any part of the spectrum invariant.
Next, if
0
is in the discrete spectrum, we can easily remove this eigenvalue
with a nite rank perturbation of arbitrarily small norm. In fact, consider
A+P
A
(
0
). (6.30)
Hence our only hope is that the remainder, namely the essential spectrum,
is stable under nite rank perturbations. To show this, we rst need a good
criterion for a point to be in the essential spectrum of A.
Lemma 6.17 (Weyl criterion). A point is in the essential spectrum of
a self-adjoint operator A if and only if there is a sequence
n
such that
|
n
| = 1,
n
converges weakly to 0, and |(A )
n
| 0. Moreover, the
sequence can be chosen orthonormal. Such a sequence is called a singular
Weyl sequence.
Proof. Let
n
be a singular Weyl sequence for the point
0
. By Lemma 2.16
we have
0
(A) and hence it suces to show
0
,
d
(A). If
0

d
(A),
we can nd an > 0 such that P
= P
A
((
0
,
0
+ )) is nite rank.
Consider

n
= P
n
. Clearly |(A
0
)
n
| = |P
(A
0
)
n
| |(A
0
)
n
| 0 and Lemma 6.8 (iii) implies

n
0. However,
|
n
n
|
2
=
_
R\(,+)
d
n
()
2
_
R\(,+)
(
0
)
2
d
n
()
2
|(A
0
)
n
|
2
and hence |
n
| 1, a contradiction.
Conversely, if
0

ess
(A), consider P
n
= P
A
([
1
n
,
1
n+1
) ( +
1
n+1
, +
1
n
]). Then rank(P
n
j
) > 0 for an innite subsequence n
j
. Now pick
j
RanP
n
j
.
Now let K be a self-adjoint compact operator and
n
a singular Weyl
sequence for A. Then
n
converges weakly to zero and hence
|(A+K )
n
| |(A)
n
| +|K
n
| 0 (6.31)
since |(A)
n
| 0 by assumption and |K
n
| 0 by Lemma 6.8 (iii).
Hence
ess
(A)
ess
(A + K). Reversing the roles of A + K and A shows
ess
(A+K) =
ess
(A). In particular, note that A and A+K have the same
singular Weyl sequences.
Since we have shown that we can remove any point in the discrete spec-
trum by a self-adjoint nite rank operator, we obtain the following equivalent
characterization of the essential spectrum.
Lemma 6.18. The essential spectrum of a self-adjoint operator A is pre-
cisely the part which is invariant under compact perturbations. In particular,
ess
(A) =
KC(H),K
=K
(A+K). (6.32)
There is even a larger class of operators under which the essential spec-
trum is invariant.
Theorem 6.19 (Weyl). Suppose A and B are self-adjoint operators. If
R
A
(z) R
B
(z) C(H) (6.33)
for one z (A) (B), then
ess
(A) =
ess
(B). (6.34)
Proof. In fact, suppose
ess
(A) and let
n
be a corresponding singular
Weyl sequence. Then
(R
A
(z)
1
z
)
n
=
R
A
(z)
z
(A)
n
and thus |(R
A
(z)
1
z
)
n
| 0. Moreover, by our assumption we also have
|(R
B
(z)
1
z
)
n
| 0 and thus |(B )
n
| 0, where
n
= R
B
(z)
n
.
Since
lim
n
|
n
| = lim
n
|R
A
(z)
n
| = [ z[
1
,= 0
(since |(R
A
(z)
1
z
)
n
| = |
1
z
R
A
(z)(A )
n
| 0), we obtain a
singular Weyl sequence for B, showing
ess
(B). Now interchange the
roles of A and B.
As a rst consequence note the following result:
Theorem 6.20. Suppose A is symmetric with equal nite defect indices.
Then all self-adjoint extensions have the same essential spectrum.
Proof. By Lemma 2.29 the resolvent dierence of two self-adjoint extensions
is a nite rank operator if the defect indices are nite.
In addition, the following result is of interest.
Lemma 6.21. Suppose
R
A
(z) R
B
(z) C(H) (6.35)
for one z (A)(B). Then this holds for all z (A)(B). In addition,
if A and B are self-adjoint, then
f(A) f(B) C(H) (6.36)
for all f C
(R).
Proof. If the condition holds for one z, it holds for all since we have (using
both resolvent formulas)
R
A
(z
t
) R
B
(z
t
)
= (1 (z z
t
)R
B
(z
t
))(R
A
(z) R
B
(z))(1 (z z
t
)R
A
(z
t
)).
Let A and B be self-adjoint. The set of all functions f for which the
claim holds is a closed -subalgebra of C
(R) (with sup norm). Hence the

claim follows from Lemma 4.4.
Remember that we have called K relatively compact with respect to
A if KR
A
(z) is compact (for one and hence for all z) and note that the
resolvent dierence R
A+K
(z) R
A
(z) is compact if K is relatively compact.
In particular, Theorem 6.19 applies if B = A + K, where K is relatively
compact.
For later use observe that the set of all operators which are relatively
compact with respect to A forms a linear space (since compact operators
do) and relatively compact operators have A-bound zero.
Lemma 6.22. Let A be self-adjoint and suppose K is relatively compact
with respect to A. Then the A-bound of K is zero.
Proof. Write
KR
A
(i) = (KR
A
(i))((A+ i)R
A
(i))
and observe that the rst operator is compact and the second is normal
and converges strongly to 0 (cf. Problem 3.7). Hence the claim follows from
Lemma 6.3 and the discussion after Lemma 6.8 (since R
A
is normal).
In addition, note the following result which is a straightforward conse-
quence of the second resolvent formula.
Lemma 6.23. Suppose A is self-adjoint and B is symmetric with A-bound
less then one. If K is relatively compact with respect to A, then it is also
relatively compact with respect to A+B.
Proof. Since B is A bounded with A-bound less than one, we can choose a
z C such that |BR
A
(z)| < 1 and hence
BR
A+B
(z) = BR
A
(z)(I +BR
A
(z))
1
(6.37)
shows that B is also A+B bounded and the result follows from
KR
A+B
(z) = KR
A
(z)(I BR
A+B
(z)) (6.38)
since KR
A
(z) is compact and BR
A+B
(z) is bounded.
Problem 6.13. Let A and B be self-adjoint operators. Suppose B is rel-
atively bounded with respect to A and A + B is self-adjoint. Show that if
[B[
1/2
R
A
(z) is HilbertSchmidt for one z (A), then this is true for all
z (A). Moreover, [B[
1/2
R
A+B
(z) is also HilbertSchmidt and R
A+B
(z)
R
A
(z) is trace class.
Problem 6.14. Show that A =
d
2
dx
2
+q(x), D(A) = H
2
(R) is self-adjoint
if q L
(R). Show that if u

tt
(x) + q(x)u(x) = zu(x) has a solution for
which u and u
t
are bounded near + (or ) but u is not square integrable
near + (or ), then z
ess
(A). (Hint: Use u to construct a Weyl
sequence by restricting it to a compact set. Now modify your construction
to get a singular Weyl sequence by observing that functions with disjoint
support are orthogonal.)
6.5. Relatively form bounded operators and the KLMN
theorem
In Section 6.1 we have considered the case where the operators A and B
have a common domain on which the operator sum is well-dened. In this
section we want to look at the case were this is no longer possible, but where
it is still possible to add the corresponding quadratic forms. Under suitable
conditions this form sum will give rise to an operator via Theorem 2.13.
Example. Let A be the self-adjoint operator A =
d
2
dx
2
, D(A) = f
H
2
2
(0, 1). If we want to
add a potential represented by a multiplication operator with a real-valued
(measurable) function q, then we already have seen that q will be relatively
bounded if q L
2
(0, 1). Hence, if q , L
2
(0, 1), we are out of luck with the
theory developed so far. On the other hand, if we look at the corresponding
quadratic forms, we have Q(A) = f H
1
[0, 1][f(0) = f(1) = 0 and
Q(q) = D([q[
1/2
). Thus we see that Q(A) Q(q) if q L
1
(0, 1).
In summary, the operators can be added if q L
2
(0, 1) while the forms
can be added under the less restrictive condition q L
1
(0, 1).
Finally, note that in some drastic cases, there might even be no way to
dene the operator sum: Let x
j
be an enumeration of the rational numbers
in (0, 1) and set
q(x) =
j=1
1
2
j
_
[x x
j
[
,
where the sum is to be understood as a limit in L
1
(0, 1). Then q gives
rise to a self-adjoint multiplication operator in L
2
(0, 1). However, note that
D(A) D(q) = 0! In fact, let f D(A) D(q). Then f is continuous
and q(x)f(x) L
2
(0, 1). Now suppose f(x
j
) ,= 0 for some rational number
x
j
(0, 1). Then by continuity [f(x)[ for x (x
j
, x
j
+ ) and
q(x)[f(x)[ 2
j
[x x
j
[
1/2
for x (x
j
, x
j
+ ) which shows that
q(x)f(x) , L
2
(0, 1) and hence f must vanish at every rational point. By
continuity, we conclude f = 0.
Recall from Section 2.3 that every closed semi-bounded form q = q
A
corresponds to a self-adjoint operator A (Theorem 2.13).
Given a self-adjoint operator A and a (hermitian) form q : Q R
with Q(A) Q, we call q relatively form bound with respect to q
A
if
there are constants a, b 0 such that
[q()[ a q
A
() +b||
2
, Q(A). (6.39)
The inmum of all possible a is called the form bound of q with respect
to q
A
.
Note that we do not require that q is associated with some self-adjoint
operator (though it will be in most cases).
Example. Let A =
d
2
dx
2
, D(A) = f H
2
[0, 1][f(0) = f(1) = 0. Then
q(f) = [f(c)[
2
, f H
1
[0, 1], c (0, 1),
is a well-dened nonnegative form. Formally, one can interpret q as the
quadratic form of the multiplication operator with the delta distribution at
x = c. But for f Q(A) = f H
1
[0, 1][f(0) = f(1) = 0 we have by
CauchySchwarz
[f(c)[
2
= 2 Re
_
c
0
f(t)
f
t
(t)dt 2
_
1
0
[f(t)
f
t
(t)[dt |f
t
|
2
+
1
|f|
2
.
Consequently q is relatively bounded with bound 0 and hence q
A
+ q gives
rise to a well-dened operator as we will show in the next theorem.
The following result is the analog of the KatoRellich theorem and is
due to Kato, Lions, Lax, Milgram, and Nelson.
Theorem 6.24 (KLMN). Suppose q
A
: Q(A) R is a semi-bounded closed
hermitian form and q a relatively bounded hermitian form with relative bound
less than one. Then q
A
+q dened on Q(A) is closed and hence gives rise to
a semi-bounded self-adjoint operator. Explicitly we have q
A
+q (1a)b.
Proof. A straightforward estimate shows q
A
() + q() (1 a)q
a
()
b||
2
((1 a) b)||
2
; that is, q
A
+q is semi-bounded. Moreover, by
q
A
()
1
1 a
_
[q
A
() +q()[ +b||
2
_
we see that the norms |.|
q
A
and |.|
q
A
+q
are equivalent. Hence q
A
+ q is
closed and the result follows from Theorem 2.13.
In the investigation of the spectrum of the operator A + B a key role
is played by the second resolvent formula. In our present case we have the
following analog.
Theorem 6.25. Suppose A 0 is self-adjoint and let q be a hermitian
form with Q(q) Q(A). Then the hermitian form
q(R
A
()
1/2
), H, (6.40)
corresponds to a bounded operator C
q
() with |C
q
()| a for >
b
a
if
and only if q is relatively form bound with constants a and b.
In particular, the form bound is given by
lim
|C
q
()|. (6.41)
Moreover, if a < 1, then
R
q
A
+q
() = R
A
()
1/2
(1 C
q
())
1
R
A
()
1/2
. (6.42)
Here R
q
A
+q
(z) is the resolvent of the self-adjoint operator corresponding to
q
A
+q.
Proof. We will abbreviate C = C
q
() and R
1/2
A
= R
A
()
1/2
. If q is form
bounded, we have for >
b
a
that
[q(R
1/2
A
)[ a q
A
(R
1/2
A
) +b|R
1/2
A
|
2
= a, (A +
b
a
)R
1/2
A
) a||
2
and hence q(R
1/2
A
) corresponds to a bounded operator C. The converse is
similar.
If a < 1, then (1 C)
1
is a well-dened bounded operator and so is
R = R
1/2
A
(1 C)
1
R
1/2
A
. To see that R is the inverse of A
1
, where A
1
is the operator associated with q
A
+q, take = R
1/2
A
Q(A) and H.
Then
s
A
1
+
(, R) = s
A+
(, R) +s(, R)
= , (1 +C)
1
R
1/2
A
) + , C(1 +C)
1
R
1/2
A
) = , ).
Taking D(A
1
) Q(A), we see (A
1
+ ), R) = , ) and thus
R = R
A
1
() (Problem 6.15).
Furthermore, we can dene C
q
() for all z (A) using
C
q
(z) = ((A+)
1/2
R
A
(z)
1/2
)
C
q
()(A+)
1/2
R
A
(z)
1/2
. (6.43)
We will call q relatively form compact if the operator C
q
(z) is compact for
one and hence all z (A). As in the case of relatively compact operators
we have
Lemma 6.26. Suppose A 0 is self-adjoint and let q be a hermitian
form. If q is relatively form compact with respect to q
A
, then its relative
form bound is 0 and the resolvents of q
A
+ q and q
A
dier by a compact
operator.
In particular, by Weyls theorem, the operators associated with q
A
and
q
A
+q have the same essential spectrum.
Proof. Fix
0
>
b
a
and let
0
. Consider the operator D() =
(A+
0
)
1/2
R
A
()
1/2
and note that D() is a bounded self-adjoint operator
with |D()| 1. Moreover, D() converges strongly to 0 as (cf.
Problem 3.7). Hence |D()C(
0
)| 0 by Lemma 6.8 and the same is
true for C() = D()C(
0
)D(). So the relative bound is zero by (6.41).
Finally, the resolvent dierence is compact by (6.42) since (1 + C)
1
=
1 C(1 +C)
1
.
Corollary 6.27. Suppose A 0 is self-adjoint and let q
1
, q
2
be hermitian
forms. If q
1
is relatively bounded with bound less than one and q
2
is relatively
compact, then the resolvent dierence of q
A
+q
1
+q
2
and q
A
+q
1
is compact.
In particular, the operators associated with q
A
+q
1
and q
A
+q
1
+q
2
have the
same essential spectrum.
Proof. Just observe that C
q
1
+q
2
= C
q
1
+ C
q
2
and (1 + C
q
1
+ C
q
2
)
1
=
(1 +C
q
1
)
1
(1 +C
q
1
)
1
C
q
2
(1 +C
q
1
+C
q
2
)
1
.
Finally we turn to the special case where q = q
B
for some self-adjoint
operator B. In this case we have
C
B
(z) = ([B[
1/2
R
A
(z)
1/2
)
sign(B)[B[
1/2
R
A
(z)
1/2
(6.44)
and hence
|C
B
(z)| |[B[
1/2
R
A
(z)
1/2
|
2
(6.45)
with equality if V 0. Thus the following result is not too surprising.
Lemma 6.28. Suppose A 0 and B is self-adjoint. Then the following
are equivalent:
(i) B is A form bounded.
(ii) Q(A) Q(B).
(iii) [B[
1/2
R
A
(z)
1/2
is bounded for one (and hence for all) z (A).
Proof. (i) (ii) is true by denition. (ii) (iii) since [B[
1/2
R
A
(z)
1/2
is a closed (Problem 2.9) operator dened on all of H and hence bounded
by the closed graph theorem (Theorem 2.8). To see (iii) (i), observe
[B[
1/2
R
A
(z)
1/2
= [B[
1/2
R
A
(z
0
)
1/2
(A z
0
)
1/2
R
A
(z)
1/2
which shows that
[B[
1/2
R
A
(z)
1/2
is bounded for all z (A) if it is bounded for one z
0
(A).
But then (6.45) shows that (i) holds.
Clearly C() will be compact if [B[
1/2
R
A
(z)
1/2
is compact. However,
since R
1/2
A
(z) might be hard to compute, we provide the following more
handy criterion.
Lemma 6.29. Suppose A 0 and B is self-adjoint where B is rela-
tively form bounded with bound less than one. Then the resolvent dierence
R
A+B
(z) R
A
(z) is compact if [B[
1/2
R
A
(z) is compact and trace class if
[B[
1/2
R
A
(z) is HilbertSchmidt.
Proof. Abbreviate R
A
= R
A
(), B
1
= [B[
1/2
, B
2
= sign(B)[B[
1/2
. Then
we have (1 C
B
)
1
= 1 (B
1
R
1/2
A
)
(1 +

C
B
)
1
B
2
R
1/2
A
, where

C
B
=
B
2
R
1/2
A
(B
1
R
1/2
A
)
. Hence R
A+B
R
A
= (B
1
R
A
)
(1 +

C
B
)
1
B
2
R
A
and the
claim follows.
Moreover, the second resolvent formula still holds when interpreted suit-
ably:
Lemma 6.30. Suppose A 0 and B is self-adjoint. If Q(A) Q(B)
and q
A
+q
B
is a closed semi-bounded form. Then
R
A+B
(z) = R
A
(z) ([B[
1/2
R
A+B
(z
))
sign(B)[B[
1/2
R
A
(z)
= R
A
(z) ([B[
1/2
R
A
(z
))
sign(B)[B[
1/2
R
A+B
(z) (6.46)
for z (A) (A+B). Here A+B is the self-adjoint operator associated
with q
A
+q
B
.
Proof. Let D(A + B) and H. Denote the right-hand side in
(6.46) by R(z) and abbreviate R = R(z), R
A
= R
A
(z), B
1
= [B[
1/2
, B
2
=
sign(B)[B[
1/2
. Then, using s
A+Bz
(, ) = (A+B +z
), ),
s
A+Bz
(, R) = s
A+Bz
(, R
A
) B
1
R
A+B
(A+B +z
), B
2
R
A
)
= s
A+Bz
(, R
A
) s
B
(, R
A
) = s
Az
(, R
A
)
= , ).
Thus R = R
A+B
(z) (Problem 6.15). The second equality follows after ex-
changing the roles of A and A+B.
It can be shown using abstract interpolation techniques that if B is
relatively bounded with respect to A, then it is also relatively form bounded.
In particular, if B is relatively bounded, then BR
A
(z) is bounded and it is
not hard to check that (6.46) coincides with (6.4). Consequently A + B
dened as operator sum is the same as A+B dened as form sum.
Problem 6.15. Suppose A is closed and R is bounded. Show that R =
R
A
(z) if and only if (Az)
, R) = , ) for all D(A
), H.
Problem 6.16. Let q be relatively form bounded with constants a and b.
Show that C
q
() satises |C()| max(a,
b
+
) for > . Furthermore,
show that |C()| decreases as .
6.6. Strong and norm resolvent convergence
Suppose A
n
and A are self-adjoint operators. We say that A
n
converges to
A in the norm, respectively, strong resolvent sense, if
lim
n
R
An
(z) = R
A
(z), respectively, s-lim
n
R
An
(z) = R
A
(z), (6.47)
for one z = C, = (A)
n
(A
n
). In fact, in the case of
strong resolvent convergence it will be convenient to include the case if A
n
is only dened on some subspace H
n
H, where we require P
n
s
1 for
the orthogonal projection onto H
n
. In this case R
An
(z) (respectively, any
other function of A
n
) has to be understood as R
An
(z)P
n
, where P
n
is the
orthogonal projector onto H
n
. (This generalization will produce nothing
new in the norm case, since P
n
1 implies P
n
= 1 for suciently large n.)
Using the StoneWeierstra theorem, we obtain as a rst consequence
n
converges to A in the norm resolvent sense.
Then f(A
n
) converges to f(A) in norm for any bounded continuous function
f : C with lim
f() = lim
f().
If A
n
converges to A in the strong resolvent sense, then f(A
n
) converges
to f(A) strongly for any bounded continuous function f : C.
Proof. The set of functions for which the claim holds clearly forms a -
subalgebra (since resolvents are normal, taking adjoints is continuous even
with respect to strong convergence) and since it contains f() = 1 and
f() =
1
z
0
, this -subalgebra is dense by the StoneWeierstra theorem
(cf. Problem 1.21). The usual

3
argument shows that this -subalgebra is
also closed.
It remains to show the strong resolvent case for arbitrary bounded con-
tinuous functions. Let
n
be a compactly supported continuous function
(0
m
1) which is one on the interval [m, m]. Then
m
(A
n
)
s

m
(A),
f(A
n
)
m
(A
n
)
s
f(A)
m
(A) by the rst part and hence
|(f(A
n
) f(A))| |f(A
n
)| |(1
m
(A))|
+|f(A
n
)| |(
m
(A)
m
(A
n
))|
+|(f(A
n
)
m
(A
n
) f(A)
m
(A))|
+|f(A)| |(1
m
(A))|
can be made arbitrarily small since |f(.)| |f|
and
m
(.)
s
I by Theo-
rem 3.1.
As a consequence, note that the point z is of no importance, that
is,
Corollary 6.32. Suppose A
n
converges to A in the norm or strong resolvent
sense for one z
0
. Then this holds for all z .
Also,
n
converges to A in the strong resolvent sense.
Then
e
itAn
s
e
itA
, t R, (6.48)
and if all operators are semi-bounded by the same bound
e
tAn
s
e
tA
, t 0. (6.49)
Next we need some good criteria to check for norm, respectively, strong,
resolvent convergence.
Lemma 6.34. Let A
n
, A be self-adjoint operators with D(A
n
) = D(A).
Then A
n
converges to A in the norm resolvent sense if there are sequences
a
n
and b
n
converging to zero such that
|(A
n
A)| a
n
|| +b
n
|A|, D(A) = D(A
n
). (6.50)
Proof. From the second resolvent formula
R
An
(z) R
A
(z) = R
An
(z)(AA
n
)R
A
(z),
we infer
|(R
An
(i) R
A
(i))| |R
An
(i)|
_
a
n
|R
A
(i)| +b
n
|AR
A
(i)|
_
(a
n
+b
n
)||
and hence |R
An
(i) R
A
(i)| a
n
+b
n
0.
In particular, norm convergence implies norm resolvent convergence:
Corollary 6.35. Let A
n
, A be bounded self-adjoint operators with A
n
A.
Then A
n
converges to A in the norm resolvent sense.
Similarly, if no domain problems get in the way, strong convergence
implies strong resolvent convergence:
Lemma 6.36. Let A
n
, A be self-adjoint operators. Then A
n
converges to
A in the strong resolvent sense if there is a core D
0
of A such that for any
D
0
we have P
n
D(A
n
) for n suciently large and A
n
A.
Proof. We begin with the case H
n
= H. Using the second resolvent formula,
we have
|(R
An
(i) R
A
(i))| |(AA
n
)R
A
(i)| 0
for (A i)D
0
which is dense, since D
0
is a core. The rest follows from
Lemma 1.14.
If H
n
H, we can consider

A
n
= A
n
0 and conclude R

An
(i)
s
R
A
(i)
from the rst case. By R

An
(i) = R
An
(i) i(1 P
n
) the same is true for
R
An
(i) since 1 P
n
s
0 by assumption.
If you wonder why we did not dene weak resolvent convergence, here
is the answer: it is equivalent to strong resolvent convergence.
Lemma 6.37. Suppose w-lim
n
R
An
(z) = R
A
(z) for some z . Then
s-lim
n
R
An
(z) = R
A
(z) also.
Proof. By R
An
(z) R
A
(z) we also have R
An
(z)
R
A
(z)
and thus by
the rst resolvent formula
|R
An
(z)|
2
|R
A
(z)|
2
= , R
An
(z
)R
An
(z) R
A
(z
)R
A
(z))
=
1
z z
, (R
An
(z) R
An
(z
) +R
A
(z) R
A
(z
))) 0.
Together with R
An
(z) R
A
(z) we have R
An
(z) R
A
(z) by virtue
of Lemma 1.12 (iv).
Now what can we say about the spectrum?
Theorem 6.38. Let A
n
and A be self-adjoint operators. If A
n
converges
to A in the strong resolvent sense, we have (A) lim
n
(A
n
). If A
n
converges to A in the norm resolvent sense, we have (A) = lim
n
(A
n
).
Proof. Suppose the rst claim were incorrect. Then we can nd a (A)
and some > 0 such that (A
n
) ( , + ) = . Choose a bounded
continuous function f which is one on (

2
, +

2
) and which vanishes
outside ( , +). Then f(A
n
) = 0 and hence f(A) = limf(A
n
) = 0
for every . On the other hand, since (A), there is a nonzero
RanP
A
((

2
, +

2
)) implying f(A) = , a contradiction.
To see the second claim, recall that the norm of R
A
(z) is just one over
the distance from the spectrum. In particular, , (A) if and only if
|R
A
( + i)| < 1. So , (A) implies |R
A
( + i)| < 1, which implies
|R
An
( + i)| < 1 for n suciently large, which implies , (A
n
) for n
suciently large.
Example. Note that the spectrum can contract if we only have convergence
in the strong resolvent sense: Let A
n
be multiplication by
1
n
x in L
2
(R).
Then A
n
converges to 0 in the strong resolvent sense, but (A
n
) = R and
(0) = 0.
Lemma 6.39. Suppose A
n
converges in the strong resolvent sense to A. If
P
A
() = 0, then
s-lim
n
P
An
((, )) = s-lim
n
P
An
((, ]) = P
A
((, )) = P
A
((, ]).
(6.51)
Proof. By Theorem 6.31 the spectral measures
n,
corresponding to A
n
converge vaguely to those of A. Hence |P
An
()|
2
=
n,
() together with
Lemma A.25 implies the claim.
Using P((
0
,
1
)) = P((,
1
)) P((,
0
]), we also obtain the
following.
n
converges in the strong resolvent sense to A.
If P
A
(
0
) = P
A
(
1
) = 0, then
s-lim
n
P
An
((
0
,
1
)) = s-lim
n
P
An
([
0
,
1
]) = P
A
((
0
,
1
)) = P
A
([
0
,
1
]).
(6.52)
Example. The following example shows that the requirement P
A
() = 0
is crucial, even if we have bounded operators and norm convergence. In fact,
let H = C
2
and
A
n
=
1
n
_
1 0
0 1
_
. (6.53)
Then A
n
0 and
P
An
((, 0)) = P
An
((, 0]) =
_
0 0
0 1
_
, (6.54)
but P
0
((, 0)) = 0 and P
0
((, 0]) = I.
Problem 6.17. Show that for self-adjoint operators, strong resolvent con-
vergence is equivalent to convergence with respect to the metric
d(A, B) =
nN
1
2
n
|(R
A
(i) R
B
(i))
n
|, (6.55)
where
n
nN
is some (xed) ONB.
Problem 6.18 (Weak convergence of spectral measures). Suppose A
n
A
in the strong resolvent sense and let
n,
,
be the corresponding spectral

measures. Show that
_
f()d
n,
()
_
f()d
() (6.56)
for every bounded continuous f. Give a counterexample when f is not con-
tinuous.
Part 2
Schr odinger Operators
Chapter 7
The free Schr odinger
operator
7.1. The Fourier transform
We rst review some basic facts concerning the Fourier transform which
will be needed in the following section.
Let C
(R
n
) be the set of all complex-valued functions which have partial
derivatives of arbitrary order. For f C
(R
n
) and N
n
0
we set
f =

[[
f
x
1
1
x
n
n
, x
= x
1
1
x
n
n
, [[ =
1
+ +
n
. (7.1)
An element N
n
0
is called a multi-index and [[ is called its order.
Recall the Schwartz space
o(R
n
) = f C
(R
n
)[ sup
x
[x
f)(x)[ < , , N
n
0
(7.2)
which is dense in L
2
(R
n
) (since C
c
(R
n
) o(R
n
) is). Note that if f
o(R
n
), then the same is true for x
f(x) and (
f)(x) for any multi-index

. For f o(R
n
) we dene
T(f)(p)

f(p) =
1
(2)
n/2
_
R
n
e
ipx
f(x)d
n
x. (7.3)
Then,
Lemma 7.1. The Fourier transform maps the Schwartz space into itself,
T : o(R
n
) o(R
n
). Furthermore, for any multi-index N
n
0
and any
f o(R
n
) we have
(
f)
(p) = (ip)

f(p), (x
f(x))
(p) = i
[[

f(p). (7.4)
161
162 7. The free Schrodinger operator
Proof. First of all, by integration by parts, we see
(

x
j
f(x))
(p) =
1
(2)
n/2
_
R
n
e
ipx

x
j
f(x)d
n
x
=
1
(2)
n/2
_
R
n
_

x
j
e
ipx
_
f(x)d
n
x
=
1
(2)
n/2
_
R
n
ip
j
e
ipx
f(x)d
n
x = ip
j

f(p).
So the rst formula follows by induction.
Similarly, the second formula follows from induction using
(x
j
f(x))
(p) =
1
(2)
n/2
_
R
n
x
j
e
ipx
f(x)d
n
x
=
1
(2)
n/2
_
R
n
_
i

p
j
e
ipx
_
f(x)d
n
x = i

p
j
f(p),
where interchanging the derivative and integral is permissible by Prob-
lem A.8. In particular,

f(p) is dierentiable.
To see that

f o(R
n
) if f o(R
n
), we begin with the observation
that

f is bounded; in fact, |
f|
(2)
n/2
|f|
1
. But then p

f)(p) =
i
[[[[
(
f(x))
(p) is bounded since
f(x) o(R
n
) if f o(R
n
).
Hence we will sometimes write pf(x) for if(x), where = (

1
, . . . ,
n
)
is the gradient.
Two more simple properties are left as an exercise.
Lemma 7.2. Let f o(R
n
). Then
(f(x +a))
(p) = e
iap

f(p), a R
n
, (7.5)
(f(x))
(p) =
1
f(
p
), > 0. (7.6)
Next, we want to compute the inverse of the Fourier transform. For this
the following lemma will be needed.
Lemma 7.3. We have e
zx
2
/2
o(R
n
) for Re(z) > 0 and
T(e
zx
2
/2
)(p) =
1
z
n/2
e
p
2
/(2z)
. (7.7)
Here z
n/2
has to be understood as (
z)
n
, where the branch cut of the root
is chosen along the negative real axis.
Proof. Due to the product structure of the exponential, one can treat each
coordinate separately, reducing the problem to the case n = 1.
Let
z
(x) = exp(zx
2
/2). Then
t
z
(x)+zx
z
(x) = 0 and hence i(p
z
(p)+
z
t
z
(p)) = 0. Thus

z
(p) = c
1/z
(p) and (Problem 7.1)
c =

z
(0) =
1
2
_
R
exp(zx
2
/2)dx =
1
z
at least for z > 0. However, since the integral is holomorphic for Re(z) > 0
by Problem A.10, this holds for all z with Re(z) > 0 if we choose the branch
cut of the root along the negative real axis.
Now we can show
Theorem 7.4. The Fourier transform T : o(R
n
) o(R
n
) is a bijection.
Its inverse is given by
T
1
(g)(x) g(x) =
1
(2)
n/2
_
R
n
e
ipx
g(p)d
n
p. (7.8)
We have T
2
(f)(x) = f(x) and thus T
4
= I.
Proof. Abbreviate
(x) = exp(x
2
/2). By dominated convergence we
have
(

f(p))
(x) =
1
(2)
n/2
_
R
n
e
ipx

f(p)d
n
p
= lim
0
1
(2)
n/2
_
R
n
(p)e
ipx

f(p)d
n
p
= lim
0
1
(2)
n
_
R
n
_
R
n
(p)e
ipx
f(y)e
ipx
d
n
yd
n
p,
and, invoking Fubini and Lemma 7.2, we further see
= lim
0
1
(2)
n/2
_
R
n
(
(p)e
ipx
)
(y)f(y)d
n
y
= lim
0
1
(2)
n/2
_
R
n
1
n/2
1/
(y x)f(y)d
n
y
= lim
0
1
(2)
n/2
_
R
n
1
(z)f(x +
z)d
n
z = f(x),
which nishes the proof, where we used the change of coordinates z =
yx
and again dominated convergence in the last two steps.

From Fubinis theorem we also obtain Parsevals identity
_
R
n
[
f(p)[
2
d
n
p =
1
(2)
n/2
_
R
n
_
R
n
f(x)

f(p)e
ipx
d
n
p d
n
x
=
_
R
n
[f(x)[
2
d
n
x (7.9)
for f o(R
n
). Thus, by Theorem 0.26, we can extend T to L
2
(R
n
) by
setting
f(p) = lim
R
1
(2)
n/2
_
[x[R
e
ipx
f(x)d
n
x, (7.10)
where the limit is to be understood in L
2
(R
n
) (Problem 7.5). If f L
1
(R
n
)
L
2
(R
n
), we can omit the limit (why?) and

f is still given by (7.3).
Theorem 7.5. The Fourier transform T extends to a unitary operator T :
L
2
(R
n
) L
2
(R
n
). Its spectrum is given by
(T) = z C[z
4
= 1 = 1, 1, i, i. (7.11)
Proof. As already noted, T extends uniquely to a bounded operator on
L
2
(R
n
). Moreover, the same is true for T
1
. Since Parsevals identity
remains valid by continuity of the norm, this extension is a unitary operator.
It remains to compute the spectrum. In fact, if
n
is a Weyl sequence,
then (T
2
+ z
2
)(T + z)(T z)
n
= (T
4
z
4
)
n
= (1 z
4
)
n
0 implies
z
4
= 1. Hence (T) z C[z
4
= 1. We defer the proof for equality
to Section 8.3, where we will explicitly compute an orthonormal basis of
eigenfunctions.
Lemma 7.1 also allows us to extend dierentiation to a larger class. Let
us introduce the Sobolev space
H
r
(R
n
) = f L
2
(R
n
)[[p[
r

f(p) L
2
(R
n
). (7.12)
Then, every function in H
r
(R
n
) has partial derivatives up to order r, which
are dened via
f = ((ip)

f(p))
, f H
r
(R
n
), [[ r. (7.13)
By Lemma 7.1 this denition coincides with the usual one for every f
o(R
n
) and we have
_
R
n
g(x)(
f)(x)d
n
x = g
, (
f)) = g(p)
, (ip)

f(p))
= (1)
[[
(ip)
g(p)
,

f(p)) = (1)
[[
, f)
= (1)
[[
_
R
n
(
g)(x)f(x)d
n
x, (7.14)
for f, g H
r
(R
n
). Furthermore, recall that a function h L
1
loc
(R
n
) satisfy-
ing
_
R
n
(x)h(x)d
n
x = (1)
[[
_
R
n
(
)(x)f(x)d
n
x, C
c
(R
n
), (7.15)
is also called the weak derivative or the derivative in the sense of distri-
butions of f (by Lemma 0.37 such a function is unique if it exists). Hence,
choosing g = in (7.14), we see that H
r
(R
n
) is the set of all functions hav-
ing partial derivatives (in the sense of distributions) up to order r, which
are in L
2
(R
n
).
Finally, we note that on L
1
(R
n
) we have
Lemma 7.6 (Riemann-Lebesgue). Let C
(R
n
) denote the Banach space
of all continuous functions f : R
n
C which vanish at equipped with
the sup norm. Then the Fourier transform is a bounded injective map from
L
1
(R
n
) into C
(R
n
) satisfying
|
f|
(2)
n/2
|f|
1
. (7.16)
Proof. Clearly we have

f C
(R
n
) if f o(R
n
). Moreover, since o(R
n
)
is dense in L
1
(R
n
), the estimate
sup
p
[
f(p)[
1
(2)
n/2
sup
p
_
R
n
[e
ipx
f(x)[d
n
x =
1
(2)
n/2
_
R
n
[f(x)[d
n
x
shows that the Fourier transform extends to a continuous map from L
1
(R
n
)
into C
(R
n
).
To see that the Fourier transform is injective, suppose

f = 0. Then
Fubini implies
0 =
_
R
n
(x)

f(x)d
n
x =
_
R
n
(x)f(x)d
n
x
for every o(R
n
). Hence Lemma 0.37 implies f = 0.
Note that T : L
1
(R
n
) C
(R
n
) is not onto (cf. Problem 7.7).
Another useful property is the convolution formula.
Lemma 7.7. The convolution
(f g)(x) =
_
R
n
f(y)g(x y)d
n
y =
_
R
n
f(x y)g(y)d
n
y (7.17)
of two functions f, g L
1
(R
n
) is again in L
1
(R
n
) and we have Youngs
inequality
|f g|
1
|f|
1
|g|
1
. (7.18)
Moreover, its Fourier transform is given by
(f g)
(p) = (2)
n/2

f(p) g(p). (7.19)
Proof. The fact that f g is in L
1
together with Youngs inequality follows
by applying Fubinis theorem to h(x, y) = f(x y)g(y). For the last claim
we compute
(f g)
(p) =
1
(2)
n/2
_
R
n
e
ipx
_
R
n
f(y)g(x y)d
n
y d
n
x
=
_
R
n
e
ipy
f(y)
1
(2)
n/2
_
R
n
e
ip(xy)
g(x y)d
n
xd
n
y
=
_
R
n
e
ipy
f(y) g(p)d
n
y = (2)
n/2

f(p) g(p),
where we have again used Fubinis theorem.
In other words, L
1
(R
n
) together with convolution as a product is a
Banach algebra (without identity). For the case of convolution on L
2
(R
n
)
see Problem 7.9.
_
R
exp(x
2
/2)dx =

2. (Hint: Square the inte-
gral and evaluate it using polar coordinates.)
Problem 7.2. Compute the Fourier transform of the following functions
f : R C:
(i) f(x) =
(1,1)
(x). (ii) f(p) =
1
p
2
+k
2
, Re(k) > 0.
Problem 7.3. Suppose f(x) L
1
(R) and g(x) = ixf(x) L
1
(R). Then
f is dierentiable and

f
t
= g.
Problem 7.4. A function f : R
n
C is called spherically symmetric if
it is invariant under rotations; that is, f(Ox) = f(x) for all O SO(R
n
)
(equivalently, f depends only on the distance to the origin [x[). Show that the
Fourier transform of a spherically symmetric function is again spherically
symmetric.
Problem 7.5. Show (7.10). (Hint: First suppose f has compact support.
Then there is a sequence of functions f
n
C
c
(R
n
) converging to f in L
2
.
The support of these functions can be chosen inside a xed set and hence
this sequence also converges to f in L
1
. Thus (7.10) follows for f L
2
with
compact support. To remove this restriction, use that the projection onto a
ball with radius R converges strongly to the identity as R .)
Problem 7.6. Show that C
(R
n
) is indeed a Banach space. Show that
o(R
n
) is dense.
Problem 7.7. Show that T : L
1
(R
n
) C
(R
n
) is not onto as follows:
(i) The range of T is dense.
(ii) T is onto if and only if it has a bounded inverse.
(iii) T has no bounded inverse.
7.2. The free Schrodinger operator 167
(Hint for (iii): Suppose is smooth with compact support in (0, 1) and
set f
m
(x) =
m
k=1
e
ikx
(x k). Then |f
m
|
1
= m||
1
and |
f
m
|
const
since o(R) and hence (p) const(1 +[p[)
2
).
Problem 7.8. Show that the convolution of two o(R
n
) functions is in
o(R
n
).
Problem 7.9. Show that the convolution of two L
2
(R
n
) functions is in
C
(R
n
) and we have |f g|
|f|
2
|g|
2
.
Problem 7.10 (Wiener). Suppose f L
2
(R
n
). Then the set f(x +a)[a
R
n
is total in L
2
(R
n
) if and only if

f(p) ,= 0 a.e. (Hint: Use Lemma 7.2
and the fact that a subspace is total if and only if its orthogonal complement
is zero.)
Problem 7.11. Suppose f(x)e
k[x[
L
1
(R) for some k > 0. Then

f(p) has
an analytic extension to the strip [ Im(p)[ < k.
7.2. The free Schrodinger operator
In Section 2.1 we have seen that the Hilbert space corresponding to one
particle in R
3
is L
2
(R
3
). More generally, the Hilbert space for N particles
in R
d
is L
2
(R
n
), n = Nd. The corresponding nonrelativistic Hamilton
operator, if the particles do not interact, is given by
H
0
= , (7.20)
where is the Laplace operator
=
n
j=1
2
x
2
j
. (7.21)
Here we have chosen units such that all relevant physical constants disap-
pear; that is, = 1 and the mass of the particles is equal to m =
1
2
. Be
aware that some authors prefer to use m = 1; that is, H
0
=
1
2
.
Our rst task is to nd a good domain such that H
0
is a self-adjoint
operator.
By Lemma 7.1 we have that
(x) = (p
2

(p))
(x), H
2
(R
n
), (7.22)
and hence the operator
H
0
= , D(H
0
) = H
2
(R
n
), (7.23)
is unitarily equivalent to the maximally dened multiplication operator
(T H
0
T
1
)(p) = p
2
(p), D(p
2
) = L
2
(R
n
)[p
2
(p) L
2
(R
n
).
(7.24)
Theorem 7.8. The free Schrodinger operator H
0
is self-adjoint and its
spectrum is characterized by
(H
0
) =
ac
(H
0
) = [0, ),
sc
(H
0
) =
pp
(H
0
) = . (7.25)
Proof. It suces to show that d
is purely absolutely continuous for every

. First observe that
, R
H
0
(z)) =
, R
p
2(z)
) =
_
R
n
[
(p)[
2
p
2
z
d
n
p =
_
R
1
r
2
z
d
(r),
where
d
(r) =
[0,)
(r)r
n1
__
S
n1
[
(r)[
2
d
n1
_
dr.
Hence, after a change of coordinates, we have
, R
H
0
(z)) =
_
R
1
z
d
(),
where
d
() =
1
2
[0,)
()
n/21
__
S
n1
[
)[
2
d
n1
_
d,
proving the claim.
Finally, we note that the compactly supported smooth functions are a
core for H
0
.
Lemma 7.9. The set C
c
(R
n
) = f o(R
n
)[ supp(f) is compact is a core
for H
0
.
Proof. It is not hard to see that o(R
n
) is a core (Problem 7.12) and hence it
suces to show that the closure of H
0
[
C
c
(R
n
)
contains H
0
[
S(R
n
)
. To see this,
let (x) C
c
(R
n
) which is one for [x[ 1 and vanishes for [x[ 2. Set
n
(x) = (
1
n
x). Then
n
(x) =
n
(x)(x) is in C
c
(R
n
) for every o(R
n
)
and
n
, respectively,
n
.
Note also that the quadratic form of H
0
is given by
q
H
0
() =
n
j=1
_
R
n
[
j
(x)[
2
d
n
x, Q(H
0
) = H
1
(R
n
). (7.26)
Problem 7.12. Show that o(R
n
) is a core for H
0
. (Hint: Show that the
closure of H
0
[
S(R
n
)
contains H
0
.)
Problem 7.13. Show that o(R)[(0) = 0 is dense but not a core for
H
0
=
d
2
dx
2
.
7.3. The time evolution in the free case 169
7.3. The time evolution in the free case
Now let us look at the time evolution. We have
e
itH
0
(x) = T
1
e
itp
2
(p). (7.27)
The right-hand side is a product and hence our operator should be express-
ible as an integral operator via the convolution formula. However, since
e
itp
2
is not in L
2
, a more careful analysis is needed.
Consider
f
(p
2
) = e
(it+)p
2
, > 0. (7.28)
Then f
(H
0
) e
itH
0
by Theorem 3.1. Moreover, by Lemma 7.3 and
the convolution formula we have
f
(H
0
)(x) =
1
(4(it +))
n/2
_
R
n
e
|xy|
2
4(it+)
(y)d
n
y (7.29)
and hence
e
itH
0
(x) =
1
(4it)
n/2
_
R
n
e
i
|xy|
2
4t
(y)d
n
y (7.30)
for t ,= 0 and L
1
L
2
. For general L
2
the integral has to be
understood as a limit.
Using this explicit form, it is not hard to draw some immediate conse-
quences. For example, if L
2
(R
n
) L
1
(R
n
), then (t) C(R
n
) for t ,= 0
(use dominated convergence and continuity of the exponential) and satises
|(t)|

1
[4t[
n/2
|(0)|
1
. (7.31)
Thus we have spreading of wave functions in this case. Moreover, it is even
possible to determine the asymptotic form of the wave function for large t
as follows. Observe
e
itH
0
(x) =
e
i
x
2
4t
(4it)
n/2
_
R
n
e
i
y
2
4t
(y)e
i
xy
2t
d
n
y
=
_
1
2it
_
n/2
e
i
x
2
4t
_
e
i
y
2
4t
(y)
_
(
x
2t
). (7.32)
Moreover, since exp(i
y
2
4t
)(y) (y) in L
2
as [t[ (dominated conver-
gence), we obtain
Lemma 7.10. For any L
2
(R
n
) we have
e
itH
0
(x)
_
1
2it
_
n/2
e
i
x
2
4t

(
x
2t
) 0 (7.33)
in L
2
as [t[ .
Note that this result is not too surprising from a physical point of view.
In fact, if a classical particle starts at a point x(0) = x
0
with velocity v = 2p
(recall that we use units where the mass is m =
1
2
), then we will nd it at
x = x
0
+ 2pt at time t. Dividing by 2t, we get
x
2t
= p +
x
0
2t
p for large t.
Hence the probability distribution for nding a particle at a point x at time
t should approach the probability distribution for the momentum at p =
x
2t
;
that is, [(x, t)[
2
d
n
x = [(
x
2t
)[
2 d
n
x
(2t)
n
. This could also be stated as follows:
The probability of nding the particle in a region R
n
is asymptotically
for [t[ equal to the probability of nding the momentum of the particle
in
1
2t
.
Next we want to apply the RAGE theorem in order to show that for any
initial condition, a particle will escape to innity.
Lemma 7.11. Let g(x) be the multiplication operator by g and let f(p) be
the operator given by f(p)(x) = T
1
(f(p)
(p))(x). Denote by L
(R
n
) the
bounded Borel functions which vanish at innity. Then
f(p)g(x) and g(x)f(p) (7.34)
are compact if f, g L
(R
n
) and (extend to) HilbertSchmidt operators if
f, g L
2
(R
n
).
Proof. By symmetry it suces to consider g(x)f(p). Let f, g L
2
. Then
g(x)f(p)(x) =
1
(2)
n/2
_
R
n
g(x)

f(x y)(y)d
n
y
shows that g(x)f(p) is HilbertSchmidt since g(x)

f(x y) L
2
(R
n
R
n
).
If f, g are bounded, then the functions f
R
(p) =
p[p
2
R
(p)f(p) and
g
R
(x) =
x[x
2
R
(x)g(x) are in L
2
. Thus g
R
(x)f
R
(p) is compact and by
|g(x)f(p) g
R
(x)f
R
(p)| |g|
|f f
R
|
+|g g
R
|
|f
R
|
it tends to g(x)f(p) in norm since f, g vanish at innity.

In particular, this lemma implies that
(H
0
+ i)
1
(7.35)
is compact if R
n
is bounded and hence
lim
t
|
e
itH
0
|
2
= 0 (7.36)
for any L
2
(R
n
) and any bounded subset of R
n
. In other words, the
particle will eventually escape to innity since the probability of nding the
particle in any bounded set tends to zero. (If L
1
(R
n
), this of course
also follows from (7.31).)
7.4. The resolvent and Greens function 171
7.4. The resolvent and Greens function
Now let us compute the resolvent of H
0
. We will try to use an approach
similar to that for the time evolution in the previous section. However,
since it is highly nontrivial to compute the inverse Fourier transform of
exp(p
2
)(p
2
z)
1
directly, we will use a small ruse.
Note that
R
H
0
(z) =
_

0
e
zt
e
tH
0
dt, Re(z) < 0, (7.37)
by Lemma 4.1. Moreover,
e
tH
0
(x) =
1
(4t)
n/2
_
R
n
e
|xy|
2
4t
(y)d
n
y, t > 0, (7.38)
by the same analysis as in the previous section. Hence, by Fubini, we have
R
H
0
(z)(x) =
_
R
n
G
0
(z, [x y[)(y)d
n
y, (7.39)
where
G
0
(z, r) =
_

0
1
(4t)
n/2
e
r
2
4t
+zt
dt, r > 0, Re(z) < 0. (7.40)
The function G
0
(z, r) is called Greens function of H
0
. The integral can be
evaluated in terms of modied Bessel functions of the second kind as follows:
First of all it suces to consider z < 0 since the remaining values will follow
by analytic continuation. Then, making the substitution t =
r
2
z
e
s
, we
obtain
_

0
1
(4t)
n/2
e
r
2
4t
+zt
dt =
1
4
_
z
2r
_
n
2
1
_

e
s
e
xcosh(s)
ds
=
1
2
_
z
2r
_
n
2
1
_

0
cosh(s)e
xcosh(s)
ds,
(7.41)
where we have abbreviated x =

zr and =
n
2
1. But the last integral
is given by the modied Bessel function K
(x) (see [1, (9.6.24)]) and thus

G
0
(z, r) =
1
2
_
z
2r
_
n
2
1
K
n
2
1
(
zr). (7.42)
Note K
(x) = K
(x) and K
(x) > 0 for , x R. The functions K
(x)
satisfy the dierential equation (see [1, (9.6.1)])
_
d
2
dx
2
+
1
x
d
dx
1

2
x
2
_
K
(x) = 0 (7.43)
and have the asymptotics (see [1, (9.6.8) and (9.6.9)])
K
(x) =
_
()
2
_
x
2
_
+O(x
+2
), > 0,
log(
x
2
) +O(1), = 0,
(7.44)
for [x[ 0 and (see [1, (9.7.2)])
K
(x) =
_
2x
e
x
(1 +O(x
1
)) (7.45)
for [x[ . For more information see for example [1] or [59]. In particular,
G
0
(z, r) has an analytic continuation for z C[0, ) = (H
0
). Hence we
can dene the right-hand side of (7.39) for all z (H
0
) such that
_
R
n
_
R
n
(x)G
0
(z, [x y[)(y)d
n
yd
n
x (7.46)
is analytic for z (H
0
) and , o(R
n
) (by Moreras theorem). Since
it is equal to , R
H
0
(z)) for Re(z) < 0, it is equal to this function for all
z (H
0
), since both functions are analytic in this domain. In particular,
(7.39) holds for all z (H
0
).
If n is odd, we have the case of spherical Bessel functions which can be
expressed in terms of elementary functions. For example, we have
G
0
(z, r) =
1
2
z
e
z r
, n = 1, (7.47)
and
G
0
(z, r) =
1
4r
e
z r
, n = 3. (7.48)
Problem 7.14. Verify (7.39) directly in the case n = 1.
Chapter 8
Algebraic methods
8.1. Position and momentum
Apart from the Hamiltonian H
0
, which corresponds to the kinetic energy,
there are several other important observables associated with a single par-
ticle in three dimensions. Using the commutation relation between these
observables, many important consequences about these observables can be
derived.
First consider the one-parameter unitary group
(U
j
(t))(x) = e
itx
j
(x), 1 j 3. (8.1)
For o(R
3
) we compute
lim
t0
i
e
itx
j
(x) (x)
t
= x
j
(x) (8.2)
and hence the generator is the multiplication operator by the jth coordinate
function. By Corollary 5.3 it is essentially self-adjoint on o(R
3
). It is
customary to combine all three operators into one vector-valued operator
x, which is known as the position operator. Moreover, it is not hard
to see that the spectrum of x
j
is purely absolutely continuous and given
by (x
j
) = R. In fact, let (x) be an orthonormal basis for L
2
(R). Then
i
(x
1
)
j
(x
2
)
k
(x
3
) is an orthonormal basis for L
2
(R
3
) and x
1
can be written
as an orthogonal sum of operators restricted to the subspaces spanned by
j
(x
2
)
k
(x
3
). Each subspace is unitarily equivalent to L
2
(R) and x
1
is
given by multiplication with the identity. Hence the claim follows (or use
Theorem 4.14).
Next, consider the one-parameter unitary group of translations
(U
j
(t))(x) = (x te
j
), 1 j 3, (8.3)
173
174 8. Algebraic methods
where e
j
is the unit vector in the jth coordinate direction. For o(R
3
)
we compute
lim
t0
i
(x te
j
) (x)
t
=
1
i
x
j
(x) (8.4)
and hence the generator is p
j
=
1
i
x
j
. Again it is essentially self-adjoint
on o(R
3
). Moreover, since it is unitarily equivalent to x
j
by virtue of
the Fourier transform, we conclude that the spectrum of p
j
is again purely
absolutely continuous and given by (p
j
) = R. The operator p is known as
the momentum operator. Note that since
[H
0
, p
j
](x) = 0, o(R
3
), (8.5)
we have
d
dt
(t), p
j
(t)) = 0, (t) = e
itH
0
(0) o(R
3
); (8.6)
that is, the momentum is a conserved quantity for the free motion. More
generally we have
Theorem 8.1 (Noether). Suppose A is a self-adjoint operator which com-
mutes with a self-adjoint operator H. Then D(A) is invariant under e
itH
,
that is, e
itH
D(A) = D(A), and A is a conserved quantity, that is,
(t), A(t)) = (0), A(0)), (t) = e
itH
(0) D(A). (8.7)
Proof. By the second part of Lemma 4.5 (with f() = and B = e
itH
) we
see D(A) = D(e
itH
A) D(Ae
itH
) = [e
itH
D(A) which implies
e
itH
D(A) D(A), and [e
itH
, A] = 0 for D(A).
Similarly one has
i[p
j
, x
k
](x) =
jk
(x), o(R
3
), (8.8)
which is known as the Weyl relations. In terms of the corresponding
unitary groups they read
e
isp
j
e
itx
k
= e
ist
jk
e
itx
j
e
isp
k
. (8.9)
The Weyl relations also imply that the mean-square deviation of position
and momentum cannot be made arbitrarily small simultaneously:
Theorem 8.2 (Heisenberg Uncertainty Principle). Suppose A and B are
two symmetric operators. Then for any D(AB) D(BA) we have
(A)
(B)
1
2
[E
([A, B])[ (8.10)

with equality if
(B E
(B)) = i(AE
(A)), R0, (8.11)

or if is an eigenstate of A or B.
Proof. Let us x D(AB) D(BA) and abbreviate
A = AE
(A),

B = B E
(B).
Then
(A) = |
A|,
(B) = |
B| and hence by CauchySchwarz

[
A,

B)[
(A)
(B).
Now note that
A

B =
1
2
A,

B +
1
2
[A, B],
A,

B =

A

B +

B

A
where
A,

B and i[A, B] are symmetric. So
[
A,

B)[
2
= [,

A

B)[
2
=
1
2
[,
A,

B)[
2
+
1
2
[, [A, B])[
2
which proves (8.10).
To have equality if is not an eigenstate, we need

B = z

A for
equality in CauchySchwarz and ,
A,

B) = 0. Inserting the rst into
the second requirement gives 0 = (z z
)|
A|
2
and shows Re(z) = 0.
In the case of position and momentum we have (|| = 1)
(p
j
)
(x
k
)

jk
2
(8.12)
and the minimum is attained for the Gaussian wave packets
(x) =
_
_
n/4
e
2
[xx
0
[
2
ip
0
x
, (8.13)
which satisfy E
(x) = x
0
and E
(p) = p
0
, respectively,
(p
j
)
2
=

2
and
(x
k
)
2
=
1
2
.
Problem 8.1. Check that (8.13) realizes the minimum.
8.2. Angular momentum
Now consider the one-parameter unitary group of rotations
(U
j
(t))(x) = (M
j
(t)x), 1 j 3, (8.14)
where M
j
(t) is the matrix of rotation around e
j
by an angle of t. For
o(R
3
) we compute
lim
t0
i
(M
i
(t)x) (x)
t
=
3
j,k=1
ijk
x
j
p
k
(x), (8.15)
where
ijk
=
_
_
_
1 if ijk is an even permutation of 123,
1 if ijk is an odd permutation of 123,
0 otherwise.
(8.16)
Again one combines the three components into one vector-valued operator
L = x p, which is known as the angular momentum operator. Since
e
i2L
j
= I, we see that the spectrum is a subset of Z. In particular, the
continuous spectrum is empty. We will show below that we have (L
j
) = Z.
Note that since
[H
0
, L
j
](x) = 0, o(R
3
), (8.17)
we again have
d
dt
(t), L
j
(t)) = 0, (t) = e
itH
0
(0) o(R
3
); (8.18)
that is, the angular momentum is a conserved quantity for the free motion
as well.
Moreover, we even have
[L
i
, K
j
](x) = i
3
k=1
ijk
K
k
(x), o(R
3
), K
j
L
j
, p
j
, x
j
, (8.19)
and these algebraic commutation relations are often used to derive informa-
tion on the point spectra of these operators. In this respect the domain
D = spanx
x
2
2
[ N
n
0
o(R
n
) (8.20)
is often used. It has the nice property that the nite dimensional subspaces
D
k
= spanx
x
2
2
[ [[ k (8.21)
are invariant under L
j
(and hence they reduce L
j
).
Lemma 8.3. The subspace D L
2
(R
n
) dened in (8.20) is dense.
Proof. By Lemma 1.10 it suces to consider the case n = 1. Suppose
, ) = 0 for every D. Then
1
2
_
(x)e
x
2
2
k
j=1
(itx)
j
j!
dx = 0
for any nite k and hence also in the limit k by the dominated conver-
gence theorem. But the limit is the Fourier transform of (x)e
x
2
2
, which
shows that this function is zero. Hence (x) = 0.
Since D is invariant under the unitary groups generated by L
j
, the op-
erators L
j
are essentially self-adjoint on D by Corollary 5.3.
Introducing L
2
= L
2
1
+L
2
2
+L
2
3
, it is straightforward to check
[L
2
, L
j
](x) = 0, o(R
3
). (8.22)
Moreover, D
k
is invariant under L
2
and L
3
and hence D
k
reduces L
2
and
L
3
. In particular, L
2
and L
3
are given by nite matrices on D
k
. Now
let H
m
= Ker(L
3
m) and denote by P
k
the projector onto D
k
. Since
L
2
and L
3
commute on D
k
, the space P
k
H
m
is invariant under L
2
, which
shows that we can choose an orthonormal basis consisting of eigenfunctions
of L
2
for P
k
H
m
. Increasing k, we get an orthonormal set of simultaneous
eigenfunctions whose span is equal to D. Hence there is an orthonormal
basis of simultaneous eigenfunctions of L
2
and L
3
.
Now let us try to draw some further consequences by using the commuta-
tion relations (8.19). (All commutation relations below hold for o(R
3
).)
Denote by H
l,m
the set of all functions in D satisfying
L
3
= m, L
2
= l(l + 1). (8.23)
By L
2
0 and (L
3
) Z we can restrict our attention to the case l 0
and m Z.
First introduce two new operators
L
= L
1
iL
2
, [L
3
, L
] = L
. (8.24)
Then, for every H
l,m
we have
L
3
(L
) = (m1)(L
), L
2
(L
) = l(l + 1)(L
); (8.25)
that is, L
H
l,m
H
l,m1
. Moreover, since
L
2
= L
2
3
L
3
+L
, (8.26)
we obtain
|L
|
2
= , L
) = (l(l + 1) m(m1))|| (8.27)

for every H
l,m
. If ,= 0, we must have l(l + 1) m(m1) 0, which
shows H
l,m
= 0 for [m[ > l. Moreover, L
H
l,m
H
l,m1
is injective
unless [m[ = l. Hence we must have H
l,m
= 0 for l , N
0
.
Up to this point we know (L
2
) l(l +1)[l N
0
, (L
3
) Z. In order
to show that equality holds in both cases, we need to show that H
l,m
,= 0
for l N
0
, m = l, l + 1, . . . , l 1, l. First of all we observe
0,0
(x) =
1
3/4
e
x
2
2
H
0,0
. (8.28)
Next, we note that (8.19) implies
[L
3
, x
] = x
, x
= x
1
ix
2
,
[L
, x
] = 0, [L
, x
] = 2x
3
,
[L
2
, x
] = 2x
(1 L
3
) 2x
3
L
. (8.29)
Hence if H
l,l
, then (x
1
ix
2
) H
l1,l1
. Thus
l,l
(x) =
1
l!
(x
1
ix
2
)
l
0,0
(x) H
l,l
, (8.30)
respectively,
l,m
(x) =
(l +m)!
(l m)!(2l)!
L
lm

l,l
(x) H
l,m
. (8.31)
The constants are chosen such that |
l,m
| = 1.
In summary,
Theorem 8.4. There exists an orthonormal basis of simultaneous eigenvec-
tors for the operators L
2
and L
j
. Moreover, their spectra are given by
(L
2
) = l(l + 1)[l N
0
, (L
3
) = Z. (8.32)
We will give an alternate derivation of this result in Section 10.3.
8.3. The harmonic oscillator
Finally, let us consider another important model whose algebraic structure
is similar to those of the angular momentum, the harmonic oscillator
H = H
0
+
2
x
2
, > 0. (8.33)
We will choose as domain
D(H) = D = spanx
x
2
2
[ N
3
0
L
2
(R
3
) (8.34)
from our previous section.
We will rst consider the one-dimensional case. Introducing
A
=
1
2
_
x
1
d
dx
_
, D(A
) = D, (8.35)
we have
[A
, A
+
] = 1 (8.36)
and
H = (2N + 1), N = A
+
A
, D(N) = D, (8.37)
for any function in D. In particular, note that D is invariant under A
.
Moreover, since
[N, A
] = A
, (8.38)
we see that N = n implies NA
= (n 1)A
. Moreover, |A
+
|
2
=
, A
A
+
) = (n+1)||
2
, respectively, |A
|
2
= n||
2
, in this case and
hence we conclude that
p
(N) N
0
.
If N
0
= 0, then we must have A
= 0 and the normalized solution

of this last equation is given by
0
(x) =
_
_
1/4
e
x
2
2
D. (8.39)
8.4. Abstract commutation 179
Hence
n
(x) =
1
n!
A
n
+
0
(x) (8.40)
is a normalized eigenfunction of N corresponding to the eigenvalue n. More-
over, since
n
(x) =
1
2
n
n!
_
_
1/4
H
n
(
x)e
x
2
2
(8.41)
where H
n
(x) is a polynomial of degree n given by
H
n
(x) = e
x
2
2
_
x
d
dx
_
n
e
x
2
2
= (1)
n
e
x
2 d
n
dx
n
e
x
2
, (8.42)
we conclude span
n
= D. The polynomials H
n
(x) are called Hermite
polynomials.
In summary,
Theorem 8.5. The harmonic oscillator H is essentially self-adjoint on D
and has an orthonormal basis of eigenfunctions
n
1
,n
2
,n
3
(x) =
n
1
(x
1
)
n
2
(x
2
)
n
3
(x
3
), (8.43)
with
n
j
(x
j
) from (8.41). The spectrum is given by
(H) = (2n + 3)[n N
0
. (8.44)
Finally, there is also a close connection with the Fourier transformation.
Without restriction we choose = 1 and consider only one-dimension. Then
it easy to verify that H commutes with the Fourier transformation,
TH = HT, (8.45)
on D. Moreover, by TA
= iA
T we even infer
T
n
=
1
n!
TA
n
+
0
=
(i)
n
n!
A
n
+
T
0
= (i)
n
n
, (8.46)
since T
0
=
0
by Lemma 7.3. In particular,
(T) = z C[z
4
= 1. (8.47)
8.4. Abstract commutation
The considerations of the previous section can be generalized as follows.
First of all, the starting point was a factorization of H according to H = A
A
(note that A
from the previous section are adjoint to each other when

restricted to D). Then it turned out that commuting both operators just
corresponds to a shift of H; that is, AA
= H +c. Hence one could exploit

the close spectral relation of A
A and AA
to compute both the eigenvalues

and eigenvectors.
More generally, let A be a closed operator and recall that H
0
= A
A is a
self-adjoint operator (cf. Problem 2.12) with Ker(H
0
) = Ker(A). Similarly,
H
1
= AA
is a self-adjoint operator with Ker(H

1
) = Ker(A
).
Theorem 8.6. Let A be a closed operator. The operators H
0
= A
Ker(A)
and H
1
= AA
Ker(A
are unitarily equivalent.

If H
0
0
= E
0
,
0
D(H
0
), then
1
= A
0
D(H
1
) with H
1
1
=
1
and |
1
| =

E|
0
|. Moreover,
R
H
1
(z)
1
z
(AR
H
0
(z)A
1) , R
H
0
(z)
1
z
(A
R
H
1
(z)A1) . (8.48)
Proof. Introducing [A[ = H
1/2
0
, we have the polar decomposition (Prob-
lem 3.11)
A = U[A[,
where
U : Ker(A)
Ker(A
is unitary. Taking adjoints, we have (Problem 2.3)

A
= [A[U
and thus H
1
= AA
= U[A[[A[U
= UH
0
U
shows the claimed unitary

equivalence.
The claims about the eigenvalues are straightforward (for the norm note
A
0
=

EU
0
). To see the connection between the resolvents, abbreviate
P
1
= P
H
1
(0). Then
R
H
1
(z) = R
H
1
(z)(1 P
1
) +
1
z
P
1
= UR
H
0
U
+
1
z
P
1
1
z
_
U([H
0
[
1/2
R
H
0
[H
0
[
1/2
1)U
+P
1
_
=
1
z
(AR
H
0
A
+ (1 P
1
) +P
1
) =
1
z
(AR
H
0
A
+ 1) ,
where we have used UU
= 1 P
1
.
We will use this result to compute the eigenvalues and eigenfunctions of
the hydrogen atom in Section 10.4. In the physics literature this approach
is also known as supersymmetric quantum mechanics.
Problem 8.2. Show that H
0
=
d
2
dx
2
+ q can formally (i.e., ignoring do-
mains) be written as H
0
= AA
, where A =
d
dx
+ , if the dierential
equation
tt
+ q = 0 has a positive solution. Compute H
1
= A
A. (Hint:
=

.)
Problem 8.3. Take H
0
=
d
2
dx
2
+ , > 0, and compute H
1
. What about
domains?
Chapter 9
One-dimensional
Schr odinger operators
9.1. SturmLiouville operators
In this section we want to illustrate some of the results obtained thus far by
investigating a specic example, the SturmLiouville equation
f(x) =
1
r(x)
_
d
dx
p(x)
d
dx
f(x) +q(x)f(x)
_
, f, pf
t
AC
loc
(I). (9.1)
The case p = r = 1 can be viewed as the model of a particle in one-
dimension in the external potential q. Moreover, the case of a particle in
three dimensions can in some situations be reduced to the investigation of
SturmLiouville equations. In particular, we will see how this works when
explicitly solving the hydrogen atom.
The suitable Hilbert space is
L
2
((a, b), r(x)dx), f, g) =
_
b
a
f(x)
g(x)r(x)dx, (9.2)
where I = (a, b) R is an arbitrary open interval.
We require
(i) p
1
L
1
loc
(I), positive,
(ii) q L
1
loc
(I), real-valued,
(iii) r L
1
loc
(I), positive.
181
182 9. One-dimensional Schrodinger operators
If a is nite and if p
1
, q, r L
1
((a, c)) (c I), then the SturmLiouville
equation (9.1) is called regular at a. Similarly for b. If it is regular at both
a and b, it is called regular.
The maximal domain of denition for in L
2
(I, r dx) is given by
D() = f L
2
(I, r dx)[f, pf
t
AC
loc
(I), f L
2
(I, r dx). (9.3)
It is not clear that D() is dense unless (e.g.) p AC
loc
(I), p
t
, q L
2
loc
(I),
r
1
L
loc
(I) since C
0
(I) D() in this case. We will defer the general
case to Lemma 9.4 below.
Since we are interested in self-adjoint operators H associated with (9.1),
we perform a little calculation. Using integration by parts (twice), we obtain
the Lagrange identity (a < c < d < b)
_
d
c
g
(f) rdy = W
d
(g
, f) W
c
(g
, f) +
_
d
c
(g)
f rdy, (9.4)
for f, g, pf
t
, pg
t
AC
loc
(I), where
W
x
(f
1
, f
2
) =
_
p(f
1
f
t
2
f
t
1
f
2
)
_
(x) (9.5)
is called the modied Wronskian.
Equation (9.4) also shows that the Wronskian of two solutions of u = zu
is constant
W
x
(u
1
, u
2
) = W(u
1
, u
2
), u
1,2
= zu
1,2
. (9.6)
Moreover, it is nonzero if and only if u
1
and u
2
are linearly independent
(compare Theorem 9.1 below).
If we choose f, g D() in (9.4), then we can take the limits c a and
d b, which results in
g, f) = W
b
(g
, f) W
a
(g
, f) +g, f), f, g D(). (9.7)

Here W
a,b
(g
, f) has to be understood as a limit.

Finally, we recall the following well-known result from ordinary dier-
ential equations.
Theorem 9.1. Suppose rg L
1
loc
(I). Then there exists a unique solution
f, pf
t
AC
loc
(I) of the dierential equation
( z)f = g, z C, (9.8)
satisfying the initial condition
f(c) = , (pf
t
)(c) = , , C, c I. (9.9)
In addition, f is entire with respect to z.
Proof. Introducing
u =
_
f
pf
t
_
, v =
_
0
rg
_
,
we can rewrite (9.8) as the linear rst order system
u
t
Au = v, A(x) =
_
0 p
1
(x)
q(x) z r(x) 0
_
.
Integrating with respect to x, we see that this system is equivalent to the
Volterra integral equation
u Ku = w, (Ku)(x) =
_
x
c
A(y)u(y)dy, w(x) =
_
_
+
_
x
c
v(y)dy.
We will choose some d (c, b) and consider the integral operator K in the
Banach space C([c, d]). Then for any h C([c, d]) and x [c, d] we have
the estimate
[K
n
(h)(x)[
a
1
(x)
n
n!
|h|, a
1
(x) =
_
x
c
a(y)dy, a(x) = |A(x)|,
which follows from induction
[K
n+1
(h)(x)[ =
_
x
c
A(y)K
n
(h)(y)dy
_
x
c
a(y)[K
n
(h)(y)[dy
|h|
_
x
c
a(y)
a
1
(y)
n
n!
dy =
a
1
(x)
n+1
(n + 1)!
|h|.
Hence the unique solution of our integral equation is given by the Neumann
series (show this)
u(x) =
n=0
K
n
(w)(x).
To see that the solution u(x) is entire with respect to z, note that the partial
sums are entire (in fact polynomial) in z and hence so is the limit by uniform
convergence with respect to z in compact sets. An analogous argument for
d (a, c) nishes the proof.
Note that f, pf
t
can be extended continuously to a regular endpoint.
Lemma 9.2. Suppose u
1
, u
2
are two solutions of ( z)u = 0 which satisfy
W(u
1
, u
2
) = 1. Then any other solution of (9.8) can be written as (, C)
f(x) = u
1
(x)
_
+
_
x
c
u
2
g rdy
_
+u
2
(x)
_

_
x
c
u
1
g rdy
_
,
f
t
(x) = u
t
1
(x)
_
+
_
x
c
u
2
g rdy
_
+u
t
2
(x)
_

_
x
c
u
1
g rdy
_
. (9.10)
Note that the constants , coincide with those from Theorem 9.1 if u
1
(c) =
(pu
t
2
)(c) = 1 and (pu
t
1
)(c) = u
2
(c) = 0.
Proof. It suces to check f z f = g. Dierentiating the rst equation
of (9.10) gives the second. Next we compute
(pf
t
)
t
= (pu
t
1
)
t
_
+
_
u
2
g rdy
_
+ (pu
t
2
)
t
_

_
u
1
g rdy
_
W(u
1
, u
2
)gr
= (q zr)u
1
_
+
_
u
2
grdy
_
+ (q zr)u
2
_

_
u
1
gdy
_
gr
= (q zr)f gr
which proves the claim.
Now we want to obtain a symmetric operator and hence we choose
A
0
f = f, D(A
0
) = D() AC
c
(I), (9.11)
where AC
c
(I) denotes the functions in AC(I) with compact support. This
denition clearly ensures that the Wronskian of two such functions vanishes
on the boundary, implying that A
0
is symmetric by virtue of (9.7). Our rst
task is to compute the closure of A
0
and its adjoint. For this the following
elementary fact will be needed.
Lemma 9.3. Suppose V is a vector space and l, l
1
, . . . , l
n
are linear func-
tionals (dened on all of V ) such that

n
j=1
Ker(l
j
) Ker(l). Then l =
n
j=0
j
l
j
for some constants
j
C.
Proof. First of all it is no restriction to assume that the functionals l
j
are
linearly independent. Then the map L : V C
n
, f (l
1
(f), . . . , l
n
(f)) is
surjective (since x Ran(L)
implies

n
j=1
x
j
l
j
(f) = 0 for all f). Hence
there are vectors f
k
V such that l
j
(f
k
) = 0 for j ,= k and l
j
(f
j
) = 1. Then
f
n
j=1
l
j
(f)f
j

n
j=1
Ker(l
j
) and hence l(f)
n
j=1
l
j
(f)l(f
j
) = 0. Thus
we can choose
j
= l(f
j
).
Now we are ready to prove
Lemma 9.4. The operator A
0
is densely dened and its closure is given by
A
0
f = f, D(A
0
) = f D() [ W
a
(f, g) = W
b
(f, g) = 0, g D().
(9.12)
Its adjoint is given by
A
0
f = f, D(A
0
) = D(). (9.13)
Proof. We start by computing A
0
and ignore the fact that we do not know
whether D(A
0
) is dense for now.
By (9.7) we have D() D(A
0
) and it remains to show D(A
0
) D().
If h D(A
0
), we must have
h, A
0
f) = k, f), f D(A
0
),
for some k L
2
(I, r dx). Using (9.10), we can nd a

h such that
h = k
and from integration by parts we obtain
_
b
a
(h(x)
h(x))
(f)(x)r(x)dx = 0, f D(A
0
). (9.14)
Clearly we expect that h
h will be a solution of u = 0 and to prove this,

we will invoke Lemma 9.3. Therefore we consider the linear functionals
l(g) =
_
b
a
(h(x)
h(x))
g(x)r(x)dx, l
j
(g) =
_
b
a
u
j
(x)
g(x)r(x)dx,
on L
2
c
(I, r dx), where u
j
are two solutions of u = 0 with W(u
1
, u
2
) ,= 0.
Then we have Ker(l
1
) Ker(l
2
) Ker(l). In fact, if g Ker(l
1
) Ker(l
2
),
then
f(x) = u
1
(x)
_
x
a
u
2
(y)g(y)r(y)dy +u
2
(x)
_
b
x
u
1
(y)g(y)r(y)dy
is in D(A
0
) and g = f Ker(l) by (9.14). Now Lemma 9.3 implies
_
b
a
(h(x)
h(x) +
1
u
1
(x) +
2
u
2
(x))
g(x)r(x)dx = 0, g L
2
c
(I, rdx)
and hence h =

h +
1
u
1
+
2
u
2
D().
Now what if D(A
0
) were not dense? Then there would be some freedom
in the choice of k since we could always add a component in D(A
0
)
. So
suppose we have two choices k
1
,= k
2
. Then by the above calculation, there
are corresponding functions

h
1
and

h
2
such that h =

h
1
+
1,1
u
1
+
1,2
u
2
=
h
2
+
2,1
u
1
+
2,2
u
2
. In particular,

h
1

h
2
is in the kernel of and hence
k
1
=
h
1
=
h
2
= k
2
, a contradiction to our assumption.
Next we turn to A
0
. Denote the set on the right-hand side of (9.12) by
D. Then we have D D(A
0
) = A
0
by (9.7). Conversely, since A
0
A
0
,
we can use (9.7) to conclude
W
a
(f, h) +W
b
(f, h) = 0, f D(A
0
), h D(A
0
).
Now replace h by a

h D(A
0
) which coincides with h near a and vanishes
identically near b (Problem 9.1). Then W
a
(f, h) = W
a
(f,
h) +W
b
(f,
h) = 0.
Finally, W
b
(f, h) = W
a
(f, h) = 0 shows f D.
Example. If is regular at a, then W
a
(f, g) = 0 for all g D() if and
only if f(a) = (pf
t
)(a) = 0. This follows since we can prescribe the values
of g(a), (pg
t
)(a) for g D() arbitrarily.
This result shows that any self-adjoint extension of A
0
must lie between
A
0
and A
0
. Moreover, self-adjointness seems to be related to the Wronskian
of two functions at the boundary. Hence we collect a few properties rst.
Lemma 9.5. Suppose v D() with W
a
(v
, v) = 0 and suppose there is a
f D() with W
a
(v
,

f) ,= 0. Then, for f, g D(), we have
W
a
(v, f) = 0 W
a
(v, f
) = 0 (9.15)
and
W
a
(v, f) = W
a
(v, g) = 0 W
a
(g
, f) = 0. (9.16)
Proof. For all f
1
, . . . , f
4
D() we have the Pl ucker identity
W
x
(f
1
, f
2
)W
x
(f
3
, f
4
) +W
x
(f
1
, f
3
)W
x
(f
4
, f
2
) +W
x
(f
1
, f
4
)W
x
(f
2
, f
3
) = 0
(9.17)
which remains valid in the limit x a. Choosing f
1
= v, f
2
= f, f
3
=
v
, f
4
=

f, we infer (9.15). Choosing f
1
= f, f
2
= g
, f
3
= v, f
4
=

f, we
infer (9.16).
Problem 9.1. Given , , , , show that there is a function f in D()
restricted to [c, d] (a, b) such that f(c) = , (pf
t
)(c) = and f(d) = ,
(pf
t
)(c) = . (Hint: Lemma 9.2.)
Problem 9.2. Let A
0
=
d
2
dx
2
, D(A
0
) = f H
2
[0, 1][f(0) = f(1) = 0
and B = q, D(B) = f L
2
(0, 1)[qf L
2
(0, 1). Find a q L
1
(0, 1) such
that D(A
0
) D(B) = 0. (Hint: Problem 0.30.)
Problem 9.3. Let L
1
loc
(I). Dene
A
=
d
dx
+, D(A
) = f L
2
(I)[f AC(I), f
t
+f L
2
(I)
and A
0,
= A
[
ACc(I)
. Show A
0,
= A
and
D(A
0,
) = f D(A
)[ lim
xa,b
f(x)g(x) = 0, g D(A
).
In particular, show that the limits above exist.
Problem 9.4 (Liouville normal form). Show that every SturmLiouville
equation can be transformed into one with r = p = 1 as follows: Show that
the transformation U : L
2
((a, b), r dx) L
2
(0, c), c =
_
b
a
r(t)
p(t)
dt, dened via
u(x) v(y), where
y(x) =
_
x
a
r(t)
p(t)
dt, v(y) =
4
_
r(x(y))p(x(y)) u(x(y)),
is unitary. Moreover, if p, r, p
t
, r
t
AC(a, b), then
(pu
t
)
t
+qu = ru
transforms into
v
tt
+Qv = v,
where
Q = q
(pr)
1/4
r
_
p((pr)
1/4
)
t
_
t
.
9.2. Weyls limit circle, limit point alternative
Inspired by Lemma 9.5, we make the following denition: We call limit
circle (l.c.) at a if there is a v D() with W
a
(v
, v) = 0 such that
W
a
(v, f) ,= 0 for at least one f D(). Otherwise is called limit point
(l.p.) at a and similarly for b.
Example. If is regular at a, it is limit circle at a. Since
W
a
(v, f) = (pf
t
)(a)v(a) (pv
t
)(a)f(a), (9.18)
any real-valued v with (v(a), (pv
t
)(a)) ,= (0, 0) works.
Note that if W
a
(f, v) ,= 0, then W
a
(f, Re(v)) ,= 0 or W
a
(f, Im(v)) ,= 0.
Hence it is no restriction to assume that v is real and W
a
(v
, v) = 0 is
trivially satised in this case. In particular, is limit point if and only if
W
a
(f, g) = 0 for all f, g D().
Theorem 9.6. If is l.c. at a, then let v D() with W
a
(v
, v) = 0 and
W
a
(v, f) ,= 0 for some f D(). Similarly, if is l.c. at b, let w be an
analogous function. Then the operator
A : D(A) L
2
(I, r dx)
f f
(9.19)
with
D(A) = f D()[ W
a
(v, f) = 0 if l.c. at a
W
b
(w, f) = 0 if l.c. at b
(9.20)
is self-adjoint. Moreover, the set
D
1
= f D()[ x
0
I : x (a, x
0
), W
x
(v, f) = 0,
x
1
I : x (x
1
, b), W
x
(w, f) = 0
(9.21)
is a core for A.
Proof. By Lemma 9.5, A is symmetric and hence A A
0
. Let g
D(A
). As in the computation of A
0
we conclude W
a
(f, g) = W
b
(f, g) = 0
for all f D(A). Moreover, we can choose f such that it coincides with v
near a and hence W
a
(v, g) = 0. Similarly W
b
(w, g) = 0; that is, g D(A).
To see that D
1
is a core, let A
1
be the corresponding operator and observe
that the argument from above, with A
1
in place of A, shows A
1
= A.
The name limit circle, respectively, limit point, stems from the original
approach of Weyl, who considered the set of solutions u = zu, z CR,
which satisfy W
x
(u
, u) = 0. They can be shown to lie on a circle which

converges to a circle, respectively, a point, as x a or x b (see Prob-
lem 9.9).
Before proceeding, let us shed some light on the number of possible
boundary conditions. Suppose is l.c. at a and let u
1
, u
2
be two solutions
of u = 0 with W(u
1
, u
2
) = 1. Abbreviate
BC
j
x
(f) = W
x
(u
j
, f), f D(). (9.22)
Let v be as in Theorem 9.6. Then, using Lemma 9.5, it is not hard to see
that
W
a
(v, f) = 0 cos()BC
1
a
(f) sin()BC
2
a
(f) = 0, (9.23)
where tan() =
BC
1
a
(v)
BC
2
a
(v)
. Hence all possible boundary conditions can be
parametrized by [0, ). If is regular at a and if we choose u
1
(a) =
(pu
t
2
)(a) = 1 and (pu
t
1
)(a) = u
2
(a) = 0, then
BC
1
a
(f) = f(a), BC
2
a
(f) = (pf
t
)(a) (9.24)
and the boundary condition takes the simple form
sin()(pf
t
)(a) cos()f(a) = 0. (9.25)
The most common choice of = 0 is known as the Dirichlet boundary
condition f(a) = 0. The choice = /2 is known as the Neumann
boundary condition (pf
t
)(a) = 0.
Finally, note that if is l.c. at both a and b, then Theorem 9.6 does not
give all possible self-adjoint extensions. For example, one could also choose
BC
1
a
(f) = e
i
BC
1
b
(f), BC
2
a
(f) = e
i
BC
2
b
(f). (9.26)
The case = 0 gives rise to periodic boundary conditions in the regular
case.
Next we want to compute the resolvent of A.
Lemma 9.7. Suppose z (A). Then there exists a solution u
a
(z, x) of
( z)u = g which is in L
2
((a, c), r dx) and which satises the boundary
condition at a if is l.c. at a. Similarly, there exists a solution u
b
(z, x) with
the analogous properties near b.
The resolvent of A is given by
(Az)
1
g(x) =
_
b
a
G(z, x, y)g(y)r(y)dy, (9.27)
where
G(z, x, y) =
1
W(u
b
(z), u
a
(z))
_
u
b
(z, x)u
a
(z, y), x y,
u
a
(z, x)u
b
(z, y), x y.
(9.28)
Proof. Let g L
2
c
(I, r dx) be real-valued and consider f = (A z)
1
g
D(A). Since ( z)f = 0 near a, respectively, b, we obtain u
a
(z, x) by
setting it equal to f near a and using the dierential equation to extend it
to the rest of I. Similarly we obtain u
b
. The only problem is that u
a
or u
b
might be identically zero. Hence we need to show that this can be avoided
by choosing g properly.
Fix z and let g be supported in (c, d) I. Since ( z)f = g, we have
f(x) = u
1
(x)
_
+
_
x
a
u
2
gr dy
_
+u
2
(x)
_
+
_
b
x
u
1
gr dy
_
. (9.29)
Near a (x < c) we have f(x) = u
1
(x) +

u
2
(x) and near b (x > d) we have
f(x) = u
1
(x) +u
2
(x), where = +
_
b
a
u
2
gr dy and

= +
_
b
a
u
1
gr dy.
If f vanishes identically near both a and b, we must have = = =

= 0
and thus = = 0 and
_
b
a
u
j
(y)g(y)r(y)dy = 0, j = 1, 2. This case can
be avoided by choosing a suitable g and hence there is at least one solution,
say u
b
(z).
Now choose u
1
= u
b
and consider the behavior near b. If u
2
is not square
integrable on (d, b), we must have = 0 since u
2
= f u
b
is. If u
2
is
square integrable, we can nd two functions in D() which coincide with u
b
and u
2
near b. Since W(u
b
, u
2
) = 1, we see that is l.c. at a and hence
0 = W
b
(u
b
, f) = W
b
(u
b
, u
b
+ u
2
) = . Thus = 0 in both cases and we
have
f(x) = u
b
(x)
_
+
_
x
a
u
2
gr dy
_
+u
2
(x)
_
b
x
u
b
gr dy.
Now choosing g such that
_
b
a
u
b
gr dy ,= 0, we infer the existence of u
a
(z).
Choosing u
2
= u
a
and arguing as before, we see = 0 and hence
f(x) = u
b
(x)
_
x
a
u
a
(y)g(y)r(y)dy +u
a
(x)
_
b
x
u
b
(y)g(y)r(y)dy
=
_
b
a
G(z, x, y)g(y)r(y)dy
for any g L
2
c
(I, r dx). Since this set is dense, the claim follows.
Example. If is regular at a with a boundary condition as in the pre-
vious example, we can choose u
a
(z, x) to be the solution corresponding to
the initial conditions (u
a
(z, a), (pu
t
a
)(z, a)) = (sin(), cos()). In particular,
u
a
(z, x) exists for all z C.
If is regular at both a and b, there is a corresponding solution u
b
(z, x),
again for all z. So the only values of z for which (A z)
1
does not exist
must be those with W(u
b
(z), u
a
(z)) = 0. However, in this case u
a
(z, x)
and u
b
(z, x) are linearly dependent and u
a
(z, x) = u
b
(z, x) satises both
boundary conditions. That is, z is an eigenvalue in this case.
In particular, regular operators have pure point spectrum. We will see
in Theorem 9.10 below that this holds for any operator which is l.c. at both
endpoints.
In the previous example u
a
(z, x) is holomorphic with respect to z and
satises u
a
(z, x)
= u
a
(z
, x) (since it corresponds to real initial conditions

and our dierential equation has real coecients). In general we have:
Lemma 9.8. Suppose z (A). Then u
a
(z, x) from the previous lemma
can be chosen locally holomorphic with respect to z such that
u
a
(z, x)
= u
a
(z
, x) (9.30)
and similarly for u
b
(z, x).
Proof. Since this is a local property near a, we can assume b is regular
and choose u
b
(z, x) such that (u
b
(z, b), (pu
t
b
)(z, b)) = (sin(), cos()) as in
the example above. In addition, choose a second solution v
b
(z, x) such that
(v
b
(z, b), (pv
t
b
)(z, b)) = (cos(), sin()) and observe W(u
b
(z), v
b
(z)) = 1. If
z (A), z is no eigenvalue and hence u
a
(z, x) cannot be a multiple of
u
b
(z, x). Thus we can set
u
a
(z, x) = v
b
(z, x) +m(z)u
b
(z, x)
and it remains to show that m(z) is holomorphic with m(z)
= m(z
).
Choosing h with compact support in (a, c) and g with support in (c, b),
we have
h, (Az)
1
g) = h, u
a
(z))g
, u
b
(z))
= (h, v
b
(z)) +m(z)h, u
b
(z)))g
, u
b
(z))
(with a slight abuse of notation since u
b
, v
b
might not be square integrable).
Choosing (real-valued) functions h and g such that h, u
b
(z))g
, u
b
(z)) , = 0,
we can solve for m(z):
m(z) =
h, (Az)
1
g) h, v
b
(z))g
, u
b
(z))
h, u
b
(z))g
, u
b
(z))
.
This nishes the proof.
Example. We already know that =
d
2
dx
2
on I = (, ) gives rise to
the free Schrodinger operator H
0
. Furthermore,
u
(z, x) = e
zx
, z C, (9.31)
are two linearly independent solutions (for z ,= 0) and since Re(
z) > 0
for z C[0, ), there is precisely one solution (up to a constant multiple)
which is square integrable near , namely u
. In particular, the only

choice for u
a
is u
and for u
b
is u
+
and we get
G(z, x, y) =
1
2
z
e
z[xy[
(9.32)
which we already found in Section 7.4.
If, as in the previous example, there is only one square integrable solu-
tion, there is no choice for G(z, x, y). But since dierent boundary condi-
tions must give rise to dierent resolvents, there is no room for boundary
conditions in this case. This indicates a connection between our l.c., l.p.
distinction and square integrability of solutions.
Theorem 9.9 (Weyl alternative). The operator is l.c. at a if and only if
for one z
0
C all solutions of ( z
0
)u = 0 are square integrable near a.
This then holds for all z C and similarly for b.
Proof. If all solutions are square integrable near a, is l.c. at a since the
Wronskian of two linearly independent solutions does not vanish.
Conversely, take two functions v, v D() with W
a
(v, v) ,= 0. By con-
sidering real and imaginary parts, it is no restriction to assume that v and
v are real-valued. Thus they give rise to two dierent self-adjoint operators
A and

A (choose any xed w for the other endpoint). Let u
a
and u
a
be the
corresponding solutions from above. Then W(u
a
, u
a
) ,= 0 (since otherwise
A =

A by Lemma 9.5) and thus there are two linearly independent solutions
which are square integrable near a. Since any other solution can be written
as a linear combination of those two, every solution is square integrable near
a.
It remains to show that all solutions of ( z)u = 0 for all z C are
square integrable near a if is l.c. at a. In fact, the above argument ensures
this for every z (A) (

A), that is, at least for all z CR.
Suppose ( z)u = 0 and choose two linearly independent solutions u
j
,
j = 1, 2, of ( z
0
)u = 0 with W(u
1
, u
2
) = 1. Using ( z
0
)u = (z z
0
)u
and (9.10), we have (a < c < x < b)
u(x) = u
1
(x) +u
2
(x) + (z z
0
)
_
x
c
(u
1
(x)u
2
(y) u
1
(y)u
2
(x))u(y)r(y) dy.
Since u
j
L
2
((c, b), rdx), we can nd a constant M 0 such that
_
b
c
[u
1,2
(y)[
2
r(y) dy M.
Now choose c close to b such that [z z
0
[M
2
1/4. Next, estimating the
integral using CauchySchwarz gives
_
x
c
(u
1
(x)u
2
(y) u
1
(y)u
2
(x))u(y)r(y) dy
_
x
c
[u
1
(x)u
2
(y) u
1
(y)u
2
(x)[
2
r(y) dy
_
x
c
[u(y)[
2
r(y) dy
M
_
[u
1
(x)[
2
+[u
2
(x)[
2
_
_
x
c
[u(y)[
2
r(y) dy
and hence
_
x
c
[u(y)[
2
r(y) dy ([[
2
+[[
2
)M + 2[z z
0
[M
2
_
x
c
[u(y)[
2
r(y) dy
([[
2
+[[
2
)M +
1
2
_
x
c
[u(y)[
2
r(y) dy.
Thus
_
x
c
[u(y)[
2
r(y) dy 2([[
2
+[[
2
)M
and since u AC
loc
(I), we have u L
2
((c, b), r dx) for every c (a, b).
Now we turn to the investigation of the spectrum of A. If is l.c. at
both endpoints, then the spectrum of A is very simple
Theorem 9.10. If is l.c. at both endpoints, then the resolvent is a Hilbert
Schmidt operator; that is,
_
b
a
_
b
a
[G(z, x, y)[
2
r(y)dy r(x)dx < . (9.33)
In particular, the spectrum of any self-adjoint extensions is purely discrete
and the eigenfunctions (which are simple) form an orthonormal basis.
Proof. This follows from the estimate
_
b
a
_
_
x
a
[u
b
(x)u
a
(y)[
2
r(y)dy +
_
b
x
[u
b
(y)u
a
(x)[
2
r(y)dy
_
r(x)dx
2
_
b
a
[u
a
(y)[
2
r(y)dy
_
b
a
[u
b
(y)[
2
r(y)dy,
which shows that the resolvent is HilbertSchmidt and hence compact.
Note that all eigenvalues are simple. If is l.p. at one endpoint, this is
clear, since there is at most one solution of ( )u = 0 which is square
integrable near this endpoint. If is l.c., this also follows since the fact that
two solutions of ( )u = 0 satisfy the same boundary condition implies
that their Wronskian vanishes.
If is not l.c., the situation is more complicated and we can only say
something about the essential spectrum.
Theorem 9.11. All self-adjoint extensions have the same essential spec-
trum. Moreover, if A
ac
and A
cb
are self-adjoint extensions of restricted to
(a, c) and (c, b) (for any c I), then
ess
(A) =
ess
(A
ac
)
ess
(A
cb
). (9.34)
Proof. Since ( i)u = 0 has two linearly independent solutions, the defect
indices are at most two (they are zero if is l.p. at both endpoints, one if
is l.c. at one and l.p. at the other endpoint, and two if is l.c. at both
endpoints). Hence the rst claim follows from Theorem 6.20.
For the second claim restrict to the functions with compact support
in (a, c) (c, d). Then, this operator is the orthogonal sum of the operators
A
0,ac
and A
0,cb
. Hence the same is true for the adjoints and hence the defect
indices of A
0,ac
A
0,cb
are at most four. Now note that A and A
ac
A
cb
are both self-adjoint extensions of this operator. Thus the second claim also
follows from Theorem 6.20.
In particular, this result implies that for the essential spectrum only the
behaviour near the endpoints a and b is relevant.
Another useful result to determine if q is relatively compact is the fol-
lowing:
Lemma 9.12. Suppose k L
2
loc
((a, b), r dx). Then kR
A
(z) is Hilbert
Schmidt if and only if
|kR
A
(z)|
2
2
=
1
Im(z)
_
b
a
[k(x)[
2
Im(G(z, x, x))r(x)dx (9.35)
is nite.
Proof. From the rst resolvent formula we have
G(z, x, y) G(z
t
, x, y) = (z z
t
)
_
b
a
G(z, x, t)G(z
t
, t, y)r(t)dt.
Setting x = y and z
t
= z
, we obtain
Im(G(z, x, x)) = Im(z)
_
b
a
[G(z, x, t)[
2
r(t)dt. (9.36)
Using this last formula to compute the HilbertSchmidt norm proves the
lemma.
Problem 9.5. Compute the spectrum and the resolvent of =
d
2
dx
2
, I =
(0, ) dened on D(A) = f D()[f(0) = 0.
Problem 9.6. Suppose is given on (a, ), where a is a regular endpoint.
Suppose there are two solutions u
of u = zu satisfying r(x)
1/2
[u
(x)[
Ce
x
for some C, > 0. Then z is not in the essential spectrum of any self-
adjoint operator corresponding to . (Hint: You can take any self-adjoint
extension, say the one for which u
a
= u
and u
b
= u
+
. Write down what
you expect the resolvent to be and show that it is a bounded operator by
comparison with the resolvent from the previous problem.)
Problem 9.7. Suppose a is regular and lim
xb
q(x)/r(x) = . Show that
ess
(A) = for every self-adjoint extension. (Hint: Fix some positive con-
stant n and choose c (a, b) such that q(x)/r(x) n in (c, b) and use
Theorem 9.11.)
Problem 9.8 (Approximation by regular operators). Fix functions v, w
D() as in Theorem 9.6. Pick I
m
= (c
m
, d
m
) with c
m
a, d
m
b and dene
A
m
: D(A
m
) L
2
(I
m
, r dr)
f f
,
where
D(A
m
) = f L
2
(I
m
, r dr)[ f, pf
t
AC(I
m
), f L
2
(I
m
, r dr),
W
cm
(v, f) = W
dm
(w, f) = 0.
Then A
m
converges to A in the strong resolvent sense as m . (Hint:
Lemma 6.36.)
Problem 9.9 (Weyl circles). Fix z CR and c (a, b). Introduce
[u]
x
=
W(u, u
)
x
z z
R
and use (9.4) to show that
[u]
x
= [u]
c
+
_
x
c
[u(y)[
2
r(y)dy, ( z)u = 0.
Hence [u]
x
is increasing and exists if and only if u L
2
((c, b), r dx).
Let u
1,2
be two solutions of ( z)u = 0 which satisfy [u
1
]
c
= [u
2
]
c
= 0
and W(u
1
, u
2
) = 1. Then, all (nonzero) solutions u of ( z)u = 0 which
satisfy [u]
b
= 0 can be written as
u = u
2
+mu
1
, m C,
up to a complex multiple (note [u
1
]
x
> 0 for x > c).
Show that
[u
2
+mu
1
]
x
= [u
1
]
x
_
[mM(x)[
2
R(x)
2
_
,
where
M(x) =
W(u
2
, u
1
)
x
W(u
1
, u
1
)
x
and
R(x)
2
=
_
[W(u
2
, u
1
)
x
[
2
+W(u
2
, u
2
)
x
W(u
1
, u
1
)
x
__
[z z
[[u
1
]
x
_
2
=
_
[z z
[[u
1
]
x
_
2
.
Hence the numbers m for which [u]
x
= 0 lie on a circle which either converges
to a circle (if lim
xb
R(x) > 0) or to a point (if lim
xb
R(x) = 0) as x b.
Show that is l.c. at b in the rst case and l.p. in the second case.
9.3. Spectral transformations I
In this section we want to provide some fundamental tools for investigating
the spectra of SturmLiouville operators and, at the same time, give some
nice illustrations of the spectral theorem.
Example. Consider again =
d
2
dx
2
on I = (, ). From Section 7.2
we know that the Fourier transform maps the associated operator H
0
to the
multiplication operator with p
2
in L
2
(R). To get multiplication by , as in
the spectral theorem, we set p =

and split the Fourier integral into a
positive and negative part, that is,
(Uf)() =
_
_
R
e
i
x
f(x) dx
_
R
e
i
x
f(x) dx
_
, (H
0
) = [0, ). (9.37)
Then
U : L
2
(R)
2
j=1
L
2
(R,
[0,)
()
2
d) (9.38)
is the spectral transformation whose existence is guaranteed by the spectral
theorem (Lemma 3.4). Note, however, that the measure is not nite. This
can be easily xed if we replace exp(i
x) by () exp(i
x).
Note that in the previous example the kernel e
i
x
of the integral trans-
form U is just a pair of linearly independent solutions of the underlying
dierential equation (though no eigenfunctions, since they are not square
integrable).
More generally, if
U : L
2
(I, r dx) L
2
(R, d), f(x)
_
I
u(, x)f(x)r(x) dx (9.39)
is an integral transformation which maps a self-adjoint SturmLiouville op-
erator A to multiplication by , then its kernel u(, x) is a solution of the
underlying dierential equation. This formally follows from UAf = Uf
which implies
0 =
_
I
u(, x)( )f(x)r(x) dx =
_
I
( )u(, x)f(x)r(x) dx (9.40)
and hence ( )u(, .) = 0.
Lemma 9.13. Suppose
U : L
2
(I, r dx)
k
j=1
L
2
(R, d
j
) (9.41)
is a spectral mapping as in Lemma 3.4. Then U is of the form
Uf(x) =
_
b
a
u(, x)f(x)r(x) dx, (9.42)
where u(, x) = (u
1
(, x), . . . , u
k
(, x)) is measurable and for a.e. (with
respect to
j
) and each u
j
(, .) is a solution of u
j
= u
j
which satises the
boundary conditions of A (if any). Here the integral has to be understood as
_
b
a
dx = lim
ca,db
_
d
c
dx with limit taken in

j
L
2
(R, d
j
).
The inverse is given by
(U
1
F)(x) =
k
j=1
_
R
u
j
(, x)
F
j
()d
j
(). (9.43)
Again the integrals have to be understood as
_
R
d
j
= lim
R
_
R
R
d
j
with
limits taken in L
2
(I, r dx).
If the spectral measures are ordered, then the solutions u
j
(), 1 j l,
are linearly independent for a.e. with respect to
l
. In particular, for
ordered spectral measures we always have k 2 and even k = 1 if is l.c.
at one endpoint.
Proof. Using U
j
R
A
(z) =
1
z
U
j
, we have
U
j
f(x) = ( z)U
j
_
b
a
G(z, x, y)f(y)r(y) dy.
If we restrict R
A
(z) to a compact interval [c, d] (a, b), then R
A
(z)
[c,d]
is HilbertSchmidt since G(z, x, y)
[c,d]
(y) is square integrable over (a, b)
(a, b). Hence U
j
[c,d]
= ( z)U
j
R
A
(z)
[c,d]
is HilbertSchmidt as well and
by Lemma 6.9 there is a corresponding kernel u
[c,d]
j
(, y) such that
(U
j
[c,d]
f)() =
_
b
a
u
[c,d]
j
(, x)f(x)r(x) dx.
Now take a larger compact interval [ c,

d] [c, d]. Then the kernels coincide
on [c, d], u
[c,d]
j
(, .) = u
[ c,
d]
j
(, .)
[c,d]
, since we have U
j
[c,d]
= U
j
[ c,
d]
[c,d]
.
In particular, there is a kernel u
j
(, x) such that
U
j
f(x) =
_
b
a
u
j
(, x)f(x)r(x) dx
for every f with compact support in (a, b). Since functions with compact
support are dense and U
j
is continuous, this formula holds for any f provided
the integral is understood as the corresponding limit.
Using the fact that U is unitary, F, Ug) = U
1
F, g), we see
j
_
R
F
j
()
_
b
a
u
j
(, x)g(x)r(x) dx =
_
b
a
(U
1
F)(x)
g(x)r(x) dx.
Interchanging integrals on the right-hand side (which is permitted at least
for g, F with compact support), the formula for the inverse follows.
Next, from U
j
Af = U
j
f we have
_
b
a
u
j
(, x)(f)(x)r(x) dx =
_
b
a
u
j
(, x)f(x)r(x) dx
for a.e. and every f D(A
0
). Restricting everything to [c, d] (a, b),
the above equation implies u
j
(, .)[
[c,d]
D(A
cd,0
) and A
cd,0
u
j
(, .)[
[c,d]
=
u
j
(, .)[
[c,d]
. In particular, u
j
(, .) is a solution of u
j
= u
j
. Moreover, if
is l.c. near a, we can choose c = a and allow all f D() which satisfy
the boundary condition at a and vanish identically near b.
Finally, assume the
j
are ordered and x l k. Suppose
l
j=1
c
j
()u
j
(, x) = 0.
Then we have
l
j=1
c
j
()F
j
() = 0, F
j
= U
j
f,
for every f. Since U is surjective, we can prescribe F
j
arbitrarily on (
l
),
e.g., F
j
() = 1 for j = j
0
and F
j
() = 0 otherwise, which shows c
j
0
() = 0.
Hence the solutions u
j
(, x), 1 j l, are linearly independent for
(
l
) which shows k 2 since there are at most two linearly independent
solutions. If is l.c. and u
j
(, x) must satisfy the boundary condition, there
is only one linearly independent solution and thus k = 1.
Note that since we can replace u
j
(, x) by
j
()u
j
(, x) where [
j
()[ =
1, it is no restriction to assume that u
j
(, x) is real-valued.
For simplicity we will only pursue the case where one endpoint, say a,
is regular. The general case can often be reduced to this case and will be
postponed until Section 9.6.
We choose a boundary condition
cos()f(a) sin()p(a)f
t
(a) = 0 (9.44)
and introduce two solution s(z, x) and c(z, x) of u = zu satisfying the
initial conditions
s(z, a) = sin(), p(a)s
t
(z, a) = cos(),
c(z, a) = cos(), p(a)c
t
(z, a) = sin(). (9.45)
Note that s(z, x) is the solution which satises the boundary condition at
a; that is, we can choose u
a
(z, x) = s(z, x). In fact, if is not regular
at a but only l.c., everything below remains valid if one chooses s(z, x) to
be a solution satisfying the boundary condition at a and c(z, x) a linearly
independent solution with W(c(z), s(z)) = 1.
Moreover, in our previous lemma we have u
1
(, x) =
a
()s(, x) and
using the rescaling d() = [
a
()[
2
d
a
() and (U
1
f)() =
a
()(Uf)(),
we obtain a unitary map
U : L
2
(I, r dx) L
2
(R, d), (Uf)() =
_
b
a
s(, x)f(x)r(x)dx (9.46)
with inverse
(U
1
F)(x) =
_
R
s(, x)F()d(). (9.47)
Note, however, that while this rescaling gets rid of the unknown factor
a
(),
it destroys the normalization of the measure . For
1
we know
1
(R) (if
the corresponding vector is normalized), but might not even be bounded!
In fact, it turns out that is indeed unbounded.
So up to this point we have our spectral transformation U which maps A
to multiplication by , but we know nothing about the measure . Further-
more, the measure is the object of desire since it contains all the spectral
information of A. So our next aim must be to compute . If A has only
pure point spectrum (i.e., only eigenvalues), this is straightforward as the
following example shows.
Example. Suppose E
p
(A) is an eigenvalue. Then s(E, x) is the cor-
responding eigenfunction and the same is true for S
E
() = (Us(E))(). In
particular,
E
(A)s(E, x) = s(E, x) shows S
E
() = (U
E
(A)s(E))() =
E
()S
E
(); that is,
S
E
() =
_
|s(E)|
2
, = E,
0, ,= 0.
(9.48)
Moreover, since U is unitary, we have
|s(E)|
2
=
_
b
a
s(E, x)
2
r(x)dx =
_
R
S
E
()
2
d() = |s(E)|
4
(E); (9.49)
that is, (E) = |s(E)|
2
. In particular, if A has pure point spectrum
(e.g., if is limit circle at both endpoints), we have
d() =
j=1
1
|s(E
j
)|
2
d( E
j
),
p
(A) = E
j
j=1
, (9.50)
where d is the Dirac measure centered at 0. For arbitrary A, the above
formula holds at least for the pure point part
pp
.
In the general case we have to work a bit harder. Since c(z, x) and s(z, x)
are linearly independent solutions,
W(c(z), s(z)) = 1, (9.51)
we can write u
b
(z, x) =
b
(z)(c(z, x) +m
b
(z)s(z, x)), where
m
b
(z) =
cos()p(a)u
t
b
(z, a) + sin()u
b
(z, a)
cos()u
b
(z, a) sin()p(a)u
t
b
(z, a)
, z (A), (9.52)
is known as the WeylTitchmarsh m-function. Note that m
b
(z) is holo-
morphic in (A) and that
m
b
(z)
= m
b
(z
) (9.53)
since the same is true for u
b
(z, x) (the denominator in (9.52) only vanishes if
u
b
(z, x) satises the boundary condition at a, that is, if z is an eigenvalue).
Moreover, the constant
b
(z) is of no importance and can be chosen equal
to one,
u
b
(z, x) = c(z, x) +m
b
(z)s(z, x). (9.54)
Lemma 9.14. The Weyl m-function is a Herglotz function and satises
Im(m
b
(z)) = Im(z)
_
b
a
[u
b
(z, x)[
2
r(x) dx, (9.55)
where u
b
(z, x) is normalized as in (9.54).
Proof. Given two solutions u(x), v(x) of u = zu, v = zv, respectively, it
is straightforward to check
( z z)
_
x
a
u(y)v(y)r(y) dy = W
x
(u, v) W
a
(u, v)
(clearly it is true for x = a; now dierentiate with respect to x). Now choose
u(x) = u
b
(z, x) and v(x) = u
b
(z, x)
= u
b
(z
, x),
2 Im(z)
_
x
a
[u
b
(z, y)[
2
r(y) dy = W
x
(u
b
(z), u
b
(z)
) 2 Im(m
b
(z)),
and observe that W
x
(u
b
, u
b
) vanishes as x b, since both u
b
and u
b
are in
D() near b.
Lemma 9.15. Let
G(z, x, y) =
_
s(z, x)u
b
(z, y), y x,
s(z, y)u
b
(z, x), y x,
(9.56)
be the Green function of A. Then
(UG(z, x, .))() =
s(, x)
z
and (Up(x)
x
G(z, x, .))() =
p(x)s
t
(, x)
z
(9.57)
for every x (a, b) and every z (A).
Proof. First of all note that G(z, x, .) L
2
((a, b), r dx) for every x (a, b)
and z (A). Moreover, from R
A
(z)f = U
1 1
z
Uf we have
_
b
a
G(z, x, y)f(y)r(y) dy =
_
R
s(, x)F()
z
d(), (9.58)
where F = Uf. Here equality is to be understood in L
2
, that is, for a.e.
x. However, the left-hand side is continuous with respect to x and so is the
right-hand side, at least if F has compact support. Since this set is dense,
the rst equality follows. Similarly, the second follows after dierentiating
(9.58) with respect to x.
(Uu
b
(z))() =
1
z
, (9.59)
where u
b
(z, x) is normalized as in (9.54).
Proof. Choosing x = a in the lemma, we obtain the claim from the rst
identity if sin() ,= 0 and from the second if cos() ,= 0.
Now combining Lemma 9.14 and Corollary 9.16, we infer from unitarity
of U that
Im(m
b
(z)) = Im(z)
_
b
a
[u
b
(z, x)[
2
r(x) dx = Im(z)
_
R
1
[ z[
2
d() (9.60)
and since a holomorphic function is determined up to a real constant by its
imaginary part, we obtain
Theorem 9.17. The Weyl m-function is given by
m
b
(z) = d +
_
R
_
1
z

1 +
2
_
d(), d R, (9.61)
and
d = Re(m
b
(i)),
_
R
1
1 +
2
d() = Im(m
b
(i)) < . (9.62)
Moreover, is given by the Stieltjes inversion formula
() = lim
0
lim
0
1
_
+
Im(m
b
( + i))d, (9.63)
where
Im(m
b
( + i)) =
_
b
a
[u
b
( + i, x)[
2
r(x) dx. (9.64)
Proof. Choosing z = i in (9.60) shows (9.62) and hence the right-hand side
of (9.61) is a well-dened holomorphic function in CR. By
Im(
1
z

1 +
2
) =
Im(z)
[ z[
2
its imaginary part coincides with that of m
b
(z) and hence equality follows.
The Stieltjes inversion formula follows as in the case where the measure is
bounded.
Example. Consider =
d
2
dx
2
on I = (0, ). Then
c(z, x) = cos() cos(
zx)
sin()
z
sin(
zx) (9.65)
and
s(z, x) = sin() cos(
zx) +
cos()
z
sin(
zx). (9.66)
Moreover,
u
b
(z, x) = u
b
(z, 0)e
zx
(9.67)
and thus
m
b
(z) =
sin()
z cos()
cos() +
z sin()
, (9.68)
respectively,
d() =
(cos()
2
+sin()
2
)
d. (9.69)
Note that if ,= 0, we even have

_
1
[z[
d() < 0 in the previous
example and hence
m
b
(z) = cot() +
_
R
1
z
d() (9.70)
in this case (the factor cot() follows by considering the limit [z[
of both sides). Formally this even follows in the general case by choosing
x = a in u
b
(z, x) = (U
1 1
z
)(x); however, since we know equality only for
a.e. x, a more careful analysis is needed. We will address this problem in
the next section.
Problem 9.10. Show
m
b,
(z) =
cos( )m
b,
(z) + sin( )
cos( ) sin( )m
b,
(z)
. (9.71)
(Hint: The case = 0 is (9.52).)
Problem 9.11. Let
0
(x),
0
(x) be two real-valued solutions of u =
0
u
for some xed
0
R such that W(
0
,
0
) = 1. We will call quasi-regular
at a if the limits
lim
xa
W
x
(
0
, u(z)), lim
xa
W
x
(
0
, u(z)) (9.72)
exist for every solution u(z) of u = zu. Show that this denition is inde-
pendent of
0
(Hint: Pl uckers identity). Show that is quasi-regular at a
if it is l.c. at a.
Introduce
(z, x) = W
a
(c(z),
0
)s(z, x) W
a
(s(z),
0
)c(z, x),
(z, x) = W
a
(s(z),
0
)c(z, x) W
a
(c(z),
0
)s(z, x), (9.73)
where c(z, x) and s(z, x) are chosen with respect to some base point c (a, b),
and a singular Weyl m-function M
b
(z) such that
(z, x) = (z, x) +M
b
(z)(z, x) L
2
(c, b). (9.74)
Show that all claims from this section still hold true in this case for the
operator associated with the boundary condition W
a
(
0
, f) = 0 if is l.c. at
a.
9.4. Inverse spectral theory
In this section we want to show that the Weyl m-function (respectively,
the corresponding spectral measure) uniquely determines the operator. For
simplicity we only consider the case p = r 1.
We begin with some asymptotics for large z away from the spectrum.
We recall that

z always denotes the branch with arg(z) (, ]. We will
write c(z, x) = c
(z, x) and s(z, x) = s
(z, x) to display the dependence on

whenever necessary.
We rst observe (Problem 9.12)
Lemma 9.18. For = 0 we have
c
0
(z, x) = cosh(
z(x a)) +O(

1
z
e
z(xa)
),
s
0
(z, x) =
1
z
sinh(
z(x a)) +O(

1
z
e
z(xa)
), (9.75)
uniformly for x (a, c) as [z[ .
Note that for z C[0, ) this can be written as
c
0
(z, x) =
1
2
e
z(xa)
(1 +O(
1
z
)),
s
0
(z, x) =
1
2
z
e
z(xa)
(1 +O(
1
z
)), (9.76)
for Im(z) and for z = [0, ) we have
c
0
(, x) = cos(
(x a)) +O(
1
),
s
0
(, x) =
1
sin(
(x a)) +O(
1
), (9.77)
as .
From this lemma we obtain
Lemma 9.19. The Weyl m-function satises
m
b
(z) =
_
cot() +O(
1
z
), ,= 0,
z +O(1), = 0,
(9.78)
as z in any sector [ Re(z)[ C[ Im(z)[.
Proof. As in the proof of Theorem 9.17 we obtain from Lemma 9.15
G(z, x, x) = d(x) +
_
R
_
1
z

1 +
2
_
s(, x)
2
d().
Hence, since the integrand converges pointwise to 0, dominated convergence
(Problem 9.13) implies G(z, x, x) = o(z) as z in any sector [ Re(z)[
C[ Im(z)[. Now solving G(z, x, y) = s(z, x)u
b
(z, x) for m
b
(z) and using the
asymptotic expansions from Lemma 9.18, we see
m
b
(z) =
c(z, x)
s(z, x)
+o(ze
2
z(xa)
)
from which the claim follows.
Note that assuming q C
k
([a, b)), one can obtain further asymptotic
terms in Lemma 9.18 and hence also in the expansion of m
b
(z).
The asymptotics of m
b
(z) in turn tell us more about L
2
(R, d).
Lemma 9.20. Let
F(z) = d +
_
R
_
1
z

1 +
2
_
d()
be a Herglotz function. Then, for any 0 < < 2, we have
_
R
d()
1 +[[
<
_

1
Im(F(iy))
y
dy < . (9.79)
Proof. First of all note that we can split F(z) = F
1
(z) + F
2
(z) according
to d =
[1,1]
d + (1 +
[1,1]
)d. The part F
1
(z) corresponds to a nite
measure and does not contribute by Theorem 3.20. Hence we can assume
that is not supported near 0. Then Fubini shows
_

0
Im(F(iy))
y
dy =
_

0
_
R
y
1
2
+y
2
d()dy =
/2
sin(/2)
_
R
1
[[
d(),
which proves the claim. Here we have used (Problem 9.14)
_

0
y
1
2
+y
2
dy =
/2
[[
sin(/2)
.
For the case = 0 see Theorem 3.20 and for the case = 2 see Prob-
lem 9.15.
G(z, x, y) =
_
R
s(, x)s(, y)
z
d(), (9.80)
where the integrand is integrable. Moreover, for any > 0 we have
G(z, x, y) = O(z
1/2+
e
z[yx[
), (9.81)
as z in any sector [ Re(z)[ C[ Im(z)[.
Proof. The previous lemma implies
_
s(, x)
2
(1+[[)
d() < for >

1
2
.
This already proves the rst part and also the second in the case x = y, and
hence the result follows from [ z[
1
const Im(z)
1/2+
(1 +
2
)
1/4/2
(Problem 9.13) in any sector [ Re(z)[ C[ Im(z)[. But the case x = y implies
u
b
(z, x) = O(z
1/2+
e
z(ax)
),
which in turn implies the x ,= y case.
Now we come to our main result of this section:
Theorem 9.22. Suppose
j
, j = 0, 1, are given on (a, b) and both are regular
at a. Moreover, A
j
are some self-adjoint operators associated with
j
and
the same boundary condition at a.
Let c (0, b). Then q
0
(x) = q
1
(x) for x (a, c) if and only if for every
> 0 we have that m
1,b
(z) m
0,b
(z) = O(e
2(a) Re(
z)
) as z along
some nonreal ray.
Proof. By (9.75) we have s
1
(z, x)/s
0
(z, x) 1 as z along any nonreal
ray. Moreover, (9.81) in the case y = x shows s
0
(z, x)u
1,b
(z, x) 0 and
s
1
(z, x)u
0,b
(z, x) 0 as well. In particular, the same is true for the dierence
s
1
(z, x)c
0
(z, x) s
0
(z, x)c
1
(z, x) + (m
1,b
(z) m
0,b
(z))s
0
(z, x)s
1
(z, x).
Since the rst two terms cancel for x (a, c), (9.75) implies m
1,b
(z)
m
0,b
(z) = O(e
2(a) Re(
z)
).
To see the converse, rst note that the entire function
s
1
(z, x)c
0
(z, x) s
0
(z, x)c
1
(z, x) =s
1
(z, x)u
0,b
(z, x) s
0
(z, x)u
1,b
(z, x)
(m
1,b
(z) m
0,b
(z))s
0
(z, x)s
1
(z, x)
vanishes as z along any nonreal ray for xed x (a, c) by the same
arguments used before together with the assumption on m
1,b
(z) m
0,b
(z).
Moreover, by (9.75) this function has an order of growth 1/2 and thus
by the PhragmenLindelof theorem (e.g., [53, Thm. 4.3.4]) is bounded on
all of C. By Liouvilles theorem it must be constant and since it vanishes
along rays, it must be zero; that is, s
1
(z, x)c
0
(z, x) = s
0
(z, x)c
1
(z, x) for all
z C and x (a, c). Dierentiating this identity with respect to x and us-
ing W(c
j
(z), s
j
(z)) = 1 shows s
1
(z, x)
2
= s
0
(z, x)
2
. Taking the logarithmic
derivative further gives s
t
1
(z, x)/s
1
(z, x) = s
t
0
(z, x)/s
0
(z, x) and dierentiat-
ing once more shows s
tt
1
(z, x)/s
1
(z, x) = s
tt
0
(z, x)/s
0
(z, x). This nishes the
proof since q
j
(x) = z +s
tt
j
(z, x)/s
j
(z, x).
Problem 9.12. Prove Lemma 9.18. (Hint: Without loss set a = 0. Now
use that
c(z, x) =cos() cosh(
zx)
sin()
z
sinh(
zx)
z
_
x
0
sinh(
z(x y))q(y)c(z, y)dy

by Lemma 9.2 and consider c(z, x) = e
zx
c(z, x).)
Problem 9.13. Show
1
z
1 +
2
[z[
Im(z)
(9.82)
and
1
z

1 +
2
2
1 +
2
(1 +[z[)[z[
Im(z)
(9.83)
for any R. (Hint: To obtain the rst, search for the maximum as
a function of (cf. also Problem 3.7). The second then follows from the
rst.)
Problem 9.14. Show
_

0
y
1
1 +y
2
dy =
/2
sin(/2)
, (0, 2),
by proving
_

e
x
1 + e
x
dx =

sin()
, (0, 1).
(Hint: To compute the last integral, use a contour consisting of the straight
lines connecting the points R, R, R+2i, R+2i. Evaluate the contour
integral using the residue theorem and let R . Show that the contribu-
tions from the vertical lines vanish in the limit and relate the integrals along
the horizontal lines.)
Problem 9.15. In Lemma 9.20 we assumed 0 < < 2. Show that in the
case = 2 we have
_
R
log(1 +
2
)
1 +
2
d() <
_

1
Im(F(iy))
y
2
dy < .
(Hint:
_
1
y
1
2
+y
2
dy =
log(1+
2
)
2
2
.)
9.5. Absolutely continuous spectrum
In this section we will show how to locate the absolutely continuous spec-
trum. We will again assume that a is a regular endpoint. Moreover, we
assume that b is l.p. since otherwise the spectrum is discrete and there will
be no absolutely continuous spectrum.
In this case we have seen in the Section 9.3 that A is unitarily equivalent
to multiplication by in the space L
2
(R, d), where is the measure asso-
ciated to the Weyl m-function. Hence by Theorem 3.23 we conclude that
the set
M
s
= [ limsup
0
Im(m
b
( + i)) = (9.84)
is a support for the singularly continuous part and
M
ac
= [0 < limsup
0
Im(m
b
( + i)) < (9.85)
9.5. Absolutely continuous spectrum 207
is a minimal support for the absolutely continuous part. Moreover, (A
ac
)
can be recovered from the essential closure of M
ac
; that is,
(A
ac
) = M
ess
ac
. (9.86)
Compare also Section 3.2.
We now begin our investigation with a crucial estimate on Im(m
b
(+i)).
Set
|f|
(a,x)
=
_
x
a
[f(y)[
2
r(y)dy, x (a, b). (9.87)
Lemma 9.23. Let
= (2|s()|
(1,x)
|c()|
(a,x)
)
1
(9.88)
and note that since b is l.p., there is a one-to-one correspondence between
(0, ) and x (a, b). Then
5
24 [m
b
( + i)[
|s()|
(a,x)
|c()|
(a,x)
5 +
24. (9.89)
Proof. Let x > a. Then by Lemma 9.2
u
+
( + i, x) = c(, x) m
b
( + i)s(, x)
i
_
x
a
_
c(, x)s(, y) c(, y)s(, x)
_
u
+
( + i, y)r(y)dy.
Hence one obtains after a little calculation (as in the proof of Theorem 9.9)
|c() m
b
( + i)s()|
(a,x)
|u
b
( + i)|
(a,x)
+ 2|s()|
(a,x)
|c()|
(a,x)
|u
b
( + i)|
(a,x)
.
Using the denition of and (9.55), we obtain
|c() m
b
( + i)s()|
2
(a,x)
4|u
b
( + i)|
2
(a,x)
4|u
b
( + i)|
2
(a,b)
=
4
Im(m
b
( + i))
8|s()|
(a,x)
|c()|
(a,x)
Im(m
b
( + i)).
Combining this estimate with
|c() m
b
( + i)s()|
2
(a,x)

_
|c()|
(a,x)
[m
b
( + i)[|s()|
(a,x)
_
2
shows (1 t)
2
8t, where t = [m
b
( + i)[|s()|
(a,x)
|c()|
1
(a,x)
.
We now introduce the concept of subordinacy. A nonzero solution u of
u = zu is called sequentially subordinate at b with respect to another
solution v if
liminf
xb
|u|
(a,x)
|v|
(a,x)
= 0. (9.90)
If the liminf can be replaced by a lim, the solution is called subordinate.
Both concepts will eventually lead to the same results (cf. Remark 9.26
below). We will work with (9.90) since this will simplify proofs later on and
hence we will drop the additional sequentially.
It is easy to see that if u is subordinate with respect to v, then it is
subordinate with respect to any linearly independent solution. In particular,
a subordinate solution is unique up to a constant. Moreover, if a solution
u of u = u, R, is subordinate, then it is real up to a constant, since
both the real and the imaginary parts are subordinate. For z CR we
know that there is always a subordinate solution near b, namely u
b
(z, x).
The following result considers the case z R.
Lemma 9.24. Let R. There is a subordinate solution u() near b if
and only if there is a sequence
n
0 such that m
b
( + i
n
) converges to a
limit in R as n . Moreover,
lim
n
m
b
( + i
n
) =
cos()p(a)u
t
(, a) + sin()u(, a)
cos()u(, a) sin()p(a)u
t
(, a)
(9.91)
in this case (compare (9.52)).
Proof. We will consider the number xing the boundary condition as a
parameter and write s
(z, x), c
(z, x), m
b,
, etc., to emphasize the depen-
dence on .
Every solution can (up to a constant) be written as s
(, x) for some
[0, ). But by Lemma 9.23, s
(, x) is subordinate if and only there is

a sequence
n
0 such that lim
n
m
b,
( +i
n
) = and by (9.71) this is
the case if and only if
lim
n
m
b,
(+i
n
) = lim
n
cos( )m
b,
( + i
n
) + sin( )
cos( ) sin( )m
b,
( + i
n
)
= cot()
is a number in R .
We are interested in N(), the set of all R for which no subordinate
solution exists, that is,
N() = R[No solution of u = u is subordinate at b (9.92)
and the set
S
() = [ s(, x) is subordinate at b. (9.93)

From the previous lemma we obtain
Corollary 9.25. We have N() if and only if
liminf
0
Im(m
b
( + i)) > 0 and limsup
0
[m
b
( + i)[ < .
Similarly, S
() if and only if limsup

0
[m
b
( + i)[ = .
Remark 9.26. Since the set, for which the limit lim
0
m
b
( +i) does not
exist, is of zero spectral and Lebesgue measure (Corollary 3.25), changing
the lim in (9.90) to a liminf will aect N() only on such a set (which
is irrelevant for our purpose). Moreover, by (9.71) the set where the limit
exists (nitely or innitely) is independent of the boundary condition .
Then, as a consequence of the previous corollary, we have
Theorem 9.27. The set N() M
ac
is a minimal support for the absolutely
continuous spectrum of H. In particular,
ac
(H) = N()
ess
. (9.94)
Moreover, the set S
() M
s
is a minimal support for the singular spectrum
of H.
Proof. By our corollary we have N() M
ac
. Moreover, if M
ac
N(),
then either 0 = liminf Im(m
b
) < limsupIm(m
b
) or limsupRe(m
b
) = .
The rst case can only happen on a set of Lebesgue measure zero by Theo-
rem 3.23 and the same is true for the second by Corollary 3.25.
Similarly, by our corollary we also have S
() M
s
and S
()M
s
happens precisely when limsupRe(m
b
) = which can only happen on a
set of Lebesgue measure zero by Corollary 3.25.
Note that if (
1
,
2
) N(), then the spectrum of any self-adjoint
extension H of is purely absolutely continuous in the interval (
1
,
2
).
Example. Consider H
0
=
d
2
dx
2
on (0, ) with a Dirichlet boundary con-
dition at x = 0. Then it is easy to check H
0
0 and N(
0
) = (0, ). Hence
ac
(H
0
) = [0, ). Moreover, since the singular spectrum is supported on
[0, )N(
0
) = 0, we see
sc
(H
0
) = (since the singular continuous spec-
trum cannot be supported on a nite set) and
pp
(H
0
) 0. Since 0 is no
eigenvalue, we have
pp
(H
0
) = .
Problem 9.16. Determine the spectrum of H
0
=
d
2
dx
2
on (0, ) with a
general boundary condition (9.44) at a = 0.
9.6. Spectral transformations II
In Section 9.3 we have looked at the case of one regular endpoint. In this
section we want to remove this restriction. In the case of a regular endpoint
(or more generally an l.c. endpoint), the choice of u(, x) in Lemma 9.13 was
dictated by the fact that u(, x) is required to satisfy the boundary condition
at the regular (l.c.) endpoint. We begin by showing that in the general case
we can choose any pair of linearly independent solutions. We will choose
some arbitrary point c I and two linearly independent solutions according
to the initial conditions
c(z, c) = 1, p(c)c
t
(z, c) = 0, s(z, c) = 0, p(c)s
t
(z, c) = 1. (9.95)
We will abbreviate
s(z, x) =
_
c(z, x)
s(z, x)
_
. (9.96)
Lemma 9.28. There is measure d() and a nonnegative matrix R() with
trace one such that
U : L
2
(I, r dx) L
2
(R, Rd)
f(x)
_
b
a
s(, x)f(x)r(x) dx
(9.97)
is a spectral mapping as in Lemma 9.13. As before, the integral has to be
understood as
_
b
a
dx = lim
ca,db
_
d
c
dx with limit taken in L
2
(R, Rd), where
L
2
(R, Rd) is the Hilbert space of all C
2
-valued measurable functions with
scalar product
f, g) =
_
R
f
Rg d. (9.98)
The inverse is given by
(U
1
F)(x) =
_
R
s(, x)R()F()d(). (9.99)
Proof. Let U
0
be a spectral transformation as in Lemma 9.13 with corre-
sponding real solutions u
j
(, x) and measures d
j
(x), 1 j k. Without
loss of generality we can assume k = 2 since we can always choose d
2
= 0
and u
2
(, x) such that u
1
and u
2
are linearly independent.
Now dene the 2 2 matrix C() via
_
u
1
(, x)
u
2
(, x)
_
= C()
_
c(, x)
s(, x)
_
and note that C() is nonsingular since u
1
, u
2
as well as s, c are linearly
independent.
Set d = d
1
+ d
2
. Then d
j
= r
j
d and we can introduce

R =
C
_
r
1
0
0 r
2
_
C. By construction

R is a (symmetric) nonnegative matrix. More-
over, since C() is nonsingular, tr(

R) is positive a.e. with respect to . Thus
we can set R = tr(

R)
1

R and d = tr(

R)
1
d .
This matrix gives rise to an operator
C : L
2
(R, Rd)
j
L
2
(R, d
j
), F() C()F(),
which, by our choice of Rd, is norm preserving. By CU = U
0
it is onto
and hence it is unitary (this also shows that L
2
(R, Rd) is a Hilbert space,
i.e., complete).
It is left as an exercise to check that C maps multiplication by in
L
2
(R, Rd) to multiplication by in

j
L
2
(R, d
j
) and the formula for
U
1
.
Clearly the matrix-valued measure Rd contains all the spectral in-
formation of A. Hence it remains to relate it to the resolvent of A as in
Section 9.3
For our base point x = c there are corresponding Weyl m-functions
m
a
(z) and m
b
(z) such that
u
a
(z) = c(z, x) m
a
(z)s(z, x), u
b
(z) = c(z, x) +m
b
(z)s(z, x). (9.100)
The dierent sign in front of m
a
(z) is introduced such that m
a
(z) will again
be a Herglotz function. In fact, this follows using reection at c, x c
(xc), which will interchange the roles of m
a
(z) and m
b
(z). In particular,
all considerations from Section 9.3 hold for m
a
(z) as well.
Furthermore, we will introduce the Weyl M-matrix
M(z) =
1
m
a
(z) +m
b
(z)
_
1 (m
a
(z) m
b
(z))/2
(m
a
(z) m
b
(z))/2 m
a
(z)m
b
(z)
_
.
(9.101)
Note det(M(z)) =
1
4
. Since
m
a
(z) =
p(c)u
t
a
(z, c)
u
a
(z, c)
and m
b
(z) =
p(c)u
t
b
(z, c)
u
b
(z, c)
, (9.102)
it follows that W(u
a
(z), u
b
(z)) = m
a
(z) +m
b
(z) and
M(z) =
lim
x,yc
_
G(z, x, x) (p(x)
x
+p(y)
y
)G(z, x, y)/2
(p(x)
x
+p(y)
y
)G(z, x, y)/2 p(x)
x
p(y)
y
G(z, x, y)
_
,
(9.103)
where G(z, x, y) is the Green function of A. The limit is necessary since
x
G(z, x, y) has dierent limits as y x from y > x, respectively, y < x.
We begin by showing
Lemma 9.29. Let U be the spectral mapping from the previous lemma.
Then
(UG(z, x, .))() =
1
z
s(, x),
(Up(x)
x
G(z, x, .))() =
1
z
p(x)s
t
(, x) (9.104)
for every x (a, b) and every z (A).
Proof. First of all note that G(z, x, .) L
2
((a, b), r dx) for every x (a, b)
and z (A). Moreover, from R
A
(z)f = U
1 1
z
Uf we have
_
b
a
G(z, x, y)f(y)r(y)dy =
_
R
1
z
s(, x)R()F()d()
where F = Uf. Now proceed as in the proof of Lemma 9.15.
With the aid of this lemma we can now show
Theorem 9.30. The Weyl M-matrix is given by
M(z) = D +
_
R
_
1
z

1 +
2
_
R()d(), D
jk
R, (9.105)
and
D = Re(M(i)),
_
R
1
1 +
2
R()d() = Im(M(i)), (9.106)
where
Re(M(z)) =
1
2
_
M(z)+M
(z)
_
, Im(M(z)) =
1
2
_
M(z)M
(z)
_
. (9.107)
Proof. By the previous lemma we have
_
b
a
[G(z, c, y)[
2
r(y)dy =
_
R
1
[z [
2
R
11
()d().
Moreover by (9.28), (9.55), and (9.100) we infer
_
b
a
[G(z, c, y)[
2
r(y)dy =
1
[W(u
a
, u
b
)[
2
_
[u
b
(z, c)[
2
_
c
a
[u
a
(z, y)[
2
r(y)dy
+[u
a
(z, c)[
2
_
b
c
[u
b
(z, y)[
2
r(y)dy
_
=
Im(M
11
(z))
Im(z)
.
Similarly we obtain
_
R
1
[z [
2
R
22
()d() =
Im(M
22
(z))
Im(z)
and
_
R
1
[z [
2
R
12
()d() =
Im(M
12
(z))
Im(z)
.
Hence the result follows as in the proof of Theorem 9.17.
Now we are also able to extend Theorem 9.27. Note that by
tr(M(z)) = M
11
(z) +M
22
(z) = d +
_
R
_
1
z

1 +
2
_
d() (9.108)
(with d = tr(D) R) we have that the set
M
s
= [ limsup
0
Im(tr(M( + i))) = (9.109)
is a support for the singularly continuous part and
M
ac
= [0 < limsup
0
Im(tr(M( + i))) < (9.110)
is a minimal support for the absolutely continuous part.
Theorem 9.31. The set N
a
() N
b
() M
ac
is a minimal support for the
absolutely continuous spectrum of H. In particular,
ac
(H) = N
a
() N
b
()
ess
. (9.111)
Moreover, the set
_
[0,)
S
a,
() S
b,
() M
s
(9.112)
is a support for the singular spectrum of H.
Proof. By Corollary 9.25 we have 0 < liminf Im(m
a
) and limsup[m
a
[ <
if and only if N
a
() and similarly for m
b
.
Now suppose N
a
(). Then limsup[M
11
[ < since limsup[M
11
[ =
is impossible by 0 = liminf [M
1
11
[ = liminf [m
a
+m
b
[ liminf Im(m
a
) >
0. Similarly limsup[M
22
[ < . Moreover, if limsup[m
b
[ < , we also have
liminf Im(M
11
) = liminf
Im(m
a
+m
b
)
[m
a
+m
b
[
2

liminf Im(m
a
)
limsup[m
a
[
2
+ limsup[m
b
[
2
> 0
and if limsup[m
b
[ = , we have
liminf Im(M
22
) = liminf Im
_
m
a
1 +
ma
m
b
_
liminf Im(m
a
) > 0.
Thus N
a
() M
ac
and similarly N
b
() M
ac
.
Conversely, let M
ac
. By Corollary 3.25 we can assume that the
limits limm
a
and limm
b
both exist and are nite after disregarding a set of
Lebesgue measure zero. For such , limIm(M
11
) and limIm(M
22
) both exist
and are nite. Moreover, either limIm(M
11
) > 0, in which case limIm(m
a
+
m
b
) > 0, or limIm(M
11
) = 0, in which case
0 < limIm(M
22
) = lim
[m
a
[
2
Im(m
b
) +[m
b
[
2
Im(m
a
)
[m
a
[
2
+[m
b
[
2
= 0
yields a contradiction. Thus N
a
() N
b
() and the rst part is proven.
To prove the second part, let M
s
. If limsupIm(M
11
) = , we have
limsup[M
11
[ = and thus liminf [m
a
+ m
b
[ = 0. But this implies that
there is some subsequence such that limm
b
= limm
a
= cot() R.
Similarly, if limsupIm(M
22
) = , we have liminf [m
1
a
+m
1
b
[ = 0 and there
is some subsequence such that limm
1
b
= limm
1
a
= tan() R .
This shows M
s

S
a,
() S
b,
().
Problem 9.17. Show
R()d
ac
() =
_
_
Im(ma()+m
b
())
[ma()[
2
+[m
b
()[
2
Im(ma()m
b
())
[ma()[
2
+[m
b
()[
2
Im(ma()m
b
())
[ma()[
2
+[m
b
()[
2
[ma()[
2
Im(m
b
())+[m
b
()[
2
Im(ma())
[ma()[
2
+[m
b
()[
2
_
_
d
,
where m
a
() = lim
0
m
a
( + i) and similarly for m
b
().
Moreover, show that the choice of solutions
_
u
b
(, x)
u
a
(, x)
_
= V ()
_
c(, x)
s(, x)
_
,
where
V () =
1
m
a
() +m
b
()
_
1 m
b
()
1 m
a
()
_
,
diagonalizes the absolutely continuous part,
V
1
()
R()V ()
1
d
ac
() =
1
_
Im(m
a
()) 0
0 Im(m
b
())
_
d.
9.7. The spectra of one-dimensional Schr odinger operators
In this section we want to look at the case of one-dimensional Schrodinger
operators; that is, r = p = 1 on (a, b) = (0, ).
Recall that
H
0
=
d
2
dx
2
, D(H
0
) = H
2
(R), (9.113)
is self-adjoint and
q
H
0
(f) = |f
t
|
2
, Q(H
0
) = H
1
(R). (9.114)
Hence we can try to apply the results from Chapter 6. We begin with a
simple estimate:
Lemma 9.32. Suppose f H
1
(0, 1). Then
sup
x[0,1]
[f(x)[
2

_
1
0
[f
t
(x)[
2
dx + (1 +
1
)
_
1
0
[f(x)[
2
dx (9.115)
for every > 0.
Proof. First note that
[f(x)[
2
= [f(c)[
2
+ 2
_
x
c
Re(f(t)
f
t
(t))dt [f(c)[
2
+ 2
_
1
0
[f(t)f
t
(t)[dt
[f(c)[
2
+
_
1
0
[f
t
(t)[
2
dt +
1
_
1
0
[f(t)[
2
dt
for any c [0, 1]. But by the mean value theorem there is a c (0, 1) such
that [f(c)[
2
=
_
1
0
[f(t)[
2
dt.
As a consequence we obtain
Lemma 9.33. Suppose q L
2
loc
(R) and
sup
nZ
_
n+1
n
[q(x)[
2
dx < . (9.116)
Then q is relatively bounded with respect to H
0
with bound zero.
Similarly, if q L
1
loc
(R) and
sup
nZ
_
n+1
n
[q(x)[dx < . (9.117)
Then q is relatively form bounded with respect to H
0
with bound zero.
Proof. Let Q be in L
2
loc
(R) and abbreviate M = sup
nZ
_
n+1
n
[Q(x)[
2
dx.
Using the previous lemma, we have for f H
1
(R) that
|Qf|
2
nZ
_
n+1
n
[Q(x)f(x)[
2
dx M
nZ
sup
x[n,n+1]
[f(x)[
2
M
_
_
n+1
n
[f
t
(x)[
2
dx + (1 +
1
)
_
n+1
n
[f(x)[
2
dx
_
= M
_
|f
t
|
2
+ (1 +
1
)|f|
2
_
.
Choosing Q = [q[
1/2
, this already proves the form case since |f
t
|
2
= q
H
0
(f).
Choosing Q = q and observing q
H
0
(f) = f, H
0
f) |H
0
f||f| for f
H
2
(R) shows the operator case.
Hence in both cases H
0
+ q is a well-dened (semi-bounded) operator
dened as operator sum on D(H
0
+ q) = D(H
0
) = H
2
(R) in the rst case
and as form sum on Q(H
0
+q) = Q(H
0
) = H
1
(R) in the second case. Note
also that the rst case implies the second one since by CauchySchwarz we
have
_
n+1
n
[q(x)[dx
_
n+1
n
[q(x)[
2
dx. (9.118)
This is not too surprising since we already know how to turn H
0
+ q into
a self-adjoint operator without imposing any conditions on q (except for
L
1
loc
(R)) at all. However, we get at least a simple description of the (form)
domains and by requiring a bit more, we can even compute the essential
spectrum of the perturbed operator.
1
(R). Then the resolvent dierence of H
0
and
H
0
+q is trace class.
Proof. Using G
0
(z, x, x) = 1/(2
z), Lemma 9.12 implies that [q[

1/2
R
H
0
(z)
is HilbertSchmidt and hence the result follows from Lemma 6.29.
1
loc
(R) and
lim
[n[
_
n+1
n
[q(x)[dx = 0. (9.119)
Then R
H
0
+q
(z) R
H
0
(z) is compact and hence
ess
(H
0
+ q) =
ess
(H
0
) =
[0, ).
Proof. By Weyls theorem it suces to show that the resolvent dierence is
compact. Let q
n
(x) = q(x)
R\[n,n]
(x). Then R
H
0
+q
(z)R
H
0
+qn
(z) is trace
class, which can be shown as in the previous theorem since qq
n
has compact
support (no information on the corresponding diagonal Greens function is
needed since by continuity it is bounded on every compact set). Moreover,
by the proof of Lemma 9.33, q
n
is form bounded with respect to H
0
with
constants a = M
n
and b = 2M
n
, where M
n
= sup
[m[n
_
m+1
m
[q(x)[
2
dx.
Hence by Theorem 6.25 we see
R
H
0
+qn
() = R
H
0
()
1/2
(1 C
qn
())
1
R
H
0
()
1/2
, > 2,
with |C
qn
()| M
n
. So we conclude
R
H
0
+qn
() R
H
0
() = R
H
0
()
1/2
C
qn
()(1 C
qn
())
1
R
H
0
()
1/2
,
> 2, which implies that the sequence of compact operators R
H
0
+q
()
R
H
0
+qn
() converges to R
H
0
+q
() R
H
0
() in norm, which implies
that the limit is also compact and nishes the proof.
Using Lemma 6.23, respectively, Corollary 6.27, we even obtain
Corollary 9.36. Let q = q
1
+q
2
where q
1
and q
2
satisfy the assumptions of
Lemma 9.33 and Lemma 9.35, respectively. Then H
0
+q
1
+q
2
is self-adjoint
and
ess
(H
0
+q
1
+q
2
) =
ess
(H
0
+q
1
).
This result applies for example in the case where q
2
is a decaying per-
turbation of a periodic potential q
1
.
Finally we turn to the absolutely continuous spectrum.
Lemma 9.37. Suppose q = q
1
+q
2
, where q
1
L
1
(0, ) and q
2
AC[0, )
with q
t
2
L
1
(0, ) and lim
x
q
2
(x) = 0. Then there are two solutions
u
(, x) of u = u, > 0, of the form

u
(, x) = (1 +o(1))u
0,
(, x), u
t
(, x) = (1 +o(1))u
t
0,
(, x) (9.120)
as x , where
u
0,
(, x) = exp
_
i
_
x
0
_
q
2
(y)dy
_
. (9.121)
Proof. We will omit the dependence on for notational simplicity. More-
over, we will choose x so large that W
x
(u
, u
+
) = 2i
_
q
2
(x) ,= 0. Write
u(x) = U
0
(x)a(x), U
0
(x) =
_
u
0,+
(x) u
0,+
(x)
u
t
0,
(x) u
t
0,
(x)
_
, a(x) =
_
a
+
(x)
a
(x)
_
.
Then
u
t
(x) =
_
0 1
q(x) 0
_
u(x)
+
_
0 0
q
+
(x)u
0,+
(x) q
(x)u
0,
(x)
_
a(x) +U
0
(x)a
t
(x),
where
q
(x) = q
1
(x) i
q
t
2
(x)
_
q
2
(x)
.
Hence u(x) will solve u = u if
a
t
(x) =
1
W
x
(u
, u
+
)
_
q
+
(x) q
(x)u
0,
(x)
2
q
+
(x)u
0,+
(x)
2
q
(x)
_
a(x).
Since the coecient matrix of this linear system is integrable, the claim
follows by a simple application of Gronwalls inequality.
Theorem 9.38 (Weidmann). Let q
1
and q
2
be as in the previous lemma
and suppose q = q
1
+ q
2
satises the assumptions of Lemma 9.35. Let
H = H
0
+q
1
+q
2
. Then
ac
(H) = [0, ),
sc
(H) = , and
p
(H) (, 0].
Proof. By the previous lemma there is no subordinate solution for > 0 on
(0, ) and hence 0 < Im(m
b
(+i0)) < . Similarly, there is no subordinate
solution (, 0) and hence 0 < Im(m
a
(+i0)) < . Thus the same is true
for the diagonal entries M
jj
(z) of the Weyl M-matrix, 0 < Im(M
jj
( +
i0)) < , and hence d is purely absolutely continuous on (0, ). Since
ess
(H) = [0, ), we conclude
ac
(H) = [0, ) and
sc
(H) 0. Since
the singular continuous part cannot live on a single point, we are done.
Note that the same results hold for operators on [0, ) rather than R.
Moreover, observe that the conditions from Lemma 9.37 are only imposed
near + but not near . The conditions from Lemma 9.35 are only used
to ensure that there is no essential spectrum in (, 0).
Having dealt with the essential spectrum, let us next look at the discrete
spectrum. In the case of decaying potentials, as in the previous theorem,
one key question is if the number of eigenvalues below the essential spectrum
is nite or not.
As preparation, we shall prove Sturms comparison theorem:
Theorem 9.39 (Sturm). Let
0
,
1
be associated with q
0
q
1
on (a, b),
respectively. Let (c, d) (a, b) and
0
u = 0,
1
v = 0. Suppose at each end
of (c, d) either W
x
(u, v) = 0 or, if c, d (a, b), u = 0. Then v is either a
multiple of u in (c, d) or v must vanish at some point in (c, d).
Proof. By decreasing d to the rst zero of u in (c, d] (and perhaps ipping
signs), we can suppose u > 0 on (c, d). If v has no zeros in (c, d), we can
suppose v > 0 on (c, d) again by perhaps ipping signs. At each endpoint,
W(u, v) vanishes or else u = 0, v > 0, and u
t
(c) > 0 (or u
t
(d) < 0). Thus,
W
c
(u, v) 0, W
d
(u, v) 0. But this is inconsistent with
W
d
(u, v) W
c
(u, v) =
_
d
c
(q
0
(t) q
1
(t))u(t)v(t) dt, (9.122)
unless both sides vanish.
In particular, choosing q
0
= q
0
and q
1
= q
1
, this result holds for
solutions of u =
0
u and v =
1
v.
Now we can prove
Theorem 9.40. Suppose q satises (9.117) such that H is semi-bounded
and Q(H) = H
1
(R). Let
0
< <
n
< be its eigenvalues below
the essential spectrum and
0
, . . . ,
n
, . . . the corresponding eigenfunctions.
Then
n
has n zeros.
Proof. We rst prove that
n
has at least n zeros and then that if
n
has
m zeros, then (,
n
] has at least (m+ 1) eigenvalues. If
n
has m zeros
at x
1
, x
2
, . . . , x
m
and we let x
0
= a, x
m+1
= b, then by Theorem 9.39,
n+1
must have at least one zero in each of (x
0
, x
1
), (x
1
, x
2
), . . . , (x
m
, x
m+1
); that
is,
n+1
has at least m+1 zeros. It follows by induction that
n
has at least
n zeros.
On the other hand, if
n
has m zeros x
1
, . . . , x
m
, dene
j
(x) =
_
n
(x), x
j
x x
j+1
,
0 otherwise,
j = 0, . . . , m, (9.123)
where we set x
0
= and x
m+1
= . Then
j
is in the form domain
of H and satises
j
, H
j
) =
n
|
j
|
2
. Hence if =

m
j=0
c
j
j
, then
, H) =
n
||
2
and it follows by Theorem 4.12 (i) that there are at least
m+ 1 eigenvalues in (,
n
].
Note that by Theorem 9.39, the zeros of
n
interlace the zeros of
n
.
The second part of the proof also shows
Corollary 9.41. Let H be as in the previous theorem. If the Weyl solution
u
(, .) has m zeros, then dimRan

(,)
(H) m. In particular, below
the spectrum of H implies that u
(, .) has no zeros.
The equation ( )u is called oscillating if one solution has an innite
number of zeros. Theorem 9.39 implies that this is then true for all solu-
tions. By our previous considerations this is the case if and only if (H) has
innitely many points below . Hence it remains to nd a good oscillation
criterion.
Theorem 9.42 (Kneser). Consider q on (0, ). Then
liminf
x
_
x
2
q(x)
_
>
1
4
implies nonoscillation of near (9.124)
and
limsup
x
_
x
2
q(x)
_
<
1
4
implies oscillation of near . (9.125)
Proof. The key idea is that the equation
0
=
d
2
dx
2
+

x
2
is of Euler type. Hence it is explicitly solvable with a fundamental system
given by
x
1
2
q
+
1
4
.
There are two cases to distinguish. If 1/4, all solutions are nonoscil-
latory. If < 1/4, one has to take real/imaginary parts and all solutions
are oscillatory. Hence a straightforward application of Sturms comparison
theorem between
0
and yields the result.
Corollary 9.43. Suppose q satises (9.117). Then H has nitely many
eigenvalues below the inmum of the essential spectrum 0 if
lim
[x[
inf
_
x
2
q(x)
_
>
1
4
(9.126)
and innitely many if
lim
[x[
sup
_
x
2
q(x)
_
<
1
4
. (9.127)
Problem 9.18. Show that if q is relatively bounded with respect to H
0
, then
necessarily q L
2
loc
(R) and (9.116) holds. Similarly, if q is relatively form
bounded with respect to H
0
, then necessarily q L
1
loc
(R) and (9.117) holds.
Problem 9.19. Suppose q L
1
(R) and consider H =
d
2
dx
2
+ q. Show
that inf (H)
_
R
q(x)dx. In particular, there is at least one eigenvalue
below the essential spectrum if
_
R
q(x)dx < 0. (Hint: Let C
c
(R) with
(x) = 1 for [x[ 1 and investigate q
H
(
n
), where
n
(x) = (x/n).)
Chapter 10
One-particle
Schr odinger operators
10.1. Self-adjointness and spectrum
Our next goal is to apply these results to Schrodinger operators. The Hamil-
tonian of one particle in d dimensions is given by
H = H
0
+V, (10.1)
where V : R
d
R is the potential energy of the particle. We are mainly
interested in the case 1 d 3 and want to nd classes of potentials which
are relatively bounded, respectively, relatively compact. To do this, we need
a better understanding of the functions in the domain of H
0
.
Lemma 10.1. Suppose n 3 and H
2
(R
n
). Then C
(R
n
) and for
any a > 0 there is a b > 0 such that
||
a|H
0
| +b||. (10.2)
Proof. The important observation is that (p
2
+
2
)
1
L
2
(R
n
) if n 3.
Hence, since (p
2
+
2
)
L
2
(R
n
), the CauchySchwarz inequality
|
|
1
= |(p
2
+
2
)
1
(p
2
+
2
)
(p)|
1
|(p
2
+
2
)
1
| |(p
2
+
2
)
(p)|
shows

L
1
(R
n
). But now everything follows from the Riemann-Lebesgue
lemma, that is,
||
(2)
n/2
|(p
2
+
2
)
1
|(|p
2

(p)| +
2
|
(p)|)
= (/2)
n/2
|(p
2
+ 1)
1
|(
2
|H
0
| +||),
221
222 10. One-particle Schrodinger operators
Now we come to our rst result.
Theorem 10.2. Let V be real-valued and V L
(R
n
) if n > 3 and V
L
(R
n
) +L
2
(R
n
) if n 3. Then V is relatively compact with respect to H
0
.
In particular,
H = H
0
+V, D(H) = H
2
(R
n
), (10.3)
is self-adjoint, bounded from below and
ess
(H) = [0, ). (10.4)
Moreover, C
c
(R
n
) is a core for H.
Proof. Our previous lemma shows D(H
0
) D(V ). Moreover, invoking
Lemma 7.11 with f(p) = (p
2
z)
1
and g(x) = V (x) (note that f
L
(R
n
) L
2
(R
n
) for n 3) shows that V is relatively compact. Since
C
c
(R
n
) is a core for H
0
by Lemma 7.9, the same is true for H by the
KatoRellich theorem.
Observe that since C
c
(R
n
) D(H
0
), we must have V L
2
loc
(R
n
) if
D(V ) D(H
0
).
10.2. The hydrogen atom
We begin with the simple model of a single electron in R
3
moving in the
external potential V generated by a nucleus (which is assumed to be xed
at the origin). If one takes only the electrostatic force into account, then
V is given by the Coulomb potential and the corresponding Hamiltonian is
given by
H
(1)
=

[x[
, D(H
(1)
) = H
2
(R
3
). (10.5)
If the potential is attracting, that is, if > 0, then it describes the hydrogen
atom and is probably the most famous model in quantum mechanics.
We have chosen as domain D(H
(1)
) = D(H
0
) D(
1
[x[
) = D(H
0
) and by
Theorem 10.2 we conclude that H
(1)
is self-adjoint. Moreover, Theorem 10.2
also tells us
ess
(H
(1)
) = [0, ) (10.6)
and that H
(1)
is bounded from below,
E
0
= inf (H
(1)
) > . (10.7)
If 0, we have H
(1)
0 and hence E
0
= 0, but if > 0, we might have
E
0
< 0 and there might be some discrete eigenvalues below the essential
spectrum.
10.2. The hydrogen atom 223
In order to say more about the eigenvalues of H
(1)
, we will use the fact
that both H
0
and V
(1)
= /[x[ have a simple behavior with respect to
scaling. Consider the dilation group
U(s)(x) = e
ns/2
(e
s
x), s R, (10.8)
which is a strongly continuous one-parameter unitary group. The generator
can be easily computed:
D(x) =
1
2
(xp +px)(x) = (xp
in
2
)(x), o(R
n
). (10.9)
Now let us investigate the action of U(s) on H
(1)
:
H
(1)
(s) = U(s)H
(1)
U(s) = e
2s
H
0
+ e
s
V
(1)
, D(H
(1)
(s)) = D(H
(1)
).
(10.10)
Now suppose H = . Then
, [U(s), H]) = U(s), H) H, U(s)) = 0 (10.11)
and hence
0 = lim
s0
1
s
, [U(s), H]) = lim
s0
U(s),
H H(s)
s
)
= , (2H
0
+V
(1)
)). (10.12)
Thus we have proven the virial theorem.
Theorem 10.3. Suppose H = H
0
+ V with U(s)V U(s) = e
s
V . Then
any normalized eigenfunction corresponding to an eigenvalue satises
= , H
0
) =
1
2
, V ). (10.13)
In particular, all eigenvalues must be negative.
This result even has some further consequences for the point spectrum
of H
(1)
.
Corollary 10.4. Suppose > 0. Then
p
(H
(1)
) =
d
(H
(1)
) = E
j1
jN
0
, E
0
< E
j
< E
j+1
< 0, (10.14)
with lim
j
E
j
= 0.
Proof. Choose C
c
(R0) and set (s) = U(s). Then
(s), H
(1)
(s)) = e
2s
, H
0
) + e
s
, V
(1)
)
which is negative for s large. Now choose a sequence s
n
such that
we have supp((s
n
)) supp((s
m
)) = for n ,= m. Then Theorem 4.12
(i) shows that rank(P
H
(1) ((, 0))) = . Since each eigenvalue E
j
has
nite multiplicity (it lies in the discrete spectrum), there must be an innite
number of eigenvalues which accumulate at 0.
If 0, we have
d
(H
(1)
) = since H
(1)
0 in this case.
Hence we have obtained quite a complete picture of the spectrum of
H
(1)
. Next, we could try to compute the eigenvalues of H
(1)
(in the case
> 0) by solving the corresponding eigenvalue equation, which is given by
the partial dierential equation
(x)

[x[
(x) = (x). (10.15)
For a general potential this is hopeless, but in our case we can use the rota-
tional symmetry of our operator to reduce our partial dierential equation
to ordinary ones.
First of all, it suggests itself to switch from Cartesian coordinates x =
(x
1
, x
2
, x
3
) to spherical coordinates (r, , ) dened by
x
1
= r sin() cos(), x
2
= r sin() sin(), x
3
= r cos(), (10.16)
where r [0, ), [0, ], and (, ]. This change of coordinates
corresponds to a unitary transform
L
2
(R
3
) L
2
((0, ), r
2
dr) L
2
((0, ), sin()d) L
2
((0, 2), d). (10.17)
In these new coordinates (r, , ) our operator reads
H
(1)
=
1
r
2
r
r
2

r
+
1
r
2
L
2
+V (r), V (r) =
r
, (10.18)
where
L
2
= L
2
1
+L
2
2
+L
2
3
=
1
sin()
sin()

1
sin()
2
2
. (10.19)
(Recall the angular momentum operators L
j
from Section 8.2.)
Making the product ansatz (separation of variables)
(r, , ) = R(r)()(), (10.20)
we obtain the three SturmLiouville equations
_
1
r
2
d
dr
r
2
d
dr
+
l(l + 1)
r
2
+V (r)
_
R(r) = R(r),
1
sin()
_
d
d
sin()
d
d
+
m
2
sin()
_
() = l(l + 1)(),
d
2
d
2
() = m
2
(). (10.21)
The form chosen for the constants l(l + 1) and m
2
is for convenience later
on. These equations will be investigated in the following sections.
Problem 10.1. Generalize the virial theorem to the case U(s)V U(s) =
e
s
V , R0. What about Corollary 10.4?
10.3. Angular momentum
We start by investigating the equation for () which is associated with the
SturmLiouville equation
=
tt
, I = (0, 2). (10.22)
Since we want dened via (10.20) to be in the domain of H
0
(in particular
continuous), we choose periodic boundary conditions the SturmLiouville
equation
A = , D(A) = L
2
(0, 2)[ AC
1
[0, 2],
(0) = (2),
t
(0) =
t
(2).
(10.23)
From our analysis in Section 9.1 we immediately obtain
Theorem 10.5. The operator A dened via (10.22) is self-adjoint. Its
spectrum is purely discrete, that is,
(A) =
d
(A) = m
2
[m Z, (10.24)
and the corresponding eigenfunctions
m
() =
1
2
e
im
, m Z, (10.25)
form an orthonormal basis for L
2
(0, 2).
Note that except for the lowest eigenvalue, all eigenvalues are twice de-
generate.
We note that this operator is essentially the square of the angular mo-
mentum in the third coordinate direction, since in polar coordinates
L
3
=
1
i
. (10.26)
Now we turn to the equation for ():
m
() =
1
sin()
_
d
d
sin()
d
d
+
m
2
sin()
_
(), I = (0, ), m N
0
.
(10.27)
For the investigation of the corresponding operator we use the unitary
transform
L
2
((0, ), sin()d) L
2
((1, 1), dx), () f(x) = (arccos(x)).
(10.28)
The operator transforms to the somewhat simpler form
m
=
d
dx
(1 x
2
)
d
dx

m
2
1 x
2
. (10.29)
The corresponding eigenvalue equation
m
u = l(l + 1)u (10.30)
is the associated Legendre equation. For l N
0
it is solved by the
associated Legendre functions [1, (8.6.6)]
P
m
l
(x) = (1)
m
(1 x
2
)
m/2
d
m
dx
m
P
l
(x), [m[ l, (10.31)
where the
P
l
(x) =
1
2
l
l!
d
l
dx
l
(x
2
1)
l
, l N
0
, (10.32)
are the Legendre polynomials [1, (8.6.18)] (Problem 10.2). Moreover,
note that the P
l
(x) are (nonzero) polynomials of degree l and since
m
depends only on m
2
, there must be a relation between P
m
l
(x) and P
m
l
(x).
In fact, (Problem 10.3)
P
m
l
(x) = (1)
m
(l +m)!
(l m)!
P
m
l
. (10.33)
A second, linearly independent, solution is given by
Q
m
l
(x) = P
m
l
(x)
_
x
0
dt
(1 t
2
)P
m
l
(t)
2
. (10.34)
In fact, for every SturmLiouville equation, v(x) = u(x)
_
x
dt
p(t)u(t)
2
satises
v = 0 whenever u = 0. Now x l = 0 and note P
0
(x) = 1. For m = 0 we
have Q
0
0
= arctanh(x) L
2
and so
0
is l.c. at both endpoints. For m > 0
we have Q
m
0
= (x 1)
m/2
(C +O(x 1)) which shows that it is not square
integrable. Thus
m
is l.c. for m = 0 and l.p. for m > 0 at both endpoints.
In order to make sure that the eigenfunctions for m = 0 are continuous (such
that dened via (10.20) is continuous), we choose the boundary condition
generated by P
0
(x) = 1 in this case:
A
m
f =
m
f,
D(A
m
) = f L
2
(1, 1)[ f AC
1
(1, 1),
m
f L
2
(1, 1),
lim
x1
(1 x
2
)f
t
(x) = 0 if m = 0.
(10.35)
Theorem 10.6. The operator A
m
, m N
0
, dened via (10.35) is self-
adjoint. Its spectrum is purely discrete, that is,
(A
m
) =
d
(A
m
) = l(l + 1)[l N
0
, l m, (10.36)
u
l,m
(x) =
2l + 1
2
(l m)!
(l +m)!
P
m
l
(x), l N
0
, l m, (10.37)
2
(1, 1).
Proof. By Theorem 9.6, A
m
is self-adjoint. Moreover, P
m
l
is an eigenfunc-
tion corresponding to the eigenvalue l(l +1) and it suces to show that the
P
m
l
form a basis. To prove this, it suces to show that the functions P
m
l
(x)
are dense. Since (1 x
2
) > 0 for x (1, 1), it suces to show that the
functions (1 x
2
)
m/2
P
m
l
(x) are dense. But the span of these functions
contains every polynomial. Every continuous function can be approximated
by polynomials (in the sup norm, Theorem 0.15, and hence in the L
2
norm)
and since the continuous functions are dense, so are the polynomials.
For the normalization of the eigenfunctions see Problem 10.7, respec-
tively, [1, (8.14.13)].
Returning to our original setting, we conclude that the
m
l
() =
2l + 1
2
(l +m)!
(l m)!
P
m
l
(cos()), [m[ l, (10.38)
2
((0, ), sin()d) for any xed m N
0
.
Theorem 10.7. The operator L
2
on L
2
((0, ), sin()d) L
2
((0, 2)) has
a purely discrete spectrum given
(L
2
) = l(l + 1)[l N
0
. (10.39)
The spherical harmonics
Y
m
l
(, ) =
m
l
()
m
() =
2l + 1
4
(l m)!
(l +m)!
P
m
l
(cos())e
im
, [m[ l,
(10.40)
form an orthonormal basis and satisfy L
2
Y
m
l
= l(l + 1)Y
m
l
and L
3
Y
m
l
=
mY
m
l
.
Proof. Everything follows from our construction, if we can show that the
Y
m
l
form a basis. But this follows as in the proof of Lemma 1.10.
Note that transforming the Y
m
l
back to cartesian coordinates gives
Y
m
l
(x) = (1)
m
2l + 1
4
(l [m[)!
(l +[m[)!

P
m
l
(
x
3
r
)
_
x
1
ix
2
r
_
m
, r = [x[,
(10.41)
where

P
m
l
is a polynomial of degree l m given by
P
m
l
(x) = (1 x
2
)
m/2
P
m
l
(x) =
d
l+m
dx
l+m
(1 x
2
)
l
. (10.42)
In particular, the Y
m
l
are smooth away from the origin and by construction
they satisfy
Y
m
l
=
l(l + 1)
r
2
Y
m
l
. (10.43)
Problem 10.2. Show that the associated Legendre functions satisfy the
dierential equation (10.30). (Hint: Start with the Legendre polynomials
(10.32) which correspond to m = 0. Set v(x) = (x
2
1)
l
and observe
(x
2
1)v
t
(x) = 2lxv(x). Then dierentiate this identity l + 1 times using
Leibnizs rule. For the case of the associated Legendre functions, substitute
v(x) = (1 x
2
)
m/2
u(x) in (10.30) and dierentiate the resulting equation
once.)
Problem 10.3. Show (10.33). (Hint: Write (x
2
1)
l
= (x 1)
l
(x + 1)
l
and use Leibnizs rule.)
Problem 10.4 (Orthogonal polynomials). Suppose the monic polynomials
P
j
(x) = x
j
+
j
x
j1
+. . . are orthogonal with respect to the weight function
w(x):
_
b
a
P
i
(x)P
j
(x)w(x)dx =
_
2
j
, i = j,
0, otherwise.
Note that they are uniquely determined by the GramSchmidt procedure.
Let

P
j
(x) =
1
j
P(x) and show that they satisfy the three term recurrence
relation
a
j

P
j+1
(x) +b
j

P
j
(x) +a
j1

P
j1
(x) = x
P
j
(x),
where
a
j
=
_
b
a
x
P
j+1
(x)

P
j
(x)w(x)dx, b
j
=
_
b
a
x
P
j
(x)
2
w(x)dx.
Moreover, show
a
j
=

j+1
j
, b
j
=
j

j+1
.
(Note that w(x)dx could be replaced by a measure d(x).)
Problem 10.5. Consider the orthogonal polynomials with respect to the
weight function w(x) as in the previous problem. Suppose [w(x)[ Ce
k[x[
for some C, k > 0. Show that the orthogonal polynomials are dense in
L
2
(R, w(x)dx). (Hint: It suces to show that
_
f(x)x
j
w(x)dx = 0 for
all j N
0
implies f = 0. Consider the Fourier transform of f(x)w(x) and
note that it has an analytic extension by Problem 7.11. Hence this Fourier
transform will be zero if, e.g., all derivatives at p = 0 are zero (cf. Prob-
lem 7.3).)
Problem 10.6. Show
P
l
(x) =
]l/2|
k=0
(1)
k
(2l 2k)!
2
l
k!(l k)!(l 2k)!
x
l2k
.
Moreover, by Problem 10.4 there is a recurrence relation of the form P
l+1
(x) =
( a
l
+

b
l
x)P
l
(x) + c
l
P
l1
(x). Find the coecients by comparing the highest
powers in x and conclude
(l + 1)P
l+1
(x) = (2l + 1)xP
l
(x) lP
l1
.
Use this to prove
_
1
1
P
l
(x)
2
dx =
2
2l + 1
.
Problem 10.7. Prove
_
1
1
P
m
l
(x)
2
dx =
2
2l + 1
(l +m)!
(l m)!
.
(Hint: Use (10.33) to compute
_
1
1
P
m
l
(x)P
m
l
(x)dx by integrating by parts
until you can use the case m = 0 from the previous problem.)
10.4. The eigenvalues of the hydrogen atom
Now we want to use the considerations from the previous section to decom-
pose the Hamiltonian of the hydrogen atom. In fact, we can even admit any
spherically symmetric potential V (x) = V ([x[) with
V (r) L
(R) +L
2
((0, ), r
2
dr) (10.44)
such that Theorem 10.2 holds.
The important observation is that the spaces
H
l,m
= (x) = R(r)Y
m
l
(, )[R(r) L
2
((0, ), r
2
dr) (10.45)
with corresponding projectors
P
m
l
(r, , ) =
__
2
0
_

0
(r,
t
,
t
)Y
m
l
(
t
,
t
) sin(
t
)d
t
d
t
_
Y
m
l
(, )
(10.46)
reduce our operator H = H
0
+ V . By Lemma 2.24 it suces to check
this for H restricted to C
c
(R
3
), which is straightforward. Hence, again by
Lemma 2.24,
H = H
0
+V =
l,m
H
l
, (10.47)
where
H
l
R(r) =
l
R(r),
l
=
1
r
2
d
dr
r
2
d
dr
+
l(l + 1)
r
2
+V (r),
D(H
l
) L
2
((0, ), r
2
dr). (10.48)
Using the unitary transformation
L
2
((0, ), r
2
dr) L
2
((0, )), R(r) u(r) = rR(r), (10.49)
our operator transforms to
A
l
f =
l
f,
l
=
d
2
dr
2
+
l(l + 1)
r
2
+V (r),
D(A
l
) = P
m
l
D(H) L
2
((0, )). (10.50)
It remains to investigate this operator (that its domain is indeed independent
of m follows from the next theorem).
Theorem 10.8. The domain of the operator A
l
is given by
D(A
l
) = f L
2
(I)[ f, f
t
AC(I), f L
2
(I),
lim
r0
(f(r) rf
t
(r)) = 0 if l = 0,
(10.51)
where I = (0, ). Moreover,
ess
(A
l
) =
ac
(A
l
) = [0, ),
sc
(A
l
) = ,
p
(, 0]. (10.52)
Proof. By construction of A
l
we know that it is self-adjoint and satises
ess
(A
l
) [0, ) (Problem 10.8). By Lemma 9.37 we have (0, ) N
(
l
)
and hence Theorem 9.31 implies
ac
(A
l
) = [0, ),
sc
(A
l
) = , and
p

(, 0]. So it remains to compute the domain. We know at least D(A
l
)
D() and since D(H) = D(H
0
), it suces to consider the case V = 0. In this
case the solutions of u
tt
(r)+
l(l+1)
r
2
u(r) = 0 are given by u(r) = r
l+1
+r
l
.
Thus we are in the l.p. case at for any l N
0
. However, at 0 we are in
the l.p. case only if l > 0; that is, we need an additional boundary condition
at 0 if l = 0. Since we need R(r) =
u(r)
r
to be bounded (such that (10.20)
is in the domain of H
0
, that is, continuous), we have to take the boundary
condition generated by u(r) = r.
Finally let us turn to some explicit choices for V , where the correspond-
ing dierential equation can be explicitly solved. The simplest case is V = 0.
In this case the solutions of
u
tt
(r) +
l(l + 1)
r
2
u(r) = zu(r) (10.53)
are given by
u(r) = z
l/2
r j
l
(
zr) + z
(l+1)/2
r y
l
(
zr), (10.54)
where j
l
(r) and y
l
(r) are the spherical Bessel, respectively, spherical
Neumann, functions
j
l
(r) =
_
2r
J
l+1/2
(r) = (r)
l
_
1
r
d
dr
_
l
sin(r)
r
,
y
l
(r) =
_
2r
Y
l+1/2
(r) = (r)
l
_
1
r
d
dr
_
l
cos(r)
r
. (10.55)
Note that z
l/2
r j
l
(
zr) and z
(l+1)/2
r y
l
(
zr) are entire as functions of z

and their Wronskian is given by W(z
l/2
r j
l
(
zr), z
(l+1)/2
r y
l
(
zr)) = 1.
See [1, Sects. 10.1 and 10.3]. In particular,
u
a
(z, r) =
r
z
l/2
j
l
(
zr) =
2
l
l!
(2l + 1)!
r
l+1
(1 +O(r
2
)),
u
b
(z, r) =

zr
_
j
l
(i
zr) + iy
l
(i
zr)
_
= e
zr+il/2
(1 +O(
1
r
))
(10.56)
are the functions which are square integrable and satisfy the boundary con-
dition (if any) near a = 0 and b = , respectively.
The second case is that of our Coulomb potential
V (r) =
r
, > 0, (10.57)
where we will try to compute the eigenvalues plus corresponding eigenfunc-
tions. It turns out that they can be expressed in terms of the Laguerre
polynomials ([1, (22.2.13)])
L
j
(r) =
e
r
j!
d
j
dr
j
e
r
r
j
(10.58)
and the generalized Laguerre polynomials ([1, (22.2.12)])
L
(k)
j
(r) = (1)
k
d
k
dr
k
L
j+k
(r). (10.59)
Note that the L
(k)
j
(r) are polynomials of degree j k which are explicitly
given by
L
(k)
j
(r) =
j
i=0
(1)
i
_
j +k
j i
_
r
i
i!
(10.60)
and satisfy the dierential equation (Problem 10.9)
r y
tt
(r) + (k + 1 r)y
t
(r) +j y(r) = 0. (10.61)
Moreover, they are orthogonal in the Hilbert space L
2
((0, ), r
k
e
r
dr) (Prob-
lem 10.10):
_

0
L
(k)
j
(r)L
(k)
j
(r)r
k
e
r
dr =
_
(j+k)!
j!
, i = j,
0, otherwise.
(10.62)
Theorem 10.9. The eigenvalues of H
(1)
are explicitly given by
E
n
=
_

2(n + 1)
_
2
, n N
0
. (10.63)
An orthonormal basis for the corresponding eigenspace is given by the (n+1)
2
functions
n,l,m
(x) = R
n,l
(r)Y
m
l
(x), [m[ l n, (10.64)
where
R
n,l
(r) =

3
(n l)!
2(n + 1)
4
(n +l + 1)!
_
r
n + 1
_
l
e
r
2(n+1)
L
(2l+1)
nl
(
r
n + 1
).
(10.65)
In particular, the lowest eigenvalue E
0
=
2
4
is simple and the correspond-
ing eigenfunction
000
(x) =
_
3
2
e
r/2
is positive.
Proof. Since all eigenvalues are negative, we need to look at the equation
u
tt
(r) + (
l(l + 1)
r
2

r
)u(r) = u(r)
for < 0. Introducing new variables x = 2
r and v(x) =
e
x/2
x
l+1
u(
x
2
),
this equation transforms into Kummers equation
xv
tt
(x) + (k + 1 x)v
t
(x) +j v(x) = 0, k = 2l + 1, j =

2
(l + 1).
Now let us search for a solution which can be expanded into a convergent
power series
v(x) =
i=0
v
i
x
i
, v
0
= 1. (10.66)
The corresponding u(r) is square integrable near 0 and satises the boundary
condition (if any). Thus we need to nd those values of for which it is
square integrable near +.
Substituting the ansatz (10.66) into our dierential equation and com-
paring powers of x gives the following recursion for the coecients
v
i+1
=
(i j)
(i + 1)(i +k + 1)
v
i
and thus
v
i
=
1
i!
i1
=0
j
+k + 1
.
Now there are two cases to distinguish. If j N
0
, then v
i
= 0 for i > j and
v(x) is a polynomial; namely
v(x) =
_
j +k
j
_
1
L
(k)
j
(x).
In this case u(r) is square integrable and hence an eigenfunction correspond-
ing to the eigenvalue
j
= (

2(n+1)
)
2
, n = j + l. This proves the formula
for R
n,l
(r) except for the normalization which follows from (Problem 10.11)
_

0
L
(k)
j
(r)
2
r
k+1
e
r
dr =
(j +k)!
j!
(2j +k + 1). (10.67)
It remains to show that we have found all eigenfunctions, that is, that there
are no other square integrable solutions. Otherwise, if j , N, we have
v
i+1
v
i
(1)
i+1
for i suciently large. Hence by adding a polynomial to v(x)
(and perhaps ipping its sign), we can get a function v(x) such that v
i

(1)
i
i!
for all i. But then v(x) exp((1 )x) and thus the corresponding
u(r) is not square integrable near +.
Finally, let us also look at an alternative algebraic approach for com-
puting the eigenvalues and eigenfunctions of A
l
based on the commutation
methods from Section 8.4. We begin by introducing
Q
l
f =
d
dr
+
l + 1
r

2(l + 1)
,
D(Q
l
) = f L
2
((0, ))[f AC((0, )), Q
l
f L
2
((0, )). (10.68)
Then (Problem 9.3) Q
l
is closed and its adjoint is given by
Q
l
f =
d
dr
+
l + 1
r

2(l + 1)
,
D(Q
l
) = f L
2
((0, ))[ f AC((0, )), Q
l
f L
2
((0, )),
lim
x0,
f(x)g(x) = 0, g D(Q
l
).
(10.69)
It is straightforward to check
Ker(Q
l
) = spanu
l,0
, Ker(Q
l
) = 0, (10.70)
where
u
l,0
(r) =
1
_
(2l + 2)!
_

l + 1
_
(l+1)+1/2
r
l+1
e

2(l+1)
r
(10.71)
is normalized.
Theorem 10.10. The radial Schrodinger operator A
l
satises
A
l
= Q
l
Q
l
c
2
l
, A
l+1
= Q
l
Q
l
c
2
l
, (10.72)
where
c
l
=

2(l + 1)
. (10.73)
Proof. Equality is easy to check for f AC
2
with compact support. Hence
Q
l
Q
l
c
2
l
is a self-adjoint extension of
l
restricted to this set. If l > 0,
there is only one self-adjoint extension and equality follows. If l = 0, we
know u
0,0
D(Q
l
Q
l
) and since A
l
is the only self-adjoint extension with
u
0,0
D(A
l
), equality follows in this case as well.
Hence, as a consequence of Theorem 8.6 we see (A
l
) = (A
l+1
)c
2
l
,
or, equivalently,
p
(A
l
) = c
2
j
[j l (10.74)
if we use that
p
(A
l
) (, 0), which already follows from the virial the-
orem. Moreover, using Q
l
, we can turn any eigenfunction of H
l
into one
of H
l+1
. However, we only know the lowest eigenfunction u
l,0
, which is
mapped to 0 by Q
l
. On the other hand, we can also use Q
l
to turn an
eigenfunction of H
l+1
into one of H
l
. Hence Q
l
u
l+1,0
will give the second
eigenfunction of H
l
. Proceeding inductively, the normalized eigenfunction
of H
l
corresponding to the eigenvalue c
2
l+j
is given by
u
l,j
=
_
j1
k=0
(c
l+j
c
l+k
)
_
1
Q
l
Q
l+1
Q
l+j1
u
l+j,0
. (10.75)
The connection with Theorem 10.9 is given by
R
n,l
(r) =
1
r
u
l,nl
(r). (10.76)
Problem 10.8. Let A =
n
A
n
. Then

n
ess
(A
n
)
ess
(A).
Problem 10.9. Show that the generalized Laguerre polynomials satisfy the
dierential equation (10.61). (Hint: Start with the Laguerre polynomials
(10.58) which correspond to k = 0. Set v(r) = r
j
e
r
and observe r v
t
(r) =
(j r)v(r). Then dierentiate this identity j + 1 times using Leibnizs
rule. For the case of the generalized Laguerre polynomials, start with the
dierential equation for L
j+k
(r) and dierentiate k times.)
Problem 10.10. Show that the dierential equation (10.58) can be rewritten
in SturmLiouville form as
r
k
e
r
d
dr
r
k+1
e
r
d
dr
u = ju.
We have found one entire solution in the proof of Theorem 10.9. Show that
any linearly independent solution behaves like log(r) if k = 0, respectively,
like r
k
otherwise. Show that it is l.c. at the endpoint r = 0 if k = 0 and
l.p. otherwise.
Let H = L
2
((0, ), r
k
e
r
dr). The operator
A
k
f = f = r
k
e
r
d
dr
r
k+1
e
r
d
dr
f,
D(A
k
) = f H[ f AC
1
(0, ),
k
f H,
lim
r0
rf
t
(r) = 0 if k = 0
for k N
0
is self-adjoint. Its spectrum is purely discrete, that is,
(A
k
) =
d
(A
k
) = N
0
, (10.77)
L
(k)
j
(r), j N
0
, (10.78)
form an orthogonal base for H. (Hint: Compare the argument for the asso-
ciated Legendre equation and Problem 10.5.)
Problem 10.11. By Problem 10.4 there is a recurrence relation of the form
L
(k)
j+1
(r) = ( a
j
+
b
j
r)L
(k)
j
(r) + c
j
L
(k)
j1
(r). Find the coecients by comparing
the highest powers in r and conclude
L
(k)
j+1
(r) =
1
1 +j
_
(2j +k + 1 r)L
(k)
j
(r) (j +k)L
(k)
j1
(r)
_
.
Use this to prove (10.62) and (10.67).
10.5. Nondegeneracy of the ground state
The lowest eigenvalue (below the essential spectrum) of a Schrodinger op-
erator is called the ground state. Since the laws of physics state that a
quantum system will transfer energy to its surroundings (e.g., an atom emits
radiation) until it eventually reaches its ground state, this state is in some
sense the most important state. We have seen that the hydrogen atom has
a nondegenerate (simple) ground state with a corresponding positive eigen-
function. In particular, the hydrogen atom is stable in the sense that there
is a lowest possible energy. This is quite surprising since the corresponding
classical mechanical system is not the electron could fall into the nucleus!
Our aim in this section is to show that the ground state is simple with a
corresponding positive eigenfunction. Note that it suces to show that any
ground state eigenfunction is positive since nondegeneracy then follows for
free: two positive functions cannot be orthogonal.
To set the stage, let us introduce some notation. Let H = L
2
(R
n
). We
call f L
2
(R
n
) positive if f 0 a.e. and f , 0. We call f strictly posi-
tive if f > 0 a.e. A bounded operator A is called positivity preserving if
f 0 implies Af 0 and positivity improving if f 0 implies Af > 0.
Clearly A is positivity preserving (improving) if and only if f, Ag) 0
(> 0) for f, g 0.
Example. Multiplication by a positive function is positivity preserving (but
not improving). Convolution with a strictly positive function is positivity
improving.
We rst show that positivity improving operators have positive eigen-
functions.
Theorem 10.11. Suppose A L(L
2
(R
n
)) is a self-adjoint, positivity im-
proving and real (i.e., it maps real functions to real functions) operator. If
|A| is an eigenvalue, then it is simple and the corresponding eigenfunction
is strictly positive.
Proof. Let be an eigenfunction. It is no restriction to assume that is
real (since A is real, both real and imaginary part of are eigenfunctions
as well). We assume || = 1 and denote by
=
f|f[
2
the positive and
negative parts of . Then by [A[ = [A
+
A
[ A
+
+ A
= A[[
we have
|A| = , A) [[, [A[) [[, A[[) |A|;
that is, , A) = [[, A[[) and thus
+
, A
) =
1
4
([[, A[[) , A)) = 0.
Consequently
= 0 or
+
= 0 since otherwise A
> 0 and hence also
+
, A
) > 0. Without restriction =

+
0 and since A is positivity
increasing, we even have = |A|
1
A > 0.
So we need a positivity improving operator. By (7.38) and (7.39) both
e
tH
0
, t > 0, and R
(H
0
), < 0, are since they are given by convolution
with a strictly positive function. Our hope is that this property carries over
to H = H
0
+V .
Theorem 10.12. Suppose H = H
0
+ V is self-adjoint and bounded from
below with C
c
(R
n
) as a core. If E
0
= min(H) is an eigenvalue, it is
simple and the corresponding eigenfunction is strictly positive.
Proof. We rst show that e
tH
, t > 0, is positivity preserving. If we set
V
n
= V
x[ [V (x)[n
, then V
n
is bounded and H
n
= H
0
+ V
n
is positivity
preserving by the Trotter product formula since both e
tH
0
and e
tV
are.
Moreover, we have H
n
H for C
c
(R
n
) (note that necessarily
V L
2
loc
) and hence H
n
sr
H in the strong resolvent sense by Lemma 6.36.
Hence e
tHn
s
e
tH
by Theorem 6.31, which shows that e
tH
is at least
positivity preserving (since 0 cannot be an eigenvalue of e
tH
, it cannot map
a positive function to 0).
Next I claim that for positive the closed set
N() = L
2
(R
n
) [ 0, , e
sH
) = 0 s 0
is just 0. If N(), we have by e
sH
0 that e
sH
= 0. Hence
e
tVn
e
sH
= 0; that is, e
tVn
N(). In other words, both e
tVn
and e
tH
leave N() invariant and invoking Trotters formula again, the same is true
for
e
t(HVn)
= s-lim
k
_
e
t
k
H
e
t
k
Vn
_
k
.
Since e
t(HVn)
s
e
tH
0
, we nally obtain that e
tH
0
leaves N() invariant,
but this operator is positivity increasing and thus N() = 0.
Now it remains to use (7.37), which shows
, R
H
()) =
_

0
e
t
, e
tH
)dt > 0, < E
0
,
for , positive. So R
H
() is positivity increasing for < E
0
.
If is an eigenfunction of H corresponding to E
0
, it is an eigenfunction
of R
H
() corresponding to
1
E
0
and the claim follows since |R

H
()| =
1
E
0
.
The assumptions are for example satised for the potentials V considered
in Theorem 10.2.
Chapter 11
Atomic Schr odinger
operators
11.1. Self-adjointness
In this section we want to have a look at the Hamiltonian corresponding to
more than one interacting particle. It is given by
H =
N
j=1
j
+
N
j<k
V
j,k
(x
j
x
k
). (11.1)
We rst consider the case of two particles, which will give us a feeling
for how the many particle case diers from the one particle case and how
the diculties can be overcome.
We denote the coordinates corresponding to the rst particle by x
1
=
(x
1,1
, x
1,2
, x
1,3
) and those corresponding to the second particle by x
2
=
(x
2,1
, x
2,2
, x
2,3
). If we assume that the interaction is again of the Coulomb
type, the Hamiltonian is given by
H =
1

[x
1
x
2
[
, D(H) = H
2
(R
6
). (11.2)
Since Theorem 10.2 does not allow singularities for n 3, it does not tell
us whether H is self-adjoint or not. Let
(y
1
, y
2
) =
1
2
_
I I
I I
_
(x
1
, x
2
). (11.3)
239
240 11. Atomic Schrodinger operators
Then H reads in this new coordinate system as
H = (
1
) + (
2
2
[y
2
[
). (11.4)
In particular, it is the sum of a free particle plus a particle in an external
Coulomb eld. From a physics point of view, the rst part corresponds to
the center of mass motion and the second part to the relative motion.
Using that /(
2[y
2
[) has (
2
)-bound 0 in L
2
(R
3
), it is not hard to
see that the same is true for the (
1
2
)-bound in L
2
(R
6
) (details will
follow in the next section). In particular, H is self-adjoint and semi-bounded
for any R. Moreover, you might suspect that /(
2[y
2
[) is relatively
compact with respect to
1
2
in L
2
(R
6
) since it is with respect to
2
in L
2
(R
6
). However, this is not true! This is due to the fact that /(
2[y
2
[)
does not vanish as [y[ .
Let us look at this problem from the physical view point. If
ess
(H),
this means that the movement of the whole system is somehow unbounded.
There are two possibilities for this.
First, both particles are far away from each other (such that we can
neglect the interaction) and the energy corresponds to the sum of the kinetic
energies of both particles. Since both can be arbitrarily small (but positive),
we expect [0, )
ess
(H).
Secondly, both particles remain close to each other and move together.
In the last set of coordinates this corresponds to a bound state of the second
operator. Hence we expect [
0
, )
ess
(H), where
0
=
2
/8 is the
smallest eigenvalue of the second operator if the forces are attracting ( 0)
and
0
= 0 if they are repelling ( 0).
It is not hard to translate this intuitive idea into a rigorous proof. Let
1
(y
1
) be a Weyl sequence corresponding to [0, ) for
1
and let
2
(y
2
) be a Weyl sequence corresponding to
0
for
2
/(
2[y
2
[). Then,
1
(y
1
)
2
(y
2
) is a Weyl sequence corresponding to +
0
for H and thus
[
0
, )
ess
(H). Conversely, we have
1
0, respectively,
2

/(
2[y
2
[)
0
, and hence H
0
. Thus we obtain
(H) =
ess
(H) = [
0
, ),
0
=
_

2
/8, 0,
0, 0.
(11.5)
Clearly, the physically relevant information is the spectrum of the operator
2
/(
2[y
2
[) which is hidden by the spectrum of
1
. Hence, in order
to reveal the physics, one rst has to remove the center of mass motion.
To avoid clumsy notation, we will restrict ourselves to the case of one
atom with N electrons whose nucleus is xed at the origin. In particular,
this implies that we do not have to deal with the center of mass motion
11.1. Self-adjointness 241
encountered in our example above. In this case the Hamiltonian is given by
H
(N)
=
N
j=1
j

N
j=1
V
ne
(x
j
) +
N
j=1
N
j<k
V
ee
(x
j
x
k
),
D(H
(N)
) = H
2
(R
3N
), (11.6)
where V
ne
describes the interaction of one electron with the nucleus and V
ee
describes the interaction of two electrons. Explicitly we have
V
j
(x) =

j
[x[
,
j
> 0, j = ne, ee. (11.7)
We rst need to establish the self-adjointness of H
(N)
= H
0
+ V
(N)
. This
will follow from Katos theorem.
Theorem 11.1 (Kato). Let V
k
L
(R
d
) + L
2
(R
d
), d 3, be real-valued
and let V
k
(y
(k)
) be the multiplication operator in L
2
(R
n
), n = Nd, obtained
by letting y
(k)
be the rst d coordinates of a unitary transform of R
n
. Then
V
k
is H
0
bounded with H
0
-bound 0. In particular,
H = H
0
+
k
V
k
(y
(k)
), D(H) = H
2
(R
n
), (11.8)
is self-adjoint and C
0
(R
n
) is a core.
Proof. It suces to consider one k. After a unitary transform of R
n
we can
assume y
(1)
= (x
1
, . . . , x
d
) since such transformations leave both the scalar
product of L
2
(R
n
) and H
0
invariant. Now let o(R
n
). Then
|V
k
|
2
a
2
_
R
n
[
1
(x)[
2
d
n
x +b
2
_
R
n
[(x)[
2
d
n
x,
where
1
=
d
j=1
2
/
2
x
j
, by our previous lemma. Hence we obtain
|V
k
|
2
a
2
_
R
n
[
d
j=1
p
2
j

(p)[
2
d
n
p +b
2
||
2
a
2
_
R
n
[
n
j=1
p
2
j

(p)[
2
d
n
p +b
2
||
2
= a
2
|H
0
|
2
+b
2
||
2
,
which implies that V
k
is relatively bounded with bound 0. The rest follows
from the KatoRellich theorem.
So V
(N)
is H
0
bounded with H
0
-bound 0 and thus H
(N)
= H
0
+ V
(N)
is self-adjoint on D(H
(N)
) = D(H
0
).
11.2. The HVZ theorem
The considerations of the beginning of this section show that it is not so
easy to determine the essential spectrum of H
(N)
since the potential does
not decay in all directions as [x[ . However, there is still something we
can do. Denote the inmum of the spectrum of H
(N)
by
N
. Then, let us
split the system into H
(N1)
plus a single electron. If the single electron is
far away from the remaining system such that there is little interaction, the
energy should be the sum of the kinetic energy of the single electron and
the energy of the remaining system. Hence, arguing as in the two electron
example of the previous section, we expect
Theorem 11.2 (HVZ). Let H
(N)
be the self-adjoint operator given in (11.6).
Then H
(N)
is bounded from below and
ess
(H
(N)
) = [
N1
, ), (11.9)
where
N1
= min(H
(N1)
) < 0.
In particular, the ionization energy (i.e., the energy needed to remove
one electron from the atom in its ground state) of an atom with N electrons
is given by
N
N1
.
Our goal for the rest of this section is to prove this result which is due
to Zhislin, van Winter, and Hunziker and is known as the HVZ theorem. In
fact there is a version which holds for general N-body systems. The proof
is similar but involves some additional notation.
The idea of proof is the following. To prove [
N1
, )
ess
(H
(N)
),
we choose Weyl sequences for H
(N1)
and
N
and proceed according to
our intuitive picture from above. To prove
ess
(H
(N)
) [
N1
, ), we
will localize H
(N)
on sets where one electron is far away from the nucleus
whenever some of the others are. On these sets, the interaction term between
this electron and the nucleus is decaying and hence does not contribute to
the essential spectrum. So it remain to estimate the inmum of the spectrum
of a system where one electron does not interact with the nucleus. Since the
interaction term with the other electrons is positive, we can nally estimate
this inmum by the inmum of the case where one electron is completely
decoupled from the rest.
We begin with the rst inclusion. Let
N1
(x
1
, . . . , x
N1
)H
2
(R
3(N1)
)
such that |
N1
| = 1, |(H
(N1)

N1
)
N1
| and
1
H
2
(R
3
)
such that |
1
| = 1, |(
N
)
1
| for some 0. Now consider
r
(x
1
, . . . , x
N
) =
N1
(x
1
, . . . , x
N1
)
1
r
(x
N
),
1
r
(x
N
) =
1
(x
N
r). Then
|(H
(N)

N1
)
r
| |(H
(N1)
N1
)
N1
||
1
r
|
+|
N1
||(
N
)
1
r
|
+|(V
N

N1
j=1
V
N,j
)
r
|, (11.10)
where V
N
= V
ne
(x
N
) and V
N,j
= V
ee
(x
N
x
j
). Using the fact that
(V
N

N1
j=1
V
N,j
)
N1
L
2
(R
3N
) and [
1
r
[ 0 pointwise as [r[
(by Lemma 10.1), the third term can be made smaller than by choosing
[r[ large (dominated convergence). In summary,
|(H
(N)

N1
)
r
| 3 (11.11)
proving [
N1
, )
ess
(H
(N)
).
The second inclusion is more involved. We begin with a localization
formula.
Lemma 11.3 (IMS localization formula). Suppose
j
C
(R
n
), 1 j
m, is such that
m
j=1
j
(x)
2
= 1, x R
n
. (11.12)
Then
=
m
j=1
_
j
(
j
) +[
j
[
2
_
, H
2
(R
n
). (11.13)
Proof. The proof follows from a straightforward computation using the
identities

j

j
j
= 0 and

j
((
k
j
)
2
+
j
2
k
j
) = 0 which follow by
dierentiating (11.12).
Now we will choose
j
, 1 j N, in such a way that, for x outside
some ball, x supp(
j
) implies that the jth particle is far away from the
nucleus.
Lemma 11.4. Fix some C (0,
1
N
). There exist smooth functions
j

C
(R
n
, [0, 1]), 1 j N, such that (11.12) holds,
supp(
j
) x[ [x[ 1 x[ [x
j
[ C[x[, (11.14)
and [
j
(x)[ 0 as [x[ .
Proof. The open sets
U
j
= x S
3N1
[ [x
j
[ > C
cover the unit sphere in R
N
; that is,
N
_
j=1
U
j
= S
3N1
.
By Lemma 0.13 there is a partition of unity

j
(x) subordinate to this cover.
Extend

j
(x) to a smooth function from R
3N
0 to [0, 1] by
j
(x) =

j
(x), x S
3N1
, > 0,
and pick a function

C
(R
3N
, [0, 1]) with support inside the unit ball
which is 1 in a neighborhood of the origin. Then the
j
=

+ (1

)
j
_
N
=1
(
+ (1

)
)
2
are the desired functions. The gradient tends to zero since
j
(x) =
j
(x)
for 1 and [x[ 1 which implies (
j
)(x) =
1
(
j
)(x).
By our localization formula we have
H
(N)
=
N
j=1
j
H
(N,j)
j
+P K,
K =
N
j=1
_
2
j
V
j
+[
j
[
2
_
, P =
N
j=1
2
j
N
,=j
V
j,
, (11.15)
where
H
(N,j)
=
N
=1
,=j
V
+
N
k<, k,,=j
V
k,
(11.16)
is the Hamiltonian with the jth electron decoupled from the rest of the
system. Here we have abbreviated V
j
(x) = V
ne
(x
j
) and V
j,
= V
ee
(x
j
x
).
Since K vanishes as [x[ , we expect it to be relatively compact with
respect to the rest. By Lemma 6.23 it suces to check that it is relatively
compact with respect to H
0
. The terms [
j
[
2
are bounded and vanish at
; hence they are H
0
compact by Lemma 7.11. However, the terms
2
j
V
j
have singularities and will be covered by the following lemma.
Lemma 11.5. Let V be a multiplication operator which is H
0
bounded with
H
0
-bound 0 and suppose that |
x[[x[R
V R
H
0
(z)| 0 as R . Then
V is relatively compact with respect to H
0
.
Proof. Let
n
converge to 0 weakly. Note that |
n
| M for some
M > 0. It suces to show that |V R
H
0
(z)
n
| converges to 0. Choose
C
0
(R
n
, [0, 1]) such that it is one for [x[ R. Note D(H
0
) D(H
0
).
Then
|V R
H
0
(z)
n
| |(1 )V R
H
0
(z)
n
| +|V R
H
0
(z)
n
|
|(1 )V R
H
0
(z)||
n
|
+a|H
0
R
H
0
(z)
n
| +b|R
H
0
(z)
n
|.
By assumption, the rst term can be made smaller than by choosing R
large. Next, the same is true for the second term choosing a small since
H
0
R
H
0
(z) is bounded (by Problem 2.9 and the closed graph theorem).
Finally, the last term can also be made smaller than by choosing n large
since is H
0
compact.
So K is relatively compact with respect to H
(N)
. In particular H
(N)
+
K is self-adjoint on H
2
(R
3N
) and
ess
(H
(N)
) =
ess
(H
(N)
+ K). Since
the operators H
(N,j)
, 1 j N, are all of the form H
(N1)
plus one
particle which does not interact with the others and the nucleus, we have
H
(N,j)
N1
0, 1 j N. Moreover, we have P 0 and hence
, (H
(N)
+K
N1
)) =
N
j=1
j
, (H
(N,j)
N1
)
j
)
+, P) 0. (11.17)
Thus we obtain the remaining inclusion
ess
(H
(N)
) =
ess
(H
(N)
+K) (H
(N)
+K) [
N1
, ), (11.18)
which nishes the proof of the HVZ theorem.
Note that the same proof works if we add additional nuclei at xed
locations. That is, we can also treat molecules if we assume that the nuclei
are xed in space.
Finally, let us consider the example of helium-like atoms (N = 2). By
the HVZ theorem and the considerations of the previous section we have
ess
(H
(2)
) = [
2
ne
4
, ). (11.19)
Moreover, if
ee
= 0 (no electron interaction), we can take products of one-
particle eigenfunctions to show that
2
ne
_
1
4n
2
+
1
4m
2
_

p
(H
(2)
(
ee
= 0)), n, m N. (11.20)
In particular, there are eigenvalues embedded in the essential spectrum in
this case. Moreover, since the electron interaction term is positive, we see
H
(2)

2
ne
2
. (11.21)
Note that there can be no positive eigenvalues by the virial theorem. This
even holds for arbitrary N,
p
(H
(N)
) (, 0). (11.22)
Chapter 12
Scattering theory
12.1. Abstract theory
In physical measurements one often has the following situation. A particle
is shot into a region where it interacts with some forces and then leaves
the region again. Outside this region the forces are negligible and hence the
time evolution should be asymptotically free. Hence one expects asymptotic
states
(t) = exp(itH
0
)
(0) to exist such that

|(t)
(t)| 0 as t . (12.1)
(t)
(t)
+
(t)
Rewriting this condition, we see

0 = lim
t
|e
itH
(0) e
itH
0
(0)| = lim
t
|(0) e
itH
e
itH
0
(0)|
(12.2)
and motivated by this, we dene the wave operators by
D(
) = H[lim
t
e
itH
e
itH
0
,
= lim
t
e
itH
e
itH
0
.
(12.3)
247
248 12. Scattering theory
The set D(
) is the set of all incoming/outgoing asymptotic states
and
Ran(
) is the set of all states which have an incoming/outgoing asymptotic

state. If a state has both, that is, Ran(
+
) Ran(
), it is called a
scattering state.
By construction we have
|
| = lim
t
|e
itH
e
itH
0
| = lim
t
|| = || (12.4)
and it is not hard to see that D(
) is closed. Moreover, interchanging the

roles of H
0
and H amounts to replacing
by
1
and hence Ran(
) is
also closed. In summary,
Lemma 12.1. The sets D(
) and Ran(
) are closed and
: D(
)
Ran(
) is unitary.
Next, observe that
lim
t
e
itH
e
itH
0
(e
isH
0
) = lim
t
e
isH
(e
i(t+s)H
e
i(t+s)H
0
) (12.5)
and hence
e
itH
0
= e
itH
, D(
). (12.6)
In addition, D(
) is invariant under exp(itH

0
) and Ran(
) is invariant
under exp(itH). Moreover, if D(
, then
, exp(itH
0
)) = exp(itH
0
), ) = 0, D(
). (12.7)
Hence D(
is invariant under exp(itH

0
) and Ran(
is invariant
under exp(itH). Consequently, D(
) reduces exp(itH
0
) and Ran(
)
reduces exp(itH). Moreover, dierentiating (12.6) with respect to t, we
obtain from Theorem 5.1 the intertwining property of the wave operators.
Theorem 12.2. The subspaces D(
), respectively, Ran(
), reduce H
0
,
respectively, H, and the operators restricted to these subspaces are unitarily
equivalent:
H
0
= H
, D(
) D(H
0
). (12.8)
It is interesting to know the correspondence between incoming and out-
going states. Hence we dene the scattering operator
S =
1
+

, D(S) = D(
)[
Ran(
+
). (12.9)
Note that we have D(S) = D(
) if and only if Ran(
) Ran(
+
) and
Ran(S) = D(
+
) if and only if Ran(
+
) Ran(
). Moreover, S is unitary
from D(S) onto Ran(S) and we have
H
0
S = SH
0
, D(H
0
) D(S). (12.10)
However, note that this whole theory is meaningless until we can show that
the domains D(
) are nontrivial. We rst show a criterion due to Cook.

12.1. Abstract theory 249
Lemma 12.3 (Cook). Suppose D(H) D(H
0
). If
_

0
|(H H
0
) exp(itH
0
)|dt < , D(H
0
), (12.11)
then D(
), respectively. Moreover, we even have

|(
I)|
_

0
|(H H
0
) exp(itH
0
)|dt (12.12)
in this case.
Proof. The result follows from
e
itH
e
itH
0
= + i
_
t
0
exp(isH)(H H
0
) exp(isH
0
)ds (12.13)
which holds for D(H
0
).
As a simple consequence we obtain the following result for Schrodinger
operators in R
3
Theorem 12.4. Suppose H
0
is the free Schrodinger operator and H =
H
0
+V with V L
2
(R
3
). Then the wave operators exist and D(
) = H.
Proof. Since we want to use Cooks lemma, we need to estimate
|V (s)|
2
=
_
R
3
[V (x)(s, x)[
2
dx, (s) = exp(isH
0
),
for given D(H
0
). Invoking (7.31), we get
|V (s)| |(s)|
|V |
1
(4s)
3/2
||
1
|V |, s > 0,
at least for L
1
(R
3
). Moreover, this implies
_

1
|V (s)|ds
1
4
3/2
||
1
|V |
and thus any such is in D(
+
). Since such functions are dense, we obtain
D(
+
) = H, and similarly for
.
By the intertwining property is an eigenfunction of H
0
if and only
if it is an eigenfunction of H. Hence for H
pp
(H
0
) it is easy to check
whether it is in D(
) or not and only the continuous subspace is of interest.

We will say that the wave operators exist if all elements of H
ac
(H
0
) are
asymptotic states, that is,
H
ac
(H
0
) D(
), (12.14)
and that they are complete if, in addition, all elements of H
ac
(H) are
scattering states, that is,
H
ac
(H) Ran(
). (12.15)
If we even have
H
c
(H) Ran(
), (12.16)
they are called asymptotically complete.
We will be mainly interested in the case where H
0
is the free Schrodinger
operator and hence H
ac
(H
0
) = H. In this latter case the wave operators ex-
ist if D(
) = H, they are complete if H

ac
(H) = Ran(
), and they are

asymptotically complete if H
c
(H) = Ran(
). In particular, asymptotic
completeness implies H
sc
(H) = 0 since H restricted to Ran(
) is uni-
tarily equivalent to H
0
. Completeness implies that the scattering operator
is unitary. Hence, by the intertwining property, kinetic energy is preserved
during scattering:
, H
0
) = S
, SH
0
) = S
, H
0
S
) =
+
, H
0
+
) (12.17)
for
D(H
0
) and
+
= S
.
12.2. Incoming and outgoing states
In the remaining sections we want to apply this theory to Schrodinger op-
erators. Our rst goal is to give a precise meaning to some terms in the
intuitive picture of scattering theory introduced in the previous section.
This physical picture suggests that we should be able to decompose
H into an incoming and an outgoing part. But how should incoming,
respectively, outgoing, be dened for H? Well, incoming (outgoing)
means that the expectation of x
2
should decrease (increase). Set x(t)
2
=
exp(iH
0
t)x
2
exp(iH
0
t). Then, abbreviating (t) = e
itH
0
,
d
dt
E
(x(t)
2
) = (t), i[H
0
, x
2
](t)) = 4(t), D(t)), o(R
n
),
(12.18)
where D is the dilation operator introduced in (10.9). Hence it is natural to
consider Ran(P
),
P
= P
D
((0, )), (12.19)
as outgoing, respectively, incoming, states. If we project a state in Ran(P
)
to energies in the interval (a
2
, b
2
), we expect that it cannot be found in a
ball of radius proportional to a[t[ as t (a is the minimal velocity of
the particle, since we have assumed the mass to be two). In fact, we will
show below that the tail decays faster then any inverse power of [t[.
We rst collect some properties of D which will be needed later on. Note
TD = DT (12.20)
and hence Tf(D) = f(D)T. Additionally, we will look for a transforma-
tion which maps D to a multiplication operator.
12.2. Incoming and outgoing states 251
Since the dilation group acts on [x[ only, it seems reasonable to switch to
polar coordinates x = r, (t, ) R
+
S
n1
. Since U(s) essentially trans-
forms r into r exp(s), we will replace r by = log(r). In these coordinates
we have
U(s)(e
) = e
ns/2
(e
(s)
) (12.21)
and hence U(s) corresponds to a shift of (the constant in front is absorbed
by the volume element). Thus D corresponds to dierentiation with respect
to this coordinate and all we have to do to make it a multiplication operator
is to take the Fourier transform with respect to .
This leads us to the Mellin transform
/ : L
2
(R
n
) L
2
(R S
n1
),
(r) (/)(, ) =
1
2
_

0
r
i
(r)r
n
2
1
dr.
(12.22)
By construction, / is unitary; that is,
_
R
_
S
n1
[(/)(, )[
2
dd
n1
=
_
R
+
_
S
n1
[(r)[
2
r
n1
drd
n1
,
(12.23)
where d
n1
is the normalized surface measure on S
n1
. Moreover,
/
1
U(s)/ = e
is
(12.24)
and hence
/
1
D/ = . (12.25)
From this it is straightforward to show that
(D) =
ac
(D) = R,
sc
(D) =
pp
(D) = (12.26)
and that o(R
n
) is a core for D. In particular we have P
+
+P
= I.
Using the Mellin transform, we can now prove Perrys estimate.
Lemma 12.5. Suppose f C
c
(R) with supp(f) (a
2
, b
2
) for some a, b >
0. For any R R, N N there is a constant C such that
|
x[ [x[<2a[t[
e
itH
0
f(H
0
)P
D
((R, ))|
C
(1 +[t[)
N
, t 0, (12.27)
respectively.
Proof. We prove only the + case, the remaining one being similar. Consider
o(R
n
). Introducing
(t, x) = e
itH
0
f(H
0
)P
D
((R, ))(x) = K
t,x
, TP
D
((R, )))
= K
t,x
, P
D
((, R))
),
where
K
t,x
(p) =
1
(2)
n/2
e
i(tp
2
px)
f(p
2
)
,
we see that it suces to show
|P
D
((, R))K
t,x
|
2
const
(1 +[t[)
2N
, for [x[ < 2a[t[, t > 0.
Now we invoke the Mellin transform to estimate this norm:
|P
D
((, R))K
t,x
|
2
=
_
R
_
S
n1
[(/K
t,x
)(, )[
2
dd
n1
.
Since
(/K
t,x
)(, ) =
1
(2)
(n+1)/2
_

0
f(r)e
i(r)
dr (12.28)
with

f(r) = f(r
2
)
r
n/21
C
c
((a, b)), (r) = tr
2
+ rx log(r). Esti-
mating the derivative of , we see
t
(r) = 2tr +x /r > 0, r (a, b),
for R and t > R(2a)
1
, where is the distance of a to the support
of

f. Hence we can nd a constant such that
[
t
(r)[ const(1 +[[ +[t[), r supp(

f),
for R, t > R(a)
1
. Now the method of stationary phase (Prob-
lem 12.1) implies
[(/K
t,x
)(, )[
const
(1 +[[ +[t[)
N
for , t as before. By increasing the constant, we can even assume that it
holds for t 0 and R. This nishes the proof.
Corollary 12.6. Suppose that f C
c
((0, )) and R R. Then the
operator P
D
((R, ))f(H
0
) exp(itH
0
) converges strongly to 0 as t
.
Proof. Abbreviating P
D
= P
D
((R, )) and =
x[ [x[<2a[t[
, we have
|P
D
f(H
0
)e
itH
0
| |e
itH
0
f(H
0
)
P
D
| || +|f(H
0
)||(I )|
since |A| = |A
|. Taking t , the rst term goes to zero by our

lemma and the second goes to zero since .
Problem 12.1 (Method of stationary phase). Consider the integral
I(t) =
_

f(r)e
it(r)
dr
with f C
c
(R) and a real-valued phase C
(R). Show that [I(t)[

C
N
t
N
for any N N if [
t
(r)[ 1 for r supp(f). (Hint: Make a change
of variables = (r) and conclude that it suces to show the case (r) = r.
Now use integration by parts.)
12.3. Schrodinger operators with short range potentials
By the RAGE theorem we know that for H
c
, (t) will eventually leave
every compact ball (at least on the average). Hence we expect that the
time evolution will asymptotically look like the free one for H
c
if the
potential decays suciently fast. In other words, we expect such potentials
to be asymptotically complete.
Suppose V is relatively bounded with bound less than one. Introduce
h
1
(r) = |V R
H
0
(z)
r
|, h
2
(r) = |
r
V R
H
0
(z)|, r 0, (12.29)
where
r
=
x[ [x[r
. (12.30)
The potential V will be called short range if these quantities are integrable.
We rst note that it suces to check this for h
1
or h
2
and for one z (H
0
).
Lemma 12.7. The function h
1
is integrable if and only if h
2
is. Moreover,
h
j
integrable for one z
0
(H
0
) implies h
j
integrable for all z (H
0
).
Proof. Pick C
c
(R
n
, [0, 1]) such that (x) = 0 for 0 [x[ 1/2 and
(x) = 0 for 1 [x[. Then it is not hard to see that h
j
is integrable if and
only if

h
j
is integrable, where
h
1
(r) = |V R
H
0
(z)
r
|,

h
2
(r) = |
r
V R
H
0
(z)|, r 1,
and
r
(x) = (x/r). Using
[R
H
0
(z),
r
] = R
H
0
(z)[H
0
(z),
r
]R
H
0
(z)
= R
H
0
(z)(
r
+ (
r
))R
H
0
(z)
and
r
=
r/2
r
, |
r
|
||
/r
2
, respectively, (
r
) =
r/2
(
r
),
|
r
|
||
/r
2
, we see
[
h
1
(r)
h
2
(r)[
c
r
h
1
(r/2), r 1.
Hence

h
2
is integrable if

h
1
is. Conversely,
h
1
(r)

h
2
(r) +
c
r
h
1
(r/2)

h
2
(r) +
c
r
h
2
(r/2) +
2c
r
2
h
1
(r/4)
shows that

h
2
is integrable if

h
1
is.
Invoking the rst resolvent formula
|
r
V R
H
0
(z)| |
r
V R
H
0
(z
0
)||I (z z
0
)R
H
0
(z)|
nishes the proof.
As a rst consequence note
Lemma 12.8. If V is short range, then R
H
(z) R
H
0
(z) is compact.
Proof. The operator R
H
(z)V (I
r
)R
H
0
(z) is compact since (I
r
)R
H
0
(z)
is by Lemma 7.11 and R
H
(z)V is bounded by Lemma 6.23. Moreover, by
our short range condition it converges in norm to
R
H
(z)V R
H
0
(z) = R
H
(z) R
H
0
(z)
as r (at least for some subsequence).
In particular, by Weyls theorem we have
ess
(H) = [0, ). Moreover,
V short range implies that H and H
0
look alike far outside.
Lemma 12.9. Suppose R
H
(z)R
H
0
(z) is compact. Then so is f(H)f(H
0
)
for any f C
(R) and
lim
r
|(f(H) f(H
0
))
r
| = 0. (12.31)
Proof. The rst part is Lemma 6.21 and the second part follows from part
(ii) of Lemma 6.8 since
r
converges strongly to 0.
However, this is clearly not enough to prove asymptotic completeness
and we need a more careful analysis.
We begin by showing that the wave operators exist. By Cooks criterion
(Lemma 12.3) we need to show that
|V exp(itH
0
)| |V R
H
0
(1)||(I
2a[t[
) exp(itH
0
)(H
0
+I)|
+|V R
H
0
(1)
2a[t[
||(H
0
+I)| (12.32)
is integrable for a dense set of vectors . The second term is integrable by our
short range assumption. The same is true by Perrys estimate (Lemma 12.5)
for the rst term if we choose = f(H
0
)P
D
((R, )). Since vectors of
this form are dense, we see that the wave operators exist,
D(
) = H. (12.33)
Since H restricted to Ran(
) is unitarily equivalent to H
0
, we obtain
[0, ) =
ac
(H
0
)
ac
(H). Furthermore, by
ac
(H)
ess
(H) = [0, )
we even have
ac
(H) = [0, ).
To prove asymptotic completeness of the wave operators, we will need
that the (
I)f(H
0
)P
are compact.
Lemma 12.10. Let f C
c
((0, )) and suppose
n
converges weakly to 0.
Then
lim
n
|(
I)f(H
0
)P
n
| = 0; (12.34)
that is, (
I)f(H
0
)P
is compact.
Proof. By (12.13) we see
|R
H
(z)(
I)f(H
0
)P
n
|
_

0
|R
H
(z)V exp(isH
0
)f(H
0
)P
n
|dt.
Since R
H
(z)V R
H
0
is compact, we see that the integrand
R
H
(z)V exp(isH
0
)f(H
0
)P
n
= R
H
(z)V R
H
0
exp(isH
0
)(H
0
+ 1)f(H
0
)P
n
converges pointwise to 0. Moreover, arguing as in (12.32), the integrand
is bounded by an L
1
function depending only on |
n
|. Thus R
H
(z)(

I)f(H
0
)P
is compact by the dominated convergence theorem. Furthermore,

using the intertwining property, we see that
(
I)

f(H
0
)P
=R
H
(z)(
I)f(H
0
)P
(R
H
(z) R
H
0
(z))f(H
0
)P
is compact by Lemma 6.21, where

f() = ( + 1)f().
Now we have gathered enough information to tackle the problem of
asymptotic completeness.
We rst show that the singular continuous spectrum is absent. This
is not really necessary, but it avoids the use of Cesàro means in our main
argument.
Abbreviate P = P
sc
H
P
H
((a, b)), 0 < a < b. Since H restricted to
Ran(
) is unitarily equivalent to H
0
(which has purely absolutely continu-
ous spectrum), the singular part must live on Ran(
; that is, P
sc
H

= 0.
Thus Pf(H
0
) = P(I
+
)f(H
0
)P
+
+P(I
)f(H
0
)P
is compact. Since
f(H) f(H
0
) is compact, it follows that Pf(H) is also compact. Choos-
ing f such that f() = 1 for [a, b], we see that P = Pf(H) is com-
pact and hence nite dimensional. In particular
sc
(H) (a, b) is a -
nite set. But a continuous measure cannot be supported on a nite set,
showing
sc
(H) (a, b) = . Since 0 < a < b are arbitrary, we even
have
sc
(H) (0, ) = and by
sc
(H)
ess
(H) = [0, ), we obtain
sc
(H) = .
Observe that replacing P
sc
H
by P
pp
H
, the same argument shows that all
nonzero eigenvalues are nite dimensional and cannot accumulate in (0, ).
In summary we have shown
Theorem 12.11. Suppose V is short range. Then
ac
(H) =
ess
(H) = [0, ),
sc
(H) = . (12.35)
All nonzero eigenvalues have nite multiplicity and cannot accumulate in
(0, ).
Now we come to the anticipated asymptotic completeness result of En.
Choose
H
c
(H) = H
ac
(H) such that = f(H) (12.36)
for some f C
c
((0, )). By the RAGE theorem the sequence (t) con-
verges weakly to zero as t . Abbreviate (t) = exp(itH). Intro-
duce
(t) = f(H
0
)P
(t), (12.37)
which satisfy
lim
t
|(t)
+
(t)
(t)| = 0. (12.38)
Indeed this follows from
(t) =
+
(t) +
(t) + (f(H) f(H

0
))(t) (12.39)
and Lemma 6.21. Moreover, we even have
lim
t
|(
I)
(t)| = 0 (12.40)
by Lemma 12.10. Now suppose Ran(
. Then
||
2
= lim
t
(t), (t))
= lim
t
(t),
+
(t) +
(t))
= lim
t
(t),
+
+
(t) +
(t)). (12.41)
By Theorem 12.2, Ran(
is invariant under H and thus (t) Ran(
implying
||
2
= lim
t
(t),
(t)) (12.42)
= lim
t
P
f(H
0
)
(t), (t)).
Invoking the intertwining property, we see
||
2
= lim
t
P
f(H
0
)
e
itH
0
, (t)) = 0 (12.43)
by Corollary 12.6. Hence Ran(
) = H
ac
(H) = H
c
(H) and we thus have
shown
Theorem 12.12 (En). Suppose V is short range. Then the wave operators
are asymptotically complete.
Part 3
Appendix
Appendix A
Almost everything
about Lebesgue
integration
In this appendix I give a brief introduction to measure theory. Good refer-
ences are [7], [32], or [47].
A.1. Borel measures in a nut shell
The rst step in dening the Lebesgue integral is extending the notion of
size from intervals to arbitrary sets. Unfortunately, this turns out to be too
much, since a classical paradox by Banach and Tarski shows that one can
break the unit ball in R
3
into a nite number of (wild choosing the pieces
uses the Axiom of Choice and cannot be done with a jigsaw;-) pieces, rotate
and translate them, and reassemble them to get two copies of the unit ball
(compare Problem A.1). Hence any reasonable notion of size (i.e., one which
is translation and rotation invariant) cannot be dened for all sets!
A collection of subsets / of a given set X such that
X /,
/ is closed under nite unions,
/ is closed under complements
is called an algebra. Note that / and that, by de Morgan, / is also
closed under nite intersections. If an algebra is closed under countable
unions (and hence also countable intersections), it is called a -algebra.
259
260 A. Almost everything about Lebesgue integration
Moreover, the intersection of any family of (-)algebras /
is again
a (-)algebra and for any collection S of subsets there is a unique smallest
(-)algebra (S) containing S (namely the intersection of all (-)algebras
containing S). It is called the (-)algebra generated by S.
If X is a topological space, the Borel -algebra of X is dened to be
the -algebra generated by all open (respectively, all closed) sets. Sets in
the Borel -algebra are called Borel sets.
Example. In the case X = R
n
the Borel -algebra will be denoted by B
n
and we will abbreviate B = B
1
.
Now let us turn to the denition of a measure: A set X together with
a -algebra is called a measurable space. A measure is a map
: [0, ] on a -algebra such that
() = 0,
(
j=1
A
j
) =
j=1
(A
j
) if A
j
A
k
= for all j ,= k (-additivity).
It is called -nite if there is a countable cover X
j
j=1
of X with (X
j
) <
for all j. (Note that it is no restriction to assume X
j
X
j+1
.) It is
called nite if (X) < . The sets in are called measurable sets and
the triple X, , and is referred to as a measure space.
If we replace the -algebra by an algebra /, then is called a premea-
sure. In this case -additivity clearly only needs to hold for disjoint sets
A
n
for which
n
A
n
/.
We will write A
n
A if A
n
A
n+1
(note A =
n
A
n
) and A
n
A if
A
n+1
A
n
(note A =
n
A
n
).
Theorem A.1. Any measure satises the following properties:
(i) A B implies (A) (B) (monotonicity).
(ii) (A
n
) (A) if A
n
A (continuity from below).
(iii) (A
n
) (A) if A
n
A and (A
1
) < (continuity from above).
Proof. The rst claim is obvious. The second follows using

A
n
= A
n
A
n1
and -additivity. The third follows from the second using

A
n
= A
1
A
n
and
(

A
n
) = (A
1
) (A
n
).
Example. Let A P(M) and set (A) to be the number of elements of A
(respectively, if A is innite). This is the so-called counting measure.
Note that if X = N and A
n
= j N[j n, then (A
n
) = , but
(
n
A
n
) = () = 0 which shows that the requirement (A
1
) < in the
last claim of Theorem A.1 is not superuous.
A.1. Borel measures in a nut shell 261
A measure on the Borel -algebra is called a Borel measure if (C) <
for any compact set C. A Borel measures is called outer regular if
(A) = inf
AO,O open
(O) (A.1)
and inner regular if
(A) = sup
CA,C compact
(C). (A.2)
It is called regular if it is both outer and inner regular.
But how can we obtain some more interesting Borel measures? We will
restrict ourselves to the case of X = R for simplicity. Then the strategy
is as follows: Start with the algebra of nite unions of disjoint intervals
and dene for those sets (as the sum over the intervals). This yields a
premeasure. Extend this to an outer measure for all subsets of R. Show
that the restriction to the Borel sets is a measure.
Let us rst show how we should dene for intervals: To every Borel
measure on B we can assign its distribution function
(x) =
_
_
_
((x, 0]), x < 0,
0, x = 0,
((0, x]), x > 0,
(A.3)
which is right continuous and nondecreasing. Conversely, given a right con-
tinuous nondecreasing function : R R, we can set
(A) =
_
_
(b) (a), A = (a, b],
(b) (a), A = [a, b],
(b) (a), A = (a, b),
(b) (a), A = [a, b),
(A.4)
where (a) = lim
0
(a). In particular, this gives a premeasure on the
algebra of nite unions of intervals which can be extended to a measure:
Theorem A.2. For every right continuous nondecreasing function : R
R there exists a unique regular Borel measure which extends (A.4). Two
dierent functions generate the same measure if and only if they dier by a
constant.
Since the proof of this theorem is rather involved, we defer it to the next
section and look at some examples rst.
Example. Suppose (x) = 0 for x < 0 and (x) = 1 for x 0. Then we
obtain the so-called Dirac measure at 0, which is given by (A) = 1 if
0 A and (A) = 0 if 0 , A.
Example. Suppose (x) = x. Then the associated measure is the ordinary
Lebesgue measure on R. We will abbreviate the Lebesgue measure of a
Borel set A by (A) = [A[.
It can be shown that Borel measures on a locally compact second count-
able space are always regular ([7, Thm. 29.12]).
A set A is called a support for if (XA) = 0. A property is
said to hold -almost everywhere (a.e.) if it holds on a support for or,
equivalently, if the set where it does not hold is contained in a set of measure
zero.
Example. The set of rational numbers has Lebesgue measure zero: (Q) =
0. In fact, any single point has Lebesgue measure zero, and so has any
countable union of points (by countable additivity).
Example. The Cantor set is an example of a closed uncountable set of
Lebesgue measure zero. It is constructed as follows: Start with C
0
= [0, 1]
and remove the middle third to obtain C
1
= [0,
1
3
][
2
3
, 1]. Next, again remove
the middle thirds of the remaining sets to obtain C
2
= [0,
1
9
] [
2
9
,
1
3
] [
2
3
,
7
9
]
[
8
9
, 1]:
C
0
C
1
C
2
C
3
.
.
.
Proceeding like this, we obtain a sequence of nesting sets C
n
and the limit
C =

n
C
n
is the Cantor set. Since C
n
is compact, so is C. Moreover,
C
n
consists of 2
n
intervals of length 3
n
, and thus its Lebesgue measure
is (C
n
) = (2/3)
n
. In particular, (C) = lim
n
(C
n
) = 0. Using the
ternary expansion, it is extremely simple to describe: C is the set of all
x [0, 1] whose ternary expansion contains no ones, which shows that C is
uncountable (why?). It has some further interesting properties: it is totally
disconnected (i.e., it contains no subintervals) and perfect (it has no isolated
points).
Problem A.1 (Vitali set). Call two numbers x, y [0, 1) equivalent if xy
is rational. Construct the set V by choosing one representative from each
equivalence class. Show that V cannot be measurable with respect to any
nontrivial nite translation invariant measure on [0, 1). (Hint: How can
you build up [0, 1) from translations of V ?)
A.2. Extending a premeasure to a measure
The purpose of this section is to prove Theorem A.2. It is rather technical and
should be skipped on rst reading.
In order to prove Theorem A.2, we need to show how a premeasure can
be extended to a measure. As a prerequisite we rst establish that it suces
to check increasing (or decreasing) sequences of sets when checking whether
a given algebra is in fact a -algebra:
A collection of sets / is called a monotone class if A
n
A implies
A / whenever A
n
/ and A
n
A implies A / whenever A
n
/.
Every -algebra is a monotone class and the intersection of monotone classes
is a monotone class. Hence every collection of sets S generates a smallest
monotone class /(S).
Theorem A.3. Let / be an algebra. Then /(/) = (/).
Proof. We rst show that / = /(/) is an algebra.
Put M(A) = B /[A B /. If B
n
is an increasing sequence
of sets in M(A), then A B
n
is an increasing sequence in / and hence
n
(A B
n
) /. Now
A
_
_
n
B
n
_
=
_
n
(A B
n
)
shows that M(A) is closed under increasing sequences. Similarly, M(A) is
closed under decreasing sequences and hence it is a monotone class. But
does it contain any elements? Well, if A /, we have / M(A) implying
M(A) = / for A /. Hence AB / if at least one of the sets is in /.
But this shows / M(A) and hence M(A) = / for any A /. So / is
closed under nite unions.
To show that we are closed under complements, consider M = A
/[XA /. If A
n
is an increasing sequence, then XA
n
is a decreasing
sequence and X
n
A
n
=

n
XA
n
/ if A
n
M and similarly for
decreasing sequences. Hence M is a monotone class and must be equal to
/ since it contains /.
So we know that / is an algebra. To show that it is a -algebra, let
A
n
/ be given and put

A
n
=
kn
A
n
/. Then

A
n
is increasing and
A
n
=
n
A
n
/.
The typical use of this theorem is as follows: First verify some property
for sets in an algebra /. In order to show that it holds for any set in (/), it
suces to show that the collection of sets for which it holds is closed under
countable increasing and decreasing sequences (i.e., is a monotone class).
Now we start by proving that (A.4) indeed gives rise to a premeasure.
Lemma A.4. The interval function dened in (A.4) gives rise to a unique
-nite regular premeasure on the algebra / of nite unions of disjoint in-
tervals.
Proof. First of all, (A.4) can be extended to nite unions of disjoint inter-
vals by summing over all intervals. It is straightforward to verify that is
well-dened (one set can be represented by dierent unions of intervals) and
by construction additive.
To show regularity, we can assume any such union to consist of open
intervals and points only. To show outer regularity, replace each point x
by a small open interval (x+, x) and use that (x) = lim
0
(x+)
(x). Similarly, to show inner regularity, replace each open interval (a, b)
by a compact one, [a
n
, b
n
] (a, b), and use ((a, b)) = lim
n
(b
n
)(a
n
)
if a
n
a and b
n
b.
It remains to verify -additivity. We need to show
(
_
k
I
k
) =
k
(I
k
)
whenever I
n
/ and I =
k
I
k
/. Since each I
n
is a nite union of in-
tervals, we can as well assume each I
n
is just one interval (just split I
n
into
its subintervals and note that the sum does not change by additivity). Sim-
ilarly, we can assume that I is just one interval (just treat each subinterval
separately).
By additivity is monotone and hence
n
k=1
(I
k
) = (
n
_
k=1
I
k
) (I)
which shows
k=1
(I
k
) (I).
To get the converse inequality, we need to work harder.
By outer regularity we can cover each I
k
by some open interval J
k
such
that (J
k
) (I
k
) +

2
k
. First suppose I is compact. Then nitely many of
the J
k
, say the rst n, cover I and we have
(I) (
n
_
k=1
J
k
)
n
k=1
(J
k
)
k=1
(I
k
) +.
Since > 0 is arbitrary, this shows -additivity for compact intervals. By
additivity we can always add/subtract the endpoints of I and hence -
additivity holds for any bounded interval. If I is unbounded, say I = [a, ),
then given x > 0, we can nd an n such that J
n
cover at least [0, x] and
hence
n
k=1
(I
k
)
n
k=1
(J
k
) ([a, x]) .
Since x > a and > 0 are arbitrary, we are done.
This premeasure determines the corresponding measure uniquely (if
there is one at all):
Theorem A.5 (Uniqueness of measures). Let be a -nite premeasure
on an algebra /. Then there is at most one extension to (/).
Proof. We rst assume that (X) < . Suppose there is another extension
and consider the set
S = A (/)[(A) = (A).
I claim S is a monotone class and hence S = (/) since / S by assump-
tion (Theorem A.3).
Let A
n
A. If A
n
S, we have (A
n
) = (A
n
) and taking limits
(Theorem A.1 (ii)), we conclude (A) = (A). Next let A
n
A and take
limits again. This nishes the nite case. To extend our result to the -nite
case, let X
j
X be an increasing sequence such that (X
j
) < . By the
nite case (A X
j
) = (A X
j
) (just restrict , to X
j
). Hence
(A) = lim
j
(A X
j
) = lim
j
(A X
j
) = (A)
and we are done.
Note that if our premeasure is regular, so will the extension be:
Lemma A.6. Suppose is a -nite measure on the Borel sets B. Then
outer (inner) regularity holds for all Borel sets if it holds for all sets in some
algebra / generating the Borel sets B.
Proof. We rst assume that (X) < . Set
(A) = inf
AO,O open
(O) (A)
and let M = A B[
(A) = (A). Since by assumption M contains

some algebra generating B, it suces to prove that M is a monotone class.
Let A
n
M be a monotone sequence and let O
n
A
n
be open sets such
that (O
n
) (A
n
) +

2
n
. Then
(A
n
) (O
n
) (A
n
) +

2
n
.
Now if A
n
A, just take limits and use continuity from below of to see
that O
n
A
n
A is a sequence of open sets with (O
n
) (A). Similarly
if A
n
A, observe that O =
n
O
n
satises O A and
(O) (A) +
(O
n
A) (A) +
since (O
n
A) (O
n
A
n
)

2
n
.
Next let be arbitrary. Let X
j
be a cover with (X
j
) < . Given
A, we can split it into disjoint sets A
j
such that A
j
X
j
(A
1
= A X
1
,
A
2
= (AA
1
)X
2
, etc.). By regularity, we can assume X
j
open. Thus there
are open (in X) sets O
j
covering A
j
such that (O
j
) (A
j
) +

2
j
. Then
O =
j
O
j
is open, covers A, and satises
(A) (O)
j
(O
j
) (A) +.
This settles outer regularity.
Next let us turn to inner regularity. If (X) < , one can show as
before that M = A B[
(A) = (A), where
(A) = sup
CA,C compact
(C) (A)
is a monotone class. This settles the nite case.
For the -nite case split A again as before. Since X
j
has nite measure,
there are compact subsets K
j
of A
j
such that (A
j
) (K
j
) +

2
j
. Now
we need to distinguish two cases: If (A) = , the sum

j
(A
j
) will
diverge and so will

j
(K
j
). Hence

K
n
=

n
j=1
A is compact with
(

K
n
) = (A). If (A) < , the sum

j
(A
j
) will converge and
choosing n suciently large, we will have
(

K
n
) (A) (

K
n
) + 2.
This nishes the proof.
So it remains to ensure that there is an extension at all. For any pre-
measure we dene
(A) = inf
_

n=1
(A
n
)
_
n=1
A
n
, A
n
/
_
(A.5)
where the inmum extends over all countable covers from /. Then the
function
: P(X) [0, ] is an outer measure; that is, it has the

properties (Problem A.2)

() = 0,
A
1
A
2

(A
1
)
(A
2
), and

n=1
A
n
)
n=1
(A
n
) (subadditivity).
Note that
(A) = (A) for A / (Problem A.3).

Theorem A.7 (Extensions via outer measures). Let
be an outer measure.
Then the set of all sets A satisfying the Caratheodory condition
(E) =
(A E) +
(A
t
E), E X (A.6)
(where A
t
= XA is the complement of A) forms a -algebra and
re-
stricted to this -algebra is a measure.
Proof. We rst show that is an algebra. It clearly contains X and is closed
under complements. Let A, B . Applying Caratheodorys condition
twice nally shows
(E) =
(A B E) +
(A
t
B E) +
(A B
t
E)
+
(A
t
B
t
E)
((A B) E) +
((A B)
t
E),
where we have used de Morgan and
(A B E) +
(A
t
B E) +
(A B
t
E)
((A B) E)
which follows from subadditivity and (A B) E = (A B E) (A
t
B E) (A B
t
E). Since the reverse inequality is just subadditivity, we
conclude that is an algebra.
Next, let A
n
be a sequence of sets from . Without restriction we
can assume that they are disjoint (compare the last argument in proof of
Theorem A.3). Abbreviate

A
n
=
kn
A
n
, A =
n
A
n
. Then for any set E
we have
(

A
n
E) =
(A
n

A
n
E) +
(A
t
n

A
n
E)
=
(A
n
E) +
(

A
n1
E)
= . . . =
n
k=1
(A
k
E).
Using

A
n
and monotonicity of
, we infer
(E) =
(

A
n
E) +
(

A
t
n
E)
k=1
(A
k
E) +
(A
t
E).
Letting n and using subadditivity nally gives
(E)
k=1
(A
k
E) +
(A
t
E)

(A E) +
(B
t
E)
(E) (A.7)
and we infer that is a -algebra.
Finally, setting E = A in (A.7), we have
(A) =
k=1
(A
k
A) +
(A
t
A) =
k=1
(A
k
)
and we are done.
Remark: The constructed measure is complete; that is, for any mea-
surable set A of measure zero, any subset of A is again measurable (Prob-
lem A.4).
The only remaining question is whether there are any nontrivial sets
satisfying the Caratheodory condition.
Lemma A.8. Let be a premeasure on / and let
be the associated outer

measure. Then every set in / satises the Caratheodory condition.
Proof. Let A
n
/ be a countable cover for E. Then for any A / we
have
n=1
(A
n
) =
n=1
(A
n
A) +
n=1
(A
n
A
t
)
(E A) +
(E A
t
)
since A
n
A / is a cover for E A and A
n
A
t
/ is a cover for E A
t
.
Taking the inmum, we have
(E)
(EA)+
(EA
t
), which nishes
the proof.
Thus, as a consequence we obtain Theorem A.2.
Problem A.2. Show that
dened in (A.5) is an outer measure. (Hint

for the last property: Take a cover B
nk
k=1
for A
n
such that
(A
n
) =
2
n
+
k=1
(B
nk
) and note that B
nk
n,k=1
is a cover for

n
A
n
.)
Problem A.3. Show that
dened in (A.5) extends . (Hint: For the

cover A
n
it is no restriction to assume A
n
A
m
= and A
n
A.)
Problem A.4. Show that the measure constructed in Theorem A.7 is com-
plete.
A.3. Measurable functions
The Riemann integral works by splitting the x coordinate into small intervals
and approximating f(x) on each interval by its minimum and maximum.
The problem with this approach is that the dierence between maximum
and minimum will only tend to zero (as the intervals get smaller) if f(x) is
suciently nice. To avoid this problem, we can force the dierence to go to
zero by considering, instead of an interval, the set of x for which f(x) lies
A.3. Measurable functions 269
between two given numbers a < b. Now we need the size of the set of these
x, that is, the size of the preimage f
1
((a, b)). For this to work, preimages
of intervals must be measurable.
A function f : X R
n
is called measurable if f
1
(A) for every
A B
n
. A complex-valued function is called measurable if both its real and
imaginary parts are. Clearly it suces to check this condition for every set A
in a collection of sets which generate B
n
, since the collection of sets for which
it holds forms a -algebra by f
1
(R
n
A) = Xf
1
(A) and f
1
(
j
A
j
) =
j
f
1
(A
j
).
Lemma A.9. A function f : X R
n
is measurable if and only if
f
1
(I) I =
n
j=1
(a
j
, ). (A.8)
In particular, a function f : X R
n
is measurable if and only if every
component is measurable.
Proof. We need to show that B is generated by rectangles of the above
form. The -algebra generated by these rectangles also contains all open
rectangles of the form I =

n
j=1
(a
j
, b
j
). Moreover, given any open set O,
we can cover it by such open rectangles satisfying I O. By Lindelofs
theorem there is a countable subcover and hence every open set can be
written as a countable union of open rectangles.
Clearly the intervals (a
j
, ) can also be replaced by [a
j
, ), (, a
j
),
or (, a
j
].
If X is a topological space and the corresponding Borel -algebra,
we will also call a measurable function a Borel function. Note that, in
particular,
Lemma A.10. Let X be a topological space and its Borel -algebra. Any
continuous function is Borel. Moreover, if f : X R
n
and g : Y R
n
R
m
are Borel functions, then the composition g f is again Borel.
Sometimes it is also convenient to allow as possible values for f,
that is, functions f : X R, R = R , . In this case A R is
called Borel if A R is.
The set of all measurable functions forms an algebra.
Lemma A.11. Let X be a topological space and its Borel -algebra.
Suppose f, g : X R are measurable functions. Then the sum f + g and
the product fg are measurable.
Proof. Note that addition and multiplication are continuous functions from
R
2
R and hence the claim follows from the previous lemma.
Moreover, the set of all measurable functions is closed under all impor-
tant limiting operations.
Lemma A.12. Suppose f
n
: X R is a sequence of measurable functions.
Then
inf
nN
f
n
, sup
nN
f
n
, liminf
n
f
n
, limsup
n
f
n
(A.9)
are measurable as well.
Proof. It suces to prove that supf
n
is measurable since the rest follows
from inf f
n
= sup(f
n
), liminf f
n
= sup
k
inf
nk
f
n
, and limsupf
n
=
inf
k
sup
nk
f
n
. But (supf
n
)
1
((a, )) =
n
f
1
n
((a, )) and we are done.
A few immediate consequences are worthwhile noting: It follows that

if f and g are measurable functions, so are min(f, g), max(f, g), [f[ =
max(f, f), and f
= max(f, 0). Furthermore, the pointwise limit of

measurable functions is again measurable.
A.4. The Lebesgue integral
Now we can dene the integral for measurable functions as follows. A mea-
surable function s : X R is called simple if its range is nite; that is,
if
s =
p
j=1
j

A
j
, A
j
= s
1
(
j
) . (A.10)
Here
A
is the characteristic function of A; that is,
A
(x) = 1 if x A
and
A
(x) = 0 otherwise.
For a nonnegative simple function we dene its integral as
_
A
s d =
p
j=1
j
(A
j
A). (A.11)
Here we use the convention 0 = 0.
Lemma A.13. The integral has the following properties:
(i)
_
A
s d =
_
X

A
s d.
(ii)
_
S
j=1
A
j
s d =
j=1
_
A
j
s d, A
j
A
k
= for j ,= k.
(iii)
_
A
s d =
_
A
s d, 0.
(iv)
_
A
(s +t)d =
_
A
s d +
_
A
t d.
(v) A B
_
A
s d
_
B
s d.
(vi) s t
_
A
s d
_
A
t d.
Proof. (i) is clear from the denition. (ii) follows from -additivity of .
(iii) is obvious. (iv) Let s =

j

j

A
j
, t =

j

j

B
j
and abbreviate
C
jk
= (A
j
B
k
) A. Then, by (ii),
_
A
(s +t)d =
j,k
_
C
jk
(s +t)d =
j,k
(
j
+
k
)(C
jk
)
=
j,k
_
_
C
jk
s d +
_
C
jk
t d
_
=
_
A
s d +
_
A
t d.
(v) follows from monotonicity of . (vi) follows since by (iv) we can write
s =
j

j

C
j
, t =
j

j

C
j
where, by assumption,
j

j
.
Our next task is to extend this denition to arbitrary positive functions
by
_
A
f d = sup
sf
_
A
s d, (A.12)
where the supremum is taken over all simple functions s f. Note that,
except for possibly (ii) and (iv), Lemma A.13 still holds for this extension.
Theorem A.14 (Monotone convergence). Let f
n
be a monotone nondecreas-
ing sequence of nonnegative measurable functions, f
n
f. Then
_
A
f
n
d
_
A
f d. (A.13)
Proof. By property (vi),
_
A
f
n
d is monotone and converges to some num-
ber . By f
n
f and again (vi) we have

_
A
f d.
To show the converse, let s be simple such that s f and let (0, 1). Put
A
n
= x A[f
n
(x) s(x) and note A
n
A (show this). Then
_
A
f
n
d
_
An
f
n
d
_
An
s d.
Letting n , we see

_
A
s d.
Since this is valid for any < 1, it still holds for = 1. Finally, since s f
is arbitrary, the claim follows.
In particular
_
A
f d = lim
n
_
A
s
n
d, (A.14)
for any monotone sequence s
n
f of simple functions. Note that there is
always such a sequence, for example,
s
n
(x) =
2
n
k=0
k
2
n
f
1
(A
k
)
(x), A
k
= [
k
2
n
,
k + 1
2
n
), A
2
n = [n, ). (A.15)
By construction s
n
converges uniformly if f is bounded, since s
n
(x) = n if
f(x) = and f(x) s
n
(x) <
1
n
if f(x) < n + 1.
Now what about the missing items (ii) and (iv) from Lemma A.13? Since
limits can be spread over sums, the extension is linear (i.e., item (iv) holds)
and (ii) also follows directly from the monotone convergence theorem. We
even have the following result:
Lemma A.15. If f 0 is measurable, then d = f d dened via
(A) =
_
A
f d (A.16)
is a measure such that
_
g d =
_
gf d. (A.17)
Proof. As already mentioned, additivity of is equivalent to linearity of the
integral and -additivity follows from the monotone convergence theorem:
(
_
n=1
A
n
) =
_
(
n=1
An
)f d =
n=1
_

An
f d =
n=1
(A
n
).
The second claim holds for simple functions and hence for all functions by
construction of the integral.
If f
n
is not necessarily monotone, we have at least
Theorem A.16 (Fatous lemma). If f
n
is a sequence of nonnegative mea-
surable function, then
_
A
liminf
n
f
n
d liminf
n
_
A
f
n
d. (A.18)
Proof. Set g
n
= inf
kn
f
k
. Then g
n
f
n
implying
_
A
g
n
d
_
A
f
n
d.
Now take the liminf on both sides and note that by the monotone conver-
gence theorem
liminf
n
_
A
g
n
d = lim
n
_
A
g
n
d =
_
A
lim
n
g
n
d =
_
A
liminf
n
f
n
d,
proving the claim.
If the integral is nite for both the positive and negative part f
of an
arbitrary measurable function f, we call f integrable and set
_
A
f d =
_
A
f
+
d
_
A
f
d. (A.19)
The set of all integrable functions is denoted by L
1
(X, d).
Lemma A.17. Lemma A.13 holds for integrable functions s, t.
Similarly, we handle the case where f is complex-valued by calling f
integrable if both the real and imaginary part are and setting
_
A
f d =
_
A
Re(f)d + i
_
A
Im(f)d. (A.20)
Clearly f is integrable if and only if [f[ is.
Lemma A.18. For any integrable functions f, g we have
[
_
A
f d[
_
A
[f[ d (A.21)
and (triangle inequality)
_
A
[f +g[ d
_
A
[f[ d +
_
A
[g[ d. (A.22)
Proof. Put =
z
[z[
, where z =
_
A
f d (without restriction z ,= 0). Then
[
_
A
f d[ =
_
A
f d =
_
A
f d =
_
A
Re(f) d
_
A
[f[ d,
proving the rst claim. The second follows from [f +g[ [f[ +[g[.
In addition, our integral is well behaved with respect to limiting opera-
tions.
Theorem A.19 (Dominated convergence). Let f
n
be a convergent sequence
of measurable functions and set f = lim
n
f
n
. Suppose there is an inte-
grable function g such that [f
n
[ g. Then f is integrable and
lim
n
_
f
n
d =
_
fd. (A.23)
Proof. The real and imaginary parts satisfy the same assumptions and so
do the positive and negative parts. Hence it suces to prove the case where
f
n
and f are nonnegative.
By Fatous lemma
liminf
n
_
A
f
n
d
_
A
f d
and
liminf
n
_
A
(g f
n
)d
_
A
(g f)d.
Subtracting
_
A
g d on both sides of the last inequality nishes the proof
since liminf(f
n
) = limsupf
n
.
Remark: Since sets of measure zero do not contribute to the value of the
integral, it clearly suces if the requirements of the dominated convergence
theorem are satised almost everywhere (with respect to ).
Note that the existence of g is crucial, as the example f
n
(x) =
1
n
[n,n]
(x)
on R with Lebesgue measure shows.
Example. If (x) =

n
n
(x x
n
) is a sum of Dirac measures, (x)
centered at x = 0, then
_
f(x)d(x) =
n
f(x
n
). (A.24)
Hence our integral contains sums as special cases.
Problem A.5. Show that the set B(X) of bounded measurable functions
with the sup norm is a Banach space. Show that the set S(X) of simple
functions is dense in B(X). Show that the integral is a bounded linear func-
tional on B(X). (Hence Theorem 0.26 could be used to extend the integral
from simple to bounded measurable functions.)
Problem A.6. Show that the dominated convergence theorem implies (un-
der the same assumptions)
lim
n
_
[f
n
f[d = 0.
Problem A.7. Let X R, Y be some measure space, and f : X Y R.
Suppose y f(x, y) is measurable for every x and x f(x, y) is continuous
for every y. Show that
F(x) =
_
A
f(x, y) d(y) (A.25)
is continuous if there is an integrable function g(y) such that [f(x, y)[ g(y).
Problem A.8. Let X R, Y be some measure space, and f : X Y R.
Suppose y f(x, y) is measurable for all x and x f(x, y) is dierentiable
for a.e. y. Show that
F(x) =
_
A
f(x, y) d(y) (A.26)
is dierentiable if there is an integrable function g(y) such that [

x
f(x, y)[
g(y). Moreover, x

x
f(x, y) is measurable and
F
t
(x) =
_
A
x
f(x, y) d(y) (A.27)
in this case.
A.5. Product measures
Let
1
and
2
be two measures on
1
and
2
, respectively. Let
1
2
be
the -algebra generated by rectangles of the form A
1
A
2
.
Example. Let B be the Borel sets in R. Then B
2
= BB are the Borel
sets in R
2
(since the rectangles are a basis for the product topology).
Any set in
1
2
has the section property; that is,
Lemma A.20. Suppose A
1
2
. Then its sections
A
1
(x
2
) = x
1
[(x
1
, x
2
) A and A
2
(x
1
) = x
2
[(x
1
, x
2
) A (A.28)
are measurable.
Proof. Denote all sets A
1
2
with the property that A
1
(x
2
)
1
by
S. Clearly all rectangles are in S and it suces to show that S is a -algebra.
Now, if A S, then (A
t
)
1
(x
2
) = (A
1
(x
2
))
t

2
and thus S is closed under
complements. Similarly, if A
n
S, then (
n
A
n
)
1
(x
2
) =
n
(A
n
)
1
(x
2
) shows
that S is closed under countable unions.
This implies that if f is a measurable function on X
1
X
2
, then f(., x
2
) is
measurable on X
1
for every x
2
and f(x
1
, .) is measurable on X
2
for every x
1
(observe A
1
(x
2
) = x
1
[f(x
1
, x
2
) B, where A = (x
1
, x
2
)[f(x
1
, x
2
) B).
Given two measures
1
on
1
and
2
on
2
, we now want to construct
the product measure
1
2
on
1
2
such that
2
(A
1
A
2
) =
1
(A
1
)
2
(A
2
), A
j

j
, j = 1, 2. (A.29)
Theorem A.21. Let
1
and
2
be two -nite measures on
1
and
2
,
respectively. Let A
1

2
. Then
2
(A
2
(x
1
)) and
1
(A
1
(x
2
)) are mea-
surable and
_
X
1
2
(A
2
(x
1
))d
1
(x
1
) =
_
X
2
1
(A
1
(x
2
))d
2
(x
2
). (A.30)
Proof. Let S be the set of all subsets for which our claim holds. Note
that S contains at least all rectangles. It even contains the algebra of nite
disjoint unions of rectangles. Thus it suces to show that S is a monotone
class. If
1
and
2
are nite, measurability and equality of both integrals
follow from the monotone convergence theorem for increasing sequences of
sets and from the dominated convergence theorem for decreasing sequences
of sets.
If
1
and
2
are -nite, let X
i,j
X
i
with
i
(X
i,j
) < for i = 1, 2.
Now
2
((A X
1,j
X
2,j
)
2
(x
1
)) =
2
(A
2
(x
1
) X
2,j
)
X
1,j
(x
1
) and similarly
with 1 and 2 exchanged. Hence by the nite case
_
X
1
2
(A
2
X
2,j
)
X
1,j
d
1
=
_
X
2
1
(A
1
X
1,j
)
X
2,j
d
2
(A.31)
and the -nite case follows from the monotone convergence theorem.
Hence we can dene
2
(A) =
_
X
1
2
(A
2
(x
1
))d
1
(x
1
) =
_
X
2
1
(A
1
(x
2
))d
2
(x
2
) (A.32)
or equivalently, since
A
1
(x
2
)
(x
1
) =
A
2
(x
1
)
(x
2
) =
A
(x
1
, x
2
),
2
(A) =
_
X
1
__
X
2
A
(x
1
, x
2
)d
2
(x
2
)
_
d
1
(x
1
)
=
_
X
2
__
X
1
A
(x
1
, x
2
)d
1
(x
1
)
_
d
2
(x
2
). (A.33)
Additivity of
1
2
follows from the monotone convergence theorem.
Note that (A.29) uniquely denes
1

2
as a -nite premeasure on
the algebra of nite disjoint unions of rectangles. Hence by Theorem A.5 it
is the only measure on
1
2
satisfying (A.29).
Finally we have
Theorem A.22 (Fubini). Let f be a measurable function on X
1
X
2
and
let
1
,
2
be -nite measures on X
1
, X
2
, respectively.
(i) If f 0, then
_
f(., x
2
)d
2
(x
2
) and
_
f(x
1
, .)d
1
(x
1
) are both
measurable and
__
f(x
1
, x
2
)d
1
2
(x
1
, x
2
) =
_ __
f(x
1
, x
2
)d
1
(x
1
)
_
d
2
(x
2
)
=
_ __
f(x
1
, x
2
)d
2
(x
2
)
_
d
1
(x
1
). (A.34)
(ii) If f is complex, then
_
[f(x
1
, x
2
)[d
1
(x
1
) L
1
(X
2
, d
2
), (A.35)
respectively,
_
[f(x
1
, x
2
)[d
2
(x
2
) L
1
(X
1
, d
1
), (A.36)
if and only if f L
1
(X
1
X
2
, d
1
d
2
). In this case (A.34)
holds.
Proof. By Theorem A.21 and linearity the claim holds for simple functions.
To see (i), let s
n
f be a sequence of nonnegative simple functions. Then it
follows by applying the monotone convergence theorem (twice for the double
integrals).
For (ii) we can assume that f is real-valued by considering its real and
imaginary parts separately. Moreover, splitting f = f
+
f
into its positive

and negative parts, the claim reduces to (i).
In particular, if f(x
1
, x
2
) is either nonnegative or integrable, then the
order of integration can be interchanged.
Lemma A.23. If
1
and
2
are -nite regular Borel measures, then so is
2
.
Proof. Regularity holds for every rectangle and hence also for the algebra of
nite disjoint unions of rectangles. Thus the claim follows from Lemma A.6.
Note that we can iterate this procedure.

Lemma A.24. Suppose
j
, j = 1, 2, 3, are -nite measures. Then
(
1
2
)
3
=
1
(
2
3
). (A.37)
Proof. First of all note that (
1
2
)
3
=
1
(
2
3
) is the sigma
algebra generated by the rectangles A
1
A
2
A
3
in X
1
X
2
X
3
. Moreover,
since
((
1
2
)
3
)(A
1
A
2
A
3
) =
1
(A
1
)
2
(A
2
)
3
(A
3
)
= (
1
(
2
3
))(A
1
A
2
A
3
),
the two measures coincide on the algebra of nite disjoint unions of rectan-
gles. Hence they coincide everywhere by Theorem A.5.
Example. If is Lebesgue measure on R, then
n
= is Lebesgue
measure on R
n
. Since is regular, so is
n
.
Problem A.9. Show that the set of all nite union of rectangles A
1
A
2
forms an algebra.
Problem A.10. Let U C be a domain, Y be some measure space, and
f : U Y R. Suppose y f(z, y) is measurable for every z and z
f(z, y) is holomorphic for every y. Show that
F(z) =
_
A
f(z, y) d(y) (A.38)
is holomorphic if for every compact subset V U there is an integrable
function g(y) such that [f(z, y)[ g(y), z V . (Hint: Use Fubini and
Morera.)
A.6. Vague convergence of measures
Let
n
be a sequence of Borel measures, we will say that
n
converges to
vaguely if
_
X
fd
n

_
X
fd (A.39)
for every f C
c
(X).
We are only interested in the case of Borel measures on R. In this case
we have the following equivalent characterization of vague convergence.
Lemma A.25. Let
n
be a sequence of Borel measures on R. Then
n

vaguely if and only if the (normalized) distribution functions converge at
every point of continuity of .
Proof. Suppose
n
vaguely. Let I be any bounded interval (closed, half
closed, or open) with boundary points x
0
, x
1
. Moreover, choose continuous
functions f, g with compact support such that f
I
g. Then we have
_
fd (I)
_
gd and similarly for
n
. Hence
(I)
n
(I)
_
gd
_
fd
n

_
(g f)d +
_
fd
_
fd
n
and
(I)
n
(I)
_
fd
_
gd
n

_
(f g)d
_
gd
_
gd
n
.
Combining both estimates, we see
[(I)
n
(I)[
_
(g f)d +
_
fd
_
fd
n
_
gd
_
gd
n
and so
limsup
n
[(I)
n
(I)[
_
(g f)d.
Choosing f, g such that g f
x
0
+
x
1
pointwise, we even get from

dominated convergence that
limsup
n
[(I)
n
(I)[ (x
0
) +(x
1
),
which proves that the distribution functions converge at every point of con-
tinuity of .
Conversely, suppose that the distribution functions converge at every
point of continuity of . To see that in fact
n
vaguely, let f C
c
(R).
Fix some > 0 and note that, since f is uniformly continuous, there is a
A.6. Vague convergence of measures 279
> 0 such that [f(x) f(y)[ whenever [x y[ . Next, choose some
points x
0
< x
1
< < x
k
such that supp(f) (x
0
, x
k
), is continuous at
x
j
, and x
j
x
j1
(recall that a monotone function has at most countable
discontinuities). Furthermore, there is some N such that [
n
(x
j
) (x
j
)[
2k
for all j and n N. Then
_
fd
n
_
fd
j=1
_
(x
j1
,x
j
]
[f(x) f(x
j
)[d
n
(x)
+
k
j=1
[f(x
j
)[[((x
j1
, x
j
])
n
((x
j1
, x
j
])[
+
k
j=1
_
(x
j1
,x
j
]
[f(x) f(x
j
)[d(x).
Now, for n N, the rst and the last term on the right-hand side are both
bounded by (((x
0
, x
k
]) +

k
) and the middle term is bounded by max [f[.
Thus the claim follows.
Moreover, every bounded sequence of measures has a vaguely convergent
subsequence.
Lemma A.26. Suppose
n
is a sequence of nite Borel measures on R such
that
n
(R) M. Then there exists a subsequence which converges vaguely
to some measure with (R) M.
Proof. Let
n
(x) =
n
((, x]) be the corresponding distribution func-
tions. By 0
n
(x) M there is a convergent subsequence for xed x.
Moreover, by the standard diagonal series trick, we can assume that
n
(x)
converges to some number (x) for each rational x. For irrational x we set
(x) = inf
x
0
>x
(x
0
)[x
0
rational. Then (x) is monotone, 0 (x
1
)
(x
2
) M for x
1
x
2
. Furthermore,
(x) liminf
n
(x) limsup
n
(x) (x)
shows that
n
(x) (x) at every point of continuity of . So we can
redene to be right continuous without changing this last fact.
In the case where the sequence is bounded, (A.39) even holds for a larger
class of functions.
Lemma A.27. Suppose
n
vaguely and
n
(R) M. Then (A.39)
holds for any f C
(R).
Proof. Split f = f
1
+f
2
, where f
1
has compact support and [f
2
[ . Then
[
_
fd
_
fd
n
[ [
_
f
1
d
_
f
1
d
n
[ + 2M and the claim follows.
Example. The example d
n
() = d(n) shows that in the above claim
f cannot be replaced by a bounded continuous function. Moreover, the
example d
n
() = nd( n) also shows that the uniform bound cannot
be dropped.
A.7. Decomposition of measures
Let , be two measures on a measure space (X, ). They are called
mutually singular (in symbols ) if they are supported on disjoint
sets. That is, there is a measurable set N such that (N) = 0 and (XN) =
0.
Example. Let be the Lebesgue measure and the Dirac measure (cen-
tered at 0). Then : Just take N = 0; then (0) = 0 and
(R0) = 0.
On the other hand, is called absolutely continuous with respect to
(in symbols ) if (A) = 0 implies (A) = 0.
Example. The prototypical example is the measure d = f d (compare
Lemma A.15). Indeed (A) = 0 implies
(A) =
_
A
f d = 0 (A.40)
and shows that is absolutely continuous with respect to . In fact, we will
show below that every absolutely continuous measure is of this form.
The two main results will follow as simple consequence of the following
result:
Theorem A.28. Let , be -nite measures. Then there exists a unique
(a.e.) nonnegative function f and a set N of measure zero, such that
(A) = (A N) +
_
A
f d. (A.41)
Proof. We rst assume , to be nite measures. Let = + and
consider the Hilbert space L
2
(X, d). Then
(h) =
_
X
hd
is a bounded linear functional by CauchySchwarz:
[(h)[
2
=
_
X
1 hd
__
[1[
2
d
___
[h[
2
d
_
(X)
__
[h[
2
d
_
= (X)|h|
2
.
A.7. Decomposition of measures 281
Hence by the Riesz lemma (Theorem 1.8) there exists a g L
2
(X, d) such
that
(h) =
_
X
hg d.
By construction
(A) =
_

A
d =
_

A
g d =
_
A
g d. (A.42)
In particular, g must be positive a.e. (take A the set where g is negative).
Furthermore, let N = x[g(x) 1. Then
(N) =
_
N
g d (N) = (N) +(N),
which shows (N) = 0. Now set
f =
g
1 g
N
, N
t
= XN.
Then, since (A.42) implies d = g d, respectively, d = (1 g)d, we have
_
A
fd =
_

A
g
1 g
N
d =
_

AN
g d = (A N
t
)
as desired. Clearly f is unique, since if there is a second function

f, then
_
A
(f

f)d = 0 for every A shows f

f = 0 a.e.
To see the -nite case, observe that X
n
X, (X
n
) < and Y
n
X,
(Y
n
) < implies X
n
Y
n
X and (X
n
Y
n
) < . Hence when
restricted to X
n
Y
n
, we have sets N
n
and functions f
n
. Now take N =
N
n
and choose f such that f[
Xn
= f
n
(this is possible since f
n+1
[
Xn
= f
n
a.e.).
Then (N) = 0 and
(A N
t
) = lim
n
(A (X
n
N)) = lim
n
_
AXn
f d =
_
A
f d,
Now the anticipated results follow with no eort:
Theorem A.29 (Lebesgue decomposition). Let , be two -nite mea-
sures on a measure space (X, ). Then can be uniquely decomposed as
=
ac
+
sing
, where
ac
and
sing
are mutually singular and
ac
is abso-
lutely continuous with respect to .
Proof. Taking
sing
(A) = (A N) and d
ac
= f d, there is at least
one such decomposition. To show uniqueness, rst let be nite. If there
is another one, =
ac
+
sing
, then let

N be such that (

N) = 0 and

sing
(

N
t
) = 0. Then
sing
(A)
sing
(A) =
_
A
(

f f)d. In particular,
_
AN
(

f f)d = 0 and hence

f = f a.e. away from N

N. Since
(N

N) = 0, we have

f = f a.e. and hence
ac
=
ac
as well as
sing
=

ac
=
ac
=
sing
. The -nite case follows as usual.
Theorem A.30 (RadonNikodym). Let , be two -nite measures on a
measure space (X, ). Then is absolutely continuous with respect to if
and only if there is a positive measurable function f such that
(A) =
_
A
f d (A.43)
for every A . The function f is determined uniquely a.e. with respect to
and is called the RadonNikodym derivative
d
d
of with respect to
.
Proof. Just observe that in this case (A N) = 0 for every A; that is,
sing
= 0.
Problem A.11. Let be a Borel measure on B and suppose its distribution
function (x) is dierentiable. Show that the RadonNikodym derivative
equals the ordinary derivative
t
(x).
Problem A.12. Suppose and are inner regular measures. Show that
if and only if (C) = 0 implies (C) = 0 for every compact set.
Problem A.13. Let d = f d. Suppose f > 0 a.e. with respect to . Then
and d = f
1
d.
Problem A.14 (Chain rule). Show that is a transitive relation. In
particular, if , show that
d
d
=
d
d
d
d
.
Problem A.15. Suppose . Show that for any measure we have
d
d
d =
d
d
d +d,
where is a positive measure (depending on ) which is singular with respect
to . Show that = 0 if and only if .
A.8. Derivatives of measures
If is a Borel measure on B and its distribution function (x) is dieren-
tiable, then the RadonNikodym derivative is just the ordinary derivative
t
(x) (Problem A.11). Our aim in this section is to generalize this result to
arbitrary regular Borel measures on B
n
.
We call
(D)(x) = lim
0
(B
(x))
[B
(x)[
(A.44)
the derivative of at x R
n
provided the above limit exists. (Here B
r
(x)
R
n
is a ball of radius r centered at x R
n
and [A[ denotes the Lebesgue
measure of A B
n
.)
Note that for a Borel measure on B, (D)(x) exists if and only if (x)
(as dened in (A.3)) is dierentiable at x and (D)(x) =
t
(x) in this case.
To compute the derivative of , we introduce the upper and lower
derivative,
(D)(x) = limsup
0
(B
(x))
[B
(x)[
and (D)(x) = liminf
0
(B
(x))
[B
(x)[
. (A.45)
Clearly is dierentiable if (D)(x) = (D)(x) < . First of all note that
they are measurable:
Lemma A.31. The upper derivative is lower semicontinuous; that is, the
set x[(D)(x) > is open for every R. Similarly, the lower derivative
is upper semicontinuous; that is, x[(D)(x) < is open.
Proof. We only prove the claim for D, the case D being similar. Abbre-
viate
M
r
(x) = sup
0<<r
(B
(x))
[B
(x)[
and note that it suces to show that O
r
= x[M
r
(x) > is open.
If x O
r
, there is some < r such that
(B
(x))
[B
(x)[
> .
Let > 0 and y B
(x). Then B
(x) B
+
(y) implying
(B
+
(y))
[B
+
(y)[

_

+
_
n
(B
(x))
[B
(x)[
>
for suciently small. That is, B
(x) O.
In particular, both the upper and lower derivatives are measurable.
Next, the following geometric fact of R
n
will be needed.
Lemma A.32. Given open balls B
1
, . . . , B
m
in R
n
, there is a subset of
disjoint balls B
j
1
, . . . , B
j
k
such that
m
_
i=1
B
i
3
n
k
i=1
[B
j
i
[. (A.46)
Proof. Assume that the balls B
j
are ordered by radius. Start with B
j
1
=
B
1
= B
r
1
(x
1
) and remove all balls from our list which intersect B
j
1
. Observe
that the removed balls are all contained in 3B
1
= B
3r
1
(x
1
). Proceeding like
this, we obtain B
j
1
, . . . , B
j
k
such that
m
_
i=1
B
i

k
_
i=1
3B
r
j
i
and the claim follows since [3B[ = 3
n
[B[.
Now we can show
Lemma A.33. Let > 0. For any Borel set A we have
[x A[ (D)(x) > [ 3
n
(A)
(A.47)
and
[x A[ (D)(x) > 0[ = 0, whenever (A) = 0. (A.48)
Proof. Let A
= x A[(D)(x) > . We will show

[K[ 3
n
(O)
for any compact set K and open set O with K A
O. The rst claim

then follows from regularity of and the Lebesgue measure.
Given xed K, O, for every x K there is some r
x
such that B
rx
(x) O
and [B
rx
(x)[ <
1
(B
rx
(x)). Since K is compact, we can choose a nite
subcover of K. Moreover, by Lemma A.32 we can rene our set of balls such
that
[K[ 3
n
k
i=1
[B
r
i
(x
i
)[ <
3
n
i=1
(B
r
i
(x
i
)) 3
n
(O)
.
To see the second claim, observe that
x A[ (D)(x) > 0 =
_
j=1
x A[ (D)(x) >
1
j
and by the rst part [x A[ (D)(x) >

1
j
[ = 0 for any j if (A) = 0.
Theorem A.34 (Lebesgue). Let f be (locally) integrable, then for a.e. x
R
n
we have
lim
r0
1
[B
r
(x)[
_
Br(x)
[f(y) f(x)[dy = 0. (A.49)
Proof. Decompose f as f = g + h, where g is continuous and |h|
1
<
(Theorem 0.34) and abbreviate
D
r
(f)(x) =
1
[B
r
(x)[
_
Br(x)
[f(y) f(x)[dy.
Then, since limD
r
(g)(x) = 0 (for every x) and D
r
(f) D
r
(g) +D
r
(h), we
have
limsup
r0
D
r
(f)(x) limsup
r0
D
r
(h)(x) (D)(x) +[h(x)[,
where d = [h[dx. This implies
x[ limsup
r0
D
r
(f)(x) 2 x[(D)(x) x[ [h(x)[
and using the rst part of Lemma A.33 plus [x[ [h(x)[ [
1
|h|
1
,
we see
[x[ limsup
r0
D
r
(f)(x) 2[ (3
n
+ 1)
.
Since is arbitrary, the Lebesgue measure of this set must be zero for every
. That is, the set where the limsup is positive has Lebesgue measure
zero.
The points where (A.49) holds are called Lebesgue points of f.
Note that the balls can be replaced by more general sets: A sequence of
sets A
j
(x) is said to shrink to x nicely if there are balls B
r
j
(x) with r
j
0
and a constant > 0 such that A
j
(x) B
r
j
(x) and [A
j
[ [B
r
j
(x)[. For
example A
j
(x) could be some balls or cubes (not necessarily containing x).
However, the portion of B
r
j
(x) which they occupy must not go to zero! For
example the rectangles (0,
1
j
) (0,
2
j
) R
2
do shrink nicely to 0, but the
rectangles (0,
1
j
) (0,
2
j
2
) do not.
Lemma A.35. Let f be (locally) integrable. Then at every Lebesgue point
we have
f(x) = lim
j
1
[A
j
(x)[
_
A
j
(x)
f(y)dy (A.50)
whenever A
j
(x) shrinks to x nicely.
Proof. Let x be a Lebesgue point and choose some nicely shrinking sets
A
j
(x) with corresponding B
r
j
(x) and . Then
1
[A
j
(x)[
_
A
j
(x)
[f(y) f(x)[dy
1
[B
r
j
(x)[
_
Br
j
(x)
[f(y) f(x)[dy
Corollary A.36. Suppose is an absolutely continuous Borel measure on
R. Then its distribution function is dierentiable a.e. and d(x) =
t
(x)dx.
As another consequence we obtain
Theorem A.37. Let be a Borel measure on R
n
. The derivative D
exists a.e. with respect to Lebesgue measure and equals the RadonNikodym
derivative of the absolutely continuous part of with respect to Lebesgue
measure; that is,
ac
(A) =
_
A
(D)(x)dx. (A.51)
Proof. If d = f dx is absolutely continuous with respect to Lebesgue mea-
sure, the claim follows from Theorem A.34. To see the general case, use the
Lebesgue decomposition of and let N be a support for the singular part
with [N[ = 0. Then (D
sing
)(x) = 0 for a.e. x R
n
N by the second part
of Lemma A.33.
In particular, is singular with respect to Lebesgue measure if and only
if D = 0 a.e. with respect to Lebesgue measure.
Using the upper and lower derivatives, we can also give supports for the
absolutely and singularly continuous parts.
Theorem A.38. The set x[(D)(x) = is a support for the singular
and x[0 < (D)(x) < is a support for the absolutely continuous part.
Proof. First suppose is purely singular. Let us show that the set O
k
=
x[ (D)(x) < k satises (O
k
) = 0 for every k N.
Let K O
k
be compact, and let V
j
K be some open set such that
[V
j
K[
1
j
. For every x K there is some = (x) such that B
(x) V
j
and (B
(x)) k[B
(x)[. By compactness, nitely many of these balls cover

K and hence
(K)
i
(B
i
(x
i
)) k
i
[B
i
(x
i
)[.
Selecting disjoint balls as in Lemma A.32 further shows
(K) k3
n
[B
(x
i
)[ k3
n
[V
j
[.
Letting j , we see (K) k3
n
[K[ and by regularity we even have
(A) k3
n
[A[ for every A O
k
. Hence is absolutely continuous on O
k
and since we assumed to be singular, we must have (O
k
) = 0.
Thus (D
sing
)(x) = for a.e. x with respect to
sing
and we are done.
Finally, we note that these supports are minimal. Here a support M of

some measure is called a minimal support (it is sometimes also called
an essential support) if any subset M
0
M which does not support
(i.e., (M
0
) = 0) has Lebesgue measure zero.
Lemma A.39. The set M
ac
= x[0 < (D)(x) < is a minimal support
for
ac
.
Proof. Suppose M
0
M
ac
and
ac
(M
0
) = 0. Set M
= x M
0
[ <
(D)(x) for > 0. Then M
M
0
and
[M
[ =
_
M
dx
1
_
M
(D)(x)dx =
1
ac
(M
)
1
ac
(M
0
) = 0
shows [M
0
[ = lim
0
[M
[ = 0.
Note that the set M = x[0 < (D)(x) is a minimal support of .
Example. The Cantor function is constructed as follows: Take the sets
C
n
used in the construction of the Cantor set C: C
n
is the union of 2
n
closed intervals with 2
n
1 open gaps in between. Set f
n
equal to j/2
n
on the jth gap of C
n
and extend it to [0, 1] by linear interpolation. Note
that, since we are creating precisely one new gap between every old gap
when going from C
n
to C
n+1
, the value of f
n+1
is the same as the value of
f
n
on the gaps of C
n
. In particular, |f
n
f
m
|
2
min(n,m)
and hence
we can dene the Cantor function as f = lim
n
f
n
. By construction f
is a continuous function which is constant on every subinterval of [0, 1]C.
Since C is of Lebesgue measure zero, this set is of full Lebesgue measure
and hence f
t
= 0 a.e. in [0, 1]. In particular, the corresponding measure, the
Cantor measure, is supported on C and is purely singular with respect to
Lebesgue measure.
Problem A.16. Show that M = x[0 < (D)(x) is a minimal support of
.
Bibliographical notes
The aim of this section is not to give a comprehensive guide to the literature,
but to document the sources from which I have learned the materials and
which I have used during the preparation of this text. In addition, I will
point out some standard references for further reading. In some sense all
books on this topic are inspired by von Neumanns celebrated monograph
[64] and the present text is no exception.
General references for the rst part are Akhiezer and Glazman [2],
Berthier (Boutet de Monvel) [9], Blank, Exner, and Havlcek [10], Edmunds
and Evans [16], Lax [25], Reed and Simon [40], Weidmann [60], [62], or
Yosida [66].
Chapter 0: A rst look at Banach and Hilbert spaces
As a reference for general background I can warmly recommend Kellys
classical book [26]. The rest is standard material and can be found in any
book on functional analysis.
Chapter 1: Hilbert spaces
The material in this chapter is again classical and can be found in any book
on functional analysis. I mainly follow Reed and Simon [40], respectively,
Weidmann [60], with the main dierence being that I use orthonormal sets
and their projections as the central theme from which everything else is
derived. For an alternate problem based approach see Halmos book [22].
Chapter 2: Self-adjointness and spectrum
This chapter is still similar in spirit to [40], [60] with some ideas taken from
Schechter [48].
289
290 Bibliographical notes
Chapter 3: The spectral theorem
The approach via the Herglotz representation theorem follows Weidmann
[60]. However, I use projection-valued measures as in Reed and Simon [40]
rather than the resolution of the identity. Moreover, I have augmented the
discussion by adding material on spectral types and the connections with
the boundary values of the resolvent. For a survey containing several recent
results see [28].
Chapter 4: Applications of the spectral theorem
This chapter collects several applications from various sources which I have
found useful or which are needed later on. Again Reed and Simon [40] and
Weidmann [60], [63] are the main references here.
Chapter 5: Quantum dynamics
The material is a synthesis of the lecture notes by En [18], Reed and Simon
[40], [42], and Weidmann [63].
Chapter 6: Perturbation theory for self-adjoint operators
This chapter is similar to [60] (which contains more results) with the main
dierence that I have added some material on quadratic forms. In particular,
the section on quadratic forms contains, in addition to the classical results,
some material which I consider useful but was unable to nd (at least not
in the present form) in the literature. The prime reference here is Katos
monumental treatise [24] and Simons book [49]. For further information
on trace class operators see Simons classic [52]. The idea to extend the
usual notion of strong resolvent convergence by allowing the approximating
operators to live on subspaces is taken from Weidmann [62].
Chapter 7: The free Schrodinger operator
Most of the material is classical. Much more on the Fourier transform can
be found in Reed and Simon [41].
Chapter 8: Algebraic methods
This chapter collects some material which can be found in almost any physics
text book on quantum mechanics. My only contribution is to provide some
mathematical details. I recommend the classical book by Thirring [58] and
the visual guides by Thaller [56], [57].
Chapter 9: One-dimensional Schrodinger operators
One-dimensional models have always played a central role in understand-
ing quantum mechanical phenomena. In particular, general wisdom used to
say that Schr odinger operators should have absolutely continuous spectrum
plus some discrete point spectrum, while singular continuous spectrum is a
pathology that should not occur in examples with bounded V [14, Sect. 10.4].
Bibliographical notes 291
In fact, a large part of [43] is devoted to establishing the absence of sin-
gular continuous spectrum. This was proven wrong by Pearson, who con-
structed an explicit one-dimensional example with singular continuous spec-
trum. Moreover, after the appearance of random models, it became clear
that such kind of exotic spectra (singular continuous or dense pure point)
are frequently generic. The starting point is often the boundary behaviour
of the Weyl m-function and its connection with the growth properties of
solutions of the underlying dierential equation, the latter being known as
Gilbert and Pearson or subordinacy theory. One of my main goals is to give
a modern introduction to this theory. The section on inverse spectral theory
presents a simple proof for the BorgMarchenko theorem (in the local ver-
sion of Simon) from Bennewitz [8]. Again this result is the starting point of
almost all other inverse spectral results for SturmLiouville equations and
should enable the reader to start reading research papers in this area.
Other references with further information are the lecture notes by Weid-
mann [61] or the classical books by Coddington and Levinson [13], Levitan
[29], Levitan and Sargsjan [30], [31], Marchenko [33], Naimark [34], Pear-
son [37]. See also the recent monographs by Rofe-Betekov and Kholkin [46],
Zettl [67] or the recent collection of historic and survey articles [4]. For a
nice introduction to random models I can recommend the recent notes by
Kirsch [27] or the classical monographs by Carmona and Lacroix [11] or Pas-
tur and Figotin [36]. For the discrete analog of SturmLiouville operators,
Jacobi operators, see my monograph [54].
Chapter 10: One-particle Schrodinger operators
The presentation in the rst two sections is inuenced by En [18] and
Thirring [58]. The solution of the Schrodinger equation in spherical coordi-
nates can be found in any text book on quantum mechanics. Again I tried
to provide some missing mathematical details. Several other explicitly solv-
able examples can be found in the books by Albeverio et al. [3] or Fl ugge
[19]. For the formulation of quantum mechanics via path integrals I suggest
Roepstor [45] or Simon [50].
Chapter 11: Atomic Schrodinger operators
This chapter essentially follows Cycon, Froese, Kirsch, and Simon [14]. For
a recent review see Simon [51].
Chapter 12: Scattering theory
This chapter follows the lecture notes by En [18] (see also [17]) using some
material from Perry [38]. Further information on mathematical scattering
theory can be found in Amrein, Jauch, and Sinha [5], Baumgaertel and
Wollenberg [6], Chadan and Sabatier [12], Cycon, Froese, Kirsch, and Simon
[14], Newton [35], Pearson [37], Reed and Simon [42], or Yafaev [65].
292 Bibliographical notes
Appendix A: Almost everything about Lebesgue integration
Most parts follow Rudins book [47], respectively, Bauer [7], with some ideas
also taken from Weidmann [60]. I have tried to strip everything down to the
results needed here while staying self-contained. Another useful reference is
the book by Lieb and Loss [32].
Bibliography
[1] M. Abramovitz and I. A. Stegun, Handbook of Mathematical Functions, Dover,
New York, 1972.
[2] N. I. Akhiezer and I. M. Glazman, Theory of Linear Operators in Hilbert Space,
Vols. I and II, Pitman, Boston, 1981.
[3] S. Albeverio, F. Gesztesy, R. Hegh-Krohn, and H. Holden, Solvable Models
in Quantum Mechanics, 2nd ed., American Mathematical Society, Providence,
2005.
[4] W. O. Amrein, A. M. Hinz, and D. B. Pearson, SturmLiouville Theory: Past
and Present, Birkh auser, Basel, 2005.
[5] W. O. Amrein, J. M. Jauch, and K. B. Sinha, Scattering Theory in Quantum
Mechanics, W. A. Benajmin Inc., New York, 1977.
[6] H. Baumgaertel and M. Wollenberg, Mathematical Scattering Theory,
Birkh auser, Basel, 1983.
[7] H. Bauer, Measure and Integration Theory, de Gruyter, Berlin, 2001.
[8] C. Bennewitz, A proof of the local BorgMarchenko theorem, Commun. Math.
Phys. 218, 131132 (2001).
[9] A. M. Berthier, Spectral Theory and Wave Operators for the Schrodinger Equa-
tion, Pitman, Boston, 1982.
[10] J. Blank, P. Exner, and M. Havlcek, Hilbert-Space Operators in Quantum
Physics, 2nd ed., Springer, Dordrecht, 2008.
[11] R. Carmona and J. Lacroix, Spectral Theory of Random Schr odinger Operators,
Birkh auser, Boston, 1990.
[12] K. Chadan and P. C. Sabatier, Inverse Problems in Quantum Scattering Theory,
Springer, New York, 1989.
[13] E. A. Coddington and N. Levinson, Theory of Ordinary Dierential Equations,
Krieger, Malabar, 1985.
[14] H. L. Cycon, R. G. Froese, W. Kirsch, and B. Simon, Schr odinger Operators,
2nd printing, Springer, Berlin, 2008.
[15] M. Demuth and M. Krishna, Determining Spectra in Quantum Theory,
Birkh auser, Boston, 2005.
293
294 Bibliography
[16] D. E. Edmunds and W. D. Evans, Spectral Theory and Dierential Operators,
Oxford University Press, Oxford, 1987.
[17] V. Enss, Asymptotic completeness for quantum mechanical potential scattering,
Comm. Math. Phys. 61, 285291 (1978).
[18] V. En, Schr odinger Operators, lecture notes (unpublished).
[19] S. Fl ugge, Practical Quantum Mechanics, Springer, Berlin, 1994.
[20] I. Gohberg, S. Goldberg, and N. Krupnik, Traces and Determinants of Linear
Operators, Birkh auser, Basel, 2000.
[21] S. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics,
Springer, Berlin, 2003.
[22] P. R. Halmos, A Hilbert Space Problem Book, 2nd ed., Springer, New York, 1984.
[23] P. D. Hislop and I. M. Sigal, Introduction to Spectral Theory, Springer, New
York, 1996.
[24] T. Kato, Perturbation Theory for Linear Operators, Springer, New York, 1966.
[25] P. D. Lax, Functional Analysis, Wiley-Interscience, New York, 2002.
[26] J. L. Kelly, General Topology, Springer, New York, 1955.
[27] W. Kirsch, An Invitation to Random Schrodinger Operators, in Random
Schr odinger Operators, M. Dissertori et al. (eds.), 1119, Panoramas et Synthèse
25, Societe Mathematique de France, Paris, 2008.
[28] Y. Last, Quantum dynamics and decompositions of singular continuous spectra,
J. Funct. Anal. 142, 406445 (1996).
[29] B. M. Levitan, Inverse SturmLiouville Problems, VNU Science Press, Utrecht,
1987.
[30] B. M. Levitan and I. S. Sargsjan, Introduction to Spectral Theory, American
Mathematical Society, Providence, 1975.
[31] B. M. Levitan and I. S. Sargsjan, SturmLiouville and Dirac Operators, Kluwer
Academic Publishers, Dordrecht, 1991.
[32] E. Lieb and M. Loss, Analysis, American Mathematical Society, Providence,
1997.
[33] V. A. Marchenko, SturmLiouville Operators and Applications, Birkh auser,
Basel, 1986.
[34] M.A. Naimark, Linear Dierential Operators, Parts I and II , Ungar, New York,
1967 and 1968.
[35] R. G. Newton, Scattering Theory of Waves and Particles, 2nd ed., Dover, New
York, 2002.
[36] L. Pastur and A. Figotin, Spectra of Random and Almost-Periodic Operators,
Springer, Berlin, 1992.
[37] D. Pearson, Quantum Scattering and Spectral Theory, Academic Press, London,
1988.
[38] P. Perry, Mellin transforms and scattering theory, Duke Math. J. 47, 187193
(1987).
[39] E. Prugovecki, Quantum Mechanics in Hilbert Space, 2nd ed., Academic Press,
New York, 1981.
[40] M. Reed and B. Simon, Methods of Modern Mathematical Physics I. Functional
Analysis, rev. and enl. ed., Academic Press, San Diego, 1980.
Bibliography 295
[41] M. Reed and B. Simon, Methods of Modern Mathematical Physics II. Fourier
Analysis, Self-Adjointness, Academic Press, San Diego, 1975.
[42] M. Reed and B. Simon, Methods of Modern Mathematical Physics III. Scattering
Theory, Academic Press, San Diego, 1979.
[43] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV. Analysis
of Operators, Academic Press, San Diego, 1978.
[44] J. R. Retherford, Hilbert Space: Compact Operators and the Trace Theorem,
Cambridge University Press, Cambridge, 1993.
[45] G. Roepstor, Path Integral Approach to Quantum Physics, Springer, Berlin,
1994.
[46] F.S. Rofe-Beketov and A.M. Kholkin, Spectral Analysis of Dierential Operators.
Interplay Between Spectral and Oscillatory Properties, World Scientic, Hacken-
sack, 2005.
[47] W. Rudin, Real and Complex Analysis, 3rd ed., McGraw-Hill, New York, 1987.
[48] M. Schechter, Operator Methods in Quantum Mechanics, North Holland, New
York, 1981.
[49] B. Simon, Quantum Mechanics for Hamiltonians Dened as Quadratic Forms,
Princeton University Press, Princeton, 1971.
[50] B. Simon, Functional Integration and Quantum Physics, Academic Press, New
York, 1979.
[51] B. Simon, Schr odinger operators in the twentieth century, J. Math. Phys. 41:6,
35233555 (2000).
[52] B. Simon, Trace Ideals and Their Applications, 2nd ed., Amererican Mathemat-
ical Society, Providence, 2005.
[53] E. Stein and R. Shakarchi, Complex Analysis, Princeton University Press, Prince-
ton, 2003.
[54] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Math.
Surv. and Mon. 72, Amer. Math. Soc., Rhode Island, 2000.
[55] B. Thaller, The Dirac Equation, Springer, Berlin 1992.
[56] B. Thaller, Visual Quantum Mechanics, Springer, New York, 2000.
[57] B. Thaller, Advanced Visual Quantum Mechanics, Springer, New York, 2005.
[58] W. Thirring, Quantum Mechanics of Atoms and Molecules, Springer, New York,
1981.
[59] G. N. Watson, A Treatise on the Theory of Bessel Functions, 2nd ed., Cambridge
University Press, Cambridge, 1962.
[60] J. Weidmann, Linear Operators in Hilbert Spaces, Springer, New York, 1980.
[61] J. Weidmann, Spectral Theory of Ordinary Dierential Operators, Lecture Notes
in Mathematics, 1258, Springer, Berlin, 1987.
[62] J. Weidmann, Lineare Operatoren in Hilbertr aumen, Teil 1: Grundlagen, B. G.
Teubner, Stuttgart, 2000.
[63] J. Weidmann, Lineare Operatoren in Hilbertr aumen, Teil 2: Anwendungen, B.
G. Teubner, Stuttgart, 2003.
[64] J. von Neumann, Mathematical Foundations of Quantum Mechanics, Princeton
University Press, Princeton, 1996.
[65] D. R. Yafaev, Mathematical Scattering Theory: General Theory, American Math-
ematical Society, Providence, 1992.
296 Bibliography
[66] K. Yosida, Functional Analysis, 6th ed., Springer, Berlin, 1980.
[67] A. Zettl, SturmLiouville Theory, American Mathematical Society, Providence,
2005.
Glossary of notation
AC(I) . . . absolutely continuous functions, 84
B = B
1
B
n
. . . Borel -eld of R
n
, 260
C(H) . . . set of compact operators, 128
C(U) . . . set of continuous functions from U to C
C
(U) . . . set of functions in C(U) which vanish at

C(U, V ) . . . set of continuous functions from U to V
C
c
(U, V ) . . . set of compactly supported smooth functions
(.) . . . characteristic function of the set

dim . . . dimension of a linear space
dist(x, Y ) = inf
yY
|x y|, distance between x and Y
D(.) . . . domain of an operator
e . . . exponential function, e
z
= exp(z)
E(A) . . . expectation of an operator A, 55
T . . . Fourier transform, 161
H . . . Schrodinger operator, 221
H
0
. . . free Schrodinger operator, 167
H
m
(a, b) . . . Sobolev space, 85
H
m
(R
n
) . . . Sobolev space, 164
hull(.) . . . convex hull
H . . . a separable Hilbert space
i . . . complex unity, i
2
= 1
I . . . identity operator
Im(.) . . . imaginary part of a complex number
inf . . . inmum
Ker(A) . . . kernel of an operator A, 22
297
298 Glossary of notation
L(X, Y ) . . . set of all bounded linear operators from X to Y , 23
L(X) = L(X, X)
L
p
(X, d) . . . Lebesgue space of p integrable functions, 26
L
p
loc
(X, d) . . . locally p integrable functions, 31
L
p
c
(X, d) . . . compactly supported p integrable functions
L
(X, d) . . . Lebesgue space of bounded functions, 26

L
(R
n
) . . . Lebesgue space of bounded functions vanishing at
1
(N) . . . Banach space of summable sequences, 13
2
(N) . . . Hilbert space of square summable sequences, 17
(N) . . . Banach space of bounded summable sequences, 13

. . . a real number
m
a
(z) . . . Weyl m-function, 199
M(z) . . . Weyl M-matrix, 211
max . . . maximum
/ . . . Mellin transform, 251
. . . spectral measure, 95
N . . . the set of positive integers
N
0
= N 0
o(x) . . . Landau symbol little-o
O(x) . . . Landau symbol big-O
. . . a Borel set
. . . wave operators, 247

P
A
(.) . . . family of spectral projections of an operator A, 96
P
. . . projector onto outgoing/incoming states, 250

Q . . . the set of rational numbers
Q(.) . . . form domain of an operator, 97
R(I, X) . . . set of regulated functions, 112
R
A
(z) . . . resolvent of A, 74
Ran(A) . . . range of an operator A, 22
rank(A) = dimRan(A), rank of an operator A, 127
Re(.) . . . real part of a complex number
(A) . . . resolvent set of A, 73
R . . . the set of real numbers
S(I, X) . . . set of simple functions, 112
o(R
n
) . . . set of smooth functions with rapid decay, 161
sign(x) . . . +1 for x > 0 and 1 for x < 0; sign function
(A) . . . spectrum of an operator A, 73
ac
(A) . . . absolutely continuous spectrum of A, 106
sc
(A) . . . singular continuous spectrum of A, 106
pp
(A) . . . pure point spectrum of A, 106
p
(A) . . . point spectrum (set of eigenvalues) of A, 103
d
(A) . . . discrete spectrum of A, 145
ess
(A) . . . essential spectrum of A, 145
Glossary of notation 299
span(M) . . . set of nite linear combinations from M, 14
sup . . . supremum
supp(f) . . . support of a function f, 7
Z . . . the set of integers
z . . . a complex number
z . . . square root of z with branch cut along (, 0]

z
. . . complex conjugation
A
. . . adjoint of A, 59
A . . . closure of A, 63
f = Tf, Fourier transform of f, 161
f = T
1
f, inverse Fourier transform of f, 163
|.| . . . norm in the Hilbert space H, 17
|.|
p
. . . norm in the Banach space L
p
, 25
., ..) . . . scalar product in H, 17
E
(A) = , A), expectation value, 56
(A) = E
(A
2
) E
(A)
2
, variance, 56
. . . Laplace operator, 167
. . . gradient, 162
. . . derivative, 161
. . . orthogonal sum of linear spaces or operators, 45, 79
M
. . . orthogonal complement, 43
A
t
. . . complement of a set
(
1
,
2
) = R[
1
< <
2
, open interval
[
1
,
2
] = R[
1

2
, closed interval
n
. . . norm convergence, 12
n
. . . weak convergence, 49
A
n
A . . . norm convergence
A
n
s
A . . . strong convergence, 50
A
n
A . . . weak convergence, 50
A
n
nr
A . . . norm resolvent convergence, 153
A
n
sr
A . . . strong resolvent convergence, 153
Index
a.e., see almost everywhere
absolue value of an operator, 99
absolute convergence, 16
absolutely continuous
function, 84
measure, 280
spectrum, 106
adjoint operator, 47, 59
algebra, 259
almost everywhere, 262
angular momentum operator, 176
B.L.T. theorem, 23
Baire category theorem, 32
Banach algebra, 24
Banach space, 13
BanachSteinhaus theorem, 33
base, 5
basis, 14
orthonormal, 40
spectral, 93
Bessel function, 171
spherical, 230
Bessel inequality, 39
Borel
function, 269
measure, 261
regular, 261
set, 260
-algebra, 260
transform, 95, 100
boundary condition
Dirichlet, 188
Neumann, 188
periodic, 188
bounded
operator, 23
sesquilinear form, 21
C-real, 83
canonical form of compact operators, 137
Cantor
function, 287
measure, 287
set, 262
Cauchy sequence, 6
CauchySchwarzBunjakowski inequality,
18
Cayley transform, 81
Cesàro average, 126
characteristic function, 270
closable
form, 71
operator, 63
closed
form, 71
operator, 63
closed graph theorem, 66
closed set, 5
closure, 5
essential, 104
commute, 115
compact, 8
locally, 10
sequentially, 8
complete, 6, 13
completion, 22
conguration space, 56
conjugation, 83
continuous, 7
convergence, 6
convolution, 165
301
302 Index
core, 63
cover, 8
C
algebra, 48
cyclic vector, 93
dense, 6
dilation group, 223
Dirac measure, 261, 274
Dirichlet boundary condition, 188
discrete topology, 4
distance, 3, 10
distribution function, 261
domain, 22, 56, 58
dominated convergence theorem, 273
eigenspace, 112
eigenvalue, 74
multiplicity, 112
eigenvector, 74
element
adjoint, 48
normal, 48
positive, 48
self-adjoint, 48
unitary, 48
equivalent norms, 20
essential
closure, 104
range, 74
spectrum, 145
supremum, 26
expectation, 55
extension, 59
nite intersection property, 8
rst resolvent formula, 75
form, 71
bound, 149
bounded, 21, 72
closable, 71
closed, 71
core, 72
domain, 68, 97
hermitian, 71
nonnegative, 71
semi-bounded, 71
Fourier
series, 41
transform, 126, 161
Friedrichs extension, 70
Fubini theorem, 276
function
absolutely continuous, 84
Gaussian wave packet, 175
gradient, 162
GramSchmidt orthogonalization, 42
graph, 63
graph norm, 64
Greens function, 171
ground state, 235
Hamiltonian, 57
harmonic oscillator, 178
Hausdor space, 5
HeineBorel theorem, 10
Heisenberg picture, 130
Hellinger-Toeplitz theorem, 67
Herglotz
function, 95
representation theorem, 107
Hermite polynomials, 179
hermitian
form, 71
operator, 58
Hilbert space, 17, 37
separable, 41
Holders inequality, 26
HVZ theorem, 242
hydrogen atom, 222
ideal, 48
identity, 24
induced topology, 5
inner product, 17
inner product space, 17
integrable, 273
integral, 270
interior, 6
interior point, 4
intertwining property, 248
involution, 48
ionization, 242
Jacobi operator, 67
KatoRellich theorem, 135
kernel, 22
KLMN theorem, 150
l.c., see limit circle
l.p., see limit point
Lagrange identity, 182
Laguerre polynomial, 231
generalized, 231
Lebesgue
decomposition, 281
measure, 262
point, 285
Legendre equation, 226
lemma
Riemann-Lebesgue, 165
Lidskij trace theorem, 143
limit circle, 187
Index 303
limit point, 4, 187
Lindelof theorem, 8
linear
functional, 24, 44
operator, 22
Liouville normal form, 186
localization formula, 243
maximum norm, 12
mean-square deviation, 56
measurable
function, 269
set, 260
space, 260
measure, 260
complete, 268
nite, 260
growth point, 99
Lebesgue, 262
minimal support, 286
mutually singular, 280
product, 275
projection-valued, 88
space, 260
spectral, 95
support, 262
Mellin transform, 251
metric space, 3
Minkowskis inequality, 27
mollier, 30
momentum operator, 174
monotone convergence theorem, 271
multi-index, 161
order, 161
multiplicity
spectral, 94
neighborhood, 4
Neumann
boundary condition, 188
function
spherical, 230
series, 76
Nevanlinna function, 95
Noether theorem, 174
norm, 12
operator, 23
norm resolvent convergence, 153
normal, 11, 91
normalized, 17, 38
normed space, 12
nowhere dense, 32
observable, 55
ONB, see orthonormal basis
one-parameter unitary group, 57
ONS, see orthonormal set
open ball, 4
open set, 4
operator
adjoint, 47, 59
bounded, 23
bounded from below, 70
closable, 63
closed, 63
closure, 63
compact, 128
domain, 22, 58
nite rank, 127
hermitian, 58
HilbertSchmidt, 139
linear, 22, 58
nonnegative, 68
normal, 60, 67, 91
positive, 68
relatively bounded, 133
relatively compact, 128
self-adjoint, 59
semi-bounded, 70
strong convergence, 50
symmetric, 58
unitary, 39, 57
weak convergence, 50
orthogonal, 17, 38
complement, 43
polynomials, 228
projection, 43
sum, 45
orthonormal
basis, 40
set, 38
orthonormal basis, 40
oscillating, 219
outer measure, 266
parallel, 17, 38
parallelogram law, 19
parity operator, 98
Parsevals identity, 163
partial isometry, 99
partition of unity, 11
perpendicular, 17, 38
phase space, 56
Pl ucker identity, 186
polar decomposition, 99
polarization identity, 19, 39, 58
position operator, 173
positivity
improving, 235
preserving, 235
premeasure, 260
probability density, 55
product measure, 275
304 Index
product topology, 8
projection, 48
pure point spectrum, 106
Pythagorean theorem, 17, 38
quadrangle inequality, 11
quadratic form, 58, see form
RadonNikodym
derivative, 282
theorem, 282
RAGE theorem, 129
range, 22
essential, 74
rank, 127
reducing subspace, 80
regulated function, 112
relatively compact, 128
resolution of the identity, 89
resolvent, 74
convergence, 153
formula
rst, 75
second, 135
Neumann series, 76
set, 73
Riesz lemma, 44
scalar product, 17
scattering operator, 248
scattering state, 248
Schatten p-class, 141
Schauder basis, 14
Schr odinger equation, 57
Schur criterion, 28
second countable, 5
second resolvent formula, 135
self-adjoint
essentially, 63
semi-metric, 3
separable, 6, 14
series
absolutely convergent, 16
sesquilinear form, 17
bounded, 21
parallelogram law, 21
polarization identity, 21
short range, 253
-algebra, 259
-nite, 260
simple function, 112, 270
simple spectrum, 94
singular values, 137
singularly continuous
spectrum, 106
Sobolev space, 164
span, 14
spectral
basis, 93
ordered, 105
mapping theorem, 105
measure
maximal, 105
theorem, 97
compact operators, 136
vector, 93
maximal, 105
spectrum, 73
discrete, 145
essential, 145
pure point, 106
singularly continuous, 106
spherical coordinates, 224
spherical harmonics, 227
spherically symmetric, 166
-ideal, 48
-subalgebra, 48
stationary phase, 252
Stieltjes inversion formula, 95, 114
Stone theorem, 124
Stones formula, 114
StoneWeierstra theorem, 52
strong convergence, 50
strong resolvent convergence, 153
Sturm comparison theorem, 218
SturmLiouville equation, 181
regular, 182
subcover, 8
subordinacy, 207
subordinate solution, 208
subspace
reducing, 80
superposition, 56
supersymmetric quantum mechanics, 180
support, 7
Temples inequality, 120
tensor product, 46
theorem
B.L.T., 23
Bair, 32
BanachSteinhaus, 33
closed graph, 66
dominated convergence, 273
Fubini, 276
HeineBorel, 10
Hellinger-Toeplitz, 67
Herglotz, 107
HVZ, 242
KatoRellich, 135
KLMN, 150
Lebesgue decomposition, 281
Lindelof, 8
Index 305
monotone convergence, 271
Noether, 174
Pythagorean, 17, 38
RadonNikodym, 282
RAGE, 129
Riesz, 44
Schur, 28
spectral, 97
spectral mapping, 105
Stone, 124
StoneWeierstra, 52
Sturm, 218
Urysohn, 10
virial, 223
Weierstra, 14
Weyl, 146
Wiener, 126, 167
topological space, 4
topology
base, 5
product, 8
total, 14
trace, 143
class, 142
triangel inequality, 3, 12
inverse, 3, 12
trivial topology, 4
Trotter product formula, 131
uncertainty principle, 174
uniform boundedness principle, 33
unit vector, 17, 38
unitary group, 57
generator, 57
strongly continuous, 57
weakly continuous, 124
Urysohn lemma, 10
variance, 56
virial theorem, 223
Vitali set, 262
wave
function, 55
operators, 247
weak
Cauchy sequence, 49
convergence, 24, 49
derivative, 85, 164
Weierstra approximation, 14
Weyl
M-matrix, 211
circle, 194
relations, 174
sequence, 76
singular, 145
theorem, 146
WeylTitchmarsh m-function, 199
Wiener theorem, 126
Wronskian, 182
Youngs inequality, 165

Mathematical Background QM

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mathematical Background QM

Uploaded by

Copyright:

Available Formats

Mathematical Methods

algebra of bounded linear operators 47

(f(x)) and observe

are open. Any subset of U

be an open cover for Y and let B be a countable base.

can be written as a union of elements from B, the set of all

for some form a countable open cover for Y .

is an open cover for f(Y ), then f

be an open cover for the closed subset Y . Then O

be an open cover for X Y . For every (x, y) X Y

there is some > 0 such that for every x we have B

for every . Choose

(x) is compact and B

be the set of all maximal subin-

O and there is no other subinterval of O which

. Then this is a cover of disjoint intervals which has a countable

. This implies in particular [a

(N) of all bounded sequences a = (a

(N) is not separable (Problem 0.11)!

(N) is a Banach space.

(N) is not separable. (Hint: Consider se-

Taking any other vector parallel to u, it is easy to see

are straightforward to check.

/[s(f, g)[ with t R.)

= L(X, C) is called the dual space of X. A sequence f

(X, d). It should be the set of

= infC [ (x[ [f(x)[ > C) = 0. (0.67)

(X, d) = B(X)/^(X, d) (0.68)

is independent of the equivalence class.

This shows that L

and equal to |g|

(show that the resulting sequence still converges). Finally use

for any bounded measurable function.

which has a single pole at 0. Then f

L(X, Y ) be a family of bounded operators.

x| C(x) is bounded for xed x X. Then |A

are orthogonal. Moreover,

are orthogonal. Moreover,

,= 0, then we can normalize

to obtain a unit vector

which is orthogonal to all vectors

is a closed linear subspace and by linearity that

. For example we have H

= 0 since any vector in H

must be in particular orthogonal to all vectors in some orthonormal basis.

. Furthermore, (1.29) uniquely characterizes orthogonal projec-

= 0 if and only if M is total.

. To see uniqueness, let

f(x, y)d(x)d (y) = 0 (1.41)

algebra of bounded linear operators 47

f(x, y)d (y) (1.42)

algebra of bounded linear operators

, ). (iv) follows from

| = |A| observe that taking the adjoint is

algebra. The element a

is called the adjoint of a. Note that

| = |a| follows from (1.47) and |aa

for some b /. Clearly both self-adjoint and unitary

in this case. Thus at least for normal operators taking

< f + . Since f < f

, we have found a required

algebra of continuous functions (with the sup norm).

(A) vanishes if and only if is an eigenstate corresponding to