Professional Documents
Culture Documents
Quantum Mechanics
Franco Gallone
Università degli Studi di Milano, Italy
World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TA I P E I • CHENNAI
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
ISBN 978-981-4635-83-7
Printed in Singapore
v
July 25, 2013 17:28 WSPC - Proceedings Trim Size: 9.75in x 6.5in icmp12-master
Preface
The subjects of this book are the mathematical foundations of non-relativistic quan-
tum mechanics and the mathematical theory they require. In its mathematical part,
this book aims at expounding in a complete and self-contained way the mathemati-
cal basis for “mathematical” quantum mechanics, namely the branch of mathemat-
ical physics that was constructed by David Hilbert, John von Neumann and other
mathematicians, notably George Mackey, in order to systematize quantum mechan-
ics, and which was presented in book form for the first time by von Neumann in
1932 (Neumann, 1932). In von Neumann’s approach, the language of quantum
mechanics is the theory of linear operators in Hilbert space.
Von Neumann’s book was the result of work which had been done previously over
several years. Hilbert, who had been consulted on numerous aspects of quantum
mechanics since its inception, began in 1926 a systematic study of its mathemat-
ical foundations. Hilbert taught the course “Mathematical Methods of Quantum
Theory” in the academic year 1926-27, and a summary of Hilbert’s lessons was pub-
lished in the spring of 1927 by Hilbert himself and his assistants Lothar Nordheim
and von Neumann (Hilbert et al., 1927). In their view, the mathematical framework
suitable for quantum mechanics was the mathematical structure that was defined
in an abstract way and called a Hilbert space by von Neumann in 1927. Further-
more, between 1926 and 1932, von Neumann proved a number of theorems about
operators in Hilbert space which bore upon quantum mechanics (among them, the
spectral theorem for unbounded self-adjoint operators), and so did the mathemati-
cians Marshall Stone and Hermann Weyl, who had a keen interest in quantum
mechanics. Thus, the theory of linear operators in Hilbert space was actually born
as the mathematical basis for quantum mechanics.
Quantum mechanics and the theory of Hilbert space operators constitute one
of those rare examples in which there is complete correspondence between physical
and mathematical concepts (another example is Euclidean geometry). Actually, it is
one of the most stunning examples of “the unreasonable effectiveness of mathemat-
ics in the natural sciences” (E.P. Wigner). Unfortunately, this aspect of quantum
mechanics is almost completely overlooked in most quantum mechanics textbooks,
where too many subtle points are dealt with by means of mathematical shortcuts
vii
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page viii
which not only can hardly convince a mathematically aware reader but also blot out
physical subtleties. The main reason for this is that, in the community of physicists,
Dirac’s quantum mechanics (Dirac, 1958, 1947, 1935, 1930) is by far more popu-
lar than von Neumann’s quantum mechanics, perhaps exactly because the former
requires almost no mathematics. For instance, the idea that self-adjoint operators
have a critical domain is almost completely missing in standard quantum mechanics
textbooks; however, the domain of an unbounded self-adjoint operator represents
exactly the pure states in which the fundamental statistical quantities (expected
result and uncertainty) are defined for the observable represented by that opera-
tor. This point gets hopelessly blurred in most quantum mechanics books, which
treat unbounded observables — like energy, position, momentum, orbital angular
momentum — as if they were represented by self-adjoint operators defined on the
entire space, while this is impossible on account of the Hellinger–Toeplitz theorem.
Another example is the relation existing between the physical idea of compatibility
of two observables and the mathematical idea of commutativity of the operators
that represent them; for self-adjoint operators, the right notion of commutativity
is subtler than the one usually found in quantum mechanics books and depends
on the representations of the operators as projection valued measures; however it is
exactly through this subtler notion that the physical essence of compatibility can be
really grasped. More than anything else, the real way to understand why quantum
observables are represented by self-adjoint operators is through the spectral theo-
rem, since quantum observables arise most naturally as projection valued measures,
but this is usually outside the scope of standard quantum mechanics books.
One last word about the mathematical framework for quantum mechanics pre-
sented in this book. It is undoubtedly very interesting and useful to treat quantum
mechanics in the framework of mathematical structures more general than Hilbert
space theory, especially in order to study quantum mechanics of systems with an in-
finite number of degrees of freedom. However, quantum mechanics in Hilbert space
is an enthralling subject in its own right, mainly because it is here that one can see
most clearly how the mathematical structure is linked to the physical theory in an
almost necessary way.
Most books about fundamental quantum mechanics use results in the theory
of Hilbert space operators without proving them, while most books about Hilbert
space operators do not treat quantum mechanics; moreover, they often use fairly
advanced results from other branches of mathematics assuming the reader to be
already familiar with them, but this is seldom true. The aim of this book is not
to be a complete treatise about Hilbert space operators, but to give a really self-
contained treatment of all the elements of this subject that are necessary for a
sound and mathematically accurate exposition of the principles of quantum me-
chanics; this exposition is the object of the final chapters of the book. The main
characteristic of the book is that the mathematical theory is developed only assum-
ing familiarity with elementary analysis. Moreover, all the proofs in the book are
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page ix
Preface ix
carried out in a very detailed way. These features make the book easily accessible to
readers with only the mathematical experience offered by undergraduate education
in mathematics or in physics, and also ideal for individual study. The principles
of quantum mechanics are discussed with complete mathematical accuracy and an
effort is always made to trace them back to the experimental reality that lies at
their root. The treatment of quantum mechanics is axiomatic, with definitions fol-
lowed by propositions proved in a mathematical fashion. No previous knowledge
of quantum mechanics is required. The level of this book is intermediate between
advanced undergraduate and graduate. It is a purely theoretical book, in which no
exercises are provided.
After the first chapter, whose function is mainly to fix notation and terminology,
the first part of the book (Chapters 2–9) is devoted to an exposition of the elements
of real and abstract analysis that are needed later in the study of operators in
Hilbert space. The reason for this is to make it really self-contained and avoid
proving theorems by means of other fairly advanced theorems outside this book. In
particular, the chapter devoted to metric spaces (Chapter 2) contains results which
are not completely elementary but are necessary in order to prove (in Chapter 6) the
theorem about Borel functions that plays an essential role in proving the spectral
theorems (in Chapter 15). The chapters about measure and integration (Chapters
5–9) contain results about extensions of measures which are not to be found in first
level books on measure theory but which are essential in order to study commuting
self-adjoint operators, and also the Riesz–Markov theorem about positive linear
functionals which plays an essential role in proving the spectral theorems. Actually,
Chapters 1–2 and 5–9 could by themselves be a short book about measure and
integration. Chapters 3 and 4 deal with that part of the theory of linear operators in
normed spaces that is used later in the study of Hilbert space operators. Moreover,
the Stone–Weierstrass approximation theorem is proved in Chapter 4; this theorem
plays an essential role in proving the spectral theorems.
The second part of this book (Chapters 10–18) is its core, and contains a treat-
ment of the theory of linear operators in Hilbert space which is particularly well
suited for the discussion of the mathematical foundations of quantum mechanics
presented later in the book. It contains the spectral theorems for unitary and for
self-adjoint operators, one-parameter unitary groups and Stone’s theorem, theo-
rems about commuting operators and invariant subspaces, trace class operators,
and also Wigner’s theorem and the real line special case of Bargmann’s theorem
about automorphisms of projective Hilbert spaces.
The theory of Hilbert space operators is the backbone of the third and final
part of the book, which consists of two chapters (19 and 20). The first of these is
by far the longest chapter in the book and endeavours to present the principles of
non-relativistic quantum mechanics in a mathematically accurate way, with also an
unstinting effort to present some possible physical reasoning behind the constructs
that are considered. Since the predictions provided by quantum mechanics are in
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page x
general statistical ones, in the first part of this chapter general statistical ideas are
introduced and it is examined how these ideas are implemented in classical theories;
later in the chapter, the statistical aspects of quantum mechanics are compared and
contrasted with the same aspects of classical theories. The final chapter deals with
an important example of how quantum observables can arise in connection with
symmetry principles; moreover, it presents the Stone–von Neumann uniqueness
theorem about canonical commutation relations.
Although the book’s length might make it difficult to use it as a textbook for a
single course, parts of it can easily be used in that way for various courses. Here
are some concrete suggestions:
• Chapters 1, 2, 5, 6, 7, 8, 9 for a one-semester course in Real Analysis or in Measure
Theory (intermediate, could be either undergraduate or graduate, mathematics);
• Chapters 3, 4, 10, 11, 12, 13, 14, 15, 16, 17, 18 for a two-semester course in
Operators in Hilbert Space (graduate, mathematics and physics);
• Chapters 19, 20 (using without proof a large number of results from the previous
chapters) for a one-semester course in Mathematical Foundations of Quantum
Mechanics (graduate, mathematics and physics).
To make cross-reference as easy as possible, almost every bit of this book is
marked with three numbers, the first for the chapter, the second for the section,
and the third for the position within the section. Comments also are marked in
this way, and they are called “remarks”. As already mentioned, all the proofs in
this book are written in minute detail; in them, however, previous results are always
quoted simply by means of the three numbers code, without spelling them out. This
should enable experts to pursue the logic of a proof without too many diversions,
and beginners to receive all the support they might need.
I wish to thank Roberto Palazzi for the great job he did of preparing the LATEX
files for the book, and also for useful mathematical comments.
Franco Gallone
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page xi
Contents
Preface vii
2. Metric Spaces 21
2.1 Distance, convergent sequences . . . . . . . . . . . . . . . . . . . . 21
2.2 Open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Continuous mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Characteristic functions of closed and of open sets . . . . . . . . . 32
2.6 Complete metric spaces . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Product of two metric spaces . . . . . . . . . . . . . . . . . . . . . 37
2.8 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.9 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
xi
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page xii
7. Measures 151
7.1 Additive functions, premeasures, measures . . . . . . . . . . . . . . 151
7.2 Outer measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.3 Extension theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.4 Finite measures in metric spaces . . . . . . . . . . . . . . . . . . . 168
8. Integration 177
8.1 Integration of positive functions . . . . . . . . . . . . . . . . . . . . 177
8.2 Integration of complex functions . . . . . . . . . . . . . . . . . . . 191
8.3 Integration with respect to measures constructed from other
measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.4 Integration on product spaces . . . . . . . . . . . . . . . . . . . . . 210
8.5 The Riesz–Markov theorem . . . . . . . . . . . . . . . . . . . . . . 227
Contents xiii
Bibliography 739
Index 741
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 1
Chapter 1
Most readers are likely to have a working familiarity with most of the subjects of
this introductory chapter. For them, the main function of this chapter is to fix the
notation and the terminology that will be used throughout this book and provide
ready reference inside the book.
The reader is assumed to be already familiar with the topics of this section, which
is only intended for future reference.
1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 2
absolute value of a real number a coincides with |a|. Identifying a ∈ R with (a, 0)
and defining i := (0, 1), we also have (a1 , a2 ) = a1 + ia2 . When for a complex
number z we write 0 ≤ z (or 0 < z, z ≤ 0, z < 0), we mean Im z = 0 and 0 ≤ Re z
(or 0 < Re z, Re z ≤ 0, Re z < 0). More generally, outside the chapters devoted to
measure and integration, when for a symbol x we write 0 ≤ x or x ≥ 0 we mean
x ∈ [0, ∞); similarly, by 0 < x or x > 0 we mean x ∈ (0, ∞). However, in chapters
from 5 to 9 by 0 ≤ x or x ≥ 0 we mean x ∈ [0, ∞] and by 0 < x or x > 0 we mean
x ∈ (0, ∞] (i.e. we allow the case x = ∞; cf. 5.1.1).
It is always understood that the square root of a positive real number is taken
to be positive.
1.1.2 Proofs
A proposition is a statement that is either true or false (but not both). By means of
logical connectives and brackets, a new proposition can be defined starting from one
or more given propositions. We assume known to the reader the logical connectives:
“not”, “and”, “or” (“A or B” means “A or B or both”), “⇒” (if, then), “⇔” (if
and only if).
Given two propositions P, Q, the proposition P ⇒ Q is logically equivalent to the
proposition (notQ) ⇒ (notP ), which is called the contrapositive form of P ⇒ Q. A
proof that (notQ) ⇒ (notP ) is true, is called proof by contraposition of P ⇒ Q. The
proposition P ⇒ Q is also logically equivalent to the proposition [P and (notQ)] ⇒
[R and (notR)], for any proposition R. A proof that there is a proposition R such
that [P and (notQ)] ⇒ [R and (notR)] is true, is called proof by contradiction of
P ⇒ Q.
Suppose that, for each positive integer n, we are given a proposition Pn . From
the principle of induction it follows that, if the propositions
(a) P1 ,
(b) Pn ⇒ Pn+1 is true for each positive integer n
are true, then the proposition
(c) Pn is true for each positive integer n
is true. A proof that propositions a and b are true is called proof by induction of
proposition c.
Often, for a proposition P , we will write “P ” instead of “P is true” or “P holds”.
Propositions will be written in a rather informal style, mixing logical symbols and
ordinary language.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 3
The symbols “∀”, “∃”, “∈” are often used collectively: instead of writing
“∃x ∈ S, ∃y ∈ S” or “∀x ∈ S, ∀y ∈ S”
one often writes
“∃x, y ∈ S” or “∀x, y ∈ S”.
The expressions “∀x ∈ S” and “for x ∈ S” are regarded as equivalent.
When n ∈ N, “for k ∈ {1, ..., n}” is often written as “for k = 1, ..., n”.
In definitions, “if” means “if and only if”.
When, for a symbol x, we write “∃x ≥ 0”, or “∃x > 0”, or “∀x ≥ 0”, or “∀x > 0”,
we mean “∃x ∈ [0, ∞)”, or “∃x ∈ (0, ∞)”, or “∀x ∈ [0, ∞)”, or “∀x ∈ (0, ∞)”
respectively if we are not in chapters from 5 to 9; in chapters from 5 to 9, we mean
“∃x ∈ [0, ∞]”, or “∃x ∈ (0, ∞]”, or “∀x ∈ [0, ∞]”, or “∀x ∈ (0, ∞]” respectively (cf.
5.1.1).
P PN
If I = {1, ..., N } or I := N, we will often write “ n∈I ” for “ n=1 ” or for
P∞
“ n=1 ”.
1.1.4 Sets
The words family and collection will be used synonymously with set, e.g. in order
to avoid phrases like “set of sets”.
The empty set is denoted by ∅, and the family of all subsets of a set X is denoted
by P(X). If X is a set and if, for each x ∈ X, P (x) is a proposition involving x,
then
{x ∈ X : P (x)}
denotes the set of all elements x of X for which P (x) is true.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 4
{a, b, c, ...} denotes the set which contains the elements that are listed, and {x}
denotes the set which contains just x (such a set is called a singleton set ).
For two subsets S1 , S2 of a set X, we use the following symbols:
Symbol Meaning
S1 ⊂ S2 x ∈ S1 ⇒ x ∈ S2
(S1 is said to be a subset of S2 or to be contained by S2 )
S2 ⊃ S1 S1 ⊂ S2
S1 6⊂ S2 not (S1 ⊂ S2 ), i.e. ∃x ∈ S1 s.t. x 6∈ S2
S1 = S2 (S1 ⊂ S2 ) and (S2 ⊂ S1 ), i.e. x ∈ S1 ⇔ x ∈ S2
S1 6= S2 not (S1 = S2 ), i.e. (S1 6⊂ S2 ) or (S2 6⊂ S1 )
S1 ∩ S2 = ∅ ⇔ S1 ⊂ X − S2 ,
S2 − S1 = S2 ∩ (X − S1 );
then we also have
(S2 − S1 ) ∪ S1 = (S2 ∩ (X − S1 )) ∪ S1 = S2 ∪ S1 ,
S2 − (S2 − S1 ) = S1 ,
and this implies
X − S1 = X − (S2 ∩ (X − (S2 − S1 ))) = (X − S2 ) ∪ (S2 − S1 ).
Then, for three subsets S1 , S2 , S3 of X such that S1 ⊂ S2 ⊂ S3 we have
S3 − S1 = S3 ∩ ((X − S2 ) ∪ (S2 − S1 )) = (S3 − S2 ) ∪ (S2 − S1 ).
1.1.5 Relations
If X and Y are sets, the cartesian product of X and Y , written X × Y , is the set of
all ordered pairs (x, y) with x ∈ X and y ∈ Y .
A relation in a non-empty set X is a subset R of X × X. If (x, y) ∈ R, we write
xRy and say that x is related by R to y. If S is a subset of X, then R ∩ (S × S) is
a relation in S which is called the relation induced by R in S.
A relation R in a set X is said to be an equivalence relation if it has the following
three properties:
(er1 ) xRx, ∀x ∈ X (R is reflexive);
(er2 ) xRy ⇒ yRx (R is symmetric);
(er3 ) (xRy and yRz) ⇒ xRz (R is transitive).
A symbol often used for an equivalence relation is “∼”.
Let X be a set equipped with an equivalence relation ∼ and let x ∈ X. The
equivalence class of x for ∼ is the set
[x] := {y ∈ X : y ∼ x},
and any element of [x] is called a representative of [x].
The following facts can be easily proved:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 6
(a) x ∈ [x], ∀x ∈ X; thus, every equivalence class is nonempty and X = ∪x∈X [x];
(b) either [x] = [y] or [x] ∩ [y] = ∅ (but not both), ∀x, y ∈ X;
(c) [x] = [y] ⇔ x ∼ y;
we notice that, by assertion b, the contrapositive form of statement c is
[x] ∩ [y] = ∅ ⇔ not (x ∼ y).
A partition of a set X is a family F of subsets of X which has the following three
properties:
(pa1 ) S 6= ∅, ∀S ∈ F ;
(pa2 ) (S1 , S2 ∈ F , S1 6= S2 ) ⇒ S1 ∩ S2 = ∅;
(pa3 ) ∪S∈F S = X.
Thus, if X is a non-empty set equipped with an equivalence relation, the family of
equivalence classes constitute a partition of X. Conversely, it is straightforward to
prove that, if F is a partition of non-empty a set X, then the set
R := {(x, y) ∈ X × X : ∃S ∈ F such that x ∈ S and y ∈ S}
is an equivalence relation in X and F is the family of equivalence classes defined by
R.
If ∼ is an equivalence relation in a non-empty set X, the family of equivalence
classes defined by ∼ is called the quotient set of X by the relation ∼ and is denoted
by X/ ∼.
A relation R in a non-empty set X is said to be a partial ordering if it has the
following three properties:
(po1 ) xRx, ∀x ∈ X (R is reflexive);
(po2 ) (xRy and yRx) ⇒ x = y (R is antisymmetric);
(po3 ) (xRy and yRz) ⇒ xRz (R is transitive).
A partial ordering is called a total ordering if it has the following further property:
(po4 ) (xRy or yRx), ∀x, y ∈ X.
A symbol often used for a partial ordering is “≤”. A partially ordered set is a pair
(X, ≤), where X is a non-empty set and ≤ is a partial ordering in X.
Let (X, ≤) be a partially ordered set, S a non-empty subset of X, and x a point
of X; the following terms are used:
x is called an upper bound for S if y ≤ x for each y ∈ S;
x is called a lower bound for S if x ≤ y for each y ∈ S;
x is called a least upper bound (l.u.b.) for S if x is an upper bound for S and if,
for every upper bound x′ for S, we have x ≤ x′ ; if a l.u.b. for S exists, then (as can
be readily seen) it is the unique l.u.b. for S and is denoted by sup S; if the l.u.b.
of S exists and it is an element of S, we write max S := sup S;
x is called a greatest lower bound (g.l.b.) for S if x is a lower bound for S and
if, for every lower bound x′ for S, we have x′ ≤ x; if a g.l.b. for S exists, then it is
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 7
the unique g.l.b. for S and is denoted by inf S; if the g.l.b. of S exists and it is an
element of S, we write min S := inf S.
In the family P(X) of all subsets of a set X, a relation R is defined by letting
R := {(S1 , S2 ) ∈ P(X) × P(X) : S1 ⊂ S2 }.
For S1 , S2 ∈ P(X), one writes S1 RS2 directly as S1 ⊂ S2 . This relation is a partial
ordering and, for a non-empty subfamily F ⊂ P(X), both sup F and inf F exist
and in fact
sup F = ∪S∈F S and inf F = ∩S∈F S.
1.2 Mappings
In this section we give a methodical treatment of the subject, since some of the
concepts which are contained in this section might not be utterly familiar to all
readers. Indeed, for two sets X and Y , we consider mappings from X to Y which
are defined on any subset of X. This foreshadows what will happen in the study of
linear operators in Hilbert space, where we use the definitions, notations and results
of this section.
what sets to use as initial and final sets is often made on the grounds of particu-
lar properties they possess, or in order to have a common playground for several
mappings.
Mappings are sometimes given different names. A mapping is also called a map
or a function, and we will use the latter name especially when the final set is C or
R∗ (cf. 5.1.1), or some subset of them. When the final set is C (or R) we sometimes
say that the mapping is a complex (or a real ) function. A mapping from a cartesian
product of two sets to one of them is occasionally called a binary operation. A
mapping ϕ : N → X, where X is a non-empty set, is called a sequence in X and is
denoted by the symbol {xn }, where xn := ϕ(n). Sometimes, given a non-empty set
X and a non-empty set I which for psychological reasons we like to think about as a
set of indices, the range of a mapping ϕ : I → X is denoted by the symbol {xi }i∈I ,
where xi := ϕ(i), and is referred to as a family of elements of X indexed by the set
I. If a family F of subsets of a set X is obtained in this way, i.e. if F = {Si }i∈I ,
the union and the intersection of the elements of F are usually written as follows:
∪i∈I Si and ∩i∈I Si . If I := {1, ..., n} or I := N, “∪i∈I ” and “∩i∈I ” are written as
“∪ni=1 ” and“∩ni=1 ” or “∪∞ ∞
i=1 ” and “∩i=1 ” respectively.
We can now formalize better the concept of cartesian product, which we have
already introduced for two sets. Let {X1 , X2 , ..., Xn } be a finite family of sets. If
Xi 6= ∅ for i = 1, 2, ..., n, then the cartesian product X1 × X2 × · · · × Xn is defined
to be the set of all mappings ϕ : {1, 2, ..., n} → ∪ni=1 Xi so that ϕ(i) ∈ Xi for
i = 1, 2, ..., n; if there is i so that Xi = ∅, then X1 × X2 × · · · × Xn := ∅. If Xi 6= ∅
for i = 1, 2, ..., n, an element ϕ of X1 × X2 × · · · × Xn is called an ordered n-tuple, or
simply an n-tuple, and is denoted by the symbol (x1 , x2 , ..., xn ), where xi := ϕ(i). If
Ei ⊂ Xi for i = 1, 2, ..., n, then E1 × E2 × · · · × En is a subset of X1 × X2 × · · · × Xn ,
and
(X1 × X2 × · · · × Xn ) − (E1 × E2 × · · · × En )
= ∪ni=1 (X1 × · · · × Xi−1 × (Xi − Ei ) × Xi+1 × · · · × Xn );
if Fi ⊂ Xi for i = 1, 2, ..., n, then
(E1 × E2 × · · · × En ) ∩ (F1 × F2 × · · · × Fn )
= (E1 ∩ F1 ) × (E2 ∩ F2 ) × · · · × (En ∩ Fn ).
If X is a set so that Xi = X for i = 1, 2, ..., n, then we write
X n := X1 × X2 × · · · × Xn .
1.2.2 Remark. Given two non-empty sets X and Y , if we want to define a mapping
ϕ from X to Y by using a rule r that assigns elements of Y to some elements of X,
we need to define a subset Dϕ of X such that the rule r assigns one and only one
element of Y to each element of Dϕ . After defining Dϕ , a mapping ϕ is defined by
assigning to each element of Dϕ the element r(x) of Y that we obtain by applying
the rule r to x. To indicate a mapping defined in this way, we often write
ϕ : Dϕ → Y
x 7→ ϕ(x) := r(x),
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 9
or equivalently
Dϕ ∋ x 7→ ϕ(x) := r(x) ∈ Y.
When, for a given non-empty subset S of X, the rule r assigns one and only one
element of Y to each element of S and we want to define Dϕ by setting Dϕ := S,
we often write directly
ϕ:S→Y
x 7→ ϕ(x) := r(x),
or even (without introducing a symbol to denote the mapping)
S ∋ x 7→ r(x) ∈ Y.
1.2.3 Definition. Let ϕ be a mapping from X to Y (by this, here and in the
sequel, we mean that X, Y are non-empty sets and ϕ is a mapping ϕ : Dϕ → Y
with Dϕ ⊂ X). The graph of ϕ is the subset of X × Y defined by
Gϕ := {(x, y) ∈ X × Y : x ∈ Dϕ and y = ϕ(x)}.
We remark that, when X and Y are replaced with two different sets X ′ and Y ′ such
that Dϕ ⊂ X ′ and Rϕ ⊂ Y ′ , the graph of ϕ will remain unaltered (but it will be
considered as a subset of X ′ × Y ′ ).
πY : X × Y → Y
(x, y) 7→ πY (x, y) := y.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 11
ϕ(Dϕ ) = Rϕ ;
We recall (cf. 1.2.7) that, for any mapping ϕ from X to Y , for a subset T of Y we
have defined the set
ϕ−1 (T ) := {x ∈ Dϕ : ϕ(x) ∈ T }.
For an injective mapping ϕ we have {ϕ−1 (y)} = ϕ−1 ({y}) for each y ∈ Rϕ ; more-
over, for any subset S of Rϕ , ϕ−1 (S) is the same thing when interpreted as the
image of S under the inverse of ϕ or as the counterimage of S under ϕ. One can
see immediately that the following facts are true for an injective mapping ϕ:
(a) Rϕ−1 = Dϕ ;
(b) ϕ−1 is injective and (ϕ−1 )−1 = ϕ;
(c) if V denotes the mapping
V : X ×Y →Y ×X
(x, y) 7→ V (x, y) := (y, x),
then Gϕ−1 = V (Gϕ ) (notice that condition b of 1.2.4 is in effect for G := V (Gϕ )
iff ϕ is injective).
ψ ◦ ϕ : Dψ◦ϕ → Z
x 7→ (ψ ◦ ϕ)(x) := ψ(ϕ(x)).
If ψ : N → X is a sequence in X and ϕ : N → N is a mapping such that ϕ(n1 ) <
ϕ(n2 ) whenever n1 < n2 , then the mapping ψ ◦ ϕ is called a subsequence of ψ. If
ψ is denoted by {xn }, then ψ ◦ ϕ is denoted by {xϕ(k) }, or by {xnk } if ϕ does not
need to be specified.
1.2.13 Proposition.
(A) Let ϕ be a mapping from X to Y . We have:
(a) ϕ ◦ idX = idY ◦ ϕ = ϕ.
If ψ is a mapping from Y to Z such that ϕ−1 (Dψ ) 6= ∅, we have:
(b) Rψ◦ϕ ⊂ Rψ ;
(c) Dψ◦ϕ ⊂ Dϕ ;
(d) Dψ◦ϕ = Dϕ iff Rϕ ⊂ Dψ ;
(e) Dψ = Y ⇒ Dψ◦ϕ = Dϕ ;
(f ) (ψ ◦ ϕ)−1 (S) = ϕ−1 (ψ −1 (S)) for every subset S of Z.
If S is a subset of Y , we have:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 14
∀x ∈ Dϕ , πY ◦ (πX )−1
G (x) = πY (x, r(x)) = r(x) = ϕ(x).
1.2.14 Proposition.
(A) If ϕ is an injective mapping from X to Y , we have:
ϕ−1 ◦ ϕ = idDϕ ⊂ idX and ϕ ◦ ϕ−1 = idRϕ ⊂ idY .
If ϕ is a bijection from X onto Y , we have
ϕ−1 ◦ ϕ = idX and ϕ ◦ ϕ−1 = idY .
(B) Let ϕ be an injective mapping from X to Y , ψ an injective mapping from Y
to Z, and suppose that ϕ−1 (Dψ ) 6= ∅. Then the mapping ψ ◦ ϕ is injective and
(ψ ◦ ϕ)−1 = ϕ−1 ◦ ψ −1 .
Proof. We have
[x1 , x2 ∈ Dϕ , ϕ(x1 ) = ϕ(x2 )] ⇒ [x1 , x2 ∈ Dψ , ψ(x1 ) = ψ(x2 )] ⇒ x1 = x2 ,
which proves that ϕ is injective. We also have by 1.2.14A:
y ∈ Dϕ−1 = Rϕ ⇒ y = ϕ(ϕ−1 (y)) = ψ(ϕ−1 (y)) ⇒
[y ∈ Rψ = Dψ−1 and ψ −1 (y) = ψ −1 (ψ(ϕ−1 (y))) = (ϕ−1 (y))],
which proves ϕ−1 ⊂ ψ −1 .
Proof. We have
D(ϕ3 ◦ϕ2 )◦ϕ1 := {w ∈ Dϕ1 : ϕ1 (w) ∈ Dϕ3 ◦ϕ2 }
= {w ∈ Dϕ1 : ϕ1 (w) ∈ Dϕ2 and ϕ2 (ϕ1 (w)) ∈ Dϕ3 }
= {w ∈ Dϕ2 ◦ϕ1 : (ϕ2 ◦ ϕ1 )(w) ∈ Dϕ3 }
= Dϕ3 ◦(ϕ2 ◦ϕ1 ) ,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 16
(a) Dϕ = ψ −1 (Dψ◦ϕ◦ψ−1 );
(b) Rψ◦ϕ◦ψ−1 = ψ(Rϕ ).
Proof. a: We have
(∗)
x ∈ Dϕ ⇒ ψ −1 (ψ(x)) ∈ Dϕ ⇒ ψ(x) ∈ Dϕ◦ψ−1 = Dψ◦ϕ◦ψ−1 ⇒
x ∈ ψ −1 (Dψ◦ϕ◦ψ−1 ),
(∗)
x ∈ ψ −1 (Dψ◦ϕ◦ψ−1 ) ⇒ ψ(x) ∈ Dψ◦ϕ◦ψ−1 = Dϕ◦ψ−1 ⇒
x ∈ Dϕ◦ψ−1 ◦ψ = Dϕ◦idX = Dϕ ,
−ϕ : Dϕ → C
x 7→ (−ϕ)(x) := −ϕ(x);
Re ϕ : Dϕ → C
x 7→ (Re ϕ)(x) := Re ϕ(x);
Im ϕ : Dϕ → C
x 7→ (Im ϕ)(x) := Im ϕ(x);
ϕ : Dϕ → C
x 7→ ϕ(x) := ϕ(x);
|ϕ| : Dϕ → C
x 7→ |ϕ|(x) := |ϕ(x)|;
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 17
D ϕ1 : = {x ∈ Dϕ : ϕ(x) 6= 0} and
1
: D ϕ1 → C
ϕ
1 1
x 7→ ( )(x) := ;
ϕ ϕ(x)
eϕ : Dϕ → C
x 7→ (eϕ )(x) := exp ϕ(x);
for n ∈ N,
ϕn : Dϕ → C
x 7→ (ϕn )(x) := (ϕ(x))n ;
for α ∈ C,
αϕ : Dϕ → C
x 7→ (αϕ)(x) := αϕ(x).
For a function ϕ from X to R, we define:
ϕ+ : Dϕ → R
x 7→ ϕ+ (x) := max{ϕ(x), 0};
ϕ− : Dϕ → R
x 7→ ϕ− (x) := − min{ϕ(x), 0}.
For two functions ϕ, ψ from X to C s.t. Dϕ ∩ Dψ 6= ∅, we define:
ϕ + ψ : Dϕ ∩ Dψ → C
x 7→ (ϕ + ψ)(x) := ϕ(x) + ψ(x);
ϕψ : Dϕ ∩ Dψ → C
x 7→ (ϕψ)(x) := ϕ(x)ψ(x).
Clearly, for a function ϕ from X to C we have
ϕ = Re ϕ + i Im ϕ and |ϕ|2 = ϕϕ,
and for a function from X to R we have
ϕ = ϕ+ − ϕ− and |ϕ| = ϕ+ + ϕ− ;
thus, for a function ϕ from X to C we have
ϕ = (Re ϕ)+ − (Re ϕ)− + i(Im ϕ)+ − i(Im ϕ)− .
For α ∈ C, we define the constant function
αX : X → C
x 7→ αX (x) := α;
we also write ϕ + α := ϕ + αX for every function ϕ from X to C.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 18
1.3 Groups
We review in this section the few elementary facts about groups that will be used
in the book.
1.3.1 Definitions. A group is a pair (G, γ), where G is a non-empty set and γ is
a mapping γ : G × G → G with the following properties, which we write with the
shorthand notation g1 g2 := γ(g1 , g2 ):
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 19
(ag) g1 g2 = g2 g1 , ∀g1 , g2 ∈ G.
For an abelian group, one usually writes “g1 + g2 ” instead of “g1 g2 ”, “sum” instead
of “product”, “zero” instead of “identity”, “0” instead of “e”, “opposite” instead of
“inverse”, “−g” instead of “g −1 ”, “g1 − g2 ” instead of “g1 + (−g2 )”. For elements
Pm
of G, one writes g1 + g2 + g3 := g1 + (g2 + g3 ) and i=n gi := gn + gn+1 + ... + gm
P
if n < m; one also writes i∈I gi to denote the sum of a finite family {gi }i∈I of
elements of G.
One often says “the group G” to mean the pair (G, γ), but on the other hand one
often speaks of “elements of the group G” to mean “elements of the set G”. Tacit
conventions of this sort are used whenever one deals with mathematical structures
which are composed of sets together with some mappings (as in the case of metric,
linear, normed, inner product spaces, algebras, normed algebras, etc.), or together
with some relation (as in the case of a partially ordered set), or together with some
class of distinguished subsets (as in the case of a measurable space), and will not
be mentioned again later on.
(sg1 ) g1 , g2 ∈ G̃ ⇒ g1 g2 ∈ G̃,
(sg2 ) g ∈ G̃ ⇒ g −1 ∈ G̃.
If G̃ is a subgroup of (G, γ), condition sg1 makes it possible to use G̃ as the final set of
γG̃×G̃ (the restriction of γ to G̃× G̃). Then (G̃, γG̃×G̃ ) is a group: indeed for this pair
condition gr1 is obviously satisfied; moreover, for any g ∈ G̃ we have e = gg −1 ∈ G̃
by conditions sg1 and sg2 , hence condition gr2 is satisfied for (G̃, γG̃×G̃ ) with e still
playing the role of the identity; finally, condition gr3 is satisfied for (G̃, γG̃×G̃ ) since
by condition sg2 we have g −1 ∈ G̃ for all g ∈ G̃.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 20
1.3.4 Remark. It can be easily seen that the family of all automorphisms of any
group G is a group if the product of two automorphisms is assumed to be their
composition as defined in 1.2.12. The identity of this group is idG (which is clearly
an automorphism of G), and the group inverse of an automorphism is its inverse as
defined in 1.2.11.
Proof. a: We have
g, g ′ ∈ RΦ ⇒
[∃g̃, g̃ ′ ∈ G1 s.t. g = Φ(g̃), g ′ = Φ(g˜′ ), hence s.t. gg ′ = Φ(g̃ g˜′ )] ⇒
gg ′ ∈ RΦ
and
g ∈ RΦ ⇒
[∃g̃ ∈ G1 s.t. g = Φ(g̃), hence s.t. g −1 = Φ(g̃)−1 = Φ(g̃ −1 )] ⇒
g −1 ∈ RΦ .
b: Let G1 be abelian. Then we have
Φ(g1 )Φ(g2 ) = Φ(g1 g2 ) = Φ(g2 g1 ) = Φ(g2 )Φ(g1 ), ∀g1 , g2 ∈ G.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 21
Chapter 2
Metric Spaces
This chapter contains just the facts about metric spaces that will be used later in
the book and is not intended for a thorough treatment of this subject.
2.1.1 Definition. A metric space is a pair (X, d), where X is a non-empty set and
d is a function d : X × X → R such that
21
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 22
2.1.7 Remarks.
(a) Let (X, d) be a metric space. For x ∈ X and a sequence {xn } in X, xn → x iff
d(xn , x) → 0 in the metric space (R, dR ).
(b) Let (X, d) be a metric space, {xn } a convergent sequence in X, and ϕ : N → N
a mapping such that ϕ(n1 ) < ϕ(n2 ) whenever n1 < n2 . Then the subsequence
{xϕ(k) } is convergent and limk→∞ xϕ(k) = limn→∞ xn . Indeed, write x :=
limn→∞ xn and for each ε > 0 let Nε ∈ N be such that
n > Nε ⇒ d(xn , x) < ε.
Then, for each ε > 0,
k > min ϕ−1 ([Nε , ∞) ∩ N) ⇒ ϕ(k) > Nε ⇒ d(xϕ(k) , x) < ε.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 23
Metric Spaces 23
2.1.10 Definition. Let (X, γ, d) be a triple so that (X, γ) is an abelian group and
Pn
(X, d) is a metric space, let {xn } be a sequence in X, and let sn := k=1 xk for
all n ∈ N. The sequence {sn } is called the series of the xn ’s and is denoted by
P∞ P∞
the symbol n=1 xn ; thus, one says that n=1 xn is convergent when the sequence
{sn } is convergent. If the sequence {sn } is convergent then one calls limn→∞ sn
P∞
the sum of the series and denotes limn→∞ sn by the same symbol n=1 xn as the
P∞
series, i.e. one writes n=1 xn := limn→∞ sn .
2.2.1 Definition. Let (X, d) be a metric space. If x ∈ X and r ∈ (0, ∞), the open
ball with center x and radius r is the set
B(x, r) := {y ∈ X : d(x, y) < r}.
2.2.3 Proposition. Let (X, d) be a metric space. For all x ∈ X and r ∈ (0, ∞),
the open ball B(x, r) is an open set (this justifies its name).
Proof. Let y be a point in B(x, r). We must produce r ∈ (0, ∞) such that B(y, r) ⊂
B(x, r). Since d(y, x) < r, we have 0 < r − d(y, x). Defining r := r − d(y, x), we
have:
z ∈ B(y, r) ⇒ d(z, y) < r − d(y, x) ⇒
d(z, x) ≤ d(z, y) + d(y, x) < r ⇒ z ∈ B(x, r).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 24
2.2.4 Theorem. Let (X, d) be a metric space. The topology Td has the following
properties:
(to1 ) ∅ ∈ Td , X ∈ Td ;
(to2 ) if F is any family of elements of Td , then ∪S∈F S ∈ Td ;
(to3 ) if F is a finite family of elements of Td , then ∩S∈F S ∈ Td .
Proof. to1 : To show that ∅ is open, we must show that each point in ∅ is the center
of an open ball contained in ∅; but since there are no points in ∅, this requirement
is automatically satisfied. The set X is clearly open, since every open ball centered
on any of its points is contained in X.
to2 : Every point x in ∪S∈F S lies in some Sx of the family F . Since Sx is an
open set, some open ball centered on x is contained in Sx and hence in ∪S∈F S.
to3 : If ∩S∈F S = ∅, then ∩S∈F S is an open set. Assume then that ∩S∈F S is
non-empty and write F = {S1 , ..., Sn } for some n ∈ N; let x be a point in ∩S∈F S;
for k = 1, ..., n, ∃rk > 0 s.t. B(x, rk ) ⊂ Sk ; let r be the smallest number in the
set {r1 , ..., rn }; r is a positive real number and we have B(x, r) ⊂ B(x, rk ) for
k = 1, ..., n; therefore B(x, r) ⊂ ∩S∈F S.
Proof. First we note that if x ∈ S then B(x, r) ∩ S is the open ball with center x
and radius r in the metric subspace (S, dS ).
If G ∈ Td and x ∈ G ∩ S, then ∃r > 0 such that B(x, r) ⊂ G, hence such that
B(x, r) ∩ S ⊂ G ∩ S. This shows that G ∩ S is an open set in (S, dS ).
Conversely, if T is an open set in (S, dS ), then for each x ∈ T there is rx > 0 s.t.
B(x, rx ) ∩ S ⊂ T . Then we have T = ∪x∈T (B(x, rx ) ∩ S) = (∪x∈T B(x, rx )) ∩ S,
with ∪x∈T B(x, rx ) ∈ Td by 2.2.3 and 2.2.4.
2.2.6 Definition. Let (X, d) be a metric space and S a subset of X. The interior
of S is the set:
S o := ∪G∈F G, with F := {G ∈ P(X) : G ∈ Td and G ⊂ S}.
Metric Spaces 25
2.3.2 Theorem. Let (X, d) be a metric space. The family Kd of all closed sets has
the following properties:
(cl1 ) ∅ ∈ Kd , X ∈ Kd ;
(cl2 ) if F is any family of elements of Kd , then ∩S∈F S ∈ Kd ;
(cl3 ) if F is a finite family of elements of Kd , then ∪S∈F S ∈ Kd .
Proof. Properties cl1 , cl2 , cl3 follow from 2.3.1, 2.2.4 and De Morgan’s laws (cf.
1.1.4).
2.3.4 Theorem. Let (X, d) be a metric space. For a subset S of X the following
conditions are equivalent:
(a) S is a closed set;
(b) [x ∈ X, {xn } a sequence in S, xn → x] ⇒ x ∈ S.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 26
2.3.5 Remark. Using 2.3.4 one can see at once that the set {x} is closed, for every
point x of any metric space.
2.3.6 Definition. Let (X, d) be a metric space. If x ∈ X and r ∈ (0, ∞), the closed
ball with center x and radius r is the set
K(x, r) := {y ∈ X : d(x, y) ≤ r}.
2.3.7 Proposition. Let (X, d) be a metric space. For all x ∈ X and r ∈ (0, ∞),
the closed ball K(x, r) is a closed set (this justifies its name).
Metric Spaces 27
2.3.8 Definition. Let (X, d) be a metric space and S a subset of X. The closure
of S is the set
S = ∩f ∈F F, with F := {F ∈ P(X) : F ∈ Kd and S ⊂ F }.
(f ) (∩S∈L S) ⊂ ∩S∈L S.
(a) x ∈ S;
(b) ∀ǫ > 0, ∃y ∈ S such that d(x, y) < ǫ;
(c) there exists a sequence {xn } in S such that xn → x.
Proof. a ⇒ b: We prove (not b)⇒ (not a). Assume (not b), i.e.
∃ǫ > 0 such that ǫ ≤ d(x, y) for each y ∈ S,
which is equivalent to
∃ǫ > 0 such that S ⊂ X − B(x, ǫ);
since X − B(x, ǫ) ∈ Kd (cf. 2.2.3 and 2.3.1), we have (cf. 2.3.9a)
∃ǫ > 0 such that S ⊂ X − B(x, ǫ);
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 28
S ⊂ X − {x},
∀ǫ > 0, B(x, ǫ) ∩ S 6= ∅;
2.3.12 Corollary. Let (X, d) be a metric space and S a subset of X. The following
conditions are equivalent:
(a) S = X;
(b) ∀x ∈ X, ∀ǫ > 0, ∃y ∈ S such that d(x, y) < ǫ;
(c) ∀x ∈ X, there exists a sequence {xn } in S such that xn → x.
2.3.13 Theorem. Let (X, d) be a metric space, and let S and T be two subsets of
X such that T ⊂ S. The following conditions are equivalent:
S = ∩f ∈F (F ∩ S) ⊂ ∩f ∈F F = T .
Metric Spaces 29
2.3.14 Corollary. Let (X, d) be a metric space, and let S and T be two subsets of
X such that T ⊂ S. If T is dense in (S, dS ) and S is dense in (X, d), then T is
dense in (X, d).
Proof. If T is dense in (S, dS ), then we have (by 2.3.13) S ⊂ T , which implies (by
2.3.9a) S ⊂ T . If moreover S is dense in (X, d), then we also have S = X and hence
X ⊂ T , which implies T = X.
2.3.16 Remark. The metric space (R, dR ) is separable since the set Q of all rational
numbers is both countable and dense in R.
2.3.17 Proposition. Let (X, d) be a separable metric space. Then there is a count-
able family Tc of open balls such that every open set is a union of elements of Tc .
Proof. If G is the empty set then the statement is trivial. Assume then G non-
empty. Let Tc be the countable family of open balls of 2.3.17. Let x be a point
in G. The point x is in some Gi , and we can find an open ball B in Tc such that
x ∈ B ⊂ Gi . If we do this for each point x in G, we obtain a family of open
balls {Bn }n∈J such that ∪n∈J Bn = G, and this family is countable since it is a
subfamily of Tc . Further, for each open ball in this subfamily, we can select i ∈ I so
that Gi contains the ball. The family Ic of i’s which arises in this way is countable,
since there exists a surjection of the countable set J onto Ic by construction of Ic .
Moreover, ∪i∈Ic Gi = G since Gi ⊂ G for each i ∈ I, and ∀n ∈ J, ∃i ∈ Ic such that
Bn ⊂ Gi .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 30
2.3.19 Theorem. Assume that in a metric space (X, d) there is a countable family
Tc of open sets such that every open set is a union of elements of Tc . Then (X, d)
is separable.
Proof. For each element of Tc choose a point, and let S be the set of all these
points. The set S is countable since by its construction there is a surjection from
Tc onto S. For every x ∈ X and every ǫ > 0, B(x, ǫ) contains an element of Tc ,
hence a point y ∈ S, and we have d(x, y) < ǫ. In view of 2.3.12, this shows that
S = X.
Metric Spaces 31
2.4.5 Remark. It is obvious that an isomorphism from one metric space onto
another is a continuous mapping.
Metric Spaces 33
Proof. We have
d(x, S) = 0 ⇔ [∀ǫ > 0, ∃y ∈ S s.t. d(x, y) < ǫ] ⇔ x ∈ S,
where the latter equivalence holds by 2.3.10.
2.5.4 Lemma. Let (X, d) be a metric space and S a non-empty subset of X. The
function
δS : X → R
x 7→ δS (x) := d(x, S)
is uniformly continuous.
function of 2.5.5, have the required properties (cf. 1.2.8 and 2.4.3).
2.5.7 Corollary. Let F be a closed set in a metric space (X, d). Then there exists
a sequence {ϕn } such that:
∀n ∈ N, ϕn is a continuous function ϕn : X → [0, 1];
∀x ∈ X, ϕn (x) → χF (x) as n → ∞.
Proof. If F = ∅, let ϕn := 0X . Assuming F 6= ∅, for n ∈ N the set Fn :=
δF−1 ([ n1 , ∞)) is closed by 2.5.4 and 2.4.3, and F ∩ Fn = ∅ by 2.5.2. Hence, by 2.5.5
there is a continuous function ϕn : X → [0, 1] such that ϕn (x) = 1, ∀x ∈ F , and
ϕn (x) = 0, ∀x ∈ Fn . By 2.5.2 we also have:
∀x ∈ X − F, ∃Nx ∈ N such that x ∈ Fn for n > Nx ,
and hence such that ϕn (x) = 0 for n > Nx .
This proves that
∀x ∈ X, ϕn (x) → χF (x) as n → ∞.
2.5.8 Corollary. Let G be an open set in a metric space (X, d). Then there exists
a sequence {ψn } such that:
(a) ∀n ∈ N, ψn is a continuous function ψn : X → [0, 1];
(b) ∀x ∈ X, ψn (x) → χG (x) as n → ∞.
Proof. Let F := X −G in 2.5.7 and define ψn := 1X −ϕn (cf. 1.2.19). The required
properties for {ψn } follow from the properties of {ϕn }. In particular
∀x ∈ X, ψn (x) = 1 − ϕn (x) → 1 − χX−G (x) = χG (x) as n → ∞.
Metric Spaces 35
(a) F ⊂ G,
(b) ϕ is continuous,
(c) χF (x) ≤ ϕ(x) ≤ χG (x), ∀x ∈ X,
(d) supp ϕ ⊂ G.
2.5.11 Theorem. In a metric space (X, d), let F be a closed set, G an open set,
and F ⊂ G. Then there exists a function ϕ : X → [0, 1] such that F ≺ ϕ ≺ G.
2.6.2 Theorem. Let (X, d) be a metric space and let {xn } be a sequence in X
which is convergent. Then {xn } is a Cauchy sequence.
Proof. Let x be the limit of {xn }. For ǫ > 0 let Nǫ ∈ N be such that Nǫ < n ⇒
d(xn , x) < 2ǫ . Then we have
Nǫ < n, m ⇒ d(xn , xm ) ≤ d(xn , x) + d(x, xm ) < ǫ.
2.6.4 Proposition. Let (X1 , d1 ), (X2 , d2 ) be two metric spaces such that there ex-
ists an isomorphism from (X1 , d1 ) onto (X2 , d2 ). Then (X1 , d1 ) is complete iff
(X2 , d2 ) is complete.
2.6.5 Example. The metric space (R, dR ) (cf. 2.1.4) is complete (cf. e.g. Rudin,
1976, 3.11).
2.6.6 Proposition. Let (X, d) be a metric space and let S be a non-empty subset
of X.
(a) If the metric subspace (S, dS ) is a complete metric space, then S is a closed set
in (X, d).
(b) If (X, d) is a complete metric space and S is a closed set in (X, d), then the
metric subspace (S, dS ) is a complete metric space.
We point out that, as a result of condition co1 , the mapping ι is necessarily injective:
ˆ
ι(x) = ι(y) ⇒ d(ι(x), ι(y)) = 0 ⇒ d(x, y) = 0 ⇒ x = y.
ˆ ι) is a completion of (X, d), then clearly ι is an isomorphism from (X, d)
If ((X̂, d),
onto the metric subspace (Rι , dˆRι ) of (X̂, d).
ˆ
2.6.8 Proposition. Let (X, d) be a complete metric space and let S be a subset of
X such that S = X (hence S is non-empty) and S 6= X. Then the metric subspace
(S, dS ) is not complete and the pair ((X, d), idS ) is one of the completions of (S, dS ).
Proof. Since S is not closed (cf. 2.3.9c), (S, dS ) is not complete by 2.6.6a. It
follows directly from the definitions that ((X, d), idS ) is a completion of (S, dS ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 37
Metric Spaces 37
We shall not use the following theorem, also because we shall need completions of
metric spaces, but those completions will be constructed without relying on either
the statement or the proof of this theorem. For this reason, we state it without
giving its proof, which can be found e.g. in 6.3.11 of (Berberian, 1999).
2.6.9 Theorem. If (X, d) is any metric space, then there exists a completion
ˆ ι) of (X, d).
((X̂, d),
˜ ω) is also a completion of (X, d), then there exists an isomorphism Φ
If ((X̃, d),
ˆ onto (X̃, d)
from (X̂, d) ˜ such that Φ ◦ ι = ω, i.e. such that Φ(ι(x)) = ω(x), ∀x ∈ X.
In order that (X, d) be complete, it is necessary and sufficient that ι be surjective
onto X̂.
2.7.1 Theorem. Let (X, d) and (X̃, d) ˜ be metric spaces. The function
d × d˜ : (X × X̃) × (X × X̃) → R
q
˜ ỹ)2
((x, x̃), (y, ỹ)) 7→ d × d˜((x, x̃), (y, ỹ)) := d(x, y)2 + d(x̃,
is a distance on X × X̃.
Proof. One can see that, for d× d, ˜ properties di1 and di3 of 2.1.1 follow immediately
from the corresponding properties for d and for d. ˜
As to property di2 , for all (x, x̃), (y, ỹ), (z, z̃) ∈ X × X̃ we have
q
d × d˜((x, x̃), (y, ỹ)) ≤ (d(x, z) + d(z, y))2 + (d(x̃,˜ z̃) + d(z̃,
˜ ỹ))2
q q
≤ d(x, z)2 + d(x̃,˜ z̃)2 + d(z, y)2 + d(z̃,
˜ ỹ)2
Proof. Statements a and b follow directly from the definitions and statement d
follows immediately from statements a and b.
As to statement c, assume first (X, d) and (X̃, d) ˜ separable, and let S and S̃
be two countable subsets of X and X̃ which are dense in X and X̃ respectively.
Then S × S̃ is a countable subset of X × X̃. Moreover, for each (x, x̃) ∈ X × X̃, by
2.3.12 there are sequences {xn } and {x̃n } in S and S̃ respectively such that xn → x
and x̃n → x̃ . Then, by statement a, {(xn , x̃n )} is a sequence in S × S̃ such that
(xn , x̃n ) → (x, x̃). By 2.3.12, this proves that S × S̃ is dense in X × X̃, and hence
that X × X̃ is separable.
Assume now (X × X̃, d × d) ˜ separable, and let T be a countable subset of X × X̃
which is dense in X × X̃. Let S be the set of first members of T , i.e. S := πX (T )
(cf. 1.2.6c). Then S is countable. Fix now x ∈ X and let x̃ be any element of
X̃. By 2.3.12, there is a sequence {(xn , x̃n )} in T such that (xn , x̃n ) → (x, x̃), and
hence, by statement a, such that xn → x. Since xn ∈ S, in view of 2.3.12 this
proves that S is dense in X, and hence that X is separable. For X̃, proceed in a
similar way.
2.7.4 Examples.
dC : C × C → R
(z1 , z2 ) 7→ dC (z1 , z2 ) := |z1 − z2 |
Metric Spaces 39
Proof. By 2.4.2, the mapping ϕ is continuous at x iff the following condition holds:
2.7.6 Remark. Let (X, d) be a metric space. By 2.7.4a and 2.7.5, a function
ϕ : Dϕ → C with Dϕ ⊂ X is continuous at x ∈ Dϕ iff Re ϕ and Im ϕ (cf. 1.2.19)
are both continuous at x.
If ϕ is a function from R to C, i.e. ϕ : Dϕ → C with Dϕ ⊂ R, and x0 is a point
in Dϕ for which ∃ǫ > 0 such that (x0 −ǫ, x0 +ǫ) ⊂ Dϕ , we see that ϕ is differentiable
at x0 (cf. 1.2.21) iff ∃z ∈ C such that the following function is continuous at x0 :
Dϕ → C
(
ϕ(x)−ϕ(x0 )
x−x0 if x 6= x0 ,
x 7→
z if x = x0 .
If z with this property exists, then it is unique and in fact
2.7.7 Proposition. Let (X1 , d1 ), (X2 , d2 ), (Y, d) be metric spaces, and let a map-
ping ϕ : X1 × X2 → Y be continuous at a point (x1 , x2 ) of X1 × X2 . Then the
mapping
ϕx1 : X2 → Y
x2 7→ ϕx1 (x2 ) := ϕ(x1 , x2 )
is continuous at x2 , and the mapping
ϕx2 : X1 → Y
x1 7→ ϕx2 (x1 ) := ϕ(x1 , x2 )
is continuous at x1 .
Proof. Let {x2,n } be a sequence in X2 such that x2,n → x2 . By 2.7.3a, {(x1 , x2,n )}
is then a sequence in X1 × X2 such that (x1 , x2,n ) → (x1 , x2 ). Since ϕ is continuous
at (x1 , x2 ), by 2.4.2 this implies that
By 2.4.2, this proves that ϕx1 is continuous at x2 . For ϕx2 one proceeds in a similar
way.
2.8 Compactness
2.8.2 Theorem. For a subset S of a metric space (X, d) the following three condi-
tions are equivalent:
(a) the metric subspace (S, dS ) is complete and, for every ǫ > 0, S can be covered
by a finite family of open balls with radius ǫ;
(b) for every sequence in S, there exists a subsequence which converges to a point
of S;
(c) for every family G of open sets which is a cover of S, there exists a finite
subfamily Gf of G which is a cover of S.
Metric Spaces 41
must contain xn for infinitely many n ∈ N; define then the infinite subset N1 of N
by
N1 := {n ∈ N : xn ∈ B1 },
and choose n1 ∈ N1 . Suppose now that we have defined an open ball Bk with radius
1
k and an infinite subset Nk of N such that xn ∈ Bk for all n ∈ Nk , and that we
have chosen nk ∈ Nk . Proceed hence as follows; since there is a finite family of open
1
balls with radius k+1 which is a cover of S and hence of S ∩ Bk , at least one of
these balls, which we denote by Bk+1 , must contain xn for infinitely many n ∈ Nk ;
define then the infinite subset Nk+1 of Nk by
Nk+1 := {n ∈ Nk : xn ∈ Bk+1 },
and choose nk+1 ∈ Nk+1 such that nk < nk+1 (this is possible because Nk+1 is an
infinite set). Since Nk+1 ⊂ Nk for each k ∈ N, for k and l such that l < k we have
Nk ⊂ Nl , and hence nk ∈ Nl , and hence xnk ∈ Bl .
Now, {xnk } is a subsequence of {xn } and it is a Cauchy sequence: if k > l then
xnk , xnl ∈ Bl and hence d(xnk , xnl ) < 2l . Since (S, dS ) is complete, there exists
x ∈ S such that xnk → x as k → ∞.
b ⇒ a: We shall prove (not a)⇒(not b). Assume (not a), i.e. [(S, dS ) not
complete] or [∃ǫ > 0 s.t. S cannot be covered by a finite family of open balls with
radius ǫ].
If (S, dS ) is not complete, there is a Cauchy sequence {xn } in S with no limit in
S. Then, no subsequence of {xn } can converge to a point of S. Indeed, for ǫ > 0 let
Nǫ ∈ N be so that d(xn , xm ) < ǫ whenever n, m > Nǫ ; then, if a subsequence {xnk }
and x ∈ S existed such that xnk → x as k → ∞, by choosing k(ǫ) large enough so
that nk(ǫ) > Nǫ and d(xnk(ǫ) , x) < ǫ we would have
n > Nǫ ⇒ d(xn , x) ≤ d(xn , xnk(ǫ) ) + d(xnk(ǫ) , x) < 2ǫ,
and this would prove that xn → x.
On the other hand, if ǫ > 0 exists such that S cannot be covered by a finite family
of open balls with radius ǫ, we can construct a sequence {xn } in S inductively as
follows. Choose x1 ∈ S; having chosen x1 , ..., xn , notice that S − ∪nk=1 B(xk , ǫ) 6= ∅
(otherwise we should have S ⊂ ∪nk=1 B(xk , ǫ)) and choose xn+1 ∈ S − ∪nk=1 B(xk , ǫ).
Then for n 6= m we have ǫ ≤ d(xn , xm ) (if e.g. n > m, then xn 6∈ B(xm , ǫ)), and no
subsequence of {xn } can be convergent (cf. 2.6.2).
(a and b) ⇒ c: Assume condition b and let G be a family of open sets which is
a cover of S. We can prove by contradiction that
∃n ∈ N such that
(∗)
1 1
x ∈ X and B(x, ) ∩ S 6= ∅ ⇒ ∃G ∈ G s.t. B(x, ) ⊂ G .
n n
Indeed, suppose to the contrary that
∀n ∈ N, ∃xn ∈ X such that
1 1
B(xn , ) ∩ S 6= ∅ and B(xn , ) 6⊂ G for all G ∈ G .
n n
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 42
(a) S is compact;
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 43
Metric Spaces 43
2.8.5 Proposition. For a subset S of a metric space (X, d) the following conditions
are equivalent:
(a) S is compact;
(b) the metric subspace (S, dS ) is complete and, for each n ∈ N, there exists a finite
family {xn,1 , ..., xn,Nn } of points of X such that S ⊂ ∪N 1
k=1 K(xn,k , n ).
n
Nn
we also have S ⊂ ∪k=1 B(xn,k , ǫ). Since (S, dS ) is complete, this proves that S has
property a of 2.8.2.
2.8.7 Theorem (Heine–Borel). In the metric space (Rn , dn ) (cf. 2.7.4b), every
closed and bounded subset of Rn is compact.
it is enough to prove that, for every ǫ > 0, QR can be covered by a finite family
√ of
open balls of radius ǫ. Given ǫ > 0, choose an integer k such that k > R ǫ n and
construct a partition of QR made of k n congruent subcubes, by dividing the interval
[−R, R] into k intervals of equal length. Each of these subcubes has side length 2R
k ,
√
2R n
hence its diameter is k < 2ǫ, so it is contained in the open ball that has the
center of the subcube as its own center and radius ǫ. Thus, QR can be covered by
a family of k n open balls with radius ǫ.
2.8.8 Theorem. In a metric space (X, d), let F and K be subsets of X such that
F is closed, K is compact, and F ⊂ K. Then F is compact.
K = F ∪ (K − F ) ⊂ (∪G∈G G) ∪ (X − F ).
Since G ∪ {X − F } is a family of open sets and K is compact, this implies that there
exists a finite subfamily Gf of G such that
K ⊂ (∪G∈Gf G) ∪ (X − F )
2.8.9 Proposition. Let (X, d) be a metric space and {S1 , ..., Sn } a finite family of
compact subsets of X. Then ∪nk=1 Sk is a compact subset of X.
Proof. Let G be a family of open subsets of X which is a cover of ∪nk=1 Sk . Then, for
each k ∈ {1, ..., n}, G is also a cover of Sk , and hence there exists a finite subfamily
Gk of G which is a cover of Sk . Hence, ∪nk=1 Gk is a finite subfamily of G which is a
cover of ∪nk=1 Sk . This proves that ∪nk=1 Sk is compact.
2.8.10 Proposition. Let (X, d) and (X̃, d) ˜ be metric spaces. If S and S̃ are com-
˜ respectively, then S × S̃ is a compact subset in the
pact subsets in (X, d) and (X̃, d)
˜
product metric space (X × X̃, d × d).
Proof. Let S and S̃ be compact subsets in (X, d) and (X̃, d) ˜ respectively, and let
{(xn , x̃n )} be a sequence in S × S̃. Since S is compact, there is a subsequence {xnk }
of the sequence {xn } which converges to a point x of S. Since {x̃nk } is a sequence
in S̃ and S̃ is compact, there is a subsequence {x̃nkl } of {x̃nk } which converges to
a point x̃ of S̃. Now, the subsequence {xnkl } of {xnk } converges to x (cf. 2.1.7b).
Then, by 2.7.3a, {(xnkl , x̃nkl )} is a subsequence of {(xn , x̃n )} which converges (with
respect to d × d) ˜ to (x, x̃), which is a point of S × S̃. This proves that S × S̃ is
compact.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 45
Metric Spaces 45
Proof. Let {G̃i }i∈I be a family (which for convenience we denote as an indexed
family) of open subsets of X̃ such that ϕ(S) ⊂ ∪i∈I G̃i . By 2.4.3 and 2.2.5,
Since S is compact, this implies that there is a finite subset If of I such that
S ⊂ ∪i∈If Gi ,
Proof. Assume Dϕ compact and let ǫ > 0 be given. Since ϕ is continuous, the
following condition is satisfied:
˜ ǫ
∀x ∈ Dϕ , ∃δx,ǫ > 0 s.t. [y ∈ Dϕ and d(x, y) < δx,ǫ ] ⇒ d(ϕ(x), ϕ(y)) < .
2
δ δ
The family {B(x, x,ǫ 2 )}x∈Dϕ is a family of open sets and Dϕ ⊂ ∪x∈Dϕ B(x, 2 ).
x,ǫ
Since Dϕ is compact, there exists a finite subset {x1 , ..., xn } of Dϕ such that
δ
Dϕ ⊂ ∪ni=1 B(xi , x2i ,ǫ ). We define
δxi ,ǫ
δǫ := min : i = 1, ..., n
2
and we have δǫ > 0.
Let now x and y be points in Dϕ such that d(x, y) < δǫ . There is k ∈ {1, ..., n}
δ ˜
such that x ∈ B(xk , x2k ,ǫ ); then we have d(ϕ(x), ϕ(xk )) < 2ǫ since d(x, xk ) < δxk ,ǫ ;
˜ ǫ
we also have d(ϕ(xk ), ϕ(y)) < 2 since
δxk ,ǫ
d(xk , y) ≤ d(xk , x) + d(x, y) < + δǫ ≤ δxk ,ǫ ;
2
thus we have
˜ ˜ ˜ ǫ ǫ
d(ϕ(x), ϕ(y)) ≤ d(ϕ(x), ϕ(xk )) + d(ϕ(xk ), ϕ(y)) < + = ǫ.
2 2
This proves that ϕ is uniformly continuous.
2.8.16 Theorem. In a compact metric space (X, d), suppose F is a closed set,
{G1 , ..., GN } a finite family of open sets, and F ⊂ ∪N n=1 Gn . Then there exists
a family {ψ1 , ..., ψN } of functions
P ψn : X → [0, 1] such that ψn ≺ Gn for all
N
n ∈ {1, ..., N } and such that n=1 ψn (x) = 1 for all x ∈ F .
Metric Spaces 47
For k ∈ {1, ..., N }, Hk is closed (cf. 2.3.2 and 2.3.7) and Hk ⊂ Gk since rx,k has
been chosen in such a way that
rx,k
K(x, ) ⊂ B(x, rx,k ) ⊂ Gk , ∀(x, k) ∈ Ik ;
2
then by 2.5.11 there exists a function ϕk : X → [0, 1] such that Hk ≺ ϕk ≺ Gk .
Define:
ψ1 := ϕ1 ,
ψ2 := (1X − ϕ1 )ϕ2 ,
..
.
ψN := (1X − ϕ1 )(1X − ϕ2 ) · · · (1X − ϕN −1 )ϕN .
For every n ∈ {1, ..., N }, ψn is a continuous function and 0 ≤ ψn (x) ≤ 1 for all
x ∈ X (since ψn is a product of continuous functions with values in [0, 1]), and
also supp ψn ⊂ Gn (since clearly supp ψn ⊂ supp ϕn ). Thus, ψn ≺ Gn . It is easily
verified, by induction, that
N
X
ψn = 1X − (1X − ϕ1 )(1X − ϕ2 ) · · · (1X − ϕN ). (4)
n=1
2.9 Connectedness
2.9.1 Definition. A metric space (X, d) is said to be connected if there does not
exist a pair of non-empty open sets G1 and G2 such that
G1 ∩ G2 = ∅ and G1 ∪ G2 = X.
A non-empty subset S of X is said to be connected if the metric subspace (S, dS )
is connected.
2.9.2 Proposition. For a metric space (X, d), the following conditions are equiv-
alent:
(a) (X, d) is connected;
(b) the only subsets of X which are both open and closed are ∅ and X;
(c) there does not exist a pair of non-empty closed sets F1 and F2 such that
F1 ∩ F2 = ∅ and F1 ∪ F2 = X.
Proof. (not a)⇒(not b): Let G1 and G2 be non-empty open sets such that
G1 ∩ G2 = ∅ and G1 ∪ G2 = X.
Then G2 = X − G1 and hence G2 is closed and G2 6= X (for otherwise G1 = ∅).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 48
(not b)⇒(not c): Let S be a subset of X which is both open and closed and
such that S 6= ∅ and S 6= X. Then S ′ := X − S is non-empty and closed and
S ∩ S ′ = ∅ and S ∪ S ′ = X.
(not c)⇒(not a): Let F1 and F2 be non-empty closed sets such that
F1 ∩ F2 = ∅ and F1 ∪ F2 = X.
Then G1 := X − F1 and G2 := X − F2 are non-empty (G1 = ∅ would imply F2 = ∅
since F2 = X − F1 , and similarly for G2 ) open sets and
G1 ∩ G2 = X − (F1 ∪ F2 ) = ∅ and G1 ∪ G2 = X − (F1 ∩ F2 ) = X.
Proof. a ⇒ b: We shall prove (not b)⇒(not a). Suppose that S is neither R nor
an interval nor a singleton set. Then there exist x, y, z ∈ R such that
x < y < z, x, z ∈ S, y 6∈ S.
Then,
S = ((−∞, y) ∩ S) ∪ ((y, ∞) ∩ S)
and (−∞, y) ∩ S and (y, ∞) ∩ S are two disjoint non-empty sets which are open in
the metric space (S, dS ) (cf. 2.2.5). Therefore S is not connected.
b ⇒ a: If S is a singleton set then it is obviously connected. Then suppose that
S is R or an interval. We shall prove by contradiction that S is connected. Suppose
to the contrary that S is not connected. Then (cf. 2.9.2) there exist two non-empty
subsets T1 and T2 of S which are closed in the metric space (S, dS ) and such that
T1 ∩ T2 = ∅ and T1 ∪ T2 = S.
Since T1 and T2 are non-empty we can choose x1 ∈ T1 and x2 ∈ T2 . Since T1 and
T2 are disjoint, x1 6= x2 and (by altering our notation if necessary) we may assume
that x1 < x2 . Since S is R or an interval, [x1 , x2 ] ⊂ S and each point in [x1 , x2 ] is
in either T1 or T2 . Since [x1 , x2 ] ∩ T1 6= ∅, we can define
y := sup([x1 , x2 ] ∩ T1 ).
It is clear that x1 ≤ y ≤ x2 , so y ∈ S. By definition of the l.u.b. (cf. 1.1.5), for
each n ∈ N we can choose zn ∈ [x1 , x2 ] ∩ T1 such that y − n1 < zn ; thus we have a
sequence {zn } in T1 such that y − n1 < zn ≤ y; since T1 is closed in the metric space
(S, dS ) and dS is a restriction of dR , this proves that y ∈ T1 (cf. 2.3.4). Since T1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 49
Metric Spaces 49
and T2 are disjoint, this implies y < x2 . For each n ∈ N such that y + n1 ≤ x2 we
have y + n1 ∈ [x1 , x2 ] and hence y + n1 ∈ S, and then either y + n1 ∈ T1 or y + n1 ∈ T2 ;
however y + n1 ∈ T1 would imply y + n1 ∈ [x1 , x2 ] ∩ T1 and this would contradict the
definition of y; therefore, y + n1 ∈ T2 ; thus, the sequence y + n1 is in T2 for n large
2.9.4 Theorem. Let (X, d) and (X̃, d)˜ be metric spaces and let ϕ : X → X̃ be a
continuous mapping. If (X, d) is connected then Rϕ is a connected subset of X̃.
ϕ−1 (G2 ) are non-empty, and they are open sets in (X, d) (cf. 2.4.3). Moreover,
ϕ−1 (G1 ) ∩ ϕ−1 (G2 ) = ϕ−1 ((G1 ∩ Rϕ ) ∩ (G2 ∩ Rϕ )) = ϕ−1 (∅) = ∅,
Proof. The statement follows immediately from 2.9.4 and 2.9.3 (a ⇒ b).
2.9.7 Definition. Let (X, d) be a metric space. Two subsets S1 and S2 of X are
said to be separated from one another if S1 ∩ S2 = ∅.
2.9.8 Theorem. Let (X, d) be a metric space and suppose that there exists a family
F of subsets of X such that:
(a) each element of F is connected;
S
(b) S∈F S = X;
(c) no two elements of F are separated from one another.
Then (X, d) is connected.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 50
Proof. Let T be a subset of X which is both open and closed. We shall show that
T is either empty or equal to all of X. In view of 2.9.2, this will have proved the
statement.
Each element of F is connected (cf. a), so for any S ∈ F we know (cf. 2.9.2)
that T ∩ S is either empty or all of S, since T ∩ S is both open and closed in the
metric subspace (S, dS ) (cf. 2.2.5 and 2.3.3).
S
If T ∩ S = ∅ for all S ∈ F , then T = S∈F (T ∩ S) = ∅ (cf. b).
The other possibility is that there exists S0 ∈ F such that T ∩ S0 6= ∅. Then
T ∩ S0 = S0 , i.e. S0 ⊂ T . If S0 is the only element of F , this gives T = X (cf. b).
If not, let S be an element of F different from S0 ; if T ∩ S = ∅ then S ⊂ X − T , and
hence S ⊂ X − T since X − T is closed; therefore, S0 ∩ S = ∅ (we have S0 ⊂ T since
T is closed); however, this is not possible since no two elements of F are separated
from one another (cf. c); thus, we must have T ∩ S 6= ∅ and hence T ∩ S = S. Hence
S S
we have T ∩ S = S for all S ∈ F , and hence T = S∈F (T ∩ S) = S∈F S = X (cf.
b).
2.9.9 Theorem. Let (X, d) and (X̃, d)˜ be connected metric spaces. Then the prod-
˜ is connected.
uct metric space (X × X̃, d × d)
For all (x, x̃), (x′ , x̃′ ) ∈ X × X̃, ({x} × X̃) ∪ (X × {x̃}) and ({x′ } × X̃) ∪ (X × {x̃′ })
are not separated from one another since
(x, x̃′ ) ∈ ({x} × X̃) ∩ (X × {x̃′ }).
˜ is connected by 2.9.8.
Therefore (X × X̃, d × d)
by 2.9.9, and hence (Rn+1 , dn+1 ) is connected by 2.9.5 since there exists an obvious
isomorphism from (Rn × R, dn × dR ) onto (Rn+1 , dn+1 ). This concludes the proof
by induction.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 51
Chapter 3
Our main purpose is to study operators in Hilbert spaces, which are in fact linear
operators in linear spaces. Hence the subject of this chapter. Throughout the
chapter, K stands for a field. By 0 and 1 we denote the zero and unit elements of
K.
3.1.1 Definition. A linear space over K (or, simply, a linear space) is a triple
(X, σ, µ), where X is a non-empty set, σ is a mapping σ : X × X → X, µ is a
mapping µ : K × X → X, and the conditions listed under ls1 and ls2 are satisfied.
(ls1 ) (X, σ) is an abelian group; i.e., with the shorthand notation f + g := σ(f, g),
we have:
f + (g + h) = (f + g) + h, ∀f, g, h ∈ X,
∃0X ∈ X s.t. f + 0X = f , ∀f ∈ X,
∀f ∈ X, ∃f ′ ∈ X s.t. f + f ′ = 0X ,
f + g = g + f , ∀f, g ∈ X;
we recall (cf. 1.3.1) that 0X is the only element of X s.t. f + 0X = f for all
f ∈ X, that for f ∈ X there is only one element f ′ of X s.t. f + f ′ = 0X
and that it is denoted by −f , and that we write f − g := f + (−g).
(ls2 ) With the shorthand notation αf := µ(α, f ), we have:
α(βf ) = (αβ)f , ∀α, β ∈ K, ∀f ∈ X,
(α + β)f = αf + βf , ∀α, β ∈ K, ∀f ∈ X,
α(f + g) = αf + αg, ∀α ∈ K, ∀f, g ∈ X,
1f = f , ∀f ∈ X.
The elements of K are called scalars, and will be preferably denoted by the small
Greek letters α, β, γ, .... The elements of X are called vectors, and will be preferably
denoted by the italics f, g, h, .... The composition law σ is called vector sum and
51
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 52
the composition law µ is called scalar multiplication. Another name for a linear
space is vector space.
(a) 0f = 0X , ∀f ∈ X;
(b) α0X = 0X , ∀α ∈ K;
(c) if α ∈ K and f ∈ X are such that αf = 0X , then α 6= 0 ⇒ f = 0X (or
equivalently f 6= 0X ⇒ α = 0);
(d) (−1)f = −f , ∀f ∈ X;
(e) (−α)f = −(αf ), ∀α ∈ K, ∀f ∈ X (hence we will write −αf := (−α)f ).
(lm1 ) f, g ∈ M ⇒ f + g ∈ M ;
(lm2 ) (α ∈ K and f ∈ M ) ⇒ αf ∈ M .
(a) In any linear space X there are two trivial linear manifolds: {0X } and X.
(b) If M is a linear manifold in a linear space (X, σ, µ) and N is a non-empty
subset of M , then N is a linear manifold in (X, σ, µ) iff N is a linear manifold
in (M, σM×M , µK×M ).
(c) Conditions lm1 and lm2 of 3.1.3 are equivalent to the one condition:
(lm) (α, β ∈ K and f, g ∈ M ) ⇒ αf + βg ∈ M .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 53
3.1.8 Definition. If S1 and S2 are subsets of a linear space X, we define their sum
S1 + S2 by
S1 + S2 := {f1 + f2 : f1 ∈ S1 and f2 ∈ S2 }.
It is immediate to see that, if S1 and S2 are linear manifolds in X, then S1 + S2 is
a linear manifold in X.
November 18, 2014 10:30 World Scientific Book - 9.75in x 6.5in HilbertSpace page 54
3.1.9 Definition. Let X and Y be linear spaces over the same field K. It is
immediate to see that the set X × Y becomes a linear space over K if we define
vector sum and scalar multiplication by the rules:
(f1 , g1 ) + (f2 , g2 ) := (f1 + f2 , g1 + g2 ), ∀(f1 , g1 ), (f2 , g2 ) ∈ X × Y ;
α(f, g) := (αf, αg), ∀α ∈ K, ∀(f, g) ∈ X × Y.
This linear space is called the sum of the linear spaces X, Y and is denoted by
X + Y . It is immediate to see that the two subsets of X × Y
X̂ := {(f, 0Y ) : f ∈ X} and Ŷ := {(0X , g) : g ∈ X}
are linear manifolds in X + Y and that X × Y = X̂ + Ŷ , with X̂ + Ŷ defined as in
3.1.8.
3.1.10 Examples.
(a) Let x denote a point (of any set). Define X := {x}, and vector sum and scalar
multiplication by the rules:
σ(x, x) := x, µ(α, x) := x for all α ∈ K.
The triple (X, σ, µ) defined in this way is a trivial linear space, which is called
a zero linear space. If X is a zero linear space, we have X = {0X }.
(b) Define X := K, and vector sum and scalar multiplication by the rules:
σ(z1 , z2 ) := z1 + z2 , ∀z1 , z2 ∈ K,
µ(α, z) := αz, ∀α, z ∈ K
(where z1 + z2 and αz are the sum and the product that are defined in the field
K). The triple (X, σ, µ) defined in this way is a linear space over K, which is
called the linear space K.
(c) Let X be a non-empty set and let F (X) denote the family of all the functions
from X to C that have the whole of X as their domains, i.e. the family of
complex functions on X. Define the mappings
σ : F (X) × F(X) → F(X)
(ϕ, ψ) 7→ σ(ϕ, ψ) := ϕ + ψ,
µ : C × F(X) → F(X)
(α, ϕ) 7→ µ(α, ϕ) := αϕ,
where ϕ + ψ and αϕ are defined as in 1.2.19. It is immediate to check that
(F (X), σ, µ) is a linear space over C (hence, the symbols ϕ + ψ and αϕ defined
in 1.2.19 are in agreement with the shorthand notations used in 3.1.1), with the
function
0X : X → C
x 7→ 0X (x) := 0
(cf. 1.2.19) as zero element, and the function −ϕ (cf. 1.2.19) as the opposite
element of an element ϕ of F (X).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 55
(d) Let X be a non-empty set, and let FB (X) denote the set of all bounded elements
of F (X):
FB (X) := {ϕ ∈ F (X) : ∃mϕ ∈ [0, ∞) such that
|ϕ(x)| ≤ mϕ for all x ∈ X}.
It is immediate to check that FB (X) is a linear manifold in F (X).
(e) Let (X, d) be a metric space, and define
C(X) := {ϕ ∈ F (X) : ϕ is continuous}.
Since a linear combination (cf. 3.1.12) of continuous functions is a continuous
function, C(X) is a linear manifold in F (X).
We also define
CB (X) := C(X) ∩ FB (X),
which is a linear manifold in F (X) by 3.1.5, and hence in FB (X) by 3.1.4b.
If (X, d) is a compact metric space, we have C(X) = CB (X) by 2.8.14.
(f) For a, b ∈ R such that a < b, the family of functions C(a, b) := C([a, b]) is a
linear manifold in FB ([a, b]) since [a, b] is compact (cf. 2.3.7 and 2.8.7).
By C 1 (a, b) we denote the set of all the elements of C(a, b) that are differentiable
at all point of [a, b] and such that their derivatives (cf. 1.2.21 and 2.7.6) are
elements of C(a, b) (differentiability and derivatives at a and b are one-sided).
Since a linear combination of differentiable functions is differentiable and its
derivative is the linear combination of the derivatives, C 1 (a, b) is a linear mani-
fold in C(a, b).
(g) By Cc (R) we denote the family of all continuous complex functions on R whose
support is compact, i.e. we define
Cc (R) := {ϕ ∈ C(R) : supp ϕ is compact}
(for supp ϕ cf. 2.5.9). From 2.8.6 and 2.8.7, for ϕ ∈ C(R) we have
ϕ ∈ Cc (R) ⇔ ∃aϕ , bϕ ∈ R s.t. aϕ < bϕ and supp ϕ ⊂ [aϕ , bϕ ].
∞
(h) By C (R) we denote the subset of F (R) defined by
C ∞ (R) := {ϕ ∈ F (R) : ϕ is infinitely differentiable at all points of R}.
Clearly, C ∞ (R) is a linear manifold in F (R).
Next, we define the Schwartz space of functions of rapid decrease:
S(R) := {ϕ ∈ C ∞ (R) : lim xk ϕ(l) (x) = 0,
x→±∞
∀k = 0, 1, 2, ..., ∀l = 0, 1, 2, ...},
where ϕ(l) denotes the l-th derivative of ϕ (and ϕ(0) := ϕ).
The following properties of S(R) are easily checked:
(1) ϕ ∈ S(R) ⇒ ϕ(l) ∈ S(R), ∀l ∈ N;
(2) ϕ ∈ S(R) ⇒ ϕ ∈ S(R);
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 56
We will see now just the few facts about linear independence and linear dimen-
sion that will be used in our study of Hilbert spaces. Thus, our treatment of these
subjects will be nowhere near complete.
Proof. The first statement follows from 3.1.2b,c. Assume then n > 1 and f1 6= 0X .
Pk−1
If there exists k ∈ {2, ..., n} such that fk = i=1 αi fi , with αi ∈ K for i =
Pk−1
1, ..., k −1, then i=1 αi fi −1fk = 0X , and this proves that S is linearly dependent.
Assume now that S is linearly dependent; this means that there are a non-empty
subset {fi1 , ..., fir } of S and (α1 , ..., αr ) ∈ Kr such that (α1 , ..., αr ) 6= (0, ..., 0) and
Pr
l=1P αl fil = 0X ; hence there is (β1 , ..., βn ) ∈ Kn such that (β1 , ..., βn ) 6= (0, ..., 0)
n
and i=1 βi fi = 0X . If βk is the last non-zero element in (β1 , ..., βn ), then k > 1
Pk−1
(since f1 6= 0X ) and fk = i=1 (− ββki )fi .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 57
3.1.15 Theorem. Let X be a non-zero linear space and assume that there exists
a linear basis in X which is finite and contains n elements. Then every linearly
independent subset of X which contains n elements is a linear basis.
Proof. Let {e1 , ..., en } be the linear basis in X of which we assume the existence,
and let {f1 , ..., fn } be a linearly independent subset of X.
Since f1 is a linear combination of the ei ’s, the set
S1 := {f1 , e1 , ..., en }
is linearly dependent; then, by 3.1.14, there is one of the ei ’s, say ei1 , which is
a linear combinations of the vectors that precede it in S1 ; if we delete ei1 , the
remaining set
(we have written as if i1 < i2 , but it could be well the other way round) is still (as
it was S1′ ) such that X = LS2′ . Continuing in this way, in the end we are left with
the set
which is such that X = LSn′ . This proves that {f1 , ..., fn } is a linear basis.
3.1.16 Corollary. Let X be a non-zero linear space and assume that there exists
a linear basis in X which is finite and contains n elements. Then every linearly
independent subset of X is finite and contains at most n elements.
Proof. Our proof is by contradiction. Assume that there exists a linearly indepen-
dent subset S of X which is either infinite or finite with more than n elements. In
both cases, there is a subset {f1 , ..., fn , fn+1 } of S which contains n + 1 elements.
As any subset of S, {f1 , ..., fn , fn+1 } is linearly independent. However, {f1 , ..., fn }
is also linearly independent, hence a linear basis by 3.1.15. But then fn+1 is a
linear combination of f1 , ..., fn , and this contradicts the linear independence of
{f1 , ..., fn , fn+1 }.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 58
3.1.17 Corollary. If in a linear space X there exists a finite linear basis, then
every other linear basis in X is finite and contains the same number of elements.
Proof. Assume that B1 is a linear basis in X which is finite and contains n elements,
and let B2 be another linear basis in X. Since B2 is a linearly independent subset
of X, by 3.1.16 B2 is finite and contains m elements with m ≤ n. Then, since B2
is a finite linear basis with m elements and B1 is a linearly independent subset of
X, by 3.1.16 we have n ≤ m. Thus, m = n.
3.1.19 Proposition. Let {fn } be a sequence in a linear space X, and assume that
there exists n ∈ N such that fn 6= 0X . Then there exists an N -tuple (fn1 , ..., fnN )
or a subsequence {fnk } such that, letting I := {1, ..., N } or I := N, {fnk }k∈I is a
linearly independent subset of X and L{fnk }k∈I = L{fn}n∈N .
3.2.1 Definition. Let X and Y be linear spaces over the same field K. A linear
operator (or, simply, an operator ) from X to Y is a mapping A from X to Y , i.e.
A : DA → Y with DA ⊂ X (the set DA is called the domain of A, cf. 1.2.1), which
has the following properties:
By tradition, linear operators are denoted by capital letters, and the value A(f ) of
a linear operator A at f ∈ DA is written as Af .
When X = Y , a linear operator A from X to X is called a linear operator in
X, and on X if DA = X.
When Y is the linear space K (cf. 3.1.10b), a linear operator from X to K is
called a linear functional.
We point out that conditions lo2 and lo3 are consistent only when condition lo1
is assumed, and that conditions lo1 , lo2 , lo3 are equivalent to the one condition:
We denote by the symbol O(X, Y ) the family of all linear operators from X to Y .
If X = Y , we write O(X) := O(X, X).
For A ∈ O(X, Y ), the null space (also called the kernel ) of A is the subset of X
defined by
NA := {f ∈ X : f ∈ DA and Af = 0Y }.
3.2.2 Definition. Let X and Y be linear spaces over the same field K, and let
A ∈ O(X, Y ). We have:
Proof. a: By its definition, RA can never be the empty set. Moreover we have:
(α, β ∈ K and g1 , g2 ∈ RA ) ⇒
[∃f1 , f2 ∈ DA s.t. g1 = Af1 and g2 = Af2 , and hence s.t.
αf1 + βf2 ∈ DA and αg1 + βg2 = αAf1 + βAf2 = A(αf1 + βf2 )] ⇒
αg1 + βg2 ∈ RA .
3.2.3 Definition. Let X and Y be linear spaces over the same field.
For A, B ∈ O(X, Y ), in agreement with 1.2.5 we write A ⊂ B (or B ⊃ A) if
f ∈ DA ⇒ (f ∈ DB and Af = Bf ),
and we have A = B iff
A ⊂ B and DB ⊂ DA .
For A ∈ O(X, Y ) and a subset M of DA , one can see at once that the restriction
AM of A to M (cf. 1.2.5) is a linear operator iff M is a linear manifold.
3.2.6 Theorem. Let X and Y be linear spaces over the same field K, and let
A ∈ O(X, Y ). We have:
(a) A is injective iff NA = {0X };
(b) if A is injective, then the mapping A−1 is a linear operator, i.e. A−1 ∈ O(Y, X),
and we have A−1 A = 1DA and AA−1 = 1RA .
3.2.8 Definitions. Let X and Y be linear spaces over the same field K. For
A, B ∈ O(X, Y ), we define the mapping:
A + B : DA ∩ DB → Y
f 7→ (A + B)f := Af + Bf,
which is called the sum of A and B. Recalling 3.1.5, it is immediate to see that
A + B ∈ O(X, Y ).
For α ∈ K and A ∈ O(X, Y ), we define the mapping
αA : DA → Y
f 7→ (αA)f := α(Af ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 62
3.2.9 Definition. Let X and Y be linear spaces over the same field K. We define
the mapping
OX,Y : X → Y
f 7→ OX,Y f := 0Y .
3.2.10 Proposition. The three binary operations defined in 3.2.4 and 3.2.8 have
the following properties
(a) If X and Y are linear spaces over the same field K, then we have:
(a1 ) A + (B + C) = (A + B) + C, ∀A, B, C ∈ O(X, Y ),
(a2 ) A + OX,Y = A, ∀A ∈ O(X, Y ),
(a3 ) A − A ⊂ OX,Y , ∀A ∈ O(X, Y ),
(a′3 ) A − A = OX,Y , ∀A ∈ O(X, Y ) s.t. DA = X,
(a4 ) A + B = B + A, ∀A, B ∈ O(X, Y ),
(a5 ) α(βA) = (αβ)A, ∀α, β ∈ K, ∀A ∈ O(X, Y ),
(a6 ) (α + β)A = αA + βA, ∀α, β ∈ K, ∀A ∈ O(X, Y ),
(a7 ) α(A + B) = αA + αB, ∀α ∈ K, ∀A, B ∈ O(X, Y ),
(a8 ) 1A = A, ∀A ∈ O(X, Y ).
(b) If W, X, Y, Z are four linear spaces over the same field K, then we have:
(b1 ) (AB)C = A(BC), ∀A ∈ O(Y, Z), ∀B ∈ O(X, Y ), ∀C ∈ O(W, X),
(b2 ) AB + AC ⊂ A(B + C), ∀A ∈ O(Y, Z), ∀B, C ∈ O(X, Y ),
(b′2 ) AB + AC = A(B + C), ∀A ∈ O(Y, Z) s.t. DA = Y, ∀B, C ∈ O(X, Y ),
(b3 ) AC + BC = (A + B)C, ∀A, B ∈ O(Y, Z), ∀C ∈ O(X, Y ),
(b4 ) (αA)B = α(AB) = A(αB), ∀α ∈ K − {0}, ∀A ∈ O(Y, Z), ∀B ∈ O(X, Y ),
(b′4 ) (0A)B = 0(AB) ⊂ A(0B), ∀A ∈ O(Y, Z), ∀B ∈ O(X, Y ),
(b5 ) 1Y A = A1X = A, ∀A ∈ O(X, Y ).
Proof. For all the relations we have to prove it is clear that, at a vector which
belongs to the intersection of the domains of the two operators which appear on the
two sides of the relation, the value of the operator on the left hand side coincides
with the value of the operator on the right hand side. Thus, in order to prove the
relations between the operators, we need only prove the same relations between
their domains. We will examine only the cases that are not completely obvious.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 63
a1 : We have
DA+(B+C) = DA ∩ DB+C = DA ∩ (DB ∩ DC )
= (DA ∩ DB ) ∩ DC = DA+B ∩ DC = D(A+B)+C .
b1 : Cf. 1.2.17.
b2 : We have
f ∈ DAB+AC ⇒ f ∈ DAB ∩ DAC ⇒
(f ∈ DB , Bf ∈ DA , f ∈ DC , Cf ∈ DA ) ⇒
(f ∈ DB ∩ DC and Bf + Cf ∈ DA ) ⇒
(f ∈ DB+C and (B + C)f ∈ DA ) ⇒ f ∈ DA(B+C) .
b′2 : If DA = Y , we have (cf. 1.2.13e)
DAB+AC = DAB ∩ DAC = DB ∩ DC = DB+C = DA(B+C) .
b3 : We have
f ∈ DAC+BC ⇔ f ∈ DAC ∩ DBC ⇔
(f ∈ DC , Cf ∈ DA , f ∈ DC , Cf ∈ DB ) ⇔
(f ∈ DC and Cf ∈ DA ∩ DB ) ⇔
(f ∈ DC and Cf ∈ DA+B ) ⇔ f ∈ D(A+B)C .
b4 : If α 6= 0, then DAB = DA(αB) : since DA is a linear manifold, for f ∈ DB
we have Bf ∈ DA ⇒ αBf ∈ DA , and also αBf ∈ DA ⇒ Bf = α−1 (αBf ) ∈ DA .
b′4 : We have (0A)B = 0(AB) = (OX,Y )DAB and also A(0B) = (OX,Y )DB
because 0Bf = 0Y ∈ DA for all f ∈ DB . Then, we recall that DAB ⊂ DB (cf.
1.2.13c).
3.2.11 Remark. The family O(X, Y ) with the two binary operations defined in
3.2.8, despite the symbols used to denote them, is not a linear space. In fact, OX,Y
is the only element of O(X, Y ) for which condition a2 of 3.2.10 can hold (if Õ is
another operator which satisfies that condition, then we have Õ = Õ + OX,Y =
OX,Y + Õ = OX,Y ), and for an operator A ∈ O(X, Y ) with DA 6= X no operator
A′ ∈ O(X, Y ) can exist such that A + A′ = OX,Y , since DA+A′ ⊂ DA for all
A′ ∈ O(X, Y ). Thus, there is no opposite for any element of O(X, Y ) that is not
defined on the whole of X.
We also notice that, in condition b2 of 3.2.10, we do have AB + AC 6= A(B + C)
if for instance RB 6⊂ DA and C = −B. In fact, this implies both DAB+AC =
DAB 6= DB (cf. 1.2.13d) and DA(B+C) = DB (from B + C = B − B ⊂ OX,Y we
have (B + C)f = 0Y ∈ DA for all f ∈ DB = DB+C ).
3.2.12 Definition. Let X and Y be linear spaces over the same field, and define
OE (X, Y ) := {A ∈ O(X, Y ) : DA = X}.
Thus, OE (X, Y ) is the family of all the operators from X to Y that are defined
everywhere on X. For X = Y we write OE (X) := OE (X, X).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 64
3.2.13 Remark. All the relations that appear in 3.2.10 are equalities if
O(W, X), O(X, Y ), O(Y, Z) are replaced by OE (W, X), OE (X, Y ), OE (Y, Z) respec-
tively.
3.2.14 Theorem. Let X and Y be linear spaces over the same field K, and define
the mappings
σ : OE (X, Y ) × OE (X, Y ) → OE (X, Y )
(A, B) 7→ σ(A, B) := A + B,
µ : C × OE (X, Y ) → OE (X, Y )
(α, A) 7→ µ(α, A) := αA,
with A + B and αA defined as in 3.2.8. Then (OE (X, Y ), σ, µ) is a linear space
over K (thus, the symbols A + B and αA introduced in 3.2.8 are in agreement with
the shorthand notation introduced in 3.1.1). The zero element is the operator OX,Y
defined in 3.2.9 and the opposite element of A ∈ OE (X, Y ) is the operator −A
defined in 3.2.8.
3.2.15 Proposition. Let X and Y be linear spaces over the same field.
(a) A mapping ϕ from X to Y is a linear operator iff Gϕ is a linear manifold in
the linear space X + Y (for Gϕ , cf. 1.2.3).
(b) A linear manifold G in the linear space X + Y is the graph of a mapping from
X to Y iff G has the property:
(0X , g) ∈ G ⇒ g = 0Y .
3.3.7 Theorem. The linear space OE (X) of the operators defined on a linear space
X (cf. 3.2.12 and 3.2.14) becomes an associative algebra with identity if we define
π : OE (X) × OE (X) → OE (X)
(A, B) 7→ π(A, B) := AB,
with AB defined as in 3.2.4 (thus, the symbol AB introduced in 3.2.4 is in agreement
with the shorthand notation introduced in 3.3.1). The identity is the operator 1X
defined in 3.2.5.
3.3.8 Examples.
(a) For the linear space F (X) (cf. 3.1.10c) we define
π : F (X) × F (X) → F (X)
(ϕ, ψ) 7→ π(ϕ, ψ) := ϕψ,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 67
Chapter 4
There are properties of a linear operator in a Hilbert space which depend only on
the relation between the operator and the norm which is generated by the inner
product of the space. Thus, in this chapter we examine what can be said about
linear operators in normed spaces. Throughout the chapter, K stands for the field
C of complex numbers or the field R of real numbers.
4.1.1 Definition. A normed space over K (or simply a normed space) is a quadru-
ple (X, σ, µ, ν), where (X, σ, µ) is a linear space over K and ν is a function ν : X → R
which, with the shorthand notation kf k := ν(f ), has the following properties:
(no1 ) kf + gk ≤ kf k + kgk, ∀f, g ∈ X,
(no2 ) kαf k = |α|kf k, ∀α ∈ K, ∀f ∈ X,
(no3 ) kf k = 0 ⇒ f = 0X .
The function ν is called a norm for the linear space (X, σ, µ).
69
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 70
4.1.4 Example. Recall that K is a linear space over K (cf. 3.1.10b). From the
properties of the absolute value in R or of the modulus in C it follows immediately
that the function
ν:K→R
z 7→ ν(z) := |z|
is a norm for the linear space K. We have dν = dK (cf. 2.1.4 and 2.7.4a).
4.1.8 Remarks.
(a) Let (X, σ, µ, ν) be a normed space and let M be a linear manifold in the linear
space (X, σ, µ). It is immediate to see that (M, σM×M , µK×M , νM ) is a normed
space. If X is a Banach space, then (M, σM×M , µK×M , νM ) is a Banach space
as well iff M is a closed set. This follows at once from 2.6.6. This partially
justifies the definition we give in 4.1.9 (which is, however, completely justified
only in the context of Banach spaces).
P∞
(b) Let {fn } be a sequence in a Banach space. The series n=1 fn is said to be
P∞
absolutely convergent if the series n=1 kfn k is convergent. Suppose that the
series ∞ Then the series ∞
P P
n=1 fn is absolutely convergent.
Pn Pn n
n=1 f is convergent
as well. Indeed, if we define sn := k=1 fk and σn := k=1 kfk k for each
n ∈ N, then the sequence {σn } is a Cauchy sequence (cf. 2.6.2), and hence the
sequence {sn } is a Cauchy sequence as well since, for n < m,
m
X
ksm − sn k ≤ kfk k = |σm − σn |,
k=n+1
and this implies that the sequence {sn } is convergent (cf. 2.6.3). Moreover, if
P∞
β is a bijection from N onto N then the series n=1 fβ(n) is convergent and for
P∞ P∞
the sums we have n=1 fβ(n) = n=1 fn . Indeed, for any ε > 0 let Nε ∈ N
Pp
be so that |σn − σm | < ε for n, m > Nε . Then k=Nε +2 kfk k < ε for all
P
p ≥ Nε + 2, and hence k∈I kfk k < ε if I is a finite set of positive integers such
that k > Nε + 1 for all k ∈ I. Let Mε := max β −1 ({1, ..., Nε + 1}) and note
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 72
Then
k lim sm − s′n k ≤ k lim sm − sn k + ksn − s′n k < 2ε for n > max{Mε , Lε }.
m→∞ m→∞
This proves that the sequence {s′n } is convergent and limn→∞ s′n = limn→∞ sn .
Proof. We will prove that M is a linear manifold by using 2.3.10 and 3.1.4c. For
α, β ∈ K and f, g ∈ M , let {fn } and {gn } be sequences in M such that fn → f and
gn → g; then by 4.1.6b,c we have αfn + βgn → αf + βg; since αfn + βgn ∈ M , we
have αf + βg ∈ M .
Proof. The equality L{f } = {αf : α ∈ K} follows directly from 3.1.7. Then, in
view of 4.1.13, V {f } = L{f } is true if the set {αf : α ∈ K} is closed. If f = 0X ,
then {αf : α ∈ K} = {0X }, which is a closed set (cf. 2.3.5). Assume next f 6= 0X .
Let {gn } be a sequence in {αf : α ∈ K} and g an element of X such that gn → g.
If βn is the element of K such that gn = βn f , the sequence {βn } turns out to be
a Cauchy sequence because {gn } is such (cf. 2.6.2) and f 6= 0X . Since K is a
complete metric space, there exists β ∈ K such that βn → β, hence by 4.1.6c such
that gn → βf . Therefore, g = βf ∈ {αf : α ∈ K}. On account of 2.3.4, this proves
that {αf : α ∈ K} is a closed set.
4.1.16 Definition. Let X and Y be normed spaces over the same field, and denote
by νX and νY their norms. The function
ν : X ×Y →R
q
(f, g) 7→ ν(f, g) := 2 (f ) + ν 2 (g)
νX Y
is a norm for the linear space X + Y (cf. 3.1.9); in fact, properties no1 , no2 and
no3 of 4.1.1 follow immediately for ν from the same properties for νX and νY , using
also (for no1 ) the inequality
p p p
∀a1 , a2 , b1 , b2 ∈ C, |a1 + b1 |2 + |a2 + b2 |2 ≤ |a1 |2 + |a2 |2 + |b1 |2 + |b2 |2 ,
which will be proved in 10.3.8c.
The linear space X + Y with this norm ν is called the sum of the normed spaces
X and Y .
It can be seen immediately that dν = dνX × dνY . Hence, from 2.7.3d it follows
that the normed space X +Y is a Banach space iff X and Y are both Banach spaces.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 74
4.2.1 Definition. Let X and Y be normed spaces over the same field and let
A ∈ O(X, Y ). The linear operator A is said to be bounded if it has the following
property:
∃m ∈ [0, ∞) such that kAf k ≤ mkf k for all f ∈ DA .
For a linear operator, the importance of the condition of being bounded lies in
the fact that a linear operator is bounded iff it is continuous, as is shown by the
following theorem.
4.2.2 Theorem. Let X and Y be normed spaces over the same field. For a linear
operator A ∈ O(X, Y ), the following conditions are equivalent:
(a) A is bounded, i.e. ∃m ≥ 0 such that kAf k ≤ mkf k for all f ∈ DA ;
(b) A is uniformly continuous;
(c) A is continuous;
(d) ∃f0 ∈ DA such that A is continuous at f0 .
ǫ
Proof. a ⇒ b: Assume condition a and let ǫ > 0. Define δǫ := m+1 . Then we have
[f, g ∈ DA and kf − gk < δǫ ] ⇒
m
kAf − Agk = kA(f − g)k ≤ mkf − gk < ǫ < ǫ.
m+1
This proves that A is uniformly continuous.
b ⇒ c: This is obvious.
c ⇒ d: This is obvious.
d ⇒ a: Assume condition d. Then (setting ǫ := 1 in the condition of continuity
of A at f0 and then δ := δ1 , cf. 2.4.1) ∃δ > 0 such that
[f ∈ DA and kf0 − f k < δ] ⇒ kAf0 − Af k < 1.
Then we have
δ δ
g ∈ DA − {0X } ⇒ [ g ∈ DA and k gk < δ] ⇒
2kgk 2kgk
δ δ
[f0 + g ∈ DA and kf0 − (f0 + g)k < δ] ⇒
2kgk 2kgk
δ δ 2
kAgk = kAf0 − A(f0 + g)k < 1 ⇒ kAgk < kgk.
2kgk 2kgk δ
This proves condition a with m = 2δ .
4.2.3 Theorem. Let X and Y be normed spaces over the same field. For a linear
operator A ∈ O(X, Y ), the following conditions are equivalent:
(a) A is injective and A−1 is bounded;
(b) ∃k > 0 such that kAf k ≥ kkf k for all f ∈ DA .
November 18, 2014 10:30 World Scientific Book - 9.75in x 6.5in HilbertSpace page 75
f ∈ NA ⇒ kAf k = 0 ⇒ kf k = 0 ⇒ f = 0X .
and hence
1
kA−1 gk ≤ kgk, ∀g ∈ DA−1 ,
k
and this proves that A−1 is bounded.
4.2.4 Definition. Let X and Y be normed spaces over the same field, and let A
be a bounded operator from X to Y . Then the set of non-negative real numbers
kAk := inf BA ,
4.2.5 Proposition. Let X and Y be normed spaces over the same field (with X 6=
{0X }), and let A be a bounded operator from X to Y . We have:
4.2.6 Theorem. Let X and Y be normed spaces over the same field, and let A be
a bounded operator from X to Y . Assume further that Y is a Banach space. Then
there exists one and only one operator à from X to Y with the following properties:
(a) DÃ = DA ;
(b) A ⊂ Ã;
(c) Ã is bounded.
We also have
(d) kÃk = kAk.
Thus, limn→∞ Afn depends only on f and not on the choice of the sequence {fn }
in DA , as long as fn → f . Therefore, we can define the mapping
à : DA → Y
f 7→ Ãf := lim Afn if {fn } is a sequence in DA such that fn → f .
n→∞
owing to 4.1.6a (used twice) and 4.2.5b. This proves that à has property c and that
kAk ∈ BÃ , whence kÃk ≤ kAk. Since A ⊂ Ã, we also have kAk ≤ kÃk by 4.2.5a.
Thus, kÃk = kAk.
It remains to prove the uniqueness of Ã. Let B ∈ O(X, Y ) be such that DB =
DA , A ⊂ B, B is bounded. Then, if f ∈ DA and {fn } is a sequence in DA such
that fn → f , we have
Bf = lim Bfn = lim Afn = Ãf,
n→∞ n→∞
4.2.7 Proposition. Let X and Y be normed spaces over the same field, and let
A, B be bounded operators from X to Y . Then the operator A + B is bounded, and
kA + Bk ≤ kAk + kBk.
4.2.8 Proposition. Let X, Y be normed spaces over the same field K, and let A
be a bounded operator from X to Y . Then, for each α ∈ K, the operator αA is
bounded and kαAk = |α|kAk.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 78
4.2.9 Proposition. Let X, Y, Z be normed spaces over the same field, and suppose
that A ∈ O(X, Y ) and B ∈ O(Y, Z) are bounded operators. Then the operator BA
is bounded and kBAk ≤ kBkkAk.
4.2.10 Definition. Let X and Y be normed spaces over the same field. We define
B(X, Y ) := {A ∈ OE (X, Y ) : A is bounded}.
For X = Y , we write B(X) := B(X, X).
4.2.11 Theorem. Let X and Y be normed spaces over the same field. We have:
(a) B(X, Y ) is a linear manifold in the linear space OE (X, Y ) (cf. 3.2.14) and the
function
νB : B(X, Y ) → R
A 7→ νB (A) := kAk := inf BA
is a norm for the linear space B(X, Y ); hence, B(X, Y ) is a normed space;
(b) if Y is a Banach space, then B(X, Y ) is also a Banach space.
Proof. a: On account of 4.2.7 and 4.2.8, conditions lm1 and lm2 of 3.1.3 hold for
B(X, Y ), and conditions no1 and no2 of 4.1.1 hold for νB . By 4.2.5b we also have,
for A ∈ B(X, Y ),
kAk = 0 ⇒ [kAf k ≤ 0, ∀f ∈ X] ⇒ [Af = 0Y , ∀f ∈ X] ⇒ A = OX,Y ,
which proves that condition no3 of 4.1.1 holds for νB .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 79
4.2.12 Remark. Let X and Y be normed spaces over the same field, and let {An }
be a sequence in B(X, Y ).
If {An } is convergent then, letting A := limn→∞ An , we have
∀f ∈ X, Af = lim An f,
n→∞
which is proved by
∀f ∈ X, ∀n ∈ N, kAn f − Af k ≤ kAn − Akkf k.
This implies that, if the series ∞
P
A is convergent, then for each f ∈ X the
P∞n
n=1
series n=1 (An f ) is convergent and ( n=1 An )f = ∞
P∞ P
n=1 (An f ).
thus,
1
rn+1 ≤ ,
2n+1
kAn+1 hk > n + 1 for all h ∈ B(hn+1 , rn+1 ),
B(hn+1 , rn+1 ) ⊂ B(hn , rn ).
1
n, m > k ⇒ hn , hm ∈ B(hk , rk ) ⇒ khn − hm k < 2rk ≤ ;
2k−1
hence, the sequence {hn } is convergent since X is a complete metric space. We
define h := limn→∞ hn . For each k ∈ N we have
kAk hk ≥ k.
This proves that if proposition P is not true then the assumption of the statement
is not true.
Step 2: We prove that if P is true then the conclusion of the statement is true.
Thus, we assume that proposition P is true and we fix (g, r, n) ∈ X × (0, ∞) × N
which satisfies the condition of proposition P . For each f ∈ X such that kf k < r,
we have g + f ∈ B(g, r) and hence
Proof. We use 2.4.2 and the following remarks. For (x, y) ∈ X × X and a sequence
(xn , yn ) in X × X we have:
dν (xn yn , xy) = kxn yn − xyk = k(xn yn − xyn ) + (xyn − xy)k
≤ kxn − xkkynk + kxkkyn − yk.
If (xn , yn ) → (x, y), then kxn − xk → 0 and kyn − yk → 0 by 2.7.3a. Besides,
kyn − yk → 0 implies kyn k → kyk by 4.1.6a, and hence the sequence {kyn k} is
bounded (cf. 2.1.9).
where we have used first 4.3.3 and then 4.1.6b. Then we have
∞
X ∞
X ∞
X
(1 + xn )(1 − x) = 1 + xn − x − ( xn )x = 1.
n=1 n=1 n=1
P∞
In a similar way we can prove that (1 − x)(1 + n=1 xn ) = 1.
Proof. Condition sa1 of 3.3.2 has been proved for B(X) in 4.2.11a, and condition
sa2 follows from 4.2.9. Thus, B(X) is a subalgebra of OE (X), and therefore it is
also an associative algebra. By 4.2.11a, B(X) is also a normed space. On account of
4.2.9, for B(X) we also have property na of 4.3.1. Thus, B(X) is a normed algebra,
and it is with identity since 1X ∈ B(X) and k1X k = 1 (unless X is a zero space).
Indeed we have
k1X f k = kf k ≤ 1kf k, ∀f ∈ X,
which proves that 1X ∈ B(X) and 1 ∈ B1X , and hence k1X k ≤ 1 (cf. 4.2.4). By
4.2.5b we also have
kf k = k1X f k ≤ k1X kkf k, ∀f ∈ X,
and this implies 1 ≤ k1X k if ∃f ∈ X s.t. f 6= 0X .
Finally, if X is a Banach space then B(X) is also a Banach space by 4.2.11b.
4.3.6 Examples.
(a) For the associative algebra FB (X) (cf. 3.3.8b), define
ν : FB (X) → R
ϕ 7→ ν(ϕ) := kϕk∞ := sup{|ϕ(x)| : x ∈ X}.
It is easy to see that ν is a norm for the linear space FB (X) and that ν has
property na of 4.3.1. Therefore, FB (X) is a normed algebra, and it is with
identity since k1X k∞ = 1.
Actually, FB (X) is a Banach algebra. In fact, let {ϕn } be a Cauchy sequence
in FB (X); then {ϕn (x)} is a Cauchy sequence in C for every x ∈ X, and hence
we can define the function
X ∋ x 7→ ϕ(x) := lim ϕn (x) ∈ C;
n→∞
now, for ǫ > 0, let Nǫ ∈ N be such that kϕn − ϕm k < ǫ for n, m > Nǫ ; then we
have, for n > Nǫ ,
∀x ∈ X, |ϕn (x) − ϕ(x)| = lim |ϕn (x) − ϕm (x)| ≤ ǫ.
m→∞
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 84
ϕ = (ϕ − ϕn ) + ϕn ∈ FB (X)
since FB (X) is a linear manifold in F (X), and second that
kϕn − ϕk∞ ≤ ǫ.
As a consequence, the sequence {ϕn } is convergent and its limit is ϕ. Thus,
FB (X) is a Banach space.
(b) Let (X, d) be a metric space. Then CB (X) is a subalgebra with identity of
FB (X) (cf. 3.3.8c) and hence it is an associative algebra with identity. Further,
CB (X) is a closed subset of the Banach space FB (X). Indeed, let {ϕn } be a
sequence in CB (X), let ϕ ∈ FB (X), and suppose that kϕn − ϕk∞ → 0; for each
x ∈ X and each ǫ > 0, let nǫ ∈ N be such that kϕnǫ − ϕk∞ < 3ǫ , and let δx,ǫ > 0
be such that |ϕnǫ (x) − ϕnǫ (y)| < 3ǫ whenever d(x, y) < δx,ǫ ; then we have
d(x, y) < δx,ǫ ⇒ |ϕ(x) − ϕ(y)| ≤ |ϕ(x) − ϕnǫ (x)| +
|ϕnǫ (x) − ϕnǫ (y)| + |ϕnǫ (y) − ϕ(y)| < ǫ;
this shows that ϕ ∈ C(X), and hence that ϕ ∈ CB (X). Since CB (X) is a
closed linear manifold in the Banach space FB (X), CB (X) is a Banach space
(cf. 4.1.8a).
Thus, CB (X) is a Banach algebra with identity (cf. 4.3.2).
(c) Let T be the unit circle in the complex plane (T is also called the one-
dimensional torus), i.e. we define
T := {z ∈ C : |z| = 1}.
Proof. We will show that, for each ϕ ∈ C(T), there exists a sequence {pn } in P
such that kpn − ϕk∞ → 0. By 2.3.12, this will prove that P = C(T).
Suppose we have a sequence {qn } in P such that each qn has the following
properties:
(a) R0 ≤ qn (z), ∀z ∈ T
π Rπ
(b) −π qn (eis )ds = 1 (by −π ...ds we denote a Riemann integral, cf. 9.3.2);
(c) for each δ ∈ (0, π), if mn (δ) := sup{qn (eit ) : δ ≤ |t| ≤ π}, then limn→∞ mn (δ) =
0.
pn : T → C
Z π
z 7→ pn (z) := ϕ(ei(t−s) )qn (eis )ds if t ∈ (−π, π] is so that z = eit .
−π
where in 1 we have made the change of variables s 7→ −s, in 2 we have made the
change of variables s 7→ s − t, and 3 is true because the integrand is a periodic
function of period 2π.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 86
Thus, pn ∈ P.
Let now ǫ > 0 be given. Since the function [−π, π] ∋ t 7→ ϕ(eit ) ∈ C is
continuous, it is uniformly continuous (cf. 2.8.7 and 2.8.15). Therefore, the function
R ∋ t 7→ ϕ(eit ) ∈ C is uniformly continuous since it is periodic of period 2π. Thus,
∃δǫ > 0 s.t. |ϕ(eit ) − ϕ(eis )| < ǫ whenever |t − s| < δǫ . For z ∈ T and t ∈ (−π, π]
so that z = eit , by property b we have
Z π
pn (z) − ϕ(z) = ϕ(ei(t−s) ) − ϕ(eit ) qn (eis )ds
−π
and property a implies, assuming 0 < δǫ < π,
Z π
|pn (z) − ϕ(z)| ≤ |ϕ(ei(t−s) ) − ϕ(eit )|qn (eis )ds
−π
Z −δǫ Z δǫ Z π
= ...ds + ...ds + ...ds;
−π −δǫ δǫ
now, we have
Z δǫ Z π
i(t−s) it is
|ϕ(e ) − ϕ(e )|qn (e )ds ≤ ǫ qn (eis )ds = ǫ
−δǫ −π
since |(t − s) − t| < δǫ when s ∈ (−δǫ , δǫ ) and since qn has property b, and also
Z −δǫ Z π
i(t−s) it is
|ϕ(e ) − ϕ(e )|qn (e )ds + |ϕ(ei(t−s) ) − ϕ(eit )|qn (eis )ds
−π δǫ
≤ 2kϕk∞ mn (δǫ )2(π − δǫ ) < 4πkϕk∞ mn (δǫ ),
where the definition of mn (δǫ ) has been used; thus we have
|pn (z) − ϕ(z)| ≤ ǫ + 4πkϕk∞ mn (δǫ ).
Since this estimate is independent of z, we have
kpn − ϕk∞ ≤ ǫ + 4πkϕk∞ mn (δǫ ).
Recalling property c, let Nǫ ∈ N be such that 4πkϕk∞ mn (δǫ ) < ǫ whenever Nǫ < n;
then we have
Nǫ < n ⇒ kpn − ϕk∞ < 2ǫ.
This proves that kpn − ϕk∞ → 0.
It remains to construct a sequence {qn } in P with properties a, b, c. Let qn (z) :=
γn
4n (2 + z + z −1 )n for every z ∈ T, with γn so that condition b is satisfied. For z ∈ T
and t ∈ (−π, π] so that z = eit , we have
n Z π n −1
1 + cos t 1 + cos t
qn (z) = γn , with γn := dt .
2 −π 2
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 87
4.4.1 Definition. Let X and Y be normed spaces over the same field and let
A ∈ O(X, Y ). The linear operator A is said to be closed if its graph GA is a closed
subset of the product of the two metric spaces X, Y (cf. 4.1.3 and 2.7.2), i.e. a
subspace of the normed space X + Y (cf. 3.2.15 and 4.1.16). From 2.3.4 and 2.7.3a
we have that A is closed iff the following condition is satisfied:
[f ∈ X, g ∈ Y, {fn } is a sequence in DA , fn → f, Afn → g] ⇒
[f ∈ DA and g = Af ].
This condition can be written in the equivalent way:
[{fn } is a sequence in DA that is convergent in X and
{Afn } is a convergent sequence in Y ] ⇒
[ lim fn ∈ DA and A( lim fn ) = lim Afn ].
n→∞ n→∞ n→∞
4.4.2 Remark. Let X and Y be normed spaces over the same field. For a linear
operator A ∈ O(X, Y ) we have that A is bounded iff the following condition is
satisfied (cf. 4.2.2 and 2.4.2):
[f ∈ DA , {fn } is a sequence in DA , fn → f ] ⇒ [Afn → Af ].
This condition can be written in the equivalent way:
[{fn } is a sequence in DA that is convergent in X and lim fn ∈ DA ] ⇒
n→∞
[{Afn } is a convergent sequence in Y and A( lim fn ) = lim Afn ].
n→∞ n→∞
Thus, for both a bounded (i.e. continuous on account of 4.2.2) operator and a closed
one there are conditions, for a convergent sequence {fn } in their domains, which
allow one to “commute the operator with the limit”. However, while for a bounded
operator A one must assume limn→∞ fn ∈ DA in order to obtain that the sequence
{Afn } is convergent, for a closed operator A one must assume that the sequence
{Afn } is convergent in order to obtain limn→∞ fn ∈ DA .
The interplay between the concepts of bounded operator and closed operator is
studied in 4.4.3, 4.4.4, 4.4.6.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 88
4.4.3 Theorem. Let X and Y be normed spaces over the same field, and suppose
A ∈ O(X, Y ). If A is bounded and DA is closed, then A is closed.
where 1 is true by 2.3.4 because DA is closed (notice that the condition “{Afn } is
a convergent sequence in Y ” plays no role) and 2 is true because A is continuous
(cf. 4.2.2).
4.4.4 Theorem. Let X be a normed space, Y a Banach space over the same field,
and A ∈ O(X, Y ). If A is bounded and closed, then DA is closed.
Proof. Assume A bounded and closed. First, we notice that, if {fn } is a sequence in
DA that is convergent in X, then {Afn } is a Cauchy sequence since kAfn − Afm k ≤
kAkkfn − fm k, and therefore {Afn } is a convergent sequence in Y since Y is a
complete metric space. Then, DA is closed by 2.3.4 since the following implications
are true:
We state the following theorem without giving its proof, which can be found e.g.
in Chapter 10 of (Royden, 1988), since we shall use neither this theorem nor its
corollary. We prove less general versions of this theorem and of its corollary in
12.2.3 and in 13.1.9, for an operator in a Hilbert space.
4.4.5 Theorem (Closed graph theorem). Let X and Y be Banach spaces over
the same field. If A ∈ OE (X, Y ) (for OE (X, Y ), cf. 3.2.12) and A is closed, then
A is bounded.
4.4.6 Corollary. Let X and Y be Banach spaces over the same field, and suppose
A ∈ O(X, Y ). If DA is closed and A is closed, then A is bounded.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 89
Assuming X, Y Banach spaces over the same field and A ∈ O(X, Y ), from 4.4.3,
4.4.4 and 4.4.6 we see that, if two of the three conditions A bounded, A closed,
DA closed are true, then the remaining one is true as well. This rounds off our
examination of the interplay between the concepts of bounded operator and closed
operator. However, we shall not use either 4.4.5 or 4.4.6 for general closed operators
in general Banach spaces, and this is the reason why we have not provided a proof
of the closed graph theorem (which, moreover, would require preliminary results
outside the scope of this book). Anyway, as already mentioned, the closed graph
theorem and its corollary will be proved in 12.2.3 and in 13.1.9 respectively, for
operators in Hilbert spaces.
4.4.7 Proposition. Let X and Y be normed spaces over the same field, and suppose
A ∈ O(X, Y ) and A injective. Then A is closed iff A−1 is closed.
4.4.8 Proposition. Let X and Y be normed spaces over the same field, and let
A ∈ O(X, Y ). If A is closed then NA is a subspace of X.
4.4.9 Proposition. Let X and Y be normed spaces over the same field, and suppose
A ∈ O(X, Y ), B ∈ B(X, Y ), A closed. Then A + B is closed.
Proof. Let {fn } be a sequence in DA+B and assume that there exists (f, g) ∈ X ×Y
so that fn → f and (A + B)fn → g. Since B ∈ B(X, Y ), we have Bfn → Bf and
hence Afn → g − Bf . Since fn ∈ DA for each n ∈ N and A is closed, this implies
f ∈ DA and g − Bf = Af , i.e. f ∈ DA+B and g = (A + B)f . This proves that
A + B is closed.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 90
4.4.10 Definition. Let X and Y be normed spaces over the same field, and let
A ∈ O(X, Y ). The linear operator A is said to be closable if the closure (in the
product metric space X × Y or equivalently in the normed space X + Y ) GA of its
graph is the graph of a mapping. If A is closable, then the mapping which has GA
as its graph is a closed linear operator from X to Y (cf. 4.1.12 and 3.2.15a) which
is called the closure of A and is denoted by A. Clearly, A ⊂ A (cf. 1.2.5) and A is
the smallest closed operator that contains A: if B is a closed operator that contains
A, then GA ⊂ GB , and hence GA = GA ⊂ GB , and hence A ⊂ B (cf. 1.2.5).
If A is closable, GA = GA means that (cf. 2.3.10):
DA := {f ∈ X : there exists a sequence {fn } in DA s.t.
fn → f and {Afn } is convergent},
∀f ∈ DA , Af = lim Afn if {fn } is a sequence in DA
n→∞
4.4.11 Proposition. Let X and Y be normed spaces over the same field, and let
A ∈ O(X, Y ). Then:
(a) A is closable iff the following condition holds
(0X , g) ∈ GA ⇒ g = 0Y ;
(b) A is closable iff ∃B ∈ O(X, Y ) such that B is closed and A ⊂ B.
4.4.12 Proposition. Let X be a normed space and Y a Banach space over the same
field. Let A ∈ O(X, Y ), and suppose A bounded. Then A is closable, DA = DA and
A is bounded.
Proof. Let à be the element of O(X, Y ) such that Dà = DA , A ⊂ Ã, à is bounded
(cf. 4.2.6).
Since à is bounded and Dà is closed, à is closed by 4.4.3. Then, since A ⊂ Ã,
A is closable by 4.4.11b and we also have A ⊂ Ã.
For f ∈ DÃ , in view of DÃ = DA and 2.3.10 there is a sequence {fn } in DA s.t.
fn → f , and hence (since à is continuous) also s.t. Ãfn → Ãf . Since A ⊂ A ⊂ Ã,
the sequence {fn } is also in DA and Afn = Ãfn ; since A is closed, this implies
f ∈ DA . Thus, DÃ ⊂ DA and therefore A = Ã.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 91
4.4.13 Proposition. Let X and Y be normed spaces over the same field, let
A ∈ O(X, Y ), and suppose A injective and closable. Then A−1 is closable iff A
−1
is injective. If these conditions are satisfied, then A−1 = (A) .
4.4.14 Proposition. Let X and Y be normed spaces over the same field, and
suppose A ∈ O(X, Y ), B ∈ B(X, Y ), A closable. Then A + B is closable and
A + B = A + B.
In this section, which contains little more than definitions, X denotes a normed
space over K and A denotes an operator in X, i.e. A ∈ O(X).
Proof. Since X is a non-zero Banach space, B(X) is a Banach algebra with identity
(cf. 4.3.5). For λ ∈ K so that kAk < |λ|, we have k λ1 Ak < 1. Then, by 4.3.4, the
P∞ n
series n=1 λ1 A is convergent in B(X) and
∞ n ! ∞ n !
X 1 1 1 X 1
1X + A 1X − A = 1X − A 1X + A = 1X ,
n=1
λ λ λ n=1
λ
and hence
∞
" n !#
1 X 1
− 1X + A (A − λ1X )
λ n=1
λ
∞
" n !#
1 X 1
= (A − λ1X ) − 1X + A = 1X .
λ n=1
λ
4.5.12 Proposition. Suppose that X is a Banach space and that A is closed. Then
RA−λ1X = X for each λ ∈ ρ(A).
Proof. Since A is closed, A − λ1X is closed for each λ ∈ K (cf. 4.4.9). Thus,
if λ ∈ ρ(A) then (A − λ1X )−1 is closed (cf. 4.4.7) and bounded. Since X is a
Banach space, this implies that D(A−λ1X )−1 is closed (cf. 4.4.4), and hence that
RA−λ1X = RA−λ1X = X.
4.5.13 Remark. Some define the resolvent set and the spectrum of A in a different
way than we did in 4.5.1, by letting the resolvent set of A be the set
ρ′ (A) := {λ ∈ K : A − λ1X is injective and (A − λ1X )−1 ∈ B(X)}
and letting the spectrum of A be the set
σ ′ (A) := K − ρ′ (A).
However, these definitions are not very useful for non-closed operators. In fact, if
ρ′ (A) 6= ∅ then there is λ ∈ K so that (A − λ1X )−1 exists and (A − λ1X )−1 ∈ B(X),
hence (A − λ1X )−1 is closed by 4.4.3, hence A − λ1X is closed by 4.4.7, hence A
is closed by 4.4.9. This proves that ρ′ (A) = ∅, and hence σ ′ (A) = K, if A is not
closed. Thus, the spectrum as defined above is always trivially the same for all non-
closed operators, even when they are closable. This is not true with our definition
of spectrum, as indicated by 4.5.11 (cf. also 12.4.25).
If X is Banach space then the definitions given above are actually equivalent to
ours, for closed operators. In fact, 4.5.12 proves that ρ(A) ⊂ ρ′ (A) if A is closed
and X is a Banach space, and hence ρ(A) = ρ′ (A) since ρ′ (A) ⊂ ρ(A) is obvious.
If an isomorphism from X1 onto X2 exists, then the two normed spaces X1 and X2
are said to be isomorphic.
If the two normed spaces X1 and X2 are the same, an isomorphism from X1
onto X2 is called an automorphism of X1 .
4.6.2 Remarks.
(a) In 4.6.1, condition in1 means that U is an “isomorphism” from the set X1 onto
the set X2 (it preserves the set theoretical “operations”, i.e. union, intersec-
tion, complementation), conditions in1 and in2 mean that U is an isomorphism
from the linear space (X1 , σ1 , µ1 ) onto the linear space (X2 , σ2 , µ2 ) (actually,
condition in2 says that U is a linear operator), and condition in3 says that U
preserves the norm. We laid down condition in1 the way we did in order to
make it clear from the outset that an isomorphism preserves the three level
structure of a normed space. However, we could have only asked in in1 that
U be surjective onto X2 , since U is a linear operator by condition in2 and
NU = {0X1 } holds by condition in3 , and hence U is injective by 3.2.6a.
(b) It is obvious (also in view of 3.2.6b) that, if U is an isomorphism from a normed
space X1 onto a normed space X2 , then the inverse mapping U −1 (i.e. the linear
operator U −1 ) is an isomorphism from X2 onto X1 ; and also that, if V is an
isomorphism from X2 onto a third normed space X3 , then the composition V ◦U
(i.e. the product V U of the linear operators U and V ) is an isomorphism from
X1 onto X3 .
(c) For any normed space X, the identity mapping idX (i.e. the linear operator
1X , cf. 3.2.5) is obviously an automorphism of X. It is immediate to see, also
in view of remark b, that the family of all automorphisms of X is a group,
with the product of operators as group product, the identity mapping as group
identity, the inverse mapping as group inverse.
(d) If U is an isomorphism from a normed space (X1 , σ1 , µ1 , ν1 ) onto a normed
space (X2 , σ2 , µ2 , ν2 ), then
Thus, U is an isomorphism from the metric space (X1 , dν1 ) onto the metric
space (X2 , dν2 ).
(e) As remarked above, an isomorphism U from a normed space X1 onto a normed
space X2 is a linear operator. It is obvious that U is a bounded operator, i.e.
U ∈ B(X1 , X2 ). Thus, U is a continuous mapping (cf. 4.2.2; however, this was
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 96
4.6.3 Proposition. Let X1 and X2 be isomorphic normed spaces over the same
field, and let U be an isomorphism from X1 onto X2 . The mapping
T U : X1 + X1 → X2 + X2
(f, g) 7→ TU (f, g) := (U f, U g)
is an isomorphism from the normed space X1 + X1 onto the normed space X2 + X2
(cf. 4.1.16).
For each linear operator A ∈ O(X1 ) we have TU (GA ) = GUAU −1 .
4.6.4 Proposition. Let X1 and X2 be isomorphic normed spaces over the same
field, let A ∈ O(X1 ) and B ∈ O(X2 ), and let U be an isomorphism from X1 onto
X2 . The following conditions are equivalent:
(a) B = U AU −1 ;
(b) A = U −1 BU ;
(c) BU = U A;
(d) AU −1 = U −1 B;
(e) DA = U −1 (DB ) and Bf = U AU −1 f , ∀f ∈ DB ;
(f ) DB = U (DA ) and Ag = U −1 BU g, ∀g ∈ DA ;
(g) GB = TU (GA ) (for TU , cf. 4.6.3).
If these conditions are satisfied then:
(h) RB = U (RA );
(i) DB = U (DA ) and RB = U (RA ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 97
4.6.5 Theorem. Let X1 and X2 be isomorphic normed spaces over the same field
K. Let A ∈ O(X1 ), B ∈ O(X2 ) and suppose that there exists an isomorphism U
from X1 onto X2 so that B = U AU −1 . Then:
(a) A is injective iff B is injective; if A and B are injective then B −1 = U A−1 U −1 ;
(b) A is bounded iff B is bounded; if A and B are bounded then kBk = kAk;
(c) A is closed iff B is closed;
(d) A is closable iff B is closable; if A and B are closable, then B = U AU −1 ;
(e) NB = U (NA );
(f ) B − λ1X2 = U (A − λ1X1 )U −1 , ∀λ ∈ K;
(g) σ(B) = σ(A);
(h) Apσ(B) = Apσ(A);
(i) σp (B) = σp (A).
Proof. a: Everything follows from 1.2.17 and 1.2.14B (cf. also the equivalence
between conditions a and b in 4.6.4)
b: If A is bounded then
kBf k2 = kU AU −1 f k2 = kAU −1 f k1 ≤ kAkkU −1f k1 = kAkkf k2, ∀f ∈ DB
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 98
(cf. 4.2.5b). This proves that B is bounded and kBk ≤ kAk. Since A = U −1 BU
(cf. the equivalence between conditions a and b in 4.6.4), by the same token it can
be proved that if B is bounded then A is bounded and kAk ≤ kBk.
c: Since TU is an isomorphism from X1 + X1 onto X2 + X2 as metric spaces (cf.
4.6.3 and 4.6.2d) and since GB = TU (GA ) (cf. the equivalence between conditions
a and g in 4.6.4), 2.3.21b implies that GA is closed iff GB is closed.
d: Since TU is an isomorphism from X1 + X1 onto X2 + X2 as metric spaces and
since GB = TU (GA ), 2.3.21a implies that GB = TU (GA ). Then, if B is closable we
have
where the second equality holds by 3.2.10b′2 since DU = X1 and the last by 3.2.10b3.
g: Let λ ∈ K. We have that
RB−λ1X2 = U (RA−λ1X1 ),
4.6.6 Proposition. Let X1 and X2 be Banach spaces over the same field. Suppose
that there exists a linear operator V from X1 to X2 , i.e. V ∈ O(X1 , X2 ), such that
DV = X1 , RV = X2 , kV f k = kf k for all f ∈ DV .
Then there exists a unique operator U ∈ B(X1 , X2 ) such that V ⊂ U . The operator
U is an isomorphism from X1 onto X2 . Moreover, the operator U −1 is the unique
element of B(X1 , X2 ) such that V −1 ⊂ U −1 , or equivalently such that
U −1 (V f ) = f, ∀f ∈ DV .
Proof. From 4.2.6 we have that there exists a unique operator U ∈ B(X1 , X2 ) such
that V ⊂ U . Clearly, condition in2 of 4.6.1 holds true for U .
Now we fix f ∈ X1 and let {fn } be a sequence in DV such that fn → f (cf.
2.3.12). Then,
U f = lim V fn
n→∞
and hence
kU f k = lim kV fn k = lim kfn k = kf k
n→∞ n→∞
(cf. 4.1.6a). Since f was an arbitrary element of X1 , this proves condition in3 of
4.6.1 for U . Moreover we fix g ∈ X2 and let {gn } be a sequence in DV such that
V gn → g. Then {gn } is a Cauchy sequence because
kgn − gm k = kV gn − V gm k, ∀n, m ∈ N,
and hence we have
U lim gn = lim V gn = g.
n→∞ n→∞
Chapter 5
There are many situations in integration theory where one finds it unavoidable to
deal with infinity. For instance, one wants to be able to integrate over sets of infinite
measure. Moreover, even if one is only interested in real-valued functions, the least
upper bound or the sum of a sequence of positive real-valued functions may well be
infinite at some points. More generally, there are a number of idiomatic expressions
about real functions where the word infinity and the symbol ∞ are used, even when
these two things have not been given a definite status.
This brief chapter is devoted to the extended real line, which is a way to organize
the various rules according to which infinity is dealt with in real analysis, and in
particular in the next chapters about measure and integration theory.
101
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 102
[a, b] := {x ∈ R∗ : a ≤ x ≤ b}.
Note that some of these sets can be empty (e.g. (a, b) = ∅ if b ≤ a) and that
R = (−∞, ∞), R∗ = [−∞, ∞], [0, ∞] = [0, ∞) ∪ {∞}.
Given a non-empty set X, for two functions ϕ : X → R∗ and ψ : X → R∗ we write
ϕ ≤ ψ if ϕ(x) ≤ ψ(x) for all x ∈ X.
δ : R∗ × R∗ → R∗
(a, b) 7→ δ(a, b) := dR (ϕ−1 (a), ϕ−1 (b)) (= |ϕ−1 (a) − ϕ−1 (b)|, cf. 2.1.4).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 103
(a) The function δ is a distance on R∗ (R∗ will always be regarded as the first
element of the metric space (R∗ , δ)).
(b) For a sequence {an } in R∗ we have:
(b1 ) an → ∞ iff ∀m ∈ R, ∃Nm ∈ N such that n > Nm ⇒ an > m;
(b2 ) an → −∞ iff ∀m ∈ R, ∃Nm ∈ N such that n > Nm ⇒ an < m;
(b3 ) for a ∈ R,
an → a iff ∀ǫ > 0, ∃Nǫ ∈ N such that n > Nǫ ⇒ a − ǫ < an < a + ǫ.
(c) A sequence {an } in R is convergent in the metric subspace (R, δR ) (i.e. it is
convergent in the metric space (R∗ , δ) and limn→∞ an ∈ R) iff it is convergent
in the metric space (R, dR ), and in case of convergence the two limits are equal.
(d) The topology (i.e. the family of all open sets) of the metric subspace (R, δR ) is
the same as the topology of the metric space (R, dR ).
(e) The metric subspace (R, δR ) is not complete, and one of its completions is the
pair ((R∗ , δ), idR ).
Proof. a: The properties of 2.1.1 for δ follow directly from the same properties for
dR .
b: Let {an } be a sequence in R∗ .
b1 : We have
δ(an , ∞) −−−−→ 0 ⇔
n→∞
−1 π
dR (ϕ (an ), ) −−−−→ 0 ⇔
n 2 n→∞
∀ǫ ∈ (0, π), ∃Nǫ ∈ N s.t. n > Nǫ ⇒
π o (1)
[an = ∞ or (an ∈ R and arctan an > − ǫ)] ⇔
2
(∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an > m).
Indeed:
(1) π
⇒: for m ∈ R, put ǫ := 2 − arctan m; then
n > Nm := Nǫ ⇒
[an = ∞ or (an ∈ R and arctan an > arctan m, i.e. an > m)];
(1)
⇐: for ǫ ∈ (0, π), put m := tan( π2 − ǫ); then
π
n > Nǫ := Nm ⇒ an > tan( − ǫ) ⇒
2
π
[an = ∞ or (an ∈ R and arctan an > − ǫ)].
2
b2 : The proof is analogous to the one given for b1 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 104
b3 : For a ∈ R we have
(2)
δ(an , a) −−−−→ 0 ⇔
n→∞
(3)
[∃k ∈ N s.t. an ∈ R for n > k and dR (ϕ−1 (ak+n ), ϕ−1 (a)) −−−−→ 0] ⇔
n→∞
(4)
[∃k ∈ N s.t. an ∈ R for n > k and dR (ak+n , a) −−−−→ 0] ⇔
n→∞
(∀ǫ > 0, ∃Nǫ ∈ N s.t. n > Nǫ ⇒ a − ǫ < an < a + ǫ).
Indeed:
(2)
⇒: since η := min{δ(a, −∞), δ(a, ∞)} > 0, there exists k ∈ N such that
let then Nǫ := k + nǫ ;
(4)
⇐: set for instance k := N1 and notice that dR (an , a) −−−−→ 0 implies trivially
n→∞
dR (ak+n , a) −−−−→ 0.
n→∞
inf inf am,k = inf inf am,k .
m≥1 k≥1 k≥1 m≥1
Proof. We have
∀(n, l) ∈ N × N, sup sup am,k ≥ sup an,k ≥ an,l ,
m≥1 k≥1 k≥1
and hence
∀l ∈ N, sup sup am,k ≥ sup am,l ,
m≥1 k≥1 m≥1
and hence
sup sup am,k ≥ sup sup am,k .
m≥1 k≥1 k≥1 m≥1
If an+1 ≤ an for each n ∈ N, then {an } is convergent in the metric space (R, dR )
iff there exists m ∈ R such that m ≤ an for each n ∈ N. In case of convergence we
have
lim an = inf an .
n→∞ n≥1
Proof. Suppose an ≤ an+1 for each n ∈ N (the proof is analogous in the other
case). If there exists m ∈ R s.t. an ≤ m for each n ∈ N, then s := supn≥1 an is an
element of R since an ≤ s ≤ m for each n ∈ N by the definition of l.u.b.. Then, by
the same token,
∀ǫ > 0, ∃Nǫ ∈ N s.t. s − ǫ < aNǫ .
This implies
∀ǫ > 0, ∃Nǫ ∈ N s.t. n > Nǫ ⇒ s − ǫ < an ≤ s,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 106
Proof. Suppose an ≤ an+1 for each n ∈ N (the proof is analogous in the other
case).
If ∃m ∈ R s.t. an ≤ m for each n ∈ N, then the result follows from 5.2.4 and
5.2.1c.
If ∀m ∈ R, ∃Nm ∈ N s.t. m < aNm , then
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an > m,
hence an → ∞ by 5.2.1b1; also, ∞ is the only upper bound for {an : n ∈ N}, hence
∞ = sup {an : n ∈ N} = supn≥1 an .
Proof. Clearly, we have bn+1 ≤ bn for each n ∈ N. Hence 5.2.5 implies that
the sequence {bn } is convergent and limn→∞ bn = inf n≥1 bn . There are now three
possibilities.
If limn→∞ an = ∞, then (cf. 5.2.1b1)
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an > m ⇒ bn > m,
and this proves that limn→∞ bn = ∞.
If limn→∞ an = −∞, then (cf. 5.2.1b2 )
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an < m − 1,
and therefore
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ bn ≤ m − 1 < m,
and this proves that limn→∞ bn = −∞.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 107
5.3.2 Remarks.
(a) Note that, for a, b ∈ R∗ , a ≤ b iff −b ≤ −a. As can be easily seen, this implies
that, if for S ⊂ R∗ we write −S := {−a : a ∈ S}, then
c ∈ (0, ∞) we have
∀a ∈ S, ca ≤ c sup S,
and, for m ∈ R∗ ,
1 1
[ca ≤ m, ∀a ∈ S] ⇒ [a ≤ m, ∀a ∈ S] ⇒ sup S ≤ m ⇒ c sup S ≤ m,
c c
and this proves that sup cS = c sup S. For the other equality we can proceed in
a similar way.
(c) As can be checked directly, if a, b ∈ R∗ are such that a ≤ b then
In fact, we have
∀a ∈ S, a + c ≤ sup S + c
and, for m ∈ R∗ ,
[a + c ≤ m, ∀a ∈ S] ⇒ [a ≤ m − c, ∀a ∈ S] ⇒ sup S ≤ m − c ⇒ sup S + c ≤ m,
and this proves that sup(S + c) = sup S + c. For the other equality we can
proceed in a similar way.
(d) From remark b it follows that, if a1 , a2 , b1 , b2 ∈ [0, ∞] are such that ai ≤ bi for
i = 1, 2, then
a1 a2 ≤ b 1 a2 ≤ b 1 b 2 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 109
5.3.3 Remark. In [0, ∞] both the product and the sum are defined without re-
straints, and they are commutative and associative. Owing to this, for elements of
Pn
[0, ∞] we will write a1 + a2 + a3 := a1 + (a2 + a3 ) and k=1 ak := a1 + ... + an ; we
P
will also write i∈I ai to denote the sum of a finite family {ai }i∈I of elements of
[0, ∞].
By a straightforward check we see that
a(b + c) = ab + ac, ∀a, b, c ∈ [0, ∞].
5.3.4 Proposition. Suppose that {an } and {bn } are sequences in [0, ∞] such that
an ≤ an+1 and bn ≤ bn+1 for each n ∈ N, and let a ∈ [0, ∞]. Then the sequences
{an + bn } , {an bn } and {abn } are convergent in the metric space (R∗ , δ) and
lim (an + bn ) = sup(an + bn ) = sup an + sup bn = lim an + lim bn ,
n→∞ n≥1 n≥1 n≥1 n→∞ n→∞
Proof. Recall (cf. 5.2.5) that {an } and {bn } are convergent and limn→∞ an =
supn≥1 an and limn→∞ bn = supn≥1 bn .
If both limn→∞ an and limn→∞ bn are elements of R, then what we want to
prove follows from 5.2.1c and from the continuity of the sum and the product in R.
Thus, in what follows we assume e.g. limn→∞ an = ∞.
We have limn→∞ an + limn→∞ bn = ∞; since
an + bn ≤ an+1 + bn+1 and an ≤ an + bn , ∀n ∈ N,
(cf. 5.3.2e), the sequence {an + bn } is convergent and
lim (an + bn ) = sup(an + bn ) ≥ sup an = ∞,
n→∞ n≥1 n≥1
and hence
lim (an + bn ) = ∞ = lim an + lim bn .
n→∞ n→∞ n→∞
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 110
and hence
lim (an bn ) = ∞ = ( lim an )( lim bn ).
n→∞ n→∞ n→∞
Finally, the sequence {abn } is the sequence {an bn } if an := a for all n ∈ N.
5.4.1 Definition. Let {an } be a sequence in [0, ∞]. For each n ∈ N, define sn :=
Pn
k=1 akP. The sequence {sn } is called the series of the an ’s and is denoted by the
∞
symbol n=1 an . Since sn ≤ sn+1 for each n ∈ N, 5.2.5 implies that the sequence
{sn } is convergent in the metric space (R∗ , δ) and limn→∞ sn = supn≥1 sn . Then,
limn→∞ sn is called the sum of the series of the an ’s and is denoted by the same
P∞
symbol n=1 an as the series, i.e.
∞
X
an := lim sn = sup sn
n→∞ n≥1
n=1
(these definitions are in agreement with the ones given in 2.1.10).
If an ∈ R for each n ∈ N, then limn→∞ sn can be either ∞ or an element of R.
P∞
If limn→∞ sn ∈ R, then {sn } converges to n=1 an also in the metric space (R, dR )
P∞
(cf. 5.2.1c). Clearly, limn→∞ sn ∈ R iff n=1 an < ∞ (we will always use the latter
expression).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 111
5.4.2 Remarks.
(a) Let {an } and {bn } be sequences in [0, ∞], and suppose an ≤ bn for each n ∈ N.
Then, by induction applied to 5.3.2e,
n
X n
X n
X ∞
X
∀n ∈ N, ak ≤ bk ≤ sup bk =: bn ,
n≥1 n=1
k=1 k=1 k=1
whence
∞
X n
X ∞
X
an := sup ak ≤ bn .
n=1 n≥1 n=1
k=1
Pn
(b) For a sequence {an } in [0, ∞], letting sn := k=1 ak we have
∞
X (1)
an < ∞ ⇔
n=1
(2)
[∃m ∈ [0, ∞) s.t. sn ≤ m, ∀n ∈ N] ⇔
(3)
[an ∈ [0, ∞), ∀n ∈ N, and {sn } is convergent in (R, dR )] ⇔
[an ∈ [0, ∞), ∀n ∈ N, and
X∞
an is convergent in the normed space R (cf. 4.1.4)],
n=1
5.4.3 Proposition. Let {an } be a sequence in [0, ∞] and let β be a bijection from
N onto N. Then
∞
X ∞
X
aβ(n) = an .
n=1 n=1
For this reason, for a countable family {bn }n∈I of elements of [0, ∞], we will write
P
n∈I bn to denote the sum or the series of the bn ’s, with no need to specify the
order.
5.4.4 Corollary. Let {an } be a sequence in [0, ∞) and let β be a bijection from N
P∞
onto N. Then the series n=1 aβ(n) is convergent in the normed space R (cf. 4.1.4
P∞
and 4.1.5) iff the series n=1 an is convergent in the normed space R, and in case
of convergence the two sums are equal.
5.4.5 Proposition. Let {an } be a sequence in [0, ∞] and a ∈ [0, ∞]. Then
∞
X ∞
X
(aan ) = a an .
n=1 n=1
Proof. If a = ∞, then the two sides are zero if an = 0 for each n ∈ N, otherwise
the two sides are ∞.
Assuming now a ∈ [0, ∞), we have (using 5.3.2b)
∞
( n ) ( n )
X X X
(aan ) := sup (aak ) : n ∈ N = sup a ak : n ∈ N
n=1 k=1 k=1
∞
( n
)
X X
= a sup ak : n ∈ N =a an .
k=1 n=1
5.4.6 Proposition. Let {an } and {bn } be two sequences in [0, ∞]. Then
∞
X ∞
X ∞
X
(an + bn ) = an + bn .
n=1 n=1 n=1
P∞
If n=1 (an + bn ) = ∞, this proves the equality of the statement. Assume next
P∞ P∞ P∞
n=1 (an + bn ) < ∞; this implies n=1 an < ∞ and n=1 bn < ∞ since e.g.
n
X n
X ∞
X
∀n ∈ N, ak ≤ (ak + bk ) ≤ (an + bn );
k=1 k=1 n=1
thus, all three series of the statement are convergent in the normed space R (cf.
5.4.2b), where the equality holds by the continuity of the sum.
Fix also M ∈ N and let K := max σ −1 ({1, ..., N } × {1, ..., M }); we have
M X
X N K
X ∞
X
an,m ≤ aσ(l) ≤ aσ(n) .
m=1 n=1 l=1 n=1
Thus,
∞ ∞ ∞
!
X X X
aσ(n) = an,m
n=1 n=1 m=1
Finally, let bn,m := am,n and denote by γ the bijection from N × N onto N × N
defined by γ(n, m) := (m, n). Then γ ◦ σ is a bijection from N onto N × N and
∞ ∞ ∞ ∞
!
X X X X
aσ(n) = b(γ◦σ)(n) = bn,m
n=1 n=1 n=1 m=1
∞ ∞ ∞ ∞
! !
X X X X
= am,n = an,m .
n=1 m=1 m=1 n=1
∀n ∈ N, the series ∞
P
a is convergent in the normed space R;
P∞m=1 n,m
the series ∞
P
n=1 ( a
m=1 n,m ) is convergent in the normed space R.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 115
In case of convergence, for the sums of the two series we have the equality
∞ ∞ ∞
!
X X X
aσ(n) = an,m .
n=1 n=1 m=1
P∞
Thus, both the convergence of the series n=1 aσ(n) and its sum do not depend on
the particular bijection σ used.
5.4.9 Proposition. Let {an,k }(n,k)∈N×N be a family of elements of [0, ∞] such that
an,k ≤ an+1,k for each (n, k) ∈ N×N. Then the sequence { ∞
P
k=1 an,k } is convergent
in the metric space (R∗ , δ) and
∞
X ∞
X ∞
X
lim an,k = sup an,k = (sup an,k )
n→∞ n≥1 n≥1
k=1 k=1 k=1
∞
X
= ( lim an,k ).
n→∞
k=1
i.e.
N
! N
!
X X
sup sup an,k = sup (sup an,k ) ;
n≥1 N ≥1 N ≥1 n≥1
k=1 k=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 116
for each n ∈ N, since an,k ≤ an+1,k for all n ∈ N and for k = 1, ..., N , we can apply
induction to 5.3.4 to obtain
N
X N
X
(sup an,k ) = sup an,k ;
n≥1 n≥1
k=1 k=1
PN
thus, letting sN,n := k=1an,k , what we need to prove is
sup sup sN,n = sup sup sN,n ,
n≥1 N ≥1 N ≥1 n≥1
5.4.10 Remark. Let {fn } be a sequence in a normed space and suppose that the
P∞
series n=1 fn is convergent (cf. 2.1.10). Then,
X ∞
X ∞
fn
≤ kfn k.
n=1 n=1
P∞
If n=1 kfn k = ∞, this is obvious. Otherwise, it is proved by
XN
X N X∞
fn
≤ kfn k ≤ kfn k, ∀N ∈ N,
n=1 n=1 n=1
Chapter 6
Although this chapter deals with measurable sets and functions, its contents are
purely set-theoretic: in this chapter there is still no measure in view. The reason
for the adjective “measurable” lies in the following facts, which will be seen in later
chapters: a measure is a function defined on a family of measurable sets and an
integral is a concept which is consistent only for measurable functions.
Proof. The proof is by induction. For n = 1, the statement follows at once from
the definition of semialgebra. Assume then that the statement is true for n = m and
consider a disjoint family {E1 , ..., Em , Em+1 } of elements of S. Then there exists a
finite and disjoint family {Fi }i∈I of elements of S such that
X − ∪m
k=1 Ek = ∪i∈I Fi .
Since Em+1 ∈ S, there exists also a finite and disjoint family {Gj }j∈J of elements
of S such that
X − Em+1 = ∪j∈J Gj .
117
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 118
6.1.7 Remark. It is clear from al4 , al3 , al2 that, for an algebra, conditions
sa1 , sa2 , sa3 of 6.1.1 are satisfied. Thus, an algebra is also a semialgebra. For
any non-empty set X, the collections {∅, X} and P(X) are algebras on X.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 120
Then:
(a) Fk ∩ Fl = ∅ if k 6= l,
(b) ∪N N
n=1 Fn = ∪n=1 En , ∀N ∈ N,
(c) ∪n=1 Fn = ∪∞
∞
n=1 En ,
(d) if A0 is an algebra on X and En ∈ A0 for all n ∈ N, then Fn ∈ A0 for all
n ∈ N.
x ∈ ∪N
n=1 En ⇒ [∃k ∈ {1, ..., N } s.t. x ∈ Ek and x 6∈ Ei for i < k] ⇒
[∃k ∈ {1, ..., N } s.t. x ∈ Fk ] ⇒ x ∈ ∪N
n=1 Fn .
6.1.10 Theorem.
(a) Let F be a family of subsets of X. Then there exists a unique algebra on X,
which is called the algebra on X generated by F and is denoted by A0 (F ), such
that
(ga1 ) F ⊂ A0 (F ),
(ga2 ) if A0 is an algebra on X and F ⊂ A0 , then A0 (F ) ⊂ A0 .
If F is the empty family, then A0 (F ) = {0, X}.
(b) Let F1 and F2 be families of subsets of X. If F1 ⊂ F2 or F1 ⊂ A0 (F2 ), then
A0 (F1 ) ⊂ A0 (F2 ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 121
Proof. It is obvious that C ⊂ A0 (S), since S ⊂ A0 (S) and A0 (S) has property al5 .
Since it is also obvious that S ⊂ C, if we can prove that C is an algebra on X then
we can conlude that A0 (S) ⊂ C by property ga2 of A0 (S).
Now, let E, F be two elements of C and let {E1 , ..., En }, {F1 , ..., Fm } be two
disjoint families of elements of S such that E = ∪nk=1 Ek and F = ∪m l=1 Fl . From
6.1.4 it follows that there exists a finite and disjoint family {Bj }j∈J of elements of
S such that each element in the family {E1 , ..., En , F1 , ..., Fm } can be obtained as
the union of a subfamily of {Bj }j∈J ; it is then clear that E ∪ F too can be obtained
in this way, and this implies that E ∪ F ∈ C. Thus, C has property al1 of 6.1.5.
Moreover, if {E1 , ..., En } is a disjoint family of elements of S, then from 6.1.2 it
follows that X − ∪nk=1 Ek can be obtained as the union of a finite and disjoint family
of elements of S. This proves that C has property al2 of 6.1.5.
The pair (X, A), where X is a non-empty set and A is a σ-algebra on X, is said to
be a measurable space and the elements of A are called the measurable subsets of X
(the reason for these names is that a measure is a function defined on a σ-algebra).
(En ∈ A, ∀n ∈ N) ⇒ (X − En ∈ A, ∀n ∈ N) ⇒
X − ∩∞ ∞
n=1 En = ∪n=1 (X − En ) ∈ A ⇒
∩∞ ∞
n=1 En = X − (X − ∩n=1 En ) ∈ A.
6.1.15 Remark. For any non-empty set X, the collections {∅, X} and P(X) are
σ-algebras on X.
(En ∈ A, ∀n ∈ N, ∀A ∈ Λ) ⇒ (∪∞ ∞
n=1 En ∈ A, ∀A ∈ Λ) ⇒ ∪n=1 En ∈ ∩A∈Λ A.
6.1.17 Theorem.
Proof. The proof of the present statement is a slight modification of the proof of
6.1.10.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 123
Proof. Since F ⊂ A0 (F ) (cf. ga1 ), we have A(F ) ⊂ A(A0 (F )) (cf. 6.1.17b). Since
A(F ) is an algebra on X and F ⊂ A(F ) (cf. gσ1 ), we have A0 (F ) ⊂ A(F ) (cf.
ga2 ), whence A(A0 (F )) ⊂ A(F ) (cf. 6.1.17b).
Proof. From the inclusion F ⊂ A(F ) (cf. gσ1 ) we have F Y ⊂ (A(F ))Y , and hence
A(F Y ) ⊂ (A(F ))Y by property gσ2 for A(F Y ) (since (A(F ))Y is a σ-algebra on Y
by 6.1.19).
To prove the opposite inclusion, we define the collection A of subsets of X by
A := E ∈ P(X) : E ∩ Y ∈ A(F Y ) .
Proof. Since
E ∈ Kd ⇒ X − E ∈ Td ⇒ X − E ∈ A(d) ⇒ E = X − (X − E) ∈ A(d),
we have Kd ⊂ A(d), whence A(Kd ) ⊂ A(d) (cf. 6.1.17b).
Since
E ∈ Td ⇒ X − E ∈ Kd ⇒ X − E ∈ A(Kd ) ⇒ E = X − (X − E) ∈ A(Kd ),
we have Td ⊂ A(Kd ), whence A(d) ⊂ A(Kd ).
6.1.24 Proposition. Consider the measurable spaces (R, A(dR )) (cf. 2.1.4),
(R∗ , A(δ)) (cf. 5.2.1), and (C, A(dC )) (cf. 2.7.4a). We have on R the three
σ-algebras A(dR ), (A(δ))R , (A(dC ))R (the last symbol is consistent since we iden-
tify R with the subset {(a, 0) : a ∈ R} of C). However,
A(dR ) = (A(δ))R = (A(dC ))R .
Then we have
A(dR ) = A(In ) for n = 1, ..., 8.
If we define
I9 := I4 ∪ I5 ∪ I8 ,
then I9 is a semialgebra on R and
A(dR ) = A(I9 ).
Proof. From 2.3.16 and 2.3.17 it follows that every element of TdR is the union of a
countable family of open balls. Now, the family of open balls in (R, dR ) is I1 . Thus,
TdR ⊂ A(I1 )
by property σa1 of A(I1 ), since I1 ⊂ A(I1 ) (cf. gσ1 ). Therefore (cf. 6.1.17b),
A(dR ) := A(TdR ) ⊂ A(I1 ).
Next, we notice that:
1
∀a, b ∈ R, (a, b) = ∪∞n=1 [a + n , b), hence I1 ⊂ A(I2 ), whence A(I1 ) ⊂ A(I2 ),
1
∀a, b ∈ R, [a, b) = ∪∞n=1 [a, b − n ], hence I2 ⊂ A(I3 ), whence A(I2 ) ⊂ A(I3 ),
∞ 1
∀a, b ∈ R, [a, b] = ∩n=1 (a − n , b], hence I3 ⊂ A(I4 ), whence A(I3 ) ⊂ A(I4 ),
∀a, b ∈ R, (a, b] = (−∞, b] ∩ (R − (−∞, a]), hence I4 ⊂ A(I5 ), whence A(I4 ) ⊂
A(I5 ),
1
∀a ∈ R, (−∞, a] = ∩∞ n=1 (−∞, a + n ), hence I5 ⊂ A(I6 ), whence A(I5 ) ⊂ A(I6 ),
∀a ∈ R, (−∞, a) = R − [a, ∞), hence I6 ⊂ A(I7 ), whence A(I6 ) ⊂ A(I7 ),
1
∀a ∈ R, [a, ∞) = ∩∞ n=1 (a − n , ∞), hence I7 ⊂ A(I8 ), whence A(I7 ) ⊂ A(I8 ),
∀a ∈ R, (a, ∞) ∈ TdR , i.e. I8 ⊂ TdR , whence A(I8 ) ⊂ A(dR ).
This proves that A(dR ) = A(I1 ) = ... = A(I8 ).
As to I9 , recall that (a, b] := ∅ if b < a. Thus I9 has the property sa1 of 6.1.1,
and it is immediate to check that it has properties sa2 and sa3 as well. Moreover,
from
I4 ⊂ A(dR ), I5 ⊂ A(dR ), I8 ⊂ A(dR ),
we have I9 ⊂ A(dR ), whence A(I9 ) ⊂ A(dR ). And from I4 ⊂ I9 we have A(dR ) =
A(I4 ) ⊂ A(I9 ). This proves that A(dR ) = A(I9 ).
which proves that {{−∞} , {∞}} ⊂ A(I2∗ ). Thus we have I1∗ ⊂ A(I2∗ ), whence
A(I1∗ ) ⊂ A(I2∗ ).
Since I2∗ ⊂ I ∗ , we also have A(I2∗ ) ⊂ A(I ∗ ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 128
1
[a, ∞] = ∩∞
n=1 (a − , ∞] ∈ A(I2∗ ), ∀a ∈ R,
n
and this proves that I2∗ ⊂ A(I3∗ ) and I3∗ ⊂ A(I2∗ ), and hence that A(I2∗ ) = A(I3∗ ).
Proof. a: We have
∅ × · · · × ∅ = ∅;
∀E1 × · · · × EN , F1 × · · · × FN ∈ SN ,
(E1 × · · · × EN ) ∩ (F1 × · · · × FN ) = (E1 ∩ F1 ) × · · · × (EN ∩ FN );
∀E1 × · · · × EN ∈ SN ,
(X1 × · · · × XN ) − (E1 × · · · × EN )
= ∪N
k=1 (X1 × · · · × Xk−1 × (Xk − Ek ) × Xk+1 × · · · × XN ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 130
Then
A(d1 ) ⊗ · · · A(dN ) ⊂ A(d).
If (Xk , dk ) is separable for each k ∈ {1, ..., N }, then
A(d1 ) ⊗ · · · A(dN ) = A(d).
By 6.1.29 we have
A(d1 ) ⊗ · · · A(dN ) = A(G).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 131
It is easy to see that ρ is a distance on X since properties di1 , di2 , di3 of 2.1.1 for the
function ρ follow immediately from the corresponding properties for the functions
dk . It is also immediate to see that, for a sequence {(x1,n , ..., xN,n )} in X and for
(x1 , ..., xN ) ∈ X,
ρ((x1,n , ..., xN,n ), (x1 , ..., xN )) → 0 as n → ∞ ⇔
(dk (xk,n , xk ) → 0 as n → ∞, ∀k ∈ {1, ..., N }) ⇔
d((x1,n , ..., xN,n ), (x1 , ..., xN )) → 0 as n → ∞.
By 2.3.4, this implies Kρ = Kd , and hence Tρ = Td .
Suppose now that (Xk , dk ) is separable for each k ∈ {1, ..., N }. Then, by 2.7.3c,
(X, d) is separable. Since Kρ = Kd , (X, ρ) is separable as well. By 2.3.17, this
implies that every element of Tρ is a countable union of open balls in X defined
with respect to ρ. But for any such ball Bρ ((x1 , ..., xN ), r) we have (denoting by
Bdk (xk , r) a ball in Xk defined with respect to dk ):
Bρ ((x1 , ..., xN ), r) = Bd1 (x1 , r) × · · · × BdN (xN , r)
−1
= ∩N
k=1 πk (Bdk (xk , r)) ∈ A(G).
This proves the inclusion Tρ ⊂ A(G), i.e. Td ⊂ A(G), and therefore also the inclusion
A(d) := A(Td ) ⊂ A(G). In view of the inclusion A(G) ⊂ A(d) proved above, this
shows that
A(d1 ) ⊗ · · · A(dN ) = A(G) = A(d).
show that G ⊂ A(R), and this implies that A(G) ⊂ A(R). Thus, we have A(dC ) =
A(G) = A(R).
Proof. Since a σ-algebra is always a monotone class and A0 ⊂ A(A0 ) (cf. gσ1
in 6.1.17), we have C(A0 ) ⊂ A(A0 ) (cf. gm2 ). Hence it is sufficient to show that
C(A0 ) is a σ-algebra, because then from A0 ⊂ C(A0 ) (cf. gm1 ) we can derive
A(A0 ) ⊂ C(A0 ) (cf. gσ2 in 6.1.17). Besides, if we prove that C(A0 ) is an algebra on
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 133
X, then for any sequence {En } in C(A0 ) we have ∪N n=1 En ∈ C(A0 ) for each N ∈ N,
and hence ∪∞ E
n=1 n = ∪ ∞
N =1 (∪ N
E
n=1 n ) ∈ C(A 0 ) by property mo1 of C(A0 ), and we
can conclude that C(A0 ) is a σ-algebra on X.
For each E ∈ C(A0 ) we define a collection C(E) of subsets of X by
C(E) := {F ∈ C(A0 ) : E − F, F − E, E ∩ F ∈ C(A0 )} .
Clearly, ∅ ∈ C(E) (since ∅ ∈ A0 ⊂ C(A0 )) and hence C(E) is not empty. Moreover,
if {Fn } is a sequence in C(E) such that Fn ⊂ Fn+1 for each n ∈ N, then:
E − (∪∞ ∞ ∞
n=1 Fn ) = E ∩ (∩n=1 (X − Fn )) = ∩n=1 (E − Fn ) ∈ C(A0 )
by property mo2 of C(A0 ), since E − Fn+1 ⊂ E − Fn for each n ∈ N;
(∪∞ ∞
n=1 Fn ) − E = ∪n=1 (Fn − E) ∈ C(A0 )
by property mo1 of C(A0 ), since Fn − E ⊂ Fn+1 − E for each n ∈ N;
E ∩ (∪∞ ∞
n=1 Fn ) = ∪n=1 (E ∩ Fn ) ∈ C(A0 )
by property mo1 of C(A0 ), since E ∩ Fn ⊂ E ∩ Fn+1 for each n ∈ N.
This proves property mo1 for C(E). Property mo2 can be proved for C(E) in a
similar way. Thus, C(E) is a monotone class.
For each E ∈ A0 , it is clear (from 6.1.6 and A0 ⊂ C(A0 )) that A0 ⊂ C(E), so
that C(A0 ) ⊂ C(E) (cf. gm2 ). Hence, for each F ∈ C(A0 ), we have
E ∈ A0 ⇒ F ∈ C(E) ⇒ E ∈ C(F ),
where the second implication follows from the symmetry of the definition of C(E).
This proves that, for each F ∈ C(A0 ), A0 ⊂ C(F ) and hence C(A0 ) ⊂ C(F ) (cf.
gm2 ). Thus, if E, F ∈ C(A0 ) then E ∈ C(F ) and hence F − E and F ∩ E are
elements of C(A0 ). Since X ∈ A0 ⊂ C(A0 ), this implies that C(A0 ) is an algebra on
X:
if E ∈ C(A0 ), then X − E ∈ C(A0 );
if E, F ∈ C(A0 ), then X − E, X − F ∈ C(A0 ), and then
E ∪ F = X − ((X − E) ∩ (X − F )) ∈ C(A0 ).
This completes the proof.
6.2.1 Definition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces, i.e. let X1 , X2
be non-empty sets and let A1 , A2 be σ-algebras on X1 , X2 respectively. A mapping
ϕ : X1 → X2 is said to be measurable w.r.t. (with respect to) A1 and A2 (or, simply,
measurable when no confusion can occur) if the following condition holds:
ϕ−1 (E) ∈ A1 , ∀E ∈ A2 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 134
6.2.3 Proposition. Let (X1 , A1 ), (X2 , A2 ) be measurable spaces, and suppose that
ϕ : X1 → X2 is a measurable mapping w.r.t. A1 and A2 . If Y is a non-empty
subset of X1 , the restriction ϕY of ϕ to Y is measurable w.r.t. AY1 and A2 .
6.2.4 Proposition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces. For a map-
ping ϕ : X1 → X2 , let Y be a subset of X2 such that Rϕ ⊂ Y . Then the final set X2
can be replaced by Y , i.e. we can consider ϕ as ϕ : X1 → Y (cf. 1.2.1). However,
the following are equivalent conditions:
(a) ϕ is measurable w.r.t. A1 and A2 ;
(b) ϕ is measurable w.r.t. A1 and AY2 .
6.2.5 Theorem. Let (X1 , A1 ), (X2 , A2 ), (X3 , A3 ) be measurable spaces, and let
ϕ : X1 → X2 be a measurable mapping w.r.t. A1 and A2 and ψ : X2 → X3 a
measurable mapping w.r.t. A2 and A3 . Then ψ ◦ ϕ is a measurable mapping w.r.t
A1 and A3 .
Proof. Use the definition of measurable mapping (cf. 6.2.1) and (cf. 1.2.13f)
(ψ ◦ ϕ)−1 (S) = ϕ−1 (ψ −1 (S)), ∀S ∈ P(X3 ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 135
6.2.9 Proposition. Let N ∈ N and let (Xk , Ak ), (Yk , Bk ) be measurable spaces for
k = 1, ..., N . For k = 1, ..., N , let ϕk : Xk → Yk be a measurable mapping w.r.t. Ak
and Bk .
(a) The mapping
ϕ1 × · · · × ϕN : X1 × · · · × XN → Y1 × · · · × YN
(x1 , ..., xN ) 7→ (ϕ1 × · · · × ϕN )(x1 , ..., xN )
:= (ϕ1 (x1 ), ..., ϕN (xN ))
is measurable w.r.t. A1 ⊗ · · · ⊗ AN and B1 ⊗ · · · ⊗ BN .
(b) If (Z, C) is a measurable space and ρ : Y1 × · · · × YN → Z is a measurable
mapping w.r.t. B1 ⊗ · · · ⊗ BN and C, then the mapping
χ : X1 × · · · × XN → Z
(x1 , ..., xN ) 7→ χ(x1 , ..., xN ) := ρ(ϕ1 (x1 ), ..., ϕN (xN ))
is measurable w.r.t. A1 ⊗ · · · ⊗ AN and C.
6.2.10 Proposition. Let (X, A) be a measurable space, let N ∈ N, and let (Yk , Bk )
be a measurable space for k = 1, ..., N . For k = 1, ..., N , let ϕk : X → Yk be a
measurable mapping w.r.t. A and Bk .
(a) The mapping
ϕ : X → Y1 × · · · × YN
x 7→ ϕ(x) := (ϕ1 (x), ..., ϕN (x))
is measurable w.r.t. A and B1 ⊗ · · · ⊗ BN .
(b) If (Z, C) is a measurable space and ρ : Y1 × · · · × YN → Z is a measurable
mapping w.r.t. B1 ⊗ · · · ⊗ BN and C, then the mapping
χ :X → Z
x 7→ χ(x) := ρ(ϕ1 (x), ..., ϕN (x))
is measurable w.r.t. A and C.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 137
By 6.1.30a and 6.2.7, this proves that the mapping ι is measurable w.r.t. A and
A ⊗ · · · n times · · · ⊗ A. Now, by 6.2.9a the mapping ϕ1 × · · · × ϕN is measurable
w.r.t. A ⊗ · · · n times · · · ⊗ A and B1 ⊗ · · · ⊗ BN . Then ϕ is measurable w.r.t. A
and B1 ⊗ · · · ⊗ BN by 6.2.5, since ϕ = (ϕ1 × · · · × ϕN ) ◦ ι.
b: This follows at once from part a and 6.2.5, since χ = ρ ◦ ϕ.
Proof. With the exception of the second assertion of part b, everything follows at
once from 6.2.7 along with 6.1.25, 6.1.26, 6.1.33, 6.1.22, 6.1.23.
As to the second assertion of part b, we notice that the condition
ϕ−1 (E) ∈ A, ∀E ∈ I1∗
is precisely the condition
ϕ−1 ({−∞}), ϕ−1 ({∞}) ∈ A and ϕ−1 (E) ∈ A for all E ∈ A(dR ).
We also notice that the condition
ϕ−1 (E) ∈ A, ∀E ∈ A(dR )
can be written as
∃F ∈ A s.t. ϕ−1
ϕ−1 (R) (E) = ϕ
−1
(E) = F = F ∩ ϕ−1 (R), ∀E ∈ A(dR )
and that this in its turn can be written as
−1
ϕ−1
ϕ−1 (R) (E) ∈ A
ϕ (R)
, ∀E ∈ A(dR ).
−1
(R)
Finally, we notice that the last condition is the condition of Aϕ -measurability
for the mapping ϕϕ−1 (R) .
6.2.14 Remark. Let (X, A) be a measurable space. From 6.2.4 and 6.1.24 it follows
that, for the A-measurability of a function ϕ : X → R, it is immaterial what choice
is made among R, R∗ and C for the final set of ϕ. For this reason, in what follows
we consider only functions whose final sets are either R∗ or C.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 139
6.2.15 Definition. For a measurable space (X, A), we denote by M(X, A) the
family of all the functions, with X as domain and C as final set, that are
A-measurable, i.e. M(X, A) is the family of all A-measurable complex functions
on X:
M(X, A) := {ϕ ∈ F (X) : ϕ is A-measurable}
(for F (X) cf. 3.1.10c).
Proof. For lm1 and sa2 , use 6.2.10b with N := 2, (Y1 , B1 ) = (Y2 , B2 ) = (C, A(dC )),
ϕ1 := ϕ, ϕ2 := ψ and either
C × C ∋ (z1 , z2 ) 7→ ρ(z1 , z2 ) := z1 + z2 ∈ C
or
C × C ∋ (z1 , z2 ) 7→ ρ(z1 , z2 ) := z1 z2 ∈ C.
Notice in fact that in either case ρ is A(dC × dC )-measurable since it is continuous
(cf. 6.2.8), and that A(dC × dC ) = A(dC ) ⊗ A(dC ) by 6.1.31.
Condition lm2 follows from sa2 , since αϕ = αX ϕ (for the constant function αX
cf. 1.2.19) and αX ∈ M(X, A) by 6.2.2.
Finally, 1X ∈ M(X, A) by 6.2.2.
6.2.17 Proposition. Let (X, A) be a measurable space and ϕ ∈ M(X, A). Then:
ϕ ∈ M(X, A);
|ϕ|n ∈ M(X, A), ∀n ∈ N;
1 D1
the function ϕ is A ϕ -measurable
1
(for the functions ϕ, |ϕ|n and ϕ, cf. 1.2.19).
1 D1 1
Finally, ϕ is A ϕ -measurable by 6.2.6. Indeed, ϕ = ψ ◦ ϕ if ψ is the function
1
C − {0} ∋ z 7→ ψ(z) :=
∈ C.
z
Now ψ is continuous, hence A(dC−{0} )-measurable by 6.2.8 (we have denoted by
dC−{0} the restriction of dC to (C − {0}) × (C − {0})). Moreover, A(dC−{0} ) =
(A(dC ))C−{0} = (A(dC ))Dψ by 6.1.21 (cf. also 6.1.22).
inf ϕk : X → R∗
k≥n
If the sequence {ϕn (x)} is convergent (in the metric space (R∗ , δ)) for all x ∈ X,
we define the function
lim ϕn : X → R∗
n→∞
and from 5.2.6 we have limn→∞ ϕn = inf n≥1 (supk≥n ϕk ) = supn≥1 (inf k≥n ϕk ).
6.2.19 Proposition. Let (X, A) be a measurable space and let {ϕn } be a sequence
of A-measurable functions ϕn : X → R∗ .
(a) For each n ∈ N, the functions supk≥n ϕk and inf k≥n ϕk are A-measurable.
(b) If the sequence {ϕn (x)} is convergent for all x ∈ X, then the function
limn→∞ ϕn is measurable.
(3) (3)
where ⇐ holds by 1 and ⇒ holds by 2. This proves that
∞
[
(sup ϕk )−1 ((a, ∞]) = ϕ−1
k ((a, ∞]).
k≥n
n=k
Since this is true for all a ∈ R, it proves that supk≥n ϕk is A-measurable, in view
of 6.2.13b with F ∗ := I2∗ .
For inf k≥n ϕk the proof is analogous.
b: Let the sequence {ϕn (x)} be convergent for all x ∈ X. Then limn→∞ ϕn =
inf n≥1 (supk≥n ϕk ). Now, supk≥n ϕk is A-measurable for each n ∈ N, in view of part
a. Then, in view of part a once again, inf n≥1 (supk≥n ϕk ) is A-measurable.
Now, the functions Re ϕn and Im ϕn are A-measurable for each n ∈ N (cf. 6.2.12).
Thus, 6.2.19b implies that limn→∞ Re ϕn and limn→∞ Im ϕn are A-measurable (we
have also used 6.2.14 twice). Then, limn→∞ ϕn is A-measurable by 6.2.16 (or by
6.2.10a and the equality A(dC ) = A(dR ) ⊗ A(dR )).
d: This result follows from 6.2.16 and corollary c, since
∞
X Xn
ϕn = lim ϕk .
n→∞
n=1 k=1
n
)
X
Ei ∩ Ej = ∅ if i 6= j and ψ = αk χEk .
k=1
We point out the obvious fact that, for ψ ∈ S(X, A), its representation ψ =
Pn
k=1 αk χEk as in theS definition above is never unique (it would be if we required
n
αi 6= αj for i 6= j and k=1 Ek = X, but we do not).
Then χEk ∈ M(X, A) for k = 1, ..., n (cf. 6.2.21), and hence ψ ∈ M(X, A) by
6.2.16. Moreover, ψ has finitely many values since the only possible values of ψ are
the numbers 0, α1 , ..., αn .
b ⇒ a: Assume condition b and let {α1 , ..., αn } := Rψ and Ek := ψ −1 ({αk }) for
k = 1, ..., n.
We have Ek ∈ A by 6.2.13c (with G = KdC ) since ψ is A-measurable and
{αk } ∈ KdC . Moreover,
Ei ∩ Ej = ψ −1 ({αi } ∩ {αj }) = ψ −1 (∅) = ∅ if i 6= j.
Sn
Finally, since X = k=1 Ek we have
n
X
∀x ∈ X, ∃!i ∈ {1, ..., n} s.t. x ∈ Ei , hence s.t. ψ(x) = αi = αk χEk (x),
k=1
Pn
and this proves that ψ = k=1 αk χEk .
Proof. In view of 6.2.23 and 6.2.16, we only need to notice that, if α ∈ C and
ψ1 , ψ2 , ψ ∈ S(X, A), then ψ1 + ψ2 , αψ, ψ1 ψ2 have only finitely many values.
6.2.26 Theorem. Let (X, A) be a measurable space and ϕ ∈ L+ (X, A). Then there
exists a sequence {ψn } in S + (X, A) such that:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 144
6.2.27 Corollary. Let (X, A) be a measurable space and ϕ ∈ M(X, A). Then there
exists a sequence {ψn } in S(X, A) such that:
(a) |ψn | ≤ |ϕ|, ∀n ∈ N,
(b) ψn (x) → ϕ(x) as n → ∞, ∀x ∈ X,
(c) if a subset Y of X and m ∈ [0, ∞) are such that |ϕ(x)| ≤ m for all x ∈ Y , then
sup{|ϕ(x) − ψn (x)| : x ∈ Y } → 0 as n → ∞.
6.2.28 Definition. For a measurable space (X, A), we denote by MB (X, A) the
family of all bounded A-measurable complex functions on X, i.e. we define
MB (X, A) := M(X, A) ∩ FB (X)
(for FB (X), cf. 3.1.10d).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 146
6.2.29 Remarks. Let (X, A) be a measurable space. Since M(X, A) and FB (X)
are subalgebras of the associative algebra F (X) (cf. 6.2.16 and 3.3.8b), MB (X, A) is
a subalgebra of the associative algebra F (X) (cf. 3.3.4), and hence of the associative
algebras M(X, A) and FB (X) as well (cf. 3.3.3b).
Since FB (X) is a normed algebra (cf. 4.3.6a), MB (X, A) is a normed algebra
as well (cf. 4.3.2).
Since convergence with respect to the k k∞ norm (cf. 4.3.6a) implies pointwise
convergence, 6.2.20c shows that MB (X, A) is a closed subset of FB (X). Hence,
since FB (X) is a Banach space (cf. 4.3.6a), MB (X, A) is a Banach space as well
(cf. 4.1.8a) and hence a Banach algebra.
Clearly, S(X, A) ⊂ MB (X, A). Since S(X, A) is a subalgebra of the associative
algebra M(X, A) (cf. 6.2.24), S(X, A) is a subalgebra of the associative algebra
MB (X, A) as well (cf. 3.3.3b).
Finally, 6.2.27c and 2.3.12 show that S(X, A) is dense (in the k k∞ norm) in
MB (X, A).
ϕ + ψ : Dϕ ∩ Dψ → [0, ∞]
x 7→ (ϕ + ψ)(x) := ϕ(x) + ψ(x),
ϕψ : Dϕ ∩ Dψ → [0, ∞]
x 7→ (ϕψ)(x) := ϕ(x)ψ(x).
aϕ : Dϕ → [0, ∞]
x 7→ (aϕ)(x) := aϕ(x).
Clearly, if Rϕ ⊂ [0, ∞) and a ∈ [0, ∞) then this definition is in agreement with the
one given in 1.2.19.
Proof. It is obvious that Rϕ1 +ϕ2 ⊂ [0, ∞]. By 6.2.26, there are two sequences {ψn1 }
and {ψn2 } in S + (X, A) so that, for i = 1, 2,
In this section we prove a result about Borel functions which has hardly anything
to do with the theory of measure and integration, but which will play an essential
role in our proof of the spectral theorem for unitary operators (from which we will
deduce the spectral theorem for self-adjoint operators).
6.3.1 Definition. Let X be a non-empty set, {ϕn } a sequence in F (X) (for F (X),
cf. 3.1.10c), and ϕ ∈ F (X). We say that ϕ is the uniformly bounded pointwise limit,
ubp
in short ubp limit, of {ϕn } and we write ϕn −→ ϕ if the following two conditions
are satisfied:
∃m ∈ [0, ∞) such that |ϕn (x)| ≤ m, ∀x ∈ X, ∀n ∈ N;
ϕn (x) → ϕ(x) as n → ∞, ∀x ∈ X.
ubp
Clearly, if ϕn −→ ϕ then ϕ ∈ FB (X) (for FB (X), cf. 3.1.10d).
A family of functions V ⊂ F (X) is said to be ubp closed if the following condition
is satisfied:
ubp
[ϕ ∈ F (X), {ϕn } a sequence in V, ϕn −→ ϕ] ⇒ ϕ ∈ V.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 148
6.3.2 Lemma. Let (X, d) be a metric space, define the collection of families of
functions
Γ := {V ⊂ F (X) : V is ubp closed and CB (X) ⊂ V}
(for CB (X), cf. 3.1.10e), and then define the family of functions
\
E := V.
V∈Γ
ubp
and hence χSN n=1 En
−→ χS∞ n=1 En
. Since E is ubp closed (cf. property a), this
that ∞
S
implies that χ ∞
S
n=1 En
∈ E, and hence n=1 En ∈ A.
6.3.4 Theorem. Let (X, d) be a metric space. The family MB (X, A(d)) of all
bounded Borel functions is the smallest family of complex functions on X that is
ubp closed and that contains CB (X). More explicitly:
(a) MB (X, A(d)) is ubp closed and CB (X) ⊂ MB (X, A(d));
(b) if V ⊂ F (X), V is ubp closed, and CB (X) ⊂ V, then MB (X, A(d)) ⊂ V.
Proof. a: Suppose that a sequence {ϕn } in MB (X, A(d)) and ϕ ∈ F (X) are such
ubp
that ϕn −→ ϕ. Then ϕ ∈ M(X, A(d)) by 6.2.20c, and ϕ ∈ FB (X) since ϕ is a
ubp limit. This shows that MB (X, A(d)) is ubp closed. Moreover, the inclusion
CB (X) ⊂ MB (X, A(d)) holds by 6.2.8.
b: We prove this property of MB (X, A(d)) by proving the inclusion
MB (X, A(d)) ⊂ E, where E is the family of functions defined in 6.3.2. Indeed,
for every ϕ ∈ MB (X, A(d)), by 6.2.27 there exists a sequence {ψn } in S(X, A(d))
ubp
such that ψn −→ ϕ. Now, properties c and d of 6.3.2 imply that S(X, A(d)) ⊂ E.
Since E is ubp closed (cf. 6.3.2a), this shows that ϕ ∈ E.
6.3.6 Remark. By substituting pointwise limits for ubp limits and dropping ev-
erywhere any condition of boundedness, the whole reasoning of this section can be
rerun to prove that, for any metric space (X, d), the family M(X, A(d)) of all Borel
functions is the smallest subset of F (X) that is closed with respect to pointwise
convergence and that contains C(X) (for C(X), cf. 3.1.10e).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 151
Chapter 7
Measures
151
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 152
Measures 153
(cf. 6.1.8). If ∪∞
n=1 En ∈ A0 , then we have
∞
X ∞
X
µ0 (∪∞ ∞
n=1 En ) = µ0 (∪n=1 Fn ) = µ0 (Fn ) ≤ µ0 (En )
n=1 n=1
and hence, if ∪∞
n=1 En ∈ A0 , by pm we have
µ0 (∪∞ ∞
n=1 En ) = µ0 (∪n=1 Fn )
X∞ n
X
= µ0 (Fn ) = lim µ0 (Fk ) = lim µ0 (En ),
n→∞ n→∞
n=1 k=1
or
∞
X n
X
µ0 (∪∞
n=1 En ) = µ0 (Fn ) = sup µ0 (Fk ) = sup µ0 (En ).
n=1 n≥1 n≥1
k=1
∪∞ ∞ ∞ ∞
n=1 Fn = ∪n=1 (E1 ∩ (X − En )) = E1 ∩ (X − ∩n=1 En ) = E1 − ∩n=1 En .
Notice that limn→∞ µ0 (Fn ) < ∞ since ∪∞ n=1 Fn ⊂ E1 implies, by the monotonicity
∞
of µ0 , µ0 (∪n=1 Fn ) ≤ µ0 (E1 ) < ∞. We also have
E1 = (∩∞ ∞ ∞ ∞
n=1 En ) ∪ (∪n=1 Fn ) and (∩n=1 En ) ∩ (∪n=1 Fn ) = ∅,
whence, by af2 ,
µ0 (E1 ) = µ0 (∩∞ ∞
n=1 En ) + µ0 (∪n=1 Fn ),
µ0 (∩∞ ∞
n=1 En ) = µ0 (E1 ) − µ0 (∪n=1 Fn ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 154
Thus we have
µ0 (∩∞
n=1 En ) = µ0 (E1 ) − lim µ0 (Fn ).
n→∞
From
∀n ∈ N, E1 = En ∪ Fn and En ∩ Fn = ∅
we have
∀n ∈ N, µ0 (E1 ) = µ0 (En ) + µ0 (Fn ),
whence, since En ⊂ E1 implies µ0 (En ) ≤ µ0 (E1 ) < ∞,
∀n ∈ N, µ0 (Fn ) = µ0 (E1 ) − µ0 (En ),
and this implies, since all the terms involved are in R and so is limn→∞ µ0 (Fn ),
that the sequence {µ0 (En )} is convergent to a limit in R and that
lim µ0 (Fn ) = µ0 (E1 ) − lim µ0 (En ).
n→∞ n→∞
Finally, from 7.1.2a and from 5.2.5 we obtain limn→∞ µ0 (En ) = inf n≥1 µ0 (En ).
Proof. First we prove that conditions a and b imply that µ0 has property pm of
7.1.3. Let then {En } be a sequence in A0 such that ∪∞ n=1 En ∈ A0 and Ei ∩ Ej = ∅
if i 6= j. The additivity and the monotonicity of µ0 imply that
N
X
∞
∀N ∈ N, µ0 (En ) = µ0 (∪N
n=1 En ) ≤ µ0 (∪n=1 En ),
n=1
whence
∞
X N
X
µ0 (En ) := sup µ0 (En ) ≤ µ0 (∪∞
n=1 En ). (1)
n=1 N ≥1 n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 155
Measures 155
P∞
If n=1 µ0 (En ) = ∞, 1 implies that
X∞
µ0 (En ) = µ0 (∪∞
n=1 En ).
n=1
Thus our task is now to prove that conditions a and b imply that
∞
X
µ0 (∪∞ E
n=1 n ) ≤ µ0 (En ),
n=1
assuming that ∞
P
n=1 µ0 (En ) < ∞ (we will see that, in this part of the proof, no
role is played by the condition Ei ∩ Ej = ∅ if i 6= j, which however has already
played its role in the proof of 1). Assume then ∞
P
n=1 µ0 (En ) < ∞ (this implies that
µ0 (En ) < ∞ for all n ∈ N), and consider any F ∈ A0 such that F ⊂ ∪∞ n=1 En and
such that F is compact. Choose ǫ > 0. For each n ∈ N, condition b for En implies
that
ǫ
∃Gn,ǫ ∈ A0 s.t. En ⊂ G◦n,ǫ and µ0 (Gn,ǫ ) − µ0 (En ) < n .
2
Since ∪∞ ◦ ∞
n=1 Gn,ǫ ⊃ ∪n=1 En ⊃ F and F is compact, there exists N ∈ N so that
◦
∪Nn=1 Gn,ǫ ⊃ F , and hence so that
◦
∪N N
n=1 Gn,ǫ ⊃ ∪n=1 Gn,ǫ ⊃ F ⊃ F.
Then we have
∞ N N N
X X X ǫ X
µ0 (En ) ≥ µ0 (En ) > (µ0 (Gn,ǫ ) − n ) > µ0 (Gn,ǫ ) − ǫ
n=1 n=1 n=1
2 n=1
≥ µ0 (∪N
n=1 Gn,ǫ ) − ǫ ≥ µ0 (F ) − ǫ,
where the subadditivity and the monotonicity of µ0 have been used. Since ǫ was
arbitrary, this proves that
∞
X
µ0 (En ) ≥ µ0 (F ).
n=1
Since F was any element of A0 such that F ⊂ ∪∞ n=1 En and such that F was compact,
condition a for ∪∞
n=1 En implies that
∞
X
µ0 (En ) ≥ µ0 (∪∞
n=1 En ),
n=1
which along with 1 proves that
X∞
µ0 (En ) = µ0 (∪∞
n=1 En ).
n=1
Now suppose that µ0 (X) < ∞. Notice that this implies that µ0 (H) < ∞ for
all H ∈ A0 , by the monotonicity of µ0 . We will prove that from this it follows
that condition a implies condition b. Assume then condition a, and consider any
E ∈ A0 . Choose ǫ > 0. Since X − E ∈ A0 , condition a for X − E implies that
∃Fǫ ∈ A0 s.t. F ǫ ⊂ X − E and µ0 (X − E) − µ0 (Fǫ ) < ǫ.
Letting Gǫ := X − Fǫ , we have:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 156
Gǫ ∈ A0 ;
E ⊂ G◦ǫ (because F ǫ ⊂ X − E implies E ⊂ X − F ǫ , and X − F ǫ ⊂ (X − Fǫ )◦
follows from X − F ǫ ∈ Td and X − F ǫ ⊂ X − Fǫ );
µ0 (Gǫ ) − µ0 (E) = µ0 (X) − µ0 (Fǫ ) − µ0 (E) = µ0 (X − E) − µ0 (Fǫ ) < ǫ, where
µ0 (Gǫ ) = µ0 (X) − µ0 (Fǫ ) and µ0 (X) − µ0 (E) = µ0 (X − E) are true because µ0
is finite.
Since ǫ was arbitrary, this proves condition b for E.
Proof. We will prove this corollary by proving that condition a of the statement
implies condition 7.1.5a and that condition b implies condition 7.1.5b.
Consider then E ∈ A0 (S). By 6.1.11, there is a finite and disjoint family
{E1 , ..., En } of elements of S so that E = ∪nk=1 Ek .
Assume condition a and suppose first that µ0 (E0 ) = ∞. Then there exists
l ∈ {1, ..., n} such that µ0 (El ) = ∞, for otherwise we would have µ0 (E) < ∞ by
the additivity of µ0 . Notice that, since S ⊂ A0 (S) and El ⊂ E,
F ∈ S : F ⊂ El , F is compact ⊂ F ∈ A0 (S) : F ⊂ E, F is compact .
Condition a for El is
∞ = µ0 (El ) = sup µ0 (F ) : F ∈ S, F ⊂ El , F is compact .
Hence we have
sup µ0 (F ) : F ∈ A0 (S), F ⊂ E, F is compact = ∞ = µ0 (E),
which is condition 7.1.5a for E.
Assume next condition a and suppose that µ0 (E) < ∞. Then µ0 (Ek ) < ∞ for
k = 1, ..., n by the monotonicity of µ0 . Choose ǫ > 0. For k = 1, ..., n, condition a
for Ek implies that
ǫ
∃Fk,ǫ ∈ S s.t. F k,ǫ ⊂ Ek , F k,ǫ is compact, µ0 (Ek ) − µ0 (Fk,ǫ ) < .
n
Letting Fǫ := ∪nk=1 Fk,ǫ , we have:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 157
Measures 157
Measures 159
Proof. For each E ∈ P(X), conditions om3 and om1 imply that
µ∗ (A) ≤ µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)), ∀A ∈ P(X), (1)
since A = (A ∩ E) ∪ (A ∩ (X − E)), and 1 implies that
µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) for all A ∈ P(X) such that µ∗ (A) = ∞. (2)
If E ∈ P(X) is such that
µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) ≤ µ∗ (A) for all A ∈ P(X) such that µ∗ (A) < ∞,
then in view of 1 we have
µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) for all A ∈ P(X) such that µ∗ (A) < ∞,
which, together with 2, proves that E is µ∗ -measurable.
Measures 161
is obvious. Suppose then µ∗ (E < ∞ for all k ∈ N and choose ǫ > 0. For each
k)
k ∈ N there exists a sequence Akn in E such that
∞
X ǫ
Ek ⊂ ∪∞ k
n=1 An and ρ(Akn ) < µ∗ (Ek ) + .
n=1
2k
Then we have
∪∞ k
k=1 Ek ⊂ ∪(k,n)∈N×N An
and
X ∞ X
X ∞
ρ(Akn ) = ( ρ(Akn ))
(k,n)∈N×N k=1 n=1
∞ ∞
X ǫ X
≤ (µ∗ (Ek ) + k
)= µ∗ (Ek ) + ǫ,
2
k=1 k=1
Measures 163
Proof. A, existence: Let E ∈ A0 (S). Then there exists a finite and disjoint family
{E1 , ..., EN } of elements of S such that E = ∪N n=1 En (cf. 6.1.11). Suppose now
that there exists another finite and disjoint family {F1 , ..., FL } of elements of S such
that E = ∪L l=1 Fl . If we define Gn,l := En ∩ Fl , we have Gn,l ∈ S for all n = 1, ..., N
and l = 1, ..., L, and Gn,l ∩ Gm,k = ∅ if (n, l) 6= (m, k). By condition b, we also have
L
X
for n = 1, ..., N, En = ∪L
l=1 Gn,l , whence ν(En ) = ν(Gn,l ),
l=1
N
X
for l = 1, ..., L, Fl = ∪N
n=1 Gn,l , whence ν(Fl ) = ν(Gn,l ),
n=1
and hence
N
X N X
X L L X
X N L
X
ν(En ) = ν(Gn,l ) = ν(Gn,l ) = ν(Fl ).
n=1 n=1 l=1 l=1 n=1 l=1
of elements of S s.t. E = ∪N
n=1 En .
Obviously, µ0 is an extension of ν.
Since µ0 is an extension of ν, µ0 has property af1 of 7.1.1 because ν satisfies
condition
1 1
2 {E1 , E
a. Let now
2
} be a disjoint pair of elements of A0 (S), and let
2
E1 , ..., EN 1
and E1 , ..., EN2 be finiteand disjoint families of elements of S such
Ni
that Ei = ∪n=1 En for i = 1, 2. Then Eni i=1,2;n=1,...,Ni is a finite and disjoint
i
Ni
X X N1
X N2
X
µ0 (E1 ∪ E2 ) = ν(Eni ) = ν(En1 ) + ν(En2 ) = µ0 (E1 ) + µ0 (E2 ).
i=1,2 n=1 n=1 n=1
Applying induction to this result, we obtain property af2 of 7.1.1 for µ0 , which is
therefore an additive function on A0 (S).
A, uniqueness: Suppose that µ̃0 is an additive function on A0 (S) which extends
ν. For any E ∈ A0 (S), let {E1 , ..., EN } be a finite and disjoint family of elements of
S such that E = ∪N n=1 En . Then we have, by the additivity of µ̃0 and the definition
of µ0 ,
N
X N
X
µ̃0 (E) = µ̃0 (En ) = ν(En ) = µ0 (E).
n=1 n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 164
B: Assume now condition c, and suppose that {Fn } is a sequence in A0 (S) such
that F := ∪∞n=1 Fn ∈ A0 (S) and Fi ∩ Fj = ∅ if i 6= j.
There are finite and disjoint families {A1 , ..., AN } and {Bn,1 , ..., Bn,Nn } (for each
n ∈ N) of elements of S so that (cf. 6.1.11)
Nn
F = ∪N
k=1 Ak and Fn := ∪l=1 Bn,l (for each n ∈ N).
Define
∪N
k=1 Ck,n,l = Bn,l ,
Nn
∪l=1 Ck,n,l = Ak ∩ Fn , ∪∞
n=1 (Ak ∩ Fn ) = Ak .
We have
N
X
ν(Bn,l ) = ν(Ck,n,l )
k=1
where we have used induction applied to 5.3.2e, induction applied to 5.4.6, and
5.3.3.
On the other hand, the additivity and the monotonicity of µ0 imply that
N
X
∀N ∈ N, µ0 (Fn ) = µ0 (∪N
n=1 Fn ) ≤ µ0 (F ),
n=1
Measures 165
and note that ME is non-empty (take A1 := X and An := ∅ for n > 1). Then the
function
µ : A(A0 ) → [0, ∞]
E 7→ µ(E) := inf ME
is a measure on A(A0 ) (the σ-algebra on X generated by A0 ) and µ is an extension
of µ0 , i.e.
µ(E) = µ0 (E), ∀E ∈ A0 .
If µ̃ is another measure on A(A0 ) that is an extension of µ0 , then:
µ̃(E) ≤ µ(E), ∀E ∈ A(A0 );
µ̃(E) = µ(E) for each E ∈ A(A0 ) such that µ(E) < ∞;
µ̃ = µ if µ0 is σ-finite.
Measures 167
(e) there exists a countable family {En }n∈I of elements of S such that ν(En ) < ∞
for all n ∈ I and X = ∪n∈I En ,
Proof. Conditions a and b imply that there exists a unique additive function µ0
on A0 (S) that extends ν (cf. 7.3.1A). Then conditions c and d become respectively
conditions a and b of 7.1.6 for µ0 , and this implies that µ0 is a premeasure. If there
is a finite family {E1 , ..., EN } of elements of S such that ν(En ) < ∞ for n = 1, ..., N
and X = ∪N n=1 En , then the additive function µ0 that exists on A0 (S) is finite (since
it is subadditive), and hence condition c is enough to make µ0 a premeasure (cf.
7.1.6).
Since µ0 is a premeasure on A0 (S), 7.3.2 implies that there is a measure µ on
A(A0 (S)) that extends µ0 . Since A(A0 (S)) = A(S) (cf. 6.1.18) and µ0 extends ν,
this proves that there exists a measure µ on A(S) that extends ν.
Finally, assume that ν satisfies also condition e. Then clearly µ and µ0 are
σ-finite. Suppose that µ̃ is another measure on A(S) that extends ν. Then the
restriction µ̃A0 (S) of µ̃ to A0 (S) is an additive function on A0 (S) that extends ν
and we have µ̃A0 (S) = µ0 by the uniqueness asserted in 7.3.1A. Hence, µ̃ is an
extension of µ0 and we have µ̃ = µ by the uniqueness asserted in 7.3.2 in the event
that µ0 is σ-finite.
The content of the first part of this section will be used mainly in the study of
the product of two commuting projection valued measures, in Section 13.5. Indeed,
for that study, which is a necessary step for the spectral theory of two commuting
self-adjoint operators, it is essential to prove first that every finite measure on the
Borel σ-algebra A(dR ) on R is regular. This can be achieved in several ways. The
way we adopt here is borrowed from Section 2.7 of (Parthasarathy, 2005), and will
allow to prove a more general result about commuting projection valued measures
than the one that is required by the spectral theory of two commuting self-adjoint
operators.
Lusin’s theorem is presented in the second part of this section. It will be used
to prove that C(a, b) is isomorphic to a dense linear manifold in the Hilbert space
L2 (a, b) (cf. 11.2.1).
Throughout this section, (X, d) stands for a metric space.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 169
Measures 169
7.4.1 Proposition. Let µ be a finite measure on the Borel σ-algebra A(d) (cf.
6.1.22). Then, for each E ∈ A(d) the following conditions are both satisfied:
(a) µ(E) = sup {µ(F ) : F ⊂ E, F ∈ Kd };
(b) µ(E) = inf {µ(G) : E ⊂ G, G ∈ Td }.
Proof. We prove first that, for E ∈ A(d), conditions a and b together are equivalent
to the one condition
(c) ∀ǫ > 0,∃Fǫ ∈ Kd , ∃Gǫ ∈ Td s.t. Fǫ ⊂ E ⊂ Gǫ and µ(Gǫ − Fǫ ) < ǫ.
On the one hand in fact, if conditions a and b are true for E ∈ A(d), then (since µ
is finite) for ǫ > 0 there exist Fǫ ∈ Kd and Gǫ ∈ Td such that Fǫ ⊂ E ⊂ Gǫ and
ǫ ǫ
µ(E) − µ(Fǫ ) < and µ(Gǫ ) − µ(E) < . (1)
2 2
Since Gǫ − Fǫ = (Gǫ − E) ∪ (E − Fǫ ) (cf. 1.1.4) and (Gǫ − E) ∩ (E − Fǫ ) = ∅, and
since µ is finite, 1 implies that
µ(Gǫ − Fǫ ) = µ(Gǫ − E) + µ(E − Fǫ ) = µ(Gǫ ) − µ(E) + µ(E) − µ(Fǫ ) < ǫ,
and this shows that condition c is true for E. On the other hand, if condition c is
true for E ∈ A(d), then for ǫ > 0 we have
µ(E) − µ(Fǫ ) = µ(E − Fǫ ) ≤ µ(Gǫ − Fǫ ) < ǫ and
µ(Gǫ ) − µ(E) = µ(Gǫ − E) ≤ µ(Gǫ − Fǫ ) < ǫ
since E − Fǫ ⊂ Gǫ − Fǫ and Gǫ − E ⊂ Gǫ − Fǫ and since µ is finite, and this shows
that conditions a and b are true for E.
Let now B denote the collection of all the subsets of X for which condition c is
satisfied. We will prove that B is a σ-algebra on X and that Kd ⊂ B.
In the first place, we prove that condition al2 of 6.1.5 is satisfied for B. Indeed,
let E ∈ B and ǫ > 0. Then there exist Fǫ ∈ Kd and Gǫ ∈ Td such that Fǫ ⊂ E ⊂ Gǫ
and µ(Gǫ −Fǫ ) < ǫ. Hence, X −Gǫ ∈ Kd , X −Fǫ ∈ Td , X −Gǫ ⊂ X −E ⊂ X −Fǫ and
(X − Fǫ ) − (X − Gǫ ) = (X − Fǫ ) ∩ Gǫ = Gǫ − Fǫ , whence µ((X − Fǫ ) − (X − Gǫ )) < ǫ.
This shows that X − E ∈ B. Note next that ∅ ∈ B, since ∅ ∈ Kd and ∅ ∈ Td , and
thus condition c is trivially satisfied for ∅. Therefore, to prove both al1 of 6.1.5
and σa1 of 6.1.13 for B it is enough to prove that ∪∞ n=1 En ∈ B whenever {En } is a
sequence in B. Let then {En } be a sequence in B and ǫ > 0. For each n ∈ N there
exist Fn,ǫ ∈ Kd and Gn,ǫ ∈ Td such that Fn,ǫ ⊂ En ⊂ Gn,ǫ and µ(Gn,ǫ − Fn,ǫ ) < 3ǫn .
Letting S := ∪∞ ∞ N
n=1 Fn,ǫ = ∪N =1 (∪n=1 Fn,ǫ ), from 7.1.4b it follows (since µ is finite)
that we can choose Nǫ so large that
Nǫ ǫ
µ(S − ∪n=1 Fn,ǫ ) = µ(S) − µ(∪N n=1 Fn,ǫ ) < .
ǫ
2
Letting now Fǫ := ∪N ∞
n=1 Fn,ǫ and Gǫ := ∪n=1 Gn,ǫ , we have Fǫ ∈ Kd , Gǫ ∈ Td ,
ǫ
∞
Fǫ ⊂ ∪n=1 En ⊂ Gǫ and
µ(Gǫ − Fǫ ) = µ(Gǫ − S) + µ(S − Fǫ )
≤ µ(∪∞n=1 (Gn,ǫ − Fn,ǫ )) + µ(S − Fǫ )
∞ ∞
X X ǫ ǫ
≤ µ(Gn,ǫ − Fn,ǫ ) + µ(S − Fǫ ) < n
+ = ǫ,
n=1 n=1
3 2
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 170
Proof. This result is obtained immediately from 7.4.1, by using either condition a
or condition b for all elements of A(d).
Measures 171
7.4.5 Theorem. Let µ be a finite and tight measure on the Borel σ-algebra A(d).
Then µ is regular.
Proof. Let E be an arbitrary element of A(d). Condition β of 7.4.3 coincides with
condition b of 7.4.1, and therefore we know that it is satisfied since µ is finite. Let
now ǫ > 0. Since µ is finite, from 7.4.1 we know that
ǫ
∃Fǫ ∈ Kd s.t. Fǫ ⊂ E and µ(E) − µ(Fǫ ) < .
2
Since µ is tight, there is a compact subset Kǫ of X such that µ(X − Kǫ ) < 2ǫ . Define
then Cǫ := Fǫ ∩ Kǫ . We have Cǫ ⊂ E. Also, Cǫ is closed since Kǫ is closed by 2.8.6,
and hence Cǫ is compact by 2.8.8. Since µ is finite, we also have
ǫ ǫ
µ(E)−µ(Cǫ ) = µ(E)−µ(Fǫ )+µ(Fǫ )−µ(Cǫ ) < +µ(Fǫ −Cǫ ) ≤ +µ(X −Kǫ) < ǫ,
2 2
where we have used the monotonicity of µ and the calculation
Fǫ − Cǫ = Fǫ ∩ ((X − Fǫ ) ∪ (X − Kǫ )) = Fǫ − Kǫ ⊂ X − Kǫ .
Since µ is finite, this shows that condition α is satisfied for E.
7.4.6 Theorem. If the metric space (X, d) is complete and separable, then every
finite measure on the Borel σ-algebra A(d) is tight.
Proof. Let (X, d) be complete and separable, and let µ be a finite measure on
A(d). Choose ǫ > 0. For each n ∈ N, the family of open balls B(x, n1 ) x∈X is so
that X = ∪x∈X B(x, n1 ). Then, by 2.3.18, there is a countable family {xn,k }k∈In of
points of X so that X = ∪k∈In B(xn,k , n1 ), and hence so that X = ∪k∈In K(xn,k , n1 ).
We can assume that either In = {1, ..., Nn } or In = N. If I = N, we have X =
1
∪∞ N
N =1 (∪k=1 K(xn,k , n )) and 7.1.4b implies (since µ is finite) that there exists Nn ∈ N
so large that
1 1 ǫ
µ(X − ∪N Nn
k=1 K(xn,k , )) = µ(X) − µ(∪k=1 K(xn,k , )) < n .
n
n n 2
Thus, for each n ∈ N, in either case there is a finite family {xn,1 , ..., xn,Nn } of points
so that
1 ǫ
µ(X − ∪N k=1 K(xn,k , )) < n .
n
n 2
Nn 1
Let then Kǫ := ∩∞ n=1 (∪k=1 K(xn,k , n )). The set Kǫ is closed (cf. 2.3.7 and 2.3.2)
and hence the metric subspace (Kǫ , dKǫ ) is complete (cf. 2.6.6b). Moreover,
1
∀n ∈ N, Kǫ ⊂ ∪N k=1 K(xn,k , ).
n
n
Therefore Kǫ is compact by 2.8.5. Moreover, we have
1
X − Kǫ = ∪∞ Nn
n=1 (X − ∪k=1 K(xn,k , ))
n
and this implies, by the σ-subadditivity of µ, that
∞ ∞
X
Nn 1 X ǫ
µ(X − Kǫ ) ≤ µ(X − ∪k=1 K(xn,k , )) < n
= ǫ.
n=1
n n=1
2
This shows that µ is tight.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 172
7.4.7 Corollary. If the metric space (X, d) is complete and separable, then every
finite measure on the Borel σ-algebra A(d) is regular.
and
Proof. First of all we point out that the statement is consistent because
we have a sequence {ψn } in S + (X, A(d)) such that limn→∞ ψn (x) = ϕ(x) for all
x ∈ X (cf. the proof of 6.2.26).
We have
1
ψ1 = χE2,1 ,
2
which can be written as
1
ψ1 = χE
2 1
if we define
E1 := E2,1 .
Measures 173
Now fix ε > 0. From 7.4.1 and its proof we have that, for each n ∈ N,
ε
∃Fn ∈ Kd , ∃Gn ∈ Td such that Fn ⊂ En ⊂ Gn and µ(Gn − Fn ) < ,
2n
and from 2.5.11 we have that
∃ϕn ∈ C(X) such that 0 ≤ ϕn (x) ≤ 1, ∀x ∈ X, and Fn ≺ ϕn ≺ GN .
Then we define the function
ψ:X→R
∞
X 1
x 7→ ψ(x) := ϕ (x).
n n
n=1
2
Since
∞
X 1
0 ≤ ψ(x) ≤ n
= 1, ∀x ∈ X,
n=1
2
we have ψ ∈ FB (X), and since
N
∞
X 1
X 1
ϕ − ψ ≤ → 0 as N → ∞,
n
2n 2n
n=1 ∞ n=N +1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 174
and hence
∞
[ ∞
[
{x ∈ X : ϕ(x) 6= ψ(x)} ⊂ ((X − Fn ) ∩ Gn ) = (Gn − Fn ),
n=1 n=1
Measures 175
are elements of M(X, A(d)) ∩ FB (X) such that 0 ≤ ϕi (x) for all x ∈ X, for
i = 1, 2, 3, 4. Therefore, the result of step 2 implies that if we fix ε > 0 then, for
i = 1, 2, 3, 4, there exists ψi ∈ CB (X) such that
ε
µ({x ∈ X : ϕi (x) 6= ψi (x)}) < .
4
Then the function
ψ := ψ1 − ψ2 + iψ3 − iψ4
is an element of CB (X) such that
4
[
{x ∈ X : ϕ(x) 6= ψ(x)} ⊂ {x ∈ X : ϕi (x) 6= ψi (x)},
i=1
Chapter 8
Integration
8.1.1 Proposition. Let n, m ∈ N, let {a1 , ..., an } and {b1 , ..., bm } be families of
elements of [0, ∞), let {E1 , ..., En } and {F1 , ..., Fm } be disjoint (i.e. Ei ∩ Ej = ∅
and Fi ∩ Fj = ∅ if i 6= j) families of elements of A, and suppose that
n
X m
X
a k χE k = bl χFl .
k=1 l=1
Then
n
X m
X
ak µ(Ek ) = bl µ(Fl ).
k=1 l=1
Proof. We define
n
[ m
[
an+1 := bm+1 := 0, En+1 := X − Ek , Fm+1 := X − Fl .
k=1 l=1
n+1
[ n+1
X
∀l ∈ {1, ..., m + 1}, Fl = (Ek ∩ Fl ), whence µ(Fl ) = µ(Ek ∩ Fl ).
k=1 k=1
We also notice that
n+1
X n
X m
X m+1
X
a k χE k = a k χE k = bl χFl = bl χFl .
k=1 k=1 l=1 l=1
177
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 178
8.1.2 Definition. Let ψ ∈ S + (X, A) (for S + (X, A), cf. 6.2.25). Then there are
n ∈ N, a family {a1 , ..., an } of elements of [0, ∞), and a disjoint family {E1 , ..., En }
Pn
of elements of A so that ψ = k=1 ak χEk . We define the integral (with respect to
µ) of ψ by
Z Xn
ψdµ := ak µ(Ek ),
X k=1
not unique).
8.1.3 Remarks.
(a) For each E ∈ A, we have χE ∈ S + (X, A). Thus, immediately from the defini-
tion in 8.1.2, we have
Z
χE dµ = µ(E).
X
R
Hence
R in particular 0 dµ = µ(∅) = 0 (even if µ(X) = ∞) since 0X = χ∅ ,
X X
and X 1X dµ = µ(X) since 1X = χX .
(b) From the definition in 8.1.2 and from 7.1.2a we have that if µ(X) = 0 then
= 0 for all ψ ∈ S + (X, A).
R
X
ψdµ
Integration 179
We have
n+1
X m+1
X
ψ1 = ak χEk and ψ2 = bl χFl .
k=1 l=1
The family {Ek ∩ Fl }(k,l)∈I , with I := {1, ..., n + 1} × {1, ..., m + 1}, is a disjoint
family of elements of A, and from Ek = m+1
S Sn+1
l=1 (Ek ∩ Fl ) and Fl = k=1 (Ek ∩ Fl )
we have
m+1
X
χE k = χEk ∩Fl for k = 1, ..., n + 1 and
l=1
n+1
X
χFl = χEk ∩Fl for l = 1, ..., m + 1,
k=1
and hence
X X
ψ1 = ak χEk ∩Fl and ψ2 = bl χEk ∩Fl .
(k,l)∈I (k,l)∈I
This shows that aψ1 + bψ2 ∈ S + (X, A) (this was already clear from 6.2.24). More-
over (cf. 5.3.3)
Z X
(aψ1 + bψ2 )dµ = (aak + bbl )µ(Ek ∩ Fl )
X (k,l)∈I
n+1
X m+1
X m+1
X n+1
X
=a ak µ(Ek ∩ Fl ) + b bl µ(Ek ∩ Fl )
k=1 l=1 l=1 k=1
n+1
X m+1
X Z Z
=a ak µ(Ek ) + b bl µ(Fl ) = a ψ1 dµ + b ψ2 dµ.
k=1 l=1 X X
b: Suppose ψ1 ≤ ψ2 , i.e.
X X
ak χEk ∩Fl (x) ≤ bl χEk ∩Fl (X), ∀x ∈ X.
(k,l)∈I (k,l)∈I
R
Proof. We have ν(∅) = X 0X dµ = 0 (cf. 8.1.3a). Thus, ν has property af1 of
7.1.1.
Pm
Write now ψ = k=1 ak χFk , with ak ∈ [0, ∞) for k = 1, ..., m and {F1 , ..., Fm }
a disjoint family of elements of A, and notice that, for every E ∈ A, the equality
Pm
χE ψ = k=1 R ak χE∩Fk shows that χE ψ ∈ S + (X, A) (this was already clear from
Pm
6.2.24) and X χE ψdµ = k=1 ak µ(E ∩ Fk ). Then, if {En } is a sequence in A such
that Ei ∩ Ej = ∅ whenever i 6= j, we have (by 5.4.5 and induction applied to 5.4.6)
∞ ∞
! Z m
! !
[ X [
ν En = χS ∞n=1 En
ψdµ = ak µ En ∩ Fk
n=1 X k=1 n=1
m
X ∞
X ∞ X
X m
= ak µ(En ∩ Fk ) = ak µ(En ∩ Fk )
k=1 n=1 n=1 k=1
X∞ Z ∞
X
= χEn ψdµ = ν(En ).
n=1 X n=1
Thus, ν has property me of 7.1.7.
Integration 181
R R R
(b) X (limn→∞ ϕn )dµ = limn→∞ X ϕn dµ = supn≥1 X ϕn dµ.
Proof. a: By 5.2.5 we have that, for each x ∈ X, the sequence {ϕn (x)} is con-
vergent and limn→∞ ϕn (x) = supn≥1 ϕn (x), and hence also that limn→∞ ϕn (x) ∈
[0, ∞]. Thus, limn→∞ ϕn = supn≥1 ϕn , and limn→∞ ϕn ∈ L+ (X, A) follows from
6.2.19b. R
b: From 8.1.7 and 5.2.5 it follows that the sequence { X ϕn dµ} is convergent (in
the metric space (R∗ , δ)) and that
Z Z
lim ϕn dµ = sup ϕn dµ.
n→∞ X n≥1 X
From 6.2.31 and 6.1.26 we have En ∈ A for all n ∈ N. Also, En ⊂ En+1 for all
S∞
n ∈ N and X = n=1 En (to see this, notice that if ψ(x) = 0 then x ∈ E1 ; if
0 < ψ(x), then aψ(x) < ψ(x) ≤ supn≥1 ϕn (x) and hence there exists n ∈ N so that
aψ(x) < ϕn (x)). Then, 8.1.4a, 8.1.5 (since aψ ∈ S + (X, A) by 6.2.24) and 7.1.4b
imply that
Z Z Z
a ψdµ = aψdµ = sup χEn aψdµ.
X X n≥1 X
Thus,
Z Z
a ψdµ ≤ sup ϕn dµ.
X n≥1 X
Since this is true for every ψ ∈ S + (X, A) such that ψ ≤ supn≥1 ϕn , we have
Z Z
(sup ϕn )dµ ≤ sup ϕn dµ.
X n≥1 n≥1 X
Proof. We have ϕ1 +ϕ2 ∈ L+ (X, A) from 6.2.31. By 6.2.26, there are two sequences
{ψn1 } and {ψn2 } in S + (X, A) so that, for i = 1, 2, ψni ≤ ψn+1
i
and limn→∞ ψni = ϕi .
1 2 +
By 8.1.4a, {ψn + ψn } is a sequence in S (X, A) and
Z Z Z
1 2 1
(ψn + ψn )dµ = ψn dµ + ψn2 dµ.
X X X
Since
P∞
8.1.10 Proposition. Let {ϕn } be a sequence in L+ (X, A). Then (for n=1 ϕn ,
cf. 6.2.32)
∞ ∞ ∞ Z
Z !
X X X
+
ϕn ∈ L (X, A) and ϕn dµ = ϕn dµ.
n=1 X n=1 n=1 X
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 183
Integration 183
P∞
Proof. We have n=1 ϕn ∈ L+ (X, A) from 6.2.32. Applying induction to 8.1.9 we
have
n Z n
! n Z
X X X
+
ϕk ∈ L (X, A) and ϕk dµ = ϕk dµ, ∀n ∈ N.
k=1 X k=1 k=1 X
Pn Pn+1
Then, since k=1 ϕk ≤ k=1 ϕk for all n ∈ N, by 8.1.8 and 5.4.1 we have
∞
Z ! Z n
!
X X
ϕn dµ = lim ϕk dµ
X n→∞ X
n=1 k=1
n Z
X ∞ Z
X
= lim ϕk dµ = ϕn dµ.
n→∞ X X
k=1 n=1
1
On the other hand, letting E := ϕ−1 ((0, ∞]) and En := ϕ−1
n, ∞ for each
n ∈ N, we have (cf. 6.1.26):
∞
[
E ∈ A, En ∈ A for each n ∈ N, E = En .
n=1
1
Since n χE n ≤ ϕ, by 8.1.7 we have
1 1
Z Z
µ(En ) = χEn dµ ≤ ϕdµ, ∀n ∈ N.
n X n X
R
Thus, if X ϕdµ = 0 then µ(En ) = 0 for each n ∈ N, and hence (cf. 7.1.4a)
µ(E) = 0. Since
ϕ(x) = 0, ∀x ∈ X − E,
R
this shows that X ϕdµ = 0 implies ϕ(x) = 0 µ-a.e. on X.
b: We have ϕ−1 ({∞}) ∈ A by 6.1.26. Defining ψn := nχϕ−1 ({∞}) for each
n ∈ N, we have
Z
+
ψn ∈ S (X, A), ψn ≤ ϕ, ψn dµ = nµ(ϕ−1 ({∞})).
X
Integration 185
R
Actually, since the value of X ϕdµ is independent from what extension of ϕ is used
in order to define it, this extension need not be specified, unless this is useful for
calculations.
As a matter of convenience, for ϕ ∈ L+ (X, A, µ) we will sometimes write
Z Z
ϕ(x)dµ(x) := ϕdµ.
X X
Proof. We have
Z Z
ϕdµ := ϕe dµ,
X X
E0,n := ϕ−1
e ([n, ∞]),
k−1 k
Ek,n := ϕ−1
e , for k = 1, ..., n2n ,
2n 2n
n
n2
X k−1
ψn := χEk ,n + nχE0 ,n .
2n
k=1
We recall (cf. the proof of 6.2.26) that {ψn } is a sequence in S + (X, A) so that
Integration 187
8.1.16 Remark. Proposition 8.1.15 shows that we could have defined, for every
ϕ ∈ L+ (X, A, µ), the integral of ϕ with respect to µ by
" n2n #
Z X k−1
k−1 k
−1 −1
ϕdµ := sup µ ϕ , + nµ(ϕ ([n, ∞])) ,
X n≥1 2n 2n 2n
k=2
without going through 8.1.2, 8.1.6, 8.1.14; this would have been close to Lebesgue’s
original way of defining his integral (cf. Shilov and Gurevich, 1966, 6.6). This way
of defining the integral has the merit of showing at the outset why the integral
can be defined for measurable functions only (if −1
ϕ were not measurable with Dϕ
measurable, then the sets ϕ−1 k−1 k
,
2n 2n and ϕ ([n, ∞]) would not be elements
of A for all n’s and k’s, and the whole formula would be meaningless since the
domain of µ is A). Indeed, the definition in 8.1.6 would not be contradictory if
it were given for all functions with X as domain and [0, ∞] as final set, and only
later does it become clear why the functions must be measurable (e.g., the proof of
additivity given in 8.1.9 requires the measurability of the functions in an essential
way).
8.1.17 Theorem. Let ϕ, ψ ∈ L+ (X, A, µ). Then:
(a) aϕ+bψ ∈ L+ (X, A, µ) and X (aϕ+bψ)dµ R= a X ϕdµ+b
R R R
R X
ψdµ, ∀a, b ∈ [0, ∞];
(b) if ϕ(x) ≤ ψ(x) µ-a.e. on Dϕ ∩ Dψ , then RX ϕdµ ≤ RX ψdµ;
(c) if ϕ(x) = ψ(x) µ-a.e. on Dϕ ∩ Dψ , then X ϕdµ = X ψdµ;
(d) ϕψ ∈ L+ (X, A, µ).
Proof. a: For a, b ∈ [0, ∞] we have:
Daϕ+bψ = Dϕ ∩ Dψ ∈ A (cf. 6.2.30);
µ(X − Dϕ ∩ Dψ ) = µ((X − Dϕ ) ∪ (X − Dψ )) = 0 (cf. 7.1.2b);
ϕDϕ ∩Dψ , ψDϕ ∩Dψ ∈ L+ (Dϕ ∩ Dψ , ADϕ ∩Dψ ) (cf. 6.2.3 and 6.1.19b) and hence
aϕ + bψ = aϕDϕ ∩Dψ + bψDϕ ∩Dψ ∈ L+ (Dϕ ∩ Dψ , ADϕ ∩Dψ ) (cf. 6.2.31).
This proves that aϕ + bψ ∈ L+ (X, A, µ).
Further, if ϕ̃, ψ̃ ∈ L+ (X, A) are extensions of ϕ, ψ respectively (cf. 8.1.14), then
aϕ̃ + bψ̃ ∈ L+ (X, A) and aϕ̃ + bψ̃ is an extension of aϕ + bψ. Then,
Z Z Z Z Z Z
(aϕ + bψ)dµ = (aϕ̃ + bψ̃)dµ = a ϕ̃dµ + b ψ̃dµ = a ϕdµ + b ψdµ,
X X X X X X
where 8.1.9 and 8.1.13 have been used.
b: Let E ∈ A be so that
µ(E) = 0 and ϕ(x) ≤ ψ(x) for all x ∈ Dϕ ∩ Dψ ∩ (X − E).
+
If ϕ̃, ψ̃ ∈ L (X, A) are extensions of ϕ, ψ respectively, then we have
ϕ̃(x) ≤ ψ̃(x) for all x ∈ Dϕ ∩ Dψ ∩ (X − E) = X − ((X − Dϕ ) ∪ (X − Dψ ) ∪ E).
Since (X − Dϕ ) ∪ (X − Dψ ) ∪ E ∈ A and µ((X − Dϕ ) ∪ (X − Dψ ) ∪ E) = 0 (cf.
7.1.2b), we have
ϕ̃(x) ≤ ψ̃(x) µ-a.e. on X,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 188
Integration 189
(a) there exists ϕ̃ ∈ L+ (X, A) such that ϕ̃(x) = limn→∞ ϕn (x) µ-a.e. on
T∞
n=1 Dϕn ; T∞
(b) if ϕ ∈ L+ (X, A, µ) and ϕ(x) = limn→∞ ϕn (x) µ-a.e. on Dϕ ∩ ( n=1 Dϕn ),
then
Z Z Z
ϕdµ = lim ϕn dµ = sup ϕn dµ.
X n→∞ X n≥1 X
which is an element of L+ (X, A) since, for every S ⊂ [0, ∞], ϕ̃−1n (S) is either
ϕ−1
n (S) ∩ (X − E) or (ϕ−1
n (S) ∩ (X − E)) ∪ E, and ADϕn
⊂ A. We have ϕ̃n ≤ ϕ̃n+1
since X − E ⊂ Dϕn ∩ Dϕn+1 ∩ (X − En ). Hence, by 8.1.8a, the sequence {ϕ̃n (x)}
is convergent for all x ∈ X and limn→∞ ϕ̃n ∈ L+ (X, A). Letting ϕ̃ := limn→∞ ϕ̃n ,
we also have
∞
!
\
ϕ̃(x) := lim ϕ̃n (x) = lim ϕn (x), ∀x ∈ X − E = Dϕn ∩ (X − E),
n→∞ n→∞
n=1
and hence
∞
\
ϕ̃(x) = lim ϕn (x) µ-a.e. on Dϕn .
n→∞
n=1
b: Let ϕ̃n and ϕ̃ denote the same functions as in the proof of part a. By 8.1.8b
we have
Z Z Z
ϕ̃dµ = lim ϕ̃n dµ = sup ϕ̃n dµ. (1)
X n→∞ X n≥1 X
T∞
Similarly, if ϕ ∈ L+ (X, A, µ) and ϕ(x) = limn→∞ ϕn (x) µ-a.e. on Dϕ ∩( n=1 Dϕn ),
and if F ∈ A is so that
∞
!
\
µ(F ) = 0 and ϕ(x) = lim ϕn (x) for all x ∈ Dϕ ∩ Dϕn ∩ (X − F ),
n→∞
n=1
then
∞
!
\
ϕ(x) = ϕ̃(x) for all x ∈ Dϕ ∩ Dϕn ∩(X − F )∩(X − E) = Dϕ ∩(X − (F ∪E)).
n=1
8.1.20 Lemma (Fatou’s lemma). Let {ϕn } be a sequence in L+ (X, A, µ) and let
ϕ ∈ L+ (X, A, µ). Suppose that
∞
!
\
ϕ(x) = lim ϕn (x) µ-a.e. on Dϕ ∩ Dϕn
n→∞
n=1
R
and that there exists M ∈ [0, ∞) such that X ϕn dµ ≤ M for all n ∈ N. Then,
Z
ϕdµ ≤ M.
X
Integration 191
and hence (cf. 6.2.18) ϕ̃ = supn≥1 (inf k≥n ϕ̃k ). Now, for each k ∈ N,
inf ϕ̃k ∈ L+ (X, A) (cf. 6.2.19a) and inf ϕ̃k ≤ inf ϕ̃k .
k≥n k≥n k≥n+1
Moreover, for each n ∈ N we have inf k≥n ϕ̃k ≤ ϕ̃n and hence (cf. 8.1.7)
Z Z Z
( inf ϕ̃k )dµ ≤ ϕ̃n dµ = ϕn dµ ≤ M.
X k≥n X X
Then we have
Z Z
ϕdµ = ϕ̃dµ ≤ M.
X X
Integration 193
8.2.4 Proposition. For ϕ ∈ M(X, A, µ), the following conditions are equivalent:
1
R ∈ L (X, A, µ);
(a) ϕ
(b) X |ϕ|dµ < ∞.
Proof. We have
(Re ϕ)± (x) = (Re ψ)± (x) and (Im ϕ)± (x) = (Im ψ)± (x) µ-a.e. on Dϕ ∩ Dψ .
The result then follows from 8.1.17c.
Proof. Use 8.1.18b (or else, notice that if µ(X) = 0 then ϕ(x) = 0 µ-a.e. for each
ϕ ∈ M(X, A, µ), and use 8.2.7).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 194
Integration 195
Let now ϕ ∈ L1 (X, A, µ). If α ≥ 0, then 2 follows from 8.1.17a and from the
equalities
(Re(αϕ))± = α(Re ϕ)± and (Im(αϕ))± = α(Im ϕ)± .
If α = −1, then 2 follows from the equalities
(Re(−ϕ))± = (Re ϕ)∓ and (Im(−ϕ))± = (Im ϕ)∓ .
If α = i, then 2 follows from the equalities
(Re(iϕ))± = (Im ϕ)∓ and (Im(iϕ))± = (Re ϕ)± .
Combining these cases with 1, we obtain 2 for any α ∈ C.
where the second inequality holds by 8.1.17b and the last equality holds since |α| =
1.
T∞
(c) if ϕ ∈ M(X, A, µ) is s.t. ϕ(x)R= limn→∞ ϕn (x) µ-a.e. on Dϕ ∩ ( n=1
R Dϕn ),
then ϕ ∈ L1 (X, A, µ), limn→∞ X |ϕn − ϕ|dµ = 0, X ϕdµ = limn→∞ X ϕn dµ.
R
Proof. a: For each n ∈ N, we note that |ϕn (x)| ≤ ψ(x) entails ψ(x) ∈ [0, ∞). Thus
we have |ϕn (x)| ≤ |ψ(x)| µ-a.e. on Dϕn ∩ Dψ , and this implies ϕn ∈ L1 (X, A, µ)
by 8.2.5.
b: Let E ∈ A be so that
∞
!
\
µ(E) = 0 and {ϕn (x)} is convergent for all x ∈ Dϕn ∩ (X − E).
n=1
T∞
Letting S := ( n=1 Dϕn ) ∩ (X − E), we have S ∈ A. We define the function
ϕ:S→C
x 7→ ϕ(x) := lim ϕn (x).
n→∞
Since (ϕn )S ∈ M(S, A ) (cf. 6.2.3 and 6.1.19b), we have ϕ ∈ M(S, AS ) by 6.2.20c,
S
S∞
and hence ϕ ∈ M(X, A, µ) since µ(X − S) = µ (( n=1 (X − Dϕn )) ∪ E) = 0 (cf.
7.1.4a). From the definition of ϕ we have
∞
!
\
ϕ(x) = lim ϕn (x), ∀x ∈ Dϕ = Dϕ ∩ Dϕn .
n→∞
n=1
We have
|ϕ(x)| ≤ ψ(x), ∀x ∈ X − H = Dϕ ∩ Dψ ∩ (X − H),
and hence ϕ ∈ L1 (X, A, µ) by 8.2.5.
Now we define the functions:
ϕ̃n := (ϕn )X−H for each n ∈ N, ψ̃ := ψX−H , ϕ̃ := ϕX−H .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 197
Integration 197
These functions are elements of M(X − H, AX−H ) by 6.2.3 and 6.1.19b, and hence
elements of M(X, A, µ) since µ(H) = 0. Moreover, ψ̃ ∈ L1 (X, A, µ) by 8.2.7.
For each n ∈ N, we define the function
ψ̃n : X − H → [0, ∞]
x 7→ ψ̃n (x) := sup |ϕ̃k (x) − ϕ̃(x)|
k≥n
(in this proof we characterize with a tilde the functions whose domain is X −H). By
6.2.16, 6.2.17, 6.2.19a we have ψ̃n ∈ M(X −H, AX−H ) and hence ψ̃n ∈ M(X, A, µ).
From
|ϕ̃k (x)| ≤ ψ̃(x) for each k ∈ N and |ϕ̃(x)| ≤ ψ̃(x), ∀x ∈ X − H,
we have
ψ̃n (x) ≤ 2ψ̃(x), ∀x ∈ X − H,
and hence ψ̃n ∈ L1 (X, A, µ) (cf. 8.2.5) and 2ψ̃ − ψ̃n ∈ L+ (X, A, µ). We also have
2ψ̃ − ψ̃n ≤ 2ψ̃ − ψ̃n+1 since obviously ψ̃n+1 ≤ ψ̃n . Furthermore, by 5.2.6 we have
lim ψ̃n (x) = lim |ϕ̃n (x) − ϕ̃(x)|
n→∞ n→∞
= lim |ϕn (x) − ϕ(x)| = 0, ∀x ∈ X − H,
n→∞
and hence
2ψ̃(x) = lim (2ψ̃(x) − ψ̃n (x)), ∀x ∈ X − H.
n→∞
and hence
Z
lim ψ̃n dµ = 0. (1)
n→∞ X
µ : C × M (X, A, µ) → M (X, A, µ)
(α, [ψ]) 7→ µ(α, [ψ]) := α[ϕ] := [αϕ];
Integration 199
Proof. The only thing to prove is that the mappings σ, µ, π can indeed be defined
as they were in the statement, while it is immediate to check all the rest.
We already know that, if ϕ, ψ ∈ M(X, A, µ), then ϕ + ψ ∈ M(X, A, µ) (cf.
8.2.2). Suppose now that ϕ, ϕ′ , ψ, ψ ′ ∈ M(X, A, µ) are so that ϕ′ ∼ ϕ, ψ ′ ∼ ψ, and
let:
E ∈ A be so that µ(E) = 0 and ϕ′ (x) = ϕ(x), ∀x ∈ Dϕ′ ∩ Dϕ ∩ (X − E);
F ∈ A be so that µ(F ) = 0 and ψ ′ (x) = ψ(x), ∀x ∈ Dψ′ ∩ Dψ ∩ (X − F ).
Then
ϕ′ (x) + ψ ′ (x) = ϕ(x) + ψ(x),
∀x ∈ Dϕ′ ∩ Dϕ ∩ (X − E) ∩ Dψ′ ∩ Dψ ∩ (X − F )
= Dϕ′ +ψ′ ∩ Dϕ+ψ ∩ (X − (E ∪ F )),
which proves that ϕ′ + ψ ′ ∼ ϕ + ψ. This shows that the equivalence class [ϕ + ψ]
does not depend on the particular elements ϕ and ψ (of the classes [ϕ] and [ψ])
through which it has been defined. Hence, the rule which assigns [ϕ + ψ] to a
pair ([ϕ], [ψ]) ∈ M (X, A, µ) × M (X, A, µ) does assign one and only one element of
M (X, A, µ) to ([ϕ], [ψ]).
The arguments for µ and for π are analogous.
8.2.15 Theorem. The following definition, of the set L1 (X, A, µ), is consistent:
Z
L1 (X, A, µ) := [ϕ] ∈ M (X, A, µ) : |ϕ|dµ < ∞ .
X
1
Then, L (X, A, µ) is a linear manifold in the linear space M (X, A, µ).
The following definition, of the function ν, is consistent:
ν : L1 (X, A, µ) → R
Z
[ϕ] 7→ ν([ϕ]) := k[ϕ]kL1 := |ϕ|dµ.
X
Proof. To prove that L1 (X, A, µ) can indeed be defined as it was in the statement,
we must show that the implication
Z Z
[ϕ′ , ϕ ∈ M(X, A, µ), ϕ′ ∼ ϕ, |ϕ|dµ < ∞] ⇒ |ϕ′ |dµ < ∞ (∗)
X X
R
holds true, because then the condition X |ϕ|dµ < ∞ is actually a condition for the
equivalence class [ϕ] even though it is expressed through a particular element of it.
Now, (∗) is true by 8.1.17c.
Similar arguments, based on 8.1.17c and 8.2.7, show that ν and I can be defined
as they were in the statement. R
Since, for ϕ ∈ M(X, A, µ), X |ϕ|dµ < ∞ is equivalent to ϕ ∈ L1 (X, A, µ) (cf.
8.2.4), 8.2.9 proves that L1 (X, A, µ) is a linear manifold in M (X, A, µ) and that I
is a linear functional.
To prove that ν is a norm, we notice that:
Z Z
∀[ϕ], [ψ] ∈ L1 (X, A, µ), k[ϕ] + [ψ]kL1 = |ϕ + ψ|dµ ≤ (|ϕ| + |ψ|)dµ
ZX Z X
Integration 201
8.3.4 Proposition. Let (X, A, µ) be a measure space, let ρ ∈ L+ (X, A, µ), and
define the function
ν : A → [0, ∞]
Z
E 7→ ν(E) := ρdµ.
E
Integration 203
Thus, ν is a measure on A.
If E ∈ A and µ(E) = 0, then χE (x)ρ̃(x) = 0 µ-a.e. on X and hence by 8.1.12a
Z
ν(E) = χE ρ̃dµ = 0.
X
Assume now ρ(x) < ∞ µ-a.e. on Dρ . Then, if E ∈ A is so that
µ(E) = 0 and ρ(x) < ∞ for all x ∈ Dρ ∩ (X − E),
we have ρ̃ ({∞}) ⊂ X − (Dρ ∩ (X − E)) = (X − Dρ ) ∪ E; thus ρ̃−1 ({∞}) is an
−1
c: Suppose ρ(x) < ∞ for all x ∈ Dρ ; then ρ ∈ M(X, A, µ). If ψ ∈ M(X, A, µ),
then ψρ ∈ M(X, A, µ) by 8.2.2 and ψ ∈ M(X, A, ν) since µ(X − Dψ ) = 0 implies
ν(X − Dψ ) = 0. The rest of the statement about ψ follows from the definitions
given in 8.2.3 and from the results proved in part b.
Integration 205
Proof. a: We have:
P∞
µ(∅) = k=1 ak µk (∅) = 0;
for a sequence {En } in A such that Ei ∩ Ej = ∅ if i 6= j, by 5.4.5 and 5.4.7,
∞ ∞ ∞ ∞ ∞
! !
[ X [ X X
µ En = a k µk En = ak µk (En )
n=1 k=1 n=1 k=1 n=1
X∞ X∞ ∞
X
= ak µk (En ) = µ(En ).
n=1 k=1 n=1
Thus, µ is a measure on A.
Now notice that, for E ∈ A, µ(E) = 0 iff µk (E) = 0 for all k ∈ J. This proves
that
\ \
L+ (X, A, µ) = L+ (X, A, µk ) and M(X, A, µ) = M(X, A, µk ).
k∈J k∈J
For each E ∈ A we have
Z ∞
X ∞
X Z
χE dµ = µ(E) = ak µk (E) = ak χE dµk .
X k=1 k=1 X
+
PN
Hence, for τ ∈ S (X, A), letting τ = n=1 bn χEn with {E1 , ..., En } a disjoint
family of elements of A and bn ∈ [0, ∞) for n = 1, ..., N , we have (cf. 5.4.5, 5.4.6,
5.3.3)
Z XN Z N
X ∞
X Z
τ dµ = bn χEn dµ = bn ak χEn dµk
X n=1 X n=1 k=1 X
∞
X N
X Z ∞
X Z
= ak bn χEn dµk = ak τ dµk .
k=1 n=1 X k=1 X
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 206
For ϕ ∈ L+ (X, A, µ), let ϕ̃ ∈ L+ (X, A) be an extension of ϕ (cf. 8.1.14) and let
{τn } be a sequence in S + (X, A) so that (cf. 6.2.26)
τn ≤ τn+1 for all n ∈ N and ϕ̃ = lim τn
n→∞
(the function limn→∞ τn is defined as in 6.2.18). Then we have, by 8.1.7 and 5.3.2b,
Z Z
ak τn dµk ≤ ak τn+1 dµk , ∀(n, k) ∈ N × N,
X X
The part of the statement about ψ ∈ M(X, A, µ) follows easily from what has just
been proved for ϕ ∈ L+ (X, A, µ), from 8.2.4, and from the definitions given in 8.2.3.
b: In part a of the statement, assume µ1 := µ, µ2 := ν, a1 := a, a2 := b, and,
for k > 2, ak any positive number and µk the null measure on A. Then everything
asserted in part b follows at once from part a.
Proof. By a straightforward check, we see that µx0 has properties af1 of 7.1.1 and
me of 7.1.7.
If ϕ ∈ L+ (X, A, µ), then x0 ∈ Dϕ (otherwise, µx0 (X − Dϕ ) = 1) and
ϕ(x) = ϕ(x0 ) µ-a.e.
(since µx0 (X − {x0 }) = 0). Then (cf. 8.1.17c and 8.1.3a)
Z Z
ϕdµx0 = ϕ(x0 )dµx0 = ϕ(x0 )µx0 (X) = ϕ(x0 ).
X X
The part of the statement about M(X, A, µx0 ) follows from this and from the
definitions given in 8.2.3.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 207
Integration 207
8.3.7 Proposition. Let µ be a non-null measure on A(dR ) and suppose that µ(E)
is either 0 or 1 for each E ∈ A(dR ). Then there exists x0 ∈ R so that µ is the Dirac
measure in x0 .
S
Proof. Since R = n∈Z [n, n + 1], the σ-subadditivity of µ (cf. 7.1.4a) implies that
∃n ∈ Z such that µ([n, n + 1]) = 1.
Then the family
X := {[a, b] : a, b ∈ R, n ≤ a ≤ b ≤ n + 1, µ([a, b]) = 1}
is non-empty because [n, n + 1] ∈ X. The subadditivity of µ (cf. 7.1.2b) implies
that
a+b a+b
∀[a, b] ∈ X, µ a, =0⇒µ ,b = 1.
2 2
Thus, we can define the mapping
ϕ:X →X
(
a, a+b a, a+b
2 if µ 2 =1
[a, b] 7→ ϕ([a, b]) := a+b a+b
2 ,b if µ a, 2 = 0.
Next, we define a sequence {[an , bn ]} by letting:
[a1 , b1 ] := [n, n + 1],
8.3.9 Remark. For the measure µ defined in 8.3.8 we have µ(X − {xn }n∈I ) = 0.
Conversely, suppose that we have a measure space (X, A, µ) such that there
exists a family {xn }n∈I , with I := {1, ..., N } or I := N, of points of X so that the
singleton set {xn } is an element of A for each n ∈ I and µ(X − {xn }n∈I ) = 0.
Then, for each E ∈ A, we have
X
µ(E) = µ(E ∩ {xn }n∈I ) = µ({xn })
n∈IE
if we define IE := {n ∈ I : xn ∈ E}. Thus, µ turns out to be the measure defined
in 8.3.8, with an := µ({xn }) for each n ∈ I.
The measures of this kind, i.e. the ones that can be constructed as in 8.3.8, are
said to be discrete.
8.3.10 Remarks.
(a) In 8.3.8, let X := N, A := P(N) (cf. 6.1.15), I := N, xn := n and an := 1
for each n ∈ N. Then the measure µ is called the counting measure on N
(since, for E ⊂ N, µ(E) is the number of the points that are contained in E),
L+ (X, A, µ) is the family of all sequences in [0, ∞] and M(X, A, µ) is the family
of all sequences in C. For a sequence ϕ := {yn } in [0, ∞] we have
Z X∞
ϕdµ = yn ,
X n=1
and for a sequence ψ := {zn } in C we have:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 209
Integration 209
P∞
ψ ∈ L1 (X, A, µ) iff n=1R |zn | < ∞;P∞
if ψ ∈ L1 (X, A, µ) then X ψdµ = n=1 zn .
Thus, all the results about integrals of Section 8.1 and 8.2 have corollaries which
are results about series.
(b) In 8.3.8, let X := {1, ..., N }, A := P({1, ..., N }), I := {1, ..., N }, xn := n and
an := 1 for each n ∈ {1, ..., N }. Then the measure µ is called the counting
measure on {1, ..., N }, the equalities L1 (X, A, µ) = M(X, A, µ) = CN hold
true, and for an N -tuple ψ := (z1 , ..., zN ) ∈ CN we have
Z N
X
ψdµ = zn .
X n=1
µ2 : A2 → [0, ∞]
E 7→ µ2 (E) := µ1 (π −1 (E))
is a measure on A2 .
(b) For ϕ ∈ L+ (X2 , A2 , µ2 ) we have:
+
R ◦ π ∈ L (X
ϕ R 1 , A1 , µ1 ), R R
X2
ϕdµ 2 = X1
(ϕ ◦ π)dµ1 and E ϕdµ2 = π−1 (E) (ϕ ◦ π)dµ1 , ∀E ∈ A2 .
(c) For ψ ∈ M(X2 , A2 , µ2 ) we have:
ψ ◦ π ∈ M(X1 , A1 , µ1 );
ψ ∈ L1 (X2 , A2 , µ2 ) iff ψ ◦ π ∈RL1 (X1 , A1 , µ1R);
1
R
R ψ ∈ L (X2 , A2 , µ2 ) then X2 ψdµ2 = X1 (ψ ◦ π)dµ1 and E ψdµ2 =
if
π −1 (E) (ψ ◦ π)dµ1 , ∀E ∈ A2 .
Thus, µ2 is a measure on A2 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 210
Dϕ +
w.r.t. A2 and A(δ) (cf. 6.2.6). Thus, ϕ ◦ π ∈ L (X1 , A1 , µ1 ).
Then, by 8.1.15 and 1.2.13Af we have
" n2n #
Z X k−1
k−1 k
−1 −1
ϕdµ2 = sup µ2 ϕ , + nµ2 (ϕ ([n, ∞]))
X2 n≥1 2n 2n 2n
k=2
" n2n #
X k−1
k−1 k
−1 −1
= sup µ1 (ϕ ◦ π) , + nµ1 ((ϕ ◦ π) ([n, ∞]))
n≥1 2n 2n 2n
k=2
Z
= (ϕ ◦ π)dµ1 .
X1
since (χE ◦ π)(x) = χπ−1 (E) (x) for each x ∈ Dπ (cf. 1.2.13Ag).
c: We have ψ ◦ π ∈ M(X1 , A1 , µ1 ) for each ψ ∈ M(X2 , A2 , µ2 ) by the first result
of part b, in view of 6.2.12 and 6.2.20b. The rest of the statement follows easily
from what was proved in part b and from the definitions given in 8.2.3.
The subject of this section is integration of functions defined on the cartesian prod-
uct of two σ-finite measure spaces. Actually, this section could be part of the
preceding one, since it deals with integration with respect to measures which are
constructed out of previously given measures. However, since the treatment of the
subject goes through several steps, it is perhaps better to deal with it in a separate
section.
We start with two set-theoretical concepts, which we define in 8.4.1 and examine
in 8.4.2.
Integration 211
This implies that E is a σ-algebra, and hence that A(S2 ) ⊂ E (cf. gσ2 of 6.1.17).
Since A1 ⊗ A2 = A(S2 ) (cf. 6.1.30a), this proves the statement.
b: Let x1 ∈ X1 . Then, for every subset F of Y ,
ϕ−1
x1 (F ) = {x2 ∈ X2 : ϕ(x1 , x2 ) ∈ F }
= {x2 ∈ X2 : (x1 , x2 ) ∈ ϕ−1 (F )} = (ϕ−1 (F ))x1 .
Thus, if F ∈ B then ϕ−1 (F ) ∈ A1 ⊗ A2 , and hence (in view of the result proved in
part a) ϕ−1
x1 (F ) ∈ A2 . This proves that ϕx1 is measurable w.r.t. A2 and B.
The proof for ϕx2 , for every x2 ∈ X2 , in analogous.
S2 := {E1 × E2 : Ek ∈ Ak for k = 1, 2}
ν : S2 → [0, ∞]
E1 × E2 7→ ν(E1 × E2 ) := µ1 (E1 )µ2 (E2 )
has obviously property a of 7.3.1. We will prove that it has properties b and c
as well. Indeed, suppose that {E1,n × E2,n }n∈I is a disjoint family of elements of
S
S2 such that I = {1, ..., N } or I = N, and such that n∈I (E1,n × E2,n ) ∈ S2 ,
S
i.e. n∈I (E1,n × E2,n ) = E1 × E2 with E1 ∈ A1 and E2 ∈ A2 . Then, for each
(x1 , x2 ) ∈ X1 × X2 ,
Integration 213
Thus, ν has properties a, b, c of 7.3.1, and hence there exists a unique additive
function µ0 on the algebra A0 (S2 ) which is an extension of ν, and µ0 is a premeasure.
Then, by 7.3.2 there exists a measure µ on A(A0 (S2 )) which is an extension of µ0 .
Since A(S2 ) = A(A0 (S2 )) (cf. 6.1.18), this proves that there exists a measure µ on
A1 ⊗ A2 which is an extension of ν.
If µ̃ is another measure which is an extension of ν, then the restriction of µ̃ to
A0 (S2 ) must coincide with µ0 on account of the uniqueness asserted in 7.3.1A, and
then µ̃ must coincide with µ on account of the uniqueness asserted in 7.3.2, since
µ0 is σ-finite. Indeed, for k = 1, 2 there exists a countable family {Fk,n }n∈Ik of
S
elements of Ak so that µk (Fk,n ) < ∞ for all n ∈ Ik and Xk = n∈Ik Fk,n . Then
µ (F × F2,m ) = µ1 (F1,n )µ2 (F2,m ) < ∞ for all (n, m) ∈ I1 × I2 and X1 × X2 =
S0 1,n
(n,m)∈I1 ×I2 (F1,n × F2,m ), so µ0 is σ-finite. Obviously, this also proves that µ is
σ-finite.
µ1 ⊗ · · · ⊗ µN := ((· · · ((µ1 ⊗ µ2 ) ⊗ µ3 ) ⊗ · · · ) ⊗ µN −1 ) ⊗ µN .
Then
µ1 ⊗ · · · ⊗ µN
= (· · · (((µ1 ⊗ · · · ⊗ µi1 ) ⊗ (µi1 +1 ⊗ · · · ⊗ µi2 )) ⊗ (µi2 +1 ⊗ · · · ⊗ µi3 )) ⊗ · · · )
⊗(µir +1 ⊗ · · · ⊗ µN ).
Integration 215
Proof. We recall that the measure (µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN is defined on the σ-
algebra (A1 ⊗ · · · ⊗ AN )Y1 ×···×YN and that the measure (µ1 )Y1 ⊗ · · · ⊗ (µN )YN is
defined on the σ-algebra AY1 1 ⊗ · · · ⊗ AYNN . Moreover, from 6.1.30c and its proof we
know that
Y1 ×···×YN
(A1 ⊗ · · · ⊗ AN )Y1 ×···×YN = AY1 1 ⊗ · · · ⊗ AYNN = A(SN ),
with
Y1 ×···×YN
SN = {G1 × · · · × GN : Gk ∈ AYk k for k = 1, ..., N }.
Y1 ×···×YN
We also know that SN is a semialgebra on Y1 × · · · × YN (cf. 6.1.30a, with
Y1 ×···×YN
(Xk , Ak ) replaced by (Yk , AYk k )). For each G1 × · · · × GN ∈ SN we have
(µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN (G1 × · · · × GN ) = (µ1 ⊗ · · · ⊗ µN )(G1 × · · · × GN )
= µ1 (G1 ) · · · µN (GN ) = (µ1 )Y1 (G1 ) · · · (µN )YN (GN )
= ((µ1 )Y1 ⊗ · · · ⊗ (µN )YN )(G1 × · · · × GN ).
Since the restrictions of (µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN and (µ1 )Y1 ⊗ · · · ⊗ (µN )YN to
Y1 ×···×YN Y1 ×···×YN
SN coincide and since SN is a semialgebra on Y1 × · · · × YN , an
argument similar to the one used at the end of 8.4.5 leads to the equality between
the measures (µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN and (µ1 )Y1 ⊗ · · · ⊗ (µN )YN .
8.4.7 Lemma. Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces. For
each E ∈ A1 ⊗ A2 , the functions
ψ1E : X1 → [0, ∞]
x1 7→ ψ1E (x1 ) := µ2 (Ex1 )
and
ψ2E : X2 → [0, ∞]
x2 7→ ψ2E (x2 ) := µ1 (E x2 )
are defined consistently, are elements of L+ (X1 , A1 ) and L+ (X2 , A2 ) respectively,
and
Z Z
E
(µ1 ⊗ µ2 )(E) = ψ1 dµ1 = ψ2E dµ2 ,
X1 X2
Proof. It follows from 8.4.2a that the definitions of the functions ψ1E and ψ2E are
consistent, for each E ∈ A1 ⊗ A2 .
First, suppose that (µ1 ⊗ µ2 )(X1 × X2 ) < ∞. Since (µ1 ⊗ µ2 )(X1 × X2 ) =
µ1 (X1 )µ2 (X2 ), this is equivalent to µ1 (X1 ) < ∞ and µ2 (X2 ) < ∞.
Define
C := {E ∈ A1 ⊗ A2 : ψkE ∈ L+ (Xk , Ak ) and
Z
(µ1 ⊗ µ2 )(E) = ψkE dµk , for k = 1, 2}.
Xk
and
ψ1E (x1 ) = µ2 (Ex1 ) = lim µ2 ((En )x1 ) = lim ψ1En (x1 ), ∀x1 ∈ X1 .
n→∞ n→∞
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 217
Integration 217
This shows that ψ1E ∈ L+ (X1 , A1 ) (cf. 6.2.19b). Moreover, we notice that ψ1E , ψ1En
for each n ∈ N, and the constant function
ψ1 : X1 → [0, ∞]
x1 7→ ψ1 (x1 ) := µ2 (X2 ),
+
which are elements of L (X1 , A1 ) (cf. also 6.2.2), are in fact elements of M(X1 , A1 )
since µ2 (X2 ) < ∞ (cf. also 7.1.2a). Further, ψ1 ∈ L1 (X1 , A1 , µ1 ) since µ1 (X1 ) < ∞
(cf. 8.2.6). Then, by 8.2.11 with ψ1 as dominating function, we have
Z Z
En
(µ1 ⊗ µ2 )(E) = lim (µ1 ⊗ µ2 )(En ) = lim ψ1 dµ1 = ψ1E dµ1 .
n→∞ n→∞ X1 X1
Integration 219
Proof. It follows from 8.4.2b that the definitions of the functions ψ1ϕ and ψ2ϕ are
consistent, for each ϕ ∈ L+ (X1 × X2 , A1 ⊗ A2 ).
Suppose E ∈ A1 ⊗ A2 and ϕ := χE . Then
∀x2 ∈ X2 , ϕx1 (x2 ) = χE (x1 , x2 ) = χEx1 (x2 )
and hence, if we define ψ1E as in 8.4.7,
Z
∀x1 ∈ X1 , ψ1ϕ (x1 ) = χEx1 dµ2 = µ2 (Ex1 ) = ψ1E (x1 ).
X2
And similarly for ψ2ϕ . Thus, the conclusions of the statement are true for ϕ = χE
with E ∈ A1 ⊗ A2 .
For a, b ∈ [0, ∞) and ϕ, ϕ̃ ∈ L+ (X1 × X2 , A1 ⊗ A2 ), we have
aϕ + bϕ̃ ∈ L+ (X1 × X2 , A1 ⊗ A2 )
(cf. 6.2.31) and also, for each x1 ∈ X1 ,
∀x2 ∈ X2 , (aϕ + bϕ̃)x1 (x2 ) = aϕx1 (x2 ) + bϕ̃x1 (x2 ),
and hence ψ1aϕ+bϕ̃
= aψ1ϕ + bψ1ϕ̃ , and similarly for ψ2aϕ+bϕ̃ . From this and from
what was proved above for a characteristic function, by linearity (cf. 8.1.9 and
8.1.13) we have that the conclusions of the statement are true for all the elements
of S + (X1 × X2 , A1 ⊗ A2 ).
Suppose now ϕ ∈ L+ (X1 × X2 , A1 ⊗ A2 ). Then there is a sequence {ϕn } in
+
S (X1 × X2 , A1 ⊗ A2 ) such that
ϕn ≤ ϕn+1 , ∀n ∈ N, and ϕn (x1 , x2 ) −−−−→ ϕ(x1 , x2 ), ∀(x1 , x2 ) ∈ X1 × X2
n→∞
(cf. 6.2.26). Now, for each x1 ∈ X1 , ϕx1 ∈ L+ (X2 , A2 ), {(ϕn )x1 } is a sequence in
L+ (X2 , A2 ) (cf. 8.4.2b), (ϕn )x1 ≤ (ϕn+1 )x1 , and (ϕn )x1 (x2 ) −−−−→ ϕx1 (x2 ) for all
n→∞
x2 ∈ X2 . By 8.1.7 and 8.1.8, this implies that
ϕ
∀x1 ∈ X1 , ψ1ϕn (x1 ) ≤ ψ1 n+1 (x1 ) and ψ1ϕn (x1 ) −−−−→ ψ1ϕ (x1 ).
n→∞
Since the conclusions of the statement are true for ϕn for all n ∈ N, this implies,
by 8.1.8 used twice, that ψ1ϕ ∈ L+ (X2 , A2 ) and that
Z Z
ϕd(µ1 ⊗ µ2 ) = lim ϕn d(µ1 ⊗ µ2 )
X1 ×X2 n→∞ X ×X
1 2
Z Z
= lim ψ1ϕn dµ1 = ψ1ϕ dµ1 .
n→∞ X1 X1
8.4.9 Corollary. Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces and
suppose that, for ϕ ∈ M(X1 × X2 , A1 ⊗ A2 ), there exist ϕ1 ∈ L1 (X1 , A1 , µ1 ) and
ϕ2 ∈ L1 (X2 , A2 , µ2 ) so that
|ϕ(x1 , x2 )| = |ϕ1 (x1 )||ϕ2 (x2 )|, ∀(x1 , x2 ) ∈ X1 × X2 .
1
Then ϕ ∈ L (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ).
R
Proof. Letting Ik := Xk |ϕk |dµk , we have Ik < ∞ for k = 1, 2 (cf. 8.2.4). Since
|ϕ| ∈ L+ (X1 × X2 , A1 ⊗ A2 ) (cf. 6.2.17), from 8.4.8 (with ϕ replaced by |ϕ|) we
have
Z Z
|ϕ|
|ϕ|d(µ1 ⊗ µ2 ) = ψ1 dµ1 .
X1 ×X2 X1
|ϕ|
Now, ψ1 = I2 |ϕ1 | and hence
Z
|ϕ|d(µ1 ⊗ µ2 ) = I2 I1 < ∞,
X1 ×X2
Integration 221
R
with
R the understanding that the expressions X2 ϕ(x1 , x2 )dµ2 (x2 ) and
X1 ϕ(x1 , x2 )dµ1 (x1 ) are to be considered only for x1 ∈ D1 and for x2 ∈ D2 ,
and hence only for µ1 -a.e. x1 ∈ X1 and for µ2 -a.e. x2 ∈ X2 respectively (the
second equality is often referred to by saying that “the order of integration may
be reversed”).
Proof. We will prove the conclusions of the statement for ϕx1 and for ρϕ 1 . The
proof for ϕx2 and ρϕ2 is similar.
a: From 8.4.2b we have ϕx1 ∈ M(X2 , A2 )Rfor each x1 ∈ X1 . Moreover we have
|ϕ| ∈ L+ (X1 × X2 , A1 ⊗ A2 ) (cf. 6.2.17) and X1 ×X2 |ϕ|d(µ1 ⊗ µ2 ) < ∞ (cf. 8.2.4).
|ϕ|
Then from 8.4.8 (with ϕ replaced by |ϕ|) we have ψ1 ∈ L+ (X1 , A1 ) and
Z Z
|ϕ|
ψ1 dµ1 = |ϕ|d(µ1 ⊗ µ2 ) < ∞,
X1 X1 ×X2
and hence (since obviously |ϕx1 | = |ϕ|x1 ) that the the set
Z
∞ |ϕ| −1
D1 := (ψ1 ) ({∞}) = x1 ∈ X1 : |ϕx1 |dµ2 = ∞
X2
is an element of A1 (cf. 6.1.26) and µ1 (D1∞ ) = 0 (cf. 8.1.12b). This proves that
ϕx1 ∈ L1 (X2 , A2 , µ2 ) for µ1 -a.e. x1 ∈ X1 (cf. 8.2.4).
b: We define
ϕ1 := (Re ϕ)+ , ϕ2 := (Re ϕ)− , ϕ3 := (Im ϕ)+ , ϕ4 := (Im ϕ)−
and we notice that ϕx1 = (ϕ1 )x1 − (ϕ2 )x1 + i(ϕ3 )x1 − i(ϕ4 )x1 for each x1 ∈ X1 (cf.
1.2.19).
Fix now i ∈ {1, 2, 3, 4}. We have ϕi ∈ L+ (X1 × X2 , A1 ⊗ A2 ) (cf. 6.2.12
and 6.2.20b) and ϕi ≤ |ϕ|, and hence (ϕi )x1 ∈ L+ (X2 , A2 ) (cf. 8.4.2b) and
(ϕi )x1 ≤ |ϕx1 | for each x1 ∈ X1 . We define ψi := (ψ1ϕi )D1 , with ψ1ϕi as in
8.4.8 (with ϕ replaced by ϕi ). From 8.4.8 we have ψ1ϕi ∈ L+ (X1 , A1 ), and hence
ψi ∈ L+ (D1 , AD D1
1 ) (cf. 6.2.3). As a matter of fact, ψi ∈ M(D1 , A1 ) because
1
∞
D1 = X1 − D1 (cf. 8.2.4) and hence
Z Z
∀x1 ∈ D1 , ψi (x1 ) = ψ1ϕi (x1 ) = (ϕi )x1 dµ2 ≤ |ϕx1 |dµ2 < ∞.
X2 X2
Thus, ψi ∈ M(X1 , A1 , µ1 ) since D1 ∈ A1 and µ1 (X1 − D1 ) = µ1 (D1∞ ) = 0 (cf.
8.2.1). Moreover, ψi ∈ L+ (X1 , A1 , µ1 ) and from 8.1.14, 8.4.8, 8.1.11a we have
Z Z Z Z
ϕi
ψi dµ1 = ψ1 dµ1 = ϕi d(µ1 ⊗ µ2 ) ≤ |ϕ|d(µ1 ⊗ µ2 ) < ∞,
X1 X1 X1 ×X2 X1 ×X2
and this shows (cf. 8.2.4) that ψi ∈ L1 (X1 , A1 , µ1 ).
Finally, for each x1 ∈ D1 , we have
Z
ρϕ
1 (x1 ) = ϕx1 dµ2
X2
Z Z Z Z
= (ϕ1 )x1 dµ2 − (ϕ2 )x1 dµ2 + i (ϕ3 )x1 dµ2 − i (ϕ4 )x1 dµ2
X2 x2 x2 x2
= ψ1 (x1 ) − ψ2 (x1 ) + iψ3 (x1 ) − iψ4 (x1 ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 222
8.4.11 Remarks.
(a) Let N ∈ N be so that N > 2 and let (Xk , Ak , µk ) be a σ-finite measure space
for k = 1, ..., N . If 1 < i1 < i2 < ... < ir < N , then (cf. 8.4.5)
µ1 ⊗ · · · ⊗ µN
= (· · · (((µ1 ⊗ · · · ⊗ µi1 ) ⊗ (µi1 +1 ⊗ · · · ⊗ µi2 )) ⊗
(µi2 +1 ⊗ · · · ⊗ µi3 )) ⊗ · · · ) ⊗ (µir +1 ⊗ · · · ⊗ µN ).
Thus, if ϕ ∈ L+ (X1 × · · · XN , A1 ⊗ · · · ⊗ AN ), from 8.4.8 we have
Z
ϕd(µ1 ⊗ · · · ⊗ µN )
X1 ×···×XN
Z Z Z
= ··· ϕ(x1 , ..., xN )
Xir +1 ×···×XN Xi1 +1 ×···×Xi2 X1 ×···Xi1
! ! !
d(µ1 ⊗ · · · ⊗ µi1 )(x1 , ..., xi1 ) d(µi1 +1 ⊗ · · · ⊗ µi2 )(xi1 +1 , ..., xi2 ) · · ·
Integration 223
where the last equality holds because the names we give to variables are imma-
terial (while their positions are essential). Thus, the two functions ϕ and
(x1 , ..., xN ) 7→ ϕ(x1 , ..., xk−1 , xj , xk+1 , ..., xj−1 , xk , xj+1 , ..., xN )
8.4.12 Remark. In 8.4.8 and in 8.4.10 we assumed Dϕ = X1 ×X2 for the functions
ϕ in the statements. However, both Tonelli’s theorem and Fubini’s theorem can be
generalized to the case of functions defined only µ1 ⊗ µ2 -a.e. We examine here the
case of Tonelli’s theorem. For Fubini’s theorem the treatment would be analogous.
Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces, and suppose that
for a function ϕ we have ϕ ∈ L+ (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ). This entails (cf.
8.1.14) Dϕ ∈ A1 ⊗ A2 , (µ1 ⊗ µ2 )(X1 × X2 − Dϕ ) = 0, ϕ ∈ L+ (Dϕ , (A1 ⊗ A2 )Dϕ ).
For each x1 ∈ πX1 (Dϕ ) (cf. 1.2.6c), we define
ϕx1 : (Dϕ )x1 → [0, ∞]
x2 7→ ϕx1 (x2 ) := ϕ(x1 , x2 )
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 224
(the condition x1 ∈ πX1 (Dϕ ) implies (Dϕ )x1 6= ∅). We have (Dϕ )x1 ∈ A2 (cf.
8.4.2a). We also have, for every subset F of R∗ ,
ϕ−1
x1 (F ) = {x2 ∈ (Dϕ )x1 : ϕ(x1 , x2 ) ∈ F }
= {x2 ∈ X2 : (x1 , x2 ) ∈ Dϕ and ϕ(x1 , x2 ) ∈ F }
= {x2 ∈ X2 : (x1 , x2 ) ∈ ϕ−1 (F )} = (ϕ−1 (F ))x1 ;
thus, if F ∈ A(δ), then ϕ−1 (F ) = E ∩ Dϕ with E ∈ A1 ⊗ A2 , and hence
ϕ−1
x1 (F ) = (E ∩ Dϕ )x1 = Ex1 ∩ (Dϕ )x1 ∈ (A2 )
(Dϕ )x1
(Dϕ )x
since Ex ∈ A2 (cf. 8.4.2a); this implies that ϕx1 ∈ L+ ((Dϕ )x1 , A2 1
). Moreover,
Z
µ2 ((X1 × X2 − Dϕ )x1 )dµ1 (x1 ) = (µ1 ⊗ µ2 )(X1 × X2 − Dϕ ) = 0
X1
(cf. 8.4.7) implies (cf. 8.1.12a) that µ2 ((X1 × X2 − Dϕ )x1 ) = 0 µ1 -a.e. on X1 ; since
(X1 × X2 − Dϕ )x1 = X2 − (Dϕ )x1 , this implies that
ϕx1 ∈ L+ (X2 , A2 , µ2 ) µ1 -a.e. on X1 .
Let then E1 ∈ A1 be such that µ1 (E1 ) = 0 and ϕx1 ∈ L+ (X2 , A2 , µ2 ) for each
x1 ∈ X1 − E1 .
Now, if ϕ̃ ∈ L+ (X1 × X2 , A1 ⊗ A2 ) is an extension of ϕ, we have (cf. 8.1.14)
Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ̃d(µ1 ⊗ µ2 ).
X1 ×X2 X1 ×X2
Moreover, for each x1 ∈ πX1 (Dϕ ), ϕ̃x1 is an element of L+ (X2 , A2 ) (cf. 8.4.2b)
which is obviously an extension of ϕx1 for each x1 ∈ X − E1 , and hence we have
(cf. 8.1.14),
Z Z
ϕx1 dµ2 = ϕ̃x1 dµ2 , ∀x ∈ X1 − E1 .
X2 X2
Thus, the function
Z
X1 − E1 ∋ xi 7→ ϕx1 dµ2 ∈ [0, ∞]
X2
is the restriction of the function ψ1ϕ̃ (cf. 8.4.8 with ϕ replaced by ϕ̃) to X1 − E1 ,
and hence (cf. 6.2.3) it is an element of L+ (X1 , A1 , µ1 ) and (cf. 8.1.14)
Z Z Z
ϕx1 (x2 )dµ2 (x2 ) dµ1 (x1 ) = ψ1ϕ̃ dµ1 .
X1 X2 X1
Then, 8.4.8 (with ϕ replaced by ϕ̃) implies that
Z Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ̃d(µ1 ⊗ µ2 ) = ψ1ϕ̃1 dµ1
X1 ×X2 X ×X X1
Z 1 Z2
= ϕ(x1 , x2 )dµ2 (x2 ) dµ1 (x1 ).
X1 X2
In a similar way it can be proved that
Z Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ(x1 , x2 )dµ1 (x1 ) dµ2 (x2 ).
X1 ×X2 X2 X1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 225
Integration 225
From Fubini’s theorem we can derive the results about double series that we
present in 8.4.14. These results, it must be said, can be obtained by more elementary
means (cf. e.g. Apostol, 1974, th. 8.42).
(b) If the conditions in part a are satisfied, then all the series written below are
convergent and
∞ ∞ ∞ ∞ ∞
! !
X X X X X
ϕ(n, s) = ϕ(n, s) = ϕ(σ(k)).
n=1 s=1 s=1 n=1 k=1
P∞ P∞ P∞
s=1|ϕ(n, s)| and | n=1 ϕ(n, s)| ≤ n=1 |ϕ(n, s)|), and from 8.3.10a we have the
equalities
∞ ∞
! Z Z
X X
ϕ(n, s) = ϕ(n, s)dµ(s) dµ(n),
n=1 n=1 N N
∞ ∞
! Z Z
X X
ϕ(n, s) = ϕ(n, s)dµ(n) dµ(s),
s=1 n=1 N N
∞
X Z
ϕ(σ(k)) = (ϕ ◦ σ)dµ.
k=1 N
Now, the equalities we need to prove follow from 8.4.10 since from 8.3.11c we have
Z Z Z
(ϕ ◦ σ)dµ = ϕdµ̃ = ϕd(µ ⊗ µ).
N N×N N ×N
(b) If the conditions in part a are satisfied, then the series that may appear below
are convergent and
X X X
α(n,s) = α(n,s) .
n∈I (n,s)∈In (n,s)∈J
Note that, for the various sums or series of this statement to be defined properly,
an ordering must be assumed in the various index sets. However, the orderings we
use in part a are immaterial in view of 5.4.3, and the orderings we use in part b are
immaterial in view of 4.1.8b because, if the conditions in part a are satisfied, then
all the series that may appear in part b are absolutely convergent.
Integration 227
In this section we prove the Riesz–Markov theorem for compact metric spaces, which
will play an essential role in our proof of the spectral theorem for unitary operators
(from which we will deduce the spectral theorem for self-adjoint operators).
8.5.1 Definition. Let M be a linear manifold in the linear space F (X) (cf. 3.1.10c)
and let L be a linear functional with DL = M . The linear functional L is said to
be positive if 0 ≤ Lϕ whenever ϕ ∈ M is such that 0X ≤ ϕ.
Notice that, if L is positive and ϕ, ψ ∈ M are such that ϕ ≤ ψ, then Lϕ ≤ Lψ
since Lψ = Lϕ + L(ψ − ϕ) and 0X ≤ ψ − ϕ.
8.5.2 Remark. Let (X, d) be a compact metric space and µ a finite measure on
the Borel σ-algebra A(d). For C(X) (cf. 3.1.10e) we have C(X) ⊂ L1 (X, A(d), µ)
by 6.2.8, 2.8.14 and 8.2.6. Thus, we can define the mapping
Lµ : C(X) → C
Z
ϕ 7→ Lµ ϕ := ϕdµ,
X
Proof. Existence: For each G ∈ Td , the family {ϕ ∈ C(X) : ϕ ≺ G} is not empty (it
contains the function 0X ; for ϕ ≺ G, cf. 2.5.10). Thus, we can define the function
ν : Td → [0, ∞]
(1)
G 7→ ν(G) := sup{Lϕ : ϕ ∈ C(X) and ϕ ≺ G}.
If G1 , G2 ∈ Td are so that G1 ⊂ G2 , then
{ϕ ∈ C(X) : ϕ ≺ G1 } ⊂ {ϕ ∈ C(X) : ϕ ≺ G2 },
and this implies that
if G1 , G2 ∈ Td are so that G1 ⊂ G2 , then ν(G1 ) ≤ ν(G2 ). (2)
We recall that, for ϕ ∈ C(X) and G ∈ Td , ϕ ≺ G implies ϕ ≤ 1X ; thus, since
1X ≺ X and L is positive, we have
ν(X) = L1X . (3)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 228
Integration 229
S∞ S∞
Since n=1 En ⊂ n=1 Gn , in view of 4 this implies
∞ ∞
!
[ X
∗
µ En ≤ µ∗ (En ) + ε.
n=1 n=1
We also have
n
X
ϕ= ψi ϕ. (14)
i=1
Then we have
n (16) n
(15) X X
Lϕ = L(ψi ϕ) ≤ (yi + ε)Lψi
n=1 i=1
Xn n
X
= (|a| + yi + ε)Lψi − |a| Lψi
i=1 i=1
(17) Xn
≤ (|a| + yi + ε)µ(G̃i ) − |a|µ(X)
i=1
(18) Xn ε
≤ (|a| + yi + ε) µ(Ei ) + − |a|µ(X)
i=1
n
(19) n n
X εX
≤ (yi + ε)µ(Ei ) + (|a| + yi + ε)
i=1
n i=1
(20) n
X
≤ (yi − ε)µ(Ei ) + 2εµ(X) + ε|a| + εb + ε2
i=1
(21)
Z
≤ ϕdµ + ε(2µ(X) + |a| + b + ε),
X
where: 15 holds by 14 and the linearity of L; 16 holds by the linearity and the
positivity of L, since ψi ϕ ≤ (yi +ε)ψi by the definition of G̃i ; 17 holds by 1 and by 13;
18 holds by 12 since G̃i ⊂ Gi ; 19 holds because ni=1 µ(Ei ) = µ ( ni=1 Ei ) = µ(X);
P S
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 231
Integration 231
Pn
20 holds because i=1 µ(Ei ) = µ(X) and because yi ≤ b for i = 1, ..., n; 21 holds
because
n
X n
X
(yi − ε)χEi ≤ yi−1 χEi ≤ ϕ,
i=1 i=1
Xn n
Z X
(yi − ε)µ(Ei ) = (yi − ε)χEi dµ.
i=1 X i=1
Since ε was arbitrary and µ(X) < ∞, 11 is established and the proof of existence
is complete.
Uniqueness: Let µ̃ be a finite measure on A(d) such that
Z
Lϕ = ϕdµ̃, ∀ϕ ∈ C(X).
X
For every closed set F , by 2.5.7 there exists a sequence {ϕn } in C(X) so that
∀x ∈ X, 0 ≤ ϕn (x) ≤ 1 and ϕn (x) → χF (x) as n → ∞.
Then, by Lebesgue’s dominated convergence theorem (cf. 8.2.11, with 1X as domi-
nating function),
Z Z
µ̃(F ) = χF dµ̃ = lim ϕn dµ̃
X n→∞ X
Z Z
= lim Lϕn = lim ϕn dµ = χF dµ = µ(F ).
n→∞ n→∞ X X
In view of 7.4.2, this proves that µ̃ = µ.
July 25, 2013 17:28 WSPC - Proceedings Trim Size: 9.75in x 6.5in icmp12-master
Chapter 9
Lebesgue Measure
Then there exists a unique measure µF on the Borel σ-algebra A(dR ) on R (cf.
6.1.22 and 2.1.4) such that
µF ((a, b]) = F (b) − F (a), for all a, b ∈ R so that a < b.
The measure µF is σ-finite and is called the Lebesgue–Stieltjes measure associated
to F .
Proof. Recall that I9 denotes a semialgebra on R such that A(dR ) = A(I9 ) (cf.
6.1.25), and define the function
ν : I9 → [0, ∞]
0
if E = ∅,
F (b) − F (a) if E = (a, b] with a, b ∈ R s.t. a < b,
E 7→ ν(E) :=
F (b) − limn→∞ F (−n) if E = (−∞, b] with b ∈ R,
limn→∞ F (n) − F (a) if E = (a, ∞) with a ∈ R.
(notice that limn→∞ F (−n) and limn→∞ F (n) do exist by 5.2.5). We will show that
ν satisfies conditions a, b, c, d, e of 7.3.3.
233
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 234
continuous)
1 1
∀ε > 0, ∃nε ∈ N s.t. ν((a, b]) − ν a+ ,b =F a+ − F (a) < ε.
nε nε
If E = (−∞, b] (with b ∈ R): ∀n ∈ N, (−n, b] ∈ S, (−n, b] ⊂ (−∞, b], (−n, b] is
compact; if limn→∞ F (−n) ∈ R, then (from the definition of limit)
∀ε > 0, ∃nε ∈ N s.t. ν((−∞, b]) − ν((−nε , b]) = F (−nε ) − lim F (−n) < ε;
n→∞
if limn→∞ F (−n) = −∞, then ν((−∞, b]) = ∞ and (cf. 5.3.2c, 5.3.2a, 5.2.5)
sup ν((−n, b]) = sup(F (b) − F (−n)) = F (b) − inf F (−n)
n≥1 n≥1 n≥1
F is right continuous)
1 1
∀ε > 0, ∃nε ∈ N s.t. ν −∞, b + − ν((−∞, b]) = F b + − F (b) < ε.
nε nε
If E = (a, ∞) (with a ∈ R) and ν((a, ∞)) < ∞, simply notice that (a, ∞)◦ =
(a, ∞).
S
e: We have R = n∈Z (n, n + 1] and ν((n, n + 1]) = F (n + 1) − F (n) < ∞ for all
n ∈ Z.
Since ν satisfies conditions a, b, c, d, e and A(dR ) = A(I9 ), 7.3.3 implies that
there exists a unique measure µF which is an extension of ν, and that µF is σ-finite.
Since µF extends ν, we have
∀a, b ∈ R so that a < b, µF ((a, b]) = ν((a, b]) = F (b) − F (a).
Suppose now that µ is a measure on A(dR ) such that
∀a, b ∈ R so that a < b, µ((a, b]) = F (b) − F (a).
Then we have, by 7.1.4b:
∞
!
[
∀b ∈ R, µ((−∞, b]) = µ (−n, b] = lim µ((−n, b]) = F (b) − lim F (−n);
n→∞ n→∞
n=1
∞
!
[
∀a ∈ R, µ((a, ∞)) = µ (a, n] = lim µ((a, n]) = lim F (n) − F (a).
n→∞ n→∞
n=1
The Lebesgue measure m is the only measure on A(dR ) with the property:
∀a, b ∈ R so that a < b, m((a, b)) = b − a.
(or the same proposition with (a, b) replaced by [a, b) or by [a, b]).
Proof. We know that m is the only measure on A(dR ) such that
∀a, b ∈ R so that a < b, m((a, b]) = b − a.
Then we have, by 7.1.4b and 7.1.4c, for all a, b ∈ R so that a < b:
∞ !
[ 1
m((a, b)) = m a, b −
n=1
n
1 1
= lim m a, b − = lim b − − a = b − a;
n→∞ n n→∞ n
∞ !
\ 1
m([a, b)) = m a− ,b
n=1
n
1 1
= lim m a− ,b = lim b − a + = b − a;
n→∞ n n→∞ n
∞ !
\ 1
m([a, b]) = m a, b +
n=1
n
1 1
= lim m a, b + = lim b + − a = b − a.
n→∞ n n→∞ n
Similarly we have, for all a ∈ R: !
∞
\ 1 1 1
m({a}) = m a− ,a = lim m a− ,a = lim = 0;
n=1
n n→∞ n n→∞ n
∞
!
[
m((−∞, a)) = m (−n, a) = lim m((−n, a)) = lim (a + n) = ∞;
n→∞ n→∞
n=1
∞
!
[
m((a, ∞)) = m (a, n) = lim m((a, n)) = lim (n − a) = ∞;
n→∞ n→∞
n=1
m((−∞, a]) = m([a, ∞)) = ∞ since (−∞, a) ⊂ (−∞, a] and (a, ∞) ⊂ [a, ∞)
(cf. 7.1.2a).
Suppose now that m̃ is a measure on A(dR ) such that
∀a, b ∈ R so that a < b, m̃((a, b)) = b − a.
Then
∞ !
\ 1
∀a, b ∈ R so that a < b, m̃((a, b]) = m̃ a, b +
n=1
n
1
= lim m̃ a, b +
n→∞ n
1
= lim b + − a = b − a,
n→∞ n
and hence m̃ = m by the uniqueness property of m quoted above. The proofs for
(a, b) replaced by [a, b) or by [a, b] are analogous.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 237
The function
ρ : Rn × Rn → R
((x1 , ..., xn ), (y1 , ..., yn )) 7→ ρ((x1 , ..., xn ), (y1 , ..., yn ))
:= max{|xk − yk | : k = 1, ..., n}
is a distance on Rn and Tρ = Tdn (cf. the proof of 6.1.31). For each (x1 , ..., xn ) ∈ G,
since G ∈ Tρ there exists ε > 0 so that, for (y1 , ..., yn ) ∈ Rn ,
[|xk − yk | < ε for k = 1, ..., n] ⇒ (y1 , ..., yn ) ∈ G;
then, if r ∈ N is so that 21r < ε, the half-open interval defined by Hr that contains
(x1 , ..., xn ) must be contained in G, and therefore either this half-open interval is
contained in an interval Isq with q ∈ Js and s < r or this half-open interval itself
is an interval Irp with p ∈ Jr . This shows that each point of G is contained in an
element of the family {Irp : r ∈ N, p ∈ Jr }, and hence that
[∞ [
G⊂ Irp .
r=1 p∈Jr
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 238
The first result of this section is that Lebesgue measure is invariant under transla-
tion.
then
(b) for each ϕ ∈ L+ (Rn , A(dn ), mn ) (or ϕ ∈ L1 (Rn , A(dn ), mn )), if we define
ϕc : Dϕ + (c1 , ..., cn ) → [0, ∞] (or C)
x 7→ ϕc (x1 , ..., xn ) := ϕ(x1 − c1 , ..., xn − cn )
The function
µc : A(dn ) → [0, ∞]
E 7→ µc (E) := m(τc−1 (E))
is a measure on A(dn ) (cf. 8.3.11 with µ1 := mn and π := τc ) and we have, for all
(a1 , ..., an ), (b1 , ..., bn ) ∈ Rn so that ak < bk for k = 1, ..., n,
1 c 1
µ ((a, b)) = |c|(b − a) = b − a,
|c| |c|
since m((ca, cb)) = c(b − a) if c > 0 and m((cb, ca)) = (−c)(b − a) if c < 0. Then,
1 c
by the uniqueness asserted in 9.1.3 we have |c| µ = m, i.e.
∀E ∈ A(dR ), m(cE) = µc (E) = |c|m(E).
b: Since ϕc = ϕ ◦ δc and (|c|m)(E) = m(δc−1 (E)) for each E ∈ A(dR ), the
assertions of the statement follow from 8.3.11 (and from 8.3.5b with a := |c|, µ := m,
ν the null measure).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 241
9.2.3 Remark. We denote by GL(n, R) the family of all injective linear operators
on the linear space Rn . If A, B ∈ GL(n, R) then AB ∈ GL(n, R) (cf. 1.2.14B and
3.2.4), and if A ∈ GL(n, R) then A−1 ∈ GL(n, R) (cf. 3.2.6b). Our next result
requires a few facts which are known from linear algebra (cf. e.g. Munkres, 1991,
Chapter 1). Every A ∈ GL(n, R) determines a unique matrix [Aik ] so that
n
X
A(x1 , ..., xn ) = (x′1 , ..., x′n ) with x′i = Aik xk
k=1
for i = 1, ..., n, ∀(x1 , ..., xn ) ∈ Rn ;
from this it is clear that A is a continuous mapping; also, denoting by det A the
determinant of the matrix [Aik ], we have:
• A1 (x1 , ..., xn ) := (x1 , ..., xi−1 , cxi , xi+1 , ..., xn ), with c ∈ R − {0},
• A2 (x1 , ..., xn ) := (x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn ), with c ∈ R and i 6= k,
• A3 (x1 , ..., xi−1 , xi , xi+1 , ..., xk−1 , xk , xk+1 , ...xn )
:= (x1 , ..., xi−1 , xk , xi+1 , ..., xk−1 , xi , xk+1 , ..., xn ).
then
and
Z Z
(ϕ ◦ A−1 )dmn = | det A| ϕdmn .
Rn Rn
Proof. a: For each ψ ∈ L+ (Rn , A(dn )) and each T ∈ GL(n, R), the continuity of
T implies that ψ ◦ T ∈ L+ (Rn , A(dn )) (cf. 6.2.8 and 6.2.5).
For each ψ ∈ L+ (Rn , A(dn )) we have:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 242
where 8.4.11a, 8.4.8 and 9.2.2 (with c replaced by 1c ) have been used;
for c ∈ R and i, k = 1, ..., n so that i 6= k, assuming i < n (otherwise, it is easy
to simplify the calculation below),
Z
ψ(x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn )dmn (x1 , ..., xn )
Rn
Z Z Z
= ψ(x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn )dmi−1 (x1 , ..., xi−1 )
Rn−i R i−1
R
dm(xi ) dmn−i (xi+1 , ..., xn )
Z Z Z
= ψ(x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn )dm(xi )
Rn−i Ri−1 R
dmi−1 (x1 , ..., xi−1 ) dmn−i (xi+1 , ..., xn )
Z Z Z
= ψ(x1 , ..., xn )dm(xi ) dmi−1 (x1 , ..., xi−1 ) dmn−i (xi+1 , ..., xn )
n−i Ri−1
ZR R
In this section we see how the Riemann integral can be subsumed in the Lebesgue
integral, when this is defined with respect to the Lebesgue measure on a bounded
interval.
Proof. Since ϕ is bounded and m[a,b] ([a, b]) = b−a < ∞, ϕ ∈ L1 ([a, b], A[a,b] , m[a,b] )
by 8.2.6. Then Re ϕ, Im ϕ ∈ L1 ([a, b], A[a,b] , m[a,b] ) (cf. 8.2.3).
Suppose now
R b that ϕ is Riemann integrable, denote by ϕ̃ either Re ϕ or Im ϕ,
and let I := a ϕ̃(x)dx. Then, using the symbols introduced in 9.3.2,
1
∀n ∈ N, ∃Pn ∈ P s.t. I − < sPn (ϕ̃).
n
Thus, I = limn→∞ sPn (ϕ̃). Define now a sequence {Pn′ } in P by letting
P1′ := P1 and Pn+1
′
:= Pn+1 ∪ Pn′ for n ∈ N,
where Pn+1 ∪ Pn′ denotes the partition of [a, b] that is obtained by reordering the
union of the families Pn+1 and Pn′ . For each n ∈ N, write
{tn0 , tn1 , ..., tnNn } := Pn′ and mni (ϕ̃) := inf{ϕ̃(t) : t ∈ [tni−1 , tni ]} for i = 1, ..., Nn ,
PNn
and define ψn := n=1 mni (ϕ̃)χ(tni−1 ,tni ] . Then, for each n ∈ N:
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 245
′
(a) ψn ≤ ψn+1 since Pn+1 is a refinement of Pn′ , and ψn ≤ ϕ̃;
(b) sPn (ϕ̃) ≤ sPn′ (ϕ̃) since Pn′ is a refinement of Pn ;
1
(c) ψn is obviously R A[a,b] -measurable, ψn ∈ L ([a, b], A[a,b] , m[a,b] ) since ψn is
bounded, and [a,b] ψn dm[a,b] = sPn′ (ϕ̃).
From (a) and from 5.2.4 it follows that we can define the function
ψ : [a, b] → R
x 7→ ψ(x) := lim ψn (x)
n→∞
and that ψ ≤ ϕ̃. By 6.2.20c, ψ is A[a,b] -measurable. From (b) it follows that
sPn′ (ϕ̃) → I as n → ∞,
and this and (c) imply that
Z
I = lim ψn dm[a,b] .
n→∞ [a,b]
by 8.2.11 (with dominating function any constant function which majorizes |ϕ|).
We can show in a similar way that there exists χ ∈ L1 ([a, b], A[a,b] , m[a,b] ) such
that
Z
ϕ̃ ≤ χ and χdm[a,b] = I.
[a,b]
Then we have
Z Z Z
(χ − ψ)dm[a,b] = χdm[a,b] − ψdm[a,b] = 0.
[a,b] [a,b] [a,b]
Since 0 ≤ χ − ψ, by 8.1.18a we have ψ(x) = χ(x) m[a,b] -a.e. on [a, b], and hence
ψ(x) = ϕ̃(x) m[a,b] -a.e on [a, b] since ψ ≤ ϕ̃ ≤ χ. From this we obtain, by 8.2.7,
Z Z Z b
ϕ̃dm[a,b] = ψdm[a,b] = I = ϕ̃(x)dx.
[a,b] [a,b] a
Chapter 10
Hilbert Spaces
In this chapter we study inner product spaces, and Hilbert spaces in particular,
which we only consider over the complex field C. While linear operators in Hilbert
spaces are studied in later chapters in connection with the concept of adjoint opera-
tor, we present here what more can be said about the concepts previously introduced
for linear operators when the linear spaces in which they are defined are actually
inner product or Hilbert spaces.
(α denotes the complex conjugate of a complex number α). We point out that
conditions sf2 and sf3 are consistent only when condition sf1 is assumed.
A sesquilinear form ψ is said to be on X if Mψ = X.
247
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 248
The function φ is called an inner product for the linear space (X, σ, µ). An inner
product is also called a scalar product.
10.1.4 Remarks.
(a) It is immediately clear that, in every inner product space X, conditions ip1 and
ip2 imply the following condition:
(ip5 ) (αf1 + βf2 |g) = α (f1 |g) + β (f2 |g), ∀α, β ∈ C, ∀f1 , f2 , g ∈ X.
Thus, an inner product for X is a sesquilinear form on X.
(b) The reader should be aware that some define an inner product with condition
ip1 replaced by condition
(ip′1 ) (f |αg1 + βg2 ) = α (f |g1 ) + β (f |g2 ), ∀α, β ∈ C, ∀f, g1 , g2 ∈ X.
Then, condition ip5 gets replaced by condition
(ip′5 ) (αf1 + βf2 |g) = α (f1 |g) + β (f2 |g), ∀α, β ∈ C, ∀f1 , f2 , g ∈ X.
Of course, the two definitions of an inner product are fully equivalent. However,
care must be taken not to mix formulae obtained on the basis of different
definitions.
10.1.5 Examples.
(a) Let ℓf denote the family of all the sequences in C that have just a finite number
of non-zero elements, i.e.
Obviously, ℓf is a linear manifold in the linear space F (N) (cf. 3.1.10c), and
therefore it is a linear space over C (cf. 3.1.3). It is immediately clear that the
function
φ : ℓf × ℓf → C
∞
X
({xn }, {yn }) 7→ φ({xn }, {yn }) := xn yn
n=1
and it is immediately clear that this function has properties ip1 , ip2 , ip3 of
10.1.3. As to property ip4 , we note first that if ϕ ∈ C(a, b) is such that ϕ(x) = 0
m-a.e. on [a, b] then ϕ(x) = 0 for all x ∈ [a, b]. Indeed, as can be easily seen, if
for ϕ ∈ C(a, b) there exists x0 ∈ (a, b) so that ϕ(x0 ) 6= 0, then there exists δ > 0
so that (x0 −δ, x0 +δ) ⊂ [a, b] and ϕ(x) 6= 0 for all x ∈ (x0 −δ, x0 +δ), and hence
it cannot be that ϕ(x) = 0 m-a.e. on [a, b], since m((x0 − δ, x0 + δ)) = 2δ > 0.
Now, for ϕ ∈ C(a, b), (ϕ|ϕ) = 0 implies ϕ(x) = 0 m-a.e. on [a, b] by 8.1.12a,
and hence ϕ = 0C(a,b) . This shows that φ has property ip4 .
It is worth remarking that this example can be formulated without recourse
to Lebesgue integration, but using Riemann integration instead. In fact, 9.3.3
implies that
Z b
φ(ϕ, ψ) = ϕ(x)ψ(x)dx, ∀ϕ, ψ ∈ C(a, b),
a
since ϕψ ∈ C(a, b) for all ϕ, ψ ∈ C(a, b) and the elements of C(a, b) are Riemann-
integrable. Moreover, the argument presented above to prove property ip4 of
φ can be replaced with an argument suited to the definition of φ by means of
Riemann integrals.
(c) The linear space S(R) (cf. 3.1.10h) is an inner product space, with the inner
product defined by
φ : S(R) × S(R) → C
Z
(ϕ, ψ) 7→ φ(ϕ, ψ) := ϕψdm
R
10.1.6 Remark. Let (X, σ, µ, φ) be an inner product space and M a linear manifold
in the linear space (X, σ, µ). It is immediate to see that (M, σM×M , µC×M , φM×M )
is an inner product space, since (M, σM×M , µC×M ) is a linear space over C (cf.
3.1.3) and conditions ip1 , ip2 , ip3 , ip4 of 10.1.3 hold trivially if X is replaced by M .
10.1.7 Proposition. Let f, g be two elements of an inner product space X. Then:
p p
p| ≤ (f |f ) (g|g)
(a) | (f |g)
(by (f |f ) we mean the non-negative square root of (f |f ), which is non-
negative by ip3 ); this
p is called
p the Schwarz inequality;
(b) we have | (f |g) | = (f |f ) (g|g) iff the set {f, g} is linearly dependent.
Proof. As a preliminary step, we note that if f 6= 0X then (f |f ) 6= 0, by property
ip4 , and that, by properties ip1 and ip5 ,
| (f |g) |2
(f |g) (f |g)
g− f |g − f = (g|g) − . (∗)
(f |f ) (f |f ) (f |f )
a: If f = 0X we have (cf. 10.1.2b)
p p p
| (f |g) | = 0 = 0 (g|g) = (f |f ) (g|g).
If f 6= 0X , from (∗) we have, by properties ip3 ,
| (f |g) |2
0 ≤ (g|g) − ,
(f |f )
and hence
p p
| (f |g) | ≤ (f |f ) (g|g).
b: If the set {f, g} is linearly dependent, then there exist α, β ∈ C so that
β
(α, β) 6= (0, 0) and αf + βg = 0X ; assuming for instance α 6= 0, we have f = − α g
and hence s
β β β β p p p
| (f |g) | = − | (g|g) | = − (g|g) =
− g| − g (g|g) = (f |f ) (g|g).
α α α α
p p
If | (f |g) | = (f |f ) (g|g) and f 6= 0X (f = 0X would make the set {f, g} linearly
dependent in any case), from (∗) we have
(f |g) (f |g)
g− f |g − f = 0,
(f |f ) (f |f )
and hence
(f |g)
g− f = 0X ,
(f |f )
which shows that the set {f, g} is linearly dependent.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 252
νφ : X → R
p
f 7→ νφ (f ) := kf kφ := (f |f )
Proof. For νφ , condition no2 of 4.1.1 follows from properties ip1 and ip5 of an inner
product:
p p
kαf kφ = (αf |αf ) = |α| (f |f ) = |α|kf kφ , ∀α ∈ C, ∀f ∈ X.
Condition no3 of 4.1.1 is actually the same as condition ip4 of 10.1.3. It remains to
verify condition no1 of 4.1.1. Now, for all f, g ∈ X,
Thus we have
which implies
kf + gkφ ≤ kf kφ + kgkφ.
10.1.11 Proposition. Let X1 and X2 be inner product spaces. For a linear oper-
ator A ∈ O(X1 , X2 ), the following conditions are equivalent:
(a) (Af |Ag)2 = (f |g)1 , ∀f, g ∈ DA ;
(b) kAf k2 = kf k1 , ∀f ∈ DA
(we have indexed by 1 and 2 the inner products and the norms in X1 and X2
respectively, as we will do whenever a similar situation arises).
Proof. a ⇒ b: This follows immediately from the definition of νφ in 10.1.8.
b ⇒ a: If condition b holds true then, by 10.1.10b, we have for all f, g ∈ DA
4
X 1
(Af |Ag)2 = n
kAf + in Agk22
n=1
4i
4 4
X 1 2
X 1
= n
kA(f + i n
g)k 2 = n
kf + in gk21 = (f |g)1 .
n=1
4i n=1
4i
10.1.13 Remarks.
(a) We saw in 10.1.12 that if a norm is derived from an inner product as in 10.1.8
then it satisfies the parallelogram law.
The converse is also true, namely if a norm ν for a linear space (X, σ, µ) over
C is such that
ν(f + g)2 + ν(f − g)2 = 2ν(f )2 + 2ν(g)2 , ∀f, g ∈ X, (∗)
then there exists a unique inner product φ for (X, σ, µ) so that ν = νφ . The
idea of the proof is as follows. If an inner product φ does exist so that ν = νφ ,
then
4
X 1
φ(f, g) = n
ν(f + in g)2 , ∀f, g ∈ X,
n=1
4i
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 254
must be true, in view of 10.1.10b. Thus, one is led to define the function
φ: X ×X →C
4
X 1
(f, g) 7→ φ(f, g) := n
ν(f + in g)2 ,
n=1
4i
and to check that this function has properties ip1 , ip2 , ip3 , ip4 of 10.1.3; in
this check, properties no1 , no2 , no3 of 4.1.1 and condition (∗) are used (cf.
Weidmann, 1980, p.10–11). After that, one notes that, for every f ∈ X,
4
X 1
νφ (f )2 = φ(f, f ) = n
ν(f + in f )2
n=1
4i
1 1 1 1 1
= 2+ 0+ 2 + 4 ν(f )2 = ν(f )2 .
4 i −1 −i 1
We do not give the details of the aforementioned checks because we shall not
use this result.
(b) There are norms which do not satisfy the parallelogram law and which therefore
cannot be derived from any inner product. Such is e.g. the norm defined in
4.3.6a.
Proof. The equality between the two least upper bounds of the statement is obvi-
ous.
Now assume A bounded. Then, by 10.1.7a and 4.2.5b,
| (f |Ag) | ≤ kf kkAgk ≤ kAkkf kkgk, ∀f ∈ Y, ∀g ∈ DA ;
thus, k ≤ kAk and therefore k < ∞.
Conversely, assume k < ∞. Then,
(Ah|Ah)
kAhk = khk ≤ kkhk, ∀h ∈ DA − NA ,
kAhkkhk
and this implies
kAf k ≤ kkf k, ∀f ∈ DA ;
thus, A is bounded, k ∈ BA and therefore kAk ≤ k (cf. 4.2.4).
The statement follows from the two arguments above.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 255
10.1.15 Remark. Let (X, σ, µ, φ) be an inner product space. From 10.1.8 and
4.1.3 we have that the function
dφ := dνφ : X × X → R
p
(f, g) 7→ dφ (f, g) := νφ (f − g) = φ(f − g, f − g)
is a distance on X. Whenever we use metric concepts in an inner product space,
we will refer to this distance. For instance, if we say that the inner product space
X is complete or separable we mean that the metric space (X, dφ ) is such.
If M is a linear manifold in X, it is immediately clear that we obtain the same
metric space by first defining the inner product space (M, σM×M , µC×M , φM×M )
(cf. 10.1.6) and then the metric space (M, dφM ×M ), or by first defining the metric
space (X, dφ ) and then the metric subspace (M, (dφ )M ) (cf. 2.1.3). Thus, there can
be no ambiguity when we refer to M as a metric space.
If an isomorphism from X1 onto X2 exists, then the two inner product spaces X1
and X2 are said to be isomorphic.
If the two inner product spaces X1 and X2 are the same, an isomorphism from
X1 onto X2 is called an automorphism of X1 .
the structure of an inner product space. However, 10.1.19 proves that the injectivity
part of condition is1 and condition is2 altogether are in fact redundant.
10.1.19 Theorem. Let X1 and X2 be inner product spaces and U a mapping from
X1 to X2 such that DU is a linear manifold in X1 and
(U (f )|U (g))2 = (f |g)1 , ∀f, g, ∈ DU .
Then U is an injective linear operator.
10.1.20 Theorem. Let X1 and X2 be inner product spaces and U a mapping from
X1 to X2 . The following conditions are equivalent:
(a) U is an isomorphism from X1 onto X2 ;
(b) DU = X1 , RU = X2 , U is a linear operator, and
kU f k2 = kf k1 , ∀f ∈ X1 ;
(c) DU = X1 , RU = X2 , and
(U (f )|U (g))2 = (f |g)1 , ∀f, g, ∈ X1 .
Proof. Suppose that {f1 , ..., fn } is a subset of S and (α1 , ..., αn ) ∈ Cn is so that
Pn
i=1 αi fi = 0X . For k = 1, ..., n we have
n
! n
X X
0 = fk | αi fi = αi (fk |fi ) = αk kfk k2 ;
i=1 i=1
Proof. We have
2 !
Xn
n
X n
X
fi
= fi | fk
i=1 i=1 k=1
n
X n X
X n
X
= (fi |fi ) + (fi |fk ) = kfi k2 .
i=1 i=1 k6=i i=1
10.2.5 Examples.
(a) For each k ∈ N, let δk be the element of the inner product space ℓf (cf. 10.1.5a)
defined by
δk := {δk,n },
i.e. δk is the sequence whose elements are all zero but the k-th, which is one.
The family {δk }k∈N is an o.n.s. in ℓf , since it is obvious that (δk |δl ) = δk,l for
all k, l ∈ N.
(b) We define a family {un }n∈Z of elements of the inner product space C(0, 2π) (cf.
10.1.5b) by
1
un (x) := √ einx , ∀x ∈ [0, 2π], ∀n ∈ Z.
2π
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 259
The family {un }n∈Z is an o.n.s. in C(0, 2π) since (un |un ) = 1 is obvious and,
for n 6= m,
Z 2π
1
(um |un ) = ei(n−m)x dx
2π 0
Z 2π Z 2π
1 1
= cos(n − m)xdx + i sin(n − m)xdx = 0.
2π 0 2π 0
For each n ∈ N we define the elements vn and wn of C(0, 2π) by
1 1
vn (x) := √ cos nx and wn (x) := √ sin nx, ∀x ∈ [0, 2π].
π π
Since
1 1
vn = √ (un + u−n ) and wn = √ (un − u−n ), ∀n ∈ N,
2 2i
a straightforward computation shows that the family {u0 } ∪ {vn }n∈N ∪ {wn }n∈N
is an o.n.s., and 3.1.7 implies that
L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N ) ⊂ L{un }n∈Z .
However, since
1 1
un = √ (vn + iwn ) and u−n = √ (vn − iwn ), ∀n ∈ N,
2 2
3.1.7 implies also that
L{un }n∈Z ⊂ L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N ).
Thus,
L{un }n∈Z = L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N ).
(e) if {gn }n∈I is an orthogonal set such that gn 6= 0X and gn is a linear combination
of f1 , ..., fn for each n ∈ I, then for each n ∈ I there exists αn ∈ C so that
gn = αn un .
then by 3.1.7 and by proposition Pn−1 we should have fn ∈ L{u1 , ..., un−1 } =
L{f1 , ..., fn−1 } and hence {fn }n∈I would not be a linearly independent set); this
and proposition Pn−1 imply that the family {u1 , ..., un } can be consistently defined.
Furthermore, kun k = 1 holds trivially and, for l = 1, ..., n − 1,
n−1
−1 n−1
!
X
X
(ul |un ) =
fn − (uk |fn ) uk
(ul |fn ) − (uk |fn ) (ul |uk )
k=1 k=1
n−1
−1 n−1
!
X
X
=
fn − (uk |fn ) uk
(ul |fn ) − (uk |fn ) δl,k = 0;
k=1 k=1
these facts and proposition Pn−1 imply that {u1 , ..., un } is an o.n.s. in X. Finally,
the definition of un , 3.1.7 and proposition Pn−1 imply that
L{u1 , ..., un−1 , un } ⊂ L{u1 , ..., un−1 , fn } ⊂ L{f1 , ..., fn−1 , fn },
and also
l
X l
X
(ul |gk ) = βl,j gj |gk = βl,j (gj |gk ) = 0 if l < k.
j=1 j=1
This proves that, for each k ∈ I, αk,l = 0 if l < k and hence that gk = αk,k uk .
then
2 n
dn e−x −x2
X 2
n
= H n (x)e = αk xk e−x , ∀x ∈ R,
dx
k=1
and hence
2
x2 2 dn+1 e−x
Hn+1 (x) = gn+1 (x)e 2 = ex n+1
dx!
n n
2 d X 2 X
= ex αk xk e−x = αk (kxk−1 − 2xk+1 ), ∀x ∈ R;
dx
k=1 k=1
Since {gn }n∈I is an orthogonal set, it is obvious that {hn }n∈I is an o.n.s. Moreover,
since hn = cn αn un and cn αn 6= 0 for each n ∈ I, from 3.1.7 it follows that
L{hn }n∈I = L{un }n∈I .
Hence we have
L{hn }n∈I = L{fn }n∈I .
10.2.8 Theorem.
(a) Let {u1 , ..., un } be a finite o.n.s. in X. Then
n
X
| (uk |f ) |2 ≤ kf k2 , ∀f ∈ X.
k=1
(b) Let {ui }i∈I be any o.n.s. in X and, for every f ∈ X, define
If := {i ∈ I : (ui |f ) 6= 0}.
Then the set If is countable and
X
| (ui |f ) |2 ≤ kf k2 , ∀f ∈ X.
i∈If
Note that the total ordering in If that is necessary for the definition of the
sum or the series i∈If | (ui |f ) |2 need not be specified in view of 5.4.3. This
P
inequality is called Bessel’s inequality. For any f, g ∈ X such that the set
If ∩ Ig is denumerable, the series
X
(f |ui ) (ui |g)
i∈If ∩Ig
is absolutely convergent in the Banach space C (cf. 4.1.4 and 2.7.4a). Hence,
the total ordering of If ∩ Ig that is necessary for the definition of the series
P
i∈If ∩Ig (f |ui ) (ui |g) need not be specified (cf. 4.1.8b).
Proof. a: We have
n n
!
X X
0≤ f− (uk |f ) uk |f − (ul |f ) ul
k=1 l=1
n
X n
X n X
X n
= (f |f ) − (uk |f ) (uk |f ) − (ul |f ) (f |ul ) + (uk |f ) (ul |f ) δk,l
k=1 l=1 k=1 l=1
Xn
= kf k2 − | (uk |f ) |2 .
k=1
The result obtained in part a shows that the number of the elements of If,n can not
exceed n2 kf k2 ; thus, If,n is a finite set for each n ∈ N, and this implies that If is a
S
countable set since If = n∈N If,n (cf. 1.2.10). If If is finite then the inequality of
the statement follows from the result obtained in part a. If If is denumerable, let
{ik }k∈N := If be an ordering in If ; then (cf. 5.4.1)
X n
X
| (ui |f ) |2 = sup | (uik |f ) |2 ,
i∈If n≥1
k=1
and the inequality of the statement follows once again from the result obtained in
part a.
For any α, β ∈ C, the inequality |αβ| ≤ 12 (|α|2 +|β|2 ) follows from 0 ≤ (|α|−|β|)2 ;
then, for all f, g ∈ X we have (whatever ordering is chosen in If ∩ Ig )
X X 1
| (f |ui ) (ui |g) | ≤ (| (f |ui ) |2 + | (ui |g) |2 )
2
i∈If ∩Ig i∈If ∩Ig
1 X 1 X
≤ | (f |ui ) |2 + | (ui |g) |2
2 2
i∈If ∩Ig i∈If ∩Ig
1 1
≤ kf k2 + kgk2,
2 2
where 5.4.2a, 5.4.5 and 5.4.6 have been used if If ∩ Ig is denumerable. This proves
P
that if If ∩ Ig is denumerable then the series i∈If ∩Ig (f |ui ) (ui |g) is absolutely
convergent.
f: We have
f ∈ S ∩ S ⊥ ⇒ (f |f ) = 0 ⇒ f = 0X . (∗)
⊥ ⊥
If 0X ∈ S then 0X ∈ S ∩ S since 0X ∈ S follows from 10.1.2b, and hence (∗)
proves that S ∩ S ⊥ = {0X }. If 0X 6∈ S then (∗) proves that S ∩ S ⊥ = ∅.
10.2.15 Proposition. Let S1 and S2 be two subsets of X such that S1 ⊂ S2⊥ and
X = S1 + S2 . Then S1 = S2⊥ and S2 = S1⊥ .
Proof. We prove that S1 = S2⊥ by proving that S2⊥ ⊂ S1 . For each f ∈ S2⊥ , there
exists a pair (f1 , f2 ) ∈ S1 × S2 so that f = f1 + f2 and hence so that f − f1 = f2 ;
now, f − f1 ∈ S2⊥ since f1 ∈ S1 ⊂ S2⊥ and S2⊥ is a linear manifold (cf. 10.2.13),
while f2 ∈ S2 ; thus, f − f1 = 0X (cf. 10.2.10f), and hence f = f1 ∈ S1 .
Since S1 ⊂ S2⊥ implies S2 ⊂ S1⊥ (cf. 10.2.14), by the same reasoning we can
prove that S2 = S1⊥ .
U (S ⊥ ) = (U (S))⊥ .
f ∈ U (S ⊥ ) ⇔
U −1 f ∈ S ⊥ ⇔
[(f |U g) = U U −1 f |U g = U −1 f |g = 0, ∀g ∈ S] ⇔
f ∈ (U (S))⊥ .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 268
10.3.1 Definition. A Hilbert space is an inner product space (X, σ, µ, φ) such that
the metric space (X, dφ ) is complete (equivalently, such that the normed space
(X, σ, µ, νφ ) is a Banach space).
10.3.2 Theorem. Let M be a linear manifold in a Hilbert space (H, σ, µ, φ). Then
the inner product space (M, σM×M , µC×M , φM×M ) (cf. 10.1.6) is a Hilbert space
iff M is a closed set in the metric space (X, dφ ). This fully explains why in 4.1.9 a
closed linear manifold was called a subspace.
10.3.3 Theorem. If two inner product spaces are isomorphic and one of them is
a Hilbert space, then the other one is also a Hilbert space.
10.3.5 Remark. Let ((X̂, σ̂, µ̂, φ̂), ι) be a completion of an inner product space
(X, σ, µ, φ). Then ι is a linear operator (cf. 10.1.19) and therefore Rι can be
considered as an inner product space (cf. 3.2.2a and 10.1.6). Moreover, ι is injective
(cf. 10.1.19). Thus, condition co1 in 10.3.4 is equivalent to the condition that Rι
be a linear manifold in X̂ and ι be an isomorphism from the inner product space
(X, σ, µ, φ) onto the inner product space (Rι , σ̂Rι ×Rι , µ̂C×Rι , φ̂Rι ×Rι ) (cf. 10.1.6
and 10.1.20).
Since ι is a linear operator, it follows directly from the definitions that ((X̂, dφ̂ ), ι)
is a completion of the metric space (X, dφ ) (cf. 2.6.7).
We shall not use the following theorem, also because the completions of inner
product spaces that we need will be constructed without using either the statement
or the proof of this theorem. For this reason we state it without giving its proof,
which can be found e.g. in 4.11 of (Weidmann, 1980).
10.3.6 Theorem. For every inner product space (X, σ, µ, φ), there exists a com-
pletion ((X̂, σ̂, µ̂, φ̂), ι) of (X, σ, µ, φ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 269
If ((X̃, σ̃, µ̃, φ̃), ω) is also a completion of (X, σ, µ, φ), then there exists an iso-
morphism U from (X̂, σ̂, µ̂, φ̂) onto (X̃, σ̃, µ̃, φ̃) such that U ◦ ι = ω, i.e. such that
U (ι(f )) = ω(f ), ∀f ∈ X.
⊕
X ⊕
X
µ: C× Hn → Hn
n∈I n∈I
⊕
X ⊕
X
φ: Hn × Hn → C
n∈I n∈I
X
({fn }, {gn }) 7→ φ({fn }, {gn}) := φn (fn , gn )
n∈I
P∞ P
P PN ⊕
( n∈I stands for either n=1 or n=1 ). The quadruple n∈I Hn , σ, µ, φ is a
Hilbert space, which is called the direct sum of the family {Hn }n∈I . The symbol
P⊕ PN ⊕
n∈I Hn is written as n=1 Hn or as H1 ⊕ · · · ⊕ HN if I = {1, ..., N }, and as
P∞⊕
n=1 H n if I = N.
Proof. We expound the proof for I = N, from which the proof for I = {1, ..., N }
can be obtained by obvious simplifications.
To prove that the definition of σ is consistent, we note first that the inequality
1
|αβ| ≤ (|α|2 + |β|2 ) (i.e. 0 ≤ (|α| − |β|)2 ), ∀α, β ∈ C (1)
2
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 270
(where 5.4.2a, 5.4.5, 5.4.6 and inequality 2 have been used), which proves that
{σn (fn , gn )} ∈ ∞⊕
P
n=1 Hn .
As to the definition of µ, for α ∈ C and {fn } ∈ ∞⊕
P
n=1 Hn we have
∞
X ∞
X ∞
X
kαfn k2n = |α|2 kfn k2n = |α|2 kfn k2n < ∞
n=1 n=1 n=1
P∞⊕
(where 5.4.5 has been used), which proves that {µn (α, fn )} ∈ n=1 Hn .
P∞⊕
As to the definition of φ, for {fn }, {gn } ∈ n=1 Hn we have
∞
X ∞
X
| (fn |gn )n | ≤ kfn kn kgn kn
n=1 n=1
∞ ∞ ∞
X 1 1X 1X
≤ (kfn k2 + kgn k2n ) = kfn k2n + kgn k2n < ∞
n=1
2 2 n=1
2 n=1
(where 10.1.7a, 5.4.2a, 5.4.5, 5.4.6 and inequality 1 have been used), which proves
that the series ∞
P
n=1 φn (fn , gn
) is absolutely convergent
and hence convergent.
P⊕
Then, it is easy to see that n∈I Hn , σ, µ, φ is an inner product space. Prop-
erties ls1 and ls2 of 3.1.1 follow directly from the definitions of σ and µ (the zero
P∞⊕ P∞⊕
vector of n=1 Hn is the sequence {0Hn }, and the opposite of {fn } ∈ n=1 Hn
is the sequence {−fn }), and properties ip1 , ip2 , ip3 , ip4 of 10.1.3 follow from the
definition of φ and from the continuity of sum and product in C (for ip1 ) or the
continuity of complex conjugation (for ip2 ).
P∞⊕
Finally, we prove that the metric space n=1 H n , dφ is complete. Let {ϕk }
P∞⊕ Q
be a Cauchy sequence in n=1 Hn . This means that ϕk := {fk,n } ∈ n∈N Hn and
P∞ 2
n=1 kfk,n kn < ∞ for each k ∈ N, and that ∀ε > 0, ∃Nε ∈ N so that
∞
! 12
X
Nε < k, l ⇒ kfk,n − fl,n k2n = dφ (ϕk , ϕl ) < ε. (3)
n=1
Thus, for each n ∈ N, {fk,n } (where k is the index within the sequence) is a
Cauchy sequence in Hn . Therefore (since Hn is a complete metric space) there
exists fn ∈ Hn so that fn = limk→∞ fk,n . Moreover, 3 implies that, for each p ∈ N,
p
X
Nε < k, l ⇒ kfk,n − fl,n k2n ≤ ε2 ,
n=1
and therefore (in view of the continuity of σn and νφn ) also that, for each p ∈ N,
Xp Xp
Nε < k ⇒ kfk,n − fn k2n = lim kfk,n − fl,n k2n ≤ ε2 ,
l→∞
n=1 n=1
and therefore also that
∞
X
Nε < k ⇒ kfk,n − fn k2n ≤ ε2 . (4)
n=1
Q
Now, if we fix k > Nε , 4 implies that the sequence ψk := {fk,n − fn } ∈ n∈N Hn is
P∞⊕ Q
an element of n=1 Hn , and hence that the sequence ϕ := {fn } ∈ n∈N Hn is an
P∞⊕
element of n=1 Hn as well since ϕk − ψk = ϕ. Then, 4 can be written as
Nε < k ⇒ dφ (ϕk , ϕ) ≤ ε
and this shows that the sequence {ϕk } is convergent.
10.3.8 Examples.
(a) We define an inner product φ for a zero linear space (cf. 3.1.10a) by letting
φ(0X , 0X ) := 0. This trivial inner product space is obviously a Hilbert space,
which is called a zero Hilbert space. It is obvious that two zero Hilbert spaces
are isomorphic and that an inner product space which is isomorphic to a zero
Hilbert space is also a zero Hilbert space.
(b) The function
φ: C×C→C
(x1 , x2 ) 7→ φ(x1 , x2 ) := x1 x2
is an inner product for the linear space C (cf. 3.1.10b) and dφ = dC (cf. 2.7.4a),
as can be immediately seen. Since (C, dC ) is a complete metric space, the inner
product space C is a Hilbert space.
PN ⊕
(c) Let N ∈ N and let Hn := C for n = 1, ..., N . The Hilbert space n=1 Hn
(cf. 10.3.7) is then denoted by CN (this is consistent with the definition CN :=
C × · · · N times · · · × C given in 1.2.1). Explicitely, the mappings σ, µ, φ are
defined by
σ((x1 , ..., xN ), (y1 , ..., yN )) := (x1 + y1 , ..., xN + yN ),
∀(x1 , ..., xN ), (y1 , ..., yN ) ∈ CN ,
µ(α, (x1 , ..., xN )) := (αx1 , ..., αxN ), ∀α ∈ C, ∀(x1 , ..., xN ) ∈ CN ,
N
X
φ((x1 , ..., xN ), (y1 , ..., yN )) := xn yn , ∀(x1 , ..., xN ), (y1 , ..., yN ) ∈ CN .
n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 272
Thus, condition no1 of 4.1.1 turns out to be, for the norm νφ ,
v v v
u∞ u∞ u∞
uX uX uX
t 2
|xn + yn | ≤ t 2
|xn | + t |yn |2 , ∀{xn }, {yn } ∈ ℓ2 ,
n=1 n=1 n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 273
10.3.10 Remark. For a Hilbert space H, the family U(H) is a group with product
of operators as group product, the operator 1H as group identity, the operator U −1
as group inverse of U for every U ∈ U(H) (cf. 10.1.21 and 4.6.2c).
10.3.12 Remarks.
(a) For a Hilbert space H, the set
R := {(A, B) ∈ O(H) × O(H) : ∃U ∈ U(H) such that B = U AU −1 }
defines a relation in O(H) which can be easily seen to be an equivalence relation.
This justifies the term “unitarily equivalent” used in 10.3.11. The fact that R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 274
10.3.14 Remark. All the definitions and the symbols introduced in Section 3.2 for
linear operators can be extended to the family of all linear or antilinear operators,
and it is easy to see that all the results proved in Section 3.2 hold for this wider
family, with only the following exceptions: 3.2.10b4 must be supplemented with
(αA)B = α(AB) = A(αB), ∀α ∈ C − {0}, for every antilinear A
and every linear or antilinear B;
3.2.15 is not true for an antilinear operator.
The product of two antilinear operators is a linear operator and the product of
a linear operator and an antilinear one (in either order) is an antilinear operator;
for the sum of two operators to give a linear or an antilinear operator, the two
operators must be both linear or both antilinear.
Moreover, if X and Y are normed spaces over C, all the definitions, the symbols
and the results set out about linear operators in Section 4.2 can be extended to the
family of all linear or antilinear operators (in the extended version of 4.2.7, both
the operators A and B must be either linear or antilinear).
The family (which can be empty) of all antiunitary operators from H1 onto H2 is
denoted by the symbol A(H1 , H2 ). We also write
UA(H1 , H2 ) := U(H1 , H2 ) ∪ A(H1 , H2 ).
For a Hilbert space H, we write
A(H) := A(H, H) and UA(H) := UA(H, H).
An element of A(H) is called an antiunitary operator in H.
10.3.16 Remarks.
(a) The reason why we take antiunitary operators into consideration is that they
play an essential role in Wigner’s theorem (cf. Section 10.9).
(b) If H1 and H2 are Hilbert spaces and V ∈ A(H1 , H2 ), it is immediate to see
that V −1 ∈ A(H2 , H1 ).
(c) If H1 , H2 , H3 are Hilbert spaces, U ∈ UA(H1 , H2 ), and V ∈ UA(H2 , H3 ), it is
immediate to see that V U ∈ UA(H1 , H3 ), and that V U ∈ U(H1 , H3 ) iff U and
V are both unitary or both antiunitary.
(d) For every Hilbert space H, 10.3.10 and remarks b and c above imply that
the family UA(H) is a group with product of operators as group product, the
operator 1H as group identity, the operator T −1 as group inverse of T for every
T ∈ UA(H).
(e) If H1 and H2 are Hilbert spaces and V ∈ A(H1 , H2 ), it is immediate to see that
kV f − V gk2 = kf − gk1 , ∀f, g ∈ H1 .
Thus, V is an isomorphism from the metric space H1 onto the metric space H2 .
(f) The result of 10.2.16 holds true also for an antiunitary operator (if X1 and X2
in that proposition are Hilbert spaces and U is an antiunitary operator, the
proof remains essentially the same).
Proof. The proof is an obvious modification of the proof of 10.1.20, and it follows
from obvious modifications of 10.1.11 and 10.1.19 and their proofs.
10.3.18 Definition. Let H1 and H2 be Hilbert spaces such that the family
A(H1 , H2 ) is not empty. Two linear operators A ∈ O(H1 ) and B ∈ O(H2 ) are said
to be antiunitarily equivalent if there exists V ∈ A(H1 , H2 ) so that B = V AV −1 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 276
10.3.19 Remark. Let H1 and H2 be Hilbert spaces such that the family
UA(H1 , H2 ) is not empty. For A ∈ O(H1 ), B ∈ O(H2 ), U ∈ A(H1 , H2 ), sup-
pose that B = U AU −1 . Then it is easy to check that all the conditions listed in
4.6.4 still hold true. Indeed, conditions from a to h depend on U being a bijection
from H1 onto H2 , and condition i depends on U being an isomorphism of metric
spaces (the mapping TU defined in 4.6.3 is now an antiunitary operator from the
Hilbert space H1 ⊕ H1 onto the Hilbert space H2 ⊕ H2 ; note that conditions e, f, g,
h, i are still consistent because the image of a linear manifold under an antiunitary
operator is a linear manifold, as can be easily seen). Furthermore, it is easy to check
that propositions from a to e in 4.6.5 still hold true, while propositions from f to i
get replaced by:
10.3.20 Definition. Let H1 and H2 be Hilbert spaces such that the family
UA(H1 , H2 ) is not empty. Two linear operators A ∈ O(H1 ) and B ∈ O(H2 )
are said to be unitarily-antiunitarily equivalent if there exists T ∈ UA(H1 , H2 ) so
that B = T AT −1 .
10.3.21 Remark. For a Hilbert space H, it is easy to check that the relation
in O(H) of unitary-antiunitary equivalence is indeed an equivalence relation, in
analogy with what we saw in 10.3.12a. This in linked to the group structure of
UA(H) (cf. 10.3.16d).
∀f ∈ H, ∃!(f1 , f2 ) ∈ M × M ⊥ so that f = f1 + f2 .
10.4.2 Remarks.
(a) The existence part of the statement of 10.4.1 can be rephrased as follows (cf.
3.1.8): if M is a subspace of a Hilbert space H, then H = M + M ⊥ .
(b) In 10.4.1, the condition that the linear manifold M be closed is essential. This
is proved by the following counterexample. In the Hilbert space ℓ2 (cf. 10.3.8d),
ℓf is a linear manifold and it is not a closed set since ℓf = ℓ2 and ℓf 6= ℓ2 (cf.
2.3.9c). Now, for any Hilbert space H and any linear manifold M in H such
that M = H and M 6= H, we have M ⊥ = H⊥ = {0H } (cf. 10.2.11 and 10.2.10a)
and hence H 6= M + M ⊥ .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 278
(c) In 10.4.1, the condition that the inner product space H be complete is essential.
This is proved by the following counterexample. For a, b ∈ R, let c ∈ (a, b) and
define the subset M (a, c) and M (c, b) of the inner product space C(a, b) (cf.
10.1.5.b) by letting:
M (a, c) := {ϕ ∈ C(a, b) : ϕ(x) = 0, ∀x ∈ (c, b)};
M (c, b) := {ϕ ∈ C(a, b) : ϕ(x) = 0, ∀x ∈ (a, c)}.
Obviously, M (c, b) ⊂ M (a, c)⊥ . We will prove by contraposition the inclusion
M (a, c)⊥ ⊂ M (c, b), i.e. that
[ϕ ∈ C(a, b), ∃x0 ∈ (a, c) s.t. ϕ(x0 ) 6= 0] ⇒
[∃ψ ∈ M (a, c) s.t. (ϕ|ψ) 6= 0].
Assume that ϕ ∈ C(a, b) and x0 ∈ (a, c) are so that ϕ(x0 ) 6= 0, and suppose
e.g. that Re ϕ(x0 ) > 0 (the argument would be analogue if Re ϕ(x0 ) < 0, or
Im ϕ(x0 ) > 0, or Im ϕ(x0 ) < 0); since Re ϕ is a continuous function (cf. 2.7.6),
there exists ε > 0 so that Re ϕ(x) > 0 for all x ∈ (x0 − ε, x0 + ε) ∩ [a, b], and
we can choose ε so that (x0 − ε, x0 + ε) ⊂ (a, c); then the function
ψ : [a, b] → C
0
if x 6∈ (x0 − ε, x0 + ε),
x 7→ ψ(x) := x − (x0 − ε) if x ∈ (x0 − ε, x0 ),
−x + (x + ε) if x ∈ [x , x + ε)
0 0 0
S ⊥ = {0X } ⇔ S ⊥⊥ = X,
M ⊥ = {0H } ⇔ M = H
by corollary c.
10.4.5 Remark. In all the corollaries of 10.4.1 proved in 10.4.4, the condition that
the inner product space H be complete is essential. We prove this by a counterex-
ample, which shows that if H were not complete then the statement of corollary
10.4.4d would not be true. Since each corollary listed in 10.4.4 implies the following
one (see the proof of 10.4.4), this actually shows that if H were not complete then
no corollary listed in 10.4.4 would be true.
In the inner product space ℓf (cf. 10.1.5a), which is not a Hilbert space (cf.
10.3.8d), let
∞
( )
X 1
M := {xn } ∈ ℓf : xn = 0 .
n=1
n
10.4.6 Corollary. Let A be a linear operator in a Hilbert space. Then, the spectrum
of A is a closed subset of C.
The results we prove in 10.4.8, which are sometimes known as the Riesz–Fisher
theorem, are corollaries of the next theorem, which is an extension of 10.2.3.
10.4.7 Theorem. Let {fn } be a sequence in a inner product space X and suppose
that (fn |fm ) = 0 if n 6= m. Then:
(a) if the series ∞
P P∞ 2
n=1 fn is convergent then the series n=1 kfn k is convergent
P∞ 2
P ∞ 2
P ∞ 2
in R, i.e. n=1 kfn k < ∞, and n=1 kfn k = k n=1 fn k ;
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 282
P∞ P∞
(b) if X is a Hilbert space and n=1 kfn k2 < ∞ then the series n=1 fn is con-
vergent.
Pn Pn 2
Proof. For each n ∈ N, let sn := k=1 fk and σn := k=1 kfk k . We recall
P∞
that the series n=1 fn is said to be convergent if the sequence {sn } is convergent,
P∞
and that we write n=1 fn := limn→∞ sn when {sn } is convergent (cf. 2.1.10).
P∞
Similarly, the series n=1 kfn k2 is said to be convergent in R if the sequence {σn }
P∞
is convergent in R, and we write n=1 kfn k2 := limn→∞ σn when {σn } is convergent
(cf. 5.4.1).
P∞
a: Assume that the series n=1 fn is convergent. Then the continuity of
the norm (cf. 4.1.6a) implies that the sequence {ksn k2 } is convergent in R and
limn→∞ ksn k2 = k limn→∞ sn k2 (cf. 2.4.2). Since ksn k2 = σn by 10.2.3, this means
P∞ P∞
that the series n=1 kfn k2 is convergent in R, i.e. 2
n=1 kfn k < ∞ (cf. 5.4.1),
P∞ ∞ 2
and n=1 kfn k2 = k n=1 fn k .
P
P∞
b: Assume n=1 kfn k2 < ∞, i.e. that the sequence {σn } is convergent in R.
Pm
Then {σn } is a Cauchy sequence (cf. 2.6.2). Since |σm − σn | = k=n+1 kfn k2 =
Pm
2
k=n+1fn
= ksm − sn k2 for all m, n ∈ N such that n < m (cf. 10.2.3), this
implies that {sn } is a Cauchy sequence as well, and hence a convergent sequence if
X is a complete metric space.
10.4.8 Corollaries. Let {un }n∈N be an o.n.s. in an inner product space X, and
let {αn } be a sequence in C. Then:
P∞ P∞ P∞
(a) if the series n=1 αn un is convergent then n=1 |αn |2 < ∞ and n=1 |αn |2 =
2
k ∞
P
n=1 αn un k ; P∞ P∞
2
(b) if X is a Hilbert space and n=1 |αn | < ∞ then the series n=1 αn un is
convergent;
P∞ P∞
(c) if the series n=1 αn un is convergent then αk = (uk | n=1 αn un ) for all k ∈ N.
Proof. Letting fn := αn un for all n ∈ N, statements a and b follow immediately
from 10.4.7.
c: If the series ∞
P
n=1 αn un is convergent, then the continuity of the inner product
(cf. 10.1.16c) implies that
∞
! n
!
X X
uk | αn un = uk | lim αl ul
n→∞
n=1 l=1
n
X
= lim αl δk,l = αk , ∀k ∈ N.
n→∞
l=1
10.4.9 Proposition. Let {fn } be a sequence in a Hilbert space such that (fn |fm ) =
P∞
0 if n 6= m, and let β be a bijection from N onto N. Then the series n=1 fβ(n) is
P∞
convergent iff the series n=1 fn is convergent. If these series are convergent then
P∞ P∞
their sums are the same, i.e. n=1 fβ(n) = n=1 fn .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 283
P∞ P∞
Proof. By 10.4.7, the series n=1 fβ(n) is convergent iff n=1 kfβ(n) k2 < ∞ and
P∞ P∞ P∞
the series n=1 fn is convergent iff n=1 kfn k2 < ∞. Now, 2
n=1 kfβ(n) k =
P∞ 2
n=1 kfn k by 5.4.3.
Suppose that the series ∞
P P∞
n=1 fβ(n) and n=1 fn are convergent. Then, by the
continuity of the inner product,
∞ ∞ ∞ ∞
!
X X X X
fk | fβ(n) − fn = fk |fβ(n) − (fk |fn )
n=1 n=1 n=1 n=1
= (fk |fk ) − (fk |fk ) = 0, ∀k ∈ N.
Hence,
∞
X ∞
X
fβ(n) − fn ∈ {fn }⊥
n∈N = (V {fn }n∈N )
⊥
n=1 n=1
by 10.2.11. We also have
X∞ ∞
X n
X n
X
fβ(n) − fn = lim fβ(k) − lim fk ∈ V {fn }n∈N
n→∞ n→∞
n=1 n=1 k=1 k=1
P∞ P∞
by 4.1.13, 2.3.10, and 3.1.7. Then, n=1 fβ(n) = n=1 fn by 10.2.10f.
Proof. By 10.4.7, condition a is true iff (n,s)∈N×N kfn,s k2 < ∞ (cf. 5.4.7 for the
P
P 2
P∞ 2
symbol (n,s)∈N×N kfn,s k ). Further, condition b is true iff s=1 kfn,s k < ∞
P∞ P∞ 2
< ∞, since ( ∞ f | ∞ f ) =
P P
for all n ∈ N and n=1 k s=1 fn,s k
P∞ P∞ P∞ s=1 n,s2 t=1 m,t
s=1 t=1 (fn,s |fm,t ) = 0 if n 6= m; also, if s=1 kfn,s k < ∞ then
P ∞ 2 P∞ 2
k s=1 fn,s k = s=1 kfn,s k by 10.4.7; hence, condition b is true iff
P∞ P∞ 2
n=1 s=1 kf n,s k < ∞. Then, conditions a and b are equivalent by 5.4.7.
Suppose that the series of conditions a and b are convergent. Then, by the same
procedure as in 10.4.9, we see that
!
X X∞ ∞
X
fm,t | fn,s − fn,s = (fm,t |fm,t ) − (fm,t |fm,t )
(n,s)∈N×N n=1 s=1
= 0, ∀(m, t) ∈ N × N,
P P∞ P∞
and hence that (n,s)∈N×N fn,s = n=1 ( s=1 fn,s ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 284
10.4.11 Remark. By using 10.3.6, it is easy to see that the statements of 10.4.9
and 10.4.10 hold true even if the inner product space of the statements, which
we denote here by X, is not a Hilbert space. Simply, let (X̂, ι) be a completion
of X, substitute the vectors fn or fn,s with ι(fn ) or ι(fn,s ), and note that the
P∞
series n=1 fn (for instance) is convergent (in the metric space X) iff the series
P∞ P∞
n=1 ι(fn ) is convergent (in the metric space X̂) and the sum n=1 ι(fn ) is an
element of Rι .
We present here the Riesz–Fréchet theorem and a result about bounded sesquilinear
forms which follows from it. The Riesz–Fréchet theorem is actually a corollary of
the orthogonal decomposition theorem, since its proof relies on 10.4.4a. The Riesz–
Fréchet theorem is also known as the Riesz representation theorem, but we prefer
to call it Riesz–Fréchet theorem in order to distinguish it from several other “Riesz
representation theorems”. For the same reason, we called Riesz–Markov theorem
the theorem in 8.5.3, which is often named after the first author only.
10.5.1 Proposition. Let h be a vector of an inner product space X and M a linear
manifold in X. Then the function
Fh : M → C
f 7→ Fh f := (h|f )
is a continuous linear functional (for the definition of a linear functional, cf. 3.2.1),
kFh k ≤ khk and kFh k = khk if h ∈ M .
Proof. The function Fh is a linear operator by property ip1 of an inner product.
Moreover, by 10.1.9 we have
|Fh f | ≤ khkkf k, ∀f ∈ M,
and this shows that the linear operator Fh is bounded, and hence continuous (cf.
4.2.2), and that kFh k ≤ khk (cf. 4.2.4). If h = 0X then kFh k = 0. If h 6= 0X and
h ∈ M then |Fh h| = khkkhk shows that kFh k ≥ khk and hence that kFh k = khk.
and we have
Ff Ff
F f− g = 0, i.e. f − g ∈ NF , ∀f ∈ H,
Fg Fg
F f = (h|f ) = (h′ |f ) , ∀f ∈ H.
10.5.3 Remarks.
(a) The plan for the proof of 10.5.2 is prompted by the following considerations. If
the theorem is true, then NF = {h}⊥ and hence
(cf. 10.4.4b and 4.1.15). Thus, if we assume that the theorem is true and that
NF 6= H and hence h 6= 0H , for any non-zero element g of NF⊥ there exists α ∈ C
such that α 6= 0 and g = αh, and hence also such that F g = (h|g) = α−1 (g|g),
F (g)
which implies α−1 = kgk2 . Therefore, if the theorem is true and NF 6= H, we
(g)
must have h = Fkgk ⊥
2 g for any non-zero element g of NF .
(b) In 10.5.2, the condition that the inner product space H be complete is essential.
This is readily seen as follows. Let H be a Hilbert space and M a linear manifold
in H such that M 6= H and M = H (such are e.g. ℓ2 and ℓf , cf. 10.3.8d). Then,
M can be regarded as an inner product space (cf. 10.1.6), which is not complete
by 2.6.8. Let g ∈ H be such that g 6∈ M . Then the function
M ∋ f 7→ F f := (g|f ) ∈ C
F f = (h|f ) , ∀f ∈ M,
Proof. For both the definitions of ψ given in the statement, the function ψ is a
sesquilinear form since A is a linear operator and an inner product is a sesquilinear
form. Moreover, for both definitions,
|ψ(f, g)| ≤ kAkkf kkgk, ∀(f, g) ∈ DA × DA ,
by 10.1.7a and 4.2.5b.
Proof. Existence: Let m ≥ 0 be such that |ψ(f, g)| ≤ mkf kkgk, ∀f, g ∈ H. For
each f ∈ H, define the function
Ff : H → C
g 7→ Ff (g) := ψ(f, g),
which is a linear functional in view of property sf2 of ψ (cf. 10.1.1), and is continuous
since (cf. 4.2.2)
|Ff g| ≤ (mkf k)kgk, ∀g ∈ H;
hence, by 10.5.2,
∃!hf ∈ H such that ψ(f, g) = Ff g = (hf |g) , ∀g ∈ H.
Then, we can define the mapping
A:H→H
f 7→ Af := hf if hf ∈ H is such that (hf |g) = ψ(f, g), ∀g ∈ H,
which is obviously such that ψ(f, g) = (Af |g), ∀f, g ∈ H.
The mapping A is a linear operator since, for all α, β ∈ C and f1 , f2 ∈ H,
(αAf1 + βAf2 |g) = α (Af1 |g) + β (Af2 |g)
= αψ(f1 , g) + βψ(f2 , g)
= ψ(αf1 + βf2 , g) = (A(αf1 + βf2 )|g) , ∀g ∈ H,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 287
in view of property sf3 of ψ, and hence αAf1 + βAf2 = A(αf1 + βf2 ). Moreover,
| (Af |g) | = |ψ(f, g)| ≤ mkf kkgk, ∀f, g ∈ H,
proves that A is bounded, in view of 10.1.14.
Finally, we note that the function
ψ̃ : H × H → C
(f, g) 7→ ψ̃(f, g) := ψ(g, f )
is obviously a bounded sesquilinear form on H. Therefore, what was proved above
implies that there exists B ∈ B(H) such that
ψ̃(f, g) = (Bf |g) , ∀f, g ∈ H,
and hence such that
ψ(g, f ) = (g|Bf ) , ∀f, g ∈ H.
′
Uniqueness: If A, A ∈ OE (H) are such that
ψ(f, g) = (Af |g) = (A′ f |g) , ∀f, g ∈ H,
then A = A′ by 10.2.12. And similarly for the uniqueness of B.
10.6.1 Proposition. Let {ui }i∈I be any o.n.s. in H. The family of indices If :=
{i ∈ I : (ui |f ) 6= 0} is countable, for every f ∈ H. If If is denumerable then the
P
series i∈If (ui |f ) ui is convergent and its sum is the same whatever ordering is
chosen in If for the definition of this series. Thus, we can define
X X
(ui |f ) ui := (ui |f ) ui , ∀f ∈ H,
i∈I i∈If
Proof. For every f ∈ H, it was proved in 10.2.8b that If was countable. Now,
suppose that If is denumerable. Since
((ui |f ) ui | (uk |f ) uk ) = (ui |f ) (uk |f ) (ui |uk ) = 0 if i 6= k,
10.4.9 proves that the choice of the ordering in If , which is necessary for the def-
P
inition of the series i∈If (ui |f ) ui , is immaterial both for the convergence of the
series and, in case of convergence, for its sum. Moreover, it was proved in 10.2.8b
that, for whatever ordering in If ,
X
| (ui |f ) |2 ≤ kf k2 ,
i∈If
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 288
P
and hence 10.4.8b implies that the series i∈If (ui |f ) ui is convergent. From 4.1.13,
2.3.10, 3.1.7 we have
X
(ui |f ) ui ∈ V {ui }i∈I .
i∈If
For each k ∈ I we also have, using the continuity of inner product if If is denumer-
able,
X X
uk |f − (ui |f ) ui = (uk |f ) − (ui |f ) δk,i
i∈If i∈If
(
0 if k 6∈ If ,
=
(uk |f ) − (uk |f ) if k ∈ If .
Proof. For the part of the statement concerning converge of series and indepen-
dence of sums from the orderings, cf. 10.2.8b.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 289
Thus,
X X X
(f |ui ) (ui |g) = (ui |f ) ui | (uk |g) uk .
i∈If ∩Ig i∈If k∈Ig
If If ∪ Ig is finite then this equality follows solely from properties ip1 , ip2 , ip5 of an
inner product.
By letting g := f in the equality above, for every f ∈ H we have
2
X
X
2
| (ui |f ) | =
(ui |f ) ui
.
i∈If
i∈If
10.6.4 Theorem. Let M be a subspace of H and {ui }i∈I an o.n.s. in H such that
ui ∈ M for all i ∈ I. Then, the following conditions are equivalent:
(a) {ui }i∈I is complete in M ;
P
(b) f = i∈I (ui |f ) ui , ∀f ∈ M ;
P
(c) (f |g) = i∈I (f |ui ) (ui |g), ∀f, g ∈ M ;
kf k2 = i∈I | (ui |f ) |2 , ∀f ∈ M ;
P
(d)
(e) [f ∈ M and (ui |f ) = 0, ∀i ∈ I] ⇒ f = 0H .
If M = H, the equality in condition b is called Fourier expansion and the equalities
in conditions c and d are called Parseval’s identities.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 290
c ⇒ d: Set g := f in condition c.
d ⇒ e: We prove this by contraposition. Assume that f ∈ M exists such that
f 6= 0H and (ui |f ) = 0 for all i ∈ I. Then,
X
kf k2 6= 0 = | (ui |f ) |2 .
i∈I
10.6.5 Remarks.
(a) The equivalence of conditions a and e in 10.6.4 can be rephrased as follows:
an o.n.s. {ui }i∈I in H is complete in a subspace M of H iff {ui }i∈I ⊂ M and
({ui }i∈I )⊥ ∩ M = {0H }; in particular, {ui }i∈I is a c.o.n.s. in H iff ({ui }i∈I )⊥ =
{0H }.
(b) Suppose that {ui }i∈I is a c.o.n.s. in H and M is a linear manifold in H such
that {ui }i∈I ⊂ M . Then M is dense in H. Indeed,
{ui }i∈I ⊂ M ⇒ L{ui }i∈I ⊂ M ⇒
H = V {ui }i∈I = L{ui }i∈I ⊂ M ⇒ M = H
(cf. 3.1.6c, 4.1.13, 2.3.9d).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 291
10.6.6 Corollaries.
(a) Assume that, for N ∈ N, an o.n.s. {u1 , ..., uN } exists in H. Then,
(N )
X
N
V {u1 , ..., uN } = αn un : (α1 , ..., αN ) ∈ C .
n=1
Hence,
( N
)
X
V {u1 , ..., uN } ⊂ αn un : (α1 , ..., αN ) ∈ CN .
n=1
Since the opposite inclusion is obvious, we have the equality of the statement.
b: First, we note that the condition {αn } ∈ ℓ2 is necessary and sufficient for
P∞
the series n=1 αn un to converge (cf. 10.4.8a,b). Then, since the o.n.s. {un }n∈N
is obviously complete in the subspace V {un }n∈N , 10.6.4 proves that
∞
X
f= (un |f ) un , ∀f ∈ V {un }n∈N ;
n=1
Since the opposite inclusion follows from 4.1.13, 2.3.10, 3.1.7, we have the equality
of the statement.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 292
10.6.7 Examples.
The family {e1 , ..., eN } is an o.n.s. in the Hilbert space CN (cf. 10.3.8c) since
it is obvious that (ek |el ) = δk,l for k, l = 1, ..., N , and it is complete by 10.6.4
because
proves that
(b) The family {δk }k∈N , which is an o.n.s. in ℓf (cf. 10.2.5a), is obviously an o.n.s.
in the Hilbert space ℓ2 (cf. 10.3.8d) as well, and it is complete by 10.6.4 because
proves that
10.6.8 Proposition. Let H1 and H2 be Hilbert spaces such that the family
UA(H1 , H2 ) is not empty, and let U ∈ UA(H1 , H2 ). Then:
b: For any family {ui }i∈I of vectors of H1 , suppose that f ∈ H2 is such that
(U ui |f )2 = 0, ∀i ∈ I.
ui |U −1 f 1 = 0, ∀i ∈ I.
If {ui }i∈I is a c.o.n.s. in H1 , this implies U −1 f = 0H1 (cf. 10.6.4) and hence
f = 0H2 since U is a linear or antilinear operator. In view of statement a and of
10.6.4, this proves that {U ui }i∈I is a c.o.n.s. in H2 if {ui }i∈I is a c.o.n.s. in H1 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 293
10.6.9 Proposition. Let H1 and H2 be Hilbert spaces such that a c.o.n.s. {ui }i∈I
exists in H1 and a c.o.n.s. {vi }i∈I exists in H2 which are indexed by the same set I
of indices. For every f ∈ H1 , the set If := {i ∈ I : (ui |f )1 6= 0} is countable. If If
P P
is denumerable then the series i∈If (ui |f )1 vi and i∈If (f |ui )1 vi are convergent
and their sums are independent from the orderings chosen in If for their definitions.
The mapping
U : H1 → H2
X X
f 7→ U f := (ui |f )1 vi := (ui |f )1 vi
i∈I i∈If
Proof. We set out the proof for U , from which the proof for V can be obtained by
obvious modifications.
For each f ∈ H1 , it was proved in 10.2.8b that If was countable and
2
P
i∈If | (ui |f )1 | < ∞;
P
then, if If is denumerable, 10.4.9 implies that the con-
vergence of the series i∈If (ui |f )1 vi and its sum do not depend on the ordering
chosen in If , and the series is convergent by 10.4.8b.
For each g ∈ H2 , the same arguments as above prove that Ig is countable and
P
that if Ig is denumerable then the series i∈Ig (vi |g)2 ui is convergent and its sum
is independent from the ordering chosen in Ig , and we can define the vector f of
H1 by
X
f := (vi |g)2 ui ;
i∈Ig
we have
(
(vi |g)2 if i ∈ Ig ,
(ui |f )1 =
0 if i 6∈ Ig
(cf. 10.4.8c); thus, If = Ig and
X X
Uf = (ui |f )1 vi = (vi |g)2 vi = g
i∈If i∈Ig
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 294
since {vi }i∈I is a c.o.n.s. in H2 (cf. 10.6.4b). This proves that RU = H2 . Moreover,
for all f, h ∈ H1 ,
X X
(U f |U h)2 = (ui |f )1 (uk |h)1 (vi |vk )2
i∈If k∈Ih
X
= (ui |f )1 (ui |h)1 = (f |h)1
i∈If ∩Ih
since {ui }i∈I is a c.o.n.s. in H1 (cf. 10.6.4c). In view of 10.1.20, this proves that
U ∈ U(H1 , H2 ) (in the proof for V , 10.1.20 must be replaced by 10.3.17).
Since U is an isomorphism, it is injective and the proof of surjectivity given
above for U proves also the part of the statement concerning U −1 .
10.6.10 Remark. Suppose that a c.o.n.s. {ui }i∈I exists in a Hilbert space H.
Then the mapping
V :H→H
X
f 7→ V f := (f |ui ) ui
i∈I
is an element of A(H) (cf. 10.6.9) and V 2 = 1H , as can be easily seen. Thus, every
antiunitary operator in H is the product of a unitary operator multiplied by V . In
fact, for A ∈ A(H), A = (AV )V and AV ∈ U(H) (cf. 10.3.16c).
It can be proved that there exists a c.o.n.s. in any non-zero Hilbert space, if the
axiom of choice is assumed, in its equivalent form called Zorn’s lemma (cf. e.g.
Weidmann, 1980, th. 3.10). However, it is possible to prove that there exists a
c.o.n.s. in every separable non-zero Hilbert space without using the axiom of choice.
In this section, we give the proof of the existence of a c.o.n.s. in this reduced form
only, because in our opinion the idea of a c.o.n.s. is really useful in separable Hilbert
spaces only (mainly because, as we see below, a c.o.n.s. is countable iff the Hilbert
space is separable). The importance of a theorem which proves the existence of a
c.o.n.s. is that it justifies all the procedures in which complete orthonormal systems
are used.
10.7.1 Theorem. Suppose that a Hilbert space H is separable and non-zero. Then
a countable c.o.n.s. exists in H.
Since {fnk }k∈I is a linearly independent subset of H, 10.2.6 implies that there exists
an o.n.s. {un }n∈I in H such that
L{un }n∈I = L{fnk }k∈I ,
and hence such that
L{un}n∈I = LS.
Then we have
V {un }n∈I = L{un }n∈I = LS ⊃ S = H
(cf. 4.1.13, 3.1.6b, 2.3.9d), and hence V {un }n∈I = H.
Proof. Since M is a subspace, it can be regarded as a Hilbert space on its own (cf.
10.1.6 and 10.3.2), and it is not a zero Hilbert space since M 6= {0H }. Moreover,
M is separable (cf. 2.3.20 and 10.1.15). Then, 10.7.1 proves that there exists a
countable c.o.n.s. in the Hilbert space M , and hence a countable o.n.s. in H which
is complete in the subspace M (cf. 10.6.5c).
10.7.3 Corollary. Suppose that a Hilbert space H is separable, and let {ui }i∈I be
an o.n.s. in H. Then there exists a c.o.n.s. in H which contains {ui }i∈I .
Proof. If ({ui }i∈I )⊥ = {0H } then {ui }i∈I is a c.o.n.s. in H (cf. 10.6.5a). Now
assume ({ui }i∈I )⊥ 6= {0H }. Then, 10.2.13 and 10.7.2 imply that there exists an
o.n.s. {vj }j∈J in H such that V {vj }j∈J = ({ui }i∈I )⊥ . It is obvious that the family
{ui }i∈I ∪ {vj }j∈J is an o.n.s. in H. Moreover,
({vj }j∈J )⊥ = (V {vj }j∈J )⊥ = (({ui }i∈I )⊥ )⊥
(cf. 10.2.11), and hence
({ui }i∈I ∪ {vj }j∈J )⊥ = ({ui }i∈I )⊥ ∩ ({vj }j∈J )⊥
= ({ui }i∈I )⊥ ∩ (({ui }i∈I )⊥ )⊥ = {0H }
(cf. 10.2.10c,f). This proves that {ui }i∈I ∪ {vj }j∈J is a c.o.n.s. in H (cf. 10.6.5a).
10.7.4 Remark. In the proof of the orthogonal decomposition theorem that was
given in 10.4.1, the axiom of choice (cf. 1.2.22) was used in the construction of the
sequence {gn } in M which was such that kf −gn k → d. Now, corollary 10.7.2 makes
it possible to prove the orthogonal decomposition theorem without resorting to the
axiom of choice, if the Hilbert space is separable. Indeed, if the Hilbert space H
is separable and M is a non-zero subspace of H, 10.7.2 proves that there exists an
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 296
o.n.s. {ui }i∈I in H which is complete in M (this o.n.s. is countable, but this has
no relevance here). Since 10.6.1 proves that
X X
(ui |f ) ui ∈ V {ui }i∈I and f − (ui |f ) ui ∈ (V {ui }i∈I )⊥ , ∀f ∈ H,
i∈I i∈I
then for each f ∈ H we actually have a pair (f1 , f2 ) ∈ M ×M ⊥ such that f = f1 +f2
if we define
X X
f1 := (ui |f ) ui and f2 := f − (ui |f ) ui
i∈I i∈I
(the uniqueness of the pair can then be proved as in 10.4.1).
The next two theorems round off our exposition of the relation between separa-
bility of a Hilbert space and countability of orthonormal systems. Theorem 10.7.5
is the converse of theorem 10.7.1.
Proof. Assume that there exists a countable c.o.n.s. {un }n∈I in H. We set out
the proof of the separability of H for I denumerable, from which the proof for I
finite can be derived easily. Let then I := N, and fix f ∈ H and ε > 0. Since
V {un }n∈N = H, 4.1.13 and 2.3.12 imply that
ε
∃fε ∈ L{un }n∈N such that kf − fε k < ,
2
and 3.1.7 implies that
Nε
X
∃Nε ∈ N, ∃(αε1 , ..., αεNε ) ∈ CNε such that fε = αεn un .
n=1
Since Q is dense in R (cf. 2.3.16), there exist (aε1 , ..., aεNε ), (bε1 , ..., bεNε ) ∈ QNε such
that
ε ε
| Re αεn − aεn | < and | Im αεn − bεn | < , ∀n ∈ {1, ..., Nε },
4Nε 4Nε
and hence such that
Nε
Nε
X
X
f − (aεn + ibεn )un
≤ kf − fε k +
fε − (aεn + ibεn )un
n=1 n=1
Nε
ε X
< + |αε − (aεn + ibεn )|
2 n=1 n
Nε Nε
ε X X
≤ + | Re αεn − aεn | + | Im αεn − bεn | < ε.
2 n=1 n=1
This proves that the subset H0 of H defined by
(N )
X
N
H0 := (an + ibn )un : N ∈ N, (a1 , ..., aN ), (b1 , ..., bN ) ∈ Q
n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 297
10.7.6 Remark. From 10.7.5 and 10.6.7 we have that the Hilbert spaces CN and
ℓ2 (cf. 10.3.8c,d) are separable.
Proof. If H is a zero Hilbert space then there is nothing to prove. Now assume
that H is a non-zero separable Hilbert space. Then there exists a countable subset
S of H which is dense in H. Let {ui }i∈I be an o.n.s. in H and define
1
Si := f ∈ S : kui − f k < √ , ∀i ∈ I.
2
In view of 2.3.12, Si 6= ∅ for each i ∈ I. Then, by the axiom of choice (cf. 1.2.22)
there exists a mapping ϕ : I → S such that ϕ(i) ∈ Si for each i ∈ I. Moreover,
Si ∩ Sk = ∅ if i 6= k;
in fact, if f ∈ S existed such that f ∈ Si ∩ Sk , we would have
2 √
kui − uk k ≤ kui − f k + kf − uk k < √ = 2,
2
while we have, if i 6= k,
p p √
kui − uk k = (ui − uk |ui − uk ) = (ui |ui ) + (uk |uk ) = 2.
Then, the mapping ϕ is injective and hence ϕ is a bijection from I onto Rϕ , which
is a countable set since it is a subset of S (cf. 1.2.10). Thus, I is countable and so
is {ui }i∈I .
10.7.8 Proposition. Suppose that a set X is not finite. Then there exists an
injection i : N → X.
(a) If an o.n.s. in the Hilbert space H is a linear basis in the linear space (H, σ, µ),
then it is a finite set.
(b) If a c.o.n.s. in the Hilbert space H is finite, then it is a linear basis in the linear
space (H, σ, µ).
Proof. a: The proof is by contradiction. Assume that {ui }i∈I is an o.n.s. in the
Hilbert space H, and that it is not finite. Then there exists an injection i : N → I
P∞
(cf. 10.7.8) and we can define the vector f := n=1 n1 ui(n) (cf. 10.4.8b). Next,
assume that {ui }i∈I is a linear basis in the linear space (X, σ, µ). Then there exist
a finite subfamily {ui1 , ..., uin } of {ui }i∈I and (α1 , ..., αn ) ∈ Cn so that
n
X
f= αk uik ,
k=1
and hence so that
(ui |f ) = 0, ∀i ∈ I − {i1 , ..., in }.
But this is in contradiction to
1
ui(n) |f = , ∀n ∈ N,
n
which holds true by 10.4.8c.
b: If a c.o.n.s. in the Hilbert space H is finite, then 10.2.2 and 10.6.4b prove
that it is a linear basis in the linear space (X, σ, µ).
10.7.10 Theorem. Let H be a separable and non-zero Hilbert space. Then a count-
able c.o.n.s. S exists in H and:
(a) if S is finite then every other c.o.n.s. in H is finite and contains the same
number of vectors as S;
(b) if S is denumerable then every other c.o.n.s. in H is denumerable.
10.7.12 Remarks.
(a) From 10.6.7a we have that the orthogonal dimension of the Hilbert space CN is
finite and equal to N , and from 10.6.7b we have that the orthogonal dimension
of the Hilbert space ℓ2 is denumerable.
(b) If M is a subspace of a separable Hilbert space whose orthogonal dimension
is finite, then the orthogonal dimension of M is finite as well. This follows
immediately from 10.7.2 and 10.7.3.
10.7.14 Theorem. Let H1 and H2 be Hilbert spaces, and suppose that H1 is sep-
arable. Then the following conditions are equivalent:
(a) H2 is separable and the orthogonal dimensions of H1 and H2 are equal;
(b) U(H1 , H2 ) is not empty (i.e. H1 and H2 are isomorphic);
(c) A(H1 , H2 ) is not empty.
If H1 is not a zero space and if these conditions are satisfied, then a mapping
T : H1 → H2 is a unitary (or antiunitary) operator iff there exist a c.o.n.s. {un }n∈I
in H1 and a c.o.n.s. {vn }n∈I in H2 , with I := {1, ..., N } or I := N, so that
X X
Tf = (un |f )1 vn (or T f = (f |un )1 vn ), ∀f ∈ H1 ,
n∈I n∈I
P PN P∞
where n∈I stands for n=1 or n=1 .
Proof. The first half of the statement is trivial if H1 is a zero space. Then we
assume that H1 is not a zero space.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 300
The implications “a ⇒ b” and “a ⇒ c”, as well as the “if” part of the second
half of the statement are proved by 10.6.9.
The implications “b ⇒ a” and “c ⇒ a” are proved by 10.6.8b.
As to the “only if” part of the second half of the statement, let T ∈ UA(H1 , H2 )
and let {un }n∈I be a c.o.n.s. in H1 . We may assume I := {1, ..., N } or I := N by
10.7.7. If we define vn := T un for all n ∈ I, then {T un}n∈I is a c.o.n.s. in H2 by
10.6.8b and we have, in view of 10.6.4b,
X X X
Tf = T (un |f )1 un = (un |f )1 T un = (un |f )1 vn , ∀f ∈ H1 ,
n∈I n∈I n∈I
if T ∈ U(H1 , H2 ) or
X X X
Tf = T (un |f )1 un = (un |f )1 T un = (f |un )1 vn , ∀f ∈ H1 ,
n∈I n∈I n∈I
if T ∈ A(H1 , H2 ), since T is a linear or antilinear operator and (if I = N) since
T is a continuous mapping in either case (cf. 10.1.21 and 4.6.2.d, or 10.3.16e, and
2.4.5).
10.7.15 Remarks.
(a) From 10.7.14 and 10.7.12 we have that any non-zero separable Hilbert space is
isomorphic either to CN for a suitable N ∈ N or to ℓ2 .
(b) If the orthogonal dimension of a separable Hilbert space H is finite and equal
to N , for every c.o.n.s. {u1 , ..., uN } in H the mapping
U : H → CN
f 7→ U f := ((u1 |f ) , ..., (uN |f ))
is a unitary operator from H onto CN . This follows immediately from 10.6.9
with {ui }i∈I := {u1 , ..., uN } and {vi }i∈I := {e1 , ..., eN } (cf. 10.6.7a).
(c) If the orthogonal dimension of a separable Hilbert space H is denumerable, for
every c.o.n.s. {un }n∈N in H the mapping
U : H → ℓ2
f 7→ U f := {(un |f )}
is a unitary operator from H onto ℓ2 . This follows immediately from 10.6.9
with {ui }i∈I := {un }n∈N and {vi }i∈I := {δn }n∈N (cf. 10.6.7b). In fact, for
each f ∈ H, the sequence ξ := {(un |f )} is an element of ℓ2 by 10.2.8b and
P∞ 2
n=1 | (un |f ) | < ∞ implies that
N
2 ∞
X
X
ξ − (un |f ) δn
= | (un |f ) |2 → 0 as N → ∞.
n=1 n=N +1
(d) The reason why, notwithstanding remark a, other separable Hilbert spaces are
worth studying besides CN and ℓ2 is that there are problems which can be
formulated in separable Hilbert spaces and which, although in their abstract
form they are the same in all isomorphic Hilbert spaces, are actually easier to
solve or even to phrase in certain Hilbert spaces than in others.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 301
Proof. If M = {0X } then all the conditions of the statement are true. Therefore,
suppose M 6= {0X }.
a ⇒ [b, c, d]: Assume condition a, let N be the linear dimension of
(M, σM×M , µC×M ), and let {f1 , ..., fN } be a linear basis in M . Then by 10.2.6
there exists an o.n.s. {u1 , ..., uN } in X such that
L{u1, ..., uN } = L{f1 , ..., fN } = M.
Now suppose that {gn } is a Cauchy sequence in M . Then,
N
(n) (n) (n)
X
N
∀n ∈ N, ∃{α1 , ...αN } ∈C such that gn = αk uk ,
k=1
and, by 10.2.3,
N
(n) (m)
X
|αk − αk |2 = kgn − gm k2 → 0 as n, m → ∞.
k=1
(n)
Thus, for all k ∈ {1, ..., N }, the sequence {αk } is a Cauchy sequence in C and
(n)
therefore there exists αk ∈ C such that αk = limn→∞ αk . By the continuity of
vector sum and of scalar multiplication, this implies that
N
X
αk uk = lim gn .
n→∞
k=1
This proves that the metric space (M, dφM ×M ) is complete and consequently that
(M, σM×M , µC×M , φM×M ) is a Hilbert space. From 2.6.6a it follows that M is a
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 302
This implies that f = 0M . By 10.6.4, this proves that {u1 , ..., uN } is a c.o.n.s.
in the Hilbert space (M, σM×M , µC×M , φM×M ). Therefore, the Hilbert space
(M, σM×M , µC×M , φM×M ) is separable (cf. 10.7.5) and its orthogonal dimension
is N .
b ⇒ a: This is proved by 10.7.9b.
Proof. From 10.8.1 we obtain immediately that X is a separable Hilbert space and
that its orthogonal dimension is finite.
Let M be a linear manifold in X. If M = {0X } then M is a subspace. Now
suppose M 6= {0X }. If we had proved in Section 3.1 that every linear manifold in a
finite-dimensional linear space was finite-dimensional then we would have that M
is a subspace of X by 10.8.1. However we did not prove that result and therefore
we must take a different tack. From 2.3.20 we have that M is separable. Hence,
there exists a countable subset S of M such that M ⊂ S (cf. 2.3.13). Proceeding
as in the proof of 10.7.1 we see that there exists an o.n.s. {un }n∈I in X such that
L{un}n∈I = LS,
and hence such that
M ⊂ S ⊂ LS = L{un}n∈I .
Now, {un }n∈I is a linearly independent subset of X (cf. 10.2.2) and hence it must
be a finite set (cf. 3.1.16). Then, L{un }n∈I is a finite-dimensional linear manifold
in X and hence it is a subspace of X by 10.8.1. Thus,
L{un }n∈I = L{un}n∈I
and hence
M ⊂ L{un }n∈I .
Since M is a linear manifold, we have
L{un}n∈I = LS ⊂ M
and hence M = L{un}n∈I . This proves that M is a subspace of X.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 303
10.8.3 Proposition.
(A) Let A be a linear operator from an inner product space X to a normed space
Y and suppose that the linear manifold DA is finite-dimensional. Then A is
bounded, and hence continuous.
(B) We say that a Hilbert space is finite-dimensional if it is separable and its or-
thogonal dimension is finite; in view of 10.7.9b and 10.8.2, this is equivalent
to its being finite dimensional as a linear space.
Every linear operator in a finite-dimensional Hilbert space is bounded.
Proof. A: From 10.8.1 we have that DA is a separable Hilbert space and that its
orthogonal dimension is finite. Then, let {u1 , ..., uN } be a c.o.n.s. in the Hilbert
space DA , and define
K := max{kAun k : n ∈ {1, ..., N }}.
We have, for all f ∈ DA ,
N
X N
X
kAf k = kA (un |f ) un k ≤ | (un |f ) |kAun k
n=1 n=1
v
N
X
u N
√ uX √
≤K | (un |f ) | ≤ K N t | (un |f ) |2 = K N kf k,
n=1 n=1
where 10.6.4b,d and the Schwarz inequality in CN (cf. 10.3.8c) have been used.
This proves that A is bounded, and hence continuous (cf. 4.2.2).
B: Let A be a linear operator in a finite-dimensional Hilbert space. The domain
DA is a linear manifold in H, and hence it is a subspace of H (cf. 10.8.2). Therefore,
the orthogonal dimension of DA is finite (cf. 10.7.12b). Then, proceeding as in the
proof of statement A, we see that the operator A is bounded.
OE (H) ∋ A 7→ U AU −1 ∈ OE (CN )
is an isomorphism from the associative algebra OE (H) (cf. 3.3.7) onto the asso-
ciative algebra OE (CN ). The composition of this isomorphism with the one from
OE (CN ) onto M(N ) mentioned above is obviously an isomorphism ΦU from OE (H)
onto M(N ). Now, for A ∈ OE (H), ΦU (A) is the element [αik ] of M(N ) such that,
for all (x1 , ..., xN ) ∈ CN ,
N N N
!
X X X
−1
U AU (x1 , ..., xi , ..., xN ) = α1k xk , ..., αik xk , ..., αN k xk ;
k=1 k=1 k=1
we also have
N
X N
X
U AU −1 (x1 , ..., xi , ..., xN ) = U A xk uk = U xk Auk
k=1 k=1
N
! N
! N
!!
X X X
= u1 | xk Auk , ..., ui | xk Auk , ..., uN | xk Auk
k=1 k=1 k=1
N N N
!
X X X
= (u1 |Auk ) xk , ..., (ui |Auk ) xk , ..., (uN |Auk ) xk ;
k=1 k=1 k=1
this proves that ΦU (A) = [(ui |Auk )]. We underline the fact that the isomorphism
ΦU depends in a crucial way on the c.o.n.s. {u1 , ..., uN } in H that was chosen in
order to define the isomorphism U .
difference that we find it more convenient to arrange his reasoning in two theorems
instead of one.
10.9.2 Proposition. Let H and H′ be Hilbert spaces and suppose that there exists a
linear or an antilinear operator U : H → H′ which fulfils either one of the following
conditions:
(U f |U g) = (f |g) , ∀f, g ∈ H, if U is linear;
(U f |U g) = (g|f ) , ∀f, g ∈ H, if U is antilinear.
Then the mapping
ΦU : Hq → Hq′
[f ] 7→ ΦU ([f ]) := [U f ]
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 306
10.9.3 Theorem. Let H and H′ be Hilbert spaces and suppose that there exists a
mapping Φ : Hq → Hq′ which has the following properties:
Φ(a[f ]) = aΦ([f ]), ∀a ∈ [0, ∞), ∀f ∈ H;
τ ′ (Φ([f ]), Φ([g])) = τ ([f ], [g]), ∀f, g ∈ H.
Then the following properties are true:
(A) The mapping Φ is injective.
(B) There exists a linear or an antilinear operator U : H → H′ which is so that
ΦU = Φ, i.e. [U f ] = Φ([f ]) for all f ∈ H,
and which fulfils either one of the following conditions:
(U f |U g) = (f |g) , ∀f, g ∈ H, if U is linear;
(U f |U g) = (g|f ) , ∀f, g ∈ H, if U is antilinear.
(C) If the mapping Φ is surjective onto Hq′ then the operator U is surjective onto
H′ and hence it is a unitary or an antiunitary operator, i.e. U ∈ UA(H, H′ ).
(D) If H is not one-dimensional as a linear space, and if a mapping V : H → H′
is such that
V (f + g) = V f + V g, ∀f, g ∈ H,
and
[V f ] = Φ([f ]), ∀f ∈ H,
then there exists z ∈ T so that V f = zU f for all f ∈ H.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 307
Proof. A: We have
kΦ([f ])k2 = τ ′ (Φ([f ]), Φ([f ])) = τ ([f ], [f ]) = k[f ]k2 , ∀f ∈ H. (1)
This implies that, for f ∈ H,
Φ([f ]) = [0H′ ] iff f = 0H . (2)
Moreover, if f, g ∈ H − {0H } are so that Φ([f ]) = Φ([g]), then 1 implies that
τ ′ (Φ([f ]), Φ([g])) = k[f ]k2 = k[g]k2 ,
and hence
kf kkgk = k[f ]kk[g]k = τ ′ (Φ([f ]), Φ([g])) = τ ([f ], [g]) = | (f |g) |;
by 10.1.7b and 3.1.14, this implies that
∃z ∈ T such that f = zg, i.e. [f ] = [g].
This proves that the mapping Φ is injective.
B: We prove in what follows the existence of U with the required properties. If
H is a zero Hilbert space, this is true trivially. In the other cases, the proof is by
construction.
First we assume that, as a linear space, H is one-dimensional, and we choose an
element u ∈ H such that kuk = 1. Then,
∀f ∈ H, ∃!α ∈ C so that f = αu.
Therefore, if we choose u′ ∈ Φ([u]), we can define the mappings
U1 : H → H ′
αu 7→ U1 (αu) := αu′
and
U2 : H → H ′
αu 7→ U2 (αu) := αu′ .
The mapping U1 is linear since
U1 (α(βu) + γ(δu)) = (αβ + γδ)u′ = α(βu′ ) + γ(δu′ )
= αU1 (βu) + γU1 (δu), ∀α, β, γ, δ ∈ C;
we also have
[U1 (αu)] = [αu′ ] = |α|[u′ ] = |α|Φ([u]) = Φ(|α|[u]) = Φ([αu]), ∀α ∈ C,
and
(U1 (αu)|U1 (βu)) = (αu′ |βu′ ) = αβ = (αu|βu) , ∀α, β ∈ C,
which is true since 1 implies that
ku′ k = kΦ([u])k = k[u]k = kuk = 1.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 308
Thus, both the mappings U1 and U2 have the properties required for U in the
statement, and the proof for the one-dimensional case is concluded.
In what follows we assume that H, as a linear space, is neither zero-dimensional
nor one-dimensional. During the construction of U we shall need the results that
we collect in the following preliminary remarks.
Remark 1: Suppose that f ∈ H is so that U f has been defined and [U f ] = Φ([f ]).
If g ∈ H is also so that U g has been defined and [U g] = Φ([g]), then
| (U f |U g) | = τ ′ ([U f ], [U g]) = τ ′ (Φ([f ]), Φ([g])) = τ ([f ], [g]) = | (f |g) |. (3)
Hence, for g := f ,
kU f k = kf k. (4)
Now suppose f 6= 0H and also that, for all α ∈ C, U (αf ) has been defined and
[U (αf )] = Φ([αf ]); then
U (αf ) ∈ Φ([αf ]) = |α|Φ([f ]) = |α|[U f ] = [αU f ], ∀α ∈ C,
and hence
∀α ∈ C, ∃z ∈ T so that U (αf ) = zαU f,
and hence
∀α ∈ C, ∃!αf ∈ C so that |αf | = |α| and U (αf ) = αf U f, (5)
where uniqueness holds since U f 6= 0H′ (cf. 4). This defines the function
C ∋ α 7→ χf (α) := αf ∈ C.
Obviously, χf (1) = 1.
Remark 2: Let {ui }i∈I be a finite o.n.s. in H. If we choose u′i ∈ Φ([ui ]) for each
i ∈ I, then {u′i }i∈I is an o.n.s. in H′ since
| u′i |u′j | = τ ′ (Φ([ui ]), Φ([uj ])) = τ ([ui ], [uj ]) = | (ui |uj ) |, ∀i, j ∈ I.
′
P
If f = i∈I αi ui with αi ∈ C for all i ∈ I, then for each f ∈ Φ([f ]) we have
f ′ = i∈I α′i u′i with α′i ∈ C such that |α′i | = |αi | for all i ∈ I. In fact, 1 implies
P
and also
!
2
′ 2
′ X ′ ′ X
′ ′
kf k =
f − αi ui + αi ui
i∈I i∈I
2
2
2
X
′ X ′ ′
′ ′
′ X ′ ′
X
=
f − αi ui
+
αi ui
=
f − αi ui
+ |α′i |2
i∈I i∈I i∈I i∈I
Now suppose that g ′′ ∈ Φ([g]) is such that u′ + g ′′ ∈ Φ([u + g]). Then there exists
z ∈ T so that
u′ + g ′′ = z(u′ + g ′ ), i.e. (1 − z)u′ = zg ′ − g ′′ ;
since
| (u′ |g ′ ) | = | (u′ |g ′′ ) | = τ ′ (Φ([u]), Φ([g])) = τ ([u], [g]) = | (u|g) | = 0,
we have
(1 − z)u′ = 0H′ and zg ′ = g ′′ ,
and hence z = 1 and g ′′ = g ′ . Thus, 6 is proved.
Moreover, we note that
∃!g ′ ∈ Φ([0H ]) such that u′ + g ′ ∈ Φ([u + 0H ]);
indeed, the vector g ′ := 0H′ satisfies the above condition trivially since Φ([0H ]) =
[0H′ ] (cf. 2), and it is the only one that does so since the equivalence class [0H′ ]
contains just the vector 0H′ .
Thus, we have proved that
∀g ∈ {u}⊥, ∃!g ′ ∈ Φ([g]) such that u′ + g ′ ∈ Φ([u + g]).
Therefore we can define, for each g ∈ {u}⊥ ,
U g := g ′ and U (u + g) := u′ + g ′
if g ′ is the unique element of Φ([g]) such that u′ + g ′ ∈ Φ([u + g]). Since g ′ = 0H′ if
g = 0H (see above), this entails
U 0H = 0H′ and U u = u′ ,
and hence
U (u + g) = U u + U g, ∀g ∈ {u}⊥. (7)
We point out that
[U g] = Φ([g]), ∀g ∈ {u}⊥ , (8)
and
[U (u + g)] = Φ([u + g]), ∀g ∈ {u}⊥ . (9)
Step 3: In this step we prove that, for each v ∈ {u}⊥ such that kvk = 1,
either χv (α) = α for all α ∈ C or χv (α) = α for all α ∈ C. (10)
Let g and h be elements of {u}⊥ . From 7, 9, 3 we have
| (U u + U g|U u + U h) |2 = | (u + g|u + h) |2 ;
from 8, 9, 3 we have
(U u|U g) = (U u|U h) = 0;
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 311
Let v be an element of {u}⊥ such that kvk = 1. Then {u}⊥ = {αv : α ∈ C}.
For g, h ∈ {u}⊥ , by 8, 4, 5, 16, 17, 18 we have, if β, γ ∈ C are so that g = βv and
h = γv:
U (g + h) = U (βv + γv) = U ((β + γ)v) = χv (β + γ)U v
= (χv (β) + χv (γ))U v = χv (β)U v + χv (γ)U v
= U (βv) + U (γv) = U (g) + U (h);
(U g|U h) = (χu1 (β1 )U u1 + χu1 (β2 )U u2 |χu1 (γ1 )U u1 + χu1 (γ2 )U2 )
= χu1 (β1 )χu1 (γ1 ) + χu1 (β2 )χu1 (γ2 ) = χu1 (β 1 γ1 + β 2 γ2 ) (28)
= χu1 ((β1 u1 + β2 u2 |γ1 u1 + γ2 u2 )) = χu1 ((g|h)).
Now we are ready to prove 19, 20, 21 for all g, h ∈ {u}⊥. Fix g0 ∈ {u}⊥ − {0H } and
define u1 := kg10 k g0 . For each h ∈ {u}⊥ there exists u2 ∈ {u}⊥ so that {u1 , u2 } is
an o.n.s. and h ∈ L{u1 , u2 }: u2 is obtained as in 10.2.6 with I := {1, 2}, f1 := g0 ,
and either f2 := h if h is not a multiple of g0 or f2 any element of {u}⊥ which is
not a multiple of g0 if h is a multiple of g0 . Then, from 26 we obtain
U (g0 + h) = U g0 + U h, ∀h ∈ {u}⊥ .
Since g0 is an arbitrary element of {u}⊥ − {0H } and since 19 holds trivially for all
h ∈ {u}⊥ if g = 0H (recall that U 0H = 0H′ ), this proves 19 for all g, h ∈ {u}⊥.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 314
Furthermore, from 27 we obtain 20 for all h ∈ {u}⊥, with χ := χu1 . Finally, from
28 we obtain
(U g0 |U h) = χu1 ((g0 |h)), ∀h ∈ {u}⊥ .
If g is any element of {u}⊥ − {0H } different from g0 , proceeding as above we obtain
(U g|U h) = χu1,g ((g|h)), ∀h ∈ {u}⊥,
1
with u1,g := kgk g. However, 27 proves that χh = χu1 for all h ∈ {u}⊥ − {0H }.
Thus we have
(U g|U h) = χu1 ((g|h)), ∀g ∈ {u}⊥ − {0H }, ∀h ∈ {u}⊥ .
Since 21 holds trivially for all h ∈ {u}⊥ if g = 0H , this proves 21 for all g, h ∈ {u}⊥,
with χ := χu1 . Finally, recall 10.
Step 5: In this step we define U f for all f ∈ H and conclude the proof of part
B of the statement.
From 10.4.1 we have
∀f ∈ H, ∃!(f1 , f2 ) ∈ V {u} × (V {u})⊥ so that f = f1 + f2 .
Since V {u} = {αu : α ∈ C} (cf. 4.1.15) and (V {u})⊥ = {u}⊥ (cf. 10.2.11), we have
∀f ∈ H, ∃!(α, g) ∈ C × {u}⊥ so that f = αu + g.
Therefore, we can define U f for all f ∈ H by letting
U (αu + g) := χ(α)U u + U g, ∀α ∈ C, ∀g ∈ {u}⊥ ,
where χ is the function of step 4. For α = 0 or α = 1, this definition coincides with
the ones already given in step 2 since χ(0) = 0 and χ(1) = 1 (cf. 7).
We have already noted that [U g] = Φ([g]) for all g ∈ {u}⊥ (cf. 8). Moreover,
for every α ∈ C − {0} and every g ∈ {u}⊥ we have
[U (αu + g)] = [χ(α)(U u + U (α−1 g))] = |χ(α)|[U (u + α−1 g)]
= |α|Φ([u + α−1 g]) = Φ([αu + g]),
where 20, 7, 9 have been used. Thus,
[U f ] = Φ([f ]), ∀f ∈ H.
Finally, for f1 , f2 ∈ H we have, if α1 , α2 ∈ C and g1 , g2 ∈ {u}⊥ are so that f1 =
α1 u + g1 and f2 = α2 u + g2 ,
U (f1 + f2 ) = U ((α1 + α2 )u + (g1 + g2 ))
= χ(α1 + α2 )U u + U (g1 + g2 )
= χ(α1 )U u + U g1 + χ(α2 )U u + U g2
= U (α1 u + g1 ) + U (α2 u + g2 ) = U f1 + U f2 ,
where 16 and 19 have been used; we also have
U (αf1 ) = U (αα1 u + αg1 ) = χ(αα1 )U u + U (αg1 )
= χ(α)(χ(α1 )U u + U g1 ) = χ(α)U f1 , ∀α ∈ C,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 315
10.9.5 Proposition. Let H and H′ be non-zero Hilbert spaces and suppose that the
family UA(H, H′ ) is not empty. For every U ∈ UA(H, H′ ), the mapping
ωU : Ĥ → Ĥ′
[u] 7→ ωU ([u]) := [U u]
Thus, Φω has all the properties that were assumed for Φ in 10.9.3. Then, Φω is
injective by 10.9.3A and so is ω. Thus, ω is an isomorphism from Ĥ onto Ĥ′ .
Further, if f ′ ∈ H′ − {0H′ } then kf1′ k f ′ ∈ H̃′ and hence (since ω is surjective onto
Ĥ′ ) there exists u ∈ H̃ so that
1 ′
ω([u]) = f ,
kf ′ k
and hence so that
1 ′
Φω ([kf ′ ku]) = kf ′ kΦω ([u]) = kf ′ kω([u]) = kf ′ k f = [f ′ ].
kf ′ k
This proves that the mapping Φω is surjective onto Hq′ . Then, by 10.9.3B,C there
exists U ∈ UA(H, H′ ) such that
[U f ] = Φω ([f ]), ∀f ∈ H,
and hence such that
[U u] = Φω ([u]) = ω([u]), ∀u ∈ H̃.
′
If V ∈ UA(H, H ) is such that ωV = ω, then we have:
1 1
[V f ] = kf k V f = kf kωV f
kf k kf k
1
= kf kω f = Φω ([f ]), ∀f ∈ H − {0H };
kf k
10.9.7 Remarks.
(a) If H and H′ are non-zero Hilbert spaces and ω is an isomorphism from Ĥ onto
Ĥ′ , we say that an element U of UA(H, H′ ) implements ω if ωU = ω.
Suppose that H and H′ are non-zero Hilbert spaces and that the projective
spaces (Ĥ, τ ), (Ĥ′ , τ ′ ) are isomorphic. Further, suppose that H and H′ are
not one-dimensional as linear spaces. If U is an element of UA(H, H′ ) which
implements an isomorphism ω from Ĥ onto Ĥ′ , then it is clear from 10.9.5 and
10.9.6 that another element V of UA(H, H′ ) implements ω iff there exists z ∈ T
so that V = zU . Then, the operators in UA(H, H′ ) which implement a given
isomorphism are either all unitary or all antiunitary.
(b) Let H be a non-zero Hilbert space. It is obvious that the mapping
UA(H) ∋ U 7→ ωU ∈ Aut Ĥ
is a homomorphism from the group UA(H) (cf. 10.3.16d) to the group Aut Ĥ.
Wigner’s theorem proves that this homomorphism is surjective onto Aut Ĥ.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 319
Chapter 11
L2 Hilbert Spaces
This chapter deals with actualizations of the concept of abstract Hilbert space which
are used in most applications of Hilbert space theory.
11.1 L2 (X, A, µ)
319
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 320
a condition for the equivalence class [ϕ] even though it is expressed through a
particular representative of it. Now, the above implication follows from 8.1.17c.
Then, 11.1.2a proves that L2 (X, A, µ) is a linear manifold in M (X, A, µ).
To prove the consistency of the definition of the function φ, first we note
that 11.1.2b implies that ϕψ ∈ L1 (X, A, µ) for all ϕ, ψ ∈ L2 (X, A, µ). Next, if
ϕ′ , ϕ, ψ ′ , ψ ∈ L2 (X, A, µ) are so that ϕ′ ∼ ϕ and ψ ′ ∼ ψ, then
ϕ′ (x)ψ ′ (x) = ϕ(x)ψ(x) µ-a.e. on Dϕψ ∩ Dϕ′ ψ′
(the proof is similar to the one given in 8.2.13 for ϕ′ + ψ ′ ∼ ϕ + ψ) and hence
Z Z
′ ′
ϕ ψ dµ = ϕψµ
R X X
by 8.2.7. Thus, the number X ϕψdµ depends only on the equivalence classes [ϕ]
and [ψ], and not on the particular representatives ϕ and ψ through which it is
obtained.
As to the properties listed in 10.1.3 which φ must have in order to be an inner
product, ip1 follows from 8.2.9, ip2 from
Z Z
ϕψdµ = ψϕdµ , ∀ϕ, ψ ∈ L2 (X, A, µ)
X X
(cf. 8.2.3), ip3 is obvious. Finally, for ϕ ∈ L2 (X, A, µ) we have
Z
|ϕ|2 dµ = ([ϕ]|[ϕ]) = 0 ⇒ ϕ(x) = 0 µ-a.e. on Dϕ ⇒ [ϕ] = 0L2 (X,A,µ)
X
(cf. 8.1.18a), and this shows that φ has property ip4 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 321
and then writes L2 (X, A, µ) := L2 (X, A, µ)/ ∼ for the quotient set. Then one
defines vector sum, scalar multiplication and inner product by
This way of defining the inner product space L2 (X, A, µ) is more frequent than
ours.
In a similar way one can give an alternative but equivalent definition of the
normed space L1 (X, A, µ) (cf. 8.2.15).
11.1.6 Theorem. The normed space L1 (X, A, µ) is a Banach space and the inner
product space L2 (X, A, µ) is a Hilbert space.
Proof. In what follows, p stands for either 1 or 2. We need to prove that the metric
space (Lp (X, A, µ), d) is complete, where d is the distance on Lp (X, A, µ) defined
by
Z p1
p
d([ϕ], [ψ]) := k[ϕ] − [ψ]k = |ϕ − ψ| dµ , ∀[ϕ], [ψ] ∈ Lp (X, A, µ).
X
Then, let {[ϕn ]} be a Cauchy sequence in Lp (X, A, µ), and for each ε > 0 let Nε ∈ N
be so that
11.1.7 Remark. From the proof of 11.1.6 we obtain the following result:
if {[ϕn ]} is a convergent sequence in L1 (X, A, µ) or in L2 (X, A, µ) and
[ϕ] := limn→∞ [ϕn ], then there exists a subsequence {[ϕnk ]} so that
ϕ(x) = lim ϕnk (x) µ-a.e. on X.
k→∞
Proof. Since S(X, A, µ) and L2 (X, A, µ) are linear manifolds in the linear space
M (X, A, µ), the same holds true for their intersection (cf. 3.1.5). Then S 2 (X, A, µ)
is a linear manifold in the linear space L2 (X, A, µ) as well (cf. 3.1.4b).
Let [ϕ] ∈ L2 (X, A, µ) and assume that the representative ϕ is an element of
M(X, A) (cf. 8.2.12). Then, by 6.2.27 there exists a sequence {ψn } in S(X, A)
such that
|ψn (x)| ≤ |ϕ(x)|, ∀x ∈ X, ∀n ∈ N,
lim ψn (x) = ϕ(x), ∀x ∈ X,
n→∞
and hence such that
|ψn (x) − ϕ(x)|2 ≤ 4|ϕ(x)|2 , ∀x ∈ X,
lim |ψn (x) − ϕ(x)|2 = 0, ∀x ∈ X.
n→∞
Then ψn ∈ L2 (X, A, µ) by 8.2.5, and by 8.2.11 we have
Z 12
lim k[ψn ] − [ϕ]k = lim |ψn − ϕ|2 dµ = 0.
n→∞ n→∞ X
In view of 2.3.12, this proves that S (X, A, µ) is dense in L2 (X, A, µ).
2
is consistent.
The definition of the mapping
V : L2 (X1 , A1 , µ1 ) → L2 (X2 , A2 , µ2 )
[ψ] 7→ V [ψ] := [ψ ◦ π −1 ]
is also consistent.
The mappings U and V are unitary operators and V = U −1 .
By 1.2.16b, this proves that U is injective and U −1 = V , and therefore also that
RU = L2 (X1 , A1 , µ1 ). Moreover, for all [ϕ], [ψ] ∈ L2 (X2 , A2 , µ2 ) we have
Z Z
(ϕ ◦ π)(ψ ◦ π)dµ1 = ϕψdµ2
X1 X2
11.2 L2 (a, b)
In this section, a and b are two real numbers such that a < b.
We write M(a, b) := M(X, A, µ), L1 (a, b) := L1 (X, A, µ), L2 (a, b) :=
L (X, A, µ), L2 (a, b) := L2 (X, A, µ) if X := [a, b], A := (A(dR ))[a,b] , µ := m[a,b] ,
2
where m[a,b] is the Lebesgue measure on [a, b] (cf. 9.3.1). Moreover, we denote
(A(dR ))[a,b] by the symbol A[a,b] .
11.2.1 Theorem. The inclusion C(a, b) ⊂ L2 (a, b) holds true. If the mapping ι is
defined by
ι : C(a, b) → L2 (a, b)
ϕ 7→ ι(ϕ) := [ϕ],
then the pair (L2 (a, b), ι) is a completion of the inner product space C(a, b) (for
which, cf. 10.1.5b).
Proof. For every ϕ ∈ C(a, b), ϕ ∈ M(a, b) (cf. 6.2.8). Moreover, the function |ϕ|2
is bounded since it is an element of C(a, b) (cf. 3.1.10f), and therefore |ϕ|2 ∈ L1 (a, b)
(cf. 8.2.6), and hence ϕ ∈ L2 (a, b).
If the inner products in C(a, b) and in L2 (a, b) are denoted by φ and by φ̂
respectively, directly from their definitions we have
Z
φ̂(ι(ϕ), ι(ψ)) = ϕψdm[a,b] = φ(ϕ, ψ), ∀ϕ, ψ ∈ C(a, b).
[a,b]
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 326
11.2.2 Remarks.
(a) Since ι(C(a, b)) = L2 (a, b) and ι(C(a, b)) 6= L2 (a, b), ι(C(a, b)) is not a closed
subset of L2 (a, b) (cf. 2.3.9c). Hence, the inner product space ι(C(a, b)) (cf.
10.3.5) is not a Hilbert space (cf. 2.6.6a). Since the inner product spaces
ι(C(a, b)) and C(a, b) are isomorphic (cf. 10.3.5), this furnishes another proof
(besides the one given in 10.4.2c) that the inner product space C(a, b) is not a
Hilbert space (cf. 10.1.21).
(b) The mapping ι of 11.2.1 is a linear operator and it is injective (cf. 10.3.5). Thus,
if ϕ, ψ ∈ C(a, b) are such that [ϕ] = [ψ] then ϕ = ψ. Namely, if an element of
L2 (a, b) contains a continuous function then this function is the only continuous
one it contains.
(c) The mapping ι of 11.2.1 is not surjective onto L2 (a, b). To see this, let x0 ∈ (a, b)
and consider the element [χ[a,x0 ] ] of L2 (a, b). If ϕ ∈ [χ[a,x0 ] ] and Dϕ = [a, b]
then for each n ∈ N large enough there exists x′n ∈ x0 − n1 , x0 such that
ϕ(x′n ) = 1 (otherwise we should have ϕ(x) 6= χ[a,x0 ] (x) for all x ∈ x0 − n1 , x0 ,
and this would be in contradiction with ϕ ∼ χ[a,x0 ] ), and similarly there exists
1
′′
xn ∈ x0 , x0 + n such that ϕ(xn ) = 0. Hence, there exist two sequences {x′n }
′′
and {x′′n } in [a, b] such that limn→∞ x′n = limn→∞ x′′n = x0 , limn→∞ ϕ(x′n ) = 1,
limn→∞ ϕ(x′′n ) = 0. By 2.4.2, this proves that ϕ is not continuous at x0 , and
hence that ϕ 6∈ C(a, b). Another proof that ι is not surjective is by contraposition
as follows: if ι were surjective onto L2 (a, b) then ι would be an isomorphism from
C(a, b) onto L2 (a, b) and hence C(a, b) would be a Hilbert space (cf. 10.1.21),
which is not true.
Proof. The proof is based on 11.1.10. Let X1 := [a, b], A1 := A[a,b] , µ1 := m[a,b] ,
X2 := [0, 2π], A2 := A[0,2π] , µ2 := b−a
2π m[0,2π] (for which, cf. 8.3.5b with µ := m[0,2π]
and ν the null measure on A[0,2π] ). In view of 9.2.1a and 9.2.2a we have, for every
E ∈ A[0,2π] ,
b−a b−a b−a
m[0,2π] (E) = m(E) = m E+a
2π 2π 2π
b−a
= m[a,b] E + a = m[a,b] (π −1 (E)).
2π
Then, 11.1.10 proves that the mapping
b−a
W : L2 [0, 2π], A[0,2π] , m[0,2π] → L2 (a, b)
2π
[ϕ] 7→ W [ϕ] := [ϕ ◦ π]
Since M [0, 2π], A[0,2π] , b−a
is a unitary operator. 2π m[0,2π] = M(0, 2π)
1 b−a 1
and L [0, 2π], A[0,2π] , 2π m[0,2π] = L (0, 2π) (cf. 8.3.5b), and since triv-
ially, for E ∈ A[0,2π] , b−a 2π m[0,2π] (E) = 0 iff m[0,2π] (E) = 0, we have
L2 [0, 2π], A[0,2π] , b−a
2π m [0,2π] = L 2
(0, 2π). Thus, the mapping
b−a
T : L2 (0, 2π) → L2 [0, 2π], A[0,2π] , m[0,2π]
2π
21
2π
[ϕ] 7→ T [ϕ] := [ϕ]
b−a
is defined consistently and RT = L2 [0, 2π], A[0,2π] , b−a
2π m[0,2π] . In view of 8.3.5b,
we also have
12 12 Z
2π 2π b−a
Z
ϕ ψd m[0,2π] = ϕψdm[0,2π] .
[0,2π] b−a b−a 2π [0,2π]
12
2 x−a
wn (x) := sin 2πn , ∀x ∈ [a, b].
b−a b−a
Both the families {[un ]}n∈Z and {[un ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N are complete or-
thonormal systems in the Hilbert space L2 (a, b).
Proof. First we consider the special case a := 0 and b := 2π. We already know that
both the families {un }n∈Z and {u0 } ∪ {vn }n∈N ∪ {wn }n∈N are orthonormal systems
in the inner product space C(0, 2π) (cf. 10.2.5b). Hence so are in L2 (0, 2π) the two
families of the statement, owing to property co1 of 10.3.4 possessed by the mapping
ι of 11.2.1. To prove that these orthonormal systems are complete in L2 (0, 2π), we
first prove that L{un}n∈Z is dense in Rι . Then, fix ϕ ∈ C(0, 2π) and ε > 0. For
n ∈ N large enough, let χn be the element of C(0, 2π) defined by
nx
if 0 ≤ x < n1 ,
χn (x) := 1 if n1 ≤ x ≤ 2π − n1 ,
n(2π − x) if 2π − 1 < x ≤ 2π,
n
Then, fix k ∈ N so that k[ϕ] − [ϕk ]k < 2ε . Since ϕk (0) = 0 = ϕk (2π), ϕk can be
identified in an obvious way with an element of C(T) and conversely any trigono-
metric polynomial can be identified with an element of L{un }n∈Z by 3.1.7 (for C(T)
and the trigonometric polynomials, cf. 4.3.6c). Then, 4.3.7 implies that there exists
ψ ∈ L{un }n∈Z such that
ε
sup{|ϕk (x) − ψ(x)| : x ∈ [0, 2π)} < 1
2(b − a) 2
(cf. 2.3.12), and hence such that
! 12 12
ε2
ε
Z
2
k[ϕk ] − [ψ]k = |ϕk − ψ| dm ≤ (b − a) = ,
[a,b] 4(b − a) 2
Now, [ψ] ∈ L{[un ]}n∈Z by 3.1.7 since ι is a linear operator. By 2.3.12, this
proves that L{[un ]}n∈Z is dense in Rι . Since Rι is dense in L2 (0, 2π) (cf. 11.2.1),
L{[un ]}n∈Z is dense in L2 (0, 2π) by 2.3.14. Thus we have
V {[un ]}n∈Z = L{[un ]}n∈Z = L2 (0, 2π)
(cf. 4.1.13), and also
V ({[u0 ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N ) = L2 (0, 2π)
since
L{un }n∈Z = L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N )
(cf. 10.2.5b) implies
L{[un]}n∈Z = L({[u0 ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N )
in view of 3.1.7 and of the linearity of ι. This proves that the two orthonormal
systems of the statement are complete in L2 (a, b) if a := 0 and b := 2π.
For any a, b ∈ R such that a < b, the two families of vectors of the statement
can be obtained from the same families for a := 0 and b := 2π by means of the
unitary operator U of 11.2.3. In view of 10.6.8b, this proves that they are complete
orthonormal systems in L2 (a, b).
11.2.5 Remark. In view of 10.7.5, 11.2.4 proves that the Hilbert space L2 (a, b) is
separable.
12
2 x−a
sn (x) := sin πn , ∀x ∈ [a, b].
b−a b−a
Both the families {[u]} ∪ {[cn ]}n∈N and {[sn ]}n∈N are complete orthonormal systems
in the Hilbert space L2 (a, b).
(cf. 8.3.11c and note that if ψ ∈ L1 (−2π, 2π) then ψ[0,2π] ∈ L1 (0, 2π); in the
integrals we simply denote by ψ the restrictions of ψ that are actually used). Hence,
if ψ is an even function, i.e. if Dψ = [−2π, 2π] and (ψ ◦ π)(x) = ψ(−x) = ψ(x) for
all x ∈ [−2π, 0], then
Z Z Z
ψdm[−2π,2π] = χ[−2π,0) ψdm[−2π,2π] + χ[0,2π] ψdm[−2π,2π]
[−2π,2π] [−2π,2π] [−2π,2π]
Z Z Z
= ψdm[−2π,0] + ψdm[0,2π] = 2 ψdm[0,2π] .
[−2π,0] [0,2π] [0,2π]
If ψ is an odd function, i.e. if Dψ = [−2π, 2π] and (ψ ◦ π)(x) = ψ(−x) = −ψ(x) for
all x ∈ [−2π, 0], then
Z Z Z
ψdm[−2π,2π] = ψdm[−2π,0] + ψdm[0,2π]
[−2π,2π] [−2π,0] [0,2π]
Z Z
=− ψdm[0,2π] + ψdm[0,2π] = 0.
[0,2π] [0,2π]
Now we consider the c.o.n.s. {[u0 ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N of 11.2.4, in the special
case a := −2π and b := 2π. We note that, for each n ∈ N,
12
1 nx
wn (x) = (−1)n sin , ∀x ∈ [−2π, 2π],
2π 2
and hence
√
σn (x) = (−1)n 2wn (x), ∀x ∈ [0, 2π].
For all n, k ∈ N we have
Z Z
σn σk dm[0,2π] = (−1)n (−1)k 2 wn wk dm[0,2π]
[0,2π] [0,2π]
Z
n k
= (−1) (−1) wn wk dm[−2π,2π]
[−2π,2π]
where the second equality holds because the function wn wk is even and the third
holds because {[wn ]}n∈N is an o.n.s. in L2 (−2π, 2π). Thus, {[σn ]}n∈N is an o.n.s.
in L2 (0, 2π). For each [ϕ] ∈ L2 (0, 2π), assuming for convenience that for the repre-
sentative ϕ we have Dϕ = [0, 2π] (cf. 8.2.12), we define the function
ϕ̃ : [−2π, 2π] → C
(
−ϕ(−x) if x ∈ [−2π, 0),
x 7→ ϕ̃(x) :=
ϕ(x) if x ∈ [0, 2π].
From 8.3.11c we have that ϕ ◦ π ∈ M(−2π, 0) and |ϕ ◦ π|2 ∈ L1 (−2π, 0); it is then
easy to see that ϕ̃ ∈ L2 (−2π, 2π). Then we have
Z Z
2 |ϕ|2 dm[0,2π] = |ϕ̃|2 dm[−2π,2π]
[0,2π] [−2π,2π]
Z 2 2
X∞ Z
= u0 ϕ̃dm[−2π,2π] + vn ϕ̃dm[−2π,2π]
[−2π,2π] [−2π,2π]
n=1
2
∞ Z
X
+ w ϕ̃dm
n [−2π,2π]
[−2π,2π]
n=1
2
∞ Z
X
= 2 wn ϕ̃dm[0,2π] ,
[0,2π]
n=1
where the first equality holds because the function |ϕ̃|2 is even, the second holds
by 10.6.4d, the third holds because the functions u0 ϕ̃ and vn ϕ̃ are odd and the
functions wn ϕ̃ are even. Thus we have
2 2
∞ ∞ Z
Z X √ Z X
|ϕ|2 dm[0,2π] = 2 wn ϕ̃dm[0,2π] = σn ϕ̃dm[0,2π] .
[0,2π] n=1
[0,2π] [0,2π]
n=1
2
This proves that condition 10.6.4d (with M := L (0, 2π)) holds true for the o.n.s.
{σn }n∈N , which is therefore a c.o.n.s. in L2 (0, 2π).
Now, if U is the unitary operator of 11.2.3 then {U [σn ]}n∈N is a c.o.n.s. in
L2 (a, b) by 10.6.8b, and we note that
U [σn ] = [sn ], ∀n ∈ N.
so that the functions u0 ϕ̃ and vn ϕ̃ are even and the functions wn ϕ̃ are odd.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 333
11.3 L2 (R)
We write M(R) := M(X, A, µ), L1 (R) := L1 (X, A, µ), L2 (R) := L2 (X, A, µ),
L2 (R) := L2 (X, A, µ) if X := R, A := A(dR ), µ = m, where m is the Lebesgue
measure on R.
11.3.1 Proposition. The inclusion S(R) ⊂ L2 (R) holds true (for S(R), cf. 3.1.10h
and 10.1.5c). The mapping
ι : S(R) → L2 (R)
ϕ 7→ ι(ϕ) := [ϕ]
is a linear operator and
(ι(ϕ)|ι(ψ)) L2 (R) = (ϕ|ψ)S(R) , ∀ϕ, ψ ∈ S(R).
Proof. It is obvious that the family {[hn ]}n∈I is an o.n.s. in L2 (R), since {hn }n∈I
is an o.n.s. in S(R) and the mapping ι of 11.3.1 preserves the inner product. To
prove that {[hn ]}n∈I is complete in L2 (R) we proceed as follows. We prove below
that
({[fn ]}n∈I )⊥ = {0L2 (R) },
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 334
ξ2
where {fn }n∈I is as in 10.2.7, i.e. fn = ξ n e− 2 for all n ∈ I. From the equality
L{hn }n∈I = L{fn }n∈I
in S(R) (cf. 10.2.7), we obtain the equality
L{[hn ]}n∈I = L{[fn ]}n∈I
in L2 (R), by 3.1.7 and the linearity of the mapping ι of 11.3.1. Then we have
({[hn ]}n∈I )⊥ = (L{[hn ]}n∈I )⊥ = (L{[fn ]}n∈I )⊥ = ({[fn ]}n∈I )⊥ = {0L2 (R) }
(cf. 10.2.11), and this proves that {[hn ]}n∈I is a c.o.n.s. in L2 (R) (cf. 10.6.5a).
Now we prove that ({[fn ]}n∈I )⊥ = {0L2 (R) }. Then, let [ϕ] ∈ L2 (R) be such that
([ϕ]|[fn ]) = 0, ∀n ∈ I.
We assume for convenience that for the representative ϕ we have Dϕ = R (cf.
8.2.12), and we write ϕ1 := Re ϕ and ϕ2 := Im ϕ. In what follows, fix i = 1 or
i = 2. We have ϕi ∈ L2 (R) (cf. 6.2.12 and 8.1.17b) and
([ϕi ]|[fn ]) = 0, ∀n ∈ I,
ξ2
because fn is a real function. For any a ∈ R, the function eiaξ e− 2 is an element of
2
2 iaξ − ξ2
S(R) (cf. 5 in 3.1.10h) and hence of L (R) (cf. 11.3.1). Then ϕi e e ∈ L1 (R)
(cf. 11.1.2b) and
∞
(iax)n − x2
Z 2
Z
iaξ − ξ2
X
ϕi e e dm = ϕi (x) e 2 dm(x)
R R n=0
n!
N
(iax)n − x2
Z X
= lim ϕi (x) e 2 dm(x)
R N →∞ n=0
n!
N
(1)
(ia)n
Z X
x2
= lim ϕi (x)xn e− 2 dm(x)
N →∞ R
n=0
n!
N
X (ia)n
= lim ([ϕi ]|[fn ]) = 0,
N →∞
n=0
n!
ξ2
where the third equality holds true by 8.2.11, with the function |ϕi |e|aξ| e− 2 as
dominating function. Indeed,
N ∞
X (iax)n − x2 X |ax|n − x2
ϕi (x) e 2 ≤ |ϕi (x)| e 2
n!
n=0
n! n=0
x2
= |ϕi (x)|e|ax| e− 2 , ∀x ∈ R, ∀N ∈ N;
ξ2 2
moreover, e|aξ| e− 2 ∈ L2 (R) (it can be seen that e2|aξ| e−ξ ∈ L1 (R) in the same
ξ2
way as for an element of S(R) in 10.1.5c) and therefore |ϕi |e|aξ| e− 2 ∈ L1 (R) (cf.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 335
11.1.2b). For each l ∈ N, let the function ψl : R → C be the periodic function with
x ∈ [−l, l):
period 2l that is defined as follows for
1 if ϕi (x) > 0,
ψl (x) := 0 if ϕi (x) = 0,
−1 if ϕ (x) < 0.
i
We have ψl ∈ M(R) since, if we denote by ϕi,0 the restriction of ϕi to [−l, l),
then the three sets ϕ−1 −1 −1
i,0 ((0, ∞)), ϕi,0 ({0}), ϕi,0 ((−∞, 0)) are elements of A(dR ) (cf.
6.2.3, 6.2.13a, 6.1.19a) and the same is true for their translations (cf. 9.2.1a). We
also have
x2 x2
lim ϕi (x)ψl (x)e− 2 = |ϕi (x)|e− 2 , ∀x ∈ R,
l→∞
since ϕi (x)ψl (x) = |ϕi (x)| for all x ∈ [−l, l), and
x2 x2
|ϕi (x)ψl (x)e− 2 | ≤ |ϕi (x)e− 2 |, ∀x ∈ R,
ξ2
since |ψl (x)| ≤ 1 for all x ∈ R. Hence, by 8.2.11 with the function |ϕi |e− 2 (which
is an element of L1 (R) by 11.1.2b)
Z as dominating
Z function, we have
ξ2 ξ2
lim ϕi ψl e− 2 dm = |ϕi |e− 2 dm.
l→∞ R R
Z0, there exists r Z∈ N such that
Therefore, if we fix ε >
2 2
|ϕi |e− ξ2 dm − ϕi ψr e− ξ2 dm < ε.
(2)
R R
Now, the restriction of ψr to the interval [0, 2r) is obviously an element of L2 (0, 2r),
Pm π
and therefore there is a function q := k=−m αk ei r kξ such that
Z Z ∞
!−1
−4r 2 n2
X
2 2 2
|ψr − q| dm = |ψr − q| dm < ε 2 e
[0,2r) [0,2r] n=0
(cf. 10.6.4b in the Hilbert space L2 (0, 2r), with the c.o.n.s. {[uk ]}k∈Z of 11.2.4).
ξ2
Now, (ψr −q)e− is an element of L2 (R) since both ψr and q are bounded functions,
2
and Z
ξ2 2
k[(ψr − q)e− 2 ]k2 = |ψr − q|2 e−ξ dm
R
∞ Z
X 2
= |ψr − q|2 e−ξ dm
n=0 [2rn,2r(n+1))
∞ Z
X 2
+ |ψr − q|2 e−ξ dm
n=0 [−2r(n+1),−2rn))
∞ Z (3)
−4r 2 n2
X
≤ e |ψr − q|2 dm
n=0 [2rn,2r(n+1))
∞ Z
2
n2
X
+ e−4r |ψr − q|2 dm
n=0 [−2r(n+1),−2rn))
∞ Z
2
n2
X
=2 e−4r |ψr − q|2 dm < ε2 ,
n=0 [0,2r)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 336
where the second equality holds by 8.3.4a and the last equality holds because ψr
and q are periodic functions with period 2r. Moreover, from 1 we have
Z 2
m Z
ξ2
− ξ2 π
X
ϕi qe dm = αk ϕi ei r kξ e− 2 dm = 0. (4)
R k=−m R
11.3.4 Remark. In view of 10.7.5, 11.3.3 proves that the Hilbert space L2 (R) is
separable.
11.3.5 Theorem. The pair (L2 (R), ι), with ι defined as in 11.3.1, is a completion
of the inner product space S(R).
Proof. We already know that ι fulfils condition co1 of 10.3.4 (cf. 11.3.1). Condition
co2 follows from 11.3.3 and 10.6.5b.
11.3.6 Remarks.
(a) By reasoning as in 11.2.2a, we can see that the inner product space S(R) (cf.
10.1.5c) is not a Hilbert space.
(b) The mapping ι of 11.3.1 is injective (cf. 10.1.19). This means that if an element
of L2 (R) contains an element ϕ of S(R), then ϕ is the only element of S(R) it
contains. As a rule, when we denote by [ϕ] an element of Rι , the representative
ϕ by which we mark the equivalence class is meant to be the element of S(R)
that is contained by the class.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 337
(fˇ)(l) = il (ξ l f )ˇ, ∀l ∈ N.
Proof. Let {tn } be a sequence in R − {0} such that tn → 0. For all x ∈ R, we have
1 ˆ 1 −itn y
Z
lim ˆ
(f (x + tn ) − f (x)) = lim (2π)−1
(e − 1)e−ixy f (y)dm(y)
n→∞ tn n→∞ t
R n
Z
= (2π)−1 (−iy)e−ixy f (y)dm(y) = −i(ξf )ˆ(x)
R
|eiα − 1| ≤ |α|, ∀α ∈ R;
then
This proves by induction the part of the statement about fˆ. The proof for fˇ is
analogous.
11.4.3 Remark. We recall (cf. 10.1.5c) that we have, for all ϕ ∈ S(R):
1
R ∈ L (R);
(a) ϕ Rn
(b) R ϕdm = limn→∞ −n ϕ(x)dx
(the integrals on the right hand side of this equation are Riemann integrals).
Proof. Preliminarily we note that, for all k ∈ N, ϕ(k) ∈ S(R) (cf. 3.1.10h-1) and
hence ϕ(k) ∈ L1 (R) (cf. 11.4.3a). Thus, the statement is consistent.
For all x ∈ R, we have
Z n
(1) − 12
(1)
(ϕ )ˆ(x) = (2π) lim e−ixy ϕ(1) (y)dy
n→∞ −n
Z n
(2) − 21 −ixn ixn −ixy
= (2π) lim e ϕ(n) − e ϕ(−n) + ix e ϕ(y)dy
n→∞ −n
Z n
(3) 1 (4)
= ix(2π)− 2 lim e−ixy ϕ(y)dy = ixϕ̂(x),
n→∞ −n
where 1 and 4 hold true by 11.4.3b, 2 is integration by parts for the Riemann
integrals, 3 holds true because limn→∞ ϕ(±n) = 0. This proves that
(ϕ(1) )ˆ = iξ ϕ̂.
In the same way we can prove that, for each k ∈ N, if
(ϕ(k) )ˆ = (iξ)k ϕ̂
then
(ϕ(k+1) )ˆ = (iξ)k+1 ϕ̂.
This proves by induction the part of the statement about ϕ̂. The proof for ϕ̌ is
analogous.
Proof. Preliminarily we note that (ξ l ϕ)(k) ∈ S(R) (cf. 3.1.10h-1,4) and hence
(ξ l ϕ)(k) ∈ L1 (R) (cf. 11.4.3a), for all k, l ∈ N. Moreover, ϕ̂ and ϕ̌ are elements
of C ∞ (R) since ξ l ϕ ∈ L1 (R), for all l ∈ N (cf. 11.4.2). Thus, the statement is
consistent.
We fix k, l ∈ N. From 11.4.2 we have
ξ k (ϕ̂)(l) = (−i)l ξ k (ξ l ϕ)ˆ. (1)
Since ξ l ϕ ∈ S(R), we can write the first equality in 11.4.4 with ϕ replaced by ξ l ϕ,
to obtain
((ξ l ϕ)(k) )ˆ = (iξ)k (ξ l ϕ)ˆ. (2)
Now, 1 and 2 yield the first equality of the statement. The proof of the second
equality is analogous.
Proof. As already noted in the proof of 11.4.5, ϕ̂ and ϕ̌ are elements of C ∞ (R).
Moreover we have, for all k = 0, 1, 2, ... and all l = 0, 1, 2, ...,
(1)
sup{|xk+1 (ϕ̂)(l) (x)| : x ∈ R} = sup{|((ξ l ϕ)(k+1) )ˆ(x)| : x ∈ R}
(2)
Z
(3)
−1
≤ (2π) |(ξ l ϕ)(k+1) |dm < ∞,
R
where: 1 holds true by 11.4.5 if l ∈ N or by 11.4.4 if l = 0; 2 follows from 8.2.10; 3
holds true in view of 11.4.3a since (ξ l ϕ)(k+1) ∈ S(R) (cf. 3.1.10h-1,4). Then,
(l) 1 k+1 (l)
lim xk (ϕ̂) (x) = lim x (ϕ̂) (x) = 0.
x→±∞ x→±∞ x
11.4.8 Lemma. For each a ∈ (0, ∞), let the function γa be defined by
γa : R → C
1 2
x 7→ γa (x) := e− 2 ax .
Then, γa ∈ S(R) and
−1
1 1
x2
γ̂a (x) = γ̌a (x) = a− 2 e− 2 a , ∀x ∈ R.
Proof. We fix a ∈ (0, ∞). It is obvious that γa ∈ S(R). Then γ̂a ∈ C ∞ (R) and
(γ̂a )(1) = −i(ξγa )ˆ (1)
(cf. 11.4.2). Moreover, from
γa′ (x) = −axγa (x), ∀x ∈ R,
we obtain
(ξγa )ˆ = −a−1 (γa(1) )ˆ = −a−1 iξγ̂a , (2)
in view of 11.4.4. From 1 and 2, we see that
(γ̂a )(1) + a−1 ξγ̂a = 0.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 341
Moreover,
1√
Z
1 1 2 1 1
γ̂a (0) = (2π)− 2 e− 2 ax dm(x) = (2π)− 2 (2a−1 ) 2 π = a− 2
R
(cf. 11.4.7). Now, there exists a unique element ϕ of C ∞ (R) such that
1
ϕ′ (x) + a−1 xϕ(x) = 0, ∀x ∈ R, and ϕ(0) = a− 2 ,
and ϕ is defined by
−1
1 1
x2
ϕ(x) := a− 2 e− 2 a .
Thus,
−1
1 1
x2
γ̂a (x) = a− 2 e− 2 a , ∀x ∈ R.
As to γ̌a , we note that
(f )ˆ = fˇ, ∀f ∈ L1 (R),
since complex conjugation commutes with integration (cf. 8.2.3).
Proof. The first equation of the statement is proved by the following equalities,
where x is a fixed but arbitrary element of R:
Z
− 12
(ϕ̂)ˇ(x) = (2π) eixt ϕ̂(t)dm(t)
R
Z Z
(1) 1 1 −2 2
= (2π)− 2 lim e− 2 n t eixt e−ity ϕ(y)dm(y) dm(t)
n→∞ R R
Z Z
(2) − 12 −i(y−x)t − 21 n−2 t2
= (2π) lim ϕ(y) e e dm(t) dm(y)
n→∞ R R
Z
(3) −21
− 12 n2 (y−x)2
= (2π) lim ϕ(y)ne dm(y)
n→∞ R
1
Z
(4) 1 1 2
= (2π)− 2 lim ϕ s + x e− 2 s dm(s)
n→∞ R n
Z
(5) 1 1 2
= (2π)− 2 ϕ(x)e− 2 s dm(s)
R
(6) − 12 1
= (2π) ϕ(x)(2π) 2 = ϕ(x).
The explanations of the above equalities are as follows:
1 holds true in view of 8.2.11 with dominating function |ϕ̂|, which is an element
of L1 (R) in view of 11.4.6 and 11.4.3a;
2 holds true in view of 8.4.9 and 8.4.10c, because
1 −2 2 1 −2 2
− 2 n t ixt −ity
e e ϕ(y) = e− 2 n t |ϕ(y)|, ∀(y, t) ∈ R2 ,
e
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 342
and because γn−2 and ϕ are elements of S(R) and hence of L1 (R) (cf. 11.4.3a);
3 follows from 11.4.8;
4 follows from the change of variable
s := n(y − x),
in view of 9.2.1 and 9.2.2;
5 holds in view of 8.2.11, since ϕ is continuous and hence
1 1 2 1 2
lim ϕ s + x e− 2 s = ϕ(x)e− 2 s , ∀s ∈ R,
n→∞ n
and since
ϕ 1 s + x e− 12 s2 ≤ sup{|ϕ(y)| : y ∈ R}e− 12 s2 , ∀s ∈ R
n
Proof. Preliminarily we note that ϕψ, ϕ̂ψ̂, ϕ̌ψ are elements of S(R) (in view of
3.1.10h-2,6 and 11.4.6) and hence of L1 (R) (in view of 11.4.3a).
The first equation of the statement is proved by the following equalities:
Z Z
(1)
ϕψ = ϕ(ψ̂)ˇdm
R R
Z Z
1
= ϕ(x) (2π)− 2 eixy ψ̂(y)dm(y) dm(x)
R R
Z Z
(2) − 21
= ψ̂(y) (2π) eixy ϕ(x)dm(x) dm(y)
R R
Z Z
(3) − 1
= ψ̂(y) (2π) 2 e −ixy ϕ(x)dm(x) dm(y)
ZR R
= ψ̂ ϕ̂dm.
R
The explanations of the above equalities are as follows:
1 holds true by 11.4.9;
2 holds true in view of 8.4.9 and 8.4.10c, because
|ϕ(x)eixy ψ̂(y)| = |ϕ(x)||ψ̂(y)|, ∀(x, y) ∈ R2 ,
and because ϕ, ψ̂ ∈ L1 (R) (cf. 11.4.6 and 11.4.3a);
3 holds true because complex conjugation commutes with integration (cf. 8.2.3).
The proof of the second equation of the statement is analogous.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 343
These definitions are consistent in view of 11.4.6. The mappings F̂ and F̌ are
obviously linear operators on the linear space S(R) (cf. 3.1.10h). The statement of
11.4.9 can be written as
F̌ F̂ = F̂ F̌ = 1S(R) .
In view of 1.2.16b, this implies that both F̂ and F̌ are injective and that
F̌ = F̂ −1 and F̂ = F̌ −1 .
Since RF̂ = DF̂ −1 and RF̌ = DF̌ −1 , both F̂ and F̌ are surjective. By means of the
inner product for S(R) (cf. 10.1.5c), the statement of 11.4.10 can be written as
F̂ ϕ|F̂ ψ = F̌ ϕ|F̌ ψ = (ϕ|ψ) , ∀ϕ, ψ ∈ S(R).
Therefore, F̂ and F̌ are automorphisms of the inner product space S(R) (cf.
10.1.17).
11.4.12 Theorem. There exists a unique operator F ∈ B(L2 (R)) such that
F [ϕ] = [ϕ̂], ∀ϕ ∈ S(R).
The operator F is a unitary operator in L2 (R). The operator F −1 is the unique
element of B(L2 (R)) such that
F −1 [ϕ] = [ϕ̌], ∀ϕ ∈ S(R),
or equivalently such that
F −1 [ϕ̂] = [ϕ], ∀ϕ ∈ S(R).
The operator F is called the Fourier transform on L2 (R).
Proof. We recall that the pair (L2 (R), ι), with ι defined as in 11.3.1, is a completion
of the inner product space S(R) (cf. 11.3.5). We define the mapping
F0 : Rι → L2 (R)
[ϕ] 7→ F0 [ϕ] := [ϕ̂]
(cf. 11.3.6b). Clearly, F0 is a linear operator in L2 (R). Moreover, from 11.4.9 we
have
F0 [ϕ̌] = [ϕ], ∀ϕ ∈ S(R), (1)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 344
Then, by 8.2.11 (with sup{|ψ ′ (s)| : s ∈ R}|f | as dominating function, cf. 3.1.10h-7)
we have
1
Z
lim (ϕ(x + tn ) − ϕ(x)) = ψ ′ (x − y)f (y)dm(y).
n→∞ tn R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 346
11.4.15 Definitions. For each t ∈ R and each f ∈ L2 (R), we define the functions
f t : Df → C
x 7→ f t (x) := eitx f (x)
and
f−t : Df − t → C
x 7→ f−t (x) := f (x + t)
(the definition of f−t is consistent with the definition of ϕc given in 9.2.1b, while
the definition of f t has nothing to do with the definition of ϕc given in 9.2.2). It is
obvious that f t ∈ L2 (R), while f−t ∈ L2 (R) follows from 9.2.1b.
It is obvious that, for f, g ∈ L2 (R),
f ∼ g ⇒ f t ∼ gt,
while the implication
f ∼ g ⇒ f−t ∼ g−t
follows from 9.2.1a.
In view of the remarks above, for each t ∈ R we can define the mappings
Ut : L2 (R) → L2 (R)
[f ] 7→ Ut [f ] := [f t ]
and
Vt : L2 (R) → L2 (R)
[f ] 7→ Vt [f ] := [f−t ].
It is obvious that Ut and Vt are linear operators. Moreover,
kUt [f ]k = k[f ]k, ∀[f ] ∈ L2 (R),
is obvious, while
kVt [f ]k = k[f ]k, ∀[f ] ∈ L2 (R),
follows from 9.2.1b. Thus, Ut and Vt are elements of B(L2 (R)).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 347
and
|(eitn x − eit0 x )f (x)|2 ≤ 4|f (x)|2 , ∀x ∈ Df .
Then, by 8.2.11 (with 4|f |2 as dominating function) we have
Z
lim |eitn x f (x) − eit0 x f (x)|2 dm(x) = 0,
n→∞ R
or
lim kUtn [f ] − Ut0 [f ]k = 0.
n→∞
11.4.18 Lemma. Let f ∈ L2 (R), let a, b ∈ R be such that a < b, and suppose that
f (x) = 0 m-a.e. on Df − [a, b].
Then, for each ε1 > 0 and each ε2 > 0, there exists ϕ ∈ Cc∞ (R) so that
k[f ] − [ϕ]k ≤ ε1
and
supp ϕ ⊂ [a − ε2 , b + ε2 ].
ψn : R → C
(
kn exp(x2 − n−2 )−1 if x ∈ In ,
x 7→ ψn (x) :=
0 if x 6∈ In ,
where kn ∈ R is so that R ψn dm = 1. It is easy to see that ψn ∈ C ∞ (R). Hence
R
ψn ∈ S(R).
Now, we fix ε1 > 0. Since the mapping
R ∋ t 7→ Vt [f ] ∈ L2 (R)
is continuous (cf. 11.4.16), there exists δ > 0 so that
|t| < δ ⇒ k[f ] − [ft ]k < ε1 .
Moreover, we fix ε2 > 0 and then n ∈ N such that n−1 < min{δ, ε2}.
We note that f ∈ L1 (R) (this follows from 11.1.2b with ϕ := χ[a,b] and ψ := f )
and define the function
ϕ:R→C
Z
x 7→ ϕ(x) := ψn (s)f (x − s)dm(s),
R
which is an element of C ∞ (R) (cf. 11.4.14). Moreover, if x 6∈ [a − ε2 , b + ε2 ] then
s ∈ In ⇒ |s| < ε2 ⇒ x − s 6∈ [a, b],
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 349
and hence
ψn (s)f (x − s) = 0 for m-a.e. s ∈ Df + x,
and hence
ϕ(x) = 0.
This proves that supp ϕ ⊂ [a − ε2 , b + ε2 ], and hence also that ϕ ∈ Cc∞ (R).
Now we want to prove that
k[f ] − [ϕ]k ≤ ε1 . (1)
In view of the Schwarz inequality (cf. 10.1.9) we have
k[f ] − [ϕ]k = sup{| ([h]|[f ] − [ϕ]) | : [h] ∈ L2 (R) s.t. k[h]k = 1}. (2)
We fix [h] ∈ L2 (R) such that k[h]k = 1. We have
Z Z
([h]|[ϕ]) = h(x) ψn (s)f (x − s)dm(s) dm(x).
R R
We note that the function
Z
R ∋ x 7→ ψn (x)|f (x − s)|dm(s) ∈ [0, ∞)
R
is an element of Cc∞ (R) (by the same argument as above, with f replaced by |f |)
and hence of L2 (R), and hence
Z Z
|h(x)| ψn (s)|f (x − s)|dm(s) dm(x) < ∞
R R
Moreover we have
Z
([h]|[f ]) = ψn (s) ([h]|[f ]) dm(s)
In
R
since In ψn dm = 1. Therefore we have
Z
([h]|[f ] − [ϕ]) = ψn (s) ([h]|[f ] − [fs ]) dm(s),
In
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 350
and hence
(3)
Z
| ([h]|[f ] − [ϕ]) | ≤ ψn (s)| ([h]|[f ] − [fs ]) |dm(s)
In
(4) (5)
Z Z
≤ ψn (s)k[f ] − [fs ]kdm(s) ≤ ε1 ψn dm = ε1 ,
In In
11.4.19 Corollary. The family ι(Cc∞ (R)) (with ι defined as in 11.3.1) is dense in
L2 (R).
11.4.20 Corollary. Let f ∈ L2 (R), let a, b ∈ R be such that a < b, and suppose
that
f (x) = 0 m-a.e. on Df − [a, b].
Then, for each ε > 0 there exists ϕ ∈ Cc∞ (R) so that
k[f ] − [ϕ]k < ε
and
supp ϕ ⊂ (a, b).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 351
11.4.21 Corollary. Let a, b ∈ R be such that a < b. The family ι(C0∞ (a, b)) (with
ι defined as in 11.2.1) is dense in L2 (a, b).
Proof. This follows immediately from 11.4.20, since each element of L2 (a, b) is
extended trivially by an element of L2 (R) which satisfies the condition of 11.4.20,
and each element ϕ of Cc∞ (R) such that supp ϕ ⊂ (a, b) becomes an element of
C0∞ (a, b) when it is restricted to [a, b].
Proof. We define fn := χ[−n,n] f for all n ∈ N. For each n ∈ N, let ϕn ∈ Cc∞ (R) be
such that
1
k[fn ] − [ϕn ]k < and supp ϕn ⊂ (−n, n)
n
(ϕn with these properties exists by 11.4.20).
From the condition f ∈ L2 (R) we obtain
k[f ] − [ϕn ]k ≤ k[f ] − [fn ]k + k[fn ] − [ϕn ]k −−−−→ 0, (1)
n→∞
since
Z
lim |f − fn |2 dm = 0
n→∞ R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 352
therefore we have
Z Z Z
|f − ϕn |dm = |f − ϕn |dm + |f |dm −−−−→ 0. (2)
R [−n,n] R−[−n,n] n→∞
by the continuity of F and since ϕn ∈ S(R) for all n ∈ N. In view of 11.1.7, this
implies that there exists a subsequence {ϕ̂nk } of the sequence {ϕ̂n } so that
h(x) = lim ϕ̂nk (x) m-a.e. on R.
k→∞
Moreover, from 2 we have
Z
|fˆ(x) − ϕ̂n (x)| ≤ (2π)−1 |f − ϕn |dm −−−−→ 0, ∀x ∈ R
R n→∞
11.4.23 Remark. For all [f ] ∈ L2 (R), on the basis of 11.4.22 we can find a formula
which yields F [f ] more directly than the mere definition of F does. Indeed, let
[f ] ∈ L2 (R), let {an } and {bn } be sequences in R such that
an < bn for all n ∈ N, lim an = −∞, lim bn = ∞,
and define
fn := χ[an ,bn ] f, ∀n ∈ N.
For all n ∈ N, fn ∈ L (R) is obvious and fn ∈ L1 (R) follows from 11.1.2b. Moreover,
2
follows from 8.2.11 (with |f |2 as dominating function). Then, in view of the conti-
nuity of F and of 11.4.22, we have
F [f ] = lim F [fn ] = lim [fˆn ],
n→∞ n→∞
with
Z
1
fˆn (x) = (2π)− 2 e−ixy f (y)dm(y), ∀x ∈ R, ∀n ∈ N.
[an ,bn ]
The sequences {an } and {bn } can be chosen in order to make the computation of
the limit above as easy as possible.
July 25, 2013 17:28 WSPC - Proceedings Trim Size: 9.75in x 6.5in icmp12-master
Chapter 12
Adjoint Operators
In this chapter we study the idea of adjoint operator, which is in a sense the main
tool for dealing with linear operators in Hilbert space. Throughout the chapter, H
denotes an abstract Hilbert space. We recall that O(H) denotes the family of all
linear operators in H (cf. 3.2.1).
355
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 356
Proof. a: This follows from 12.1.5c since (W (GA ))⊥ is a subspace of H ⊕ H (cf.
10.2.13).
b: We have
(1) (2) (3) (4)
(W (GA† ))⊥ = (W ((W (GA ))⊥ ))⊥ = ((W 2 (GA ))⊥ )⊥ = (G⊥
A)
⊥
= GA ,
where: 1 holds by 12.1.5c; 2 holds by 10.2.16; 3 holds because W 2 = −1H⊕H and
GA is a linear manifold in H⊕H (cf. 3.2.15a); 4 holds by 10.4.4c. Now, A is closable
iff GA is the graph of a mapping, and (W (GA† ))⊥ is the graph of a mapping iff
DA† = H (cf. 12.1.5b). Thus, A is closable iff DA† = H.
If A is closable and DA† = H, then (cf. 12.1.5c)
GA†† = (W (GA† ))⊥ = GA = GA ,
and hence A†† = A because two mappings are equal if their graphs are equal.
Moreover,
(5) (6)
G(A)† = (W (GA ))⊥ = (W (GA ))⊥ = (W (GA ))⊥ = (W (GA ))⊥ = GA† ,
where: 5 holds by 10.1.21, 4.6.2d, 2.3.21a; 6 holds by 10.2.11. This proves that
(A)† = A† .
c: This follows immediately from result b, since A is closed iff [A is closable and
A = A].
12.1.8 Theorem. Let A ∈ O(H) be such that DA = H and NA = {0H } (thus, the
operators A† and A−1 are defined). Then, DA−1 = H iff NA† = {0H } (thus, the
operator (A−1 )† is defined iff the operator (A† )−1 is defined). If these conditions
hold true, then
(A−1 )† = (A† )−1 .
Proof. The parenthetical remarks of the statement are true by 12.1.1 and by 3.2.6a.
We have
⊥
RA = H ⇔ RA = {0H }
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 359
This proves that A ∈ B(H), and hence that A is closed (cf. 4.4.3).
Second, we suppose {xn } 6∈ ℓ2 . We choose n0 ∈ N such that xn0 6= 0 and define
fk := −xk un0 + xn0 uk , ∀k ∈ N;
clearly,
fk ∈ DA and Afk = (−xk xn0 + xn0 xk )u = 0H , ∀k ∈ N.
Now, let g ∈ DA† ; then
0 = (Afk |g) = fk |A† g = −xk un0 |A† g + xn0 uk |A† g , ∀k ∈ N;
k=1 k=1
12.2.1 Theorem. Let A ∈ OE (H) (for OE (H), cf. 3.2.12). Then the operator A†
is bounded.
(for FA† g , cf. 10.5.1; for H̃, cf. 10.9.4). This proves that
∀f ∈ H, ∃mf ∈ [0, ∞) such that |FA† g f | ≤ mf , ∀g ∈ DA† ∩ H̃,
and hence, by 4.2.13 and 10.5.1, that
∃m ∈ [0, ∞) such that kA† gk = kFA† g k ≤ m, ∀g ∈ DA† ∩ H̃,
and hence that
† 1
≤ m, i.e. kA† gk ≤ mkgk, ∀g ∈ DA† − {0H },
∃m ∈ [0, ∞) such that
A
g
kgk
and hence that the operator A† is bounded.
Proof. a ⇒ (b and d): Assuming condition a, by 4.2.6 there exists à ∈ B(H) such
that A ⊂ Ã, since DA = H. Then the function
ψ : H×H→C
(f, g) 7→ ψ(f, g) := (Ãf |g)
is a bounded sesquilinear form on H (cf. 10.5.5), and hence by 10.5.6 there exists
B ∈ B(H) such that
(Ãf |g) = (f |Bg) , ∀f, g ∈ H,
and hence such that
(Af |g) = (f |Bg) , ∀f ∈ DA , ∀g ∈ H = DB .
By 12.1.3B, this implies B ⊂ A† and hence B = A† . Further, we have kAk = kÃk
by 4.2.6d and kÃk = kBk by 10.1.14.
b ⇒ c: This is obvious.
c ⇒ a: Assuming condition c, by 12.2.1 we have that A†† is bounded. Since
A ⊂ A†† (cf. 12.1.6b), by 4.2.5a we obtain that A is bounded.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 362
thus, the matrix ΦU (A† ) is the complex conjugate of the transpose of the matrix
ΦU (A).
(a) A† + B † ⊂ (A + B)† ;
(b) if B ∈ B(H) then A† + B † = (A + B)† .
Proof. For all α ∈ C, we have DαA = H because DαA = DA , and also (cf. 12.1.3A)
i.e. D(αA)† ⊂ DαA† , which (in view of what was proved above) implies αA† = (αA)† .
b: In view of what was proved above we already know that 0A† ⊂ (0A)† .
Moreover,
proves that OH ⊂ (0A)† (cf. 12.1.3B), and hence that OH = (0A)† . Now, D0A† =
DA† and DA† = H iff A is bounded (cf. 12.2.2).
12.3.3 Remark. The equality 1†H = 1H (for the operator 1H , cf. 3.2.5) follows
immediately from 12.1.3B with A := ψ := 1H . Then, for every adjointable operator
A in H and every α ∈ C, from 12.3.1b and 12.3.2 we obtain
(A + α1H )† = A† + α1H
(a) A† B † ⊂ (BA)† ;
(b) if B ∈ B(H) then A† B † = (BA)† .
⇒ [ Af |B † g = f |(BA)† g , ∀f ∈ DA ]
⇒ B † g ∈ DA
∗
= D A† ⇒ g ∈ D A† B † ,
∗
where 12.1.3A, the equality DBA = DA , and the definition of DA have been used.
This proves that D(BA)† ⊂ DA† B † , which (in view of result a) implies A† B † =
(BA)† .
Proof. From 12.1.4 we have DA† = H and A†† ⊂ A† . From 12.1.6b we have
A ⊂ A†† .
12.4.4 Remarks.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 365
12.4.6 Remarks.
(a) A self-adjoint operator is closed (cf. 12.1.6a).
(b) Suppose that A and B are self-adjoint operators in H and that A ⊂ B. Then
B = B † ⊂ A† = A by 12.1.4, and hence A = B.
Proof. The operator A is injective in view of 3.2.6a. Then, 12.1.8 implies that
DA−1 = H and (A−1 )† = (A† )−1 = (A)−1 ,
i.e. that A−1 is s.a..
Proof. a ⇒ (b and c): Suppose that A is e.s.a.. Then it is symmetric, and hence it
is closable and A = A†† (cf. 12.4.4a). Thus, DA = DA† = H and (A)† = (A†† )† =
A† = A†† = A (cf. 12.1.6b).
Now suppose that B is a s.a. operator in H such that A ⊂ B. From 12.1.4 we
have B † ⊂ A† and then A†† ⊂ B †† . Since B †† = B † = B and A†† = A† , this implies
B = A†† = A. This proves that condition c is true.
b ⇒ a: If A is closable, then DA† = H and A†† = A by 12.1.6b. If A is s.a.,
we also have A†† = A = (A)† = (A†† )† = A† (cf. 12.1.6b). Thus, A ⊂ A† and
A†† = A† .
Proof. The equality RA−λ1H = H implies that, for each g ∈ DA† , there exists
f ∈ DA−λ1H so that
A† g − λg = Af − λf,
and hence, since A ⊂ A† , so that
(A† − λ1H )(g − f ) = 0H .
Now, A† −λ1H = (A−λ1H )† by 12.3.3, N(A−λ1H )† = RA−λ1
⊥ ⊥
by 12.1.7, RA−λ1 =
H H
{0H } since RA−λ1H = H (cf. 10.4.4d). Thus, g − f = 0H and hence g ∈ DA−λ1H =
DA . This proves that DA† ⊂ DA and hence that A is s.a..
Proof. Suppose that {fn } is a sequence in RA†† +iH and that there exists f ∈ H
so that limn→∞ fn = f . Then there exists a sequence {gn } in DA†† +i1H = DA†† so
that (A†† + i1H )gn = fn for all n ∈ N. Now,
k(A†† + i1H )gk2 = kA†† gk2 + i A†† g|g − i g|A†† g + kgk2
and hence f ∈ RA†† +i1H . This proves that RA†† +i1H is closed (cf. 2.3.4).
The proof for RA†† −i1H is analogous.
f = 0H .
by 12.4.16. Then A†† is s.a. by 12.4.14 (with A replaced by A†† and λ := i), since
it is a symmetric operator (cf. 12.4.4a). Thus, A†† = (A†† )† = A† (cf. 12.1.6b).
(a) A is self-adjoint;
(b) A is closed and NA† +i1H = NA† −i1H = {0H };
(c) RA+i1H = RA−i1H = H.
by 12.4.17.
b ⇒ c: Assuming condition b, we have
RA+i1H = RA−i1H = H
RA+i1H = RA−i1H = H
by 12.4.16.
c ⇒ a: This follow directly from 12.4.14 (with λ := i).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 369
12.4.19 Remark. Self-adjoint operators are, among symmetric operators, the im-
portant ones because the spectral theorem holds true for them. One is often given
an operator A which for some reason is known to be symmetric even if its adjoint is
not known (e.g., A might have been proved to be symmetric by 12.4.3), and wants
to find out if A is self-adjoint, or at least essentially self-adjoint. Condition 12.4.17c
is a criterion for deciding whether a symmetric operator is essentially self-adjoint
in which only the operator itself appears, and condition 12.4.18c is the same for
self-adjointness. If the operator A is found to be essentially self-adjoint, then it has
a unique self-adjoint extension, which is A (cf. 12.4.11), and it is often possible to
learn the relevant properties of the self-adjoint extension of A without explicitely
constructing A or A†† , but relying instead on the explicit form of A and on the
abstract properties of closures and adjoints. We point out that it usually easier to
find essentially self-adjoint operators then self-adjoint ones because there are usually
many essentially self-adjoint operators that are restrictions of the same self-adjoint
operator (cf. 12.4.13). It is worth mentioning that there exist symmetric oper-
ators that have many self-adjoint extensions and others that have no self-adjoint
extension.
(A) Apσ(A) ⊂ R.
(B) Assuming σp (A) 6= ∅, suppose λ1 , λ2 ∈ σp (A) and λ1 6= λ2 . Then,
⊥
NA−λ1 1H ⊂ NA−λ2 1H
.
since λ1 , λ2 ∈ R (cf. 4.5.8 and result A) and A is symmetric (cf. 12.4.3), and hence
(f1 |f2 ) = 0.
C: This follows from B, in view of 10.7.7. Indeed, we can construct an o.n.s. in
H by choosing an element of NA−λ1H ∩ H̃ for each λ ∈ σp (A).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 370
Proof. a ⇒ b: Since the operator A is closed (cf. 12.4.6a), this follows directly
from 4.5.12.
b ⇒ a: If λ 6∈ R, then λ ∈ ρ(A) by 12.4.21a. Now assume λ ∈ R and RA−λ1H =
H. Then the operator A − λ1H is s.a. (cf. 12.3.3) and in view of 12.1.7 we have
⊥
NA−λ1H = RA−λ1 H
= {0H },
which implies that the operator A − λ1H is injective and (A − λ1H )−1 is s.a. (cf.
12.4.8). Then the equalities
D(A−λ1H )−1 = RA−λ1H = H
imply that the operator (A − λ1H )−1 is bounded (cf. 12.4.7). Therefore, we have
λ ∈ ρ(A).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 371
12.4.24 Proposition. Suppose that A is a symmetric operator and that there exist
a c.o.n.s. {un }n∈N in H and a sequence {λn } in R so that
un ∈ DA and Aun = λn un , ∀n ∈ N.
Then:
P∞
DA† = g ∈ H : n=1 λ2n | (un |g) |2 < ∞ and
∞
A† g = n=1 λn (un |g) un , ∀g ∈ DA† ;
P
Proof. We have
g ∈ DA† ⇒ [ un |A† g = (Aun |g) = λn (un |g) , ∀n ∈ N] ⇒
"∞ ∞
X X
λ2n | (un |g) |2 = | un |A† g |2 < ∞ and
n=1 n=1
∞ ∞
#
X X
† †
A g= un |A g un = λn (un |g) un
n=1 n=1
This proves that the operator A† is symmetric (cf. 12.4.3), and hence that A† = A††
since we already know that A†† ⊂ A† (cf. 12.4.2). Thus, the operator A is e.s.a..
It is obvious that {λn }n∈N ⊂ σp (A) ⊂ σp (A† ). If λ ∈ σp (A† ) existed such that
λ 6= λn for all n ∈ N, then by 12.4.20B there would exist f ∈ DA† so that f 6= 0H
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 372
and (un |f ) = 0 for all n ∈ N, and hence the o.n.s. {un }n∈N would not be complete
(cf. 10.6.4e). This proves that σp (A† ) ⊂ {λn }n∈N and hence that
σp (A) = σp (A† ) = {λn }n∈N .
The inclusion {λn }n∈N ⊂ σ(A† ) is true because {λn }n∈N = σp (A† ) ⊂ σ(A† ) and
σ(A† ) is a closed subset of C (cf. 10.4.6). Now let λ ∈ C − {λn }n∈N ; then (cf.
2.3.10),
∃ε > 0 such that |λ − λn | ≥ ε, ∀n ∈ N,
and this implies that
∞
X
∃ε > 0 s.t. k(A† − λ1H )gk2 = |λn − λ|2 | (un |g) |2
n=1
∞
X
2
≥ε | (un |g) |2 = ε2 kgk2 , ∀g ∈ DA†
n=1
(cf. 10.6.4b, 10.4.8a, 10.6.4d), and this implies that λ ∈ C − Apσ(A† ) (cf. 4.2.3),
i.e. λ ∈ C − σ(A† ) (cf. 12.4.21b). This proves that σ(A† ) ⊂ {λn }n∈N , and hence
that
σ(A† ) = {λn }n∈N .
Finally, the equation σ(A) = σ(A† ) follows from 4.5.11 since the operator A is
closable and A = A†† = A† (cf. 12.4.4a).
12.4.25 Examples. The examples we examine here are operators in the Hilbert
space L2 (a, b). Most of the elements of L2 (a, b) that we use in these examples are
equivalence classes which contain an element of C(a, b), and we find it pointless to
distinguish always between the symbol ϕ for an element of C(a, b) and the symbol
[ϕ] for the element of L2 (a, b) that contains ϕ. In fact, if ϕ ∈ C(a, b) then ϕ is the
only continuous function in the equivalence class of [ϕ] (cf. 11.2.2b) and therefore it
is unambiguously identified with [ϕ]. This is useful for avoiding some cumbersome
notation. In the same spirit, we use the same symbol for a subset of C(a, b) and
its image under the mapping ι defined in 11.2.1. For instance, in what follows we
regard the set
C01 (a, b) := {ϕ ∈ C 1 (a, b) : ϕ(a) = ϕ(b) = 0}
(for C 1 (a, b), cf. 3.1.10f) as a subset of L2 (a, b). Clearly, C01 (a, b) is a linear manifold
in L2 (a, b). Also, C01 (a, b) = L2 (a, b) by 10.6.5b since {sn }n∈N ⊂ C01 (a, b), where
{sn }n∈N is the c.o.n.s. in L2 (a, b) defined in 11.2.6.
For any θ ∈ [0, 2π) we define:
DAθ := {ϕ ∈ C 1 (a, b) : ϕ(b) = eiθ ϕ(a)},
Aθ : DAθ → L2 (a, b)
ϕ 7→ Aθ ϕ := −iϕ′ .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 373
Clearly, the mapping Aθ is a linear operator in L2 (a, b), and it is adjointable since
C01 (a, b) ⊂ DAθ . Moreover
Z b Z b
(1) (2)
(Aθ ϕ|ψ) = i ′
ϕ (x)ψ(x)dx = i ϕ′ (x)ψ(x)dx
a a
Z b
(3)
= i(ϕ(b)ψ(b) − ϕ(a)ψ(a)) − i ϕ(x)ψ ′ (x)dx
a
Z b
(4)
= −i ϕ(x)ψ ′ (x)dx = (ϕ|Aθ ψ) , ∀ϕ, ψ ∈ DAθ ,
a
where: 1 and 4 hold because an inner product of elements of C(a, b) can be written
as a Riemann integral (cf. 10.1.5b); 2 holds by the definition of derivative of a
complex function (cf. 1.2.21); 3 is integration by parts. By 12.4.3, this proves that
the operator Aθ is symmetric. Let eθ be the element of C(a, b) defined by
x−a
eθ (x) := exp iθ , ∀x ∈ [a, b].
b−a
If {un }n∈Z is the c.o.n.s. in L2 (a, b) defined in 11.2.4, it is obvious that the family
{eθ un }n∈Z is an o.n.s. in L2 (a, b). Moreover, for [ϕ] ∈ L2 (a, b), eθ ϕ ∈ L2 (a, b) and
([eθ un ]|[ϕ]) = ([un ]|[eθ ϕ]) , ∀n ∈ Z;
therefore, in view of 10.6.4e,
x−a
([eθ un ]|[ϕ]) = 0, ∀n ∈ Z] ⇒ [exp −iθ ϕ(x) = 0 m-a.e. on [a, b]]
b−a
(we have used 10.1.7a for the elements 1[a,b] and ϕ of L2 (a, b)). Thus, the operator
Sλ is bounded for each λ ∈ C.
Now, for each λ ∈ C we have RSλ ⊂ DD and
(D − λ1L2 (a,b) )Sλ ϕ = ϕ, ∀ϕ ∈ C(a, b) = DSλ , (1)
and also, for all ψ ∈ DD and x ∈ [a, b],
Z x
(Sλ (D − λ1 )ψ)(x) = i exp(iλx)
L2 (a,b) exp(−iλs)(−iψ ′ (s) − λψ(s))ds
Z ax
= exp(iλx)(exp(−iλx)ψ(x) + iλ exp(−iλs)ψ(s)ds) − λψλ (x) = ψ(x),
a
which proves that
Sλ (D − λ1L2 (a,b) )ψ = ψ, ∀ψ ∈ DD−λ1L2 (a,b) . (2)
By 1.2.16b, 1 and 2 imply that, for each λ ∈ C,
D − λ1L2 (a,b) is injective and (D − λ1L2 (a,b) )−1 = Sλ .
Since DSλ = C(a, b) and C(a, b) = L2 (a, b), this proves that
ρ(D) = C and hence σ(D) = ∅.
Furthermore, for each λ ∈ C we have
Sλ (B − λ1L2 (a,b) )ψ = ψ, ∀ψ ∈ DB−λ1L2 (a,b) ,
since B ⊂ D. By 1.2.16a, this implies that, for each λ ∈ C,
B − λ1L2 (a,b) is injective and (B − λ1L2 (a,b) )−1 ⊂ Sλ ,
and hence (cf. 4.2.5a) also that (B − λ1L2 (a,b) )−1 is bounded. This proves that
Apσ(B) = ∅.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 376
(cf. 12.1.7), and hence RB−λ1L2 (a,b) 6= L2 (a, b) (cf. 10.4.4d). This proves that
All the operators examined above are defined by the same rule (cf. 1.2.1);
actually, they are all restrictions of the operator C. It is therefore clear that their
various properties depend entirely on the domains on which they are defined.
Finally, we examine some “second order” derivation operators. We have the
inclusion {u} ∪ {cn }n∈N ⊂ DBC , where {u} ∪ {cn }n∈N is the c.o.n.s. in L2 (a, b)
defined in 11.2.6, and hence DBC = L2 (a, b) (cf. 10.6.5b). Furthermore, we have
BC ⊂ C † B † ⊂ (BC)† by 12.3.4a. Thus, the operator BC is symmetric. Moreover,
2
π
BCu = 0L2 (a,b) and BCcn = n2 cn , ∀n ∈ N.
b−a
Similarly, relying on the c.o.n.s. {sn }n∈N defined in 11.2.6, we can prove that the
operator CB is e.s.a., that the elements of {sn }n∈N are eigenvectors of CB, and
that
( 2 )
π
σp (CB) = σp (CB) = σ(CB) = σ(CB) = n2 .
b−a
n∈N
Similarly, relying on the c.o.n.s. {eθ un }n∈Z defined above, we can prove that the
operator A2θ is e.s.a. for any θ ∈ [0, 2π), that the elements of {eθ un }n∈Z are eigen-
vectors of A2θ , and that
( 2 )
2 2 2 2 2πn + θ
σp (Aθ ) = σp (Aθ ) = σ(Aθ ) = σ(Aθ ) = .
b−a
n∈Z
All these “second order” derivation operators are defined by the same rule.
Hence, the diversity of their spectra depends on the diversity of their domains.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 377
and this implies A−1 ⊂ A† by 12.1.3B, and hence A−1 = A† since DA−1 = RA = H
by the definition of an automorphism.
b ⇒ c: Assuming condition b, we have A† A = 1H since DA = H and AA† = 1RA
(cf. 3.2.6b). Now, A−1 = A† implies that A−1 is closed (cf. 12.1.6a). Hence A is
closed (cf. 4.4.7), hence A is bounded by 12.2.3, hence DA† = H by 12.2.2, hence
RA = DA−1 = H, and hence AA† = 1H .
c ⇒ d: Assuming condition c, A† A = 1H implies DA = H and RA† = H, and
†
AA = 1H implies DA† = H (cf. 1.2.13Ab,Ac). Further we have
(f |g) = A(A† f )|g = A† f |A† g , ∀f, g ∈ H,
= k(U † − λ1H )f k2 , ∀λ ∈ C, ∀f ∈ H,
where 12.1.3A and 12.3.3 have been used. This implies
NU−λ1H = NU † −λ1H , ∀λ ∈ C.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 378
Proof. In view of 12.4.20A we have −i 6∈ σp (A), and hence that the operator
A + i1H is injective. We have
D(A+i1H )−1 = RA+i1H and R(A+i1H )−1 = DA+i1H = DA = DA−i1H
(cf. 1.2.11a), and from these equalities we obtain
DV = RA+i1H and RV = RA−i1H .
For each f ∈ DV , we set g := (A + i1H )−1 f (note that DV = D(A+i1H )−1 ); then,
g ∈ DA+i1H = DA = DA−i1H and f = (A + i1H )g, (1)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 379
and also
V f = (A − i1H )(A + i1H )−1 f = (A − i1H )g; (2)
since
k(A ± i1H )gk2 = kAgk2 ± i (Ag|g) ∓ i (g|Ag) + kgk2 = kAgk2 + kgk2
(cf. 12.4.3a), from 2 we have
kV f k = k(A − i1H )gk = k(A + i1H )gk = kf k;
moreover, from 1 and 2 we have
(V − 1H )f = (A − i1H )g − (A + i1H )g = −2ig = −2i(A + i1H )−1 f (3)
and also
(V + 1H )f = 2Ag = 2A(A + i1H )−1 f. (4)
Since
DV −1H = DV = RA+i1H = D(A+i1H )−1 ,
3 implies that
V − 1H = −2i(A + i1H )−1 , (5)
which implies (cf. 1.2.11b) that the operator V − 1H is injective and hence that
1 6∈ σp (V ), and also that
1
(V − 1H )−1 = − (A + i1H ).
2i
From R(A+i1H )−1 = DA we have
DA(A+i1H )−1 = D(A+i1H )−1 = DV = DV +1H
(cf. 1.2.13Ad); then, 4 implies that
V + 1H = 2A(A + i1H )−1 . (6)
Now, 5 and 6 imply that
−i(V + 1H )(V − 1H )−1 = A(A + i1H )−1 (A + i1H ) = A,
where the last equality holds because
(A + i1H )−1 (A + i1H ) = 1DA+i1H = 1DA .
Finally, since
kV f k = kf k, ∀f ∈ DV ,
the operator V is unitary iff DV = RV = H (cf. 10.1.20), i.e. iff
RA+i1H = RA−i1H = H,
i.e. iff the operator A is s.a. (cf. 12.4.18).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 380
The next theorem must be added to what was proved in 4.6.5 (also, cf. 10.3.19)
about the unitary-antiunitary equivalence of operators.
12.5.4 Theorem. Let H1 and H2 be isomorphic Hilbert spaces, let A ∈ O(H1 ) and
B ∈ O(H2 ), and suppose that there exists U ∈ UA(H1 , H2 ) so that B = U AU −1 .
Then:
(a) if DA = H1 then DB = H2 and B † = U A† U −1 ;
(b) if A is symmetric then B is symmetric;
(c) if A is self-adjoint then B is self-adjoint;
(d) if A is essentially self-adjoint then B is essentially self-adjoint.
Proof. We already know that B(H) is a Banach algebra over C (cf. 4.3.5). The
definition of the mapping ι of the statement is consistent because
A† ∈ B(H), ∀A ∈ B(H),
by 12.2.2. Now we prove that the mapping ι of the statement has all the properties
listed in 12.6.1.
c∗1 : this follows from 12.3.1b.
c∗2 : this follows from 12.3.2.
c∗3 : this follows from 12.3.4b.
c∗4 : this follows from 12.1.6b.
c∗5 : For A ∈ B(H) we have
kAf k2 = (Af |Af ) = f |A† Af ≤ kf kkA† Af k ≤ kf kkA† Akkf k, ∀f ∈ H,
by 12.1.3A, 10.1.7a, 4.2.5b, and this proves that kAk2 ≤ kA† Ak. We also have
| f |A† Ag | = | (Af |Ag) | ≤ kAf kkAgk ≤ kAk2 kf kkgk, ∀f, g ∈ H,
by the same reasons as above, and this proves that kA† Ak ≤ kAk2 (cf. 10.1.14).
The last two assertions of the statement follow directly from 12.6.2 and from
12.6.3.
were bounded, then we should have A, B ∈ B(H) (cf. 12.4.7) and condition HCCR
would be
AB − BA = i1H . (1)
This would imply the equations
An B − BAn = inAn−1 , ∀n ∈ N. (2)
Indeed, 1 is 2 for n = 1 (recall that A0 := 1H , cf. 3.3.1) and, assuming that 2 is
true for a given n ∈ N, we have
An BA − BAn+1 = inAn ,
which in view of 1 can be written as
An (AB − i1H ) − BAn+1 = inAn ,
or
An+1 B − BAn+1 = i(n + 1)An .
This proves 2 by induction. From 2 we would have, by 12.6.3 (also, cf. 12.6.4),
nkAkn−1 = kinAn−1 k ≤ kAkn kBk + kBkkAkn, ∀n ∈ N,
which would imply (note that 1 implies A 6= OH )
n ≤ 2kAkkBk, ∀n ∈ N,
which is a contradiction.
This also shows that the relation
AB − BA = i1H
(which is clearly stronger than HCCR) is an impossible relation for two self-adjoint
operators A and B, since it would imply DA = DB = H, and hence for both the
operators A and B to be bounded (cf. 12.4.7). We mention the fact that there are
pairs of self-adjoint operators which satisfy HCCR (cf. 20.1.3b and 20.1.7).
12.6.6 Remarks.
(a) Here we make some remarks about linear operators in a one-dimensional Hilbert
space which could also be deduced from 10.8.4. Thus, we suppose in what
follows that H is a one-dimensional Hilbert space.
Since {0H } and H are the only linear manifolds in H, the domain of every non-
trivial linear operator in H must be H. Moreover, every linear operator in H
is bounded (cf. 10.8.3). Thus, the family of non-trivial linear operators in H is
B(H).
For α ∈ C, we define the mapping
Aα : H → H
f 7→ Aα f := αf.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 384
(b) For a one-dimensional Hilbert space H, theorem 12.5.3 on the Cayley transform
of a self-adjoint operator can be rephrased as follows, in view of what was seen
in remark a:
for all x ∈ R, x−i x−i
x+i ∈ T and x+i 6= 1;
if we define the function
ϕ : R → T − {1}
x−i
x 7→ ϕ(x) := ,
x+i
then we have
ϕ(x) + 1
x = −i , ∀x ∈ R,
ϕ(x) − 1
and hence the function ϕ is injective.
Of course, all this can be proved directly, without going through 12.5.3. The
name of Cayley transform was originally given to the function ϕ.
Now, let z ∈ T − {1} and write z = exp iθ with 0 < θ < 2π. Then,
exp i 2θ + exp −i 2θ cos θ2
z+1
−i = −i = − ∈R
z−1 exp i 2θ − exp −i 2θ sin θ2
and !
cos θ2 cos θ2 + i sin θ2 exp i 2θ
ϕ − = = = exp iθ = z.
sin θ2 cos 2θ − i sin θ2 exp −i 2θ
This proves that the function ϕ is a bijection from R onto T − {1} and that its
inverse is the function
ψ : T − {1} → R
z+1
z 7→ ψ(z) := −i .
z−1
12.6.7 Proposition. Let X be a non-empty set. For the Banach algebra FB (X)
(cf. 4.3.6a), the mapping
ι : FB (X) → FB (X)
ϕ 7→ ι(ϕ) := ϕ
is defined consistently, and FB (X) is a C ∗ -algebra with this mapping as involution.
If A is a σ-algebra on X, the Banach algebra MB (X, A) (cf. 6.2.29) is a C ∗ -
algebra with the restriction ιMB (X,A) as involution.
If a distance is defined on X, the Banach algebra CB (X) (cf. 4.3.6b) is a C ∗ -
algebra with the restriction ιCB (X) as involution.
Proof. It is obvious that the mapping ι is defined consistently and that it satisfies
all the conditions listed in 12.6.1. For instance, as to condition c∗5 we have, for all
ϕ ∈ FB (X),
[|ϕ(x)| ≤ kϕk∞ , ∀x ∈ X] ⇒
[|ϕ(x)ϕ(x)| = |ϕ(x)|2 ≤ kϕk2∞ , ∀x ∈ X] ⇒ kϕϕk∞ ≤ kϕk2∞
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 386
and
p
[|ϕ(x)|2 = |ϕ(x)ϕ(x)| ≤ kϕϕk∞ , ∀x ∈ X] ⇒ kϕk∞ ≤ kϕϕk∞ ⇒ kϕk2∞ ≤ kϕϕk.
The restrictions ιMB (X,A) and ιCB (X) are defined because MB (X, A) and CB (X)
are subsets of FB (X). Moreover, ι(MB (X, A)) ⊂ MB (X, A) (cf. 6.2.17),
ι(CB (X)) ⊂ CB (X), and it is obvious that ιMB (X,A) and ιCB (X) have the prop-
erties of an involution.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 387
Chapter 13
In the first half of this chapter we study orthogonal projections, which are the
building blocks of unitary and of self-adjoint operators, as the spectral theorems
show. Orthogonal projections enter our formulation of the spectral theorems in
the guise of projection valued measures, which we study in the second half of this
chapter.
Throughout this chapter, H denotes an abstract Hilbert space.
δM : H → M × M ⊥
f 7→ δM (f ) := (f1 , f2 ) if (f1 , f2 ) ∈ M × M ⊥ is such that f = f1 + f2 .
πM : M × M ⊥ → M
(f, g) 7→ πM (f, g) := f,
and we call orthogonal projection onto M the composition of πM with δM , i.e. the
mapping PM defined by PM := πM ◦ δM . Thus, PM is a mapping from H to M .
However, it is convenient to consider H instead of M as the final set of the mapping
PM (cf. 1.2.1). Clearly, the mapping PM can be defined directly as follows:
PM : H → H
f 7→ PM f := f ′ if f ′ ∈ M and f − f ′ ∈ M ⊥ .
387
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 388
13.1.4 Remark.
(a) For every projection A in H we have, in view of 13.1.3c,
RA ∈ S (H) and A = PRA .
We also have, in view of 13.1.3c,e,
⊥
PR⊥
A
= 1H − A and hence RA = R1H −A .
(b) In view of 13.1.3c, the mapping
S (H) ∋ M 7→ PM ∈ P(H)
is injective (if M, N ∈ S (H) are such that PM = PN then M = RPM = RPN =
N ) and hence it is bijective from S (H) onto P(H), and the mapping
P(H) ∋ A 7→ RA ∈ S (H)
is defined consistently and it is the inverse of the mapping preceding (cf. remark
a).
in view of 13.1.3c. This proves that A is symmetric (cf. 12.4.3) and hence self-
adjoint (since DA = H) and also that A2 = A.
b ⇒ a: We assume condition b. For every f ∈ H, we have obviously
f = Af + (f − Af )
and
Af ∈ RA and hence Af ∈ RA ;
we also have
(f − Af |Ag) = (A(f − Af )|g) = Af − A2 f |g = 0, ∀g ∈ H,
⊥
and this proves that f − Af ∈ RA and hence f − Af ∈ (RA )⊥ (cf. 10.2.11); since
RA ∈ S (H) (cf. 3.2.2a and 4.1.12), all this can be written as
δRA (f ) = (Af, f − Af ),
and this implies Af = PRA f . This proves that A = PRA , and hence condition
a.
13.1.7 Remarks.
(a) For any normed space X, an operator A ∈ OE (X) is called a projection in X if
A = A2 . From 13.1.5 we see that a projection in H is an orthogonal projection
iff it is self-adjoint. Besides, from 13.1.6 we see that a projection in H is an
orthogonal projection iff it is bounded with norm not greater than one.
The only projections in H that we consider in this book are orthogonal projec-
tions. For this reason, we may sometimes use the word projection to mean an
orthogonal projection.
(b) The plan for the proof of b ⇒ a in 13.1.5 was suggested by the fact that if
A ∈ P(H) then A = PRA = PRA (cf. 13.1.4.a). We point out that we could
not have set out to prove the equation A = PRA , because we did not know yet
that RA was a subspace. However, we did know that RA was a subspace and
therefore it was sensible to set out to prove that A = PRA .
The plan for the proof of b ⇒ a in 13.1.6 was suggested by the fact that if
⊥⊥
A ∈ P(H) then 1H − A = PR⊥ A
(cf. 13.1.4a), and hence N1H −A = RA = RA
(cf. 13.1.3b and 10.4.4a), and hence A = PRA = PM if we write M := N1H −A
(cf. 13.1.4a). We point out that the first thing we proved was that N1H −A was
a subspace.
(c) In view of 13.1.5, for each A ∈ P(H) we have
(f |Af ) = f |A2 f = (Af |Af ) = kAf k2 , ∀f ∈ H.
13.1.8 Theorem. Let H1 and H2 be isomorphic Hilbert spaces, let A ∈ P(H1 ) and
B ∈ O(H2 ), and suppose that there exists U ∈ UA(H1 , H2 ) so that B = U AU −1 .
Then B ∈ P(H2 ). In fact,
U (RA ) ∈ S (H2 ) and B = PU(RA ) .
Proof. Assume DA closed and A closed, let P denote the orthogonal projection
onto DA , and consider the operator AP . Clearly, DAP = H. Moreover, if two
vectors f, g of H and a sequence {fn } in H are so that
fn → f and AP fn → g
then
P fn → P f
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 392
13.1.11 Corollary. If H is separable then for every A ∈ P(H)− {OH } there exists
a countable o.n.s. {ui }i∈I in H so that
X
Af = (ui |f ) ui , ∀f ∈ H.
i∈I
13.1.12 Definition. For u ∈ H̃ (for H̃, cf. 10.9.4) we write Au := PV {u} . In view
of 13.1.10, we have
DAu = H and Au f = (u|f ) u, ∀f ∈ H.
The operator Au is said to be a one-dimensional projection in H.
13.1.13 Remarks.
(a) If u, v ∈ H̃ are such that u ÷ v (for the relation ÷ in H, cf. 10.9.1) then there
exists z ∈ T such that u = zv, and hence
Au f = (zv|f ) zv = zz (v|f ) v = (v|f ) v = Av f, ∀f ∈ H,
i.e. Au = Av . Conversely, if u, v ∈ H̃ are such that Au = Av then in particular
u = Au u = Av u = (v|u) v,
and hence u ÷ v. Therefore, the mapping
Ĥ ∋ [u] 7→ Au ∈ P(H)
(for Ĥ, cf. 10.9.4) can be defined consistently and it is injective; hence, it is a
bjiection from Ĥ onto the family of all one-dimensional projections in H.
(b) If H1 and H2 are isomorphic Hilbert spaces, for u ∈ H̃1 and U ∈ UA(H1 , H2 )
we have U Au U −1 = AUu . Indeed,
U Au U −1 f = U u|U −1 f u = U (U u|f ) u = (U u|f ) U u = AUu f, ∀f ∈ H,
if U is unitary, and
U Au U −1 f = U u|U −1 f u = U (f |U u) u = (U u|f ) U u = AUu f, ∀f ∈ H,
if U is antiunitary.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 393
In this section we examine some conditions under which orthogonal projections can
be constructed out of other orthogonal projections. The bijection between S (H)
and P(H) examined in 13.1.4b translates relations between subspaces into relations
between orthogonal projections. Examples of this can be found in 13.2.1, 13.2.4,
13.2.8, 13.2.9.
Then,
f ∈ M ⇒ f = PM f = PM PN f + PM (1H − PN )f
= PM PN f + PM PN ⊥ f ∈ (M ∩ N ) + (M ∩ N ⊥ ).
Since the inclusion (M ∩ N ) + (M ∩ N ⊥ ) ⊂ M is obvious, this proves the equation
M = (M ∩ N ) + (M ∩ N ⊥ ).
Assuming condition b, we also have PN PM ⊥ = PM ⊥ PN (cf. the second remark in
the statement), and this implies (proceeding as above) the equation
M ⊥ = (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ).
In view of 10.4.2a, this proves the equation
H = (M ∩ N ) + (M ∩ N ⊥ ) + (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ). (1)
Now, in view of 10.2.10b and 10.2.13, we have
M ∩ N ⊥ ⊂ N ⊥ ⊂ (M ∩ N )⊥ and (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ) ⊂ M ⊥ ⊂ (M ∩ N )⊥ ,
and hence
(M ∩ N ⊥ ) + (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ) ⊂ (M ∩ N )⊥ . (2)
In view of 10.2.15, 1 and 2 imply condition c.
c ⇒ d: Assuming condition c, by 10.4.2a we have
H = (M ∩ N ) + (M ∩ N ⊥ ) + (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ),
and hence
H = (M ∩ N ) + (M ∩ N ⊥ ) + M ⊥ (3)
since (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ) ⊂ M ⊥ . Now,
(M ∩ N ) + (M ∩ N ⊥ ) ⊂ M = (M ⊥ )⊥ (4)
(cf. 10.4.4a). In view of 10.2.15, 3 and 4 imply that
(M ∩ N ) + (M ∩ N ⊥ ) = (M ⊥ )⊥ = M.
d ⇒ (a and e): Assuming condition d we have that
∀g ∈ M, ∃!(g1 , g2 ) ∈ (M ∩ N ) × (M ∩ N ⊥ ) so that g = g1 + g2
(the uniqueness of (g1 , g2 ) as above follows from g1 ∈ N and g2 ∈ N ⊥ , cf. 10.4.1).
This and 10.4.1 imply that, for every f ∈ H,
∃!(f1 , f2 , f3 ) ∈ (M ∩ N ) × (M ∩ N ⊥ ) × M ⊥ so that f = f1 + f2 + f3 ;
then we have:
δM∩N (f ) = (f1 , f2 + f3 )
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 395
For any M, N ∈ S (H) we have M ∩ N ∈ S (H) (cf. 4.1.10) and hence we can
consider the orthogonal projection PM∩N . If PN PM = PM PN then 13.2.1 proves
that PM∩N = PN PM . The next theorem shows how PM∩N can be obtained from
PM and PN in the general case. In its proof, we follow von Neumann faithfully
(Neumann, 1950).
since
(PM PN )h (PN PM )k = PM (PN PM )h−1 PN (PN PM )k
= PM (PN PM )k+h−1 = A2k+2h−1
and
(PM PN )h PM (PN PM )k = PM (PN PM )h (PN PM )k = A2k+2h+1 .
This proves that, for all m, n ∈ N and all f ∈ H,
(Am f |An f ) = (Am+n−s f |f ) ,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 396
with s = 1 if m and n have the same parity and s = 0 if m and n have different
parity, and hence that
kAm f − An f k2 = (Am f |Am f ) + (An f |An f ) − (Am f |An f ) − (An f |Am f )
= (A2m−1 f |f ) + (A2n−1 f |f ) − 2 (Am+n−s f |f ) (5)
= (A2m−1 f |f ) + (A2n−1 f |f ) − 2 A2km,n −1 f |f ,
with km,n ∈ N such that 2km,n − 1 = m + n − s (note that m + n − s is always odd).
Moreover, for all i ∈ N and all f ∈ H, we have
(A2i−1 f |f ) = (Ai f |Ai f ) = kAi f k2 and
(A2i+1 f |f ) = (Ai+1 f |Ai+1 f ) = kAi+1 f k2 ;
now, Ai+1 f = PM Ai f if i is even and Ai+1 f = PN Ai f if i is odd; therefore (cf.
13.1.3d or 13.1.6), in any case,
kAi+1 f k ≤ kAi f k.
This proves that, for every f ∈ H, the sequence of non-negative real numbers
{(A2i−1 f |f )} is monotone non-increasing, and hence that it is convergent, and
hence (cf. 2.6.2) that
∀ε > 0, ∃Nε ∈ N s.t. Nε < i, j ⇒ | (A2i−1 f |f ) − (A2j−1 f |f ) | < ε;
from this and from 5 we have that
∀ε > 0, ∃Nε ∈ N s.t. Nε < m, n ⇒
kAm f − An f k2 ≤ | (A2m−1 f |f ) − A2km,n −1 f |f | +
| (A2n−1 f |f ) − A2km,n −1 f |f | < 2ε
(note that Nε < m, n implies Nε < km,n ); since H is a complete metric space, this
proves that the sequence {An f } is convergent.
Thus, we can define the mapping
A:H→H
f 7→ Af := lim An f.
n→∞
It is easy to see that the mapping A is a linear operator by the continuity of vector
sum and of scalar multiplication. Further, we have
(Af |Af ) = lim (An f |An f ) = lim (A2n−1 f |f )
n→∞ n→∞
= lim A2n−1 f |f = (Af |f ) , ∀f ∈ H,
n→∞
by the continuity of inner product and by 2.1.7b (in relation to the subsequence
{A2n−1 f } of the sequence {An f }). In view of 12.4.3, this proves that the operator
A is symmetric, and hence self-adjoint since DA = H. Then, from the equation
above we also have
A2 f |f = (Af |f ) , ∀f ∈ H,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 397
which proves the equation A2 = A, in view of 10.2.12. Thus (cf. 13.1.5), the
operator A is an orthogonal projection.
Now, in view of 13.1.3c we have
f ∈ M ∩ N ⇒ [PM f = f and PN f = f ] ⇒
[An f = f, ∀n ∈ N] ⇒ Af = f ⇒ f ∈ RA ,
and conversely
f ∈ RA ⇒ f = Af = lim A2n f ⇒
n→∞
PN f = lim PN A2n f = lim A2n f = f ⇒ f ∈ N
n→∞ n→∞
as well as
f ∈ RA ⇒ f = Af = lim A2n+1 f ⇒
n→∞
PM f = lim PM A2n+1 f = lim A2n+1 f = f ⇒ f ∈ M
n→∞ n→∞
(we have used 2.1.7b in relation to the subsequences {A2n f } and {A2n+1 f } of the
sequence {An f }). This proves that RA = M ∩ N , and hence that A = PM∩N (cf.
13.1.4a).
Finally we have, as already noted,
PM∩N f = Af = lim A2n f = lim (PN PM )n f, ∀f ∈ H.
n→∞ n→∞
whence
PM PN + PM PN PM = 2PM PN and PM PN PM + PN PM = 2PN PM ,
whence
PM PN = PN PM .
Substituting PM PN for PN PM in 6, we obtain PM PN = PN .
b ⇒ c: In view of 13.1.5 and 12.3.4b, condition b implies
PN = PN† = (PM PN )† = PN† PM
†
= PN PM .
c ⇒ d: Assuming condition c, in view of 13.1.3d or 13.1.6 we have
kPN f k = kPN PM f k ≤ kPM f k, ∀f ∈ H,
and hence condition d by 13.1.7c.
d ⇒ e: Assuming condition d, in view of 13.1.3c and 13.1.7c we have
f ∈ N ⇒ kf k = kPN f k ≤ kPM f k ⇒ kf k = kPM f k ⇒ f ∈ M,
where the second implication is true because kPM f k ≤ kf k for all f ∈ H (cf. 13.1.3d
or 13.1.6).
e ⇒ (a and f ): Assuming condition e, 10.4.3 implies that
∀g ∈ M, ∃!(g1 , g2 ) ∈ N × (M ∩ N ⊥ ) so that g = g1 + g2 .
This and 10.4.1 imply that, for every f ∈ H,
∃!(f1 , f2 , f3 ) ∈ N × (M ∩ N ⊥ ) × M ⊥ so that f = f1 + f2 + f3 ;
then we have:
δM∩N ⊥ (f ) = (f2 , f1 + f3 )
since f1 ∈ N = N ⊥⊥ ⊂ (M ∩ N ⊥ )⊥ and f3 ∈ M ⊥ ⊂ (M ∩ N ⊥ )⊥ , and hence
f1 + f3 ∈ (M ∩ N ⊥ )⊥ ;
δN (f ) = (f1 , f2 + f3 )
since f2 ∈ M ∩ N ⊥ ⊂ N ⊥ and f3 ∈ M ⊥ ⊂ N ⊥ , and hence f2 + f3 ∈ N ⊥ ;
δM (f ) = (f1 + f2 , f3 )
since f1 ∈ N ⊂ M and f2 ∈ M ∩ N ⊥ ⊂ M , and hence f1 + f2 ∈ M ; therefore we
have
PM∩N ⊥ f = f2 = (f1 + f2 ) − f1 = PM f − PN f = (PM − PN )f.
This proves the equation PM − PN = PM∩N ⊥ , which obviously implies condition a.
Finally, the equation PM − PN = PM∩N ⊥ is equivalent to the equation
RPM −PN = M ∩ N ⊥ , in view of 13.1.3c and 13.1.4a.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 399
for N, M ∈ S (H), N ≤ M if N ⊂ M.
The l.u.b. and the g.l.b. exist for every family {Mi }i∈I of elements of S (H), and
they are
!
[ \
sup{Mi }i∈I = V Mi and inf{Mi }i∈I = Mi
i∈I i∈I
(the first equation is proved by 4.1.11a,b,c). In view of the bijection existing between
S (H) and P(H) (cf. 13.1.4b), we can obviously define a partial ordering in P(H)
as follows:
for P, Q ∈ P(H), P ≤ Q if RP ⊂ RQ .
and
\
inf{Pi }i∈I = PM̌ if M̌ := RPi .
i∈I
13.2.6 Remark. Let {Pi }i∈I be an arbitrary family of elements of P(H). Since
inf{Pi }i∈I ≤ Pk for all k ∈ I, we have
need not hold true, and indeed they do not in general, not even when the elements
of the family are so that Pi Pj = Pj Pi for all i, j ∈ I, as is shown by the family
of one-dimensional projections {Au1 , Au2 } with {u1 , u2 } an o.n.s. in H; in fact,
inf{Au1 , Au2 } = OH , while for the vector f := u1 + u2 we have (f |Aui f ) = 1 for
i = 1, 2. In 13.2.7 we prove that statement 7 is true if the family {Pi }i∈I is closed
under multiplication, i.e. if the product Pi Pj belongs to the family for all i, j ∈ I.
This result is important in the theory of projection valued measures (it is used in
the proof of 13.4.2).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 400
13.2.7 Theorem. Let {Pi }i∈I be a family of elements of P(H) and suppose that
∀i, j ∈ I, ∃k ∈ I such that Pi Pj = Pk .
Then,
∃!P ∈ P(H) so that (f |P f ) = inf{(f |Pi f )}i∈I , ∀f ∈ H.
This unique orthogonal projection P is the orthogonal projection inf{Pi }i∈I .
13.2.8 Theorem. For a sequence {Mn } in S (H), the following conditions are
equivalent (we write Pn := PMn , ∀n ∈ N):
P∞
(a) the series n=1 Pn f is convergent for all f ∈ H and the mapping
P : H→H
∞
X
f 7→ P f := Pn f
n=1
is an orthogonal projection;
P∞ 2 2
(b) n=1 kPn f k ≤ kf k , ∀f ∈ H;
(c) Pi Pk = OH if i 6= k;
(d) Mk ⊂ Mi⊥ if i 6= k.
If the above conditions are satisfied, the subset of H defined by
∞
(
[
f ∈ H : there exists a sequence {fn } ∈ Mn such that
n=1
∞ ∞
)
X X
fn ∈ Mn for all n ∈ N, fn is convergent, f = fn
n=1 n=1
is called the orthogonal sum of the sequence of subspaces {Mn } and is denoted by
P∞⊕
the symbol n=1 Mn .
If the above conditions are satisfied, then:
P∞⊕ P∞⊕
(e) RP = n=1 Mn = V (∪∞ n=1 Mn ), and hence n=1 Mn ∈ S (H);
P∞
(f ) if β is a bijection from N onto N then n=1 Pβ(n) f = P f , ∀f ∈ H.
(cf. 13.1.7c). By 12.4.3, this proves that the operator P is symmetric, and hence
self-adjoint since DP = H. We also have
∞ X
X ∞ ∞
X
(P f |P f ) = (Pi f |Pk f ) = (Pi f |Pi f )
i=1 k=1 i=1
X∞
= kPi f k2 = (f |P f ) , ∀f ∈ H,
i=1
and hence
f |P 2 f = (f |P f ) , ∀f ∈ H,
since fk ∈ Mk ⊂ Mi⊥
= NPi if i 6= k, and hence Pi fk = 0H if i 6= k, while Pi fi = fi
holds for all i ∈ N (cf. 13.1.3b,c). In view of 13.1.3c, this proves the inclusion
P∞⊕ P∞⊕
n=1 Mn ⊂ RP and hence the equation RP = n=1 Mn . This equation implies
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 404
P∞⊕ P∞⊕ S∞
n=1Mn ∈ S (H) (cf. 13.1.4a). Next, the inclusion n=1 Mn ⊂ V ( n=1 Mn ) is
S∞ S∞
obvious since V ( n=1 Mn ) is a subspace and it contains n=1 Mn (also, cf. 2.3.4).
P∞⊕
On the other hand, the inclusion Mk ⊂ n=1 Mn is obvious for all k ∈ N; then
S∞ P∞⊕ P∞⊕
the inclusion V ( n=1 Mn ) ⊂ n=1 Mn follows from 4.1.11c since n=1 Mn is a
subspace.
Finally, if β is a bijection from N onto N, then
X∞
the series Pβ(n) f is convergent and
n=1
∞
X ∞
X
Pβ(n) f = Pn f = P f, ∀f ∈ H
n=1 n=1
by 10.4.9 since (Pi f |Pk f ) = 0 if i 6= k.
13.2.9 Corollary. For a finite family {M1 , ..., MN } of subspaces of H, the following
conditions are equivalent (we write Pn := PMn , ∀n ∈ {1, ..., N }):
PN
(a) n=1 Pn ∈ P(H);
(b) Pi Pk = OH if i 6= k;
(c) Mk ⊂ Mi⊥ if i 6= k.
PN ⊕
If the above conditions are satisfied, the subset n=1 Mn of H defined by
N⊕
X
Mn := M1 + ... + MN
n=1
(cf. 3.1.8) is called the orthogonal sum of the family of subspaces {M1 , ..., MN } and
PN
the following equations are true (we write P := n=1 Pn ):
P ⊕ S PN ⊕
(d) RP = N n=1 Mn = V
N
n=1 Mn , and hence n=1 Mn ∈ S (H).
13.2.10 Remarks.
(a) If the conditions in 13.2.8 are satisfied then
∞⊕ ∞
(
X [
Mn = f ∈ H : there exists a sequence {fn } in Mn such that
n=1 n=1
∞ ∞
)
X X
2
fn ∈ Mn for all n ∈ N, kfn k < ∞, f = fn .
n=1 n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 405
P∞⊕
This follows immediately from 10.4.7. If f ∈ n=1 Mn then the sequence {fn }
P∞
such that fn ∈ Mn and f = n=1 fn is unique. In fact, suppose that {gn } is
P∞
another sequence such that gn ∈ Mn and f = n=1 gn . Then,
∞
X ∞
X
fk = Pk fn = Pk fn = Pk f
n=1 n=1
∞
X ∞
X
= Pk gn = Pk gn = gk , ∀k ∈ N,
n=1 n=1
where we have used the continuity of Pk (cf. 13.1.3d) and the equations
Pk fn = δk,n fn and Pk gn = δk,n gn , ∀k, n ∈ N,
which follow from 13.1.3b,c.
PN ⊕
Similarly, if the conditions in 13.2.9 are satisfied then, for f ∈ n=1 Mn , the
PN
N -tuple {f1 , ..., fN } such that fn ∈ Mn and f = n=1 fn is unique.
(b) If the conditions in 13.2.8 are satisfied, the orthogonal projection P is called
the series of the sequence of projections {Pn } and is denoted by the symbol
P∞ P∞
n=1 Pn , i.e. one writes n=1 Pn := P . However, unless
(cf. 13.1.3d); this implies that the sequence { nk=1 Pk } is not a Cauchy sequence
P
in the normed space B(H), and hence that it is not convergent (cf. 2.6.2).
P
(c) From 13.2.8f we have that the projection i∈I Pi can be defined unambiguously
for any countable family {Pi }i∈I of projections such that Pi Pj = OH if i 6= j.
Indeed, if I is denumerable, we define
X ∞
X
Pi := Pi(n)
i∈I n=1
E 7→ µQ
f (E) := (f |Q(E)f ) (= kQ(E)f k2 )
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 408
g: For every f ∈ H, the function µPf has property af1 of 7.1.1 in view of result
0
N
!
[
µP
f
0
En = 0, ∀f ∈ H,
n=1
S
N
by 7.1.2b, and this implies P0 n=1 n = OH .
E
(a) P is a p.v.m. on A;
2
(b) µP P
f is a measure on A and µf (X) = kf k , ∀f ∈ H;
(c) µu is a probability measure on A (i.e. µu is a measure on A and µP
P P
u (X) = 1),
∀u ∈ H̃.
b ⇒ c: This is obvious.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 410
c ⇒ a: Assuming condition c, for every finite and disjoint family {E1 , ..., En } of
elements of A we have
n
! ! n
! n
[ [ X
P
u|P Ek u = µu Ek = µP
u (Ek )
k=1 k=1 k=1
n n
!
X X
= (u|P (Ek )u) = u| P (Ek )u , ∀u ∈ H̃,
k=1 k=1
Sn Pn
and hence P ( k=1 Ek ) = k=1 P (Ek ) by 10.2.12. We also have
(u|P (X)u) = µP
u (X) = 1 = (u|1H u) , ∀u ∈ H̃,
c: If P (En ) = OH then µP
f (En ) = 0 for all f ∈ H, and hence
∞
!
[
µPf En = 0, ∀f ∈ H,
n=1
S∞
by 7.1.4a, and this implies P ( n=1 En ) = OH .
of elements of S s.t. E = ∪N
n=1 En
Proof. From 7.3.2 we have that, for every f ∈ H, there exists a measure µf on
A(A0 ) which is an extension of µP
f and which is defined by
0
An ∩ Bl ∈ A0 , ∀(n, l) ∈ N × N,
(Am ∩ Bk ) ∩ (An ∩ Bl ) = ∅ if (m, k) 6= (n, l),
∞ ∞
! !
[ [ [
E⊂ An ∩ Bl = An ∩ Bl ,
n=1 l=1 (n,l)∈N×N
∞ ∞ ∞ X
∞
! !
X X X
P0 (An ) P0 (Bl ) f = P0 (An )P0 (Bl )f
n=1 l=1 n=1 l=1
X
= P0 (An ∩ Bl )f, ∀f ∈ H
(n,l)∈N×N
(f |P (E)f ) = µf (E) = µP
f (E) = (f |P0 (E)f ) , ∀f ∈ H, ∀E ∈ A0 ,
0
hence µP̃
f = µf for each f ∈ H by the uniqueness asserted in 7.3.2 for a σ-finite
premeasure (µP P0 2
f is finite since µf (X) = kf k ). Therefore we have
0
f |P̃ (E)f = µP̃ f (E) = µf (E) = (f |P (E)f ) , ∀f ∈ H, ∀E ∈ A(A0 ),
P (E) = P̃ (E), ∀E ∈ S.
Then P = P̃ .
P
instance) to A0 (S) and µf is a measure on A(S) (note that A0 (S) ⊂ A(S) since
A(S) = A(A0 (S)), cf. 6.1.18). Then, 13.4.2 implies that there exists a unique
p.v.m. on A(A0 (S)), i.e. on A(S), which extends P0 . Therefore, P = P̃ .
(q1 ) for every finite and disjoint family {E1 , ..., En } of elements of S such that
Sn
k=1 Ek ∈ S,
n
! n
[ X
Q Ek = Q(Ek );
k=1 k=1
µQ P0
f ; hence, condition q4 implies that µf satisfies condition a of 7.1.6 (if E, F ∈ S
and F ⊂ E then µQ Q Q
f (F ) ≤ µf (E) since µf is restriction of an additive function,
cf. 7.1.2a); since µP 2 P0
f (X) = kf k < ∞, this implies that µf is a premeasure (cf.
0
7.1.6). Then, 13.4.2 implies that there exists a unique p.v.m. P on A (A0 (S))
which extends P0 . Since A(A0 (S)) = A(S) (cf. 6.1.18), P is a p.v.m. on A(S).
The uniqueness of P follows from 13.4.3.
13.5.2 Proposition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces and let P be
a p.v.m. on the σ-algebra A1 ⊗ A2 (which is a σ-algebra on X1 × X2 , cf. 6.1.28).
Then the mappings
P1 : A1 → P(H)
E1 7→ P1 (E1 ) := P (E1 × X2 )
and
P2 : A2 → P(H)
E2 7→ P2 (E2 ) := P (X1 × E2 )
are projection valued measures, they commute, and
P1 (E1 )P2 (E2 ) = P (E1 × E2 ), ∀E1 ∈ A1 , ∀E2 ∈ A2
(recall that E1 × E2 ∈ A1 ⊗ A2 for all E1 ∈ A1 and E2 ∈ A2 , cf. 6.1.30a).
S
Proof. For every family {E1,i }i∈I of elements of A1 we have i∈I E1,i × X2 =
S
i∈I (E1,i × X2 ). For E1 , F1 ∈ A1 , if E1 ∩ F1 = ∅ then (E1 × X2 ) ∩ (F1 × X2 ) = ∅.
Then, it is obvious that P1 has the properties of a p.v.m. on A1 since P is a p.v.m.
on A1 ⊗ A2 . And similarly for P2 . From property 13.3.2d of P we have that P1 and
P2 commute. Finally, for all E1 ∈ A1 and E2 ∈ A2 ,
P1 (E1 )P2 (E2 ) = P (E1 × X2 )P (X1 × E2 )
= P ((E1 × X2 ) ∩ (X1 × E2 )) = P (E1 × E2 ),
by property 13.3.2c of P .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 416
13.5.3 Theorem. Let (X1 , d1 ) and (X2 , d2 ) be complete and separable metric
spaces, let P1 be a p.v.m. on the Borel σ-algebra A(d1 ), let P2 be a p.v.m. on
the Borel σ-algebra A(d2 ) (both P1 and P2 with values in P(H)), and suppose that
P1 and P2 commute. Then there exists a p.v.m. P on the σ-algebra A(d1 ) ⊗ A(d2 )
(which is the same as the Borel σ-algebra A(d1 × d2 ), where d1 × d2 denotes the
product distance on X1 × X2 , cf. 6.1.31, 2.7.1, 2.7.2) such that
P (E1 × E2 ) = P1 (E1 )P2 (E2 ), ∀E1 ∈ A(d1 ), ∀E2 ∈ A(d2 ).
The p.v.m. P is the unique p.v.m. on A(d1 ) ⊗ A(d2 ) such that
P (E1 × X2 ) = P1 (E1 ), ∀E1 ∈ A(d1 ), and P (X1 × E2 ) = P2 (E2 ), ∀E2 ∈ A(d2 ).
The p.v.m. P is called the product of P1 and P2 .
We can assume that the sets E1,k and E2,k are non-empty for all k ∈ {1, ..., n}.
Sn
Then we have Ep = k=1 Ep,k for p = 1, 2. Since every σ-algebra is a semialgebra,
6.1.4 implies that, for p = 1, 2, there exists a finite and disjoint family {Fp,j }j∈Jp
of elements of A(dp ) so that
[
∀k ∈ {1, ..., n}, ∃Jp,k ⊂ Jp such that Ep,k = Fp,j .
j∈Jp,k
Then we have
[
E1,k × E2,k = (F1,i × F2,j ), ∀k ∈ {1, ..., n}. (14)
(i,j)∈J1,k ×J2,k
Clearly, we can assume that Fp,j is non-empty for all j ∈ Jp and for p = 1, 2. Then,
the condition
(E1,k × E2,k ) ∩ (E1,h × E2,h ) = ∅ if k 6= h
and 14 imply the condition
(J1,k × J2,k ) ∩ (J1,h × J2,h ) = ∅ if k 6= h. (15)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 417
S
Moreover, we can assume Ep = j∈Jp Fp,j for p = 1, 2 (if this is not already true,
Sn
we can replace Jp with k=1 Jp,k ). Then we have
[ n
[
(F1,i × F2,j ) = E1 × E2 = (E1,k × E2,k )
(i,j)∈J1 ×J2 k=1
(16)
n
[ [
= (F1,i × F2,j ) .
k=1 (i,j)∈J1,k ×J2,k
Sn
Now, the inclusion k=1 (J1,k × J2,k ) ⊂ J1 × J2 is obvious. For (i, j) ∈ J1 × J2 , let
(x1 , x2 ) ∈ F1,i × F2,j ; then 16 implies that there exist k ∈ {1, ..., n} and (l, m) ∈
J1,k × J2,k so that (x1 , x2 ) ∈ F1,l × F2,m ; since
n
[
J1 × J2 = (J1,k × J2,k ).
k=1
q2 : Let E1 ×E2 and F1 ×F2 be elements of S such that (E1 ×E2 )∩(F1 ×F2 ) = ∅.
Then at least one of the two conditions
E1 ∩ F1 = ∅ and E2 ∩ F2 = ∅
is true, and hence (cf. 13.3.2a,c) at least one of the two conditions
P1 (E1 )P1 (F1 ) = P1 (E1 ∩ F1 ) = OH and P2 (E2 )P2 (F2 ) = P2 (E2 ∩ F2 ) = OH
is true, and hence
Q(E1 × E2 )Q(F1 × F2 ) = P1 (E1 )P2 (E2 )P1 (F1 )P2 (F2 )
= P1 (E1 )P1 (F1 )P2 (E2 )P2 (F2 ) = OH .
q3 : We have X1 × X2 ∈ S and Q(X1 × X2 ) = P1 (X1 )P2 (X2 ) = 1H 1H = 1H .
q4 : We fix f ∈ H, E1 × E2 ∈ S, ε ∈ (0, ∞). For i = 1, 2, in view of the fact that
the measure µP i
f is finite and the metric space (Xi , d ) is complete and separable,
i
2
Now, F1 × F2 ∈ S, F1 × F2 ⊂ E1 × E2 , F1 × F2 is compact in the metric space
(X1 × X2 , d1 × d2 ) (cf. 2.8.10), and hence F1 × F2 = F1 × F2 (cf. 2.8.6). We have
E1 × E2 = (E1 × (E2 − F2 )) ∪ ((E1 − F1 ) × F2 ) ∪ (F1 × F2 ),
E1 − F1 ∈ A(d1 ),
E2 − F2 ∈ A(d2 );
then, by property q1 of Q already proved,
Q(E1 × E2 ) = Q(E1 × (E2 − F2 )) + Q((E1 − F1 ) × F2 ) + Q(F1 × F2 ),
and hence
µQ Q Q Q
f (E1 × E2 ) − µf (F1 × F2 ) = µf ((E1 × (E2 − F2 ))) + µf (((E1 − F1 ) × F2 ));
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 419
moreover,
µQ
f (E1 × (E2 − F2 )) = kP1 (E1 )P2 (E2 − F2 )f k
2
≤ kP2 (E2 − F2 )f k2
ε
= µP P2 P2
f (E2 − F2 ) = µf (E2 ) − µf (F2 ) <
2
2
and
µQ 2
f ((E1 − F1 ) × F2 ) = kP1 (E1 − F1 )P2 (F2 )f k = kP2 (F2 )P1 (E1 − F1 )f k
2
≤ kP1 (E1 − F1 )f k2 = µP P1 P1
f (E1 − F1 ) = µf (E1 ) − µf (F1 )
1
ε
< ,
2
and therefore
|µQ Q
f (E1 × E2 ) − µf (F1 × F2 )| < ε.
Thus, the mapping Q satisfies all the conditions of 13.4.4, and hence there exists a
unique p.v.m. P on A(S) which is an extension of Q. Now, A(S) = A(d1 ) ⊗ A(d2 )
(cf. 6.1.30a).
Finally, suppose that P̃ is a p.v.m. on A(d1 ) ⊗ A(d2 ) such that
P̃ (E1 × X2 ) = P1 (E1 ), ∀E1 ∈ A(d1 ), and P̃ (X1 × E2 ) = P2 (E2 ), ∀E2 ∈ A(d2 ).
Then, in view of 13.3.2c,
P̃ (E1 × E2 ) = P̃ ((E1 × X2 ) ∩ (X1 × E2 )) = P̃ (E1 × X2 )P̃ (X2 × E2 )
= P1 (E1 )P2 (E2 ) = Q(E1 × E2 ), ∀E1 ∈ A(d1 ), ∀E2 ∈ A(d2 ),
and hence P̃ is an extension of Q, and hence P̃ = P .
Our version of the spectral theorem for self-adjoint operators (cf. 15.2.1) relates
self-adjoint operators to projection valued measures on the Borel σ-algebra A(dR )
(which is a σ-algebra on R, cf. 6.1.22 and 2.1.4). In other books, this theorem is
often phrased so that it relates self-adjoint operators to spectral families. These
two versions of the spectral theorem are completely equivalent. In this section we
prove the equivalence of the notions of a spectral family and of a p.v.m. on A(dR ).
However, the results of this section are not needed in other parts of the present
book.
13.6.2 Proposition. Let P be a p.v.m. on the σ-algebra A(dR ). Then the mapping
T : R → P(H)
x 7→ T (x) := P ((−∞, x])
is a spectral family.
Proof. We prove that the mapping T has all the properties of a spectral family.
sf1 : This follows immediately from 13.3.2e.
sf2 : We fix x ∈ R and f ∈ H. We have obviously
∞
1 1 \ 1
−∞, x + ⊂ −∞, x + , ∀n ∈ N, and (−∞, x] = −∞, x + ,
n+1 n n=1
n
1
where the equalities hold by 13.1.7c since T (x + δn ) − T (x) and T x + nε − T (x)
are orthogonal projections (cf. sf1 and 13.2.4).
sf3 : We fix f ∈ H. By 13.3.2a and 13.3.6b we have
0H = P (∅)f = lim P ((−∞, n])f = lim T (−n)f.
n→∞ n→∞
Now, let {xn } be a sequence in R such that xn −−−−→ −∞ and fix ε > 0. Let nε ∈ N
n→∞
be such that
kT (−nε )f k < ε
and let Nε ∈ N be such that
n > Nε ⇒ xn < −nε .
Then, for n > Nε we have T (xn ) ≤ T (−nε ) in view of property sf1 and hence
kT (xn )f k2 = (f |T (xn )f ) ≤ (f |T (−nε )f ) = kT (−nε)f k2 < ε2 .
sf4 : We fix f ∈ H. By property pvam2 of P and by 13.3.6a we have
f = P (R)f = lim P ((−∞, n])f = lim T (n)f.
n→∞ n→∞
Now, let {xn } be a sequence in R such that xn −−−−→ ∞ and fix ε > 0. Let nε ∈ N
n→∞
be such that
kf − T (nε )f k < ε
and let Nε ∈ N be such that
n > Nε ⇒ xn > nε .
Then, for n > Nε we have T (nε ) ≤ T (xn ) in view of property sf1 , and hence
kf − T (xn )f k2 = (f |f − T (xn )f ) ≤ (f |f − T (nε )f ) = kf − T (nε )f k2 < ε2 ,
where the equalities hold by 13.1.7c because 1H − T (xn ) and 1H − T (nε ) are or-
thogonal projections (cf. 13.1.3e).
13.6.3 Theorem. Let T be a spectral family. Then there exists a unique p.v.m. P
on the σ-algebra A(dR ) such that
T (x) = P ((−∞, x]), ∀x ∈ R.
= (f |Q((−∞, b])f ) , ∀b ∈ R;
µf ((a, ∞)) = lim Ff (n) − Ff (a) = (f |f ) − Ff (a)
n→∞
= (f |Q((a, ∞))f ) , ∀a ∈ R
(we have limn→∞ Ff (−n) = 0 by property sf3 of T and limn→∞ Ff (n) = (f |f ) by
property sf4 of T ). Thus, we have
µf (E) = (f |Q(E)f ) , ∀E ∈ S.
Now we can prove that the mapping Q satisfies conditions q1 , q2 , q3 of 13.4.1.
q : For every finite and disjoint family {E1 , ..., En } of elements of S such that
Sn 1
k=1 Ek ∈ S, we have
n
! ! n
! n
[ [ X
f |Q Ek f = µf Ek = µf (Ek )
k=1 k=1 k=1
n n
!
X X
= (f |Q(Ek )f ) = f| Q(Ek )f , ∀f ∈ H,
k=1 k=1
and hence
n
! n
[ X
Q Ek = Q(Ek ),
k=1 k=1
by 10.2.12.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 423
the restrictions of µf to A0 (S) (note that A0 (S) ⊂ A(dR ) since A(dR ) = A(S) =
A(A0 (S)), cf. 6.1.25 and 6.1.18) owing to the uniqueness asserted in 7.3.1, since
the restrictions of µPf and of µf to S are the same (both of them are equal to
0
µQ P0
f ). Hence, µf is a premeasure on A0 (S). Then, 13.4.2 implies that there exists
a unique p.v.m. P on A(A0 (S)) = A(dR ) which is an extension of P0 . Thus, P is
also an extension of Q and we have in particular
P ((−∞, x]) = Q((−∞, x]) = T (x), ∀x ∈ R.
Finally, suppose that P̃ is a p.v.m. on A(dR ) such that
P̃ ((−∞, x]) = T (x), ∀x ∈ R.
Then we have
P̃ ((−∞, a]) + P̃ ((a, ∞)) = P̃ (R) = 1H , ∀a ∈ R,
and hence
P̃ ((a, ∞)) = 1H − T (a), ∀a ∈ R.
We also have, for all a, b ∈ R so that a < b,
P̃ ((a, b]) = P̃ ((−∞, b] ∩ (a, ∞)) = P̃ ((−∞, b])P̃ ((a, ∞))
= T (b)(1H − T (a)) = T (b) − T (a)
(cf. 13.3.2c). This proves that
P̃ (E) = Q(E) = P (E), ∀E ∈ S,
and this implies P̃ = P by 13.4.3.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 424
13.6.4 Remark. Some define a spectral family replacing “continuity from the
right” in sf2 with “continuity from the left” (defined in an obvious way). Clearly
the two definitions are not the same, but they are equivalent in the following sense:
the spectral theorem (in the formulation in which spectral families instead of pro-
jection valued measures are used) says that for any given self-adjoint operator there
exists a unique spectral family for each type (i.e. either continuous from the right
or continuous from the left) so that the operator “is the integral of the function ξ
(cf. 11.3.2) with respect to that family”. Actually, in order to prove the existence of
a spectral family which does this trick one could dispose altogether of condition sf2
and only require condition sf1 in the definition of a spectral family. However, the
spectral family (thus redefined) associated to a given self-adjoint operator would
not be unique. In a way, right continuity or left continuity are “normalization con-
ditions”. Obviously, a p.v.m. P on A(dR ) determines and is determined uniquely
by a “left continuous” spectral family T in a way similar to what was seen in 13.6.2
and in 13.6.3, and the link condition is
T (x) = P ((−∞, x)), ∀x ∈ R.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 425
Chapter 14
The spectral theorems for unitary and for self-adjoint operators will be presented
in the next chapter. They consist in the representation of a unitary or a self-adjoint
operator as an integral with respect to a projection valued measure. In this chapter
we investigate the idea of an integral with respect to a projection valued measure
and study the properties of this kind of integral.
14.1.1 Theorem. There exists a unique mapping JˆP : MB (X, A) → B(H) such
that:
(a) JˆP (χE ) = P (E), ∀E ∈ A;
(b) JˆP is a linear operator;
(c) JˆP is continuous.
In addition, the following conditions are true;
(d) JˆP (ϕ1 ϕ2 ) = JˆP (ϕ1 )JˆP (ϕ2 ), ∀ϕ1 , ϕ2 ∈ MB (X, A);
JˆP (ϕ) = (JˆP (ϕ))† , ∀ϕ ∈ MB (X, A);
(e)
(f ) f |JˆP (ϕ)f = X ϕdµP
R
f , ∀f ∈ H, ∀ϕ ∈ MB (X, A);
425
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 426
(h) if A ∈ B(H) is so that AP (E) = P (E)A for all E ∈ A, then AJˆP (ϕ) = JˆP (ϕ)A
for all ϕ ∈ MB (X, A).
Proof. We begin with a preliminary remark. For n, m ∈ N, let {α1 , ..., αn } and
{β1 , ..., βm } be families of elements of C, let {E1 , ..., En } and {F1 , ..., Fm } be disjoint
families of elements of A, and suppose that
Xn Xm
αk χEk = βl χFl .
k=1 l=1
The same proof as the one given in 8.1.1 (with µ replaced by P ) shows that
n
X m
X
αk P (Ek ) = βl P (Fl ).
k=1 l=1
Now let ψ ∈ S(X, A). Then there are n ∈ N, a family {α1 , ..., αn } of elements of C,
Pn
and a disjoint family {E1 , ..., En } of elements of A so that ψ = k=1 αk χEk . We
define the operator
Xn
APψ := αk P (Ek ),
k=1
is a linear operator. Moreover, for every ψ ∈ S(X, A) we have (cf. 13.3.2b, 13.2.9,
10.2.3)
n
X
2
kAP
ψfk = |αk |2 kP (Ek )f k2
k=1
n
X
≤ kψk2∞ kP (Ek )f k2 = kψk2∞ kP (∪nk=1 Ek ) f k2
k=1
≤ 2
kψk∞ kf k2 , ∀f ∈ H,
and hence kAP
ψk ≤ kψk∞ . This proves that
kAP (ψ)k ≤ kψk∞ , ∀ψ ∈ S(X, A),
and hence that the linear operator AP is bounded. Since S(X, A) is dense in
MB (X, A) and B(H) is a Banach space, by 4.2.6 there exists a unique bounded
(and hence continuous) linear operator
JˆP : MB (X, A) → B(H)
which is an extension of AP , i.e. such that JˆP (ψ) = AP
ψ for all ψ ∈ S(X, A), and
hence such that
JˆP (χE ) = AP
χE = P (E), ∀E ∈ A.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 427
Then
n X
X m
ψ1 ψ2 = αk βl χEk ∩Fl ,
k=1 l=1
= AP P
ψ1 Aψ2 = JˆP (ψ1 )JˆP (ψ2 ).
Now, for ϕ1 , ϕ2 ∈ MB (X, A) let {ψ1,n } and {ψ2,n } be sequences in S(X, A) such
that ϕ1 = limn→∞ ψ1,n and ϕ2 = limn→∞ ψ2,n (in the k k∞ norm); then ϕ1 ϕ2 =
limn→∞ ψ1,n ψ2,n in view of 4.3.3, and hence
(1)
JˆP (ϕ1 ϕ2 ) = lim JˆP (ψ1,n ψ2,n ) = lim JˆP (ψ1,n )JˆP (ψ2,n )
n→∞ n→∞
(2) (3)
= ( lim JˆP (ψ1,n ))( lim JˆP (ψ2,n )) = JˆP (ϕ1 )JˆP (ϕ2 ),
n→∞ n→∞
n n
!†
JˆP (ψ) = A = αk P (Ek ) = (AP )† = (JˆP (ψ))† .
X X
P
ψ
αk P (Ek ) = ψ
k=1 k=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 428
n
f |JˆP (ψ)f = f |AP
X
ψ f = αk (f |P (Ek )f )
k=1
n
X Z
= αk µP
f (Ek ) = ψdµP
f , ∀f ∈ H.
k=1 X
in view of conditions a and b, and hence J = Jˆµ by the uniqueness asserted in 4.2.6,
since S(X, A) is dense in MB (X, A) and both J and Jµ are continuous.
This shows that there exists a close analogy between the mappings JˆP and Jˆµ .
Owing to this analogy, for ϕ ∈ MB (X, A) the operator JˆP (ϕ) is called the integral
of ϕ with respect to P and it is often denoted as follows
Z
ϕdP := JˆP (ϕ).
X
On the basis of the results of the previous section, in this section we extend the
notion of an integral with respect to a projection valued measure, to measurable
functions which are not necessarily bounded nor necessarily defined on the whole
of X.
As before, (X, A) denotes an abstract measurable space, H denotes an abstract
Hilbert space, and P denotes a projection valued measure on A with values in
P(H).
It is obvious that, if
Q(x) P -a.e. on E
then (cf. 7.1.9)
Q(x) µP
f -a.e. on E, ∀f ∈ H.
whence
|α|P -sup|ϕ| ≤ P -sup|αϕ|;
therefore,
P -sup|αϕ| = |α|P -sup|ϕ|.
For every ψ ∈ L∞ (X, A, P ) we have ϕ + ψ ∈ M(X, A, P ) and ϕψ ∈ M(X, A, P )
by 14.2.3a. Now, let E be as before and let F ∈ A be such that
P (F ) = OH and |ψ(x)| ≤ P -sup|ψ|, ∀x ∈ Dψ − F ;
then E ∪ F ∈ A and P (E ∪ F ) = OH (cf. 13.3.2h), and also
|ϕ(x) + ψ(x)| ≤ |ϕ(x)| + |ψ(x)| ≤ P -sup|ϕ| + P -sup|ψ|,
∀x ∈ (Dϕ − E) ∩ (Dψ − F ) = Dϕ+ψ − (E ∪ F ),
which proves that
ϕ + ψ ∈ L∞ (X, A, P ) and P -sup|ϕ + ψ| ≤ P -sup|ϕ| + P -sup|ψ|;
moreover
|ϕ(x)ψ(x)| = |ϕ(x)||ψ(x)| ≤ (P -sup|ϕ|)(P -sup|ψ|),
∀x ∈ (Dϕ − E) ∩ (Dψ − F ) = Dϕψ − (E ∪ F ),
which proves that
ϕψ ∈ L∞ (X, A, P ) and P -sup|ϕψ| ≤ (P -sup|ϕ|)(P -sup|ψ|).
It is obvious that
ϕ ∈ L∞ (X, A, P ) and P -sup|ϕ| = P -sup|ϕ|.
Finally, suppose that there exists E ∈ A such that E 6= ∅ and P (E) = OH . Then
the family of functions L∞ (X, A, P ) is not an associative algebra nor a linear space
for the same reason why M(X, A, P ) is not (cf. the proof of 8.2.2). Therefore,
the function L∞ (X, A, P ) ∋ ϕ 7→ P -sup|ϕ| ∈ R cannot be a norm. Moreover
χE ∈ MB (X, A), χE 6= 0X , and P -sup|χE | = 0; this proves that the function
MB (X, A) ∋ ϕ 7→ P -sup|ϕ| ∈ R is not a norm.
and hence JˆP (ϕe ) = JˆP (ϕ′e ) (cf. 10.2.12). This proves that the mapping J˜P is
defined consistently. It is obvious that J˜P is an extension of JˆP .
Now we prove the conditions listed in the statement. For ϕ ∈ L∞ (X, A, P ), we
denote by ϕe an element of MB (X, A) such that ϕe (x) = ϕ(x) P -a.e. on Dϕ .
a: This follows at once from 14.1.1.a, since J˜P is an extension of JˆP .
b: Let α, β ∈ C and ϕ, ψ ∈ L∞ (X, A, P ). Then, αϕe + βψe ∈ MB (X, A) and
(αϕe + βψe )(x) = (αϕ + βψ)(x) P -a.e. on Dαϕ+βψ
by 14.2.3c, and hence (cf. 14.1.1b)
J˜P (αϕ + βψ) = JˆP (αϕe + βψe ) = αJˆP (ϕe ) + β JˆP (ψe ) = αJ˜P (ϕ) + β J˜P (ψ).
c: Let ϕ, ψ ∈ L∞ (X, A, P ). Then ϕe ψe ∈ MB (X, A) and
(ϕe ψe )(x) = (ϕψ)(x) P -a.e. on Dϕψ
by 14.2.3c, and hence (cf. 14.1.1d)
J˜P (ϕψ) = JˆP (ϕe ψe ) = JˆP (ϕe )JˆP (ψe ) = J˜P (ϕ)J˜P (ψ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 434
and hence J˜P (ϕ) = J˜P (ϕ′ ) by 10.2.12. Conversely, if ϕ, ϕ′ ∈ L∞ (X, A, P ) are such
that J˜P (ϕ) = J˜P (ϕ′ ), then (cf. conditions b and h)
P -sup|ϕ − ϕ′ | = kJ˜P (ϕ − ϕ′ )k = kJ˜P (ϕ) − J˜P (ϕ′ )k = 0,
and hence (cf. 14.2.5)
ϕ(x) − ϕ′ (x) = 0, i.e. ϕ(x) = ϕ′ (x), P -a.e. on Dϕ−ϕ′ = Dϕ ∩ Dϕ′ .
14.2.8 Remark. We denote by M (X, A, P ) the quotient set defined by the equiv-
alence relation ∼ in M(X, A, P ) (cf. 14.2.3.b). On the basis of 14.2.3a,c, it is easy
to see that M (X, A, P ) becomes an abelian associative algebra if we define
[ϕ] + [ψ] := [ϕ + ψ], ∀[ϕ], [ψ] ∈ M (X, A, P ),
α[ϕ] := [αϕ], ∀α ∈ C, ∀[ϕ] ∈ M (X, A, P ),
[ϕ][ψ] := [ϕψ], ∀[ϕ], [ψ] ∈ M (X, A, P )
(there is a close analogy between M (X, A, P ) and M (X, A, µ), cf. 8.2.13).
We can define a subset of M (X, A, P ) by
L∞ (X, A, P ) := {[ϕ] ∈ M (X, A, P ) : ϕ ∈ L∞ (X, A, P )}.
Indeed, if ϕ ∈ L∞ (X, A, P ) and ϕ′ ∈ [ϕ], let E ∈ A be such that
P (E) = OH and ∃m ∈ [0, ∞) s.t. |ϕ(x)| ≤ m, ∀x ∈ Dϕ − E,
and let F ∈ A be such that
P (F ) = OH and ϕ′ (x) = ϕ(x), ∀x ∈ (Dϕ′ ∩ Dϕ ) − F ;
then,
|ϕ′ (x)| ≤ m, ∀x ∈ ((Dϕ′ ∩ Dϕ ) − F ) ∩ (Dϕ − E) = Dϕ′ − ((X − Dϕ ) ∪ F ∪ E),
and this proves that ϕ′ ∈ L∞ (X, A, P ), in view of 13.3.2h. Thus, the condition
ϕ ∈ L∞ (X, A, P ) is actually a condition for the equivalence class [ϕ] even though
it is expressed through a particular element of the class. On the basis of 14.2.5, it
is easy to see that L∞ (X, A, P ) is a subalgebra (cf. 3.3.2) of M (X, A, P ), and that
it becomes a normed algebra if we define a norm by
k[ϕ]k := P -sup|ϕ|, ∀[ϕ] ∈ L∞ (X, A, P ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 436
14.2.10 Remarks.
(a) Let ϕ ∈ M(X, A, P ). For each n ∈ N, we define the set
En := |ϕ|−1 ([0, n]),
which is an element of A (cf. 6.2.17 and 6.2.13a with n := 3). It is obvious
that the sequence {χEn ϕ} is ϕ-convergent. This proves that the family of ϕ-
convergent sequences is not empty.
(b) If ψ ∈ L∞ (X, A, P ) then |ψ(x)| ≤ P -sup|ψ| µP f -a.e. on Dψ for all f ∈ H
(cf. 14.2.5), and hence ψ ∈ L2 (X, A, µP
f ) for all f ∈ H (cf. 8.2.6). Thus, if
ϕ ∈ M(X, A, P ) and {ϕn } is a ϕ-convergent sequence, then ϕn ∈ L2 (X, A, µP
f)
for all f ∈ H and all n ∈ N.
(a) f ∈ DP (ϕ);
(b) the sequence {[ϕn ]} is convergent in the Hilbert space L2 (X, A, µP
f );
˜
(c) the sequence {JP (ϕn )f } is convergent in the Hilbert space H.
If these conditions are satisfied, then:
(d) [ϕ] = limn→∞ [ϕn ] in the Hilbert space L2 (X, A, µPf );
′
(e) if {ϕn } is any ϕ-convergent sequence, then
lim J˜P (ϕ′n )f = lim J˜P (ϕn )f.
n→∞ n→∞
Proof. a ⇒ (b and d): Since the sequence {ϕn } is ϕ-convergent, for any f ∈ H we
have:
∞ ∞
!
\ \
2
lim |ϕn (x) − ϕ(x)| = 0 µPf -a.e. on D ϕ ∩ D ϕn = Dϕn −ϕ ;
n→∞
n=1 n=1
∃k1 , k2 ∈ [0, ∞) such that
|ϕn (x) − ϕ(x)|2 ≤ 2|ϕn (x)|2 + 2|ϕ(x)|2 ≤ 2(k1 + 1)|ϕ(x)|2 + 2k2
µP
f -a.e. on Dϕn −ϕ , ∀n ∈ N
this implies
Z
|ϕ|2 dµP
f ≤ M
X
and hence
k lim J˜P (ϕk )f − J˜P (ϕ′n )f k
k→∞
≤ k lim J˜P (ϕk )f − J˜P (ϕn )f k + kJ˜P (ϕn )f − J˜P (ϕ′n )f k −−−−→ 0,
k→∞ n→∞
which is condition e.
Now, we fix f ∈ H and write gn := P (En )f for each n ∈ N. Then gn = J˜P (χEn )f
(cf. 14.2.7a), and hence (cf. 14.2.12)
Z
P
µgn (E) = χEn dµP
f , ∀E ∈ A,
E
and hence (cf. 8.3.4b and 8.1.17b)
Z Z Z
2 P 2 P 2 2 P
|ϕ| dµgn = |ϕ| χEn dµf ≤ n 1X dµP
f = n µf (X) < ∞,
X X X
14.2.14 Theorem. Let ϕ ∈ M(X, A, P ). Then there exists a unique linear opera-
tor JϕP in H such that:
(a) DJϕP = DP (ϕ);
(b) f |JϕP f = X ϕdµP
R
f , ∀f ∈ DP (ϕ)
since αJ˜P (ϕn )f + β J˜P (ϕn )g = J˜P (ϕn )(αf + βg), this implies that αf + βg ∈ DP (ϕ)
(cf. 14.2.11, c ⇒ a), and that
JϕP (αf + βg) = αJϕP f + βJϕP g.
This proves that JϕP is a linear operator.
Further we have, for every f ∈ DP (ϕ),
Z Z
(1) (2) (3)
f |JϕP f = lim f |J˜P (ϕn )f = lim ϕn dµP
f = ϕdµP
f ,
n→∞ n→∞ X X
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 440
where 1 holds by 10.1.16c and 2 by 14.2.7e; as to 3, from [ϕ] = limn→∞ [ϕn ] in the
Hilbert space L2 (X, A, µP
f ) (cf. 14.2.11, a ⇒ d) we have
Z Z Z
ϕn dµP − P 1X (ϕn − ϕ)dµP
f ϕdµ f = f
X X X
Z 12 Z 12
2
≤ 1X dµP
f |ϕn − ϕ| dµP
f −−−−→ 0
X X n→∞
2
by the Schwarz inequality in L (X, A, µP
f ).
This proves that condition b is satisfied.
The uniqueness of the linear operator in H which satisfies conditions a and b
follows from 14.2.13 and 10.2.12.
In what follows, we prove conditions d and e.
d: For every f ∈ DP (ϕ), we have
Z Z
2 (4) ˜ 2 (5) 2 P (6)
P
kJϕ f k = lim kJP (ϕn )f k = lim |ϕn | dµf = |ϕ|2 dµP
f ,
n→∞ n→∞ X X
where 4 holds by 4.1.6a and 5 by 14.2.7f; as to 6, from [ϕ] = limn→∞ [ϕn ] in the
Hilbert space L2 (X, A, µP 2 2
f ) we have k[ϕ]kL2 (X,A,µP ) = limn→∞ k[ϕn ]kL2 (X,A,µP ) .
f f
e: We have:
(7) (8)
f ∈ DP (ϕ) ⇒ {J˜P (ϕn )f } is convergent ⇒
(9)
[{AJ˜P (ϕn )f } is convergent and A lim J˜P (ϕn )f = lim AJ˜P (ϕn )f ] ⇒
n→∞ n→∞
(10)
[{J˜P (ϕn )Af } is convergent and AJϕP f = lim J˜P (ϕn )Af ] ⇒
n→∞
[Af ∈ DP (ϕ) and AJϕP f = JϕP Af ],
where: 7 holds by 14.2.11 (a ⇒ c); 8 holds because A ∈ B(H); 9 holds by 14.2.7g;
10 holds by 14.2.11 (c ⇒ a). Since DAJϕP = DP (ϕ), this proves condition e (cf.
3.2.3 and 3.2.4).
14.2.15 Theorem. For all ϕ ∈ M(X, A, P ), the operator JϕP is adjointable and
(JϕP )† = JϕP .
Proof. For every ϕ ∈ M(X, A, P ), 14.2.13 shows that the operator JϕP is ad-
jointable.
Now, let ϕ ∈ M(X, A, P ) and let {ϕn } be a ϕ-convergent sequence. The se-
quence {ϕn } is obviously ϕ-convergent, and hence (cf. 14.2.14c and 14.2.7d)
JϕP f |g = lim J˜P (ϕn )f |g = lim f |(J˜P (ϕn ))† g
n→∞ n→∞
˜
= lim f |JP (ϕn )g = f |JϕP g , ∀f ∈ DP (ϕ), ∀g ∈ DP (ϕ)
n→∞
For each n ∈ N, define the set En := |ϕ|−1 ([0, n]), which is an element of A (cf.
14.2.10a), and the vector fn := J˜P (χEn ϕ)g (note that χEn ϕ ∈ L∞ (X, A, P )); then
(cf. 14.2.12)
Z
µPfn (E) = χEn |ϕ|2 dµP
g , ∀E ∈ A,
E
and hence fn ∈ DP (ϕ). The sequence {χEn ϕ} is ϕ-convergent (cf. 14.2.10a), and
hence
(1) (2)
JϕP fn = lim J˜P (χEk ϕ)J˜P (χEn ϕ)g = J˜P (χEn |ϕ|2 )g, ∀n ∈ N,
k→∞
where 1 holds by 14.2.14c and 2 by 14.2.7c, since χEk χEn = χEn if k ≥ n. Then,
(h|fn ) = g|JϕP fn = g|J˜P (χEn |ϕ|2 )g
Z
(3) (4)
= χEn |ϕ|2 dµP ˜ 2 2
g = kJP (χEn ϕ)gk = kfn k , ∀n ∈ N,
X
where 3 holds by 14.2.7e and 4 by 14.2.7f. Then the Schwarz inequality yields
kfn k2 = (h|fn ) ≤ khkkfnk, ∀n ∈ N,
and hence
kfn k ≤ khk, ∀n ∈ N,
and hence
Z
χEn |ϕ|2 dµP 2 2
g = kfn k ≤ khk , ∀n ∈ N.
X
and hence g ∈ DP (ϕ). This proves the inclusion D(JϕP )† ⊂ DP (ϕ), and hence that
(JϕP )† = JϕP .
Proof. Equivalence of a, b, c: We know that JϕP is closed (cf. 14.2.16) and that
DP (ϕ) = H (cf. 14.2.13). Then, the implication a ⇒ b is true by 4.4.4 and 2.3.9c,
and the implication b ⇒ a is true by 12.2.3. In view of this, the implications a ⇒ c
and b ⇒ c are obvious. The implications c ⇒ a and c ⇒ b are obvious.
a ⇒ d: We prove this by contraposition. We suppose that ϕ 6∈ L∞ (X, A, P ).
Then we have
in fact, if we had
and hence
(2)
Z Z Z
(1)
|ϕ|2 dµP
fn = |ϕ|2 dµP
fn ≤ (kn + 1)
2
1X dµP
fn < ∞,
X Fn Fn
where 3 holds by 14.2.14d and 4 by 8.1.17b. Since fn 6= 0H for all n ∈ N, this proves
that the operator JϕP is not bounded.
d ⇒ (a and e): If ϕ ∈ L∞ (X, A, P ), then ϕn := ϕ for each n ∈ N defines an
obviously ϕ-convergent sequence. In view of 14.2.14c, this proves that JϕP = J˜P (ϕ),
and therefore also that the operator JϕP is bounded.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 443
(2)
DJP (ϕ)+JP (ψ) = DP (ϕ) = DP (ϕ + ψ),
where 2 holds because DP (ψ) = H (cf. 14.2.17).
Next, let {ϕn } be a ϕ-convergent sequence. Then {ϕn + ψ} is a (ϕ + ψ)-
convergent sequence since the condition
∃k1 , k2 ∈ [0, ∞) such that |ϕn (x)|2 ≤ k1 |ϕ(x)|2 + k2 P -a.e. on Dϕ ∩ Dϕn , ∀n ∈ N,
implies that there exist k1 , k2 ∈ [0, ∞) such that
(3)
|ϕn (x) + ψ(x)|2 ≤ 2|ϕn (x)|2 + 2|ψ(x)|2 ≤ 2k1 |ϕ(x)|2 + 2k2 + 2|ψ(x)|2
(4)
≤ 4k1 |ϕ(x) + ψ(x)|2 + 4k1 |ψ(x)|2 + 2k2 + 2|ψ(x)|2
≤ 4k1 |ϕ(x) + ψ(x)|2 + (4k1 + 2)(P -sup|ψ|)2 + 2k2
P -a.e. on Dϕ ∩ Dϕn ∩ Dψ = Dϕ+ψ ∩ Dϕn +ψ , ∀n ∈ N,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 444
where 3 holds by inequality 2 in the proof of 10.3.7 and 4 holds by the inequality
(thereby derived)
|α|2 = |α + β − β|2 ≤ 2|α + β|2 + 2|β|2 , ∀α, β ∈ C.
This yields
(5)
(JP (ϕ) + JP (ψ))f = lim J˜P (ϕn )f + J˜P (ψ)f
n→∞
(6)
= lim (J˜P (ϕn )f + J˜P (ψ))f = lim J˜P (ϕn + ψ)f
n→∞ n→∞
(7)
= JP (ϕ + ψ)f, ∀f ∈ DP (ϕ),
where: 5 holds by 14.2.14c and 14.2.17e; 6 holds by 14.2.7b; 7 holds by 14.2.14c.
Proof. Let {ψn } be a ϕ-convergent sequence. Then limn→∞ [ψn ] = [ϕ] in the
Hilbert space L2 (X, A, µP
f ) (cf. 14.2.11), and hence
Z
2 (1) 2 (2)
kJP (ϕn )f − JP (ψn )f k = kJP (ϕn − ψn )f k = |ϕn − ψn |2 dµP
f
X
= k[ϕn ] − [ψn ]k2L2 (X,A,µP ) −−−−→ 0,
f n→∞
where 1 holds by 14.3.1 and 2 by 14.2.14d. Then,
kJP (ϕn )f − JP (ϕ)f k ≤ kJP (ϕn )f − JP (ψn )f k + kJP (ψn )f − JP (ϕ)f k −−−−→ 0
n→∞
since JP (ϕ)f = limn→∞ JP (ψn )f (cf. 14.2.14c and 14.2.17e).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 445
Proof. We have
f ∈ DP (ϕ) ∩ DP (ψ) ⇒ ϕ, ψ ∈ L2 (X, A, µP
f) ⇒
ϕ + ψ ∈ L2 (X, A, µP
f ) ⇒ f ∈ DP (ϕ + ψ),
and hence
[ϕ + ψ] = lim [ϕn + ψn ] in the Hilbert space L2 (X, A, µP
f ).
n→∞
now, JP (ϕn + ψn ) = J˜P (ϕn + ψn ) = J˜P (ϕn ) + J˜P (ψn ) (cf. 14.2.17e and 14.2.7b),
and hence
JP (ϕ + ψ)f = lim (J˜P (ϕn ) + J˜P (ψn ))f
n→∞
(cf. 14.2.14c).
Moreover,
J˜P (ψn ) = JˆP (ψn )
n
n2
X k−1 −1 k−1 k
= P ϕ , + nP (ϕ−1 ([n, ∞])), ∀n ∈ N.
2n 2n 2n
k=2
(cf. 14.2.18).
We point out that, if ϕ ∈ L∞ (X, A, P ) is so that 0 ≤ ϕ(x) P -a.e. on Dϕ , then
for the sequence {ψn } considered above we have
P -sup|ψn − ϕ| → 0 as n → ∞
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 447
and hence
µP 2 ˜ 2
g (E) = kP (E)JP (ϕ)f k = kP (E) lim JP (ϕn )f k
n→∞
Z
= lim kP (E)J˜P (ϕn )f k = lim kJ˜P (χE ϕn )f k2 = lim
2
|χE ϕn |2 dµP
f
n→∞ n→∞ n→∞ X
and
Thus we have
f ∈ DJP (ψ)JP (ϕ) ⇔ [f ∈ DP (ϕ) and f ∈ DP (ψϕ)],
or DJP (ψ)JP (ϕ) = DP (ϕ) ∩ DP (ψϕ).
Now we prove the part of the statement about the operators. We note that
from the part of the statement about the domains we have DJP (ψ)JP (ϕ) ⊂ DP (ψϕ).
Thus, we need to prove that JP (ψ)JP (ϕ)f = JP (ψϕ)f for all f ∈ DJP (ψ)JP (ϕ) .
First we assume ψ ∈ L∞ (X, A, P ). If {ϕn } is a ϕ-convergent sequence then
the sequence {ψϕn } is (ψϕ)-convergent, as can be seen easily. Hence for every
f ∈ DP (ϕ) we have f ∈ DJP (ψ)JP (ϕ) (in view of 14.2.17) and
(1) (2)
JP (ψ)JP (ϕ)f = J˜P (ψ) lim J˜P (ϕn )f = lim J˜P (ψ)J˜P (ϕn )f
n→∞ n→∞
(3) (4)
= lim J˜P (ψϕn )f = JP (ψϕ)f ;
n→∞
where: 1 holds by 14.2.17e and 14.2.14c, since f ∈ DP (ϕ); 2 holds because J˜P (ψ)
is continuous; 3 holds by 14.2.7c; 4 holds by 14.2.14c, since f ∈ DP (ψϕ).
Next, let ψ be any element of M(X, A, P ), let {ψn } be a ψ-convergent sequence,
and let f ∈ DJP (ψ)JP (ϕ) ; this implies f ∈ DP (ϕ) ∩ DP (ψϕ), or ϕ ∈ L2 (X, A, µP f)
∞
and ψϕ ∈ L2 (X, A, µP f ); since ψn ∈ L (X, A, P ), we have also ψn ϕ ∈ L 2
(X, A, µ P
f)
for all n ∈ N. Since {ψn } is a ψ-convergent sequence, we have
∞
!
\
2 P
lim |ψn (x)ϕ(x) − ψ(x)ϕ(x)| = 0 µf -a.e. on Dψϕ ∩ D ψn ϕ ,
n→∞
n=1
In view of 14.3.3 (recall that f ∈ DP (ψϕ) and f ∈ DP (ψn ϕ) for all n ∈ N), this
yields
lim JP (ψn ϕ)f = JP (ψϕ)f ;
n→∞
moreover, JP (ψn ϕ)f = JP (ψn )JP (ϕ)f in view of what was proved above (since
ψn ∈ L∞ (X, A, P ) and f ∈ DP (ϕ)), and hence
lim JP (ψn ϕ)f = lim JP (ψn )JP (ϕ)f = JP (ψ)JP (ϕ)f
n→∞ n→∞
Proof. This follows at once from 14.3.9, since ϕ ∈ L∞ (X, A, P ) entails DP (ϕ) = H
(cf. 14.2.17) and hence DJP (ψ)JP (ϕ) = DP (ψϕ).
where 1 holds by 14.2.7a and 14.2.17e, and 2 holds by 14.3.10; similarly, letting
ψn := χEn ψ, we have ψn ∈ L∞ (X, A, P ) and
Moreover, we have
and
∞
or (note that ϕn + ψn ∈ L (X, A, P ) implies ϕn + ψn ∈ L2 (X, A, µP
f ), cf. 14.2.10b)
Thus, we have constructed a sequence {gn } in DP (ϕ) ∩ DP (ψ) = DJP (ϕ)+JP (ψ)
which is such that
This implies f ∈ DJP (ϕ)+JP (ψ) (cf. 4.4.10). Since f was an arbitrary element of
DP (ϕ + ψ), we have
Proof. From 14.2.16 and 14.3.9 we have that JP (ψϕ) is a closed extension of
JP (ψ)JP (ϕ). Therefore, the operator JP (ψ)JP (ϕ) is closable (cf. 4.4.11b) and
(cf. 4.4.10)
JP (ψ)JP (ϕ) ⊂ JP (ψϕ).
Now we fix f ∈ DP (ψϕ). For each n ∈ N, we define the set
En := |ϕ|−1 ([0, n]),
which is an element of A, and we define the vector gn := P (En )f . In the proof of
14.2.13, we saw that f = limn→∞ gn and that gn ∈ DP (ϕ) for each n ∈ N. Letting
ϕn := χEn ϕ, we have ϕn ∈ L∞ (X, A, P ) and
(1) (2)
JP (ϕ)gn = JP (ϕ)JP (χEn )f = JP (ϕn )f, ∀n ∈ N,
where 1 holds by 14.2.7a and 14.2.17e, and 2 holds by 14.3.10. Moreover, we have
lim |ψ(x)ϕn (x) − ψ(x)ϕ(x)|2 = 0, ∀x ∈ Dψϕ ,
n→∞
and
|ψ(x)ϕn (x) − ψ(x)ϕ(x)|2 ≤ 4|ψ(x)ϕ(x)|2 , ∀x ∈ Dψϕ , ∀n ∈ N;
then, by 8.2.11 (recall that ψϕ ∈ L2 (X, A, µP
f ) since f ∈ DP (ψϕ)) we have
Z
lim |ψϕn − ψϕ|2 dµP f = 0,
n→∞ X
or (|ψ(x)ϕn (x)| ≤ |ψ(x)ϕ(x)|, ∀x ∈ Dψϕ , implies ψϕn ∈ L2 (X, A, µP
f ))
This implies f ∈ DJP (ψ)JP (ϕ) (cf. 4.4.10). Since f was an arbitrary element of
DP (ψϕ), we have
DP (ψϕ) ⊂ DJP (ψ)JP (ϕ) ,
and hence
JP (ψ)JP (ϕ) = JP (ϕψ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 452
14.3.13 Remark. For every ϕ ∈ M(X, A, P ) and every α ∈ C − {0}, the operator
αJP (ϕ) is closed; this is true because αJP (ϕ) = JP (αϕ) (cf. 14.3.5 and 14.2.16), but
more in general because αA is closed for any closed operator A and every α ∈ C−{0}
(as can be seen easily). If α = 0 then αJP (ϕ) is closed iff ϕ ∈ L∞ (X, A, P ) (cf.
14.2.17, 14.2.13, 4.4.3, 4.4.4).
Proof. a: First we point out that ϕ−1 ({0}) ∈ ADϕ ⊂ A. Then, for f ∈ H we have
(1)
f ∈ NJP (ϕ) ⇔ [f ∈ DP (ϕ) and kJP (ϕ)f k = 0] ⇔
Z
(2) (3)
|ϕ|2 dµP P
f = 0 ⇔ ϕ(x) = 0 µf -a.e. on Dϕ ⇔
X
−1
µP
f (Dϕ − ϕ ({0})) = 0 ⇔ P (Dϕ − ϕ−1 ({0}))f = 0H ⇔
(4) (5)
f = P (X)f = P (Dϕ )f = P (ϕ−1 ({0}))f ⇔ f ∈ RP (ϕ−1 ({0})) ,
where: 1 holds by definition of DP (ϕ) and by 14.2.14d; 2 holds by 8.1.18a; 3 holds
because ϕ−1 ({0}) ∈ ADϕ (cf. the last part of 7.1.10); 4 holds because P (X − Dϕ ) =
OH ; 5 holds by 13.1.3c.
b ⇔ c: This follows from a and from 3.2.6a.
c ⇔ d: This is true (by an argument similar to the argument used in the last
part of 7.1.10) because ϕ−1 ({0}) ∈ ADϕ .
e: We assume condition c and note that
D ϕ1 = Dϕ − ϕ−1 ({0});
then D ϕ1 ∈ A and X − D ϕ1 = (X − Dϕ ) ∪ ϕ−1 ({0}), whence P X − D ϕ1 = OH
1
by 13.3.2h; in view of 6.2.17, this proves that ϕ ∈ M(X, A, P ). Moreover,
1 1
ϕ(x) (x) = (x)ϕ(x) = 1, ∀x ∈ D ϕ1 = Dϕ ϕ1 = D ϕ1 ϕ ;
ϕ ϕ
this implies ϕ ϕ1 , ϕ1 ϕ ∈ L∞ (X, A, P ) and hence (cf. 14.2.17e and 14.2.7a,i)
1 1
JP ϕ = JP ϕ = J˜P (1X ) = P (X) = 1H ;
ϕ ϕ
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 453
then, by 14.3.9,
1 1
JP (ϕ)JP ⊂ 1H , JP JP (ϕ) ⊂ 1H ,
ϕ ϕ
1
DJP (ϕ)JP ( 1 ) = DP , DJP ( 1 )JP (ϕ) = DP (ϕ);
ϕ ϕ ϕ
by 1.2.16b, this implies (JP (ϕ))−1 = JP ϕ1 .
ρ(JP (ϕ)) = {λ ∈ C : JP (ϕ) − λ1H is injective and (JP (ϕ) − λ1H )−1 is bounded}
or equivalently
Now, JP (ϕ) − λ1H = JP (ϕ − λ) for all λ ∈ C (cf. the proof of 14.3.2); therefore, if
JP (ϕ) − λ1H is injective then (cf. 14.3.14)
1 −1 1
∈ M(X, A, P ) and (JP (ϕ) − λ1H ) = JP ,
ϕ−λ ϕ−λ
1
and hence RJP (ϕ)−λ1H = DP ϕ−λ , and hence RJP (ϕ)−λ1H = H (cf. 14.2.13).
Proof. a ⇒ b: We prove (not b)⇒(not a). Assuming condition (not b), there exists
ε ∈ (0, ∞) so that P (ϕ−1 (B(λ, ε))) = OH , and hence so that
−1
µP
f (ϕ (B(λ, ε))) = kP (ϕ−1 (B(λ, ε)))f k2 = 0, ∀f ∈ H,
(cf. 14.3.2, 8.3.3a, 8.1.17b). By 4.2.3 and 14.4.1, this proves that λ ∈ ρ(JP (ϕ)), i.e.
condition (not a).
b ⇒ a: We prove (not a)⇒(not b). In view of 14.4.1, condition (not a) implies
that
and hence, in view of the equality JP (ϕ)− λ1H = JP (ϕ− λ) (cf. the proof of 14.3.2)
and of 14.3.14, that
1 1
∈ M(X, A, P ) and JP is bounded,
ϕ−λ ϕ−λ
and hence, in view of 14.2.17, that
1
∃m ∈ (0, ∞) such that
≤ m P -a.e. on D 1 ;
ϕ(x) − λ ϕ−λ
(note that ϕ−1 (σ(JP (ϕ))) ∈ A by 10.4.6 and by 6.2.13c with G := KdC ), and hence
σ(JP (ϕ)) 6= ∅.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 457
Proof. For each λ ∈ C − σ(JP (ϕ)), 14.4.2 implies that there exists ε ∈ (0, ∞) such
that
P (ϕ−1 (B(λ, ε))) = OH ;
this condition implies
B(λ, ε) ⊂ C − σ(JP (ϕ));
indeed, if z ∈ B(λ, ε) then there exists η ∈ (0, ∞) such that B(z, η) ⊂ B(λ, ε), and
hence such that ϕ−1 (B(z, η)) ⊂ ϕ−1 (B(λ, ε)), and hence (cf. 13.3.2e) such that
P (ϕ−1 (B(z, η))) = OH ;
in view of 14.4.2, this implies z ∈ C − σ(JP (ϕ)).
Now, for each λ ∈ C − σ(JP (ϕ)) let ελ ∈ (0, ∞) be such that
P (ϕ−1 (B(λ, ελ ))) = OH .
Since B(λ, ελ ) ⊂ C − σ(JP (ϕ)) for all λ ∈ C − σ(JP (ϕ)), we have obviously
[
C − σ(JP (ϕ)) = B(λ, ελ ).
λ∈C−σ(JP (ϕ))
Since (C, dC ) is a separable metric space (cf. 2.7.4a), by 2.3.18 there exists a count-
able subset {λn }n∈I of C − σ(JP (ϕ)) such that
[
C − σ(JP (ϕ)) = B(λn , ελn ),
n∈I
and hence such that
[
Dϕ − ϕ−1 (σ(JP (ϕ))) = ϕ−1 (C − σ(JP (ϕ))) = ϕ−1 (B(λn , ελn ));
n∈I
−1
then (cf. 13.3.6c) P (Dϕ − ϕ (σ(JP (ϕ)))) = OH , or equivalently
−1
P (ϕ (σ(JP (ϕ)))) = P (Dϕ ) = P (Dϕ ) + P (X − Dϕ ) = P (X) = 1H .
Obviously, this implies σ(JP (ϕ)) 6= ∅ (otherwise, ϕ−1 (σ(JP (ϕ))) = ∅ and hence
P (ϕ−1 (σ(JP (ϕ)))) = OH ).
Proof. If the operator JP (ϕ) is bounded then JP (ϕ) ∈ B(H) (cf. 14.2.17), and
hence σ(JP (ϕ)) is a bounded subset of C by 4.5.10.
Conversely, suppose that σ(JP (ϕ)) is bounded and let m ∈ [0, ∞) be such that
|z| ≤ m, ∀z ∈ σ(JP (ϕ));
then (cf. 14.4.4)
Z
kJP (ϕ)f k2 = |ϕ|2 dµP
f
ϕ−1 (σ(J P (ϕ)))
Z Z
2 2
≤m 1X dµP
f =m 1X dµP
f
ϕ−1 (σ(JP (ϕ))) X
= m µf (X) = m2 kf k2 , ∀f
2 P
∈ DP (ϕ),
and this proves that JP (ϕ) is bounded.
In view of this, the equivalence of conditions a and b follows directly from 4.5.7, and
so does the part of the statement about eigenspaces (for which, cf. also 13.1.3c).
In this section, (X, A, µ) stands for an abstract measure space. At variance with
what was done in Section 11.1, we denote the elements of L2 (X, A, µ) with the
letters f , g,....
For ϕ ∈ M(X, A, µ), we define the mapping from L2 (X, A, µ) to itself
Mϕ : DMϕ → L2 (X, A, µ)
[f ] 7→ Mϕ [f ] := [ϕf ],
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 459
with
DMϕ := {[f ] ∈ L2 (X, A, µ) : ϕf ∈ L2 (X, A, µ)}
Z
2 2
= [f ] ∈ L (X, A, µ) : |ϕf | dµ < ∞
X
2
(note that ϕf ∈ M(X, A, µ) for all f ∈ L (X, A, µ), in view of 8.2.2). It is easy
to see that Mϕ is a linear operator (DMϕ is a linear manifold in L2 (X, A, µ) by
11.1.2a).
For E ∈ A, we write PE := MχE . We have:
Z Z
DPE = L2 (X, A, µ) since |χE f |2 dµ ≤ |f |2 dµ < ∞, ∀[f ] ∈ L2 (X, A, µ);
X X
Z
([f ]|PE [f ]) = χE |f |2 dµ ∈ R, ∀[f ] ∈ L2 (X, A, µ), hence PE = PE† (cf. 12.4.3);
X
PE ([f ]) = [χE f ] = [χ2E f ] = PE2 [f ], ∀[f ] ∈ L2 (X, A, µ), hence PE = PE2 .
This proves that PE is a projection (cf. 13.1.5).
Now, we define the mapping
P : A → P(L2 (X, A, µ))
E 7→ P (E) := PE .
For all [f ] ∈ L2 (X, A, µ), we have:
2 2
µP
[f ] (X) = ([f ]|PX [f ]) = k[f ]k , ∀[f ] ∈ L (X, A, µ);
Z Z
µP
[f ] (E) = ([f ]|PE [f ]) = χE |f |2 dµ = |f |2 dµ, ∀E ∈ A.
X E
In view of 8.3.4a and 13.3.5, this proves that P is a projection valued measure on
2
A. If F ∈ A is such that µ(F ) = 0, then µP [f ] (F ) = 0 for all [f ] ∈ L (X, A, µ) (cf.
8.3.4a) and hence P (F ) = OL2 (X,A,µ) . Therefore, M(X, A, µ) ⊂ M(X, A, P ).
For ϕ ∈ M(X, A, µ), we have
Z Z
2 2 (1)
[f ] ∈ DMϕ ⇔ |ϕ| |f | dµ < ∞ ⇔ |ϕ|2 dµP
[f ] < ∞ ⇔ [f ] ∈ DP (ϕ),
X X
where 1 holds by 8.3.4b; moreover, we have
Z Z
2 (2)
([f ]|Mϕ [f ]) = ϕ|f | dµ = ϕdµP
[f ] , ∀[f ] ∈ DMϕ ,
X X
where 2 holds by 8.3.4c. This proves that Mϕ = JP (ϕ), by the uniqueness asserted
in 14.2.14.
Now we assume that the measure µ is σ-finite, i.e. that there exists a countable
S
family {En }n∈I of elements of A so that X = n∈I En and µ(En ) < ∞ for all
n ∈ I (this implies that χEn ∈ L2 (X, A, µ) for all n ∈ I). If F ∈ A is such that
P (F ) = OL2 (X,A,µ) , then
Z Z
µ(F ∩ En ) = χF ∩En dµ = χF |χEn |2 dµ = ([χEn ]|P (F )[χEn ]) = 0, ∀n ∈ I,
X X
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 460
and hence
!
[ X
µ(F ) = µ (F ∩ En ) ≤ µ(F ∩ En ) = 0
n∈I n∈I
(cf. 7.1.4a), and hence µ(F ) = 0. Now, let E be an element of A and, for each
x ∈ E, let Q(x) be a proposition. Then,
[Q(x) P -a.e. on E] is equivalent to [Q(x) µ-a.e. on E].
Thus, M(X, A, µ) = M(X, A, P ) and all the statements of Sects. 14.2, 14.3,
14.4 hold true with JP (ϕ) replaced by Mϕ , the projection valued measure P re-
placed by the measure µ, “P -a.e.” replaced by “µ-a.e.” (L∞ (X, A, µ) is defined as
L∞ (X, A, P ) was, with P replaced by µ).
In some cases, there are relations between integrals constructed with respect to two
different projection valued measures. In this section we examine two important
cases of this kind.
(1)
µP P1 P1 2
f (X2 ) = µf (Dπ ) = µf (X1 ) = kf k ,
2
X2 X1
and hence DP1 (ϕ ◦ π) = DP2 (ϕ). Moreover, in view of 14.2.14b and 8.3.11c we have
Z Z
(f |JP2 (ϕ)f ) = ϕdµP
f
2
= (ϕ ◦ π)dµPf = (f |JP1 (ϕ ◦ π)f ) , ∀f ∈ DP2 (ϕ).
1
X2 X1
This proves the equality JP1 (ϕ ◦ π) = JP2 (ϕ), in view of 14.2.13 and 10.2.12.
14.6.2 Theorem. Let H1 and H2 be isomorphic Hilbert spaces and suppose that
U ∈ UA(H1 , H2 ) (for UA(H1 , H2 ), cf. 10.3.15). Let (X, A) be a measurable space
and let P1 be a projection valued measure on A with values in P(H1 ). Then the
mapping
P2 : A → P(H2 )
E 7→ P2 (E) := U P1 (E)U −1
is a projection valued measure on A. We have M(X, A, P2 ) = M(X, A, P1 ) and,
for all ϕ ∈ M(X, A, P1 ),
JP2 (ϕ) = U JP1 (ϕ)U −1 if U ∈ U(H1 , H2 ),
JP2 (ϕ) = U JP1 (ϕ)U −1 if U ∈ A(H1 , H2 ).
and hence
DP2 (ϕ) = DP2 (ϕ) = {f ∈ H2 : U −1 f ∈ DP1 (ϕ)} = DJP1 (ϕ)U −1 = DUJP1 (ϕ)U −1 .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 462
Moreover, for every f ∈ DP2 (ϕ)(= DP2 (ϕ)), in view of 14.2.14b and of 1 we have
(since U −1 f ∈ DP1 (ϕ) = DP1 (ϕ)):
Z Z
P2
(f |JP2 (ϕ)f ) = ϕdµf = ϕdµP 1
U −1 f
X X
= U −1 f |JP1 (ϕ)U −1 f = f |U JP1 (ϕ)U −1 f ,
if U ∈ U(H1 , H2 );
Z
−1
ϕdµP f |JP1 (ϕ)U −1 f
(f |JP2 (ϕ)f ) = f = U
2
X
(2)
= JP1 (ϕ)U −1 f |U −1 f = f |U JP1 (ϕ)U −1 f ,
Chapter 15
Spectral Theorems
The proof we give of the spectral theorem for unitary operators rests on the Fejér–
Riesz lemma which is proved in 15.1.2, on the Stone–Weierstrass theorem for the
unit circle proved in 4.3.7, on the Riesz–Markov theorem for positive linear func-
tionals proved in 8.5.3, and on the characterization of the family of bounded Borel
functions provided in 6.3.4.
We recall that P denotes the family of trigonometric polynomials on the unit
circle T, that P is a subalgebra of the associative algebra C(T), that C(T) = CB (T)
since the metric subspace (T, dT ) of the metric space (C, dC ) is compact, and hence
that C(T) is a normed algebra (cf. 4.3.6a,c). We note that obviously p ∈ P for all
p ∈ P.
Throughout this section, H denotes an abstract Hilbert space. We recall that
B(H) is a C ∗ -algebra (cf. 12.6.4).
463
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 464
15.1.2 Lemma (The Fejér–Riesz lemma). Let p ∈ P be such that 0 ≤ p(z) for
all z ∈ T. Then,
∃q ∈ P such that p = qq.
Proof. Since p ∈ P,
∃N ≥ 0, ∃(α0 , α1 , α−1 , ..., αN , α−N ) ∈ C2N +1 so that
N
X
p(z) = αk z k , ∀z ∈ T.
k=−N
and hence
N
X
(αk − α−k )z k+N = 0, ∀z ∈ T,
k=−N
and hence
αk = α−k , ∀k ∈ {0, ±1, ..., ±N }.
This implies that both αN and α−N are non-zero (if one of them were zero then
the other one would be zero as well, and thus we should have |αN | + |α−N | = 0).
Therefore, zero cannot be a root of the polynomial P defined by
N
X
P (z) := αk z k+N , ∀z ∈ C,
k=−N
Since the set of the roots inside the unit circle must be the same on the two sides
of this equation and so must be their multiplicities (or, equivalently, since the fac-
torization of a polynomial with respect to its roots is unique), this implies that
{λi }i∈I = {µ−1 j }j∈J , and hence the setsP
of indices I and J can be identified, and
P
also that ri = si for all i ∈ I, and hence i∈I ri = i∈I si = N . Thus there exists
(ν1 , ..., νN ) ∈ CN (the components of this N -tuple are the roots of P outside the
unit circle, each of them repeated as many times as its multiplicity) so that
N
Y N
Y
P (z) = c (z − ν −1
k ) (z − νk ), ∀z ∈ C.
k=1 k=1
Now we suppose that p is not strictly positive, i.e. that there exists z ∈ T such that
p(z) = 0. For every n ∈ N, we define the trigonometric polynomial pn := p + n1 and
the polynomial
N
X 1
Pn (z) := αk + δ0,k z k+N , ∀z ∈ C.
n
k=−N
Since pn (z) > 0 and pn (z) = z −N Pn (z) for all z ∈ T, proceeding as above we see
that
N N
Y −1 Y
Pn (z) = c(n) (z − νk (n) ) (z − νk (n)), ∀z ∈ C,
k=1 k=1
where c(n) ∈ C and the components of the N -tuple (ν1 (n), ..., νN (n)) are the roots
of Pn outside the unit circle, repeated as many times as their multiplicities. Since
the roots of a polynomial depend continuously on the coefficients of the polynomial
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 467
(cf. e.g. Horn and Johnson, 2013, th.D.1.), the sequence {νk (n)} converges to a
root νk of P for each k ∈ {1, ..., N }. Then we have
N
Y N
Y
P (z) = lim Pn (z) = c (z − ν −1
k ) (z − νk ), ∀z ∈ C,
n→∞
k=1 k=1
where c := limn→∞ c(n); indeed, the sequence {c(n)} is convergent since, for z0 ∈ C
such that P (z0 ) 6= 0, for n large enough we have
N N
Y −1 Y
(z0 − νk (n) ) (z0 − νk (n)) 6= 0
k=1 k=1
and
N N
!−1
Y −1 Y
c(n) = Pn (z) (z0 − νk (n) ) (z0 − νk (n)) .
k=1 k=1
Although it is not relevant for the present proof, we note that what we have just
seen proves that every root of P in the unit circle has even multiplicity.
Thus, as a consequence of the hypothesis p(z) ≥ 0 for all z ∈ T , there exist
c ∈ C and (ν1 , ..., νN ) ∈ CN so that
N
Y N
Y
P (z) = c (z − ν −1
k ) (z − νk ), ∀z ∈ C,
k=1 k=1
and hence so that
N
Y N
Y N
Y
p(z) = z −N P (z) = c(−1)N ν −1
k (z −1 − ν k ) (z − νk )
k=1 k=1 k=1
N
Y N
Y
= c2 (z − ν k ) (z − νk ), ∀z ∈ T,
k=1 k=1
with c2 ∈ C. Since there exists z ∈ C such that p(z) > 0, c2 > 0 must be true.
Then, the trigonometric polynomial q defined by
N
√ Y
q(z) := c2 (z − νk ), ∀z ∈ T,
k=1
is such that p = qq.
Proof. In view of 15.1.1a and 15.1.4, the mapping φ̂U is a bounded linear operator
from the normed space C(T) to the Banach space B(H). Since P = C(T) (cf. 4.3.7),
4.2.6 implies that there exists a unique linear operator φU : C(T) → B(H) which is
an extension of φ̂U and which is bounded, i.e. continuous. This proves that there
exists a unique mapping φU : C(T) → B(H) which has properties a, b, c.
Now we prove the additional properties of φU .
d: In view of 4.2.6d, the norm of the linear operator φU equals the norm of
the linear operator φ̂U . Now, 15.1.4 implies that the latter is not greater than one.
Thus, we have condition d (cf. 4.2.5b).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 469
(A) There exists a unique projection valued measure P on the Borel σ-algebra A(dT )
on T (cf. 6.1.22; as before, dT denotes the restriction of the distance dC to
T × T), with values in P(H), such that U = JζP , where ζ is the function
defined by
ζ : T→T
z 7→ ζ(z) := z
and JζP is the operator defined in 14.2.14.
Equivalently, there exists a unique projection valued measure P on A(dT ), with
values in P(H), such that
Z
(f |U f ) = ζdµP
f , ∀f ∈ H.
T
(note that MB (T, A(dT )) ⊂ L1 (T, A(dT ), µf ) for all f ∈ H, in view of 8.2.6).
We want to prove that:
∀ϕ ∈ MB (T, A(dT )), ∃!Bϕ ∈ B(H) such that (f |Bϕ g) = ψϕ (f, g), ∀f, g ∈ H.
To this end, we define the family
V1 := {ϕ ∈ MB (T, A(dT )) : ψϕ is a bounded sesquilinear form}
(for a bounded sesquilinear form, cf. 10.1.1 and 10.5.4).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 471
If ϕ ∈ C(T) then
4
X 1 (1)
ψϕ (f, g) = n
(f + in g|φU (ϕ)(f + in g)) = (f |φU (ϕ)g) , ∀f, g ∈ H
n=1
4i
Then, ϕ ∈ MB (T, A(dT )) (cf. 6.3.4a). Moreover, 8.2.11 (with the constant function
mT as dominating function) implies that
Z Z
ϕdµf = lim ϕn dµf , ∀f ∈ H,
T n→∞ T
and hence
4
1
X Z
ψϕ (f, g) = lim ϕn dµf +ik g
n→∞ 4ik T (2)
k=1
Now suppose also that ϕn ∈ V1 for all n ∈ N. Then 2 implies that ψϕ is a sesquilinear
form, since so is ψϕn for all n ∈ N. Moreover, for all u, v ∈ H such that kuk =
kvk = 1,
4 Z 4
1X mX
|ψϕ (u, v)| ≤ |ϕ|dµu+in v ≤ ku + in vk2 ≤ 4m
4 n=1 T 4 n=1
(for 3, cf. 8.3.5a with a1 := |α|2 , µ1 := µf , µk the null measure on A(dT ) for k > 1);
in view of the uniqueness asserted in 8.5.3, this proves that
µαf = |α|2 µf .
In particular, for every f ∈ H we have µf +in f = |1 + in |2 µf for n = 1, 2, 3, 4; thus,
µf −f is the null measure, µf +f = 4µf , µf +if = µf −if ;
this yields
4
1 1
X Z Z Z
ϕdµ n
f +i f = 4 ϕdµ f = ϕdµf , ∀ϕ ∈ MB (T, A(dT )),
n=1
4in T 4 T T
and hence
Z
(f |Bϕ f ) = ψϕ (f, f ) = ϕdµf , ∀ϕ ∈ MB (T, A(dT )).
T
Step 4: Suppose ϕ ∈ F (T) and that {ϕn } is a sequence in MB (T, A(dT )) such
ubp
that ϕn −→ ϕ. Then ϕ ∈ MB (T, A(dT )) and
(f |Bϕ g) = ψϕ (f, g) = lim ψϕn (f, g) = lim (f |Bϕn g) , ∀f, g ∈ H.
n→∞ n→∞
This follows from what we saw in step 2.
Step 5: For every ϕ ∈ C(T) we have (cf. step 3)
Z
(f |Bϕ f ) = ϕdµf = (f |φU (ϕ)f ) , ∀f ∈ H,
T
whence Bϕ = φU (ϕ) by 10.2.12.
Step 6: For every ϕ ∈ MB (T, A(dT )) we have (cf. step 3)
Z Z
(4)
f |Bϕ† f = (f |Bϕ f ) =
ϕdµf = ϕdµf = (f |Bϕ f ) , ∀f ∈ H
T T
where 8 holds (in view of step 4) because {ψn ϕ} is a sequence in MB (T, A(dT ))
ubp
such that ψn ϕ −→ ψϕ; in fact, if m ∈ [0, ∞) is such that |ψn (z)| ≤ m for all z ∈ T,
then |(ψn ϕ)(z)| ≤ mkϕk∞ for all z ∈ T. Therefore, Bψ Bϕ = Bψϕ .
Thus, V2 ⊂ F (T), C(T) ⊂ V2 , and V2 is ubp closed. Hence, MB (T, A(dT )) ⊂ V2 ,
or
Bψ Bϕ = Bψϕ , ∀ϕ ∈ C(T), ∀ψ ∈ MB (T, A(dT )).
This implies that
(9) (10)
Bϕ Bψ = Bϕ† Bψ† = (Bψ Bϕ )† = Bψϕ
†
(11)
= Bψϕ = Bϕψ , ∀ϕ ∈ C(T), ∀ψ ∈ MB (T, A(dT ))
(for 9 and 11, cf. step 6; 10 holds by 12.3.4b).
Now we define the family
V3 := {ϕ ∈ MB (T, A(dT )) : Bϕ Bψ = Bϕψ , ∀ψ ∈ MB (T, A(dT ))}.
The last thing proved implies C(T) ⊂ V3 .
ubp
Next, suppose ϕ ∈ F (T) and that {ϕn } is a sequence in V3 such that ϕn −→ ϕ.
Then, proceeding exactly as above we have that, for all ψ ∈ MB (T, A(dT )),
(12)
(f |Bϕ Bψ g) = lim (f |Bϕn Bψ g) = lim (f |Bϕn ψ g) = (f |Bϕψ g) , ∀f, g ∈ H,
n→∞ n→∞
where 12 holds (in view of step 4) because {ϕn ψ} is a sequence in MB (T, A(dT ))
ubp
such that ϕn ψ −→ ϕψ. Therefore, Bϕ Bψ = Bϕψ .
Thus, V3 ⊂ F (T), C(T) ⊂ V3 , and V3 is ubp closed. Hence, MB (T, A(dT )) ⊂ V3 ,
or
Bϕ Bψ = Bϕψ , ∀ψ ∈ MB (T, A(dT )), ∀ϕ ∈ MB (T, A(dT )).
Step 8: For every E ∈ A(dT ), we have χE ∈ MB (T, A(dT )). Since χE = χE ,
we have Bχ† E = BχE (cf. step 6). Since χ2E = χE , we have Bχ2 E = BχE (cf. step 7).
Thus, BχE ∈ P(H) by 13.1.5.
Now, we define the mapping
P : A(dT ) → P(H)
E 7→ P (E) := BχE .
For every f ∈ H, we have (cf. step 3)
Z
P
µf (E) = (f |BχE f ) = χE dµf = µf (E), ∀E ∈ A(dT );
T
Now, we have
Z
2
DU = H = f ∈H: |ζ| dµP
f <∞
T
since the function ζ is bounded and the measure µP f is finite for all f ∈ H, and also
Z Z
(13)
(14)
(f |U f ) = f |φ̂U (ζ)f = (f |φU (ζ)f ) = ζdµf = ζdµPf , ∀f ∈ H,
T T
where 13 holds by the very definition of φ̂U and 14 by 15.1.5b. By the uniqueness
asserted in 14.2.14, this is equivalent to U = JζP .
This concludes the proof that a projection valued measure P exists as in the
statement.
Step 9: Here we prove that the projection valued measure as in the statement is
unique. To this end, suppose that Q is a projection valued measure on A(dT ), with
values in P(H), such that
Z
(f |U f ) = ζdµQ
f , ∀f ∈ H.
T
n o
R 2 Q
Since the equality DU = f ∈ H : T |ζ| dµ f < ∞ is obvious, this is equivalent to
JζQ = U.
In view of 14.2.17e (and of the fact that the mapping J˜Q is an extension of the
mapping JˆQ , cf. 14.2.7), this can be written as
JˆQ (ζ) = U.
Now, in view of 14.1.1b,d,e (and of the fact that ζ −1 = ζ and U † = U −1 , cf.
12.5.1b), this implies that
JˆQ (p) = φ̂U (p), ∀p ∈ P.
Moreover, the mapping JˆQ is continuous (cf. 14.1.1c), and so is its restriction to
C(T). Then, the uniqueness asserted in 15.1.5 implies that
JˆQ (ϕ) = φU (ϕ), ∀ϕ ∈ C(T),
and hence (cf. 14.1.1f)
Z Z
ϕdµQ = f | ˆQ (ϕ)f = (f |φU (ϕ)f ) =
J ϕdµf , ∀ϕ ∈ C(T), ∀f ∈ H;
f
T T
and hence
(f |ABϕ g) = A† f |Bϕ g = lim A† f |Bϕn g
n→∞
= lim (f |ABϕn g) = lim (f |Bϕn Ag) = (f |Bϕ Ag) , ∀f, g ∈ H,
n→∞ n→∞
whence ABϕ = Bϕ A.
Thus, V4 ⊂ F (T), C(T) ⊂ V4 , and V4 is ubp closed. Hence, MB (T, A(dT )) ⊂ V4 ,
or
ABϕ = Bϕ A, ∀ϕ ∈ MB (T, A(dT )).
Then, in particular
AP (E) = ABχE = BχE A = P (E)A, ∀E ∈ A(dT ).
The spectral theorem for self-adjoint operators is deduced from the spectral theorem
for unitary operators, by means of the Cayley transform (in both its incarnations,
as an operator and as a function).
Throughout this section, H denotes an abstract Hilbert space.
P 7→ JξP
is a bijection from the family of all projection valued measures on the Borel σ-
algebra A(dR ), with values in P(H), onto the family of all self-adjoint operators in
H. Indeed, 14.3.17 proves that the operator JξP is self-adjoint for every projection
valued measure P on A(dR ). Further, 15.2.1A proves that, for every self-adjoint
operator A, there exists one and only one projection valued measure P on A(dR )
such that A = JξP .
For a given projection valued measure P on A(dR ), we sometimes denote by AP
the operator JξP . Conversely, for a given self-adjoint operator we always denote by
P A the unique projection valued measure P on A(dR ) which is so that A = JξP , and
we call it the projection valued measure of A. Thus, the bijection discussed above
is defined by
P 7→ AP ,
while its inverse is the bijection from the family of all self-adjoint operators onto
the family of all projection valued measures on A(dR ) which is defined by
A 7→ P A .
For every self-adjoint operator A, from the more general results of Chapter 14 we
A
have the following results, since A = JξP :
For convenience, we collect the results (also the ones already known before this
chapter) for the spectrum and the point spectrum of a self-adjoint operator in the
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 480
next two theorems, after defining two numbers which are of great importance in
quantum mechanics.
Moreover,
(e) NA−λ1H = RP A ({λ}) , ∀λ ∈ R;
thus, if λ ∈ σp (A) then P A ({λ}) is the projection onto the corresponding eigenspace.
The next theorem can be proved directly (cf. e.g. Simmons, 1963, Chapter 11).
Instead, we deduce it from the results proved in this section.
Proof. We know that σ(A) 6= ∅ (cf. 15.2.2d). Now let λ ∈ σ(A). Since σ(A) =
Apσ(A) (cf. 12.4.21b) and since every linear operator in H is bounded (cf. 10.8.3B),
we have that the operator A − λ1H is not injective, i.e. that λ ∈ σp (A). This proves
that σp (A) is a non-empty set, and also (in view of 4.5.8) that
σp (A) = σ(A).
In view of 12.4.20B, σp (A) must be a finite set: if it were not, then by choosing an
element of NA−λ1H ∩ H̃ for each λ ∈ σp (A) we could construct a non-finite o.n.s.
in H and hence (cf. 10.7.3) there would exist a non-finite c.o.n.s. in H, contrary to
the hypothesis that H is finite-dimensional. Thus, we can write
{λ1 , ..., λN } := σp (A).
In view of 15.2.5 we have
Pn := P A ({λn }) 6= OH , ∀n ∈ {1, ..., N }.
Moreover, we have
Pi Pj = P A ({λi })P A ({λj }) = OH if i 6= j
(cf. 13.3.2b) and also
XN XN
Pn = P A ({λn }) = P A (σp (A)) = P A (σ(A)) = 1H
n=1 n=1
(cf. 15.2.2d). Finally, we note that DA = H in view of 10.8.3B and 12.4.7, and that
Pn f = P A ({λn })f ∈ NA−λn 1H and hence APn f = λn Pn f, ∀f ∈ H
(cf. 15.2.5e). This yields
XN N
X
Af = APn f = λn Pn f, ∀f ∈ H,
n=1 n=1
PN
or A = n=1 λn Pn .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 483
(cf. 15.3.2a,b), and hence χE (A) = P A (E) (cf. 10.2.12). Obviously, this equation
cannot be used for the construction of the projection valued measure P A by means
of A, since it is actually based on the previous existence of P A .
15.3.4 Examples.
(A) We set (X, A, µ) := (R, A(dR ), m) in the discussion of Section 14.5; we recall
that m denotes the Lebesgue measure on R. Thus, L2 (R, A, µ) = L2 (R). The
projection valued measure P of Section 14.5 is now defined on A(dR ) and
we define the operator Q := JξP , which is a self-adjoint operator in L2 (R).
This operator is denoted by Q since in non-relativistic quantum mechanics it
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 484
(B) Suppose that H is a separable Hilbert space and let A be a self-adjoint operator
in H. In view of 12.4.20C, σp (A) is a countable set and hence σp (A) ∈ A(dR ).
The following conditions are equivalent:
(a) P A (R − σp (A)) = OH , or equivalently P A (σp (A)) = 1H ;
(b) there exists a family {(λn , Pn )}n∈I , with I = {1, ..., N } or I = N, so that
λn ∈ R, Pn ∈ P(H), Pn 6= OH , ∀n ∈ I,
λi 6= λj and Pi Pj = OH if i 6= j,
P
n∈I Pn f = f , ∀f ∈ H,
2 2
P
DA = {f ∈ H : n∈I λn kPn f k < ∞},
P
Af = n∈I λn Pn f, ∀f ∈ DA
P
(we note that, if I = N, the series n∈I λn Pn f is convergent for all
f ∈ DA , in view of 13.2.8d and 10.4.7b).
Indeed, if condition a is true then we define {λn }n∈I := σp (A), with the con-
dition λi 6= λj if i 6= j and with I := {1, ..., N } or I := N as the case may be,
and Pn := P A ({λn }) for each n ∈ I. Then we have:
λn ∈ R, Pn ∈ P(H), Pn 6= OH (cf. 15.2.5c), ∀n ∈ I;
Pi Pj = OH if i 6= j and n∈I Pn f = P A (σp (A))f = f , ∀f ∈ H
P
and hence
Z ( )
PA
X
2 2 2
DA = f ∈ H : ξ dµf < ∞ = f ∈ H : λn kPn f k < ∞ ,
R n∈I
P
by 15.2.2a and 8.3.8. Thus, if I = N, the series n∈I λn Pn f is convergent for
all f ∈ DA , and, for either I = {1, ..., N } or I = N, we can define the mapping
B : DA → H
X
f 7→ Bf := λn Pn f
n∈I
i.e. NA−λk 1H = RPk (cf. 13.1.3c). Since Pn 6= OH for all n ∈ I, this proves
that
Therefore,
X
f= P A ({λn })f = P A (σp (A))f, ∀f ∈ H,
n∈I
Jn := {j ∈ J : µj = λn }
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 487
Z
A X
(f |ϕ(A)f ) = ϕdµP
f = ϕ(λn )kPn f k2
R n∈I
!
X
= f| ϕ(λn )Pn f , ∀f ∈ Dϕ(A) ;
n∈I
P
since the mapping Dϕ(A) ∋ f 7→ n∈I ϕ(λn )Pn f ∈ H is obviously a linear
operator (its definition is consistent by 10.4.7b), in view of 10.2.12 this implies
that
X
ϕ(A)f = ϕ(λn )Pn f, ∀f ∈ Dϕ(A) .
n∈I
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 488
(C) If the Hilbert space H is finite-dimensional then 15.2.8 proves that condition b
of example B holds true for every self-adjoint operator A in H. Then condition
a holds true as well (this was also seen directly in the proof of 15.2.8), and so
does condition c. Thus, for every self-adjoint operator A in a finite-dimensional
Hilbert space H there exists a c.o.n.s. in H whose elements are eigenvectors of
A.
(D) Let M be a subspace of H. The mapping
P : A(dR ) → P(H)
E 7→ P (E) := χE (0)PM ⊥ + χE (1)PM
is a projection valued measure in view of 13.3.5. Indeed, for every f ∈ H, µP
f
is the measure µ defined in 8.3.8 with
I := {1, 2}, x1 := 0, x2 := 1, a1 := (f |PM ⊥ f ) , a2 := (f |PM f ) ;
2 2 2 2
moreover, this entails µPf (R) = a1 + a2 = kf k − kPM f k + kPM f k = kf k .
P
The operator A is the projection PM since (cf. 8.3.8 and 15.2.2a,b)
Z Z
ξ 2 dµP
f = ξdµP
f = 0a1 + 1a2 = (f |PM f ) ,
R R
and hence
Z
f ∈H: ξ 2 dµP
f < ∞ = H = DPM
R
and
Z
ξdµP
f = (f |PM f ) , ∀f ∈ H.
R
Then,
N
X
p(A) = αk Ak (we define A0 := 1H ).
k=0
Let q be a non-trivial polynomial, i.e. there exist M ≥ 1 and (β0 , β1 , ..., βM ) ∈ CM+1
with βM 6= 0 so that
M
X
q= βi ξ i .
i=0
If the roots of q are not elements of σp (A), then q1 ∈ M(R, A(dR ), P A ) (where 1
q is
PM
defined as in 1.2.19), the operator i=0 βi Ai is injective and
M
!−1
1 X
(A) = βi Ai .
q i=0
p
If, further, the roots of q are not elements of σ(A), then (letting q := p q1 )
N
! M
!−1
p X
k
X
i
(A) = αk A βi A .
q i=0
k=0
since obviously DAk+1 ⊂ DAk for all k ∈ N. Now, it is easy to prove that there
exists a bounded interval I so that
1
|αN ||x|N ≤ |p(x)|, ∀x ∈ R − I;
2
therefore,
A
f ∈ Dp(A) ⇒ p ∈ L2 (R, A(dR ), µP
f )⇒
A
ξ N ∈ L2 (R, A(dR ), µP
f ) ⇒ f ∈ Dξ N (A) = DAN .
This proves the first part of the statement. In what follows, we prove the second
part.
If the roots of q are not elements of σp (A), then (cf. 15.2.5c)
P A (q −1 ({0})) = OH
and hence (cf. 14.3.14)
1 1
the operator q(A) is injective, ∈ M(R, A(dR ), P A ), (A) = (q(A))−1 ;
q q
PM
now, q(A) = i=0 βi Ai in view of the first part of the statement.
To prove the last part of the statement, let {λ1 , ..., λM } be the roots of q (each
value is repeated as many times as its multiplicity); then
1 −1 1 1
= βM ··· .
q ξ − λ1 ξ − λM
Now, suppose λi 6∈ σ(A) for all i ∈ {1, ..., M }. Then, for each i ∈ {1, ..., M },
the operator A − λi 1H is injective and the operator (A − λi 1H )−1 is bounded (cf.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 491
12.4.21b, 4.5.2, 4.5.3); moreover, A − λi 1H = (ξ − λi )(A) (cf. the first part of the
statement); then, 14.3.14 and 14.2.17 imply that
1
∈ L∞ (R, A(dR ), P A ).
ξ − λi
1
Thus, q ∈ L∞ (R, A(dR ), P A ) (cf. 14.2.5) and hence, in view of 14.3.10,
p 1 1 1
(A) = JP A p = JP A (p)JP A = p(A) (A),
q q q q
or
N
! M
!−1
p X
k
X
i
(A) = αk A βi A ,
q i=0
k=0
0 ≤ (f |Bf ) , ∀f ∈ DB , and A = B 2 .
ψ(C) = ξ 2 (C) = A.
X1 := R, A1 := A(dR ), P1 := P C , X2 := R, A2 := A(dR ), π := ψ
now,
and hence
by 13.3.2c and the equality P C ([0, ∞)) = 1H . Thus, P B = P C and hence (cf.
15.2.2)
B C
B = JξP = JξP = C.
σ(ϕ(A)) = ϕ(σ(A)).
15.4.1 Theorem. Let H1 and H2 be isomorphic Hilbert spaces and suppose that
U ∈ UA(H1 , H2 ). Let A1 and A2 be self-adjoint operators in H1 and in H2 respec-
tively. Then the following conditions are equivalent:
(a) P A2 (E) = U P A1 (E)U −1 , ∀E ∈ A(dR );
(b) A2 = U A1 U −1 .
Proof. a ⇒ b: This follows immediately from 14.6.2.
b ⇒ a: We define the mapping
Q : A(dR ) → P(H2 )
E 7→ Q(E) := U P A1 (E)U −1 .
Then (cf. 14.6.2) Q is a projection valued measure on A(dR ) and
A1
JξQ = U JξP U −1 = U A1 U −1 = A2 .
Thus, Q = P A2 by the definition of P A2 , and hence condition a.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 495
Chapter 16
The subject of this chapter is fundamental for quantum mechanics. Indeed, con-
tinuous one-parameter unitary groups and Stone’s theorem are the mathematical
basis for the description of time evolution of conservative and reversible quantum
systems (cf. Section 19.6).
Moreover, if G is a Lie group which is considered to be a symmetry group for
a quantum system, then a continuous one-parameter unitary group is found to be
associated with each element of the Lie algebra of G, and the generators of these
one-parameter groups are self-adjoint operators which are interpreted as observables
representing the elements of the Lie algebra. However, this topic is outside the scope
of this book (cf. e.g. Thaller, 1992, 2.3.1).
Throughout this section, H denotes an abstract Hilbert space. We recall that U(H)
denotes the group of unitary operators in H (cf. 10.3.9 and 10.3.10).
495
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 496
Obviously, if X = C then these definitions agree with the ones given in 1.2.21 (cf.
2.7.6 and 2.4.2).
We also have
1
lim (ϕtn (x) − 1) − ix = 0, ∀x ∈ R
n→∞ tn
deitx
(since dt 0= ix, ∀x ∈ R), and
1
(ϕtn (x) − 1) − ix ≤ 1 (eitn x − 1) + |x| ≤ 2|x|, ∀x ∈ R, ∀n ∈ N
tn tn
A
(we have used the inequality |eiα − 1| ≤ |α|, ∀α ∈ R). Since ξ ∈ L2 (R, A(dR ), µP
f )
for all f ∈ DA , by 8.2.11 (with 4ξ 2 as dominating function) we have
1 A A
lim
(Uf (tn ) − Uf (0)) − iAf
= 0, ∀f ∈ DA .
n→∞ tn
16.1.7 Remark. For every self-adjoint operator A in H, 16.1.6 and 16.1.5e show
that the mapping U A defined by
R ∋ t 7→ U A (t) := ϕt (A) ∈ U(H)
is a c.o.p.u.g. and that it is the only c.o.p.u.g. U in H which satisfies with A the
condition
dU
(sa-ug) the mapping Uf is differentiable at 0 and dtf = iAf , ∀f ∈ DA .
0
A
We point out that, for each t ∈ R, U (t) is the unique linear operator in H such
that DU A (t) = H and
Z
A
A
ϕt dµP
f |U (t)f = f , ∀f ∈ H
R
(a) Let λ ∈ R. Then, the operator B := A + λ1H is self-adjoint and the following
conditions are true:
P B (E) = P A (E − λ), ∀E ∈ A(dR )
(we recall that E − λ := {x − λ : x ∈ E}, cf. 9.2.1a);
U B (t) = eiλt U A (t), ∀t ∈ R.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 502
(b) Let µ ∈ R − {0}. Then, the operator C := µA is self-adjoint and the following
conditions are true:
B = ψλ (A).
X1 := R, A1 := A(dR ), P1 := P A , X2 := R, A2 := A(dR ), π := ψλ ,
B
and hence P2 = P ,
to obtain
C = γµ (A).
X1 := R, A1 := A(dR ), P1 := P A , X2 := R, A2 := A(dR ), π := γµ ,
C
and hence P2 = P ,
to obtain
(a) D = H;
(b) U (t)f ∈ D, ∀f ∈ D, ∀t ∈ R;
(c) the mapping Uf is differentiable at 0, ∀f ∈ D.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 503
is differentiable at 0 and its derivative at 0 is the number (U (0)f |f ); hence, for any
sequence {tn } in R − {0} such that tn → 0, we have
1 tn
Z
kf k2 = (U (0)f |f ) = lim (U (x)f |f ) dx = lim (Btn f |f ) , (3)
n→∞ tn 0 n→∞
where 11 holds because U (t0 ) ∈ B(H) and 12 holds because U (t0 )Ct = Ct U (t0 ) for
all t ∈ R − {0}. Finally we note that, for f ∈ H,
DA = {f ∈ H : Uf is differentiable at 0};
16.1.11 Remarks.
A 7→ U A
(cf. 16.1.7) is a surjection, and hence a bijection from the family of all self-
adjoint operators in H onto the family of all c.o.p.u.g.’s in H.
For a c.o.p.u.g. U , the self-adjoint operator A such that U = U A is called the
generator of U .
(b) For every self-adjoint operator A, it is obvious that the mapping
The theorem we present in this section determines when the results of the previous
section can be expressed entirely within the Banach algebra structure of B(H).
moreover, we have
n
n
A X 1
(10)
X 1
U (t) − (it)k Ak
=
JP A (ϕt ) − (it)k JP A (ξ k )
k!
k!
k=0 k=0
n
!
(11)
1
=
J˜P A ϕt −
X
(it)k ξ k
k!
k=0
n
(12)
X 1
= P A -sup ϕt − (it)k ξ k
k!
k=0
(13)
≤ sn (t), ∀n ∈ N,
where 10 holds by 15.3.5, 11 holds by 14.2.17e and 14.2.7b since 3 that implies that
ξ ∈ L∞ (R, A(dR ), P A ), 12 holds by 14.2.7h, 13 holds in view of 3. In view of 9, this
proves that
n
A X 1
k k
U (t) − (it) A
→ 0 as n → ∞, ∀t ∈ R,
k!
k=0
i.e. condition c.
b ⇒ d: We assume condition b and fix m ∈ (0, ∞) as above. Let t0 ∈ R and
let {tn } be any sequence in R − {0} such that tn → 0. Proceeding as before (using
16.1.5c, which now reads U A (t0 )A = AU A (t0 ) since A ∈ B(H), and also 4.2.9 and
the equation kU (t0 )k = 1) we see that
1 A
(U (t0 + tn ) − U A (t0 )) − iAU (t0 )
tn
1 A 1
˜
≤
(U (tn ) − 1H ) − iA
=
JP A
(ϕtn − 1R ) − iξ
(14)
tn tn
1
= P A -sup (ϕtn − 1R ) − iξ , ∀n ∈ N.
tn
is
Next we fix ε > 0; since deds = i, there exists δε > 0 such that
0
1 ε
0 < s < δε ⇒ (eis − 1) − i < ;
s m
now let Nε ∈ N be such that
δε
n > Nε ⇒ |tn | < ;
m
then, for every x ∈ [−m, m] such that x 6= 0, we have
1 itn x
hence, in view of 3 (also, note that − 1) − ix = 0 for x = 0), we have
tn (e
A
1
n > Nε ⇒ P -sup (ϕtn − 1R ) − iξ < ε,
tn
and hence, in view of 14,
1 A A A
n > Nε ⇒
tn (U (t 0 + t n ) − U (t 0 )) − iAU (t 0
< ε.
)
16.2.2 Remark. From 12.6.1 we have that the restriction of the mapping A 7→ U A
(cf. 16.1.7 and 16.1.11a) to the family of all bounded self-adjoint operators is a
bijection from this family onto the family of all norm-continuous c.o.p.u.g.’s.
16.3.1 Theorem. Let H1 and H2 be isomorphic Hilbert spaces and suppose that
V ∈ UA(H1 , H2 ). Let A1 and A2 be self-adjoint operators in H1 and in H2 respec-
tively. Then the following conditions are equivalent:
(a) A2 = V A1 V −1 ;
(b) U A2 (t) = V U A1 (t)V −1 , ∀t ∈ R, if V ∈ U(H1 , H2 ), or
U A2 (−t) = V U A1 (t)V −1 , ∀t ∈ R, if V ∈ A(H1 , H2 ).
Now,
B := V A1 V −1 ;
The main theorem of this section (cf. 16.4.11) is a special case of a much more
general theorem proved by Valentine Bargmann (Bargmann, 1954). In the analy-
sis of time evolution of conservative and reversible quantum systems (cf. Section
19.6) and of symmetries (for an example, cf. Section 20.3), one is led to consider
what we call continuous one-parameter groups of automorphisms. The special case
of Bargmann’s theorem we consider here is the essential link between these and
c.o.p.u.g.’s.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 513
16.4.2 Remarks.
(a) For any mapping R ∋ t 7→ ωt ∈ Aut Ĥ, by Wigner’s theorem (cf. 10.9.6) we
have that, for each t ∈ R, there exists a family of operators Ut ∈ UA(H) which
are such that
ωUt = ωt , i.e. [Ut u] = ωt ([u]), ∀u ∈ H̃,
and that, given an operator of this family, all the others are multiplies of this
one by a factor in T. Hence, for each t ∈ R, either all the operators Ut ∈ UA(H)
which are such that ωUt = ωt are unitary or all of them are antiunitary.
(b) Let R ∋ t 7→ ωt ∈ Aut Ĥ be a homomorphism from the additive group R to
Aut Ĥ.
First, we have (cf. 1.3.3 and 1.3.5b):
ω0 = idĤ ;
ω−t = ωt−1 , ∀t ∈ R;
ωt1 ◦ ωt2 = ωt2 ◦ ωt1 , ∀t1 , t2 ∈ R.
Next, for each t ∈ R and any choice of Ut and of U 2t in UA(H) such that
ωUt = ωt and ωU t = ω 2t , we have
2
2
then (cf. a) there exists z ∈ T so that Ut = zU t , and hence Ut ∈ U(H) (cf.
2
10.3.16c). Thus, the operators Ut ∈ UA(H) such that ωUt = ωt are unitary, for
each t ∈ R.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 514
(1 holds by 16.4.4b), or
τ ([w], ωsn ([v])) −−−−→ τ ([w], [v]).
n→∞
Then, for all u, v ∈ H̃, for every t ∈ R, and for every sequence {tn } in R such that
tn → t, we have
τ ([u], ωtn ([v])) = τ (ω−t ([u]), ωtn −t ([v])) −−−−→ τ (ω−t ([u]), [v]) = τ ([u], ωt ([v]))
n→∞
Proof. We fix h ∈ H̃ and δ ∈ (0, 1) throughout the proof. We divide the proof into
five steps.
Step 1: Here we define a ∈ (0, ∞) and V (t) ∈ U(H) for all t ∈ (−a, a) so that
V0 = 1H and ωt = ωVt , ∀t ∈ (−a, a).
The function
R ∋ r 7→ ρr := τ ([h], ωr ([h])) ∈ [0, 1]
is continuous and ρ0 = 1, in view of conditions c and a. Hence, we can choose
a ∈ (0, ∞) so that
r ∈ (−a, a) ⇒ δ < ρr ≤ 1.
For each r ∈ (−a, a), there exists a unique Vr ∈ U(H) so that ωVr = ωr and
(h|Vr h) = | (h|Vr h) | = ρr (1)
(in view of condition b, there exists Ur ∈ U(H) such that ωUr = ωr ; then, we define
Vr := zr Ur , with zr := | (h|Ur h) | (h|Ur h)−1 ; the uniqueness of Vr is obvious, since
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 517
for any other Vr′ ∈ U(H) such that ωVr′ = ωr we would have (h|Vr′ h) = zρr with
z 6= 1, cf. 16.4.2a). Clearly,
V0 = 1H .
Step 2: Here we prove two auxiliary relations.
For all u ∈ H̃ and r, s ∈ (−a, a), we define
dr,s (u) := d(ωr ([u]), ωs ([u]));
σr,s (u) := (Vr u|Vs u) ;
zr,s (u) := Vs u − σr,s (u)Vr u.
We have
(Vr u|zr,s (u)) = 0,
and hence
1 = kVs uk2 = kzr,s (u) + σr,s (u)Vr uk2 = kzr,s (u)k2 + |σr,s (u)|2 ,
and hence
(2)
kzr,s (u)k2 = 1 − |σr,s (u)|2 = 1 − τ (ωr ([u]), ωs ([u]))2 ≤ dr,s (u)2 , (3)
where 2 holds by 16.4.4a. Moreover, we have
kVs u − Vr uk2 = 2 − 2 Re (Vr u|Vs u) ≤ 2|1 − (Vr u|Vs u) | = 2|1 − σr,s (u)|. (4)
Step 3: Here we prove that, for every t ∈ (−a, a) and every sequence {tn } in
(−a, a) such that tn → t, we have dt,tn (u) −−−−→ 0 for all u ∈ H̃.
n→∞
Indeed, for each u ∈ H̃, by condition c we have
τ (ωt ([u]), ωtn ([u])) −−−−→ τ (ωt ([u]), ωt ([u])) = 1
n→∞
(we have used the continuity at t of the function s 7→ τ (ωt ([u]), ωs ([u]))); therefore,
1 1
dt,tn (u) = d(ωt ([u]), ωtn ([u])) = 2 2 (1 − τ (ωt ([u]), ωtn ([u]))) 2 −−−−→ 0.
n→∞
Step 4: Here we prove that, for every t ∈ (−a, a) and every sequence {tn } in
(−a, a) such that tn → t, we have kVtn h − Vt hk −−−−→ 0.
n→∞
For all r, s ∈ (−a, a), we have
(5)
(h|zr,s (h)) = (h|Vs h) − σr,s (h) (h|Vr h) = ρs − σr,s (h)ρr
(5 holds in view of 1), and hence
1 − σr,s (h) = ρ−1
r (ρr − ρs + (h|zr,s (h))),
and hence
(6)
kVs h − Vr hk2 ≤ 2|1 − σr,s (h)| ≤ 2ρ−1
r (|ρr − ρs | + | (h|zr,s (h)) |)
(7)
≤ 2ρ−1
r (|τ ([h], ωr ([h])) − τ ([h], ωs ([h]))| + khkkzr,s(h)k)
(8)
≤ 2ρ−1 −1
r (d(ωr ([h]), ωs ([h])) + dr,s (h)) = 4ρr dr,s (h) < 4δ
−1
dr,s (h),
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 518
We need the next four results in the proof of 16.4.11, which is the above-
mentioned special case of Bargmann’s general theorem.
16.4.7 Lemma. Let U, V ∈ U(H) and let {Un },{Vn } be sequences in U(H) such
that:
Un f −−−−→ U f, ∀f ∈ H;
n→∞
Vn f −−−−→ V f, ∀f ∈ H.
n→∞
Then:
Un Vn−1 f −−−−→ U V −1 f, ∀f ∈ H;
n→∞
Vn−1 f −−−−→ V −1 f, ∀f ∈ H;
n→∞
Un Vn f −−−−→ U V f, ∀f ∈ H.
n→∞
Thus,
Un Vn−1 f −−−−→ U V −1 f, ∀f ∈ H. (1)
n→∞
Since 2 is true, we can substitute V −1 for V and Vn−1 for V in the statement, and
thus obtain from 1
Un Vn f −−−−→ U V f, ∀f ∈ H.
n→∞
Proof. It is outside the scope of this book to prove this result, which is a special
case of a theorem of topology about liftings (cf. e.g. Greenberg and Harper, 1981,
6.1 and 6.6).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 520
Proof. First we note that the integral which defines the function λ exists by 2.8.14
and 8.2.6.
Let a, b ∈ R be so that a < b and ϕ(t) = 0 for all t ∈ R − [a, b]. Then,
Z
λ(x) = ξ(x, t)ϕ(t)dm(t), ∀x ∈ R.
[a,b]
This proves that the function λ is continuous at x0 , and hence that it is continuous
since x0 was arbitrary.
Let a, b ∈ R be so that a < b and ϕ(s) = 0 for all s ∈ R − [a, b]. Then, for each
y ∈ R,
ϕ(t − y) = 0, ∀t ∈ R − [a + y, b + y].
We fix y0 ∈ R and d ∈ (0, ∞), and define the interval
I(y0 , d) := [a + y0 − d, b + y0 + d];
then, for each y ∈ [y0 − d, y0 + d],
ϕ(t − y) = 0, ∀t ∈ R − I(y0 , d).
Since ϕ′ (s) = 0 for all s 6∈ [a, b], the same reasoning as above proves that, for each
y ∈ [y0 − d, y0 + d],
ϕ′ (t − y) = 0, ∀t ∈ R − I(y0 , d).
Hence, for each h ∈ R − {0} such that |h| ≤ d, we have (cf. 1)
1
Z
(ψ(x0 , y0 + h) − ψ(x0 , y0 )) + ξ(x0 , t)ϕ′ (t − y0 )dm(t)
h R
(2)
1 ∂χ
Z
= (χ(t, y0 + h) − χ(t, y0 )) − (t, y0 ) dm(t)
I(y0 ,d) h ∂y
(note that the function χ depends on x0 , which however is fixed).
Now, the function
∂χ
I(y0 , d) × [y0 − d, y0 + d] ∋ (t, y) 7→ (t, y) ∈ C
∂y
is continuous (cf. 1); hence it is uniformly continuous (cf. 2.8.7 and 2.8.15), and
hence for each ε ∈ (0, ∞) there exists δε ∈ (0, ∞) such that δε < d and
|y − y0 | < δε ⇒ [d2 ((t, y), (t, y0 )) < δε , ∀t ∈ I(y0 , d)] ⇒
(3)
∂χ
(t, y) − ∂χ (t, y0 ) < ε, ∀t ∈ I(y0 , d) .
∂y ∂y
Proof. The mapping R ∋ t 7→ ωt ∈ Aut Ĥ we are now considering has all the
properties assumed for the mapping considered in 16.4.6 (cf. 16.4.2b and ag2 ).
Then there exists a ∈ (0, ∞) and a mapping
(−a, a) ∋ t 7→ Vt ∈ U(H)
with the properties listed in 16.4.6.
We divide the proof of the theorem into five steps.
Step 1: Here we define a mapping R ∋ t 7→ T (t) ∈ U(H) such that
T (0) = 1H and ωt = ωT (t) , ∀t ∈ R.
1
We set b := 2 a. Then,
∀t ∈ R, ∃!(k(t), r(t)) ∈ Z × [0, b) such that t = k(t)b + r(t).
Thus we can define the mapping
T : R → U(H)
k(t)
t 7→ T (t) := Vb Vr(t)
(we recall that Vb0 := 1H ). We see that T (0) = 1H . Moreover, in view of ag1 ,
k(t)
ωT (t) ([u]) = [Vb Vr(t) u] = (ωb ◦ · · · k(t) times · · · ◦ ωb ◦ ωr(t) )([u])
= ωk(t)b+r(t) ([u]) = ωt ([u]), ∀u ∈ H̃, ∀t ∈ R,
and hence
ωT (t) = ωt , ∀t ∈ R.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 523
(1 holds for n > n0 ) since r(tn ) − r(t) = tn − t for n > n0 . This proves that the
mapping t 7→ T (t)f is continuous at t.
In the second place we suppose that t ∈ R is such that t = k(t)b; hence, T (t) =
k(t)
Vb . First let {tn } be a sequence in R such that tn → t and such that there exists
n1 ∈ N so that
k(t)
n > n1 ⇒ k(t)b = t ≤ tn < (k(t) + 1)b ⇒ T (tn ) = Vb Vr(tn ) ;
then,
(2) k(t) k(t)
kT (tn )f − T (t)f k = kVb Vr(tn ) f − Vb f k = kVr(tn ) f − f k −−−−→ 0
n→∞
(2 holds for n > n1 ) since r(tn ) = tn − t for n > n1 ; by the argument used in the
proof of 2.4.2 (b ⇒ a), this implies that
∀ε > 0, ∃δε+ > 0 such that t ≤ s < t + δε+ ⇒ kT (t)f − T (s)f k < ε.
Next let {tn } be a sequence in R such that tn → t and such that there exists n2 ∈ N
so that
k(t)−1
n > n2 ⇒ (k(t) − 1)b ≤ tn < t = k(t)b ⇒ T (tn ) = Vb Vr(tn ) ;
then t − tn = b − r(tn ) for n > n2 and hence
[tn → t] ⇒ [r(tn ) → b] ⇒ [Vr(tn ) f −−−−→ Vb f ],
n→∞
k(t)−1
and hence, since Vb ∈ B(H),
(3) k(t)−1 k(t)−1
T (tn )f = Vb Vr(tn ) f −−−−→ Vb Vb t = T (t)f
n→∞
(3 holds for n > n2 ); by the argument used in the proof of 2.4.2 (b ⇒ a), this implies
that
∀ε > 0, ∃δε− such that t − δε− < s < t ⇒ kT (t)f − T (s)f k < ε.
Letting δε := min{δε+ , δε− }, we have proved that
∀ε > 0, ∃δε > 0 such that |s − t| < δε ⇒ kT (t)f − T (s)f k < ε,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 524
thus, the function µ is continuous (cf. 2.4.2). Therefore, by 16.4.8 there exists a
continuous function ξ : R2 → R such that
ξ(0, 0) = 0 and µ(r, s) = eiξ(r,s) , ∀(r, s) ∈ R2 . (6)
The function
R3 ∋ (r, s, t) 7→ ξ(r, s) + ξ(r + s, t) − ξ(s, t) − ξ(r, s + t) ∈ R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 525
is obviously continuous; then, since (R3 , d3 ) is a connected metric space (cf. 2.9.10),
its range can only be either R or an interval or a singleton set (cf. 2.9.6); now, 5
implies that
∀(r, s, t) ∈ R3 , ∃nr,s,t ∈ Z such that
ξ(r, s) + ξ(r + s, t) − ξ(s, t) − ξ(r, s + t) = 2nr,s,t π;
hence,
∃n ∈ Z such that ξ(r, s) + ξ(r + s, t) − ξ(s, t) − ξ(r, s + t) = 2nπ, ∀(r, s, t) ∈ R3 ;
if we set r = s = t = 0 in this, we see that n = 0 since ξ(0, 0) = 0; thus,
ξ(r, s) + ξ(r + s, t) = ξ(s, t) + ξ(r, s + t), ∀(r, s, t) ∈ R3 . (7)
If we set r = s = 0 in this, we obtain
ξ(0, t) = 0, ∀t ∈ R. (8)
Step 4: The exponent ξ0 .
Throughout this step we fix a real function ϕ ∈ Cc (R) which is differentiable at
′
R
all points of R, and also such that the derivative ϕ is continuous and R ϕdm = 1.
A possible choice is
(
1
(cos x + 1) if x ∈ [−π, π]
ϕ(x) := 2π
0 if x 6∈ [−π, π].
We define the function
Z
R ∋ r 7→ λ(r) := − ξ(r, t)ϕ(t)dm(t) ∈ R; (9)
R
this function is continuous, in view of 16.4.9; moreover, λ(0) = 0 in view of 8.
Next we define the mapping
R ∋ t 7→ W (t) := eiλ(t) T (t) ∈ U(H).
In view of step 1 we have
W (0) = 1H and ωt = ωW (t) , ∀t ∈ R; (10)
in view of step 2 and of 10.1.16b we have that
the mapping R ∋ t 7→ W (t)f ∈ H is continuous, ∀f ∈ H. (11)
In view of 4 and 6 we see that
W (r)W (s) = ei(λ(r)+λ(s)−λ(r+s)+ξ(r,s)) W (r + s), ∀(r, s) ∈ R2 ,
or
W (r)W (s) = eiξ0 (r,s) W (r + s), ∀(r, s) ∈ R2 , (12)
where ξ0 is the function defined by
R2 ∋ (r, s) 7→ ξ0 (r, s) := ξ(r, s) + λ(r) + λ(s) − λ(r + s) ∈ R; (13)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 526
∂ξ0
the function R ∋ r 7→
(r, 0) ∈ R is continuous. (19)
∂s
Step 5: Here we prove the statement of the theorem.
We define the function
∂ξ0
R2 ∋ (r, s) 7→ ψ(r, s) := (r, s) ∈ R.
∂s
If we derive 14 with respect to t at t = 0 we obtain
ψ(r + s, 0) = ψ(s, 0) + ψ(r, s), ∀(r, s) ∈ R2 (20)
From 19 and 20 we have that the function ψ is continuous.
Next we define the function
Z t
R ∋ t 7→ λ0 (t) := ψ(r, 0)dr ∈ R
0
Rb
(the symbol a has in the present proof the same meaning it has in the proof of
16.1.10), which is continuous.
For all (t1 , t2 ) ∈ R2 , we have
λ0 (t1 + t2 ) − λ0 (t1 ) − λ0 (t2 )
Z t1 +t2 Z t1 Z t2
= ψ(r, 0)dr − ψ(r, 0)dr − ψ(r, 0)dr
0 0 0
Z t1 +t2 Z t2 Z t2 Z t2
(21)
= ψ(r, 0)dr − ψ(r, 0)dr = ψ(t1 + r, 0)dr − ψ(r, 0)dr
t1 0 0 0
Z t2
(22) (23) (24)
= ψ(t1 , r)dr = ξ0 (t1 , t2 ) − ξ0 (t1 , 0) = ξ0 (t1 , t2 ),
0
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 527
where 21 holds by a change of variable (cf. the explanation of 7 and 8 in the proof
of 16.1.10), 22 holds in view of 20, 23 holds by Riemann’s fundamental theorem of
calculus, 24 holds in view of 15.
Finally, we define the mapping
R ∋ t 7→ U (t) := eiλ0 (t) W (t) ∈ U(H).
In view of 10 we have
ωt = ωU(t) , ∀t ∈ R.
In view of 11 and of 10.1.16b we have that
the mapping R ∋ t 7→ U (t)f ∈ H is continuous, ∀f ∈ H;
moreover, in view of 12 and of the equation proved above we have
U (t1 )U (t2 ) = ei(λ0 (t1 )+λ0 (t2 )−λ0 (t1 +t2 )+ξ0 (t1 ,t2 )) U (t1 + t2 )
= U (t1 + t2 ), ∀t1 , t2 ∈ R;
thus, the mapping U is a c.o.p.u.g.
July 25, 2013 17:28 WSPC - Proceedings Trim Size: 9.75in x 6.5in icmp12-master
Chapter 17
The results of operations performed on elements of O(H) (cf. 3.2.1) can be misun-
derstood if they are interpreted as if O(H) were an algebra, since O(H) is not an
algebra and not even a linear space (cf. 3.2.11). This is true in particular for the
commutator of two elements of O(H).
17.1.1 Definitions.
(a) Let A, B be elements of O(H), i.e. linear operators in H. The commutator of
A and B is the linear operator denoted by the symbol [A, B] and defined by
[A, B] := AB − BA.
We note that
D[A,B] = {f ∈ DA ∩ DB : Af ∈ DB and Bf ∈ DA }.
(b) Two elements A and B of B(H) are said to commute if AB = BA, i.e. if
[A, B] = OH .
The definition given in 17.1.1b is the natural one for elements of B(H), since
B(H) is an algebra (cf. 4.3.5). It might be thought that this definition could be
generalized meaningfully to arbitrary elements of O(H) in a direct way, by saying
that two elements A and B of O(H) commute if [A, B] ⊂ OH . However, this
definition would not be very useful. Firstly, it is clear that the content of the
relation [A, B] ⊂ OH depends on the size of D[A,B] (it is even void if D[A,B] = {0H }).
Moreover, if A and B are self-adjoint elements of B(H) then the relation [A, B] = OH
529
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 530
has consequences (cf. 17.1.4 and 17.1.7) which are not granted in general by the
relation [A, B] ⊂ OH when A and B are self-adjoint elements of O(H) which are
not defined on the whole of H (we recall that, for a self-adjoint operator A, DA = H
is equivalent to A ∈ B(H), cf. 12.4.7); this is proved by examples (cf. 17.1.8). In
general, the condition [A, B] ⊂ OH does not seem to lead to interesting results for
self-adjoint operators A and B which are not in B(H). The main task of this section
is to find a commutativity condition for self-adjoint operators which has the same
consequences whether or not the operators are in B(H).
We start off by noting that there is a condition of commutativity which has
already played a role in previous chapters.
17.1.3 Remarks.
(a) Let (X, A) be a measurable space and let P : A → P(H) be a projection
valued measure. If B ∈ B(H) is such that [B, P (E)] = OH for all E ∈ A, then
B commutes with JϕP for all ϕ ∈ M(X, A, P ) (cf. 14.2.14e).
(b) The previous remark implies that, if A is a self-adjoint operator in H and
B ∈ B(H) is such that [B, P A (E)] = OH for all E ∈ A(dR ), then B commutes
with ϕ(A) for all ϕ ∈ M(R, A(dR ), P A ); in particular, B commutes with A
since A = ξ(A).
(c) If A is a self-adjoint operator in H and B ∈ B(H) commutes with A, then
[B, P A (E)] = OH for all E ∈ A(dR ) (cf. 15.2.1B).
(d) Remarks b and c imply that, if A is a self-adjoint operator in H and B ∈ B(H)
commutes with A, then B commutes with ϕ(A) for all ϕ ∈ M(R, A(dR ), P A ).
We note that, while the definition provided in 17.1.1b sets up a relation in B(H)
which is obviously symmetric, for A ∈ O(H) and B ∈ B(H) the condition BA ⊂ AB
is asymmetric if DA 6= H. The implication equivalent to BA ⊂ AB that is written
in 17.1.2 makes this asymmetry immediately clear. If neither of two linear operators
A and B is an element of B(H) then we do not try at all to define anything like
commutativity for A and B, unless both operators are self-adjoint. Indeed, 17.1.4
proves that if both A ∈ O(H) and B ∈ B(H) are self-adjoint then the condition
BA ⊂ AB is in fact equivalent to a relation in which A and B have equal roles,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 531
and suggests how this condition can be generalized to a symmetric relation between
any kind of self-adjoint operators (cf. 17.1.5). After that, 17.1.7 shows that this
generalization is a meaningful condition of commutativity for self-adjoint operators.
In view of 17.1.4, the following definition is consistent with the definition given
in 17.1.2.
17.1.6 Remarks.
(cf. 13.3.2d).
(c) Let A and B be commuting (in the sense of 17.1.5) self-adjoint operators,
let ϕ be a real element of M(R, A(dR ), P A ), let ψ be a real element of
M(R, A(dR ), P B ). Then the operators ϕ(A) and ψ(B) are self-adjoint and
they commute (in the sense of 17.1.5). This follows immediately from 15.3.8.
Every self-adjoint operator defines the family of operators that contains the
operator itself and the ranges of the projection valued measure and of the continuous
one-parameter unitary group determined by the operator. The next theorem proves
that two self-adjoint operators commute (in the sense of 17.1.5) iff any bounded
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 532
element of the family defined as above by one of them commutes (in the sense of
17.1.2) with any element of the family defined as above by the other one.
then we have
Af = −i lim gn
n→∞
B
(cf. 16.1.6); since P (F ) ∈ B(H), this implies that
the sequence {P B (F )gn } is convergent and P B (F )Af = −i lim P B (F )gn ;
n→∞
From 17.1.7 we see that if two self-adjoint operators A and B commute (in the
sense defined in 17.1.5) then [A, B] ⊂ OH . It is almost obvious that the converse
cannot be true in general because two self-adjoint operators A and B can be such
that [A, B] ⊂ OH , but with D[A,B] so little that the relation [A, B] ⊂ OH is of no
consequence. Now, one might conjecture that, if [A, B] ⊂ OH and D[A,B] is dense in
H, then A and B should commute. Example a in 17.1.8 proves that this conjecture
is false. Then one might go one step further and conjecture that if D[A,B] is not
only dense in H but also large enough so that two self-adjoint operators A and B
are uniquely determined by their restrictions to D[A,B] , then [A, B] ⊂ OH could be
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 534
17.1.8 Examples.
(a) The Hilbert space of this example is L2 (a, b). As in 12.4.25, here we do not
distinguish between a symbol ϕ for an element of C(a, b) and the symbol [ϕ] for
the element of L2 (a, b) that contains ϕ. Accordingly, the family of functions
C0∞ (a, b) defined in 11.4.17 is identified with the subset ι(C0∞ (a, b)) of L2 (a, b).
We consider the operators A0 and A1 defined as Aθ in 12.4.25, with θ := 0 and
θ := 1. It is obvious that
[A0 , A1 ] ⊂ OL2 (a,b) .
Moreover, it is obvious that C0∞ (a, b) ⊂ D[A0 ,A1 ] . Now, C0∞ (a, b) is dense in
L2 (a, b) (cf. 11.4.21) and hence so is D[A0 ,A1 ] . Then we have
OL2 (a,b) = O†L2 (a,b) ⊂ [A0 , A1 ]†
(cf. 12.1.4; the equation OL2 (a,b) = O†L2 (a,b) follows e.g. from 12.1.3B), and
hence
[A0 , A1 ]† = OL2 (a,b) ,
and hence
A†1 A†0 − A†0 A†1 ⊂ (A0 A1 )† − (A1 A0 )† ⊂ (A0 A1 − A1 A0 )† = OL2 (a,b)
(cf. 12.3.4a and 12.3.1a). Thus, A†0 and A†1 are self-adjoint operators (since A0
and A1 are essentially self-adjoint, cf. 12.4.25) such that
[A†0 , A†1 ] ⊂ OL2 (a,b)
and
D[A† ,A† ] is dense in L2 (a, b)
0 1
(note that D[A0 ,A1 ] ⊂ D[A† ,A† ] since A0 ⊂ A†0 and A1 ⊂ A†1 ).
0 1
Now, the conditions of 15.3.4B hold true for both the self-adjoint operators A†0
and A†1 (cf. 12.4.25). The number 0 is eigenvalue of A†0 and its eigenspace is
the one-dimensional subspace generated by the element u of L2 (a, b) defined by
12
1
u(x) := , ∀x ∈ [a, b];
b−a
therefore we have
†
P A0 ({0})ϕ = (u|ϕ) u, ∀ϕ ∈ L2 (a, b)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 535
(we identify the symbols ϕ and [ϕ] also for an element ϕ of L2 (a, b)). The num-
1
ber b−a is eigenvalue for A†1 and its eigenspace is the one-dimensional subspace
generated by the element v of L2 (a, b) defined by
21
1 x−a
v(x) := exp i , ∀x ∈ [a, b];
b−a b−a
therefore we have
A†1 1
P ϕ = (v|ϕ) v, ∀ϕ ∈ L2 (a, b).
b−a
Thus, we have
† † 1
P A0 ({0})P A1 v = (u|v) u
b−a
and
† 1 †
P A1 P A0 ({0})v = (u|v) (v|u) v.
b−a
Now,
b Z 1
1 x−a
Z
(u|v) = exp i dx = eis ds 6= 0,
b−a a b−a 0
and hence
A†0 A†1 1
P ({0}), P 6= OL2 (a,b) .
b−a
This proves that the self-adjoint operators A†0 and A†1 do not commute (in the
sense defined in 17.1.5).
We point out that D[A† ,A† ] , though dense in L2 (a, b), cannot be so that the
0 1
restrictions of A†0 and A†1 to D[A† ,A† ] are essentially self-adjoint, since these
0 1
restrictions are equal but the self-adjoint operators A†0 and A†1 are not (cf.
12.4.11c and 12.4.13).
(b) This example is due to Edward Nelson (cf. Reed and Simon, 1980, 1972, p.306),
and its key-point is the proof of the following proposition.
There exists a Hilbert space K, a linear manifold D dense in K, and two linear
operators A and B in K so that:
(a) DA = DB = D, A(D) ⊂ D, B(D) ⊂ D;
(b) ABf − BAf = 0K , ∀f ∈ D;
(c) A and B are essentially self-adjoint;
(d) ∃f ∈ D such that U A (1)U B (1)f 6= U B (1)U A (1)f .
We do not prove this proposition. A scheme of its proof can be found at p.273–
274 of (Reed and Simon, 1980, 1972).
We note that from condition a and b we have
AB − BA ⊂ OK ,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 536
and hence (in view of 12.1.4), since condition a implies DBA−AB = D and hence
DBA−AB = K,
OK = O†K ⊂ (BA − AB)† ,
and hence
(BA − AB)† = OK .
Then from condition c we have (in view of 12.1.6b, 12.3.4a, 12.3.1a)
A B − B A = A† B † − B † A† ⊂ (BA)† − (AB)† ⊂ (BA − AB)† = OK .
However, condition d proves (in view of 17.1.7) that the self-adjoint operators
A and B do not commute (in the sense defined in 17.1.5).
Two self-adjoint operators commute (in the sense of 17.1.5) iff they are functions
of a third self-adjoint operator (cf. 17.1.10 a ⇔ c). The difficult part of this
equivalence is proved by the next theorem. The main idea for the proof we provide
is drawn from Section 130 of (Riesz and Sz.-Nagy, 1972). We will write this proof
in full detail even at the risk of belabouring the obvious.
σ11 σ41
[0, 1) × [0, 1)= ;
σ21 σ31
for m > 1, supposing that {σnm }n=1,...,4m has already been defined, we define
{σnm+1 }n=1,...,4m+1 by
m+1 m+1
σ4n−3 σ4n
σnm = , for all n ∈ {1, ..., 4m}. (3)
m+1 m+1
σ4n−2 σ4n−1
It can be easily proved by induction that, for all m ∈ N, for all n ∈ {1, ..., 4m }, for
all l ∈ N such that m < l,
ιm
n = ιls and σnm = σsl , (4)
s∈I(m,l) s∈I(m,l)
For all m ∈ N and for all l ∈ N such that m ≤ l, in view of 4 we have, for
n = 1, ..., 4m and s = 1, ..., 4l :
either ιls ⊂ ιm
n or ιs ∩ ιn = ∅;
l m
(5)
ιls ⊂ ιm
n iff σs ⊂ σn ;
l m
(7)
ιls ∩ ιm
n = ∅ iff σs ∩ σn = ∅;
l m
(8)
For l, m ∈ N, let I1 and I2 be subfamilies of {1, ..., 4l } and of {1, ..., 4m } respectively.
First we note that if l = m then (in view of 5)
m
ιs = ιn ⇒ I1 = I2 and hence
m
σsm = σnm . (9)
s∈I1 n∈I2 s∈I1 n∈I2
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 538
∀s ∈ I1 , ∃n ∈ I2 s.t. ιls ⊂ ιm l m
n and hence (in view of 7) s.t. σs ⊂ σn ;
and hence (in view of 4) there would exist t ∈ {1, ..., 4l } such that
[
σtl ⊂ σnm − σsl ,
s∈I1
Step 3: We prove that there exists a projection valued measure T on A(dR ) such
that T (ιm m m
n ) = P (σn ), ∀m ∈ N, ∀n ∈ {1, ..., 4 }.
Let S be the collection of subsets of [0, 1) whose elements are the empty set and
all the intervals [a, b) such that 0 ≤ a < b ≤ 1 and
a = 0 or a = na 4−ma , b = 1 or b = nb 4−mb (12)
with ma , mb ∈ N and na , nb elements of N which are not multiples of 4 (equivalently,
if a 6= 0 then ma is the least positive integer so that a = na 4−ma with na ∈ N, and
similarly for mb if b 6= 1). It is obvious that S is a semialgebra on [0, 1).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 539
and we define
X [
QE := P (σnm(E) ) = P σnm(E)
n∈I(E) n∈I(E)
Now it is time to prove that the mapping Q satisfies all the conditions of 13.4.4.
q1 : Let {E1 , ..., EN } be a disjoint family of elements of S such that
N
[
E := Ek ∈ S.
k=1
We define m := max{m(E), m(E1 ), ..., m(EN )}. In view of 4 there are subsets of
{1, ..., 4m }, I and Ik for k = 1, ..., N , so that
[ [
E= ιm
s and Ek = ιm
s for k = 1, ..., N,
s∈I s∈Ik
moreover,
N N
!
[ [ [ [
ιm
s =E= Ek = ιm
s
s∈I k=1 k=1 s∈Ik
SN
implies (in view of 9) I = k=1 Ik , and hence
!
[ [ N
[ [ N
[ [
σnm(E) = σsm = σsm = σrm(Ek ) ,
n∈I(E) s∈I k=1 s∈Ik k=1 r∈I(Ek )
and hence
[ N
X [ N
X
Q(E) = P σnm(E) = P σrm(Ek ) = Q(Ek ),
n∈I(E) k=1 r∈I(Ek ) k=1
where the second equality is true because P is a projection valued measure (if k 6= h
m(E ) m(E ) m(E ) m(E )
then σr k ∩ σt h = ∅ in view of 8 since ιr k ∩ ιt h = ∅, for all r ∈ I(Ek )
and t ∈ I(Eh )).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 541
and hence
lim µQ
f ([b − 4
−m
, b)) = 0
m→∞
To conclude the proof that condition q4 holds true, we note that obviously
[a, b − 4−m ) = [a, b − 4−m ] ⊂ [a, b), ∀m > m0 ,
and recall 2.8.7.
Step 4: We prove the statement of the theorem.
We define the self-adjoint operator B := JξT . Clearly,
T = P B. (14)
For all m ∈ N and n ∈ {1, ..., 4m }, we denote by (xm m
n , yn ) the bottom-left corner of
m
the square σn . For each m ∈ N, we define the function
m
4
X
ρm := xm
n χι m
n
,
n=1
∞
which is obviously an element of L (R, A(dR ), T ). We note that
ρm (x) ≤ ρm+1 (x), ∀x ∈ R, ∀m ∈ N;
indeed, fix x ∈ [0, 1) and for each m ∈ N let nm (x) ∈ {1, ..., 4m} be such that
x ∈ ιm
nm (x) ; then,
ιm+1 m
nm+1 (x) ⊂ ιnm (x)
17.1.11 Definition. Let A1 and A2 be commuting (in the sense of 17.1.5) self-
adjoint operators in H, and let P be the projection valued measure of 17.1.10b. For
a function ϕ ∈ M(R2 , A(d2 ), P ), we write
ϕ(A1 , A2 ) := JϕP
and we say that this operator is a function of A1 and A2 . This name is justified by
the fact that ϕ(A1 , A2 ) is often the closure of the function ϕ of A1 and A2 defined
in an obvious direct way. In 17.1.12 we examine two instances of this.
17.1.12 Proposition. Let A1 and A2 be commuting (in the sense of 17.1.5) self-
adjoint operators in H.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 545
R2 ∋ (x1 , x2 ) 7→ ϕ(x1 , x2 ) := x1 + x2 ∈ R,
R2 ∋ (x1 , x2 ) 7→ ψ(x1 , x2 ) := x1 x2 ∈ R,
R2 ∋ (x1 , x2 ) 7→ πi (x1 , x2 ) := xi ∈ R,
and the operator Bi := JπPi . The operators B1 and B2 are self-adjoint and
The operator A1 + A2 is adjointable since DA1 +A2 ⊂ DA1 +A2 (cf. 4.4.10) and
DP (ϕ) = H (cf. 14.2.13). Moreover, the operator ϕ(A1 , A2 ) is self-adjoint (cf.
14.3.17). Therefore, A1 + A2 is essentially self-adjoint by 12.4.11. Then we have
A1 + A2 = (A1 + A2 )† (cf. 12.1.6b).
If A1 ∈ B(H) then (cf. 12.3.1b) (A1 + A2 )† = A†1 + A†2 = A1 + A2 .
b: We note that π1 π2 = π2 π1 = ψ. Then, 1 and 14.3.12 imply that
The next result has an important role in the discussion of compatible quantum
observables (cf. 19.5.23 and 19.5.24f). It may be interesting to note that in a way
it extends to a pair of commuting self-adjoint operators what happens for a single
self-adjoint operator (cf. 15.2.4 a ⇒ d).
17.1.13 Theorem. Let A1 and A2 be commuting (in the sense of 17.1.5) self-
adjoint operators in H, and let λ1 ∈ σ(A1 ). Then, for every ε > 0, there exist
λ2 ∈ σ(A2 ) and uε ∈ DA1 ∩ DA2 ∩ H̃ so that
|hAi iuε − λi | ≤ ε and ∆uε Ai ≤ 2ε, for i = 1, 2
(for hAiu and ∆u A, cf. 15.2.3).
Proof. We fix ε ∈ (0, ∞). Then P A1 ((λ1 − ε, λ1 + ε)) 6= OH (cf. 15.2.4). We define
the mapping
Q : A(dR ) → P(H)
E 7→ Q(E) := P A1 ((λ1 − ε, λ1 + ε))P A2 (E);
we point out that this definition is consistent (in view of 13.2.1) because A1 and A2
commute. We note that, for each f ∈ H and all E ∈ A(dR ),
A2
µQ A1
((λ1 − ε, λ1 + ε))f |P A2 (E)P A1 ((λ1 − ε, λ1 + ε))f = µP
f (E) = P g (E)
Then, by 2.3.16 and 2.3.18 there is a countable subset {µn }n∈J of G so that
[
G= (µn − εµn , µn + εµn ).
n∈J
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 547
and
Z
Ai Ai
kAi uε − λi uε k2 = |ξ − λi |2 dµP
uε ≤ ε 2 µP 2
uε ((λi − ε, λi + ε)) = ε .
(λi −ε,λi +ε)
17.1.14 Example. Suppose that H is a separable Hilbert space and let A1 and A2
be self-adjoint operators in H such that conditions a, b, c of 15.3.4B hold true for
both of them. Then the following conditions are equivalent:
(α) A1 and A2 commute (in the sense defined in 17.1.5);
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 548
(β) if, for k = 1, 2, {(λkn , Pnk )}n∈Ik is the family associated with Ak as {(λn , Pn )}n∈I
was associated with A in 15.3.4B, then [Pn1 , Pl2 ] = OH for all (n, l) ∈ I1 × I2 ;
(γ) there exists a c.o.n.s. {vj }j∈J in H whose elements are eigenvectors of both A1
and A2 .
Indeed, the equivalence of conditions α and β follows at once from
X
P Ak (E)f = Pnk f, ∀f ∈ H, ∀E ∈ A(dR ),
k
n∈IE
k
with IE := {n ∈ Ik : λkn ∈ E}, for k = 1, 2 (cf. 15.3.4B).
Moreover, if condition β is true then {Pn1 Pl2 }(n,l)∈I1 ×I2 is a family of projections
(cf. 13.2.1) which is so that
(Pn1 Pl2 )(Pm
1 2
Pj ) = OH if (n, l) 6= (m, j); (1)
if we set I0 := {(n, l) ∈ I1 × I2 : Pn1 Pl2 6= OH }, we have
!
X X X
Pn1 Pl2 f = Pn1 Pl2 f
(n,l)∈I0 n∈I1 l∈I2
X (2)
= Pn1 f = f, ∀f ∈ H;
n∈I1
for each (n, l) ∈ I0 , we fix an o.n.s. {un,l,s }s∈In,l which is complete in the subspace
S
RPn1 Pl2 (cf. 10.7.2); then the set (n,l)∈I0 {un,l,s }s∈In,l is an o.n.s. in H in view of
1 (cf. 13.2.8d and 13.2.9c) and it is complete in H by 10.6.4 (with M := H) since
(cf. 2 and 13.1.10)
X X X
f= Pn1 Pl2 f = (un,l,s |f ) un,l,s , ∀f ∈ H;
(n,l)∈I0 (n,l)∈I0 s∈In,l
moreover, all the elements of this c.o.n.s. are eigenvectors of both A1 and A2 , since
RPn1 Pl2 = RPn1 ∩ RPl2 , ∀(n, l) ∈ I0
(cf. 13.2.1e) and since all the non-null elements of RPnk are eigenvectors of Ak , for
all n ∈ Ik and for k = 1, 2 (cf. 15.3.4B). This proves that condition β implies
condition γ.
Conversely, assume that condition γ is true. Then (cf. 15.3.4B):
X
∀n ∈ I1 , ∃Jn1 ⊂ J s.t. Pn1 f = (vj |f ) vj , ∀f ∈ H;
1
j∈Jn
X
∀l ∈ I2 , ∃Jl2 ⊂ J s.t. Pl2 f = (vj |f ) vj , ∀f ∈ H.
j∈Jl2
(if Jn1 ∩ Jl2 = ∅ then the sum of the series is defined to be 0H ), and this proves that
condition β is true.
Now suppose that A1 and A2 commute and let {(λkn , Pnk )}n∈Ik be as in condition
β, for k = 1, 2. For each G ∈ A(d2 ), let JG := {(n, l) ∈ I0 : (λ1n , λ2l ) ∈ G} and let
PG be the projection defined by
X
PG f := Pn1 Pl2 f, ∀f ∈ H
(n,l)∈JG
hence, µP
f is a measure (cf. 8.3.8 with (X, A) := (R2 , A(d2 )), I := I0 ,
x(n,l) := (λn , λ2l ), a(n,l) := kPn1 Pl2 f k2 ) and
1
2
X
X
2 1 2 2
P 1 2
Pn Pl f
= kf k2 .
µf (R ) = kPn Pl f k =
(n,l)∈I0
(n,l)∈I0
Therefore, µPf is a projection valued measure (cf. 13.3.5). Furthermore, for every
E ∈ A(dR ),
X X X X
(f |P (E × R)f ) = kPn1 Pl2 f k2 = kPl2 Pn1 f k2
1 l∈I
n∈IE 1 l∈I
n∈IE
2 2
P A1
X
kPn1 f k2 = µf (E) = f |P A1 (E)f , ∀f ∈ H
=
1
n∈IE
Z X
(f |ϕ(A1 , A2 )f ) = ϕdµP
f = ϕ(λ1n , λ2l )kPn1 Pl2 f k2
R2 (n,l)∈I0
X
= f | ϕ(λ1n , λ2l )Pn1 Pl2 f , ∀f ∈ Dϕ(A1 ,A2 ) ;
(n,l)∈I0
since the mapping Dϕ(A1 ,A2 ) ∋ f 7→ (n,l)∈I0 ϕ(λ1n , λ2l )Pn1 Pl2 f is obviously a linear
P
operator (its definition is consistent by 10.4.7b), in view of 10.2.12 this implies that
X
ϕ(A1 , A2 )f = ϕ(λ1n , λ2l )Pn1 Pl2 f, ∀f ∈ Dϕ(A1 ,A2 ) .
(n,l)∈I0
⊥
this case to the study of the two operators AM and AM ? The answer is clearly in
the affirmative if DA = H. Indeed, if DA = H then for each f ∈ DA we have the
unique representation
f = f1 + f2 ,
where f1 ∈ M and f2 ∈ M ⊥ (cf. 10.4.1), from which it follows that
⊥
Af = AM f1 + AM f2 .
However, if DA 6= H then we do not have in general
DA = (DA ∩ M ) + (DA ∩ M ⊥ )
(cf. 3.1.8 for the sum of two subsets of a linear space). In fact, we have the following
proposition, which is preliminary to the idea of a reducing subspace.
and hence
f ∈ D ⇒ [PM f ∈ D and PM ⊥ f ∈ D] ⇒
[PM f ∈ D ∩ M and PM ⊥ f ∈ D ∩ M ⊥ ] ⇒
[∃(f1 , f2 ) ∈ (D ∩ M ) × (D ∩ M ⊥ ) so that f = f1 + f2 ] ⇒
f ∈ (D ∩ M ) + (D ∩ M ⊥ ).
This proves the inclusion
D ⊂ (D ∩ M ) + (D ∩ M ⊥ ).
On the other hand, the inclusion
(D ∩ M ) + (D ∩ M ⊥ ) ⊂ D
is obvious.
17.2.3 Theorem. Let A ∈ O(H) and M ∈ S (H). The following conditions are
equivalent:
(a) M and M ⊥ are invariant subspaces for A (i.e. Af ∈ M , ∀f ∈ DA ∩ M , and
Ag ∈ M ⊥ , ∀g ∈ DA ∩ M ⊥ ) and PM (DA ) ⊂ DA ;
(b) PM (DA ) ⊂ DA and there exist A1 ∈ O(M ), A2 ∈ O(M ⊥ ) so that
DA1 = PM (DA ), DA2 = PM ⊥ (DA ),
Af = A1 PM f + A2 PM ⊥ f , ∀f ∈ DA ;
(c) PM A ⊂ APM (i.e. PM commutes with A, in the sense defined in 17.1.2).
If these conditions are satisfied, then
(d) the operators A1 and A2 are uniquely determined by condition b; in fact,
⊥
A1 = AM and A2 = AM .
⊥
This proves condition b, with A1 := AM and A2 := AM .
b ⇒ c: We assume condition b. Then we have
f ∈ DPM A ⇒ f ∈ DA ⇒ PM f ∈ DA ,
i.e. DPM A ⊂ DAPM . Moreover, for every f ∈ DPM A (= DA ) we have
PM Af = PM (A1 PM f + A2 PM ⊥ f ) = A1 PM f
(since A1 PM f ∈ M and A2 PM ⊥ f ∈ M ⊥ , cf. 13.1.3b,c), and also
2
APM f = A1 PM f + A2 PM ⊥ PM f = A1 PM f,
and hence
PM Af = APM f.
c ⇒ a: We assume condition c. Then we have DPM A ⊂ DAPM and hence
f ∈ DA ⇒ f ∈ DPM A ⇒ f ∈ DAPM ⇒ PM f ∈ DA ,
i.e. PM (DA ) ⊂ DA . Moreover we have
(1) (2)
f ∈ DA ∩ M ⇒ Af = APM f = PM Af ∈ M
(1 holds true because f ∈ M and 2 because f ∈ DA = DPM A ), and also
(3) (4) (5)
f ∈ DA ∩ M ⊥ ⇒ Af = APM ⊥ f = Af − APM f = Af − PM Af = PM ⊥ Af ∈ M ⊥
(3 holds true because f ∈ M ⊥ , 4 because f ∈ DA = DPM A and hence f ∈ DAPM ,
5 because f ∈ DPM A ).
d: We suppose that PM (DA ) ⊂ DA and that A1 ∈ O(M ), A2 ∈ O(M ⊥ ) are so
that condition b holds true. Then condition a holds true as well and we have
DA1 = PM (DA ) = DA ∩ M = DAM ,
in view of 17.2.2, and
2
A1 f = A1 PM f = A1 PM f + A2 PM ⊥ PM f = Af = AM f, ∀f ∈ DA1 .
⊥
This proves that A1 = AM . The proof of the equation A2 = AM is similar.
Proof. Let (0M , g) ∈ GAM ; then there exists a sequence {(fn , gn )} in GAM so that
fn → 0M and gn → g;
now, (fn , gn ) ∈ GA for all n ∈ N and hence (since 0M = 0H ) (0H , g) ∈ GA , and
hence (since A is closable) g = 0H = 0M . By 4.4.11a, this proves that AM is
closable.
We have PM A ⊂ APM by hypothesis. Let f ∈ DA (= DPM A ); then there exists
a sequence {fn } in DA so that
fn → f, {Afn } is convergent, Af = lim Afn
n→∞
(cf. 4.4.10); now, the sequence {PM fn } is in DA and (since PM is continuous)
PM fn → PM f, {PM Afn } is convergent i.e. {APM fn } is convergent;
therefore,
PM f ∈ DA and
APM f = lim APM fn = lim PM Afn = PM lim Afn = PM Af.
n→∞ n→∞ n→∞
(AM )† = (A† )M ,
where (AM )† denotes the adjoint of the operator AM in the Hilbert space M (hence,
(AM )† ∈ O(M )).
g = g1 + g2 with g1 ∈ DA ∩ M and g2 ∈ DA ∩ M ⊥
(f |g1 ) = (f |g2 ) = 0
(f |g) = 0.
⊥
Since g was an arbitrary element of DA and DA = {0H } (cf. 10.4.4d), this proves
that
⊥
(3 holds in view of 17.2.3b,d; 4 holds because AM PM ⊥ f ∈ M ⊥ and g ∈ M
since (AM )† denotes the adjoint of the operator AM in the Hilbert space M ; 5
holds because PM ⊥ f ∈ M ⊥ and (AM )† g ∈ M ); therefore g ∈ DA† and hence
g ∈ DA† ∩ M = D(A† )M . This proves that
D(AM )† ⊂ D(A† )M ,
and hence, in view of 2, that (A† )M = (AM )† .
17.2.9 Proposition. Let A ∈ B(H) and M ∈ S (H). The following conditions are
equivalent:
(a) A is reduced by M ;
(b) M is an invariant subspace for both A and A† (i.e. Af ∈ M and A† f ∈ M ,
∀f ∈ M ; recall that DA† = H, cf. 12.2.2).
kAM f kM = kAf kH = kf kH = kf kM , ∀f ∈ M.
A−1 f = A† f ∈ M
f = A(A−1 f ) = AM (A−1 f ).
Proof. The parenthetic equivalence in condition b follows from 13.1.5 and 17.2.9.
The parenthetic equivalence in condition c follows from 17.2.11.
Condition a is PM A ⊂ APM and hence, in view of 17.1.4, it is equivalent to
condition a of 17.1.7 with B := PM . Condition b is PM P A (E) = P A (E)PM for
all E ∈ A(dR ), and hence it is condition b of 17.1.7 with B := PM . Condition c
is PM U A (t) = U A (t)PM for all t ∈ R, and hence it is condition e of 17.1.7 with
B := PM . This proves that conditions a, b, c are equivalent.
d: We assume conditions a and b. Then we have
P A,M (E) ∈ P(M ), ∀E ∈ A(dR ),
by 17.2.12, and also
A,M A A,M
µP = µP and µP (R) = f |1M = (f |1M f )M = kf k2M , ∀f ∈ M.
f f f Hf M
In view of 13.3.5, this proves that P A,M is a projection valued measure with values
in P(M ). In view of 15.2.2e, we have, for f ∈ M ,
Z Z
A,M A
ξ 2 dµP
f < ∞ iff ξ 2 dµP
f < ∞ iff f ∈ DA iff f ∈ DA ∩ M = DAM ,
R R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 559
and also
Z Z
A A,M
M
ξdµP ξdµP
f |A f M
= (f |Af )H = f = f , ∀f ∈ DAM .
R R
This proves that P A,M is the projection valued measure of the self-adjoint operator
AM (cf. 15.2.1).
e: We assume conditions a and c. Then, for every t ∈ R, (U A (t))M is a linear
operator in M and (cf. 16.1.7)
Z Z
A M A PA A,M
ϕt dµP
f |(U (t)) f M = f |U (t)f H = ϕt dµf = f , ∀f ∈ M.
R R
A M AM
By 16.1.7, this proves that (U (t)) =U (t) for all t ∈ R.
In the next two theorems, the first statements generalize the content of 17.2.3b,d.
On the basis of these theorems, the study of the structure of a (closed) operator can
be carried out through the investigation of its reducing subspaces and its restrictions
to the intersections of its domain with them.
and suppose that an operator A in H is reduced by Mn for all n ∈ {1, ..., N }. Writing
An := AMn for all n ∈ {1, ..., N }, we have:
(a) DA = {f ∈ H : Pn f ∈ DAn , ∀n ∈ {1, ..., N }},
PN
Af = n=1 An Pn f , ∀f ∈ DA ;
(b) A is closed iff An is a closed operator in the Hilbert space Mn , ∀n ∈ {1, ..., N };
(c) A is adjointable iff An is an adjointable operator in the Hilbert space Mn ,
∀n ∈ {1, ..., N };
if these conditions hold true, then
DA† = {f ∈ H : Pn f ∈ DA†n , ∀n ∈ {1, ..., N }},
PN
A† f = n=1 A†n Pn f , ∀f ∈ DA† ,
where A†n denotes the adjoint of An in the Hilbert space Mn ;
(d) A is symmetric iff An is symmetric, ∀n ∈ {1, ..., N };
A is self-adjoint iff An is self-adjoint, ∀n ∈ {1, ..., N };
A is essentially self-adjoint iff An is essentially self-adjoint, ∀n ∈ {1, ..., N };
(e) A ∈ B(H) iff An ∈ B(Mn ), ∀n ∈ {1, ..., N };
(f ) A ∈ U(H) iff An ∈ U(Mn ), ∀n ∈ {1, ..., N };
(g) A ∈ P(H) iff An ∈ P(Mn ), ∀n ∈ {1, ..., N }.
Moreover, if A is self-adjoint then:
(h) P A (E) = N An
P
n=1 P (E)Pn , ∀E ∈ A(dR );
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 560
PN
(i) U A (t) = n=1 U An (t)Pn , ∀t ∈ R.
since DAn ⊂ DA and DA is a linear manifold. This proves the part of statement a
about DA . Moreover, from 2 we have
N
X N
X
Af = Pn Af = An Pn f, ∀f ∈ DA .
n=1 n=1
since DA†n = D(A† )Mn ⊂ DA† and DA† is a linear manifold. This proves the part of
the statement about DA† . Moreover, from 4 we have
N
X N
X
A† f = Pn A† f = A†n Pn f, ∀f ∈ DA† .
n=1 n=1
d: The “only if” parts of statement d follow from 17.2.8. The “if” parts follow
from results a and c.
e: The “only if” part of statement e follows from 17.2.9c.
Now we assume An ∈ B(Mn ), for all n ∈ {1, ..., N }. From result a we have
DA = H and also (cf. 4.2.5b)
N
X N
X
kAf k2 = kAn Pn f k2 ≤ max{kAn k2 : n ∈ {1, ..., N }} kPn f k2
n=1 n=1
2 2
= max{kAn k : n ∈ {1, ..., N }}kf k , ∀f ∈ H.
f: The “only if” part of statement f follows from 17.2.10.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 562
Now we assume An ∈ U(Mn ), for all n ∈ {1, ..., N }. From results a and c we
have DA = DA† = H, and also (in view of 12.5.1)
N N
! N
†
X X X
†
AA f = An Pn Ak Pk f = An A†n Pn f
n=1 k=1 n=1
N
X
= Pn f = f, ∀f ∈ H,
n=1
since DAn ⊂ DA and DA is a linear manifold, the first condition in 2 implies that
n
X n
X n
X
Pk f ∈ DA and A Pk f = Ak Pk f, ∀n ∈ N;
k=1 k=1 k=1
Pn
now, the sequence { k=1 Pk f } is convergent (cf. 13.2.8); moreover, the second
P∞
condition in 2 implies that the series n=1 APn f is convergent (cf. 10.4.7b), and
Pn
hence that the sequence {A k=1 Pk f } is convergent; since A is supposed to be
closed, this implies that
n
X
f = lim Pk f ∈ DA .
n→∞
k=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 564
The opposite inclusion follows from result a and 10.4.7a. This concludes the proof
of the “only if” part of statement b.
Now we suppose that An is closed, for all n ∈ N, and that
∞
( )
X
2
DA = f ∈ H : Pn f ∈ DAn , ∀n ∈ N, and kAn Pn f k < ∞ .
n=1
e: The “only if” part of statement e follows from 17.2.9c and 4.4.3.
Now we assume that A is closed, An ∈ B(Mn ) for all n ∈ N, and
m := sup{kAn k : n ∈ N} < ∞.
Then we have
X∞ ∞
X
kAn Pn f k2 ≤ m2 kPn f k2 = m2 kf k2 , ∀f ∈ H,
n=1 n=1
and hence DA = H, in view of result b. Moreover, in view of result a, we have (cf.
10.4.7a)
2
X∞
X∞
2
kAf k =
An Pn f
= kAn Pn f k2 ≤ m2 kf k2 , ∀f ∈ H.
n=1 n=1
Thus, A ∈ B(H).
f, g, h, i: The proofs of these statements are analogous to those of statements f,
g, h, i of 17.2.14, on the basis of results a, b, c.
To conclude this section, we present an example which shows that for an invari-
ant subspace it is possible not to be a reducing subspace even when its orthogonal
complement is invariant as well.
since the equivalence class PM1 [u0 ] does not contain any continuous function (cf.
11.2.2c).
We note that not even the self-adjoint operator A0 (cf. 12.4.25) is reduced by
the subspace M1 . Indeed, if A0 were reduced by M1 then we should have
PM1 A0 ⊂ A0 PM1 ,
PM1 [u0 ] = PM1 P A0 ({0})[u0 ] = P A0 ({0})PM1 [u0 ] = ([u0 ]|PM1 [u0 ]) [u0 ],
which cannot be true since PM1 [u0 ] does not contain any continuous function.
17.3 Irreducibility
We see that:
Z
DA = H = f ∈H: ξ 2 dµP
f < ∞ ,
R
Z
(f |Af ) = (f |λ1H f ) = λkf k2 = ξdµPf , ∀f ∈ H.
R
A
This proves that P = P (cf. 15.2.1), and hence that condition a is true.
17.3.3 Theorem. Let {Ai }i∈I be a set of self-adjoint operators in H. The following
conditions are equivalent:
(a) the set {Ai }i∈I is irreducible;
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 569
(b) if A is a self-adjoint operator in H which commutes with Ai (in the sense defined
in 17.1.5) for all i ∈ I, then there exists λ ∈ R so that A = λ1H .
Proof. a ⇒ b: Let A be a self-adjoint operator in H which commutes with Ai for
all i ∈ I. Then,
P A (E)Ai ⊂ Ai P A (E), ∀i ∈ I, ∀E ∈ A(dR )
(cf. 17.1.7b with B := Ai ), and hence, if we assume condition a,
P A (E) ∈ {OH , 1H }, ∀E ∈ A(dR ).
By 17.3.2, this implies that there exists λ ∈ R so that A = λ1H .
b ⇒ a: Let P ∈ P(H) be such that
P Ai ⊂ Ai P, ∀i ∈ I.
Then P commutes with Ai (in the sense defined in 17.1.5) for all i ∈ I, by 17.1.4,
and hence, if we assume condition b,
∃λ ∈ R so that P = λ1H .
Since P is a projection, from P = P 2 (cf. 13.1.5) we have λ = λ2 , and hence
P ∈ {OH , 1H }.
17.3.5 Corollary (Shur’s lemma). Let {Ai }i∈I be an irreducible set of elements
of B(H) such that
∀i ∈ I, ∃j ∈ I so that A†i = Aj .
Then,
[B ∈ B(H) and BAi = Ai B, ∀i ∈ I] ⇒ [∃α ∈ C so that B = α1H ].
Chapter 18
Statistical operators were devised by John von Neumann in order to represent the
most general statistical ensembles of a given quantum system (cf. Neumann, 1932,
Chapter IV). In this representation, those particular ensembles which von Neu-
mann denoted as homogeneous (and which we call pure states in Chapter 19) are
represented by one-dimensional projections, which are a special case of statistical
operators.
In this chapter we study statistical operators. Before that, we need to study the
polar decomposition for elements of B(H) and a subset of B(H) which is called the
trace class. As usual, H denotes an abstract Hilbert space throughout the chapter.
In this section we find a decomposition for elements of B(H) which is the general-
ization of the decomposition z = |z| exp(i arg z) for a complex number z. First, we
must find the right analogous of a positive number and of the absolute value |z| of
a complex number z.
18.1.2 Remarks.
(a) If an operator A ∈ B(H) is positive then (f |Af ) ∈ R for all f ∈ H, and hence
A is self-adjoint (cf. 12.4.3).
(b) If an operator A ∈ B(H) is positive then there exists a unique positive operator
B ∈ B(H) such that A = B 2 (cf. 15.3.9). The operator B will be denoted by
1
the symbol A 2 .
1
If A ∈ B(H) is positive and T ∈ B(H) is such that [T, A] = OH , then [T, A 2 ] =
OH (cf. 15.3.9).
(c) If an operator A ∈ B(H) is positive then the operator U AU −1 is a positive
571
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 572
Proof. a: We have
k|A|f k2 = f ||A|2 f = f |A† Af = kAf k2 , ∀f ∈ H.
The definitions given in 18.1.1 and in 18.1.3 are generalizations from C to B(H).
Indeed, if H is a one-dimensional Hilbert space then every complex number can
be identified with an element of B(H) (cf. 12.6.6a). In this identification, positive
numbers are identified with positive operators; moreover, if Aα is the operator
that corresponds to the complex number α, then |Aα | corresponds to |α|. In the
decomposition z = |z| exp(i arg z) for a complex number z, the number exp(i arg z)
is an element of T and hence it can be identified with an element of U(H) (cf.
12.6.6a). However, in order to obtain the decomposition for elements of B(H) we
are after, the right generalization of T is wider than U(H) when H is not a one-
dimensional Hilbert space.
18.1.5 Definitions. An operator U ∈ B(H) is called an isometry if kU f k = kf k
for all f ∈ H, while it is more generally called a partial isometry, or it is said to be
partially isometric, if kU f k = kf k for all f ∈ NU⊥ .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 573
If U ∈ B(H) is partially isometric then I(U ) := NU⊥ is called the initial subspace
of U (I(U ) is actually a subspace of H by 10.2.13).
Each element of U(H) is obviously an isometry. Each element of P(H) is a
partial isometry in view of 13.1.3b,c and 10.4.4a.
c: First we point out that I(U ) and F (U ) can be considered Hilbert spaces
since they are subspaces of H (cf. 10.3.2). Next we notice that the linear operator
UI(U) is surjective onto RU since U f = U PI(U) f for all f ∈ H (cf. the preliminary
remark). Then, statement c holds true in view of 10.1.20.
d: We have
f |U † U f = kU f k2 = kU PI(U) f k2 = kPI(U) f k2 = f |PI(U) f , ∀f ∈ H.
is a unitary operator from the Hilbert space I(U † ) onto the Hilbert space F (U † ).
Now,
I(U † ) = NU † ⊥ = RU = F (U ).
Moreover, from
NU = NU †† = RU † ⊥
(cf. 12.1.6b and 12.1.7) we have
F (U † ) = RU † = NU⊥ = I(U )
by 10.4.4a since RU † is a subspace of H, in view of statements e and b (written
with U † in place of U ).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 575
18.1.7 Theorem. Let A ∈ B(H). Then there exists a unique partially isometric
operator U ∈ B(H) such that
A = U |A| and NU = NA .
Moreover,
RU = RA and |A| = U † A.
The equality A = U |A| is called the polar decomposition of A.
hence such that h = limn→∞ V gn = Ṽ g; thus, h ∈ RṼ . Thus proves the inclusion
RV ⊂ RṼ , and hence (in view of 3) the equalities
RṼ = RV = RA . (4)
Now we set M := R|A| and define the operator
U := Ṽ PM ,
which is an element of B(H) (note that DṼ = M ). In what follows we prove that
U satisfies the conditions of the statement.
From the definition of U and from 4 we have
RU = RṼ = RA .
From the definitions of U, Ṽ , V , from 4 and from 13.1.3c we have
U |A|g = Ṽ |A|g = V |A|g = Ag, ∀g ∈ H,
i.e. A = U |A|.
Moreover, we have
(5) (6) (7) (8) (9)
NU = NPM = (R|A| )⊥ = R|A| ⊥ = N|A| = NA , (10)
where 5 follows from 2, 6 from 13.1.3b, 7 from 10.2.11, 8 from 12.1.7 (since |A| is
self-adjoint), 9 from 18.1.4c.
Furthermore, from 10 and from 10.4.4c we have
NU⊥ = R|A|
⊥⊥
= R|A| = M, (11)
and hence, in view of the definition of U and of 13.1.3c,
U f = Ṽ f, ∀f ∈ NU⊥ ,
and hence, in view of 2,
kU f k = kf k, ∀f ∈ NU⊥ .
Thus, the operator U is partially isometric.
Finally, from U |A| = A we have
U † U |A| = U † A;
now, from 18.1.6d and 11, we have U † U = PM and hence (in view of 13.1.3c)
|A| = U † A.
Uniqueness: Suppose that T is a partially isometric element of B(H) such that
A = T |A| and NT = NA .
Let f ∈ R|A| and let g ∈ H be so that f = |A|g; then,
U f = U |A|g = Ag = T |A|g = T f.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 577
18.1.9 Remark. The analogy between the symbols |A| for A ∈ B(H) and |z| for
z ∈ C must not induce the reader to expect other properties for |A| than the ones
discussed above. To see this, for any u, v ∈ H̃ (cf. 10.9.4) we define the mapping
Au,v : H → H
f 7→ Au,v f := (u|f ) v.
We notice that, if u = v, then Au,v = Au (the one-dimension projection defined in
13.1.12). It is obvious that Au,v ∈ B(H) (use the Schwarz inequality, cf. 10.1.9).
Moreover, the equation
(Au,v f |g) = (u|f ) (v|g) = (f |Av,u g) , ∀f, g ∈ H,
proves that A†u,v = Av,u (cf. 12.1.3B). Then, the equation
A†u,v Au,v f = (u|f ) u = Au f, ∀f ∈ H,
proves that A†u,v Au,v = Au , and hence (since Au is positive and A2u = Au ) that
|Au,v | = Au .
By the same token, we also have
|A†u,v | = |Av,u | = Av .
Moreover, from
Av,u Au,v = Au
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 578
we have
|Av,u Au,v | = Au ,
while
|Av,u ||Au,v | = Av Au .
|z| = |z|, ∀z ∈ C,
|zw| = |z||w|, ∀z, w ∈ C
In this and in the next section, the Hilbert space H is assumed to be separable. All
definitions, statements and proofs are written on the hypothesis that the orthogonal
dimension of H is denumerable. If the orthogonal dimension of H was finite then all
the arguments presented would get simplified in an obvious way and some conditions
would become trivial.
18.2.1 Theorem. Let A be a positive element of B(H) and let {un }n∈N be a c.o.n.s.
in H.
If {vn }n∈N is another c.o.n.s. in H then
∞
X ∞
X
(un |Aun ) = (vn |Avn )
n=1 n=1
tr(A + B) = tr A + tr B;
tr(aA) = a tr A;
(c) if B ∈ B(H) is such that (f |Af ) ≤ (f |Bf ) for all f ∈ H, then B is positive and
tr A ≤ tr B;
tr(U AU −1 ) = tr A;
tr(V AV † ) ≤ tr A;
tr(V AV † ) = tr A;
f |V AV † f = V † f |AV † f ≥ 0, ∀f ∈ H.
i∈I j∈J
X (2)
†
= ui |V AV ui ,
i∈I
since RV⊥ = NV † (cf. 12.1.7). We point out that 1 holds true by an easy corollary
to 5.4.7 (in 5.4.7, take an,m := 0 for n > 2). Now, the restriction of V † to RV
is a unitary operator from the Hilbert space RV onto the Hilbert space NV⊥ (cf.
18.1.6f), and hence {V † ui }i∈I is a c.o.n.s. in the Hilbert space NV⊥ (cf. 10.6.5c and
10.6.8b), and hence it is an o.n.s. in H which is complete in the subspace NV⊥ (cf.
10.6.5c). In V is an isometry then NV = {0H }, and hence {V † ui }i∈I is a c.o.n.s. in
H, and hence (in view of 2)
X X
V † ui |AV † ui = ui |V AV † ui = tr(V AV † ).
tr A =
i∈I i∈I
f: Let {vi }i∈I be an o.n.s. in H and let {un }n∈N be a c.o.n.s. in H which
contains {vi }i∈I (cf. 10.7.3.). Then,
∞ ∞ ∞
!
X X X X
kAvi k2 ≤ kAun k2 = | (uk |Aun ) |2
i∈I n=1 n=1 k=1
∞ ∞
" #
(3) X X
≤ (uk |Auk ) (un |Aun )
n=1 k=1
∞
" ∞ ! #
(4) X X
= (uk |Auk ) (un |Aun )
n=1 k=1
∞ ∞
! !
(5) X X
= (uk |Auk ) (un |Aun ) = (tr A)(tr A),
k=1 n=1
where 4 and 5 hold true by 5.4.5 and 3 by 5.4.2a since, for all k, n ∈ N,
1 1
2 1 1
| (uk |Aun ) |2 = A 2 uk |A 2 un ≤ kA 2 uk k2 kA 2 un k2 = (uk |Auk ) (un |Aun ) .
n=1 n=1
= tr(U † V |A|V † U ) ≤ tr(V |A|V † ) ≤ tr |A|,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 582
in view of 18.2.2e (once for V and once for U † , which is partially isometric by
18.1.6e). Moreover,
∞ ∞
1
X X
k|A| 2 un k2 = (un ||A|un ) = tr |A|.
n=1 n=1
(cf. 18.1.4a), and hence that kAk ≤ tr |A|, which is statement d. From this we have
obviously
A = OH if tr |A| = 0,
which is statement c.
e: Let A ∈ T (H) and let U be a partially isometric element of B(H) such that
A = U |A| and NU = NA
(cf. 18.1.7). Then A† = |A|U † (cf. 12.6.4) and hence
A† = U † U |A|U †
by 13.1.3c, since U † U is the orthogonal projection onto the subspace
NU⊥ = NA⊥ = N|A|
⊥
⊥
(cf. 18.1.6d and 18.1.4c) and since the equality N|A| = R|A| (cf. 12.1.7) implies the
inclusion
⊥
N|A| ⊃ R|A|
(cf. 10.2.10d). Thus,
(A† )† A† = AA† = U |A|U † U |A|U † .
In view of 18.2.2e, the operator U |A|U † is positive. Therefore,
U |A|U † = |A† |.
Then, by 18.2.2e once more,
tr |A† | = tr(U |A|U † ) ≤ tr |A|.
This proves that A† ∈ T (H).
Now, since in the reasoning above A was an arbitrary element of T (H), we can
replace A with A† and obtain
tr |A| = tr |(A† )† | ≤ tr |A† |,
and hence
tr |A† | = tr |A|.
f: If U ∈ UA(H) and A ∈ B(H), then |U AU −1 | = U |A|U −1 (cf. 18.1.4d), and
hence
tr |U AU −1 | = tr(U |A|U −1 ) = tr |A|
(cf. 18.2.2d). Therefore, if A ∈ T (H) then U AU −1 ∈ T (H).
1
(cf. 4.2.5b), and hence we can define the operator (1H − B 2 ) 2 . We have
1 1
(B ± i(1H − B 2 ) 2 )† = B ∓ i(1H − B 2 ) 2
and hence
1 1
(B ± i(1H − B 2 ) 2 )† (B ± i(1H − B 2 ) 2 )
1 1
= B 2 ∓ i(1H − B 2 ) 2 B ± iB(1H − B 2 ) 2 + 1H − B 2 = 1H ,
1
since [B, (1H − B 2 ) 2 ] = OH (cf. 18.1.2b). Similarly, we have
1 1
(B ± i(1H − B 2 ) 2 )(B ± i(1H − B 2 ) 2 )† = 1H .
1 1
In view of 12.5.1, this proves that B + i(1H − B 2 ) 2 and B − i(1H − B 2 ) 2 are unitary
operators. Moreover,
1 1 1 1
(B + i(1H − B 2 ) 2 ) + (B − i(1H − B 2 ) 2 ) = B.
2 2
Thus, there exist V1 , V2 ∈ U(H) so that B = 12 (V1 + V2 ).
Next we notice that, for all A ∈ B(H) − {OH }:
1 −1 † 1 −1 †
A = kAk kAk (A + A ) − i kAk i(A − A ) ;
2 2
1 1
kAk−1 (A + A† ) and kAk−1 i(A − A† ) are self-adjoint;
2
2
1
kAk−1 (A + A† )
≤ 1.
2
18.2.7 Theorem. Suppose that A ∈ T (H) and B ∈ B(H). Then BA ∈ T (H) and
AB ∈ T (H).
Proof. Let U ∈ U(H). The operator U −1 |A|U is positive (cf. 18.1.2c) and
(U −1 |A|U )2 = U −1 |A|2 U = U −1 A† AU = (AU )† (AU )
(cf. 12.6.4 and 12.5.1b). Therefore,
U −1 |A|U = |AU | and hence tr |AU | = tr |A| < ∞
(cf. 18.2.2d). Moreover,
|A|2 = A† A = A† U † U A = (U A)† (U A)
(cf. 12.5.1c) proves that
|A| = |U A| and hence tr |U A| = tr |A| < ∞.
Since U was an arbitrary element of U(H), this proves that
AU, U A ∈ T (H), ∀U ∈ U(H),
and this proves the statement, in view of 18.2.6 and 18.2.4a,b.
If I = N, the first series is convergent with respect to the norm for B(H) defined in
4.2.11a. The one-dimensional projection Aun is defined as in 13.1.12.
Proof. We set
1 1
Ek := kAk, kAk and Pk := P A (Ek ), ∀k ∈ N.
k+1 k
Since σ(A) ⊂ [0, kAk] (cf. 15.3.9 and 4.5.10), we have
∞
X
A
P ({0})f + Pk f = P A (σ(A))f = f, ∀f ∈ H (1)
k=1
(cf. 15.2.2d).
For all k ∈ N, the subspace Mk := RPk is finite-dimensional. Indeed, if we fix
k ∈ N then for all f ∈ Mk we have
A
2 2
µP A A A
f (R − Ek ) = kP (R − Ek )f k = kP (R − Ek )P (Ek )f k = 0
(cf. 15.2.2e and 8.1.11a). In view of 18.2.2f, this proves that each o.n.s. contained
in Mk must be finite, and hence that the orthogonal dimension of Mk is finite.
For all k ∈ N, we have
[A, Pk ] = OH
(cf. 15.2.1B). Hence, the operator A is reduced by the subspace Mk (cf. 17.2.4)
and Ak := AMk is a self-adjoint operator in the Hilbert space Mk (cf. 17.2.8).
Therefore, if Mk 6= {0H } then there exists an o.n.s. {vk,i }i∈Ik which is complete
in the subspace Mk and whose elements are eigenvectors of Ak (cf. 15.3.4C and
10.6.5c), i.e. so that
For each i ∈ Ik , it is obvious that µk,i is an eigenvalue of A; then, µk,i ∈ [0, ∞); more-
over, Pk P A ({0}) = OH (cf. 13.3.2b) implies vk,i ∈ NA⊥ by 13.2.9 (since P A ({0}) is
the orthogonal projection onto NA , cf. 15.2.5e), and hence µk,i ∈ (0, ∞). Further,
we have
X
APk f = Ak Pk f = (vk,i |Ak Pk f )Mk vk,i
i∈Ik
X X (2)
= (Pk Avk,i |f )H vk,i = µk,i (vk,i |f ) vk,i , ∀f ∈ H
i∈Ik i∈Ik
(cf. 10.6.4b). Letting J := {k ∈ N : Mk 6= {0H }}, from 1 and 2 and from the
continuity of A we infer that
∞
X
Af = AP A ({0})f + APk f
k=1
! (3)
X X
= µk,i (vk,i |f ) vk,i , ∀f ∈ H.
k∈J i∈Ik
Now let I := {1, ..., N } or I := N be so that there is a bijection from I onto the set
S
k∈J Ik , and for each n ∈ I let
Then {un }n∈I is obviously an o.n.s. in H (since Mk ⊂ Mh⊥ if k 6= h), {λn }n∈I is a
family of elements of (0, ∞), and 3 can be written as
X
Af = λn (un |f ) un , ∀f ∈ H, (4)
n∈I
in view of 10.4.10 (note that every series which may appear in 3 is convergent in
view of 13.2.8 and 10.6.1).
Now let {wj }j∈N be a c.o.n.s. in H which contains {un }n∈I (cf. 10.7.3). Then,
18.2.9 Corollary. Let A ∈ T (H) and suppose that A 6= OH .Then there exist two
orthonormal systems {un }n∈I and {vn }n∈I (with I := {1, ..., N } or I := N) in H
and a family {λn }n∈I of elements of (0, ∞) (not necessarily different from each
P PN P∞
other) so that (denoting by n∈I either n=1 or n=1 )
X X X
A= λn Aun ,vn , |A| = λn Aun , tr |A| = λn .
n∈I n∈I n∈I
If I = N, the first two series are convergent with respect to the norm for B(H)
defined in 4.2.11a. The operator Aun ,vn is defined as in 18.1.9.
(cf. 18.2.8 with |A| in place of A). If I = N, the first series is convergent with
respect to the norm for B(H) defined in 4.2.11a. Since N|A| = NA (cf.18.1.4c) and
⊥
since un ∈ N|A| (note that |A|un = λn un and then use 12.4.20B), we have
un ∈ NU⊥ , ∀n ∈ I.
Now, the restriction of the operator U to the subspace NU⊥ is a unitary operator
from the Hilbert space NU⊥ onto the Hilbert space RU (cf. 18.1.6c). Thus, if we
set vn := U un for all n ∈ I, {vn }n∈I is an o.n.s. in H (cf. 10.6.5c and 10.6.8a).
Moreover,
U Aun = Aun ,vn , ∀n ∈ I,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 588
18.2.10 Theorem. Let A ∈ T (H) and let {vn }n∈N be a c.o.n.s. in H. Then the
series ∞
P
n=1 (vn |Avn ) is absolutely convergent and hence it is convergent. The sum
of this series is independent of the c.o.n.s. {vn }n∈N in H chosen to compute it, and
it is called the trace of A and denoted by tr A. Thus,
∞
X
tr A := (wn |Awn )
n=1
for whichever c.o.n.s. {wn }n∈N in H. It is obvious that, if A is positive, this
definition agrees with the one given in 18.2.1.
The following inequalities hold true:
(a) | tr BA| ≤ kBk tr |A|, ∀B ∈ B(H);
(b) tr |BA| ≤ kBk tr |A|, ∀B ∈ B(H);
(c) | tr A| ≤ tr |A|.
(1 hold true by the Schwarz inequality in ℓ2 , cf. 10.2.8b and 10.3.8d; 2 holds true
by 10.6.4d with M := H; 3 holds true by 18.1.6a and 4.2.5b). Then we have
∞
!
X X X
†
| U vk |un (un ||A|vk ) | ≤ λn = tr |A| < ∞, (4)
n∈I k=1 n∈I
and hence
∞
X ∞
X ∞
X
| U † vk ||A|vk | = | U † vk |P |A|vk |
| (vk |Avk ) | =
k=1 k=1 k=1
∞ X
X
†
= U vk |un (un ||A|vk )
k=1 n∈I
(5) X∞ X
| U † vk |un (un ||A|vk ) |
≤
k=1 n∈I
∞
!
(6) X X
| U † vk |un (un ||A|vk ) |
= <∞
n∈I k=1
(5 holds true by 5.4.2a, and by 5.4.10 if I = N; 6 holds true by 5.4.6 if I = {1, ..., N }
P∞
or by 5.4.7 if I = N). Thus, the series k=1 (vk |Avk ) is absolutely convergent and
hence it is convergent by 4.1.8b. Moreover, we have
∞
X ∞
X ∞
X
U † vk ||A|vk = U † vk |P |A|vk
(vk |Avk ) =
k=1 k=1 k=1
∞
!
X X
U † vk |un (un ||A|vk )
=
k=1 n∈I
∞
!
(7) X X
†
= U vk |un (un ||A|vk )
n∈I k=1
∞
!
X X (8) X
= (|A|un |vk ) (vk |U un ) = (|A|un |U un )
n∈I k=1 n∈I
(7 holds true by 8.4.14b, since 4 proves that the conditions in 8.4.14a are satis-
fied; 8 holds true by 10.6.4c with M := H). Since the last term of this equation
is independent of the choice of the c.o.n.s. {vn }n∈N , this proves that the sum of
P∞
the series k=1 (vk |Avk ) is independent as well (note that, if I = N, the series
P∞
n=1 (|A|un |U un ) is shown to be convergent by the way the last term of the equa-
tion above has been obtained).
Now we prove the inequalities of the statement.
a: Let {wj }j∈N be a c.o.n.s. in H which contains the o.n.s. {un }n∈I (cf. 10.7.3).
If j ∈ N is such that wj 6∈ {un }n∈I , then
(wj |un ) = 0, ∀n ∈ I,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 590
and hence
wj ∈ N|A| = NA
|BA| = V † BA
Proof. a: For the function tr, property lo1 of 3.2.1 is obvious and properties lo2 ,
lo3 follow directly from the property ip1 of an inner product and from the continuity
of sum and product in C. The continuity of tr follows from 18.2.10c and from 4.2.2.
b: This follows directly from 12.1.3A, from property ip2 of an inner product,
and from the continuity of complex conjugation.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 591
where 1 holds true by 12.5.1c and 2 because {U vn }n∈N is a c.o.n.s in H (cf. 10.6.8b).
In view of 18.2.6 and property a, this proves that
tr(AB) = tr(BA), ∀B ∈ B(H).
d: This follows immediately from property c.
e: Let A ∈ T (H) and V ∈ A(H). If {vn }n∈N is a c.o.n.s. in H then
∞
X ∞
X
tr(V AV −1 ) = vn |V AV −1 vn = AV −1 vn |V −1 vn = tr A,
n=1 n=1
−1
since {V vn }n∈N is a c.o.n.s. in H (cf. 10.6.8b).
for each A ∈ B(H) and for each o.n.s. {un }n∈I in H which is complete in the
subspace RP ;
(d) if P ∈ T (H) then
0 ≤ tr(P A) ≤ tr A
for each positive element A of B(H).
Proof. In what follows, let {un }n∈I be a c.o.n.s. in H which is complete in the
subspace RP (cf. 10.7.2) and let {wj }j∈N be a c.o.n.s. in H which contains {un }n∈I
(cf. 10.7.3). Then,
wj ∈ NP if j ∈ N is such that wj 6∈ {un }n∈I
(cf. 13.1.10).
a and b: We notice that P is positive, in view of 13.1.7c. Then we have
∞
X X X
tr |P | = tr P = (wj |P wj ) = (un |P un ) = (un |un )
j=1 n∈I n∈I
18.2.14 Theorem. The normed space (T (H), ν1 ) (i.e. the linear space T (H) with
the norm ν1 , cf. 18.2.5) is a Banach space.
18.2.15 Theorem. Let {un }n∈I and {vn }n∈I be families of elements of H̃ (cf.
10.9.4) and let {λn }n∈I be a family of elements of C, with I := {1, ..., N } or I := N.
If I = N, suppose that ∞
P
n=1 |λn | < ∞.
If I = {1, ..., N } then the operator defined by
N
X
A := λn Aun ,vn
n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 594
is an element of T (H).
P∞
If I = N then the series n=1 λn Aun ,vn is convergent in the normed space
(T (H), ν1 ) (and hence also in the normed space B(H) with respect to the norm for
B(H) defined in 4.2.11a) and therefore the operator defined by
∞
X
A := λn Aun ,vn
n=1
is an element of T (H).
P PN P∞
In both cases we have (denoting by n∈I either n=1 or n=1 )
X
tr(AB) = tr(BA) = λn (un |Bvn ) , ∀B ∈ B(H),
n∈I
Proof. First we recall that, for u, v ∈ H̃, we have |Au,v | = Au (cf. 18.1.9), and
hence tr |Au,v | = 1 (cf. 18.2.12b), and hence Au,v ∈ T (H). Moreover, if {wn }n∈N is
a c.o.n.s. in H which contains {u} (cf. 10.7.3), then we have
∞
X
tr(BAu,v ) = (wn |BAu,v wn ) = (u|Bv) , ∀B ∈ B(H).
n=1
Since T (H) is a linear manifold in B(H) and since the function tr is a linear func-
tional (cf. 18.2.11a), this proves the whole statement for I = {1, ..., N }.
Now we suppose I = N. We notice that, in the normed space (T (H), ν1 ), the
P∞
series n=1 λn Aun ,vn is absolutely convergent since
ν1 (λn Aun ,vn ) = |λn | tr |Aun ,vn | = |λn |, ∀n ∈ N.
P∞
Then, in view of 18.2.14 and 4.1.8b, the series n=1 λn Aun ,vn is convergent in the
normed space (T (H), ν1 ), and hence also in the normed space B(H) with respect to
the norm for B(H) defined in 4.2.11a (cf. 18.2.5a). For all B ∈ B(H), we have (cf.
18.2.10b)
n
n
X X
tr BA − B λk Auk ,vk ≤ kBk tr A − λk Auk ,vk −−−−→ 0;
n→∞
k=1 k=1
therefore, in view of the continuity of the linear functional tr (cf. 18.2.11a), we have
n
!
X
tr(BA) = lim tr B λk Auk ,vk
n→∞
k=1
n
X ∞
X
= lim λk (uk |Bvk ) = λn (un |Bvn ) ;
n→∞
k=1 n=1
18.2.16 Remark. In view of 18.2.15, the series of operators which appear in 18.2.8
and in 18.2.9 (if I = N) are convergent not only with respect to the norm defined
in 4.2.11a but also with respect to the norm ν1 .
18.2.17 Proposition. Let M and N be subspaces of H, let T1 := PM , and let
T2h := (PN PM )h , T2h+1 := PM (PN PM )h , ∀h ∈ N.
Then
tr(BPM∩N APM∩N ) = lim tr(BTk ATk† ), ∀A ∈ T (H), ∀B ∈ B(H),
k→∞
and hence (for B := 1H )
tr(PM∩N APM∩N ) = lim tr(Tk ATk† ), ∀A ∈ T (H).
k→∞
If A is a positive element of T (H) and tr(PM∩N APM∩N ) 6= 0, then
tr(Tk ATk† ) 6= 0, ∀k ∈ N.
Proof. If A = OH then the statement is trivially true. In what follows, we assume
A ∈ T (H)−{OH }, we fix B ∈ B(H), and we set P := PM∩N . Let {un }n∈I , {vn }n∈I ,
{λn }n∈I be with respect to A as in 18.2.9. In view of 18.2.11c and 18.2.15, we have
X X
tr(BP AP ) = tr(P BP A) = λn (un |P BP vn ) = λn (P un |BP vn ) ,
n∈I n∈I
and
tr(BTk ATk† ) = tr(Tk† BTk A) =
X
λn (Tk un |BTk vn ) , ∀k ∈ N.
n∈I
Moreover, by 13.2.2 (and by the continuity of B) we have
(P un |BP vn ) = lim (Tk un |BTk vn ) , ∀n ∈ I.
k→∞
If I = {1, ..., N }, this proves that
tr(BP AP ) = lim tr(BTk ATk† ).
k→∞
Now we suppose I = N. We notice that
| (Tk un |BTk vn ) | ≤ kTk un kkBTk vn k ≤ kBk, ∀n ∈ N, ∀k ∈ N
(cf. 10.1.9, 4.2.5b, 4.2.9), and that
X∞ X∞
|λn |kBk = kBk |λn | < ∞.
n=1 n=1
Then, by 8.3.10a and 8.2.11 (with the sequence {|λn |kBk} as dominating function)
we have
X∞
tr(BP AP ) = λn (P un |BP vn )
n=1
∞
λn (Tk un |BTk vn ) = lim tr(BTk ATk† ).
X
= lim
k→∞ k→∞
n=1
Finally, we suppose that A is positive. Then the operator Tk ATk† is positive for all
k ∈ N, as can be seen easily. Therefore, if k ∈ N exists so that tr(Tk ATk† ) = 0 then
Tk ATk† = OH (cf. 18.2.4c), and hence Tm ATm †
= OH for all m > k since
∀m > k, ∃Sm,k ∈ B(H) s.t. Tm ATm †
= Sm,k Tk ATk† Sm,k
†
,
and hence limk→∞ tr(Tk ATk† ) = 0.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 596
Statistical operators are nothing else than positive trace class operators which are
normalized with respect to the norm ν1 for T (H) (i.e., their trace is one). Thus, the
results we prove in this section are essentially exercises about positive trace class
operators and they are of interest especially in view of the role played by statistical
operators in quantum mechanics.
Throughout this section, H denotes a separable Hilbert space whose orthogonal
dimension is denumerable. For a finite-dimensional Hilbert space, everything holds
in an obviously simplified fashion.
18.3.2 Remarks.
(a) If W ∈ W(H) and U ∈ UA(H), then U W U −1 ∈ W(H). This follows from
18.2.2d.
(b) For each u ∈ H̃, the one-dimensional projection Au is a statistical operator. In
fact Au is positive (so are all orthogonal projections, in view of 13.1.7c) and
tr Au = 1 (cf. 18.2.12b). From 18.2.12c we have
tr(BAu ) = (u|Bu) , ∀B ∈ B(H).
In view of 18.2.12a,b, the one-dimensional projections are the only orthogonal
projections which are statistical operators.
(c) If W ∈ W(H) then, in view of 18.2.8, there exist an o.n.s. {un }n∈I (with
I := {1, ..., N } or I := N) and a family {λn }n∈I of elements of (0, ∞) so that
X X
W = λn Aun and λn = tr W = 1; (1)
n∈I n∈I
thus λn ∈ (0, 1] for all n ∈ I. If I = N then the first of these series is convergent
with respect to the norm for B(H) defined in 4.2.11a and also with respect to
the norm ν1 for T (H) (cf. 18.2.16), and we have
∞
X
Wf = λn (un |f ) un , ∀f ∈ H,
n=1
cf. 1.2.1); therefore, this family is uniquely determined (if {un }n∈I is required,
as above, to be an o.n.s.). The family {Aun }n∈I is uniquely determined iff the
eigenspaces of all non-zero eigenvalues of W are one-dimensional (if this is true
then Aun is the orthogonal projection on the eigenspace corresponding to λn ).
However, even in this case, a decomposition of W as in 1 is not unique if the
family {un }n∈I is not required to be an o.n.s. but only to consist of elements
of H̃, unless W is a one-dimensional projection. This will be proved in 18.3.7.
Proof. Let {un }n∈I be an o.n.s. in H (with I := {1, ..., N } or I := N) and {λn }n∈I
a family of elements of (0, 1] so that
X X
W = λn Aun and λn = 1,
n∈I n∈I
as in 18.3.2c. We have
X
W2 = λ2n Aun
n∈I
since Aun Aum = δn,m Aun for all n, m ∈ I (if I = N, we have used also the continuity
of the operator product in B(H), cf. 4.3.5 and 4.3.3). We notice that λ2n ≤ λn and
hence n∈I λ2n < ∞. Then, in view of 18.2.15, W 2 ∈ T (H) and
P
X X
1 = tr W 2 = λ2n (un |un ) = λ2n .
n∈I n∈I
Therefore,
X
(λn − λ2n ) = 0
n∈I
and hence
λn ∈ {0, 1}, ∀n ∈ I.
This implies I = {1} and hence W = Au1 .
is an element of W(H);
(c) for all B ∈ B(H),
X
tr(BW ) = wn tr(BWn ).
n∈I
P∞
Proof. a: If I = N then the series n=1 wn Wn is absolutely convergent in the
normed space (T (H), ν1 ) since
ν1 (wn Wn ) = wn tr Wn = wn , ∀n ∈ N,
and hence it is convergent in this normed space (cf. 18.2.14 and 4.1.8b). Then, this
series is convergent also with respect to the norm for B(H) defined in 4.2.11a (cf.
18.2.5a).
b: From 18.2.4a,b (if I = {1, ..., N }) or from result a (if I = N) we have
W ∈ T (H). Moreover,
X
(f |W f ) = wn (f |Wn f ) ≥ 0, ∀f ∈ H
n∈I
by 18.2.11a.
c: If I = {1, ..., N }, this follows from the linearity of the function tr (cf.
18.2.11a). Now we suppose that I = N and fix B ∈ B(H). Then the series
P∞
n=1 wn BWn is absolutely convergent in the normed space (T (H), ν1 ) since
in view of the continuity of the operator product with respect to the norm defined
in 4.2.11a. Hence we have
∞ ∞
!
X X
tr(BW ) = tr wn BWn = wn tr(BWn ),
n=1 n=1
in view of the continuity of the function tr with respect to the norm ν1 (cf. 18.2.11a).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 599
18.3.5 Corollary. Let I := {1, ..., N } or I := N, let {un }n∈I be a family of ele-
P
ments of H̃, let {wn }n∈I be a family of elements of (0, 1] such that n∈I wn = 1.
Then:
P∞
(a) if I = N, the series n=1 wn Aun is convergent in the normed space (T (H), ν1 )
and also with respect to the norm for B(H) defined in 4.2.11a;
(b) the operator
X
W := wn Aun
n∈I
is an element of W(H);
(c) for all B ∈ B(H),
X X
tr(BW ) = wn tr(BAun ) = wn (un |Bun ) .
n∈I n∈I
(a) W ∈ W(H);
(b) there exist a family {un }n∈I (with I := {1, ..., N } or I := N) of elements of H̃
and a family {wn }n∈I of elements of (0, 1] so that
6 Auk if i 6= k,
Aui =
X
wn = 1,
n∈I
X
Wf = wn Aun f, ∀f ∈ H.
n∈I
(a) the representation of W as in 18.3.6b is unique (i.e. the families {Aun }n∈I and
{wn }n∈I as in 18.3.6b are uniquely determined);
(b) W is a one-dimensional projection.
with {un }n∈I an o.n.s. in H and {wn }n∈I a family of elements of (0, 1] such that
P
n∈I wn = 1. We suppose that W is not a one-dimensional projection. Then the
index set I must contain more than one element, and we define the vectors
1 1
v1 := 2− 2 (u1 + u2 ) and v2 := 2− 2 (u1 − u2 ),
which are elements of H̃. It is easy to see that
Au1 + Au2 = Av1 + Av2 .
We set J := I − {1, 2}. If w1 = w2 , we have
X
W = w1 Av1 + w2 Av2 + wn Aun (2)
n∈J
P
(if I = {1, 2} then n∈J wn Aun := OH ). If w1 6= w2 and (for instance) w1 < w2 ,
we have
X
W = w1 Av1 + w1 Av2 + (w2 − w1 )Au2 + wn Aun . (3)
n∈J
Proof. Let {un }n∈I be an o.n.s. in H which is complete in the subspace RP and
let {vn }n∈N be a c.o.n.s in H which contains {un }n∈I (cf. 10.7.3).
a: We have
X∞ X
tr(P W ) = tr(W P ) = (vn |W P vn ) = (un |W un )
n=1 n∈I
18.3.9 Proposition. Let W ∈ W(H) and P ∈ P(H). Then the following condi-
tions are equivalent:
(a) tr(P W ) = 1;
(b) RW ⊂ RP ;
(c) PW = W;
(d) PWP = W;
a ⇒ b: Condition a implies
X
λn (un |P un ) = tr(P W ) = 1,
n∈I
P
and hence (since λn > 0 for each n ∈ I and n∈I λn = 1)
kP un k2 = (un |P un ) = 1, ∀n ∈ I,
and hence (cf. 13.1.3c)
un ∈ RP , ∀n ∈ I.
Since RW ⊂ V {un }n∈I , this proves that RW ⊂ RP .
b ⇒ c: We assume condition b. We have
un = λ−1
n W un , ∀n ∈ I,
and hence
un ∈ RP , ∀n ∈ I,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 603
and hence
P Aun = Aun , ∀n ∈ I
(cf. 13.1.3c). This implies P W = W (if I = N, use e.g. the continuity of the
operator product in B(H)).
c ⇒ d: We have
P W = W ⇒ W P = (P W )† = W
P W = W ⇒ P W P = W P = W.
d ⇒ a: Condition d implies
18.3.10 Corollary. Let W ∈ W(H) and u ∈ H̃. Then the following conditions are
equivalent:
(a) tr(Au W ) = 1;
(b) W = Au .
un ∈ V {u}, ∀n ∈ I,
and hence
I = {1}, λ1 = 1, Au1 = Au ,
and hence W = Au .
b ⇒ a: If W = Au then
tr(Au W ) = tr Au = 1.
18.3.11 Corollary. Let W ∈ W(H) and P ∈ P(H). Then the following conditions
are equivalent:
(a) tr(P W ) = 0;
(b) RW ⊂ NP ;
(c) P W = OH ;
(d) P W P = OH .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 604
18.3.12 Proposition. Let W ∈ W(H) and let {Pn } be a sequence in P(H) such
that Pi Pk = OH if i 6= k. Then
∞ ∞
! !
X X
tr Pn W = tr(Pn W )
n=1 n=1
P∞
(the orthogonal projection n=1 Pn is defined as in 13.2.10b).
Proof. The range of the function µPW is indeed a subset of [0, 1], in view of 18.3.8b.
The rest of the statement follows immediately from the definition of a projection
valued measure and from 18.3.12.
If condition a holds true then we have (in view of 2, and since wn > 0 for all n ∈ I)
Z
A
ξ 2 dµP
un < ∞, ∀n ∈ I,
R
and hence (cf. 15.2.2e)
un ∈ DA , ∀n ∈ I,
and also Z
A
kAun k2 = ξ 2 dµP
un , ∀n ∈ I,
R
and hence (cf. 2) also
X
wn kAun k2 < ∞.
n∈I
Thus, condition b holds true.
If condition b holds true then we have (in view of 15.2.2e and 2)
Z
A X
ξ 2 dµP
W = wn kAun k2 < ∞,
R n∈I
and this proves that condition a holds true.
In what follows we assume that conditions a and b are satisfied.
If I = {1, ..., N } then DAW = H since
RW ⊂ L{u1 , ..., uN } ⊂ DA ,
and also
N
X
AW f = wn (un |f ) Aun , ∀f ∈ H. (3)
n=1
Now we suppose I = N. We fix f ∈ H. Then,
N
X
wn (un |f ) un ∈ DA and
n=1
N
! N
X X
A wn (un |f ) un = wn (un |f ) Aun , ∀N ∈ N.
n=1 n=1
Moreover, the inequality
X∞ ∞
(4) X ∞
X
kwn (un |f ) un k ≤ wn kf k = kf k wn < ∞
n=1 n=1 n=1
P∞
(4 holds true by the Schwarz inequality) proves that the series n=1 wn (un |f ) un
is convergent (cf. 4.1.8b). Similarly, the inequalities
X∞ ∞
(5) X
kwn (un |f ) Aun k ≤ wn kf kkAunk
n=1 n=1
∞
! 21 ∞
! 21
(6) X X
2
≤ kf k wn wn kAun k <∞
n=1 n=1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 608
(5 holds true by the Schwarz inequality in H; 6 holds true by the Schwarz inequality
1 1
in ℓ2 for the two sequences {wn2 } and {wn2 kAun k}, cf. 10.3.8d) prove that the series
P∞
n=1 wn (un |f ) Aun is convergent. Since the operator A is closed (cf. 12.4.6a), this
implies that
∞ ∞ ∞
!
X X X
wn (un |f ) un ∈ DA and A wn (un |f ) un = wn (un |f ) Aun .
n=1 n=1 n=1
Since f was an arbitrary element of H, this proves that
W f ∈ DA for all f ∈ H, or DAW = H,
and
∞
X
AW f = wn (un |f ) Aun , ∀f ∈ H. (7)
n=1
In what follows, I can be either {1, ..., N } or N. We define the set of indices
J := {n ∈ I : Aun 6= 0H }.
If J = ∅ then from either 3 or 7 we have AW = OH and hence AW ∈ T (H), and
also
X X
tr(AW ) = 0 = wn (un |Aun ) = wn hAiun .
n∈I n∈I
If J 6= ∅, we define
vn := kAun k−1 Aun , ∀n ∈ J;
then from either 3 or 7 we have
X
AW f = wn kAun kAun ,vn f, ∀f ∈ H,
n∈J
P∞
and hence (cf. 18.2.15; note that, if I = N, 6 proves that n=1 wn kAun k < ∞)
AW ∈ T (H) and
X X X
tr(AW ) = wn kAun k (un |vn ) = wn (un |Aun ) = wn hAiun .
n∈J n∈I n∈I
= hA iW − 2hAi2W +
2
hAi2W = hA2 iW − hAi2W .
Proof. Let W ∈ W(H), let {un }n∈I and {wn }n∈I be as in 18.3.16, and let A be a
self-adjoint operator in H. If A is bounded then DA = H (cf. 12.4.7) and hence
un ∈ DA , ∀n ∈ I;
moreover,
X X X
wn kAun k2 ≤ wn kAk2 kun k2 = kAk2 wn < ∞
n∈I n∈I n∈I
18.3.19 Remarks.
(a) Let A be a self-adjoint operator in H and W ∈ W(H). If conditions a and b in
18.3.16 hold true, then
AW ∈ T (H) and hAiW = tr(AW ).
If A is bounded then A ∈ B(H) (cf. 12.4.7) and hence we have
W A ∈ T (H) and hAiW = tr(W A),
by 18.2.7 and by 18.2.11c respectively.
If A is not bounded then DA 6= H (cf. 12.4.7), and hence DW A 6= H, and hence
the operator W A cannot be trace class and the formula tr(W A) is meaningless.
(b) We recall that, for a statistical operator W , the decomposition W =
P
n∈I wn Aun as in 18.3.6b is not unique unless W is a one-dimensional pro-
jection (cf. 18.3.7). From 18.3.16 we have that, if a self-adjoint operator A is
computable in W , then
X
un ∈ DA for all n ∈ I and wn kAun k2 < ∞
n∈I
Chapter 19
In this chapter we examine how the theory of Hilbert space operators is used in
quantum mechanics. This chapter is not meant to be a short treatise on quantum
mechanics, since only the basic mathematical structure of the quantum theories is
discussed and no applications are provided.
The predictions that are provided by quantum mechanics are in general sta-
tistical ones. And indeed, in what follows, quantum mechanics is presented as a
theoretical scheme which can account for the probabilistic distributions of mea-
surements, in experiments where measurements are repeatedly carried out on a
large number of suitably-prepared copies of a physical system. The probabilities
are interpreted as the theoretical predictions of the relative frequencies with which
results are obtained when measurements are made on a large number of identically-
prepared copies of the physical system under consideration. Quantum mechanics
shares a good deal of its theoretical framework with other statistical theories, e.g.
classical statistical mechanics and theories of games of chance (we will refer to all
these theories as “classical statistical theories”). In the first section of this chapter
we give an outline of this shared framework, which we call a “general statistical
theory”. For the abstract concepts we introduce, we use the names that are com-
monly used for them in quantum mechanics. In the second section we examine how
this general statistical theory is implemented in the classical theories, and in the
third section how it is implemented in the quantum theories. In the fourth and fifth
sections, other topics are examined which are specific to quantum mechanics: state
reduction, compatibility of observables, uncertainty relations.
Up to Section 19.5 we think of time as standing still: the time intervals between
operational procedures are always supposed to be sufficiently small that there is no
need to consider the internal time evolution of the system. At times, this is indicated
by the use of the locution “immediately after”. In Section 19.6 we examine time
evolution in non-relativistic quantum mechanics.
611
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 612
Since we are mainly concerned with the mathematical aspects of the foundations of
quantum mechanics, we could set off in an axiomatic way by simply saying that we
are given two abstract sets Π and Σ and a function p : Π × Σ → [0, 1], specifying
that Π represents the family of all propositions pertaining to a physical system and
Σ the family of all states of the system, and that p(π, σ) is the probability that
the proposition π is true when the system is in the state σ. However, we prefer to
explain by what kind of reasoning these abstract objects are brought about in what
we call a general statistical theory.
19.1.1 Definitions. A state preparation (or, simply, a state) is a collection of
instructions for a set of physical operations to be performed on an array of objects,
so that:
the operations can be repeated, at least in principle, an indefinite number of
times;
the objects are macroscopic bodies, in the sense that the instructions are governed
by standard classical logic.
A proposition is an event so that:
the occurrence or non-occurrence of the event is to be decided immediately after
a state preparation has been performed;
when the event occurs, it takes place in a macroscopic device, in the sense that
the procedure for ascertaining whether the event has occurred is governed by
standard classical logic;
when the event occurs, it leaves a long-lasting record in the device and its occur-
rence is ascertained by verifying this record, which has an objective meaning (the
record can be read by any number of scientific observers and all of them agree
about its meaning).
When the procedures which define a state σ and a proposition π are implemented,
they have obviously space and time positions. However, it is assumed that these
“absolute” positions are immaterial (and therefore they are not specified in the
definitions of σ and π) and that, if π is to be decided immediately after σ, then
the relative space positions of π and σ are suitable for an “interaction” between σ
and π to take place (suitable, that is, according to the picture one has of a possible
“interaction” between σ and π).
For a given proposition π and a given state σ, we define the following course
of action: we perform the operations prescribed by σ and immediately after we
ascertain whether the event π has occurred, and we do this a large number N of
times. If we implement this course of action twice, and if Nπ′ (respectively, Nπ′′ )
denotes the number of times when π has occurred in the first (respectively, second)
implementation, we cannot expect Nπ′ and Nπ′′ to be equal in general. We say
that π and σ ′are correlated when, as N grows, the difference between the relative
N N ′′
frequencies Nπ and Nπ approaches zero and the relative frequencies approach a
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 613
limit (clearly, the term “approach” has here an informal meaning and so has the
term “limit”). When π and σ are correlated, the limit is called the probability of
the occurrence of the proposition π immediately after the state preparation σ or,
simply, the probability of π in σ.
We say that we have a physical system (or, simply, a system) if we have a family
Σ of states and a family Π of propositions so that π and σ are correlated for each
pair (π, σ) ∈ Π × Σ and if we think that the resulting probabilities are liable to
be organized in a consistent theory. If (π, σ) ∈ Π × Σ, a single implementation of
the operations prescribed by σ is said to be a copy of the system prepared in the
state σ (or, simply, a copy in σ), and the ascertainment whether π has occurred
immediately after an implementation of the operations prescribed by σ is said to
be the determination of π for a copy in σ; if π has occurred, then π is said to be
true in that copy. We say that we have a statistical theory of the system if we have
a theoretical scheme whereby a function p : Π × Σ → [0, 1] can be obtained so that
p(π, σ) is, for all (π, σ) ∈ Π × Σ, the probability of the occurrence of π immediately
after the state preparation σ. The function p is called a probability function. If
the theory supplies such a function, then p(π, σ) is the theoretical prediction of the
relative frequency NNπ , where Nπ is the number of copies in which a proposition
π has turned out to be true out of N copies of the system prepared in a state σ,
provided that N is large enough. It is important to note that, although a state σ
and a proposition π are procedures to be operated on a single copy of the system,
the number p(π, σ) that the theory assigns to the pair (π, σ) can be compared with
the experimental results only if we have a large (hypothetically infinite) collection
of copies of the system, for each of which the determination of π is carried out
immediately after the copy has been prepared in the state σ. A large collection of
copies, all prepared in the state σ, is sometimes called an ensemble representing σ.
19.1.2 Remarks.
(a) While in the classical statistical theories it is often obvious what is to be con-
sidered the physical system under consideration (e.g. the gas contained in a
vessel, a coin, a pair of dice, a roulette table), this is not so in the quantum
theories, and in our opinion it is convenient to consider a quantum system as an
“interaction channel” between a definite set of states and a definite set of propo-
sitions, according to the definition given in 19.1.1. For instance, if the state is
to switch on a “source” to the left of a Stern–Gerlach apparatus (the source and
the Stern–Gerlach apparatus are thus the objects which appear in the abstract
definition of state) and the proposition is the event defined by the reaction of a
detector to the right of the Stern–Gerlach apparatus, then the physical system
is called a spin-half (for instance) particle. As another example, if the state is to
operate an accelerator in a given mode and to arrange magnetic analysers and
collimating slits in a given way, and the proposition is once again the reaction of
a detector, then the case may be that the physical system is called a meson. In
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 614
(d) Experimental evidence supports the assumption that only the relative positions
in space (as specified in 19.1.1) and time (a proposition immediately after a
state) are important. Actually, this is perhaps the first invariance law discovered
in the history of physics. This fact has allowed physics to be tackled as an
experimental science.
19.1.3 Definition. The event that defines a proposition π can be used to define
another event, which defines another proposition; this new proposition is called the
negation of π and denoted by the symbol ¬π, and the event that defines ¬π is the
non-occurrence of the event that defines π.
19.1.4 Remarks.
(a) If there is a statistical theory for the physical system defined by a collection Π
of propositions and a collection Σ of states, then the theory is consistent only
if, for the probability function p, we have
p(¬π, σ) = 1 − p(π, σ), ∀π ∈ Π, ∀σ ∈ Σ. (1)
(b) We point out that nothing has been said about the possibility, for two proposi-
tions π and π ′ , of defining the event that is said to have occurred if and only if
both the events that define π and π ′ have occurred (such new event would define
the proposition “π and π ′ ”) or of defining the event that is said to have occurred
if and only if at least one of the events that define π and π ′ has occurred (such
new event would define the proposition “π or π ′ ”). Indeed, this requires that it
is feasible to determine both π and π ′ for a single copy prepared in any state.
This feasibility is actually assumed in all classical statistical theories. We will
see that this is one of the aspects in which the quantum theories differ from the
classical statistical ones.
19.1.5 Definitions. Let Π and Σ be families of propositions and states so that
Π × Σ defines a physical system for which a probability function p is given. We
define an equivalence relation RΠ in Π by letting
RΠ := {(π ′ , π ′′ ) ∈ Π × Π : p(π ′ , σ) = p(π ′′ , σ), ∀σ ∈ Σ},
and similarly we define an equivalence relation RΣ in Σ by letting
RΣ := {(σ ′ , σ ′′ ) ∈ Σ × Σ : p(π, σ ′ ) = p(π, σ ′′ ), ∀π ∈ Π}.
We denote by Π̂ and Σ̂ the quotient sets which are thus defined, we denote by π̂
and σ̂, and still call proposition and states, the equivalence classes containing π ∈ Π
and σ ∈ Σ, and we define the function
p̂ : Π̂ × Σ̂ → [0, 1]
(π̂, σ̂) 7→ p̂(π̂, σ̂) := p(π, σ),
which we still call a probability function. Obviously, we have
p̂(π̂ ′ , σ̂) = p̂(π̂ ′′ , σ̂), ∀σ̂ ∈ Σ̂ ⇒ π̂ ′ = π̂ ′′ and
p̂(π̂, σ̂ ′ ) = p̂(π̂, σ̂ ′′ ), ∀π̂ ∈ Π̂ ⇒ σ̂ ′ = σ̂ ′′ .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 616
19.1.6 Remarks.
(a) It is clear that, for a physical system defined by a family Π of propositions
and a family Σ of states, there may be state preparations σ ′ , σ ′′ ∈ Σ which are
different (i.e. they are different collections of instructions which refer to dif-
ferent arrays of objects), but which nonetheless lead to the same experimental
statistical results in the sense that, for every π ∈ Π, the probability of π in
σ ′ equals the probability of π in σ ′′ . This means that the differences between
σ ′ and σ ′′ are immaterial as far as the statistical study of the physical system
under consideration goes. And a similar remark can be made for the elements
of Π. Thus, a statistical theory of the system must contain mathematical rep-
resentations of the quotient sets Π̂, Σ̂ defined above, through which a formula
which defines the function p̂ must then be written.
(b) From condition 1 of 19.1.4 we see that, for each π̂ ∈ Π̂, the family of all negations
of the representatives of π̂ constitute an equivalence class, which we still call
the negation of π̂ and denote by the symbol ¬π̂.
19.1.7 Definitions. For every physical system, two (trivial) propositions always
exist, which we denote by the symbols π0 and π1 . The proposition π0 is defined
by the event which is said to occur if and only if no copy of the system has been
prepared: this event never occurs if π0 is determined immediately after a state
preparation has been performed. The proposition π1 is defined by the event which
is said to occur if and only if a copy of the system has been prepared: this event
always occurs if π1 is determined immediately after a state preparation has been
performed. Clearly, the equivalence classes π̂0 and π̂1 are characterized by the
following conditions
p̂(π̂0 , σ̂) = 0, ∀σ̂ ∈ Σ̂, and p̂(π̂1 , σ̂) = 1, ∀σ̂ ∈ Σ̂,
where Σ denotes as usual the family of states that defines the system.
19.1.9 Remarks.
(a) In 19.1.8, the measurable space (X, A) provides a representation of the events
into which the measurement of a physical quantity can be analysed. In our view,
an X-valued physical quantity is defined by an ideal apparatus which comprises
a dial, which is represented by X (i.e. there is a mapping, not necessarily
injective, from the dial to X), and a pointer. Immediately after a copy of the
system has been prepared in a state in such a way that an “interaction” between
the copy and the apparatus can happen (i.e., the state preparation must be
implemented in a suitable spatial position with respect to the apparatus), the
apparatus gives an X-result by the position of the pointer on the dial, which
identifies a point of the dial (since the apparatus is an ideal one) and hence a
point of X. The pointer and the dial are assumed to be macroscopic objects,
to wit the ascertainment of the possible positions of the pointer is governed by
standard classical logic. An element E of A is a subset of X to which it is
deemed sensible to assign the probability that, in a given state, the X-result is
a point of E (a probability is always considered in this book to be a normalized
measure on a σ-algebra, cf. 7.1.7); for instance, if X is endowed with a distance
then a natural choice for A is the Borel σ-algebra on X (cf. 6.1.22).
All this leads to an X-valued observable α if, for each E ∈ A, we define α(E) to
be the event which is said to have occurred if and only if the X-result given by
the apparatus has been an element of E and if we assume that this event defines
a proposition of the system. In fact, the condition that µα σ be a probability
measure on A for each σ ∈ Σ can be accounted for as follows. We assume
that all propositions α(E), for E ∈ A, can be determined simultaneously for
any single copy prepared in any state. Moreover, we assume that, if {En } is a
sequence in A such that Ei ∩ Ek = ∅ for i 6= k, then the proposition α (∪∞ n=1 En )
is true in a copy prepared in a state if and only if there exists exactly one En such
that α(En ) is true in that copy. Actually, the basis for these two assumptions
is the macroscopic nature assumed before for the pointer and the dial. Now let
{En } be a sequence in A such that Ei ∩ Ek = ∅ for i 6= k; if we determine all
propositions α(En ), for n ∈ N, as well as the proposition α (∪∞ n=1 En ) (which
is possible on account of the first assumption above) for a large number N of
copies of the system prepared in a state σ, and if we denote by Nn the number
of copies in which the proposition α(En ) is true and by NU the number of copies
in which the proposition α (∪∞ n=1 En ) is true, then we have (on account of the
second assumption above)
∞
X Nn NU
=
n=1
N N
(the series on the left hand side is actually a sum); since p(π, σ) is the theoretical
prediction of the relative frequency of a proposition π ∈ Π being true in a large
number of copies all prepared in the state σ, the consistency of the theory leads
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 618
to the equation
X∞ ∞
X
µασ (En ) = p(α(En ), σ) = p (α (∪∞ α ∞
n=1 En ) , σ) = µσ (∪n=1 En ) .
n=1 n=1
Thus, the function µα σ is σ-additive. Moreover, since we have assumed that the
apparatus (which is an ideal one) always gives an X-result immediately after a
copy has been prepared in some suitable state, the proposition α(X) is always
true in every state, and therefore the consistency of the theory leads to the
condition µα α
σ (X) = 1 for every state σ. Thus, µσ is a probability measure on A
for every state σ.
As to the assumption that A = A(dR )X when X is a Borel subset of R, we note
that intervals are most naturally associated with a dial which is represented by
a subset of R, and that A(dR ) is the σ-algebra on R generated by the family of
intervals (cf. 6.1.25). And similarly for the general case A(dn )X .
The determination of all propositions α(E) (i.e. of α(E) for all E ∈ A) for a
copy, prepared in some state, can be performed in actual fact by determining
only some propositions, on account of the assumption that the ascertainment
of the position of the pointer is governed by classical logic (e.g., if X is so that
{x} ∈ A and if α({x}) is true in a copy, then the proposition α(E) is true in
that copy if and only if x ∈ E). The determination of all propositions α(E) for
a copy is said to be a measurement of the observable α in that copy.
(b) The “position of the pointer on the dial” may be actually implemented by an
apparatus which does not comprise a needle-like object over a graduated scale:
it may be the blackening of a grain in a photographic plate (and then X is a
subset of R2 ), or the formation of a bubble in a bubble-chamber (and then X
is a subset of R3 ), or the digital reading of an instrument (and then X is a
subset of R). In any case, the apparatus that was considered above was clearly
an ideal one inasmuch as the position of the pointer was supposed to identify
a point of the dial. But reference to ideal instruments is a common feature
of all mathematical physics (however, we shall see that in quantum mechanics
we do not need an ideal apparatus in order to get exact measurements, if the
observable is quantized; nor do we need an ideal apparatus in the game of dice
or in the game of roulette).
(c) The analysis which was carried out in remark a was aimed at showing that it is
reasonable to represent an instrument, which can measure a physical quantity,
by a mapping α : A → Π which is so that the function µα σ is a probability
measure on A for every state σ, where A is a σ-algebra which represents the
sensible parts of the dial of the instrument. However, even when (X, A) is
chosen in a conservative way (e.g. (X, A) = (R, A(dR )), since the dial of most
measuring instruments can be identified with some part of R), it would be
hard to justify in general the assumption that any mapping α : A → Π for
which µασ is a probability measure for all σ ∈ Σ should be taken to represent a
measuring instrument, and therefore should be considered a bona-fide X-valued
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 619
is a probability measure for each σ̂ ∈ Σ̂. We still call X-valued observables such
mappings.
Proof. The equalities of the statement follow from the following facts, which are
true because µα̂
σ̂ is a probability measure: for each σ̂ ∈ Σ̂,
19.1.12 Remark. Throughout the rest of this chapter, we always assume that we
are dealing with a fixed, although general, physical system for which we assume
that a probability function is defined. The symbols Σ and Π always denote the
families of states and propositions which define the system.
As a rule we drop the carets in the symbols Σ̂, Π̂, σ̂, π̂, α̂ and we leave it to the
reader to understand whether we refer to an equivalence class or to a representative
of it. If σ is an equivalence class of states, a representative of σ is sometimes called
an implementation of σ; and similarly for propositions.
19.1.14 Remark. If α and ϕ are as in the statement of 19.1.13, then ϕ(α) can be
considered an X2 -valued observable, which is called the function of α according to
ϕ. Indeed, there exists a measuring instrument which is represented by ϕ(α) (cf.
19.1.9c), since ϕ(α) can be interpreted as the X2 -valued observable that is defined
operationally by the same apparatus that defines α (cf. 19.1.9a), in which however
a change of scale has been made: while the dial of the apparatus is represented
by X1 when the apparatus is related to α, the dial is represented by X2 when the
apparatus is related to ϕ(α). Assuming first Dϕ = X1 , if a point of the dial is
represented by x ∈ X1 when the scale that defines α is used, then that same point
of the dial is represented by ϕ(x) ∈ X2 when the scale that defines ϕ(α) is used;
thus, in a copy prepared in some state and for any E ∈ A2 , the operational meaning
of the proposition ϕ(α)(E) is so that the proposition ϕ(α)(E) is true if and only if
the X2 -value given by the apparatus that defines ϕ(α) is in E, and this is true if
and only if the X1 -value given by the apparatus that defines α is in ϕ−1 (E), and
this is true (by the operational meaning of the proposition α(ϕ−1 (E))) if and only
if the proposition α(ϕ−1 (E)) is true; thus, the proposition ϕ(α)(E) must coincide
with the proposition α(ϕ−1 (E)). We notice that, in the reasoning just made, there
was no need for ϕ to be injective (e.g., in the game of roulette all observables are
functions of the observable α which assigns a number from 0 to 36 to any copy of
the system, and for instance the observable which assigns the colour-values “rouge”,
“noir” or “nul” is defined by a non-injective function). If Dϕ 6= X1 we can extend ϕ
to a function ϕ̃ defined on the whole X1 in any way that makes it A1 -measurable,
and repeat the reasoning for this extension ϕ̃. For each E ∈ A2 we have
p(α(ϕ̃−1 (E)), σ) = µα
σ (ϕ̃
−1
(E))
−1 −1
= µα
σ (ϕ̃ (E) ∩ Dϕ ) + µα
σ (ϕ̃ (E) ∩ (X1 − Dϕ ))
−1 −1
= µα
σ (ϕ̃ (E) ∩ Dϕ ) = µα
σ (ϕ (E)) = p(α(ϕ−1 (E)), σ), ∀σ ∈ Σ,
where the monotonicity of µα α
σ and the condition µσ (X1 − Dϕ ) = 0 have been used;
this proves that α(ϕ̃−1 (E)) = α(ϕ−1 (E)). Thus, the reasoning we made before
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 621
can indeed be referred to the extension ϕ̃, but the observable ϕ̃(α) we obtain does
not depend on the extension we use: it depends only on ϕ, and therefore can be
denoted by the symbol ϕ(α). Furthermore, it can be defined directly trough ϕ as
in the statement of 19.1.13.
In what follows, we are concerned mainly with R-valued observables, which are
simply called observables.
The set of all possible results for α, i.e. the set spα defined by
spα := {λ ∈ R : ∀ε > 0, ∃σ ∈ Σ so that µα
σ ((λ − ε, λ + ε)) 6= 0},
19.1.16 Remarks.
(a) If a number λ ∈ R happens to be so that, for an observable α, there exists
σ ∈ Σ such that
µα
σ ({λ}) 6= 0, (2)
then obviously λ must be considered a possible result for α from an operational
point of view: in N repetitions of the measurement of α in copies of the system
prepared in the state σ, the result λ occurs so often that its relative frequency
will approach a non-null number as N grows. In this case, it is obvious (owing
to the monotonicity of µα σ ) that λ fulfills the condition that we have given in
19.1.15 to characterize a possible result for α, and λ is said to be an exact result
for α.
However, condition 2 need not be fulfilled by every number which can occur as
the result obtained in the measurement of α in some copy: in N repetitions of
the measurement of α in a copy prepared in a state, a number λ can indeed
occur, but so seldom that its relative frequency will approach zero as N grows.
This is indeed what we expect to happen if λ belongs to what our theoretical
image of the system depicts as a continuum of possible results (unless state
preparations are assumed to exist that are so “precise” as to pinpoint a value
of an observable amid a continuum of possible values; such state preparations
are not realistic; however, classical mechanics is indeed based on such state
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 622
19.1.17 Proposition. For every observable α, the spectrum spα is a closed subset
of R and we have µα α
σ (R − spα ) = 0, or equivalently µσ (spα ) = 1, for all σ ∈ Σ.
µα
σ ((µ − η, µ + η)) = 0, ∀σ ∈ Σ,
µα
σ (R − spα ) = 0, ∀σ ∈ Σ,
which is equivalent to µα α
σ (spα ) = 1 for all σ ∈ Σ, since µσ is a probability measure.
property the family is required to have in 19.1.18. Then, if 3 is true, all the elements
of the family {λn }n∈I are exact results for α (cf. 19.1.16a).
We cannot say that in general every possible result for α is an element of
{λn }n∈I , since spα = {λn }n∈I . Indeed, while {λn }n∈I ⊂ spα is obvious since
spα is closed, for λ ∈ R we have (by the monotonicity of µα σ)
λ 6∈ {λn }n∈I ⇒
[∃ε > 0 s.t. (λ − ε, λ + ε) ⊂ R − {λn }n∈I and hence s.t.
µα
σ ((λ − ε, λ + ε)) = 0, ∀σ ∈ Σ],
19.1.22 Remarks.
(a) Suppose that α is a discrete observable with a finite family {λk }k∈I of possible
results. Then µα σ ({λk }k∈I ) = 1 for all σ ∈ Σ and the results obtained in any
collection of measurements of α are bound to be elements of the family {λk }k∈I .
Now suppose that measurements of α are performed in N copies of the system,
all prepared in the same state σ ∈ Σ. Two important quantities connected with
these measurements are the average of the results and the standard deviation
of the results, which are defined respectively by
! 12
X Nk X 2 Nk
Aσ,N (α) := λk and Dσ,N (α) := (λk − Aσ,N (α)) ,
N N
k∈I k∈I
if Nk denotes the number of copies for which the result λk has been obtained.
For N large enough, the theoretical predictions of Aσ,N and Dσ,N are respec-
tively
X
Aσ,th (α) := λk p(α({λk }), σ) and
k∈I
! 12
2
X
Dσ,th (α) := (λk − Aσ,th (α)) p(α({λk }), σ) ,
k∈I
(cf. 8.3.9 and 8.3.8). Thus, for a discrete observable with a finite number of
possible results, the expected result and the uncertainty defined in 19.1.20 are
the theoretical predictions of the average and of the standard deviation of the
results obtained in a large number of measurements.
The analysis above cannot be carried out for an observable with an infinite
number of possible results, since for an observable α of this kind there might be
possible results λ such that p(α({λ}), σ) = 0 for all σ ∈ Σ (this would represent
the existence of a continuum of possible results). One could argue that for
every actual measuring instrument there exists a finite set which contains all the
results that the instrument can produce, and therefore every actual measuring
instrument must be represented by an observable with only a finite number
of possible results. Thus, one could be tempted into discarding observables
with an infinite number of possible results on the grounds that they are not
realistic. However, physical theories can hardly ever be formulated in terms of
actual measuring instruments and the use of idealized observables is common
practice in physics (for instance, the position and the velocity of a particle
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 625
are observables which in both the classical and the quantum mechanics are
not discrete, even though no actual measuring instrument can pinpoint their
alleged values better than assigning them to intervals related to the resolution
of the instrument; moreover, as to velocity in the classical mechanics, no actual
instrument can really compute a derivative). Hence, idealized observables must
be taken into consideration. The idealistic import of this is lessened by the fact
that every observable α can be considered as the limit of a sequence of realistic
observables in the sense explained below.
Let α be an observable. For each n ∈ N, let En be a bounded interval, let
{Fn,k }k∈In be a finite partition of En such that Fn,k is an interval for all k ∈ In ,
let λn,k be a non-null element of Fn,k for all k ∈ In ; further, assume that
En ⊂ En+1 for each n ∈ N, that ∞
S
n=1 En = R, and that limn→∞ ℓn = 0
if ℓn denotes the maximum length of the intervals of the family {Fn,k }k∈In .
For instance, we could have En := −n, n + 21n , In := {0, ±1, ±2, ..., ±n2n},
k+ 12
Fn,k := 2kn , k+1
P
2n , λn,k := 2n . We define the function ξn := k∈In λn,k χFn,k
and the observable αn := ξn (α). The observable αn is discrete and it has a
finite number of possible results since
α −1
µα α
σ ({λn,k }k∈In ∪ {0}) = µσ (ξn ({λn,k }k∈In ∪ {0})) = µσ (R) = 1, ∀σ ∈ Σ.
n
Proposition. LetR σ ∈ Σ. Then the sequences {Aσ,th (αn )} and {Dσ,th (αn )}
are convergent iff R ξ 2 dµα
σ < ∞; if these sequences are convergent then
Z
lim Aσ,th (αn ) = ξdµασ and
n→∞ R
Z Z 12
2
lim Dσ,th (αn ) = (ξ − ξdµα
σ
α
) dµσ .
n→∞ R R
and
! 21
X
2
Dσ,th (αn ) := (λn,k − Aσ,th (αn )) p(αn ({λn,k }), σ)
k∈In
Z 12 Z 12
= (ξn − Aσ,th (αn ))2 dµα
σ = ξn2 dµα
σ − (Aσ,th (αn ))
2
,
R R
since p(αn ({λn,k }), σ) = p(α(ξn−1 ({λn,k })), σ) = µα
σ (Fn,k ) for all k ∈ In (the
equalities above are in agreement with what is proved more in general in
19.1.23).
First we assume R ξ 2 dµα α
R
σ < ∞. Because the measure µσ is finite, we have
ξ ∈ L1 (R, A(dR ), µα 1 α
σ ) (cf. 11.1.3) and hence |ξ| + ℓn ∈ L (R, A(dR ), µσ ) since
1 α
1R ∈ L (R, A(dR ), µσ ) (cf. 8.2.6). Moreover,
|ξn (x)| ≤ |x| + ℓn and ξn (x) −−−−→ x, ∀x ∈ R.
n→∞
Then, by Lebesgue’s dominated convergence theorem (cf. 8.2.11) we have
Z Z
α
ξn dµσ −−−−→ ξdµα
σ.
R n→∞ R
2 1 2
Also, we have (|ξ| + ℓn ) ∈ L (R, A(dR ), µα α
σ ) since 1R ∈ L (R, A(dR ), µσ ) (cf.
11.1.2a). Moreover,
|ξn2 (x)| ≤ (|x| + ℓn )2 and ξn2 (x) −−−−→ x2 , ∀x ∈ R.
n→∞
Then, by Lebesgue’s dominated
Z convergence
Z theorem we have
2 α
ξn dµσ −−−−→ ξ 2 dµα
σ
R n→∞ R
and hence
Z Z 2 ! 12 Z Z 2 ! 12
ξn2 dµα
σ − ξn dµα
σ −−−−→ ξ 2
dµα
σ − ξdµα
σ
R R n→∞ R R
Z Z 2 ! 12
= ξ− ξdµα
σ dµα
σ .
R R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 627
Next and conversely we assume that the R sequences {Aσ,th (αn )} and
{Dσ,th (αn )} are convergent. Then the sequence R ξn2 dµα σ is convergent since
Z
(Dσ,th (αn ))2 − (Aσ,th (αn ))2 = ξn2 dµα
σ,
R
and hence (cf. 2.1.9)
Z
∃M ∈ [0, ∞) such that ξn2 dµα
σ ≤ M, ∀n ∈ N.
R
By Fatou’s lemma (cf. 8.1.20), this implies that R ξ 2 dµα
R
σ ≤ M.
(b) Suppose that an observable α and a state σ ∈ Σ are so that α is evaluable in σ
and ∆σ α = 0. Then R (ξ − hαiσ )2 dµα
R
σ = 0, and hence (cf. 8.1.12a) x − hαiσ = 0
µασ -a.e. on R, and hence µ α
σ (R − {hαi σ }) = 0, and hence µ α
σ ({hαi σ }) = 1. This
means that there is a result λ which is obtained with certainty for any number
of copies prepared in the state σ (then, obviously, hαiσ = λ).
Conversely, suppose that for σ ∈ Σ there is λ ∈ R such that µα σ ({λ}) = 1. Then
α α
µ (R
Rσ 2 α − {λ}) = 0. Thus, µ σ is the Dirac measure in λ and we have (cf. 8.3.6)
2
R ξ dµσ = λ < ∞ (hence, α is evaluable in σ), hαiσ = λ, ∆σ α = 0.
ϕ(α)−1
Proof. Since µσ (E) = µα σ (ϕ (E)) for all E ∈ A(dR ), we obtain the statement
from 8.3.11 (π is there what ϕ is here).
Proof. If α is bounded then there exists k ∈ [0, ∞) such that |ξ 2 (x)| = x2 ≤ k for
all x ∈ spα , and hence µα α
σ -a.e. on R for every σ ∈ Σ, since µσ (R − spα ) = 0 (cf.
19.1.17). The result then follows from 8.2.6.
(∆σ απ )2 = (0 − hαπ iσ )2 µα 2 απ
σ ({0}) + (1 − hαπ iσ ) µσ ({1})
π
Classical statistical theories, although very diverse, have some common features,
some of which we set out here axiomatically. As before, we denote by Σ and Π the
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 629
families of states and of propositions that define a fixed physical system, which is
assumed in this section to be described by a classical statistical theory. By Σ, Π
and p we denote what was denoted by Σ̂, Π̂, p̂ in 19.1.5 (cf. 19.1.12).
19.2.2 Remark. The reason behind axiom C1 is that, in a classical theory, the
determination of any proposition π for any copy prepared in any state σ is held to be
implementable in such an “unobtrusive” way that, immediately after the proposition
has been determined, the copy can still be considered as if it had just been prepared
in the state σ. That is to say, recalling that π stands for an equivalence class, there
is an event which belongs to the class π and which requires an interaction, between
a copy prepared in σ and the apparatus in which the event possibly occurs, which
involves so little transfers of e.g. energy, momentum, angular momentum that they
can be considered negligible, so that it is as if nothing had happened to the copy,
which therefore can be considered still in the state σ. This makes it possible to
determine two propositions, one immediately after the other, in the same copy and
assume that the determination of the first of them has had no influence on the
outcome of the determination of the second. Moreover, it makes it possible to
consider immaterial the order in which the two propositions are determined.
any case, the main role of the condition {s} ∈ A is to make it possible to claim that
there exists a probability measure µs on A such that
p(π, s) = µs (Sπ ), ∀π ∈ Π
(and indeed in 19.2.5c we will see that this condition implies that µs is the Dirac
measure in s, which requires the condition {s} ∈ A in order to be defined).
then
Z Z 21
2
hαiσ = ϕα dµσ and ∆σ α = (ϕα − hαiσ ) dµσ ;
S S
µα
s (E) = p(α(E), s), ∀E ∈ A(dR )
(cf. 19.1.8). Hence, in view of 19.2.3 we have µα s (E) ∈ {0, 1} for all E ∈ A(dR ). By
8.3.7, this implies that there exists αs ∈ R so that µα s is the Dirac measure in αs ,
and this implies (cf. 8.3.6) that
Z Z Z 21
ξ 2 dµα
s = α2
s < ∞, hαi s = ξdµ α
s = αs , ∆s α = (x − hαis )2
dµ α
s (x) = 0.
R R R
Now we prove that spα = Rϕα . If λ ∈ Rϕα then there exists s ∈ S such that
s ∈ ϕ−1
α ({λ}) and hence
−1
µα
s ({λ}) = µs (Sα({λ}) ) = µs (ϕα ({λ}) = 1;
this proves that Rϕα ⊂ spα and hence Rϕα ⊂ spα since spα is closed (cf. 19.1.17).
If conversely λ 6∈ Rϕα then there exists ε > 0 such that ϕα (s) 6∈ (λ − ε, λ + ε), and
hence µs (ϕ−1
α ((λ − ε, λ + ε))) = 0, for all s ∈ S (since ϕα (s) 6∈ (λ − ε, λ + ε) is
equivalent to s 6∈ ϕ−1
α ((λ − ε, λ + ε))); then we have, for every σ ∈ Σ,
−1
µα
σ ((λ − ε, λ + ε)) = µσ (ϕα ((λ − ε, λ + ε)))
Z
= χϕ−1
α ((λ−ε,λ+ε))
(s)dµσ (s)
ZS
= µs (ϕ−1
α ((λ − ε, λ + ε)))dµσ (s) = 0,
S
we note that if µα α
σ (R − Dψ ) = 0 for all σ ∈ Σ then obviously µs (R − Dψ ) = 0 for
α
all s ∈ S. Finally, we note that if µs (R − Dψ ) = 0 for all s ∈ S then Sα(R−Dψ ) = ∅,
and hence µα σ (R − Dψ ) = µσ (Sα(R−Dψ ) ) = 0 for all σ ∈ Σ.
Thus, the condition Rϕα ⊂ Dψ holds true if and only if the observable ψ(α) can
be defined (cf. 19.1.23). In this case, ψ ◦ ϕα is an A-measurable function from S to
R such that Dψ◦ϕα = S, and we have
Z
ϕψ(α) (s) = hψ(α)is = ψdµα s = ψ(hαis ) = ψ(ϕα (s)) = (ψ ◦ ϕα )(s), ∀s ∈ S,
R
where the first equation holds by statement b, the second by 19.1.23, and the third
because µα
s is the Dirac measure in hαis (cf. statement a).
π ≤ π ′ if Sπ ⊂ Sπ′ .
For each pair {π, π ′ } of elements of Π, the g.l.b. exists and we have inf{π, π ′ } =
π ∧ π ′ , and the l.u.b. exists and we have sup{π, π ′ } = π ∨ π ′ . Further, we have:
π ∧ (π ′ ∨ π ′′ ) = (π ∧ π ′ ) ∨ (π ∧ π ′′ ) and
π ∨ (π ′ ∧ π ′′ ) = (π ∨ π ′ ) ∧ (π ∨ π ′′ ), ∀π, π ′ , π ′′ ∈ Π
π0 ≤ π and π ≤ π1 , ∀π ∈ Π;
π ∧ (¬π) = π0 and π ∨ (¬π) = π1 , ∀π ∈ Π;
¬(¬π) = π, ∀π ∈ Π;
π ≤ π ′ ⇒ ¬π ′ ≤ ¬π, ∀π, π ′ ∈ Π
the family Σ of all states to the family of all probability measures on A, so that A
is the σ-algebra generated by the family {Sπ : π ∈ Π} and
p(π, σ) = µσ (Sπ ), ∀π ∈ Π, ∀σ ∈ Σ
(in this subsection, we denote by (S, A) an abstract measurable space and therefore
we must denote the family of microstates by a different symbol than the symbol S
used before; in what follows the family of microstates is denoted by the symbol Σ0 ).
Further, there is an injective mapping α 7→ ϕα from the family of all observables
to the family of all A-measurable real functions so that for each observable α we
have:
• Sα(E) = ϕ−1α (E), ∀E ∈ A(dR );
• α is a bounded observable iff ϕα is aR bounded function;
• for a state σ, α is evaluable in σ iff S ϕ2α dµσ < ∞;
• for a state σ, if α is evaluable in σ then
Z Z 21
hαiσ = ϕα dµσ and ∆σ α = (ϕα (s) − hαiσ )2 dµσ (s) ;
S S
19.2.9 Remark. If we consider only one observable in the general statistical theory
of Section 19.1 (and therefore in a quantum theory as a special case), we can note a
similarity between the nature of the probabilities that played a role in that situation
(cf. 19.1.8 and 19.1.9) and the nature of probabilities in a classical statistical theory.
In fact, for a state σ, while the nature of the probability p(π, σ) for a general
proposition π is completely unspecified in the general statistical theory (and indeed
p(π, σ) will be obtained in a quantum theory by an algorithm altogether different
from the one used in a classical theory, cf. 19.3.1 and 19.2.3), if a fixed observable
α is considered then there is a σ-algebra A so that an element E of A represents
the proposition “the position of the pointer is in the section of the dial identified
with E” and the probability of this proposition is µα α
σ (E), where µσ is a probability
measure on A, and this is similar to what happens in a classical statistical theory.
Actually, this is due to the classical nature we assumed for the dial and the pointer
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 636
19.3.1 Axiom (Axiom Q1). A quantum theory is a statistical theory for which
a separable Hilbert space H is assumed to exists so that:
19.3.2 Remarks.
(a) For all P ∈ P(H) and W ∈ W(H) we have 0 ≤ tr(P W ) ≤ 1 (cf. 18.3.8b).
Thus, condition c in 19.3.1 is consistent with the fact that p is a probability
function.
(b) The structure which emerges from 19.3.1 is a truly statistical one. In a statis-
tical theory, the probabilistic aspects become trivial only when there is a pair
proposition-state (π, σ) such that the probability p(π, σ) is either 0 or 1: the
proposition π is then either never true or always true in all copies of the system
prepared in the state σ. Consider then a proposition π such that Pπ 6= OH and
Pπ 6= 1 (this is possible if the dimension of H is greater than one, which we
assume), and a state σ such that Wσ = Au , with u ∈ H̃ (cf. 18.3.2b). Then we
have
p(π, σ) = (u|Pπ u) = kPπ uk2 ,
and hence p(π, σ) 6= 0 and p(π, σ) 6= 1 whenever u 6∈ NPπ ∪ RPπ (cf. 13.1.3c).
Now, there are infinitely many operators Au such that u 6∈ NPπ ∪ RPπ .
(c) There are quantum theories, which are said to be “with superselection rules”, for
which the mappings of conditions a and b in 19.3.1 are not surjective. These
theories are outside the scope of this book. Thus, all quantum theories we
discuss are “without superselection rules”.
Proof. We have
tr(Pπ0 Wσ ) = p(π0 , σ) = 0 and tr(Pπ1 Wσ ) = p(π1 , σ) = 1, ∀σ ∈ Σ.
Since the mapping Σ ∋ σ 7→ Wσ ∈ W(H) is surjective, this implies (cf. 18.3.2b)
(u|Pπ0 u) = 0 = (u|OH u) and (u|Pπ1 u) = 1 = (u|1H u) , ∀u ∈ H̃,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 638
and hence
19.3.5 Remarks.
(a) We always assume that the dimension of the Hilbert space H in 19.3.1 is greater
than one, for otherwise the only projections in H would be OH and 1H and
hence the only propositions of the system would be the trivial propositions π0
and π1 .
(b) Let σ ∈ Σ be a state such that Wσ is not a one dimensional projection. Then (cf.
18.3.6) there exist countable families {un }n∈I of elements of H̃ and {wn }n∈I of
elements of (0, 1], so that I contains more than one index, Aui 6= Auk if i 6= k,
P
n∈I wn = 1, and
X
Wσ f = wn Aun f, ∀f ∈ H. (1)
n∈I
If we denote by σn the element of Σ such that Wσn = Aun , then we have (cf.
18.3.5c)
X X
p(π, σ) = tr(Pπ Wσ ) = wn tr(Pπ Aun ) = wn p(π, σn ), ∀π ∈ Π.
n∈I n∈I
(cf. 18.3.4). In this case, if σn denotes the state which is such that Wσn = Wn ,
the state σ is said to be a mixture of the family {σn }n∈I of states, and the
elements of the family {wn }n∈I are said to be the weights of the decomposition.
(c) A state σ such that Wσ is a one-dimensional projection cannot be decomposed
into a mixture of other states (cf. 18.3.7). Thus, the probabilities p(π, σ) that
arise in connection with σ are not mixtures of probabilities intrinsic to the
quantum theory that is being discussed and probabilities of a different kind;
they are, that is, purely quantum probabilities. For this reason, a state σ ∈ Σ
such that Wσ is a one-dimensional projection is said to be a purely quantum
state, or simply a pure state. Since the mapping Ĥ ∋ [u] 7→ Au ∈ P(H)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 640
is a bijection from the family Ĥ of all rays of H onto the family of all one-
dimensional projections in H (cf. 13.1.13a), if we denote by Σ0 the family of
all pure states we have a bijection Σ0 ∋ σ 7→ [uσ ] ∈ Ĥ, where [uσ ] denotes, for
any σ ∈ Σ0 , the ray such that Wσ = Auσ , i.e. such that (cf. 18.3.2b)
(d) Suppose we are given a countable family {σn }n∈I of pure states, and for each
n ∈ I let un be an element of H̃ such that Wσn = Aun . Moreover, suppose we
P
are given a family {α
n }n∈I of complex
numbers so that n∈I αn un converges
P
(if it is a series) and
αn un
= 1. Then, the bijectivity of the mapping
n∈I
Σ0 ∋ σ 7→ [uσ ] ∈ Ĥ allows considering the pure state σp which is such that
P P
[uσp ] = n∈I αn un , i.e. such that Wσp = Au with u := n∈I αn un . This
state is said to be a coherent superposition of the family {σn } of pure states.
Note that, in spite of its name, the state σp actually depends not only on the
family {σn } but also on the choice of the representative un in each equivalence
class [un ].
The bijectivity of the mapping Σ0 ∋ σ 7→ uσ ∈ Ĥ is called superposition
principle.
We point out that, on the basis of the family {σn }n∈I of pure states considered
above, we can obtain a mixed state for any family {wn }n∈I of elements of (0, 1]
P
such that n∈I wn = 1, defined as the state σm such that
X X
Wσm f = wn Wσn f = wn Aun f, ∀f ∈ H.
n∈I n∈I
Clearly, this state depends only on the equivalence classes [un ] and not on their
representatives.
(e) Suppose we are given an o.n.s. {un }n∈I in H and a family {αn }n∈I of complex
numbers so that n∈I |αn |2 = 1. Then we can consider the pure state σp ∈ Σ0
P
P
which is such that Wσp = Au , with u := n∈I αn un , or else we can consider the
mixed state σm which is such that Wσm f = n∈I |αn |2 Aun f for each f ∈ H.
P
(f) For a proposition π ∈ Π we have (cf. 19.3.4 and 19.1.7, recalling that we “drop
the carets” in conformity with 19.1.12)
Pπ 6= OH ⇔ π 6= π0 ⇔ [∃σ ∈ Σ s.t. p(π, σ) 6= 0].
Moreover, for σ ∈ Σ0 we have p(π, σ) = (uσ |Pπ uσ ) = kPπ uσ k2 (cf. remark c),
and therefore (cf. 13.1.3c)
p(π, σ) = 1 ⇔ uσ ∈ RPπ and p(π, σ) = 0 ⇔ uσ ∈ NPπ ;
these equivalences show that, for each σ ∈ Σ0 , there are propositions π ∈ Π
such that p(π, σ) 6∈ {0, 1} (e.g., assume π such that Pπ = Au , with u ∈ H̃ and
(u|uσ ) 6∈ {0, 1}); from the first equivalence we also have
Pπ 6= OH ⇔ RPπ 6= {0H } ⇔ [∃σ ∈ Σ0 s.t. p(π, σ) = 1].
For a proposition π ∈ Π and a state σ ∈ Σ we have (cf. 18.3.9 and 18.3.11)
p(π, σ) = 1 ⇔ RWσ ⊂ RPπ ⇔ Pπ Wσ = Wσ ⇔ Pπ Wσ Pπ = Wσ
and
p(π, σ) = 0 ⇔ RWσ ⊂ NPπ ⇔ Pπ Wσ = OH ⇔ Pπ Wσ Pπ = OH .
If Pπ is a one-dimensional projection, i.e. Pπ = Au with u ∈ H̃, then for a state
σ ∈ Σ we have p(π, σ) = 1 if and only if σ is a pure state and [uσ ] = [u] (cf.
18.3.10).
(g) If, for two propositions π, π ′ ∈ Π, we have
{σ ∈ Σ0 : p(π, σ) = 1} = {σ ∈ Σ0 : p(π ′ , σ) = 1},
then π = π ′ . In fact, the above condition can be written as
{u ∈ H̃ : (u|Pπ u) = 1} = {u ∈ H̃ : (u|Pπ′ u) = 1},
and this can be written as
{u ∈ H̃ : kPπ uk = kuk} = {u ∈ H̃ : kPπ′ uk = kuk},
and this is equivalent to RPπ = RPπ′ , in view of 13.1.3c. Then, Pπ = Pπ′ and
hence π = π ′ .
19.3.6 Definitions. Let (X, A) be a measurable space and α an X-valued observ-
able. We define the projection valued mapping
Pα : A → P(H)
E 7→ Pα (E) := Pα(E) .
For every u ∈ H̃, the function µPu
α
(cf. Section 13.3 for the definition of µP
u )
α
Pα α
is a probability measure on A since µu = µσu if σu is the pure state such that
Wσu = Au :
µP α
u (E) = u|Pα(E) u = tr(Pα(E) Au ) = p(α(E), σu ) = µσu (E), ∀E ∈ A.
α
19.3.8 Remark. If the assumption is made that every mapping α : A(dR ) → Π, for
which µα σ is a probability measure for all σ ∈ Σ, must be considered an observable,
then every self-adjoint operator in H represents an observable. Indeed, if A is a
self-adjoint operator in H, we can define the mapping αA : A(dR ) → Π by letting
αA (E) be the proposition such that PαA (E) = P A (E), for all E ∈ A(dR ). Then we
have, for each σ ∈ Σ,
µα A
σ (E) = p(αA (E), σ) = tr(P (E)Wσ ), ∀E ∈ A(dR ),
A
now, this is the projection valued measure of the self-adjoint operator Pπ (cf.
15.3.4D and 13.1.3e); thus Aαπ = Pπ . Furthermore, for every projection P ∈ P(H)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 643
there exists a proposition π such that P = Pπ , owing to the surjectivity of the map-
ping Π ∋ π 7→ Pπ ∈ P(H).
and hence (cf. 15.2.2) JϕPα = Aϕ(α) . If, further, (X, A) = (R, A(dR )), then (cf.
15.3.1)
Aα
ϕ(Aα ) = JϕP = JϕPα = Aϕ(α) .
c: Assume first that α is a discrete observable, and let {λn }n∈I be a countable
family of real numbers so that µα σ ({λn }n∈I ) = 1 for all σ ∈ Σ and so that (cf.
19.1.19)
∀n ∈ I, ∃σ ∈ Σ so that µα
σ ({λn }) 6= 0.
Then µασ (R − {λn }n∈I ) = 0 for all σ ∈ Σ, and hence (by the monotonicity of µσ )
α
µα
σ ({λ}) = 0 for all σ ∈ Σ and for each λ ∈ R − {λn }n∈I , and hence (cf. 19.3.5f)
P Aα ({λn }) 6= OH , ∀n ∈ I.
Thus, {λn }n∈I = σp (Aα ) by 15.2.5, and hence µα σ (R − σp (Aα )) = 0 for all σ ∈ Σ,
and hence (cf. 19.3.5f) P Aα (R − σp (Aα )) = OH , and hence (cf. 15.3.4B) there
exists a c.o.n.s in H whose elements are eigenvectors of Aα .
Assume, next and conversely, that there exists a c.o.n.s. in H whose elements
are eigenvectors of Aα . Then (cf. 15.3.4B) P Aα (R − σp (Aα )) = OH , and hence (cf.
19.3.5f) µασ (R− σp (Aα )) = 0 for all σ ∈ Σ. Since σp (Aα ) is countable (cf. 12.4.20C),
this proves that the observable α is discrete.
(a) For an observable α, a real number λ, and a state σ, the number µα σ ({λ}) is the
probability of obtaining λ as result for α in the state σ. For α and λ we have
(cf. 15.2.5 and 19.3.5f and recall that P Aα = Pα )
δ δ
n ∈ I, statistical operators W ex-
ple since it is easy to see that, for each
ist so that tr Pϕ(α) nδ − 2 , nδ + 2 W = 1, and hence states σ so that
ϕ(α)
nδ − δ2 , nδ + 2δ = 1 (however, it may be difficult to attain procedures
µσ
which define such states in practice; among these states there are the states
ϕ(α)
σ for which µσ ({nδ}) = 1; states σ for which only the milder condition
ϕ(α)
µσ nδ − 2 , nδ + 2δ = 1 is requested may be easier to implement). Now,
δ
of 19.3.9 with Dψ ∈ A(dR ), can give an observable ψ(α) which is not discrete.
This can be seen from
−1
µσψ(α) (ψ({λn }n∈I )) = µα
σ (ψ (ψ({λn }n∈I ))) ≥ µα
σ ({λn }n∈I ) = 1, ∀σ ∈ Σ
−1
(we have used the monotonicity of µα σ and {λn }n∈I ⊂ ψ (ψ({λn }n∈I ))). Thus,
the discreteness of the observable α is a property which is shared by all functions
of α.
What we have just seen shows that a discrete observable is an observable so
that at least in principle there are realistic states (i.e. preparation procedures
which do not demand absolute precision for their implementation) in which an
exact result is obtainable with certainty. An observable is said to be quantized
if it is discrete. This idea was expressed by John von Neumann as follows: “In
the method of observation of classical mechanics ... we assign to each quantity
α in each state [what is meant here is ’in each microstate’] a completely deter-
mined value. At the same time, however, we recognize that each conceivable
measuring apparatus, as a consequence of the imperfections of human means
of observations (which result in the reading of the position of a pointer or in
locating the blackening of a photographic plate with only limited accuracy), can
furnish this value only with a certain (never vanishing) margin of error. This
margin of error can, by sufficient refinement of the method of measurement, be
made arbitrarily close to zero but it is never exactly zero. One expects that
this will also be true in quantum theory for those quantities which ... are not
quantized; for example, for the cartesian coordinates of an electron (which can
take on every value between −∞ and +∞, and whose operators have continuous
spectra [what is meant here is that their point spectra are empty]). On the other
hand, for those quantities which ... are ’quantized’, the converse is true: since
these are capable of assuming only discrete values, it suffices to observe them
with just sufficient precision that no doubt can exist as to which one of these
’quantized’ values is occurring. That value is then as good as ’observed’ with
absolute precision. ... This division into quantized and unquantized quantities
corresponds ... to the division into quantities α with an operator Aα that has
a pure discrete spectrum [what is mean here is that P Aα (R − σp (Aα )) = OH ],
and into such quantities for which this is not the case. And it was for the
former, and only for these, that we found a possibility of an absolutely precise
measurement — while the latter could be observed only with arbitrarily good
(but never absolute) precision” (Neumann, 1932, p.221–222).
(b) Let α be an observable, and suppose that λ ∈ σc (Aα ) (cf. 12.4.22). Then the
result λ can never be obtained exactly with certainty, because λ 6∈ σp (Aα ).
However, from 19.3.10a and 19.3.5f we have that
∀ε > 0, ∃σ ∈ Σ0 such that µα
σ ((λ − ε, λ + ε)) = 1.
This means that the result λ can be obtained with certainty with arbitrarily
good precision. Thus, σc (Aα ) can be interpreted as representing a continuum
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 647
of possible results for α. To obtain one of these results with absolute precision
would require an absolutely precise preparation procedure (the situation is in
a certain sense opposite to the one discussed in remark a). The treatment of
quantum mechanics based on Hilbert space does not allow these rather idealis-
tic procedures, which are instead part of the treatments of quantum mechanics
that use the notion of “improper eigenfunction” to represent them. Now let von
Neumann speak. “It should be observed that the introduction of an eingenfunc-
tion which is ’improper’, i.e. which does not belong to Hilbert space, gives a less
good approach to reality than our treatment here. For such a method pretends
the existence of such states in which quantities with continuous spectra take
on certain values exactly, although this never occurs. Although such idealiza-
tions have often been advanced, we believe that it is necessary to discard them
on these grounds, in addition to their mathematical untenability” (Neumann,
1932, p.223). We point out that, in this respect, quantum mechanics in Hilbert
space is a construction which requires a smaller amount of idealization than
classical mechanics, which has at its core states (the microstates) in which all
quantities take on exact values with certainty.
What was considered as “mathematically untenable” by von Neumann in 1932
was Dirac’s notion of bras and kets (Dirac, 1958, 1947, 1935, 1930), which
was actually systematized later by the mathematical theory of “rigged Hibert
spaces”. However, this theory relies heavily on von Neumann’s spectral theorem
and “we must emphasize that we regard the spectral theorem as sufficient for
any argument where a nonrigorous approach might rely on Dirac notation; thus,
we only recommend the abstract rigged space approach to readers with a strong
emotional attachment to the Dirac formalism” (Reed and Simon, 1980, 1972,
p.244).
the results then follow from the definitions given in 19.1.20, 19.1.21, 18.3.14, and
from 18.3.16.
b: Since α2 := ξ 2 (α), from 19.3.9 we have Aα2 = ξ 2 (Aα ); since ξ 2 (Aα ) = A2α
(cf. 15.3.5), we have Aα2 = A2α . Then the results follow from the results in part a
and from 18.3.17.
c: The results follow from the results in part a and from 18.3.16.
d: If σ is a pure state, then Wσ = Auσ (cf. 19.3.5c). Hence the results are the
particularization of the results of part c to the case of I containing just one index
(cf. also the definitions of hAiu and ∆u A in 15.2.3).
19.3.14 Remarks.
(a) If α is not a bounded observable then DAα 6= H (cf. 19.3.10b) and therefore
Wσ Aα is not an element of T (H) and we cannot write hαiσ = tr(Wσ Aα ) even
if α is evaluable in σ (cf. also 18.3.19a).
(b) The results of 19.3.13c,d show that, if a mixed state σ ∈ Σ is the mixture of
a countable family {σn }n∈I of pure states with weights {wn }n∈I , then for an
observable α which is evaluable in σ we have that α is evaluable in every pure
state σn and
X
hαiσ = wn hαiσn .
n∈I
This supports the idea (cf. 19.3.5b) that σ can be implemented using implemen-
tations of the states σn , by the procedure which is put into effect by carrying
out with probability wn the plan of action σn (this procedure is not precise,
because each time it is put into effect we do not know which plan of action σn is
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 649
actually going into effect, but it is not utterly at random, because the probabil-
ities wn are defined). However, we remind the reader that the decomposition of
a mixed state into a mixture is never unique, and thus σ cannot be interpreted
as being necessarily implemented by this procedure: in fact, as an equivalence
class, σ contains all the procedures that can be constructed as above, on the
basis of any decomposition of σ into a mixture of other states.
19.3.15 Remarks. The results we have obtained for a quantum theory are consis-
tent with the results we obtained for a general statistical theory in Section 19.1. This
could be checked systematically. We examine here five instances of this consistency.
(a) For an observable α we have spα = σ(Aα ) (cf. 19.3.10a). Then spα is closed
because such is the spectrum of every operator in H (cf. 10.4.6), and this is
consistent with 19.1.17.
(b) For an observable α and a function ϕ as in 19.3.9 we have Aϕ(α) = ϕ(Aα ).
Aα
Then, for a pure state σ, 19.3.13d, 15.3.2 and µP uσ = µασ (cf. 19.3.6) imply
19.1.23.
(c) If an observable α is bounded then the operator Aα is bounded (cf. 19.3.10b),
and hence Aα is computable in Wσ for every σ ∈ Σ (cf. 18.3.18), and hence α
is evaluable in every σ ∈ Σ (cf. 19.3.13a). This is consistent with 19.1.25. In a
quantum theory we can also prove the converse of 19.1.25: if an observable α
is evaluable in every state, then α is evaluable in every pure state, and hence
DAα = H (cf. 19.3.13d), and hence α is bounded (cf. 19.3.10b).
(d) For each π ∈ Π we have Aαπ = Pπ (cf. 19.3.8). Then, since Pπ is bounded (cf.
13.1.3d), Aαπ is computable in Wσ for every σ ∈ Σ (cf. 18.3.18), and hence απ
is evaluable in every σ ∈ Σ (cf. 19.3.13a). Moreover, for each σ ∈ Σ, 19.3.13a
implies that
(a) λ ∈ σ(Aα );
(b) ∀ε > 0, ∃σε ∈ Σ0 such that α is evaluable in σε , |hαiσε − λ| < ε, ∆σε α < 2ε.
19.3.19 Remarks.
(a) The result in 19.3.18 confirms the interpretation that was made in 19.3.12a
of σp (Aα ), for an observable α in quantum mechanics: a real number λ is an
eigenvalue of Aα if and only if there exists a pure state σ so that λ is the result
that is always obtained when α is measured for any number of copies prepared in
σ; in fact (cf. 19.1.22b) the meaning of ∆σ α = 0 is that the same result is always
obtained for any number of measurements (then, of course, this result is also
the mean result). It is also clear from 19.3.13d that, for a pure state σ in which
α is evaluable, we have hαiσ = λ and ∆σ α = 0 if and only if λ is an eigenvalue
of Aα and uσ is an eigenvector of Aα corresponding to λ; and indeed this is
true if and only if (cf. 15.2.5e and 13.1.3c) µασ ({λ}) = kP
Aα
({λ})uσ k2 = 1, in
agreement with what was seen in 19.3.12a.
(b) For an observable α, a pure state σ, a real number λ, in remark a we saw that
µασ ({λ}) = 1 if and only if λ is an eigenvalue of Aα and uσ is an eigenvector of
Aα corresponding to λ. More in general we have (cf. 19.3.5c)
µα Aα
σ ({λ}) = uσ |P ({λ})uσ .
Thus, if λ ∈ σp (Aα ) and {uλ,d}d∈Iλ is an o.n.s. in H which is complete in
NAα −λ1H , i.e. so that V {uλ,d}d∈Iλ = NAα −λ1H , we have (cf. 15.2.5e and
13.1.10)
X
µασ ({λ}) = | (uλ,d |uσ ) |2 .
d∈Iλ
If the dimension of NAα −λ1H is one, i.e. if λ is a non-degenerate eigenvalue of
Aα , we have
2
µα
σ ({λ}) = | (uλ |uσ ) | ,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 651
by 19.3.5f, since µα
σ ({λ}) = p(α({λ}), σ) and Pα({λ}) = P
Aα
({λ}).
Aα
d: Condition c implies obviously P ({λ}) 6= OH and hence λ ∈ σp (Aα ) (cf.
15.2.5). In 19.1.22b it was proved that condition b implies hαiσ = λ.
19.3.21 Remark. From 19.3.20 we have that, for an observable α and a state σ
in which α is evaluable, ∆σ α = 0 is possible if and only if σp (Aα ) 6= ∅; moreover, if
σp (Aα ) 6= ∅ then ∆σ α = 0 is true if and only if there exists an eigenvalue λ of Aα
so that µα σ ({λ}) = 1, namely an eigenvalue of Aα which is the result that is always
obtained when α is measured in any number of copies prepared in σ.
From 19.3.13c we also have that an observable α is evaluable in a state σ and
∆σ α = 0 if and only if any collection of pure states, into a mixture of which σ can be
decomposed, is comprised of states represented by eigenvectors of Aα corresponding
to hαiσ , which is then the eigenvalue λ of Aα such that µα σ ({λ}) = 1, or equivalently
such that RWσ ⊂ RP Aα ({λ}) . If the state σ is pure, we have ∆σ α = 0 if and only
if uσ is an eigenvector of Aα ; if this holds true, then hαiσ is the eigenvalue of Aα
to which uσ corresponds. Thus, we have derived the results of 19.3.19a as a special
case of the results obtained in the present remark.
Pπ′ := U Pπ U −1 , ∀π ∈ Π,
The subject of this section is sometimes known as von Neumann’s and Lüders’
reduction postulates. We start by examining in 19.4.1 two experiments which we
consider to be paradigmatic of what we later analyse in the abstract.
As before, in this and in the following sections Σ and Π denote the families
of equivalence classes (cf. 19.1.12) of states and propositions of a given quantum
system (i.e., a system described by a quantum theory), and H denotes the Hilbert
space in which they are represented as summarized in 19.3.22.
19.4.1 Remarks.
emulsion). Even when these catastrophic events do not occur, for other determi-
nation techniques, the analysis of the process of determination of a proposition
in the physics of microparticles (initiated by Werner Heisenberg) leads to the
conclusion that the determination of a proposition is a process which is bound
to alter in a substantial way the copy for which the determination is carried
out. As a matter of fact, an alteration takes place in classical physics too, but
in classical physics it is assumed that the determination of any proposition in
any state can always be implemented by probing the copy is such a way that
the alteration of the copy is negligible (cf. 19.2.2). Since this is not the case for
microparticles, in quantum mechanics (which deals mostly with microparticles)
we must acknowledge that a proposition is true in a copy, or it is not true,
only upon its determination, and not in general also immediately after that.
However the case may be that the experimental set-up which is used for the
determination of a proposition π can be modified so that it selects copies for
which π is certainly true: if π is determined for any number of copies “emerg-
ing” from the modified set-up, then π will be found to be true in all of them.
In what follows, we provide two examples of this sort.
(b) As a first example, we consider the method depicted in fig. 1 (all figures are
on page 656) for determining the magnitude of the linear momentum (in what
follows, briefly, “momentum”) of a charged particle. To the left of the screen S1
a particle of known charge e is produced which, after passing through the narrow
openings O1 and O2 in the screens S1 and S2 , is deflected by a uniform magnetic
→
field H , which is present to the right of the screen S2 and orthogonal to the
plane of the drawing. In D there is a detector. If the particle is detected in the
region D, the magnitude of the momentum of the particle is determined to be
pD = eHrD (in suitable units), where rD is as in fig. 1. In fact, if the particle is
classical (i.e., a charged particle which is not a microparticle) then its trajectory
is a circle with a radius depending on the momentum as in the formula just used;
thus, from the region of localization of the particle we can deduce the magnitude
of its momentum, and indeed the fact that detection of a charged classical
particle in D corresponds to the magnitude pD = eHrD is uncontentious. If
the particle is not a classical particle, but it is a microparticle instead, the whole
description given above, which is based on the idea of a trajectory, is meaningless
(for a microparticle the concept of a trajectory loses its meaning, as first pointed
out by Werner Heisenberg); the observable “magnitude of momentum” is then
defined as the observable to which the result pD := eHrD is ascribed if detection
in the region D occurs (and other results in other similar experiments); indeed,
if the particle is a microparticle, the experimental arrangement described above
is one of those which give an empirical meaning to the concept of momentum
of a microparticle. One could ask the question: “how can I know that the
macroscopic event that happened in a detector located in D (as for instance
the blackening of a spot of a photographic plate or the click of a detector) is due
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 655
S3 , with the apertures of the “new” screens S1′ and S2′ on the line of the hypo-
thetical beam coming out of A (we are now using, as “new” source of copies, the
source to the left of the screen S1 plus the first Stern–Gerlach device with the
photographic plate replaced by the screen S3 ) and with a “new” photographic
plate, we see that all the copies that are detected by the “new” photographic
plate leave marks in the upper region of the plate. As in remark b, if the
state σ in which the copies are prepared to the left of the screen S1 is so that
p(πz+ , σ) 6= 0, then σ plus the modified Stern–Gerlach device of fig. 4 amounts
to a new state preparation procedure σ ′ which is so that RWσ′ ⊂ RPπz+ (cf.
19.3.5f). Now, a spin one-half microparticle is wholly described by a quantum
theory the Hilbert space of which is not two-dimensional. However, if one is
interested in studying spin (beside sz , there are other spin observables, one
for each direction in three-dimensional space) and nothing else, then one can
give a partial description of a spin one-half microparticle in a two-dimensional
Hilbert space, e.g. C2 (at the opposite end, if spin is disregarded one can give
a partial description of the same microparticle in L2 (R3 )). In that case, the
projection Pπz+ is one-dimensional and one can conclude that, whatever the
state σ to the left of S1 , if p(πz+ , σ) 6= 0 then the copies that are selected by
the procedure described above are in the pure state σ ′ represented by the ray
[uσ′ ] of C2 (cf. 19.3.5c) which is so that Auσ′ = Pπz+ ; indeed, p(πz+ , σ ′ ) = 1
implies now Wσ′ = Pπz+ since Pπz+ is now one-dimensional (cf. 19.3.5f). Thus,
in the partial description in which just the spin observables are represented,
the procedure described above can be interpreted, provided p(πz+ , σ) 6= 0, as
an implementation of the pure state represented by the ray which contains the
normalized eigenvectors of the self-adjoint operator Asz corresponding to the
eigenvalue 21 . If a large number N of copies are prepared in the state σ to the
left of S1 , this procedure selects approximately p(πz+ , σ)N copies which are in
this pure state.
19.4.2 Definition. We say that we have a filter for a proposition π ∈ Π if we
have, for every state preparation σ ∈ Σ such that p(π, σ) 6= 0, an experimental set-
up which can be added to a definite experimental implementation of σ and which
affects a collection of copies prepared in σ as follows:
some copies are “absorbed” or “destroyed” by the set-up (i.e., “after” the setup,
no effect can be observed that can be related to those copies);
there is the probability p(π, σ) that a copy will not be absorbed;
there is a state (as an equivalence class) σ ′ which depends on σ and for which
p(π, σ ′ ) = 1, so that if a copy has not been absorbed then it is in σ ′ (a copy which
has not been absorbed by the set-up is said to have gone through the filter ).
Thus, if p(π, σ) 6= 0 then there is a state σ ′ so that p(π, σ ′ ) = 1 and so that the
experimental implementation of σ combined with the experimental set-up of the
filter amounts to an experimental implementation of σ ′ .
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 660
19.4.3 Remarks.
(a) In 19.4.2 it is not asserted that a filter exists for every proposition π. Indeed,
such claim would be an axiom. However, an even stronger assumption will
actually be made in 19.4.6.
(b) In 19.4.2 it is not maintained that a filter produces copies of the system. Rather,
we can say that a filter selects and modifies copies of the system. In fact, the
definition of a filter implies that, if a state preparation procedure σ ∈ Σ is
activated then the filter affects the copy so that the copy is either absorbed or
modified into a new copy (i.e., a copy in a new state). If p(π, σ) 6= 0, we can say
that the filter transforms the state σ into a new state σ ′ . This transformation
is called a state reduction. Note that, in a given experimental situation, σ and
σ ′ are represented by different ensembles: if we have an ensemble consisting of
a large number N of copies prepared in σ, then “after” the filter we have a new
ensemble consisting of approximately p(π, σ)N copies prepared in σ ′ .
(c) For a proposition π ∈ Π there may exist essentially different filters. In fact,
if p(π, σ) 6= 0, the state σ ′ is only subject to the condition RWσ′ ⊂ RPπ (cf.
19.3.5f). Thus, there may exist different experimental set-ups which act as
filters for the same proposition but lead to different state-reductions.
(d) It is expedient to define an equivalence relation in the family of filters for a
proposition π ∈ Π, by defining two filters equivalent if they transform in the
same way any state σ ∈ Σ such that p(π, σ) 6= 0 (it is obvious that this defines
an equivalence relation). An equivalence class is still called a filter. A repre-
sentative of an equivalence class is sometimes called an implementation of the
filter.
(e) If π ∈ Π is such that Pπ is a one-dimensional projection, that is to say Pπ = Au
with u ∈ H̃, then just one filter (as an equivalence class) can exist, because
p(π, σ ′ ) = 1 then implies Wσ′ = Au (cf. 19.3.5f). This can be rephrased as
follows: if π is represented by a one dimensional projection Au , then the only
state that can be obtained by supplementing any state with a filter for π is the
pure state represented by the ray [u] (cf. 19.3.5c).
(f) Suppose we have, for a proposition π ∈ Π, an experimental implementation of π
which includes a detector so that the event which defines π is declared to have
occurred when the detector “clicks”. Then it is often possible to convert this
apparatus into a filter for π by replacing the detector with a suitably oriented
screen in which an aperture is opened in the shape of the detector. This is in
fact what was done in the two examples of 19.4.1, which are examples of how
filters can be obtained by modifying pieces of equipment originally designed for
determining propositions.
1 1
Wσ′ = Wσ,π := Pπ Wσ Pπ = Pπ Wσ Pπ .
tr(Pπ Wσ Pπ ) tr(Pπ Wσ )
19.4.5 Remarks.
(a) The condition that defines an ideal filter in 19.4.4 is consistent. Indeed, for
every projection P ∈ P(H) and every statistical operator W ∈ W(H) we have:
P W P ∈ T (H) by 18.2.7;
0 ≤ (P f |W P f ) = (f |P W P f ) , ∀f ∈ H, since P = P † ;
tr(P W P ) = tr(P 2 W ) = tr(P W ) by 18.2.11c, since P = P 2 ;
this shows that, if tr(P W ) 6= 0, then tr(P1W P ) P W P ∈ W(H). Also, recall that
tr(Pπ Wσ ) = p(π, σ). Furthermore, it is clear that p(π, σ ′ ) = 1 since Pπ2 = Pπ
implies Pπ Wσ,π = Wσ,π (cf. 19.3.5f).
An ideal filter can be regarded as a filter which alters any “incoming” state σ as
little as possible. In fact, for the “outgoing” state σ ′ we must have RWσ′ ⊂ RPπ
(cf. 19.4.3c), and the operator Pπ Wσ Pπ is so to speak just the operator Wσ
“reduced” to the subspace RPπ .
(b) If an ideal filter for a proposition π exists then it is clearly unique (as an
equivalence class), owing to the injectivity of the mapping σ 7→ Wσ .
(c) For a proposition π which is represented by a one-dimensional projection, only
one filter can exist (cf. 19.4.3e). Actually, if a filter exists then it is the ideal
filter. Indeed, if a filter for π exists then it transforms every state σ ∈ Σ
such that p(π, σ) 6= 0 into the state σ ′ which is represented by the statistical
operator Wσ′ = Au , if Pπ = Au with u ∈ H̃ (cf. 19.4.3e). Now, for each
W ∈ W(H) and each u ∈ H̃ so that tr(Au W ) 6= 0 we have RAu W Au = V {u}
(notice that Au W Au 6= O since (u|Au W Au u) = (u|W u) = tr(Au W )) and hence
1
tr(Au W ) Au W Au = Au (this follows easily from 18.3.2c). Thus Wσ = Wσ,π .
′
(d) If the ideal filter exists for a proposition π, then it transforms each
h pure state σi
1
such that p(π, σ) 6= 0 into the pure state represented by the ray kPπ uσ k Pπ uσ
(cf. 19.3.5c). Indeed, for P ∈ P(H) and u ∈ H̃ we have
P Au P f = (u|P f ) P u = (P u|f ) P u, ∀f ∈ H,
tr(P Au ) = (u|P u) = kP uk2 (cf. 18.3.2b),
and hence, if tr(P Au ) 6= 0, tr(P1Au ) P Au P = Au′ with u′ := kP1uk P u.
(e) For a proposition π and a state σ, the ideal filter for π (if it exists) transforms σ
into itself , i.e. we have σ ′ = σ in 19.4.4, if and only if p(π, σ) = 1. This follows
at once from the equivalence between p(π, σ) = 1 and Pπ Wσ Pπ = Wσ (cf.
19.3.5f). Indeed, p(π, σ) = 1 implies Pπ Wσ Pπ = Wσ and tr(Pπ Wσ ) = 1, and
hence Wσ,π = Pπ Wσ Pπ = Wσ . Conversely, since obviously Pπ Wσ,π Pπ = Wσ,π ,
Wσ,π = Wσ implies Pπ Wσ Pπ = Wσ and hence p(π, σ) = 1.
19.4.6 Axiom (Axiom Q2). The ideal filter exists for every proposition π ∈ Π.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 662
19.4.7 Remarks.
(a) Axiom Q2 is a version of what is sometimes called Lüder’s reduction axiom. A
milder version of the axiom would be to assume that a filter exists for every
proposition represented by a one-dimensional projection. This milder version
would be a version of what is sometimes called von Neumann’s reduction axiom,
or projection postulate.
We point out that in our approach to quantum mechanics, in which states
correspond to ensembles of copies prepared in a definite way, the transformation
of a state σ into a pure state σ ′ such that [uσ′ ] = [u], upon action of a filter for
the proposition represented by a one-dimensional projection Au , is an immediate
consequence of the definition of filter (cf. 19.4.3e) and it does not need to be
assumed. However, it is not obvious that a filter does exist for every one-
dimensional proposition (even less, that a filter exists for every proposition).
(b) For all u, v ∈ H̃, axiom Q2 implies that there exists an experimental set-up
which can be used in conjunction with an apparatus which implements the pure
state σ represented by the ray [v] (cf. 19.3.5c) so that, when the set-up is used,
there is the probability | (u|v) |2 that a copy prepared in σ is modified into a
copy in the pure state σ ′ represented by the ray [u]. Indeed, any implementation
of the filter for the proposition π represented by the one-dimensional projection
Au is such an experimental set-up, since
| (u|v) |2 = (v|Au v) = p(π, σ)
(cf. 19.4.2 and 19.4.3e). For this reason, the number | (u|v) |2 is called the
transition probability from the pure state represented by v to the pure state
represented by u. We point out that the transition probability from one pure
state to another is one if and only if the two states coincide (cf. 10.1.7b and
13.1.13a; also, this is a special case of 19.4.5e).
19.4.9 Remarks.
(a) If we have a first kind implementation of a proposition, this happens notwith-
standing the cautionary remarks of 19.4.1a. Some assume that there are first
kind implementations for all propositions, but we do not make this assumption.
(b) Clearly, a first kind (respectively an ideal) implementation of a proposition π is
a collection of procedures which amounts to a filter (respectively an ideal filter)
for π if they are supplemented with devices which absorb all the copies in which
¬π has been found to be true (i.e., in which π has not been found to be true).
(c) If a proposition π is represented by a one-dimensional projection then a first
kind implementation of π is necessarily an ideal one (cf. 19.4.5c).
19.4.11 Remarks. Wolfgang Pauli introduced the distinction between first and
second kind measurements (Pauli, 1933), when he distinguished between two types
of measurements. The first type of measurement brings (or leaves) the copy of the
system into a state in which the observable that has been measured surely gives the
result that has been the outcome of the measurement if it is measured a second time.
The second type of measurement either destroys the copy or else changes its state
arbitrarily. For an example of each type, we quote from Josef M. Jauch (note that
Jauch calls “value” what we call “result”). “First we consider the measurement
of the position of some elementary particle by a counter with a finite sensitive
volume. After the measurement has been performed and the counter has recorded
the presence of a particle inside its sensitive volume, we know for certain that the
particle, at the instant of the triggering, is actually inside the sensitive volume. By
this we mean the following: Suppose we repeated the measurement immediately
after it has occurred (this is of course an idealization, since counters are notorious
for having a dead time after they are triggered), then we would with certainty
observe the particle inside the volume of the counter. In the second example, we
consider a momentum measurement with a counter which analyzes the pulse height
of a recoil particle. Here the situation is quite different. The experiment will permit
us to determine the value of the momentum only before the collision occurred. If
we repeat the measurement immediately after it has occurred, then we find that the
momentum of the particle will have a quite different value from its measured value.
The very act of measurement has changed the momentum, and it is this change
which produced the observable effect. We shall call a measurement which will give
the same value when immediately repeated a measurement of the first kind. The
second example is then a measurement of the second kind ” (Jauch, 1968, p.165).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 664
19.4.12 Remarks.
(a) In what follows we assume that α is a discrete observable. Then the self-
adjoint operator Aα that represents α is the operator determined by a family
{(λn , Pn )}n∈I as A was in 15.3.4B (cf. 19.3.10c). Since {λn }n∈I = σp (Aα ),
{λn }n∈I is the family of all exact results for α (cf. 19.3.12a); moreover, Pn =
P Aα ({λn }) = Pα({λn }) for each n ∈ I. In what follows we consider a definite
state σ ∈ Σ.
First, suppose that we have an ideal measurement of α in an ensemble repre-
senting σ, i.e. in a large number N of copies of the system all prepared in the
state σ, and that, for a definite n ∈ I, a device is installed which absorbs all the
copies in which the result λn has not been found, i.e. in which the proposition
α({λn }) is not true. Then we have an ideal filter for α({λn }) (cf. 19.4.9b).
This selects, from the original ensemble of copies, a subensemble containing ap-
proximately p(α({λn }), σ)N copies which are in the state σn′ represented by the
statistical operator
1
Wσn′ = Wσ,α({λn }) = Pα({λn }) Wσ Pα({λn })
tr(Pα({λn }) Wσ )
1
= Pn Wσ Pn ,
tr(Pn Wσ )
provided this subensemble is not empty, i.e. provided p(α({λn }), σ) =
tr(Pn Wσ ) 6= 0 (cf. 19.4.3b and 19.4.4).
Next suppose that we are in a different situation, and that we have just one copy
which had been previously prepared in σ and in which an ideal measurement
of α has given the exact result λn . Then immediately after the measurement
the copy is in the state σn′ . Indeed, since we are considering the copy after the
proposition α({λn }) has been determined to be true in it, there is no need to
select the copy since everything is as if the copy had gone through an ideal filter
for α({λn }) (if we had provided a device that would absorb the copies in which
α({λn }) was not true, our copy would not have been absorbed).
Suppose once again that we have an ideal measurement of α in a copy pre-
pared in σ, but this time the result obtained has not been recorded; i.e.,
there has been a result which was necessarily one of the numbers in {λn }n∈I
(since a measurement of α means that all propositions α({λn }) have been
determined, and hence one of them has been found to be true because the
elements of {λn }n∈I are the only numbers that can be obtained as results
in view of the fact that P Aα (R − {λn }n∈I ) = OH and this implies that
p(α(R − {λn }n∈I ), σ) = tr(P Aα (R − {λn }n∈I )Wσ ) = 0), but the measuring
apparatus has failed to keep record of the result (if we include ourselves as ob-
servers in the measuring apparatus, this could mean that we have not registered
the result in our memories or elsewhere). Then we only know that immediately
after the measurement the copy has probability
tr(Pn Wσ ) = tr(Pα({λn }) Wσ ) = p(α({λn }), σ)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 665
of being in the state σn′ , and thus we must conclude (cf. 19.3.5b) that the state
of the copy after the measurement is the mixed state σ ′′ represented by the
statistical operator Wσ′′ defined by
X X
Wσ′′ f := (tr(Pn Wσ )) Wσn′ f = Pn Wσ Pn f, ∀f ∈ H,
n∈I0 n∈I
in fact, our ignorance is smaller than it was in the previous case, and we modify
the probabilities of the previous case as we should do if they were classical
probabilities. Proceeding as before and observing that
X
Pα({λk }k∈IE ) = P Aα ({λk }k∈IE ) = Pk
k∈IE
(cf. 15.3.4B), we see that immediately after the measurement the copy is in the
′′
state σE represented by the statistical operator WσE′′ defined by
1 X
WσE′′ f := P Pn Wσ Pn f, ∀f ∈ H
k∈IE tr(Pk Wσ ) n∈IE
P P
(note that tr ( k∈IE Pk )Wσ = k∈IE tr(Pk Wσ ) by 18.3.12 and that
tr WσE = 1 since tr(Pn Wσ Pn ) = tr(Pn Wσ )).
′′
We must underline the fact that, in the last two cases considered above (when
σ was transformed into σ ′′ or σE ′′
), there is a measuring apparatus which “in-
teracts” with a copy of the system in such a way as to turn out an exact result,
and that only the recording section of the apparatus is defective. Indeed, if in
the last case considered above the apparatus was only capable of determining
whether the proposition α({λn }n∈IE ) was true, then we would only have an
ideal determination of this proposition (and not an ideal measurement of α)
and, after an “interaction” with the apparatus in which this proposition was
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 666
′
determined to be true, the copy would be in the state σE represented by the
statistical operator WσE′ defined by
WσE′ f = Wσ,α({λk }k∈IE ) f
1
= Pα({λk }k∈IE ) Wσ Pα({λk }k∈IE ) f
tr(Pα({λk }k∈IE ) Wσ )
1 X
= P Pn Wσ Pm f, ∀f ∈ H,
k∈IE tr(Pk Wσ ) n,m∈IE
′ ′′
which is clearly not the same as WσE′′ (and hence σE is not the same as σE ).
Finally, suppose that we have ideal measurements of α in an ensemble of N
copies all prepared in σ. We have already seen what happens if we make a
selection by keeping just those copies in which a particular exact result has
been obtained. If instead no selection is made, then after the measurements we
have an ensemble which still contains N copies, all of them in the state σ ′′ . If
only a coarse selection is made by keeping just those copies for which a result
has been obtained that belongs to a definite subset E of R, then after the mea-
surements and theselection we have an ensemble which contains approximately
′′
P
k∈IE tr(Pk Wσ ) N copies, all of them in the state σE .
All transformations considered above of σ into another state (σn′ , σ ′′ ,σE ′′
,σE′
)
′ ′
are called state reductions (for the transformations of σ into σn or into σE , this
name was already known from 19.4.3b).
(b) We suppose here that α is an observable which is not discrete and, for the sake
of simplicity, we also suppose that Aα has no eigenvalues, i.e. (cf. 19.3.12a)
that there are no real numbers which are exact results for α. What happens
then if an ideal measurement of α is carried out? Naturally, a result is ob-
tained which is identified with a real number λ, but there is no state in which
this result has non zero probability of being obtained, since α({λ}) = π0 for
each λ ∈ R. Indeed, in N repetitions of the measurement of α we will obtain
N results, but each of them so seldom that its relative frequency approaches
zero as N grows (cf. 19.1.16a). However, an observable with no exact results
(or, more generally, a non discrete observable) is an idealization which is useful
(under some respects, even essential) on the theoretical level but which on the
operational level actually stands for a sequence of more realistic discrete ob-
servables which correspond to more realistic measuring instruments and which
can be assumed to be functions of α, as for instance the observables αn defined
in 19.1.22a. In order to perform a non-fictional measurement of α, we must
actually measure one of these more realistic discrete observables, for instance
one of the observables αn , and hence the analysis of remark a applies.
As already observed in 19.1.22a, the relation between the observable α and
the more realistic discrete observables which approximate α is conceptually
similar to the one that exists, in classical mechanics, between derivatives used
to represent values of speed and the way speed is actually measured. When
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 667
In discussions about quantum mechanics the issue is often addressed of whether two
observables are compatible with each other, something which is often regarded as
being equivalent to the condition that they can be measured simultaneously. How-
ever, it is not always clear what is meant by a “simultaneous measurement”. And
indeed the idea of an interaction of a copy of a quantum system with two measuring
instruments at the same time does not seem experimentally very sound. A perhaps
more promising idea might be that two observables α and β are simultaneously mea-
surable if a measurement of α followed immediately by a measurement of β yields
the same results as when the order of the α and β measurements is reversed. In the
first part of this section we endevour to deal with this topic on mainly statistical
grounds.
In the second part of this section we discuss uncertainty relations, an issue which
in the early days of quantum mechanics seemed to involve deep epistemological
and even philosophical questions. However, a strict statistical interpretation of
uncertainty relations as presented here is quite unproblematic.
As usual, states, propositions, observables are referred to a given quantum sys-
tem (cf. also 19.1.12) and they are represented as summarized in 19.3.22.
19.5.1 Remarks.
Proof. First we notice that the denominator in the statement is non-zero since
tr(Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ ) = p(π ′′ , π ′ , σ) (cf. 19.5.1a). Next, from 19.4.4 we have that
the state σ̃ is represented by the statistical operator Wσ′ ,π′′ with Wσ′ = Wσ,π′ , and
hence by the statistical operator
1 1
Pπ′′ Wσ,π′ Pπ′′ = Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ .
tr(Pπ ′′ Wσ,π Pπ )
′ ′′ tr(Pπ Pπ Wσ Pπ′ Pπ′′ )
′′ ′
From Pπ2′′ = Pπ′′ we have Pπ′′ Wσ̃ = Wσ̃ , and hence p(π ′′ , σ̃) = 1 by 19.3.5f.
We prove now the equivalence between conditions a and b.
a ⇒ b : Assume condition a. Then, for each σ ∈ Σ such that p(π ′′ , π ′ , σ) 6= 0,
we have
Wσ̃ = Wσ̃,π′
(cf. 19.4.5e). This equality is true in particular for each pure state σ ∈ Σ0 such
that kPπ′′ Pπ′ uσ k2 = p(π ′′ , π ′ , σ) 6= 0 (cf. 19.5.1a), for which it can be written as
(cf. 19.4.5d)
Aũσ = Aũ′σ
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 670
with
1 1
ũσ := Pπ′′ Pπ′ uσ and ũ′σ := Pπ′ Pπ′′ Pπ′ uσ ,
kPπ′′ Pπ′ uσ k kPπ′ Pπ′′ Pπ′ uσ k
and this implies (cf. 13.1.13a) that there exists α ∈ C so that
Pπ′′ Pπ′ uσ = αPπ′ Pπ′′ Pπ′ uσ ;
applying Pπ′ to the left of both sides of this equality we get
Pπ′ Pπ′′ Pπ′ uσ = αPπ′ Pπ′′ Pπ′ uσ ,
and hence α = 1 since Pπ′ Pπ′′ Pπ′ uσ 6= 0H . Owing to the bijection that exists from
Σ0 onto the family of all rays of H (cf. 19.3.5c), this proves that
Pπ′′ Pπ′ u = Pπ′ Pπ′′ Pπ′ u for each u ∈ H̃ such that Pπ′′ Pπ′ u 6= 0H ;
since the same is trivially true for each u ∈ H̃ such that Pπ′′ Pπ′ u = 0H , we have
Pπ′′ Pπ′ = Pπ′ Pπ′′ Pπ′ .
By taking the adjoints of both sides we get
Pπ′ Pπ′′ = Pπ′ Pπ′′ Pπ′
(cf. 12.3.4b), and hence [Pπ′ , Pπ′′ ] = OH , and hence condition b by 19.5.3.
b ⇒ a: If π ′ and π ′′ are compatible, then [Pπ′ , Pπ′′ ] = OH by 19.5.3, and hence
Pπ′ Wσ̃ = Wσ̃ , and hence p(π ′ , σ̃) = 1 by 19.3.5f.
The probability of the occurrence oπ′ ,π′′ is p(π ′′ , π ′ , σ), and the probability of the
occurrence oπ′′ ,π′ is p(π ′ , π ′′ , σ) (cf. 19.5.1b).
The following conditions are equivalent:
(a) there exists a proposition π ∈ Π such that p(π ′′ , π ′ , σ) = p(π, σ) for each σ ∈ Σ;
(b) π ′ and π ′′ are compatible.
If these conditions are satisfied, then the proposition π (as an equivalence class) is
unique and we have:
19.5.6 Remarks.
(a) The equivalence between conditions a and b in 19.5.5 shows that we cannot ac-
cept all occurrences related to a quantum system as bonafide events which define
propositions. Indeed, the meaning of condition a is that the occurrence oπ′ ,π′′
is actually a quantum event which defines a proposition, and the equivalence
between conditions a and b shows that this is true if and only if π ′ and π ′′ are
compatible. Condition c shows that if π ′ and π ′′ are compatible then both the
occurrences oπ′ ,π′′ and oπ′′ ,π′ are implementations of the same proposition π.
Thus, if π ′ and π ′′ are compatible, we can say that an event in the equivalence
class of π is the “simultaneous occurrence” of the events that define π ′ and π ′′ ;
actually, the experimental determinations of π ′ and π ′′ will require to determine
first one of them and then, immediately afterwards, the other one; however, the
order is immaterial since oπ′ ,π′′ and oπ′′ ,π′ define propositions which are in the
same equivalence class. This equivalence class, which we have denoted by π up
to now, will be denoted by the symbol π ′ ∧ π ′′ henceforth (thus, this symbol im-
plies that π ′ and π ′′ are compatible and that there exist ideal implementations
of them).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 672
(b) If π ′ and π ′′ are compatible propositions and ideal implementations of them are
available, then the proposition we have denoted by π ′ ∧ π ′′ is represented by
the orthogonal projection Pπ′ ∧π′′ = Pπ′′ Pπ′ (cf. 19.5.5d), i.e. by the orthogonal
projection defined by the subspace RPπ′ ∩ RPπ′′ (cf. 13.2.1e).
(c) We remark that, for two propositions π ′ and π ′′ , the operator Pπ′′ Pπ′ is an
orthogonal projection if and only if π ′ and π ′′ are compatible (cf. 19.5.3 and
13.2.1). However, for any pair of propositions π ′ , π ′′ there is always (i.e., with no
conditions on π ′ , π ′′ ) an orthogonal projection which is defined by the subspace
RPπ′ ∩ RPπ′′ (cf. 4.1.10), and hence there is always a proposition, which we
still denote by π, such that RPπ = RPπ′ ∩ RPπ′′ , since the mapping of 19.3.1b
is bijective. For a state σ we have
p(π, σ) = 1 ⇔ RWσ ⊂ RPπ = RPπ′ ∩ RPπ′′ ⇔ p(π ′ , σ) = p(π ′′ , σ) = 1
(cf. 19.3.5f). Thus, π is certainly true in a state if and only if both π ′ and π ′′
are certainly true in that state. We note that, if π ′ and π ′′ were proposition
in a classical theory, then the classical proposition π ′ ∧ π ′′ (defined in 19.2.1)
would be certainly true in a state if and only if both π ′ and π ′′ were certainly
true in that state. Indeed, for a state σ, in a classical theory we would have (cf.
19.2.8 and the proof of 19.2.7)
p(π ′ ∧ π ′′ , σ) = µσ (Sπ′ ∧π′′ ) = µσ (Sπ′ ∩ Sπ′′ ) = 1 ⇔
[p(π ′ , σ) = µσ (Sπ′ ) = 1 and p(π ′′ , σ) = µσ (Sπ′′ ) = 1];
in fact, one implication follows immediately from the monotonicity of µσ and
for the other one we have
µσ (Sπ′ ) = µσ (Sπ′′ ) = 1 ⇒ µσ (S − Sπ′ ) = µσ (S − Sπ′′ ) = 0 ⇒
µσ (S − (Sπ′ ∩ Sπ′′ )) = µσ ((S − Sπ′ ) ∪ (S − Sπ′′ )) = 0 ⇒ µσ (Sπ′ ∩ Sπ′′ ) = 1.
This could suggest interpreting π as the proposition “π ′ and π ′′ ” also in the
quantum theory. However, if pursued in the quantum theory, this interpretation
must not lead to thinking that in general π ′ and π ′′ can be determined in the
same copies (as instead they could in a classical theory); actually, p(π ′ , σ) = 1
means that π ′ is found to be true in all copies of an ensemble representing σ
and p(π ′′ , σ) = 1 means that π ′′ is found to be true in all copies of a different
ensemble representing σ. Moreover, determining π ′ in a copy and then π ′′ in the
resulting copy is a procedure which is not in general equivalent to determining
first π ′′ and then π ′ , as 19.5.3 shows. However, if π ′ and π ′′ are compatible and
if ideal implementations are available for both of them, then we saw in 19.5.6a
that an ideal determination of one of them in a copy immediately followed by
a determination of the other one in the resulting copy defines an event which
lies in the equivalence class of π. Thus, when π ′ and π ′′ are compatible there
are experimentally reasonable grounds for interpreting the proposition π as the
proposition “π ′ and π ′′ ”. In any case, we will reserve the symbol π ′ ∧ π ′′ for
the case of compatible propositions π ′ , π ′′ for which ideal implementations are
available.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 673
(d) Suppose that two propositions π ′ and π ′′ are compatible and that ideal imple-
mentations are available for both of them. Then the pairs π ′ and ¬π ′′ , ¬π ′ and
π ′′ , ¬π ′ and ¬π ′′ are all compatible; this follows at once from 19.5.3 and 19.3.4.
Thus, for every state σ, the probabilities for the joint results of π ′ and π ′′ are
independent from the order in which the determinations are made, i.e.
p(π ∗ , π ∗∗ , σ) = p(π ∗∗ , π ∗ , σ) for π ∗ = π ′ , ¬π ′ and π ∗∗ = π ′′ , ¬π ′′ .
19.5.8 Remark. To understand better the meaning of the results obtained so far
in this section, it is useful to examine what we should have if, in the situations
discussed, we were considering a classical statistical theory (for which we refer to
Section 19.2).
In a classical statistical theory, the action of an ideal filter for a proposition π
would be to transform any state σ such that µσ (Sπ ) = p(π, σ) 6= 0 into the state σ ′
represented by the probability measure µσ,π on A defined by
1
µσ,π (E) := µσ (E ∩ Sπ ), ∀E ∈ A.
µσ (Sπ )
Note that this obviously defines a probability measure and that p(π, σ ′ ) =
µσ,π (Sπ ) = 1; thus, the reduction from µσ to µσ,π would indeed represent the
action of a filter for π; moreover, µσ,π is obtained from the original measure µσ
by altering it to the least degree consistent with the condition µσ,π (Sπ ) = 1, as an
ideal filter should do.
Then, for two propositions π ′ , π ′′ and a state σ in a classical statistical theory,
if p(π ′ , σ) 6= 0 we should have, reasoning as in 19.5.1,
p(π ′′ , π ′ , σ) = p(π ′′ , σ ′ )p(π ′ , σ),
where σ ′ would be the state represented by the probability measure µσ,π′ , and hence
p(π ′′ , π ′ , σ) = µσ,π′ (Sπ′′ )µσ (Sπ′ ) = µσ (Sπ′′ ∩ Sπ′ );
since p(π ′ , σ) = 0 implies that the occurrence o defined in 19.5.1 can never happen
and hence p(π ′′ , π ′ , σ) = 0, and also implies µσ (Sπ′′ ∩ Sπ′ ) = 0 (by the monotonicity
of µσ ), we should have
p(π ′′ , π ′ , σ) = µσ (Sπ′′ ∩ Sπ′ )
whatever the value of p(π ′ , σ). And similarly we should have
p(π ′ , π ′′ , σ) = µσ (Sπ′ ∩ Sπ′′ ).
Thus, in a classical statistical theory we should have
p(π ′′ , π ′ , σ) = p(π ′ , π ′′ , σ)
for every pair of propositions and every state, in contrast with the result of 19.5.3.
As to 19.5.4, in a classical statistical theory a copy, initially prepared in a state
σ, after going through an ideal filter for a proposition π ′ and through an ideal filter
for a proposition π ′′ would be in the state σ̃ represented by the probability measure
µσ̃ on A defined by
1
µσ̃ (E) := µσ (E ∩ Sπ′′ ∩ Sπ′ ), ∀E ∈ A,
µσ (Sπ′′ ∩ Sπ′ )
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 675
19.5.10 Proposition. Two observables α1 and α2 are compatible if and only if the
operators Aα1 and Aα2 commute.
Proof. This result follows from 19.5.3, from the definitions of the operators Aα1
and Aα2 (cf. 19.3.6), and from the definition of commutativity for two self-adjoint
operators (cf. 17.1.5).
19.5.11 Remarks.
(a) Suppose that we have an R2 -valued observable α. Then α represents a measur-
ing instrument which yields a result by the position of a pointer in a dial which
is represented by R2 (cf. 19.1.9a).
We can define the mapping
α1 : A(dR ) → Π
E 7→ α1 (E) := α(E × R),
and we see that α1 is an observable since α1 = ϕ1 (α), with
ϕ1 : R2 → R
(x1 , x2 ) 7→ ϕ1 (x1 , x2 ) := x1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 676
pieces of equipment that define α1 (E1 ) and α2 (E2 ), and hence by the measuring
instruments represented by α1 and α2 , and for which we have (cf. 19.5.5)
p(α2 (E2 ), α1 (E1 ), σ) = p(α1 (E1 ) ∧ α2 (E2 ), σ), ∀σ ∈ Σ,
and hence
α(E1 × E2 ) = α1 (E1 ) ∧ α2 (E2 ).
This gives an operational interpretation to the proposition α(E) on the basis
of the measuring instruments represented by α1 and α2 , for each E ∈ S :=
{E1 × E2 : (E1 , E2 ) ∈ A(dR ) × A(dR )}. In particular, for each (x1 , x2 ) ∈ R2 we
can say that the determination of the proposition α({(x1 , x2 )}) is, in any state,
“the simultaneous determination” of the propositions α1 ({x1 }) and α2 ({x2 }),
in the sense specified in 19.5.6a.
The reason why we define the R2 -valued observable α on A(d2 ) and not just
on S is that we want the probability functions µα σ to be bona fide measures
and hence to be defined on a σ-algebra (S is just a semialgebra and A(d2 ) is
the σ-algebra generated by S, cf. 6.1.30a and 6.1.32). However, an operational
meaning for the proposition α(E) for each E ∈ A(d2 ) cannot be inferred from
the operational interpretation given above to all propositions α(E) with E ∈ S,
because there is no constructive procedure for obtaining each element of A(d2 )
starting from elements of S. Still, we know that, for every σ ∈ Σ, the measure
µασ is uniquely determined by its values on S (this follows from 6.1.18, from the
uniqueness asserted in 7.3.1A, and from the uniqueness asserted in 7.3.2 for a σ-
finite premeasure); in this respect, the operational grounds found above for the
propositions α(E) with E ∈ S provide operational grounds for the probability
measures µα σ.
(c) Suppose that we have an R2 -valued observable α and a function ϕ : Dϕ → R
such that Dϕ ∈ A(d2 ), Pα (R2 − Dϕ ) = OH , ϕ is A(d2 )Dϕ -measurable. We can
define the observable ϕ(α) (cf. 19.1.13, 19.1.14, 19.3.9), which is supported by
the same measuring instrument that defines α: if a measurement of α yields the
result (x1 , x2 ) ∈ R2 then we attribute the result ϕ(x1 , x2 ) to ϕ(α). Consider
now the two compatible observables α1 and α2 that are related to α as above:
either α1 and α2 are obtained from α as in remark a, or α is obtained from α1
and α2 as in remark b. Then the observable ϕ(α) can be considered a function
of α1 and α2 : if a “simultaneous measurement” of α1 and α2 brings out the
pair of results x1 , x2 then the result (x1 , x2 ), as an element of R2 , is assigned
to α and hence the result ϕ(x1 , x2 ) is assigned to ϕ(α). For this reason, the
observable ϕ(α) is also called the function of α1 , α2 according to ϕ and denoted
by the symbol ϕ(α1 , α2 ). Thus ϕ(α1 , α2 ) := ϕ(α) and we have
P Aϕ(α1 ,α2 ) (E) = P Aϕ(α) (E) = Pα (ϕ−1 (E)) = P ϕ(Aα1 ,Aα2 ) (E), ∀E ∈ A(dR )
(cf. the proof of 19.3.9, 17.1.11, 15.2.7, noticing that the relation between the
pairs of commuting self-adjoint operators Aα1 , Aα2 and the projection valued
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 678
measure Pα is the same as the one between the pair A1 , A2 and P in 17.1.10b),
and hence (cf. 15.2.2) Aϕ(α1 ,α2 ) = ϕ(Aα1 , Aα2 ). This extends the function
preserving property of the representation of observables by self-adjoint operators
that was noted in 19.3.9.
Suppose in particular that we have two compatible observables α1 and α2 ,
that we have ideal implementations of all the propositions in the ranges of α1
and α2 , and that we wish to define, using the measuring instruments that are
represented by α1 and α2 , a new observable to which the result x1 + x2 (or
x1 x2 ) is assigned when the “simultaneous” results x1 and x2 are obtained for
α1 and α2 respectively. Then, from what we saw above and from 17.1.12 it
follows that this new observable is represented by the self-adjoint extension of
the essentially self-adjoint operator A1 +A2 (or A1 A2 ), which actually coincides
with A1 + A2 (or A1 A2 ) whenever A2 is bounded.
19.5.12 Remark. The results of 19.5.11a,b are based on the equivalence between
conditions a and b in 17.1.10, and can be summarised as follows: two observables
α1 and α2 are compatible if and only if there exists an R2 -valued observable α such
that
α1 (E) = α(E × R) and α2 (E) = α(R × E), ∀E ∈ A(dR )
(actually, for the “only if” part we have to assume that there are ideal implementa-
tions of all the propositions in the ranges of α1 and α2 ). This gives, in our opinion,
a nice characterization of the compatibility of two observables.
However, in standard quantum mechanics textbooks, the only X-valued observ-
ables that are considered are observables. Now, it is possible to give a characteri-
zation of the compatibility of two observables in which only observables are used.
This is accomplished on the basis of the equivalence between conditions a and c in
17.1.10. Indeed, if two observables α1 and α2 are functions of an observable β, then
by 19.3.9 the self-adjoint operators Aα1 and Aα2 are functions of the self-adjoint
operator Aβ , and hence Aα1 and Aα2 commute by c ⇒ a in 17.1.10, and hence
α1 and α2 are compatible by 19.5.10. If conversely two observable α1 and α2 are
compatible, then the self-adjoint operators Aα1 and Aα2 commute by 19.5.10, and
hence there are a self-adjoint operator B and two functions ϕi so that Aαi = ϕi (B)
for i = 1, 2, by a ⇒ c in 17.1.10; now, it would be hard to give in general an
operational meaning (as instead we did for the mapping α in 19.5.11b) to the map-
ping β : A(dR ) → Π which is defined by letting β(E) be the proposition such that
Pβ(E) = P B (E), for all E ∈ A(dR ); this is due to the fact that the construction of
the projection valued measure P B out of the projection valued measures P A1 and
P A2 , in the proof of 17.1.9, is utterly abstract (whereas condition b in 17.1.10 relates
directly the projection valued measure P to the projection valued measures P A1
and P A2 ); however, every self-adjoint operator is taken to represent an observable
in standard quantum mechanics textbooks, and hence according to their rules we
can say that there exists an observable β which is represented by the self-adjoint
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 679
for i = 1, 2 (cf. 15.3.8 and 19.1.13). Thus, within the rules of standard quantum
mechanics textbooks, two observables α1 and α2 are compatible if and only if there
exists an observable β of which both α1 and α2 are functions.
19.5.13 Proposition. For a proposition π ∈ Π, a discrete observable α, a state
σ ∈ Σ, we denote by p(π, α, σ) the probability that π is true in a copy which is
produced by an ideal measurement of α with any result, carried out in a copy initially
prepared in the state σ. Thus, p(π, α, σ) is the theoretical prediction of the relative
frequency of π being found true in an ensemble of copies which, after being prepared
in σ, have gone through an ideal measurement of α without being selected according
to any particular set of results for α.
The following conditions are equivalent:
(a) p(π, α, σ) = p(π, σ), ∀σ ∈ Σ;
(b) π and α(E) are compatible, ∀E ∈ A(dR ).
Proof. Let {(λn , Pn )}n∈I be the family related to the self-adjoint operator Aα as
in 15.3.4B with A := Aα (cf. 19.3.10c). From 19.4.12a we see that, for every σ ∈ Σ,
X
p(π, α, σ) = p(π, σ ′′ ) = tr(Pπ Wσ′′ ) = tr(Pπ Pn Wσ Pn )
n∈I
(the third equality follows from 18.3.4c).
We prove now the equivalence between conditions a and b.
a ⇒ b: Assuming condition a, we have in particular
p(π, α, σ) = p(π, σ), ∀σ ∈ Σ0 ,
which is equivalent to
X
tr(Pπ Pn Au Pn ) = tr(Pπ Au ), ∀u ∈ H̃.
n∈I
P
We note that, if I is infinite, the series n∈I Pn Pπ Pn f is convergent for each f ∈ H
by 10.4.7b, since
(Pi Pπ Pi f |Pj Pπ Pj f ) = (Pπ Pi f |Pi Pj Pπ Pj f ) = 0 if i 6= j,
kPn Pπ Pn f k ≤ kPn f k (cf. 13.1.3d),
X
kPn f k2 < ∞ (cf. 13.2.8);
n∈I
thus, we can define the operator
X
Pn Pπ Pn : H → H
n∈I
!
X X
f 7→ Pn Pπ Pn f := Pn Pπ Pn f.
n∈I n∈I
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 680
Then we have
! !
X X X
u| Pn Pπ Pn u = (u|Pn Pπ Pn u) = tr(Pn Pπ Pn Au )
n∈I n∈I n∈I
X
= tr(Pπ Pn Au Pn ) = tr(Pπ Au ) = (u|Pπ u) , ∀u ∈ H̃,
n∈I
where we have used 18.2.11c and 18.3.12 (note that Pn Pπ ∈ P(H) for each n ∈ I
by 13.2.1, and that (Pi Pπ )(Pk Pπ ) = Pi Pk Pπ = OH if i 6= k) and the equality
P
n∈I Pn = 1H (cf. 15.3.4B).
19.5.14 Corollary. For a discrete observable α and any observable β, the following
conditions are equivalent:
Proof. The result follows immediately from 19.5.13 and the definition of compati-
bility for two observables.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 681
19.5.16 Remark. Let α be a discrete observable and let {(λn , Pn )}n∈I be the
family related to the self-adjoint operator Aα as in 15.3.4B with A := Aα (cf.
19.3.10c). Suppose that the projection Pn is one-dimensional, i.e. that there
exists un ∈ H̃ such that Pn = Aun , for each n ∈ I, and that we have a pro-
cedure for carrying out a first kind measurement of α. If a first kind measure-
ment of α is made in a copy of the system prepared in a state σ ∈ Σ and if
the result λn is obtained (the elements of {λn }n∈I are the only numbers that
can be obtained as results, since P Aα (R − {λn }n∈I ) = OH and this implies
p(α(R − {λn }n∈I ), σ) = tr(P Aα (R − {λn }n∈I )Wσ ) = 0), then immediately after the
measurement we have a copy in the pure state represented by the ray [un ], whatever
the state σ was; this follows from 19.4.3e, since Pα({λn }) = P Aα ({λn }) = Pn (cf.
15.3.4B). We also note that, for i 6= j, Pi Pj = OH implies (ui |uj ) = 0, and also
that
X X
f = P Aα (R)f = Pn f = (un |f ) un , ∀f ∈ H
n∈I n∈I
(cf. 15.3.4B). This proves that the family {un }n∈I is a c.o.n.s. in H (cf. 10.6.4).
Thus, if we have a discrete observable α such that the self-adjoint operator
Aα has one-dimensional eigenspaces and a procedure for a first kind measurement
of α, we actually have a procedure for preparing pure states, and a great deal of
them (one for each element of a c.o.n.s. in H). However, observables with these
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 682
characteristics are seldom available. More often, their function in preparing pure
states is fulfilled by a set of observables with the features specified in 19.5.17, as is
explained in 19.5.18.
19.5.17 Definition. Let {α1 , α2 , ..., αℓ } be a finite family of discrete observables
and, for k = 1, 2, ..., ℓ, let {(λkn , Pnk )}n∈Ik be the family associated with the self-
adjoint operator Aαk as the family {(λn , Pn )}n∈I was associated with the self-
adjoint operator A in 15.3.4B. The family {α1 , α2 , ..., αℓ } is said to be a complete
set of compatible observables if the observables of the family are pairwise compatible
and if the projection Pn11 Pn22 · · · Pnℓℓ is either one-dimensional or the operator OH ,
for all (n1 , n2 , ..., nℓ ) ∈ I1 × I2 × · · · × Iℓ (the operator Pn11 Pn22 · · · Pnℓℓ is a projection
by 19.5.10, 17.1.14, 13.2.1).
19.5.18 Remark. Let the family {α1 , α2 , ..., αℓ } be as in 19.5.17, and suppose
that it is a complete set of compatible observables. Suppose further that proce-
dures are available for performing ideal measurements of all observables αk . If ideal
measurements are made for all observables αk , one immediately after the other
in whichever order, in a copy of the system initially prepared in whatever state
σ, and if λ1n1 , λ2n2 , ..., λℓnℓ are the results obtained, then immediately after the ℓ
measurements we have a copy which is in the pure state represented by the ray
[un1 ,n2 ,...,nℓ ] if Pn11 Pn22 · · · Pnℓℓ = Aun1 ,n2 ,...,nℓ . Indeed, reasoning as in 19.5.1a we see
that the probability of obtaining the results λ1n1 , λ2n2 , ..., λℓnℓ was, before the mea-
surements, tr(Pn11 Pn22 · · · Pnℓℓ Wσ ); thus, if the results λ1n1 , λ2n2 , ..., λℓnℓ have actually
been obtained then Pn11 Pn22 · · · Pnℓℓ 6= OH and hence the projection Pn11 Pn22 · · · Pnℓℓ
is one-dimensional; then, reasoning as in the proof of 19.5.4 we see that after the
ℓ measurements we have a copy which is in the state represented by the statistical
operator
1
P 1 P 2 · · · Pnℓℓ Wσ Pn11 Pn22 · · · Pnℓℓ ,
tr(Pn11 Pn22 · · · Pnℓℓ Wσ ) n1 n2
which is the same as Aun1 ,n2 ,...,nℓ , for whatever state σ such that
tr(Pn11 Pn22 · · · Pnℓℓ Wσ ) 6= 0 (cf. 19.4.5c). This gives us a method for preparing
pure states, one for each element of a c.o.n.s. in H. To see this, define
J := {(n1 , n2 , ..., nℓ ) ∈ I1 × I2 × · · · × Iℓ : Pn11 Pn22 · · · Pnℓℓ 6= OH }
and let un1 ,n2 ,...,nℓ ∈ H̃ be such that Pn11 Pn22 · · · Pnℓℓ = Aun1 ,n2 ,...,nℓ for
(n1 , n2 , ..., nℓ ) ∈ J. The condition Pnkk Pnk′ = OH if nk 6= n′k (cf. 15.3.4B) im-
k
plies that
un1 ,n2 ,...,nℓ |un′1 ,n′2 ,...,n′ℓ = 0 if (n1 , n2 , ..., nℓ ) 6= (n′1 , n′2 , ..., n′ℓ );
moreover, the condition 1 = P Aαk (R) = nk ∈Ik Pnkk (cf. 15.3.4B) implies that
P
X X X
f = ··· Pn11 Pn22 · · · Pnℓℓ f
n1 ∈I1 n2 ∈I2 nℓ ∈Iℓ
X
= (un1 ,n2 ,...,nℓ |f ) un1 ,n2 ,...,nℓ , ∀f ∈ H.
(n1 ,n2 ,...,nℓ )∈J
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 683
This proves that the family {un1 ,n2 ,...,nℓ }(n1 ,n2 ,...,nℓ)∈J is a c.o.n.s. in H (cf. 10.6.4).
19.5.19 Proposition. Let α and β be two observables and σ a state in which both
α and β are evaluable, and let {un }n∈I and {wn }n∈I be as in 19.3.13c so that
P
Wσ f = n∈I wn Aun f for all f ∈ H. Then:
un ∈ DAα ∩ DAβ , ∀n ∈ I;
1X
∆σ α∆σ β ≥ wn | (Aα un |Aβ un ) − (Aβ un |Aα un ) |.
2
n∈I
If in particular σ is a pure state, then:
1
uσ ∈ DAα ∩ DAβ and ∆σ α∆σ β ≥ | (Aα uσ |Aβ uσ ) − (Aβ uσ |Aα uσ ) |.
2
Proof. From 19.3.13c we have un ∈ DAα ∩ DAβ for all n ∈ I. For the product
∆σ α∆σ β we have
sX sX
∆σ α∆σ β = wn kAα un − hαiσ un k 2 wn kAβ un − hβiσ un k2
n∈I n∈I
X
≥ wn kAα un − hαiσ un kkAβ un − hβiσ un k;
n∈I
the equality follows from 19.3.13c and the inequality is the Schwarz inequality in
CN if I contains N elements or in ℓ2 if I is denumerable (cf. 10.3.8c,d; if I is
√ √
denumerable, the sequences { wn kAα un − hαiσ un k} and { wn kAβ un − hβiσ un k}
are elements of ℓ2 , cf. 19.3.13c). Further, for each n ∈ I we have (using the fact
that the operators Aα and Aβ are symmetric, cf. 12.4.3c)
kAα un − hαiσ un kkAβ un − hβiσ un k
≥ |(Aα un − hαiσ un |Aβ un − hβiσ un )|
≥ |Im (Aα un − hαiσ un |Aβ un − hβiσ un )|
1
= |(Aα un − hαiσ un |Aβ un − hβiσ un ) − (Aβ un − hβiσ un |Aα un − hαiσ un )|
2
1
= |(Aα un |Aβ un ) − (Aβ un |Aα un )| .
2
Thus, the first part of the statement is proved. The second part follows immediately
from the first since Wσ = Auσ if σ is a pure state.
19.5.20 Corollary. Let α and β be two observables and σ a state in which both α
and β are evaluable, and also such that Aα Aβ Wσ ∈ T (H) and Aβ Aα Wσ ∈ T (H).
Then:
1
[Aα , Aβ ]Wσ ∈ T (H) and ∆σ α∆σ β ≥ | tr([Aα , Aβ ]Wσ )|.
2
If in particular σ is a pure state, the above conditions for σ are equivalent to the
one condition uσ ∈ D[Aα ,Aβ ] and, if they are fulfilled, the following inequality holds:
1
∆σ α∆σ β ≥ | (uσ |[Aα , Aβ ]uσ ) |.
2
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 684
Proof. Let {un }n∈I and {wn }n∈I be as in 19.5.19, with {un }n∈I an o.n.s. in H
(cf. 18.3.2c); then un ∈ RWσ for each n ∈ I. Since DAα Aβ Wσ = DAβ Aα Wσ = H,
we have Aβ un ∈ DAα and Aα un ∈ DAβ , and hence un ∈ D[Aα ,Aβ ] for each n ∈ I.
Then, since the operators Aα and Aβ are symmetric, from 19.5.19 we obtain
1X
∆σ α∆σ β ≥ wn | (Aα un |Aβ un ) − (Aβ un |Aα un ) |
2
n∈I
1 X
≥ wn (un |[Aα , Aβ ]un ) .
2
n∈I
If in particular σ is a pure state, then uσ ∈ D[Aα ,Aβ ] and
1
∆σ α∆σ β ≥ | (uσ |[Aα , Aβ ]uσ ) |.
2
In the general case, from 18.2.4a,b we have [Aα , Aβ ]Wσ ∈ T (H), and we can com-
pute tr([Aα , Aβ ]Wσ ) by means of a c.o.n.s. in H which contains {un }n∈I (cf. 10.7.3);
then we have
X X
tr([Aα , Aβ ]Wσ ) = (un |[Aα , Aβ ]Wσ un ) = wn (un |[Aα , Aβ ]un ) .
n∈I n∈I
Finally, if σ is a pure state and uσ ∈ D[Aα ,Aβ ] , then uσ ∈ DAα ∩ DAβ and hence
both α and β are evaluable in σ (cf. 19.3.13d); moreover,
Aα Aβ Wσ f = (uσ |f ) Aα Aβ uσ and Aβ Aα Wσ f = (uσ |f ) Aβ Aα uσ , ∀f ∈ H,
and this proves that Aα Aβ Wσ ∈ T (H) and Aβ Aα Wσ ∈ T (H). Indeed, if Aα Aβ uσ 6=
0H then Aα Aβ Wσ = λAu,v with λ := kAα Aβ uσ k, u := uσ , v := λ−1 Aα Aβ uσ , and
hence Aα Aβ Wσ ∈ T (H) in view of 18.2.15; and similarly for Aβ Aα Wσ .
Proof. Since β is bounded, β is evaluable in every state (cf. 19.3.15c), the operator
Aβ is bounded, and DAβ = H (cf. 19.3.10b). For each pure state σ ∈ Σ0 , in view
of 19.3.13d we have
|hβiσ | = | (uσ |Aβ uσ ) | ≤ kAβ uσ k
by the Schwarz inequality, and hence (cf. 4.2.5b)
∆σ β = kAβ uσ − hβiσ uσ k ≤ 2kAβ uσ k ≤ 2kAβ k.
If kAβ k = 0 then we have ∆σ β = 0 and hence ∆σ α∆σ β = 0 for each state σ ∈ Σ0
in which α is evaluable. Assuming kAβ k 6= 0, 19.3.16 implies that for every ε > 0
there exists a pure state σε ∈ Σ0 such that α is evaluable in σε and ∆σε α < 2kAε β k ,
and hence such that
ε
∆σε α∆σε β < 2kAβ k = ε.
2kAβ k
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 685
Proof. First we notice that, for every self-adjoint operator A in H, condition sa-ug
in 16.1.6 and the continuity
inner product imply that,for g ∈ Hand f ∈ DA ,
of the
d
the function R ∋ t 7→ g|UfA (t) is differentiable at 0 and dt g|UfA (t) = (g|iAf ).
0
d
And similarly dt UfA (t)|g = (iAf |g).
0
For f ∈ DAα ∩ DAβ , from 19.5.10 and 17.1.7 we have
UfAα (−t)|Aβ f = U Aα (−t)f |Aβ f
= f |U Aα (t)Aβ f = f |Aβ U Aα (t)f = Aβ f |UfAα (t)
−1 †
(recall that U Aα (−t) = U Aα (t) = U Aα (t) , cf. 16.1.1), and hence
d Aα d
(−iAα f |Aβ f ) = Uf (−t)|Aβ f = Aβ f |UfAα (t) = (Aβ f |iAα f ) .
dt 0 dt 0
19.5.23 Proposition. Let α1 and α2 be two compatible observables. Then for each
possible result λ1 for α1 and each ε > 0 there exist a possible result λ2 for α2 and
a pure state σε ∈ Σ0 so that
αk is evaluable in σε , |hαk iσε − λk | < ε, ∆σε αk < 2ε, for k = 1, 2.
Proof. Everything follows from 17.1.13 and 19.5.10 since, for each observable α,
σ(Aα ) is the set spα of all possible results for α (cf. 19.3.10a), α is evaluable in a
pure state σ ∈ Σ0 if and only if uσ ∈ DAα (cf. 19.3.13d), if α is evaluable in a pure
state σ ∈ Σ0 then hαiσ = hAα iuσ and ∆σ α = ∆uσ Aα (cf. 19.3.13d).
19.5.24 Remarks.
(a) We saw in 19.3.16 that, for each observable α, the uncertainty ∆σ α can be made
arbitrarily small by a suitable choice of the state σ. One can wonder if a similar
possibility exists for two observables α and β, i.e. if the following proposition
is true
P : ∀ε > 0, ∃σε ∈ Σ so that α and β are evaluable in σε and ∆σε α∆σε β < ε.
We must emphasize the fact that, whether proposition P is true or not, for
any state σ the product ∆σ α∆σ β has for us only the statistical meaning that is
based on the interpretation of ∆σ α as the theoretical prediction of the standard
deviation of the results obtained when measuring an observable α in a large
number of copies all prepared in σ (cf. 19.1.22a). In particular, considering the
product ∆σ α∆σ β does not imply for us any idea of carrying out measurements
of α and of β in the same copies of the quantum system. In fact, an experimental
test for the value of ∆σ α∆σ β rests on measuring α in a large collection of copies
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 686
and hence (cf. 19.5.10) the self-adjoint operators Aα and Aβ can fail to commute
(in the sense defined in 17.1.5), but nonetheless be such that [Aα , Aβ ] ⊂ OH ,
with a mathematically very meaningful domain D[Aα ,Aβ ] to boot (cf. 17.1.8).
It must be granted that, if α and β are bounded, then α and β are compatible
if and only if [Aα , Aβ ] = OH (cf. 19.3.10b, 19.5.10, 17.1.6a); but in this case
19.5.20 is of no real use since in this case the truthfulness of proposition P is
assured by 19.5.21. What is true in general is that if α and β are compatible
then [Aα , Aβ ] ⊂ OH (cf. 19.5.10 and 17.1.7h), but it would be sensible to use
this fact together with 19.5.20 only if we did not have the stronger result of
19.5.22, which shows that for compatible α and β the result of 19.5.19 does not
exert any constraint on ∆σ α∆σ β for any state σ in which α and β are evalu-
able (without the additional condition on σ that we should need if we were to
use 19.5.20). Actually, 19.5.23 shows that if α and β are compatible then an
even stronger proposition than proposition P is true. We point out that, while
for the results previously obtained about the compatibility of two observables we
had to assume that an ideal measurement was available for at least one of them
(cf. 19.5.13 and 19.5.14), this assumption is not required in 19.5.23. We notice
that the result of 19.5.23 holds trivially for every pair of classical observables;
indeed, in the classical case, for each microstate s ∈ S we have ∆s α = 0 for
each observable α (cf. 19.2.6a). Thus, two compatible quantum observables ex-
hibit once again a behaviour similar to the one they would display if they were
any pair of classical observables. The behaviour of two compatible quantum
observables is not in general equal, but only similar to the one of two classical
observables because we do not assume that for every quantum observable α
and for every possible result λ for α there is a state σ such that hαiσ = λ and
∆σ α = 0 (in our treatment of quantum mechanics, there is such a state if and
only if σp (Aα ) 6= ∅ and λ ∈ σp (Aα ), cf. 19.3.21; there would be such a state
for every observable α and every λ ∈ σ(Aα ) if we admitted in our treatment
the absolute precision state preparations represented by elements which do not
belong to the Hilbert space that we mentioned in 19.3.12b).
(g) An observable α is discrete if and only if there exists a c.o.n.s. {vj }j∈J in H so
that, letting σj be the pure state such that uσj = vj , α is evaluable in σj and
∆σj α = 0, for all j ∈ J; this follows from 19.3.10c and from the fact that an
observable α is evaluable in a pure state σ and ∆σ α = 0 if and only if uσ is an
eigenvector of Aα (cf. 19.3.21). Thus, for a discrete observable there are many
pure states (one for each element of a c.o.n.s. in H) in which α behaves as a
classical observable does in a microstate.
Let α and β be discrete observables. Then α and β are compatible if and only
if there exists a c.o.n.s. {vj }j∈J in H so that, letting σj be the pure state such
that uσj = vj , α and β are evaluable in σj and ∆σj α = ∆σj β = 0 (which
is a stronger result than ∆σj α∆σj β = 0) for all j ∈ J (cf. 19.5.10, 17.1.14,
19.3.21). Thus, if two discrete observables are compatible then there are many
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 688
pure states (one for every element of a c.o.n.s. in H) in which both of them
behave as a pair of classical observables do in a microstate.
19.6.1 Remark. For a given quantum system, let σ be a state preparation proce-
dure and suppose that it is carried out at a definite instant of time t0 . In all the
sections preceding, a copy prepared in σ was used in a second procedure (the deter-
mination of a proposition, the measurement of an observable, the passage through
a filter) which took place immediately after time t0 . However, at least in princi-
ple it is possible to wait for a positive time interval t before activating the second
procedure, and carry out this second procedure at time t0 + t; if this is done, the
second procedure takes place immediately after the new first procedure that can be
described as follows: perform procedure σ and wait for the time interval t. Now,
this new first procedure is not in general equivalent to the procedure σ. We assume
that this new first procedure is still a state preparation procedure, which we denote
by σt , and we say that the state σ at time t0 evolves into the state σt at time t0 + t.
We also assume that σt is a pure state whenever σ is a pure state; thus, we have
the mapping Γt defined by
Σ0 ∋ σ 7→ Γt (σ) := σt ∈ Σ0 .
In what follows we confine our attention to quantum systems for which Γt does
not depend on t0 but only on the time interval t (this was already anticipated by
the symbol Γt , where t0 does not appear); these systems are called conservative.
Also, we confine our attention to quantum systems for which the mapping Γt is a
bijection from Σ0 onto itself, for every positive t; these systems are called reversible.
Completely isolated quantum systems are experimentally seen to be conservative
and reversible. We denote the identity mapping of Σ0 by Γ0 and write Γ−t := (Γt )−1
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 689
for every positive t; for every pure state σ and any time t0 , if we prepare the state
Γ−t (σ) at time t0 − t and we wait until time t0 , then at time t0 we have a copy of
the system in the state σ. For every pair of positive t1 , t2 we have Γt1 ◦ Γt2 = Γt1 +t2 ;
this is simply due to the fact that waiting for the time interval t2 and then for the
time interval t1 is the same as waiting for the time interval t1 + t2 (and to the fact
that Γt depends only on the time interval t). Then, it is easy to prove that we have
Γt1 ◦ Γt2 = Γt1 +t2 for all t1 , t2 ∈ R. Further, we assume that, for every pair of
pure states σ1 , σ2 and every positive t, the transition probability (cf. 19.4.7b) from
Γt (σ1 ) to Γt (σ2 ) is the same as the transition probability from σ1 to σ2 ; indeed,
the probability that a copy prepared at time t0 in a pure state σ1 gets modified
(by the action of a suitable filter) to become as if it had been prepared at time t0
in another pure state σ2 is experimentally seen to be the same immediately after
t0 as at any later time. Since Γ−t = (Γt )−1 , this entails that the same is true for
negative t. Next, we assume that for every pure state σ the transition probability
from the pure state Γt (σ) to the pure state σ approaches one as t approaches zero;
the meaning of this continuity condition is obvious.
Now, since Γt is a bijection from the family Σ0 of pure states onto itself, in view
of the bijection between Σ0 and the projective Hilbert space Ĥ defined in 19.3.5c
there exists, for all t ∈ R, a unique mapping ωt : Ĥ → Ĥ which is a bijection from
Ĥ onto itself and also such that
[uΓt (σ) ] = ωt ([uσ ]), ∀σ ∈ Σ0 .
Since Γt preserves the transition probability between pure states, we have
τ (ωt ([uσ1 ]), ωt ([uσ2 ])) = | uΓt (σ1 ) |uΓt (σ2 ) | = | (uσ1 |uσ2 ) |
= τ ([uσ1 ], [uσ2 ]), ∀σ1 , σ2 ∈ Σ0 , ∀t ∈ R,
where τ is the function defined in 10.9.1; thus, ωt is an automorphism of the pro-
jective Hilbert space (Ĥ, τ ), for all t ∈ R (cf. 10.9.4). Moreover, the condition
Γt1 ◦ Γt2 = Γt1 +t2 , ∀t1 , t2 ∈ R,
is obviously equivalent to the condition
ωt1 ◦ ωt2 = ωt1 +t2 , ∀t1 , t2 ∈ R.
Furthermore, the continuity condition assumed above is obviously equivalent to the
condition that
the function R ∋ t 7→ τ ([u], ωt ([u])) ∈ [0, 1] is continuous at 0, ∀u ∈ H̃.
Therefore, in view of 16.4.5, the mapping
R ∋ t 7→ ωt ∈ Aut Ĥ
is a continuous one-parameter group of automorphisms. Consequently, in view of
16.4.11, there exists a continuous one-parameter unitary group U in H so that
ωt ([u]) = [U (t)u], ∀u ∈ H̃, ∀t ∈ R,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 690
i.e. Wσt = U H (t)Wσ U H (t)−1 . We point out that, although this equality has been
obtained on the basis of a particular decomposition of σ into a mixture of pure
states, there is no trace of that particular decomposition in the final result, as must
be since that decomposition in not unique unless σ is a pure state (cf. 19.3.5b,c).
We also note that U H (t)Wσ U H (t)−1 is indeed a statistical operator by 18.3.2a. Now
we notice that, for every positive t, the mapping Σ ∋ σ 7→ σt ∈ Σ results to be a
bijection from Σ onto itself because the mapping of 19.3.1a is a bijection and the
mapping W(H) ∋ W 7→ U H (t)W U H (t)−1 ∈ W(H) is a bijection from W(H) onto
itself, as can be easily seen. Thus, for every state σ ∈ Σ and every positive t we can
define σ−t as the state that evolves into the state σ at any time t0 if it is prepared
at time t0 − t; clearly, we have
Wσ−t = U H (t)−1 Wσ U H (t) = U H (−t)Wσ U H (−t)−1 .
Thus, we have
Wσt = U H (t)Wσ U H (t)−1 , ∀σ ∈ Σ, ∀t ∈ R.
This outcome of the assumptions above can be stated as the axiom below.
19.6.2 Axiom (Axiom Q3). There are quantum systems for which there exists
a self-adjoint operator H (in the Hilbert space in which the system is represented)
so that for every t0 ∈ R, every positive t, and every state σ ∈ Σ, a copy prepared at
time t0 in σ becomes after the time interval t the same as a copy prepared at time
t0 + t in the state σt represented by the statistical operator
Wσt := U H (t)Wσ U H (t)−1 ,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 691
and a copy prepared at time t0 − t in the state σ−t represented by the statistical
operator
becomes after the time interval t the same as a copy prepared at time t0 in σ.
19.6.3 Remarks.
(a) In 19.6.1 we proved that the assumptions made there implied axiom Q3, and it
is easy to see that axiom Q3 implies those assumptions (13.1.13b and 16.4.3a
must be used). Then, we see in particular that the quantum systems for which
axiom Q3 holds are the conservative and reversible quantum systems. In what
follows we consider only conservative and reversible quantum systems.
(b) The self-adjoint operator H of axiom Q3 is unique up to an additive multiple
of the unit operator. This is already clear from 19.6.1. In any case, to see it
directly, assume that H ′ is a self-adjoint operator which plays the same role as
H in axiom Q3. Then we have
′ ′
U H (t)W U H (t)−1 = U H (t)W U H (t)−1 , ∀W ∈ W(H), ∀t ∈ R,
ωU H (t) = ωU H ′ (t) , ∀t ∈ R.
From 16.4.3b we see that this implies that there exists k ∈ R so that
′
U H (t) = eikt U H (t), ∀t ∈ R,
H ′ = H + k1H .
(c) The self-adjoint operator −H is called the Hamiltonian of the system, and
it is interpreted as the self-adjoint operator which represents the observable
“energy” of the system. This is consistent with its being unique only up to an
additive multiple of the unit operator, since physically the observable energy
of any system is only defined up to an additive constant (note that, for k ∈ R,
σ(H + k1H ) = σ(H) + k and σp (H + k1H ) = σp (H) + k, as is obvious from
15.2.4b and 15.2.5b; then, cf. 19.3.10a and 19.3.12a).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 692
19.6.4 Remark. The relationship between a state and the state which at any time
evolves from it or has evolved into it, as implied by axiom Q3, is strictly causal, in
spite of the acausal character of quantum mechanics when it is referred to a single
copy of a system, as reflected in the impossibility of making more than statistical
statements about the results to be expected from determinations of propositions
or from measurements of observables. Thus, when referred to ensembles and not
to single copies, quantum mechanics is as deterministic as classical mechanics if
the quantum system is conservative and reversible, hence in particular if it is a
completely isolated system. An altogether different change of state happens when
there is state reduction (cf. 19.4.3b), produced by the interaction of copies of the
system with a filter or with a measuring instrument in a first kind measurement
(cf. 19.4.8 and 19.4.10). We point out that the number of copies in an ensemble
representing a state does not change in the time evolution of axiom Q3, while it
does in a state reduction.
19.6.5 Remark. For a quantum system whose time evolution is determined by a
self-adjoint operator H as in axiom Q3, for each state σ ∈ Σ activated at any time
t0 we can define the mapping R ∋ t 7→ σt ∈ Σ, which is called the trajectory of the
state σ. For a pure state σ, the trajectory of σ corresponds to the mapping
R ∋ t 7→ uσ (t) := UuHσ (t) = U H (t)uσ ∈ H̃
(cf. 19.6.1; for UuHσ , cf. 16.1.1). Now, for uσ ∈ DH we have (cf. 16.1.5b)
duσ (t)
uσ (t) ∈ DH and = iHuσ (t), ∀t ∈ R.
dt
Thus, this is the condition that is obeyed by the pure states whose representatives
(as in 19.3.5c) are rays which lie in DH , and this is the abstract form of what is
known as the Schrödinger equation. In many specific cases, the Hilbert space H is
a space of equivalence classes of functions on Rn and H is a differential operator;
then uσ becomes a function (actually, an equivalence class of functions) and the
Schrödinger equation is often written as
∂uσ
(x1 , ..., xn , t) = iHuσ (x1 , ..., xn , t);
∂t
duσ (t)
however, the use of the symbol ∂u ∂t is misleading, since
σ
dt has the meaning
that is defined in 16.1.3 (with the limit taken with respect to the distance defined
in 10.1.15). Finding the continuous one-parameter unitary group U H is sometimes
dubbed “solving the Schrödinger equation”; however, it must be noted that the
trajectories of all states are known if U H is known, while only the trajectories of
the pure states represented by vectors in DH appear in the Schrödinger equation,
and it is physically impossible to have DH = H (cf. 19.3.11).
19.6.6 Proposition. If the time evolution of a quantum system is determined by
a self-adjoint operator H as in axiom Q3 and the energy of the system is a discrete
observable then for every pure state σ ∈ Σ 0 we have
X
uσ (t) := U H (t)uσ = eitλn Pn uσ , ∀t ∈ R,
n∈I
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 693
Proof. From 19.3.10c we have that the conditions of 15.3.4B hold true for the self-
adjoint operator H, since −H represents the observable energy (cf. 19.6.3c). The
result then follows from 16.1.6 and from the explicit forms of the operator ϕ(A) in
15.3.4B.
19.6.7 Remark. The result of 19.6.6 shows why, if the energy of a quantum system
is a discrete observable, knowing the eigenvalues of H and a c.o.n.s. comprised of
eigenvectors of H allows one to “solve the Schrödinger equation”.
19.6.10 Remark. The result of 19.6.9 shows why the point spectrum of the Hamil-
tonian of a quantum system is of interest: the eigenvectors represent stationary
states of the system. The typical reaction of an atom to outside stimuli is to trans-
form its state from one stationary state to another emitting light whose frequency
is proportional to the difference between the corresponding eigenvalues.
Now, this implies condition b by the very definitions given in 19.1.21 and in 19.1.20.
b ⇒ c: This is obvious.
c ⇒ d: This is obvious.
d ⇒ e: Assume condition d. Recalling that for a pure state σ ∈ Σ0 we have
uσt = U H (t)uσ (cf. 19.6.1), from 19.3.13d we have
U H (t)u ∈ DAα and U H (t)u|Aα U H (t)u = (u|Aα u) , ∀t ∈ R, ∀u ∈ H̃ ∩ DAα ,
which is equivalent to
DAα ⊂ DU H (t)−1 Aα U H (t) and
u|U H (t)−1 Aα U H (t)u = (u|Aα u) , ∀t ∈ R, ∀u ∈ H̃ ∩ DAα .
19.6.13 Remark. The way of describing the time evolution of a conservative and
reversible quantum system that has been discussed in this section is called the
Schrödinger picture. There is a mathematically equivalent way of doing the same,
which can at times be useful for practical calculations.
For each proposition π ∈ Π and each state σ ∈ Σ, we see that (cf. 18.2.11c)
p(π, σt ) = tr(Pπ U H (t)Wσ U H (t)−1 )
= tr(U H (t)−1 Pπ U H (t)Wσ ) = p(πt , σ), ∀t ∈ R,
if we define πt as the proposition such that Pπt = U H (t)−1 Pπ U H (t) (this operator
is an orthogonal projection in view of 13.1.8). Similarly, if α is an observable, σ is
a state, and α is evaluable in σt for some t ∈ R, we see that (cf. 19.3.13a)
hαiσt = tr(Aα U H (t)Wσ U H (t)−1 ) = tr(Aα,t Wσ ),
if we define Aα,t := U H (t)−1 Aα U H (t) (this operator is self-adjoint in view of
12.5.4c). This mathematical way of dealing with time evolution is called the Heisen-
berg picture.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 697
Chapter 20
697
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 698
Now, conditions a’ and c are equivalent in view of 16.3.1, and conditions b’ and c
are equivalent in view of 15.4.1. Thus, conditions a, b, c are equivalent. It can be
proved in a similar way that conditions a, d, e are equivalent.
R ∋ t 7→ g|U T (t)f ∈ C
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 699
is differentiable at 0 and
d
g|U T (t)f 0 = i (g|T f ) ,
dt
and similarly
d
U T (−t)f |g 0 = i (T f |g) .
dt
After this preliminary remark, now we give the proofs of statements a and b.
a: Condition a in 20.1.1 (cf. also 16.1.1) implies that
U A (−t)f |U B (s)g = e−its U B (−s)f |U A (t)g , ∀f, g ∈ H, ∀(s, t) ∈ R2 .
and hence
d
U A (−t)f |Bf 0
(Af |Bf ) = −i
dt
d
−t f |U A (t)f + Bf |U A (t)f 0
= −i
dt
= i (f |f ) + (Bf |Af ) ,
and hence
(Af |Bf ) − (Bf |Af ) = ikf k2.
In view of 19.5.19 (the proof of 19.5.19 is actually effective for every pair of self-
adjoint operators and every statistical operator in which they are both computable;
cf. also 19.3.13a), this proves condition a.
b: Condition c in 20.1.1 is equivalent to
U A (t)B = (B − t1H )U A (t), ∀t ∈ R
(cf. 3.2.10b1). For all g ∈ DB and f ∈ DAB−BA , this implies that
g|U A (t)Bf = Bg|U A (t)f − t g|U A (t)f , ∀t ∈ R,
Proof. The operators AM and B M are self-adjoint operators in the Hilbert space
M (cf. 17.2.8). Moreover, the operators U A (t) and U B (t) are reduced by M for all
t ∈ R and the mappings
are continuous one-parameter unitary groups whose generators are AM and B M (cf.
17.2.13). Now, condition a in 20.1.1 implies obviously that
(a) The operator Q defined in 15.3.4A is a self-adjoint operator in the Hilbert space
L2 (R). The continuous one-parameter unitary group U Q is so that
is continuous for all [f ] ∈ L2 (R) (cf. 11.4.16). Then, if we denote by P the generator
of this continuous one-parameter group, we have
P = F −1 QF
by 16.3.1.
The set D is obviously a linear manifold in L2 (R) and D ⊂ DQ ; moreover
D = L2 (R) in view of 11.3.3 and 10.6.5b (or in view of 11.4.19). The restriction QD
of Q to D is a symmetric operator (cf. e.g. 12.4.3); moreover, for each ϕ ∈ S(R),
the two functions
R ∋ x 7→ ϕ± (x) := (x ± i)−1 ϕ(x) ∈ C
are obviously elements of S(R) and
(QD ± i1L2 (R) )[ϕ± ] = [ϕ];
in view of 12.4.17, this proves that the operator QD is essentially self-adjoint. There-
fore, the operator F −1 QD F is also essentially self-adjoint (cf. 12.5.4d). We see that
DF −1 QD F = {[f ] ∈ L2 (R) : F [f ] ∈ D}.
Now, for [f ] ∈ L2 (R), we have
F [f ] ∈ D ⇒ [∃[g] ∈ D s.t. F [f ] = [g]] ⇒ [∃[g] ∈ D s.t. [f ] = F −1 [g]] ⇒ [f ] ∈ D
and
[f ] ∈ D ⇒ F [f ] ∈ D
(cf. 11.4.6). Therefore,
DF −1 QD F = D.
Moreover we have
F −1 QD F [ϕ] = [(ξ ϕ̂)ˇ] = −i[((ϕ̂)ˇ)(1) ] = −i[ϕ′ ] = P0 [ϕ], ∀ϕ ∈ S(R)
(cf. 11.4.2 and 11.4.9). This proves that
F −1 QD F = P0 .
Then P0 is essentially self-adjoint (cf. 12.5.4d) and P is its unique self-adjoint
extension, since P is self-adjoint and QD ⊂ Q implies P0 ⊂ P (cf. 12.4.11c). This
concludes the proof of statement b.
c: For all (s, t) ∈ R2 and all [f ] ∈ L2 (R), we have
(f−s )t (x) = eitx f (x + s) = e−its eit(x+s) f (x + s) = e−its (f t )−s (x), ∀x ∈ Df − s,
and hence
U Q (t)U P (s)[f ] = e−its U P (s)U Q (t)[f ].
This proves statement c.
20.1.8 Remark. In view of 20.1.7c, 20.1.3b, 12.6.5, we know that either operator
P or Q or both operators P and Q must be non-bounded. Then both P and Q are
non-bounded, in view of their unitary equivalence (cf. 20.1.7b) and of 4.6.5b.
This can be proved more directly in the following way. The operator Q is not
bounded in view of 14.2.17 (cf. also Section 14.5 and 15.3.4A). Then the operator
P is not bounded because Q and P are unitarily equivalent.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 703
Neumann proved (Neumann, 1931) what is now called the Stone–von Neumann
uniqueness theorem. This theorem was the real final proof of the equivalence of
Heisenberg’s and Schrödinger’s formulations of quantum mechanics.
In section 20.3 we use the Stone–von Neumann theorem in our discussion of the
position and linear momentum observables for a non-relativistic quantum particle.
There are various proofs of the Stone–von Neumann uniqueness theorem. The
proof we present here is von Neumann’s original one, mainly because this proof is
a nice opportunity to put into action several theorems we saw in previous chapters.
Some facts we use in the proof are collected as preliminary remarks in 20.2.2, part
of the proof is set forth as a lemma in 20.2.3, and the theorem is stated and proved
in 20.2.4. Before all that, in 20.2.1 we compute an integral which has an important
role in the proof of 20.2.3.
(d) Suppose that the operators A and B are reduced by a non-trivial subspace M of
H. Then the operator T is reduced by M and the restriction T M of T to M (cf.
17.2.1) is the same as the element T (M) of B(M ) which is defined by AM and
B M as T is defined by A and B in statement a (recall that the pair AM , B M is
a representation of WCR in the Hilbert space M , cf. 20.1.6).
(e) If the orthogonal dimension of the subspace RT is one, then the pair of self-
adjoint operators A, B is jointly irreducible, and hence the pair A, B is an ir-
reducible representation of WCR.
Then 10.5.6 implies that there exists a unique operator T ∈ B(H) such that
Z
(f |T g) = γ(s, t) (f |W (s, t)g) dm2 (s, t), ∀f, g ∈ H.
R2
We have, for all f, g ∈ H,
Z
(2)
(T g|f ) = (f |T g) = γ(s, t)(f |W (s, t)g)dm2 (s, t)
R2
Z
(3)
= γ(s, t) (g|W (−s, −t)f ) dm2 (s, t)
2
ZR
(4) (5)
= γ(−s, −t) (g|W (s, t)f ) dm2 (s, t) = (g|T f ) ,
R2
where 2 holds true because complex conjugation commutes with integration (cf.
8.2.3), 3 holds true by 20.2.2b, 4 by 9.2.4b (with A(s, t) := (−s, −t)), 5 by the
equality γ(−s, −t) = γ(s, t). This proves that the operator T is self-adjoint (cf.
12.4.3).
Now we want to prove that T 6= OH . We assume to the contrary that T = OH
and we fix f, g ∈ H. Then we have, for all (s′ , t′ ) ∈ R2 ,
(6)
0 = (f |W (−s′ , −t′ )T W (s′ , t′ )g) = (W (s′ , t′ )f |T W (s′ , t′ )g)
Z
= γ(s, t) (W (s′ , t′ )f |W (s, t)W (s′ , t′ )g) dm2 (s, t)
R2
Z
(7) ′ ′
= γ(s, t)ei(st −ts ) (f |W (s, t)g) dm2 (s, t)
2
ZR Z
(8) ′ ′
= eit s e−is t γ(s, t) (f |W (s, t)g) dm(t) dm(s),
R R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 707
where: 6 is obvious; 7 follows from 20.2.2a,b; 8 holds true by 8.4.10c since 1 implies
that the function
′ ′
R2 ∋ (s, t) 7→ γ(s, t)ei(st −ts ) (f |W (s, t)g) ∈ C
is an element of L1 (R2 , A(d2 ), m2 ). For all s′ ∈ R, 1 and 11.4.7 imply that the
function
′
R ∋ t 7→ e−is t γ(s, t) (f |W (s, t)g) ∈ C
is an element of L1 (R) for all s ∈ R; then we can define the function
Z
′
R ∋ s 7→ ϕs′ (s) := e−is t γ(s, t) (f |W (s, t)g) dm(t) ∈ C,
R
which is an element of L1 (R) in view of 8.4.10b; thus, the result obtained above by
the equalities from 6 to 8 can be written as
ϕ̌s′ (t′ ) = 0, ∀t′ ∈ R, ∀s′ ∈ R. (9)
Moreover, for all s′ ∈ R, 1 implies also that
Z
1 2 1 2
|ϕs′ (s)| ≤ (2π)−1 e− 4 s kf kkgk e− 4 t dm(t), ∀s ∈ R,
R
and hence that ϕs′ ∈ L2 (R) (cf. also 11.4.7). Then, in view of 11.4.22 we can write
9 as
F −1 [ϕs′ ] = 0L2 (R) , ∀s′ ∈ R,
and this implies that
[ϕs′ ] = 0L2 (R) , ∀s′ ∈ R. (10)
For all s′ ∈ R, the function ϕs′ is continuous, as can be proved by 8.2.11 with
1 2
R ∋ t 7→ (2π)−1 kf kkgke− 4 t ∈ [0, ∞)
as dominating function. Then (cf. 11.3.6d) 10 implies that
ϕs′ (s) = 0, ∀s ∈ R, ∀s′ ∈ R,
or
Z
′
e−is t γ(s, t) (f |W (s, t)g) dm(t) = 0, ∀s′ ∈ R, ∀s ∈ R. (11)
R
Now we fix s ∈ R; 1 and 11.4.7 imply that the function
R ∋ t 7→ βs (t) := γ(s, t) (f |W (s, t)g) ∈ C
is an element of L2 (R) ∩ L1 (R); then, in view of 11.4.22, 11 implies that
F [βs ] = [β̂s ] = 0L2 (R) ,
and this implies that
[βs ] = 0L2 (R) ;
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 708
Z Z
(13)
γ(s̃ − s′ − s, t̃ − t′ − t)e 2 i[s(t̃−t −t)−t(s̃−s −s)+s (t̃−t )−t (s̃−s )]
1 ′ ′ ′ ′ ′ ′
′ ′
= γ(s , t )
R2 R2
f |W (s̃, t̃)g dm2 (s̃, t̃) dm2 (s′ , t′ )
Z Z
(14)
γ(s′ , t′ )γ(s̃ − s′ − s, t̃ − t′ − t)e 2 i[s(t̃−t −t)−t(s̃−s −s)+s t̃−t s̃]
1 ′ ′ ′ ′
=
R2 R2
′ ′
dm2 (s , t ) f |W (s̃, t̃)g dm2 (s̃, t̃)
where:
12 follows from a direct computation on the basis of the definition of T and of
20.2.2a;
13 follows from the change of variable (s̃, t̃) := (s′ + s + s′′ , t′ + t + t′′ ), in view
of 9.2.1b;
14 holdsZtrue by 8.4.8
Zand 8.4.10c, since
γ(s′ , t′ ) γ(s̃ − s′ − s, t̃ − t′ − t)dm2 (s̃, t̃) dm2 (s′ , t′ )
R2 R2
Z 2
(15)
= γ(u, v)dm2 (u, v) < ∞
R2
(15 follows from the change of variable (u, v) := (s̃ − s′ − s, t̃ − t′ − t)).
2
Z Moreover we have, for all (s̃, t̃) ∈ R ,
γ(s′ , t′ )γ(s̃ − s′ − s, t̃ − t′ − t)e 2 i[s(t̃−t −t)−t(s̃−s −t)+s t̃−t s̃] dm (s′ , t′ )
1 ′ ′ ′ ′
2
R2
Z
(16)
γ(ŝ − s, t̂ − t)γ(s̃ − ŝ, t̃ − t̂)e 2 i[s(t̃−t̂)−t(s̃−ŝ)+(ŝ−s)t̃−(t̂−t)s̃] dm2 (ŝ, t̂)
1
=
R2
Z
e 2 [−ŝ +((s+s̃)+i(t+t̃))ŝ−t̂ +((t+t̃)−i(s+s̃))t̂] dm2 (ŝ, t̂)
1 2 2 1 2 2 1 2 2
= (2π)−2 e− 4 (s +t ) e− 4 (s̃ +t̃ )
R2
(17)
= 2πγ(s, t)γ(s̃, t̃),
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 709
where:
16 follows from the change of variable (ŝ, t̂) := (s′ + s, t′ + t);
17 holds true because
Z
e 2 [−ŝ +((s+s̃)+i(t+t̃))ŝ−t̂ +((t+t̃)−i(s+s̃))t̂] dm2 (ŝ, t̂)
1 2 2
R2
Z h 2 2
i
2 −(ŝ− 2 ((s+s̃)+i(t+t̃))) −(t̂− 2 ((t+t̃)−i(s+s̃)))
(18) 1 1 1
= e dm2 (ŝ, t̂)
R2
Z Z
(19) 2 2
e− 2 (ŝ− 2 ((s+s̃)+i(t+t̃))) dm(ŝ) e− 2 (t̂− 2 ((t+t̃)−i(s+s̃))) dm(t̂)
1 1 1 1
=
R R
(20)
= 2π
(18 holds true because (a + ib)2 + (b − ia)2 = 0 for all a, b ∈ R; 19 holds true by
20.2.1, 8.4.9, 8.4.10c; 20 follows from 20.2.1).
Therefore we have
Z
(f |T W (s, t)T g) = 2πγ(s, t) γ(s̃, t̃) f |W (s̃, t̃)g dm2 (s̃, t̃) = 2πγ(s, t) (f |T g) .
R2
Since f and g were arbitrary elements of H and (s, t) was an arbitrary element of
R2 , this proves that
T W (s, t)T = 2πγ(s, t)T, ∀(s, t) ∈ R2 .
c: If we set s := t := 0 in statement b, we obtain
T 2 = T.
Since T is a self-adjoint element of B(H) (cf. statement a), this proves that T is an
orthogonal projection (cf. 13.1.5).
For all (s1 , t1 ), (s2 , t2 ) ∈ R2 and all f, g ∈ RT , we have
(21)
(W (s1 , t1 )f |W (s2 , t2 )g) = (W (s1 , t1 )T f |W (s2 , t2 )T g)
(22)
= (f |T W (−s1 , −t1 )W (s2 , t2 )T g)
(23) 1
= e 2 i(−s1 t2 +t1 s2 ) (f |T W (−s1 + s2 , −t1 + t2 )T g)
(24) 1 1 2
− 14 (t2 −t1 )2
= e− 2 i(s1 t2 −t1 s2 )− 4 (s2 −s1 ) (f |g) ,
where 21 holds true by 13.1.3c, 22 by 20.2.2b, 23 by 20.2.2a, 24 by statement b and
13.1.3c.
d: For all g ∈ M we have, in view of 20.2.2.c,
Z
(f |T g) = γ(s, t) (f |W (s, t)g) dm2 (s, t) = 0, ∀f ∈ M ⊥ ,
R2
and hence T g ∈ M ⊥⊥ . Since M = M ⊥⊥ (cf. 10.4.4a), this proves that M is an
invariant subspace for T , and hence that T is reduced by M (cf. 17.2.9).
The operator W (s, t) is reduced by M , for all (s, t) ∈ R2 (cf. 20.2.2c). Now we
recall that
M M
U A (t) = (U A (t))M and U B (t) = (U B (t))M , ∀t ∈ R
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 710
and hence T M = T (M) (we have denoted by a subscript whether a given inner
product is to be regarded as pertaining to the Hilbert space H or to the Hilbert
space M ).
e: We prove statement e by contraposition. We assume that there exists a non
trivial subspace M of H so that A and B are reduced by M . Then A and B are
⊥
reduced by the non-trivial subspace M ⊥ as well. Then T M and T M are non-null
orthogonal projections in the Hilbert spaces M and M ⊥ respectively (cf. statements
a, c, d). In view of 13.1.3c, there exist two normalized vectors u1 , u2 so that:
u1 ∈ RT M , and hence T u1 = T M u1 = u1 ;
⊥
u2 ∈ RT M ⊥ , and hence T u2 = T M u2 = u2 .
Therefore, {u1 , u2 } is an o.n.s. contained in RT (cf. 13.1.3c). Hence RT cannot be
a one dimensional subspace (cf. e.g. 10.7.3).
Proof. a: Let W (s, t) and W̃ (s, t) be defined as in 20.2.2 for all (s, t) ∈ R2 , with
respect to the pair A, B and the pair Ã, B̃ respectively. Moreover, let T and T̃ be
defined as in 20.2.3, with respect to A, B and Ã, B̃ respectively. Since the projections
T and T̃ are non-zero (cf. 20.2.3a,c), we can fix two normalized vectors u ∈ RT and
ũ ∈ RT̃ . Since the operators A and B are reduced by the subspace
Mu := V {W (s, t)u : (s, t) ∈ R2 }
(cf. 20.2.2d) and since Mu cannot be {0H } (because W (0, 0) = 1H ), the equality
Mu = H must be true. Similarly, Mũ = H̃.
For all L ∈ N, all (α1 , ..., αL ) ∈ CL , all (s1 , t1 , ..., sL , tL ) ∈ R2L , we have
L
2 L
X
X
αl W (sl , tl )u
= αh αl (W (sh , th )u|W (sl , tl )u)H
l=1 H h,l=1
L
(1) X
= αh αl W̃ (sh , th )ũ|W̃ (sl , tl )ũ
H̃
h,l=1
L
2
X
=
αl W̃ (sl , tl )ũ
,
l=1 H̃
where 1 holds true in view of 20.2.3c. Then we have, for all N, M ∈ N,
all (β1 , ..., βN ) ∈ CN , all (γ1 , ..., γM ) ∈ CM , all (s1 , t1 , ..., sN , tN ) ∈ R2N , all
(x1 , y1 , ..., xM , yM ) ∈ R2M ,
N
X M
X
βn W (sn , tn )u = γm W (xm , ym )u ⇒
n=1 m=1
X N XM
βn W (sn , tn )u − γm W (xm , ym )u
= 0 ⇒
n=1 m=1 H
N M
X X
βn W̃ (sn , tn )ũ − γm W̃ (xm , ym )ũ
= 0 ⇒
n=1 m=1 H̃
N
X M
X
βn W̃ (sn , tn )ũ = γm W̃ (xm , ym )ũ.
n=1 m=1
Therefore we can define a mapping
V0 : L{W (s, t)u : (s, t) ∈ R2 } → L{W̃ (s, t)ũ : (s, t) ∈ R2 }
by letting
N
! N
X X
V0 αn W (sn , tn )u := αn W̃ (sn , tn )ũ,
n=1 n=1
∀N ∈ N, ∀(α1 , ..., αN ) ∈ CN , ∀(s1 , t1 , ..., sN , tN ) ∈ R2N
(cf. 3.1.7). It is obvious that V0 is a linear operator from H to H̃ and that
RV0 = L{W̃ (s, t)ũ : (s, t) ∈ R2 }.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 712
In view of 4.1.13 and 4.6.6, there exists V ∈ U(H, H̃) such that
V0 ⊂ V and V −1 (W̃ (s, t)ũ) = W (s, t)u, ∀(s, t) ∈ R2 .
Then we have, for all (s′ , t′ ) ∈ R2 ,
V W (s′ , t′ )V −1 (W̃ (s, t)ũ) = V W (s′ , t′ )W (s, t)u
(2)
1 ′ ′
= V ei 2 (s t−t s) W (s′ + s, t′ + t)u
1 ′ ′
= ei 2 (s t−t s) W̃ (s′ + s, t′ + t)ũ
(3)
= W̃ (s′ , t′ )(W̃ (s, t)ũ), ∀(s, t) ∈ R2
(2 and 3 hold true in view of 20.2.2a), and hence by linearity
V W (s′ , t′ )V −1 f˜ = W̃ (s′ , t′ )f˜, ∀f˜ ∈ L{W̃ (s, t)ũ : (s, t) ∈ R2 },
and hence
V W (s′ , t′ )V −1 = W̃ (s′ , t′ )
in view of 4.2.6. Therefore we have
V U A (t)V −1 = V W (0, t)V −1 = W̃ (0, t) = U Ã (t), ∀t ∈ R,
and
V U B (s)V −1 = V W (s, 0)V −1 = W̃ (s, 0) = U B̃ (s), ∀s ∈ R.
By 16.3.1, these conditions are equivalent to
V AV −1 = Ã and V BV −1 = B̃.
b: In view of 20.2.3e, we prove statement b by proving that the orthogonal
projection T defined as in 20.2.3 with A := Q and B := P (where Q and P are the
operators discussed in 20.1.7) is so that the orthogonal dimension of the subspace
RT is one. In what follows, for simplicity we do not distinguish between the symbol
f for an element of L2 (R) and the symbol [f ] for the element of L2 (R) that contains
f.
For all (s, t) ∈ R2 and all g ∈ L2 (R) we have
(W (s, t)g)(x) = e− 2 ist (g t )−s (x) = eit(x+ 2 s) g(x + s), ∀x ∈ Dg − s.
1 1
Now we fix f, g ∈ L2 (R), and suppose that the representative g ∈ L2 (R) is such
that Dg = R (cf. 8.2.12). We have
Z
|f (x)(W (s, t)g)(x)|dm(x) ≤ kf kkW (s, t)gk = kf kkgk, ∀(s, t) ∈ R2 ,
R
by the Schwarz inequality (cf. 10.1.9) for the elements |f | and |W (s, t)g| of L2 (R),
and hence
Z Z
γ(s, t) |f (x)(W (s, t)g)(x)|dm(x) dm2 (s, t)
R2 R
Z
≤ kf kkgk γ(s, t)dm2 (s, t) < ∞, ∀(s, t) ∈ R2 .
R2
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 713
Then, by Tonelli’s theorem (cf. 8.4.8) followed by Fubini’s theorem (cf. 8.4.10c
with µ1 := m2 and µ2 := m) we have
Z Z
(f |T g) = γ(s, t) f (x)(W (s, t)g)(x)dm(x) dm2 (s, t)
R2
Z Z R
= f (x) γ(s, t)(W (s, t)g)(x)dm2 (s, t) dm(x).
R R2
or
{W (s, t)ui : (s, t) ∈ R2 } ⊂ {W (s, t)uk : (s, t) ∈ R2 }⊥ ,
and hence
Mk = {W (s, t)uk : (s, t) ∈ R2 }⊥⊥ ⊂ {W (s, t)ui : (s, t) ∈ R2 }⊥ = Mi⊥
(cf. 10.4.4b, 10.2.10b, 10.2.11). For each n ∈ I, Mn is a reducing subspace for A
and B (cf. 20.2.2d) and hence for the operator W (s, t) as well, for all (s, t) ∈ R2
(cf. 20.2.2c; the operator W (s, t) is defined by A, B as in 20.2.2). Then it is obvious
P⊕
that the subspace n∈I Mn is an invariant subspace for the operator W (s, t), for all
P⊕
(s, t) ∈ R2 . In view of 17.2.9 and 20.2.2b, this implies that n∈I Mn is a reducing
subspace for W (s, t) for all (s, t) ∈ R2 , and hence also for A and B (cf. 20.2.2c).
Therefore, the subspace
⊕
!⊥
X
M0 := Mn
n∈I
(u0 |un ) = 0, ∀n ∈ I
(note that un = W (0, 0)un ∈ Mn for all n ∈ I), we have a contradiction with the
fact that {un }n∈I is a c.o.n.s. in RT (cf. 10.6.4). This proves that M0 = {0K } and
hence that
⊕
X
Mn = K.
n∈I
Now we prove that, for each n ∈ I, the pair of self-adjoint operators AMn , B Mn
is jointly irreducible by proving that the projection T (Mn ) is one-dimensional (cf.
20.2.3e). Indeed, T (Mn ) = T Mn (cf. 20.2.3d) and hence
X
T (Mn ) f = T f = (uk |f )H uk = (un |f )H un = (un |f )Mn un , ∀f ∈ Mn
k∈I
(the second equality follows from 13.1.10), and this proves that T (Mn ) is a one-
dimensional projection in the Hilbert space Mn (cf. 13.1.12).
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 715
20.2.5 Remarks.
(a) We stated and proved part c of 20.2.4 even if we do not use it in this book,
because we thought it better to reproduce the whole content of von Neumann’s
article. The other parts of the Stone–von Neumann theorem play an essential
role in Section 20.3.
(b) For any irreducible representation of WCR, the Hilbert space H in which it is
defined is separable and of denumerable orthogonal dimension. Indeed state-
ments a and b in 20.2.4 imply that H and L2 (R) are isomorphic; moreover the
Hilbert space L2 (R) is separable (cf. 11.3.4) and of denumerable orthogonal
dimension (cf. 11.3.3); then so is H, in view of 10.7.14.
for each pure state σ and for each sequence {gn } in G which converges to the identity
of G. When interpreted in terms of transitions probabilities, this assumption follows
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 718
from the idea that the difference between the description of physical reality given
by an inertial observer O and the description given by the observer g(O) becomes
negligible when g is close enough to the identity of G.
We recall that a proposition is an event which does or does not occur in a
macroscopic device. Therefore each inertial observer describes this event in his
own way, with respect to his own frame of reference. We assume that, for an
inertial observer O and each g ∈ G, if O represents a given proposition π by an
orthogonal projection Pπ in H then the inertial observer g(O) will represent the
same proposition π by an orthogonal projection Pπg which will not be in general the
same as Pπ , while g(O) will represent by the same projection Pπ the proposition
that he describes (with respect to his own frame of reference) in the same way as
O describes π (with respect to O’s own frame of reference). The relation between
Pπ and Pπg follows from the relation obtained above between the representations
of pure states given by O and by g(O). In fact, the principle of relativity implies
that the probability p(π, σ) is the same for O and g(O), for all pure states σ. This
implies that
Pπg = Ug Pπ Ug−1 ,
for each proposition π. We point out that ω̃g depends on ωg and not on the partic-
ular element Ug of UA(H) (among those which implement ωg ) that has been used
to define ω̃g , because in Ug P Ug−1 an arbitrary multiplicative factor in front of Ug is
immaterial.
We consider an X-valued observable α, where (X, A) is a measurable space.
The equivalence of the descriptions of physical reality given by all inertial observers,
embodied in the principle of relativity, accounts for the assumption that all inertial
observers represent the dial of the measuring instrument that underlies α by the
same measurable space (X, A) (cf. 19.1.9a,b). Since the pointer and the dial are
macroscopic objects, the position of the pointer on the dial is described by each
inertial observer by means of his own frame of reference. Therefore, if an inertial
observer O represents a position of the pointer on the dial by a point x of X then
the inertial observer g(O) (for any g ∈ G) will represent the same position by a
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 719
point xg of X which will not be in general the same as x. We assume that, for each
g ∈ G, there exists a bijective measurable mapping tαg from X onto itself so that
xg = tα
g (x), ∀x ∈ X.
tα α α
g1 ◦ tg2 = tg1 g2 , ∀g1 , g2 ∈ G.
We observe that, for E ∈ A, the symbol α(E) denotes different propositions when
it is used by different inertial observers, since it denotes the proposition determined
by the event “the pointer of the measuring instrument is in the section of the dial
represented by E”, but which section of the dial is represented by E depends on
the observer. The proposition denoted as α(E) by an inertial observer O is in fact
the proposition denoted as α(tα g (E)) by the observer g(O), for all g ∈ G. However,
O and g(O) represent the X-valued observable α by the same projection valued
measure
Pα : A → P(H)
E 7→ Pα (E) := Pα(E) .
Indeed we assumed above that, if O represents a proposition π by a projection Pπ ,
then g(O) represents by the same projection Pπ the proposition (in general different
from π) that is described by g(O) as π is described by O. Now we fix E ∈ A and
consider the proposition that is denoted as α(E) by O. The representation of this
proposition given by O is Pα(E) . According to what we saw above, the representation
of this proposition given by g(O) must be
ω̃g (Pα(E) ),
and it must be
Pα(tαg (E))
as well. Thus, consistency requires that
Ug Pα(E) Ug−1 = ω̃g (Pα(E) ) = Pα(tαg (E)) , ∀E ∈ A, ∀g ∈ G,
where Ug is an implementation of g. This condition is called a Galilei-covariance
relation and the X-valued observable α is said to be Galilei-covariant. If the relation
above is written for a subgroup G0 of the kinematic Galilei group G, then it is called
a covariance relation with respect to G0 .
The case may be that, for a Galilei-covariant X-valued observable α, the map-
ping tαg is the identity mapping of X for all g ∈ G. This means that the repre-
sentations of the positions of the pointer in the dial are the same for all inertial
observers. Then the X-valued observable α is said to be Galilei-invariant and this
case of covariance condition is called a Galilei-invariance condition.
We remark that, if α is an observable (i.e., an R-observable), then the covariance
condition can be written as
Ug P Aα (E)Ug−1 = P Aα (tα
g (E)), ∀E ∈ A(dR ), ∀g ∈ G,
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 720
Then
∃!µ ∈ R such that ϕ(x, y) = eiµxy , ∀(x, y) ∈ R2 .
Proof. In view of 16.2.3, condition a and b imply that there exists a function
R ∋ y 7→ a(y) ∈ R
so that
ϕ(x, y) = eia(y)x , ∀x ∈ R, ∀y ∈ R. (1)
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 722
20.3.2 Proposition. Let H be a Hilbert space which is neither a zero nor a one-
dimensional linear space.
For a mapping R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ, the following conditions are
equivalent:
(a) the mapping R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ is a homomorphism from the additive
group R2 to the group Aut Ĥ and the following implication holds true
[(s, v) ∈ R2 , {(sn , vn )} a sequence in R2 , lim (sn , vn ) = (0, 0)] ⇒
n→∞
(b) there exist µ ∈ R and two continuous one parameter unitary groups Uµ1 , Uµ2 in
H such that
Uµ2 (v)Uµ1 (s) = eiµsv Uµ1 (s)Uµ2 (v), ∀(s, v) ∈ R2 ,
and
ω(s,v) ([u]) = [Uµ1 (s)Uµ2 (v)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 .
(cf. 10.9.6). Since zs,v is uniquely determined by this condition (recall that U 1 and
U 2 have been fixed), we have the function
R2 ∋ (s, v) 7→ ϕ(s, v) := zs,v ∈ T,
which is such that
U 2 (v)U 1 (s) = ϕ(s, v)U 1 (s)U 2 (v), ∀(s, v) ∈ R2 .
We see that, for all s, s′ ∈ R and all v ∈ R,
ϕ(s, v)−1 ϕ(s + s′ , v)U 1 (s + s′ )U 2 (v)
= ϕ(s, v)−1 U 2 (v)U 1 (s + s′ )
= ϕ(s, v)−1 U 2 (v)U 1 (s)U 1 (s′ ) = U 1 (s)U 2 (v)U 1 (s′ )
= ϕ(s′ , v)U 1 (s)U 1 (s′ )U 2 (v) = ϕ(s′ , v)U 1 (s + s′ )U 2 (v),
and hence
ϕ(s + s′ , v) = ϕ(s, v)ϕ(s′ , v).
Similarly we can prove that
ϕ(s, v + v ′ ) = ϕ(s, v)ϕ(s, v ′ ), ∀v, v ′ ∈ R, ∀s ∈ R.
Moreover, let (s, v) ∈ R2 and a sequence {(sn , vn )} in R2 be so that (s, v) =
limn→∞ (sn , vn ), and fix u ∈ H̃. By condition ug2 in 16.1.1 and by 16.4.7, we have
lim U 1 (sn )U 2 (vn )u = U 1 (s)U 2 (v)u and
n→∞
lim U 2 (vn )U 1 (sn )u = U 2 (v)U 1 (s)u,
n→∞
must exhibit some connection with “external space”. Now we prove that this requi-
site is equivalent to the joint irreducibility of the pair of continuous one-parameter
unitary groups Uµ1 , Uµ2 . Indeed, suppose that requisite qp2 is fulfilled. For each
orthogonal projection P in H there exists a proposition π such that Pπ = P (this
is the surjectivity of the mapping in 19.3.1b), and hence there exists the yes-no
observable απ , for which Aαπ = P (cf. 19.3.8). Moreover, in the range of the pro-
jection valued measure of the self-adjoint operator P there are only the projections
OH , 1H , P, 1H − P (cf. 19.3.8), and the only projections which are multiples of the
identity operator are OH and 1H . In view of all this, requisite qp2 entails that, for
P ∈ P(H), the following implications are true
[Uµ1 (s)P Uµ1 (−s) = Uµ2 (v)P Uµ2 (−v) = P, ∀(s, v) ∈ R2 ] ⇒
[Uµ1 (s)Uµ2 (v)P Uµ2 (−v)Uµ1 (−s) = P, ∀(s, v) ∈ R2 ] ⇒ P ∈ {OH , 1H },
and this is the condition that the pair Uµ1 , Uµ2 is jointly irreducible (cf. 17.3.1).
Conversely, suppose that the pair Uµ1 , Uµ2 is jointly irreducible and that an observable
α is Galilei-invariant. Then we have
Uµ1 (s)Pα(E) Uµ1 (−s) = Uµ2 (v)Pα(E) Uµ2 (−v) = Pα(E) , ∀E ∈ A(dR ), ∀(s, v) ∈ R2 ,
and hence, by the irreducibility of the pair Uµ1 , Uµ2 ,
P Aα (E) = Pα(E) ∈ {OH , 1H }, ∀E ∈ A(dR ),
and hence, by 17.3.2,
∃λ ∈ R so that Aα = λ1H .
Thus, requisite qp2 is fulfilled.
In view of the discussion above we assume that, if the homomorphism from R2
to Aut Ĥ of the quantum-particle model is implemented by a real number µ and
a pair of continuous one-parameter groups Uµ1 , Uµ2 as in 7 and 8, then this pair is
jointly irreducible. The next proposition proves that this implies µ 6= 0.
20.3.3 Proposition. Let µ ∈ R and let an irreducible pair Uµ1 , Uµ2 of continuous
one-parameter unitary groups in a Hilbert space H be so that
Uµ2 (v)Uµ1 (s) = eiµsv Uµ1 (s)Uµ2 (v), ∀(s, v) ∈ R2 .
If H is neither a zero nor a one-dimensional linear space then µ 6= 0.
Proof. The proof is by contraposition. Since the pair Uµ1 , Uµ2 is jointly irreducible,
the following implication holds true:
[B ∈ B(H) and [B, Uµ1 (s)] = [B, Uµ2 (v)] = OH , ∀(s, v) ∈ R2 ] ⇒
[∃α ∈ C so that B = α1H ]
(cf. 17.3.5). Now suppose µ = 0. Then, for all (s, v) ∈ R2 , Uµ1 (s) and Uµ2 (v) satisfy
the first condition for B above, and hence Uµ1 (s) and Uµ2 (v) are multiplies of 1H .
Since the pair Uµ1 , Uµ2 is jointly irreducible, this implies that H is either a zero or a
one-dimensional linear space.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 727
Thus, requisites qp1 and qp2 are fulfilled if we have a non-zero real number µ
and an irreducible pair of continuous one-parameter unitary groups Uµ1 , Uµ2 with
property 7. The next proposition proves that an irreducible pair of continuous
one-parameter groups with this property does exist, for each µ 6= 0.
and of g(0,v) (O) are in the same place). Well, this is exactly what would happen if
the system were a classical particle and its position were being measured (by means
of detectors suitable for classical particles). By this analogy with the classical case,
the quantum observable described above is given the name of “position” and is
denoted by q.
We imagine the observable “momentum” of a quantum particle in one dimension
as the abstract representation of a pair of detectors which are placed, each time a
measurement is made, on either side of the apparatus that prepares a copy of the
system; no forces act on these detectors and therefore they move with constant ve-
locities with respect to all inertial observers before a copy has been prepared; more-
over, they are so that one and only one of them reacts by changing its velocity after
a copy of the system has been prepared (this too is one of the particle-like aspects
of a quantum particle). Now suppose that a copy has been prepared. Then each
inertial observer assigns, as result to “his” observable “momentum”, the difference
between the values of the momentum of the detector that has reacted, measured
by him (with respect to his own frame of reference) before and after the reaction
(the detectors are classical objects and therefore each of them has a momentum at
all times, in the classical sense). If an inertial observer O assigns the result y to
“his” observable “momentum”, then on the basis of the same reaction the inertial
observer g(s,0) (O) (for any s ∈ R) will assign the same result to “his” observable
“momentum” (the frames of reference of O and of g(s,0) (O) are stationary with re-
spect to each other), while the inertial observer g(0,v) (O) (for any v ∈ R) will assign
a different result. The idea that the particle “has mass m” is supported by the
experimental evidence that the result assigned by g(0,v) (O) is y + mv, where m is a
positive number independent of v. Well, this is exactly what would happen if the
system were a classical particle of mass m and its momentum were being measured
(by techniques suitable for classical particles). By this analogy with the classical
case, the quantum particle is said to “have mass m” and the quantum observable
described above is given the name of “momentum” and is denoted by p.
These observations give the transformations tq(s,0) , tq(0,v) , tp(s,0) , tp(0,v) (for all
s, v ∈ R) to be used in the covariance conditions for the observables q and p with
respect to the subgroups S and V of the kinematic Galilei group G (and hence,
with respect to any other element of G). They are:
tq(s,0) (x) = x + s, ∀x ∈ R, ∀s ∈ R;
tq(0,v) (x) = x, ∀x ∈ R, ∀v ∈ R;
tp(s,0) (y) = y, ∀y ∈ R, ∀s ∈ R;
tp(0,v) (y) = y + mv, ∀y ∈ R, ∀v ∈ R.
Then we see that conditions 10, 11, 12, 13 are nothing else than the covariance
conditions for the observables q and p with respect to S and V , since U B (−s) and
U A (µv) are implementations of ω(s,0) and ω(0,v) respectively (cf. 9).
The outcome of the discussion above is that the structure of the quantum particle
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 730
model for a definite mass m is equivalent to the structure made up by a pair of self-
adjoint operators A, B which are an irreducible representation of WCR, together
with a non-zero real number µ and a pair of self-adjoint operators Aq and Ap which
satisfy conditions 10, 11, 12, 13 with the pair A, B. The operators Aq and Ap
represent the observables position and momentum of the quantum particle, while
the pair A, B and the number µ are related as in 9 to the homomorphism from R2
to Aut Ĥ that represents the action of the kinematic Galilei group in the quantum
particle model.
The question of existence and uniqueness of implementations of these structures
will be addressed on the basis of the next proposition.
(D) Let µ1 , µ2 ∈ R − {0} and suppose µ1 6= ±µ2 . Then, there does not exist any
unitary or antiunitary operator V in H so that
V AV −1 = A and V (µ−1
2 mB)V
−1
= µ−1
1 mB.
Now, it seems that not only do we have pairs which fit our scheme, but we have too
many of them: what value of µ and which pair (A + k1 1H , µ−1 mB + k2 1H ) should
be used to represent a quantum particle of mass m?
For a fixed value of µ ∈ R − {0}, 20.3.5b1 shows that all the pairs related to that
value of µ are unitarily equivalent to each other. If we transform, by means of a
unitary operator, a pair related to a value of µ to another related to the same value,
perhaps we want to transform the operators U B (−s) and U A (µv) as well, since
they are implementations of the automorphisms ω(s,0) and ω(0,v) respectively. Then
20.3.5b2 shows that these operators get just multiplied by factors in T, and hence in
the new representation the same automorphisms ω(s,0) and ω(0,v) are implemented
as in the old one. In view of all this and of 19.3.23, we consider two pairs with the
same value of µ to be equivalent for the description of position and momentum of
a quantum particle of mass m.
For a fixed value of µ ∈ R − {0}, 20.3.5b1, c1 show that all the pairs defined
by a value of µ are antiunitarily equivalent to all the pairs defined by the opposite
value. If we transform, by means of an antiunitary operator, a pair defined by a
value of µ into another defined by the opposite value, perhaps also in this case
we want to transform the operators U B (−s) and U A (µv). Then 20.3.5b2, c2 show
that these operators, besides being multiplied by inessential multiplicative factors
in T, get changed into U B (−s) and U A (−µv); now, these operators implement the
automorphism ω(s,0) and ω(0,−v) . Thus it appears that, in the new representation,
the direction of the flow of time has been reversed. However, since we do not want
to study time evolution, in view of 19.3.23 we consider pairs defined by opposite
values of µ to be equivalent.
Finally, 20.3.5D (together with 20.3.5b1, c1 ) shows that, if µ1 , µ2 ∈ R − {0} are
such that µ1 6= ±µ2 , then no pair defined by µ2 is either unitarily of antiunitarily
equivalent to any pair defined by µ1 .
In view of all this, for a given irreducible representation A, B of WCR, we need
only consider the pairs
but we must consider all of them. For each µ > 0, they implement in inequivalent
ways our quantum particle model of mass m, with the assignements
and with the kinematic Galilei group represented by the automorphism of Ĥ defined
by
In addition, we recall that the Stone–von Neumann uniqueness theorem (cf. 20.2.4a)
implies that, if a pair Ã, B̃ is a different irreducible representation of WCR, then for
each µ ∈ R−{0} the pair (Ã, µ−1 mB̃) is unitarily equivalent to the pair (A, µ−1 mB),
and so is the pair U Ã , U B̃ to the pair U A , U B . Thus, nothing is gained by considering
irreducible representations of WCR different from A, B.
Since the quantum models defined by different positive values of µ are not uni-
tarily or antiunitarily equivalent, the question is now what value of µ should be
used to represent a quantum particle of mass m. Mathematical reasoning cannot
help us here, and in fact we must turn to experimental outcomes. Indeed suppose
that, for a definite positive value of µ, we have the representation
µ := ~−1 m,
where ~ := (2π)−1 h and h is Planck’s constant. Thus, also on the basis of experi-
mental physics, the quantum particle model of mass m is given by
Aq := A,
Ap := ~B,
ω(s,v) ([u]) := [U B (−s)U A (~−1 mv)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 .
20.3.6 Remarks.
(a) The discussion above shows that the Hilbert space, in which a non-relativistic
quantum particle without internal degrees of freedom is represented, is necessar-
ily separable and of denumerable dimension.
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 736
(b) In the representation of a quantum particle of mass m obtained above, the value
m of the mass does not have a role in the operators Aq and Ap which represent
the observables position and momentum. However it does in the implementation
of the homomorphism from R2 to Aut Ĥ which represents the kinematic Galilei
group. On the basis of 20.3.5D it is easy to see that implementations related to
different values of m are not unitarily or antiunitarily equivalent.
(c) Historically, the first mathematical representation of a quantum particle of mass
m was obtained in what is now called the Schrödinger representation of WCR.
In this representation we have
H := L2 (R), A := Q, B := P,
where Q and P are the operators discussed in 20.1.7, and hence
Aq := Q,
Ap := ~P,
ω(s,v) ([f ]) := [U P (−s)U Q (~−1 mv)f ],
for each ray [f ] in L2 (R) and each (s, v) ∈ R2
(here, for f ∈ L2 (R), the element [f ] of L2 (R) is denoted by the same symbol
f ; here, for a unit vector f of L2 (R), [f ] denotes the ray that contains f ). More
explicitly, for all f ∈ L2 (R) and all (s, v) ∈ R2 , we have (assuming for simplicity
Df = R, cf. 8.2.12)
−1
(U P (−s)U Q (~−1 mv)f )(x) = ei~ mv(x−s)
f (x − s), ∀x ∈ R
(cf. 20.1.7).
If a pure state σ is represented by a ray [fσ ] in L2 (R), it is possible to put a
direct statistical interpretation on the function |fσ |2 . In fact, from 15.3.4A and
from Section 14.5 we see that
Pq(E) fσ = P Q (E)fσ = χE fσ , ∀E ∈ A(dR ),
and hence
Z
χE |fσ |2 dm, ∀E ∈ A(dR ).
p(q(E), σ) = fσ |Pq(E) fσ =
R
and hence
p(p(E), σ) = fσ |Pp(E) fσ = f˜σ |P Q (~−1 E)f˜σ
Z
= χ~−1 E |f˜σ |2 dm, ∀E ∈ A(dR ),
R
Bibliography
Apostol, T. M. (1974). Mathematical Analysis, 2nd edn. (Addison Wesley Publishing Com-
pany, Reading).
Bargmann, V. (1954). On the Unitary Ray Representations of Continuous Groups (Annals
of Mathematics 59), p.1-46.
Bargmann, V. (1964). Note on Wigner’s Theorem on Symmetry Operations (Journal of
Mathematical Physics 5), p.862-868.
Berberian, S. K. (1999). Fundamentals of Real Analysis (Springer, New York).
Dirac, P. A. M. (1958, 1947, 1935, 1930). The Principles of Quantum Mechanics (Claren-
don Press, Oxford).
Greenberg, M. J. and Harper, J. R. (1981). Algebraic Topology: a First Course (Addison-
Wesley Publishing Company, Redwood City, California).
Heisenberg, W. (1925). Über Quantentheoretische Umdeutung Kinematischer und Mecha-
nischer Beziehungen (Zeitschr. f. Phys. 33), p.879-893.
Hewitt, E. and Stromberg, K. (1965). Real and Abstract Analysis (Springer-Verlag, New
York).
Hilbert, D., Neumann, J. v., and Nordheim, L. (1927). Über die Grundlagen der Quanten-
mechanik (Mathematische Annalen 98(1)), p.1-30.
Holevo, A. S. (1982). Probabilistic and Statistical Aspects of Quantum Theory. (North-
Holland Publishing Company, Amsterdam), second English edition published by
Scuola Normale Superiore, Pisa, 2011.
Horn, R. A. and Johnson, C. R. (2013). Matrix Analysis, 2nd edn. (Cambridge University
Press).
Jauch, J. M. (1968). Foundations of Quantum Mechanics (Addison-Wesley Publishing
Company, Reading, Massachusetts).
Jordan, P. (1926). Über Kanonische Transformationen in der Quantenmechanik (Zeitschr.
f. Phys. 37), p.383-386.
Mackey, G. W. (1978). Unitary Group Representations in Physics, Probability, and Number
Theory (The Benjamin/Cummings Publishing Company, Reading, Massachusetts).
Munkres, J. R. (1991). Analysis on Manifolds (Addison-Wesley Publishing Company, Red-
wood City, California).
Parthasarathy, K. R. (2005). Introduction to Probability and Measure (Hindustan Book
Agency (India), New Delhi).
Pauli, W. (1933). Die Allgemeinen Prinzipien der Wellenmechanik (Handbuch der Physik
24), p.83-272.
Reed, M. and Simon, B. (1980, 1972). Methods of Modern Mathematical Physics I: Func-
tional Analysis (Academic Press, New York).
739
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 740
Index
741
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 742
Index 743
Index 745