Hilbert Space and Quantum Mecha - Franco Gallone

HilbertandSpace
Quantum Mechanics
9405hc_9789814635837_tp.indd 1 3/10/14 9:06 am

July 25, 2013 17:28 WSPC - Proceedings Trim Size: 9.75in x 6.5in icmp12-master
This page intentionally left blank

HilbertandSpace
Quantum Mechanics
Franco Gallone
Università degli Studi di Milano, Italy
World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TA I P E I • CHENNAI
9405hc_9789814635837_tp.indd 2 3/10/14 9:06 am

Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.
HILBERT SPACE AND QUANTUM MECHANICS

Copyright © 2015 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or
mechanical, including photocopying, recording or any information storage and retrieval system now known or to
be invented, without written permission from the publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
ISBN 978-981-4635-83-7
In-house Editor: Ng Kah Fee
Printed in Singapore
KahFee - Hilbert Space and quantum Mechanics.indd 1 20/11/2014 3:34:58 PM

November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page v
To Kissy, Lilith, Malcy, Micio,

who taught me how to stay focused
v

November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page vii
Preface
The subjects of this book are the mathematical foundations of non-relativistic quan-
tum mechanics and the mathematical theory they require. In its mathematical part,
this book aims at expounding in a complete and self-contained way the mathemati-
cal basis for “mathematical” quantum mechanics, namely the branch of mathemat-
ical physics that was constructed by David Hilbert, John von Neumann and other
mathematicians, notably George Mackey, in order to systematize quantum mechan-
ics, and which was presented in book form for the first time by von Neumann in
1932 (Neumann, 1932). In von Neumann’s approach, the language of quantum
mechanics is the theory of linear operators in Hilbert space.
Von Neumann’s book was the result of work which had been done previously over
several years. Hilbert, who had been consulted on numerous aspects of quantum
mechanics since its inception, began in 1926 a systematic study of its mathemat-
ical foundations. Hilbert taught the course “Mathematical Methods of Quantum
Theory” in the academic year 1926-27, and a summary of Hilbert’s lessons was pub-
lished in the spring of 1927 by Hilbert himself and his assistants Lothar Nordheim
and von Neumann (Hilbert et al., 1927). In their view, the mathematical framework
suitable for quantum mechanics was the mathematical structure that was defined
in an abstract way and called a Hilbert space by von Neumann in 1927. Further-
more, between 1926 and 1932, von Neumann proved a number of theorems about
operators in Hilbert space which bore upon quantum mechanics (among them, the
spectral theorem for unbounded self-adjoint operators), and so did the mathemati-
cians Marshall Stone and Hermann Weyl, who had a keen interest in quantum
mechanics. Thus, the theory of linear operators in Hilbert space was actually born
as the mathematical basis for quantum mechanics.
Quantum mechanics and the theory of Hilbert space operators constitute one
of those rare examples in which there is complete correspondence between physical
and mathematical concepts (another example is Euclidean geometry). Actually, it is
one of the most stunning examples of “the unreasonable effectiveness of mathemat-
ics in the natural sciences” (E.P. Wigner). Unfortunately, this aspect of quantum
mechanics is almost completely overlooked in most quantum mechanics textbooks,
where too many subtle points are dealt with by means of mathematical shortcuts
vii
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page viii
viii Hilbert Space and Quantum Mechanics
which not only can hardly convince a mathematically aware reader but also blot out
physical subtleties. The main reason for this is that, in the community of physicists,
Dirac’s quantum mechanics (Dirac, 1958, 1947, 1935, 1930) is by far more popu-
lar than von Neumann’s quantum mechanics, perhaps exactly because the former
requires almost no mathematics. For instance, the idea that self-adjoint operators
have a critical domain is almost completely missing in standard quantum mechanics
textbooks; however, the domain of an unbounded self-adjoint operator represents
exactly the pure states in which the fundamental statistical quantities (expected
result and uncertainty) are defined for the observable represented by that opera-
tor. This point gets hopelessly blurred in most quantum mechanics books, which
treat unbounded observables — like energy, position, momentum, orbital angular
momentum — as if they were represented by self-adjoint operators defined on the
entire space, while this is impossible on account of the Hellinger–Toeplitz theorem.
Another example is the relation existing between the physical idea of compatibility
of two observables and the mathematical idea of commutativity of the operators
that represent them; for self-adjoint operators, the right notion of commutativity
is subtler than the one usually found in quantum mechanics books and depends
on the representations of the operators as projection valued measures; however it is
exactly through this subtler notion that the physical essence of compatibility can be
really grasped. More than anything else, the real way to understand why quantum
observables are represented by self-adjoint operators is through the spectral theo-
rem, since quantum observables arise most naturally as projection valued measures,
but this is usually outside the scope of standard quantum mechanics books.
One last word about the mathematical framework for quantum mechanics pre-
sented in this book. It is undoubtedly very interesting and useful to treat quantum
mechanics in the framework of mathematical structures more general than Hilbert
space theory, especially in order to study quantum mechanics of systems with an in-
finite number of degrees of freedom. However, quantum mechanics in Hilbert space
is an enthralling subject in its own right, mainly because it is here that one can see
most clearly how the mathematical structure is linked to the physical theory in an
almost necessary way.
Most books about fundamental quantum mechanics use results in the theory
of Hilbert space operators without proving them, while most books about Hilbert
space operators do not treat quantum mechanics; moreover, they often use fairly
advanced results from other branches of mathematics assuming the reader to be
already familiar with them, but this is seldom true. The aim of this book is not
to be a complete treatise about Hilbert space operators, but to give a really self-
contained treatment of all the elements of this subject that are necessary for a
sound and mathematically accurate exposition of the principles of quantum me-
chanics; this exposition is the object of the final chapters of the book. The main
characteristic of the book is that the mathematical theory is developed only assum-
ing familiarity with elementary analysis. Moreover, all the proofs in the book are
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page ix
Preface ix
carried out in a very detailed way. These features make the book easily accessible to
readers with only the mathematical experience offered by undergraduate education
in mathematics or in physics, and also ideal for individual study. The principles
of quantum mechanics are discussed with complete mathematical accuracy and an
effort is always made to trace them back to the experimental reality that lies at
their root. The treatment of quantum mechanics is axiomatic, with definitions fol-
lowed by propositions proved in a mathematical fashion. No previous knowledge
of quantum mechanics is required. The level of this book is intermediate between
advanced undergraduate and graduate. It is a purely theoretical book, in which no
exercises are provided.
After the first chapter, whose function is mainly to fix notation and terminology,
the first part of the book (Chapters 2–9) is devoted to an exposition of the elements
of real and abstract analysis that are needed later in the study of operators in
Hilbert space. The reason for this is to make it really self-contained and avoid
proving theorems by means of other fairly advanced theorems outside this book. In
particular, the chapter devoted to metric spaces (Chapter 2) contains results which
are not completely elementary but are necessary in order to prove (in Chapter 6) the
theorem about Borel functions that plays an essential role in proving the spectral
theorems (in Chapter 15). The chapters about measure and integration (Chapters
5–9) contain results about extensions of measures which are not to be found in first
level books on measure theory but which are essential in order to study commuting
self-adjoint operators, and also the Riesz–Markov theorem about positive linear
functionals which plays an essential role in proving the spectral theorems. Actually,
Chapters 1–2 and 5–9 could by themselves be a short book about measure and
integration. Chapters 3 and 4 deal with that part of the theory of linear operators in
normed spaces that is used later in the study of Hilbert space operators. Moreover,
the Stone–Weierstrass approximation theorem is proved in Chapter 4; this theorem
plays an essential role in proving the spectral theorems.
The second part of this book (Chapters 10–18) is its core, and contains a treat-
ment of the theory of linear operators in Hilbert space which is particularly well
suited for the discussion of the mathematical foundations of quantum mechanics
presented later in the book. It contains the spectral theorems for unitary and for
self-adjoint operators, one-parameter unitary groups and Stone’s theorem, theo-
rems about commuting operators and invariant subspaces, trace class operators,
and also Wigner’s theorem and the real line special case of Bargmann’s theorem
about automorphisms of projective Hilbert spaces.
The theory of Hilbert space operators is the backbone of the third and final
part of the book, which consists of two chapters (19 and 20). The first of these is
by far the longest chapter in the book and endeavours to present the principles of
non-relativistic quantum mechanics in a mathematically accurate way, with also an
unstinting effort to present some possible physical reasoning behind the constructs
that are considered. Since the predictions provided by quantum mechanics are in
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page x
x Hilbert Space and Quantum Mechanics
general statistical ones, in the first part of this chapter general statistical ideas are
introduced and it is examined how these ideas are implemented in classical theories;
later in the chapter, the statistical aspects of quantum mechanics are compared and
contrasted with the same aspects of classical theories. The final chapter deals with
an important example of how quantum observables can arise in connection with
symmetry principles; moreover, it presents the Stone–von Neumann uniqueness
theorem about canonical commutation relations.
Although the book’s length might make it difficult to use it as a textbook for a
single course, parts of it can easily be used in that way for various courses. Here
are some concrete suggestions:
• Chapters 1, 2, 5, 6, 7, 8, 9 for a one-semester course in Real Analysis or in Measure
Theory (intermediate, could be either undergraduate or graduate, mathematics);
• Chapters 3, 4, 10, 11, 12, 13, 14, 15, 16, 17, 18 for a two-semester course in
Operators in Hilbert Space (graduate, mathematics and physics);
• Chapters 19, 20 (using without proof a large number of results from the previous
chapters) for a one-semester course in Mathematical Foundations of Quantum
Mechanics (graduate, mathematics and physics).
To make cross-reference as easy as possible, almost every bit of this book is
marked with three numbers, the first for the chapter, the second for the section,
and the third for the position within the section. Comments also are marked in
this way, and they are called “remarks”. As already mentioned, all the proofs in
this book are written in minute detail; in them, however, previous results are always
quoted simply by means of the three numbers code, without spelling them out. This
should enable experts to pursue the logic of a proof without too many diversions,
and beginners to receive all the support they might need.
I wish to thank Roberto Palazzi for the great job he did of preparing the LATEX
files for the book, and also for useful mathematical comments.
Franco Gallone
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page xi
Contents
Preface vii
1. Sets, Mappings, Groups 1

1.1 Symbols, sets, relations . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Sets of numbers . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Symbols and shorthand . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.5 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2. Metric Spaces 21
2.1 Distance, convergent sequences . . . . . . . . . . . . . . . . . . . . 21
2.2 Open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Continuous mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Characteristic functions of closed and of open sets . . . . . . . . . 32
2.6 Complete metric spaces . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Product of two metric spaces . . . . . . . . . . . . . . . . . . . . . 37
2.8 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.9 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3. Linear Operators in Linear Spaces 51

3.1 Linear spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.3 The algebra of linear operators . . . . . . . . . . . . . . . . . . . . 65
4. Linear Operators in Normed Spaces 69

4.1 Normed spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
xi
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page xii
xii Hilbert Space and Quantum Mechanics
4.2 Bounded operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.3 The normed algebra of bounded operators . . . . . . . . . . . . . . 82
4.4 Closed operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.5 The spectrum of a linear operator . . . . . . . . . . . . . . . . . . . 91
4.6 Isomorphisms of normed spaces . . . . . . . . . . . . . . . . . . . . 94
5. The Extended Real Line 101

5.1 The extended real line as an ordered set . . . . . . . . . . . . . . . 101
5.2 The extended real line as a metric space . . . . . . . . . . . . . . . 102
5.3 Algebraic operations in R∗ . . . . . . . . . . . . . . . . . . . . . . . 107
5.4 Series in [0, ∞] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6. Measurable Sets and Measurable Functions 117

6.1 Semialgebras, algebras, σ-algebras . . . . . . . . . . . . . . . . . . 117
6.2 Measurable mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.3 Borel functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7. Measures 151
7.1 Additive functions, premeasures, measures . . . . . . . . . . . . . . 151
7.2 Outer measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.3 Extension theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.4 Finite measures in metric spaces . . . . . . . . . . . . . . . . . . . 168
8. Integration 177
8.1 Integration of positive functions . . . . . . . . . . . . . . . . . . . . 177
8.2 Integration of complex functions . . . . . . . . . . . . . . . . . . . 191
8.3 Integration with respect to measures constructed from other
measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.4 Integration on product spaces . . . . . . . . . . . . . . . . . . . . . 210
8.5 The Riesz–Markov theorem . . . . . . . . . . . . . . . . . . . . . . 227
9. Lebesgue Measure 233

9.1 Lebesgue–Stieltjes and Lebesgue measures . . . . . . . . . . . . . . 233
9.2 Invariance properties of Lebesgue measure . . . . . . . . . . . . . . 239
9.3 The Lebesgue integral as an extension of the Riemann integral . . 243
10. Hilbert Spaces 247

10.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 247
10.2 Orthogonality in inner product spaces . . . . . . . . . . . . . . . . 257
10.3 Completions, direct sums, unitary and antiunitary operators in
Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
10.4 Orthogonality in Hilbert spaces . . . . . . . . . . . . . . . . . . . . 276
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page xiii
Contents xiii
10.5 The Riesz–Fréchet theorem . . . . . . . . . . . . . . . . . . . . . . 284

10.6 Complete orthonormal systems . . . . . . . . . . . . . . . . . . . . 287
10.7 Separable Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . 294
10.8 The finite-dimensional case . . . . . . . . . . . . . . . . . . . . . . 301
10.9 Projective Hilbert spaces and Wigner’s theorem . . . . . . . . . . . 304
11. L2 Hilbert Spaces 319

11.1 L2 (X, A, µ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
11.2 L2 (a, b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
11.3 L2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
11.4 The Fourier transform on L2 (R) . . . . . . . . . . . . . . . . . . . . 337
12. Adjoint Operators 355

12.1 Basic properties of adjoint operators . . . . . . . . . . . . . . . . . 355
12.2 Adjoints and boundedness . . . . . . . . . . . . . . . . . . . . . . . 361
12.3 Adjoints and algebraic operations . . . . . . . . . . . . . . . . . . . 362
12.4 Symmetric and self-adjoint operators . . . . . . . . . . . . . . . . . 364
12.5 Unitary operators and adjoints . . . . . . . . . . . . . . . . . . . . 377
12.6 The C ∗ -algebra of bounded operators in Hilbert space . . . . . . . 380
13. Orthogonal Projections and Projection Valued Measures 387

13.1 Orthogonal projections . . . . . . . . . . . . . . . . . . . . . . . . . 387
13.2 Orthogonal projections and subspaces . . . . . . . . . . . . . . . . 393
13.3 Projection valued measures . . . . . . . . . . . . . . . . . . . . . . 407
13.4 Extension theorems for projection valued mappings . . . . . . . . . 411
13.5 Product of commuting projection valued measures . . . . . . . . . 415
13.6 Spectral families and projection valued measures . . . . . . . . . . 419
14. Integration with respect to a Projection Valued Measure 425

14.1 Integrals of bounded measurable functions . . . . . . . . . . . . . . 425
14.2 Integrals of general measurable functions . . . . . . . . . . . . . . . 429
14.3 Sum, product, inverse, self-adjointness, unitarity of integrals . . . . 443
14.4 Spectral properties of integrals . . . . . . . . . . . . . . . . . . . . 455
14.5 Multiplication operators . . . . . . . . . . . . . . . . . . . . . . . . 458
14.6 Change of variable. Unitary equivalence. . . . . . . . . . . . . . . . 460
15. Spectral Theorems 463

15.1 The spectral theorem for unitary operators . . . . . . . . . . . . . 463
15.2 The spectral theorem for self-adjoint operators . . . . . . . . . . . 475
15.3 Functions of a self-adjoint operator . . . . . . . . . . . . . . . . . . 483
15.4 Unitary equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page xiv
xiv Hilbert Space and Quantum Mechanics
16. One-Parameter Unitary Groups and Stone’s Theorem 495

16.1 Continuous one-parameter unitary groups . . . . . . . . . . . . . . 495
16.2 Norm-continuous one-parameter unitary groups . . . . . . . . . . . 508
16.3 Unitary equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
16.4 One-parameter groups of automorphisms . . . . . . . . . . . . . . . 512
17. Commuting Operators and Reducing Subspaces 529

17.1 Commuting operators . . . . . . . . . . . . . . . . . . . . . . . . . 529
17.2 Invariant and reducing subspaces . . . . . . . . . . . . . . . . . . . 550
17.3 Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
18. Trace Class and Statistical Operators 571

18.1 Positive operators and polar decomposition . . . . . . . . . . . . . 571
18.2 The trace class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
18.3 Statistical operators . . . . . . . . . . . . . . . . . . . . . . . . . . 596
19. Quantum Mechanics in Hilbert Space 611

19.1 Elements of a general statistical theory . . . . . . . . . . . . . . . . 612
19.2 States, propositions, observables in classical statistical theories . . 628
19.3 States, propositions, observables in quantum mechanics . . . . . . 636
19.4 State reduction in quantum mechanics . . . . . . . . . . . . . . . . 653
19.5 Compatible observables and uncertainty relations in quantum
mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
19.6 Time evolution in non-relativistic quantum mechanics . . . . . . . 688
20. Position and Momentum in Non-Relativistic Quantum Mechanics 697

20.1 The Weyl commutation relation . . . . . . . . . . . . . . . . . . . . 697
20.2 The Stone–von Neumann uniqueness theorem . . . . . . . . . . . . 703
20.3 Position and momentum as Galilei-covariant observables . . . . . . 715
Bibliography 739
Index 741
November 17, 2014 17:34 World Scientific Book - 9.75in x 6.5in HilbertSpace page 1
Chapter 1
Sets, Mappings, Groups
Most readers are likely to have a working familiarity with most of the subjects of
this introductory chapter. For them, the main function of this chapter is to fix the
notation and the terminology that will be used throughout this book and provide
ready reference inside the book.
1.1 Symbols, sets, relations
The reader is assumed to be already familiar with the topics of this section, which
is only intended for future reference.
1.1.1 Sets of numbers

Symbol Meaning
N the set of all positive integers, i.e. {1, 2, 3, ...}
Z the set of all integers, i.e. {0, ±1, ±2, ...}
Q the set of all rational numbers, i.e. {m/n : m, n ∈ Z and n 6= 0}
R the set of all real numbers
[0, ∞) the set of all non-negative real numbers
(0, ∞) the set of all positive real numbers
C the set of all complex numbers
The complex field is always meant to be R2 endowed with the two operations:
(a1 , a2 ) + (b1 , b2 ) = (a1 + b1 , a2 + b2 ),

(a1 , a2 )(b1 , b2 ) = (a1 b1 − a2 b2 , a1 b2 + a2 b1 ),
and C denotes the set R2 when R2 is endowed in this way.

For a complex number z := (a1 , a2 ), we define:
q
Re z := a1 , Im z := a2 , z := (a1 , −a2 ), |z| := a21 + a22 .
The subset {(a, 0) : a ∈ R} of C is identified with R, identifying (a, 0) with a.

With this identification, for a complex number z we have zz = zz = |z|2 , and the
1
2 Hilbert Space and Quantum Mechanics
absolute value of a real number a coincides with |a|. Identifying a ∈ R with (a, 0)
and defining i := (0, 1), we also have (a1 , a2 ) = a1 + ia2 . When for a complex
number z we write 0 ≤ z (or 0 < z, z ≤ 0, z < 0), we mean Im z = 0 and 0 ≤ Re z
(or 0 < Re z, Re z ≤ 0, Re z < 0). More generally, outside the chapters devoted to
measure and integration, when for a symbol x we write 0 ≤ x or x ≥ 0 we mean
x ∈ [0, ∞); similarly, by 0 < x or x > 0 we mean x ∈ (0, ∞). However, in chapters
from 5 to 9 by 0 ≤ x or x ≥ 0 we mean x ∈ [0, ∞] and by 0 < x or x > 0 we mean
x ∈ (0, ∞] (i.e. we allow the case x = ∞; cf. 5.1.1).
It is always understood that the square root of a positive real number is taken
to be positive.
1.1.2 Proofs
A proposition is a statement that is either true or false (but not both). By means of
logical connectives and brackets, a new proposition can be defined starting from one
or more given propositions. We assume known to the reader the logical connectives:
“not”, “and”, “or” (“A or B” means “A or B or both”), “⇒” (if, then), “⇔” (if
and only if).
Given two propositions P, Q, the proposition P ⇒ Q is logically equivalent to the
proposition (notQ) ⇒ (notP ), which is called the contrapositive form of P ⇒ Q. A
proof that (notQ) ⇒ (notP ) is true, is called proof by contraposition of P ⇒ Q. The
proposition P ⇒ Q is also logically equivalent to the proposition [P and (notQ)] ⇒
[R and (notR)], for any proposition R. A proof that there is a proposition R such
that [P and (notQ)] ⇒ [R and (notR)] is true, is called proof by contradiction of
P ⇒ Q.
Suppose that, for each positive integer n, we are given a proposition Pn . From
the principle of induction it follows that, if the propositions
(a) P1 ,
(b) Pn ⇒ Pn+1 is true for each positive integer n
are true, then the proposition
(c) Pn is true for each positive integer n
is true. A proof that propositions a and b are true is called proof by induction of
proposition c.
Often, for a proposition P , we will write “P ” instead of “P is true” or “P holds”.
Propositions will be written in a rather informal style, mixing logical symbols and
ordinary language.
Sets, Mappings, Groups 3
1.1.3 Symbols and shorthand

Symbol Meaning
x∈S x is an element of the set S
(x is also said to be a point in S or a point of S)
x 6∈ S not (x ∈ S)
∀x ∈ S for every element x of the set S
∃x ∈ S there exists at least one element x of the set S
∃!x ∈ S there exists one and only one element x of the set S
A := B A equals B by definition of A
B =: A A equals B by definition of A
s.t. such that
iff if and only if
i.e. that is to say
cf. see, recall
e.g. for example
The symbols “∀”, “∃”, “∈” are often used collectively: instead of writing
“∃x ∈ S, ∃y ∈ S” or “∀x ∈ S, ∀y ∈ S”
one often writes
“∃x, y ∈ S” or “∀x, y ∈ S”.
The expressions “∀x ∈ S” and “for x ∈ S” are regarded as equivalent.
When n ∈ N, “for k ∈ {1, ..., n}” is often written as “for k = 1, ..., n”.
In definitions, “if” means “if and only if”.
When, for a symbol x, we write “∃x ≥ 0”, or “∃x > 0”, or “∀x ≥ 0”, or “∀x > 0”,
we mean “∃x ∈ [0, ∞)”, or “∃x ∈ (0, ∞)”, or “∀x ∈ [0, ∞)”, or “∀x ∈ (0, ∞)”
respectively if we are not in chapters from 5 to 9; in chapters from 5 to 9, we mean
“∃x ∈ [0, ∞]”, or “∃x ∈ (0, ∞]”, or “∀x ∈ [0, ∞]”, or “∀x ∈ (0, ∞]” respectively (cf.
5.1.1).
P PN
If I = {1, ..., N } or I := N, we will often write “ n∈I ” for “ n=1 ” or for
P∞
“ n=1 ”.
1.1.4 Sets
The words family and collection will be used synonymously with set, e.g. in order
to avoid phrases like “set of sets”.
The empty set is denoted by ∅, and the family of all subsets of a set X is denoted
by P(X). If X is a set and if, for each x ∈ X, P (x) is a proposition involving x,
then
{x ∈ X : P (x)}
denotes the set of all elements x of X for which P (x) is true.
{a, b, c, ...} denotes the set which contains the elements that are listed, and {x}
denotes the set which contains just x (such a set is called a singleton set ).
For two subsets S1 , S2 of a set X, we use the following symbols:
Symbol Meaning
S1 ⊂ S2 x ∈ S1 ⇒ x ∈ S2
(S1 is said to be a subset of S2 or to be contained by S2 )
S2 ⊃ S1 S1 ⊂ S2
S1 6⊂ S2 not (S1 ⊂ S2 ), i.e. ∃x ∈ S1 s.t. x 6∈ S2
S1 = S2 (S1 ⊂ S2 ) and (S2 ⊂ S1 ), i.e. x ∈ S1 ⇔ x ∈ S2
S1 6= S2 not (S1 = S2 ), i.e. (S1 6⊂ S2 ) or (S2 6⊂ S1 )
If F is a family of subsets of a set X, we define the union and the intersection of

F:
∪S∈F S := {x ∈ X : ∃S ∈ F such that x ∈ S},
∩S∈F S := {x ∈ X : x ∈ S for all S ∈ F }.
For a finite family F = {S1 , S2 , ..., Sn }, we write
S1 ∪ S2 ∪ · · · ∪ Sn := ∪S∈F S,
S1 ∩ S2 ∩ · · · ∩ Sn := ∩S∈F S.
If F is the empty family, we define
∪S∈F S := ∅ and ∩S∈F S := X.
A family F of subsets of a set X is said to be disjoint, and its elements are said
to be disjoint or pairwise disjoint, if S ∩ S ′ = ∅ for all S, S ′ ∈ F such that S 6= S ′ .
For two subsets S, T of a set X, directly from the definitions we have:
S ⊂ T ⇔ S = S ∩ T ⇔ T = S ∪ T.
For a subset T of a set X and a family F of subsets of X, directly from the definitions
we have:
T ∩ (∪S∈F S) = ∪S∈F (T ∩ S),
T ∪ (∩S∈F S) = ∩S∈F (T ∪ S),
T ∩ (∩S∈F S) = ∩S∈F (T ∩ S),
T ∪ (∪S∈F S) = ∪S∈F (T ∪ S).
For two subsets S1 , S2 of a set X we define the difference of S2 and S1 :
S2 − S1 := {x ∈ X : x ∈ S2 and x 6∈ S1 };
clearly, S1 ∩ (S2 − S1 ) = ∅.
For a subset S of a set X, X − S is called the complement of S in X; we have
S ∩ (X − S) = ∅, X − (X − S) = S, X = S ∪ (X − S).
For a family F of subsets of X we have De Morgan’s laws

X − (∪S∈F S) = ∩S∈F (X − S) and X − (∩S∈F S) = ∪S∈F (X − S).
For two subsets S1 , S2 of a set X we have, directly from the definitions,
S1 ⊂ S2 ⇔ X − S2 ⊂ X − S1 ,
S1 ∩ S2 = ∅ ⇔ S1 ⊂ X − S2 ,
S2 − S1 = S2 ∩ (X − S1 );
then we also have
(S2 − S1 ) ∪ S1 = (S2 ∩ (X − S1 )) ∪ S1 = S2 ∪ S1 ,
S2 − (S2 − S1 ) = S2 ∩ (X − (S2 ∩ (X − S1 ))) = S2 ∩ ((X − S2 ) ∪ S1 ) = S2 ∩ S1 ;

hence, if S1 ⊂ S2 we have
(S2 − S1 ) ∪ S1 = S2 ,
S2 − (S2 − S1 ) = S1 ,
and this implies
X − S1 = X − (S2 ∩ (X − (S2 − S1 ))) = (X − S2 ) ∪ (S2 − S1 ).
Then, for three subsets S1 , S2 , S3 of X such that S1 ⊂ S2 ⊂ S3 we have
S3 − S1 = S3 ∩ ((X − S2 ) ∪ (S2 − S1 )) = (S3 − S2 ) ∪ (S2 − S1 ).
1.1.5 Relations
If X and Y are sets, the cartesian product of X and Y , written X × Y , is the set of
all ordered pairs (x, y) with x ∈ X and y ∈ Y .
A relation in a non-empty set X is a subset R of X × X. If (x, y) ∈ R, we write
xRy and say that x is related by R to y. If S is a subset of X, then R ∩ (S × S) is
a relation in S which is called the relation induced by R in S.
A relation R in a set X is said to be an equivalence relation if it has the following
three properties:
(er1 ) xRx, ∀x ∈ X (R is reflexive);
(er2 ) xRy ⇒ yRx (R is symmetric);
(er3 ) (xRy and yRz) ⇒ xRz (R is transitive).
A symbol often used for an equivalence relation is “∼”.
Let X be a set equipped with an equivalence relation ∼ and let x ∈ X. The
equivalence class of x for ∼ is the set
[x] := {y ∈ X : y ∼ x},
and any element of [x] is called a representative of [x].
The following facts can be easily proved:
(a) x ∈ [x], ∀x ∈ X; thus, every equivalence class is nonempty and X = ∪x∈X [x];
(b) either [x] = [y] or [x] ∩ [y] = ∅ (but not both), ∀x, y ∈ X;
(c) [x] = [y] ⇔ x ∼ y;
we notice that, by assertion b, the contrapositive form of statement c is
[x] ∩ [y] = ∅ ⇔ not (x ∼ y).
A partition of a set X is a family F of subsets of X which has the following three
properties:
(pa1 ) S 6= ∅, ∀S ∈ F ;
(pa2 ) (S1 , S2 ∈ F , S1 6= S2 ) ⇒ S1 ∩ S2 = ∅;
(pa3 ) ∪S∈F S = X.
Thus, if X is a non-empty set equipped with an equivalence relation, the family of
equivalence classes constitute a partition of X. Conversely, it is straightforward to
prove that, if F is a partition of non-empty a set X, then the set
R := {(x, y) ∈ X × X : ∃S ∈ F such that x ∈ S and y ∈ S}
is an equivalence relation in X and F is the family of equivalence classes defined by
R.
If ∼ is an equivalence relation in a non-empty set X, the family of equivalence
classes defined by ∼ is called the quotient set of X by the relation ∼ and is denoted
by X/ ∼.
A relation R in a non-empty set X is said to be a partial ordering if it has the
following three properties:
(po1 ) xRx, ∀x ∈ X (R is reflexive);
(po2 ) (xRy and yRx) ⇒ x = y (R is antisymmetric);
(po3 ) (xRy and yRz) ⇒ xRz (R is transitive).
A partial ordering is called a total ordering if it has the following further property:
(po4 ) (xRy or yRx), ∀x, y ∈ X.
A symbol often used for a partial ordering is “≤”. A partially ordered set is a pair
(X, ≤), where X is a non-empty set and ≤ is a partial ordering in X.
Let (X, ≤) be a partially ordered set, S a non-empty subset of X, and x a point
of X; the following terms are used:
x is called an upper bound for S if y ≤ x for each y ∈ S;
x is called a lower bound for S if x ≤ y for each y ∈ S;
x is called a least upper bound (l.u.b.) for S if x is an upper bound for S and if,
for every upper bound x′ for S, we have x ≤ x′ ; if a l.u.b. for S exists, then (as can
be readily seen) it is the unique l.u.b. for S and is denoted by sup S; if the l.u.b.
of S exists and it is an element of S, we write max S := sup S;
x is called a greatest lower bound (g.l.b.) for S if x is a lower bound for S and
if, for every lower bound x′ for S, we have x′ ≤ x; if a g.l.b. for S exists, then it is
the unique g.l.b. for S and is denoted by inf S; if the g.l.b. of S exists and it is an
element of S, we write min S := inf S.
In the family P(X) of all subsets of a set X, a relation R is defined by letting
R := {(S1 , S2 ) ∈ P(X) × P(X) : S1 ⊂ S2 }.
For S1 , S2 ∈ P(X), one writes S1 RS2 directly as S1 ⊂ S2 . This relation is a partial
ordering and, for a non-empty subfamily F ⊂ P(X), both sup F and inf F exist
and in fact
sup F = ∪S∈F S and inf F = ∩S∈F S.
1.2 Mappings
In this section we give a methodical treatment of the subject, since some of the
concepts which are contained in this section might not be utterly familiar to all
readers. Indeed, for two sets X and Y , we consider mappings from X to Y which
are defined on any subset of X. This foreshadows what will happen in the study of
linear operators in Hilbert space, where we use the definitions, notations and results
of this section.
1.2.1 Definitions. Let X and Y be non-empty sets. A mapping ϕ from X to Y

is a rule which assigns to each element x of a non-empty subset Dϕ of X a single
element of Y , called the value of ϕ at x and denoted by ϕ(x). The set X is called
the initial set of ϕ, and Y the final set of ϕ. The mapping ϕ is said to be defined
in X and to have values in Y . The set Dϕ is called the domain of ϕ, and ϕ is said
to be defined on Dϕ . To indicate that ϕ is a mapping from X to Y , we write
ϕ : Dϕ → Y with Dϕ ⊂ X,
or simply
ϕ : Dϕ → Y
if it is already clear that Dϕ ⊂ X.
If Dϕ = X, we write
ϕ:X→Y
and we say that ϕ is a mapping on X.
The range of ϕ is the subset of Y defined by
Rϕ := {y ∈ Y : ∃x ∈ Dϕ s.t. y = ϕ(x)}.
It should be clear that, while Dϕ and Rϕ are two sets which are completely deter-
mined by the mapping ϕ (Dϕ is indeed part of the definition of ϕ), the initial set
X and the final set Y can be replaced with two different sets X ′ and Y ′ as long
as Dϕ ⊂ X ′ and Rϕ ⊂ Y ′ , without altering ϕ in any essential way. The choice of
what sets to use as initial and final sets is often made on the grounds of particu-
lar properties they possess, or in order to have a common playground for several
mappings.
Mappings are sometimes given different names. A mapping is also called a map
or a function, and we will use the latter name especially when the final set is C or
R∗ (cf. 5.1.1), or some subset of them. When the final set is C (or R) we sometimes
say that the mapping is a complex (or a real ) function. A mapping from a cartesian
product of two sets to one of them is occasionally called a binary operation. A
mapping ϕ : N → X, where X is a non-empty set, is called a sequence in X and is
denoted by the symbol {xn }, where xn := ϕ(n). Sometimes, given a non-empty set
X and a non-empty set I which for psychological reasons we like to think about as a
set of indices, the range of a mapping ϕ : I → X is denoted by the symbol {xi }i∈I ,
where xi := ϕ(i), and is referred to as a family of elements of X indexed by the set
I. If a family F of subsets of a set X is obtained in this way, i.e. if F = {Si }i∈I ,
the union and the intersection of the elements of F are usually written as follows:
∪i∈I Si and ∩i∈I Si . If I := {1, ..., n} or I := N, “∪i∈I ” and “∩i∈I ” are written as
“∪ni=1 ” and“∩ni=1 ” or “∪∞ ∞
i=1 ” and “∩i=1 ” respectively.
We can now formalize better the concept of cartesian product, which we have
already introduced for two sets. Let {X1 , X2 , ..., Xn } be a finite family of sets. If
Xi 6= ∅ for i = 1, 2, ..., n, then the cartesian product X1 × X2 × · · · × Xn is defined
to be the set of all mappings ϕ : {1, 2, ..., n} → ∪ni=1 Xi so that ϕ(i) ∈ Xi for
i = 1, 2, ..., n; if there is i so that Xi = ∅, then X1 × X2 × · · · × Xn := ∅. If Xi 6= ∅
for i = 1, 2, ..., n, an element ϕ of X1 × X2 × · · · × Xn is called an ordered n-tuple, or
simply an n-tuple, and is denoted by the symbol (x1 , x2 , ..., xn ), where xi := ϕ(i). If
Ei ⊂ Xi for i = 1, 2, ..., n, then E1 × E2 × · · · × En is a subset of X1 × X2 × · · · × Xn ,
and
(X1 × X2 × · · · × Xn ) − (E1 × E2 × · · · × En )
= ∪ni=1 (X1 × · · · × Xi−1 × (Xi − Ei ) × Xi+1 × · · · × Xn );
if Fi ⊂ Xi for i = 1, 2, ..., n, then
(E1 × E2 × · · · × En ) ∩ (F1 × F2 × · · · × Fn )
= (E1 ∩ F1 ) × (E2 ∩ F2 ) × · · · × (En ∩ Fn ).
If X is a set so that Xi = X for i = 1, 2, ..., n, then we write
X n := X1 × X2 × · · · × Xn .
1.2.2 Remark. Given two non-empty sets X and Y , if we want to define a mapping
ϕ from X to Y by using a rule r that assigns elements of Y to some elements of X,
we need to define a subset Dϕ of X such that the rule r assigns one and only one
element of Y to each element of Dϕ . After defining Dϕ , a mapping ϕ is defined by
assigning to each element of Dϕ the element r(x) of Y that we obtain by applying
the rule r to x. To indicate a mapping defined in this way, we often write
ϕ : Dϕ → Y
x 7→ ϕ(x) := r(x),
or equivalently
Dϕ ∋ x 7→ ϕ(x) := r(x) ∈ Y.
When, for a given non-empty subset S of X, the rule r assigns one and only one
element of Y to each element of S and we want to define Dϕ by setting Dϕ := S,
we often write directly
ϕ:S→Y
x 7→ ϕ(x) := r(x),
or even (without introducing a symbol to denote the mapping)
S ∋ x 7→ r(x) ∈ Y.
1.2.3 Definition. Let ϕ be a mapping from X to Y (by this, here and in the
sequel, we mean that X, Y are non-empty sets and ϕ is a mapping ϕ : Dϕ → Y
with Dϕ ⊂ X). The graph of ϕ is the subset of X × Y defined by
Gϕ := {(x, y) ∈ X × Y : x ∈ Dϕ and y = ϕ(x)}.
We remark that, when X and Y are replaced with two different sets X ′ and Y ′ such
that Dϕ ⊂ X ′ and Rϕ ⊂ Y ′ , the graph of ϕ will remain unaltered (but it will be
considered as a subset of X ′ × Y ′ ).
1.2.4 Proposition. Let X and Y be non-empty sets. For a non-empty subset G of

X × Y the following conditions are equivalent:
(a) G is the graph of a mapping from X to Y ;
(b) (x, y1 ), (x, y2 ) ∈ G ⇒ y1 = y2 .
Proof. a ⇒ b: Let ϕ be a mapping from X to Y and let G = Gϕ . Then we have

(x, y) ∈ G ⇔ (x ∈ Dϕ and y = ϕ(x)) .
Hence
(x, y1 ), (x, y2 ) ∈ G ⇒ y1 = ϕ(x) = y2 .
b ⇒ a: Assuming condition b, we define
Dϕ := {x ∈ X : ∃y ∈ Y such that (x, y) ∈ G},
and, ∀x ∈ Dϕ ,
r(x) := y if y ∈ Y and (x, y) ∈ G.
For x ∈ Dϕ , ∃!y ∈ Y such that y ∈ Y and (x, y) ∈ G, by the very definition of Dϕ
and by condition b. Thus, we can define the mapping
ϕ : Dϕ → Y
x 7→ ϕ(x) := r(x),
and we see at once that Gϕ = G.
1.2.5 Definitions. Let ϕ1 , ϕ2 be mappings from X to Y . The mapping ϕ2 is called

an extension of ϕ1 (or ϕ1 a restriction of ϕ2 ), and it is said to extend ϕ1 , if we
have:
Dϕ1 ⊂ Dϕ2 and ϕ1 (x) = ϕ2 (x), ∀x ∈ Dϕ1 ,
which can be written equivalently as
x ∈ Dϕ1 ⇒ (x ∈ Dϕ2 and ϕ1 (x) = ϕ2 (x)) .
For this we write ϕ1 ⊂ ϕ2 (or ϕ2 ⊃ ϕ1 ).
It is immediately clear that ϕ1 ⊂ ϕ2 iff Gϕ1 ⊂ Gϕ2 .
The mappings ϕ1 , ϕ2 are said to be equal if
Dϕ1 = Dϕ2 and ϕ1 (x) = ϕ2 (x), ∀x ∈ Dϕ1 ,
which is equivalent to
[x ∈ Dϕ1 ⇒ (x ∈ Dϕ2 and ϕ1 (x) = ϕ2 (x))] and (x ∈ Dϕ2 ⇒ x ∈ Dϕ1 ) .
Clearly, ϕ1 = ϕ2 iff (ϕ1 ⊂ ϕ2 and ϕ2 ⊂ ϕ1 ) iff Gϕ1 = Gϕ2 .
Given a mapping ϕ from X to Y and a non-empty subset S of Dϕ , the restriction
of ϕ to S is the mapping ϕS defined by
ϕS : S → Y
x 7→ ϕS (x) := ϕ(x).
Obviously, ϕS ⊂ ϕ.
1.2.6 Examples. We define a few useful mappings.

(a) Let S be a non-empty subset of a set X. The identity mapping of S is the
mapping idS defined as follows:
idS : S → X
x 7→ idS (x) := x.
(b) Let S be a subset of a non-empty set X. The characteristic function of S is
the mapping χS defined as follows:
χS : X → R
(
1 if x ∈ S,
x 7→ χS (x) :=
0 if x 6∈ S.
(c) Let X, Y be two non-empty sets. The projection mappings of X × Y are the
two mappings πX , πY defined as follows
πX : X × Y → X
(x, y) 7→ πX (x, y) := x,
πY : X × Y → Y
(x, y) 7→ πY (x, y) := y.
1.2.7 Definitions. Let ϕ be a mapping from X to Y .

For a subset S of Dϕ , the image of S under ϕ is the set ϕ(S) defined by
ϕ(S) := {y ∈ Y : ∃x ∈ S s.t. y = ϕ(x)}.
To mean the set ϕ(S) one sometimes writes
{ϕ(x) : x ∈ S},
or also
{r(x) : x ∈ S},
if r is the rule which defines ϕ as in 1.2.2 and there is no need to mention the
mapping ϕ.
For a subset T of Y , the counterimage of T under ϕ is the set ϕ−1 (T ) defined
by
ϕ−1 (T ) := {x ∈ Dϕ : ϕ(x) ∈ T }.
1.2.8 Proposition. Let ϕ be a mapping from X to Y . For any family F of subsets

of Dϕ we have:
ϕ (∪S∈F S) = ∪S∈F ϕ(S) and ϕ (∩S∈F S) ⊂ ∩S∈F ϕ(S);
we also have:
ϕ(∅) = ∅;
ϕ(Dϕ ) = Rϕ ;
ϕ(S1 ) ⊂ ϕ(S2 ) if S1 , S2 are subsets of Dϕ such that S1 ⊂ S2 .

For any family G of subsets of Y we have:
ϕ−1 (∪T ∈G T ) = ∪T ∈G ϕ−1 (T ) and ϕ−1 (∩T ∈G T ) = ∩T ∈G ϕ−1 (T );
we also have:
ϕ−1 (∅) = ∅;
ϕ−1 (Y ) = ϕ−1 (Rϕ ) = Dϕ ;
ϕ−1 (T1 ) ⊂ ϕ−1 (T2 ) if T1 , T2 are subsets of Y such that T1 ⊂ T2 ;
ϕ−1 (Y − T ) = Dϕ − ϕ−1 (T ) for any subset T of Y.

We also have:
ϕ(ϕ−1 (T )) ⊂ T for each subset T of Y ;
S ⊂ ϕ−1 (ϕ(S)) for each subset S of Dϕ

(equality need not hold in either case).
Proof. Everything follows at once from the definitions.
1.2.9 Definitions. A mapping ϕ from X to Y is said to be:

(a) injective (or an injection) if [x1 , x2 ∈ Dϕ and ϕ(x1 ) = ϕ(x2 )] ⇒ x1 = x2 ,
i.e. if [x1 , x2 ∈ Dϕ and x1 6= x2 ] ⇒ ϕ(x1 ) 6= ϕ(x2 ),
i.e. if (x1 , y), (x2 , y) ∈ Gϕ ⇒ x1 = x2 ,
i.e. if ϕ−1 ({y}) contains just one point for each y ∈ Rϕ ;
(b) surjective (or a surjection) onto Y if Rϕ = Y ,
i.e. if ∀y ∈ Y, ∃x ∈ Dϕ s.t. y = ϕ(x);
(c) bijective (or a bijection) from X onto Y (or between X and Y ) if Dϕ = X and
ϕ is both injective and surjective onto Y .
As was pointed out before, the final set Y can be replaced with a different set Y ′
as long as Rϕ ⊂ Y ′ . By choosing Rϕ as the final set, any mapping ϕ can be made
surjective.
1.2.10 Definitions. A set X is said to be:

finite if either X = ∅ or there are n ∈ N and a bijection from {1, ..., n} onto X;
denumerable if there is a bijection from N onto X;
countable if X is either finite or denumerable;
uncountable if X is not countable.
The following facts can be proved (cf. e.g. Shilov, 1973, 2.32, 2.33, 2.34, 2.35, 2.41):
every subset of a countable set is countable; the union of a countable family of
countable sets is countable; the cartesian product of a finite family of countable
sets is countable; the set of all rational numbers is countable; the set of all real
numbers is uncountable.
1.2.11 Definition. Let ϕ be a mapping from X to Y . By definition of Rϕ we have

∀y ∈ Rϕ , ∃x ∈ Dϕ such that y = ϕ(x),
whereas we have
∀y ∈ Rϕ , ∃!x ∈ Dϕ such that y = ϕ(x)
iff ϕ is injective.
Therefore, if and only if ϕ is injective can we define a mapping, which we denote
by ϕ−1 and call the inverse of ϕ, by setting
Dϕ−1 := Rϕ
and using the rule
∀y ∈ Dϕ−1 , r(y) := x if x ∈ Dϕ and y = ϕ(x).
Thus, if ϕ is injective we have the mapping
ϕ−1 : Rϕ → X
y 7→ ϕ−1 (y) := x if x ∈ Dϕ and y = ϕ(x).
We recall (cf. 1.2.7) that, for any mapping ϕ from X to Y , for a subset T of Y we
have defined the set
ϕ−1 (T ) := {x ∈ Dϕ : ϕ(x) ∈ T }.
For an injective mapping ϕ we have {ϕ−1 (y)} = ϕ−1 ({y}) for each y ∈ Rϕ ; more-
over, for any subset S of Rϕ , ϕ−1 (S) is the same thing when interpreted as the
image of S under the inverse of ϕ or as the counterimage of S under ϕ. One can
see immediately that the following facts are true for an injective mapping ϕ:
(a) Rϕ−1 = Dϕ ;
(b) ϕ−1 is injective and (ϕ−1 )−1 = ϕ;
(c) if V denotes the mapping
V : X ×Y →Y ×X
(x, y) 7→ V (x, y) := (y, x),
then Gϕ−1 = V (Gϕ ) (notice that condition b of 1.2.4 is in effect for G := V (Gϕ )
iff ϕ is injective).
1.2.12 Definition. Let X, Y, Z be non-empty sets, let ϕ be a mapping from X to

Y , and let ψ be a mapping from Y to Z, i.e. ϕ : Dϕ → Y with Dϕ ⊂ X and
ψ : Dψ → Z with Dψ ⊂ Y . If ϕ−1 (Dψ ) 6= ∅, the composition of ψ with ϕ is the
mapping ψ ◦ ϕ defined as follows:
Dψ◦ϕ := {x ∈ Dϕ : ϕ(x) ∈ Dψ } = ϕ−1 (Dψ ),
ψ ◦ ϕ : Dψ◦ϕ → Z
x 7→ (ψ ◦ ϕ)(x) := ψ(ϕ(x)).
If ψ : N → X is a sequence in X and ϕ : N → N is a mapping such that ϕ(n1 ) <
ϕ(n2 ) whenever n1 < n2 , then the mapping ψ ◦ ϕ is called a subsequence of ψ. If
ψ is denoted by {xn }, then ψ ◦ ϕ is denoted by {xϕ(k) }, or by {xnk } if ϕ does not
need to be specified.
1.2.13 Proposition.
(A) Let ϕ be a mapping from X to Y . We have:
(a) ϕ ◦ idX = idY ◦ ϕ = ϕ.
If ψ is a mapping from Y to Z such that ϕ−1 (Dψ ) 6= ∅, we have:
(b) Rψ◦ϕ ⊂ Rψ ;
(c) Dψ◦ϕ ⊂ Dϕ ;
(d) Dψ◦ϕ = Dϕ iff Rϕ ⊂ Dψ ;
(e) Dψ = Y ⇒ Dψ◦ϕ = Dϕ ;
(f ) (ψ ◦ ϕ)−1 (S) = ϕ−1 (ψ −1 (S)) for every subset S of Z.
If S is a subset of Y , we have:
(g) (χS ◦ ϕ)(x) = χϕ−1 (S) (x), ∀x ∈ Dϕ .

(B) If X and Y are non-empty sets and G is a non-empty subset of X × Y which
satisfies condition b of 1.2.4, then the restriction (πX )G of the mapping πX to
G is injective, and G is the graph of the mapping πY ◦ (πX )−1 G .
Proof. A: Everything from assertion a to assertion g follows at once from the

definitions.
B: The mapping ϕ of which G is the graph has been constructed in the proof
of b ⇒ a in 1.2.4, to which we refer in what follows. Condition b of 1.2.4 means
exactly that the mapping (πX )G is injective. The domain of the mapping (πX )−1 G
is the range of (πX )G , which is exactly what Dϕ was defined to be, and this is also
the domain of πY ◦ (πX )−1G since DπY = X × Y (cf. e). We also have
∀x ∈ Dϕ , πY ◦ (πX )−1
G (x) = πY (x, r(x)) = r(x) = ϕ(x).
Thus, ϕ = πY ◦ (πX )−1

G .
1.2.14 Proposition.
(A) If ϕ is an injective mapping from X to Y , we have:
ϕ−1 ◦ ϕ = idDϕ ⊂ idX and ϕ ◦ ϕ−1 = idRϕ ⊂ idY .
If ϕ is a bijection from X onto Y , we have
ϕ−1 ◦ ϕ = idX and ϕ ◦ ϕ−1 = idY .
(B) Let ϕ be an injective mapping from X to Y , ψ an injective mapping from Y
to Z, and suppose that ϕ−1 (Dψ ) 6= ∅. Then the mapping ψ ◦ ϕ is injective and
(ψ ◦ ϕ)−1 = ϕ−1 ◦ ψ −1 .
Proof. A: Everything follows at once from the definitions.

B: The mapping ψ ◦ ϕ is injective since
[x1 , x2 ∈ Dψ◦ϕ and (ψ ◦ ϕ)(x1 ) = (ψ ◦ ϕ)(x2 )] ⇒
[x1 , x2 ∈ Dϕ and ϕ(x1 ) = ϕ(x2 )] ⇒ x1 = x2 .
The equality D(ψ◦ϕ)−1 = Dϕ−1 ◦ψ−1 is proved by
z ∈ D(ψ◦ϕ)−1 ⇔ z ∈ Rψ◦ϕ ⇔
[∃x ∈ Dψ◦ϕ s.t. z = (ψ ◦ ϕ)(x)] ⇔
[∃x ∈ Dϕ s.t. ϕ(x) ∈ Dψ and z = ψ(ϕ(x))] ⇔
[∃y ∈ Rϕ s.t. y ∈ Dψ and z = ψ(y)] ⇔
[z ∈ Rψ and ψ −1 (z) ∈ Rϕ ] ⇔
[z ∈ Dψ−1 and ψ −1 (z) ∈ Dϕ−1 ] ⇔
z ∈ Dϕ−1 ◦ψ−1 .
Finally, for each z ∈ Rψ◦ϕ , if x ∈ Dψ◦ϕ is such that
z = (ψ ◦ ϕ)(x) = ψ(ϕ(x))
then ψ −1 (z) = ϕ(x) and hence

ϕ−1 (ψ −1 (z)) = x = (ψ ◦ ϕ)−1 (z).
1.2.15 Theorem. Let ϕ, ψ be mappings from X to Y such that ψ is injective and

ϕ ⊂ ψ. Then ϕ is injective and ϕ−1 ⊂ ψ −1 .
Proof. We have
[x1 , x2 ∈ Dϕ , ϕ(x1 ) = ϕ(x2 )] ⇒ [x1 , x2 ∈ Dψ , ψ(x1 ) = ψ(x2 )] ⇒ x1 = x2 ,
which proves that ϕ is injective. We also have by 1.2.14A:
y ∈ Dϕ−1 = Rϕ ⇒ y = ϕ(ϕ−1 (y)) = ψ(ϕ−1 (y)) ⇒
[y ∈ Rψ = Dψ−1 and ψ −1 (y) = ψ −1 (ψ(ϕ−1 (y))) = (ϕ−1 (y))],
which proves ϕ−1 ⊂ ψ −1 .
1.2.16 Theorem. Let ϕ be a mapping from X to Y and ψ a mapping from Y to

X.
(a) If ψ ◦ ϕ = idDϕ , then ϕ is injective and ϕ−1 ⊂ ψ.
(b) If ψ ◦ ϕ = idDϕ and ϕ ◦ ψ = idDψ , then both ϕ and ψ are injective, ϕ−1 = ψ
and ψ −1 = ϕ.
Proof. a: Assume ψ ◦ ϕ = idDϕ . We have

[x1 , x2 ∈ Dϕ , ϕ(x1 ) = ϕ(x2 )] ⇒ x1 = ψ(ϕ(x1 )) = ψ(ϕ(x2 )) = x2 ,
which proves that ϕ is injective.
From Dψ◦ϕ = DidDϕ = Dϕ we have Dϕ−1 = Rϕ ⊂ Dψ by 1.2.13d; moreover, for
y ∈ Dϕ−1 we have ϕ−1 (y) ∈ Dϕ , and hence by 1.2.14A
ϕ−1 (y) = (ψ ◦ ϕ)(ϕ−1 (y)) = ψ(ϕ(ϕ−1 (y))) = ψ(y);
this proves ϕ−1 ⊂ ψ.
b: Assume ψ ◦ ϕ = idDϕ and ϕ ◦ ψ = idDψ . By part a, we have that ϕ and ψ are
both injective, and also ϕ−1 ⊂ ψ and ψ −1 ⊂ ϕ. By 1.2.15, ψ −1 ⊂ ϕ implies ψ =
(ψ −1 )−1 ⊂ ϕ−1 . Thus we have ϕ−1 = ψ, which implies ϕ = (ϕ−1 )−1 = ψ −1 .
1.2.17 Proposition. Let ϕ1 be a mapping from W to X, ϕ2 a mapping from X to

Y , ϕ3 a mapping from Y to Z. We have (ϕ3 ◦ ϕ2 ) ◦ ϕ1 = ϕ3 ◦ (ϕ2 ◦ ϕ1 ).
Proof. We have
D(ϕ3 ◦ϕ2 )◦ϕ1 := {w ∈ Dϕ1 : ϕ1 (w) ∈ Dϕ3 ◦ϕ2 }
= {w ∈ Dϕ1 : ϕ1 (w) ∈ Dϕ2 and ϕ2 (ϕ1 (w)) ∈ Dϕ3 }
= {w ∈ Dϕ2 ◦ϕ1 : (ϕ2 ◦ ϕ1 )(w) ∈ Dϕ3 }
= Dϕ3 ◦(ϕ2 ◦ϕ1 ) ,
and, for each w ∈ D(ϕ3 ◦ϕ2 )◦ϕ1 ,
((ϕ3 ◦ ϕ2 ) ◦ ϕ1 )(w) = (ϕ3 ◦ ϕ2 )(ϕ1 (w)) = ϕ3 (ϕ2 (ϕ1 (w)))

= ϕ3 ((ϕ2 ◦ ϕ1 )(w)) = (ϕ3 ◦ (ϕ2 ◦ ϕ1 ))(w).
1.2.18 Proposition. Let ϕ be a mapping from X to X, and let ψ be bijection from

X onto Y . We have:
(a) Dϕ = ψ −1 (Dψ◦ϕ◦ψ−1 );
(b) Rψ◦ϕ◦ψ−1 = ψ(Rϕ ).
Proof. a: We have
(∗)
x ∈ Dϕ ⇒ ψ −1 (ψ(x)) ∈ Dϕ ⇒ ψ(x) ∈ Dϕ◦ψ−1 = Dψ◦ϕ◦ψ−1 ⇒
x ∈ ψ −1 (Dψ◦ϕ◦ψ−1 ),
(∗)
x ∈ ψ −1 (Dψ◦ϕ◦ψ−1 ) ⇒ ψ(x) ∈ Dψ◦ϕ◦ψ−1 = Dϕ◦ψ−1 ⇒
x ∈ Dϕ◦ψ−1 ◦ψ = Dϕ◦idX = Dϕ ,
where (∗) is true by 1.2.13e.

b: We have
y ∈ Rψ◦ϕ◦ψ−1 ⇒ [∃ỹ ∈ Dψ◦ϕ◦ψ−1 s.t. y = ψ(ϕ(ψ −1 (ỹ)))] ⇒ y ∈ ψ(Rϕ ),
y ∈ ψ(Rϕ ) ⇒ [∃x ∈ Dϕ s.t. y = ψ(ϕ(x)) = (ψ ◦ ϕ ◦ ψ −1 )(ψ(x))] ⇒ y ∈ Rψ◦ϕ◦ψ−1 .
1.2.19 Definitions. We define some operations on functions, which will be used

in the book. Let X be a non-empty set. For a function ϕ from X to C, we define:
−ϕ : Dϕ → C
x 7→ (−ϕ)(x) := −ϕ(x);
Re ϕ : Dϕ → C
x 7→ (Re ϕ)(x) := Re ϕ(x);
Im ϕ : Dϕ → C
x 7→ (Im ϕ)(x) := Im ϕ(x);
ϕ : Dϕ → C
x 7→ ϕ(x) := ϕ(x);
|ϕ| : Dϕ → C
x 7→ |ϕ|(x) := |ϕ(x)|;
D ϕ1 : = {x ∈ Dϕ : ϕ(x) 6= 0} and
1
: D ϕ1 → C
ϕ
1 1
x 7→ ( )(x) := ;
ϕ ϕ(x)
eϕ : Dϕ → C
x 7→ (eϕ )(x) := exp ϕ(x);
for n ∈ N,
ϕn : Dϕ → C
x 7→ (ϕn )(x) := (ϕ(x))n ;
for α ∈ C,
αϕ : Dϕ → C
x 7→ (αϕ)(x) := αϕ(x).
For a function ϕ from X to R, we define:
ϕ+ : Dϕ → R
x 7→ ϕ+ (x) := max{ϕ(x), 0};
ϕ− : Dϕ → R
x 7→ ϕ− (x) := − min{ϕ(x), 0}.
For two functions ϕ, ψ from X to C s.t. Dϕ ∩ Dψ 6= ∅, we define:
ϕ + ψ : Dϕ ∩ Dψ → C
x 7→ (ϕ + ψ)(x) := ϕ(x) + ψ(x);
ϕψ : Dϕ ∩ Dψ → C
x 7→ (ϕψ)(x) := ϕ(x)ψ(x).
Clearly, for a function ϕ from X to C we have
ϕ = Re ϕ + i Im ϕ and |ϕ|2 = ϕϕ,
and for a function from X to R we have
ϕ = ϕ+ − ϕ− and |ϕ| = ϕ+ + ϕ− ;
thus, for a function ϕ from X to C we have
ϕ = (Re ϕ)+ − (Re ϕ)− + i(Im ϕ)+ − i(Im ϕ)− .
For α ∈ C, we define the constant function
αX : X → C
x 7→ αX (x) := α;
we also write ϕ + α := ϕ + αX for every function ϕ from X to C.
1.2.20 Remark. If S is a subset of a non-empty set X, we have (for χS , cf. 1.2.6b)

χX−S = 1X − χS .
If S1 and S2 are subsets of X, we have
χS1 + χS2 = χS1 ∪S2 + χS1 ∩S2 and χS1 χS2 = χS1 ∩S2 .
If {Sn }n∈I is a disjoint family of subsets of X and I := {1, ..., N } or I := N, then
X
∀x ∈ X, χ∪n∈I Sn (x) = χSn (x),
n∈I
P PN P∞
where n∈I stands for n=1 or n=1 (note that, for each x ∈ X, there is at most
one index n so that χSn (x) 6= 0).
If S1 and S2 are subsets of two non-empty sets X1 and X2 respectively, then
∀(x1 , x2 ) ∈ X1 × X2 , χS1 ×S2 (x1 , x2 ) = χS1 (x1 )χS2 (x2 ).
1.2.21 Definition. Let ϕ be a function from R to C, i.e. ϕ : Dϕ → C with Dϕ ⊂ R,

and let x be a point of Dϕ for which ∃ǫ > 0 such that (x − ǫ, x + ǫ) ⊂ Dϕ . We say
that ϕ is differentiable at x if both Re ϕ and Im ϕ are differentiable at x, i.e. if the
derivatives of Re ϕ and Im ϕ exist at x and are finite. If ϕ is differentiable at x, we
call derivative of ϕ at x the complex number
ϕ′ (x) := (Re ϕ)′ (x) + i(Im ϕ)′ (x),
where (Re ϕ)′ (x) and (Im ϕ)′ (x) stand for the derivatives of Re ϕ and Im ϕ at x
respectively.
An analogous definition can be given for one-sided differentiability and deriva-
tives.
If ϕ is differentiable at all x ∈ Dϕ , we call derivative of ϕ the function ϕ′ that
has Dϕ as its domain and is defined by assigning ϕ′ (x) to each x ∈ Dϕ .
We will use freely the following axiom.
1.2.22 Axiom (Axiom of choice). Let X and Y be non-empty sets and Φ a

mapping Φ : X → P(Y ), where P(Y ) denotes the family of all subsets of Y . If
Φ(x) 6= ∅ for each x ∈ X, then there exists a mapping ϕ : X → Y such that
ϕ(x) ∈ Φ(x) for all x ∈ X.
1.3 Groups
We review in this section the few elementary facts about groups that will be used
in the book.
1.3.1 Definitions. A group is a pair (G, γ), where G is a non-empty set and γ is
a mapping γ : G × G → G with the following properties, which we write with the
shorthand notation g1 g2 := γ(g1 , g2 ):
(gr1 ) g1 (g2 g3 ) = (g1 g2 )g3 , ∀g1 , g2 , g3 ∈ G,

(gr2 ) ∃e ∈ G s.t. ge = eg = g for all g ∈ G,
(gr3 ) ∀g, ∃g ′ s.t. gg ′ = g ′ g = e.
The mapping γ is called the product of the group.

If u ∈ G is such that gu = ug = g for all g ∈ G, then we have e = eu = u. Thus,
condition gr2 identifies a unique element of G, which is called the identity of the
group.
If, for g ∈ G, g ′′ ∈ G is such that gg ′′ = g ′′ g = e, then we have g ′ = g ′ e =
g (gg ′′ ) = (g ′ g)g ′′ = eg ′′ = g ′′ . Thus, for all g ∈ G, the element g ′ of condition gr3
′
is actually unique. It is called the inverse of g and denoted by g −1 . We see at once

that (g −1 )−1 = g.
In view of gr1 , we write g1 g2 g3 := g1 (g2 g3 ).
A group (G, γ) is said to be abelian if γ has the further property:
(ag) g1 g2 = g2 g1 , ∀g1 , g2 ∈ G.
For an abelian group, one usually writes “g1 + g2 ” instead of “g1 g2 ”, “sum” instead
of “product”, “zero” instead of “identity”, “0” instead of “e”, “opposite” instead of
“inverse”, “−g” instead of “g −1 ”, “g1 − g2 ” instead of “g1 + (−g2 )”. For elements
Pm
of G, one writes g1 + g2 + g3 := g1 + (g2 + g3 ) and i=n gi := gn + gn+1 + ... + gm
P
if n < m; one also writes i∈I gi to denote the sum of a finite family {gi }i∈I of
elements of G.
One often says “the group G” to mean the pair (G, γ), but on the other hand one
often speaks of “elements of the group G” to mean “elements of the set G”. Tacit
conventions of this sort are used whenever one deals with mathematical structures
which are composed of sets together with some mappings (as in the case of metric,
linear, normed, inner product spaces, algebras, normed algebras, etc.), or together
with some relation (as in the case of a partially ordered set), or together with some
class of distinguished subsets (as in the case of a measurable space), and will not
be mentioned again later on.
1.3.2 Definition. A subgroup of a group (G, γ) is a non-empty subset G̃ of G such

that:
(sg1 ) g1 , g2 ∈ G̃ ⇒ g1 g2 ∈ G̃,
(sg2 ) g ∈ G̃ ⇒ g −1 ∈ G̃.
If G̃ is a subgroup of (G, γ), condition sg1 makes it possible to use G̃ as the final set of
γG̃×G̃ (the restriction of γ to G̃× G̃). Then (G̃, γG̃×G̃ ) is a group: indeed for this pair
condition gr1 is obviously satisfied; moreover, for any g ∈ G̃ we have e = gg −1 ∈ G̃
by conditions sg1 and sg2 , hence condition gr2 is satisfied for (G̃, γG̃×G̃ ) with e still
playing the role of the identity; finally, condition gr3 is satisfied for (G̃, γG̃×G̃ ) since
by condition sg2 we have g −1 ∈ G̃ for all g ∈ G̃.
1.3.3 Definitions. Let (G1 , γ1 ) and (G2 , γ2 ) be two groups. A homomorphism

from G1 to G2 is a mapping Φ : G1 → G2 such that:
(hg) γ2 (Φ(g), Φ(g ′ )) = Φ(γ1 (g, g ′ )), i.e. Φ(g)Φ(g ′ ) = Φ(gg ′ ), ∀g, g ′ ∈ G.
We see that, denoting by e1 and e2 the identities of G1 and G2 respectively,
we have Φ(e1 ) = e2 : for any g ∈ G1 , e2 = Φ(g)−1 Φ(g) = Φ(g)−1 Φ(ge1 ) =
Φ(g)−1 Φ(g)Φ(e1 ) = e2 Φ(e1 ) = Φ(e1 ).
We also see that Φ(g −1 ) = Φ(g)−1 for all g ∈ G1 : Φ(g −1 ) = Φ(g −1 )e2 =
Φ(g −1 )Φ(g)Φ(g)−1 = Φ(g −1 g)Φ(g)−1 = Φ(e1 )Φ(g)−1 = e2 Φ(g)−1 = Φ(g)−1 .
A homomorphism from G1 to G2 which is also a bijection from G1 onto G2
is called an isomorphism. It is immediate to see that, when it is defined, the
composition (cf. 1.2.12) of two isomorphisms is an isomorphism, and that the inverse
(cf. 1.2.11) of an isomorphism is an isomorphism. If G1 = G2 , an isomorphism is
called an automorphism.
1.3.4 Remark. It can be easily seen that the family of all automorphisms of any
group G is a group if the product of two automorphisms is assumed to be their
composition as defined in 1.2.12. The identity of this group is idG (which is clearly
an automorphism of G), and the group inverse of an automorphism is its inverse as
defined in 1.2.11.
1.3.5 Proposition. Let Φ be a homomorphism from a group (G1 , γ1 ) to a group

(G2 , γ2 ). We have:
(a) RΦ is a subgroup of G2 ;
(b) if G1 is an abelian group, then RΦ (with the restriction of γ2 to RΦ × RΦ ) is
also an abelian group.
Proof. a: We have
g, g ′ ∈ RΦ ⇒
[∃g̃, g̃ ′ ∈ G1 s.t. g = Φ(g̃), g ′ = Φ(g˜′ ), hence s.t. gg ′ = Φ(g̃ g˜′ )] ⇒
gg ′ ∈ RΦ
and
g ∈ RΦ ⇒
[∃g̃ ∈ G1 s.t. g = Φ(g̃), hence s.t. g −1 = Φ(g̃)−1 = Φ(g̃ −1 )] ⇒
g −1 ∈ RΦ .
b: Let G1 be abelian. Then we have
Φ(g1 )Φ(g2 ) = Φ(g1 g2 ) = Φ(g2 g1 ) = Φ(g2 )Φ(g1 ), ∀g1 , g2 ∈ G.
Chapter 2
Metric Spaces
This chapter contains just the facts about metric spaces that will be used later in
the book and is not intended for a thorough treatment of this subject.
2.1 Distance, convergent sequences
2.1.1 Definition. A metric space is a pair (X, d), where X is a non-empty set and
d is a function d : X × X → R such that
(di1 ) d(x, y) = d(y, x), ∀x, y ∈ X,

(di2 ) d(x, y) ≤ d(x, z) + d(z, y), ∀x, y, z ∈ X,
(di3 ) d(x, y) = 0 ⇔ x = y.
These conditions imply 0 ≤ d(x, y), ∀x, y ∈ X:

1 1
0 = d(x, x) ≤ (d(x, y) + d(y, x)) = d(x, y).
2 2
The function d is called a distance on X. The inequality in di2 is called the triangle
inequality.
2.1.2 Proposition. In a metric space (X, d) we have
|d(x1 , y1 ) − d(x2 , y2 )| ≤ d(x1 , x2 ) + d(y1 , y2 ), ∀x1 , y1 , x2 , y2 ∈ X.
Proof. For x1 , y1 , x2 , y2 ∈ X we have
d(x1 , y1 ) ≤ d(x1 , x2 ) + d(x2 , y1 ) ≤ d(x1 , x2 ) + d(x2 , y2 ) + d(y2 , y1 ),
hence
d(x1 , y1 ) − d(x2 , y2 ) ≤ d(x1 , x2 ) + d(y1 , y2 ).
In the same way we have
d(x2 , y2 ) − d(x1 , y1 ) ≤ d(x2 , x1 ) + d(y2 , y1 ).
Thus we have
|d(x1 , y1 ) − d(x2 , y2 )| ≤ d(x1 , x2 ) + d(y1 , y2 ).
21
2.1.3 Definition. Let (X, d) be a metric space and S a non-empty subset of X.

The function dS := dS×S (cf. 1.2.5) is clearly a distance on S. The metric space
(S, dS ) is called a metric subspace of (X, d) and is said to be defined by S.
2.1.4 Example. Define the function
dR : R × R → R,
(x, y) 7→ dR (x, y) := |x − y|,
where |x| is the absolute value of x ∈ R. Directly from the properties of the absolute
value it follows that dR is a distance on R. We will always consider R to be the first
element of the metric space (R, dR ) and every subset of R to be the first element of
the metric subspace of (R, dR ) it defines.
2.1.5 Definition. Let (X1 , d1 ) and (X2 , d2 ) be metric spaces. An isomorphism
(or isometry) from (X1 , d1 ) onto (X2 , d2 ) is a mapping Φ : X1 → X2 which is a
surjection onto X2 and has the following property:
d2 (Φ(x), Φ(y)) = d1 (x, y), ∀x, y ∈ X1 .
An isomorphism is necessarily injective (hence, it is a bijection from X1 onto X2 ):
Φ(x) = Φ(y) ⇒ d2 (Φ(x), Φ(y)) = 0 ⇒ d1 (x, y) = 0 ⇒ x = y.
In is obvious that the inverse of an isomorphism from (X1 , d1 ) onto (X2 , d2 ) is an
isomorphism from (X2 , d2 ) onto (X1 , d1 ).
2.1.6 Definition. Let (X, d) be a metric space. A sequence {xn } in X is said to
be convergent if the following condition is satisfied:
∃x ∈ X such that ∀ǫ > 0, ∃Nǫ ∈ N such that n > Nǫ ⇒ d(xn , x) < ǫ.
If this condition is satisfied, the point x is unique: assume x′ ∈ X is such that
∀ǫ > 0, ∃Nǫ′ ∈ N such that n > Nǫ′ ⇒ d(xn , x′ ) < ǫ;
fix ǫ > 0; then, for n > max{Nǫ , Nǫ′ } we have d(x, x′ ) ≤ d(x, xn ) + d(xn , x′ ) < 2ǫ;
since ǫ was arbitrary, this proves that d(x, x′ ) = 0, which implies x = x′ .
If the condition of convergence is satisfied, one says that {xn } converges to x,
calls x the limit of {xn }, and writes limn→∞ xn = x, or xn → x as n → ∞, or
xn −−−−→ x, or simply xn → x.
n→∞
2.1.7 Remarks.
(a) Let (X, d) be a metric space. For x ∈ X and a sequence {xn } in X, xn → x iff
d(xn , x) → 0 in the metric space (R, dR ).
(b) Let (X, d) be a metric space, {xn } a convergent sequence in X, and ϕ : N → N
a mapping such that ϕ(n1 ) < ϕ(n2 ) whenever n1 < n2 . Then the subsequence
{xϕ(k) } is convergent and limk→∞ xϕ(k) = limn→∞ xn . Indeed, write x :=
limn→∞ xn and for each ε > 0 let Nε ∈ N be such that
n > Nε ⇒ d(xn , x) < ε.
Then, for each ε > 0,
k > min ϕ−1 ([Nε , ∞) ∩ N) ⇒ ϕ(k) > Nε ⇒ d(xϕ(k) , x) < ε.
Metric Spaces 23
2.1.8 Definition. Let (X, d) be a metric space. A subset S of X is called a bounded

set (or it is said to be bounded ) if the following condition is satisfied:
∃m ∈ (0, ∞), ∃x ∈ X such that d(y, x) < m, ∀y ∈ S.
2.1.9 Proposition. Let (X, d) be a metric space and {xn } a sequence in X. If

{xn } is convergent then the range of {xn }, i.e. the set {xn : n ∈ N}, is a bounded
set.
Proof. Assume that there exists x ∈ X such that xn → x. Then

∃N ∈ N such that n > N ⇒ d(xn , x) < 1.
Put m := max{d(x1 , x), ..., d(xN , x)}. Then d(xn , x) < m + 1, ∀n ∈ N.
2.1.10 Definition. Let (X, γ, d) be a triple so that (X, γ) is an abelian group and
Pn
(X, d) is a metric space, let {xn } be a sequence in X, and let sn := k=1 xk for
all n ∈ N. The sequence {sn } is called the series of the xn ’s and is denoted by
P∞ P∞
the symbol n=1 xn ; thus, one says that n=1 xn is convergent when the sequence
{sn } is convergent. If the sequence {sn } is convergent then one calls limn→∞ sn
P∞
the sum of the series and denotes limn→∞ sn by the same symbol n=1 xn as the
P∞
series, i.e. one writes n=1 xn := limn→∞ sn .
2.2 Open sets
2.2.1 Definition. Let (X, d) be a metric space. If x ∈ X and r ∈ (0, ∞), the open
ball with center x and radius r is the set
B(x, r) := {y ∈ X : d(x, y) < r}.
2.2.2 Definition. Let (X, d) be a metric space. A subset G of X is called an open

set (or it is said to be open) if the following condition is satisfied:
∀x ∈ G, ∃r ∈ (0, ∞) such that B(x, r) ⊂ G.
The family of all open sets is called the topology defined by d and is denoted by Td .
2.2.3 Proposition. Let (X, d) be a metric space. For all x ∈ X and r ∈ (0, ∞),
the open ball B(x, r) is an open set (this justifies its name).
Proof. Let y be a point in B(x, r). We must produce r ∈ (0, ∞) such that B(y, r) ⊂
B(x, r). Since d(y, x) < r, we have 0 < r − d(y, x). Defining r := r − d(y, x), we
have:
z ∈ B(y, r) ⇒ d(z, y) < r − d(y, x) ⇒
d(z, x) ≤ d(z, y) + d(y, x) < r ⇒ z ∈ B(x, r).
2.2.4 Theorem. Let (X, d) be a metric space. The topology Td has the following
properties:
(to1 ) ∅ ∈ Td , X ∈ Td ;
(to2 ) if F is any family of elements of Td , then ∪S∈F S ∈ Td ;
(to3 ) if F is a finite family of elements of Td , then ∩S∈F S ∈ Td .
Proof. to1 : To show that ∅ is open, we must show that each point in ∅ is the center
of an open ball contained in ∅; but since there are no points in ∅, this requirement
is automatically satisfied. The set X is clearly open, since every open ball centered
on any of its points is contained in X.
to2 : Every point x in ∪S∈F S lies in some Sx of the family F . Since Sx is an
open set, some open ball centered on x is contained in Sx and hence in ∪S∈F S.
to3 : If ∩S∈F S = ∅, then ∩S∈F S is an open set. Assume then that ∩S∈F S is
non-empty and write F = {S1 , ..., Sn } for some n ∈ N; let x be a point in ∩S∈F S;
for k = 1, ..., n, ∃rk > 0 s.t. B(x, rk ) ⊂ Sk ; let r be the smallest number in the
set {r1 , ..., rn }; r is a positive real number and we have B(x, r) ⊂ B(x, rk ) for
k = 1, ..., n; therefore B(x, r) ⊂ ∩S∈F S.
2.2.5 Proposition. Let (X, d) be a metric space and S a non-empty subset of X.

A subset T of S is an open set in the metric subspace (S, dS ) iff ∃G ∈ Td such that
T = G ∩ S.
Proof. First we note that if x ∈ S then B(x, r) ∩ S is the open ball with center x
and radius r in the metric subspace (S, dS ).
If G ∈ Td and x ∈ G ∩ S, then ∃r > 0 such that B(x, r) ⊂ G, hence such that
B(x, r) ∩ S ⊂ G ∩ S. This shows that G ∩ S is an open set in (S, dS ).
Conversely, if T is an open set in (S, dS ), then for each x ∈ T there is rx > 0 s.t.
B(x, rx ) ∩ S ⊂ T . Then we have T = ∪x∈T (B(x, rx ) ∩ S) = (∪x∈T B(x, rx )) ∩ S,
with ∪x∈T B(x, rx ) ∈ Td by 2.2.3 and 2.2.4.
2.2.6 Definition. Let (X, d) be a metric space and S a subset of X. The interior
of S is the set:
S o := ∪G∈F G, with F := {G ∈ P(X) : G ∈ Td and G ⊂ S}.
2.2.7 Theorem. Let (X, d) be a metric space and S a subset of X. Then:

(a) S o ∈ Td , S o ⊂ S, if G ∈ Td and G ⊂ S then G ⊂ S o (thus, S o is the largest
open set that is contained in S);
(b) S ∈ Td iff S = S o ;
(c) if T is a subset of X such that S ⊂ T , then S o ⊂ T o ;
(d) for every subset T of X, (S ∩ T )o = S o ∩ T o .
Let L be a family of subsets of X. Then
o
(e) ∪S∈L S o ⊂ (∪S∈L S) .
Metric Spaces 25
Proof. a: Everything follows from 2.2.4 and the definition of S o .

b: This follows immediately from assertion a.
c: This follows immediately from assertion a.
d: From assertion c we have (S ∩ T )o ⊂ S o and (S ∩ T )o ⊂ T o , and hence we
have (S ∩ T )o ⊂ S o ∩ T o . On the other hand, we have S o ∩ T o ∈ Td by assertion
a and 2.2.4 and S o ∩ T o ⊂ S ∩ T by assertion a, and hence S o ∩ T o ⊂ (S ∩ T )o by
assertion a.
e: From assertion a and 2.2.4 we have ∪S∈L S o ∈ Td and ∪S∈L S o ⊂ ∪S∈L S. By
o
assertion a, this implies ∪S∈L S o ⊂ (∪S∈L S) .
2.3 Closed sets
2.3.1 Definition. Let (X, d) be a metric space. A subset F of X is called a closed

set (or it is said to be closed ) if X − F is an open set. The family of all closed
sets is denoted by Kd . Since S = X − (X − S) and X − S ∈ Kd iff (by definition)
X − (X − S) ∈ Td , for a subset S of X we have S ∈ Td iff X − S ∈ Kd .
2.3.2 Theorem. Let (X, d) be a metric space. The family Kd of all closed sets has
the following properties:
(cl1 ) ∅ ∈ Kd , X ∈ Kd ;
(cl2 ) if F is any family of elements of Kd , then ∩S∈F S ∈ Kd ;
(cl3 ) if F is a finite family of elements of Kd , then ∪S∈F S ∈ Kd .
Proof. Properties cl1 , cl2 , cl3 follow from 2.3.1, 2.2.4 and De Morgan’s laws (cf.
1.1.4).

A subset T of S is a closed set in the metric subspace (S, dS ) iff ∃F ∈ Kd such that
T = F ∩ S.
Proof. For a subset T of S and any subset G of X we have (cf. 1.1.4)

(1)
S − T = G ∩ S ⇔ T = S − (S − T ) = S − (G ∩ S) = S ∩ (X − (G ∩ S))
= S ∩ ((X − G) ∪ (X − S)) = S ∩ (X − G).
Now, a subset T of S is a closed set in (S, dS ) iff S − T is an open set in (S, dS ) iff
(by 2.2.5) ∃G ∈ Td s.t. S − T = G ∩ S iff (by 1) ∃G ∈ Td s.t. T = S ∩ (X − G) iff
(set F := X − G or G := X − F ) ∃F ∈ Kd s.t. T = S ∩ F .
2.3.4 Theorem. Let (X, d) be a metric space. For a subset S of X the following
conditions are equivalent:
(a) S is a closed set;
(b) [x ∈ X, {xn } a sequence in S, xn → x] ⇒ x ∈ S.
Proof. a ⇒ b: Let {xn } be a sequence in S, let x be a point of X, and assume

xn → x. This implies that B(x, r) ∩ S 6= ∅, i.e. B(x, r) 6⊂ X − S, for all r ∈ (0, ∞).
Then, by 2.2.2 we have
x ∈ X − S ⇒ X − S 6∈ Td ,
X − S ∈ Td ⇒ x ∈ S.
Since S ∈ Kd ⇒ X − S ∈ Td , this proves a ⇒ b.
b ⇒ a: We prove (not a)⇒(not b). Assume S 6∈ Kd . Then X − S 6∈ Td , and we
have:
∃x ∈ X − S s.t. B(x, ǫ) 6⊂ X − S for all ǫ > 0,
which implies
1
∃x ∈ X − S s.t. B(x, ) 6⊂ X − S for all n ∈ N,
n
which means
1
∃x ∈ X − S s.t. B(x, ) ∩ S 6= ∅ for all n ∈ N.
n
By choosing a point xn in B(x, n1 ) ∩ S for each n ∈ N, we construct a sequence {xn }
which is in S and such that xn → x; since x 6∈ S, this proves that proposition b is
not true; hence proposition (not b) is true.
2.3.5 Remark. Using 2.3.4 one can see at once that the set {x} is closed, for every
point x of any metric space.
2.3.6 Definition. Let (X, d) be a metric space. If x ∈ X and r ∈ (0, ∞), the closed
ball with center x and radius r is the set
K(x, r) := {y ∈ X : d(x, y) ≤ r}.
2.3.7 Proposition. Let (X, d) be a metric space. For all x ∈ X and r ∈ (0, ∞),
the closed ball K(x, r) is a closed set (this justifies its name).
Proof. Let a sequence {yn } in K(x, r) and y ∈ X be such that yn → y. If

y 6∈ K(x, r) were true then d(x, y) − r > 0 would be true, and hence there would
exist k ∈ N such that
d(yk , y) < d(x, y) − r,
hence such that
d(yk , x) ≥ d(x, y) − d(yk , y) > r.
This proves by contraposition that y ∈ K(x, r). In view of 2.3.4, this proves that
K(x, r) is a closed set.
Metric Spaces 27
2.3.8 Definition. Let (X, d) be a metric space and S a subset of X. The closure
of S is the set
S = ∩f ∈F F, with F := {F ∈ P(X) : F ∈ Kd and S ⊂ F }.
2.3.9 Theorem. Let (X, d) be a metric space and S a subset of X. Then:
(a) S ∈ Kd , S ⊂ S, if F ∈ Kd and S ⊂ F then S ⊂ F (thus S is the smallest closed

set that contains S);
(b) X − S = (X − S)o ;
(c) S ∈ Kd iff S = S;
(d) if T is a subset of X such that S ⊂ T , then S ⊂ T ;
(e) for every subset T of X, S ∪ T = S ∪ T .
Let L be a family of subsets of X. Then
(f ) (∩S∈L S) ⊂ ∩S∈L S.
Proof. a: Everything follows from 2.3.2 and the definition of S.

b: Using assertion a and 2.2.7a we have:
[X − S ∈ Td and X − S ⊂ X − S] ⇒ X − S ⊂ (X − S)o ;
[(X − S)o ∈ Td and (X − S)o ⊂ X − S] ⇒

[X − (X − S)o ∈ Kd and S ⊂ X − (X − S)o ] ⇒
S ⊂ X − (X − S)o ⇒ (X − S)o ⊂ X − S.
c, d, e, f: These properties of the closure follow from the corresponding properties
b, c, d, e of the interior listed in 2.2.7, by taking the complement in X of every
subset involved and using assertion b.
2.3.10 Theorem. Let (X, d) be a metric space, S a subset of X, and x a point of

X. The following conditions are equivalent:
(a) x ∈ S;
(b) ∀ǫ > 0, ∃y ∈ S such that d(x, y) < ǫ;
(c) there exists a sequence {xn } in S such that xn → x.
Proof. a ⇒ b: We prove (not b)⇒ (not a). Assume (not b), i.e.
∃ǫ > 0 such that ǫ ≤ d(x, y) for each y ∈ S,
∃ǫ > 0 such that S ⊂ X − B(x, ǫ);
since X − B(x, ǫ) ∈ Kd (cf. 2.2.3 and 2.3.1), we have (cf. 2.3.9a)
∃ǫ > 0 such that S ⊂ X − B(x, ǫ);
since X − B(x, ǫ) ⊂ X − {x}, we have
S ⊂ X − {x},
which leads to {x} ⊂ X − S, i.e. (not a).

b ⇒ c: Condition b means:
∀ǫ > 0, B(x, ǫ) ∩ S 6= ∅;
by choosing a point xn in B(x, n1 ) ∩ S for each n ∈ N, we construct a sequence {xn }

which is in S and such that xn → x. This proves that condition b implies condition
c.
c ⇒ a: Assume condition c. For each F ∈ Kd so that S ⊂ F , we have that
the sequence {xn } is in F and therefore (by 2.3.4) x ∈ F . By definition of S, this
proves that x ∈ S.
2.3.11 Definition. Let (X, d) be a metric space. A subset S of X is said to be

dense in (X, d) (or simply dense in X) if S = X.
2.3.12 Corollary. Let (X, d) be a metric space and S a subset of X. The following
(a) S = X;
(b) ∀x ∈ X, ∀ǫ > 0, ∃y ∈ S such that d(x, y) < ǫ;
(c) ∀x ∈ X, there exists a sequence {xn } in S such that xn → x.
Proof. a ⇒ b: This follows immediately from a ⇒ b in 2.3.10.

b ⇒ c: This follows immediately from b ⇒ c in 2.3.10.
c ⇒ a: If condition c holds, then by c ⇒ a in 2.3.10 we have X ⊂ S, which is
equivalent to condition a.
2.3.13 Theorem. Let (X, d) be a metric space, and let S and T be two subsets of
X such that T ⊂ S. The following conditions are equivalent:
(a) T is dense in the metric subspace (S, dS );

(b) S ⊂ T , where T means the closure of T in (X, d).
Proof. By 2.3.8, the closure of T in (X, d) is
T = ∩f ∈F F, with F := {F ∈ P(X) : F ∈ Kd and T ⊂ F },
while the closure of T in (S, dS ) is (cf. 2.3.3) ∩f ∈F (F ∩ S).

If condition a is true, then
S = ∩f ∈F (F ∩ S) ⊂ ∩f ∈F F = T .
If condition b is true, then S ⊂ ∩f ∈F F and hence S = S ∩(∩f ∈F F ) = ∩f ∈F (S ∩F ),

which means that the closure of T in (S, dS ) equals S.
Metric Spaces 29
2.3.14 Corollary. Let (X, d) be a metric space, and let S and T be two subsets of
X such that T ⊂ S. If T is dense in (S, dS ) and S is dense in (X, d), then T is
dense in (X, d).
Proof. If T is dense in (S, dS ), then we have (by 2.3.13) S ⊂ T , which implies (by
2.3.9a) S ⊂ T . If moreover S is dense in (X, d), then we also have S = X and hence
X ⊂ T , which implies T = X.
2.3.15 Definition. A metric space (X, d) is said to be separable if there exists a

countable subset of X which is dense in (X, d).
2.3.16 Remark. The metric space (R, dR ) is separable since the set Q of all rational
numbers is both countable and dense in R.
2.3.17 Proposition. Let (X, d) be a separable metric space. Then there is a count-
able family Tc of open balls such that every open set is a union of elements of Tc .
Proof. Let S be a countable dense subset of X. Let Tc be the family

Tc := {B(s, r) : s ∈ S, r ∈ (0, ∞) and r ∈ Q}.
This family Tc is countable since Q is countable and the cartesian product of two
countable sets is countable. Let G be an arbitrary non-empty open set. For x ∈ G,
let r > 0 be such that B(x, r) ⊂ G. Since S is dense in X, there is sx ∈ S such that
sx ∈ B(x, 3r ) (cf. 2.3.12). Let rx be a rational number such that r3 < rx < 2r
3 . We
r
have x ∈ B(sx , rx ) since d(x, sx ) < 3 < rx . We also have
r 2r r
y ∈ B(sx , rx ) ⇒ d(y, x) ≤ d(y, sx ) + d(sx , x) < rx + < + = r,
3 3 3
hence B(sx , rx ) ⊂ B(x, r), hence B(sx , rx ) ⊂ G. Then we have ∪x∈G B(sx , rx ) = G,
and we note that the family {B(sx , rx )}x∈G is contained in Tc .
2.3.18 Theorem (Lindelöf ’s theorem). Let (X, d) be a separable metric space.

Let G be an open set and {Gi }i∈I a family of open sets such that G = ∪i∈I Gi . Then
there is a countable subset Ic of I such that G = ∪i∈Ic Gi .
Proof. If G is the empty set then the statement is trivial. Assume then G non-
empty. Let Tc be the countable family of open balls of 2.3.17. Let x be a point
in G. The point x is in some Gi , and we can find an open ball B in Tc such that
x ∈ B ⊂ Gi . If we do this for each point x in G, we obtain a family of open
balls {Bn }n∈J such that ∪n∈J Bn = G, and this family is countable since it is a
subfamily of Tc . Further, for each open ball in this subfamily, we can select i ∈ I so
that Gi contains the ball. The family Ic of i’s which arises in this way is countable,
since there exists a surjection of the countable set J onto Ic by construction of Ic .
Moreover, ∪i∈Ic Gi = G since Gi ⊂ G for each i ∈ I, and ∀n ∈ J, ∃i ∈ Ic such that
Bn ⊂ Gi .
2.3.19 Theorem. Assume that in a metric space (X, d) there is a countable family
Tc of open sets such that every open set is a union of elements of Tc . Then (X, d)
is separable.
Proof. For each element of Tc choose a point, and let S be the set of all these
points. The set S is countable since by its construction there is a surjection from
Tc onto S. For every x ∈ X and every ǫ > 0, B(x, ǫ) contains an element of Tc ,
hence a point y ∈ S, and we have d(x, y) < ǫ. In view of 2.3.12, this shows that
S = X.
2.3.20 Proposition. Let (X, d) be a separable metric space and S a non-empty

subset of X. Then the metric subspace (S, dS ) is a separable metric space.
Proof. Let Tc be the countable family of open sets of 2.3.17. Consider the countable
family
TcS := {B ∩ S : B ∈ Tc },
which by 2.2.5 is a family of open sets in(S, dS ). By 2.2.5, each open set in (S, dS )
can be written as G ∩ S, with G an open set in (X, d), and we have G = ∪n∈I Bn
with {Bn }n∈I a subfamily of Tc ; hence we have G ∩ S = ∪n∈I (Bn ∩ S). This shows
that each open set in (S, dS ) is a union of elements of TcS . Then, (S, dS ) is separable
by 2.3.19.
2.3.21 Proposition. Let Φ be an isomorphism from a metric space (X1 , d1 ) onto

a metric space (X2 , d2 ). Then:
(a) for every subset S of X1 , Φ(S) = Φ(S);
(b) S ∈ Kd1 ⇔ Φ(S) ∈ Kd2 ;
(c) if (X1 , d1 ) is separable then (X2 , d2 ) is separable.
Proof. a: For y ∈ X2 we have
y ∈ Φ(S) ⇔
[there exists a sequence {yn } in Φ(S) such that yn → y] ⇔
[there exists a sequence {xn } in S such that Φ(xn ) → y] ⇔
[there exists a sequence {xn } in S such that xn → Φ−1 (y)] ⇔
Φ−1 (y) ∈ S ⇔
y ∈ Φ(S),
where 2.3.10 has been used twice, and also the fact that both Φ and Φ−1 preserve
distances.
b: If S ∈ Kd1 then S = S (cf. 2.3.9c), and hence Φ(S) = Φ(S) = Φ(S) by result
a, and hence Φ(S) ∈ Kd2 . The converse is also true, since S = Φ−1 (Φ(S)) and Φ−1
is an isomorphism from (X2 , d2 ) onto (X1 , d1 ).
c: If (X1 , d1 ) is separable then there exists a countable subset S of X1 such that
S = X1 . Then Φ(S) is a countable subset of X2 and Φ(S) = Φ(X1 ) = X2 by result
a, and hence (X2 , d2 ) is separable.
Metric Spaces 31
2.4 Continuous mappings
˜ be metric spaces. A mapping ϕ from X

2.4.1 Definitions. Let (X, d) and (X̃, d)
to X̃, i.e. ϕ : Dϕ → X̃ with Dϕ ⊂ X, is said to be continuous at a point x in Dϕ if
the following condition is satisfied:
˜
∀ǫ > 0, ∃δǫ > 0 such that [y ∈ Dϕ and d(x, y) < δǫ ] ⇒ d(ϕ(x), ϕ(y)) < ǫ.
It is clear from the definition that, if S is a non-empty subset of Dϕ and x is a point
in S at which ϕ is continuous, then the restriction ϕS (cf. 1.2.5) is continuous at x.
The mapping ϕ is said to be continuous if it is continuous at every point in Dϕ ,
i.e. if the following condition is satisfied:
˜
∀x ∈ Dϕ , ∀ǫ > 0, ∃δx,ǫ > 0 s.t. [y ∈ Dϕ and d(x, y) < δx,ǫ ] ⇒ d(ϕ(x), ϕ(y)) < ǫ.
The mapping ϕ is said to be uniformly continuous if the following condition is
satisfied:
˜
∀ǫ > 0, ∃δǫ > 0 such that [x, y ∈ Dϕ and d(x, y) < δǫ ] ⇒ d(ϕ(x), ϕ(y)) < ǫ.
˜ be metric spaces, let ϕ : Dϕ → X̃ be a
2.4.2 Theorem. Let (X, d) and (X̃, d)
mapping with Dϕ ⊂ X, and let x be a point in Dϕ . The following conditions are
equivalent:
(a) ϕ is continuous at x;
(b) [{xn } is a sequence in Dϕ and xn → x] ⇒ ϕ(xn ) → ϕ(x).
Proof. a ⇒ b: Assume ϕ continuous at x, i.e.

˜
∀ǫ > 0, ∃δǫ > 0 s.t. [y ∈ Dϕ and d(x, y) < δǫ ] ⇒ d(ϕ(x), ϕ(y)) < ǫ.
Let {xn } be a sequence in Dϕ such that xn → x, i.e. such that
∀η > 0, ∃Kη ∈ N s.t. Kη < n ⇒ d(xn , x) < η.
Setting Nǫ := Kδǫ , we have for each ǫ > 0
˜
Nǫ < n ⇒ d(xn , x) < δǫ ⇒ d(ϕ(xn ), ϕ(x)) < ǫ.
This proves that condition b is true.

b ⇒ a: We prove (not a)⇒(not b). We assume that condition a is not true, i.e.
˜
∃ǫ > 0 s.t. ∀δ > 0, ∃y ∈ Dϕ s.t. d(x, y) < δ and ǫ ≤ d(ϕ(x), ϕ(y)).
We fix ε > 0 with this property. Then, for each n ∈ N, the set

1 ˜
Sn := y ∈ Dϕ : d(x, y) < and ǫ ≤ d(ϕ(x), ϕ(y))
n
is non-empty. By choosing a point xn in Sn for each n ∈ N, we construct a sequence
such that xn → x but also such that {ϕ(xn )} does not converge to ϕ(x). Therefore,
condition b is not true.
˜ be metric spaces, and let ϕ : Dϕ → X̃ be a

2.4.3 Theorem. Let (X, d) and (X̃, d)
mapping with Dϕ ⊂ X. The following conditions are equivalent:
(a) ϕ is continuous;
(b) ∀F ∈ Kd̃ , ϕ−1 (F ) is a closed set in the metric subspace (Dϕ , dDϕ );
(c) ∀G ∈ Td̃ , ϕ−1 (G) is an open set in the metric subspace (Dϕ , dDϕ ).
Proof. a ⇒ b: Assume that ϕ is continuous and let F be a closed set in (X̃, d). ˜
−1
We must prove that ϕ (F ) is a closed set in (Dϕ , dDϕ ). Let a sequence {xn } in
ϕ−1 (F ) and x ∈ Dϕ be such that xn → x; by 2.4.2 we have ϕ(xn ) → ϕ(x); since
ϕ(xn ) ∈ F for each n and F is closed, by 2.3.4 we have ϕ(x) ∈ F , i.e. x ∈ ϕ−1 (F ).
By 2.3.4, this proves that ϕ−1 (F ) is a closed set in (Dϕ , dDϕ ).
b ⇒ c: This follows immediately from Dϕ − ϕ−1 (G) = ϕ−1 (X̃ − G) (cf. 1.2.8)
and 2.3.1 (referred to the metric space (Dϕ , dDϕ )).
c ⇒ a: Assume condition c and let x be a point in Dϕ . For each ǫ > 0,
by 2.2.3 ϕ−1 (B(ϕ(x), ǫ)) is an open set in (Dϕ , dDϕ ); since x ∈ ϕ−1 (B(ϕ(x), ǫ)),
there is δǫ > 0 such that for the open ball B(x, δǫ ) ∩ Dϕ in (Dϕ , dDϕ ) we have
B(x, δǫ ) ∩ Dϕ ⊂ ϕ−1 (B(ϕ(x), ǫ)), and this means
˜
[y ∈ Dϕ , d(x, y) < δǫ ] ⇒ ϕ(y) ∈ B(ϕ(x), ǫ), i.e. d(ϕ(x), ϕ(y)) < ǫ.
This proves condition a.
˜ (X̃,
2.4.4 Theorem. Let (X, d), (X̃, d), ˜˜ be metric spaces, and let ϕ : D → X̃
˜ d)
ϕ
and ψ : Dψ → X̃ ˜ be mappings with D ⊂ X and D ⊂ X̃. If ϕ and ψ are

ϕ ψ
continuous, then ψ ◦ ϕ is continuous.
˜ ∈ T . We have (cf. 1.2.13f)
Proof. Assume ϕ, ψ continuous and let G̃ ˜
d˜
−1 ˜ ˜
(ψ ◦ ϕ) (G̃) = ϕ (ψ −1 (G̃)).
−1
˜ is an open set in (D , d˜ ); hence by 2.2.5 there is G̃ ∈ T

By 2.4.3, ψ −1 (G̃) ψ Dψ d̃
−1 ˜ −1
such that ψ (G̃) = G̃ ∩ Dψ . Similarly, since ϕ (G̃) is an open set in (Dϕ , dDϕ ),
there is G ∈ T such that ϕ−1 (G̃) = G ∩ D . Therefore we have (ψ ◦ ϕ)−1 (G̃) ˜ =
d ϕ
ϕ−1 (G̃) ∩ ϕ−1 (Dψ ) = G ∩ Dϕ ∩ ϕ−1 (Dψ ) = G ∩ ϕ−1 (Dψ ) = G ∩ Dψ◦ϕ . By 2.2.5,
˜ is an open set in (D
this proves that (ψ ◦ ϕ)−1 (G̃) ψ◦ϕ , dDψ◦ϕ ). By 2.4.3, this proves
that ψ ◦ ϕ is continuous.
2.4.5 Remark. It is obvious that an isomorphism from one metric space onto
another is a continuous mapping.
2.5 Characteristic functions of closed and of open sets
2.5.1 Definition. Let (X, d) be a metric space, x a point of X, and S a non-empty

subset of X. The distance of x from S is the nonnegative number
d(x, S) := inf{d(x, y) : y ∈ S}.
Metric Spaces 33
2.5.2 Theorem. Let (X, d) be a metric space, x a point of X, and S a non-empty

subset of X. The following conditions are equivalent:
(a) d(x, S) = 0;
(b) x ∈ S.
Proof. We have
d(x, S) = 0 ⇔ [∀ǫ > 0, ∃y ∈ S s.t. d(x, y) < ǫ] ⇔ x ∈ S,
where the latter equivalence holds by 2.3.10.

Then
|d(x, S) − d(y, S)| ≤ d(x, y), ∀x, y ∈ X.
Proof. For x, y ∈ X we have:

d(x, S) = inf{d(x, z) : z ∈ S} ≤ inf{d(x, y) + d(y, z) : z ∈ S}
= d(x, y) + inf{d(y, z) : z ∈ S} = d(x, y) + d(y, S).
Hence we also have d(y, S) ≤ d(y, x) + d(x, S).
2.5.4 Lemma. Let (X, d) be a metric space and S a non-empty subset of X. The
function
δS : X → R
x 7→ δS (x) := d(x, S)
is uniformly continuous.
Proof. From 2.5.3 we have:

∀ǫ > 0, [x, y ∈ X, d(x, y) < ǫ] ⇒ |δS (x) − δS (y)| < ǫ.
2.5.5 Lemma (Urysohn’s lemma). In a metric space, let F1 and F2 be closed

sets such that F1 ∩ F2 = ∅. Then there exists a continuous function ϕ : X → R
such that: ϕ(x) = 1, ∀x ∈ F1 ; ϕ(x) = 0, ∀x ∈ F2 ; 0 ≤ ϕ(x) ≤ 1, ∀x ∈ X.
Proof. If F1 = ∅ take ϕ := 0X ; if F2 = ∅ take ϕ := 1X (cf. 1.2.19). Assuming F1

and F2 non-empty, by 2.5.2 we have d(x, F1 ) + d(x, F2 ) 6= 0, ∀x ∈ X. Thus, we can
define the function
ϕ:X→R
d(x, F2 )
x 7→ ϕ(x) := ,
d(x, F1 ) + d(x, F2 )
which is continuous by 2.5.4 and has the other required properties by 2.5.2.
2.5.6 Corollary. In a metric space, let F1 , F2 be closed sets such that F1 ∩ F2 = ∅.

Then, there exist two open sets G1 , G2 such that F1 ⊂ G1 , F2 ⊂ G2 , G1 ∩ G2 = ∅.
Proof. the sets G1 := ϕ−1 43 , ∞ and G2 := ϕ−1 −∞, 12 , where ϕ is the

function of 2.5.5, have the required properties (cf. 1.2.8 and 2.4.3).
2.5.7 Corollary. Let F be a closed set in a metric space (X, d). Then there exists
a sequence {ϕn } such that:
∀n ∈ N, ϕn is a continuous function ϕn : X → [0, 1];
∀x ∈ X, ϕn (x) → χF (x) as n → ∞.
Proof. If F = ∅, let ϕn := 0X . Assuming F 6= ∅, for n ∈ N the set Fn :=
δF−1 ([ n1 , ∞)) is closed by 2.5.4 and 2.4.3, and F ∩ Fn = ∅ by 2.5.2. Hence, by 2.5.5
there is a continuous function ϕn : X → [0, 1] such that ϕn (x) = 1, ∀x ∈ F , and
ϕn (x) = 0, ∀x ∈ Fn . By 2.5.2 we also have:
∀x ∈ X − F, ∃Nx ∈ N such that x ∈ Fn for n > Nx ,
and hence such that ϕn (x) = 0 for n > Nx .
This proves that
∀x ∈ X, ϕn (x) → χF (x) as n → ∞.
2.5.8 Corollary. Let G be an open set in a metric space (X, d). Then there exists
a sequence {ψn } such that:
(a) ∀n ∈ N, ψn is a continuous function ψn : X → [0, 1];
(b) ∀x ∈ X, ψn (x) → χG (x) as n → ∞.
Proof. Let F := X −G in 2.5.7 and define ψn := 1X −ϕn (cf. 1.2.19). The required
properties for {ψn } follow from the properties of {ϕn }. In particular
∀x ∈ X, ψn (x) = 1 − ϕn (x) → 1 − χX−G (x) = χG (x) as n → ∞.
2.5.9 Definition. Let (X, d) be a metric space.

The support of a function ϕ : X → C is the closed subset of X defined by
supp ϕ := {x ∈ X : ϕ(x) 6= 0}.
2.5.10 Definitions. Let (X, d) be a metric space. For a closed set F and a function
ϕ : X → C, the notation F ≺ ϕ will mean that ϕ is continuous, that 0 ≤ ϕ(x) ≤ 1
for each x ∈ X, and that ϕ(x) = 1 for each x ∈ F .
For an open set G and a function ϕ : X → C, the notation ϕ ≺ G will mean
that ϕ is continuous, that 0 ≤ ϕ(x) ≤ 1 for each x ∈ X, and that supp ϕ ⊂ G.
For F ∈ Kd , G ∈ Td and ϕ : X → C, the notation F ≺ ϕ ≺ G will be used
to indicate that both F ≺ ϕ and ϕ ≺ G hold. Clearly, F ≺ ϕ ≺ G holds iff the
following conditions hold simultaneously:
Metric Spaces 35
(a) F ⊂ G,
(b) ϕ is continuous,
(c) χF (x) ≤ ϕ(x) ≤ χG (x), ∀x ∈ X,
(d) supp ϕ ⊂ G.
Clearly, ∅ ≺ ϕ ≺ G reduces to ϕ ≺ G and F ≺ ϕ ≺ X reduces to F ≺ ϕ.
2.5.11 Theorem. In a metric space (X, d), let F be a closed set, G an open set,
and F ⊂ G. Then there exists a function ϕ : X → [0, 1] such that F ≺ ϕ ≺ G.
Proof. Since X − G is closed and (X − G) ∩ F = ∅, from 2.5.6 it follows that

∃G1 , G2 ∈ Td such that X − G ⊂ G1 , F ⊂ G2 , G1 ∩ G2 = ∅.
Since we also have F ∩ (X − G2 ) = ∅, by 2.5.5 there exists a continuous function
ϕ : X → [0, 1] such that ϕ(x) = 1, ∀x ∈ F , and ϕ(x) = 0, ∀x ∈ X − G2 , and hence
also such that
ϕ(x) 6= 0 ⇒ x ∈ G2 ⇒ x ∈ X − G1 ,
which implies supp ϕ ⊂ X − G1 since X − G1 is closed (cf. 2.3.9a), and hence
supp ϕ ⊂ G since X − G1 ⊂ G. Thus, ϕ has the required properties.
2.6 Complete metric spaces
2.6.1 Definition. Let (X, d) be a metric space. A sequence {xn } in X is said to

be a Cauchy sequence if the following condition is satisfied:
∀ǫ > 0, ∃Nǫ ∈ N such that Nǫ < n, m ⇒ d(xn , xm ) < ǫ.
To denote this condition, one sometimes writes d(xn , xm ) → 0 as n, m → ∞.
2.6.2 Theorem. Let (X, d) be a metric space and let {xn } be a sequence in X
which is convergent. Then {xn } is a Cauchy sequence.
Proof. Let x be the limit of {xn }. For ǫ > 0 let Nǫ ∈ N be such that Nǫ < n ⇒
d(xn , x) < 2ǫ . Then we have
Nǫ < n, m ⇒ d(xn , xm ) ≤ d(xn , x) + d(x, xm ) < ǫ.
2.6.3 Definition. A metric space (X, d) is said to be complete if every Cauchy

sequence is convergent, i.e. if, for a sequence {xn } in X, the following implication
is true:
[∀ǫ > 0, ∃Nǫ ∈ N such that Nǫ < n, m ⇒ d(xn , xm ) < ǫ] ⇒
[∃x ∈ X such that ∀ǫ > 0, ∃Kǫ ∈ N such that Kǫ < n ⇒ d(xn , x) < ǫ].
2.6.4 Proposition. Let (X1 , d1 ), (X2 , d2 ) be two metric spaces such that there ex-
ists an isomorphism from (X1 , d1 ) onto (X2 , d2 ). Then (X1 , d1 ) is complete iff
(X2 , d2 ) is complete.
Proof. Immediate from the definitions.
2.6.5 Example. The metric space (R, dR ) (cf. 2.1.4) is complete (cf. e.g. Rudin,
1976, 3.11).
2.6.6 Proposition. Let (X, d) be a metric space and let S be a non-empty subset
of X.
(a) If the metric subspace (S, dS ) is a complete metric space, then S is a closed set
in (X, d).
(b) If (X, d) is a complete metric space and S is a closed set in (X, d), then the
metric subspace (S, dS ) is a complete metric space.
Proof. a: We prove the contrapositive form of the statement. If S is not a closed

set, then ∃x ∈ X s.t. x 6∈ S and x = limn→∞ xn , with {xn } a sequence in S
(cf. 2.3.4). Then {xn } is a Cauchy sequence in (S, dS ) which is not convergent in
(S, dS ). Thus, (S, dS ) is not complete.
b: Assume (X, d) complete and S closed. If {xn } is a Cauchy sequence in (S, dS ),
then it is a Cauchy sequence in (X, d). Then, since (X, d) is complete, there exists
x ∈ X so that xn → x. Since S is closed in (X, d), by 2.3.4 we have x ∈ S. Then
we have xn → x in (S, dS ). This proves that (S, dS ) is complete.
2.6.7 Definition. Let (X, d) be a metric space. A completion of (X, d) is a pair

ˆ ι), where (X̂, d)
((X̂, d), ˆ is a complete metric space, ι is a mapping ι : X → X̂, and
the following two conditions hold:
ˆ
(co1 ) d(ι(x), ι(y)) = d(x, y), ∀x, y ∈ X;
ˆ i.e. Rι = X̂.
(co2 ) Rι is dense in (X̂, d),
We point out that, as a result of condition co1 , the mapping ι is necessarily injective:
ˆ
ι(x) = ι(y) ⇒ d(ι(x), ι(y)) = 0 ⇒ d(x, y) = 0 ⇒ x = y.
ˆ ι) is a completion of (X, d), then clearly ι is an isomorphism from (X, d)
If ((X̂, d),
onto the metric subspace (Rι , dˆRι ) of (X̂, d).
ˆ
2.6.8 Proposition. Let (X, d) be a complete metric space and let S be a subset of
X such that S = X (hence S is non-empty) and S 6= X. Then the metric subspace
(S, dS ) is not complete and the pair ((X, d), idS ) is one of the completions of (S, dS ).
Proof. Since S is not closed (cf. 2.3.9c), (S, dS ) is not complete by 2.6.6a. It
follows directly from the definitions that ((X, d), idS ) is a completion of (S, dS ).
Metric Spaces 37
We shall not use the following theorem, also because we shall need completions of
metric spaces, but those completions will be constructed without relying on either
the statement or the proof of this theorem. For this reason, we state it without
giving its proof, which can be found e.g. in 6.3.11 of (Berberian, 1999).
2.6.9 Theorem. If (X, d) is any metric space, then there exists a completion
ˆ ι) of (X, d).
((X̂, d),
˜ ω) is also a completion of (X, d), then there exists an isomorphism Φ
If ((X̃, d),
ˆ onto (X̃, d)
from (X̂, d) ˜ such that Φ ◦ ι = ω, i.e. such that Φ(ι(x)) = ω(x), ∀x ∈ X.
In order that (X, d) be complete, it is necessary and sufficient that ι be surjective
onto X̂.
2.7 Product of two metric spaces
2.7.1 Theorem. Let (X, d) and (X̃, d) ˜ be metric spaces. The function
d × d˜ : (X × X̃) × (X × X̃) → R
q
˜ ỹ)2
((x, x̃), (y, ỹ)) 7→ d × d˜((x, x̃), (y, ỹ)) := d(x, y)2 + d(x̃,
is a distance on X × X̃.
Proof. One can see that, for d× d, ˜ properties di1 and di3 of 2.1.1 follow immediately
from the corresponding properties for d and for d. ˜
As to property di2 , for all (x, x̃), (y, ỹ), (z, z̃) ∈ X × X̃ we have
q
d × d˜((x, x̃), (y, ỹ)) ≤ (d(x, z) + d(z, y))2 + (d(x̃,˜ z̃) + d(z̃,
˜ ỹ))2
q q
≤ d(x, z)2 + d(x̃,˜ z̃)2 + d(z, y)2 + d(z̃,
˜ ỹ)2
= d × d˜((x, x̃), (z, z̃)) + d × d˜((z, z̃), (y, ỹ)) ,

where the second inequaliy holds since, more in general, the following inequality
holds
p p p
|a1 + b1 |2 + |a2 + b2 |2 ≤ |a1 |2 + |a2 |2 + |b1 |2 + |b2 |2 ,
for all a1 , a2 , b1 , b2 ∈ C (this inequality will be proved in 10.3.8c).
2.7.2 Definition. Let (X, d) and (X̃, d) ˜ be metric spaces.

˜
The metric space (X × X̃, d × d) is called the product of the metric spaces (X, d),
˜ and the distance d × d˜ is called the product distance.
(X̃, d),
˜ be metric spaces. Then we have:
2.7.3 Proposition. Let (X, d) and (X̃, d)
(a) A sequence {(xn , x̃n )} in X × X̃ is convergent (with respect to d × d)˜ iff both
˜
sequences {xn } and {x̃n } are convergent (with respect to d and d respectively);
in case of convergence, an element (x, x̃) of X × X̃ is the limit of {(xn , x̃n )} iff
x is the limit of {xn } and x̃ is the limit of {x̃n }.
(b) A sequence {(xn , x̃n )} in X × X̃ is a Cauchy sequence (with respect to d × d)˜

iff both sequences {xn } and {x̃n } are Cauchy sequences (with respect to d and
d˜ respectively).
(c) The metric space (X × X̃, d × d)˜ is separable iff both metric spaces (X, d) and
˜ are separable.
(X̃, d)
(d) The metric space (X × X̃, d × d) ˜ is complete iff both metric spaces (X, d) and
˜
(X̃, d) are complete.
Proof. Statements a and b follow directly from the definitions and statement d
follows immediately from statements a and b.
As to statement c, assume first (X, d) and (X̃, d) ˜ separable, and let S and S̃
be two countable subsets of X and X̃ which are dense in X and X̃ respectively.
Then S × S̃ is a countable subset of X × X̃. Moreover, for each (x, x̃) ∈ X × X̃, by
2.3.12 there are sequences {xn } and {x̃n } in S and S̃ respectively such that xn → x
and x̃n → x̃ . Then, by statement a, {(xn , x̃n )} is a sequence in S × S̃ such that
(xn , x̃n ) → (x, x̃). By 2.3.12, this proves that S × S̃ is dense in X × X̃, and hence
that X × X̃ is separable.
Assume now (X × X̃, d × d) ˜ separable, and let T be a countable subset of X × X̃
which is dense in X × X̃. Let S be the set of first members of T , i.e. S := πX (T )
(cf. 1.2.6c). Then S is countable. Fix now x ∈ X and let x̃ be any element of
X̃. By 2.3.12, there is a sequence {(xn , x̃n )} in T such that (xn , x̃n ) → (x, x̃), and
hence, by statement a, such that xn → x. Since xn ∈ S, in view of 2.3.12 this
proves that S is dense in X, and hence that X is separable. For X̃, proceed in a
similar way.
2.7.4 Examples.
(a) The function
dC : C × C → R
(z1 , z2 ) 7→ dC (z1 , z2 ) := |z1 − z2 |
(where |z| is the modulus ofpz ∈ C) is a distance on C. Since C = R2 , with

z = (Re z, Im z) and |z| = (Re z)2 + (Im z)2 for z ∈ C, we see that in fact
dC = dR × dR (cf. 2.1.4). By 2.3.16, 2.6.5 and 2.7.3c,d, (C, dC ) is a separable
and complete metric space. We will always consider C to be the first element
of the metric space (C, dC ), and every subset of C to be the first element of the
metric subspace of (C, dC ) it defines.
Since we identify R with the subset {(a, 0) : a ∈ R} of C, we can consider the
restriction (dC )R of dC to R × R (cf. 2.1.4). Clearly, with that identification we
have (dC )R = dR and the metric subspace of (C, dC ) defined by R is identified
with (R, dR ).
Metric Spaces 39
(b) For each n ∈ N, the function

dn : Rn × Rn → R
((x1 , ..., xn ), (y1 , ..., yn )) 7→ dn ((x1 , ..., xn ), (y1 , ..., yn ))
v
u n
uX
:= t (xk − yk )2
k=1
is the n-fold product dR × · · ·× dR (defined associatively). In particular, dR = d1

and dC = d2 . By 2.3.16, 2.6.5 and 2.7.3c,d, (Rn , dn ) is a separable and complete
metric space. We will always consider Rn to be the first element of the metric
space (Rn , dn ), and every subset of Rn to be the first element of the metric
subspace of (Rn , dn ) it defines.
2.7.5 Proposition. Let (X, d), (Y1 , d1 ), (Y2 , d2 ) be metric spaces.

Let ϕ : Dϕ → Y1 × Y2 be a mapping with Dϕ ⊂ X, and let x be a point in Dϕ .
Then the mapping ϕ is continuous at x (with respect to d1 × d2 ) iff both mappings
πY1 ◦ ϕ and πY2 ◦ ϕ (cf. 1.2.6c) are continuous at x (with respect to d1 and d2
respectively).
Proof. By 2.4.2, the mapping ϕ is continuous at x iff the following condition holds:
[{xn } sequence in Dϕ and xn → x] ⇒ ϕ(xn ) → ϕ(x); (1)

since ϕ(x) = ((πY1 ◦ ϕ)(x), (πY2 ◦ ϕ)(x)) for all x ∈ Dϕ , by 2.7.3a condition 1 is
equivalent to the condition
[{xn } sequence in Dϕ and xn → x] ⇒ (πYi ◦ ϕ)(xn ) → (πYi ◦ ϕ)(x) for i = 1, 2,

and by 2.4.2 this condition holds iff πY1 ◦ ϕ and πY2 ◦ ϕ are both continuous at x.
2.7.6 Remark. Let (X, d) be a metric space. By 2.7.4a and 2.7.5, a function
ϕ : Dϕ → C with Dϕ ⊂ X is continuous at x ∈ Dϕ iff Re ϕ and Im ϕ (cf. 1.2.19)
are both continuous at x.
If ϕ is a function from R to C, i.e. ϕ : Dϕ → C with Dϕ ⊂ R, and x0 is a point
in Dϕ for which ∃ǫ > 0 such that (x0 −ǫ, x0 +ǫ) ⊂ Dϕ , we see that ϕ is differentiable
at x0 (cf. 1.2.21) iff ∃z ∈ C such that the following function is continuous at x0 :
Dϕ → C
(
ϕ(x)−ϕ(x0 )
x−x0 if x 6= x0 ,
x 7→
z if x = x0 .
If z with this property exists, then it is unique and in fact
z = (Re ϕ)′ (x0 ) + i(Im ϕ)′ (x0 ) = ϕ′ (x0 ).

An analogous remark can be made for one-sided differentiability.
2.7.7 Proposition. Let (X1 , d1 ), (X2 , d2 ), (Y, d) be metric spaces, and let a map-
ping ϕ : X1 × X2 → Y be continuous at a point (x1 , x2 ) of X1 × X2 . Then the
mapping
ϕx1 : X2 → Y
x2 7→ ϕx1 (x2 ) := ϕ(x1 , x2 )
is continuous at x2 , and the mapping
ϕx2 : X1 → Y
x1 7→ ϕx2 (x1 ) := ϕ(x1 , x2 )
is continuous at x1 .
Proof. Let {x2,n } be a sequence in X2 such that x2,n → x2 . By 2.7.3a, {(x1 , x2,n )}
is then a sequence in X1 × X2 such that (x1 , x2,n ) → (x1 , x2 ). Since ϕ is continuous
at (x1 , x2 ), by 2.4.2 this implies that
ϕx1 (x2,n ) = ϕ(x1 , x2,n ) → ϕ(x1 , x2 ) = ϕx1 (x2 ).
By 2.4.2, this proves that ϕx1 is continuous at x2 . For ϕx2 one proceeds in a similar
way.
2.8 Compactness
2.8.1 Definition. If S is a subset of a set X and F is a family of subsets of X such

that S ⊂ ∪T ∈F T , then F is called a cover of S and S is said to be covered by F .
2.8.2 Theorem. For a subset S of a metric space (X, d) the following three condi-
tions are equivalent:
(a) the metric subspace (S, dS ) is complete and, for every ǫ > 0, S can be covered
by a finite family of open balls with radius ǫ;
(b) for every sequence in S, there exists a subsequence which converges to a point
of S;
(c) for every family G of open sets which is a cover of S, there exists a finite
subfamily Gf of G which is a cover of S.
Proof. We shall prove a ⇔ b, (a and b) ⇒ c, c ⇒ b.

a ⇒ b: Assume condition a and let {xn } be a sequence in S. We can construct
a sequence {Bk } of open balls and a sequence {nk } of positive integers such that:
1
∀k ∈ N, the radius of Bk is , nk < nk+1 , xnk ∈ Bl for all l ≤ k.
k
We proceed inductively as follows. Since there is a finite family of open balls with
radius 1 which is a cover of S, at least one of these balls, which we denote by B1 ,
Metric Spaces 41
must contain xn for infinitely many n ∈ N; define then the infinite subset N1 of N
by
N1 := {n ∈ N : xn ∈ B1 },
and choose n1 ∈ N1 . Suppose now that we have defined an open ball Bk with radius
1
k and an infinite subset Nk of N such that xn ∈ Bk for all n ∈ Nk , and that we
have chosen nk ∈ Nk . Proceed hence as follows; since there is a finite family of open
1
balls with radius k+1 which is a cover of S and hence of S ∩ Bk , at least one of
these balls, which we denote by Bk+1 , must contain xn for infinitely many n ∈ Nk ;
define then the infinite subset Nk+1 of Nk by
Nk+1 := {n ∈ Nk : xn ∈ Bk+1 },
and choose nk+1 ∈ Nk+1 such that nk < nk+1 (this is possible because Nk+1 is an
infinite set). Since Nk+1 ⊂ Nk for each k ∈ N, for k and l such that l < k we have
Nk ⊂ Nl , and hence nk ∈ Nl , and hence xnk ∈ Bl .
Now, {xnk } is a subsequence of {xn } and it is a Cauchy sequence: if k > l then
xnk , xnl ∈ Bl and hence d(xnk , xnl ) < 2l . Since (S, dS ) is complete, there exists
x ∈ S such that xnk → x as k → ∞.
b ⇒ a: We shall prove (not a)⇒(not b). Assume (not a), i.e. [(S, dS ) not
complete] or [∃ǫ > 0 s.t. S cannot be covered by a finite family of open balls with
radius ǫ].
If (S, dS ) is not complete, there is a Cauchy sequence {xn } in S with no limit in
S. Then, no subsequence of {xn } can converge to a point of S. Indeed, for ǫ > 0 let
Nǫ ∈ N be so that d(xn , xm ) < ǫ whenever n, m > Nǫ ; then, if a subsequence {xnk }
and x ∈ S existed such that xnk → x as k → ∞, by choosing k(ǫ) large enough so
that nk(ǫ) > Nǫ and d(xnk(ǫ) , x) < ǫ we would have
n > Nǫ ⇒ d(xn , x) ≤ d(xn , xnk(ǫ) ) + d(xnk(ǫ) , x) < 2ǫ,
and this would prove that xn → x.
On the other hand, if ǫ > 0 exists such that S cannot be covered by a finite family
of open balls with radius ǫ, we can construct a sequence {xn } in S inductively as
follows. Choose x1 ∈ S; having chosen x1 , ..., xn , notice that S − ∪nk=1 B(xk , ǫ) 6= ∅
(otherwise we should have S ⊂ ∪nk=1 B(xk , ǫ)) and choose xn+1 ∈ S − ∪nk=1 B(xk , ǫ).
Then for n 6= m we have ǫ ≤ d(xn , xm ) (if e.g. n > m, then xn 6∈ B(xm , ǫ)), and no
subsequence of {xn } can be convergent (cf. 2.6.2).
(a and b) ⇒ c: Assume condition b and let G be a family of open sets which is
a cover of S. We can prove by contradiction that
∃n ∈ N such that
(∗)

1 1
x ∈ X and B(x, ) ∩ S 6= ∅ ⇒ ∃G ∈ G s.t. B(x, ) ⊂ G .
n n
Indeed, suppose to the contrary that
∀n ∈ N, ∃xn ∈ X such that

1 1
B(xn , ) ∩ S 6= ∅ and B(xn , ) 6⊂ G for all G ∈ G .
n n
Then we can construct a sequence in S by choosing yn ∈ B(xn , n1 ) ∩ S; then there

are a subsequence {ynk } of {yn } and y ∈ S so that ynk → y as k → ∞. Now,
Gy ∈ G exists such that y ∈ Gy ; then, ǫ > 0 exists so that B(y, ǫ) ⊂ Gy ; hence, if k
is large enough so that n1k < 3ǫ and d(ynk , y) < 3ǫ , we have
1
z ∈ B(xnk , ) ⇒ d(z, y) ≤ d(z, xnk ) + d(xnk , ynk ) + d(ynk , y) < ǫ,
nk
and hence
1
B(xnk , ) ⊂ B(y, ǫ) ⊂ Gy ,
nk
which contradicts B(xnk , n1k ) 6⊂ G for all G ∈ G. Thus, condition (∗) is proved.
Assuming now condition a as well, there is a finite subset {z1 , ..., zN } of X
1 1
so that S ⊂ ∪N k=1 B(zk , n ), and of course we may suppose B(zk , n ) ∩ S 6= ∅ for
k = 1, ..., N . Since (∗) holds, we can choose Gk ∈ G such that B(zk , n1 ) ⊂ Gk , and
we have S ⊂ ∪N k=1 Gk .
c ⇒ b: We shall prove (not b)⇒(not c). Assume (not b), i.e. that there exists
a sequence {xn } in S with no subsequence converging to a point of S.
Then the range of {xn } is an infinite set, for otherwise there would exist a
subsequence {xnk } and x ∈ S so that xnk = x for all k ∈ N, whence xnk → x as
k → ∞.
Moreover, for every x ∈ S there exists ǫx > 0 so that B(x, ǫx ) contains only a
finite number of elements of the range of {xn }. Indeed, suppose to the contrary
that there exists x ∈ S such that the sets
1
Nk := {n ∈ N : xn ∈ B(x, )}
k
is infinite for all k ∈ N. Then we could proceed inductively as follows. We could
choose n1 ∈ N and, after choosing nk ∈ Nk , we could choose nk+1 ∈ Nk+1 in such
a way that nk < nk+1 (because Nk+1 is an infinite set). Thus we would obtain a
subsequence {xnk } of {xn } such that xnk → x as k → ∞. But we assumed above
that {xn } had no convergent subsequences.
Then, {B(x, ǫx )}x∈S is a family of open sets which is a cover of S, but the union
of the elements of any finite subfamily of {B(x, ǫx )}x∈S can contain only a finite
number of elements of the range of {xn }, and hence it cannot contain S since S
contains the range of {xn } and the range of {xn } is an infinite set.
2.8.3 Definition. Let (X, d) be a metric space. A subset of X is said to be compact

if it has the properties a, b and c of 2.8.2. The metric space (X, d) is said to be
compact if X is compact (thus X is compact iff for every sequence in X there exists
a subsequence which is convergent).

The following conditions are equivalent:
(a) S is compact;
Metric Spaces 43
(b) the metric subspace (S, dS ) is compact.
Proof. a ⇒ b: Assume S compact. Then for every sequence in S there exists

a subsequence which converges (with respect to d) to a point of S; hence this
subsequence is convergent in the metric space (S, dS ).
b ⇒ a: Assume (S, dS ) compact. Then for every sequence in S there exists a
subsequence which is convergent in the metric space (S, dS ); hence this subsequence
converges (with respect to d) to a point of S.
2.8.5 Proposition. For a subset S of a metric space (X, d) the following conditions
are equivalent:
(a) S is compact;
(b) the metric subspace (S, dS ) is complete and, for each n ∈ N, there exists a finite
family {xn,1 , ..., xn,Nn } of points of X such that S ⊂ ∪N 1
k=1 K(xn,k , n ).
n
Proof. a ⇒ b: Assume S compact. Then 2.8.2a implies that (S, dS ) is complete

and that, for each n ∈ N, there exists a finite family {xn,1 , ..., xn,Nn } of points of X
Nn
so that S ⊂ ∪k=1 B(xn,k , n1 ), and hence also so that S ⊂ ∪N 1
k=1 K(xn,k , n ).
n
b ⇒ a: Assume condition b. Choose ǫ > 0 and let n ∈ N be so that n1 < ǫ; then,

for a finite family {xn,1 , ..., xn,Nn } of elements of X such that S ⊂ ∪N 1
k=1 K(xn,k , n ),
n
Nn
we also have S ⊂ ∪k=1 B(xn,k , ǫ). Since (S, dS ) is complete, this proves that S has
property a of 2.8.2.
2.8.6 Proposition. Let (X, d) be a metric space. A compact subset of X is closed

and bounded.
Proof. Let S be a compact subset of X. Then (S, dS ) is complete, hence S is

closed by 2.6.6a. Moreover, there exists a finite subset {x1 , ..., xn } of X such that
S ⊂ ∪nk=1 B(xk , 1), i.e. such that
∀z ∈ S, ∃kz ∈ {1, ..., n} such that z ∈ B(xkz , 1),
whence
∀z ∈ S, d(z, x1 ) ≤ d(z, xkz ) + d(xkz , x1 ) < 1 + max{d(xl , x1 ) : l = 1, ..., n},
and this proves that S is bounded.
2.8.7 Theorem (Heine–Borel). In the metric space (Rn , dn ) (cf. 2.7.4b), every
closed and bounded subset of Rn is compact.
Proof. Let S be a closed and bounded subset of Rn . Since (Rn , dn ) is complete,

(S, dS ) is complete by 2.6.6b. It remains to prove that S has the second property
of condition 2.8.2a. Since for every bounded subset S of Rn there exists R > 0 such
that S ⊂ QR , with QR the cube
QR := {(x1 , ..., xn ) ∈ Rn : max{|x1 |, ..., |xn |} ≤ R},
it is enough to prove that, for every ǫ > 0, QR can be covered by a finite family
√ of
open balls of radius ǫ. Given ǫ > 0, choose an integer k such that k > R ǫ n and
construct a partition of QR made of k n congruent subcubes, by dividing the interval
[−R, R] into k intervals of equal length. Each of these subcubes has side length 2R
k ,
√
2R n
hence its diameter is k < 2ǫ, so it is contained in the open ball that has the
center of the subcube as its own center and radius ǫ. Thus, QR can be covered by
a family of k n open balls with radius ǫ.
2.8.8 Theorem. In a metric space (X, d), let F and K be subsets of X such that
F is closed, K is compact, and F ⊂ K. Then F is compact.
Proof. Let G be a family of open sets which is a cover of F . Then
K = F ∪ (K − F ) ⊂ (∪G∈G G) ∪ (X − F ).
Since G ∪ {X − F } is a family of open sets and K is compact, this implies that there
exists a finite subfamily Gf of G such that
K ⊂ (∪G∈Gf G) ∪ (X − F )
and hence such that
F = F ∩ K ⊂ F ∩ [(∪G∈Gf G) ∪ (X − F )] = F ∩ (∪G∈Gf G) ⊂ ∪G∈Gf G.
This proves that F is compact.
2.8.9 Proposition. Let (X, d) be a metric space and {S1 , ..., Sn } a finite family of
compact subsets of X. Then ∪nk=1 Sk is a compact subset of X.
Proof. Let G be a family of open subsets of X which is a cover of ∪nk=1 Sk . Then, for
each k ∈ {1, ..., n}, G is also a cover of Sk , and hence there exists a finite subfamily
Gk of G which is a cover of Sk . Hence, ∪nk=1 Gk is a finite subfamily of G which is a
cover of ∪nk=1 Sk . This proves that ∪nk=1 Sk is compact.
2.8.10 Proposition. Let (X, d) and (X̃, d) ˜ be metric spaces. If S and S̃ are com-
˜ respectively, then S × S̃ is a compact subset in the
pact subsets in (X, d) and (X̃, d)
˜
product metric space (X × X̃, d × d).
Proof. Let S and S̃ be compact subsets in (X, d) and (X̃, d) ˜ respectively, and let
{(xn , x̃n )} be a sequence in S × S̃. Since S is compact, there is a subsequence {xnk }
of the sequence {xn } which converges to a point x of S. Since {x̃nk } is a sequence
in S̃ and S̃ is compact, there is a subsequence {x̃nkl } of {x̃nk } which converges to
a point x̃ of S̃. Now, the subsequence {xnkl } of {xnk } converges to x (cf. 2.1.7b).
Then, by 2.7.3a, {(xnkl , x̃nkl )} is a subsequence of {(xn , x̃n )} which converges (with
respect to d × d) ˜ to (x, x̃), which is a point of S × S̃. This proves that S × S̃ is
compact.
Metric Spaces 45
˜ be metric spaces, let ϕ : Dϕ → X̃ be a contin-

2.8.11 Theorem. Let (X, d), (X̃, d)
uous mapping with Dϕ ⊂ X, and let S be a compact subset of X such that S ⊂ Dϕ .
Then ϕ(S) is a compact subset of X̃.
Proof. Let {G̃i }i∈I be a family (which for convenience we denote as an indexed
family) of open subsets of X̃ such that ϕ(S) ⊂ ∪i∈I G̃i . By 2.4.3 and 2.2.5,
∀i ∈ I, ∃Gi ∈ Td such that ϕ−1 (G̃i ) = Gi ∩ Dϕ .
Then we have (cf. 1.2.8)
S ⊂ ϕ−1 (ϕ(S)) ⊂ ϕ−1 (∪i∈I G̃i ) = ∪i∈I ϕ−1 (G̃i ) ⊂ ∪i∈I Gi .
Since S is compact, this implies that there is a finite subset If of I such that
S ⊂ ∪i∈If Gi ,
and hence such that
S ⊂ (∪i∈If Gi ) ∩ Dϕ = ∪i∈If (Gi ∩ Dϕ ),
and hence such that (cf. 1.2.8)
ϕ(S) ⊂ ϕ(∪i∈If (Gi ∩ Dϕ )) = ∪i∈If ϕ(Gi ∩ Dϕ )

= ∪i∈If ϕ(ϕ−1 (G̃i )) ⊂ ∪i∈If G̃i .
This proves that ϕ(S) is compact.
˜ be metric spaces and let ϕ : Dϕ → X̃ be a

2.8.12 Corollary. Let (X, d), (X̃, d)
continuous mapping with Dϕ ⊂ X. If Dϕ is a compact subset of X then Rϕ is a
compact subset of X̃.
Proof. Immediate, in view of 2.8.11.
2.8.13 Definition. A function ϕ from a non-empty set X to C, i.e. ϕ : Dϕ → C

with Dϕ ⊂ X, is said to be bounded if it has the following property:
∃m ∈ [0, ∞) such that |ϕ(x)| ≤ m for all x ∈ Dϕ .
2.8.14 Corollary. Let (X, d) be a metric space and let ϕ : Dϕ → C be a continuous

function with Dϕ ⊂ X. If Dϕ is a compact subset of X then ϕ is bounded.
Proof. Immediate, in view of 2.8.12 and 2.8.6.
˜ be metric spaces and let ϕ : Dϕ → X̃ be a

2.8.15 Theorem. Let (X, d), (X̃, d)
continuous mapping with Dϕ ⊂ X. If Dϕ is a compact subset of X then ϕ is
uniformly continuous.
Proof. Assume Dϕ compact and let ǫ > 0 be given. Since ϕ is continuous, the
following condition is satisfied:
˜ ǫ
∀x ∈ Dϕ , ∃δx,ǫ > 0 s.t. [y ∈ Dϕ and d(x, y) < δx,ǫ ] ⇒ d(ϕ(x), ϕ(y)) < .
2
δ δ
The family {B(x, x,ǫ 2 )}x∈Dϕ is a family of open sets and Dϕ ⊂ ∪x∈Dϕ B(x, 2 ).
x,ǫ
Since Dϕ is compact, there exists a finite subset {x1 , ..., xn } of Dϕ such that
δ
Dϕ ⊂ ∪ni=1 B(xi , x2i ,ǫ ). We define

δxi ,ǫ
δǫ := min : i = 1, ..., n
2
and we have δǫ > 0.
Let now x and y be points in Dϕ such that d(x, y) < δǫ . There is k ∈ {1, ..., n}
δ ˜
such that x ∈ B(xk , x2k ,ǫ ); then we have d(ϕ(x), ϕ(xk )) < 2ǫ since d(x, xk ) < δxk ,ǫ ;
˜ ǫ
we also have d(ϕ(xk ), ϕ(y)) < 2 since
δxk ,ǫ
d(xk , y) ≤ d(xk , x) + d(x, y) < + δǫ ≤ δxk ,ǫ ;
2
thus we have
˜ ˜ ˜ ǫ ǫ
d(ϕ(x), ϕ(y)) ≤ d(ϕ(x), ϕ(xk )) + d(ϕ(xk ), ϕ(y)) < + = ǫ.
2 2
This proves that ϕ is uniformly continuous.
2.8.16 Theorem. In a compact metric space (X, d), suppose F is a closed set,
{G1 , ..., GN } a finite family of open sets, and F ⊂ ∪N n=1 Gn . Then there exists
a family {ψ1 , ..., ψN } of functions
P ψn : X → [0, 1] such that ψn ≺ Gn for all
N
n ∈ {1, ..., N } and such that n=1 ψn (x) = 1 for all x ∈ F .
Proof. Define I := {(x, n) ∈ F × {1, ..., N } : x ∈ Gn }. Since Gn is open,

∀(x, n) ∈ I, ∃rx,n > 0 s.t. B(x, rx,n ) ⊂ Gn .
For (x, n) ∈ I, choose rx,n > 0 such that B(x, rx,n ) ⊂ Gn . Then we have
rx,n
F ⊂ ∪(x,n)∈I B(x, ), (1)
2
since F ⊂ ∪Nn=1 Gn means that ∀x ∈ F, ∃n ∈ {1, ..., N } such that (x, n) ∈ I.
Since F is closed, F is compact by 2.8.8. Then 1 implies that there is a finite
subset If of I such that
rx,n
F ⊂ ∪(x,n)∈If B(x, ). (2)
2
For k ∈ {1, ..., N }, define Ik := {(x, n) ∈ If : n = k} (we point out that Ik could
r
be the empty set for some k) and Hk := ∪(x,n)∈Ik K(x, x,k 2 ) (if Ik is the empty set
then Hk = ∅). From 2 we have
rx,k
F ⊂ ∪N k=1 ∪(x,k)∈Ik B(x, ) ⊂ ∪N k=1 Hk . (3)
2
Metric Spaces 47
For k ∈ {1, ..., N }, Hk is closed (cf. 2.3.2 and 2.3.7) and Hk ⊂ Gk since rx,k has
been chosen in such a way that
rx,k
K(x, ) ⊂ B(x, rx,k ) ⊂ Gk , ∀(x, k) ∈ Ik ;
2
then by 2.5.11 there exists a function ϕk : X → [0, 1] such that Hk ≺ ϕk ≺ Gk .
Define:
ψ1 := ϕ1 ,
ψ2 := (1X − ϕ1 )ϕ2 ,
..
.
ψN := (1X − ϕ1 )(1X − ϕ2 ) · · · (1X − ϕN −1 )ϕN .
For every n ∈ {1, ..., N }, ψn is a continuous function and 0 ≤ ψn (x) ≤ 1 for all
x ∈ X (since ψn is a product of continuous functions with values in [0, 1]), and
also supp ψn ⊂ Gn (since clearly supp ψn ⊂ supp ϕn ). Thus, ψn ≺ Gn . It is easily
verified, by induction, that
N
X
ψn = 1X − (1X − ϕ1 )(1X − ϕ2 ) · · · (1X − ϕN ). (4)
n=1
From 3 it follows that ∀x ∈ F, ∃k ∈ {1, ..., N } s.t. x ∈ Hk , hence s.t. ϕk (x) = 1,

which in view of 4 implies ( N
P
n=1 ψn )(x) = 1.
2.9 Connectedness
2.9.1 Definition. A metric space (X, d) is said to be connected if there does not
exist a pair of non-empty open sets G1 and G2 such that
G1 ∩ G2 = ∅ and G1 ∪ G2 = X.
A non-empty subset S of X is said to be connected if the metric subspace (S, dS )
is connected.
2.9.2 Proposition. For a metric space (X, d), the following conditions are equiv-
alent:
(a) (X, d) is connected;
(b) the only subsets of X which are both open and closed are ∅ and X;
(c) there does not exist a pair of non-empty closed sets F1 and F2 such that
F1 ∩ F2 = ∅ and F1 ∪ F2 = X.
Proof. (not a)⇒(not b): Let G1 and G2 be non-empty open sets such that
G1 ∩ G2 = ∅ and G1 ∪ G2 = X.
Then G2 = X − G1 and hence G2 is closed and G2 6= X (for otherwise G1 = ∅).
(not b)⇒(not c): Let S be a subset of X which is both open and closed and
such that S 6= ∅ and S 6= X. Then S ′ := X − S is non-empty and closed and
S ∩ S ′ = ∅ and S ∪ S ′ = X.
(not c)⇒(not a): Let F1 and F2 be non-empty closed sets such that
F1 ∩ F2 = ∅ and F1 ∪ F2 = X.
Then G1 := X − F1 and G2 := X − F2 are non-empty (G1 = ∅ would imply F2 = ∅
since F2 = X − F1 , and similarly for G2 ) open sets and
G1 ∩ G2 = X − (F1 ∪ F2 ) = ∅ and G1 ∪ G2 = X − (F1 ∩ F2 ) = X.
2.9.3 Theorem. Let S be a non-empty subset of R. Then the following conditions

are equivalent:
(a) S is connected, i.e. the metric space (S, dS ) is connected;
(b) S is either R or an interval or a singleton set, i.e. S is either R or a non-empty
element of one of the families In defined in 6.1.25 for n = 1, ..., 8.
Proof. a ⇒ b: We shall prove (not b)⇒(not a). Suppose that S is neither R nor
an interval nor a singleton set. Then there exist x, y, z ∈ R such that
x < y < z, x, z ∈ S, y 6∈ S.
Then,
S = ((−∞, y) ∩ S) ∪ ((y, ∞) ∩ S)
and (−∞, y) ∩ S and (y, ∞) ∩ S are two disjoint non-empty sets which are open in
the metric space (S, dS ) (cf. 2.2.5). Therefore S is not connected.
b ⇒ a: If S is a singleton set then it is obviously connected. Then suppose that
S is R or an interval. We shall prove by contradiction that S is connected. Suppose
to the contrary that S is not connected. Then (cf. 2.9.2) there exist two non-empty
subsets T1 and T2 of S which are closed in the metric space (S, dS ) and such that
T1 ∩ T2 = ∅ and T1 ∪ T2 = S.
Since T1 and T2 are non-empty we can choose x1 ∈ T1 and x2 ∈ T2 . Since T1 and
T2 are disjoint, x1 6= x2 and (by altering our notation if necessary) we may assume
that x1 < x2 . Since S is R or an interval, [x1 , x2 ] ⊂ S and each point in [x1 , x2 ] is
in either T1 or T2 . Since [x1 , x2 ] ∩ T1 6= ∅, we can define
y := sup([x1 , x2 ] ∩ T1 ).
It is clear that x1 ≤ y ≤ x2 , so y ∈ S. By definition of the l.u.b. (cf. 1.1.5), for
each n ∈ N we can choose zn ∈ [x1 , x2 ] ∩ T1 such that y − n1 < zn ; thus we have a
sequence {zn } in T1 such that y − n1 < zn ≤ y; since T1 is closed in the metric space
(S, dS ) and dS is a restriction of dR , this proves that y ∈ T1 (cf. 2.3.4). Since T1
Metric Spaces 49
and T2 are disjoint, this implies y < x2 . For each n ∈ N such that y + n1 ≤ x2 we
have y + n1 ∈ [x1 , x2 ] and hence y + n1 ∈ S, and then either y + n1 ∈ T1 or y + n1 ∈ T2 ;
however y + n1 ∈ T1 would imply y + n1 ∈ [x1 , x2 ] ∩ T1 and this would contradict the
definition of y; therefore, y + n1 ∈ T2 ; thus, the sequence y + n1 is in T2 for n large

enough; since T2 is closed in the metric space (S, dS ) and dS is a restriction of dR ,

this proves that y ∈ T2 . Therefore y is in both T1 and T2 , and hence T1 ∩ T2 = ∅
and T1 ∩ T2 6= ∅ are both true. This concludes the proof by contradiction.
2.9.4 Theorem. Let (X, d) and (X̃, d)˜ be metric spaces and let ϕ : X → X̃ be a
continuous mapping. If (X, d) is connected then Rϕ is a connected subset of X̃.
Proof. The proof is by contraposition. Suppose that Rϕ is not a connected subset

˜ so that
of X̃. Then (cf. 2.2.5) there exist two open sets G1 and G2 in (X̃, d)
G1 ∩ Rϕ 6= ∅, G2 ∩ Rϕ 6= ∅, (G1 ∩ Rϕ ) ∩ (G2 ∩ Rϕ ) = ∅,
(G1 ∩ Rϕ ) ∪ (G2 ∩ Rϕ ) = Rϕ .
Since ϕ (G1 ) = ϕ−1 (G1 ∩ Rϕ ) and ϕ−1 (G2 ) = ϕ−1 (G2 ∩ Rϕ ), ϕ−1 (G1 ) and
−1
ϕ−1 (G2 ) are non-empty, and they are open sets in (X, d) (cf. 2.4.3). Moreover,
ϕ−1 (G1 ) ∩ ϕ−1 (G2 ) = ϕ−1 ((G1 ∩ Rϕ ) ∩ (G2 ∩ Rϕ )) = ϕ−1 (∅) = ∅,
ϕ−1 (G1 ) ∪ ϕ−1 (G2 ) = ϕ−1 ((G1 ∩ Rϕ ) ∪ (G2 ∩ Rϕ )) = ϕ−1 (Rϕ ) = X.

Therefore, (X, d) is not connected.
2.9.5 Corollary. Let (X1 , d1 ) and (X2 , d2 ) be metric spaces such that there exists
an isomorphism from (X1 , d1 ) onto (X2 , d2 ). Then (X1 , d1 ) is connected iff (X2 , d2 )
is connected.
Proof. The statement is a direct consequence of 2.9.4 since an isomorphism from

(X1 , d1 ) onto (X2 , d2 ) is a continuous bijection from X1 onto X2 , and since its
inverse is a continuous surjection from X2 onto X1 .
2.9.6 Corollary. The range of a continuous real function defined on a connected

metric space is either R or an interval or a singleton set.
Proof. The statement follows immediately from 2.9.4 and 2.9.3 (a ⇒ b).
2.9.7 Definition. Let (X, d) be a metric space. Two subsets S1 and S2 of X are
said to be separated from one another if S1 ∩ S2 = ∅.
2.9.8 Theorem. Let (X, d) be a metric space and suppose that there exists a family
F of subsets of X such that:
(a) each element of F is connected;
S
(b) S∈F S = X;
(c) no two elements of F are separated from one another.
Then (X, d) is connected.
Proof. Let T be a subset of X which is both open and closed. We shall show that
T is either empty or equal to all of X. In view of 2.9.2, this will have proved the
statement.
Each element of F is connected (cf. a), so for any S ∈ F we know (cf. 2.9.2)
that T ∩ S is either empty or all of S, since T ∩ S is both open and closed in the
metric subspace (S, dS ) (cf. 2.2.5 and 2.3.3).
S
If T ∩ S = ∅ for all S ∈ F , then T = S∈F (T ∩ S) = ∅ (cf. b).
The other possibility is that there exists S0 ∈ F such that T ∩ S0 6= ∅. Then
T ∩ S0 = S0 , i.e. S0 ⊂ T . If S0 is the only element of F , this gives T = X (cf. b).
If not, let S be an element of F different from S0 ; if T ∩ S = ∅ then S ⊂ X − T , and
hence S ⊂ X − T since X − T is closed; therefore, S0 ∩ S = ∅ (we have S0 ⊂ T since
T is closed); however, this is not possible since no two elements of F are separated
from one another (cf. c); thus, we must have T ∩ S 6= ∅ and hence T ∩ S = S. Hence
S S
we have T ∩ S = S for all S ∈ F , and hence T = S∈F (T ∩ S) = S∈F S = X (cf.
b).
2.9.9 Theorem. Let (X, d) and (X̃, d)˜ be connected metric spaces. Then the prod-
˜ is connected.
uct metric space (X × X̃, d × d)
Proof. For each x ∈ X, the subset {x} × X̃ of X × X̃ is connected by 2.9.5,

since there exists an obvious isomorphism from (X̃, d)˜ onto the metric subspace of
˜ defined by {x} × X̃. Similarly, the subset X × {x̃} is connected for
(X × X̃, d × d)
each x̃ ∈ X̃. Now, for each (x, x̃) ∈ X × X̃, the subsets {x} × X̃ and X × {x̃} are
not separated from one another ((x, x̃) is in both of them), and hence the subset
({x} × X̃) ∪ (X × {x̃}) is connected by 2.9.8.
We have obviously
[
X × X̃ = ({x} × X̃) ∪ (X × {x̃}).
(x,x̃)∈X×X̃
For all (x, x̃), (x′ , x̃′ ) ∈ X × X̃, ({x} × X̃) ∪ (X × {x̃}) and ({x′ } × X̃) ∪ (X × {x̃′ })
are not separated from one another since
(x, x̃′ ) ∈ ({x} × X̃) ∩ (X × {x̃′ }).
˜ is connected by 2.9.8.
Therefore (X × X̃, d × d)
2.9.10 Corollary. The metric space (Rn , dn ) is connected for all n ∈ N.
Proof. We prove the statement by induction.

The metric space (R, dR ) is connected by 2.9.3 (b ⇒ a). For each n ∈ N, if
(R , dn ) is connected then the product metric space (Rn × R, dn × dR ) is connected
n
by 2.9.9, and hence (Rn+1 , dn+1 ) is connected by 2.9.5 since there exists an obvious
isomorphism from (Rn × R, dn × dR ) onto (Rn+1 , dn+1 ). This concludes the proof
by induction.
Chapter 3
Linear Operators in Linear Spaces
Our main purpose is to study operators in Hilbert spaces, which are in fact linear
operators in linear spaces. Hence the subject of this chapter. Throughout the
chapter, K stands for a field. By 0 and 1 we denote the zero and unit elements of
K.
3.1 Linear spaces
3.1.1 Definition. A linear space over K (or, simply, a linear space) is a triple
(X, σ, µ), where X is a non-empty set, σ is a mapping σ : X × X → X, µ is a
mapping µ : K × X → X, and the conditions listed under ls1 and ls2 are satisfied.
(ls1 ) (X, σ) is an abelian group; i.e., with the shorthand notation f + g := σ(f, g),
we have:
f + (g + h) = (f + g) + h, ∀f, g, h ∈ X,
∃0X ∈ X s.t. f + 0X = f , ∀f ∈ X,
∀f ∈ X, ∃f ′ ∈ X s.t. f + f ′ = 0X ,
f + g = g + f , ∀f, g ∈ X;
we recall (cf. 1.3.1) that 0X is the only element of X s.t. f + 0X = f for all
f ∈ X, that for f ∈ X there is only one element f ′ of X s.t. f + f ′ = 0X
and that it is denoted by −f , and that we write f − g := f + (−g).
(ls2 ) With the shorthand notation αf := µ(α, f ), we have:
α(βf ) = (αβ)f , ∀α, β ∈ K, ∀f ∈ X,
(α + β)f = αf + βf , ∀α, β ∈ K, ∀f ∈ X,
α(f + g) = αf + αg, ∀α ∈ K, ∀f, g ∈ X,
1f = f , ∀f ∈ X.
The elements of K are called scalars, and will be preferably denoted by the small
Greek letters α, β, γ, .... The elements of X are called vectors, and will be preferably
denoted by the italics f, g, h, .... The composition law σ is called vector sum and
51
the composition law µ is called scalar multiplication. Another name for a linear
space is vector space.
3.1.2 Proposition. In a linear space X over K we have:
(a) 0f = 0X , ∀f ∈ X;
(b) α0X = 0X , ∀α ∈ K;
(c) if α ∈ K and f ∈ X are such that αf = 0X , then α 6= 0 ⇒ f = 0X (or
equivalently f 6= 0X ⇒ α = 0);
(d) (−1)f = −f , ∀f ∈ X;
(e) (−α)f = −(αf ), ∀α ∈ K, ∀f ∈ X (hence we will write −αf := (−α)f ).
Proof. a: We have f = 1f = (1 + 0)f = f + 0f , and hence 0X = 0f .

b: We have α0X = α(0X + 0X ) = α0X + α0X , and hence α0X = 0X .
c: If α 6= 0, we have α−1 α = 1; thus, if αf = 0X , we have f = 1f = (α−1 α)f =
α (αf ) = α−1 0X = 0X by result b.
−1
d: We have f + (−1)f = 1f + (−1)f = (1 − 1)f = 0f = 0X by result a, and

hence (−1)f = −f .
e: We have (−α)f = (−1α)f = (−1)(αf ) = −(αf ) by result d.
3.1.3 Definition. Let (X, σ, µ) be a linear space over K. A linear manifold in X

is a non-empty subset M of X which has the following properties:
(lm1 ) f, g ∈ M ⇒ f + g ∈ M ;
(lm2 ) (α ∈ K and f ∈ M ) ⇒ αf ∈ M .
Condition lm1 is equivalent to σ(M × M ) ⊂ M and condition lm2 is equivalent

to µ(K × M ) ⊂ M . Thus, M can be used as final set of the mappings σM×M
and µK×M if and only if M is a linear manifold. If M is a linear manifold, it is
then immediately clear that (M, σM×M , µK×M ) is a linear space over K, since the
conditions listed under ls1 and ls2 in 3.1.1 trivially hold with X replaced by M . If
M is a linear manifold, the linear space (M, σM×M , µK×M ) will be referred to as
the linear space M . If M is a linear manifold, we have 0X ∈ M and, for the zero
element 0M of the linear space M , we have 0M = 0X (cf. 1.3.2).
A linear manifold is also called linear subspace or vector subspace, but we reserve
the name “subspace” for a closed linear manifold in a Banach space (cf. 4.1.9).
3.1.4 Remarks. The following facts are immediately clear:
(a) In any linear space X there are two trivial linear manifolds: {0X } and X.
(b) If M is a linear manifold in a linear space (X, σ, µ) and N is a non-empty
subset of M , then N is a linear manifold in (X, σ, µ) iff N is a linear manifold
in (M, σM×M , µK×M ).
(c) Conditions lm1 and lm2 of 3.1.3 are equivalent to the one condition:
(lm) (α, β ∈ K and f, g ∈ M ) ⇒ αf + βg ∈ M .
Linear Operators in Linear Spaces 53
3.1.5 Proposition. Let F be a family of linear manifolds in a linear space X.

Then ∩M∈F M is a linear manifold in X.
Proof. We have
(α, β ∈ K and f, g ∈ ∩M∈F M ) ⇒
(f, g ∈ M and hence αf + βg ∈ M, ∀M ∈ F ) ⇒ αf + βg ∈ ∩M∈F M.
3.1.6 Definition. Let S be a subset of a linear space X. Consider the family

F := {M ∈ P(X) : M is a linear manifold in X and S ⊂ M }
and define
LS := ∩M∈F M.
We have:
(a) LS is a linear manifold in X (cf. 3.1.5);
(b) S ⊂ LS (immediate from the definition of LS);
(c) if M is a linear manifold in X and S ⊂ M , then LS ⊂ M (immediate from the
definition of LS).
Thus, LS is the smallest linear manifold in X that contains S. For this reason, LS
is called the linear manifold generated by S (LS is also called the linear span or
linear hull of S, owing to 3.1.7).
From the definition of LS we also have immediately:
(d) LS = S iff S is a linear manifold;
(e) if T is a subset of X such that S ⊂ T , then LS ⊂ LT .
3.1.7 Proposition. Let S be a non-empty subset of a linear space X over K. We
have
Xn
LS = { αi fi : n ∈ N, (α1 , ..., αn ) ∈ Kn , (f1 , ..., fn ) ∈ S n }
i=1
n
X
:= {f ∈ X : ∃n ∈ N, ∃(α1 , ..., αn ) ∈ Kn , ∃(f1 , ..., fn ) ∈ S n s.t. f = αi fi }.
i=1
Pn
Proof. Define M := { i=1 αi fi : n ∈ N, (α1 , ..., αn ) ∈ Kn , (f1 , ..., fn ) ∈ S n }.
Obviously, M is a linear manifold in X and S ⊂ M ; hence LS ⊂ M . On the other
Pn
hand, since S ⊂ LS and LS is a linear manifold in X, we have i=1 αi fi ∈ LS,
∀n ∈ N, ∀(α1 , ..., αn ) ∈ Kn , ∀(f1 , ..., fn ) ∈ S n ; thus M ⊂ LS.
3.1.8 Definition. If S1 and S2 are subsets of a linear space X, we define their sum
S1 + S2 by
S1 + S2 := {f1 + f2 : f1 ∈ S1 and f2 ∈ S2 }.
It is immediate to see that, if S1 and S2 are linear manifolds in X, then S1 + S2 is
a linear manifold in X.
3.1.9 Definition. Let X and Y be linear spaces over the same field K. It is
immediate to see that the set X × Y becomes a linear space over K if we define
vector sum and scalar multiplication by the rules:
(f1 , g1 ) + (f2 , g2 ) := (f1 + f2 , g1 + g2 ), ∀(f1 , g1 ), (f2 , g2 ) ∈ X × Y ;
α(f, g) := (αf, αg), ∀α ∈ K, ∀(f, g) ∈ X × Y.
This linear space is called the sum of the linear spaces X, Y and is denoted by
X + Y . It is immediate to see that the two subsets of X × Y
X̂ := {(f, 0Y ) : f ∈ X} and Ŷ := {(0X , g) : g ∈ X}
are linear manifolds in X + Y and that X × Y = X̂ + Ŷ , with X̂ + Ŷ defined as in
3.1.8.
3.1.10 Examples.
(a) Let x denote a point (of any set). Define X := {x}, and vector sum and scalar
multiplication by the rules:
σ(x, x) := x, µ(α, x) := x for all α ∈ K.
The triple (X, σ, µ) defined in this way is a trivial linear space, which is called
a zero linear space. If X is a zero linear space, we have X = {0X }.
(b) Define X := K, and vector sum and scalar multiplication by the rules:
σ(z1 , z2 ) := z1 + z2 , ∀z1 , z2 ∈ K,
µ(α, z) := αz, ∀α, z ∈ K
(where z1 + z2 and αz are the sum and the product that are defined in the field
K). The triple (X, σ, µ) defined in this way is a linear space over K, which is
called the linear space K.
(c) Let X be a non-empty set and let F (X) denote the family of all the functions
from X to C that have the whole of X as their domains, i.e. the family of
complex functions on X. Define the mappings
σ : F (X) × F(X) → F(X)
(ϕ, ψ) 7→ σ(ϕ, ψ) := ϕ + ψ,
µ : C × F(X) → F(X)
(α, ϕ) 7→ µ(α, ϕ) := αϕ,
where ϕ + ψ and αϕ are defined as in 1.2.19. It is immediate to check that
(F (X), σ, µ) is a linear space over C (hence, the symbols ϕ + ψ and αϕ defined
in 1.2.19 are in agreement with the shorthand notations used in 3.1.1), with the
function
0X : X → C
x 7→ 0X (x) := 0
(cf. 1.2.19) as zero element, and the function −ϕ (cf. 1.2.19) as the opposite
element of an element ϕ of F (X).
(d) Let X be a non-empty set, and let FB (X) denote the set of all bounded elements
of F (X):
FB (X) := {ϕ ∈ F (X) : ∃mϕ ∈ [0, ∞) such that
|ϕ(x)| ≤ mϕ for all x ∈ X}.
It is immediate to check that FB (X) is a linear manifold in F (X).
(e) Let (X, d) be a metric space, and define
C(X) := {ϕ ∈ F (X) : ϕ is continuous}.
Since a linear combination (cf. 3.1.12) of continuous functions is a continuous
function, C(X) is a linear manifold in F (X).
We also define
CB (X) := C(X) ∩ FB (X),
which is a linear manifold in F (X) by 3.1.5, and hence in FB (X) by 3.1.4b.
If (X, d) is a compact metric space, we have C(X) = CB (X) by 2.8.14.
(f) For a, b ∈ R such that a < b, the family of functions C(a, b) := C([a, b]) is a
linear manifold in FB ([a, b]) since [a, b] is compact (cf. 2.3.7 and 2.8.7).
By C 1 (a, b) we denote the set of all the elements of C(a, b) that are differentiable
at all point of [a, b] and such that their derivatives (cf. 1.2.21 and 2.7.6) are
elements of C(a, b) (differentiability and derivatives at a and b are one-sided).
Since a linear combination of differentiable functions is differentiable and its
derivative is the linear combination of the derivatives, C 1 (a, b) is a linear mani-
fold in C(a, b).
(g) By Cc (R) we denote the family of all continuous complex functions on R whose
support is compact, i.e. we define
Cc (R) := {ϕ ∈ C(R) : supp ϕ is compact}
(for supp ϕ cf. 2.5.9). From 2.8.6 and 2.8.7, for ϕ ∈ C(R) we have
ϕ ∈ Cc (R) ⇔ ∃aϕ , bϕ ∈ R s.t. aϕ < bϕ and supp ϕ ⊂ [aϕ , bϕ ].
∞
(h) By C (R) we denote the subset of F (R) defined by
C ∞ (R) := {ϕ ∈ F (R) : ϕ is infinitely differentiable at all points of R}.
Clearly, C ∞ (R) is a linear manifold in F (R).
Next, we define the Schwartz space of functions of rapid decrease:
S(R) := {ϕ ∈ C ∞ (R) : lim xk ϕ(l) (x) = 0,
x→±∞
∀k = 0, 1, 2, ..., ∀l = 0, 1, 2, ...},
where ϕ(l) denotes the l-th derivative of ϕ (and ϕ(0) := ϕ).
The following properties of S(R) are easily checked:
(1) ϕ ∈ S(R) ⇒ ϕ(l) ∈ S(R), ∀l ∈ N;
(2) ϕ ∈ S(R) ⇒ ϕ ∈ S(R);
(3) (α, β ∈ C and ϕ, ψ ∈ S(R)) ⇒ αϕ + βψ ∈ S(R);

(4) ϕ ∈ S(R) ⇒ pϕ ∈ S(R), for every polynomial p with complex coefficients;
(5) ϕ ∈ S(R) ⇒ eip ϕ ∈ S(R), for every polynomial p with real coefficients;
(6) ϕ, ψ ∈ S(R) ⇒ ϕψ ∈ S(R);
(7) ϕ ∈ S(R) ⇒ sup{|xk ϕ(l) (x)| : x ∈ R} < ∞, ∀k = 0, 1, 2, ..., ∀l = 0, 1, 2, ....
Property 3 means that S(R) is a linear manifold in C ∞ (R).
We will see now just the few facts about linear independence and linear dimen-
sion that will be used in our study of Hilbert spaces. Thus, our treatment of these
subjects will be nowhere near complete.
3.1.11 Definition. A non-empty subset S of a linear space X over K is said to be

linearly independent if for each n ∈ N and each subset {f1 , ..., fn } of S the following
condition is satisfied:
n
X
n
[(α1 , ..., αn ) ∈ K , αi fi = 0X ] ⇒ αi = 0 for i = 1, ..., n.
i=1
The set S is said to be linearly dependent if it is not linearly independent.

Clearly, every subset of a linearly independent set is linearly independent as
well.
3.1.12 Definition. If {f1 , ..., fn } is a finite non-empty subset of a linear space X

Pn
over K and (α1 , ..., αn ) ∈ Kn , then the vector f := i=1 αi fi is called a linear
combination of f1 , ..., fn .
3.1.13 Definition. A linear basis in a non-zero linear space X is a linearly inde-

pendent subset B of X such that X = LB, i.e. (cf. 3.1.7) such that for every vector
f of X there is a finite subset of B of which f is a linear combination.
3.1.14 Proposition. Let S = {f1 , ..., fn } be a finite non-empty subset of a linear

space X over K. If n = 1, then S is linearly dependent iff f1 = 0X . If n > 1 and
f1 6= 0X , then S is linearly dependent iff some one of the elements of S is a linear
combination of the element of S which precede it (i.e. there exists k ∈ {2, ..., n}
such that fk is a linear combination of f1 , ..., fk−1 ).
Proof. The first statement follows from 3.1.2b,c. Assume then n > 1 and f1 6= 0X .
Pk−1
If there exists k ∈ {2, ..., n} such that fk = i=1 αi fi , with αi ∈ K for i =
Pk−1
1, ..., k −1, then i=1 αi fi −1fk = 0X , and this proves that S is linearly dependent.
Assume now that S is linearly dependent; this means that there are a non-empty
subset {fi1 , ..., fir } of S and (α1 , ..., αr ) ∈ Kr such that (α1 , ..., αr ) 6= (0, ..., 0) and
Pr
l=1P αl fil = 0X ; hence there is (β1 , ..., βn ) ∈ Kn such that (β1 , ..., βn ) 6= (0, ..., 0)
n
and i=1 βi fi = 0X . If βk is the last non-zero element in (β1 , ..., βn ), then k > 1
Pk−1
(since f1 6= 0X ) and fk = i=1 (− ββki )fi .
3.1.15 Theorem. Let X be a non-zero linear space and assume that there exists
a linear basis in X which is finite and contains n elements. Then every linearly
independent subset of X which contains n elements is a linear basis.
Proof. Let {e1 , ..., en } be the linear basis in X of which we assume the existence,
and let {f1 , ..., fn } be a linearly independent subset of X.
Since f1 is a linear combination of the ei ’s, the set
S1 := {f1 , e1 , ..., en }
is linearly dependent; then, by 3.1.14, there is one of the ei ’s, say ei1 , which is
a linear combinations of the vectors that precede it in S1 ; if we delete ei1 , the
remaining set
S1′ := {f1 , e1 , ..., ei1 −1 , ei1 +1 , ..., en }
(we have written as if i1 6= 1 and i1 6= n; otherwise we would have written S1′ :=

{f1 , e2 , ..., en } or S1′ := {f1 , e1 , ..., en−1 }) is still (as it was {e1 , ..., en }) such that
X = LS1′ . Just as before, f2 is a linear combination of the vectors in S1′ , so the set
S2 := {f1 , f2 , e1 , ..., ei1 −1 , ei1 +1 , ..., en }
is linearly dependent; then, by 3.1.14, some vector in S2 is a linear combination of

the vectors which precede it in S2 , and (since {f1 , f2 } is a linearly independent set)
this vector must be one of the ei ’s which are left in S2 , say ei2 ; if we delete ei2 from
S2 , the remaining set
S2′ := {f1 , f2 , e1 , ..., ei1 −1 , ei1 +1 , ..., ei2 −1 , ei2 +1 , ..., en }
(we have written as if i1 < i2 , but it could be well the other way round) is still (as
it was S1′ ) such that X = LS2′ . Continuing in this way, in the end we are left with
the set
Sn′ := {f1 , ..., fn }
which is such that X = LSn′ . This proves that {f1 , ..., fn } is a linear basis.
3.1.16 Corollary. Let X be a non-zero linear space and assume that there exists
a linear basis in X which is finite and contains n elements. Then every linearly
independent subset of X is finite and contains at most n elements.
Proof. Our proof is by contradiction. Assume that there exists a linearly indepen-
dent subset S of X which is either infinite or finite with more than n elements. In
both cases, there is a subset {f1 , ..., fn , fn+1 } of S which contains n + 1 elements.
As any subset of S, {f1 , ..., fn , fn+1 } is linearly independent. However, {f1 , ..., fn }
is also linearly independent, hence a linear basis by 3.1.15. But then fn+1 is a
linear combination of f1 , ..., fn , and this contradicts the linear independence of
{f1 , ..., fn , fn+1 }.
3.1.17 Corollary. If in a linear space X there exists a finite linear basis, then
every other linear basis in X is finite and contains the same number of elements.
Proof. Assume that B1 is a linear basis in X which is finite and contains n elements,
and let B2 be another linear basis in X. Since B2 is a linearly independent subset
of X, by 3.1.16 B2 is finite and contains m elements with m ≤ n. Then, since B2
is a finite linear basis with m elements and B1 is a linearly independent subset of
X, by 3.1.16 we have n ≤ m. Thus, m = n.
3.1.18 Definition. A non-zero linear space X is said to be finite-dimensional if

there exists a linear basis in X which is finite.
If a non-zero linear space X is finite-dimensional, then by 3.1.17 all its bases
are finite and contain the same number of elements, which is said to be the linear
dimension of X.
The linear dimension of a zero linear space (cf. 3.1.10a) is defined to be zero.
3.1.19 Proposition. Let {fn } be a sequence in a linear space X, and assume that
there exists n ∈ N such that fn 6= 0X . Then there exists an N -tuple (fn1 , ..., fnN )
or a subsequence {fnk } such that, letting I := {1, ..., N } or I := N, {fnk }k∈I is a
linearly independent subset of X and L{fnk }k∈I = L{fn}n∈N .
Proof. We proceed by induction as follows. The set S0 := {n ∈ N : fn 6= 0X }

is not empty, and we define n1 := min S0 . For k ∈ N, if the family {n1 , ..., nk } of
elements of N has already been defined in such a way that the set {fn1 , ..., fnk } is
linearly independent and fn is a linear combination of fn1 , ..., fnk for every n ∈ N
such that n ≤ nk (note that the family {n1 } meets these requirements trivially), let
Sk := {n ∈ N : {fn1 , ..., fnk , fn } is linearly independent};
if Sk 6= ∅ we define nk+1 := min Sk and we note that the set {fn1 , ..., fnk , fnk+1 } is
obviously linearly independent and that, for n ∈ N such that n ≤ nk+1 , fn is a linear
combination of fn1 , ..., fnk , fnk+1 (if n ≤ nk this is already known, for n = nk+1 this
is obvious, for nk < n < nk+1 this follows from 3.1.14). If N ∈ N is such that
SN −1 6= ∅ and SN = ∅, this procedure defines an N -tuple {fn1 , ..., fnN }; if Sk 6= ∅
for all k ∈ N, this procedure defines a subsequence {fnk }. We define I := {1, ..., N }
in the first case and I := N in the latter. In either case, the set {fnk }k∈I is clearly
linearly independent since the set {fn1 , ..., fnk } is linearly independent for each
k ∈ I. Also, if I = {1, ..., N } then, for each n ∈ N, fn is a linear combination of
fn1 , ..., fnN ; if I = N then, for each n ∈ N, fn is a linear combination of fn1 , ..., fnk
for nk ≥ n. Then, in view of 3.1.7 we have
L{fn }n∈N ⊂ L{fnk }k∈I .
Since {fnk }k∈I ⊂ {fn }n∈N implies L{fnk }k∈I ⊂ L{fn}n∈N (cf. 3.1.6e), we have
L{fnk }k∈I = L{fn }n∈N .
3.2 Linear operators
3.2.1 Definition. Let X and Y be linear spaces over the same field K. A linear
operator (or, simply, an operator ) from X to Y is a mapping A from X to Y , i.e.
A : DA → Y with DA ⊂ X (the set DA is called the domain of A, cf. 1.2.1), which
has the following properties:
(lo1 ) DA is a linear manifold in X;

(lo2 ) A(f + g) = Af + Ag, ∀f, g ∈ DA ;
(lo3 ) A(αf ) = αAf , ∀α ∈ K, ∀f ∈ DA .
By tradition, linear operators are denoted by capital letters, and the value A(f ) of
a linear operator A at f ∈ DA is written as Af .
When X = Y , a linear operator A from X to X is called a linear operator in
X, and on X if DA = X.
When Y is the linear space K (cf. 3.1.10b), a linear operator from X to K is
called a linear functional.
We point out that conditions lo2 and lo3 are consistent only when condition lo1
is assumed, and that conditions lo1 , lo2 , lo3 are equivalent to the one condition:
(lo) (α, β ∈ K and f, g ∈ DA ) ⇒ (αf + βg ∈ DA and A(αf + βg) = αAf + βAg).
We denote by the symbol O(X, Y ) the family of all linear operators from X to Y .
If X = Y , we write O(X) := O(X, X).
For A ∈ O(X, Y ), the null space (also called the kernel ) of A is the subset of X
defined by
NA := {f ∈ X : f ∈ DA and Af = 0Y }.
3.2.2 Definition. Let X and Y be linear spaces over the same field K, and let
A ∈ O(X, Y ). We have:
(a) RA is a linear manifold in Y (for the range Rϕ of a mapping ϕ, cf. 1.2.1);

(b) NA is a linear manifold in X.
Proof. a: By its definition, RA can never be the empty set. Moreover we have:
(α, β ∈ K and g1 , g2 ∈ RA ) ⇒
[∃f1 , f2 ∈ DA s.t. g1 = Af1 and g2 = Af2 , and hence s.t.
αf1 + βf2 ∈ DA and αg1 + βg2 = αAf1 + βAf2 = A(αf1 + βf2 )] ⇒
αg1 + βg2 ∈ RA .
b: Since DA is a linear manifold, 0X ∈ DA and we have
A0X = A(00X ) = 0A(0X ) = 0Y

(cf. 3.1.2a). Thus, NA is not the empty set. Moreover we have:

(α, β ∈ K and f1 , f2 ∈ NA ) ⇒
[f1 , f2 ∈ DA , whence αf1 + βf2 ∈ DA , and
Af1 = Af2 = 0Y , whence A(αf1 + βf2 ) = αAf1 + βAf2 = 0Y ] ⇒
αf1 + βf2 ∈ NA .
3.2.3 Definition. Let X and Y be linear spaces over the same field.
For A, B ∈ O(X, Y ), in agreement with 1.2.5 we write A ⊂ B (or B ⊃ A) if
f ∈ DA ⇒ (f ∈ DB and Af = Bf ),
and we have A = B iff
A ⊂ B and DB ⊂ DA .
For A ∈ O(X, Y ) and a subset M of DA , one can see at once that the restriction
AM of A to M (cf. 1.2.5) is a linear operator iff M is a linear manifold.
3.2.4 Definition. Let X, Y, Z be linear spaces over the same field K.

For A ∈ O(X, Y ) and B ∈ O(Y, Z), we can define the composition B ◦ A of B
with A (cf. 1.2.12), which by tradition is written as BA and is called the product
of A and B. Thus, we have:
DBA := {f ∈ DA : Af ∈ DB } = A−1 (DB ),
BA : DBA → Z
f 7→ (BA)f := B(Af ),
−1
where A (DB ) denotes the counterimage of DB under A. From now on we will
write BAf := (BA)f , for f ∈ DBA . The mapping BA is a linear operator, since
condition lo of 3.2.1 holds:
(α, β ∈ K and f, g ∈ DAB ) ⇒
[f, g ∈ DA and Af, Ag ∈ DB , whence αf + βg ∈ DA and
A(αf + βg) = αAf + βAg ∈ DB ] ⇒
[αf + βg ∈ DBA and BA(αf + βg) = αBAf + βBAg].
For A ∈ O(X) and n ∈ N, we write An := AA · · · (n times) · · · A.
3.2.5 Definition. Let M be a linear manifold in a linear space X. The mapping

idM (cf. 1.2.6a) is obviously a linear operator, which is denoted by the symbol 1M .
Thus, we have
1M : M → X
f 7→ 1M f := f.
Clearly we have 1M ⊂ 1X .
3.2.6 Theorem. Let X and Y be linear spaces over the same field K, and let
A ∈ O(X, Y ). We have:
(a) A is injective iff NA = {0X };
(b) if A is injective, then the mapping A−1 is a linear operator, i.e. A−1 ∈ O(Y, X),
and we have A−1 A = 1DA and AA−1 = 1RA .
Proof. a: If A is injective then we have (recalling that 0X ∈ NA ):

f ∈ NA ⇒ (f ∈ DA and Af = 0Y ) ⇒ (f, 0X ∈ DA and Af = A0X ) ⇒ f = 0X ;
hence NA ⊂ {0X }, from which NA = {0X } follows since 0X ∈ NA .
If NA = {0X } then we have:
(f1 , f2 ∈ DA and Af1 = Af2 ) ⇒ (f1 − f2 ∈ DA and A(f1 − f2 ) = 0Y ) ⇒
f1 − f2 ∈ NA ⇒ f1 − f2 = 0 X ⇒ f1 = f2 ;
hence A is injective.
b: Assume A injective. Since DA−1 = RA , condition lo1 holds for A−1 (cf.
3.2.2a). As to conditions lo2 and lo3 , recalling that RA−1 = DA and that A−1 ◦ A =
1DA and A ◦ A−1 = 1RA (cf. 1.2.14), we have:
(α, β ∈ K and g1 , g2 ∈ DA−1 ) ⇒
[αg1 + βg2 ∈ DA−1 and A−1 (αg1 + βg2 ) = A−1 (αA ◦ A−1 (g1 ) + βA ◦ A−1 (g2 ))
= A−1 (αA(A−1 (g1 )) + βA(A−1 (g2 )))
= A−1 ◦ A(αA−1 (g1 ) + βA−1 (g2 ))
= αA−1 (g1 ) + βA−1 (g2 )].
Since A−1 has been proved to be a linear operator, from now on we will write
A−1 A := A−1 ◦ A, AA−1 := A ◦ A−1 and A−1 g := A−1 (g) for g ∈ RA .
3.2.7 Remarks. Let X be a linear space. For A, B ∈ O(X), recall that:

(a) if B is injective and A ⊂ B, then A is injective and A−1 ⊂ B −1 (cf. 1.2.15);
(b) if BA = 1DA , then A is injective and A−1 ⊂ B (cf. 1.2.16a);
(c) if BA = 1DA and AB = 1DB , then both A and B are injective, A−1 = B and
B −1 = A (cf. 1.2.16b).
3.2.8 Definitions. Let X and Y be linear spaces over the same field K. For
A, B ∈ O(X, Y ), we define the mapping:
A + B : DA ∩ DB → Y
f 7→ (A + B)f := Af + Bf,
which is called the sum of A and B. Recalling 3.1.5, it is immediate to see that
A + B ∈ O(X, Y ).
For α ∈ K and A ∈ O(X, Y ), we define the mapping
αA : DA → Y
f 7→ (αA)f := α(Af ).
From now on we will write αAf := (αA)f , for f ∈ DA . It is immediate to see

that αA ∈ O(X, Y ). We will write −A := (−1)A and B − A := B + (−A) for
B ∈ O(X, Y ).
3.2.9 Definition. Let X and Y be linear spaces over the same field K. We define
the mapping
OX,Y : X → Y
f 7→ OX,Y f := 0Y .
Obviously, OX,Y ∈ O(X, Y ) and we have:
αOX,Y = OX,Y , ∀α ∈ K, and 0A ⊂ OX,Y , ∀A ∈ O(X, Y ).
For X = Y , we write OX := OX,X .
3.2.10 Proposition. The three binary operations defined in 3.2.4 and 3.2.8 have
the following properties
(a) If X and Y are linear spaces over the same field K, then we have:
(a1 ) A + (B + C) = (A + B) + C, ∀A, B, C ∈ O(X, Y ),
(a2 ) A + OX,Y = A, ∀A ∈ O(X, Y ),
(a3 ) A − A ⊂ OX,Y , ∀A ∈ O(X, Y ),
(a′3 ) A − A = OX,Y , ∀A ∈ O(X, Y ) s.t. DA = X,
(a4 ) A + B = B + A, ∀A, B ∈ O(X, Y ),
(a5 ) α(βA) = (αβ)A, ∀α, β ∈ K, ∀A ∈ O(X, Y ),
(a6 ) (α + β)A = αA + βA, ∀α, β ∈ K, ∀A ∈ O(X, Y ),
(a7 ) α(A + B) = αA + αB, ∀α ∈ K, ∀A, B ∈ O(X, Y ),
(a8 ) 1A = A, ∀A ∈ O(X, Y ).
(b) If W, X, Y, Z are four linear spaces over the same field K, then we have:
(b1 ) (AB)C = A(BC), ∀A ∈ O(Y, Z), ∀B ∈ O(X, Y ), ∀C ∈ O(W, X),
(b2 ) AB + AC ⊂ A(B + C), ∀A ∈ O(Y, Z), ∀B, C ∈ O(X, Y ),
(b′2 ) AB + AC = A(B + C), ∀A ∈ O(Y, Z) s.t. DA = Y, ∀B, C ∈ O(X, Y ),
(b3 ) AC + BC = (A + B)C, ∀A, B ∈ O(Y, Z), ∀C ∈ O(X, Y ),
(b4 ) (αA)B = α(AB) = A(αB), ∀α ∈ K − {0}, ∀A ∈ O(Y, Z), ∀B ∈ O(X, Y ),
(b′4 ) (0A)B = 0(AB) ⊂ A(0B), ∀A ∈ O(Y, Z), ∀B ∈ O(X, Y ),
(b5 ) 1Y A = A1X = A, ∀A ∈ O(X, Y ).
Proof. For all the relations we have to prove it is clear that, at a vector which
belongs to the intersection of the domains of the two operators which appear on the
two sides of the relation, the value of the operator on the left hand side coincides
with the value of the operator on the right hand side. Thus, in order to prove the
relations between the operators, we need only prove the same relations between
their domains. We will examine only the cases that are not completely obvious.
a1 : We have
DA+(B+C) = DA ∩ DB+C = DA ∩ (DB ∩ DC )
= (DA ∩ DB ) ∩ DC = DA+B ∩ DC = D(A+B)+C .
b1 : Cf. 1.2.17.
b2 : We have
f ∈ DAB+AC ⇒ f ∈ DAB ∩ DAC ⇒
(f ∈ DB , Bf ∈ DA , f ∈ DC , Cf ∈ DA ) ⇒
(f ∈ DB ∩ DC and Bf + Cf ∈ DA ) ⇒
(f ∈ DB+C and (B + C)f ∈ DA ) ⇒ f ∈ DA(B+C) .
b′2 : If DA = Y , we have (cf. 1.2.13e)
DAB+AC = DAB ∩ DAC = DB ∩ DC = DB+C = DA(B+C) .
b3 : We have
f ∈ DAC+BC ⇔ f ∈ DAC ∩ DBC ⇔
(f ∈ DC , Cf ∈ DA , f ∈ DC , Cf ∈ DB ) ⇔
(f ∈ DC and Cf ∈ DA ∩ DB ) ⇔
(f ∈ DC and Cf ∈ DA+B ) ⇔ f ∈ D(A+B)C .
b4 : If α 6= 0, then DAB = DA(αB) : since DA is a linear manifold, for f ∈ DB
we have Bf ∈ DA ⇒ αBf ∈ DA , and also αBf ∈ DA ⇒ Bf = α−1 (αBf ) ∈ DA .
b′4 : We have (0A)B = 0(AB) = (OX,Y )DAB and also A(0B) = (OX,Y )DB
because 0Bf = 0Y ∈ DA for all f ∈ DB . Then, we recall that DAB ⊂ DB (cf.
1.2.13c).
3.2.11 Remark. The family O(X, Y ) with the two binary operations defined in
3.2.8, despite the symbols used to denote them, is not a linear space. In fact, OX,Y
is the only element of O(X, Y ) for which condition a2 of 3.2.10 can hold (if Õ is
another operator which satisfies that condition, then we have Õ = Õ + OX,Y =
OX,Y + Õ = OX,Y ), and for an operator A ∈ O(X, Y ) with DA 6= X no operator
A′ ∈ O(X, Y ) can exist such that A + A′ = OX,Y , since DA+A′ ⊂ DA for all
A′ ∈ O(X, Y ). Thus, there is no opposite for any element of O(X, Y ) that is not
defined on the whole of X.
We also notice that, in condition b2 of 3.2.10, we do have AB + AC 6= A(B + C)
if for instance RB 6⊂ DA and C = −B. In fact, this implies both DAB+AC =
DAB 6= DB (cf. 1.2.13d) and DA(B+C) = DB (from B + C = B − B ⊂ OX,Y we
have (B + C)f = 0Y ∈ DA for all f ∈ DB = DB+C ).
3.2.12 Definition. Let X and Y be linear spaces over the same field, and define
OE (X, Y ) := {A ∈ O(X, Y ) : DA = X}.
Thus, OE (X, Y ) is the family of all the operators from X to Y that are defined
everywhere on X. For X = Y we write OE (X) := OE (X, X).
3.2.13 Remark. All the relations that appear in 3.2.10 are equalities if
O(W, X), O(X, Y ), O(Y, Z) are replaced by OE (W, X), OE (X, Y ), OE (Y, Z) respec-
tively.
3.2.14 Theorem. Let X and Y be linear spaces over the same field K, and define
the mappings
σ : OE (X, Y ) × OE (X, Y ) → OE (X, Y )
(A, B) 7→ σ(A, B) := A + B,
µ : C × OE (X, Y ) → OE (X, Y )
(α, A) 7→ µ(α, A) := αA,
with A + B and αA defined as in 3.2.8. Then (OE (X, Y ), σ, µ) is a linear space
over K (thus, the symbols A + B and αA introduced in 3.2.8 are in agreement with
the shorthand notation introduced in 3.1.1). The zero element is the operator OX,Y
defined in 3.2.9 and the opposite element of A ∈ OE (X, Y ) is the operator −A
defined in 3.2.8.
Proof. Everything follows directly from 3.2.10 and 3.2.13.
3.2.15 Proposition. Let X and Y be linear spaces over the same field.
(a) A mapping ϕ from X to Y is a linear operator iff Gϕ is a linear manifold in
the linear space X + Y (for Gϕ , cf. 1.2.3).
(b) A linear manifold G in the linear space X + Y is the graph of a mapping from
X to Y iff G has the property:
(0X , g) ∈ G ⇒ g = 0Y .
Proof. a: We assume first ϕ := A ∈ O(X, Y ). Then GA is a linear manifold in

X + Y since it meets condition lm of 3.1.4c:
[α, β ∈ K and (f1 , g1 ), (f2 , g2 ) ∈ GA ] ⇒
[f1 , f2 ∈ DA , g1 = Af1 , g2 = Af2 , whence
αf1 + βf2 ∈ DA and αg1 + βg2 = A(αf1 + βf2 )] ⇒
α(f1 , g1 ) + β(f2 , g2 ) = (αf1 + βf2 , αg1 + βg2 ) ∈ GA .
−1
Recall now that a mapping ϕ from X to Y is the mapping πY ◦(πX )Gϕ (cf. 1.2.13B).
The mappings πX and πY are immediately seen to be linear operators from the
linear space X + Y (cf. 3.1.9) to X and Y respectively. If we assume that Gϕ is a
linear manifold in X + Y , then (πX )Gϕ is a linear operator (cf. 3.2.3), and hence
−1
πY ◦ (πX )Gϕ is a linear operator since it is the composition of two linear operators
(cf. 3.2.6b and 3.2.4).
b: If G is the graph of a mapping, then by part a this mapping is a linear
operator A, and hence we have:
(0X , g) ∈ G ⇒ g = A0X = 0Y .
If, for a linear manifold G in X + Y , we have (0X , g) ∈ G ⇒ g = 0Y , then we also

have
(f, g1 ), (f, g2 ) ∈ G ⇒ (0X , g1 −g2 ) = (f, g1 )−(f, g2 ) ∈ G ⇒ g1 −g2 = 0Y ⇒ g1 = g2 .
Then, by 1.2.4, G is the graph of a mapping from X to Y .
3.3 The algebra of linear operators
3.3.1 Definition. An associative algebra (or, simply, an algebra) over K is a

quadruple (X, σ, µ, π), where (X, σ, µ) is a linear space over K and π is a mapping
π : X × X → X with the following properties, which we write with the shorthand
notation xy := π(x, y):
(al1 ) (xy)z = x(yz), ∀x, y, z ∈ X,
(al2 ) x(y + z) = xy + xz, ∀x, y, z ∈ X,
(al3 ) (x + y)z = xz + yz, ∀x, y, z ∈ X,
(al4 ) (αx)y = α(xy) = x(αy), ∀α ∈ K, ∀x, y ∈ X.
The algebra X is said to be with identity if
(al5 ) ∃x̃ ∈ X s.t. x̃x = xx̃ = x for all x ∈ X.
If x̃ ∈ X exists such that x̃x = xx̃ = x for all x ∈ X, then it is obviously the only
element of X with this property, and it is called the identity of X and denoted by
1.
The algebra X is said to be abelian if
(al6 ) xy = yx, ∀x, y ∈ X
The composition law π is called product.
For x ∈ X and n ∈ N, we define xn := xx · · · (n times) · · · x. If the algebra X is
with identity, we define x0 := 1 for all x ∈ X.
3.3.2 Definition. A subalgebra of an associative algebra (X, σ, µ, π) is a non-empty

subset M of X which has the following properties:
(sa1 ) M is a linear manifold in (X, σ, µ),
(sa2 ) x, y ∈ M ⇒ xy ∈ M .
Condition sa2 is equivalent to π(M × M ) ⊂ M . Thus it clear that, if M is a
subalgebra, then (M, σM×M , µK×M , πM×M ) is an associative algebra, which will be
referred to as the algebra M . If the algebra X is with identity and 1 ∈ M , then
obviously M is an algebra with identity. If the algebra X is abelian, then obviously
M is abelian.
3.3.3 Remarks. The following facts are immediately clear:

(a) Any associative algebra (X, σ, µ, π) has two trivial subalgebras: {0X } and X.
(b) If M is a subalgebra of an associative algebra (X, σ, µ, π) and N is a non-empty
subset of M , then N is a subalgebra of (X, σ, µ, π) iff N is a subalgebra of M .
3.3.4 Proposition. Let F be a family of subalgebras of an associative algebra

(X, σ, µ, π). Then ∩M∈F M is a subalgebra of (X, σ, µ, π).
Proof. By 3.1.5, ∩M∈F M is a linear manifold in (X, σ, µ). Moreover,

x, y ∈ ∩M∈F M ⇒ (x, y ∈ M and hence xy ∈ M, ∀M ∈ F ) ⇒ xy ∈ ∩M∈F M.
3.3.5 Definitions. Let (X1 , σ1 , µ1 , π1 ) and (X2 , σ2 , µ2 , π2 ) be two associative al-

gebras over the same field K. A homomorphism from X1 to X2 is a mapping
Φ : X1 → X2 which has the following properties:
(ha1 ) σ2 (Φ(x), Φ(y)) = Φ(σ1 (x, y)), ∀x, y ∈ X1 ,
(ha2 ) µ2 (α, Φ(x)) = Φ(µ1 (α, x)), ∀α ∈ K, ∀x ∈ X1 ,
(ha3 ) π2 (Φ(x), Φ(y)) = Φ(π1 (x, y)), ∀x, y ∈ X1 .
A homomorphism from X1 to X2 that is also a bijection from X1 onto X2 is called
an isomorphism. If X1 = X2 , an isomorphism is called an automorphism.
3.3.6 Proposition. If Φ is a homomorphism from an associative algebra X1 to an

associative algebra X2 , then RΦ is a subalgebra of X2 . If X1 is with identity, then
the algebra RΦ is with identity. If X1 is abelian, then RΦ is abelian.
Proof. Everything follows directly from the definitions.
3.3.7 Theorem. The linear space OE (X) of the operators defined on a linear space
X (cf. 3.2.12 and 3.2.14) becomes an associative algebra with identity if we define
π : OE (X) × OE (X) → OE (X)
(A, B) 7→ π(A, B) := AB,
with AB defined as in 3.2.4 (thus, the symbol AB introduced in 3.2.4 is in agreement
with the shorthand notation introduced in 3.3.1). The identity is the operator 1X
defined in 3.2.5.
Proof. Everything follows directly from 3.2.10 and 3.2.13.
3.3.8 Examples.
(a) For the linear space F (X) (cf. 3.1.10c) we define
π : F (X) × F (X) → F (X)
(ϕ, ψ) 7→ π(ϕ, ψ) := ϕψ,
with ϕψ defined as in 1.2.19. It is immediate to check that (F (X), σ, µ, π) is

an abelian associative algebra over C (thus the symbol ϕψ defined in 1.2.19 is
in agreement with the shorthand notation used in 3.3.1), with the function
1X : X → C
x 7→ 1X (x) := 1,
(cf. 1.2.19) as identity.
(b) It is immediate to see that the linear manifold FB (X) in F (X) (cf. 3.1.10d) is
a subalgebra of F (X), with identity since 1X ∈ FB (X).
(c) If (X, d) is a metric space, then the linear manifold C(X) in F (X) (cf. 3.1.10e) is
a subalgebra of F (X) since the pointwise product of two continuous functions is
a continuous function, and it is with identity since 1X ∈ C(X). Moreover, CB (X)
(cf. 3.1.10e) is a subalgebra with identity of F (X) since it is the intersection
of two subalgebras (cf. 3.3.4), and hence it is a subalgebra of both FB (X) and
C(X) (cf. 3.3.3b).

Chapter 4
Linear Operators in Normed Spaces
There are properties of a linear operator in a Hilbert space which depend only on
the relation between the operator and the norm which is generated by the inner
product of the space. Thus, in this chapter we examine what can be said about
linear operators in normed spaces. Throughout the chapter, K stands for the field
C of complex numbers or the field R of real numbers.
4.1 Normed spaces
4.1.1 Definition. A normed space over K (or simply a normed space) is a quadru-
ple (X, σ, µ, ν), where (X, σ, µ) is a linear space over K and ν is a function ν : X → R
which, with the shorthand notation kf k := ν(f ), has the following properties:
(no1 ) kf + gk ≤ kf k + kgk, ∀f, g ∈ X,
(no2 ) kαf k = |α|kf k, ∀α ∈ K, ∀f ∈ X,
(no3 ) kf k = 0 ⇒ f = 0X .
The function ν is called a norm for the linear space (X, σ, µ).
4.1.2 Proposition. In a normed space X we have:

(a) k0X k = 0;
(b) |kf k − kgk| ≤ kf − gk, ∀f, g ∈ X;
(c) 0 ≤ kf k, ∀f ∈ X.
Proof. a: k0X k = k00X k = |0|k0X k = 0.

b: For f, g ∈ X we have
kf k = k(f − g) + gk ≤ kf − gk + kgk, hence kf k − kgk ≤ kf − gk,
and in the same way we also have kgk − kf k ≤ kg − f k.
c: Using results a and b, for f ∈ X we get:
0 ≤ |kf k| ≤ kf − 0X k = kf k.
69
4.1.3 Theorem. In a normed space (X, σ, µ, ν), the function

dν : X × X → R
(f, g) 7→ dν (f, g) := kf − gk
is a distance on X.
Proof. For dν we have the properties of a distance (cf. 2.1.1):

(di1 ) for f, g ∈ X,
dν (f, g) = kf − gk = k(−1)(g − f )k
= | − 1|kg − f k = kg − f k = dν (g, f );
(di2 ) for f, g, h ∈ X,
dν (f, g) = kf − gk = k(f − h) + (h − g)k
≤ kf − hk + kh − gk = dν (f, h) + dν (h, g);
(di3 ) for f, g ∈ X,
dν (f, g) = 0 ⇔ kf − gk = 0 ⇔ f − g = 0X ⇔ f = g.
Whenever we speak about metric properties in a normed space, we will be re-

ferring to the distance defined in 4.1.3.
4.1.4 Example. Recall that K is a linear space over K (cf. 3.1.10b). From the
properties of the absolute value in R or of the modulus in C it follows immediately
that the function
ν:K→R
z 7→ ν(z) := |z|
is a norm for the linear space K. We have dν = dK (cf. 2.1.4 and 2.7.4a).
4.1.5 Definition. Let {fn } be a sequence in a normed space. For n ∈ N, we define

sn := nk=1 fk , which is called the nth partial sum. The sequence {sn } is called
P
the series of fn ’s and denoted by the symbol ∞

P
P∞ n=1 fn . If the sequence {sn } is
convergent, then the series n=1 fn is said to be convergent and the limit of {sn }
P∞ P∞
is also denoted by the same symbol n=1 fn , i.e. n=1 fn := limn→∞ sn , and it is
called the sum of the series. These definitions are in agreement with the ones given
in 2.1.10.
4.1.6 Theorem. Let (X, σ, µ, ν) be a normed space. Then:

(a) the function ν is continuous (with respect to dν and dR );
(b) the mapping σ is continuous (with respect to dν × dν and dν );
(c) the mapping µ is continuous (with respect to dK × dν and dν ).
Linear Operators in Normed Spaces 71
Proof. Use 2.4.2, 2.7.3a, and the following remarks.

(a) For f ∈ X and a sequence {fn } in X we have:
dR (kfn k, kf k) = |kfn k − kf k| ≤ kfn − f k = dν (fn , f ).
(b) For (f, g) ∈ X × X and a sequence {(fn , gn )} in X × X we have:
dν (fn + gn , f + g) = k(fn + gn ) − (f + g)k
≤ kfn − f k + kgn − gk = dν (fn , f ) + dν (gn , g).
(c) For (α, f ) ∈ K × X and a sequence {(αn , fn )} in K × X we have:
dν (αn fn , αf ) = kαn fn − αf k = k(αn − α)fn + α(fn − f )k
≤ |αn − α|kfn k + |α|kfn − f k = dK (αn , α)kfn k + |α|dν (fn , f ).
If (αn , fn ) → (α, f ), then dK (αn , α) → 0 and dν (fn , f ) → 0 by 2.7.3a. Besides,
dν (fn , f ) → 0 implies kfn k → kf k by result a, and hence the sequence {kfn k}
is bounded (cf. 2.1.9).
4.1.7 Definition. A normed space which is a complete metric space is called a

Banach space.
4.1.8 Remarks.
(a) Let (X, σ, µ, ν) be a normed space and let M be a linear manifold in the linear
space (X, σ, µ). It is immediate to see that (M, σM×M , µK×M , νM ) is a normed
space. If X is a Banach space, then (M, σM×M , µK×M , νM ) is a Banach space
as well iff M is a closed set. This follows at once from 2.6.6. This partially
justifies the definition we give in 4.1.9 (which is, however, completely justified
only in the context of Banach spaces).
P∞
(b) Let {fn } be a sequence in a Banach space. The series n=1 fn is said to be
P∞
absolutely convergent if the series n=1 kfn k is convergent. Suppose that the
series ∞ Then the series ∞
P P
n=1 fn is absolutely convergent.
Pn Pn n
n=1 f is convergent
as well. Indeed, if we define sn := k=1 fk and σn := k=1 kfk k for each
n ∈ N, then the sequence {σn } is a Cauchy sequence (cf. 2.6.2), and hence the
sequence {sn } is a Cauchy sequence as well since, for n < m,
m
X
ksm − sn k ≤ kfk k = |σm − σn |,
k=n+1
and this implies that the sequence {sn } is convergent (cf. 2.6.3). Moreover, if
P∞
β is a bijection from N onto N then the series n=1 fβ(n) is convergent and for
P∞ P∞
the sums we have n=1 fβ(n) = n=1 fn . Indeed, for any ε > 0 let Nε ∈ N
Pp
be so that |σn − σm | < ε for n, m > Nε . Then k=Nε +2 kfk k < ε for all
P
p ≥ Nε + 2, and hence k∈I kfk k < ε if I is a finite set of positive integers such
that k > Nε + 1 for all k ∈ I. Let Mε := max β −1 ({1, ..., Nε + 1}) and note
that Mε ≥ Nε + 1 since β −1 ({1, ..., Nε + 1}) contains Nε + 1 distinct positive

Pn
integers. Then, if we define s′n := k=1 fβ(k) for each n ∈ N, we have
ks′n − sn k < ε for n > Mε
since n > Mε implies that there is a finite set I of positive indices such that
X
k > Nε + 1, ∀k ∈ I, and ks′n − sn k ≤ kfk k.
k∈I
Now let Lε ∈ N be so that

k lim sm − sn k < ε for n > Lε .
m→∞
Then
k lim sm − s′n k ≤ k lim sm − sn k + ksn − s′n k < 2ε for n > max{Mε , Lε }.
m→∞ m→∞
This proves that the sequence {s′n } is convergent and limn→∞ s′n = limn→∞ sn .
4.1.9 Definition. A closed linear manifold in a normed space X is called a subspace

of X. Obviously, {0X } and X are (trivial) subspaces of any normed space X (cf.
2.3.5, 2.3.2cl1, 3.1.4a).
4.1.10 Proposition. If F is a family of subspaces of a normed space X, then

∩M∈F M is a subspace of X.
Proof. Use 2.3.2cl2 and 3.1.5.
4.1.11 Definition. Let S be a subset of a normed space X. Consider the family

F := {M ∈ P(X) : M is a subspace of X and S ⊂ M }
and define
V S := ∩M∈F M.
We have:
(a) V S is a subspace of X (cf. 4.1.10);
(b) S ⊂ V S (immediate from the definition of V S);
(c) if M is a subspace of X and S ⊂ M , then V S ⊂ M (immediate from the
definition of V S).
Thus, V S is the smallest subspace of X that contains S. For this reason, V S is
called the subspace generated by S.
From the definition of V S we also have immediately:
(d) V S = S iff S is a subspace;
(e) if T is a subset of X such that S ⊂ T , then V S ⊂ V T .
4.1.12 Proposition. Let M be a linear manifold in a normed space X. Then M

is a linear manifold in X, and hence a subspace of X.
Proof. We will prove that M is a linear manifold by using 2.3.10 and 3.1.4c. For
α, β ∈ K and f, g ∈ M , let {fn } and {gn } be sequences in M such that fn → f and
gn → g; then by 4.1.6b,c we have αfn + βgn → αf + βg; since αfn + βgn ∈ M , we
have αf + βg ∈ M .
4.1.13 Proposition. Let S be a subset of a normed space X. Then V S = LS.
Proof. Since S ⊂ LS and LS is a subspace by 4.1.12, we have V S ⊂ LS. Since

S ⊂ V S and V S is a linear manifold, we have LS ⊂ V S, which implies LS ⊂ V S
because V S is a closed set.
4.1.14 Proposition. Let M be a linear manifold in a normed space. Then V M =

M.
Proof. Since M is a linear manifold, we have LM = M . Then we use 4.1.13.
4.1.15 Proposition. Let f be any element of a normed space. Then V {f } =

L{f } = {αf : α ∈ K}.
Proof. The equality L{f } = {αf : α ∈ K} follows directly from 3.1.7. Then, in
view of 4.1.13, V {f } = L{f } is true if the set {αf : α ∈ K} is closed. If f = 0X ,
then {αf : α ∈ K} = {0X }, which is a closed set (cf. 2.3.5). Assume next f 6= 0X .
Let {gn } be a sequence in {αf : α ∈ K} and g an element of X such that gn → g.
If βn is the element of K such that gn = βn f , the sequence {βn } turns out to be
a Cauchy sequence because {gn } is such (cf. 2.6.2) and f 6= 0X . Since K is a
complete metric space, there exists β ∈ K such that βn → β, hence by 4.1.6c such
that gn → βf . Therefore, g = βf ∈ {αf : α ∈ K}. On account of 2.3.4, this proves
that {αf : α ∈ K} is a closed set.
4.1.16 Definition. Let X and Y be normed spaces over the same field, and denote
by νX and νY their norms. The function
ν : X ×Y →R
q
(f, g) 7→ ν(f, g) := 2 (f ) + ν 2 (g)
νX Y
is a norm for the linear space X + Y (cf. 3.1.9); in fact, properties no1 , no2 and
no3 of 4.1.1 follow immediately for ν from the same properties for νX and νY , using
also (for no1 ) the inequality
p p p
∀a1 , a2 , b1 , b2 ∈ C, |a1 + b1 |2 + |a2 + b2 |2 ≤ |a1 |2 + |a2 |2 + |b1 |2 + |b2 |2 ,
which will be proved in 10.3.8c.
The linear space X + Y with this norm ν is called the sum of the normed spaces
X and Y .
It can be seen immediately that dν = dνX × dνY . Hence, from 2.7.3d it follows
that the normed space X +Y is a Banach space iff X and Y are both Banach spaces.
4.2 Bounded operators
4.2.1 Definition. Let X and Y be normed spaces over the same field and let
A ∈ O(X, Y ). The linear operator A is said to be bounded if it has the following
property:
∃m ∈ [0, ∞) such that kAf k ≤ mkf k for all f ∈ DA .
For a linear operator, the importance of the condition of being bounded lies in
the fact that a linear operator is bounded iff it is continuous, as is shown by the
following theorem.
4.2.2 Theorem. Let X and Y be normed spaces over the same field. For a linear
operator A ∈ O(X, Y ), the following conditions are equivalent:
(a) A is bounded, i.e. ∃m ≥ 0 such that kAf k ≤ mkf k for all f ∈ DA ;
(b) A is uniformly continuous;
(c) A is continuous;
(d) ∃f0 ∈ DA such that A is continuous at f0 .
ǫ
Proof. a ⇒ b: Assume condition a and let ǫ > 0. Define δǫ := m+1 . Then we have
[f, g ∈ DA and kf − gk < δǫ ] ⇒
m
kAf − Agk = kA(f − g)k ≤ mkf − gk < ǫ < ǫ.
m+1
This proves that A is uniformly continuous.
b ⇒ c: This is obvious.
c ⇒ d: This is obvious.
d ⇒ a: Assume condition d. Then (setting ǫ := 1 in the condition of continuity
of A at f0 and then δ := δ1 , cf. 2.4.1) ∃δ > 0 such that
[f ∈ DA and kf0 − f k < δ] ⇒ kAf0 − Af k < 1.
Then we have
δ δ
g ∈ DA − {0X } ⇒ [ g ∈ DA and k gk < δ] ⇒
2kgk 2kgk
δ δ
[f0 + g ∈ DA and kf0 − (f0 + g)k < δ] ⇒
2kgk 2kgk
δ δ 2
kAgk = kAf0 − A(f0 + g)k < 1 ⇒ kAgk < kgk.
2kgk 2kgk δ
This proves condition a with m = 2δ .
4.2.3 Theorem. Let X and Y be normed spaces over the same field. For a linear
operator A ∈ O(X, Y ), the following conditions are equivalent:
(a) A is injective and A−1 is bounded;
(b) ∃k > 0 such that kAf k ≥ kkf k for all f ∈ DA .
Proof. a ⇒ b: Assuming condition a, there exists m ∈ [0, ∞) such that
kA−1 gk ≤ mkgk, ∀g ∈ DA−1 ,
and hence (since Af ∈ RA = DA−1 , ∀f ∈ DA ) such that
kf k = kA−1 (Af )k ≤ mkAf k ≤ (m + 1)kAf k, ∀f ∈ DA .

1
Then we have condition b with k := m+1 .
b ⇒ a: Assuming condition b, we have
f ∈ NA ⇒ kAf k = 0 ⇒ kf k = 0 ⇒ f = 0X .
Hence A is injective by 3.2.6a. Moreover (since A−1 g ∈ RA−1 = DA , ∀g ∈ DA−1 ),
kgk = kA(A−1 g)k ≥ kkA−1 gk, ∀g ∈ DA−1 ,
and hence
1
kA−1 gk ≤ kgk, ∀g ∈ DA−1 ,
k
and this proves that A−1 is bounded.
4.2.4 Definition. Let X and Y be normed spaces over the same field, and let A
be a bounded operator from X to Y . Then the set of non-negative real numbers
BA := {m ∈ [0, ∞) : kAf k ≤ mkf k for all f ∈ DA }
is non-empty. Thus, we can define the non-negative number
kAk := inf BA ,
which is called the norm of A.

We point out that, notwithstanding the symbol and the name, the “norm” just
defined cannot be a true norm for the family of all bounded operators from X to
Y , because this family cannot be a linear space since a bounded operator which is
not defined on the whole of X has no opposite (cf. 3.2.11). However, the symbol
and the name will be partially justified by the theorem in 4.2.11.
4.2.5 Proposition. Let X and Y be normed spaces over the same field (with X 6=
{0X }), and let A be a bounded operator from X to Y . We have:
(a) if B ∈ O(X, Y ) is such that B ⊂ A, then B is bounded and kBk ≤ kAk;

(b) kAf k ≤ kAkkf k, ∀f ∈ DA ;
(c) kAk
(c1 )
n o
= sup kAf
kf k
k
: f ∈ D A − {0 X }
(c2 )
= sup{kAf k : f ∈ DA and kf k = 1}
(c3 )
= sup{kAf k : f ∈ DA and kf k ≤ 1}.
Proof. a: Suppose B ⊂ A. Then any non-negative m that implements the condi-

tion of 4.2.1 for A implements the same condition for B as well. Hence B is bounded
and BA ⊂ BB , and this implies inf BB ≤ inf BA .
b: For f ∈ DA − {0X } we have kAfk kAf k
kf k ≤ m for every m ∈ BA . Hence kf k is a
lower bound for BA , and hence kAfk kf k ≤ inf BA = kAk by the definition of g.l.b. (cf.
1.1.5).
c: Equalities c2 and c3 are obvious. We prove equality c1 . From result b we
have kAf k
kf k ≤ kAk for every f ∈ DA − {0X }, and hence

kAf k
sup : f ∈ DA − {0X } ≤ kAk.
kf k
On the other hand, for every g ∈ DA − {0X } we have

kAgk kAf k
≤ sup : f ∈ DA − {0X } ,
kgk kf k
n o
hence sup kAf k
kf k : f ∈ DA − {0X } ∈ BA , and hence

kAf k
kAk ≤ sup : f ∈ DA − {0X }
kf k
(we have used the definition of l.u.b. twice and the definition of g.l.b. once, cf.
1.1.5).
4.2.6 Theorem. Let X and Y be normed spaces over the same field, and let A be
a bounded operator from X to Y . Assume further that Y is a Banach space. Then
there exists one and only one operator Ã from X to Y with the following properties:
(a) DÃ = DA ;
(b) A ⊂ Ã;
(c) Ã is bounded.
We also have
(d) kÃk = kAk.
Proof. For f ∈ DA there is a sequence {fn } in DA such that fn → f (cf. 2.3.10).

Then {fn } is a Cauchy sequence (cf. 2.6.2), and hence the sequence {Afn } is also
a Cauchy sequence since kAfn − Afm k ≤ kAkkfn − fm k by 4.2.5b. Since Y is a
complete metric space, the sequence {Afn } is convergent. For another sequence
{fn′ } in DA such that fn′ → f we have limn→∞ Afn′ = limn→∞ Afn , since
k lim Afn − Afk′ k
n→∞
≤ k lim Afn − Afk k + kAfk − Afk′ k
n→∞
≤ k lim Afn − Afk k + kAkkfk − fk′ k
n→∞
≤ k lim Afn − Afk k + kAkkfk − f k + kAkkf − fk′ k → 0 as k → ∞.
n→∞
Thus, limn→∞ Afn depends only on f and not on the choice of the sequence {fn }
in DA , as long as fn → f . Therefore, we can define the mapping
Ã : DA → Y
f 7→ Ãf := lim Afn if {fn } is a sequence in DA such that fn → f .
n→∞
The mapping Ã is a linear operator. In fact, DA is a linear manifold in X by

4.1.12. Moreover, if f, g ∈ DA and {fn } and {gn } are sequences in DA such that
fn → f and gn → g, then for all scalars α, β the sequence {αfn + βgn } is in DA
and αfn + βgn → αf + βg by 4.1.6b,c, and we have
Ã(αf + βg) = lim A(αfn + βgn ) = lim (αAfn + βAgn )
n→∞ n→∞
= α lim Afn + β lim Agn = αÃf + β Ãg,

n→∞ n→∞
where use has been made of 4.1.6b,c again.

Now, Ã has property a by its definition, and property b because, if f ∈ DA , then
we can define Ãf by using the sequence {fn } with fn := f for every n. Besides, if
f ∈ DA and {fn } is a sequence in DA such that fn → f , then we have
kÃf k = lim kAfn k ≤ kAk lim kfn k = kAkkf k,
n→∞ n→∞
owing to 4.1.6a (used twice) and 4.2.5b. This proves that Ã has property c and that
kAk ∈ BÃ , whence kÃk ≤ kAk. Since A ⊂ Ã, we also have kAk ≤ kÃk by 4.2.5a.
Thus, kÃk = kAk.
It remains to prove the uniqueness of Ã. Let B ∈ O(X, Y ) be such that DB =
DA , A ⊂ B, B is bounded. Then, if f ∈ DA and {fn } is a sequence in DA such
that fn → f , we have
Bf = lim Bfn = lim Afn = Ãf,
n→∞ n→∞
where we have used the continuity of B granted by 4.2.2. Thus, B = Ã.
4.2.7 Proposition. Let X and Y be normed spaces over the same field, and let
A, B be bounded operators from X to Y . Then the operator A + B is bounded, and
kA + Bk ≤ kAk + kBk.
Proof. Using 4.2.5b we have

∀f ∈ DA+B , k(A + B)f k = kAf + Bf k ≤ kAf k + kBf k ≤ (kAk + kBk)kf k.
This proves that A+B is bounded and that kAk+kBk ∈ BA+B , whence kA + Bk ≤
kAk + kBk.
4.2.8 Proposition. Let X, Y be normed spaces over the same field K, and let A
be a bounded operator from X to Y . Then, for each α ∈ K, the operator αA is
bounded and kαAk = |α|kAk.
Proof. If α = 0K , then we have

∀f ∈ DαA , k(αA)f k = k0Y k = 0kf k.
Hence αA is bounded and kαAk = 0 = |α|kAk.
Assume now α 6= 0K . Using 4.2.5b we have
∀f ∈ DαA , k(αA)f k = |α|kAf k ≤ |α|kAkkf k,
which proves that αA is bounded and that |α|kAk ∈ BαA , whence kαAk ≤ |α|kAk.
Since αA is bounded we also have
1 1
∀f ∈ DA , kAf k = k (αA)f k ≤ kαAkkf k,
α |α|
1 1
which proves that |α| kαAk ∈ BA , whence kAk ≤ |α| kαAk, whence |α|kAk ≤ kαAk.
Thus, kαAk = |α|kAk.
4.2.9 Proposition. Let X, Y, Z be normed spaces over the same field, and suppose
that A ∈ O(X, Y ) and B ∈ O(Y, Z) are bounded operators. Then the operator BA
is bounded and kBAk ≤ kBkkAk.
Proof. Using 4.2.5b we have

∀f ∈ DBA , k(BA)f k = kB(Af )k ≤ kBkkAf k ≤ kBkkAkkf k.
This proves that BA is bounded and that kBkkAk ∈ BBA , whence kBAk ≤ kBkkAk.
4.2.10 Definition. Let X and Y be normed spaces over the same field. We define
B(X, Y ) := {A ∈ OE (X, Y ) : A is bounded}.
For X = Y , we write B(X) := B(X, X).
4.2.11 Theorem. Let X and Y be normed spaces over the same field. We have:
(a) B(X, Y ) is a linear manifold in the linear space OE (X, Y ) (cf. 3.2.14) and the
function
νB : B(X, Y ) → R
A 7→ νB (A) := kAk := inf BA
is a norm for the linear space B(X, Y ); hence, B(X, Y ) is a normed space;
(b) if Y is a Banach space, then B(X, Y ) is also a Banach space.
Proof. a: On account of 4.2.7 and 4.2.8, conditions lm1 and lm2 of 3.1.3 hold for
B(X, Y ), and conditions no1 and no2 of 4.1.1 hold for νB . By 4.2.5b we also have,
for A ∈ B(X, Y ),
kAk = 0 ⇒ [kAf k ≤ 0, ∀f ∈ X] ⇒ [Af = 0Y , ∀f ∈ X] ⇒ A = OX,Y ,
which proves that condition no3 of 4.1.1 holds for νB .
b: Assume Y to be a Banach space, and let {An } be a Cauchy sequence in the

normed space B(X, Y ), i.e. a sequence so that
∀ǫ > 0, ∃Nǫ ∈ N s.t. Nǫ < n, m ⇒ kAn − Am k < ǫ.
For each f ∈ X, by 4.2.5b we have kAn f − Am f k ≤ kAn − Am kkf k, and this shows
that {An f } is a Cauchy sequence in Y and hence a convergent sequence since Y is
a complete metric space. Therefore, we can define the mapping
A:X→Y
f 7→ Af := lim An f.
n→∞
This mapping is a linear operator, since for all α, β ∈ K, f, g ∈ X we have
A(αf + βg) = lim An (αf + βg) = lim (αAn f + βAn g)
n→∞ n→∞
= α lim An f + β lim An g = αAf + βAg,
n→∞ n→∞
where use has been made of 4.1.6b,c. Furthermore, for ǫ > 0 and n > Nǫ we have
∀f ∈ X, k(A − An )f k = kAf − An f k = lim kAk f − An f k
k→∞
= lim k(Ak − An )f k ≤ ǫkf k,
k→∞
where use has been made of 4.1.6 and of the inequality k(Ak − An )f k ≤
kAk − An kkf k. This proves first that A − An ∈ B(X, Y ), whence
A = (A − An ) + An ∈ B(X, Y )
since B(X, Y ) is a linear manifold in OE (X, Y ), and second that
kA − An k ≤ ǫ.
As a consequence, the sequence {An } is convergent and its limit is A.
4.2.12 Remark. Let X and Y be normed spaces over the same field, and let {An }
be a sequence in B(X, Y ).
If {An } is convergent then, letting A := limn→∞ An , we have
∀f ∈ X, Af = lim An f,
n→∞
which is proved by
∀f ∈ X, ∀n ∈ N, kAn f − Af k ≤ kAn − Akkf k.
This implies that, if the series ∞
P
A is convergent, then for each f ∈ X the
P∞n
n=1
series n=1 (An f ) is convergent and ( n=1 An )f = ∞
P∞ P
n=1 (An f ).
4.2.13 Theorem (The Banach–Steinhaus theorem). Let X be a Banach

space and Y a normed space over the same field, let F be a family of elements
of B(X, Y ), and assume that
∀f ∈ X, ∃mf ∈ [0, ∞) such that kAf k ≤ mf , ∀A ∈ F .
Then
∃m ∈ [0, ∞) such that kAk ≤ m, ∀A ∈ F .
Proof. We divide the proof into two steps.

Step 1: We prove by contraposition that the assumption of the statement implies
that the following proposition is true
P : ∃g ∈ X, ∃r ∈ (0, ∞), ∃n ∈ N so that kAhk ≤ n, ∀h ∈ B(g, r), ∀A ∈ F ,
where B(g, r) is the open ball with center g and radius r (cf. 2.2.1), i.e. B(g, r) :=
{h ∈ X : kg − hk < r}.
Thus, we assume that the following proposition is true
(not P ) : ∀g ∈ X, ∀r ∈ (0, ∞), ∀n ∈ N, ∃h ∈ B(g, r), ∃A ∈ F so that kAhk > n.
We define a sequence {(hn , rn , An )} in X × (0, ∞) × F by induction as follows. We
fix g0 ∈ X; since (not P ) is true,
∃h1 ∈ B(g0 , 1), ∃A1 ∈ F so that kA1 h1 k > 1,
and we choose h1 ∈ B(g0 , 1) and A1 ∈ F which satisfy this condition; since the
function X ∋ h 7→ kA1 hk ∈ R is continuous (it is the composition of two continuous
mappings),
∃δ1 ∈ (0, ∞) so that kA1 hk > 1, ∀h ∈ B(h1 , δ1 ),
and we choose δ1 ∈ (0, ∞) which satisfies this condition; we define

1
r1 := min δ1 , ;
2
thus,
1
r1 ≤
and kA1 hk > 1 for all h ∈ B(h1 , r1 ).
2
Next we suppose that, for a definite n ∈ N, (hn , rn , An ) ∈ X × (0, ∞) × F has
already been defined so that
1
rn ≤ n and kAn hk > n for all h ∈ B(hn , rn );
2
since (not P ) is true,
∃hn+1 ∈ B(hn , rn ), ∃An+1 ∈ F so that kAn+1 hn+1 k > n + 1,
and we choose hn+1 ∈ B(hn , rn ) and An+1 ∈ F which satisfy this condition; since
the function X ∋ h 7→ kAn+1 hk ∈ R is continuous,
∃δn+1 ∈ (0, ∞) so that kAn+1 hk > n + 1, ∀h ∈ B(hn+1 , δn+1 ),
and we choose δn+1 ∈ (0, ∞) which satisfies this condition; since B(hn , rn ) is open
and hn+1 ∈ B(hn , rn ),
∃ρn+1 ∈ (0, ∞) so that B(hn+1 , ρn+1 ) ⊂ B(hn , rn ),
and we choose ρn+1 ∈ (0, ∞) which satisfies this condition; we define

1 1
rn+1 := min δn+1 , ρn+1 , n+1 ;
2 2
thus,
1
rn+1 ≤ ,
2n+1
kAn+1 hk > n + 1 for all h ∈ B(hn+1 , rn+1 ),
B(hn+1 , rn+1 ) ⊂ B(hn , rn ).
In this way we have defined a sequence {(hn , rn , An )} in X × (0, ∞) × F which is

such that, for each n ∈ N,
1
rn ≤ , kAn hk > n for all h ∈ B(hn , rn ), B(hn+1 , rn+1 ) ⊂ B(hn , rn ).
2n
The sequence {hn } is a Cauchy sequence since, for each k ∈ N,
1
n, m > k ⇒ hn , hm ∈ B(hk , rk ) ⇒ khn − hm k < 2rk ≤ ;
2k−1
hence, the sequence {hn } is convergent since X is a complete metric space. We
define h := limn→∞ hn . For each k ∈ N we have
n > k + 1 ⇒ hn ∈ B(hk+1 , rk+1 ),
and hence h ∈ B(hk+1 , rk+1 ), and hence h ∈ B(hk , rk ), and hence
kAk hk ≥ k.
This proves that if proposition P is not true then the assumption of the statement
is not true.
Step 2: We prove that if P is true then the conclusion of the statement is true.
Thus, we assume that proposition P is true and we fix (g, r, n) ∈ X × (0, ∞) × N
which satisfies the condition of proposition P . For each f ∈ X such that kf k < r,
we have g + f ∈ B(g, r) and hence
kAf k = kA(g + f ) − Agk ≤ kA(f + g)k + kAgk ≤ 2n, ∀A ∈ F .

r
Then, for each f ∈ X − {0X }, since 2kf k < r we have
f

A r ≤ 2n, i.e. kAf k ≤ 4n kf k, ∀A ∈ F .

f
2kf k r
This proves that

4n
kAk ≤ , ∀A ∈ F ,
r
4n
which is the conclusion of the statement for m := r .
4.3 The normed algebra of bounded operators
4.3.1 Definition. A normed algebra over K is a quintuple (X, σ, µ, π, ν), where

(X, σ, µ, π) is an associative algebra over K, (X, σ, µ, ν) is a normed space over K,
and the following condition is satisfied:
(na) kxyk ≤ kxkkyk ∀x, y ∈ X.
A normed algebra is said to be with identity if (X, σ, µ, π) is with identity and
k1k = 1.
A normed algebra is called a Banach algebra if (X, σ, µ, ν) is a Banach space.
4.3.2 Proposition. If M is a subalgebra of a normed algebra (X, σ, µ, π, ν) over

K, then (M, σM×M , µK×M , πM×M , νM ) is also a normed algebra over K.
Proof. Since condition na of 4.3.1 holds obviously when X is replaced by any

subset of X, we simply recall that a subalgebra of an associative algebra defines
an associative algebra (cf. 3.3.2) and a linear manifold in a normed space defines a
normed space (cf. 4.1.8a).
4.3.3 Theorem. Let (X, σ, µ, π, ν) be a normed algebra. Then the mapping π is

continuous (with respect to dν × dν and dν ).
Proof. We use 2.4.2 and the following remarks. For (x, y) ∈ X × X and a sequence
(xn , yn ) in X × X we have:
dν (xn yn , xy) = kxn yn − xyk = k(xn yn − xyn ) + (xyn − xy)k
≤ kxn − xkkynk + kxkkyn − yk.
If (xn , yn ) → (x, y), then kxn − xk → 0 and kyn − yk → 0 by 2.7.3a. Besides,
kyn − yk → 0 implies kyn k → kyk by 4.1.6a, and hence the sequence {kyn k} is
bounded (cf. 2.1.9).
4.3.4 Theorem. Let X be a Banach algebra with identity. For x ∈ X so that

kxk < 1, we have:
(a) the series ∞ xn is convergent;
P
P∞ n=1
(b) (1 + n=1 x )(1 − x) = (1 − x)(1 + ∞
n n
P
n=1 x ) = 1.
P∞
Proof. a: The series n=1 kxn k is convergent because kxn k ≤ kxkn and kxk < 1.
P∞
Then the series n=1 xn is convergent by 4.1.8b.
b: We have
X∞ n
X Xn
n k
x+( x )x = x + ( lim x )x = x + lim xk+1
n→∞ n→∞
n=1 k=1 k=1
n
X n+1
X ∞
X
= lim (x + xk+1 ) = lim xk = xn ,
n→∞ n→∞
k=1 k=1 n=1
where we have used first 4.3.3 and then 4.1.6b. Then we have
∞
X ∞
X ∞
X
(1 + xn )(1 − x) = 1 + xn − x − ( xn )x = 1.
n=1 n=1 n=1
P∞
In a similar way we can prove that (1 − x)(1 + n=1 xn ) = 1.
4.3.5 Theorem. Let X be a normed space. Then B(X) is a subalgebra of the

associative algebra OE (X) (cf. 3.3.7). With the norm νB of 4.2.11, B(X) is a
normed algebra with identity. If X is a Banach space, then B(X) is a Banach
algebra.
Proof. Condition sa1 of 3.3.2 has been proved for B(X) in 4.2.11a, and condition
sa2 follows from 4.2.9. Thus, B(X) is a subalgebra of OE (X), and therefore it is
also an associative algebra. By 4.2.11a, B(X) is also a normed space. On account of
4.2.9, for B(X) we also have property na of 4.3.1. Thus, B(X) is a normed algebra,
and it is with identity since 1X ∈ B(X) and k1X k = 1 (unless X is a zero space).
Indeed we have
k1X f k = kf k ≤ 1kf k, ∀f ∈ X,
which proves that 1X ∈ B(X) and 1 ∈ B1X , and hence k1X k ≤ 1 (cf. 4.2.4). By
4.2.5b we also have
kf k = k1X f k ≤ k1X kkf k, ∀f ∈ X,
and this implies 1 ≤ k1X k if ∃f ∈ X s.t. f 6= 0X .
Finally, if X is a Banach space then B(X) is also a Banach space by 4.2.11b.
4.3.6 Examples.
(a) For the associative algebra FB (X) (cf. 3.3.8b), define
ν : FB (X) → R
ϕ 7→ ν(ϕ) := kϕk∞ := sup{|ϕ(x)| : x ∈ X}.
It is easy to see that ν is a norm for the linear space FB (X) and that ν has
property na of 4.3.1. Therefore, FB (X) is a normed algebra, and it is with
identity since k1X k∞ = 1.
Actually, FB (X) is a Banach algebra. In fact, let {ϕn } be a Cauchy sequence
in FB (X); then {ϕn (x)} is a Cauchy sequence in C for every x ∈ X, and hence
we can define the function
X ∋ x 7→ ϕ(x) := lim ϕn (x) ∈ C;
n→∞
now, for ǫ > 0, let Nǫ ∈ N be such that kϕn − ϕm k < ǫ for n, m > Nǫ ; then we
have, for n > Nǫ ,
∀x ∈ X, |ϕn (x) − ϕ(x)| = lim |ϕn (x) − ϕm (x)| ≤ ǫ.
m→∞
This proves first that ϕn − ϕ ∈ FB (X), and hence that
ϕ = (ϕ − ϕn ) + ϕn ∈ FB (X)
since FB (X) is a linear manifold in F (X), and second that
kϕn − ϕk∞ ≤ ǫ.
As a consequence, the sequence {ϕn } is convergent and its limit is ϕ. Thus,
FB (X) is a Banach space.
(b) Let (X, d) be a metric space. Then CB (X) is a subalgebra with identity of
FB (X) (cf. 3.3.8c) and hence it is an associative algebra with identity. Further,
CB (X) is a closed subset of the Banach space FB (X). Indeed, let {ϕn } be a
sequence in CB (X), let ϕ ∈ FB (X), and suppose that kϕn − ϕk∞ → 0; for each
x ∈ X and each ǫ > 0, let nǫ ∈ N be such that kϕnǫ − ϕk∞ < 3ǫ , and let δx,ǫ > 0
be such that |ϕnǫ (x) − ϕnǫ (y)| < 3ǫ whenever d(x, y) < δx,ǫ ; then we have
d(x, y) < δx,ǫ ⇒ |ϕ(x) − ϕ(y)| ≤ |ϕ(x) − ϕnǫ (x)| +
|ϕnǫ (x) − ϕnǫ (y)| + |ϕnǫ (y) − ϕ(y)| < ǫ;
this shows that ϕ ∈ C(X), and hence that ϕ ∈ CB (X). Since CB (X) is a
closed linear manifold in the Banach space FB (X), CB (X) is a Banach space
(cf. 4.1.8a).
Thus, CB (X) is a Banach algebra with identity (cf. 4.3.2).
(c) Let T be the unit circle in the complex plane (T is also called the one-
dimensional torus), i.e. we define
T := {z ∈ C : |z| = 1}.
Since C ∋ z 7→ |z| ∈ R is a continuous function, T is a closed set in the metric

space (C, dC ), and it is obviously also bounded. Hence, from 2.8.7 it follows
that T is compact. Hence C(T) = CB (T) (cf. 3.1.10e).
Define the subset P of F (T) (cf. 3.1.10c) by
P := {p ∈ F (T) : ∃N ≥ 0, ∃(α0 , α1 , α−1 , ..., αN , α−N ) ∈ C2N +1 s.t.
N
X
p(z) = αk z k , ∀z ∈ T}.
k=−N
The elements of P are called trigonometric polynomials since for ϕ ∈ F (T) we

have
M
X
ϕ ∈ P iff ϕ(eit ) = (βk cos kt + γk sin kt) for every t ∈ R,
k=−M
where M is a non-negative integer and βk and γk are complex numbers.

Clearly, P ⊂ C(T) and P is a subalgebra of C(T).
The following theorem can be proved (cf. e.g. Shilov, 1974, 1.52):
Stone’s theorem for a complex algebra. Let (X, d) be a compact metric

space, and let A be a subalgebra of the Banach algebra C(X) which has the
following properties:
(a) ϕ ∈ A ⇒ ϕ ∈ A;
(b) 1X ∈ A;
(c) (x, y ∈ X and x 6= y) ⇒ ∃ϕ ∈ A s.t. ϕ(x) 6= ϕ(y).
Then A = C(X).
Now, P is a subalgebra of C(T) which has properties a, b, c of Stone’s theorem

(as to property c, let ϕ(z) := z for all z ∈ T). Then P = C(T). However,
instead of giving a proof of Stone’s theorem (which would require preliminary
results outside the scope of this book), in 4.3.7 we give a direct proof (borrowed
from Rudin, 1987, 4.24) of the equality P = C(T), which will play a crucial role
in our proof of the spectral theorem for unitary operators (from which we will
deduce the spectral theorem for self-adjoint operators).
4.3.7 Theorem (The Stone–Weierstrass approximation theorem). In the

Banach space C(T) we have P = C(T).
Proof. We will show that, for each ϕ ∈ C(T), there exists a sequence {pn } in P
such that kpn − ϕk∞ → 0. By 2.3.12, this will prove that P = C(T).
Suppose we have a sequence {qn } in P such that each qn has the following
properties:
(a) R0 ≤ qn (z), ∀z ∈ T
π Rπ
(b) −π qn (eis )ds = 1 (by −π ...ds we denote a Riemann integral, cf. 9.3.2);
(c) for each δ ∈ (0, π), if mn (δ) := sup{qn (eit ) : δ ≤ |t| ≤ π}, then limn→∞ mn (δ) =
0.
Let ϕ ∈ C(T). For n ∈ N, we define the function
pn : T → C
Z π
z 7→ pn (z) := ϕ(ei(t−s) )qn (eis )ds if t ∈ (−π, π] is so that z = eit .
−π
For all t ∈ (−π, π] we have

Z π Z π
(1)
ϕ(ei(t−s) )qn (eis )ds = ϕ(ei(t+s) )qn (e−is )ds
−π −π
Z π+t Z π
(2) is i(t−s) (3)
= ϕ(e )qn (e )ds = ϕ(eis )qn (ei(t−s) )ds,
−π+t −π
where in 1 we have made the change of variables s 7→ −s, in 2 we have made the
change of variables s 7→ s − t, and 3 is true because the integrand is a periodic
function of period 2π.
Since qn (eit ) = N ikt

P n
k=−Nn αk,n e for every t ∈ R, we have
Z π Nn
X Z π
ϕ(eis )qn (ei(t−s) )ds = βk,n eikt , with βk,n := αk,n ϕ(eis )e−iks ds.
−π k=−Nn −π
Thus, pn ∈ P.
Let now ǫ > 0 be given. Since the function [−π, π] ∋ t 7→ ϕ(eit ) ∈ C is
continuous, it is uniformly continuous (cf. 2.8.7 and 2.8.15). Therefore, the function
R ∋ t 7→ ϕ(eit ) ∈ C is uniformly continuous since it is periodic of period 2π. Thus,
∃δǫ > 0 s.t. |ϕ(eit ) − ϕ(eis )| < ǫ whenever |t − s| < δǫ . For z ∈ T and t ∈ (−π, π]
so that z = eit , by property b we have
Z π
pn (z) − ϕ(z) = ϕ(ei(t−s) ) − ϕ(eit ) qn (eis )ds
−π
and property a implies, assuming 0 < δǫ < π,
Z π
|pn (z) − ϕ(z)| ≤ |ϕ(ei(t−s) ) − ϕ(eit )|qn (eis )ds
−π
Z −δǫ Z δǫ Z π
= ...ds + ...ds + ...ds;
−π −δǫ δǫ
now, we have
Z δǫ Z π
i(t−s) it is
|ϕ(e ) − ϕ(e )|qn (e )ds ≤ ǫ qn (eis )ds = ǫ
−δǫ −π
since |(t − s) − t| < δǫ when s ∈ (−δǫ , δǫ ) and since qn has property b, and also
Z −δǫ Z π
i(t−s) it is
|ϕ(e ) − ϕ(e )|qn (e )ds + |ϕ(ei(t−s) ) − ϕ(eit )|qn (eis )ds
−π δǫ
≤ 2kϕk∞ mn (δǫ )2(π − δǫ ) < 4πkϕk∞ mn (δǫ ),
where the definition of mn (δǫ ) has been used; thus we have
|pn (z) − ϕ(z)| ≤ ǫ + 4πkϕk∞ mn (δǫ ).
Since this estimate is independent of z, we have
kpn − ϕk∞ ≤ ǫ + 4πkϕk∞ mn (δǫ ).
Recalling property c, let Nǫ ∈ N be such that 4πkϕk∞ mn (δǫ ) < ǫ whenever Nǫ < n;
then we have
Nǫ < n ⇒ kpn − ϕk∞ < 2ǫ.
This proves that kpn − ϕk∞ → 0.
It remains to construct a sequence {qn } in P with properties a, b, c. Let qn (z) :=
γn
4n (2 + z + z −1 )n for every z ∈ T, with γn so that condition b is satisfied. For z ∈ T
and t ∈ (−π, π] so that z = eit , we have
n Z π n −1
1 + cos t 1 + cos t
qn (z) = γn , with γn := dt .
2 −π 2
Since properties a and b are clear, we only need to show condition c.

t n
Since the function [−π, π] ∋ t 7→ 1+cos

2 ∈ R is even, property b shows that
Z π n Z π n
1 + cos t 1 + cos t 2
1 = 2γn dt > 2γn sin tdt = 2γn .
0 2 0 2 n + 1
Since the function [0, π] ∋ t 7→ 1 + cos t ∈ R is decreasing, it follows that
n
n + 1 1 + cos δ
qn (eit ) ≤ qn (eiδ ) ≤ whenever 0 < δ ≤ |t| ≤ π.
4 2
This implies condition c, since 1 + cos δ < 2 if 0 < δ ≤ π.
4.4 Closed operators
4.4.1 Definition. Let X and Y be normed spaces over the same field and let
A ∈ O(X, Y ). The linear operator A is said to be closed if its graph GA is a closed
subset of the product of the two metric spaces X, Y (cf. 4.1.3 and 2.7.2), i.e. a
subspace of the normed space X + Y (cf. 3.2.15 and 4.1.16). From 2.3.4 and 2.7.3a
we have that A is closed iff the following condition is satisfied:
[f ∈ X, g ∈ Y, {fn } is a sequence in DA , fn → f, Afn → g] ⇒
[f ∈ DA and g = Af ].
This condition can be written in the equivalent way:
[{fn } is a sequence in DA that is convergent in X and
{Afn } is a convergent sequence in Y ] ⇒
[ lim fn ∈ DA and A( lim fn ) = lim Afn ].
n→∞ n→∞ n→∞
4.4.2 Remark. Let X and Y be normed spaces over the same field. For a linear
operator A ∈ O(X, Y ) we have that A is bounded iff the following condition is
satisfied (cf. 4.2.2 and 2.4.2):
[f ∈ DA , {fn } is a sequence in DA , fn → f ] ⇒ [Afn → Af ].
This condition can be written in the equivalent way:
[{fn } is a sequence in DA that is convergent in X and lim fn ∈ DA ] ⇒
n→∞
[{Afn } is a convergent sequence in Y and A( lim fn ) = lim Afn ].
n→∞ n→∞
Thus, for both a bounded (i.e. continuous on account of 4.2.2) operator and a closed
one there are conditions, for a convergent sequence {fn } in their domains, which
allow one to “commute the operator with the limit”. However, while for a bounded
operator A one must assume limn→∞ fn ∈ DA in order to obtain that the sequence
{Afn } is convergent, for a closed operator A one must assume that the sequence
{Afn } is convergent in order to obtain limn→∞ fn ∈ DA .
The interplay between the concepts of bounded operator and closed operator is
studied in 4.4.3, 4.4.4, 4.4.6.
4.4.3 Theorem. Let X and Y be normed spaces over the same field, and suppose
A ∈ O(X, Y ). If A is bounded and DA is closed, then A is closed.
Proof. Assuming A bounded and DA closed, A is closed because the following

implications are true:

(1)
(2)
[{fn } is a sequence in DA that is convergent in X and lim fn ∈ DA ] ⇒
n→∞
[ lim fn ∈ DA and A( lim fn ) = lim Afn ],
n→∞ n→∞ n→∞
where 1 is true by 2.3.4 because DA is closed (notice that the condition “{Afn } is
a convergent sequence in Y ” plays no role) and 2 is true because A is continuous
(cf. 4.2.2).
4.4.4 Theorem. Let X be a normed space, Y a Banach space over the same field,
and A ∈ O(X, Y ). If A is bounded and closed, then DA is closed.
Proof. Assume A bounded and closed. First, we notice that, if {fn } is a sequence in
DA that is convergent in X, then {Afn } is a Cauchy sequence since kAfn − Afm k ≤
kAkkfn − fm k, and therefore {Afn } is a convergent sequence in Y since Y is a
complete metric space. Then, DA is closed by 2.3.4 since the following implications
are true:
[{fn } is a sequence in DA that is convergent in X] ⇒

(∗)
lim fn ∈ DA ,
n→∞
where (∗) is true because A is closed.
We state the following theorem without giving its proof, which can be found e.g.
in Chapter 10 of (Royden, 1988), since we shall use neither this theorem nor its
corollary. We prove less general versions of this theorem and of its corollary in
12.2.3 and in 13.1.9, for an operator in a Hilbert space.
4.4.5 Theorem (Closed graph theorem). Let X and Y be Banach spaces over
the same field. If A ∈ OE (X, Y ) (for OE (X, Y ), cf. 3.2.12) and A is closed, then
A is bounded.
4.4.6 Corollary. Let X and Y be Banach spaces over the same field, and suppose
A ∈ O(X, Y ). If DA is closed and A is closed, then A is bounded.
Proof. If DA is closed then DA is a Banach space (cf. 4.1.8a).

If A is closed then GA is a closed set in the product metric space X × Y . By
2.3.3, GA is then a closed set in the metric space DA × Y as well, since obviously
GA = GA ∩ (DA × Y ).
Thus, if DA is closed and A is closed, then DA and Y are Banach spaces and
A is a closed element of OE (DA , Y ). Then, by 4.4.5, A is a bounded element of
OE (DA , Y ), and hence a bounded element of O(DA , Y ) as well.
Assuming X, Y Banach spaces over the same field and A ∈ O(X, Y ), from 4.4.3,
4.4.4 and 4.4.6 we see that, if two of the three conditions A bounded, A closed,
DA closed are true, then the remaining one is true as well. This rounds off our
examination of the interplay between the concepts of bounded operator and closed
operator. However, we shall not use either 4.4.5 or 4.4.6 for general closed operators
in general Banach spaces, and this is the reason why we have not provided a proof
of the closed graph theorem (which, moreover, would require preliminary results
outside the scope of this book). Anyway, as already mentioned, the closed graph
theorem and its corollary will be proved in 12.2.3 and in 13.1.9 respectively, for
operators in Hilbert spaces.
4.4.7 Proposition. Let X and Y be normed spaces over the same field, and suppose
A ∈ O(X, Y ) and A injective. Then A is closed iff A−1 is closed.
Proof. If A is injective, then GA−1 = V (GA ), where V is defined by

X × Y ∋ (f, g) 7→ V (f, g) := (g, f ) ∈ Y × X
(cf. 1.2.11). Now, from 2.7.3a it is clear that a sequence {(fn , gn )} in the product
metric space X ×Y converges to (f, g) ∈ X ×Y iff the sequence {(gn , fn )} converges
to (g, f ) in the product metric space Y × X. Hence, using 2.3.4 we see that GA is a
closed subset of the product metric space X × Y iff GA−1 is a closed subset of the
product metric space Y × X.
A ∈ O(X, Y ). If A is closed then NA is a subspace of X.
Proof. Let f ∈ X and let {fn } be a sequence in NA so that fn → f . Then,

Afn → 0Y . If A is closed, this implies f ∈ DA and Af = 0Y , i.e. f ∈ NA . In view
of 2.3.4, this shows that NA is closed, and hence a subspace of X (cf. 3.2.2b).
4.4.9 Proposition. Let X and Y be normed spaces over the same field, and suppose
A ∈ O(X, Y ), B ∈ B(X, Y ), A closed. Then A + B is closed.
Proof. Let {fn } be a sequence in DA+B and assume that there exists (f, g) ∈ X ×Y
so that fn → f and (A + B)fn → g. Since B ∈ B(X, Y ), we have Bfn → Bf and
hence Afn → g − Bf . Since fn ∈ DA for each n ∈ N and A is closed, this implies
f ∈ DA and g − Bf = Af , i.e. f ∈ DA+B and g = (A + B)f . This proves that
A + B is closed.
4.4.10 Definition. Let X and Y be normed spaces over the same field, and let
A ∈ O(X, Y ). The linear operator A is said to be closable if the closure (in the
product metric space X × Y or equivalently in the normed space X + Y ) GA of its
graph is the graph of a mapping. If A is closable, then the mapping which has GA
as its graph is a closed linear operator from X to Y (cf. 4.1.12 and 3.2.15a) which
is called the closure of A and is denoted by A. Clearly, A ⊂ A (cf. 1.2.5) and A is
the smallest closed operator that contains A: if B is a closed operator that contains
A, then GA ⊂ GB , and hence GA = GA ⊂ GB , and hence A ⊂ B (cf. 1.2.5).
If A is closable, GA = GA means that (cf. 2.3.10):
DA := {f ∈ X : there exists a sequence {fn } in DA s.t.
fn → f and {Afn } is convergent},
∀f ∈ DA , Af = lim Afn if {fn } is a sequence in DA
n→∞
s.t. fn → f and {Afn } is convergent.

It is obvious that A is closed iff (A is closable and A = A).
A ∈ O(X, Y ). Then:
(a) A is closable iff the following condition holds
(0X , g) ∈ GA ⇒ g = 0Y ;
(b) A is closable iff ∃B ∈ O(X, Y ) such that B is closed and A ⊂ B.
Proof. a: this follows at once from 4.1.12 and 3.2.15b.

b: If A is closable, put B := A.
If there is B ∈ O(X, Y ) such that B is closed and A ⊂ B, then we have
GA ⊂ GB . Hence,
(0X , g) ∈ GA ⇒ (0X , g) ∈ GB ⇒ g = 0Y
by 3.2.15b. Then A is closable by result a.
4.4.12 Proposition. Let X be a normed space and Y a Banach space over the same
field. Let A ∈ O(X, Y ), and suppose A bounded. Then A is closable, DA = DA and
A is bounded.
Proof. Let Ã be the element of O(X, Y ) such that DÃ = DA , A ⊂ Ã, Ã is bounded
(cf. 4.2.6).
Since Ã is bounded and DÃ is closed, Ã is closed by 4.4.3. Then, since A ⊂ Ã,
A is closable by 4.4.11b and we also have A ⊂ Ã.
For f ∈ DÃ , in view of DÃ = DA and 2.3.10 there is a sequence {fn } in DA s.t.
fn → f , and hence (since Ã is continuous) also s.t. Ãfn → Ãf . Since A ⊂ A ⊂ Ã,
the sequence {fn } is also in DA and Afn = Ãfn ; since A is closed, this implies
f ∈ DA . Thus, DÃ ⊂ DA and therefore A = Ã.
4.4.13 Proposition. Let X and Y be normed spaces over the same field, let
A ∈ O(X, Y ), and suppose A injective and closable. Then A−1 is closable iff A
−1
is injective. If these conditions are satisfied, then A−1 = (A) .
Proof. If V is the mapping defined in 1.2.11, we have

(∗)
GA−1 = V (GA ) = V (GA ) = V (GA ),
where (∗) holds in view of 2.3.10 and of what was noted in the proof of 4.4.7. Since
V (V (f, g)) = (f, g) for each (f, g) ∈ X × Y , for f ∈ X we also have
(0Y , f ) ∈ V (GA ) iff (f, 0Y ) ∈ GA iff f ∈ NA .
Thus, for f ∈ X,
(0Y , f ) ∈ GA−1 iff f ∈ NA .
In view of 4.4.11a and of 3.2.6a, this shows that A−1 is closable iff A is injective. If
these conditions are satisfied, then we have (cf. 1.2.11)
GA−1 = GA−1 = V (GA ) = G(A)−1 ,
−1
and hence A−1 = (A) .
4.4.14 Proposition. Let X and Y be normed spaces over the same field, and
suppose A ∈ O(X, Y ), B ∈ B(X, Y ), A closable. Then A + B is closable and
A + B = A + B.
Proof. Since A + B is closed (cf. 4.4.9) and A + B ⊂ A + B, we have that A + B

is closable (cf. 4.4.11b) and GA+B = GA+B ⊂ GA+B .
Let now f ∈ DA+B . Since DA+B = DA , there exists a sequence {fn } in DA
such that fn → f and Afn → Af . Since B ∈ B(X, Y ), we also have (A + B)fn →
(A + B)f . Then we have (f, (A + B)f ) ∈ GA+B since (fn , (A + B)fn ) is a sequence
in GA+B (cf. 2.3.10). This proves that GA+B ⊂ GA+B , and hence that GA+B =
GA+B , which is equivalent to A + B = A + B.
4.5 The spectrum of a linear operator
In this section, which contains little more than definitions, X denotes a normed
space over K and A denotes an operator in X, i.e. A ∈ O(X).
4.5.1 Definitions. The resolvent set of A is the set

ρ(A) := {λ ∈ K : A − λ1X is injective, (A − λ1X )−1 is bounded, RA−λ1X = X}.
The spectrum of A is the set
σ(A) := K − ρ(A).
4.5.2 Proposition. For λ ∈ K, the following conditions are equivalent:

(a) [A − λ1X is not injective] or [A − λ1X is injective and (A − λ1X )−1 is not
bounded];
(b) ∀ǫ > 0, ∃fǫ ∈ DA such that kAfǫ − λfǫ k < ǫkfǫ k (hence, fǫ 6= 0X ).
Proof. This statement is the statement of 4.2.3, with A replaced by A − λ1X (note
that DA−λ1X = DA ) and the two conditions in contrapositive form.
4.5.3 Definition. The approximate point spectrum of A is the set

Apσ(A) := {λ ∈ K : for λ the conditions of 4.5.2 are true}.
4.5.4 Remark. Clearly, Apσ(A) ⊂ σ(A).
4.5.5 Proposition. If A is bounded, then:
Apσ(A) ⊂ {λ ∈ K : |λ| ≤ kAk}.
Proof. Let λ ∈ Apσ(A) and choose ǫ > 0. Then (cf. 4.1.2b)
∃fǫ ∈ DA s.t. kfǫ k = 1 and |λ| − kAfǫ k ≤ kAfǫ − λfǫ k < ǫ,
and hence (cf. 4.2.5b) |λ| < kAk + ǫ. Since ǫ was arbitrary, this proves that
|λ| ≤ kAk.
4.5.6 Proposition. For λ ∈ K the following conditions are equivalent:

(a) A − λ1X is not injective;
(b) ∃f ∈ DA such that f 6= 0X and Af = λf ;
(c) NA−λ1X 6= {0X }.
Proof. The equivalence of conditions a and c is the contrapositive form of 3.2.6a,
with A replaced by A − λ1X . The equivalence of conditions b and c follows from
the definition of NA−λ1X (since DA−λ1X = DA ).
4.5.7 Definitions. The point spectrum of A is the set

σp (A) := {λ ∈ K : for λ the conditions of 4.5.6 are true}.
An element λ of σp (A) is called an eigenvalue of A, and the linear manifold NA−λ1X
(cf. 3.2.2b) is called the corresponding eigenspace. A non-null vector of NA−λ1X is
called an eigenvector of A corresponding to λ.
4.5.8 Remark. Clearly, σp (A) ⊂ Apσ(A).
Later in the book (cf. 12.4.24, 12.4.25, 15.3.4), examples will be provided of
operators in Hilbert spaces that have empty spectrum, or that have non-empty
spectrum but empty approximate point spectrum, or that have non-empty approx-
imate point spectrum but empty point spectrum, or that have non-empty point
spectrum. Further, examples will be provided of operators for which the various
kinds of spectra coincide and others for which they do not.
4.5.9 Proposition. If A is closed then NA−λ1X is a subspace of X, for all λ ∈ K.
Proof. Use 4.4.9 and 4.4.8.
4.5.10 Theorem. Suppose that X is a non-zero Banach space and A ∈ B(X).

Then:
σ(A) ⊂ {λ ∈ K : |λ| ≤ kAk}.
Proof. Since X is a non-zero Banach space, B(X) is a Banach algebra with identity
(cf. 4.3.5). For λ ∈ K so that kAk < |λ|, we have k λ1 Ak < 1. Then, by 4.3.4, the
P∞ n
series n=1 λ1 A is convergent in B(X) and
∞ n ! ∞ n !
X 1 1 1 X 1
1X + A 1X − A = 1X − A 1X + A = 1X ,
n=1
λ λ λ n=1
λ
and hence
∞
" n !#
1 X 1
− 1X + A (A − λ1X )
λ n=1
λ
∞
" n !#
1 X 1
= (A − λ1X ) − 1X + A = 1X .
λ n=1
λ
By 3.2.7c, this implies that A − λ1X is injective and that

∞ n !
−1 1 X 1
(A − λ1X ) = − 1X + A ,
λ n=1
λ
−1 −1
and hence that (A − λ1X ) ∈ B(X), which means (A − λ1X ) bounded and
RA−λ1X = D(A−λ1X )−1 = X. Thus, λ ∈ ρ(A).
This proves the implication
kAk < |λ| ⇒ λ ∈ ρ(A),
λ ∈ σ(A) ⇒ |λ| ≤ kAk.
4.5.11 Proposition. Suppose that X is a Banach space and that A is closable.

Then σ(A) = σ(A).
Proof. Let λ ∈ ρ(A). Then A − λ1X is injective. Moreover, A − λ1X is closable

−1
(cf. 4.4.14) and (A − λ1X ) is bounded and therefore closable (cf. 4.4.12). Then,
−1 −1
from 4.4.13 we have that A − λ1X is injective and (A − λ1X ) = (A − λ1X ) .
Since A − λ1X = A − λ1X (cf. 4.4.14), we have that A − λ1X is injective and
−1 −1
(A − λ1X ) = (A − λ1X )−1 . This equality and 4.4.12 imply that (A − λ1X ) is
bounded. Finally, R(A−λ1X ) = X implies R(A−λ1X ) = X in view of the inclusion
R(A−λ1X ) ⊂ R(A−λ1X ) . Thus, λ ∈ ρ(A).
Conversely, suppose λ ∈ ρ(A). Then A − λ1X is injective and hence

−1 −1
(cf. 3.2.7a) A − λ1X is injective and (A − λ1X ) ⊂ (A − λ1X ) . Since
−1 −1
(A − λ1X ) is bounded, this implies (cf. 4.2.5a) that (A − λ1X ) is bounded.
Further, D(A−λ1X )−1 = R(A−λ1X ) = X implies D(A−λ1X )−1 = X by 4.4.4, since
−1
(A − λ1X ) is bounded and closed (cf. 4.4.9 and 4.4.7), and hence D(A−λ1 −1 =
X)
−1 −1
X since (A − λ1X ) = (A − λ1X ) (see above). We observe now that, if B
is a closable operator, then obviously DB ⊂ DB . Then, D(A−λ1X )−1 = X, i.e.
R(A−λ1X ) = X. Thus, λ ∈ ρ(A).
This proves that ρ(A) = ρ(A), and hence that σ(A) = σ(A).
4.5.12 Proposition. Suppose that X is a Banach space and that A is closed. Then
RA−λ1X = X for each λ ∈ ρ(A).
Proof. Since A is closed, A − λ1X is closed for each λ ∈ K (cf. 4.4.9). Thus,
if λ ∈ ρ(A) then (A − λ1X )−1 is closed (cf. 4.4.7) and bounded. Since X is a
Banach space, this implies that D(A−λ1X )−1 is closed (cf. 4.4.4), and hence that
RA−λ1X = RA−λ1X = X.
4.5.13 Remark. Some define the resolvent set and the spectrum of A in a different
way than we did in 4.5.1, by letting the resolvent set of A be the set
ρ′ (A) := {λ ∈ K : A − λ1X is injective and (A − λ1X )−1 ∈ B(X)}
and letting the spectrum of A be the set
σ ′ (A) := K − ρ′ (A).
However, these definitions are not very useful for non-closed operators. In fact, if
ρ′ (A) 6= ∅ then there is λ ∈ K so that (A − λ1X )−1 exists and (A − λ1X )−1 ∈ B(X),
hence (A − λ1X )−1 is closed by 4.4.3, hence A − λ1X is closed by 4.4.7, hence A
is closed by 4.4.9. This proves that ρ′ (A) = ∅, and hence σ ′ (A) = K, if A is not
closed. Thus, the spectrum as defined above is always trivially the same for all non-
closed operators, even when they are closable. This is not true with our definition
of spectrum, as indicated by 4.5.11 (cf. also 12.4.25).
If X is Banach space then the definitions given above are actually equivalent to
ours, for closed operators. In fact, 4.5.12 proves that ρ(A) ⊂ ρ′ (A) if A is closed
and X is a Banach space, and hence ρ(A) = ρ′ (A) since ρ′ (A) ⊂ ρ(A) is obvious.
4.6 Isomorphisms of normed spaces
4.6.1 Definitions. Let (X1 , σ1 , µ1 , ν1 ) and (X2 , σ2 , µ2 , ν2 ) be normed spaces over

the same field K. An isomorphism from X1 onto X2 is a mapping U : X1 → X2
such that:
(in1 ) U is a bijection from X1 onto X2 ;

(in2 ) σ2 (U (f ), U (g)) = U (σ1 (f, g)), ∀f, g ∈ X1 ;
µ2 (α, U (f )) = U (µ1 (α, f )), ∀α ∈ K, ∀f ∈ X1 ;
(in3 ) ν2 (U (f )) = ν1 (f ), ∀f ∈ X1 .
If an isomorphism from X1 onto X2 exists, then the two normed spaces X1 and X2
are said to be isomorphic.
If the two normed spaces X1 and X2 are the same, an isomorphism from X1
onto X2 is called an automorphism of X1 .
4.6.2 Remarks.
(a) In 4.6.1, condition in1 means that U is an “isomorphism” from the set X1 onto
the set X2 (it preserves the set theoretical “operations”, i.e. union, intersec-
tion, complementation), conditions in1 and in2 mean that U is an isomorphism
from the linear space (X1 , σ1 , µ1 ) onto the linear space (X2 , σ2 , µ2 ) (actually,
condition in2 says that U is a linear operator), and condition in3 says that U
preserves the norm. We laid down condition in1 the way we did in order to
make it clear from the outset that an isomorphism preserves the three level
structure of a normed space. However, we could have only asked in in1 that
U be surjective onto X2 , since U is a linear operator by condition in2 and
NU = {0X1 } holds by condition in3 , and hence U is injective by 3.2.6a.
(b) It is obvious (also in view of 3.2.6b) that, if U is an isomorphism from a normed
space X1 onto a normed space X2 , then the inverse mapping U −1 (i.e. the linear
operator U −1 ) is an isomorphism from X2 onto X1 ; and also that, if V is an
isomorphism from X2 onto a third normed space X3 , then the composition V ◦U
(i.e. the product V U of the linear operators U and V ) is an isomorphism from
X1 onto X3 .
(c) For any normed space X, the identity mapping idX (i.e. the linear operator
1X , cf. 3.2.5) is obviously an automorphism of X. It is immediate to see, also
in view of remark b, that the family of all automorphisms of X is a group,
with the product of operators as group product, the identity mapping as group
identity, the inverse mapping as group inverse.
(d) If U is an isomorphism from a normed space (X1 , σ1 , µ1 , ν1 ) onto a normed
space (X2 , σ2 , µ2 , ν2 ), then
dν2 (U (f ), U (g)) := kU (f ) − U (g)k2 = kU (f − g)k2

= kf − gk1 =: dν1 (f, g), ∀f, g ∈ X1 .
Thus, U is an isomorphism from the metric space (X1 , dν1 ) onto the metric
space (X2 , dν2 ).
(e) As remarked above, an isomorphism U from a normed space X1 onto a normed
space X2 is a linear operator. It is obvious that U is a bounded operator, i.e.
U ∈ B(X1 , X2 ). Thus, U is a continuous mapping (cf. 4.2.2; however, this was
already clear from remark d). Moreover, kU k = 1 follows immediately from

4.2.5c.
4.6.3 Proposition. Let X1 and X2 be isomorphic normed spaces over the same
field, and let U be an isomorphism from X1 onto X2 . The mapping
T U : X1 + X1 → X2 + X2
(f, g) 7→ TU (f, g) := (U f, U g)
is an isomorphism from the normed space X1 + X1 onto the normed space X2 + X2
(cf. 4.1.16).
For each linear operator A ∈ O(X1 ) we have TU (GA ) = GUAU −1 .
Proof. It follows immediately from the definitions that TU is an isomorphism from

the normed space X1 + X1 onto the normed space X2 + X2 .
For (f, g) ∈ X1 + X1 we have
(f, g) ∈ GA ⇔
[f ∈ DA and g = Af ] ⇔
[U f ∈ DAU −1 = DUAU −1 and U g = U AU −1 U f ] ⇔
(U f, U g) ∈ GUAU −1 ⇔
(f, g) ∈ TU−1 (GUAU −1 )
(DAU −1 = DUAU −1 is true because DU = X1 ). Thus we have GA = TU−1 (GUAU −1 )
and hence
TU (GA ) = TU (TU−1 (GUAU −1 )) = GUAU −1 ,
because the counterimage of a subset of X2 + X2 under the mapping TU can be
considered the image of that subset under the mapping TU−1 , since the mapping TU
is bijective from X1 + X1 onto X2 + X2 (cf. 1.2.11).
4.6.4 Proposition. Let X1 and X2 be isomorphic normed spaces over the same
field, let A ∈ O(X1 ) and B ∈ O(X2 ), and let U be an isomorphism from X1 onto
X2 . The following conditions are equivalent:
(a) B = U AU −1 ;
(b) A = U −1 BU ;
(c) BU = U A;
(d) AU −1 = U −1 B;
(e) DA = U −1 (DB ) and Bf = U AU −1 f , ∀f ∈ DB ;
(f ) DB = U (DA ) and Ag = U −1 BU g, ∀g ∈ DA ;
(g) GB = TU (GA ) (for TU , cf. 4.6.3).
If these conditions are satisfied then:
(h) RB = U (RA );
(i) DB = U (DA ) and RB = U (RA ).
Proof. The equivalence of conditions a, b, c, d follows immediately from the as-

sociativity of compositions of mappings (cf. 1.2.17) and from U −1 U = 1X1 and
U U −1 = 1X2 .
The equivalence of conditions a and g follows from the equality TU (GA ) =
GUAU −1 (cf. 4.6.3) and from the fact that two mappings are equal iff their graphs
are equal.
If conditions a and e are proved to be equivalent then it is obvious that conditions
b and f are also equivalent, since U −1 is an isomorphism from X2 onto X1 (cf.
4.6.2b).
The equivalence of conditions a and e is proved as follows.
a ⇒ e: Assume condition a. Then DA = U −1 (DB ) since DA = U −1 (DUAU −1 )
by 1.2.18a. Moreover it is obvious that Bf = U AU −1 f for all f ∈ DB .
e ⇒ a: If we assume DA = U −1 (DB ) then we have U −1 (DB ) = U −1 (DUAU −1 )
since DA = U −1 (DUAU −1 ) by 1.2.18a, and hence
DB = U (U −1 (DB )) = U (U −1 (DUAU −1 )) = DUAU −1
because counterimages under U can be interpreted as images under U −1 since U is
bijective (cf. 1.2.11). If we further assume that Bf = U AU −1 f for all f ∈ DB then
we have B = U AU −1 .
Now suppose that conditions a, b, c, d, e, f, g are satisfied. Then conditions h
and i are proved as follows.
h: Condition a implies RB = U (RA ) by 1.2.18b.
i: Since DB = U (DA ) and RB = U (RA ), 2.3.21a implies that DB = U (DA ) and
RB = U (RA ) since U is an isomorphism from the metric space X1 onto the metric
space X2 (cf. 4.6.2d).
4.6.5 Theorem. Let X1 and X2 be isomorphic normed spaces over the same field
K. Let A ∈ O(X1 ), B ∈ O(X2 ) and suppose that there exists an isomorphism U
from X1 onto X2 so that B = U AU −1 . Then:
(a) A is injective iff B is injective; if A and B are injective then B −1 = U A−1 U −1 ;
(b) A is bounded iff B is bounded; if A and B are bounded then kBk = kAk;
(c) A is closed iff B is closed;
(d) A is closable iff B is closable; if A and B are closable, then B = U AU −1 ;
(e) NB = U (NA );
(f ) B − λ1X2 = U (A − λ1X1 )U −1 , ∀λ ∈ K;
(g) σ(B) = σ(A);
(h) Apσ(B) = Apσ(A);
(i) σp (B) = σp (A).
Proof. a: Everything follows from 1.2.17 and 1.2.14B (cf. also the equivalence
between conditions a and b in 4.6.4)
b: If A is bounded then
kBf k2 = kU AU −1 f k2 = kAU −1 f k1 ≤ kAkkU −1f k1 = kAkkf k2, ∀f ∈ DB
(cf. 4.2.5b). This proves that B is bounded and kBk ≤ kAk. Since A = U −1 BU
(cf. the equivalence between conditions a and b in 4.6.4), by the same token it can
be proved that if B is bounded then A is bounded and kAk ≤ kBk.
c: Since TU is an isomorphism from X1 + X1 onto X2 + X2 as metric spaces (cf.
4.6.3 and 4.6.2d) and since GB = TU (GA ) (cf. the equivalence between conditions
a and g in 4.6.4), 2.3.21b implies that GA is closed iff GB is closed.
d: Since TU is an isomorphism from X1 + X1 onto X2 + X2 as metric spaces and
since GB = TU (GA ), 2.3.21a implies that GB = TU (GA ). Then, if B is closable we
have
(0X1 , g) ∈ GA ⇒ (0X1 , U g) ∈ GB ⇒ U g = 0X2 ⇒ g = 0X1 ,
by 4.4.11a. This proves that A is closable if B is closable, in view of 4.4.11a again.

The converse statement can be proved by the same token, since GA = TU−1 (GB ).
Suppose A and B closable. From 4.6.3 we have TU (GA ) = GUAU −1 , and hence
GB = GB = TU (GA ) = GUAU −1 , and hence B = U AU −1 .
e: For f ∈ X1 we have
f ∈ NA ⇔ (f, 0X1 ) ∈ GA ⇔ (U f, 0X2 ) ∈ TU (GA ) = GB ⇔

U f ∈ NB ⇔ f ∈ U −1 (NB ).
Thus we have NA = U −1 (NB ) and hence NB = U (NA ).

f: For each λ ∈ K, we have
B − λ1X2 = U AU −1 − λU U −1 = U (AU −1 − λU −1 ) = U (A − λ1X1 )U −1 ,
where the second equality holds by 3.2.10b′2 since DU = X1 and the last by 3.2.10b3.
g: Let λ ∈ K. We have that
A − λ1X1 is injective iff B − λ1X2 is injective,
in view of results f and a. Assuming A − λ1X1 and B − λ1X2 injective, we have

that
(A − λ1X1 )−1 is bounded iff (B − λ1X2 )−1 is bounded,
in view of results f, a, b. Finally, we have
RA−λ1X1 = X1 iff RB−λ1X2 = X2
because result f and 4.6.4i imply that
RB−λ1X2 = U (RA−λ1X1 ),
and also because RU = X2 and RU −1 = X1 .

This proves that ρ(B) = ρ(A). Hence, σ(B) = σ(A).
h: This result is proved by the first two equivalences in the proof of result g.
i: This result is proved by the first equivalence in the proof of result g.
4.6.6 Proposition. Let X1 and X2 be Banach spaces over the same field. Suppose
that there exists a linear operator V from X1 to X2 , i.e. V ∈ O(X1 , X2 ), such that
DV = X1 , RV = X2 , kV f k = kf k for all f ∈ DV .
Then there exists a unique operator U ∈ B(X1 , X2 ) such that V ⊂ U . The operator
U is an isomorphism from X1 onto X2 . Moreover, the operator U −1 is the unique
element of B(X1 , X2 ) such that V −1 ⊂ U −1 , or equivalently such that
U −1 (V f ) = f, ∀f ∈ DV .
Proof. From 4.2.6 we have that there exists a unique operator U ∈ B(X1 , X2 ) such
that V ⊂ U . Clearly, condition in2 of 4.6.1 holds true for U .
Now we fix f ∈ X1 and let {fn } be a sequence in DV such that fn → f (cf.
2.3.12). Then,
U f = lim V fn
n→∞
and hence
kU f k = lim kV fn k = lim kfn k = kf k
n→∞ n→∞
(cf. 4.1.6a). Since f was an arbitrary element of X1 , this proves condition in3 of
4.6.1 for U . Moreover we fix g ∈ X2 and let {gn } be a sequence in DV such that
V gn → g. Then {gn } is a Cauchy sequence because
kgn − gm k = kV gn − V gm k, ∀n, m ∈ N,
and hence we have
U lim gn = lim V gn = g.
n→∞ n→∞
Since g was an arbitrary element of X2 , this proves that RU = X2 . Thus condition

in1 of 4.6.1 holds true for U since U is injective by conditions in2 and in3 , in view
of 3.2.6a.
Finally, V ⊂ U implies V −1 ⊂ U −1 by 1.2.15; this condition can be written as
U −1 (V f ) = V −1 (V f ) = f, ∀f ∈ DV .
Since RV = X2 and U −1 ∈ B(X2 , X1 ), U −1 is determined uniquely by this condi-
tion.

Chapter 5
The Extended Real Line
There are many situations in integration theory where one finds it unavoidable to
deal with infinity. For instance, one wants to be able to integrate over sets of infinite
measure. Moreover, even if one is only interested in real-valued functions, the least
upper bound or the sum of a sequence of positive real-valued functions may well be
infinite at some points. More generally, there are a number of idiomatic expressions
about real functions where the word infinity and the symbol ∞ are used, even when
these two things have not been given a definite status.
This brief chapter is devoted to the extended real line, which is a way to organize
the various rules according to which infinity is dealt with in real analysis, and in
particular in the next chapters about measure and integration theory.
5.1 The extended real line as an ordered set
5.1.1 Definitions. Let ∞ and −∞ be two distinct objects, neither of which is a

real number. “Many writers use the symbol +∞ for what we write as ∞, but the
sign + is a mere nuisance and so we omit it” (Hewitt and Stromberg, 1965, p.54).
The extended real line is the set R∗ defined by
R∗ := R ∪ {−∞, ∞} .
A total ordering is defined in R∗ by specifying that:
for a, b ∈ R, a ≤ b if a ≤ b according to the usual ordering in R;
for a ∈ R∗ , −∞ ≤ a and a ≤ ∞.
We will always regard R∗ as endowed with this total ordering. Note that the
relation which is induced in R by this total ordering coincides obviously with the
usual ordering in R.
For a, b ∈ R∗ , a ≤ b is also written as b ≥ a, and we write a < b if a ≤ b and
a 6= b.
For a, b ∈ R∗ , we define the sets:
(a, b) := {x ∈ R∗ : a < x < b},
101
[a, b) := {x ∈ R∗ : a ≤ x < b},
(a, b] := {x ∈ R∗ : a < x ≤ b},
[a, b] := {x ∈ R∗ : a ≤ x ≤ b}.
Note that some of these sets can be empty (e.g. (a, b) = ∅ if b ≤ a) and that
R = (−∞, ∞), R∗ = [−∞, ∞], [0, ∞] = [0, ∞) ∪ {∞}.
Given a non-empty set X, for two functions ϕ : X → R∗ and ψ : X → R∗ we write
ϕ ≤ ψ if ϕ(x) ≤ ψ(x) for all x ∈ X.
5.1.2 Proposition. If S is a non-empty subset of R∗ then both sup S (the l.u.b.

for S, cf. 1.1.5) and inf S (the g.l.b. for S) exist.
Proof. Let S be a non-empty subset of R∗ . Notice that ∞ is an upper bound for

S and −∞ is a lower bound for S.
If S ⊂ {−∞, ∞}, the assertions we want to prove are obvious. Assume then
S ∩ R 6= ∅ and examine the existence of sup S. If the proposition
[∀m ∈ R, ∃x ∈ S s.t. m < x]
is true, then ∞ is the only upper bound for S, and hence sup S exists and sup S = ∞.
If the proposition
[∃m ∈ R s.t. x ≤ m for all x ∈ S]
is true, then an element of R exists which is sup(S − {−∞}) according to the usual
ordering in R (cf. e.g. Chapter 1 of (Rudin, 1976) or 1.11 in (Apostol, 1974)), and
which is therefore sup S according to the ordering in R∗ . The arguments for inf S
are similar.
5.2 The extended real line as a metric space
5.2.1 Theorem. The function ϕ defined by

h π πi
ϕ: − , → R∗
2 2 
−∞

 if x = − π2 ,
x 7→ ϕ(x) := tan x if x ∈ (− π2 , π2 ),

∞ π
if x = 2.

is obviously a bijection from − π2 , π2 onto R∗ . Define then the function

δ : R∗ × R∗ → R∗
(a, b) 7→ δ(a, b) := dR (ϕ−1 (a), ϕ−1 (b)) (= |ϕ−1 (a) − ϕ−1 (b)|, cf. 2.1.4).
The Extended Real Line 103
(a) The function δ is a distance on R∗ (R∗ will always be regarded as the first
element of the metric space (R∗ , δ)).
(b) For a sequence {an } in R∗ we have:
(b1 ) an → ∞ iff ∀m ∈ R, ∃Nm ∈ N such that n > Nm ⇒ an > m;
(b2 ) an → −∞ iff ∀m ∈ R, ∃Nm ∈ N such that n > Nm ⇒ an < m;
(b3 ) for a ∈ R,
an → a iff ∀ǫ > 0, ∃Nǫ ∈ N such that n > Nǫ ⇒ a − ǫ < an < a + ǫ.
(c) A sequence {an } in R is convergent in the metric subspace (R, δR ) (i.e. it is
convergent in the metric space (R∗ , δ) and limn→∞ an ∈ R) iff it is convergent
in the metric space (R, dR ), and in case of convergence the two limits are equal.
(d) The topology (i.e. the family of all open sets) of the metric subspace (R, δR ) is
the same as the topology of the metric space (R, dR ).
(e) The metric subspace (R, δR ) is not complete, and one of its completions is the
pair ((R∗ , δ), idR ).
Proof. a: The properties of 2.1.1 for δ follow directly from the same properties for
dR .
b: Let {an } be a sequence in R∗ .
b1 : We have
δ(an , ∞) −−−−→ 0 ⇔
n→∞
−1 π
dR (ϕ (an ), ) −−−−→ 0 ⇔
n 2 n→∞
∀ǫ ∈ (0, π), ∃Nǫ ∈ N s.t. n > Nǫ ⇒
π o (1)
[an = ∞ or (an ∈ R and arctan an > − ǫ)] ⇔
2
(∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an > m).
Indeed:
(1) π
⇒: for m ∈ R, put ǫ := 2 − arctan m; then
n > Nm := Nǫ ⇒
[an = ∞ or (an ∈ R and arctan an > arctan m, i.e. an > m)];
(1)
⇐: for ǫ ∈ (0, π), put m := tan( π2 − ǫ); then
π
n > Nǫ := Nm ⇒ an > tan( − ǫ) ⇒
2
π
[an = ∞ or (an ∈ R and arctan an > − ǫ)].
2
b2 : The proof is analogous to the one given for b1 .
b3 : For a ∈ R we have
(2)
δ(an , a) −−−−→ 0 ⇔
n→∞
(3)
[∃k ∈ N s.t. an ∈ R for n > k and dR (ϕ−1 (ak+n ), ϕ−1 (a)) −−−−→ 0] ⇔
n→∞
(4)
[∃k ∈ N s.t. an ∈ R for n > k and dR (ak+n , a) −−−−→ 0] ⇔
n→∞
(∀ǫ > 0, ∃Nǫ ∈ N s.t. n > Nǫ ⇒ a − ǫ < an < a + ǫ).
Indeed:
(2)
⇒: since η := min{δ(a, −∞), δ(a, ∞)} > 0, there exists k ∈ N such that
n > k ⇒ δ(an , a) < η ⇒ an 6∈ {−∞, ∞};
moreover, δ(an , a) −−−−→ 0 implies trivially δ(ak+n , a) −−−−→ 0;

n→∞ n→∞
(2)
⇐: δ(ak+n , a) −−−−→ 0 implies trivially δ(an , a) −−−−→ 0;
n→∞ n→∞
(3)
⇔: tan and arctan are continuous functions between (− π2 , π2 ) and R;
(4)
⇒: dR (ak+n , a) −−−−→ 0 implies that
n→∞
∀ǫ > 0, ∃nǫ ∈ N s.t. n > nǫ ⇒ a − ǫ < ak+n < a + ǫ;
let then Nǫ := k + nǫ ;
(4)
⇐: set for instance k := N1 and notice that dR (an , a) −−−−→ 0 implies trivially
n→∞
dR (ak+n , a) −−−−→ 0.
n→∞
c: This follows immediately from part b3 .

d: From part c and 2.3.4 it follows that a subset of R is closed in (R, δR ) iff it is
closed in (R, dR ). Then, from 2.3.1 it follows that a subset of R is open in (R, δR )
iff it is open in (R, dR ).
e: Let {an } be a Cauchy sequence in (R∗ , δ); then {ϕ−1 (an )} is a Cauchy
sequence in (R, dR ); since (R, dR ) is complete, this implies that there exists x ∈ R
such that dR (ϕ−1 (an ), x) → 0, and x ∈ [− π2 , π2 ] since [− π2 , π2 ] is closed (cf. 2.3.6 and
2.3.4); then we have δ(an , ϕ(x)) → 0 by the definition of δ. This proves that (R∗ , δR )
is complete. From parts b1 and b2 it follows that δ(n, ∞) → 0 and δ(−n, ∞) → 0;
by 2.3.12, this proves that R = R∗ , where R means the closure of R in the metric
space (R, δ). The statement we need to prove now follows from 2.6.8.
5.2.2 Definition. For a sequence {an } in R∗ we define, for each n ∈ N,
sup ak := sup {ak : k ≥ n} and inf ak := inf {ak : k ≥ n}

k≥n k≥n
(notice that sup {ak : k ≥ n} and inf {ak : k ≥ n} exist by 5.1.2).

5.2.3 Proposition. Let {am,k }(m,k)∈N×N be a family of elements of R∗ . Then,

sup sup am,k = sup sup am,k ,
m≥1 k≥1 k≥1 m≥1

inf inf am,k = inf inf am,k .
m≥1 k≥1 k≥1 m≥1
Proof. We have

∀(n, l) ∈ N × N, sup sup am,k ≥ sup an,k ≥ an,l ,
m≥1 k≥1 k≥1
and hence

∀l ∈ N, sup sup am,k ≥ sup am,l ,
m≥1 k≥1 m≥1
and hence

sup sup am,k ≥ sup sup am,k .
m≥1 k≥1 k≥1 m≥1
In a similar way we can prove

sup sup am,k ≥ sup sup am,k .
k≥1 m≥1 m≥1 k≥1
The proof of the second equality of the statement is analogous.
5.2.4 Proposition. Let {an } be a sequence in R.

If an ≤ an+1 for each n ∈ N, then {an } is convergent in the metric space (R, dR )
iff there exists m ∈ R such that an ≤ m for each n ∈ N. In case of convergence we
have
lim an = sup an .
n→∞ n≥1
If an+1 ≤ an for each n ∈ N, then {an } is convergent in the metric space (R, dR )
iff there exists m ∈ R such that m ≤ an for each n ∈ N. In case of convergence we
have
lim an = inf an .
n→∞ n≥1
Proof. Suppose an ≤ an+1 for each n ∈ N (the proof is analogous in the other
case). If there exists m ∈ R s.t. an ≤ m for each n ∈ N, then s := supn≥1 an is an
element of R since an ≤ s ≤ m for each n ∈ N by the definition of l.u.b.. Then, by
the same token,
∀ǫ > 0, ∃Nǫ ∈ N s.t. s − ǫ < aNǫ .
This implies
∀ǫ > 0, ∃Nǫ ∈ N s.t. n > Nǫ ⇒ s − ǫ < an ≤ s,
which implies dR (an , s) −−−−→ 0.

n→∞
Conversely, suppose that there exists a ∈ R s.t. an → a. Then, by 2.1.9,
∃k ∈ [0, ∞), ∃x ∈ R s.t. dR (an , x) < k, ∀n ∈ N.
Hence we have
∀n ∈ N, an ≤ |an | ≤ |an − x| + |x| < k + |x|.
5.2.5 Proposition. Let {an } be a sequence in R∗ .

If an ≤ an+1 for each n ∈ N, then {an } is convergent in the metric space (R∗ , δ)
and limn→∞ an = supn≥1 an .
If an+1 ≤ an for each n ∈ N, then {an } is convergent in the metric space (R∗ , δ)
and limn→∞ an = inf n≥1 an .
Proof. Suppose an ≤ an+1 for each n ∈ N (the proof is analogous in the other
case).
If ∃m ∈ R s.t. an ≤ m for each n ∈ N, then the result follows from 5.2.4 and
5.2.1c.
If ∀m ∈ R, ∃Nm ∈ N s.t. m < aNm , then
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an > m,
hence an → ∞ by 5.2.1b1; also, ∞ is the only upper bound for {an : n ∈ N}, hence
∞ = sup {an : n ∈ N} = supn≥1 an .
5.2.6 Proposition. Let {an } be a convergent sequence in R∗ . For each n ∈ N

define bn := supk≥n ak and cn := inf k≥n ak . Then the sequences {bn } and {cn } are
convergent in the metric space (R∗ , δ) and limn→∞ an = limn→∞ bn = inf n≥1 bn =
limn→∞ cn = supn≥1 cn .
Proof. Clearly, we have bn+1 ≤ bn for each n ∈ N. Hence 5.2.5 implies that
the sequence {bn } is convergent and limn→∞ bn = inf n≥1 bn . There are now three
possibilities.
If limn→∞ an = ∞, then (cf. 5.2.1b1)
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an > m ⇒ bn > m,
and this proves that limn→∞ bn = ∞.
If limn→∞ an = −∞, then (cf. 5.2.1b2 )
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an < m − 1,
and therefore
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ bn ≤ m − 1 < m,
and this proves that limn→∞ bn = −∞.
If limn→∞ an = a with a ∈ R, then (cf. 5.2.1b3)

ǫ ǫ
∀ǫ > 0, ∃Nǫ ∈ N s.t. n > Nǫ ⇒ a − < an < a + ,
2 2
and therefore
ǫ ǫ
∀ǫ > 0, ∃Nǫ ∈ N s.t. n > Nǫ ⇒ a − < bn ≤ a + ⇒ a − ǫ < bn < a + ǫ,
2 2
and this proves that limn→∞ bn = a.
For {cn } we can proceed in a similar way.
5.3 Algebraic operations in R∗
5.3.1 Definitions. The algebraic operations of R can, at least partially, be ex-

tended to R∗ .
(a) The mapping a 7→ −a in R is extended to R∗ by defining
−(∞) := −∞ and − (−∞) := ∞.
(b) The mapping a 7→ |a| in R is extended to R∗ by defining
|∞| := ∞ and | − ∞| := ∞.
(c) The product (a, b) 7→ ab in R is extended to R∗ by defining
for a ∈ (0, ∞], a∞ := ∞a := ∞ and a(−∞) = (−∞)a = −∞,
for a ∈ [−∞, 0), a∞ := ∞a := −∞ and a(−∞) = (−∞)a = ∞,
0∞ := ∞0 := 0(−∞) := (−∞)0 := 0.
Note that, for a, b, c ∈ R∗ ,
ab = ba and (ab)c = a(bc).
(d) The sum (a, b) 7→ a + b in R is partially extended to R∗ by defining
for a ∈ (−∞, ∞], a + ∞ := ∞ + a := ∞,
for a ∈ [−∞, ∞), a + (−∞) = (−∞) + a = −∞,
while the sums ∞ + (−∞) and (−∞) + ∞ are not defined.
Note that, for a, b, c ∈ R∗ ,
a + b = b + a and (a + b) + c = a + (b + c)
whenever the sides of these equations are defined.
For a, b ∈ R∗ , we write
a − b := a + (−b)
provided that the sum on the right side is defined. The only differences that
are not defined are ∞ − ∞ and (−∞) − (−∞).
5.3.2 Remarks.
(a) Note that, for a, b ∈ R∗ , a ≤ b iff −b ≤ −a. As can be easily seen, this implies
that, if for S ⊂ R∗ we write −S := {−a : a ∈ S}, then
sup(−S) = − inf S and inf(−S) = − sup S.
(b) As can be checked directly, if a, b ∈ R∗ are such that a ≤ b then
ca ≤ cb, ∀c ∈ [0, ∞].
This implies that, if for S ⊂ R∗ and c ∈ [0, ∞] we write cS := {ca : a ∈ S},

then
sup cS = c sup S and inf cS = c inf S, ∀c ∈ [0, ∞)

1
(these equalities may be false for c = ∞, as is shown by S := − n : n ∈ N
and by S := n1 : n ∈ N ). In fact, for c = 0 both equalities are trivial and for

c ∈ (0, ∞) we have
∀a ∈ S, ca ≤ c sup S,
and, for m ∈ R∗ ,
1 1
[ca ≤ m, ∀a ∈ S] ⇒ [a ≤ m, ∀a ∈ S] ⇒ sup S ≤ m ⇒ c sup S ≤ m,
c c
and this proves that sup cS = c sup S. For the other equality we can proceed in
a similar way.
(c) As can be checked directly, if a, b ∈ R∗ are such that a ≤ b then
a + c ≤ b + c for all c ∈ R∗ such that the two sides are defined.
This implies that, if for S ⊂ R∗ and c ∈ R we write S + c := {a + c : a ∈ S},

then
sup(S + c) = sup S + c and inf(S + c) = inf S + c, ∀c ∈ R.
In fact, we have
∀a ∈ S, a + c ≤ sup S + c
and, for m ∈ R∗ ,
[a + c ≤ m, ∀a ∈ S] ⇒ [a ≤ m − c, ∀a ∈ S] ⇒ sup S ≤ m − c ⇒ sup S + c ≤ m,
and this proves that sup(S + c) = sup S + c. For the other equality we can
proceed in a similar way.
(d) From remark b it follows that, if a1 , a2 , b1 , b2 ∈ [0, ∞] are such that ai ≤ bi for
i = 1, 2, then
a1 a2 ≤ b 1 a2 ≤ b 1 b 2 .
(e) From remark c it follows that, if a1 , a2 , b1 , b2 ∈ R∗ are so that both a1 + a2 and

b1 + b2 are defined and also so that ai ≤ bi for i = 1, 2, then
a1 + a2 ≤ b 1 + b 2 .
Indeed, if b1 + a2 is defined, then by remark c we have
a1 + a2 ≤ b 1 + a2 ≤ b 1 + b 2 .
If b1 + a2 is not defined, then a2 = −∞ and b1 = ∞ (because a2 = ∞ and
b1 = −∞ would render a1 + a2 not defined), and hence
a1 + a2 = −∞ ≤ ∞ = b1 + b2 .
5.3.3 Remark. In [0, ∞] both the product and the sum are defined without re-
straints, and they are commutative and associative. Owing to this, for elements of
Pn
[0, ∞] we will write a1 + a2 + a3 := a1 + (a2 + a3 ) and k=1 ak := a1 + ... + an ; we
P
will also write i∈I ai to denote the sum of a finite family {ai }i∈I of elements of
[0, ∞].
By a straightforward check we see that
a(b + c) = ab + ac, ∀a, b, c ∈ [0, ∞].
5.3.4 Proposition. Suppose that {an } and {bn } are sequences in [0, ∞] such that
an ≤ an+1 and bn ≤ bn+1 for each n ∈ N, and let a ∈ [0, ∞]. Then the sequences
{an + bn } , {an bn } and {abn } are convergent in the metric space (R∗ , δ) and
lim (an + bn ) = sup(an + bn ) = sup an + sup bn = lim an + lim bn ,
n→∞ n≥1 n≥1 n≥1 n→∞ n→∞
lim (an bn ) = sup(an bn ) = (sup an )(sup bn ) = ( lim an )( lim bn ),

n→∞ n≥1 n≥1 n≥1 n→∞ n→∞
lim (abn ) = sup(abn ) = a sup bn = a lim bn .

n→∞ n≥1 n≥1 n→∞
Proof. Recall (cf. 5.2.5) that {an } and {bn } are convergent and limn→∞ an =
supn≥1 an and limn→∞ bn = supn≥1 bn .
If both limn→∞ an and limn→∞ bn are elements of R, then what we want to
prove follows from 5.2.1c and from the continuity of the sum and the product in R.
Thus, in what follows we assume e.g. limn→∞ an = ∞.
We have limn→∞ an + limn→∞ bn = ∞; since
an + bn ≤ an+1 + bn+1 and an ≤ an + bn , ∀n ∈ N,
(cf. 5.3.2e), the sequence {an + bn } is convergent and
lim (an + bn ) = sup(an + bn ) ≥ sup an = ∞,
n→∞ n≥1 n≥1
and hence
lim (an + bn ) = ∞ = lim an + lim bn .
n→∞ n→∞ n→∞
If limn→∞ bn = 0, then bn = 0 and hence an bn = 0 for each n ∈ N; thus,

lim (an bn ) = 0 = ( lim an )( lim bn ).
n→∞ n→∞ n→∞
If limn→∞ bn = ∞, then there exists k ∈ N such that bk ≥ 1; now, limn→∞ an = ∞
implies (cf. 5.2.1b1) that
∀m ∈ R, ∃Nm ∈ N s.t. n > Nm ⇒ an > m;
then (letting Ñm := max {Nm , k}) we have
∀m ∈ R, ∃Ñm ∈ N s.t. n > Ñm ⇒ an bn ≥ an bk ≥ an > m,
which proves that
lim (an bn ) = ∞ = ( lim an )( lim bn ).
n→∞ n→∞ n→∞
Assume finally limn→∞ bn ∈ (0, ∞); since
an bn ≤ an+1 bn+1 , ∀n ∈ N
(cf. 5.3.2d), the sequence {an bn } is convergent and limn→∞ (an bn ) = supn≥1 (an bn );
let k ∈ N be s.t. bk > 0; since
an bk ≤ an bn for all n ≥ k
(cf. 5.3.2d), using 5.3.2b (note that bk ∈ (0, ∞)) we have
sup(an bn ) = sup(an bn ) ≥ sup an bk = bk sup an = bk sup an = ∞,
n≥1 n≥k n≥k n≥k n≥1
and hence
lim (an bn ) = ∞ = ( lim an )( lim bn ).
n→∞ n→∞ n→∞
Finally, the sequence {abn } is the sequence {an bn } if an := a for all n ∈ N.
5.4 Series in [0, ∞]
5.4.1 Definition. Let {an } be a sequence in [0, ∞]. For each n ∈ N, define sn :=
Pn
k=1 akP. The sequence {sn } is called the series of the an ’s and is denoted by the
∞
symbol n=1 an . Since sn ≤ sn+1 for each n ∈ N, 5.2.5 implies that the sequence
{sn } is convergent in the metric space (R∗ , δ) and limn→∞ sn = supn≥1 sn . Then,
limn→∞ sn is called the sum of the series of the an ’s and is denoted by the same
P∞
symbol n=1 an as the series, i.e.
∞
X
an := lim sn = sup sn
n→∞ n≥1
n=1
(these definitions are in agreement with the ones given in 2.1.10).
If an ∈ R for each n ∈ N, then limn→∞ sn can be either ∞ or an element of R.
P∞
If limn→∞ sn ∈ R, then {sn } converges to n=1 an also in the metric space (R, dR )
P∞
(cf. 5.2.1c). Clearly, limn→∞ sn ∈ R iff n=1 an < ∞ (we will always use the latter
expression).
5.4.2 Remarks.
(a) Let {an } and {bn } be sequences in [0, ∞], and suppose an ≤ bn for each n ∈ N.
Then, by induction applied to 5.3.2e,
n
X n
X n
X ∞
X
∀n ∈ N, ak ≤ bk ≤ sup bk =: bn ,
n≥1 n=1
k=1 k=1 k=1
whence
∞
X n
X ∞
X
an := sup ak ≤ bn .
n=1 n≥1 n=1
k=1
Pn
(b) For a sequence {an } in [0, ∞], letting sn := k=1 ak we have
∞
X (1)
an < ∞ ⇔
n=1
(2)
[∃m ∈ [0, ∞) s.t. sn ≤ m, ∀n ∈ N] ⇔
(3)
[an ∈ [0, ∞), ∀n ∈ N, and {sn } is convergent in (R, dR )] ⇔
[an ∈ [0, ∞), ∀n ∈ N, and
X∞
an is convergent in the normed space R (cf. 4.1.4)],
n=1
where: 1 holds because ∞

P
n=1 an = sup {sn : n ∈ N}; 2 holds by 5.2.4; 3 holds
by the definitions given in 4.1.5.
P∞ P∞
If n=1 an < ∞ then from 5.2.1c it follows that the value of n=1 an is the
same whether it is defined as in 5.4.1 or as in 4.1.5 in the context of the normed
space R.
P∞
Note that n=1 an = ∞ iff [(∃n ∈ N s.t. an = ∞) or (an ∈ [0, ∞), ∀n ∈ N,
P∞
and the series n=1 an is not convergent in the normed space R)].
5.4.3 Proposition. Let {an } be a sequence in [0, ∞] and let β be a bijection from
N onto N. Then
∞
X ∞
X
aβ(n) = an .
n=1 n=1
For this reason, for a countable family {bn }n∈I of elements of [0, ∞], we will write
P
n∈I bn to denote the sum or the series of the bn ’s, with no need to specify the
order.
Proof. For n ∈ N, let Mn := max {β(1), ..., β(n)}; then we have

n
X Mn
X ∞
X
∀n ∈ N, aβ(k) ≤ al ≤ an ,
k=1 l=1 n=1
and this implies that

∞ ∞
( n
)
X X X
aβ(n) := sup aβ(k) : n ∈ N ≤ an .
n=1 k=1 n=1
Replacing an with aβ(n) and β with β −1 , this also proves that

∞
X ∞
X
an ≤ aβ(n) .
n=1 n=1
5.4.4 Corollary. Let {an } be a sequence in [0, ∞) and let β be a bijection from N
P∞
onto N. Then the series n=1 aβ(n) is convergent in the normed space R (cf. 4.1.4
P∞
and 4.1.5) iff the series n=1 an is convergent in the normed space R, and in case
of convergence the two sums are equal.
Proof. Use 5.4.2b and 5.4.3.
5.4.5 Proposition. Let {an } be a sequence in [0, ∞] and a ∈ [0, ∞]. Then
∞
X ∞
X
(aan ) = a an .
n=1 n=1
Proof. If a = ∞, then the two sides are zero if an = 0 for each n ∈ N, otherwise
the two sides are ∞.
Assuming now a ∈ [0, ∞), we have (using 5.3.2b)
∞
( n ) ( n )
X X X
(aan ) := sup (aak ) : n ∈ N = sup a ak : n ∈ N
n=1 k=1 k=1
∞
( n
)
X X
= a sup ak : n ∈ N =a an .
k=1 n=1
5.4.6 Proposition. Let {an } and {bn } be two sequences in [0, ∞]. Then
∞
X ∞
X ∞
X
(an + bn ) = an + bn .
n=1 n=1 n=1
Proof. We have (using 5.3.2e)

n
X n
X n
X ∞
X ∞
X
∀n ∈ N, (ak + bk ) = ak + bk ≤ an + bn ,
k=1 k=1 k=1 n=1 n=1

∞
X ∞
X ∞
X
(an + bn ) ≤ an + bn .
n=1 n=1 n=1
P∞
If n=1 (an + bn ) = ∞, this proves the equality of the statement. Assume next
P∞ P∞ P∞
n=1 (an + bn ) < ∞; this implies n=1 an < ∞ and n=1 bn < ∞ since e.g.
n
X n
X ∞
X
∀n ∈ N, ak ≤ (ak + bk ) ≤ (an + bn );
k=1 k=1 n=1
thus, all three series of the statement are convergent in the normed space R (cf.
5.4.2b), where the equality holds by the continuity of the sum.
5.4.7 Proposition. Let {an,m } be a family of elements of [0, ∞] indexed by N × N

and let σ be a bijection from N onto N × N. Then
∞ ∞ ∞
!
X X X
aσ(n) = an,m .
n=1 n=1 m=1
P∞
Thus, the sum of the series n=1 aσ(n) does not depend on the particular bijection
that is used, and we can define
X ∞
X
an,m := aσ(n) .
(n,m)∈N×N n=1
The following equalities are true:

∞ ∞ ∞ ∞
! !
X X X X X
an,m = an,m = an,m .
(n,m)∈N×N n=1 m=1 m=1 n=1
Proof. For i = 1, 2, define the mapping

πi : N × N → N
(n1 , n2 ) 7→ πi (n1 , n2 ) := ni .
Fix L ∈ N and let Li := max πi (σ({1, ..., L})); we have (using induction applied to
5.3.2e)
L1 L2 L1 ∞
L
! !
X X X X X
aσ(k) ≤ an,m ≤ an,m
k=1 n=1 m=1 n=1 m=1
∞ ∞
!
X X
≤ an,m .
n=1 m=1
Since L was arbitrary, this implies that

∞ ∞ ∞
!
X X X
aσ(n) ≤ an,m .
n=1 n=1 m=1
Fix now N ∈ N. Applying induction to 5.4.6 we have

∞ ∞
N
! N
!
X X X X
an,m = an,m . (∗)
m=1 n=1 n=1 m=1
Fix also M ∈ N and let K := max σ −1 ({1, ..., N } × {1, ..., M }); we have
M X
X N K
X ∞
X
an,m ≤ aσ(l) ≤ aσ(n) .
m=1 n=1 l=1 n=1
Since M was arbitrary, this implies that

∞ ∞
N
!
X X X
an,m ≤ aσ(n) ,
m=1 n=1 n=1
and hence by (∗)

∞ ∞
N
!
X X X
an,m ≤ aσ(n) .
n=1 m=1 n=1
Since N was arbitrary, this implies that

∞ ∞ ∞
!
X X X
an,m ≤ aσ(n) .
n=1 m=1 n=1
Thus,
∞ ∞ ∞
!
X X X
aσ(n) = an,m
n=1 n=1 m=1
and we can define

X ∞
X
an,m := aσ(n) .
(n,m)∈N×N n=1
Finally, let bn,m := am,n and denote by γ the bijection from N × N onto N × N
defined by γ(n, m) := (m, n). Then γ ◦ σ is a bijection from N onto N × N and
∞ ∞ ∞ ∞
!
X X X X
aσ(n) = b(γ◦σ)(n) = bn,m
n=1 n=1 n=1 m=1
∞ ∞ ∞ ∞
! !
X X X X
= am,n = an,m .
n=1 m=1 m=1 n=1
5.4.8 Corollary. Let {an,m }(n,m)∈N×N be a family of elements of [0, ∞) indexed

P∞
by N × N and let σ be a bijection from N onto N × N. The series n=1 aσ(n)
is convergent in the normed space R (cf. 4.1.4 and 4.1.5) iff the following two
conditions are both satisfied:
∀n ∈ N, the series ∞
P
a is convergent in the normed space R;
P∞m=1 n,m
the series ∞
P
n=1 ( a
m=1 n,m ) is convergent in the normed space R.
In case of convergence, for the sums of the two series we have the equality
∞ ∞ ∞
!
X X X
aσ(n) = an,m .
n=1 n=1 m=1
P∞
Thus, both the convergence of the series n=1 aσ(n) and its sum do not depend on
the particular bijection σ used.
5.4.9 Proposition. Let {an,k }(n,k)∈N×N be a family of elements of [0, ∞] such that
an,k ≤ an+1,k for each (n, k) ∈ N×N. Then the sequence { ∞
P
k=1 an,k } is convergent
in the metric space (R∗ , δ) and
∞
X ∞
X ∞
X
lim an,k = sup an,k = (sup an,k )
n→∞ n≥1 n≥1
k=1 k=1 k=1
∞
X
= ( lim an,k ).
n→∞
k=1
Proof. For each n ∈ N, induction applied to 5.3.2e implies that

N
X N
X ∞
X
an,k ≤ an+1,k ≤ an+1,k , ∀N ∈ N,
k=1 k=1 k=1

∞
X ∞
X
an,k ≤ an+1,k ;
k=1 k=1
P∞
then, by 5.2.5 we have that the sequence { k=1 an,k } is convergent and
∞
X ∞
X
lim an,k = sup an,k .
n→∞ n≥1
k=1 k=1
By 5.2.5 we also have
lim an,k = sup an,k , ∀k ∈ N.

n→∞ n≥1
Thus, we need to prove that

∞
X ∞
X
sup an,k = (sup an,k ),
n≥1 n≥1
k=1 k=1
i.e.
N
! N
!
X X
sup sup an,k = sup (sup an,k ) ;
n≥1 N ≥1 N ≥1 n≥1
k=1 k=1
for each n ∈ N, since an,k ≤ an+1,k for all n ∈ N and for k = 1, ..., N , we can apply
induction to 5.3.4 to obtain
N
X N
X
(sup an,k ) = sup an,k ;
n≥1 n≥1
k=1 k=1
PN
thus, letting sN,n := k=1an,k , what we need to prove is

sup sup sN,n = sup sup sN,n ,
n≥1 N ≥1 N ≥1 n≥1
but this is true by 5.2.3.
5.4.10 Remark. Let {fn } be a sequence in a normed space and suppose that the
P∞
series n=1 fn is convergent (cf. 2.1.10). Then,

X ∞ X ∞
fn ≤ kfn k.

n=1 n=1
P∞
If n=1 kfn k = ∞, this is obvious. Otherwise, it is proved by

XN X N X∞
fn ≤ kfn k ≤ kfn k, ∀N ∈ N,

n=1 n=1 n=1
which implies that

X∞ XN X ∞
fn = lim fn ≤ kfn k.

n→∞
n=1 n=1 n=1
Chapter 6
Measurable Sets and Measurable

Functions
Although this chapter deals with measurable sets and functions, its contents are
purely set-theoretic: in this chapter there is still no measure in view. The reason
for the adjective “measurable” lies in the following facts, which will be seen in later
chapters: a measure is a function defined on a family of measurable sets and an
integral is a concept which is consistent only for measurable functions.
6.1 Semialgebras, algebras, σ-algebras
Throughout this section, X stands for a non-empty set.
6.1.1 Definition. A non-empty collection S of subsets of X is called a semialgebra

on X if the following conditions are satisfied:
(sa1 ) ∅ ∈ S;
(sa2 ) if E, F ∈ S then E ∩ F ∈ S;
(sa3 ) if E ∈ S then there is a finite and disjoint family {Fi }i∈I of elements of S
such that X − E = ∪i∈I Fi .
6.1.2 Proposition. Let S be a semialgebra on X, let n ∈ N, and let {E1 , ..., En }

be a disjoint family of elements of S. Then there exists a finite and disjoint family
{Fi }i∈I of elements of S such that
X − ∪nk=1 Ek = ∪i∈I Fi .
Proof. The proof is by induction. For n = 1, the statement follows at once from
the definition of semialgebra. Assume then that the statement is true for n = m and
consider a disjoint family {E1 , ..., Em , Em+1 } of elements of S. Then there exists a
finite and disjoint family {Fi }i∈I of elements of S such that
X − ∪m
k=1 Ek = ∪i∈I Fi .
Since Em+1 ∈ S, there exists also a finite and disjoint family {Gj }j∈J of elements
of S such that
X − Em+1 = ∪j∈J Gj .
117
Then {Fi ∩ Gj }(i,j)∈I×J is a finite and disjoint family of elements of S and

X − ∪m+1 m
k=1 Ek = (X − ∪k=1 Ek ) ∩ (X − Em+1 ) = ∪(i,j)∈I×J Fi ∩ Gj .
6.1.3 Proposition. Let S be a semialgebra on X, let E ∈ S, let n ∈ N, and let

{E1 , ..., En } be a disjoint family of elements of S such that Ek ⊂ E for k = 1, ..., n.
Then there exists a finite and disjoint family {Gi }i∈I of elements of S such that:
Ek ∩ Gi = ∅, ∀k ∈ {1, ..., n} , ∀i ∈ I,
E = (∪nk=1 Ek ) ∪ (∪i∈I Gi ) .
Proof. By 6.1.2, there exists a finite and disjoint family {Fi }i∈I of elements of S
such that
X − ∪nk=1 Ek = ∪i∈I Fi .
Define Gi := E ∩ Fi for each i ∈ I. Then {Gi }i∈I is a disjoint family of elements of
S and also:
∀k ∈ {1, ..., n} , ∀i ∈ I, Ek ∩ Gi = ∅ since Gi ⊂ Fi ⊂ X − ∪nl=1 El ⊂ X − Ek ,
(∪nk=1 Ek ) ∪ (∪i∈I Gi ) = (∪nk=1 Ek ) ∪ (E ∩ (X − ∪nk=1 Ek )) = E ∩ X = E.
6.1.4 Proposition. Let S be a semialgebra on X, let n ∈ N, and let {A1 , ..., An }

be a family of elements of S. Then there exists a finite and disjoint family {Bj }j∈J
of elements of S such that
∀k ∈ {1, ..., n} , ∃Jk ⊂ J such that Ak = ∪j∈Jk Bj .
Proof. The proof is by induction. For n = 1 the statement is obviously true
(assume J := {1} and B1 := A1 ). Assume then that the statement is true for
n = m and consider a family {A1 , ..., Am , Am+1 } of elements of S. Then there
exists a finite and disjoint family {Bi }j∈J of elements of S such that
∀k ∈ {1, ..., m} , ∃Jk ⊂ J such that Ak = ∪j∈Jk Bj .
Define Bj,1 := Am+1 ∩ Bj for each j ∈ J. From 6.1.3 (with E := Am+1 and
{E1 , ..., En } := {Bj,1 }j∈J ) it follows that there exists a finite and disjoint family
{Gi }i∈I of elements of S such that
Bj,1 ∩ Gi = ∅, ∀j ∈ J, ∀i ∈ I,
Am+1 = (∪j∈J Bj,1 ) ∪ (∪i∈I Gi ).
From 6.1.3 (with E := Bj and {E1 , ..., En } := {Bj,1 }) it also follows that, for each
j ∈ J, there exists a finite and disjoint family {Bj,r }r=2,...,kj of elements of S such
that
Bj,1 ∩ Bj,r = ∅, ∀r ∈ {2, ..., kj } ,
j k j k
Bj = Bj,1 ∪ (∪r=2 Bj,r ) = ∪r=1 Bj,r .
Now, we have that {Gi }i∈I ∪ {Bj,r }j∈J,r=1,...,kj is a family of elements of S which
are disjoint since:
Measurable Sets and Measurable Functions 119
if r 6= r′ , Bj,r ∩ Bj,r′ = ∅ (see above),

if j 6= j ′ , Bj,r ∩ Bj ′ ,r′ = ∅ since Bj,r ⊂ Bj , Bj ′ ,r′ ⊂ Bj ′ , Bj ∩ Bj ′ = ∅,
if i 6= i′ , Gi ∩ Gi′ = ∅ (see above),
Bj,1 ∩ Gi = ∅ (see above),
if r 6= 1, Bj,r ∩ Gi = ∅ since Gi ⊂ Am+1 and Bj,r ⊂ (X − Bj,1 ) ∩ Bj =
((X − Am+1 ) ∪ (X − Bj )) ∩ Bj = (X − Am+1 ) ∩ Bj ⊂ X − Am+1 .
Moreover we have
j k
∀k ∈ {1, ..., m} , Ak = ∪j∈Jk Bj = ∪j∈Jk ∪r=1 Bj,r .
Thus, {Gi }i∈I ∪ {Bj,r }j∈J,r=1,...,kj is a finite and disjoint family of elements of S
such that every element of {A1 , ..., Am , Am+1 } can be obtained as the union of some
of its elements.
6.1.5 Definition. A non-empty collection A0 of subsets of X is called an algebra

on X if the following conditions are satisfied:
(al1 ) if E, F ∈ A0 then E ∪ F ∈ A0 ;
(al2 ) if E ∈ A0 then X − E ∈ A0 .
6.1.6 Proposition. An algebra A0 on X has the following properties:

(al3 ) if E, F ∈ A0 then E ∩ F ∈ A0 ,
(al4 ) ∅ ∈ A0 , X ∈ A0 ,
(al5 ) if n ∈ N and {E1 , ..., En } is a family of elements of A0 , then ∪nk=1 Ek ∈ A0
and ∩nk=1 Ek ∈ A0 ,
(al6 ) if E, F ∈ A0 then E − F ∈ A0 .
Proof. al3 : We have

E, F ∈ A0 ⇒ X − E, X − F ∈ A0 ⇒
X − (E ∩ F ) = (X − E) ∪ (X − F ) ∈ A0 ⇒
E ∩ F = X − (X − (E ∩ F )) ∈ A0 .
al4 : Let E ∈ A0 . Then X − E ∈ A0 , and hence
∅ = E ∩ (X − E) ∈ A0 , X = E ∪ (X − E) ∈ A0 .
al5 : This follows from al1 and al3 by elementary induction.
al6 : We have
[E, F ∈ A0 ] ⇒ [E, X − F ∈ A0 ] ⇒ [E − F = E ∩ (X − F ) ∈ A0 ].
6.1.7 Remark. It is clear from al4 , al3 , al2 that, for an algebra, conditions
sa1 , sa2 , sa3 of 6.1.1 are satisfied. Thus, an algebra is also a semialgebra. For
any non-empty set X, the collections {∅, X} and P(X) are algebras on X.
6.1.8 Proposition. Let {En } be a sequence of subsets of X, and define

F1 := E1 and Fn := En − ∪n−1
k=1 Ek for n > 1.
Then:
(a) Fk ∩ Fl = ∅ if k 6= l,
(b) ∪N N
n=1 Fn = ∪n=1 En , ∀N ∈ N,
(c) ∪n=1 Fn = ∪∞
∞
n=1 En ,
(d) if A0 is an algebra on X and En ∈ A0 for all n ∈ N, then Fn ∈ A0 for all
n ∈ N.
Proof. a: If k 6= l, assume e.g. k < l; then Fk ⊂ Ek and

Fl ⊂ El − Ek ⊂ X − Ek ,
and this implies Fk ∩ Fl ⊂ Ek ∩ (X − Ek ) = ∅.
b: From Fn ⊂ En for all n ∈ N we obtain ∪N N
n=1 Fn ⊂ ∪n=1 En . We also have
x ∈ ∪N
n=1 En ⇒ [∃k ∈ {1, ..., N } s.t. x ∈ Ek and x 6∈ Ei for i < k] ⇒
[∃k ∈ {1, ..., N } s.t. x ∈ Fk ] ⇒ x ∈ ∪N
n=1 Fn .
c: Repeat the proof for b, with {1, ..., N } replaced by N.

d: This follows at once from al5 and al6 .
6.1.9 Proposition. Let Λ be a family of algebras on X. Then ∩A0 ∈Λ A0 (this

intersection is defined within the framework of P(X)) is an algebra on X.
Proof. al1 : We have

E, F ∈ ∩A0 ∈Λ A0 ⇒ (E, F ∈ A0 , ∀A0 ∈ Λ) ⇒
(E ∪ F ∈ A0 , ∀A0 ∈ Λ) ⇒ E ∪ F ∈ ∩A0 ∈Λ A0 .
al2 : We have
E ∈ ∩A0 ∈Λ A0 ⇒ (X − E ∈ A0 , ∀A0 ∈ Λ) ⇒ X − E ∈ ∩A0 ∈Λ A0 .
6.1.10 Theorem.
(a) Let F be a family of subsets of X. Then there exists a unique algebra on X,
which is called the algebra on X generated by F and is denoted by A0 (F ), such
that
(ga1 ) F ⊂ A0 (F ),
(ga2 ) if A0 is an algebra on X and F ⊂ A0 , then A0 (F ) ⊂ A0 .
If F is the empty family, then A0 (F ) = {0, X}.
(b) Let F1 and F2 be families of subsets of X. If F1 ⊂ F2 or F1 ⊂ A0 (F2 ), then
A0 (F1 ) ⊂ A0 (F2 ).
Proof. a: As to existence, define

Λ := {A0 : A0 is an algebra on X and F ⊂ A0 }
and A0 (F ) := ∩A0 ∈Λ A0 . By 6.1.9, A0 (F ) is an algebra on X. Properties ga1 and
ga2 are obvious. As to uniqueness, assume that Ã0 is an algebra on X such that
(ã1 ) F ⊂ Ã0 ,
(ã2 ) if A0 is an algebra on X and F ⊂ A0 , then Ã0 ⊂ A0 .
Then A0 (F ) ⊂ Ã0 by ã1 and ga2 , and also Ã0 ⊂ A0 (F ) by ga1 and ã2 .
If the family F is empty then F is contained in every algebra on X, and the
intersection of all the algebras on X is {∅, X}, because {∅, X} is an algebra on X
and it is contained in all the algebras on X (cf. al4 ).
b: If F1 ⊂ A0 (F2 ), then A0 (F1 ) ⊂ A0 (F2 ) by property ga2 of A0 (F1 ). If
F1 ⊂ F2 , then F1 ⊂ A0 (F2 ) by property ga1 of A0 (F2 ).
6.1.11 Theorem. Let S be a semialgebra on X. Then A0 (S) is the collection of

all the unions of finite and disjoint families of elements of S, i.e. by letting
C := {E ∈ P(X) : ∃n ∈ N, ∃ {E1 , ..., En } s.t.
Ek ∈ S for k = 1, ..., n, Ek ∩ El = ∅ if k 6= l, E = ∪nk=1 Ek }
we have A0 (S) = C.
Proof. It is obvious that C ⊂ A0 (S), since S ⊂ A0 (S) and A0 (S) has property al5 .
Since it is also obvious that S ⊂ C, if we can prove that C is an algebra on X then
we can conlude that A0 (S) ⊂ C by property ga2 of A0 (S).
Now, let E, F be two elements of C and let {E1 , ..., En }, {F1 , ..., Fm } be two
disjoint families of elements of S such that E = ∪nk=1 Ek and F = ∪m l=1 Fl . From
6.1.4 it follows that there exists a finite and disjoint family {Bj }j∈J of elements of
S such that each element in the family {E1 , ..., En , F1 , ..., Fm } can be obtained as
the union of a subfamily of {Bj }j∈J ; it is then clear that E ∪ F too can be obtained
in this way, and this implies that E ∪ F ∈ C. Thus, C has property al1 of 6.1.5.
Moreover, if {E1 , ..., En } is a disjoint family of elements of S, then from 6.1.2 it
follows that X − ∪nk=1 Ek can be obtained as the union of a finite and disjoint family
of elements of S. This proves that C has property al2 of 6.1.5.
6.1.12 Corollary. Let S be a semialgebra on X. Then A0 (S) is the collection of

all the unions of finite families of elements of S, i.e. by letting
C ′ := {E ∈ P(X) : ∃n ∈ N, ∃ {E1 , ..., En } s.t.
Ek ∈ S for k = 1, ..., n and E = ∪nk=1 Ek }
we have A0 (S) = C ′ .
Proof. If C is as in 6.1.11, then clearly C ⊂ C ′ . On the other hand, since S ⊂ C we

also have C ′ ⊂ C by property al5 of C. Thus, C ′ = C = A0 (S).
6.1.13 Definition. A collection A of subsets of X is called a σ-algebra on X if A

is an algebra on X and it has the following property:
(σa1 ) if {En } is a sequence in A then ∪∞

n=1 En ∈ A.
The pair (X, A), where X is a non-empty set and A is a σ-algebra on X, is said to
be a measurable space and the elements of A are called the measurable subsets of X
(the reason for these names is that a measure is a function defined on a σ-algebra).
6.1.14 Proposition. A σ-algebra A on X has the following property:
(σa2 ) if {En } is a sequence in A then ∩∞

n=1 En ∈ A.
Proof. If {En } is a sequence in A then we have
(En ∈ A, ∀n ∈ N) ⇒ (X − En ∈ A, ∀n ∈ N) ⇒
X − ∩∞ ∞
n=1 En = ∪n=1 (X − En ) ∈ A ⇒
∩∞ ∞
n=1 En = X − (X − ∩n=1 En ) ∈ A.
6.1.15 Remark. For any non-empty set X, the collections {∅, X} and P(X) are
σ-algebras on X.
6.1.16 Proposition. Let Λ be a family of σ-algebras on X. Then ∩A∈Λ A (this

intersection is defined within the framework of P(X)) is a σ-algebra on X.
Proof. From 6.1.9 it follows that ∩A∈Λ A is an algebra on X. Moreover, if {En } is

a sequence in ∩A∈Λ A, then we have
(En ∈ A, ∀n ∈ N, ∀A ∈ Λ) ⇒ (∪∞ ∞
n=1 En ∈ A, ∀A ∈ Λ) ⇒ ∪n=1 En ∈ ∩A∈Λ A.
6.1.17 Theorem.
(a) Let F be a family of subsets of X. Then there exists a unique σ-algebra on

X, which is called the σ-algebra on X generated by F and is denoted by A(F ),
such that
(gσ1 ) F ⊂ A(F ),
(gσ2 ) if A is a σ-algebra on X and F ⊂ A, then A(F ) ⊂ A.
If F is the empty family, then A(F ) = {∅, X}.
(b) Let F1 and F2 be families of subsets of X. If F1 ⊂ F2 or F1 ⊂ A(F2 ), then
A(F1 ) ⊂ A(F2 ).
Proof. The proof of the present statement is a slight modification of the proof of
6.1.10.
6.1.18 Proposition. Let F be a family of subsets of X. Then

A(F ) = A(A0 (F )).
Proof. Since F ⊂ A0 (F ) (cf. ga1 ), we have A(F ) ⊂ A(A0 (F )) (cf. 6.1.17b). Since
A(F ) is an algebra on X and F ⊂ A(F ) (cf. gσ1 ), we have A0 (F ) ⊂ A(F ) (cf.
ga2 ), whence A(A0 (F )) ⊂ A(F ) (cf. 6.1.17b).
6.1.19 Proposition. Let A be a σ-algebra on X, let Y be a non-empty subset of

X, and define the collection AY of subsets of Y by
AY := {E ∩ Y : E ∈ A} .
Then AY is a σ-algebra on Y , which is called the σ-algebra induced on Y by A.
The measurable space (Y, AY ) is said to be a measurable subspace of (X, A), and it
is said to be defined by Y .
We have:
(a) AY ⊂ A iff Y ∈ A;
(b) if Z is a non-empty subset of Y , then (AY )Z = AZ .
Proof. First we prove that AY is a σ-algebra on Y .

al1 and σa1 : Let {Fn }n∈I be a family of elements of AY , with I := {1, 2} or
I := N. Then there is a family {En }n∈I of elements of A such that Fn = En ∩ Y
for all n ∈ I, and we have
∪n∈I Fn = (∪n∈I En ) ∩ Y,
which proves that ∪n∈I Fn ∈ AY since ∪n∈I En ∈ A.
al2 : If F ∈ AY , then there exists E ∈ A such that F = E ∩ Y , and we have
Y − F = Y ∩ (X − F ) = Y ∩ ((X − E) ∪ (X − Y )) = (X − E) ∩ Y,
which proves that Y − F ∈ AY since X − E ∈ A.
Now we prove a and b.
a: By property al3 of A we have
Y ∈ A ⇒ (E ∩ Y ∈ A, ∀E ∈ A) ⇒ AY ⊂ A,
and by property al4 of AY we have
AY ⊂ A ⇒ Y ∈ A.
b: If Z ⊂ Y then (E ∩ Y ) ∩ Z = E ∩ Z for all E ∈ A.
6.1.20 Proposition. Let F be a family of subsets of X, let Y be a non-empty

subset of X, and define the family F Y of subsets of Y by
F Y := {F ∩ Y : F ∈ F } .
Then we can define two σ-algebras on Y : (A(F ))Y and A(F Y ) (by this symbol we
denote the σ-algebra on Y generated by F Y ). However,
(A(F ))Y = A(F Y ).
Proof. From the inclusion F ⊂ A(F ) (cf. gσ1 ) we have F Y ⊂ (A(F ))Y , and hence
A(F Y ) ⊂ (A(F ))Y by property gσ2 for A(F Y ) (since (A(F ))Y is a σ-algebra on Y
by 6.1.19).
To prove the opposite inclusion, we define the collection A of subsets of X by
A := E ∈ P(X) : E ∩ Y ∈ A(F Y ) .

By the definition of F Y and property gσ1 for A(F Y ) we have

F ∈ F ⇒ F ∩ Y ∈ F Y ⇒ F ∩ Y ∈ A(F Y ),
and this shows that F ⊂ A. We prove now that A is a σ-algebra on X.
al1 and σa1 : Let {En }n∈I be a family of elements of A, with I := {1, 2} or
I := N. Then we have En ∩ Y ∈ A(F Y ) for all n ∈ I, and this implies that
(∪n∈I En ) ∩ Y = ∪n∈I (En ∩ Y ) ∈ A(F Y ),
which proves that ∪n∈I En ∈ A.
al2 : If E ∈ A, then E ∩ Y ∈ A(F Y ), and we have
(X −E)∩Y = ((X −E)∪(X −Y ))∩Y = (X −(E ∩Y ))∩Y = Y −(E ∩Y ) ∈ A(F Y ),
which proves that X − E ∈ A.
Thus A is a σ-algebra on X and therefore we have A(F ) ⊂ A (cf. gσ2 ), which
means
E ∩ Y ∈ A(F Y ), ∀E ∈ A(F ).
From this we have that, for F ∈ P(Y ), the following implications are true
F ∈ (A(F ))Y ⇒ [∃E ∈ A(F ) s.t. F = E ∩ Y ] ⇒ F ∈ A(F Y ).
This proves the inclusion (A(F ))Y ⊂ A(F Y ) that we wanted to prove.
6.1.21 Corollary. Let d be a distance on X and let Y be a non-empty subset of

X. Then we can define two σ-algebras on Y : (A(Td ))Y and A(TdY ) (recall that Td
denotes the family of all open sets in a metric space whose distance is denoted by
d; for the metric subspace (Y, dY ), cf. 2.1.3). However,
(A(Td ))Y = A(TdY ).
Proof. The result of 2.2.5 can be rephrased as

TdY = (Td )Y .
Use then 6.1.20.
6.1.22 Definition. Let d be a distance on X. The σ-algebra A(Td ) on X is called

the Borel σ-algebra on X and is denoted by A(d), i.e.
A(d) := A(Td ).
Any element of A(d) is called a Borel set.
If Y is a non-empty subset of X, the σ-algebra A(TdY ) on Y is called the Borel
σ-algebra on Y . From 6.1.21 it follows that
A(dY ) = A(TdY ) = (A(Td ))Y = (A(d))Y .
6.1.23 Proposition. Let d be a distance on X. We have

A(d) = A(Kd )
(recall that Kd denotes the family of all closed sets in a metric space whose distance
is denoted by d).
Proof. Since
E ∈ Kd ⇒ X − E ∈ Td ⇒ X − E ∈ A(d) ⇒ E = X − (X − E) ∈ A(d),
we have Kd ⊂ A(d), whence A(Kd ) ⊂ A(d) (cf. 6.1.17b).
Since
E ∈ Td ⇒ X − E ∈ Kd ⇒ X − E ∈ A(Kd ) ⇒ E = X − (X − E) ∈ A(Kd ),
we have Td ⊂ A(Kd ), whence A(d) ⊂ A(Kd ).
6.1.24 Proposition. Consider the measurable spaces (R, A(dR )) (cf. 2.1.4),
(R∗ , A(δ)) (cf. 5.2.1), and (C, A(dC )) (cf. 2.7.4a). We have on R the three
σ-algebras A(dR ), (A(δ))R , (A(dC ))R (the last symbol is consistent since we iden-
tify R with the subset {(a, 0) : a ∈ R} of C). However,
A(dR ) = (A(δ))R = (A(dC ))R .
Proof. From 6.1.21 we have (as already noted in 6.1.22)

(A(δ))R = A(δR ) := A(TδR ).
Besides, from 5.2.1d we have TδR = TdR . Therefore,
(A(δ))R = A(TdR ) =: A(dR ).
From 6.1.21 we also have (as already noted in 6.1.22)
(A(dC ))R = A((dC )R ).
Now, (dC )R = dR (cf. 2.7.4a). Therefore,
(A(dC ))R = A(dR ).
6.1.25 Proposition. Define the following families of subsets of R:

I1 := {(a, b) : a, b ∈ R} ,
I2 := {[a, b) : a, b ∈ R} ,
I3 := {[a, b] : a, b ∈ R} ,
I4 := {(a, b] : a, b ∈ R} ,
I5 := {(−∞, a] : a ∈ R} ,
I6 := {(−∞, a) : a ∈ R} ,
I7 := {[a, ∞) : a ∈ R} ,
I8 := {(a, ∞) : a ∈ R} .
Then we have
A(dR ) = A(In ) for n = 1, ..., 8.
If we define
I9 := I4 ∪ I5 ∪ I8 ,
then I9 is a semialgebra on R and
A(dR ) = A(I9 ).
Proof. From 2.3.16 and 2.3.17 it follows that every element of TdR is the union of a
countable family of open balls. Now, the family of open balls in (R, dR ) is I1 . Thus,
TdR ⊂ A(I1 )
by property σa1 of A(I1 ), since I1 ⊂ A(I1 ) (cf. gσ1 ). Therefore (cf. 6.1.17b),
A(dR ) := A(TdR ) ⊂ A(I1 ).
Next, we notice that:
1
∀a, b ∈ R, (a, b) = ∪∞n=1 [a + n , b), hence I1 ⊂ A(I2 ), whence A(I1 ) ⊂ A(I2 ),
1
∀a, b ∈ R, [a, b) = ∪∞n=1 [a, b − n ], hence I2 ⊂ A(I3 ), whence A(I2 ) ⊂ A(I3 ),
∞ 1
∀a, b ∈ R, [a, b] = ∩n=1 (a − n , b], hence I3 ⊂ A(I4 ), whence A(I3 ) ⊂ A(I4 ),
∀a, b ∈ R, (a, b] = (−∞, b] ∩ (R − (−∞, a]), hence I4 ⊂ A(I5 ), whence A(I4 ) ⊂
A(I5 ),
1
∀a ∈ R, (−∞, a] = ∩∞ n=1 (−∞, a + n ), hence I5 ⊂ A(I6 ), whence A(I5 ) ⊂ A(I6 ),
∀a ∈ R, (−∞, a) = R − [a, ∞), hence I6 ⊂ A(I7 ), whence A(I6 ) ⊂ A(I7 ),
1
∀a ∈ R, [a, ∞) = ∩∞ n=1 (a − n , ∞), hence I7 ⊂ A(I8 ), whence A(I7 ) ⊂ A(I8 ),
∀a ∈ R, (a, ∞) ∈ TdR , i.e. I8 ⊂ TdR , whence A(I8 ) ⊂ A(dR ).
This proves that A(dR ) = A(I1 ) = ... = A(I8 ).
As to I9 , recall that (a, b] := ∅ if b < a. Thus I9 has the property sa1 of 6.1.1,
and it is immediate to check that it has properties sa2 and sa3 as well. Moreover,
from
I4 ⊂ A(dR ), I5 ⊂ A(dR ), I8 ⊂ A(dR ),
we have I9 ⊂ A(dR ), whence A(I9 ) ⊂ A(dR ). And from I4 ⊂ I9 we have A(dR ) =
A(I4 ) ⊂ A(I9 ). This proves that A(dR ) = A(I9 ).
6.1.26 Proposition. Define the following families of subsets of R∗ :

I1∗ := A(dR ) ∪ {{−∞} , {∞}} ,
I2∗ := {(a, ∞] : a ∈ R} ,
I3∗ := {[a, ∞] : a ∈ R} .
Then we have
A(δ) = A(I1∗ ) = A(I2∗ ) = A(I3∗ ).
Note that A(δ), A(I1∗ ), A(I2∗ ), A(I3∗ ) denote σ-algebras on R∗ , while A(dR ) is a
σ-algebra on R.
Proof. Consider the metric subspace of (R, dR ) that is defined by [− π2 , π2 ], i.e.

the metric space ([− π2 , π2 ], dr ), where dr denotes the restriction of the distance
dR to [− π2 , π2 ] × [− π2 , π2 ]. By the very definition of δ, the mapping ϕ of 5.2.1 is
an isomorphism from the metric space ([− π2 , π2 ], dr ) onto the metric space (R∗ , δ).
Therefore, since ([− π2 , π2 ], dr ) is separable by 2.3.16 and 2.3.20, (R∗ , δ) is separable
as well (cf. 2.3.21c). Then, by 2.3.17, every element of Tδ is the union of a countable
family of open balls in (R∗ , δ). Since ϕ is an isomorphism, a subset of R∗ is an open
ball in (R∗ , δ) iff it is the image of an open ball in ([− π2 , π2 ], dr ). Now, since the
family of the open balls in (R, dR ) is the family I1 of 6.1.25, the family of the open
[− π , π ]
balls in ([− π2 , π2 ], dr ) is the family I1 2 2 (cf. the proof of 2.2.5), i.e. the family
n π πo n π π πo
(a, b) : − ≤ a < b ≤ ∪ [− , a) : − < a ≤
2 2 n 2π 2 2
π πo n π π o
∪ (a, ] : − ≤ a < ∪ [− , ] .
2 2 2 2 2
∗
Therefore, the family of the open balls in (R , δ) is the family
I ∗ := {(a, b) : −∞ ≤ a < b ≤ ∞} ∪ {[−∞, a) : −∞ < a ≤ ∞}
∪ {(a, ∞] : −∞ ≤ a < ∞} ∪ R∗ .
Thus, every element of Tδ is the union of a countable family of elements of I ∗ , and
this implies that Tδ ⊂ A(I ∗ ), and hence that A(Tδ ) ⊂ A(I ∗ ) (A(Tδ ) and A(I ∗ )
denote σ-algebras on R∗ ). Since I ∗ ⊂ Tδ , we also have A(I ∗ ) ⊂ A(Tδ ). This proves
that A(δ) := A(Tδ ) = A(I ∗ ).
Next, we notice that:
(a, b) ∈ A(dR ) ⊂ A(I1∗ ), for − ∞ ≤ a < b ≤ ∞,
[−∞, a) = {−∞} ∪ (−∞, a) ∈ A(I1∗ ), for − ∞ < a ≤ ∞,
(a, ∞] = (a, ∞) ∪ {∞} ∈ A(I1∗ ), for − ∞ ≤ a < ∞,
R∗ = {−∞} ∪ R ∪ {∞} ∈ A(I1∗ ).
Thus, I ∗ ⊂ A(I1∗ ), whence A(I ∗ ) ⊂ A(I1∗ ).
Moreover,
(a, b] = (a, ∞] ∩ (R∗ − (b, ∞]), for a, b ∈ R,
and this proves that for the family I4 of 6.1.25 we have the inclusion I4 ⊂ A(I2∗ ),
whence A(I4 ) ⊂ A(I2∗ ) (A(I4 ) denotes the σ-algebra on R∗ generated by I4 ). Now,
R
R ∈ A(I4 ) since R = ∪∞ n=1 (−n, n], and hence A(I4 ) ⊂ A(I4 ) (cf. 6.1.19a); also,
R
A(I4 ) is the σ-algebra on R generated by I4R (cf. 6.1.20); but I4R = I4 , and hence
R
A(I4 ) = A(dR ) (cf. 6.1.25). Thus, A(dR ) ⊂ A(I2∗ ). We also have
{−∞} = ∩∞ ∗ ∞
n=1 (R − (−n, ∞]) and {∞} = ∩n=1 (n, ∞],
which proves that {{−∞} , {∞}} ⊂ A(I2∗ ). Thus we have I1∗ ⊂ A(I2∗ ), whence
A(I1∗ ) ⊂ A(I2∗ ).
Since I2∗ ⊂ I ∗ , we also have A(I2∗ ) ⊂ A(I ∗ ).
Summing up, we have proved that

A(I ∗ ) ⊂ A(I1∗ ) ⊂ A(I2∗ ) ⊂ A(I ∗ ),
which shows that
A(δ) = A(I ∗ ) = A(I1∗ ) = A(I2∗ ).
Finally, we have
1
(a, ∞] = ∪∞
n=1 [a + , ∞] ∈ A(I3∗ ), ∀a ∈ R,
n
1
[a, ∞] = ∩∞
n=1 (a − , ∞] ∈ A(I2∗ ), ∀a ∈ R,
n
and this proves that I2∗ ⊂ A(I3∗ ) and I3∗ ⊂ A(I2∗ ), and hence that A(I2∗ ) = A(I3∗ ).
6.1.27 Definition. Let N ∈ N and, for k = 1, ..., N , let Xk be a non-empty set.

For k = 1, ..., N , we define the mapping
πk : X1 × · · · × XN → Xk
(x1 , ..., xN ) 7→ πk ((x1 , ..., xN )) := xk ,
which we will sometimes denote by πXk (thus, this definition generalizes the defini-
tion given in 1.2.6c). Notice that, for k = 1, ..., N and Ek ∈ P(Xk ),
πk−1 (Ek ) = X1 × · · · × Xk−1 × Ek × Xk+1 × · · · × XN .
6.1.28 Definition. Let N ∈ N, let (Xk , Ak ) be a measurable space for k = 1, ..., N ,

and let F be the family of subsets of X1 × · · · × XN defined by
F := πk−1 (Ek ) : k ∈ {1, ..., N } , Ek ∈ Ak .

The σ-algebra on X1 × · · · × XN generated by F is called the product σ-algebra of

the Ak ’s and is denoted by A1 ⊗ · · · ⊗ AN , i.e. we define
A1 ⊗ · · · ⊗ AN := A(F ).
6.1.29 Proposition. Let N ∈ N and, for k = 1, ..., N , let Xk be a non-empty set

and Fk a family of subsets of Xk . Define the families of subsets of X1 × · · · × XN :
G := πk−1 (Sk ) : k ∈ {1, ..., N } , Sk ∈ Fk ,

G ′ := {S1 × · · · × SN : Sk ∈ Fk or Sk = Xk for k = 1, ..., N } .

Then
A(F1 ) ⊗ · · · ⊗ A(FN ) = A(G) = A(G ′ ).
Note that A(Fk ) denotes a σ-algebra on Xk for k = 1, ..., N , while A(G) and A(G ′ )
denote σ-algebras on X1 × · · · × XN .
Proof. Define the family of subsets of X1 × · · · × XN :

F := πk−1 (Ek ) : k ∈ {1, ..., N } , Ek ∈ A(Fk ) .

Since Fk ⊂ A(Fk ), we have G ⊂ F , and hence A(G) ⊂ A(F ).

For k = 1, ..., N , define the family of subsets of Xk
Ek := Tk ∈ P(Xk ) : πk−1 (Tk ) ∈ A(G) .

Since G ⊂ A(G), we have Fk ⊂ Ek . Moreover, Ek is a σ-algebra on Xk since (cf.

1.2.8)
πk−1 (∩n∈I Tk,n ) = ∩n∈I πk−1 (Tk,n )
for every family {Tk,n }n∈I of subsets of Xk , and
πk−1 (Xk − Tk ) = X1 × · · · × XN − πk−1 (Tk )
for each subset Tk of Xk . Thus, we have A(Fk ) ⊂ Ek , which can be written as
πk−1 (Ek ) ∈ A(G), ∀Ek ∈ A(Fk ),
or equivalently as F ⊂ A(G), which implies A(F ) ⊂ A(G). Therefore,
A(F1 ) ⊗ · · · ⊗ A(FN ) = A(F ) = A(G).
As to the second equality we must prove, we notice that G ⊂ G ′ , and hence
A(G) ⊂ A(G ′ ). We notice also that, for each (S1 , ..., SN ) ∈ P(X1 ) × · · · × P(XN ),
S1 × · · · × SN = ∩k∈I πk−1 (Sk ) with I := {k ∈ {1, ..., N } : Sk 6= Xk } ,
and hence G ′ ⊂ A(G), and hence A(G ′ ) ⊂ A(G).
6.1.30 Proposition. Let N ∈ N and, for k = 1, ..., N , let (Xk , Ak ) be a measurable

space.
(a) The family of subsets of X1 × · · · × XN defined by
SN := {E1 × · · · × EN : Ek ∈ Ak for k = 1, ..., N }
is a semialgebra on X1 × · · · × XN and A1 ⊗ · · · ⊗ AN = A(SN ).
(b) If we identify X1 × · · · × XN with ((· · · (X1 × X2 ) × · · · ) × XN −1 ) × XN , then
A1 ⊗ · · · ⊗ AN = ((· · · (A1 ⊗ A2 ) ⊗ · · · ) ⊗ AN −1 ) ⊗ AN .
(c) Let Yk be a non-empty subset of Xk for each k = 1, ..., N . Then
(A1 ⊗ · · · ⊗ AN )Y1 ×···×YN = AY1 1 ⊗ · · · ⊗ AYNN .
Proof. a: We have
∅ × · · · × ∅ = ∅;
∀E1 × · · · × EN , F1 × · · · × FN ∈ SN ,
(E1 × · · · × EN ) ∩ (F1 × · · · × FN ) = (E1 ∩ F1 ) × · · · × (EN ∩ FN );
∀E1 × · · · × EN ∈ SN ,
(X1 × · · · × XN ) − (E1 × · · · × EN )
= ∪N
k=1 (X1 × · · · × Xk−1 × (Xk − Ek ) × Xk+1 × · · · × XN ).
This shows that SN is a semialgebra on X1 × · · · × XN . From 6.1.29, with G ′ := SN ,

we have A1 ⊗ · · · ⊗ AN = A(SN ) since obviously A(Ak ) = Ak .
b: We will prove A1 ⊗ · · · ⊗ AN = (A1 ⊗ · · · ⊗ AN −1 ) ⊗ AN . The result will
then follow by induction. From 6.1.29, with
G ′ := {F × EN : F ∈ SN −1 , EN ∈ AN } ,
and from part a (with N replaced by N −1) we have (A1 ⊗· · ·⊗AN −1 )⊗AN = A(G ′ ).
But if we identify X1 × · · · × XN with ((· · · (X1 × X2 ) × · · · ) × XN −1 ) × XN , G ′ gets
identified with SN and from part a we have (A1 ⊗· · ·⊗AN −1 )⊗AN = A1 ⊗· · ·⊗AN .
c: We have
(1)
(A1 ⊗ · · · ⊗ AN )Y1 ×···×YN = A(SN )Y1 ×···×YN
(2) Y1 ×···×YN (3)
= A(SN ) = AY1 1 ⊗ · · · ⊗ AYNN ,
where 1 holds by part a, 2 holds by 6.1.20, 3 holds by part a (with (Xk , Ak ) replaced
by (Yk , AYk k )) since
Y1 ×···×YN
SN = {(E1 × · · · × EN ) ∩ (Y1 × · · · × YN ) : Ek ∈ Ak for k = 1, ..., N }
= {(E1 ∩ Y1 ) × · · · × (EN ∩ YN ) : Ek ∈ Ak for k = 1, ..., N }
n o
= G1 × · · · × GN : Gk ∈ AYk k for k = 1, ..., N .
6.1.31 Proposition. Let N ∈ N and, for k = 1, ..., N , let (Xk , dk ) be a met-

ric space. Let (X, d) be the product (defined associatively) of the N metric spaces
(Xk , dk ), i.e. X := X1 × · · · × XN and
d: X ×X →R
((x1 , ..., xN ), (y1 , ..., yN )) 7→ d((x1 , ..., xN ), (y1 , ..., yN ))
v
uN
uX
:= t dk (x , y )2 .
k k
k=1
Then
A(d1 ) ⊗ · · · A(dN ) ⊂ A(d).
If (Xk , dk ) is separable for each k ∈ {1, ..., N }, then
A(d1 ) ⊗ · · · A(dN ) = A(d).
Proof. Define the family of subsets of X1 × · · · × XN

G := πk−1 (Gk ) : k = {1, ..., N } , Gk ∈ Tdk .

By 6.1.29 we have
A(d1 ) ⊗ · · · A(dN ) = A(G).
Since πk is continuous (cf. e.g. 2.4.2), we have (cf. 2.4.3) G ⊂ Td , whence

A(d1 ) ⊗ · · · A(dN ) = A(G) ⊂ A(Td ) =: A(d).
Now define the function
ρ: X ×X →R
((x1 , ..., xN ), (y1 , ..., yN )) 7→ ρ((x1 , ..., xN ), (y1 , ..., yN ))
:= max dk (xk , yk ) : k ∈ {1, ..., N } .

It is easy to see that ρ is a distance on X since properties di1 , di2 , di3 of 2.1.1 for the
function ρ follow immediately from the corresponding properties for the functions
dk . It is also immediate to see that, for a sequence {(x1,n , ..., xN,n )} in X and for
(x1 , ..., xN ) ∈ X,
ρ((x1,n , ..., xN,n ), (x1 , ..., xN )) → 0 as n → ∞ ⇔
(dk (xk,n , xk ) → 0 as n → ∞, ∀k ∈ {1, ..., N }) ⇔
d((x1,n , ..., xN,n ), (x1 , ..., xN )) → 0 as n → ∞.
By 2.3.4, this implies Kρ = Kd , and hence Tρ = Td .
Suppose now that (Xk , dk ) is separable for each k ∈ {1, ..., N }. Then, by 2.7.3c,
(X, d) is separable. Since Kρ = Kd , (X, ρ) is separable as well. By 2.3.17, this
implies that every element of Tρ is a countable union of open balls in X defined
with respect to ρ. But for any such ball Bρ ((x1 , ..., xN ), r) we have (denoting by
Bdk (xk , r) a ball in Xk defined with respect to dk ):
Bρ ((x1 , ..., xN ), r) = Bd1 (x1 , r) × · · · × BdN (xN , r)
−1
= ∩N
k=1 πk (Bdk (xk , r)) ∈ A(G).
This proves the inclusion Tρ ⊂ A(G), i.e. Td ⊂ A(G), and therefore also the inclusion
A(d) := A(Td ) ⊂ A(G). In view of the inclusion A(G) ⊂ A(d) proved above, this
shows that
A(d1 ) ⊗ · · · A(dN ) = A(G) = A(d).
6.1.32 Corollary. For n ∈ N,

A(dn ) = A(dR ) ⊗ · · · n times · · · ⊗ A(dR )
(dn is the distance on Rn defined in 2.7.4b).
Proof. Use 6.1.31 and 2.3.16.
6.1.33 Proposition. Define the following family of subsets of C:

R := {(a, b) × (c, d) : a, b, c, d ∈ R} .
Then we have
A(dC ) = A(R).
Proof. Since dC = d2 (cf. 2.7.4b), by 6.1.32 we have

A(dC ) = A(dR ) ⊗ A(dR ).
From 6.1.25 we also have
A(dR ) = A(I1 ), with I1 := {(a, b) : a, b ∈ R} .
Then by 6.1.29 we have A(dC ) = A(G) if we define
G := {(a, b) × R, R × (c, d) : a, b, c, d ∈ R} .
Now, the equality
(a, b) × (c, d) = ((a, b) × R) ∩ (R × (c, d))
shows that R ⊂ A(G), and this implies that A(R) ⊂ A(G). Moreover, the equalities
(a, b) × R = ∪∞ ∞
n=1 (a, b) × (−n, n) and R × (c, d) = ∪n=1 (−n, n) × (c, d)
show that G ⊂ A(R), and this implies that A(G) ⊂ A(R). Thus, we have A(dC ) =
A(G) = A(R).
6.1.34 Definition. A non-empty collection C of subsets of X is called a monotone

class on X if the following conditions are satisfied:
(mo1 ) if {En } is a sequence in C such that En ⊂ En+1 for each n ∈ N, then
∪∞n=1 En ∈ C;
(mo2 ) if {En } is a sequence in C such that En+1 ⊂ En for each n ∈ N, then
∩∞n=1 En ∈ C.
6.1.35 Remarks. Every σ-algebra on X is a monotone class on X (cf. σa1 in

6.1.13 and σa2 in 6.1.14).
Proceeding as in 6.1.16, it is easy to see that, if Λ is a family of monotone classes
on X, then ∩C∈Λ C (this intersection is defined within the framework of P(X)) is a
monotone class on X. Proceeding as in 6.1.10a it is then easy to see that, if F is
a family of subsets of X, then there exists a unique monotone class on X, which is
called the monotone class on X generated by F and is denoted by C(F ), such that
(gm1 ) F ⊂ C(F ),
(gm2 ) If C is a monotone class on X and F ⊂ C, then C(F ) ⊂ C.
If F is the empty family, then C(F ) = {∅, X} .
6.1.36 Theorem. If A0 is an algebra on X, then the monotone class C(A0 ) gener-

ated by A0 is the same as the σ-algebra A(A0 ) generated by A0 , i.e. C(A0 ) = A(A0 ).
Proof. Since a σ-algebra is always a monotone class and A0 ⊂ A(A0 ) (cf. gσ1
in 6.1.17), we have C(A0 ) ⊂ A(A0 ) (cf. gm2 ). Hence it is sufficient to show that
C(A0 ) is a σ-algebra, because then from A0 ⊂ C(A0 ) (cf. gm1 ) we can derive
A(A0 ) ⊂ C(A0 ) (cf. gσ2 in 6.1.17). Besides, if we prove that C(A0 ) is an algebra on
X, then for any sequence {En } in C(A0 ) we have ∪N n=1 En ∈ C(A0 ) for each N ∈ N,
and hence ∪∞ E
n=1 n = ∪ ∞
N =1 (∪ N
E
n=1 n ) ∈ C(A 0 ) by property mo1 of C(A0 ), and we
can conclude that C(A0 ) is a σ-algebra on X.
For each E ∈ C(A0 ) we define a collection C(E) of subsets of X by
C(E) := {F ∈ C(A0 ) : E − F, F − E, E ∩ F ∈ C(A0 )} .
Clearly, ∅ ∈ C(E) (since ∅ ∈ A0 ⊂ C(A0 )) and hence C(E) is not empty. Moreover,
if {Fn } is a sequence in C(E) such that Fn ⊂ Fn+1 for each n ∈ N, then:
E − (∪∞ ∞ ∞
n=1 Fn ) = E ∩ (∩n=1 (X − Fn )) = ∩n=1 (E − Fn ) ∈ C(A0 )
by property mo2 of C(A0 ), since E − Fn+1 ⊂ E − Fn for each n ∈ N;
(∪∞ ∞
n=1 Fn ) − E = ∪n=1 (Fn − E) ∈ C(A0 )
by property mo1 of C(A0 ), since Fn − E ⊂ Fn+1 − E for each n ∈ N;
E ∩ (∪∞ ∞
n=1 Fn ) = ∪n=1 (E ∩ Fn ) ∈ C(A0 )
by property mo1 of C(A0 ), since E ∩ Fn ⊂ E ∩ Fn+1 for each n ∈ N.
This proves property mo1 for C(E). Property mo2 can be proved for C(E) in a
similar way. Thus, C(E) is a monotone class.
For each E ∈ A0 , it is clear (from 6.1.6 and A0 ⊂ C(A0 )) that A0 ⊂ C(E), so
that C(A0 ) ⊂ C(E) (cf. gm2 ). Hence, for each F ∈ C(A0 ), we have
E ∈ A0 ⇒ F ∈ C(E) ⇒ E ∈ C(F ),
where the second implication follows from the symmetry of the definition of C(E).
This proves that, for each F ∈ C(A0 ), A0 ⊂ C(F ) and hence C(A0 ) ⊂ C(F ) (cf.
gm2 ). Thus, if E, F ∈ C(A0 ) then E ∈ C(F ) and hence F − E and F ∩ E are
elements of C(A0 ). Since X ∈ A0 ⊂ C(A0 ), this implies that C(A0 ) is an algebra on
X:
if E ∈ C(A0 ), then X − E ∈ C(A0 );
if E, F ∈ C(A0 ), then X − E, X − F ∈ C(A0 ), and then
E ∪ F = X − ((X − E) ∩ (X − F )) ∈ C(A0 ).
This completes the proof.
6.2 Measurable mappings
6.2.1 Definition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces, i.e. let X1 , X2
be non-empty sets and let A1 , A2 be σ-algebras on X1 , X2 respectively. A mapping
ϕ : X1 → X2 is said to be measurable w.r.t. (with respect to) A1 and A2 (or, simply,
measurable when no confusion can occur) if the following condition holds:
ϕ−1 (E) ∈ A1 , ∀E ∈ A2 .
6.2.2 Remark. Let X1 , X2 be non-empty sets, and assume that ϕ : X1 → X2 is

a constant mapping, i.e. that there is x2 ∈ X such that ϕ(x) = x2 for all x ∈ X1 .
Then ϕ is measurable w.r.t. any σ-algebras A1 on X1 and A2 on X2 , since the only
possible counterimages under ϕ are ∅ and X1 , and ∅, X1 ∈ A1 for any σ-algebra A1
on X1 (cf. al4 in 6.1.6).
6.2.3 Proposition. Let (X1 , A1 ), (X2 , A2 ) be measurable spaces, and suppose that
ϕ : X1 → X2 is a measurable mapping w.r.t. A1 and A2 . If Y is a non-empty
subset of X1 , the restriction ϕY of ϕ to Y is measurable w.r.t. AY1 and A2 .
Proof. Notice that

ϕ−1
Y (S) = ϕ
−1
(S) ∩ Y, ∀S ∈ P(Y ).
Thus,
ϕ−1 Y
Y (E) ∈ A1 , ∀E ∈ A2 .
6.2.4 Proposition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces. For a map-
ping ϕ : X1 → X2 , let Y be a subset of X2 such that Rϕ ⊂ Y . Then the final set X2
can be replaced by Y , i.e. we can consider ϕ as ϕ : X1 → Y (cf. 1.2.1). However,
the following are equivalent conditions:
(a) ϕ is measurable w.r.t. A1 and A2 ;
(b) ϕ is measurable w.r.t. A1 and AY2 .
Proof. Notice that

ϕ−1 (S) = ϕ−1 (S) ∩ X1 = ϕ−1 (S) ∩ ϕ−1 (Y ) = ϕ−1 (S ∩ Y ), ∀S ∈ P(Y ).
Thus,
ϕ−1 (E) ∈ A1 , ∀E ∈ A1 ,
is equivalent to
ϕ−1 (E) ∈ A1 , ∀E ∈ AY2 .
6.2.5 Theorem. Let (X1 , A1 ), (X2 , A2 ), (X3 , A3 ) be measurable spaces, and let
ϕ : X1 → X2 be a measurable mapping w.r.t. A1 and A2 and ψ : X2 → X3 a
measurable mapping w.r.t. A2 and A3 . Then ψ ◦ ϕ is a measurable mapping w.r.t
A1 and A3 .
Proof. Use the definition of measurable mapping (cf. 6.2.1) and (cf. 1.2.13f)
(ψ ◦ ϕ)−1 (S) = ϕ−1 (ψ −1 (S)), ∀S ∈ P(X3 ).
6.2.6 Corollary. Let (X1 , A1 ), (X2 , A2 ), (X3 , A3 ) be measurable spaces, let ϕ be a

mapping from X1 to X2 (i.e. ϕ : Dϕ → X2 with Dϕ ⊂ X1 ), and let ψ be a mapping
from X2 to X3 (i.e. ψ : Dψ → X3 with Dψ ⊂ X2 ). Suppose that ϕ is measurable
D D
w.r.t. A1 ϕ and A2 and that ψ is measurable w.r.t. A2 ψ and A3 . Then ψ ◦ ϕ is
D
measurable w.r.t. A1 ψ◦ϕ and A3 .
Proof. Letting D := Dψ◦ϕ , we have ψ ◦ ϕ = ψ ◦ ϕD ; in fact

RϕD = ϕD (D) = ϕ(ϕ−1 (Dψ )) ⊂ Dψ ,
and this implies (cf. 1.2.13d) that Dψ◦ϕD = DϕD = Dψ◦ϕ ; moreover,
(ψ ◦ ϕD )(x) = ψ(ϕ(x)) = (ψ ◦ ϕ)(x), ∀x ∈ Dψ◦ϕ .
D
Next, from 6.2.3 we have that ϕD is measurable w.r.t. (A1 ϕ )D and A2 , hence (cf.
6.1.19b) w.r.t. AD
1 and A2 . Besides, since RϕD ⊂ Dψ , from 6.2.4 we have that ϕD
can be considered as a mapping ϕD : D → Dψ which is measurable w.r.t. AD 1 and
Dψ D
A2 . Then ψ ◦ ϕD is measurable w.r.t. A1 and A3 by 6.2.5.
6.2.7 Theorem. Let (X1 , A1 ) be a measurable space, X2 a non-empty set, and F

a family of subsets of X2 . For a mapping ϕ : X1 → X2 the following are equivalent
conditions:
(a) ϕ is measurable w.r.t A1 and A(F );
(b) ϕ−1 (F ) ∈ A1 , ∀F ∈ F .
Proof. a ⇒ b: Recall that F ⊂ A(F ).

b ⇒ a: Assume condition b and define
Aϕ := {E ∈ P(X2 ) : ϕ−1 (E) ∈ A1 }.
First we prove that Aϕ is a σ-algebra on X2 , by showing that it has properties al1
and al2 of 6.1.5 and σa1 of 6.1.13.
al1 and σa1 : Let {En }n∈I be a family of elements of Aϕ , with I := {1, 2}
or I := N. Then ϕ−1 −1
S S
n∈I En = n∈I ϕ (En ) ∈ A1 , and this proves that
S
E
n∈I n ∈ Aϕ .
al2 : If E ∈ Aϕ then ϕ−1 (X2 − E) = X1 − ϕ−1 (E) ∈ A1 , and this proves that
X2 − E ∈ Aϕ .
Next we notice that condition b means that F ⊂ Aϕ . Thus A(F ) ⊂ Aϕ , and
this means that
ϕ−1 (F ) ∈ A1 , ∀F ∈ A(F ),
which is condition a.
6.2.8 Corollary. Let (X1 , d1 ) and (X2 , d2 ) be metric spaces.

If a mapping ϕ : X1 → X2 is continuous then it is measurable w.r.t. A(d1 ) and
A(d2 ).
Proof. If ϕ is continuous, then (cf. 2.4.3)

ϕ−1 (E) ∈ Td1 , and hence ϕ−1 (E) ∈ A(d1 ), ∀E ∈ Td2 .
Therefore, ϕ is measurable w.r.t. A(d1 ) and A(d2 ) by 6.2.7, since A(d2 ) = A(Td2 ).
6.2.9 Proposition. Let N ∈ N and let (Xk , Ak ), (Yk , Bk ) be measurable spaces for
k = 1, ..., N . For k = 1, ..., N , let ϕk : Xk → Yk be a measurable mapping w.r.t. Ak
and Bk .
(a) The mapping
ϕ1 × · · · × ϕN : X1 × · · · × XN → Y1 × · · · × YN
(x1 , ..., xN ) 7→ (ϕ1 × · · · × ϕN )(x1 , ..., xN )
:= (ϕ1 (x1 ), ..., ϕN (xN ))
is measurable w.r.t. A1 ⊗ · · · ⊗ AN and B1 ⊗ · · · ⊗ BN .
(b) If (Z, C) is a measurable space and ρ : Y1 × · · · × YN → Z is a measurable
mapping w.r.t. B1 ⊗ · · · ⊗ BN and C, then the mapping
χ : X1 × · · · × XN → Z
(x1 , ..., xN ) 7→ χ(x1 , ..., xN ) := ρ(ϕ1 (x1 ), ..., ϕN (xN ))
is measurable w.r.t. A1 ⊗ · · · ⊗ AN and C.
Proof. a: By 6.1.30a we have E1 × · · · × EN ∈ A1 ⊗ · · · ⊗ AN if Ek ∈ Ak for

k = 1, ..., N . Then by the measurability of the ϕk ’s we have
[Fk ∈ Bk , ∀k ∈ {1, ..., N }] ⇒
(ϕ1 × · · · × ϕN )−1 (F1 × · · · × FN ) = ϕ−1 −1
1 (E1 ) × · · · × ϕN (EN ) ∈ A1 ⊗ · · · ⊗ AN .
By 6.1.30a and 6.2.7, this proves the measurability of ϕ1 × · · · × ϕN .

b: This follows at once from part a and 6.2.5, since χ = ρ ◦ (ϕ1 × · · · × ϕN ).
6.2.10 Proposition. Let (X, A) be a measurable space, let N ∈ N, and let (Yk , Bk )
be a measurable space for k = 1, ..., N . For k = 1, ..., N , let ϕk : X → Yk be a
measurable mapping w.r.t. A and Bk .
(a) The mapping
ϕ : X → Y1 × · · · × YN
x 7→ ϕ(x) := (ϕ1 (x), ..., ϕN (x))
is measurable w.r.t. A and B1 ⊗ · · · ⊗ BN .
(b) If (Z, C) is a measurable space and ρ : Y1 × · · · × YN → Z is a measurable
mapping w.r.t. B1 ⊗ · · · ⊗ BN and C, then the mapping
χ :X → Z
x 7→ χ(x) := ρ(ϕ1 (x), ..., ϕN (x))
is measurable w.r.t. A and C.
Proof. a: Define the mapping

ι : X → X × · · · n times · · · × X
x 7→ ι(x) := (x, ..., x, ..., x).
We have
N
\
(Ek ∈ A, ∀k ∈ {1, ..., N }) ⇒ ι−1 (E1 × · · · × EN ) = Ek ∈ A.
k=1
By 6.1.30a and 6.2.7, this proves that the mapping ι is measurable w.r.t. A and
A ⊗ · · · n times · · · ⊗ A. Now, by 6.2.9a the mapping ϕ1 × · · · × ϕN is measurable
w.r.t. A ⊗ · · · n times · · · ⊗ A and B1 ⊗ · · · ⊗ BN . Then ϕ is measurable w.r.t. A
and B1 ⊗ · · · ⊗ BN by 6.2.5, since ϕ = (ϕ1 × · · · × ϕN ) ◦ ι.
b: This follows at once from part a and 6.2.5, since χ = ρ ◦ ϕ.
6.2.11 Definition. Let (X, A) be a measurable space.

A function ϕ : X → R is said to be A-measurable if it is measurable w.r.t. A
and A(dR ).
A function ϕ : X → R∗ is said to be A-measurable if it is measurable w.r.t. A
and A(δ).
A function ϕ : X → C is said to be A-measurable if it is measurable w.r.t. A
and A(dC ).
6.2.12 Theorem. Let (X, A) be a measurable space. For a function ϕ : X → C,

the following conditions are equivalent:
(a) ϕ is A-measurable;
(b) Re ϕ and Im ϕ are both A-measurable.
Proof. a ⇒ b: Assume condition a and notice that the two functions

π1 : C → R
z 7→ π1 (z) := Re z
and
π2 : C → R
z 7→ π2 (z) := Im z
are continuous, hence they are A(dC )-measurable by 6.2.8. Then Re ϕ and Im ϕ are
A-measurable by 6.2.5, since Re ϕ = π1 ◦ ϕ and Im ϕ = π2 ◦ ϕ.
b ⇒ a: Use 6.2.10a with N := 2, (Y1 , B1 ) := (Y2 , B2 ) := (R, A(dR )), ϕ1 := Re ϕ,
ϕ2 := Im ϕ, and notice that A(dR ) ⊗ A(dR ) = A(dC ) (cf. 6.1.32 with n := 2 and
2.7.4b).
6.2.13 Proposition. Let (X, A) be a measurable space.

(a) A function ϕ : X → R is A-measurable iff the following condition holds:
ϕ−1 (E) ∈ A, ∀E ∈ F ,
where F is any of the nine families In (with n = 1, ..., 9) of 6.1.25, or else
F := TdR , or else F = KdR .
(b) A function ϕ : X → R∗ is A-measurable iff the following condition holds:
ϕ−1 (E) ∈ A, ∀E ∈ F ∗ ,
where F ∗ is any of the three families In∗ (with n = 1, 2, 3) of 6.1.26, or else
F ∗ := Tδ , or else F ∗ = Kδ .
Moreover, ϕ is A-measurable iff the following condition holds:
−1
ϕ−1 ({−∞}), ϕ−1 ({∞}) ∈ A and ϕϕ−1 (R) is Aϕ (R)
-measurable.
(c) A function ϕ : X → C is A-measurable iff the following condition holds:
ϕ−1 (E) ∈ A, ∀E ∈ G,
where G is the family R of 6.1.33, or else G := TdC , or else G := KdC .
Proof. With the exception of the second assertion of part b, everything follows at
once from 6.2.7 along with 6.1.25, 6.1.26, 6.1.33, 6.1.22, 6.1.23.
As to the second assertion of part b, we notice that the condition
ϕ−1 (E) ∈ A, ∀E ∈ I1∗
is precisely the condition
ϕ−1 ({−∞}), ϕ−1 ({∞}) ∈ A and ϕ−1 (E) ∈ A for all E ∈ A(dR ).
We also notice that the condition
ϕ−1 (E) ∈ A, ∀E ∈ A(dR )
can be written as
∃F ∈ A s.t. ϕ−1
ϕ−1 (R) (E) = ϕ
−1
(E) = F = F ∩ ϕ−1 (R), ∀E ∈ A(dR )
and that this in its turn can be written as
−1
ϕ−1
ϕ−1 (R) (E) ∈ A
ϕ (R)
, ∀E ∈ A(dR ).
−1
(R)
Finally, we notice that the last condition is the condition of Aϕ -measurability
for the mapping ϕϕ−1 (R) .
6.2.14 Remark. Let (X, A) be a measurable space. From 6.2.4 and 6.1.24 it follows
that, for the A-measurability of a function ϕ : X → R, it is immaterial what choice
is made among R, R∗ and C for the final set of ϕ. For this reason, in what follows
we consider only functions whose final sets are either R∗ or C.
6.2.15 Definition. For a measurable space (X, A), we denote by M(X, A) the
family of all the functions, with X as domain and C as final set, that are
A-measurable, i.e. M(X, A) is the family of all A-measurable complex functions
on X:
M(X, A) := {ϕ ∈ F (X) : ϕ is A-measurable}
(for F (X) cf. 3.1.10c).
6.2.16 Theorem. Let (X, A) be a measurable space. Then we have

(lm1 ) ϕ + ψ ∈ M(X, A), ∀ϕ, ψ ∈ M(X, A),
(lm2 ) αϕ ∈ M(X, A), ∀α ∈ C, ∀ϕ ∈ M(X, A),
(sa2 ) ϕψ ∈ M(X, A), ∀ϕ, ψ ∈ M(X, A).
This means that M(X, A) is a subalgebra of the abelian associative algebra F (X)
(cf. 3.3.8a). Since 1X ∈ M(X, A), M(X, A) is with identity.
Proof. For lm1 and sa2 , use 6.2.10b with N := 2, (Y1 , B1 ) = (Y2 , B2 ) = (C, A(dC )),
ϕ1 := ϕ, ϕ2 := ψ and either
C × C ∋ (z1 , z2 ) 7→ ρ(z1 , z2 ) := z1 + z2 ∈ C
or
C × C ∋ (z1 , z2 ) 7→ ρ(z1 , z2 ) := z1 z2 ∈ C.
Notice in fact that in either case ρ is A(dC × dC )-measurable since it is continuous
(cf. 6.2.8), and that A(dC × dC ) = A(dC ) ⊗ A(dC ) by 6.1.31.
Condition lm2 follows from sa2 , since αϕ = αX ϕ (for the constant function αX
cf. 1.2.19) and αX ∈ M(X, A) by 6.2.2.
Finally, 1X ∈ M(X, A) by 6.2.2.
6.2.17 Proposition. Let (X, A) be a measurable space and ϕ ∈ M(X, A). Then:
ϕ ∈ M(X, A);
|ϕ|n ∈ M(X, A), ∀n ∈ N;
1 D1
the function ϕ is A ϕ -measurable
1
(for the functions ϕ, |ϕ|n and ϕ, cf. 1.2.19).
Proof. Notice that the function

C ∋ z 7→ ψ(z) := z ∈ C
is continuous, and hence it is A(dC )-measurable by 6.2.8. Use then 6.2.5 to obtain
the A-measurability of ϕ, since ϕ = ψ ◦ ϕ.
For |ϕ|n the proof is analogous, using the function
C ∋ z 7→ ψ(z) := |z|n ∈ C.
1 D1 1
Finally, ϕ is A ϕ -measurable by 6.2.6. Indeed, ϕ = ψ ◦ ϕ if ψ is the function
1
C − {0} ∋ z 7→ ψ(z) :=
∈ C.
z
Now ψ is continuous, hence A(dC−{0} )-measurable by 6.2.8 (we have denoted by
dC−{0} the restriction of dC to (C − {0}) × (C − {0})). Moreover, A(dC−{0} ) =
(A(dC ))C−{0} = (A(dC ))Dψ by 6.1.21 (cf. also 6.1.22).
6.2.18 Definitions. Let X be a non-empty set, and let {ϕn } be a sequence of

functions ϕn : X → R∗ . For each n ∈ N we define the functions (cf. 5.2.2):
sup ϕk : X → R∗
k≥n
x 7→ (sup ϕk )(x) := sup ϕk (x),

k≥n k≥n
inf ϕk : X → R∗
k≥n
x 7→ ( inf ϕk )(x) := inf ϕk (x).

k≥n k≥n
If the sequence {ϕn (x)} is convergent (in the metric space (R∗ , δ)) for all x ∈ X,
we define the function
lim ϕn : X → R∗
n→∞
x 7→ ( lim ϕn )(x) := lim ϕn (x),

n→∞ n→∞
and from 5.2.6 we have limn→∞ ϕn = inf n≥1 (supk≥n ϕk ) = supn≥1 (inf k≥n ϕk ).
6.2.19 Proposition. Let (X, A) be a measurable space and let {ϕn } be a sequence
of A-measurable functions ϕn : X → R∗ .
(a) For each n ∈ N, the functions supk≥n ϕk and inf k≥n ϕk are A-measurable.
(b) If the sequence {ϕn (x)} is convergent for all x ∈ X, then the function
limn→∞ ϕn is measurable.
Proof. a: Recall that, for S ⊂ R∗ , we have

(1) s ≤ sup S, ∀s ∈ S,
(2) for a ∈ R∗ ,
(s ≤ a, ∀s ∈ S) ⇒ sup S ≤ a, or equivalently
a < sup S ⇒ (∃s ∈ S s.t. a < s).
Fix now a ∈ R. For x ∈ X we have
(3)
x ∈ (sup ϕk )−1 ((a, ∞]) ⇔ a < sup ϕk (x) ⇔
k≥n k≥n
∞
[
[∃k ≥ n s.t. a < ϕk (x)] ⇔ x ∈ ϕ−1
k ((a, ∞]),
n=k
(3) (3)
where ⇐ holds by 1 and ⇒ holds by 2. This proves that
∞
[
(sup ϕk )−1 ((a, ∞]) = ϕ−1
k ((a, ∞]).
k≥n
n=k
Since this is true for all a ∈ R, it proves that supk≥n ϕk is A-measurable, in view
of 6.2.13b with F ∗ := I2∗ .
For inf k≥n ϕk the proof is analogous.
b: Let the sequence {ϕn (x)} be convergent for all x ∈ X. Then limn→∞ ϕn =
inf n≥1 (supk≥n ϕk ). Now, supk≥n ϕk is A-measurable for each n ∈ N, in view of part
a. Then, in view of part a once again, inf n≥1 (supk≥n ϕk ) is A-measurable.
6.2.20 Corollaries. Let (X, A) be a measurable space.

(a) If the functions ϕ : X → R∗ and ψ : X → R∗ are measurable, then the functions
max{ϕ, ψ} : X → R∗
x 7→ (max{ϕ, ψ})(x) := max{ϕ(x), ψ(x)}
and
min{ϕ, ψ} : X → R∗
x 7→ (min{ϕ, ψ})(x) := min{ϕ(x), ψ(x)}
are A-measurable.
(b) A function ϕ : X → R is A-measurable iff both ϕ+ and ϕ− (cf. 1.2.19) are
A-measurable.
(c) Let {ϕn } be a sequence in M(X, A) and suppose that the sequence {ϕn (x)} is
convergent (in the metric space (C, dC )) for all x ∈ X. Then the function
lim ϕn : X → C
n→∞
x 7→ ( lim ϕn )(x) := lim ϕn (x)

n→∞ n→∞
is an element of M(X, A).

(d) Let {ϕn } be a sequence in M(X, A) and suppose that the series ∞
P
n=1 ϕn (x)
is convergent for all x ∈ X. Then the function
∞
X
ϕn : X → C
n=1
∞ ∞
!
X X
x 7→ ϕn (x) := ϕn (x).
n=1 n=1
is an element of M(X, A).
Proof. a: If we define ϕ1 := ϕ and ϕn := ψ for n ≥ 2, then we have

max{ϕ, ψ} = sup ϕk and min{ϕ, ψ} = inf ϕk ,
k≥1 k≥1
and the result follows from 6.2.19a.

b: If ϕ is A-measurable, then ϕ+ and ϕ− are A-measurable by corollary a, since

the function 0X is A-measurable (cf. 6.2.2). If ϕ+ and ϕ− are A-measurable, then
ϕ is A-measurable by 6.2.16 since ϕ = ϕ+ − ϕ− .
c: By 2.7.3a we have that, for all x ∈ X, the sequences {Re ϕn (x)} and
{Im ϕn (x)} are convergent (in the metric space (R, dR ), and hence in the metric
space (R∗ , δ) as well: cf. 5.2.1c) and
lim ϕn (x) = lim Re ϕn (x) + i lim Im ϕn (x).
n→∞ n→∞ n→∞
Therefore we have
lim ϕn = lim Re ϕn + i lim Im ϕn .
n→∞ n→∞ n→∞
Now, the functions Re ϕn and Im ϕn are A-measurable for each n ∈ N (cf. 6.2.12).
Thus, 6.2.19b implies that limn→∞ Re ϕn and limn→∞ Im ϕn are A-measurable (we
have also used 6.2.14 twice). Then, limn→∞ ϕn is A-measurable by 6.2.16 (or by
6.2.10a and the equality A(dC ) = A(dR ) ⊗ A(dR )).
d: This result follows from 6.2.16 and corollary c, since
∞
X Xn
ϕn = lim ϕk .
n→∞
n=1 k=1
6.2.21 Proposition. Let (X, A) be a measurable space and S a subset of X. Then

the characteristic function χS (cf. 1.2.6b) is A-measurable iff S ∈ A.
Proof. The only counterimages under χS are the sets ∅, S, X − S, X.
6.2.22 Definition. Let (X, A) be a measurable space. A function ψ : X → C is

said to be A-simple if there exist n ∈ N, a family {α1 , ..., αn } of elements of C, and
a disjoint family {E1 , ..., En } of elements of A so that
n
X
ψ= αk χEk .
k=1
We denote by S(X, A) the family of all A-simple functions, i.e. we define

(
S(X, A) := ψ ∈ F (X) : ∃n ∈ N, ∃(α1 , ..., αn ) ∈ Cn , ∃(E1 , ..., En ) ∈ An s.t.
n
)
X
Ei ∩ Ej = ∅ if i 6= j and ψ = αk χEk .
k=1
We point out the obvious fact that, for ψ ∈ S(X, A), its representation ψ =
Pn
k=1 αk χEk as in theS definition above is never unique (it would be if we required
n
αi 6= αj for i 6= j and k=1 Ek = X, but we do not).
6.2.23 Proposition. Let (X, A) be a measurable space. For a function ψ : X → C,

the following conditions are equivalent:
(a) ψ ∈ S(X, A),

(b) ψ ∈ M(X, A) and ψ has only finitely many values, i.e. Rψ is a finite set.
Proof. a ⇒ b: Let n ∈ N, (α1 , ..., αN ) ∈ Cn , (E1 , ..., En ) ∈ A be so that

n
X
Ei ∩ Ej = ∅ if i 6= j and ψ = αk χEk .
k=1
Then χEk ∈ M(X, A) for k = 1, ..., n (cf. 6.2.21), and hence ψ ∈ M(X, A) by
6.2.16. Moreover, ψ has finitely many values since the only possible values of ψ are
the numbers 0, α1 , ..., αn .
b ⇒ a: Assume condition b and let {α1 , ..., αn } := Rψ and Ek := ψ −1 ({αk }) for
k = 1, ..., n.
We have Ek ∈ A by 6.2.13c (with G = KdC ) since ψ is A-measurable and
{αk } ∈ KdC . Moreover,
Ei ∩ Ej = ψ −1 ({αi } ∩ {αj }) = ψ −1 (∅) = ∅ if i 6= j.
Sn
Finally, since X = k=1 Ek we have
n
X
∀x ∈ X, ∃!i ∈ {1, ..., n} s.t. x ∈ Ei , hence s.t. ψ(x) = αi = αk χEk (x),
k=1
Pn
and this proves that ψ = k=1 αk χEk .
6.2.24 Theorem. Let (X, A) be a measurable space. Then we have

(lm1 ) ψ1 + ψ2 ∈ S(X, A), ∀ψ1 , ψ2 ∈ S(X, A),
(lm2 ) αψ ∈ S(X, A), ∀α ∈ C, ∀ψ ∈ S(X, A),
(sa2 ) ψ1 ψ2 ∈ S(X, A), ∀ψ1 , ψ2 ∈ S(X, A).
This means that S(X, A) is a subalgebra of the abelian associative algebra M(X, A)
(cf. 6.2.16). Since 1X ∈ S(X, A), S(X, A) is with identity.
Proof. In view of 6.2.23 and 6.2.16, we only need to notice that, if α ∈ C and
ψ1 , ψ2 , ψ ∈ S(X, A), then ψ1 + ψ2 , αψ, ψ1 ψ2 have only finitely many values.
6.2.25 Definitions. Let (X, A) be a measurable space. We define

S + (X, A) := {ψ ∈ S(X, A) : ψ(x) ∈ [0, ∞), ∀x ∈ X}.
We denote by L+ (X, A) the family of all the functions, with X as domain and [0, ∞]
as final set, that are A-measurable, i.e. we define
L+ (X, A) := {ϕ : X → [0, ∞] : ϕ is A-measurable}.
Obviously, S + (X, A) ⊂ L+ (X, A).
6.2.26 Theorem. Let (X, A) be a measurable space and ϕ ∈ L+ (X, A). Then there
exists a sequence {ψn } in S + (X, A) such that:
(a) ψn ≤ ψn+1 ≤ ϕ, ∀n ∈ N (for the notation ϕ ≤ ψ, cf. 5.1.1);

(b) ψn (x) → ϕ(x) as n → ∞, ∀x ∈ X;
(c) if a subset Y of X and m ∈ [0, ∞) are such that ϕ(x) ≤ m for all x ∈ Y , then
sup{ϕ(x) − ψn (x) : x ∈ Y } → 0 as n → ∞.
Proof. For each n ∈ N define
−1 −1k−1 k
E0,n := ϕ ([n, ∞]) and Ek,n := ϕ , for k = 1, ..., n2n .
2n 2n
For all a, b ∈ R, we have [a, b) ∈ A(dR ) from 6.1.25, and hence [a, b) ∈ A(δ) from
6.1.26. For each a ∈ R we have [a, ∞] ∈ A(δ) from 6.1.26. Thus, we have Ek,n ∈ A
for k = 0, 1, ..., n2n, since ϕ is A-measurable. Besides, the family
{Ek,n }k=0,1,...,n2n
is a disjoint family because such is the family {[n, ∞]} ∪ k−1 k

,
2n 2n k=1,...,n2n
.
Thus, the function
n2n
X k−1
ψn := χEk,n + nχE0,n
2n
k=1
is an element of S + (X, A), and clearly ψn ≤ ϕ.
We will prove that the sequenceS{ψnn } hasproperties a, b, c.
n2
a: Fix n ∈ N, notice that X = k=1 Ek,n ∪E0,n , and consider x ∈ X. Suppose
first that there exists k ∈ {1, ..., n2n } such that x ∈ Ek,n . Since

k−1 k (2k − 1) − 1 2k − 1 2k − 1 2k
, = , n+1 ∪ , ,
2n 2n 2n+1 2 2n+1 2n+1
then:
(2k−1)−1
either x ∈ E2k−1,n+1 , and then ψn (x) = k−1
2n = 2n+1 = ψn+1 (x),
2k−1
or x ∈ E2k,n+1 , and then ψn (x) = k−1
2n < 2n+1 = ψn+1 (x).
Suppose next that x ∈ E0,n . Then:
either ϕ(x) < n + 1, and then ∃k ∈ {1, ..., (n + 1)2n+1 } s.t. n ≤ 2k−1 n+1 and
x ∈ Ek,n+1 (recall that n ≤ ϕ(x)), whence ψn (x) = n ≤ 2k−1

n+1 = ψn+1 (x),
or n + 1 ≤ ϕ(x), i.e. x ∈ E0,n+1 , whence ψn (x) = n < n + 1 = ψn+1 (x).

This shows that ψn ≤ ψn+1 .
b: Consider x ∈ X, and suppose first that ϕ(x) < ∞. Then for n > ϕ(x) there
exists k ∈ {1, ..., n2n} such that x ∈ Ek,n , and hence
1
0 ≤ ϕ(x) − ψn (x) ≤ n ,
2
and this shows (cf. 5.1.2b3) that ψn (x) → ϕ(x) as n → ∞. Suppose next that
ϕ(x) = ∞. Then x ∈ E0,n , hence ψn (x) = n for all n ∈ N, and this shows (cf.
5.2.1b1 ) that ψn (x) → ϕ(x) as n → ∞.
c: If Y ∈ P(X) and m ∈ [0, ∞) are such that ϕ(x) ≤ m for all x ∈ y, then for
n ≥ m we have
1
ϕ(x) − ψn (x) ≤ n , ∀x ∈ Y.
2
6.2.27 Corollary. Let (X, A) be a measurable space and ϕ ∈ M(X, A). Then there
exists a sequence {ψn } in S(X, A) such that:
(a) |ψn | ≤ |ϕ|, ∀n ∈ N,
(b) ψn (x) → ϕ(x) as n → ∞, ∀x ∈ X,
(c) if a subset Y of X and m ∈ [0, ∞) are such that |ϕ(x)| ≤ m for all x ∈ Y , then
sup{|ϕ(x) − ψn (x)| : x ∈ Y } → 0 as n → ∞.
Proof. The functions

ϕ1 := (Re ϕ)+ , ϕ2 := (Re ϕ)− , ϕ3 := (Im ϕ)+ , ϕ4 := (Im ϕ)−
are elements of L+ (X, A) by 6.2.12 and 6.2.20b. Then 6.2.26 implies that, for
i = 1, ..., 4, there exists a sequence {ψni } in S + (X, A) such that:
0 ≤ ψni ≤ ϕi , ∀n ∈ N,
ψni (x) → ϕi (x) as n → ∞, ∀x ∈ X,
if Y ∈ P(X) and m ∈ [0, ∞) are such that |ϕ(x)| ≤ m, ∀x ∈ Y , then (since
ϕi ≤ |ϕ|) sup{ϕi (x) − ψni (x) : x ∈ Y } → 0 as n → ∞.
For each n ∈ N the function ψn := ψn1 − ψn2 + i(ψn3 − ψn4 ) is an element of S(X, A)
by 6.2.24, and we will prove that conditions a, b, c are satisfied for the sequence
{ψn }.
a: For each n ∈ N we have
|ψn1 − ψn2 | ≤ ϕ1 + ϕ2 = | Re ϕ| and |ψn3 − ψn4 | ≤ ϕ3 + ϕ4 = | Im ϕ|,
and this implies that |ψn | ≤ |ϕ|.
b and c: We have
4
X
|ϕ − ψn | ≤ |ϕi − ψni |.
i=1
This implies in the first place that

ψn (x) → ϕ(x) as n → ∞, ∀x ∈ X,
and in the second place that, if Y ∈ P(X) and m ∈ [0, ∞) are such that |ϕ(x)| ≤ m
for all x ∈ Y , then
4
X
sup{|ϕ(x) − ψn (x)| : x ∈ Y } ≤ sup{ϕi (x) − ψni (x) : x ∈ Y } → 0 as n → ∞.
i=1
6.2.28 Definition. For a measurable space (X, A), we denote by MB (X, A) the
family of all bounded A-measurable complex functions on X, i.e. we define
MB (X, A) := M(X, A) ∩ FB (X)
(for FB (X), cf. 3.1.10d).
6.2.29 Remarks. Let (X, A) be a measurable space. Since M(X, A) and FB (X)
are subalgebras of the associative algebra F (X) (cf. 6.2.16 and 3.3.8b), MB (X, A) is
a subalgebra of the associative algebra F (X) (cf. 3.3.4), and hence of the associative
algebras M(X, A) and FB (X) as well (cf. 3.3.3b).
Since FB (X) is a normed algebra (cf. 4.3.6a), MB (X, A) is a normed algebra
as well (cf. 4.3.2).
Since convergence with respect to the k k∞ norm (cf. 4.3.6a) implies pointwise
convergence, 6.2.20c shows that MB (X, A) is a closed subset of FB (X). Hence,
since FB (X) is a Banach space (cf. 4.3.6a), MB (X, A) is a Banach space as well
(cf. 4.1.8a) and hence a Banach algebra.
Clearly, S(X, A) ⊂ MB (X, A). Since S(X, A) is a subalgebra of the associative
algebra M(X, A) (cf. 6.2.24), S(X, A) is a subalgebra of the associative algebra
MB (X, A) as well (cf. 3.3.3b).
Finally, 6.2.27c and 2.3.12 show that S(X, A) is dense (in the k k∞ norm) in
MB (X, A).
6.2.30 Definitions. Let X be a non-empty set and ϕ : Dϕ → [0, ∞] a function

with Dϕ ⊂ X.
If ψ : Dψ → [0, ∞] is a function with Dψ ⊂ X, then we define the functions:
ϕ + ψ : Dϕ ∩ Dψ → [0, ∞]
x 7→ (ϕ + ψ)(x) := ϕ(x) + ψ(x),
ϕψ : Dϕ ∩ Dψ → [0, ∞]
x 7→ (ϕψ)(x) := ϕ(x)ψ(x).
Clearly, if Rϕ ⊂ [0, ∞) and Rψ ⊂ [0, ∞) then these definitions are in agreement

with the ones given in 1.2.19.
If a ∈ [0, ∞], then we define the function
aϕ : Dϕ → [0, ∞]
x 7→ (aϕ)(x) := aϕ(x).
Clearly, if Rϕ ⊂ [0, ∞) and a ∈ [0, ∞) then this definition is in agreement with the
one given in 1.2.19.
6.2.31 Theorem. Let (X, A) be a measurable space. If ϕ1 , ϕ2 ∈ L+ (X, A), then

ϕ1 + ϕ2 ∈ L+ (X, A) and ϕ1 ϕ2 ∈ L+ (X, A). If a ∈ [0, ∞] and ϕ ∈ L+ (X, A), then
aϕ ∈ L+ (X, A).
Proof. It is obvious that Rϕ1 +ϕ2 ⊂ [0, ∞]. By 6.2.26, there are two sequences {ψn1 }
and {ψn2 } in S + (X, A) so that, for i = 1, 2,
∀x ∈ X, ψni (x) ≤ ψn+1

i
(x) and ϕi (x) = lim ψni (x).
n→∞
Then by 5.3.4 we have

∀x ∈ X, (ϕ1 + ϕ2 )(x) = lim ψn1 (x) + lim ψn2 (x)
n→∞ n→∞
= lim (ψn1 (x) + ψn2 (x)) = lim (ψn1 + ψn2 )(x),
n→∞ n→∞
and ψn1 + ψn2 is A-measurable by 6.2.16. This implies, by 6.2.19b, that ϕ1 + ϕ2 is
A-measurable. Thus ϕ1 + ϕ2 ∈ L+ (X, A).
The proof for ϕ1 ϕ2 is analogous, and for aϕ we notice that aϕ = aX ϕ if aX
denotes the constant function on X with value a, which is an element of L+ (X, A)
by 6.2.2.
6.2.32 Corollary. Let (X, A) be a measurable space, let {ϕn } be a sequence in

L+ (X, A), and define the function (cf. 5.4.1)
X∞
ϕn : X → [0, ∞]
n=1
∞ ∞
!
X X
x 7→ ϕn (x) := ϕn (x).
n=1 n=1
P∞
Then n=1 ϕn ∈ L+ (X, A).
Proof. Using 5.4.1 and the definition given in 6.2.18, we have
X∞ Xn
ϕn = lim ϕk .
n→∞
n=1 k=1
The result then follows from 6.2.31 and 6.2.19b.
6.3 Borel functions
In this section we prove a result about Borel functions which has hardly anything
to do with the theory of measure and integration, but which will play an essential
role in our proof of the spectral theorem for unitary operators (from which we will
deduce the spectral theorem for self-adjoint operators).
6.3.1 Definition. Let X be a non-empty set, {ϕn } a sequence in F (X) (for F (X),
cf. 3.1.10c), and ϕ ∈ F (X). We say that ϕ is the uniformly bounded pointwise limit,
ubp
in short ubp limit, of {ϕn } and we write ϕn −→ ϕ if the following two conditions
are satisfied:
∃m ∈ [0, ∞) such that |ϕn (x)| ≤ m, ∀x ∈ X, ∀n ∈ N;
ϕn (x) → ϕ(x) as n → ∞, ∀x ∈ X.
ubp
Clearly, if ϕn −→ ϕ then ϕ ∈ FB (X) (for FB (X), cf. 3.1.10d).
A family of functions V ⊂ F (X) is said to be ubp closed if the following condition
is satisfied:
ubp
[ϕ ∈ F (X), {ϕn } a sequence in V, ϕn −→ ϕ] ⇒ ϕ ∈ V.
6.3.2 Lemma. Let (X, d) be a metric space, define the collection of families of
functions
Γ := {V ⊂ F (X) : V is ubp closed and CB (X) ⊂ V}
(for CB (X), cf. 3.1.10e), and then define the family of functions
\
E := V.
V∈Γ
The family E has the following properties:

(a) E ∈ Γ;
(b) E ⊂ FB (X);
(c) ϕ + ψ ∈ E and ϕψ ∈ E, ∀ϕ, ψ ∈ E,
αϕ ∈ E, ∀α ∈ C, ∀ϕ ∈ E;
(d) χE ∈ E, ∀E ∈ A(d).
Proof. a: This follows immediately from the definitions.

ubp
b: If a sequence {ϕn } in FB (X) and ϕ ∈ F (X) are such that ϕn −→ ϕ,
then ϕ ∈ FB (X) since ϕ is a ubp limit. Thus, FB (X) is ubp closed. Moreover,
CB (X) ⊂ FB (X). This shows that FB (X) ⊂ Γ, and hence that E ⊂ FB (X).
c: Choose h ∈ CB (X) and define
Vh := {f ∈ F (X) : f + h ∈ E}.
We prove that Vh ∈ Γ. If a sequence {fn } in Vh and f ∈ F (X) are such that
ubp ubp
fn −→ f , then clearly fn + h −→ f + h, and this implies (since fn + h ∈ E and E
is ubp closed, cf. property a) that f + h ∈ E, i.e. f ∈ Vh . Thus, Vh is ubp closed.
Moreover, CB (X) ⊂ Vh since
(1)
f ∈ CB (X) ⇒ f + h ∈ CB (X) ⇒ f + h ∈ E ⇒ f ∈ Vh ,
where 1 holds because CB (X) ⊂ E (cf. property a). Therefore, Vh ∈ Γ. This implies
that E ⊂ Vh , i.e. that
ψ + h ∈ E, ∀ψ ∈ E.
Since h was an arbitrary element of CB (X), we have proved that
h ∈ CB (X) ⇒ ψ + h ∈ E, ∀ψ ∈ E. (2)
Choose now ψ ∈ E and define
Vψ := {f ∈ F (X) : f + ψ ∈ E}.
We prove that Vψ ∈ Γ. If a sequence {fn } in Vψ and f ∈ F (X) are such that
ubp ubp
fn −→ f , then fn + ψ −→ f + ψ (since ψ ∈ FB (X), cf. property b), and this
implies (since fn + ψ ∈ E and E is ubp closed, cf. property a) that f + ψ ∈ E, i.e.
that f ∈ Vψ . Thus, Vψ is ubp closed. Moreover, CB (X) ⊂ Vψ since
(3)
h ∈ CB (X) ⇒ h + ψ ∈ E ⇒ h ∈ Vψ ,
where 3 follows from 2. Therefore, Vψ ∈ Γ. This implies that E ⊂ Vψ , i.e. that

ϕ + ψ ∈ E, ∀ϕ ∈ E.
Since ψ was an arbitrary element of E, we have indeed proved that
ϕ + ψ ∈ E, ∀ϕ, ψ ∈ E.
Proceeding as above, with pointwise sum of functions replaced by pointwise product,
we can prove that
ϕψ ∈ E, ∀ϕ, ψ ∈ E.
Finally, we note that αX ∈ CB (X) and hence (cf. property a) αX ∈ E, for each
α ∈ C (for the constant function αX cf. 1.2.19). Then we have
αϕ = (αX )ϕ ∈ E, ∀α ∈ C, ∀ϕ ∈ E.
d: Define the collection of subsets of X
A := {E ∈ P(X) : χE ∈ E}.
First, we prove that A is a σ-algebra, by showing that it has properties al1 and al2
of 6.1.5 and σa1 of 6.1.13.
al1 : If E, F ∈ A then χE , χF ∈ E, and hence (cf. 1.2.20 and property c)
χE∪F = χE + χF − χE χF ∈ E,
and this shows that E ∪ F ∈ A.
al2 : If E ∈ A then χE ∈ E; besides, 1X ∈ E since CB (X) ⊂ E (cf. property a);
hence (cf. 1.2.20 and property c)
χX−E = 1X − χE ∈ E,
and this shows that X − E ∈ A.
SN
σa1 : If {En } is a sequence in A, then induction applied to al1 shows that n=1 En ∈
A, and hence that χSN n=1 En
∈ E for each N ∈ N. Besides, directly from the
definition of union we have
∀x ∈ X, χSN En (x) → χS ∞
n=1 En
(x) as N → ∞,
n=1
ubp
and hence χSN n=1 En
−→ χS∞ n=1 En
. Since E is ubp closed (cf. property a), this
that ∞
S
implies that χ ∞
S
n=1 En
∈ E, and hence n=1 En ∈ A.
Next, we prove that Td ⊂ A. If G ∈ Td , then there exists a sequence {ψn } in CB (X)

ubp
such that ψn −→ χG (cf. 2.5.8). Since CB (X) ⊂ E and E is ubp closed (cf. property
a), this shows that χG ∈ E, and hence that G ∈ A.
Thus, we have A(d) := A(Td ) ⊂ A (cf. gσ2 of 6.1.17, with F := Td ), and this
proves that
χE ∈ E, ∀E ∈ A(d).
6.3.3 Definition. Let (X, d) be a metric space. A function ϕ ∈ F (X) is called

a Borel function if it is A(d)-measurable, i.e. if it is measurable w.r.t A(d) and
A(dC ). Thus, M(X, A(d)) is the family of all Borel functions, and MB (X, A(d)) is
the family of all bounded Borel functions.
6.3.4 Theorem. Let (X, d) be a metric space. The family MB (X, A(d)) of all
bounded Borel functions is the smallest family of complex functions on X that is
ubp closed and that contains CB (X). More explicitly:
(a) MB (X, A(d)) is ubp closed and CB (X) ⊂ MB (X, A(d));
(b) if V ⊂ F (X), V is ubp closed, and CB (X) ⊂ V, then MB (X, A(d)) ⊂ V.
Proof. a: Suppose that a sequence {ϕn } in MB (X, A(d)) and ϕ ∈ F (X) are such
ubp
that ϕn −→ ϕ. Then ϕ ∈ M(X, A(d)) by 6.2.20c, and ϕ ∈ FB (X) since ϕ is a
ubp limit. This shows that MB (X, A(d)) is ubp closed. Moreover, the inclusion
CB (X) ⊂ MB (X, A(d)) holds by 6.2.8.
b: We prove this property of MB (X, A(d)) by proving the inclusion
MB (X, A(d)) ⊂ E, where E is the family of functions defined in 6.3.2. Indeed,
for every ϕ ∈ MB (X, A(d)), by 6.2.27 there exists a sequence {ψn } in S(X, A(d))
ubp
such that ψn −→ ϕ. Now, properties c and d of 6.3.2 imply that S(X, A(d)) ⊂ E.
Since E is ubp closed (cf. 6.3.2a), this shows that ϕ ∈ E.
6.3.5 Remark. The statement of 6.3.4 is clearly equivalent to the equality

MB (X, A(d)) = E,
where E is the family of functions defined in 6.3.2.
6.3.6 Remark. By substituting pointwise limits for ubp limits and dropping ev-
erywhere any condition of boundedness, the whole reasoning of this section can be
rerun to prove that, for any metric space (X, d), the family M(X, A(d)) of all Borel
functions is the smallest subset of F (X) that is closed with respect to pointwise
convergence and that contains C(X) (for C(X), cf. 3.1.10e).
Chapter 7
Measures
7.1 Additive functions, premeasures, measures

7.1.1 Definitions. Let A0 be an algebra on X. An additive function on A0 is a
function µ0 : A0 → [0, ∞] with the following properties:
(af1 ) µ0 (∅) = 0;
(af2 ) for every finite and disjoint family {E1 , ..., En } of elements of A0 ,
Xn
µ0 (∪nk=1 Ek ) = µ0 (Ek )
k=1
(this property of µ0 is called additivity).
If a function µ0 : A0 → [0, ∞] is so that there exists E ∈ A0 for which µ0 (E) < ∞,
then property af2 implies property af1 : for E ∈ A0 , property af2 implies µ0 (E) =
µ0 (E) + µ0 (∅), and this implies µ0 (∅) = 0 if µ0 (E) < ∞.
An additive function µ0 on A0 is said to be σ-finite if there exists a countable
family {En }n∈I of elements of A0 so that µ0 (En ) < ∞ for all n ∈ I and X =
∪n∈I En .
An additive function µ0 on A0 is said to be finite if µ0 (X) < ∞. Clearly, a finite
additive function is also σ-finite.
7.1.2 Proposition. Let A0 be an algebra on X and µ0 an additive function on A0 .
Then:
(a) if E, F ∈ A0 , then
E ⊂ F ⇒ µ0 (E) ≤ µ0 (F )
(this property of µ0 is called monotonicity);
(b) if {E1 , ..., EN } is a finite family of elements of A0 , then
N
X
N
µ0 (∪n=1 En ) ≤ µ0 (En )
n=1
(this property of µ0 is called subadditivity).
151
Proof. a: If E, F ∈ A0 are such that E ⊂ F , then (F − E) ∪ E = F . Since

(F −E)∩E = ∅, by af2 we have µ0 (F −E)+µ0 (E) = µ0 (F ), whence µ0 (E) ≤ µ0 (F )
(since 0 ≤ µ0 (F − E) implies µ0 (E) ≤ µ0 (F − E) + µ0 (E), cf. 5.3.2c).
b: If {E1 , ..., EN } is a finite family of elements of A0 , then there exists a disjoint
family {F1 , ..., FN } of elements of A0 such that
∪N N
n=1 Fn = ∪n=1 En and Fn ⊂ En for n = 1, ..., N
(cf. 6.1.8, with En := ∅ for n > N ). Then we have

N
X N
X
µ0 (∪N N
n=1 En ) = µ0 (∪n=1 Fn ) = µ0 (Fn ) ≤ µ0 (En )
n=1 n=1
by af2 , by the monotonicity of µ0 proved in a, and by induction applied to 5.3.2e.
7.1.3 Definition. Let A0 be an algebra on X. An additive function µ0 on A0 is

said to be a premeasure if it has the following property (which is called σ-additivity):
(pm) for every sequence {En } in A0 such that ∪∞
n=1 En ∈ A0 and Ei ∩ Ej = ∅ if
i 6= j,
∞
X
µ0 (∪∞
n=1 En ) = µ0 (En )
n=1
7.1.4 Proposition. Let A0 be an algebra on X and µ0 a premeasure on A0 . Then:

(a) if {En } is a sequence in A0 such that ∪∞
n=1 En ∈ A0 , then
∞
X
µ0 (∪∞
n=1 En ) ≤ µ0 (En )
n=1
(this property of µ0 is called σ-subadditivity);

(b) if {En } is a sequence in A0 such that ∪∞ n=1 En ∈ A0 and En ⊂ En+1 for all
n ∈ N, then
µ0 (∪∞
n=1 En ) = lim µ0 (En ) = sup µ0 (En )
n→∞ n≥1
(this property of µ0 is called continuity from below);

(c) if {En } is a sequence in A0 such that ∩∞ n=1 En ∈ A0 and En+1 ⊂ En for all
n ∈ N, and if there exists l ∈ N so that µ0 (El ) < ∞, then
µ0 (∩∞
n=1 En ) = lim µ0 (En ) = inf µ0 (En )
n→∞ n≥1
(this property of µ0 is called continuity from above).
Proof. a: If {En } is a sequence in A0 , then there exists a sequence {Fn } in A0

such that
∪∞ ∞
n=1 Fn = ∪n=1 En , Fn ⊂ En for all n ∈ N, Fk ∩ Fl = ∅ if k 6= l
Measures 153
(cf. 6.1.8). If ∪∞
n=1 En ∈ A0 , then we have
∞
X ∞
X
µ0 (∪∞ ∞
n=1 En ) = µ0 (∪n=1 Fn ) = µ0 (Fn ) ≤ µ0 (En )
n=1 n=1
by pm, by the monotonicity of µ0 , and by 5.4.2a.

b: If {En } is a sequence in A0 such that En ⊂ En+1 for all n ∈ N, then there
exists a sequence {Fn } in A0 such that
∪∞ ∞ n
n=1 Fn = ∪n=1 En , ∪k=1 Fk = En for all n ∈ N, Fk ∩ Fl = ∅ if k 6= l
(cf. 6.1.8). By af2 we have

n
X
∀n ∈ N, µ0 (Fk ) = µ0 (∪nk=1 Fk ) = µ0 (En ),
k=1
and hence, if ∪∞
n=1 En ∈ A0 , by pm we have
µ0 (∪∞ ∞
n=1 En ) = µ0 (∪n=1 Fn )
X∞ n
X
= µ0 (Fn ) = lim µ0 (Fk ) = lim µ0 (En ),
n→∞ n→∞
n=1 k=1
or
∞
X n
X
µ0 (∪∞
n=1 En ) = µ0 (Fn ) = sup µ0 (Fk ) = sup µ0 (En ).
n=1 n≥1 n≥1
k=1
c: Let {En } be a sequence in A0 such that ∩∞ n=1 En ∈ A0 and En+1 ⊂ En for

all n ∈ N, and suppose that there exists l ∈ N so that µ0 (El ) < ∞. We may
assume l = 1 since, if it is not already so, we may replace the sequence {En } with
the sequence {El+n } since ∩∞ ∞
n=1 En = ∩n=1 El+n and of course limn→∞ µ0 (En ) =
limn→∞ µ0 (El+n ).
Letting Fn := E1 − En for all n ∈ N, we have a sequence {Fn } in A0 such that
Fn ⊂ Fn+1 for all n ∈ N, and also such that ∪∞ n=1 Fn ∈ A0 since
∪∞ ∞ ∞ ∞
n=1 Fn = ∪n=1 (E1 ∩ (X − En )) = E1 ∩ (X − ∩n=1 En ) = E1 − ∩n=1 En .
Then, by part b, we have

µ0 (∪∞
n=1 Fn ) = lim µ0 (Fn ).
n→∞
Notice that limn→∞ µ0 (Fn ) < ∞ since ∪∞ n=1 Fn ⊂ E1 implies, by the monotonicity
∞
of µ0 , µ0 (∪n=1 Fn ) ≤ µ0 (E1 ) < ∞. We also have
E1 = (∩∞ ∞ ∞ ∞
n=1 En ) ∪ (∪n=1 Fn ) and (∩n=1 En ) ∩ (∪n=1 Fn ) = ∅,
whence, by af2 ,
µ0 (E1 ) = µ0 (∩∞ ∞
n=1 En ) + µ0 (∪n=1 Fn ),
which implies, since µ0 (∪∞

n=1 Fn ) < ∞,
µ0 (∩∞ ∞
n=1 En ) = µ0 (E1 ) − µ0 (∪n=1 Fn ).
Thus we have
µ0 (∩∞
n=1 En ) = µ0 (E1 ) − lim µ0 (Fn ).
n→∞
From
∀n ∈ N, E1 = En ∪ Fn and En ∩ Fn = ∅
we have
∀n ∈ N, µ0 (E1 ) = µ0 (En ) + µ0 (Fn ),
whence, since En ⊂ E1 implies µ0 (En ) ≤ µ0 (E1 ) < ∞,
∀n ∈ N, µ0 (Fn ) = µ0 (E1 ) − µ0 (En ),
and this implies, since all the terms involved are in R and so is limn→∞ µ0 (Fn ),
that the sequence {µ0 (En )} is convergent to a limit in R and that
lim µ0 (Fn ) = µ0 (E1 ) − lim µ0 (En ).
n→∞ n→∞
From all this we derive (recall that µ0 (E1 ) < ∞)

µ0 (∩∞
n=1 En ) = µ0 (E1 ) − (µ0 (E1 ) − lim µ0 (En )) = lim µ0 (En ).
n→∞ n→∞
Finally, from 7.1.2a and from 5.2.5 we obtain limn→∞ µ0 (En ) = inf n≥1 µ0 (En ).
7.1.5 Theorem (Alexandroff ’s theorem). Suppose that we have a distance d

on X, an algebra A0 on X, an additive function µ0 on A0 , and that the following
two conditions are satisfied:
(a) for each E ∈ A0 ,

µ0 (E) = sup µ0 (F ) : F ∈ A0 , F ⊂ E, F is compact
(this condition is consistent because F ∈ A0 and F ⊂ E imply µ0 (F ) ≤ µ0 (E)
by the monotonicity of µ0 , and because ∅ ∈ A0 , ∅ = ∅ ⊂ E, ∅ is compact);
(b) for each E ∈ A0 such that µ0 (E) < ∞,
∀ǫ > 0, ∃Gǫ ∈ A0 s.t. E ⊂ G◦ǫ and µ0 (Gǫ ) − µ0 (E) < ǫ.
Then µ0 is a premeasure.
If µ0 (X) < ∞ then condition a implies condition b.
Proof. First we prove that conditions a and b imply that µ0 has property pm of
7.1.3. Let then {En } be a sequence in A0 such that ∪∞ n=1 En ∈ A0 and Ei ∩ Ej = ∅
if i 6= j. The additivity and the monotonicity of µ0 imply that
N
X
∞
∀N ∈ N, µ0 (En ) = µ0 (∪N
n=1 En ) ≤ µ0 (∪n=1 En ),
n=1
whence
∞
X N
X
µ0 (En ) := sup µ0 (En ) ≤ µ0 (∪∞
n=1 En ). (1)
n=1 N ≥1 n=1
Measures 155
P∞
If n=1 µ0 (En ) = ∞, 1 implies that
X∞
µ0 (En ) = µ0 (∪∞
n=1 En ).
n=1
Thus our task is now to prove that conditions a and b imply that
∞
X
µ0 (∪∞ E
n=1 n ) ≤ µ0 (En ),
n=1
assuming that ∞
P
n=1 µ0 (En ) < ∞ (we will see that, in this part of the proof, no
role is played by the condition Ei ∩ Ej = ∅ if i 6= j, which however has already
played its role in the proof of 1). Assume then ∞
P
n=1 µ0 (En ) < ∞ (this implies that
µ0 (En ) < ∞ for all n ∈ N), and consider any F ∈ A0 such that F ⊂ ∪∞ n=1 En and
such that F is compact. Choose ǫ > 0. For each n ∈ N, condition b for En implies
that
ǫ
∃Gn,ǫ ∈ A0 s.t. En ⊂ G◦n,ǫ and µ0 (Gn,ǫ ) − µ0 (En ) < n .
2
Since ∪∞ ◦ ∞
n=1 Gn,ǫ ⊃ ∪n=1 En ⊃ F and F is compact, there exists N ∈ N so that
◦
∪Nn=1 Gn,ǫ ⊃ F , and hence so that
◦
∪N N
n=1 Gn,ǫ ⊃ ∪n=1 Gn,ǫ ⊃ F ⊃ F.
Then we have
∞ N N N
X X X ǫ X
µ0 (En ) ≥ µ0 (En ) > (µ0 (Gn,ǫ ) − n ) > µ0 (Gn,ǫ ) − ǫ
n=1 n=1 n=1
2 n=1
≥ µ0 (∪N
n=1 Gn,ǫ ) − ǫ ≥ µ0 (F ) − ǫ,
where the subadditivity and the monotonicity of µ0 have been used. Since ǫ was
arbitrary, this proves that
∞
X
µ0 (En ) ≥ µ0 (F ).
n=1
Since F was any element of A0 such that F ⊂ ∪∞ n=1 En and such that F was compact,
condition a for ∪∞
n=1 En implies that
∞
X
µ0 (En ) ≥ µ0 (∪∞
n=1 En ),
n=1
which along with 1 proves that
X∞
µ0 (En ) = µ0 (∪∞
n=1 En ).
n=1
Now suppose that µ0 (X) < ∞. Notice that this implies that µ0 (H) < ∞ for
all H ∈ A0 , by the monotonicity of µ0 . We will prove that from this it follows
that condition a implies condition b. Assume then condition a, and consider any
E ∈ A0 . Choose ǫ > 0. Since X − E ∈ A0 , condition a for X − E implies that
∃Fǫ ∈ A0 s.t. F ǫ ⊂ X − E and µ0 (X − E) − µ0 (Fǫ ) < ǫ.
Letting Gǫ := X − Fǫ , we have:
Gǫ ∈ A0 ;
E ⊂ G◦ǫ (because F ǫ ⊂ X − E implies E ⊂ X − F ǫ , and X − F ǫ ⊂ (X − Fǫ )◦
follows from X − F ǫ ∈ Td and X − F ǫ ⊂ X − Fǫ );
µ0 (Gǫ ) − µ0 (E) = µ0 (X) − µ0 (Fǫ ) − µ0 (E) = µ0 (X − E) − µ0 (Fǫ ) < ǫ, where
µ0 (Gǫ ) = µ0 (X) − µ0 (Fǫ ) and µ0 (X) − µ0 (E) = µ0 (X − E) are true because µ0
is finite.
Since ǫ was arbitrary, this proves condition b for E.
7.1.6 Corollary. Suppose that we have a distance d on X, a semialgebra S on X,

an additive function µ0 on A0 (S) (the algebra on X generated by S), and that the
following two conditions are satisfied:
(a) for each E ∈ S,

µ0 (E) = sup µ0 (F ) : F ∈ S, F ⊂ E, F is compact
(this condition is consistent by the monotonicity of µ0 and because ∅ ∈ S);
(b) for each E ∈ S such that µ0 (E) < ∞,
∀ǫ > 0, ∃Gǫ ∈ S s.t. E ⊂ G◦ǫ and µ0 (Gǫ ) − µ0 (E) < ǫ.
Then µ0 is a premeasure.
If µ0 (X) < ∞, then condition a is enough to make µ0 a premeasure.
Proof. We will prove this corollary by proving that condition a of the statement
implies condition 7.1.5a and that condition b implies condition 7.1.5b.
Consider then E ∈ A0 (S). By 6.1.11, there is a finite and disjoint family
{E1 , ..., En } of elements of S so that E = ∪nk=1 Ek .
Assume condition a and suppose first that µ0 (E0 ) = ∞. Then there exists
l ∈ {1, ..., n} such that µ0 (El ) = ∞, for otherwise we would have µ0 (E) < ∞ by
the additivity of µ0 . Notice that, since S ⊂ A0 (S) and El ⊂ E,

F ∈ S : F ⊂ El , F is compact ⊂ F ∈ A0 (S) : F ⊂ E, F is compact .
Condition a for El is

∞ = µ0 (El ) = sup µ0 (F ) : F ∈ S, F ⊂ El , F is compact .
Hence we have

sup µ0 (F ) : F ∈ A0 (S), F ⊂ E, F is compact = ∞ = µ0 (E),
which is condition 7.1.5a for E.
Assume next condition a and suppose that µ0 (E) < ∞. Then µ0 (Ek ) < ∞ for
k = 1, ..., n by the monotonicity of µ0 . Choose ǫ > 0. For k = 1, ..., n, condition a
for Ek implies that
ǫ
∃Fk,ǫ ∈ S s.t. F k,ǫ ⊂ Ek , F k,ǫ is compact, µ0 (Ek ) − µ0 (Fk,ǫ ) < .
n
Letting Fǫ := ∪nk=1 Fk,ǫ , we have:
Measures 157
Fǫ ∈ A0 (S) (since S ⊂ A0 (S));

F ǫ = ∪nk=1 F k,ǫ ⊂ ∪nk=1 Ek = E (cf. 2.3.9e);
F ǫ is compact (since F ǫ = ∪nk=1 F k,ǫ , cf. 2.8.9);
Pn Pn
µ0 (E) − µ0 (Fǫ ) = k=1 µ0 (Ek ) − k=1 µ0 (Fk,ǫ ) < ǫ, where the additivity of µ0
has been used (note that Fk,ǫ ⊂ Ek implies Fi,ǫ ∩ Fj,ǫ = ∅ if i 6= j).
Since ǫ was arbitrary, this proves condition 7.1.5a for E.
Finally, assume condition b and suppose that µ0 (E) < ∞. Choose ǫ > 0. For
k = 1, ..., n, since µ0 (Ek ) < ∞, condition b for Ek implies that
ǫ
∃Gk,ǫ ∈ S s.t. Ek ⊂ G◦k,ǫ and µ0 (Gk,ǫ ) − µ0 (Ek ) < .
n
Letting Gǫ := ∪nk=1 Gk,ǫ , we have:
Gǫ ∈ A0 (S) (since S ⊂ A0 (S));
E = ∪nk=1 Ek ⊂ ∪nk=1 G◦k,ǫ ⊂ (∪nk=1 Gk,ǫ )◦ = G◦ǫ (cf. 2.2.7e);
Pn Pn
µ0 (Gǫ ) − µ0 (E) ≤ k=1 µ0 (Gk,ǫ ) − k=1 µ0 (Ek ) < ǫ, where the subadditivity of
µ0 has been used.
Since ǫ was arbitrary, this proves condition 7.1.5b for E.
7.1.7 Definitions. Let A be a σ-algebra on X. A premeasure on A is said to

be a measure. Thus, a measure on A is an additive function on A which has the
following property:
(me) for every sequence {En } in A such that Ei ∩ Ej = ∅ if i 6= j,
∞
X
µ(∪∞ E
n=1 n ) = µ(En ).
n=1
To prove that a function µ : A → [0, ∞] is a measure, it is enough to prove that it

has the property af1 of 7.1.1 and me, since these two properties imply property af2
of 7.1.1: for any finite and disjoint family {E1 , ..., En } of elements of A, consider
the sequence {Ek } defined by letting Ek := ∅ for k > n.
If µ is a measure on A, the triple (X, A, µ) is called a measure space, and it is
said to be σ-finite if µ is σ-finite.
A measure µ on A is called a probability measure if µ(X) = 1.
A measure µ on A is said to be complete if it has the following property
(cm) [F ∈ P(X), ∃E ∈ A s.t. F ⊂ E and µ(E) = 0] ⇒ F ∈ A.
The null measure on A is the function that assigns the value 0 to each element of
A.
7.1.8 Proposition. Let A be a σ-algebra on X, µ a measure on A, and a ∈ [0, ∞].
Then the function
aµ : A → [0, ∞]
E 7→ (aµ)(E) := aµ(E)
is a measure on A.
Proof. Use 5.4.5.
7.1.9 Definition. Let A be a σ-algebra on X and µ a measure on A. Let E be

an element of A and, for each x ∈ E, let P (x) be a proposition, i.e. a statement
about x which is either true or false (but not both). We write “P (x) µ-a.e. on E”
or “P (x) is true µ-a.e. on E” or “P (x) is true for µ-a.e. x ∈ E” (“a.e.” is read
“almost everywhere” in the first two cases, and “almost every” in the third one)
when the following condition is satisfied:
∃F ∈ A such that µ(F ) = 0 and P (x) is true for all x ∈ E − F.
7.1.10 Remark. Let A, µ, E, P (x) be as in 7.1.9. Note that, if P (x) is true

µ-a.e. on E and F ∈ A is s.t. µ(F ) = 0 and P (x) is true for all x ∈ E − F ,
then
Ef := {x ∈ E : P (x) is false} ⊂ E ∩ F and µ(E ∩ F ) = 0 (cf. 7.1.2a).
However, this does not imply µ(Ef ) = 0, because Ef need not to be an element of
A; if it is, then µ(Ef ) = 0 is true by the monotonicity of µ.
It is obvious that, if µ is complete, then
P (x) µ-a.e. on E ⇔ (Ef ∈ A and µ(Ef ) = 0).
It must be pointed out that sometimes the nature of P (x) makes Ef an element
of A. An istance of this is when E := X and, for ϕ, ψ ∈ M(X, A), P (x) is
−1
“ϕ(x) = ψ(x)”. Indeed, in this case, Ef = (ϕ − ψ) (C − {0}) ∈ A by 6.2.16 and
6.2.13c (with G := Td ). When the nature of P (x) makes Ef an element of A, then
it is obvious that
P (x) µ-a.e. on X ⇔ µ(Ef ) = 0
even if µ is not complete.
7.2 Outer measures
Throghout this section, X stands for a non-empty set.

Carathéodory’s theorem, along with the construction of an outer measure set
forth in 7.2.4, shows how we can construct a measure on a σ-algebra on X starting
from almost any non-negative function defined on almost any family of subsets of
X.
7.2.1 Definitions. An outer measure on X is a function µ∗ : P(X) → [0, ∞] which

satisfies the following conditions:
(om1 ) µ∗ (∅) = 0;
(om2 ) if E, F ∈ P(X) are such that E ⊂ F , then µ∗ (E) ≤ µ∗ (F );
P∞
(om3 ) for every sequence {En } in P(X), µ∗ (∪∞
n=1 En ) ≤
∗
n=1 µ (En ).
Measures 159
If µ∗ is an outer measure on X and {E1 , ..., EN } is a finite family of subsets of X,

define En := ∅ for n > N to obtain, by conditions om3 and om1 ,
∞
X N
X
µ∗ (∪N ∗ ∞
n=1 En ) = µ (∪n=1 En ) ≤ µ∗ (En ) = µ∗ (En ).
n=1 n=1
If µ∗ is an outer measure on X, a set E ∈ P(X) is called µ∗ -measurable if

µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)), ∀A ∈ P(X).
7.2.2 Proposition. Let µ∗ be an outer measure on X. A set E ∈ P(X) is µ∗ -

measurable if
µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) ≤ µ∗ (A) for all A ∈ P(X) such that µ∗ (A) < ∞.
Proof. For each E ∈ P(X), conditions om3 and om1 imply that
µ∗ (A) ≤ µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)), ∀A ∈ P(X), (1)
since A = (A ∩ E) ∪ (A ∩ (X − E)), and 1 implies that
µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) for all A ∈ P(X) such that µ∗ (A) = ∞. (2)
If E ∈ P(X) is such that
µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) ≤ µ∗ (A) for all A ∈ P(X) such that µ∗ (A) < ∞,
then in view of 1 we have
µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) for all A ∈ P(X) such that µ∗ (A) < ∞,
which, together with 2, proves that E is µ∗ -measurable.
7.2.3 Theorem (Carathéodory’s theorem). Let µ∗ be an outer measure on X.

Then the collection M of µ∗ -measurable subsets of X is a σ-algebra on X and the
restriction µ∗M of µ∗ to M is a complete measure on M.
Proof. First, we observe that X − E ∈ M whenever E ∈ M, since the definition

of µ∗ -measurability is symmetric in E and X − E (since X − (X − E) = E).
Next, suppose E, F ∈ M and let A be an arbitrary subset of X. Since
E ∪ F = (E ∩ F ) ∪ (E ∩ (X − F )) ∪ ((X − E) ∩ F ),
by conditions om3 and om1 we have
µ∗ (A ∩ (E ∪ F )) ≤ µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E ∩ (X − F )) + µ∗ (A ∩ (X − E) ∩ F ),
and hence
µ∗ (A ∩ (E ∪ F )) + µ∗ (A ∩ (X − (E ∪ F )))
≤ µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E ∩ (X − F ))
+µ∗ (A ∩ (X − E) ∩ F ) + µ∗ (A ∩ (X − E) ∩ (X − F ))
= µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) = µ∗ (A),
since E, F ∈ M. In view of 7.2.2, this shows that E ∪ F ∈ M. Thus, M is an

algebra on X.
Suppose now that E1 ∈ M, E2 ∈ P(X) and E1 ∩ E2 = ∅. Then
µ∗ (E1 ∪ E2 ) = µ∗ ((E1 ∪ E2 ) ∩ E1 ) + µ∗ ((E1 ∪ E2 ) ∩ (X − E1 ))
= µ∗ (E1 ) + µ∗ (E2 ).
This proves that, for every disjoint pair E1 , E2 of elements of M,
µ∗ (E1 ∪ E2 ) = µ∗ (E1 ) + µ∗ (E2 ).
Applying induction to this result, we obtain property af2 of 7.1.1 for µ∗M . Since
property af1 is ensured by condition om1 , we can conclude that µ∗M is an additive
function on M.
We will now show that M and µ∗ have the following properties: if {En } is a
sequence in M such that Ei ∩ Ej = ∅ for i 6= j, then
∞
X
∪∞ ∗ ∞
n=1 En ∈ M and µ (∪n=1 En ) = µ∗ (En ).
n=1
In view of 6.1.8, this will prove first that M has property σa1 of 6.1.13, and hence
that M is a σ-algebra on X, and second that µ∗M has property pm of 7.1.3, and
hence that µ∗M is a measure on M.
Then let {En } be a sequence in M such that Ei ∩ Ej = ∅ for i 6= j, define the
sets Fn := ∪nk=1 Ek for all n ∈ N and F := ∪∞ n=1 En , and let A be an arbitrary subset
of X. We will prove by induction that
Xn
∀n ∈ N, µ∗ (A ∩ Fn ) = µ∗ (A ∩ Ek ). (1)
k=1
Obviously, we have
µ∗ (A ∩ F1 ) = µ∗ (A ∩ E1 ).
Assume next that for m ∈ N we have
m
X
µ∗ (A ∩ Fm ) = µ∗ (A ∩ Ek );
k=1
then we have, since Em+1 ∈ M,
µ∗ (A ∩ Fm+1 ) = µ∗ (A ∩ Fm+1 ∩ Em+1 ) + µ∗ (A ∩ Fm+1 ∩ (X − Em+1 ))
= µ∗ (A ∩ Em+1 ) + µ∗ (A ∩ Fm )
m
X m+1
X
= µ∗ (A ∩ Em+1 ) + µ∗ (A ∩ Ek ) = µ∗ (A ∩ Ek ).
k=1 k=1
Thus, 1 is proved. Since Fn ∈ M, from 1, condition om2 , and 5.3.2c we obtain
∀n ∈ N, µ∗ (A) = µ∗ (A ∩ Fn ) + µ∗ (A ∩ (X − Fn ))
Xn
≥ µ∗ (A ∩ Ek ) + µ∗ (A ∩ (X − F )).
k=1
Measures 161
From this, using 5.3.2c, om3 , om1 , we obtain

Xn
µ∗ (A) ≥ sup( µ∗ (A ∩ Ek ) + µ∗ (A ∩ (X − F )))
n≥1
k=1
n
X
= sup µ∗ (A ∩ Ek ) + µ∗ (A ∩ (X − F ))
n≥1
k=1
∞
X
= µ∗ (A ∩ En ) + µ∗ (A ∩ (X − F ))
n=1
≥ µ∗ (∪∞ ∗
n=1 (A ∩ En )) + µ (A ∩ (X − F ))
= µ∗ (A ∩ F ) + µ∗ (A ∩ (X − F )) ≥ µ∗ (A).
Thus, all the inequalities in this last calculation are in fact equalities. Since A was
an arbitrary subset of X, this proves that ∪∞ n=1 En =: F ∈ M. Moreover, letting
A := F and using condition om1 , we have
∞
X X∞
µ∗ (∪∞ E
n=1 n ) = µ ∗
(F ) = µ ∗
(F ∩ En ) = µ∗ (En ).
n=1 n=1
Finally, we prove that the measure µ∗M is complete. If F ∈ P(X) and there
exists E ∈ M such that F ⊂ E and µM (F ) = 0, then µ∗ (F ) = 0 by condition om2 .
By conditions om1 , om2 , om3 , this implies that
∀A ∈ P(X), µ∗ (A) ≤ µ∗ (A ∩ F ) + µ∗ (A ∩ (X − F ))
= µ∗ (A ∩ (X − F )) ≤ µ∗ (A),
so that F ∈ M.
7.2.4 Proposition. Let E be a family of subsets of X so that ∅, X ∈ E and let

ρ : E → [0, ∞] be a function so that ρ(∅) = 0. For each E ∈ P(X), define the subset
RE of [0, ∞] by
(∞ )
X
RE := ρ(An ) : {An } is a sequence in E s.t. E ⊂ ∪∞
n=1 An
n=1
and note that RE is non-empty (take An := X for all n ∈ N). Then the function
µ∗ : P(X) → [0, ∞]
E 7→ µ∗ (E) := inf RE
is an outer measure on X.
Proof. Obviously µ∗ (∅) = 0 since An := ∅ defines a sequence in E such that

∅ ⊂ ∪∞ n=1 An . Moreover, if E, F ∈ P(X) are such that E ⊂ F then RF ⊂ RE
and hence µ∗ (E) ≤ µ∗ (F ). Thus, conditions om1 and om2 are satisfied by µ∗ . To
prove condition om3 , consider a sequence {En } in P(X). If there is n ∈ N so that
µ∗ (En ) = ∞, then the inequality
∞
X
µ∗ (∪∞ E
n=1 n ) ≤ µ∗ (En )
n=1
is obvious. Suppose then µ∗ (E < ∞ for all k ∈ N and choose ǫ > 0. For each
k)
k ∈ N there exists a sequence Akn in E such that
∞
X ǫ
Ek ⊂ ∪∞ k
n=1 An and ρ(Akn ) < µ∗ (Ek ) + .
n=1
2k
Then we have
∪∞ k
k=1 Ek ⊂ ∪(k,n)∈N×N An
and
X ∞ X
X ∞
ρ(Akn ) = ( ρ(Akn ))
(k,n)∈N×N k=1 n=1
∞ ∞
X ǫ X
≤ (µ∗ (Ek ) + k
)= µ∗ (Ek ) + ǫ,
2
k=1 k=1
where 5.4.7, 5.4.2a and 5.4.6 have been used. Thus,

X ∞
X
µ∗ (∪∞
k=1 Ek ) ≤ ρ(Akn ) ≤ µ∗ (Ek ) + ǫ.
(k,n)∈N×N k=1
Since ǫ was arbitrary, this proves that

∞
X
µ∗ (∪∞
k=1 Ek ) ≤ µ∗ (Ek ).
k=1
7.3 Extension theorems
7.3.1 Theorem. Let S be a semialgebra on a non-empty set X.

(A) Let ν : S → [0, ∞] be a function which satisfies the following conditions:
(a) ν(∅) = 0;
(b) for every finite and disjoint family {E1 , ..., EN } of elements of S such that
∪Nn=1 En ∈ S,
N
X
ν(∪N
n=1 En ) = ν(En ).
n=1
Then there exists a unique additive function µ0 on A0 (S) (the algebra on X

generated by S) which is an extension of ν.
(B) If ν satisfies the further condition
(c) for every sequence {En } in S such that ∪∞
n=1 En ∈ S and Ei ∩ Ej = ∅ if
i 6= j,
∞
X
ν(∪∞
n=1 En ) ≤ ν(En ),
n=1
then µ0 is a premeasure on A0 (S).

Measures 163
Proof. A, existence: Let E ∈ A0 (S). Then there exists a finite and disjoint family
{E1 , ..., EN } of elements of S such that E = ∪N n=1 En (cf. 6.1.11). Suppose now
that there exists another finite and disjoint family {F1 , ..., FL } of elements of S such
that E = ∪L l=1 Fl . If we define Gn,l := En ∩ Fl , we have Gn,l ∈ S for all n = 1, ..., N
and l = 1, ..., L, and Gn,l ∩ Gm,k = ∅ if (n, l) 6= (m, k). By condition b, we also have
L
X
for n = 1, ..., N, En = ∪L
l=1 Gn,l , whence ν(En ) = ν(Gn,l ),
l=1
N
X
for l = 1, ..., L, Fl = ∪N
n=1 Gn,l , whence ν(Fl ) = ν(Gn,l ),
n=1
and hence
N
X N X
X L L X
X N L
X
ν(En ) = ν(Gn,l ) = ν(Gn,l ) = ν(Fl ).
n=1 n=1 l=1 l=1 n=1 l=1
This shows that we can define the function

µ0 : A0 (S) → [0, ∞]
N
X
E 7→ µ0 (E) := ν(En ) if {E1 , ..., EN } is a finite and disjoint family
n=1
of elements of S s.t. E = ∪N
n=1 En .
Obviously, µ0 is an extension of ν.
Since µ0 is an extension of ν, µ0 has property af1 of 7.1.1 because ν satisfies
condition
1 1
2 {E1 , E
a. Let now
2
} be a disjoint pair of elements of A0 (S), and let
2
E1 , ..., EN 1
and E1 , ..., EN2 be finiteand disjoint families of elements of S such
Ni
that Ei = ∪n=1 En for i = 1, 2. Then Eni i=1,2;n=1,...,Ni is a finite and disjoint
i
family of elements of S and E1 ∪ E2 = ∪i=1,2 ∪N i

n=1 En , and hence
i
Ni
X X N1
X N2
X
µ0 (E1 ∪ E2 ) = ν(Eni ) = ν(En1 ) + ν(En2 ) = µ0 (E1 ) + µ0 (E2 ).
i=1,2 n=1 n=1 n=1
Applying induction to this result, we obtain property af2 of 7.1.1 for µ0 , which is
therefore an additive function on A0 (S).
A, uniqueness: Suppose that µ̃0 is an additive function on A0 (S) which extends
ν. For any E ∈ A0 (S), let {E1 , ..., EN } be a finite and disjoint family of elements of
S such that E = ∪N n=1 En . Then we have, by the additivity of µ̃0 and the definition
of µ0 ,
N
X N
X
µ̃0 (E) = µ̃0 (En ) = ν(En ) = µ0 (E).
n=1 n=1
B: Assume now condition c, and suppose that {Fn } is a sequence in A0 (S) such
that F := ∪∞n=1 Fn ∈ A0 (S) and Fi ∩ Fj = ∅ if i 6= j.
There are finite and disjoint families {A1 , ..., AN } and {Bn,1 , ..., Bn,Nn } (for each
n ∈ N) of elements of S so that (cf. 6.1.11)
Nn
F = ∪N
k=1 Ak and Fn := ∪l=1 Bn,l (for each n ∈ N).
Define
I := {(k, n, l) : k = 1, ..., N, n ∈ N, l = 1, ..., Nn } ,

Ck,n,l := Ak ∩ Bn,l for (k, n, l) ∈ I.
Clearly, {Ck,n,l }(k,n,l)∈I is a disjoint family of elements of S and
∪N
k=1 Ck,n,l = Bn,l ,
Nn
∪l=1 Ck,n,l = Ak ∩ Fn , ∪∞
n=1 (Ak ∩ Fn ) = Ak .
We have
N
X
ν(Bn,l ) = ν(Ck,n,l )
k=1
since ν satisfies condition b, and

∞
X ∞ X
X Nn
ν(Ak ) ≤ ν(Ak ∩ Fn ) = ( ν(Ck,n,l ))
n=1 n=1 l=1
since ν satisfies conditions c and b. Then we have

N
X N X
X ∞ XNn ∞ X
X N XNn
µ0 (F ) = ν(Ak ) ≤ ( ( ν(Ck,n,l ))) = ( ( ν(Ck,n,l )))
k=1 k=1 n=1 l=1 n=1 k=1 l=1
∞ X
X Nn ∞
X
= ( ν(Bn,l )) = µ0 (Fn ),
n=1 l=1 n=1
where we have used induction applied to 5.3.2e, induction applied to 5.4.6, and
5.3.3.
On the other hand, the additivity and the monotonicity of µ0 imply that
N
X
∀N ∈ N, µ0 (Fn ) = µ0 (∪N
n=1 Fn ) ≤ µ0 (F ),
n=1
and hence that

∞
X N
X
µ0 (Fn ) = sup µ0 (Fn ) ≤ µ0 (F ).
n=1 N ≥1 n=1
P∞
Thus, µ0 (F ) = n=1 µ0 (Fn ). This proves that, if condition c is satisfied, then
µ0 has the property pm of 7.1.3.
Measures 165
7.3.2 Theorem (Hahn’s theorem). Let A0 be an algebra on X and µ0 a pre-

measure on A0 . For each E ∈ P(X), define the subset ME of [0, ∞] by
(∞
X
ME := µ0 (An ) : {An } is a sequence in A0 such that
n=1
)
Ai ∩ Aj = ∅ if i 6= j and E ⊂ ∪∞
n=1 An
and note that ME is non-empty (take A1 := X and An := ∅ for n > 1). Then the
function
µ : A(A0 ) → [0, ∞]
E 7→ µ(E) := inf ME
is a measure on A(A0 ) (the σ-algebra on X generated by A0 ) and µ is an extension
of µ0 , i.e.
µ(E) = µ0 (E), ∀E ∈ A0 .
If µ̃ is another measure on A(A0 ) that is an extension of µ0 , then:
µ̃(E) ≤ µ(E), ∀E ∈ A(A0 );
µ̃(E) = µ(E) for each E ∈ A(A0 ) such that µ(E) < ∞;
µ̃ = µ if µ0 is σ-finite.
Proof. Define RE and µ∗ as in 7.2.4, with E := A0 and ρ := µ0 . Then µ∗ is

an outer measure on X and from 7.2.3 it follows that the collection M of µ∗ -
measurable subsets of X is a σ-algebra on X, and that the restriction µ∗M of µ∗ to
M is a measure on M. From these facts, all the assertions of the statements can
be derived. It will be convenient to divide this derivation into several steps.
Step 1: We prove that µ∗ (E) = inf ME for each E ∈ P(X).
For each E ∈ P(X), on the one hand clearly ME ⊂ RE and this implies that
inf RE ≤ inf ME . On the other hand, if {An } is a sequence in A0 such that
E ⊂ ∪∞ n=1 An , then by 6.1.8 there is a sequence {Bn } in A0 such that
Bn ⊂ An for all n ∈ N, Bi ∩ Bj = ∅ if i 6= j and E ⊂ ∪∞ ∞

n=1 An = ∪n=1 Bn ,
and we have, by the monotonicity of µ0 and 5.4.2a,

X∞ ∞
X
µ0 (Bn ) ≤ µ0 (An ).
n=1 n=1
This proves that
∀a ∈ RE , ∃a′ ∈ ME s.t. a′ ≤ a,
and hence that
∀a ∈ RE , inf ME ≤ a,
and hence that inf ME ≤ inf RE . Thus, µ∗ (E) := inf RE = inf ME .
Step 2: We prove that A(A0 ) ⊂ M.

Suppose E ∈ A0 . Let A ∈ P(X) be such that µ∗ (A) < ∞ and choose ǫ > 0.
Since µ∗ (A) = inf RA , there is a sequence {An } in A0 such that
X∞
A ⊂ ∪∞ n=1 nA and µ0 (An ) < µ∗ (A) + ǫ.
n=1
By the additivity of µ0 we also have
µ0 (An ) = µ0 (An ∩ E) + µ0 (An ∩ (X − E)), ∀n ∈ N,
which implies
µ∗ (A ∩ E) + µ∗ (A ∩ (X − E))
X∞ ∞
X ∞
X
≤ µ0 (An ∩ E) + µ0 (An ∩ (X − E)) = µ0 (An ),
n=1 n=1 n=1
where the equality holds by 5.4.6 and the inequality holds because {An ∩ E} is a
sequence in A0 such that A ∩ E ⊂ ∪∞n=1 (An ∩ E) and {An ∩ (X − E)} is a sequence
in A0 such that A ∩ (X − E) ⊂ ∪∞ n=1 (An ∩ (X − E)). Thus we have
µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) < µ∗ (A) + ǫ.

Since ǫ was arbitrary, this proves that
µ∗ (A ∩ E) + µ∗ (A ∩ (X − E)) ≤ µ∗ (A).
In view of 7.2.2, this proves that E ∈ M.
Step 3: From steps 1 and 2 it follows that µ is a measure, since it is the restriction
of the measure µ∗M to A(A0 ).
Step 4: We prove that µ(E) = µ0 (E) for all E ∈ A0 .
Suppose E ∈ A0 . If {An } is any sequence in A0 such that Ai ∩ Aj = ∅ if i 6= j
and E ⊂ ∪∞ n=1 An , let Bn := E ∩ An for all n ∈ N. Then Bn ∈ A0 for all n ∈ N,
Bi ∩ Bj = ∅ if i 6= j and E = ∪∞ n=1 Bn , so by the σ-additivity and the monotonicity
of µ0 and by 5.4.2a we have
X∞ X ∞
µ0 (E) = µ0 (Bn ) ≤ µ0 (An ).
n=1 n=1
This proves that µ0 (E) ≤ µ(E). On the other hand, A1 := E and An := ∅ for n > 1
defines a sequence in A0 such that Ai ∩ Aj = ∅ if i 6= j and E ⊂ ∪∞ n=1 An , and this
proves that µ(E) ≤ µ0 (E).
Step 5: We prove that, if µ̃ is a measure on A(A0 ) which is an extension of µ0 ,
then the three assertions about µ̃ in the statement are true.
Let µ̃ be a measure on A(A0 ) such that µ̃(F ) = µ0 (F ), ∀F ∈ A0 , and suppose
E ∈ A(A0 ).
For any sequence {An } in A0 such that Ai ∩ Aj = ∅ if i 6= j and E ⊂ ∪∞ n=1 An ,
by the monotonicity and the σ-additivity of µ̃ we have
∞
X ∞
X
µ̃(E) ≤ µ̃(∪∞ A
n=1 n ) = µ̃(An ) = µ0 (An ).
n=1 n=1
Measures 167
This proves that µ̃(E) ≤ µ(E).

Assume next µ(E) < ∞ and choose ǫ > 0. Then we can choose the An ’s above
so that
∞
X
∞
µ(∪n=1 An ) = µ(An ) < µ(E) + ǫ
n=1
and obtain, by the monotonicity of µ and the σ-additivity of µ and µ̃,
∞
X ∞
X
µ(E) ≤ µ(∪∞ A
n=1 n ) = µ 0 (An ) = µ̃(An )
n=1 n=1
= µ̃(∪∞ ∞
n=1 An ) = µ̃(E) + µ̃((∪n=1 An ) − E)
≤ µ̃(E) + µ((∪∞ ∞
n=1 An ) − E) = µ̃(E) + µ(∪n=1 An ) − µ(E)
< µ̃(E) + ǫ.
Since ǫ was arbitrary, this proves that µ(E) ≤ µ̃(E), and hence that µ̃(E) = µ(E).
Finally, assume that µ0 is σ-finite. This means that there exists a sequence
{En } in A0 so that µ0 (En ) < ∞ for all n ∈ N and X = ∪∞ n=1 En . By 6.1.8,
there exists a sequence {Fn } in A0 such that Fn ⊂ En (and hence µ0 (Fn ) < ∞
by the monotonicity of µ0 ) for all n ∈ N, and such that Fk ∩ Fl = ∅ if k 6= l and
X = ∪∞ n=1 Fn . Then we have, by the σ-additivity of µ and µ̃,
∞
X X∞
µ(E) = µ(∪∞ n=1 (E ∩ F n )) = µ(E ∩ Fn ) = µ̃(E ∩ Fn ) = µ̃(E)
n=1 n=1
since µ(E ∩ Fn ) < ∞ by the monotonicity of µ.
7.3.3 Theorem. Let (X, d) be a metric space, let S be a semialgebra on X, and

let ν : S → [0, ∞] be a function which satisfies the following conditions:
(a) ν(∅) = 0;
(b) for each finite and disjoint family {E1 , ..., EN } of elements of S such that
∪Nn=1 En ∈ S,
N
X
ν(∪N
n=1 En ) = ν(En );
n=1
(c) for each E ∈ S,

ν(E) = sup ν(F ) : F ∈ S, F ⊂ E, F is compact
(this condition is consistent because conditions a and b imply, by 7.3.1A, that ν
is the restriction of an additive function on A0 (S) and hence that ν(F ) ≤ ν(E)
if E, F ∈ S are such that F ⊂ E, and also because ∅ ∈ S, ∅ = ∅ ⊂ E for each
E ∈ S, and ∅ = ∅ is compact);
(d) for each E ∈ S such that ν(E) < ∞,
∀ǫ > 0, ∃Gǫ ∈ S s.t. E ⊂ G◦ǫ and ν(Gǫ ) − ν(E) < ǫ
(if there exists a finite family {E1 , ..., EN } of elements of S such that
ν(En ) < ∞ for n = 1, ..., N and X = ∪Nn=1 En , then condition d is redundant).
Then there exists a measure on A(S) (the σ-algebra on X generated by S) that is

an extension of ν.
If ν satisfies the further condition
(e) there exists a countable family {En }n∈I of elements of S such that ν(En ) < ∞
for all n ∈ I and X = ∪n∈I En ,
then the measure on A(S) that extends ν is unique, and it is σ-finite.
Proof. Conditions a and b imply that there exists a unique additive function µ0
on A0 (S) that extends ν (cf. 7.3.1A). Then conditions c and d become respectively
conditions a and b of 7.1.6 for µ0 , and this implies that µ0 is a premeasure. If there
is a finite family {E1 , ..., EN } of elements of S such that ν(En ) < ∞ for n = 1, ..., N
and X = ∪N n=1 En , then the additive function µ0 that exists on A0 (S) is finite (since
it is subadditive), and hence condition c is enough to make µ0 a premeasure (cf.
7.1.6).
Since µ0 is a premeasure on A0 (S), 7.3.2 implies that there is a measure µ on
A(A0 (S)) that extends µ0 . Since A(A0 (S)) = A(S) (cf. 6.1.18) and µ0 extends ν,
this proves that there exists a measure µ on A(S) that extends ν.
Finally, assume that ν satisfies also condition e. Then clearly µ and µ0 are
σ-finite. Suppose that µ̃ is another measure on A(S) that extends ν. Then the
restriction µ̃A0 (S) of µ̃ to A0 (S) is an additive function on A0 (S) that extends ν
and we have µ̃A0 (S) = µ0 by the uniqueness asserted in 7.3.1A. Hence, µ̃ is an
extension of µ0 and we have µ̃ = µ by the uniqueness asserted in 7.3.2 in the event
that µ0 is σ-finite.
7.4 Finite measures in metric spaces
The content of the first part of this section will be used mainly in the study of
the product of two commuting projection valued measures, in Section 13.5. Indeed,
for that study, which is a necessary step for the spectral theory of two commuting
self-adjoint operators, it is essential to prove first that every finite measure on the
Borel σ-algebra A(dR ) on R is regular. This can be achieved in several ways. The
way we adopt here is borrowed from Section 2.7 of (Parthasarathy, 2005), and will
allow to prove a more general result about commuting projection valued measures
than the one that is required by the spectral theory of two commuting self-adjoint
operators.
Lusin’s theorem is presented in the second part of this section. It will be used
to prove that C(a, b) is isomorphic to a dense linear manifold in the Hilbert space
L2 (a, b) (cf. 11.2.1).
Throughout this section, (X, d) stands for a metric space.
Measures 169
7.4.1 Proposition. Let µ be a finite measure on the Borel σ-algebra A(d) (cf.
6.1.22). Then, for each E ∈ A(d) the following conditions are both satisfied:
(a) µ(E) = sup {µ(F ) : F ⊂ E, F ∈ Kd };
(b) µ(E) = inf {µ(G) : E ⊂ G, G ∈ Td }.
Proof. We prove first that, for E ∈ A(d), conditions a and b together are equivalent
to the one condition
(c) ∀ǫ > 0,∃Fǫ ∈ Kd , ∃Gǫ ∈ Td s.t. Fǫ ⊂ E ⊂ Gǫ and µ(Gǫ − Fǫ ) < ǫ.
On the one hand in fact, if conditions a and b are true for E ∈ A(d), then (since µ
is finite) for ǫ > 0 there exist Fǫ ∈ Kd and Gǫ ∈ Td such that Fǫ ⊂ E ⊂ Gǫ and
ǫ ǫ
µ(E) − µ(Fǫ ) < and µ(Gǫ ) − µ(E) < . (1)
2 2
Since Gǫ − Fǫ = (Gǫ − E) ∪ (E − Fǫ ) (cf. 1.1.4) and (Gǫ − E) ∩ (E − Fǫ ) = ∅, and
since µ is finite, 1 implies that
µ(Gǫ − Fǫ ) = µ(Gǫ − E) + µ(E − Fǫ ) = µ(Gǫ ) − µ(E) + µ(E) − µ(Fǫ ) < ǫ,
and this shows that condition c is true for E. On the other hand, if condition c is
true for E ∈ A(d), then for ǫ > 0 we have
µ(E) − µ(Fǫ ) = µ(E − Fǫ ) ≤ µ(Gǫ − Fǫ ) < ǫ and
µ(Gǫ ) − µ(E) = µ(Gǫ − E) ≤ µ(Gǫ − Fǫ ) < ǫ
since E − Fǫ ⊂ Gǫ − Fǫ and Gǫ − E ⊂ Gǫ − Fǫ and since µ is finite, and this shows
that conditions a and b are true for E.
Let now B denote the collection of all the subsets of X for which condition c is
satisfied. We will prove that B is a σ-algebra on X and that Kd ⊂ B.
In the first place, we prove that condition al2 of 6.1.5 is satisfied for B. Indeed,
let E ∈ B and ǫ > 0. Then there exist Fǫ ∈ Kd and Gǫ ∈ Td such that Fǫ ⊂ E ⊂ Gǫ
and µ(Gǫ −Fǫ ) < ǫ. Hence, X −Gǫ ∈ Kd , X −Fǫ ∈ Td , X −Gǫ ⊂ X −E ⊂ X −Fǫ and
(X − Fǫ ) − (X − Gǫ ) = (X − Fǫ ) ∩ Gǫ = Gǫ − Fǫ , whence µ((X − Fǫ ) − (X − Gǫ )) < ǫ.
This shows that X − E ∈ B. Note next that ∅ ∈ B, since ∅ ∈ Kd and ∅ ∈ Td , and
thus condition c is trivially satisfied for ∅. Therefore, to prove both al1 of 6.1.5
and σa1 of 6.1.13 for B it is enough to prove that ∪∞ n=1 En ∈ B whenever {En } is a
sequence in B. Let then {En } be a sequence in B and ǫ > 0. For each n ∈ N there
exist Fn,ǫ ∈ Kd and Gn,ǫ ∈ Td such that Fn,ǫ ⊂ En ⊂ Gn,ǫ and µ(Gn,ǫ − Fn,ǫ ) < 3ǫn .
Letting S := ∪∞ ∞ N
n=1 Fn,ǫ = ∪N =1 (∪n=1 Fn,ǫ ), from 7.1.4b it follows (since µ is finite)
that we can choose Nǫ so large that
Nǫ ǫ
µ(S − ∪n=1 Fn,ǫ ) = µ(S) − µ(∪N n=1 Fn,ǫ ) < .
ǫ
2
Letting now Fǫ := ∪N ∞
n=1 Fn,ǫ and Gǫ := ∪n=1 Gn,ǫ , we have Fǫ ∈ Kd , Gǫ ∈ Td ,
ǫ
∞
Fǫ ⊂ ∪n=1 En ⊂ Gǫ and
µ(Gǫ − Fǫ ) = µ(Gǫ − S) + µ(S − Fǫ )
≤ µ(∪∞n=1 (Gn,ǫ − Fn,ǫ )) + µ(S − Fǫ )
∞ ∞
X X ǫ ǫ
≤ µ(Gn,ǫ − Fn,ǫ ) + µ(S − Fǫ ) < n
+ = ǫ,
n=1 n=1
3 2
where we have used the following facts: Gǫ − Fǫ = (Gǫ − S) ∪ (S − Fǫ ) (cf. 1.1.4)

and (Gǫ − S) ∩ (S − Fǫ ) = ∅; the monotonicity of µ and the calculation
Gǫ − S = (∪∞ ∞ ∞ ∞
n=1 Gn,ǫ ) ∩ (X − ∪k=1 Fk,ǫ ) = (∪n=1 Gn,ǫ ) ∩ (∩k=1 (X − Fk,ǫ ))
= ∪∞ ∞
n=1 (Gn,ǫ ∩ (∩k=1 (X − Fk,ǫ )))
⊂ ∪∞ ∞
n=1 (Gn,ǫ ∩ (X − Fn,ǫ )) = ∪n=1 (Gn,ǫ − Fn,ǫ );
the σ-subadditivity of µ. This shows that ∪∞ n=1 En ∈ B. Thus, B is a σ-algebra on

X.
We prove now that Kd ⊂ B. Indeed, fix E ∈ Kd . Then, by 2.3.9c and 2.5.2, we
have

∞ 1
E = {x ∈ X : δE (x) = 0} = ∩n=1 x ∈ X : δE (x) < ,
n
where δE is the non-negative function defined in 2.5.4. For each n ∈ N define

1 1
Gn := x ∈ X : δE (x) < = δE −1 ((−∞, )).
n n
Since δE is continuous and (−∞, n1 ) is an open subset of R, we have Gn ∈ Td by
2.4.3. Also, since Gn+1 ⊂ Gn for all n ∈ N and µ is finite, from 7.1.4c it follows
that we can choose nǫ ∈ N so large that
µ(Gnǫ − E) = µ(Gnǫ ) − µ(E) < ǫ.
Letting Fǫ := E and Gǫ := Gnǫ , this proves that condition c is satisfied for E, and
hence that E ∈ B.
Thus, B is a σ-algebra on X and Kd ⊂ B. Therefore,
A(d) = A(Kd ) ⊂ B
(cf. 6.1.23 and gσ2 of 6.1.17a with F := Kd and A := B), and this inclusion proves
the assertion of the statement.
7.4.2 Corollary. If µ1 and µ2 are finite measures on A(d) and either µ1 (F ) =

µ2 (F ) for all F ∈ Kd or µ1 (G) = µ2 (G) for all G ∈ Td , then µ1 = µ2 .
Proof. This result is obtained immediately from 7.4.1, by using either condition a
or condition b for all elements of A(d).
7.4.3 Definition. A measure µ on the Borel σ-algebra A(d) is said to be regular

if, for each E ∈ A(d), the following conditions are both satisfied:
(α) µ(E) = sup {µ(C) : C ⊂ E, C is compact};
(β) µ(E) = inf {µ(G) : E ⊂ G, G ∈ Td }.
Note that condition α is consistent because, if C is compact, then C ∈ Kd by 2.8.6.
7.4.4 Definition. A finite measure µ on the Borel σ-algebra A(d) is said to be

tight if the following condition is satisfied:
∀ǫ > 0, there exists a compact subset Kǫ of X such that µ(X − Kǫ ) < ǫ.
Measures 171
7.4.5 Theorem. Let µ be a finite and tight measure on the Borel σ-algebra A(d).
Then µ is regular.
Proof. Let E be an arbitrary element of A(d). Condition β of 7.4.3 coincides with
condition b of 7.4.1, and therefore we know that it is satisfied since µ is finite. Let
now ǫ > 0. Since µ is finite, from 7.4.1 we know that
ǫ
∃Fǫ ∈ Kd s.t. Fǫ ⊂ E and µ(E) − µ(Fǫ ) < .
2
Since µ is tight, there is a compact subset Kǫ of X such that µ(X − Kǫ ) < 2ǫ . Define
then Cǫ := Fǫ ∩ Kǫ . We have Cǫ ⊂ E. Also, Cǫ is closed since Kǫ is closed by 2.8.6,
and hence Cǫ is compact by 2.8.8. Since µ is finite, we also have
ǫ ǫ
µ(E)−µ(Cǫ ) = µ(E)−µ(Fǫ )+µ(Fǫ )−µ(Cǫ ) < +µ(Fǫ −Cǫ ) ≤ +µ(X −Kǫ) < ǫ,
2 2
where we have used the monotonicity of µ and the calculation
Fǫ − Cǫ = Fǫ ∩ ((X − Fǫ ) ∪ (X − Kǫ )) = Fǫ − Kǫ ⊂ X − Kǫ .
Since µ is finite, this shows that condition α is satisfied for E.
7.4.6 Theorem. If the metric space (X, d) is complete and separable, then every
finite measure on the Borel σ-algebra A(d) is tight.
Proof. Let (X, d) be complete and separable, and let µ be a finite measure on
A(d). Choose ǫ > 0. For each n ∈ N, the family of open balls B(x, n1 ) x∈X is so

that X = ∪x∈X B(x, n1 ). Then, by 2.3.18, there is a countable family {xn,k }k∈In of
points of X so that X = ∪k∈In B(xn,k , n1 ), and hence so that X = ∪k∈In K(xn,k , n1 ).
We can assume that either In = {1, ..., Nn } or In = N. If I = N, we have X =
1
∪∞ N
N =1 (∪k=1 K(xn,k , n )) and 7.1.4b implies (since µ is finite) that there exists Nn ∈ N
so large that
1 1 ǫ
µ(X − ∪N Nn
k=1 K(xn,k , )) = µ(X) − µ(∪k=1 K(xn,k , )) < n .
n
n n 2
Thus, for each n ∈ N, in either case there is a finite family {xn,1 , ..., xn,Nn } of points
so that
1 ǫ
µ(X − ∪N k=1 K(xn,k , )) < n .
n
n 2
Nn 1
Let then Kǫ := ∩∞ n=1 (∪k=1 K(xn,k , n )). The set Kǫ is closed (cf. 2.3.7 and 2.3.2)
and hence the metric subspace (Kǫ , dKǫ ) is complete (cf. 2.6.6b). Moreover,
1
∀n ∈ N, Kǫ ⊂ ∪N k=1 K(xn,k , ).
n
n
Therefore Kǫ is compact by 2.8.5. Moreover, we have
1
X − Kǫ = ∪∞ Nn
n=1 (X − ∪k=1 K(xn,k , ))
n
and this implies, by the σ-subadditivity of µ, that
∞ ∞
X
Nn 1 X ǫ
µ(X − Kǫ ) ≤ µ(X − ∪k=1 K(xn,k , )) < n
= ǫ.
n=1
n n=1
2
This shows that µ is tight.
7.4.7 Corollary. If the metric space (X, d) is complete and separable, then every
finite measure on the Borel σ-algebra A(d) is regular.
Proof. Use 7.4.6 and then 7.4.5.
7.4.8 Theorem (Lusin’s theorem). Let µ be a finite measure on the Borel

σ-algebra A(d), and let ϕ ∈ M(X, A(d)). Then, for each ε > 0 there exists
ϕ̃ ∈ CB (X) (for CB (X), cf. 3.1.10e) such that
µ({x ∈ X : ϕ(x) 6= ϕ̃(x)}) < ε
and
sup{|ϕ̃(x)| : x ∈ X} ≤ sup{|ϕ(x)| : x ∈ X}.
Proof. First of all we point out that the statement is consistent because
{x ∈ X : ϕ(x) 6= ϕ̃(x)} ∈ A(d)
(since {0} ∈ KdC , this follows from 6.2.8, 6.2.16, 6.2.13c).

We divide the proof into five steps.
Step 1: Besides ϕ ∈ M(X, A(d)), we assume 0 ≤ ϕ(x) < 1 for all x ∈ X. Then
if we define, for each n ∈ N,

k−1 k
Ek,n := ϕ−1 , , for k = 2, 3, ..., 2n,
2n 2n
and
n
2
X k−1
ψn := χEk,n ,
2n
k=2
we have a sequence {ψn } in S + (X, A(d)) such that limn→∞ ψn (x) = ϕ(x) for all
x ∈ X (cf. the proof of 6.2.26).
We have
1
ψ1 = χE2,1 ,
2
which can be written as
1
ψ1 = χE
2 1
if we define
E1 := E2,1 .
For n > 1 we have

n−1 n−1
2X 2X
1 (2h − 1) − 1 2h − 1
ψn = n χE2,n + χ E2h−1,n + χE2h,n ,
2 2n 2n
h=2 h=2
Measures 173
and also, for h = 2, ..., 2n−1 ,

h−1 h
Eh,n−1 = ϕ−1 ,
2n−1 2n−1

(2h − 1) − 1 2h − 1 2h − 1 2h
= ϕ−1 , ∪ ϕ−1
,
2n 2n 2n 2n
= E2h−1,n ∪ E2h,n ,
and hence, since Ek,n ∩ Ek′ ,n = ∅ if k 6= k ′ ,
n−1 n−1 n−1
2X 2X 2X
h−1 h−1 h−1
ψn−1 = χE = χE + χE .
2n−1 h,n−1 2n−1 2h−1,n 2n−1 2h,n
h=2 h=2 h=2
Thus, for n > 1 we have
n−1
2X
1 2h − 1 h − 1
ψn − ψn−1 = n χE2,n + − n−1 χE2h,n
2 2n 2
h=2
 n−1

2X
1 
= n χE2,n + χE2h,n  ,
2
h=2

1
ψn − ψn−1 = χE
2n n
if we define
 n−1

2[
En := E2,n ∪  E2h,n  .
h=2
Now fix ε > 0. From 7.4.1 and its proof we have that, for each n ∈ N,
ε
∃Fn ∈ Kd , ∃Gn ∈ Td such that Fn ⊂ En ⊂ Gn and µ(Gn − Fn ) < ,
2n
and from 2.5.11 we have that
∃ϕn ∈ C(X) such that 0 ≤ ϕn (x) ≤ 1, ∀x ∈ X, and Fn ≺ ϕn ≺ GN .
Then we define the function
ψ:X→R
∞
X 1
x 7→ ψ(x) := ϕ (x).
n n
n=1
2
Since
∞
X 1
0 ≤ ψ(x) ≤ n
= 1, ∀x ∈ X,
n=1
2
we have ψ ∈ FB (X), and since
N ∞
X 1 X 1
ϕ − ψ ≤ → 0 as N → ∞,

n
2n 2n

n=1 ∞ n=N +1
we have ψ ∈ CB (X) (cf. 4.3.6b).

We note that
∞
X 1
ϕ(x) = χ (x), ∀x ∈ X,
n En
n=1
2
because
N N
X 1 X
χ E (x) = ψ1 (x) + (ψn (x) − ψn−1 (x))
n=1
2n n n=2
= ψN (x) → ϕ(x) as N → ∞, ∀x ∈ X.
Since
ϕn (x) = χEn (x), ∀x ∈ Fn ∪ (X − Gn ), ∀n ∈ N
(because ϕn (x) = 1 = χEn (x) if x ∈ Fn and ϕn (x) = 0 = χEn (x) if x ∈ X − Gn ),
we have
∞
\
(Fn ∪ (X − Gn )) ⊂ {x ∈ X : ϕ(x) = ψ(x)},
n=1
and hence
∞
[ ∞
[
{x ∈ X : ϕ(x) 6= ψ(x)} ⊂ ((X − Fn ) ∩ Gn ) = (Gn − Fn ),
n=1 n=1
and hence, by the σ-subadditivity of µ (cf. 7.1.4a)

∞ ∞
X X ε
µ({x ∈ X : ϕ(x) 6= ψ(x)}) ≤ µ(Gn − Fn ) < = ε.
n=1 n=1
2n
Step 2: Besides ϕ ∈ M(X, A(d)), we assume 0 ≤ ϕ(x) for all x ∈ X and
ϕ ∈ FB (X). Then the function
1
ϕ0 := ϕ
kϕk∞ + 1
is an element of M(X, A(d)) such that 0 ≤ ϕ0 (x) < 1 for all x ∈ X, and therefore
the result of step 1 implies that if we fix ε > 0 then there exists ψ0 ∈ CB (X) such
that
µ({x ∈ X : ϕ0 (x) 6= ψ0 (x)}) < ε.
Then the function
ψ := (kϕk∞ + 1)ψ0
is an element of CB (X) such that
µ({x ∈ X : ϕ(x) 6= ψ(x)}) < ε.
Step 3: Besides ϕ ∈ M(X, A(d)), we assume ϕ ∈ FB (X). Then the functions
Measures 175
are elements of M(X, A(d)) ∩ FB (X) such that 0 ≤ ϕi (x) for all x ∈ X, for
i = 1, 2, 3, 4. Therefore, the result of step 2 implies that if we fix ε > 0 then, for
i = 1, 2, 3, 4, there exists ψi ∈ CB (X) such that
ε
µ({x ∈ X : ϕi (x) 6= ψi (x)}) < .
4
Then the function
ψ := ψ1 − ψ2 + iψ3 − iψ4
is an element of CB (X) such that
4
[
{x ∈ X : ϕ(x) 6= ψ(x)} ⊂ {x ∈ X : ϕi (x) 6= ψi (x)},
i=1
and hence, by the subadditivity of µ (cf. 7.1.2b), such that

ε
µ({x ∈ X : ϕ(x) 6= ψ(x)}) < 4 = ε.
4
Step 4: We make no further assumptions about ϕ besides ϕ ∈ M(X, A(d)). For
each n ∈ N we define
An := {x ∈ X : n < |ϕ(x)|},
T∞
and we have An ∈ A(d) (cf. 6.2.17 and 6.2.13a), An+1 ⊂ An , and n=1 An = ∅.
Then, 7.1.4c implies that limn→∞ µ(An ) = 0. Therefore, if we fix ε > 0 then there
exists k ∈ N so that
ε
µ(Ak ) < .
2
Now obviously
χX−Ak ϕ ∈ M(X, A(d)) ∩ FB (X)
and hence the result of step 3 implies that there exists ψ ∈ CB (X) such that
ε
µ({x ∈ X : (χX−Ak ϕ)(x) 6= ψ(x)}) < .
2
Thus, from
(X − Ak ) ∩ {x ∈ X : (χX−Ak ϕ)(x) = ψ(x)} ⊂ {x ∈ X : ϕ(x) = ψ(x)}
we have
{x ∈ X : ϕ(x) 6= ψ(x)} ⊂ Ak ∪ {x ∈ X : (χX−Ak ϕ)(x) 6= ψ(x)}
and hence, by the subadditivity of µ
ε
µ({x ∈ X : ϕ(x) 6= ψ(x)}) < 2= ε.
2
Step 5: We make no further assumptions about ϕ besides ϕ ∈ M(X, A(d)). In
step 4 it was proved that if we fix ε > 0 then there exists ψ ∈ CB (X) such that
µ({x ∈ X : ϕ(x) 6= ψ(x)}) < ε.
Now we prove that there exists ϕ̃ ∈ CB (X) as in the statement. If

sup{|ϕ(x)| : x ∈ X} = ∞
then it is enough to define ϕ̃ := ψ. If
M := sup{|ϕ(x)| : x ∈ X} < ∞,
we first define the function
η:C→C
(
z if |z| ≤ M,
z 7→ η(z) := −1
M z|z| if |z| > M,
and then we define ϕ̃ := η ◦ ψ. The function η is continuous and hence ϕ̃ ∈ C(X)
(cf. 2.4.4), and it is bounded and hence ϕ̃ ∈ CB (X). Furthermore, it is obvious
that, for x ∈ X,
ϕ(x) = ψ(x) ⇒ [ϕ(x) = ψ(x) and |ψ(x)| ≤ M ] ⇒ ϕ(x) = ϕ̃(x).
Therefore,
{x ∈ X : ϕ(x) 6= ϕ̃(x)} ⊂ {x ∈ X : ϕ(x) 6= ψ(x)}
and hence, by the monotonicity of µ (cf. 7.1.2a),
µ({x ∈ X : ϕ(x) 6= ϕ̃(x)}) < ε.
Finally, it is obvious that
|ϕ̃(x)| ≤ M, ∀x ∈ X,
and hence that
sup{|ϕ̃(x)| : x ∈ X} ≤ sup{|ϕ(x)| : x ∈ X}.
Chapter 8
Integration
8.1 Integration of positive functions
In this section, (X, A, µ) denotes an abstract measure space.
8.1.1 Proposition. Let n, m ∈ N, let {a1 , ..., an } and {b1 , ..., bm } be families of
elements of [0, ∞), let {E1 , ..., En } and {F1 , ..., Fm } be disjoint (i.e. Ei ∩ Ej = ∅
and Fi ∩ Fj = ∅ if i 6= j) families of elements of A, and suppose that
n
X m
X
a k χE k = bl χFl .
k=1 l=1
Then
n
X m
X
ak µ(Ek ) = bl µ(Fl ).
k=1 l=1
Proof. We define
n
[ m
[
an+1 := bm+1 := 0, En+1 := X − Ek , Fm+1 := X − Fl .
k=1 l=1
Then, Ek ∩ Fl ∈ A for k = 1, ..., n + 1 and l = 1, ..., m + 1, (Ek ∩ Fl ) ∩ (Ek′ ∩ Fl′ ) = ∅

if (k, l) 6= (k ′ , l′ ), and we have:
m+1
[ m+1
X
∀k ∈ {1, ..., n + 1}, Ek = (Ek ∩ Fl ), whence µ(Ek ) = µ(Ek ∩ Fl ),
l=1 l=1
n+1
[ n+1
X
∀l ∈ {1, ..., m + 1}, Fl = (Ek ∩ Fl ), whence µ(Fl ) = µ(Ek ∩ Fl ).
k=1 k=1
We also notice that
n+1
X n
X m
X m+1
X
a k χE k = a k χE k = bl χFl = bl χFl .
k=1 k=1 l=1 l=1
This implies that, for k = 1, ..., n + 1 and l = 1, ..., m + 1, if µ(Ek ∩ Fl ) 6= 0 and

hence Ek ∩ Fl 6= ∅, then ak = bl .
177
From these facts and from 5.3.3 we obtain

n
X n+1
X n+1
X m+1
X
ak µ(Ek ) = ak µ(Ek ) = ak µ(Ek ∩ Fl )
k=1 k=1 k=1 l=1
m+1
X n+1
X m+1
X m
X
= bl µ(Ek ∩ Fl ) = bl µ(Fl ) = bl µ(Fl ).
l=1 k=1 l=1 l=1
8.1.2 Definition. Let ψ ∈ S + (X, A) (for S + (X, A), cf. 6.2.25). Then there are
n ∈ N, a family {a1 , ..., an } of elements of [0, ∞), and a disjoint family {E1 , ..., En }
Pn
of elements of A so that ψ = k=1 ak χEk . We define the integral (with respect to
µ) of ψ by
Z Xn
ψdµ := ak µ(Ek ),
X k=1
which is an element of [0, ∞] determined by ψ without ambiguity in view of 8.1.1

(it depends only on ψ, and not on the representation ψ = nk=1 ak χEk , which is
P
not unique).
8.1.3 Remarks.
(a) For each E ∈ A, we have χE ∈ S + (X, A). Thus, immediately from the defini-
tion in 8.1.2, we have
Z
χE dµ = µ(E).
X
R
Hence
R in particular 0 dµ = µ(∅) = 0 (even if µ(X) = ∞) since 0X = χ∅ ,
X X
and X 1X dµ = µ(X) since 1X = χX .
(b) From the definition in 8.1.2 and from 7.1.2a we have that if µ(X) = 0 then
= 0 for all ψ ∈ S + (X, A).
R
X
ψdµ
8.1.4 Proposition. Let ψ1 , ψ2 ∈ S + (X, A).

(a) If a, b ∈ [0, ∞) then aψ1 + bψ2 ∈ S + (X, A) and
Z Z Z
(aψ1 + bψ2 )dµ = a ψ1 dµ + b ψ2 dµ.
X X X
R R
(b) If ψ1 ≤ ψ2 then ψ1 dµ ≤ X ψ2 dµ (for the notation ϕ ≤ ψ, cf. 5.1.1).
X
Pn Pm
Proof. Let us write ψ1 = k=1 ak χEk and ψ2 = l=1 bl χFl , with ak ∈ [0, ∞)
for k = 1, ..., n, bl ∈ [0, ∞) for l = 1, ..., m, {E1 , ..., En } and {F1 , ..., Fm } disjoint
families of elements of A.
We define
[n m
[
an+1 := bm+1 := 0, En+1 := X − Ek , Fm+1 := X − Fl .
k=1 l=1
Integration 179
We have
n+1
X m+1
X
ψ1 = ak χEk and ψ2 = bl χFl .
k=1 l=1
The family {Ek ∩ Fl }(k,l)∈I , with I := {1, ..., n + 1} × {1, ..., m + 1}, is a disjoint
family of elements of A, and from Ek = m+1
S Sn+1
l=1 (Ek ∩ Fl ) and Fl = k=1 (Ek ∩ Fl )
we have
m+1
X
χE k = χEk ∩Fl for k = 1, ..., n + 1 and
l=1
n+1
X
χFl = χEk ∩Fl for l = 1, ..., m + 1,
k=1
and hence
X X
ψ1 = ak χEk ∩Fl and ψ2 = bl χEk ∩Fl .
(k,l)∈I (k,l)∈I
a: We have, for a, b ∈ [0, ∞),

X
aψ1 + bψ2 = (aak + bbl )χEk ∩Fl .
(k,l)∈I
This shows that aψ1 + bψ2 ∈ S + (X, A) (this was already clear from 6.2.24). More-
over (cf. 5.3.3)
Z X
(aψ1 + bψ2 )dµ = (aak + bbl )µ(Ek ∩ Fl )
X (k,l)∈I
n+1
X m+1
X m+1
X n+1
X
=a ak µ(Ek ∩ Fl ) + b bl µ(Ek ∩ Fl )
k=1 l=1 l=1 k=1
n+1
X m+1
X Z Z
=a ak µ(Ek ) + b bl µ(Fl ) = a ψ1 dµ + b ψ2 dµ.
k=1 l=1 X X
b: Suppose ψ1 ≤ ψ2 , i.e.
X X
ak χEk ∩Fl (x) ≤ bl χEk ∩Fl (X), ∀x ∈ X.
(k,l)∈I (k,l)∈I
Then ak ≤ bl whenever Ek ∩ Fl 6= ∅, and hence (cf. 5.3.2b,e)

Z X X Z
ψ1 dµ = ak µ(Ek ∩ Fl ) ≤ bl µ(Ek ∩ Fl ) = ψ2 dµ.
X (k,l)∈I (k,l)∈I X
8.1.5 Proposition. Let ψ ∈ S + (X, A). Then the function

ν : A → [0, ∞]
Z
E 7→ ν(E) := χE ψdµ.
X
is a measure on A.
R
Proof. We have ν(∅) = X 0X dµ = 0 (cf. 8.1.3a). Thus, ν has property af1 of
7.1.1.
Pm
Write now ψ = k=1 ak χFk , with ak ∈ [0, ∞) for k = 1, ..., m and {F1 , ..., Fm }
a disjoint family of elements of A, and notice that, for every E ∈ A, the equality
Pm
χE ψ = k=1 R ak χE∩Fk shows that χE ψ ∈ S + (X, A) (this was already clear from
Pm
6.2.24) and X χE ψdµ = k=1 ak µ(E ∩ Fk ). Then, if {En } is a sequence in A such
that Ei ∩ Ej = ∅ whenever i 6= j, we have (by 5.4.5 and induction applied to 5.4.6)
∞ ∞
! Z m
! !
[ X [
ν En = χS ∞n=1 En
ψdµ = ak µ En ∩ Fk
n=1 X k=1 n=1
m
X ∞
X ∞ X
X m
= ak µ(En ∩ Fk ) = ak µ(En ∩ Fk )
k=1 n=1 n=1 k=1
X∞ Z ∞
X
= χEn ψdµ = ν(En ).
n=1 X n=1
Thus, ν has property me of 7.1.7.
8.1.6 Definition. From 8.1.4b it is clear that, for ϕ ∈ S + (X, A),

Z Z
ϕdµ = sup ψdµ : ψ ∈ S + (X, A), ψ ≤ ϕ ,
X X
R R
where X ϕdµ and X ψdµ are defined as in 8.1.2. It is therefore consistent with the
definition given in 8.1.2 to define the integral (with respect to µ) of any ϕ ∈ L+ (X, A)
(for L+ (X, A), cf. 6.2.25) by
Z Z
ϕdµ = sup ψdµ : ψ ∈ S + (X, A), ψ ≤ ϕ ,
X X
which is an element of [0, ∞] (notice that this definition is consistent also because
0X ∈ S + (X, A) and 0X ≤ ϕ for each ϕ ∈ L+ (X, A), and hence the set for which
the l.u.b. is taken is non-empty).
8.1.7 Proposition. If ϕ1 , ϕ2 ∈ L+ (X, A) are so that ϕ1 ≤ ϕ2 , then

Z Z
ϕ1 dµ ≤ ϕ2 dµ.
X X
+
Proof. If ϕ1 , ϕ2 ∈ L (X, A) are so that ϕ1 ≤ ϕ2 , then
{ψ ∈ S + (X, A), ψ ≤ ϕ1 } ⊂ {ψ ∈ S + (X, A), ψ ≤ ϕ2 },
and this implies the inequality we want to prove.
8.1.8 Theorem (Monotone convergence theorem). Let {ϕn } be a sequence

in L+ (X, A) and suppose that ϕn ≤ ϕn+1 for all n ∈ N. Then (for limn→∞ ϕn
and supn≥1 ϕn , cf. 6.2.18):
(a) the sequence {ϕn (x)} is convergent (in the metric space (R∗ , δ)) for all x ∈ X,
limn→∞ ϕn = supn≥1 ϕn , and limn→∞ ϕn ∈ L+ (X, A);
Integration 181
R R R
(b) X (limn→∞ ϕn )dµ = limn→∞ X ϕn dµ = supn≥1 X ϕn dµ.
Proof. a: By 5.2.5 we have that, for each x ∈ X, the sequence {ϕn (x)} is con-
vergent and limn→∞ ϕn (x) = supn≥1 ϕn (x), and hence also that limn→∞ ϕn (x) ∈
[0, ∞]. Thus, limn→∞ ϕn = supn≥1 ϕn , and limn→∞ ϕn ∈ L+ (X, A) follows from
6.2.19b. R
b: From 8.1.7 and 5.2.5 it follows that the sequence { X ϕn dµ} is convergent (in
the metric space (R∗ , δ)) and that
Z Z
lim ϕn dµ = sup ϕn dµ.
n→∞ X n≥1 X
Moreover, ϕk ≤ supn≥1 ϕn for all k ∈ N, so

Z Z
sup ϕn dµ ≤ (sup ϕn )dµ
n≥1 X X n≥1
R
by 8.1.7.
R The reverse inequality is obvious if supn≥1 X ϕn dµ = ∞. Assume then
supn≥1 X ϕn dµ < ∞ and fix ψ ∈ S + (X, A) such that ψ ≤ supn≥1 ϕn . Choose
a ∈ (0, 1) and define
En := {x ∈ X : aψ(x) ≤ ϕn (x)} for all n ∈ N.
From 6.2.31 and 6.1.26 we have En ∈ A for all n ∈ N. Also, En ⊂ En+1 for all
S∞
n ∈ N and X = n=1 En (to see this, notice that if ψ(x) = 0 then x ∈ E1 ; if
0 < ψ(x), then aψ(x) < ψ(x) ≤ supn≥1 ϕn (x) and hence there exists n ∈ N so that
aψ(x) < ϕn (x)). Then, 8.1.4a, 8.1.5 (since aψ ∈ S + (X, A) by 6.2.24) and 7.1.4b
imply that
Z Z Z
a ψdµ = aψdµ = sup χEn aψdµ.
X X n≥1 X
Now 8.1.7 implies that

Z Z
χEn aψdµ ≤ ϕn dµ.
X X
Thus,
Z Z
a ψdµ ≤ sup ϕn dµ.
X n≥1 X
Since this is true for every a ∈ (0, 1), we have in particular

Z
1
Z
1− ψdµ ≤ sup ϕn dµ, ∀k ∈ N,
k X n≥1 X
R R
whence (notice that X ψdµ < ∞ since we are assuming supn≥1 X ϕn dµ < ∞)
Z
1
Z Z
ψdµ = lim 1 − ψdµ ≤ sup ϕn dµ.
X k→∞ k X n≥1 X
Since this is true for every ψ ∈ S + (X, A) such that ψ ≤ supn≥1 ϕn , we have
Z Z
(sup ϕn )dµ ≤ sup ϕn dµ.
X n≥1 n≥1 X
This proves that

Z Z
(sup ϕn )dµ = sup ϕn dµ,
X n≥1 n≥1 X
which, in view of part a, can be written as

Z Z
( lim ϕn )dµ = sup ϕn dµ.
X n→∞ n≥1 X
8.1.9 Proposition. Let ϕ1 , ϕ2 ∈ L+ (X, A). Then

Z Z Z
+
ϕ1 + ϕ2 ∈ L (X, A) and (ϕ1 + ϕ2 )dµ = ϕ1 dµ + ϕ2 dµ.
X X X
Proof. We have ϕ1 +ϕ2 ∈ L+ (X, A) from 6.2.31. By 6.2.26, there are two sequences
{ψn1 } and {ψn2 } in S + (X, A) so that, for i = 1, 2, ψni ≤ ψn+1
i
and limn→∞ ψni = ϕi .
1 2 +
By 8.1.4a, {ψn + ψn } is a sequence in S (X, A) and
Z Z Z
1 2 1
(ψn + ψn )dµ = ψn dµ + ψn2 dµ.
X X X
Since
∀x ∈ X, (ψn1 + ψn2 )(x) ≤ (ψn+1

1 2
+ ψn+1 )(x) and
lim (ψn1 + ψn2 )(x) = (ϕ1 + ϕ2 )(x)
n→∞
(cf. 5.3.4), by 8.1.8, 8.1.4a and 5.3.4 we have

Z Z
(ϕ1 + ϕ2 )dµ = lim (ψn1 + ψn2 )dµ
X n→∞ X
Z Z Z Z
1 2
= lim ψn dµ + lim ψn dµ = ϕ1 dµ + ϕ2 dµ.
n→∞ X n→∞ X X X
P∞
8.1.10 Proposition. Let {ϕn } be a sequence in L+ (X, A). Then (for n=1 ϕn ,
cf. 6.2.32)
∞ ∞ ∞ Z
Z !
X X X
+
ϕn ∈ L (X, A) and ϕn dµ = ϕn dµ.
n=1 X n=1 n=1 X
Integration 183
P∞
Proof. We have n=1 ϕn ∈ L+ (X, A) from 6.2.32. Applying induction to 8.1.9 we
have
n Z n
! n Z
X X X
+
ϕk ∈ L (X, A) and ϕk dµ = ϕk dµ, ∀n ∈ N.
k=1 X k=1 k=1 X
Pn Pn+1
Then, since k=1 ϕk ≤ k=1 ϕk for all n ∈ N, by 8.1.8 and 5.4.1 we have
∞
Z ! Z n
!
X X
ϕn dµ = lim ϕk dµ
X n→∞ X
n=1 k=1
n Z
X ∞ Z
X
= lim ϕk dµ = ϕn dµ.
n→∞ X X
k=1 n=1
8.1.11 Proposition. Let ϕ1 , ϕ2 ∈ L+ (X, A). Then:

R R
(a) if ϕ1 (x) ≤ ϕ2 (x) µ-a.e. on X, then X ϕ1 dµ ≤ X ϕ2 dµ;
R R
(b) if ϕ1 (x) = ϕ2 (x) µ-a.e. on X, then X ϕ1 dµ = X ϕ2 dµ.
Proof. a: let E ∈ A be so that µ(E) = 0 and ϕ1 (x) ≤ ϕ2 (x) for all x ∈ X − E.

For i = 1, 2 we have:
Z
ϕi = χE ϕi + χX−E ϕi , χE ϕi ∈ L+ (X, A), χE ϕi dµ = 0.
X
Indeed, χE ϕi ∈ L (X, A) follows from 6.2.31 and, if ψ ∈ S + (X, A) is such that

+
ψ ≤ χE ϕi , then ψ ≤ (max{ψ(x) : x ∈ X})χE and hence (cf. 8.1.4a,b and 8.1.3a)

Z Z
ψdµ ≤ (max{ψ(x) : x ∈ X}) χE dµ = (max{ψ(x) : x ∈ X})µ(E) = 0.
X X
Then, by 8.1.9 and 8.1.7,

Z Z Z
ϕ1 dµ = χE ϕ1 dµ + χX−E ϕ1 dµ
X X X
Z Z Z
= χX−E ϕ1 dµ ≤ χX−E ϕ2 dµ = ϕ2 dµ.
X X X
b: This follows immediately from part a.
8.1.12 Proposition. Let ϕ ∈ L+ (X, A). Then:

R
(a) XRϕdµ = 0 iff ϕ(x) = 0 µ-a.e. on X;
(b) if X ϕdµ < ∞ then µ(ϕ−1 ({∞})) = 0, i.e. ϕ(x) < ∞ µ-a.e. on X.
Proof. a: If ϕ(x) = 0 µ-a.e. on X, then 8.1.11b implies that

Z Z
ϕdµ = 0X dµ = 0.
X X
1
On the other hand, letting E := ϕ−1 ((0, ∞]) and En := ϕ−1

n, ∞ for each
n ∈ N, we have (cf. 6.1.26):
∞
[
E ∈ A, En ∈ A for each n ∈ N, E = En .
n=1
1
Since n χE n ≤ ϕ, by 8.1.7 we have
1 1
Z Z
µ(En ) = χEn dµ ≤ ϕdµ, ∀n ∈ N.
n X n X
R
Thus, if X ϕdµ = 0 then µ(En ) = 0 for each n ∈ N, and hence (cf. 7.1.4a)
µ(E) = 0. Since
ϕ(x) = 0, ∀x ∈ X − E,
R
this shows that X ϕdµ = 0 implies ϕ(x) = 0 µ-a.e. on X.
b: We have ϕ−1 ({∞}) ∈ A by 6.1.26. Defining ψn := nχϕ−1 ({∞}) for each
n ∈ N, we have
Z
+
ψn ∈ S (X, A), ψn ≤ ϕ, ψn dµ = nµ(ϕ−1 ({∞})).
X
In view of 8.1.6, this shows that if µ(ϕ−1 ({∞})) 6= 0 then

R
X ϕdµ = ∞.
8.1.13 Proposition. Let a ∈ [0, ∞] and ϕ ∈ L+ (X, A). Then,

Z Z
aϕ ∈ L+ (X, A) and aϕdµ = a ϕdµ.
X X
Proof. We have aϕ ∈ L+ (X, A) from 6.2.31.

If a = 0, then the
R equality of the statement is obvious.
If a = ∞ and X ϕdµ = 0, then (cf. 8.1.12a) ϕ(x) = 0 µ-a.e. on X, and hence
(aϕ)(x) = 0 µ-a.e. on X, and hence (cf. 8.1.12a)
Z Z
aϕdµ = 0 = a ϕdµ.
X X
If a = ∞ and X ϕdµ > 0, then there exists n ∈ N so that µ ϕ−1 n1 , ∞

R
6= 0,
for otherwise we would have µ(ϕ−1R ((0, ∞])) = 0 (cf. the proof of 8.1.12a) and hence
ϕ(x) = 0 µ-a.e. on X and hence X ϕdµ = 0 by 8.1.12a. Since

1
ϕ−1 ,∞ ⊂ (aϕ)−1 ({∞}), ∀n ∈ N,
n
by 7.1.2a we have µ((aϕ)−1 ({∞})) 6= 0, and hence (cf. 8.1.12b)

Z Z
aϕdµ = ∞ = a ϕdµ.
X X
Integration 185
Finally, suppose that a ∈ (0, ∞). If ψ ∈ S + (X, A) is such that ψ ≤ ϕ, then

aψ ∈ S + (X, A) (cf. 6.2.24) and aψ ≤ aϕ (cf. 5.3.2b). Then (cf. 8.1.4a,b and
5.3.2b),
Z Z
aϕdµ ≥ sup aψdµ : ψ ∈ S + (X, A), ψ ≤ ϕ
X
XZ
= sup a ψdµ : ψ ∈ S + (X, A), ψ ≤ ϕ
ZX Z
= a sup ψdµ : ψ ∈ S + (X, A), ψ ≤ ϕ = a ϕdµ.
X X
1
Now replace, in the equality just obtained, a with and ϕ with aϕ (which is still
a
an element of L+ (X, A), by 6.2.31). Then,
1
Z Z
ϕdµ ≥ aϕdµ,
X a X
whence (cf. 5.3.2b)
Z Z
a ϕdµ ≥ aϕdµ,
X X
and this concludes the proof of the equality of the statement.
8.1.14 Definitions. We denote by L+ (X, A, µ) the family of functions from X to

[0, ∞] that is defined as follows:
L+ (X, A, µ) := {ϕ : Dϕ → [0, ∞] : Dϕ ∈ A, µ(X − Dϕ ) = 0, ϕ ∈ L+ (Dϕ , ADϕ )}.
Clearly, L+ (X, A) ⊂ L+ (X, A, µ).
If ϕ ∈ L+ (X, A, µ) and Dϕ 6= X, then ϕ has any number of extensions which
are elements of L+ (X, A). One of these extensions is for instance the function
ϕe : X → [0, ∞]
(
ϕ(x) if x ∈ Dϕ ,
x 7→ ϕe (x) :=
0 if x 6∈ Dϕ ,
which we call the standard extension of ϕ and which is A-measurable since, for
S ⊂ [0, ∞], ϕ−1
e (S) is either ϕ
−1
(S) or ϕ−1 (S) ∪ (X − Dϕ ), and ADϕ ⊂ A (cf.
6.1.19a). Now, if ϕ1 and ϕ2 are two elements of L+ (X, A) which are extensions of
ϕ then ϕ1 (x) = ϕ2 (x) µ-a.e. on X (since ϕ1 (x) = ϕ2 (x) for all x ∈ X − (X − Dϕ ),
X − Dϕ ∈ A, and µ(X − Dϕ ) = 0), and this implies (cf. 8.1.11b)
Z Z
ϕ1 dµ = ϕ2 dµ.
X X
Thus, we can define the integral (with respect to µ) of ϕ as the integral of any
element of L+ (X, A) that is an extension of ϕ, for instance as
Z Z
ϕdµ := ϕe dµ,
X X
R
Actually, since the value of X ϕdµ is independent from what extension of ϕ is used
in order to define it, this extension need not be specified, unless this is useful for
calculations.
As a matter of convenience, for ϕ ∈ L+ (X, A, µ) we will sometimes write
Z Z
ϕ(x)dµ(x) := ϕdµ.
X X
8.1.15 Proposition. For every ϕ ∈ L+ (X, A, µ) we have

" n2n #
Z X k−1
k−1 k

−1 −1
ϕdµ = lim µ ϕ , + nµ(ϕ ([n, ∞]))
X n→∞ 2n 2n 2n
k=2
" n2n #
X k−1
k−1 k

−1 −1
= sup µ ϕ , + nµ(ϕ ([n, ∞])) .
n≥1 2n 2n 2n
k=2
Proof. We have
Z Z
ϕdµ := ϕe dµ,
X X
where ϕe is the standard extension of ϕ (cf. 8.1.14).

For each n ∈ N we define:
E0,n := ϕ−1
e ([n, ∞]),

k−1 k
Ek,n := ϕ−1
e , for k = 1, ..., n2n ,
2n 2n
n
n2
X k−1
ψn := χEk ,n + nχE0 ,n .
2n
k=1
We recall (cf. the proof of 6.2.26) that {ψn } is a sequence in S + (X, A) so that
ψn ≤ ψn+1 for all n ∈ N and lim ψn = ϕe .

n→∞
Then, from 8.1.8 it follows that

" n2n #
Z Z X k−1
ϕe dµ = lim ψn dµ = lim µ(Ek,n ) + nµ(E0,n )
X n→∞ X n→∞ 2n
k=1
" n2n #
X k−1
= sup µ(Ek,n ) + nµ(E0,n ) .
n≥1 2n
k=1
Now, we notice that, for each n ∈ N,

k−1

k−1 k

−1
= 0 for k = 1, E k,n = ϕ , for k > 1, E0,n = ϕ−1 ([n, ∞]).
2n 2n 2n
Thus, we have the equalities of the statement.
Integration 187
8.1.16 Remark. Proposition 8.1.15 shows that we could have defined, for every
ϕ ∈ L+ (X, A, µ), the integral of ϕ with respect to µ by
" n2n #
Z X k−1
k−1 k

−1 −1
ϕdµ := sup µ ϕ , + nµ(ϕ ([n, ∞])) ,
X n≥1 2n 2n 2n
k=2
without going through 8.1.2, 8.1.6, 8.1.14; this would have been close to Lebesgue’s
original way of defining his integral (cf. Shilov and Gurevich, 1966, 6.6). This way
of defining the integral has the merit of showing at the outset why the integral
can be defined for measurable functions only (if −1
ϕ were not measurable with Dϕ
measurable, then the sets ϕ−1 k−1 k

,
2n 2n and ϕ ([n, ∞]) would not be elements
of A for all n’s and k’s, and the whole formula would be meaningless since the
domain of µ is A). Indeed, the definition in 8.1.6 would not be contradictory if
it were given for all functions with X as domain and [0, ∞] as final set, and only
later does it become clear why the functions must be measurable (e.g., the proof of
additivity given in 8.1.9 requires the measurability of the functions in an essential
way).
8.1.17 Theorem. Let ϕ, ψ ∈ L+ (X, A, µ). Then:
(a) aϕ+bψ ∈ L+ (X, A, µ) and X (aϕ+bψ)dµ R= a X ϕdµ+b
R R R
R X
ψdµ, ∀a, b ∈ [0, ∞];
(b) if ϕ(x) ≤ ψ(x) µ-a.e. on Dϕ ∩ Dψ , then RX ϕdµ ≤ RX ψdµ;
(c) if ϕ(x) = ψ(x) µ-a.e. on Dϕ ∩ Dψ , then X ϕdµ = X ψdµ;
(d) ϕψ ∈ L+ (X, A, µ).
Proof. a: For a, b ∈ [0, ∞] we have:
Daϕ+bψ = Dϕ ∩ Dψ ∈ A (cf. 6.2.30);
µ(X − Dϕ ∩ Dψ ) = µ((X − Dϕ ) ∪ (X − Dψ )) = 0 (cf. 7.1.2b);
ϕDϕ ∩Dψ , ψDϕ ∩Dψ ∈ L+ (Dϕ ∩ Dψ , ADϕ ∩Dψ ) (cf. 6.2.3 and 6.1.19b) and hence
aϕ + bψ = aϕDϕ ∩Dψ + bψDϕ ∩Dψ ∈ L+ (Dϕ ∩ Dψ , ADϕ ∩Dψ ) (cf. 6.2.31).
This proves that aϕ + bψ ∈ L+ (X, A, µ).
Further, if ϕ̃, ψ̃ ∈ L+ (X, A) are extensions of ϕ, ψ respectively (cf. 8.1.14), then
aϕ̃ + bψ̃ ∈ L+ (X, A) and aϕ̃ + bψ̃ is an extension of aϕ + bψ. Then,
Z Z Z Z Z Z
(aϕ + bψ)dµ = (aϕ̃ + bψ̃)dµ = a ϕ̃dµ + b ψ̃dµ = a ϕdµ + b ψdµ,
X X X X X X
where 8.1.9 and 8.1.13 have been used.
b: Let E ∈ A be so that
µ(E) = 0 and ϕ(x) ≤ ψ(x) for all x ∈ Dϕ ∩ Dψ ∩ (X − E).
+
If ϕ̃, ψ̃ ∈ L (X, A) are extensions of ϕ, ψ respectively, then we have
ϕ̃(x) ≤ ψ̃(x) for all x ∈ Dϕ ∩ Dψ ∩ (X − E) = X − ((X − Dϕ ) ∪ (X − Dψ ) ∪ E).
Since (X − Dϕ ) ∪ (X − Dψ ) ∪ E ∈ A and µ((X − Dϕ ) ∪ (X − Dψ ) ∪ E) = 0 (cf.
7.1.2b), we have
ϕ̃(x) ≤ ψ̃(x) µ-a.e. on X,
and hence, by 8.1.11a,

Z Z Z Z
ϕdµ = ϕ̃dµ ≤ ψ̃ = ψdµ.
X X X X
c: This follows immediately from part b.
d: Since ϕψ = ϕDϕ ∩Dψ ψDϕ ∩Dψ (cf. 6.2.30), the proof is quite similar to the one
given in part a for aϕ + bψ.
8.1.18 Theorem. Let ϕ ∈ L+ (X, A, µ). Then:

R
(a) X ϕdµ = 0 iff ϕ(x)R = 0 µ-a.e. on Dϕ ;
(b) if µ(X) = 0 then X ϕ = 0;
(c) if ν is a measure on A such that ν(E) ≤ µ(E) for all E ∈ A, then
Z Z
+
ϕ ∈ L (X, A, ν) and ϕdν ≤ ϕdµ.
X X
+
Proof. a: Let ϕ̃ ∈ L (X, A) be an extension of ϕ. Then,
ϕ̃(x) = 0 µ-a.e. on X iff ϕ(x) = 0 µ-a.e. on Dϕ .
Indeed, the “only if” is obvious and, if E ∈ A is so that
µ(E) = 0 and ϕ(x) = 0 for all x ∈ Dϕ ∩ (X − E),
then
µ((X − Dϕ ) ∪ E) = 0 (cf. 7.1.2b) and ϕ̃(x) = 0 for all x ∈ X − ((X − Dϕ ) ∪ E).
R R
The result now follows from 8.1.12a, since X ϕdµ := X ϕ̃dµ.
b: If µ(X) = 0, then ϕ(x) = 0 µ-a.e. on Dϕ , and the result follows from part a.
c: If ν is a measure on A such that ν(E) ≤ µ(E) for all E ∈ A, then the
inequality ν(X − Dϕ) ≤ µ(X − Dϕ) = 0 shows that ϕ ∈ L+ (X, A, ν). Moreover, by
5.3.2b and induction applied to 5.3.2e, for each n ∈ N we have
n2n
k−1

k−1 k
X
−1
ν ϕ , + nν(ϕ−1 ([n, ∞]))
2n 2n 2n
k=2
n2n
X k−1 −1 k−1 k
≤ µ ϕ , + nµ(ϕ−1 ([n, ∞])),
2n 2n 2n
k=2
and hence, by 8.1.15,
Z Z
ϕdν ≤ ϕdµ.
X X
8.1.19 Theorem (Monotone convergence theorem (2nd version)). Let

{ϕn } be a sequence in L+ (X, A, µ) and suppose that
∀n ∈ N, ϕn (x) ≤ ϕn+1 (x) µ-a.e on Dϕn ∩ Dϕn+1 .
Then:
Integration 189
(a) there exists ϕ̃ ∈ L+ (X, A) such that ϕ̃(x) = limn→∞ ϕn (x) µ-a.e. on
T∞
n=1 Dϕn ; T∞
(b) if ϕ ∈ L+ (X, A, µ) and ϕ(x) = limn→∞ ϕn (x) µ-a.e. on Dϕ ∩ ( n=1 Dϕn ),
then
Z Z Z
ϕdµ = lim ϕn dµ = sup ϕn dµ.
X n→∞ X n≥1 X
Proof. a: For each n ∈ N, let En ∈ A be so that

µ(En ) = 0 and ϕn (x) ≤ ϕn+1 (x) for all x ∈ Dϕn ∩ Dϕn+1 ∩ (X − En ).
S∞
Letting E := n=1 ((X − Dϕn ) ∪ En ), we have E ∈ A and µ(E) = 0 (cf. 7.1.4a).
Since X − E ⊂ Dϕn , for each n ∈ N we can define the function
ϕ̃n : X → [0, ∞]
(
ϕn (x) if x ∈ X − E,
x 7→ ϕ̃n (x) :=
0 if x ∈ E,
which is an element of L+ (X, A) since, for every S ⊂ [0, ∞], ϕ̃−1n (S) is either
ϕ−1
n (S) ∩ (X − E) or (ϕ−1
n (S) ∩ (X − E)) ∪ E, and ADϕn
⊂ A. We have ϕ̃n ≤ ϕ̃n+1
since X − E ⊂ Dϕn ∩ Dϕn+1 ∩ (X − En ). Hence, by 8.1.8a, the sequence {ϕ̃n (x)}
is convergent for all x ∈ X and limn→∞ ϕ̃n ∈ L+ (X, A). Letting ϕ̃ := limn→∞ ϕ̃n ,
we also have
∞
!
\
ϕ̃(x) := lim ϕ̃n (x) = lim ϕn (x), ∀x ∈ X − E = Dϕn ∩ (X − E),
n→∞ n→∞
n=1
and hence
∞
\
ϕ̃(x) = lim ϕn (x) µ-a.e. on Dϕn .
n→∞
n=1
b: Let ϕ̃n and ϕ̃ denote the same functions as in the proof of part a. By 8.1.8b
we have
Z Z Z
ϕ̃dµ = lim ϕ̃n dµ = sup ϕ̃n dµ. (1)
X n→∞ X n≥1 X
For each n ∈ N we have

ϕ̃n (x) = ϕn (x) for all x ∈ X − E = Dϕn ∩ (X − E),
and hence
ϕ̃n (x) = ϕn (x) µ-a.e. on Dϕn ,
and hence, by 8.1.17c,
Z Z
ϕ̃n dµ = ϕn dµ. (2)
X X
T∞
Similarly, if ϕ ∈ L+ (X, A, µ) and ϕ(x) = limn→∞ ϕn (x) µ-a.e. on Dϕ ∩( n=1 Dϕn ),
and if F ∈ A is so that
∞
!
\
µ(F ) = 0 and ϕ(x) = lim ϕn (x) for all x ∈ Dϕ ∩ Dϕn ∩ (X − F ),
n→∞
n=1
then
∞
!
\
ϕ(x) = ϕ̃(x) for all x ∈ Dϕ ∩ Dϕn ∩(X − F )∩(X − E) = Dϕ ∩(X − (F ∪E)).
n=1
Since F ∪ E ∈ A and µ(F ∪ E) = 0 (cf. 7.1.2b), this shows that

ϕ(x) = ϕ̃(x) µ-a.e. on Dϕ ,
and hence that
Z Z
ϕdµ = ϕ̃dµ. (3)
X X
What we want to prove follows from 1, 2, 3.
8.1.20 Lemma (Fatou’s lemma). Let {ϕn } be a sequence in L+ (X, A, µ) and let
ϕ ∈ L+ (X, A, µ). Suppose that
∞
!
\
ϕ(x) = lim ϕn (x) µ-a.e. on Dϕ ∩ Dϕn
n→∞
n=1
R
and that there exists M ∈ [0, ∞) such that X ϕn dµ ≤ M for all n ∈ N. Then,
Z
ϕdµ ≤ M.
X
Proof. Let E ∈ A be so that

∞
!
\
µ(E) = 0 and ϕ(x) = lim ϕn (x) for all x ∈ Dϕ ∩ Dϕn ∩ (X − E),
n→∞
n=1
S∞
and define F := (X − Dϕ ) ∪ ( n=1 (X − Dϕn )) ∪ E. We have F ∈ A and µ(F ) = 0
(cf. 7.1.4a). Since X − F ⊂ Dϕ and X − F ⊂ Dϕn for each n ∈ N, we can define
the functions
ϕ̃ : X → [0, ∞]
(
ϕ(x) if x ∈ X − F,
x 7→ ϕ̃(x) :=
0 if x ∈ F,
and, for each n ∈ N,
ϕ̃n : X → [0, ∞]
(
ϕn (x) if x ∈ X − F,
x 7→ ϕ̃n (x) :=
0 if x ∈ F.
Proceeding as in the proof of 8.1.19a, we see that:
Integration 191
ϕ̃ ∈ L+ (X, A) and ϕ̃(x) = ϕ(x) µ-a.e. on Dϕ ;

ϕ̃n ∈ L+ (X, A) and ϕ̃n (x) = ϕn (x) µ-a.e. on Dϕn , ∀n ∈ N.
Thus, 8.1.17c implies that
Z Z Z Z
ϕ̃dµ = ϕdµ and ϕ̃n = ϕn dµ, ∀n ∈ N.
X X X X
T∞
Since X − F = Dϕ ∩ ( n=1 Dϕn ) ∩ (X − E), we also have
ϕ̃(x) = lim ϕ̃n (x), ∀x ∈ X,
n→∞
and hence (cf. 6.2.18) ϕ̃ = supn≥1 (inf k≥n ϕ̃k ). Now, for each k ∈ N,
inf ϕ̃k ∈ L+ (X, A) (cf. 6.2.19a) and inf ϕ̃k ≤ inf ϕ̃k .
k≥n k≥n k≥n+1
By 8.1.8, this implies that

Z Z
ϕ̃dµ = sup ( inf ϕ˜k )dµ.
X n≥1 X k≥n
Moreover, for each n ∈ N we have inf k≥n ϕ̃k ≤ ϕ̃n and hence (cf. 8.1.7)
Z Z Z
( inf ϕ̃k )dµ ≤ ϕ̃n dµ = ϕn dµ ≤ M.
X k≥n X X
Then we have
Z Z
ϕdµ = ϕ̃dµ ≤ M.
X X
8.2 Integration of complex functions
8.2.1 Definition. We denote by M(X, A, µ) the family of functions from X to C

which is defined as follows:
M(X, A, µ) := {ϕ : Dϕ → C : Dϕ ∈ A, µ(X − Dϕ ) = 0, ϕ ∈ M(Dϕ , ADϕ )}
(for M(X, A), cf. 6.2.15). The elements of M(X, A, µ) are called µ-measurable
functions.
8.2.2 Theorem. We have:

αϕ + βψ ∈ M(X, A, µ), ∀α, β ∈ C, ∀ϕ, ψ ∈ M(X, A, µ);
ϕψ ∈ M(X, A, µ), ∀ϕ, ψ ∈ M(X, A, µ).
However, M(X, A, µ) is not an associative algebra nor a linear space, unless the
only element E of A such that µ(E) = 0 is E = ∅.
Proof. For α, β ∈ C and ϕ, ψ ∈ M(X, A, µ) we have:

Dαϕ+βψ = Dϕ ∩ Dψ ∈ A (cf. 1.2.19);

µ(X − Dϕ ∩ Dψ ) = µ((X − Dϕ ) ∪ (X − Dψ )) = 0 (cf. 7.1.2b);
ϕDϕ ∩Dψ , ψDϕ ∩Dψ ∈ M(Dϕ ∩ Dψ , ADϕ ∩Dψ ) (cf. 6.2.3 and 6.1.19b) and hence
αϕ + βψ = αϕDϕ ∩Dψ + βψDϕ ∩Dψ ∈ M(Dϕ ∩ Dψ , ADϕ ∩Dψ ) (cf. 6.2.16).
This proves that αϕ + βψ ∈ M(X, A, µ).
The proof for ϕψ is analogous.
If there exists E ∈ A so that E 6= ∅ and µ(E) = 0, then obviously there exists
ϕ ∈ M(X, A, µ) so that Dϕ 6= X (e.g., a constant function with X − E as its
domain), and hence so that it has no opposite (the situation is quite similar to the
one discussed in 3.2.11). Therefore, M(X, A, µ) is not a linear space, and hence it
cannot be an associative algebra.
On the other hand, if the only element E of A such that µ(E) = 0 is E = ∅,
then M(X, A, µ) = M(X, A), which is an associative algebra and hence a linear
space as well (cf. 6.2.16).
8.2.3 Definitions. We denote by L1 (X, A, µ) the family of functions that is defined

as follows: Z
L1 (X, A, µ) := ϕ ∈ M(X, A, µ) : ϕi dµ < ∞ for i = 1, ..., 4 ,
X
with, for ϕ ∈ M(X, A, µ),
R
(cf. 1.2.19). Note that, for i = 1, ..., 4, the condition X ϕi dµ < ∞ is consistent
because ϕi ∈ L+ (X, A, µ) (cf. 6.2.12 and 6.2.20b). The elements of L1 (X, A, µ) are
called Lebesgue integrable functions (with respect to µ) or µ-integrable functions or,
simply, integrable functions.
1
For all ϕ ∈ LR (X, A, µ), we define the Lebesgue integral of ϕ (with respect to µ)
as the element X ϕdµ of C defined by
Z Z Z Z Z
ϕdµ := +
(Re ϕ) dµ − (Re ϕ) dµ + i (Im ϕ) dµ − i (Im ϕ)− dµ.
− +
X X X X X
This definition is consistent with the one given in 8.1.14 because ϕ = (Re ϕ)+ and
(Re ϕ)− = (Im ϕ)+ = (Im ϕ)− = 0X for all ϕ ∈ L1 (X, A, µ) ∩ L+ (X, A, µ).
For ϕ ∈ M(X, A, µ), it follows immediately from the definitions that
ϕ ∈ L1 (X, A, µ) iff Re ϕ, Im ϕ ∈ L1 (X, A, µ)
and that Z Z Z
if ϕ ∈ L1 (X, A, µ) then ϕdµ = (Re ϕ)dµ + i (Im ϕ)dµ.
X X X
Thus, for ϕ ∈ L1 (X, A, µ), ϕ ∈ L1 (X, A, µ) and
Z Z Z Z Z Z
ϕdµ = (Re ϕ)dµ+ i (Im ϕ)dµ = (Re ϕ)dµ− i (Im ϕ)dµ = ϕdµ .
X X X X X X
As a matter of convenience, for ϕ ∈ L1 (X, A, µ) we will sometimes write
Z Z
ϕ(x)dµ(x) := ϕdµ.
X X
Integration 193
8.2.4 Proposition. For ϕ ∈ M(X, A, µ), the following conditions are equivalent:
1
R ∈ L (X, A, µ);
(a) ϕ
(b) X |ϕ|dµ < ∞.
Proof. First, note that condition b is consistent because |ϕ| ∈ L+ (X, A, µ) by

6.2.17.
a ⇒ b: Notice that, by the triangle inequality in C,
|ϕ|(x) ≤ (Re ϕ)+ (x) + (Re ϕ)− (x) + (Im ϕ)+ (x) + (Im ϕ)− (x), ∀x ∈ Dϕ ,
and use 8.1.17a,b.
b ⇒ a: Notice that
(Re ϕ)± (x) ≤ | Re ϕ|(x) ≤ |ϕ|(x) and (Im ϕ)± (x) ≤ | Im ϕ|(x) ≤ |ϕ|(x), ∀x ∈ Dϕ ,
and use 8.1.17b.
8.2.5 Corollary. If ϕ ∈ M(X, A, µ), ψ ∈ L1 (X, A, µ) and |ϕ(x)| ≤ |ψ(x)| µ-a.e.

on Dϕ ∩ Dψ , then ϕ ∈ L1 (X, A, µ).
8.2.6 Corollary. Suppose that

ϕ ∈ M(X, A, µ), ∃k ∈ [0, ∞) s.t. |ϕ(x)| ≤ k µ-a.e. on Dϕ , µ(X) < ∞.
Then ϕ ∈ L1 (X, A, µ).
R
Proof. The result follows from 8.2.4 and 8.2.5, since X kdµ = kµ(X).
8.2.7 Theorem. Suppose that ϕ ∈ L1 (X, A, µ), ψ ∈ M(X, A, µ) and

ϕ(x) = ψ(x) µ-a.e. on Dϕ ∩ Dψ .
Then
Z Z
ψ ∈ L1 (X, A, µ) and ψdµ = ϕdµ.
X X
Proof. We have
(Re ϕ)± (x) = (Re ψ)± (x) and (Im ϕ)± (x) = (Im ψ)± (x) µ-a.e. on Dϕ ∩ Dψ .
The result then follows from 8.1.17c.
8.2.8 Proposition. If µ(X) = 0, then L1 (X, A, µ) = M(X, A, µ) and

Z
ϕdµ = 0, ∀ϕ ∈ M(X, A, µ).
X
Proof. Use 8.1.18b (or else, notice that if µ(X) = 0 then ϕ(x) = 0 µ-a.e. for each
ϕ ∈ M(X, A, µ), and use 8.2.7).
8.2.9 Theorem. Let α, β ∈ C and ϕ, ψ ∈ L1 (X, A, µ). Then

Z Z Z
1
αϕ + βψ ∈ L (X, A, µ) and (αϕ + βψ)dµ = α ϕdµ + β ψdµ.
X X X
Proof. We already know that αϕ + βψ ∈ M(X, A, µ) (cf. 8.2.2). By 8.1.17a,b and

8.2.4 we also have
Z Z Z Z
|αϕ + βψ|dµ ≤ (|α||ϕ| + |β||ψ|)dµ = |α| |ϕ|dµ + |β| |ψ|dµ < ∞,
X X X X
1
which proves (cf. 8.2.4) that αϕ + βψ ∈ L (X, A, µ).
To prove the second part of the statement, it is clearly sufficient to prove that,
if ϕ, ψ ∈ L1 (X, A, µ), then
Z Z Z
(ϕ + ψ)dµ = ϕdµ + ψdµ, (1)
X X X
1
and that, if α ∈ C and ϕ ∈ L (X, A, µ), then
Z Z
αϕdµ = α ϕdµ. (2)
X X
To prove 1 for ϕ, ψ ∈ L1 (X, A, µ), we note that

(Re(ϕ + ψ))+ − (Re(ϕ + ψ))− = Re(ϕ + ψ) = Re ϕ + Re ψ
= (Re ϕ)+ − (Re ϕ)− + (Re ψ)+ − (Re ψ)− ,
or
(Re(ϕ + ψ))+ + (Re ϕ)− + (Re ψ)− = (Re ϕ)+ + (Re ψ)+ + (Re(ϕ + ψ))−
(all the terms of these equalities are functions with domain Dϕ ∩ Dψ ). Hence, by
8.1.17a,
Z Z Z
−
+
(Re(ϕ + ψ)) dµ + (Re ϕ) dµ + (Re ψ)− dµ
X X X
Z Z Z
= (Re ϕ)+ dµ + (Re ψ)+ dµ + (Re(ϕ + ψ))− dµ
X X X
and, since each of these integrals is finite, we may transpose and obtain
Z Z
(Re(ϕ + ψ))+ dµ − (Re(ϕ + ψ))− dµ
X X
Z Z Z Z
= +
(Re ϕ) dµ − (Re ϕ)− dµ + (Re ψ)+ dµ − (Re ψ)− dµ.
X X X X
In exactly the same way we obtain
Z Z
(Im(ϕ + ψ))+ dµ − (Im(ϕ + ψ))− dµ
X X
Z Z Z Z
−
= +
(Im ϕ) dµ − (Im ϕ) dµ + +
(Im ψ) dµ − (Im ψ)− dµ.
X X X X
By multiplying by i the second equation and summing, we obtain 1.
Integration 195
Let now ϕ ∈ L1 (X, A, µ). If α ≥ 0, then 2 follows from 8.1.17a and from the
equalities
(Re(αϕ))± = α(Re ϕ)± and (Im(αϕ))± = α(Im ϕ)± .
If α = −1, then 2 follows from the equalities
(Re(−ϕ))± = (Re ϕ)∓ and (Im(−ϕ))± = (Im ϕ)∓ .
If α = i, then 2 follows from the equalities
(Re(iϕ))± = (Im ϕ)∓ and (Im(iϕ))± = (Re ϕ)± .
Combining these cases with 1, we obtain 2 for any α ∈ C.
8.2.10 Theorem. If ϕ ∈ L1 (X, A, µ), then

Z Z

ϕdµ ≤ |ϕ|dµ.

X X
R R
Proof. If X ϕdµ = 0, then the inequality is obvious. Now assume X
ϕdµ 6= 0 and
R R −1
let α := X ϕdµ X ϕdµ . Then
Z Z Z

ϕdµ = α ϕdµ = αϕdµ.

X X X
R
This shows that X αϕdµ is an element of R. Hence
Z Z Z Z
+
(Re(αϕ))− dµ

ϕdµ = Re
αϕdµ = (Re(αϕ)) dµ −
X
Z X X
Z X
+ −
≤ (Re(αϕ)) dµ + (Re(αϕ)) dµ
ZX Z X Z
= | Re(αϕ)|dµ ≤ |αϕ|dµ = |ϕ|dµ,
X X X
where the second inequality holds by 8.1.17b and the last equality holds since |α| =
1.
8.2.11 Theorem (Lebesgue’s dominated convergence theorem).

Suppose that {ϕn } is a sequence in M(X, A, µ) such that the following two con-
ditions are satisfied:
the sequence {ϕn (x)} is convergent (in the metric space (C, dC )) µ-a.e. on
T∞
n=1 Dϕn ;
there exists ψ ∈ L1 (X, A, µ) so that |ϕn (x)| ≤ ψ(x) µ-a.e. on Dϕn ∩ Dψ , ∀n ∈ N
(ψ is said to be a dominating function).
Then:
(a) ϕn ∈ L1 (X, A, µ), ∀n ∈ N;
(b) there exists ϕ ∈ M(X, A, µ) s.t. ϕ(x) = limn→∞ ϕn (x), ∀x ∈ Dϕ ∩( ∞
T
n=1 Dϕn );
T∞
(c) if ϕ ∈ M(X, A, µ) is s.t. ϕ(x)R= limn→∞ ϕn (x) µ-a.e. on Dϕ ∩ ( n=1
R Dϕn ),
then ϕ ∈ L1 (X, A, µ), limn→∞ X |ϕn − ϕ|dµ = 0, X ϕdµ = limn→∞ X ϕn dµ.
R
Proof. a: For each n ∈ N, we note that |ϕn (x)| ≤ ψ(x) entails ψ(x) ∈ [0, ∞). Thus
we have |ϕn (x)| ≤ |ψ(x)| µ-a.e. on Dϕn ∩ Dψ , and this implies ϕn ∈ L1 (X, A, µ)
by 8.2.5.
b: Let E ∈ A be so that
∞
!
\
µ(E) = 0 and {ϕn (x)} is convergent for all x ∈ Dϕn ∩ (X − E).
n=1
T∞
Letting S := ( n=1 Dϕn ) ∩ (X − E), we have S ∈ A. We define the function
ϕ:S→C
x 7→ ϕ(x) := lim ϕn (x).
n→∞
Since (ϕn )S ∈ M(S, A ) (cf. 6.2.3 and 6.1.19b), we have ϕ ∈ M(S, AS ) by 6.2.20c,
S
S∞
and hence ϕ ∈ M(X, A, µ) since µ(X − S) = µ (( n=1 (X − Dϕn )) ∪ E) = 0 (cf.
7.1.4a). From the definition of ϕ we have
∞
!
\
ϕ(x) = lim ϕn (x), ∀x ∈ Dϕ = Dϕ ∩ Dϕn .
n→∞
n=1
c: Let ϕ ∈ M(X, A, µ) and let F ∈ A be so that

∞
!
\
µ(F ) = 0 and ϕ(x) = lim ϕn (x) for all x ∈ Dϕ ∩ Dϕn ∩ (X − F ).
n→∞
n=1
For each n ∈ N, let Gn ∈ A be so that

µ(Gn ) = 0 and |ϕn (x)| ≤ ψ(x) for all x ∈ Dϕn ∩ Dψ ∩ (X − Gn ).
We define
∞ ∞
! !
[ [
H := (X − Dϕn ) ∪ (X − Dψ ) ∪ (X − Dϕ ) ∪ F ∪ Gn .
n=1 n=1
We have H ∈ A and µ(H) = 0 (cf. 7.1.4a). Moreover, we note that

∞ ∞
! !
\ \
X −H = Dϕn ∩ Dψ ∩ Dϕ ∩ (X − F ) ∩ (X − Gn ) .
n=1 n=1
We have
|ϕ(x)| ≤ ψ(x), ∀x ∈ X − H = Dϕ ∩ Dψ ∩ (X − H),
and hence ϕ ∈ L1 (X, A, µ) by 8.2.5.
Now we define the functions:
ϕ̃n := (ϕn )X−H for each n ∈ N, ψ̃ := ψX−H , ϕ̃ := ϕX−H .
Integration 197
These functions are elements of M(X − H, AX−H ) by 6.2.3 and 6.1.19b, and hence
elements of M(X, A, µ) since µ(H) = 0. Moreover, ψ̃ ∈ L1 (X, A, µ) by 8.2.7.
For each n ∈ N, we define the function
ψ̃n : X − H → [0, ∞]
x 7→ ψ̃n (x) := sup |ϕ̃k (x) − ϕ̃(x)|
k≥n
(in this proof we characterize with a tilde the functions whose domain is X −H). By
6.2.16, 6.2.17, 6.2.19a we have ψ̃n ∈ M(X −H, AX−H ) and hence ψ̃n ∈ M(X, A, µ).
From
|ϕ̃k (x)| ≤ ψ̃(x) for each k ∈ N and |ϕ̃(x)| ≤ ψ̃(x), ∀x ∈ X − H,
we have
ψ̃n (x) ≤ 2ψ̃(x), ∀x ∈ X − H,
and hence ψ̃n ∈ L1 (X, A, µ) (cf. 8.2.5) and 2ψ̃ − ψ̃n ∈ L+ (X, A, µ). We also have
2ψ̃ − ψ̃n ≤ 2ψ̃ − ψ̃n+1 since obviously ψ̃n+1 ≤ ψ̃n . Furthermore, by 5.2.6 we have
lim ψ̃n (x) = lim |ϕ̃n (x) − ϕ̃(x)|
n→∞ n→∞
= lim |ϕn (x) − ϕ(x)| = 0, ∀x ∈ X − H,
n→∞
and hence
2ψ̃(x) = lim (2ψ̃(x) − ψ̃n (x)), ∀x ∈ X − H.
n→∞
Then, by 8.1.19 and 8.2.9 we have

Z Z Z Z
2ψ̃dµ = lim (2ψ̃ − ψ̃n )dµ = lim 2ψ̃dµ − ψ̃n dµ ,
X n→∞ X n→∞ X X
and hence
Z
lim ψ̃n dµ = 0. (1)
n→∞ X
Moreover, for each n ∈ N we have

|ϕn (x) − ϕ(x)| = |ϕ̃n (x) − ϕ̃(x)| ≤ ψ̃n (x), ∀x ∈ X − H = Dϕn −ϕ ∩ Dψ̃n ∩ (X − H),
and hence, by 8.2.9, 8.2.10, 8.1.17b,
Z Z Z Z

ϕn dµ − ϕdµ ≤
|ϕn − ϕ|dµ ≤ ψ̃n dµ. (2)

X X X X
Now, 1 and 2 imply that

Z Z Z
lim |ϕn − ϕ|dµ = 0 and ϕdµ = lim ϕn dµ.
n→∞ X X n→∞ X
8.2.12 Definition. In M(X, A, µ) we define a relation, denoted by ∼, as follows:

ϕ ∼ ψ if ϕ(x) = ψ(x) µ-a.e. on Dϕ ∩ Dψ .
This relation is obviously reflexive and symmetric. It is also transitive. Suppose in
fact that ϕ1 , ϕ2 , ϕ3 ∈ M(X, A, µ) are so that ϕ1 ∼ ϕ2 and ϕ2 ∼ ϕ3 , and let:
E ∈ A be so that µ(E) = 0 and ϕ1 (x) = ϕ2 (x), ∀x ∈ Dϕ1 ∩ Dϕ2 ∩ (X − E);
F ∈ A be so that µ(F ) = 0 and ϕ2 (x) = ϕ3 (x), ∀x ∈ Dϕ2 ∩ Dϕ3 ∩ (X − F );
letting G := (X − Dϕ2 ) ∪ E ∪ F , we have G ∈ A, µ(G) = 0 (cf. 7.1.2b) and
ϕ1 (x) = ϕ3 (x), ∀x ∈ Dϕ1 ∩ Dϕ3 ∩ (X − G),
and this shows that ϕ1 ∼ ϕ3 .
Thus, ∼ is an equivalence relation (cf. 1.1.5) and this justifies the symbol used.
We denote by M (X, A, µ) the quotient set defined by ∼, i.e. we define
M (X, A, µ) := M(X, A, µ)/ ∼ .
In every element of M (X, A, µ) there exists a representative which is an element of
M(X, A). Indeed, for each ϕ ∈ M(X, A, µ), the function
ϕe : X → C
(
ϕ(x) if x ∈ Dϕ ,
x 7→ ϕe (x) :=
0 if x 6∈ Dϕ
is an element of M(X, A) (this can be seen as in 8.1.14) and clearly ϕe ∼ ϕ.
8.2.13 Proposition. The following definitions, of the mappings σ, µ, π, are con-

sistent:
σ : M (X, A, µ) × M (X, A, µ) → M (X, A, µ)
([ϕ], [ψ]) 7→ σ([ϕ], [ψ]) := [ϕ] + [ψ] := [ϕ + ψ];
µ : C × M (X, A, µ) → M (X, A, µ)
(α, [ψ]) 7→ µ(α, [ψ]) := α[ϕ] := [αϕ];
π : M (X, A, µ) × M (X, A, µ) → M (X, A, µ)

([ϕ], [ψ]) 7→ π([ϕ], [ψ]) := [ϕ][ψ] := [ϕψ].
Then, (M (X, A, µ), σ, µ, π) is an abelian associative algebra over C, with [1X ] as
identity. The zero element is [0X ] and the opposite of [ϕ] ∈ M (X, A, µ) is [−ϕ].
It must be remarked that the second mapping above has been denoted by µ, which
is the symbol used for scalar multiplication throughout the book, even though the
same symbol µ had already been chosen to denote a measure in the present chapter.
However, no confusion should arise since the roles of the two things denoted by µ
are utterly different.
Integration 199
Proof. The only thing to prove is that the mappings σ, µ, π can indeed be defined
as they were in the statement, while it is immediate to check all the rest.
We already know that, if ϕ, ψ ∈ M(X, A, µ), then ϕ + ψ ∈ M(X, A, µ) (cf.
8.2.2). Suppose now that ϕ, ϕ′ , ψ, ψ ′ ∈ M(X, A, µ) are so that ϕ′ ∼ ϕ, ψ ′ ∼ ψ, and
let:
E ∈ A be so that µ(E) = 0 and ϕ′ (x) = ϕ(x), ∀x ∈ Dϕ′ ∩ Dϕ ∩ (X − E);
F ∈ A be so that µ(F ) = 0 and ψ ′ (x) = ψ(x), ∀x ∈ Dψ′ ∩ Dψ ∩ (X − F ).
Then
ϕ′ (x) + ψ ′ (x) = ϕ(x) + ψ(x),
∀x ∈ Dϕ′ ∩ Dϕ ∩ (X − E) ∩ Dψ′ ∩ Dψ ∩ (X − F )
= Dϕ′ +ψ′ ∩ Dϕ+ψ ∩ (X − (E ∪ F )),
which proves that ϕ′ + ψ ′ ∼ ϕ + ψ. This shows that the equivalence class [ϕ + ψ]
does not depend on the particular elements ϕ and ψ (of the classes [ϕ] and [ψ])
through which it has been defined. Hence, the rule which assigns [ϕ + ψ] to a
pair ([ϕ], [ψ]) ∈ M (X, A, µ) × M (X, A, µ) does assign one and only one element of
M (X, A, µ) to ([ϕ], [ψ]).
The arguments for µ and for π are analogous.
8.2.14 Definition. We define the subset S(X, A, µ) of M (X, A, µ) as follows:

S(X, A, µ) := {[ϕ] ∈ M (X, A, µ) : ∃ψ ∈ S(X, A) so that ψ ∈ [ϕ]}.
In view of 6.2.24, S(X, A, µ) is a subalgebra of the abelian associative algebra
M (X, A, µ).
8.2.15 Theorem. The following definition, of the set L1 (X, A, µ), is consistent:
Z
L1 (X, A, µ) := [ϕ] ∈ M (X, A, µ) : |ϕ|dµ < ∞ .
X
1
Then, L (X, A, µ) is a linear manifold in the linear space M (X, A, µ).
The following definition, of the function ν, is consistent:
ν : L1 (X, A, µ) → R
Z
[ϕ] 7→ ν([ϕ]) := k[ϕ]kL1 := |ϕ|dµ.
X
Then, ν is a norm for the linear space L1 (X, A, µ).

The following definition, of the function I, is consistent:
I : L1 (X, A, µ) → C
Z
[ϕ] 7→ I([ϕ]) := ϕdµ.
X
Then, I is a continuous linear functional.

Proof. To prove that L1 (X, A, µ) can indeed be defined as it was in the statement,
we must show that the implication
Z Z
[ϕ′ , ϕ ∈ M(X, A, µ), ϕ′ ∼ ϕ, |ϕ|dµ < ∞] ⇒ |ϕ′ |dµ < ∞ (∗)
X X
R
holds true, because then the condition X |ϕ|dµ < ∞ is actually a condition for the
equivalence class [ϕ] even though it is expressed through a particular element of it.
Now, (∗) is true by 8.1.17c.
Similar arguments, based on 8.1.17c and 8.2.7, show that ν and I can be defined
as they were in the statement. R
Since, for ϕ ∈ M(X, A, µ), X |ϕ|dµ < ∞ is equivalent to ϕ ∈ L1 (X, A, µ) (cf.
8.2.4), 8.2.9 proves that L1 (X, A, µ) is a linear manifold in M (X, A, µ) and that I
is a linear functional.
To prove that ν is a norm, we notice that:
Z Z
∀[ϕ], [ψ] ∈ L1 (X, A, µ), k[ϕ] + [ψ]kL1 = |ϕ + ψ|dµ ≤ (|ϕ| + |ψ|)dµ
ZX Z X
= |ϕ|dµ + |ψ|dµ = k[ϕ]kL1 + k[ψ]kL1

X X
Z Z
∀α ∈ C, ∀[ϕ] ∈ L1 (X, A, µ), kα[ϕ]kL1 = |αϕ|dµ = |α| |ϕ|dµ = kα|k[ϕ]kL1 ,
X X
Z
k[ϕ]kL1 = 0 ⇒ |ϕ|dµ = 0 ⇒ ϕ(x) = 0 µ-a.e. on Dϕ ⇒ [ϕ] = [0X ],
X
where we have used 8.1.17a,b and 8.1.18a.
To prove that I is continuous, we notice that, by 8.2.10,
Z Z
1

∀[ϕ] ∈ L (X, A, µ), |I([ϕ])| = ϕdµ ≤
|ϕ|dµ = k[ϕ]kL1 ,
X X
and we use 4.2.2.
8.2.16 Proposition. The intersection S 1 (X, A, µ) := S(X, A, µ) ∩ L1 (X, A, µ) is

a dense linear manifold in the normed space L1 (X, A, µ).
Proof. Since S(X, A, µ) and L1 (X, A, µ) are linear manifolds in the linear space
M (X, A, µ), the same is true for their intersection (cf. 3.1.5). Then S 1 (X, A, µ) is
a linear manifold in the linear space L1 (X, A, µ), too (cf. 3.1.4b).
Let [ϕ] ∈ L1 (X, A, µ) and assume that the representative ϕ is an element of
M(X, A) (cf. 8.2.12). By 6.2.27 there exists a sequence {ψn } in S(X, A) such that
|ψn (x)| ≤ |ϕ(x)|, ∀x ∈ X, ∀n ∈ N,
ϕ(x) = limn→∞ ψn (x), ∀x ∈ X.
Then 8.2.11 implies that ψn ∈ L1 (X, A, µ) for all n ∈ N and that
Z
lim k[ψn ] − [ϕ]k = lim |ψn − ϕ|dµ = 0.
n→∞ n→∞ X
In view of 2.3.12, this proves that S (X, A, µ) is dense in L1 (X, A, µ).
1
Integration 201
8.3 Integration with respect to measures constructed from other

measures
8.3.1 Proposition. Let (X, A, µ) be a measure space, let E be a non-empty element

of A, and denote by µE the restriction of µ to the σ-algebra AE (cf. 6.1.19), i.e.
define
µE : AE → [0, ∞]
F 7→ µE (F ) := µ(F ).
(a) µE is a measure on AE .
(b) For ϕ ∈ L+ (X, A, µ), denote by ϕE the restriction of ϕ to Dϕ ∩ E, i.e. define
ϕE := ϕDϕ ∩E . Then ϕE ∈ L+ (E, AE , µE ), χE ϕ ∈ L+ (X, A, µ) and
Z Z
ϕE dµE = χE ϕdµ.
E X
(c) For ψ ∈ M(X, A, µ), denote by ψE the restriction of ψ to Dψ ∩ E, i.e. define
ψE := ψDψ ∩E . Then ψE ∈ M(E, AE , µE ), χE ψ ∈ M(X, A, µ), and:
ψE ∈ L1 (E, AE , µE ) iff χE ψ ∈ L1 (X, A, µ),
Z Z
if ψE ∈ L1 (E, AE , µE ) then ψE dµE = χE ψdµ.
E X
Proof. a: This is obvious.

b: Let ϕ ∈ L+ (X, A, µ). Then Dϕ ∩ E ∈ AE and
µE (E − Dϕ ∩ E) = µ(E ∩ (X − Dϕ )) ≤ µ(X − Dϕ ) = 0.
Moreover, ϕE is measurable w.r.t. ADϕ ∩E = (ADϕ )Dϕ ∩E and A(δ) since ϕ is mea-
surable w.r.t. ADϕ and A(δ) (cf. 6.2.3 and 6.1.19b). Thus, ϕE ∈ L+ (E, AE , µE ).
We have χE ϕ ∈ L+ (X, A, µ) by 8.1.17d.
Notice now that, for each S ⊂ (0, ∞],
ϕ−1
E (S) = ϕ
−1
(S) ∩ E = (χE ϕ)−1 (S).
Then, by 8.1.15, we have
" n2n #
Z X k−1 k−1 k
−1 −1
ϕE dµE = sup µE ϕE , + nµE (ϕE ([n, ∞]))
E n≥1 2n 2n 2n
k=2
" n2n #
X k−1
k−1 k

−1 −1
= sup µ (χE ϕ) , + nµ((χE ϕ) ([n, ∞]))
n≥1 2n 2n 2n
k=2
Z
= χE ϕdµ.
X
c: Let ψ ∈ M(X, A, µ). In the same way as for ϕE in part b, it can be proved
that ψE ∈ M(E, AE , µE ). Also, χE ψ ∈ M(X, A, µ) by 8.2.2. Then, the rest of the
statement follows from the equalities
(Re ψE )± = ((Re ψ)± )E , (Im ψE )± = ((Im ψ)± )E ,
χE (Re ψ)± = (Re(χE ψ))± , χE (Im ψ)± = (Im(χE ψ))± ,

from the definitions given in 8.2.3, and from what was proved in part b for every
ϕ ∈ L+ (X, A, µ).
8.3.2 Definition. Let (X, A, µ) be a measure space.

If E is a non-empty element of A, we write:
for ϕ ∈ L+ (X, A, µ), E ϕdµ := E ϕE dµE = X χE ϕdµ;R
R R R
1
R
R ψ ∈ M(X, A, µ) such that χE ψ ∈ L (X, A, µ), E ψdµ := E ψE dµE =
for
X χE ψdµ.
R R
The integrals E ϕdµ and E ψdµ above are said to be integrals over E.
If E = ∅, we define:
Z
ϕdµ := 0, ∀ϕ ∈ L+ (X, A, µ);
E
Z
ψdµ := 0, ∀ψ ∈ M(X, A, µ).
E
8.3.3 Proposition. Let (X, A, µ) be a measure space and let E be a non-empty

element of A such that µ(X − E) = 0. Then:
(a) for ϕ ∈ L+ (X, A, µ) we have
Z Z
ϕdµ = ϕdµ;
E X
(b) for ψ ∈ M(X, A, µ) we have

χE ψ ∈ L1 (X, A, µ) iff ψ ∈ L1 (X, A, µ),
Z Z
if ψ ∈ L1 (X, A, µ) then ψdµ = ψdµ.
E X
Proof. a: We have χE ϕ ∈ L+ (X, A, µ) (cf. 8.1.17d) and χE (x)ϕ(x) = ϕ(x) µ-a.e.

on Dϕ . The result then follows from 8.1.17c.
b: We have χE ψ ∈ M(X, A, µ) (cf. 8.2.2) and χE (x)ψ(x) = ψ(x) µ-a.e. on Dψ .
The result then follows from 8.2.7.
8.3.4 Proposition. Let (X, A, µ) be a measure space, let ρ ∈ L+ (X, A, µ), and
define the function
ν : A → [0, ∞]
Z
E 7→ ν(E) := ρdµ.
E
(a) ν is a measure on A and ν(E) = 0 whenever E ∈ A is such that µ(E) = 0.

If ρ(x) < ∞ µ-a.e. on Dρ and the measure µ is σ-finite, then the measure ν is
σ-finite as well.
Integration 203
(b) For every ϕ ∈ L+ (X, A, µ) we have ϕ ∈ L+ (X, A, ν), ϕρ ∈ L+ (X, A, µ) and

Z Z
ϕdν = ϕρdµ.
X X
(c) Assuming ρ(x) < ∞ for all x ∈ Dρ , for every ψ ∈ M(X, A, µ) we have
ψρ ∈ M(X, A, µ) and:
ψ ∈ L1 (X, A, ν) iff ψρ ∈ L1 (X, A, µ);
Z Z
if ψ ∈ L1 (X, A, ν) then ψdν = ψρdµ.
X X
+
Proof. a: Let ρ̃ ∈ L (X, A) be an extension of ρ (cf. 8.1.14). Then, for each
E ∈ A, RχE ρ̃ ∈ L+ (X, A) (cf. 6.2.31) and χE ρ̃ is an extension of χE ρ, and hence
ν(E) = X χE ρ̃dµ.
We have:
ν(∅) = 0 (cf. 8.3.2);
P∞
for a sequence {En } in A such that Ei ∩Ej = ∅ if i 6= j, χS∞n=1 En
ρ̃ = n=1 (χEn ρ̃)
P∞
(with n=1 (χEn ρ̃) defined as in 6.2.32), and hence by 8.1.10
∞ ∞ Z ∞
! Z
[ X X
ν En = χS∞n=1 En
ρ̃dµ = χ En ρ̃dµ = ν(En ).
n=1 X n=1 X n=1
Thus, ν is a measure on A.
If E ∈ A and µ(E) = 0, then χE (x)ρ̃(x) = 0 µ-a.e. on X and hence by 8.1.12a
Z
ν(E) = χE ρ̃dµ = 0.
X
Assume now ρ(x) < ∞ µ-a.e. on Dρ . Then, if E ∈ A is so that
µ(E) = 0 and ρ(x) < ∞ for all x ∈ Dρ ∩ (X − E),
we have ρ̃ ({∞}) ⊂ X − (Dρ ∩ (X − E)) = (X − Dρ ) ∪ E; thus ρ̃−1 ({∞}) is an
−1
element of A (cf. 6.1.26) so that (cf. 7.1.2a,b)

µ(ρ̃−1 ({∞})) ≤ µ((X − Dϕ ) ∪ E) = 0.
Assume also that µ is σ-finite. Then there exists a countable family {En }n∈I so
that
[
En ∈ A and µ(En ) < ∞ for all n ∈ N, and X = En .
n∈I
Define
F0 := ρ̃−1 ({∞}) and Fk := ρ̃−1 ([k − 1, k)) for k ∈ N.
Then, {F0 } ∪ {En ∩ Fk }(n,k)∈I×N is a countable family of elements of A (cf. 6.1.26
and 6.1.25) and:
ν(F0 ) = 0 since µ(F0 ) = 0,
Z Z
∀(n, k) ∈ I × N, ν(En ∩ Fk ) = χEn ∩Fk ρ̃dµ ≤ k χEn ∩Fk dµ
X X
= kµ(En ∩ Fk ) ≤ kµ(En ) < ∞
(cf. 8.1.7, 8.1.3a, 7.1.2a). Moreover,

∞ ∞ ∞
! ! !!
[ [ [
X = F0 ∪ Fk = F0 ∪ En ∩ Fk
k=1 n=1 k=1
 
[
= F0 ∪  (En ∩ Fk ) .
(n,k)∈I×N
This shows that ν is σ-finite.

b: For each E ∈ A, from 8.1.3a we have
Z Z
χE dν = ν(E) = χE ρ̃dµ.
X X
+
Hence, for each τ ∈ S (X, A), by 8.1.9 and 8.1.13 we have
Z Z
+
τ ρ̃ ∈ L (X, A) and τ dν = τ ρ̃dµ.
X X
Now let ϕ ∈ L (X, A, µ). Then ϕρ ∈ L (X, A, µ) by 8.1.17d and ϕ ∈ L+ (X, A, ν)

+ +
since µ(X − Dϕ ) = 0 implies ν(X − Dϕ ) = 0. Let ϕ̃ ∈ L+ (X, A) be an extension of

ϕ (cf. 8.1.14). Then ϕ̃ρ̃ ∈ L+ (X, A) (cf. 6.2.31), and ϕ̃ρ̃ is an extension of ϕρ. Let
{τn } be a sequence in S + (X, A) so that (cf. 6.2.26)
τn ≤ τn+1 for all n ∈ N and ϕ̃ = lim τn
n→∞
(the function limn→∞ τn is defined as in 6.2.18). Then {τn ρ̃} is a sequence in

L+ (X, A) so that (cf. 5.3.2b and 5.3.4)
τn ρ̃ ≤ τn+1 ρ̃ for all n ∈ N and ϕ̃ρ̃ = lim τn ρ̃,
n→∞
and 8.1.8 implies that

Z Z Z Z Z Z
ϕdν = ϕ̃dν = lim τn dν = lim τn ρ̃dµ = ϕ̃ρ̃dµ = ϕρdµ.
X X n→∞ n→∞ X X
c: Suppose ρ(x) < ∞ for all x ∈ Dρ ; then ρ ∈ M(X, A, µ). If ψ ∈ M(X, A, µ),
then ψρ ∈ M(X, A, µ) by 8.2.2 and ψ ∈ M(X, A, ν) since µ(X − Dψ ) = 0 implies
ν(X − Dψ ) = 0. The rest of the statement about ψ follows from the definitions
given in 8.2.3 and from the results proved in part b.
8.3.5 Proposition. Let (X, A) be a measurable space.

(a) Let {µk } be a sequence of measures on A, let {ak } be a sequence in [0, ∞], and
define the function
µ : A → [0, ∞]
∞
X
E 7→ µ(E) := ak µk (E).
k=1
Then µ is a measure on A and:

Integration 205
letting J := {k ∈ N : ak > 0},

\ \
L+ (X, A, µ) = L+ (X, A, µk ) and M(X, A, µ) = M(X, A, µk );
k∈J k∈J
P∞
∀ϕ ∈ L+ (X, A, µ), X ϕdµ = k=1 ak X ϕdµk ; R
R R
for ψ ∈ M(X, A, µ), ψ ∈ L1 (X, A, µ) iff ∞

P
k=1 ak X |ψ|dµk < ∞;
1
R P ∞ R
∀ψ ∈ L (X, A, µ), X ψdµ = k=1 ak X ψdµk .
(b) Let µ, ν be measures on A and let a, b, ∈ (0, ∞). Then the function
aµ + bν : A → [0, ∞]
E 7→ (aµ + bν)(E) := aµ(E) + bν(E)
is a measure on A and:
L+ (X, A, aµ + bν) = L+ (X, A, µ) ∩ L+ (X, A, ν) and M(X, A, aµ + bν) =
M(X, A, µ) ∩ M(X, A, ν); R
for ϕ ∈ L+ (X, A, aµ + bν), X ϕd(aµ + bν) = a X ϕdµ + b X ψdν;
R R
L1 (X, A, aµ + bν) = L1 (X, A, µ) ∩ L1 (X, A, ν);

for ψ ∈ L1 (X, A, aµ + bν), X ψd(aµ + bν) = a X ψdµ + b X ψdν.
R R R
Proof. a: We have:
P∞
µ(∅) = k=1 ak µk (∅) = 0;
for a sequence {En } in A such that Ei ∩ Ej = ∅ if i 6= j, by 5.4.5 and 5.4.7,
∞ ∞ ∞ ∞ ∞
! !
[ X [ X X
µ En = a k µk En = ak µk (En )
n=1 k=1 n=1 k=1 n=1
X∞ X∞ ∞
X
= ak µk (En ) = µ(En ).
n=1 k=1 n=1
Thus, µ is a measure on A.
Now notice that, for E ∈ A, µ(E) = 0 iff µk (E) = 0 for all k ∈ J. This proves
that
\ \
L+ (X, A, µ) = L+ (X, A, µk ) and M(X, A, µ) = M(X, A, µk ).
k∈J k∈J
For each E ∈ A we have
Z ∞
X ∞
X Z
χE dµ = µ(E) = ak µk (E) = ak χE dµk .
X k=1 k=1 X
+
PN
Hence, for τ ∈ S (X, A), letting τ = n=1 bn χEn with {E1 , ..., En } a disjoint
family of elements of A and bn ∈ [0, ∞) for n = 1, ..., N , we have (cf. 5.4.5, 5.4.6,
5.3.3)
Z XN Z N
X ∞
X Z
τ dµ = bn χEn dµ = bn ak χEn dµk
X n=1 X n=1 k=1 X
∞
X N
X Z ∞
X Z
= ak bn χEn dµk = ak τ dµk .
k=1 n=1 X k=1 X
For ϕ ∈ L+ (X, A, µ), let ϕ̃ ∈ L+ (X, A) be an extension of ϕ (cf. 8.1.14) and let
{τn } be a sequence in S + (X, A) so that (cf. 6.2.26)
τn ≤ τn+1 for all n ∈ N and ϕ̃ = lim τn
n→∞
(the function limn→∞ τn is defined as in 6.2.18). Then we have, by 8.1.7 and 5.3.2b,
Z Z
ak τn dµk ≤ ak τn+1 dµk , ∀(n, k) ∈ N × N,
X X
and hence, by 8.1.8, 5.4.9, 5.3.4,

Z Z ∞
X Z
ϕdµ = lim τn dµ = lim ak τn dµk
X n→∞ X n→∞
k=1
∞
X Z ∞
X Z X∞ Z
= lim ak τn dµk = ak lim τn dµk = ak ϕdµk .
n→∞ n→∞
k=1 k=1 k=1
The part of the statement about ψ ∈ M(X, A, µ) follows easily from what has just
been proved for ϕ ∈ L+ (X, A, µ), from 8.2.4, and from the definitions given in 8.2.3.
b: In part a of the statement, assume µ1 := µ, µ2 := ν, a1 := a, a2 := b, and,
for k > 2, ak any positive number and µk the null measure on A. Then everything
asserted in part b follows at once from part a.
8.3.6 Proposition. Let (X, A) be a measurable space and let x0 ∈ X be so that

{x0 } ∈ A. Then the function
µx0 : A → [0, ∞]
(
1 if x0 ∈ E
E 7→ µx0 (E) :=
0 if x0 6∈ E
is a measure on A, which is called the Dirac measure in x0 , and:
∀ϕ ∈ L+ (X, A, µx0 ), x0 ∈ Dϕ and X ϕdµx0 = ϕ(x0 );
R
L1 (X, A, µx0 ) = M(X, A, µx0 ); R

∀ψ ∈ M(X, A, µx0 ), x0 ∈ Dψ and X ψdµx0 = ψ(x0 ).
Proof. By a straightforward check, we see that µx0 has properties af1 of 7.1.1 and
me of 7.1.7.
If ϕ ∈ L+ (X, A, µ), then x0 ∈ Dϕ (otherwise, µx0 (X − Dϕ ) = 1) and
ϕ(x) = ϕ(x0 ) µ-a.e.
(since µx0 (X − {x0 }) = 0). Then (cf. 8.1.17c and 8.1.3a)
Z Z
ϕdµx0 = ϕ(x0 )dµx0 = ϕ(x0 )µx0 (X) = ϕ(x0 ).
X X
The part of the statement about M(X, A, µx0 ) follows from this and from the
definitions given in 8.2.3.
Integration 207
8.3.7 Proposition. Let µ be a non-null measure on A(dR ) and suppose that µ(E)
is either 0 or 1 for each E ∈ A(dR ). Then there exists x0 ∈ R so that µ is the Dirac
measure in x0 .
S
Proof. Since R = n∈Z [n, n + 1], the σ-subadditivity of µ (cf. 7.1.4a) implies that
∃n ∈ Z such that µ([n, n + 1]) = 1.
Then the family
X := {[a, b] : a, b ∈ R, n ≤ a ≤ b ≤ n + 1, µ([a, b]) = 1}
is non-empty because [n, n + 1] ∈ X. The subadditivity of µ (cf. 7.1.2b) implies
that

a+b a+b
∀[a, b] ∈ X, µ a, =0⇒µ ,b = 1.
2 2
Thus, we can define the mapping
ϕ:X →X
(
a, a+b a, a+b

2 if µ 2 =1
[a, b] 7→ ϕ([a, b]) := a+b a+b

2 ,b if µ a, 2 = 0.
Next, we define a sequence {[an , bn ]} by letting:
[a1 , b1 ] := [n, n + 1],
[an+1 , bn+1 ] := ϕ([an , bn ]) for each n ∈ N.

Clearly,
an ≤ an+1 ≤ n + 1 and n ≤ bn+1 ≤ bn , ∀n ∈ N.
1
This, along with |bn − an | = 2n−1 for each n ∈ N, implies that
∃x0 ∈ R such that x0 = lim an = lim bn ,
n→∞ n→∞
and it is easy to see that
∞
\
{x0 } = [an , bn ].
n=1
Then, since µ([an , bn ]) = 1 for each n ∈ N, by 7.1.4c we have
µ({x0 }) = lim µ([an , bn ]) = 1,
n→∞
and hence we also have, by the additivity of µ,
µ(R − {x0 }) = 0.
Then, for E ∈ A(dR ) we have, by the monotonicity of µ,
x0 ∈ E ⇒ {x0 } ⊂ E ⇒ µ({x0 }) ≤ µ(E) ⇒ µ(E) = 1,
x0 6∈ E ⇒ E ⊂ R − {x0 } ⇒ µ(E) ≤ µ(R − {x0 }) ⇒ µ(E) = 0.

8.3.8 Proposition. Let (X, A) be a measurable space. Suppose we have a family

{xn }n∈I , with I := {1, ..., N } or I := N, of points of X so that the singleton set {xn }
is an element of A for each n ∈ I, suppose we have a function I ∋ n 7→ an ∈ [0, ∞],
and for each E ∈ A define the set of indices
IE := {n ∈ I : xn ∈ E}.
Then the function
µ : A → [0, ∞]
X
E 7→ µ(E) := an
n∈IE
P
(for cf. 5.3.3 and 5.4.3) is a measure on A and:
n∈IE ,
∀ϕ ∈ L+ (X, A, µ), X ϕdµ = n∈I an ϕ(xn );

R P
ψ ∈ L1 (X, A, µ) iff n∈I an |ψ(xn )| < ∞;

P
for ψ ∈ M(X, A, µ),
∀ψ ∈ L1 (X, A, µ), X ψdµ = n∈I an ψ(xn )
R P
P P∞
(if I = N, n∈I an ψ(xn ) := n=1 an ψ(xn )).
Proof. We notice that
X
∀E ∈ A, µ(E) = an µxn (E),
n∈I
where µxn is the Dirac measure in xn (cf. 8.3.6). Then we use 8.3.5 and 8.3.6.
8.3.9 Remark. For the measure µ defined in 8.3.8 we have µ(X − {xn }n∈I ) = 0.
Conversely, suppose that we have a measure space (X, A, µ) such that there
exists a family {xn }n∈I , with I := {1, ..., N } or I := N, of points of X so that the
singleton set {xn } is an element of A for each n ∈ I and µ(X − {xn }n∈I ) = 0.
Then, for each E ∈ A, we have
X
µ(E) = µ(E ∩ {xn }n∈I ) = µ({xn })
n∈IE
if we define IE := {n ∈ I : xn ∈ E}. Thus, µ turns out to be the measure defined
in 8.3.8, with an := µ({xn }) for each n ∈ I.
The measures of this kind, i.e. the ones that can be constructed as in 8.3.8, are
said to be discrete.
8.3.10 Remarks.
(a) In 8.3.8, let X := N, A := P(N) (cf. 6.1.15), I := N, xn := n and an := 1
for each n ∈ N. Then the measure µ is called the counting measure on N
(since, for E ⊂ N, µ(E) is the number of the points that are contained in E),
L+ (X, A, µ) is the family of all sequences in [0, ∞] and M(X, A, µ) is the family
of all sequences in C. For a sequence ϕ := {yn } in [0, ∞] we have
Z X∞
ϕdµ = yn ,
X n=1
and for a sequence ψ := {zn } in C we have:
Integration 209
P∞
ψ ∈ L1 (X, A, µ) iff n=1R |zn | < ∞;P∞
if ψ ∈ L1 (X, A, µ) then X ψdµ = n=1 zn .
Thus, all the results about integrals of Section 8.1 and 8.2 have corollaries which
are results about series.
(b) In 8.3.8, let X := {1, ..., N }, A := P({1, ..., N }), I := {1, ..., N }, xn := n and
an := 1 for each n ∈ {1, ..., N }. Then the measure µ is called the counting
measure on {1, ..., N }, the equalities L1 (X, A, µ) = M(X, A, µ) = CN hold
true, and for an N -tuple ψ := (z1 , ..., zN ) ∈ CN we have
Z N
X
ψdµ = zn .
X n=1
8.3.11 Theorem (Change of variable theorem). Let (X1 , A1 , µ1 ) be a mea-

sure space, let (X2 , A2 ) be a measurable space, let π : Dπ → X2 be a mapping
from X1 to X2 which is measurable w.r.t. AD 1
π
and A2 , and so that Dπ ∈ A1 and
µ1 (X1 − Dπ ) = 0.
(a) The function
µ2 : A2 → [0, ∞]
E 7→ µ2 (E) := µ1 (π −1 (E))
is a measure on A2 .
(b) For ϕ ∈ L+ (X2 , A2 , µ2 ) we have:
+
R ◦ π ∈ L (X
ϕ R 1 , A1 , µ1 ), R R
X2
ϕdµ 2 = X1
(ϕ ◦ π)dµ1 and E ϕdµ2 = π−1 (E) (ϕ ◦ π)dµ1 , ∀E ∈ A2 .
(c) For ψ ∈ M(X2 , A2 , µ2 ) we have:
ψ ◦ π ∈ M(X1 , A1 , µ1 );
ψ ∈ L1 (X2 , A2 , µ2 ) iff ψ ◦ π ∈RL1 (X1 , A1 , µ1R);
1
R
R ψ ∈ L (X2 , A2 , µ2 ) then X2 ψdµ2 = X1 (ψ ◦ π)dµ1 and E ψdµ2 =
if
π −1 (E) (ψ ◦ π)dµ1 , ∀E ∈ A2 .
Proof. a: We have (cf. 1.2.8):
µ2 (∅) = µ1 (π −1 (∅)) = µ1 (∅) = 0;

for a sequence {En } in A2 such that Ei ∩ Ej = ∅ if i 6= j,
∞ ∞ ∞
! !! !
[ [ [
−1 −1
µ2 En = µ1 π En = µ1 π (En )
n=1 n=1 n=1
∞
X ∞
X
= µ1 (π −1 (En )) = µ2 (En ).
n=1 n=1
Thus, µ2 is a measure on A2 .
b: Let ϕ ∈ L+ (X2 , A2 , µ2 ). Then:

Dϕ◦π = π −1 (Dϕ ) ∈ AD
1
π
⊂ A1 ;
µ1 (X1 − Dϕ◦π ) = µ1 (Dπ − π −1 (Dϕ )) = µ1 (π −1 (X2 − Dϕ )) = µ2 (X2 − Dϕ ) = 0,
where the first equality is true because µ1 (X1 −Dπ ) = 0; further, ϕ◦π is measurable
D
w.r.t. A1 ϕ◦π and A(δ) since π is measurable w.r.t. AD1 and A2 , and ϕ is measurable
π
Dϕ +
w.r.t. A2 and A(δ) (cf. 6.2.6). Thus, ϕ ◦ π ∈ L (X1 , A1 , µ1 ).
Then, by 8.1.15 and 1.2.13Af we have
" n2n #
Z X k−1
k−1 k

−1 −1
ϕdµ2 = sup µ2 ϕ , + nµ2 (ϕ ([n, ∞]))
X2 n≥1 2n 2n 2n
k=2
" n2n #
X k−1
k−1 k

−1 −1
= sup µ1 (ϕ ◦ π) , + nµ1 ((ϕ ◦ π) ([n, ∞]))
n≥1 2n 2n 2n
k=2
Z
= (ϕ ◦ π)dµ1 .
X1
From this we also obtain, for E ∈ A2 ,

Z Z Z
ϕdµ2 = χE ϕdµ2 = (χE ◦ π)(ϕ ◦ π)dµ1
E
ZX2 X1
Z
= χπ−1 (E) (ϕ ◦ π)dµ1 = (ϕ ◦ π)dµ1 ,
X1 π −1 (E)
since (χE ◦ π)(x) = χπ−1 (E) (x) for each x ∈ Dπ (cf. 1.2.13Ag).
c: We have ψ ◦ π ∈ M(X1 , A1 , µ1 ) for each ψ ∈ M(X2 , A2 , µ2 ) by the first result
of part b, in view of 6.2.12 and 6.2.20b. The rest of the statement follows easily
from what was proved in part b and from the definitions given in 8.2.3.
8.4 Integration on product spaces
The subject of this section is integration of functions defined on the cartesian prod-
uct of two σ-finite measure spaces. Actually, this section could be part of the
preceding one, since it deals with integration with respect to measures which are
constructed out of previously given measures. However, since the treatment of the
subject goes through several steps, it is perhaps better to deal with it in a separate
section.
We start with two set-theoretical concepts, which we define in 8.4.1 and examine
in 8.4.2.
8.4.1 Definitions. Let X1 and X2 be sets.

(a) Given a subset E of X1 × X2 and a point x1 of X1 , the subset of X2 defined by
Ex1 := {x2 ∈ X2 : (x1 , x2 ) ∈ E}
Integration 211
is called the section of E at x1 . Similarly, for x2 ∈ X2 , the subset of X1 defined

by
E x2 := {x1 ∈ X1 : (x1 , x2 ) ∈ E}
is called the section of E at x2 .
(b) Assume X1 and X2 non-empty. Given a non-empty set Y , a mapping ϕ :
X1 × X2 → Y , and a point x1 of X1 , the mapping
ϕx1 : X2 → Y
x2 7→ ϕx1 (x2 ) := ϕ(x1 , x2 )
is called the section of ϕ at x1 . Similarly, for x2 ∈ X2 , the mapping
ϕx2 : X1 → Y
x1 7→ ϕx2 (x1 ) := ϕ(x1 , x2 )
is called the section of ϕ at x2 .
8.4.2 Proposition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces.

(a) Let E ∈ A1 ⊗ A2 (cf. 6.1.28). Then
Ex1 ∈ A2 , ∀x1 ∈ X1 , and E x2 ∈ A1 , ∀x2 ∈ X2 .
(b) Let (Y, B) be a measurable space and let a mapping ϕ : X1 × X2 → Y be
measurable w.r.t. A1 ⊗ A2 and B. Then:
ϕx1 is measurable w.r.t. A2 and B, ∀x1 ∈ X1 ;
ϕx2 is measurable w.r.t. A1 and B, ∀x2 ∈ X2 .
Proof. a: Let E and S2 be the two collections of subsets of X1 × X2 defined by

E := {E ⊂ X1 × X2 : Ex1 ∈ A2 , ∀x1 ∈ X1 , and E x2 ∈ A1 , ∀x2 ∈ X2 },
S2 := {E1 × E2 : Ek ∈ Ak for k = 1, 2}.
For every E1 ⊂ X1 and E2 ⊂ X2 , we have:
(E1 × E2 )x1 = E2 if x1 ∈ E1 or (E1 × E2 )x1 = ∅ if x1 6∈ E1 ;
(E1 × E2 )x2 = E1 if x2 ∈ E2 or (E1 × E2 )x2 = ∅ if x2 6∈ E2 .
This shows that S2 ⊂ E. Moreover, for every subset E of X1 × X2 , every x1 ∈ X1 ,
every x2 ∈ X2 , we have
(X1 × X2 − E)x1 = X2 − Ex1 and (X1 × X2 − E)x2 = X1 − E x2 ,
and for every family {Ei }i∈I of subsets of X1 × X2 , every x1 ∈ X1 , every x2 ∈ X2 ,
we have
! !x2
[ [ [ [
Ei = Ex1 and Ei = E x2 .
i∈I x1 i∈I i∈I i∈I
This implies that E is a σ-algebra, and hence that A(S2 ) ⊂ E (cf. gσ2 of 6.1.17).
Since A1 ⊗ A2 = A(S2 ) (cf. 6.1.30a), this proves the statement.
b: Let x1 ∈ X1 . Then, for every subset F of Y ,
ϕ−1
x1 (F ) = {x2 ∈ X2 : ϕ(x1 , x2 ) ∈ F }
= {x2 ∈ X2 : (x1 , x2 ) ∈ ϕ−1 (F )} = (ϕ−1 (F ))x1 .
Thus, if F ∈ B then ϕ−1 (F ) ∈ A1 ⊗ A2 , and hence (in view of the result proved in
part a) ϕ−1
x1 (F ) ∈ A2 . This proves that ϕx1 is measurable w.r.t. A2 and B.
The proof for ϕx2 , for every x2 ∈ X2 , in analogous.
We construct now the measure with respect to which functions defined on a

product space can be integrated.
8.4.3 Theorem. Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces.

Then there exists a unique measure µ on A1 ⊗ A2 such that
∀(E1 , E2 ) ∈ A1 × A2 , µ(E1 × E2 ) = µ1 (E1 )µ2 (E2 ).
The measure µ is σ-finite.
Proof. We recall that the family
S2 := {E1 × E2 : Ek ∈ Ak for k = 1, 2}
is a semialgebra in X1 × X2 and A1 ⊗ A2 = A(S2 ) (cf. 6.1.30a). The function
ν : S2 → [0, ∞]
E1 × E2 7→ ν(E1 × E2 ) := µ1 (E1 )µ2 (E2 )
has obviously property a of 7.3.1. We will prove that it has properties b and c
as well. Indeed, suppose that {E1,n × E2,n }n∈I is a disjoint family of elements of
S
S2 such that I = {1, ..., N } or I = N, and such that n∈I (E1,n × E2,n ) ∈ S2 ,
S
i.e. n∈I (E1,n × E2,n ) = E1 × E2 with E1 ∈ A1 and E2 ∈ A2 . Then, for each
(x1 , x2 ) ∈ X1 × X2 ,
χE1 (x1 )χE2 (x2 ) = χE1 ×E2 (x1 , x2 )

X X
= χE1,n ×E2,n (x1 , x2 ) = χE1,n (x1 )χE2,n (x2 )
n∈I n∈I
P PN P∞
(cf. 1.2.20), where n∈I stands for n=1 or n=1 . Using this along with 8.1.9 or
8.1.10, for each x2 ∈ X2 we obtain
Z
µ1 (E1 )χE2 (x2 ) = χE1 (x1 )χE2 (x2 )dµ1 (x1 )
X1
XZ X
= χE1,n (x1 )χE2,n (x2 )dµ1 (x1 ) = µ1 (E1,n )χE2,n (x2 ),
n∈I X1 n∈I
Integration 213
and hence, using 8.1.13 as well,

! Z
[
ν (E1,n × E2,n ) = µ1 (E1 )µ2 (E2 ) = µ1 (E1 )χE2 (x2 )dµ2 (x2 )
n∈I X2
X Z
= µ1 (E1,n ) χE2,n (x2 )dµ2 (x2 )
n∈I X2
X X
= µ1 (E1,n )µ2 (E2,n ) = ν(E1,n × E2,n ).
n∈I n∈I
Thus, ν has properties a, b, c of 7.3.1, and hence there exists a unique additive
function µ0 on the algebra A0 (S2 ) which is an extension of ν, and µ0 is a premeasure.
Then, by 7.3.2 there exists a measure µ on A(A0 (S2 )) which is an extension of µ0 .
Since A(S2 ) = A(A0 (S2 )) (cf. 6.1.18), this proves that there exists a measure µ on
A1 ⊗ A2 which is an extension of ν.
If µ̃ is another measure which is an extension of ν, then the restriction of µ̃ to
A0 (S2 ) must coincide with µ0 on account of the uniqueness asserted in 7.3.1A, and
then µ̃ must coincide with µ on account of the uniqueness asserted in 7.3.2, since
µ0 is σ-finite. Indeed, for k = 1, 2 there exists a countable family {Fk,n }n∈Ik of
S
elements of Ak so that µk (Fk,n ) < ∞ for all n ∈ Ik and Xk = n∈Ik Fk,n . Then
µ (F × F2,m ) = µ1 (F1,n )µ2 (F2,m ) < ∞ for all (n, m) ∈ I1 × I2 and X1 × X2 =
S0 1,n
(n,m)∈I1 ×I2 (F1,n × F2,m ), so µ0 is σ-finite. Obviously, this also proves that µ is
σ-finite.
8.4.4 Definitions. Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces.

The measure µ whose existence and uniqueness was proved in 8.4.3 is called the
product of µ1 and µ2 and is denoted by µ1 ⊗ µ2 .
The measure space (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ) is called the product measure
space of (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ).
8.4.5 Proposition. Let N ∈ N be so that N > 2 and let (Xk , Ak , µk ) be a σ-finite

measure space for k = 1, ..., N . We define
µ1 ⊗ · · · ⊗ µN := ((· · · ((µ1 ⊗ µ2 ) ⊗ µ3 ) ⊗ · · · ) ⊗ µN −1 ) ⊗ µN .
We identify ((· · · ((X1 × X2 ) × X3 ) × · · · ) × XN −1) × XN with X1 × X2 × · · · × XN .

Then, µ1 ⊗ · · · ⊗ µN is a σ-finite measure on the σ-algebra A1 ⊗ A2 ⊗ · · · ⊗ AN , and
it is the only measure on A1 ⊗ A2 ⊗ · · · ⊗ AN such that
(µ1 ⊗ · · · ⊗ µN )(E1 × E2 × · · · × EN ) = µ1 (E1 )µ2 (E2 ) · · · µN (EN ),

∀(E1 , E2 , ..., EN ) ∈ A1 × A2 × · · · AN .
If 1 < i1 < i2 < ...ir < N , we identify X1 × X2 × · · · × XN with
(· · · (((X1 × · · · Xi1 ) × (Xi1 +1 × · · · × Xi2 )) × (Xi2 +1 × · · · × Xi3 )) × · · · )

×(Xir +1 × · · · × XN ).
Then
µ1 ⊗ · · · ⊗ µN
= (· · · (((µ1 ⊗ · · · ⊗ µi1 ) ⊗ (µi1 +1 ⊗ · · · ⊗ µi2 )) ⊗ (µi2 +1 ⊗ · · · ⊗ µi3 )) ⊗ · · · )
⊗(µir +1 ⊗ · · · ⊗ µN ).
Proof. From 8.4.3 and 8.4.4 we have that µ1 ⊗ · · · ⊗ µN is a σ-finite measure on

the σ-algebra ((· · · ((A1 ⊗ A2 ) ⊗ A3 ) ⊗ · · · ) ⊗ AN −1 ) ⊗ AN and that
(µ1 ⊗ · · · ⊗ µN )(((· · · ((E1 × E2 ) × E3 ) × · · · ) × EN −1 ) × EN )
= µ1 (E1 )µ2 (E2 ) · · · µN (EN ).
Since we identify ((· · · ((X1 ×X2 )×X3 )×· · · )×XN −1 )×XN with X1 ×X2 ×· · ·×XN ,
we have
((· · · ((A1 ⊗ A2 ) ⊗ A3 ) ⊗ · · · ) ⊗ AN −1 ) ⊗ AN = A1 ⊗ A2 ⊗ · · · ⊗ AN
(cf. 6.1.30b) and
((· · · ((E1 × E2 ) × E3 ) × · · · ) × EN −1 ) × EN = E1 × E2 × · · · × EN ,
∀(E1 , E2 , ..., EN ) ∈ A1 × A2 × · · · × AN .
Suppose now that µ̃ is a measure on A1 ⊗ A2 ⊗ · · · ⊗ AN such that
µ̃(E1 × E2 × · · · × EN ) = µ1 (E1 )µ2 (E2 ) · · · µN (EN ),
∀(E1 , E2 , ..., EN ) ∈ A1 × A2 × · · · × AN .
Then the restrictions of the measures µ̃ and µ1 ⊗ µ2 ⊗ · · · ⊗ µN to SN (defined
as in 6.1.30a) coincide, and hence their restrictions to A0 (SN ) must coincide as
well by the uniqueness asserted in 7.3.1A, and hence µ̃ and µ1 ⊗ µ2 ⊗ · · · ⊗ µN
must coincide altogether by the uniqueness asserted in 7.3.2, in view of the equality
A(A0 (SN )) = A(SN ) = A1 ⊗ A2 ⊗ · · · AN (cf. 6.1.30a) and of the fact that the
restriction of µ1 ⊗µ2 ⊗· · ·⊗µN to A0 (SN ) is σ-finite (this can be seen by an argument
similar to the one which led to the σ-finiteness of µ0 in the proof of 8.4.3).
Finally, we notice that, since we identify X1 × X2 × · · · × XN with
(· · · (((X1 × · · · Xi1 ) × (Xi1 +1 × · · · × Xi2 )) × (Xi2 +1 × · · · × Xi3 )) × · · · )
×(Xir +1 × · · · × XN ),
we have, for every (E1 , E2 , ..., EN ) ∈ A1 × A2 × · · · AN ,
(· · · (((µ1 ⊗ · · · ⊗ µi1 ) ⊗ (µi1 +1 ⊗ · · · ⊗ µi2 )) ⊗ (µi2 +1 ⊗ · · · ⊗ µi3 )) ⊗ · · · )
⊗(µir +1 ⊗ · · · ⊗ µN )(E1 , E2 , ..., EN ) = µ1 (E1 )µ2 (E2 ) · · · µN (EN ).
and we use the uniqueness result proved above.
8.4.6 Proposition. Let N ∈ N be so that N > 1 and, for k = 1, ..., N , let

(Xk , Ak , µk ) be a σ-finite measure space. For k = 1, ..., N , let Yk be a non-empty
element of Ak . Then (cf. 8.3.1 for µE )
(µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN = (µ1 )Y1 ⊗ · · · ⊗ (µN )YN .
Integration 215
Proof. We recall that the measure (µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN is defined on the σ-
algebra (A1 ⊗ · · · ⊗ AN )Y1 ×···×YN and that the measure (µ1 )Y1 ⊗ · · · ⊗ (µN )YN is
defined on the σ-algebra AY1 1 ⊗ · · · ⊗ AYNN . Moreover, from 6.1.30c and its proof we
know that
Y1 ×···×YN
(A1 ⊗ · · · ⊗ AN )Y1 ×···×YN = AY1 1 ⊗ · · · ⊗ AYNN = A(SN ),
with
Y1 ×···×YN
SN = {G1 × · · · × GN : Gk ∈ AYk k for k = 1, ..., N }.
Y1 ×···×YN
We also know that SN is a semialgebra on Y1 × · · · × YN (cf. 6.1.30a, with
Y1 ×···×YN
(Xk , Ak ) replaced by (Yk , AYk k )). For each G1 × · · · × GN ∈ SN we have
(µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN (G1 × · · · × GN ) = (µ1 ⊗ · · · ⊗ µN )(G1 × · · · × GN )
= µ1 (G1 ) · · · µN (GN ) = (µ1 )Y1 (G1 ) · · · (µN )YN (GN )
= ((µ1 )Y1 ⊗ · · · ⊗ (µN )YN )(G1 × · · · × GN ).
Since the restrictions of (µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN and (µ1 )Y1 ⊗ · · · ⊗ (µN )YN to
Y1 ×···×YN Y1 ×···×YN
SN coincide and since SN is a semialgebra on Y1 × · · · × YN , an
argument similar to the one used at the end of 8.4.5 leads to the equality between
the measures (µ1 ⊗ · · · ⊗ µN )Y1 ×···×YN and (µ1 )Y1 ⊗ · · · ⊗ (µN )YN .
Finally, we come to the theorems that govern integration on product spaces.

Fubini’s theorem follows from Tonelli’s theorem, which follows from the lemma in
8.4.7. In the remainder of this section we make the tacit assumption that all the
measures we consider are non-null (in all the statements, if one of the measures was
null then either the statement would hold trivially or it would be of no interest).
8.4.7 Lemma. Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces. For
each E ∈ A1 ⊗ A2 , the functions
ψ1E : X1 → [0, ∞]
x1 7→ ψ1E (x1 ) := µ2 (Ex1 )
and
ψ2E : X2 → [0, ∞]
x2 7→ ψ2E (x2 ) := µ1 (E x2 )
are defined consistently, are elements of L+ (X1 , A1 ) and L+ (X2 , A2 ) respectively,
and
Z Z
E
(µ1 ⊗ µ2 )(E) = ψ1 dµ1 = ψ2E dµ2 ,
X1 X2
which can also be written as

Z Z
(µ1 ⊗ µ2 )(E) = µ2 (Ex1 )dµ1 (x1 ) = µ1 (E x2 )dµ2 (x2 ).
X1 X2
Proof. It follows from 8.4.2a that the definitions of the functions ψ1E and ψ2E are
consistent, for each E ∈ A1 ⊗ A2 .
First, suppose that (µ1 ⊗ µ2 )(X1 × X2 ) < ∞. Since (µ1 ⊗ µ2 )(X1 × X2 ) =
µ1 (X1 )µ2 (X2 ), this is equivalent to µ1 (X1 ) < ∞ and µ2 (X2 ) < ∞.
Define
C := {E ∈ A1 ⊗ A2 : ψkE ∈ L+ (Xk , Ak ) and
Z
(µ1 ⊗ µ2 )(E) = ψkE dµk , for k = 1, 2}.
Xk
If {E1 , .., EN } is a finite and disjoint family of elements of C and E := N

S
n=1 En ,
then E ∈ C. In fact, for each x1 ∈ X1 , {(E1 )x1 , ..., (EN )x1 } is a disjoint family of
SN
elements of A2 and Ex1 := n=1 (En )x1 , and therefore
N
X N
X
ψ1E (x1 ) = µ2 (Ex1 ) = µ2 ((En )x1 ) = ψ1En (x1 ).
n=1 n=1
PN En
Thus, ψ1E = n=1 ψ1 and hence (cf. 8.1.9) ψ1E ∈ L+ (X1 , A1 ) and
Z N Z
X XN
E
ψ1 dµ1 = ψ1En dµ1 = (µ1 ⊗ µ2 )(En ) = (µ1 ⊗ µ2 )(E).
X1 n=1 X1 n=1
And the analogous facts are true for ψ2E .

The family
S2 := {E1 × E2 : Ek ∈ Ak for k = 1, 2}
is a semialgebra on X1 ×X2 and A1 ⊗A2 = A(S2 ) (cf. 6.1.30a). We also have S2 ⊂ C.
Indeed, it is easy to see that, for E := E1 × E2 ∈ S, µ2 (Ex1 ) = µ2 (E2 )χE1 (x1 ).
Thus, ψ1E = µ(E2 )χE1 and hence (cf. 8.1.13) ψ1E ∈ L+ (X1 , A1 ) and
Z Z
E
ψ1 dµ1 = µ2 (E2 ) χE1 dµ1 = µ2 (E2 )µ1 (E1 ) = (µ1 ⊗ µ2 )(E).
X1 X1

From what was proved above and from 6.1.11, it follows that A0 (S2 ) ⊂ C.
Now we will show that C is a monotone class. Let us assume that {En } is a
sequence in C such that either En ⊂ En+1 or En+1 ⊂ En for each n ∈ N. This
implies that, for every x1 ∈ X1 , either (En )x1 ⊂ (En+1 )x1 of (En+1 )x1 ⊂ (En )x1
for each n ∈ N. If we define E as either E := ∞ E or E := ∞
S T
S∞ T∞ n=1 n n=1 En , then
we have either Ex1 = n=1 (En )x1 or Ex1 = n=1 (En )x1 for every x1 ∈ X1 . Since
(µ1 ⊗ µ2 )(X1 × X2 ) < ∞ and µ2 (X2 ) < ∞, by either 7.1.4b or 7.1.4c we have
(µ1 ⊗ µ2 )(E) = lim (µ1 ⊗ µ2 )(En )
n→∞
and
ψ1E (x1 ) = µ2 (Ex1 ) = lim µ2 ((En )x1 ) = lim ψ1En (x1 ), ∀x1 ∈ X1 .
n→∞ n→∞
Integration 217
This shows that ψ1E ∈ L+ (X1 , A1 ) (cf. 6.2.19b). Moreover, we notice that ψ1E , ψ1En
for each n ∈ N, and the constant function
ψ1 : X1 → [0, ∞]
x1 7→ ψ1 (x1 ) := µ2 (X2 ),
+
which are elements of L (X1 , A1 ) (cf. also 6.2.2), are in fact elements of M(X1 , A1 )
since µ2 (X2 ) < ∞ (cf. also 7.1.2a). Further, ψ1 ∈ L1 (X1 , A1 , µ1 ) since µ1 (X1 ) < ∞
(cf. 8.2.6). Then, by 8.2.11 with ψ1 as dominating function, we have
Z Z
En
(µ1 ⊗ µ2 )(E) = lim (µ1 ⊗ µ2 )(En ) = lim ψ1 dµ1 = ψ1E dµ1 .
n→∞ n→∞ X1 X1

Thus, A0 (S2 ) ⊂ C and C is a monotone class. Then, C(A0 (S2 )) ⊂ C (cf.
gm2 in 6.1.35), where C(A0 (S2 )) is the monotone class generated by A0 (S2 ).
But C(A0 (S2 )) = A(A0 (S2 )) (cf. 6.1.36) and A(A0 (S2 )) = A(S2 ) (cf. 6.1.18).
This proves that A1 ⊗ A2 ⊂ C, and therefore that the statement is true, if
(µ1 ⊗ µ2 )(X1 × X2 ) < ∞.
Suppose now (µ1 ⊗ µ2 )(X1 × X2 ) = ∞. Then µ1 and µ2 cannot be both finite.
Since µ1 and µ2 are σ-finite, for k = 1, 2 there is a family {Fk,n }n∈N of elements
S∞
of Ak so that µk (Fk,n ) < ∞ for all n ∈ N and Xk = n=1 Fk,n (if µk happens
to be finite, take Fk,n := Xk for all n ∈ N); then, for all N ∈ N, we define
SN
Xk,N := n=1 Fk,n and we have Xk,N ∈ Ak and µk (Xk,N ) < ∞ (cf. 7.1.2b); we
X
write Ak,N := Ak k,N and µk,N := (µk )Xk,N and we consider the measure space
(Xk,N , Ak,N , µk,N ) (cf. 8.3.1); notice that µk,N is a finite measure. Notice that
Xk = ∞
S S∞
N =1 Xk,N for k = 1, 2, and hence X1 × X2 = N =1 (X1,N × X2,N ).
Fix E ∈ A1 ⊗ A2 . For each N ∈ N, define EN := E ∩ (X1,N × X2,N ). We have
EN ∈ A1,N ⊗ A2,N (cf. 6.1.30c). Since µ1,N and µ2,N are finite measures, what was
proved before implies that the function
ψ̃1EN : X1,N → [0, ∞]
x1 7→ ψ̃1EN (x1 ) := µ2,N ((EN )x1 )
is an element of L+ (X1,N , A1,N ) and that
Z
(µ1,N ⊗ µ2,N )(EN ) = ψ̃1EN dµ1,N .
X1,N
We notice now that

(
EN ψ̃1EN (x1 ) if x1 ∈ X1,N (since µ2,N ((EN )x1 ) = µ2 ((EN )x1 )),
ψ1 (x1 ) =
0 if x1 ∈ X − X1,N (since EN ⊂ X1,N × X2,N ).
Clearly, (ψ1EN )X1,N
= ψ̃1EN and ψ1EN = χX1,N ψ1EN . Moreover, ψ1EN ∈ L+ (X1 , A1 )
since, for F ∈ A(dR ),
(
(ψ̃1EN )−1 (F ) if 0 6∈ F,
(ψ1EN )−1 (F ) =
(ψ̃1EN )−1 (F ) ∪ (X1 − X1,N ) if 0 ∈ F,
and (ψ1EN )−1 (F ) ∈ A1,N ⊂ A1 (cf. 6.1.19a). Then we have

(1)
(µ1 ⊗ µ2 )(EN ) = (µ1 ⊗ µ2 )X1,N ×X2,N (EN ) = (µ1,N ⊗ µ2,N )(EN )
Z Z Z
(2)
= (ψ1EN )X1,N dµ1,N = χX1,N ψ1EN dµ1 = ψ1EN dµ1 ,
X1,N X1 X1
where 1 holds by 8.4.6 and 2 holds by 8.3.1b.
S∞
Notice that EN ⊂ EN +1 for all N ∈ N and E = N =1 EN . This implies that,
S∞
for each x1 ∈ X1 , (EN )x1 ⊂ (EN +1 )x1 for all N ∈ N and Ex1 = N =1 (EN )x1 .
Then, by 7.1.4b, we have
(µ1 ⊗ µ2 )(E) = lim (µ1 ⊗ µ2 )(EN )
n→∞
and
∀x1 ∈ X1 , lim ψ1EN (x1 ) = lim µ2 ((EN )x1 ) = µ2 (Ex1 ) = ψ1E (x1 ).
N →∞ N →∞
By 7.1.2a we also have, for all N ∈ N,
E
∀x1 ∈ X1 , ψ1EN (x1 ) = µ2 ((EN )x1 ) ≤ µ2 ((EN +1 )x1 ) = ψ1 N +1 (x1 ).
Then, by 8.1.8, we have ψ1E ∈ L+ (X1 , A1 ) and
Z Z
(µ1 ⊗ µ2 )(E) = lim ψ1EN dµ1 = ψ1E dµ1 .
N →∞ X1 X1
+
In a similar way we can prove that ψ2E ∈ L (X2 , A2 ) and (µ1 ⊗µ2 )(E) = ψ2E dµ2 .
R
X2
8.4.8 Theorem (Tonelli’s theorem). Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be

σ-finite measure spaces. For each ϕ ∈ L+ (X1 × X2 , A1 ⊗ A2 ), the functions
ψ1ϕ : X1 → [0, ∞]
Z
x1 7→ ψ1ϕ (x1 ) := ϕx1 dµ2
X2
and
ψ2ϕ : X2 → [0, ∞]
Z
x2 7→ ψ2ϕ (x2 ) := ϕx2 dµ1
X1
are defined consistently, are elements of L+ (X1 , A1 ) and L+ (X2 , A2 ) respectively,
and Z Z Z
ϕd(µ1 ⊗ µ2 ) = ψ1ϕ dµ1 = ψ2ϕ dµ2 ,
X1 ×X2 X1 X2
which can also
Z be written as Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ(x1 , x2 )dµ2 (x2 ) dµ1 (x1 )
X1 ×X2 X X
Z 1 Z 2
= ϕ(x1 , x2 )dµ1 (x1 ) dµ2 (x2 )
X2 X1
(the second equality is often referred to by saying that “the order of integration may
be reversed”).
Integration 219
Proof. It follows from 8.4.2b that the definitions of the functions ψ1ϕ and ψ2ϕ are
consistent, for each ϕ ∈ L+ (X1 × X2 , A1 ⊗ A2 ).
Suppose E ∈ A1 ⊗ A2 and ϕ := χE . Then
∀x2 ∈ X2 , ϕx1 (x2 ) = χE (x1 , x2 ) = χEx1 (x2 )
and hence, if we define ψ1E as in 8.4.7,
Z
∀x1 ∈ X1 , ψ1ϕ (x1 ) = χEx1 dµ2 = µ2 (Ex1 ) = ψ1E (x1 ).
X2
Thus, in view of 8.4.7 we have ψ1ϕ ∈ L+ (X1 , A1 ) and

Z Z Z
ϕd(µ1 ⊗ µ2 ) = χE d(µ1 ⊗ µ2 ) = (µ1 ⊗ µ2 )(E) = ψ1ϕ dµ1 .
X1 ×X2 X1 ×X2 X1
And similarly for ψ2ϕ . Thus, the conclusions of the statement are true for ϕ = χE
with E ∈ A1 ⊗ A2 .
For a, b ∈ [0, ∞) and ϕ, ϕ̃ ∈ L+ (X1 × X2 , A1 ⊗ A2 ), we have
aϕ + bϕ̃ ∈ L+ (X1 × X2 , A1 ⊗ A2 )
(cf. 6.2.31) and also, for each x1 ∈ X1 ,
∀x2 ∈ X2 , (aϕ + bϕ̃)x1 (x2 ) = aϕx1 (x2 ) + bϕ̃x1 (x2 ),
and hence ψ1aϕ+bϕ̃
= aψ1ϕ + bψ1ϕ̃ , and similarly for ψ2aϕ+bϕ̃ . From this and from
what was proved above for a characteristic function, by linearity (cf. 8.1.9 and
8.1.13) we have that the conclusions of the statement are true for all the elements
of S + (X1 × X2 , A1 ⊗ A2 ).
Suppose now ϕ ∈ L+ (X1 × X2 , A1 ⊗ A2 ). Then there is a sequence {ϕn } in
+
S (X1 × X2 , A1 ⊗ A2 ) such that
ϕn ≤ ϕn+1 , ∀n ∈ N, and ϕn (x1 , x2 ) −−−−→ ϕ(x1 , x2 ), ∀(x1 , x2 ) ∈ X1 × X2
n→∞
(cf. 6.2.26). Now, for each x1 ∈ X1 , ϕx1 ∈ L+ (X2 , A2 ), {(ϕn )x1 } is a sequence in
L+ (X2 , A2 ) (cf. 8.4.2b), (ϕn )x1 ≤ (ϕn+1 )x1 , and (ϕn )x1 (x2 ) −−−−→ ϕx1 (x2 ) for all
n→∞
x2 ∈ X2 . By 8.1.7 and 8.1.8, this implies that
ϕ
∀x1 ∈ X1 , ψ1ϕn (x1 ) ≤ ψ1 n+1 (x1 ) and ψ1ϕn (x1 ) −−−−→ ψ1ϕ (x1 ).
n→∞
Since the conclusions of the statement are true for ϕn for all n ∈ N, this implies,
by 8.1.8 used twice, that ψ1ϕ ∈ L+ (X2 , A2 ) and that
Z Z
ϕd(µ1 ⊗ µ2 ) = lim ϕn d(µ1 ⊗ µ2 )
X1 ×X2 n→∞ X ×X
1 2
Z Z
= lim ψ1ϕn dµ1 = ψ1ϕ dµ1 .
n→∞ X1 X1
And similarly for ψ2ϕ .

8.4.9 Corollary. Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces and
suppose that, for ϕ ∈ M(X1 × X2 , A1 ⊗ A2 ), there exist ϕ1 ∈ L1 (X1 , A1 , µ1 ) and
ϕ2 ∈ L1 (X2 , A2 , µ2 ) so that
|ϕ(x1 , x2 )| = |ϕ1 (x1 )||ϕ2 (x2 )|, ∀(x1 , x2 ) ∈ X1 × X2 .
1
Then ϕ ∈ L (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ).
R
Proof. Letting Ik := Xk |ϕk |dµk , we have Ik < ∞ for k = 1, 2 (cf. 8.2.4). Since
|ϕ| ∈ L+ (X1 × X2 , A1 ⊗ A2 ) (cf. 6.2.17), from 8.4.8 (with ϕ replaced by |ϕ|) we
have
Z Z
|ϕ|
|ϕ|d(µ1 ⊗ µ2 ) = ψ1 dµ1 .
X1 ×X2 X1
|ϕ|
Now, ψ1 = I2 |ϕ1 | and hence
Z
|ϕ|d(µ1 ⊗ µ2 ) = I2 I1 < ∞,
X1 ×X2
which shows (cf. 8.2.4) that ϕ ∈ L1 (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ).
8.4.10 Theorem (Fubini’s theorem). Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be

σ-finite measure spaces, and let ϕ ∈ L1 (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ) be such that
Dϕ = X1 × X2 . Then:
(a) ϕx1 ∈ L1 (X2 , A2 , µ2 ) for µ1 -a.e. x1 ∈ X1 and ϕx2 ∈ L1 (X1 , A1 , µ1 ) for µ2 -a.e.
x2 ∈ X2 ;
(b) the function
ρϕ
1 : D1 → C
Z
x1 7→ ρϕ
1 (x1 ) := ϕx1 dµ2 ,
X2
with D1 := {x1 ∈ X1 : ϕx1 ∈ L1 (X2 , A2 , µ2 )}, is an element of L1 (X1 , A1 , µ1 ),

and the function
ρϕ
2 : D2 → C
Z
x2 7→ ρϕ
2 (x2 ) := ϕx2 dµ1
X1
with D2 := {x2 ∈ X2 : ϕx2 ∈ L1 (X1 , A1 , µ1 )}, is an element of L1 (X2 , A2 , µ2 );

(c)
Z Z Z
ϕd(µ1 ⊗ µ2 ) = ρϕ
1 dµ1 = ρϕ
2 dµ2 ,
X1 ×X2 X1 X2
which can also be written as
Z Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ(x1 , x2 )dµ2 (x2 ) dµ1 (x1 )
X1 ×X2 X1 X2
Z Z
= ϕ(x1 , x2 )dµ1 (x1 ) dµ2 (x2 ),
X2 X1
Integration 221
R
with
R the understanding that the expressions X2 ϕ(x1 , x2 )dµ2 (x2 ) and
X1 ϕ(x1 , x2 )dµ1 (x1 ) are to be considered only for x1 ∈ D1 and for x2 ∈ D2 ,
and hence only for µ1 -a.e. x1 ∈ X1 and for µ2 -a.e. x2 ∈ X2 respectively (the
second equality is often referred to by saying that “the order of integration may
be reversed”).
Proof. We will prove the conclusions of the statement for ϕx1 and for ρϕ 1 . The
proof for ϕx2 and ρϕ2 is similar.
a: From 8.4.2b we have ϕx1 ∈ M(X2 , A2 )Rfor each x1 ∈ X1 . Moreover we have
|ϕ| ∈ L+ (X1 × X2 , A1 ⊗ A2 ) (cf. 6.2.17) and X1 ×X2 |ϕ|d(µ1 ⊗ µ2 ) < ∞ (cf. 8.2.4).
|ϕ|
Then from 8.4.8 (with ϕ replaced by |ϕ|) we have ψ1 ∈ L+ (X1 , A1 ) and
Z Z
|ϕ|
ψ1 dµ1 = |ϕ|d(µ1 ⊗ µ2 ) < ∞,
X1 X1 ×X2
and hence (since obviously |ϕx1 | = |ϕ|x1 ) that the the set
Z
∞ |ϕ| −1
D1 := (ψ1 ) ({∞}) = x1 ∈ X1 : |ϕx1 |dµ2 = ∞
X2
is an element of A1 (cf. 6.1.26) and µ1 (D1∞ ) = 0 (cf. 8.1.12b). This proves that
ϕx1 ∈ L1 (X2 , A2 , µ2 ) for µ1 -a.e. x1 ∈ X1 (cf. 8.2.4).
b: We define
and we notice that ϕx1 = (ϕ1 )x1 − (ϕ2 )x1 + i(ϕ3 )x1 − i(ϕ4 )x1 for each x1 ∈ X1 (cf.
1.2.19).
Fix now i ∈ {1, 2, 3, 4}. We have ϕi ∈ L+ (X1 × X2 , A1 ⊗ A2 ) (cf. 6.2.12
and 6.2.20b) and ϕi ≤ |ϕ|, and hence (ϕi )x1 ∈ L+ (X2 , A2 ) (cf. 8.4.2b) and
(ϕi )x1 ≤ |ϕx1 | for each x1 ∈ X1 . We define ψi := (ψ1ϕi )D1 , with ψ1ϕi as in
8.4.8 (with ϕ replaced by ϕi ). From 8.4.8 we have ψ1ϕi ∈ L+ (X1 , A1 ), and hence
ψi ∈ L+ (D1 , AD D1
1 ) (cf. 6.2.3). As a matter of fact, ψi ∈ M(D1 , A1 ) because
1
∞
D1 = X1 − D1 (cf. 8.2.4) and hence
Z Z
∀x1 ∈ D1 , ψi (x1 ) = ψ1ϕi (x1 ) = (ϕi )x1 dµ2 ≤ |ϕx1 |dµ2 < ∞.
X2 X2
Thus, ψi ∈ M(X1 , A1 , µ1 ) since D1 ∈ A1 and µ1 (X1 − D1 ) = µ1 (D1∞ ) = 0 (cf.
8.2.1). Moreover, ψi ∈ L+ (X1 , A1 , µ1 ) and from 8.1.14, 8.4.8, 8.1.11a we have
Z Z Z Z
ϕi
ψi dµ1 = ψ1 dµ1 = ϕi d(µ1 ⊗ µ2 ) ≤ |ϕ|d(µ1 ⊗ µ2 ) < ∞,
X1 X1 X1 ×X2 X1 ×X2
and this shows (cf. 8.2.4) that ψi ∈ L1 (X1 , A1 , µ1 ).
Finally, for each x1 ∈ D1 , we have
Z
ρϕ
1 (x1 ) = ϕx1 dµ2
X2
Z Z Z Z
= (ϕ1 )x1 dµ2 − (ϕ2 )x1 dµ2 + i (ϕ3 )x1 dµ2 − i (ϕ4 )x1 dµ2
X2 x2 x2 x2
= ψ1 (x1 ) − ψ2 (x1 ) + iψ3 (x1 ) − iψ4 (x1 ).
This shows that ρϕ ϕ 1

1 = ψ1 − ψ2 + iψ3 − iψ4 , and hence that ρ1 ∈ L (X1 , A1 , µ1 ) (cf.
8.2.9).
c: From 8.2.9 we have
Z Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ1 d(µ1 ⊗ µ2 ) − ϕ2 d(µ1 ⊗ µ2 )
X1 ×X2 X1 ×X2 X1 ×X2
Z Z
+i ϕ3 d(µ1 ⊗ µ2 ) − i ϕ4 d(µ1 ⊗ µ2 )
X1 ×X2 X1 ×X2
Z Z
= ψ1 dµ1 − ψ2 dµ1
X1 X1
Z Z Z
+i ψ3 dµ1 − i ψ4 dµ1 = ρϕ
1 dµ1 .
X1 X1 X1
8.4.11 Remarks.
(a) Let N ∈ N be so that N > 2 and let (Xk , Ak , µk ) be a σ-finite measure space
for k = 1, ..., N . If 1 < i1 < i2 < ... < ir < N , then (cf. 8.4.5)
µ1 ⊗ · · · ⊗ µN
= (· · · (((µ1 ⊗ · · · ⊗ µi1 ) ⊗ (µi1 +1 ⊗ · · · ⊗ µi2 )) ⊗
(µi2 +1 ⊗ · · · ⊗ µi3 )) ⊗ · · · ) ⊗ (µir +1 ⊗ · · · ⊗ µN ).
Thus, if ϕ ∈ L+ (X1 × · · · XN , A1 ⊗ · · · ⊗ AN ), from 8.4.8 we have
Z
ϕd(µ1 ⊗ · · · ⊗ µN )
X1 ×···×XN
Z Z Z
= ··· ϕ(x1 , ..., xN )
Xir +1 ×···×XN Xi1 +1 ×···×Xi2 X1 ×···Xi1
! ! !
d(µ1 ⊗ · · · ⊗ µi1 )(x1 , ..., xi1 ) d(µi1 +1 ⊗ · · · ⊗ µi2 )(xi1 +1 , ..., xi2 ) · · ·
d(µir +1 ⊗ · · · ⊗ µN )(xir +1 , ..., xN ).

If ϕ ∈ L1 (X1 × · · · XN , A1 ⊗ · · · ⊗ AN , µ1 ⊗ · · · ⊗ µN ) and Dϕ = X1 × · · · × XN ,
the same is true by 8.4.10.
(b) Let (X, A, ν) be a σ-finite measure space and put νN := ν ⊗ · · · N times · · ·⊗ ν
for N > 1 and ν1 := ν. From the result found in remark a it follows that, by
repeated use of 8.4.8 or of 8.4.10, we can reverse the order of integration in any
two variables when integrating with respect to νN for N > 2. Indeed, for
ϕ ∈ L+ (X N , A ⊗ · · · N times · · · ⊗ A),
or for
ϕ ∈ L1 (X N , A ⊗ · · · N times · · · ⊗ A, νN ) such that Dϕ = X N ,
Integration 223
we have (supposing j < N and k < j − 1; if j = N or k = j − 1, it is obvious

how to simplify the calculation below)
Z
ϕ(x1 , ..., xN )dνN (x1 , ..., xN )
XN
Z Z Z Z Z
= ϕ(x1 , ..., xN )dνk−1 (x1 , ..., xk−1 )
X N −j X X j−1−k X X k−1

dν(xk ) dνj−1−k (xk+1 , ..., xj−1 ) dν(xj ) dνN −j (xj+1 , ..., xN )
Z Z Z Z Z
= ϕ(x1 , ..., xN )dνk−1 (x1 , ..., xk−1 )
X N −j X j−1−k k−1
X X
X
dν(xk ) dν(xj ) dνj−1−k (xk+1 , ..., xj−1 ) dνN −j (xj+1 , ..., xN )
Z Z Z Z Z
= ϕ(x1 , ..., xN )dνk−1 (x1 , ..., xk−1 )
X N −j X j−1−k k−1
X X
X
dν(xj ) dν(xk ) dνj−1−k (xk+1 , ..., xj−1 ) dνN −j (xj+1 , ..., xN )
Z Z Z Z Z
= ϕ(x1 , ..., xN )dνk−1 (x1 , ..., xk−1 )
X N −j X X j−1−k X X k−1

dν(xj ) dνj−1−k (xk+1 , ..., xj−1 ) dν(xk ) dνN −j (xj+1 , ..., xN )
Z
= ϕ(x1 , .., xN )dνN (x1 , ..., xk−1 , xj , xk+1 , ..., xj−1 , xk , xj+1 , ..., xN )
N
ZX
= ϕ(x1 , ..., xk−1 , xj , xk+1 , ..., xj−1 , xk , xj+1 , ..., xN )dνN (x1 , ..., xN ),
XN
where the last equality holds because the names we give to variables are imma-
terial (while their positions are essential). Thus, the two functions ϕ and
(x1 , ..., xN ) 7→ ϕ(x1 , ..., xk−1 , xj , xk+1 , ..., xj−1 , xk , xj+1 , ..., xN )
have the same integrals with respect to νN .
8.4.12 Remark. In 8.4.8 and in 8.4.10 we assumed Dϕ = X1 ×X2 for the functions
ϕ in the statements. However, both Tonelli’s theorem and Fubini’s theorem can be
generalized to the case of functions defined only µ1 ⊗ µ2 -a.e. We examine here the
case of Tonelli’s theorem. For Fubini’s theorem the treatment would be analogous.
Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces, and suppose that
for a function ϕ we have ϕ ∈ L+ (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ). This entails (cf.
8.1.14) Dϕ ∈ A1 ⊗ A2 , (µ1 ⊗ µ2 )(X1 × X2 − Dϕ ) = 0, ϕ ∈ L+ (Dϕ , (A1 ⊗ A2 )Dϕ ).
For each x1 ∈ πX1 (Dϕ ) (cf. 1.2.6c), we define
ϕx1 : (Dϕ )x1 → [0, ∞]
x2 7→ ϕx1 (x2 ) := ϕ(x1 , x2 )
(the condition x1 ∈ πX1 (Dϕ ) implies (Dϕ )x1 6= ∅). We have (Dϕ )x1 ∈ A2 (cf.
8.4.2a). We also have, for every subset F of R∗ ,
ϕ−1
x1 (F ) = {x2 ∈ (Dϕ )x1 : ϕ(x1 , x2 ) ∈ F }
= {x2 ∈ X2 : (x1 , x2 ) ∈ Dϕ and ϕ(x1 , x2 ) ∈ F }
= {x2 ∈ X2 : (x1 , x2 ) ∈ ϕ−1 (F )} = (ϕ−1 (F ))x1 ;
thus, if F ∈ A(δ), then ϕ−1 (F ) = E ∩ Dϕ with E ∈ A1 ⊗ A2 , and hence
ϕ−1
x1 (F ) = (E ∩ Dϕ )x1 = Ex1 ∩ (Dϕ )x1 ∈ (A2 )
(Dϕ )x1
(Dϕ )x
since Ex ∈ A2 (cf. 8.4.2a); this implies that ϕx1 ∈ L+ ((Dϕ )x1 , A2 1
). Moreover,
Z
µ2 ((X1 × X2 − Dϕ )x1 )dµ1 (x1 ) = (µ1 ⊗ µ2 )(X1 × X2 − Dϕ ) = 0
X1
(cf. 8.4.7) implies (cf. 8.1.12a) that µ2 ((X1 × X2 − Dϕ )x1 ) = 0 µ1 -a.e. on X1 ; since
(X1 × X2 − Dϕ )x1 = X2 − (Dϕ )x1 , this implies that
ϕx1 ∈ L+ (X2 , A2 , µ2 ) µ1 -a.e. on X1 .
Let then E1 ∈ A1 be such that µ1 (E1 ) = 0 and ϕx1 ∈ L+ (X2 , A2 , µ2 ) for each
x1 ∈ X1 − E1 .
Now, if ϕ̃ ∈ L+ (X1 × X2 , A1 ⊗ A2 ) is an extension of ϕ, we have (cf. 8.1.14)
Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ̃d(µ1 ⊗ µ2 ).
X1 ×X2 X1 ×X2
Moreover, for each x1 ∈ πX1 (Dϕ ), ϕ̃x1 is an element of L+ (X2 , A2 ) (cf. 8.4.2b)
which is obviously an extension of ϕx1 for each x1 ∈ X − E1 , and hence we have
(cf. 8.1.14),
Z Z
ϕx1 dµ2 = ϕ̃x1 dµ2 , ∀x ∈ X1 − E1 .
X2 X2
Thus, the function
Z
X1 − E1 ∋ xi 7→ ϕx1 dµ2 ∈ [0, ∞]
X2
is the restriction of the function ψ1ϕ̃ (cf. 8.4.8 with ϕ replaced by ϕ̃) to X1 − E1 ,
and hence (cf. 6.2.3) it is an element of L+ (X1 , A1 , µ1 ) and (cf. 8.1.14)
Z Z Z
ϕx1 (x2 )dµ2 (x2 ) dµ1 (x1 ) = ψ1ϕ̃ dµ1 .
X1 X2 X1
Then, 8.4.8 (with ϕ replaced by ϕ̃) implies that
Z Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ̃d(µ1 ⊗ µ2 ) = ψ1ϕ̃1 dµ1
X1 ×X2 X ×X X1
Z 1 Z2
= ϕ(x1 , x2 )dµ2 (x2 ) dµ1 (x1 ).
X1 X2
In a similar way it can be proved that
Z Z Z
ϕd(µ1 ⊗ µ2 ) = ϕ(x1 , x2 )dµ1 (x1 ) dµ2 (x2 ).
X1 ×X2 X2 X1
Integration 225
8.4.13 Remark. Having generalized 8.4.8 in 8.4.12, we can generalize 8.4.9 in

a similar way: let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be σ-finite measure spaces and
suppose that, for ϕ ∈ M(X1 ×X2 , A1 ⊗A2 , µ1 ⊗µ2 ), there exist ϕ1 ∈ L1 (X1 , A1 , µ1 )
and ϕ2 ∈ L1 (X2 , A2 , µ2 ) so that
|ϕ(x1 , x2 )| = |ϕ1 (x1 )||ϕ2 (x2 )| µ1 ⊗ µ2 -a.e. on X1 × X2 .
then ϕ ∈ L1 (X1 × X2 , A1 ⊗ A2 , µ1 ⊗ µ2 ).
From Fubini’s theorem we can derive the results about double series that we
present in 8.4.14. These results, it must be said, can be obtained by more elementary
means (cf. e.g. Apostol, 1974, th. 8.42).
8.4.14 Proposition. Suppose ϕ is a function ϕ : N×N → C, and let σ : N → N×N

be a bijection from N onto N × N.
(a) The following are equivalent conditions:
∞ ∞ ∞
!
X X X
|ϕ(n, s)| < ∞ (and hence |ϕ(n, s)| < ∞ for each n ∈ N);
n=1 s=1 s=1
∞ ∞ ∞
!
X X X
|ϕ(n, s)| <∞ (and hence |ϕ(n, s)| < ∞ for each s ∈ N);
s=1 n=1 n=1
X∞
|ϕ(σ(k))| < ∞.
k=1
(b) If the conditions in part a are satisfied, then all the series written below are
convergent and
∞ ∞ ∞ ∞ ∞
! !
X X X X X
ϕ(n, s) = ϕ(n, s) = ϕ(σ(k)).
n=1 s=1 s=1 n=1 k=1
Proof. a: Use 5.4.7 (and 5.4.2b).

b: Let µ be the counting measure on N (cf. 8.3.10a). Then
µ̃ : P(N × N) → [0, ∞]
E 7→ µ̃(E) := µ(σ −1 (E))
is a measure on P(N × N) (cf. 8.3.11a). We note now that P(N × N) = P(N) ⊗ P(N)
and that, for all E1 , E2 ∈ P(N),
µ̃(E1 × E2 ) = number of points in E1 × E2
= (number of points in E1 )(number of points in E2 ) = µ(E1 )µ(E2 ).
(by “number of points in a set” we mean here ∞ if the set is not finite). Thus,
µ̃ = µ ⊗ µ (cf. 8.4.3).
Assume now that the conditions in part a are satisfied. Then all the series in part
b are absolutely convergent and hence convergent (also because | ∞
P
s=1 ϕ(n, s)| ≤
P∞ P∞ P∞
s=1|ϕ(n, s)| and | n=1 ϕ(n, s)| ≤ n=1 |ϕ(n, s)|), and from 8.3.10a we have the
equalities
∞ ∞
! Z Z
X X
ϕ(n, s) = ϕ(n, s)dµ(s) dµ(n),
n=1 n=1 N N
∞ ∞
! Z Z
X X
ϕ(n, s) = ϕ(n, s)dµ(n) dµ(s),
s=1 n=1 N N
∞
X Z
ϕ(σ(k)) = (ϕ ◦ σ)dµ.
k=1 N
Now, the equalities we need to prove follow from 8.4.10 since from 8.3.11c we have
Z Z Z
(ϕ ◦ σ)dµ = ϕdµ̃ = ϕd(µ ⊗ µ).
N N×N N ×N
8.4.15 Corollary. Let I := {1, ..., N } or I := N, and for each n ∈ I let

S
In := {(n, 1), ..., (n, Nn )} or In := {(n, s) : s ∈ N}. Let J := n∈I In and let
{α(n,s) }(n,s)∈J be a family of elements of C.
(a) The following are equivalent conditions:
 
X X X
|α(n,s) | < ∞ for each n ∈ I and  |α(n,s) | < ∞;
(n,s)∈In n∈I (n,s)∈In
X
|α(n,s) | < ∞.
(n,s)∈J
(b) If the conditions in part a are satisfied, then the series that may appear below
are convergent and
 
X X X
 α(n,s)  = α(n,s) .
n∈I (n,s)∈In (n,s)∈J
Note that, for the various sums or series of this statement to be defined properly,
an ordering must be assumed in the various index sets. However, the orderings we
use in part a are immaterial in view of 5.4.3, and the orderings we use in part b are
immaterial in view of 4.1.8b because, if the conditions in part a are satisfied, then
all the series that may appear in part b are absolutely convergent.
Proof. If I = N and In = {(n, s) : s ∈ N} for each n ∈ N, this follows immediately

from 8.4.14. Otherwise, define I ′ := N, In′ := {(n, s) : s ∈ N} for all n ∈ N, and
(
′ α(n,s) if (n, s) ∈ J,
α(n,s) := for all (n, s) ∈ N × N,
0 if (n, s) 6∈ J,
and then use 8.4.14.
Integration 227
8.5 The Riesz–Markov theorem
In this section we prove the Riesz–Markov theorem for compact metric spaces, which
will play an essential role in our proof of the spectral theorem for unitary operators
(from which we will deduce the spectral theorem for self-adjoint operators).
8.5.1 Definition. Let M be a linear manifold in the linear space F (X) (cf. 3.1.10c)
and let L be a linear functional with DL = M . The linear functional L is said to
be positive if 0 ≤ Lϕ whenever ϕ ∈ M is such that 0X ≤ ϕ.
Notice that, if L is positive and ϕ, ψ ∈ M are such that ϕ ≤ ψ, then Lϕ ≤ Lψ
since Lψ = Lϕ + L(ψ − ϕ) and 0X ≤ ψ − ϕ.
8.5.2 Remark. Let (X, d) be a compact metric space and µ a finite measure on
the Borel σ-algebra A(d). For C(X) (cf. 3.1.10e) we have C(X) ⊂ L1 (X, A(d), µ)
by 6.2.8, 2.8.14 and 8.2.6. Thus, we can define the mapping
Lµ : C(X) → C
Z
ϕ 7→ Lµ ϕ := ϕdµ,
X
which is a positive linear functional on C(X) (cf. 8.2.9).

The Riesz–Markov theorem proves that every positive linear functional on C(X)
can be obtained in this way.
8.5.3 Theorem (The Riesz–Markov theorem). Let (X, d) be a compact met-

ric space and let L be a positive linear functional on C(X). Then there exists a
unique finite measure µ on the Borel σ-algebra A(d) so that
Z
Lϕ = ϕdµ, ∀ϕ ∈ C(X).
X
Proof. Existence: For each G ∈ Td , the family {ϕ ∈ C(X) : ϕ ≺ G} is not empty (it
contains the function 0X ; for ϕ ≺ G, cf. 2.5.10). Thus, we can define the function
ν : Td → [0, ∞]
(1)
G 7→ ν(G) := sup{Lϕ : ϕ ∈ C(X) and ϕ ≺ G}.
If G1 , G2 ∈ Td are so that G1 ⊂ G2 , then
{ϕ ∈ C(X) : ϕ ≺ G1 } ⊂ {ϕ ∈ C(X) : ϕ ≺ G2 },
if G1 , G2 ∈ Td are so that G1 ⊂ G2 , then ν(G1 ) ≤ ν(G2 ). (2)
We recall that, for ϕ ∈ C(X) and G ∈ Td , ϕ ≺ G implies ϕ ≤ 1X ; thus, since
1X ≺ X and L is positive, we have
ν(X) = L1X . (3)
We define the function

µ∗ : P(X) → [0, ∞]
(4)
E 7→ µ∗ (E) := inf{ν(G) : G ∈ Td and E ⊂ G}.
From 2 we have
∀G ∈ Td , µ∗ (G) = ν(G). (5)
If E1 , E2 ∈ P(X) are so that E1 ⊂ E2 , then
{G ∈ Td : E2 ⊂ G} ⊂ {G ∈ Td : E1 ⊂ G},
if E1 , E2 ∈ P(X) are so that E1 ⊂ E2 , then µ∗ (E1 ) ≤ µ∗ (E2 ). (6)
We notice that 6, 5 and 3 imply that
∀E ∈ P(X), µ∗ (E) ≤ µ∗ (X) = ν(X) < ∞. (7)
We want to prove that µ∗ is an outer measure on X.
The only ϕ ∈ C(X) such that ϕ ≺ ∅ is ϕ = 0X . Thus, ν(∅) = 0 since L is linear,
and hence µ∗ (∅) = 0 by 5.
Since 6 has already been proved, it remains to show that
∞ ∞
!
[ X
for every sequence {En } in P(X), µ∗ En ≤ µ∗ (En ). (8)
n=1 n=1
Consider first G1 , G2 ∈ Td and let ϕ ∈ C(X) be such that ϕ ≺ G1 ∪ G2 . Since

supp ϕ is closed and supp ϕ ⊂ G1 ∪ G2 , there are ψ1 , ψ2 ∈ C(X) such that ψ1 ≺ G1 ,
ψ2 ≺ G2 and ψ1 (x) + ψ2 (x) = 1 for all x ∈ supp ϕ (cf. 2.8.16). Hence, ψ1 ϕ ≺ G1 ,
ψ2 ϕ ≺ G2 , and ϕ = ψ1 ϕ + ψ2 ϕ. So, by the linearity of L and by 1 we have
Lϕ = L(ψ1 ϕ) + L(ψ2 ϕ) ≤ ν(G1 ) + ν(G2 ).
Since this is true for every ϕ ∈ C(X) such that ϕ ≺ G1 ∪ G2 , by 1 we have
ν(G1 ∪ G2 ) ≤ ν(G1 ) + ν(G2 ). (9)
Now, let {En } be a sequence in P(X) and choose ε > 0. By 4 and 7, for every
n ∈ N there exists Gn ∈ Td so that En ⊂ Gn and ν(Gn ) < µ∗ (En ) + ε2−n . Let now
S∞
ϕ ∈ C(X) be such that ϕ ≺ n=1 Gn . Since supp ϕ is closed and X is compact, by
2.8.8 there exists N ∈ N so that ϕ ≺ ∪N
n=1 Gn . Then, by 1, by induction applied to
9, and by 5.4.6, we have
∞
N
! N
[ X X
Lϕ ≤ ν Gn ≤ ν(Gn ) < µ∗ (En ) + ε.
n=1 n=1 n=1
S∞
Since this is true for every ϕ ∈ C(X) such that ϕ ≺ n=1
Gn , by 1 we have
∞ ∞
!
[ X
ν Gn ≤ µ∗ (En ) + ε.
n=1 n=1
Integration 229
S∞ S∞
Since n=1 En ⊂ n=1 Gn , in view of 4 this implies
∞ ∞
!
[ X
∗
µ En ≤ µ∗ (En ) + ε.
n=1 n=1
Since ε was arbitrary, 8 is proved.

Since µ∗ is an outer measure on X, from Carathéodory’s theorem (cf. 7.2.3) it
follows that the restriction of µ∗ to the σ-algebra M of µ∗ -measurable subsets of
X is a measure, which is finite by 7.
We want to prove that Td ⊂ M. In view of 7.2.2 and of 7, we need to show that,
for each G ∈ Td and each A ∈ P(X),
µ∗ (A ∩ G) + µ∗ (A − G) ≤ µ∗ (A). (10)
Assuming G ∈ Td , suppose first A ∈ Td . Then A ∩ G ∈ Td , so given ε > 0 we
can find ϕ ∈ C(X) such that ϕ ≺ A ∩ G and ν(A ∩ G) − ε < Lϕ (cf. 1 and 7).
Moreover, A − supp ϕ is open, so we can find ψ ∈ C(X) such that ψ ≺ A − supp ϕ
and ν(A − supp ϕ) − ε < Lψ. But then A − G ⊂ A − supp ϕ and ϕ + ψ ≺ A, so (cf.
6, 5, 1)
µ∗ (A ∩ G) + µ∗ (A − G) − 2ε ≤ µ∗ (A ∩ G) + µ∗ (A − supp ϕ) − 2ε
= ν(A ∩ G) + ν(A − supp ϕ) − 2ε
< Lϕ + Lψ = L(ϕ + ψ) ≤ ν(A) = µ∗ (A).
Since ε was arbitrary, 10 is proved for A ∈ Td . For the general case A ∈ P(X),
given ε > 0 we can find U ∈ Td so that A ⊂ U and ν(U ) < µ∗ (A) + ε (cf. 4 and 7),
and this implies (cf. 6, 8, 5) that
µ∗ (A ∩ G) + µ∗ (A − G) ≤ µ∗ (U ∩ G) + µ∗ (U − G) ≤ µ∗ (U ) = ν(U ) < µ∗ (A) + ε.
Since ε was arbitrary, 10 is proved for the general case A ∈ P(X).
Having thus proved that Td ⊂ M, we have A(d) ⊂ M. Then the restriction of
µ∗ to A(d) is a finite measure on A(d) (since the restriction of µ∗ to M was a finite
measure on M), which we denote by µ.
To conclude the proof of existence, it remains to prove that
Z
Lϕ = ϕdµ, ∀ϕ ∈ C(X).
X
For ϕ ∈ C(X), we have Lϕ = L(Re ϕ) + iL(Im ϕ) by the linearity of L. Then, it

is enough to carry out the proof for ϕ ∈ C(X) such that ϕ(x) ∈ R for all x ∈ R.
Moreover, it is enough to prove that
Z
Lϕ ≤ ϕdµ for all real ϕ ∈ C(X). (11)
X
For once 11 is established, the linearity of L shows that
Z Z
−Lϕ = L(−ϕ) ≤ (−ϕ)dµ = − ϕdµ for all real ϕ ∈ C(X),
X X
which, together with 11, shows that equality holds in 11.

Let ϕ be a real element of C(X). From 2.8.14 it follows that there are a, b ∈ R
so that a < b and Rϕ ⊂ [a, b]. Choose ε > 0 and choose y0 , y1 , ..., yn ∈ R so that
y0 < a < y1 < ... < yn = b and yi − yi−1 < ε for i = 1, ..., n.
Put Ei := ϕ−1 ((yi−1 , yi ]) for i = 1, ..., n. Since ϕ is continuous, ϕ is A(d)-
measurable (cf. 6.2.8), and the sets Ei are therefore disjoint elements of A(d)
whose union is X. By 4 and 7, for i = 1, ..., n there exists Gi ∈ Td so that
ε
Ei ⊂ Gi and µ(Gi ) < µ(Ei ) + . (12)
n
For i = 1, ..., n, by letting G̃i := Gi ∩ ϕ−1 ((−∞, yi + ε)) we have Ei ⊂ G̃i and
Sn
G̃i ∈ Td since ϕ is continuous (cf. 2.4.3). Since i=1 G̃i = X, by 2.8.16 there exists
a family {ψ1 , ..., ψn } so that ψi ∈ C(X) and ψi ≺ G̃i for i = 1, ..., n, and so that
Pn
i=1 ψi = 1X . Hence, by 3 and the linearity of L we have
n
X
µ(X) = Lψi . (13)
i=1
We also have
n
X
ϕ= ψi ϕ. (14)
i=1
Then we have
n (16) n
(15) X X
Lϕ = L(ψi ϕ) ≤ (yi + ε)Lψi
n=1 i=1
Xn n
X
= (|a| + yi + ε)Lψi − |a| Lψi
i=1 i=1
(17) Xn
≤ (|a| + yi + ε)µ(G̃i ) − |a|µ(X)
i=1
(18) Xn ε
≤ (|a| + yi + ε) µ(Ei ) + − |a|µ(X)
i=1
n
(19) n n
X εX
≤ (yi + ε)µ(Ei ) + (|a| + yi + ε)
i=1
n i=1
(20) n
X
≤ (yi − ε)µ(Ei ) + 2εµ(X) + ε|a| + εb + ε2
i=1
(21)
Z
≤ ϕdµ + ε(2µ(X) + |a| + b + ε),
X
where: 15 holds by 14 and the linearity of L; 16 holds by the linearity and the
positivity of L, since ψi ϕ ≤ (yi +ε)ψi by the definition of G̃i ; 17 holds by 1 and by 13;
18 holds by 12 since G̃i ⊂ Gi ; 19 holds because ni=1 µ(Ei ) = µ ( ni=1 Ei ) = µ(X);
P S
Integration 231
Pn
20 holds because i=1 µ(Ei ) = µ(X) and because yi ≤ b for i = 1, ..., n; 21 holds
because
n
X n
X
(yi − ε)χEi ≤ yi−1 χEi ≤ ϕ,
i=1 i=1
Xn n
Z X
(yi − ε)µ(Ei ) = (yi − ε)χEi dµ.
i=1 X i=1
Since ε was arbitrary and µ(X) < ∞, 11 is established and the proof of existence
is complete.
Uniqueness: Let µ̃ be a finite measure on A(d) such that
Z
Lϕ = ϕdµ̃, ∀ϕ ∈ C(X).
X
For every closed set F , by 2.5.7 there exists a sequence {ϕn } in C(X) so that
∀x ∈ X, 0 ≤ ϕn (x) ≤ 1 and ϕn (x) → χF (x) as n → ∞.
Then, by Lebesgue’s dominated convergence theorem (cf. 8.2.11, with 1X as domi-
nating function),
Z Z
µ̃(F ) = χF dµ̃ = lim ϕn dµ̃
X n→∞ X
Z Z
= lim Lϕn = lim ϕn dµ = χF dµ = µ(F ).
n→∞ n→∞ X X
In view of 7.4.2, this proves that µ̃ = µ.

Chapter 9
Lebesgue Measure
In this chapter we study the Lebesgue measure on Rn , which according to our

definition is a measure on the Borel σ-algebra A(dn ). We warn the reader that
many books call Lebesgue measure a measure which is in fact an extension of our
Lebesgue measure.
9.1 Lebesgue–Stieltjes and Lebesgue measures
9.1.1 Theorem. Suppose we have a function F : R → R which is monotone in-

creasing and right continuous, i.e. such that:
(a) if x′ , x′′ ∈ R are so that x′ < x′′ then F (x′ ) ≤ F (x′′ );
(b) for each x ∈ R, if {δn } is a sequence in [0, ∞) so that δn −−−−→ 0 then
n→∞
F (x + δn ) −−−−→ F (x).
n→∞
Then there exists a unique measure µF on the Borel σ-algebra A(dR ) on R (cf.
6.1.22 and 2.1.4) such that
µF ((a, b]) = F (b) − F (a), for all a, b ∈ R so that a < b.
The measure µF is σ-finite and is called the Lebesgue–Stieltjes measure associated
to F .
Proof. Recall that I9 denotes a semialgebra on R such that A(dR ) = A(I9 ) (cf.
6.1.25), and define the function
ν : I9 → [0, ∞]

0
 if E = ∅,


F (b) − F (a) if E = (a, b] with a, b ∈ R s.t. a < b,
E 7→ ν(E) :=


 F (b) − limn→∞ F (−n) if E = (−∞, b] with b ∈ R,

limn→∞ F (n) − F (a) if E = (a, ∞) with a ∈ R.

(notice that limn→∞ F (−n) and limn→∞ F (n) do exist by 5.2.5). We will show that
ν satisfies conditions a, b, c, d, e of 7.3.3.
233
a: This conditions holds by the definition of ν.

b: Let E ∈ S be the union of a finite and disjoint family {E1 , ..., EN } of elements
of I9 . In view of condition a, we may assume E 6= ∅ and En 6= ∅ for n = 1, ..., N .
If E = (a, b] (with a, b ∈ R so that a < b), then we must have En = (an , bn ] (with
an , bn ∈ R so that an < bn ) for n = 1, ..., N and, after perhaps relabelling the index
n, we must also have a = a1 < b1 = a2 < b2 = a3 < · · · < bN = b, and hence
N
X N
X
ν((a, b]) = F (b) − F (a) = (F (bn ) − F (an )) = ν((an , bn ]).
n=1 n=1
If either E = (−∞, b] (with b ∈ R) or E = (a, ∞) (with a ∈ R), then we proceed as

above, either with a replaced by −∞ and F (a) by limn→∞ F (−n) or with b replaced
by ∞ and F (b) by limn→∞ F (n).
c: If E = ∅, this condition holds trivially.
If E = (a, b] (with a, b ∈ R so that a < b): for n ∈ N large enough, a + n1 , b ∈ S,

a + n1 , b = a + n1 , b ⊂ (a, b], a + n1 , b is compact, and also (since F is right

continuous)

1 1
∀ε > 0, ∃nε ∈ N s.t. ν((a, b]) − ν a+ ,b =F a+ − F (a) < ε.
nε nε
If E = (−∞, b] (with b ∈ R): ∀n ∈ N, (−n, b] ∈ S, (−n, b] ⊂ (−∞, b], (−n, b] is
compact; if limn→∞ F (−n) ∈ R, then (from the definition of limit)
∀ε > 0, ∃nε ∈ N s.t. ν((−∞, b]) − ν((−nε , b]) = F (−nε ) − lim F (−n) < ε;
n→∞
if limn→∞ F (−n) = −∞, then ν((−∞, b]) = ∞ and (cf. 5.3.2c, 5.3.2a, 5.2.5)
sup ν((−n, b]) = sup(F (b) − F (−n)) = F (b) − inf F (−n)
n≥1 n≥1 n≥1
= F (b) − lim F (−n) = F (b) + ∞ = ∞.

n→∞
If E = (a, ∞) (with a ∈ R): ∀n ∈ N, a + n1 , n ∈ S, a + n1 , n ⊂ (a, ∞),

a + n1 , n is compact; if limn→∞ F (n) ∈ R, then (from the definition of limit and

the right continuity of F )

∀ε > 0, ∃nε ∈ N s.t.

1 1
ν((a, ∞)) − ν a + , nε = lim F (n) − F (nε ) + F a + − F (a) < ε;
nε n→∞ nε
if limn→∞ F (n) = ∞, then ν((a, ∞)) = ∞ and

1 1
sup ν a+ ,n = sup F (n) − F a +
n≥1 n n≥1 n
≥ sup(F (n) − F (a + 1)) = sup F (n) − F (a + 1)
n≥1 n≥1
= lim F (n) − F (a + 1) = ∞ − F (a + 1) = ∞.
n→∞
d: If E = ∅, this condition holds trivially.

Lebesgue Measure 235
(a, b] (with a, b ∈ R so that a < b): ∀n ∈ N, a, b + n1 ∈ S, (a, b] ⊂

If E =
◦
a, b + n1 = a, b + n1 , and also (since F is right continuous)

1 1
∀ε > 0, ∃nε ∈ N s.t. ν a, b + − ν((a, b]) = F b + − F (b) < ε.
nε nε
If E = (−∞, b] (with b ∈ R) and ν((−∞, b]) < ∞, i.e. limn→∞ F (−n) ∈ R:
◦
∀n ∈ N, −∞, b + n1 ∈ S, (−∞, b] ⊂ −∞, b + n1 = −∞, b + n1 , and also (since

F is right continuous)

1 1
∀ε > 0, ∃nε ∈ N s.t. ν −∞, b + − ν((−∞, b]) = F b + − F (b) < ε.
nε nε
If E = (a, ∞) (with a ∈ R) and ν((a, ∞)) < ∞, simply notice that (a, ∞)◦ =
(a, ∞).
S
e: We have R = n∈Z (n, n + 1] and ν((n, n + 1]) = F (n + 1) − F (n) < ∞ for all
n ∈ Z.
Since ν satisfies conditions a, b, c, d, e and A(dR ) = A(I9 ), 7.3.3 implies that
there exists a unique measure µF which is an extension of ν, and that µF is σ-finite.
Since µF extends ν, we have
∀a, b ∈ R so that a < b, µF ((a, b]) = ν((a, b]) = F (b) − F (a).
Suppose now that µ is a measure on A(dR ) such that
∀a, b ∈ R so that a < b, µ((a, b]) = F (b) − F (a).
Then we have, by 7.1.4b:
∞
!
[
∀b ∈ R, µ((−∞, b]) = µ (−n, b] = lim µ((−n, b]) = F (b) − lim F (−n);
n→∞ n→∞
n=1
∞
!
[
∀a ∈ R, µ((a, ∞)) = µ (a, n] = lim µ((a, n]) = lim F (n) − F (a).
n→∞ n→∞
n=1
Thus, µ is an extension of ν, and hence µ = µF since µF is the only measure on

A(dR ) that is an extension of ν.
9.1.2 Definition. We call Lebesgue measure on R and denote by m the Lebesgue–

Stieltjes measure µξ associated to the function
ξ:R→R
x 7→ ξ(x) := x.
9.1.3 Theorem. For the Lebesgue measure m on R we have

∀a, b ∈ R so that a < b, m((a, b]) = m((a, b)) = m([a, b)) = m([a, b]) = b − a;
∀a ∈ R, m({a}) = 0 and m((−∞, a)) = m((a, ∞)) = m((−∞, a]) = m([a, ∞)) =
∞.
The Lebesgue measure m is the only measure on A(dR ) with the property:
∀a, b ∈ R so that a < b, m((a, b)) = b − a.
(or the same proposition with (a, b) replaced by [a, b) or by [a, b]).
Proof. We know that m is the only measure on A(dR ) such that
∀a, b ∈ R so that a < b, m((a, b]) = b − a.
Then we have, by 7.1.4b and 7.1.4c, for all a, b ∈ R so that a < b:
∞ !
[ 1
m((a, b)) = m a, b −
n=1
n

1 1
= lim m a, b − = lim b − − a = b − a;
n→∞ n n→∞ n
∞ !
\ 1
m([a, b)) = m a− ,b
n=1
n

1 1
= lim m a− ,b = lim b − a + = b − a;
n→∞ n n→∞ n
∞ !
\ 1
m([a, b]) = m a, b +
n=1
n

1 1
= lim m a, b + = lim b + − a = b − a.
n→∞ n n→∞ n
Similarly we have, for all a ∈ R: !
∞
\ 1 1 1
m({a}) = m a− ,a = lim m a− ,a = lim = 0;
n=1
n n→∞ n n→∞ n
∞
!
[
m((−∞, a)) = m (−n, a) = lim m((−n, a)) = lim (a + n) = ∞;
n→∞ n→∞
n=1
∞
!
[
m((a, ∞)) = m (a, n) = lim m((a, n)) = lim (n − a) = ∞;
n→∞ n→∞
n=1
m((−∞, a]) = m([a, ∞)) = ∞ since (−∞, a) ⊂ (−∞, a] and (a, ∞) ⊂ [a, ∞)
(cf. 7.1.2a).
Suppose now that m̃ is a measure on A(dR ) such that
∀a, b ∈ R so that a < b, m̃((a, b)) = b − a.
Then
∞ !
\ 1
∀a, b ∈ R so that a < b, m̃((a, b]) = m̃ a, b +
n=1
n

1
= lim m̃ a, b +
n→∞ n

1
= lim b + − a = b − a,
n→∞ n
and hence m̃ = m by the uniqueness property of m quoted above. The proofs for
(a, b) replaced by [a, b) or by [a, b] are analogous.
9.1.4 Definition. For n ∈ N so that n > 1, we call Lebesgue measure on Rn and

denote by mn the measure m ⊗ · · · n times · · · ⊗ m.
Since A(dn ) = A(dR ) ⊗ · · · n times · · · ⊗ A(dR ) (cf. 6.1.32), mn is a measure
on A(dn ). The measure mn is σ-finite (cf. 8.4.5).
9.1.5 Lemma. For n ∈ N, a subset I of Rn which is defined by

I := (a1 , b1 ] × · · · (an , bn ], with ak , bk ∈ R so that ak < bk for k = 1, ..., n,
is called a half-open interval. Every open subset of Rn is the union of a countable
and disjoint family of half-open intervals.
Proof. For each r ∈ N, let Hr be the family of the hyperplanes

i
(x1 , ..., xn ) ∈ Rn : xk = r , with k = 1, ..., n and i ∈ Z.
2
The family Hr defines in an obvious way a partition of Rn into a countable and
disjoint family of half-open intervals.
Fix G ∈ Tdn . We define by induction a family of half-open intervals as follows.
Let {I1p }p∈J1 (where J1 is a countable set of indices) be the family of the half-open
intervals defined by H1 that are contained in G. For r > 1, let {Irp }p∈Jr be the
family of the half-open intervals defined by Hr that are contained in G but are not
contained in (and hence are disjoint from) any interval Isq with q ∈ Js and s < r.
Clearly, {Irp : r ∈ N, p ∈ Jr } is a countable and disjoint family, and
 
∞
[ [
 Irp  ⊂ G.
r=1 p∈Jr
The function
ρ : Rn × Rn → R
((x1 , ..., xn ), (y1 , ..., yn )) 7→ ρ((x1 , ..., xn ), (y1 , ..., yn ))
:= max{|xk − yk | : k = 1, ..., n}
is a distance on Rn and Tρ = Tdn (cf. the proof of 6.1.31). For each (x1 , ..., xn ) ∈ G,
since G ∈ Tρ there exists ε > 0 so that, for (y1 , ..., yn ) ∈ Rn ,
[|xk − yk | < ε for k = 1, ..., n] ⇒ (y1 , ..., yn ) ∈ G;
then, if r ∈ N is so that 21r < ε, the half-open interval defined by Hr that contains
(x1 , ..., xn ) must be contained in G, and therefore either this half-open interval is
contained in an interval Isq with q ∈ Js and s < r or this half-open interval itself
is an interval Irp with p ∈ Jr . This shows that each point of G is contained in an
element of the family {Irp : r ∈ N, p ∈ Jr }, and hence that
 
[∞ [
G⊂  Irp  .
r=1 p∈Jr
9.1.6 Theorem. For n ∈ N, the Lebesgue measure mn is the only measure on

A(dn ) with the property: for all (a1 , ..., an ), (b1 , ..., bn ) ∈ Rn so that ak < bk for
k = 1, ..., n,
mn ((a1 , b1 ] × · · · (an , bn ]) = (b1 − a1 ) · · · (bn − an ).
The measure mn is also the only measure on A(dn ) with the property: for all
(a1 , ..., an ), (b1 , ..., bn ) ∈ Rn so that ak < bk for k = 1, ..., n,
mn ((a1 , b1 ) × · · · (an , bn )) = (b1 − a1 ) · · · (bn − an ).
Proof. Clearly, mn has the two properties of the statement since m((a, b]) =
m((a, b)) = b − a for all a, b ∈ R so that a < b.
Let now µ be a measure on A(dn ) such that, for all (a1 , ..., an ), (b1 , ..., bn ) ∈ Rn
so that ak < bk for k = 1, ..., n,
µ((a1 , b1 ] × · · · (an , bn ]) = (b1 − a1 ) · · · (bn − an ).
Then, 9.1.5 and the σ-additivity of µ and of mn imply that
µ(G) = mn (G), ∀G ∈ Tdn .
For N ∈ N, let QN be the open cube
QN := {(x1 , ..., xn ) ∈ Rn : −N < xk < N for k = 1, ..., N }.
Then µQN and (mn )QN are finite measures (this is proved for example by the
inclusion Qn ⊂ (−N, N ] × · · · n times · · · × (−N, N ] and by the monotonicity of µ
and mn ) on the σ-algebra A(dn )QN = A((dn )QN ) (cf. 6.1.21) and
µQN (G ∩ QN ) = µ(G ∩ QN ) = mn (G ∩ QN ) = (mn )QN (G ∩ QN ), ∀G ∈ Tdn ,
and hence (cf. 2.2.5)
µQN (G) = (mn )QN (G), ∀G ∈ T(dn )QN .
S∞
Then µQN = (mn )QN by 7.4.2. Thus, for each E ∈ A(dn ), from E = N =1 (E ∩QN )
and 7.1.4b we have
µ(E) = lim µQN (E ∩ QN ) = lim (mn )QN (E ∩ QN ) = mn (E),
N →∞ n→∞
and this proves that µ = mn .
If µ̃ is a measure on A(dn ) such that, for all (a1 , ..., an ), (b1 , ..., bn ) ∈ Rn so that
ak < bk for k = 1, ..., n,
µ̃((a1 , b1 ) × · · · (an , bn )) = (b1 − a1 ) · · · (bn − an ),
then, for all (a1 , ..., an ), (b1 , ..., bn ) ∈ Rn so that ak < bk for k = 1, ..., n, by 7.1.4c
we have
∞ !
\ 1 1
µ̃((a1 , b1 ] × · · · × (an , bn ]) = µ̃ a1 , b 1 + × · · · × an , b n +
k k
k=1

1 1
= lim µ̃ a1 , b 1 + × · · · × an , b n +
k→∞ k k

1 1
= lim b1 + − a1 × · · · × bn + − an
k→∞ k k
= (b1 − a1 ) · · · (bn − an ).
In view of what was proved before, this entails µ = mn .
9.2 Invariance properties of Lebesgue measure
The first result of this section is that Lebesgue measure is invariant under transla-
tion.
9.2.1 Theorem. Let n ∈ N and (c1 , ..., cn ) ∈ Rn . Then:
(a) for each E ∈ A(dn ), if we define
E + (c1 , ..., cn ) := {(x1 + c1 , .., xn + cn ) : (x1 , ..., xn ) ∈ E}
then
E + (c1 , ..., cn ) ∈ A(dn ) and mn (E + (c1 , ..., cn )) = mn (E);
(b) for each ϕ ∈ L+ (Rn , A(dn ), mn ) (or ϕ ∈ L1 (Rn , A(dn ), mn )), if we define
ϕc : Dϕ + (c1 , ..., cn ) → [0, ∞] (or C)
x 7→ ϕc (x1 , ..., xn ) := ϕ(x1 − c1 , ..., xn − cn )
then ϕc ∈ L+ (Rn , A(dn ), mn ) (or ϕc ∈ L1 (Rn , A(dn ), mn )) and

Z Z
ϕc dmn = ϕdmn .
Rn Rn
Proof. a: The function (which is called translation by (−c1 , ..., −cn ))

τc : Rn → Rn
(x1 , ..., xn ) 7→ τc (x1 , ..., xn ) := (x1 − c1 , ..., xn − cn )
is continuous, and hence (cf. 6.2.8) it is measurable. Then,
∀E ∈ A(dR ), E + (c1 , ..., cn ) = τc−1 (E) ∈ A(dn ).
The function
µc : A(dn ) → [0, ∞]
E 7→ µc (E) := m(τc−1 (E))
is a measure on A(dn ) (cf. 8.3.11 with µ1 := mn and π := τc ) and we have, for all
(a1 , ..., an ), (b1 , ..., bn ) ∈ Rn so that ak < bk for k = 1, ..., n,
µc ((a1 , b1 ] × · · · × (an , bn ]) = mn ((a1 + c1 , b1 + c1 ] × · · · × (an + cn , bn + cn ])

= (b1 − a1 ) · · · (bn − an ).
Then, by the uniqueness asserted in 9.1.6, we have µc = µn , i.e.
∀E ∈ A(dn ), mn (E + (c1 , ..., cn )) = µc (E) = mn (E).
b: Since ϕc = ϕ ◦ τc and mn (E) = mn (τc−1 (E)) for each E ∈ A(dR ), the

assertions of the statement follow from 8.3.11.
We now investigate the behaviour of Lebesgue measure under linear transfor-

mations. We need a lemma, which as a matter of fact already says how things are
in the one-dimensional case.
9.2.2 Lemma. Let c ∈ R − {0}. Then:

(a) for each E ∈ A(dR ), if we define
cE := {cx : x ∈ E}
then
cE ∈ A(dR ) and m(cE) = |c|m(E);
(b) for each ϕ ∈ L (R, A(dR ), m) (or ϕ ∈ L1 (R, A(dR ), m)), if we define
+
ϕc : cDϕ → [0, ∞] (or C)

x
x 7→ ϕc (x) := ϕ
c
then ϕc ∈ L+ (R, A(dR ), m) (or ϕc ∈ L1 (R, A(dR ), m)) and
Z Z
ϕc dm = |c| ϕdm.
R R
Proof. a: The function (which is called dilatation by 1c )

δc : R → R
x
x 7→ δc (x) :=
c
is continuous, and hence (cf. 6.2.8) it is A(dR )-measurable. Thus,
∀E ∈ A(dR ), cE = δc−1 (E) ∈ A(dR ).
The function
µc : A(dR ) → [0, ∞]
E 7→ µc (E) := m(δc−1 (E))
is a measure on A(dR ) (cf. 8.3.11 with µ1 := m and π := δc ) and for the measure
1 c
|c| µ (cf. 7.1.8) we have, for all a, b ∈ R so that a < b,
1 c 1
µ ((a, b)) = |c|(b − a) = b − a,
|c| |c|
since m((ca, cb)) = c(b − a) if c > 0 and m((cb, ca)) = (−c)(b − a) if c < 0. Then,
1 c
by the uniqueness asserted in 9.1.3 we have |c| µ = m, i.e.
∀E ∈ A(dR ), m(cE) = µc (E) = |c|m(E).
b: Since ϕc = ϕ ◦ δc and (|c|m)(E) = m(δc−1 (E)) for each E ∈ A(dR ), the
assertions of the statement follow from 8.3.11 (and from 8.3.5b with a := |c|, µ := m,
ν the null measure).
9.2.3 Remark. We denote by GL(n, R) the family of all injective linear operators
on the linear space Rn . If A, B ∈ GL(n, R) then AB ∈ GL(n, R) (cf. 1.2.14B and
3.2.4), and if A ∈ GL(n, R) then A−1 ∈ GL(n, R) (cf. 3.2.6b). Our next result
requires a few facts which are known from linear algebra (cf. e.g. Munkres, 1991,
Chapter 1). Every A ∈ GL(n, R) determines a unique matrix [Aik ] so that
n
X
A(x1 , ..., xn ) = (x′1 , ..., x′n ) with x′i = Aik xk
k=1
for i = 1, ..., n, ∀(x1 , ..., xn ) ∈ Rn ;
from this it is clear that A is a continuous mapping; also, denoting by det A the
determinant of the matrix [Aik ], we have:
det(AB) = det A det B, ∀A, B ∈ GL(n, R);

det A 6= 0 and det A−1 = (det A)−1 , ∀A ∈ GL(n, R).
If n = 1, then Ax = αx with α ∈ R and det A = α. Every A ∈ GL(n, R) can be

obtained as the product of finitely many elements of GL(n, R) of the following three
types: for all (x1 , ..., xn ) ∈ Rn ,
• A1 (x1 , ..., xn ) := (x1 , ..., xi−1 , cxi , xi+1 , ..., xn ), with c ∈ R − {0},
• A2 (x1 , ..., xn ) := (x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn ), with c ∈ R and i 6= k,
• A3 (x1 , ..., xi−1 , xi , xi+1 , ..., xk−1 , xk , xk+1 , ...xn )
:= (x1 , ..., xi−1 , xk , xi+1 , ..., xk−1 , xi , xk+1 , ..., xn ).
It is easy to see that det A1 = c, det A2 = 1, det A3 = −1.
9.2.4 Theorem. Let n ∈ N and A ∈ GL(n, R). Then:
(a) for each E ∈ A(dn ), if we define
A(E) := {A(x1 , ..., xn ) : (x1 , ..., xn ) ∈ E}
then
A(E) ∈ A(dn ) and mn (A(E)) = | det A|mn (E);
(b) for each ϕ ∈ L+ (Rn , A(dn ), mn ) (or ϕ ∈ L1 (Rn , A(dn ), mn )) we have
ϕ ◦ A−1 ∈ L+ (Rn , A(dn ), mn ) (or ϕ ◦ A−1 ∈ L1 (Rn , A(dn ), mn ))
and
Z Z
(ϕ ◦ A−1 )dmn = | det A| ϕdmn .
Rn Rn
Proof. a: For each ψ ∈ L+ (Rn , A(dn )) and each T ∈ GL(n, R), the continuity of
T implies that ψ ◦ T ∈ L+ (Rn , A(dn )) (cf. 6.2.8 and 6.2.5).
For each ψ ∈ L+ (Rn , A(dn )) we have:
for c ∈ R − {0}, assuming i < n (otherwise, it is easy to simplify the calculation

below),
Z
|c| ψ(x1 , ..., xi−1 , cxi , xi+1 , ..., xn )dmn (x1 , ..., xn )
Rn
Z Z Z
= |c| ψ(x1 , ..., xi−1 , cxi , xi+1 , ..., xn )dmi−1 (x1 , ..., xi−1 )
Rn−i Ri−1
R
dm(xi ) dmn−i (xi+1 , ..., xn )
Z Z Z
= |c| ψ(x1 , ..., xi−1 , cxi , xi+1 , ..., xn )dm(xi )
Rn−i Ri−1 R

dmi−1 (x1 , ..., xi−1 ) dmn−i (xi+1 , ..., xn )
Z Z Z
= ψ(x1 , ..., xn )dm(xi ) dmi−1 (x1 , ..., xi−1 ) dmn−i (xi+1 , ..., xn )
n−i Ri−1
ZR R
= ψ(x1 , ..., xn )dmn (x1 , ..., xn ),

Rn
where 8.4.11a, 8.4.8 and 9.2.2 (with c replaced by 1c ) have been used;
for c ∈ R and i, k = 1, ..., n so that i 6= k, assuming i < n (otherwise, it is easy
to simplify the calculation below),
Z
ψ(x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn )dmn (x1 , ..., xn )
Rn
Z Z Z
= ψ(x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn )dmi−1 (x1 , ..., xi−1 )
Rn−i R i−1
R
dm(xi ) dmn−i (xi+1 , ..., xn )
Z Z Z
= ψ(x1 , ..., xi−1 , xi + cxk , xi+1 , ..., xn )dm(xi )
Rn−i Ri−1 R

dmi−1 (x1 , ..., xi−1 ) dmn−i (xi+1 , ..., xn )
Z Z Z
= ψ(x1 , ..., xn )dm(xi ) dmi−1 (x1 , ..., xi−1 ) dmn−i (xi+1 , ..., xn )
n−i Ri−1
ZR R
= ψ(x1 , ..., xn )dmn (x1 , ..., xn ),

Rn
where 8.4.11a, 8.4.8 and 9.2.1 (with n = 1) have been used;

for i, k = 1, ..., n so that i < k,
Z
ψ(x1 , ..., xi−1 , xk , xi+1 , ..., xk−1 , xi , xk+1 , ..., xn )dmn (x1 , ..., xn )
Rn
Z
= ψ(x1 , ..., xn )dmn (x1 , ..., xn )
Rn
(this was proved in 8.4.11b).

Thus, for each ψ ∈ L+ (Rn , A(dn )) and for the three elements A1 , A2 , A3 of
GL(n, R) introduced in 9.2.3, we have
Z Z
| det Ak | (ψ ◦ Ak )dmn = ψdmn .
Rn Rn
We notice now that, if T1 and T2 are elements of GL(n, R) so that
Z Z
| det Tk | (ψ ◦ Tk )dmn = ψdmn , ∀ψ ∈ L+ (Rn , A(dn )), for k = 1, 2,
Rn Rn
then
Z Z
| det(T1 T2 )| (ψ ◦ (T1 T2 ))dmn = | det T1 || det T2 | ((ψ ◦ T1 ) ◦ T2 )dmn
Rn Rn
Z
= | det T1 | (ψ ◦ T1 )dmn
Rn
Z
= ψdmn , ∀ψ ∈ L+ (Rn , A(dn )).
Rn
On account of the result of linear algebra mentioned in 9.2.3, this proves that, for
each ψ ∈ L+ (Rn , A(dn )) and each T ∈ GL(n, R),
Z Z
| det T | (ψ ◦ T )dmn = ψdmn .
Rn Rn
Let now E ∈ A(dn ). Then A(E) = (A ) (E). Since A−1 is continuous, it is

−1 −1
measurable (cf. 6.2.8) and hence A(E) ∈ A(dn ). Moreover,

(x1 , ..., xn ) ∈ A(E) ⇔ A−1 (x1 , ..., xn ) ∈ E
proves that χA(E) = χE ◦ A−1 and hence (using the result obtained above) that
Z Z
mn (A(E)) = χA(E) dmn = (χE ◦ A−1 )dmn
Rn Rn
1
Z
= χE dmn = | det A|mn (E).
| det A−1 | Rn
b: If in 8.3.11 we assume µ1 := mn and π := A−1 , we have µ2 = | det A|mn by
the result of part a. Then, the assertion of the statement follow from 8.3.11 (and
from 8.3.5b with a := | det A|, µ := mn , ν the null measure).
9.3 The Lebesgue integral as an extension of the Riemann integral
In this section we see how the Riemann integral can be subsumed in the Lebesgue
integral, when this is defined with respect to the Lebesgue measure on a bounded
interval.
9.3.1 Definition. Let a, b ∈ R be so that a < b. We call Lebesgue measure on [a, b]

the restriction m[a,b] of m to the σ-algebra A[a,b] := (A(dR ))[a,b] (cf. 8.3.1).
9.3.2 Definitions. Let a, b ∈ R be so that a < b. By a partition of [a, b] we mean

here a family P := {t0 , t1 , ..., tN }, where N ∈ N, tn ∈ [a, b] for n = 1, ..., N and
a = t0 < t1 < ... < tn < tn+1 < ... < tN = b. We denote by P the family of all
partitions of [a, b]. Let ϕ : [a, b] → R be a bounded function. For each partition
P := {t0 , t1 , ..., tN } we define
N
X N
X
sP (ϕ) := mi (ϕ)(ti − ti−1 ) and SP (ϕ) := Mi (ϕ)(ti − ti−1 ),
i=1 n=1
where
mi (ϕ) := inf{ϕ(t) : t ∈ [ti−1 , ti ]} and Mi (ϕ) := sup{ϕ(t) : t ∈ [ti−1 , ti ]}
(notice that mi (ϕ) and Mi (ϕ) are elements of R since ϕ is bounded). The function
ϕ is said to be Riemann-integrable if
sup{sP (ϕ) : P ∈ P} = inf{SP (ϕ) : P ∈ P}.
If ϕ is Riemann-integrable, then the Riemann integral of ϕ is defined by
Z b
ϕ(x)dx := sup{sP (ϕ) : P ∈ P}
a
Rb
(notice that a ϕ(x)dx is an element of R since ϕ is bounded).
A bounded function ϕ : [a, b] → C is said to be Riemann integrable if Re ϕ
and Im ϕ (which are bounded functions) are Riemann integrable. If ϕ is Riemann
integrable, then the Riemann integral of ϕ is defined by
Z b Z b Z b
ϕ(x)dx := (Re ϕ)(x)dx + i (Im ϕ)(x)dx.
a a a
9.3.3 Theorem. Let a, b ∈ R be such that a < b, and let ϕ : [a, b] → C be a

bounded function which is A[a,b] -measurable. Then ϕ ∈ L1 ([a, b], A[a,b] , m[a,b] ). If ϕ
is Riemann-integrable, then
Z Z b
ϕdm[a,b] = ϕ(x)dx.
[a,b] a
Proof. Since ϕ is bounded and m[a,b] ([a, b]) = b−a < ∞, ϕ ∈ L1 ([a, b], A[a,b] , m[a,b] )
by 8.2.6. Then Re ϕ, Im ϕ ∈ L1 ([a, b], A[a,b] , m[a,b] ) (cf. 8.2.3).
Suppose now
R b that ϕ is Riemann integrable, denote by ϕ̃ either Re ϕ or Im ϕ,
and let I := a ϕ̃(x)dx. Then, using the symbols introduced in 9.3.2,
1
∀n ∈ N, ∃Pn ∈ P s.t. I − < sPn (ϕ̃).
n
Thus, I = limn→∞ sPn (ϕ̃). Define now a sequence {Pn′ } in P by letting
P1′ := P1 and Pn+1
′
:= Pn+1 ∪ Pn′ for n ∈ N,
where Pn+1 ∪ Pn′ denotes the partition of [a, b] that is obtained by reordering the
union of the families Pn+1 and Pn′ . For each n ∈ N, write
{tn0 , tn1 , ..., tnNn } := Pn′ and mni (ϕ̃) := inf{ϕ̃(t) : t ∈ [tni−1 , tni ]} for i = 1, ..., Nn ,
PNn
and define ψn := n=1 mni (ϕ̃)χ(tni−1 ,tni ] . Then, for each n ∈ N:
′
(a) ψn ≤ ψn+1 since Pn+1 is a refinement of Pn′ , and ψn ≤ ϕ̃;
(b) sPn (ϕ̃) ≤ sPn′ (ϕ̃) since Pn′ is a refinement of Pn ;
1
(c) ψn is obviously R A[a,b] -measurable, ψn ∈ L ([a, b], A[a,b] , m[a,b] ) since ψn is
bounded, and [a,b] ψn dm[a,b] = sPn′ (ϕ̃).
From (a) and from 5.2.4 it follows that we can define the function
ψ : [a, b] → R
x 7→ ψ(x) := lim ψn (x)
n→∞
and that ψ ≤ ϕ̃. By 6.2.20c, ψ is A[a,b] -measurable. From (b) it follows that
sPn′ (ϕ̃) → I as n → ∞,
and this and (c) imply that
Z
I = lim ψn dm[a,b] .
n→∞ [a,b]
Then we have ψ ∈ L1 ([a, b], A[a,b] , m[a,b] ) and

Z Z
ψdm[a,b] = lim ψn dm[a,b] = I
[a,b] n→∞ [a,b]
by 8.2.11 (with dominating function any constant function which majorizes |ϕ|).
We can show in a similar way that there exists χ ∈ L1 ([a, b], A[a,b] , m[a,b] ) such
that
Z
ϕ̃ ≤ χ and χdm[a,b] = I.
[a,b]
Then we have
Z Z Z
(χ − ψ)dm[a,b] = χdm[a,b] − ψdm[a,b] = 0.
[a,b] [a,b] [a,b]
Since 0 ≤ χ − ψ, by 8.1.18a we have ψ(x) = χ(x) m[a,b] -a.e. on [a, b], and hence
ψ(x) = ϕ̃(x) m[a,b] -a.e on [a, b] since ψ ≤ ϕ̃ ≤ χ. From this we obtain, by 8.2.7,
Z Z Z b
ϕ̃dm[a,b] = ψdm[a,b] = I = ϕ̃(x)dx.
[a,b] [a,b] a
Thus we have (cf. 8.2.3)

Z Z Z
ϕdm[a,b] = Re ϕdm[a,b] + i Im ϕdm[a,b]
[a,b] [a,b] [a,b]
Z b Z b Z b
= (Re ϕ)(x)dx + i (Im ϕ)(x)dx = ϕ(x)dx.
a a a

Chapter 10
Hilbert Spaces
In this chapter we study inner product spaces, and Hilbert spaces in particular,
which we only consider over the complex field C. While linear operators in Hilbert
spaces are studied in later chapters in connection with the concept of adjoint opera-
tor, we present here what more can be said about the concepts previously introduced
for linear operators when the linear spaces in which they are defined are actually
inner product or Hilbert spaces.
10.1 Inner product spaces
10.1.1 Definition. Let (X, σ, µ) be a linear space over C. A sesquilinear form in

X is a function ψ from X × X to C, i.e. ψ : Dψ → C with Dψ ⊂ X × X, which has
(sf1 ) there exists a linear manifold Mψ in X so that Dψ = Mψ × Mψ ;

(sf2 ) ψ(f, αg1 + βg2 ) = αψ(f, g1 ) + βψ(f, g2 ), ∀α, β ∈ C, ∀f, g1 , g2 ∈ Mψ ;
(sf3 ) ψ(αf1 + βf2 , g) = αψ(f1 , g) + βψ(f2 , g), ∀α, β ∈ C, ∀f1 , f2 , g ∈ Mψ
(α denotes the complex conjugate of a complex number α). We point out that
conditions sf2 and sf3 are consistent only when condition sf1 is assumed.
A sesquilinear form ψ is said to be on X if Mψ = X.
10.1.2 Proposition. Let X be a linear space over C and ψ a sesquilinear form in

X. Then:
P4
(a) ψ(f, g) = n=1 4i1n ψ(f + in g, f + in g), ∀f, g ∈ Mψ
(i denotes the complex number (0, 1)); this is called the polarization identity;
(b) ψ(f, 0X ) = ψ(0X , f ) = 0, ∀f ∈ X.
Proof. a: Conditions sf2 and sf3 imply that, for all f, g ∈ Mψ ,
ψ(f + in g, f + in g) = ψ(f, f ) + in ψ(f, g) + (−i)n ψ(g, f ) + ψ(g, g).
247
Then, it is enough to note that

4 4 4
X 1 n X 1 X 1
n
i = 1 and n
= n
(−i)n = 0.
n=1
4i n=1
4i n=1
4i
b: For every f ∈ Mψ we have
ψ(f, 0X ) = ψ(f, 0f ) = 0ψ(f, f ) = 0,
and similarly for ψ(0X , f ) = 0.
10.1.3 Definition. An inner product space is a quadruple (X, σ, µ, φ), where

(X, σ, µ) is a linear space over C and φ is a function φ : X × X → C which,
with the shorthand notation (f |g) := φ(f, g), has the following properties:
(ip1 ) (f |αg1 + βg2 ) = α (f |g1 ) + β (f |g2 ), ∀α, β ∈ C, ∀f, g1 , g2 ∈ X;

(ip2 ) (f |g) = (g|f ); ∀f, g ∈ X;
(ip3 ) 0 ≤ (f |f ), ∀f ∈ X;
(ip4 ) (f |f ) = 0 ⇒ f = 0X .
The function φ is called an inner product for the linear space (X, σ, µ). An inner
product is also called a scalar product.
10.1.4 Remarks.
(a) It is immediately clear that, in every inner product space X, conditions ip1 and
ip2 imply the following condition:
(ip5 ) (αf1 + βf2 |g) = α (f1 |g) + β (f2 |g), ∀α, β ∈ C, ∀f1 , f2 , g ∈ X.
Thus, an inner product for X is a sesquilinear form on X.
(b) The reader should be aware that some define an inner product with condition
ip1 replaced by condition
(ip′1 ) (f |αg1 + βg2 ) = α (f |g1 ) + β (f |g2 ), ∀α, β ∈ C, ∀f, g1 , g2 ∈ X.
Then, condition ip5 gets replaced by condition
(ip′5 ) (αf1 + βf2 |g) = α (f1 |g) + β (f2 |g), ∀α, β ∈ C, ∀f1 , f2 , g ∈ X.
Of course, the two definitions of an inner product are fully equivalent. However,
care must be taken not to mix formulae obtained on the basis of different
definitions.
10.1.5 Examples.
(a) Let ℓf denote the family of all the sequences in C that have just a finite number
of non-zero elements, i.e.
ℓf := {{xn } ∈ F (N) : ∃N{xn } ∈ N such that n > N{xn } ⇒ xn = 0}.

Hilbert Spaces 249
Obviously, ℓf is a linear manifold in the linear space F (N) (cf. 3.1.10c), and
therefore it is a linear space over C (cf. 3.1.3). It is immediately clear that the
function
φ : ℓf × ℓf → C
∞
X
({xn }, {yn }) 7→ φ({xn }, {yn }) := xn yn
n=1
(note that the series ∞

P
n=1 xn yn is actually a finite sum) is an inner product
for the linear space ℓf .
(b) For a, b ∈ R such that a < b, let C(a, b) be the linear space over C defined by
the linear manifold C(a, b) in F ([a, b]) introduced in 3.1.10f.
For all ϕ, ψ ∈ C(a, b) we have ϕψ ∈ L1 ([a, b], (A(dR ))[a,b] , m[a,b] ), where m[a,b]
is the Lebesgue measure on [a, b] (cf. 6.2.8, 6.2.17, 6.2.16, 2.8.7, 2.8.14, 8.2.6).
Thus, we can define the function
φ : C(a, b) × C(a, b) → C
Z
(ϕ, ψ) 7→ φ(ϕ, ψ) := ϕψdm[a,b]
[a,b]
and it is immediately clear that this function has properties ip1 , ip2 , ip3 of
10.1.3. As to property ip4 , we note first that if ϕ ∈ C(a, b) is such that ϕ(x) = 0
m-a.e. on [a, b] then ϕ(x) = 0 for all x ∈ [a, b]. Indeed, as can be easily seen, if
for ϕ ∈ C(a, b) there exists x0 ∈ (a, b) so that ϕ(x0 ) 6= 0, then there exists δ > 0
so that (x0 −δ, x0 +δ) ⊂ [a, b] and ϕ(x) 6= 0 for all x ∈ (x0 −δ, x0 +δ), and hence
it cannot be that ϕ(x) = 0 m-a.e. on [a, b], since m((x0 − δ, x0 + δ)) = 2δ > 0.
Now, for ϕ ∈ C(a, b), (ϕ|ϕ) = 0 implies ϕ(x) = 0 m-a.e. on [a, b] by 8.1.12a,
and hence ϕ = 0C(a,b) . This shows that φ has property ip4 .
It is worth remarking that this example can be formulated without recourse
to Lebesgue integration, but using Riemann integration instead. In fact, 9.3.3
implies that
Z b
φ(ϕ, ψ) = ϕ(x)ψ(x)dx, ∀ϕ, ψ ∈ C(a, b),
a
since ϕψ ∈ C(a, b) for all ϕ, ψ ∈ C(a, b) and the elements of C(a, b) are Riemann-
integrable. Moreover, the argument presented above to prove property ip4 of
φ can be replaced with an argument suited to the definition of φ by means of
Riemann integrals.
(c) The linear space S(R) (cf. 3.1.10h) is an inner product space, with the inner
product defined by
φ : S(R) × S(R) → C
Z
(ϕ, ψ) 7→ φ(ϕ, ψ) := ϕψdm
R
(m denotes the Lebesgue measure on R).

Indeed, once it is proved that ϕψ ∈ L1 (R, A(dR ), m) for all ϕ, ψ ∈ S(R), it

will be immediate to see that conditions ip1 , ip2 , ip3 of 10.1.3 are fulfilled, and
condition ip4 will be proved in the same way as in example b, by noting first
that if ϕ ∈ C(R) is such that ϕ(x) = 0 m-a.e. on R then ϕ(x) = 0 for all x ∈ R.
Since ϕψ ∈ S(R) for all ϕψ ∈ S(R) (this follows from 2 and 6 of 3.1.10h), it
remains to prove the inclusion S(R) ⊂ L1 (R, A(dR ), m).
As a preliminary step, we note that
1
Z
dm(x) < ∞. (∗)
R 1 + x2
Indeed, since
χ[−n,n] ≤ χ[−n−1,n+1] and
Z n
1 1
Z
χ[−n,n] (x) 2
dm(x) = 2
dx = 2 arctan n, ∀n ∈ N
R 1+x −n 1 + x
(cf. 9.3.3), and since
1 1
2
= lim χ[−n,n] (x) , ∀x ∈ R,
1+x n→∞ 1 + x2
8.1.8 implies that
1
Z
dm(x) = lim 2 arctan n = π.
R 1 + x2 n→∞
Now let ϕ ∈ S(R). We have ϕ ∈ M(R, A(dR ), m) by 6.2.8. Furthermore,

1
Z Z
|ϕ|dm = 2
(|ϕ(x)| + x2 |ϕ(x)|)dm(x)
R R 1 + x
1
Z
≤ (sup{|ϕ(x)| : x ∈ R} + sup{x2 |ϕ(x)| : x ∈ R}) 2
dm(x) < ∞,
R 1+x
where the first inequality holds by 8.1.7 and the second inequality holds by (∗)
because
sup{|ϕ(x)| : x ∈ R} < ∞ and sup{x2 |ϕ(x)| : x ∈ R} < ∞
(if a function defined on R is continuous and has finite limits as x → ±∞, then
it can be easily proved to be bounded; in any case, cf. 7 in 3.1.10h). In view of
8.2.4, this proves that ϕ ∈ L1 (R, A(dR ), m).
We point out that this example can be presented using Riemann integration
instead of Lebesgue integration. In fact, for every function ϕ ∈ S(R),
|χ[−n,n] (x)ϕ(x)| ≤ |ϕ(x)|, ∀x ∈ R, ∀n ∈ N,
and
lim χ[−n,n] (x)ϕ(x) = ϕ(x), ∀x ∈ R,
n→∞
imply
Z Z Z
ϕdm = lim χ[−n,n] ϕdm = lim ϕdm
R n→∞ R n→∞ [−n,n]
Hilbert Spaces 251
by 8.2.11, and hence

Z Z n
ϕdm = lim ϕ(x)dx
R n→∞ −n
by 9.3.3. Thus,
Z n
φ(ϕ, ψ) = lim ϕ(x)ψ(x)dx, ∀ϕ, ψ ∈ S(R).
n→∞ −n
10.1.6 Remark. Let (X, σ, µ, φ) be an inner product space and M a linear manifold
in the linear space (X, σ, µ). It is immediate to see that (M, σM×M , µC×M , φM×M )
is an inner product space, since (M, σM×M , µC×M ) is a linear space over C (cf.
3.1.3) and conditions ip1 , ip2 , ip3 , ip4 of 10.1.3 hold trivially if X is replaced by M .
10.1.7 Proposition. Let f, g be two elements of an inner product space X. Then:
p p
p| ≤ (f |f ) (g|g)
(a) | (f |g)
(by (f |f ) we mean the non-negative square root of (f |f ), which is non-
negative by ip3 ); this
p is called
p the Schwarz inequality;
(b) we have | (f |g) | = (f |f ) (g|g) iff the set {f, g} is linearly dependent.
Proof. As a preliminary step, we note that if f 6= 0X then (f |f ) 6= 0, by property
ip4 , and that, by properties ip1 and ip5 ,
| (f |g) |2

(f |g) (f |g)
g− f |g − f = (g|g) − . (∗)
(f |f ) (f |f ) (f |f )
a: If f = 0X we have (cf. 10.1.2b)
p p p
| (f |g) | = 0 = 0 (g|g) = (f |f ) (g|g).
If f 6= 0X , from (∗) we have, by properties ip3 ,
| (f |g) |2
0 ≤ (g|g) − ,
(f |f )
and hence
p p
| (f |g) | ≤ (f |f ) (g|g).
b: If the set {f, g} is linearly dependent, then there exist α, β ∈ C so that
β
(α, β) 6= (0, 0) and αf + βg = 0X ; assuming for instance α 6= 0, we have f = − α g
and hence s

β β β β p p p
| (f |g) | = − | (g|g) | = − (g|g) =
− g| − g (g|g) = (f |f ) (g|g).
α α α α
p p
If | (f |g) | = (f |f ) (g|g) and f 6= 0X (f = 0X would make the set {f, g} linearly
dependent in any case), from (∗) we have

(f |g) (f |g)
g− f |g − f = 0,
(f |f ) (f |f )
and hence
(f |g)
g− f = 0X ,
(f |f )
which shows that the set {f, g} is linearly dependent.
10.1.8 Theorem. Let (X, σ, µ, φ) be an inner product space. The function
νφ : X → R
p
f 7→ νφ (f ) := kf kφ := (f |f )
is a norm for the linear space (X, σ, µ).
Proof. For νφ , condition no2 of 4.1.1 follows from properties ip1 and ip5 of an inner
product:
p p
kαf kφ = (αf |αf ) = |α| (f |f ) = |α|kf kφ , ∀α ∈ C, ∀f ∈ X.
Condition no3 of 4.1.1 is actually the same as condition ip4 of 10.1.3. It remains to
verify condition no1 of 4.1.1. Now, for all f, g ∈ X,
(f + g|f + g) = (f |f ) + (g|g) + 2 Re (f |g) ,
and, in view of 10.1.7a,

p p
Re (f |g) ≤ | (f |g) | ≤ (f |f ) (g|g).
Thus we have
kf + gk2φ ≤ kf k2φ + kgk2φ + 2kf kφkgkφ .
which implies
kf + gkφ ≤ kf kφ + kgkφ.
10.1.9 Remark. Whenever we consider an inner product space as a normed space,

we will refer to the norm defined in 10.1.8. Unless confusion can arise, we will drop
the index φ in νφ and in k kφ . Thus, in an inner product space X, the Schwarz
inequality is written simply as follows:
| (f |g) | ≤ kf kkgk, ∀f, g ∈ X.
10.1.10 Proposition. Let X be an inner product space. Then:
(a) for every linear operator A ∈ O(X),

4
X 1
(f |Ag) = n
(f + in g|A(f + in g)) , ∀f, g ∈ DA ,
n=1
4i
4
X 1
(Af |g) = n
(A(f + in g)|f + in g) , ∀f, g ∈ DA ;
n=1
4i
P4 1
(b) (f |g) = n=1 4in kf + in gk2 , ∀f, g ∈ X.
Hilbert Spaces 253
Proof. a: We note that the functions

DA × DA ∋ (f, g) 7→ (f |Ag) ∈ C
and
DA × DA ∋ (f, g) 7→ (Af |g) ∈ C
are sesquilinear forms in the linear space (X, σ, µ), and use 10.1.2a.
b: Set A := 1X in part a of the statement.
10.1.11 Proposition. Let X1 and X2 be inner product spaces. For a linear oper-
ator A ∈ O(X1 , X2 ), the following conditions are equivalent:
(a) (Af |Ag)2 = (f |g)1 , ∀f, g ∈ DA ;
(b) kAf k2 = kf k1 , ∀f ∈ DA
(we have indexed by 1 and 2 the inner products and the norms in X1 and X2
respectively, as we will do whenever a similar situation arises).
Proof. a ⇒ b: This follows immediately from the definition of νφ in 10.1.8.
b ⇒ a: If condition b holds true then, by 10.1.10b, we have for all f, g ∈ DA
4
X 1
(Af |Ag)2 = n
kAf + in Agk22
n=1
4i
4 4
X 1 2
X 1
= n
kA(f + i n
g)k 2 = n
kf + in gk21 = (f |g)1 .
n=1
4i n=1
4i
10.1.12 Proposition. In an inner product space X we have

kf + gk2 + kf − gk2 = 2kf k2 + 2kgk2, ∀f, g ∈ X;
this is called the parallelogram law.
Proof. A straightforward computation, starting from the definition of νφ (cf.
10.1.8).
10.1.13 Remarks.
(a) We saw in 10.1.12 that if a norm is derived from an inner product as in 10.1.8
then it satisfies the parallelogram law.
The converse is also true, namely if a norm ν for a linear space (X, σ, µ) over
C is such that
ν(f + g)2 + ν(f − g)2 = 2ν(f )2 + 2ν(g)2 , ∀f, g ∈ X, (∗)
then there exists a unique inner product φ for (X, σ, µ) so that ν = νφ . The
idea of the proof is as follows. If an inner product φ does exist so that ν = νφ ,
then
4
X 1
φ(f, g) = n
ν(f + in g)2 , ∀f, g ∈ X,
n=1
4i
must be true, in view of 10.1.10b. Thus, one is led to define the function
φ: X ×X →C
4
X 1
(f, g) 7→ φ(f, g) := n
ν(f + in g)2 ,
n=1
4i
and to check that this function has properties ip1 , ip2 , ip3 , ip4 of 10.1.3; in
this check, properties no1 , no2 , no3 of 4.1.1 and condition (∗) are used (cf.
Weidmann, 1980, p.10–11). After that, one notes that, for every f ∈ X,
4
X 1
νφ (f )2 = φ(f, f ) = n
ν(f + in f )2
n=1
4i

1 1 1 1 1
= 2+ 0+ 2 + 4 ν(f )2 = ν(f )2 .
4 i −1 −i 1
We do not give the details of the aforementioned checks because we shall not
use this result.
(b) There are norms which do not satisfy the parallelogram law and which therefore
cannot be derived from any inner product. Such is e.g. the norm defined in
4.3.6a.
10.1.14 Proposition. Let X and Y be inner product spaces and A ∈ O(X, Y ).

We suppose DA 6= {0X } and Y 6= {0Y }, and we set

| (f |Ag) |
k := sup : f ∈ Y − {0Y }, g ∈ DA − {0X }
kf kkgk
(we have denoted by the same symbol the norms in X and in Y ). Then,
k = sup{| (f |Ag) | : f ∈ Y, g ∈ DA , kf k = kgk = 1}.
The operator A is bounded iff k < ∞. If A is bounded then kAk = k.
Proof. The equality between the two least upper bounds of the statement is obvi-
ous.
Now assume A bounded. Then, by 10.1.7a and 4.2.5b,
| (f |Ag) | ≤ kf kkAgk ≤ kAkkf kkgk, ∀f ∈ Y, ∀g ∈ DA ;
thus, k ≤ kAk and therefore k < ∞.
Conversely, assume k < ∞. Then,

(Ah|Ah)
kAhk = khk ≤ kkhk, ∀h ∈ DA − NA ,
kAhkkhk
and this implies
kAf k ≤ kkf k, ∀f ∈ DA ;
thus, A is bounded, k ∈ BA and therefore kAk ≤ k (cf. 4.2.4).
The statement follows from the two arguments above.
Hilbert Spaces 255
10.1.15 Remark. Let (X, σ, µ, φ) be an inner product space. From 10.1.8 and
4.1.3 we have that the function
dφ := dνφ : X × X → R
p
(f, g) 7→ dφ (f, g) := νφ (f − g) = φ(f − g, f − g)
is a distance on X. Whenever we use metric concepts in an inner product space,
we will refer to this distance. For instance, if we say that the inner product space
X is complete or separable we mean that the metric space (X, dφ ) is such.
If M is a linear manifold in X, it is immediately clear that we obtain the same
metric space by first defining the inner product space (M, σM×M , µC×M , φM×M )
(cf. 10.1.6) and then the metric space (M, dφM ×M ), or by first defining the metric
space (X, dφ ) and then the metric subspace (M, (dφ )M ) (cf. 2.1.3). Thus, there can
be no ambiguity when we refer to M as a metric space.
10.1.16 Theorem. Let (X, σ, µ, φ) be an inner product space. Then:
(a) the mapping σ is continuous (with respect to dφ × dφ and dφ );

(b) the mapping µ is continuous (with respect to dC × dφ and dφ );
(c) the inner product φ is continuous (with respect to dφ × dφ and dC ).
Proof. a, b: These follow from 4.1.6b,c.

c: This follows from the continuity of σ and µ, from the continuity of νφ (with
respect to dφ and dR , cf. 4.1.6a), from the continuity of sum and product in C, and
from 10.1.10b.
10.1.17 Definitions. Let (X1 , σ1 , µ1 , φ1 ) and (X2 , σ2 , µ2 , φ2 ) be inner product

spaces. An isomorphism from X1 onto X2 is a mapping U : X1 → X2 such that:
(is1 ) U is a bijection from X1 onto X2 ;

(is2 ) σ2 (U (f ), U (g)) = U (σ1 (f, g)), ∀f, g ∈ X1 ;
µ2 (α, U (f )) = U (µ1 (α, f )), ∀α ∈ C, ∀f ∈ X1 ;
(is3 ) φ2 (U (f ), U (g)) = φ1 (f, g), ∀f, g ∈ X1 .
If an isomorphism from X1 onto X2 exists, then the two inner product spaces X1
and X2 are said to be isomorphic.
If the two inner product spaces X1 and X2 are the same, an isomorphism from
X1 onto X2 is called an automorphism of X1 .
10.1.18 Remark. In 10.1.17, condition is1 means that U is an “isomorphism”

from the set X1 onto the set X2 (U preserves the set theoretical “operations”, i.e.
union, intersection, complementation), conditions is1 and is2 mean that U is an
isomorphism from the linear space (X1 , σ1 , µ1 ) onto the linear space (X2 , σ2 , µ2 )
(condition is2 says that U is a linear operator), and condition is3 says that U
preserves the inner product. We formulated condition is1 the way we did in order
to make it clear from the definition that an isomorphism preserves the three levels of
the structure of an inner product space. However, 10.1.19 proves that the injectivity
part of condition is1 and condition is2 altogether are in fact redundant.
10.1.19 Theorem. Let X1 and X2 be inner product spaces and U a mapping from
X1 to X2 such that DU is a linear manifold in X1 and
(U (f )|U (g))2 = (f |g)1 , ∀f, g, ∈ DU .
Then U is an injective linear operator.
Proof. For all f, g ∈ DU we have

kU (f + g) − U (f ) − U (g)k22
= kU (f + g)k22 + kU (f ) + U (g)k22
−2 Re (U (f + g)|U (f ))2 − 2 Re (U (f + g)|U (g))2 .
We also have, by the condition assumed for U :
kU (f + g)k22 = (U (f + g)|U (f + g))2 = (f + g|f + g)1 = kf + gk21 ;
kU (f ) + U (g)k22 = kU (f )k22 + kU (g)k22 + 2 Re (U (f )|U (g))2
= kf k21 + kgk21 + 2 Re (f |g)1 = kf + gk21 ;
Re (U (f + g)|U (f ))2 + Re (U (f + g)|U (g))2
= Re (f + g|f )1 + Re (f + g|g)1 = Re ((f + g|f )1 + (f + g|g)1 )
= Re (f + g|f + g)1 = (f + g|f + g)1 = kf + gk21 .
This shows that
kU (f + g) − U (f ) − U (g)k2 = 0,
and therefore that
U (f + g) = U (f ) + U (g).
For all α ∈ C and f ∈ DU we have
kU (αf ) − αU (f )k22 = kU (αf )k22 + |α|2 kU (f )k22 − 2 Re (U (αf )|αU (f ))2 .
We also have:
kU (αf )k22 = kαf k21 = |α|2 kf k21 ;
kU (f )k22 = kf k21 ;
Re (U (αf )|αU (f ))2 = Re (α (U (αf )|U (f ))2 )
= Re (α (αf |f )1 ) = Re |α|2 (f |f )1 = |α|2 kf k21.
This shows that
kU (αf ) − αU (f )k2 = 0,
and therefore that
U (αf ) = αU (f ).
Hilbert Spaces 257
This proves that U is a linear operator.

Finally, we have
kU f k2 = kf k1 , ∀f ∈ DU ,
and hence
f ∈ NU ⇒ kf k1 = kU f k2 = 0 ⇒ f = 0X1 .
By 3.2.6a, this proves that U is injective.
10.1.20 Theorem. Let X1 and X2 be inner product spaces and U a mapping from
X1 to X2 . The following conditions are equivalent:
(a) U is an isomorphism from X1 onto X2 ;
(b) DU = X1 , RU = X2 , U is a linear operator, and
kU f k2 = kf k1 , ∀f ∈ X1 ;
(c) DU = X1 , RU = X2 , and
(U (f )|U (g))2 = (f |g)1 , ∀f, g, ∈ X1 .
Proof. a ⇒ b: This is obvious.

b ⇒ c: This follows from 10.1.11.
c ⇒ a: Assuming condition c, U is an injective linear operator by 10.1.19. Then
U has all the properties required by the definition of an isomorphism.
10.1.21 Remark. Let (X1 , σ1 , µ1 , φ1 ) and (X2 , σ2 , µ2 , φ2 ) be inner product spaces

and U a mapping from X1 to X2 . The equivalence of conditions a and b in 10.1.20
proves that U is an isomorphism from the inner product space X1 onto the inner
product space X2 iff U is an isomorphism from the normed space (X1 , σ1 , µ1 , νφ1 )
onto the normed space (X2 , σ2 , µ2 , νφ2 ) (cf. 4.6.1 and 4.6.2a). Therefore, all the
remarks made in 4.6.2 about isomorphisms of normed spaces apply also to isomor-
phisms of inner product spaces. In particular, if two inner product spaces X1 and
X2 are isomorphic then so are the metric spaces (X1 , dφ1 ) and (X2 , dφ2 ), and hence
X1 is complete iff X2 is complete (cf. 2.6.4) and X1 is separable iff X2 is separable
(cf. 2.3.21c).
10.2 Orthogonality in inner product spaces
Throughout this section, X stands for an abstract inner product space.
10.2.1 Definitions. An element f of X is said to be orthogonal to an element g

of X if (f |g) = 0. Property ip2 of an inner product implies that if f is orthogonal
to g then g is orthogonal to f .
A subset S of X is said to be orthogonal if (f |g) = 0 whenever f and g are
different elements of S.
10.2.2 Proposition. Let S be an orthogonal set of non-zero elements of X. Then

S is linearly independent.
Proof. Suppose that {f1 , ..., fn } is a subset of S and (α1 , ..., αn ) ∈ Cn is so that
Pn
i=1 αi fi = 0X . For k = 1, ..., n we have
n
! n
X X
0 = fk | αi fi = αi (fk |fi ) = αk kfk k2 ;
i=1 i=1
since fk 6= 0X , this implies αk = 0.
10.2.3 Proposition. Let {f1 , ..., fn } be an orthogonal subset of X. Then,

2
X n Xn
fi = kfi k2 .

i=1 i=1
This is called the Pythagorean theorem.
Proof. We have
2 !
Xn n
X n
X
fi = fi | fk

i=1 i=1 k=1
n
X n X
X n
X
= (fi |fi ) + (fi |fk ) = kfi k2 .
i=1 i=1 k6=i i=1
10.2.4 Definition. An orthogonal subset S of X is called an orthonormal system

(briefly, o.n.s.) if kuk = 1 for all u ∈ S. Thus, an indexed family {ui }i∈I of elements
of X is an o.n.s. iff (ui |uk ) = δi,k for all i, k ∈ I (δi,k denotes the Kronecker delta,
i.e. δi,i = 1 for all i ∈ I and δi,k = 0 for i 6= k). The reason for the “normal” part of
the name “orthonormal system” is that an element f of X is said to be normalized
if kf k = 1.
10.2.5 Examples.
(a) For each k ∈ N, let δk be the element of the inner product space ℓf (cf. 10.1.5a)
defined by
δk := {δk,n },
i.e. δk is the sequence whose elements are all zero but the k-th, which is one.
The family {δk }k∈N is an o.n.s. in ℓf , since it is obvious that (δk |δl ) = δk,l for
all k, l ∈ N.
(b) We define a family {un }n∈Z of elements of the inner product space C(0, 2π) (cf.
10.1.5b) by
1
un (x) := √ einx , ∀x ∈ [0, 2π], ∀n ∈ Z.
2π
Hilbert Spaces 259
The family {un }n∈Z is an o.n.s. in C(0, 2π) since (un |un ) = 1 is obvious and,
for n 6= m,
Z 2π
1
(um |un ) = ei(n−m)x dx
2π 0
Z 2π Z 2π
1 1
= cos(n − m)xdx + i sin(n − m)xdx = 0.
2π 0 2π 0
For each n ∈ N we define the elements vn and wn of C(0, 2π) by
1 1
vn (x) := √ cos nx and wn (x) := √ sin nx, ∀x ∈ [0, 2π].
π π
Since
1 1
vn = √ (un + u−n ) and wn = √ (un − u−n ), ∀n ∈ N,
2 2i
a straightforward computation shows that the family {u0 } ∪ {vn }n∈N ∪ {wn }n∈N
is an o.n.s., and 3.1.7 implies that
L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N ) ⊂ L{un }n∈Z .
However, since
1 1
un = √ (vn + iwn ) and u−n = √ (vn − iwn ), ∀n ∈ N,
2 2
3.1.7 implies also that
L{un }n∈Z ⊂ L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N ).
Thus,
L{un }n∈Z = L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N ).
10.2.6 Theorem (Gram–Schmidt orthonormalization). Let {fn }n∈I be a

countable and linearly independent subset of X, and suppose I := {1, ..., N } or
I := N. Then:
(a) a family {un }n∈I can be consistently defined by induction as
u1 := kf1 k−1 f1 ,
−1 !
n−1
X n−1
X
un := fn − (uk |fn ) uk fn − (uk |fn ) uk

k=1 k=1
for n ∈ I such that n > 1,
Pn−1
i.e. kf1 k 6= 0 and fn − k=1 (uk |fn ) uk 6= 0;

(b) {un }n∈I is an o.n.s. in X;

(c) L{u1 , ..., un } = L{f1 , ..., fn }, ∀n ∈ I;
(d) L{un }n∈I = L{fn }n∈I ;
(e) if {gn }n∈I is an orthogonal set such that gn 6= 0X and gn is a linear combination
of f1 , ..., fn for each n ∈ I, then for each n ∈ I there exists αn ∈ C so that
gn = αn un .
Proof. a, b, c, d: We define, for each n ∈ I, the proposition


the family {u1 , ..., un } can be consistently defined,


Pn := the family {u1 , ..., un } is an o.n.s. in X,

L{u , ..., u } = L{f , ..., f },

1 n 1 n
and we prove by induction that proposition Pn is true for each n ∈ I (if I :=

{1, ..., N }, to comply with the definition of a proof by induction given in 1.1.2 we
can define Pn to be a trivially true proposition for all n > N ).
Proposition P1 is true because kf1 k 6= 0 (in fact, f1 6= 0X since {fn }n∈I is a
linearly independent set), because ku1 k = 1 holds trivially, because
L{u1 } = {αu1 : α ∈ C} and L{f1 } = {αf1 : α ∈ C}

(cf. 3.1.7), and because {αu1 : α ∈ C} = {αf1 : α ∈ C} is obvious.
Assume
Pn−1 n ∈ I such that n > 1,
now that, for proposition Pn−1 is true. Then,
Pn−1
fn − k=1 (uk |fn ) uk 6= 0 (in fact, if fn − k=1 (uk |fn ) uk = 0X were true

then by 3.1.7 and by proposition Pn−1 we should have fn ∈ L{u1 , ..., un−1 } =
L{f1 , ..., fn−1 } and hence {fn }n∈I would not be a linearly independent set); this
and proposition Pn−1 imply that the family {u1 , ..., un } can be consistently defined.
Furthermore, kun k = 1 holds trivially and, for l = 1, ..., n − 1,
n−1
−1 n−1
!
X X
(ul |un ) = fn − (uk |fn ) uk (ul |fn ) − (uk |fn ) (ul |uk )

k=1 k=1
n−1
−1 n−1
!
X X
= fn − (uk |fn ) uk (ul |fn ) − (uk |fn ) δl,k = 0;

k=1 k=1
these facts and proposition Pn−1 imply that {u1 , ..., un } is an o.n.s. in X. Finally,
the definition of un , 3.1.7 and proposition Pn−1 imply that
L{u1 , ..., un−1 , un } ⊂ L{u1 , ..., un−1 , fn } ⊂ L{f1 , ..., fn−1 , fn },
and also that

L{f1 , ..., fn−1 , fn } ⊂ L{f1 , ..., fn−1 , u1 , ..., un−1 , un } ⊂ L{u1 , ..., un−1 , un }.
Thus, proposition Pn is true.
This proves by induction that proposition Pn is true for each n ∈ I. Then parts
a, b and c are obviously true, and part d is true since 3.1.7 implies that
[ [
L{un }n∈I = L{u1, ..., un } and L{fn }n∈I = L{f1 , ..., fn }.
n∈I n∈I
Hilbert Spaces 261
e: let {gn }n∈I be as in the statement. Then, for each n ∈ I, we have

gk ∈ L{f1 , ..., fk } = L{u1 , ..., uk } ⊂ L{u1, ..., un } for 1 ≤ k ≤ n;
since {u1 , ..., un } is obviously a linear basis in the linear space L{u1, ..., un } (cf.
10.2.2), and since {g1 , ..., gn } is a linearly independent set (cf. 10.2.2), 3.1.15 implies
that {g1 , ..., gn } is a linear basis in the linear space L{u1 , ..., un }; thus, we have
L{g1 , ..., gn } = L{u1 , ..., un }.
Then, for each k ∈ I there exists (αk,1 , ..., αk,k ) ∈ Ck so that
k
X
gk = αk,i ui
i=1
and for each l ∈ I there exists (βl,1 , ..., βl,l ) ∈ Cl so that

l
X
ul = βl,j gj .
j=1
We note that, for k ∈ I and l ≤ k,

k
! k k
X X X
(ul |gk ) = ul | αk,i ui = αk,i (ul |ui ) = αk,i δl,i = αk,l
i=1 i=1 i=1
and also
 
l
X l
X
(ul |gk ) =  βl,j gj |gk  = βl,j (gj |gk ) = 0 if l < k.
j=1 j=1
This proves that, for each k ∈ I, αk,l = 0 if l < k and hence that gk = αk,k uk .
10.2.7 Theorem. Let I := {0} ∪ N. For each n ∈ I, let fn be the function

fn : R → C
x2
x 7→ fn (x) := xn e− 2 .
The family {fn }n∈I is a linearly independent subset of the inner product space S(R)
(cf. 3.1.10h and 10.1.5c).
For each n ∈ I, let gn be the function
gn : R → C
2
dn e−x x2
x 7→ gn (x) := e
. 2
dxn
For each n ∈ I, there exists a polynomial Hn of degree n and with real coefficients
which contains only even (respectively, odd) powers of x if n is even (respectively,
odd), and which is so that
x2
gn (x) = Hn (x)e− 2 , ∀x ∈ R.
For each n ∈ I, it is obvious that gn is a non-zero element of S(R); then, let

cn := kgn k−1 and
hn := cn gn .
The family {hn }n∈I is an o.n.s. in S(R) and
L{hn }n∈I = L{fn }n∈I .
The function hn is called the nth Hermite function and the polynomial Hn is called
the Hermite polynomial of degree n.
x2
Proof. Since the function x 7→ e− 2 is obviously an element of S(R), we have
fn ∈ S(R) for each n ∈ I (cf. 4 in 3.1.10h). The family {fn }n∈I is a linearly
Pn
independent subset of S(R) because for every polynomial k=1 αk xk we have
n
!
X
k
αk x = 0, ∀x ∈ R ⇒ (αk = 0 for k = 1, ..., n) .
k=1
For each n ∈ I, we define a function Hn on R by

2
dn e−x x2 2
Hn (x) := gn (x)e , ∀x ∈ R.
2 = ex
dxn
We prove by induction that Hn is a polynomial with the stated properties, for each
n ∈ I. Since
H0 (x) = 1 and H1 (x) = −2x, ∀x ∈ R,
H0 and H1 have the required properties. Now suppose that, for n ∈ I, Hn is a
polynomial of degree n and with real coefficients, i.e. that (α1 , ..., αn ) ∈ Rn exists
so that αn 6= 0 and
n
X
Hn (x) = αk xk , ∀x ∈ R;
k=1
then
2 n
dn e−x −x2
X 2
n
= H n (x)e = αk xk e−x , ∀x ∈ R,
dx
k=1
and hence
2
x2 2 dn+1 e−x
Hn+1 (x) = gn+1 (x)e 2 = ex n+1
dx!
n n
2 d X 2 X
= ex αk xk e−x = αk (kxk−1 − 2xk+1 ), ∀x ∈ R;
dx
k=1 k=1
thus, Hn+1 is a polynomial of degree n + 1 and with real coefficients. Moreover, if

Hn contains only even (respectively, odd) powers of x then Hn+1 contains only odd
(respectively, even) powers of n. This proves that Hn has the required properties
for each n ∈ I.
Hilbert Spaces 263
For each n ∈ I, since

x2
gn (x) = Hn (x)e− 2 , ∀x ∈ R,
it is obvious that gn is a non-zero element of S(R) (cf. 4 in 3.1.10h) and that gn is
a linear combination of f1 , ..., fn .
Now we prove that {gn }n∈I is an orthogonal set. In order to see this, we need
to prove beforehand that, for all m, n ∈ I such that m < n, we have
n −x2
md e
Z
x dm(x) = 0. (∗)
R dxn
First, we note that the function
n −x2
md e
x 7→ ϕm,n (x) := x
dxn
2
is an element of S(R) for all m, n ∈ I, since the function x 7→ e−x is obviously so
(cf. 1 and 4 in 3.1.0h). Therefore, ϕm,n ∈ L1 (R, A(dR ), m) (cf. 10.1.5c). We proved
in 10.1.5c that for every ϕ ∈ S(R) we have
Z Z k
ϕdm = lim ϕ(x)dx.
R k→∞ −k
Thus, for all m, n ∈ I such that m < n, we have
2 Z k n −x2
dn e−x md e
Z
xm dm(x) = lim x dx
R dxn k→∞ −k dxn
k 2
dn−1 e−x
Z
(1)
= lim (ϕm,n−1 (k) − ϕm,n−1 (−k)) − lim mxm−1 dx
k→∞ k→∞ −k dxn−1
k 2
n−1 −x
d e
Z
(2)
= −m lim xm−1 dx = · · ·
k→∞ −k dxn−1
k 2
dn−m e−x
Z
= (−1)m m! lim dx
k→∞ −k dxn−m
(3)
= (−1)m m!(ϕ0,n−m−1 (k) − ϕ0,n−m−1 (−k)) = 0,
where: 1 follows from integration by parts for the Riemann integral, which can
n−1 −x2
be used since the functions x 7→ xm and x 7→ d dxn−1 e
are differentiable and
their derivatives are continuous; 2 holds because ϕm,n−1 ∈ S(R); 3 holds because
ϕ0,n−m−1 ∈ S(R). Having thus proved (∗), let m, n ∈ I be such that m < n; we
have
2
dn e−x
Z 2
Z
− x2
(gm |gn ) = Hm (x)e gn (x)dm(x) = Hm (x) dm(x) = 0
R R dxn
since Hm is a polynomial which contains powers of x of degree less than n.
Thus, the families {fn }n∈I and {gn }n∈I are subsets of the inner product space
S(R) which satisfy the conditions of 10.2.6. Therefore, for each n ∈ I there exists
αn ∈ C so that gn = αn un , where {un }n∈I is an o.n.s. in S(R) such that
L{un}n∈I = L{fn}n∈I .
Since {gn }n∈I is an orthogonal set, it is obvious that {hn }n∈I is an o.n.s. Moreover,
since hn = cn αn un and cn αn 6= 0 for each n ∈ I, from 3.1.7 it follows that
L{hn }n∈I = L{un }n∈I .
Hence we have
L{hn }n∈I = L{fn }n∈I .
10.2.8 Theorem.
(a) Let {u1 , ..., un } be a finite o.n.s. in X. Then
n
X
| (uk |f ) |2 ≤ kf k2 , ∀f ∈ X.
k=1
(b) Let {ui }i∈I be any o.n.s. in X and, for every f ∈ X, define
If := {i ∈ I : (ui |f ) 6= 0}.
Then the set If is countable and
X
| (ui |f ) |2 ≤ kf k2 , ∀f ∈ X.
i∈If
Note that the total ordering in If that is necessary for the definition of the
sum or the series i∈If | (ui |f ) |2 need not be specified in view of 5.4.3. This
P
inequality is called Bessel’s inequality. For any f, g ∈ X such that the set
If ∩ Ig is denumerable, the series
X
(f |ui ) (ui |g)
i∈If ∩Ig
is absolutely convergent in the Banach space C (cf. 4.1.4 and 2.7.4a). Hence,
the total ordering of If ∩ Ig that is necessary for the definition of the series
P
i∈If ∩Ig (f |ui ) (ui |g) need not be specified (cf. 4.1.8b).
Proof. a: We have
n n
!
X X
0≤ f− (uk |f ) uk |f − (ul |f ) ul
k=1 l=1
n
X n
X n X
X n
= (f |f ) − (uk |f ) (uk |f ) − (ul |f ) (f |ul ) + (uk |f ) (ul |f ) δk,l
k=1 l=1 k=1 l=1
Xn
= kf k2 − | (uk |f ) |2 .
k=1
b: Suppose f ∈ X and define, for each n ∈ N,

1
If,n := {i ∈ I : ≤ | (ui |f ) |}.
n
Hilbert Spaces 265
The result obtained in part a shows that the number of the elements of If,n can not
exceed n2 kf k2 ; thus, If,n is a finite set for each n ∈ N, and this implies that If is a
S
countable set since If = n∈N If,n (cf. 1.2.10). If If is finite then the inequality of
the statement follows from the result obtained in part a. If If is denumerable, let
{ik }k∈N := If be an ordering in If ; then (cf. 5.4.1)
X n
X
| (ui |f ) |2 = sup | (uik |f ) |2 ,
i∈If n≥1
k=1
and the inequality of the statement follows once again from the result obtained in
part a.
For any α, β ∈ C, the inequality |αβ| ≤ 12 (|α|2 +|β|2 ) follows from 0 ≤ (|α|−|β|)2 ;
then, for all f, g ∈ X we have (whatever ordering is chosen in If ∩ Ig )
X X 1
| (f |ui ) (ui |g) | ≤ (| (f |ui ) |2 + | (ui |g) |2 )
2
i∈If ∩Ig i∈If ∩Ig
1 X 1 X
≤ | (f |ui ) |2 + | (ui |g) |2
2 2
i∈If ∩Ig i∈If ∩Ig
1 1
≤ kf k2 + kgk2,
2 2
where 5.4.2a, 5.4.5 and 5.4.6 have been used if If ∩ Ig is denumerable. This proves
P
that if If ∩ Ig is denumerable then the series i∈If ∩Ig (f |ui ) (ui |g) is absolutely
convergent.
10.2.9 Definition. For every subset S of X, we define

S ⊥ := {f ∈ X : (f |g) = 0, ∀g ∈ S},
which is called the orthogonal complement of S in X.
10.2.10 Proposition. The following statements hold true:

(a) {0X }⊥ = X and X ⊥ = {0X };
(b) if S1 and S2 are subsets of X such that S1 ⊂ S2 , then S2⊥ ⊂ S1⊥ ;
⊥ T
= i∈I Si⊥ ;
S
(c) if {Si }i∈I is a family of subsets of X, then i∈I Si
⊥⊥ ⊥ ⊥
(d) for every subset S of X, S ⊂ S := (S ) ;
(e) for every subset S of X, S ⊥ = S ⊥⊥⊥ := (S ⊥⊥ )⊥ ;
(f ) for every subset S of X, S ∩ S ⊥ = {0X } if 0X ∈ S and S ∩ S ⊥ = ∅ if 0X 6∈ S.
Proof. a: Use 10.1.2b and property ip4 of an inner product.

b and c: These are obvious.
d: We have
g ∈ S ⇒ [(g|f ) = (f |g) = 0, ∀f ∈ S ⊥ ] ⇒ g ∈ (S ⊥ )⊥ .
e: From part d we have S ⊂ S ⊥⊥ , and hence by part b we have (S ⊥⊥ )⊥ ⊂ S ⊥ .
On the other hand, if we substitute S ⊥ for S in part d we have S ⊥ ⊂ (S ⊥ )⊥⊥ .
f: We have
f ∈ S ∩ S ⊥ ⇒ (f |f ) = 0 ⇒ f = 0X . (∗)
⊥ ⊥
If 0X ∈ S then 0X ∈ S ∩ S since 0X ∈ S follows from 10.1.2b, and hence (∗)
proves that S ∩ S ⊥ = {0X }. If 0X 6∈ S then (∗) proves that S ∩ S ⊥ = ∅.
10.2.11 Proposition. For every subset S of X, we have

S ⊥ = (S)⊥ = (LS)⊥ = (V S)⊥ .
Proof. S ⊥ = (S)⊥ : From S ⊂ S we have (S)⊥ ⊂ S ⊥ (cf. 10.2.10b). Now suppose
f ∈ S ⊥ and g ∈ S; by 2.3.10, there exists a sequence {gn } in S such that gn → g;
then, by 10.1.16c, 2.4.2, 2.7.3a, we have
(f |g) = lim (f |gn ) = 0;
n→∞
⊥
this proves that f ∈ (S) .
S ⊥ = (LS)⊥ : From S ⊂ LS we have (LS)⊥ ⊂ S ⊥ (cf. 10.2.10b). Now suppose
f ∈ S ⊥ and g ∈ LS; by 3.1.7, there exist n ∈ N, (α1 , ..., αn ) ∈ Cn , (g1 , ..., gn ) ∈ S n
so that g = ni=1 αi gi ; then we have
P
n
X
(f |g) = αi (f |gi ) = 0;
i=1
⊥
this proves that f ∈ (LS) .
⊥
(LS)⊥ = (V S)⊥ : Since V S = LS (cf. 4.1.13), this follows from S ⊥ = S with
S replaced by LS.
10.2.12 Proposition. Let A and B be linear operators in X, i.e. A, B ∈ O(X),

and suppose that DA = X, DA ⊂ DB , and
(Au|u) = (Bu|u) , ∀u ∈ DA ∩ X̃,
with X̃ := {u ∈ X : kuk = 1}. Then A ⊂ B.
Proof. We have
2 1 1
(Af |f ) = kf k A f| f
kf k kf k

2 1 1
= kf k B f| f = (Bf |f ) , ∀f ∈ DA − {0H },
kf k kf k
and hence
(Af |f ) = (Bf |f ) , ∀f ∈ DA .
Then, by 10.1.10a we have
(Af |g) = (Bf |g) , ∀f, g ∈ DA ,
and hence
⊥
Af − Bf ∈ DA , ∀f ∈ DA .
⊥
Since DA = (DA )⊥ = X ⊥ = {0X } (cf. 10.2.11 and 10.2.10a), we have
Af = Bf, ∀f ∈ DA .
Hilbert Spaces 267
10.2.13 Theorem. For every subset S of X, S ⊥ is a subspace of X.
Proof. For all α, β ∈ C and f, g ∈ S ⊥ , we have
(αf + βg|h) = α (f |h) + β (g|h) = 0, ∀h ∈ S,
and hence αf + βg ∈ S ⊥ . This proves that S ⊥ is a linear manifold in X (cf. 3.1.4c).

Now, let f ∈ X and let {fn } be a sequence in S ⊥ so that fn → f ; then, by 10.1.16c
we have
(f |g) = lim (fn |g) = 0, ∀g ∈ S,

n→∞
and hence f ∈ S ⊥ . This proves that S ⊥ is a closed subset of X (cf. 2.3.4).
10.2.14 Definition. A subset S1 of X is said to be orthogonal to a subset S2 of X

if S1 ⊂ S2⊥ . If S1 is orthogonal to S2 then S2 is orthogonal to S1 since
S1 ⊂ S2⊥ ⇒ S2 ⊂ S2⊥⊥ ⊂ S1⊥ ,
where 10.2.10d and 10.2.10b have been used.
10.2.15 Proposition. Let S1 and S2 be two subsets of X such that S1 ⊂ S2⊥ and
X = S1 + S2 . Then S1 = S2⊥ and S2 = S1⊥ .
Proof. We prove that S1 = S2⊥ by proving that S2⊥ ⊂ S1 . For each f ∈ S2⊥ , there
exists a pair (f1 , f2 ) ∈ S1 × S2 so that f = f1 + f2 and hence so that f − f1 = f2 ;
now, f − f1 ∈ S2⊥ since f1 ∈ S1 ⊂ S2⊥ and S2⊥ is a linear manifold (cf. 10.2.13),
while f2 ∈ S2 ; thus, f − f1 = 0X (cf. 10.2.10f), and hence f = f1 ∈ S1 .
Since S1 ⊂ S2⊥ implies S2 ⊂ S1⊥ (cf. 10.2.14), by the same reasoning we can
prove that S2 = S1⊥ .
10.2.16 Proposition. Let X1 and X2 be isomorphic inner product spaces, let U

be an isomorphism from X1 onto X2 , and let S be a subset of X1 . Then,
U (S ⊥ ) = (U (S))⊥ .
Proof. For f ∈ X2 we have
f ∈ U (S ⊥ ) ⇔
U −1 f ∈ S ⊥ ⇔
[(f |U g) = U U −1 f |U g = U −1 f |g = 0, ∀g ∈ S] ⇔

f ∈ (U (S))⊥ .
10.3 Completions, direct sums, unitary and antiunitary operators

in Hilbert spaces
10.3.1 Definition. A Hilbert space is an inner product space (X, σ, µ, φ) such that
the metric space (X, dφ ) is complete (equivalently, such that the normed space
(X, σ, µ, νφ ) is a Banach space).
The symbol H denotes an abstract Hilbert space throughout the book.
10.3.2 Theorem. Let M be a linear manifold in a Hilbert space (H, σ, µ, φ). Then
the inner product space (M, σM×M , µC×M , φM×M ) (cf. 10.1.6) is a Hilbert space
iff M is a closed set in the metric space (X, dφ ). This fully explains why in 4.1.9 a
closed linear manifold was called a subspace.
Proof. The statement follows from 2.6.6.
10.3.3 Theorem. If two inner product spaces are isomorphic and one of them is
a Hilbert space, then the other one is also a Hilbert space.
Proof. The statement follows from 2.6.4 (cf. 10.1.21).
10.3.4 Definition. Let (X, σ, µ, φ) be an inner product space. A completion of

(X, σ, µ, φ) is a pair ((X̂, σ̂, µ̂, φ̂), ι), where (X̂, σ̂, µ̂, φ̂) is a Hilbert space, ι is a
mapping ι : X → X̂, and the following two conditions hold:
(co1 ) (ι(f )|ι(g)) = (f |g), ∀f, g ∈ X;

(co2 ) Rι is dense in X̂, i.e. Rι = X̂.
10.3.5 Remark. Let ((X̂, σ̂, µ̂, φ̂), ι) be a completion of an inner product space
(X, σ, µ, φ). Then ι is a linear operator (cf. 10.1.19) and therefore Rι can be
considered as an inner product space (cf. 3.2.2a and 10.1.6). Moreover, ι is injective
(cf. 10.1.19). Thus, condition co1 in 10.3.4 is equivalent to the condition that Rι
be a linear manifold in X̂ and ι be an isomorphism from the inner product space
(X, σ, µ, φ) onto the inner product space (Rι , σ̂Rι ×Rι , µ̂C×Rι , φ̂Rι ×Rι ) (cf. 10.1.6
and 10.1.20).
Since ι is a linear operator, it follows directly from the definitions that ((X̂, dφ̂ ), ι)
is a completion of the metric space (X, dφ ) (cf. 2.6.7).
We shall not use the following theorem, also because the completions of inner
product spaces that we need will be constructed without using either the statement
or the proof of this theorem. For this reason we state it without giving its proof,
which can be found e.g. in 4.11 of (Weidmann, 1980).
10.3.6 Theorem. For every inner product space (X, σ, µ, φ), there exists a com-
pletion ((X̂, σ̂, µ̂, φ̂), ι) of (X, σ, µ, φ).
Hilbert Spaces 269
If ((X̃, σ̃, µ̃, φ̃), ω) is also a completion of (X, σ, µ, φ), then there exists an iso-
morphism U from (X̂, σ̂, µ̂, φ̂) onto (X̃, σ̃, µ̃, φ̃) such that U ◦ ι = ω, i.e. such that
U (ι(f )) = ω(f ), ∀f ∈ X.
10.3.7 Theorem. Let {(Hn , σn , µn , φn )}n∈I be a countable family of Hilbert spaces,

and suppose I := {1, ..., N } or I := N. If I = {1, ..., N }, we define
⊕
X
Hn := H1 × · · · × HN ,
n∈I
P⊕
and we denote an element of n∈I Hn by {fn } (thus, this symbols stands here also
Q
for an N -tuple). If I = N, we denote by n∈N Hn the family of all the sequences
S
{fn } in n∈N Hn that are such that fn ∈ Hn for all n ∈ N, and we define
⊕ ∞
( )
X Y X
2
Hn := {fn } ∈ Hn : kfn kn < ∞
n∈I n∈N n=1
(we have denoted by k kn the norm of Hn ).

The following definitions, of the mappings σ, µ, φ, are consistent:
⊕
X ⊕
X ⊕
X
σ: Hn × Hn → Hn
n∈I n∈I n∈I
({fn }, {gn}) 7→ σ({fn }, {gn }) := {σn (fn , gn )},
⊕
X ⊕
X
µ: C× Hn → Hn
n∈I n∈I
(α, {fn }) 7→ µ(α, {fn }) := {µn (α, fn )},
⊕
X ⊕
X
φ: Hn × Hn → C
n∈I n∈I
X
({fn }, {gn }) 7→ φ({fn }, {gn}) := φn (fn , gn )
n∈I
P∞ P
P PN ⊕
( n∈I stands for either n=1 or n=1 ). The quadruple n∈I Hn , σ, µ, φ is a
Hilbert space, which is called the direct sum of the family {Hn }n∈I . The symbol
P⊕ PN ⊕
n∈I Hn is written as n=1 Hn or as H1 ⊕ · · · ⊕ HN if I = {1, ..., N }, and as
P∞⊕
n=1 H n if I = N.
Proof. We expound the proof for I = N, from which the proof for I = {1, ..., N }
can be obtained by obvious simplifications.
To prove that the definition of σ is consistent, we note first that the inequality
1
|αβ| ≤ (|α|2 + |β|2 ) (i.e. 0 ≤ (|α| − |β|)2 ), ∀α, β ∈ C (1)
2
implies the inequality
(|α| + |β|)2 ≤ 2(|α|2 + |β|2 ), ∀α, β ∈ C. (2)

P∞⊕
Then, for {fn }, {gn} ∈ n=1 Hn we have
∞
X ∞
X
kfn + gn k2n ≤ (kfn kn + kgn kn )2
n=1 n=1
X∞ ∞
X ∞
X
≤2 (kfn k2n + kgn k2n ) = 2 kfn k2n + 2 kgn k2n < ∞
n=1 n=1 n=1
(where 5.4.2a, 5.4.5, 5.4.6 and inequality 2 have been used), which proves that
{σn (fn , gn )} ∈ ∞⊕
P
n=1 Hn .
As to the definition of µ, for α ∈ C and {fn } ∈ ∞⊕
P
n=1 Hn we have
∞
X ∞
X ∞
X
kαfn k2n = |α|2 kfn k2n = |α|2 kfn k2n < ∞
n=1 n=1 n=1
P∞⊕
(where 5.4.5 has been used), which proves that {µn (α, fn )} ∈ n=1 Hn .
P∞⊕
As to the definition of φ, for {fn }, {gn } ∈ n=1 Hn we have
∞
X ∞
X
| (fn |gn )n | ≤ kfn kn kgn kn
n=1 n=1
∞ ∞ ∞
X 1 1X 1X
≤ (kfn k2 + kgn k2n ) = kfn k2n + kgn k2n < ∞
n=1
2 2 n=1
2 n=1
(where 10.1.7a, 5.4.2a, 5.4.5, 5.4.6 and inequality 1 have been used), which proves
that the series ∞
P
n=1 φn (fn , gn
) is absolutely convergent
and hence convergent.
P⊕
Then, it is easy to see that n∈I Hn , σ, µ, φ is an inner product space. Prop-
erties ls1 and ls2 of 3.1.1 follow directly from the definitions of σ and µ (the zero
P∞⊕ P∞⊕
vector of n=1 Hn is the sequence {0Hn }, and the opposite of {fn } ∈ n=1 Hn
is the sequence {−fn }), and properties ip1 , ip2 , ip3 , ip4 of 10.1.3 follow from the
definition of φ and from the continuity of sum and product in C (for ip1 ) or the
continuity of complex conjugation (for ip2 ).
P∞⊕
Finally, we prove that the metric space n=1 H n , dφ is complete. Let {ϕk }
P∞⊕ Q
be a Cauchy sequence in n=1 Hn . This means that ϕk := {fk,n } ∈ n∈N Hn and
P∞ 2
n=1 kfk,n kn < ∞ for each k ∈ N, and that ∀ε > 0, ∃Nε ∈ N so that
∞
! 12
X
Nε < k, l ⇒ kfk,n − fl,n k2n = dφ (ϕk , ϕl ) < ε. (3)
n=1
This implies that, for each n ∈ N,
Nε < k, l ⇒ dφn (fk,n , fl,n ) = kfk,n − fl,n k < ε.

Hilbert Spaces 271
Thus, for each n ∈ N, {fk,n } (where k is the index within the sequence) is a
Cauchy sequence in Hn . Therefore (since Hn is a complete metric space) there
exists fn ∈ Hn so that fn = limk→∞ fk,n . Moreover, 3 implies that, for each p ∈ N,
p
X
Nε < k, l ⇒ kfk,n − fl,n k2n ≤ ε2 ,
n=1
and therefore (in view of the continuity of σn and νφn ) also that, for each p ∈ N,
Xp Xp
Nε < k ⇒ kfk,n − fn k2n = lim kfk,n − fl,n k2n ≤ ε2 ,
l→∞
n=1 n=1
and therefore also that
∞
X
Nε < k ⇒ kfk,n − fn k2n ≤ ε2 . (4)
n=1
Q
Now, if we fix k > Nε , 4 implies that the sequence ψk := {fk,n − fn } ∈ n∈N Hn is
P∞⊕ Q
an element of n=1 Hn , and hence that the sequence ϕ := {fn } ∈ n∈N Hn is an
P∞⊕
element of n=1 Hn as well since ϕk − ψk = ϕ. Then, 4 can be written as
Nε < k ⇒ dφ (ϕk , ϕ) ≤ ε
and this shows that the sequence {ϕk } is convergent.
10.3.8 Examples.
(a) We define an inner product φ for a zero linear space (cf. 3.1.10a) by letting
φ(0X , 0X ) := 0. This trivial inner product space is obviously a Hilbert space,
which is called a zero Hilbert space. It is obvious that two zero Hilbert spaces
are isomorphic and that an inner product space which is isomorphic to a zero
Hilbert space is also a zero Hilbert space.
(b) The function
φ: C×C→C
(x1 , x2 ) 7→ φ(x1 , x2 ) := x1 x2
is an inner product for the linear space C (cf. 3.1.10b) and dφ = dC (cf. 2.7.4a),
as can be immediately seen. Since (C, dC ) is a complete metric space, the inner
product space C is a Hilbert space.
PN ⊕
(c) Let N ∈ N and let Hn := C for n = 1, ..., N . The Hilbert space n=1 Hn
(cf. 10.3.7) is then denoted by CN (this is consistent with the definition CN :=
C × · · · N times · · · × C given in 1.2.1). Explicitely, the mappings σ, µ, φ are
defined by
σ((x1 , ..., xN ), (y1 , ..., yN )) := (x1 + y1 , ..., xN + yN ),
∀(x1 , ..., xN ), (y1 , ..., yN ) ∈ CN ,
µ(α, (x1 , ..., xN )) := (αx1 , ..., αxN ), ∀α ∈ C, ∀(x1 , ..., xN ) ∈ CN ,
N
X
φ((x1 , ..., xN ), (y1 , ..., yN )) := xn yn , ∀(x1 , ..., xN ), (y1 , ..., yN ) ∈ CN .
n=1
The norm νφ turns out to be

v
u N
uX
νφ ((x1 , ..., xN )) = t |xn |2 , ∀(x1 , ..., xN ) ∈ CN .
n=1
Thus, altogether independently from the concepts of product of metric spaces

and of the sum of normed spaces (where the equality we are about to state was
used), condition no1 of 4.1.1 turns out to be, for the norm νφ ,
v v v
u N u N u N
uX uX uX
t 2
|xn + yn | ≤ t 2
|xn | + t |yn |2 ,
n=1 n=1 n=1
N
∀(x1 , ..., xN ), (y1 , ..., yN ) ∈ C ,
which is known as the triangle inequality in CN .
Moreover 10.1.7a turns out to be
v v
XN u u N
X
uN
uX

xn yn ≤
t 2
|xn | t |yn |2 ,

n=1 n=1 n=1
∀(x1 , ..., xN ), (y1 , ..., yN ) ∈ CN ,

which is therefore the Schwarz inequality in CN .
Finally, we note that the distance dφ coincides with dC ×· · · N times · · ·×dC =
d2N (cf. 2.7.4b).
(d) Let Hn := C for all n ∈ N. The Hilbert space ∞⊕
P
n=1 Hn (cf. 10.3.7) is then
denoted by ℓ2 . Explicitely, the set ℓ2 is defined by
∞
( )
X
ℓ2 := {xn } ∈ F (N) : |xn |2 < ∞
n=1
and the mappings σ, µ, φ are defined by

σ({xn }, {yn }) := {xn + yn }, ∀{xn }, {yn } ∈ ℓ2 ,
µ(α, {xn }) := {αxn }, ∀α ∈ C, ∀{xn } ∈ ℓ2 ,
∞
X
φ({xn }, {yn }) := xn yn , ∀{xn }, {yn } ∈ ℓ2 .
n=1
The norm νφ turns out to be

v
u∞
uX
νφ ({xn }) = t |xn |2 , ∀{xn } ∈ ℓ2 .
n=1
Thus, condition no1 of 4.1.1 turns out to be, for the norm νφ ,
v v v
u∞ u∞ u∞
uX uX uX
t 2
|xn + yn | ≤ t 2
|xn | + t |yn |2 , ∀{xn }, {yn } ∈ ℓ2 ,
n=1 n=1 n=1
Hilbert Spaces 273
which is known as the triangle inequality in ℓ2 .

Moreover 10.1.7a turns out to be
v v
X∞ u X∞ u∞
uX
|yn |2 , ∀{xn }, {yn } ∈ ℓ2 ,
u

xn yn ≤
t 2
|xn | t

n=1 n=1 n=1
which is therefore the Schwarz inequality in ℓ2 .

We notice that ℓf (cf. 10.1.5a) is a linear manifold in ℓ2 , and that its structure
of an inner product space is exactly the one it inherits from ℓ2 in the way
explained in 10.1.6. Moreover, we have ℓf = ℓ2 . In fact, let ξ := {xn } ∈ ℓ2 ; for
each k ∈ N, define ξk := {xk,n } by xk,n := xn if n ≤ k and xk,n := 0 if k < n;
then, ξk ∈ ℓf and
v v
u∞
u ∞
u X
uX
d(ξ, ξk ) = t 2
|xn − xk,n | = t |xn |2 → 0 as k → ∞
n=1 n=k+1
P∞ 2
since n=1 |xn | < ∞. Thus, if we define ι := idℓf , i.e. (cf. 1.2.6)
ι : ℓf → ℓ2
{xn } 7→ ι({xn }) := {xn },
2
then the pair (ℓ , ι) is a completion of the inner product space ℓf . We point out
that, in view of 2.6.8, this shows that
ℓ f is not a Hilbert space, since obviously
ℓf 6= ℓ2 (for instance, the sequence n1 is an element of ℓ2 and not an element
of ℓf ).
10.3.9 Definition. An isomorphism from a Hilbert space onto another is called a

unitary operator. The family (which can be empty) of all unitary operators from a
Hilbert space H1 onto a Hilbert space H2 is denoted by the symbol U(H1 , H2 ). For
a Hilbert space H we write U(H) := U(H, H) and an element of U(H) is called a
unitary operator in H.
10.3.10 Remark. For a Hilbert space H, the family U(H) is a group with product
of operators as group product, the operator 1H as group identity, the operator U −1
as group inverse of U for every U ∈ U(H) (cf. 10.1.21 and 4.6.2c).
10.3.11 Definition. Let H1 and H2 be isomorphic Hilbert spaces. Two linear

operators A ∈ O(H1 ) and B ∈ O(H2 ) are said to be unitarily equivalent if there
exists U ∈ U(H1 , H2 ) so that B = U AU −1 .
10.3.12 Remarks.
(a) For a Hilbert space H, the set
R := {(A, B) ∈ O(H) × O(H) : ∃U ∈ U(H) such that B = U AU −1 }
defines a relation in O(H) which can be easily seen to be an equivalence relation.
This justifies the term “unitarily equivalent” used in 10.3.11. The fact that R
is an equivalence relation is linked to the group structure of U(H) (cf. 10.3.10).

In fact, reflexivity holds because 1H is a unitary operator, symmetry because
the inverse of a unitary operator is a unitary operator, transitivity because the
product of two unitary operators is a unitary operator, and these are exactly
the facts that form the basis for the group structure of U(H).
(b) Let H1 and H2 be isomorphic Hilbert spaces. For A ∈ O(H1 ), B ∈ O(H2 ),
U ∈ U(H1 , H2 ), suppose that B = U AU −1 . Then, in view of 10.1.21, all the
conditions and all the propositions listed in 4.6.4 and in 4.6.5 hold true (as to
condition 4.6.4g, the operator TU defined in 4.6.3 is now a unitary operator from
the Hilbert space H1 ⊕H1 onto the Hilbert space H2 ⊕H2 ). Thus, A and B have
the same abstract properties related to set theory, metric space theory, normed
space theory. We shall see that this is true also for the properties related to
inner product.
10.3.13 Definition. Let X and Y be linear spaces over C. An antilinear operator

from X to Y is a mapping A from X to Y , i.e. A : DA → Y with DA ⊂ X, which
(ao1 ) DA is a linear manifold in X;
(ao2 ) A(f + g) = Af + Ag, ∀f, g ∈ DA ;
(ao3 ) A(αf ) = αAf , ∀α ∈ C, ∀f ∈ DA .
10.3.14 Remark. All the definitions and the symbols introduced in Section 3.2 for
linear operators can be extended to the family of all linear or antilinear operators,
and it is easy to see that all the results proved in Section 3.2 hold for this wider
family, with only the following exceptions: 3.2.10b4 must be supplemented with
(αA)B = α(AB) = A(αB), ∀α ∈ C − {0}, for every antilinear A
and every linear or antilinear B;
3.2.15 is not true for an antilinear operator.
The product of two antilinear operators is a linear operator and the product of
a linear operator and an antilinear one (in either order) is an antilinear operator;
for the sum of two operators to give a linear or an antilinear operator, the two
operators must be both linear or both antilinear.
Moreover, if X and Y are normed spaces over C, all the definitions, the symbols
and the results set out about linear operators in Section 4.2 can be extended to the
family of all linear or antilinear operators (in the extended version of 4.2.7, both
the operators A and B must be either linear or antilinear).
10.3.15 Definition. An antiunitary operator from a Hilbert space H1 onto a

Hilbert space H2 is a mapping V : H1 → H2 such that:
(au1 ) V is a bijection from H1 onto H2 ;
(au2 ) V is an antilinear operator;
(au3 ) (V f |V g)2 = (g|f )1 , ∀f, g ∈ H1 .
Hilbert Spaces 275
The family (which can be empty) of all antiunitary operators from H1 onto H2 is
denoted by the symbol A(H1 , H2 ). We also write
UA(H1 , H2 ) := U(H1 , H2 ) ∪ A(H1 , H2 ).
For a Hilbert space H, we write
A(H) := A(H, H) and UA(H) := UA(H, H).
An element of A(H) is called an antiunitary operator in H.
10.3.16 Remarks.
(a) The reason why we take antiunitary operators into consideration is that they
play an essential role in Wigner’s theorem (cf. Section 10.9).
(b) If H1 and H2 are Hilbert spaces and V ∈ A(H1 , H2 ), it is immediate to see
that V −1 ∈ A(H2 , H1 ).
(c) If H1 , H2 , H3 are Hilbert spaces, U ∈ UA(H1 , H2 ), and V ∈ UA(H2 , H3 ), it is
immediate to see that V U ∈ UA(H1 , H3 ), and that V U ∈ U(H1 , H3 ) iff U and
V are both unitary or both antiunitary.
(d) For every Hilbert space H, 10.3.10 and remarks b and c above imply that
the family UA(H) is a group with product of operators as group product, the
operator 1H as group identity, the operator T −1 as group inverse of T for every
T ∈ UA(H).
(e) If H1 and H2 are Hilbert spaces and V ∈ A(H1 , H2 ), it is immediate to see that
kV f − V gk2 = kf − gk1 , ∀f, g ∈ H1 .
Thus, V is an isomorphism from the metric space H1 onto the metric space H2 .
(f) The result of 10.2.16 holds true also for an antiunitary operator (if X1 and X2
in that proposition are Hilbert spaces and U is an antiunitary operator, the
proof remains essentially the same).
10.3.17 Theorem. Let H1 and H2 be Hilbert spaces and V a mapping from H1 to

H2 . The following conditions are equivalent:
(a) V ∈ A(H1 , H2 );
(b) DV = H1 , RV = H2 , V is an antilinear operator, and
kV f k2 = kf k1 , ∀f ∈ H1 ;
(c) DV = H1 , RV = H2 , and
(V (f )|V (g))2 = (g|f )1 , ∀f, g ∈ H1 .
Proof. The proof is an obvious modification of the proof of 10.1.20, and it follows
from obvious modifications of 10.1.11 and 10.1.19 and their proofs.
10.3.18 Definition. Let H1 and H2 be Hilbert spaces such that the family
A(H1 , H2 ) is not empty. Two linear operators A ∈ O(H1 ) and B ∈ O(H2 ) are said
to be antiunitarily equivalent if there exists V ∈ A(H1 , H2 ) so that B = V AV −1 .
10.3.19 Remark. Let H1 and H2 be Hilbert spaces such that the family
UA(H1 , H2 ) is not empty. For A ∈ O(H1 ), B ∈ O(H2 ), U ∈ A(H1 , H2 ), sup-
pose that B = U AU −1 . Then it is easy to check that all the conditions listed in
4.6.4 still hold true. Indeed, conditions from a to h depend on U being a bijection
from H1 onto H2 , and condition i depends on U being an isomorphism of metric
spaces (the mapping TU defined in 4.6.3 is now an antiunitary operator from the
Hilbert space H1 ⊕ H1 onto the Hilbert space H2 ⊕ H2 ; note that conditions e, f, g,
h, i are still consistent because the image of a linear manifold under an antiunitary
operator is a linear manifold, as can be easily seen). Furthermore, it is easy to check
that propositions from a to e in 4.6.5 still hold true, while propositions from f to i
get replaced by:
(f′ ) B − λ1H2 = U (A − λ1H1 )U −1 , ∀λ ∈ C;

(g′ ) σ(B) = σ(A);
(h′ ) Apσ(B) = Apσ(A);
(i′ ) σp (B) = σp (A)
(the bar means here complex conjugation, not closure).
10.3.20 Definition. Let H1 and H2 be Hilbert spaces such that the family
UA(H1 , H2 ) is not empty. Two linear operators A ∈ O(H1 ) and B ∈ O(H2 )
are said to be unitarily-antiunitarily equivalent if there exists T ∈ UA(H1 , H2 ) so
that B = T AT −1 .
10.3.21 Remark. For a Hilbert space H, it is easy to check that the relation
in O(H) of unitary-antiunitary equivalence is indeed an equivalence relation, in
analogy with what we saw in 10.3.12a. This in linked to the group structure of
UA(H) (cf. 10.3.16d).
10.4 Orthogonality in Hilbert spaces
The orthogonal decomposition theorem (also known as the projection theorem)

is the cornerstone of the spectral theory in Hilbert spaces since the definition of a
projection operator relies on this theorem, and projection operators are the building
blocks of the spectral decomposition of unitary and self-adjoint operators. The first
part of this section is devoted to this theorem and its corollaries. The second part
deals with series of mutually orthogonal vectors.
10.4.1 Theorem (The orthogonal decomposition theorem). Let M be a

subspace of a Hilbert space H. Then,
∀f ∈ H, ∃!(f1 , f2 ) ∈ M × M ⊥ so that f = f1 + f2 .
The pair (f1 , f2 ) is called the orthogonal decomposition of f with respect to M .

Hilbert Spaces 277
Proof. Let f ∈ H and d := inf{kf − gk : g ∈ M }. For every n ∈ N there exists

g ∈ M so that kf − gk < d+ n1 (if this were not true, we should have d+ n1 ≤ kf − gk
for all g ∈ M , contrary to d being the greater lower bound for {kf − gk : g ∈ M }),
and for each n ∈ N we choose gn ∈ M so that kf − gn k < d + n1 . Since obviously
d ≤ kf − gn k, we have a sequence {gn } in M which is so that kf − gn k → d as
n → ∞.
For all n, m ∈ N, the equalities (cf. 10.1.12)
kgn − gm k2 = k(gn − f ) − (gm − f )k2 = 2kgn − f k2 + 2kgm − f k2 − kgn + gm − 2f k2,
together with d ≤ 21 (gn + gm ) − f (since 12 (gn + gm ) ∈ M ), imply that

2 2
1 1
kgn − gm k2 < 2 d + +2 d+ − 4d2 ,
n m
and this shows that {gn } is a Cauchy sequence. Since H is a complete metric space,
there exists g0 ∈ H so that gn → g0 as n → ∞, and g0 ∈ M because M is a closed
subset of H (cf. 2.3.4). By the continuity of the sum and of the norm in H (cf.
4.1.6) we have kf − g0 k = d.
Now we prove that f − g0 ∈ M ⊥ . For every h ∈ M we have
kf − g0 k2 = d2 ≤ kf − (g0 − αh)k2 = k(f − g0 ) + αhk2 , ∀α ∈ C
(since g0 − αh ∈ M ), and hence
0 ≤ |α|2 khk2 + 2 Re α (f − g0 |h) , ∀α ∈ C.
From this, for every h ∈ M − {0H }, by putting α := −khk−2 (h|f − g0 ) we have
0 ≤ khk−2 (| (h|f − g0 ) |2 − 2| (h|f − g0 ) |2 ),
which obviously implies (h|f − g0 ) = 0.
Thus, by letting f1 := g0 and f2 := f − g0 , we have proved that
∃(f1 , f2 ) ∈ M × M ⊥ so that f = f1 + f2 .
To prove uniqueness, suppose that (f1′ , f2′ ) ∈ M × M ⊥ is so that f = f1′ + f2′ . Then,
f1 − f1′ = f2′ − f2 .
Since f1 − f1′ ∈ M and f2′ − f2 ∈ M ⊥ (cf. 10.2.13), we have f1 − f1′ = f2′ − f2 = 0H
by 10.2.10f, and hence (f1′ , f2′ ) = (f1 , f2 ).
10.4.2 Remarks.
(a) The existence part of the statement of 10.4.1 can be rephrased as follows (cf.
3.1.8): if M is a subspace of a Hilbert space H, then H = M + M ⊥ .
(b) In 10.4.1, the condition that the linear manifold M be closed is essential. This
is proved by the following counterexample. In the Hilbert space ℓ2 (cf. 10.3.8d),
ℓf is a linear manifold and it is not a closed set since ℓf = ℓ2 and ℓf 6= ℓ2 (cf.
2.3.9c). Now, for any Hilbert space H and any linear manifold M in H such
that M = H and M 6= H, we have M ⊥ = H⊥ = {0H } (cf. 10.2.11 and 10.2.10a)
and hence H 6= M + M ⊥ .
(c) In 10.4.1, the condition that the inner product space H be complete is essential.
This is proved by the following counterexample. For a, b ∈ R, let c ∈ (a, b) and
define the subset M (a, c) and M (c, b) of the inner product space C(a, b) (cf.
10.1.5.b) by letting:
M (a, c) := {ϕ ∈ C(a, b) : ϕ(x) = 0, ∀x ∈ (c, b)};
M (c, b) := {ϕ ∈ C(a, b) : ϕ(x) = 0, ∀x ∈ (a, c)}.
Obviously, M (c, b) ⊂ M (a, c)⊥ . We will prove by contraposition the inclusion
M (a, c)⊥ ⊂ M (c, b), i.e. that
[ϕ ∈ C(a, b), ∃x0 ∈ (a, c) s.t. ϕ(x0 ) 6= 0] ⇒
[∃ψ ∈ M (a, c) s.t. (ϕ|ψ) 6= 0].
Assume that ϕ ∈ C(a, b) and x0 ∈ (a, c) are so that ϕ(x0 ) 6= 0, and suppose
e.g. that Re ϕ(x0 ) > 0 (the argument would be analogue if Re ϕ(x0 ) < 0, or
Im ϕ(x0 ) > 0, or Im ϕ(x0 ) < 0); since Re ϕ is a continuous function (cf. 2.7.6),
there exists ε > 0 so that Re ϕ(x) > 0 for all x ∈ (x0 − ε, x0 + ε) ∩ [a, b], and
we can choose ε so that (x0 − ε, x0 + ε) ⊂ (a, c); then the function
ψ : [a, b] → C

0

 if x 6∈ (x0 − ε, x0 + ε),
x 7→ ψ(x) := x − (x0 − ε) if x ∈ (x0 − ε, x0 ),

−x + (x + ε) if x ∈ [x , x + ε)

0 0 0
is a continuous function such that ψ(x) = 0 for all x 6∈ (x0 − ε, x0 + ε) and

ψ(x) > 0 for all x ∈ (x0 − ε, x0 + ε). Now, ψ ∈ M (a, c) and
Z
Re (ϕ|ψ) = (Re ϕ)ψdm > 0
[a,b]
+ [a,b]
R
since (Re ϕ)ψ ∈ L ([a, b], (A(dR )) ) and therefore [a,b] (Re ϕ)ψdm = 0 would
imply Re ϕ(x)ψ(x) = 0 m-a.e. on [a, b] and hence (cf. 10.1.5b) Re ϕ(x)ψ(x) =
0 for all x ∈ [a, b], which is not true. This proves by contraposition that
M (a, c)⊥ ⊂ M (c, b). Hence, M (c, b) = M (a, c)⊥ . The equality M (a, c) =
M (c, b)⊥ can be proved in a similar way. Thus, M (a, c) is a subspace of C(a, b)
(cf. 10.2.13) but
C(a, b) 6= M (a, c) + M (c, b) = M (a, c) + M (a, c)⊥ ,
since ϕ(c) = 0 for all ϕ ∈ M (a, c) + M (c, b). In view of 10.4.1, this proves that
C(a, b) is not a Hilbert space.
Summing up, we have proved that the inner product space C(a, b) is not a
Hilbert space, and that a subspace M of C(a, b) exists such that C(a, b) 6=
M + M ⊥.
10.4.3 Corollary. Let M and N be subspaces of a Hilbert space and suppose that
M ⊂ N . Then,
∀f ∈ N, ∃!(f1 , f2 ) ∈ M × (M ⊥ ∩ N ) so that f = f1 + f2 .
Hilbert Spaces 279
Proof. Since N is a subspace, it can be regarded as a Hilbert space on its own

(cf. 10.3.2) and M can be considered a subspace of N (cf. 3.1.4b and 2.3.3). Since
the orthogonal complement of M in the Hilbert space N is obviously M ⊥ ∩ N , the
equality of the statement follows from 10.4.1 with N substituted for H and M ⊥ ∩ N
for M ⊥ .
10.4.4 Corollaries. Let H be a Hilbert space. Then:
(a) A subset S of H is a subspace of H iff S = S ⊥⊥ .

(b) For every subset S of H, V S = S ⊥⊥ .
(c) For every linear manifold M in H, M = M ⊥⊥ .
(d) For a linear manifold M in H, M = H iff M ⊥ = {0H }.
Proof. a: Let S be a subset of H. If S = S ⊥⊥ then S is a subspace of H by

10.2.13. If S is a subspace, then H = S + S ⊥ by 10.4.2a and this implies S = S ⊥⊥
by 10.2.15.
b: Let S be a subset of H. From corollary a we have V S = (V S)⊥⊥ and from
10.2.11 we have (V S)⊥ = S ⊥ and hence (V S)⊥⊥ = S ⊥⊥ .
c: For a linear manifold M in H we have M = V M (cf. 4.1.14), and hence
M = M ⊥⊥ by corollary b.
d: For any subset S of any inner product space we have
S ⊥ = {0X } ⇔ S ⊥⊥ = X,
in view of 10.2.10a,e. Thus, for a linear manifold M in H we have
M ⊥ = {0H } ⇔ M = H
by corollary c.
10.4.5 Remark. In all the corollaries of 10.4.1 proved in 10.4.4, the condition that
the inner product space H be complete is essential. We prove this by a counterex-
ample, which shows that if H were not complete then the statement of corollary
10.4.4d would not be true. Since each corollary listed in 10.4.4 implies the following
one (see the proof of 10.4.4), this actually shows that if H were not complete then
no corollary listed in 10.4.4 would be true.
In the inner product space ℓf (cf. 10.1.5a), which is not a Hilbert space (cf.
10.3.8d), let
∞
( )
X 1
M := {xn } ∈ ℓf : xn = 0 .
n=1
n
It is obvious that M is a linear manifold.

Suppose that a sequence {ξk } in M and ξ ∈ ℓf are given so that ξ = limk→∞ ξk ;
setting {yn } := ξ and {xk,n } := ξk , by the Schwarz inequality in ℓ2 (cf. 10.3.8d) we
have, for every k ∈ N,

∞ ∞ ∞
∞
X 1 X 1 X 1 X 1
yn = yn − xk,n = (yn − xk,n )

n n n n

n=1 n=1 n=1 n=1
v v v
u∞ u∞ u∞
u X 1 uX uX 1
≤t |y − x | 2 =t kξ − ξk k;
k k,n
t
n 2 n2
n=1 n=1 n=1
P∞
this shows that n=1 n1 yn = 0 and hence that ξ ∈ M ; thus, M is a closed subset
of ℓf (cf. 2.3.4), and hence M = M (cf. 2.3.9c).
Now, for each k ∈ N let ηk be the element of M defined by ηk := {wk,n } with
(
1
if n = 1,
wk,n = k ;
−δk,n if n 6= 1
then we have, for ζ := {zn } ∈ ℓf ,
1 1
0 = (ηk |ζ) =
z1 − zk ⇒ zk = z1 ;
k k
therefore, if ζ ∈ M ⊥ then zk = k1 z1 for all k ∈ N, and hence (since ζ ∈ ℓf )
z1 = 0, and hence zk = 0 for all k ∈ N. This proves that M ⊥ = {0ℓf }. However,
M = M 6= ℓf .
10.4.6 Corollary. Let A be a linear operator in a Hilbert space. Then, the spectrum
of A is a closed subset of C.
Proof. We denote by H the Hilbert space in which A is defined. We prove the

statement by proving that the resolvent set of A is open (cf. 4.5.1 and 2.3.1). If
ρ(A) = ∅ then there is nothing to prove. If H is the zero Hilbert space (cf. 10.3.8a),
then it is immediate to see that ρ(A) = C. In what follows we assume that ρ(A) 6= ∅
and that H is a non-zero Hilbert space.
Let λ be an arbitrary element of ρ(A). Then the operator A − λ1H is injective,
the operator (A − λ1H )−1 is bounded, and k(A − λ1H )−1 k 6= 0 (indeed, if we had
k(A − λ1H )−1 k = 0 then we should have RA−λ1H = D(A−λ1H )−1 = {0H } since
(A − λ1H )−1 is injective, but this would be contradictory to RA−λ1H = H). We
will prove that ρ(A) is open by proving that

1
µ ∈ C, |µ − λ| < ⇒ µ ∈ ρ(A)
k(A − λ1H )−1 k
(cf. 2.2.2). Let then µ ∈ C be such that |µ − λ| < k(A−λ11 H )−1 k . We prove that
µ ∈ ρ(A) in two steps. In step I we prove that the operator A − µ1H is injective and
the operator (A − µ1H )−1 is bounded, and in step II we prove that RA−µ1H = H.
Step I: For every f ∈ DA we have
kf k = k(A − λ1H )−1 (A − λ1H )f k ≤ k(A − λ1H )−1 kk(A − λ1H )f k
(cf. 4.2.5b), and also
k(A − λ1H )f k − |µ − λ|kf k ≤ k(A − λ1H )f − (µ − λ)f k = k(A − µ1H )f k
Hilbert Spaces 281
(cf. 4.1.2b), and hence

kf k − |µ − λ|k(A − λ1H )−1 kkf k ≤ k(A − λ1H )−1 k(k(A − λ1H )f k − |µ − λ|kf k)
≤ k(A − λ1H )−1 kk(A − µ1H )f k,
and hence
1 − |µ − λ|k(A − λ1H )−1 k
kf k ≤ k(A − µ1H )f k.
k(A − λ1H )−1 k
Since 0 < 1 − |µ − λ|k(A − λ1H )−1 k, we have by 4.2.3 that the operator A − µ1H
is injective and the operator (A − µ1H )−1 is bounded.
Step II: We prove by contraposition that RA−µ1H = H. Assume to the contrary
that RA−µ1H 6= H. Since RA−µ1H is a linear manifold in H (cf. 3.2.2a), this
implies by 10.4.4d that there exists f ∈ H such that f 6= 0H and f ∈ (RA−µ1H )⊥ .
Since RA−λ1H = H, there exists a sequence {fn } in D(A−λ1H )−1 = RA−λ1H such
that fn → f (cf. 2.3.12). For each n ∈ N, we define gn := (A − λ1H )−1 fn ;
we have gn ∈ R(A−λ1H )−1 = DA−λ1H = DA , and hence gn ∈ DA−µ1H ; since
(f |(A − µ1H )gn ) = 0, we have
kf k2 ≤ k(A − µ1H )gn k2 + kf k2 = k(A − µ1H )gn − f k2
(cf. 10.2.3); we also have
k(A − µ1H )gn − (A − λ1H )gn k = |µ − λ|kgn k ≤ |µ − λ|k(A − λ1H )−1 kkfn k;
thus, we have
kf k ≤ k(A − µ1H )gn − f k
≤ k(A − µ1H )gn − (A − λ1H )gn k + k(A − λ1H )gn − f k
≤ |µ − λ|k(A − λ1H )−1 kkfn k + kfn − f k.
By the continuity of the norm (cf. 4.1.6a), this implies
kf k ≤ lim (|µ − λ|k(A − λ1H )−1 kkfn k + kfn − f k) = |µ − λ|k(A − λ1H )−1 kkf k,
n→∞
and this implies, since kf k 6= 0,

1 ≤ |µ − λ|k(A − λ1H )−1 k,
which is contrary to the hypothesis that was assumed for µ.
The results we prove in 10.4.8, which are sometimes known as the Riesz–Fisher
theorem, are corollaries of the next theorem, which is an extension of 10.2.3.
10.4.7 Theorem. Let {fn } be a sequence in a inner product space X and suppose
that (fn |fm ) = 0 if n 6= m. Then:
(a) if the series ∞
P P∞ 2
n=1 fn is convergent then the series n=1 kfn k is convergent
P∞ 2
P ∞ 2
P ∞ 2
in R, i.e. n=1 kfn k < ∞, and n=1 kfn k = k n=1 fn k ;
P∞ P∞
(b) if X is a Hilbert space and n=1 kfn k2 < ∞ then the series n=1 fn is con-
vergent.
Pn Pn 2
Proof. For each n ∈ N, let sn := k=1 fk and σn := k=1 kfk k . We recall
P∞
that the series n=1 fn is said to be convergent if the sequence {sn } is convergent,
P∞
and that we write n=1 fn := limn→∞ sn when {sn } is convergent (cf. 2.1.10).
P∞
Similarly, the series n=1 kfn k2 is said to be convergent in R if the sequence {σn }
P∞
is convergent in R, and we write n=1 kfn k2 := limn→∞ σn when {σn } is convergent
(cf. 5.4.1).
P∞
a: Assume that the series n=1 fn is convergent. Then the continuity of
the norm (cf. 4.1.6a) implies that the sequence {ksn k2 } is convergent in R and
limn→∞ ksn k2 = k limn→∞ sn k2 (cf. 2.4.2). Since ksn k2 = σn by 10.2.3, this means
P∞ P∞
that the series n=1 kfn k2 is convergent in R, i.e. 2
n=1 kfn k < ∞ (cf. 5.4.1),
P∞ ∞ 2
and n=1 kfn k2 = k n=1 fn k .
P
P∞
b: Assume n=1 kfn k2 < ∞, i.e. that the sequence {σn } is convergent in R.
Pm
Then {σn } is a Cauchy sequence (cf. 2.6.2). Since |σm − σn | = k=n+1 kfn k2 =
Pm 2

k=n+1fn = ksm − sn k2 for all m, n ∈ N such that n < m (cf. 10.2.3), this
implies that {sn } is a Cauchy sequence as well, and hence a convergent sequence if
X is a complete metric space.
10.4.8 Corollaries. Let {un }n∈N be an o.n.s. in an inner product space X, and
let {αn } be a sequence in C. Then:
P∞ P∞ P∞
(a) if the series n=1 αn un is convergent then n=1 |αn |2 < ∞ and n=1 |αn |2 =
2
k ∞
P
n=1 αn un k ; P∞ P∞
2
(b) if X is a Hilbert space and n=1 |αn | < ∞ then the series n=1 αn un is
convergent;
P∞ P∞
(c) if the series n=1 αn un is convergent then αk = (uk | n=1 αn un ) for all k ∈ N.
Proof. Letting fn := αn un for all n ∈ N, statements a and b follow immediately
from 10.4.7.
c: If the series ∞
P
n=1 αn un is convergent, then the continuity of the inner product
(cf. 10.1.16c) implies that
∞
! n
!
X X
uk | αn un = uk | lim αl ul
n→∞
n=1 l=1
n
X
= lim αl δk,l = αk , ∀k ∈ N.
n→∞
l=1
10.4.9 Proposition. Let {fn } be a sequence in a Hilbert space such that (fn |fm ) =
P∞
0 if n 6= m, and let β be a bijection from N onto N. Then the series n=1 fβ(n) is
P∞
convergent iff the series n=1 fn is convergent. If these series are convergent then
P∞ P∞
their sums are the same, i.e. n=1 fβ(n) = n=1 fn .
Hilbert Spaces 283
P∞ P∞
Proof. By 10.4.7, the series n=1 fβ(n) is convergent iff n=1 kfβ(n) k2 < ∞ and
P∞ P∞ P∞
the series n=1 fn is convergent iff n=1 kfn k2 < ∞. Now, 2
n=1 kfβ(n) k =
P∞ 2
n=1 kfn k by 5.4.3.
Suppose that the series ∞
P P∞
n=1 fβ(n) and n=1 fn are convergent. Then, by the
continuity of the inner product,
∞ ∞ ∞ ∞
!
X X X X
fk | fβ(n) − fn = fk |fβ(n) − (fk |fn )
n=1 n=1 n=1 n=1
= (fk |fk ) − (fk |fk ) = 0, ∀k ∈ N.
Hence,
∞
X ∞
X
fβ(n) − fn ∈ {fn }⊥
n∈N = (V {fn }n∈N )
⊥
n=1 n=1
by 10.2.11. We also have
X∞ ∞
X n
X n
X
fβ(n) − fn = lim fβ(k) − lim fk ∈ V {fn }n∈N
n→∞ n→∞
n=1 n=1 k=1 k=1
P∞ P∞
by 4.1.13, 2.3.10, and 3.1.7. Then, n=1 fβ(n) = n=1 fn by 10.2.10f.
10.4.10 Proposition. Let {fn,s }(n,s)∈N×N be a family of vectors of a Hilbert space

such that (fn,s |fm,t ) = 0 if (n, s) 6= (m, t). Then, the following conditions are
equivalent:
P
(a) the series (n,s)∈N×N fn,s is convergent (there is no need to specify what order-
ing in N × N is used in order to define this series, in view of 10.4.9);
P∞ P∞ P∞
(b) the series s=1 fn,s is convergent for all n ∈ N and the series n=1 ( s=1 fn,s )
is convergent.
If the series in conditions a and b are convergent, then for their sums we have
P P∞ P∞
(n,s)∈N×N fn,s = n=1 ( s=1 fn,s ).
Proof. By 10.4.7, condition a is true iff (n,s)∈N×N kfn,s k2 < ∞ (cf. 5.4.7 for the
P
P 2
P∞ 2
symbol (n,s)∈N×N kfn,s k ). Further, condition b is true iff s=1 kfn,s k < ∞
P∞ P∞ 2
< ∞, since ( ∞ f | ∞ f ) =
P P
for all n ∈ N and n=1 k s=1 fn,s k
P∞ P∞ P∞ s=1 n,s2 t=1 m,t
s=1 t=1 (fn,s |fm,t ) = 0 if n 6= m; also, if s=1 kfn,s k < ∞ then
P ∞ 2 P∞ 2
k s=1 fn,s k = s=1 kfn,s k by 10.4.7; hence, condition b is true iff
P∞ P∞ 2
n=1 s=1 kf n,s k < ∞. Then, conditions a and b are equivalent by 5.4.7.
Suppose that the series of conditions a and b are convergent. Then, by the same
procedure as in 10.4.9, we see that
 !
X X∞ ∞
X
fm,t | fn,s − fn,s  = (fm,t |fm,t ) − (fm,t |fm,t )
(n,s)∈N×N n=1 s=1
= 0, ∀(m, t) ∈ N × N,
P P∞ P∞
and hence that (n,s)∈N×N fn,s = n=1 ( s=1 fn,s ).
10.4.11 Remark. By using 10.3.6, it is easy to see that the statements of 10.4.9
and 10.4.10 hold true even if the inner product space of the statements, which
we denote here by X, is not a Hilbert space. Simply, let (X̂, ι) be a completion
of X, substitute the vectors fn or fn,s with ι(fn ) or ι(fn,s ), and note that the
P∞
series n=1 fn (for instance) is convergent (in the metric space X) iff the series
P∞ P∞
n=1 ι(fn ) is convergent (in the metric space X̂) and the sum n=1 ι(fn ) is an
element of Rι .
10.5 The Riesz–Fréchet theorem
We present here the Riesz–Fréchet theorem and a result about bounded sesquilinear
forms which follows from it. The Riesz–Fréchet theorem is actually a corollary of
the orthogonal decomposition theorem, since its proof relies on 10.4.4a. The Riesz–
Fréchet theorem is also known as the Riesz representation theorem, but we prefer
to call it Riesz–Fréchet theorem in order to distinguish it from several other “Riesz
representation theorems”. For the same reason, we called Riesz–Markov theorem
the theorem in 8.5.3, which is often named after the first author only.
10.5.1 Proposition. Let h be a vector of an inner product space X and M a linear
manifold in X. Then the function
Fh : M → C
f 7→ Fh f := (h|f )
is a continuous linear functional (for the definition of a linear functional, cf. 3.2.1),
kFh k ≤ khk and kFh k = khk if h ∈ M .
Proof. The function Fh is a linear operator by property ip1 of an inner product.
Moreover, by 10.1.9 we have
|Fh f | ≤ khkkf k, ∀f ∈ M,
and this shows that the linear operator Fh is bounded, and hence continuous (cf.
4.2.2), and that kFh k ≤ khk (cf. 4.2.4). If h = 0X then kFh k = 0. If h 6= 0X and
h ∈ M then |Fh h| = khkkhk shows that kFh k ≥ khk and hence that kFh k = khk.
10.5.2 Theorem (The Riesz–Fréchet theorem). Let H be a Hilbert space and

F a continuous linear functional on H, i.e. F ∈ B(H, C). Then,
∃!h ∈ H such that F = Fh , i.e. F f = (h|f ) , ∀f ∈ H.
Proof. Existence: If NF = H, take h := 0H . Now suppose NF 6= H. By 4.2.2,

4.4.3, 4.4.8, NF is a subspace of H. Then, NF⊥ 6= {0H } by 10.4.4a (NF⊥ = {0H }
would imply NF = NF⊥⊥ = H). Fix a non-zero element g of NF⊥ , and define
Fg ⊥
h := kgk 2 g. Since F g 6= 0 because NF ∩ NF = {0H } (cf. 10.2.10f), we can write

Ff Ff
f= f− g + g, ∀f ∈ H,
Fg Fg
Hilbert Spaces 285
and we have

Ff Ff
F f− g = 0, i.e. f − g ∈ NF , ∀f ∈ H,
Fg Fg
and hence, as h ∈ NF⊥ ,

Ff Fg Ff
(h|f ) = h| g = (g|g) = F f, ∀f ∈ H.
Fg kgk2 F g
Uniqueness: Suppose that h, h′ are so that
F f = (h|f ) = (h′ |f ) , ∀f ∈ H.
Then h′ − h ∈ H⊥ and hence h′ − h = 0H (cf. 10.2.10a), i.e. h′ = h.
10.5.3 Remarks.
(a) The plan for the proof of 10.5.2 is prompted by the following considerations. If
the theorem is true, then NF = {h}⊥ and hence
NF⊥ = {h}⊥⊥ = V {h} = {αh : α ∈ C}
(cf. 10.4.4b and 4.1.15). Thus, if we assume that the theorem is true and that
NF 6= H and hence h 6= 0H , for any non-zero element g of NF⊥ there exists α ∈ C
such that α 6= 0 and g = αh, and hence also such that F g = (h|g) = α−1 (g|g),
F (g)
which implies α−1 = kgk2 . Therefore, if the theorem is true and NF 6= H, we
(g)
must have h = Fkgk ⊥
2 g for any non-zero element g of NF .
(b) In 10.5.2, the condition that the inner product space H be complete is essential.
This is readily seen as follows. Let H be a Hilbert space and M a linear manifold
in H such that M 6= H and M = H (such are e.g. ℓ2 and ℓf , cf. 10.3.8d). Then,
M can be regarded as an inner product space (cf. 10.1.6), which is not complete
by 2.6.8. Let g ∈ H be such that g 6∈ M . Then the function
M ∋ f 7→ F f := (g|f ) ∈ C
is a continuous linear functional defined on M (cf. 10.5.1). However, there

exists no h ∈ M so that
F f = (h|f ) , ∀f ∈ M,
since this would imply h − g ∈ M ⊥ , and hence h − g = 0H (cf. 10.4.4d), and

hence g ∈ M .
10.5.4 Definition. A sesquilinear form ψ in a inner product space X is said to be

bounded if it has the following property:
∃m ∈ [0, ∞) such that |ψ(f, g)| ≤ mkf kkgk, ∀f, g ∈ Dψ .

10.5.5 Proposition. Let A be a bounded linear operator in a inner product space

X. Then the function
ψ : DA × DA → C
(f, g) 7→ ψ(f, g) := (Af |g)
is a bounded sesquilinear form in X.
The same is true if ψ is defined by ψ(f, g) := (f |Ag), ∀(f, g) ∈ DA × DA .
Proof. For both the definitions of ψ given in the statement, the function ψ is a
sesquilinear form since A is a linear operator and an inner product is a sesquilinear
form. Moreover, for both definitions,
|ψ(f, g)| ≤ kAkkf kkgk, ∀(f, g) ∈ DA × DA ,
by 10.1.7a and 4.2.5b.
10.5.6 Theorem. Let H be a Hilbert space and ψ a bounded sesquilinear form on

H. Then
∃!A ∈ OE (H) such that ψ(f, g) = (Af |g) , ∀f, g ∈ H
(cf. 3.2.12 for the definition of OE (H)), and
∃!B ∈ OE (H) such that ψ(f, g) = (f |Bg) , ∀f, g ∈ H.
The linear operators A and B are bounded, i.e. A, B ∈ B(H).
Proof. Existence: Let m ≥ 0 be such that |ψ(f, g)| ≤ mkf kkgk, ∀f, g ∈ H. For
each f ∈ H, define the function
Ff : H → C
g 7→ Ff (g) := ψ(f, g),
which is a linear functional in view of property sf2 of ψ (cf. 10.1.1), and is continuous
since (cf. 4.2.2)
|Ff g| ≤ (mkf k)kgk, ∀g ∈ H;
hence, by 10.5.2,
∃!hf ∈ H such that ψ(f, g) = Ff g = (hf |g) , ∀g ∈ H.
Then, we can define the mapping
A:H→H
f 7→ Af := hf if hf ∈ H is such that (hf |g) = ψ(f, g), ∀g ∈ H,
which is obviously such that ψ(f, g) = (Af |g), ∀f, g ∈ H.
The mapping A is a linear operator since, for all α, β ∈ C and f1 , f2 ∈ H,
(αAf1 + βAf2 |g) = α (Af1 |g) + β (Af2 |g)
= αψ(f1 , g) + βψ(f2 , g)
= ψ(αf1 + βf2 , g) = (A(αf1 + βf2 )|g) , ∀g ∈ H,
Hilbert Spaces 287
in view of property sf3 of ψ, and hence αAf1 + βAf2 = A(αf1 + βf2 ). Moreover,
| (Af |g) | = |ψ(f, g)| ≤ mkf kkgk, ∀f, g ∈ H,
proves that A is bounded, in view of 10.1.14.
Finally, we note that the function
ψ̃ : H × H → C
(f, g) 7→ ψ̃(f, g) := ψ(g, f )
is obviously a bounded sesquilinear form on H. Therefore, what was proved above
implies that there exists B ∈ B(H) such that
ψ̃(f, g) = (Bf |g) , ∀f, g ∈ H,
and hence such that
ψ(g, f ) = (g|Bf ) , ∀f, g ∈ H.
′
Uniqueness: If A, A ∈ OE (H) are such that
ψ(f, g) = (Af |g) = (A′ f |g) , ∀f, g ∈ H,
then A = A′ by 10.2.12. And similarly for the uniqueness of B.
10.6 Complete orthonormal systems
Throughout this section, H denotes an abstract Hilbert space.
10.6.1 Proposition. Let {ui }i∈I be any o.n.s. in H. The family of indices If :=
{i ∈ I : (ui |f ) 6= 0} is countable, for every f ∈ H. If If is denumerable then the
P
series i∈If (ui |f ) ui is convergent and its sum is the same whatever ordering is
chosen in If for the definition of this series. Thus, we can define
X X
(ui |f ) ui := (ui |f ) ui , ∀f ∈ H,
i∈I i∈If
where some ordering in If is understood. We have

X X
(ui |f ) ui ∈ V {ui }i∈I and f − (ui |f ) ui ∈ (V {ui }i∈I )⊥ , ∀f ∈ H.
i∈I i∈I
Proof. For every f ∈ H, it was proved in 10.2.8b that If was countable. Now,
suppose that If is denumerable. Since
((ui |f ) ui | (uk |f ) uk ) = (ui |f ) (uk |f ) (ui |uk ) = 0 if i 6= k,
10.4.9 proves that the choice of the ordering in If , which is necessary for the def-
P
inition of the series i∈If (ui |f ) ui , is immaterial both for the convergence of the
series and, in case of convergence, for its sum. Moreover, it was proved in 10.2.8b
that, for whatever ordering in If ,
X
| (ui |f ) |2 ≤ kf k2 ,
i∈If
P
and hence 10.4.8b implies that the series i∈If (ui |f ) ui is convergent. From 4.1.13,
2.3.10, 3.1.7 we have
X
(ui |f ) ui ∈ V {ui }i∈I .
i∈If
For each k ∈ I we also have, using the continuity of inner product if If is denumer-
able,
 
X X
uk |f − (ui |f ) ui  = (uk |f ) − (ui |f ) δk,i
i∈If i∈If
(
0 if k 6∈ If ,
=
(uk |f ) − (uk |f ) if k ∈ If .
In view of 10.2.11, this proves that

X
f− (ui |f ) ui ∈ ({ui }i∈I )⊥ = (V {ui }i∈I )⊥ .
i∈If
10.6.2 Proposition. Let {ui }i∈I be an o.n.s. in H. For f ∈ H, if If is denumer-

able then the series i∈If | (ui |f ) |2 is convergent and its sum is the same whatever
P
ordering is chosen in If in order to define this series. Thus, we can define

X X
| (ui |f ) |2 := | (ui |f ) |2 , ∀f ∈ H,
i∈I i∈If
where some ordering in If is understood. We have

2
X X
| (ui |f ) |2 = (ui |f ) ui , ∀f ∈ H.

i∈I i∈I
P
For f, g ∈ H, if If ∩ Ig is denumerable then the series i∈If ∩Ig (f |ui ) (ui |g) is
convergent and its sum is the same whatever ordering is chosen in If ∩ Ig for the
definition of this series. Thus, we can define
X X
(f |ui ) (ui |g) := (f |ui ) (ui |g) , ∀f, g ∈ H,
i∈I i∈If ∩Ig
where some ordering in If ∩ Ig is understood. We have

!
X X X
(f |ui ) (ui |g) = (ui |f ) ui | (uk |g) uk , ∀f, g ∈ H.
i∈I i∈I k∈I
Proof. For the part of the statement concerning converge of series and indepen-
dence of sums from the orderings, cf. 10.2.8b.
Hilbert Spaces 289
For f, g ∈ H, suppose that If ∪ Ig is denumerable and let {in }n∈N := If ∪ Ig .

Then the continuity of inner product (cf. 10.1.6c and 2.4.2) implies that
∞ ∞
N N
! !
X X X X
lim (uin |f ) uin | (uik |g) uik = (uin |f ) uin | (uik |g) uik
N →∞
n=1 k=1 n=1 k=1
 
X X
= (ui |f ) ui | (uk |g) uk  .
i∈If k∈Ig
Also, by properties ip1 , ip2 , ip5 of inner product,

N N
! N X
N
X X X
lim (uin |f ) uin | (uik |g) uik = lim (uin |f ) (uik |g) δi,k
N →∞ N →∞
n=1 k=1 n=1 k=1
N
X ∞
X X
= lim (f |uin ) (uin |g) = (f |uin ) (uin |g) = (f |ui ) (ui |g) .
N →∞
n=1 n=1 i∈If ∩Ig
Thus,
 
X X X
(f |ui ) (ui |g) =  (ui |f ) ui | (uk |g) uk  .
i∈If ∩Ig i∈If k∈Ig
If If ∪ Ig is finite then this equality follows solely from properties ip1 , ip2 , ip5 of an
inner product.
By letting g := f in the equality above, for every f ∈ H we have
2
X X
2

| (ui |f ) | =
(ui |f ) ui
.
i∈If i∈If
10.6.3 Definitions. An o.n.s. {ui }i∈I in H is said to be complete in a subspace

M of H if V {ui }i∈I = M .
An o.n.s. in H which is complete in H is called a complete orthonormal system
(briefly, c.o.n.s.) in H.
10.6.4 Theorem. Let M be a subspace of H and {ui }i∈I an o.n.s. in H such that
ui ∈ M for all i ∈ I. Then, the following conditions are equivalent:
(a) {ui }i∈I is complete in M ;
P
(b) f = i∈I (ui |f ) ui , ∀f ∈ M ;
P
(c) (f |g) = i∈I (f |ui ) (ui |g), ∀f, g ∈ M ;
kf k2 = i∈I | (ui |f ) |2 , ∀f ∈ M ;
P
(d)
(e) [f ∈ M and (ui |f ) = 0, ∀i ∈ I] ⇒ f = 0H .
If M = H, the equality in condition b is called Fourier expansion and the equalities
in conditions c and d are called Parseval’s identities.
Proof. a ⇒ b: For every f ∈ M , 4.1.13, 2.3.10, and 3.1.7 imply that

X
f− (ui |f ) ui ∈ M,
i∈I
and hence, in view of 10.6.1,

X
f− (ui |f ) ui ∈ M ∩ (V {ui }i∈I )⊥ .
i∈I
Therefore, if condition a is true then

X
f− (ui |f ) ui ∈ M ∩ M ⊥ = {0H }, ∀f ∈ M
i∈I
(cf. 10.2.10f). This proves that condition a implies condition b.

b ⇒ c: This is obvious, since in 10.6.2 we saw that
!
X X X
(f |ui ) (ui |g) = (ui |f ) ui | (uk |g) uk , ∀f, g ∈ H.
i∈I i∈I i∈I
c ⇒ d: Set g := f in condition c.
d ⇒ e: We prove this by contraposition. Assume that f ∈ M exists such that
f 6= 0H and (ui |f ) = 0 for all i ∈ I. Then,
X
kf k2 6= 0 = | (ui |f ) |2 .
i∈I
e ⇒ a: We prove this by contraposition. First note that {ui }i∈I ⊂ M implies

V {ui }i∈I ⊂ M and hence, in view of 10.4.3,
M = V {ui }i∈I + ((V {ui }i∈I )⊥ ∩ M ).
Therefore, if V {ui }i∈I 6= M then ({ui }i∈I )⊥ ∩ M 6= {0H } and hence
∃f ∈ M such that f ∈ (V {ui }i∈I )⊥ = ({ui }i∈I )⊥ and f 6= 0H ,
where 10.2.11 has been used. This proves that if condition a is not true then
condition e is not true.
10.6.5 Remarks.
(a) The equivalence of conditions a and e in 10.6.4 can be rephrased as follows:
an o.n.s. {ui }i∈I in H is complete in a subspace M of H iff {ui }i∈I ⊂ M and
({ui }i∈I )⊥ ∩ M = {0H }; in particular, {ui }i∈I is a c.o.n.s. in H iff ({ui }i∈I )⊥ =
{0H }.
(b) Suppose that {ui }i∈I is a c.o.n.s. in H and M is a linear manifold in H such
that {ui }i∈I ⊂ M . Then M is dense in H. Indeed,
{ui }i∈I ⊂ M ⇒ L{ui }i∈I ⊂ M ⇒
H = V {ui }i∈I = L{ui }i∈I ⊂ M ⇒ M = H
(cf. 3.1.6c, 4.1.13, 2.3.9d).
Hilbert Spaces 291
(c) Let M be a subspace of H. A subset {ui }i∈I of M is clearly an o.n.s. in the

framework of the Hilbert space (H, σ, µ, φ) iff it is an o.n.s. in the framework of
the Hilbert space (M, σM×M , µC×M , φM×M ) (cf. 10.3.2). Moreover, conditions
from b to e in 10.6.4 are clearly the same whether they are interpreted in the
framework of (H, σ, µ, φ) or in the framework of (M, σM×M , µC×M , φM×M ).
Therefore, an o.n.s. in H is complete in the subspace M iff it is a c.o.n.s. in the
Hilbert space (M, σM×M , µC×M , φM×M ) (note that, in condition V {ui }i∈I = M
of 10.6.3, V {ui }i∈I stands for the intersection of all the subspaces of H, not of
all the subspaces of M , that contain {ui }i∈I ).
10.6.6 Corollaries.
(a) Assume that, for N ∈ N, an o.n.s. {u1 , ..., uN } exists in H. Then,
(N )
X
N
V {u1 , ..., uN } = αn un : (α1 , ..., αN ) ∈ C .
n=1
(b) Assume that a denumerable o.n.s. {un }n∈N exists in H. Then,

(∞ )
X
V {un }n∈N = αn un : {αn } ∈ ℓ2 .
n=1
Proof. a: The o.n.s. {u1 , ..., uN } is obviously complete in the subspace

V {u1 , ..., uN }. Then 10.6.4 proves that
N
X
f= (un |f ) un , ∀f ∈ V {u1 , ..., un }.
n=1
Hence,
( N
)
X
V {u1 , ..., uN } ⊂ αn un : (α1 , ..., αN ) ∈ CN .
n=1
Since the opposite inclusion is obvious, we have the equality of the statement.
b: First, we note that the condition {αn } ∈ ℓ2 is necessary and sufficient for
P∞
the series n=1 αn un to converge (cf. 10.4.8a,b). Then, since the o.n.s. {un }n∈N
is obviously complete in the subspace V {un }n∈N , 10.6.4 proves that
∞
X
f= (un |f ) un , ∀f ∈ V {un }n∈N ;
n=1
moreover, {(un |f )} ∈ ℓ2 for every f ∈ H (cf. 10.2.8b). Thus we have

(∞ )
X
V {un }n∈N ⊂ αn un : {αn } ∈ ℓ2 .
n=1
Since the opposite inclusion follows from 4.1.13, 2.3.10, 3.1.7, we have the equality
of the statement.
10.6.7 Examples.
(a) Let N ∈ N and, for 1 ≤ k ≤ N , define the element ek of CN by
ek := (δk,1 , ..., δk,n , ..., δk,N ).
The family {e1 , ..., eN } is an o.n.s. in the Hilbert space CN (cf. 10.3.8c) since
it is obvious that (ek |el ) = δk,l for k, l = 1, ..., N , and it is complete by 10.6.4
because
xk = (ek |ξ) , ∀ξ := (x1 , ..., xN ) ∈ CN , ∀k ∈ {1, ..., N }
proves that
[ξ ∈ CN and (ek |ξ) = 0, ∀k ∈ {1, ..., N }] ⇒ ξ = 0CN .
(b) The family {δk }k∈N , which is an o.n.s. in ℓf (cf. 10.2.5a), is obviously an o.n.s.
in the Hilbert space ℓ2 (cf. 10.3.8d) as well, and it is complete by 10.6.4 because
xk = (δk |ξ) , ∀ξ := {xn } ∈ ℓ2 , ∀k ∈ N
proves that
[ξ ∈ ℓ2 and (δk |ξ) = 0, ∀k ∈ N] ⇒ ξ = 0ℓ2 .
10.6.8 Proposition. Let H1 and H2 be Hilbert spaces such that the family
UA(H1 , H2 ) is not empty, and let U ∈ UA(H1 , H2 ). Then:
(a) for every o.n.s. {ui }i∈I in H1 , {U ui }i∈I is an o.n.s. in H2 ;

(b) for every c.o.n.s. {ui }i∈I in H1 , {U ui }i∈I is a c.o.n.s. in H2 .
Proof. a: For every o.n.s. {ui }i∈I in H1 we have:
(U ui |U uk )2 = (ui |uk )1 = δi,k if U ∈ U(H1 , H2 );

(U ui |U uk )2 = (uk |ui )1 = δi,k if U ∈ A(H1 , H2 ).
b: For any family {ui }i∈I of vectors of H1 , suppose that f ∈ H2 is such that
(U ui |f )2 = 0, ∀i ∈ I.
Since (U ui |f )2 = ui |U −1 f 1 if U ∈ U(H1 , H2 ) and (U ui |f )2 = U −1 f |ui 1 if

U ∈ A(H1 , H2 ), in either case we have
ui |U −1 f 1 = 0, ∀i ∈ I.

If {ui }i∈I is a c.o.n.s. in H1 , this implies U −1 f = 0H1 (cf. 10.6.4) and hence
f = 0H2 since U is a linear or antilinear operator. In view of statement a and of
10.6.4, this proves that {U ui }i∈I is a c.o.n.s. in H2 if {ui }i∈I is a c.o.n.s. in H1 .
Hilbert Spaces 293
10.6.9 Proposition. Let H1 and H2 be Hilbert spaces such that a c.o.n.s. {ui }i∈I
exists in H1 and a c.o.n.s. {vi }i∈I exists in H2 which are indexed by the same set I
of indices. For every f ∈ H1 , the set If := {i ∈ I : (ui |f )1 6= 0} is countable. If If
P P
is denumerable then the series i∈If (ui |f )1 vi and i∈If (f |ui )1 vi are convergent
and their sums are independent from the orderings chosen in If for their definitions.
The mapping
U : H1 → H2
X X
f 7→ U f := (ui |f )1 vi := (ui |f )1 vi
i∈I i∈If
is an element of U(H1 , H2 ), while the mapping

V : H1 → H2
X X
f 7→ V f := (f |ui )1 vi := (f |ui )1 vi
i∈I i∈If
is an element of A(H1 , H2 ). For the inverse operators we have

X X
U −1 g = (vi |g)2 ui := (vi |g)2 ui , ∀g ∈ H2 ,
i∈I i∈Ig
X X
V −1 g = (g|vi )2 ui := (g|vi )2 ui , ∀g ∈ H2 ,
i∈I i∈Ig
where Ig := {i ∈ I : (vi |g)2 6= 0}.
Proof. We set out the proof for U , from which the proof for V can be obtained by
obvious modifications.
For each f ∈ H1 , it was proved in 10.2.8b that If was countable and
2
P
i∈If | (ui |f )1 | < ∞;
P
then, if If is denumerable, 10.4.9 implies that the con-
vergence of the series i∈If (ui |f )1 vi and its sum do not depend on the ordering
chosen in If , and the series is convergent by 10.4.8b.
For each g ∈ H2 , the same arguments as above prove that Ig is countable and
P
that if Ig is denumerable then the series i∈Ig (vi |g)2 ui is convergent and its sum
is independent from the ordering chosen in Ig , and we can define the vector f of
H1 by
X
f := (vi |g)2 ui ;
i∈Ig
we have
(
(vi |g)2 if i ∈ Ig ,
(ui |f )1 =
0 if i 6∈ Ig
(cf. 10.4.8c); thus, If = Ig and
X X
Uf = (ui |f )1 vi = (vi |g)2 vi = g
i∈If i∈Ig
since {vi }i∈I is a c.o.n.s. in H2 (cf. 10.6.4b). This proves that RU = H2 . Moreover,
for all f, h ∈ H1 ,
X X
(U f |U h)2 = (ui |f )1 (uk |h)1 (vi |vk )2
i∈If k∈Ih
X
= (ui |f )1 (ui |h)1 = (f |h)1
i∈If ∩Ih
since {ui }i∈I is a c.o.n.s. in H1 (cf. 10.6.4c). In view of 10.1.20, this proves that
U ∈ U(H1 , H2 ) (in the proof for V , 10.1.20 must be replaced by 10.3.17).
Since U is an isomorphism, it is injective and the proof of surjectivity given
above for U proves also the part of the statement concerning U −1 .
10.6.10 Remark. Suppose that a c.o.n.s. {ui }i∈I exists in a Hilbert space H.
Then the mapping
V :H→H
X
f 7→ V f := (f |ui ) ui
i∈I
is an element of A(H) (cf. 10.6.9) and V 2 = 1H , as can be easily seen. Thus, every
antiunitary operator in H is the product of a unitary operator multiplied by V . In
fact, for A ∈ A(H), A = (AV )V and AV ∈ U(H) (cf. 10.3.16c).
10.7 Separable Hilbert spaces
It can be proved that there exists a c.o.n.s. in any non-zero Hilbert space, if the
axiom of choice is assumed, in its equivalent form called Zorn’s lemma (cf. e.g.
Weidmann, 1980, th. 3.10). However, it is possible to prove that there exists a
c.o.n.s. in every separable non-zero Hilbert space without using the axiom of choice.
In this section, we give the proof of the existence of a c.o.n.s. in this reduced form
only, because in our opinion the idea of a c.o.n.s. is really useful in separable Hilbert
spaces only (mainly because, as we see below, a c.o.n.s. is countable iff the Hilbert
space is separable). The importance of a theorem which proves the existence of a
c.o.n.s. is that it justifies all the procedures in which complete orthonormal systems
are used.
10.7.1 Theorem. Suppose that a Hilbert space H is separable and non-zero. Then
a countable c.o.n.s. exists in H.
Proof. Since H is separable, there exists a countable subset S of H so that S = H.

It is easy to see that S must be denumerable since H is non-zero, and we can write
{fn }n∈N := S. Now, 3.1.19 implies that there exists a countable subset {fnk }k∈I of
{fn }n∈N which is linearly independent and such that
L{fnk }k∈I = L{fn }n∈N .
Hilbert Spaces 295
Since {fnk }k∈I is a linearly independent subset of H, 10.2.6 implies that there exists
an o.n.s. {un }n∈I in H such that
L{un }n∈I = L{fnk }k∈I ,
and hence such that
L{un}n∈I = LS.
Then we have
V {un }n∈I = L{un }n∈I = LS ⊃ S = H
(cf. 4.1.13, 3.1.6b, 2.3.9d), and hence V {un }n∈I = H.
10.7.2 Corollary. Suppose that a Hilbert space H is separable, that M is a subspace

of H, and that M 6= {0H }. Then there exists a countable o.n.s. in H which is
complete in the subspace M .
Proof. Since M is a subspace, it can be regarded as a Hilbert space on its own (cf.
10.1.6 and 10.3.2), and it is not a zero Hilbert space since M 6= {0H }. Moreover,
M is separable (cf. 2.3.20 and 10.1.15). Then, 10.7.1 proves that there exists a
countable c.o.n.s. in the Hilbert space M , and hence a countable o.n.s. in H which
is complete in the subspace M (cf. 10.6.5c).
10.7.3 Corollary. Suppose that a Hilbert space H is separable, and let {ui }i∈I be
an o.n.s. in H. Then there exists a c.o.n.s. in H which contains {ui }i∈I .
Proof. If ({ui }i∈I )⊥ = {0H } then {ui }i∈I is a c.o.n.s. in H (cf. 10.6.5a). Now
assume ({ui }i∈I )⊥ 6= {0H }. Then, 10.2.13 and 10.7.2 imply that there exists an
o.n.s. {vj }j∈J in H such that V {vj }j∈J = ({ui }i∈I )⊥ . It is obvious that the family
{ui }i∈I ∪ {vj }j∈J is an o.n.s. in H. Moreover,
({vj }j∈J )⊥ = (V {vj }j∈J )⊥ = (({ui }i∈I )⊥ )⊥
(cf. 10.2.11), and hence
({ui }i∈I ∪ {vj }j∈J )⊥ = ({ui }i∈I )⊥ ∩ ({vj }j∈J )⊥
= ({ui }i∈I )⊥ ∩ (({ui }i∈I )⊥ )⊥ = {0H }
(cf. 10.2.10c,f). This proves that {ui }i∈I ∪ {vj }j∈J is a c.o.n.s. in H (cf. 10.6.5a).
10.7.4 Remark. In the proof of the orthogonal decomposition theorem that was
given in 10.4.1, the axiom of choice (cf. 1.2.22) was used in the construction of the
sequence {gn } in M which was such that kf −gn k → d. Now, corollary 10.7.2 makes
it possible to prove the orthogonal decomposition theorem without resorting to the
axiom of choice, if the Hilbert space is separable. Indeed, if the Hilbert space H
is separable and M is a non-zero subspace of H, 10.7.2 proves that there exists an
o.n.s. {ui }i∈I in H which is complete in M (this o.n.s. is countable, but this has
no relevance here). Since 10.6.1 proves that
X X
(ui |f ) ui ∈ V {ui }i∈I and f − (ui |f ) ui ∈ (V {ui }i∈I )⊥ , ∀f ∈ H,
i∈I i∈I
then for each f ∈ H we actually have a pair (f1 , f2 ) ∈ M ×M ⊥ such that f = f1 +f2
if we define
X X
f1 := (ui |f ) ui and f2 := f − (ui |f ) ui
i∈I i∈I
(the uniqueness of the pair can then be proved as in 10.4.1).
The next two theorems round off our exposition of the relation between separa-
bility of a Hilbert space and countability of orthonormal systems. Theorem 10.7.5
is the converse of theorem 10.7.1.
10.7.5 Theorem. If there exists a countable c.o.n.s. in a Hilbert space H then H

is separable.
Proof. Assume that there exists a countable c.o.n.s. {un }n∈I in H. We set out
the proof of the separability of H for I denumerable, from which the proof for I
finite can be derived easily. Let then I := N, and fix f ∈ H and ε > 0. Since
V {un }n∈N = H, 4.1.13 and 2.3.12 imply that
ε
∃fε ∈ L{un }n∈N such that kf − fε k < ,
2
and 3.1.7 implies that
Nε
X
∃Nε ∈ N, ∃(αε1 , ..., αεNε ) ∈ CNε such that fε = αεn un .
n=1
Since Q is dense in R (cf. 2.3.16), there exist (aε1 , ..., aεNε ), (bε1 , ..., bεNε ) ∈ QNε such
that
ε ε
| Re αεn − aεn | < and | Im αεn − bεn | < , ∀n ∈ {1, ..., Nε },
4Nε 4Nε
and hence such that
Nε
Nε

X X
f − (aεn + ibεn )un ≤ kf − fε k + fε − (aεn + ibεn )un

n=1 n=1
Nε
ε X
< + |αε − (aεn + ibεn )|
2 n=1 n
Nε Nε
ε X X
≤ + | Re αεn − aεn | + | Im αεn − bεn | < ε.
2 n=1 n=1
This proves that the subset H0 of H defined by
(N )
X
N
H0 := (an + ibn )un : N ∈ N, (a1 , ..., aN ), (b1 , ..., bN ) ∈ Q
n=1
Hilbert Spaces 297
is dense in H. Now, the mapping

∞
[
(QN × QN ) → H0
N =1
N
X
((a1 , ..., aN ), (b1 , ..., bN )) 7→ (an + ibn )un
n=1
is easily seen to be a bijection, and this proves that H0 is countable in view of the
facts that are mentioned in 1.2.10. Thus, H is separable.
10.7.6 Remark. From 10.7.5 and 10.6.7 we have that the Hilbert spaces CN and
ℓ2 (cf. 10.3.8c,d) are separable.
10.7.7 Theorem. If a Hilbert space H is separable then every o.n.s. in H is count-

able.
Proof. If H is a zero Hilbert space then there is nothing to prove. Now assume
that H is a non-zero separable Hilbert space. Then there exists a countable subset
S of H which is dense in H. Let {ui }i∈I be an o.n.s. in H and define

1
Si := f ∈ S : kui − f k < √ , ∀i ∈ I.
2
In view of 2.3.12, Si 6= ∅ for each i ∈ I. Then, by the axiom of choice (cf. 1.2.22)
there exists a mapping ϕ : I → S such that ϕ(i) ∈ Si for each i ∈ I. Moreover,
Si ∩ Sk = ∅ if i 6= k;
in fact, if f ∈ S existed such that f ∈ Si ∩ Sk , we would have
2 √
kui − uk k ≤ kui − f k + kf − uk k < √ = 2,
2
while we have, if i 6= k,
p p √
kui − uk k = (ui − uk |ui − uk ) = (ui |ui ) + (uk |uk ) = 2.
Then, the mapping ϕ is injective and hence ϕ is a bijection from I onto Rϕ , which
is a countable set since it is a subset of S (cf. 1.2.10). Thus, I is countable and so
is {ui }i∈I .
10.7.8 Proposition. Suppose that a set X is not finite. Then there exists an
injection i : N → X.
Proof. We must produce a sequence {xn } in X such that xn 6= xk if n 6= k. We

define xn for all n ∈ N by induction as follows. Since X is not finite, X 6= ∅;
choose x1 ∈ X. Then X 6= {x1 } since {x1 } is finite, so X − {x1 } 6= ∅; choose
x2 ∈ X − {x1 }. Assuming x1 , ..., xn already defined, X 6= {x1 , ..., xn } and hence
X − {x1 , ..., xn } 6= ∅; choose xn+1 ∈ X − {x1 , ..., xn }.
10.7.9 Proposition. Let (H, σ, µ, φ) be a Hilbert space. Then:

(a) If an o.n.s. in the Hilbert space H is a linear basis in the linear space (H, σ, µ),
then it is a finite set.
(b) If a c.o.n.s. in the Hilbert space H is finite, then it is a linear basis in the linear
space (H, σ, µ).
Proof. a: The proof is by contradiction. Assume that {ui }i∈I is an o.n.s. in the
Hilbert space H, and that it is not finite. Then there exists an injection i : N → I
P∞
(cf. 10.7.8) and we can define the vector f := n=1 n1 ui(n) (cf. 10.4.8b). Next,
assume that {ui }i∈I is a linear basis in the linear space (X, σ, µ). Then there exist
a finite subfamily {ui1 , ..., uin } of {ui }i∈I and (α1 , ..., αn ) ∈ Cn so that
n
X
f= αk uik ,
k=1
and hence so that
(ui |f ) = 0, ∀i ∈ I − {i1 , ..., in }.
But this is in contradiction to
1
ui(n) |f = , ∀n ∈ N,
n
which holds true by 10.4.8c.
b: If a c.o.n.s. in the Hilbert space H is finite, then 10.2.2 and 10.6.4b prove
that it is a linear basis in the linear space (X, σ, µ).
10.7.10 Theorem. Let H be a separable and non-zero Hilbert space. Then a count-
able c.o.n.s. S exists in H and:
(a) if S is finite then every other c.o.n.s. in H is finite and contains the same
number of vectors as S;
(b) if S is denumerable then every other c.o.n.s. in H is denumerable.
Proof. We already know that a countable c.o.n.s. exists in H (cf. 10.7.1).

a: If S is finite, then S is a linear basis by 10.7.9b. Then, every other c.o.n.s.
S ′ in H must be finite by 3.1.16, since S ′ is a linearly independent subset of H by
10.2.2. But then S ′ is a linear basis by 10.7.9b, and hence S ′ must contain the same
number of vectors as S by 3.1.17.
b: If S is denumerable then any other c.o.n.s. S ′ in H cannot be finite, for oth-
erwise S would be finite by part a of the statement. Hence S ′ must be denumerable
since it is countable by 10.7.7.
10.7.11 Definitions. We say that the orthogonal dimension of a separable and

non-zero Hilbert space H is finite and equal to N if a c.o.n.s. in H is finite and
contains N vectors (10.7.10a proves that if this is true for a c.o.n.s. in H then it is
true for every other c.o.n.s. in H), and we say that the orthogonal dimension of H
is denumerable if a c.o.n.s. in H is denumerable (10.7.10b proves that if this is true
for a c.o.n.s. in H then it is true for every other c.o.n.s. in H).
Hilbert Spaces 299
The orthogonal dimension of a zero Hilbert space (cf. 10.3.8a) is defined to be

zero.
If M is a subspace of a separable Hilbert space then M can be considered a
separable Hilbert space (cf. the proof of 10.7.2), and the orthogonal dimension of
this Hilbert space is said to be the orthogonal dimension of the subspace M .
10.7.12 Remarks.
(a) From 10.6.7a we have that the orthogonal dimension of the Hilbert space CN is
finite and equal to N , and from 10.6.7b we have that the orthogonal dimension
of the Hilbert space ℓ2 is denumerable.
(b) If M is a subspace of a separable Hilbert space whose orthogonal dimension
is finite, then the orthogonal dimension of M is finite as well. This follows
immediately from 10.7.2 and 10.7.3.
10.7.13 Theorem. Suppose that the orthogonal dimension of a separable Hilbert

space (H, σ, µ, φ) is denumerable. Then no countable linear basis exists in the linear
space (H, σ, µ).
Proof. The proof is by contraposition. Assume that a countable linear basis B

exists in the linear space (H, σ, µ). Then, 10.2.6 implies that an o.n.s. S exists in
H such that
LS = LB = H.
Since S is a linearly independent subset of the linear space (H, σ, µ) (cf. 10.2.2), this
implies that S is a linear basis in the linear space (H, σ, µ), and hence (cf. 10.7.9a)
that S is a finite set, and hence that the orthogonal dimension of the Hilbert space
H is finite since LS = H implies V S = H, i.e. that S is a c.o.n.s. in H.
10.7.14 Theorem. Let H1 and H2 be Hilbert spaces, and suppose that H1 is sep-
arable. Then the following conditions are equivalent:
(a) H2 is separable and the orthogonal dimensions of H1 and H2 are equal;
(b) U(H1 , H2 ) is not empty (i.e. H1 and H2 are isomorphic);
(c) A(H1 , H2 ) is not empty.
If H1 is not a zero space and if these conditions are satisfied, then a mapping
T : H1 → H2 is a unitary (or antiunitary) operator iff there exist a c.o.n.s. {un }n∈I
in H1 and a c.o.n.s. {vn }n∈I in H2 , with I := {1, ..., N } or I := N, so that
X X
Tf = (un |f )1 vn (or T f = (f |un )1 vn ), ∀f ∈ H1 ,
n∈I n∈I
P PN P∞
where n∈I stands for n=1 or n=1 .
Proof. The first half of the statement is trivial if H1 is a zero space. Then we
assume that H1 is not a zero space.
The implications “a ⇒ b” and “a ⇒ c”, as well as the “if” part of the second
half of the statement are proved by 10.6.9.
The implications “b ⇒ a” and “c ⇒ a” are proved by 10.6.8b.
As to the “only if” part of the second half of the statement, let T ∈ UA(H1 , H2 )
and let {un }n∈I be a c.o.n.s. in H1 . We may assume I := {1, ..., N } or I := N by
10.7.7. If we define vn := T un for all n ∈ I, then {T un}n∈I is a c.o.n.s. in H2 by
10.6.8b and we have, in view of 10.6.4b,
X X X
Tf = T (un |f )1 un = (un |f )1 T un = (un |f )1 vn , ∀f ∈ H1 ,
n∈I n∈I n∈I
if T ∈ U(H1 , H2 ) or
X X X
Tf = T (un |f )1 un = (un |f )1 T un = (f |un )1 vn , ∀f ∈ H1 ,
n∈I n∈I n∈I
if T ∈ A(H1 , H2 ), since T is a linear or antilinear operator and (if I = N) since
T is a continuous mapping in either case (cf. 10.1.21 and 4.6.2.d, or 10.3.16e, and
2.4.5).
10.7.15 Remarks.
(a) From 10.7.14 and 10.7.12 we have that any non-zero separable Hilbert space is
isomorphic either to CN for a suitable N ∈ N or to ℓ2 .
(b) If the orthogonal dimension of a separable Hilbert space H is finite and equal
to N , for every c.o.n.s. {u1 , ..., uN } in H the mapping
U : H → CN
f 7→ U f := ((u1 |f ) , ..., (uN |f ))
is a unitary operator from H onto CN . This follows immediately from 10.6.9
with {ui }i∈I := {u1 , ..., uN } and {vi }i∈I := {e1 , ..., eN } (cf. 10.6.7a).
(c) If the orthogonal dimension of a separable Hilbert space H is denumerable, for
every c.o.n.s. {un }n∈N in H the mapping
U : H → ℓ2
f 7→ U f := {(un |f )}
is a unitary operator from H onto ℓ2 . This follows immediately from 10.6.9
with {ui }i∈I := {un }n∈N and {vi }i∈I := {δn }n∈N (cf. 10.6.7b). In fact, for
each f ∈ H, the sequence ξ := {(un |f )} is an element of ℓ2 by 10.2.8b and
P∞ 2
n=1 | (un |f ) | < ∞ implies that
N
2 ∞
X X
ξ − (un |f ) δn = | (un |f ) |2 → 0 as N → ∞.

n=1 n=N +1
(d) The reason why, notwithstanding remark a, other separable Hilbert spaces are
worth studying besides CN and ℓ2 is that there are problems which can be
formulated in separable Hilbert spaces and which, although in their abstract
form they are the same in all isomorphic Hilbert spaces, are actually easier to
solve or even to phrase in certain Hilbert spaces than in others.
Hilbert Spaces 301
10.8 The finite-dimensional case
In the finite-dimensional case there is no distinction between inner product spaces

and Hilbert spaces, nor between linear manifolds and subspaces. Moreover, all
linear operators are bounded, and hence continuous. Furthermore, all propositions
in a finite-dimensional Hilbert space can be expressed in the language of matrices.
10.8.1 Proposition. Let (X, σ, µ, φ) be an inner product space and let M be a

linear manifold in X. The following conditions are equivalent:
(a) M is finite-dimensional, i.e. the linear space (M, σM×M , µC×M ) is finite-
dimensional;
(b) the inner product space (M, σM×M , µC×M , φM×M ) is a separable Hilbert space
and its orthogonal dimension is finite.
If these conditions are satisfied, then the following conditions also hold true:
(c) M is a subspace of X;
(d) the linear dimension of the linear space (M, σM×M , µC×M ) and the orthogonal
dimension of the separable Hilbert space (M, σM×M , µC×M , φM×M ) are equal.
Proof. If M = {0X } then all the conditions of the statement are true. Therefore,
suppose M 6= {0X }.
a ⇒ [b, c, d]: Assume condition a, let N be the linear dimension of
(M, σM×M , µC×M ), and let {f1 , ..., fN } be a linear basis in M . Then by 10.2.6
there exists an o.n.s. {u1 , ..., uN } in X such that
L{u1, ..., uN } = L{f1 , ..., fN } = M.
Now suppose that {gn } is a Cauchy sequence in M . Then,
N
(n) (n) (n)
X
N
∀n ∈ N, ∃{α1 , ...αN } ∈C such that gn = αk uk ,
k=1
and, by 10.2.3,
N
(n) (m)
X
|αk − αk |2 = kgn − gm k2 → 0 as n, m → ∞.
k=1
(n)
Thus, for all k ∈ {1, ..., N }, the sequence {αk } is a Cauchy sequence in C and
(n)
therefore there exists αk ∈ C such that αk = limn→∞ αk . By the continuity of
vector sum and of scalar multiplication, this implies that
N
X
αk uk = lim gn .
n→∞
k=1
This proves that the metric space (M, dφM ×M ) is complete and consequently that
(M, σM×M , µC×M , φM×M ) is a Hilbert space. From 2.6.6a it follows that M is a
closed subset of X, and hence a subspace of X. Further, suppose that f ∈ M is

such that (uk |f ) = 0 for all k ∈ {1, ..., N }. Then there exists (β1 , ..., βN ) ∈ CN such
PN
that f = k=1 βk uk and also such that
N N
!
X X
βk = βi δk,i = uk | βi ui = (uk |f ) = 0, ∀k ∈ {1, ..., N }.
i=1 i=1
This implies that f = 0M . By 10.6.4, this proves that {u1 , ..., uN } is a c.o.n.s.
in the Hilbert space (M, σM×M , µC×M , φM×M ). Therefore, the Hilbert space
(M, σM×M , µC×M , φM×M ) is separable (cf. 10.7.5) and its orthogonal dimension
is N .
b ⇒ a: This is proved by 10.7.9b.
10.8.2 Proposition. Suppose that an inner product space X is finite-dimensional

as a linear space. Then X is a separable Hilbert space, its orthogonal dimension is
finite, and every linear manifold in X is a subspace of X.
Proof. From 10.8.1 we obtain immediately that X is a separable Hilbert space and
that its orthogonal dimension is finite.
Let M be a linear manifold in X. If M = {0X } then M is a subspace. Now
suppose M 6= {0X }. If we had proved in Section 3.1 that every linear manifold in a
finite-dimensional linear space was finite-dimensional then we would have that M
is a subspace of X by 10.8.1. However we did not prove that result and therefore
we must take a different tack. From 2.3.20 we have that M is separable. Hence,
there exists a countable subset S of M such that M ⊂ S (cf. 2.3.13). Proceeding
as in the proof of 10.7.1 we see that there exists an o.n.s. {un }n∈I in X such that
L{un}n∈I = LS,
and hence such that
M ⊂ S ⊂ LS = L{un}n∈I .
Now, {un }n∈I is a linearly independent subset of X (cf. 10.2.2) and hence it must
be a finite set (cf. 3.1.16). Then, L{un }n∈I is a finite-dimensional linear manifold
in X and hence it is a subspace of X by 10.8.1. Thus,
L{un }n∈I = L{un}n∈I
and hence
M ⊂ L{un }n∈I .
Since M is a linear manifold, we have
L{un}n∈I = LS ⊂ M
and hence M = L{un}n∈I . This proves that M is a subspace of X.
Hilbert Spaces 303
10.8.3 Proposition.
(A) Let A be a linear operator from an inner product space X to a normed space
Y and suppose that the linear manifold DA is finite-dimensional. Then A is
bounded, and hence continuous.
(B) We say that a Hilbert space is finite-dimensional if it is separable and its or-
thogonal dimension is finite; in view of 10.7.9b and 10.8.2, this is equivalent
to its being finite dimensional as a linear space.
Every linear operator in a finite-dimensional Hilbert space is bounded.
Proof. A: From 10.8.1 we have that DA is a separable Hilbert space and that its
orthogonal dimension is finite. Then, let {u1 , ..., uN } be a c.o.n.s. in the Hilbert
space DA , and define
K := max{kAun k : n ∈ {1, ..., N }}.
We have, for all f ∈ DA ,
N
X N
X
kAf k = kA (un |f ) un k ≤ | (un |f ) |kAun k
n=1 n=1
v
N
X
u N
√ uX √
≤K | (un |f ) | ≤ K N t | (un |f ) |2 = K N kf k,
n=1 n=1
where 10.6.4b,d and the Schwarz inequality in CN (cf. 10.3.8c) have been used.
This proves that A is bounded, and hence continuous (cf. 4.2.2).
B: Let A be a linear operator in a finite-dimensional Hilbert space. The domain
DA is a linear manifold in H, and hence it is a subspace of H (cf. 10.8.2). Therefore,
the orthogonal dimension of DA is finite (cf. 10.7.12b). Then, proceeding as in the
proof of statement A, we see that the operator A is bounded.
10.8.4 Remark. For N ∈ N, we denote by M(N ) the family of all N by N matrices

with complex entries. We assume that the reader is familiar with the following facts:
the family M(N ) can be given the structure of an associative algebra over C (cf.
3.3.1);
the associative algebra OE (CN ) (cf. 3.3.7) can be represented faithfully by
M(N ), i.e. an isomorphism (cf. 3.3.5) can be defined from the associative algebra
OE (CN ) onto the associative algebra M(N ), in such a way that if a linear operator
A ∈ OE (CN ) is represented by the matrix [αik ] ∈ M(N ) then
N N N
!
X X X
A(x1 , ..., xi , ..., xN ) = α1k xk , ..., αik xk , ..., αN k xk , ∀(x1 , ..., xN ) ∈ CN .
k=1 k=1 k=1
N
(if vectors of C are represented as column matrices, then A transforms a vector
by the row-by-column multiplication of the matrix [αik ] by the column matrix that
represents the vector).
Now, let H be a finite-dimensional Hilbert space, let N be its dimension, and

let {u1 , ..., uN } be a c.o.n.s. in H. The mapping U of 10.7.15b is an isomorphism
from H onto CN and it is easy to see that the mapping
OE (H) ∋ A 7→ U AU −1 ∈ OE (CN )
is an isomorphism from the associative algebra OE (H) (cf. 3.3.7) onto the asso-
ciative algebra OE (CN ). The composition of this isomorphism with the one from
OE (CN ) onto M(N ) mentioned above is obviously an isomorphism ΦU from OE (H)
onto M(N ). Now, for A ∈ OE (H), ΦU (A) is the element [αik ] of M(N ) such that,
for all (x1 , ..., xN ) ∈ CN ,
N N N
!
X X X
−1
U AU (x1 , ..., xi , ..., xN ) = α1k xk , ..., αik xk , ..., αN k xk ;
k=1 k=1 k=1
we also have
N
X N
X
U AU −1 (x1 , ..., xi , ..., xN ) = U A xk uk = U xk Auk
k=1 k=1
N
! N
! N
!!
X X X
= u1 | xk Auk , ..., ui | xk Auk , ..., uN | xk Auk
k=1 k=1 k=1
N N N
!
X X X
= (u1 |Auk ) xk , ..., (ui |Auk ) xk , ..., (uN |Auk ) xk ;
k=1 k=1 k=1
this proves that ΦU (A) = [(ui |Auk )]. We underline the fact that the isomorphism
ΦU depends in a crucial way on the c.o.n.s. {u1 , ..., uN } in H that was chosen in
order to define the isomorphism U .
10.9 Projective Hilbert spaces and Wigner’s theorem
In the mathematical formalism of quantum mechanics, symmetries are represented

by what we call automorphisms of a projective Hilbert space, and so is reversible
and conservative time evolution. Wigner’s theorem proves that every such automor-
phism can be implemented by a unitary or an antiunitary operator. The theorem
was presented by Eugene P. Wigner in the book he wrote in 1931. Yet, “ in Wigner’s
book the theorem is not proved in full detail. The construction of the mapping U ,
however, is clearly indicated, so that is not difficult to close the gaps in the proof”
(Bargmann, 1964). And indeed those gaps were closed by Valentine Bargmann in
1964 (loc. cit.). A major difference between Wigner’s proof and Bargmann’s is that
Wigner relies on the existence of a c.o.n.s. in the Hilbert space under consideration,
while Bargmann does not. Hence, Bargmann’s approach is better suited to our
exposition since we proved the existence of a c.o.n.s. only for a separable Hilbert
space. In this section we follow Bargmann’s arguments faithfully, with the only
Hilbert Spaces 305
difference that we find it more convenient to arrange his reasoning in two theorems
instead of one.
10.9.1 Definitions. Let H be a Hilbert space. In H we define a relation, denoted

by ÷, as follows:
f ÷ g if ∃z ∈ T such that f = zg,
where T := {z ∈ C : |z| = 1}. This relation is obviously an equivalence relation, and
we denote by Hq the quotient set H/÷. Obviously, if f 6= 0H then the equivalence
class [f ] is an infinite set, while the equivalence class [0H ] contains just the vector
0H .
It is obvious that the following definitions of mappings are consistent:
τ : Hq × Hq → [0, ∞)
([f ], [g]) 7→ τ ([f ], [g]) := | (f |g) |;
Hq → [0, ∞)
[f ] 7→ k[f ]k := kf k;
[0, ∞) × Hq → Hq
(a, [f ]) 7→ a[f ] := [af ].
We note that
τ ([f ], [f ]) = kf k2 = k[f ]k2 , ∀f ∈ H.
We also note that, for α ∈ C and f ∈ H, if α = |α|eiθ with θ ∈ R then αf = eiθ |α|f ,
and hence αf ÷ |α|f ; thus,
[αf ] = [|α|f ] = |α|[f ], or αf ∈ |α|[f ], ∀α ∈ C, ∀f ∈ H,
and also
k|α|[f ]k = k[αf ]k = kαf k = |α|kf k = |α|k[f ]k, ∀α ∈ C, ∀f ∈ H.
When another Hilbert space H′ is discussed in this section, the same symbols as
above will be used, with τ replaced by τ ′ for the first mapping above; moreover, the
same symbols will be used for the norms and the inner products in H and in H′ .
10.9.2 Proposition. Let H and H′ be Hilbert spaces and suppose that there exists a
linear or an antilinear operator U : H → H′ which fulfils either one of the following
conditions:
(U f |U g) = (f |g) , ∀f, g ∈ H, if U is linear;
(U f |U g) = (g|f ) , ∀f, g ∈ H, if U is antilinear.
Then the mapping
ΦU : Hq → Hq′
[f ] 7→ ΦU ([f ]) := [U f ]
is defined consistently and it has the following properties:

ΦU (a[f ]) = aΦU ([f ]), ∀a ∈ [0, ∞), ∀f ∈ H;
τ ′ (ΦU ([f ]), ΦU ([g])) = τ ([f ], [g]), ∀f, g ∈ H.
For each z ∈ T, the operator zU has all the properties assumed for U above and
ΦzU = ΦU .
Proof. The definition of ΦU is consistent because, for f, g ∈ H,
f ÷g ⇒
[∃z ∈ T s.t. f = zg and hence s.t. either U f = zU g or U f = zU g] ⇒
U f ÷ U g,
and hence the equivalence class [U f ] does not depend on the choice of the repre-
sentative f in the equivalence class [f ].
For all a ∈ [0, ∞) and f ∈ H, we have
ΦU (a[f ]) = ΦU ([af ]) = [U (af )] = [aU f ] = a[U f ] = aΦU ([f ]).
For all f, g ∈ H, we have
τ ′ (ΦU ([f ]), ΦU ([g])) = τ ′ ([U f ], [U g]) = | (U f |U g) | = | (f |g) | = τ ([f ], [g]).
Finally, for each z ∈ T, it is immediate to check that zU has all the properties
assumed for U in the statement and that ΦzU = ΦU .
10.9.3 Theorem. Let H and H′ be Hilbert spaces and suppose that there exists a
mapping Φ : Hq → Hq′ which has the following properties:
Φ(a[f ]) = aΦ([f ]), ∀a ∈ [0, ∞), ∀f ∈ H;
τ ′ (Φ([f ]), Φ([g])) = τ ([f ], [g]), ∀f, g ∈ H.
Then the following properties are true:
(A) The mapping Φ is injective.
(B) There exists a linear or an antilinear operator U : H → H′ which is so that
ΦU = Φ, i.e. [U f ] = Φ([f ]) for all f ∈ H,
and which fulfils either one of the following conditions:
(U f |U g) = (f |g) , ∀f, g ∈ H, if U is linear;
(U f |U g) = (g|f ) , ∀f, g ∈ H, if U is antilinear.
(C) If the mapping Φ is surjective onto Hq′ then the operator U is surjective onto
H′ and hence it is a unitary or an antiunitary operator, i.e. U ∈ UA(H, H′ ).
(D) If H is not one-dimensional as a linear space, and if a mapping V : H → H′
is such that
V (f + g) = V f + V g, ∀f, g ∈ H,
and
[V f ] = Φ([f ]), ∀f ∈ H,
then there exists z ∈ T so that V f = zU f for all f ∈ H.
Hilbert Spaces 307
Proof. A: We have
kΦ([f ])k2 = τ ′ (Φ([f ]), Φ([f ])) = τ ([f ], [f ]) = k[f ]k2 , ∀f ∈ H. (1)
This implies that, for f ∈ H,
Φ([f ]) = [0H′ ] iff f = 0H . (2)
Moreover, if f, g ∈ H − {0H } are so that Φ([f ]) = Φ([g]), then 1 implies that
τ ′ (Φ([f ]), Φ([g])) = k[f ]k2 = k[g]k2 ,
and hence
kf kkgk = k[f ]kk[g]k = τ ′ (Φ([f ]), Φ([g])) = τ ([f ], [g]) = | (f |g) |;
by 10.1.7b and 3.1.14, this implies that
∃z ∈ T such that f = zg, i.e. [f ] = [g].
This proves that the mapping Φ is injective.
B: We prove in what follows the existence of U with the required properties. If
H is a zero Hilbert space, this is true trivially. In the other cases, the proof is by
construction.
First we assume that, as a linear space, H is one-dimensional, and we choose an
element u ∈ H such that kuk = 1. Then,
∀f ∈ H, ∃!α ∈ C so that f = αu.
Therefore, if we choose u′ ∈ Φ([u]), we can define the mappings
U1 : H → H ′
αu 7→ U1 (αu) := αu′
and
U2 : H → H ′
αu 7→ U2 (αu) := αu′ .
The mapping U1 is linear since
U1 (α(βu) + γ(δu)) = (αβ + γδ)u′ = α(βu′ ) + γ(δu′ )
= αU1 (βu) + γU1 (δu), ∀α, β, γ, δ ∈ C;
we also have
[U1 (αu)] = [αu′ ] = |α|[u′ ] = |α|Φ([u]) = Φ(|α|[u]) = Φ([αu]), ∀α ∈ C,
and
(U1 (αu)|U1 (βu)) = (αu′ |βu′ ) = αβ = (αu|βu) , ∀α, β ∈ C,
which is true since 1 implies that
ku′ k = kΦ([u])k = k[u]k = kuk = 1.
Similarly, the mapping U2 is antilinear since

U2 (α(βu) + γ(δu)) = (αβ + γδ)u′ = α(βu′ ) + γ(δu′ )
= αU2 (βu) + γU2 (δu), ∀α, β, γ, δ ∈ C;
we also have
[U2 (αu)] = [αu′ ] = |α|[u′ ] = |α|Φ([u]) = Φ(|α|[u]) = Φ([αu]), ∀α ∈ C,
and
(U2 (αu)|U2 (βu)) = αu′ |βu′ = αβ = (βu|αu) , ∀α, β ∈ C.

Thus, both the mappings U1 and U2 have the properties required for U in the
statement, and the proof for the one-dimensional case is concluded.
In what follows we assume that H, as a linear space, is neither zero-dimensional
nor one-dimensional. During the construction of U we shall need the results that
we collect in the following preliminary remarks.
Remark 1: Suppose that f ∈ H is so that U f has been defined and [U f ] = Φ([f ]).
If g ∈ H is also so that U g has been defined and [U g] = Φ([g]), then
| (U f |U g) | = τ ′ ([U f ], [U g]) = τ ′ (Φ([f ]), Φ([g])) = τ ([f ], [g]) = | (f |g) |. (3)
Hence, for g := f ,
kU f k = kf k. (4)
Now suppose f 6= 0H and also that, for all α ∈ C, U (αf ) has been defined and
[U (αf )] = Φ([αf ]); then
U (αf ) ∈ Φ([αf ]) = |α|Φ([f ]) = |α|[U f ] = [αU f ], ∀α ∈ C,
and hence
∀α ∈ C, ∃z ∈ T so that U (αf ) = zαU f,
and hence
∀α ∈ C, ∃!αf ∈ C so that |αf | = |α| and U (αf ) = αf U f, (5)
where uniqueness holds since U f 6= 0H′ (cf. 4). This defines the function
C ∋ α 7→ χf (α) := αf ∈ C.
Obviously, χf (1) = 1.
Remark 2: Let {ui }i∈I be a finite o.n.s. in H. If we choose u′i ∈ Φ([ui ]) for each
i ∈ I, then {u′i }i∈I is an o.n.s. in H′ since
| u′i |u′j | = τ ′ (Φ([ui ]), Φ([uj ])) = τ ([ui ], [uj ]) = | (ui |uj ) |, ∀i, j ∈ I.

′
P
If f = i∈I αi ui with αi ∈ C for all i ∈ I, then for each f ∈ Φ([f ]) we have
f ′ = i∈I α′i u′i with α′i ∈ C such that |α′i | = |αi | for all i ∈ I. In fact, 1 implies
P
kf ′ k = kf k; moreover, we have αi = (ui |f ) for all i ∈ I; thus, if we define α′i :=

(u′i |f ′ ) for all i ∈ I then we have
|α′i | = | (u′i |f ′ ) | = τ ′ (Φ([ui ]), Φ([f ])) = τ ([ui ], [f ]) = | (ui |f ) | = |αi |, ∀i ∈ I,
Hilbert Spaces 309
and also
! 2

′ 2 ′ X ′ ′ X
′ ′
kf k = f − αi ui + αi ui

i∈I i∈I
2 2 2
X
′ X ′ ′ ′ ′ ′ X ′ ′ X
= f − αi ui + αi ui = f − αi ui + |α′i |2

i∈I i∈I i∈I i∈I
(cf. 10.6.1 and 10.2.3), and hence

2

′ X ′ ′ X
f − αi ui = kf k2 − |αi |2 = 0.

i∈I i∈I
These remarks attended to, now we proceed to the construction of U , which we

divide into five steps for the sake of clarity.
Step 1: As in the one-dimensional case, we choose an element u of H such that
kuk = 1, and we choose u′ ∈ Φ([u]). The arbitrariness of the selection of u′ after u
has been selected is the one arbitrary point in the construction of U , and therein
lies the non-uniqueness of U which is clear from the last part of the statement in
10.9.2. In fact, as we will see, U is constructed in such a way that U u = u′ ; if
another element u′′ of Φ([u]) was chosen at the outset and the same procedure that
leads to U was followed, we should obtain a mapping V such that V u = u′′ , and
hence such that V u = zU u if z ∈ T is so that u′′ = zu′ .
Step 2: In this step we define U g and U (u + g) for all g ∈ {u}⊥ .
First we prove that
∀g ∈ {u}⊥ − {0H }, ∃!g ′ ∈ Φ([g]) such that u′ + g ′ ∈ Φ([u + g]). (6)
Let g ∈ {u}⊥ − {0H } and set f := u + g; we have

1
f = u + kgk g
kgk
n o h i
1 1
and we note that u, kgk g is an o.n.s. in H; hence, if we choose u′g ∈ Φ kgk g
and f ′ ∈ Φ([f ]), from remark 2 we have that {u′ , u′g } is an o.n.s. in H′ and
∃(α′0 , α′1 ) ∈ C2 such that |α′0 | = 1, |α′1 | = kgk, f ′ = α′0 u′ + α′1 u′g
(we will see presently that the particular choices made for u′g and f ′ are immaterial
for the construction of U ); we note that
α′1 ′
u′ + u = (α′0 )−1 f ′ ∈ Φ([f ]) = Φ([u + g])
α′0 g
and
α′1 ′ α′1 ′

1 1
u ∈ [u ] = kgkΦ g ) = Φ kgk g ) = Φ([g]);
α′0 g α′0 g kgk kgk
this proves that
∃g ′ ∈ Φ([g]) such that u′ + g ′ ∈ Φ([u + g]).
Now suppose that g ′′ ∈ Φ([g]) is such that u′ + g ′′ ∈ Φ([u + g]). Then there exists
z ∈ T so that
u′ + g ′′ = z(u′ + g ′ ), i.e. (1 − z)u′ = zg ′ − g ′′ ;
since
| (u′ |g ′ ) | = | (u′ |g ′′ ) | = τ ′ (Φ([u]), Φ([g])) = τ ([u], [g]) = | (u|g) | = 0,
we have
(1 − z)u′ = 0H′ and zg ′ = g ′′ ,
and hence z = 1 and g ′′ = g ′ . Thus, 6 is proved.
Moreover, we note that
∃!g ′ ∈ Φ([0H ]) such that u′ + g ′ ∈ Φ([u + 0H ]);
indeed, the vector g ′ := 0H′ satisfies the above condition trivially since Φ([0H ]) =
[0H′ ] (cf. 2), and it is the only one that does so since the equivalence class [0H′ ]
contains just the vector 0H′ .
Thus, we have proved that
∀g ∈ {u}⊥, ∃!g ′ ∈ Φ([g]) such that u′ + g ′ ∈ Φ([u + g]).
Therefore we can define, for each g ∈ {u}⊥ ,
U g := g ′ and U (u + g) := u′ + g ′
if g ′ is the unique element of Φ([g]) such that u′ + g ′ ∈ Φ([u + g]). Since g ′ = 0H′ if
g = 0H (see above), this entails
U 0H = 0H′ and U u = u′ ,
and hence
U (u + g) = U u + U g, ∀g ∈ {u}⊥. (7)
We point out that
[U g] = Φ([g]), ∀g ∈ {u}⊥ , (8)
and
[U (u + g)] = Φ([u + g]), ∀g ∈ {u}⊥ . (9)
Step 3: In this step we prove that, for each v ∈ {u}⊥ such that kvk = 1,
either χv (α) = α for all α ∈ C or χv (α) = α for all α ∈ C. (10)
Let g and h be elements of {u}⊥ . From 7, 9, 3 we have
| (U u + U g|U u + U h) |2 = | (u + g|u + h) |2 ;
from 8, 9, 3 we have
(U u|U g) = (U u|U h) = 0;
Hilbert Spaces 311
then, since kU uk = 1 (cf. 4), we obtain

|1 + (U g|U h) |2 = |1 + (g|h) |2 ,
1 + | (U g|U h) |2 + 2 Re (U g|U h) = 1 + | (g|h) |2 + 2 Re (g|h) ,
and this, in view of 3, is equivalent to
Re (U g|U h) = Re (g|h) . (11)
If Im (g|h) = 0, 3 and 11 imply that Im (U g|U h) = 0; therefore,
if (g|h) ∈ R then (U g|U h) ∈ R and hence (U g|U h) = (g|h) . (12)
Now let v ∈ {u}⊥ be such that kvk = 1. Since kU vk = 1 (cf. 8 and 4), from 5 and
11 we obtain, for all α, β ∈ C,

Re χv (β)χv (α) = Re (χv (β)U v|χv (α)U v)
(13)
= Re (U (βv)|U (αv)) = Re (βv|αv) = Re(βα)
In this, set β := 1. Since χv (1) = 1, we obtain, for all α ∈ C,
Re χv (α) = Re α. (14)
In this, set α := i. We obtain Re χv (i) = 0, which together with |χv (i)| = 1 (cf. 5)
implies χv (i) = ηi with either η = 1 or η = −1, and hence i = ηχv (i). This and 13
imply that, for all α ∈ C,
Im χv (α) = Re(−iχv (α)) = Re(ηχv (i)χv (α)) = η Re(−iα) = η Im α. (15)
Combining 14 and 15 we obtain 10.
We point out that 10 implies the following relations:
χv (α + β) = χv (α) + χv (β), ∀α, β ∈ C; (16)
χv (αβ) = χv (α)χv (β), ∀α, β ∈ C; (17)
χv (α) = χv (α), ∀α ∈ C. (18)

Step 4: In this step we prove that, for all g, h ∈ {u}⊥ ,
U (g + h) = U g + U h, (19)
U (αh) = χ(α)U h, ∀α ∈ C, (20)
(U g|U h) = χ((g|h)), (21)

where the function χ : C → C is defined either by χ(α) := α for all α ∈ C or by
χ(α) := α for all α ∈ C.
First we suppose that the linear manifold {u}⊥ is one-dimensional.
Let v be an element of {u}⊥ such that kvk = 1. Then {u}⊥ = {αv : α ∈ C}.
For g, h ∈ {u}⊥ , by 8, 4, 5, 16, 17, 18 we have, if β, γ ∈ C are so that g = βv and
h = γv:
U (g + h) = U (βv + γv) = U ((β + γ)v) = χv (β + γ)U v
= (χv (β) + χv (γ))U v = χv (β)U v + χv (γ)U v
= U (βv) + U (γv) = U (g) + U (h);
U (αh) = U (αγv) = χv (αγ)U v = χv (α)χv (γ)U v

= χv (α)U (γv) = χv (α)U (h), ∀α ∈ C;
(U g|U h) = (U (βv)|U (γv)) = (χv (β)U v|χv (γ)U v)

= χv (β)χv (γ) = χv (βγ) = χv ((βv|γv)) = χv ((g|h)).
This proves that 19, 20, 21 are true for all g, h ∈ {u}⊥, with χ := χv . Also, recall
10.
Next we suppose that the linear manifold {u}⊥ is not one-dimensional.
Let {u1 , u2 } be any o.n.s. in {u}⊥ (the linear space {u}⊥ is not a zero space by
10.4.2a and 10.2.11 since the linear space H is not one-dimensional, and it is not
one-dimensional; then there exist two linearly independent vectors in {u}⊥ , and
hence there exists an o.n.s. in {u}⊥ which contains two elements, by 10.2.6 with
I := {1, 2}). For i = 1, 2, U ui ∈ Φ([ui ]) (cf. 8). Hence, remark 2 implies that
{U u1 , U u2 } is an o.n.s. in H′ and that, for each (α1 , α2 ) ∈ C2 ,
U (α1 u1 + α2 u2 ) = α′1 U u1 + α′2 U u2 ,
with α′i ∈ C so that |α′i | = |αi | for i = 1, 2, since U (α1 u1 + α2 u2 ) ∈ Φ([α1 u1 + α2 u2 ])
(cf. 8). Now we prove that
α′i = χui (αi ) for i = 1, 2. (22)
This is trivial if αi = 0 (cf. 10). Then, let i = 1 or i = 2 be so that αi 6= 0, and set
βi := αi −1 . We have
1 = (βi ui |αi ui ) = (βi ui |α1 u1 + α2 u2 ) .
Since the inner products above are real, from 12, 8, 5 we have
(βi ui |αi ui ) = (U (βi ui )|U (αi ui )) = (χui (βi )U ui |χui (αi )U ui ) = χui (βi )χui (αi ),
and also
(βi ui |α1 u1 + α2 u2 ) = (U (βi ui )|U (α1 u1 + α2 u2 ))
= (χui (βi )U ui |α′1 U u1 + α′2 U u2 ) = χui (βi )α′i .
This implies
χui (βi )χui (αi ) = χui (βi )α′i ,
Hilbert Spaces 313
and hence (since χui (βi ) 6= 0 as βi 6= 0)

α′i = χui (αi ).
Thus, 22 is proved. Since 22 is true, we can prove that
χu1 (α) = χu2 (α), ∀α ∈ C. (23)
Indeed, from 22 we have
U (u1 + u2 ) = χu1 (1)U u1 + χu2 (1)U u2 = U u1 + U u2 ,
and hence
U (α(u1 + u2 )) = χu1 +u2 (α)U (u1 + u2 )
(24)
= χu1 +u2 (α)U u1 + χu1 +u2 (α)U u2 , ∀α ∈ C,
(cf. 8 and 5); from 22 we also have
U (α(u1 + u2 )) = U (αu1 + αu2 ) = χu1 (α)U u1 + χu2 (α)U u2 , ∀α ∈ C; (25)
since {U u1 , U u2 } is an o.n.s., 24 and 25 imply that
χu1 (α) = χu1 +u2 (α) and χu2 (α) = χu1 +u2 (α), ∀α ∈ C,
and this proves 23. Then, for g, h ∈ L{u1 , u2 }, from 22 and 23 (and 16, 17, 18
as well) we have, if (β1 , β2 ), (γ1 , γ2 ) ∈ C2 are so that g = β1 u1 + β2 u2 and h =
γ1 u 1 + γ2 u 2 :
U (g + h) = U ((β1 + γ1 )u1 + (β2 + γ2 )u2 )
= χu1 (β1 + γ1 )U u1 + χu1 (β2 + γ2 )U u2
(26)
= χu1 (β1 )U u1 + χu1 (γ1 )U u1 + χu1 (β2 )U u2 + χu1 (γ2 )U u2
= U g + U h;
U (αh) = U (αγ1 u1 + αγ2 u2 ) = χu1 (αγ1 )U u1 + χu1 (αγ2 )U u2

(27)
= χu1 (α)(χu1 (γ1 )U u1 + χu1 (γ2 )U u2 ) = χu1 (α)U h, ∀α ∈ C;
(U g|U h) = (χu1 (β1 )U u1 + χu1 (β2 )U u2 |χu1 (γ1 )U u1 + χu1 (γ2 )U2 )
= χu1 (β1 )χu1 (γ1 ) + χu1 (β2 )χu1 (γ2 ) = χu1 (β 1 γ1 + β 2 γ2 ) (28)
= χu1 ((β1 u1 + β2 u2 |γ1 u1 + γ2 u2 )) = χu1 ((g|h)).
Now we are ready to prove 19, 20, 21 for all g, h ∈ {u}⊥. Fix g0 ∈ {u}⊥ − {0H } and
define u1 := kg10 k g0 . For each h ∈ {u}⊥ there exists u2 ∈ {u}⊥ so that {u1 , u2 } is
an o.n.s. and h ∈ L{u1 , u2 }: u2 is obtained as in 10.2.6 with I := {1, 2}, f1 := g0 ,
and either f2 := h if h is not a multiple of g0 or f2 any element of {u}⊥ which is
not a multiple of g0 if h is a multiple of g0 . Then, from 26 we obtain
U (g0 + h) = U g0 + U h, ∀h ∈ {u}⊥ .
Since g0 is an arbitrary element of {u}⊥ − {0H } and since 19 holds trivially for all
h ∈ {u}⊥ if g = 0H (recall that U 0H = 0H′ ), this proves 19 for all g, h ∈ {u}⊥.
Furthermore, from 27 we obtain 20 for all h ∈ {u}⊥, with χ := χu1 . Finally, from
28 we obtain
(U g0 |U h) = χu1 ((g0 |h)), ∀h ∈ {u}⊥ .
If g is any element of {u}⊥ − {0H } different from g0 , proceeding as above we obtain
(U g|U h) = χu1,g ((g|h)), ∀h ∈ {u}⊥,
1
with u1,g := kgk g. However, 27 proves that χh = χu1 for all h ∈ {u}⊥ − {0H }.
Thus we have
(U g|U h) = χu1 ((g|h)), ∀g ∈ {u}⊥ − {0H }, ∀h ∈ {u}⊥ .
Since 21 holds trivially for all h ∈ {u}⊥ if g = 0H , this proves 21 for all g, h ∈ {u}⊥,
with χ := χu1 . Finally, recall 10.
Step 5: In this step we define U f for all f ∈ H and conclude the proof of part
B of the statement.
From 10.4.1 we have
∀f ∈ H, ∃!(f1 , f2 ) ∈ V {u} × (V {u})⊥ so that f = f1 + f2 .
Since V {u} = {αu : α ∈ C} (cf. 4.1.15) and (V {u})⊥ = {u}⊥ (cf. 10.2.11), we have
∀f ∈ H, ∃!(α, g) ∈ C × {u}⊥ so that f = αu + g.
Therefore, we can define U f for all f ∈ H by letting
U (αu + g) := χ(α)U u + U g, ∀α ∈ C, ∀g ∈ {u}⊥ ,
where χ is the function of step 4. For α = 0 or α = 1, this definition coincides with
the ones already given in step 2 since χ(0) = 0 and χ(1) = 1 (cf. 7).
We have already noted that [U g] = Φ([g]) for all g ∈ {u}⊥ (cf. 8). Moreover,
for every α ∈ C − {0} and every g ∈ {u}⊥ we have
[U (αu + g)] = [χ(α)(U u + U (α−1 g))] = |χ(α)|[U (u + α−1 g)]
= |α|Φ([u + α−1 g]) = Φ([αu + g]),
where 20, 7, 9 have been used. Thus,
[U f ] = Φ([f ]), ∀f ∈ H.
Finally, for f1 , f2 ∈ H we have, if α1 , α2 ∈ C and g1 , g2 ∈ {u}⊥ are so that f1 =
α1 u + g1 and f2 = α2 u + g2 ,
U (f1 + f2 ) = U ((α1 + α2 )u + (g1 + g2 ))
= χ(α1 + α2 )U u + U (g1 + g2 )
= χ(α1 )U u + U g1 + χ(α2 )U u + U g2
= U (α1 u + g1 ) + U (α2 u + g2 ) = U f1 + U f2 ,
where 16 and 19 have been used; we also have
U (αf1 ) = U (αα1 u + αg1 ) = χ(αα1 )U u + U (αg1 )
= χ(α)(χ(α1 )U u + U g1 ) = χ(α)U f1 , ∀α ∈ C,
Hilbert Spaces 315
where 17 and 20 have been used; finally, we have

(U f1 |U f2 ) = (χ(α1 )U u + U g1 |χ(α2 )U u + U g2 )
= χ(α1 )χ(α2 ) + (U g1 |U g2 ) = χ(α1 α2 ) + χ((g1 |g2 ))
= χ(α1 α2 + (g1 |g2 )) = χ((α1 u + g1 |α2 u + g2 )) = χ((f1 |f2 )),
where 3, 18, 17, 21, 16 have been used. Thus, U has all the required properties.
C: Suppose that the mapping Φ is surjective onto Hq′ . Then,
∀f ′ ∈ H′ , ∃f ∈ H so that [f ′ ] = Φ([f ]) = [U f ],
and hence
∀f ′ ∈ H′ , ∃f ∈ H, ∃z ∈ T so that f ′ = zU f,
and hence
∀f ′ ∈ H′ , ∃f ∈ H, ∃z ∈ T so that f ′ = U (zf ) if U is linear or
f ′ = U (zf ) if U is antilinear.
′
Thus, the mapping U is surjective onto H .
D: If H is a zero space then the statement is trivial. Now suppose that H is
neither a zero space nor a one-dimensional linear space. Then for every vector
f ∈ H there exists a vector in H which is not a multiple of f .
Preliminarily, we note that if f, g ∈ H are linearly independent then so are
U f, U g. In fact, for any f, g ∈ H we have
| (U f |U g) | = | (f |g) |, kU f k = kf k, kU gk = kgk,
and hence 10.1.7b proves that f and g are linearly dependent iff so are U f and U g.
Now let V be as in the statement. Then
∀f ∈ H, ∃zf ∈ T so that V f = zf U f.
We point out that zf is uniquely fixed by this condition if f 6= 0H , since U f = 0H′
iff f = 0H . Also, this condition implies that V 0H = 0H′ .
For f, g ∈ H we have
V (f + g) = V f + V g, U (f + g) = U f + U g, V (f + g) = zf +g U (f + g),
and hence
zf U f + zg U g = zf +g U f + zf +g U g.
If f and g are linearly independent then so are U f and U g and hence
zf = zf +g and zg = zf +g , and hence zf = zg .
Now fix g0 ∈ H − {0H } and set z0 := zg0 . If f ∈ H is such that f and g0 are linearly
independent, then zf = z0 . If f ∈ H − {0H } is such that f and g0 are linearly
dependent then there exists α ∈ C − {0} so that f = αg0 (cf. 3.1.14); if we choose
h ∈ H which is not a multiple of g0 , then g0 and h are linearly independent and so
are f and h, and hence z0 = zh and zf = zh , and hence zf = z0 . This proves that
V f = z0 U f, ∀f ∈ H − {0},
and hence that
V f = z0 U f, ∀f ∈ H.
10.9.4 Definitions. Let H be a non-zero Hilbert space. We define the subset H̃

of H by letting
H̃ := {u ∈ H : kuk = 1}
(this notation is consistent with the one introduced in 10.2.12). The relation ÷
defined in 10.9.1 induces an equivalence relation in H̃, and we denote by Ĥ the
quotient set of H̃ by this equivalence relation. Obviously,
Ĥ = {[u] ∈ Hq : kuk = 1}.
An element of Ĥ is called a ray.
We denote by the same symbol τ the restriction to Ĥ × Ĥ of the function τ
defined in 10.9.1. The pair (Ĥ, τ ) is called a projective Hilbert space.
Two projective Hilbert spaces (Ĥ, τ ) an (Ĥ′ , τ ′ ) (where H and H′ are non-zero
Hilbert spaces, not necessarily different from each other) are said to be isomorphic
if there exists a mapping ω : Ĥ → Ĥ′ which is bijective from Ĥ onto Ĥ′ and such
that
τ ′ (ω([u]), ω([v])) = τ ([u], [v]), ∀u, v ∈ H̃.
If such a mapping exists, it is called an isomorphism from Ĥ onto Ĥ′ . If H = H′ , an
isomorphism is called an automorphism. It is obvious that the family Aut Ĥ of all
automorphisms of Ĥ is a group with respect to the usual composition of mappings.
10.9.5 Proposition. Let H and H′ be non-zero Hilbert spaces and suppose that the
family UA(H, H′ ) is not empty. For every U ∈ UA(H, H′ ), the mapping
ωU : Ĥ → Ĥ′
[u] 7→ ωU ([u]) := [U u]
is defined consistently and it is an isomorphism from Ĥ onto Ĥ′ . For all z ∈ T,

zU ∈ UA(H, H′ ) and ωzU = ωU .
Proof. The mapping ωU is the restriction to Ĥ of the mapping ΦU defined in

10.9.2. Moreover, U u ∈ H̃′ for all u ∈ H̃. Hence, the definition of ωU is consistent
and
τ ′ (ωU ([u]), ωU ([v])) = τ ([u], [v]), ∀u, v ∈ H̃.
Furthermore, ωU is injective because so is ΦU (cf. 10.9.3A). Finally, for each u′ ∈ H̃′
we have
U −1 u′ ∈ H̃ and ωU ([U −1 u′ ]) = [u′ ],
and this proves that ωU is surjective onto Ĥ′ . Thus, ωU is an isomorphism from Ĥ
onto Ĥ′ .
The last part of the statement is obvious (besides, it follows immediately from
the last part of the statement in 10.9.2).
Hilbert Spaces 317
10.9.6 Theorem (Wigner’s theorem). Let H and H′ be non-zero Hilbert spaces

and suppose that there exists a mapping ω : Ĥ → Ĥ′ which is surjective from Ĥ
onto Ĥ′ and such that
τ ′ (ω([u]), ω([v])) = τ ([u], [v]), ∀u, v ∈ H̃.
Then ω is an isomorphism from Ĥ onto Ĥ′ and there exists U ∈ UA(H, H′ ) such
that
ωU = ω, i.e. [U u] = ω([u]) for all u ∈ H̃.
The operator U is said to be an implementation of ω.
If H and H′ are not one-dimensional as linear spaces and if V ∈ UA(H, H′ ) is
such that ωV = ω, then there exists z ∈ T so that V = zU . Thus, an implementation
of ω is unique up to a multiplicative factor in T.
Proof. We define the mapping

Φω : Hq → Hq′

[0H′ ] if [f ] = [0H ],
[f ] 7→ Φω ([f ]) := h
1
i
kf kω kf k f if [f ] 6= [0H ].
This definition is consistent because if f and g are non-zero elements of H and f ÷ g
then
1 1
kf k = kgk and f÷ g,
kf k kgk
and therefore
1 1
kf kω f = kgkω g .
kf k kgk
h i
Thus, the element kf kω kf1 k f of Hq′ does not depend on the choice of the
representative f in the equivalence class [f ]. It is obvious that the mapping Φω is
an extension of the mapping ω. Moreover, we have:
1
Φω (a[f ]) = Φω ([af ]) = akf kω af
akf k
= aΦω ([f ]), ∀a ∈ (0, ∞), ∀f ∈ H − {0H };
Φω (a[f ]) = [0H′ ] = aΦω ([f ]) if a = 0 or f = 0H ;

′ ′ 1 1
τ (Φω ([f ]), Φω ([g])) = τ kf kω f , kgkω g
kf k kgk

′ 1 1
= kf kkgkτ ω f ,ω g
kf k kgk

1 1
= kf kkgkτ f , g
kf k kgk

1 1
= τ kf k f , kgk g
kf k kgk
= τ ([f ], [g]), ∀f, g ∈ H − {0H };
τ ′ (Φω ([f ]), Φω ([g])) = 0 = τ ([f ], [g]) if f = 0H or g = 0H .
Thus, Φω has all the properties that were assumed for Φ in 10.9.3. Then, Φω is
injective by 10.9.3A and so is ω. Thus, ω is an isomorphism from Ĥ onto Ĥ′ .
Further, if f ′ ∈ H′ − {0H′ } then kf1′ k f ′ ∈ H̃′ and hence (since ω is surjective onto
Ĥ′ ) there exists u ∈ H̃ so that

1 ′
ω([u]) = f ,
kf ′ k
and hence so that

1 ′
Φω ([kf ′ ku]) = kf ′ kΦω ([u]) = kf ′ kω([u]) = kf ′ k f = [f ′ ].
kf ′ k
This proves that the mapping Φω is surjective onto Hq′ . Then, by 10.9.3B,C there
exists U ∈ UA(H, H′ ) such that
[U f ] = Φω ([f ]), ∀f ∈ H,
and hence such that
[U u] = Φω ([u]) = ω([u]), ∀u ∈ H̃.
′
If V ∈ UA(H, H ) is such that ωV = ω, then we have:

1 1
[V f ] = kf k V f = kf kωV f
kf k kf k

1
= kf kω f = Φω ([f ]), ∀f ∈ H − {0H };
kf k
[V 0H ] = [0H′ ] = Φω ([0H ]).

Therefore, if H and H′ are not one-dimensional linear spaces, 10.9.3D implies that
there exists z ∈ T so that V = zU .
10.9.7 Remarks.
(a) If H and H′ are non-zero Hilbert spaces and ω is an isomorphism from Ĥ onto
Ĥ′ , we say that an element U of UA(H, H′ ) implements ω if ωU = ω.
Suppose that H and H′ are non-zero Hilbert spaces and that the projective
spaces (Ĥ, τ ), (Ĥ′ , τ ′ ) are isomorphic. Further, suppose that H and H′ are
not one-dimensional as linear spaces. If U is an element of UA(H, H′ ) which
implements an isomorphism ω from Ĥ onto Ĥ′ , then it is clear from 10.9.5 and
10.9.6 that another element V of UA(H, H′ ) implements ω iff there exists z ∈ T
so that V = zU . Then, the operators in UA(H, H′ ) which implement a given
isomorphism are either all unitary or all antiunitary.
(b) Let H be a non-zero Hilbert space. It is obvious that the mapping
UA(H) ∋ U 7→ ωU ∈ Aut Ĥ
is a homomorphism from the group UA(H) (cf. 10.3.16d) to the group Aut Ĥ.
Wigner’s theorem proves that this homomorphism is surjective onto Aut Ĥ.
Chapter 11
L2 Hilbert Spaces
This chapter deals with actualizations of the concept of abstract Hilbert space which
are used in most applications of Hilbert space theory.
11.1 L2 (X, A, µ)
11.1.1 Definition. We denote by L2 (X, A, µ) the subset of M(X, A, µ) which is

defined as follows:
Z
L2 (X, A, µ) := ϕ ∈ M(X, A, µ) : |ϕ|2 dµ < ∞
X
(this definition is consistent because ϕ ∈ M(X, A, µ) implies |ϕ|2 ∈ L+ (X, A, µ), in

view of 6.2.17).
Clearly, for ϕ ∈ M(X, A, µ), ϕ ∈ L2 (X, A, µ) iff |ϕ|2 ∈ L1 (X, A, µ).
The elements of L2 (X, A, µ) are called square integrable functions.
11.1.2 Proposition. Let ϕ, ψ ∈ L2 (X, A, µ). Then:

(a) αϕ + βψ ∈ L2 (X, A, µ), ∀α, β ∈ C;
(b) ϕψ ∈ L1 (X, A, µ).
Proof. a: For any α, β ∈ C, we have αϕ + βψ ∈ M(X, A, µ) by 8.2.2. We also

have
|αϕ(x) + βψ(x)|2 ≤ 2(|α|2 |ϕ(x)|2 + |β|2 |ψ(x)|2 ), ∀x ∈ Dϕ ∩ Dψ
(cf. inequality 2 in the proof of 10.3.7), and this implies |αϕ + βψ|2 ∈ L1 (X, A, µ)
by 8.2.9 and 8.2.5, and hence αϕ + βψ ∈ L2 (X, A, µ).
b: We have ϕψ ∈ M(X, A, µ) by 8.2.2. We also have
1
|ϕ(x)ψ(x)| ≤ (|ϕ(x)|2 + |ψ(x)|2 ), ∀x ∈ Dϕ ∩ Dψ
2
(cf. inequality 1 in the proof of 10.3.7), and this implies ϕψ ∈ L1 (X, A, µ) by 8.2.9
and 8.2.5.
319
11.1.3 Corollary. If the measure µ is finite, then L2 (X, A, µ) ⊂ L1 (X, A, µ).

Proof. If µ(X) < ∞ the obviously 1X ∈ L2 (X, A, µ) and we can assume ψ := 1X
in 11.1.2b.
11.1.4 Theorem. A subset of M (X, A, µ), which we denote by L2 (X, A, µ), is

consistently defined as follows:
Z
L2 (X, A, µ) := [ϕ] ∈ M (X, A, µ) : |ϕ|2 dµ < ∞ .
X
Clearly, for [ϕ] ∈ M (X, A, µ), [ϕ] ∈ L2 (X, A, µ) iff ϕ ∈ L2 (X, A, µ).
L2 (X, A, µ) is a linear manifold in the linear space M (X, A, µ).
A function φ is consistently defined by
φ : L2 (X, A, µ) × L2 (X, A, µ) → C
Z
([ϕ], [ψ]) 7→ φ([ϕ], [ψ]) := ϕψdµ
X
and φ is an inner product for the linear space L2 (X, A, µ).
Proof. To prove the consistency of the definition of the set L2 (X, A, µ), we need
to check the implication
Z Z
′ ′
ϕ , ϕ ∈ M(X, A, µ), ϕ ∼ ϕ, 2
|ϕ| dµ < ∞ ⇒ |ϕ′ |2 dµ < ∞,
X X
because if this implication is true then the condition X |ϕ|2 dµ < ∞ is actually
R
a condition for the equivalence class [ϕ] even though it is expressed through a
particular representative of it. Now, the above implication follows from 8.1.17c.
Then, 11.1.2a proves that L2 (X, A, µ) is a linear manifold in M (X, A, µ).
To prove the consistency of the definition of the function φ, first we note
that 11.1.2b implies that ϕψ ∈ L1 (X, A, µ) for all ϕ, ψ ∈ L2 (X, A, µ). Next, if
ϕ′ , ϕ, ψ ′ , ψ ∈ L2 (X, A, µ) are so that ϕ′ ∼ ϕ and ψ ′ ∼ ψ, then
ϕ′ (x)ψ ′ (x) = ϕ(x)ψ(x) µ-a.e. on Dϕψ ∩ Dϕ′ ψ′
(the proof is similar to the one given in 8.2.13 for ϕ′ + ψ ′ ∼ ϕ + ψ) and hence
Z Z
′ ′
ϕ ψ dµ = ϕψµ
R X X
by 8.2.7. Thus, the number X ϕψdµ depends only on the equivalence classes [ϕ]
and [ψ], and not on the particular representatives ϕ and ψ through which it is
obtained.
As to the properties listed in 10.1.3 which φ must have in order to be an inner
product, ip1 follows from 8.2.9, ip2 from
Z Z
ϕψdµ = ψϕdµ , ∀ϕ, ψ ∈ L2 (X, A, µ)
X X
(cf. 8.2.3), ip3 is obvious. Finally, for ϕ ∈ L2 (X, A, µ) we have
Z
|ϕ|2 dµ = ([ϕ]|[ϕ]) = 0 ⇒ ϕ(x) = 0 µ-a.e. on Dϕ ⇒ [ϕ] = 0L2 (X,A,µ)
X
(cf. 8.1.18a), and this shows that φ has property ip4 .
L2 Hilbert Spaces 321
11.1.5 Remark. Since in every element of M (X, A, µ) there exists a representative

which is an element of M(X, A) (cf. 8.2.12), it is easy to see that the inner product
space constructed in 11.1.4 is trivially isomorphic to the one that is constructed as
follows (we will still use the symbols L2 (X, A, µ) and L2 (X, A, µ), although they
have now a different meaning than before). One defines a subset of M(X, A) by
Z
2 2
L (X, A, µ) := ϕ ∈ M(X, A) : |ϕ| dµ < ∞ ,
X
then defines an equivalence relation in L2 (X, A, µ) by
ϕ ∼ ψ if ϕ(x) = ψ(x) µ-a.e. on X,
and then writes L2 (X, A, µ) := L2 (X, A, µ)/ ∼ for the quotient set. Then one
defines vector sum, scalar multiplication and inner product by
[ϕ] + [ψ] := [ϕ + ψ], ∀[ϕ], [ψ] ∈ L2 (X, A, µ),

α[ϕ] := [αϕ], ∀α ∈ C, ∀[ϕ] ∈ L2 (X, A, µ),
Z
([ϕ]|[ψ]) := ϕψdµ, ∀[ϕ], [ψ] ∈ L2 (X, A, µ).
X
This way of defining the inner product space L2 (X, A, µ) is more frequent than
ours.
In a similar way one can give an alternative but equivalent definition of the
normed space L1 (X, A, µ) (cf. 8.2.15).
11.1.6 Theorem. The normed space L1 (X, A, µ) is a Banach space and the inner
product space L2 (X, A, µ) is a Hilbert space.
Proof. In what follows, p stands for either 1 or 2. We need to prove that the metric
space (Lp (X, A, µ), d) is complete, where d is the distance on Lp (X, A, µ) defined
by
Z p1
p
d([ϕ], [ψ]) := k[ϕ] − [ψ]k = |ϕ − ψ| dµ , ∀[ϕ], [ψ] ∈ Lp (X, A, µ).
X
Then, let {[ϕn ]} be a Cauchy sequence in Lp (X, A, µ), and for each ε > 0 let Nε ∈ N
be so that
n, m > Nε ⇒ k[ϕn ] − [ϕm ]k < ε.
We define a subsequence {[ϕnk ]} of {[ϕn ]} by induction as follows:
we choose n1 such that n1 > N 21 ;

for k > 1, assuming that we have already chosen n2 , ..., nk so that ni > N 1i and
2
ni > ni−1 for i = 2, ..., k, we choose nk+1 such that nk+1 > N k+1
1 and nk+1 > nk .
2
It is expedient to choose, for each n ∈ N, the representative ϕn so that Dϕn = X

(cf. 8.2.12). We define
m
X
ψm := |ϕnk+1 − ϕnk |, ∀m ∈ N,
k=1
and we have ψm ∈ Lp (X, A, µ) (cf. 8.2.9 or 11.1.2a). We also define
X∞
ψ∞ := |ϕnk+1 − ϕnk |
k=1
(ψ∞ (x) may be ∞ for some x ∈ X), and we have ψ∞ ∈ L+ (X, A) (cf. 6.2.32).
Since
Z p1 m m
p
X X 1
|ψm | dµ = k[ψm ]k ≤ k[ϕnk+1 ] − [ϕnk ]k < < 1, ∀m ∈ N,
X 2k
k=1 k=1
by 8.1.20 we have Z
|ψ∞ |p dµ ≤ 1.
X
−1
In view of 8.1.12b, this implies that if we define E := ψ∞ ({∞}) then
E ∈ A and µ(E) = 0.
P∞
The series k=1 (ϕnk+1 (x) − ϕnk (x)) is convergent for all x ∈ X − E (cf. 4.1.8b)
and hence we can define the function
ϕ: X −E →C
∞
X
x 7→ ϕ(x) := ϕn1 (x) + (ϕnk+1 (x) − ϕnk (x)),
k=1
which is an element of M(X, A, µ) (cf. 6.2.3 and 6.2.20d). Since
i−1
X
ϕn1 + (ϕnk+1 − ϕnk ) = ϕni , ∀i ∈ N,
k=1
we see that
ϕ(x) = lim ϕni (x), ∀x ∈ X − E.
i→∞
Now fix ε > 0 and let n > Nε . We have
|ϕ(x) − ϕn (x)|p = lim |ϕni (x) − ϕn (x)|p , ∀x ∈ X − E.
i→∞
We also have Z
|ϕni − ϕn |p dµ < εp for all i ∈ N s.t. ni > Nε .
X
Then, by 8.1.20 we have Z
|ϕ − ϕn |p dµ ≤ εp .
X
This proves first that ϕ − ϕn ∈ Lp (X, A, µ) and hence that ϕ ∈ Lp (X, A, µ) since
ϕ = (ϕ − ϕn ) + ϕn (cf. 8.2.9 or 11.1.2a), and thereafter also that
k[ϕ] − [ϕn ]k ≤ ε.
Thus, we have proved that there exists [ϕ] ∈ Lp (X, A, µ) such that limn→∞ [ϕn ] =
[ϕ].
11.1.7 Remark. From the proof of 11.1.6 we obtain the following result:
if {[ϕn ]} is a convergent sequence in L1 (X, A, µ) or in L2 (X, A, µ) and
[ϕ] := limn→∞ [ϕn ], then there exists a subsequence {[ϕnk ]} so that
ϕ(x) = lim ϕnk (x) µ-a.e. on X.
k→∞
11.1.8 Remark. For X := {1, ..., N } or X := N, if A = P(X) and µ is the counting

measure on X (cf. 8.3.10) then L2 (X, A, µ) is the Hilbert space CN or the Hilbert
space ℓ2 , respectively (cf. 10.3.8c,d).
11.1.9 Proposition. The intersection S 2 (X, A, µ) := S(X, A, µ)∩L2 (X, A, µ) is a
dense linear manifold in the Hilbert space L2 (X, A, µ) (for S(X, A, µ), cf. 8.2.14).
Proof. Since S(X, A, µ) and L2 (X, A, µ) are linear manifolds in the linear space
M (X, A, µ), the same holds true for their intersection (cf. 3.1.5). Then S 2 (X, A, µ)
is a linear manifold in the linear space L2 (X, A, µ) as well (cf. 3.1.4b).
Let [ϕ] ∈ L2 (X, A, µ) and assume that the representative ϕ is an element of
M(X, A) (cf. 8.2.12). Then, by 6.2.27 there exists a sequence {ψn } in S(X, A)
such that
|ψn (x)| ≤ |ϕ(x)|, ∀x ∈ X, ∀n ∈ N,
lim ψn (x) = ϕ(x), ∀x ∈ X,
n→∞
and hence such that
|ψn (x) − ϕ(x)|2 ≤ 4|ϕ(x)|2 , ∀x ∈ X,
lim |ψn (x) − ϕ(x)|2 = 0, ∀x ∈ X.
n→∞
Then ψn ∈ L2 (X, A, µ) by 8.2.5, and by 8.2.11 we have
Z 12
lim k[ψn ] − [ϕ]k = lim |ψn − ϕ|2 dµ = 0.
n→∞ n→∞ X
In view of 2.3.12, this proves that S (X, A, µ) is dense in L2 (X, A, µ).
2
11.1.10 Proposition. Let (X1 , A1 , µ1 ) and (X2 , A2 , µ2 ) be measure spaces, and

suppose that π : Dπ → X2 is an injective mapping from X1 to X2 such that the
following conditions are true:
Dπ ∈ A1 , Rπ ∈ A2 ;
π is measurable w.r.t. AD
1
π
and A2 , π −1 is measurable w.r.t. AR
2
π
and A1 ;
µ1 (X1 − Dπ ) = 0;
µ2 (E) = µ1 (π −1 (E)), ∀E ∈ A2 .
Then, for every ϕ ∈ L2 (X2 , A2 , µ2 ) the function ϕ ◦ π is an element of
L2 (X1 , A1 , µ1 ). Moreover, for ϕ, ψ ∈ L2 (X2 , A2 , µ2 ), if ϕ ∼ ψ then ϕ ◦ π ∼ ψ ◦ π.
Therefore, the definition of the mapping
U : L2 (X2 , A2 , µ2 ) → L2 (X1 , A1 , µ1 )
[ϕ] 7→ U [ϕ] := [ϕ ◦ π]
is consistent.
The definition of the mapping
V : L2 (X1 , A1 , µ1 ) → L2 (X2 , A2 , µ2 )
[ψ] 7→ V [ψ] := [ψ ◦ π −1 ]
is also consistent.
The mappings U and V are unitary operators and V = U −1 .
Proof. In view of 8.3.11c, if ϕ ∈ L2 (X2 , A2 , µ2 ) then ϕ ∈ M(X2 , A2 , µ2 ) and hence

ϕ◦π ∈ M(X1 , A1 , µ1 ); also, |ϕ|2 ∈ L1 (X2 , A2 , µ2 ) and hence |ϕ◦π|2 ∈ L1 (X1 , A1 , µ)
and hence ϕ ◦ π ∈ L2 (X1 , A1 , µ). Next, suppose that ϕ, ψ ∈ L2 (X2 , A2 , µ2 ) are such
that ϕ ∼ ψ; then there exists E ∈ A2 such that µ2 (E) = 0 and
ϕ(x) = ψ(x), ∀x ∈ Dϕ ∩ Dψ ∩ (X2 − E);
(ϕ ◦ π)(y) = (ψ ◦ π)(y), ∀y ∈ π −1 (Dϕ ∩ Dψ ∩ (X2 − E));
since
π −1 (Dϕ ∩ Dψ ∩ (X2 − E)) = π −1 (Dϕ ) ∩ π −1 (Dψ ) ∩ π −1 (X2 − E)
= Dϕ◦π ∩ Dψ◦π ∩ (Dπ − π −1 (E))
= Dϕ◦π ∩ Dψ◦π ∩ (X1 − π −1 (E))
(the last equality holds because e.g. Dϕ◦π ⊂ Dπ ) and µ1 (π −1 (E)) = µ2 (E) = 0, this
proves that ϕ ◦ π ∼ ψ ◦ π. Therefore, the definition of the mapping U is consistent.
Now, define τ := π −1 . Obviously Dτ = Rπ . Also,
µ2 (X2 − Dτ ) = µ1 (π −1 (X2 − Rπ )) = µ1 (∅) = 0.
Moreover, for every F ∈ A1 we have F = ((X1 − Dπ ) ∩ F ) ∪ (Dπ ∩ F ) and hence,
since µ1 ((X1 − Dπ ) ∩ F ) ≤ µ1 (X1 − Dπ ) = 0,
µ1 (F ) = µ1 (Dπ ∩ F ) = µ1 (π −1 (π(Dπ ∩ F ))) = µ2 (π(Dπ ∩ F )) = µ2 (τ −1 (F ))
(the second and the fourth equalities are true because π is injective). Thus, the
assumptions are actually completely symmetrical with respect to the indices 1 and
2, and therefore the definition of the mapping V is also consistent.
For every ϕ ∈ L2 (X2 , A2 , µ2 ) we have ϕ ◦ π ◦ π −1 = ϕ ◦ idRπ = ϕDϕ ∩Rπ , and
hence ϕ ∼ ϕ ◦ π ◦ π −1 since
ϕ(x) = ϕDϕ ∩Rπ (x), ∀x ∈ Dϕ ∩ Rπ = Dϕ ∩ (Dϕ ∩ Rπ ) ∩ (X2 − (X2 − Rπ ))
and µ2 (X2 − Rπ ) = 0. Thus we have
V (U [ϕ]) = [ϕ ◦ π ◦ π −1 ] = [ϕ], ∀[ϕ] ∈ L2 (X2 , A2 , µ2 ),
and symmetrically
U (V [ψ]) = [ψ], ∀[ψ] ∈ L2 (X1 , A1 , µ1 ).
By 1.2.16b, this proves that U is injective and U −1 = V , and therefore also that
RU = L2 (X1 , A1 , µ1 ). Moreover, for all [ϕ], [ψ] ∈ L2 (X2 , A2 , µ2 ) we have
Z Z
(ϕ ◦ π)(ψ ◦ π)dµ1 = ϕψdµ2
X1 X2
by 8.3.11c. Thus, U is a unitary operator in view of 10.1.20. Symmetrically, V

is a unitary operator as well (this can be deduced also on the basis of 4.6.2b and
10.1.21).
11.1.11 Remark. It is immediate to see that the mapping

V : L2 (X, A, µ) → L2 (X, A, µ)
[ϕ] 7→ V [ϕ] := [ϕ]
is defined consistently, also in view of 6.2.17, and that V is an antiunitary operator
as a result of the basic properties of the Lebesgue integral. Clearly, V 2 = 1L2 (X,A,µ)
and hence every antiunitary operator A in L2 (X, A, µ) is the product of a unitary
operator multiplied by V , since A = (AV )V and AV is a unitary operator (cf.
10.3.16c).
11.2 L2 (a, b)
In this section, a and b are two real numbers such that a < b.
We write M(a, b) := M(X, A, µ), L1 (a, b) := L1 (X, A, µ), L2 (a, b) :=
L (X, A, µ), L2 (a, b) := L2 (X, A, µ) if X := [a, b], A := (A(dR ))[a,b] , µ := m[a,b] ,
2
where m[a,b] is the Lebesgue measure on [a, b] (cf. 9.3.1). Moreover, we denote
(A(dR ))[a,b] by the symbol A[a,b] .
11.2.1 Theorem. The inclusion C(a, b) ⊂ L2 (a, b) holds true. If the mapping ι is
defined by
ι : C(a, b) → L2 (a, b)
ϕ 7→ ι(ϕ) := [ϕ],
then the pair (L2 (a, b), ι) is a completion of the inner product space C(a, b) (for
which, cf. 10.1.5b).
Proof. For every ϕ ∈ C(a, b), ϕ ∈ M(a, b) (cf. 6.2.8). Moreover, the function |ϕ|2
is bounded since it is an element of C(a, b) (cf. 3.1.10f), and therefore |ϕ|2 ∈ L1 (a, b)
(cf. 8.2.6), and hence ϕ ∈ L2 (a, b).
If the inner products in C(a, b) and in L2 (a, b) are denoted by φ and by φ̂
respectively, directly from their definitions we have
Z
φ̂(ι(ϕ), ι(ψ)) = ϕψdm[a,b] = φ(ϕ, ψ), ∀ϕ, ψ ∈ C(a, b).
[a,b]
To complete the proof of the statement it remains to prove that Rι is dense in

L2 (a, b), which we do below by proving that
∀[ϕ] ∈ L2 (a, b), ∀ε > 0, ∃ψ ∈ C(a, b) such that k[ϕ] − [ψ]k < ε
(cf. 2.3.12). We fix [ϕ] ∈ L2 (a, b) and ε > 0, and for convenience we assume
Dϕ = [a, b] for the representative ϕ (cf. 8.2.12). For each n ∈ N we define the
function
ηn : C → C
(
z if |z| ≤ n,
z 7→ ηn (z) := −1
nz|z| if |z| > n,
and then the function ϕn := ηn ◦ϕ. Since the function ηn is continuous, the function
ϕn is an element of M(a, b) (cf. 6.2.5 and 6.2.8), and it is an element of L2 (a, b)
since |ϕn (x)| ≤ |ϕ(x)| for all x ∈ [a, b] (cf. 8.2.5). Since
|ϕ(x) − ϕn (x)|2 ≤ 4|ϕ(x)|2 , ∀x ∈ X, ∀n ∈ N,
and
lim ϕn (x) = ϕ(x) and hence lim |ϕ(x) − ϕn (x)|2 = 0, ∀x ∈ X,
n→∞ n→∞
by 8.2.11 we have
Z
lim |ϕ − ϕn |2 dm[a,b] = 0.
n→∞ [a,b]
Thus, there exists k ∈ N such that

ε2
Z
|ϕ − ϕk |2 dm[a,b] < .
[a,b] 4
Moreover, by 7.4.8 there exists ψ ∈ C(a, b) such that
ε2
m[a,b] ({x ∈ [a, b] : ϕk (x) 6= ψ(x)}) <
16k 2
and
sup{|ψ(x)| : x ∈ [a, b]} ≤ sup{|ϕk (x)| : x ∈ [a, b]} ≤ k.
Thus, if we write E := {x ∈ [a, b] : ϕk (x) 6= ψ(x)} we have
Z Z
2
|ϕk − ψ| dm[a,b] = χE |ϕk − ψ|2 dm[a,b]
[a,b] [a,b]
Z
≤ χE (|ϕk | + |ψ|)2 dm[a,b]
[a,b]
ε2
Z
≤ 4k 2 χE dm[a,b] = 4k 2 m[a,b] (E) < ,
[a,b] 4
and hence
k[ϕ] − [ψ]k ≤ k[ϕ] − [ϕk ]k + k[ϕk ] − [ψ]k < ε.
11.2.2 Remarks.
(a) Since ι(C(a, b)) = L2 (a, b) and ι(C(a, b)) 6= L2 (a, b), ι(C(a, b)) is not a closed
subset of L2 (a, b) (cf. 2.3.9c). Hence, the inner product space ι(C(a, b)) (cf.
10.3.5) is not a Hilbert space (cf. 2.6.6a). Since the inner product spaces
ι(C(a, b)) and C(a, b) are isomorphic (cf. 10.3.5), this furnishes another proof
(besides the one given in 10.4.2c) that the inner product space C(a, b) is not a
Hilbert space (cf. 10.1.21).
(b) The mapping ι of 11.2.1 is a linear operator and it is injective (cf. 10.3.5). Thus,
if ϕ, ψ ∈ C(a, b) are such that [ϕ] = [ψ] then ϕ = ψ. Namely, if an element of
L2 (a, b) contains a continuous function then this function is the only continuous
one it contains.
(c) The mapping ι of 11.2.1 is not surjective onto L2 (a, b). To see this, let x0 ∈ (a, b)
and consider the element [χ[a,x0 ] ] of L2 (a, b). If ϕ ∈ [χ[a,x0 ] ] and Dϕ = [a, b]
then for each n ∈ N large enough there exists x′n ∈ x0 − n1 , x0 such that
ϕ(x′n ) = 1 (otherwise we should have ϕ(x) 6= χ[a,x0 ] (x) for all x ∈ x0 − n1 , x0 ,

and this would be in contradiction with ϕ ∼ χ[a,x0 ] ), and similarly there exists
1
′′
xn ∈ x0 , x0 + n such that ϕ(xn ) = 0. Hence, there exist two sequences {x′n }
′′
and {x′′n } in [a, b] such that limn→∞ x′n = limn→∞ x′′n = x0 , limn→∞ ϕ(x′n ) = 1,
limn→∞ ϕ(x′′n ) = 0. By 2.4.2, this proves that ϕ is not continuous at x0 , and
hence that ϕ 6∈ C(a, b). Another proof that ι is not surjective is by contraposition
as follows: if ι were surjective onto L2 (a, b) then ι would be an isomorphism from
C(a, b) onto L2 (a, b) and hence C(a, b) would be a Hilbert space (cf. 10.1.21),
which is not true.
11.2.3 Proposition. Let π be the function
π : [a, b] → [0, 2π]

x−a
x 7→ π(x) := 2π .
b−a
The definitions of the mappings
U : L2 (0, 2π) → L2 (a, b)

12
2π
[ϕ] 7→ U [ϕ] := [ϕ ◦ π]
b−a
and
V : L2 (a, b) → L2 (0, 2π)
1
b−a 2
[ψ] 7→ V [ψ] := [ψ ◦ π −1 ]
2π
are consistent.
The mappings U and V are unitary operators and V = U −1 .
Proof. The proof is based on 11.1.10. Let X1 := [a, b], A1 := A[a,b] , µ1 := m[a,b] ,
X2 := [0, 2π], A2 := A[0,2π] , µ2 := b−a
2π m[0,2π] (for which, cf. 8.3.5b with µ := m[0,2π]
and ν the null measure on A[0,2π] ). In view of 9.2.1a and 9.2.2a we have, for every
E ∈ A[0,2π] ,

b−a b−a b−a
m[0,2π] (E) = m(E) = m E+a
2π 2π 2π

b−a
= m[a,b] E + a = m[a,b] (π −1 (E)).
2π
Then, 11.1.10 proves that the mapping

b−a
W : L2 [0, 2π], A[0,2π] , m[0,2π] → L2 (a, b)
2π
[ϕ] 7→ W [ϕ] := [ϕ ◦ π]
Since M [0, 2π], A[0,2π] , b−a

is a unitary operator. 2π m[0,2π] = M(0, 2π)
1 b−a 1
and L [0, 2π], A[0,2π] , 2π m[0,2π] = L (0, 2π) (cf. 8.3.5b), and since triv-
ially, for E ∈ A[0,2π] , b−a 2π m[0,2π] (E) = 0 iff m[0,2π] (E) = 0, we have
L2 [0, 2π], A[0,2π] , b−a
2π m [0,2π] = L 2
(0, 2π). Thus, the mapping

b−a

T : L2 (0, 2π) → L2 [0, 2π], A[0,2π] , m[0,2π]
2π
21
2π
[ϕ] 7→ T [ϕ] := [ϕ]
b−a
is defined consistently and RT = L2 [0, 2π], A[0,2π] , b−a

2π m[0,2π] . In view of 8.3.5b,
we also have
12 12 Z
2π 2π b−a
Z
ϕ ψd m[0,2π] = ϕψdm[0,2π] .
[0,2π] b−a b−a 2π [0,2π]
Thus, T is a unitary operator in view of 10.1.20, and hence U is a unitary operator

since U = W ◦ T (cf. 10.1.21 and 4.6.2b).
Now, it is obvious that
V (U [ϕ]) = [ϕ], ∀[ϕ] ∈ L2 (0, 2π).
Then, 1.2.16a implies U −1 ⊂ V , and hence U −1 = V . This also proves that V is a
unitary operator (cf. 10.1.21 and 4.6.2b).
11.2.4 Theorem. For each n ∈ Z let un be the element of C(a, b) defined by

12
1 x−a
un (x) := exp i2πn , ∀x ∈ [a, b],
b−a b−a
and for each n ∈ N let vn and wn be the elements of C(a, b) defined by
12
x−a

2
vn (x) := cos 2πn , ∀x ∈ [a, b],
b−a b−a
12
2 x−a
wn (x) := sin 2πn , ∀x ∈ [a, b].
b−a b−a
Both the families {[un ]}n∈Z and {[un ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N are complete or-
thonormal systems in the Hilbert space L2 (a, b).
Proof. First we consider the special case a := 0 and b := 2π. We already know that
both the families {un }n∈Z and {u0 } ∪ {vn }n∈N ∪ {wn }n∈N are orthonormal systems
in the inner product space C(0, 2π) (cf. 10.2.5b). Hence so are in L2 (0, 2π) the two
families of the statement, owing to property co1 of 10.3.4 possessed by the mapping
ι of 11.2.1. To prove that these orthonormal systems are complete in L2 (0, 2π), we
first prove that L{un}n∈Z is dense in Rι . Then, fix ϕ ∈ C(0, 2π) and ε > 0. For
n ∈ N large enough, let χn be the element of C(0, 2π) defined by

nx

 if 0 ≤ x < n1 ,
χn (x) := 1 if n1 ≤ x ≤ 2π − n1 ,

n(2π − x) if 2π − 1 < x ≤ 2π,

n
and let ϕn := χn ϕ. Clearly, for n ∈ N large enough, ϕn ∈ C(0, 2π) and

|ϕn (x)| ≤ |ϕ(x)| and hence |ϕ(x) − ϕn (x)|2 ≤ 4|ϕ(x)|2 , ∀x ∈ [0, 2π].
Moreover,
lim ϕn (x) = ϕ(x) and hence lim |ϕ(x) − ϕn (x)|2 = 0, ∀x ∈ (0, 2π).
n→∞ n→∞
By 8.2.11, this implies that

Z ! 12
2
lim k[ϕ] − [ϕn ]k = lim |ϕ − ϕn | dm[a,b] = 0.
n→∞ n→∞ [a,b]
Then, fix k ∈ N so that k[ϕ] − [ϕk ]k < 2ε . Since ϕk (0) = 0 = ϕk (2π), ϕk can be
identified in an obvious way with an element of C(T) and conversely any trigono-
metric polynomial can be identified with an element of L{un }n∈Z by 3.1.7 (for C(T)
and the trigonometric polynomials, cf. 4.3.6c). Then, 4.3.7 implies that there exists
ψ ∈ L{un }n∈Z such that
ε
sup{|ϕk (x) − ψ(x)| : x ∈ [0, 2π)} < 1
2(b − a) 2
(cf. 2.3.12), and hence such that
! 12 12
ε2

ε
Z
2
k[ϕk ] − [ψ]k = |ϕk − ψ| dm ≤ (b − a) = ,
[a,b] 4(b − a) 2
and hence such that

k[ϕ] − [ψ]k ≤ k[ϕ] − [ϕk ]k + k[ϕk ] − [ψ]k < ε.
Now, [ψ] ∈ L{[un ]}n∈Z by 3.1.7 since ι is a linear operator. By 2.3.12, this
proves that L{[un ]}n∈Z is dense in Rι . Since Rι is dense in L2 (0, 2π) (cf. 11.2.1),
L{[un ]}n∈Z is dense in L2 (0, 2π) by 2.3.14. Thus we have
V {[un ]}n∈Z = L{[un ]}n∈Z = L2 (0, 2π)
(cf. 4.1.13), and also
V ({[u0 ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N ) = L2 (0, 2π)
since
L{un }n∈Z = L({u0 } ∪ {vn }n∈N ∪ {wn }n∈N )
(cf. 10.2.5b) implies
L{[un]}n∈Z = L({[u0 ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N )
in view of 3.1.7 and of the linearity of ι. This proves that the two orthonormal
systems of the statement are complete in L2 (a, b) if a := 0 and b := 2π.
For any a, b ∈ R such that a < b, the two families of vectors of the statement
can be obtained from the same families for a := 0 and b := 2π by means of the
unitary operator U of 11.2.3. In view of 10.6.8b, this proves that they are complete
orthonormal systems in L2 (a, b).
11.2.5 Remark. In view of 10.7.5, 11.2.4 proves that the Hilbert space L2 (a, b) is
separable.
11.2.6 Proposition. Let u be the element of C(a, b) defined by

12
1
u(x) := , ∀x ∈ [a, b],
b−a
and, for each n ∈ N, let cn and sn be the elements of C(a, b) defined by
21
2 x−a
cn (x) := cos πn , ∀x ∈ [a, b],
b−a b−a
12
2 x−a
sn (x) := sin πn , ∀x ∈ [a, b].
b−a b−a
Both the families {[u]} ∪ {[cn ]}n∈N and {[sn ]}n∈N are complete orthonormal systems
in the Hilbert space L2 (a, b).
Proof. We prove the statement for the family {[sn ]}n∈N .

First we examine the special case a := 0 and b := 2π, for which we denote sn
by σn . We have
12
1 nx
σn (x) := sin , ∀x ∈ [0, 2π], ∀n ∈ N.
π 2
We consider 8.3.11 with

(X1 , A1 , µ1 ) := ([−2π, 0], A[−2π,0], m[−2π,0] ),
(X2 , A2 ) := ([0, 2π], A[0,2π] ),
π : [−2π, 0] → [0, 2π]
x 7→ π(x) := −x.
By 9.2.2a (with c := −1) we have
µ2 (E) := m[−2π,0] ((−1)E) = m((−1)E) = m(E) = m[0,2π] (E), ∀E ∈ A[0,2π] .
Then we have
Z Z
ψdm[0,2π] = (ψ ◦ π)dm[−2π,0] , ∀ψ ∈ L1 (−2π, 2π)
[0,2π] [−2π,0]
(cf. 8.3.11c and note that if ψ ∈ L1 (−2π, 2π) then ψ[0,2π] ∈ L1 (0, 2π); in the
integrals we simply denote by ψ the restrictions of ψ that are actually used). Hence,
if ψ is an even function, i.e. if Dψ = [−2π, 2π] and (ψ ◦ π)(x) = ψ(−x) = ψ(x) for
all x ∈ [−2π, 0], then
Z Z Z
ψdm[−2π,2π] = χ[−2π,0) ψdm[−2π,2π] + χ[0,2π] ψdm[−2π,2π]
[−2π,2π] [−2π,2π] [−2π,2π]
Z Z Z
= ψdm[−2π,0] + ψdm[0,2π] = 2 ψdm[0,2π] .
[−2π,0] [0,2π] [0,2π]
If ψ is an odd function, i.e. if Dψ = [−2π, 2π] and (ψ ◦ π)(x) = ψ(−x) = −ψ(x) for
all x ∈ [−2π, 0], then
Z Z Z
ψdm[−2π,2π] = ψdm[−2π,0] + ψdm[0,2π]
[−2π,2π] [−2π,0] [0,2π]
Z Z
=− ψdm[0,2π] + ψdm[0,2π] = 0.
[0,2π] [0,2π]
Now we consider the c.o.n.s. {[u0 ]} ∪ {[vn ]}n∈N ∪ {[wn ]}n∈N of 11.2.4, in the special
case a := −2π and b := 2π. We note that, for each n ∈ N,
12
1 nx
wn (x) = (−1)n sin , ∀x ∈ [−2π, 2π],
2π 2
and hence
√
σn (x) = (−1)n 2wn (x), ∀x ∈ [0, 2π].
For all n, k ∈ N we have
Z Z
σn σk dm[0,2π] = (−1)n (−1)k 2 wn wk dm[0,2π]
[0,2π] [0,2π]
Z
n k
= (−1) (−1) wn wk dm[−2π,2π]
[−2π,2π]
= (−1)n (−1)k δn,k = δn,k ,

where the second equality holds because the function wn wk is even and the third
holds because {[wn ]}n∈N is an o.n.s. in L2 (−2π, 2π). Thus, {[σn ]}n∈N is an o.n.s.
in L2 (0, 2π). For each [ϕ] ∈ L2 (0, 2π), assuming for convenience that for the repre-
sentative ϕ we have Dϕ = [0, 2π] (cf. 8.2.12), we define the function
ϕ̃ : [−2π, 2π] → C
(
−ϕ(−x) if x ∈ [−2π, 0),
x 7→ ϕ̃(x) :=
ϕ(x) if x ∈ [0, 2π].
From 8.3.11c we have that ϕ ◦ π ∈ M(−2π, 0) and |ϕ ◦ π|2 ∈ L1 (−2π, 0); it is then
easy to see that ϕ̃ ∈ L2 (−2π, 2π). Then we have
Z Z
2 |ϕ|2 dm[0,2π] = |ϕ̃|2 dm[−2π,2π]
[0,2π] [−2π,2π]
Z 2 2
X∞ Z
= u0 ϕ̃dm[−2π,2π] + vn ϕ̃dm[−2π,2π]

[−2π,2π] [−2π,2π]
n=1
2
∞ Z

X
+ w ϕ̃dm

n [−2π,2π]
[−2π,2π]
n=1
2

∞ Z

X
= 2 wn ϕ̃dm[0,2π] ,

[0,2π]
n=1
where the first equality holds because the function |ϕ̃|2 is even, the second holds
by 10.6.4d, the third holds because the functions u0 ϕ̃ and vn ϕ̃ are odd and the
functions wn ϕ̃ are even. Thus we have
2 2
∞ ∞ Z
Z X √ Z X
|ϕ|2 dm[0,2π] = 2 wn ϕ̃dm[0,2π] = σn ϕ̃dm[0,2π] .

[0,2π] n=1
[0,2π] [0,2π]
n=1

2
This proves that condition 10.6.4d (with M := L (0, 2π)) holds true for the o.n.s.
{σn }n∈N , which is therefore a c.o.n.s. in L2 (0, 2π).
Now, if U is the unitary operator of 11.2.3 then {U [σn ]}n∈N is a c.o.n.s. in
L2 (a, b) by 10.6.8b, and we note that
U [σn ] = [sn ], ∀n ∈ N.
This completes the proof for the family {[sn ]}n∈N .

The proof that the statement is true for the family {[u]}∪{[cn ]}n∈N is analogous,
with ϕ̃ defined by
(
ϕ(−x) if x ∈ [−2π, 0),
ϕ̃(x) :=
ϕ(x) if x ∈ [0, 2π].
so that the functions u0 ϕ̃ and vn ϕ̃ are even and the functions wn ϕ̃ are odd.
11.3 L2 (R)
We write M(R) := M(X, A, µ), L1 (R) := L1 (X, A, µ), L2 (R) := L2 (X, A, µ),
L2 (R) := L2 (X, A, µ) if X := R, A := A(dR ), µ = m, where m is the Lebesgue
measure on R.
11.3.1 Proposition. The inclusion S(R) ⊂ L2 (R) holds true (for S(R), cf. 3.1.10h
and 10.1.5c). The mapping
ι : S(R) → L2 (R)
ϕ 7→ ι(ϕ) := [ϕ]
is a linear operator and
(ι(ϕ)|ι(ψ)) L2 (R) = (ϕ|ψ)S(R) , ∀ϕ, ψ ∈ S(R).
Proof. Every ϕ ∈ S(R) is a continuous function, and hence ϕ ∈ M(R) by 6.2.8;

moreover, |ϕ|2 = ϕϕ shows that |ϕ|2 ∈ S(R) (cf. 2 and 6 in 3.1.10h), and hence
that |ϕ|2 ∈ L1 (R) (cf. 10.1.5c). This proves the inclusion S(R) ⊂ L2 (R).
The fact that the mapping ι is a linear operator follows at once from the defini-
tions of vector sum and scalar multiplication in S(R) and in M (X, A, µ), and the
fact that ι preserves the inner product follows at once from the definitions of the
inner products in S(R) and in L2 (X, A, µ).
11.3.2 Definitions. We denote by ξ the function defined as follows

ξ:R→C
x 7→ ξ(x) := x.
More generally, if the rule that defines a function from R to C in an expression
containing x ∈ R and if the domain of the function is the set of all x ∈ R for which
this expression is consistent as the definition of an element of C, then we denote
the function by the same expression with x replaced by ξ. Thus, for example, sin ξ
is the function defined by
R ∋ x 7→ sin x ∈ C
1
and 1−ξ is the function defined by
1
R − {1} ∋ x 7→ ∈ C.
1−x
11.3.3 Theorem. The family {[hn ]}n∈I , where I := {0} ∪ N and {hn }n∈I is the
Hermite o.n.s. in S(R) (cf. 10.2.7), is a c.o.n.s. in the Hilbert space L2 (R).
Proof. It is obvious that the family {[hn ]}n∈I is an o.n.s. in L2 (R), since {hn }n∈I
is an o.n.s. in S(R) and the mapping ι of 11.3.1 preserves the inner product. To
prove that {[hn ]}n∈I is complete in L2 (R) we proceed as follows. We prove below
that
({[fn ]}n∈I )⊥ = {0L2 (R) },
ξ2
where {fn }n∈I is as in 10.2.7, i.e. fn = ξ n e− 2 for all n ∈ I. From the equality
L{hn }n∈I = L{fn }n∈I
in S(R) (cf. 10.2.7), we obtain the equality
L{[hn ]}n∈I = L{[fn ]}n∈I
in L2 (R), by 3.1.7 and the linearity of the mapping ι of 11.3.1. Then we have
({[hn ]}n∈I )⊥ = (L{[hn ]}n∈I )⊥ = (L{[fn ]}n∈I )⊥ = ({[fn ]}n∈I )⊥ = {0L2 (R) }
(cf. 10.2.11), and this proves that {[hn ]}n∈I is a c.o.n.s. in L2 (R) (cf. 10.6.5a).
Now we prove that ({[fn ]}n∈I )⊥ = {0L2 (R) }. Then, let [ϕ] ∈ L2 (R) be such that
([ϕ]|[fn ]) = 0, ∀n ∈ I.
We assume for convenience that for the representative ϕ we have Dϕ = R (cf.
8.2.12), and we write ϕ1 := Re ϕ and ϕ2 := Im ϕ. In what follows, fix i = 1 or
i = 2. We have ϕi ∈ L2 (R) (cf. 6.2.12 and 8.1.17b) and
([ϕi ]|[fn ]) = 0, ∀n ∈ I,
ξ2
because fn is a real function. For any a ∈ R, the function eiaξ e− 2 is an element of
2
2 iaξ − ξ2
S(R) (cf. 5 in 3.1.10h) and hence of L (R) (cf. 11.3.1). Then ϕi e e ∈ L1 (R)
(cf. 11.1.2b) and
∞
(iax)n − x2
Z 2
Z
iaξ − ξ2
X
ϕi e e dm = ϕi (x) e 2 dm(x)
R R n=0
n!
N
(iax)n − x2
Z X
= lim ϕi (x) e 2 dm(x)
R N →∞ n=0
n!
N
(1)
(ia)n
Z X
x2
= lim ϕi (x)xn e− 2 dm(x)
N →∞ R
n=0
n!
N
X (ia)n
= lim ([ϕi ]|[fn ]) = 0,
N →∞
n=0
n!
ξ2
where the third equality holds true by 8.2.11, with the function |ϕi |e|aξ| e− 2 as
dominating function. Indeed,

N ∞
X (iax)n − x2 X |ax|n − x2
ϕi (x) e 2 ≤ |ϕi (x)| e 2

n!
n=0
n! n=0
x2
= |ϕi (x)|e|ax| e− 2 , ∀x ∈ R, ∀N ∈ N;
ξ2 2
moreover, e|aξ| e− 2 ∈ L2 (R) (it can be seen that e2|aξ| e−ξ ∈ L1 (R) in the same
ξ2
way as for an element of S(R) in 10.1.5c) and therefore |ϕi |e|aξ| e− 2 ∈ L1 (R) (cf.
11.1.2b). For each l ∈ N, let the function ψl : R → C be the periodic function with
 x ∈ [−l, l):
period 2l that is defined as follows for
1 if ϕi (x) > 0,


ψl (x) := 0 if ϕi (x) = 0,

−1 if ϕ (x) < 0.

i
We have ψl ∈ M(R) since, if we denote by ϕi,0 the restriction of ϕi to [−l, l),
then the three sets ϕ−1 −1 −1
i,0 ((0, ∞)), ϕi,0 ({0}), ϕi,0 ((−∞, 0)) are elements of A(dR ) (cf.
6.2.3, 6.2.13a, 6.1.19a) and the same is true for their translations (cf. 9.2.1a). We
also have
x2 x2
lim ϕi (x)ψl (x)e− 2 = |ϕi (x)|e− 2 , ∀x ∈ R,
l→∞
since ϕi (x)ψl (x) = |ϕi (x)| for all x ∈ [−l, l), and
x2 x2
|ϕi (x)ψl (x)e− 2 | ≤ |ϕi (x)e− 2 |, ∀x ∈ R,
ξ2
since |ψl (x)| ≤ 1 for all x ∈ R. Hence, by 8.2.11 with the function |ϕi |e− 2 (which
is an element of L1 (R) by 11.1.2b)
Z as dominating
Z function, we have
ξ2 ξ2
lim ϕi ψl e− 2 dm = |ϕi |e− 2 dm.
l→∞ R R
Z0, there exists r Z∈ N such that
Therefore, if we fix ε >
2 2
|ϕi |e− ξ2 dm − ϕi ψr e− ξ2 dm < ε.

(2)
R R
Now, the restriction of ψr to the interval [0, 2r) is obviously an element of L2 (0, 2r),
Pm π
and therefore there is a function q := k=−m αk ei r kξ such that
Z Z ∞
!−1
−4r 2 n2
X
2 2 2
|ψr − q| dm = |ψr − q| dm < ε 2 e
[0,2r) [0,2r] n=0
(cf. 10.6.4b in the Hilbert space L2 (0, 2r), with the c.o.n.s. {[uk ]}k∈Z of 11.2.4).
ξ2
Now, (ψr −q)e− is an element of L2 (R) since both ψr and q are bounded functions,
2
and Z
ξ2 2
k[(ψr − q)e− 2 ]k2 = |ψr − q|2 e−ξ dm
R
∞ Z
X 2
= |ψr − q|2 e−ξ dm
n=0 [2rn,2r(n+1))
∞ Z
X 2
+ |ψr − q|2 e−ξ dm
n=0 [−2r(n+1),−2rn))
∞ Z (3)
−4r 2 n2
X
≤ e |ψr − q|2 dm
n=0 [2rn,2r(n+1))
∞ Z
2
n2
X
+ e−4r |ψr − q|2 dm
n=0 [−2r(n+1),−2rn))
∞ Z
2
n2
X
=2 e−4r |ψr − q|2 dm < ε2 ,
n=0 [0,2r)
where the second equality holds by 8.3.4a and the last equality holds because ψr
and q are periodic functions with period 2r. Moreover, from 1 we have
Z 2
m Z
ξ2
− ξ2 π
X
ϕi qe dm = αk ϕi ei r kξ e− 2 dm = 0. (4)
R k=−m R
From 2, 3, 4 and 10.1.7a we have

Z Z Z
2 2 2
− ξ2 − ξ2 − ξ2

|ϕi |e dm = |ϕi |e
dm − ϕi qe dm
R
ZR Z
R

ξ2 2
− 2 − ξ2

≤ |ϕi |e
dm − ϕi ψr e dm
R R
Z Z
ξ2 2
− 2 − ξ2

+ ϕi ψr e
dm − ϕi qe dm
R R
ξ2
< ε + [ϕi ]| (ψr − q)e− 2

ξ2
≤ ε + k[ϕi ]kk[(ψr − q)e− 2 ]k ≤ (1 + k[ϕi ]k)ε.
Since ε was fixed arbitrarily, this implies that
Z
ξ2
|ϕi |e− 2 dm = 0,
R
and hence, by 8.1.12a,
ϕi (x) = 0 m-a.e. on R.
Since this holds for both i = 1 and i = 2, this implies that
ϕ(x) = 0 m-a.e. on R,
and hence [ϕ] = 0L2 (R) . This proves that ({[fn ]}n∈I )⊥ = {0L2 (R) }.
11.3.4 Remark. In view of 10.7.5, 11.3.3 proves that the Hilbert space L2 (R) is
separable.
11.3.5 Theorem. The pair (L2 (R), ι), with ι defined as in 11.3.1, is a completion
of the inner product space S(R).
Proof. We already know that ι fulfils condition co1 of 10.3.4 (cf. 11.3.1). Condition
co2 follows from 11.3.3 and 10.6.5b.
11.3.6 Remarks.
(a) By reasoning as in 11.2.2a, we can see that the inner product space S(R) (cf.
10.1.5c) is not a Hilbert space.
(b) The mapping ι of 11.3.1 is injective (cf. 10.1.19). This means that if an element
of L2 (R) contains an element ϕ of S(R), then ϕ is the only element of S(R) it
contains. As a rule, when we denote by [ϕ] an element of Rι , the representative
ϕ by which we mark the equivalence class is meant to be the element of S(R)
that is contained by the class.
(c) The mapping ι of 11.3.1 is not surjective. In fact, by reasoning as in 11.2.2c

we see that if ι were surjective then S(R) would be a Hilbert space. Actually,
there are elements of L2 (R) which do not contain any continuous function. For
instance, by reasoning as in 11.2.2c, we can see that for any finite interval [a, b]
the characteristic function χ[a,b] is an element of L2 (R) such that [χ[a,b] ] does
not contain any continuous function.
(d) If an element of L2 (R) contains an element ϕ of C(R) (cf. 3.1.10e), then ϕ is
the only element of C(R) it contains (this generalizes what was seen for S(R) in
remark b). In fact suppose that, for ϕ, ψ ∈ L2 (R) ∩ C(R), there exists x0 ∈ R
so that |ϕ(x0 ) − ψ(x0 )| = k 6= 0; then there exists δ > 0 such that
1
[x ∈ R and |x − x0 | < δ] ⇒ [ϕ(x) − ψ(x)] > k,
2
and hence such that
1
Z
|ϕ − ψ|2 dm ≥ k 2 δ,
R 2
and this implies [ϕ] 6= [ψ].
11.4 The Fourier transform on L2 (R)
The Fourier transform is an important topic of functional analysis, and it can be

defined in a variety of contexts, as for instance the Hilbert space L2 (Rn ), the Banach
space L1 (Rn ), the spaces of distributions and of generalized functions. In this
section we study the Fourier transform as an operator on L2 (R) (from a structural
point of view it would be the same to study it in the framework of L2 (Rn ), we limit
ourselves to n = 1 essentially in order to keep the notation as simple as possible).
The Fourier transform on L2 (R) plays a major role in fundamental non-relativistic
quantum mechanics as it implements the unitary equivalence of the operators that
represent the observables position and momentum of a non-relativistic quantum
particle (cf. 20.1.7b and 20.3.6c).
In this section we denote elements of L1 (R) or L2 (R) by the letters f, g, ..., while
we denote elements of C ∞ (R) (for the definition of C ∞ (R), cf. 3.1.10h) by the letters
ϕ, ψ, ....
11.4.1 Definitions. Suppose f ∈ L1 (R). The Fourier transform of f is the func-
tion
fˆ : R → C
Z
ˆ − 12
x 7→ f (x) := (2π) e−ixy f (y)dm(y),
R
and the inverse Fourier transform of f is the function
fˇ : R → C
Z
1
x 7→ fˇ(x) := (2π)− 2 eixy f (y)dm(y).
R
These definitions are consistent in view of 8.2.4.
11.4.2 Proposition. Let f ∈ L1 (R) be such that ξ l f ∈ L1 (R) for all l ∈ N (ξ is

the function defined in 11.3.2).
Then fˆ ∈ C ∞ (R) and
(fˆ)(l) = (−i)l (ξ l f )ˆ, ∀l ∈ N
((fˆ)(l) denotes the l-th derivative of fˆ).

Similarly, fˇ ∈ C ∞ (R) and
(fˇ)(l) = il (ξ l f )ˇ, ∀l ∈ N.
Proof. Let {tn } be a sequence in R − {0} such that tn → 0. For all x ∈ R, we have
1 ˆ 1 −itn y
Z
lim ˆ
(f (x + tn ) − f (x)) = lim (2π)−1
(e − 1)e−ixy f (y)dm(y)
n→∞ tn n→∞ t
R n
Z
= (2π)−1 (−iy)e−ixy f (y)dm(y) = −i(ξf )ˆ(x)
R
by 8.2.11 (with |ξf | as dominating function), in view of the equation

1
lim (e−itn y − 1) = −iy, ∀y ∈ R,
n→∞ tn
and of the inequality
|eiα − 1| ≤ |α|, ∀α ∈ R;
this proves that
fˆ is differentiable at x and (fˆ)(1) (x) = −i(ξf )ˆ(x), ∀x ∈ R.
In the same way we can prove that, for each l ∈ N, if
fˆ is l times differentiable at all points of R and (fˆ)(l) = (−i)l (ξ l f )ˆ
then
fˆ is l + 1 times differentiable at all points of R and (fˆ)(l+1) = (−i)l+1 (ξ l+1 f )ˆ.
This proves by induction the part of the statement about fˆ. The proof for fˇ is
analogous.
11.4.3 Remark. We recall (cf. 10.1.5c) that we have, for all ϕ ∈ S(R):
1
R ∈ L (R);
(a) ϕ Rn
(b) R ϕdm = limn→∞ −n ϕ(x)dx
(the integrals on the right hand side of this equation are Riemann integrals).
11.4.4 Proposition. Let ϕ ∈ S(R). Then
(ϕ(k) )ˆ = (iξ)k ϕ̂ and (ϕ(k) )ˇ = (−iξ)k ϕ̌, ∀k ∈ N.

Proof. Preliminarily we note that, for all k ∈ N, ϕ(k) ∈ S(R) (cf. 3.1.10h-1) and
hence ϕ(k) ∈ L1 (R) (cf. 11.4.3a). Thus, the statement is consistent.
For all x ∈ R, we have
Z n
(1) − 12
(1)
(ϕ )ˆ(x) = (2π) lim e−ixy ϕ(1) (y)dy
n→∞ −n
Z n
(2) − 21 −ixn ixn −ixy
= (2π) lim e ϕ(n) − e ϕ(−n) + ix e ϕ(y)dy
n→∞ −n
Z n
(3) 1 (4)
= ix(2π)− 2 lim e−ixy ϕ(y)dy = ixϕ̂(x),
n→∞ −n
where 1 and 4 hold true by 11.4.3b, 2 is integration by parts for the Riemann
integrals, 3 holds true because limn→∞ ϕ(±n) = 0. This proves that
(ϕ(1) )ˆ = iξ ϕ̂.
In the same way we can prove that, for each k ∈ N, if
(ϕ(k) )ˆ = (iξ)k ϕ̂
then
(ϕ(k+1) )ˆ = (iξ)k+1 ϕ̂.
This proves by induction the part of the statement about ϕ̂. The proof for ϕ̌ is
analogous.
11.4.5 Proposition. Let ϕ ∈ S(R). Then

ξ k (ϕ̂)(l) = (−i)l+k ((ξ l ϕ)(k) )ˆ and ξ k (ϕ̌)(l) = il+k ((ξ l ϕ)(k) )ˇ, ∀k, l ∈ N.
Proof. Preliminarily we note that (ξ l ϕ)(k) ∈ S(R) (cf. 3.1.10h-1,4) and hence
(ξ l ϕ)(k) ∈ L1 (R) (cf. 11.4.3a), for all k, l ∈ N. Moreover, ϕ̂ and ϕ̌ are elements
of C ∞ (R) since ξ l ϕ ∈ L1 (R), for all l ∈ N (cf. 11.4.2). Thus, the statement is
consistent.
We fix k, l ∈ N. From 11.4.2 we have
ξ k (ϕ̂)(l) = (−i)l ξ k (ξ l ϕ)ˆ. (1)
Since ξ l ϕ ∈ S(R), we can write the first equality in 11.4.4 with ϕ replaced by ξ l ϕ,
to obtain
((ξ l ϕ)(k) )ˆ = (iξ)k (ξ l ϕ)ˆ. (2)
Now, 1 and 2 yield the first equality of the statement. The proof of the second
equality is analogous.
11.4.6 Theorem. Let ϕ ∈ S(R). Then

ϕ̂ ∈ S(R) and ϕ̌ ∈ S(R).
Proof. As already noted in the proof of 11.4.5, ϕ̂ and ϕ̌ are elements of C ∞ (R).
Moreover we have, for all k = 0, 1, 2, ... and all l = 0, 1, 2, ...,
(1)
sup{|xk+1 (ϕ̂)(l) (x)| : x ∈ R} = sup{|((ξ l ϕ)(k+1) )ˆ(x)| : x ∈ R}
(2)
Z
(3)
−1
≤ (2π) |(ξ l ϕ)(k+1) |dm < ∞,
R
where: 1 holds true by 11.4.5 if l ∈ N or by 11.4.4 if l = 0; 2 follows from 8.2.10; 3
holds true in view of 11.4.3a since (ξ l ϕ)(k+1) ∈ S(R) (cf. 3.1.10h-1,4). Then,
(l) 1 k+1 (l)
lim xk (ϕ̂) (x) = lim x (ϕ̂) (x) = 0.
x→±∞ x→±∞ x
The proof for ϕ̌ is analogous.
11.4.7 Remark. The function

2
R ∋ x 7→ e−x ∈ R
is obviously an element of S(R), and hence an element of L1 (R) (cf. 11.4.3a). We
take it that the reader already knows the equation
√
Z
2
e−x dm(x) = π
R
(cf. e.g. Rudin, 1976, 8.21 or exercise 10, p.290). Then we also have
√
Z
−2 2
e−c x dm(x) = c π, ∀c ∈ (0, ∞)
R
(cf. 9.2.2b).
11.4.8 Lemma. For each a ∈ (0, ∞), let the function γa be defined by
γa : R → C
1 2
x 7→ γa (x) := e− 2 ax .
Then, γa ∈ S(R) and
−1
1 1
x2
γ̂a (x) = γ̌a (x) = a− 2 e− 2 a , ∀x ∈ R.
Proof. We fix a ∈ (0, ∞). It is obvious that γa ∈ S(R). Then γ̂a ∈ C ∞ (R) and
(γ̂a )(1) = −i(ξγa )ˆ (1)
(cf. 11.4.2). Moreover, from
γa′ (x) = −axγa (x), ∀x ∈ R,
we obtain
(ξγa )ˆ = −a−1 (γa(1) )ˆ = −a−1 iξγ̂a , (2)
in view of 11.4.4. From 1 and 2, we see that
(γ̂a )(1) + a−1 ξγ̂a = 0.
Moreover,
1√
Z
1 1 2 1 1
γ̂a (0) = (2π)− 2 e− 2 ax dm(x) = (2π)− 2 (2a−1 ) 2 π = a− 2
R
(cf. 11.4.7). Now, there exists a unique element ϕ of C ∞ (R) such that
1
ϕ′ (x) + a−1 xϕ(x) = 0, ∀x ∈ R, and ϕ(0) = a− 2 ,
and ϕ is defined by
−1
1 1
x2
ϕ(x) := a− 2 e− 2 a .
Thus,
−1
1 1
x2
γ̂a (x) = a− 2 e− 2 a , ∀x ∈ R.
As to γ̌a , we note that
(f )ˆ = fˇ, ∀f ∈ L1 (R),
since complex conjugation commutes with integration (cf. 8.2.3).
11.4.9 Theorem. Let ϕ ∈ S(R). Then

(ϕ̂)ˇ = ϕ and (ϕ̌)ˆ = ϕ.
Proof. The first equation of the statement is proved by the following equalities,
where x is a fixed but arbitrary element of R:
Z
− 12
(ϕ̂)ˇ(x) = (2π) eixt ϕ̂(t)dm(t)
R
Z Z
(1) 1 1 −2 2
= (2π)− 2 lim e− 2 n t eixt e−ity ϕ(y)dm(y) dm(t)
n→∞ R R
Z Z
(2) − 12 −i(y−x)t − 21 n−2 t2
= (2π) lim ϕ(y) e e dm(t) dm(y)
n→∞ R R
Z
(3) −21
− 12 n2 (y−x)2
= (2π) lim ϕ(y)ne dm(y)
n→∞ R

1
Z
(4) 1 1 2
= (2π)− 2 lim ϕ s + x e− 2 s dm(s)
n→∞ R n
Z
(5) 1 1 2
= (2π)− 2 ϕ(x)e− 2 s dm(s)
R
(6) − 12 1
= (2π) ϕ(x)(2π) 2 = ϕ(x).
The explanations of the above equalities are as follows:
1 holds true in view of 8.2.11 with dominating function |ϕ̂|, which is an element
of L1 (R) in view of 11.4.6 and 11.4.3a;
2 holds true in view of 8.4.9 and 8.4.10c, because
1 −2 2 1 −2 2
− 2 n t ixt −ity
e e ϕ(y) = e− 2 n t |ϕ(y)|, ∀(y, t) ∈ R2 ,

e
and because γn−2 and ϕ are elements of S(R) and hence of L1 (R) (cf. 11.4.3a);
3 follows from 11.4.8;
4 follows from the change of variable
s := n(y − x),
in view of 9.2.1 and 9.2.2;
5 holds in view of 8.2.11, since ϕ is continuous and hence

1 1 2 1 2
lim ϕ s + x e− 2 s = ϕ(x)e− 2 s , ∀s ∈ R,
n→∞ n
and since

ϕ 1 s + x e− 12 s2 ≤ sup{|ϕ(y)| : y ∈ R}e− 12 s2 , ∀s ∈ R

n
(also, recall 3.1.10h-7);

6 follows from 11.4.7.
The proof of the second equation of the statement is analogous.
11.4.10 Theorem. Let ϕ, ψ ∈ S(R). Then

Z Z Z
ϕψdm = ϕ̂ψ̂dm = ϕ̌ψ̌dm.
R R R
Proof. Preliminarily we note that ϕψ, ϕ̂ψ̂, ϕ̌ψ are elements of S(R) (in view of
3.1.10h-2,6 and 11.4.6) and hence of L1 (R) (in view of 11.4.3a).
The first equation of the statement is proved by the following equalities:
Z Z
(1)
ϕψ = ϕ(ψ̂)ˇdm
R R
Z Z
1
= ϕ(x) (2π)− 2 eixy ψ̂(y)dm(y) dm(x)
R R
Z Z
(2) − 21
= ψ̂(y) (2π) eixy ϕ(x)dm(x) dm(y)
R R
Z Z
(3) − 1
= ψ̂(y) (2π) 2 e −ixy ϕ(x)dm(x) dm(y)
ZR R
= ψ̂ ϕ̂dm.
R
The explanations of the above equalities are as follows:
1 holds true by 11.4.9;
2 holds true in view of 8.4.9 and 8.4.10c, because
|ϕ(x)eixy ψ̂(y)| = |ϕ(x)||ψ̂(y)|, ∀(x, y) ∈ R2 ,
and because ϕ, ψ̂ ∈ L1 (R) (cf. 11.4.6 and 11.4.3a);
3 holds true because complex conjugation commutes with integration (cf. 8.2.3).
The proof of the second equation of the statement is analogous.
11.4.11 Remark. We define the mappings

F̂ : S(R) → S(R)
ϕ 7→ F̂ ϕ := ϕ̂
and
F̌ : S(R) → S(R)
ϕ 7→ F̌ ϕ := ϕ̌.
These definitions are consistent in view of 11.4.6. The mappings F̂ and F̌ are
obviously linear operators on the linear space S(R) (cf. 3.1.10h). The statement of
11.4.9 can be written as
F̌ F̂ = F̂ F̌ = 1S(R) .
In view of 1.2.16b, this implies that both F̂ and F̌ are injective and that
F̌ = F̂ −1 and F̂ = F̌ −1 .
Since RF̂ = DF̂ −1 and RF̌ = DF̌ −1 , both F̂ and F̌ are surjective. By means of the
inner product for S(R) (cf. 10.1.5c), the statement of 11.4.10 can be written as

F̂ ϕ|F̂ ψ = F̌ ϕ|F̌ ψ = (ϕ|ψ) , ∀ϕ, ψ ∈ S(R).
Therefore, F̂ and F̌ are automorphisms of the inner product space S(R) (cf.
10.1.17).
11.4.12 Theorem. There exists a unique operator F ∈ B(L2 (R)) such that
F [ϕ] = [ϕ̂], ∀ϕ ∈ S(R).
The operator F is a unitary operator in L2 (R). The operator F −1 is the unique
element of B(L2 (R)) such that
F −1 [ϕ] = [ϕ̌], ∀ϕ ∈ S(R),
or equivalently such that
F −1 [ϕ̂] = [ϕ], ∀ϕ ∈ S(R).
The operator F is called the Fourier transform on L2 (R).
Proof. We recall that the pair (L2 (R), ι), with ι defined as in 11.3.1, is a completion
of the inner product space S(R) (cf. 11.3.5). We define the mapping
F0 : Rι → L2 (R)
[ϕ] 7→ F0 [ϕ] := [ϕ̂]
(cf. 11.3.6b). Clearly, F0 is a linear operator in L2 (R). Moreover, from 11.4.9 we
have
F0 [ϕ̌] = [ϕ], ∀ϕ ∈ S(R), (1)
and this proves that RF0 = Rι . Thus we have

DF0 = RF0 = L2 (R).
Furthermore, from 11.4.10 we have
Z Z
kF0 [ϕ]k2 = |ϕ̂|2 dm = |ϕ|2 dm = k[ϕ]k2 , ∀ϕ ∈ S(R).
R R
Then, in view of 4.6.6, there exists a unique operator F ∈ B(L2 (R)) such that
F0 ⊂ F , i.e. such that
F [ϕ] = [ϕ̂], ∀ϕ ∈ S(R),
and F is a unitary operator in L2 (R).
We already know that DF −1 = Rι , and 1 implies that
0
F0−1 [ϕ] = [ϕ̌], ∀ϕ ∈ S(R).

Then, in view of 4.6.6, F −1 is the unique element of B(L2 (R)) such that
F −1 [ϕ] = F0−1 [ϕ] = [ϕ̌], ∀ϕ ∈ S(R),
or equivalently such that
F −1 [ϕ̂] = F −1 (F0 [ϕ]) = [ϕ], ∀ϕ ∈ S(R).
11.4.13 Remark. The Fourier transform F is an operator on L2 (R). Thus, F [f ]

is a vector in L2 (R) for each [f ] ∈ L2 (R). However we do not have a formula which
permits us to construct the vector F [f ] directly from [f ], unless f ∈ S(R). Indeed,
for each [f ] ∈ L2 (R), from the continuity of F we have that
F [f ] = lim [ϕ̂n ]
n→∞
if {ϕn } is a sequence in S(R) such that [f ] = limn→∞ [ϕn ] (such a sequence exists
in view of 11.3.5 and 2.3.12; these limits are with respect to the distance on L2 (R)),
but the construction of a sequence {ϕn } in S(R) such that [f ] = limn→∞ [ϕn ] is not
a straightforward procedure.
Now, we point out that the formula
Z
− 12
(2π) e−ixy f (y)dm(y),
R
for x ∈ R and f ∈ L2 (R), has no meaning unless f ∈ L1 (R) (note that there exist
elements of L2 (R) which are not in L1 (R), as for instance the function g : R → C
defined by g(x) := 0 for all x ∈ [−1, 1] and g(x) := x−1 for all x ∈ R − [−1, 1]).
However, for f ∈ L2 (R) ∩ L1 (R) we can actually define both F [f ], which is an
element of L2 (R), and the function fˆ as in 11.4.1, and we can wonder whether the
following guess is right:
fˆ ∈ L2 (R) and F [f ] = [fˆ].
In 11.4.22 we prove that this guess is correct. Then, the “Fourier transform of [f ]”
is an unambiguous expression when both F [f ] and [fˆ] exist, that is to say when
f ∈ L2 (R) ∩ L1 (R). In order to prove 11.4.22 we need a few preliminary results,
which have corollaries of interest on their own.
11.4.14 Proposition. Let ψ ∈ S(R) and f ∈ L1 (R). Then:

(a) the function
ϕ: R→C
Z
x 7→ ϕ(x) := ψ(x − y)f (y)dm(y)
R
is defined consistently and

Z
ϕ(x) = ψ(s)f (x − s)dm(s), ∀x ∈ R;
R
(b) ϕ ∈ C ∞ (R) and

Z
ϕ(n) (x) = ψ (n) (x − y)f (y)dm(y), ∀x ∈ R, ∀n ∈ N.
R
Proof. a: For each x ∈ R, we have

|ψ(x − y)f (y)| ≤ sup{|ψ(t)| : t ∈ R}|f (y)|, ∀y ∈ R.
This proves that the definition of the function ϕ is consistent, in view of 3.1.10h-7
and 8.2.5.
The equality in the second part of statement a follows from the change of variable
s := x − y,
in view of 9.2.1 and 9.2.2.
b: Preliminarily we note that the right hand side of the equation we want to
prove is defined consistently in view of statement a, since ψ (n) ∈ S(R) for all n ∈ N
(cf. 3.1.10h-1).
For simplicity, we assume Df = R (if this was not true, we could replace f with
its extension fe defined in 8.2.12 and everything in the statement would remain
unchanged, in view of 8.2.7). Now let x be a fixed element of R. For each sequence
{tn } in R − {0} such that tn → 0 we have
1
lim (ψ(x + tn − y) − ψ(x − y)) = ψ ′ (x − y), ∀y ∈ R;
n→∞ tn
moreover, the mean value theorem implies that

∀y ∈ R, ∀n ∈ N, ∃sy,n ∈ R s.t. ψ(x + tn − y) − ψ(x − y) = ψ ′ (sy,n )tn ,
and hence that

1
(ψ(x + tn − y) − ψ(x − y)) ≤ sup{|ψ ′ (s)| : s ∈ R}, ∀y ∈ R, ∀n ∈ N.

tn
Then, by 8.2.11 (with sup{|ψ ′ (s)| : s ∈ R}|f | as dominating function, cf. 3.1.10h-7)
we have
1
Z
lim (ϕ(x + tn ) − ϕ(x)) = ψ ′ (x − y)f (y)dm(y).
n→∞ tn R
This proves that

Z
ϕ is differentiable at x and ϕ′ (x) = ψ ′ (x − y)f (y)dm(y).
R
In the same way we can prove that, for each n ∈ N, if
Z
(n)
ϕ is n times differentiable at x and ϕ (x) = ψ (n) (x − y)f (y)dm(y)
R
then
Z
ϕ is n + 1 times differentiable at x and ϕ(n+1) (x) = ψ (n+1) (x − y)f (y)dm(y).
R
Since x was an arbitrary element of R, this proves statement b by induction.
11.4.15 Definitions. For each t ∈ R and each f ∈ L2 (R), we define the functions
f t : Df → C
x 7→ f t (x) := eitx f (x)
and
f−t : Df − t → C
x 7→ f−t (x) := f (x + t)
(the definition of f−t is consistent with the definition of ϕc given in 9.2.1b, while
the definition of f t has nothing to do with the definition of ϕc given in 9.2.2). It is
obvious that f t ∈ L2 (R), while f−t ∈ L2 (R) follows from 9.2.1b.
It is obvious that, for f, g ∈ L2 (R),
f ∼ g ⇒ f t ∼ gt,
while the implication
f ∼ g ⇒ f−t ∼ g−t
follows from 9.2.1a.
In view of the remarks above, for each t ∈ R we can define the mappings
Ut : L2 (R) → L2 (R)
[f ] 7→ Ut [f ] := [f t ]
and
Vt : L2 (R) → L2 (R)
[f ] 7→ Vt [f ] := [f−t ].
It is obvious that Ut and Vt are linear operators. Moreover,
kUt [f ]k = k[f ]k, ∀[f ] ∈ L2 (R),
is obvious, while
kVt [f ]k = k[f ]k, ∀[f ] ∈ L2 (R),
follows from 9.2.1b. Thus, Ut and Vt are elements of B(L2 (R)).
11.4.16 Proposition. For all t ∈ R,

Vt = F −1 Ut F.
For all [f ] ∈ L2 (R), the mappings
R ∋ t 7→ Ut [f ] ∈ L2 (R) and R ∋ t 7→ Vt [f ] ∈ L2 (R)
are continuous.
Proof. For each ϕ ∈ S(R), we have

Z
((ϕ̂)t )ˇ(x) = (2π)−1 eixy eity ϕ̂(y)dm(y) = (ϕ̂)ˇ(x + t)
R
= ϕ(x + t) = ϕ−t (x), ∀x ∈ R, ∀t ∈ R
(cf. 11.4.9; note that ψ t ∈ S(R) for all ψ ∈ S(R), cf. 3.1.10h-5). This proves that
F −1 Ut F [ϕ] = Vt [ϕ], ∀ϕ ∈ S(R), ∀t ∈ R.
By 4.2.6 this proves that
F −1 Ut F = Vt , ∀t ∈ R,
since F −1 Ut F and Vt are elements of B(L2 (R)) and Rι is dense in L2 (R).
Now we fix [f ] ∈ L2 (R) and t0 ∈ R. For each sequence {tn } in R such that
tn → t0 , we have
lim |eitn x − eit0 x |2 |f (x)|2 = 0, ∀x ∈ Df ,
n→∞
and
|(eitn x − eit0 x )f (x)|2 ≤ 4|f (x)|2 , ∀x ∈ Df .
Then, by 8.2.11 (with 4|f |2 as dominating function) we have
Z
lim |eitn x f (x) − eit0 x f (x)|2 dm(x) = 0,
n→∞ R
or
lim kUtn [f ] − Ut0 [f ]k = 0.
n→∞
Since [f ] was an arbitrary element of L2 (R) and t0 an arbitrary element of R, this

proves that the mapping
R ∋ t 7→ Ut [f ] ∈ L2 (R)
is continuous for all [f ] ∈ L2 (R). Then the mapping
R ∋ t 7→ Ut F [f ] ∈ L2 (R)
is continuous for all [f ] ∈ L2 (R), and hence so is the mapping
R ∋ t 7→ Vt [f ] = F −1 Ut F [f ] ∈ L2 (R)
since the mapping F −1 is continuous.
11.4.17 Definitions. We define

Cc∞ (R) := Cc (R) ∩ C ∞ (R)
(for Cc (R) and C ∞ (R), cf. 3.1.10g,h). Obviously, Cc∞ (R) ⊂ S(R) and Cc∞ (R) is a
linear manifold in the linear space F (R).
For a, b ∈ R such that a < b, we define
C0∞ (a, b) := {ϕ ∈ C(a, b) : ϕ is infinitely differentiable at all points of (a, b)
and supp ϕ ⊂ (a, b)}
(for supp ϕ, cf. 2.5.9). Obviously, C0∞ (a, b) is a linear manifold in the linear space
C(a, b).
11.4.18 Lemma. Let f ∈ L2 (R), let a, b ∈ R be such that a < b, and suppose that
f (x) = 0 m-a.e. on Df − [a, b].
Then, for each ε1 > 0 and each ε2 > 0, there exists ϕ ∈ Cc∞ (R) so that
k[f ] − [ϕ]k ≤ ε1
and
supp ϕ ⊂ [a − ε2 , b + ε2 ].
Proof. For each n ∈ N, we define the interval In := − n1 , n1 and the function

ψn : R → C
(
kn exp(x2 − n−2 )−1 if x ∈ In ,
x 7→ ψn (x) :=
0 if x 6∈ In ,
where kn ∈ R is so that R ψn dm = 1. It is easy to see that ψn ∈ C ∞ (R). Hence
R
ψn ∈ S(R).
Now, we fix ε1 > 0. Since the mapping
R ∋ t 7→ Vt [f ] ∈ L2 (R)
is continuous (cf. 11.4.16), there exists δ > 0 so that
|t| < δ ⇒ k[f ] − [ft ]k < ε1 .
Moreover, we fix ε2 > 0 and then n ∈ N such that n−1 < min{δ, ε2}.
We note that f ∈ L1 (R) (this follows from 11.1.2b with ϕ := χ[a,b] and ψ := f )
and define the function
ϕ:R→C
Z
x 7→ ϕ(x) := ψn (s)f (x − s)dm(s),
R
which is an element of C ∞ (R) (cf. 11.4.14). Moreover, if x 6∈ [a − ε2 , b + ε2 ] then
s ∈ In ⇒ |s| < ε2 ⇒ x − s 6∈ [a, b],
and hence
ψn (s)f (x − s) = 0 for m-a.e. s ∈ Df + x,
and hence
ϕ(x) = 0.
This proves that supp ϕ ⊂ [a − ε2 , b + ε2 ], and hence also that ϕ ∈ Cc∞ (R).
Now we want to prove that
k[f ] − [ϕ]k ≤ ε1 . (1)
In view of the Schwarz inequality (cf. 10.1.9) we have
k[f ] − [ϕ]k = sup{| ([h]|[f ] − [ϕ]) | : [h] ∈ L2 (R) s.t. k[h]k = 1}. (2)
We fix [h] ∈ L2 (R) such that k[h]k = 1. We have
Z Z
([h]|[ϕ]) = h(x) ψn (s)f (x − s)dm(s) dm(x).
R R
We note that the function
Z
R ∋ x 7→ ψn (x)|f (x − s)|dm(s) ∈ [0, ∞)
R
is an element of Cc∞ (R) (by the same argument as above, with f replaced by |f |)
and hence of L2 (R), and hence
Z Z
|h(x)| ψn (s)|f (x − s)|dm(s) dm(x) < ∞
R R
(cf. 11.1.2b). Then, by Tonelli’s theorem (cf. 8.4.8) the function

R2 ∋ (x, s) 7→ h(x)ψn (s)f (x − s) ∈ C
is an element of L1 (R2 , A(dR ) ⊗ A(dR ), m ⊗ m), and hence by Fubini’s theorem (cf.
8.4.10c) we have
Z Z
h(x) ψn (s)f (x − s)dm(s) dm(x)
R R
Z Z
= ψn (s) h(x)f (x − s)dm(x) dm(s)
ZR R
Z
= ψn (s) ([h]|[fs ]) dm(s) = ψn (s) ([h]|[fs ]) dm(s).
R In
Moreover we have
Z
([h]|[f ]) = ψn (s) ([h]|[f ]) dm(s)
In
R
since In ψn dm = 1. Therefore we have
Z
([h]|[f ] − [ϕ]) = ψn (s) ([h]|[f ] − [fs ]) dm(s),
In
and hence
(3)
Z
| ([h]|[f ] − [ϕ]) | ≤ ψn (s)| ([h]|[f ] − [fs ]) |dm(s)
In
(4) (5)
Z Z
≤ ψn (s)k[f ] − [fs ]kdm(s) ≤ ε1 ψn dm = ε1 ,
In In
where 3 holds by 8.2.10, 4 by the Schwarz inequality, and 5 because

s ∈ In ⇒ s < δ ⇒ k[f ] − [fs ]k < ε1 .
Since [h] was an arbitrary normalized element of L2 (R), this proves that
| ([h]|[f ] − [ϕ]) | ≤ ε1 , ∀[h] ∈ L2 (R) s.t. k[h]k = 1,
and hence (in view of 2) it proves 1.
11.4.19 Corollary. The family ι(Cc∞ (R)) (with ι defined as in 11.3.1) is dense in
L2 (R).
Proof. We fix f ∈ L2 (R) and ε > 0. We define fn := χ[−n,n] f for all n ∈ N. By

8.2.11 (with |f |2 as dominating function) we have
Z
lim |f − fn |2 dm = 0.
n→∞ R
Therefore, there exists k ∈ N such that
ε
k[f ] − [fk ]k <
.
2
Moreover, by 11.4.18 there exists ϕ ∈ Cc∞ (R) such that
ε
k[fk ] − [ϕ]k < .
2
Thus, there exists ϕ ∈ Cc∞ (R) such that
k[f ] − [ϕ]k < ε.
Since f and ε were arbitrary, this proves that
∀[f ] ∈ L2 (R), ∀ε > 0, ∃ϕ ∈ Cc∞ (R) s.t. k[f ] − [ϕ]k < ε,
and hence (cf. 2.3.12) that ι(Cc∞ (R)) is dense in L2 (R).
11.4.20 Corollary. Let f ∈ L2 (R), let a, b ∈ R be such that a < b, and suppose
that
f (x) = 0 m-a.e. on Df − [a, b].
Then, for each ε > 0 there exists ϕ ∈ Cc∞ (R) so that
k[f ] − [ϕ]k < ε
and
supp ϕ ⊂ (a, b).
Proof. For each n ∈ N we define fn := χ[a+ 1 ,b− 1 ] f . By 8.2.11 (with |f |2 as

n n
dominating function) we have
Z
lim |f − fn |2 dm = 0.
n→∞ R
Therefore, there exists k ∈ N such that
ε
k[f ] − [fk ]k < .
2
1
Now let ε2 > 0 be such that ε2 < k. In view of 11.4.18, there exists ϕ ∈ Cc∞ (R)
such that
ε
k[fk ] − [ϕ]k ≤ ε1 := ,
2
and hence
k[f ] − [ϕ]k < ε,
and also such that

1 1
supp ϕ ⊂ a + − ε2 , b − + ε2 ⊂ (a, b).
k k
11.4.21 Corollary. Let a, b ∈ R be such that a < b. The family ι(C0∞ (a, b)) (with
ι defined as in 11.2.1) is dense in L2 (a, b).
Proof. This follows immediately from 11.4.20, since each element of L2 (a, b) is
extended trivially by an element of L2 (R) which satisfies the condition of 11.4.20,
and each element ϕ of Cc∞ (R) such that supp ϕ ⊂ (a, b) becomes an element of
C0∞ (a, b) when it is restricted to [a, b].
11.4.22 Theorem. Let f ∈ L2 (R) ∩ L1 (R). Then:

fˆ ∈ L2 (R) and F [f ] = [fˆ];
fˇ ∈ L2 (R) and F −1 [f ] = [fˇ].
Proof. We define fn := χ[−n,n] f for all n ∈ N. For each n ∈ N, let ϕn ∈ Cc∞ (R) be
such that
1
k[fn ] − [ϕn ]k < and supp ϕn ⊂ (−n, n)
n
(ϕn with these properties exists by 11.4.20).
From the condition f ∈ L2 (R) we obtain
k[f ] − [ϕn ]k ≤ k[f ] − [fn ]k + k[fn ] − [ϕn ]k −−−−→ 0, (1)
n→∞
since
Z
lim |f − fn |2 dm = 0
n→∞ R
by 8.2.11 (with |f |2 as dominating function).

By the Schwarz inequality (for the vectors [χ[−n,n] ] and [|fn − ϕn |] in L2 (R)) we
have, for each n ∈ N,
Z Z
|f − ϕn |dm = χ[−n,n] |fn − ϕn |dm
[−n,n] R
Z 12 Z 12 12
2 1 1 2
≤ χ[−n,n] dm |fn − ϕn | dm < (2n)
2 = ;
R R n n
moreover, by 8.2.11 (with |f | as dominating function) we have
Z Z
|f |dm = |f − fn | −−−−→ 0;
R−[−n,n] R n→∞
therefore we have
Z Z Z
|f − ϕn |dm = |f − ϕn |dm + |f |dm −−−−→ 0. (2)
R [−n,n] R−[−n,n] n→∞
In obtaining this, the condition f ∈ L1 (R) was essential.

Now we prove the statement for fˆ. The proof for fˇ would be similar. We set
[h] := F [f ]. Then from 1 we have
[h] = lim F [ϕn ] = lim [ϕ̂n ],
n→∞ n→∞
by the continuity of F and since ϕn ∈ S(R) for all n ∈ N. In view of 11.1.7, this
implies that there exists a subsequence {ϕ̂nk } of the sequence {ϕ̂n } so that
h(x) = lim ϕ̂nk (x) m-a.e. on R.
k→∞
Moreover, from 2 we have
Z
|fˆ(x) − ϕ̂n (x)| ≤ (2π)−1 |f − ϕn |dm −−−−→ 0, ∀x ∈ R
R n→∞
(the inequality holds by 8.2.10). Therefore we have

h(x) = fˆ(x) m-a.e. on R.
Since h ∈ L2 (R), this proves that fˆ ∈ L2 (R) (cf. 8.1.17c) and that [h] = [fˆ], i.e.
F [f ] = [fˆ].
11.4.23 Remark. For all [f ] ∈ L2 (R), on the basis of 11.4.22 we can find a formula
which yields F [f ] more directly than the mere definition of F does. Indeed, let
[f ] ∈ L2 (R), let {an } and {bn } be sequences in R such that
an < bn for all n ∈ N, lim an = −∞, lim bn = ∞,
and define
fn := χ[an ,bn ] f, ∀n ∈ N.
For all n ∈ N, fn ∈ L (R) is obvious and fn ∈ L1 (R) follows from 11.1.2b. Moreover,
2
lim k[f ] − [fn ]k = 0

n→∞
follows from 8.2.11 (with |f |2 as dominating function). Then, in view of the conti-
nuity of F and of 11.4.22, we have
F [f ] = lim F [fn ] = lim [fˆn ],
n→∞ n→∞
with
Z
1
fˆn (x) = (2π)− 2 e−ixy f (y)dm(y), ∀x ∈ R, ∀n ∈ N.
[an ,bn ]
The sequences {an } and {bn } can be chosen in order to make the computation of
the limit above as easy as possible.

Chapter 12
Adjoint Operators
In this chapter we study the idea of adjoint operator, which is in a sense the main
tool for dealing with linear operators in Hilbert space. Throughout the chapter, H
denotes an abstract Hilbert space. We recall that O(H) denotes the family of all
linear operators in H (cf. 3.2.1).
12.1 Basic properties of adjoint operators
Adjoint operators can be defined and investigated in a geometrical way connected

with the concept of graph of a linear operator. However, we prefer to resort to
this approach only when it really makes things easier, and we give the definition
of adjoint in a more direct way, which makes the reason behind the definition
immediately clear and also leads directly to most results.
∗
12.1.1 Definition. For any A ∈ O(H), we define a subset DA of H by letting
∗
DA := {g ∈ H : ∃g ∗ ∈ H so that (Af |g) = (f |g ∗ ) , ∀f ∈ DA }.
∗ ∗
We have DA 6= ∅ since 0H ∈ DA (for g := 0H take g ∗ := 0H ).
The proposition
∗
∀g ∈ DA , ∃!g ∗ so that (Af |g) = (f |g ∗ ) , ∀f ∈ DA ,
is true iff DA = H. Indeed, while the existence of g ∗ is obvious in any case, for its
∗
uniqueness we can argue as follows. Let g ∈ DA and g ∗ ∈ H be such that
(Af |g) = (f |g ∗ ) , ∀f ∈ DA .
Then, for g ′ ∈ H,
[(Af |g) = (f |g ′ ) , ∀f ∈ DA ] ⇔ g ′ − g ∗ ∈ DA
⊥
.
⊥
Therefore, if DA = H then DA = {0H } (cf. 10.4.4d) and hence
[(Af |g) = (f |g ′ ) , ∀f ∈ DA ] ⇒ g ′ = g ∗ .
⊥
On the other hand, if DA 6= H then there exists h ∈ DA such that h 6= 0H (cf.
10.4.4d) and g ′ := g ∗ + h is such that
g ′ 6= g ∗ and (Af |g) = (f |g ′ ) , ∀f ∈ DA .
355
Thus, if and only if DA = H can we define a mapping A† from H to H by letting

∗
DA† := DA and
A† : DA† → H
g 7→ A† g := g ∗ if g ∗ ∈ H and (Af |g) = (f |g ∗ ) , ∀f ∈ DA .
If DA = H then the mapping A† is called the adjoint of A and the operator A is
said to be adjointable.
12.1.2 Theorem. For every adjointable operator A in H, the mapping A† is a

linear operator.
∗
Proof. Let A ∈ O(H) be such that DA = H. For any g1 , g2 ∈ DA , there exist
∗ ∗
g1 , g2 ∈ H so that
(Af |g1 ) = (f |g1∗ ) and (Af |g2 ) = (f |g2∗ ) , ∀f ∈ DA ,
and hence so that, for all α, β ∈ C,
(Af |αg1 + βg2 ) = (f |αg1∗ + βg2∗ ) , ∀f ∈ DA ;
∗
therefore, αg1 + βg2 ∈ DA and A† (αg1 + βg2 ) = αg1∗ + βg2∗ = αA† g1 + βA† g2 . This
proves that condition lo of 3.2.1 holds for A† .
12.1.3 Proposition. Let A be an adjointable operator in H. Then:

(A) (Af |g) = f |A† g , ∀f ∈ DA , ∀g ∈ DA† .

(B) For a mapping ψ : Dψ → H with Dψ ⊂ H, the following conditions are equiv-

alent:
(a) (Af |g) = (f |ψ(g)), ∀f ∈ DA , ∀g ∈ Dψ ;
(b) ψ ⊂ A† .
Proof. A: This follows directly from the definition of A† .

B: In view of part A, it is obvious that condition b implies condition a. On the
other hand, if condition a holds true then we obtain, directly from the definitions
∗
of DA and of A† ,
∗
g ∈ Dψ ⇒ [(Af |g) = (f |ψ(g)) , ∀f ∈ DA ] ⇒ [g ∈ DA and A† g = ψ(g)],
i.e. ψ ⊂ A† (cf. 1.2.5).
12.1.4 Proposition. Let A be an adjointable operator in H. If B ∈ O(H) is such

that A ⊂ B, then B is adjointable and B † ⊂ A† .
Proof. If B ∈ O(H) is such that A ⊂ B, then DA ⊂ DB and hence DB = H

(cf. 2.3.9d). Moreover, from 12.1.3A (written with the operator A replaced by the
operator B) we obtain
(Af |g) = (Bf |g) = f |B † g , ∀f ∈ DA , ∀g ∈ DB † ,

which implies B † ⊂ A† by 12.1.3B.

Adjoint Operators 357
To prove results 12.1.6 and 12.1.8, it is expedient to go over to the geometric

characterization of the adjoint that is obtained in 12.1.5.
12.1.5 Theorem. We define the mapping

W : H⊕H→H⊕H
(f1 , f2 ) 7→ W (f1 , f2 ) := (f2 , −f1 )
(for the Hilbert space H ⊕ H, cf. 10.3.7). Then:
(a) the mapping W is a unitary operator;
(b) for A ∈ O(H), DA = H iff (W (GA ))⊥ is the graph of a mapping from H to
H (the orthogonal complement (W (GA ))⊥ is defined with respect to the Hilbert
space H ⊕ H);
(c) for an adjointable operator A in H, GA† = (W (GA ))⊥ .
Proof. a: It is obvious that the mapping W satisfies the conditions of 10.1.20c.

b: For A ∈ O(H), (W (GA ))⊥ is a linear manifold in the Hilbert space H ⊕ H
(cf. 10.2.13). Hence, (W (GA ))⊥ is the graph of a mapping from H to H iff the
following condition is true (cf. 3.2.15b):
(0H , g) ∈ (W (GA ))⊥ ⇒ g = 0H .
Now, for g ∈ H,
(0H , g) ∈ (W (GA ))⊥ ⇔ [(0H |Af ) − (g|f ) = 0, ∀f ∈ DA ] ⇔ g ∈ DA
⊥
,
and the condition
⊥
g ∈ DA ⇒ g = 0H
⊥
is equivalent to DA = {0H }, which is equivalent to DA = H by 10.4.4d.
c: Let A be an adjointable operator in H. Then, for (g, g ∗ ) ∈ H,
(g, g ∗ ) ∈ GA† ⇔ [g ∈ DA
∗
and g ∗ = A† g]
⇔ [(Af |g) = (f |g ∗ ) , ∀f ∈ DA ]
⇔ [(W (f, Af )|(g, g ∗ ))H⊕H = 0, ∀f ∈ DA ]
⇔ (g, g ∗ ) ∈ (W (GA ))⊥ .
12.1.6 Theorem. Let A be an adjointable operator in H. Then:

(a) The operator A† is closed.
(b) The operator A is closable iff DA† = H. If these conditions hold true, then
A = A†† and A† = (A)†
(note that A is adjointable since A ⊂ A), and hence
A ⊂ A†† and A† = A††† ,
where we have written A†† := (A† )† and A††† := ((A† )† )† .
(c) The operator A is closed iff [DA† = H and A = A†† ].
Proof. a: This follows from 12.1.5c since (W (GA ))⊥ is a subspace of H ⊕ H (cf.
10.2.13).
b: We have
(1) (2) (3) (4)
(W (GA† ))⊥ = (W ((W (GA ))⊥ ))⊥ = ((W 2 (GA ))⊥ )⊥ = (G⊥
A)
⊥
= GA ,
where: 1 holds by 12.1.5c; 2 holds by 10.2.16; 3 holds because W 2 = −1H⊕H and
GA is a linear manifold in H⊕H (cf. 3.2.15a); 4 holds by 10.4.4c. Now, A is closable
iff GA is the graph of a mapping, and (W (GA† ))⊥ is the graph of a mapping iff
DA† = H (cf. 12.1.5b). Thus, A is closable iff DA† = H.
If A is closable and DA† = H, then (cf. 12.1.5c)
GA†† = (W (GA† ))⊥ = GA = GA ,
and hence A†† = A because two mappings are equal if their graphs are equal.
Moreover,
(5) (6)
G(A)† = (W (GA ))⊥ = (W (GA ))⊥ = (W (GA ))⊥ = (W (GA ))⊥ = GA† ,
where: 5 holds by 10.1.21, 4.6.2d, 2.3.21a; 6 holds by 10.2.11. This proves that
(A)† = A† .
c: This follows immediately from result b, since A is closed iff [A is closable and
A = A].
12.1.7 Proposition. For every adjointable operator A in H,

⊥
NA† = RA .
Proof. Let A ∈ O(H) be such that DA = H. Then, for g ∈ H,

∗
g ∈ NA† ⇔ [g ∈ DA and A† g = 0H ]
⇔ [(Af |g) = 0 = (f |0H ) , ∀f ∈ DA ]
⊥
⇔ g ∈ RA .
12.1.8 Theorem. Let A ∈ O(H) be such that DA = H and NA = {0H } (thus, the
operators A† and A−1 are defined). Then, DA−1 = H iff NA† = {0H } (thus, the
operator (A−1 )† is defined iff the operator (A† )−1 is defined). If these conditions
hold true, then
(A−1 )† = (A† )−1 .
Proof. The parenthetical remarks of the statement are true by 12.1.1 and by 3.2.6a.
We have
⊥
RA = H ⇔ RA = {0H }
by 10.4.4d, and hence

DA−1 = H ⇔ NA† = {0H }
⊥
because DA−1 = RA and NA† = RA (cf. 12.1.7).
Now, suppose DA−1 = H and NA† = {0H }, and define the mapping
V : H⊕H→H⊕H
(f1 , f2 ) 7→ V (f1 , f2 ) := (f2 , f1 )
which is a unitary operator since it obviously satisfies the conditions of 10.1.20c.
Then,
G(A−1 )† = (W (GA−1 ))⊥ = (W (V (GA )))⊥
= (V (W (GA )))⊥ = V ((W (GA ))⊥ ) = V (GA† ) = G(A† )−1 ,
where 12.1.5c and 1.2.11c have been used twice, as well as 10.2.16 and the equation
W (V (GA )) = V (W (GA )),
which is true because W V = −V W and GA is a linear manifold in H ⊕ H. This
proves that (A−1 )† = (A† )−1 .
12.1.9 Example. In what follows, we provide an example of an operator which is

not closable.
Let H be a separable Hilbert space which is not finite-dimensional and let
{un }n∈N be a c.o.n.s. in H. Let {xn } be a sequence in C and let u be a non-
zero element of H. We define a mapping A by letting
∞
( )
X
DA := f ∈ H : |xn (un |f ) | < ∞ ,
n=1
∞
!
X
Af := xn (un |f ) u, ∀f ∈ DA .
n=1
This mapping is a linear operator in H. Indeed, DA is a linear manifold in H
because, for all α, β ∈ C and f, g ∈ H,
X∞ ∞
X ∞
X
|xn (un |αf + βg) | ≤ |α| |xn (un |f ) | + |β| |xn (un |f ) |
n=1 n=1 n=1
(cf. 5.4.2a, 5.4.6, 5.4.5). Moreover, for all α, β ∈ C and f, g ∈ DA ,
X∞ X∞ ∞
X
xn (un |αf + βg) = α xn (un |f ) + β xn (un |g) ,
n=1 n=1 n=1
and hence
A(αf + βg) = αAf + βAg.
We notice that un ∈ DA for all n ∈ N, and hence L{un}n∈N ⊂ DA (cf. 3.1.6c), and
hence
H = V {un }n∈N = L{un }n∈N ⊂ DA
(cf. 4.1.13). This proves that the operator A is adjointable.

In what follows we distinguish two cases, {xn } ∈ ℓ2 and {xn } 6∈ ℓ2 .
P∞ 1
First, we suppose {xn } ∈ ℓ2 (cf. 10.3.8d). We set m := n=1 |xn |
2 2
. For all
f ∈ H, we have {(un |f )} ∈ ℓ2 (cf. 10.2.8b) and hence, by the Schwarz inequality in
ℓ2 and by 10.6.4d (with M := H),
∞
X ∞
X
|xn (un |f ) | = |xn | | (un |f ) | ≤ mkf k,
n=1 n=1
and hence f ∈ DA ; then, by the Schwarz inequality once more,
∞
X
kAf k = xn (un |f ) kuk ≤ mkf kkuk.

n=1
This proves that A ∈ B(H), and hence that A is closed (cf. 4.4.3).
Second, we suppose {xn } 6∈ ℓ2 . We choose n0 ∈ N such that xn0 6= 0 and define
fk := −xk un0 + xn0 uk , ∀k ∈ N;
clearly,
fk ∈ DA and Afk = (−xk xn0 + xn0 xk )u = 0H , ∀k ∈ N.
Now, let g ∈ DA† ; then
0 = (Afk |g) = fk |A† g = −xk un0 |A† g + xn0 uk |A† g , ∀k ∈ N;

this implies that either

−1
un0 |A† g 6= 0 and hence xk = un0 |A† g xn0 uk |A† g , ∀k ∈ N,

(1)
or
un0 |A† g = 0 and hence uk |A† g = 0, ∀k ∈ N;

(2)
however, alternative 1 would entail
X∞ ∞
X
|xk |2 = | un0 |A† g |−2 |xn0 |2 | uk |A† g |2 < ∞,

k=1 k=1
which is contrary to the assumption that {xn } 6∈ ℓ2 ; therefore, alternative 2 is true,

and hence A† g = 0H (cf. 10.6.4e with M := H). This proves that
A† ⊂ OH (3)
(for the operator OH , cf. 3.2.9). If the operator A were closable then the operator
A† would be adjointable (cf. 12.1.6b) and 3 would imply A†† = OH (this would
follow immediately from 12.1.3B), and hence we would have A ⊂ OH (this would
follow from A ⊂ A†† ), contrary to the fact that
un0 ∈ DA and Aun0 = xn0 u 6= 0H .
Therefore, the operator A is not closable. We point out that this implies that A is
not bounded (cf. 4.4.12).
12.2 Adjoints and boundedness
12.2.1 Theorem. Let A ∈ OE (H) (for OE (H), cf. 3.2.12). Then the operator A†
is bounded.
Proof. By 12.1.3A and 10.1.7a, we have

|FA† g f | = | f |A† g | = | (Af |g) | ≤ kAf k, ∀f ∈ DA = H, ∀g ∈ DA† ∩ H̃

(for FA† g , cf. 10.5.1; for H̃, cf. 10.9.4). This proves that
∀f ∈ H, ∃mf ∈ [0, ∞) such that |FA† g f | ≤ mf , ∀g ∈ DA† ∩ H̃,
and hence, by 4.2.13 and 10.5.1, that
∃m ∈ [0, ∞) such that kA† gk = kFA† g k ≤ m, ∀g ∈ DA† ∩ H̃,
and hence that

† 1
≤ m, i.e. kA† gk ≤ mkgk, ∀g ∈ DA† − {0H },

∃m ∈ [0, ∞) such that A
g
kgk
and hence that the operator A† is bounded.
12.2.2 Theorem. Let A be an adjointable operator in H. Then the following con-

ditions are equivalent:
(a) A is bounded;
(b) A† ∈ B(H) (for B(H), cf. 4.2.10);
(c) DA† = H.
If these conditions are satisfied, then
(d) kA† k = kAk.
Proof. a ⇒ (b and d): Assuming condition a, by 4.2.6 there exists Ã ∈ B(H) such
that A ⊂ Ã, since DA = H. Then the function
ψ : H×H→C
(f, g) 7→ ψ(f, g) := (Ãf |g)
is a bounded sesquilinear form on H (cf. 10.5.5), and hence by 10.5.6 there exists
B ∈ B(H) such that
(Ãf |g) = (f |Bg) , ∀f, g ∈ H,
and hence such that
(Af |g) = (f |Bg) , ∀f ∈ DA , ∀g ∈ H = DB .
By 12.1.3B, this implies B ⊂ A† and hence B = A† . Further, we have kAk = kÃk
by 4.2.6d and kÃk = kBk by 10.1.14.
c ⇒ a: Assuming condition c, by 12.2.1 we have that A†† is bounded. Since
A ⊂ A†† (cf. 12.1.6b), by 4.2.5a we obtain that A is bounded.
12.2.3 Theorem (Closed graph theorem in Hilbert space). If A ∈ OE (H)

and A is closed, then A is bounded.
Proof. Suppose that A is a closed operator in H such that DA = H. By 12.1.6c

we have DA† = H and A = A†† . Moreover, A† is bounded by 12.2.1, and hence A††
is bounded by 12.2.2.
12.2.4 Remark. Here we suppose that H is finite-dimensional (cf. 10.8.3B). Then,

an operator A in H is adjointable iff A ∈ OE (H). In fact, the condition DA = H
is the same as the condition DA = H since every linear manifold in H is closed (cf.
10.8.2). Moreover, for every A ∈ OE (H) we have A† ∈ OE (H) by 12.2.2 because A
is bounded (cf. 10.8.3A).
Let N be the dimension of H, {u1 , ..., uN } a c.o.n.s. in H, and ΦU the iso-
morphism from the associative algebra OE (H) onto the associative algebra M(N )
defined in 10.8.4. Then, for all A ∈ OE (H) we have
ΦU (A† ) = [ ui |A† uk ] = [(Aui |uk )] = [(uk |Aui )];

thus, the matrix ΦU (A† ) is the complex conjugate of the transpose of the matrix
ΦU (A).
12.3 Adjoints and algebraic operations
12.3.1 Proposition. Let A, B ∈ O(H) be such that DA+B = H. Then DA =

DB = H and:
(a) A† + B † ⊂ (A + B)† ;
(b) if B ∈ B(H) then A† + B † = (A + B)† .
Proof. We have DA = DB = H because DA+B = DA ∩ DB (cf. 2.3.9d).

a: By 12.1.3A we have
((A + B)f |g) = (Af |g) + (Bf |g) = f |A† g + f |B † g = f |(A† + B † )g ,

∀f ∈ DA+B , ∀g ∈ DA† ∩ DB † = DA† +B † .
By 12.1.3B, this implies A† + B † ⊂ (A + B)† .

b: Assuming B ∈ B(H), we have DB † = H by 12.2.2, and hence
g ∈ D(A+B)† ⇒ [((A + B)f |g) = f |(A + B)† g , ∀f ∈ DA+B ]

⇒ [(Af |g) = f | − B † g + (A + B)† g , ∀f ∈ DA ]

∗
⇒ g ∈ DA = DA† = DA† +B † ,
∗
where 12.1.3A, the equality DA+B = DA , and the definition of DA have been
used. This proves that D(A+B)† ⊂ DA† +B † , which (in view of result a) implies
A† + B † = (A + B)† .
12.3.2 Proposition. Let A be an adjointable operator in H. Then the operator

αA is adjointable for all α ∈ C and:
(a) αA† = (αA)† , ∀α ∈ C − {0};

(b) 0A† ⊂ (0A)† = OH , and 0A† = (0A)† iff A is bounded (for the operator OH ,
cf. 3.2.9).
Proof. For all α ∈ C, we have DαA = H because DαA = DA , and also (cf. 12.1.3A)
(αAf |g) = α (Af |g) = f |αA† g , ∀f ∈ DαA , ∀g ∈ DA† = DαA† .

By 12.1.3B, this implies αA† ⊂ (αA)† .

a: If α ∈ C − {0}, we have
g ∈ D(αA)† ⇒ [(αAf |g) = f |(αA)† g , ∀f ∈ DαA ]

1
⇒ [(Af |g) = f | (αA)† g , ∀f ∈ DA ]
α
∗
⇒ g ∈ DA = DA† = DαA† ,
i.e. D(αA)† ⊂ DαA† , which (in view of what was proved above) implies αA† = (αA)† .
b: In view of what was proved above we already know that 0A† ⊂ (0A)† .
Moreover,
(0Af |g) = 0 = (f |OH g) , ∀f ∈ DA , ∀g ∈ H = DOH ,
proves that OH ⊂ (0A)† (cf. 12.1.3B), and hence that OH = (0A)† . Now, D0A† =
DA† and DA† = H iff A is bounded (cf. 12.2.2).
12.3.3 Remark. The equality 1†H = 1H (for the operator 1H , cf. 3.2.5) follows
immediately from 12.1.3B with A := ψ := 1H . Then, for every adjointable operator
A in H and every α ∈ C, from 12.3.1b and 12.3.2 we obtain
(A + α1H )† = A† + α1H
(note that the operator A + α1H is adjointable because DA+α1H = DA ).
12.3.4 Proposition. Let A, B ∈ O(H) be such that DBA = H and DB = H. Then

DA = H and:
(a) A† B † ⊂ (BA)† ;
(b) if B ∈ B(H) then A† B † = (BA)† .
Proof. We have DA = H because DBA := {f ∈ DA : Af ∈ DB } ⊂ DA .

a: By 12.1.3A we have
(BAf |g) = Af |B † g = f |A† B † g , ∀f ∈ DBA , ∀g ∈ DA† B † .

By 12.1.3B, this implies A† B † ⊂ (BA)† .

b: Assuming B ∈ B(H), we have DB † = H by 12.2.2, and hence

g ∈ D(BA)† ⇒ [(BAf |g) = f |(BA)† g , ∀f ∈ DBA ]

⇒ [ Af |B † g = f |(BA)† g , ∀f ∈ DA ]

⇒ B † g ∈ DA
∗
= D A† ⇒ g ∈ D A† B † ,
∗
where 12.1.3A, the equality DBA = DA , and the definition of DA have been used.
This proves that D(BA)† ⊂ DA† B † , which (in view of result a) implies A† B † =
(BA)† .
12.3.5 Remark. If we have A ∈ B(H) in 12.3.4, then the equation A† B † = (BA)†

may not hold, as is proved by the following counterexample. First, we note that the
equality O†H = OH follows immediately from 12.1.3B with A := ψ := OH . Then,
let A := OH and let B be an adjointable operator in H such that DB † 6= H (by
12.2.2, this is true iff B is not bounded). Then BA = OH , and hence (BA)† = OH .
However, DA† B † = DB † 6= H, and hence A† B † 6= (BA)† .
12.4 Symmetric and self-adjoint operators
12.4.1 Definition. An operator A in H is said to be symmetric if DA = H and

A ⊂ A† .
12.4.2 Proposition. If A is a symmetric operator in H then

DA† = H and A ⊂ A†† ⊂ A† .
Proof. From 12.1.4 we have DA† = H and A†† ⊂ A† . From 12.1.6b we have
A ⊂ A†† .
12.4.3 Theorem. Let A be an adjointable operator in H. The following conditions

are equivalent:
(a) A is symmetric;
(b) (f |Af ) ∈ R, ∀f ∈ DA ;
(c) (Af |g) = (f |Ag), ∀f, g ∈ DA .
Proof. a ⇒ b: Assuming condition a, from 12.1.3A we have

(f |Af ) = (Af |f ) = f |A† f = (f |Af ) , ∀f ∈ DA .

b ⇒ c: Assuming condition b, we have

(Af |f ) = (f |Af ) , ∀f ∈ DA .
In view of 10.1.10a, this implies condition c.
c ⇒ a: This follows directly from 12.1.3B.
12.4.4 Remarks.
(a) A symmetric operator A is closable, since A† is densely defined (cf. 12.1.6b),

and its closure is symmetric. In fact, A = A†† (cf. 12.1.6b) and A†† is symmetric
since DA†† = H and A†† ⊂ A† = (A†† )† (cf. 12.1.6b).
(b) If A is a symmetric operator in H and B ∈ O(H) is such that DB = H and
B ⊂ A, then B is symmetric since from 12.1.4 we have B ⊂ A ⊂ A† ⊂ B † .
(c) Theorem 12.4.3 is important because it provides two criteria for deciding
whether a given adjointable operator is symmetric without explicitily construct-
ing its adjoint, which might be difficult.
12.4.5 Definition. An operator A in H is said to be self-adjoint (briefly, s.a.) if

DA = H and A = A† .
12.4.6 Remarks.
(a) A self-adjoint operator is closed (cf. 12.1.6a).
(b) Suppose that A and B are self-adjoint operators in H and that A ⊂ B. Then
B = B † ⊂ A† = A by 12.1.4, and hence A = B.
12.4.7 Theorem (The Hellinger–Toeplitz theorem). Let A be a self-adjoint

operator in H. Then the following conditions are equivalent:
(a) A is bounded;
(b) A ∈ B(H);
(c) DA = H.
Proof. The statement follows at once from 12.2.2.
12.4.8 Proposition. Let A be a self-adjoint operator in H such that NA = {0H }.

Then, the operator A is injective and the operator A−1 is self-adjoint.
Proof. The operator A is injective in view of 3.2.6a. Then, 12.1.8 implies that
DA−1 = H and (A−1 )† = (A† )−1 = (A)−1 ,
i.e. that A−1 is s.a..
12.4.9 Definition. An operator A in H is said to be essentially self-adjoint (briefly,

e.s.a.) if it is symmetric and A†† = A† .
12.4.10 Remark. For a closed symmetric operator, being essentially self-adjoint

is the same as being self-adjoint (cf. 12.1.6b). Now suppose that a symmetric
operator A is not closed. Its closure A is always symmetric but it is self-adjoint
iff A is essentially self-adjoint (cf. 12.4.4a). If this is so then A is the only self-
adjoint operator that extends A (cf. 12.4.11c). In this way an essentially self-adjoint
operator leads to a unique self-adjoint operator. This accounts for its name.
12.4.11 Proposition. Let A be an adjointable operator in H. Then the following

(a) A is essentially self-adjoint;

(b) A is closable and A is self-adjoint.
(c) A has a unique self-adjoint extension; in fact, A is the only self-adjoint operator
that extends A.
Proof. a ⇒ (b and c): Suppose that A is e.s.a.. Then it is symmetric, and hence it
is closable and A = A†† (cf. 12.4.4a). Thus, DA = DA† = H and (A)† = (A†† )† =
A† = A†† = A (cf. 12.1.6b).
Now suppose that B is a s.a. operator in H such that A ⊂ B. From 12.1.4 we
have B † ⊂ A† and then A†† ⊂ B †† . Since B †† = B † = B and A†† = A† , this implies
B = A†† = A. This proves that condition c is true.
b ⇒ a: If A is closable, then DA† = H and A†† = A by 12.1.6b. If A is s.a.,
we also have A†† = A = (A)† = (A†† )† = A† (cf. 12.1.6b). Thus, A ⊂ A† and
A†† = A† .
12.4.12 Proposition. Let A be a self-adjoint operator in H and let M be a linear

manifold in H such that M ⊂ DA . Then the restriction AM of A to M is closable
by 4.4.11b since A is closed (cf. 12.4.6a). The following conditions are equivalent:
(a) AM is essentially self-adjoint;
(b) AM = A.
Proof. a ⇒ b: If AM is e.s.a. then AM is the only s.a. operator that extends AM

(cf. 12.4.11c), and hence A = AM .
b ⇒ a: If AM = A then from the definition of AM (cf. 4.4.10) and from 2.3.10
we obtain DA = DAM ⊂ DAM , and hence DAM = H. Thus, AM is adjointable and
closable, and AM is s.a.. Then, AM is e.s.a. by 12.4.11.
12.4.13 Remark. Proposition 12.4.11 shows that an essentially self-adjoint opera-

tor has one and only one self-adjoint extension. A linear manifold M which satisfies
the condition of 12.4.12 with respect to a self-adjoint operator A is called a core for
A. Thus, to specify A uniquely one need not give the exact domain of A, but just
some core for A. Usually, there are many cores for a given self-adjoint operator.
This explains why, if for a given rule (cf. 1.2.1) there exists a domain on which
that rule would define a self-adjoint operator, it is usually easier to guess a domain
on which that rule defines an essentially self-adjoint operator than the domain of
self-adjointness.
12.4.14 Theorem. Let A be a symmetric operator in H and suppose that λ ∈ C

exists so that
RA−λ1H = RA−λ1H = H.
Then A is self-adjoint.
Proof. The equality RA−λ1H = H implies that, for each g ∈ DA† , there exists
f ∈ DA−λ1H so that
A† g − λg = Af − λf,
and hence, since A ⊂ A† , so that
(A† − λ1H )(g − f ) = 0H .
Now, A† −λ1H = (A−λ1H )† by 12.3.3, N(A−λ1H )† = RA−λ1
⊥ ⊥
by 12.1.7, RA−λ1 =
H H
{0H } since RA−λ1H = H (cf. 10.4.4d). Thus, g − f = 0H and hence g ∈ DA−λ1H =
DA . This proves that DA† ⊂ DA and hence that A is s.a..
12.4.15 Corollary. If A is a symmetric operator in H and RA = H, then A is

self-adjoint.
Proof. Set λ := 0 in 12.4.14.
12.4.16 Lemma. Let A be a symmetric operator in H. Then RA†† +i1H and

RA†† −i1H are closed subsets of H.
Proof. Suppose that {fn } is a sequence in RA†† +iH and that there exists f ∈ H
so that limn→∞ fn = f . Then there exists a sequence {gn } in DA†† +i1H = DA†† so
that (A†† + i1H )gn = fn for all n ∈ N. Now,
k(A†† + i1H )gk2 = kA†† gk2 + i A†† g|g − i g|A†† g + kgk2

= kA†† gk2 + kgk2, ∀g ∈ DA†† ,

since A†† is symmetric (cf. 12.4.4a and 12.4.3). Therefore, for all n, m ∈ N,
kgn − gm k ≤ kfn − fm k and kA†† gn − A†† gm k ≤ kfn − fm k.
This shows that {gn } and {A†† gn } are Cauchy, and hence convergent, sequences.
Since the operator A†† is closed, this implies that
lim gn ∈ DA†† and A†† ( lim gn ) = lim A†† gn ,
n→∞ n→∞ n→∞
and hence
lim gn ∈ DA†† and
n→∞
f = lim A†† gn + i lim gn = (A†† + i1H ) lim gn ,
n→∞ n→∞ n→∞
and hence f ∈ RA†† +i1H . This proves that RA†† +i1H is closed (cf. 2.3.4).
The proof for RA†† −i1H is analogous.
12.4.17 Theorem. Let A be a symmetric operator in H. The following conditions

are equivalent:
(a) A is essentially self-adjoint;
(b) NA† +i1H = NA† −i1H = {0H };
(c) RA+i1H = RA−i1H = H.
Proof. a ⇒ b: If A is e.s.a. then

f ∈ NA† ±i1H ⇒
[f ∈ DA† = DA†† and A† f = A†† f = ∓if ] ⇒
[f ∈ DA† = DA†† and ± i (f |f ) = A† f |f = f |A†† f = ∓i (f |f )] ⇒

f = 0H .
This proves that NA† ±i1H = {0H }.

⊥
RA±i1 H
= N(A±i1H )† = NA† ∓i1H = {0H },
by 12.1.7 and 12.3.3, and this implies condition c by 10.4.4d.

c ⇒ a: Assuming condition c, we have
RA†† +i1H = RA†† −i1H = H
since A ⊂ A†† , and hence
RA†† +i1H = RA†† −i1H = H
by 12.4.16. Then A†† is s.a. by 12.4.14 (with A replaced by A†† and λ := i), since
it is a symmetric operator (cf. 12.4.4a). Thus, A†† = (A†† )† = A† (cf. 12.1.6b).
12.4.18 Theorem. Let A be a symmetric operator in H. The following conditions

are equivalent:
(a) A is self-adjoint;
(b) A is closed and NA† +i1H = NA† −i1H = {0H };
(c) RA+i1H = RA−i1H = H.
Proof. a ⇒ b: Assuming condition a, A is closed (cf. 12.4.6a) and it is obviously

e.s.a., and this implies
NA† +i1H = NA† −i1H = {0H }
by 12.4.17.
RA+i1H = RA−i1H = H
by 12.4.17. Since A is closed, we also have A = A†† by 12.1.6c, and hence
RA+i1H = RA−i1H = H
by 12.4.16.
c ⇒ a: This follow directly from 12.4.14 (with λ := i).
12.4.19 Remark. Self-adjoint operators are, among symmetric operators, the im-
portant ones because the spectral theorem holds true for them. One is often given
an operator A which for some reason is known to be symmetric even if its adjoint is
not known (e.g., A might have been proved to be symmetric by 12.4.3), and wants
to find out if A is self-adjoint, or at least essentially self-adjoint. Condition 12.4.17c
is a criterion for deciding whether a symmetric operator is essentially self-adjoint
in which only the operator itself appears, and condition 12.4.18c is the same for
self-adjointness. If the operator A is found to be essentially self-adjoint, then it has
a unique self-adjoint extension, which is A (cf. 12.4.11), and it is often possible to
learn the relevant properties of the self-adjoint extension of A without explicitely
constructing A or A†† , but relying instead on the explicit form of A and on the
abstract properties of closures and adjoints. We point out that it usually easier to
find essentially self-adjoint operators then self-adjoint ones because there are usually
many essentially self-adjoint operators that are restrictions of the same self-adjoint
operator (cf. 12.4.13). It is worth mentioning that there exist symmetric oper-
ators that have many self-adjoint extensions and others that have no self-adjoint
extension.
12.4.20 Proposition. Let A be a symmetric operator in H. Then:
(A) Apσ(A) ⊂ R.
(B) Assuming σp (A) 6= ∅, suppose λ1 , λ2 ∈ σp (A) and λ1 6= λ2 . Then,
⊥
NA−λ1 1H ⊂ NA−λ2 1H
.
Thus, eigenvectors of A corresponding to different eigenvalues are orthogonal

to each other.
(C) If the Hilbert space H is separable then σp (A) is a countable set.
Proof. A: For λ ∈ C, set a := Re λ and b := Im λ. We have

k(A − λ1H )f k2 = k(A − a1H )f − ibf k2
= k(A − a1H )f k2 − ib ((A − a1H )f |f ) + ib (f |(A − a1H )f ) + b2 kf k2
= k(A − a1H )f k2 + b2 kf k2 ≥ b2 kf k2, ∀f ∈ DA−λ1 ,
by 12.4.3 since A − a1H ⊂ A† − a1H = (A − a1H )† (cf. 12.3.3). In view of 4.2.3,
this proves that if Im λ 6= 0 then A − λ1H is injective and (A − λ1H )−1 is bounded,
and hence λ 6∈ Apσ(A).
B: For f1 ∈ NA−λ1 1H and f2 ∈ NA−λ2 1H , we have
λ1 (f1 |f2 ) = (Af1 |f2 ) = (f1 |Af2 ) = λ2 (f1 |f2 )
since λ1 , λ2 ∈ R (cf. 4.5.8 and result A) and A is symmetric (cf. 12.4.3), and hence
(f1 |f2 ) = 0.
C: This follows from B, in view of 10.7.7. Indeed, we can construct an o.n.s. in
H by choosing an element of NA−λ1H ∩ H̃ for each λ ∈ σp (A).
12.4.21 Theorem. Let A be a self-adjoint operator in H. Then:

(a) σ(A) ⊂ R;
(b) Apσ(A) = σ(A).
Proof. a. We prove that C − R ⊂ C − σ(A). Let λ ∈ C − R and set a := Re λ,

b := Im λ, B := 1b (A − a1H ). From 12.3.2a and 12.3.3 we have that the operator B
is s.a.. Hence, from 12.4.18 we have
RB−i1H = H.
Now, RA−λ1H = RB−i1H because A − λ1H = b(B − i1H ) an the range of a linear
operator is a linear manifold. Thus, RA−λ1H = H. Moreover, from 12.4.20A we
have that λ 6∈ Apσ(A), and hence that A − λ1H is injective and (A − λ1H )−1 is
bounded. Therefore, λ ∈ ρ(A) = C − σ(A).
b: In view of 4.5.4 and of result a, we have Apσ(A) ⊂ σ(A) ⊂ R. Now we prove
that R − Apσ(A) ⊂ R − σ(A). Let λ ∈ R − Apσ(A). From 12.1.7, 12.3.3, 3.2.6a we
have
⊥
RA−λ1 H
= NA−λ1H = {0H },
and hence RA−λ1H = H by 10.4.4d. Therefore, λ ∈ ρ(A), i.e. λ ∈ R − σ(A). This
proves that Apσ(A) = σ(A).
12.4.22 Definition. The continuous spectrum of a self-adjoint operator A is the

set
σc (A) := σ(A) − σp (A).
From 12.4.21b we have that, for λ ∈ R,
λ ∈ σc (A) iff [A − λ1H is injective and (A − λ1H )−1 is not bounded].
12.4.23 Theorem. Let A be a self-adjoint operator in H. For λ ∈ C, the following

(a) λ ∈ ρ(A);
(b) RA−λ1H = H.
Proof. a ⇒ b: Since the operator A is closed (cf. 12.4.6a), this follows directly
from 4.5.12.
b ⇒ a: If λ 6∈ R, then λ ∈ ρ(A) by 12.4.21a. Now assume λ ∈ R and RA−λ1H =
H. Then the operator A − λ1H is s.a. (cf. 12.3.3) and in view of 12.1.7 we have
⊥
NA−λ1H = RA−λ1 H
= {0H },
which implies that the operator A − λ1H is injective and (A − λ1H )−1 is s.a. (cf.
12.4.8). Then the equalities
D(A−λ1H )−1 = RA−λ1H = H
imply that the operator (A − λ1H )−1 is bounded (cf. 12.4.7). Therefore, we have
λ ∈ ρ(A).
12.4.24 Proposition. Suppose that A is a symmetric operator and that there exist
a c.o.n.s. {un }n∈N in H and a sequence {λn } in R so that
un ∈ DA and Aun = λn un , ∀n ∈ N.
Then:
P∞
DA† = g ∈ H : n=1 λ2n | (un |g) |2 < ∞ and

∞
A† g = n=1 λn (un |g) un , ∀g ∈ DA† ;
P
the operator A is essentially self-adjoint;

σp (A) = σp (A† ) = {λn }n∈N and σ(A) = σ(A† ) = {λn }n∈N .
Proof. We have
g ∈ DA† ⇒ [ un |A† g = (Aun |g) = λn (un |g) , ∀n ∈ N] ⇒

"∞ ∞
X X
λ2n | (un |g) |2 = | un |A† g |2 < ∞ and

n=1 n=1
∞ ∞
#
X X
† †

A g= un |A g un = λn (un |g) un
n=1 n=1
(cf. 12.1.3A, 10.2.8b, 10.6.4b) and, for g ∈ H,

∞
X ∞
X
λ2n | (un |g) |2 < ∞ ⇒ [the series λn (un |g) un is convergent] ⇒
n=1 n=1
 X∞ ∞
X 
 (Af |g) = (Af |u n ) (u n |g) = (f |Au n ) (u n |g) 

 n=1 n=1 
⇒
∞ ∞
 ! 
X X
= λn (f |un ) (un |g) = f | λn (un |g) un , ∀f ∈ DA
 
n=1 n=1
∗
g ∈ DA = D A†
(cf. 10.4.8b, 10.6.4c, 12.4.3c). This proves that
∞ ∞
( )
X X
D A† = g ∈ H : λ2n | (un |g) |2 < ∞ and A† g = λn (un |g) un , ∀g ∈ DA† .
n=1 n=1
. Now, it is obvious that un ∈ DA† for all n ∈ N. Therefore, the operator A† is

adjointable by 10.6.5b. Moreover,
∞
X
†
λn | (un |g) |2 ∈ R, ∀g ∈ DA† .

g|A g =
n=1
This proves that the operator A† is symmetric (cf. 12.4.3), and hence that A† = A††
since we already know that A†† ⊂ A† (cf. 12.4.2). Thus, the operator A is e.s.a..
It is obvious that {λn }n∈N ⊂ σp (A) ⊂ σp (A† ). If λ ∈ σp (A† ) existed such that
λ 6= λn for all n ∈ N, then by 12.4.20B there would exist f ∈ DA† so that f 6= 0H
and (un |f ) = 0 for all n ∈ N, and hence the o.n.s. {un }n∈N would not be complete
(cf. 10.6.4e). This proves that σp (A† ) ⊂ {λn }n∈N and hence that
σp (A) = σp (A† ) = {λn }n∈N .
The inclusion {λn }n∈N ⊂ σ(A† ) is true because {λn }n∈N = σp (A† ) ⊂ σ(A† ) and
σ(A† ) is a closed subset of C (cf. 10.4.6). Now let λ ∈ C − {λn }n∈N ; then (cf.
2.3.10),
∃ε > 0 such that |λ − λn | ≥ ε, ∀n ∈ N,
∞
X
∃ε > 0 s.t. k(A† − λ1H )gk2 = |λn − λ|2 | (un |g) |2
n=1
∞
X
2
≥ε | (un |g) |2 = ε2 kgk2 , ∀g ∈ DA†
n=1
(cf. 10.6.4b, 10.4.8a, 10.6.4d), and this implies that λ ∈ C − Apσ(A† ) (cf. 4.2.3),
i.e. λ ∈ C − σ(A† ) (cf. 12.4.21b). This proves that σ(A† ) ⊂ {λn }n∈N , and hence
that
σ(A† ) = {λn }n∈N .
Finally, the equation σ(A) = σ(A† ) follows from 4.5.11 since the operator A is
closable and A = A†† = A† (cf. 12.4.4a).
12.4.25 Examples. The examples we examine here are operators in the Hilbert
space L2 (a, b). Most of the elements of L2 (a, b) that we use in these examples are
equivalence classes which contain an element of C(a, b), and we find it pointless to
distinguish always between the symbol ϕ for an element of C(a, b) and the symbol
[ϕ] for the element of L2 (a, b) that contains ϕ. In fact, if ϕ ∈ C(a, b) then ϕ is the
only continuous function in the equivalence class of [ϕ] (cf. 11.2.2b) and therefore it
is unambiguously identified with [ϕ]. This is useful for avoiding some cumbersome
notation. In the same spirit, we use the same symbol for a subset of C(a, b) and
its image under the mapping ι defined in 11.2.1. For instance, in what follows we
regard the set
C01 (a, b) := {ϕ ∈ C 1 (a, b) : ϕ(a) = ϕ(b) = 0}
(for C 1 (a, b), cf. 3.1.10f) as a subset of L2 (a, b). Clearly, C01 (a, b) is a linear manifold
in L2 (a, b). Also, C01 (a, b) = L2 (a, b) by 10.6.5b since {sn }n∈N ⊂ C01 (a, b), where
{sn }n∈N is the c.o.n.s. in L2 (a, b) defined in 11.2.6.
For any θ ∈ [0, 2π) we define:
DAθ := {ϕ ∈ C 1 (a, b) : ϕ(b) = eiθ ϕ(a)},
Aθ : DAθ → L2 (a, b)
ϕ 7→ Aθ ϕ := −iϕ′ .
Clearly, the mapping Aθ is a linear operator in L2 (a, b), and it is adjointable since
C01 (a, b) ⊂ DAθ . Moreover
Z b Z b
(1) (2)
(Aθ ϕ|ψ) = i ′
ϕ (x)ψ(x)dx = i ϕ′ (x)ψ(x)dx
a a
Z b
(3)
= i(ϕ(b)ψ(b) − ϕ(a)ψ(a)) − i ϕ(x)ψ ′ (x)dx
a
Z b
(4)
= −i ϕ(x)ψ ′ (x)dx = (ϕ|Aθ ψ) , ∀ϕ, ψ ∈ DAθ ,
a
where: 1 and 4 hold because an inner product of elements of C(a, b) can be written
as a Riemann integral (cf. 10.1.5b); 2 holds by the definition of derivative of a
complex function (cf. 1.2.21); 3 is integration by parts. By 12.4.3, this proves that
the operator Aθ is symmetric. Let eθ be the element of C(a, b) defined by

x−a
eθ (x) := exp iθ , ∀x ∈ [a, b].
b−a
If {un }n∈Z is the c.o.n.s. in L2 (a, b) defined in 11.2.4, it is obvious that the family
{eθ un }n∈Z is an o.n.s. in L2 (a, b). Moreover, for [ϕ] ∈ L2 (a, b), eθ ϕ ∈ L2 (a, b) and
([eθ un ]|[ϕ]) = ([un ]|[eθ ϕ]) , ∀n ∈ Z;
therefore, in view of 10.6.4e,

x−a
([eθ un ]|[ϕ]) = 0, ∀n ∈ Z] ⇒ [exp −iθ ϕ(x) = 0 m-a.e. on [a, b]]
b−a
⇒ [ϕ(x) = 0 m-a.e. on [a, b]] ⇒ [ϕ] = 0L2 (a,b) ,

and this proves that {eθ un }n∈Z is a c.o.n.s. in L2 (a, b). Now,
2πn + θ
eθ un ∈ DAθ and Aθ eθ un = eθ un , ∀n ∈ Z.
b−a
By 12.4.24, this proves that the operator Aθ is e.s.a. and that

2πn + θ
σp (Aθ ) = σp (Aθ ) = σ(Aθ ) = σ(Aθ ) =
b − a n∈Z
(recall that Aθ = A†† †

θ = Aθ ). For θ1 , θ2 ∈ [0, 2π) such that θ1 6= θ2 , it is clear that
Aθ1 6= Aθ2 since σ(Aθ1 ) 6= σ(Aθ2 ).
Next, we define the mapping
B : C01 (a, b) → L2 (a, b)
ϕ 7→ Bϕ := −iϕ′ .
Clearly, B is a linear operator in L2 (a, b), B is adjointable, and B ⊂ Aθ for all
θ ∈ [0, 2π). Hence, B is symmetric (cf. 12.4.4b). Since B ⊂ Aθ for all θ ∈ [0, 2π),
B is not e.s.a. (cf. 12.4.11c).
Further, we define the mapping

C : C 1 (a, b) → L2 (a, b)
ϕ 7→ Cϕ := −iϕ′ .
Clearly, C is a linear operator in L2 (a, b) and B ⊂ C. Hence, C is adjointable.
Moreover,
Z b Z b
(Bϕ|ψ) = i ϕ′ (x)ψ(x)dx = i ϕ′ (x)ψ(x)dx
a a
Z b
= i(ϕ(b)ψ(b) − ϕ(a)ψ(a)) − i ϕ(x)ψ ′ (x)dx
a
= (ϕ|Cψ) , ∀ϕ ∈ DB , ∀ψ ∈ DC .
By 12.1.3B, this proves that B ⊂ C † and C ⊂ B † . By 12.1.6a and 4.4.11b, this

proves that the operator C is closable.
Finally, we define:
DD := {ϕ ∈ C 1 (a, b) : ϕ(a) = 0},
D : DD → L2 (a, b)
ϕ 7→ Dϕ := −iϕ′ .
Clearly, the mapping D is a linear operator in L2 (a, b) and B ⊂ D ⊂ C. Therefore,
D is adjointable and D ⊂ B † (since C ⊂ B † ), and hence D is closable. Letting ϕ0
be the element of C(a, b) defined by
ϕ0 (x) := x − a, ∀x ∈ [a, b],
we have ϕ0 ∈ DD and
(Dϕ|ϕ) = i(b − a)2 + (ϕ|Dϕ) 6= (ϕ|Dϕ) ;
in view of 12.4.3, this proves that the operator D is not symmetric, and also that
the operator C is not symmetric (since D ⊂ C, cf. 12.4.4b).
Now we study the spectra of B, C, D, which are the same as the spectra of
B, C, D (cf. 4.5.11).
For each λ ∈ C, let exp iλξ be the element of C(a, b) defined by
(exp iλξ)(x) := exp(iλx), ∀x ∈ [a, b];
clearly,
exp iλξ ∈ DC and C exp iλξ = λ exp iλξ.
Thus, σp (C) = C and hence σ(C) = C.
For ϕ ∈ C(a, b) and λ ∈ C, let ϕλ be the element of C(a, b) defined by
Z x
ϕλ (x) := i exp(iλx) exp(−iλs)ϕ(s)ds, ∀x ∈ [a, b].
a
Then, for each λ ∈ C we define the mapping

Sλ : C(a, b) → C(a, b)
ϕ 7→ Sλ ϕ := ϕλ .
Clearly, Sλ is a linear operator in L2 (a, b). Moreover, for all ϕ ∈ C(a, b) we have
Z Z
|ϕλ (x)| ≤ Mλ |ϕ|dm ≤ Mλ |ϕ|dm, ∀x ∈ [a, b],
[a,x] [a,b]
with Mλ := sup{| exp(iλ(x − s))| : x, s ∈ [a, b]}, and hence

Z Z !2
kSλ ϕk2 = |ϕλ |2 dm ≤ (b − a)Mλ2 |ϕ|dm
[a,b] [a,b]
Z ! Z !
≤ (b − a)Mλ2 1dm |ϕ|2 dm = (b − a)2 Mλ2 kϕk2
[a,b] [a,b]
(we have used 10.1.7a for the elements 1[a,b] and ϕ of L2 (a, b)). Thus, the operator
Sλ is bounded for each λ ∈ C.
Now, for each λ ∈ C we have RSλ ⊂ DD and
(D − λ1L2 (a,b) )Sλ ϕ = ϕ, ∀ϕ ∈ C(a, b) = DSλ , (1)
and also, for all ψ ∈ DD and x ∈ [a, b],
Z x
(Sλ (D − λ1 )ψ)(x) = i exp(iλx)
L2 (a,b) exp(−iλs)(−iψ ′ (s) − λψ(s))ds
Z ax
= exp(iλx)(exp(−iλx)ψ(x) + iλ exp(−iλs)ψ(s)ds) − λψλ (x) = ψ(x),
a
which proves that
Sλ (D − λ1L2 (a,b) )ψ = ψ, ∀ψ ∈ DD−λ1L2 (a,b) . (2)
By 1.2.16b, 1 and 2 imply that, for each λ ∈ C,
D − λ1L2 (a,b) is injective and (D − λ1L2 (a,b) )−1 = Sλ .
Since DSλ = C(a, b) and C(a, b) = L2 (a, b), this proves that
ρ(D) = C and hence σ(D) = ∅.
Furthermore, for each λ ∈ C we have
Sλ (B − λ1L2 (a,b) )ψ = ψ, ∀ψ ∈ DB−λ1L2 (a,b) ,
since B ⊂ D. By 1.2.16a, this implies that, for each λ ∈ C,
B − λ1L2 (a,b) is injective and (B − λ1L2 (a,b) )−1 ⊂ Sλ ,
and hence (cf. 4.2.5a) also that (B − λ1L2 (a,b) )−1 is bounded. This proves that
Apσ(B) = ∅.
We also have, for each λ ∈ C,
exp iλξ ∈ NC−λ1 ⊂ NB † −λ1 = N(B−λ1L2 (a,b) )† ,

L2 (a,b) L2 (a,b)
since C ⊂ B † (cf. also 12.3.3). Thus, for each λ ∈ C,

⊥
RB−λ1 L2 (a,b)
= N(B−λ1L2 (a,b) )† 6= {0L2 (a,b) }
(cf. 12.1.7), and hence RB−λ1L2 (a,b) 6= L2 (a, b) (cf. 10.4.4d). This proves that
ρ(B) = ∅ and hence σ(B) = C.
All the operators examined above are defined by the same rule (cf. 1.2.1);
actually, they are all restrictions of the operator C. It is therefore clear that their
various properties depend entirely on the domains on which they are defined.
Finally, we examine some “second order” derivation operators. We have the
inclusion {u} ∪ {cn }n∈N ⊂ DBC , where {u} ∪ {cn }n∈N is the c.o.n.s. in L2 (a, b)
defined in 11.2.6, and hence DBC = L2 (a, b) (cf. 10.6.5b). Furthermore, we have
BC ⊂ C † B † ⊂ (BC)† by 12.3.4a. Thus, the operator BC is symmetric. Moreover,
2
π
BCu = 0L2 (a,b) and BCcn = n2 cn , ∀n ∈ N.
b−a
By 12.4.24, this proves that the operator BC is e.s.a. and that

( 2 )
π
σp (BC) = σp (BC) = σ(BC) = σ(BC) = {0} ∪ n2 .
b−a
n∈N
Similarly, relying on the c.o.n.s. {sn }n∈N defined in 11.2.6, we can prove that the
operator CB is e.s.a., that the elements of {sn }n∈N are eigenvectors of CB, and
that
( 2 )
π
σp (CB) = σp (CB) = σ(CB) = σ(CB) = n2 .
b−a
n∈N
Similarly, relying on the c.o.n.s. {eθ un }n∈Z defined above, we can prove that the
operator A2θ is e.s.a. for any θ ∈ [0, 2π), that the elements of {eθ un }n∈Z are eigen-
vectors of A2θ , and that
( 2 )
2 2 2 2 2πn + θ
σp (Aθ ) = σp (Aθ ) = σ(Aθ ) = σ(Aθ ) = .
b−a
n∈Z
All these “second order” derivation operators are defined by the same rule.
Hence, the diversity of their spectra depends on the diversity of their domains.
12.5 Unitary operators and adjoints
12.5.1 Theorem. For A ∈ O(H), the following conditions are equivalent:

(a) A ∈ U(H);
(b) DA = H, A is injective, A−1 = A† ;
(c) A† A = AA† = 1H ;
(d) DA = H and A† ∈ U(H);
Proof. a ⇒ b: Assuming condition a, DA = H and A is injective by the definition

of an automorphism of H (cf. 10.1.17). Further we have, for the same reason,
(Af |g) = Af |A(A−1 g) = f |A−1 g , ∀f ∈ DA , ∀g ∈ DA−1 ,

and this implies A−1 ⊂ A† by 12.1.3B, and hence A−1 = A† since DA−1 = RA = H
by the definition of an automorphism.
b ⇒ c: Assuming condition b, we have A† A = 1H since DA = H and AA† = 1RA
(cf. 3.2.6b). Now, A−1 = A† implies that A−1 is closed (cf. 12.1.6a). Hence A is
closed (cf. 4.4.7), hence A is bounded by 12.2.3, hence DA† = H by 12.2.2, hence
RA = DA−1 = H, and hence AA† = 1H .
c ⇒ d: Assuming condition c, A† A = 1H implies DA = H and RA† = H, and
†
AA = 1H implies DA† = H (cf. 1.2.13Ab,Ac). Further we have
(f |g) = A(A† f )|g = A† f |A† g , ∀f, g ∈ H,

by 12.1.3A. In view of 10.1.20, this proves that A† ∈ U(H).

d ⇒ a: We replace A with A† in the implication [a ⇒ d] already proved, and we
obtain A†† ∈ U(H). Now, A†† = A because A ⊂ A†† (cf. 12.1.6b) and DA = H.
12.5.2 Theorem. Let U be a unitary operator in H. Then:

(A) Apσ(U ) = σ(U ).
(B) σ(U ) ⊂ T := {z ∈ C : |z| = 1}.
(C) Assuming σp (U ) 6= ∅, suppose λ1 , λ2 ∈ σp (U ) and λ1 6= λ2 . Then,
⊥
NU−λ1 1H ⊂ NU−λ2 1H
.
Thus, eigenvectors of U corresponding to different eigenvalues are orthogonal to
each other.
Proof. A: From 12.5.1 we have U † U = U U † and hence

k(U − λ1H )f k2 = f |(U † − λ1H )(U − λ1H )f

= f |(U − λ1H )(U † − λ1H )f

= k(U † − λ1H )f k2 , ∀λ ∈ C, ∀f ∈ H,
where 12.1.3A and 12.3.3 have been used. This implies
NU−λ1H = NU † −λ1H , ∀λ ∈ C.
Now suppose λ ∈ C − Apσ(U ). Then,

NU−λ1H = {0H } and (U − λ1H )−1 is bounded,
and also
⊥
RU−λ1 H
= NU † −λ1H = NU−λ1H = {0H }
by 12.1.7, and hence
RU−λ1H = H
by 10.4.4d. Therefore, λ ∈ ρ(U ) = C − σ(U ). This proves that σ(U ) ⊂ Apσ(U ),
and hence that Apσ(U ) = σ(U ).
B: For any λ ∈ C we have
k(U − λ1H )f k = kU f − λf k ≥ |kU f k − |λ|kf k| = |1 − |λ||kf k, ∀f ∈ H.
By 4.2.3 this proves that
U − λ1H is injective and (U − λ1H )−1 is bounded
whenever |λ| 6= 1. Therefore, C − T ⊂ C − Apσ(U ), i.e. σ(U ) ⊂ T in view of result
A.
C: For f1 ∈ NU−λ1 1H and f2 ∈ NU−λ2 1H we have
(f1 |f2 ) = (U f1 |U f2 ) = λ1 λ2 (f1 |f2 ) .
Now, λ1 λ2 6= 1 since |λ1 | = 1 (cf. result B) and λ1 6= λ2 , and hence (f1 |f2 ) = 0.
12.5.3 Theorem. Let A be a symmetric operator in H. Then:

the operator A + i1H is injective;
for the operator V := (A − i1H )(A + i1H )−1 we have
DV = RA+i1H , RV = RA−i1H , kV f k = kf k for all f ∈ DV ;
the operator V − 1H is injective and A = −i(V + 1H )(V − 1H )−1 ;
1 6∈ σp (V );
the operator V is unitary iff the operator A is self-adjoint.
The operator V is called the Cayley transform of A.
Proof. In view of 12.4.20A we have −i 6∈ σp (A), and hence that the operator
A + i1H is injective. We have
D(A+i1H )−1 = RA+i1H and R(A+i1H )−1 = DA+i1H = DA = DA−i1H
(cf. 1.2.11a), and from these equalities we obtain
DV = RA+i1H and RV = RA−i1H .
For each f ∈ DV , we set g := (A + i1H )−1 f (note that DV = D(A+i1H )−1 ); then,
g ∈ DA+i1H = DA = DA−i1H and f = (A + i1H )g, (1)
and also
V f = (A − i1H )(A + i1H )−1 f = (A − i1H )g; (2)
since
k(A ± i1H )gk2 = kAgk2 ± i (Ag|g) ∓ i (g|Ag) + kgk2 = kAgk2 + kgk2
(cf. 12.4.3a), from 2 we have
kV f k = k(A − i1H )gk = k(A + i1H )gk = kf k;
moreover, from 1 and 2 we have
(V − 1H )f = (A − i1H )g − (A + i1H )g = −2ig = −2i(A + i1H )−1 f (3)
and also
(V + 1H )f = 2Ag = 2A(A + i1H )−1 f. (4)
Since
DV −1H = DV = RA+i1H = D(A+i1H )−1 ,
3 implies that
V − 1H = −2i(A + i1H )−1 , (5)
which implies (cf. 1.2.11b) that the operator V − 1H is injective and hence that
1 6∈ σp (V ), and also that
1
(V − 1H )−1 = − (A + i1H ).
2i
From R(A+i1H )−1 = DA we have
DA(A+i1H )−1 = D(A+i1H )−1 = DV = DV +1H
(cf. 1.2.13Ad); then, 4 implies that
V + 1H = 2A(A + i1H )−1 . (6)
Now, 5 and 6 imply that
−i(V + 1H )(V − 1H )−1 = A(A + i1H )−1 (A + i1H ) = A,
where the last equality holds because
(A + i1H )−1 (A + i1H ) = 1DA+i1H = 1DA .
Finally, since
kV f k = kf k, ∀f ∈ DV ,
the operator V is unitary iff DV = RV = H (cf. 10.1.20), i.e. iff
RA+i1H = RA−i1H = H,
i.e. iff the operator A is s.a. (cf. 12.4.18).
The next theorem must be added to what was proved in 4.6.5 (also, cf. 10.3.19)
about the unitary-antiunitary equivalence of operators.
12.5.4 Theorem. Let H1 and H2 be isomorphic Hilbert spaces, let A ∈ O(H1 ) and
B ∈ O(H2 ), and suppose that there exists U ∈ UA(H1 , H2 ) so that B = U AU −1 .
Then:
(a) if DA = H1 then DB = H2 and B † = U A† U −1 ;
(b) if A is symmetric then B is symmetric;
(c) if A is self-adjoint then B is self-adjoint;
(d) if A is essentially self-adjoint then B is essentially self-adjoint.
Proof. a: Suppose DA = H1 . Then DB = H2 by 4.6.4i (also, cf. 10.3.19). From

4.6.4g we have
GB = TU (GA ),
where TU is the unitary or antiunitary operator from the Hilbert space H1 ⊕ H1
onto the Hilbert space H2 ⊕ H2 defined in 4.6.3 (also, cf. 10.3.19). We denote by
W1 and W2 the unitary operators defined in H1 ⊕ H1 and H2 ⊕ H2 respectively as
W was in H ⊕ H (cf. 12.1.5). We have
W2 TU (f, g) = W2 (U f, U g) = (U g, −U f )
= TU (g, −f ) = TU W1 (f, g), ∀(f, g) ∈ H1 ⊕ H1 .
Then, in view of 12.1.5c,
GB † = (W2 (GB ))⊥ = (W2 (TU (GA )))⊥
(1) (2)
= (TU (W1 (GA )))⊥ = TU (W1 (GA ))⊥ = TU (GA† ) = GUA† U −1 ,
where 1 holds by 10.2.16 (also, cf. 10.3.16f) and 2 holds by 4.6.4g. This proves the
equation B † = U A† U −1 .
b, c, d: These follow immediately from result a.
12.6 The C ∗ -algebra of bounded operators in Hilbert space
12.6.1 Definition. A C ∗ -algebra is a sextuple (X, σ, µ, π, ν, ι), where (X, σ, µ, π, ν)

is a Banach algebra over C and ι is a mapping ι : X → X with the following
properties, which we write with the shorthand notation x∗ := ι(x):
(c∗1 ) (x + y)∗ = x∗ + y ∗ , ∀x, y ∈ X,
(c∗2 ) (αx)∗ = αx∗ , ∀α ∈ C, ∀x ∈ X,
(c∗3 ) (xy)∗ = y ∗ x∗ , ∀x, y ∈ X,
(c∗4 ) (x∗ )∗ = x, ∀x ∈ X,
(c∗5 ) kx∗ xk = kxk2 , ∀x ∈ X.
The mapping ι is called an involution.
12.6.2 Proposition. Let (X, σ, µ, π, ν, ι) be a C ∗ -algebra. Then,

kx∗ k = kxk, ∀x ∈ X,
and the mapping ι is continuous.
Proof. For every x ∈ X, by c∗5 and na (cf. 4.3.1) we have
kxk2 = kx∗ xk ≤ kx∗ kkxk,
and by c∗5 , c∗4 and na
kx∗ k2 = k(x∗ )∗ x∗ k = kxx∗ k ≤ kxkkx∗ k;
these inequalities imply
kxk ≤ kx∗ k and kx∗ k ≤ kxk,
∗
and hence kxk = kx k.
If a sequence {xn } in X and an element x of X are so that xn → x, then
kx∗n − x∗ k = k(xn − x)∗ k = kxn − xk → 0, i.e. x∗n → x∗ ,
where c∗1 and c∗2 have been used. In view of 2.4.2, this proves that the mapping ι is
continuous.
12.6.3 Proposition. If x is an element of a C ∗ -algebra such that x∗ x = xx∗ , then

kxn k = kxkn , ∀n ∈ N.
Proof. If an element y of a C ∗ -algebra is such that y ∗ y = yy ∗ , then
ky 2 k2 = k(y 2 )∗ y 2 k = ky ∗ y ∗ yyk = ky ∗ yy ∗ yk = k(y ∗ y)∗ (y ∗ y)k = ky ∗ yk2 = kyk4 ,
where c∗3 , c∗4 , c∗5 have been used, and hence
ky 2 k = kyk2 .
Now, let x be an element of a C ∗ -algebra such that x∗ x = xx∗ . First we prove by
induction that
n n
kx2 k = kxk2 , ∀n ∈ N.
We already know that this equality is true for n = 1. If we assume that the equality
is true for a given n ∈ N then we have
n+1 n n n+1
kx2 k = k(x2 )2 k = kx2 k2 = kxk2
by the result proved above for y, since
n n n n n n n n
(x2 )∗ x2 = (x∗ )2 x2 = x2 (x∗ )2 = x2 (x2 )∗ ,
where c∗3 has been used. This concludes the proof by induction. Now, for any n ∈ N,
let m ∈ N be so that n + m = 2k for some k ∈ N. Then,
kxkn kxkm = kxkn+m = kxn+m k = kxn xm k ≤ kxn kkxm k ≤ kxn kkxkm ,
where na (cf. 4.3.1) has been used twice. If x 6= 0X , this proves that
kxkn ≤ kxn k,
which is also trivially true for x = 0X . Since
kxn k ≤ kxkn
is true by na, we have kxn k = kxkn .
12.6.4 Theorem. The mapping

ι : B(H) → B(H)
A 7→ ι(A) := A†
is defined consistently, and B(H) is a C ∗ -algebra with this mapping as involution.
Hence:
A† ∈ B(H), ∀A ∈ B(H),
(A + B)† = A† + B † , ∀A, B ∈ B(H),
(αA)† = αA† , ∀α ∈ C, ∀A ∈ B(H),
(AB)† = B † A† , ∀A, B ∈ B(H),
A†† = A, ∀A ∈ B(H),
kA† Ak = kAk2 , ∀A ∈ B(H),
kA† k = kAk, ∀A ∈ B(H),
kAn k = kAkn , ∀n ∈ N, if A ∈ B(H) is such that A† A = AA† .
Proof. We already know that B(H) is a Banach algebra over C (cf. 4.3.5). The
definition of the mapping ι of the statement is consistent because
A† ∈ B(H), ∀A ∈ B(H),
by 12.2.2. Now we prove that the mapping ι of the statement has all the properties
listed in 12.6.1.
c∗1 : this follows from 12.3.1b.
c∗2 : this follows from 12.3.2.
c∗5 : For A ∈ B(H) we have
kAf k2 = (Af |Af ) = f |A† Af ≤ kf kkA† Af k ≤ kf kkA† Akkf k, ∀f ∈ H,

by 12.1.3A, 10.1.7a, 4.2.5b, and this proves that kAk2 ≤ kA† Ak. We also have
| f |A† Ag | = | (Af |Ag) | ≤ kAf kkAgk ≤ kAk2 kf kkgk, ∀f, g ∈ H,

by the same reasons as above, and this proves that kA† Ak ≤ kAk2 (cf. 10.1.14).
The last two assertions of the statement follow directly from 12.6.2 and from
12.6.3.
12.6.5 Remark. A pair A, B of self-adjoint operators in H is said to “satisfy” the

Heisenberg canonical commutation relation if
AB − BA ⊂ i1H . (HCCR)
If this condition is satisfied, then either A or B or both A and B must be non-
bounded (thus, the relation HCCR cannot be discussed without worrying about
the domains of the operators, in view of 12.4.7). The proof is as follows. If A and B
were bounded, then we should have A, B ∈ B(H) (cf. 12.4.7) and condition HCCR
would be
AB − BA = i1H . (1)
This would imply the equations
An B − BAn = inAn−1 , ∀n ∈ N. (2)
Indeed, 1 is 2 for n = 1 (recall that A0 := 1H , cf. 3.3.1) and, assuming that 2 is
true for a given n ∈ N, we have
An BA − BAn+1 = inAn ,
which in view of 1 can be written as
An (AB − i1H ) − BAn+1 = inAn ,
or
An+1 B − BAn+1 = i(n + 1)An .
This proves 2 by induction. From 2 we would have, by 12.6.3 (also, cf. 12.6.4),
nkAkn−1 = kinAn−1 k ≤ kAkn kBk + kBkkAkn, ∀n ∈ N,
which would imply (note that 1 implies A 6= OH )
n ≤ 2kAkkBk, ∀n ∈ N,
which is a contradiction.
This also shows that the relation
AB − BA = i1H
(which is clearly stronger than HCCR) is an impossible relation for two self-adjoint
operators A and B, since it would imply DA = DB = H, and hence for both the
operators A and B to be bounded (cf. 12.4.7). We mention the fact that there are
pairs of self-adjoint operators which satisfy HCCR (cf. 20.1.3b and 20.1.7).
12.6.6 Remarks.
(a) Here we make some remarks about linear operators in a one-dimensional Hilbert
space which could also be deduced from 10.8.4. Thus, we suppose in what
follows that H is a one-dimensional Hilbert space.
Since {0H } and H are the only linear manifolds in H, the domain of every non-
trivial linear operator in H must be H. Moreover, every linear operator in H
is bounded (cf. 10.8.3). Thus, the family of non-trivial linear operators in H is
B(H).
For α ∈ C, we define the mapping
Aα : H → H
f 7→ Aα f := αf.
It is obvious that Aα ∈ B(H). If α, β ∈ C are such that Aα = Aβ then αf = βf

for all f ∈ H and hence α = β. Now let A ∈ B(H) and fix f0 ∈ H − {0H }; then,
∃!α ∈ C such that Af0 = αf0 ;
hence, for each f ∈ H, if kf ∈ C is such that f = kf f0 then
Af = A(kf f0 ) = kf αf0 = α(kf f0 ) = αf = Aα f.
This proves that the mapping
C ∋ α 7→ Φ(α) := Aα ∈ B(H)
is a bijection from C onto B(H). It is obvious that C is a C ∗ -algebra with
the modulus of complex numbers as norm and the complex conjugation as
involution. Now, it is obvious that
Aα+β = Aα + Aβ and Aαβ = αAβ = Aα Aβ , ∀α, β ∈ C.
This proves that Φ is an isomorphism from the associative algebra C onto the
associative algebra B(H).
For each α ∈ C we have
kAα f k = |α|kf k, ∀f ∈ H;
by 4.2.5c, this proves that
kAα k = |α|, ∀α ∈ C.
Moreover, for each α ∈ C we have
(Aα f |g) = α (f |g) = (f |Aα g) , ∀f, g ∈ H;
by 12.1.3B, this proves that
Aα = A†α , ∀α ∈ C.
All this means that the bijection Φ identifies the two C ∗ -algebras C and B(H)
also as to their norms and their involutions.
We also note that
Aα is self-adjoint iff α ∈ R.
Moreover we have
A α1 Aα = 1H , ∀α ∈ C − {0};
in view of 1.2.16a, this proves that Aα is injective and A−1
α = A α1 for all
α ∈ C − {0}. Then,
σ(Aα ) = σp (Aα ) = {α}, ∀α ∈ C,
since Aα − λ1H = Aα−λ for all α, λ ∈ C.
Finally, the subsets T := {z ∈ C : |z| = 1} of C and U(H) of B(H) are groups
(cf. 10.3.10). For α ∈ C we have, in view of 10.1.20,
Aα ∈ U(H) iff [kAα f k = kf k, ∀f ∈ H] iff α ∈ T.
This, along with
Az1 Az2 = Az1 z2 , ∀z1 , z2 ∈ T,
proves that the restriction ΦT of the mapping Φ to T is an isomorphism from
the group T onto the group U(H).
(b) For a one-dimensional Hilbert space H, theorem 12.5.3 on the Cayley transform
of a self-adjoint operator can be rephrased as follows, in view of what was seen
in remark a:
for all x ∈ R, x−i x−i
x+i ∈ T and x+i 6= 1;
if we define the function
ϕ : R → T − {1}
x−i
x 7→ ϕ(x) := ,
x+i
then we have
ϕ(x) + 1
x = −i , ∀x ∈ R,
ϕ(x) − 1
and hence the function ϕ is injective.
Of course, all this can be proved directly, without going through 12.5.3. The
name of Cayley transform was originally given to the function ϕ.
Now, let z ∈ T − {1} and write z = exp iθ with 0 < θ < 2π. Then,
exp i 2θ + exp −i 2θ cos θ2

z+1
−i = −i = − ∈R
z−1 exp i 2θ − exp −i 2θ sin θ2

and !
cos θ2 cos θ2 + i sin θ2 exp i 2θ
ϕ − = = = exp iθ = z.
sin θ2 cos 2θ − i sin θ2 exp −i 2θ
This proves that the function ϕ is a bijection from R onto T − {1} and that its
inverse is the function
ψ : T − {1} → R
z+1
z 7→ ψ(z) := −i .
z−1
12.6.7 Proposition. Let X be a non-empty set. For the Banach algebra FB (X)
(cf. 4.3.6a), the mapping
ι : FB (X) → FB (X)
ϕ 7→ ι(ϕ) := ϕ
is defined consistently, and FB (X) is a C ∗ -algebra with this mapping as involution.
If A is a σ-algebra on X, the Banach algebra MB (X, A) (cf. 6.2.29) is a C ∗ -
algebra with the restriction ιMB (X,A) as involution.
If a distance is defined on X, the Banach algebra CB (X) (cf. 4.3.6b) is a C ∗ -
algebra with the restriction ιCB (X) as involution.
Proof. It is obvious that the mapping ι is defined consistently and that it satisfies
all the conditions listed in 12.6.1. For instance, as to condition c∗5 we have, for all
ϕ ∈ FB (X),
[|ϕ(x)| ≤ kϕk∞ , ∀x ∈ X] ⇒
[|ϕ(x)ϕ(x)| = |ϕ(x)|2 ≤ kϕk2∞ , ∀x ∈ X] ⇒ kϕϕk∞ ≤ kϕk2∞
and
p
[|ϕ(x)|2 = |ϕ(x)ϕ(x)| ≤ kϕϕk∞ , ∀x ∈ X] ⇒ kϕk∞ ≤ kϕϕk∞ ⇒ kϕk2∞ ≤ kϕϕk.
The restrictions ιMB (X,A) and ιCB (X) are defined because MB (X, A) and CB (X)
are subsets of FB (X). Moreover, ι(MB (X, A)) ⊂ MB (X, A) (cf. 6.2.17),
ι(CB (X)) ⊂ CB (X), and it is obvious that ιMB (X,A) and ιCB (X) have the prop-
erties of an involution.
Chapter 13
Orthogonal Projections and Projection

Valued Measures
In the first half of this chapter we study orthogonal projections, which are the
building blocks of unitary and of self-adjoint operators, as the spectral theorems
show. Orthogonal projections enter our formulation of the spectral theorems in
the guise of projection valued measures, which we study in the second half of this
chapter.
Throughout this chapter, H denotes an abstract Hilbert space.
13.1 Orthogonal projections
13.1.1 Definitions. Let M be a subspace of H. In view of the orthogonal decom-

position theorem (cf. 10.4.1), we can define the orthogonal decomposition mapping
δM : H → M × M ⊥
f 7→ δM (f ) := (f1 , f2 ) if (f1 , f2 ) ∈ M × M ⊥ is such that f = f1 + f2 .
We denote by πM the projection mapping (cf. 1.2.6c)
πM : M × M ⊥ → M
(f, g) 7→ πM (f, g) := f,
and we call orthogonal projection onto M the composition of πM with δM , i.e. the
mapping PM defined by PM := πM ◦ δM . Thus, PM is a mapping from H to M .
However, it is convenient to consider H instead of M as the final set of the mapping
PM (cf. 1.2.1). Clearly, the mapping PM can be defined directly as follows:
PM : H → H
f 7→ PM f := f ′ if f ′ ∈ M and f − f ′ ∈ M ⊥ .
It is expedient to denote by S (H) the family of all subspaces of H and to define
P(H) := {PM : M ∈ S (H)}.
The elements of P(H) are called orthogonal projections in H.
387
13.1.2 Remark. It is obvious that

δ{0H } (f ) = (0H , f ) and δH (f ) = (f, 0H ), ∀f ∈ H.
Thus the orthogonal projections onto the trivial subspaces {0H } and H are the
trivial operators:
P{0H } = OH and PH = 1H .
13.1.3 Theorem. For M ∈ S (H), we have:

(a) PM is a linear operator, i.e. PM ∈ OE (H);
(b) NPM = M ⊥ ;
(c) RPM = M = {f ∈ H : PM f = f } = {f ∈ H : kPM f k = kf k};
(d) the operator PM is bounded, i.e. PM ∈ B(H), and kPM k = 1 if M 6= {0H };
(e) PM ⊥ = 1H − PM (recall that M ⊥ ∈ S (H), cf. 10.2.13), and hence M ⊥ =
R1H −PM .
Proof. a: First, DPM = H. Next, for all α, β ∈ C and f, g ∈ H, if we write

(f1 , f2 ) := δM (f ) and (g1 , g2 ) := δM (g)
then
δM (αf + βg) = (αf1 + βg1 , αf2 + βg2 )
since M and M ⊥ are linear manifolds in H and αf + βg = αf1 + βg1 + αf2 + βg2 ,
and hence
PM (αf + βg) = αf1 + βg1 = αPM f + βPM g.
This proves condition lo of 3.2.1 for the mapping PM .
b: For f ∈ H, if we write (f1 , f2 ) := δM (f ) then f1 equals 0H iff f2 equals f , in
view of the equation f = f1 + f2 . Then,
f ∈ NPM ⇔ PM f = 0H ⇔ δM (f ) = (0H , f ) ⇔ f ∈ M ⊥ .
c: For f ∈ H, if we write (f1 , f2 ) := δM (f ) then f1 equals f iff f2 equals 0H .
Then,
PM f = f ⇔ δM (f ) = (f, 0H ) ⇔ f ∈ M.
This proves the equation
M = {f ∈ H : PM f = f }.
Since the inclusions RPM ⊂ M and {f ∈ H : PM f = f } ⊂ RPM are obvious, we
have the equations
RPM = M = {f ∈ H : PM f = f }.
The inclusion {f ∈ H : PM f = f } ⊂ {f ∈ H : kPM f k = kf k} is obvious. Moreover,
if for f ∈ H we write (f1 , f2 ) := δM (f ) then, in view of 10.2.3,
kPM f k = kf k ⇒ kf1 k2 = kf k2 = kf1 k2 + kf2 k2 ⇒ f2 = 0H ⇒ f = f1 ∈ M.
Orthogonal Projections and Projection Valued Measures 389
This proves the inclusion {f ∈ H : kPM f k = kf k} ⊂ M . Thus, all the equations of

the statement are proved.
d: For every f ∈ H, if we write (f1 , f2 ) := δM (f ) then
kPM f k2 = kf1 k2 ≤ kf1 k2 + kf2 k2 = kf k2 .
This proves that the operator PM is bounded and that kPM k ≤ 1 (cf. 4.2.4). If
M 6= {0H }, then
∃f ∈ H such that f 6= 0H and kPM f k = kf k
(cf. result c), and hence kPM k ≥ 1 by 4.2.5c, and hence kPM k = 1.
e: For every f ∈ H, if we write (f1 , f2 ) := δM (f ) then δM ⊥ (f ) = (f2 , f1 ) since
M ⊂ M ⊥⊥ (cf. 10.2.10d), and hence
PM ⊥ f = f2 = f − f1 = f − PM f = (1H − PM )f.
This proves the equation PM ⊥ = 1H − PM , which implies M ⊥ = R1H −PM by result
c (with M replaced by M ⊥ ).
13.1.4 Remark.
(a) For every projection A in H we have, in view of 13.1.3c,
RA ∈ S (H) and A = PRA .
We also have, in view of 13.1.3c,e,
⊥
PR⊥
A
= 1H − A and hence RA = R1H −A .
(b) In view of 13.1.3c, the mapping
S (H) ∋ M 7→ PM ∈ P(H)
is injective (if M, N ∈ S (H) are such that PM = PN then M = RPM = RPN =
N ) and hence it is bijective from S (H) onto P(H), and the mapping
P(H) ∋ A 7→ RA ∈ S (H)
is defined consistently and it is the inverse of the mapping preceding (cf. remark
a).
13.1.5 Theorem. For A ∈ OE (H) the following conditions are equivalent:

(a) A ∈ P(H);
(b) A = A† and A = A2 .
Proof. a ⇒ b: Let M ∈ S (H) be so that A = PM . For every f ∈ H, if we write

(f1 , f2 ) := δM (f ) then
(f |Af ) = (f1 + f2 |f1 ) = kf1 k2 ∈ R
and
A2 f = Af1 = f1 = Af,
in view of 13.1.3c. This proves that A is symmetric (cf. 12.4.3) and hence self-
adjoint (since DA = H) and also that A2 = A.
b ⇒ a: We assume condition b. For every f ∈ H, we have obviously
f = Af + (f − Af )
and
Af ∈ RA and hence Af ∈ RA ;
we also have
(f − Af |Ag) = (A(f − Af )|g) = Af − A2 f |g = 0, ∀g ∈ H,

⊥
and this proves that f − Af ∈ RA and hence f − Af ∈ (RA )⊥ (cf. 10.2.11); since
RA ∈ S (H) (cf. 3.2.2a and 4.1.12), all this can be written as
δRA (f ) = (Af, f − Af ),
and this implies Af = PRA f . This proves that A = PRA , and hence condition
a.
13.1.6 Theorem. For A ∈ OE (H) the following conditions are equivalent:

(a) A ∈ P(H);
(b) A = A2 and kAf k ≤ kf k for all f ∈ H.
Proof. a ⇒ b: This follows from 13.1.5 and 13.1.3d.

b ⇒ a: We assume condition b. Then N1H −A ∈ S (H), by 4.4.3 and 4.4.8. We
write M := N1H −A . Suppose that there exists f ∈ M ⊥ such that Af 6= 0H . Since
(1H − A)Af = Af − A2 f = 0H ,
we have Af ∈ M and hence, in view of 10.2.3,
kf k2 + t2 kAf k2 = kf + tAf k2
≥ kA(f + tAf )k2 = k(1 + t)Af k2 = (1 + t)2 kAf k2 , ∀t ∈ R,
and hence
kf k2 ≥ (1 + 2t)kAf k2, ∀t ∈ R,
which leads to a contradiction for t so that 1 + 2t > kf k2 kAf k−2 . This proves that
Af = 0H , ∀f ∈ M ⊥ .
Then, for every g ∈ H we have, in view of 13.1.3e and 13.1.3c (with M replaced by
M ⊥)
g − PM g = PM ⊥ g ∈ M ⊥ , and hence A(g − PM g) = 0H ;
we also have, in view of 13.1.3c,
PM g ∈ M, and hence (1H − A)PM g = 0H , and hence APM g = PM g;
thus, we have
Ag = A((g − PM g) + PM g) = PM g.
This proves that A = PM , and hence that A ∈ P(H).
13.1.7 Remarks.
(a) For any normed space X, an operator A ∈ OE (X) is called a projection in X if
A = A2 . From 13.1.5 we see that a projection in H is an orthogonal projection
iff it is self-adjoint. Besides, from 13.1.6 we see that a projection in H is an
orthogonal projection iff it is bounded with norm not greater than one.
The only projections in H that we consider in this book are orthogonal projec-
tions. For this reason, we may sometimes use the word projection to mean an
orthogonal projection.
(b) The plan for the proof of b ⇒ a in 13.1.5 was suggested by the fact that if
A ∈ P(H) then A = PRA = PRA (cf. 13.1.4.a). We point out that we could
not have set out to prove the equation A = PRA , because we did not know yet
that RA was a subspace. However, we did know that RA was a subspace and
therefore it was sensible to set out to prove that A = PRA .
The plan for the proof of b ⇒ a in 13.1.6 was suggested by the fact that if
⊥⊥
A ∈ P(H) then 1H − A = PR⊥ A
(cf. 13.1.4a), and hence N1H −A = RA = RA
(cf. 13.1.3b and 10.4.4a), and hence A = PRA = PM if we write M := N1H −A
(cf. 13.1.4a). We point out that the first thing we proved was that N1H −A was
a subspace.
(c) In view of 13.1.5, for each A ∈ P(H) we have
(f |Af ) = f |A2 f = (Af |Af ) = kAf k2 , ∀f ∈ H.

13.1.8 Theorem. Let H1 and H2 be isomorphic Hilbert spaces, let A ∈ P(H1 ) and
B ∈ O(H2 ), and suppose that there exists U ∈ UA(H1 , H2 ) so that B = U AU −1 .
Then B ∈ P(H2 ). In fact,
U (RA ) ∈ S (H2 ) and B = PU(RA ) .
Proof. It is obvious that DB = H2 . Since A = A† and A = A2 (cf. 13.1.5) we

have B = B † by 12.5.4c and
B 2 = U AU −1 U AU −1 = U A2 U −1 = U AU −1 = B.
In view of 13.1.5, this proves that B ∈ P(H2 ). Then, RB ∈ S (H2 ) and B = PRB
(cf. 13.1.4a). Now, RB = U (RA ) by 4.6.4h, 10.3.12b, 10.3.19.
13.1.9 Corollary (Cor. of the closed graph theorem in Hilbert space).

Let A ∈ O(H). If DA is closed and A is closed, then A is bounded.
Proof. Assume DA closed and A closed, let P denote the orthogonal projection
onto DA , and consider the operator AP . Clearly, DAP = H. Moreover, if two
vectors f, g of H and a sequence {fn } in H are so that
fn → f and AP fn → g
then
P fn → P f
because P is bounded (cf. 13.1.3d) and hence continuous, and hence g = AP f

because A is closed. This proves that AP is closed and hence, by the closed graph
theorem in Hilbert space (cf. 12.2.3), that AP is bounded. Since A is the same as
the restriction of AP to DA (cf. 13.1.3c), A is bounded as well.
13.1.10 Proposition. Let {ui }i∈I be an o.n.s. in H. If we write M := V {ui }i∈I ,

then
X
PM f = (ui |f ) ui , ∀f ∈ H.
i∈I
Proof. From 10.6.1 we have !

X X
δM (f ) = (ui |f ) ui , f − (ui |f ) ui , ∀f ∈ H.
i∈I i∈I
13.1.11 Corollary. If H is separable then for every A ∈ P(H)− {OH } there exists
a countable o.n.s. {ui }i∈I in H so that
X
Af = (ui |f ) ui , ∀f ∈ H.
i∈I
Proof. This follows immediately from 10.7.2 and 13.1.10.
13.1.12 Definition. For u ∈ H̃ (for H̃, cf. 10.9.4) we write Au := PV {u} . In view
of 13.1.10, we have
DAu = H and Au f = (u|f ) u, ∀f ∈ H.
The operator Au is said to be a one-dimensional projection in H.
13.1.13 Remarks.
(a) If u, v ∈ H̃ are such that u ÷ v (for the relation ÷ in H, cf. 10.9.1) then there
exists z ∈ T such that u = zv, and hence
Au f = (zv|f ) zv = zz (v|f ) v = (v|f ) v = Av f, ∀f ∈ H,
i.e. Au = Av . Conversely, if u, v ∈ H̃ are such that Au = Av then in particular
u = Au u = Av u = (v|u) v,
and hence u ÷ v. Therefore, the mapping
Ĥ ∋ [u] 7→ Au ∈ P(H)
(for Ĥ, cf. 10.9.4) can be defined consistently and it is injective; hence, it is a
bjiection from Ĥ onto the family of all one-dimensional projections in H.
(b) If H1 and H2 are isomorphic Hilbert spaces, for u ∈ H̃1 and U ∈ UA(H1 , H2 )
we have U Au U −1 = AUu . Indeed,
U Au U −1 f = U u|U −1 f u = U (U u|f ) u = (U u|f ) U u = AUu f, ∀f ∈ H,

if U is unitary, and
U Au U −1 f = U u|U −1 f u = U (f |U u) u = (U u|f ) U u = AUu f, ∀f ∈ H,

if U is antiunitary.
13.2 Orthogonal projections and subspaces
In this section we examine some conditions under which orthogonal projections can
be constructed out of other orthogonal projections. The bijection between S (H)
and P(H) examined in 13.1.4b translates relations between subspaces into relations
between orthogonal projections. Examples of this can be found in 13.2.1, 13.2.4,
13.2.8, 13.2.9.
13.2.1 Theorem. For M, N ∈ S (H), the following conditions are equivalent:

(a) PN PM ∈ P(H);
(b) PN PM = PM PN ;
(c) (M ∩ N )⊥ = (M ∩ N ⊥ ) + (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ );
(d) M = (M ∩ N ) + (M ∩ N ⊥ ).
Since condition b remains the same if M and N are interchanged, the above
conditions are also equivalent to conditions a and d with M and N interchanged
(condition c remains the same if M and N are interchanged in it).
Since condition b is obviously equivalent to each of the following conditions
PN (1H − PM ) = (1H − PM )PN ,
(1H − PN )PM = PM (1H − PN ),
(1H − PN )(1H − PM ) = (1H − PM )(1H − PN ),
and since 1H − PM = PM ⊥ and 1H − PN = PN ⊥ (cf. 13.1.3e), conditions a, b,
c, d are equivalent to the same conditions with either M or N or both M and N
replaced by M ⊥ and N ⊥ respectively.
If the above conditions are satisfied, then
(e) PN PM = PM∩N , or equivalently RPN PM = M ∩ N
(recall that M ∩ N ∈ S (H), cf. 4.1.10).
In view of the remark above, the equations also hold true that are obtained from
the equations in e by replacing either M or N or both M and N with M ⊥ and N ⊥
respectively.
Proof. a ⇒ b: Assuming condition a we have

†
PN PM = (PN PM )† = PM PN† = PM PN
(cf. 13.1.5 and 12.3.4b).
b ⇒ c: Assuming condition b, in view of 13.1.3c we have
PM PN f ∈ M ∩ N, ∀f ∈ H.
Assuming condition b, we also have PM PN ⊥ = PN ⊥ PM (cf. remark in the state-
ment) and hence
PM PN ⊥ f ∈ M ∩ N ⊥ , ∀f ∈ H.
Then,
f ∈ M ⇒ f = PM f = PM PN f + PM (1H − PN )f
= PM PN f + PM PN ⊥ f ∈ (M ∩ N ) + (M ∩ N ⊥ ).
Since the inclusion (M ∩ N ) + (M ∩ N ⊥ ) ⊂ M is obvious, this proves the equation
M = (M ∩ N ) + (M ∩ N ⊥ ).
Assuming condition b, we also have PN PM ⊥ = PM ⊥ PN (cf. the second remark in
the statement), and this implies (proceeding as above) the equation
M ⊥ = (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ).
In view of 10.4.2a, this proves the equation
H = (M ∩ N ) + (M ∩ N ⊥ ) + (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ). (1)
Now, in view of 10.2.10b and 10.2.13, we have
M ∩ N ⊥ ⊂ N ⊥ ⊂ (M ∩ N )⊥ and (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ) ⊂ M ⊥ ⊂ (M ∩ N )⊥ ,
and hence
(M ∩ N ⊥ ) + (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ) ⊂ (M ∩ N )⊥ . (2)
In view of 10.2.15, 1 and 2 imply condition c.
c ⇒ d: Assuming condition c, by 10.4.2a we have
H = (M ∩ N ) + (M ∩ N ⊥ ) + (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ),
and hence
H = (M ∩ N ) + (M ∩ N ⊥ ) + M ⊥ (3)
since (M ⊥ ∩ N ) + (M ⊥ ∩ N ⊥ ) ⊂ M ⊥ . Now,
(M ∩ N ) + (M ∩ N ⊥ ) ⊂ M = (M ⊥ )⊥ (4)
(cf. 10.4.4a). In view of 10.2.15, 3 and 4 imply that
(M ∩ N ) + (M ∩ N ⊥ ) = (M ⊥ )⊥ = M.
d ⇒ (a and e): Assuming condition d we have that
∀g ∈ M, ∃!(g1 , g2 ) ∈ (M ∩ N ) × (M ∩ N ⊥ ) so that g = g1 + g2
(the uniqueness of (g1 , g2 ) as above follows from g1 ∈ N and g2 ∈ N ⊥ , cf. 10.4.1).
This and 10.4.1 imply that, for every f ∈ H,
∃!(f1 , f2 , f3 ) ∈ (M ∩ N ) × (M ∩ N ⊥ ) × M ⊥ so that f = f1 + f2 + f3 ;
then we have:
δM∩N (f ) = (f1 , f2 + f3 )
since f2 ∈ M ∩ N ⊥ ⊂ N ⊥ ⊂ (M ∩ N )⊥ and f3 ∈ M ⊥ ⊂ (M ∩ N )⊥ , and hence

f2 + f3 ∈ (M ∩ N )⊥ ;
δM (f ) = (f1 + f2 , f3 )
since f1 , f2 ∈ M and hence f1 + f2 ∈ M ;
δN (f1 + f2 ) = (f1 , f2 );
therefore we have
PM∩N f = f1 = PN (f1 + f2 ) = PN PM f.
This proves the equation PN PM = PM∩N , which obviously implies condition a.
Finally, the equation PN PM = PM∩N is equivalent to the equation RPN PM =
M ∩ N , in view of 13.1.3c and 13.1.4a.
For any M, N ∈ S (H) we have M ∩ N ∈ S (H) (cf. 4.1.10) and hence we can
consider the orthogonal projection PM∩N . If PN PM = PM PN then 13.2.1 proves
that PM∩N = PN PM . The next theorem shows how PM∩N can be obtained from
PM and PN in the general case. In its proof, we follow von Neumann faithfully
(Neumann, 1950).
13.2.2 Theorem. For M, N ∈ S (H), let A1 := PM , and let

A2k := (PN PM )k , A2k+1 := PM (PN PM )k , ∀k ∈ N.
Then, for all f ∈ H, the sequence {An f } is convergent and
PM∩N f = lim An f,
n→∞
n
and the sequence {(PN PM ) f } is also convergent and
PM∩N f = lim (PN PM )n f.
n→∞
Proof. For all k, h ∈ N and all f ∈ H we have, in view of 13.1.5:

(A2k f |A2h f ) = (PM PN )h (PN PM )k f |f = (A2k+2h−1 f |f ) ,

(A2k+1 f |A2h+1 f ) = (PM PN )h PM (PN PM )k f |f = A(2k+1)+(2h+1)−1 f |f ,

(A2k f |A2h+1 f ) = (PM PN )h PM (PN PM )k f |f = A2k+(2h+1) f |f ,

(A2k+1 f |A2h f ) = (PM PN )h PM (PN PM )k f |f = A(2k+1)+2h f |f ,

since
(PM PN )h (PN PM )k = PM (PN PM )h−1 PN (PN PM )k
= PM (PN PM )k+h−1 = A2k+2h−1
and
(PM PN )h PM (PN PM )k = PM (PN PM )h (PN PM )k = A2k+2h+1 .
This proves that, for all m, n ∈ N and all f ∈ H,
(Am f |An f ) = (Am+n−s f |f ) ,
with s = 1 if m and n have the same parity and s = 0 if m and n have different
parity, and hence that
kAm f − An f k2 = (Am f |Am f ) + (An f |An f ) − (Am f |An f ) − (An f |Am f )
= (A2m−1 f |f ) + (A2n−1 f |f ) − 2 (Am+n−s f |f ) (5)

= (A2m−1 f |f ) + (A2n−1 f |f ) − 2 A2km,n −1 f |f ,
with km,n ∈ N such that 2km,n − 1 = m + n − s (note that m + n − s is always odd).
Moreover, for all i ∈ N and all f ∈ H, we have
(A2i−1 f |f ) = (Ai f |Ai f ) = kAi f k2 and
(A2i+1 f |f ) = (Ai+1 f |Ai+1 f ) = kAi+1 f k2 ;
now, Ai+1 f = PM Ai f if i is even and Ai+1 f = PN Ai f if i is odd; therefore (cf.
13.1.3d or 13.1.6), in any case,
kAi+1 f k ≤ kAi f k.
This proves that, for every f ∈ H, the sequence of non-negative real numbers
{(A2i−1 f |f )} is monotone non-increasing, and hence that it is convergent, and
hence (cf. 2.6.2) that
∀ε > 0, ∃Nε ∈ N s.t. Nε < i, j ⇒ | (A2i−1 f |f ) − (A2j−1 f |f ) | < ε;
from this and from 5 we have that
∀ε > 0, ∃Nε ∈ N s.t. Nε < m, n ⇒
kAm f − An f k2 ≤ | (A2m−1 f |f ) − A2km,n −1 f |f | +

| (A2n−1 f |f ) − A2km,n −1 f |f | < 2ε
(note that Nε < m, n implies Nε < km,n ); since H is a complete metric space, this
proves that the sequence {An f } is convergent.
Thus, we can define the mapping
A:H→H
f 7→ Af := lim An f.
n→∞
It is easy to see that the mapping A is a linear operator by the continuity of vector
sum and of scalar multiplication. Further, we have
(Af |Af ) = lim (An f |An f ) = lim (A2n−1 f |f )
n→∞ n→∞

= lim A2n−1 f |f = (Af |f ) , ∀f ∈ H,
n→∞
by the continuity of inner product and by 2.1.7b (in relation to the subsequence
{A2n−1 f } of the sequence {An f }). In view of 12.4.3, this proves that the operator
A is symmetric, and hence self-adjoint since DA = H. Then, from the equation
above we also have
A2 f |f = (Af |f ) , ∀f ∈ H,

which proves the equation A2 = A, in view of 10.2.12. Thus (cf. 13.1.5), the
operator A is an orthogonal projection.
Now, in view of 13.1.3c we have
f ∈ M ∩ N ⇒ [PM f = f and PN f = f ] ⇒
[An f = f, ∀n ∈ N] ⇒ Af = f ⇒ f ∈ RA ,
and conversely
f ∈ RA ⇒ f = Af = lim A2n f ⇒
n→∞
PN f = lim PN A2n f = lim A2n f = f ⇒ f ∈ N
n→∞ n→∞
as well as
f ∈ RA ⇒ f = Af = lim A2n+1 f ⇒
n→∞
PM f = lim PM A2n+1 f = lim A2n+1 f = f ⇒ f ∈ M
n→∞ n→∞
(we have used 2.1.7b in relation to the subsequences {A2n f } and {A2n+1 f } of the
sequence {An f }). This proves that RA = M ∩ N , and hence that A = PM∩N (cf.
13.1.4a).
Finally we have, as already noted,
PM∩N f = Af = lim A2n f = lim (PN PM )n f, ∀f ∈ H.
n→∞ n→∞
13.2.3 Remark. In 13.2.2, if PN PM = PM PN then A2k = A2k+1 = PN PM for all

k ∈ N (cf. 13.1.5 or 13.1.6) and we have PM∩N = PN PM , as already proved in
13.2.1.
13.2.4 Theorem. For M, N ∈ S (H), the following conditions are equivalent:

(a) PM − PN ∈ P(H);
(b) PM PN = PN ;
(c) PN PM = PN ;
(d) (f |PN f ) ≤ (f |PM f ), ∀f ∈ H;
(e) N ⊂ M.
If the above conditions are satisfied, then
(f ) PM − PN = PM∩N ⊥ , or equivalently RPM −PN = M ∩ N ⊥ .
Proof. a ⇒ b: Assuming condition a, in view of 13.1.5 or 13.1.6 we have

PM − PN = (PM − PN )2 = PM − PM PN − PN PM + PN ,
and hence
PM PN + PN PM = 2PN , (6)
whence
PM PN + PM PN PM = 2PM PN and PM PN PM + PN PM = 2PN PM ,
whence
PM PN = PN PM .
Substituting PM PN for PN PM in 6, we obtain PM PN = PN .
b ⇒ c: In view of 13.1.5 and 12.3.4b, condition b implies
PN = PN† = (PM PN )† = PN† PM
†
= PN PM .
c ⇒ d: Assuming condition c, in view of 13.1.3d or 13.1.6 we have
kPN f k = kPN PM f k ≤ kPM f k, ∀f ∈ H,
and hence condition d by 13.1.7c.
d ⇒ e: Assuming condition d, in view of 13.1.3c and 13.1.7c we have
f ∈ N ⇒ kf k = kPN f k ≤ kPM f k ⇒ kf k = kPM f k ⇒ f ∈ M,
where the second implication is true because kPM f k ≤ kf k for all f ∈ H (cf. 13.1.3d
or 13.1.6).
e ⇒ (a and f ): Assuming condition e, 10.4.3 implies that
∀g ∈ M, ∃!(g1 , g2 ) ∈ N × (M ∩ N ⊥ ) so that g = g1 + g2 .
This and 10.4.1 imply that, for every f ∈ H,
∃!(f1 , f2 , f3 ) ∈ N × (M ∩ N ⊥ ) × M ⊥ so that f = f1 + f2 + f3 ;
then we have:
δM∩N ⊥ (f ) = (f2 , f1 + f3 )
since f1 ∈ N = N ⊥⊥ ⊂ (M ∩ N ⊥ )⊥ and f3 ∈ M ⊥ ⊂ (M ∩ N ⊥ )⊥ , and hence
f1 + f3 ∈ (M ∩ N ⊥ )⊥ ;
δN (f ) = (f1 , f2 + f3 )
since f2 ∈ M ∩ N ⊥ ⊂ N ⊥ and f3 ∈ M ⊥ ⊂ N ⊥ , and hence f2 + f3 ∈ N ⊥ ;
δM (f ) = (f1 + f2 , f3 )
since f1 ∈ N ⊂ M and f2 ∈ M ∩ N ⊥ ⊂ M , and hence f1 + f2 ∈ M ; therefore we
have
PM∩N ⊥ f = f2 = (f1 + f2 ) − f1 = PM f − PN f = (PM − PN )f.
This proves the equation PM − PN = PM∩N ⊥ , which obviously implies condition a.
Finally, the equation PM − PN = PM∩N ⊥ is equivalent to the equation
RPM −PN = M ∩ N ⊥ , in view of 13.1.3c and 13.1.4a.
13.2.5 Definition. The family S (H) of all subspaces of H is obviously a partially

ordered set (cf. 1.1.5) with set inclusion as partial ordering, i.e. with the relation
≤ in S (H) defined as follows:
for N, M ∈ S (H), N ≤ M if N ⊂ M.
The l.u.b. and the g.l.b. exist for every family {Mi }i∈I of elements of S (H), and
they are
!
[ \
sup{Mi }i∈I = V Mi and inf{Mi }i∈I = Mi
i∈I i∈I
(the first equation is proved by 4.1.11a,b,c). In view of the bijection existing between
S (H) and P(H) (cf. 13.1.4b), we can obviously define a partial ordering in P(H)
as follows:
for P, Q ∈ P(H), P ≤ Q if RP ⊂ RQ .
In view of 13.2.4, for P, Q ∈ P(H) we have
P ≤ Q ⇔ P Q = P ⇔ [(f |P f ) ≤ (f |Qf ) , ∀f ∈ H].
Clearly, for every family {Pi }i∈I of elements of P(H),

!
[
sup{Pi }i∈I = PM̂ if M̂ := V RPi
i∈I
and
\
inf{Pi }i∈I = PM̌ if M̌ := RPi .
i∈I
13.2.6 Remark. Let {Pi }i∈I be an arbitrary family of elements of P(H). Since
inf{Pi }i∈I ≤ Pk for all k ∈ I, we have
(f |(inf{Pi }i∈I )f ) ≤ (f |Pk f ) , ∀k ∈ I, ∀f ∈ H.
However, the equations
(f |(inf{Pi }i∈I )f ) = inf{(f |Pi f )}i∈I , ∀f ∈ H, (7)
need not hold true, and indeed they do not in general, not even when the elements
of the family are so that Pi Pj = Pj Pi for all i, j ∈ I, as is shown by the family
of one-dimensional projections {Au1 , Au2 } with {u1 , u2 } an o.n.s. in H; in fact,
inf{Au1 , Au2 } = OH , while for the vector f := u1 + u2 we have (f |Aui f ) = 1 for
i = 1, 2. In 13.2.7 we prove that statement 7 is true if the family {Pi }i∈I is closed
under multiplication, i.e. if the product Pi Pj belongs to the family for all i, j ∈ I.
This result is important in the theory of projection valued measures (it is used in
the proof of 13.4.2).
13.2.7 Theorem. Let {Pi }i∈I be a family of elements of P(H) and suppose that
∀i, j ∈ I, ∃k ∈ I such that Pi Pj = Pk .
Then,
∃!P ∈ P(H) so that (f |P f ) = inf{(f |Pi f )}i∈I , ∀f ∈ H.
This unique orthogonal projection P is the orthogonal projection inf{Pi }i∈I .
Proof. We write P := inf{Pi }i∈I . In what follows we prove that

(f |P f ) = inf{(f |Pi f )}i∈I , ∀f ∈ H.
The uniqueness asserted in the statement then follows immediately from 10.2.12.
We define
Qi := Pi − P, ∀i ∈ I,
and we observe that Qi ∈ P(H) by 13.2.4. Moreover,
Qi Qj = (Pi − P )(Pj − P ) = Pi Pj − P − P + P = Pi Pj − P, ∀i, j ∈ I,
and hence, by the condition assumed in the statement,
∀i, j ∈ I, ∃k ∈ I such that Qi Qj = Qk
(we note that this implies Qi Qj = Qj Qi for all i, j ∈ I, by 13.2.1). By induction,
this implies that
∀n ∈ N, ∀(i1 , ..., in ) ∈ I n , ∃k ∈ I such that Qi1 · · · Qin = Qk ,
and hence, if we write Mi := RQi for all i ∈ I, that
∀n ∈ N, ∀(i1 , ..., in ) ∈ I n , ∃k ∈ I such that Mi1 ∩ · · · ∩ Min = Mk
(we have used 13.1.4b and induction applied to 13.2.1e), and hence that
∀n ∈ N, ∀(i1 , ..., in ) ∈ I n , ∃k ∈ I such that V (Mi⊥1 ∪ · · · ∪ Mi⊥n ) = Mk⊥ ;
in fact, in view of 10.4.4b, 10.2.10c, 10.4.4a we have
V (Mi⊥1 ∪ · · · ∪ Mi⊥n ) = (Mi⊥1 ∪ · · · ∪ Mi⊥n )⊥⊥
= (Mi⊥⊥
1
∩ · · · ∩ Mi⊥⊥
n
)⊥ = (Mi1 ∩ · · · ∩ Min )⊥ .
Then, in view of 3.1.7 and 4.1.13,
!
[
⊥
f ∈L Mi ⇒
i∈I
[∃n ∈ N, ∃(i1 , ..., in ) ∈ I n such that f ∈ L(Mi⊥1 ∪ · · · ∪ Mi⊥n )] ⇒

[∃k ∈ I such that f ∈ Mk⊥ ].
This proves the inclusion
!
[ [
L Mi⊥ ⊂ Mi⊥ ,
i∈I i∈I
which is obviously equivalent to the equation

!
[ [
⊥
L Mi = Mi⊥ . (8)
i∈I i∈I
Moreover, in view of 13.2.4f we have
Mi = RPi ∩ RP⊥ ,
and hence
! ! !⊥
\ \ \ \
Mi = RPi ∩ RP⊥ = RPi ∩ RPi = {0H },
i∈I i∈I i∈I i∈I
and hence (proceeding as above)
! !⊥⊥ !⊥
[ [ \
⊥
V Mi = Mi⊥ = Mi⊥⊥
i∈I i∈I i∈I
!⊥ (9)
\
= Mi = {0H }⊥ = H.
i∈I
In view of 4.1.13, 8 and 9 imply the equation
!
[
Mi⊥ = H.
i∈I
Since Mi⊥ = R1H −Qi (cf. 13.1.3e), in view of 2.3.12 this statement can be written
as
∀f ∈ H, ∀ε > 0, ∃iε ∈ I, ∃gε ∈ H such that kf − (1H − Qiε )gε k < ε. (10)
Now, for every A ∈ P(H), in view of 10.2.3 we have
kf − Af k2 ≤ kf − Af k2 + kAf − Agk2 = kf − Af + Af − Agk2
= kf − Agk2 , ∀f, g ∈ H;
in fact, in view of 13.1.5c,e we have
(f − Af |Af − Ag) = 0, ∀f, g ∈ H.
Thus, 10 implies that
∀f ∈ H, ∀ε > 0, ∃iε ∈ I such that kQiε f k = kf − (1H − Qiε )f k < ε,
or (cf. 13.1.7c)
∀f ∈ H, ∀ε > 0, ∃iε ∈ I such that (f |Qiε f ) < ε2 ,
or
∀f ∈ H, ∀ε > 0, ∃iε ∈ I such that (f |Piε f ) < ε2 + (f |P f ) . (11)
On the other hand we have (as already noted in 13.2.6)
(f |P f ) ≤ (f |Pi f ) , ∀i ∈ I, ∀f ∈ H. (12)
Now, 11 and 12 prove that
(f |P f ) = inf{(f |Pi f )}i∈I , ∀f ∈ H.
13.2.8 Theorem. For a sequence {Mn } in S (H), the following conditions are
equivalent (we write Pn := PMn , ∀n ∈ N):
P∞
(a) the series n=1 Pn f is convergent for all f ∈ H and the mapping
P : H→H
∞
X
f 7→ P f := Pn f
n=1
is an orthogonal projection;
P∞ 2 2
(b) n=1 kPn f k ≤ kf k , ∀f ∈ H;
(c) Pi Pk = OH if i 6= k;
(d) Mk ⊂ Mi⊥ if i 6= k.
If the above conditions are satisfied, the subset of H defined by
∞
(
[
f ∈ H : there exists a sequence {fn } ∈ Mn such that
n=1
∞ ∞
)
X X
fn ∈ Mn for all n ∈ N, fn is convergent, f = fn
n=1 n=1
is called the orthogonal sum of the sequence of subspaces {Mn } and is denoted by
P∞⊕
the symbol n=1 Mn .
If the above conditions are satisfied, then:
P∞⊕ P∞⊕
(e) RP = n=1 Mn = V (∪∞ n=1 Mn ), and hence n=1 Mn ∈ S (H);
P∞
(f ) if β is a bijection from N onto N then n=1 Pβ(n) f = P f , ∀f ∈ H.
Proof. a ⇒ b: Assuming condition a, we have

∞ ∞ ∞
!
X X X
kPn f k2 = (f |Pn f ) = f| Pn f
n=1 n=1 n=1
= (f |P f ) = kP f k2 ≤ kf k2 , ∀f ∈ H,
in view of 13.1.7c, of the continuity of inner product, and of 13.1.3d or 13.1.6.
b ⇒ c: Assuming condition b, for i, k ∈ N such that i 6= k we have
X∞
kPi Pk gk2 + kPk Pk gk2 ≤ kPn (Pk g)k2 ≤ kPk gk2 , ∀g ∈ H,
n=1
and hence, since Pk2 = Pk ,

kPi Pk gk2 ≤ 0, ∀g ∈ H,
and hence
Pi Pk g = 0H , ∀g ∈ H.
c ⇒ d: From Pi Pk = OH we obtain RPk ⊂ NPi , i.e. Mk ⊂ Mi⊥ in view of
13.1.3b,c.
d ⇒ (a, e, f ): We assume condition d.

For each f ∈ H, we have
(Pi f |Pk f ) = 0 if i 6= k
since Pn f ∈ Mn for all n ∈ N (cf. 13.1.3c); then, if we define
If := {n ∈ N : Pn f 6= 0H },
n o
1
the family of vectors kPn f k Pn f is an o.n.s. in H; in view of 13.1.7c we have
n∈If

1 1
Pn f |f = kPn f k2 = kPn f k, ∀n ∈ If ;
kPn f k kPn f k
therefore,
∞
X X X 1 2
kPn f k2 = kPn f k2 =

P
kPn f k n
f |f <∞

n=1 n∈If n∈If
P∞
by 10.2.8b; in view of 10.4.7b, this proves that the series n=1 Pn f is convergent.
It is obvious that the mapping P is a linear operator, owing to the continuity of
vector sum and of scalar multiplication. Now, for the mapping P we have
∞
X ∞
X
(f |P f ) = (f |Pn f ) = kPn f k2 ∈ R, ∀f ∈ R
n=1 n=1
(cf. 13.1.7c). By 12.4.3, this proves that the operator P is symmetric, and hence
self-adjoint since DP = H. We also have
∞ X
X ∞ ∞
X
(P f |P f ) = (Pi f |Pk f ) = (Pi f |Pi f )
i=1 k=1 i=1
X∞
= kPi f k2 = (f |P f ) , ∀f ∈ H,
i=1
and hence
f |P 2 f = (f |P f ) , ∀f ∈ H,

which implies P 2 = P by 10.2.12. Thus, P is an orthogonal projection by 13.1.5,

and condition a is proved.
P∞⊕
As to the equation in e, the inclusion RP ⊂ n=1 Mn is obvious since RPn = Mn
P∞⊕ P∞
for all n ∈ N (cf. 13.1.3c). And conversely, if f ∈ n=1 Mn then f = n=1 fn with
fn ∈ Mn for all n ∈ N and hence
∞ ∞ ∞ X ∞ ∞ ∞
!
X X X X X
Pf = Pi fk = Pi fk = Pi fi = fi = f,
i=1 k=1 i=1 k=1 i=1 i=1
since fk ∈ Mk ⊂ Mi⊥
= NPi if i 6= k, and hence Pi fk = 0H if i 6= k, while Pi fi = fi
holds for all i ∈ N (cf. 13.1.3b,c). In view of 13.1.3c, this proves the inclusion
P∞⊕ P∞⊕
n=1 Mn ⊂ RP and hence the equation RP = n=1 Mn . This equation implies
P∞⊕ P∞⊕ S∞
n=1Mn ∈ S (H) (cf. 13.1.4a). Next, the inclusion n=1 Mn ⊂ V ( n=1 Mn ) is
S∞ S∞
obvious since V ( n=1 Mn ) is a subspace and it contains n=1 Mn (also, cf. 2.3.4).
P∞⊕
On the other hand, the inclusion Mk ⊂ n=1 Mn is obvious for all k ∈ N; then
S∞ P∞⊕ P∞⊕
the inclusion V ( n=1 Mn ) ⊂ n=1 Mn follows from 4.1.11c since n=1 Mn is a
subspace.
Finally, if β is a bijection from N onto N, then
X∞
the series Pβ(n) f is convergent and
n=1
∞
X ∞
X
Pβ(n) f = Pn f = P f, ∀f ∈ H
n=1 n=1
by 10.4.9 since (Pi f |Pk f ) = 0 if i 6= k.
13.2.9 Corollary. For a finite family {M1 , ..., MN } of subspaces of H, the following
conditions are equivalent (we write Pn := PMn , ∀n ∈ {1, ..., N }):
PN
(a) n=1 Pn ∈ P(H);
(b) Pi Pk = OH if i 6= k;
(c) Mk ⊂ Mi⊥ if i 6= k.
PN ⊕
If the above conditions are satisfied, the subset n=1 Mn of H defined by
N⊕
X
Mn := M1 + ... + MN
n=1
(cf. 3.1.8) is called the orthogonal sum of the family of subspaces {M1 , ..., MN } and
PN
the following equations are true (we write P := n=1 Pn ):
P ⊕ S PN ⊕
(d) RP = N n=1 Mn = V
N
n=1 Mn , and hence n=1 Mn ∈ S (H).
Proof. Define a sequence {Mn } in S (H) by letting Mn := {0H } for n > N .

For this sequence, conditions a, b, c are equivalent to conditions a, c, d of 13.2.8
respectively, and the equations in d are the same as the equations in e of 13.2.8
since
N⊕ ∞⊕ ∞
N
! !
X X [ [
Mn = Mn and V Mn = V Mn .
n=1 n=1 n=1 n=1
13.2.10 Remarks.
(a) If the conditions in 13.2.8 are satisfied then
∞⊕ ∞
(
X [
Mn = f ∈ H : there exists a sequence {fn } in Mn such that
n=1 n=1
∞ ∞
)
X X
2
fn ∈ Mn for all n ∈ N, kfn k < ∞, f = fn .
n=1 n=1
P∞⊕
This follows immediately from 10.4.7. If f ∈ n=1 Mn then the sequence {fn }
P∞
such that fn ∈ Mn and f = n=1 fn is unique. In fact, suppose that {gn } is
P∞
another sequence such that gn ∈ Mn and f = n=1 gn . Then,
∞
X ∞
X
fk = Pk fn = Pk fn = Pk f
n=1 n=1
∞
X ∞
X
= Pk gn = Pk gn = gk , ∀k ∈ N,
n=1 n=1
where we have used the continuity of Pk (cf. 13.1.3d) and the equations
Pk fn = δk,n fn and Pk gn = δk,n gn , ∀k, n ∈ N,
which follow from 13.1.3b,c.
PN ⊕
Similarly, if the conditions in 13.2.9 are satisfied then, for f ∈ n=1 Mn , the
PN
N -tuple {f1 , ..., fN } such that fn ∈ Mn and f = n=1 fn is unique.
(b) If the conditions in 13.2.8 are satisfied, the orthogonal projection P is called
the series of the sequence of projections {Pn } and is denoted by the symbol
P∞ P∞
n=1 Pn , i.e. one writes n=1 Pn := P . However, unless
∃k ∈ N such that n > k ⇒ Pn = OH , (13)

P∞
the series n=1 Pn is not convergent in the normed space B(H). Indeed, if
condition 13 does not hold true then
∀m ∈ N, ∃nm ∈ N such that nm > m and Pnm 6= OH ,
and hence
∀m ∈ N, ∃nm ∈ N such that nm > m and
n
X m nXm −1
Pk − Pk = kPnm k = 1

k=1 k=1
(cf. 13.1.3d); this implies that the sequence { nk=1 Pk } is not a Cauchy sequence
P
in the normed space B(H), and hence that it is not convergent (cf. 2.6.2).
P
(c) From 13.2.8f we have that the projection i∈I Pi can be defined unambiguously
for any countable family {Pi }i∈I of projections such that Pi Pj = OH if i 6= j.
Indeed, if I is denumerable, we define
X ∞
X
Pi := Pi(n)
i∈I n=1
with N ∋ n 7→ i(n) ∈ I any bijection from N onto I.

(d) For M, N ∈ S (H), if M ⊂ N ⊥ then M + N ∈ S (H) (cf. 13.2.9 with N := 2).
A more general condition for M + N to be a subspace is PM PN = PN PM . In
fact, if this condition is true then
M = (M ∩ N ) + (M ∩ N ⊥ ) and N = (N ∩ M ) + (N ∩ M ⊥ )
(cf. 13.2.1) and hence

M + N = (M ∩ N ) + (M ∩ N ⊥ ) + (N ∩ M ⊥ ).
Since the subspaces M ∩ N , M ∩ N ⊥ , N ∩ M ⊥ are obviously orthogonal to
each other (cf. 10.2.14), 13.2.9 (with N := 3) proves that M + N is a subspace.
If PM PN = PN PM then PM+N can be readily obtained from PM and PN . In
fact, in view of what was seen above, 13.2.9 implies that
PM+N = PM∩N + PM∩N ⊥ + PN ∩M ⊥ ,
and this equation can be written as
PM+N = PM PN + PM (1H − PN ) + PN (1H − PM ) = PM + PN − PM PN ,
in view of 13.2.1e and 13.1.3e.
(e) One can wonder whether there exist subspaces M and N in H so that M + N
is not a subspace. Now, for all M, N ∈ S (H) the set M + N is always a
linear manifold in H (this was already noted in 3.1.8); if M and N are finite-
dimensional then so is M + N and hence M + N ∈ S (H) in view of 10.8.1.
The following example proves that there are infinite-dimensional subspaces M
and N of H so that M + N is not a subspace.
Assume that H is separable and that its orthogonal dimension is denumerable,
let {un }n∈N be a c.o.n.s. in H, and define

1 1
vn := sin u2n−1 + cos u2n , ∀n ∈ N,
n n
and
M := V {u2n }n∈N and N := V {vn }n∈N .
We prove that M + N is not a subspace by proving that M + N = H and
M + N 6= H. First, we have
f ∈ (M + N )⊥ ⇒ [(f |u2n ) = (f |vn ) = 0, ∀n ∈ N] ⇒
[(f |u2n ) = (f |u2n−1 ) = 0, ∀n ∈ N] ⇒ f = 0H
(cf. 10.6.4e), and hence M + N = H by 10.4.4d. Next, we define the vector
P∞ 1
g := n=1 sin n u2n−1 (this series is convergent by 10.4.7b) and we prove
that g 6∈ M + N . Indeed, it is immediate to see that {vn }n∈N is an o.n.s in H,
and hence g ∈ M + N would imply
∞
X X∞
g= αn u2n + βn vn , with {αn }, {βn } ∈ ℓ2
n=1 n=1
(cf. 10.6.6b), and hence
∞ ∞
!
1 X X
sin = (u2k−1 |g) = u2k−1 | αn u2n + βn vn
k n=1 n=1
1
= βk sin , ∀k ∈ N,
k
and hence βk = 1 for all k ∈ N, which is in contradiction with {βn } ∈ ℓ2 .
(f) In 13.2.8 and in 13.2.9 we introduced the symbols ∞⊕

P PN ⊕
n=1 Mn and n=1 Mn to
denote the orthogonal sum of a sequence {Mn } or of a finite family {M1 , ..., MN }
of subspaces of H such that Mk ⊂ Mi⊥ if i 6= k. These symbols can be unified in
the symbol ⊕
P
n∈I Mn , with I := N or I := {1, ..., N }. This symbol is the same
as the one we used in 10.3.7 to denote the direct sum of a countable family of
Hilbert spaces, and indeed the concepts of direct sum and of orthogonal sum
are strictly related, as we show in what follows.
Let {Hn }n∈I be a family of Hilbert spaces, with I := {1, ..., N } or I := N, and
define
⊕
( )
X
Mk := {fn } ∈ Hn : fn = 0Hn if n 6= k , ∀k ∈ I,
n∈I
P⊕
where n∈I Hn denotes the direct sum of the family {Hn }n∈I . It is easy to see
P⊕
that Mk is a subspace of the Hilbert space n∈I Hn for each k ∈ I (one may
P⊕ P⊕
use 2.6.6a), that Mk ⊂ Mi⊥ if i 6= k, and that n∈I Hn = n∈I Mn , where
the right hand side of this equation denotes the orthogonal sum of the family
of subspaces {Mn }n∈I . Moreover, Mk as a Hilbert space on its own (cf. 10.3.2)
is obviously isomorphic to Hk , for each k ∈ I. Thus, a direct sum of Hilbert
spaces equals an orthogonal sum of subspaces, and the terms of the two sums
are pairwise isomorphic.
Conversely, let {Mn }n∈I be a finite family or a sequence of subspaces of a
Hilbert space H, and suppose that Mk ⊂ Mi⊥ if i 6= k. For each n ∈ I, Mn can
be considered as a Hilbert space on its own (cf. 10.3.2), which we denote by
Mnh . Then, it is easy to see that the mapping
X⊕ X X⊕
Mnh ∋ {fn } 7→ fn ∈ Mn ,
n∈I n∈I n∈I
P⊕
where n∈I Mnh denotes the direct sum of the family {Mnh }n∈I of Hilbert spaces
P⊕
and n∈I Mn denotes the orthogonal sum of the family {Mn }n∈I of subspaces
P⊕ h
of H, is an isomorphism from the Hilbert space n∈I Mn onto the Hilbert
P⊕
space we obtain by considering the subspace n∈I Mn of H as a Hilbert space
on its own (cf. 10.3.2). Thus, an orthogonal sum of subspaces is isomorphic
to a direct sum of Hilbert spaces, and the terms of the two sums are pairwise
equal, once considered as subspaces and once as Hilbert spaces.
13.3 Projection valued measures

For any non-empty family F of subsets of X, any mapping Q : F → P(H), and
every f ∈ H, in this and in later chapters we denote by µQ
f the function defined by
µQ
f :F → [0, ∞)
E 7→ µQ
f (E) := (f |Q(E)f ) (= kQ(E)f k2 )
(for the equality (f |Q(E)f ) = kQ(E)f k2 , cf. 13.1.7c).
13.3.1 Definition. Let A0 be an algebra on X. A projection valued additive map-

ping (briefly, a p.v.a.m.) on A0 (with values in P(H), which we usually omit) is a
mapping P0 : A0 → P(H) which has the following properties:
(pvam1 ) for every finite and disjoint family {E1 , ..., En } of elements of A0 ,
n
! n
[ X
P0 Ek = P0 (Ek )
k=1 k=1
(this property of P0 is called additivity);

(pvam2 ) P0 (X) = 1H .
13.3.2 Proposition. Let A0 be an algebra on X and P0 a p.v.a.m. on A0 . Then:

(a) P0 (∅) = OH ;
(b) if E, F ∈ A0 are such that E ∩ F = ∅, then P0 (E)P0 (F ) = OH ;
(c) P0 (E)P0 (F ) = P0 (E ∩ F ), ∀E, F ∈ A0 ;
(d) P0 (E)P0 (F ) = P0 (F )P0 (E), ∀E, F ∈ A0 ;
(e) if E, F ∈ A0 are such that E ⊂ F , then P0 (E) ≤ P0 (F ) (i.e. RP0 (E) ⊂ RP0 (F ) ,
cf. 13.2.5);
(f ) P0 (X − E) = 1H − P0 (E), or equivalently RP0 (X−E) = RP⊥0 (E) , ∀E ∈ A0 ;
(g) µPf is an additive function on A0 , ∀f ∈ H;
0
(h) if {E1 , ..., EN } is a finite

S family of elements of A0 such that P0 (En ) = OH for
N
n = 1, ..., N , then P0 n=1 En = OH .
Proof. a: Set n := 2 and E1 := E2 := ∅ in pvam1 .

b: If E, F ∈ A0 are such that E ∩ F = ∅, then pvam1 implies that
P0 (E) + P0 (F ) ∈ P(H),
and this implies P0 (E)P0 (F ) = OH by 13.2.9.
c: For E, F ∈ A0 we have
E = (E ∩ F ) ∪ (E ∩ (X − F )) and F = (F ∩ E) ∪ (F ∩ (X − E));
then, in view of pvam1 we have
P0 (E) = P0 (E ∩ F ) + P0 (E ∩ (X − F )) and P0 (F ) = P0 (F ∩ E) + P0 (F ∩ (X − E)),
and hence, in view of result b and of 13.1.5 or 13.1.6,
P0 (E)P0 (F ) = P0 (E ∩ F )P0 (F ∩ E) = P0 (E ∩ F ).
d: This follows immediately from result c.
e: This follows immediately from result c, since RP0 (E) ⊂ RP0 (F ) is equivalent
to P0 (E)P0 (F ) = P0 (E) (cf. 13.2.4).
f: This follows immediately from pvam1 and pvam2 (cf. also 13.1.4a).
g: For every f ∈ H, the function µPf has property af1 of 7.1.1 in view of result
0
a, and it has property af2 in view of property pvam1 of P0 .

h: If P0 (En ) = OH then µP
f (En ) = 0 for all f ∈ H, and hence
0
N
!
[
µP
f
0
En = 0, ∀f ∈ H,
n=1
S
N
by 7.1.2b, and this implies P0 n=1 n = OH .
E
13.3.3 Definition. Let A be a σ-algebra on X. A projection valued measure

(briefly, a p.v.m.) on A (with values in P(H), which we usually omit) is a p.v.a.m.
P on A which has the following property:
(pvm) for every sequence {En } in A such that Ei ∩ Ej = ∅ if i 6= j,

∞ ∞
!
[ X
P En f = P (En )f, ∀f ∈ H
n=1 n=1
(this property of P is called σ-additivity).

P∞
13.3.4 Remark. In condition pvm of 13.3.3, the series n=1 P (En )f is convergent
in view of 13.3.2b and 13.2.8. With the notation introduced in 13.2.10b, condition
pvm can be written briefly as follows:
(pvm) for every sequence {En } in A such that Ei ∩ Ej = ∅ if i 6= j,

∞ ∞
!
[ X
P En = P (En ).
n=1 n=1
13.3.5 Theorem. Let A be a σ-algebra on X and P a mapping P : A → P(H).

(a) P is a p.v.m. on A;
2
(b) µP P
f is a measure on A and µf (X) = kf k , ∀f ∈ H;
(c) µu is a probability measure on A (i.e. µu is a measure on A and µP
P P
u (X) = 1),
∀u ∈ H̃.
Proof. a ⇒ b: Assuming condition a, for every f ∈ H the function µP f is an

additive function on A in view of 13.3.2g and it has property me of 7.1.7 in view of
condition pvm for P and of the continuity of inner product. Also, condition pvam2
for P implies
2
µP
f (X) = kf k , ∀f ∈ H.
c ⇒ a: Assuming condition c, for every finite and disjoint family {E1 , ..., En } of
elements of A we have
n
! ! n
! n
[ [ X
P
u|P Ek u = µu Ek = µP
u (Ek )
k=1 k=1 k=1
n n
!
X X
= (u|P (Ek )u) = u| P (Ek )u , ∀u ∈ H̃,
k=1 k=1
Sn Pn
and hence P ( k=1 Ek ) = k=1 P (Ek ) by 10.2.12. We also have
(u|P (X)u) = µP
u (X) = 1 = (u|1H u) , ∀u ∈ H̃,
and hence P (X) = 1H by 10.2.12 (cf. also 13.2.10b). Thus, P is a p.v.a.m. on A.

Then, for a sequence {En } in A such that Ei ∩ Ej = ∅ if i 6= j, we have that the
P∞
series n=1 P (En )u is convergent for all u ∈ H̃, in view of 13.3.2b and 13.2.8, and
hence
∞ ∞ ∞
! ! !
[ [ X
P
u|P En u = µu En = µP
u (En )
n=1 n=1 n=1
∞ ∞
!
X X
= (u|P (En )u) = u| P (En )u , ∀u ∈ H̃,
n=1 n=1
S∞ P∞
and hence P ( n=1 En ) = n=1 P (En ) by 10.2.12 (cf. also 13.2.10b). Thus, P is a
p.v.m. on A.
13.3.6 Proposition. Let A be a σ-algebra on X and P a p.v.m. on A. Then:

(a) if {En } is a sequence in A such that En ⊂ En+1 for all n ∈ N, then
∞
!
[
P En f = lim P (En )f, ∀f ∈ H;
n→∞
n=1
(b) if {En } is a sequence in A such that En+1 ⊂ En for all n ∈ N, then

∞
!
\
P En f = lim P (En )f, ∀f ∈ H;
n→∞
n=1
(c) if {En } is a sequence in A such that P (En ) = OH for all n ∈ N, then

∞
!
[
P En = OH .
n=1
Proof. a: If {En } is a sequence in A such that En ⊂ En+1 for all n ∈ N, then

there exists a sequence {Fn } in A such that
∞
[ ∞
[ n
[
Fn = En , Fk = En for all n ∈ N, Fk ∩ Fl = ∅ if k 6= l
n=1 n=1 k=1
(cf. 6.1.8). Then, in view of conditions pvm and pvam1 of P , we have

∞ ∞
! n
[ X X
P En f = P (Fn )f = lim P (Fk )f
n→∞
n=1 n=1 k=1
n
!
[
= lim P Fk f = lim P (En )f, ∀f ∈ H.
n→∞ n→∞
k=1
b: Let {En } be a sequence in A such that En+1 ⊂ En for all n ∈ N. Then,

by letting Fn := X − En for all n ∈ N, we obtain a sequence {Fn } in A such that
Fn ⊂ Fn+1 for all n ∈ N. Then, in view of 13.3.2f and of result a, we have
∞ ∞ ∞
! ! !
\ [ [
P En f = P X − Fn f = f − P Fn f
n=1 n=1 n=1
= lim (f − P (Fn )f ) = lim P (En )f, ∀f ∈ H.
n→∞ n→∞
c: If P (En ) = OH then µP
f (En ) = 0 for all f ∈ H, and hence
∞
!
[
µPf En = 0, ∀f ∈ H,
n=1
S∞
by 7.1.4a, and this implies P ( n=1 En ) = OH .
13.4 Extension theorems for projection valued mappings
13.4.1 Theorem. Let S be a semialgebra on X and let Q : S → P(H) be a

mapping from S to P(H) which satisfies the following conditions:
(q1 ) for every finite and disjoint family {E1 , ..., En } of elements of S such that
Sn
k=1 Ek ∈ S,
n
! n
[ X
Q Ek = Q(Ek );
k=1 k=1
(q2 ) if E, F ∈ S are such that E ∩ F = ∅, then Q(E)Q(F ) = OH .

Then there exists a unique mapping P0 : A0 (S) → P(H) from A0 (S) (the algebra
on X generated by S) to P(H) which has property pvam1 of 13.3.1 and is an
extension of Q.
If Q satisfies the further condition
(q3 ) there exists a finite and disjoint family {F1 , ..., FN } of elements of S such that
N
[ N
X
Fk = X and Q(Fk ) = 1H ,
k=1 k=1
then P0 is a p.v.a.m. on A0 (S).
Proof. Condition q1 implies that

Q(∅) = Q(∅ ∪ ∅) = Q(∅) + Q(∅),
and hence the following condition
(q0 ) Q(∅) = OH .
If {E1 , ..., EN } is a finite and disjoint family of elements of S, condition q2 and 13.2.9
PN
imply that n=1 Q(En ) ∈ P(H). Then, proceeding as in the proof of 7.3.1A with
ν replaced by Q, µ0 by P0 , the number 0 by the operator OH , we can prove that
conditions q0 and q1 imply that the mapping
P0 : A0 (S) → P(H)
N
X
E 7→ P0 (E) := Q(En ) if {E1 , ..., EN } is a finite and disjoint family
n=1
of elements of S s.t. E = ∪N
n=1 En
is defined consistently and has property pvam1 of 13.3.1. It is obvious that P0 is an

extension of Q. Moreover, proceeding as in the proof of 7.3.1A we can prove that
P0 is the unique mapping from A0 (S) to P(H) which has these properties.
Finally, it is obvious that condition q3 for Q implies condition pvam2 for P0 .
13.4.2 Theorem. Let A0 be an algebra on X, let P0 be a p.v.a.m. on A0 , and

suppose that µP
f is a premeasure on A0 for all f ∈ H. Then there exists a unique
0
p.v.m. on A(A0 ) (the σ-algebra on X generated by A0 ) which is an extension of

P0 .
Proof. From 7.3.2 we have that, for every f ∈ H, there exists a measure µf on
A(A0 ) which is an extension of µP
f and which is defined by
0
µf (E) := inf ME , ∀E ∈ A(A0 ),

with, for all E ∈ A(A0 ),
(∞
X
ME := (f |P0 (An )f ) : {An } a sequence in A0 s.t.
n=1
∞
)
[
Ai ∩ Aj = ∅ if i 6= j and E ⊂ An .
n=1
For every sequence {An } in A0 s.t. Ai ∩ Aj = ∅ if i 6= j, in view of 13.3.2b and

P∞
13.2.8 we can define the orthogonal projection that is denoted by n=1 P0 (An ) (cf.
13.2.10b), for which we have
∞ ∞
! !
X X
f| P0 (An ) f = (f |P0 (An )f ) , ∀f ∈ H.
n=1 n=1
Then, for each E ∈ A(A0 ), we define the family of orthogonal projections

(∞
X
PE := P0 (An ) : {An } a sequence in A0 s.t.
n=1
∞
)
[
Ai ∩ Aj = ∅ if i 6= j and E ⊂ An
n=1
and we have
µf (E) := inf{(f |T f ) : T ∈ PE }, ∀f ∈ H.
Now, let {An } and {Bn } be two sequences in A0 such that Ai ∩ Aj = Bi ∩ Bj = ∅
if i 6= j, E ⊂ ∞
S S∞
n=1 An , E ⊂ n=1 Bn ; then,
An ∩ Bl ∈ A0 , ∀(n, l) ∈ N × N,
(Am ∩ Bk ) ∩ (An ∩ Bl ) = ∅ if (m, k) 6= (n, l),
∞ ∞
! !
[ [ [
E⊂ An ∩ Bl = An ∩ Bl ,
n=1 l=1 (n,l)∈N×N
∞ ∞ ∞ X
∞
! !
X X X
P0 (An ) P0 (Bl ) f = P0 (An )P0 (Bl )f
n=1 l=1 n=1 l=1
X
= P0 (An ∩ Bl )f, ∀f ∈ H
(n,l)∈N×N
(cf. 13.1.3d and 13.3.2c). This proves that

T1 T2 ∈ PE , ∀T1 , T2 ∈ PE .
Then, 13.2.7 implies that
∀E ∈ A(A0 ), ∃!PE ∈ P(H) so that
(f |PE f ) = inf{(f |T f ) : T ∈ PE } = µf (E), ∀f ∈ H.
Therefore, we can define the mapping
P : A(A0 ) → P(H)
E 7→ P (E) := PE .
Since µf is an extension of µP
f for all f ∈ H, we have
0
(f |P (E)f ) = µf (E) = µP
f (E) = (f |P0 (E)f ) , ∀f ∈ H, ∀E ∈ A0 ,
0
and hence, in view of 10.2.12,

P (E) = P0 (E), ∀E ∈ A0 .
Thus, P is an extension of P0 . Also, in view of 13.3.5, P is a p.v.m. on A(A0 ) since
P0 2
µP P
f = µf and µf (X) = µf (X) = (f |P0 (X)f ) = kf k , for all f ∈ H.
Finally, suppose that P̃ is a p.v.m. on A(A0 ) and that P̃ is an extension of
P0
P0 . Then the measure µP̃f (cf. 13.3.5) is an extension of µf for each f ∈ H, and
hence µP̃
f = µf for each f ∈ H by the uniqueness asserted in 7.3.2 for a σ-finite
premeasure (µP P0 2
f is finite since µf (X) = kf k ). Therefore we have
0

f |P̃ (E)f = µP̃ f (E) = µf (E) = (f |P (E)f ) , ∀f ∈ H, ∀E ∈ A(A0 ),
which implies P̃ = P by 10.2.12.
13.4.3 Corollary. Let S be a semialgebra on X and let P and P̃ be projection

valued measures (both with values in P(H)) on A(S) (the σ-algebra on X generated
by S) such that
P (E) = P̃ (E), ∀E ∈ S.
Then P = P̃ .
Proof. If we denote by Q the restrictions of P and of P̃ to S, then Q has properties

q1 and q2 of 13.4.1 since P and P̃ satisfy conditions pvam1 and 13.3.2b. Then,
13.4.1 implies that Q has a unique extension which is defined on A0 (S) and satisfies
condition pvam1 . Hence, the restrictions of P and P̃ to A0 (S) must coincide,
since they both extend Q and satisfy condition pvam1 . Now, if we denote by P0
the restrictions of P and of P̃ to A0 (S), then P0 is obviously a p.v.a.m. and
µP P
f is a premeasure on A0 (S) for all f ∈ H since it is the restriction of µf (for
0
P
instance) to A0 (S) and µf is a measure on A(S) (note that A0 (S) ⊂ A(S) since
A(S) = A(A0 (S)), cf. 6.1.18). Then, 13.4.2 implies that there exists a unique
p.v.m. on A(A0 (S)), i.e. on A(S), which extends P0 . Therefore, P = P̃ .
13.4.4 Theorem. Suppose that we have a distance d on X, a semialgebra S on X,

and a mapping Q : S → P(H) which satisfies the following conditions:
(q1 ) for every finite and disjoint family {E1 , ..., En } of elements of S such that
Sn
k=1 Ek ∈ S,
n
! n
[ X
Q Ek = Q(Ek );
k=1 k=1
(q2 ) if E, F ∈ S are such that E ∩ F = ∅, then Q(E)Q(F ) = OH ;

(q3 ) there exists a finite and disjoint family {F1 , ..., FN } of elements of S such that
N
[ N
X
Fk = X and Q(Fk ) = 1H ;
k=1 k=1
(q4 ) ∀f ∈ H, ∀E ∈ S, ∀ε > 0, ∃F ∈ S such that F ⊂ E, F is compact and

|µQ Q
f (E) − µf (F )| < ε.
Then there exists a unique p.v.m. on A(S) which is an extension of Q.

Proof. In view of 13.4.1, conditions q1 , q2 , q3 imply that there exists a unique

p.v.a.m. P0 on A0 (S) which is an extension of Q. Then, for every f ∈ H, the
function µPf is an additive function on A0 (S) (cf. 13.3.2g) and it is an extension of
0
µQ P0
f ; hence, condition q4 implies that µf satisfies condition a of 7.1.6 (if E, F ∈ S
and F ⊂ E then µQ Q Q
f (F ) ≤ µf (E) since µf is restriction of an additive function,
cf. 7.1.2a); since µP 2 P0
f (X) = kf k < ∞, this implies that µf is a premeasure (cf.
0
7.1.6). Then, 13.4.2 implies that there exists a unique p.v.m. P on A (A0 (S))
which extends P0 . Since A(A0 (S)) = A(S) (cf. 6.1.18), P is a p.v.m. on A(S).
The uniqueness of P follows from 13.4.3.
13.5 Product of commuting projection valued measures
13.5.1 Definition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces. A p.v.m. P1

on A1 and a p.v.m. P2 on A2 (both P1 and P2 with values in P(H)) are said to
commute if
P1 (E1 )P2 (E2 ) = P2 (E2 )P1 (E1 ), ∀E1 ∈ A1 , ∀E2 ∈ A2 .
13.5.2 Proposition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces and let P be
a p.v.m. on the σ-algebra A1 ⊗ A2 (which is a σ-algebra on X1 × X2 , cf. 6.1.28).
Then the mappings
P1 : A1 → P(H)
E1 7→ P1 (E1 ) := P (E1 × X2 )
and
P2 : A2 → P(H)
E2 7→ P2 (E2 ) := P (X1 × E2 )
are projection valued measures, they commute, and
P1 (E1 )P2 (E2 ) = P (E1 × E2 ), ∀E1 ∈ A1 , ∀E2 ∈ A2
(recall that E1 × E2 ∈ A1 ⊗ A2 for all E1 ∈ A1 and E2 ∈ A2 , cf. 6.1.30a).
S
Proof. For every family {E1,i }i∈I of elements of A1 we have i∈I E1,i × X2 =
S
i∈I (E1,i × X2 ). For E1 , F1 ∈ A1 , if E1 ∩ F1 = ∅ then (E1 × X2 ) ∩ (F1 × X2 ) = ∅.
Then, it is obvious that P1 has the properties of a p.v.m. on A1 since P is a p.v.m.
on A1 ⊗ A2 . And similarly for P2 . From property 13.3.2d of P we have that P1 and
P2 commute. Finally, for all E1 ∈ A1 and E2 ∈ A2 ,
P1 (E1 )P2 (E2 ) = P (E1 × X2 )P (X1 × E2 )
= P ((E1 × X2 ) ∩ (X1 × E2 )) = P (E1 × E2 ),
by property 13.3.2c of P .
13.5.3 Theorem. Let (X1 , d1 ) and (X2 , d2 ) be complete and separable metric
spaces, let P1 be a p.v.m. on the Borel σ-algebra A(d1 ), let P2 be a p.v.m. on
the Borel σ-algebra A(d2 ) (both P1 and P2 with values in P(H)), and suppose that
P1 and P2 commute. Then there exists a p.v.m. P on the σ-algebra A(d1 ) ⊗ A(d2 )
(which is the same as the Borel σ-algebra A(d1 × d2 ), where d1 × d2 denotes the
product distance on X1 × X2 , cf. 6.1.31, 2.7.1, 2.7.2) such that
P (E1 × E2 ) = P1 (E1 )P2 (E2 ), ∀E1 ∈ A(d1 ), ∀E2 ∈ A(d2 ).
The p.v.m. P is the unique p.v.m. on A(d1 ) ⊗ A(d2 ) such that
P (E1 × X2 ) = P1 (E1 ), ∀E1 ∈ A(d1 ), and P (X1 × E2 ) = P2 (E2 ), ∀E2 ∈ A(d2 ).
The p.v.m. P is called the product of P1 and P2 .
Proof. The family S of subsets of X1 × X2 defined by

S := {E1 × E2 : E1 ∈ A(d1 ) and E2 ∈ A(d2 )}
is a semialgebra on X1 × X2 (cf. 6.1.30a). Since the operator P1 (E1 )P2 (E2 ) is
a projection for all E1 ∈ A(d1 ) and E2 ∈ A(d2 ) (cf. 13.2.1), we can define the
mapping
Q : S → P(H)
E1 × E2 7→ Q(E1 × E2 ) := P1 (E1 )P2 (E2 ).
We prove that this mapping satisfies the conditions assumed for Q in 13.4.4.
q1 : Suppose that a finite and disjoint family {E1,1 × E2,1 , ..., E1,n × E2,n } of
elements of S and an element E1 × E2 ∈ S are so that
[n
E1 × E2 = (E1,k × E2,k ).
k=1
We can assume that the sets E1,k and E2,k are non-empty for all k ∈ {1, ..., n}.
Sn
Then we have Ep = k=1 Ep,k for p = 1, 2. Since every σ-algebra is a semialgebra,
6.1.4 implies that, for p = 1, 2, there exists a finite and disjoint family {Fp,j }j∈Jp
of elements of A(dp ) so that
[
∀k ∈ {1, ..., n}, ∃Jp,k ⊂ Jp such that Ep,k = Fp,j .
j∈Jp,k
Then we have
[
E1,k × E2,k = (F1,i × F2,j ), ∀k ∈ {1, ..., n}. (14)
(i,j)∈J1,k ×J2,k
Clearly, we can assume that Fp,j is non-empty for all j ∈ Jp and for p = 1, 2. Then,
the condition
(E1,k × E2,k ) ∩ (E1,h × E2,h ) = ∅ if k 6= h
and 14 imply the condition
(J1,k × J2,k ) ∩ (J1,h × J2,h ) = ∅ if k 6= h. (15)
S
Moreover, we can assume Ep = j∈Jp Fp,j for p = 1, 2 (if this is not already true,
Sn
we can replace Jp with k=1 Jp,k ). Then we have
[ n
[
(F1,i × F2,j ) = E1 × E2 = (E1,k × E2,k )
(i,j)∈J1 ×J2 k=1
  (16)
n
[ [
=  (F1,i × F2,j ) .
k=1 (i,j)∈J1,k ×J2,k
Sn
Now, the inclusion k=1 (J1,k × J2,k ) ⊂ J1 × J2 is obvious. For (i, j) ∈ J1 × J2 , let
(x1 , x2 ) ∈ F1,i × F2,j ; then 16 implies that there exist k ∈ {1, ..., n} and (l, m) ∈
J1,k × J2,k so that (x1 , x2 ) ∈ F1,l × F2,m ; since
(F1,i × F2,j ) ∩ (F1,l × F2,m ) = ∅ if (i, j) 6= (l, m),

Sn
this implies (l, m) = (i, j); this proves the inclusion J1 × J2 ⊂ k=1 (J1,k × J2,k ).
Thus,
n
[
J1 × J2 = (J1,k × J2,k ).
k=1
This equation and 15 imply the equation

X
P1 (F1,i )P2 (F2,j )
(i,j)∈J1 ×J2
(17)
 
n
X X
=  P1 (F1,i )P2 (F2,j ) .
k=1 (i,j)∈J1,k ×J2,k
Now, by the additivity of P1 and P2 we have

 
X X X
P1 (F1,i )P2 (F2,j ) =  P1 (F1,i )P2 (F2,j )
(i,j)∈J1 ×J2 i∈J1 j∈J2
 
X [
= P1 (F1,i )P2  F2,j 
i∈J1 j∈J2
X
= P1 (F1,i )P2 (E2 )
i∈J1
!
[
= P1 F1,i P2 (E2 ) = P1 (E1 )P2 (E2 ),
i∈J1
and similarly, for each k ∈ {1, ..., n},

 
X X [
P1 (F1,i )P2 (F2,j ) = P1 (F1,i )P2  F2,j 
(i,j)∈J1,k ×J2,k i∈J1,k j∈J2,k
X
= P1 (F1,i )P2 (E2,k )
i∈J1,k
 
[
= P1  F1,i  P2 (E2,k )
i∈J1,k
= P1 (E1,k )P2 (E2,k ).

These equations and 17 prove that
n
X n
X
Q(E1 × E2 ) = P1 (E1 )P2 (E2 ) = P1 (E1,k )P2 (E2,k ) = Q(E1,k × E2,k ).
k=1 k=1
q2 : Let E1 ×E2 and F1 ×F2 be elements of S such that (E1 ×E2 )∩(F1 ×F2 ) = ∅.
Then at least one of the two conditions
E1 ∩ F1 = ∅ and E2 ∩ F2 = ∅
is true, and hence (cf. 13.3.2a,c) at least one of the two conditions
P1 (E1 )P1 (F1 ) = P1 (E1 ∩ F1 ) = OH and P2 (E2 )P2 (F2 ) = P2 (E2 ∩ F2 ) = OH
is true, and hence
Q(E1 × E2 )Q(F1 × F2 ) = P1 (E1 )P2 (E2 )P1 (F1 )P2 (F2 )
= P1 (E1 )P1 (F1 )P2 (E2 )P2 (F2 ) = OH .
q3 : We have X1 × X2 ∈ S and Q(X1 × X2 ) = P1 (X1 )P2 (X2 ) = 1H 1H = 1H .
q4 : We fix f ∈ H, E1 × E2 ∈ S, ε ∈ (0, ∞). For i = 1, 2, in view of the fact that
the measure µP i
f is finite and the metric space (Xi , d ) is complete and separable,
i
7.4.7 implies that

ε
∃Fi ∈ A(di ) so that Fi ⊂ Ei , Fi is compact and µP Pi
f (Ei ) − µf (Fi ) < .
i
2
Now, F1 × F2 ∈ S, F1 × F2 ⊂ E1 × E2 , F1 × F2 is compact in the metric space
(X1 × X2 , d1 × d2 ) (cf. 2.8.10), and hence F1 × F2 = F1 × F2 (cf. 2.8.6). We have
E1 × E2 = (E1 × (E2 − F2 )) ∪ ((E1 − F1 ) × F2 ) ∪ (F1 × F2 ),
E1 − F1 ∈ A(d1 ),
E2 − F2 ∈ A(d2 );
then, by property q1 of Q already proved,
Q(E1 × E2 ) = Q(E1 × (E2 − F2 )) + Q((E1 − F1 ) × F2 ) + Q(F1 × F2 ),
and hence
µQ Q Q Q
f (E1 × E2 ) − µf (F1 × F2 ) = µf ((E1 × (E2 − F2 ))) + µf (((E1 − F1 ) × F2 ));
this obviously implies

0 ≤ µQ Q
f (E1 × E2 ) − µf (F1 × F2 );
moreover,
µQ
f (E1 × (E2 − F2 )) = kP1 (E1 )P2 (E2 − F2 )f k
2
≤ kP2 (E2 − F2 )f k2
ε
= µP P2 P2
f (E2 − F2 ) = µf (E2 ) − µf (F2 ) <
2
2
and
µQ 2
f ((E1 − F1 ) × F2 ) = kP1 (E1 − F1 )P2 (F2 )f k = kP2 (F2 )P1 (E1 − F1 )f k
2
≤ kP1 (E1 − F1 )f k2 = µP P1 P1
f (E1 − F1 ) = µf (E1 ) − µf (F1 )
1
ε
< ,
2
and therefore
|µQ Q
f (E1 × E2 ) − µf (F1 × F2 )| < ε.
Thus, the mapping Q satisfies all the conditions of 13.4.4, and hence there exists a
unique p.v.m. P on A(S) which is an extension of Q. Now, A(S) = A(d1 ) ⊗ A(d2 )
(cf. 6.1.30a).
Finally, suppose that P̃ is a p.v.m. on A(d1 ) ⊗ A(d2 ) such that
P̃ (E1 × X2 ) = P1 (E1 ), ∀E1 ∈ A(d1 ), and P̃ (X1 × E2 ) = P2 (E2 ), ∀E2 ∈ A(d2 ).
Then, in view of 13.3.2c,
P̃ (E1 × E2 ) = P̃ ((E1 × X2 ) ∩ (X1 × E2 )) = P̃ (E1 × X2 )P̃ (X2 × E2 )
= P1 (E1 )P2 (E2 ) = Q(E1 × E2 ), ∀E1 ∈ A(d1 ), ∀E2 ∈ A(d2 ),
and hence P̃ is an extension of Q, and hence P̃ = P .
13.6 Spectral families and projection valued measures
Our version of the spectral theorem for self-adjoint operators (cf. 15.2.1) relates
self-adjoint operators to projection valued measures on the Borel σ-algebra A(dR )
(which is a σ-algebra on R, cf. 6.1.22 and 2.1.4). In other books, this theorem is
often phrased so that it relates self-adjoint operators to spectral families. These
two versions of the spectral theorem are completely equivalent. In this section we
prove the equivalence of the notions of a spectral family and of a p.v.m. on A(dR ).
However, the results of this section are not needed in other parts of the present
book.
13.6.1 Definition. A spectral family in H is a mapping T : R → P(H) which has

(sf1 ) T (x1 ) ≤ T (x2 ) if x1 ≤ x2 ;

(sf2 ) T (x + δn )f −−−−→ T (x)f , ∀x ∈ R, ∀f ∈ H, for every sequence {δn } in [0, ∞)
n→∞
s.t. δn −−−−→ 0 (this property of T is called continuity from the right );
n→∞
(sf3 ) T (xn )f −−−−→ 0H , ∀f ∈ H, for every sequence {xn } in R s.t. xn −−−−→ −∞;
n→∞ n→∞
(sf4 ) T (xn )f −−−−→ f , ∀f ∈ H, for every sequence {xn } in R s.t. xn −−−−→ ∞.
n→∞ n→∞
Another name for a spectral family is resolution of the identity.
13.6.2 Proposition. Let P be a p.v.m. on the σ-algebra A(dR ). Then the mapping
T : R → P(H)
x 7→ T (x) := P ((−∞, x])
is a spectral family.
Proof. We prove that the mapping T has all the properties of a spectral family.
sf1 : This follows immediately from 13.3.2e.
sf2 : We fix x ∈ R and f ∈ H. We have obviously
∞
1 1 \ 1
−∞, x + ⊂ −∞, x + , ∀n ∈ N, and (−∞, x] = −∞, x + ,
n+1 n n=1
n
and hence, by 13.3.6b,

1
T (x)f = P ((−x, ∞])f = lim P −∞, x + f
n→∞ n

1
= lim T x + f.
n→∞ n
Now, let {δn } be a sequence in [0, ∞) such that δn −−−−→ 0 and fix ε > 0. Let
n→∞
nε ∈ N be such that

T (x)f − T x + 1 f < ε

nε
and let Nε ∈ N be such that

1
n > N ε ⇒ δn < .
nε

1
Then, for n > Nε we have T (x + δn ) ≤ T x + nε in view of property sf1 already
proved, and hence, in view of 13.2.4,
kT (x + δn )f − T (x)f k2 = (f |T (x + δn )f − T (x)f )

1
≤ f |T x + f − T (x)f
nε
2
1 < ε2 ,

= T x + f − T (x)f
nε

1
where the equalities hold by 13.1.7c since T (x + δn ) − T (x) and T x + nε − T (x)
are orthogonal projections (cf. sf1 and 13.2.4).
sf3 : We fix f ∈ H. By 13.3.2a and 13.3.6b we have
0H = P (∅)f = lim P ((−∞, n])f = lim T (−n)f.
n→∞ n→∞
Now, let {xn } be a sequence in R such that xn −−−−→ −∞ and fix ε > 0. Let nε ∈ N
n→∞
be such that
kT (−nε )f k < ε
n > Nε ⇒ xn < −nε .
Then, for n > Nε we have T (xn ) ≤ T (−nε ) in view of property sf1 and hence
kT (xn )f k2 = (f |T (xn )f ) ≤ (f |T (−nε )f ) = kT (−nε)f k2 < ε2 .
sf4 : We fix f ∈ H. By property pvam2 of P and by 13.3.6a we have
f = P (R)f = lim P ((−∞, n])f = lim T (n)f.
n→∞ n→∞
Now, let {xn } be a sequence in R such that xn −−−−→ ∞ and fix ε > 0. Let nε ∈ N
n→∞
be such that
kf − T (nε )f k < ε
n > Nε ⇒ xn > nε .
Then, for n > Nε we have T (nε ) ≤ T (xn ) in view of property sf1 , and hence
kf − T (xn )f k2 = (f |f − T (xn )f ) ≤ (f |f − T (nε )f ) = kf − T (nε )f k2 < ε2 ,
where the equalities hold by 13.1.7c because 1H − T (xn ) and 1H − T (nε ) are or-
thogonal projections (cf. 13.1.3e).
13.6.3 Theorem. Let T be a spectral family. Then there exists a unique p.v.m. P
on the σ-algebra A(dR ) such that
T (x) = P ((−∞, x]), ∀x ∈ R.
Proof. Here we denote by S the semialgebra on R that was denoted by I9 in 6.1.25.

We note that
T (b) − T (a) ∈ P(H), for all a, b ∈ R so that a < b,
by property sf1 of T and by 13.2.4, and that
1H − T (a) ∈ P(H), ∀a ∈ R,
by 13.1.3e. Then, we can define a mapping Q : S → P(H) by letting

Q(∅) := OH ,
Q((a, b]) := T (b) − T (a), for all a, b ∈ R so that a < b,
Q((−∞, b]) := T (b), ∀b ∈ R,
Q((a, ∞)) := 1H − T (a), ∀a ∈ R.
For every f ∈ H, the function
Ff : R → R
x 7→ Ff (x) := (f |T (x)f )
has the properties that were assumed for the function F in 9.1.1, in view of prop-
erties sf1 and sf2 of T (also, cf. 13.2.4), and hence there exists a measure µf on
A(dR ) such that
µf ((a, b]) = Ff (b) − Ff (a), for all a, b ∈ R so that a < b.
For the measure µf we also have (cf. the proof of 9.1.1):
µf (∅) = 0 = (f |Q(∅)f ) ;
µf ((−∞, b]) = Ff (b) − lim Ff (−n) = Ff (b)
n→∞
= (f |Q((−∞, b])f ) , ∀b ∈ R;
µf ((a, ∞)) = lim Ff (n) − Ff (a) = (f |f ) − Ff (a)
n→∞
= (f |Q((a, ∞))f ) , ∀a ∈ R
(we have limn→∞ Ff (−n) = 0 by property sf3 of T and limn→∞ Ff (n) = (f |f ) by
property sf4 of T ). Thus, we have
µf (E) = (f |Q(E)f ) , ∀E ∈ S.
Now we can prove that the mapping Q satisfies conditions q1 , q2 , q3 of 13.4.1.
q : For every finite and disjoint family {E1 , ..., En } of elements of S such that
Sn 1
k=1 Ek ∈ S, we have
n
! ! n
! n
[ [ X
f |Q Ek f = µf Ek = µf (Ek )
k=1 k=1 k=1
n n
!
X X
= (f |Q(Ek )f ) = f| Q(Ek )f , ∀f ∈ H,
k=1 k=1
and hence
n
! n
[ X
Q Ek = Q(Ek ),
k=1 k=1
by 10.2.12.
q2 : First, fix a, b ∈ R so that a < b. Then:

for c, d ∈ R so that c < d, if (a, b] ∩ (c, d] = ∅ then either b ≤ c or d ≤ a; supposing
e.g. b ≤ c, we have
Q((a, b])Q((c, d]) = (T (b) − T (a))(T (d) − T (c))
= T (b) − T (a) − T (b) + T (a) = OH ;
for c ∈ R, if (a, b] ∩ (−∞, c] = ∅ then c ≤ a and hence
Q((a, b])Q((−∞, c]) = (T (b) − T (a))T (c) = T (c) − T (c) = OH ;
for c ∈ R, if (a, b] ∩ (c, ∞) = ∅ then b ≤ c and hence
Q((a, b])Q((c, ∞)) = (T (b) − T (a))(1H − T (c))
= T (b) − T (a) − T (b) + T (a) = OH .
Next, if a, b ∈ R are so that (a, ∞) ∩ (−∞, b] = ∅ then b ≤ a and hence
Q((a, ∞))Q((−∞, b]) = (1H − T (a))T (b) = T (b) − T (b) = OH .
We have used property sf1 of T and 13.2.4 sistematically.
q3 : Choose a ∈ R. Then (−∞, a] and (a, ∞) are elements of S such that
(−∞, a] ∪ (a, ∞) = R and Q((−∞, a]) + Q((a, ∞)) = T (a) + 1H − T (a) = 1H .
Thus, the mapping Q satisfies all the conditions of 13.4.1 and hence there exists a
unique p.v.a.m. P0 on A0 (S) which is an extension of Q. Now, for every f ∈ H,
µP
f is an additive function on A0 (S) (cf. 13.3.2g) and it must be the same as
0
the restrictions of µf to A0 (S) (note that A0 (S) ⊂ A(dR ) since A(dR ) = A(S) =
A(A0 (S)), cf. 6.1.25 and 6.1.18) owing to the uniqueness asserted in 7.3.1, since
the restrictions of µPf and of µf to S are the same (both of them are equal to
0
µQ P0
f ). Hence, µf is a premeasure on A0 (S). Then, 13.4.2 implies that there exists
a unique p.v.m. P on A(A0 (S)) = A(dR ) which is an extension of P0 . Thus, P is
also an extension of Q and we have in particular
P ((−∞, x]) = Q((−∞, x]) = T (x), ∀x ∈ R.
Finally, suppose that P̃ is a p.v.m. on A(dR ) such that
P̃ ((−∞, x]) = T (x), ∀x ∈ R.
Then we have
P̃ ((−∞, a]) + P̃ ((a, ∞)) = P̃ (R) = 1H , ∀a ∈ R,
and hence
P̃ ((a, ∞)) = 1H − T (a), ∀a ∈ R.
We also have, for all a, b ∈ R so that a < b,
P̃ ((a, b]) = P̃ ((−∞, b] ∩ (a, ∞)) = P̃ ((−∞, b])P̃ ((a, ∞))
= T (b)(1H − T (a)) = T (b) − T (a)
(cf. 13.3.2c). This proves that
P̃ (E) = Q(E) = P (E), ∀E ∈ S,
and this implies P̃ = P by 13.4.3.
13.6.4 Remark. Some define a spectral family replacing “continuity from the
right” in sf2 with “continuity from the left” (defined in an obvious way). Clearly
the two definitions are not the same, but they are equivalent in the following sense:
the spectral theorem (in the formulation in which spectral families instead of pro-
jection valued measures are used) says that for any given self-adjoint operator there
exists a unique spectral family for each type (i.e. either continuous from the right
or continuous from the left) so that the operator “is the integral of the function ξ
(cf. 11.3.2) with respect to that family”. Actually, in order to prove the existence of
a spectral family which does this trick one could dispose altogether of condition sf2
and only require condition sf1 in the definition of a spectral family. However, the
spectral family (thus redefined) associated to a given self-adjoint operator would
not be unique. In a way, right continuity or left continuity are “normalization con-
ditions”. Obviously, a p.v.m. P on A(dR ) determines and is determined uniquely
by a “left continuous” spectral family T in a way similar to what was seen in 13.6.2
and in 13.6.3, and the link condition is
T (x) = P ((−∞, x)), ∀x ∈ R.
Chapter 14
Integration with respect to a Projection

Valued Measure
The spectral theorems for unitary and for self-adjoint operators will be presented
in the next chapter. They consist in the representation of a unitary or a self-adjoint
operator as an integral with respect to a projection valued measure. In this chapter
we investigate the idea of an integral with respect to a projection valued measure
and study the properties of this kind of integral.
14.1 Integrals of bounded measurable functions
In this section, (X, A) denotes an abstract measurable space (i.e. X is a non-empty

set and A is a σ-algebra on X), H denotes an abstract Hilbert space, and P denotes
a projection valued measure on A with values in P(H).
We recall that MB (X, A) denotes the family of all bounded A-measurable com-
plex functions on X (cf. 6.2.28). With the k k∞ norm (cf. 4.3.6a), MB (X, A) is
a C ∗ -algebra (cf. 6.2.29 and 12.6.7). The family S(X, A) of all A-simple functions
on X (cf. 6.2.22) is a subalgebra of MB (X, A) and it is dense (with respect to the
k k∞ norm) in MB (X, A) (cf. 6.2.29). Finally, we recall that B(H) denotes the
family of all bounded (i.e. continuous) linear operators on H (cf. 4.2.10) and that
B(H) is a C ∗ -algebra (cf. 12.6.4).
14.1.1 Theorem. There exists a unique mapping JˆP : MB (X, A) → B(H) such
that:
(a) JˆP (χE ) = P (E), ∀E ∈ A;
(b) JˆP is a linear operator;
(c) JˆP is continuous.
In addition, the following conditions are true;
(d) JˆP (ϕ1 ϕ2 ) = JˆP (ϕ1 )JˆP (ϕ2 ), ∀ϕ1 , ϕ2 ∈ MB (X, A);
JˆP (ϕ) = (JˆP (ϕ))† , ∀ϕ ∈ MB (X, A);
(e)
(f ) f |JˆP (ϕ)f = X ϕdµP
R
f , ∀f ∈ H, ∀ϕ ∈ MB (X, A);
(g) kJˆP (ϕ)f k2 = X |ϕ|2 dµP

R
f , ∀f ∈ H, ∀ϕ ∈ MB (X, A);
425
(h) if A ∈ B(H) is so that AP (E) = P (E)A for all E ∈ A, then AJˆP (ϕ) = JˆP (ϕ)A
for all ϕ ∈ MB (X, A).
Proof. We begin with a preliminary remark. For n, m ∈ N, let {α1 , ..., αn } and
{β1 , ..., βm } be families of elements of C, let {E1 , ..., En } and {F1 , ..., Fm } be disjoint
families of elements of A, and suppose that
Xn Xm
αk χEk = βl χFl .
k=1 l=1
The same proof as the one given in 8.1.1 (with µ replaced by P ) shows that
n
X m
X
αk P (Ek ) = βl P (Fl ).
k=1 l=1
Now let ψ ∈ S(X, A). Then there are n ∈ N, a family {α1 , ..., αn } of elements of C,
Pn
and a disjoint family {E1 , ..., En } of elements of A so that ψ = k=1 αk χEk . We
define the operator
Xn
APψ := αk P (Ek ),
k=1
which is an element of B(H) determined by ψ without ambiguity in view of the

preliminary remark above. A proof similar to the one given in 8.1.4a (with µ
replaced by P ) shows that the mapping
S(X, A) ∋ ψ 7→ AP (ψ) := AP
ψ ∈ B(H)
is a linear operator. Moreover, for every ψ ∈ S(X, A) we have (cf. 13.3.2b, 13.2.9,
10.2.3)
n
X
2
kAP
ψfk = |αk |2 kP (Ek )f k2
k=1
n
X
≤ kψk2∞ kP (Ek )f k2 = kψk2∞ kP (∪nk=1 Ek ) f k2
k=1
≤ 2
kψk∞ kf k2 , ∀f ∈ H,
and hence kAP
ψk ≤ kψk∞ . This proves that
kAP (ψ)k ≤ kψk∞ , ∀ψ ∈ S(X, A),
and hence that the linear operator AP is bounded. Since S(X, A) is dense in
MB (X, A) and B(H) is a Banach space, by 4.2.6 there exists a unique bounded
(and hence continuous) linear operator
JˆP : MB (X, A) → B(H)
which is an extension of AP , i.e. such that JˆP (ψ) = AP
ψ for all ψ ∈ S(X, A), and
hence such that
JˆP (χE ) = AP
χE = P (E), ∀E ∈ A.
Integration with respect to a Projection Valued Measure 427
Thus, the mapping JˆP has properties a, b, c.

Next, we prove that JˆP is the unique mapping from MB (X, A) to B(H) which
has properties a, b, c. Suppose that conditions a, b, c hold true for a mapping
J : MB (X, A) → B(H). Then,
J(ψ) = AP
ψ = AP (ψ), ∀ψ ∈ S(X, A),
in view of conditions a and b, and therefore J = JˆP in view of condition c and of

the uniqueness asserted in 4.2.6, since S(X, A) is dense in MB (X, A).
In what follows we prove that the mapping JˆP has the additional properties of
the statement.
d: For ψ1 , ψ2 ∈ S(X, A), let n and m be elements of N, {α1 , ..., αn } and
{β1 , ..., βm } families of elements of C, and {E1 , ..., En } and {F1 , ..., Fm } disjoint
families of elements of A so that
Xn X m
ψ1 = αk χEk and ψ2 = βl χFl .
k=1 l=1
Then
n X
X m
ψ1 ψ2 = αk βl χEk ∩Fl ,
k=1 l=1
and hence (since {Ek ∩ Fl }k=1,...,n;l=1,...,m is a disjoint family of elements of A; also,

cf. 13.3.2c),
n X
m
JˆP (ψ1 ψ2 ) = AP
X
ψ1 ψ2 = αk βl P (Ek ∩ Fl )
k=1 l=1
n X
m n
! m
!
X X X
= αk βl P (Ek )P (Fl ) = αk P (Ek ) βl P (Fl )
k=1 l=1 k=1 l=1
= AP P
ψ1 Aψ2 = JˆP (ψ1 )JˆP (ψ2 ).
Now, for ϕ1 , ϕ2 ∈ MB (X, A) let {ψ1,n } and {ψ2,n } be sequences in S(X, A) such
that ϕ1 = limn→∞ ψ1,n and ϕ2 = limn→∞ ψ2,n (in the k k∞ norm); then ϕ1 ϕ2 =
limn→∞ ψ1,n ψ2,n in view of 4.3.3, and hence
(1)
JˆP (ϕ1 ϕ2 ) = lim JˆP (ψ1,n ψ2,n ) = lim JˆP (ψ1,n )JˆP (ψ2,n )
n→∞ n→∞
(2) (3)
= ( lim JˆP (ψ1,n ))( lim JˆP (ψ2,n )) = JˆP (ϕ1 )JˆP (ϕ2 ),
n→∞ n→∞
where 1 and 3 hold by continuity of JˆP and 2 by 4.3.3.

e: For ψ ∈ S(X, A), let n be an element of N, {α1 , ..., αn } a family of elements
of C, {E1 , ..., En } a disjoint family of elements of A so that ψ = nk=1 αk χEk . Then
P
n n
!†
JˆP (ψ) = A = αk P (Ek ) = (AP )† = (JˆP (ψ))† .
X X
P
ψ
αk P (Ek ) = ψ
k=1 k=1
Now, for ϕ ∈ MB (X, A) let {ψn } be a sequence in S(X, A) such that ϕ =

limn→∞ ψn (in the k k∞ norm); then ϕ = limn→∞ ψn in view of 12.6.2, and hence
(4)
JˆP (ϕ) = lim JˆP (ψn ) = lim (JˆP (ψn ))†
n→∞ n→∞
(5) (6)
= ( lim JˆP (ψn ))† = (JˆP (ϕ))† ,
n→∞
where 4 and 6 hold by continuity of JˆP and 5 by 12.6.2.

f: For ψ ∈ S(X, A), let ψ = nk=1 αk χEk be as in the proof of e. Then
P
n
f |JˆP (ψ)f = f |AP
X
ψ f = αk (f |P (Ek )f )
k=1
n
X Z
= αk µP
f (Ek ) = ψdµP
f , ∀f ∈ H.
k=1 X

limn→∞ ψn (in the k k∞ norm); then kϕk∞ = limn→∞ kψn k∞ in view of 4.1.6a,
∃m ∈ [0, ∞) such that |ψn (x)| ≤ kψn k∞ < m, ∀x ∈ X, ∀n ∈ N;
we notice that
ϕ(x) = lim ψn (x), ∀x ∈ X, and mX ∈ L1 (X, A, µP
f ), ∀f ∈ H
n→∞
(mX denotes the constant function on X with value m, cf. 1.2.19; also, cf. 8.2.6);
then,
(7) Z
ˆ ˆ
f |JP (ϕ)f = lim f |JP (ψn )f = lim ψn dµP
f
n→∞ n→∞ X
Z
(8)
= ϕdµPf , ∀f ∈ H,
X
where: 7 holds by continuity of JˆP , by 4.2.12, and by 10.1.16c; 8 holds by 8.2.11.
g: In view of conditions d, e, f, already proved, we have, for ϕ ∈ MB (X, A),

kJˆP (ϕ)f k2 = f |(JˆP (ϕ))† JˆP (ϕ)f = f |JˆP (ϕ)JˆP (ϕ)f
Z
ˆ 2
= f |JP (|ϕ| )f = |ϕ|2 dµP
f , ∀f ∈ H.
X
h: If A ∈ B(H) is so that AP (E) = P (E)A for all E ∈ A, then obviously
AJˆP (ψ) = AAP P ˆ
ψ = Aψ A = JP (ψ)A, ∀ψ ∈ S(X, A).

limn→∞ ψn (in the k k∞ norm); then, in view of 4.3.3 and of the continuity of
JˆP ,
AJˆP (ϕ) = A lim JˆP (ψn ) = lim AJˆP (ψn )
n→∞ n→∞
= lim JˆP (ψn )A = ( lim JˆP (ψn ))A = JˆP (ψ)A.

n→∞ n→∞
14.1.2 Remark. Let µ be a finite measure on A. Then there exists a unique

function Jˆµ : MB (X, A) → C such that:
(a) Jˆµ (χE ) = µ(E), ∀E ∈ A;
(b) Jˆµ is a linear operator;
(c) Jˆµ is continuous.
Indeed, we can define (cf. 8.2.6)
Z
MB (X, A) ∋ ϕ 7→ Jˆµ (ϕ) := ϕdµ ∈ C,
X
and thus obtain a function with properties a (cf. 8.1.3a) and b (cf. 8.2.9); condition
c is proved by
Z
ˆ
|Jµ (ϕ)| ≤ |ϕ|dµ ≤ µ(X)kϕk∞
X
(cf. 8.2.10 and 8.1.11a). If we assume conversely that conditions a, b, c hold true
for a function J : MB (X, A) → C, then
Z
J(ψ) = ψdµ = Jˆµ (ψ), ∀ψ ∈ S(X, A),
X
in view of conditions a and b, and hence J = Jˆµ by the uniqueness asserted in 4.2.6,
since S(X, A) is dense in MB (X, A) and both J and Jµ are continuous.
This shows that there exists a close analogy between the mappings JˆP and Jˆµ .
Owing to this analogy, for ϕ ∈ MB (X, A) the operator JˆP (ϕ) is called the integral
of ϕ with respect to P and it is often denoted as follows
Z
ϕdP := JˆP (ϕ).
X
14.2 Integrals of general measurable functions
On the basis of the results of the previous section, in this section we extend the
notion of an integral with respect to a projection valued measure, to measurable
functions which are not necessarily bounded nor necessarily defined on the whole
of X.
As before, (X, A) denotes an abstract measurable space, H denotes an abstract
Hilbert space, and P denotes a projection valued measure on A with values in
P(H).
14.2.1 Definition. Let E be an element of A and, for each x ∈ E, let Q(x) be a

proposition, i.e. a statement about x which is either true or false. We write
“Q(x) P -a.e. on E” or “Q(x) is true P -a.e. on E”
when the following condition is satisfied
∃F ∈ A such that P (F ) = OH and Q(x) is true for all x ∈ E − F.
It is obvious that, if
Q(x) P -a.e. on E
then (cf. 7.1.9)
Q(x) µP
f -a.e. on E, ∀f ∈ H.
14.2.2 Definition. We denote by M(X, A, P ) the family of functions from X to

C which is defined as follows
M(X, A, P ) := {ϕ : Dϕ → C : Dϕ ∈ A, P (X − Dϕ ) = OH , ϕ ∈ M(Dϕ , ADϕ )}
(for M(X, A), cf. 6.2.15). It is obvious that
M(X, A, P ) ⊂ M(X, A, µP
f ), ∀f ∈ H
(for M(X, A, µP
f ), cf. 8.2.1). The elements of M(X, A, P ) are called P -measurable
functions.
In M(X, A, P ) we define a relation, denoted by ∼, as follows:
ϕ ∼ ψ if ϕ(x) = ψ(x) P -a.e. on Dϕ ∩ Dψ .
14.2.3 Theorem. The following statements hold true:
(a) We have:
αϕ + βψ ∈ M(X, A, P ), ∀α, β ∈ C, ∀ϕ, ψ ∈ M(X, A, P );
ϕψ ∈ M(X, A, P ), ∀ϕ, ψ ∈ M(X, A, P ).
However, M(X, A, P ) is not an associative algebra nor a linear space, unless
E := ∅ is the only element E of A such that P (E) = OH .
(b) The relation ∼ in M(X, A, P ) is an equivalence relation.
(c) If ϕ, ϕ′ , ψ, ψ ′ ∈ M(X, A, P ) are so that ϕ′ ∼ ϕ and ψ ′ ∼ ψ, then
αϕ′ + βψ ′ ∼ αϕ + βψ, ∀α, β ∈ C,
ϕ′ ψ ′ ∼ ϕψ.
Proof. The arguments used in the proofs of 8.2.2, 8.2.12, 8.2.13 can be repeated
word for word, with the measure µ replaced by the projection valued measure P
and references to 7.1.2b replaced by references to 13.3.2h.
14.2.4 Definitions. We denote by L∞ (X, A, P ) the subset of M(X, A, P ) defined

by
L∞ (X, A, P ) := {ϕ ∈ M(X, A, P ) : ∃m ∈ [0, ∞) such that
|ϕ(x)| ≤ m P -a.e. on Dϕ }.
∞
For ϕ ∈ L (X, A, P ), we define the essential supremum of |ϕ| (with respect to P )
as
P -sup|ϕ| := inf{m ∈ [0, ∞) : |ϕ(x)| ≤ m P -a.e. on Dϕ };
obviously, P -sup|ϕ| ∈ [0, ∞).
Clearly,
MB (X, A) ⊂ L∞ (X, A, P ) and P -sup|ϕ| ≤ kϕk∞ , ∀ϕ ∈ MB (X, A).
14.2.5 Theorem. We have:

|ϕ(x)| ≤ P -sup|ϕ| P -a.e. on Dϕ , ∀ϕ ∈ L∞ (X, A, P );
[ϕ ∈ L∞ (X, A, P ) and P -sup|ϕ| = 0] ⇒ [ϕ(x) = 0 P -a.e. on Dϕ ];
αϕ ∈ L∞ (X, A, P ) and P -sup|αϕ| = |α|P -sup|ϕ|, ∀α ∈ C, ∀ϕ ∈ L∞ (X, A, P );
ϕ + ψ ∈ L∞ (X, A, P ) and P -sup|ϕ + ψ| ≤ P -sup|ϕ| + P -sup|ψ|,
∀ϕ, ψ ∈ L∞ (X, A, P );
ϕψ ∈ L∞ (X, A, P ) and P -sup|ϕψ| ≤ (P -sup|ϕ|)(P -sup|ψ|),
∀ϕ, ψ ∈ L∞ (X, A, P );
ϕ ∈ L∞ (X, A, P ) and P -sup|ϕ| = P -sup|ϕ|, ∀ϕ ∈ L∞ (X, A, P ).
However, L∞ (X, A, P ) is not an associative algebra nor a linear space, and the
function
L∞ (X, A, P ) ∋ ϕ 7→ P -sup|ϕ| ∈ R
is not a norm, nor is a norm its restriction to MB (X, A), unless E := ∅ is the only
element E of A such that P (E) = OH .
Proof. Let ϕ ∈ L∞ (X, A, P ). For each n ∈ N, there exist mn ∈ [0, ∞) and En ∈ A

so that
1
mn < (P -sup|ϕ|) + , P (En ) = OH , |ϕ(x)| ≤ mn for all x ∈ Dϕ − En ;
n
S∞
letting E := n=1 En , we have E ∈ A, P (E) = OH (cf. 13.3.6c), and also
∞
\
x ∈ Dϕ − E = (Dϕ − En ) ⇒
n=1
1
[|ϕ(x)| < (P -sup|ϕ|) + , ∀n ∈ N] ⇒ |ϕ(x)| ≤ P -sup|ϕ|;
n
this proves that
|ϕ(x)| ≤ P -sup|ϕ| P -a.e. on Dϕ .
Then, it is obvious that
P -sup|ϕ| = 0 ⇒ [ϕ(x) = 0 P -a.e. on Dϕ ].
We have αϕ ∈ M(X, A, P ) for every α ∈ C, by 14.2.3a. If α = 0 then the equation
P -sup|αϕ| = |α|P -sup|ϕ|
is obvious. If α 6= 0, then
|αϕ(x)| = |α||ϕ(x)| ≤ |α|P -sup|ϕ| P -a.e. on Dϕ = Dαϕ ,
whence
αϕ ∈ L∞ (X, A, P ) and P -sup|αϕ| ≤ |α|P -sup|ϕ|;
by the same token,
P -sup|ϕ| = P -sup|α−1 αϕ| ≤ |α−1 |P -sup|αϕ|,
whence
|α|P -sup|ϕ| ≤ P -sup|αϕ|;
therefore,
P -sup|αϕ| = |α|P -sup|ϕ|.
For every ψ ∈ L∞ (X, A, P ) we have ϕ + ψ ∈ M(X, A, P ) and ϕψ ∈ M(X, A, P )
by 14.2.3a. Now, let E be as before and let F ∈ A be such that
P (F ) = OH and |ψ(x)| ≤ P -sup|ψ|, ∀x ∈ Dψ − F ;
then E ∪ F ∈ A and P (E ∪ F ) = OH (cf. 13.3.2h), and also
|ϕ(x) + ψ(x)| ≤ |ϕ(x)| + |ψ(x)| ≤ P -sup|ϕ| + P -sup|ψ|,
∀x ∈ (Dϕ − E) ∩ (Dψ − F ) = Dϕ+ψ − (E ∪ F ),
which proves that
ϕ + ψ ∈ L∞ (X, A, P ) and P -sup|ϕ + ψ| ≤ P -sup|ϕ| + P -sup|ψ|;
moreover
|ϕ(x)ψ(x)| = |ϕ(x)||ψ(x)| ≤ (P -sup|ϕ|)(P -sup|ψ|),
∀x ∈ (Dϕ − E) ∩ (Dψ − F ) = Dϕψ − (E ∪ F ),
which proves that
ϕψ ∈ L∞ (X, A, P ) and P -sup|ϕψ| ≤ (P -sup|ϕ|)(P -sup|ψ|).
It is obvious that
ϕ ∈ L∞ (X, A, P ) and P -sup|ϕ| = P -sup|ϕ|.
Finally, suppose that there exists E ∈ A such that E 6= ∅ and P (E) = OH . Then
the family of functions L∞ (X, A, P ) is not an associative algebra nor a linear space
for the same reason why M(X, A, P ) is not (cf. the proof of 8.2.2). Therefore,
the function L∞ (X, A, P ) ∋ ϕ 7→ P -sup|ϕ| ∈ R cannot be a norm. Moreover
χE ∈ MB (X, A), χE 6= 0X , and P -sup|χE | = 0; this proves that the function
MB (X, A) ∋ ϕ 7→ P -sup|ϕ| ∈ R is not a norm.
14.2.6 Proposition. For every ϕ ∈ L∞ (X, A, P ) there exists ϕe ∈ MB (X, A)

such that ϕe (x) = ϕ(x) P -a.e. on Dϕ .
Proof. Let ϕ ∈ L∞ (X, A, P ) and let E ∈ A be such that
P (E) = OH and |ϕ(x)| ≤ P -sup|ϕ|, ∀x ∈ Dϕ − E.
Then the function
ϕe : X → C
(
ϕ(x) if x ∈ Dϕ − E,
x 7→ ϕe (x) :=
0 if x ∈ X − (Dϕ − E)
is A-measurable (this is true by 6.2.12, because ϕDϕ −E is ADϕ −E -measurable in
view of 6.2.3 and because Re ϕe and Im ϕe are the standard extensions of Re ϕDϕ −E
and of Im ϕDϕ −E respectively, cf. 8.1.14) and bounded, i.e. ϕe ∈ MB (X, A);
moreover, ϕe (x) = ϕ(x) P -a.e. on Dϕ since P (X − Dϕ ) = OH and P (E) = OH ,
and hence P (X − (Dϕ − E)) = P ((X − Dϕ ) ∪ E) = OH (cf. 13.3.2h).
14.2.7 Proposition. The mapping

J˜P : L∞ (X, A, P ) → B(H)
ϕ 7→ J˜P (ϕ) := JˆP (ϕe ) if ϕe ∈ MB (X, A) is such that
ϕe (x) = ϕ(x) P -a.e. on Dϕ
is defined consistently and is an extension of the mapping JˆP .
The following conditions hold true:
(a) J˜P (χE ) = P (E), ∀E ∈ A;
(b) J˜P (αϕ + βψ) = αJ˜P (ϕ) + β J˜P (ψ), ∀α, β ∈ C, ∀ϕ, ψ ∈ L∞ (X, A, P );
(c) J˜P (ϕψ) = J˜P (ϕ)J˜P (ψ), ∀ϕ, ψ ∈ L∞ (X, A, P );
(d) J˜ ˜ †
, ∀ϕ ∈ L∞ (X, A, P );
P (ϕ) = (JP (ϕ))
f |J˜P (ϕ)f = X ϕdµP ∞
R
(e) f , ∀f ∈ H, ∀ϕ ∈ L (X, A, P );
kJ˜P (ϕ)f k2 = X |ϕ|2 dµP ∞
R
(f ) f , ∀f ∈ H, ∀ϕ ∈ L (X, A, P );
(g) if A ∈ B(H) is so that AP (E) = P (E)A for all E ∈ A, then AJ˜P (ϕ) = J˜P (ϕ)A
for all ϕ ∈ L∞ (X, A, P );
(h) kJ˜P (ϕ)k = P -sup|ϕ|, ∀ϕ ∈ L∞ (X, A, P );
(i) for ϕ, ϕ′ ∈ L∞ (X, A, P ), J˜P (ϕ) = J˜P (ϕ′ ) iff ϕ(x) = ϕ′ (x) P -a.e. on Dϕ ∩ Dϕ′ .
Proof. Let ϕ ∈ L∞ (X, A, P ). By 14.2.6, there exists ϕe ∈ MB (X, A) such that

ϕe (x) = ϕ(x) P -a.e. on Dϕ . Suppose that ϕ′e ∈ MB (X, A) also is such that
ϕ′e (x) = ϕ(x) P -a.e. on Dϕ . Then ϕe (x) = ϕ′e (x) P -a.e. on X since the relation
∼ in M(X, A, P ) is an equivalence relation (cf. 14.2.3b), and hence ϕe (x) = ϕ′e (x)
µP
f -a.e. on X for all f ∈ H, and hence (cf. 14.1.1f and 8.2.7)
Z Z
f |JˆP (ϕe )f = ϕe dµP
f = ϕ′e dµP ˆ ′
f = f |JP (ϕe )f , ∀f ∈ H,
X X
and hence JˆP (ϕe ) = JˆP (ϕ′e ) (cf. 10.2.12). This proves that the mapping J˜P is
defined consistently. It is obvious that J˜P is an extension of JˆP .
Now we prove the conditions listed in the statement. For ϕ ∈ L∞ (X, A, P ), we
denote by ϕe an element of MB (X, A) such that ϕe (x) = ϕ(x) P -a.e. on Dϕ .
a: This follows at once from 14.1.1.a, since J˜P is an extension of JˆP .
b: Let α, β ∈ C and ϕ, ψ ∈ L∞ (X, A, P ). Then, αϕe + βψe ∈ MB (X, A) and
(αϕe + βψe )(x) = (αϕ + βψ)(x) P -a.e. on Dαϕ+βψ
by 14.2.3c, and hence (cf. 14.1.1b)
J˜P (αϕ + βψ) = JˆP (αϕe + βψe ) = αJˆP (ϕe ) + β JˆP (ψe ) = αJ˜P (ϕ) + β J˜P (ψ).
c: Let ϕ, ψ ∈ L∞ (X, A, P ). Then ϕe ψe ∈ MB (X, A) and
(ϕe ψe )(x) = (ϕψ)(x) P -a.e. on Dϕψ
by 14.2.3c, and hence (cf. 14.1.1d)
J˜P (ϕψ) = JˆP (ϕe ψe ) = JˆP (ϕe )JˆP (ψe ) = J˜P (ϕ)J˜P (ψ).
d: For every ϕ ∈ L∞ (X, A, P ), it is obvious that ϕe ∈ MB (X, A) and that

ϕe (x) = ϕ(x) P -a.e. on Dϕ ;
then (cf. 14.1.1e)
J˜P (ϕ) = JˆP (ϕe ) = (JˆP (ϕe ))† = (J˜P (ϕ))† .
e: For every ϕ ∈ L∞ (X, A, P ),
Z Z
f |J˜P (ϕ)f = f |JˆP (ϕe )f = ϕe dµP
f = ϕdµP
f , ∀f ∈ H,
X X
by 14.1.1f and 8.2.7 (since ϕe (x) = ϕ(x) µP

on Dϕ , ∀f ∈ H).
f -a.e.
f: For every ϕ ∈ L∞ (X, A, P ),
Z Z
˜ 2 ˆ 2
kJP (ϕ)f k = kJP (ϕe )f k = 2 P
|ϕe | dµf = |ϕ|2 dµP
f , ∀f ∈ H,
X X
by 14.1.1g and 8.2.7 (or 8.1.17c).
g: This follows at once from the definition of J˜P and 14.1.1h.
h: Let ϕ ∈ L∞ (X, A, P ). Since |ϕ(x)| ≤ P -sup|ϕ| P -a.e. on Dϕ (cf. 14.2.5), we
have
Z
kJ˜P (ϕ)f k2 ≤ (P -sup|ϕ|)2 1X dµP 2 P
f = (P -sup|ϕ|) µf (X)
X
= (P -sup|ϕ|)2 kf k2 , ∀f ∈ H,
by condition f and 8.1.17b. This proves that
kJ˜P (ϕ)k ≤ P -sup|ϕ|.
If P -sup|ϕ| = 0, this inequality implies kJ˜P (ϕ)k = P -sup|ϕ|. Then, suppose
P -sup|ϕ| 6= 0. Let n ∈ N be such that n1 < P -sup|ϕ|, and define
1
En := |ϕ|−1 ([(P -sup|ϕ|) −
, ∞));
n
we have En ∈ A (cf. 6.2.17 and 6.2.13a with n := 7) and P (En ) 6= OH ; in fact,
P (En ) = OH would imply P -sup|ϕ| ≤ (P -sup|ϕ|) − n1 , which is a contradiction;
therefore, there exists fn ∈ H such that fn 6= 0H and P (En )fn = fn , and hence
such that (cf. 13.3.2b)
2
µP
fn (X − En ) = kP (X − En )P (En )fn k = 0,
and hence such that

2 Z
1
Z Z
kJ˜P (ϕ)fn k2 = |ϕ|2 dµP
fn = |ϕ|2 dµP
fn ≥
(P -sup|ϕ|) − 1X dµPfn
X En n En
2 Z 2
1 1
= (P -sup|ϕ|) − 1X dµP
fn = (P -sup|ϕ|) − kfn k2
n X n
(cf. condition f, 8.3.3a, 8.1.7). In view of 4.2.5c, this proves that
1
kJ˜P (ϕ)k ≥ (P -sup|ϕ|) − , ∀n ∈ N,
n
and hence that

kJ˜P (ϕ)k ≥ P -sup|ϕ|.
i: If ϕ, ϕ′ ∈ L∞ (X, A, P ) are such that ϕ(x) = ϕ′ (x) P -a.e. on Dϕ ∩ Dϕ′ , then
ϕ(x) = ϕ′ (x) µP
f -a.e. on Dϕ ∩ Dϕ′ , ∀f ∈ H,
and hence (cf. condition e and 8.2.7)

Z Z
˜
f |JP (ϕ)f = P
ϕ′ dµP ˜ ′
ϕdµf = f = f |JP (ϕ )f , ∀f ∈ H,
X X
and hence J˜P (ϕ) = J˜P (ϕ′ ) by 10.2.12. Conversely, if ϕ, ϕ′ ∈ L∞ (X, A, P ) are such
that J˜P (ϕ) = J˜P (ϕ′ ), then (cf. conditions b and h)
P -sup|ϕ − ϕ′ | = kJ˜P (ϕ − ϕ′ )k = kJ˜P (ϕ) − J˜P (ϕ′ )k = 0,
ϕ(x) − ϕ′ (x) = 0, i.e. ϕ(x) = ϕ′ (x), P -a.e. on Dϕ−ϕ′ = Dϕ ∩ Dϕ′ .
14.2.8 Remark. We denote by M (X, A, P ) the quotient set defined by the equiv-
alence relation ∼ in M(X, A, P ) (cf. 14.2.3.b). On the basis of 14.2.3a,c, it is easy
to see that M (X, A, P ) becomes an abelian associative algebra if we define
[ϕ] + [ψ] := [ϕ + ψ], ∀[ϕ], [ψ] ∈ M (X, A, P ),
α[ϕ] := [αϕ], ∀α ∈ C, ∀[ϕ] ∈ M (X, A, P ),
[ϕ][ψ] := [ϕψ], ∀[ϕ], [ψ] ∈ M (X, A, P )
(there is a close analogy between M (X, A, P ) and M (X, A, µ), cf. 8.2.13).
We can define a subset of M (X, A, P ) by
L∞ (X, A, P ) := {[ϕ] ∈ M (X, A, P ) : ϕ ∈ L∞ (X, A, P )}.
Indeed, if ϕ ∈ L∞ (X, A, P ) and ϕ′ ∈ [ϕ], let E ∈ A be such that
P (E) = OH and ∃m ∈ [0, ∞) s.t. |ϕ(x)| ≤ m, ∀x ∈ Dϕ − E,
and let F ∈ A be such that
P (F ) = OH and ϕ′ (x) = ϕ(x), ∀x ∈ (Dϕ′ ∩ Dϕ ) − F ;
then,
|ϕ′ (x)| ≤ m, ∀x ∈ ((Dϕ′ ∩ Dϕ ) − F ) ∩ (Dϕ − E) = Dϕ′ − ((X − Dϕ ) ∪ F ∪ E),
and this proves that ϕ′ ∈ L∞ (X, A, P ), in view of 13.3.2h. Thus, the condition
ϕ ∈ L∞ (X, A, P ) is actually a condition for the equivalence class [ϕ] even though
it is expressed through a particular element of the class. On the basis of 14.2.5, it
is easy to see that L∞ (X, A, P ) is a subalgebra (cf. 3.3.2) of M (X, A, P ), and that
it becomes a normed algebra if we define a norm by
k[ϕ]k := P -sup|ϕ|, ∀[ϕ] ∈ L∞ (X, A, P ).
Proceeding as we would if P were a measure on A, we can prove that L∞ (X, A, P )

is a Banach algebra (cf. e.g. Berberian, 1999, 6.6.7). Then, it is obvious that
L∞ (X, A, P ) becomes a C ∗ -algebra if we define an involution by
[ϕ]∗ := [ϕ], ∀ϕ ∈ L∞ (X, A, P ).
On the basis of 14.2.7b,c,d,h,i, we can define the mapping
L∞ (X, A, P ) ∋ [ϕ] 7→ ΦP ([ϕ]) := J˜P (ϕ) ∈ B(H),
see that it is a homomorphism (cf. 3.3.5) from L∞ (X, A, P ) to B(H), and see that
kΦP ([ϕ])k = k[ϕ]k, ∀[ϕ] ∈ L∞ (X, A, P ),
(ΦP ([ϕ]))† = ΦP ([ϕ]∗ ), ∀[ϕ] ∈ L∞ (X, A, P ).
Then, it is easy to see that RΦP is an abelian C ∗ -algebra (cf. 3.3.6 and 2.6.4). Thus
ΦP is an isomorphism from the C ∗ -algebra L∞ (X, A, P ) onto this C ∗ -algebra, and
it is norm-preserving and involution-preserving.
14.2.9 Definition. Let ϕ ∈ M(X, A, P ). A sequence {ϕn } in M(X, A, P ) is said

to be ϕ-convergent if the following conditions are satisfied:
ϕn ∈ L∞ (X, A, P ), ∀n ∈ N;
∞
!
\
ϕ(x) = lim ϕn (x) P -a.e. on Dϕ ∩ Dϕn ;
n→∞
n=1
∃k1 , k2 ∈ [0, ∞) such that |ϕn (x)|2 ≤ k1 |ϕ(x)|2 + k2 P -a.e. on Dϕ ∩ Dϕn , ∀n ∈ N.
14.2.10 Remarks.
(a) Let ϕ ∈ M(X, A, P ). For each n ∈ N, we define the set
En := |ϕ|−1 ([0, n]),
which is an element of A (cf. 6.2.17 and 6.2.13a with n := 3). It is obvious
that the sequence {χEn ϕ} is ϕ-convergent. This proves that the family of ϕ-
convergent sequences is not empty.
(b) If ψ ∈ L∞ (X, A, P ) then |ψ(x)| ≤ P -sup|ψ| µP f -a.e. on Dψ for all f ∈ H
(cf. 14.2.5), and hence ψ ∈ L2 (X, A, µP
f ) for all f ∈ H (cf. 8.2.6). Thus, if
ϕ ∈ M(X, A, P ) and {ϕn } is a ϕ-convergent sequence, then ϕn ∈ L2 (X, A, µP
f)
for all f ∈ H and all n ∈ N.
14.2.11 Proposition. Let ϕ ∈ M(X, A, P ) and let DP (ϕ) be the subset of H

defined by
Z
DP (ϕ) := {f ∈ H : ϕ ∈ L2 (X, A, µP
f )} = f ∈ H : |ϕ| 2
dµ P
f < ∞ .
X
Let {ϕn } be a ϕ-convergent sequence. For f ∈ H, the following conditions are

equivalent:
(a) f ∈ DP (ϕ);
(b) the sequence {[ϕn ]} is convergent in the Hilbert space L2 (X, A, µP
f );
˜
(c) the sequence {JP (ϕn )f } is convergent in the Hilbert space H.
If these conditions are satisfied, then:
(d) [ϕ] = limn→∞ [ϕn ] in the Hilbert space L2 (X, A, µPf );
′
(e) if {ϕn } is any ϕ-convergent sequence, then
lim J˜P (ϕ′n )f = lim J˜P (ϕn )f.
n→∞ n→∞
Proof. a ⇒ (b and d): Since the sequence {ϕn } is ϕ-convergent, for any f ∈ H we
have:
∞ ∞
!
\ \
2
lim |ϕn (x) − ϕ(x)| = 0 µPf -a.e. on D ϕ ∩ D ϕn = Dϕn −ϕ ;
n→∞
n=1 n=1
∃k1 , k2 ∈ [0, ∞) such that
|ϕn (x) − ϕ(x)|2 ≤ 2|ϕn (x)|2 + 2|ϕ(x)|2 ≤ 2(k1 + 1)|ϕ(x)|2 + 2k2
µP
f -a.e. on Dϕn −ϕ , ∀n ∈ N
(cf. inequality 2 in the proof of 10.3.7). If f ∈ DP (ϕ) then ϕ ∈ L2 (X, A, µP

f ) and
hence
Z
2
lim k[ϕn ] − [ϕ]kL2 (X,A,µP ) = lim |ϕn − ϕ|2 dµP
f = 0
n→∞ f n→∞ X
2
by 8.2.11 with 2(k1 + 1)|ϕ| + 2k2 as dominating function (recall that a constant
function is µP P
f integrable since µf (X) < ∞, cf. 8.2.6).
Z
|ϕn − ϕm |2 dµP 2
f = k[ϕn ] − [ϕm ]kL2 (X,A,µP ) → 0 as n, m → ∞,
f
X
and hence, in view of 14.2.7b,f,
kJ˜P (ϕn )f − J˜P (ϕm )f k2 = kJ˜P (ϕn − ϕm )f k2
Z
= |ϕn − ϕm |2 dµP
f → 0 as n, m → ∞,
X
and hence condition c, since H is a complete metric space.
c ⇒ a: Assuming condition c, let gf := limn→∞ J˜P (ϕn )f . Then,
Z
2 ˜ 2
kgf k = lim kJP (ϕn )f k = lim |ϕn |2 dµP
f
n→∞ n→∞ X
(cf. 14.2.7f) and hence (cf. 2.1.9)
Z
∃M ∈ [0, ∞) such that |ϕn |2 dµP
f ≤ M, ∀n ∈ N.
X
Since
∞
!
\
ϕ(x) = lim ϕn (x) µP
f -a.e. on Dϕ ∩ Dϕn ,
n→∞
n=1
this implies
Z
|ϕ|2 dµP
f ≤ M
X
by 8.1.20, i.e. f ∈ DP (ϕ).

e: Assuming condition a, let {ϕ′n } be a ϕ-convergent sequence. From condition
d (written for {ϕn } and for {ϕ′n }) we have
Z
|ϕ′n − ϕn |2 dµP ′ 2
f = k[ϕn ] − [ϕn ]kL2 (X,A,µP ) f
X
≤ (k[ϕ′n ] − [ϕ]kL2 (X,A,µPf ) + k[ϕ] − [ϕn ]kL2 (X,A,µPf ) )2 −−−−→ 0.
n→∞
and hence (cf. 14.2.7b,f)

Z
kJ˜P (ϕ′n )f − J˜P (ϕn )f k2 = kJ˜P (ϕ′n − ϕn )f k2 = |ϕ′n − ϕn |2 dµP
f −
−−−→ 0,
X n→∞
and hence
k lim J˜P (ϕk )f − J˜P (ϕ′n )f k
k→∞
≤ k lim J˜P (ϕk )f − J˜P (ϕn )f k + kJ˜P (ϕn )f − J˜P (ϕ′n )f k −−−−→ 0,
k→∞ n→∞
which is condition e.
14.2.12 Lemma. Let ϕ ∈ L∞ (X, A, P ) and f ∈ H. If we write g := J˜P (ϕ)f , then

Z
µP
g (E) = |ϕ|2 dµP
f , ∀E ∈ A.
E
Proof. In view of 14.2.7a,c,f, for every E ∈ A we have

µP ˜ 2 ˜ ˜ 2
g (E) = kP (E)JP (ϕ)f k = kJP (χE )JP (ϕ)f k
Z Z
= kJ˜P (χE ϕ)f k2 = χE |ϕ|2 dµP
f = |ϕ|2 dµP
f .
X E
14.2.13 Proposition. For all ϕ ∈ M(X, A, P ), DP (ϕ) = H.
Proof. Let ϕ ∈ M(X, A, P ). For each n ∈ N, we define the set

En := |ϕ|−1 ([0, n]),
which is an element of A (cf. 14.2.10a). Clearly
∞
[
En ⊂ En+1 for all n ∈ N and En = Dϕ ;
n=1
since P (X − Dϕ ) = OH , in view of 13.3.6a this implies that

∞
!
[
f = P (X)f = P (Dϕ )f = P En f = lim P (En )f, ∀f ∈ H.
n→∞
n=1
Now, we fix f ∈ H and write gn := P (En )f for each n ∈ N. Then gn = J˜P (χEn )f
(cf. 14.2.7a), and hence (cf. 14.2.12)
Z
P
µgn (E) = χEn dµP
f , ∀E ∈ A,
E
and hence (cf. 8.3.4b and 8.1.17b)
Z Z Z
2 P 2 P 2 2 P
|ϕ| dµgn = |ϕ| χEn dµf ≤ n 1X dµP
f = n µf (X) < ∞,
X X X
and hence gn ∈ DP (ϕ). In view of 2.3.12, this proves that DP (ϕ) = H.
14.2.14 Theorem. Let ϕ ∈ M(X, A, P ). Then there exists a unique linear opera-
tor JϕP in H such that:
(a) DJϕP = DP (ϕ);
(b) f |JϕP f = X ϕdµP
R
f , ∀f ∈ DP (ϕ)
(note that ϕ ∈ L1 (X, A, µP

f ), ∀f ∈ DP (ϕ); cf. 11.1.3).
In addition, the following conditions are true:
(c) for every ϕ-convergent sequence {ϕn },
JϕP f = lim J˜P (ϕn )f, ∀f ∈ DP (ϕ);
n→∞
P 2 2 P
R
(d) kJϕ f k = X |ϕ| dµf , ∀f ∈ DP (ϕ);
(e) if A ∈ B(H) is so that AP (E) = P (E)A for all E ∈ A, then AJϕP ⊂ JϕP A.
Proof. We fix a ϕ-convergent sequence {ϕn }.

In view of 14.2.11 (a ⇒ c), we can define the mapping
JϕP : DP (ϕ) → H
f 7→ JϕP f := lim J˜P (ϕn )f.
n→∞
In view of 14.2.11e, this mapping does not depend on the choice of the ϕ-convergent
sequence {ϕn }. Thus, the mapping JϕP satisfies conditions a and c.
Let α, β ∈ C and f, g ∈ DP (ϕ). The sequences {J˜P (ϕn )f } and {J˜P (ϕn )g}
are convergent and hence (cf. 10.1.16a,b) the sequence {αJ˜P (ϕn )f + β J˜P (ϕn )g} is
convergent and
lim (αJ˜P (ϕn )f + β J˜P (ϕn )g) = α lim J˜P (ϕn )f + β lim J˜P (ϕn )g;
n→∞ n→∞ n→∞
since αJ˜P (ϕn )f + β J˜P (ϕn )g = J˜P (ϕn )(αf + βg), this implies that αf + βg ∈ DP (ϕ)
(cf. 14.2.11, c ⇒ a), and that
JϕP (αf + βg) = αJϕP f + βJϕP g.
This proves that JϕP is a linear operator.
Further we have, for every f ∈ DP (ϕ),
Z Z
(1) (2) (3)
f |JϕP f = lim f |J˜P (ϕn )f = lim ϕn dµP
f = ϕdµP
f ,
n→∞ n→∞ X X
where 1 holds by 10.1.16c and 2 by 14.2.7e; as to 3, from [ϕ] = limn→∞ [ϕn ] in the
Hilbert space L2 (X, A, µP
f ) (cf. 14.2.11, a ⇒ d) we have
Z Z Z
ϕn dµP − P 1X (ϕn − ϕ)dµP

f ϕdµ f = f
X X X
Z 12 Z 12
2
≤ 1X dµP
f |ϕn − ϕ| dµP
f −−−−→ 0
X X n→∞
2
by the Schwarz inequality in L (X, A, µP
f ).
This proves that condition b is satisfied.
The uniqueness of the linear operator in H which satisfies conditions a and b
follows from 14.2.13 and 10.2.12.
In what follows, we prove conditions d and e.
d: For every f ∈ DP (ϕ), we have
Z Z
2 (4) ˜ 2 (5) 2 P (6)
P
kJϕ f k = lim kJP (ϕn )f k = lim |ϕn | dµf = |ϕ|2 dµP
f ,
n→∞ n→∞ X X
where 4 holds by 4.1.6a and 5 by 14.2.7f; as to 6, from [ϕ] = limn→∞ [ϕn ] in the
Hilbert space L2 (X, A, µP 2 2
f ) we have k[ϕ]kL2 (X,A,µP ) = limn→∞ k[ϕn ]kL2 (X,A,µP ) .
f f
e: We have:
(7) (8)
f ∈ DP (ϕ) ⇒ {J˜P (ϕn )f } is convergent ⇒
(9)
[{AJ˜P (ϕn )f } is convergent and A lim J˜P (ϕn )f = lim AJ˜P (ϕn )f ] ⇒
n→∞ n→∞
(10)
[{J˜P (ϕn )Af } is convergent and AJϕP f = lim J˜P (ϕn )Af ] ⇒
n→∞
[Af ∈ DP (ϕ) and AJϕP f = JϕP Af ],
where: 7 holds by 14.2.11 (a ⇒ c); 8 holds because A ∈ B(H); 9 holds by 14.2.7g;
10 holds by 14.2.11 (c ⇒ a). Since DAJϕP = DP (ϕ), this proves condition e (cf.
3.2.3 and 3.2.4).
14.2.15 Theorem. For all ϕ ∈ M(X, A, P ), the operator JϕP is adjointable and
(JϕP )† = JϕP .
Proof. For every ϕ ∈ M(X, A, P ), 14.2.13 shows that the operator JϕP is ad-
jointable.
Now, let ϕ ∈ M(X, A, P ) and let {ϕn } be a ϕ-convergent sequence. The se-
quence {ϕn } is obviously ϕ-convergent, and hence (cf. 14.2.14c and 14.2.7d)

JϕP f |g = lim J˜P (ϕn )f |g = lim f |(J˜P (ϕn ))† g

n→∞ n→∞

˜
= lim f |JP (ϕn )g = f |JϕP g , ∀f ∈ DP (ϕ), ∀g ∈ DP (ϕ)

n→∞
In view of 12.1.3B, this proves that JϕP ⊂ (JϕP )† .

Conversely, let g ∈ D(JϕP )† and write h := (JϕP )† g; then,
(h|f ) = g|JϕP f , ∀f ∈ DP (ϕ).

For each n ∈ N, define the set En := |ϕ|−1 ([0, n]), which is an element of A (cf.
14.2.10a), and the vector fn := J˜P (χEn ϕ)g (note that χEn ϕ ∈ L∞ (X, A, P )); then
(cf. 14.2.12)
Z
µPfn (E) = χEn |ϕ|2 dµP
g , ∀E ∈ A,
E
and hence (cf. 8.3.4b and 8.1.17b)

Z Z Z
2 P 2 2 P 4
|ϕ| dµfn = |ϕ| χEn |ϕ| dµg ≤ n 1X dµP
g < ∞,
X X X
and hence fn ∈ DP (ϕ). The sequence {χEn ϕ} is ϕ-convergent (cf. 14.2.10a), and
hence
(1) (2)
JϕP fn = lim J˜P (χEk ϕ)J˜P (χEn ϕ)g = J˜P (χEn |ϕ|2 )g, ∀n ∈ N,
k→∞
where 1 holds by 14.2.14c and 2 by 14.2.7c, since χEk χEn = χEn if k ≥ n. Then,

(h|fn ) = g|JϕP fn = g|J˜P (χEn |ϕ|2 )g
Z
(3) (4)
= χEn |ϕ|2 dµP ˜ 2 2
g = kJP (χEn ϕ)gk = kfn k , ∀n ∈ N,
X
where 3 holds by 14.2.7e and 4 by 14.2.7f. Then the Schwarz inequality yields
kfn k2 = (h|fn ) ≤ khkkfnk, ∀n ∈ N,
and hence
kfn k ≤ khk, ∀n ∈ N,
and hence
Z
χEn |ϕ|2 dµP 2 2
g = kfn k ≤ khk , ∀n ∈ N.
X
Since limn→∞ χEn |ϕ(x)|2 = |ϕ(x)|2 , ∀x ∈ Dϕ , by 8.1.20 we obtain

Z
|ϕ|2 dµP
g ≤ khk ,
2
X
and hence g ∈ DP (ϕ). This proves the inclusion D(JϕP )† ⊂ DP (ϕ), and hence that
(JϕP )† = JϕP .
14.2.16 Corollary. For all ϕ ∈ M(X, A, P ), the operator JϕP is closed.
Proof. For every ϕ ∈ M(X, A, P ), we have JϕP = (JϕP )† in view of 14.2.15. By

12.1.6a, this proves that the operator JϕP is closed.
14.2.17 Proposition. For ϕ ∈ M(X, A, P ), the following conditions are equiva-

lent:
(a) the operator JϕP is bounded;
(b) DP (ϕ) = H;
(c) JϕP ∈ B(H);

(d) ϕ ∈ L∞ (X, A, P ).
(e) JϕP = J˜P (ϕ).
Proof. Equivalence of a, b, c: We know that JϕP is closed (cf. 14.2.16) and that
DP (ϕ) = H (cf. 14.2.13). Then, the implication a ⇒ b is true by 4.4.4 and 2.3.9c,
and the implication b ⇒ a is true by 12.2.3. In view of this, the implications a ⇒ c
and b ⇒ c are obvious. The implications c ⇒ a and c ⇒ b are obvious.
a ⇒ d: We prove this by contraposition. We suppose that ϕ 6∈ L∞ (X, A, P ).
Then we have
∀n ∈ N, ∃kn ∈ N so that kn ≥ n and P (|ϕ|−1 ([kn , kn + 1))) 6= OH ;
in fact, if we had
∃n ∈ N such that P (|ϕ|−1 ([k, k + 1)) = OH , ∀k ∈ N so that k ≥ n,
then we should have, by 13.3.6c,

∞
!
[
∃n ∈ N such that P (|ϕ|−1 ([n, ∞))) = P |ϕ|−1 ([k, k + 1)) = OH ,
k=n
and hence ϕ ∈ L∞ (X, A, P ). For each n ∈ N, we write Fn := |ϕ|−1 ([kn , kn + 1))

and we choose fn ∈ H such that fn 6= 0H and P (Fn )fn = fn ; then,
2
µP
fn (X − Fn ) = kP (X − Fn )P (Fn )fn k = 0
and hence
(2)
Z Z Z
(1)
|ϕ|2 dµP
fn = |ϕ|2 dµP
fn ≤ (kn + 1)
2
1X dµP
fn < ∞,
X Fn Fn
where 1 holds by 8.3.3a and 2 by 8.1.17b, and hence fn ∈ DP (ϕ); moreover,

Z Z
(3)
kJϕP fn k2 = |ϕ|2 dµP
fn = |ϕ|2 dµP
fn
X Fn
(4)
Z
≥ kn2 1X dµP 2 P 2 2
fn = kn µfn (X) = kn kfn k ,
Fn
where 3 holds by 14.2.14d and 4 by 8.1.17b. Since fn 6= 0H for all n ∈ N, this proves
that the operator JϕP is not bounded.
d ⇒ (a and e): If ϕ ∈ L∞ (X, A, P ), then ϕn := ϕ for each n ∈ N defines an
obviously ϕ-convergent sequence. In view of 14.2.14c, this proves that JϕP = J˜P (ϕ),
and therefore also that the operator JϕP is bounded.
14.2.18 Definitions. We define the mapping

JP : M(X, A, P ) → O(H)
ϕ 7→ JP (ϕ) := JϕP
(we recall that O(H) denotes the family of all linear operators in H, cf. 3.2.1).
For notational convenience, we will often write JP (ϕ) instead of JϕP ; for the same
reason, we always write DP (ϕ) instead of DJϕP .
From 14.2.17(d ⇒ e) we have that the mapping JP is an extension of the map-
ping J˜P , and hence of the mapping JˆP . The terminology adopted in 14.1.2 for JˆP is
extended to JP . Thus, for ϕ ∈ M(X, A, P ) the operator JP (ϕ) is called the integral
of ϕ with respect to P and is often denoted as follows
Z
ϕdP := JP (ϕ).
X
A reason behind this extension of terminology will be set out in 14.3.7.
14.3 Sum, product, inverse, self-adjointness, unitarity of integrals

P(H).
14.3.1 Proposition. Let ϕ ∈ M(X, A, P ) and ψ ∈ L∞ (X, A, P ). Then,

JP (ϕ) + JP (ψ) = JP (ϕ + ψ).
Proof. First we note that, for f ∈ H,

(1)
f ∈ DP (ϕ) ⇔ ϕ ∈ L2 (X, A, µP 2 P
f ) ⇔ ϕ + ψ ∈ L (X, A, µf ) ⇔ f ∈ DP (ϕ + ψ),
where 1 holds because ψ ∈ L2 (X, A, µP

f ) for all f ∈ H (cf. 14.2.10b). Hence,
(2)
DJP (ϕ)+JP (ψ) = DP (ϕ) = DP (ϕ + ψ),
where 2 holds because DP (ψ) = H (cf. 14.2.17).
Next, let {ϕn } be a ϕ-convergent sequence. Then {ϕn + ψ} is a (ϕ + ψ)-
convergent sequence since the condition
∃k1 , k2 ∈ [0, ∞) such that |ϕn (x)|2 ≤ k1 |ϕ(x)|2 + k2 P -a.e. on Dϕ ∩ Dϕn , ∀n ∈ N,
implies that there exist k1 , k2 ∈ [0, ∞) such that
(3)
|ϕn (x) + ψ(x)|2 ≤ 2|ϕn (x)|2 + 2|ψ(x)|2 ≤ 2k1 |ϕ(x)|2 + 2k2 + 2|ψ(x)|2
(4)
≤ 4k1 |ϕ(x) + ψ(x)|2 + 4k1 |ψ(x)|2 + 2k2 + 2|ψ(x)|2
≤ 4k1 |ϕ(x) + ψ(x)|2 + (4k1 + 2)(P -sup|ψ|)2 + 2k2
P -a.e. on Dϕ ∩ Dϕn ∩ Dψ = Dϕ+ψ ∩ Dϕn +ψ , ∀n ∈ N,
where 3 holds by inequality 2 in the proof of 10.3.7 and 4 holds by the inequality
(thereby derived)
|α|2 = |α + β − β|2 ≤ 2|α + β|2 + 2|β|2 , ∀α, β ∈ C.
This yields
(5)
(JP (ϕ) + JP (ψ))f = lim J˜P (ϕn )f + J˜P (ψ)f
n→∞
(6)
= lim (J˜P (ϕn )f + J˜P (ψ))f = lim J˜P (ϕn + ψ)f
n→∞ n→∞
(7)
= JP (ϕ + ψ)f, ∀f ∈ DP (ϕ),
where: 5 holds by 14.2.14c and 14.2.17e; 6 holds by 14.2.7b; 7 holds by 14.2.14c.
14.3.2 Corollary. For all ϕ ∈ M(X, A, P ) and λ ∈ C,

Z
kJP (ϕ)f − λf k2 = |ϕ − λ|2 dµP
f , ∀f ∈ DP (ϕ)
X
(we recall that ϕ − λ := ϕ − λX , cf. 1.2.19).
Proof. Since λX ∈ L∞ (X, A, P ),

JP (−λX ) = J˜P (−λX ) = −λJ˜P (1X ) = −λP (X) = −λ1H
(cf. 14.2.17e and 14.2.7a,b). Then, 14.3.1 implies that
JP (ϕ) − λ1H = JP (ϕ − λ),
and hence 14.2.14d implies that
Z
kJP (ϕ)f − λf k2 = kJP (ϕ − λ)f k2 = |ϕ − λ|2 dµP
f , ∀f ∈ DP (ϕ − λ) = DP (ϕ).
X
14.3.3 Lemma. Let ϕ ∈ M(X, A, P ) and f ∈ DP (ϕ). Let {ϕn } be a sequence in

M(X, A, P ) such that
f ∈ DP (ϕn ), ∀n ∈ N, and lim [ϕn ] = [ϕ] in the Hilbert space L2 (X, A, µP
f ).
n→∞
Then
lim JP (ϕn )f = JP (ϕ)f.
n→∞
Proof. Let {ψn } be a ϕ-convergent sequence. Then limn→∞ [ψn ] = [ϕ] in the
Hilbert space L2 (X, A, µP
f ) (cf. 14.2.11), and hence
Z
2 (1) 2 (2)
kJP (ϕn )f − JP (ψn )f k = kJP (ϕn − ψn )f k = |ϕn − ψn |2 dµP
f
X
= k[ϕn ] − [ψn ]k2L2 (X,A,µP ) −−−−→ 0,
f n→∞
where 1 holds by 14.3.1 and 2 by 14.2.14d. Then,
kJP (ϕn )f − JP (ϕ)f k ≤ kJP (ϕn )f − JP (ψn )f k + kJP (ψn )f − JP (ϕ)f k −−−−→ 0
n→∞
since JP (ϕ)f = limn→∞ JP (ψn )f (cf. 14.2.14c and 14.2.17e).
14.3.4 Proposition. Let ϕ, ψ ∈ M(X, A, P ). Then,

JP (ϕ) + JP (ψ) ⊂ JP (ϕ + ψ).
Proof. We have
f ∈ DP (ϕ) ∩ DP (ψ) ⇒ ϕ, ψ ∈ L2 (X, A, µP
f) ⇒
ϕ + ψ ∈ L2 (X, A, µP
f ) ⇒ f ∈ DP (ϕ + ψ),
or DJϕP +JψP ⊂ DP (ϕ + ψ).

Now let f ∈ DP (ϕ) ∩ DP (ψ), let {ϕn } be a ϕ-convergent sequence, and let {ψn }
be a ψ-convergent sequence. Then (cf. 14.2.11)
[ϕ] = lim [ϕn ] and [ψ] = lim [ψn ] in the Hilbert space L2 (X, A, µP
f ),
n→∞ n→∞
and hence
[ϕ + ψ] = lim [ϕn + ψn ] in the Hilbert space L2 (X, A, µP
f ).
n→∞
Since f ∈ DP (ϕ + ψ) and DP (ϕn + ψn ) = H for all n ∈ H (cf. 14.2.17), by 14.3.3

we have
JP (ϕ + ψ)f = lim JP (ϕn + ψn )f ;
n→∞
now, JP (ϕn + ψn ) = J˜P (ϕn + ψn ) = J˜P (ϕn ) + J˜P (ψn ) (cf. 14.2.17e and 14.2.7b),
and hence
JP (ϕ + ψ)f = lim (J˜P (ϕn ) + J˜P (ψn ))f
n→∞
= lim J˜P (ϕn )f + lim J˜P (ψn )f = JP (ϕ)f + JP (ψ)f

n→∞ n→∞
(cf. 14.2.14c).
14.3.5 Proposition. Let ϕ ∈ M(X, A, P ). Then,

αJP (ϕ) = JP (αϕ), ∀α ∈ C − {0}, and αJP (ϕ) ⊂ JP (αϕ) for α := 0.
Proof. For every α ∈ C − {0} we have

f ∈ DP (ϕ) ⇔ ϕ ∈ L2 (X, A, µP 2 P
f ) ⇔ αϕ ∈ L (X, A, µf ) ⇔ f ∈ DP (αϕ)
and hence DαJϕP = DP (ϕ) = DP (αϕ).

If α = 0 then
αϕ ∈ L2 (X, A, µP
f ), ∀f ∈ H,
and hence DαJϕP = DP (ϕ) ⊂ H = DP (αϕ).

For all α ∈ C and all f ∈ DP (ϕ) we have (cf. 14.2.14b)
Z Z
P
(f |JP (αϕ)f ) = αϕdµf = α ϕdµPf = α (f |JP (ϕ)f ) = (f |αJP (ϕ)f ) .
X X
In view of 14.2.13 and 10.2.12, this proves the statement.
14.3.6 Proposition. For all α, β ∈ C and ϕ, ψ ∈ M(X, A, P ),

αJP (ϕ) + βJP (ψ) ⊂ JP (αϕ + βψ).
Proof. This follows immediately from 14.3.5 and 14.3.4.
14.3.7 Remark. If ϕ ∈ M(X, A, P ) is such that 0 ≤ ϕ(x) P -a.e. on Dϕ , then the

sequence {ψn } defined as in the proof of 8.1.15 (cf. also the proof of 6.2.26) is a
ϕ-convergent sequence since
ψn ∈ S + (X, A) (and S + (X, A) ⊂ MB (X, A) ⊂ L∞ (X, A, P )), ∀n ∈ N,
ψn (x) ≤ ϕ(x), ∀x ∈ Dϕ , ∀n ∈ N,
lim ψn (x) = ϕ(x), ∀x ∈ Dϕ .
n→∞
Moreover,
J˜P (ψn ) = JˆP (ψn )
n
n2
X k−1 −1 k−1 k
= P ϕ , + nP (ϕ−1 ([n, ∞])), ∀n ∈ N.
2n 2n 2n
k=2
Then we have (cf. 14.2.14c)

JP (ϕ)f
" n2n #
X k−1
k − 1 k

= lim P ϕ−1 , + nP (ϕ−1 ([n, ∞])) f,
n→∞ 2n 2n 2n
k=2
∀f ∈ DP (ϕ).
This equation must be compared with 8.1.15.
Further, for every ϕ ∈ M(X, A, P ) we have DP (ϕ) ⊂ DP ((Re ϕ)+ ) (since
(Re ϕ)+ (x) ≤ |ϕ(x)|, ∀x ∈ Dϕ ) and similarly for (Re ϕ)− , (Im ϕ)+ , (Im ϕ)− , and
hence (cf. 14.3.6)
JP (ϕ) = JP ((Re ϕ)+ ) − JP ((Re ϕ)− ) + iJP ((Im ϕ)+ ) − iJP ((Im ϕ)− ).
R
This equation must be compared with the definition of X ϕdµ in 8.2.3.
Thus, we see that, for ϕ ∈ M(X, A, P ), there is a close analogy between the
R
vector JP (ϕ)f for any f ∈ DP (ϕ) and the integral X ϕdµ for any measure µ on A
such that ϕ ∈ L1 (X, A, µ). This is the reason why the operator JP (ϕ) is called the
integral of ϕ with respect to P and is often denoted as follows
Z
ϕdP := JP (ϕ)
X
(cf. 14.2.18).
We point out that, if ϕ ∈ L∞ (X, A, P ) is so that 0 ≤ ϕ(x) P -a.e. on Dϕ , then
for the sequence {ψn } considered above we have
P -sup|ψn − ϕ| → 0 as n → ∞
(this is true because ϕ satisfies condition c of 6.2.26 with Y := Dϕ − E, where

E ∈ A and P (E) = OH ), and hence (cf. 14.2.7b,h)
" n2n #
X k−1
k − 1 k

JP (ϕ) = J˜P (ϕ) = lim P ϕ−1 , + nP (ϕ−1 ([n, ∞])) ,
n→∞ 2n 2n 2n
k=2
where the limit is with respect to the norm of B(H).
14.3.8 Lemma. Let ϕ ∈ M(X, A, P ) and f ∈ DP (ϕ). If we write g := JP (ϕ)f ,

then
Z
µP
g (E) = |ϕ|2 dµP
f , ∀E ∈ A
E
(this generalizes 14.2.12).
Proof. Let {ϕn } be a ϕ-convergent sequence. For every E ∈ A we have (cf.

14.2.7a,c)
P (E)J˜P (ϕn ) = J˜P (χE )J˜P (ϕn ) = J˜P (χE ϕn ), ∀n ∈ N,
and hence
µP 2 ˜ 2
g (E) = kP (E)JP (ϕ)f k = kP (E) lim JP (ϕn )f k
n→∞
Z
= lim kP (E)J˜P (ϕn )f k = lim kJ˜P (χE ϕn )f k2 = lim
2
|χE ϕn |2 dµP
f
n→∞ n→∞ n→∞ X
(cf. 14.2.14c and 14.2.7f). Further, we have
χE (x)ϕ(x) = lim χE (x)ϕn (x)

n→∞
∞ ∞
! !
\ \
P
µf -a.e. on Dϕ ∩ Dϕn = DχE ϕ ∩ DχE ϕn ,
n=1 n=1
and
∃k1 , k2 ∈ [0, ∞) such that

|χE (x)ϕn (x)|2 ≤ k1 |ϕ(x)|2 + k2 µP
f -a.e. on Dϕ ∩ DχE ϕn , ∀n ∈ N.
Then, by 8.2.11 (recall that a constant function is µP

f -integrable) we have
Z Z
lim |χE ϕn |2 dµP
f = χE |ϕ|2 dµP
f .
n→∞ X X
This proves the statement.
14.3.9 Proposition. Let ϕ, ψ ∈ M(X, A, P ). Then
DJP (ψ)JP (ϕ) = DP (ϕ) ∩ DP (ψϕ) and JP (ψ)JP (ϕ) ⊂ JP (ψϕ).

Proof. We prove the part of the statement about the domains.

We have
f ∈ DJP (ψ)JP (ϕ) ⇔ [f ∈ DP (ϕ) and JP (ϕ)f ∈ DP (ψ)].
For f ∈ DP (ϕ), letting g := JP (ϕ)f we have
Z
JP (ϕ)f ∈ DP (ψ) ⇔ |ψ|2 dµP
g < ∞;
X
2 2 2
dµP dµP
R R
since X |ψ| g = X |ψ| |ϕ| f (cf. 14.3.8 and 8.3.4b), we actually have
JP (ϕ)f ∈ DP (ψ) ⇔ ψϕ ∈ L2 (X, A, µP

f ) ⇔ f ∈ DP (ψϕ).
Thus we have
f ∈ DJP (ψ)JP (ϕ) ⇔ [f ∈ DP (ϕ) and f ∈ DP (ψϕ)],
or DJP (ψ)JP (ϕ) = DP (ϕ) ∩ DP (ψϕ).
Now we prove the part of the statement about the operators. We note that
from the part of the statement about the domains we have DJP (ψ)JP (ϕ) ⊂ DP (ψϕ).
Thus, we need to prove that JP (ψ)JP (ϕ)f = JP (ψϕ)f for all f ∈ DJP (ψ)JP (ϕ) .
First we assume ψ ∈ L∞ (X, A, P ). If {ϕn } is a ϕ-convergent sequence then
the sequence {ψϕn } is (ψϕ)-convergent, as can be seen easily. Hence for every
f ∈ DP (ϕ) we have f ∈ DJP (ψ)JP (ϕ) (in view of 14.2.17) and
(1) (2)
JP (ψ)JP (ϕ)f = J˜P (ψ) lim J˜P (ϕn )f = lim J˜P (ψ)J˜P (ϕn )f
n→∞ n→∞
(3) (4)
= lim J˜P (ψϕn )f = JP (ψϕ)f ;
n→∞
where: 1 holds by 14.2.17e and 14.2.14c, since f ∈ DP (ϕ); 2 holds because J˜P (ψ)
is continuous; 3 holds by 14.2.7c; 4 holds by 14.2.14c, since f ∈ DP (ψϕ).
Next, let ψ be any element of M(X, A, P ), let {ψn } be a ψ-convergent sequence,
and let f ∈ DJP (ψ)JP (ϕ) ; this implies f ∈ DP (ϕ) ∩ DP (ψϕ), or ϕ ∈ L2 (X, A, µP f)
∞
and ψϕ ∈ L2 (X, A, µP f ); since ψn ∈ L (X, A, P ), we have also ψn ϕ ∈ L 2
(X, A, µ P
f)
for all n ∈ N. Since {ψn } is a ψ-convergent sequence, we have
∞
!
\
2 P
lim |ψn (x)ϕ(x) − ψ(x)ϕ(x)| = 0 µf -a.e. on Dψϕ ∩ D ψn ϕ ,
n→∞
n=1
and also that there exist k1 , k2 ∈ [0, ∞) so that

|ψn (x) − ψ(x)|2 ≤ 2|ψn (x)|2 + 2|ψ(x)|2 ≤ 2(k1 + 1)|ψ(x)|2 + 2k2
µP
f -a.e on Dψ ∩ Dψn , ∀n ∈ N,
and hence so that

|ψn (x)ϕ(x) − ψ(x)ϕ(x)|2 ≤ 2(k1 + 1)|ψ(x)ϕ(x)|2 + 2k2 |ϕ(x)|2
µP
f -a.e. on Dψϕ ∩ Dψn ϕ , ∀n ∈ N;
then we have, by 8.2.11,

Z
k[ψn ϕ] − [ψϕ]k2L2 (X,A,µP ) = |ψn ϕ − ψϕ|2 dµP
f −
−−−→ 0.
f
X n→∞
In view of 14.3.3 (recall that f ∈ DP (ψϕ) and f ∈ DP (ψn ϕ) for all n ∈ N), this
yields
lim JP (ψn ϕ)f = JP (ψϕ)f ;
n→∞
moreover, JP (ψn ϕ)f = JP (ψn )JP (ϕ)f in view of what was proved above (since
ψn ∈ L∞ (X, A, P ) and f ∈ DP (ϕ)), and hence
lim JP (ψn ϕ)f = lim JP (ψn )JP (ϕ)f = JP (ψ)JP (ϕ)f
n→∞ n→∞
in view of 14.2.14c, since JP (ϕ)f ∈ DP (ψ). Thus,

JP (ψ)JP (ϕ)f = JP (ψϕ)f.
14.3.10 Corollary. If ϕ ∈ L∞ (X, A, P ) and ψ ∈ M(X, A, P ), then

JP (ψ)JP (ϕ) = JP (ψϕ).
Proof. This follows at once from 14.3.9, since ϕ ∈ L∞ (X, A, P ) entails DP (ϕ) = H
(cf. 14.2.17) and hence DJP (ψ)JP (ϕ) = DP (ψϕ).
14.3.11 Proposition. Let ϕ, ψ ∈ M(X, A, P ). Then the operator JP (ϕ) + JP (ψ)

is closable and
JP (ϕ) + JP (ψ) = JP (ϕ + ψ).
Proof. From 14.2.16 and 14.3.4 we have that JP (ϕ + ψ) is a closed extension of

JP (ϕ) + JP (ψ). Therefore, the operator JP (ϕ) + JP (ψ) is closable (cf. 4.4.11b) and
(cf. 4.4.10)
JP (ϕ) + JP (ψ) ⊂ JP (ϕ + ψ).
Now we fix f ∈ DP (ϕ + ψ). For each n ∈ N, we define the set
En := {x ∈ Dϕ ∩ Dψ : |ϕ(x)| + |ψ(x)| ≤ n},
which is an element of A, and we define the vector gn := P (En )f . Proceeding as
in the proof of 14.2.13, we see that f = limn→∞ gn and that gn ∈ DP (|ϕ| + |ψ|) for
each n ∈ N; then,
Z Z
|ϕ|2 dµP
gn ≤ (|ϕ| + |ψ|)2 dµP
gn < ∞,
X X
and this proves that gn ∈ DP (ϕ); similarly, gn ∈ DP (ψ). Letting ϕn := χEn ϕ, we

have ϕn ∈ L∞ (X, A, P ) and
(1) (2)
JP (ϕ)gn = JP (ϕ)JP (χEn )f = JP (ϕn )f, ∀n ∈ N,
where 1 holds by 14.2.7a and 14.2.17e, and 2 holds by 14.3.10; similarly, letting
ψn := χEn ψ, we have ψn ∈ L∞ (X, A, P ) and
JP (ψ)gn = JP (ψn )f, ∀n ∈ N.
Moreover, we have
lim |ϕn (x) + ψn (x) − ϕ(x) − ψ(x)|2 = 0, ∀x ∈ Dϕ+ψ ,

n→∞
and
|ϕn (x) + ψn (x) − ϕ(x) − ψ(x)|2 ≤ 4|ϕ(x) + ψ(x)|2 , ∀x ∈ Dϕ+ψ , ∀n ∈ N;
then, by 8.2.11 (recall that ϕ + ψ ∈ L2 (X, A, µP

f ) since f ∈ DP (ϕ + ψ)) we have
Z
lim |ϕn + ψn − ϕ − ψ|2 dµP f = 0,
n→∞ X
∞
or (note that ϕn + ψn ∈ L (X, A, P ) implies ϕn + ψn ∈ L2 (X, A, µP
f ), cf. 14.2.10b)
lim [ϕn + ψn ] = [ϕ + ψ] in the Hilbert space L2 (X, A, µP

f );
n→∞
by 14.3.3 (note that DP (ϕn + ψn ) = H, cf. 14.2.17), this implies
lim JP (ϕn + ψn )f = JP (ϕ + ψ)f.

n→∞
Further, in view of 14.3.1 we have
JP (ϕn + ψn )f = JP (ϕn )f + JP (ψn )f = JP (ϕ)gn + JP (ψ)gn

= (JP (ϕ) + JP (ψ))gn , ∀n ∈ N.
Thus, we have constructed a sequence {gn } in DP (ϕ) ∩ DP (ψ) = DJP (ϕ)+JP (ψ)
which is such that
lim gn = f and the sequence {(JP (ϕ) + JP (ψ))gn } is convergent.

n→∞
This implies f ∈ DJP (ϕ)+JP (ψ) (cf. 4.4.10). Since f was an arbitrary element of
DP (ϕ + ψ), we have
DP (ϕ + ψ) ⊂ DJP (ϕ)+JP (ψ) ,
and hence JP (ϕ) + JP (ψ) = JP (ϕ + ψ).
14.3.12 Proposition. Let ϕ, ψ ∈ M(X, A, P ). Then the operator JP (ψ)JP (ϕ) is

closable and
JP (ψ)JP (ϕ) = JP (ϕψ).

Proof. From 14.2.16 and 14.3.9 we have that JP (ψϕ) is a closed extension of
JP (ψ)JP (ϕ). Therefore, the operator JP (ψ)JP (ϕ) is closable (cf. 4.4.11b) and
(cf. 4.4.10)
JP (ψ)JP (ϕ) ⊂ JP (ψϕ).
Now we fix f ∈ DP (ψϕ). For each n ∈ N, we define the set
En := |ϕ|−1 ([0, n]),
which is an element of A, and we define the vector gn := P (En )f . In the proof of
14.2.13, we saw that f = limn→∞ gn and that gn ∈ DP (ϕ) for each n ∈ N. Letting
ϕn := χEn ϕ, we have ϕn ∈ L∞ (X, A, P ) and
(1) (2)
JP (ϕ)gn = JP (ϕ)JP (χEn )f = JP (ϕn )f, ∀n ∈ N,
where 1 holds by 14.2.7a and 14.2.17e, and 2 holds by 14.3.10. Moreover, we have
lim |ψ(x)ϕn (x) − ψ(x)ϕ(x)|2 = 0, ∀x ∈ Dψϕ ,
n→∞
and
|ψ(x)ϕn (x) − ψ(x)ϕ(x)|2 ≤ 4|ψ(x)ϕ(x)|2 , ∀x ∈ Dψϕ , ∀n ∈ N;
then, by 8.2.11 (recall that ψϕ ∈ L2 (X, A, µP
f ) since f ∈ DP (ψϕ)) we have
Z
lim |ψϕn − ψϕ|2 dµP f = 0,
n→∞ X
or (|ψ(x)ϕn (x)| ≤ |ψ(x)ϕ(x)|, ∀x ∈ Dψϕ , implies ψϕn ∈ L2 (X, A, µP
f ))
lim [ψϕn ] = [ψϕ] in the Hilbert space L2 (X, A, µP

f );
n→∞
by 14.3.3 (note that f ∈ DP (ψϕn ) since ψϕn ∈ L2 (X, A, µP

f )) this yields
lim JP (ψϕn )f = JP (ψϕ)f.

n→∞
Further, in view of 14.3.10 we have JP (ψ)JP (ϕn ) = JP (ψϕn ) for each n ∈ N;

since f ∈ DP (ψϕn ), this implies JP (ϕn )f ∈ DP (ψ), i.e. JP (ϕ)gn ∈ DP (ψ), i.e.
gn ∈ DJP (ψ)JP (ϕ) , and
JP (ψϕn )f = JP (ψ)JP (ϕn )f = JP (ψ)JP (ϕ)gn .
Thus, we have constructed a sequence {gn } in DJP (ψ)JP (ϕ) which is such that
lim gn = f and the sequence {JP (ψ)JP (ϕ)gn } is convergent.
n→∞
This implies f ∈ DJP (ψ)JP (ϕ) (cf. 4.4.10). Since f was an arbitrary element of
DP (ψϕ), we have
DP (ψϕ) ⊂ DJP (ψ)JP (ϕ) ,
and hence
JP (ψ)JP (ϕ) = JP (ϕψ).
14.3.13 Remark. For every ϕ ∈ M(X, A, P ) and every α ∈ C − {0}, the operator
αJP (ϕ) is closed; this is true because αJP (ϕ) = JP (αϕ) (cf. 14.3.5 and 14.2.16), but
more in general because αA is closed for any closed operator A and every α ∈ C−{0}
(as can be seen easily). If α = 0 then αJP (ϕ) is closed iff ϕ ∈ L∞ (X, A, P ) (cf.
14.2.17, 14.2.13, 4.4.3, 4.4.4).

(a) NJP (ϕ) = RP (ϕ−1 ({0})) .
(b) the operator JP (ϕ) is injective;
(c) P (ϕ−1 ({0})) = OH ;
6 0 P -a.e. on Dϕ .
(d) ϕ(x) =

1
(e) ϕ ∈ M(X, A, P ) and (JP (ϕ))−1 = JP ϕ1 (for the function 1
ϕ, cf. 1.2.19; in
particular D ϕ1 := {x ∈ Dϕ : ϕ(x) 6= 0}).
Proof. a: First we point out that ϕ−1 ({0}) ∈ ADϕ ⊂ A. Then, for f ∈ H we have
(1)
f ∈ NJP (ϕ) ⇔ [f ∈ DP (ϕ) and kJP (ϕ)f k = 0] ⇔
Z
(2) (3)
|ϕ|2 dµP P
f = 0 ⇔ ϕ(x) = 0 µf -a.e. on Dϕ ⇔
X
−1
µP
f (Dϕ − ϕ ({0})) = 0 ⇔ P (Dϕ − ϕ−1 ({0}))f = 0H ⇔
(4) (5)
f = P (X)f = P (Dϕ )f = P (ϕ−1 ({0}))f ⇔ f ∈ RP (ϕ−1 ({0})) ,
where: 1 holds by definition of DP (ϕ) and by 14.2.14d; 2 holds by 8.1.18a; 3 holds
because ϕ−1 ({0}) ∈ ADϕ (cf. the last part of 7.1.10); 4 holds because P (X − Dϕ ) =
OH ; 5 holds by 13.1.3c.
b ⇔ c: This follows from a and from 3.2.6a.
c ⇔ d: This is true (by an argument similar to the argument used in the last
part of 7.1.10) because ϕ−1 ({0}) ∈ ADϕ .
e: We assume condition c and note that
D ϕ1 = Dϕ − ϕ−1 ({0});

then D ϕ1 ∈ A and X − D ϕ1 = (X − Dϕ ) ∪ ϕ−1 ({0}), whence P X − D ϕ1 = OH
1
by 13.3.2h; in view of 6.2.17, this proves that ϕ ∈ M(X, A, P ). Moreover,
1 1
ϕ(x) (x) = (x)ϕ(x) = 1, ∀x ∈ D ϕ1 = Dϕ ϕ1 = D ϕ1 ϕ ;
ϕ ϕ
this implies ϕ ϕ1 , ϕ1 ϕ ∈ L∞ (X, A, P ) and hence (cf. 14.2.17e and 14.2.7a,i)

1 1
JP ϕ = JP ϕ = J˜P (1X ) = P (X) = 1H ;
ϕ ϕ
then, by 14.3.9,

1 1
JP (ϕ)JP ⊂ 1H , JP JP (ϕ) ⊂ 1H ,
ϕ ϕ

1
DJP (ϕ)JP ( 1 ) = DP , DJP ( 1 )JP (ϕ) = DP (ϕ);
ϕ ϕ ϕ

by 1.2.16b, this implies (JP (ϕ))−1 = JP ϕ1 .
14.3.15 Proposition. Let ϕ, ψ ∈ M(X, A, P ). Then,

JP (ϕ) = JP (ψ) iff ϕ(x) = ψ(x) P -a.e. on Dϕ ∩ Dψ .
Proof. First we assume ϕ(x) = ψ(x) P -a.e. on Dϕ ∩ Dψ . Then ϕ(x) = ψ(x)

µP
f -a.e. on Dϕ ∩ Dψ for all f ∈ H, and hence (cf. 8.1.17c)
Z Z
2 P
|ϕ| dµf = |ψ|2 dµP
f , ∀f ∈ H,
X X
and hence DP (ϕ) = DP (ψ). Moreover, in view of 14.2.14b and 8.2.7,

Z Z
(f |JP (ϕ)f ) = ϕdµP
f = ψdµP f = (f |JP (ψ)f ) , ∀f ∈ DP (ϕ).
X X
Then, JP (ϕ) = JP (ψ) in view of 14.2.13 and 10.2.12.

Conversely, we assume JP (ϕ) = JP (ψ). For each n ∈ N, we define the set
En := {x ∈ Dϕ ∩ Dψ : |ϕ(x)| + |ψ(x)| ≤ n},
which is an element of A; then (cf. 14.3.10),
JP (ϕχEn ) = JP (ϕ)JP (χEn ) = JP (ψ)JP (χEn ) = JP (ψχEn );
since ϕχEn , ψχEn ∈ L∞ (X, A, P ), this implies (cf. 14.2.17e and 14.2.7i) that
ϕ(x)χEn (x) = ψ(x)χEn (x) P -a.e. on Dϕ ∩ Dψ ,
or equivalently that
∃Fn ∈ A such that P (Fn ) = OH and ϕ(x) = ψ(x), ∀x ∈ En ∩ (X − Fn ).
S∞
By letting F := n=1 Fn , we have (cf. 13.3.6c)
F ∈ A, P (F ) = OH and ϕ(x) = ψ(x), ∀x ∈ En ∩ (X − F ), ∀n ∈ N,
and hence
F ∈ A, P (F ) = OH and ϕ(x) = ψ(x),
∞
!
[
∀x ∈ En ∩ (X − F ) = Dϕ ∩ Dψ ∩ (X − F ),
n=1
or ϕ(x) = ψ(x) P -a.e. on Dϕ ∩ Dψ .

14.3.16 Remark. On the basis of 14.3.15, we can define the mapping

M (X, A, P ) ∋ [ϕ] 7→ ΦP ([ϕ]) := JP (ϕ) ∈ O(H)
and see that it is injective. This mapping is an extension of the mapping denoted
by the same symbol in 14.2.8. However, this mapping is not a homomorphism from
the associative algebra M (X, A, P ) to any algebra of operators. First, O(H) is not
an associative algebra (it is not even a linear space, cf. 3.2.11); furthermore, and
more decisively, for [ϕ], [ψ] ∈ M (X, A, P ) we have
ΦP ([ϕ]) + ΦP ([ψ]) ⊂ ΦP ([ϕ] + [ψ]) and ΦP ([ϕ])ΦP ([ψ]) ⊂ ΦP ([ϕ][ψ]),
and not in general the corresponding equalities. For instance, if ϕ ∈ M(X, A, P )
and ϕ 6∈ L∞ (X, A, P ) then DJP (ϕ) + DJP (−ϕ) = DP (ϕ) 6= H (cf. 14.2.17), while
DP (ϕ − ϕ) = H; similarly, if ϕ ∈ M(X, A, P ) is such that ϕ 6∈ L∞(X,A, P ) and
ϕ(x) 6= 0 P -a.e on Dϕ , then DJP ( 1 )JP (ϕ) = DP (ϕ) 6= H while DP ϕ1 ϕ = H (cf.
ϕ
the proof of 14.3.14).
14.3.17 Proposition. Let ϕ ∈ M(X, A, P ). Then the following conditions are
equivalent:
(a) the operator JP (ϕ) is self-adjoint;
(b) the operator JP (ϕ) symmetric;
(c) ϕ(x) = ϕ(x) P -a.e. on Dϕ .
Proof. We recall that DP (ϕ) = H (cf. 14.2.13). Thus, the operator JP (ϕ) is
adjointable.
a ⇒ b: This is obvious.
b ⇒ c: Assuming condition b, 14.2.15 implies JP (ϕ) ⊂ JP (ϕ), and hence
JP (ϕ) = JP (ϕ) since DP (ϕ) = DP (ϕ), and hence condition c by 14.3.15.
c ⇒ a: This follows immediately from 14.3.15 and 14.2.15.
14.3.18 Proposition. Let ϕ ∈ M(X, A, P ). Then the following conditions are

equivalent:
(a) the operator JP (ϕ) is unitary;
(b) |ϕ(x)| = 1 P -a.e. on Dϕ ;
(c) kJP (ϕ)f k = kf k, ∀f ∈ DP (ϕ).
Proof. a ⇒ b: Assuming condition a, JP (ϕ) is injective and (JP (ϕ))−1 = (JP (ϕ))

†
(cf. 12.5.1b). By 14.3.14 and 14.2.15, this implies ϕ1 ∈ M(X, A, P ) and JP 1

ϕ =
JP (ϕ), and this implies (cf. 14.3.15) that
1
= ϕ(x) P -a.e. on D ϕ1 ∩ Dϕ = D ϕ1 ,
ϕ(x)
or equivalently that
∃F ∈ A such that
1
P (F ) = OH and (x) = ϕ(x), or |ϕ(x)| = 1, ∀x ∈ D ϕ1 ∩ (X − F );
ϕ
now, D ϕ1 ∩ (X − F ) = Dϕ ∩ (X − ((X − D ϕ1 ) ∪ F )) and P ((X − D ϕ1 ) ∪ F ) = OH

(cf. 13.3.2h); thus, condition b is proved.
b ⇒ c: Assuming condition b, 14.2.14d and 8.2.7 yield
Z
2 2
kJP (ϕ)f k = 1X dµP P
f = µf (X) = kf k , ∀f ∈ DP (ϕ).
X
c ⇒ a: We assume condition c. Then, 4.2.3 implies that

JP (ϕ) is injective
−1 −1 1
and (JP (ϕ)) is bounded; moreover, (JP (ϕ)) = JP ϕ (cf. 14.3.14e); then,
RJP (ϕ) = D(JP (ϕ))−1 = H by 14.2.17. Similarly, condition c implies that JP (ϕ) is
bounded and hence that DP (ϕ) = H (cf. 14.2.17). In view of 10.1.20, this proves
that U is an automorphism of H, i.e. that U is a unitary operator.
14.4 Spectral properties of integrals

P(H).
ρ(JP (ϕ)) = {λ ∈ C : JP (ϕ) − λ1H is injective and (JP (ϕ) − λ1H )−1 is bounded}
or equivalently
σ(JP (ϕ)) = Apσ(JP (ϕ)).
Proof. We prove the statement by proving, for λ ∈ C, the implication
[JP (ϕ) − λ1H is injective] ⇒ RJP (ϕ)−λ1H = H.
Now, JP (ϕ) − λ1H = JP (ϕ − λ) for all λ ∈ C (cf. the proof of 14.3.2); therefore, if
JP (ϕ) − λ1H is injective then (cf. 14.3.14)

1 −1 1
∈ M(X, A, P ) and (JP (ϕ) − λ1H ) = JP ,
ϕ−λ ϕ−λ

1
and hence RJP (ϕ)−λ1H = DP ϕ−λ , and hence RJP (ϕ)−λ1H = H (cf. 14.2.13).
14.4.2 Theorem. Let ϕ ∈ M(X, A, P ) and λ ∈ C. Then the following conditions

are equivalent:
(a) λ ∈ σ(JP (ϕ));

(b) P (ϕ−1 (B(λ, ε))) 6= OH , ∀ε ∈ (0, ∞)
(recall that B(λ, ε) := {z ∈ C : |z − λ| < ε} and note that ϕ−1 (B(λ, ε)) ∈ A by
6.2.13c with G := TdC ).
Proof. a ⇒ b: We prove (not b)⇒(not a). Assuming condition (not b), there exists
ε ∈ (0, ∞) so that P (ϕ−1 (B(λ, ε))) = OH , and hence so that
−1
µP
f (ϕ (B(λ, ε))) = kP (ϕ−1 (B(λ, ε)))f k2 = 0, ∀f ∈ H,
and hence so that, letting E := X − ϕ−1 (B(λ, ε)),

Z Z
k(JP (ϕ) − λ1H )f k2 = |ϕ − λ|2 dµP
f = |ϕ − λ|2 dµP
f
X E
Z
≥ ε2 1X dµP
f
E
Z
= ε2 1X dµP 2 P 2 2
f = ε µf (X) = ε kf k , ∀f ∈ DP (ϕ)
X
(cf. 14.3.2, 8.3.3a, 8.1.17b). By 4.2.3 and 14.4.1, this proves that λ ∈ ρ(JP (ϕ)), i.e.
condition (not a).
b ⇒ a: We prove (not a)⇒(not b). In view of 14.4.1, condition (not a) implies
that
JP (ϕ) − λ1H is injective and (JP (ϕ) − λ1H )−1 is bounded,
and hence, in view of the equality JP (ϕ)− λ1H = JP (ϕ− λ) (cf. the proof of 14.3.2)
and of 14.3.14, that

1 1
∈ M(X, A, P ) and JP is bounded,
ϕ−λ ϕ−λ
and hence, in view of 14.2.17, that

1
∃m ∈ (0, ∞) such that
≤ m P -a.e. on D 1 ;
ϕ(x) − λ ϕ−λ
proceeding as in the proof of 14.3.18 (a ⇒ b), we see that this is equivalent to

1
∃m ∈ (0, ∞) such that |ϕ(x) − λ| ≥ P -a.e. on Dϕ ;
m
1
proceeding as at the end of 7.1.10 (in view of ϕ−1 B λ, m

∈ A), this yields

1
∃m ∈ (0, ∞) such that P ϕ−1 B λ, = OH ,
m
i.e. condition (not b).
14.4.3 Theorem. Let ϕ ∈ M(X, A, P ). Then,
P (ϕ−1 (σ(JP (ϕ)))) = 1H
(note that ϕ−1 (σ(JP (ϕ))) ∈ A by 10.4.6 and by 6.2.13c with G := KdC ), and hence
σ(JP (ϕ)) 6= ∅.
Proof. For each λ ∈ C − σ(JP (ϕ)), 14.4.2 implies that there exists ε ∈ (0, ∞) such
that
P (ϕ−1 (B(λ, ε))) = OH ;
this condition implies
B(λ, ε) ⊂ C − σ(JP (ϕ));
indeed, if z ∈ B(λ, ε) then there exists η ∈ (0, ∞) such that B(z, η) ⊂ B(λ, ε), and
hence such that ϕ−1 (B(z, η)) ⊂ ϕ−1 (B(λ, ε)), and hence (cf. 13.3.2e) such that
P (ϕ−1 (B(z, η))) = OH ;
in view of 14.4.2, this implies z ∈ C − σ(JP (ϕ)).
Now, for each λ ∈ C − σ(JP (ϕ)) let ελ ∈ (0, ∞) be such that
P (ϕ−1 (B(λ, ελ ))) = OH .
Since B(λ, ελ ) ⊂ C − σ(JP (ϕ)) for all λ ∈ C − σ(JP (ϕ)), we have obviously
[
C − σ(JP (ϕ)) = B(λ, ελ ).
λ∈C−σ(JP (ϕ))
Since (C, dC ) is a separable metric space (cf. 2.7.4a), by 2.3.18 there exists a count-
able subset {λn }n∈I of C − σ(JP (ϕ)) such that
[
C − σ(JP (ϕ)) = B(λn , ελn ),
n∈I
and hence such that
[
Dϕ − ϕ−1 (σ(JP (ϕ))) = ϕ−1 (C − σ(JP (ϕ))) = ϕ−1 (B(λn , ελn ));
n∈I
−1
then (cf. 13.3.6c) P (Dϕ − ϕ (σ(JP (ϕ)))) = OH , or equivalently
−1
P (ϕ (σ(JP (ϕ)))) = P (Dϕ ) = P (Dϕ ) + P (X − Dϕ ) = P (X) = 1H .
Obviously, this implies σ(JP (ϕ)) 6= ∅ (otherwise, ϕ−1 (σ(JP (ϕ))) = ∅ and hence
P (ϕ−1 (σ(JP (ϕ)))) = OH ).
14.4.4 Remark. For every ϕ ∈ M(X, A, P ), the equalities in 14.2.14b and in

14.3.2 can be written as follows:
Z
(f |JP (ϕ)f ) = ϕdµP f , ∀f ∈ DP (ϕ);
ϕ−1 (σ(JP (ϕ)))
Z
kJP (ϕ)f − λf k2 = |ϕ − λ|2 dµP
f , ∀f ∈ DP (ϕ), ∀λ ∈ C.
ϕ−1 (σ(JP (ϕ)))
This follows from 8.3.3, since 14.4.3 implies that

P (X − ϕ−1 (σ(JP (ϕ)))) = OH ,
and hence
−1
µP
f (X − ϕ (σ(JP (ϕ)))) = 0, ∀f ∈ H.
14.4.5 Proposition. Let ϕ ∈ M(X, A, P ). Then the operator JP (ϕ) is bounded

iff σ(JP (ϕ)) is a bounded subset of C.
Proof. If the operator JP (ϕ) is bounded then JP (ϕ) ∈ B(H) (cf. 14.2.17), and
hence σ(JP (ϕ)) is a bounded subset of C by 4.5.10.
Conversely, suppose that σ(JP (ϕ)) is bounded and let m ∈ [0, ∞) be such that
|z| ≤ m, ∀z ∈ σ(JP (ϕ));
then (cf. 14.4.4)
Z
kJP (ϕ)f k2 = |ϕ|2 dµP
f
ϕ−1 (σ(J P (ϕ)))
Z Z
2 2
≤m 1X dµP
f =m 1X dµP
f
ϕ−1 (σ(JP (ϕ))) X
= m µf (X) = m2 kf k2 , ∀f
2 P
∈ DP (ϕ),
and this proves that JP (ϕ) is bounded.
14.4.6 Theorem. Let ϕ ∈ M(X, A, P ). Then,

NJP (ϕ)−λ1H = RP (ϕ−1 ({λ})) .
For λ ∈ C, the following conditions are equivalent:
(a) λ ∈ σP (JP (ϕ));
(b) P (ϕ−1 ({λ})) 6= OH .
If λ ∈ σP (JP (ϕ)) then P (ϕ−1 ({λ})) is the projection onto the corresponding
eigenspace.
Proof. For λ ∈ C, we have JP (ϕ) − λ1H = JP (ϕ − λ) (cf. the proof of 14.3.2). If

we define ψλ := ϕ − λ, we have ψλ−1 ({0}) = ϕ−1 ({λ}); then, from 14.3.14a we have
NJP (ϕ)−λ1H = NJP (ψλ ) = RP (ψ−1 ({0})) = RP (ϕ−1 ({λ})) .
λ
In view of this, the equivalence of conditions a and b follows directly from 4.5.7, and
so does the part of the statement about eigenspaces (for which, cf. also 13.1.3c).
14.5 Multiplication operators
In this section, (X, A, µ) stands for an abstract measure space. At variance with
what was done in Section 11.1, we denote the elements of L2 (X, A, µ) with the
letters f , g,....
For ϕ ∈ M(X, A, µ), we define the mapping from L2 (X, A, µ) to itself
Mϕ : DMϕ → L2 (X, A, µ)
[f ] 7→ Mϕ [f ] := [ϕf ],
with
DMϕ := {[f ] ∈ L2 (X, A, µ) : ϕf ∈ L2 (X, A, µ)}
Z
2 2
= [f ] ∈ L (X, A, µ) : |ϕf | dµ < ∞
X
2
(note that ϕf ∈ M(X, A, µ) for all f ∈ L (X, A, µ), in view of 8.2.2). It is easy
to see that Mϕ is a linear operator (DMϕ is a linear manifold in L2 (X, A, µ) by
11.1.2a).
For E ∈ A, we write PE := MχE . We have:
Z Z
DPE = L2 (X, A, µ) since |χE f |2 dµ ≤ |f |2 dµ < ∞, ∀[f ] ∈ L2 (X, A, µ);
X X
Z
([f ]|PE [f ]) = χE |f |2 dµ ∈ R, ∀[f ] ∈ L2 (X, A, µ), hence PE = PE† (cf. 12.4.3);
X
PE ([f ]) = [χE f ] = [χ2E f ] = PE2 [f ], ∀[f ] ∈ L2 (X, A, µ), hence PE = PE2 .
This proves that PE is a projection (cf. 13.1.5).
Now, we define the mapping
P : A → P(L2 (X, A, µ))
E 7→ P (E) := PE .
For all [f ] ∈ L2 (X, A, µ), we have:
2 2
µP
[f ] (X) = ([f ]|PX [f ]) = k[f ]k , ∀[f ] ∈ L (X, A, µ);
Z Z
µP
[f ] (E) = ([f ]|PE [f ]) = χE |f |2 dµ = |f |2 dµ, ∀E ∈ A.
X E
In view of 8.3.4a and 13.3.5, this proves that P is a projection valued measure on
2
A. If F ∈ A is such that µ(F ) = 0, then µP [f ] (F ) = 0 for all [f ] ∈ L (X, A, µ) (cf.
8.3.4a) and hence P (F ) = OL2 (X,A,µ) . Therefore, M(X, A, µ) ⊂ M(X, A, P ).
For ϕ ∈ M(X, A, µ), we have
Z Z
2 2 (1)
[f ] ∈ DMϕ ⇔ |ϕ| |f | dµ < ∞ ⇔ |ϕ|2 dµP
[f ] < ∞ ⇔ [f ] ∈ DP (ϕ),
X X
where 1 holds by 8.3.4b; moreover, we have
Z Z
2 (2)
([f ]|Mϕ [f ]) = ϕ|f | dµ = ϕdµP
[f ] , ∀[f ] ∈ DMϕ ,
X X
where 2 holds by 8.3.4c. This proves that Mϕ = JP (ϕ), by the uniqueness asserted
in 14.2.14.
Now we assume that the measure µ is σ-finite, i.e. that there exists a countable
S
family {En }n∈I of elements of A so that X = n∈I En and µ(En ) < ∞ for all
n ∈ I (this implies that χEn ∈ L2 (X, A, µ) for all n ∈ I). If F ∈ A is such that
P (F ) = OL2 (X,A,µ) , then
Z Z
µ(F ∩ En ) = χF ∩En dµ = χF |χEn |2 dµ = ([χEn ]|P (F )[χEn ]) = 0, ∀n ∈ I,
X X
and hence
!
[ X
µ(F ) = µ (F ∩ En ) ≤ µ(F ∩ En ) = 0
n∈I n∈I
(cf. 7.1.4a), and hence µ(F ) = 0. Now, let E be an element of A and, for each
x ∈ E, let Q(x) be a proposition. Then,
[Q(x) P -a.e. on E] is equivalent to [Q(x) µ-a.e. on E].
Thus, M(X, A, µ) = M(X, A, P ) and all the statements of Sects. 14.2, 14.3,
14.4 hold true with JP (ϕ) replaced by Mϕ , the projection valued measure P re-
placed by the measure µ, “P -a.e.” replaced by “µ-a.e.” (L∞ (X, A, µ) is defined as
L∞ (X, A, P ) was, with P replaced by µ).
14.6 Change of variable. Unitary equivalence.
In some cases, there are relations between integrals constructed with respect to two
different projection valued measures. In this section we examine two important
cases of this kind.
14.6.1 Theorem (Change of variable theorem). Let H be a Hilbert space,

(X1 , A1 ) a measurable space, and P1 a projection valued measure on A1 with values
in P(H). Let (X2 , A2 ) be a measurable space, and let π : Dπ → X2 be a mapping
from X1 to X2 which is measurable w.r.t. A1 Dπ and A2 , and so that Dπ ∈ A1 and
P1 (X1 − Dπ ) = OH .
(a) The mapping
P2 : A2 → P(H)
E 7→ P2 (E) := P1 (π −1 (E))
is a projection valued measure on A2 .
(b) For ϕ ∈ M(X2 , A2 , P2 ) we have:
ϕ ◦ π ∈ M(X1 , A1 , P1 ) and
Z Z
JP2 (ϕ) = JP1 (ϕ ◦ π) or ϕdP2 = (ϕ ◦ π)dP1 ).
X2 X1
Proof. a: For every f ∈ H we have

−1
µP P1
f (E) = µf (π
2
(E)), ∀E ∈ A2 ;
then, µP
f is a measure on A2 in view of 13.3.5 (a ⇒ b) and 8.3.11a; moreover,
2
(1)
µP P1 P1 2
f (X2 ) = µf (Dπ ) = µf (X1 ) = kf k ,
2
where 1 is true because P1 (X1 − Dπ ) = OH . In view of 13.3.5 (b ⇒ a), this proves

that P2 is a projection valued measure on A2 .
b: For ϕ ∈ (X2 , A2 , P2 ) we have:

Dϕ◦π = π −1 (Dϕ ) ∈ A1 Dπ ⊂ A1 ;
(2)
P1 (X1 − Dϕ◦π ) = P1 (Dπ − π −1 (Dϕ )) = P1 (π −1 (X2 − Dϕ ))
= P2 (X2 − Dϕ ) = OH ,
where 2 is true because P1 (X1 − Dπ ) = OH ; further ϕ ◦ π is measurable w.r.t.
A1 Dϕ◦π and A(dC ) since π is measurable w.r.t. A1 Dπ and A2 and ϕ is measurable
w.r.t. A2 Dϕ and A(dC ) (cf. 6.2.6). Thus, ϕ ◦ π ∈ M(X1 , A1 , P1 ).
Then, in view of 8.3.11b we have
Z Z
|ϕ|2 dµP
f
2
= |ϕ ◦ π|2 dµP
f , ∀f ∈ H,
1
X2 X1
and hence DP1 (ϕ ◦ π) = DP2 (ϕ). Moreover, in view of 14.2.14b and 8.3.11c we have
Z Z
(f |JP2 (ϕ)f ) = ϕdµP
f
2
= (ϕ ◦ π)dµPf = (f |JP1 (ϕ ◦ π)f ) , ∀f ∈ DP2 (ϕ).
1
X2 X1
This proves the equality JP1 (ϕ ◦ π) = JP2 (ϕ), in view of 14.2.13 and 10.2.12.
14.6.2 Theorem. Let H1 and H2 be isomorphic Hilbert spaces and suppose that
U ∈ UA(H1 , H2 ) (for UA(H1 , H2 ), cf. 10.3.15). Let (X, A) be a measurable space
and let P1 be a projection valued measure on A with values in P(H1 ). Then the
mapping
P2 : A → P(H2 )
E 7→ P2 (E) := U P1 (E)U −1
is a projection valued measure on A. We have M(X, A, P2 ) = M(X, A, P1 ) and,
for all ϕ ∈ M(X, A, P1 ),
JP2 (ϕ) = U JP1 (ϕ)U −1 if U ∈ U(H1 , H2 ),
JP2 (ϕ) = U JP1 (ϕ)U −1 if U ∈ A(H1 , H2 ).
Proof. From 13.1.8 we have P2 (E) ∈ P(H2 ), ∀E ∈ A. Further, condition 13.3.5b

for P1 implies the same condition for P2 since
−1
µP
f (E) = kU P1 (E)U
2
f k2 = kP1 (E)U −1 f k2
(1)
= µP
U −1 f (E), ∀E ∈ A, ∀f ∈ H2 ,
1
and since kU −1 f k = kf k, ∀f ∈ H2 . Thus, P2 is a projection valued measure on A.

The equality M(X, A, P2 ) = M(X, A, P1 ) is obvious since, for E ∈ A, P2 (E) =
OH2 iff P1 (E) = OH1 . Now let ϕ ∈ M(X, A, P1 ). In view of 1 we have, for f ∈ H2 ,
ϕ ∈ L2 (X, A, µP P1
f ) iff ϕ ∈ L2 (X, A, µU −1 f ),
2
and hence
DP2 (ϕ) = DP2 (ϕ) = {f ∈ H2 : U −1 f ∈ DP1 (ϕ)} = DJP1 (ϕ)U −1 = DUJP1 (ϕ)U −1 .
Moreover, for every f ∈ DP2 (ϕ)(= DP2 (ϕ)), in view of 14.2.14b and of 1 we have
(since U −1 f ∈ DP1 (ϕ) = DP1 (ϕ)):
Z Z
P2
(f |JP2 (ϕ)f ) = ϕdµf = ϕdµP 1
U −1 f
X X
= U −1 f |JP1 (ϕ)U −1 f = f |U JP1 (ϕ)U −1 f ,

if U ∈ U(H1 , H2 );
Z
−1
ϕdµP f |JP1 (ϕ)U −1 f

(f |JP2 (ϕ)f ) = f = U
2
X
(2)
= JP1 (ϕ)U −1 f |U −1 f = f |U JP1 (ϕ)U −1 f ,

if U ∈ A(H1 , H2 ) (2 is true because JP1 (ϕ) = (JP1 (ϕ))† , cf. 14.2.15).

This proves the equalities of the statement, in view of 14.2.13 and 10.2.12.
Chapter 15
Spectral Theorems
Unitary and self-adjoint operators can be represented in a unique way as integrals

with respect to suitable projection valued measures. This is the content of the
corresponding spectral theorems. Following John von Neumann, we deduce the
spectral theorem for self-adjoint operators from the spectral theorem for unitary
operators.
The spectral theorem for self-adjoint operators is of crucial importance in quan-
tum mechanics. In fact, it is through this theorem that self-adjoint operators step
onto the quantum mechanical stage, since quantum observables arise most naturally
in the guise of projection valued measures (cf. Section 19.3).
Functions of a self-adjoint operator can be defined, on the basis of the projection
valued measure associated with that operator. The mathematical idea of a function
of a self-adjoint operator has its physical counterpart in the idea of a function of a
quantum observable (cf. Section 19.3).
15.1 The spectral theorem for unitary operators
The proof we give of the spectral theorem for unitary operators rests on the Fejér–
Riesz lemma which is proved in 15.1.2, on the Stone–Weierstrass theorem for the
unit circle proved in 4.3.7, on the Riesz–Markov theorem for positive linear func-
tionals proved in 8.5.3, and on the characterization of the family of bounded Borel
functions provided in 6.3.4.
We recall that P denotes the family of trigonometric polynomials on the unit
circle T, that P is a subalgebra of the associative algebra C(T), that C(T) = CB (T)
since the metric subspace (T, dT ) of the metric space (C, dC ) is compact, and hence
that C(T) is a normed algebra (cf. 4.3.6a,c). We note that obviously p ∈ P for all
p ∈ P.
Throughout this section, H denotes an abstract Hilbert space. We recall that
B(H) is a C ∗ -algebra (cf. 12.6.4).
463
15.1.1 Theorem. Let U be a unitary operator in H. Then the following definition,

of the mapping φ̂U , is consistent:
φ̂U : P → B(H)
p 7→ φ̂U (p) := p(U ),
where
N
X
p(U ) := αk U k
k=−N
if N ≥ 0 and (α0 , α1 , α−1 , ..., αN , α−N ) ∈ C2N +1 are so that

N
X
p(z) = αk z k , ∀z ∈ T
k=−N
(we define U := 1H , cf. 3.3.1; for n ∈ N we define U −n := (U −1 )n = (U n )−1 ,

0
where the second equality follows from 1.2.14B).

The mapping φ̂U has the following properties:
(a) φ̂U (α1 p1 + α2 p2 ) = α1 φ̂U (p1 ) + α2 φ̂U (p2 ), ∀α1 , α2 ∈ C, ∀p1 , p2 ∈ P;
(b) φ̂U (p1 p2 ) = φ̂U (p1 )φ̂U (p2 ), ∀p1 , p2 ∈ P;
(c) φ̂U (p) = (φ̂U (p))† , ∀p ∈ P.
Proof. Consistency: If p ∈ P then

∃N ≥ 0, ∃(α0 , α1 , α−1 , ..., αN , α−N ) ∈ C2N +1 so that
N
X
p(z) = αk z k , ∀z ∈ T;
k=−N
we can assume that |αN | + |α−N | 6= 0. Suppose that M ≥ 0 and

(β0 , β1 , β−1 , ..., βM , β−M ) ∈ C2M+1 are so that |βM | + |β−M | 6= 0 and
M
X
p(z) = βh z h , ∀z ∈ T.
h=−M
Then, supposing e.g. M ≤ N , the equation

N M
!
X X
N k h
z αk z − βh z = 0, ∀z ∈ T,
k=−N h=−M
shows that M = N (M < N would imply α−N = αN = 0) and βk = αk for all

k ∈ {0, ±1, ..., ±N }. This proves that the definition of the mapping φ̂U is consistent
(also, note that p(U ) ∈ B(H) because U, U −1 ∈ B(H) and B(H) is an associative
algebra, cf. 4.3.5).
a and b: These properties are obvious, since B(H) is an associative algebra.
c: This property follows from 12.3.1b, 12.3.2, 12.3.4b, 12.5.1b (also, note that
z = z −1 for all z ∈ T).
Spectral Theorems 465
15.1.2 Lemma (The Fejér–Riesz lemma). Let p ∈ P be such that 0 ≤ p(z) for
all z ∈ T. Then,
∃q ∈ P such that p = qq.
Proof. Since p ∈ P,
∃N ≥ 0, ∃(α0 , α1 , α−1 , ..., αN , α−N ) ∈ C2N +1 so that
N
X
p(z) = αk z k , ∀z ∈ T.
k=−N
If N = 0, then 0 ≤ α0 , and hence the element q of P defined by

√
q(z) := α0 , ∀z ∈ T,
is so that p = qq.
In what follows we suppose N > 0 and |αN | + |α−N | 6= 0.
We have
N
X N
X N
X
αk z k = p(z) = p(z) = αk z −k = α−k z k , ∀z ∈ T,
k=−N k=−N k=−N
and hence
N
X
(αk − α−k )z k+N = 0, ∀z ∈ T,
k=−N
and hence
αk = α−k , ∀k ∈ {0, ±1, ..., ±N }.
This implies that both αN and α−N are non-zero (if one of them were zero then
the other one would be zero as well, and thus we should have |αN | + |α−N | = 0).
Therefore, zero cannot be a root of the polynomial P defined by
N
X
P (z) := αk z k+N , ∀z ∈ C,
k=−N
(otherwise, α−N = 0) and the degree of P is 2N .

First we suppose p(z) > 0 for all z ∈ T. Then the roots of P cannot be elements
of the unit circle T, since
p(z) = z −N P (z), ∀z ∈ T.
Let {λi }i∈I be the family of the roots of P inside the unit circle and {µj }j∈J the
family of the roots outside; let ri be the multiplicity of the root λi and sj the
multiplicity of the root µj ; thus,
X X
ri + sj = 2N.
i∈I j∈J
Then we have the factorization

Y Y
P (z) = c (z − λi )ri (z − µj )sj , ∀z ∈ C,
i∈I j∈J
with c ∈ C, and hence (recall that no root of P is zero)

Y Y
p(z) = z −N P (z) = z N c (z −1 − λi )ri (z −1 − µj )sj
i∈I j∈J
Y ri Y s
Y −1 Y
= z czN −2N
λi µj j (z − λi )ri (z − µ−1
j )
sj
i∈I j∈J i∈I j∈J

Y −1 Y
=z −N
c1 (z − λi )ri (z − µ−1 sj
j ) , ∀z ∈ T,
i∈I j∈J
with c1 ∈ C, and hence

Y Y
c (z − λi )ri (z − µj )sj = z N p(z)
i∈I j∈J
Y −1 Y
= z N p(z) = c1 (z − λi )ri (z − µ−1 sj
j ) , ∀z ∈ T.
i∈I j∈J
Since the set of the roots inside the unit circle must be the same on the two sides
of this equation and so must be their multiplicities (or, equivalently, since the fac-
torization of a polynomial with respect to its roots is unique), this implies that
{λi }i∈I = {µ−1 j }j∈J , and hence the setsP
of indices I and J can be identified, and
P
also that ri = si for all i ∈ I, and hence i∈I ri = i∈I si = N . Thus there exists
(ν1 , ..., νN ) ∈ CN (the components of this N -tuple are the roots of P outside the
unit circle, each of them repeated as many times as its multiplicity) so that
N
Y N
Y
P (z) = c (z − ν −1
k ) (z − νk ), ∀z ∈ C.
k=1 k=1
Now we suppose that p is not strictly positive, i.e. that there exists z ∈ T such that
p(z) = 0. For every n ∈ N, we define the trigonometric polynomial pn := p + n1 and
the polynomial
N
X 1
Pn (z) := αk + δ0,k z k+N , ∀z ∈ C.
n
k=−N
Since pn (z) > 0 and pn (z) = z −N Pn (z) for all z ∈ T, proceeding as above we see
that
N N
Y −1 Y
Pn (z) = c(n) (z − νk (n) ) (z − νk (n)), ∀z ∈ C,
k=1 k=1
where c(n) ∈ C and the components of the N -tuple (ν1 (n), ..., νN (n)) are the roots
of Pn outside the unit circle, repeated as many times as their multiplicities. Since
the roots of a polynomial depend continuously on the coefficients of the polynomial
(cf. e.g. Horn and Johnson, 2013, th.D.1.), the sequence {νk (n)} converges to a
root νk of P for each k ∈ {1, ..., N }. Then we have
N
Y N
Y
P (z) = lim Pn (z) = c (z − ν −1
k ) (z − νk ), ∀z ∈ C,
n→∞
k=1 k=1
where c := limn→∞ c(n); indeed, the sequence {c(n)} is convergent since, for z0 ∈ C
such that P (z0 ) 6= 0, for n large enough we have
N N
Y −1 Y
(z0 − νk (n) ) (z0 − νk (n)) 6= 0
k=1 k=1
and
N N
!−1
Y −1 Y
c(n) = Pn (z) (z0 − νk (n) ) (z0 − νk (n)) .
k=1 k=1
Although it is not relevant for the present proof, we note that what we have just
seen proves that every root of P in the unit circle has even multiplicity.
Thus, as a consequence of the hypothesis p(z) ≥ 0 for all z ∈ T , there exist
c ∈ C and (ν1 , ..., νN ) ∈ CN so that
N
Y N
Y
P (z) = c (z − ν −1
k ) (z − νk ), ∀z ∈ C,
k=1 k=1
and hence so that
N
Y N
Y N
Y
p(z) = z −N P (z) = c(−1)N ν −1
k (z −1 − ν k ) (z − νk )
k=1 k=1 k=1
N
Y N
Y
= c2 (z − ν k ) (z − νk ), ∀z ∈ T,
k=1 k=1
with c2 ∈ C. Since there exists z ∈ C such that p(z) > 0, c2 > 0 must be true.
Then, the trigonometric polynomial q defined by
N
√ Y
q(z) := c2 (z − νk ), ∀z ∈ T,
k=1
is such that p = qq.
15.1.3 Proposition. Let U be a unitary operator in H and let p ∈ P be such that

0 ≤ p(z) for all z ∈ T. Then,

0 ≤ f |φ̂U (p)f , ∀f ∈ H.
Proof. In view of 15.1.2, there exists q ∈ P so that p = qq. Then, by 15.1.1b,c,

φ̂U (p) = (φ̂U (q))† φ̂U (q),
and hence
0 ≤ kφ̂U (q)f k2 = f |(φ̂U (q))† φ̂U (q)f = f |φ̂U (p)f , ∀f ∈ H.
15.1.4 Proposition. The mapping φ̂U has the following property:

kφ̂U (p)k ≤ kpk∞ , ∀p ∈ P.
Proof. For p ∈ P, let p̃ be the element of P defined by

p̃(z) := kpk2∞ − p(z)p(z), ∀z ∈ T;
obviously, we have 0 ≤ p̃(z) for all z ∈ T, and hence, in view of 15.1.3,

0 ≤ f |φ̂U (p̃)f , ∀f ∈ H;
now, 15.1.1a,b,c imply that

φ̂U (p̃) = kpk2∞ 1H − (φ̂U (p))† φ̂U (p);
thus,
0 ≤ kpk2∞ kf k2 − kφ̂U (p)f k2 , ∀f ∈ H,
which yields
kφ̂U (p)k ≤ kpk∞ .
15.1.5 Theorem. Let U be a unitary operator in H. Then there exists a unique

mapping φU : C(T) → B(H) such that:
(a) φU (αϕ + βψ) = αφU (ϕ) + βφU (ψ), ∀α, β ∈ C, ∀ϕ, ψ ∈ C(T);
(b) φU (p) = φ̂U (p), ∀p ∈ P;
(c) φU is continuous.
In addition, the following conditions are true:
(d) kφU (ϕ)k ≤ kϕk∞ , ∀ϕ ∈ C(T);
(e) φU (ϕψ) = φU (ϕ)φU (ψ), ∀ϕ, ψ ∈ C(T);
(f ) φU (ϕ) = (φU (ϕ))† , ∀ϕ ∈ C(T);
(g) if ϕ ∈ C(T) is such that 0 ≤ ϕ(z) for all z ∈ T, then 0 ≤ (f |φU (ϕ)f ), ∀f ∈ H;
(h) if A ∈ B(H) is such that AU = U A, then AφU (ϕ) = φU (ϕ)A, ∀ϕ ∈ C(T).
Proof. In view of 15.1.1a and 15.1.4, the mapping φ̂U is a bounded linear operator
from the normed space C(T) to the Banach space B(H). Since P = C(T) (cf. 4.3.7),
4.2.6 implies that there exists a unique linear operator φU : C(T) → B(H) which is
an extension of φ̂U and which is bounded, i.e. continuous. This proves that there
exists a unique mapping φU : C(T) → B(H) which has properties a, b, c.
Now we prove the additional properties of φU .
d: In view of 4.2.6d, the norm of the linear operator φU equals the norm of
the linear operator φ̂U . Now, 15.1.4 implies that the latter is not greater than one.
Thus, we have condition d (cf. 4.2.5b).
e: For ϕ, ψ ∈ C(T), let {pn } and {qn } be sequences in P such that ϕ =

limn→∞ pn and ψ = limn→∞ qn . Then, ϕψ = limn→∞ pn qn (cf. 4.3.3) and hence
(1)
φU (ϕψ) = lim φ̂U (pn qn ) = lim φ̂U (pn )φ̂U (qn )
n→∞ n→∞
(2)
= ( lim φ̂U (pn ))( lim φ̂U (qn )) = φU (ϕ)φU (ψ),
n→∞ n→∞
where 1 holds by 15.1.1b and 2 by 4.3.3.

f: For ϕ ∈ C(T), let {pn } be a sequence in P such that ϕ = limn→∞ pn . Then,
ϕ = limn→∞ pn (this is obvious) and hence
(3)
φU (ϕ) = lim φ̂U (pn ) = lim (φ̂U (pn ))†
n→∞ n→∞
(4)
= ( lim φ̂U (pn )) = (φU (ϕ))† ,
†
n→∞
where 3 holds by 15.1.1c and 4 by 12.6.2.

g: For ϕ ∈ C(T) such that 0 ≤ ϕ(z) for all z ∈ T, let ψ be the element of C(T)
defined by
p
ψ(z) := ϕ(z), ∀z ∈ T.
Then, ϕ = ψ 2 and ψ = ψ imply
φU (ϕ) = φU (ψ)φU (ψ) = (φU (ψ))† φU (ψ)
(cf. conditions e and f), and hence
0 ≤ kφU (ψ)f k2 = f |(φU (ψ))† φU (ψ)f = (f |φU (ϕ)f ) , ∀f ∈ H.

h: Let A ∈ B(H) be such that AU = U A. We have also

AU −1 = U −1 (U A)U −1 = U −1 (AU )U −1 = U −1 A.
These conditions imply, owing to the very definition of φ̂U ,
Aφ̂U (p) = φ̂U (p)A, ∀p ∈ P.
For ϕ ∈ C(T), let {pn } be a sequence in P such that ϕ = limn→∞ pn . Then, for
each n ∈ N,
kAφU (ϕ) − φU (ϕ)Ak ≤ kAφU (ϕ) − AφU (pn )k + kAφU (pn ) − φU (ϕ)Ak
(5)
= kAφU (ϕ − pn )k + kφU (pn − ϕ)Ak
(6) (7)
≤ 2kAkkφU (ϕ − pn )k ≤ 2kAkkϕ − pn k∞ ,
where 5 holds by condition a, 6 by 4.2.9, 7 by condition d. This proves that
AφU (ϕ) = φU (ϕ)A.
15.1.6 Theorem (The spectral theorem for unitary operators). Let U be a

unitary operator in H.
(A) There exists a unique projection valued measure P on the Borel σ-algebra A(dT )
on T (cf. 6.1.22; as before, dT denotes the restriction of the distance dC to
T × T), with values in P(H), such that U = JζP , where ζ is the function
defined by
ζ : T→T
z 7→ ζ(z) := z
and JζP is the operator defined in 14.2.14.
Equivalently, there exists a unique projection valued measure P on A(dT ), with
values in P(H), such that
Z
(f |U f ) = ζdµP
f , ∀f ∈ H.
T
(B) If A ∈ B(H) is such that AU = U A, then

AP (E) = P (E)A, ∀E ∈ A(dT ).
Proof. A: We divide the proof into nine steps.

Step 1: For every f ∈ H, the function
C(T) ∋ ϕ 7→ (f |φU (ϕ)f ) ∈ C
is a positive linear functional, in view of 15.1.5a,g; since the metric space (T, dT ) is
compact, by 8.5.3 this implies that there exists a unique finite measure µf on A(dT )
so that
Z
(f |φU (ϕ)f ) = ϕdµf , ∀ϕ ∈ C(T);
T
in particular we have, from 15.1.5b and the very definition of φ̂U ,

Z
1T dµf = f |φ̂U (1T )f = (f |1H f ) = kf k2 .
T
Step 2: For every ϕ ∈ MB (T, A(dT )), we define the function

ψϕ : H × H → C
4
1
X Z
(f, g) 7→ ψϕ (f, g) := ϕdµf +in g
n=1
4in T
(note that MB (T, A(dT )) ⊂ L1 (T, A(dT ), µf ) for all f ∈ H, in view of 8.2.6).
We want to prove that:
∀ϕ ∈ MB (T, A(dT )), ∃!Bϕ ∈ B(H) such that (f |Bϕ g) = ψϕ (f, g), ∀f, g ∈ H.
To this end, we define the family
V1 := {ϕ ∈ MB (T, A(dT )) : ψϕ is a bounded sesquilinear form}
(for a bounded sesquilinear form, cf. 10.1.1 and 10.5.4).
If ϕ ∈ C(T) then
4
X 1 (1)
ψϕ (f, g) = n
(f + in g|φU (ϕ)(f + in g)) = (f |φU (ϕ)g) , ∀f, g ∈ H
n=1
4i
(for 1, cf. 10.1.10a), and hence ϕ ∈ V1 by 10.5.5.

Next, suppose ϕ ∈ F (T) and that {ϕn } is a sequence in MB (T, A(dT )) such
ubp
that ϕn −→ ϕ, i.e. such that (cf. 6.3.1):
∃m ∈ [0, ∞) such that |ϕn (z)| ≤ m, ∀z ∈ T, ∀n ∈ N;
lim ϕn (z) = ϕ(z), ∀z ∈ T.
n→∞
Then, ϕ ∈ MB (T, A(dT )) (cf. 6.3.4a). Moreover, 8.2.11 (with the constant function
mT as dominating function) implies that
Z Z
ϕdµf = lim ϕn dµf , ∀f ∈ H,
T n→∞ T
and hence
4
1
X Z
ψϕ (f, g) = lim ϕn dµf +ik g
n→∞ 4ik T (2)
k=1
= lim ψϕn (f, g), ∀f, g ∈ H.

n→∞
Now suppose also that ϕn ∈ V1 for all n ∈ N. Then 2 implies that ψϕ is a sesquilinear
form, since so is ψϕn for all n ∈ N. Moreover, for all u, v ∈ H such that kuk =
kvk = 1,
4 Z 4
1X mX
|ψϕ (u, v)| ≤ |ϕ|dµu+in v ≤ ku + in vk2 ≤ 4m
4 n=1 T 4 n=1
(since ku + in vk ≤ 2); then,

1 1
|ψϕ (f, g)| = kf kkgkψϕ f, g ≤ 4mkf kkgk, ∀f, g ∈ H − {0H },
kf k kgk
and this proves that the sesquilinear form ψϕ is bounded.
Thus, V1 is a family of complex function on T which contains C(T) and which is
ubp closed. Hence (cf. 6.3.4b) MB (T, A(dT )) ⊂ V1 (actually, V1 = MB (T, A(dT ))),
or
ψϕ is a bounded sesquilinear form, ∀ϕ ∈ MB (T, A(dT )).
Then, by 10.5.6,
∀ϕ ∈ MB (T, A(dT )), ∃!Bϕ ∈ B(H) such that (f |Bϕ g) = ψϕ (f, g), ∀f, g ∈ H.
Step 3: For every α ∈ C and every f ∈ H, we have
Z Z Z
(3)
ϕdµαf = |α|2 (f |φU (ϕ)f ) = |α|2 ϕdµf = ϕd(|α|2 µf ), ∀ϕ ∈ C(T)
T T T
(for 3, cf. 8.3.5a with a1 := |α|2 , µ1 := µf , µk the null measure on A(dT ) for k > 1);
in view of the uniqueness asserted in 8.5.3, this proves that
µαf = |α|2 µf .
In particular, for every f ∈ H we have µf +in f = |1 + in |2 µf for n = 1, 2, 3, 4; thus,
µf −f is the null measure, µf +f = 4µf , µf +if = µf −if ;
this yields
4
1 1
X Z Z Z
ϕdµ n
f +i f = 4 ϕdµ f = ϕdµf , ∀ϕ ∈ MB (T, A(dT )),
n=1
4in T 4 T T
and hence
Z
(f |Bϕ f ) = ψϕ (f, f ) = ϕdµf , ∀ϕ ∈ MB (T, A(dT )).
T
Step 4: Suppose ϕ ∈ F (T) and that {ϕn } is a sequence in MB (T, A(dT )) such
ubp
that ϕn −→ ϕ. Then ϕ ∈ MB (T, A(dT )) and
(f |Bϕ g) = ψϕ (f, g) = lim ψϕn (f, g) = lim (f |Bϕn g) , ∀f, g ∈ H.
n→∞ n→∞
This follows from what we saw in step 2.
Step 5: For every ϕ ∈ C(T) we have (cf. step 3)
Z
(f |Bϕ f ) = ϕdµf = (f |φU (ϕ)f ) , ∀f ∈ H,
T
whence Bϕ = φU (ϕ) by 10.2.12.
Step 6: For every ϕ ∈ MB (T, A(dT )) we have (cf. step 3)
Z Z
(4)
f |Bϕ† f = (f |Bϕ f ) =

ϕdµf = ϕdµf = (f |Bϕ f ) , ∀f ∈ H
T T
(for 4, cf. 8.2.3), whence Bϕ†

= Bϕ by 10.2.12.
Step 7: Here we prove that Bϕ Bψ = Bϕψ , ∀ϕ, ψ ∈ MB (T, A(dT )).
We define the family
V2 := {ψ ∈ MB (T, A(dT )) : Bψ Bϕ = Bψϕ , ∀ϕ ∈ C(T)}.
If ψ ∈ C(T) then
(5) (6) (7)
Bψ Bϕ = φU (ψ)φU (ϕ) = φU (ψϕ) = Bψϕ , ∀ϕ ∈ C(T)
(for 5 and 7, cf. step 5; 6 holds by 15.1.5e), and hence ψ ∈ V2 .
ubp
Next, suppose ψ ∈ F (T) and that {ψn } is a sequence in V2 such that ψn −→ ψ.
Then (cf. step 4), ψ ∈ MB (T, A(dT )) and
(f |Bψ g) = lim (f |Bψn g) , ∀f, g ∈ H,
n→∞
and hence, for all ϕ ∈ C(T),

(8)
(f |Bψ Bϕ g) = lim (f |Bψn Bϕ g) = lim (f |Bψn ϕ g) = (f |Bψϕ g) , ∀f, g ∈ H,
n→∞ n→∞
where 8 holds (in view of step 4) because {ψn ϕ} is a sequence in MB (T, A(dT ))
ubp
such that ψn ϕ −→ ψϕ; in fact, if m ∈ [0, ∞) is such that |ψn (z)| ≤ m for all z ∈ T,
then |(ψn ϕ)(z)| ≤ mkϕk∞ for all z ∈ T. Therefore, Bψ Bϕ = Bψϕ .
Thus, V2 ⊂ F (T), C(T) ⊂ V2 , and V2 is ubp closed. Hence, MB (T, A(dT )) ⊂ V2 ,
or
Bψ Bϕ = Bψϕ , ∀ϕ ∈ C(T), ∀ψ ∈ MB (T, A(dT )).
This implies that
(9) (10)
Bϕ Bψ = Bϕ† Bψ† = (Bψ Bϕ )† = Bψϕ
†
(11)
= Bψϕ = Bϕψ , ∀ϕ ∈ C(T), ∀ψ ∈ MB (T, A(dT ))
(for 9 and 11, cf. step 6; 10 holds by 12.3.4b).
Now we define the family
V3 := {ϕ ∈ MB (T, A(dT )) : Bϕ Bψ = Bϕψ , ∀ψ ∈ MB (T, A(dT ))}.
The last thing proved implies C(T) ⊂ V3 .
ubp
Next, suppose ϕ ∈ F (T) and that {ϕn } is a sequence in V3 such that ϕn −→ ϕ.
Then, proceeding exactly as above we have that, for all ψ ∈ MB (T, A(dT )),
(12)
(f |Bϕ Bψ g) = lim (f |Bϕn Bψ g) = lim (f |Bϕn ψ g) = (f |Bϕψ g) , ∀f, g ∈ H,
n→∞ n→∞
where 12 holds (in view of step 4) because {ϕn ψ} is a sequence in MB (T, A(dT ))
ubp
such that ϕn ψ −→ ϕψ. Therefore, Bϕ Bψ = Bϕψ .
or
Bϕ Bψ = Bϕψ , ∀ψ ∈ MB (T, A(dT )), ∀ϕ ∈ MB (T, A(dT )).
Step 8: For every E ∈ A(dT ), we have χE ∈ MB (T, A(dT )). Since χE = χE ,
we have Bχ† E = BχE (cf. step 6). Since χ2E = χE , we have Bχ2 E = BχE (cf. step 7).
Thus, BχE ∈ P(H) by 13.1.5.
Now, we define the mapping
P : A(dT ) → P(H)
E 7→ P (E) := BχE .
For every f ∈ H, we have (cf. step 3)
Z
P
µf (E) = (f |BχE f ) = χE dµf = µf (E), ∀E ∈ A(dT );
T
this proves that µP

f is a measure on A(dT ), and also (cf. step 1) that
Z
P
µf (T) = 1T dµf = kf k2 .
T
Thus, P is a projection valued measure on A(dT ), in view of 13.3.5.

Now, we have
Z
2
DU = H = f ∈H: |ζ| dµP
f <∞
T
since the function ζ is bounded and the measure µP f is finite for all f ∈ H, and also
Z Z
(13)
(14)
(f |U f ) = f |φ̂U (ζ)f = (f |φU (ζ)f ) = ζdµf = ζdµPf , ∀f ∈ H,
T T
where 13 holds by the very definition of φ̂U and 14 by 15.1.5b. By the uniqueness
asserted in 14.2.14, this is equivalent to U = JζP .
This concludes the proof that a projection valued measure P exists as in the
statement.
Step 9: Here we prove that the projection valued measure as in the statement is
unique. To this end, suppose that Q is a projection valued measure on A(dT ), with
Z
(f |U f ) = ζdµQ
f , ∀f ∈ H.
T
n o
R 2 Q
Since the equality DU = f ∈ H : T |ζ| dµ f < ∞ is obvious, this is equivalent to
JζQ = U.
In view of 14.2.17e (and of the fact that the mapping J˜Q is an extension of the
mapping JˆQ , cf. 14.2.7), this can be written as
JˆQ (ζ) = U.
Now, in view of 14.1.1b,d,e (and of the fact that ζ −1 = ζ and U † = U −1 , cf.
12.5.1b), this implies that
JˆQ (p) = φ̂U (p), ∀p ∈ P.
Moreover, the mapping JˆQ is continuous (cf. 14.1.1c), and so is its restriction to
C(T). Then, the uniqueness asserted in 15.1.5 implies that
JˆQ (ϕ) = φU (ϕ), ∀ϕ ∈ C(T),
and hence (cf. 14.1.1f)
Z Z
ϕdµQ = f | ˆQ (ϕ)f = (f |φU (ϕ)f ) =
J ϕdµf , ∀ϕ ∈ C(T), ∀f ∈ H;
f
T T
then, by the uniqueness asserted in 8.5.3,

(15)
µQ P
f = µf = µf , ∀f ∈ H
(for 15, cf. step 8), and hence

(f |Q(E)f ) = µQ P
f (E) = µf (E) = (f |P (E)f ) , ∀E ∈ A(dT ), ∀f ∈ H,

Q(E) = P (E), ∀E ∈ A(dT ).
B: Let A ∈ B(H) be such that AU = U A.
We define the family
V4 := {ϕ ∈ MB (T, A(dT )) : ABϕ = Bϕ A}.
If ϕ ∈ C(T) then Bϕ = φU (ϕ) (cf. step 5), and hence ϕ ∈ V4 in view of 15.1.5h.
ubp
Next, suppose ϕ ∈ F (T) and that {ϕn } is a sequence in V4 such that ϕn −→ ϕ.
Then (cf. step 4), ϕ ∈ MB (T, A(dT )) and
(f |Bϕ g) = lim (f |Bϕn g) , ∀f, g ∈ H,
n→∞
and hence
(f |ABϕ g) = A† f |Bϕ g = lim A† f |Bϕn g

n→∞
= lim (f |ABϕn g) = lim (f |Bϕn Ag) = (f |Bϕ Ag) , ∀f, g ∈ H,
n→∞ n→∞
whence ABϕ = Bϕ A.
or
ABϕ = Bϕ A, ∀ϕ ∈ MB (T, A(dT )).
Then, in particular
AP (E) = ABχE = BχE A = P (E)A, ∀E ∈ A(dT ).
15.2 The spectral theorem for self-adjoint operators
The spectral theorem for self-adjoint operators is deduced from the spectral theorem
for unitary operators, by means of the Cayley transform (in both its incarnations,
as an operator and as a function).
Throughout this section, H denotes an abstract Hilbert space.
15.2.1 Theorem (The spectral theorem for self-adjoint operators). Let A

be a self-adjoint operator in H.
(A) There exists a unique projection valued measure P on the Borel σ-algebra A(dR )
on R, with values in P(H), such that A = JξP , where ξ is the function defined
by
ξ: R→C
x 7→ ξ(x) := x
and JξP is the operator defined in 14.2.14.
Equivalently, there exists a unique projection valued measure P on A(dR ), with

Z
2 P
DA = f ∈ H : ξ dµf < ∞ ,
R
Z
(f |Af ) = ξdµPf , ∀f ∈ DA .
R
(B) If B ∈ B(H) is such that BA ⊂ AB, then
BP (E) = P (E)B, ∀E ∈ A(dR ).
Proof. A existence: Let V be the Cayley transform of A (cf. 12.5.3). Since V is a

unitary operator in H, 15.1.6A grants that there exists a unique projection valued
measure P V on the Borel σ-algebra A(dT ), with values in P(H), such that
V
V = JζP = JP V (ζ)
(JP V is the mapping defined in 14.2.18). In view of 14.2.17e and 14.2.7a,b, this
implies that
JP V (−i(ζ + 1)) = −i(JP V (ζ) + JP V (1T )) = −i(V + 1H )
and
JP V (ζ − 1) = V − 1H .
Now, 1 6∈ σp (V ) (cf. 12.5.3) and hence, in view of 14.4.6,
P V ({1}) = P V (ζ −1 ({1})) = OH ;
1
thus, ζ−1 6= 0 P V -a.e. on T, and hence 14.3.14 implies that ζ−1 ∈ M(T, A(dT ), P V )
1
(for ζ−1 , cf. 1.2.19) and that

1
JP V = (JP V (ζ − 1))−1 = (V − 1H )−1
ζ −1
(it was already known that the operator V − 1H was injective, cf. 12.5.3). Now the
function ψ defined in 12.6.6b can be written as
1
ψ = −i(ζ + 1) .
ζ−1
Therefore, in view of 14.3.9,

−1 1
−i(V + 1H )(V − 1H ) = JP V (−i(ζ + 1))JP V ⊂ JP V (ψ).
ζ −1
Now, −i(V + 1H )(V − 1H )−1 = A (cf. 12.5.3) and JP V (ψ) is self-adjoint since
ψ(z) ∈ R for all z ∈ Dψ (cf. 14.3.17). Hence (cf. 12.4.6b),
JP V (ψ) = A.
Now a change of variable is made, by 14.6.1 with
X1 := T, A1 := A(dT ), P1 := P V , X2 := R, A2 := A(dR ), π := ψ
(note that X1 − Dπ = {1} and hence P1 (X1 − Dπ ) = OH ). Then the mapping

P : A(dR ) → P(H)
E 7→ P (E) := P V (ψ −1 (E))
is a projection valued measure on A(dR ) (cf. 14.6.1a) and
JP (ξ) = JP V (ξ ◦ ψ) = JP V (ψ) = A
(cf. 14.6.1b with ϕ := ξ).
Finally, for a linear operator A, the equality A = JP (ξ) is equivalent to
Z Z
2 P
DA = f ∈ H : ξ dµf < ∞ and (f |Af ) = ξdµP f , ∀f ∈ DA ,
R R
owing to the uniqueness asserted in 14.2.14.
A uniqueness: Suppose that Q is a projection valued measure on A(dR ), with
A = JξQ = JQ (ξ).
In view of 14.3.1 and 14.3.5, this implies
JQ (ξ − i) = JQ (ξ) − iJQ (1R ) = A − i1H
and
JQ (ξ + i) = A + i1H .
Since x + i 6= 0 for all x ∈ R, 14.3.14 implies that

1
JQ = (JQ (ξ + i))−1 = (A + i1H )−1
ξ+i
(it was already known that the operator A + i1H was injective, cf. 12.5.3). Now
the Cayley transform ϕ (cf. 12.6.6b) can be written as
1
ϕ = (ξ − i) .
ξ+i
1
Therefore, in view of 14.3.10 (note that ∈ MB (R, A(dR ))),
ξ+i

1
(A − i1H )(A + i1H )−1 = JQ (ξ − i)JQ = JQ (ϕ).
ξ+i
Since V = (A − i1H )(A + i1H )−1 (cf. 12.5.3), this can be written as
V = JQ (ϕ).
Now a change of variable is made, by 14.6.1 with
X1 := R, A1 := A(dR ), P1 := Q, X2 := T, A2 := A(dT ), π := ϕ
(then, X1 − Dπ = ∅ and hence P1 (X1 − Dπ ) = OH ). Then the mapping
QV : A(dT ) → P(H)
F 7→ QV (F ) := Q(ϕ−1 (F ))
is a projection valued measure on A(dT ) (cf. 14.6.1a) and

JQV (ζ) = JQ (ζ ◦ ϕ) = JQ (ϕ) = V
(cf. 14.6.1b with ϕ := ζ; naturally, here ϕ stands for the symbol used in 14.6.1.b
and not for the Cayley transform). By the uniqueness asserted in 15.1.6A, this
implies QV = P V . Then,
(1) (2)
Q(E) = Q(ϕ−1 (ϕ(E))) = QV (ϕ(E)) = P V (ϕ(E))
(3) (4)
= P V (ψ −1 (E)) = P (E), ∀E ∈ A(dR ),
where: 1 holds because ϕ is injective; ϕ is a bijection from R onto T − {1} (cf.
12.6.6b) and hence, for every subset S of T − {1}, ϕ−1 (S) can be interpreted as the
image of S under the function ϕ−1 or as the counterimage of S under the function
ϕ, and it is the latter interpretation the one that upholds 2; finally, ϕ = ψ −1 (cf.
12.6.6b) and, for every subset T of R, ψ −1 (T ) can be interpreted as the image of
T under the function ψ −1 (this interpretation upholds 3) or as the counterimage
of T under the function ψ (this interpretation upholds 4), as ψ is a bijection from
T − {1} onto R.
B: First note that 12.4.21 and 12.4.18 imply that the operator A+i1H is injective
and D(A+i1H )−1 = H.
Now, let B ∈ B(H) be such that BA ⊂ AB, i.e. such that
f ∈ DA ⇒ [Bf ∈ DA and BAf = ABf ]. (5)
−1 −1
Fix g ∈ H. Then (A + i1H ) g ∈ DA and hence, in view of 5, B(A + i1H ) g ∈ DA
and
(A − i1H )B(A + i1H )−1 g = AB(A + i1H )−1 g − iB(A + i1H )−1 g
= BA(A + i1H )−1 g − iB(A + i1H )−1 g (6)
−1
= B(A − i1H )(A + i1H ) g,
and also
(A + i1H )B(A + i1H )−1 g = B(A + i1H )(A + i1H )−1 g = Bg,
and hence
B(A + i1H )−1 g = (A + i1H )−1 Bg. (7)
Now, 6 and 7 yield
B(A − i1H )(A + i1H )−1 g = (A − i1H )B(A + i1H )−1 g
= (A − i1H )(A + i1H )−1 Bg.
−1
Since V = (A − i1H )(A + i1H ) (cf. 12.5.3) and since g was arbitrary, this proves
that
BV = V B.
This, in view of 15.1.6 B, implies that
BP V (F ) = P V (F )B, ∀F ∈ A(dT ),
and hence, owing to the way P was defined, that
BP (E) = P (E)B, ∀E ∈ A(dR ).
15.2.2 Remarks. The mapping
P 7→ JξP
is a bijection from the family of all projection valued measures on the Borel σ-
algebra A(dR ), with values in P(H), onto the family of all self-adjoint operators in
H. Indeed, 14.3.17 proves that the operator JξP is self-adjoint for every projection
valued measure P on A(dR ). Further, 15.2.1A proves that, for every self-adjoint
operator A, there exists one and only one projection valued measure P on A(dR )
such that A = JξP .
For a given projection valued measure P on A(dR ), we sometimes denote by AP
the operator JξP . Conversely, for a given self-adjoint operator we always denote by
P A the unique projection valued measure P on A(dR ) which is so that A = JξP , and
we call it the projection valued measure of A. Thus, the bijection discussed above
is defined by
P 7→ AP ,
while its inverse is the bijection from the family of all self-adjoint operators onto
the family of all projection valued measures on A(dR ) which is defined by
A 7→ P A .
Explicitly, for a given projection valued measure P on A(dR ), AP is the linear

operator in H characterized by the conditions (cf. 14.2.11 and 14.2.14):
(a) DAP = {f ∈R H : R ξ 2 dµP

R
f < ∞};
P P
(b) f |A f = R ξdµf , ∀f ∈ DAP .
From 14.3.2 we also have
(c) kAP f − λf k2 = R |ξ − λ|2 dµP

R
f , ∀f ∈ DAP , ∀λ ∈ C.
For every self-adjoint operator A, from the more general results of Chapter 14 we
A
have the following results, since A = JξP :
(d) P A (σ(A)) = 1H (equivalently, P A (R − σ(A)) = OH ; we recall that σ(A) is a

closed subset ofRR, cf. 10.4.6 and 12.4.21a) and
R hence σ(A) 6= ∅ (cf. 14.4.3);
A
PA
(e) DA = {f ∈ H : R ξ 2 dµP f < ∞} = {f ∈ H : σ(A)
ξ 2
dµ f < ∞},
PA PA
R R
(f |Af ) = R ξdµf = σ(A) ξdµf , ∀f ∈ DA ,
A A
kAf − λf k2 = R |ξ − λ|2 dµP = σ(A) |ξ − λ|2 dµP
R R
f f , ∀f ∈ DA
(cf. 14.4.4);
(f) A is a bounded operator iff σ(A) is a bounded subset of R (cf. 14.4.5).
For convenience, we collect the results (also the ones already known before this
chapter) for the spectrum and the point spectrum of a self-adjoint operator in the
next two theorems, after defining two numbers which are of great importance in
quantum mechanics.
15.2.3 Definitions. Let A be a self-adjoint operator in H and let u ∈ DA ∩ H̃ (for

H̃, cf. 10.9.4). We define the numbers
hAiu := (u|Au) and ∆u A := kAu − hAiu uk.
Thus, if P is a projection valued measure on A(dR ), for every u ∈ DAP ∩ H̃ we have
(cf. 15.2.2b,c)
Z Z 12
P P P P 2 P
hA iu = ξdµu and ∆u A = (ξ − hA iu ) dµu .
R R
15.2.4 Theorem. Let A be a self-adjoint operator in H. For λ ∈ R, the following

(a) λ ∈ σ(A);
(b) ∀ε > 0, ∃fε ∈ DA such that kAfε − λfε k < εkfε k (hence, fε 6= 0H );
(c) P A ((λ − ε, λ + ε)) 6= OH , ∀ε > 0;
(d) ∀ε > 0, ∃uε ∈ DA ∩ H̃ such that |hAiuε − λ| < ε and ∆uε A < 2ε.
Proof. a ⇔ b: Cf. 12.4.21b, 4.5.2, 4.5.3.

a ⇔ c: Cf. 14.4.2.
b ⇒ d: Fix ε ∈ (0, ∞). Condition b implies that there exists uε ∈ DA ∩ H̃ such
that
kAuε − λuε k < ε.
Then, by the Schwarz inequality we have
|hAiuε − λ| = | (uε |Auε − λuε ) | ≤ kAuε − λuε k < ε
and then also
∆uε A = kAuε − hAiuε uε k ≤ kAuε − λuε k + kλuε − hAiuε uε k < 2ε.
d ⇒ b: Fix ε ∈ (0, ∞). Condition d implies that there exists uε ∈ DA ∩ H̃ such
that
kAuε − λuε k ≤ kAuε − hAiuε uε k + khAiuε uε − λuε k = ∆uε A + |hAiuε − λ| < 3ε.
This proves condition b (take fε := u 3ε ).
15.2.5 Theorem. Let A be a self-adjoint operator in H. For λ ∈ R, the following

(a) λ ∈ σp (A);
(b) ∃f ∈ DA such that f 6= 0H and Af = λf ;
(c) P A ({λ}) 6= OH ;
(d) ∃u ∈ DA ∩ H̃ such that hAiu = λ and ∆u A = 0.
Moreover,
(e) NA−λ1H = RP A ({λ}) , ∀λ ∈ R;
thus, if λ ∈ σp (A) then P A ({λ}) is the projection onto the corresponding eigenspace.
Proof. a ⇔ b: Cf. 4.5.6 and 4.5.7.

a ⇔ c, and e: Cf. 14.4.6.
b ⇒ d: Condition b implies that there exists u ∈ DA ∩ H̃ such that Au = λu;
then
hAiu = (u|Au) = λ and ∆u A = kAu − λuk = 0.
d ⇒ b: If u ∈ DA ∩ H̃ is such that hAiu = λ and ∆u A = 0, then
kAu − λuk = kAu − hAiu uk = ∆u A = 0,
whence Au = λu.
15.2.6 Remark. Let A be a self-adjoint operator in H and let λ be an isolated

point of σ(A), i.e.
λ ∈ R and ∃δ ∈ (0, ∞) such that (λ − δ, λ + δ) ∩ σ(A) = {λ}.
Then λ ∈ σp (A). Indeed, (λ − δ, λ) ⊂ R − σ(A) implies P A ((λ − δ, λ)) = OH (cf.
15.2.2d and 13.3.2e); similarly, P A ((λ, λ + δ)) = OH . Therefore
P A ({λ}) = P A ((λ − δ, λ)) + P A ({λ}) + P A ((λ, λ + δ)) = P A ((λ − δ, λ + δ)) 6= OH
by 15.2.4, and hence λ ∈ σp (A) by 15.2.5.
15.2.7 Theorem. Let (X, A) be a measurable space and P a projection valued

measure on A with values in P(H). Let ψ ∈ M(X, A, P ) be so that ψ = ψ. Then
the operator A := JψP is self-adjoint and
P A (E) = P (ψ −1 (E)), ∀E ∈ A(dR ).
Proof. The operator A is self-adjoint by 14.3.17.

Now we resort to 14.6.1 with
X1 := X, A1 := A, P1 := P, X2 := R, A2 := A(dR ), π := ψ
(we note that X1 − Dπ = X − Dψ and hence P1 (X1 − Dπ ) = OH ). Then the
mapping
Q : A(dR ) → P(H)
E 7→ Q(E) := P (ψ −1 (E))
is a projection valued measure on A(dR ) (cf. 14.6.1a) and
JQ (ξ) = JP (ξ ◦ ψ) = JP (ψ) = A
(cf. 14.6.1b with ϕ := ξ). Then P A = Q by definition of P A .
The next theorem can be proved directly (cf. e.g. Simmons, 1963, Chapter 11).
Instead, we deduce it from the results proved in this section.
15.2.8 Theorem (Finite-dimensional spectral th. for s.a. operators).

Suppose that the Hilbert space H is finite-dimensional, and let A be a self-adjoint
operator in H. Then σp (A) is a non-empty finite set. If N is the number of the
eigenvalues of A, letting
{λ1 , ..., λN } := σp (A) and Pn := P A ({λn }), ∀n ∈ {1, ..., N },
we have:
Pn 6= OH , ∀n ∈ {1, ..., N };
Pi Pj = OH if i 6= j;
N
X
Pn = 1H ;
n=1
N
X
A= λn Pn .
n=1
Proof. We know that σ(A) 6= ∅ (cf. 15.2.2d). Now let λ ∈ σ(A). Since σ(A) =
Apσ(A) (cf. 12.4.21b) and since every linear operator in H is bounded (cf. 10.8.3B),
we have that the operator A − λ1H is not injective, i.e. that λ ∈ σp (A). This proves
that σp (A) is a non-empty set, and also (in view of 4.5.8) that
σp (A) = σ(A).
In view of 12.4.20B, σp (A) must be a finite set: if it were not, then by choosing an
element of NA−λ1H ∩ H̃ for each λ ∈ σp (A) we could construct a non-finite o.n.s.
in H and hence (cf. 10.7.3) there would exist a non-finite c.o.n.s. in H, contrary to
the hypothesis that H is finite-dimensional. Thus, we can write
{λ1 , ..., λN } := σp (A).
In view of 15.2.5 we have
Pn := P A ({λn }) 6= OH , ∀n ∈ {1, ..., N }.
Moreover, we have
Pi Pj = P A ({λi })P A ({λj }) = OH if i 6= j
(cf. 13.3.2b) and also
XN XN
Pn = P A ({λn }) = P A (σp (A)) = P A (σ(A)) = 1H
n=1 n=1
(cf. 15.2.2d). Finally, we note that DA = H in view of 10.8.3B and 12.4.7, and that
Pn f = P A ({λn })f ∈ NA−λn 1H and hence APn f = λn Pn f, ∀f ∈ H
(cf. 15.2.5e). This yields
XN N
X
Af = APn f = λn Pn f, ∀f ∈ H,
n=1 n=1
PN
or A = n=1 λn Pn .
15.3 Functions of a self-adjoint operator
Throughout this section, H stands for an abstract Hilbert space.
15.3.1 Definition. Let A be a self-adjoint operator in H.

For a function ϕ ∈ M(R, A(dR ), P A ), we write
A
ϕ(A) := JϕP .
This operator is said to be a function of A. This name is justified by the fact that a
function of A as defined here is often nothing else that the function of A as defined
in an obvious direct way. An important instance of this is the subject of 15.3.5.
We note that the equality ξ(A) = A is obvious, by the very definition of P A .
15.3.2 Remark. Let A be a self-adjoint operator in H.

For a function ϕ ∈ M(R, A(dR ), P A ), we obtain immediately a great number
of results for ϕ(A) from the corresponding more general results of Chapter 14; for
quick reference, we list here some of them (cf. 14.2.14a,b, 14.2.15, 14.2.17, 14.3.2,
14.3.6, 14.3.9, 14.3.15):
A
(a) Dϕ(A) = {f ∈ H : ϕ ∈ L2 (R, A(dR ), µPf )};
PA
R
(b) (f |ϕ(A)f ) = R ϕdµf , ∀f ∈ Dϕ(A) ;
(c) (ϕ(A))† = ϕ(A);
(d) ϕ(A) ∈ B(H) iff ϕ R∈ L∞ (R, A(dR ), P A );
A
(e) kϕ(A)f − λf k2 = R |ϕ − λ|2 dµP f , ∀f ∈ Dϕ(A) , ∀λ ∈ C;
(f) αϕ(A) + βψ(A) ⊂ (αϕ + βψ)(A), ∀α, β ∈ C, ∀ψ ∈ M(R, A(dR ), P A );
(g) Dψ(A)ϕ(A) = Dϕ(A) ∩ D(ψϕ)(A) and ψ(A)ϕ(A) ⊂ (ψϕ)(A),
∀ψ ∈ M(R, A(dR ), P A );
(h) for ψ ∈ M(R, A(dR ), P A ), ϕ(A) = ψ(A) iff ϕ(x) = ψ(x) P A -a.e. on Dϕ ∩ Dψ .
15.3.3 Remark. For every self-adjoint operator A in H and every E ∈ A(dR ), we

have DχE (A) = H and
Z
A A
χE dµP = µP A

(f |χE (A)f ) = f f (E) = f |P (E)f , ∀f ∈ H
R
(cf. 15.3.2a,b), and hence χE (A) = P A (E) (cf. 10.2.12). Obviously, this equation
cannot be used for the construction of the projection valued measure P A by means
of A, since it is actually based on the previous existence of P A .
15.3.4 Examples.
(A) We set (X, A, µ) := (R, A(dR ), m) in the discussion of Section 14.5; we recall
that m denotes the Lebesgue measure on R. Thus, L2 (R, A, µ) = L2 (R). The
projection valued measure P of Section 14.5 is now defined on A(dR ) and
we define the operator Q := JξP , which is a self-adjoint operator in L2 (R).
This operator is denoted by Q since in non-relativistic quantum mechanics it
represents the observable “position of a quantum particle in one dimension”

(cf. 20.3.6c; the operator that represents the observable “linear momentum of
a quantum particle in one dimension” is denoted by P , and the symbols Q and
P are chosen on analogy of the symbols q and p used in classical mechanics;
however, in the present discussion P denotes the projection valued measure of
Section 14.5). We have P Q = P by definition of P Q .
It is easy to see that
P Q ((λ − ε, λ + ε)) 6= OL2 (R) , ∀λ ∈ R, ∀ε > 0;
thus, σ(Q) = R (cf. 15.2.4). It is obvious that
P Q ({λ}) = OL2 (R) , ∀λ ∈ R;
thus, σp (Q) = ∅ (cf. 15.2.5).

Since JξP = Mξ , we have:
Z
2 2
DQ = [f ] ∈ L (R) : |ξf | dm < ∞ ;
R
Q[f ] = [ξf ], ∀[f ] ∈ DQ .
Since the measure m is σ-finite, M(R, A(dR ), m) = M(R, A(dR ), P Q ) and

hence
Mϕ = JP Q (ϕ) = ϕ(Q), ∀ϕ ∈ M(R, A(dR ), m).
(B) Suppose that H is a separable Hilbert space and let A be a self-adjoint operator
in H. In view of 12.4.20C, σp (A) is a countable set and hence σp (A) ∈ A(dR ).
(a) P A (R − σp (A)) = OH , or equivalently P A (σp (A)) = 1H ;
(b) there exists a family {(λn , Pn )}n∈I , with I = {1, ..., N } or I = N, so that
λn ∈ R, Pn ∈ P(H), Pn 6= OH , ∀n ∈ I,
λi 6= λj and Pi Pj = OH if i 6= j,
P
n∈I Pn f = f , ∀f ∈ H,
2 2
P
DA = {f ∈ H : n∈I λn kPn f k < ∞},
P
Af = n∈I λn Pn f, ∀f ∈ DA
P
(we note that, if I = N, the series n∈I λn Pn f is convergent for all
f ∈ DA , in view of 13.2.8d and 10.4.7b).
Indeed, if condition a is true then we define {λn }n∈I := σp (A), with the con-
dition λi 6= λj if i 6= j and with I := {1, ..., N } or I := N as the case may be,
and Pn := P A ({λn }) for each n ∈ I. Then we have:
λn ∈ R, Pn ∈ P(H), Pn 6= OH (cf. 15.2.5c), ∀n ∈ I;
Pi Pj = OH if i 6= j and n∈I Pn f = P A (σp (A))f = f , ∀f ∈ H
P
(since P A is a projection valued measure).

Moreover, if we define IE := {n ∈ I : λn ∈ E} for all E ∈ A(dR ), then we have

!
(1) [
P A (E)f = P A (E ∩ σp (A))f = P A {λn } f
n∈IE
X
= Pn f, ∀f ∈ H, ∀E ∈ A(dR )
n∈IE
(1 holds by 13.3.2c), and hence

A X
µP
f (E) = kPn f k2 , ∀E ∈ A(dR ), ∀f ∈ H,
n∈IE
and hence
Z ( )
PA
X
2 2 2
DA = f ∈ H : ξ dµf < ∞ = f ∈ H : λn kPn f k < ∞ ,
R n∈I
P
by 15.2.2a and 8.3.8. Thus, if I = N, the series n∈I λn Pn f is convergent for
all f ∈ DA , and, for either I = {1, ..., N } or I = N, we can define the mapping
B : DA → H
X
f 7→ Bf := λn Pn f
n∈I
which is obviously a linear operator. Now we have

Z
A X
(f |Af ) = ξdµP
f = λn (f |Pn f ) = (f |Bf ) , ∀f ∈ DA ,
R n∈I
by 15.2.2b and 8.3.8, and hence A = B by 10.2.12, or

X
Af = Bf = λn Pn f, ∀f ∈ DA .
n∈I
This proves that condition a implies condition b.

Conversely, if condition b is true then
2
X X
2
kAf − λf k = λn Pn f − λPn f

n∈I n∈I
X
= |λn − λ|2 kPn f k2 , ∀λ ∈ C, ∀f ∈ DA ,
n∈I
by 13.2.9c and 10.2.3 or by 13.2.8d and 10.4.7a; thus, for f ∈ DA ,

[λ ∈ C − {λn }n∈I and Af = λf ] ⇒
X
[Pn f = 0H , ∀n ∈ I] ⇒ f = Pn f = 0H
n∈I
and, for every k ∈ I,

f ∈ NA−λk 1H ⇔ [Pn f = 0H , ∀n ∈ I − {k}] ⇔ f = Pk f,
i.e. NA−λk 1H = RPk (cf. 13.1.3c). Since Pn 6= OH for all n ∈ I, this proves
that
σp (A) = {λn }n∈I and (cf. 15.2.5e and 13.1.4b) P A ({λn }) = Pn , ∀n ∈ I.
Therefore,
X
f= P A ({λn })f = P A (σp (A))f, ∀f ∈ H,
n∈I
i.e. P A (σp (A)) = 1H , which is condition a.

Since every operator determines uniquely its point spectrum and its
eigenspaces, what we saw above proves the uniqueness of the family
{(λn , Pn )}n∈I of condition b.
The following condition
(c) there exists a c.o.n.s. {vj }j∈J in H whose elements are eigenvectors of A,
i.e. so that
∀j ∈ J, vj ∈ DA and ∃µj ∈ R such that Avj = µj vj
is a further condition which is equivalent to condition b (and hence to condition

a).
Indeed, suppose that condition b is true. Then, for each n ∈ I, we fix a
countable o.n.s. {un,s }s∈In which is complete in the subspace RPn (cf. 10.7.2).
S
Then the set n∈I {un,s }s∈In in an o.n.s. in H (cf. 13.2.8d or 13.2.9c) and it
is complete in H by 10.6.4 (with M := H) since
X (2) X X
f= Pn f = (un,s |f ) un,s , ∀f ∈ H,
n∈I n∈I s∈In
P P
where 2 holds by 13.1.10 (note that n∈I s∈In (un,s |f ) un,s can be construed
S
as a single series with the set of indices n∈I {(n, s)}s∈In , and that there is no
need to specify what ordering is used to define this series, in view of 10.4.10).
Since un,s is an eigenvector of A (cf. the proof of b ⇒ a above), this proves
that condition b implies condition c.
Conversely, if condition c is true, then from 12.4.24 we have:
DA = {f ∈ H : j∈J µ2j | (vj |f ) |2 < ∞};
P
P
Af = j∈J µj (vj |f ) vj , ∀f ∈ DA ;
σp (A) = {µj }j∈J
(actually, 12.4.24 is written on the assumption that the orthogonal dimen-
sion of H is denumerable; if the orthogonal dimension of H is finite, a sim-
plified version of the proof of 12.4.24 leads to the equations written above).
Now, we define {λn }n∈I := {µj }j∈J with the condition λi 6= λj if i 6= j and
with I := {1, ..., N } or I := N; moreover, for each n ∈ I, we define
Jn := {j ∈ J : µj = λn }
and Pn as the projection such that

X
Pn f = (vj |f ) vj , ∀f ∈ H
j∈Jn
(cf. 13.1.10). Then,

Pn 6= OH , ∀n ∈ I, and Pi Pj = OH if i 6= j,
X XX X
Pn f = (vj |f ) vj = (vj |f ) vj = f, ∀f ∈ H
n∈I n∈I j∈Jn j∈J
(cf. 10.4.10 and 10.6.4b). Moreover,

  ( )
 X X  X
2 2 2 2
DA = f ∈ H : λn | (vj |f ) | < ∞ = f ∈ H : λn kPn f k < ∞
 
n∈I j∈Jn n∈I
(cf. 5.4.7 and 10.2.3 or 10.4.8a), and

X X X
Af = λn (vj |f ) vj = λn Pn f, ∀f ∈ DA
n∈I j∈Jn n∈I
(cf. 10.4.10). This proves that condition c implies condition b.

Now suppose that conditions a, b, c hold true.
Then every function ϕ : {λn }n∈I → C is an element of M(R, A(dR ), P A ),
because it is obviously A(dR )Dϕ -measurable and because {λn }n∈I = σp (A)
(see above) and hence P A (R − {λn }n∈I ) = OH . From
A X
µPf (E) = kPn f k2 , ∀E ∈ A(dR ), ∀f ∈ H
n∈IE
(see above), we have (in view of 15.3.2a,b and of 8.3.8)

Z
A
Dϕ(A) = f ∈ H : |ϕ|2 dµP
f < ∞
R
( )
X
2 2
= f ∈H: |ϕ(λn )| kPn f k < ∞ ,
n∈I
Z
A X
(f |ϕ(A)f ) = ϕdµP
f = ϕ(λn )kPn f k2
R n∈I
!
X
= f| ϕ(λn )Pn f , ∀f ∈ Dϕ(A) ;
n∈I
P
since the mapping Dϕ(A) ∋ f 7→ n∈I ϕ(λn )Pn f ∈ H is obviously a linear
operator (its definition is consistent by 10.4.7b), in view of 10.2.12 this implies
that
X
ϕ(A)f = ϕ(λn )Pn f, ∀f ∈ Dϕ(A) .
n∈I
If {un,s }s∈In is as before for each n ∈ I, we have

X
Pn f = (un,s |f ) un,s and
s∈In
X
2
kPn f k = | (un,s |f ) |2 , ∀f ∈ H, ∀n ∈ I
s∈In
(cf. 13.1.10, and 10.2.3 or 10.4.8a), and hence

( )
X X
Dϕ(A) = f ∈ H : |ϕ(λn )|2 |(un,s |f )|2 < ∞ ,
n∈I s∈In
X X
ϕ(A)f = ϕ(λn ) (un,s |f )un,s , ∀f ∈ Dϕ(A) .
n∈I s∈In
(C) If the Hilbert space H is finite-dimensional then 15.2.8 proves that condition b
of example B holds true for every self-adjoint operator A in H. Then condition
a holds true as well (this was also seen directly in the proof of 15.2.8), and so
does condition c. Thus, for every self-adjoint operator A in a finite-dimensional
Hilbert space H there exists a c.o.n.s. in H whose elements are eigenvectors of
A.
(D) Let M be a subspace of H. The mapping
P : A(dR ) → P(H)
E 7→ P (E) := χE (0)PM ⊥ + χE (1)PM
is a projection valued measure in view of 13.3.5. Indeed, for every f ∈ H, µP
f
is the measure µ defined in 8.3.8 with
I := {1, 2}, x1 := 0, x2 := 1, a1 := (f |PM ⊥ f ) , a2 := (f |PM f ) ;
2 2 2 2
moreover, this entails µPf (R) = a1 + a2 = kf k − kPM f k + kPM f k = kf k .
P
The operator A is the projection PM since (cf. 8.3.8 and 15.2.2a,b)
Z Z
ξ 2 dµP
f = ξdµP
f = 0a1 + 1a2 = (f |PM f ) ,
R R
and hence
Z
f ∈H: ξ 2 dµP
f < ∞ = H = DPM
R
and
Z
ξdµP
f = (f |PM f ) , ∀f ∈ H.
R
15.3.5 Proposition. Let A be a self-adjoint operator in H. Let p be a polynomial,

i.e. there exist N ≥ 0 and (α0 , α1 , ..., αN ) ∈ CN +1 (we assume αN 6= 0) so that
N
X
p= αk ξ k (we define ξ 0 := 1R ).
k=0
Then,
N
X
p(A) = αk Ak (we define A0 := 1H ).
k=0
Let q be a non-trivial polynomial, i.e. there exist M ≥ 1 and (β0 , β1 , ..., βM ) ∈ CM+1
with βM 6= 0 so that
M
X
q= βi ξ i .
i=0
If the roots of q are not elements of σp (A), then q1 ∈ M(R, A(dR ), P A ) (where 1
q is
PM
defined as in 1.2.19), the operator i=0 βi Ai is injective and
M
!−1
1 X
(A) = βi Ai .
q i=0
p
If, further, the roots of q are not elements of σ(A), then (letting q := p q1 )
N
! M
!−1
p X
k
X
i
(A) = αk A βi A .
q i=0
k=0
Proof. First we note that A0 = 1H = 1R (A) = ξ 0 (A). Now we prove by induction

the proposition
An = ξ n (A), ∀n ∈ N.
A
For n = 1, A = ξ(A) means A = JξP , which is obvious by definition of P A . Then
we assume, for a fixed positive integer n, that
An = ξ n (A).
From the inequality
|x|n ≤ |x|n+1 + 1, ∀x ∈ R
(for x 6= 0, 1 ≤ |x| + |x|−n as 1 ≤ |x|−n if |x| ≤ 1) we have (cf. 15.3.2a)
A
f ∈ Dξn+1 (A) ⇒ ξ n+1 ∈ L2 (R, A(dR ), µP
f )⇒
A A
ξ n+1 + 1 ∈ L2 (R, A(dR ), µP n 2 P
f ) ⇒ ξ ∈ L (R, A(dR ), µf ) ⇒
f ∈ Dξn (A) ,
i.e. Dξn+1 (A) ⊂ Dξn (A) ; on the other hand, we also have (cf. 15.3.2g)
Dξ(A)ξn (A) = Dξn (A) ∩ Dξn+1 (A) and ξ(A)ξ n (A) ⊂ ξ n+1 (A);
therefore,
Dξ(A)ξn (A) = Dξn+1 (A) and ξ(A)ξ n (A) = ξ n+1 (A),
and hence (by the assumption made)

An+1 = AAn = ξ(A)ξ n (A) = ξ n+1 (A).
Thus, the proposition is proved.
Next, 15.3.2f implies that
N
X N
X
αk Ak = αk ξ k (A) ⊂ p(A).
k=0 k=0
PN
If we define B := k=0 αk Ak , we have
N
\
DB = D Ak = D AN
k=0
since obviously DAk+1 ⊂ DAk for all k ∈ N. Now, it is easy to prove that there
exists a bounded interval I so that
1
|αN ||x|N ≤ |p(x)|, ∀x ∈ R − I;
2
therefore,
A
f ∈ Dp(A) ⇒ p ∈ L2 (R, A(dR ), µP
f )⇒
A
ξ N ∈ L2 (R, A(dR ), µP
f ) ⇒ f ∈ Dξ N (A) = DAN .
This proves that Dp(A) ⊂ DB , and hence that

N
X
p(A) = αk Ak .
k=0
This proves the first part of the statement. In what follows, we prove the second
part.
If the roots of q are not elements of σp (A), then (cf. 15.2.5c)
P A (q −1 ({0})) = OH
1 1
the operator q(A) is injective, ∈ M(R, A(dR ), P A ), (A) = (q(A))−1 ;
q q
PM
now, q(A) = i=0 βi Ai in view of the first part of the statement.
To prove the last part of the statement, let {λ1 , ..., λM } be the roots of q (each
value is repeated as many times as its multiplicity); then
1 −1 1 1
= βM ··· .
q ξ − λ1 ξ − λM
Now, suppose λi 6∈ σ(A) for all i ∈ {1, ..., M }. Then, for each i ∈ {1, ..., M },
the operator A − λi 1H is injective and the operator (A − λi 1H )−1 is bounded (cf.
12.4.21b, 4.5.2, 4.5.3); moreover, A − λi 1H = (ξ − λi )(A) (cf. the first part of the
statement); then, 14.3.14 and 14.2.17 imply that
1
∈ L∞ (R, A(dR ), P A ).
ξ − λi
1
Thus, q ∈ L∞ (R, A(dR ), P A ) (cf. 14.2.5) and hence, in view of 14.3.10,

p 1 1 1
(A) = JP A p = JP A (p)JP A = p(A) (A),
q q q q
or
N
! M
!−1
p X
k
X
i
(A) = αk A βi A ,
q i=0
k=0
in view of what has already been proved.

PN
15.3.6 Remark. If A is a self-adjoint operator in H then the operator k=0 αk Ak
is self-adjoint for every N ≥ 0 and every (α0 , α1 , ..., αN ) ∈ RN +1 . This follows at
once from 15.3.5 and 15.3.2c. Hence, in particular, the operator An is self-adjoint
for all n ∈ N.
15.3.7 Proposition. Let A be a self-adjoint operator in H and let B ∈ B(H) be

such that BA ⊂ AB. Then
Bϕ(A) ⊂ ϕ(A)B, ∀ϕ ∈ M(R, A(dR ), P A ).
Proof. From 15.2.1B we have
BP A (E) = P A (E)B, ∀E ∈ A(dR ).
Then the statement is proved by 14.2.14e.
15.3.8 Proposition. Let A be a self-adjoint operator in H and suppose that a

function ϕ ∈ M(R, A(dR ), P A ) is such that ϕ = ϕ. Then the operator ϕ(A) is
self-adjoint and
P ϕ(A) (E) = P A (ϕ−1 (E)), ∀E ∈ A(dR ).
Proof. The statement follows at once from 15.2.7.
15.3.9 Theorem. Let A be a self-adjoint operator in H such that 0 ≤ (f |Af ) for

all f ∈ DA . Then σ(A) ⊂ [0, ∞) and there exists a unique self-adjoint operator B
in H such that
0 ≤ (f |Bf ) , ∀f ∈ DB , and A = B 2 .
If T ∈ B(H) is such that T A ⊂ AT then T B ⊂ BT . If A is bounded then B ∈ B(H).

Proof. For every λ ∈ (−∞, 0) we have, by the Schwarz inequality,

kf kk(A − λ1H )f k ≥ (f |(A − λ1H )f ) ≥ −λkf k2, ∀f ∈ DA
(recall that (f |Af ) ∈ R, ∀f ∈ DA ), and hence λ 6∈ Apσ(A) (cf. 4.5.2 and 4.5.3),
and hence λ 6∈ σ(A) (cf. 12.4.21b). This proves that σ(A) ⊂ [0, ∞).
Since σ(A) ⊂ [0, ∞), from 15.2.2d and 13.3.2e we have P A ((−∞, 0)) = OH .
Then the function
ϕ : [0, ∞) → C
√
x 7→ ϕ(x) := x
(we remind the reader that we always take the square root of a positive real number
to be positive) is an element of M(R, A(dR ), P A ) and we can define the operator
B := ϕ(A). Since ϕ = ϕ, the operator B is self-adjoint (cf. 15.3.2c). Further (cf.
15.3.2b)
Z
A
(f |Bf ) = ϕdµP
f ≥ 0, ∀f ∈ DB ,
R
A
+
since ϕ ∈ L (R, A(dR ), µP
f ) for all f ∈ H. Moreover,
Dϕ(A)ϕ(A) = Dϕ(A) ∩ Dϕ2 (A) and ϕ(A)ϕ(A) ⊂ ϕ2 (A)
(cf. 15.3.2g); now, the inequality
1
x≤ (1 + x2 ), ∀x ∈ [0, ∞)
2
shows that Dϕ2 (A) ⊂ Dϕ(A) , and hence that Dϕ(A)ϕ(A) = Dϕ2 (A) , and hence that
ϕ(A)ϕ(A) = ϕ2 (A).
Now, ϕ2 (x) = ξ(x) for all x ∈ Dϕ2 ∩ Dξ (= Dϕ ), and hence (cf. 15.3.2h)
B 2 = ϕ(A)ϕ(A) = ϕ2 (A) = ξ(A) = A.
This proves the existence of B. Before proving its uniqueness, we will prove the last
two assertions of the statement.
Let T ∈ B(H) be such that T A ⊂ AT . Then, T B ⊂ BT by 15.3.7.
If A is bounded then σ(A) is bounded (cf. 15.2.2f); then, ϕ ∈ L∞ (R, A(dR ), P A )
since P A (R − σ(A)) = OH (cf. 15.2.2d), and hence B ∈ B(H) by 15.3.2d.
To prove the uniqueness of B, suppose that C is a self-adjoint operator in H
such that
0 ≤ (f |Cf ) , ∀f ∈ DC , and A = C 2 .
As before for A, we have P C ((−∞, 0)) = OH ; moreover, from 15.3.5 we have
A = ξ 2 (C). Then the function
ψ : [0, ∞) → C
x 7→ ψ(x) := x2
is an element of M(R, A(dR ), P C ), ψ(x) = ξ 2 (x) for all x ∈ Dψ ∩ Dξ2 (= Dψ ), and

hence (cf. 15.3.2h)
ψ(C) = ξ 2 (C) = A.
Now we resort to 14.6.1 with
X1 := R, A1 := A(dR ), P1 := P C , X2 := R, A2 := A(dR ), π := ψ
(then, X1 − Dπ = (−∞, 0) and hence P1 (X1 − Dπ ) = OH ). Then the mapping

Q : A(dR ) → P(H)
E 7→ Q(E) := P C (ψ −1 (E))
is a projection valued measure on A(dR ) and
JQ (ξ) = JP C (ξ ◦ ψ) = JP C (ψ) = ψ(C) = A.
By definition of P A , this implies Q = P A , i.e.
P C (ψ −1 (E)) = P A (E), ∀E ∈ A(dR );
now,
P B (E) = P A (ϕ−1 (E)), ∀E ∈ A(dR )
(cf. 15.3.8), and hence
P B (E) = P C (ψ −1 (ϕ−1 (E))), ∀E ∈ A(dR );
then we note that
ψ −1 (ϕ−1 (E)) = E ∩ [0, ∞), ∀E ∈ A(dR ),
and hence
P C (ψ −1 (ϕ−1 (E))) = P C (E)P C ([0, ∞)) = P C (E), ∀E ∈ A(dR )
by 13.3.2c and the equality P C ([0, ∞)) = 1H . Thus, P B = P C and hence (cf.
15.2.2)
B C
B = JξP = JξP = C.
15.3.10 Theorem. Let A be a self-adjoint operator in H and let ϕ : σ(A) → C

be a continuous function (with respect to the metric subspace (σ(A), dσ(A) ) of the
metric space (R, dR )). Then σ(ϕ(A)) is the closure of ϕ(σ(A)), i.e.
σ(ϕ(A)) = ϕ(σ(A)).
If A is bounded, then σ(ϕ(A)) = ϕ(σ(A)).

Proof. Let z ∈ C − ϕ(σ(A)); then there exists η ∈ (0, ∞) so that

B(z, η) ⊂ C − ϕ(σ(A)) ⊂ C − ϕ(σ(A)),
and hence so that ϕ−1 (B(z, η)) = ∅, and hence so that P A (ϕ−1 (B(z, η))) = OH ; in
view of 14.4.2, this implies z 6∈ σ(ϕ(A)). Thus,
σ(ϕ(A)) ⊂ ϕ(σ(A)).
Conversely, let z ∈ ϕ(σ(A)). Then there exists λ ∈ σ(A) such that z = ϕ(λ), and
(since ϕ is continuous) ∀ε ∈ (0, ∞), ∃δ ∈ (0, ∞) such that
[x ∈ σ(A) and |x − λ| < δ] ⇒ |z − ϕ(x)| < ε,
or σ(A) ∩ (λ − δ, λ + δ) ⊂ ϕ−1 (B(z, ε)), and hence such that
(1) (2)
P A ((λ − δ, λ + δ)) = P A (σ(A) ∩ (λ − δ, λ + δ)) ≤ P A (ϕ−1 (B(z, ε))),
where 1 holds by 13.3.2c (since P A (σ(A)) = 1H , cf. 15.2.2d) and 2 holds by 13.3.2e;
now, λ ∈ σ(A) implies P A ((λ − δ, λ + δ)) 6= OH (cf. 15.2.4). Therefore
P A (ϕ−1 (B(z, ε))) 6= OH , ∀ε ∈ (0, ∞),
and hence z ∈ σ(ϕ(A)) by 14.4.2. Thus,
ϕ(σ(A)) ⊂ σ(ϕ(A)),
and hence
ϕ(σ(A)) ⊂ σ(ϕ(A))
since σ(ϕ(A)) is a closed subset of C (cf. 10.4.6). This concludes the proof of the
first equation of the statement.
Finally, suppose that A is bounded. Then σ(A) is bounded (cf. 15.2.2f) and
hence it is a compact subset of R (cf. 10.4.6 and 2.8.7). Then ϕ(σ(A)) is a compact
subset of C (cf. 2.8.12) and hence it is closed (cf. 2.8.6), i.e. such that ϕ(σ(A)) =
ϕ(σ(A)).
15.4 Unitary equivalence
U ∈ UA(H1 , H2 ). Let A1 and A2 be self-adjoint operators in H1 and in H2 respec-
tively. Then the following conditions are equivalent:
(a) P A2 (E) = U P A1 (E)U −1 , ∀E ∈ A(dR );
(b) A2 = U A1 U −1 .
Proof. a ⇒ b: This follows immediately from 14.6.2.
b ⇒ a: We define the mapping
Q : A(dR ) → P(H2 )
E 7→ Q(E) := U P A1 (E)U −1 .
Then (cf. 14.6.2) Q is a projection valued measure on A(dR ) and
A1
JξQ = U JξP U −1 = U A1 U −1 = A2 .
Thus, Q = P A2 by the definition of P A2 , and hence condition a.
Chapter 16
One-Parameter Unitary Groups and

Stone’s Theorem
The subject of this chapter is fundamental for quantum mechanics. Indeed, con-
tinuous one-parameter unitary groups and Stone’s theorem are the mathematical
basis for the description of time evolution of conservative and reversible quantum
systems (cf. Section 19.6).
Moreover, if G is a Lie group which is considered to be a symmetry group for
a quantum system, then a continuous one-parameter unitary group is found to be
associated with each element of the Lie algebra of G, and the generators of these
one-parameter groups are self-adjoint operators which are interpreted as observables
representing the elements of the Lie algebra. However, this topic is outside the scope
of this book (cf. e.g. Thaller, 1992, 2.3.1).
16.1 Continuous one-parameter unitary groups
Throughout this section, H denotes an abstract Hilbert space. We recall that U(H)
denotes the group of unitary operators in H (cf. 10.3.9 and 10.3.10).
16.1.1 Definition. A continuous one-parameter unitary group (briefly, a c.o.p.u.g.)

in H is a mapping
U : R → U(H)
such that:
(ug1 ) U is homomorphism from the additive group R to the group U(H), i.e.
U (t1 )U (t2 ) = U (t1 + t2 ), ∀t1 , t2 ∈ R;
(ug2 ) the mapping
R ∋ t 7→ Uf (t) := U (t)f ∈ H
is continuous, for all f ∈ H.
From condition ug1 we have (cf. 1.3.3 and 12.5.1b)
U (0) = 1H and U (−t) = U (t)−1 = U (t)† , ∀t ∈ R;
495
we also have (cf. 1.3.5b)

U (t1 )U (t2 ) = U (t2 )U (t1 ), ∀t1 , t2 ∈ R.
Obviously, Uf (0) = f for all f ∈ H.
16.1.2 Proposition. Let U : R → U(H) be a homomorphism from R to U(H) (i.e.,

condition ug1 holds true). Then the following conditions are equivalent:
(a) U is a c.o.p.u.g. (i.e., condition ug2 holds true);

(b) the function
R ∋ t 7→ (U (t)f |g) ∈ C
is continuous, for all f, g ∈ H;
(c) there exists t0 ∈ R such that the function
R ∋ t 7→ (U (t)f |g) ∈ C
is continuous at t0 , for all f, g ∈ H;
(d) there exists t0 ∈ R such that the mapping Uf is continuous at t0 , for all f ∈ H.
Proof. a ⇒ b: This is obvious, in view of 10.1.16c.

b ⇒ c: This is obvious, by definition of a continuous function (cf. 2.4.1).
c ⇒ d: Assuming condition c, fix f ∈ H and let {tn } be a sequence in R such
that tn → t0 ; then
kUf (tn ) − Uf (t0 )k2 = kU (tn )f − U (t0 )f k2
= 2kf k2 − 2 Re (U (tn )f |U (t0 )f )
→ 2kf k2 − 2 Re (U (t0 )f |U (t0 )f ) = 0.
This proves that condition d is true (cf. 2.4.2).
d ⇒ a: Assuming condition d, fix f ∈ H and t ∈ R, and let {tn } be a sequence
in R such that tn → t; then t0 − t + tn → t0 and hence (cf. 2.4.2)
kUf (tn ) − Uf (t)k = kU (t0 − t)(U (tn )f − U (t)f )k = kUf (t0 − t + tn ) − Uf (t0 )k → 0.
This proves that condition a is true (cf. 2.4.2), since f and t were arbitrary.
16.1.3 Definitions. Let X be a normed space.

For a mapping ψ : R − {0} → X, we say that limt→0 ψ(t) exists if there exists
f ∈ X so that
[{tn } is a sequence in R − {0}, tn → 0] ⇒ kψ(tn ) − f k → 0;
if this condition is true then we define
lim ψ(t) := f
t→0
(it is immediate to see that, if f as above exists, then it is unique).

One-Parameter Unitary Groups and Stone’s Theorem 497
For a mapping ϕ : R → X and t0 ∈ R, we say that ϕ is differentiable at t0 if

1
lim (ϕ(t0 + t) − ϕ(t0 )) exists;
t→0 t
if this condition is true then we define the derivative of ϕ at t0 as

dϕ 1
:= lim (ϕ(t0 + t) − ϕ(t0 )).
dt t0 t→0 t
More formally, we define the function

1
R − {0} ∋ t 7→ ψϕ,t0 (t) :=(ϕ(t0 + t) − ϕ(t0 )) ∈ X
t
and we say that ϕ is differentiable at t0 if limt→0 ψϕ,t0 (t) exists, in which case we
write

dϕ
:= lim ψϕ,t0 (t).
dt t→0
t0
Obviously, if X = C then these definitions agree with the ones given in 1.2.21 (cf.
2.7.6 and 2.4.2).
16.1.4 Proposition. Let U : R → U(H) be a homomorphism from R to U(H) (i.e.,

condition ug1 holds true). Then, for f ∈ H, the following conditions are equivalent:
(a) the mapping Uf is differentiable at t0 , for all t0 ∈ R;
(b) the mapping Uf is differentiable at 0.
If these conditions hold true, then the following condition also holds true:
(c) for each t0 ∈ R, if g := U (t0 )f then the mapping Ug is differentiable at 0 and

dUg dUf dUf
= = U (t0 ) .
dt 0 dt t0 dt 0

b ⇒ (a and c): We assume condition b and fix t0 ∈ R. Then, if {tn } is a sequence
in R − {0} such that tn → 0, we have

1
(Uf (t0 + tn ) − Uf (t0 )) − U (t0 ) dUf

tn dt 0

1 dUf
= tn (Uf (tn ) − Uf (0)) − dt → 0.

0
This shows that the mapping Uf is differentiable at t0 and also that

dUf dUf
= U (t 0 ) .
dt t0 dt 0
Next, we note that
Uf (t0 + t) − Uf (t0 ) = U (t)U (t0 )f − U (t0 )f = Ug (t) − Ug (0), ∀t ∈ R;
then, from above we have

1
(Ug (tn ) − Ug (0)) − U (t0 ) dUf → 0,

tn dt 0
and this shows that the mapping Ug is differentiable at 0 and also that

dUg dUf
= U (t0 ) .
dt 0 dt 0
Since t0 was arbitrary, this proves that condition b implies conditions a and c.
16.1.5 Proposition. Suppose that, for a self-adjoint operator A in H and a

c.o.p.u.g. U in H, the following condition holds true:

dU
(sa-ug) the mapping Uf is differentiable at 0 and dtf = iAf , ∀f ∈ DA .
0
Then the following conditions also hold true:

(a) DA = {f ∈ H : Uf is differentiable at 0};
dUf
(b) Uf is differentiable at t0 , Uf (t0 ) ∈ DA , = iAUf (t0 ),

dt
t0
∀f ∈ DA , ∀t0 ∈ R;
(c) U (t0 )A ⊂ AU (t0 ), ∀t0 ∈ R.
Moreover:
(d) A is the only self-adjoint operator in H which satisfies condition sa-ug with U ;
(e) U is the only c.o.p.u.g. in H which satisfies condition sa-ug with A.
Proof. a: We define the set

D := {g ∈ H : Ug is differentiable at 0}.
Condition sa-ug implies DA ⊂ D. Now let g ∈ D and let {tn } be a sequence in
R − {0} such that tn → 0; then, by 10.1.16c and by condition sa-ug, we have

dUg 1
−i f | = −i f | lim (U (t
g n ) − U g (0))
dt 0 n→∞ tn

1
= −i lim (U (tn )† − 1H )f |g
n→∞ tn

1
= i lim (U (−tn )f − f )|g
n→∞ −tn

dUf
= −i |g = (Af |g) , ∀f ∈ DA ;
dt 0
by the very definition of DA† (cf. 12.1.1), this shows that g ∈ DA† . Thus we
have D ⊂ DA† , i.e. D ⊂ DA since A is self-adjoint, and hence DA = D, which is
condition a.
b and c: We fix f ∈ DA and t0 ∈ R. From condition sa-ug we have, by 16.1.4:

dUf dUf
Uf is differentiable at t0 and = U (t0 ) = U (t0 )(iAf ),
dt t0 dt 0
and also, letting g := U (t0 )f ,

dUg
Ug is differentiable at 0 and = U (t0 )(iAf ).
dt 0
Since Ug is differentiable at 0, in view of condition a already proved we have
Uf (t0 ) = U (t0 )f = g ∈ DA ,
and hence, in view of condition sa-ug,
dUg
= iAg = iAU (t0 )f.
dt 0
Then we have
dUf dUg
= U (t 0 )(iAf ) = = iAU (t0 )f = iAUf (t0 ).
dt t0 dt 0
Since f ∈ DA and t0 ∈ R were arbitrary, this proves condition b and also condition
c, since condition c can be written as follows:
U (t0 )f ∈ DA and U (t0 )Af = AU (t0 )f, ∀f ∈ DA , ∀t0 ∈ R.
d: Suppose that B is a self-adjoint operator in H which satisfies condition sa-ug
with U . Then DB ⊂ DA in view of condition a already proved, and hence B ⊂ A
since
dUf
Bf := −i = Af, ∀f ∈ DB .
dt 0
Then, B = A by 12.4.6b.
e: Suppose that V is a c.o.p.u.g. in H which satisfies condition sa-ug with A.
Then A and V satisfy condition b as well. Then fix f ∈ DA , let t0 be an arbitrary
element of R, and let {tn } be a sequence in R − {0} such that tn → 0. The equation
kUf (t0 + tn ) − Vf (t0 + tn )k2 − kUf (t0 ) − Vf (t0 )k2
= (Uf (t0 + tn ) − Vf (t0 + tn ) − Uf (t0 ) + Vf (t0 )|Uf (t0 + tn ) − Vf (t0 + tn ))
+ (Uf (t0 ) − Vf (t0 )|Uf (t0 + tn ) − Vf (t0 + tn ) − Uf (t0 ) + Vf (t0 )) , ∀n ∈ N,
implies, in view of condition b for U and for V (also, cf. 10.1.16),
1
lim (kUf (t0 + tn ) − Vf (t0 + tn )k2 − kUf (t0 ) − Vf (t0 )k2 )
n→∞ tn
= (iAUf (t0 ) − iAVf (t0 )|Uf (t0 ) − Vf (t0 ))
+ (Uf (t0 ) − Vf (t0 )|iAUf (t0 ) − iAVf (t0 )) = 0.
This shows that the function
R ∋ t 7→ kUf (t) − Vf (t)k2 ∈ R
is differentiable at every point of R and that its derivative is zero at every point of
R. Therefore, this function is a constant function and hence
kUf (t) − Vf (t)k = kUf (0) − Vf (0)k = kf − f k = 0, ∀t ∈ R.
Since f was an arbitrary element of DA , this proves that
U (t)f = V (t)f, ∀f ∈ DA , ∀t ∈ R.
Now, for any t ∈ R, U (t) and V (t) are elements of B(H) and DA = H. Then,
U (t) = V (t), ∀t ∈ R,
by the uniqueness asserted in 4.2.6.
16.1.6 Theorem. Let A be a self-adjoint operator in H and let U A be the mapping

defined by
U A : R → U(H)
t 7→ U (t) := ϕt (A),
where ϕt is the function
R ∋ x 7→ ϕt (x) := eitx ∈ C,
A
for all t ∈ R (we recall that ϕt (A) = JϕPt = JP A (ϕt ), cf. 15.3.1 and 14.2.18).
Then, U A is a c.o.p.u.g. and condition sa-ug holds true for A and U A , i.e.

A dUfA
(sa-ug) the mapping Uf is differentiable at 0 and dt = iAf , ∀f ∈ DA
0
(we set UfA A

:= (U )f ).
Proof. First we prove that the mapping U A is a c.o.p.u.g.

We have U A (t) ∈ U(H) for all t ∈ R, in view of 14.3.18. We also have, in view
of 14.3.10,
U A (t1 )U A (t2 ) = JP A (ϕt1 )JP A (ϕt2 ) = JP A (ϕt1 ϕt2 )
= JP A (ϕt1 +t2 ) = U A (t1 + t2 ), ∀t1 , t2 ∈ R.
Thus, U A has property gu1 . Finally, for each t ∈ R and every sequence {tn } ∈ R
such that tn → t, we have
lim |ϕtn (x) − ϕt (x)| = 0, ∀x ∈ R,
n→∞
|ϕtn (x) − ϕt (x)| ≤ 2, ∀x ∈ R, ∀n ∈ N;
then we have, for all f ∈ H,
Z
A
kUfA (tn ) − UfA (t)k2 = kJP A (ϕtn − ϕt )f k2 = |ϕtn − ϕt |2 dµP
f → 0,
R
A
by 14.3.1, 14.2.14d, 8.2.11 (we recall that a constant function is µP
f -integrable,
A
since the measure µP f is finite). This proves that U A has property ug2 .
Next we prove that A and U A satisfy condition sa-ug.
Let {tn } be any sequence in R − {0} such that tn → 0. Then we have, by 14.3.5,
14.3.1, 14.2.14d,
2
1
(UfA (tn ) − UfA (0)) − iAf

tn
2
1
= tn P(J A (ϕ tn ) − J P A (1 R ))f − iJ P A (ξ)f

2
1
= JP A tn (ϕtn − 1R ) − iξ f

Z 2
1 A
(ϕtn − 1R ) − iξ dµP

= tn f , ∀f ∈ DA , ∀n ∈ N.
R
We also have

1
lim (ϕtn (x) − 1) − ix = 0, ∀x ∈ R
n→∞ tn

deitx
(since dt 0= ix, ∀x ∈ R), and

1
(ϕtn (x) − 1) − ix ≤ 1 (eitn x − 1) + |x| ≤ 2|x|, ∀x ∈ R, ∀n ∈ N

tn tn
A
(we have used the inequality |eiα − 1| ≤ |α|, ∀α ∈ R). Since ξ ∈ L2 (R, A(dR ), µP
f )
for all f ∈ DA , by 8.2.11 (with 4ξ 2 as dominating function) we have

1 A A

lim (Uf (tn ) − Uf (0)) − iAf

= 0, ∀f ∈ DA .
n→∞ tn
This proves condition sa-ug.
16.1.7 Remark. For every self-adjoint operator A in H, 16.1.6 and 16.1.5e show
that the mapping U A defined by
R ∋ t 7→ U A (t) := ϕt (A) ∈ U(H)
is a c.o.p.u.g. and that it is the only c.o.p.u.g. U in H which satisfies with A the
condition

dU
(sa-ug) the mapping Uf is differentiable at 0 and dtf = iAf , ∀f ∈ DA .
0
A
We point out that, for each t ∈ R, U (t) is the unique linear operator in H such
that DU A (t) = H and
Z
A
A
ϕt dµP

f |U (t)f = f , ∀f ∈ H
R
(cf. 14.2.14). The operator U (t) is often denoted as eitA .

A
Finally, 16.1.5d shows that the mapping

A 7→ U A ,
from the family of all self-adjoint operators in H to the family of all c.o.p.u.g.’s in
H, is injective.
16.1.8 Proposition. Let A be a self-adjoint operator in H.
(a) Let λ ∈ R. Then, the operator B := A + λ1H is self-adjoint and the following
conditions are true:
P B (E) = P A (E − λ), ∀E ∈ A(dR )
(we recall that E − λ := {x − λ : x ∈ E}, cf. 9.2.1a);
U B (t) = eiλt U A (t), ∀t ∈ R.
(b) Let µ ∈ R − {0}. Then, the operator C := µA is self-adjoint and the following
conditions are true:
P C (E) = P A (µ−1 E), ∀E ∈ A(dR )

(we recall that µ−1 E := {µ−1 x : x ∈ E}, cf. 9.2.2a);
U C (t) = U A (µt), ∀t ∈ R.
Proof. a: If we write ψλ := ξ + λ, then (cf. 15.3.5)
B = ψλ (A).
Hence, in view of 15.3.8, the operator B is self-adjoint and
P B (E) = P A (ψλ−1 (E)) = P A (E − λ), ∀E ∈ A(dR ).
For each t ∈ R, we can use 14.6.1, with
X1 := R, A1 := A(dR ), P1 := P A , X2 := R, A2 := A(dR ), π := ψλ ,
B
and hence P2 = P ,
to obtain
U B (t) = JP B (ϕt ) = JP A (ϕt ◦ ψλ ) = JP A eitλ ϕt = eiλt JP A (ϕt ) = eiλt U A (t)

(we have used 16.1.6 and 14.3.5).

b: If we write γµ := µξ, then (cf. 15.3.5)
C = γµ (A).
Hence, in view of 15.3.8, the operator C is self-adjoint and
P C (E) = P A (γµ−1 (E)) = P A (µ−1 E), ∀E ∈ A(dR ).
For each t ∈ R, we can use 14.6.1, with
X1 := R, A1 := A(dR ), P1 := P A , X2 := R, A2 := A(dR ), π := γµ ,
C
and hence P2 = P ,
to obtain
U C (t) = JP C (ϕt ) = JP A (ϕt ◦ γµ ) = JP A (ϕµt ) = U A (µt)
(we have used 16.1.6).
16.1.9 Theorem. Let U be a c.o.p.u.g. in H and let D be a linear manifold in H

such that:
(a) D = H;
(b) U (t)f ∈ D, ∀f ∈ D, ∀t ∈ R;
(c) the mapping Uf is differentiable at 0, ∀f ∈ D.
Then the mapping

A0 : D → H

dUf
f→
7 A0 f := −i
dt 0
is an essentially self-adjoint operator (hence, A0 is closable and the operator A0 is
self-adjoint) and, letting A := A0 , U = U A .
Proof. In view of 10.1.16a,b, it is easy to see that the mapping A0 is a linear

operator.
Let {tn } be a sequence in R − {0} such that tn → 0; then, by 10.1.16c,

1 1
(A0 f |g) = i lim (Uf (tn ) − Uf (0))|g = i lim (U (tn )f − f )|g
n→∞ tn n→∞ tn

1 † 1
= i lim f | (U (tn ) − 1H )g = −i lim f | (Ug (−tn ) − Ug (0))
n→∞ tn n→∞ −tn
= (f |A0 g) , ∀f, g ∈ D;
Since the operator A0 is adjointable (cf. condition a), this proves that A0 is sym-
metric (cf. 12.4.3).
Now let h ∈ NA† −i1H , i.e. h ∈ DA† and A†0 h = ih, and fix f ∈ D. For any
0 0
t0 ∈ R, letting g := U (t0 )f we have g ∈ D by condition b; hence, Ug is differentiable
at 0 by condition c and

dUg
= iA0 g
dt 0
by definition of A0 ; moreover, in view of 16.1.4, we have that Uf is differentiable at
t0 and

dUf dUg
= ;
dt t0 dt 0
then, for every sequence {tn } in R − {0} such that tn → 0, we have

!
1 dUf
lim ((U (t0 + tn )f |h) − (U (t0 )f |h)) = |h
n→∞ tn dt t0

= (iA0 U (t0 )f |h) = −i U (t0 )f |A†0 h = (U (t0 )f |h) .
This shows that the function
R ∋ t 7→ ψh,f (t) := (U (t)f |h) ∈ C
is differentiable at t0 for all t0 ∈ R, and that it satisfies the differential equation
′
ψh,f (t) = ψh,f (t), ∀t ∈ R.
Then, there exists k ∈ C so that
ψh,f (t) = ket , ∀t ∈ R;
now, by the Schwarz inequality,

|ψh,f (t)| ≤ kf kkhk, ∀t ∈ R,
and hence k = 0, and hence ψh,f (0) = 0, i.e.
(f |h) = 0.
Since f was an arbitrary element of D, this proves that h ∈ D⊥ , and hence (cf.
condition a and 10.4.4d) that h = 0H . This proves that
NA† −i1H = {0H },
0
The equation NA† +i1H = {0H } can be proved in a similar way.

0
Thus, the operator A0 is essentially self-adjoint by 12.4.17, and hence (cf.
12.4.11) A0 is closable and the operator A := A0 is self-adjoint.
For every f ∈ D and every t0 ∈ R, we have already seen that Uf is differentiable
at t0 and

dUf
U (t0 )f ∈ D and = iA0 U (t0 )f ;
dt t0
since A0 ⊂ A, this yields

dUf
U (t0 )f ∈ DA and = iAU (t0 )f ;
dt t0
since f ∈ DA , we also have (cf. 16.1.6 and 16.1.5b) that UfA is differentiable at t0
and

A
dU f
U A (t0 )f ∈ DA and = iAU A (t0 )f.

dt
t0
Then, proceeding exactly as in the proof of 16.1.5e (with DA replaced by D and V

replaced by U A ), we have
U (t)f = U A (t)f, ∀f ∈ D, ∀t ∈ R,
and hence
U (t) = U A (t), ∀t ∈ R,
by the uniqueness asserted in 4.2.6, since U (t) and V (t) are elements of B(H) for
all t ∈ R and since D = H.
16.1.10 Theorem (Stone’s theorem). Let U be a c.o.p.u.g. in H. Then there

exists a self-adjoint operator A in H such that U = U A .
Rb
Proof. First we define a symbol. For ϕ ∈ C(R) and a, b ∈ R, if a < b then a ϕ(x)dx
denotes the Riemann integral (cf. 9.3.2) of the restriction of ϕ to the interval [a, b];
otherwise, we define:
Rb Ra
a ϕ(x)dx := − b ϕ(x)dx if b < a;
Rb
a
ϕ(x)dx := 0 if a = b.
Now we come to the proof of the theorem.

For all f, g ∈ H, the function
R ∋ x 7→ (U (x)f |g) ∈ C
is continuous (cf. 16.1.2). Thus, for each t ∈ R − {0}, we can define the function
1 t
Z
H × H ∋ (f, g) 7→ ψt (f, g) := (U (x)f |g) dx ∈ C,
t 0
which is clearly a sesquilinear form; moreover, by the Schwarz inequality, we have
1 t
Z
|ψt (f, g)| ≤ | (U (x)f |g) |dx ≤ kf kkgk, ∀f, g ∈ H, (1)
t 0
and this proves that the sesquilinear form is bounded; then, by 10.5.6,
1 t
Z
∃!Bt ∈ B(H) such that (Bt f |g) = (U (x)f |g) dx, ∀f, g ∈ H;
t 0
in view of 1 and of 10.1.14, we have kBt k ≤ 1, and hence (cf. 4.2.5b)
kBt f k ≤ kf k, ∀f ∈ H. (2)
For each f ∈ H, by Riemann’s fundamental theorem of calculus the function
Z t
R ∋ t 7→ (U (x)f |f ) dx ∈ C
0
is differentiable at 0 and its derivative at 0 is the number (U (0)f |f ); hence, for any
sequence {tn } in R − {0} such that tn → 0, we have
1 tn
Z
kf k2 = (U (0)f |f ) = lim (U (x)f |f ) dx = lim (Btn f |f ) , (3)
n→∞ tn 0 n→∞
and hence also

kf k2 = lim (Btn f |f ) = lim (f |Btn f ) ; (4)
n→∞ n→∞
now, in view of 2 we have

kBtn f − f k2 = kBtn f k2 + kf k2 − (Btn f |f ) − (f |Btn f )
≤ 2kf k2 − (Btn f |f ) − (f |Btn f ) , ∀n ∈ N,
and hence, in view of 3 and 4,
lim kBtn f − f k = 0.
n→∞
This proves that

lim Bt f exists and lim Bt f = f , ∀f ∈ H. (5)
t→0 t→0
Now we define the set

[
S := RBt .
t∈R−{0}
If f ∈ S ⊥ then, for a sequence {tn } in R − {0} such that tn → 0, in view of 5 we

have

(f |f ) = f | lim Btn f = lim (f |Btn f ) = 0,
n→∞ n→∞
and hence f = 0H . This proves that
S ⊥ = {0H }. (6)
For each t ∈ R − {0}, we define the operator
1
Ct := (U (t) − 1H ).
t
For all f, g ∈ H and all s, t ∈ R − {0}, we have
1 Z s 1 s
Z
(Ct Bs f |g) = Bs f |Ct† g = U (x)f |Ct† g dx = (Ct U (x)f |g) dx
s 0 s 0
1 s 1 s
Z Z
= (U (t + x)f |g) dx − (U (x)f |g) dx
st 0 st 0
Z s+t
(7) 1 1 t 1 t
Z Z
= (U (x)f |g) dx − (U (x)f |g) dx + (U (x)f |g) dx
st t st 0 st s
1 s+t 1 t
Z Z
= (U (x)f |g) dx − (U (x)f |g) dx
st s st 0
Z t Z t
(8) 1 1
= (U (x + s)f |g) dx − (U (x)f |g) dx
st 0 st 0
Z t
1 1
= U (x) (U (s) − 1H )f |g dx = (Bt Cs f |g) ,
t 0 s
where 7 and 8 are changes of variables for the Riemann integrals; they can be
justified on the basis of 9.3.3, 8.3.2 and 9.2.1b; for 7 we use the translation x 7→ x−t
(note that χ[0,s] (x − t) = χ[t,s+t] (x) if 0 < s and χ[s,0] (x − t) = χ[s+t,t] (x) if s < 0)
and for 8 we use the translation x 7→ x + s (note that χ[s,s+t] (x + s) = χ[0,t] (x) if
0 < t and χ[s+t,s] (x + s) = χ[t,0] (x) if t < 0). This proves that
Ct Bs = Bt Cs , ∀s, t ∈ R − {0}. (9)
In view of 5, we have that
lim Bt Cs f exists, ∀f ∈ H, ∀s ∈ R − {0};
t→0
hence, in view of 9, we have that
lim Ct Bs f exists, ∀f ∈ H, ∀s ∈ R − {0}. (10)
t→0
Now, we define the mapping A0 := DA0 → H by
DA0 := {f ∈ H : lim Ct f exists} and A0 f := −i lim Ct f, ∀f ∈ DA0 .
t→0 t→0
Since Ct is a linear operator for all t ∈ R − {0}, DA0 is a linear manifold in view
of 10.1.16a,b. Moreover, 10 shows that S ⊂ DA0 ; from this and from 6 we have (cf.
10.2.10b)
⊥
DA 0
⊂ S ⊥ = {0H },
and hence (cf. 10.4.4d) DA0 = H. Furthermore we have, for each t0 ∈ R,

(11) (12)
f ∈ DA0 ⇒ lim Ct f exists ⇒ lim U (t0 )Ct f exists ⇒
t→0 t→0
lim Ct U (t0 )f exists ⇒ U (t0 )f ∈ DA0 ,
t→0
where 11 holds because U (t0 ) ∈ B(H) and 12 holds because U (t0 )Ct = Ct U (t0 ) for
all t ∈ R − {0}. Finally we note that, for f ∈ H,
lim Ct f exists iff Uf is differentiable at 0

t→0
and, if limt→0 Ct f exists, then

dUf
lim Ct f = .
t→0 dt 0
Thus, U and A0 satisfy all the conditions that held for U and A0 in 16.1.9. There-
fore, A0 is an essentially self-adjoint operator, the operator A := A0 is self-adjoint,
and U = U A . This completes the proof. However, we note that as a matter of fact
A0 = A. Indeed, 16.1.5a implies that
DA = {f ∈ H : Uf is differentiable at 0};
therefore, DA0 = DA and hence A0 = A (since A0 ⊂ A0 ). Note that, in 16.1.9, DA0

was not assumed to be the family of all the vectors f for which Uf was differentiable;
for this reason, the operator A0 of 16.1.9 did not need to be self-adjoint.
16.1.11 Remarks.
(a) Stone’s theorem proves that the mapping
A 7→ U A
(cf. 16.1.7) is a surjection, and hence a bijection from the family of all self-
adjoint operators in H onto the family of all c.o.p.u.g.’s in H.
For a c.o.p.u.g. U , the self-adjoint operator A such that U = U A is called the
generator of U .
(b) For every self-adjoint operator A, it is obvious that the mapping
R ∋ t 7→ V (t) := U A (−t) ∈ U(H)
is a c.o.p.u.g.. Moreover, it is obvious that
the mapping Vf is differentiable at 0 and

dVf
= −iAf , ∀f ∈ D−A (= DA ).
dt 0
Therefore, V = U −A (cf. 16.1.7).

16.2 Norm-continuous one-parameter unitary groups
The theorem we present in this section determines when the results of the previous
section can be expressed entirely within the Banach algebra structure of B(H).
16.2.1 Theorem. Let A be a self-adjoint operator in a Hilbert space H. The fol-

lowing conditions are equivalent:
(a) the mapping U A is norm-continuous, i.e.
∀t0 ∈ R, ∀ε > 0, ∃δ > 0 such that |t − t0 | < δ ⇒ kU (t) − U (t0 )k < ε
(this condition is stronger than condition ug1 , cf. 2.4.2 and 4.2.12);
(b) A ∈ B(H);
If these conditions hold true then the following conditions also hold true:
P∞ 1
(c) the series n=0 n! (it)n An is convergent in the normed space B(H) and
∞
A
X 1
U (t) = (it)n An , ∀t ∈ R;
n=0
n!

A
(d) the mapping U A is differentiable at t0 and dU dt = iAU (t0 ) , ∀t0 ∈ R,
t0
in the normed space B(H), i.e.

1 A A A

lim (U (t0 + tn ) − U (t0 )) − iAU (t0 )
= 0,
n→∞ tn
for every t0 ∈ R and every sequence {tn } in R − {0} such that tn → 0.
Proof. a ⇒ b: With reference to the proof of 16.1.10, we have

RBt ⊂ DA , ∀t ∈ R − {0}. (1)
Moreover, for each t ∈ R − {0}, by the Schwarz inequality we have
Z t
1
| ((Bt − 1H )f |g) | = ((U (x) − 1H )f |g) dx
t 0
≤ (sup{kU (x) − 1H k : x ∈ [0, t]})kf kkgk, ∀f, g ∈ H,
since
k(U (x) − 1H )f k ≤ kU (x) − 1H kkf k, ∀f ∈ H, ∀x ∈ R
(cf. 4.2.5b); by 10.1.14, this implies that
kBt − 1H k ≤ sup{kU (x) − 1H k : x ∈ [0, t]}. (2)
Now we assume condition a. Then there exists δ ∈ (0, ∞) such that
1
|x| < δ ⇒ kU (x) − 1H k < ;
2
then, for s := 2δ , 2 implies that

kBs − 1H k < 1;
then (cf. 4.5.10)
−1 ∈ ρ(Bs − 1H );
since the operator Bs − 1H is closed (cf. 4.4.3), this is equivalent to (cf. 4.5.13)
Bs is injective and Bs−1 ∈ B(H);
then
RBs = DBs−1 = H.
In view of 1, this implies DA = H and hence A ∈ B(H) (cf. 12.4.7).
b ⇒ a: In this and in the ensuing parts of the proof we need the equation
U A (t) = ϕt (A) = JP A (ϕt ), ∀t ∈ R,
with
R ∋ t 7→ ϕt (x) := eitx ∈ C
(cf. 16.1.6).
Now we assume condition b. Then σ(A) is a bounded subset of R (cf. 15.2.2f)
and we can fix m ∈ (0, ∞) such that σ(A) ⊂ [−m, m]. Then,
P A (R − [−m, m]) = OH (3)
(cf. 15.2.2d and 13.3.2e). For every t ∈ R and every sequence {tn } in R such that
tn → t, we have
|ϕtn (x) − ϕt (x)| = |ei(tn −t)x − 1| ≤ |tn − t||x|, ∀x ∈ R, ∀n ∈ N (4)
(we have used the inequality |eiα − 1| ≤ |α|, ∀α ∈ R); then,
(5)
kU A (tn ) − U A (t)k = kJP A (ϕtn ) − JP A (ϕt )k = kJ˜P A (ϕtn − ϕt )k
(6) (7)
= P A -sup|ϕtn − ϕt | ≤ |tn − t|P A -sup|ξ|
(8)
≤ |tn − t|m, ∀n ∈ N,
where 5 holds by 14.2.17e and 14.2.7b, 6 holds by 14.2.7h, 7 holds in view of 4, 8
holds in view of 3. This proves that
kU A (tn ) − U A (t)k → 0.
Thus, in view of 2.4.2, condition a is proved.
b ⇒ c: We assume condition b and fix m ∈ (0, ∞) as above. For each t ∈ R, it
is a known fact that
( )
n
itx X 1

k k
sn (t) := sup e − (it) x : x ∈ [−m, m] → 0 as n → ∞; (9)
k!
k=0
moreover, we have
n
n

A X 1 (10) X 1
U (t) − (it)k Ak = JP A (ϕt ) − (it)k JP A (ξ k )

k! k!
k=0 k=0
n
!

(11) 1
= J˜P A ϕt −
X
(it)k ξ k

k!
k=0

n
(12)
X 1
= P A -sup ϕt − (it)k ξ k

k!
k=0
(13)
≤ sn (t), ∀n ∈ N,
where 10 holds by 15.3.5, 11 holds by 14.2.17e and 14.2.7b since 3 that implies that
ξ ∈ L∞ (R, A(dR ), P A ), 12 holds by 14.2.7h, 13 holds in view of 3. In view of 9, this
proves that

n

A X 1
k k
U (t) − (it) A → 0 as n → ∞, ∀t ∈ R,
k!
k=0
i.e. condition c.
b ⇒ d: We assume condition b and fix m ∈ (0, ∞) as above. Let t0 ∈ R and
let {tn } be any sequence in R − {0} such that tn → 0. Proceeding as before (using
16.1.5c, which now reads U A (t0 )A = AU A (t0 ) since A ∈ B(H), and also 4.2.9 and
the equation kU (t0 )k = 1) we see that

1 A
(U (t0 + tn ) − U A (t0 )) − iAU (t0 )

tn

1 A 1
˜

≤ (U (tn ) − 1H ) − iA = JP A
(ϕtn − 1R ) − iξ (14)
tn tn

1
= P A -sup (ϕtn − 1R ) − iξ , ∀n ∈ N.
tn

is
Next we fix ε > 0; since deds = i, there exists δε > 0 such that
0

1 ε
0 < s < δε ⇒ (eis − 1) − i < ;

s m
now let Nε ∈ N be such that
δε
n > Nε ⇒ |tn | < ;
m
then, for every x ∈ [−m, m] such that x 6= 0, we have
n > Nε ⇒ |tn x| ≤ |tn |m < δε ⇒

1 it x
(e n − 1) − ix = |x| 1 (eitn x − 1) − i < m ε = ε;

tn tn x m
1 itn x
hence, in view of 3 (also, note that − 1) − ix = 0 for x = 0), we have
tn (e

A
1
n > Nε ⇒ P -sup (ϕtn − 1R ) − iξ < ε,

tn
and hence, in view of 14,

1 A A A

n > Nε ⇒ tn (U (t 0 + t n ) − U (t 0 )) − iAU (t 0 < ε.
)
Thus, condition d is proved.
16.2.2 Remark. From 12.6.1 we have that the restriction of the mapping A 7→ U A
(cf. 16.1.7 and 16.1.11a) to the family of all bounded self-adjoint operators is a
bijection from this family onto the family of all norm-continuous c.o.p.u.g.’s.
16.2.3 Theorem (Stone’s theorem in one dimension). Let γ be a continuous

homomorphism from the additive group R to the multiplicative group T, i.e.
the function γ : R → T is continuous,
γ(t1 )γ(t2 ) = γ(t1 + t2 ), ∀t1 , t2 ∈ R.
Then,
∃!a ∈ R so that γ(t) = eiat , ∀t ∈ R.
Proof. Let H be a one-dimensional Hilbert space. We recall (cf. 12.6.6a) that

there exists an isomorphism
C ∋ α 7→ Aα ∈ B(H)
from the associative algebra C onto the associative algebra B(H) such that:
kAα k = |α|, ∀α ∈ C;
Aα is self-adjoint iff α ∈ R;
Aα ∈ U(H) iff α ∈ T.
Then, the mapping
R ∋ t 7→ Aγ(t) ∈ U(H)
is a norm-continuous c.o.p.u.g. in H. Therefore (cf. 16.1.10 and 16.2.1c), there
exists a ∈ R so that
∞
X 1
Aγ(t) = (it)n Ana , ∀t ∈ R,
n=0
n!
in the normed space B(H), and hence so that
∞
X 1
γ(t) = (it)n an = eiat , ∀t ∈ R,
n=0
n!
in the normed space C. The uniqueness of a is shown e.g. by 16.2.1d, which in the
normed space C gives

dγ
= ia.
dt 0
16.3 Unitary equivalence
V ∈ UA(H1 , H2 ). Let A1 and A2 be self-adjoint operators in H1 and in H2 respec-
tively. Then the following conditions are equivalent:
(a) A2 = V A1 V −1 ;
(b) U A2 (t) = V U A1 (t)V −1 , ∀t ∈ R, if V ∈ U(H1 , H2 ), or
U A2 (−t) = V U A1 (t)V −1 , ∀t ∈ R, if V ∈ A(H1 , H2 ).
Proof. a ⇒ b: Condition a implies (cf. 15.4.1)
P A2 (E) = V P A1 (E)V −1 , ∀E ∈ A(dR ),
and this implies (cf. 14.6.2)
ϕt (A2 ) = V ϕt (A1 )V −1 , ∀t ∈ R, if V ∈ U(H1 , H2 ), or

ϕt (A2 ) = V ϕt (A1 )V −1 , ∀t ∈ R, if V ∈ A(H1 , H2 ).
Now,
ϕt (A2 ) = U A2 (t) and ϕt (A2 ) = ϕ−t (A2 ) = U A2 (−t)
(cf. 16.1.6). Thus, condition b is proved.

b ⇒ a: We assume condition b and define the operator B in H2 by
B := V A1 V −1 ;
the operator B is self-adjoint (cf. 12.5.4c). Then, in view of a ⇒ b already proved,

we have
U B (t) = V U A1 (t)V −1 , ∀t ∈ R, if V ∈ U(H1 , H2 ), or

U B (−t) = V U A1 (t)V −1 , ∀t ∈ R, if V ∈ A(H1 , H2 ).
In either case we have U B = U A2 and hence B = A2 since the mapping A 7→ U A is

injective, and this is condition a.
16.4 One-parameter groups of automorphisms
The main theorem of this section (cf. 16.4.11) is a special case of a much more
general theorem proved by Valentine Bargmann (Bargmann, 1954). In the analy-
sis of time evolution of conservative and reversible quantum systems (cf. Section
19.6) and of symmetries (for an example, cf. Section 20.3), one is led to consider
what we call continuous one-parameter groups of automorphisms. The special case
of Bargmann’s theorem we consider here is the essential link between these and
c.o.p.u.g.’s.
Throughout this section, H denotes an abstract Hilbert space of dimension

greater than one. We recall that Aut Ĥ denotes the group of automorphisms of
the projective Hilbert space (Ĥ, τ ) (cf. 10.9.4).
16.4.1 Definition. A continuous one-parameter group of automorphisms of Ĥ is

a mapping
R ∋ t 7→ ωt ∈ Aut Ĥ
such that:
(ag1 ) the mapping is a homomorphism from the additive group R to the group
Aut Ĥ, i.e.
ωt1 ◦ ωt2 = ωt1 +t2 , ∀t1 , t2 ∈ R;
(ag2 ) the function
R ∋ t 7→ τ ([u], ωt ([v])) ∈ [0, 1]
is continuous, ∀u, v ∈ H̃ (we note that τ ([u], [v]) ≤ 1 for all u, v ∈ H̃, by the
Schwarz inequality).
16.4.2 Remarks.
(a) For any mapping R ∋ t 7→ ωt ∈ Aut Ĥ, by Wigner’s theorem (cf. 10.9.6) we
have that, for each t ∈ R, there exists a family of operators Ut ∈ UA(H) which
are such that
ωUt = ωt , i.e. [Ut u] = ωt ([u]), ∀u ∈ H̃,
and that, given an operator of this family, all the others are multiplies of this
one by a factor in T. Hence, for each t ∈ R, either all the operators Ut ∈ UA(H)
which are such that ωUt = ωt are unitary or all of them are antiunitary.
(b) Let R ∋ t 7→ ωt ∈ Aut Ĥ be a homomorphism from the additive group R to
Aut Ĥ.
First, we have (cf. 1.3.3 and 1.3.5b):
ω0 = idĤ ;
ω−t = ωt−1 , ∀t ∈ R;
ωt1 ◦ ωt2 = ωt2 ◦ ωt1 , ∀t1 , t2 ∈ R.
Next, for each t ∈ R and any choice of Ut and of U 2t in UA(H) such that
ωUt = ωt and ωU t = ω 2t , we have
2
[Ut u] = ωt ([u]) = ω 2t ◦ ω 2t ([u]) = [U 2t u], ∀u ∈ H̃;

2
2
then (cf. a) there exists z ∈ T so that Ut = zU t , and hence Ut ∈ U(H) (cf.
2
10.3.16c). Thus, the operators Ut ∈ UA(H) such that ωUt = ωt are unitary, for
each t ∈ R.
16.4.3 Proposition. Let U be a c.o.p.u.g. in H. Then:

(a) the mapping
R ∋ t 7→ ωU(t) ∈ Aut Ĥ
is a continuous one-parameter group of automorphisms;
(b) if V is a c.o.p.u.g. in H such that ωV (t) = ωU(t) for all t ∈ R, then
∃!a ∈ R so that V (t) = eiat U (t), ∀t ∈ R.
Proof. a: For all t1 , t2 ∈ R, we have

ωU(t1 ) ◦ ωU(t2 ) ([u]) = [U (t1 )U (t2 )u]
= [U (t1 + t2 )u] = ωU(t1 +t2 ) ([u]), ∀u ∈ H̃,
and hence
ωU(t1 ) ◦ ωU(t2 ) = ωU(t1 +t2 ) .
For all u, v ∈ H̃, the function
R ∋ t 7→ (u|U (t)v) ∈ C
is continuous (cf. 16.1.2b), and hence the function
R ∋ t 7→ τ ([u], ωU(t) ([v])) = | (u|U (t)v) | ∈ [0, 1]
is continuous.
b: If V is a c.o.p.u.g. in H such that ωV (t) = ωU(t) for all t ∈ R, then (cf. 10.9.6)
there exists a function
R ∋ t 7→ γ(t) ∈ T
so that
V (t) = γ(t)U (t) or U (−t)V (t) = γ(t)1H , ∀t ∈ R.
Now we fix u ∈ H̃. For every t ∈ R and every sequence {tn } in R such that tn → t,
we have
γ(tn ) = (u|U (−tn )V (tn )u) = (U (tn )u|V (tn )u) −−−−→ (U (t)u|V (t)u) = γ(t),
n→∞
in view of condition ug2 and of 10.1.16c. Thus, the function γ is continuous (cf.
2.4.2). Moreover, for all t1 , t2 ∈ R we have
γ(t1 )γ(t2 ) = γ(t1 ) (u|U (−t2 )V (t2 )u) = γ(t1 ) (U (t2 )u|V (t2 )u)
= (U (t2 )u|U (−t1 )V (t1 )V (t2 )u) = (u|U (−t1 − t2 )V (t1 + t2 )u)
= γ(t1 + t2 ).
Then the result follows from 16.2.3.
16.4.4 Lemma. We define

1 1
d([u], [v]) := 2 2 (1 − τ ([u], [v])) 2 , ∀u, v ∈ H̃.
Then:
(a) 1 − τ ([u], [v])2 ≤ d([u], [v])2 , ∀u, v ∈ H̃;

(b) |τ ([u], [v]) − τ ([u], [w])| ≤ d([v], [w]), ∀u, v, w ∈ H̃.
Proof. a: For all u, v ∈ H̃, we have

1
1 − τ ([u], [v]) = d([u], [v])2 ,
2
and hence
1
1 + τ ([u], [v]) = 2 − d([u], [v])2 ,
2
and hence

1
2
1 − τ ([u], [v]) = 1 − d([u], [v]) d([u], [v])2 ≤ d([u], [v])2 .
2
4
b: Fix u, v, w ∈ H̃, and let z ∈ T be such that
(v|w) = z| (v|w) | = zτ ([v], [w]).
If we put v0 := zv then we have
(v0 |w) = z (v|w) = zzτ ([v], [w]) = τ ([v], [w]),
and hence
2(1 − τ ([v], [w])) = 2(1 − (v0 |w)) = (v0 |v0 ) + (w|w) − (v0 |w) − (w|v0 ) = kv0 − wk2 ,
and hence, by the Schwarz inequality,
|τ ([u], [v]) − τ ([u], [w])| = | | (u|v0 ) | − | (u|w) | |
≤ | (u|v0 ) − (u|w) | ≤ kukkv0 − wk
1 1
= 2 2 (1 − τ ([v], [w])) 2 = d([v], [w]).
16.4.5 Proposition. Let R ∋ t 7→ ωt ∈ Aut Ĥ be a homomorphism from the

additive group R to Aut Ĥ (i.e., condition ag1 holds true). Then the following
(a) R ∋ t 7→ ωt ∈ Aut Ĥ is a continuous one-parameter group of automorphisms
(i.e., condition ag2 holds true);
(b) the function R ∋ t 7→ τ ([u], ωt ([u])) ∈ [0, 1] is continuous at 0, ∀u ∈ H̃.

b ⇒ a: We assume condition b. Then, for every v ∈ H̃ and for every sequence
{sn } in R such that sn → 0, we have (cf. 2.4.2)
τ ([v], ωsn ([v])) −−−−→ τ ([v], ω0 ([v])) = τ ([v], [v]) = 1,
n→∞
and hence, for every w ∈ H̃,

(1) 1 1
|τ ([w], [v]) − τ ([w], ωsn ([v]))| ≤ d([v], ωsn ([v])) = 2 2 (1 − τ ([v], ωsn ([v]))) 2 → 0
(1 holds by 16.4.4b), or
τ ([w], ωsn ([v])) −−−−→ τ ([w], [v]).
n→∞
Then, for all u, v ∈ H̃, for every t ∈ R, and for every sequence {tn } in R such that
tn → t, we have
τ ([u], ωtn ([v])) = τ (ω−t ([u]), ωtn −t ([v])) −−−−→ τ (ω−t ([u]), [v]) = τ ([u], ωt ([v]))
n→∞
since tn − t → 0. Thus, condition ag2 holds true (cf. 2.4.2).
The next theorem is instrumental in proving the special case of Bargmann’s

theorem we mentioned at the beginning of this section. It was proved (for a case
more general than the one of interest here) by Valentine Bargmann, who followed
arguments which had been put forward by Eugene P. Wigner before. In the proof
of 16.4.6 we reproduce Bargmann’s proof (cf. Bargmann, 1954, th.1.1).
16.4.6 Theorem. Let R ∋ t 7→ ωt ∈ Aut Ĥ be a mapping such that:

(a) ω0 = idĤ ;
(b) the operators Ut ∈ UA(H) such that ωUt = ωt are unitary, ∀t ∈ R;
(c) the function R ∋ t 7→ τ ([u], ωt ([v])) ∈ [0, 1] is continuous, ∀u, v ∈ H̃.
Then there exists a ∈ (0, ∞) and a mapping
(−a, a) ∋ t 7→ Vt ∈ U(H)
so that:
V0 = 1H ;
ωt = ωVt , ∀t ∈ (−a, a);
the mapping (−a, a) ∋ t 7→ Vt f ∈ H is continuous, ∀f ∈ H.
Proof. We fix h ∈ H̃ and δ ∈ (0, 1) throughout the proof. We divide the proof into
five steps.
Step 1: Here we define a ∈ (0, ∞) and V (t) ∈ U(H) for all t ∈ (−a, a) so that
V0 = 1H and ωt = ωVt , ∀t ∈ (−a, a).
The function
R ∋ r 7→ ρr := τ ([h], ωr ([h])) ∈ [0, 1]
is continuous and ρ0 = 1, in view of conditions c and a. Hence, we can choose
a ∈ (0, ∞) so that
r ∈ (−a, a) ⇒ δ < ρr ≤ 1.
For each r ∈ (−a, a), there exists a unique Vr ∈ U(H) so that ωVr = ωr and
(h|Vr h) = | (h|Vr h) | = ρr (1)
(in view of condition b, there exists Ur ∈ U(H) such that ωUr = ωr ; then, we define
Vr := zr Ur , with zr := | (h|Ur h) | (h|Ur h)−1 ; the uniqueness of Vr is obvious, since
for any other Vr′ ∈ U(H) such that ωVr′ = ωr we would have (h|Vr′ h) = zρr with
z 6= 1, cf. 16.4.2a). Clearly,
V0 = 1H .
Step 2: Here we prove two auxiliary relations.
For all u ∈ H̃ and r, s ∈ (−a, a), we define
dr,s (u) := d(ωr ([u]), ωs ([u]));
σr,s (u) := (Vr u|Vs u) ;
zr,s (u) := Vs u − σr,s (u)Vr u.
We have
(Vr u|zr,s (u)) = 0,
and hence
1 = kVs uk2 = kzr,s (u) + σr,s (u)Vr uk2 = kzr,s (u)k2 + |σr,s (u)|2 ,
and hence
(2)
kzr,s (u)k2 = 1 − |σr,s (u)|2 = 1 − τ (ωr ([u]), ωs ([u]))2 ≤ dr,s (u)2 , (3)
where 2 holds by 16.4.4a. Moreover, we have
kVs u − Vr uk2 = 2 − 2 Re (Vr u|Vs u) ≤ 2|1 − (Vr u|Vs u) | = 2|1 − σr,s (u)|. (4)
Step 3: Here we prove that, for every t ∈ (−a, a) and every sequence {tn } in
(−a, a) such that tn → t, we have dt,tn (u) −−−−→ 0 for all u ∈ H̃.
n→∞
Indeed, for each u ∈ H̃, by condition c we have
τ (ωt ([u]), ωtn ([u])) −−−−→ τ (ωt ([u]), ωt ([u])) = 1
n→∞
(we have used the continuity at t of the function s 7→ τ (ωt ([u]), ωs ([u]))); therefore,
1 1
dt,tn (u) = d(ωt ([u]), ωtn ([u])) = 2 2 (1 − τ (ωt ([u]), ωtn ([u]))) 2 −−−−→ 0.
n→∞
Step 4: Here we prove that, for every t ∈ (−a, a) and every sequence {tn } in
(−a, a) such that tn → t, we have kVtn h − Vt hk −−−−→ 0.
n→∞
For all r, s ∈ (−a, a), we have
(5)
(h|zr,s (h)) = (h|Vs h) − σr,s (h) (h|Vr h) = ρs − σr,s (h)ρr
(5 holds in view of 1), and hence
1 − σr,s (h) = ρ−1
r (ρr − ρs + (h|zr,s (h))),
and hence
(6)
kVs h − Vr hk2 ≤ 2|1 − σr,s (h)| ≤ 2ρ−1
r (|ρr − ρs | + | (h|zr,s (h)) |)
(7)
≤ 2ρ−1
r (|τ ([h], ωr ([h])) − τ ([h], ωs ([h]))| + khkkzr,s(h)k)
(8)
≤ 2ρ−1 −1
r (d(ωr ([h]), ωs ([h])) + dr,s (h)) = 4ρr dr,s (h) < 4δ
−1
dr,s (h),
where 6 holds by 4, 7 by the Schwarz inequality, 8 by 16.4.4b and by 3. Thus, for

every t ∈ (−a, a) and every sequence {tn } in R such that tn → t, we have
dt,tn (h) −−−−→ 0
n→∞
(cf. step 3) and hence, in view of the inequality just proved,
kVtn h − Vt hk −−−−→ 0.
n→∞
Step 5: Here we prove the continuity of (−a, a) ∋ t 7→ Vt f ∈ H for all f ∈ H
(this concludes the proof of the theorem).
Let g be any element of H̃ such that (g|h) = 0, and set
1
k := 2− 2 (h + g);
clearly, k ∈ H̃. For all r, s ∈ (−a, a), we have
(Vr h|zr,s (k)) = (Vr h|Vs k − σr,s (k)Vr k)
= (Vr h − Vs h|Vs k) + (Vs h|Vs k) − σr,s (k) (Vr h|Vr k) ,
1
and hence (since (Vs h|Vs k) = (Vr h|Vr k) = (h|k) = 2− 2 )
1
2− 2 (1 − σr,s (k)) = (Vr h|zr,s (k)) + (Vs h − Vr h|Vs k) ,
and hence
(9) 3
kVs k − Vr kk2 ≤ 2|1 − σr,s (k)| ≤ 2 2 (| (Vr h|zr,s (k)) | + | (Vs h − Vr h|Vs k) |)
(10) 3 (11) 3
≤ 2 2 (kzr,s (k)k + kVs h − Vr hk) ≤ 2 2 (dr,s (k) + kVs h − Vr hk),
where 9 holds by 4, 10 by the Schwarz inequality, 11 by 3. Thus, for every t ∈ (−a, a)
and every sequence {tn } in (−a, a) such that tn → t, we have
dt,tn (k) −−−−→ 0
n→∞
(cf. step 3) and
kVtn h − Vt hk → 0, or Vtn h −−−−→ Vt h,
n→∞
(cf. step 4), and hence, in view of the inequality just proved,
kVtn k − Vt kk → 0, or Vtn k −−−−→ Vt k,
n→∞
and hence, in view of 10.1.16a,b,
√ √
Vtn g = 2Vtn k − Vtn h −−−−→ 2Vt k − Vt h = Vt g.
n→∞
Now, for each f ∈ H, there exist λ1 , λ2 ∈ C and g ∈ H̃ so that (g|h) = 0 and

f = λ1 h + λ2 g
(cf. 10.4.1 with M := V {h}; also, cf. 4.1.15 and 10.2.11); then, for every t ∈ (−a, a)
and every sequence {tn } in (−a, a) such that tn → t, we have
Vtn f = λ1 Vtn h + λ2 Vtn g −−−−→ λ1 Vt h + λ2 Vt g = Vt f.
n→∞
This proves that the mapping (−a, a) ∋ t 7→ Vt f ∈ H is continuous (cf. 2.4.2).
We need the next four results in the proof of 16.4.11, which is the above-
mentioned special case of Bargmann’s general theorem.
16.4.7 Lemma. Let U, V ∈ U(H) and let {Un },{Vn } be sequences in U(H) such
that:
Un f −−−−→ U f, ∀f ∈ H;
n→∞
Vn f −−−−→ V f, ∀f ∈ H.
n→∞
Then:
Un Vn−1 f −−−−→ U V −1 f, ∀f ∈ H;
n→∞
Vn−1 f −−−−→ V −1 f, ∀f ∈ H;
n→∞
Un Vn f −−−−→ U V f, ∀f ∈ H.
n→∞
Proof. For every f ∈ H, we have

kUn Vn−1 f − Un V −1 f k = kVn−1 f − V −1 f k = kf − Vn V −1 f k
= kV V −1 f − Vn V −1 f k, ∀n ∈ N,
and hence
kUn Vn−1 f − U V −1 f k ≤ kUn Vn−1 f − Un V −1 f k + kUn V −1 f − U V −1 f k
= kV (V −1 f ) − Vn (V −1 f )k + kUn (V −1 f ) − U (V −1 f )k
−−−−→ 0.
n→∞
Thus,
Un Vn−1 f −−−−→ U V −1 f, ∀f ∈ H. (1)
n→∞
If we set U := Un := 1H in the statement, from 1 we have

Vn−1 f −−−−→ V −1 f, ∀f ∈ H. (2)
n→∞
Since 2 is true, we can substitute V −1 for V and Vn−1 for V in the statement, and
thus obtain from 1
Un Vn f −−−−→ U V f, ∀f ∈ H.
n→∞
16.4.8 Lemma. Let µ : R2 → T be a continuous function such that µ(0, 0) = 1.

Then there exists a continuous function ξ : R2 → R such that
ξ(0, 0) = 0 and µ(r, s) = eiξ(r,s) , ∀(r, s) ∈ R2 .
Proof. It is outside the scope of this book to prove this result, which is a special
case of a theorem of topology about liftings (cf. e.g. Greenberg and Harper, 1981,
6.1 and 6.6).
16.4.9 Lemma. Let ξ : R2 → C be a continuous function and let ϕ ∈ Cc (R) (cf.

3.1.10g). Then the function
Z
R ∋ x 7→ λ(x) := ξ(x, t)ϕ(t)dm(t) ∈ C
R
(m denotes the Lebesgue measure on R) is continuous.
Proof. First we note that the integral which defines the function λ exists by 2.8.14
and 8.2.6.
Let a, b ∈ R be so that a < b and ϕ(t) = 0 for all t ∈ R − [a, b]. Then,
Z
λ(x) = ξ(x, t)ϕ(t)dm(t), ∀x ∈ R.
[a,b]
Now we fix x0 ∈ R and d ∈ (0, ∞). The function

[x0 − d, x0 + d] × [a, b] ∋ (x, t) 7→ ξ(x, t)ϕ(t) ∈ C
is continuous, and hence it is uniformly continuous (cf. 2.8.7 and 2.8.15). Hence,
for every ε ∈ (0, ∞) there exists δε ∈ (0, ∞) such that δε < d and
|x0 − x| < δε ⇒ [d2 ((x0 , t), (x, t)) < δε , ∀t ∈ [a, b]] ⇒
[|ξ(x0 , t)ϕ(t) − ξ(x, t)ϕ(t)| < ε, ∀t ∈ [a, b]] ⇒
Z
|λ(x0 ) − λ(x)| ≤ |ξ(x0 , t)ϕ(t) − ξ(x, t)ϕ(t)|dm(t) ≤ ε(b − a).
[a,b]
This proves that the function λ is continuous at x0 , and hence that it is continuous
since x0 was arbitrary.
16.4.10 Lemma. Let ξ : R2 → C be a continuous function and let ϕ ∈ Cc (R) be

such that ϕ is differentiable at all points of R and the function ϕ′ (the derivative of
ϕ) is continuous. Let ψ be the function defined by
Z
R2 ∋ (x, y) 7→ ψ(x, y) := ξ(x, t)ϕ(t − y)dm(t) ∈ C.
R
Then:
∂ψ
Z
the partial derivative ∂ψ
∂y (x, y) exists and (x, y) = − ξ(x, t)ϕ′ (t − y)dm(t),
∂y R
∀(x, y) ∈ R2 ;
∂ψ
the function R ∋ x 7→ ∂y (x, 0) ∈ C is continuous.
Proof. We fix x0 ∈ R and define the function

R2 ∋ (t, y) 7→ χ(t, y) := ξ(x0 , t)ϕ(t − y) ∈ C.
∂χ
For each (t, y) ∈ R2 , the partial derivative ∂y (t, y) exists and
∂χ
(t, y) = −ξ(x0 , t)ϕ′ (t − y). (1)
∂y
Let a, b ∈ R be so that a < b and ϕ(s) = 0 for all s ∈ R − [a, b]. Then, for each
y ∈ R,
ϕ(t − y) = 0, ∀t ∈ R − [a + y, b + y].
We fix y0 ∈ R and d ∈ (0, ∞), and define the interval
I(y0 , d) := [a + y0 − d, b + y0 + d];
then, for each y ∈ [y0 − d, y0 + d],
ϕ(t − y) = 0, ∀t ∈ R − I(y0 , d).
Since ϕ′ (s) = 0 for all s 6∈ [a, b], the same reasoning as above proves that, for each
y ∈ [y0 − d, y0 + d],
ϕ′ (t − y) = 0, ∀t ∈ R − I(y0 , d).
Hence, for each h ∈ R − {0} such that |h| ≤ d, we have (cf. 1)
1
Z
(ψ(x0 , y0 + h) − ψ(x0 , y0 )) + ξ(x0 , t)ϕ′ (t − y0 )dm(t)
h R
(2)
1 ∂χ
Z
= (χ(t, y0 + h) − χ(t, y0 )) − (t, y0 ) dm(t)
I(y0 ,d) h ∂y
(note that the function χ depends on x0 , which however is fixed).
Now, the function
∂χ
I(y0 , d) × [y0 − d, y0 + d] ∋ (t, y) 7→ (t, y) ∈ C
∂y
is continuous (cf. 1); hence it is uniformly continuous (cf. 2.8.7 and 2.8.15), and
hence for each ε ∈ (0, ∞) there exists δε ∈ (0, ∞) such that δε < d and
|y − y0 | < δε ⇒ [d2 ((t, y), (t, y0 )) < δε , ∀t ∈ I(y0 , d)] ⇒
(3)

∂χ
(t, y) − ∂χ (t, y0 ) < ε, ∀t ∈ I(y0 , d) .

∂y ∂y
For each h ∈ R − {0}, the mean value theorem implies that

1 ∂χ
∀t ∈ R, ∃yt ∈ R s.t. |yt − y0 | ≤ |h| and (χ(t, y0 + h) − χ(t, y0 )) = (t, yt );
h ∂y
this and 3 imply that, if 0 < |h| < δε , then

1
(χ(t, y0 + h) − χ(t, y0 )) − ∂χ (t, y0 ) < ε, ∀t ∈ I(y0 , d);

h ∂y
hence, in view of 2, if 0 < |h| < δε then

1
Z
(ψ(x0 , y0 + h) − ψ(x0 , y0 )) + ξ(x0 , t)ϕ′ (t − y0 )dm(t)

h
R

1
(χ(t, y0 + h) − χ(t, y0 )) − ∂χ (t, y0 ) dm(t) ≤ ε(b − a + 2d).
Z

≤ h
I(y0 ,d) ∂y
This proves that the partial derivative ∂ψ

∂y exists at (x0 , y0 ) and that
∂ψ
Z
(x0 , y0 ) = − ξ(x0 , t)ϕ′ (t − y0 )dm(t).
∂y R
Since x0 and y0 were arbitrary, the first conclusion of the statement is proved.
Since
∂ψ
Z
(x, 0) = − ξ(x, t)ϕ′ (t)dm(t), ∀x ∈ R,
∂y R
the second conclusion of the statement follows from 16.4.9.
The following theorem is the special case of Bargmann’s theorem we mentioned

before. In the first part of the proof (steps 1, 2 and 3) we exploit the special nature
of R in order to extend the result of 16.4.6 to the whole of R and to obtain what
Bargmann calls an exponent. In the second part we follow closely Bargmann’s
exposition (cf. Bargmann, 1954, p.19–21), which however deals with a more general
situation than the one of interest for us here.
16.4.11 Theorem. Let R ∋ t 7→ ωt ∈ Aut Ĥ be a continuous one-parameter group

of automorphisms. Then there exists a c.o.p.u.g. U in H such that
ωt = ωU(t) , ∀t ∈ R.
Proof. The mapping R ∋ t 7→ ωt ∈ Aut Ĥ we are now considering has all the
properties assumed for the mapping considered in 16.4.6 (cf. 16.4.2b and ag2 ).
Then there exists a ∈ (0, ∞) and a mapping
(−a, a) ∋ t 7→ Vt ∈ U(H)
with the properties listed in 16.4.6.
We divide the proof of the theorem into five steps.
Step 1: Here we define a mapping R ∋ t 7→ T (t) ∈ U(H) such that
T (0) = 1H and ωt = ωT (t) , ∀t ∈ R.
1
We set b := 2 a. Then,
∀t ∈ R, ∃!(k(t), r(t)) ∈ Z × [0, b) such that t = k(t)b + r(t).
Thus we can define the mapping
T : R → U(H)
k(t)
t 7→ T (t) := Vb Vr(t)
(we recall that Vb0 := 1H ). We see that T (0) = 1H . Moreover, in view of ag1 ,
k(t)
ωT (t) ([u]) = [Vb Vr(t) u] = (ωb ◦ · · · k(t) times · · · ◦ ωb ◦ ωr(t) )([u])
= ωk(t)b+r(t) ([u]) = ωt ([u]), ∀u ∈ H̃, ∀t ∈ R,
and hence
ωT (t) = ωt , ∀t ∈ R.
Step 2: Here we prove that the mapping R ∋ t → 7 T (t)f ∈ H is continuous, for

all f ∈ H.
In what follows, we fix f ∈ H.
In the first place we suppose that t ∈ R is such that t 6= kb for all k ∈ Z; then,
k(t)b < t < (k(t) + 1)b;
if {tn } is a sequence in R such that tn → t, then there exists n0 ∈ N so that
k(t)
n > n0 ⇒ k(t)b < tn < (k(t) + 1)b ⇒ tn = k(t)b + r(tn ) ⇒ T (tn ) = Vb Vr(tn ) ,
and hence
(1) k(t) k(t)
kT (tn )f − T (t)f k = kVb Vr(tn ) f − Vb Vr(t) f k = kVr(tn ) f − Vr(t) f k −−−−→ 0
n→∞
(1 holds for n > n0 ) since r(tn ) − r(t) = tn − t for n > n0 . This proves that the
mapping t 7→ T (t)f is continuous at t.
In the second place we suppose that t ∈ R is such that t = k(t)b; hence, T (t) =
k(t)
Vb . First let {tn } be a sequence in R such that tn → t and such that there exists
n1 ∈ N so that
k(t)
n > n1 ⇒ k(t)b = t ≤ tn < (k(t) + 1)b ⇒ T (tn ) = Vb Vr(tn ) ;
then,
(2) k(t) k(t)
kT (tn )f − T (t)f k = kVb Vr(tn ) f − Vb f k = kVr(tn ) f − f k −−−−→ 0
n→∞
(2 holds for n > n1 ) since r(tn ) = tn − t for n > n1 ; by the argument used in the
proof of 2.4.2 (b ⇒ a), this implies that
∀ε > 0, ∃δε+ > 0 such that t ≤ s < t + δε+ ⇒ kT (t)f − T (s)f k < ε.
Next let {tn } be a sequence in R such that tn → t and such that there exists n2 ∈ N
so that
k(t)−1
n > n2 ⇒ (k(t) − 1)b ≤ tn < t = k(t)b ⇒ T (tn ) = Vb Vr(tn ) ;
then t − tn = b − r(tn ) for n > n2 and hence
[tn → t] ⇒ [r(tn ) → b] ⇒ [Vr(tn ) f −−−−→ Vb f ],
n→∞
k(t)−1
and hence, since Vb ∈ B(H),
(3) k(t)−1 k(t)−1
T (tn )f = Vb Vr(tn ) f −−−−→ Vb Vb t = T (t)f
n→∞
(3 holds for n > n2 ); by the argument used in the proof of 2.4.2 (b ⇒ a), this implies
that
∀ε > 0, ∃δε− such that t − δε− < s < t ⇒ kT (t)f − T (s)f k < ε.
Letting δε := min{δε+ , δε− }, we have proved that
∀ε > 0, ∃δε > 0 such that |s − t| < δε ⇒ kT (t)f − T (s)f k < ε,
and hence that the mapping t 7→ T (t)f is continuous at t.

Since f was arbitrary, step 2 is concluded.
Step 3: The exponent ξ.
We have (cf. step 1 and ag1 )
[T (r)T (s)u] = ωr ◦ ωs ([u]) = ωr+s ([u]) = [T (r + s)u], ∀u ∈ H̃, ∀(r, s) ∈ R2 ,
∀(r, s) ∈ R2 , ∃!µ(r, s) ∈ T such that T (r)T (s) = µ(r, s)T (r + s); (4)
this defines a function R2 ∋ (r, s) 7→ µ(r, s) ∈ T, for which we have
µ(r, s)µ(r + s, t)T (r + s + t) = µ(r, s)T (r + s)T (t) = T (r)T (s)T (t)
= T (r)µ(s, t)T (s + t)
= µ(s, t)µ(r, s + t)T (r + s + t), ∀(r, s, t) ∈ R3 ,
and hence
µ(r, s)µ(r + s, t) = µ(s, t)µ(r, s + t), ∀(r, s, t) ∈ R3 . (5)
Moreover, from 4 we have
µ(r, s)1H = T (r)T (s)T (r + s)−1 , ∀(r, s) ∈ R2 .
Since T (0) = 1H (cf. step 1), this gives µ(0, 0) = 1. Moreover, for each (r, s) ∈ R2 ,
if {(rn , sn )} is a sequence in R2 such that (rn , sn ) → (r, s) then rn → r, sn → s,
rn + sn → r + s, and hence (cf. step 2)
T (rn )f −−−−→ T (r)f, T (sn )f −−−−→ T (s)f, T (rn + sn )f −−−−→ T (r + s)f, ∀f ∈ H,
n→∞ n→∞ n→∞

T (sn )T (rn + sn )−1 f −−−−→ T (s)T (r + s)−1 f, ∀f ∈ H,
n→∞

T (rn )T (sn )T (rn + sn )−1 f −−−−→ T (r)T (s)T (r + s)−1 f, ∀f ∈ H,
n→∞
and hence, for u ∈ H̃,

µ(rn , sn ) = u|T (rn )T (sn )T (rn + sn )−1 u −−−−→ u|T (r)T (s)T (r + s)−1 u = µ(r, s);

n→∞
thus, the function µ is continuous (cf. 2.4.2). Therefore, by 16.4.8 there exists a
continuous function ξ : R2 → R such that
ξ(0, 0) = 0 and µ(r, s) = eiξ(r,s) , ∀(r, s) ∈ R2 . (6)
The function
R3 ∋ (r, s, t) 7→ ξ(r, s) + ξ(r + s, t) − ξ(s, t) − ξ(r, s + t) ∈ R
is obviously continuous; then, since (R3 , d3 ) is a connected metric space (cf. 2.9.10),
its range can only be either R or an interval or a singleton set (cf. 2.9.6); now, 5
implies that
∀(r, s, t) ∈ R3 , ∃nr,s,t ∈ Z such that
ξ(r, s) + ξ(r + s, t) − ξ(s, t) − ξ(r, s + t) = 2nr,s,t π;
hence,
∃n ∈ Z such that ξ(r, s) + ξ(r + s, t) − ξ(s, t) − ξ(r, s + t) = 2nπ, ∀(r, s, t) ∈ R3 ;
if we set r = s = t = 0 in this, we see that n = 0 since ξ(0, 0) = 0; thus,
ξ(r, s) + ξ(r + s, t) = ξ(s, t) + ξ(r, s + t), ∀(r, s, t) ∈ R3 . (7)
If we set r = s = 0 in this, we obtain
ξ(0, t) = 0, ∀t ∈ R. (8)
Step 4: The exponent ξ0 .
Throughout this step we fix a real function ϕ ∈ Cc (R) which is differentiable at
′
R
all points of R, and also such that the derivative ϕ is continuous and R ϕdm = 1.
A possible choice is
(
1
(cos x + 1) if x ∈ [−π, π]
ϕ(x) := 2π
0 if x 6∈ [−π, π].
Z
R ∋ r 7→ λ(r) := − ξ(r, t)ϕ(t)dm(t) ∈ R; (9)
R
this function is continuous, in view of 16.4.9; moreover, λ(0) = 0 in view of 8.
Next we define the mapping
R ∋ t 7→ W (t) := eiλ(t) T (t) ∈ U(H).
In view of step 1 we have
W (0) = 1H and ωt = ωW (t) , ∀t ∈ R; (10)
in view of step 2 and of 10.1.16b we have that
the mapping R ∋ t 7→ W (t)f ∈ H is continuous, ∀f ∈ H. (11)
In view of 4 and 6 we see that
W (r)W (s) = ei(λ(r)+λ(s)−λ(r+s)+ξ(r,s)) W (r + s), ∀(r, s) ∈ R2 ,
or
W (r)W (s) = eiξ0 (r,s) W (r + s), ∀(r, s) ∈ R2 , (12)
where ξ0 is the function defined by
R2 ∋ (r, s) 7→ ξ0 (r, s) := ξ(r, s) + λ(r) + λ(s) − λ(r + s) ∈ R; (13)
the function ξ0 is obviously continuous; moreover, ξ0 (0, 0) = 0 in view of 8. Then

we can repeat the reasoning which led from 4 and 6 to 7, to obtain
ξ0 (r, s) + ξ0 (r + s, t) = ξ0 (s, t) + ξ0 (r, s + t), ∀(r, s, t) ∈ R3 ; (14)
in we set s = t = 0 in this, we obtain
ξ0 (r, 0) = 0, ∀r ∈ R. (15)
2
For each (r, s) ∈ R , since obviously
Z
ξ(r, s) = ξ(r, s)ϕ(t)dm(t),
R
we have
Z
(16)
ξ0 (r, s) = (ξ(r, s) − ξ(r, t) − ξ(s, t) + ξ(r + s, t))ϕ(t)dm(t)
ZR Z
(17)
= ξ(r, s + t)ϕ(t)dm(t) − ξ(r, t)ϕ(t)dm(t)
ZR ZR
(18)
= ξ(r, t)ϕ(t − s)dm(t) − ξ(r, t)ϕ(t)dm(t),
R R
where 16 holds in view of 9 and 13, 17 holds in view of 7, 18 holds by 9.2.1b; by
16.4.10, this shows that the partial derivative ∂ξ
∂s (r, s) exists and that
0
∂ξ0
the function R ∋ r 7→
(r, 0) ∈ R is continuous. (19)
∂s
Step 5: Here we prove the statement of the theorem.
∂ξ0
R2 ∋ (r, s) 7→ ψ(r, s) := (r, s) ∈ R.
∂s
If we derive 14 with respect to t at t = 0 we obtain
ψ(r + s, 0) = ψ(s, 0) + ψ(r, s), ∀(r, s) ∈ R2 (20)
From 19 and 20 we have that the function ψ is continuous.
Next we define the function
Z t
R ∋ t 7→ λ0 (t) := ψ(r, 0)dr ∈ R
0
Rb
(the symbol a has in the present proof the same meaning it has in the proof of
16.1.10), which is continuous.
For all (t1 , t2 ) ∈ R2 , we have
λ0 (t1 + t2 ) − λ0 (t1 ) − λ0 (t2 )
Z t1 +t2 Z t1 Z t2
= ψ(r, 0)dr − ψ(r, 0)dr − ψ(r, 0)dr
0 0 0
Z t1 +t2 Z t2 Z t2 Z t2
(21)
= ψ(r, 0)dr − ψ(r, 0)dr = ψ(t1 + r, 0)dr − ψ(r, 0)dr
t1 0 0 0
Z t2
(22) (23) (24)
= ψ(t1 , r)dr = ξ0 (t1 , t2 ) − ξ0 (t1 , 0) = ξ0 (t1 , t2 ),
0
where 21 holds by a change of variable (cf. the explanation of 7 and 8 in the proof
of 16.1.10), 22 holds in view of 20, 23 holds by Riemann’s fundamental theorem of
calculus, 24 holds in view of 15.
Finally, we define the mapping
R ∋ t 7→ U (t) := eiλ0 (t) W (t) ∈ U(H).
In view of 10 we have
ωt = ωU(t) , ∀t ∈ R.
In view of 11 and of 10.1.16b we have that
the mapping R ∋ t 7→ U (t)f ∈ H is continuous, ∀f ∈ H;
moreover, in view of 12 and of the equation proved above we have
U (t1 )U (t2 ) = ei(λ0 (t1 )+λ0 (t2 )−λ0 (t1 +t2 )+ξ0 (t1 ,t2 )) U (t1 + t2 )
= U (t1 + t2 ), ∀t1 , t2 ∈ R;
thus, the mapping U is a c.o.p.u.g.

Chapter 17
Commuting Operators and Reducing

Subspaces
As usual, H denotes an abstract Hilbert space throughout this chapter.

Outside B(H), the idea of commutativity of operators needs careful examination
even from its definition, and the idea of a reducing subspace for an operator is subtler
than the idea of an invariant subspace. The subject of the present chapter is the
study of these ideas, which are closely connected with each other.
17.1 Commuting operators
The results of operations performed on elements of O(H) (cf. 3.2.1) can be misun-
derstood if they are interpreted as if O(H) were an algebra, since O(H) is not an
algebra and not even a linear space (cf. 3.2.11). This is true in particular for the
commutator of two elements of O(H).
17.1.1 Definitions.
(a) Let A, B be elements of O(H), i.e. linear operators in H. The commutator of
A and B is the linear operator denoted by the symbol [A, B] and defined by
[A, B] := AB − BA.
We note that
D[A,B] = {f ∈ DA ∩ DB : Af ∈ DB and Bf ∈ DA }.
(b) Two elements A and B of B(H) are said to commute if AB = BA, i.e. if
[A, B] = OH .
The definition given in 17.1.1b is the natural one for elements of B(H), since
B(H) is an algebra (cf. 4.3.5). It might be thought that this definition could be
generalized meaningfully to arbitrary elements of O(H) in a direct way, by saying
that two elements A and B of O(H) commute if [A, B] ⊂ OH . However, this
definition would not be very useful. Firstly, it is clear that the content of the
relation [A, B] ⊂ OH depends on the size of D[A,B] (it is even void if D[A,B] = {0H }).
Moreover, if A and B are self-adjoint elements of B(H) then the relation [A, B] = OH
529
has consequences (cf. 17.1.4 and 17.1.7) which are not granted in general by the
relation [A, B] ⊂ OH when A and B are self-adjoint elements of O(H) which are
not defined on the whole of H (we recall that, for a self-adjoint operator A, DA = H
is equivalent to A ∈ B(H), cf. 12.4.7); this is proved by examples (cf. 17.1.8). In
general, the condition [A, B] ⊂ OH does not seem to lead to interesting results for
self-adjoint operators A and B which are not in B(H). The main task of this section
is to find a commutativity condition for self-adjoint operators which has the same
consequences whether or not the operators are in B(H).
We start off by noting that there is a condition of commutativity which has
already played a role in previous chapters.
17.1.2 Definition. An element B of B(H) is said to commute with an element A

of O(H) if
BA ⊂ AB,
i.e. if the following implication holds true (notice that DBA = DA )
f ∈ DA ⇒ [Bf ∈ DA and BAf = ABf ].
It is obvious that, if both A and B are elements of B(H), then B commutes with A
iff [A, B] = OH . In this sense, the definition now given generalizes the one given in
17.1.1b.
17.1.3 Remarks.
(a) Let (X, A) be a measurable space and let P : A → P(H) be a projection
valued measure. If B ∈ B(H) is such that [B, P (E)] = OH for all E ∈ A, then
B commutes with JϕP for all ϕ ∈ M(X, A, P ) (cf. 14.2.14e).
(b) The previous remark implies that, if A is a self-adjoint operator in H and
B ∈ B(H) is such that [B, P A (E)] = OH for all E ∈ A(dR ), then B commutes
with ϕ(A) for all ϕ ∈ M(R, A(dR ), P A ); in particular, B commutes with A
since A = ξ(A).
(c) If A is a self-adjoint operator in H and B ∈ B(H) commutes with A, then
[B, P A (E)] = OH for all E ∈ A(dR ) (cf. 15.2.1B).
(d) Remarks b and c imply that, if A is a self-adjoint operator in H and B ∈ B(H)
commutes with A, then B commutes with ϕ(A) for all ϕ ∈ M(R, A(dR ), P A ).
We note that, while the definition provided in 17.1.1b sets up a relation in B(H)
which is obviously symmetric, for A ∈ O(H) and B ∈ B(H) the condition BA ⊂ AB
is asymmetric if DA 6= H. The implication equivalent to BA ⊂ AB that is written
in 17.1.2 makes this asymmetry immediately clear. If neither of two linear operators
A and B is an element of B(H) then we do not try at all to define anything like
commutativity for A and B, unless both operators are self-adjoint. Indeed, 17.1.4
proves that if both A ∈ O(H) and B ∈ B(H) are self-adjoint then the condition
BA ⊂ AB is in fact equivalent to a relation in which A and B have equal roles,
Commuting Operators and Reducing Subspaces 531
and suggests how this condition can be generalized to a symmetric relation between
any kind of self-adjoint operators (cf. 17.1.5). After that, 17.1.7 shows that this
generalization is a meaningful condition of commutativity for self-adjoint operators.
17.1.4 Proposition. Let A and B be self-adjoint operators in H and suppose that

B ∈ B(H). Then the following conditions are equivalent:
(a) B commutes with A, i.e. BA ⊂ AB;

(b) the projection valued measures P A and P B commute, i.e. (cf. 13.5.1)
[P A (E), P B (F )] = OH , ∀E, F ∈ A(dR ).
Proof. a ⇒ b: We assume condition a and fix E ∈ A(dR ). Then we have

[B, P A (E)] = OH (cf. 17.1.3c). Thus, P A (E) commutes with B and hence
[P A (E), P B (F )] = OH for all F ∈ A(dR ) (cf. 17.1.3c with A := B and
B := P A (E)). Since E was an arbitrary element of A(dR ), this proves condition b.
b ⇒ a: We assume condition b. Then, for every E ∈ A(dR ), P A (E) commutes
with B (cf. 17.1.3b with A := B and B := P A (E)), i.e. [B, P A (E)] = OH . By
17.1.3b, this implies condition a.
In view of 17.1.4, the following definition is consistent with the definition given
in 17.1.2.
17.1.5 Definition. Two self-adjoint operators A and B in H are said to commute

if the projection valued measures P A and P B commute, i.e. if
[P A (E), P B (F )] = OH , ∀E, F ∈ A(dR ).
17.1.6 Remarks.
(a) Let A and B be bounded self-adjoint operators in H. Then, A and B commute

(in the sense of 17.1.5) iff [A, B] = OH . Indeed, A and B are elements of B(H)
(cf. 12.4.7) and hence condition a of 17.1.4 reads [A, B] = OH .
(b) Every self-adjoint operator A in H commutes with itself, i.e.
[P A (E), P A (F )] = OH , ∀E, F ∈ A(dR )
(cf. 13.3.2d).
(c) Let A and B be commuting (in the sense of 17.1.5) self-adjoint operators,
let ϕ be a real element of M(R, A(dR ), P A ), let ψ be a real element of
M(R, A(dR ), P B ). Then the operators ϕ(A) and ψ(B) are self-adjoint and
they commute (in the sense of 17.1.5). This follows immediately from 15.3.8.
Every self-adjoint operator defines the family of operators that contains the
operator itself and the ranges of the projection valued measure and of the continuous
one-parameter unitary group determined by the operator. The next theorem proves
that two self-adjoint operators commute (in the sense of 17.1.5) iff any bounded
element of the family defined as above by one of them commutes (in the sense of
17.1.2) with any element of the family defined as above by the other one.
17.1.7 Theorem. Let A and B be self-adjoint operators in H. Then the following

(a) A and B commute (in the sense defined in 17.1.5);
(b) P A (E) commutes with B, i.e. P A (E)B ⊂ BP A (E), ∀E ∈ A(dR );
(c) [P A (E), U B (t)] = OH , ∀E ∈ A(dR ), ∀t ∈ R;
(d) [U A (s), U B (t)] = OH , ∀s, t ∈ R;
(e) U A (s)B = BU A (s), ∀s ∈ R;
(f ) U A (s) commutes with B, i.e. U A (s)B ⊂ BU A (s), ∀s ∈ R.
Obviously, in view of the symmetry between A and B in condition a, the above
conditions are also equivalent to conditions b, c, e, f with A and B interchanged,
and hence in particular to condition
(g) [U A (s), P B (F )] = OH , ∀s ∈ R, ∀F ∈ A(dR ).
If the above conditions are satisfied, then the following condition holds true:
(h) [A, B] ⊂ OH .
Proof. a ⇒ b: We assume condition a. Then, for every E ∈ A(dR ), P A (E)

commutes with B (cf. 17.1.3b with A := B and B := P A (E)).
b ⇒ c: This follows from 17.1.3d (with A := B, B := P A (E), ϕ := ϕt , cf.
16.1.7).
c ⇒ d: This follows from 17.1.3b (with B := U B (t) and ϕ := ϕs ).
d ⇒ e: We assume condition d and fix s ∈ R. Then,
U B (t) = U A (s)U B (t)(U A (s))−1 , ∀t ∈ R.
By 16.3.1, this yields
B = U A (s)B(U A (s))−1 ,
whence (cf. 3.2.10b1)
BU A (s) = U A (s)B.
e ⇒ f : This is obvious.
f ⇒ a: We assume condition f. Then we have, for every s ∈ R,
[U A (s), P B (F )] = OH , ∀F ∈ A(dR )
(cf. 17.1.3c with A := B and B := U A (s)). Now we fix F ∈ A(dR ). We have
P B (F )U A (s) = U A (s)P B (F ), ∀s ∈ R. (1)
Let f ∈ DA ; for any sequence {tn } in R − {0} such that tn → 0, if we define
1 A
gn := (U (tn ) − 1H )f, ∀n ∈ N,
tn
then we have
Af = −i lim gn
n→∞
B
(cf. 16.1.6); since P (F ) ∈ B(H), this implies that
the sequence {P B (F )gn } is convergent and P B (F )Af = −i lim P B (F )gn ;
n→∞
now, 1 implies that

1
P B (F )gn = (U A (tn ) − 1H )P B (F )f ;
tn
this proves that
P B (F )f ∈ DA and P B (F )Af = AP B (F )f
(cf. 16.1.6 and 16.1.5a). Thus, P B (F ) commutes with A (in the sense of 17.1.2).
Then,
[P B (F ), P A (E)] = OH , ∀E ∈ A(dR ),
in view of 17.1.3c (with B := P B (F )). Since F was an arbitrary element of A(dR ),
condition a is proved.
h: We assume condition f. For every f ∈ D[A,B] and every g ∈ DB , we have
g|U A (s)Bf = g|BU A (s)f = Bg|U A (s)f , ∀s ∈ R,

and hence, for a sequence {tn } in R − {0} such that tn → 0,

1
g|(U A (tn ) − 1)Bf

(g|iABf ) = lim
n→∞ tn
1
Bg|(U A (tn ) − 1H )f = (Bg|iAf )

= lim
n→∞ tn
(cf. 16.1.6; note that Bf ∈ DA and f ∈ DA ), and hence

(g|ABf ) = (g|BAf )
(note that Af ∈ DB ). Since DB = H, this implies (cf. 10.2.11 and 10.2.10a)
ABf = BAf, ∀f ∈ D[A,B] ,
which is condition h.
From 17.1.7 we see that if two self-adjoint operators A and B commute (in the
sense defined in 17.1.5) then [A, B] ⊂ OH . It is almost obvious that the converse
cannot be true in general because two self-adjoint operators A and B can be such
that [A, B] ⊂ OH , but with D[A,B] so little that the relation [A, B] ⊂ OH is of no
consequence. Now, one might conjecture that, if [A, B] ⊂ OH and D[A,B] is dense in
H, then A and B should commute. Example a in 17.1.8 proves that this conjecture
is false. Then one might go one step further and conjecture that if D[A,B] is not
only dense in H but also large enough so that two self-adjoint operators A and B
are uniquely determined by their restrictions to D[A,B] , then [A, B] ⊂ OH could be
a sufficient condition to guarantee commutativity of A and B. In particular, one

might conjecture that this should be the case when the restrictions of A and B to
D[A,B] are essentially self-adjoint, since an essentially self-adjoint operator has a
unique self-adjoint extension, its closure (cf. 12.4.11). However this conjecture is
proved false by example b in 17.1.8.
17.1.8 Examples.
(a) The Hilbert space of this example is L2 (a, b). As in 12.4.25, here we do not
distinguish between a symbol ϕ for an element of C(a, b) and the symbol [ϕ] for
the element of L2 (a, b) that contains ϕ. Accordingly, the family of functions
C0∞ (a, b) defined in 11.4.17 is identified with the subset ι(C0∞ (a, b)) of L2 (a, b).
We consider the operators A0 and A1 defined as Aθ in 12.4.25, with θ := 0 and
θ := 1. It is obvious that
[A0 , A1 ] ⊂ OL2 (a,b) .
Moreover, it is obvious that C0∞ (a, b) ⊂ D[A0 ,A1 ] . Now, C0∞ (a, b) is dense in
L2 (a, b) (cf. 11.4.21) and hence so is D[A0 ,A1 ] . Then we have
OL2 (a,b) = O†L2 (a,b) ⊂ [A0 , A1 ]†
(cf. 12.1.4; the equation OL2 (a,b) = O†L2 (a,b) follows e.g. from 12.1.3B), and
hence
[A0 , A1 ]† = OL2 (a,b) ,
and hence
A†1 A†0 − A†0 A†1 ⊂ (A0 A1 )† − (A1 A0 )† ⊂ (A0 A1 − A1 A0 )† = OL2 (a,b)
(cf. 12.3.4a and 12.3.1a). Thus, A†0 and A†1 are self-adjoint operators (since A0
and A1 are essentially self-adjoint, cf. 12.4.25) such that
[A†0 , A†1 ] ⊂ OL2 (a,b)
and
D[A† ,A† ] is dense in L2 (a, b)
0 1
(note that D[A0 ,A1 ] ⊂ D[A† ,A† ] since A0 ⊂ A†0 and A1 ⊂ A†1 ).
0 1
Now, the conditions of 15.3.4B hold true for both the self-adjoint operators A†0
and A†1 (cf. 12.4.25). The number 0 is eigenvalue of A†0 and its eigenspace is
the one-dimensional subspace generated by the element u of L2 (a, b) defined by
12
1
u(x) := , ∀x ∈ [a, b];
b−a
therefore we have
†
P A0 ({0})ϕ = (u|ϕ) u, ∀ϕ ∈ L2 (a, b)
(we identify the symbols ϕ and [ϕ] also for an element ϕ of L2 (a, b)). The num-
1
ber b−a is eigenvalue for A†1 and its eigenspace is the one-dimensional subspace
generated by the element v of L2 (a, b) defined by
21
1 x−a
v(x) := exp i , ∀x ∈ [a, b];
b−a b−a
therefore we have

A†1 1
P ϕ = (v|ϕ) v, ∀ϕ ∈ L2 (a, b).
b−a
Thus, we have

† † 1
P A0 ({0})P A1 v = (u|v) u
b−a
and

† 1 †
P A1 P A0 ({0})v = (u|v) (v|u) v.
b−a
Now,
b Z 1
1 x−a
Z
(u|v) = exp i dx = eis ds 6= 0,
b−a a b−a 0
and hence

A†0 A†1 1
P ({0}), P 6= OL2 (a,b) .
b−a
This proves that the self-adjoint operators A†0 and A†1 do not commute (in the
sense defined in 17.1.5).
We point out that D[A† ,A† ] , though dense in L2 (a, b), cannot be so that the
0 1
restrictions of A†0 and A†1 to D[A† ,A† ] are essentially self-adjoint, since these
0 1
restrictions are equal but the self-adjoint operators A†0 and A†1 are not (cf.
12.4.11c and 12.4.13).
(b) This example is due to Edward Nelson (cf. Reed and Simon, 1980, 1972, p.306),
and its key-point is the proof of the following proposition.
There exists a Hilbert space K, a linear manifold D dense in K, and two linear
operators A and B in K so that:
(a) DA = DB = D, A(D) ⊂ D, B(D) ⊂ D;
(b) ABf − BAf = 0K , ∀f ∈ D;
(c) A and B are essentially self-adjoint;
(d) ∃f ∈ D such that U A (1)U B (1)f 6= U B (1)U A (1)f .
We do not prove this proposition. A scheme of its proof can be found at p.273–
274 of (Reed and Simon, 1980, 1972).
We note that from condition a and b we have
AB − BA ⊂ OK ,
and hence (in view of 12.1.4), since condition a implies DBA−AB = D and hence
DBA−AB = K,
OK = O†K ⊂ (BA − AB)† ,
and hence
(BA − AB)† = OK .
Then from condition c we have (in view of 12.1.6b, 12.3.4a, 12.3.1a)
A B − B A = A† B † − B † A† ⊂ (BA)† − (AB)† ⊂ (BA − AB)† = OK .
However, condition d proves (in view of 17.1.7) that the self-adjoint operators
A and B do not commute (in the sense defined in 17.1.5).
Two self-adjoint operators commute (in the sense of 17.1.5) iff they are functions
of a third self-adjoint operator (cf. 17.1.10 a ⇔ c). The difficult part of this
equivalence is proved by the next theorem. The main idea for the proof we provide
is drawn from Section 130 of (Riesz and Sz.-Nagy, 1972). We will write this proof
in full detail even at the risk of belabouring the obvious.
17.1.9 Theorem. Let A1 and A2 be self-adjoint operators in H, and suppose that

they commute (in the sense defined in 17.1.5). Then there exist a self-adjoint op-
erator B in H and two real functions ϕi ∈ M(R, A(dR ), P B ) so that Ai = ϕi (B),
for i = 1, 2.
Proof. We divide the proof into four steps.

Step 1: The projection valued measure P on A(d2 ) and the operators JπPi .
Let ϕ : R → R be a bijection from R onto an interval (α, β) such that 0 ≤ α <
β ≤ 1 and such that both ϕ and ϕ−1 are continuous (e.g., the function defined by
ϕ(x) := 21 (1 + ex )−1 , ∀x ∈ R, has these properties).
The operators ϕ(A1 ) and ϕ(A2 ) are self-adjoint and they commute (cf. 17.1.6c).
Therefore (cf. 13.5.3) there exists a unique projection valued measure P on the
Borel σ-algebra A(d2 ) such that
P (E1 × E2 ) = P ϕ(A1 ) (E1 )P ϕ(A2 ) (E2 ), ∀E1 , E2 ∈ A(dR ).
We note that (cf. 15.3.8)
P ([0, 1) × [0, 1)) = P A1 (ϕ−1 ([0, 1))P A2 (ϕ−1 ([0, 1))
(1)
= P A1 (R)P A2 (R) = 1H .
For i = 1, 2, we define the function
R2 ∋ (x1 , x2 ) 7→ πi (x1 , x2 ) := xi ∈ R,
and the operator Ãi := JπPi . Both the operators Ãi are elements of B(H) because
πi ∈ L∞ (R2 , A(d2 ), P ) (cf. 14.2.17); moreover, they are self-adjoint and
P Ã1 (E) = P (π1−1 (E)) = P (E × R) = P ϕ(A1 ) (E), ∀E ∈ A(dR ),
P Ã2 (E) = P (π2−1 (E)) = P (R × E) = P ϕ(A2 ) (E), ∀E ∈ A(dR )
(cf. 15.2.7); therefore,

JπPi = ϕ(Ai ) for i = 1, 2. (2)
Step 2: The partitions F m of [0, 1) and G m of [0, 1) × [0, 1).
For all m ∈ N, we define a partition F m of the interval [0, 1) by letting
F m := {ιm
n }n=1,...,4m ,
−m
n := [(n − 1)4
with ιm , n4−m ) for n = 1, ..., 4m .
For all m ∈ N, we define a partition G m of the square [0, 1) × [0, 1) by letting
G m := {σnm }n=1,...,4m ,
where {σnm }n=1,...,4m is the family of half-open squares, of the [w, x) × [y, z) type,
that is defined inductively as follows: for m = 1 we define {σn1 }n=1,...,4 by
σ11 σ41
[0, 1) × [0, 1)= ;
σ21 σ31
for m > 1, supposing that {σnm }n=1,...,4m has already been defined, we define
{σnm+1 }n=1,...,4m+1 by
m+1 m+1
σ4n−3 σ4n
σnm = , for all n ∈ {1, ..., 4m}. (3)
m+1 m+1
σ4n−2 σ4n−1
It can be easily proved by induction that, for all m ∈ N, for all n ∈ {1, ..., 4m }, for
all l ∈ N such that m < l,

ιm
n = ιls and σnm = σsl , (4)
s∈I(m,l) s∈I(m,l)
with I(m, l) := {n4 − r : r = 0, 1, ..., 4

l−m
− 1}. l−m
For all m ∈ N and for all l ∈ N such that m ≤ l, in view of 4 we have, for
n = 1, ..., 4m and s = 1, ..., 4l :
either ιls ⊂ ιm
n or ιs ∩ ιn = ∅;
l m
(5)
either σsl ⊂ σnm or σsl ∩ σnm = ∅; (6)
ιls ⊂ ιm
n iff σs ⊂ σn ;
l m
(7)
ιls ∩ ιm
n = ∅ iff σs ∩ σn = ∅;
l m
(8)
For l, m ∈ N, let I1 and I2 be subfamilies of {1, ..., 4l } and of {1, ..., 4m } respectively.
First we note that if l = m then (in view of 5)

m
ιs = ιn ⇒ I1 = I2 and hence
m
σsm = σnm . (9)
s∈I1 n∈I2 s∈I1 n∈I2
Next we suppose m < l and

[ [
ιls = ιm
n; (10)
s∈I1 n∈I2
then (in view of 5)
∀s ∈ I1 , ∃n ∈ I2 s.t. ιls ⊂ ιm l m
n and hence (in view of 7) s.t. σs ⊂ σn ;
this proves that

[ [
σsl ⊂ σnm ;
s∈I1 n∈I2
σsl 6= σnm were true, we should have

S S
therefore, if s∈I1 n∈I2
! !
[ [ [ [
m l m
σn − σs = σn − σsl =
6 ∅,
n∈I2 s∈I1 n∈I2 s∈I1
and hence there would exists n ∈ I2 such that

[
σnm − σsl 6= ∅,
s∈I1
and hence (in view of 4) there would exist t ∈ {1, ..., 4l } such that
[
σtl ⊂ σnm − σsl ,
s∈I1
i.e. such that
σtl ⊂ σnm and σtl ∩ σsl = ∅, ∀l ∈ I1 ,

i.e. (in view of 7 and 8) such that
ιlt ⊂ ιm l l
n and ιt ∩ ιs = ∅, ∀l ∈ I1 ,
but this would be in contradiction with 10. This proves that

" # " #
[ [ [ [
l m l m
m < l and ιs = ιn ⇒ σs = σn . (11)
s∈I1 n∈I2 s∈I1 n∈I2
Step 3: We prove that there exists a projection valued measure T on A(dR ) such
that T (ιm m m
n ) = P (σn ), ∀m ∈ N, ∀n ∈ {1, ..., 4 }.
Let S be the collection of subsets of [0, 1) whose elements are the empty set and
all the intervals [a, b) such that 0 ≤ a < b ≤ 1 and
a = 0 or a = na 4−ma , b = 1 or b = nb 4−mb (12)
with ma , mb ∈ N and na , nb elements of N which are not multiples of 4 (equivalently,
if a 6= 0 then ma is the least positive integer so that a = na 4−ma with na ∈ N, and
similarly for mb if b 6= 1). It is obvious that S is a semialgebra on [0, 1).
For E := [a, b) ∈ S, we define m(E) as follows:

m(E) := 1 if a = 0 and b = 1;
m(E) := mb if a = 0 and b 6= 1;
m(E) := ma if a 6= 0 and b = 1;
m(E) := max{ma , mb } if a 6= 0 and b 6= 1;
then there exists a subfamily I(E) of {1, ..., 4m(E)} such that
[
E= ιm(E)
n ,
n∈I(E)
and we define
 
X [
QE := P (σnm(E) ) = P  σnm(E) 
n∈I(E) n∈I(E)
(the second equality is true because P is a projection valued measure). Moreover,

we define
Q∅ := OH .
We prove below that the mapping
Q : S → P(H)
E 7→ Q(E) := QE ,
satisfies all the conditions of 13.4.4. Then there exists a unique projection valued
measure Qe on A(S) which is an extension of Q.
Now we note that A(S) = A(dR )[0,1) . On the one hand, A(S) ⊂ A(dR )[0,1)
follows immediately from S ⊂ A(dR )[0,1) (cf. 6.1.25 with n = 2). On the other
hand, let a, b ∈ R be so that 0 ≤ a < b ≤ 1; for every n ∈ N there are multiples an
and bn of 4−n−1 so that a < an ≤ a + 4−n and b − 4−n ≤ bn ≤ b, and hence so that
S∞
(a, b) = n=1 [an , bn ) (if bn ≤ an , then [an , bn ) := ∅); this proves that (a, b) ∈ A(S).
Thus,
{(a, b) : 0 ≤ a < b ≤ 1} ⊂ A(S);
since [b, 1) = (0, 1) − (0, b), this implies that
{[b, 1) : 0 ≤ b < 1} ⊂ A(S),
and hence that
{[a, b) : 0 ≤ a < b ≤ 1} ⊂ A(S).
Since
A({[a, b) : 0 ≤ a < b ≤ 1}) = A(dR )[0,1)
(cf. 6.1.20 and 6.1.25 with n = 2), we have A(dR )[0,1) ⊂ A(S) and hence A(S) =
A(dR )[0,1) .
Thus, Qe is projection valued measure on A(dR )[0,1) . Then, it is obvious that

the mapping
T : A(dR ) → P(H)
E 7→ T (E) := Qe (E ∩ [0, 1))
is a projection valued measure and that
T (ιm m m m
n ) = Q(ιn ) = P (σn ), ∀m ∈ N, ∀n ∈ {1, ..., 4 }.
Now it is time to prove that the mapping Q satisfies all the conditions of 13.4.4.
q1 : Let {E1 , ..., EN } be a disjoint family of elements of S such that
N
[
E := Ek ∈ S.
k=1
We define m := max{m(E), m(E1 ), ..., m(EN )}. In view of 4 there are subsets of
{1, ..., 4m }, I and Ik for k = 1, ..., N , so that
[ [
E= ιm
s and Ek = ιm
s for k = 1, ..., N,
s∈I s∈Ik
and hence so that

[ [ [ [
k)
ιm(E)
n = ιm
s and ιm(E
r = ιm
s for k = 1, ..., N.
n∈I(E) s∈I r∈I(Ek ) s∈Ik
In view of 9 and 11, this yields

[ [ [ [
σnm(E) = σsm and σrm(Ek ) = σsm for k = 1, ..., N ;
n∈I(E) s∈I r∈I(Ek ) s∈Ik
moreover,
N N
!
[ [ [ [
ιm
s =E= Ek = ιm
s
s∈I k=1 k=1 s∈Ik
SN
implies (in view of 9) I = k=1 Ik , and hence
!  
[ [ N
[ [ N
[ [
σnm(E) = σsm = σsm =  σrm(Ek )  ,
n∈I(E) s∈I k=1 s∈Ik k=1 r∈I(Ek )
and hence
   
[ N
X [ N
X
Q(E) = P  σnm(E)  = P σrm(Ek )  = Q(Ek ),
n∈I(E) k=1 r∈I(Ek ) k=1
where the second equality is true because P is a projection valued measure (if k 6= h
m(E ) m(E ) m(E ) m(E )
then σr k ∩ σt h = ∅ in view of 8 since ιr k ∩ ιt h = ∅, for all r ∈ I(Ek )
and t ∈ I(Eh )).
q2 : Let E, F ∈ S be such that E ∩ F = ∅. Then

)
ιm(E)
n ∩ ιm(F
s = ∅, ∀n ∈ I(E), ∀s ∈ I(F ),
and hence (in view of 8)
σnm(E) ∩ σsm(F ) = ∅, ∀n ∈ I(E), ∀s ∈ I(F ),
and hence (cf. 13.3.2b)
P (σnm(E) )P (σsm(F ) ) = OH , ∀n ∈ I(E), ∀s ∈ I(F ),
and hence
X X
Q(E)Q(F ) = P (σnm(E) )P (σsm(F ) ) = OH .
n∈I(E) s∈I(F )
q3 : We have [0, 1) ∈ S and (cf.1)

4
!
[
Q([0, 1)) = P σs1 = P ([0, 1) × [0, 1)) = 1H .
s=1
q4 : We fix f ∈ H and E ∈ S. We write E = [a, b) with a and b as in 12. Thus,

∃k ∈ N, ∃h ∈ N so that b = k4−h (13)
h
(if b = 1, we choose h in whatever way and set k = 4 ). Let m0 ∈ N be such that
4−m0 < b − a and h < m0 . For each m ∈ N such that m0 ≤ m, we have
[a, b) = [a, b − 4−m ) ∪ [b − 4−m , b),
and it is obvious that [a, b − 4−m ) and [b − 4−m , b) are elements of S. In view of
condition q1 , already proved, we have
Q([a, b) = Q([a, b − 4−m )) + Q([b − 4−m , b)),
and hence
µQ Q
f ([a, b)) − µf ([a, b − 4
−m
))| = µQ
f ([b − 4
−m
, b)).
From 13 we have
[b − 4−m , b) = [(k4m−h − 1)4−m , (k4m−h )4−m ) = ιm
nm ,
with nm := k4m−h , and hence

µQ
f ([b − 4
−m
, b)) = µP m
f (σnm ).
Now we note that, for each m ≥ m0 ,

nm+1 = 4nm ;
thus, σnm+1
m+1
is the top-right square of the four squares into which σnmm is divided
(cf. 3 with n = nm ). Since these squares are top-right open, we have
∞
\
σnmm = ∅,
m=m0

lim P (σnmm )f = 0H ,
m→∞
and hence
lim µQ
f ([b − 4
−m
, b)) = 0
m→∞
To conclude the proof that condition q4 holds true, we note that obviously
[a, b − 4−m ) = [a, b − 4−m ] ⊂ [a, b), ∀m > m0 ,
and recall 2.8.7.
Step 4: We prove the statement of the theorem.
We define the self-adjoint operator B := JξT . Clearly,
T = P B. (14)
For all m ∈ N and n ∈ {1, ..., 4m }, we denote by (xm m
n , yn ) the bottom-left corner of
m
the square σn . For each m ∈ N, we define the function
m
4
X
ρm := xm
n χι m
n
,
n=1
∞
which is obviously an element of L (R, A(dR ), T ). We note that
ρm (x) ≤ ρm+1 (x), ∀x ∈ R, ∀m ∈ N;
indeed, fix x ∈ [0, 1) and for each m ∈ N let nm (x) ∈ {1, ..., 4m} be such that
x ∈ ιm
nm (x) ; then,
ιm+1 m
nm+1 (x) ⊂ ιnm (x)
and hence (cf. 7)

σnm+1
m+1 (x)
⊂ σnmm (x) ,
m+1
and hence xm m
nm (x) ≤ xnm+1 (x) ; now, xnm (x) = ρm (x). Thus, we can define the
function
R ∋ x 7→ ψ1 (x) := lim ρm (x) ∈ R.
m→∞
∞
We have ψ1 ∈ L (R, A(dR ), T ) (cf. 6.2.20c) and it is obvious that the sequence
{ρm } is ψ1 -convergent. Then,
DT (ψ1 ) = H and JψT1 f = lim JρTm f, ∀f ∈ H (15)
m→∞
(cf. 14.2.17e and 14.2.14c).

Now, for each m ∈ N, we define the function
m
4
X
τm := xm m,
n χσn
n=1
which is obviously an element of L∞ (R2 , A(d2 ), P ). We note that

|π1 (x, y) − τm (x, y)| < 2−m , ∀(x, y) ∈ [0, 1) × [0, 1), ∀m ∈ N;
indeed, fix (x, y) ∈ [0, 1) × [0, 1) and for each m ∈ N let nm (x, y) ∈ {1, ..., 4m} be
such that (x, y) ∈ σnmm (x,y) ; then,
−m
|π1 (x, y) − τm (x, y)| = |x − xm
nm (x,y) | < 2
−m
since (xm m m
nm (x,y) , ynm (x,y) ) is a corner of the square σnm (x,y) whose sides are 2 long;
moreover,
τm (x, y) ≤ π1 (x, y), ∀(x, y) ∈ [0, 1) × [0, 1), ∀m ∈ N,
since (xm m m
n , yn ) is the bottom-left corner of the square σn . This proves that the
sequence {τm } is π1 -convergent (recall that P (R − [0, 1) × [0, 1)) = OH , cf. 1), and
hence that
JπP1 f = lim JτPm f, ∀f ∈ H. (16)
m→∞
Now we note that

m m
4
X 4
X
JρTm = xm m
n T (ιn ) = xm m P
n P (σn ) = Jτm , ∀m ∈ N.
n=1 n=1
In view of 15 and 16, this implies that

JψT1 = JπP1 ,
and hence, in view of 2, that
JψT1 = ϕ(A1 ).
In view of 15.2.7 and 15.3.8, and of 14, this yields
P B (ψ1−1 (E)) = P A1 (ϕ−1 (E)), ∀E ∈ A(dR ). (17)
Now, for each F ∈ A(dR ), ϕ(F ) = (ϕ−1 )−1 (F ) (where ϕ−1 is the inverse function,
and (ϕ−1 )−1 (F ) can be understood either as the counterimage of F under ϕ−1 or
as the image of F under the inverse of ϕ−1 , i.e. under ϕ, since ϕ−1 is injective),
and hence ϕ(F ) ∈ A(dR ) (for the counterimage (ϕ−1 )−1 (F ) under ϕ−1 , we have
(ϕ−1 )−1 (F ) ∈ A(dR )(α,β) since Dϕ−1 = (α, β), and hence (ϕ−1 )−1 (F ) ∈ A(dR ) in
view of 6.1.19a). Then, 17 implies that
P B (ψ1−1 (ϕ(F ))) = P A1 (ϕ−1 (ϕ(F ))) = P A1 (F ), ∀F ∈ A(dR ); (18)
we have in particular
P B (ψ1−1 (ϕ(R))) = P A1 (R) = 1H , (19)
and this proves that ψ1−1 (Dϕ−1 ) = ψ1−1 (ϕ(R)) 6= ∅; thus, we can define the function
ϕ1 := ϕ−1 ◦ ψ1 ; moreover, we have Dϕ1 ∈ A(dR ) and
P B (Dϕ1 ) = 1H , or P B (R − Dϕ1 ) = OH ,
in view of 19 and of the equalities

Dϕ1 = ψ1−1 (Dϕ−1 ) = ψ1−1 (ϕ(R));
thus, we have ϕ1 ∈ M(R, A(dR ), P B ) and we can define the self-adjoint operator
C1 := ϕ1 (B).
Then we have (cf. 15.3.8 and 18)
P C1 (F ) = P B (ϕ−1 B −1
1 (F )) = P (ψ1 (ϕ(F ))) = P
A1
(F ), ∀F ∈ A(dR ),
and hence A1 = C1 = ϕ1 (B).
The proof for A2 would be similar (the functions ρm and τm would be defined
with ynm in lieu of xm
n , and we should use the function π2 in lieu of π1 ).
For two self-adjoint operators, 17.1.7 lists a number of conditions equivalent to

their commuting. The result of the last theorem can be collected in a similar way
with other results already proved, to obtain the following corollary.
17.1.10 Corollary. Let A1 and A2 be self-adjoint operators in H. The following

(a) A1 and A2 commute (in the sense defined in 17.1.5);

(b) there exists a projection valued measure P on the Borel σ-algebra A(d2 ) so that
P A1 (E) = P (E × R) and P A2 (E) = P (R × E), ∀E ∈ A(dR );
(c) there exist a self-adjoint operator B in H and two real functions
ϕi ∈ M(R, A(dR ), P B ) so that Ai = ϕi (B), for i = 1, 2.
If these conditions hold true, then the projection valued measure P is unique.
Proof. a ⇒ (b and uniqueness of P ): Cf. 13.5.3.

b ⇒ a: This follows from 13.3.2d.
a ⇒ c: Cf. 17.1.9.
c ⇒ a: Cf. 17.1.6b,c.
17.1.11 Definition. Let A1 and A2 be commuting (in the sense of 17.1.5) self-
adjoint operators in H, and let P be the projection valued measure of 17.1.10b. For
a function ϕ ∈ M(R2 , A(d2 ), P ), we write
ϕ(A1 , A2 ) := JϕP
and we say that this operator is a function of A1 and A2 . This name is justified by
the fact that ϕ(A1 , A2 ) is often the closure of the function ϕ of A1 and A2 defined
in an obvious direct way. In 17.1.12 we examine two instances of this.
17.1.12 Proposition. Let A1 and A2 be commuting (in the sense of 17.1.5) self-
adjoint operators in H.
(a) The operator A1 + A2 is essentially self-adjoint.

For the function
R2 ∋ (x1 , x2 ) 7→ ϕ(x1 , x2 ) := x1 + x2 ∈ R,
we have ϕ(A1 , A2 ) = (A1 + A2 ) = (A1 + A2 )† .

If A1 ∈ B(H) then the operator A1 +A2 is self-adjoint and ϕ(A1 , A2 ) = A1 +A2 .
(b) The operators A1 A2 and A2 A1 are essentially self-adjoint.
For the function
R2 ∋ (x1 , x2 ) 7→ ψ(x1 , x2 ) := x1 x2 ∈ R,
we have ψ(A1 , A2 ) = A1 A2 = A2 A1 = (A1 A2 )† = (A2 A1 )† .

If A1 ∈ B(H) then the operator A2 A1 is self-adjoint and ψ(A1 , A2 ) = A2 A1 .
If A2 ∈ B(H) then the operator A1 A2 is self-adjoint and ψ(A1 , A2 ) = A1 A2 .
Proof. Preliminary remark: Let P be the projection valued measure of 17.1.10b.

For i = 1, 2, we define the function
R2 ∋ (x1 , x2 ) 7→ πi (x1 , x2 ) := xi ∈ R,
and the operator Bi := JπPi . The operators B1 and B2 are self-adjoint and
P B1 (E) = P (π1−1 (E)) = P (E × R) = P A1 (E), ∀E ∈ A(dR ),

P B2 (E) = P (π2−1 (E)) = P (R × E) = P A2 (E), ∀E ∈ A(dR )
(cf. 15.2.7). Therefore,
JπPi = Ai for i = 1, 2. (1)
a: We note that π1 + π2 = ϕ. Then, 1 and 14.3.11 imply that
the operator A1 + A2 is closable and A1 + A2 = JϕP = ϕ(A1 , A2 ).
The operator A1 + A2 is adjointable since DA1 +A2 ⊂ DA1 +A2 (cf. 4.4.10) and
DP (ϕ) = H (cf. 14.2.13). Moreover, the operator ϕ(A1 , A2 ) is self-adjoint (cf.
14.3.17). Therefore, A1 + A2 is essentially self-adjoint by 12.4.11. Then we have
A1 + A2 = (A1 + A2 )† (cf. 12.1.6b).
If A1 ∈ B(H) then (cf. 12.3.1b) (A1 + A2 )† = A†1 + A†2 = A1 + A2 .
b: We note that π1 π2 = π2 π1 = ψ. Then, 1 and 14.3.12 imply that
the operators A1 A2 and A2 A1 are closable and A1 A2 = A2 A1 = JψP = ψ(A1 , A2 ).
Proceeding as in part a, we see that the operators A1 A2 and A2 A1 are essentially

self-adjoint, that the operator ψ(A1 A2 ) is self-adjoint, and that A1 A2 = (A1 A2 )†
and A2 A1 = (A2 A1 )† .
If A1 ∈ B(H) then (cf. 12.3.4b) (A1 A2 )† = A†2 A†1 = A2 A1 .
Similarly, if A2 ∈ B(H) then (A2 A1 )† = A†1 A†2 = A1 A2 .
The next result has an important role in the discussion of compatible quantum
observables (cf. 19.5.23 and 19.5.24f). It may be interesting to note that in a way
it extends to a pair of commuting self-adjoint operators what happens for a single
self-adjoint operator (cf. 15.2.4 a ⇒ d).
17.1.13 Theorem. Let A1 and A2 be commuting (in the sense of 17.1.5) self-
adjoint operators in H, and let λ1 ∈ σ(A1 ). Then, for every ε > 0, there exist
λ2 ∈ σ(A2 ) and uε ∈ DA1 ∩ DA2 ∩ H̃ so that
|hAi iuε − λi | ≤ ε and ∆uε Ai ≤ 2ε, for i = 1, 2
(for hAiu and ∆u A, cf. 15.2.3).
Proof. We fix ε ∈ (0, ∞). Then P A1 ((λ1 − ε, λ1 + ε)) 6= OH (cf. 15.2.4). We define
the mapping
Q : A(dR ) → P(H)
E 7→ Q(E) := P A1 ((λ1 − ε, λ1 + ε))P A2 (E);
we point out that this definition is consistent (in view of 13.2.1) because A1 and A2
commute. We note that, for each f ∈ H and all E ∈ A(dR ),
A2
µQ A1
((λ1 − ε, λ1 + ε))f |P A2 (E)P A1 ((λ1 − ε, λ1 + ε))f = µP

f (E) = P g (E)
if g := P A1 ((λ1 − ε, λ1 + ε))f ; thus µQ

f is a measure on A(dR ) for all f ∈ H (cf.
13.3.5). This implies that, if E, F ∈ A(dR ) are such that E ⊂ F and Q(F ) = OH ,
then (cf. 7.1.2a)
(f |Q(E)f ) = µQ Q
f (E) ≤ µf (F ) = (f |Q(F )f ) = 0, ∀f ∈ H,
and hence Q(E) = OH .

We define the set
G := {µ ∈ R : ∃εµ > 0 such that Q((µ − εµ , µ + εµ )) = OH }.
For each µ ∈ G we choose εµ > 0 as above, and we see that
µ′ ∈ (µ − εµ , µ + εµ ) ⇒ [∃εµ′ > 0 s.t. (µ′ − εµ′ , µ′ + εµ′ ) ⊂ (µ − εµ , µ + εµ )] ⇒
[∃εµ′ > 0 such that Q((µ′ − εµ′ , µ′ + εµ′ )) = OH ] ⇒ µ′ ∈ G.
This proves that
(µ − εµ , µ + εµ ) ⊂ G, ∀µ ∈ G,
and hence that
[
G= (µ − εµ , µ + εµ ).
µ∈G
Then, by 2.3.16 and 2.3.18 there is a countable subset {µn }n∈J of G so that
[
G= (µn − εµn , µn + εµn ).
n∈J
By 7.1.4a, this yields

X
µQ
f (G) ≤ µQ
f ((µn − εµn , µn + εµn )) = 0, ∀f ∈ H,
n∈J
and hence Q(G) = OH . Since

Q(R) = P A1 ((λ1 − ε, λ1 + ε)) 6= OH ,
this proves that G 6= R and hence that there exists λ2 ∈ R such that
Q((λ2 − η, λ2 + η)) 6= OH , ∀η > 0.
First, this implies obviously that
P A2 ((λ2 − η, λ2 + η)) 6= OH , ∀η > 0,
and hence that λ2 ∈ σ(A2 ) (cf. 15.2.4). Second, we can choose uε ∈ H̃ such that
Q((λ2 − ε, λ2 + ε))uε = uε
(cf. 13.1.3c). Then we have (cf. 13.3.2b)
A1
µP
uε (R − (λ1 − ε, λ1 + ε))
= kP A1 (R − (λ1 − ε, λ1 + ε))P A1 ((λ1 − ε, λ1 + ε))P A2 ((λ2 − ε, λ2 + ε))uε k2 = 0
and
A2
µP
uε (R − (λ2 − ε, λ2 + ε))
= kP A2 (R − (λ2 − ε, λ2 + ε))P A1 ((λ1 − ε, λ1 + ε))P A2 ((λ2 − ε, λ2 + ε))uε k2 = 0,
and hence (cf. 8.3.3 and 15.2.2e), for i = 1, 2,
Z Z
2 P Ai Ai
ξ dµuε = ξ 2 dµP
uε < ∞, i.e. uε ∈ DAi ,
R (λi −ε,λi +ε)
and
Z
Ai Ai
kAi uε − λi uε k2 = |ξ − λi |2 dµP
uε ≤ ε 2 µP 2
uε ((λi − ε, λi + ε)) = ε .
(λi −ε,λi +ε)
Then, by the Schwarz inequality we obtain, for i = 1, 2,

|hAi iuε − λi | = | (uε |Ai uε − λi uε ) | ≤ kAi uε − λi uε k ≤ ε,
and hence also
∆uε Ai = kAi uε − hAi iuε uε k ≤ kAi uε − λi uε k + kλi uε − hAi iuε uε k ≤ 2ε.
17.1.14 Example. Suppose that H is a separable Hilbert space and let A1 and A2
be self-adjoint operators in H such that conditions a, b, c of 15.3.4B hold true for
both of them. Then the following conditions are equivalent:
(α) A1 and A2 commute (in the sense defined in 17.1.5);
(β) if, for k = 1, 2, {(λkn , Pnk )}n∈Ik is the family associated with Ak as {(λn , Pn )}n∈I
was associated with A in 15.3.4B, then [Pn1 , Pl2 ] = OH for all (n, l) ∈ I1 × I2 ;
(γ) there exists a c.o.n.s. {vj }j∈J in H whose elements are eigenvectors of both A1
and A2 .
Indeed, the equivalence of conditions α and β follows at once from
X
P Ak (E)f = Pnk f, ∀f ∈ H, ∀E ∈ A(dR ),
k
n∈IE
k
with IE := {n ∈ Ik : λkn ∈ E}, for k = 1, 2 (cf. 15.3.4B).
Moreover, if condition β is true then {Pn1 Pl2 }(n,l)∈I1 ×I2 is a family of projections
(cf. 13.2.1) which is so that
(Pn1 Pl2 )(Pm
1 2
Pj ) = OH if (n, l) 6= (m, j); (1)
if we set I0 := {(n, l) ∈ I1 × I2 : Pn1 Pl2 6= OH }, we have
!
X X X
Pn1 Pl2 f = Pn1 Pl2 f
(n,l)∈I0 n∈I1 l∈I2
X (2)
= Pn1 f = f, ∀f ∈ H;
n∈I1
for each (n, l) ∈ I0 , we fix an o.n.s. {un,l,s }s∈In,l which is complete in the subspace
S
RPn1 Pl2 (cf. 10.7.2); then the set (n,l)∈I0 {un,l,s }s∈In,l is an o.n.s. in H in view of
1 (cf. 13.2.8d and 13.2.9c) and it is complete in H by 10.6.4 (with M := H) since
(cf. 2 and 13.1.10)
X X X
f= Pn1 Pl2 f = (un,l,s |f ) un,l,s , ∀f ∈ H;
(n,l)∈I0 (n,l)∈I0 s∈In,l
moreover, all the elements of this c.o.n.s. are eigenvectors of both A1 and A2 , since
RPn1 Pl2 = RPn1 ∩ RPl2 , ∀(n, l) ∈ I0
(cf. 13.2.1e) and since all the non-null elements of RPnk are eigenvectors of Ak , for
all n ∈ Ik and for k = 1, 2 (cf. 15.3.4B). This proves that condition β implies
condition γ.
Conversely, assume that condition γ is true. Then (cf. 15.3.4B):
X
∀n ∈ I1 , ∃Jn1 ⊂ J s.t. Pn1 f = (vj |f ) vj , ∀f ∈ H;
1
j∈Jn
X
∀l ∈ I2 , ∃Jl2 ⊂ J s.t. Pl2 f = (vj |f ) vj , ∀f ∈ H.
j∈Jl2
This implies that

X
Pn1 Pl2 f = (vj |f ) vj = Pl2 Pn1 f, ∀f ∈ H, ∀(n, l) ∈ I1 × I2
1 ∩J 2
j∈Jn l
(if Jn1 ∩ Jl2 = ∅ then the sum of the series is defined to be 0H ), and this proves that
condition β is true.
Now suppose that A1 and A2 commute and let {(λkn , Pnk )}n∈Ik be as in condition
β, for k = 1, 2. For each G ∈ A(d2 ), let JG := {(n, l) ∈ I0 : (λ1n , λ2l ) ∈ G} and let
PG be the projection defined by
X
PG f := Pn1 Pl2 f, ∀f ∈ H
(n,l)∈JG
(if JG = ∅ then PG := OH ; if JG 6= ∅ then 13.2.8 or 13.2.9 proves that PG is indeed

a projection). Then define the mapping
A(d2 ) ∋ G 7→ P (G) := PG ∈ P(H).
For every f ∈ H,
X
µP
f (G) = kPn1 Pl2 f k2 , ∀G ∈ A(d2 );
(n,l)∈JG
hence, µP
f is a measure (cf. 8.3.8 with (X, A) := (R2 , A(d2 )), I := I0 ,
x(n,l) := (λn , λ2l ), a(n,l) := kPn1 Pl2 f k2 ) and
1
2
X X
2 1 2 2
P 1 2
Pn Pl f = kf k2 .

µf (R ) = kPn Pl f k =
(n,l)∈I0 (n,l)∈I0
Therefore, µPf is a projection valued measure (cf. 13.3.5). Furthermore, for every
E ∈ A(dR ),
X X X X
(f |P (E × R)f ) = kPn1 Pl2 f k2 = kPl2 Pn1 f k2
1 l∈I
n∈IE 1 l∈I
n∈IE
2 2
P A1
X
kPn1 f k2 = µf (E) = f |P A1 (E)f , ∀f ∈ H

=
1
n∈IE
(cf. 10.2.3 or 10.4.7a, and 15.3.4B), and hence

P (E × R) = P A1 (E)
(cf. 10.2.12); similarly,
P (R × E) = P A2 (E), ∀E ∈ A(dR ).
This proves that P is the projection valued measure through which functions of A1
and A2 are defined (cf. 17.1.10 and 17.1.11).
Since P (R2 − {(λ1n , λ2l )}(n,l)∈I0 ) = OH , each function ϕ : {(λ1n , λ2l )}(n,l)∈I0 → C
is an element of M(R2 , A(d2 ), P ) (such a function is always A(d2 )Dϕ -measurable),
and (cf. 14.2.14a,b and 8.3.8)
Z
Dϕ(A1 ,A2 ) = f ∈ H : |ϕ|2 dµP
f < ∞
R2
 
 X 
= f ∈H: |ϕ(λ1n , λ2l )|2 kPn1 Pl2 f k2 < ∞ ,
 
(n,l)∈I0
Z X
(f |ϕ(A1 , A2 )f ) = ϕdµP
f = ϕ(λ1n , λ2l )kPn1 Pl2 f k2
R2 (n,l)∈I0
 
X
= f | ϕ(λ1n , λ2l )Pn1 Pl2 f  , ∀f ∈ Dϕ(A1 ,A2 ) ;
(n,l)∈I0
since the mapping Dϕ(A1 ,A2 ) ∋ f 7→ (n,l)∈I0 ϕ(λ1n , λ2l )Pn1 Pl2 f is obviously a linear
P
operator (its definition is consistent by 10.4.7b), in view of 10.2.12 this implies that
X
ϕ(A1 , A2 )f = ϕ(λ1n , λ2l )Pn1 Pl2 f, ∀f ∈ Dϕ(A1 ,A2 ) .
(n,l)∈I0
For each (n, l) ∈ I0 , if {un,l,s }s∈In,l is as before then

X X
Pn1 Pl2 f = (un,l,s |f ) un,l,s and kPn1 Pl2 f k2 = | (un,l,s |f ) |2 , ∀f ∈ H
s∈In,l s∈In,l
(cf. 13.1.10, and 10.2.3 or 10.4.8a). Therefore,

 
 X X 
Dϕ(A1 ,A2 ) = f ∈ H : |ϕ(λ1n , λ2l )|2 | (un,l,s |f ) |2 < ∞ ,
 
(n,l)∈I0 s∈In,l
X X
ϕ(A1 , A2 )f = ϕ(λ1n , λ2l ) (un,l,s |f ) un,l,s , ∀f ∈ Dϕ(A1 ,A2 ) .
(n,l)∈I0 s∈In,l
17.2 Invariant and reducing subspaces
If λ ∈ C is an eigenvalue of a linear operator A in H then the corresponding

eigenspace is the set Mλ of the vectors f in DA such that Af = λf (cf. 4.5.7), and
we have trivially the inclusion A(Mλ ) ⊂ Mλ since Mλ is a linear manifold. If the
operator A is closed then Mλ is a subspace of H (cf. 4.5.9). More general than the
concept of a closed eigenspace is the concept of an invariant subspace.
17.2.1 Definition. Let A be a linear operator in H, i.e. A ∈ O(H), and let M be

a subspace of H, i.e. M ∈ S (H). We say that M is an invariant subspace for A if
Af ∈ M, ∀f ∈ DA ∩ M.
If this condition is true, we denoted by AM the restriction ADA ∩M of A to DA ∩ M
(cf. 1.2.5 and 3.2.3) when M is regarded as the final set of the mapping ADA ∩M
(cf. 1.2.1). Then it is obvious that AM is a linear operator in the Hilbert space M
(cf. 10.3.2), i.e. that AM ∈ O(M ) (in particular, DA ∩ M is a linear manifold in
M , cf. 3.1.5 and 3.1.4b).
If M ∈ S (H) is an invariant subspace for A ∈ O(H), then M ⊥ may or may

not be an invariant subspace for A. Let us suppose that both M and M ⊥ are
invariant subspaces for the operator A. Does the study of the operator A reduce in
⊥
this case to the study of the two operators AM and AM ? The answer is clearly in
the affirmative if DA = H. Indeed, if DA = H then for each f ∈ DA we have the
unique representation
f = f1 + f2 ,
where f1 ∈ M and f2 ∈ M ⊥ (cf. 10.4.1), from which it follows that
⊥
Af = AM f1 + AM f2 .
However, if DA 6= H then we do not have in general
DA = (DA ∩ M ) + (DA ∩ M ⊥ )
(cf. 3.1.8 for the sum of two subsets of a linear space). In fact, we have the following
proposition, which is preliminary to the idea of a reducing subspace.
17.2.2 Proposition. Let D be a linear manifold in H and let M be a subspace of

H. The following conditions are equivalent:
(a) D = (D ∩ M ) + (D ∩ M ⊥ );
(b) PM (D) = D ∩ M and PM ⊥ (D) = D ∩ M ⊥ ;
(c) PM (D) ⊂ D;
(d) PM ⊥ (D) ⊂ D.
Proof. a ⇒ b: We assume condition a. Then, for every f ∈ D,

f = f1 + f2 , with f1 ∈ D ∩ M and f2 ∈ D ∩ M ⊥ ,
and hence PM f = f1 ∈ D ∩ M . This proves the inclusion
PM (D) ⊂ D ∩ M.
Moreover, the implications
f ∈ D ∩ M ⇒ [f ∈ D and PM f = f ] ⇒ f ∈ PM (D)
(cf. 13.3.3c) prove the inclusion
D ∩ M ⊂ PM (D).
This reasoning can be repeated with M replaced by M ⊥ (since M = M ⊥⊥ , cf.
10.4.4a).
c ⇒ d: We assume condition c. Then,
f ∈ D ⇒ [f ∈ D and PM f ∈ D] ⇒ PM ⊥ f = f − PM f ∈ D
(for PM ⊥ = 1H − PM , cf. 13.1.3e).
d ⇒ a: We assume condition d. Then,
f ∈ D ⇒ [f ∈ D and PM ⊥ f ∈ D] ⇒ PM f = f − PM ⊥ f ∈ D,
and hence
f ∈ D ⇒ [PM f ∈ D and PM ⊥ f ∈ D] ⇒
[PM f ∈ D ∩ M and PM ⊥ f ∈ D ∩ M ⊥ ] ⇒
[∃(f1 , f2 ) ∈ (D ∩ M ) × (D ∩ M ⊥ ) so that f = f1 + f2 ] ⇒
f ∈ (D ∩ M ) + (D ∩ M ⊥ ).
This proves the inclusion
D ⊂ (D ∩ M ) + (D ∩ M ⊥ ).
On the other hand, the inclusion
(D ∩ M ) + (D ∩ M ⊥ ) ⊂ D
is obvious.
We point out that, if D is a subspace N of H, then condition a of 17.2.2 is

condition d of 13.2.1 with M and N interchanged. Thus, if D is a subspace N of
H, all the conditions of 17.2.2 are equivalent to conditions a, b, c, d of 13.2.1.
The next theorem selects conditions which are equivalent to each other, and
which embody the conditions of 17.2.2 with the domain of a linear operator A as D.
It proves that, if a subspace M and its orthogonal complement M ⊥ are invariant
subspaces for an operator A, then the study of A can be reduced to the study of
⊥
the operators AM and AM provided that the additional condition PM (DA ) ⊂ DA
is satisfied (this condition can obviously be replaced by any of the conditions to
which it is equivalent in view of 17.2.2).
17.2.3 Theorem. Let A ∈ O(H) and M ∈ S (H). The following conditions are
equivalent:
(a) M and M ⊥ are invariant subspaces for A (i.e. Af ∈ M , ∀f ∈ DA ∩ M , and
Ag ∈ M ⊥ , ∀g ∈ DA ∩ M ⊥ ) and PM (DA ) ⊂ DA ;
(b) PM (DA ) ⊂ DA and there exist A1 ∈ O(M ), A2 ∈ O(M ⊥ ) so that
DA1 = PM (DA ), DA2 = PM ⊥ (DA ),
Af = A1 PM f + A2 PM ⊥ f , ∀f ∈ DA ;
(c) PM A ⊂ APM (i.e. PM commutes with A, in the sense defined in 17.1.2).
(d) the operators A1 and A2 are uniquely determined by condition b; in fact,
⊥
A1 = AM and A2 = AM .
Proof. a ⇒ b: We assume condition a. Then, in view of 17.2.2,

DAM = DA ∩ M = PM (DA ) and DAM ⊥ = DA ∩ M ⊥ = PM ⊥ (DA ).
Moreover,
⊥
Af = A(PM f + PM ⊥ f ) = AM PM f + AM PM ⊥ f, ∀f ∈ DA .
⊥
This proves condition b, with A1 := AM and A2 := AM .
b ⇒ c: We assume condition b. Then we have
f ∈ DPM A ⇒ f ∈ DA ⇒ PM f ∈ DA ,
i.e. DPM A ⊂ DAPM . Moreover, for every f ∈ DPM A (= DA ) we have
PM Af = PM (A1 PM f + A2 PM ⊥ f ) = A1 PM f
(since A1 PM f ∈ M and A2 PM ⊥ f ∈ M ⊥ , cf. 13.1.3b,c), and also
2
APM f = A1 PM f + A2 PM ⊥ PM f = A1 PM f,
and hence
PM Af = APM f.
c ⇒ a: We assume condition c. Then we have DPM A ⊂ DAPM and hence
f ∈ DA ⇒ f ∈ DPM A ⇒ f ∈ DAPM ⇒ PM f ∈ DA ,
i.e. PM (DA ) ⊂ DA . Moreover we have
(1) (2)
f ∈ DA ∩ M ⇒ Af = APM f = PM Af ∈ M
(1 holds true because f ∈ M and 2 because f ∈ DA = DPM A ), and also
(3) (4) (5)
f ∈ DA ∩ M ⊥ ⇒ Af = APM ⊥ f = Af − APM f = Af − PM Af = PM ⊥ Af ∈ M ⊥
(3 holds true because f ∈ M ⊥ , 4 because f ∈ DA = DPM A and hence f ∈ DAPM ,
5 because f ∈ DPM A ).
d: We suppose that PM (DA ) ⊂ DA and that A1 ∈ O(M ), A2 ∈ O(M ⊥ ) are so
that condition b holds true. Then condition a holds true as well and we have
DA1 = PM (DA ) = DA ∩ M = DAM ,
in view of 17.2.2, and
2
A1 f = A1 PM f = A1 PM f + A2 PM ⊥ PM f = Af = AM f, ∀f ∈ DA1 .
⊥
This proves that A1 = AM . The proof of the equation A2 = AM is similar.
17.2.4 Definition. Let A be an operator in H, i.e. A ∈ O(H), and let M be a

subspace in H, i.e. M ∈ S (H). We say that M is a reducing subspace for A, or
that A is reduced by M , if the conditions of 17.2.3 hold true for A and M , e.g. if
PM A ⊂ APM .
We see that A is reduced by M iff A is reduced by M ⊥ , in view of condition a of
17.2.3, of the equivalence between conditions c and d of 17.2.2, and of the equality
M = M ⊥⊥ (cf. 10.4.4a). We note that if A is reduced by M then M is an invariant
subspace for A and hence the operator AM is defined.
Obviously, all operators in H are reduced by the trivial subspaces {0H } and H.
If an operator A is reduced by a subspace M , many properties of A are in-

herited by the operator AM in the Hilbert space M , as is shown by the following
propositions.
17.2.5 Proposition. Let A be a closable operator in H, let M be a subspace of H,

and suppose that A is reduced by M . Then AM is a closable operator in the Hilbert
space M , the operator A is reduced by M , and
(AM ) = (A)M ,
where (AM ) denotes the closure of the operator AM in the Hilbert space M (hence,
(AM ) ∈ O(M )).
Proof. Let (0M , g) ∈ GAM ; then there exists a sequence {(fn , gn )} in GAM so that
fn → 0M and gn → g;
now, (fn , gn ) ∈ GA for all n ∈ N and hence (since 0M = 0H ) (0H , g) ∈ GA , and
hence (since A is closable) g = 0H = 0M . By 4.4.11a, this proves that AM is
closable.
We have PM A ⊂ APM by hypothesis. Let f ∈ DA (= DPM A ); then there exists
a sequence {fn } in DA so that
fn → f, {Afn } is convergent, Af = lim Afn
n→∞
(cf. 4.4.10); now, the sequence {PM fn } is in DA and (since PM is continuous)
PM fn → PM f, {PM Afn } is convergent i.e. {APM fn } is convergent;
therefore,
PM f ∈ DA and
APM f = lim APM fn = lim PM Afn = PM lim Afn = PM Af.
n→∞ n→∞ n→∞
This proves that PM A ⊂ APM , i.e. that the operator A is reduced by M .

Moreover, the following implications are true:
f ∈ D(AM ) ⇒
[there exists {fn } in DAM s.t. fn → f, {AM fn } is convergent,
(1)
(AM )f = lim AM fn ] ⇒
n→∞
[f ∈ DA ∩ M = D(A)M and (AM )f = Af = (A)M f ]

(1 holds because DAM ⊂ DA and AM ⊂ A, and because M is closed). This proves
the inclusion (AM ) ⊂ (A)M . Conversely, let f ∈ D(A)M ; then (since D(A)M ⊂ DA )
there exists a sequence {fn } in DA so that
fn → f and {Afn } in convergent;
now, the sequence {PM fn } is in DAM (since PM (DA ) = DAM , cf. 17.2.3b,d) and
PM fn → PM f = f, {PM Afn } is convergent i.e. {AM PM fn } is convergent;
therefore, f ∈ D(AM ) . This proves the inclusion D(A)M ⊂ D(AM ) .
17.2.6 Corollary. Let A be a closed operator in H, let M be a subspace of H, and

suppose that A is reduced by M . Then the operator AM is closed.
Proof. In view of 17.2.5 we have (AM ) = (A)M = AM . Hence, AM is closed (cf.

4.4.10).
17.2.7 Proposition. Let A be an adjointable operator in H (i.e. DA = H), let M

be a subspace of H, and suppose that A is reduced by M . Then AM is an adjointable
operator in the Hilbert space M , the operator A† is reduced by M , and
(AM )† = (A† )M ,
where (AM )† denotes the adjoint of the operator AM in the Hilbert space M (hence,
(AM )† ∈ O(M )).
Proof. We note that (DA ∩ M )⊥ ∩ M is the orthogonal complement of DAM =

DA ∩ M in the Hilbert space M . Let g ∈ DA ; we have
g = g1 + g2 with g1 ∈ DA ∩ M and g2 ∈ DA ∩ M ⊥
(cf. 17.2.2a); then, for every f ∈ (DA ∩ M )⊥ ∩ M , we have
(f |g1 ) = (f |g2 ) = 0
(because f ∈ (DA ∩ M )⊥ and g1 ∈ DA ∩ M , and because f ∈ M and g2 ∈ M ⊥ ),

and hence
(f |g) = 0.
⊥
Since g was an arbitrary element of DA and DA = {0H } (cf. 10.4.4d), this proves
that
(DA ∩ M ⊥ ) ∩ M = {0H } = {0M },
and hence that the operator AM in the Hilbert space M is adjointable.

By hypothesis we have PM A ⊂ APM . In view of 13.1.5, 12.3.4a,b, 12.1.4, this
implies that
PM A† ⊂ (APM )† ⊂ (PM A)† = A† PM ,
and hence that A† is reduced by M .

In what follows, we denote by a subscript whether a given inner product is to
be regarded as pertaining to the Hilbert space H or to the Hilbert space M . We
have
(1)
AM f |g M = (Af |g)H = f |A† g H = f |(A† )M g M , ∀f ∈ DAM , ∀g ∈ D(A† )M

(1 holds because D(A† )M ⊂ DA† ). By 12.1.3B, this proves that
(A† )M ⊂ (AM )† . (2)

Now let g ∈ D(AM )† ; then we have, for all f ∈ DA ,

(3)
⊥

(Af |g)H = AM PM f + AM PM ⊥ f |g
H
(4) M
M †
= A PM f |g = PM f |(A ) g M
M
(5)
= PM f + PM ⊥ f |(AM )† g H = f |(AM )† g H

⊥
(3 holds in view of 17.2.3b,d; 4 holds because AM PM ⊥ f ∈ M ⊥ and g ∈ M
since (AM )† denotes the adjoint of the operator AM in the Hilbert space M ; 5
holds because PM ⊥ f ∈ M ⊥ and (AM )† g ∈ M ); therefore g ∈ DA† and hence
g ∈ DA† ∩ M = D(A† )M . This proves that
D(AM )† ⊂ D(A† )M ,
and hence, in view of 2, that (A† )M = (AM )† .
17.2.8 Corollary. Let A be a symmetric, or a self-adjoint, or an essentially self-

adjoint operator in H, and suppose that A is reduced by a subspace M of H. Then
AM is a symmetric, or a self-adjoint, or an essentially self-adjoint operator in the
Hilbert space M .
Proof. This follows immediately from 17.2.7.
17.2.9 Proposition. Let A ∈ B(H) and M ∈ S (H). The following conditions are
equivalent:
(a) A is reduced by M ;
(b) M is an invariant subspace for both A and A† (i.e. Af ∈ M and A† f ∈ M ,
∀f ∈ M ; recall that DA† = H, cf. 12.2.2).
If these conditions are satisfied, then:
(c) AM ∈ B(H) and kAM k ≤ kAk.
Proof. a ⇒ b: If A is reduced by M then M is an invariant subspace for A (cf.

17.2.3a) and A† is reduced by M (cf. 17.2.7), and hence M is an invariant subspace
for A† as well.
b ⇒ a: We assume condition b. Then, in view of 13.1.3c we have
PM APM f = APM f and PM A† PM f = A† PM f, ∀f ∈ H,
and hence (cf. 12.1.6b, 12.3.4b, 13.1.5, and recall that A† ∈ B(H), cf. 12.2.2)
APM = PM APM = (PM A† PM )† = (A† PM )† = PM A,
i.e. condition a.
c: If M is an invariant subspace for A then it is obvious that condition c is true,
since AM is a restriction of A.
17.2.10 Proposition. Let A ∈ U(H) and M ∈ S (H), and suppose that A is

reduced by M . Then AM ∈ U(M ).
Proof. We have DAM = M and
kAM f kM = kAf kH = kf kH = kf kM , ∀f ∈ M.
Moreover, for every f ∈ M we have
A−1 f = A† f ∈ M
(cf. 12.5.1 and 17.2.9) and hence
f = A(A−1 f ) = AM (A−1 f ).
This proves that RAM = M , and hence that AM ∈ U(M ) by 10.1.20.
17.2.11 Proposition. Let U : G → U(H) be a homomorphism from a group G to

the group U(H) (cf. 10.3.10; U is then called a unitary representation of G) and
let M be a subspace of H. The following conditions are equivalent:
(a) M is a reducing subspace for U (g), ∀g ∈ G;

(b) M is an invariant subspace for U (g), ∀g ∈ G.
If these conditions hold true, then
(c) The mappings

⊥
G ∋ g 7→ (U (g))M ∈ U(M ) and G ∋ g 7→ (U (g))M ∈ U(M ⊥ )
are homomorphisms from G to the groups U(M ) and U(M ⊥ ) respectively.
Proof. a ⇒ b: This is obvious (cf. 17.2.3a).

b ⇒ a: If we assume condition b then we have, for every g ∈ G,
U (g)f ∈ M and (U (g))† f = (U (g))−1 f = U (g −1 )f ∈ M, ∀f ∈ M
(cf. 12.5.1 and 1.3.3), and this implies condition a by 17.2.9.

c: If condition a holds true, then
⊥
(U (g))M ∈ U(M ) and (U (g))M ∈ U(M ⊥ ), ∀g ∈ G,
by 17.2.10. Moreover, for all g, g ′ ∈ G,
(U (g))M (U (g ′ ))M f = U (g)U (g ′ )f = U (gg ′ )f = (U (gg ′ ))M f, ∀f ∈ M,
and similarly for M ⊥ .
17.2.12 Proposition. Let A ∈ P(H) and M ∈ S (H), and suppose that A is

reduced by M . Then AM ∈ P(M ). More precisely, AM is the orthogonal projection
onto the subspace RA ∩ M of the Hilbert space M .
Proof. In view of 13.1.5, we have AM = (AM )† (cf. 17.2.8) and

AM f = Af = A2 f = (AM )2 f, ∀f ∈ M,
and hence A ∈ P(M ).
Moreover, RA ∩ M is a subspace of the Hilbert space M (cf. 3.1.5, 3.1.4b,
13.1.4a, 2.3.3) and we have, for f ∈ M ,
AM f = f iff Af = f iff f ∈ RA iff f ∈ RA ∩ M
(cf. 13.1.3c). This proves that AM is the orthogonal projection onto RA ∩ M (cf.
13.1.3c once again).
17.2.13 Proposition. Let A be a self-adjoint operator in H and M a subspace of

H. The following conditions are equivalent:
(a) A is reduced by M ;
(b) P A (E) is reduced by M (or equivalently P A (E)f ∈ M , ∀f ∈ M ), ∀E ∈ A(dR );
(c) U A (t) is reduced by M (or equivalently U A (t)f ∈ M , ∀f ∈ M ), ∀t ∈ R.
If these conditions hold true then AM is a self-adjoint operator in the Hilbert space
M (cf. 17.2.8) and:
(d) the mapping
A(dR ) ∋ E 7→ P A,M (E) := (P A (E))M ∈ P(M )
is the projection valued measure of AM ;
(e) the mapping
R ∋ t 7→ (U A (t))M ∈ U(M )
is the continuous one-parameter unitary group whose generator is AM .
Proof. The parenthetic equivalence in condition b follows from 13.1.5 and 17.2.9.
The parenthetic equivalence in condition c follows from 17.2.11.
Condition a is PM A ⊂ APM and hence, in view of 17.1.4, it is equivalent to
condition a of 17.1.7 with B := PM . Condition b is PM P A (E) = P A (E)PM for
all E ∈ A(dR ), and hence it is condition b of 17.1.7 with B := PM . Condition c
is PM U A (t) = U A (t)PM for all t ∈ R, and hence it is condition e of 17.1.7 with
B := PM . This proves that conditions a, b, c are equivalent.
d: We assume conditions a and b. Then we have
P A,M (E) ∈ P(M ), ∀E ∈ A(dR ),
by 17.2.12, and also
A,M A A,M
µP = µP and µP (R) = f |1M = (f |1M f )M = kf k2M , ∀f ∈ M.

f f f Hf M
In view of 13.3.5, this proves that P A,M is a projection valued measure with values
in P(M ). In view of 15.2.2e, we have, for f ∈ M ,
Z Z
A,M A
ξ 2 dµP
f < ∞ iff ξ 2 dµP
f < ∞ iff f ∈ DA iff f ∈ DA ∩ M = DAM ,
R R
and also
Z Z
A A,M
M
ξdµP ξdµP

f |A f M
= (f |Af )H = f = f , ∀f ∈ DAM .
R R
This proves that P A,M is the projection valued measure of the self-adjoint operator
AM (cf. 15.2.1).
e: We assume conditions a and c. Then, for every t ∈ R, (U A (t))M is a linear
operator in M and (cf. 16.1.7)
Z Z
A M A PA A,M
ϕt dµP

f |(U (t)) f M = f |U (t)f H = ϕt dµf = f , ∀f ∈ M.
R R
A M AM
By 16.1.7, this proves that (U (t)) =U (t) for all t ∈ R.
In the next two theorems, the first statements generalize the content of 17.2.3b,d.
On the basis of these theorems, the study of the structure of a (closed) operator can
be carried out through the investigation of its reducing subspaces and its restrictions
to the intersections of its domain with them.
17.2.14 Proposition. Let N ∈ N, let {M1 , ..., MN } be a family of subspaces of H

such that, writing Pn := PMn for all n ∈ {1, ..., N }, the following conditions are
true
XN
Mk ⊂ Mi⊥ if i 6= k and Pn = 1H ,
n=1
and suppose that an operator A in H is reduced by Mn for all n ∈ {1, ..., N }. Writing
An := AMn for all n ∈ {1, ..., N }, we have:
(a) DA = {f ∈ H : Pn f ∈ DAn , ∀n ∈ {1, ..., N }},
PN
Af = n=1 An Pn f , ∀f ∈ DA ;
(b) A is closed iff An is a closed operator in the Hilbert space Mn , ∀n ∈ {1, ..., N };
(c) A is adjointable iff An is an adjointable operator in the Hilbert space Mn ,
∀n ∈ {1, ..., N };
if these conditions hold true, then
DA† = {f ∈ H : Pn f ∈ DA†n , ∀n ∈ {1, ..., N }},
PN
A† f = n=1 A†n Pn f , ∀f ∈ DA† ,
where A†n denotes the adjoint of An in the Hilbert space Mn ;
(d) A is symmetric iff An is symmetric, ∀n ∈ {1, ..., N };
A is self-adjoint iff An is self-adjoint, ∀n ∈ {1, ..., N };
A is essentially self-adjoint iff An is essentially self-adjoint, ∀n ∈ {1, ..., N };
(e) A ∈ B(H) iff An ∈ B(Mn ), ∀n ∈ {1, ..., N };
(f ) A ∈ U(H) iff An ∈ U(Mn ), ∀n ∈ {1, ..., N };
(g) A ∈ P(H) iff An ∈ P(Mn ), ∀n ∈ {1, ..., N }.
Moreover, if A is self-adjoint then:
(h) P A (E) = N An
P
n=1 P (E)Pn , ∀E ∈ A(dR );
PN
(i) U A (t) = n=1 U An (t)Pn , ∀t ∈ R.
Proof. a: For all n ∈ {1, ..., N }, we have

Pn f ∈ DA ∩ Mn = DAn , ∀f ∈ DA , (1)
and
Pn Af = APn f = An Pn f, ∀f ∈ DA . (2)
From 1 we have
DA ⊂ {f ∈ H : Pn f ∈ DAn , ∀n ∈ {1, ..., N }};
on the other hand, for f ∈ H we have
N
X
[Pn f ∈ DAn , ∀n ∈ {1, ..., N }] ⇒ f = Pn f ∈ DA ,
n=1
since DAn ⊂ DA and DA is a linear manifold. This proves the part of statement a
about DA . Moreover, from 2 we have
N
X N
X
Af = Pn Af = An Pn f, ∀f ∈ DA .
n=1 n=1
b: The “only if” part of statement b follows from 17.2.6.

Now we assume that An is a closed operator in the Hilbert space Mn , for all
n ∈ {1, ..., N }. Suppose that a sequence {fk } in DA and two vectors f, g ∈ H are
so that
fk −−−−→ f and Afk −−−−→ g.
k→∞ k→∞
Then, for all n ∈ {1, ..., N },

Pn fk ∈ DAn , ∀k ∈ N, Pn fk −−−−→ Pn f, Pn Afk −−−−→ Pn g;
k→∞ k→∞
we have Pn Afk = An Pn fk , ∀k ∈ N (cf. 2); hence, since An is closed, we have

Pn f ∈ DAn and Pn g = An Pn f.
In view of result a, this implies that
N
X N
X
f ∈ DA and g = Pn g = An Pn f = Af.
n=1 n=1
This proves that the operator A is closed.

c: The “only if” part of statement c follows from 17.2.7.
Now we assume that An is an adjointable operator in the Hilbert space Mn , for
⊥
all n ∈ {1, ..., N }. Let g ∈ DA ; then,
(Pn g|Pn f )H = (g|Pn f )H = 0, ∀f ∈ DA , ∀n ∈ {1, ..., N }
(since Pn (DA ) ⊂ DA ), and hence

(Pn g|h)Mn = 0, ∀h ∈ DAn , ∀n ∈ {1, ..., N }
(since Pn (DA ) = DAn ), and hence
Pn g = 0Mn = 0H , ∀n ∈ {1, ..., N }
(since the orthogonal complement of DAn in the Hilbert space Mn is {0Mn }, by
10.4.4d), and hence
N
X
g= Pn g = 0H .
n=1
By 10.4.4d, this proves that A is adjointable.

Now we assume that A is adjointable. Then for all n ∈ {1, ..., N }, A† is reduced
by Mn and (A† )Mn = A†n (cf. 17.2.7), and hence
Pn f ∈ DA† ∩ Mn = D(A† )Mn = DA†n , ∀f ∈ DA† , (3)
and
Pn A† f = A† Pn f = (A† )Mn Pn f = A†n Pn f, ∀f ∈ DA† . (4)
From 3 we have
DA† ⊂ {f ∈ H : Pn f ∈ DA†n , ∀n ∈ {1, ..., N }};
on the other hand, for f ∈ H we have
N
X
[Pn f ∈ DA†n , ∀n ∈ {1, ..., N }] ⇒ f = Pn f ∈ DA† ,
n=1
since DA†n = D(A† )Mn ⊂ DA† and DA† is a linear manifold. This proves the part of
the statement about DA† . Moreover, from 4 we have
N
X N
X
A† f = Pn A† f = A†n Pn f, ∀f ∈ DA† .
n=1 n=1
d: The “only if” parts of statement d follow from 17.2.8. The “if” parts follow
from results a and c.
e: The “only if” part of statement e follows from 17.2.9c.
Now we assume An ∈ B(Mn ), for all n ∈ {1, ..., N }. From result a we have
DA = H and also (cf. 4.2.5b)
N
X N
X
kAf k2 = kAn Pn f k2 ≤ max{kAn k2 : n ∈ {1, ..., N }} kPn f k2
n=1 n=1
2 2
= max{kAn k : n ∈ {1, ..., N }}kf k , ∀f ∈ H.
f: The “only if” part of statement f follows from 17.2.10.
Now we assume An ∈ U(Mn ), for all n ∈ {1, ..., N }. From results a and c we
have DA = DA† = H, and also (in view of 12.5.1)
N N
! N
†
X X X
†
AA f = An Pn Ak Pk f = An A†n Pn f
n=1 k=1 n=1
N
X
= Pn f = f, ∀f ∈ H,
n=1
i.e. AA† = 1H , and similarly A† A = 1H . By 12.5.1, this proves that A ∈ U(H).

g: The “only if” part of statement g follows from 17.2.12.
Now we assume An ∈ P(Mn ), for all n ∈ {1, ..., N }. From 13.1.5 and result d
we have A = A† ; from result a and 13.1.5 we have DA = H and
N N
! N
X X X
2
A f = An Pn Ak Pk f = A2n Pn f
n=1 k=1 n=1
N
X
= An Pn f = Af, ∀f ∈ H.
n=1
By 13.1.5, this proves that A ∈ P(H).

h: We suppose that A is self-adjoint. Then (cf. 17.2.13), for every E ∈ A(dR ),
P A (E) is reduced by Mn for all n ∈ {1, ..., N }, and we have (cf. result a written
with P A (E) in place of A, and 17.2.13d)
N
X N
X
P A (E) = (P A (E))Mn Pn = P An (E)Pn .
n=1 n=1
i: We suppose that A is self-adjoint. Then (cf. 17.2.13), for every t ∈ R, U A (t)

is reduced by Mn for all n ∈ {1, ..., N }, and we have (cf. result a written with U A (t)
in place of A, and 17.2.13e)
N
X N
X
U A (t) = (U A (t))Mn Pn = U An (t)Pn .
n=1 n=1
17.2.15 Proposition. Let {Mn } be a sequence of subspaces of H such that

∞
X
Mk ⊂ Mi⊥ if i 6= k and Pn f = f, ∀f ∈ H,
n=1
where Pn := PMn for all n ∈ N. Suppose that an operator A in H is reduced by Mn

for all n ∈ N. Then, writing An := AMn for all n ∈ N, we have:
P∞
(a) DA ⊂ {f ∈ H : Pn f ∈ DAn , ∀n ∈ N, and n=1 An Pn f is convergent},
Af = ∞
P
A P
n=1 n n f, ∀f ∈ D A ;
(b) A is closed iff

P∞ 2
[DA = {f ∈ H : Pn f ∈ DAn , ∀n ∈ N, and n=1 kAn Pn f k < ∞}
and An is closed, ∀n ∈ N];
(c) A is adjointable iff An is adjointable, ∀n ∈ N; if these conditions hold true,
then
P∞ † 2
DA† = {f ∈ H : Pn f ∈ DA†n , ∀n ∈ N, and n=1 kAn Pn f k < ∞},
∞
A† f = n=1 A†n Pn f, ∀f ∈ DA† .
P
(d) A is symmetric iff An is symmetric, ∀n ∈ N;

A is self-adjoint iff [An is self-adjoint, ∀n ∈ N, and A is closed];
(e) A ∈ B(H) iff
[An ∈ B(Mn ), ∀n ∈ N, and sup{kAn k : n ∈ N} < ∞, and A is closed];
(f ) A ∈ U(H) iff [An ∈ U(Mn ), ∀n ∈ N, and A is closed];
(g) A ∈ P(H) iff [An ∈ P(Mn ), ∀n ∈ N, and A is closed].
Moreover, if A is self-adjoint then:

P∞
(h) P A (E)f = n=1 P An (E)Pn f , ∀f ∈ H, ∀E ∈ A(dR );
∞
(i) U A (t)f = n=1 U An (t)Pn f , ∀f ∈ H, ∀t ∈ R.
P
Proof. a: As in the proof of 17.2.14a, we see that, for all n ∈ N,
Pn f ∈ DAn and Pn Af = An Pn f, ∀f ∈ DA . (1)

P∞
Then, for all f ∈ DA , the series n=1 An Pn f is convergent because so is the series
P∞
n=1 Pn Af (cf. 13.2.8), and
∞
X ∞
X
Af = Pn Af = An Pn f.
n=1 n=1
b: First we suppose that A is closed. Then An is closed for all n ∈ N, by 17.2.6.

Moreover, let f ∈ H be such that
∞
X
Pn f ∈ DAn , ∀n ∈ N, and kAn Pn f k2 < ∞; (2)
n=1
since DAn ⊂ DA and DA is a linear manifold, the first condition in 2 implies that
n
X n
X n
X
Pk f ∈ DA and A Pk f = Ak Pk f, ∀n ∈ N;
k=1 k=1 k=1
Pn
now, the sequence { k=1 Pk f } is convergent (cf. 13.2.8); moreover, the second
P∞
condition in 2 implies that the series n=1 APn f is convergent (cf. 10.4.7b), and
Pn
hence that the sequence {A k=1 Pk f } is convergent; since A is supposed to be
closed, this implies that
n
X
f = lim Pk f ∈ DA .
n→∞
k=1
This proves that

∞
( )
X
2
f ∈ H : Pn f ∈ DAn , ∀n ∈ N, and kAn Pn f k < ∞ ⊂ DA .
n=1
The opposite inclusion follows from result a and 10.4.7a. This concludes the proof
of the “only if” part of statement b.
Now we suppose that An is closed, for all n ∈ N, and that
∞
( )
X
2
DA = f ∈ H : Pn f ∈ DAn , ∀n ∈ N, and kAn Pn f k < ∞ .
n=1
Let a sequence {fk } in DA and two vectors f, g ∈ H be so that

fk −−−−→ f and Afk −−−−→ g.
k→∞ k→∞
As in the proof of 17.2.14b, we see that

Pn f ∈ DAn and Pn g = An Pn f, ∀n ∈ N.
Since n=1 kPn gk2 < ∞ (cf. 13.2.8), we have ∞
P∞ P 2
n=1 kAn Pn f k < ∞, and hence
f ∈ DA . Moreover, in view of result a, we have
∞
X ∞
X
Af = An Pn f = Pn g = g.
n=1 n=1
This proves that the operator A is closed.

c: If A is adjointable then An is adjointable for all n ∈ N, by 17.2.7.
If An is adjointable for all n ∈ N, then as in the proof of 17.2.14c we see that
⊥
g ∈ DA ⇒ Pn g = 0H , ∀n ∈ N,
and hence
∞
X
⊥
g ∈ DA ⇒g= Pn g = 0H .
n=1
By 10.4.4d, this proves that A is adjointable.

Now we suppose that A is adjointable. Then A† is reduced by Mn and (A† )Mn =
A†n , for all n ∈ N (cf. 17.2.7), and A† is closed (cf. 12.1.6a). Then, we use result a
and the “only if” part of result b, with A replaced by A† , to obtain the second part
of statement c.
d: The “only if” parts of statement d follow from 17.2.8 and 12.1.6a.
If An is symmetric for all n ∈ N, then from result c we see that A is adjointable,
and from results a and c that
f ∈ DA ⇒ [f ∈ DA† and Af = A† f ].
This proves that A is symmetric.
If A is closed and An is self-adjoint for all n ∈ N, then A is self-adjoint in view
of results a, b, c.
e: The “only if” part of statement e follows from 17.2.9c and 4.4.3.
Now we assume that A is closed, An ∈ B(Mn ) for all n ∈ N, and
m := sup{kAn k : n ∈ N} < ∞.
Then we have
X∞ ∞
X
kAn Pn f k2 ≤ m2 kPn f k2 = m2 kf k2 , ∀f ∈ H,
n=1 n=1
and hence DA = H, in view of result b. Moreover, in view of result a, we have (cf.
10.4.7a)
2
X∞ X∞
2
kAf k = An Pn f = kAn Pn f k2 ≤ m2 kf k2 , ∀f ∈ H.

n=1 n=1
Thus, A ∈ B(H).
f, g, h, i: The proofs of these statements are analogous to those of statements f,
g, h, i of 17.2.14, on the basis of results a, b, c.
17.2.16 Remark. The condition that A be closed cannot be disposed of in the

“if” parts of statements d (second part), e, f, g of 17.2.15. This is shown by the
following example.
Let H be a separable Hilbert space which is not finite-dimensional, and let
{un }n∈N be a c.o.n.s. in H. For each n ∈ N, we define the one-dimensional subspace
Mn := V {un }; then, Pn := PMn is the projection defined by
Pn f := (un |f ) un , ∀f ∈ H
(cf. 13.1.12), and we have
X∞ ∞
X
Pk Pi = OH if i 6= k and Pn f = (un |f ) un = f, ∀f ∈ H.
n=1 n=1
For any function ϕ : N → C, we define a linear operator A by letting
X∞
DA := L{un }n∈N and Af := ϕ(n) (un |f ) un , ∀f ∈ DA
n=1
(we note that L{un }n∈N = {f ∈ H : ∃nf ∈ N s.t. n > nf ⇒ (un |f ) = 0}). We see
that, for all n ∈ N,
Pn f ∈ DA and APn f = ϕ(n) (un |f ) un = Pn Af, ∀f ∈ DA ,
i.e. A is reduced by Mn . Thus, {Mn }n∈N and A are as in 17.2.15.
We consider the following cases:
(a) ϕ(n) ∈ R, ∀n ∈ N.
In this case, An is self-adjoint for all n ∈ N, but A is not self-adjoint; indeed,
∞
( )
X
D A† = f ∈ H : |ϕ(n)|2 | (un |f ) |2 < ∞ ;
n=1
hence, if we define
∞
X 1 1
f := un
n=1
ϕ(n) + i n
(this series is convergent by 10.4.7b), we have f ∈ DA† but f 6∈ DA .
(b) ∃m ∈ [0, ∞) so that |ϕ(n)| ≤ m, ∀n ∈ N.

In this case, An ∈ B(Mn ) for all n ∈ N and A is bounded, but A 6∈ B(H) since
DA 6= H.
(c) |ϕ(n)| = 1, ∀n ∈ N.
In this case, An ∈ U(Mn ) for all n ∈ N and kAf k = kf k for all f ∈ DA , but
A 6∈ U(H) since DA 6= H.
(d) ϕ(n) ∈ {0, 1}, ∀n ∈ N.
In this case, An ∈ P(Mn ) for all n ∈ N and A = A2 , but A 6∈ P(H) since A is
not self-adjoint.
We point out that, if we had defined A by letting
∞
( )
X
2 2
DA := f ∈ H : |ϕ(n)| | (un |f ) | < ∞ ,
n=1
∞
X
Af := ϕ(n) (un |ϕ) un , ∀f ∈ DA ,
n=1
we would have had: A self-adjoint in case a, A ∈ B(H) in case b, A ∈ U(H) in case

c, A ∈ P(H) in case d. In fact, A would have been a closed operator.
To conclude this section, we present an example which shows that for an invari-
ant subspace it is possible not to be a reducing subspace even when its orthogonal
complement is invariant as well.
17.2.17 Example. Let a, b, c ∈ R be so that a < c < b. We define two subsets of

the Hilbert space L2 (a, b) by letting
M1 := {[ϕ] ∈ L2 (a, b) : ϕ(x) = 0 m-a.e. on [c, b]},
M2 := {[ϕ] ∈ L2 (a, b) : ϕ(x) = 0 m-a.e. on [a, c]}.
It is obvious that M1 ⊂ M2⊥ and that
[ϕ] = [χ[a,c] ϕ] + [χ[c,b] ϕ], ∀[ϕ] ∈ L2 (a, b). (1)
By 10.2.15, this proves that M1 = M2⊥and M2 = M1⊥ .
Therefore, M1 is a subspace
2
of L (a, b) (cf. 10.2.13) and M2 is its orthogonal complement. From 1 we have that
the projection PM1 is defined by
PM1 [ϕ] := [χ[a,c] ϕ], ∀[ϕ] ∈ L2 (a, b).
Now we consider the operator Aθ defined in 12.4.25, with θ := 0 (actually, the
results we obtain are true for every θ ∈ [0, 2π); we fix θ := 0 for simplicity). It is
obvious that M1 and M2 are invariant subspaces for A0 , i.e. that
A0 [ϕ] = −i[ϕ′ ] ∈ Mk , ∀[ϕ] ∈ DA0 ∩ Mk , for k = 1, 2.
However, M1 is not a reducing subspace for A0 since the inclusion PM1 (DA0 ) ⊂ DA0
is false. Indeed, for the element u0 of C(a, b) (cf. 11.2.4 and 12.4.25) we have
[u0 ] ∈ DA0 but PM1 [u0 ] 6∈ DA0 ,
since the equivalence class PM1 [u0 ] does not contain any continuous function (cf.
11.2.2c).
We note that not even the self-adjoint operator A0 (cf. 12.4.25) is reduced by
the subspace M1 . Indeed, if A0 were reduced by M1 then we should have
PM1 A0 ⊂ A0 PM1 ,
and this would imply
[PM1 , P A0 (E)] = OH , ∀E ∈ A(dR )
(cf. 17.1.3c), and hence in particular
PM1 P A0 ({0}) = P A0 ({0})PM1 ; (2)
now, from 12.4.25 and 15.3.4B we have
P A0 ({0})[ϕ] = ([u0 ]|[ϕ]) [u0 ], ∀[ϕ] ∈ L2 (a, b);
then, 2 would imply
PM1 [u0 ] = PM1 P A0 ({0})[u0 ] = P A0 ({0})PM1 [u0 ] = ([u0 ]|PM1 [u0 ]) [u0 ],
which cannot be true since PM1 [u0 ] does not contain any continuous function.
17.3 Irreducibility
In Section 17.2 we studied what happened when an operator was reduced by a

subspace (and hence by its orthogonal complement as well), or more generally by
the subspaces of a countable family {Mn }n∈I such that ⊕
P
P n∈I Mn = H (i.e. such
that n∈I PMn f = f for all f ∈ H, cf. 13.2.8 and 13.2.9). However, when a set of
operators is taken to represent a mathematical or a physical structure, this set is
often required not to be reduced simultaneously by any non-trivial subspace; if this
is true, the set is said to be irreducible. In this section we study the condition of
irreducibility for a set of self-adjoint operators, and some of its consequences.
17.3.1 Definition. A set {Ai }i∈I of operators in H is said to be irreducible if there

does not exist any non-trivial subspace of H which is reducing for Ai for all i ∈ I,
i.e. if
[P ∈ P(H) and P Ai ⊂ Ai P, ∀i ∈ I] ⇒ P ∈ {OH , 1H }.
17.3.2 Lemma. Let A be a self-adjoint operator in H. The following conditions

are equivalent:
(a) P A (E) ∈ {OH , 1H }, ∀E ∈ A(dR );

(b) ∃λ ∈ R so that A = λ1H .
Proof. a ⇒ b: We assume condition a. Since σ(A) 6= ∅ (cf. 15.2.2d), there exists

λ ∈ R such that

1 1 1 1
PA λ − ,λ+ 6= OH , and hence P A λ − ,λ + = 1H , ∀n ∈ N
n n n n
(cf. 15.2.4), and hence such that
∞ !
A
\
A1 1
P ({λ})f = P λ − ,λ+ f
n=1
n n

A 1 1
= lim P λ − ,λ + f = f, ∀f ∈ H
n→∞ n n
(cf. 13.3.6b), and hence such that
P A ({λ}) = 1H and hence P A (R − {λ}) = OH .
Then,
A
µP
f = kf k2 µλ , ∀f ∈ H,
where µλ is the Dirac measure in λ (cf. 8.3.6 and 8.3.5b with µ := µλ and ν the
null measure on A(dR )). In view of 15.2.2e, this implies that
Z
2 PA
DA = f ∈ H : ξ dµf < ∞ = H,
R
A
2 2 2
dµP
R
since R ξ f = λ kf k for all f ∈ H, and that
Z
A
(f |Af ) = ξdµP
f = λkf k2 = (f |λ1H f ) , ∀f ∈ H.
R
Then, A = λ1H by 10.2.12.
b ⇒ a: We assume condition b. By 13.3.5, the mapping
A(dR ) ∋ E 7→ P (E) := χE (λ)1H ∈ P(H)
is a projection valued measure since
2
µP
f = kf k µλ .
We see that:
Z
DA = H = f ∈H: ξ 2 dµP
f < ∞ ,
R
Z
(f |Af ) = (f |λ1H f ) = λkf k2 = ξdµPf , ∀f ∈ H.
R
A
This proves that P = P (cf. 15.2.1), and hence that condition a is true.
17.3.3 Theorem. Let {Ai }i∈I be a set of self-adjoint operators in H. The following
(a) the set {Ai }i∈I is irreducible;
(b) if A is a self-adjoint operator in H which commutes with Ai (in the sense defined
in 17.1.5) for all i ∈ I, then there exists λ ∈ R so that A = λ1H .
Proof. a ⇒ b: Let A be a self-adjoint operator in H which commutes with Ai for
all i ∈ I. Then,
P A (E)Ai ⊂ Ai P A (E), ∀i ∈ I, ∀E ∈ A(dR )
(cf. 17.1.7b with B := Ai ), and hence, if we assume condition a,
P A (E) ∈ {OH , 1H }, ∀E ∈ A(dR ).
By 17.3.2, this implies that there exists λ ∈ R so that A = λ1H .
b ⇒ a: Let P ∈ P(H) be such that
P Ai ⊂ Ai P, ∀i ∈ I.
Then P commutes with Ai (in the sense defined in 17.1.5) for all i ∈ I, by 17.1.4,
and hence, if we assume condition b,
∃λ ∈ R so that P = λ1H .
Since P is a projection, from P = P 2 (cf. 13.1.5) we have λ = λ2 , and hence
P ∈ {OH , 1H }.
17.3.4 Corollary. Let {Ai }i∈I be an irreducible set of self-adjoint operators in H.

Then,
[B ∈ B(H) and BAi ⊂ Ai B, ∀i ∈ I] ⇒ [∃α ∈ C so that B = α1H ].
Proof. Let B ∈ B(H) be such that
BAi ⊂ Ai B, ∀i ∈ I. (1)
In view of 12.3.4a,b and 12.1.4, we have
B † Ai ⊂ (Ai B)† ⊂ (BAi )† = Ai B † , ∀i ∈ I. (2)
We define the self-adjoint operators
1 1
B1 := (B + B † ) and B2 := (B − B † ),
2 2i
From 1 and 2 we have
1 1 (3)
B1 Ai = (BAi + B † Ai ) ⊂ (Ai B + Ai B † ) ⊂ Ai B1 , ∀i ∈ I,
2 2
where 3 holds because, for all i ∈ I,
f ∈ DAi B ∩ DAi B † ⇒ Bf, B † f ∈ DAi ⇒ B1 f ∈ DAi .
Similarly, we have
B2 Ai ⊂ Ai B2 , ∀i ∈ I.
By 17.1.4, this implies that Bk (for k = 1, 2) commutes (in the sense of 17.1.5)
with Ai for all i ∈ I, and hence (in view of 17.3.3) that there exists λk ∈ R so that
Bk = λk 1H . Then, if we set α := λ1 + iλ2 , we have
B = B1 + iB2 = α1H .
17.3.5 Corollary (Shur’s lemma). Let {Ai }i∈I be an irreducible set of elements
of B(H) such that
∀i ∈ I, ∃j ∈ I so that A†i = Aj .
Then,
[B ∈ B(H) and BAi = Ai B, ∀i ∈ I] ⇒ [∃α ∈ C so that B = α1H ].
Proof. Let B ∈ B(H) be such that

BAi = Ai B, ∀i ∈ I. (1)
Since for all i ∈ I there exists j ∈ I so that A†i = Aj , this implies that
BA†i = A†i B, ∀i ∈ I. (2)
We define the self-adjoint operators
1 1
A′i :=(Ai + A†i ) and A′′i := (Ai − A†i ), ∀i ∈ I.
2 2i
We note that the set {A′i , A′′i }i∈I is irreducible; indeed, for P ∈ P(H),
(3)
[P A′i = A′i P and P A′′i = A′′i P, ∀i ∈ I] ⇒ [P Ai = Ai P, ∀i ∈ I] ⇒ P ∈ {OH , 1H },
where 3 holds because Ai = A′i + iA′′i for all i ∈ I. From 1 and 2 we have
BA′i = A′i B and BA′′i = A′′i B, ∀i ∈ I.
By 17.3.4, this implies that there exists α ∈ C so that B = α1H .
17.3.6 Remark. If U : G → U(H) is a homomorphism from a group G to the

group U(H), then RU is a set of elements of B(H) such that
U (g)† = (U (g))−1 = U (g −1 ), ∀g ∈ G
(cf. 12.5.1 and 1.3.3). Therefore, if the set RU is irreducible (U is then called a
unitary irreducible representation of G) then the multiples of the identity operator
1H are the only elements of B(H) which commute with U (g) for all g ∈ G.
Chapter 18
Trace Class and Statistical Operators
Statistical operators were devised by John von Neumann in order to represent the
most general statistical ensembles of a given quantum system (cf. Neumann, 1932,
Chapter IV). In this representation, those particular ensembles which von Neu-
mann denoted as homogeneous (and which we call pure states in Chapter 19) are
represented by one-dimensional projections, which are a special case of statistical
operators.
In this chapter we study statistical operators. Before that, we need to study the
polar decomposition for elements of B(H) and a subset of B(H) which is called the
trace class. As usual, H denotes an abstract Hilbert space throughout the chapter.
18.1 Positive operators and polar decomposition
In this section we find a decomposition for elements of B(H) which is the general-
ization of the decomposition z = |z| exp(i arg z) for a complex number z. First, we
must find the right analogous of a positive number and of the absolute value |z| of
a complex number z.
18.1.1 Definition. An operator A ∈ B(H) is said to be positive if (f |Af ) ≥ 0 for

all f ∈ H.
18.1.2 Remarks.
(a) If an operator A ∈ B(H) is positive then (f |Af ) ∈ R for all f ∈ H, and hence
A is self-adjoint (cf. 12.4.3).
(b) If an operator A ∈ B(H) is positive then there exists a unique positive operator
B ∈ B(H) such that A = B 2 (cf. 15.3.9). The operator B will be denoted by
1
the symbol A 2 .
1
If A ∈ B(H) is positive and T ∈ B(H) is such that [T, A] = OH , then [T, A 2 ] =
OH (cf. 15.3.9).
(c) If an operator A ∈ B(H) is positive then the operator U AU −1 is a positive
571
element of B(H) for all U ∈ UA(H), since

f |U AU −1 f = U −1 f |AU −1 f ≥ 0, ∀f ∈ H,

1 1
and (U AU −1 ) 2 = U A 2 U −1 , as can be seen easily.
18.1.3 Definition. Let A ∈ B(H). Then the operator A† A is obviously a positive
1
element of B(H), and we define |A| := (A† A) 2 . Thus, |A| is the unique positive
element of B(H) such that A† A = |A|2 .
If A is a positive element of B(H) then obviously |A| = A.
18.1.4 Proposition. Let A ∈ B(H). Then:

(a) k|A|f k = kAf k, ∀f ∈ H;
(b) k|A|k = kAk;
(c) N|A| = NA ;
(d) |U AU −1 | = U |A|U −1 , ∀U ∈ UA(H);
(e) |αA| = |α||A|, ∀α ∈ C.
Proof. a: We have
k|A|f k2 = f ||A|2 f = f |A† Af = kAf k2 , ∀f ∈ H.

b and c: These follow immediately from result a.

d: Let U ∈ UA(H). The operator U |A|U −1 is a positive element of B(H) (cf.
18.1.2c) and
(U AU −1 )† (U AU −1 ) = (U A† U −1 )(U AU −1 ) = U A† AU −1
= U |A|2 U −1 = (U |A|U −1 )2
(cf. 12.5.4a). Therefore, |U AU −1 | = U |A|U −1 .
e: Let α ∈ C. The operator |α||A| is obviously a positive element of B(H) and
(αA)† (αA) = ααA† A = |α|2 |A|2 = (|α||A|)2
(cf. 12.3.2). Therefore, |αA| = |α||A|.
The definitions given in 18.1.1 and in 18.1.3 are generalizations from C to B(H).
Indeed, if H is a one-dimensional Hilbert space then every complex number can
be identified with an element of B(H) (cf. 12.6.6a). In this identification, positive
numbers are identified with positive operators; moreover, if Aα is the operator
that corresponds to the complex number α, then |Aα | corresponds to |α|. In the
decomposition z = |z| exp(i arg z) for a complex number z, the number exp(i arg z)
is an element of T and hence it can be identified with an element of U(H) (cf.
12.6.6a). However, in order to obtain the decomposition for elements of B(H) we
are after, the right generalization of T is wider than U(H) when H is not a one-
dimensional Hilbert space.
18.1.5 Definitions. An operator U ∈ B(H) is called an isometry if kU f k = kf k
for all f ∈ H, while it is more generally called a partial isometry, or it is said to be
partially isometric, if kU f k = kf k for all f ∈ NU⊥ .
Trace Class and Statistical Operators 573
If U ∈ B(H) is partially isometric then I(U ) := NU⊥ is called the initial subspace
of U (I(U ) is actually a subspace of H by 10.2.13).
Each element of U(H) is obviously an isometry. Each element of P(H) is a
partial isometry in view of 13.1.3b,c and 10.4.4a.
18.1.6 Proposition. Let U ∈ B(H) be partially isometric. Then:

(a) if U 6= OH then kU k = 1;
(b) RU is a closed subset of H; F (U ) := RU is called the final subspace of U ;
(c) the mapping UI(U) (the restriction of U to NU⊥ ) is a unitary operator from the
Hilbert space I(U ) onto the Hilbert space F (U );
(d) U † U = PI(U) and U U † = PF (U) ;
(e) the operator U † is partially isometric;
(f ) the mapping UF† (U) (the restriction of U † to RU ) is a unitary operator from the
Hilbert space F (U ) onto the Hilbert space I(U ).
Proof. Preliminary remark: Since NU is a subspace of H (cf. 4.4.3 and 4.4.8), we

can write
f = PNU f + PI(U) f, ∀f ∈ H
(cf. 13.1.3e). Then,
U f = U PI(U) f, ∀f ∈ H,
i.e. U = U PI(U) , since PNU f ∈ NU for all f ∈ H (clearly, this is true for every
U ∈ B(H) if we define I(U ) := NU⊥ for every U ∈ B(H)).
a: We have
kU f k = kU PI(U) f k = kPI(U) f k ≤ kf k, ∀f ∈ H,
in view of the preliminary remark and 13.1.3d. This proves that kU k ≤ 1. If
U 6= OH then there exists f 6= 0H such that f ∈ NU⊥ (cf. 10.4.4d), and hence such
that kU f k = kf k. In view of 4.2.5c, this proves that kU k = 1 if U 6= OH .
b: Let {gn } be a Cauchy sequence in RU . For each n ∈ N, we choose fn ∈ H
so that gn = U fn and we set fn′ := PI(U) fn ; then, gn = U fn′ (cf. the preliminary
remark). Thus, {fn′ } is a sequence in NU⊥ and it is a Cauchy sequence since
kfn′ − fm
′
k = kU (fn′ − fm
′
)k = kgn − gm k, ∀n, m ∈ N.
Then there exists f ∈ H so that
fn′ → f
and hence (since U is continuous) so that
gn = U fn′ → U f.
This proves that RU is a complete metric subspace of the metric space H, and hence
that RU is a closed subset of H (cf. 2.6.6a).
c: First we point out that I(U ) and F (U ) can be considered Hilbert spaces
since they are subspaces of H (cf. 10.3.2). Next we notice that the linear operator
UI(U) is surjective onto RU since U f = U PI(U) f for all f ∈ H (cf. the preliminary
remark). Then, statement c holds true in view of 10.1.20.
d: We have
f |U † U f = kU f k2 = kU PI(U) f k2 = kPI(U) f k2 = f |PI(U) f , ∀f ∈ H.

By 10.2.12, this proves that U † U = PI(U) .

Moreover, we have
(U U † )† = U †† U † = U U †
(cf. 12.3.4b and 12.1.6b), and also
(U U † )2 = U (U † U )U † = U PI(U) U † = U U † .
By 13.1.5, this proves that U U † ∈ P(H). Further, we have
g|U † f = (U g|f ) = 0, ∀g ∈ NU , ∀f ∈ H,

and hence RU † ⊂ NU ⊥ ; therefore, we have

kU U † f k = kU † f k, ∀f ∈ H,
and hence
NUU † = NU † = RU ⊥
(cf. 12.1.7), and hence (in view of 13.1.3b,c, 10.4.4a, and statement b)
⊥
RUU † = NUU † = RU = F (U ),
which is equivalent to U U † = PF (U) .

e: From NU † = RU ⊥
we have NU † ⊥ = RU = F (U ), and hence (in view of
statement d and 13.1.3c)
kU † f k2 = f |U U † f = f |PF (U) f = kPF (U) f k2 = kf k2 , ∀f ∈ NU † ⊥ .

Thus, U † is partially isometric.

†
f: From statements e and c (written with U † in place of U ) we have that UI(U †)
is a unitary operator from the Hilbert space I(U † ) onto the Hilbert space F (U † ).
Now,
I(U † ) = NU † ⊥ = RU = F (U ).
Moreover, from
NU = NU †† = RU † ⊥
(cf. 12.1.6b and 12.1.7) we have
F (U † ) = RU † = NU⊥ = I(U )
by 10.4.4a since RU † is a subspace of H, in view of statements e and b (written
with U † in place of U ).
18.1.7 Theorem. Let A ∈ B(H). Then there exists a unique partially isometric
operator U ∈ B(H) such that
A = U |A| and NU = NA .
Moreover,
RU = RA and |A| = U † A.
The equality A = U |A| is called the polar decomposition of A.
Proof. Existence: We define the mapping

V : R|A| → H
f 7→ V f := Ag if g ∈ H is so that f = |A|g.
This definition is consistent because, for g1 , g2 ∈ H,
|A|g1 = |A|g2 ⇒ g1 − g2 ∈ N|A| ⇒ g1 − g2 ∈ NA ⇒ Ag1 = Ag2
(cf. 18.1.4c). Moreover, let f1 , f2 ∈ R|A| and let g1 , g2 ∈ H be so that f1 = |A|g1 ,
f2 = |A|g2 ; then, for all α, β ∈ C, αf1 + βf2 = |A|(αg1 + βg2 ) and hence
V (αf1 + βf2 ) = A(αg1 + βg2 ) = αAg1 + βAg2 = αV f1 + βV f2 ;
this proves that the mapping V is a linear operator.
Let f ∈ DV (= R|A| ) and let g ∈ H be so that f = |A|g; then,
kV f k = kAgk = k|A|gk = kf k (1)
(cf. 18.1.4a). We denote by Ṽ the bounded linear operator such that
DṼ = DV and V ⊂ Ṽ
(cf. 4.2.6). Let f ∈ DṼ and let {fn } be a sequence in DV such that fn → f (cf.
2.3.10); then Ṽ f = limn→∞ V fn , and hence (in view of 1)
kṼ f k = lim kV fn k = lim kfn k = kf k. (2)
n→∞ n→∞
Since the inclusion RV ⊂ RA is obvious, the implications

h ∈ RA ⇒
[∃g ∈ H s.t. h = Ag, and hence s.t. V (|A|g) = Ag = h] ⇒
h ∈ RV
prove the equality
RV = RA . (3)
Let f ∈ DṼ and let {fn } be a sequence in DV such that fn → f ; then Ṽ f =
limn→∞ V fn , and hence Ṽ f ∈ RV (cf. 2.3.10). This proves the inclusion RṼ ⊂ RV .
Conversely, let h ∈ RV and let {hn } be a sequence in RV such that hn → h; for
each n ∈ N, we choose gn ∈ DV so that hn = V gn ; in view of 1, the sequence
{gn } is a Cauchy sequence; therefore, there exists g ∈ DV such that gn → g, and
hence such that h = limn→∞ V gn = Ṽ g; thus, h ∈ RṼ . Thus proves the inclusion
RV ⊂ RṼ , and hence (in view of 3) the equalities
RṼ = RV = RA . (4)
Now we set M := R|A| and define the operator
U := Ṽ PM ,
which is an element of B(H) (note that DṼ = M ). In what follows we prove that
U satisfies the conditions of the statement.
From the definition of U and from 4 we have
RU = RṼ = RA .
From the definitions of U, Ṽ , V , from 4 and from 13.1.3c we have
U |A|g = Ṽ |A|g = V |A|g = Ag, ∀g ∈ H,
i.e. A = U |A|.
Moreover, we have
(5) (6) (7) (8) (9)
NU = NPM = (R|A| )⊥ = R|A| ⊥ = N|A| = NA , (10)
where 5 follows from 2, 6 from 13.1.3b, 7 from 10.2.11, 8 from 12.1.7 (since |A| is
self-adjoint), 9 from 18.1.4c.
Furthermore, from 10 and from 10.4.4c we have
NU⊥ = R|A|
⊥⊥
= R|A| = M, (11)
and hence, in view of the definition of U and of 13.1.3c,
U f = Ṽ f, ∀f ∈ NU⊥ ,
and hence, in view of 2,
kU f k = kf k, ∀f ∈ NU⊥ .
Thus, the operator U is partially isometric.
Finally, from U |A| = A we have
U † U |A| = U † A;
now, from 18.1.6d and 11, we have U † U = PM and hence (in view of 13.1.3c)
|A| = U † A.
Uniqueness: Suppose that T is a partially isometric element of B(H) such that
A = T |A| and NT = NA .
Let f ∈ R|A| and let g ∈ H be so that f = |A|g; then,
U f = U |A|g = Ag = T |A|g = T f.
Since both U and T are continuous, this implies that

U f = T f, ∀f ∈ R|A| . (12)
Now we notice that
NT = NA = NU
and set P := PNA ; then (in view of 13.1.3e and of 11),
1H − P = PM ,
where M := R|A| . Then,
T f = T P f + T (1H − P )f = T PM f
(13)
= U PM f = U P f + U (1H − P )f = U f, ∀f ∈ H,
where 13 follows from 12. Thus, T = U .
18.1.8 Remark. If H is a one-dimensional Hilbert space, then C can be identi-

fied with B(H) and T with U(H) (cf. 12.6.6a and the discussion before 18.1.5).
In this identification, the decomposition z = |z| exp(i arg z) is actually the polar
decomposition of a complex number z.
18.1.9 Remark. The analogy between the symbols |A| for A ∈ B(H) and |z| for
z ∈ C must not induce the reader to expect other properties for |A| than the ones
discussed above. To see this, for any u, v ∈ H̃ (cf. 10.9.4) we define the mapping
Au,v : H → H
f 7→ Au,v f := (u|f ) v.
We notice that, if u = v, then Au,v = Au (the one-dimension projection defined in
13.1.12). It is obvious that Au,v ∈ B(H) (use the Schwarz inequality, cf. 10.1.9).
Moreover, the equation
(Au,v f |g) = (u|f ) (v|g) = (f |Av,u g) , ∀f, g ∈ H,
proves that A†u,v = Av,u (cf. 12.1.3B). Then, the equation
A†u,v Au,v f = (u|f ) u = Au f, ∀f ∈ H,
proves that A†u,v Au,v = Au , and hence (since Au is positive and A2u = Au ) that
|Au,v | = Au .
By the same token, we also have
|A†u,v | = |Av,u | = Av .
Moreover, from
Av,u Au,v = Au
we have
|Av,u Au,v | = Au ,
while
|Av,u ||Au,v | = Av Au .
This proves that the relations
|z| = |z|, ∀z ∈ C,
|zw| = |z||w|, ∀z, w ∈ C
cannot be extended to B(H) through the symbol |A| (if H is a one-dimensional

Hilbert space and if complex numbers are identified with elements of B(H), the
adjoint of a complex number z as a linear operator is identified with z, and the
product of two complex numbers z and w as linear operators is identified with zw;
cf. 12.6.6a).
18.2 The trace class
In this and in the next section, the Hilbert space H is assumed to be separable. All
definitions, statements and proofs are written on the hypothesis that the orthogonal
dimension of H is denumerable. If the orthogonal dimension of H was finite then all
the arguments presented would get simplified in an obvious way and some conditions
would become trivial.
18.2.1 Theorem. Let A be a positive element of B(H) and let {un }n∈N be a c.o.n.s.
in H.
If {vn }n∈N is another c.o.n.s. in H then
∞
X ∞
X
(un |Aun ) = (vn |Avn )
n=1 n=1
(these sums of series are defined as in 5.4.1).

The sum of the series ∞
P
n=1 (un |Aun ) is an element of [0, ∞], which is called the
trace of A and denoted by tr A; in view of the proposition above, it is independent
of the c.o.n.s. in H chosen to compute it. Thus,
∞
X
tr A = (vn |Avn )
n=1
for whichever c.o.n.s. {vn }n∈N in H.

Proof. If {vn }n∈N is a c.o.n.s. in H, then

∞ ∞ ∞ ∞
!
2
1
2 (1) 1
X X X X
(un |Aun ) = kA un k = vm |A 2 un
2

n=1 n=1 n=1 m=1
∞ ∞ ∞
!
(2) X X 1
2 (3) X 1
= un |A 2 vm = kA 2 vm k2

m=1 n=1 m=1
X∞
= (vm |Avm ) ,
m=1
where 1 and 3 hold true by 10.6.4d and 2 by 5.4.7.
18.2.2 Proposition. Let A be a positive element of B(H). Then:
(a) if B is a positive element of B(H) then the operator A + B is positive and
tr(A + B) = tr A + tr B;
(b) if a ∈ [0, ∞) then the operator aA is positive and
tr(aA) = a tr A;
(c) if B ∈ B(H) is such that (f |Af ) ≤ (f |Bf ) for all f ∈ H, then B is positive and
tr A ≤ tr B;
(d) if U ∈ UA(H) then the operator U AU −1 is positive and
tr(U AU −1 ) = tr A;
(e) if V is a partially isometric element of B(H), then the operator V AV † is positive

and
tr(V AV † ) ≤ tr A;
if, in particular, V is an isometry then
tr(V AV † ) = tr A;
(f ) if {vi }i∈I is an o.n.s. in H (note that every o.n.s. in H is countable by 10.7.7)

then
X
kAvi k2 ≤ (tr A)2 := (tr A)(tr A).
i∈I
The sum tr A + tr B in statement a is defined as in 5.3.1d; the products a tr A

in statement b and (tr A)(tr A) in statement f are defined as in 5.3.1c; the total
ordering ≤ in statements c, e and f is the one defined in 5.1.1.
Proof. a: The positivity of A + B is obvious. Then use 5.4.6.

b: The positivity of aA is obvious. Then use 5.4.5.
c: The positivity of B is obvious. Then use 5.4.2a.
d: We already know that the operator U AU −1 is positive (cf. 18.1.2c). If
{un }n∈N is a c.o.n.s. in H then {U −1 un }n∈N too is a c.o.n.s. in H (cf. 10.6.8b),
and hence
∞
X ∞
X
tr(U AU −1 ) = un |U AU −1 un = U −1 un |AU −1 un = tr A.

n=1 n=1
e: Let V ∈ B(H) be partially isometric. The operator V AV † is positive since
f |V AV † f = V † f |AV † f ≥ 0, ∀f ∈ H.

If V = OH then the equality of statement e is obvious. In what follows, we suppose

V 6= OH . Then let {ui }i∈I be a o.n.s. which is complete in the subspace RV
(cf. 18.1.6b and 10.7.2) and let {vj }j∈J be a o.n.s. in H which is complete in
the subspace RV⊥ (provided RV⊥ 6= {0H }; otherwise, the sum over J below is void).
The set {ui }i∈I ∪ {vj }j∈J is a c.o.n.s. in H (for this, cf. the proof of 10.7.3 since
RV⊥ = (V {ui }i∈I )⊥ = ({ui }i∈I )⊥ by 10.2.11) and hence
(1) X X
tr(V AV † ) = ui |V AV † ui + vj |V AV † vj

i∈I j∈J
X (2)
†

= ui |V AV ui ,
i∈I
since RV⊥ = NV † (cf. 12.1.7). We point out that 1 holds true by an easy corollary
to 5.4.7 (in 5.4.7, take an,m := 0 for n > 2). Now, the restriction of V † to RV
is a unitary operator from the Hilbert space RV onto the Hilbert space NV⊥ (cf.
18.1.6f), and hence {V † ui }i∈I is a c.o.n.s. in the Hilbert space NV⊥ (cf. 10.6.5c and
10.6.8b), and hence it is an o.n.s. in H which is complete in the subspace NV⊥ (cf.
10.6.5c). In V is an isometry then NV = {0H }, and hence {V † ui }i∈I is a c.o.n.s. in
H, and hence (in view of 2)
X X
V † ui |AV † ui = ui |V AV † ui = tr(V AV † ).

tr A =
i∈I i∈I
If NV 6= {0H }, let {wk }k∈K be an o.n.s. in H which is complete in the subspace

NV (cf. 4.4.3 and 4.4.8); then the set {V † ui }i∈I ∪ {wk }k∈K is a c.o.n.s. in H and
hence
X X
tr A = V † ui |AV † ui + (wk |Awk )
i∈I k∈K
X
ui |V AV ui = tr(V AV † ).
†

≥
i∈I
f: Let {vi }i∈I be an o.n.s. in H and let {un }n∈N be a c.o.n.s. in H which
contains {vi }i∈I (cf. 10.7.3.). Then,
∞ ∞ ∞
!
X X X X
kAvi k2 ≤ kAun k2 = | (uk |Aun ) |2
i∈I n=1 n=1 k=1
∞ ∞
" #
(3) X X
≤ (uk |Auk ) (un |Aun )
n=1 k=1
∞
" ∞ ! #
(4) X X
= (uk |Auk ) (un |Aun )
n=1 k=1
∞ ∞
! !
(5) X X
= (uk |Auk ) (un |Aun ) = (tr A)(tr A),
k=1 n=1
where 4 and 5 hold true by 5.4.5 and 3 by 5.4.2a since, for all k, n ∈ N,
1 1
2 1 1
| (uk |Aun ) |2 = A 2 uk |A 2 un ≤ kA 2 uk k2 kA 2 un k2 = (uk |Auk ) (un |Aun ) .

18.2.3 Definition. The subset of B(H) defined by

T (H) := {A ∈ B(H) : tr |A| < ∞}
is called the trace class. The elements of T (H) are called trace class operators.
18.2.4 Theorem. The following properties of T (H) are true:

(a) if A, B ∈ T (H) then A + B ∈ T (H) and tr |A + B| ≤ tr |A| + tr |B|;
(b) if α ∈ C and A ∈ T (H) then αA ∈ T (H) and tr |αA| = |α| tr |A|;
(c) if A ∈ T (H) and tr |A| = 0 then A = OH ;
(d) if A ∈ T (H) then kAk ≤ tr |A|;
(e) if A ∈ T (H) then A† ∈ T (H) and tr |A† | = tr |A|;
(f ) if U ∈ UA(H) and A ∈ T (H) then U AU −1 ∈ T (H) and tr |U AU −1 | = tr |A|.
Proof. a: Let A, B ∈ T (H) and let U, V, W be partially isometric elements of B(H)

such that
|A + B| = U † (A + B), A = V |A|, B = W |B|
(cf. 18.1.7). Let {un }n∈N be a c.o.n.s. in H. Then the operator U † V |A|V † U is
positive and
∞ ∞
1
X X
† 2
un |U † V |A|V † U un

k|A| V U un k =
2
n=1 n=1
= tr(U † V |A|V † U ) ≤ tr(V |A|V † ) ≤ tr |A|,
in view of 18.2.2e (once for V and once for U † , which is partially isometric by
18.1.6e). Moreover,
∞ ∞
1
X X
k|A| 2 un k2 = (un ||A|un ) = tr |A|.
n=1 n=1
Then we have, by the Schwarz inequality in H, by 5.4.2a, and by the Schwarz

inequality in ℓ2 (cf. 10.3.8d),
∞ ∞
1 1
X X
| un |U † V |A|un | = |A| 2 V † U un ||A| 2 un

n=1 n=1
∞
1 1
X
≤ k|A| 2 V † U un kk|A| 2 un k
n=1
∞
! 12 ∞
! 12
1 1
X X
† 2 2
≤ k|A| V U un k
2 k|A| un k
2 ≤ tr |A|.
n=1 n=1
We can prove in the same way that

∞
X
| un |U † W |B|un | ≤ tr |B|.

n=1
Then we have, by 5.4.2a and 5.4.6,

∞
X ∞
X
un |U † (A + B)un

(un ||A + B|un ) =
n=1 n=1
X∞
| un |U † Aun | + | un |U † Bun |

≤
n=1
X∞ ∞
X
| un |U † V |A|un | + | un |U † W |B|un |

=
n=1 n=1
≤ tr |A| + tr |B|.
This proves statement a.
b: Let α ∈ C and A ∈ T (H). Then |αA| = |α||A| (cf. 18.1.4e) and hence (cf.
18.2.2b)
tr |αA| = |α| tr |A|.
This proves statement b.
c and d: Let A ∈ T (H). For each f ∈ H − {0H }, the set {kf k−1f } is an o.n.s.
in H, and hence
kf k−2k|A|f k2 ≤ (tr |A|)2
(cf. 18.2.2f). This proves that
kAf k = k|A|f k ≤ (tr |A|)kf k, ∀f ∈ H
(cf. 18.1.4a), and hence that kAk ≤ tr |A|, which is statement d. From this we have
obviously
A = OH if tr |A| = 0,
which is statement c.
e: Let A ∈ T (H) and let U be a partially isometric element of B(H) such that
A = U |A| and NU = NA
(cf. 18.1.7). Then A† = |A|U † (cf. 12.6.4) and hence
A† = U † U |A|U †
by 13.1.3c, since U † U is the orthogonal projection onto the subspace
NU⊥ = NA⊥ = N|A|
⊥
⊥
(cf. 18.1.6d and 18.1.4c) and since the equality N|A| = R|A| (cf. 12.1.7) implies the
inclusion
⊥
N|A| ⊃ R|A|
(cf. 10.2.10d). Thus,
(A† )† A† = AA† = U |A|U † U |A|U † .
In view of 18.2.2e, the operator U |A|U † is positive. Therefore,
U |A|U † = |A† |.
Then, by 18.2.2e once more,
tr |A† | = tr(U |A|U † ) ≤ tr |A|.
This proves that A† ∈ T (H).
Now, since in the reasoning above A was an arbitrary element of T (H), we can
replace A with A† and obtain
tr |A| = tr |(A† )† | ≤ tr |A† |,
and hence
tr |A† | = tr |A|.
f: If U ∈ UA(H) and A ∈ B(H), then |U AU −1 | = U |A|U −1 (cf. 18.1.4d), and
hence
tr |U AU −1 | = tr(U |A|U −1 ) = tr |A|
(cf. 18.2.2d). Therefore, if A ∈ T (H) then U AU −1 ∈ T (H).
18.2.5 Definition. We define the function

ν1 : T (H) → R
A 7→ ν1 (A) := tr |A|.
In view of 18.2.4a,b,c, T (H) is a linear manifold in the linear space B(H) and hence
T (H) itself is a linear space (cf. 3.1.3), and moreover ν1 is a norm for T (H).
(a) In view of 18.2.4d, if a sequence in T (H) is convergent to an element of T (H)

with respect to the norm ν1 then it is convergent to the same operator also with
respect to the norm defined in 4.2.11a.
(b) In view of 18.2.4e, the mapping T (H) ∋ A 7→ A† ∈ T (H) is continuous with
respect to the norm ν1 ; in fact, if A ∈ T (H) and {An } is a sequence in T (H)
such that tr |An − A| → 0, then
tr |A†n − A† | = tr |(An − A)† | = tr |An − A| → 0.
(c) In view of 18.2.4f, for U ∈ UA(H) the mapping T (H) ∋ A 7→ U AU −1 ∈ T (H)
is an automorphism of the normed space (T (H), ν1 ).
18.2.6 Lemma. Let A ∈ B(H). Then there exist U1 , U2 , U3 , U4 ∈ U(H) so that

1
A= kAk(U1 + U2 − iU3 − iU4 ).
2
Proof. First let B ∈ B(H) be self-adjoint and such that kBk ≤ 1. Then the
operator 1H − B 2 is positive, since
f |(1H − B 2 )f = kf k2 − kBf k2 ≥ 0, ∀f ∈ H

1
(cf. 4.2.5b), and hence we can define the operator (1H − B 2 ) 2 . We have
1 1
(B ± i(1H − B 2 ) 2 )† = B ∓ i(1H − B 2 ) 2
and hence
1 1
(B ± i(1H − B 2 ) 2 )† (B ± i(1H − B 2 ) 2 )
1 1
= B 2 ∓ i(1H − B 2 ) 2 B ± iB(1H − B 2 ) 2 + 1H − B 2 = 1H ,
1
since [B, (1H − B 2 ) 2 ] = OH (cf. 18.1.2b). Similarly, we have
1 1
(B ± i(1H − B 2 ) 2 )(B ± i(1H − B 2 ) 2 )† = 1H .
1 1
In view of 12.5.1, this proves that B + i(1H − B 2 ) 2 and B − i(1H − B 2 ) 2 are unitary
operators. Moreover,
1 1 1 1
(B + i(1H − B 2 ) 2 ) + (B − i(1H − B 2 ) 2 ) = B.
2 2
Thus, there exist V1 , V2 ∈ U(H) so that B = 12 (V1 + V2 ).
Next we notice that, for all A ∈ B(H) − {OH }:

1 −1 † 1 −1 †
A = kAk kAk (A + A ) − i kAk i(A − A ) ;
2 2
1 1
kAk−1 (A + A† ) and kAk−1 i(A − A† ) are self-adjoint;
2
2
1
kAk−1 (A + A† ) ≤ 1.

2
The two things proved above prove the statement.

18.2.7 Theorem. Suppose that A ∈ T (H) and B ∈ B(H). Then BA ∈ T (H) and
AB ∈ T (H).
Proof. Let U ∈ U(H). The operator U −1 |A|U is positive (cf. 18.1.2c) and
(U −1 |A|U )2 = U −1 |A|2 U = U −1 A† AU = (AU )† (AU )
(cf. 12.6.4 and 12.5.1b). Therefore,
U −1 |A|U = |AU | and hence tr |AU | = tr |A| < ∞
(cf. 18.2.2d). Moreover,
|A|2 = A† A = A† U † U A = (U A)† (U A)
(cf. 12.5.1c) proves that
|A| = |U A| and hence tr |U A| = tr |A| < ∞.
Since U was an arbitrary element of U(H), this proves that
AU, U A ∈ T (H), ∀U ∈ U(H),
and this proves the statement, in view of 18.2.6 and 18.2.4a,b.
18.2.8 Theorem. Let A be a positive element of T (H) and suppose that A 6= OH .

Then there exist an o.n.s. {un }n∈I (with I := {1, ..., N } or I := N) in H and a
family {λn }n∈I of elements of (0, ∞) (not necessarily different from each other) so
P PN P∞
that (denoting by n∈I either n=1 or n=1 )
X X
A= λn Aun and tr A = λn .
n∈I n∈I
If I = N, the first series is convergent with respect to the norm for B(H) defined in
4.2.11a. The one-dimensional projection Aun is defined as in 13.1.12.
Proof. We set

1 1
Ek := kAk, kAk and Pk := P A (Ek ), ∀k ∈ N.
k+1 k
Since σ(A) ⊂ [0, kAk] (cf. 15.3.9 and 4.5.10), we have
∞
X
A
P ({0})f + Pk f = P A (σ(A))f = f, ∀f ∈ H (1)
k=1
(cf. 15.2.2d).
For all k ∈ N, the subspace Mk := RPk is finite-dimensional. Indeed, if we fix
k ∈ N then for all f ∈ Mk we have
A
2 2
µP A A A
f (R − Ek ) = kP (R − Ek )f k = kP (R − Ek )P (Ek )f k = 0
(cf. 13.1.3c and 13.3.2b), and hence

1 1
Z Z
A A
kAf k2 = ξ 2 dµP
f ≥ 2
kAk 2
1R dµP
f = kAk2 kf k2
R (k + 1) R (k + 1)2
(cf. 15.2.2e and 8.1.11a). In view of 18.2.2f, this proves that each o.n.s. contained
in Mk must be finite, and hence that the orthogonal dimension of Mk is finite.
For all k ∈ N, we have
[A, Pk ] = OH
(cf. 15.2.1B). Hence, the operator A is reduced by the subspace Mk (cf. 17.2.4)
and Ak := AMk is a self-adjoint operator in the Hilbert space Mk (cf. 17.2.8).
Therefore, if Mk 6= {0H } then there exists an o.n.s. {vk,i }i∈Ik which is complete
in the subspace Mk and whose elements are eigenvectors of Ak (cf. 15.3.4C and
10.6.5c), i.e. so that
∀i ∈ Ik , ∃µk,i ∈ R such that Ak vk,i = µk,i vk,i .
For each i ∈ Ik , it is obvious that µk,i is an eigenvalue of A; then, µk,i ∈ [0, ∞); more-
over, Pk P A ({0}) = OH (cf. 13.3.2b) implies vk,i ∈ NA⊥ by 13.2.9 (since P A ({0}) is
the orthogonal projection onto NA , cf. 15.2.5e), and hence µk,i ∈ (0, ∞). Further,
we have
X
APk f = Ak Pk f = (vk,i |Ak Pk f )Mk vk,i
i∈Ik
X X (2)
= (Pk Avk,i |f )H vk,i = µk,i (vk,i |f ) vk,i , ∀f ∈ H
i∈Ik i∈Ik
(cf. 10.6.4b). Letting J := {k ∈ N : Mk 6= {0H }}, from 1 and 2 and from the
continuity of A we infer that
∞
X
Af = AP A ({0})f + APk f
k=1
! (3)
X X
= µk,i (vk,i |f ) vk,i , ∀f ∈ H.
k∈J i∈Ik
Now let I := {1, ..., N } or I := N be so that there is a bijection from I onto the set
S
k∈J Ik , and for each n ∈ I let
un := vk,i and λn := µk,i if n corresponds to the pair (k, i).
Then {un }n∈I is obviously an o.n.s. in H (since Mk ⊂ Mh⊥ if k 6= h), {λn }n∈I is a
family of elements of (0, ∞), and 3 can be written as
X
Af = λn (un |f ) un , ∀f ∈ H, (4)
n∈I
in view of 10.4.10 (note that every series which may appear in 3 is convergent in
view of 13.2.8 and 10.6.1).
Now let {wj }j∈N be a c.o.n.s. in H which contains {un }n∈I (cf. 10.7.3). Then,
Awj = 0H if j ∈ N is such that wj 6∈ {un }n∈I

(this is clear from 4), and hence

∞
X X X
tr A = (wj |Awj ) = (un |Aun ) = λn .
j=1 n∈I n∈I
P∞
Thus, if I = N, the series n=1 λn is convergent since A ∈ T (H), and hence the
P∞
series n=1 λn Aun is absolutely convergent in the Banach space B(H) (cf. 4.2.11b)
P∞
since kAun k = 1 (cf. 13.1.3d) for all n ∈ N, and hence the series n=1 λn Aun is
convergent (cf. 4.1.8b).
Finally, from 4 (and from 4.2.12, if I = N) we have
!
X X
λn Aun f = λn Aun f = Af, ∀f ∈ H.
n∈I n∈I
18.2.9 Corollary. Let A ∈ T (H) and suppose that A 6= OH .Then there exist two
orthonormal systems {un }n∈I and {vn }n∈I (with I := {1, ..., N } or I := N) in H
and a family {λn }n∈I of elements of (0, ∞) (not necessarily different from each
P PN P∞
other) so that (denoting by n∈I either n=1 or n=1 )
X X X
A= λn Aun ,vn , |A| = λn Aun , tr |A| = λn .
n∈I n∈I n∈I
If I = N, the first two series are convergent with respect to the norm for B(H)
defined in 4.2.11a. The operator Aun ,vn is defined as in 18.1.9.
Proof. Let U be a partially isometric element of B(H) such that

A = U |A| and NU = NA
(cf. 18.1.7). Moreover, let {un }n∈I (with I := {1, ..., N } or I := N) be an o.n.s. in
H and {λn }n∈I a family of elements of (0, ∞) so that
X X
|A| = λn Aun and tr |A| = λn
n∈I n∈I
(cf. 18.2.8 with |A| in place of A). If I = N, the first series is convergent with
respect to the norm for B(H) defined in 4.2.11a. Since N|A| = NA (cf.18.1.4c) and
⊥
since un ∈ N|A| (note that |A|un = λn un and then use 12.4.20B), we have
un ∈ NU⊥ , ∀n ∈ I.
Now, the restriction of the operator U to the subspace NU⊥ is a unitary operator
from the Hilbert space NU⊥ onto the Hilbert space RU (cf. 18.1.6c). Thus, if we
set vn := U un for all n ∈ I, {vn }n∈I is an o.n.s. in H (cf. 10.6.5c and 10.6.8a).
Moreover,
U Aun = Aun ,vn , ∀n ∈ I,
as can be seen immediately. Therefore we have

!
X X X
A=U λn Aun = λn U Aun = λn Aun ,vn .
n∈I n∈I n∈I
If I = N, we have used the continuity of the operator product in the Banach algebra
B(H) (cf. 4.3.5 and 4.3.3). Thus, if I = N, all the series written above are convergent
with respect to the norm for B(H) defined in 4.2.11a.
18.2.10 Theorem. Let A ∈ T (H) and let {vn }n∈N be a c.o.n.s. in H. Then the
series ∞
P
n=1 (vn |Avn ) is absolutely convergent and hence it is convergent. The sum
of this series is independent of the c.o.n.s. {vn }n∈N in H chosen to compute it, and
it is called the trace of A and denoted by tr A. Thus,
∞
X
tr A := (wn |Awn )
n=1
for whichever c.o.n.s. {wn }n∈N in H. It is obvious that, if A is positive, this
definition agrees with the one given in 18.2.1.
The following inequalities hold true:
(a) | tr BA| ≤ kBk tr |A|, ∀B ∈ B(H);
(b) tr |BA| ≤ kBk tr |A|, ∀B ∈ B(H);
(c) | tr A| ≤ tr |A|.
Proof. For A = OH the whole statement is trivially true. Thus, we suppose

A 6= OH . Then, in view of 18.2.8, there are an o.n.s. {un }n∈I (with I := {1, ..., N }
or I := N) in H and a family {λn }n∈I of elements of (0, ∞) so that
X X
|A|f = λn (un |f ) un , ∀f ∈ H, and λn = tr |A|.
n∈I n∈I
We notice that, if P is the projection defined by
X
P f := (un |f ) un , ∀f ∈ H
n∈I
(cf. 13.1.10), we have P |A| = |A|.

Let U be a partially isometric element of B(H) such that
A = U |A|
(cf. 18.1.7) and let {vn }n∈N be a c.o.n.s. in H. For each n ∈ I, we have
∞
X ∞
X
| (|A|un |vk ) (vk |U un ) | = λn | (un |vk ) || (vk |U un ) |
k=1 k=1
∞
! 12 ∞
! 12
(1) X X
2 2
≤ λn | (un |vk ) | | (vk |U un ) |
k=1 k=1
(2) (3)
= λn kun kkU unk ≤ λn
(1 hold true by the Schwarz inequality in ℓ2 , cf. 10.2.8b and 10.3.8d; 2 holds true
by 10.6.4d with M := H; 3 holds true by 18.1.6a and 4.2.5b). Then we have
∞
!
X X X
†

| U vk |un (un ||A|vk ) | ≤ λn = tr |A| < ∞, (4)
n∈I k=1 n∈I
and hence
∞
X ∞
X ∞
X
| U † vk ||A|vk | = | U † vk |P |A|vk |

| (vk |Avk ) | =
k=1 k=1 k=1

∞ X

X
†

= U vk |un (un ||A|vk )

k=1 n∈I
(5) X∞ X
| U † vk |un (un ||A|vk ) |

≤
k=1 n∈I
∞
!
(6) X X
| U † vk |un (un ||A|vk ) |

= <∞
n∈I k=1
(5 holds true by 5.4.2a, and by 5.4.10 if I = N; 6 holds true by 5.4.6 if I = {1, ..., N }
P∞
or by 5.4.7 if I = N). Thus, the series k=1 (vk |Avk ) is absolutely convergent and
hence it is convergent by 4.1.8b. Moreover, we have
∞
X ∞
X ∞
X
U † vk ||A|vk = U † vk |P |A|vk

(vk |Avk ) =
k=1 k=1 k=1
∞
!
X X
U † vk |un (un ||A|vk )

=
k=1 n∈I
∞
!
(7) X X
†

= U vk |un (un ||A|vk )
n∈I k=1
∞
!
X X (8) X
= (|A|un |vk ) (vk |U un ) = (|A|un |U un )
n∈I k=1 n∈I
(7 holds true by 8.4.14b, since 4 proves that the conditions in 8.4.14a are satis-
fied; 8 holds true by 10.6.4c with M := H). Since the last term of this equation
is independent of the choice of the c.o.n.s. {vn }n∈N , this proves that the sum of
P∞
the series k=1 (vk |Avk ) is independent as well (note that, if I = N, the series
P∞
n=1 (|A|un |U un ) is shown to be convergent by the way the last term of the equa-
tion above has been obtained).
Now we prove the inequalities of the statement.
a: Let {wj }j∈N be a c.o.n.s. in H which contains the o.n.s. {un }n∈I (cf. 10.7.3).
If j ∈ N is such that wj 6∈ {un }n∈I , then
(wj |un ) = 0, ∀n ∈ I,
and hence
wj ∈ N|A| = NA
(cf. 18.1.4c). Therefore we have, for all B ∈ B(H),

∞
X X
| tr BA| = (wj |BAwj ) = (un |BAun )

j=1
n∈I
(9) X (10) X
≤ | (un |BAun ) | ≤ kBAun k
n∈I n∈I
(11) X (12) X X
≤ kBkkAunk = kBk k|A|un k = kBk λn = kBk tr |A|,
n∈I n∈I n∈I
where 9 holds by 5.4.10 if I = N, 10 by the Schwarz inequality and by 5.4.2a if

I = N, 11 by 4.2.5b and by 5.4.2a if I = N, 12 by 5.4.5.
b: Let B ∈ B(H) and let V be a partially isometric element of B(H) such that
|BA| = V † BA
(cf. 18.1.7). Then, in view of inequality a (with V † B in place of B), we have
tr |BA| = tr(V † BA) = | tr(V † BA)| ≤ kV † Bk tr |A|

≤ kV † kkBk tr |A| ≤ kBk tr |A|
(cf. 4.2.9), since kV † k = 1 if V † 6= OH (cf. 18.1.6a,e).

c: This follows immediately from inequality a with B := 1H .
18.2.11 Theorem. The function defined by

tr : T (H) → C
A 7→ tr A
(a) tr is a linear functional, continuous with respect to the norm ν1 defined in

18.2.5;
(b) tr A† = tr A, ∀A ∈ T (H);
(c) tr(AB) = tr(BA), ∀A ∈ T (H), ∀B ∈ B(H);
(d) tr(U AU −1 ) = tr A, ∀A ∈ T (H), ∀U ∈ U(H);
(e) tr(V AV −1 ) = tr A, ∀A ∈ T (H), ∀V ∈ A(H).
Proof. a: For the function tr, property lo1 of 3.2.1 is obvious and properties lo2 ,
lo3 follow directly from the property ip1 of an inner product and from the continuity
of sum and product in C. The continuity of tr follows from 18.2.10c and from 4.2.2.
b: This follows directly from 12.1.3A, from property ip2 of an inner product,
and from the continuity of complex conjugation.
c: Let A ∈ T (H). For all U ∈ U(H) we have, if {vn }n∈N is a c.o.n.s. in H,

∞ ∞
X (1) X
U † U vn |AU vn

tr(AU ) = (vn |AU vn ) =
n=1 n=1
∞
X (2)
= (U vn |U AU vn ) = tr(U A),
n=1
where 1 holds true by 12.5.1c and 2 because {U vn }n∈N is a c.o.n.s in H (cf. 10.6.8b).
In view of 18.2.6 and property a, this proves that
tr(AB) = tr(BA), ∀B ∈ B(H).
d: This follows immediately from property c.
e: Let A ∈ T (H) and V ∈ A(H). If {vn }n∈N is a c.o.n.s. in H then
∞
X ∞
X
tr(V AV −1 ) = vn |V AV −1 vn = AV −1 vn |V −1 vn = tr A,

n=1 n=1
−1
since {V vn }n∈N is a c.o.n.s. in H (cf. 10.6.8b).
18.2.12 Lemma. Let P ∈ P(H). Then:

(a) P ∈ T (H) iff the orthogonal dimension of the subspace RP is finite;
(b) if P ∈ T (H) then tr P equals the orthogonal dimension of RP ;
(c) if P ∈ T (H) then
X
tr(P A) = (un |Aun ) ,
n∈I
for each A ∈ B(H) and for each o.n.s. {un }n∈I in H which is complete in the
subspace RP ;
(d) if P ∈ T (H) then
0 ≤ tr(P A) ≤ tr A
for each positive element A of B(H).
Proof. In what follows, let {un }n∈I be a c.o.n.s. in H which is complete in the
subspace RP (cf. 10.7.2) and let {wj }j∈N be a c.o.n.s. in H which contains {un }n∈I
(cf. 10.7.3). Then,
wj ∈ NP if j ∈ N is such that wj 6∈ {un }n∈I
(cf. 13.1.10).
a and b: We notice that P is positive, in view of 13.1.7c. Then we have
∞
X X X
tr |P | = tr P = (wj |P wj ) = (un |P un ) = (un |un )
j=1 n∈I n∈I
(cf. 13.1.3c). This proves both a and b.

c: Suppose P ∈ T (H). Then, for all A ∈ B(H), we have

∞
X ∞
X X
tr(P A) = (wj |P Awj ) = (P wj |Awj ) = (un |Aun ) .
j=1 j=1 n∈I
d: Suppose P ∈ T (H). Then, for all positive element A of B(H), in view of c

we have
X∞
0 ≤ tr(P A) ≤ (wj |Awj ) = tr A.
j=1
18.2.13 Lemma. Let A be a positive element of B(H). Then

tr A = sup{tr(P A) : P ∈ P(H) ∩ T (H)}
(this l.u.b. is meant with respect to the total ordering defined in 5.1.1).
Proof. In view of 18.2.12d, we have

sup{tr(P A) : P ∈ P(H) ∩ T (H)} ≤ tr A.
Now let {un }n∈N be a c.o.n.s. in H and, for each N ∈ N, let PN be the orthogonal
projection onto the subspace V {u1 , ..., uN }. Then
N
X
tr(PN A) = (un |Aun )
n=1
(cf. 18.2.12c), and hence

∞
X N
X
tr A = (un |Aun ) = sup (un |Aun ) = sup tr(PN A)
n=1 N ≥1 n=1 N ≥1
(cf. 5.4.1). This proves the inequality

tr A ≤ sup{tr(P A) : P ∈ P(H) ∩ T (H)},
and hence the equality of the statement.
18.2.14 Theorem. The normed space (T (H), ν1 ) (i.e. the linear space T (H) with
the norm ν1 , cf. 18.2.5) is a Banach space.
Proof. Let {An } be a sequence in T (H) such that

∀ε > 0, ∃Nε ∈ N so that Nε < n, m ⇒ ν1 (An − Am ) < ε.
We need to prove that there exists A ∈ T (H) such that ν1 (An − A) → 0. We note
that, if such A exists, then it must be so that kAn − Ak → 0 (cf. 18.2.4d).
Since
kAn − Am k ≤ tr |An − Am | = ν1 (An − Am )
(cf. 18.2.4d), by 4.2.11b there exists A ∈ B(H) so that

kAn − Ak → 0.
We fix ε ∈ (0, ∞). Let n > Nε and let Un be a partially isometric element of B(H)
such that
|An − A| = Un† (An − A)
(cf. 18.1.7). Now let P ∈ P(H) ∩ T (H) and let {ui }i∈I be an o.n.s. in H which is
complete in the subspace RP (hence the set I is finite, cf. 18.2.12a); then we have
(1) (2) X
ui |Un† (An − A)ui

0 ≤ tr(P |An − A|) =
i∈I
(3) X
ui |Un† (An − Am )ui

= lim
m→∞
i∈I
(1 holds true by 18.2.12d, 2 by 18.2.12c, 3 by 4.2.12 and the continuity of Un† );

moreover we have, for all m > Nε ,

X (4)
ui |Un† (An − Am )ui = | tr(P Un† (An − Am )|

i∈I
(5) (6)
≤ kP Un† k tr |An − Am | ≤ ν1 (An − Am ) < ε
(4 holds by 18.2.12c, 5 by 18.2.10a, 6 by 4.2.9). Therefore we have

X
ui |Un† (An − Am )ui

0 ≤ tr(P |An − A|) = lim

m→∞
i∈I

X
= lim ui |Un† (An − Am )ui ≤ ε.

m→∞
i∈I
Since P was an arbitrary element of P(H) ∩ T (H), by 18.2.13 we have

tr |An − A| ≤ ε.
This proves in the first place that An − A ∈ T (H) and hence that A ∈ T (H) (by
18.2.4a,b since A = An − (An − A)), and in the second place that
ν1 (An − A) = tr |An − A| → 0 as n → ∞
(since ε was an arbitrary element of (0, ∞)).
18.2.15 Theorem. Let {un }n∈I and {vn }n∈I be families of elements of H̃ (cf.
10.9.4) and let {λn }n∈I be a family of elements of C, with I := {1, ..., N } or I := N.
If I = N, suppose that ∞
P
n=1 |λn | < ∞.
If I = {1, ..., N } then the operator defined by
N
X
A := λn Aun ,vn
n=1
is an element of T (H).
P∞
If I = N then the series n=1 λn Aun ,vn is convergent in the normed space
(T (H), ν1 ) (and hence also in the normed space B(H) with respect to the norm for
B(H) defined in 4.2.11a) and therefore the operator defined by
∞
X
A := λn Aun ,vn
n=1
is an element of T (H).
P PN P∞
In both cases we have (denoting by n∈I either n=1 or n=1 )
X
tr(AB) = tr(BA) = λn (un |Bvn ) , ∀B ∈ B(H),
n∈I
and hence in particular (for B := 1H )

X
tr A = λn (un |vn ) .
n∈I
Proof. First we recall that, for u, v ∈ H̃, we have |Au,v | = Au (cf. 18.1.9), and
hence tr |Au,v | = 1 (cf. 18.2.12b), and hence Au,v ∈ T (H). Moreover, if {wn }n∈N is
a c.o.n.s. in H which contains {u} (cf. 10.7.3), then we have
∞
X
tr(BAu,v ) = (wn |BAu,v wn ) = (u|Bv) , ∀B ∈ B(H).
n=1
Since T (H) is a linear manifold in B(H) and since the function tr is a linear func-
tional (cf. 18.2.11a), this proves the whole statement for I = {1, ..., N }.
Now we suppose I = N. We notice that, in the normed space (T (H), ν1 ), the
P∞
series n=1 λn Aun ,vn is absolutely convergent since
ν1 (λn Aun ,vn ) = |λn | tr |Aun ,vn | = |λn |, ∀n ∈ N.
P∞
Then, in view of 18.2.14 and 4.1.8b, the series n=1 λn Aun ,vn is convergent in the
normed space (T (H), ν1 ), and hence also in the normed space B(H) with respect to
the norm for B(H) defined in 4.2.11a (cf. 18.2.5a). For all B ∈ B(H), we have (cf.
18.2.10b)
n
n

X X
tr BA − B λk Auk ,vk ≤ kBk tr A − λk Auk ,vk −−−−→ 0;

n→∞
k=1 k=1
therefore, in view of the continuity of the linear functional tr (cf. 18.2.11a), we have
n
!
X
tr(BA) = lim tr B λk Auk ,vk
n→∞
k=1
n
X ∞
X
= lim λk (uk |Bvk ) = λn (un |Bvn ) ;
n→∞
k=1 n=1
finally, the equality tr(AB) = tr(BA) holds true by 18.2.11c.

18.2.16 Remark. In view of 18.2.15, the series of operators which appear in 18.2.8
and in 18.2.9 (if I = N) are convergent not only with respect to the norm defined
in 4.2.11a but also with respect to the norm ν1 .
18.2.17 Proposition. Let M and N be subspaces of H, let T1 := PM , and let
T2h := (PN PM )h , T2h+1 := PM (PN PM )h , ∀h ∈ N.
Then
tr(BPM∩N APM∩N ) = lim tr(BTk ATk† ), ∀A ∈ T (H), ∀B ∈ B(H),
k→∞
and hence (for B := 1H )
tr(PM∩N APM∩N ) = lim tr(Tk ATk† ), ∀A ∈ T (H).
k→∞
If A is a positive element of T (H) and tr(PM∩N APM∩N ) 6= 0, then
tr(Tk ATk† ) 6= 0, ∀k ∈ N.
Proof. If A = OH then the statement is trivially true. In what follows, we assume
A ∈ T (H)−{OH }, we fix B ∈ B(H), and we set P := PM∩N . Let {un }n∈I , {vn }n∈I ,
{λn }n∈I be with respect to A as in 18.2.9. In view of 18.2.11c and 18.2.15, we have
X X
tr(BP AP ) = tr(P BP A) = λn (un |P BP vn ) = λn (P un |BP vn ) ,
n∈I n∈I
and
tr(BTk ATk† ) = tr(Tk† BTk A) =
X
λn (Tk un |BTk vn ) , ∀k ∈ N.
n∈I
Moreover, by 13.2.2 (and by the continuity of B) we have
(P un |BP vn ) = lim (Tk un |BTk vn ) , ∀n ∈ I.
k→∞
If I = {1, ..., N }, this proves that
tr(BP AP ) = lim tr(BTk ATk† ).
k→∞
Now we suppose I = N. We notice that
| (Tk un |BTk vn ) | ≤ kTk un kkBTk vn k ≤ kBk, ∀n ∈ N, ∀k ∈ N
(cf. 10.1.9, 4.2.5b, 4.2.9), and that
X∞ X∞
|λn |kBk = kBk |λn | < ∞.
n=1 n=1
Then, by 8.3.10a and 8.2.11 (with the sequence {|λn |kBk} as dominating function)
we have
X∞
tr(BP AP ) = λn (P un |BP vn )
n=1
∞
λn (Tk un |BTk vn ) = lim tr(BTk ATk† ).
X
= lim
k→∞ k→∞
n=1
Finally, we suppose that A is positive. Then the operator Tk ATk† is positive for all
k ∈ N, as can be seen easily. Therefore, if k ∈ N exists so that tr(Tk ATk† ) = 0 then
Tk ATk† = OH (cf. 18.2.4c), and hence Tm ATm †
= OH for all m > k since
∀m > k, ∃Sm,k ∈ B(H) s.t. Tm ATm †
= Sm,k Tk ATk† Sm,k
†
,
and hence limk→∞ tr(Tk ATk† ) = 0.
18.3 Statistical operators
Statistical operators are nothing else than positive trace class operators which are
normalized with respect to the norm ν1 for T (H) (i.e., their trace is one). Thus, the
results we prove in this section are essentially exercises about positive trace class
operators and they are of interest especially in view of the role played by statistical
operators in quantum mechanics.
Throughout this section, H denotes a separable Hilbert space whose orthogonal
dimension is denumerable. For a finite-dimensional Hilbert space, everything holds
in an obviously simplified fashion.
18.3.1 Definition. An operator W ∈ B(H) is said to be a statistical operator if it

is positive and tr W = 1. The family of all statistical operators in H is denoted by
the symbol W(H).
Clearly, W(H) ⊂ T (H).
Another name for a statistical operator is density matrix.
18.3.2 Remarks.
(a) If W ∈ W(H) and U ∈ UA(H), then U W U −1 ∈ W(H). This follows from
18.2.2d.
(b) For each u ∈ H̃, the one-dimensional projection Au is a statistical operator. In
fact Au is positive (so are all orthogonal projections, in view of 13.1.7c) and
tr Au = 1 (cf. 18.2.12b). From 18.2.12c we have
tr(BAu ) = (u|Bu) , ∀B ∈ B(H).
In view of 18.2.12a,b, the one-dimensional projections are the only orthogonal
projections which are statistical operators.
(c) If W ∈ W(H) then, in view of 18.2.8, there exist an o.n.s. {un }n∈I (with
I := {1, ..., N } or I := N) and a family {λn }n∈I of elements of (0, ∞) so that
X X
W = λn Aun and λn = tr W = 1; (1)
n∈I n∈I
thus λn ∈ (0, 1] for all n ∈ I. If I = N then the first of these series is convergent
with respect to the norm for B(H) defined in 4.2.11a and also with respect to
the norm ν1 for T (H) (cf. 18.2.16), and we have
∞
X
Wf = λn (un |f ) un , ∀f ∈ H,
n=1
by 4.2.12. Moreover, in view of 18.2.15, we have

X
tr(BW ) = λn (un |Bun ) , ∀B ∈ B(H).
n∈I
In view of 15.3.4B, {λn }n∈I is the family of all non-zero eigenvalues of W

(recall that {λn }n∈I stands for the range of the mapping I ∋ n 7→ λn ∈ (0, 1],
cf. 1.2.1); therefore, this family is uniquely determined (if {un }n∈I is required,
as above, to be an o.n.s.). The family {Aun }n∈I is uniquely determined iff the
eigenspaces of all non-zero eigenvalues of W are one-dimensional (if this is true
then Aun is the orthogonal projection on the eigenspace corresponding to λn ).
However, even in this case, a decomposition of W as in 1 is not unique if the
family {un }n∈I is not required to be an o.n.s. but only to consist of elements
of H̃, unless W is a one-dimensional projection. This will be proved in 18.3.7.
18.3.3 Proposition. Let W ∈ W(H) be such that tr W 2 = 1. Then W is a one-

dimensional projection.
Proof. Let {un }n∈I be an o.n.s. in H (with I := {1, ..., N } or I := N) and {λn }n∈I
a family of elements of (0, 1] so that
X X
W = λn Aun and λn = 1,
n∈I n∈I
as in 18.3.2c. We have
X
W2 = λ2n Aun
n∈I
since Aun Aum = δn,m Aun for all n, m ∈ I (if I = N, we have used also the continuity
of the operator product in B(H), cf. 4.3.5 and 4.3.3). We notice that λ2n ≤ λn and
hence n∈I λ2n < ∞. Then, in view of 18.2.15, W 2 ∈ T (H) and
P
X X
1 = tr W 2 = λ2n (un |un ) = λ2n .
n∈I n∈I
Therefore,
X
(λn − λ2n ) = 0
n∈I
and hence
λn ∈ {0, 1}, ∀n ∈ I.
This implies I = {1} and hence W = Au1 .
18.3.4 Proposition. Let {Wn }n∈I be a family (with I := {1, ..., N } or I := N)

of elements of W(H) and let {wn }n∈I be a family of elements of (0, 1] so that
P P PN P∞
n∈I wn = 1 (in the whole section, n∈I stands for either n=1 or n=1 ; the
reader must be warned that, while in previous parts of this chapter the symbol wn
represented a vector, in the present section it represents an element of (0, 1]; the
reason for the use of this symbol is that the elements of a family {wn }n∈I as in the
present proposition are called “weights” in quantum mechanics and w is the first
letter of the word “weight”, cf. 19.3.5b). Then:
(a) if I = N, the series ∞
P
n=1 wn Wn is convergent in the normed space (T (H), ν1 )
and also with respect to the norm for B(H) defined in 4.2.11a;
(b) the operator

X
W := wn Wn
n∈I
is an element of W(H);
(c) for all B ∈ B(H),
X
tr(BW ) = wn tr(BWn ).
n∈I
P∞
Proof. a: If I = N then the series n=1 wn Wn is absolutely convergent in the
normed space (T (H), ν1 ) since
ν1 (wn Wn ) = wn tr Wn = wn , ∀n ∈ N,
and hence it is convergent in this normed space (cf. 18.2.14 and 4.1.8b). Then, this
series is convergent also with respect to the norm for B(H) defined in 4.2.11a (cf.
18.2.5a).
b: From 18.2.4a,b (if I = {1, ..., N }) or from result a (if I = N) we have
W ∈ T (H). Moreover,
X
(f |W f ) = wn (f |Wn f ) ≥ 0, ∀f ∈ H
n∈I
(if I = N, we have used 4.2.12), shows that W is positive. Finally, we have

X X
tr W = wn tr Wn = wn = 1
n∈I n∈I
by 18.2.11a.
c: If I = {1, ..., N }, this follows from the linearity of the function tr (cf.
18.2.11a). Now we suppose that I = N and fix B ∈ B(H). Then the series
P∞
n=1 wn BWn is absolutely convergent in the normed space (T (H), ν1 ) since
ν1 (wn BWn ) = wn tr |BWn | ≤ wn kBk tr Wn = wn kBk, ∀n ∈ N

(cf. 18.2.10b), and hence it is convergent in this normed space, and hence it is
convergent with respect to the norm defined in 4.2.11a. Then for its sum we have
∞ ∞
!
X X
wn BWn = B wn Wn = BW,
n=1 n=1
in view of the continuity of the operator product with respect to the norm defined
in 4.2.11a. Hence we have
∞ ∞
!
X X
tr(BW ) = tr wn BWn = wn tr(BWn ),
n=1 n=1
in view of the continuity of the function tr with respect to the norm ν1 (cf. 18.2.11a).
18.3.5 Corollary. Let I := {1, ..., N } or I := N, let {un }n∈I be a family of ele-
P
ments of H̃, let {wn }n∈I be a family of elements of (0, 1] such that n∈I wn = 1.
Then:
P∞
(a) if I = N, the series n=1 wn Aun is convergent in the normed space (T (H), ν1 )
and also with respect to the norm for B(H) defined in 4.2.11a;
(b) the operator
X
W := wn Aun
n∈I
is an element of W(H);
(c) for all B ∈ B(H),
X X
tr(BW ) = wn tr(BAun ) = wn (un |Bun ) .
n∈I n∈I
Proof. Everything follows from 18.3.2b and 18.3.4.
18.3.6 Corollary. Let W ∈ OE (H) (i.e., W is a linear operator in H and DW =

H). Then the following conditions are equivalent:
(a) W ∈ W(H);
(b) there exist a family {un }n∈I (with I := {1, ..., N } or I := N) of elements of H̃
and a family {wn }n∈I of elements of (0, 1] so that
6 Auk if i 6= k,
Aui =
X
wn = 1,
n∈I
X
Wf = wn Aun f, ∀f ∈ H.
n∈I
Proof. a ⇒ b: Cf. 18.3.2c.

b ⇒ a: This follows from 18.3.5 and 4.2.12.
18.3.7 Proposition. Let W ∈ W(H). The following conditions are equivalent:
(a) the representation of W as in 18.3.6b is unique (i.e. the families {Aun }n∈I and
{wn }n∈I as in 18.3.6b are uniquely determined);
(b) W is a one-dimensional projection.
Thus, if W = Au with u ∈ H̃ then Au is the only representation of W as in 18.3.6b.
Proof. a ⇒ b: We prove (not b) ⇒ (not a). We consider a decomposition of W as

in 18.3.2c, i.e.
X
W = wn Aun , (1)
n∈I
with {un }n∈I an o.n.s. in H and {wn }n∈I a family of elements of (0, 1] such that
P
n∈I wn = 1. We suppose that W is not a one-dimensional projection. Then the
index set I must contain more than one element, and we define the vectors
1 1
v1 := 2− 2 (u1 + u2 ) and v2 := 2− 2 (u1 − u2 ),
which are elements of H̃. It is easy to see that
Au1 + Au2 = Av1 + Av2 .
We set J := I − {1, 2}. If w1 = w2 , we have
X
W = w1 Av1 + w2 Av2 + wn Aun (2)
n∈J
P
(if I = {1, 2} then n∈J wn Aun := OH ). If w1 6= w2 and (for instance) w1 < w2 ,
we have
X
W = w1 Av1 + w1 Av2 + (w2 − w1 )Au2 + wn Aun . (3)
n∈J
Now, the decompositions of W in 2 and in 3 are different than the decomposition

in 1, and both comply with the conditions set down in 18.3.6b.
b ⇒ a: We suppose that there exists u ∈ H̃ so that W = Au . Let {un }n∈I be a
family (with I := {1, ..., N } and N > 1, or I := N) of elements of H̃ and {wn }n∈I
a family of elements of (0, 1] so that
X
wn = 1,
n∈I
X
Wf = wn Aun f, ∀f ∈ H.
n∈I
We fix k ∈ I and note that

wk < 1 (4)
because either I = {1, ..., N } with N > 1 or I = N. We define the set of indices
J := I − {k}
and the operator
X
W̃ := (1 − wk )−1 wn Aun .
n∈J
We have W̃ ∈ W(H) by 18.3.5, and also

Au = wk Auk + (1 − wk )W̃
(if I = N, we have used the continuity of scalar multiplication and of vector sum in
B(H)), and hence
Au = A2u = wk2 Auk + (1 − wk )wk W̃ Auk + wk (1 − wk )Auk W̃ + (1 − wk )2 W̃ 2 ,
and hence (cf. 18.2.11a,c and 18.3.2b)

1 = tr Au = wk2 + 2wk (1 − wk ) tr(W̃ Auk ) + (1 − wk )2 tr W̃ 2 . (5)
Moreover, we have
tr(W̃ Auk ) ≤ tr W̃ = 1 (6)
(cf. 18.2.11c and 18.2.12d), and also
tr W̃ 2 ≤ kW̃ k tr W̃ ≤ tr W̃ = 1,
in view of 18.2.10a and of the inequality kW̃ k ≤ tr W̃ = 1 (cf. 18.2.4d). Now,
tr W̃ 2 < 1 would imply (in view of 4, 5, 6)
1 < wk2 + 2wk (1 − wk ) + (1 − wk )2 = 1,
which is a contradiction. Therefore, tr W̃ 2 = 1 and hence (cf. 18.3.3) there exists
v ∈ H̃ so that W̃ = Av . Then (cf. 18.3.2b)
tr(W̃ Auk ) = tr(Av Auk ) = (uk |Av uk ) = | (uk |v) |2 ,
and 5 reads
1 = wk2 + 2wk (1 − wk )| (uk |v) |2 + (1 − wk )2 .
Hence, | (uk |v) | < 1 would imply
1 < wk2 + 2wk (1 − wk ) + (1 − wk )2 = 1.
Therefore,
| (uk |v) | = 1
and hence (cf. 10.1.7b) there exists z ∈ T such that v = zuk , and hence (cf.
13.1.13a)
W̃ = Auk ,
and hence
Au = wk Auk + (1 − wk )Auk = Auk .
Since k was an arbitrary element of I, this proves that
Aun = Au , ∀n ∈ I.
Thus, the condition
Aui 6= Auk if i 6= k
(cf. 18.3.6b) cannot be true, and the only representation of W as in 18.3.6b is the
one given by
I := {1}, w1 := 1 Au1 := Au ,
i.e. a tautology.
18.3.8 Proposition. Let W ∈ W(H) and P ∈ P(H). Then:

P
(a) tr(P W ) = n∈I (un |W un )
for each o.n.s. {un }n∈I in H which is complete in the subspace RP ;
(b) 0 ≤ tr(P W ) ≤ 1.
Proof. Let {un }n∈I be an o.n.s. in H which is complete in the subspace RP and
let {vn }n∈N be a c.o.n.s in H which contains {un }n∈I (cf. 10.7.3).
a: We have
X∞ X
tr(P W ) = tr(W P ) = (vn |W P vn ) = (un |W un )
n=1 n∈I
(cf. 18.2.11c, 13.1.3b,c, 10.2.11).

b: We have
X ∞
X
0≤ (un |W un ) ≤ (vn |W vn ) = tr W = 1.
n∈I n=1
In view of statement a, this proves statement b.
18.3.9 Proposition. Let W ∈ W(H) and P ∈ P(H). Then the following condi-
tions are equivalent:
(a) tr(P W ) = 1;
(b) RW ⊂ RP ;
(c) PW = W;
(d) PWP = W;
Proof. Let {un }n∈I and {λn }n∈I be as in 18.3.2c, so that

X
W = λn Aun .
n∈I
a ⇒ b: Condition a implies
X
λn (un |P un ) = tr(P W ) = 1,
n∈I
P
and hence (since λn > 0 for each n ∈ I and n∈I λn = 1)
kP un k2 = (un |P un ) = 1, ∀n ∈ I,
and hence (cf. 13.1.3c)
un ∈ RP , ∀n ∈ I.
Since RW ⊂ V {un }n∈I , this proves that RW ⊂ RP .
b ⇒ c: We assume condition b. We have
un = λ−1
n W un , ∀n ∈ I,
and hence
un ∈ RP , ∀n ∈ I,
and hence
P Aun = Aun , ∀n ∈ I
(cf. 13.1.3c). This implies P W = W (if I = N, use e.g. the continuity of the
operator product in B(H)).
c ⇒ d: We have
P W = W ⇒ W P = (P W )† = W
(cf. 12.3.4b). Therefore,
P W = W ⇒ P W P = W P = W.
d ⇒ a: Condition d implies
tr(P W ) = tr(P 2 W ) = tr(P W P ) = tr(W ) = 1
(cf. 13.1.5 and 18.2.11c).
18.3.10 Corollary. Let W ∈ W(H) and u ∈ H̃. Then the following conditions are
equivalent:
(a) tr(Au W ) = 1;
(b) W = Au .
Proof. a ⇒ b: We assume condition a. Then we have RW ⊂ V {u} by 18.3.9

(a ⇒ b), since RAu = V {u} (cf. 13.1.12). Now, if {un }n∈I and {λn }n∈I are as in
18.3.2c, this implies
un ∈ V {u}, ∀n ∈ I,
and hence
I = {1}, λ1 = 1, Au1 = Au ,
and hence W = Au .
b ⇒ a: If W = Au then
tr(Au W ) = tr Au = 1.
18.3.11 Corollary. Let W ∈ W(H) and P ∈ P(H). Then the following conditions
are equivalent:
(a) tr(P W ) = 0;
(b) RW ⊂ NP ;
(c) P W = OH ;
(d) P W P = OH .
Proof. We notice that:

tr(P W ) = 0 ⇔ tr((1H − P )W ) = 1
(cf. 18.2.11a);
NP = R1H −P
(cf. 13.1.3b,e);
P W = OH ⇔ (1H − P )W = W.
Thus, conditions a, b, c are in fact conditions a, b, c of 18.3.9 written with the
projection 1H − P in place of P , and therefore they are equivalent.
It is obvious that condition c implies condition d. Condition d implies condition
a because tr(P W ) = tr(P 2 W ) = tr(P W P ).
18.3.12 Proposition. Let W ∈ W(H) and let {Pn } be a sequence in P(H) such
that Pi Pk = OH if i 6= k. Then
∞ ∞
! !
X X
tr Pn W = tr(Pn W )
n=1 n=1
P∞
(the orthogonal projection n=1 Pn is defined as in 13.2.10b).
Proof. We define the set I := {n ∈ N : Pn 6= OH }. For each n ∈ I, let

{un,s }(n,s)∈In be an o.n.s. in H which is complete in the subspace RPn (cf. 10.7.2).
S
We define the set J := n∈I In . Then, {un,s }(n,s)∈J in an o.n.s. in H (cf. 13.2.8d
or 13.2.9c). Let {vk }k∈N be a c.o.n.s. in H which contains {un,s }(n,s)∈J (cf. 10.7.3).
For each k ∈ N, we have
∞ ∞
! (
X X 0H if vk 6∈ {un,s }(n,s)∈J
Pn vk = Pn vk =
n=1 n=1 um,s if (m, s) ∈ J is s.t. vk = um,s
(cf. 13.1.3b,c and 10.2.11).
P∞ P∞ P∞
Since W ( n=1 Pn ) ∈ T (H), the series k=1 (vk |W ( n=1 Pn ) vk ) is absolutely
convergent (cf. 18.2.10), and hence
∞ ∞
! !
X X X
| (un,s |W un,s ) | = | vk |W Pn vk | < ∞. (1)
(n,s)∈J k=1 n=1
Then,
∞ ∞ ∞ ∞
! ! !! ! !
X (2) X X X
tr Pn W = tr W Pn = vk |W Pn vk
n=1 n=1 k=1 n=1
 
X (3) X X
= (un,s |W un,s ) =  (un,s |W un,s )
(n,s)∈J n∈I (n,s)∈In
∞
(4) X X
= tr(Pn W ) = tr(Pn W ),
n∈I n=1
where: 2 holds true by 18.2.11c; 3 by 8.4.15b since 1 proves that the conditions in
8.4.15a are satisfied; 4 by 18.3.8a.
18.3.13 Proposition. Let (X, A) be a measurable space, let P be a projection

valued measure on A with values in P(H), and let W ∈ W(H). Then the function
µP
W : A → [0, 1]
E 7→ µP
W (E) := tr(P (E)W )
is a probability measure on A.
Proof. The range of the function µPW is indeed a subset of [0, 1], in view of 18.3.8b.
The rest of the statement follows immediately from the definition of a projection
valued measure and from 18.3.12.
18.3.14 Definitions. Let A be a self-adjoint operator in H and let W ∈ W(H).

We say that A is computable in W if
Z
A
ξ 2 dµP
W < ∞,
R
where ξ is the function defined in 15.2.1A.
A
If A is computable in W then the function ξ is µP
W -integrable since the measure
A
µP
W is finite (cf. 11.1.3) and we can define the real number
Z
A
hAiW := ξdµPW ,
R
A
and the function (ξ − hAiW )2 is also µP
W -integrable (cf. 11.1.2a) and we can define
the real number
Z 21
A
∆W A := (ξ − hAiW )2 dµP
W .
R
18.3.15 Proposition. Let A be a self-adjoint operator in H and let u ∈ H̃. The

following conditions are equivalent:
(a) u ∈ DA ;
(b) A is computable in Au ;
(c) AAu ∈ T (H).
If the above conditions are satisfied, then:
(d) hAiAu = tr(AAu ) = (u|Au) = hAiu ;
(e) ∆Au A = kAu − hAiu uk = ∆u A
(for hAiu and ∆u A, cf. 15.2.3).
Proof. Preliminary remark: In view of 18.3.2b, we have

A
PA
µP A A

Au (E) = tr(P (E)Au ) = u|P (E)u = µu (E), ∀E ∈ A(dR ). (1)
a ⇔ b: In view of 15.2.2e and 1, we have
Z Z
A A
u ∈ DA ⇔ ξ 2 dµP
u < ∞ ⇔ ξ 2 dµP
Au < ∞.
R R
c ⇒ a: We assume AAu ∈ T (H). Then DAAu = H and hence u = Au u ∈ DA .

a ⇒ [c, d, e]: We assume u ∈ DA . Then DAAu = H and
AAu f = (u|f ) Au, ∀f ∈ H.
If Au = 0H then AAu = OH and hence AAu ∈ T (H), and also
tr(AAu ) = 0 = (u|Au) = hAiu .
If Au 6= 0H then, letting v := kAuk−1 Au, we have
AAu = kAukAu,v ,
and hence (cf. 18.2.15) AAu ∈ T (H) and also
tr(AAu ) = kAuk (u|v) = (u|Au) = hAiu .
Finally, for both Au = 0H and Au 6= 0H , A is computable in Au in view of the
implication a ⇒ b proved above, and the equalities
hAiAu = hAiu and ∆Au = ∆u A
follow from 15.2.2e and 1.
18.3.16 Proposition. Let A be a self-adjoint operator in H and let W ∈ W(H).

Let a family {un }n∈I of elements of H̃ and a family {wn }n∈I of elements of (0, 1]
(with I := {1, ..., N } or I := N) be so that
X X
W = wn Aun and wn = 1
n∈I n∈I
P∞
(if I = N, cf. 18.3.5a for the convergence of the series n=1 wn Aun ; families
{un }n∈I and {wn }n∈I as above exist in view of 18.3.2c).
(a) A is computable in W ;
(b) un ∈ DA for all n ∈ I and n∈I wn kAun k2 < ∞.
P
If conditions a and b are satisfied, then:

AW ∈ T (H);
P
hAiW = tr(AW ) = n∈I wn hAiun ;
P 1
2 2
∆W A = n∈I wn kAun − hAiW un k .
Proof. Preliminary remark: In view of 18.3.5c, we have

A X X A
µP
W (E) = wn un |P A (E)un = wn µP
un (E), ∀E ∈ A(dR ). (1)
n∈I n∈I
a ⇔ b: In view of 8.3.5 and 1, we have

Z Z
X A A
wn ξ 2 dµPun = ξ 2 dµP
W . (2)
n∈I R R
If condition a holds true then we have (in view of 2, and since wn > 0 for all n ∈ I)
Z
A
ξ 2 dµP
un < ∞, ∀n ∈ I,
R
and hence (cf. 15.2.2e)
un ∈ DA , ∀n ∈ I,
and also Z
A
kAun k2 = ξ 2 dµP
un , ∀n ∈ I,
R
and hence (cf. 2) also
X
wn kAun k2 < ∞.
n∈I
Thus, condition b holds true.
If condition b holds true then we have (in view of 15.2.2e and 2)
Z
A X
ξ 2 dµP
W = wn kAun k2 < ∞,
R n∈I
and this proves that condition a holds true.
In what follows we assume that conditions a and b are satisfied.
If I = {1, ..., N } then DAW = H since
RW ⊂ L{u1 , ..., uN } ⊂ DA ,
and also
N
X
AW f = wn (un |f ) Aun , ∀f ∈ H. (3)
n=1
Now we suppose I = N. We fix f ∈ H. Then,
N
X
wn (un |f ) un ∈ DA and
n=1
N
! N
X X
A wn (un |f ) un = wn (un |f ) Aun , ∀N ∈ N.
n=1 n=1
Moreover, the inequality
X∞ ∞
(4) X ∞
X
kwn (un |f ) un k ≤ wn kf k = kf k wn < ∞
n=1 n=1 n=1
P∞
(4 holds true by the Schwarz inequality) proves that the series n=1 wn (un |f ) un
is convergent (cf. 4.1.8b). Similarly, the inequalities
X∞ ∞
(5) X
kwn (un |f ) Aun k ≤ wn kf kkAunk
n=1 n=1
∞
! 21 ∞
! 21
(6) X X
2
≤ kf k wn wn kAun k <∞
n=1 n=1
(5 holds true by the Schwarz inequality in H; 6 holds true by the Schwarz inequality
1 1
in ℓ2 for the two sequences {wn2 } and {wn2 kAun k}, cf. 10.3.8d) prove that the series
P∞
n=1 wn (un |f ) Aun is convergent. Since the operator A is closed (cf. 12.4.6a), this
implies that
∞ ∞ ∞
!
X X X
wn (un |f ) un ∈ DA and A wn (un |f ) un = wn (un |f ) Aun .
n=1 n=1 n=1
Since f was an arbitrary element of H, this proves that
W f ∈ DA for all f ∈ H, or DAW = H,
and
∞
X
AW f = wn (un |f ) Aun , ∀f ∈ H. (7)
n=1
In what follows, I can be either {1, ..., N } or N. We define the set of indices
J := {n ∈ I : Aun 6= 0H }.
If J = ∅ then from either 3 or 7 we have AW = OH and hence AW ∈ T (H), and
also
X X
tr(AW ) = 0 = wn (un |Aun ) = wn hAiun .
n∈I n∈I
If J 6= ∅, we define
vn := kAun k−1 Aun , ∀n ∈ J;
then from either 3 or 7 we have
X
AW f = wn kAun kAun ,vn f, ∀f ∈ H,
n∈J
P∞
and hence (cf. 18.2.15; note that, if I = N, 6 proves that n=1 wn kAun k < ∞)
AW ∈ T (H) and
X X X
tr(AW ) = wn kAun k (un |vn ) = wn (un |Aun ) = wn hAiun .
n∈J n∈I n∈I
Finally, for both J = ∅ and J 6= ∅, we have

Z Z
A (8) X A (9) X
hAiW = ξdµP
W = wn ξdµP
un = wn hAiun
R n∈I R n∈I
and
Z 12 Z ! 12
2 A (10) X
2 A
∆W A = (ξ − hAiW ) dµP
W = wn (ξ − hAiW ) dµP
un
R n∈I R
! 21
(11) X
2
= wn kAun − hAiW un k
n∈I
(8 and 10 hold true by 8.3.5, in view of 1; 9 and 11 hold true by 15.2.2e).

18.3.17 Corollary. Let A be a self-adjoint operator in H, let W ∈ W(H), and

suppose that the operator A2 (which is self-adjoint, cf. 15.3.6) is computable in W .
Then A is computable in W and
1
∆W A = hA2 iW − hAi2W 2 .
Proof. For notational reasons we denote by P the projection valued measure of

the self adjoint operator A2 . From 15.3.5 and 15.3.8 we have
P (E) = P A (π −1 (E)), ∀E ∈ A(dR ),
with π := ξ 2 , and hence
A
−1
µP P
W (E) = µW (π (E)), ∀E ∈ A(dR ),
and hence, by 8.3.11b,
Z Z
A
ξ 4 dµP
W = ξ 2 dµP
W < ∞,
R R
2
since A is computable in W . This implies
Z
A
ξ 2 dµP
W < ∞,
R
A
since the measure µP
W is finite (cf. 11.1.3 and 8.2.4). Thus, A is computable in W
(cf. 11.1.3 and 8.2.4).
Now let {un }n∈I and {wn }n∈I be as in 18.3.16. In view of 18.3.16, written with
2
A in place of A, we have
un ∈ DA2 , ∀n ∈ I,
and
X
hA2 iW = wn hA2 iun .
n∈I
Therefore we have, in view of 18.3.16,

X
(∆W A)2 = wn kAun − hAiW un k2
n∈I
X
wn ( un |A2 un − 2hAiW (un |Aun ) + hAi2W )

=
n∈I
X X X
= wn hA2 iun − 2hAiW wn hAiun + hAi2W wn
n∈I n∈I n∈I
= hA iW − 2hAi2W +
2
hAi2W = hA2 iW − hAi2W .
18.3.18 Corollary. If a self-adjoint operator is bounded, then it is computable in

all statistical operators.
Proof. Let W ∈ W(H), let {un }n∈I and {wn }n∈I be as in 18.3.16, and let A be a
self-adjoint operator in H. If A is bounded then DA = H (cf. 12.4.7) and hence
un ∈ DA , ∀n ∈ I;
moreover,
X X X
wn kAun k2 ≤ wn kAk2 kun k2 = kAk2 wn < ∞
n∈I n∈I n∈I
(cf. 4.2.5b). By 18.3.16 (b ⇒ a), this proves that A is computable in W .
18.3.19 Remarks.
(a) Let A be a self-adjoint operator in H and W ∈ W(H). If conditions a and b in
18.3.16 hold true, then
AW ∈ T (H) and hAiW = tr(AW ).
If A is bounded then A ∈ B(H) (cf. 12.4.7) and hence we have
W A ∈ T (H) and hAiW = tr(W A),
by 18.2.7 and by 18.2.11c respectively.
If A is not bounded then DA 6= H (cf. 12.4.7), and hence DW A 6= H, and hence
the operator W A cannot be trace class and the formula tr(W A) is meaningless.
(b) We recall that, for a statistical operator W , the decomposition W =
P
n∈I wn Aun as in 18.3.6b is not unique unless W is a one-dimensional pro-
jection (cf. 18.3.7). From 18.3.16 we have that, if a self-adjoint operator A is
computable in W , then
X
un ∈ DA for all n ∈ I and wn kAun k2 < ∞
n∈I
for whichever representation of W as in 18.3.6.b.

Chapter 19
Quantum Mechanics in Hilbert Space
In this chapter we examine how the theory of Hilbert space operators is used in
quantum mechanics. This chapter is not meant to be a short treatise on quantum
mechanics, since only the basic mathematical structure of the quantum theories is
discussed and no applications are provided.
The predictions that are provided by quantum mechanics are in general sta-
tistical ones. And indeed, in what follows, quantum mechanics is presented as a
theoretical scheme which can account for the probabilistic distributions of mea-
surements, in experiments where measurements are repeatedly carried out on a
large number of suitably-prepared copies of a physical system. The probabilities
are interpreted as the theoretical predictions of the relative frequencies with which
results are obtained when measurements are made on a large number of identically-
prepared copies of the physical system under consideration. Quantum mechanics
shares a good deal of its theoretical framework with other statistical theories, e.g.
classical statistical mechanics and theories of games of chance (we will refer to all
these theories as “classical statistical theories”). In the first section of this chapter
we give an outline of this shared framework, which we call a “general statistical
theory”. For the abstract concepts we introduce, we use the names that are com-
monly used for them in quantum mechanics. In the second section we examine how
this general statistical theory is implemented in the classical theories, and in the
third section how it is implemented in the quantum theories. In the fourth and fifth
sections, other topics are examined which are specific to quantum mechanics: state
reduction, compatibility of observables, uncertainty relations.
Up to Section 19.5 we think of time as standing still: the time intervals between
operational procedures are always supposed to be sufficiently small that there is no
need to consider the internal time evolution of the system. At times, this is indicated
by the use of the locution “immediately after”. In Section 19.6 we examine time
evolution in non-relativistic quantum mechanics.
611
19.1 Elements of a general statistical theory
Since we are mainly concerned with the mathematical aspects of the foundations of
quantum mechanics, we could set off in an axiomatic way by simply saying that we
are given two abstract sets Π and Σ and a function p : Π × Σ → [0, 1], specifying
that Π represents the family of all propositions pertaining to a physical system and
Σ the family of all states of the system, and that p(π, σ) is the probability that
the proposition π is true when the system is in the state σ. However, we prefer to
explain by what kind of reasoning these abstract objects are brought about in what
we call a general statistical theory.
19.1.1 Definitions. A state preparation (or, simply, a state) is a collection of
instructions for a set of physical operations to be performed on an array of objects,
so that:
the operations can be repeated, at least in principle, an indefinite number of
times;
the objects are macroscopic bodies, in the sense that the instructions are governed
by standard classical logic.
A proposition is an event so that:
the occurrence or non-occurrence of the event is to be decided immediately after
a state preparation has been performed;
when the event occurs, it takes place in a macroscopic device, in the sense that
the procedure for ascertaining whether the event has occurred is governed by
standard classical logic;
when the event occurs, it leaves a long-lasting record in the device and its occur-
rence is ascertained by verifying this record, which has an objective meaning (the
record can be read by any number of scientific observers and all of them agree
about its meaning).
When the procedures which define a state σ and a proposition π are implemented,
they have obviously space and time positions. However, it is assumed that these
“absolute” positions are immaterial (and therefore they are not specified in the
definitions of σ and π) and that, if π is to be decided immediately after σ, then
the relative space positions of π and σ are suitable for an “interaction” between σ
and π to take place (suitable, that is, according to the picture one has of a possible
“interaction” between σ and π).
For a given proposition π and a given state σ, we define the following course
of action: we perform the operations prescribed by σ and immediately after we
ascertain whether the event π has occurred, and we do this a large number N of
times. If we implement this course of action twice, and if Nπ′ (respectively, Nπ′′ )
denotes the number of times when π has occurred in the first (respectively, second)
implementation, we cannot expect Nπ′ and Nπ′′ to be equal in general. We say
that π and σ ′are correlated when, as N grows, the difference between the relative
N N ′′
frequencies Nπ and Nπ approaches zero and the relative frequencies approach a
Quantum Mechanics in Hilbert Space 613
limit (clearly, the term “approach” has here an informal meaning and so has the
term “limit”). When π and σ are correlated, the limit is called the probability of
the occurrence of the proposition π immediately after the state preparation σ or,
simply, the probability of π in σ.
We say that we have a physical system (or, simply, a system) if we have a family
Σ of states and a family Π of propositions so that π and σ are correlated for each
pair (π, σ) ∈ Π × Σ and if we think that the resulting probabilities are liable to
be organized in a consistent theory. If (π, σ) ∈ Π × Σ, a single implementation of
the operations prescribed by σ is said to be a copy of the system prepared in the
state σ (or, simply, a copy in σ), and the ascertainment whether π has occurred
immediately after an implementation of the operations prescribed by σ is said to
be the determination of π for a copy in σ; if π has occurred, then π is said to be
true in that copy. We say that we have a statistical theory of the system if we have
a theoretical scheme whereby a function p : Π × Σ → [0, 1] can be obtained so that
p(π, σ) is, for all (π, σ) ∈ Π × Σ, the probability of the occurrence of π immediately
after the state preparation σ. The function p is called a probability function. If
the theory supplies such a function, then p(π, σ) is the theoretical prediction of the
relative frequency NNπ , where Nπ is the number of copies in which a proposition
π has turned out to be true out of N copies of the system prepared in a state σ,
provided that N is large enough. It is important to note that, although a state σ
and a proposition π are procedures to be operated on a single copy of the system,
the number p(π, σ) that the theory assigns to the pair (π, σ) can be compared with
the experimental results only if we have a large (hypothetically infinite) collection
of copies of the system, for each of which the determination of π is carried out
immediately after the copy has been prepared in the state σ. A large collection of
copies, all prepared in the state σ, is sometimes called an ensemble representing σ.
19.1.2 Remarks.
(a) While in the classical statistical theories it is often obvious what is to be con-
sidered the physical system under consideration (e.g. the gas contained in a
vessel, a coin, a pair of dice, a roulette table), this is not so in the quantum
theories, and in our opinion it is convenient to consider a quantum system as an
“interaction channel” between a definite set of states and a definite set of propo-
sitions, according to the definition given in 19.1.1. For instance, if the state is
to switch on a “source” to the left of a Stern–Gerlach apparatus (the source and
the Stern–Gerlach apparatus are thus the objects which appear in the abstract
definition of state) and the proposition is the event defined by the reaction of a
detector to the right of the Stern–Gerlach apparatus, then the physical system
is called a spin-half (for instance) particle. As another example, if the state is to
operate an accelerator in a given mode and to arrange magnetic analysers and
collimating slits in a given way, and the proposition is once again the reaction of
a detector, then the case may be that the physical system is called a meson. In
any case, while in a classical theory it is non-controversial that a physical sys-

tem is defined also independently of the operational framework outlined above,
in a general statistical theory which aims to include quantum mechanics it is
expedient to consider a physical system as defined by a collection of statistical
experiments.
(b) The definition we have given of a state may seem not suitable for describing
states in classical physics, where the state of a system is usually considered to be
the result of its previous history. However, also in the case of classical physics,
in order to define a state one often has to make operations (if nothing else, a
selection) on macroscopic bodies. We must acknowledge that the condition we
have assumed of indefinite reproducibility of a state preparation clearly limits
the cases which can be described by our theory; for instance, in astrophysics
or in geophysics the states one deals with are most of the times the result of a
previous history of the system which cannot be reproduced. On the other hand,
in games of chance, the definition we have given of a state is well suited for the
description of what happens in obvious cases: a state may be a definite way of
tossing a coin or a definite way of rolling the dice; in the game of roulette it is a
definite way of spinning the wheel and throwing the ball. It may be convenient
to keep in mind these examples of classical statistical theories throughout this
section.
(c) In the classical theories, the macroscopic device in which the event that defines
a proposition is produced (or is not produced) is often (but not always) what
we identify with the physical system itself: e.g., propositions are “the coin has
landed in such a way that heads shows”, “the dice have landed in such a way
that the sum of the values on top is five”, “the colour of the roulette wheel
compartment in which the ball has fallen is red”. On the contrary, in the
case of quantum mechanics, the event that defines a proposition can never be
produced in what we perhaps picture as the physical system, since it must be
produced in a macroscopic device; actually, it should be considered as occurring
(or not occurring) in a macroscopic device which is in an unstable condition so
that a suitable state preparation can trigger a lasting macroscopic effect, i.e. a
lasting effect which can be detected by classical means: although the device is
obviously made up of quantum particles, it must be possible to handle it in a
such way that only its classical properties are involved in the detection of the
event. For instance, a proposition in quantum physics can be “a certain detector
has reacted”, and the assessment that the detector has reacted can be based on
hearing or otherwise recording a sound in a counter in which we imagine that
a “particle” has triggered the macroscopic effect known as electrical discharge,
or the assessment can be based on seeing or otherwise registering a blackening
in a definite region of a photographic plate, where we imagine that a “particle”
has initiated a chemical reaction in a grain of the emulsion and so has caused
the macroscopic effect known as latent image.
(d) Experimental evidence supports the assumption that only the relative positions
in space (as specified in 19.1.1) and time (a proposition immediately after a
state) are important. Actually, this is perhaps the first invariance law discovered
in the history of physics. This fact has allowed physics to be tackled as an
experimental science.
19.1.3 Definition. The event that defines a proposition π can be used to define
another event, which defines another proposition; this new proposition is called the
negation of π and denoted by the symbol ¬π, and the event that defines ¬π is the
non-occurrence of the event that defines π.
19.1.4 Remarks.
(a) If there is a statistical theory for the physical system defined by a collection Π
of propositions and a collection Σ of states, then the theory is consistent only
if, for the probability function p, we have
p(¬π, σ) = 1 − p(π, σ), ∀π ∈ Π, ∀σ ∈ Σ. (1)
(b) We point out that nothing has been said about the possibility, for two proposi-
tions π and π ′ , of defining the event that is said to have occurred if and only if
both the events that define π and π ′ have occurred (such new event would define
the proposition “π and π ′ ”) or of defining the event that is said to have occurred
if and only if at least one of the events that define π and π ′ has occurred (such
new event would define the proposition “π or π ′ ”). Indeed, this requires that it
is feasible to determine both π and π ′ for a single copy prepared in any state.
This feasibility is actually assumed in all classical statistical theories. We will
see that this is one of the aspects in which the quantum theories differ from the
classical statistical ones.
19.1.5 Definitions. Let Π and Σ be families of propositions and states so that
Π × Σ defines a physical system for which a probability function p is given. We
define an equivalence relation RΠ in Π by letting
RΠ := {(π ′ , π ′′ ) ∈ Π × Π : p(π ′ , σ) = p(π ′′ , σ), ∀σ ∈ Σ},
and similarly we define an equivalence relation RΣ in Σ by letting
RΣ := {(σ ′ , σ ′′ ) ∈ Σ × Σ : p(π, σ ′ ) = p(π, σ ′′ ), ∀π ∈ Π}.
We denote by Π̂ and Σ̂ the quotient sets which are thus defined, we denote by π̂
and σ̂, and still call proposition and states, the equivalence classes containing π ∈ Π
and σ ∈ Σ, and we define the function
p̂ : Π̂ × Σ̂ → [0, 1]
(π̂, σ̂) 7→ p̂(π̂, σ̂) := p(π, σ),
which we still call a probability function. Obviously, we have
p̂(π̂ ′ , σ̂) = p̂(π̂ ′′ , σ̂), ∀σ̂ ∈ Σ̂ ⇒ π̂ ′ = π̂ ′′ and
p̂(π̂, σ̂ ′ ) = p̂(π̂, σ̂ ′′ ), ∀π̂ ∈ Π̂ ⇒ σ̂ ′ = σ̂ ′′ .
19.1.6 Remarks.
(a) It is clear that, for a physical system defined by a family Π of propositions
and a family Σ of states, there may be state preparations σ ′ , σ ′′ ∈ Σ which are
different (i.e. they are different collections of instructions which refer to dif-
ferent arrays of objects), but which nonetheless lead to the same experimental
statistical results in the sense that, for every π ∈ Π, the probability of π in
σ ′ equals the probability of π in σ ′′ . This means that the differences between
σ ′ and σ ′′ are immaterial as far as the statistical study of the physical system
under consideration goes. And a similar remark can be made for the elements
of Π. Thus, a statistical theory of the system must contain mathematical rep-
resentations of the quotient sets Π̂, Σ̂ defined above, through which a formula
which defines the function p̂ must then be written.
(b) From condition 1 of 19.1.4 we see that, for each π̂ ∈ Π̂, the family of all negations
of the representatives of π̂ constitute an equivalence class, which we still call
the negation of π̂ and denote by the symbol ¬π̂.
19.1.7 Definitions. For every physical system, two (trivial) propositions always
exist, which we denote by the symbols π0 and π1 . The proposition π0 is defined
by the event which is said to occur if and only if no copy of the system has been
prepared: this event never occurs if π0 is determined immediately after a state
preparation has been performed. The proposition π1 is defined by the event which
is said to occur if and only if a copy of the system has been prepared: this event
always occurs if π1 is determined immediately after a state preparation has been
performed. Clearly, the equivalence classes π̂0 and π̂1 are characterized by the
following conditions
p̂(π̂0 , σ̂) = 0, ∀σ̂ ∈ Σ̂, and p̂(π̂1 , σ̂) = 1, ∀σ̂ ∈ Σ̂,
where Σ denotes as usual the family of states that defines the system.
19.1.8 Definition. Consider a physical system for which a statistical theory is

given (i.e., a probability function p is defined), and let Σ and Π be the families
of states and propositions that define the system. Let (X, A) be a measurable
space (cf. 6.1.13). An X-valued observable pertaining to the system is a mapping
α : A → Π which is so that, for every state σ ∈ Σ, the function
µα
σ : A → [0, 1]
E 7→ µα
σ (E) := p(α(E), σ)
is a probability measure on A, and also so that it can be associated with a measuring

instrument (as explained in 19.1.9a).
If X is a Borel subset of Rn , A is always assumed to be the Borel σ-algebra
A(dn )X (cf. 2.2.2, 2.7.4b, 6.1.19, 6.1.22). An R-valued observable is simply called
an observable.
19.1.9 Remarks.
(a) In 19.1.8, the measurable space (X, A) provides a representation of the events
into which the measurement of a physical quantity can be analysed. In our view,
an X-valued physical quantity is defined by an ideal apparatus which comprises
a dial, which is represented by X (i.e. there is a mapping, not necessarily
injective, from the dial to X), and a pointer. Immediately after a copy of the
system has been prepared in a state in such a way that an “interaction” between
the copy and the apparatus can happen (i.e., the state preparation must be
implemented in a suitable spatial position with respect to the apparatus), the
apparatus gives an X-result by the position of the pointer on the dial, which
identifies a point of the dial (since the apparatus is an ideal one) and hence a
point of X. The pointer and the dial are assumed to be macroscopic objects,
to wit the ascertainment of the possible positions of the pointer is governed by
standard classical logic. An element E of A is a subset of X to which it is
deemed sensible to assign the probability that, in a given state, the X-result is
a point of E (a probability is always considered in this book to be a normalized
measure on a σ-algebra, cf. 7.1.7); for instance, if X is endowed with a distance
then a natural choice for A is the Borel σ-algebra on X (cf. 6.1.22).
All this leads to an X-valued observable α if, for each E ∈ A, we define α(E) to
be the event which is said to have occurred if and only if the X-result given by
the apparatus has been an element of E and if we assume that this event defines
a proposition of the system. In fact, the condition that µα σ be a probability
measure on A for each σ ∈ Σ can be accounted for as follows. We assume
that all propositions α(E), for E ∈ A, can be determined simultaneously for
any single copy prepared in any state. Moreover, we assume that, if {En } is a
sequence in A such that Ei ∩ Ek = ∅ for i 6= k, then the proposition α (∪∞ n=1 En )
is true in a copy prepared in a state if and only if there exists exactly one En such
that α(En ) is true in that copy. Actually, the basis for these two assumptions
is the macroscopic nature assumed before for the pointer and the dial. Now let
{En } be a sequence in A such that Ei ∩ Ek = ∅ for i 6= k; if we determine all
propositions α(En ), for n ∈ N, as well as the proposition α (∪∞ n=1 En ) (which
is possible on account of the first assumption above) for a large number N of
copies of the system prepared in a state σ, and if we denote by Nn the number
of copies in which the proposition α(En ) is true and by NU the number of copies
in which the proposition α (∪∞ n=1 En ) is true, then we have (on account of the
second assumption above)
∞
X Nn NU
=
n=1
N N
(the series on the left hand side is actually a sum); since p(π, σ) is the theoretical
prediction of the relative frequency of a proposition π ∈ Π being true in a large
number of copies all prepared in the state σ, the consistency of the theory leads
to the equation
X∞ ∞
X
µασ (En ) = p(α(En ), σ) = p (α (∪∞ α ∞
n=1 En ) , σ) = µσ (∪n=1 En ) .
n=1 n=1
Thus, the function µα σ is σ-additive. Moreover, since we have assumed that the
apparatus (which is an ideal one) always gives an X-result immediately after a
copy has been prepared in some suitable state, the proposition α(X) is always
true in every state, and therefore the consistency of the theory leads to the
condition µα α
σ (X) = 1 for every state σ. Thus, µσ is a probability measure on A
for every state σ.
As to the assumption that A = A(dR )X when X is a Borel subset of R, we note
that intervals are most naturally associated with a dial which is represented by
a subset of R, and that A(dR ) is the σ-algebra on R generated by the family of
intervals (cf. 6.1.25). And similarly for the general case A(dn )X .
The determination of all propositions α(E) (i.e. of α(E) for all E ∈ A) for a
copy, prepared in some state, can be performed in actual fact by determining
only some propositions, on account of the assumption that the ascertainment
of the position of the pointer is governed by classical logic (e.g., if X is so that
{x} ∈ A and if α({x}) is true in a copy, then the proposition α(E) is true in
that copy if and only if x ∈ E). The determination of all propositions α(E) for
a copy is said to be a measurement of the observable α in that copy.
(b) The “position of the pointer on the dial” may be actually implemented by an
apparatus which does not comprise a needle-like object over a graduated scale:
it may be the blackening of a grain in a photographic plate (and then X is a
subset of R2 ), or the formation of a bubble in a bubble-chamber (and then X
is a subset of R3 ), or the digital reading of an instrument (and then X is a
subset of R). In any case, the apparatus that was considered above was clearly
an ideal one inasmuch as the position of the pointer was supposed to identify
a point of the dial. But reference to ideal instruments is a common feature
of all mathematical physics (however, we shall see that in quantum mechanics
we do not need an ideal apparatus in order to get exact measurements, if the
observable is quantized; nor do we need an ideal apparatus in the game of dice
or in the game of roulette).
(c) The analysis which was carried out in remark a was aimed at showing that it is
reasonable to represent an instrument, which can measure a physical quantity,
by a mapping α : A → Π which is so that the function µα σ is a probability
measure on A for every state σ, where A is a σ-algebra which represents the
sensible parts of the dial of the instrument. However, even when (X, A) is
chosen in a conservative way (e.g. (X, A) = (R, A(dR )), since the dial of most
measuring instruments can be identified with some part of R), it would be
hard to justify in general the assumption that any mapping α : A → Π for
which µασ is a probability measure for all σ ∈ Σ should be taken to represent a
measuring instrument, and therefore should be considered a bona-fide X-valued
observable. And indeed, whenever we say that a mapping α : A → Π is an X-

valued observable, we will assume that µασ is a probability measure for all σ ∈ Σ
and that α is related to a measuring apparatus in the way described in remark
a.
(d) It is a well-known fact that technology often provides instruments which are
equivalent, in the sense that they give equal results under equal conditions. Of
this, care will be taken in 19.1.10.
19.1.10 Definition. Let Σ, Π, (X, A) be as in 19.1.8. Two X-valued observables α1

and α2 are said to be equivalent if µα α2
σ = µσ for each σ ∈ Σ, and this indeed defines
1
an equivalence relation in the family of all X-valued observables of the system. It

is easy to see that the quotient set which is thus defined can be identified with a
family of mappings α̂ : A → Π̂ which are so that the function
µα̂
σ̂ : A → [0, 1]
E 7→ µα̂
σ̂ (E) := p̂(α̂(E), σ̂)
is a probability measure for each σ̂ ∈ Σ̂. We still call X-valued observables such
mappings.
19.1.11 Proposition. Let Σ, Π, (X, A) be as in 19.1.8 and let α̂ be an X-valued

observable. We have:
α̂(∅) = π̂0 ,
α̂(X) = π̂1 ,
α̂(X − E) = ¬α̂(E), ∀E ∈ A.
Proof. The equalities of the statement follow from the following facts, which are
true because µα̂
σ̂ is a probability measure: for each σ̂ ∈ Σ̂,
p(α̂(∅), σ̂) = µα̂

σ̂ (∅) = 0 = p̂(π̂0 , σ̂),
p(α̂(X), σ̂) = µα̂σ̂ (X) = 1 = p̂(π̂1 , σ̂),
p(α̂(X − E), σ̂) = µα̂ α̂
σ̂ (X − E) = 1 − µσ̂ (E) = 1 − p̂(â(E), σ̂) = p̂(¬α̂(E), σ̂).
19.1.12 Remark. Throughout the rest of this chapter, we always assume that we
are dealing with a fixed, although general, physical system for which we assume
that a probability function is defined. The symbols Σ and Π always denote the
families of states and propositions which define the system.
As a rule we drop the carets in the symbols Σ̂, Π̂, σ̂, π̂, α̂ and we leave it to the
reader to understand whether we refer to an equivalence class or to a representative
of it. If σ is an equivalence class of states, a representative of σ is sometimes called
an implementation of σ; and similarly for propositions.
19.1.13 Proposition. Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces, let α be

an X1 -valued observable, let ϕ : Dϕ → X2 be a mapping from X1 to X2 which is
measurable w.r.t. A1 Dϕ and A2 and so that Dϕ ∈ A1 and µα

σ (X1 − Dϕ ) = 0 for
each σ ∈ Σ. Then, the mapping ϕ(α) defined by
ϕ(α) : A2 → Π E 7→ ϕ(α)(E) := α(ϕ−1 (E))
ϕ(α)
is so that the function µσ is a probability measure on A2 for each state σ ∈ Σ.
ϕ(α)
Proof. For each σ ∈ Σ, the function µσ is a measure on A2 , since µα
σ is a measure
on A1 and
−1
µϕ(α)
σ (E) = µα
σ (ϕ (E)), ∀E ∈ A2
(cf. 8.3.11a). Moreover,
−1
µϕ(α)
σ (X2 ) = µα
σ (ϕ (X2 )) = µα α
σ (Dϕ ) = µσ (X1 ) = 1
ϕ(α)
since µα
σ is a probability measure. Thus, µσ is a probability measure on A2 .
19.1.14 Remark. If α and ϕ are as in the statement of 19.1.13, then ϕ(α) can be
considered an X2 -valued observable, which is called the function of α according to
ϕ. Indeed, there exists a measuring instrument which is represented by ϕ(α) (cf.
19.1.9c), since ϕ(α) can be interpreted as the X2 -valued observable that is defined
operationally by the same apparatus that defines α (cf. 19.1.9a), in which however
a change of scale has been made: while the dial of the apparatus is represented
by X1 when the apparatus is related to α, the dial is represented by X2 when the
apparatus is related to ϕ(α). Assuming first Dϕ = X1 , if a point of the dial is
represented by x ∈ X1 when the scale that defines α is used, then that same point
of the dial is represented by ϕ(x) ∈ X2 when the scale that defines ϕ(α) is used;
thus, in a copy prepared in some state and for any E ∈ A2 , the operational meaning
of the proposition ϕ(α)(E) is so that the proposition ϕ(α)(E) is true if and only if
the X2 -value given by the apparatus that defines ϕ(α) is in E, and this is true if
and only if the X1 -value given by the apparatus that defines α is in ϕ−1 (E), and
this is true (by the operational meaning of the proposition α(ϕ−1 (E))) if and only
if the proposition α(ϕ−1 (E)) is true; thus, the proposition ϕ(α)(E) must coincide
with the proposition α(ϕ−1 (E)). We notice that, in the reasoning just made, there
was no need for ϕ to be injective (e.g., in the game of roulette all observables are
functions of the observable α which assigns a number from 0 to 36 to any copy of
the system, and for instance the observable which assigns the colour-values “rouge”,
“noir” or “nul” is defined by a non-injective function). If Dϕ 6= X1 we can extend ϕ
to a function ϕ̃ defined on the whole X1 in any way that makes it A1 -measurable,
and repeat the reasoning for this extension ϕ̃. For each E ∈ A2 we have
p(α(ϕ̃−1 (E)), σ) = µα
σ (ϕ̃
−1
(E))
−1 −1
= µα
σ (ϕ̃ (E) ∩ Dϕ ) + µα
σ (ϕ̃ (E) ∩ (X1 − Dϕ ))
−1 −1
= µα
σ (ϕ̃ (E) ∩ Dϕ ) = µα
σ (ϕ (E)) = p(α(ϕ−1 (E)), σ), ∀σ ∈ Σ,
where the monotonicity of µα α
σ and the condition µσ (X1 − Dϕ ) = 0 have been used;
this proves that α(ϕ̃−1 (E)) = α(ϕ−1 (E)). Thus, the reasoning we made before
can indeed be referred to the extension ϕ̃, but the observable ϕ̃(α) we obtain does
not depend on the extension we use: it depends only on ϕ, and therefore can be
denoted by the symbol ϕ(α). Furthermore, it can be defined directly trough ϕ as
in the statement of 19.1.13.
In what follows, we are concerned mainly with R-valued observables, which are
simply called observables.
19.1.15 Definitions. Let α be an observable. A number λ ∈ R is said to be a

possible result for α if the following condition is satisfied
∀ε > 0, ∃σ ∈ Σ so that µα
σ ((λ − ε, λ + ε)) 6= 0.
A number λ ∈ R is said to be an impossible result for α if it is not a possible result,

i.e. if the following condition is satisfied
∃ε > 0 so that µα
σ ((λ − ε, λ + ε)) = 0, ∀σ ∈ Σ.
The set of all possible results for α, i.e. the set spα defined by
spα := {λ ∈ R : ∀ε > 0, ∃σ ∈ Σ so that µα
σ ((λ − ε, λ + ε)) 6= 0},
is called the spectrum of α.

The observable α is said to be bounded if spα is a bounded set.
19.1.16 Remarks.
(a) If a number λ ∈ R happens to be so that, for an observable α, there exists
σ ∈ Σ such that
µα
σ ({λ}) 6= 0, (2)
then obviously λ must be considered a possible result for α from an operational
point of view: in N repetitions of the measurement of α in copies of the system
prepared in the state σ, the result λ occurs so often that its relative frequency
will approach a non-null number as N grows. In this case, it is obvious (owing
to the monotonicity of µα σ ) that λ fulfills the condition that we have given in
19.1.15 to characterize a possible result for α, and λ is said to be an exact result
for α.
However, condition 2 need not be fulfilled by every number which can occur as
the result obtained in the measurement of α in some copy: in N repetitions of
the measurement of α in a copy prepared in a state, a number λ can indeed
occur, but so seldom that its relative frequency will approach zero as N grows.
This is indeed what we expect to happen if λ belongs to what our theoretical
image of the system depicts as a continuum of possible results (unless state
preparations are assumed to exist that are so “precise” as to pinpoint a value
of an observable amid a continuum of possible values; such state preparations
are not realistic; however, classical mechanics is indeed based on such state
preparations, while quantum mechanics is not, as we shall see). If such is the

case, for λ to be considered a possible result we rather naturally require that
we obtain in a substantial way (i.e. with non-null probability) values around
λ with any margin of error, in suitable states. If, on the contrary, there is a
whole interval centered in λ so that no value in it ever occurs, then we are led
to consider λ an impossible result for α.
These are the ideas that are formalized in 19.1.15.
(b) Since by “proposition” and “observable” we actually mean equivalence classes,
a real number λ is a possible result for an observable α if and only if
α((λ − ε, λ + ε)) 6= π0 for all ε > 0
(cf. 19.1.7).
19.1.17 Proposition. For every observable α, the spectrum spα is a closed subset
of R and we have µα α
σ (R − spα ) = 0, or equivalently µσ (spα ) = 1, for all σ ∈ Σ.
Proof. For each λ ∈ R − spα , let ελ > 0 be such that µα

σ ((λ − ελ , λ + ελ )) = 0,
∀σ ∈ Σ. Then we have
R − spα = ∪λ∈R−spα (λ − ελ , λ + ελ );
indeed, if λ ∈ R − spα and µ ∈ (λ − ελ , λ + ελ ), then there exists η > 0 such that
(µ − η, µ + η) ⊂ (λ − ελ , λ + ελ ),
and therefore (owing to the monotonicity of µα
σ ) such that
µα
σ ((µ − η, µ + η)) = 0, ∀σ ∈ Σ,
and this proves that µ ∈ R − spα and therefore that (λ − ελ , λ + ελ ) ⊂ R − spα .

Thus, R − spα is an open set and spα is closed set.
Furthermore, by Lindelöf’s theorem (cf. 2.3.16 and 2.3.18) there exists a count-
able subset {λn }n∈I of R − spα so that R − spα = ∪n∈I (λn − ελn , λn + ελn ), and
this implies (by the σ-subadditivity of µα
σ ) that
µα
σ (R − spα ) = 0, ∀σ ∈ Σ,
which is equivalent to µα α
σ (spα ) = 1 for all σ ∈ Σ, since µσ is a probability measure.
19.1.18 Definition. An observable α is said to be discrete if there exists a count-

able family {λn }n∈I of real numbers so that µα
σ ({λn }n∈I ) = 1 for all σ ∈ Σ.
19.1.19 Remarks. In 19.1.18 we may assume that

∀n ∈ I, ∃σ ∈ Σ so that µα
σ ({λn }) 6= 0. (3)
Indeed, if this was not the case, we could eliminate from the family {λn }n∈I each
element λn which was such that µα σ ({λn }) = 0 for all σ ∈ Σ, without altering the
property the family is required to have in 19.1.18. Then, if 3 is true, all the elements
of the family {λn }n∈I are exact results for α (cf. 19.1.16a).
We cannot say that in general every possible result for α is an element of
{λn }n∈I , since spα = {λn }n∈I . Indeed, while {λn }n∈I ⊂ spα is obvious since
spα is closed, for λ ∈ R we have (by the monotonicity of µα σ)
λ 6∈ {λn }n∈I ⇒
[∃ε > 0 s.t. (λ − ε, λ + ε) ⊂ R − {λn }n∈I and hence s.t.
µα
σ ((λ − ε, λ + ε)) = 0, ∀σ ∈ Σ],
which proves that R − {λn }n∈I ⊂ R − spα .

At any rate, if {λn }n∈I is a finite family then actually spα = {λn }n∈I and we
can say that {λn }n∈I is the family of all possible results for α. Note that this is
indeed the case for all observables in most games of chance (e.g. in tossing a coin,
dice, roulette).
Finally we note that, if the spectrum of an observable is finite (i.e. the family
of all its possible results is finite), then obviously the observable is discrete and all
its possible results are exact.
19.1.20 Definitions. Let α be an observable and σ ∈ Σ. Let ξ be the function

defined in 15.2.1A. If ξ ∈ L1 (R, A(dR ), µα
σ ), the real number
Z
hαiσ := ξdµα
σ
R
is called the expected result of α in σ.
If ξ ∈ L1 (R, A(dR ), µα 2 α
σ ) and ξ − hαiσ ∈ L (R, A(dR ), µσ ), the finite positive
number
Z 12
∆σ α := (ξ − hαiσ )2 dµα
σ
R
is called the uncertainty of α in σ.
19.1.21 Proposition. Let α be an observable and σ ∈ Σ. The following conditions

are equivalent:
(a) Rξ ∈ L1 (R, A(dR ), µα 2 α
σ ) and ξ − hαiσ ∈ L (R, A(dR ), µσ );
2 α
(b) R ξ dµσ < ∞.
The observable α is said to be evaluable in the state σ if these conditions are satisfied.
Proof. a ⇒ b: From condition a we have ξ ∈ L2 (R, A(dR ), µα σ ) because a constant

function is an element of L2 (R, A(dR ), µα
σ ) since the measure µα σ is finite (cf. 8.2.6)
and ξ = (ξ − hαiσ ) + hαiσ .
b ⇒ a: From ξ ∈ L2 (R, A(dR ), µα 1
σ ) we have both conditions ξ ∈ L (R, A(dR ), µσ )
α
2 α α
and ξ − hαiσ ∈ L (R, A(dR ), µσ ) since the measure µσ is finite (cf. 11.1.3).
19.1.22 Remarks.
(a) Suppose that α is a discrete observable with a finite family {λk }k∈I of possible
results. Then µα σ ({λk }k∈I ) = 1 for all σ ∈ Σ and the results obtained in any
collection of measurements of α are bound to be elements of the family {λk }k∈I .
Now suppose that measurements of α are performed in N copies of the system,
all prepared in the same state σ ∈ Σ. Two important quantities connected with
these measurements are the average of the results and the standard deviation
of the results, which are defined respectively by
! 12
X Nk X 2 Nk
Aσ,N (α) := λk and Dσ,N (α) := (λk − Aσ,N (α)) ,
N N
k∈I k∈I
if Nk denotes the number of copies for which the result λk has been obtained.
For N large enough, the theoretical predictions of Aσ,N and Dσ,N are respec-
tively
X
Aσ,th (α) := λk p(α({λk }), σ) and
k∈I
! 12
2
X
Dσ,th (α) := (λk − Aσ,th (α)) p(α({λk }), σ) ,
k∈I

Z Z 21
2
Aσ,th (α) = ξdµα
σ and Dσ,th (α) = (ξ − Aσ,th (α)) dµ α
σ
R R
since
µα α
σ (R − {λk }k∈I ) = 0 and p(α({λk }), σ) = µσ ({λk }), ∀k ∈ I
(cf. 8.3.9 and 8.3.8). Thus, for a discrete observable with a finite number of
possible results, the expected result and the uncertainty defined in 19.1.20 are
the theoretical predictions of the average and of the standard deviation of the
results obtained in a large number of measurements.
The analysis above cannot be carried out for an observable with an infinite
number of possible results, since for an observable α of this kind there might be
possible results λ such that p(α({λ}), σ) = 0 for all σ ∈ Σ (this would represent
the existence of a continuum of possible results). One could argue that for
every actual measuring instrument there exists a finite set which contains all the
results that the instrument can produce, and therefore every actual measuring
instrument must be represented by an observable with only a finite number
of possible results. Thus, one could be tempted into discarding observables
with an infinite number of possible results on the grounds that they are not
realistic. However, physical theories can hardly ever be formulated in terms of
actual measuring instruments and the use of idealized observables is common
practice in physics (for instance, the position and the velocity of a particle
are observables which in both the classical and the quantum mechanics are
not discrete, even though no actual measuring instrument can pinpoint their
alleged values better than assigning them to intervals related to the resolution
of the instrument; moreover, as to velocity in the classical mechanics, no actual
instrument can really compute a derivative). Hence, idealized observables must
be taken into consideration. The idealistic import of this is lessened by the fact
that every observable α can be considered as the limit of a sequence of realistic
observables in the sense explained below.
Let α be an observable. For each n ∈ N, let En be a bounded interval, let
{Fn,k }k∈In be a finite partition of En such that Fn,k is an interval for all k ∈ In ,
let λn,k be a non-null element of Fn,k for all k ∈ In ; further, assume that
En ⊂ En+1 for each n ∈ N, that ∞
S
n=1 En = R, and that limn→∞ ℓn = 0
if ℓn denotes the maximum length of the intervals of the family {Fn,k }k∈In .
For instance, we could have En := −n, n + 21n , In := {0, ±1, ±2, ..., ±n2n},

k+ 12
Fn,k := 2kn , k+1
P
2n , λn,k := 2n . We define the function ξn := k∈In λn,k χFn,k
and the observable αn := ξn (α). The observable αn is discrete and it has a
finite number of possible results since
α −1
µα α
σ ({λn,k }k∈In ∪ {0}) = µσ (ξn ({λn,k }k∈In ∪ {0})) = µσ (R) = 1, ∀σ ∈ Σ.
n
The observable αn can be considered a realistic approximation of the observable

α. In fact, αn is obtained by replacing the scale that defines α with another
scale (cf. 19.1.14) which yields the same conventional results λn,k for any ideal
result (i.e. a result according to α) that belongs to the interval Fn,k , and which
gives an approximation of just a limited part of the ideal scale that defines α
since a conventional value (chosen here to be zero, but what follows would be
the same if we chose any other number) is assigned to the event α(R−En ) which
occurs when the ideal result is not in En . Thus, the observable αn (which is
interpreted here as an equivalence class, cf. 19.1.10) might at least in principle
be implemented by a realistic measuring instrument, which would register only
a limited range of results and would not distinguish between values that lie in
the same interval Fn,k (the maximum length ℓn would be related to the resolving
power of the instrument).
Inasmuch as the sequence {αn } of realistic observables is considered an approx-
imation of the observable α, the theoretical prediction of the average of the
results obtained when α is measured in a large number of copies prepared in a
state σ ∈ Σ must be given by limn→∞ Aσ,th (αn ), provided this limit exists and
is independent of the particular choice of En , {Fn,k }k∈In , {λn,k }k∈In . And sim-
ilarly for the theoretical prediction of the standard deviation of the results and
limn→∞ Dσ,th (αn ). Now, the following proposition can be proved (see below):
Proposition. LetR σ ∈ Σ. Then the sequences {Aσ,th (αn )} and {Dσ,th (αn )}
are convergent iff R ξ 2 dµα
σ < ∞; if these sequences are convergent then
Z
lim Aσ,th (αn ) = ξdµασ and
n→∞ R
Z Z 12
2
lim Dσ,th (αn ) = (ξ − ξdµα
σ
α
) dµσ .
n→∞ R R
This gives an operational interpretation to the definitions given in 19.1.20 and

in 19.1.21.
We end these remarks by proving the proposition above.
Preliminarily we note that, for each n ∈ N, Z
X
Aσ,th (αn ) := λn,k p(αn ({λn,k }), σ) = ξn dµα
σ
k∈In R
and
! 21
X
2
Dσ,th (αn ) := (λn,k − Aσ,th (αn )) p(αn ({λn,k }), σ)
k∈In
Z 12 Z 12
= (ξn − Aσ,th (αn ))2 dµα
σ = ξn2 dµα
σ − (Aσ,th (αn ))
2
,
R R
since p(αn ({λn,k }), σ) = p(α(ξn−1 ({λn,k })), σ) = µα
σ (Fn,k ) for all k ∈ In (the
equalities above are in agreement with what is proved more in general in
19.1.23).
First we assume R ξ 2 dµα α
R
σ < ∞. Because the measure µσ is finite, we have
ξ ∈ L1 (R, A(dR ), µα 1 α
σ ) (cf. 11.1.3) and hence |ξ| + ℓn ∈ L (R, A(dR ), µσ ) since
1 α
1R ∈ L (R, A(dR ), µσ ) (cf. 8.2.6). Moreover,
|ξn (x)| ≤ |x| + ℓn and ξn (x) −−−−→ x, ∀x ∈ R.
n→∞
Then, by Lebesgue’s dominated convergence theorem (cf. 8.2.11) we have
Z Z
α
ξn dµσ −−−−→ ξdµα
σ.
R n→∞ R
2 1 2
Also, we have (|ξ| + ℓn ) ∈ L (R, A(dR ), µα α
σ ) since 1R ∈ L (R, A(dR ), µσ ) (cf.
11.1.2a). Moreover,
|ξn2 (x)| ≤ (|x| + ℓn )2 and ξn2 (x) −−−−→ x2 , ∀x ∈ R.
n→∞
Then, by Lebesgue’s dominated
Z convergence
Z theorem we have
2 α
ξn dµσ −−−−→ ξ 2 dµα
σ
R n→∞ R
and hence
Z Z 2 ! 12 Z Z 2 ! 12
ξn2 dµα
σ − ξn dµα
σ −−−−→ ξ 2
dµα
σ − ξdµα
σ
R R n→∞ R R
Z Z 2 ! 12
= ξ− ξdµα
σ dµα
σ .
R R
Next and conversely we assume that the R sequences {Aσ,th (αn )} and
{Dσ,th (αn )} are convergent. Then the sequence R ξn2 dµα σ is convergent since
Z
(Dσ,th (αn ))2 − (Aσ,th (αn ))2 = ξn2 dµα
σ,
R
Z
∃M ∈ [0, ∞) such that ξn2 dµα
σ ≤ M, ∀n ∈ N.
R
By Fatou’s lemma (cf. 8.1.20), this implies that R ξ 2 dµα
R
σ ≤ M.
(b) Suppose that an observable α and a state σ ∈ Σ are so that α is evaluable in σ
and ∆σ α = 0. Then R (ξ − hαiσ )2 dµα
R
σ = 0, and hence (cf. 8.1.12a) x − hαiσ = 0
µασ -a.e. on R, and hence µ α
σ (R − {hαi σ }) = 0, and hence µ α
σ ({hαi σ }) = 1. This
means that there is a result λ which is obtained with certainty for any number
of copies prepared in the state σ (then, obviously, hαiσ = λ).
Conversely, suppose that for σ ∈ Σ there is λ ∈ R such that µα σ ({λ}) = 1. Then
α α
µ (R
Rσ 2 α − {λ}) = 0. Thus, µ σ is the Dirac measure in λ and we have (cf. 8.3.6)
2
R ξ dµσ = λ < ∞ (hence, α is evaluable in σ), hαiσ = λ, ∆σ α = 0.
19.1.23 Proposition. Let (X, A) be a measurable space and let α be an X-valued

observable. Let ϕ : Dϕ → R be a function from X to R which is ADϕ -measurable
and which is so that Dϕ ∈ A and µα σ (X − Dϕ ) = 0 for each σ ∈ Σ. For the
observable ϕ(α) we have:
ϕ(α) is evaluable in a state σ ∈ Σ iff X ϕ2 dµα
R
σ < ∞;
if ϕ(α) is evaluable in σ ∈ Σ, then
Z Z 21
α 2 α
hϕ(α)iσ = ϕdµσ and ∆σ ϕ(α) = (ϕ − hϕ(α)iσ ) dµσ .
X X
ϕ(α)−1
Proof. Since µσ (E) = µα σ (ϕ (E)) for all E ∈ A(dR ), we obtain the statement
from 8.3.11 (π is there what ϕ is here).
19.1.24 Corollary. Let α be an observable and let α2 := ξ 2 (α). If α2 is evaluable

in a state σ ∈ Σ, then α is evaluable in σ and
1
∆σ α = hα2 iσ − hαi2σ 2 .
Proof. If α2 is evaluable in σ ∈ Σ then ξ 2 ∈ L2 (R, A(dR ), µα σ ) (cf. 19.1.23), and

hence ξ 2 ∈ L1 (R, A(dR ), µασ ) (since the measure µ α
σ is finite, cf. 11.1.3), i.e. α is
evaluable in σ. Moreover,
Z 21 Z 21
2 2 2
1
∆σ α = (ξ − hαiσ ) dµσ α
= α
ξ dµσ − hαiσ = hα2 iσ − hαi2σ )2 2 ,
R R
2 2
dµα
R
since R ξ σ = hα iσ (cf. 19.1.23).
19.1.25 Proposition. A bounded observable is evaluable in every state σ ∈ Σ.

Proof. If α is bounded then there exists k ∈ [0, ∞) such that |ξ 2 (x)| = x2 ≤ k for
all x ∈ spα , and hence µα α
σ -a.e. on R for every σ ∈ Σ, since µσ (R − spα ) = 0 (cf.
19.1.17). The result then follows from 8.2.6.
19.1.26 Definition. For each proposition π ∈ Π, we define the observable απ by

letting
απ : A(dR ) → Π


 π0 if 0 6∈ E and 1 6∈ E,


π if 0 6∈ E and 1 ∈ E,
E→
7 απ (E) :=


 ¬π if 0 ∈ E and 1 6∈ E,

π1 if 0 ∈ E and 1 ∈ E.

For every state σ ∈ Σ, the function µα σ is indeed a probability measure on A(dR ),

π
since p(π0 , σ) = 0, p(¬π, σ) = 1 − p(π, σ), p(π1 , σ) = 1; it is the measure defined in

8.3.8, with I := {1, 2}, x1 := 1, x2 := 0, a1 := p(π, σ), a2 := p(¬π, σ). Moreover,
it is obvious that there exists a measuring apparatus which supports απ : it is the
same piece of equipment that defines the proposition π (cf. 19.1.9c).
The observable απ is said to be a yes-no observable or a two-valued observable,
since the apparatus which defines απ gives the result 1 when the event π occurs
and the result 0 when the event π does not occur (i.e., when the event ¬π occurs).
Thus, the possible result for απ are 1 and 0, provided π 6= π1 (if π = π1 then the
only possible result is 1) and π 6= π0 (if π = π0 then then only possible result is 0).
19.1.27 Proposition. For each proposition π ∈ Π, the observable απ is evaluable

in every state σ ∈ Σ and we have
1
hαπ iσ = p(π, σ) and ∆σ απ = (p(π, σ)(1 − p(π, σ))) 2 .
Proof. Since απ is such that µα απ απ

σ (R − {0, 1}) = 0, µσ ({1}) = p(π, σ), µσ ({0}) =
π
p(¬π, σ) = 1 − p(π, σ) for every σ ∈ Σ, by 8.3.9 and 8.3.8 we have that απ is

evaluable in every state σ ∈ Σ and
hαπ iσ = 0µα απ
σ ({0}) + 1µσ ({1}) = p(π, σ),
π
(∆σ απ )2 = (0 − hαπ iσ )2 µα 2 απ
σ ({0}) + (1 − hαπ iσ ) µσ ({1})
π
= (0 − p(π, σ))2 (1 − p(π, σ)) + (1 − p(π, σ))2 p(π, σ)

= p(π, σ)(1 − p(π, σ)).
19.2 States, propositions, observables in classical statistical theo-

ries
Classical statistical theories, although very diverse, have some common features,
some of which we set out here axiomatically. As before, we denote by Σ and Π the
families of states and of propositions that define a fixed physical system, which is
assumed in this section to be described by a classical statistical theory. By Σ, Π
and p we denote what was denoted by Σ̂, Π̂, p̂ in 19.1.5 (cf. 19.1.12).
19.2.1 Axiom (Axiom C1). In a classical statistical theory it is assumed that,

for every pair of propositions π, π ′ ∈ Π, both π and π ′ can be determined simul-
taneously in any copy of the system prepared in any state σ ∈ Σ, and that it is
possible to define a proposition of the system by the event which is considered to
have occurred if and only if both the events that define π and π ′ have occurred, and
also another proposition by the event which is considered to have occurred if and
only if at least one of the events that define π and π ′ has occurred. The proposition
defined by the first event will be denoted by π ∧ π ′ (π and π ′ ) and the proposition
defined by the second event by π ∨ π ′ (π or π ′ ).
19.2.2 Remark. The reason behind axiom C1 is that, in a classical theory, the
determination of any proposition π for any copy prepared in any state σ is held to be
implementable in such an “unobtrusive” way that, immediately after the proposition
has been determined, the copy can still be considered as if it had just been prepared
in the state σ. That is to say, recalling that π stands for an equivalence class, there
is an event which belongs to the class π and which requires an interaction, between
a copy prepared in σ and the apparatus in which the event possibly occurs, which
involves so little transfers of e.g. energy, momentum, angular momentum that they
can be considered negligible, so that it is as if nothing had happened to the copy,
which therefore can be considered still in the state σ. This makes it possible to
determine two propositions, one immediately after the other, in the same copy and
assume that the determination of the first of them has had no influence on the
outcome of the determination of the second. Moreover, it makes it possible to
consider immaterial the order in which the two propositions are determined.
19.2.3 Axiom (Axiom C2). In a classical statistical theory it is assumed that

there is a subfamily S of the family Σ of all states which is so that
p(π, s) ∈ {0, 1}, ∀π ∈ Π, ∀s ∈ S
(it is a matter of convenience to denote the elements of S by the letter s).
For each π ∈ Π we define Sπ := {s ∈ S : p(π, s) = 1}, and we denote by A the
σ-algebra on S which is generated by the family {Sπ : π ∈ Π}. Then, it is assumed
that {s} ∈ A for all s ∈ S and that for every σ ∈ Σ there is a probability measure
µσ on A such that
p(π, σ) = µσ (Sπ ), ∀π ∈ Π
(µσ is uniquely defined by this condition, as will be seen in 19.2.5b).
19.2.4 Remark. The assumptions of axiom C2 can be derived from a picture of

the essentials of a classical theory which can be summarized as follows.
In a classical theory, any copy of a physical system is assumed to be, at any

given time, in a “real condition” so that every proposition is either certainly true or
certainly false. If a state preparation is completely accurate, then we must know the
“real condition” of the copy it has produced (or it has selected, as it is more often
thought to have done in classical theories); the elements of S are these completely
accurate state preparations (they are sometimes called microstates) and therefore
we must have either p(π, s) = 1 or p(π, s) = 0 for all π ∈ Π and s ∈ S. Moreover, for
every possible “real condition” it is assumed that there exists a state preparation
which produces (or selects) with certainty a copy in that condition; this corresponds
to the notion that every “real condition” must have an operational counterpart, at
least in principle. In this way, the family of “real conditions” can be identified with
the family S of completely accurate states (for this reason, S is sometimes called the
phase space of the system). If a state preparation σ ∈ Σ is not completely accurate,
then the copies it produces are not all in the same “real condition”; however, σ is
assumed to determine the probability that a copy it produces is in any real condition:
σ determines a probability measure µσ on A so that, for every measurable subset E
of S, µσ (E) is the probability that a copy produced by σ is in the “real condition”
in which the copy would be if one of the microstates of E had been performed.
Now, if N copies of the system are produced by the preparation procedure σ and
Nπ denotes the number of times when a proposition π ∈ Π has been found to be
true, then Nπ is also the number of times when the copy is in the “real condition”
associated with a microstate of Sπ (this is so by the definition of Sπ , recalling that
p(π, s) ∈ {0, 1} for all s ∈ S); thus, NNπ approaches both p(π, σ) and µσ (Sπ ) as N
grows, and we are led to the condition
p(π, σ) = µσ (Sπ ).
We point out that, since a copy of the system is considered to be always in a
“real condition” corresponding to an element of S, the probability measure µσ
which represents a state σ is a measure of our lack of knowledge of what that “real
condition” actually happens to be. For this reason, the probabilities that arise
in a classical statistical theory are said to be of an epistemic nature. Otherwise
stated, statistical aspects emerge in a classical theory only when we consider state
preparations which are not completely precise, while the theory does not contain any
statistical aspect if we restrict ourselves to considering only the absolutely precise
states of the family S (e.g., this is the case in classical mechanics).
As to the assumption that {s} ∈ A for all s ∈ S, we note that a stronger
assumption would be to suppose that every state s ∈ S defines the proposition πs
that is determined to be true if and only if the copy of the system is found to be
in the real condition which would have been produced by s. Then we would have
{s} = Sπs . However, this hypothetical proposition πs can be difficult to implement
unless one can find a simple event, or a collection of such, that identifies the real
condition corresponding to s. The assumption {s} ∈ A is a rather milder request
than the request of having the hypothetical proposition πs as an element of Π. In
any case, the main role of the condition {s} ∈ A is to make it possible to claim that
there exists a probability measure µs on A such that
p(π, s) = µs (Sπ ), ∀π ∈ Π
(and indeed in 19.2.5c we will see that this condition implies that µs is the Dirac
measure in s, which requires the condition {s} ∈ A in order to be defined).
19.2.5 Proposition. The following statements are true:

(a) the family {Sπ : π ∈ Π} is an algebra on S;
(b) for σ ∈ Σ, if µ is a measure on A such that
p(π, σ) = µ(Sπ ), ∀π ∈ Π,
then µ = µσ ;
(c) for every s ∈ S, µs is the Dirac measure in s (the Dirac measure in s should be
denoted by µs according to the notation introduced in 8.3.6; however, µs denotes
here the measure µσ defined as in 19.2.3 for σ := s, and therefore in the proof
we shall temporarily denote the Dirac measure in s by a different symbol).
(d) if σ, σ ′ ∈ Σ are so that µσ = µσ′ , then σ = σ ′ ;
(e) if π, π ′ ∈ Π are so that Sπ = Sπ′ , then π = π ′ .
(f ) for π ∈ Π, π 6= π0 iff there exists s ∈ S such that p(π, s) = 1.
Proof. a: From the definition of π ∨ π ′ that was given in 19.2.1, we have

Sπ ∪ Sπ′ = Sπ∨π′ , ∀π, π ′ ∈ Π.
From the definition of ¬π that was given in 19.1.3, we have (cf. 19.1.4a)
S − Sπ = S¬π , ∀π ∈ Π.
Thus, {Sπ : π ∈ Π} is an algebra on S.
b: For µ as in the statement, we have µ = µσ because {Sπ : π ∈ Π} is an algebra
and µσ is a finite measure (cf. 7.3.2).
c: If s ∈ S and µs is the Dirac measure in s, directly from the definition of Sπ
and of Dirac measure, we have
p(π, s) = µs (Sπ ), ∀π ∈ Π;
in view of statement b, this proves that µs = µs .
d: For σ, σ ′ ∈ Σ, we have (cf. 19.1.5; recall that we are “dropping the carets”)
µσ = µσ′ ⇒ [p(π, σ) = µσ (Sπ ) = µσ′ (Sπ ) = p(π, σ ′ ), ∀π ∈ Π] ⇒ σ = σ ′ .
e: For π, π ′ ∈ Π, we have (cf. 19.1.5)
Sπ = Sπ′ ⇒ [p(π, σ) = µσ (Sπ ) = µσ (Sπ′ ) = p(π ′ , σ), ∀σ ∈ Σ] ⇒ π = π ′ .
f: The “if” part is obvious (cf. 19.1.7). As to the “only if” part, we have
π 6= π0 ⇒ [∃σ ∈ Σ s.t. µσ (Sπ ) = p(π, σ) 6= 0] ⇒ Sπ 6= ∅ ⇒
[∃s ∈ S s.t. s ∈ Sπ , and hence s.t. p(π, s) = µs (Sπ ) = 1].
19.2.6 Proposition. Let α be an observable. Then:

(a) for every s ∈ S there exists αs ∈ R so that µα
s is the Dirac measure in αs ; then
α is evaluable in s, hαis = αs , ∆s α = 0;
(b) there exists a unique function ϕα : S → R such that
Sα(E) = ϕ−1
α (E), ∀E ∈ A(dR );
the function ϕα is A-measurable and it is defined by ϕα (s) := hαis for all s ∈ S;

we have spα = Rϕα ; then, α is a bounded observable iff ϕα is a bounded func-
tion;
for a state σ ∈ Σ, α is evaluable in σ iff S ϕ2α dµσ < ∞; if α is evaluable in σ
R
then
Z Z 21
2
hαiσ = ϕα dµσ and ∆σ α = (ϕα − hαiσ ) dµσ ;
S S
if β is an observable such that ϕα = ϕβ , then α = β.

(c) for a function ψ : Dψ → R from R to R which is A(dR )Dψ -measurable and
such that Dψ ∈ A(dR ), the condition µα σ (R − Dψ ) = 0 for all σ ∈ Σ (which
is the condition for the definition of the observable ψ(α)) is equivalent to the
condition Rϕα ⊂ Dψ ; if these conditions are satisfied then ϕψ(α) = ψ ◦ ϕα .
Proof. a: For each s ∈ S, the measure µα

s is defined by
µα
s (E) = p(α(E), s), ∀E ∈ A(dR )
(cf. 19.1.8). Hence, in view of 19.2.3 we have µα s (E) ∈ {0, 1} for all E ∈ A(dR ). By
8.3.7, this implies that there exists αs ∈ R so that µα s is the Dirac measure in αs ,
and this implies (cf. 8.3.6) that
Z Z Z 21
ξ 2 dµα
s = α2
s < ∞, hαi s = ξdµ α
s = αs , ∆s α = (x − hαis )2
dµ α
s (x) = 0.
R R R
b: We define the function

ϕα : S → R
s 7→ ϕα (s) := hαis .
For every E ∈ A(dR ) we have (directly from the definitions and from the fact that
µα
s is the Dirac measure in hαis ), for s ∈ S,
−1
s ∈ Sα(E) ⇔ p(α(E), s) = 1 ⇔ µα
s (E) = 1 ⇔ hαis ∈ E ⇔ s ∈ ϕα (E),
and hence Sα(E) = ϕ−1

α (E). This also proves that ϕα is A-measurable.
Now suppose that a function ϕ : S → R is such that
Sα(E) = ϕ−1 (E), ∀E ∈ A(dR );
then ϕ is A-measurable and
−1
µα
s (E) = p(α(E), s) = µs (Sα(E) ) = µs (ϕ (E)), ∀E ∈ A(dR ), ∀s ∈ S.
Then, by 8.3.11 and 19.2.5c,

Z Z
ϕ(s) = ϕdµs = ξdµα
s = hαis = ϕα (s), ∀s ∈ S.
S R
Now we prove that spα = Rϕα . If λ ∈ Rϕα then there exists s ∈ S such that
s ∈ ϕ−1
α ({λ}) and hence
−1
µα
s ({λ}) = µs (Sα({λ}) ) = µs (ϕα ({λ}) = 1;
this proves that Rϕα ⊂ spα and hence Rϕα ⊂ spα since spα is closed (cf. 19.1.17).
If conversely λ 6∈ Rϕα then there exists ε > 0 such that ϕα (s) 6∈ (λ − ε, λ + ε), and
hence µs (ϕ−1
α ((λ − ε, λ + ε))) = 0, for all s ∈ S (since ϕα (s) 6∈ (λ − ε, λ + ε) is
equivalent to s 6∈ ϕ−1
α ((λ − ε, λ + ε))); then we have, for every σ ∈ Σ,
−1
µα
σ ((λ − ε, λ + ε)) = µσ (ϕα ((λ − ε, λ + ε)))
Z
= χϕ−1
α ((λ−ε,λ+ε))
(s)dµσ (s)
ZS
= µs (ϕ−1
α ((λ − ε, λ + ε)))dµσ (s) = 0,
S
and hence λ 6∈ spα . This proves that spα ⊂ Rϕα .

Then we also have: α is bounded iff spα is bounded iff Rϕα is bounded iff Rϕα
is bounded iff ϕα is bounded.
For σ ∈ Σ we have
−1
µα
σ (E) = p(α(E), σ) = µσ (Sα(E) ) = µσ (ϕα (E)), ∀E ∈ A(dR ),
and this implies, by 8.3.11, that

Z Z
ξ 2 dµα
σ = ϕ2α dµσ
R S
and, if α is evaluable in σ, that
Z Z
hαiσ = ξdµασ = ϕα dµσ and
R S
Z 12 Z 12
2 2
∆σ α = (ξ − hαiσ ) dµα
σ = (ϕα − hαiσ ) dµσ .
R S
Finally, if β is an observable such that ϕα = ϕβ , then
−1
Sα(E) = ϕ−1
α (E) = ϕβ (E) = Sβ(E) , ∀E ∈ A(dR ),
and hence, by 19.2.5e,

α(E) = β(E), ∀E ∈ A(dR ),
and hence α = β.
c: Let ψ : Dψ → R be a function from R to R which is A(dR )Dψ -measurable
and such that Dψ ∈ A(dR ). First, we note that µα s (R − Dψ ) = 0 for all s ∈ S is
equivalent (in view of statements a and b) to ϕα (s) 6∈ R − Dψ for all s ∈ S, which
is equivalent to ϕα (s) ∈ Dψ for all s ∈ S, which is equivalent to Rϕα ⊂ Dψ . Next,
we note that if µα α
σ (R − Dψ ) = 0 for all σ ∈ Σ then obviously µs (R − Dψ ) = 0 for
α
all s ∈ S. Finally, we note that if µs (R − Dψ ) = 0 for all s ∈ S then Sα(R−Dψ ) = ∅,
and hence µα σ (R − Dψ ) = µσ (Sα(R−Dψ ) ) = 0 for all σ ∈ Σ.
Thus, the condition Rϕα ⊂ Dψ holds true if and only if the observable ψ(α) can
be defined (cf. 19.1.23). In this case, ψ ◦ ϕα is an A-measurable function from S to
R such that Dψ◦ϕα = S, and we have
Z
ϕψ(α) (s) = hψ(α)is = ψdµα s = ψ(hαis ) = ψ(ϕα (s)) = (ψ ◦ ϕα )(s), ∀s ∈ S,
R
where the first equation holds by statement b, the second by 19.1.23, and the third
because µα
s is the Dirac measure in hαis (cf. statement a).
19.2.7 Proposition. A partial ordering is defined in Π by letting, for π, π ′ ∈ Π,
π ≤ π ′ if Sπ ⊂ Sπ′ .
For each pair {π, π ′ } of elements of Π, the g.l.b. exists and we have inf{π, π ′ } =
π ∧ π ′ , and the l.u.b. exists and we have sup{π, π ′ } = π ∨ π ′ . Further, we have:
π ∧ (π ′ ∨ π ′′ ) = (π ∧ π ′ ) ∨ (π ∧ π ′′ ) and
π ∨ (π ′ ∧ π ′′ ) = (π ∨ π ′ ) ∧ (π ∨ π ′′ ), ∀π, π ′ , π ′′ ∈ Π
(Π is thus what is called a distributive lattice).

We also have:
π0 ≤ π and π ≤ π1 , ∀π ∈ Π;
π ∧ (¬π) = π0 and π ∨ (¬π) = π1 , ∀π ∈ Π;
¬(¬π) = π, ∀π ∈ Π;
π ≤ π ′ ⇒ ¬π ′ ≤ ¬π, ∀π, π ′ ∈ Π
(Π is thus what is called a Boolean algebra).
Proof. The mapping π 7→ Sπ is a bijection from Π onto {Sπ : π ∈ Π} (cf. 19.2.5e).

All the facts of the statement follow from this and from the equivalent facts for the
family {Sπ : π ∈ Π} of subsets of S, which hold true trivially since we have, directly
from the definitions:
Sπ ∩ Sπ′ = Sπ∧π′ and Sπ ∪ Sπ′ = Sπ∨π′ , ∀π, π ′ ∈ Π;

Sπ0 = ∅ and Sπ1 = S;
S¬π = S − Sπ , ∀π ∈ Π.
19.2.8 Remark. We can summarize the basic mathematical structure of a classical

statistical theory as follows.
There are a measurable space (S, A), an injective mapping π 7→ Sπ from the
family Π of all propositions to the σ-algebra A, an injective mapping σ 7→ µσ from
the family Σ of all states to the family of all probability measures on A, so that A
is the σ-algebra generated by the family {Sπ : π ∈ Π} and
p(π, σ) = µσ (Sπ ), ∀π ∈ Π, ∀σ ∈ Σ
(in this subsection, we denote by (S, A) an abstract measurable space and therefore
we must denote the family of microstates by a different symbol than the symbol S
used before; in what follows the family of microstates is denoted by the symbol Σ0 ).
Further, there is an injective mapping α 7→ ϕα from the family of all observables
to the family of all A-measurable real functions so that for each observable α we
have:
• Sα(E) = ϕ−1α (E), ∀E ∈ A(dR );
• α is a bounded observable iff ϕα is aR bounded function;
• for a state σ, α is evaluable in σ iff S ϕ2α dµσ < ∞;
• for a state σ, if α is evaluable in σ then
Z Z 21
hαiσ = ϕα dµσ and ∆σ α = (ϕα (s) − hαiσ )2 dµσ (s) ;
S S
• if ψ : Dψ → R, with Dψ ∈ A(dR ), is such that ψ(α) can be defined, then

ϕψ(α) = ψ ◦ ϕα .
Moreover, there is a subfamily Σ0 of Σ so that p(π, σ) ∈ {0, 1} for all π ∈ Π and
σ ∈ Σ0 . For π ∈ Π, π 6= π0 if and only if there exists σ ∈ Σ0 so that p(π, σ) = 1.
Every observable α is evaluable in every state σ ∈ Σ0 and ∆σ α = 0; thus, every
observable α can be said to have a definite value (equal to hαiσ ) in every state σ ∈ Σ0
(cf. 19.1.22b). It is possible to identify Σ0 with S: for σ ∈ Σ, we have σ ∈ Σ0 iff
µσ is the Dirac measure in a point sσ of S; also, if σ ∈ Σ0 , then hαiσ = ϕα (sσ ).
Finally, the family π of all propositions has the structure of a Boolean algebra.
We shall see that the mathematical structure of quantum mechanics is altogether
different.
19.2.9 Remark. If we consider only one observable in the general statistical theory
of Section 19.1 (and therefore in a quantum theory as a special case), we can note a
similarity between the nature of the probabilities that played a role in that situation
(cf. 19.1.8 and 19.1.9) and the nature of probabilities in a classical statistical theory.
In fact, for a state σ, while the nature of the probability p(π, σ) for a general
proposition π is completely unspecified in the general statistical theory (and indeed
p(π, σ) will be obtained in a quantum theory by an algorithm altogether different
from the one used in a classical theory, cf. 19.3.1 and 19.2.3), if a fixed observable
α is considered then there is a σ-algebra A so that an element E of A represents
the proposition “the position of the pointer is in the section of the dial identified
with E” and the probability of this proposition is µα α
σ (E), where µσ is a probability
measure on A, and this is similar to what happens in a classical statistical theory.
Actually, this is due to the classical nature we assumed for the dial and the pointer
of any measuring instrument. However, at variance with what we have in a classical

statistical theory, it is not true that, for any observable α, there exists a family Σ0
of states so that µα σ (E) ∈ {0, 1} for all E ∈ A and all σ ∈ Σ0 (these states would
be the absolutely precise preparation procedures which we will mention later, in
19.3.12b).
19.3 States, propositions, observables in quantum mechanics
Quantum mechanics is a family of statistical theories, called quantum theories,

which are structured in accordance with the axioms that we set out in this and in
later sections of this chapter.
In a classical theory, if one considers only microstates then one has a theory
which is not really statistical, since all probabilities are trivial (i.e., they are either
zero or one); in a classical theory, microstates represent possible “real conditions”
of the system, and one can often relate the mathematical representation of a sim-
ple classical system to one’s common experience. In a quantum theory there are
no states for which all probabilities are trivial, there are no states which can be
related to a “real condition” of the system (this concept has no place in a quan-
tum theory), and the mathematical representation of the physical system allows no
intuitive imagery; in fact, this mathematical representation is utterly abstract and
the only parts of the mathematical machinery that can be directly linked to com-
mon experience are the probabilities which can be computed through it. Moreover,
in a quantum theory, states, propositions and observables cannot be handled with
the same kind of logic that can be used in a classical theory. This impossibility is
encoded in the structure of their mathematical representations.
In this section we examine how states, propositions, observables are represented
in a quantum theory, and how probabilities, expected results and uncertainties
can be computed. As always, we denote by Σ and Π the families of states and
propositions that define a fixed physical system, which is assumed in this and in the
next sections of this chapter to be described by a quantum theory. We remind the
reader that we denote by Σ, Π, p what was denoted by Σ̂, Π̂, p̂ in 19.1.5 (cf. 19.1.12).
19.3.1 Axiom (Axiom Q1). A quantum theory is a statistical theory for which
a separable Hilbert space H is assumed to exists so that:
(a) there is a bijective mapping Σ ∋ σ 7→ Wσ ∈ W(H) from the family Σ of all

states onto the family W(H) of all statistical operators in H;
(b) there is a bijective mapping Π ∋ π 7→ Pπ ∈ P(H) from the family Π of all
propositions onto the family P(H) of all orthogonal projections in H;
(c) p(π, σ) = tr(Pπ Wσ ), ∀π ∈ Π, ∀σ ∈ Σ.
19.3.2 Remarks.
(a) For all P ∈ P(H) and W ∈ W(H) we have 0 ≤ tr(P W ) ≤ 1 (cf. 18.3.8b).
Thus, condition c in 19.3.1 is consistent with the fact that p is a probability
function.
(b) The structure which emerges from 19.3.1 is a truly statistical one. In a statis-
tical theory, the probabilistic aspects become trivial only when there is a pair
proposition-state (π, σ) such that the probability p(π, σ) is either 0 or 1: the
proposition π is then either never true or always true in all copies of the system
prepared in the state σ. Consider then a proposition π such that Pπ 6= OH and
Pπ 6= 1 (this is possible if the dimension of H is greater than one, which we
assume), and a state σ such that Wσ = Au , with u ∈ H̃ (cf. 18.3.2b). Then we
have
p(π, σ) = (u|Pπ u) = kPπ uk2 ,
and hence p(π, σ) 6= 0 and p(π, σ) 6= 1 whenever u 6∈ NPπ ∪ RPπ (cf. 13.1.3c).
Now, there are infinitely many operators Au such that u 6∈ NPπ ∪ RPπ .
(c) There are quantum theories, which are said to be “with superselection rules”, for
which the mappings of conditions a and b in 19.3.1 are not surjective. These
theories are outside the scope of this book. Thus, all quantum theories we
discuss are “without superselection rules”.
19.3.3 Proposition. Condition c in 19.3.1 is consistent with the implications (cf.

19.1.5 and 19.1.12):
[π ′ , π ′′ ∈ Π and p(π ′ , σ) = p(π ′′ , σ), ∀σ ∈ Σ] ⇒ π ′ = π ′′ ,
[σ ′ , σ ′′ ∈ Σ and p(π, σ ′ ) = p(π, σ ′′ ), ∀π ∈ Π] ⇒ σ ′ = σ ′′ .
Proof. For P ′ , P ′′ ∈ P(H), if

tr(P ′ W ) = tr(P ′′ W ), ∀W ∈ W(H),
then in particular (cf. 18.3.2b)
(u|P ′ u) = tr(P ′ Au ) = tr(P ′′ Au ) = (u|P ′′ u) , ∀u ∈ H̃,
and hence P ′ = P ′′ by 10.2.12. This proves the first implication of the statement.
The proof of the second implication is similar.
19.3.4 Proposition. We have Pπ0 = OH , Pπ1 = 1H , and P¬π = 1H − Pπ for

every proposition π.
Proof. We have
tr(Pπ0 Wσ ) = p(π0 , σ) = 0 and tr(Pπ1 Wσ ) = p(π1 , σ) = 1, ∀σ ∈ Σ.
Since the mapping Σ ∋ σ 7→ Wσ ∈ W(H) is surjective, this implies (cf. 18.3.2b)
(u|Pπ0 u) = 0 = (u|OH u) and (u|Pπ1 u) = 1 = (u|1H u) , ∀u ∈ H̃,
and hence Pπ0 = OH and Pπ1 = 1H (cf. 10.2.12).

For every π ∈ Π we have
tr(P¬π Wσ ) = p(¬π, σ) = 1 − p(π, σ) = 1 − tr(Pπ Wσ ), ∀σ ∈ Σ,
and hence
(u|P¬π u) = (u|(1H − Pπ )u) , ∀u ∈ H̃,
and hence P¬π = 1H − Pπ .
19.3.5 Remarks.
(a) We always assume that the dimension of the Hilbert space H in 19.3.1 is greater
than one, for otherwise the only projections in H would be OH and 1H and
hence the only propositions of the system would be the trivial propositions π0
and π1 .
(b) Let σ ∈ Σ be a state such that Wσ is not a one dimensional projection. Then (cf.
18.3.6) there exist countable families {un }n∈I of elements of H̃ and {wn }n∈I of
elements of (0, 1], so that I contains more than one index, Aui 6= Auk if i 6= k,
P
n∈I wn = 1, and
X
Wσ f = wn Aun f, ∀f ∈ H. (1)
n∈I
If we denote by σn the element of Σ such that Wσn = Aun , then we have (cf.
18.3.5c)
X X
p(π, σ) = tr(Pπ Wσ ) = wn tr(Pπ Aun ) = wn p(π, σn ), ∀π ∈ Π.
n∈I n∈I
Consider now the state preparation procedure σ0 which is defined as follows.

When a copy of the system is prepared in σ0 , then it is actually as if it had been
prepared in one of the states σn ; however, it is not known in which σn the copy
actually is, but only the probability wn is known that the copy is in σn . This
lack of knowledge, of which σn does take effect when σ0 is implemented, could
arise from technological fluctuations of the equipment that defines σ0 (and then
wn would be a classical probability, since that equipment is made of macroscopic
bodies) or else from the procedure σn being triggered by a previous quantum
event, pertaining to a perhaps different system. Now, suppose we have N copies
of the system prepared according to the procedure σ0 . If Nn is the number of
the copies which are as if they had been prepared according to σn , and if Nn,π
is the number of these copies in which a proposition π is true, then we obviously
have
P
n∈I Nn,π
X Nn Nn,π
= .
N N Nn
n∈I
Since probabilities are theoretical predictions of frequencies, we expect that, as

N grows, Nn grows as well (since wn 6= 0), and that
P
n∈I Nn,π
approaches p(π, σ0 ),
N
Nn
approaches wn ,
N
Nn,π
approaches p(π, σn ).
Nn
Thus, we are led to conclude that
X
p(π, σ0 ) = wn p(π, σn ) = p(π, σ), ∀π ∈ Π,
n∈I
and hence that σ0 = σ (as equivalence classes).

Therefore, the probability p(π, σ) can be interpreted, within the procedure σ0 ,
as a mixture of different probabilities: the probabilities p(π, σn ) are part of the
quantum statistical theory we are discussing, while the probabilities wn are part
of a different statistical theory, whose role is here to quantify to what extent
the preparation procedures σn can be controlled. For this reason, a state σ ∈ Σ
such that Wσ is not a one dimensional projection is said to be a mixed state.
It must be pointed out that decomposition 1 for a mixed state is never unique
(cf. 18.3.7). Thus, the analysis carried out above cannot be considered as the
interpretation of the mixed state σ, but it must be regarded as a description of
how one of the many equivalent procedures that are contained in this equiva-
lence class σ can be implemented on the basis of procedures which implement
the states σn . Failure to acknowledge the non-uniqueness of these many equiva-
lent procedures may well lead to some of the so-called “paradoxes” of quantum
mechanics.
All that was said above can be generalized to the case of a state σ ∈ Σ such
that there are countable families {Wn }n∈I of elements of W(H) and {wn }n∈I
P
of elements of (0, 1] so that Wi 6= Wk for i 6= k, n∈I wn = 1, and
X
Wσ f = wn Wn f, ∀f ∈ H
n∈I
(cf. 18.3.4). In this case, if σn denotes the state which is such that Wσn = Wn ,
the state σ is said to be a mixture of the family {σn }n∈I of states, and the
elements of the family {wn }n∈I are said to be the weights of the decomposition.
(c) A state σ such that Wσ is a one-dimensional projection cannot be decomposed
into a mixture of other states (cf. 18.3.7). Thus, the probabilities p(π, σ) that
arise in connection with σ are not mixtures of probabilities intrinsic to the
quantum theory that is being discussed and probabilities of a different kind;
they are, that is, purely quantum probabilities. For this reason, a state σ ∈ Σ
such that Wσ is a one-dimensional projection is said to be a purely quantum
state, or simply a pure state. Since the mapping Ĥ ∋ [u] 7→ Au ∈ P(H)
is a bijection from the family Ĥ of all rays of H onto the family of all one-
dimensional projections in H (cf. 13.1.13a), if we denote by Σ0 the family of
all pure states we have a bijection Σ0 ∋ σ 7→ [uσ ] ∈ Ĥ, where [uσ ] denotes, for
any σ ∈ Σ0 , the ray such that Wσ = Auσ , i.e. such that (cf. 18.3.2b)
p(π, σ) = tr(Pπ Auσ ) = (uσ |Pπ uσ ) , ∀π ∈ Π.
(d) Suppose we are given a countable family {σn }n∈I of pure states, and for each
n ∈ I let un be an element of H̃ such that Wσn = Aun . Moreover, suppose we
P
are given a family {α n }n∈I of complex
numbers so that n∈I αn un converges
P
(if it is a series) and αn un = 1. Then, the bijectivity of the mapping
n∈I
Σ0 ∋ σ 7→ [uσ ] ∈ Ĥ allows considering the pure state σp which is such that
P P
[uσp ] = n∈I αn un , i.e. such that Wσp = Au with u := n∈I αn un . This
state is said to be a coherent superposition of the family {σn } of pure states.
Note that, in spite of its name, the state σp actually depends not only on the
family {σn } but also on the choice of the representative un in each equivalence
class [un ].
The bijectivity of the mapping Σ0 ∋ σ 7→ uσ ∈ Ĥ is called superposition
principle.
We point out that, on the basis of the family {σn }n∈I of pure states considered
above, we can obtain a mixed state for any family {wn }n∈I of elements of (0, 1]
P
such that n∈I wn = 1, defined as the state σm such that
X X
Wσm f = wn Wσn f = wn Aun f, ∀f ∈ H.
n∈I n∈I
Clearly, this state depends only on the equivalence classes [un ] and not on their
representatives.
(e) Suppose we are given an o.n.s. {un }n∈I in H and a family {αn }n∈I of complex
numbers so that n∈I |αn |2 = 1. Then we can consider the pure state σp ∈ Σ0
P
P
which is such that Wσp = Au , with u := n∈I αn un , or else we can consider the
mixed state σm which is such that Wσm f = n∈I |αn |2 Aun f for each f ∈ H.
P
For each π ∈ Π we have:

X X
p(π, σp ) = (u|Pπ u) = |αn |2 (un |Pπ un ) + αn αm (un |Pπ um ) ;
n∈I n,m∈I
n6=m
X X
2 2
p(π, σm ) = |αn | tr(Pπ Aun ) = |αn | (un |Pπ un ) .
n∈I n∈I
This shows in which way a coherent superposition of a family of pure states

is different from a mixture of the same family, in the particular case we have
when the family of pure states corresponds to an o.n.s. in H. The real number
P
n,m∈I αn αm (un |Pπ um ), which makes the difference, is said to be an interfer-
n6=m
ence term.
(f) For a proposition π ∈ Π we have (cf. 19.3.4 and 19.1.7, recalling that we “drop
the carets” in conformity with 19.1.12)
Pπ 6= OH ⇔ π 6= π0 ⇔ [∃σ ∈ Σ s.t. p(π, σ) 6= 0].
Moreover, for σ ∈ Σ0 we have p(π, σ) = (uσ |Pπ uσ ) = kPπ uσ k2 (cf. remark c),
and therefore (cf. 13.1.3c)
p(π, σ) = 1 ⇔ uσ ∈ RPπ and p(π, σ) = 0 ⇔ uσ ∈ NPπ ;
these equivalences show that, for each σ ∈ Σ0 , there are propositions π ∈ Π
such that p(π, σ) 6∈ {0, 1} (e.g., assume π such that Pπ = Au , with u ∈ H̃ and
(u|uσ ) 6∈ {0, 1}); from the first equivalence we also have
Pπ 6= OH ⇔ RPπ 6= {0H } ⇔ [∃σ ∈ Σ0 s.t. p(π, σ) = 1].
For a proposition π ∈ Π and a state σ ∈ Σ we have (cf. 18.3.9 and 18.3.11)
p(π, σ) = 1 ⇔ RWσ ⊂ RPπ ⇔ Pπ Wσ = Wσ ⇔ Pπ Wσ Pπ = Wσ
and
p(π, σ) = 0 ⇔ RWσ ⊂ NPπ ⇔ Pπ Wσ = OH ⇔ Pπ Wσ Pπ = OH .
If Pπ is a one-dimensional projection, i.e. Pπ = Au with u ∈ H̃, then for a state
σ ∈ Σ we have p(π, σ) = 1 if and only if σ is a pure state and [uσ ] = [u] (cf.
18.3.10).
(g) If, for two propositions π, π ′ ∈ Π, we have
{σ ∈ Σ0 : p(π, σ) = 1} = {σ ∈ Σ0 : p(π ′ , σ) = 1},
then π = π ′ . In fact, the above condition can be written as
{u ∈ H̃ : (u|Pπ u) = 1} = {u ∈ H̃ : (u|Pπ′ u) = 1},
and this can be written as
{u ∈ H̃ : kPπ uk = kuk} = {u ∈ H̃ : kPπ′ uk = kuk},
and this is equivalent to RPπ = RPπ′ , in view of 13.1.3c. Then, Pπ = Pπ′ and
hence π = π ′ .
19.3.6 Definitions. Let (X, A) be a measurable space and α an X-valued observ-
able. We define the projection valued mapping
Pα : A → P(H)
E 7→ Pα (E) := Pα(E) .
For every u ∈ H̃, the function µPu
α
(cf. Section 13.3 for the definition of µP
u )
α
Pα α
is a probability measure on A since µu = µσu if σu is the pure state such that
Wσu = Au :
µP α

u (E) = u|Pα(E) u = tr(Pα(E) Au ) = p(α(E), σu ) = µσu (E), ∀E ∈ A.
α
Thus (cf. 13.3.5) Pα is a projection valued measure on A.

If α is an observable (i.e. an R-valued observable), then Pα defines a unique
self-adjoint operator APα (cf. 15.2.2), and we write Aα := APα . Thus, Aα is the
unique self-adjoint operator in H such that (cf. 15.2.2)
P Aα (E) = Pα(E) , ∀E ∈ A(dR ).
19.3.7 Proposition. If (X, A) is a measurable space and α and β are X-valued

observables such that Pα = Pβ , then α = β.
The mapping α 7→ Aα , from the family of all observables to the family of all
self-adjoint operators in H, is injective.
Proof. If α and β are X-valued observables and Pα = Pβ , then Pα(E) = Pβ(E) and
hence (by the injectivity of the mapping of 19.3.1b) α(E) = β(E) for all E ∈ A,
i.e. α = β.
If α and β are observables and Aα = Aβ , then Pα = Pβ (cf. 15.2.2) and hence
α = β.
19.3.8 Remark. If the assumption is made that every mapping α : A(dR ) → Π, for
which µα σ is a probability measure for all σ ∈ Σ, must be considered an observable,
then every self-adjoint operator in H represents an observable. Indeed, if A is a
self-adjoint operator in H, we can define the mapping αA : A(dR ) → Π by letting
αA (E) be the proposition such that PαA (E) = P A (E), for all E ∈ A(dR ). Then we
have, for each σ ∈ Σ,
µα A
σ (E) = p(αA (E), σ) = tr(P (E)Wσ ), ∀E ∈ A(dR ),
A
which shows (cf. 18.3.13) that µα σ

A
is a probability measure on A(dR ). Thus, αA
could be an observable. If αA actually is an observable, then we obviously have
AαA = A. This shows that the hypothetical assumption above is equivalent to the
assumption that the mapping α 7→ Aα , from the family of all observables to the
family of all self-adjoint operators (cf. 19.3.7), is bijective.
We do not make the assumption above, but merely claim that every self-adjoint
operator A in H is capable of representing an observable (this claim is actually
equivalent to the spectral theorem for self-adjoint operators, cf. 15.2.1A), and
that it does represent an observable whenever it can be properly justified that the
mapping αA represents a measuring instrument (cf. 19.1.9c). Some actually assume
that, in a quantum theory without superselection rules, every self-adjoint operator
represents an observable. However, for most operators it is very difficult to imagine
measuring instruments that could support the corresponding observables.
In any case, in every quantum theory without superselection rules, at least
all projections are self-adjoint operators which represent observables, owing to the
surjectivity of the mapping Π ∋ π 7→ Pπ ∈ P(H) of 19.3.1b. Indeed, for each
proposition π ∈ Π we can define the observable απ (cf. 19.1.26), and we have, for
E ∈ A(dR ),


 Pπ0 = OH if 0 6∈ E and 1 6∈ E,


P
π if 0 6∈ E and 1 ∈ E,
P Aαπ (E) = Pαπ (E) =


 P ¬π = 1 H − Pπ if 0 ∈ E and 1 6∈ E,

Pπ1 = 1H if 0 ∈ E and 1 ∈ E;

now, this is the projection valued measure of the self-adjoint operator Pπ (cf.
15.3.4D and 13.1.3e); thus Aαπ = Pπ . Furthermore, for every projection P ∈ P(H)
there exists a proposition π such that P = Pπ , owing to the surjectivity of the map-
ping Π ∋ π 7→ Pπ ∈ P(H).
19.3.9 Proposition. Let (X, A) be a measurable space, let α be an X-valued ob-

servable, let ϕ : Dϕ → R be a function from X to R which is ADϕ -measurable and
so that Dϕ ∈ A. Then the observable ϕ(α) can be defined (i.e. µα σ (X − Dϕ ) = 0
for each σ ∈ Σ, cf. 19.1.13) if and only if the operator JϕPα can be defined (i.e.
Pα (X − Dϕ ) = OH , or ϕ ∈ M(X, A, Pα ), cf. 14.2.14). If these conditions are true
then Aϕ(α) = JϕPα . If, further, (X, A) = (R, A(dR )), then Aϕ(α) = ϕ(Aα ). Owing
to this, the mapping α 7→ Aα , from the family of all observables to the family of all
self-adjoint operators, is said to be function preserving.
Proof. The condition µα σ (X −Dϕ ) = 0 for each σ ∈ Σ is equivalent to α(X −Dϕ ) =

π0 , and hence to Pα (X − Dϕ ) = OH . Assuming that these conditions are true, we
have (cf. 15.2.7 and 19.1.13)
Pα
P Jϕ (E) = Pα (ϕ−1 (E)) = Pα(ϕ−1 (E)) = Pϕ(α)(E) = P Aϕ(α) (E), ∀E ∈ A(dR ),
and hence (cf. 15.2.2) JϕPα = Aϕ(α) . If, further, (X, A) = (R, A(dR )), then (cf.
15.3.1)
Aα
ϕ(Aα ) = JϕP = JϕPα = Aϕ(α) .
19.3.10 Proposition. For an observable α, the following statements hold true:
(a) spα = σ(Aα ).

(b) The following conditions are equivalent:
α is a bounded observable;
Aα is a bounded operator;
DAα = H.
(c) The following conditions are equivalent:
α is a discrete observable;
there exists a c.o.n.s. in H the elements of which are eigenvectors of Aα .
Proof. a: For λ ∈ R we have
λ ∈ spα ⇔ [∀ε > 0, ∃σ ∈ Σ s.t. µα

σ ((λ − ε, λ + ε)) 6= 0] ⇔
[∀ε > 0, P Aα ((λ − ε, λ + ε)) 6= OH ] ⇔ λ ∈ σ(Aα ),
where we have used 19.1.15, 19.3.5f, 15.2.4.

b: From statement a and the fact that σ(Aα ) is a bounded set iff Aα is a bounded
operator (cf. 15.2.2f) we have that α is a bounded observable iff Aα is a bounded
operator. From 12.4.7 we have that Aα is a bounded operator iff DAα = H.
c: Assume first that α is a discrete observable, and let {λn }n∈I be a countable
family of real numbers so that µα σ ({λn }n∈I ) = 1 for all σ ∈ Σ and so that (cf.
19.1.19)
∀n ∈ I, ∃σ ∈ Σ so that µα
σ ({λn }) 6= 0.
Then µασ (R − {λn }n∈I ) = 0 for all σ ∈ Σ, and hence (by the monotonicity of µσ )
α
µα
σ ({λ}) = 0 for all σ ∈ Σ and for each λ ∈ R − {λn }n∈I , and hence (cf. 19.3.5f)
P Aα ({λ}) = OH , ∀λ ∈ R − {λn }n∈I ;

moreover, by 19.3.5f,
P Aα ({λn }) 6= OH , ∀n ∈ I.
Thus, {λn }n∈I = σp (Aα ) by 15.2.5, and hence µα σ (R − σp (Aα )) = 0 for all σ ∈ Σ,
and hence (cf. 19.3.5f) P Aα (R − σp (Aα )) = OH , and hence (cf. 15.3.4B) there
exists a c.o.n.s in H whose elements are eigenvectors of Aα .
Assume, next and conversely, that there exists a c.o.n.s. in H whose elements
are eigenvectors of Aα . Then (cf. 15.3.4B) P Aα (R − σp (Aα )) = OH , and hence (cf.
19.3.5f) µασ (R− σp (Aα )) = 0 for all σ ∈ Σ. Since σp (Aα ) is countable (cf. 12.4.20C),
this proves that the observable α is discrete.
19.3.11 Remark. There are quantum theories in which it is unavoidable to have

observables which are not bounded (e.g., the observable which is interpreted as
the energy of the system). From 19.3.10b it follows that it is then unavoidable
to have self-adjoint operators which are not bounded (or, equivalently, which are
not defined on the whole space and which are touchy about their domains, cf. for
example 12.4.25).
19.3.12 Remarks. Let (X, A) be a measurable space and α an X-valued observ-

able. We recall that, for every E ∈ A and every state σ ∈ Σ, the number µα σ (E)
is the probability p(α(E), σ) of the proposition α(E) in the state σ, i.e. the proba-
bility that the apparatus underlying α gives a result which is an element of E, for
a collection of copies of the system prepared in the state σ (cf. 19.1.9a). We also
recall that p(α(E), σ) = tr(Pα (E)Wσ ).
(a) For an observable α, a real number λ, and a state σ, the number µα σ ({λ}) is the
probability of obtaining λ as result for α in the state σ. For α and λ we have
(cf. 15.2.5 and 19.3.5f and recall that P Aα = Pα )
λ ∈ σp (Aα ) ⇔ P Aα ({λ}) 6= OH ⇔ [∃σ ∈ Σ s.t. µα

σ ({λ}) 6= 0] ⇔
[∃σ ∈ Σ0 s.t. µα
σ ({λ}) = 1].
Thus, λ is as an exact result for α (cf. 19.1.16a) if and only if λ is an eigenvalue

of Aα , and this is true if and only if there exists a pure state in which the result
λ is certain.
If α is an observable and a real number λ is an isolated point of σ(Aα ), i.e. if

∃δ > 0 such that (λ − δ, λ + δ) ∩ σ(Aα ) = {λ},
then λ ∈ σp (Aα ) (cf. 15.2.6) and, to produce copies of the system in which
the result for α is λ with certainty, we only need a preparation procedure σ
so that µασ ((λ − δ, λ + δ)) = 1, i.e. a state which produces with certainty the
result λ with a margin of error not greater than 2δ . “For example, if we know
of a hydrogen atom that it contains less energy than is necessary for the second
lowest energy level, then we know its energy content with absolute precision: it
is the lowest energy value” (Neumann, 1932, p.222). We call an isolated point
of σ(Aα ) a quantized result for α.
Suppose now that α is a discrete observable. This is equivalent to the assump-
tion that P Aα (R − σp (Aα )) = OH (cf. 19.3.10c and 15.3.4B). Now, this does
not entail that the elements of σp (Aα ) are isolated points of σ(Aα ). Whether
the elements of σp (Aα ) are isolated points of σ(Aα ) actually depends on the
scale which is used in the instrument that defines α. In fact, since σp (Aα ) is
countable (cf. 12.4.20C) we can write σp (Aα ) = {λn }n∈I with I := {1, ..., N }
or I := N; then we fix δ > 0 and define the function
ϕ : {λn }n∈I → R
λn 7→ ϕ(λn ) := nδ;
this function is trivially A(dR )Dϕ -measurable and we have
µα Aα

σ (R − Dϕ ) = tr P (R − σp (Aα ))Wσ = 0, ∀σ ∈ Σ;
thus, the observable ϕ(α) can be defined and we have Aϕ(α) = ϕ(Aα ) (cf.
19.3.9); now, we have the equations σp (ϕ(Aα )) = σ(ϕ(Aα )) = {nδ : n ∈ I}
and P ϕ(Aα ) (R − σp (ϕ(Aα ))) = OH (cf. 15.3.4B), and therefore ϕ(α) is a dis-
crete observable, the entire spectrum of Aϕ(α) is made up of isolated points,
and to pinpoint a result for ϕ(α) we only need a preparation procedure
which produces with certainty a result with a margin of error not greater
than δ2 . We may assume that such a procedure exists at least in princi-
δ δ
n ∈ I, statistical operators W ex-
ple since it is easy to see that, for each
ist so that tr Pϕ(α) nδ − 2 , nδ + 2 W = 1, and hence states σ so that
ϕ(α)
nδ − δ2 , nδ + 2δ = 1 (however, it may be difficult to attain procedures

µσ
which define such states in practice; among these states there are the states
ϕ(α)
σ for which µσ ({nδ}) = 1; states σ for which only the milder condition
ϕ(α)
µσ nδ − 2 , nδ + 2δ = 1 is requested may be easier to implement). Now,
δ
an apparatus which underlies ϕ(α) is obtained by a change of scale in an ap-

paratus which underlies α (cf. 19.1.14); and indeed the two observables α and
ϕ(α) are operationally equivalent; in fact we also have α = ϕ−1 (ϕ(α)), as can
be easily proved, and therefore if the result nδ has been obtained for ϕ(α) then
the result λn can be said to have been obtained for α. We also note that no
change of scale, defined by a function ψ : Dψ → R which meets the conditions
of 19.3.9 with Dψ ∈ A(dR ), can give an observable ψ(α) which is not discrete.
This can be seen from
−1
µσψ(α) (ψ({λn }n∈I )) = µα
σ (ψ (ψ({λn }n∈I ))) ≥ µα
σ ({λn }n∈I ) = 1, ∀σ ∈ Σ
−1
(we have used the monotonicity of µα σ and {λn }n∈I ⊂ ψ (ψ({λn }n∈I ))). Thus,
the discreteness of the observable α is a property which is shared by all functions
of α.
What we have just seen shows that a discrete observable is an observable so
that at least in principle there are realistic states (i.e. preparation procedures
which do not demand absolute precision for their implementation) in which an
exact result is obtainable with certainty. An observable is said to be quantized
if it is discrete. This idea was expressed by John von Neumann as follows: “In
the method of observation of classical mechanics ... we assign to each quantity
α in each state [what is meant here is ’in each microstate’] a completely deter-
mined value. At the same time, however, we recognize that each conceivable
measuring apparatus, as a consequence of the imperfections of human means
of observations (which result in the reading of the position of a pointer or in
locating the blackening of a photographic plate with only limited accuracy), can
furnish this value only with a certain (never vanishing) margin of error. This
margin of error can, by sufficient refinement of the method of measurement, be
made arbitrarily close to zero but it is never exactly zero. One expects that
this will also be true in quantum theory for those quantities which ... are not
quantized; for example, for the cartesian coordinates of an electron (which can
take on every value between −∞ and +∞, and whose operators have continuous
spectra [what is meant here is that their point spectra are empty]). On the other
hand, for those quantities which ... are ’quantized’, the converse is true: since
these are capable of assuming only discrete values, it suffices to observe them
with just sufficient precision that no doubt can exist as to which one of these
’quantized’ values is occurring. That value is then as good as ’observed’ with
absolute precision. ... This division into quantized and unquantized quantities
corresponds ... to the division into quantities α with an operator Aα that has
a pure discrete spectrum [what is mean here is that P Aα (R − σp (Aα )) = OH ],
and into such quantities for which this is not the case. And it was for the
former, and only for these, that we found a possibility of an absolutely precise
measurement — while the latter could be observed only with arbitrarily good
(but never absolute) precision” (Neumann, 1932, p.221–222).
(b) Let α be an observable, and suppose that λ ∈ σc (Aα ) (cf. 12.4.22). Then the
result λ can never be obtained exactly with certainty, because λ 6∈ σp (Aα ).
However, from 19.3.10a and 19.3.5f we have that
∀ε > 0, ∃σ ∈ Σ0 such that µα
σ ((λ − ε, λ + ε)) = 1.
This means that the result λ can be obtained with certainty with arbitrarily
good precision. Thus, σc (Aα ) can be interpreted as representing a continuum
of possible results for α. To obtain one of these results with absolute precision
would require an absolutely precise preparation procedure (the situation is in
a certain sense opposite to the one discussed in remark a). The treatment of
quantum mechanics based on Hilbert space does not allow these rather idealis-
tic procedures, which are instead part of the treatments of quantum mechanics
that use the notion of “improper eigenfunction” to represent them. Now let von
Neumann speak. “It should be observed that the introduction of an eingenfunc-
tion which is ’improper’, i.e. which does not belong to Hilbert space, gives a less
good approach to reality than our treatment here. For such a method pretends
the existence of such states in which quantities with continuous spectra take
on certain values exactly, although this never occurs. Although such idealiza-
tions have often been advanced, we believe that it is necessary to discard them
on these grounds, in addition to their mathematical untenability” (Neumann,
1932, p.223). We point out that, in this respect, quantum mechanics in Hilbert
space is a construction which requires a smaller amount of idealization than
classical mechanics, which has at its core states (the microstates) in which all
quantities take on exact values with certainty.
What was considered as “mathematically untenable” by von Neumann in 1932
was Dirac’s notion of bras and kets (Dirac, 1958, 1947, 1935, 1930), which
was actually systematized later by the mathematical theory of “rigged Hibert
spaces”. However, this theory relies heavily on von Neumann’s spectral theorem
and “we must emphasize that we regard the spectral theorem as sufficient for
any argument where a nonrigorous approach might rely on Dirac notation; thus,
we only recommend the abstract rigged space approach to readers with a strong
emotional attachment to the Dirac formalism” (Reed and Simon, 1980, 1972,
p.244).
19.3.13 Proposition. For an observable α and a state σ ∈ Σ, the following facts

are true.
(a) α is evaluable in σ iff Aα is computable in Wσ ; if α is evaluable in σ, then
Aα Wσ ∈ T (H),
hαiσ = hAα iWσ = tr(Aα Wσ ),
∆σ α = ∆Wσ Aα .
(b) If α2 is evaluable in σ, then α is evaluable in σ and
A2α Wσ ∈ T (H),
Aα Wσ ∈ T (H),
21 12
∆σ α = tr(A2α Wσ ) − (tr(Aα Wσ ))2 = hα2 iσ − hαi2σ .
(c) If {un }n∈I is a countable family of elements of H̃ and {wn }n∈I is a family of
P P
elements of (0, 1] so that n∈I wn = 1 and Wσ f = n∈I wn Aun f for all f ∈ H
(cf. 18.3.6 and 19.3.5b), then:
wn kAα un k2 < ∞];

P
α is evaluable in σ iff [un ∈ DAα for all n ∈ I and n∈I
if α is evaluable in σ, then
X
hαiσ = wn (un |Aα un ) ,
n∈I
! 12
X
∆σ α = wn kAα un − hαiσ un k2 .
n∈I
(d) If σ is a pure state, then:

α is evaluable in σ iff uσ ∈ DAα ;
if α is evaluable in σ, then
hαiσ = (uσ |Aα uσ ) = hAα iuσ ,
∆σ α = kAα uσ − hαiσ uσ k = ∆uσ Aα .
Aα
P
Proof. a: We have (cf. 19.3.6 and 18.3.13 for the definition of µW σ
)
Aα
µα
σ (E) = p(α(E), σ) = tr(P
Aα
(E)Wσ ) = µP
Wσ (E), ∀E ∈ A(dR );
the results then follow from the definitions given in 19.1.20, 19.1.21, 18.3.14, and
from 18.3.16.
b: Since α2 := ξ 2 (α), from 19.3.9 we have Aα2 = ξ 2 (Aα ); since ξ 2 (Aα ) = A2α
(cf. 15.3.5), we have Aα2 = A2α . Then the results follow from the results in part a
and from 18.3.17.
c: The results follow from the results in part a and from 18.3.16.
d: If σ is a pure state, then Wσ = Auσ (cf. 19.3.5c). Hence the results are the
particularization of the results of part c to the case of I containing just one index
(cf. also the definitions of hAiu and ∆u A in 15.2.3).
19.3.14 Remarks.
(a) If α is not a bounded observable then DAα 6= H (cf. 19.3.10b) and therefore
Wσ Aα is not an element of T (H) and we cannot write hαiσ = tr(Wσ Aα ) even
if α is evaluable in σ (cf. also 18.3.19a).
(b) The results of 19.3.13c,d show that, if a mixed state σ ∈ Σ is the mixture of
a countable family {σn }n∈I of pure states with weights {wn }n∈I , then for an
observable α which is evaluable in σ we have that α is evaluable in every pure
state σn and
X
hαiσ = wn hαiσn .
n∈I
This supports the idea (cf. 19.3.5b) that σ can be implemented using implemen-
tations of the states σn , by the procedure which is put into effect by carrying
out with probability wn the plan of action σn (this procedure is not precise,
because each time it is put into effect we do not know which plan of action σn is
actually going into effect, but it is not utterly at random, because the probabil-
ities wn are defined). However, we remind the reader that the decomposition of
a mixed state into a mixture is never unique, and thus σ cannot be interpreted
as being necessarily implemented by this procedure: in fact, as an equivalence
class, σ contains all the procedures that can be constructed as above, on the
basis of any decomposition of σ into a mixture of other states.
19.3.15 Remarks. The results we have obtained for a quantum theory are consis-
tent with the results we obtained for a general statistical theory in Section 19.1. This
could be checked systematically. We examine here five instances of this consistency.
(a) For an observable α we have spα = σ(Aα ) (cf. 19.3.10a). Then spα is closed
because such is the spectrum of every operator in H (cf. 10.4.6), and this is
consistent with 19.1.17.
(b) For an observable α and a function ϕ as in 19.3.9 we have Aϕ(α) = ϕ(Aα ).
Aα
Then, for a pure state σ, 19.3.13d, 15.3.2 and µP uσ = µασ (cf. 19.3.6) imply
19.1.23.
(c) If an observable α is bounded then the operator Aα is bounded (cf. 19.3.10b),
and hence Aα is computable in Wσ for every σ ∈ Σ (cf. 18.3.18), and hence α
is evaluable in every σ ∈ Σ (cf. 19.3.13a). This is consistent with 19.1.25. In a
quantum theory we can also prove the converse of 19.1.25: if an observable α
is evaluable in every state, then α is evaluable in every pure state, and hence
DAα = H (cf. 19.3.13d), and hence α is bounded (cf. 19.3.10b).
(d) For each π ∈ Π we have Aαπ = Pπ (cf. 19.3.8). Then, since Pπ is bounded (cf.
13.1.3d), Aαπ is computable in Wσ for every σ ∈ Σ (cf. 18.3.18), and hence απ
is evaluable in every σ ∈ Σ (cf. 19.3.13a). Moreover, for each σ ∈ Σ, 19.3.13a
implies that
hαπ iσ = tr(Aαπ Wσ ) = tr(Pπ Wσ ) = p(π, σ)
and, since Pπ2 = Pπ (cf. 13.1.5), 19.3.13b implies that

21
∆σ απ = tr(A2απ Wσ ) − (tr(Aαπ Wσ ))2
1 1
= (tr(Pπ Wσ )(1 − tr(Pπ Wσ ))) 2 = (p(π, σ)(1 − p(π, σ))) 2 .
Now, these results are consistent with 19.1.27.

(e) The results obtained in 19.3.13b are consistent with 19.1.24.
19.3.16 Proposition. For an observable α and a real number λ, the following

(a) λ ∈ σ(Aα );
(b) ∀ε > 0, ∃σε ∈ Σ0 such that α is evaluable in σε , |hαiσε − λ| < ε, ∆σε α < 2ε.
Proof. The asserted equivalence follows from 15.2.4 and 19.3.13d.

19.3.17 Remark. On account of the equality σ(Aα ) = spα (cf. 19.3.10a), we

already have a physical interpretation of the spectrum of the self-adjoint operator
Aα that represents an observable α: σ(Aα ) coincides with the spectrum of α, i.e.
with the set of possible results for α (cf. 19.1.15 and 19.1.16a). Thus, on the
grounds of 19.3.16 we can further analyse the idea of a possible result in quantum
mechanics: a real number λ is a possible result for an observable α if and only if,
for each given ε > 0, there exists a pure state σε so that the average of the results
obtained measuring α in a large number of copies prepared in σε is predicted to
differ from λ by less than ε, and this with a standard deviation which is predicted
to be smaller than 2ε (cf. 19.1.22a).
19.3.18 Proposition. For an observable α and a real number λ, the following
(a) λ ∈ σp (Aα );
(b) ∃σ ∈ Σ0 such that α is evaluable in σ, hαiσ = λ, ∆σ α = 0.
Proof. The result follows from 15.2.5 and 19.3.13d.
19.3.19 Remarks.
(a) The result in 19.3.18 confirms the interpretation that was made in 19.3.12a
of σp (Aα ), for an observable α in quantum mechanics: a real number λ is an
eigenvalue of Aα if and only if there exists a pure state σ so that λ is the result
that is always obtained when α is measured for any number of copies prepared in
σ; in fact (cf. 19.1.22b) the meaning of ∆σ α = 0 is that the same result is always
obtained for any number of measurements (then, of course, this result is also
the mean result). It is also clear from 19.3.13d that, for a pure state σ in which
α is evaluable, we have hαiσ = λ and ∆σ α = 0 if and only if λ is an eigenvalue
of Aα and uσ is an eigenvector of Aα corresponding to λ; and indeed this is
true if and only if (cf. 15.2.5e and 13.1.3c) µασ ({λ}) = kP
Aα
({λ})uσ k2 = 1, in
agreement with what was seen in 19.3.12a.
(b) For an observable α, a pure state σ, a real number λ, in remark a we saw that
µασ ({λ}) = 1 if and only if λ is an eigenvalue of Aα and uσ is an eigenvector of
Aα corresponding to λ. More in general we have (cf. 19.3.5c)
µα Aα

σ ({λ}) = uσ |P ({λ})uσ .
Thus, if λ ∈ σp (Aα ) and {uλ,d}d∈Iλ is an o.n.s. in H which is complete in
NAα −λ1H , i.e. so that V {uλ,d}d∈Iλ = NAα −λ1H , we have (cf. 15.2.5e and
13.1.10)
X
µασ ({λ}) = | (uλ,d |uσ ) |2 .
d∈Iλ
If the dimension of NAα −λ1H is one, i.e. if λ is a non-degenerate eigenvalue of
Aα , we have
2
µα
σ ({λ}) = | (uλ |uσ ) | ,
where uλ is any element of H̃ ∩ NAα −λ1H .
19.3.20 Proposition. For an observable α and a state σ ∈ Σ, the following con-

ditions are equivalent:
(a) α is evaluable in σ and ∆σ α = 0;
(b) ∃λ ∈ R such that µα σ ({λ}) = 1;
(c) ∃λ ∈ R such that RWσ ⊂ RP Aα ({λ}) .
If these conditions are satisfied then there is only one real number λ such that
condition b, or condition c, is satisfied; λ is the same number for both conditions
and we have:
(d) λ ∈ σp (A) and hαiσ = λ.
Proof. a ⇔ b: Cf. 19.1.22b.

b ⇔ c, and uniqueness of λ: In condition b, λ is clearly unique since µα
σ ({λ}) = 1
implies µα
σ (R − {λ}) = 0; moreover, for λ ∈ R we have
µα
σ ({λ}) = 1 ⇔ RWσ ⊂ RP Aα ({λ})
by 19.3.5f, since µα
σ ({λ}) = p(α({λ}), σ) and Pα({λ}) = P
Aα
({λ}).
Aα
d: Condition c implies obviously P ({λ}) 6= OH and hence λ ∈ σp (Aα ) (cf.
15.2.5). In 19.1.22b it was proved that condition b implies hαiσ = λ.
19.3.21 Remark. From 19.3.20 we have that, for an observable α and a state σ
in which α is evaluable, ∆σ α = 0 is possible if and only if σp (Aα ) 6= ∅; moreover, if
σp (Aα ) 6= ∅ then ∆σ α = 0 is true if and only if there exists an eigenvalue λ of Aα
so that µα σ ({λ}) = 1, namely an eigenvalue of Aα which is the result that is always
obtained when α is measured in any number of copies prepared in σ.
From 19.3.13c we also have that an observable α is evaluable in a state σ and
∆σ α = 0 if and only if any collection of pure states, into a mixture of which σ can be
decomposed, is comprised of states represented by eigenvectors of Aα corresponding
to hαiσ , which is then the eigenvalue λ of Aα such that µα σ ({λ}) = 1, or equivalently
such that RWσ ⊂ RP Aα ({λ}) . If the state σ is pure, we have ∆σ α = 0 if and only
if uσ is an eigenvector of Aα ; if this holds true, then hαiσ is the eigenvalue of Aα
to which uσ corresponds. Thus, we have derived the results of 19.3.19a as a special
case of the results obtained in the present remark.
19.3.22 Remark. Here we make a summary of the basic mathematical structure

of a quantum theory.
There are a separable Hilbert space H of dimension greater than one, a bijective
mapping σ 7→ Wσ from the family Σ of all states onto the family W(H) of all
statistical operators in H, a bijective mapping π 7→ Pπ from the family Π of all
propositions onto the family P(H) of all orthogonal projections in H, so that
p(π, σ) = tr(Pπ Wσ ), ∀π ∈ Π, ∀σ ∈ Σ.
Further, there is an injective mapping α 7→ Aα from the family of all observables

to the family of all self-adjoint operators in H, which is defined by the condition
P Aα (E) = Pα(E) , ∀E ∈ A(dR ), for every observable α.
For every observable α, the following facts hold true:
• the spectrum spα of α is equal to the spectrum σ(Aα ) of Aα ;
• a real number is an exact result for α iff it is an eigenvalue of Aα ;
• α is a bounded observable iff Aα is a bounded operator;
• α is a discrete observable iff there exists a c.o.n.s. in H the elements of which are
eigenvectors of Aα ;
• α is evaluable in a state σ iff Aα is computable in Wσ ;
• if α is evaluable in a state σ then hαiσ = tr(Aα Wσ ) and, if α2 is also evaluable
1
in σ, ∆σ α = tr(A2α Wσ ) − (tr(Aα Wσ ))2 2 ;
• for an A(dR )Dϕ -measurable function ϕ : Dϕ → R with Dϕ ∈ A(dR ), the observ-
able ϕ(α) can be defined iff the operator ϕ(Aα ) can be defined; if they can be
defined, then Aϕ(α) = ϕ(Aα ).
For the subfamily Σ0 of Σ defined by Σ0 := {σ ∈ Σ : ∃[uσ ] ∈ Ĥ s.t. Wσ = Auσ },

the following facts hold true:
• for a proposition π ∈ Π, π 6= π0 iff there exists σ ∈ Σ0 such that p(π, σ) = 1;
• for all σ ∈ Σ0 , there exists π ∈ Π such that p(π, σ) 6∈ {0, 1};
• the mapping Σ0 ∋ σ 7→ [uσ ] ∈ Ĥ is a bijection from Σ0 onto the family of all rays
of H;
• an observable α is evaluable in σ ∈ Σ0 iff uσ ∈ DAα ;
• if an observable α is evaluable in σ ∈ Σ0 , then hαiσ = (uσ |Aα uσ ) and ∆σ α =
kAα uσ − hαiσ uσ k;
• an observable α is evaluable in σ ∈ Σ0 and ∆σ α = 0 iff uσ is an eigenvector of
Aα ;
• every element of Σ which is not an element of Σ0 can be decomposed into a
mixture of elements of Σ0 ;
• Σ0 is the family of all the states that cannot be decomposed into mixtures of
other states.
This summary should be compared with the one given in 19.2.8.
19.3.23 Remark. In a quantum theory, states, propositions, observables are rep-

resented by operators which are defined in a separable Hilbert space H; indeed we
have the mappings σ 7→ Wσ , π 7→ Pπ , α 7→ Aα (cf. 19.3.22). Suppose now that H′
is a separable Hilbert space so that H and H′ have the same orthogonal dimension
(H′ could be the same as H), and that U is a unitary or antiunitary operator from
H onto H′ (cf. 10.7.14). Then, if we define
Wσ′ := U Wσ U −1 , ∀σ ∈ Σ,
Pπ′ := U Pπ U −1 , ∀π ∈ Π,
we have a “new” representation of states and propositions (U W U −1 ∈ W(H) if

W ∈ W(H), cf. 18.3.2a, and U P U −1 ∈ P(H) if P ∈ P(H), cf. 13.1.8), which is
equivalent to the “old” one because
tr(Pπ′ Wσ′ ) = tr(U Pπ Wσ U −1 ) = tr(Pπ Wσ ) = p(π, σ), ∀π ∈ Π, ∀σ ∈ Σ
(cf. 18.2.11d,e and 18.3.8b). In the “new” representation, an observable α is rep-

resented by the projection valued measure A(dR ) ∋ E 7→ U P Aα U −1 ∈ P(H′ ), and
hence by the operator A′α := U Aα U −1 (cf. 15.4.1), and a pure state σ ∈ Σ0 by
the ray [u′σ ] := [U uσ ] (cf. 13.1.13b). The “new” and the “old” representations are
then easily checked to be wholly equivalent as to everything that has been exam-
ined in this section (in particular, they give the same numbers for every expected
result hαiσ and every uncertainty ∆σ α). The “new” and the “old” representations
are said to be unitarily-antiunitarily equivalent. If the operator U is unitary (or
antiunitary), they are said to be unitarily (or antiunitarily) equivalent.
Thus, the same quantum theory can be formulated in many unitarily-
antiunitarily equivalent ways.
19.4 State reduction in quantum mechanics
The subject of this section is sometimes known as von Neumann’s and Lüders’
reduction postulates. We start by examining in 19.4.1 two experiments which we
consider to be paradigmatic of what we later analyse in the abstract.
As before, in this and in the following sections Σ and Π denote the families
of equivalence classes (cf. 19.1.12) of states and propositions of a given quantum
system (i.e., a system described by a quantum theory), and H denotes the Hilbert
space in which they are represented as summarized in 19.3.22.
19.4.1 Remarks.
(a) The determination of a proposition for a copy of a system prepared in a state is

a procedure which is frequently performed by actually determining a possibly
different proposition which can be related to the spatial position of the copy.
This is indeed the case in the two examples we examine below. The determi-
nation of the position of a copy of a system which is considered to be of atomic
or subatomic size (we shall call such a system a microparticle) can be carried
out by means of a variety of detectors: Wilson chambers, bubble chambers,
Geiger counters, etc.; the simplest of them all is a photographic plate. And
this procedure often destroys the copy (for example, when a photon activates
a Geiger counter, it is absorbed in the process) or else renders it of no further
interest so far as the study of the system in question is concerned (for example,
when an electron hits a photographic plate, it is lost among the electrons of the
emulsion). Even when these catastrophic events do not occur, for other determi-
nation techniques, the analysis of the process of determination of a proposition
in the physics of microparticles (initiated by Werner Heisenberg) leads to the
conclusion that the determination of a proposition is a process which is bound
to alter in a substantial way the copy for which the determination is carried
out. As a matter of fact, an alteration takes place in classical physics too, but
in classical physics it is assumed that the determination of any proposition in
any state can always be implemented by probing the copy is such a way that
the alteration of the copy is negligible (cf. 19.2.2). Since this is not the case for
microparticles, in quantum mechanics (which deals mostly with microparticles)
we must acknowledge that a proposition is true in a copy, or it is not true,
only upon its determination, and not in general also immediately after that.
However the case may be that the experimental set-up which is used for the
determination of a proposition π can be modified so that it selects copies for
which π is certainly true: if π is determined for any number of copies “emerg-
ing” from the modified set-up, then π will be found to be true in all of them.
In what follows, we provide two examples of this sort.
(b) As a first example, we consider the method depicted in fig. 1 (all figures are
on page 656) for determining the magnitude of the linear momentum (in what
follows, briefly, “momentum”) of a charged particle. To the left of the screen S1
a particle of known charge e is produced which, after passing through the narrow
openings O1 and O2 in the screens S1 and S2 , is deflected by a uniform magnetic
→
field H , which is present to the right of the screen S2 and orthogonal to the
plane of the drawing. In D there is a detector. If the particle is detected in the
region D, the magnitude of the momentum of the particle is determined to be
pD = eHrD (in suitable units), where rD is as in fig. 1. In fact, if the particle is
classical (i.e., a charged particle which is not a microparticle) then its trajectory
is a circle with a radius depending on the momentum as in the formula just used;
thus, from the region of localization of the particle we can deduce the magnitude
of its momentum, and indeed the fact that detection of a charged classical
particle in D corresponds to the magnitude pD = eHrD is uncontentious. If
the particle is not a classical particle, but it is a microparticle instead, the whole
description given above, which is based on the idea of a trajectory, is meaningless
(for a microparticle the concept of a trajectory loses its meaning, as first pointed
out by Werner Heisenberg); the observable “magnitude of momentum” is then
defined as the observable to which the result pD := eHrD is ascribed if detection
in the region D occurs (and other results in other similar experiments); indeed,
if the particle is a microparticle, the experimental arrangement described above
is one of those which give an empirical meaning to the concept of momentum
of a microparticle. One could ask the question: “how can I know that the
macroscopic event that happened in a detector located in D (as for instance
the blackening of a spot of a photographic plate or the click of a detector) is due
to a microparticle, and that the magnitude of the momentum of that particle

is really pD = eHrD ?”.
But this question would be empirically meaningless because the only knowl-
edge that we can have about microparticles is the one we obtain from events
which happen in macroscopic objects (cf. e.g. Heisenberg, 1925). For both a
classical particle and a microparticle, the detection at D can be performed by
means of a device which absorbs the particle (e.g., a photographic plate for a
microparticle), in which case the result pD clearly refers only to the momentum
of the particle upon its detection. Or else, the particle can be localized at D
by shining a beam of light on the region D and registering whether light is
reflected from it. Now, light has a certain momentum, which is imparted to the
objects on which it impinges; however, this momentum transfer can be consid-
ered completely negligible if the particle is not a microparticle, and therefore we
can assume that if a classical particle is detected at D by this method then its
momentum has magnitude pD also immediately after detection; on the contrary,
if we are dealing with a microparticle we cannot reach this conclusion because
the uncontrollable momentum transfer can no longer be considered irrelevant;
therefore, even with this method of detection at D we must regard the result
pD as referring to the momentum of the microparticle upon detection and not
also immediately after that.
The analysis carried out above is actually oversimplified, and we need to further
distinguish between a classical particle and a microparticle. We note that the
procedure discussed above cannot actually lead to an exact result pD . In fact,
owing to the non-null size of the detection region D and the non-null width of
the openings O1 and O2 in the screens S1 and S2 , in both the classical and
the quantum cases we can only conclude that the magnitude of the momentum
lies in a range ED of possible values. However, in the classical case we can
assume that this range can be made arbitrarily little by reducing the size of the
detection region D and the width of the openings. In the case of a microparticle
this assumption is untenable because reducing the width produces “diffraction
effects”, which are revealed in another experiment as follows: if a detector
(e.g., a photographic plate) is placed behind two parallel screens each having
a narrow enough aperture, a microparticle can be detected not only along any
straight line passing through the two apertures, but elsewhere as well (in this
experiment, no magnetic field is present). Thus, in the case of a microparticle,
to reach the conclusion that detection in D corresponds to a fairly limited
range ED of possible values for the magnitude of the momentum, we must use
openings O1 and O2 wide enough so that diffraction effects can be neglected
(and this also requires a detector in D wide enough) and narrow enough so that
ED is still a limited range; how to reach this compromise is discussed e.g. in
(Wichmann, 1971, Chapter 6). In this case, we can say that the experimental
procedure just discussed implements the proposition “the magnitude of the

momentum of the microparticle belongs to the subset ED of R”, which we

will denote by the symbol πD : the proposition πD is true in a copy of the
system (i.e., of the microparticle) if and only if a detection in the region D
occurs. As already remarked, the procedure just discussed says nothing, in the
case of a microparticle, as to the magnitude of the momentum immediately
after detection, and it is designed not to produces copies of the system with
some definite property but to analyse a given state (which is, in this case,
a preparation procedure which takes place to the left of the screen S1 ) by
comparing the relative frequency of the different results for the magnitude of the
momentum which are experimentally obtained (this would be done by varying
the position of the region D) with the probability distribution that is predicted
by the quantum-theory of the system.
The experimental set-up described above can be converted into a contrivance
that selects copies for which the proposition πD is certainly true. To obtain this,
we replace detection in D with a filtering procedure, as shown in fig. 2: there
is a third screen S3 with an aperture which corresponds to the region D. Then
each microparticle which is not absorbed by the screen S3 has a momentum
of magnitude that lies in ED , and in the direction indicated in fig. 2. This
can be proved in the following way: to the right of the screen S3 we set up a
slightly modified replica of the experimental arrangement of fig. 2, with the
replica screens S1′ and S2′ parallel to S3 , with the replica openings O1′ and O2′
aligned with the aperture in S3 , and with an array of detectors (or with just
one large detector, e.g. a photographic plate) which cover the space that would
be occupied by a screen that was placed with respect to S2′ as S3 is to S2 ; then,
in many repetitions of the experiment, we see that the only detector that reacts
(if some detector does react), is the one placed in the position that is to S2′ as
D was to S2 .
On these grounds, the experimental arrangement of fig. 2 (without, of course,
its replica) can be considered a filtering device in the following sense: the copies
that emerge from the arrangement are so that the proposition πD “is true” in
all of them, i.e. in all of them πD would certainly be determined to be true
if a determination of it was carried out. We point out that this experimental
arrangement does not produce copies of the system, but it selects (among the
copies produced in some state to the left of the screen S1 ) copies to which a
definite property can be attributed (in all of them, if the proposition πD was de-
termined then πD would turn out to be true). Thus, a state preparation σ which
takes place to the left of the screen S1 plus the filtering device of fig. 2 must be
considered to be a new state preparation procedure σ ′ , provided p(πD , σ) 6= 0;
note in fact that, if p(πD , σ) = 0, then no copies are ever revealed in the re-
gion D and accordingly no copies can be produced by σ supplemented with the
filtering device with the aperture in the region D; if conversely p(πD , σ) 6= 0,
then out of a large number N of copies produced in the state σ the new proce-
dure σ ′ selects a number of copies which will be approximately p(πD , σ)N . We

note that, since p(πD , σ ′ ) = 1, we must have RWσ′ ⊂ RPπD (cf. 19.3.5f). The
study of the momentum of a microparticle in other experiments leads to the
conclusion that PπD is not a one-dimensional projection; therefore, there are
many possible states σ ′ which are so that RWσ′ ⊂ RPπD . However, even if the
state σ was completely unknown, we now have a state σ ′ for which something
is known, and this knowledge may be useful in other experiments.
(c) We consider a second example of a procedure for determining propositions which
can be converted into a filtering device linked with a proposition. Its schematic
experimental layout is sketched in fig. 3. There are two screens S1 and S2 ,
each with a narrow opening in it. To the left of the screen S1 copies are
produced of a microparticle which has a magnetic (dipole) moment. To the
right of the screen S2 an inhomogeneous magnetic field is established by a pair
of shaped magnets (magnetic poles), and to the right of the magnets there
is a photographic plate. This experimental set-up is called a Stern–Gerlach
device. If the particle were classical we should expect that a great number of
copies of the particle, produced to the left of S1 with random orientations of
their magnetic moments, left random marks on the photographic plate. It is
found instead that the experiment produces marks which are grouped in n well
separated regions along the z axis; in fig. 3, n = 2. By definition, the spin of
the microparticle is taken to be n−1 2 ; thus, fig. 3 shows the two possibilities
for copies of a spin one-half microparticle. For a spin one-half microparticle,
the Stern–Gerlach device can be used to determine two proposition πz+ and
πz− : πz+ (respectively, πz− ) is true in a copy of the microparticle if the mark
“left by that copy” is in the upper (respectively lower) region. The observable
“z-component of the spin” is then the mapping sz defined by
sz : A(dR ) → Π

1 1

π0 if − 2 6∈ E and 2 6∈ E,

 1 1
π
z+ if − 2 6∈ E and 2 ∈ E,
E→
7 sz (E) := 1 1
πz−

 if − 2 ∈ E and 2 6∈ E,
1 1

π1 if − ∈ E and ∈E

2 2
(this mapping is an observable since πz− = ¬πz+ ; this is due to the fact that all
marks are left in either the upper or the lower region, and nowhere else). Thus,
when a copy leaves a mark in the upper (respectively lower) region of the plate
we can say that 12 (respectively − 21 ) is the
exact result for sz , since in that case
the proposition sz 2 (respectively sz − 21 ) is true.
1
If we replace, in the experimental set-up examined above, the photographic

plate with a screen S3 in which an aperture A is opened in the same position
where the upper blackening region was, as in fig. 4, then we have a filtering
procedure which selects copies in which the proposition πz+ is certainly true.
Indeed, if we arrange a second Stern–Gerlach device to the right of the screen
S3 , with the apertures of the “new” screens S1′ and S2′ on the line of the hypo-
thetical beam coming out of A (we are now using, as “new” source of copies, the
source to the left of the screen S1 plus the first Stern–Gerlach device with the
photographic plate replaced by the screen S3 ) and with a “new” photographic
plate, we see that all the copies that are detected by the “new” photographic
plate leave marks in the upper region of the plate. As in remark b, if the
state σ in which the copies are prepared to the left of the screen S1 is so that
p(πz+ , σ) 6= 0, then σ plus the modified Stern–Gerlach device of fig. 4 amounts
to a new state preparation procedure σ ′ which is so that RWσ′ ⊂ RPπz+ (cf.
19.3.5f). Now, a spin one-half microparticle is wholly described by a quantum
theory the Hilbert space of which is not two-dimensional. However, if one is
interested in studying spin (beside sz , there are other spin observables, one
for each direction in three-dimensional space) and nothing else, then one can
give a partial description of a spin one-half microparticle in a two-dimensional
Hilbert space, e.g. C2 (at the opposite end, if spin is disregarded one can give
a partial description of the same microparticle in L2 (R3 )). In that case, the
projection Pπz+ is one-dimensional and one can conclude that, whatever the
state σ to the left of S1 , if p(πz+ , σ) 6= 0 then the copies that are selected by
the procedure described above are in the pure state σ ′ represented by the ray
[uσ′ ] of C2 (cf. 19.3.5c) which is so that Auσ′ = Pπz+ ; indeed, p(πz+ , σ ′ ) = 1
implies now Wσ′ = Pπz+ since Pπz+ is now one-dimensional (cf. 19.3.5f). Thus,
in the partial description in which just the spin observables are represented,
the procedure described above can be interpreted, provided p(πz+ , σ) 6= 0, as
an implementation of the pure state represented by the ray which contains the
normalized eigenvectors of the self-adjoint operator Asz corresponding to the
eigenvalue 21 . If a large number N of copies are prepared in the state σ to the
left of S1 , this procedure selects approximately p(πz+ , σ)N copies which are in
this pure state.
19.4.2 Definition. We say that we have a filter for a proposition π ∈ Π if we
have, for every state preparation σ ∈ Σ such that p(π, σ) 6= 0, an experimental set-
up which can be added to a definite experimental implementation of σ and which
affects a collection of copies prepared in σ as follows:
some copies are “absorbed” or “destroyed” by the set-up (i.e., “after” the setup,
no effect can be observed that can be related to those copies);
there is the probability p(π, σ) that a copy will not be absorbed;
there is a state (as an equivalence class) σ ′ which depends on σ and for which
p(π, σ ′ ) = 1, so that if a copy has not been absorbed then it is in σ ′ (a copy which
has not been absorbed by the set-up is said to have gone through the filter ).
Thus, if p(π, σ) 6= 0 then there is a state σ ′ so that p(π, σ ′ ) = 1 and so that the
experimental implementation of σ combined with the experimental set-up of the
filter amounts to an experimental implementation of σ ′ .
19.4.3 Remarks.
(a) In 19.4.2 it is not asserted that a filter exists for every proposition π. Indeed,
such claim would be an axiom. However, an even stronger assumption will
actually be made in 19.4.6.
(b) In 19.4.2 it is not maintained that a filter produces copies of the system. Rather,
we can say that a filter selects and modifies copies of the system. In fact, the
definition of a filter implies that, if a state preparation procedure σ ∈ Σ is
activated then the filter affects the copy so that the copy is either absorbed or
modified into a new copy (i.e., a copy in a new state). If p(π, σ) 6= 0, we can say
that the filter transforms the state σ into a new state σ ′ . This transformation
is called a state reduction. Note that, in a given experimental situation, σ and
σ ′ are represented by different ensembles: if we have an ensemble consisting of
a large number N of copies prepared in σ, then “after” the filter we have a new
ensemble consisting of approximately p(π, σ)N copies prepared in σ ′ .
(c) For a proposition π ∈ Π there may exist essentially different filters. In fact,
if p(π, σ) 6= 0, the state σ ′ is only subject to the condition RWσ′ ⊂ RPπ (cf.
19.3.5f). Thus, there may exist different experimental set-ups which act as
filters for the same proposition but lead to different state-reductions.
(d) It is expedient to define an equivalence relation in the family of filters for a
proposition π ∈ Π, by defining two filters equivalent if they transform in the
same way any state σ ∈ Σ such that p(π, σ) 6= 0 (it is obvious that this defines
an equivalence relation). An equivalence class is still called a filter. A repre-
sentative of an equivalence class is sometimes called an implementation of the
filter.
(e) If π ∈ Π is such that Pπ is a one-dimensional projection, that is to say Pπ = Au
with u ∈ H̃, then just one filter (as an equivalence class) can exist, because
p(π, σ ′ ) = 1 then implies Wσ′ = Au (cf. 19.3.5f). This can be rephrased as
follows: if π is represented by a one dimensional projection Au , then the only
state that can be obtained by supplementing any state with a filter for π is the
pure state represented by the ray [u] (cf. 19.3.5c).
(f) Suppose we have, for a proposition π ∈ Π, an experimental implementation of π
which includes a detector so that the event which defines π is declared to have
occurred when the detector “clicks”. Then it is often possible to convert this
apparatus into a filter for π by replacing the detector with a suitably oriented
screen in which an aperture is opened in the shape of the detector. This is in
fact what was done in the two examples of 19.4.1, which are examples of how
filters can be obtained by modifying pieces of equipment originally designed for
determining propositions.
19.4.4 Definition. A filter for a proposition π ∈ Π is said to be an ideal filter if it

transforms a state σ ∈ Σ such that p(π, σ) 6= 0 into the state σ ′ represented by the
statistical operator

1 1
Wσ′ = Wσ,π := Pπ Wσ Pπ = Pπ Wσ Pπ .
tr(Pπ Wσ Pπ ) tr(Pπ Wσ )
19.4.5 Remarks.
(a) The condition that defines an ideal filter in 19.4.4 is consistent. Indeed, for
every projection P ∈ P(H) and every statistical operator W ∈ W(H) we have:
P W P ∈ T (H) by 18.2.7;
0 ≤ (P f |W P f ) = (f |P W P f ) , ∀f ∈ H, since P = P † ;
tr(P W P ) = tr(P 2 W ) = tr(P W ) by 18.2.11c, since P = P 2 ;
this shows that, if tr(P W ) 6= 0, then tr(P1W P ) P W P ∈ W(H). Also, recall that
tr(Pπ Wσ ) = p(π, σ). Furthermore, it is clear that p(π, σ ′ ) = 1 since Pπ2 = Pπ
implies Pπ Wσ,π = Wσ,π (cf. 19.3.5f).
An ideal filter can be regarded as a filter which alters any “incoming” state σ as
little as possible. In fact, for the “outgoing” state σ ′ we must have RWσ′ ⊂ RPπ
(cf. 19.4.3c), and the operator Pπ Wσ Pπ is so to speak just the operator Wσ
“reduced” to the subspace RPπ .
(b) If an ideal filter for a proposition π exists then it is clearly unique (as an
equivalence class), owing to the injectivity of the mapping σ 7→ Wσ .
(c) For a proposition π which is represented by a one-dimensional projection, only
one filter can exist (cf. 19.4.3e). Actually, if a filter exists then it is the ideal
filter. Indeed, if a filter for π exists then it transforms every state σ ∈ Σ
such that p(π, σ) 6= 0 into the state σ ′ which is represented by the statistical
operator Wσ′ = Au , if Pπ = Au with u ∈ H̃ (cf. 19.4.3e). Now, for each
W ∈ W(H) and each u ∈ H̃ so that tr(Au W ) 6= 0 we have RAu W Au = V {u}
(notice that Au W Au 6= O since (u|Au W Au u) = (u|W u) = tr(Au W )) and hence
1
tr(Au W ) Au W Au = Au (this follows easily from 18.3.2c). Thus Wσ = Wσ,π .
′
(d) If the ideal filter exists for a proposition π, then it transforms each
h pure state σi
1
such that p(π, σ) 6= 0 into the pure state represented by the ray kPπ uσ k Pπ uσ
(cf. 19.3.5c). Indeed, for P ∈ P(H) and u ∈ H̃ we have
P Au P f = (u|P f ) P u = (P u|f ) P u, ∀f ∈ H,
tr(P Au ) = (u|P u) = kP uk2 (cf. 18.3.2b),
and hence, if tr(P Au ) 6= 0, tr(P1Au ) P Au P = Au′ with u′ := kP1uk P u.
(e) For a proposition π and a state σ, the ideal filter for π (if it exists) transforms σ
into itself , i.e. we have σ ′ = σ in 19.4.4, if and only if p(π, σ) = 1. This follows
at once from the equivalence between p(π, σ) = 1 and Pπ Wσ Pπ = Wσ (cf.
19.3.5f). Indeed, p(π, σ) = 1 implies Pπ Wσ Pπ = Wσ and tr(Pπ Wσ ) = 1, and
hence Wσ,π = Pπ Wσ Pπ = Wσ . Conversely, since obviously Pπ Wσ,π Pπ = Wσ,π ,
Wσ,π = Wσ implies Pπ Wσ Pπ = Wσ and hence p(π, σ) = 1.
19.4.6 Axiom (Axiom Q2). The ideal filter exists for every proposition π ∈ Π.
19.4.7 Remarks.
(a) Axiom Q2 is a version of what is sometimes called Lüder’s reduction axiom. A
milder version of the axiom would be to assume that a filter exists for every
proposition represented by a one-dimensional projection. This milder version
would be a version of what is sometimes called von Neumann’s reduction axiom,
or projection postulate.
We point out that in our approach to quantum mechanics, in which states
correspond to ensembles of copies prepared in a definite way, the transformation
of a state σ into a pure state σ ′ such that [uσ′ ] = [u], upon action of a filter for
the proposition represented by a one-dimensional projection Au , is an immediate
consequence of the definition of filter (cf. 19.4.3e) and it does not need to be
assumed. However, it is not obvious that a filter does exist for every one-
dimensional proposition (even less, that a filter exists for every proposition).
(b) For all u, v ∈ H̃, axiom Q2 implies that there exists an experimental set-up
which can be used in conjunction with an apparatus which implements the pure
state σ represented by the ray [v] (cf. 19.3.5c) so that, when the set-up is used,
there is the probability | (u|v) |2 that a copy prepared in σ is modified into a
copy in the pure state σ ′ represented by the ray [u]. Indeed, any implementation
of the filter for the proposition π represented by the one-dimensional projection
Au is such an experimental set-up, since
| (u|v) |2 = (v|Au v) = p(π, σ)
(cf. 19.4.2 and 19.4.3e). For this reason, the number | (u|v) |2 is called the
transition probability from the pure state represented by v to the pure state
represented by u. We point out that the transition probability from one pure
state to another is one if and only if the two states coincide (cf. 10.1.7b and
13.1.13a; also, this is a special case of 19.4.5e).
19.4.8 Definitions. We say that we have a first kind implementation of a proposi-

tion π ∈ Π if we have experimental procedures for determining π in any state which
are so that, immediately after π has been determined by these procedures in a copy
prepared in a state σ ∈ Σ, we have a copy in a new state σ ′ which depends on σ
and is such that p(π, σ ′ ) = 1 if π has been determined to be true, or else such that
p(¬π, σ ′ ) = 1 if ¬π has been determined to be true. A first kind implementation
is called an ideal-implementation if the state σ ′ that we have immediately after the
determination of π is the state represented by the statistical operator Wσ,π (defined
in 19.4.4) if π has been determined to be true, or else by the statistical operator
Wσ,¬π if ¬π has been determined to be true.
We say that we have a first kind (respectively an ideal ) determination of π if a
first kind (respectively an ideal) implementation of π is carried out. A second kind
implementation, or determination, of a proposition is one which is not first kind.
19.4.9 Remarks.
(a) If we have a first kind implementation of a proposition, this happens notwith-
standing the cautionary remarks of 19.4.1a. Some assume that there are first
kind implementations for all propositions, but we do not make this assumption.
(b) Clearly, a first kind (respectively an ideal) implementation of a proposition π is
a collection of procedures which amounts to a filter (respectively an ideal filter)
for π if they are supplemented with devices which absorb all the copies in which
¬π has been found to be true (i.e., in which π has not been found to be true).
(c) If a proposition π is represented by a one-dimensional projection then a first
kind implementation of π is necessarily an ideal one (cf. 19.4.5c).
19.4.10 Definitions. Let (X, A) be a measurable space and α an X-valued ob-

servable. A first kind (respectively an ideal ) measurement of α is a measurement
of α (cf. 19.1.9a) which is performed by means of first kind (respectively ideal)
determinations of all propositions α(E) for E ∈ A.
A second kind measurement of α is one which is not first kind.
19.4.11 Remarks. Wolfgang Pauli introduced the distinction between first and
second kind measurements (Pauli, 1933), when he distinguished between two types
of measurements. The first type of measurement brings (or leaves) the copy of the
system into a state in which the observable that has been measured surely gives the
result that has been the outcome of the measurement if it is measured a second time.
The second type of measurement either destroys the copy or else changes its state
arbitrarily. For an example of each type, we quote from Josef M. Jauch (note that
Jauch calls “value” what we call “result”). “First we consider the measurement
of the position of some elementary particle by a counter with a finite sensitive
volume. After the measurement has been performed and the counter has recorded
the presence of a particle inside its sensitive volume, we know for certain that the
particle, at the instant of the triggering, is actually inside the sensitive volume. By
this we mean the following: Suppose we repeated the measurement immediately
after it has occurred (this is of course an idealization, since counters are notorious
for having a dead time after they are triggered), then we would with certainty
observe the particle inside the volume of the counter. In the second example, we
consider a momentum measurement with a counter which analyzes the pulse height
of a recoil particle. Here the situation is quite different. The experiment will permit
us to determine the value of the momentum only before the collision occurred. If
we repeat the measurement immediately after it has occurred, then we find that the
momentum of the particle will have a quite different value from its measured value.
The very act of measurement has changed the momentum, and it is this change
which produced the observable effect. We shall call a measurement which will give
the same value when immediately repeated a measurement of the first kind. The
second example is then a measurement of the second kind ” (Jauch, 1968, p.165).
19.4.12 Remarks.
(a) In what follows we assume that α is a discrete observable. Then the self-
adjoint operator Aα that represents α is the operator determined by a family
{(λn , Pn )}n∈I as A was in 15.3.4B (cf. 19.3.10c). Since {λn }n∈I = σp (Aα ),
{λn }n∈I is the family of all exact results for α (cf. 19.3.12a); moreover, Pn =
P Aα ({λn }) = Pα({λn }) for each n ∈ I. In what follows we consider a definite
state σ ∈ Σ.
First, suppose that we have an ideal measurement of α in an ensemble repre-
senting σ, i.e. in a large number N of copies of the system all prepared in the
state σ, and that, for a definite n ∈ I, a device is installed which absorbs all the
copies in which the result λn has not been found, i.e. in which the proposition
α({λn }) is not true. Then we have an ideal filter for α({λn }) (cf. 19.4.9b).
This selects, from the original ensemble of copies, a subensemble containing ap-
proximately p(α({λn }), σ)N copies which are in the state σn′ represented by the
1
Wσn′ = Wσ,α({λn }) = Pα({λn }) Wσ Pα({λn })
tr(Pα({λn }) Wσ )
1
= Pn Wσ Pn ,
tr(Pn Wσ )
provided this subensemble is not empty, i.e. provided p(α({λn }), σ) =
tr(Pn Wσ ) 6= 0 (cf. 19.4.3b and 19.4.4).
Next suppose that we are in a different situation, and that we have just one copy
which had been previously prepared in σ and in which an ideal measurement
of α has given the exact result λn . Then immediately after the measurement
the copy is in the state σn′ . Indeed, since we are considering the copy after the
proposition α({λn }) has been determined to be true in it, there is no need to
select the copy since everything is as if the copy had gone through an ideal filter
for α({λn }) (if we had provided a device that would absorb the copies in which
α({λn }) was not true, our copy would not have been absorbed).
Suppose once again that we have an ideal measurement of α in a copy pre-
pared in σ, but this time the result obtained has not been recorded; i.e.,
there has been a result which was necessarily one of the numbers in {λn }n∈I
(since a measurement of α means that all propositions α({λn }) have been
determined, and hence one of them has been found to be true because the
elements of {λn }n∈I are the only numbers that can be obtained as results
in view of the fact that P Aα (R − {λn }n∈I ) = OH and this implies that
p(α(R − {λn }n∈I ), σ) = tr(P Aα (R − {λn }n∈I )Wσ ) = 0), but the measuring
apparatus has failed to keep record of the result (if we include ourselves as ob-
servers in the measuring apparatus, this could mean that we have not registered
the result in our memories or elsewhere). Then we only know that immediately
after the measurement the copy has probability
tr(Pn Wσ ) = tr(Pα({λn }) Wσ ) = p(α({λn }), σ)
of being in the state σn′ , and thus we must conclude (cf. 19.3.5b) that the state
of the copy after the measurement is the mixed state σ ′′ represented by the
statistical operator Wσ′′ defined by
X X
Wσ′′ f := (tr(Pn Wσ )) Wσn′ f = Pn Wσ Pn f, ∀f ∈ H,
n∈I0 n∈I
where I0 := {n ∈ I : tr(Pn Wσ ) 6= 0} and the second equality follows from

the fact that if tr(Pn Wσ ) = 0 then Pn Wσ Pn = OH (cf. 19.3.5f). We point
out that, in what we have just done, the probabilities p(α({λn }), σ) have not
been used as theoretical predictions of frequencies, but rather to quantify our
ignorance of which exact result has actually been obtained (but not recorded) by
the measuring apparatus. Thus, they are of an epistemic nature, like classical
probabilities (cf. 19.2.4).
Suppose for the third time that we have an ideal measurement of α in a copy
prepared in σ, and that we only know that the result obtained belongs to a
definite subset E of R. This implies that the proposition α({λn }n∈IE ) has been
determined to be true, with IE := {n ∈ I : λn ∈ E}. Then we know that the
probability for the copy to be, immediately after the measurement, in the state
σn′ is:
0 if n 6∈ IE ,
tr(Pn Wσ ) p(α({λn }),σ)
tr(Pα({λk }k∈I ) Wσ )
= p(α({λk }k∈IE ),σ) if n ∈ IE ;
E
in fact, our ignorance is smaller than it was in the previous case, and we modify
the probabilities of the previous case as we should do if they were classical
probabilities. Proceeding as before and observing that
X
Pα({λk }k∈IE ) = P Aα ({λk }k∈IE ) = Pk
k∈IE
(cf. 15.3.4B), we see that immediately after the measurement the copy is in the
′′
state σE represented by the statistical operator WσE′′ defined by
1 X
WσE′′ f := P Pn Wσ Pn f, ∀f ∈ H
k∈IE tr(Pk Wσ ) n∈IE
P P
(note that tr ( k∈IE Pk )Wσ = k∈IE tr(Pk Wσ ) by 18.3.12 and that
tr WσE = 1 since tr(Pn Wσ Pn ) = tr(Pn Wσ )).
′′
We must underline the fact that, in the last two cases considered above (when
σ was transformed into σ ′′ or σE ′′
), there is a measuring apparatus which “in-
teracts” with a copy of the system in such a way as to turn out an exact result,
and that only the recording section of the apparatus is defective. Indeed, if in
the last case considered above the apparatus was only capable of determining
whether the proposition α({λn }n∈IE ) was true, then we would only have an
ideal determination of this proposition (and not an ideal measurement of α)
and, after an “interaction” with the apparatus in which this proposition was
′
determined to be true, the copy would be in the state σE represented by the
statistical operator WσE′ defined by
WσE′ f = Wσ,α({λk }k∈IE ) f
1
= Pα({λk }k∈IE ) Wσ Pα({λk }k∈IE ) f
tr(Pα({λk }k∈IE ) Wσ )
1 X
= P Pn Wσ Pm f, ∀f ∈ H,
k∈IE tr(Pk Wσ ) n,m∈IE
′ ′′
which is clearly not the same as WσE′′ (and hence σE is not the same as σE ).
Finally, suppose that we have ideal measurements of α in an ensemble of N
copies all prepared in σ. We have already seen what happens if we make a
selection by keeping just those copies in which a particular exact result has
been obtained. If instead no selection is made, then after the measurements we
have an ensemble which still contains N copies, all of them in the state σ ′′ . If
only a coarse selection is made by keeping just those copies for which a result
has been obtained that belongs to a definite subset E of R, then after the mea-
surements and theselection we have an ensemble which contains approximately
′′
P
k∈IE tr(Pk Wσ ) N copies, all of them in the state σE .
All transformations considered above of σ into another state (σn′ , σ ′′ ,σE ′′
,σE′
)
′ ′
are called state reductions (for the transformations of σ into σn or into σE , this
name was already known from 19.4.3b).
(b) We suppose here that α is an observable which is not discrete and, for the sake
of simplicity, we also suppose that Aα has no eigenvalues, i.e. (cf. 19.3.12a)
that there are no real numbers which are exact results for α. What happens
then if an ideal measurement of α is carried out? Naturally, a result is ob-
tained which is identified with a real number λ, but there is no state in which
this result has non zero probability of being obtained, since α({λ}) = π0 for
each λ ∈ R. Indeed, in N repetitions of the measurement of α we will obtain
N results, but each of them so seldom that its relative frequency approaches
zero as N grows (cf. 19.1.16a). However, an observable with no exact results
(or, more generally, a non discrete observable) is an idealization which is useful
(under some respects, even essential) on the theoretical level but which on the
operational level actually stands for a sequence of more realistic discrete ob-
servables which correspond to more realistic measuring instruments and which
can be assumed to be functions of α, as for instance the observables αn defined
in 19.1.22a. In order to perform a non-fictional measurement of α, we must
actually measure one of these more realistic discrete observables, for instance
one of the observables αn , and hence the analysis of remark a applies.
As already observed in 19.1.22a, the relation between the observable α and
the more realistic discrete observables which approximate α is conceptually
similar to the one that exists, in classical mechanics, between derivatives used
to represent values of speed and the way speed is actually measured. When
speed is measured, only difference quotients are actually measured; however,

the notion of speed as derivative is essential for the laws of classical mechanics.
19.5 Compatible observables and uncertainty relations in quantum

mechanics
In discussions about quantum mechanics the issue is often addressed of whether two
observables are compatible with each other, something which is often regarded as
being equivalent to the condition that they can be measured simultaneously. How-
ever, it is not always clear what is meant by a “simultaneous measurement”. And
indeed the idea of an interaction of a copy of a quantum system with two measuring
instruments at the same time does not seem experimentally very sound. A perhaps
more promising idea might be that two observables α and β are simultaneously mea-
surable if a measurement of α followed immediately by a measurement of β yields
the same results as when the order of the α and β measurements is reversed. In the
first part of this section we endevour to deal with this topic on mainly statistical
grounds.
In the second part of this section we discuss uncertainty relations, an issue which
in the early days of quantum mechanics seemed to involve deep epistemological
and even philosophical questions. However, a strict statistical interpretation of
uncertainty relations as presented here is quite unproblematic.
As usual, states, propositions, observables are referred to a given quantum sys-
tem (cf. also 19.1.12) and they are represented as summarized in 19.3.22.
19.5.1 Remarks.
(a) First, for a proposition π ′ ∈ Π and a state σ ∈ Σ, we consider the occurrence

o1 : a copy of the system, prepared in σ, goes through the ideal filter for π ′ .
The probability of o1 is p(π ′ , σ), by the definition of a filter for π ′ (cf. 19.4.2).
Next, for one more proposition π ′′ ∈ Π, we consider the occurrence
o2 : π ′′ is ascertained to be true in a copy of the system which, immediately after
being prepared in σ, has just gone through the ideal filter for π ′ .
The occurrence o2 can actually happen only if some copies can exist which, after
being prepared in σ, go through the ideal filter for π ′ , i.e. only if p(π ′ , σ) 6= 0.
If such is the case, the probability of o2 is p(π ′′ , σ ′ ) with σ ′ the state that is
represented by the statistical operator Wσ,π′ , by the definition of the ideal filter
for π ′ (cf. 19.4.4).
Finally, we consider the occurrence
o: a copy of the system, prepared in σ, goes through the ideal filter for π ′ and,
immediately after that, π ′′ is ascertained to be true in that copy.
We denote by p(π ′′ , π ′ , σ) the probability of o. Now, if p(π ′ , σ) 6= 0, the occur-

rence o is the joint happening of the occurrences o1 and o2 , and therefore we
have
p(π ′′ , π ′ , σ) = p(π ′′ , σ ′ )p(π ′ , σ) = tr(Pπ′′ Wσ,π′ ) tr(Pπ′ Wσ )
= tr(Pπ′′ Pπ′ Wσ Pπ′ ).
Moreover, if p(π ′ , σ) = 0 then the occurrence o can never happen and hence
p(π ′′ , π ′ , σ) = 0, and p(π ′ , σ) = 0 also implies Pπ′ Wσ = OH (cf. 19.3.5f) and
hence tr(Pπ′′ Pπ′ Wσ Pπ′ ) = 0. Thus we have
p(π ′′ , π ′ , σ) = tr(Pπ′′ Pπ′ Wσ Pπ′ )
whatever the value of p(π ′ , σ), and this equation can be written as
p(π ′′ , π ′ , σ) = tr(Pπ′ Pπ′′ Pπ′ Wσ ) = tr(Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ )
(cf. 18.2.11c and, for the second equality, Pπ′′ = Pπ2′′ ).
If σ is a pure state, i.e. Wσ = Au with u ∈ H̃, then (cf. 18.3.2b and 13.1.5)
p(π ′′ , π ′ , σ) = tr(Pπ′ Pπ′′ Pπ′ Au ) = (u|Pπ′ Pπ′′ Pπ′ u) = kPπ′′ Pπ′ uk2 .
(b) Suppose that there exists an ideal implementation of a proposition π ′ ∈ Π.
Then, for every proposition π ′′ ∈ Π and every state σ ∈ Σ, p(π ′′ , π ′ , σ) is also the
probability that both π ′ and π ′′ turn out to be true if an ideal determination of
π ′ is carried out in a copy prepared in σ and a determination of π ′′ is carried out
immediately after that; this follows from the definition of an ideal determination
given in in 19.4.8. Moreover, p(π ′′ , π ′ , σ) is also the probability for a copy
prepared in σ to go through the ideal filter for π ′ and, immediately after that,
through a filter for π ′′ as well; this too follows from the definitions.
19.5.2 Definition. Two propositions π ′ , π ′′ ∈ Π are said to be compatible if

p(π ′′ , π ′ , σ) = p(π ′ , π ′′ , σ), ∀σ ∈ Σ.
19.5.3 Proposition. For two propositions π ′ , π ′′ ∈ Π, the following conditions are

equivalent:
(a) π ′ and π ′′ are compatible;
(b) p(π ′′ , π ′ , σ) = p(π ′ , π ′′ , σ), ∀σ ∈ Σ0 ;
(c) [Pπ′ , Pπ′′ ] = OH .

b ⇒ c: Assume condition b. Owing to the bijection that exists from Σ0 onto the
family of all rays of H (cf. 19.3.5c), condition b implies (cf. 19.5.1a)
(u|Pπ′ Pπ′′ Pπ′ u) = (u|Pπ′′ Pπ′ Pπ′′ u) , ∀u ∈ H̃,
Pπ′ Pπ′′ Pπ′ = Pπ′′ Pπ′ Pπ′′ ,
and hence (since Pπ2′ = Pπ′ and Pπ2′′ = Pπ′′ )

Pπ′ Pπ′′ Pπ′ = Pπ′ Pπ′′ Pπ′ Pπ′′ and Pπ′′ Pπ′ Pπ′′ Pπ′ = Pπ′′ Pπ′ Pπ′′ .
Then, if we define A := [Pπ′ , Pπ′′ ], by 13.1.5 we have
kAf k2 = (Af |Af ) = (f |Pπ′′ Pπ′ Pπ′′ f ) − (f |Pπ′ Pπ′′ Pπ′ Pπ′′ f )
− (f |Pπ′′ Pπ′ Pπ′′ Pπ′ f ) + (f |Pπ′ Pπ′′ Pπ′ f ) = 0, ∀f ∈ H,
and hence A = OH .
c ⇒ a: If condition c is true then we have (cf. 19.5.1a)
p(π ′′ , π ′ , σ) = tr(Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ )
= tr(Pπ′ Pπ′′ Wσ Pπ′′ Pπ′ ) = p(π ′ , π ′′ , σ), ∀σ ∈ Σ.
19.5.4 Proposition. Let π ′ , π ′′ ∈ Π. For a state σ ∈ Σ such that p(π ′′ , π ′ , σ) 6= 0,

consider the copy of the system that results from a copy which, after being prepared
in σ, has gone through the ideal filter for π ′ and, immediately after that, through the
ideal filter for π ′′ . This copy is in the state σ̃ represented by the statistical operator
1
Wσ̃ := Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ .
tr(Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ )
While π ′′ is certainly true in this copy, i.e. p(π ′′ , σ̃) = 1, for π ′ the following
(a) p(π ′ , σ̃) = 1 for each σ ∈ Σ such that p(π ′′ , π ′ , σ) 6= 0;
(b) π ′ and π ′′ are compatible.
Proof. First we notice that the denominator in the statement is non-zero since
tr(Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ ) = p(π ′′ , π ′ , σ) (cf. 19.5.1a). Next, from 19.4.4 we have that
the state σ̃ is represented by the statistical operator Wσ′ ,π′′ with Wσ′ = Wσ,π′ , and
hence by the statistical operator
1 1
Pπ′′ Wσ,π′ Pπ′′ = Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ .
tr(Pπ ′′ Wσ,π Pπ )
′ ′′ tr(Pπ Pπ Wσ Pπ′ Pπ′′ )
′′ ′
From Pπ2′′ = Pπ′′ we have Pπ′′ Wσ̃ = Wσ̃ , and hence p(π ′′ , σ̃) = 1 by 19.3.5f.
We prove now the equivalence between conditions a and b.
a ⇒ b : Assume condition a. Then, for each σ ∈ Σ such that p(π ′′ , π ′ , σ) 6= 0,
we have
Wσ̃ = Wσ̃,π′
(cf. 19.4.5e). This equality is true in particular for each pure state σ ∈ Σ0 such
that kPπ′′ Pπ′ uσ k2 = p(π ′′ , π ′ , σ) 6= 0 (cf. 19.5.1a), for which it can be written as
(cf. 19.4.5d)
Aũσ = Aũ′σ
with
1 1
ũσ := Pπ′′ Pπ′ uσ and ũ′σ := Pπ′ Pπ′′ Pπ′ uσ ,
kPπ′′ Pπ′ uσ k kPπ′ Pπ′′ Pπ′ uσ k
and this implies (cf. 13.1.13a) that there exists α ∈ C so that
Pπ′′ Pπ′ uσ = αPπ′ Pπ′′ Pπ′ uσ ;
applying Pπ′ to the left of both sides of this equality we get
Pπ′ Pπ′′ Pπ′ uσ = αPπ′ Pπ′′ Pπ′ uσ ,
and hence α = 1 since Pπ′ Pπ′′ Pπ′ uσ 6= 0H . Owing to the bijection that exists from
Σ0 onto the family of all rays of H (cf. 19.3.5c), this proves that
Pπ′′ Pπ′ u = Pπ′ Pπ′′ Pπ′ u for each u ∈ H̃ such that Pπ′′ Pπ′ u 6= 0H ;
since the same is trivially true for each u ∈ H̃ such that Pπ′′ Pπ′ u = 0H , we have
Pπ′′ Pπ′ = Pπ′ Pπ′′ Pπ′ .
By taking the adjoints of both sides we get
Pπ′ Pπ′′ = Pπ′ Pπ′′ Pπ′
(cf. 12.3.4b), and hence [Pπ′ , Pπ′′ ] = OH , and hence condition b by 19.5.3.
b ⇒ a: If π ′ and π ′′ are compatible, then [Pπ′ , Pπ′′ ] = OH by 19.5.3, and hence
Pπ′ Wσ̃ = Wσ̃ , and hence p(π ′ , σ̃) = 1 by 19.3.5f.
19.5.5 Proposition. Suppose that we have ideal implementations of two proposi-

tions π ′ , π ′′ ∈ Π, and consider for a state σ ∈ Σ the two occurrences
oπ′ ,π′′ : π ′ is ideally determined to be true (i.e., it is ascertained to be true by means

of its ideal determination) in a copy prepared in σ and, immediately after
that, π ′′ is determined to be true;
oπ′′ ,π′ : π ′′ is ideally determined to be true in a copy prepared in σ and, immediately
after that, π ′ is determined to be true.
The probability of the occurrence oπ′ ,π′′ is p(π ′′ , π ′ , σ), and the probability of the
occurrence oπ′′ ,π′ is p(π ′ , π ′′ , σ) (cf. 19.5.1b).
(a) there exists a proposition π ∈ Π such that p(π ′′ , π ′ , σ) = p(π, σ) for each σ ∈ Σ;
(b) π ′ and π ′′ are compatible.
If these conditions are satisfied, then the proposition π (as an equivalence class) is
unique and we have:
(c) p(π ′ , π ′′ , σ) = p(π, σ) for each σ ∈ Σ;

(d) Pπ = Pπ′′ Pπ′ .
Proof. First, we observe that if condition a is satisfied then the proposition π, as

an equivalence class (cf. 19.1.12), is unique by the very definition of the equivalence
relation in Π (cf. 19.1.5).
a ⇒ (b and d): We assume condition a. Then we have (cf. 19.5.1a)
tr(Pπ′ Pπ′′ Pπ′ Wσ ) = tr(Pπ Wσ ), ∀σ ∈ Σ,
and hence in particular (cf. 18.3.2b)
(u|Pπ′ Pπ′′ Pπ′ u) = (u|Pπ u) , ∀u ∈ H̃,
Pπ′ Pπ′′ Pπ′ = Pπ .
Then we have
kPπ′ Pπ′′ Pπ′ f k2 = kPπ f k2 = (f |Pπ f ) = (f |Pπ′ Pπ′′ Pπ′ f ) = kPπ′′ Pπ′ f k2 , ∀f ∈ H,
and hence (cf. 13.1.3c, with PM := Pπ′ )
Pπ′ Pπ′′ Pπ′ f = Pπ′′ Pπ′ f, ∀f ∈ H,
and hence
Pπ = Pπ′′ Pπ′ ,
which is condition d. Moreover, this implies [Pπ′ , Pπ′′ ] = OH by 13.2.1, and hence
condition b by 19.5.3.
b ⇒ (a and c): We assume condition b. Then we have [Pπ′ , Pπ′′ ] = OH by
19.5.3, and hence Pπ′′ Pπ′ ∈ P(H) by 13.2.1. Letting π be the proposition such
that Pπ = Pπ′′ Pπ′ , we have (cf. 19.5.1a)
p(π ′ , π ′′ , σ) = p(π ′′ , π ′ , σ) = tr(Pπ′′ Pπ′ Wσ Pπ′ Pπ′′ )
= tr(Pπ Wσ Pπ ) = tr(Pπ Wσ ) = p(π, σ), ∀σ ∈ Σ.
Thus, both conditions a and c are proved.
19.5.6 Remarks.
(a) The equivalence between conditions a and b in 19.5.5 shows that we cannot ac-
cept all occurrences related to a quantum system as bonafide events which define
propositions. Indeed, the meaning of condition a is that the occurrence oπ′ ,π′′
is actually a quantum event which defines a proposition, and the equivalence
between conditions a and b shows that this is true if and only if π ′ and π ′′ are
compatible. Condition c shows that if π ′ and π ′′ are compatible then both the
occurrences oπ′ ,π′′ and oπ′′ ,π′ are implementations of the same proposition π.
Thus, if π ′ and π ′′ are compatible, we can say that an event in the equivalence
class of π is the “simultaneous occurrence” of the events that define π ′ and π ′′ ;
actually, the experimental determinations of π ′ and π ′′ will require to determine
first one of them and then, immediately afterwards, the other one; however, the
order is immaterial since oπ′ ,π′′ and oπ′′ ,π′ define propositions which are in the
same equivalence class. This equivalence class, which we have denoted by π up
to now, will be denoted by the symbol π ′ ∧ π ′′ henceforth (thus, this symbol im-
plies that π ′ and π ′′ are compatible and that there exist ideal implementations
of them).
(b) If π ′ and π ′′ are compatible propositions and ideal implementations of them are
available, then the proposition we have denoted by π ′ ∧ π ′′ is represented by
the orthogonal projection Pπ′ ∧π′′ = Pπ′′ Pπ′ (cf. 19.5.5d), i.e. by the orthogonal
projection defined by the subspace RPπ′ ∩ RPπ′′ (cf. 13.2.1e).
(c) We remark that, for two propositions π ′ and π ′′ , the operator Pπ′′ Pπ′ is an
orthogonal projection if and only if π ′ and π ′′ are compatible (cf. 19.5.3 and
13.2.1). However, for any pair of propositions π ′ , π ′′ there is always (i.e., with no
conditions on π ′ , π ′′ ) an orthogonal projection which is defined by the subspace
RPπ′ ∩ RPπ′′ (cf. 4.1.10), and hence there is always a proposition, which we
still denote by π, such that RPπ = RPπ′ ∩ RPπ′′ , since the mapping of 19.3.1b
is bijective. For a state σ we have
p(π, σ) = 1 ⇔ RWσ ⊂ RPπ = RPπ′ ∩ RPπ′′ ⇔ p(π ′ , σ) = p(π ′′ , σ) = 1
(cf. 19.3.5f). Thus, π is certainly true in a state if and only if both π ′ and π ′′
are certainly true in that state. We note that, if π ′ and π ′′ were proposition
in a classical theory, then the classical proposition π ′ ∧ π ′′ (defined in 19.2.1)
would be certainly true in a state if and only if both π ′ and π ′′ were certainly
true in that state. Indeed, for a state σ, in a classical theory we would have (cf.
19.2.8 and the proof of 19.2.7)
p(π ′ ∧ π ′′ , σ) = µσ (Sπ′ ∧π′′ ) = µσ (Sπ′ ∩ Sπ′′ ) = 1 ⇔
[p(π ′ , σ) = µσ (Sπ′ ) = 1 and p(π ′′ , σ) = µσ (Sπ′′ ) = 1];
in fact, one implication follows immediately from the monotonicity of µσ and
for the other one we have
µσ (Sπ′ ) = µσ (Sπ′′ ) = 1 ⇒ µσ (S − Sπ′ ) = µσ (S − Sπ′′ ) = 0 ⇒
µσ (S − (Sπ′ ∩ Sπ′′ )) = µσ ((S − Sπ′ ) ∪ (S − Sπ′′ )) = 0 ⇒ µσ (Sπ′ ∩ Sπ′′ ) = 1.
This could suggest interpreting π as the proposition “π ′ and π ′′ ” also in the
quantum theory. However, if pursued in the quantum theory, this interpretation
must not lead to thinking that in general π ′ and π ′′ can be determined in the
same copies (as instead they could in a classical theory); actually, p(π ′ , σ) = 1
means that π ′ is found to be true in all copies of an ensemble representing σ
and p(π ′′ , σ) = 1 means that π ′′ is found to be true in all copies of a different
ensemble representing σ. Moreover, determining π ′ in a copy and then π ′′ in the
resulting copy is a procedure which is not in general equivalent to determining
first π ′′ and then π ′ , as 19.5.3 shows. However, if π ′ and π ′′ are compatible and
if ideal implementations are available for both of them, then we saw in 19.5.6a
that an ideal determination of one of them in a copy immediately followed by
a determination of the other one in the resulting copy defines an event which
lies in the equivalence class of π. Thus, when π ′ and π ′′ are compatible there
are experimentally reasonable grounds for interpreting the proposition π as the
proposition “π ′ and π ′′ ”. In any case, we will reserve the symbol π ′ ∧ π ′′ for
the case of compatible propositions π ′ , π ′′ for which ideal implementations are
available.
(d) Suppose that two propositions π ′ and π ′′ are compatible and that ideal imple-
mentations are available for both of them. Then the pairs π ′ and ¬π ′′ , ¬π ′ and
π ′′ , ¬π ′ and ¬π ′′ are all compatible; this follows at once from 19.5.3 and 19.3.4.
Thus, for every state σ, the probabilities for the joint results of π ′ and π ′′ are
independent from the order in which the determinations are made, i.e.
p(π ∗ , π ∗∗ , σ) = p(π ∗∗ , π ∗ , σ) for π ∗ = π ′ , ¬π ′ and π ∗∗ = π ′′ , ¬π ′′ .
19.5.7 Remark. Let π ′ , π ′′ ∈ Π and consider, as in 19.5.6b, the proposition π ∈ Π

which is so that RPπ = RPπ′ ∩ RPπ′′ , i.e. so that, for a state σ ∈ Σ,
p(π, σ) = 1 ⇔ p(π ′ , σ) = p(π ′′ , σ) = 1.
If π ′ and π ′′ are compatible, then Pπ = Pπ′′ Pπ′ (cf. 19.5.3 and 13.2.1) even when no
ideal implementations of π ′ and π ′′ are available. If π ′ and π ′′ are compatible, then
it is clear from the form of the statistical operator Wσ̃ in 19.5.4 that Wσ̃ = Wσ,π ,
and therefore that implementations of the ideal filters for π ′ and π ′′ , applied the
one after the other in either order, amount to an implementation of the ideal filter
for π.
Even when π ′ and π ′′ are not compatible it is possible to design, at least in
principle, an implementation of the ideal filter for the proposition π which is so
that RPπ = RPπ′ ∩ RPπ′′ , by means of implementations of the ideal filters for π ′ and
π ′′ . The procedure is as follows. We have a copy, prepared in a state σ, go through
a pack of n ideal filters for π ′ and π ′′ , arranged in an alternate sequence (first a
filter for π ′ , second a filter for π ′′ , third a filter for π ′ , and so on n times). If a copy
does go through this pack of filters then, proceeding as in the proof of 19.5.4 (cf.
also 12.3.4b), we see that afterwards this copy is in the state σn represented by the
1
Wσn = Tn Wσ Tn†
tr(Tn Wσ Tn† )
with the operator Tn defined as in 18.2.17 for PM := Pπ′ and PN := Pπ′′ . If we
admit that it is experimentally meaningful to pursue this course of action for any
number n of filters, then we have a statistical approximation as good as we want of
the ideal filter for π, since for every projection P ∈ P(H) we have
1
tr(P Wσn ) = tr(P Tn Wσ Tn† )
tr(Tn Wσ Tn† )
1
−−−−→ tr(P Pπ Wσ Pπ ) = tr(P Wσ,π )
n→∞ tr(Pπ Wσ Pπ )
whenever p(π, σ) 6= 0 (cf. 18.2.17). In fact this shows that all the probabilities that
are determined (according to 19.3.1c) by the state into which σ is transformed by
the ideal filter for π can be approximated as well as we want by the probabilities
that are determined by the state into which σ is transformed by the ideal filters for
π ′ and π ′′ , used n times alternatingly. We point out that what was written above is
consistent because if tr(Pπ Wσ Pπ ) = p(π, σ) 6= 0 then tr(Tn Wσ Tn† ) 6= 0 for all n ∈ N

(cf. 18.2.17). We also point out the obvious fact that, if π ′ and π ′′ are compatible,
then this procedure is equivalent to the one in which only two filters are used, one
for π ′ and the other for π ′′ . In fact, if π ′ and π ′′ are compatible then Tn = Pπ′′ Pπ′
for all n > 1 (cf. 19.5.3).
19.5.8 Remark. To understand better the meaning of the results obtained so far
in this section, it is useful to examine what we should have if, in the situations
discussed, we were considering a classical statistical theory (for which we refer to
Section 19.2).
In a classical statistical theory, the action of an ideal filter for a proposition π
would be to transform any state σ such that µσ (Sπ ) = p(π, σ) 6= 0 into the state σ ′
represented by the probability measure µσ,π on A defined by
1
µσ,π (E) := µσ (E ∩ Sπ ), ∀E ∈ A.
µσ (Sπ )
Note that this obviously defines a probability measure and that p(π, σ ′ ) =
µσ,π (Sπ ) = 1; thus, the reduction from µσ to µσ,π would indeed represent the
action of a filter for π; moreover, µσ,π is obtained from the original measure µσ
by altering it to the least degree consistent with the condition µσ,π (Sπ ) = 1, as an
ideal filter should do.
Then, for two propositions π ′ , π ′′ and a state σ in a classical statistical theory,
if p(π ′ , σ) 6= 0 we should have, reasoning as in 19.5.1,
p(π ′′ , π ′ , σ) = p(π ′′ , σ ′ )p(π ′ , σ),
where σ ′ would be the state represented by the probability measure µσ,π′ , and hence
p(π ′′ , π ′ , σ) = µσ,π′ (Sπ′′ )µσ (Sπ′ ) = µσ (Sπ′′ ∩ Sπ′ );
since p(π ′ , σ) = 0 implies that the occurrence o defined in 19.5.1 can never happen
and hence p(π ′′ , π ′ , σ) = 0, and also implies µσ (Sπ′′ ∩ Sπ′ ) = 0 (by the monotonicity
of µσ ), we should have
p(π ′′ , π ′ , σ) = µσ (Sπ′′ ∩ Sπ′ )
whatever the value of p(π ′ , σ). And similarly we should have
p(π ′ , π ′′ , σ) = µσ (Sπ′ ∩ Sπ′′ ).
Thus, in a classical statistical theory we should have
p(π ′′ , π ′ , σ) = p(π ′ , π ′′ , σ)
for every pair of propositions and every state, in contrast with the result of 19.5.3.
As to 19.5.4, in a classical statistical theory a copy, initially prepared in a state
σ, after going through an ideal filter for a proposition π ′ and through an ideal filter
for a proposition π ′′ would be in the state σ̃ represented by the probability measure
µσ̃ on A defined by
1
µσ̃ (E) := µσ (E ∩ Sπ′′ ∩ Sπ′ ), ∀E ∈ A,
µσ (Sπ′′ ∩ Sπ′ )
and hence we should have

p(π ′ , σ̃) = µσ̃ (Sπ′ ) = 1
with no conditions on the pair π ′ , π ′′ except the obvious one p(π ′′ , π ′ , σ) 6= 0 (if
p(π ′′ , π ′ , σ) = 0 then no copy can go through the two filters), in contrast with the
result of 19.5.4.
Finally, and in contrast with the result of 19.5.5., the result obtained above
for p(π ′′ , π ′ , σ) shows that in a classical theory we should have, for every pair of
propositions π ′ , π ′′ and every state σ,
p(π ′ , π ′′ , σ) = p(π ′′ , π ′ , σ) = p(π ′ ∧ π ′′ , σ),
since the classical proposition π ′ ∧π ′′ is the proposition such that Sπ′ ∧π′′ = Sπ′ ∩Sπ′′ .
Thus, if and only if a pair of quantum propositions are compatible do they
behave with respect to each other as any pair of classical propositions would.
We point out that the results obtained here for the classical case derive from
the fact that, in a classical theory, each copy of the system is in a “real condition”
so that each proposition is certainly true or certainly false in that copy (cf. 19.2.4),
and an ideal filter for a proposition π only selects the copies in which π is true
while leaving unaltered their “real condition”, so that the properties that were true
in a copy before the selection are true also after it. That this is not the case in
a quantum theory is proved by the results of this section. The result of 19.5.4 is
particularly clear-cut in this respect.
19.5.9 Definition. Two observables α1 , α2 are said to be compatible if the propo-

sitions α1 (E1 ) and α2 (E2 ) are compatible for all E1 , E2 ∈ A(dR ).
19.5.10 Proposition. Two observables α1 and α2 are compatible if and only if the
operators Aα1 and Aα2 commute.
Proof. This result follows from 19.5.3, from the definitions of the operators Aα1
and Aα2 (cf. 19.3.6), and from the definition of commutativity for two self-adjoint
operators (cf. 17.1.5).
19.5.11 Remarks.
(a) Suppose that we have an R2 -valued observable α. Then α represents a measur-
ing instrument which yields a result by the position of a pointer in a dial which
is represented by R2 (cf. 19.1.9a).
We can define the mapping
α1 : A(dR ) → Π
E 7→ α1 (E) := α(E × R),
and we see that α1 is an observable since α1 = ϕ1 (α), with
ϕ1 : R2 → R
(x1 , x2 ) 7→ ϕ1 (x1 , x2 ) := x1
(cf. 19.1.13 and 19.1.14); indeed,

ϕ1 (α)(E) = α(ϕ−1
1 (E)) = α(E × R) = α1 (E), ∀E ∈ A(dR ).
The observable α1 is supported by the same measuring instrument that is rep-

resented by α, in which however only a partial recording of the results obtained
is made: if the instrument brings forth the result which is represented by the
element (x1 , x2 ) of R2 , then just the number x1 is recorded. And similarly we
can define α2 by letting α2 (E) := α(R × E) for each E ∈ A(dR ).
The two observables α1 and α2 are compatible by 19.5.10, since
P Aα1 (E) = Pα (E × R) and P Aα2 (E) = Pα (R × E), ∀E ∈ A(dR ),
implies that Aα1 and Aα2 commute (cf. b ⇒ a in 17.1.10). We note that it is
consistent to say that the proposition α1 (E1 ) can be determined simultaneously
with the proposition α2 (E2 ) for any E1 , E2 ∈ A(dR ), since α1 (E1 ) and α2 (E2 )
are propositions in the range of α and we assumed that all the propositions in
the range of an observable can be determined simultaneously for any single copy
prepared in any state (cf. 19.1.9a). We recall that the basis for that assumption
was the macroscopic, to wit classical, nature of pointer and dial in a measuring
instrument that underlies an observable.
(b) We examine here a situation in a sense opposite to the one discussed in remark
a. Suppose that we have two compatible observables α1 and α2 . Then, by
19.5.10 and a ⇒ b in 17.1.10, there exists a unique projection valued measure
P on A(d2 ) such that
Pα1 (E) = P (E × R) and Pα2 (E) = P (R × E), ∀E ∈ A(dR ),
and hence, owing to the bijectivity of the mapping of 19.3.1b and to 13.3.5,
there is a unique mapping α : A(d2 ) → Π which is so that
α1 (E) = α(E × R) and α2 (E) = α(R × E), ∀E ∈ A(dR ),
and so that µα σ is a probability measure for all σ ∈ Σ0 . Actually, for each
E ∈ A(d2 ), α(E) is the proposition such that Pα(E) = P (E). Then, µα σ is a
probability measure for all σ ∈ Σ (cf. 18.3.13).
To what extent can the mapping α be considered to be an R2 -valued observable?
That is (cf. 19.1.9c), to what extent can α be taken to represent a measuring
apparatus (whose dial would then be represented by R2 )? We note that, for
each (E1 , E2 ) ∈ A(dR ) × A(dR ),
p(α(E1 × E2 ), σ) = tr(Pα(E1 ×E2 ) Wσ ) = tr(P (E1 × E2 )Wσ )
= tr(P (R × E2 )P (E1 × R)Wσ )
= tr(Pα2 (E2 ) Pα1 (E1 ) Wσ ) = p(α2 (E2 ), α1 (E1 ), σ), ∀σ ∈ Σ
(cf. 13.3.2c). Assume then that we have ideal implementations of all the propo-
sitions in the range of α1 and α2 . Then, for each (E1 , E2 ) ∈ A(dR ) × A(dR ),
we have the proposition α1 (E1 ) ∧ α2 (E2 ) (cf. 19.5.6a), which is defined by the
pieces of equipment that define α1 (E1 ) and α2 (E2 ), and hence by the measuring
instruments represented by α1 and α2 , and for which we have (cf. 19.5.5)
p(α2 (E2 ), α1 (E1 ), σ) = p(α1 (E1 ) ∧ α2 (E2 ), σ), ∀σ ∈ Σ,
and hence
α(E1 × E2 ) = α1 (E1 ) ∧ α2 (E2 ).
This gives an operational interpretation to the proposition α(E) on the basis
of the measuring instruments represented by α1 and α2 , for each E ∈ S :=
{E1 × E2 : (E1 , E2 ) ∈ A(dR ) × A(dR )}. In particular, for each (x1 , x2 ) ∈ R2 we
can say that the determination of the proposition α({(x1 , x2 )}) is, in any state,
“the simultaneous determination” of the propositions α1 ({x1 }) and α2 ({x2 }),
in the sense specified in 19.5.6a.
The reason why we define the R2 -valued observable α on A(d2 ) and not just
on S is that we want the probability functions µα σ to be bona fide measures
and hence to be defined on a σ-algebra (S is just a semialgebra and A(d2 ) is
the σ-algebra generated by S, cf. 6.1.30a and 6.1.32). However, an operational
meaning for the proposition α(E) for each E ∈ A(d2 ) cannot be inferred from
the operational interpretation given above to all propositions α(E) with E ∈ S,
because there is no constructive procedure for obtaining each element of A(d2 )
starting from elements of S. Still, we know that, for every σ ∈ Σ, the measure
µασ is uniquely determined by its values on S (this follows from 6.1.18, from the
uniqueness asserted in 7.3.1A, and from the uniqueness asserted in 7.3.2 for a σ-
finite premeasure); in this respect, the operational grounds found above for the
propositions α(E) with E ∈ S provide operational grounds for the probability
measures µα σ.
(c) Suppose that we have an R2 -valued observable α and a function ϕ : Dϕ → R
such that Dϕ ∈ A(d2 ), Pα (R2 − Dϕ ) = OH , ϕ is A(d2 )Dϕ -measurable. We can
define the observable ϕ(α) (cf. 19.1.13, 19.1.14, 19.3.9), which is supported by
the same measuring instrument that defines α: if a measurement of α yields the
result (x1 , x2 ) ∈ R2 then we attribute the result ϕ(x1 , x2 ) to ϕ(α). Consider
now the two compatible observables α1 and α2 that are related to α as above:
either α1 and α2 are obtained from α as in remark a, or α is obtained from α1
and α2 as in remark b. Then the observable ϕ(α) can be considered a function
of α1 and α2 : if a “simultaneous measurement” of α1 and α2 brings out the
pair of results x1 , x2 then the result (x1 , x2 ), as an element of R2 , is assigned
to α and hence the result ϕ(x1 , x2 ) is assigned to ϕ(α). For this reason, the
observable ϕ(α) is also called the function of α1 , α2 according to ϕ and denoted
by the symbol ϕ(α1 , α2 ). Thus ϕ(α1 , α2 ) := ϕ(α) and we have
P Aϕ(α1 ,α2 ) (E) = P Aϕ(α) (E) = Pα (ϕ−1 (E)) = P ϕ(Aα1 ,Aα2 ) (E), ∀E ∈ A(dR )
(cf. the proof of 19.3.9, 17.1.11, 15.2.7, noticing that the relation between the
pairs of commuting self-adjoint operators Aα1 , Aα2 and the projection valued
measure Pα is the same as the one between the pair A1 , A2 and P in 17.1.10b),
and hence (cf. 15.2.2) Aϕ(α1 ,α2 ) = ϕ(Aα1 , Aα2 ). This extends the function
preserving property of the representation of observables by self-adjoint operators
that was noted in 19.3.9.
Suppose in particular that we have two compatible observables α1 and α2 ,
that we have ideal implementations of all the propositions in the ranges of α1
and α2 , and that we wish to define, using the measuring instruments that are
represented by α1 and α2 , a new observable to which the result x1 + x2 (or
x1 x2 ) is assigned when the “simultaneous” results x1 and x2 are obtained for
α1 and α2 respectively. Then, from what we saw above and from 17.1.12 it
follows that this new observable is represented by the self-adjoint extension of
the essentially self-adjoint operator A1 +A2 (or A1 A2 ), which actually coincides
with A1 + A2 (or A1 A2 ) whenever A2 is bounded.
19.5.12 Remark. The results of 19.5.11a,b are based on the equivalence between
conditions a and b in 17.1.10, and can be summarised as follows: two observables
α1 and α2 are compatible if and only if there exists an R2 -valued observable α such
that
α1 (E) = α(E × R) and α2 (E) = α(R × E), ∀E ∈ A(dR )
(actually, for the “only if” part we have to assume that there are ideal implementa-
tions of all the propositions in the ranges of α1 and α2 ). This gives, in our opinion,
a nice characterization of the compatibility of two observables.
However, in standard quantum mechanics textbooks, the only X-valued observ-
ables that are considered are observables. Now, it is possible to give a characteri-
zation of the compatibility of two observables in which only observables are used.
This is accomplished on the basis of the equivalence between conditions a and c in
17.1.10. Indeed, if two observables α1 and α2 are functions of an observable β, then
by 19.3.9 the self-adjoint operators Aα1 and Aα2 are functions of the self-adjoint
operator Aβ , and hence Aα1 and Aα2 commute by c ⇒ a in 17.1.10, and hence
α1 and α2 are compatible by 19.5.10. If conversely two observable α1 and α2 are
compatible, then the self-adjoint operators Aα1 and Aα2 commute by 19.5.10, and
hence there are a self-adjoint operator B and two functions ϕi so that Aαi = ϕi (B)
for i = 1, 2, by a ⇒ c in 17.1.10; now, it would be hard to give in general an
operational meaning (as instead we did for the mapping α in 19.5.11b) to the map-
ping β : A(dR ) → Π which is defined by letting β(E) be the proposition such that
Pβ(E) = P B (E), for all E ∈ A(dR ); this is due to the fact that the construction of
the projection valued measure P B out of the projection valued measures P A1 and
P A2 , in the proof of 17.1.9, is utterly abstract (whereas condition b in 17.1.10 relates
directly the projection valued measure P to the projection valued measures P A1
and P A2 ); however, every self-adjoint operator is taken to represent an observable
in standard quantum mechanics textbooks, and hence according to their rules we
can say that there exists an observable β which is represented by the self-adjoint
operator B, and hence such that αi = ϕi (β) since

Pαi (E) = P Aαi (E) = P ϕi (B) (E) = P B (ϕ−1
i (E))
= Pβ(ϕ−1 (E)) = Pϕi (β)(E) , ∀E ∈ A(dR ),
i
for i = 1, 2 (cf. 15.3.8 and 19.1.13). Thus, within the rules of standard quantum
mechanics textbooks, two observables α1 and α2 are compatible if and only if there
exists an observable β of which both α1 and α2 are functions.
19.5.13 Proposition. For a proposition π ∈ Π, a discrete observable α, a state
σ ∈ Σ, we denote by p(π, α, σ) the probability that π is true in a copy which is
produced by an ideal measurement of α with any result, carried out in a copy initially
prepared in the state σ. Thus, p(π, α, σ) is the theoretical prediction of the relative
frequency of π being found true in an ensemble of copies which, after being prepared
in σ, have gone through an ideal measurement of α without being selected according
to any particular set of results for α.
(a) p(π, α, σ) = p(π, σ), ∀σ ∈ Σ;
(b) π and α(E) are compatible, ∀E ∈ A(dR ).
Proof. Let {(λn , Pn )}n∈I be the family related to the self-adjoint operator Aα as
in 15.3.4B with A := Aα (cf. 19.3.10c). From 19.4.12a we see that, for every σ ∈ Σ,
X
p(π, α, σ) = p(π, σ ′′ ) = tr(Pπ Wσ′′ ) = tr(Pπ Pn Wσ Pn )
n∈I
(the third equality follows from 18.3.4c).
We prove now the equivalence between conditions a and b.
a ⇒ b: Assuming condition a, we have in particular
p(π, α, σ) = p(π, σ), ∀σ ∈ Σ0 ,
X
tr(Pπ Pn Au Pn ) = tr(Pπ Au ), ∀u ∈ H̃.
n∈I
P
We note that, if I is infinite, the series n∈I Pn Pπ Pn f is convergent for each f ∈ H
by 10.4.7b, since
(Pi Pπ Pi f |Pj Pπ Pj f ) = (Pπ Pi f |Pi Pj Pπ Pj f ) = 0 if i 6= j,
kPn Pπ Pn f k ≤ kPn f k (cf. 13.1.3d),
X
kPn f k2 < ∞ (cf. 13.2.8);
n∈I
thus, we can define the operator
X
Pn Pπ Pn : H → H
n∈I
!
X X
f 7→ Pn Pπ Pn f := Pn Pπ Pn f.
n∈I n∈I
Then we have
! !
X X X
u| Pn Pπ Pn u = (u|Pn Pπ Pn u) = tr(Pn Pπ Pn Au )
n∈I n∈I n∈I
X
= tr(Pπ Pn Au Pn ) = tr(Pπ Au ) = (u|Pπ u) , ∀u ∈ H̃,
n∈I

X
Pn Pπ Pn = Pπ .
n∈I
From this we obtain, for each k ∈ I,

X
Pk Pπ f = Pk Pn Pπ Pn f = Pk Pπ Pk f
n∈I
X
= Pn Pπ Pn Pk f = Pπ Pk f, ∀f ∈ H,
n∈I
and hence, for every E ∈ A(dR ),

X
[Pα(E) , Pπ ]f = [P Aα (E), Pπ ]f = [Pn , Pπ ]f = 0H , ∀f ∈ H
n∈IE
(where IE is defined as in 15.3.4B), which is equivalent to condition b by 19.5.3.

b ⇒ a: Assuming condition b, by 19.5.3 we have in particular
[Pπ , Pn ] = [Pπ , P Aα ({λn })] = [Pπ , Pα({λn }) ] = OH , ∀n ∈ I,
and hence, for every σ ∈ Σ,

X X
p(π, α, σ) = tr(Pn Pπ Wσ Pn ) = tr(Pn Pπ Wσ )
n∈I n∈I
! !
X
= tr Pn Pπ Wσ = tr(Pπ Wσ ) = p(π, σ),
n∈I
where we have used 18.2.11c and 18.3.12 (note that Pn Pπ ∈ P(H) for each n ∈ I
by 13.2.1, and that (Pi Pπ )(Pk Pπ ) = Pi Pk Pπ = OH if i 6= k) and the equality
P
n∈I Pn = 1H (cf. 15.3.4B).
19.5.14 Corollary. For a discrete observable α and any observable β, the following
(a) p(β(E), α, σ) = p(β(E), σ), ∀E ∈ A(dR ), ∀σ ∈ Σ;

(b) α and β are compatible.
Proof. The result follows immediately from 19.5.13 and the definition of compati-
bility for two observables.
19.5.15 Remark. For two observables α and β in a classical statistical theory, we

presume that in any state it is possible to measure α in such an “undisturbing” way
that the results we obtain when we measure β immediately after measuring α (and
having kept all the copies in which the measurements of α have been made) are the
same as the ones we should obtain if α had not been measured (cf. 19.2.2).
In a quantum situation, the most “undisturbing” method for measuring an ob-
servable is to use an ideal measurement (cf. 19.4.5a). Now, the result of 19.5.14
says that, in the quantum case, if α is a discrete observable then, for any observable
β, if and only if α and β are compatible is it statistically inconsequential whether
α has been ideally measured before β or not. If the observable α is not discrete,
when β and α are compatible then so are β and a realistic, and therefore discrete,
approximation of α (in the sense discussed in 19.1.22a and in 19.4.12b), since a
realistic approximation of α is assumed to be a function of α and therefore all the
propositions in its range are in the range of α as well. Thus, if we maintain the
idea that a measurement of α is at the operational level actually a measurement
of one of its realistic approximations, we can still say that the compatibility of α
and β ensures that it is immaterial, for the statistics of the results we obtain in a
long series of measurements of β in a state σ, whether we have used directly copies
prepared in σ or copies which, after being prepared in σ, have gone through an ideal
measurement of α in which no selection was made according to any particular set
of results for α.
19.5.16 Remark. Let α be a discrete observable and let {(λn , Pn )}n∈I be the
family related to the self-adjoint operator Aα as in 15.3.4B with A := Aα (cf.
19.3.10c). Suppose that the projection Pn is one-dimensional, i.e. that there
exists un ∈ H̃ such that Pn = Aun , for each n ∈ I, and that we have a pro-
cedure for carrying out a first kind measurement of α. If a first kind measure-
ment of α is made in a copy of the system prepared in a state σ ∈ Σ and if
the result λn is obtained (the elements of {λn }n∈I are the only numbers that
can be obtained as results, since P Aα (R − {λn }n∈I ) = OH and this implies
p(α(R − {λn }n∈I ), σ) = tr(P Aα (R − {λn }n∈I )Wσ ) = 0), then immediately after the
measurement we have a copy in the pure state represented by the ray [un ], whatever
the state σ was; this follows from 19.4.3e, since Pα({λn }) = P Aα ({λn }) = Pn (cf.
15.3.4B). We also note that, for i 6= j, Pi Pj = OH implies (ui |uj ) = 0, and also
that
X X
f = P Aα (R)f = Pn f = (un |f ) un , ∀f ∈ H
n∈I n∈I
(cf. 15.3.4B). This proves that the family {un }n∈I is a c.o.n.s. in H (cf. 10.6.4).
Thus, if we have a discrete observable α such that the self-adjoint operator
Aα has one-dimensional eigenspaces and a procedure for a first kind measurement
of α, we actually have a procedure for preparing pure states, and a great deal of
them (one for each element of a c.o.n.s. in H). However, observables with these
characteristics are seldom available. More often, their function in preparing pure
states is fulfilled by a set of observables with the features specified in 19.5.17, as is
explained in 19.5.18.
19.5.17 Definition. Let {α1 , α2 , ..., αℓ } be a finite family of discrete observables
and, for k = 1, 2, ..., ℓ, let {(λkn , Pnk )}n∈Ik be the family associated with the self-
adjoint operator Aαk as the family {(λn , Pn )}n∈I was associated with the self-
adjoint operator A in 15.3.4B. The family {α1 , α2 , ..., αℓ } is said to be a complete
set of compatible observables if the observables of the family are pairwise compatible
and if the projection Pn11 Pn22 · · · Pnℓℓ is either one-dimensional or the operator OH ,
for all (n1 , n2 , ..., nℓ ) ∈ I1 × I2 × · · · × Iℓ (the operator Pn11 Pn22 · · · Pnℓℓ is a projection
by 19.5.10, 17.1.14, 13.2.1).
19.5.18 Remark. Let the family {α1 , α2 , ..., αℓ } be as in 19.5.17, and suppose
that it is a complete set of compatible observables. Suppose further that proce-
dures are available for performing ideal measurements of all observables αk . If ideal
measurements are made for all observables αk , one immediately after the other
in whichever order, in a copy of the system initially prepared in whatever state
σ, and if λ1n1 , λ2n2 , ..., λℓnℓ are the results obtained, then immediately after the ℓ
measurements we have a copy which is in the pure state represented by the ray
[un1 ,n2 ,...,nℓ ] if Pn11 Pn22 · · · Pnℓℓ = Aun1 ,n2 ,...,nℓ . Indeed, reasoning as in 19.5.1a we see
that the probability of obtaining the results λ1n1 , λ2n2 , ..., λℓnℓ was, before the mea-
surements, tr(Pn11 Pn22 · · · Pnℓℓ Wσ ); thus, if the results λ1n1 , λ2n2 , ..., λℓnℓ have actually
been obtained then Pn11 Pn22 · · · Pnℓℓ 6= OH and hence the projection Pn11 Pn22 · · · Pnℓℓ
is one-dimensional; then, reasoning as in the proof of 19.5.4 we see that after the
ℓ measurements we have a copy which is in the state represented by the statistical
operator
1
P 1 P 2 · · · Pnℓℓ Wσ Pn11 Pn22 · · · Pnℓℓ ,
tr(Pn11 Pn22 · · · Pnℓℓ Wσ ) n1 n2
which is the same as Aun1 ,n2 ,...,nℓ , for whatever state σ such that
tr(Pn11 Pn22 · · · Pnℓℓ Wσ ) 6= 0 (cf. 19.4.5c). This gives us a method for preparing
pure states, one for each element of a c.o.n.s. in H. To see this, define
J := {(n1 , n2 , ..., nℓ ) ∈ I1 × I2 × · · · × Iℓ : Pn11 Pn22 · · · Pnℓℓ 6= OH }
and let un1 ,n2 ,...,nℓ ∈ H̃ be such that Pn11 Pn22 · · · Pnℓℓ = Aun1 ,n2 ,...,nℓ for
(n1 , n2 , ..., nℓ ) ∈ J. The condition Pnkk Pnk′ = OH if nk 6= n′k (cf. 15.3.4B) im-
k
plies that

un1 ,n2 ,...,nℓ |un′1 ,n′2 ,...,n′ℓ = 0 if (n1 , n2 , ..., nℓ ) 6= (n′1 , n′2 , ..., n′ℓ );
moreover, the condition 1 = P Aαk (R) = nk ∈Ik Pnkk (cf. 15.3.4B) implies that
P
X X X
f = ··· Pn11 Pn22 · · · Pnℓℓ f
n1 ∈I1 n2 ∈I2 nℓ ∈Iℓ
X
= (un1 ,n2 ,...,nℓ |f ) un1 ,n2 ,...,nℓ , ∀f ∈ H.
(n1 ,n2 ,...,nℓ )∈J
This proves that the family {un1 ,n2 ,...,nℓ }(n1 ,n2 ,...,nℓ)∈J is a c.o.n.s. in H (cf. 10.6.4).
19.5.19 Proposition. Let α and β be two observables and σ a state in which both
α and β are evaluable, and let {un }n∈I and {wn }n∈I be as in 19.3.13c so that
P
Wσ f = n∈I wn Aun f for all f ∈ H. Then:
un ∈ DAα ∩ DAβ , ∀n ∈ I;
1X
∆σ α∆σ β ≥ wn | (Aα un |Aβ un ) − (Aβ un |Aα un ) |.
2
n∈I
If in particular σ is a pure state, then:
1
uσ ∈ DAα ∩ DAβ and ∆σ α∆σ β ≥ | (Aα uσ |Aβ uσ ) − (Aβ uσ |Aα uσ ) |.
2
Proof. From 19.3.13c we have un ∈ DAα ∩ DAβ for all n ∈ I. For the product
∆σ α∆σ β we have
sX sX
∆σ α∆σ β = wn kAα un − hαiσ un k 2 wn kAβ un − hβiσ un k2
n∈I n∈I
X
≥ wn kAα un − hαiσ un kkAβ un − hβiσ un k;
n∈I
the equality follows from 19.3.13c and the inequality is the Schwarz inequality in
CN if I contains N elements or in ℓ2 if I is denumerable (cf. 10.3.8c,d; if I is
√ √
denumerable, the sequences { wn kAα un − hαiσ un k} and { wn kAβ un − hβiσ un k}
are elements of ℓ2 , cf. 19.3.13c). Further, for each n ∈ I we have (using the fact
that the operators Aα and Aβ are symmetric, cf. 12.4.3c)
kAα un − hαiσ un kkAβ un − hβiσ un k
≥ |(Aα un − hαiσ un |Aβ un − hβiσ un )|
≥ |Im (Aα un − hαiσ un |Aβ un − hβiσ un )|
1
= |(Aα un − hαiσ un |Aβ un − hβiσ un ) − (Aβ un − hβiσ un |Aα un − hαiσ un )|
2
1
= |(Aα un |Aβ un ) − (Aβ un |Aα un )| .
2
Thus, the first part of the statement is proved. The second part follows immediately
from the first since Wσ = Auσ if σ is a pure state.
19.5.20 Corollary. Let α and β be two observables and σ a state in which both α
and β are evaluable, and also such that Aα Aβ Wσ ∈ T (H) and Aβ Aα Wσ ∈ T (H).
Then:
1
[Aα , Aβ ]Wσ ∈ T (H) and ∆σ α∆σ β ≥ | tr([Aα , Aβ ]Wσ )|.
2
If in particular σ is a pure state, the above conditions for σ are equivalent to the
one condition uσ ∈ D[Aα ,Aβ ] and, if they are fulfilled, the following inequality holds:
1
∆σ α∆σ β ≥ | (uσ |[Aα , Aβ ]uσ ) |.
2
Proof. Let {un }n∈I and {wn }n∈I be as in 19.5.19, with {un }n∈I an o.n.s. in H
(cf. 18.3.2c); then un ∈ RWσ for each n ∈ I. Since DAα Aβ Wσ = DAβ Aα Wσ = H,
we have Aβ un ∈ DAα and Aα un ∈ DAβ , and hence un ∈ D[Aα ,Aβ ] for each n ∈ I.
Then, since the operators Aα and Aβ are symmetric, from 19.5.19 we obtain
1X
∆σ α∆σ β ≥ wn | (Aα un |Aβ un ) − (Aβ un |Aα un ) |
2
n∈I

1 X
≥ wn (un |[Aα , Aβ ]un ) .

2
n∈I
If in particular σ is a pure state, then uσ ∈ D[Aα ,Aβ ] and
1
∆σ α∆σ β ≥ | (uσ |[Aα , Aβ ]uσ ) |.
2
In the general case, from 18.2.4a,b we have [Aα , Aβ ]Wσ ∈ T (H), and we can com-
pute tr([Aα , Aβ ]Wσ ) by means of a c.o.n.s. in H which contains {un }n∈I (cf. 10.7.3);
then we have
X X
tr([Aα , Aβ ]Wσ ) = (un |[Aα , Aβ ]Wσ un ) = wn (un |[Aα , Aβ ]un ) .
n∈I n∈I
Finally, if σ is a pure state and uσ ∈ D[Aα ,Aβ ] , then uσ ∈ DAα ∩ DAβ and hence
both α and β are evaluable in σ (cf. 19.3.13d); moreover,
Aα Aβ Wσ f = (uσ |f ) Aα Aβ uσ and Aβ Aα Wσ f = (uσ |f ) Aβ Aα uσ , ∀f ∈ H,
and this proves that Aα Aβ Wσ ∈ T (H) and Aβ Aα Wσ ∈ T (H). Indeed, if Aα Aβ uσ 6=
0H then Aα Aβ Wσ = λAu,v with λ := kAα Aβ uσ k, u := uσ , v := λ−1 Aα Aβ uσ , and
hence Aα Aβ Wσ ∈ T (H) in view of 18.2.15; and similarly for Aβ Aα Wσ .
19.5.21 Proposition. Let α and β be two observables, and suppose that β is

bounded. Then β is evaluable in every state and
∀ε > 0, ∃σε ∈ Σ0 so that α is evaluable in σε and ∆σε α∆σε β < ε.
Proof. Since β is bounded, β is evaluable in every state (cf. 19.3.15c), the operator
Aβ is bounded, and DAβ = H (cf. 19.3.10b). For each pure state σ ∈ Σ0 , in view
of 19.3.13d we have
|hβiσ | = | (uσ |Aβ uσ ) | ≤ kAβ uσ k
by the Schwarz inequality, and hence (cf. 4.2.5b)
∆σ β = kAβ uσ − hβiσ uσ k ≤ 2kAβ uσ k ≤ 2kAβ k.
If kAβ k = 0 then we have ∆σ β = 0 and hence ∆σ α∆σ β = 0 for each state σ ∈ Σ0
in which α is evaluable. Assuming kAβ k 6= 0, 19.3.16 implies that for every ε > 0
there exists a pure state σε ∈ Σ0 such that α is evaluable in σε and ∆σε α < 2kAε β k ,
and hence such that
ε
∆σε α∆σε β < 2kAβ k = ε.
2kAβ k
19.5.22 Proposition. Let α and β be two compatible observables. Then:

(Aα f |Aβ f ) − (Aβ f |Aα f ) = 0, ∀f ∈ DAα ∩ DAβ .
Proof. First we notice that, for every self-adjoint operator A in H, condition sa-ug
in 16.1.6 and the continuity
inner product imply that,for g ∈ Hand f ∈ DA ,
of the
d
the function R ∋ t 7→ g|UfA (t) is differentiable at 0 and dt g|UfA (t) = (g|iAf ).

0
d
And similarly dt UfA (t)|g = (iAf |g).

0
For f ∈ DAα ∩ DAβ , from 19.5.10 and 17.1.7 we have

UfAα (−t)|Aβ f = U Aα (−t)f |Aβ f

= f |U Aα (t)Aβ f = f |Aβ U Aα (t)f = Aβ f |UfAα (t)

−1 †
(recall that U Aα (−t) = U Aα (t) = U Aα (t) , cf. 16.1.1), and hence
d Aα d
(−iAα f |Aβ f ) = Uf (−t)|Aβ f = Aβ f |UfAα (t) = (Aβ f |iAα f ) .

dt 0 dt 0
19.5.23 Proposition. Let α1 and α2 be two compatible observables. Then for each
possible result λ1 for α1 and each ε > 0 there exist a possible result λ2 for α2 and
a pure state σε ∈ Σ0 so that
αk is evaluable in σε , |hαk iσε − λk | < ε, ∆σε αk < 2ε, for k = 1, 2.
Proof. Everything follows from 17.1.13 and 19.5.10 since, for each observable α,
σ(Aα ) is the set spα of all possible results for α (cf. 19.3.10a), α is evaluable in a
pure state σ ∈ Σ0 if and only if uσ ∈ DAα (cf. 19.3.13d), if α is evaluable in a pure
state σ ∈ Σ0 then hαiσ = hAα iuσ and ∆σ α = ∆uσ Aα (cf. 19.3.13d).
19.5.24 Remarks.
(a) We saw in 19.3.16 that, for each observable α, the uncertainty ∆σ α can be made
arbitrarily small by a suitable choice of the state σ. One can wonder if a similar
possibility exists for two observables α and β, i.e. if the following proposition
is true
P : ∀ε > 0, ∃σε ∈ Σ so that α and β are evaluable in σε and ∆σε α∆σε β < ε.
We must emphasize the fact that, whether proposition P is true or not, for
any state σ the product ∆σ α∆σ β has for us only the statistical meaning that is
based on the interpretation of ∆σ α as the theoretical prediction of the standard
deviation of the results obtained when measuring an observable α in a large
number of copies all prepared in σ (cf. 19.1.22a). In particular, considering the
product ∆σ α∆σ β does not imply for us any idea of carrying out measurements
of α and of β in the same copies of the quantum system. In fact, an experimental
test for the value of ∆σ α∆σ β rests on measuring α in a large collection of copies
prepared in σ and, independently of that, on measuring β in a different large

collection of copies prepared in σ. Thus, if proposition P is not true, i.e. if
there exists µ > 0 so that ∆σ α∆σ β ≥ µ for all σ, then µ sets a limit to the
joint precision with which the results for α and β can be predicted for any state
preparation used to prepare two different ensembles, one for the measurements
of α and the other for the measurements of β.
(b) In many quantum mechanics textbooks, the discussion of proposition P revolves
around the inequalities proved in 19.5.20, which are called uncertainty relations.
However, it would be better if they referred to the inequalities proved in 19.5.19
because, while in 19.5.19 the state σ is only required the physically meaningful
condition that α and β be evaluable in σ, in 19.5.20 σ is also required to be
such that Aα Aβ Wσ ∈ T (H) and Aβ Aα Wσ ∈ T (H) (or such that uσ ∈ D[Aα ,Aβ ]
if σ is a pure state) and these additional conditions have no physical meaning.
(c) Clearly, from 19.5.19 we obtain the falsification of proposition P if
inf{| (Aα u|Aβ u) − (Aβ u|Aα u) | : u ∈ H̃ ∩ DAα ∩ DAβ } > 0.
This happens in a drastic way when α and β are the observables position and
linear momentum (in a given direction) of a non-relativistic quantum particle,
in which case
| (Aα u|Aβ u) − (Aβ u|Aα u) | = (2π)−1 h, ∀u ∈ H̃ ∩ DAα ∩ DAβ ,
where h is Planck’s constant (cf. Section 20.3).
(d) The result of 19.5.21 shows that proposition P is true whenever at least one of
the two observables α and β is bounded.
(e) For a state σ in which two observables α and β are evaluable, ∆σ α∆σ β = 0 is
true if and only if for at least one out of α and β, suppose for α, σp (Aα ) 6= 0
and there is λ ∈ σp (Aα ) such that RWσ ⊂ RP Aα ({λ}) (cf. 19.3.21); if such is the
case, for {un }n∈I as in 19.5.19 we have
Aα un = hαiσ un , ∀n ∈ I,
and this explains why the right hand sides of the inequalities in 19.5.19 and
19.5.20 vanish. In particular, for a pure state σ in which α and β are evaluable,
i.e. such that uσ ∈ DAα ∩ DAβ , ∆σ α∆σ β = 0 is true if and only if uσ is
eigenvector of Aα or Aβ . Thus, ∆σ α∆σ β can be zero even when the operator
[Aα , Aβ ] is not (a restriction of) the operator OH . As an example, if α and
β are two components of the orbital angular momentum for the system of a
single quantum particle, then [Aα , Aβ ] is not a restriction of OH ; however, Aα
and Aβ have one common eigenvector and therefore α and β are evaluable in
the pure state σ represented by this vector and we have ∆σ α∆σ β = 0 since
∆σ α = ∆σ β = 0.
(f) For two observables α and β, in many quantum mechanics textbooks the con-
dition [Aα , Aβ ] ⊂ OH is considered equivalent to the condition that α and β be
compatible. However, this is wrong because α and β can fail to be compatible,
and hence (cf. 19.5.10) the self-adjoint operators Aα and Aβ can fail to commute
(in the sense defined in 17.1.5), but nonetheless be such that [Aα , Aβ ] ⊂ OH ,
with a mathematically very meaningful domain D[Aα ,Aβ ] to boot (cf. 17.1.8).
It must be granted that, if α and β are bounded, then α and β are compatible
if and only if [Aα , Aβ ] = OH (cf. 19.3.10b, 19.5.10, 17.1.6a); but in this case
19.5.20 is of no real use since in this case the truthfulness of proposition P is
assured by 19.5.21. What is true in general is that if α and β are compatible
then [Aα , Aβ ] ⊂ OH (cf. 19.5.10 and 17.1.7h), but it would be sensible to use
this fact together with 19.5.20 only if we did not have the stronger result of
19.5.22, which shows that for compatible α and β the result of 19.5.19 does not
exert any constraint on ∆σ α∆σ β for any state σ in which α and β are evalu-
able (without the additional condition on σ that we should need if we were to
use 19.5.20). Actually, 19.5.23 shows that if α and β are compatible then an
even stronger proposition than proposition P is true. We point out that, while
for the results previously obtained about the compatibility of two observables we
had to assume that an ideal measurement was available for at least one of them
(cf. 19.5.13 and 19.5.14), this assumption is not required in 19.5.23. We notice
that the result of 19.5.23 holds trivially for every pair of classical observables;
indeed, in the classical case, for each microstate s ∈ S we have ∆s α = 0 for
each observable α (cf. 19.2.6a). Thus, two compatible quantum observables ex-
hibit once again a behaviour similar to the one they would display if they were
any pair of classical observables. The behaviour of two compatible quantum
observables is not in general equal, but only similar to the one of two classical
observables because we do not assume that for every quantum observable α
and for every possible result λ for α there is a state σ such that hαiσ = λ and
∆σ α = 0 (in our treatment of quantum mechanics, there is such a state if and
only if σp (Aα ) 6= ∅ and λ ∈ σp (Aα ), cf. 19.3.21; there would be such a state
for every observable α and every λ ∈ σ(Aα ) if we admitted in our treatment
the absolute precision state preparations represented by elements which do not
belong to the Hilbert space that we mentioned in 19.3.12b).
(g) An observable α is discrete if and only if there exists a c.o.n.s. {vj }j∈J in H so
that, letting σj be the pure state such that uσj = vj , α is evaluable in σj and
∆σj α = 0, for all j ∈ J; this follows from 19.3.10c and from the fact that an
observable α is evaluable in a pure state σ and ∆σ α = 0 if and only if uσ is an
eigenvector of Aα (cf. 19.3.21). Thus, for a discrete observable there are many
pure states (one for each element of a c.o.n.s. in H) in which α behaves as a
classical observable does in a microstate.
Let α and β be discrete observables. Then α and β are compatible if and only
if there exists a c.o.n.s. {vj }j∈J in H so that, letting σj be the pure state such
that uσj = vj , α and β are evaluable in σj and ∆σj α = ∆σj β = 0 (which
is a stronger result than ∆σj α∆σj β = 0) for all j ∈ J (cf. 19.5.10, 17.1.14,
19.3.21). Thus, if two discrete observables are compatible then there are many
pure states (one for every element of a c.o.n.s. in H) in which both of them
behave as a pair of classical observables do in a microstate.
19.6 Time evolution in non-relativistic quantum mechanics
Up to now we have assumed that any procedure discussed in connection with a

physical system could be carried out at a single instant of time; of course, this
is an idealization which can never have an exact counterpart in real experiments.
Moreover, when we considered more than one procedure, we always assumed that
they were executed one immediately after the other; this is obviously a further
idealization, and what we really meant was that the copy of the system did not
change appreciably between successive procedures. Thus, time has played no actual
role until now. In this section we examine how the flow of time enters the scheme of
quantum mechanics. We restrict our discussion to the non-relativistic case, where
time is an objective real parameter. This section is centred around axiom Q3, and
it begins by showing how one can arrive at this axiom.
As usual, H denotes the Hilbert space in which a quantum system is represented,
as summarized in 19.3.22.
19.6.1 Remark. For a given quantum system, let σ be a state preparation proce-
dure and suppose that it is carried out at a definite instant of time t0 . In all the
sections preceding, a copy prepared in σ was used in a second procedure (the deter-
mination of a proposition, the measurement of an observable, the passage through
a filter) which took place immediately after time t0 . However, at least in princi-
ple it is possible to wait for a positive time interval t before activating the second
procedure, and carry out this second procedure at time t0 + t; if this is done, the
second procedure takes place immediately after the new first procedure that can be
described as follows: perform procedure σ and wait for the time interval t. Now,
this new first procedure is not in general equivalent to the procedure σ. We assume
that this new first procedure is still a state preparation procedure, which we denote
by σt , and we say that the state σ at time t0 evolves into the state σt at time t0 + t.
We also assume that σt is a pure state whenever σ is a pure state; thus, we have
the mapping Γt defined by
Σ0 ∋ σ 7→ Γt (σ) := σt ∈ Σ0 .
In what follows we confine our attention to quantum systems for which Γt does
not depend on t0 but only on the time interval t (this was already anticipated by
the symbol Γt , where t0 does not appear); these systems are called conservative.
Also, we confine our attention to quantum systems for which the mapping Γt is a
bijection from Σ0 onto itself, for every positive t; these systems are called reversible.
Completely isolated quantum systems are experimentally seen to be conservative
and reversible. We denote the identity mapping of Σ0 by Γ0 and write Γ−t := (Γt )−1
for every positive t; for every pure state σ and any time t0 , if we prepare the state
Γ−t (σ) at time t0 − t and we wait until time t0 , then at time t0 we have a copy of
the system in the state σ. For every pair of positive t1 , t2 we have Γt1 ◦ Γt2 = Γt1 +t2 ;
this is simply due to the fact that waiting for the time interval t2 and then for the
time interval t1 is the same as waiting for the time interval t1 + t2 (and to the fact
that Γt depends only on the time interval t). Then, it is easy to prove that we have
Γt1 ◦ Γt2 = Γt1 +t2 for all t1 , t2 ∈ R. Further, we assume that, for every pair of
pure states σ1 , σ2 and every positive t, the transition probability (cf. 19.4.7b) from
Γt (σ1 ) to Γt (σ2 ) is the same as the transition probability from σ1 to σ2 ; indeed,
the probability that a copy prepared at time t0 in a pure state σ1 gets modified
(by the action of a suitable filter) to become as if it had been prepared at time t0
in another pure state σ2 is experimentally seen to be the same immediately after
t0 as at any later time. Since Γ−t = (Γt )−1 , this entails that the same is true for
negative t. Next, we assume that for every pure state σ the transition probability
from the pure state Γt (σ) to the pure state σ approaches one as t approaches zero;
the meaning of this continuity condition is obvious.
Now, since Γt is a bijection from the family Σ0 of pure states onto itself, in view
of the bijection between Σ0 and the projective Hilbert space Ĥ defined in 19.3.5c
there exists, for all t ∈ R, a unique mapping ωt : Ĥ → Ĥ which is a bijection from
Ĥ onto itself and also such that
[uΓt (σ) ] = ωt ([uσ ]), ∀σ ∈ Σ0 .
Since Γt preserves the transition probability between pure states, we have

τ (ωt ([uσ1 ]), ωt ([uσ2 ])) = | uΓt (σ1 ) |uΓt (σ2 ) | = | (uσ1 |uσ2 ) |
= τ ([uσ1 ], [uσ2 ]), ∀σ1 , σ2 ∈ Σ0 , ∀t ∈ R,
where τ is the function defined in 10.9.1; thus, ωt is an automorphism of the pro-
jective Hilbert space (Ĥ, τ ), for all t ∈ R (cf. 10.9.4). Moreover, the condition
Γt1 ◦ Γt2 = Γt1 +t2 , ∀t1 , t2 ∈ R,
is obviously equivalent to the condition
ωt1 ◦ ωt2 = ωt1 +t2 , ∀t1 , t2 ∈ R.
Furthermore, the continuity condition assumed above is obviously equivalent to the
condition that
the function R ∋ t 7→ τ ([u], ωt ([u])) ∈ [0, 1] is continuous at 0, ∀u ∈ H̃.
Therefore, in view of 16.4.5, the mapping
R ∋ t 7→ ωt ∈ Aut Ĥ
is a continuous one-parameter group of automorphisms. Consequently, in view of
16.4.11, there exists a continuous one-parameter unitary group U in H so that
ωt ([u]) = [U (t)u], ∀u ∈ H̃, ∀t ∈ R,
and hence, in view of 16.1.10, there exists a self-adjoint operator H in H so that

[uσt ] = [uΓt (σ) ] = ωt ([uσ ]) = [U H (t)uσ ], ∀σ ∈ Σ0 , ∀t ∈ R,
or equivalently (cf. 13.1.13b) so that
Wσt = WΓt (σ) = AU H (t)uσ = U H (t)Auσ U H (t)−1
= U H (t)Wσ U H (t)−1 , ∀σ ∈ Σ0 , ∀t ∈ R.
From 16.4.3b, 16.1.8a, 16.1.5d we see that the operator H is unique up to an additive
multiple of the identity operator 1H .
Finally, we assume that the way in which pure states change over time deter-
mines the way in which every state changes, as follows: if a state σ ∈ Σ prepared
at time t0 is the mixture of a family {σn }n∈I of pure states with weights {wn }n∈I
as in 19.3.5b, then for every positive time interval t the state σt is the mixture of
the family {Γt (σn )}n∈I of pure states with the same weights, i.e.
X
Wσt f = wn WΓt (σn ) f, ∀f ∈ H,
n∈I
and hence
X
Wσt f = U H (t) wn Wσn U H (t)−1 f = U H (t)Wσ U H (t)−1 f, ∀f ∈ H,
n∈I
i.e. Wσt = U H (t)Wσ U H (t)−1 . We point out that, although this equality has been
obtained on the basis of a particular decomposition of σ into a mixture of pure
states, there is no trace of that particular decomposition in the final result, as must
be since that decomposition in not unique unless σ is a pure state (cf. 19.3.5b,c).
We also note that U H (t)Wσ U H (t)−1 is indeed a statistical operator by 18.3.2a. Now
we notice that, for every positive t, the mapping Σ ∋ σ 7→ σt ∈ Σ results to be a
bijection from Σ onto itself because the mapping of 19.3.1a is a bijection and the
mapping W(H) ∋ W 7→ U H (t)W U H (t)−1 ∈ W(H) is a bijection from W(H) onto
itself, as can be easily seen. Thus, for every state σ ∈ Σ and every positive t we can
define σ−t as the state that evolves into the state σ at any time t0 if it is prepared
at time t0 − t; clearly, we have
Wσ−t = U H (t)−1 Wσ U H (t) = U H (−t)Wσ U H (−t)−1 .
Thus, we have
Wσt = U H (t)Wσ U H (t)−1 , ∀σ ∈ Σ, ∀t ∈ R.
This outcome of the assumptions above can be stated as the axiom below.
19.6.2 Axiom (Axiom Q3). There are quantum systems for which there exists
a self-adjoint operator H (in the Hilbert space in which the system is represented)
so that for every t0 ∈ R, every positive t, and every state σ ∈ Σ, a copy prepared at
time t0 in σ becomes after the time interval t the same as a copy prepared at time
t0 + t in the state σt represented by the statistical operator
Wσt := U H (t)Wσ U H (t)−1 ,
and a copy prepared at time t0 − t in the state σ−t represented by the statistical
operator
Wσ−t := U H (−t)Wσ U H (−t)−1
becomes after the time interval t the same as a copy prepared at time t0 in σ.
19.6.3 Remarks.
(a) In 19.6.1 we proved that the assumptions made there implied axiom Q3, and it
is easy to see that axiom Q3 implies those assumptions (13.1.13b and 16.4.3a
must be used). Then, we see in particular that the quantum systems for which
axiom Q3 holds are the conservative and reversible quantum systems. In what
follows we consider only conservative and reversible quantum systems.
(b) The self-adjoint operator H of axiom Q3 is unique up to an additive multiple
of the unit operator. This is already clear from 19.6.1. In any case, to see it
directly, assume that H ′ is a self-adjoint operator which plays the same role as
H in axiom Q3. Then we have
′ ′
U H (t)W U H (t)−1 = U H (t)W U H (t)−1 , ∀W ∈ W(H), ∀t ∈ R,
and hence in particular (cf. 18.3.2b)

′ ′
U H (t)Au U H (t)−1 = U H (t)Au U H (t)−1 , ∀u ∈ H̃, ∀t ∈ R,
which is equivalent to (cf. 13.1.13)

′
[U H (t)u] = [U H (t)u], ∀[u] ∈ Ĥ, ∀t ∈ R,
ωU H (t) = ωU H ′ (t) , ∀t ∈ R.
From 16.4.3b we see that this implies that there exists k ∈ R so that
′
U H (t) = eikt U H (t), ∀t ∈ R,
and hence (cf. 16.1.8a and 16.1.5d) so that
H ′ = H + k1H .
(c) The self-adjoint operator −H is called the Hamiltonian of the system, and
it is interpreted as the self-adjoint operator which represents the observable
“energy” of the system. This is consistent with its being unique only up to an
additive multiple of the unit operator, since physically the observable energy
of any system is only defined up to an additive constant (note that, for k ∈ R,
σ(H + k1H ) = σ(H) + k and σp (H + k1H ) = σp (H) + k, as is obvious from
15.2.4b and 15.2.5b; then, cf. 19.3.10a and 19.3.12a).
19.6.4 Remark. The relationship between a state and the state which at any time
evolves from it or has evolved into it, as implied by axiom Q3, is strictly causal, in
spite of the acausal character of quantum mechanics when it is referred to a single
copy of a system, as reflected in the impossibility of making more than statistical
statements about the results to be expected from determinations of propositions
or from measurements of observables. Thus, when referred to ensembles and not
to single copies, quantum mechanics is as deterministic as classical mechanics if
the quantum system is conservative and reversible, hence in particular if it is a
completely isolated system. An altogether different change of state happens when
there is state reduction (cf. 19.4.3b), produced by the interaction of copies of the
system with a filter or with a measuring instrument in a first kind measurement
(cf. 19.4.8 and 19.4.10). We point out that the number of copies in an ensemble
representing a state does not change in the time evolution of axiom Q3, while it
does in a state reduction.
19.6.5 Remark. For a quantum system whose time evolution is determined by a
self-adjoint operator H as in axiom Q3, for each state σ ∈ Σ activated at any time
t0 we can define the mapping R ∋ t 7→ σt ∈ Σ, which is called the trajectory of the
state σ. For a pure state σ, the trajectory of σ corresponds to the mapping
R ∋ t 7→ uσ (t) := UuHσ (t) = U H (t)uσ ∈ H̃
(cf. 19.6.1; for UuHσ , cf. 16.1.1). Now, for uσ ∈ DH we have (cf. 16.1.5b)
duσ (t)
uσ (t) ∈ DH and = iHuσ (t), ∀t ∈ R.
dt
Thus, this is the condition that is obeyed by the pure states whose representatives
(as in 19.3.5c) are rays which lie in DH , and this is the abstract form of what is
known as the Schrödinger equation. In many specific cases, the Hilbert space H is
a space of equivalence classes of functions on Rn and H is a differential operator;
then uσ becomes a function (actually, an equivalence class of functions) and the
Schrödinger equation is often written as
∂uσ
(x1 , ..., xn , t) = iHuσ (x1 , ..., xn , t);
∂t
duσ (t)
however, the use of the symbol ∂u ∂t is misleading, since
σ
dt has the meaning
that is defined in 16.1.3 (with the limit taken with respect to the distance defined
in 10.1.15). Finding the continuous one-parameter unitary group U H is sometimes
dubbed “solving the Schrödinger equation”; however, it must be noted that the
trajectories of all states are known if U H is known, while only the trajectories of
the pure states represented by vectors in DH appear in the Schrödinger equation,
and it is physically impossible to have DH = H (cf. 19.3.11).
19.6.6 Proposition. If the time evolution of a quantum system is determined by
a self-adjoint operator H as in axiom Q3 and the energy of the system is a discrete
observable then for every pure state σ ∈ Σ 0 we have
X
uσ (t) := U H (t)uσ = eitλn Pn uσ , ∀t ∈ R,
n∈I
where {λn }n∈I = σp (H) and Pn = PNH−λn 1H for each n ∈ I, or equivalently

XX
uσ (t) = eitλn (un,s |uσ ) un,s , ∀t ∈ R,
n∈I s∈In
where {un,s }s∈In is an o.n.s. in H which is complete in the eigenspace of H corre-

sponding to λn , for each n ∈ I.
Proof. From 19.3.10c we have that the conditions of 15.3.4B hold true for the self-
adjoint operator H, since −H represents the observable energy (cf. 19.6.3c). The
result then follows from 16.1.6 and from the explicit forms of the operator ϕ(A) in
15.3.4B.
19.6.7 Remark. The result of 19.6.6 shows why, if the energy of a quantum system
is a discrete observable, knowing the eigenvalues of H and a c.o.n.s. comprised of
eigenvectors of H allows one to “solve the Schrödinger equation”.
19.6.8 Definition. A state σ ∈ Σ such that σt = σ for each t ∈ R is said to be a

stationary state.
19.6.9 Proposition. For a pure state σ ∈ Σ0 of a quantum system whose time

evolution is determined by a self-adjoint operator H as in axiom Q3, the following
(a) σ is a stationary state;
(b) σp (H) 6= ∅ and uσ is an eigenvector of H.
Proof. a ⇒ b: Assume that σ ∈ Σ0 is a stationary state. Then (cf. 19.6.1)

AU H (t)uσ = Wσt = Wσ = Auσ , ∀t ∈ R,
and hence (cf. 13.1.13a) there exists a function ρ : R → C so that U H (t)uσ = ρ(t)uσ
for each t ∈ R. We have:
ρ(t1 + t2 )uσ = U H (t1 + t2 )uσ = U H (t1 )U H (t2 )uσ
= U H (t1 )ρ(t2 )uσ = ρ(t1 )ρ(t2 )uσ ,
and hence ρ(t1 + t2 ) = ρ(t1 )ρ(t2 ), ∀t ∈ R;
ρ(t) = uσ |U H (t)uσ , ∀t ∈ R,

and hence the function ρ is continuous in view of 16.1.2;

|ρ(t)| = kρ(t)uσ k = kU H (t)uσ k = kuσ k = 1, ∀t ∈ R.
Then, by 16.2.3 there exists λ ∈ R so that
ρ(t) = eiλt , ∀t ∈ R,
and hence so that
U H (t)uσ = eiλt uσ , ∀t ∈ R.
Obviously, this implies that

dUuHσ

UuHσ is differentiable at 0 and = iλuσ ,
dt 0
and this implies (cf. 16.1.5a and 16.1.6) that
uσ ∈ DH and iλuσ = iHuσ .
This proves that condition b holds true.
b ⇒ a: Assume condition b, and let λ ∈ R be such that P H ({λ})uσ = uσ (cf.
15.2.5 and 13.1.3c); then, for each E ∈ A(dR ),
H
µP H H H

uσ (E) = uσ |P (E)uσ = uσ |P (E)P ({λ})uσ
(
1 if λ ∈ E,
= uσ |P H (E ∩ {λ})uσ =

0 if λ 6∈ E
H
(cf. 13.3.2b,c); this shows that µP uσ is the Dirac measure in λ (cf. 8.3.6). Then
from 16.1.6 and 15.3.2e we have
Z
H
kUtH uσ − eitλ uσ k2 = |eitx − eitλ |2 dµP
uσ (x) = 0, ∀t ∈ R,
R
and hence
UtH uσ = eitλ uσ , ∀t ∈ R,
and hence (cf. 19.6.1 and 13.1.13a)
Wσt = AU H (t)uσ = Auσ = Wσ , ∀t ∈ R,
which is equivalent to σt = σ for each t ∈ R.
19.6.10 Remark. The result of 19.6.9 shows why the point spectrum of the Hamil-
tonian of a quantum system is of interest: the eigenvectors represent stationary
states of the system. The typical reaction of an atom to outside stimuli is to trans-
form its state from one stationary state to another emitting light whose frequency
is proportional to the difference between the corresponding eigenvalues.
19.6.11 Definition. Let (X, A) be a measurable space. An X-valued observable

α is said to be a constant of motion if
p(α(E), σt ) = p(α(E), σ), ∀E ∈ A, ∀t ∈ R, ∀σ ∈ Σ.
19.6.12 Proposition. Let α be an observable of a quantum system whose time

evolution is determined by a self-adjoint operator H as in axiom Q3. The following
(a) α is a constant of motion;
(b) for each state σ ∈ Σ in which α is evaluable,
α is evaluable in σt , hαiσt = hαiσ , ∆σt α = ∆σ α, ∀t ∈ R;
(c) for each pure state σ ∈ Σ0 in which α is evaluable,

α is evaluable in σt , hαiσt = hαiσ , ∆σt α = ∆σ α, ∀t ∈ R;
(d) for each pure state σ ∈ Σ0 in which α is evaluable,
α is evaluable in σt and hαiσt = hαiσ , ∀t ∈ R;
(e) the self-adjoint operators Aα and H commute;
(f ) p(α(E), σt ) = p(α(E), σ), ∀E ∈ A(dR ), ∀t ∈ R, ∀σ ∈ Σ0 .
Proof. a ⇒ b: From the definition of µα

σ (cf. 19.1.8) we see that condition a is the
same as
µα α
σt (E) = µσ (E), ∀E ∈ A(dR ), ∀t ∈ R, ∀σ ∈ Σ.
Now, this implies condition b by the very definitions given in 19.1.21 and in 19.1.20.
c ⇒ d: This is obvious.
d ⇒ e: Assume condition d. Recalling that for a pure state σ ∈ Σ0 we have
uσt = U H (t)uσ (cf. 19.6.1), from 19.3.13d we have
U H (t)u ∈ DAα and U H (t)u|Aα U H (t)u = (u|Aα u) , ∀t ∈ R, ∀u ∈ H̃ ∩ DAα ,

DAα ⊂ DU H (t)−1 Aα U H (t) and
u|U H (t)−1 Aα U H (t)u = (u|Aα u) , ∀t ∈ R, ∀u ∈ H̃ ∩ DAα .

This implies (cf. 10.2.12)

Aα ⊂ U H (t)−1 Aα U H (t), ∀t ∈ R,
and this implies (cf. 3.2.10b1)
U H (t)Aα ⊂ Aα U H (t), ∀t ∈ R.
Then, Aα and H commute by 17.1.7.
e ⇒ a: Assume condition e. Then by 17.1.7 we have
U H (t)−1 P Aα (E)U H (t) = P Aα (E), ∀E ∈ A(dR ), ∀t ∈ R,
and hence (cf. 18.2.11c), for each state σ ∈ Σ,
p(α(E), σt ) = tr(P Aα (E)Wσt ) = tr(P Aα (E)U H (t)Wσ U H (t)−1 )
= tr(U H (t)−1 P Aα (E)U H (t)Wσ ) = tr(P Aα (E)Wσ )
= p(α(E), σ), ∀t ∈ R.
a ⇒ f : This is obvious.
f ⇒ c: We proceed as in the proof of a ⇒ b, since condition f can be rephrased
as follows:
µα α
σt (E) = µσ (E), ∀E ∈ A(dR ), ∀t ∈ R, ∀σ ∈ Σ0 .
19.6.13 Remark. The way of describing the time evolution of a conservative and
reversible quantum system that has been discussed in this section is called the
Schrödinger picture. There is a mathematically equivalent way of doing the same,
which can at times be useful for practical calculations.
For each proposition π ∈ Π and each state σ ∈ Σ, we see that (cf. 18.2.11c)
p(π, σt ) = tr(Pπ U H (t)Wσ U H (t)−1 )
= tr(U H (t)−1 Pπ U H (t)Wσ ) = p(πt , σ), ∀t ∈ R,
if we define πt as the proposition such that Pπt = U H (t)−1 Pπ U H (t) (this operator
is an orthogonal projection in view of 13.1.8). Similarly, if α is an observable, σ is
a state, and α is evaluable in σt for some t ∈ R, we see that (cf. 19.3.13a)
hαiσt = tr(Aα U H (t)Wσ U H (t)−1 ) = tr(Aα,t Wσ ),
if we define Aα,t := U H (t)−1 Aα U H (t) (this operator is self-adjoint in view of
12.5.4c). This mathematical way of dealing with time evolution is called the Heisen-
berg picture.
Chapter 20
Position and Momentum in

Non-Relativistic Quantum Mechanics
Oddly enough, Galilean-relativistic physics is called non-relativistic, while the name

of relativistic physics is reserved for the physical theories which are in accord with
Einstein’s special relativity. In this chapter we deal with Galilean-relativistic quan-
tum mechanics, as we already did in Chapter 19, where we expounded on the
principles of the theory. The rules we discussed there are very general, and need
to be supplemented with additional assumptions when a particular system is under
discussion. This applies in particular to the choice of an actual Hilbert space for a
given physical system, and the identification of specific operators as the represen-
tatives of specific observables. Quite often, this task is carried out on the basis of
symmetry principles. In section 20.3 we examine how this can be done for the ob-
servables position and linear momentum of a quantum particle, assuming the Galilei
group as symmetry group (this is equivalent to the assumption that the theory of
the system is in accord with Galilei’s relativity).
The first two sections of this chapter discuss mathematical ideas which play an
essential role in the more physical discussion of Section 20.3.
The subject of this chapter can be discussed with a higher level of sophistication
than here, within the framework of the theories of C ∗ -algebras (for Sections 20.1
and 20.2) and of induced representations (for Section 20.3). However, our way of
dealing with the topics of this chapter has the advantage of offering a good example
of the theory of Hilbert space operators directly at work.
20.1 The Weyl commutation relation
In this section we study a commutation relation which is related to the Heisen-

berg canonical commutation relation (cf. 12.6.5). This commutation relation was
introduced by Hermann Weyl in order to rid the discussion about the Heisenberg
relation of problems caused by the presence of non-bounded operators. Actually,
Weyl’s relation is strictly stronger than Heisenberg’s (cf. 20.1.3b and 20.1.4), but
we will see in Section 20.3 that Weyl’s relation is exaclty the one that character-
izes the self-adjoint operators which represent the observables position and linear
697
momentum of a non-relativistic quantum particle.

In this section, H stands for an abstract Hilbert space. We recall that, for a self-
adjoint operator A in H, P A denotes the projection valued measure of A (cf. 15.2.2)
and U A denotes the continuous one-parameter unitary group whose generator is A
(cf. 16.1.6 and 16.1.11a).
20.1.1 Theorem. Let A and B be self-adjoint operators in H. Then the following

(a) U A (t)U B (s) = e−its U B (s)U A (t), ∀(s, t) ∈ R2 ;

(b) U A (t)P B (E)U A (−t) = P B (E + t), ∀E ∈ A(dR ), ∀t ∈ R;
(c) U A (t)BU A (−t) = B − t1H , ∀t ∈ R;
(d) U B (s)P A (E)U B (−s) = P A (E − s), ∀E ∈ A(dR ), ∀s ∈ R;
(e) U B (s)AU B (−s) = A + s1H , ∀s ∈ R.
The sets E + t and E − s are defined as in 9.2.1a.
Proof. In view of 16.1.8a, condition a can be written as
(a’) U A (t)U B (s)U A (−t) = U B−t1H (s), ∀s ∈ R, ∀t ∈ R,
and condition b can be written as
(b’) U A (t)P B (E)U A (−t) = P B−t1H (E), ∀E ∈ A(dR ), ∀t ∈ R.
Now, conditions a’ and c are equivalent in view of 16.3.1, and conditions b’ and c
are equivalent in view of 15.4.1. Thus, conditions a, b, c are equivalent. It can be
proved in a similar way that conditions a, d, e are equivalent.
20.1.2 Definition. A pair of self-adjoint operators A, B in H is said to be a rep-

resentation of the Weyl commutation relation (briefly, a representation of WCR) if
the conditions of 20.1.1 hold true for A and B and if the Hilbert space H is non-zero.
20.1.3 Theorem. Let a pair of self-adjoint operators A, B in H be a representation

of WCR. Then:
(a) | (Au|Bu) − (Bu|Au) | = 1, ∀u ∈ DA ∩ DB ∩ H̃, and hence ∆W A∆W B ≥ 21 for

each W ∈ W(H) in which both A and B are computable (∆W A and ∆W B are
defined as in 18.3.14);
(b) A and B “satisfy” the Heisenberg canonical commutation relation, i.e. (cf.
12.6.5) [A, B] ⊂ i1H .
Proof. Preliminary remark: As already noticed in the proof of 19.5.22, condition

sa-ug in 16.1.6 and the continuity of the inner product imply that, for any self-
adjoint operator T in H, for all f ∈ DT and all g ∈ H, the function
R ∋ t 7→ g|U T (t)f ∈ C

Position and Momentum in Non-Relativistic Quantum Mechanics 699
is differentiable at 0 and
d
g|U T (t)f 0 = i (g|T f ) ,

dt
and similarly
d
U T (−t)f |g 0 = i (T f |g) .

dt
After this preliminary remark, now we give the proofs of statements a and b.
a: Condition a in 20.1.1 (cf. also 16.1.1) implies that
U A (−t)f |U B (s)g = e−its U B (−s)f |U A (t)g , ∀f, g ∈ H, ∀(s, t) ∈ R2 .

Then for each f ∈ DA ∩ DB we have, by the preliminary remark,

∂
i U A (−t)f |Bf = U A (−t)f |U B (s)f 0

∂s
∂
e−its U B (−s)f |U A (t)f 0

=
∂s
= −it f |U A (t)f + i Bf |U A (t)f , ∀t ∈ R,

and hence
d
U A (−t)f |Bf 0

(Af |Bf ) = −i
dt
d
−t f |U A (t)f + Bf |U A (t)f 0

= −i
dt
= i (f |f ) + (Bf |Af ) ,
and hence
(Af |Bf ) − (Bf |Af ) = ikf k2.
In view of 19.5.19 (the proof of 19.5.19 is actually effective for every pair of self-
adjoint operators and every statistical operator in which they are both computable;
cf. also 19.3.13a), this proves condition a.
b: Condition c in 20.1.1 is equivalent to
U A (t)B = (B − t1H )U A (t), ∀t ∈ R
(cf. 3.2.10b1). For all g ∈ DB and f ∈ DAB−BA , this implies that
g|U A (t)Bf = Bg|U A (t)f − t g|U A (t)f , ∀t ∈ R,

and hence, by the preliminary remark,

d
g|U A (t)Bf 0

i (g|ABf ) =
dt
d
Bg|U A (t)f − t g|U A (t)f 0

=
dt
= i (Bf |Af ) − (g|f ) ,
and hence
(g|ABf − BAf − if ) = 0.
Since DB = H, in view of 10.2.11 this yields
ABf − BAf − if = 0H , ∀f ∈ DAB−BA ,
which is equivalent to statement b.
20.1.4 Remark. From 20.1.3b we see that, if a pair of self-adjoint operators A, B

is a representation of WCR, then A and B “satisfy” the Heisenberg canonical com-
mutation relation. Conversely, one might conjecture that, if a pair of self-adjoint
operators A, B “satisfy” the Heisenberg canonical commutation relation, then A
and B are a representation of WCR. A slight modification of Nelson’s example (cf.
17.1.8b) proves that this is not necessarily true, even when the domain D[A,B] is
so large that the restrictions of A and B to this linear manifold are essentially self-
adjoint. This example is constructed on purely mathematical grounds, but other
examples occur so to speak spontaneously in some quantum mechanical systems,
e.g. in connection with the Aharonov–Bohm effect (cf. Reeh, 1988).
20.1.5 Definitions. A pair of self-adjoint operators A, B in H is said to be jointly

irreducible if the set of operators {A, B} is irreducible (cf. 17.3.1).
A pair of continuous one-parameter unitary groups U , V in H is said to be jointly
irreducible if the set of operators {U (t) : t ∈ R} ∪ {V (t) : t ∈ R} is irreducible.
We see from 17.2.13 that a pair of self-adjoint operators A, B in H is jointly
irreducible iff the pair of continuous one-parameter groups U A , U B is jointly irre-
ducible.
A pair of self-adjoint operators A, B in H is said to be an irreducible represen-
tation of WCR if it is a representation of WCR and it is jointly irreducible.
20.1.6 Proposition. Let A, B be a representation of WCR in H and suppose that

there exists a non-trivial subspace M of H which is reducing for A and for B. Then
the pair AM , B M is a representation of WCR in the Hilbert space M .
Proof. The operators AM and B M are self-adjoint operators in the Hilbert space
M (cf. 17.2.8). Moreover, the operators U A (t) and U B (t) are reduced by M for all
t ∈ R and the mappings
R ∋ t 7→ (U A (t))M ∈ U(M ) and R ∋ t 7→ (U B (t))M ∈ U(M )
are continuous one-parameter unitary groups whose generators are AM and B M (cf.
17.2.13). Now, condition a in 20.1.1 implies obviously that
(U A (t))M (U B (s))M = e−its (U B (s))M (U A (t))M , ∀(s, t) ∈ R2 .
Thus, the pair AM , B M is a representation of WCR in the Hilbert space M .
20.1.7 Theorem (The Schrödinger representation of WCR).
(a) The operator Q defined in 15.3.4A is a self-adjoint operator in the Hilbert space
L2 (R). The continuous one-parameter unitary group U Q is so that
U Q (t)[f ] = [f t ], with f t (x) := eitx f (x), ∀x ∈ Df ,
for all [f ] ∈ L2 (R) and all t ∈ R.

(b) The mapping

P0 : D → L2 (R)
[ϕ] 7→ P0 [ϕ] := −i[ϕ′ ],
with
D := {[ϕ] ∈ L2 (R) : ϕ ∈ S(R)}
(cf. 11.3.6b), is an essentially self-adjoint operator in L2 (R). The unique self-
adjoint extension of P0 (cf. 12.4.11c) is the generator P of the continuous
one-parameter unitary group U P defined by
U P (t)[f ] := [f−t ], with f−t (x) := f (x + t), ∀x ∈ Df − t,
for all [f ] ∈ L2 (R) and all t ∈ R.
The operators Q and P are unitarily equivalent through the Fourier transform
F on L2 (R), since
Q = F P F −1 .
(c) The pair Q, P is a representation of WCR, called the Schrödinger representa-
tion.
Proof. a: The operator Q is self-adjoint in view of 14.3.17. In view of 16.1.6,

U Q (t) = ϕt (Q), ∀t ∈ R,
where ϕt is the function
R ∋ x 7→ ϕt (x) := eitx ∈ C.
Now, in view of 15.3.4A and Section 14.5,
ϕt (Q) = Mϕt = Ut , ∀t ∈ R,
where Ut is the operator defined in 11.4.15. This proves statement a.
b: The mapping
R ∋ t 7→ Vt ∈ B(L2 (R))
defined in 11.4.15 is a continuous one-parameter unitary group. Indeed, the equation
Vt = F −1 Ut F, ∀t ∈ R
(cf. 11.4.16) implies
Vt ∈ U(L2 (R)), ∀t ∈ R
(cf. 4.6.2b) and
Vt1 Vt2 = F −1 Ut1 Ut2 F = F −1 Ut1 +t2 F = Vt1 +t2 , ∀t1 , t2 ∈ R;
furthermore, the mapping
R ∋ t 7→ Vt [f ] ∈ L2 (R)
is continuous for all [f ] ∈ L2 (R) (cf. 11.4.16). Then, if we denote by P the generator
of this continuous one-parameter group, we have
P = F −1 QF
by 16.3.1.
The set D is obviously a linear manifold in L2 (R) and D ⊂ DQ ; moreover
D = L2 (R) in view of 11.3.3 and 10.6.5b (or in view of 11.4.19). The restriction QD
of Q to D is a symmetric operator (cf. e.g. 12.4.3); moreover, for each ϕ ∈ S(R),
the two functions
R ∋ x 7→ ϕ± (x) := (x ± i)−1 ϕ(x) ∈ C
are obviously elements of S(R) and
(QD ± i1L2 (R) )[ϕ± ] = [ϕ];
in view of 12.4.17, this proves that the operator QD is essentially self-adjoint. There-
fore, the operator F −1 QD F is also essentially self-adjoint (cf. 12.5.4d). We see that
DF −1 QD F = {[f ] ∈ L2 (R) : F [f ] ∈ D}.
Now, for [f ] ∈ L2 (R), we have
F [f ] ∈ D ⇒ [∃[g] ∈ D s.t. F [f ] = [g]] ⇒ [∃[g] ∈ D s.t. [f ] = F −1 [g]] ⇒ [f ] ∈ D
and
[f ] ∈ D ⇒ F [f ] ∈ D
(cf. 11.4.6). Therefore,
DF −1 QD F = D.
Moreover we have
F −1 QD F [ϕ] = [(ξ ϕ̂)ˇ] = −i[((ϕ̂)ˇ)(1) ] = −i[ϕ′ ] = P0 [ϕ], ∀ϕ ∈ S(R)
(cf. 11.4.2 and 11.4.9). This proves that
F −1 QD F = P0 .
Then P0 is essentially self-adjoint (cf. 12.5.4d) and P is its unique self-adjoint
extension, since P is self-adjoint and QD ⊂ Q implies P0 ⊂ P (cf. 12.4.11c). This
concludes the proof of statement b.
c: For all (s, t) ∈ R2 and all [f ] ∈ L2 (R), we have
(f−s )t (x) = eitx f (x + s) = e−its eit(x+s) f (x + s) = e−its (f t )−s (x), ∀x ∈ Df − s,
and hence
U Q (t)U P (s)[f ] = e−its U P (s)U Q (t)[f ].
This proves statement c.
20.1.8 Remark. In view of 20.1.7c, 20.1.3b, 12.6.5, we know that either operator
P or Q or both operators P and Q must be non-bounded. Then both P and Q are
non-bounded, in view of their unitary equivalence (cf. 20.1.7b) and of 4.6.5b.
This can be proved more directly in the following way. The operator Q is not
bounded in view of 14.2.17 (cf. also Section 14.5 and 15.3.4A). Then the operator
P is not bounded because Q and P are unitarily equivalent.
20.2 The Stone–von Neumann uniqueness theorem
In 20.1.7 we saw a special representation of the Weyl commutation relation, called

the Schrödinger representation. The content of the Stone–von Neumann unique-
ness theorem, which is the subject of the present section, is that the Schrödinger
representation is irreducible and all irreducible representations of the Weyl commu-
tation relation are unitarily equivalent. Thus, the Schrödinger representation is the
unique irreducible representation of the Weyl commutation relation, up to unitary
equivalence. But what was the main motivation behind the Stone–von Neumann
theorem? To cut a long story short to the extreme, in the mid 1920s there were two
competing formalisms for the emerging theory of quantum mechanics: the matrix
mechanics of Werner Heisenberg and the wave mechanics of Erwin Schrödinger.
Heisenberg’s formalism was based on “infinite size matrices” q and p which satisfied
the Heisenberg canonical commutation relation
qp − pq = i(2π)−1 h,
where h is Planck’s constant, while Schrödinger’s formalism used transformations
Q and P in the space of wave functions which satisfied the same relation
QP − P Q = i(2π)−1 h.
There was the serious problem of the equivalence of these two approaches to quan-
tum mechanics, which were initially mutually antagonistic. In 1926 Schrödinger
found out a way to obtain the matrix elements of Heisenberg’s q and p by using
his Q and P together with what would be recognized later as a c.o.n.s. in L2 (R)
(Schrödinger, 1926). In the same year, Pascual Jordan provided a heuristic argu-
ment for the equivalence of the two formalisms (Jordan, 1926). However, these
and other similar observations made by Paul A. M. Dirac and Wolfgang Pauli fell
far short of an equivalence proof of matrix mechanics and wave mechanics (as is
sometimes claimed), let alone of an actual mathematical understanding of quan-
tum mechanics. Much work remained to be done before the problem of equivalence
could even be described in a form suitable for real mathematical treatment. First,
quantum theory had to be formulated in Hilbert space, a crucial step begun by
David Hilbert himself and made explicit in 1927 by John von Neumann. Heisen-
berg’s “infinite size matrices” were recognized as operators in the Hilbert space ℓ2
(cf. 10.3.8d) and Schrödinger’s transformations Q and P became, after some math-
ematical reconditioning, the operators discussed in 20.1.7 (up to the multiplicative
factor (2π)−1 h for P ). Then the fact had to be addressed that the Heisenberg
canonical commutation relation could not be understood as an operator equation
on all of a Hilbert space, because there exists no implementation of this commu-
tation relation by bounded self-adjoint operators (cf. 12.6.5). In 1927 Hermann
Weyl realized that a way out was to replace the Heisenberg canonical commutation
relation with what is now called the Weyl commutation relation, i.e. to replace
self-adjoint operators with the continuous one-parameter unitary groups they gen-
erate (Weyl, 1927). In 1930 Marshall H. Stone stated (Stone, 1930) and in 1931 von
Neumann proved (Neumann, 1931) what is now called the Stone–von Neumann
uniqueness theorem. This theorem was the real final proof of the equivalence of
Heisenberg’s and Schrödinger’s formulations of quantum mechanics.
In section 20.3 we use the Stone–von Neumann theorem in our discussion of the
position and linear momentum observables for a non-relativistic quantum particle.
There are various proofs of the Stone–von Neumann uniqueness theorem. The
proof we present here is von Neumann’s original one, mainly because this proof is
a nice opportunity to put into action several theorems we saw in previous chapters.
Some facts we use in the proof are collected as preliminary remarks in 20.2.2, part
of the proof is set forth as a lemma in 20.2.3, and the theorem is stated and proved
in 20.2.4. Before all that, in 20.2.1 we compute an integral which has an important
role in the proof of 20.2.3.
20.2.1 Lemma. For all x, y ∈ R, the function

1 2
R ∋ t 7→ e− 2 (t+x+iy) ∈ C
is an element of L1 (R) and
Z
1 2 √
e− 2 (t+x+iy) dm(t) = 2π.
R
Proof. For all y ∈ R, the function

1 2
R ∋ t 7→ e− 2 (t+iy) ∈ C
is an element of L1 (R) in view of 11.4.7, because
1
− 2 (t+iy)2 1 2 1 2
e = e 2 y e− 2 t , ∀t ∈ R.
Hence so is the function of the statement for all x, y ∈ R, in view of 9.2.1b.
Now we have (cf. 11.4.8 with a := 1)
Z Z
1 2 1 2 1 2
e− 2 (t+iy) dm(t) = e 2 y e−iyt e− 2 t dm(t)
R R
1 2√ 1 2√ 1 2 √
= e2 y
2πγ̂1 (y) = e 2 y 2πe− 2 y = 2π, ∀y ∈ R,
Z
1 2 √
e− 2 (t+x+iy) dm(t) = 2π, ∀x, y ∈ R.
R
20.2.2 Remarks. Let A, B be a representation of WCR in a Hilbert space H. We

define
1
W (s, t) := e−i 2 st U B (s)U A (t), ∀(s, t) ∈ R2 .
(a) In view of 20.1.1a, a direct computation proves that
1
W (s1 , t1 )W (s2 , t2 ) = ei 2 (s1 t2 −t1 s2 ) W (s1 + s2 , t1 + t2 ),
∀(s1 , t1 ), (s2 , t2 ) ∈ R2 .
(b) In view of remark a, we have

W (s, t)W (−s, t) = W (0, 0) = 1H , ∀(s, t) ∈ R2 ,
and hence
W (−s, −t) = W (s, t)−1 = W (s, t)† , ∀(s, t) ∈ R2
(the second equality follows from 12.5.1b since W (s, t) ∈ U(H)).
(c) For a subspace M of H, the following conditions are equivalent:
the operators A and B are reduced by M ;
W (s, t)f ∈ M , ∀f ∈ M , ∀(s, t) ∈ R2 ;
the operator W (s, t) is reduced by M , ∀(s, t) ∈ R2 .
Indeed, the first condition implies the second in view of 17.2.13; the second
implies the third in view of remark b and 17.2.9; the third implies the first in
view of 17.2.13.
(d) For each f ∈ H, the operators A and B are reduced by the subspace
Mf := V {W (s, t)f : (s, t) ∈ R2 }.
Indeed, if we fix f ∈ H then we have, for all g ∈ Mf⊥ and all (s′ , t′ ) ∈ R2 ,
(W (s′ , t′ )g|W (s, t)f ) = (g|W (−s′ , −t′ )W (s, t)f ) = 0, ∀(s, t) ∈ R2
(cf. remarks a and b), and hence
W (s′ , t′ )g ∈ {W (s, t)f : (s, t) ∈ R2 }⊥ = Mf⊥
(cf. 10.2.11). In view of remark c, this proves that A and B are reduced by the
subspace Mf⊥ and hence by the subspace Mf as well (cf. 17.2.4).
20.2.3 Lemma. Let A, B be a representation of WCR in a Hilbert space H (hence,

H is not a zero Hilbert space). Then:
(a) There exists a unique operator T ∈ B(H) such that
Z
(f |T g) = γ(s, t) (f |W (s, t)g) dm2 (s, t), ∀f, g ∈ H,
R2
where W (s, t) is defined by A, B as in 20.2.2 and the function γ : R2 → C is

defined by
2
1
+t2 )
γ(s, t) := (2π)−1 e− 4 (s , ∀(s, t) ∈ R2 ;
the operator T is self-adjoint and not OH .
(b) We have
T W (s, t)T = 2πγ(s, t)T, ∀(s, t) ∈ R2 .
(c) The operator T is an orthogonal projection and we have, for all f, g ∈ RT ,
2
1
− 14 (t2 −t1 )2 −i 12 (s1 t2 −t1 s2 )
(W (s1 , t1 )f |W (s2 , t2 )g) = e− 4 (s2 −s1 ) (f |g) ,
2
∀(s1 , t1 ), (s2 , t2 ) ∈ R .
(d) Suppose that the operators A and B are reduced by a non-trivial subspace M of
H. Then the operator T is reduced by M and the restriction T M of T to M (cf.
17.2.1) is the same as the element T (M) of B(M ) which is defined by AM and
B M as T is defined by A and B in statement a (recall that the pair AM , B M is
a representation of WCR in the Hilbert space M , cf. 20.1.6).
(e) If the orthogonal dimension of the subspace RT is one, then the pair of self-
adjoint operators A, B is jointly irreducible, and hence the pair A, B is an ir-
reducible representation of WCR.
Proof. a: In view of the Schwarz inequality (cf. 10.1.9) we have

|γ(s, t) (f |W (s, t)g) | ≤ γ(s, t)kf kkgk, ∀(s, t) ∈ R2 , ∀f, g ∈ H. (1)
Now, γ ∈ L1 (R2 , A(d2 ), m2 ) in view of 8.4.9 and 11.4.7. Therefore the function
R2 ∋ (s, t) 7→ γ(s, t) (f |W (s, t)g) ∈ C
is an element of L1 (R2 , A(d2 ), m2 ) for all f, g ∈ H, and
Z Z

2 γ(s, t) (f |W (s, t)g) dm2 (s, t) ≤ γ(s, t)dm2 (s, t) kf kkgk, ∀f, g ∈ H.

R 2 R
Then 10.5.6 implies that there exists a unique operator T ∈ B(H) such that
Z
(f |T g) = γ(s, t) (f |W (s, t)g) dm2 (s, t), ∀f, g ∈ H.
R2
We have, for all f, g ∈ H,
Z
(2)
(T g|f ) = (f |T g) = γ(s, t)(f |W (s, t)g)dm2 (s, t)
R2
Z
(3)
= γ(s, t) (g|W (−s, −t)f ) dm2 (s, t)
2
ZR
(4) (5)
= γ(−s, −t) (g|W (s, t)f ) dm2 (s, t) = (g|T f ) ,
R2
where 2 holds true because complex conjugation commutes with integration (cf.
8.2.3), 3 holds true by 20.2.2b, 4 by 9.2.4b (with A(s, t) := (−s, −t)), 5 by the
equality γ(−s, −t) = γ(s, t). This proves that the operator T is self-adjoint (cf.
12.4.3).
Now we want to prove that T 6= OH . We assume to the contrary that T = OH
and we fix f, g ∈ H. Then we have, for all (s′ , t′ ) ∈ R2 ,
(6)
0 = (f |W (−s′ , −t′ )T W (s′ , t′ )g) = (W (s′ , t′ )f |T W (s′ , t′ )g)
Z
= γ(s, t) (W (s′ , t′ )f |W (s, t)W (s′ , t′ )g) dm2 (s, t)
R2
Z
(7) ′ ′
= γ(s, t)ei(st −ts ) (f |W (s, t)g) dm2 (s, t)
2
ZR Z
(8) ′ ′
= eit s e−is t γ(s, t) (f |W (s, t)g) dm(t) dm(s),
R R
where: 6 is obvious; 7 follows from 20.2.2a,b; 8 holds true by 8.4.10c since 1 implies
that the function
′ ′
R2 ∋ (s, t) 7→ γ(s, t)ei(st −ts ) (f |W (s, t)g) ∈ C
is an element of L1 (R2 , A(d2 ), m2 ). For all s′ ∈ R, 1 and 11.4.7 imply that the
function
′
R ∋ t 7→ e−is t γ(s, t) (f |W (s, t)g) ∈ C
is an element of L1 (R) for all s ∈ R; then we can define the function
Z
′
R ∋ s 7→ ϕs′ (s) := e−is t γ(s, t) (f |W (s, t)g) dm(t) ∈ C,
R
which is an element of L1 (R) in view of 8.4.10b; thus, the result obtained above by
the equalities from 6 to 8 can be written as
ϕ̌s′ (t′ ) = 0, ∀t′ ∈ R, ∀s′ ∈ R. (9)
Moreover, for all s′ ∈ R, 1 implies also that
Z
1 2 1 2
|ϕs′ (s)| ≤ (2π)−1 e− 4 s kf kkgk e− 4 t dm(t), ∀s ∈ R,
R
and hence that ϕs′ ∈ L2 (R) (cf. also 11.4.7). Then, in view of 11.4.22 we can write
9 as
F −1 [ϕs′ ] = 0L2 (R) , ∀s′ ∈ R,
[ϕs′ ] = 0L2 (R) , ∀s′ ∈ R. (10)
For all s′ ∈ R, the function ϕs′ is continuous, as can be proved by 8.2.11 with
1 2
R ∋ t 7→ (2π)−1 kf kkgke− 4 t ∈ [0, ∞)
as dominating function. Then (cf. 11.3.6d) 10 implies that
ϕs′ (s) = 0, ∀s ∈ R, ∀s′ ∈ R,
or
Z
′
e−is t γ(s, t) (f |W (s, t)g) dm(t) = 0, ∀s′ ∈ R, ∀s ∈ R. (11)
R
Now we fix s ∈ R; 1 and 11.4.7 imply that the function
R ∋ t 7→ βs (t) := γ(s, t) (f |W (s, t)g) ∈ C
is an element of L2 (R) ∩ L1 (R); then, in view of 11.4.22, 11 implies that
F [βs ] = [β̂s ] = 0L2 (R) ,
[βs ] = 0L2 (R) ;
since the function βs is continuous, this implies that

βs (t) = 0, ∀t ∈ R.
Since s was an arbitrary element of R, we have
γ(s, t) (f |W (s, t)g) = 0, ∀(s, t) ∈ R2 .
Since f and g were arbitrary elements of H, this yields
W (s, t) = OH , ∀(s, t) ∈ R2 .
This has been obtained as a consequence of the assumption T = OH . However, we
have W (0, 0) = 1H . Therefore, T 6= OH .
b: We fix (s, t) ∈ R2 and f, g ∈ R2 . We have
(f |T W (s, t)T g)
Z Z
(12)
γ(s′′ , t′′ )e 2 i(st −ts ) e 2 i[s (t+t )−t (s+s )]
1 ′′ ′′ 1 ′ ′′ ′ ′′
′ ′
= γ(s , t )
R2 R2

(f |W (s + s + s , t + t + t )g) dm2 (s , t ) dm2 (s′ , t′ )
′ ′′ ′ ′′ ′′ ′′
Z Z
(13)
γ(s̃ − s′ − s, t̃ − t′ − t)e 2 i[s(t̃−t −t)−t(s̃−s −s)+s (t̃−t )−t (s̃−s )]
1 ′ ′ ′ ′ ′ ′
′ ′
= γ(s , t )
R2 R2

f |W (s̃, t̃)g dm2 (s̃, t̃) dm2 (s′ , t′ )

Z Z
(14)
γ(s′ , t′ )γ(s̃ − s′ − s, t̃ − t′ − t)e 2 i[s(t̃−t −t)−t(s̃−s −s)+s t̃−t s̃]
1 ′ ′ ′ ′
=
R2 R2

′ ′

dm2 (s , t ) f |W (s̃, t̃)g dm2 (s̃, t̃)
where:
12 follows from a direct computation on the basis of the definition of T and of
20.2.2a;
13 follows from the change of variable (s̃, t̃) := (s′ + s + s′′ , t′ + t + t′′ ), in view
of 9.2.1b;
14 holdsZtrue by 8.4.8
Zand 8.4.10c, since
γ(s′ , t′ ) γ(s̃ − s′ − s, t̃ − t′ − t)dm2 (s̃, t̃) dm2 (s′ , t′ )
R2 R2
Z 2
(15)
= γ(u, v)dm2 (u, v) < ∞
R2
(15 follows from the change of variable (u, v) := (s̃ − s′ − s, t̃ − t′ − t)).
2
Z Moreover we have, for all (s̃, t̃) ∈ R ,
γ(s′ , t′ )γ(s̃ − s′ − s, t̃ − t′ − t)e 2 i[s(t̃−t −t)−t(s̃−s −t)+s t̃−t s̃] dm (s′ , t′ )
1 ′ ′ ′ ′
2
R2
Z
(16)
γ(ŝ − s, t̂ − t)γ(s̃ − ŝ, t̃ − t̂)e 2 i[s(t̃−t̂)−t(s̃−ŝ)+(ŝ−s)t̃−(t̂−t)s̃] dm2 (ŝ, t̂)
1
=
R2
Z
e 2 [−ŝ +((s+s̃)+i(t+t̃))ŝ−t̂ +((t+t̃)−i(s+s̃))t̂] dm2 (ŝ, t̂)
1 2 2 1 2 2 1 2 2
= (2π)−2 e− 4 (s +t ) e− 4 (s̃ +t̃ )
R2
(17)
= 2πγ(s, t)γ(s̃, t̃),
where:
16 follows from the change of variable (ŝ, t̂) := (s′ + s, t′ + t);
17 holds true because
Z
e 2 [−ŝ +((s+s̃)+i(t+t̃))ŝ−t̂ +((t+t̃)−i(s+s̃))t̂] dm2 (ŝ, t̂)
1 2 2
R2
Z h 2 2
i
2 −(ŝ− 2 ((s+s̃)+i(t+t̃))) −(t̂− 2 ((t+t̃)−i(s+s̃)))
(18) 1 1 1
= e dm2 (ŝ, t̂)
R2
Z Z
(19) 2 2
e− 2 (ŝ− 2 ((s+s̃)+i(t+t̃))) dm(ŝ) e− 2 (t̂− 2 ((t+t̃)−i(s+s̃))) dm(t̂)
1 1 1 1
=
R R
(20)
= 2π
(18 holds true because (a + ib)2 + (b − ia)2 = 0 for all a, b ∈ R; 19 holds true by
20.2.1, 8.4.9, 8.4.10c; 20 follows from 20.2.1).
Therefore we have
Z

(f |T W (s, t)T g) = 2πγ(s, t) γ(s̃, t̃) f |W (s̃, t̃)g dm2 (s̃, t̃) = 2πγ(s, t) (f |T g) .
R2
Since f and g were arbitrary elements of H and (s, t) was an arbitrary element of
R2 , this proves that
T W (s, t)T = 2πγ(s, t)T, ∀(s, t) ∈ R2 .
c: If we set s := t := 0 in statement b, we obtain
T 2 = T.
Since T is a self-adjoint element of B(H) (cf. statement a), this proves that T is an
orthogonal projection (cf. 13.1.5).
For all (s1 , t1 ), (s2 , t2 ) ∈ R2 and all f, g ∈ RT , we have
(21)
(W (s1 , t1 )f |W (s2 , t2 )g) = (W (s1 , t1 )T f |W (s2 , t2 )T g)
(22)
= (f |T W (−s1 , −t1 )W (s2 , t2 )T g)
(23) 1
= e 2 i(−s1 t2 +t1 s2 ) (f |T W (−s1 + s2 , −t1 + t2 )T g)
(24) 1 1 2
− 14 (t2 −t1 )2
= e− 2 i(s1 t2 −t1 s2 )− 4 (s2 −s1 ) (f |g) ,
where 21 holds true by 13.1.3c, 22 by 20.2.2b, 23 by 20.2.2a, 24 by statement b and
13.1.3c.
d: For all g ∈ M we have, in view of 20.2.2.c,
Z
(f |T g) = γ(s, t) (f |W (s, t)g) dm2 (s, t) = 0, ∀f ∈ M ⊥ ,
R2
and hence T g ∈ M ⊥⊥ . Since M = M ⊥⊥ (cf. 10.4.4a), this proves that M is an
invariant subspace for T , and hence that T is reduced by M (cf. 17.2.9).
The operator W (s, t) is reduced by M , for all (s, t) ∈ R2 (cf. 20.2.2c). Now we
recall that
M M
U A (t) = (U A (t))M and U B (t) = (U B (t))M , ∀t ∈ R
(cf. 17.2.13). Then,

1 M M
(W (s, t))M = e−i 2 st U B (s)U A (t), ∀(s, t) ∈ R2 ,
and hence
Z
h|T M g M = (h|T g)H =

γ(s, t) (h|W (s, t)g)H dm2 (s, t)
R2
Z
γ(s, t) h|(W (s, t))M g M dm2 (s, t) = h|T (M) g

= , ∀h, g ∈ M,
R2 M
and hence T M = T (M) (we have denoted by a subscript whether a given inner
product is to be regarded as pertaining to the Hilbert space H or to the Hilbert
space M ).
e: We prove statement e by contraposition. We assume that there exists a non
trivial subspace M of H so that A and B are reduced by M . Then A and B are
⊥
reduced by the non-trivial subspace M ⊥ as well. Then T M and T M are non-null
orthogonal projections in the Hilbert spaces M and M ⊥ respectively (cf. statements
a, c, d). In view of 13.1.3c, there exist two normalized vectors u1 , u2 so that:
u1 ∈ RT M , and hence T u1 = T M u1 = u1 ;
⊥
u2 ∈ RT M ⊥ , and hence T u2 = T M u2 = u2 .
Therefore, {u1 , u2 } is an o.n.s. contained in RT (cf. 13.1.3c). Hence RT cannot be
a one dimensional subspace (cf. e.g. 10.7.3).
20.2.4 Theorem (The Stone–von Neumann uniqueness theorem).

(a) Two irreducible representations of WCR are unitarily equivalent:
Let A, B be an irreducible representation of WCR in a Hilbert space H, and let
Ã, B̃ be an irreducible representation of WCR in a Hilbert space H̃. Then there
exists V ∈ U(H, H̃) so that
U Ã (t) = V U A (t)V −1 and U B̃ (t) = V U B (t)V −1 , ∀t ∈ R,
or equivalently
Ã = V AV −1 and B̃ = V BV −1 .
(b) The Schrödinger representation of WCR is irreducible.
(c) Let A, B be a representation of WCR in a separable Hilbert space K. Then there
exists a family {Mn }n∈I , with I := {1, ..., N } or I := N, of subspaces of K so
that:
Mk ⊂ Mi⊥ if i 6= k;
P⊕ P⊕
n∈I Mn = K (for n∈I Mn , cf. 13.2.10f );
for each n ∈ I, A and B are reduced by Mn and the pair of self-adjoint
operators AMn and B Mn is jointly irreducible (hence, the pair AMn , B Mn is
an irreducible representation of WCR in the Hilbert space Mn , cf. 20.1.6).
Proof. a: Let W (s, t) and W̃ (s, t) be defined as in 20.2.2 for all (s, t) ∈ R2 , with
respect to the pair A, B and the pair Ã, B̃ respectively. Moreover, let T and T̃ be
defined as in 20.2.3, with respect to A, B and Ã, B̃ respectively. Since the projections
T and T̃ are non-zero (cf. 20.2.3a,c), we can fix two normalized vectors u ∈ RT and
ũ ∈ RT̃ . Since the operators A and B are reduced by the subspace
Mu := V {W (s, t)u : (s, t) ∈ R2 }
(cf. 20.2.2d) and since Mu cannot be {0H } (because W (0, 0) = 1H ), the equality
Mu = H must be true. Similarly, Mũ = H̃.
For all L ∈ N, all (α1 , ..., αL ) ∈ CL , all (s1 , t1 , ..., sL , tL ) ∈ R2L , we have
L 2 L
X X
αl W (sl , tl )u = αh αl (W (sh , th )u|W (sl , tl )u)H

l=1 H h,l=1
L
(1) X
= αh αl W̃ (sh , th )ũ|W̃ (sl , tl )ũ
H̃
h,l=1
L 2
X
= αl W̃ (sl , tl )ũ ,

l=1 H̃
where 1 holds true in view of 20.2.3c. Then we have, for all N, M ∈ N,
all (β1 , ..., βN ) ∈ CN , all (γ1 , ..., γM ) ∈ CM , all (s1 , t1 , ..., sN , tN ) ∈ R2N , all
(x1 , y1 , ..., xM , yM ) ∈ R2M ,
N
X M
X
βn W (sn , tn )u = γm W (xm , ym )u ⇒
n=1 m=1

X N XM
βn W (sn , tn )u − γm W (xm , ym )u = 0 ⇒

n=1 m=1 H
N M

X X
βn W̃ (sn , tn )ũ − γm W̃ (xm , ym )ũ = 0 ⇒

n=1 m=1 H̃
N
X M
X
βn W̃ (sn , tn )ũ = γm W̃ (xm , ym )ũ.
n=1 m=1
Therefore we can define a mapping
V0 : L{W (s, t)u : (s, t) ∈ R2 } → L{W̃ (s, t)ũ : (s, t) ∈ R2 }
by letting
N
! N
X X
V0 αn W (sn , tn )u := αn W̃ (sn , tn )ũ,
n=1 n=1
∀N ∈ N, ∀(α1 , ..., αN ) ∈ CN , ∀(s1 , t1 , ..., sN , tN ) ∈ R2N
(cf. 3.1.7). It is obvious that V0 is a linear operator from H to H̃ and that
RV0 = L{W̃ (s, t)ũ : (s, t) ∈ R2 }.
In view of 4.1.13 and 4.6.6, there exists V ∈ U(H, H̃) such that
V0 ⊂ V and V −1 (W̃ (s, t)ũ) = W (s, t)u, ∀(s, t) ∈ R2 .
Then we have, for all (s′ , t′ ) ∈ R2 ,
V W (s′ , t′ )V −1 (W̃ (s, t)ũ) = V W (s′ , t′ )W (s, t)u
(2)
1 ′ ′
= V ei 2 (s t−t s) W (s′ + s, t′ + t)u
1 ′ ′
= ei 2 (s t−t s) W̃ (s′ + s, t′ + t)ũ
(3)
= W̃ (s′ , t′ )(W̃ (s, t)ũ), ∀(s, t) ∈ R2
(2 and 3 hold true in view of 20.2.2a), and hence by linearity
V W (s′ , t′ )V −1 f˜ = W̃ (s′ , t′ )f˜, ∀f˜ ∈ L{W̃ (s, t)ũ : (s, t) ∈ R2 },
and hence
V W (s′ , t′ )V −1 = W̃ (s′ , t′ )
in view of 4.2.6. Therefore we have
V U A (t)V −1 = V W (0, t)V −1 = W̃ (0, t) = U Ã (t), ∀t ∈ R,
and
V U B (s)V −1 = V W (s, 0)V −1 = W̃ (s, 0) = U B̃ (s), ∀s ∈ R.
By 16.3.1, these conditions are equivalent to
V AV −1 = Ã and V BV −1 = B̃.
b: In view of 20.2.3e, we prove statement b by proving that the orthogonal
projection T defined as in 20.2.3 with A := Q and B := P (where Q and P are the
operators discussed in 20.1.7) is so that the orthogonal dimension of the subspace
RT is one. In what follows, for simplicity we do not distinguish between the symbol
f for an element of L2 (R) and the symbol [f ] for the element of L2 (R) that contains
f.
For all (s, t) ∈ R2 and all g ∈ L2 (R) we have
(W (s, t)g)(x) = e− 2 ist (g t )−s (x) = eit(x+ 2 s) g(x + s), ∀x ∈ Dg − s.
1 1
Now we fix f, g ∈ L2 (R), and suppose that the representative g ∈ L2 (R) is such
that Dg = R (cf. 8.2.12). We have
Z
|f (x)(W (s, t)g)(x)|dm(x) ≤ kf kkW (s, t)gk = kf kkgk, ∀(s, t) ∈ R2 ,
R
by the Schwarz inequality (cf. 10.1.9) for the elements |f | and |W (s, t)g| of L2 (R),
and hence
Z Z
γ(s, t) |f (x)(W (s, t)g)(x)|dm(x) dm2 (s, t)
R2 R
Z
≤ kf kkgk γ(s, t)dm2 (s, t) < ∞, ∀(s, t) ∈ R2 .
R2
Then, by Tonelli’s theorem (cf. 8.4.8) followed by Fubini’s theorem (cf. 8.4.10c
with µ1 := m2 and µ2 := m) we have
Z Z
(f |T g) = γ(s, t) f (x)(W (s, t)g)(x)dm(x) dm2 (s, t)
R2
Z Z R
= f (x) γ(s, t)(W (s, t)g)(x)dm2 (s, t) dm(x).
R R2
Moreover we have, for all (s, t) ∈ R2 and all x ∈ R,

Z
γ(s, t)(W (s, t)g)(x)dm2 (s, t)
R2
Z
e− 4 (s +t ) eit(x+ 2 s) g(x + s)dm2 (s, t)
1 2 2 1
= (2π)−1
2
ZR Z
(4)
e− 4 t eit(x+ 2 s) dm(t) dm(s)
1 2 1 2 1
= (2π)−1 e− 4 s g(x + s)
Z R R
2
(5) − 1 ( 2 s) dm(s)
1 2 1
− s − x+
=π 2 e 4 g(x + s)e
R
Z
(6) − 1 − 1 x2 1 2
=π e 2 2 e− 2 y g(y)dm(y),
R
where:
4 holds true by 8.4.9 and 8.4.10c since, for all x ∈ R,
1 2 2
− 4 (s +t ) it(x+ 12 s) 1 2 1 2
e g(x + s) = e− 4 t e− 4 s g−x (s), ∀(s, t) ∈ R2 ,

e
1 2
and the function t 7→ e− 4 t is an element of L1 (R) (cf. 11.4.7), and so is the function
1 2
s 7→ e− 4 s g−x (s) (cf. 11.1.2b);
5 holds true by 11.4.8 with a := 12 ;
6 follows from the change of variable y := s + x, by 9.2.1.b.
Now we define an element u of L2 (R) by letting
1 1 2
u(x) := π − 4 e− 2 x , ∀x ∈ R;
we have kuk = 1 from 11.4.7. From the results obtained above, we have
Z
(f |T g) = f (x)u(x) (u|g) dm(x) = (u|g) (f |u) = (f |Au g) ,
R
where Au is the one-dimensional projection defined in 13.1.12. Since f and g were
arbitrary elements of L2 (R), this proves that T = Au and hence that the orthogonal
dimension of the subspace RT is one.
c: Let T be defined by A, B as in 20.2.3. Since T 6= OK (cf. 20.2.3a) and K is
separable, there exists an o.n.s. {un }n∈I , with I := {1, ..., N } or I := N, which is
complete in the subspace RT (cf. 10.7.2). For each n ∈ I, we define the subspace
Mn := Mun
(cf. 20.2.2d). If i 6= k, we have (in view of 20.2.3c)
(W (s1 , t1 )ui |W (s2 , t2 )uk ) = 0, ∀(s1 , t1 ), (s2 , t2 ) ∈ R2 ,
or
{W (s, t)ui : (s, t) ∈ R2 } ⊂ {W (s, t)uk : (s, t) ∈ R2 }⊥ ,
and hence
Mk = {W (s, t)uk : (s, t) ∈ R2 }⊥⊥ ⊂ {W (s, t)ui : (s, t) ∈ R2 }⊥ = Mi⊥
(cf. 10.4.4b, 10.2.10b, 10.2.11). For each n ∈ I, Mn is a reducing subspace for A
and B (cf. 20.2.2d) and hence for the operator W (s, t) as well, for all (s, t) ∈ R2
(cf. 20.2.2c; the operator W (s, t) is defined by A, B as in 20.2.2). Then it is obvious
P⊕
that the subspace n∈I Mn is an invariant subspace for the operator W (s, t), for all
P⊕
(s, t) ∈ R2 . In view of 17.2.9 and 20.2.2b, this implies that n∈I Mn is a reducing
subspace for W (s, t) for all (s, t) ∈ R2 , and hence also for A and B (cf. 20.2.2c).
Therefore, the subspace
⊕
!⊥
X
M0 := Mn
n∈I
is a reducing subspace for A and B (cf. 17.2.4). Now we prove by contradiction

that M0 = {0K }. Suppose to the contrary M0 6= {0K }. Then the restrictions
of A and B to M0 are a representation of WCR (cf. 20.1.6) and they define a
non-zero orthogonal projection T (M0 ) in the Hilbert space M0 which is the same
as the restriction of T to M0 (cf. 20.2.3d). Then there exists a non-zero vector
u0 ∈ RT (M0 ) . In view of 13.1.3c, we have
u0 = T (M0 ) u0 = T u0 ,
P ⊥
⊕
and hence u0 ∈ RT . Since u0 ∈ n∈I M n implies that
(u0 |un ) = 0, ∀n ∈ I
(note that un = W (0, 0)un ∈ Mn for all n ∈ I), we have a contradiction with the
fact that {un }n∈I is a c.o.n.s. in RT (cf. 10.6.4). This proves that M0 = {0K } and
hence that
⊕
X
Mn = K.
n∈I
Now we prove that, for each n ∈ I, the pair of self-adjoint operators AMn , B Mn
is jointly irreducible by proving that the projection T (Mn ) is one-dimensional (cf.
20.2.3e). Indeed, T (Mn ) = T Mn (cf. 20.2.3d) and hence
X
T (Mn ) f = T f = (uk |f )H uk = (un |f )H un = (un |f )Mn un , ∀f ∈ Mn
k∈I
(the second equality follows from 13.1.10), and this proves that T (Mn ) is a one-
dimensional projection in the Hilbert space Mn (cf. 13.1.12).
20.2.5 Remarks.
(a) We stated and proved part c of 20.2.4 even if we do not use it in this book,
because we thought it better to reproduce the whole content of von Neumann’s
article. The other parts of the Stone–von Neumann theorem play an essential
role in Section 20.3.
(b) For any irreducible representation of WCR, the Hilbert space H in which it is
defined is separable and of denumerable orthogonal dimension. Indeed state-
ments a and b in 20.2.4 imply that H and L2 (R) are isomorphic; moreover the
Hilbert space L2 (R) is separable (cf. 11.3.4) and of denumerable orthogonal
dimension (cf. 11.3.3); then so is H, in view of 10.7.14.
20.3 Position and momentum as Galilei-covariant observables

In the first part of this section we try to explain briefly how relativistic ideas can be
implemented in quantum mechanics. Even though we refer our discussion directly
to the Galilei group, the experienced reader will notice that what we say can be
easily adapted to the discussion of any group of space-time transformations which
is considered a symmetry group (in the so-called passive approach). In the second
part of the section, a mathematical model for a non-relativistic quantum particle is
discussed. Some of the ideas we use are drawn from (Holevo, 1982), (Jauch, 1968)
and (Mackey, 1978).
The nature of the discussion in this section makes it unsuitable to mark every
bit of it with three numbers, as customary in the rest of the book. However, the
purely mathematical propositions are still marked in that way.
According to both Galilean relativity and Einstein’s special relativity, each ob-
server who describes physical reality must use a frame of reference which consists
of a spatial coordinate system and a method for measuring time. In both these
relativity theories, there is a special class of observers, who are called inertial ob-
servers (we assume that the reader is already familiar with these ideas). In both
theories, the principle of relativity is assumed which says that the laws of physics
are the same for all inertial observers. For any pair of inertial observers O and O′
there exists one and only one element g of a group of transformations of R4 (which
is the Galilei group in Galilean relativity and the inhomogeneous Lorentz group in
Einstein’s special relativity) so that, if (x, x0 ) are the space-time coordinates of a
space-time point according to the frame of reference used by O, then g(x, x0 ) are
the coordinates of that point according to the frame of reference used by O′ . It is
assumed that, given the observer O, there is just one observer O′ for whom this is
true, and we denote this observer by g(O). Each inertial observer has his (or her,
this alternative is understood in all that follows) own representation of physical
reality, which depends at least partially on the representation of space-time points
that is given by his frame of reference. In this book we consider only Galilean rel-
ativity (we know that we can use Galilean relativity when all relevant speeds are
very small compared with the speed of light). Since in what follows we only want
to relate the representations of physical reality given by different inertial observer,

without discussing time evolution (in fact we will only study the kinematic aspects
of a quantum particle, without discussing the dynamic ones), we need only consider
the subgroup G of the Galilei group which contains the transformations of R4 of
the form
R4 ∋ (x, x0 ) 7→ g(R,s,v) := (Rx + s + x0 v, x0 ) ∈ R4 ,
where R is a rotation in R3 and s, v two elements of R3 , and where the group
product is the composition of mappings defined in 1.2.12. The subgroup G is called
the kinematic Galilei group (the full Galilei group contains time translations in
addition).
We assume that, for every inertial observer O and every g ∈ G, there exists
some kind of dictionary which makes it possible to translate the representation of
physical reality given by O into the representation given by g(O). Moreover, we
assume that this dictionary depends only on g and not on the particular inertial
observer O whose representation we want to translate. In what follows we suppose
that, for all inertial observers, the representation of a quantum system is along the
lines of Section 19.3. In particular, the Hilbert space by which a quantum system
is represented is never one-dimensional (cf. 19.3.5a). However, we do not assume
that the Hilbert space by which we represent a non-relativistic quantum particle is
separable (in fact, this will be deduced ). Indeed, the separability of the Hilbert
space by which a quantum system was represented in Section 19.3 was necessary
only in the mathematical discussion of mixed states, since statistical operators had
been studied in Chapter 18 in the context of separable Hilbert spaces; had the
Hilbert space been non-separable, mixed states should have been discussed in a
mathematically different way. In what follows we use only pure states, which we
still assume represented by rays as in Section 19.3; moreover, we still assume that
there is a bijective mapping from the family of all pure states onto the family of
all the rays of the Hilbert space by which we represent a non-relativistic quantum
particle.
In what follows we consider a fixed quantum system. We assume that all inertial
observers represent the system by the same Hilbert space H (the principle of rela-
tivity requires that all inertial observers give equivalent representations of physical
reality, and hence it would lead only to assume that any two inertial observers use
isomorphic Hilbert spaces; to avoid cumbersome notation we assume that all these
Hilbert spaces are actually the same).
We recall that a state preparation is a collection of instructions for operations
to be performed on macroscopic bodies. Since every macroscopic body is related
directly to each frame of reference, each inertial observer describes the instructions
for any given state in his own way, with respect to his own frame of reference. We
assume that, for an inertial observer O and each g ∈ G, if O represents a given
pure state σ by the ray [uσ ] then g(O) will represent the same pure state σ by a ray
[ugσ ] which will not be in general the same as [uσ ], and that there exists a bijective
mapping ωg from the projective Hilbert space Ĥ onto itself so that

[ugσ ] = ωg ([uσ ])
for each pure state σ (this is consistent with the principle of relativity, which requires
that all inertial observers give equivalent descriptions of physical reality; the totality
of pure states for one observer must be the totality of pure states for another
observer). We point out that ωg must depend on g but not on O, in accordance
with the assumption made above that the dictionary from O to g(O) depends on
g but not on O. Moreover, for every pair of pure states σ1 and σ2 , the transition
probability from σ1 to σ2 must be the same whether it is computed by O or by
g(O) (the principle of relativity implies that the statistics of all experiments must
be the same for all inertial observers). Therefore, we assume that
τ (ωg ([uσ1 ]), ωg ([uσ2 ])) = τ ([uσ1 ], [uσ2 ])
for every pair of pure states σ1 , σ2 (τ is the function defined in 10.9.1). In view of the
bijection existing from the family of all pure states onto the projective Hilbert space
Ĥ (cf. 19.3.5c), ωg turns out to be an automorphism of the projective Hilbert space
(Ĥ, τ ). Then, by Wigner’s theorem (cf. 10.9.6), there exists an implementation of
ωg , i.e. Ug ∈ UA(H) so that
ωg ([u]) = [Ug u], ∀u ∈ H̃,
and Ug is unique up to a multiplicative factor in T. After this, we consider an inertial
observer O and two elements g1 , g2 of G. The observer (g1 g2 )(O) is the same as the
observer g1 (g2 (O)), by the definition we gave above of the inertial observer g(O)
and by the definition of the group product in G. Then consistency requires that
the translation of the description of physical reality given by O into the description
given by (g1 g2 )(O) is the same as the translation of the description given by O into
the description given by g2 (O) followed by the translation of the description given
by g2 (O) into the description given by g1 (g2 (O)). Since ωg does not depend on O,
this implies that
ωg1 ◦ ωg2 = ωg1 g2 .
Thus, the mapping
G ∋ g 7→ ωg ∈ Aut Ĥ
is a homomorphism from the group G to the group Aut Ĥ (cf. 10.9.4). Finally, the
group G can be given the structure of a metric space in an obvious way through an
identification of G with a subset of Rn for a suitable integer n, in such a way that
the group operations “product” and “inverse” are continuous. Then we assume the
following continuity condition:
lim τ ([uσ ], ωgn ([uσ ])) = τ ([uσ ], [uσ ]) (= 1),
n→∞
for each pure state σ and for each sequence {gn } in G which converges to the identity
of G. When interpreted in terms of transitions probabilities, this assumption follows
from the idea that the difference between the description of physical reality given
by an inertial observer O and the description given by the observer g(O) becomes
negligible when g is close enough to the identity of G.
We recall that a proposition is an event which does or does not occur in a
macroscopic device. Therefore each inertial observer describes this event in his
own way, with respect to his own frame of reference. We assume that, for an
inertial observer O and each g ∈ G, if O represents a given proposition π by an
orthogonal projection Pπ in H then the inertial observer g(O) will represent the
same proposition π by an orthogonal projection Pπg which will not be in general the
same as Pπ , while g(O) will represent by the same projection Pπ the proposition
that he describes (with respect to his own frame of reference) in the same way as
O describes π (with respect to O’s own frame of reference). The relation between
Pπ and Pπg follows from the relation obtained above between the representations
of pure states given by O and by g(O). In fact, the principle of relativity implies
that the probability p(π, σ) is the same for O and g(O), for all pure states σ. This
implies that
(u|Pπ u) = (Ug u|Pπg Ug u) = u|Ug−1 Pπg Ug u , ∀u ∈ H̃,

and hence (cf. 10.2.12) Pπ = Ug−1 Pπg Ug , or
Pπg = Ug Pπ Ug−1 ,
where Ug is an element of UA(H) which is an implementation of ωg as implied by

Wigner’s theorem. Thus, the mapping
ω̃g : P(H) → P(H)
P 7→ ω̃g (P ) := Ug P Ug−1
(this definition is consistent in view of 13.1.8) is so that
Pπg = ω̃g (Pπ )
for each proposition π. We point out that ω̃g depends on ωg and not on the partic-
ular element Ug of UA(H) (among those which implement ωg ) that has been used
to define ω̃g , because in Ug P Ug−1 an arbitrary multiplicative factor in front of Ug is
immaterial.
We consider an X-valued observable α, where (X, A) is a measurable space.
The equivalence of the descriptions of physical reality given by all inertial observers,
embodied in the principle of relativity, accounts for the assumption that all inertial
observers represent the dial of the measuring instrument that underlies α by the
same measurable space (X, A) (cf. 19.1.9a,b). Since the pointer and the dial are
macroscopic objects, the position of the pointer on the dial is described by each
inertial observer by means of his own frame of reference. Therefore, if an inertial
observer O represents a position of the pointer on the dial by a point x of X then
the inertial observer g(O) (for any g ∈ G) will represent the same position by a
point xg of X which will not be in general the same as x. We assume that, for each
g ∈ G, there exists a bijective measurable mapping tαg from X onto itself so that
xg = tα
g (x), ∀x ∈ X.
As before for ωg , we can establish that tα

g does not depend on O and that
tα α α
g1 ◦ tg2 = tg1 g2 , ∀g1 , g2 ∈ G.
We observe that, for E ∈ A, the symbol α(E) denotes different propositions when
it is used by different inertial observers, since it denotes the proposition determined
by the event “the pointer of the measuring instrument is in the section of the dial
represented by E”, but which section of the dial is represented by E depends on
the observer. The proposition denoted as α(E) by an inertial observer O is in fact
the proposition denoted as α(tα g (E)) by the observer g(O), for all g ∈ G. However,
O and g(O) represent the X-valued observable α by the same projection valued
measure
Pα : A → P(H)
E 7→ Pα (E) := Pα(E) .
Indeed we assumed above that, if O represents a proposition π by a projection Pπ ,
then g(O) represents by the same projection Pπ the proposition (in general different
from π) that is described by g(O) as π is described by O. Now we fix E ∈ A and
consider the proposition that is denoted as α(E) by O. The representation of this
proposition given by O is Pα(E) . According to what we saw above, the representation
of this proposition given by g(O) must be
ω̃g (Pα(E) ),
and it must be
Pα(tαg (E))
as well. Thus, consistency requires that
Ug Pα(E) Ug−1 = ω̃g (Pα(E) ) = Pα(tαg (E)) , ∀E ∈ A, ∀g ∈ G,
where Ug is an implementation of g. This condition is called a Galilei-covariance
relation and the X-valued observable α is said to be Galilei-covariant. If the relation
above is written for a subgroup G0 of the kinematic Galilei group G, then it is called
a covariance relation with respect to G0 .
The case may be that, for a Galilei-covariant X-valued observable α, the map-
ping tαg is the identity mapping of X for all g ∈ G. This means that the repre-
sentations of the positions of the pointer in the dial are the same for all inertial
observers. Then the X-valued observable α is said to be Galilei-invariant and this
case of covariance condition is called a Galilei-invariance condition.
We remark that, if α is an observable (i.e., an R-observable), then the covariance
condition can be written as
Ug P Aα (E)Ug−1 = P Aα (tα
g (E)), ∀E ∈ A(dR ), ∀g ∈ G,
and the invariance condition as

[Ug , P Aα (E)] = OH , ∀E ∈ A(dR ), ∀g ∈ G.
An observable α is said to be trivial if the self-adjoint operator Aα which represents
α is a multiple of the identity operator. The reason for this name is clear from 17.3.2:
if an observable is trivial then in its range there are only the trivial propositions π0
and π1 (cf. 19.1.7 and 19.3.4). If an observable is trivial then it is Galilei-invariant;
this follows immediately from 17.3.2.
Now we want to study the observables position and linear momentum for a
non-relativistic quantum particle, i.e. for a quantum particle in the framework
of Galilean relativity. To keep the discussion as simple as possible, we limit our
analysis to one-dimensional space.
For all (s, v) ∈ R2 , we define the mapping
g(s,v) : R2 → R2
(x1 , x0 ) 7→ g(s,v) (x1 , x0 ) := (x1 + s + x0 v, x0 ).
This mapping is interpreted as the transformation of the space-time coordinates
(x1 , x0 ) of a space-time point according to an inertial observer O, into the coordi-
nates of the same point according to the inertial observer g(s,v) (O), assuming that:
the two observers use the same units for measuring space and time, their clocks are
synchronized, at time x0 = 0 the space-origin of the frame of reference of g(s,v) (O)
has space-coordinate −s according to O, the frame of reference of g(s,v) moves with
velocity −v according to O. The set
G := {g(s,v) : (s, v) ∈ R2 },
with the composition of mappings as product, is the kinematic Galilei group we need
to consider for Galilei-covariance or Galilei-invariance of observables. The mapping
R2 ∋ (s, v) 7→ g(s,v) ∈ G
is an isomorphism from the additive group R2 onto the group G, and it is an
isomorphism of metric spaces too (actually, the distance on G is defined exactly
through this mapping). In what follows, the group R2 will often be substituted for
the group G. The subsets
S := {g(s,0) : s ∈ R} and V := {g(0,v) : v ∈ R}
are subgroups of G and each of them is isomorphic to the additive group R. The
elements of S are called space translations and those of V are called velocity transfor-
mations. For a Galilei-covariant observable α it is enough to determine its covariance
relations with respect to S and V since
g(s,v) = g(s,0) g(0,v) , ∀(s, v) ∈ R2 ,
and hence
2
ω(s,v) = ω(s,0) ◦ ω(0,v) and tα α α
(s,v) = t(s,0) ◦ t(0,v) , ∀(s, v) ∈ R .
We have set here, as we do in what follows,

2
ω(s,v) := ωg(s,v) and tα α
(s,v) := tg(s,v) , ∀(s, v) ∈ R .
Any mathematical discussion requires a clear-cut definition of what is being dis-

cussed. We say that a mathematical model for a non-relativistic quantum particle
in one-dimension and without internal degrees of freedom (briefly, a quantum par-
ticle model), is a mathematical representation of a quantum system which has the
following requisites:
(qp1 ) the analysis carried out above for Galilean-relativistic quantum mechanics
holds true for the system;
(qp2 ) the only observables of the system which are Galilei-invariant are the trivial
ones represented by multiples of the identity operator;
(qp3 ) there are two observables of the system which can be interpreted as position
and linear momentum of a particle.
In what follows we discuss the meaning of these requisites and unfold the mathe-
matical structure they contain. An essential task will be to prove that a quantum
particle model as above does exist. After this, its uniqueness will have to be inves-
tigated.
In 20.3.2 we examine the part of our quantum particle model that concerns the
homomorphism
R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ,
endowed with the continuity property discussed above, which is included in requisite
qp1 , where H denotes the Hilbert space in which the system is represented (however,
in the numbered propositions below, H denotes any Hilbert space). Before that, we
need to prove a lemma in 20.3.1.
20.3.1 Lemma. Suppose that a function ϕ : R2 → T is such that:

(a) ϕ(x1 + x2 , y) = ϕ(x1 , y)ϕ(x2 , y), ∀x1 , x2 ∈ R, ∀y ∈ R;
(b) the function R ∋ x 7→ ϕ(x, y) ∈ T is continuous, ∀y ∈ R;
(c) ϕ(x, y1 + y2 ) = ϕ(x, y1 )ϕ(x, y2 ), ∀y1 , y2 ∈ R, ∀x ∈ R;
(d) the function R ∋ y 7→ ϕ(x, y) ∈ T is continuous, ∀x ∈ R.
Then
∃!µ ∈ R such that ϕ(x, y) = eiµxy , ∀(x, y) ∈ R2 .
Proof. In view of 16.2.3, condition a and b imply that there exists a function
R ∋ y 7→ a(y) ∈ R
so that
ϕ(x, y) = eia(y)x , ∀x ∈ R, ∀y ∈ R. (1)
Moreover, in view of 16.2.3, condition c and d imply that

∃!µ ∈ R so that ϕ(1, y) = eiµy , ∀y ∈ R. (2)
eia(y) = eiµy , ∀y ∈ R,
and hence that there exists a function λ : R → Z so that
a(y) = µy + λ(y)2π, ∀y ∈ R. (3)
eiλ(y)2πx = e−iµyx ϕ(x, y), ∀(x, y) ∈ R2 . (4)
From this and condition d we have that, for each p ∈ N, the function
−1
R ∋ y 7→ ψp (y) := eiλ(y)2πp ∈ T
is continuous. For each p ∈ N, the range of ψp is a finite set since
−1
Rψp ⊂ {eir2πp : r = 0, ..., p − 1};
therefore Rψp contains just one number; indeed, if Rψp contained more than one
number then there would exist two non-empty closed subsets F1 and F2 in the
metric subspace Rψp of C such that
F1 ∩ F2 = ∅ and F1 ∪ F2 = Rψp
(cf. 2.3.5), and hence Rψp would not be connected (cf. 2.9.2), in contradiction with
the continuity of ψp (cf. 2.9.10 and 2.9.4). Now we prove by contradiction that the
function λ has just one value. Let y1 , y2 ∈ R be such that y1 6= y2 . The equalities
ψp (y1 ) = ψp (y2 ), ∀p ∈ N,
imply that
∀p ∈ N, ∃np ∈ Z so that λ(y1 )2πp−1 − λ(y2 )2πp−1 = np 2π,
whence
∀p ∈ N, ∃np ∈ Z so that |λ(y1 ) − λ(y2 )| = np p.
Therefore, λ(y1 ) 6= λ(y2 ) would imply np 6= 0 for all p ∈ N, and hence
|λ(y1 ) − λ(y2 )| ≥ p, ∀p ∈ N,
which is a contradiction. Thus, there exists k ∈ Z so that
λ(y) = k, ∀y ∈ R, (5)
and hence, in view of 3, so that
a(y) = µy + k2π, ∀y ∈ R. (6)
Now, condition c implies obviously that
ϕ(x, 0) = 1, ∀x ∈ R.
This, together with 1 and 6, implies that
1 = eia(0)x = eik2πx , ∀x ∈ R,
and this implies k = 0 (e.g. by the uniqueness asserted in 16.2.3). This, together
with 4 and 5, implies that
ϕ(x, y) = eiµyx , ∀(x, y) ∈ R2 .
20.3.2 Proposition. Let H be a Hilbert space which is neither a zero nor a one-
dimensional linear space.
For a mapping R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ, the following conditions are
equivalent:
(a) the mapping R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ is a homomorphism from the additive
group R2 to the group Aut Ĥ and the following implication holds true
[(s, v) ∈ R2 , {(sn , vn )} a sequence in R2 , lim (sn , vn ) = (0, 0)] ⇒
n→∞
[ lim τ ([u], ω(sn ,vn ) ([u])) = 1, ∀u ∈ H̃];

n→∞
(b) there exist µ ∈ R and two continuous one parameter unitary groups Uµ1 , Uµ2 in
H such that
Uµ2 (v)Uµ1 (s) = eiµsv Uµ1 (s)Uµ2 (v), ∀(s, v) ∈ R2 ,
and
ω(s,v) ([u]) = [Uµ1 (s)Uµ2 (v)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 .
(c) the real number µ as in condition b is unique for a given homomorphism as in

condition a.
Proof. a ⇒ (b and c): We assume condition a. Then the mapping

R ∋ s 7→ ω(s,0) ∈ Aut Ĥ
is a continuous one-parameter group of automorphisms (cf. 16.4.5) and hence there
exists a continuous one-parameter unitary group U 1 in H such that
ω(s,0) ([u]) = [U 1 (s)u], ∀u ∈ H̃, ∀s ∈ R
(cf. 16.4.11). Similarly, there exists a continuous one-parameter unitary group U 2
in H such that
ω(0,v) ([u]) = [U 2 (v)u], ∀u ∈ H̃, ∀v ∈ R.
Now we fix two continuous one-parameter unitary groups U 1 , U 2 in H which satisfy
the above conditions. For all (s, v) ∈ R2 , we have
ω(s,0) ω(0,v) = ω(s,v) = ω(0,v) ω(s,0) ,
and hence
[U 1 (s)U 2 (v)u] = ω(s,v) ([u]) = [U 2 (v)U 1 (s)u], ∀u ∈ H̃,
and hence there exist zs,v ∈ T so that
U 2 (v)U 1 (s) = zs,v U 1 (s)U 2 (v)
(cf. 10.9.6). Since zs,v is uniquely determined by this condition (recall that U 1 and
U 2 have been fixed), we have the function
R2 ∋ (s, v) 7→ ϕ(s, v) := zs,v ∈ T,
which is such that
U 2 (v)U 1 (s) = ϕ(s, v)U 1 (s)U 2 (v), ∀(s, v) ∈ R2 .
We see that, for all s, s′ ∈ R and all v ∈ R,
ϕ(s, v)−1 ϕ(s + s′ , v)U 1 (s + s′ )U 2 (v)
= ϕ(s, v)−1 U 2 (v)U 1 (s + s′ )
= ϕ(s, v)−1 U 2 (v)U 1 (s)U 1 (s′ ) = U 1 (s)U 2 (v)U 1 (s′ )
= ϕ(s′ , v)U 1 (s)U 1 (s′ )U 2 (v) = ϕ(s′ , v)U 1 (s + s′ )U 2 (v),
and hence
ϕ(s + s′ , v) = ϕ(s, v)ϕ(s′ , v).
Similarly we can prove that
ϕ(s, v + v ′ ) = ϕ(s, v)ϕ(s, v ′ ), ∀v, v ′ ∈ R, ∀s ∈ R.
Moreover, let (s, v) ∈ R2 and a sequence {(sn , vn )} in R2 be so that (s, v) =
limn→∞ (sn , vn ), and fix u ∈ H̃. By condition ug2 in 16.1.1 and by 16.4.7, we have
lim U 1 (sn )U 2 (vn )u = U 1 (s)U 2 (v)u and
n→∞
lim U 2 (vn )U 1 (sn )u = U 2 (v)U 1 (s)u,
n→∞
and hence, by the continuity of the inner product,

ϕ(s, v) = U 1 (s)U 2 (v)u|ϕ(s, v)U 1 (s)U 2 (v)u = U 1 (s)U 2 (v)u|U 2 (v)U 1 (s)u

= lim U 1 (sn )U 2 (vn )u|U 2 (vn )U 1 (sn )u = lim ϕ(sn , vn ).

n→∞ n→∞
Thus, the function ϕ satisfies condition a, b, c, d in 20.3.1. Therefore there exists

a unique µ ∈ R so that
U 2 (v)U 1 (s) = eiµsv U 1 (s)U 2 (v), ∀(s, v) ∈ R2 .
Letting Uµ1 := U 1 and Uµ2 := U 2 , we have two continuous one-parameter unitary
groups as in condition b.
Now let Ũ 1 and Ũ 2 be two continuous one-parameter unitary groups such that
ω(s,v) ([u]) = [Ũ 1 (s)Ũ 2 (v)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 .
Then
[Ũ 1 (s)u] = [U 1 (s)u], ∀u ∈ H̃, ∀s ∈ R,
and hence there exists a function γ1 : R → T such that
Ũ 1 (s) = γ1 (s)U 1 (s), ∀s ∈ R.
Similarly there exists a function γ2 : R → T such that

Ũ 2 (v) = γ2 (v)U 2 (v), ∀v ∈ R.
Then we have, for all (s, v) ∈ R2 ,
Ũ 2 (v)Ũ 1 (s) = γ2 (v)γ1 (s)U 2 (v)U 1 (s)
= γ2 (v)γ1 (s)eiµsv U 1 (s)U 2 (v) = eiµsv Ũ 1 (s)Ũ 2 (v).
This proves statement c.
b ⇒ a: We assume condition b. Then, for all (s1 , v1 ), (s2 , v2 ) ∈ R2 , we have
ω(s1 ,v1 ) (ω(s2 ,v2 ) ([u])) = [Uµ1 (s1 )Uµ2 (v1 )Uµ1 (s2 )Uµ2 (v2 )u]
= [eiµs2 v1 Uµ1 (s1 + s2 )Uµ2 (v1 + v2 )u]
= [Uµ1 (s1 + s2 )Uµ2 (v1 + v2 )u]
= ω(s1 +s2 ,v1 +v2 ) ([u]), ∀u ∈ H̃,
and hence
ω(s1 ,v1 ) ◦ ω(s2 ,v2 ) = ω(s1 +s2 ,v1 +v2 ) .
Moreover, let {(sn , vn )} be a sequence in R2 so that limn→∞ (sn , vn ) = (0, 0). Then,
for all u ∈ H̃, we have
lim Uµ1 (sn )Uµ2 (vn )u = u
n→∞
(cf. condition ug2 in 16.1.1 and 16.4.7), and hence

lim τ ([u], ω(sn ,vn ) ([u])) = lim | u|Uµ1 (sn )Uµ2 (vn )u | = 1.

n→∞ n→∞
This completes the proof.
As a consequence of 20.3.2, having a homomorphism

R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ
(where H is the Hilbert space in which the system is represented) endowed with the
required continuity property, as implied by requisite qp1 , is the same as having a
real number µ and a pair of continuous one-parameter unitary groups Uµ1 , Uµ2 in H
such that
Uµ2 (v)Uµ1 (s) = eiµsv Uµ1 (s)Uµ2 (v), ∀(s, v) ∈ R2 . (7)
The link is
ω(s,v) ([u]) = [Uµ1 (s)Uµ2 (v)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 . (8)
Furthermore, the real number µ is uniquely determined by the homomorphism
R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ.
Then we suppose we have all this and we examine requisite qp2 . This requisite
corresponds to the idea that there are no “internal degrees of freedom” for the quan-
tum particle we want to represent, or equivalently that each non-trivial observable
must exhibit some connection with “external space”. Now we prove that this requi-
site is equivalent to the joint irreducibility of the pair of continuous one-parameter
unitary groups Uµ1 , Uµ2 . Indeed, suppose that requisite qp2 is fulfilled. For each
orthogonal projection P in H there exists a proposition π such that Pπ = P (this
is the surjectivity of the mapping in 19.3.1b), and hence there exists the yes-no
observable απ , for which Aαπ = P (cf. 19.3.8). Moreover, in the range of the pro-
jection valued measure of the self-adjoint operator P there are only the projections
OH , 1H , P, 1H − P (cf. 19.3.8), and the only projections which are multiples of the
identity operator are OH and 1H . In view of all this, requisite qp2 entails that, for
P ∈ P(H), the following implications are true
[Uµ1 (s)P Uµ1 (−s) = Uµ2 (v)P Uµ2 (−v) = P, ∀(s, v) ∈ R2 ] ⇒
[Uµ1 (s)Uµ2 (v)P Uµ2 (−v)Uµ1 (−s) = P, ∀(s, v) ∈ R2 ] ⇒ P ∈ {OH , 1H },
and this is the condition that the pair Uµ1 , Uµ2 is jointly irreducible (cf. 17.3.1).
Conversely, suppose that the pair Uµ1 , Uµ2 is jointly irreducible and that an observable
α is Galilei-invariant. Then we have
Uµ1 (s)Pα(E) Uµ1 (−s) = Uµ2 (v)Pα(E) Uµ2 (−v) = Pα(E) , ∀E ∈ A(dR ), ∀(s, v) ∈ R2 ,
and hence, by the irreducibility of the pair Uµ1 , Uµ2 ,
P Aα (E) = Pα(E) ∈ {OH , 1H }, ∀E ∈ A(dR ),
∃λ ∈ R so that Aα = λ1H .
Thus, requisite qp2 is fulfilled.
In view of the discussion above we assume that, if the homomorphism from R2
to Aut Ĥ of the quantum-particle model is implemented by a real number µ and
a pair of continuous one-parameter groups Uµ1 , Uµ2 as in 7 and 8, then this pair is
jointly irreducible. The next proposition proves that this implies µ 6= 0.
20.3.3 Proposition. Let µ ∈ R and let an irreducible pair Uµ1 , Uµ2 of continuous
one-parameter unitary groups in a Hilbert space H be so that
Uµ2 (v)Uµ1 (s) = eiµsv Uµ1 (s)Uµ2 (v), ∀(s, v) ∈ R2 .
If H is neither a zero nor a one-dimensional linear space then µ 6= 0.
Proof. The proof is by contraposition. Since the pair Uµ1 , Uµ2 is jointly irreducible,
the following implication holds true:
[B ∈ B(H) and [B, Uµ1 (s)] = [B, Uµ2 (v)] = OH , ∀(s, v) ∈ R2 ] ⇒
[∃α ∈ C so that B = α1H ]
(cf. 17.3.5). Now suppose µ = 0. Then, for all (s, v) ∈ R2 , Uµ1 (s) and Uµ2 (v) satisfy
the first condition for B above, and hence Uµ1 (s) and Uµ2 (v) are multiplies of 1H .
Since the pair Uµ1 , Uµ2 is jointly irreducible, this implies that H is either a zero or a
one-dimensional linear space.
Thus, requisites qp1 and qp2 are fulfilled if we have a non-zero real number µ
and an irreducible pair of continuous one-parameter unitary groups Uµ1 , Uµ2 with
property 7. The next proposition proves that an irreducible pair of continuous
one-parameter groups with this property does exist, for each µ 6= 0.
20.3.4 Proposition. Let µ ∈ R − {0} and let H be a non-zero Hilbert space.

For two mappings U 1 : R → U(H) and U 2 : R → U(H), the following conditions
are equivalent:
(a) U 1 and U 2 are continuous one-parameter unitary groups and
U 2 (v)U 1 (s) = eiµsv U 1 (s)U 2 (v), ∀(s, v) ∈ R2 ;
(b) there exists a pair of self-adjoint operators A, B in H which is a representation
of WCR and also so that
U 1 (s) = U B (−s), ∀s ∈ R, and U 2 (v) = U A (µv), ∀v ∈ R.
(c) the pair U 1 , U 2 is irreducible iff the pair A, B is irreducible.
Proof. a ⇒ b : We assume condition a and define the mappings
V 1 : R → U(H)
x 7→ V 1 (x) := U 1 (−x)
and
V 2 : R → U(H)
y 7→ V 2 (y) := U 2 (µ−1 y).
It is obvious that V 1 and V 2 are continuous one-parameter unitary groups. More-
over,
V 2 (y)V 1 (x) = e−ixy V 1 (x)V 2 (y), ∀(x, y) ∈ R2 .
Then, the generators A, B of V 2 , V 1 respectively satisfy condition a in 20.1.1, and
hence the pair of self-adjoint operators A, B is a representation of WCR, and also
U 1 (s) = V 1 (−s) = U B (−s), ∀s ∈ R,
and
U 2 (v) = V 2 (µv) = U A (µv), ∀v ∈ R.
b ⇒ a: We assume condition b. Then it is obvious that U 1 and U 2 are continuous
one-parameter unitary groups. Moreover,
U 2 (v)U 1 (s) = U A (µv)U B (−s) = eiµvs U B (−s)U A (µv)
= eiµvs U 1 (s)U 2 (v), ∀(s, v) ∈ R2 .
c: We assume conditions a and b. Obviously, the ranges of the mappings U 1 and
U B are equal and so are the ranges of U 2 and U A . Then condition c follows from
the equivalence between the irreducibility of the pair U A , U B and the irreducibility
of the pair A, B (cf. 20.1.5).
Thus, having a representation of a quantum system which matches requisites qp1

and qp2 is equivalent to having a non-zero real number µ and a pair of self-adjoint
operators A, B which are an irreducible representation of WCR in the Hilbert space
H of the system. If µ, A, B are given, then the homomorphism
R2 ∋ (s, v) 7→ ω(s,v) ∈ Aut Ĥ
which is included in requisite qp1 is given by
ω(s,v) ([u]) := [U B (−s)U A (µv)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 . (9)
Since we know that irreducible representations of WCR do exist (one of them was
constructed in 20.1.7), now we also know that the quantum particle model can be
implemented as far as requisites qp1 and qp2 are concerned. Moreover, now we
know that the Hilbert space required by the quantum particle model is necessarily
separable and of denumerable orthogonal dimension (cf. 20.2.5b).
Now we discuss requisite qp3 , assuming that requisites qp1 and qp2 are fulfilled as
above by means of a non-zero real number µ and a pair of self-adjoint operators A, B
which are an irreducible representation of WCR. We claim that a non-relativistic
quantum particle “has mass m”, where m is a positive real number which is fixed
for that particle, and that its “position” and its “linear momentum” (or, briefly,
“momentum”) are two observables which satisfy the following conditions, where q
denotes the observable position and p denotes the observable momentum (Aq , Ap
are the self-adjoint operators that represent q, p respectively):
U B (−s)P Aq (E)U B (s) = P Aq (E + s), ∀E ∈ A(dR ), ∀s ∈ R, (10)
A Aq A Aq
U (µv)P (E)U (−µv) = P (E), ∀E ∈ A(dR ), ∀v ∈ R, (11)
B Ap B Ap
U (−s)P (E)U (s) = P (E), ∀E ∈ A(dR ), ∀s ∈ R, (12)
A Ap A Ap
U (µv)P (E)U (−µv) = P (E + mv), ∀E ∈ A(dR ), ∀v ∈ R. (13)
Of course we need to explain the reasoning behind the claim we have just made.
We imagine the observable “position” of a quantum particle in one dimension
as the abstract representation of an array of detectors which are ideally infinitely
small and cover the whole of one-dimensional space, and which are so that one and
only of them reacts immediately after a copy of the system has been prepared (this
is exactly one of the particle-like aspects of a quantum particle). Now suppose that
a copy has been prepared, that this happens at time zero for an inertial observer
O (time zero is chosen for simplicity), and that O assigns the result x to “his”
observable “position” if x is the space coordinate, according to his own frame of
reference, of the detector that has reacted (the detectors are classical objects and
therefore each of them has a position at all times, in the classical sense). Then,
on the basis of the same procedure and of the same reaction, the inertial observer
g(s,0) (O) (for any s ∈ R) will assign the result x + s to “his” observable “position”,
while the inertial observer g(0,v) (O) (for any v ∈ R) will assign the result x to “his”
observable “position” (at time zero, the space-origins of the frames of reference of O
and of g(0,v) (O) are in the same place). Well, this is exactly what would happen if
the system were a classical particle and its position were being measured (by means
of detectors suitable for classical particles). By this analogy with the classical case,
the quantum observable described above is given the name of “position” and is
denoted by q.
We imagine the observable “momentum” of a quantum particle in one dimension
as the abstract representation of a pair of detectors which are placed, each time a
measurement is made, on either side of the apparatus that prepares a copy of the
system; no forces act on these detectors and therefore they move with constant ve-
locities with respect to all inertial observers before a copy has been prepared; more-
over, they are so that one and only one of them reacts by changing its velocity after
a copy of the system has been prepared (this too is one of the particle-like aspects
of a quantum particle). Now suppose that a copy has been prepared. Then each
inertial observer assigns, as result to “his” observable “momentum”, the difference
between the values of the momentum of the detector that has reacted, measured
by him (with respect to his own frame of reference) before and after the reaction
(the detectors are classical objects and therefore each of them has a momentum at
all times, in the classical sense). If an inertial observer O assigns the result y to
“his” observable “momentum”, then on the basis of the same reaction the inertial
observer g(s,0) (O) (for any s ∈ R) will assign the same result to “his” observable
“momentum” (the frames of reference of O and of g(s,0) (O) are stationary with re-
spect to each other), while the inertial observer g(0,v) (O) (for any v ∈ R) will assign
a different result. The idea that the particle “has mass m” is supported by the
experimental evidence that the result assigned by g(0,v) (O) is y + mv, where m is a
positive number independent of v. Well, this is exactly what would happen if the
system were a classical particle of mass m and its momentum were being measured
(by techniques suitable for classical particles). By this analogy with the classical
case, the quantum particle is said to “have mass m” and the quantum observable
described above is given the name of “momentum” and is denoted by p.
These observations give the transformations tq(s,0) , tq(0,v) , tp(s,0) , tp(0,v) (for all
s, v ∈ R) to be used in the covariance conditions for the observables q and p with
respect to the subgroups S and V of the kinematic Galilei group G (and hence,
with respect to any other element of G). They are:
tq(s,0) (x) = x + s, ∀x ∈ R, ∀s ∈ R;
tq(0,v) (x) = x, ∀x ∈ R, ∀v ∈ R;
tp(s,0) (y) = y, ∀y ∈ R, ∀s ∈ R;
tp(0,v) (y) = y + mv, ∀y ∈ R, ∀v ∈ R.
Then we see that conditions 10, 11, 12, 13 are nothing else than the covariance
conditions for the observables q and p with respect to S and V , since U B (−s) and
U A (µv) are implementations of ω(s,0) and ω(0,v) respectively (cf. 9).
The outcome of the discussion above is that the structure of the quantum particle
model for a definite mass m is equivalent to the structure made up by a pair of self-
adjoint operators A, B which are an irreducible representation of WCR, together
with a non-zero real number µ and a pair of self-adjoint operators Aq and Ap which
satisfy conditions 10, 11, 12, 13 with the pair A, B. The operators Aq and Ap
represent the observables position and momentum of the quantum particle, while
the pair A, B and the number µ are related as in 9 to the homomorphism from R2
to Aut Ĥ that represents the action of the kinematic Galilei group in the quantum
particle model.
The question of existence and uniqueness of implementations of these structures
will be addressed on the basis of the next proposition.
20.3.5 Proposition. Let A, B be an irreducible representation of WCR in a Hilbert

space H, and let m be a fixed positive number.
(A) Let µ ∈ R−{0}. For two self-adjoint operators T1 , T2 in H, the set of conditions
listed in a1 is equivalent to the set listed in a2 :
(a1 )
U B (−s)P T1 (E)U B (s) = P T1 (E + s), ∀E ∈ A(dR ), ∀s ∈ R, (14)
A T1 A T1
U (µv)P (E)U (−µv) = P (E), ∀E ∈ A(dR ), ∀v ∈ R, (15)
B T2 B T2
U (−s)P (E)U (s) = P (E), ∀E ∈ A(dR ), ∀s ∈ R, (16)
A T2 A T2
U (µv)P (E)U (−µv) = P (E + mv), ∀E ∈ A(dR ), ∀v ∈ R; (17)
(a2 )
∃k1 ∈ R so that T1 = A + k1 1H ,
∃k2 ∈ R so that T2 = µ−1 mB + k2 1H .
(B) Let µ ∈ R − {0} and k1 , k2 ∈ R. Then:
(b1 ) there exists U ∈ U(H) so that
U (A + k1 1H )U −1 = A and U (µ−1 mB + k2 1H )U −1 = µ−1 mB;
(b2 ) for every U ∈ U(H), the equalities in b1 are equivalent to the equations
U U A (µv)U −1 = e−ik1 µv U A (µv), ∀v ∈ R,
−1
U U B (−s)U −1 = eim µk2 s
U B (−s), ∀s ∈ R.
(C) Let µ ∈ R − {0}. Then:
(c1 ) there exists W ∈ A(H) so that
W AW −1 = A and W (µ−1 mB)W −1 = −µ−1 mB;
(c2 ) for every W ∈ A(H), the equalities in c1 are equivalent to the equations
W U A (µv)W −1 = U A (−µv), ∀v ∈ R,
W U B (−s)W −1 = U B (−s), ∀s ∈ R.
(D) Let µ1 , µ2 ∈ R − {0} and suppose µ1 6= ±µ2 . Then, there does not exist any
unitary or antiunitary operator V in H so that
V AV −1 = A and V (µ−1
2 mB)V
−1
= µ−1
1 mB.
Proof. Preliminarily we recall that, since the pair A, B is a representation of WCR,

U A (t)U B (s) = e−its U B (s)U A (t), ∀(s, t) ∈ R2 . (18)
A: In view of 15.4.1, 16.1.8a, 16.3.1 (or, more directly, in view of 20.1.1), the
conditions in a1 can be written equivalently as
U B (−s)U T1 (x)U B (s) = e−isx U T1 (x), ∀x ∈ R, ∀s ∈ R, (14’)
A T1 A T1
U (µv)U (x)U (−µv) = U (x), ∀x ∈ R, ∀v ∈ R, (15’)
B T2 B T2
U (−s)U (y)U (s) = U (y), ∀y ∈ R, ∀s ∈ R, (16’)
A T2 A −imvy T2
U (µv)U (y)U (−µv) = e U (y), ∀y ∈ R, ∀v ∈ R. (17’)
Now we prove the equivalence between a1 and a2 .
a1 ⇒ a2 : We assume that the equations in a1 hold true. From 14’ and 18 we
have
U B (s)(U T1 (x)U A (−x)) = eisx U T1 (x)U B (s)U A (−x)
= (U T1 (x)U A (−x))U B (s), ∀s ∈ R, ∀x ∈ R,
and from 15’ (with the change of variable z := µv) we have
U A (z)(U T1 (x)U A (−x)) = (U T1 (x)U A (−x))U A (z), ∀z ∈ R, ∀x ∈ R.
Since the pair U A , U B is jointly irreducible (cf. 20.1.5), by 17.3.5 this implies that
there exists a function α : R → C so that
U T1 (x)U A (−x) = α(x)1H , ∀x ∈ R.
It is easy to see that α is a continuous homomorphism from the additive group R
to the multiplicative group T. Hence, by 16.2.3, there exists k1 ∈ R so that
α(x) = eik1 x , ∀x ∈ R,
and hence so that
U T1 (x) = eik1 x U A (x), ∀x ∈ R,
and hence, in view of 16.1.8a and 16.1.5d (cf. also 16.1.7), so that
T1 = A + k1 1H .
Similarly, on the basis of 16’, 17’, 18 and of the joint irreducibility of the pair
U A , U B , we can prove that there exists k2 ∈ R so that
U T2 (y) = eik2 y U B (µ−1 my), ∀y ∈ R,
and hence, in view of 16.1.8a,b, so that
U T2 (y) = U B̃ (y), ∀y ∈ R,
if B̃ := µ−1 mB + k2 1H . In view of 16.1.5d (cf. also 16.1.7), this implies

T2 = µ−1 mB + k2 1H .
a2 ⇒ a1 : We assume that the equalities in a2 hold true. Then, in view of
16.1.8a,b, we have
U T1 (x) = eik1 x U A (x), ∀x ∈ R, (19)
and
U T2 (y) = eik2 y U B (µ−1 my), ∀y ∈ R. (20)
U B (−s)U T1 (x) = eik1 x e−isx U A (x)U B (−s)
= e−isx U T1 (x)U B (−s), ∀x ∈ R, ∀s ∈ R,
and this is condition 14’. Conditions 15’ follows immediately from 19, and so does
condition 16’ from 20. Finally, from 18 and 20 we have
U A (µv)U T2 (y) = eik2 y e−imvy U B (µ−1 my)U A (µv)
= e−imvy U T2 (y)U A (µv), ∀y ∈ R, ∀v ∈ R,
and this is condition 17’.
B: It is actually more convenient to prove first that U ∈ U(H) exists so that
the equations in b2 are true, and second that these equations are equivalent to the
equalities in b1 .
We define the unitary operator U := U B (−k1 )U A (m−1 µk2 ). From 18 we have
U U A (µv)U −1 = U B (−k1 )U A (µv)U B (k1 ) = e−ik1 µv U A (µv), ∀v ∈ R,
and
U −1 U B (−s)U = U A (−m−1 µk2 )U B (−s)U A (m−1 µk2 )
−1
= e−im µk2 s
U B (−s), ∀s ∈ R,
or equivalently
−1
U U B (−s)U −1 = eim µk2 s
U B (−s), ∀s ∈ R.
Thus, the equations in b2 are proved.
For any U ∈ U(H), in view of 16.1.8a,b, the equations in b2 (with the changes
of variables x := µv and y := −s) are equivalent to the equations
′
U U A (x)U −1 = U A (x), ∀x ∈ R,
′
U U B (y)U −1 = U B (y), ∀y ∈ R,
if A′ := A − k1 1H and B ′ := B − m−1 µk2 1H . In view of 16.3.1, this equations are
equivalent to the equalities
U AU −1 = A − k1 1H ,
U BU −1 = B − m−1 µk2 1H ,
and hence to the equalities in b1 .

C: We consider the Schrödinger representation of WCR discussed in 20.1.7 and
define the mapping
C : L2 (R) → L2 (R)
[f ] 7→ C[f ] := [f ],
which is obviously an antiunitary operator in L2 (R). Moreover, it is obvious that
C −1 = C,
CQC −1 = Q,
CP0 C −1 = −P0 .
Now, the operator CP C −1 is self-adjoint (cf. 12.5.4) and so is the operator −P .
Moreover, both the operators CP C −1 and −P extend the essentially self-adjoint
operator −P0 . Since the self-adjoint extension of an essentially self-adjoint operator
is unique (cf. 12.4.11c), this proves the equality
CP C −1 = −P.
In view of 20.2.4a there exists V ∈ U(H, L2 (R)) so that
V AV −1 = Q and V BV −1 = P.
Then the operator W := V −1 CV is an antiunitary operator in H (cf. 10.3.16c),
and we have
W AW −1 = V −1 CQC −1 V = V −1 QV = A,
W BW −1 = V −1 CP C −1 V = −V −1 P V = −B,
and hence
W (µ−1 mB)W −1 = −µ−1 mB.
Thus, the equalities in c1 are proved.
For any W ∈ A(H), the equalities in c1 are equivalent to the equations in c2 , in
view of 16.3.1 (note that U −B (s) = U B (−s), in view of 16.1.8b).
D: We notice that D[A,B] 6= {0H }. Indeed, for the Schrödinger representation,
D[Q,P ] is dense in L2 (R) (it contains [ϕ] for all ϕ ∈ S(R); then use 11.3.3 and
10.6.5b) and hence D[A,B] is dense in H, by 20.2.4a. Then let f ∈ D[A,B] be such
that f 6= 0H and suppose that, for µ1 , µ2 ∈ R − {0}, there exists V ∈ UA(H) (this
operator has nothing to do with the operator denoted by the same symbol in the
proof of statement C) so that
V AV −1 = A and V (µ−1
2 mB)V
−1
= µ−1
1 mB.
Then (cf. 3.2.10b1,b′2 ,b3 )
µ−1 −1
1 m[A, B] = µ2 mV [A, B]V
−1
.
Therefore, V −1 f ∈ D[A,B] and, in view of 20.1.3b,
iµ−1 −1 −1
1 mf = µ1 m[A, B]f = µ2 mV [A, B]V
−1
f = ±iµ−1
2 mf,
where the plus or minus sign depends on whether V is unitary or antiunitary. Hence,
µ1 = ±µ2 . This proves statement D by contraposition.
In what follows, m is a fixed positive number.

Since we know that irreducible representations of WCR exist, propositions
20.3.2, 20.3.3, 20.3.4, 20.3.5A (a2 ⇒ a1 ) prove that our model for a quantum par-
ticle of mass m can be implemented. In fact they prove that, for each µ ∈ R − {0},
there exist a homomorphism from R2 to Aut Ĥ (where H is the Hilbert space of the
representation) which is implemented by an irreducible representation A, B of WCR
as in 9, and self-adjoint operators Aq , Ap which satisfy the covariance conditions 10,
11, 12, 13. For a given irreducible representation A, B of WCR, 20.3.5A actually
determines all the pairs of self-adjoint operators that can be used as representatives
Aq and Ap of the observables position q and momentum p: they are the pairs
(A + k1 1H , µ−1 mB + k2 1H ), ∀µ ∈ R − {0}, ∀k1 , k2 ∈ R.
Now, it seems that not only do we have pairs which fit our scheme, but we have too
many of them: what value of µ and which pair (A + k1 1H , µ−1 mB + k2 1H ) should
be used to represent a quantum particle of mass m?
For a fixed value of µ ∈ R − {0}, 20.3.5b1 shows that all the pairs related to that
value of µ are unitarily equivalent to each other. If we transform, by means of a
unitary operator, a pair related to a value of µ to another related to the same value,
perhaps we want to transform the operators U B (−s) and U A (µv) as well, since
they are implementations of the automorphisms ω(s,0) and ω(0,v) respectively. Then
20.3.5b2 shows that these operators get just multiplied by factors in T, and hence in
the new representation the same automorphisms ω(s,0) and ω(0,v) are implemented
as in the old one. In view of all this and of 19.3.23, we consider two pairs with the
same value of µ to be equivalent for the description of position and momentum of
a quantum particle of mass m.
For a fixed value of µ ∈ R − {0}, 20.3.5b1, c1 show that all the pairs defined
by a value of µ are antiunitarily equivalent to all the pairs defined by the opposite
value. If we transform, by means of an antiunitary operator, a pair defined by a
value of µ into another defined by the opposite value, perhaps also in this case
we want to transform the operators U B (−s) and U A (µv). Then 20.3.5b2, c2 show
that these operators, besides being multiplied by inessential multiplicative factors
in T, get changed into U B (−s) and U A (−µv); now, these operators implement the
automorphism ω(s,0) and ω(0,−v) . Thus it appears that, in the new representation,
the direction of the flow of time has been reversed. However, since we do not want
to study time evolution, in view of 19.3.23 we consider pairs defined by opposite
values of µ to be equivalent.
Finally, 20.3.5D (together with 20.3.5b1, c1 ) shows that, if µ1 , µ2 ∈ R − {0} are
such that µ1 6= ±µ2 , then no pair defined by µ2 is either unitarily of antiunitarily
equivalent to any pair defined by µ1 .
In view of all this, for a given irreducible representation A, B of WCR, we need
only consider the pairs
(A, µ−1 mB), for all µ > 0,

but we must consider all of them. For each µ > 0, they implement in inequivalent
ways our quantum particle model of mass m, with the assignements
Aq := A and Ap := µ−1 mB,
and with the kinematic Galilei group represented by the automorphism of Ĥ defined
by
ω(s,v) ([u]) := [U B (−s)U A (µv)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 .
In addition, we recall that the Stone–von Neumann uniqueness theorem (cf. 20.2.4a)
implies that, if a pair Ã, B̃ is a different irreducible representation of WCR, then for
each µ ∈ R−{0} the pair (Ã, µ−1 mB̃) is unitarily equivalent to the pair (A, µ−1 mB),
and so is the pair U Ã , U B̃ to the pair U A , U B . Thus, nothing is gained by considering
irreducible representations of WCR different from A, B.
Since the quantum models defined by different positive values of µ are not uni-
tarily or antiunitarily equivalent, the question is now what value of µ should be
used to represent a quantum particle of mass m. Mathematical reasoning cannot
help us here, and in fact we must turn to experimental outcomes. Indeed suppose
that, for a definite positive value of µ, we have the representation
Aq := A and Ap := µ−1 mB.
This representation yields statistical estimates that do depend on µ. For instance,

from 20.1.3a and 19.3.13a we have
1 −1
∆σ q∆σ p ≥ µ m,
2
for each state σ in which both q and p are evaluable. The above representation of
q and p is in accordance with experimental evidence for the value
µ := ~−1 m,
where ~ := (2π)−1 h and h is Planck’s constant. Thus, also on the basis of experi-
mental physics, the quantum particle model of mass m is given by
Aq := A,
Ap := ~B,
ω(s,v) ([u]) := [U B (−s)U A (~−1 mv)u], ∀u ∈ H̃, ∀(s, v) ∈ R2 .
20.3.6 Remarks.
(a) The discussion above shows that the Hilbert space, in which a non-relativistic
quantum particle without internal degrees of freedom is represented, is necessar-
ily separable and of denumerable dimension.
(b) In the representation of a quantum particle of mass m obtained above, the value
m of the mass does not have a role in the operators Aq and Ap which represent
the observables position and momentum. However it does in the implementation
of the homomorphism from R2 to Aut Ĥ which represents the kinematic Galilei
group. On the basis of 20.3.5D it is easy to see that implementations related to
different values of m are not unitarily or antiunitarily equivalent.
(c) Historically, the first mathematical representation of a quantum particle of mass
m was obtained in what is now called the Schrödinger representation of WCR.
In this representation we have
H := L2 (R), A := Q, B := P,
where Q and P are the operators discussed in 20.1.7, and hence
Aq := Q,
Ap := ~P,
ω(s,v) ([f ]) := [U P (−s)U Q (~−1 mv)f ],
for each ray [f ] in L2 (R) and each (s, v) ∈ R2
(here, for f ∈ L2 (R), the element [f ] of L2 (R) is denoted by the same symbol
f ; here, for a unit vector f of L2 (R), [f ] denotes the ray that contains f ). More
explicitly, for all f ∈ L2 (R) and all (s, v) ∈ R2 , we have (assuming for simplicity
Df = R, cf. 8.2.12)
−1
(U P (−s)U Q (~−1 mv)f )(x) = ei~ mv(x−s)
f (x − s), ∀x ∈ R
(cf. 20.1.7).
If a pure state σ is represented by a ray [fσ ] in L2 (R), it is possible to put a
direct statistical interpretation on the function |fσ |2 . In fact, from 15.3.4A and
from Section 14.5 we see that
Pq(E) fσ = P Q (E)fσ = χE fσ , ∀E ∈ A(dR ),
and hence
Z
χE |fσ |2 dm, ∀E ∈ A(dR ).

p(q(E), σ) = fσ |Pq(E) fσ =
R
We recall that p(q(E), σ) is the probability that a measurement of the position

of the particle yields a result in E (for any E ∈ A(dR )) when the particle is
prepared in the state σ (crf. 19.1.9a), or the probability of “finding the particle
in E” when the particle “is in the state σ”.
If a pure state σ is represented by a ray [fσ ] in L2 (R) and if f˜σ denotes the
vector F fσ (F is the Fourier transform on L2 (R)), it is possible to put a direct
statistical interpretation on the function |f˜σ |2 too. In fact, in view of 16.1.8b
and 15.4.1, and of the equality P = F −1 QF (cf. 20.1.7), we have
Pp(E) = P ~P (E) = P P (~−1 E) = F −1 P Q (~−1 E)F, ∀E ∈ A(dR ),
and hence

p(p(E), σ) = fσ |Pp(E) fσ = f˜σ |P Q (~−1 E)f˜σ
Z
= χ~−1 E |f˜σ |2 dm, ∀E ∈ A(dR ),
R
and p(p(E), σ) is the probability that a measurement of the momentum of the

particle yields a result in E when the particle is prepared in the state σ.

Bibliography
Apostol, T. M. (1974). Mathematical Analysis, 2nd edn. (Addison Wesley Publishing Com-
pany, Reading).
Bargmann, V. (1954). On the Unitary Ray Representations of Continuous Groups (Annals
of Mathematics 59), p.1-46.
Bargmann, V. (1964). Note on Wigner’s Theorem on Symmetry Operations (Journal of
Mathematical Physics 5), p.862-868.
Berberian, S. K. (1999). Fundamentals of Real Analysis (Springer, New York).
Dirac, P. A. M. (1958, 1947, 1935, 1930). The Principles of Quantum Mechanics (Claren-
don Press, Oxford).
Greenberg, M. J. and Harper, J. R. (1981). Algebraic Topology: a First Course (Addison-
Wesley Publishing Company, Redwood City, California).
Heisenberg, W. (1925). Über Quantentheoretische Umdeutung Kinematischer und Mecha-
nischer Beziehungen (Zeitschr. f. Phys. 33), p.879-893.
Hewitt, E. and Stromberg, K. (1965). Real and Abstract Analysis (Springer-Verlag, New
York).
Hilbert, D., Neumann, J. v., and Nordheim, L. (1927). Über die Grundlagen der Quanten-
mechanik (Mathematische Annalen 98(1)), p.1-30.
Holevo, A. S. (1982). Probabilistic and Statistical Aspects of Quantum Theory. (North-
Holland Publishing Company, Amsterdam), second English edition published by
Scuola Normale Superiore, Pisa, 2011.
Horn, R. A. and Johnson, C. R. (2013). Matrix Analysis, 2nd edn. (Cambridge University
Press).
Jauch, J. M. (1968). Foundations of Quantum Mechanics (Addison-Wesley Publishing
Company, Reading, Massachusetts).
Jordan, P. (1926). Über Kanonische Transformationen in der Quantenmechanik (Zeitschr.
f. Phys. 37), p.383-386.
Mackey, G. W. (1978). Unitary Group Representations in Physics, Probability, and Number
Theory (The Benjamin/Cummings Publishing Company, Reading, Massachusetts).
Munkres, J. R. (1991). Analysis on Manifolds (Addison-Wesley Publishing Company, Red-
wood City, California).
Parthasarathy, K. R. (2005). Introduction to Probability and Measure (Hindustan Book
Agency (India), New Delhi).
Pauli, W. (1933). Die Allgemeinen Prinzipien der Wellenmechanik (Handbuch der Physik
24), p.83-272.
Reed, M. and Simon, B. (1980, 1972). Methods of Modern Mathematical Physics I: Func-
tional Analysis (Academic Press, New York).
739
Reeh, H. (1988). A Remark Concerning Canonical Commutation Relation (Journal of

Mathematical Physics 29), p.1535-1536.
Riesz, F. and Sz.-Nagy, B. (1972). Leçons d’Analyse Fonctionnelle, 6th edn. (Akadémiai
Kiadó, Budapest), English translation of the 2nd edition: Functional Analysis, Dover
Publications, New York, 1990.
Royden, H. L. (1988). Real Analysis (Macmillan Publishing Company, New York).
Rudin, W. (1976). Principles of Mathematical Analysis, 3rd edn. (McGraw-Hill Book Com-
pany, New York).
Rudin, W. (1987). Real and Complex Analysis, 3rd edn. (McGraw-Hill Book Company,
New York).
Schrödinger, E. (1926). Über das Verhältnis der Heisenberg-Born-Jordanschen Quanten-
mechanik zu der Meinen (Annalen der Physik 79), p.734-756.
Shilov, G. E. (1973). Mathematical Analysis, Vol. 1 (MIT Press, Cambridge), (re-issued
as Elementary Real and Complex Analysis by Dover Publications, Mineola, 1996).
Shilov, G. E. (1974). Mathematical Analysis, Vol. 2 (MIT Press, Cambridge), (re-issued
as Elementary Functional Analysis by Dover Publications, Mineola, 1996).
Shilov, G. E. and Gurevich, B. L. (1966). Integral, Measure, and Derivative: a Unified Ap-
proach (Prentice Hall, Englewood Cliffs), (re-issued by Dover Publications, Mineola,
1977).
Simmons, G. F. (1963). Introduction to Topology and Modern Analysis (McGraw-Hill Book
Company, New York).
Stone, M. H. (1930). Linear Transformations in Hilbert Space III: Operational Methods
and Group Theory (Proc. Nat. Acad. Sci. U.S.A. 16), p. 172-175.
Thaller, B. (1992). The Dirac Equation (Springer-Verlag, Berlin).
von Neumann, J. (1931). Die Eindeutigkeit der Schrödingerschen Operatoren (Math. Ann.
104), p.570-578.
von Neumann, J. (1932). Mathematische Grundlagen der Quantenmechanik (Springer-
Verlag, Berlin), pages are quoted from the English translation, Mathematical Foun-
dations of Quantum Mechanics, Princeton University Press, Princeton, 1955.
von Neumann, J. (1950). Functional Operators, Vol. 2 (Princeton University Press, Prince-
ton).
Weidmann, J. (1980). Linear Operators in Hilbert Spaces (Springer-Verlag, New York).
Weyl, H. (1927). Quantenmechanik und Gruppentheorie (Zeitschr. f. Phys. 46), p.1-46.
Wichmann, E. H. (1971). Quantum Physics: Berkeley Physics Course, Vol. 4 (McGraw-
Hill, New York).
Index
µ-integrable function, 192 automorphism of an inner product space,

µ-measurable function, 191 255
σ-additivity, 152 average of the results, 624
σ-subadditivity, 152 axiom of choice, 18
σ-additivity of a p.v.m., 409
σ-algebra, 122 Banach algebra, 82
σ-algebra generated, 122 Banach space, 71
σ-algebra induced, 123 Bessel’s inequality, 264
σ-finite additive function, 151 bijection, 12
bijective, 12
abelian algebra, 65 Borel σ-algebra, 124
abelian group, 19 Borel function, 150
absolutely convergent, 71 Borel set, 124
absolutely precise preparation procedure, bounded observable, 621
636, 647 bounded operator, 74
additive function, 151 bounded sesquilinear form, 285
additivity, 151 bounded set, 23
additivity of a p.v.a.m., 408
adjoint, 356 C∗ -algebra, 380
adjointable, 356 c.o.n.s., 289
Alexandroff’s theorem, 154 c.o.p.u.g., 495
algebra, 65 Carathéodory’s theorem, 159
algebra generated, 120 cartesian product, 5, 8
algebra of sets, 119 Cauchy sequence, 35
almost every, 158 Cayley transform, 378, 385
almost everywhere, 158 change of variable theorem, 209
antilinear operator, 274 change of variable theorem for projection
antiunitarily equivalent, 275 valued measures, 460
antiunitary operator, 274 characteristic function, 10
approximate point spectrum, 92 closable operator, 90
associative algebra, 65 closed ball, 26
automorphism of a group, 20 closed graph theorem in Hilbert space,
automorphism of a normed space, 95 362, 391
automorphism of a projective Hilbert closed operator, 87
space, 316 closed set, 25
automorphism of an algebra, 66 closure of a set, 27
741
closure of an operator, 90 differentiability of a mapping in a normed

coherent superposition of pure states, 640 space, 496
commutator, 529 differentiable, 18, 39
commuting operators, 529, 530 dilatation, 240
commuting projection valued measures, Dirac measure, 206
415 direct sum of Hilbert spaces, 269
commuting self-adjoint operators, 531 discrete measure, 208
compact, 42 discrete observable, 622
compatible observables, 675 disjoint, 4
compatible propositions, 668 distance, 21
complement, 4 domain, 7
complete measure, 157 dominating function, 195
complete metric space, 35
complete orthonormal system, 289 e.s.a. operator, 365
complete set of compatible observables, eigenspace, 92
682 eigenvalue, 92
completion, 36, 268 eigenvector, 92
complex function, 8 ensemble, 613
composition of mappings, 13 epistemic probability, 630
computable self-adjoint operator, 605 equivalence class, 5
equivalence relation, 5
connected, 47
essential supremum, 430
conservative quantum system, 688
essentially self-adjoint operator, 365
constant of motion, 694
evaluable observable, 623
continuous at a point, 31
exact result, 621
continuous mapping, 31
expected result, 623
continuous one-parameter group of
extended real line, 101
automorphisms, 513
extension, 10
continuous one-parameter unitary group,
495
Fatou’s lemma, 190
continuous spectrum, 370
Fejér–Riesz lemma, 465
convergent sequence, 22 filter for a proposition, 659
convergent series of vectors, 70 final set, 7
copy of a physical system, 613 final subspace, 573
copy prepared in a state, 613 finite additive function, 151
core, 366 finite dimensional Hilbert space, 303
countable set, 12 finite set, 12
counterimage, 11 finite-dimensional linear space, 58
counting measure, 208 finite-dimensional spectral theorem for
cover, 40 self-adjoint operators, 482
first kind determination of a proposition,
De Morgan’s laws, 5 662
dense, 28 first kind implementation of a proposition,
density matrix, 596 662
denumerable set, 12 first kind measurement of an observable,
derivative, 18 663
derivative of a mapping in a normed Fourier expansion, 289
space, 497 Fourier transform, 337
determination of a proposition, 613 Fourier transform on L2 (R), 343
difference of sets, 4 Fubini’s theorem, 220
Index 743
function, 8 initial subspace, 573

function of a self-adjoint operator, 483 injection, 12
function of an X-valued observable, 620 injective, 12
function of an observable in quantum inner product, 248
mechanics, 643 inner product space, 248
function of two observables in quantum integrable function, 192
mechanics, 677 integral, 178, 180, 185
function of two self-adjoint operators, 544 integral over a subset, 202
function preserving, 643, 678 integral with respect to a projection
valued measure, 429, 443, 446
g.l.b., 6 interference term, 640
Galilean relativity, 715 interior, 24
Galilei-covariance, 719 intersection, 4
Galilei-invariance, 719 invariant subspace, 550
generator of a c.o.p.u.g., 507 inverse, 12
Gram–Schmidt orthonormalization, 259 inverse Fourier transform, 337
graph, 9 involution, 380
greatest lower bound, 6 irreducible pair of c.o.p.u.g.’s, 700
group, 18 irreducible pair of self-adjoint operators,
700
Hahn’s theorem, 165 irreducible representation of WCR, 700
Hamiltonian, 691 irreducible set of operators, 567
Heine–Borel theorem, 43 isometry, 572
Heisenberg canonical commutation isomorphism of algebras, 66
relation, 382 isomorphism of groups, 20
Heisenberg picture, 696 isomorphism of inner product spaces, 255
Hellinger–Toeplitz theorem, 365 isomorphism of metric spaces, 22
Hermite c.o.n.s., 333 isomorphism of normed spaces, 94
Hermite function, 262 isomorphism of projective Hilbert spaces,
Hermite polynomial, 262 316
Hilbert space, 268
homomorphism of algebras, 66 kinematic Galilei group, 716, 720
homomorphism of groups, 20
l.u.b., 6
ideal determination of a proposition, 662 Lüder’s reduction axiom, 662
ideal filter, 660 least upper bound, 6
ideal implementation of a proposition, 662 Lebesgue integrable functions, 192
ideal measurement of an observable, 663 Lebesgue integral, 192
identity mapping, 10 Lebesgue measure on R, 235
identity of an algebra, 65 Lebesgue measure on Rn , 237
image, 11 Lebesgue measure on bounded interval,
implementation of a filter, 660 243
implementation of a proposition, 619 Lebesgue’s dominated convergence
implementation of a state, 619 theorem, 195
implementation of an isomorphism of Lebesgue–Stieltjes measure, 233
projective Hilbert spaces, 317, 318 limit of a mapping in a normed space, 496
impossible result, 621 limit of a sequence, 22
improper eigenfunction, 647 linear basis, 56
indexed family, 8 linear combination, 56
initial set, 7 linear dimension, 58
linear functional, 59 orthogonal decomposition, 276

linear manifold, 52 orthogonal decomposition mapping, 387
linear manifold generated, 53 orthogonal decomposition theorem, 276
linear operator, 59 orthogonal dimension of a Hilbert space,
linear operator in, 59 298
linear operator on, 59 orthogonal dimension of a subspace, 299
linear space, 51 orthogonal projection, 387
linearly dependent, 56 orthogonal subset, 257
linearly independent, 56 orthogonal sum of subspaces, 402
lower bound, 6 orthogonal vectors, 257
Lusin’s theorem, 172 orthogonality between subsets, 267
orthonormal system, 258
mapping, 7 outer measure, 158
measurable function, 137
measurable mapping, 133 P-measurable function, 430
measurable sets, 122 p.v.a.m., 408
measurable space, 122 p.v.m., 409
measurable subspace, 123 parallelogram law, 253
measure, 157 Parseval’s identities, 289
measure space, 157 partial isometry, 572
measurement of an observable, 618 partial ordering, 6
metric space, 21 partial sum, 70
metric subspace, 22 partially isometric operator, 572
microparticle, 653 partition, 6
microstate, 629 phase space, 630
mixed state, 639 physical system, 613
mixture of states, 639 point spectrum, 92
monotone class, 132 polar decomposition, 575
monotone convergence theorem, 180, 188 polarization identity, 247
monotonicity, 151 positive linear functional, 227
multiplication operators, 458 positive operator, 571
possible result, 621
negation of a proposition, 615, 616 premeasure, 152
non-relativistic quantum particle, 721 principle of relativity, 715
norm, 69 probability function, 613, 615
norm of a bounded operator, 75 probability measure, 157
normalized vector, 258 probability of a proposition in a state, 613
normed algebra, 82 product σ-algebra, 128
normed space, 69 product distance, 37
null measure, 157 product measure, 213
null space, 59 product measure space, 213
product of operators, 60
o.n.s., 258 product of projection valued measures,
o.n.s. complete in a subspace, 289 416
observable, 616 projection, 391
one-dimensional projection, 392 projection mappings, 10
open ball, 23 projection postulate, 662
open set, 23 projection theorem, 276
operator, 59 projection valued additive mapping, 408
orthogonal complement, 265 projection valued measure, 409
Index 745
projection valued measure of a self-adjoint Schwarz inequality in ℓ2 , 273

operator, 479 section of a function, 210
projective Hilbert space, 316 section of a set, 210
proof by contradiction, 2 self-adjoint operator, 365
proof by contraposition, 2 semialgebra, 117
proof by induction, 2 separable, 29
proposition, 612, 615 sequence, 8
pure state, 639 series, 23, 70, 110
purely quantum state, 639 series of projections, 405
Pythagorean theorem, 258 sesquilinear form in, 247
sesquilinear form on, 247
quantized observable, 646 Shur’s lemma, 570
quantized result, 645 simple function, 142
quantum particle model, 721 singleton set, 4
quantum theory, 636 space translation, 720
quotient set, 6 spectral family, 419
spectral theorem for self-adjoint
range, 7 operators, 475
ray, 316 spectral theorem for unitary operators,
real function, 8 469
reducing subspace, 553 spectrum of an observable, 621
reduction of an operator, 553 spectrum of an operator, 91
regular measure, 170 square integrable function, 319
relation, 5 standard deviation of the results, 624
representation of the Weyl commutation standard extension, 185
relation, 698 state, 612, 615
representative, 5 state preparation, 612
resolution of the identity, 420 state reduction, 660, 666
resolvent set, 91 stationary state, 693
restriction of a function, 10 statistical operator, 596
restriction of an operator, 60 statistical theory, 613
reversible quantum system, 688 Stern–Gerlach device, 658
Riemann integrable function, 244 Stone’s theorem, 504
Riemann integral, 244 Stone’s theorem in one dimension, 511
Riesz representation theorem, 284 Stone–von Neumann uniqueness theorem,
Riesz–Fisher theorem, 281 710
Riesz–Fréchet theorem, 284 Stone–Weierstrass approximation
Riesz–Markov theorem, 227 theorem, 85
subadditivity, 151
s.a. operator, 365 subalgebra, 65
scalar, 51 subgroup, 19
scalar multiplication, 52 subsequence, 13
scalar product, 248 subspace, 72
Schrödinger equation, 692 subspace generated, 72
Schrödinger picture, 696 sum of a series, 23, 70, 110
Schrödinger representation of WCR, 700 sum of linear spaces, 54
Schwartz space of functions of rapid sum of normed spaces, 73
decrease, 55, 249, 336 sum of operators, 61
Schwarz inequality, 251, 252 sum of sets, 53
Schwarz inequality in CN , 272 superposition principle, 640
superselection rules, 637 union, 4

support, 34 unitarily equivalent, 273
surjection, 12 unitarily-antiunitarily equivalent, 276
surjective, 12 unitarily-antiunitarily equivalent
symmetric operator, 364 representations of a quantum system,
system, 613 653
unitary irreducible representation of a
tight measure, 170 group, 570
Tonelli’s theorem, 218 unitary operator, 273
total ordering, 6 unitary representation of a group, 557
trace class, 581 upper bound, 6
trace class operators, 581
trace of a positive operator, 578 vector, 51
trace of a trace class operator, 588 vector space, 52
trajectory of a state, 692 vector sum, 51
transition probability, 662 velocity transformation, 720
translation, 239 von Neumann’s reduction axiom, 662
triangle inequality in CN , 272
triangle inequality in ℓ2 , 273 WCR, 698
trigonometric polynomial, 84 weight, 639
trivial observable, 720 Wigner’s theorem, 304
two-valued observable, 628
X-valued observable, 616, 619
ubp closed, 147
ubp limit, 147 yes-no observable, 628
uncertainty, 623
uncertainty relations, 686 zero Hilbert space, 271
uncountable set, 12 zero linear space, 54
uniformly continuous, 31

Hilbert Space and Quantum Mecha - Franco Gallone

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hilbert Space and Quantum Mecha - Franco Gallone

Uploaded by

Copyright:

Available Formats

HilbertandSpace

9405hc_9789814635837_tp.indd 1 3/10/14 9:06 am

This page intentionally left blank

9405hc_9789814635837_tp.indd 2 3/10/14 9:06 am

British Library Cataloguing-in-Publication Data

HILBERT SPACE AND QUANTUM MECHANICS

In-house Editor: Ng Kah Fee

KahFee - Hilbert Space and quantum Mechanics.indd 1 20/11/2014 3:34:58 PM

To Kissy, Lilith, Malcy, Micio,

This page intentionally left blank

viii Hilbert Space and Quantum Mechanics

x Hilbert Space and Quantum Mechanics

1. Sets, Mappings, Groups 1

3. Linear Operators in Linear Spaces 51

4. Linear Operators in Normed Spaces 69

xii Hilbert Space and Quantum Mechanics

4.2 Bounded operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5. The Extended Real Line 101

6. Measurable Sets and Measurable Functions 117

9. Lebesgue Measure 233

10. Hilbert Spaces 247

10.5 The Riesz–Fréchet theorem . . . . . . . . . . . . . . . . . . . . . . 284

11. L2 Hilbert Spaces 319

12. Adjoint Operators 355

13. Orthogonal Projections and Projection Valued Measures 387

14. Integration with respect to a Projection Valued Measure 425

15. Spectral Theorems 463

xiv Hilbert Space and Quantum Mechanics

16. One-Parameter Unitary Groups and Stone’s Theorem 495

17. Commuting Operators and Reducing Subspaces 529

18. Trace Class and Statistical Operators 571

19. Quantum Mechanics in Hilbert Space 611

20. Position and Momentum in Non-Relativistic Quantum Mechanics 697

Sets, Mappings, Groups

1.1 Symbols, sets, relations

1.1.1 Sets of numbers

(a1 , a2 ) + (b1 , b2 ) = (a1 + b1 , a2 + b2 ),

and C denotes the set R2 when R2 is endowed in this way.

The subset {(a, 0) : a ∈ R} of C is identified with R, identifying (a, 0) with a.

2 Hilbert Space and Quantum Mechanics

Sets, Mappings, Groups 3

1.1.3 Symbols and shorthand

4 Hilbert Space and Quantum Mechanics

If F is a family of subsets of a set X, we define the union and the intersection of

Sets, Mappings, Groups 5

For a family F of subsets of X we have De Morgan’s laws

S2 − (S2 − S1 ) = S2 ∩ (X − (S2 ∩ (X − S1 ))) = S2 ∩ ((X − S2 ) ∪ S1 ) = S2 ∩ S1 ;

6 Hilbert Space and Quantum Mechanics

Sets, Mappings, Groups 7

1.2.1 Definitions. Let X and Y be non-empty sets. A mapping ϕ from X to Y

8 Hilbert Space and Quantum Mechanics

Sets, Mappings, Groups 9

1.2.4 Proposition. Let X and Y be non-empty sets. For a non-empty subset G of

Proof. a ⇒ b: Let ϕ be a mapping from X to Y and let G = Gϕ . Then we have

10 Hilbert Space and Quantum Mechanics

1.2.5 Definitions. Let ϕ1 , ϕ2 be mappings from X to Y . The mapping ϕ2 is called

1.2.6 Examples. We define a few useful mappings.

Sets, Mappings, Groups 11

1.2.7 Definitions. Let ϕ be a mapping from X to Y .

1.2.8 Proposition. Let ϕ be a mapping from X to Y . For any family F of subsets

ϕ(S1 ) ⊂ ϕ(S2 ) if S1 , S2 are subsets of Dϕ such that S1 ⊂ S2 .

ϕ−1 (Y ) = ϕ−1 (Rϕ ) = Dϕ ;

ϕ−1 (T1 ) ⊂ ϕ−1 (T2 ) if T1 , T2 are subsets of Y such that T1 ⊂ T2 ;

ϕ−1 (Y − T ) = Dϕ − ϕ−1 (T ) for any subset T of Y.