Bruce K. Driver - Analysis Tools With Examples

Bruce K.
Driver
Analysis Tools with Examples
August 6, 2004 File:anal.tex
Springer
Berlin Heidelberg NewYork
Hong Kong London
Milan Paris Tokyo
Contents
Part I Background Material
1 Introduction / User Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Set Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 A Brief Review of Real and Complex Numbers . . . . . . . . . . . . 9
3.1 The Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.1 The Decimal Representation of a Real Number . . . . . . . . 14
3.2 The Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Limits and Sums. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1 Limsups, Liminfs and Extended Limits . . . . . . . . . . . . . . . . . . . . . 19
4.2 Sums of positive functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Sums of complex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 Iterated sums and the Fubini and Tonelli Theorems . . . . . . . . . . 30
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.5.1 Limit Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.5.2 Dominated Convergence Theorem Problems . . . . . . . . . . 33
5
p
spaces, Minkowski and Holder Inequalities . . . . . . . . . . . . 37
5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Part II Metric, Banach, and Hilbert Space Basics
6 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.1 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.2 Completeness in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Contents
6.3 Supplementary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.3.1 Word of Caution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.3.2 Riemannian Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.2 Bounded Linear Operators Basics . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.3 General Sums in Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.4 Inverting Elements in L(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8 Hilbert Space Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.1 Hilbert Space Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.2 Some Spectral Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.3 Compact Operators on a Hilbert Space . . . . . . . . . . . . . . . . . . . . . 95
8.3.1 The Spectral Theorem for Self Adjoint Compact
Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.4 Supplement 1: Converse of the Parallelogram Law . . . . . . . . . . . 101
8.5 Supplement 2. Non-complete inner product spaces . . . . . . . . . . . 103
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
9 Holder Spaces as Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
9.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Part III Calculus and Ordinary Dierential Equations in Banach
Spaces
10 The Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
10.1 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . 119
10.2 Integral Operators as Examples of Bounded Operators . . . . . . . 123
10.3 Linear Ordinary Dierential Equations . . . . . . . . . . . . . . . . . . . . . 125
10.4 Classical Weierstrass Approximation Theorem. . . . . . . . . . . . . . . 130
10.5 Iterated Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
11 Ordinary Dierential Equations in a Banach Space . . . . . . . . 143
11.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
11.2 Uniqueness Theorem and Continuous Dependence on Initial
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
11.3 Local Existence (Non-Linear ODE) . . . . . . . . . . . . . . . . . . . . . . . . 147
11.4 Global Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
11.5 Semi-Group Properties of time independent ows . . . . . . . . . . . . 156
11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Page: 4 job: anal macro: svmono.cls date/time: 6-Aug-2004/16:26
Contents 5
12 Banach Space Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
12.1 The Dierential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
12.2 Product and Chain Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
12.3 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
12.4 Higher Order Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
12.5 Inverse and Implicit Function Theorems . . . . . . . . . . . . . . . . . . . . 175
12.6 Smooth Dependence of ODEs on Initial Conditions* . . . . . . . . 182
12.7 Existence of Periodic Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
12.8 Contraction Mapping Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
12.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
12.9.1 Alternate construction of g. To be made into an exercise.191
Part IV Topological Spaces
13 Topological Space Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
13.1 Constructing Topologies and Checking Continuity . . . . . . . . . . . 196
13.2 Product Spaces I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
13.3 Closure operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
13.4 Countability Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
13.5 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
13.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
13.6.1 General Topological Space Problems . . . . . . . . . . . . . . . . . 213
13.6.2 Connectedness Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
13.6.3 Metric Spaces as Topological Spaces . . . . . . . . . . . . . . . . . 215
14 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
14.1 Metric Space Compactness Criteria . . . . . . . . . . . . . . . . . . . . . . . . 218
14.2 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
14.3 Local and Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
14.4 Function Space Compactness Criteria . . . . . . . . . . . . . . . . . . . . . . 228
14.5 Tychonos Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
14.6 Banach Alaoglus Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
14.6.1 Weak and Strong Topologies . . . . . . . . . . . . . . . . . . . . . . . . 235
14.7 Weak Convergence in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . 237
14.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
14.8.1 Ascoli-Arzela Theorem Problems . . . . . . . . . . . . . . . . . . . . 240
14.8.2 Tychonos Theorem Problem . . . . . . . . . . . . . . . . . . . . . . 242
15 Locally Compact Hausdor Spaces . . . . . . . . . . . . . . . . . . . . . . . . 243
15.1 Locally compact form of Urysohns Metrization Theorem . . . . . 248
15.2 Partitions of Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
15.3 C
0
(X) and the Alexanderov Compactication . . . . . . . . . . . . . . . 255
15.4 Stone-Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
15.5 *More on Separation Axioms: Normal Spaces . . . . . . . . . . . . . . . 263
6 Contents
15.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
16 Baire Category Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
16.1 Metric Space Baire Category Theorem . . . . . . . . . . . . . . . . . . . . . 269
16.2 Locally Compact Hausdor Space Baire Category Theorem . . . 270
16.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Part V Lebesgue Integration Theory
17 Introduction: What are measures and why measurable
sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
17.1 The problem with Lebesgue measure . . . . . . . . . . . . . . . . . . . . . 280
18 Measurability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
18.1 Algebras and Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
18.2 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
18.2.1 More general pointwise limits . . . . . . . . . . . . . . . . . . . . . . . 297
18.3 Function Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
18.4 Product Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
18.4.1 Factoring of Measurable Maps . . . . . . . . . . . . . . . . . . . . . . 308
18.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
19 Measures and Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
19.1 Example of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
19.1.1 ADD: Examples of Measures . . . . . . . . . . . . . . . . . . . . . . . . 316
19.2 Integrals of Simple functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
19.3 Integrals of positive functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
19.4 Integrals of Complex Valued Functions . . . . . . . . . . . . . . . . . . . . . 326
19.5 Measurability on Complete Measure Spaces . . . . . . . . . . . . . . . . . 335
19.6 Comparison of the Lebesgue and the Riemann Integral . . . . . . . 336
19.7 Determining Classes of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . 339
19.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
20 Multiple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
20.1 Fubini-Tonellis Theorem and Product Measure . . . . . . . . . . . . . 346
20.2 Lebesgue Measure on 1
d
and the Change of Variables Theorem354
20.3 The Polar Decomposition of Lebesgue Measure . . . . . . . . . . . . . . 365
20.4 More proofs of the classical Weierstrass approximation
Theorem 10.35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
20.5 More Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
20.6 Sards Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
20.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
Contents 7
21 L
p
-spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
21.1 Jensens Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
21.2 Modes of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
21.3 Completeness of L
p
spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
21.3.1 Summary: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
21.4 Converse of Holders Inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
21.5 Uniform Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
21.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
22 Approximation Theorems and Convolutions . . . . . . . . . . . . . . . 415
22.1 Density Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
22.2 Convolution and Youngs Inequalities . . . . . . . . . . . . . . . . . . . . . . 422
22.2.1 Smooth Partitions of Unity . . . . . . . . . . . . . . . . . . . . . . . . . 432
22.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Part VI Further Hilbert and Banach Space Techniques
23 L
2
- Hilbert Spaces Techniques and Fourier Series . . . . . . . . . 439
23.1 L
2
-Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
23.2 Hilbert Schmidt Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
23.3 Fourier Series Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
23.3.1 Dirichlet, Fejer and Kernels . . . . . . . . . . . . . . . . . . . . . . . . . 447
23.3.2 The Dirichlet Problems on D and the Poisson Kernel . . 452
23.4 Weak L
2
-Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
23.5 *Conditional Expectation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
23.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
23.7 Fourier Series Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
23.8 Conditional Expectation Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 466
24 Complex Measures, Radon-Nikodym Theorem and the
Dual of L
p
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
24.1 The Radon-Nikodym Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
24.2 The Structure of Signed Measures . . . . . . . . . . . . . . . . . . . . . . . . . 476
24.2.1 Hahn Decomposition Theorem . . . . . . . . . . . . . . . . . . . . . . 477
24.2.2 Jordan Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
24.3 Complex Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
24.4 Absolute Continuity on an Algebra . . . . . . . . . . . . . . . . . . . . . . . . 486
24.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
25 Three Fundamental Principles of Banach Spaces . . . . . . . . . . . 491
25.1 The Hahn-Banach Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
25.1.1 Hahn Banach Theorem Problems . . . . . . . . . . . . . . . . . . 499
25.1.2 *Quotient spaces, adjoints, and more reexivity . . . . . . . 500
25.2 The Open Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
8 Contents
25.3 Uniform Boundedness Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
25.3.1 Applications to Fourier Series . . . . . . . . . . . . . . . . . . . . . . . 512
25.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
25.4.1 More Examples of Banach Spaces . . . . . . . . . . . . . . . . . . . 514
25.4.2 Hahn-Banach Theorem Problems . . . . . . . . . . . . . . . . . . . . 514
25.4.3 Open Mapping and Closed Operator Problems . . . . . . . . 515
25.4.4 Weak Topology and Convergence Problems . . . . . . . . . . . 516
26 Weak and Strong Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
26.1 Basic Denitions and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
26.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
27 Bochner Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
Part VII Construction and Dierentiation of Measures
28 Examples of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
28.1 Extending Premeasures to Measures . . . . . . . . . . . . . . . . . . . . . . . 541
28.1.1 Regularity and Density Results . . . . . . . . . . . . . . . . . . . . . 543
28.2 The Riesz-Markov Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
28.2.1 Regularity Results For Radon Measures . . . . . . . . . . . . . . 548
28.2.2 The dual of C
0
(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
28.3 Classifying Radon Measures on 1. . . . . . . . . . . . . . . . . . . . . . . . . . 557
28.3.1 Classifying Radon Measures on 1 using Theorem 28.2. . 558
28.3.2 Classifying Radon Measures on 1 using the
Riesz-Markov Theorem 28.16 . . . . . . . . . . . . . . . . . . . . . . . 561
28.3.3 The Lebesgue-Stieljtes Integral . . . . . . . . . . . . . . . . . . . . . . 563
28.4 Kolmogorovs Existence of Measure on Products Spaces . . . . . . 565
29 Probability Measures on Lusin Spaces . . . . . . . . . . . . . . . . . . . . . 569
29.1 Weak Convergence Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
29.2 Haar Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
29.3 Hausdor Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
29.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
29.4.1 The Laws of Large Number Exercises . . . . . . . . . . . . . . . . 575
30 Lebesgue Dierentiation and the Fundamental Theorem
of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
30.1 A Covering Lemma and Averaging Operators . . . . . . . . . . . . . . . 578
30.2 Maximal Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
30.3 Lebesque Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
30.4 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . 584
30.4.1 Increasing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
30.4.2 Functions of Bounded Variation . . . . . . . . . . . . . . . . . . . . . 588
30.4.3 Alternative method to Proving Theorem 30.29 . . . . . . . . 599
Contents 9
30.5 The connection of Weak and pointwise derivatives . . . . . . . . . . . 601
30.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
31 Constructing Measures Via Caratheodory . . . . . . . . . . . . . . . . . 607
31.1 Construction of Premeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
31.1.1 Extending Premeasures to /
. . . . . . . . . . . . . . . . . . . . . . . 609
31.2 Outer Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
31.3 *The Finite Extension Theorem . . . . . . . . . . . . . . . . . . . . . . . 612
31.4 General Extension and Construction Theorem. . . . . . . . . . . . . . . 616
31.4.1 Extensions of General Premeasures . . . . . . . . . . . . . . . . . . 618
31.5 Proof of the Riesz-Markov Theorem 28.16 . . . . . . . . . . . . . . . . . . 621
31.6 More Motivation of Caratheodorys Construction Theorem
31.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
32 The Daniell Stone Construction of Integration and
Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
32.0.1 Examples of Daniell Integrals . . . . . . . . . . . . . . . . . . . . . . . 629
32.1 Extending a Daniell Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
32.2 The Structure of L
1
(I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
32.3 Relationship to Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
32.4 Extensions of premeasures to measures . . . . . . . . . . . . . . . . . . . . . 647
32.4.1 A Useful Version: BRUCE: delete this if incorporated
above. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
32.5 Riesz Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
32.6 The General Riesz Representation by Daniell Integrals (Move
Later?) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656
32.7 Regularity Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
32.8 Metric space regularity results resisted . . . . . . . . . . . . . . . . . . . . . 664
32.9 General Product Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
32.10Daniel Integral approach to dual spaces . . . . . . . . . . . . . . . . . . . . 668
33 Class Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
33.1 Monotone Class and Theorems . . . . . . . . . . . . . . . . . . . . . . . 671
33.1.1 Some other proofs of previously proved theorems . . . . . . 674
33.2 Regularity of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
33.2.1 Another proof of Theorem 28.22. . . . . . . . . . . . . . . . . . . . . 679
33.2.2 Second Proof of Theorem 22.13 . . . . . . . . . . . . . . . . . . . . . 680
Part VIII The Fourier Transform and Generalized Functions
34 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
34.1 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
34.2 Schwartz Test Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687
34.3 Fourier Inversion Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
34.4 Summary of Basic Properties of T and T
1
. . . . . . . . . . . . . . . . 693
10 Contents
34.5 Fourier Transforms of Measures and Bochners Theorem . . . . . . 693
34.6 Supplement: Heisenberg Uncertainty Principle . . . . . . . . . . . . . . 697
34.6.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
34.6.2 More Proofs of the Fourier Inversion Theorem . . . . . . . . 700
35 Constant Coecient partial dierential equations . . . . . . . . . . 703
35.1 Elliptic examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704
35.2 Poisson Semi-Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
35.3 Heat Equation on 1
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707
35.4 Wave Equation on 1
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712
35.5 Elliptic Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
35.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
36 Elementary Generalized Functions / Distribution Theory . . 725
36.1 Distributions on U
o
1
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
36.2 Examples of distributions and related computations . . . . . . . . . . 726
36.3 Other classes of test functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
36.4 Compactly supported distributions . . . . . . . . . . . . . . . . . . . . . . . . 739
36.5 Tempered Distributions and the Fourier Transform . . . . . . . . . . 742
36.6 Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749
36.7 Appendix: Topology on C
c
(U) . . . . . . . . . . . . . . . . . . . . . . . . . . . 754
37 Convolutions involving distributions . . . . . . . . . . . . . . . . . . . . . . . 759
37.1 Tensor Product of Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
37.2 Elliptic Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 769
Part IX Appendices
A Multinomial Theorems and Calculus Results. . . . . . . . . . . . . . . 5
A.1 Multinomial Theorems and Product Rules . . . . . . . . . . . . . . . . . . 5
A.2 Taylors Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
B Zorns Lemma and the Hausdor Maximal Principle . . . . . . 11
C Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Part I
Background Material
1
Introduction / User Guide
Not written as of yet. Topics to mention.
1. A better and more general integral.
a) Convergence Theorems
b) Integration over diverse collection of sets. (See probability theory.)
c) Integration relative to dierent weights or densities including singular
weights.
d) Characterization of dual spaces.
e) Completeness.
2. Innite dimensional Linear algebra.
3. ODE and PDE.
4. Harmonic and Fourier Analysis.
5. Probability Theory
2
Set Operations
Let N denote the positive integers, N
0
:= N0 be the non-negative inte-
gers and Z = N
0
(N) the positive and negative integers including 0,
the rational numbers, 1 the real numbers (see Chapter 3 below), and C the
complex numbers. We will also use F to stand for either of the elds 1 or C.
Notation 2.1 Given two sets X and Y, let Y
X
denote the collection of all
functions f : X Y. If X = N, we will say that f Y
N
is a sequence
with values in Y and often write f
n
for f (n) and express f as f
n
n=1
.
If X = 1, 2, . . . , N, we will write Y
N
in place of Y
1,2,...,N]
and denote
f Y
N
by f = (f
1
, f
2
, . . . , f
N
) where f
n
= f(n).
Notation 2.2 More generally if X
: A is a collection of non-empty
sets, let X
A
=

A
X
and
: X
A
X
be the canonical projection map

dened by
(x) = x
. If If X
= X for some xed space X, then we will

write

A
X
as X
A
rather than X
A
.
Recall that an element x X
A
is a choice function, i.e. an assignment
x
:= x() X
for each A. The axiom of choice (see Appendix B.)

states that X
A
,= provided that X
,= for each A.
Notation 2.3 Given a set X, let 2
X
denote the power set of X the col-
lection of all subsets of X including the empty set.
The reason for writing the power set of X as 2
X
is that if we think of 2
meaning 0, 1 , then an element of a 2
X
= 0, 1
X
is completely determined
by the set
A := x X : a(x) = 1 X.
In this way elements in 0, 1
X
are in one to one correspondence with subsets
of X.
For A 2
X
let
6 2 Set Operations
A
c
:= X A = x X : x / A
and more generally if A, B X let
B A := x B : x / A = A B
c
.
We also dene the symmetric dierence of A and B by
AB := (B A) (A B) .
As usual if A
I
is an indexed collection of subsets of X we dene the
union and the intersection of this collection by
I
A
:= x X : I x A
and
I
A
:= x X : x A
I .
Notation 2.4 We will also write

I
A
for
I
A
in the case that

A
I
are pairwise disjoint, i.e. A
= if ,= .
Notice that is closely related to and is closely related to . For
example let A
n
n=1
be a sequence of subsets from X and dene
A
n
i.o. := x X : #n : x A
n
= and
A
n
a.a. := x X : x A
n
for all n suciently large.
(One should read A
n
i.o. as A
n
innitely often and A
n
a.a. as A
n
almost
always.) Then x A
n
i.o. i
N N n N x A
n
and this may be expressed as
A
n
i.o. =
N=1
nN
A
n
.
Similarly, x A
n
a.a. i
N N n N, x A
n
which may be written as
A
n
a.a. =
N=1
nN
A
n
.
Denition 2.5. A set X is said to be countable if is empty or there is an
injective function f : X N, otherwise X is said to be uncountable.
Lemma 2.6 (Basic Properties of Countable Sets).
1. If A X is a subset of a countable set X then A is countable.
2. Any innite subset N is in one to one correspondence with N.
2 Set Operations 7
3. A non-empty set X is countable i there exists a surjective map, g : N
X.
4. If X and Y are countable then X Y is countable.
5. Suppose for each m N that A
m
is a countable subset of a set X, then
A =
m=1
A
m
is countable. In short, the countable union of countable sets
is still countable.
6. If X is an innite set and Y is a set with at least two elements, then Y
X
is uncountable. In particular 2
X
is uncountable for any innite set X.
Proof. 1. If f : X N is an injective map then so is the restriction, f[
A
,
of f to the subset A. 2. Let f (1) = min and dene f inductively by
f(n + 1) = min( f(1), . . . , f(n)) .
Since is innite the process continues indenitely. The function f : N
dened this way is a bijection.
3. If g : N X is a surjective map, let
f(x) = ming
1
(x) = minn N : f(n) = x .
Then f : X N is injective which combined with item
2. (taking = f(X)) shows X is countable. Conversely if f : X N is
injective let x
0
X be a xed point and dene g : N X by g(n) = f
1
(n)
for n f (X) and g(n) = x
0
otherwise.
4. Let us rst construct a bijection, h, from N to N N. To do this put
the elements of N N into an array of the form
_
_
_
_
_
(1, 1) (1, 2) (1, 3) . . .
(2, 1) (2, 2) (2, 3) . . .
(3, 1) (3, 2) (3, 3) . . .
.
.
.
.
.
.
.
.
.
.
.
.
_
_
_
_
_
and then count these elements by counting the sets (i, j) : i +j = k one
at a time. For example let h(1) = (1, 1) , h(2) = (2, 1), h(3) = (1, 2), h(4) =
(3, 1), h(5) = (2, 2), h(6) = (1, 3) and so on. If f : N X and g : N Y are
surjective functions, then the function (f g) h : N X Y is surjective
where (f g) (m, n) := (f (m), g(n)) for all (m, n) N N.
5. If A = then A is countable by denition so we may assume A ,= .
With out loss of generality we may assume A
1
,= and by replacing A
m
by
A
1
if necessary we may also assume A
m
,= for all m. For each m N let
a
m
: N A
m
be a surjective function and then dene f : NN
m=1
A
m
by
f(m, n) := a
m
(n). The function f is surjective and hence so is the composition,
f h : N
m=1
A
m
, where h : N N N is the bijection dened above.
6. Let us begin by showing 2
N
= 0, 1
N
is uncountable. For sake of
contradiction suppose f : N 0, 1
N
is a surjection and write f (n) as
(f
1
(n) , f
2
(n) , f
3
(n) , . . . ) . Now dene a 0, 1
N
by a
n
:= 1 f
n
(n). By
8 2 Set Operations
construction f
n
(n) ,= a
n
for all n and so a / f (N) . This contradicts the
assumption that f is surjective and shows 2
N
is uncountable. For the general
case, since Y
X
0
Y
X
for any subset Y
0
Y, if Y
X
0
is uncountable then so
is Y
X
. In this way we may assume Y
0
is a two point set which may as well
be Y
0
= 0, 1 . Moreover, since X is an innite set we may nd an injective
map x : N X and use this to set up an injection, i : 2
N
2
X
by setting
i (A) := x
n
: n N X for all A N. If 2
X
were countable we could nd
a surjective map f : 2
X
N in which case f i : 2
N
N would be surjec-
tive as well. However this is impossible since we have already seed that 2
N
is
uncountable.
We end this section with some notation which will be used frequently in
the sequel.
Notation 2.7 If f : X Y is a function and c 2
Y
let
f
1
c := f
1
(c) := f
1
(E)[E c.
If ( 2
X
, let
f
( := A 2
Y
[f
1
(A) (.
Denition 2.8. Let c 2
X
be a collection of sets, A X, i
A
: A X be
the inclusion map (i
A
(x) = x for all x A) and
c
A
= i
1
A
(c) = A E : E c .
2.1 Exercises
Let f : X Y be a function and A
i
iI
be an indexed family of subsets of
Y, verify the following assertions.
Exercise 2.1. (
iI
A
i
)
c
=
iI
A
c
i
.
Exercise 2.2. Suppose that B Y, show that B (
iI
A
i
) =
iI
(B A
i
).
Exercise 2.3. f
1
(
iI
A
i
) =
iI
f
1
(A
i
).
Exercise 2.4. f
1
(
iI
A
i
) =
iI
f
1
(A
i
).
Exercise 2.5. Find a counterexample which shows that f(C D) = f(C)
f(D) need not hold.
3
A Brief Review of Real and Complex Numbers
Although it is assumed that the reader of this book is familiar with the prop-
erties of the real numbers, 1, nevertheless I feel it is instructive to dene them
here and sketch the development of their basic properties. It will most cer-
tainly be assumed that the reader is familiar with basic algebraic properties
of the natural numbers N and the ordered eld of rational numbers,
=
_
m
n
: m, n Z : n ,= 0
_
.
As usual, for q , we dene
[q[ =
_
q if q 0
q if q 0.
Notice that if q and [q[ n
1
:=
1
n
for all n, then q = 0. Since if q ,= 0,
then [q[ =
m
n
for some m, n N and hence [q[
1
n
. A similar argument shows
q 0 i q
1
n
for all n N. These trivial remarks will be used in the future
without further reference.
Denition 3.1. A sequence q
n
n=1
converges to q if [q q
n
[ 0
as n , i.e. if for all N N, [q q
n
[
1
N
for a.a. n. As usual if q
n
n=1
converges to q we will write q
n
q as n or q = lim
n
q
n
.
Denition 3.2. A sequence q
n
n=1
is Cauchy if [q
n
q
m
[ 0 as
m, n . More precisely we require for each N N that [q
m
q
n
[
1
N
for
a.a. pairs (m, n) .
Exercise 3.1. Show that all convergent sequences q
n
n=1
are Cauchy
and that all Cauchy sequences q
n
n=1
are bounded i.e. there exists M N
such that
[q
n
[ M for all n N.
Exercise 3.2. Suppose q
n
n=1
and r
n
n=1
are Cauchy sequences in .
10 3 A Brief Review of Real and Complex Numbers
1. Show q
n
+r
n
n=1
and q
n
r
n
n=1
are Cauchy.
Now assume that q
n
n=1
and r
n
n=1
are convergent sequences in .
2. Show q
n
+r
n
n=1
q
n
r
n
n=1
are convergent in and
lim
n
(q
n
+r
n
) = lim
n
q
n
+ lim
n
r
n
and
lim
n
(q
n
r
n
) = lim
n
q
n
lim
n
r
n
.
3. If we further assume q
n
r
n
for all n, show lim
n
q
n
lim
n
r
n
. (It
suces to consider the case where q
n
= 0 for all n.)
The rational numbers suer from the defect that they are not complete,
i.e. not all Cauchy sequences are convergent. In fact, according to Corollary
3.14 below, most Cauchy sequences of rational numbers do not converge to
a rational number.
Exercise 3.3. Use the following outline to construct a Cauchy sequence
q
n
n=1
which is not convergent in .
1. Recall that there is no element q such that q
2
= 2.
1
To each n N
let m
n
N be chosen so that
m
2
n
n
2
< 2 <
(m
n
+ 1)
2
n
2
(3.1)
and let q
n
:=
mn
n
.
2. Verify that q
2
n
2 as n and that q
n
n=1
is a Cauchy sequence in
.
3. Show q
n
n=1
does not have a limit in .
3.1 The Real Numbers
Let ( denote the collection of Cauchy sequences a = a
n
n=1
and say
a, b ( are equivalent (write a b) i lim
n
[a
n
b
n
[ = 0. (The reader
should check that is an equivalence relation.)
Denition 3.3. A real number is an equivalence class, a := b ( : b a
associated to some element a (. The collection of real numbers will be
denoted by 1. For q , let i (q) = a where a is the constant sequence a
n
= q
for all n N. We will simply write 0 for i (0) and 1 for i (1) .
Exercise 3.4. Given a,
b 1 show that the denitions

a = (a), a +
b := (a +b) and a
b := a b
1
This fact also shows that the intermediate value theorem, (see Theorem 13.50
below.) fails when working with continuous functions dened over Q.
3.1 The Real Numbers 11
are well dened. Here a, a + b and a b denote the sequences a
n
n=1
,
a
n
+b
n
n=1
and a
n
b
n
n=1
respectively. Further verify that with these op-
erations, 1 becomes a eld and the map i : 1 is injective homomorphism
of elds. Hint: if a ,= 0 show that a may be represented by a sequence a (
with [a
n
[
1
N
for all n and some N N. For this representative show the
sequence a
1
:=
_
a
1
n
_
n=1
(. The multiplicative inverse to a may now be
constructed as:
1
a
= a
1
:=
_
a
1
n
_
n=1
.
Denition 3.4. Let a,
b 1. Then
1. a > 0 if there exists an N N such that a
n
>
1
N
for a.a. n.
2. a 0 i either a > 0 or a = 0. Equivalently (as the reader should verify),
a 0 i for all N N, a
n

1
N
for a.a. n.
3. Write a >

b or

b < a if a
b > 0
4. Write a
b or

b a if a
b 0.
Exercise 3.5. Show make 1 into a linearly ordered eld and the map
i : 1 preserves order. Namely if a,
b 1 then
1. exactly one of the following relations hold: a 

b or a =

b.
2. If a 0 and

b 0 then a +
b 0 and a
b 0.
3. If q, r Q then q r i i (q) i (r) .
The absolute value of a real number a is dened analogously to that of
a rational number by
[ a[ =
_
a if a 0
a if a < 0
.
Observe this denition is consistent with our previous denition of the abso-
lute value on , namely i ([q[) = [i (q)[ . Also notice that a = 0 (i.e. a 0
where 0 denotes the constant sequence of all zeros) i for all N N, [a
n
[
1
N
for a.a. n. This is equivalent to saying [ a[ i
_
1
N
_
for all N N i a = 0.
Denition 3.5. A sequence a
n
n=1
1 converges to a 1 if [ a a
n
[
0 as n , i.e. if for all N N, [ a a
n
[ i
_
1
N
_
for a.a. n. As before (for
rational numbers) if a
n
n=1
converges to a we will write a
n
a as n
or a = lim
n
a
n
.
Exercise 3.6. Given a,
b 1 show
= [ a[
and

a +
[ a[ +
.
The latter inequality being referred to as the triangle inequality.
By exercise 3.6,
[ a[ =
b +
and hence
[ a[
and by reversing the roles of a and

b we also have
_
[ a[
_
=
[ a[
b a
.
Therefore,
[ a[
and consequently if a
n
n=1
1 converges to a 1 then
[[ a
n
[ [ a[[ [ a
n
a[ 0 as n .
Remark 3.6. The eld i () is dense in 1 in the sense that if a 1 there
exists q
n
n=1
such that i (q
n
) a as n . Indeed, simply let
q
n
= a
n
where a represents a. Since a is a Cauchy sequence, to any N N
there exits M N such that
1
N
a
m
a
n

1
N
for all m, n M
and therefore
i
_
1
N
_
i (a
m
) a i
_
1
N
_
for all m M.
This shows
[i (q
m
) a[ = [i (a
m
) a[ i
_
1
N
_
for all m M
and since N is arbitrary it follows that i (q
m
) a as m .
Denition 3.7. A sequence a
n
n=1
1 is Cauchy if [ a
n
a
m
[ 0 as
m, n . More precisely we require for each N N that [ a
m
a
n
[ i
_
1
N
_
for a.a. pairs (m, n) .
Exercise 3.7. The analogues of the results in Exercises 3.1 and 3.2 hold with
replaced by 1. (We now say a subset 1 is bounded if there exists
M N such that [[ i (M) for all .)
For the purposes of real analysis the most important property of 1 is that
it is complete.
Theorem 3.8. The ordered eld 1 is complete, i.e. all Cauchy sequences in
1 are convergent.
Proof. Suppose that a (m)
m=1
is a Cauchy sequence in 1. By Remark
3.6, we may choose q
m
such that
[ a (m) i (q
m
)[ i
_
m
1
_
for all m N.
Given N N, choose M N such that [ a (m) a (n)[ i
_
N
1
_
for all
m, n M. Then
[i (q
m
) i (q
n
)[ [i (q
m
) a (m)[ +[ a (m) a (n)[ +[ a (n) i (q
n
)[
i
_
m
1
_
+i
_
n
1
_
+i
_
N
1
_
and therefore
[q
m
q
n
[ m
1
+n
1
+N
1
for all m, n M.
It now follows that q = q
m
m=1
( and therefore q represents a point q 1.
Using Remark 3.6 and the triangle inequality,
[ a (m) q[ [ a (m) i (q
m
)[ +[i (q
m
) q[
i
_
m
1
_
+[i (q
m
) q[ 0 as m
and therefore lim
m
a (m) = q.
Denition 3.9. A number M 1 is an upper bound for a set 1 if
M for all and a number m 1 is an lower bound for a set
1 if m for all . Upper and lower bounds need not exist. If
has an upper (lower) bound, is said to be bounded from above (below).
Theorem 3.10. To each non-empty set 1 which is bounded from above
(below) there is a unique least upper bound denoted by sup 1 (respec-
tively greatest lower bound denoted by inf 1).
Proof. Suppose is bounded from above and for each n N, let m
n
Z
be the smallest integer such that i
_
mn
2
n
_
is an upper bound for . The sequence
q
n
:=
mn
2
n
is Cauchy because q
m
[q
n
2
n
, q
n
] for all m n, i.e.
[q
m
q
n
[ 2
min(m,n)
0 as m, n .
Passing to the limit, n , in the inequality i (q
n
) , which is valid for
all implies
q = lim
n
i (q
n
) for all .
Thus q is an upper bound for . If there were another upper bound M 1 for
such that M < q, it would follow that M i (q
n
) < q for some n. But this
is a contradiction because q
n
n=1
is a decreasing sequence, i (q
n
) i (q
m
)
for all m n and therefore i (q
n
) q for all n. Therefore q is the unique least
upper bound for . The existence of lower bounds is proved analogously.
Proposition 3.11. If a
n
n=1
1 is an increasing (decreasing) sequence
which is bounded from above (below), then a
n
n=1
is convergent and
lim
n
a
n
= supa
n
: n N
_
lim
n
a
n
= inf a
n
: n N
_
.
If 1 is a set bounded from above then there exists
n
such that
n
M := sup, as n , i.e.
n
is increasing and lim
n
n
= M.
Proof. Let M := supa
n
: n N , then for each N N there must exist
m N such that M i
_
N
1
_
< a
m
M. Since a
n
is increasing, it follows
that
M i
_
N
1
_
< a
n
M for all n m.
From this we conclude that lima
n
exists and lima
n
= M. If M = sup, for
each n N we may choose
n
such that
M i
_
n
1
_
<
n
M. (3.2)
By replacing
n
by max
1
, . . . ,
n
2
if necessary we may assume that
n
is
increasing in n. It now follows easily from Eq. (3.2) that lim
n
n
= M.
3.1.1 The Decimal Representation of a Real Number
Let 1 or , m, n Z and S :=
m
k=n
k
. If = 1 then

m
k=n
k
=
mn + 1 while for ,= 1,
S S =
m+1
n
and solving for S gives the important geometric summation formula,
m
k=n
k
=

m+1
n
1
if ,= 1. (3.3)
Taking = 10
1
in Eq. (3.3) implies
m
k=n
10
k
=
10
(m+1)
10
n
10
1
1
=
1
10
n1
1 10
(mn+1)
9
and in particular, for all M n,
lim
m
m
k=n
10
k
=
1
9 10
n1

M
k=n
10
k
.
Let | denote those sequences 0, 1, 2, . . . , 9
Z
with the following prop-
erties:
1. there exists N N such that
n
= 0 for all n N and
2.
n
,= 0 for some n Z.
2
The notation, max , denotes sup along with the assertion that sup .
Similarly, min = inf along with the assertion that inf .
Associated to each | is the sequence a = a () dened by
a
n
:=
n
k=
k
10
k
.
Since for m > n,
[a
m
a
n
[ =
k=n+1
k
10
k
9
m
k=n+1
10
k
9
1
9 10
n
=
1
10
n
,
it follows that
[a
m
a
n
[
1
10
min(m,n)
0 as m, n .
Therefore a = a () ( and we may dene a map D : 1 | 1 dened
by D(, ) = a (). As is customary we will denote D(, ) = a () as

m
. . .
0
.
1
2
. . .
n
. . . (3.4)
where m is the largest integer in Z such that
k
= 0 for all k < m. If m > 0
the expression in Eq. (3.4) should be interpreted as
0.0 . . . 0
m
m+1
. . . .
An element | has a tail of all 9s starting at N N if
n
= 9 and for
all n N and
N1
,= 9. If has a tail of 9s starting at N N, then for
n > N,
a
n
() =
N1
k=
k
10
k
+ 9
n
k=N
10
k
=
N1
k=
k
10
k
+
9
10
N1

1 10
(nN)
9
N1
k=
k
10
k
+ 10
(N1)
as n .
If
t
is the digits in the decimal expansion of

N1
k=
k
10
k
+ 10
(N1)
,
then
t
|
t
:= | : does not have a tail of all 9s .
and we have just shown that D(, ) = D(,
t
) . In particular this implies
D(1 |
t
) = D(1 |) . (3.5)
Theorem 3.12 (Decimal Representation). The map
D : 1 |
t
1 0
is a bijection.
Proof. Suppose D(, ) = D(, ) for some (, ) and (, ) in 1|.
Since D(, ) > 0 if = 1 and D(, ) < 0 if = 1 it follows that = . Let
a = a () and b = a () be the sequences associated to and respectively.
Suppose that ,= and let j Z be the position where and rst
disagree, i.e.
n
=
n
for all n < j while
j
,=
j
. For sake of deniteness
suppose
j
>
j
. Then for n > j we have
b
n
a
n
= (
j

j
) 10
j
+
n
k=j+1
(
k
k
) 10
k
10
j
9
n
k=j+1
10
k
10
j
9
1
9 10
j
= 0.
Therefore b
n
a
n
0 for all n and lim(b
n
a
n
) = 0 i
j
=
j
+1 and
k
= 9
and
k
= 0 for all k > j. In summary, D(, ) = D(, ) with ,= implies
either or has an innite tail of nines which shows that D is injective when
restricted to 1 |
t
. To see that D is surjective it suces to show any
b 1 with 0 <

b < 1 is in the range of D. For each n N, let a
n
= .
1
. . .
n
with
i
0, 1, 2, . . . , 9 such that
i (a
n
) <

b i (a
n
) +i
_
10
n
_
. (3.6)
Since a
n+1
= a
n
+
n+1
10
(n+1)
for some
n+1
0, 1, 2, . . . , 9 , we see that
a
n+1
= .
1
. . .
n
n+1
, i.e. the rst n digits in the decimal expansion of a
n+1
are the same as in the decimal expansion of a
n
. Hence this denes
n
uniquely
for all n 1. By setting
n
= 0 when n 0, we have constructed from

b an
element |. Because of Eq. (3.6), D(1, ) =

b.
Notation 3.13 From now on we will identify with i () 1 and elements
in 1 with their decimal expansions.
To summarize, we have constructed a complete ordered eld 1 contain-
ing as a dense subset. Moreover every element in 1 (modulo those of the
form m10
n
for some m Z and n N) has a unique decimal expansion.
Corollary 3.14. The set (0, 1) := a 1 : 0 < a < 1 is uncountable while
(0, 1) is countable.
Proof. By Theorem 3.12, the set 0, 1, 2 . . . , 8
N
can be mapped injectively
into (0, 1) and therefore it follows from Lemma 2.6 that (0, 1) is uncountable.
For each m N, let A
m
:=
_
n
m
: n N with n < m
_
. Since (0, 1) =
m=1
A
m
and #(A
m
) < for all m, another application of Lemma 2.6 shows
(0, 1) is countable.
3.2 The Complex Numbers 17
3.2 The Complex Numbers
Denition 3.15 (Complex Numbers). Let C = 1
2
equipped with multipli-
cation rule
(a, b)(c, d) := (ac bd, bc +ad) (3.7)
and the usual rule for vector addition. As is standard we will write 0 = (0, 0) ,
1 = (1, 0) and i = (0, 1) so that every element z of C may be written as
z = (x, y) = x1 +yi which in the future will be written simply as z = x +iy.
If z = x +iy, let Re z = x and Imz = y.
Writing z = a + ib and w = c + id, the multiplication rule in Eq. (3.7)
becomes
(a +ib)(c +id) := (ac bd) +i(bc +ad) (3.8)
and in particular 1
2
= 1 and i
2
= 1.
Proposition 3.16. The complex numbers C with the above multiplication
rule satises the usual denitions of a eld. For example wz = zw and
z (w
1
+w
2
) = zw
1
+ zw
2
, etc. Moreover if z ,= 0, z has a multiplicative
inverse given by
z
1
=
a
a
2
+b
2
i
b
a
2
+b
2
. (3.9)
Proof. The proof is a straightforward verication. Only the last assertion
will be veried here. Suppose z = a +ib ,= 0, we wish to nd w = c +id such
that zw = 1 and this happens by Eq. (3.8) i
ac bd = 1 and (3.10)
bc +ad = 0. (3.11)
Solving these equations for c and d gives c =
a
a
2
+b
2
and d =
b
a
2
+b
2
as claimed.
Notation 3.17 (Conjugation and Modulus) If z = a + ib with a, b 1
let z = a ib and
[z[ :=
z z =
_
a
2
+b
2
=
_
[Re z[
2
+[Imz[
2
.
See Exercise 3.8 for the existence of the square root as a positive real number.
Notice that
Re z =
1
2
(z + z) and Imz =
1
2i
(z z) . (3.12)
Proposition 3.18. Complex conjugation and the modulus operators satisfy
the following properties.
1. z = z,
2. zw = z w and z + w = z +w.
3. [ z[ = [z[
4. [zw[ = [z[ [w[ and in particular [z
n
[ = [z[
n
for all n N.
5. [Re z[ [z[ and [Imz[ [z[
6. [z +w[ [z[ +[w[ .
7. z = 0 i [z[ = 0.
8. If z ,= 0 then z
1
:=
z
]z]
2
(also written as
1
z
) is the inverse of z.
9.
z
1
= [z[
1
and more generally [z
n
[ = [z[
n
for all n Z.
Proof. All of these properties are direct computations except for possibly
the triangle inequality in item 6 which is veried by the following computation;
[z +w[
2
= (z +w) (z +w) = [z[
2
+[w[
2
+w z + wz
= [z[
2
+[w[
2
+w z +w z
= [z[
2
+[w[
2
+ 2 Re (w z) [z[
2
+[w[
2
+ 2 [z[ [w[
= ([z[ +[w[)
2
.
Denition 3.19. A sequence z
n
n=1
C is Cauchy if [z
n
z
m
[ 0 as
m, n and is convergent to z C if [z z
n
[ 0 as n . As usual
if z
n
n=1
converges to z we will write z
n
z as n or z = lim
n
z
n
.
Theorem 3.20. The complex numbers are complete,i.e. all Cauchy sequences
are convergent.
Proof. This follows from the completeness of real numbers and the easily
proved observations that if z
n
= a
n
+ib
n
C, then
1. z
n
n=1
C is Cauchy i a
n
n=1
1 and b
n
n=1
1 are Cauchy
and
2. z
n
z = a +ib as n i a
n
a and b
n
b as n .
3.3 Exercises
Exercise 3.8. Show to every a 1 with a 0 there exists a unique number
b 1 such that b 0 and b
2
= a. Of course we will call b =

a. Also show
that a
a is an increasing function on [0, ). Hint: To construct b =
a
for a > 0, to each n N let m
n
N
0
be chosen so that
m
2
n
n
2
< a
(m
n
+ 1)
2
n
2
i.e. i
_
m
2
n
n
2
_
< a i
_
(m
n
+ 1)
2
n
2
_
and let q
n
:=
mn
n
. Then show b = q
n
n=1
1 satises b > 0 and b
2
= a.
4
Limits and Sums
4.1 Limsups, Liminfs and Extended Limits
Notation 4.1 The extended real numbers is the set

1 := 1 , i.e. it
is 1 with two new points called and . We use the following conventions,
0 = 0, a = if a 1 with a > 0, a = if a 1 with
a < 0, +a = for any a 1, += and = while
is not dened. A sequence a
n

1 is said to converge to () if
for all M 1 there exists m N such that a
n
M (a
n
M) for all n m.
Lemma 4.2. Suppose a
n
n=1
and b
n
n=1
are convergent sequences in

1,
then:
1. If a
n
b
n
for a.a. n then lim
n
a
n
lim
n
b
n
.
2. If c 1, lim
n
(ca
n
) = c lim
n
a
n
.
3. If a
n
+b
n
n=1
is convergent and
lim
n
(a
n
+b
n
) = lim
n
a
n
+ lim
n
b
n
(4.1)
provided the right side is not of the form .
4. a
n
b
n
n=1
is convergent and
lim
n
(a
n
b
n
) = lim
n
a
n
lim
n
b
n
(4.2)
provided the right hand side is not of the for 0 of 0 () .
Before going to the proof consider the simple example where a
n
= n and
b
n
= n with > 0. Then
lim(a
n
+b
n
) =
_
_
_
if < 1
0 if = 1
if > 1
while
20 4 Limits and Sums
lim
n
a
n
+ lim
n
b
n
= .
This shows that the requirement that the right side of Eq. (4.1) is not of form
is necessary in Lemma 4.2. Similarly by considering the examples
a
n
= n and b
n
= n
with > 0 shows the necessity for assuming right hand

side of Eq. (4.2) is not of the form 0.
Proof. The proofs of items 1. and 2. are left to the reader.
Proof of Eq. (4.1). Let a := lim
n
a
n
and b = lim
n
b
n
. Case 1.,
suppose b = in which case we must assume a > . In this case, for every
M > 0, there exists N such that b
n
M and a
n
a 1 for all n N and
this implies
a
n
+b
n
M +a 1 for all n N.
Since M is arbitrary it follows that a
n
+b
n
as n . The cases where
b = or a = are handled similarly. Case 2. If a, b 1, then for every
> 0 there exists N N such that
[a a
n
[ and [b b
n
[ for all n N.
Therefore,
[a +b (a
n
+b
n
)[ = [a a
n
+b b
n
[ [a a
n
[ +[b b
n
[ 2
for all n N. Since n is arbitrary, it follows that lim
n
(a
n
+b
n
) = a +b.
Proof of Eq. (4.2). It will be left to the reader to prove the case
where lima
n
and limb
n
exist in 1. I will only consider the case where
a = lim
n
a
n
,= 0 and lim
n
b
n
= here. Let us also suppose that
a > 0 (the case a < 0 is handled similarly) and let := min
_
a
2
, 1
_
. Given
any M < , there exists N N such that a
n
and b
n
M for all n N
and for this choice of N, a
n
b
n
M for all n N. Since > 0 is xed and
M is arbitrary it follows that lim
n
(a
n
b
n
) = as desired.
For any subset

1, let sup and inf denote the least upper bound and
greatest lower bound of respectively. The convention being that sup =
if or is not bounded from above and inf = if or is
not bounded from below. We will also use the conventions that sup =
and inf = +.
Notation 4.3 Suppose that x
n
n=1

1 is a sequence of numbers. Then
lim inf
n
x
n
= lim
n
infx
k
: k n and (4.3)
lim sup
n
x
n
= lim
n
supx
k
: k n. (4.4)
We will also write lim for liminf and lim for limsup.
Remark 4.4. Notice that if a
k
:= infx
k
: k n and b
k
:= supx
k
: k
n, then a
k
is an increasing sequence while b
k
is a decreasing sequence.
Therefore the limits in Eq. (4.3) and Eq. (4.4) always exist in

1 and
4.1 Limsups, Liminfs and Extended Limits 21
lim inf
n
x
n
= sup
n
infx
k
: k n and
lim sup
n
x
n
= inf
n
supx
k
: k n.
The following proposition contains some basic properties of liminfs and
limsups.
Proposition 4.5. Let a
n
n=1
and b
n
n=1
be two sequences of real numbers.
Then
1. liminf
n
a
n
limsup
n
a
n
and lim
n
a
n
exists in

1 i
lim inf
n
a
n
= lim sup
n
a
n

1.
2. There is a subsequence a
n
k
k=1
of a
n
n=1
such that lim
k
a
n
k
=
limsup
n
a
n
. Similarly, there is a subsequence a
n
k
k=1
of a
n
n=1
such that lim
k
a
n
k
= liminf
n
a
n
.
3.
lim sup
n
(a
n
+b
n
) lim sup
n
a
n
+ lim sup
n
b
n
(4.5)
whenever the right side of this equation is not of the form .
4. If a
n
0 and b
n
0 for all n N, then
lim sup
n
(a
n
b
n
) lim sup
n
a
n
lim sup
n
b
n
, (4.6)
provided the right hand side of (4.6) is not of the form 0 or 0.
Proof. Item 1. will be proved here leaving the remaining items as an
exercise to the reader. Since
infa
k
: k n supa
k
: k n n,
lim inf
n
a
n
lim sup
n
a
n
.
Now suppose that liminf
n
a
n
= limsup
n
a
n
= a 1. Then for all
> 0, there is an integer N such that
a infa
k
: k N supa
k
: k N a +,
i.e.
a a
k
a + for all k N.
Hence by the denition of the limit, lim
k
a
k
= a. If liminf
n
a
n
= ,
then we know for all M (0, ) there is an integer N such that
M infa
k
: k N
and hence lim
n
a
n
= . The case where limsup
n
a
n
= is handled
similarly.
Conversely, suppose that lim
n
a
n
= A

1 exists. If A 1, then for
every > 0 there exists N() N such that [Aa
n
[ for all n N(),
i.e.
A a
n
A+ for all n N().
From this we learn that
A lim inf
n
a
n
lim sup
n
a
n
A+.
Since > 0 is arbitrary, it follows that
A lim inf
n
a
n
lim sup
n
a
n
A,
i.e. that A = liminf
n
a
n
= limsup
n
a
n
. If A = , then for all M > 0
there exists N = N(M) such that a
n
M for all n N. This show that
liminf
n
a
n
M and since M is arbitrary it follows that
lim inf
n
a
n
lim sup
n
a
n
.
The proof for the case A = is analogous to the A = case.
4.2 Sums of positive functions
In this and the next few sections, let X and Y be two sets. We will write
X to denote that is a nite subset of X and write 2
X
f
for those
X.
Denition 4.6. Suppose that a : X [0, ] is a function and F X is a
subset, then
F
a =
xF
a(x) := sup
_
x
a(x) : F
_
.
Remark 4.7. Suppose that X = N = 1, 2, 3, . . . and a : X [0, ], then
N
a =
n=1
a(n) := lim
N
N
n=1
a(n).
Indeed for all N,
N
n=1
a(n)

N
a, and thus passing to the limit we learn
that
n=1
a(n)
N
a.
Conversely, if N, then for all N large enough so that 1, 2, . . . , N,
we have

N
n=1
a(n) which upon passing to the limit implies that
4.2 Sums of positive functions 23
n=1
a(n).
Taking the supremum over in the previous equation shows
N
a
n=1
a(n).
Remark 4.8. Suppose a : X [0, ] and
X
a < , then x X : a(x) > 0
is at most countable. To see this rst notice that for any > 0, the set
x : a(x) must be nite for otherwise

X
a = . Thus
x X : a(x) > 0 =
_
k=1
x : a(x) 1/k
which shows that x X : a(x) > 0 is a countable union of nite sets and
thus countable by Lemma 2.6.
Lemma 4.9. Suppose that a, b : X [0, ] are two functions, then
X
(a +b) =
X
a +
X
b and
X
a =
X
a
for all 0.
I will only prove the rst assertion, the second being easy. Let X be
a nite set, then
(a +b) =
a +
X
a +
X
b
which after taking sups over shows that
X
(a +b)
X
a +
X
b.
Similarly, if , X, then
a +
a +
b =
(a +b)
X
(a +b).
Taking sups over and then shows that
X
a +
X
b
X
(a +b).
Lemma 4.10. Let X and Y be sets, R X Y and suppose that a : R

1
is a function. Let
x
R := y Y : (x, y) R and R
y
:= x X : (x, y) R .
Then
sup
(x,y)R
a(x, y) = sup
xX
sup
yxR
a(x, y) = sup
yY
sup
xRy
a(x, y) and
inf
(x,y)R
a(x, y) = inf
xX
inf
yxR
a(x, y) = inf
yY
inf
xRy
a(x, y).
(Recall the conventions: sup = and inf = +.)
Proof. Let M = sup
(x,y)R
a(x, y), N
x
:= sup
yxR
a(x, y). Then a(x, y)
M for all (x, y) R implies N
x
= sup
yxR
a(x, y) M and therefore that
sup
xX
sup
yxR
a(x, y) = sup
xX
N
x
M. (4.7)
Similarly for any (x, y) R,
a(x, y) N
x
sup
xX
N
x
= sup
xX
sup
yxR
a(x, y)
and therefore
M = sup
(x,y)R
a(x, y) sup
xX
sup
yxR
a(x, y) (4.8)
Equations (4.7) and (4.8) show that
sup
(x,y)R
a(x, y) = sup
xX
sup
yxR
a(x, y).
The assertions involving inmums are proved analogously or follow from what
we have just proved applied to the function a.
Fig. 4.1. The x and y slices of a set R X Y.
4.2 Sums of positive functions 25
Theorem 4.11 (Monotone Convergence Theorem for Sums). Suppose
that f
n
: X [0, ] is an increasing sequence of functions and
f(x) := lim
n
f
n
(x) = sup
n
f
n
(x).
Then
lim
n
X
f
n
=
X
f
Proof. We will give two proofs.
First proof. Let
2
X
f
:= A X : A X.
Then
lim
n
X
f
n
= sup
n
X
f
n
= sup
n
sup
2
X
f
f
n
= sup
2
X
f
sup
n
f
n
= sup
2
X
f
lim
n
f
n
= sup
2
X
f
lim
n
f
n
= sup
2
X
f
f =
X
f.
Second Proof. Let S
n
=

X
f
n
and S =

X
f. Since f
n
f
m
f for all
n m, it follows that
S
n
S
m
S
which shows that lim
n
S
n
exists and is less that S, i.e.
A := lim
n
X
f
n

X
f. (4.9)
Noting that

f
n

X
f
n
= S
n
A for all X and in particular,
f
n
A for all n and X.
Letting n tend to innity in this equation shows that
f A for all X
and then taking the sup over all X gives
X
f A = lim
n
X
f
n
(4.10)
which combined with Eq. (4.9) proves the theorem.
Lemma 4.12 (Fatous Lemma for Sums). Suppose that f
n
: X [0, ]
is a sequence of functions, then
X
lim inf
n
f
n
lim inf
n
X
f
n
.
Proof. Dene g
k
:= inf
nk
f
n
so that g
k
liminf
n
f
n
as k . Since
g
k
f
n
for all n k,
X
g
k

X
f
n
for all n k
and therefore

X
g
k
lim inf
n
X
f
n
for all k.
We may now use the monotone convergence theorem to let k to nd
X
lim inf
n
f
n
=
X
lim
k
g
k
MCT
= lim
k
X
g
k
lim inf
n
X
f
n
.
Remark 4.13. If A =

X
a < , then for all > 0 there exists
X
such that
A
a A
for all X containing
or equivalently,
(4.11)
for all X containing
. Indeed, choose
so that

a A.
4.3 Sums of complex functions
Denition 4.14. Suppose that a : X C is a function, we say that
X
a =
xX
a(x)
exists and is equal to A C, if for all > 0 there is a nite subset
X
such that for all X containing
we have
.
4.3 Sums of complex functions 27
The following lemma is left as an exercise to the reader.
Lemma 4.15. Suppose that a, b : X C are two functions such that

X
a
and

X
b exist, then

X
(a +b) exists for all C and
X
(a +b) =
X
a +
X
b.
Denition 4.16 (Summable). We call a function a : X C summable if
X
[a[ < .
Proposition 4.17. Let a : X C be a function, then

X
a exists i
X
[a[ < , i.e. i a is summable. Moreover if a is summable, then
X
a
X
[a[ .
Proof. If

X
[a[ < , then

X
(Re a)
< and

X
(Ima)
<
and hence by Remark 4.13 these sums exists in the sense of Denition 4.14.
Therefore by Lemma 4.15,

X
a exists and
X
a =
X
(Re a)
+
X
(Re a)
+i
_
X
(Ima)
+
X
(Ima)
_
.
Conversely, if

X
[a[ = then, because [a[ [Re a[ + [Ima[ , we must
have

X
[Re a[ = or
X
[Ima[ = .
Thus it suces to consider the case where a : X 1 is a real function. Write
a = a
+
a
where
a
+
(x) = max(a(x), 0) and a
(x) = max(a(x), 0). (4.12)

Then [a[ = a
+
+a
and
=
X
[a[ =
X
a
+
+
X
a
which shows that either

X
a
+
= or

X
a
= . Suppose, with out loss

of generality, that

X
a
+
= . Let X
t
:= x X : a(x) 0, then we know
that

X
a = which means there are nite subsets
n
X
t
X such
that

n
a n for all n. Thus if X is any nite set, it follows that
lim
n
n
a = , and therefore

X
a can not exist as a number in 1.
Finally if a is summable, write

X
a = e
i
with 0 and 1, then
X
a
= = e
i
X
a =
X
e
i
a
=
X
Re
_
e
i
a
X
_
Re
_
e
i
a
_
+
Re
_
e
i
a
e
i
a
X
[a[ .
Alternatively, this may be proved by approximating
X
a by a nite sum and
then using the triangle inequality of [[ .
Remark 4.18. Suppose that X = N and a : N C is a sequence, then it is
not necessarily true that
n=1
a(n) =
nN
a(n). (4.13)
This is because
n=1
a(n) = lim
N
N
n=1
a(n)
depends on the ordering of the sequence a where as

nN
a(n) does not. For
example, take a(n) = (1)
n
/n then

nN
[a(n)[ = i.e.

nN
a(n) does
not exist while

n=1
a(n) does exist. On the other hand, if
nN
[a(n)[ =
n=1
[a(n)[ <
then Eq. (4.13) is valid.
Theorem 4.19 (Dominated Convergence Theorem for Sums). Sup-
pose that f
n
: X C is a sequence of functions on X such that f(x) =
lim
n
f
n
(x) C exists for all x X. Further assume there is a dominat-
ing function g : X [0, ) such that
[f
n
(x)[ g(x) for all x X and n N (4.14)
and that g is summable. Then
lim
n
xX
f
n
(x) =
xX
f(x). (4.15)
Proof. Notice that [f[ = lim[f
n
[ g so that f is summable. By con-
sidering the real and imaginary parts of f separately, it suces to prove the
theorem in the case where f is real. By Fatous Lemma,
4.3 Sums of complex functions 29
X
(g f) =
X
lim inf
n
(g f
n
) lim inf
n
X
(g f
n
)
=
X
g + lim inf
n
_
X
f
n
_
.
Since liminf
n
(a
n
) = limsup
n
a
n
, we have shown,
X
g
X
f
X
g +
_
liminf
n
X
f
n
limsup
n
X
f
n
and therefore
lim sup
n
X
f
n

X
f lim inf
n
X
f
n
.
This shows that lim
n
X
f
n
exists and is equal to

X
f.
Proof. (Second Proof.) Passing to the limit in Eq. (4.14) shows that [f[
g and in particular that f is summable. Given > 0, let X such that
X\
g .
Then for X such that ,
f
n
(f f
n
)
[f f
n
[ =
[f f
n
[ +
\
[f f
n
[
[f f
n
[ + 2
\
g
[f f
n
[ + 2.
and hence that
f
n
[f f
n
[ + 2.
Since this last equation is true for all such X, we learn that
X
f
X
f
n
[f f
n
[ + 2
which then implies that
lim sup
n
X
f
X
f
n
lim sup
n
[f f
n
[ + 2
= 2.
Because > 0 is arbitrary we conclude that
lim sup
n
X
f
X
f
n
= 0.
which is the same as Eq. (4.15).
Remark 4.20. Theorem 4.19 may easily be generalized as follows. Suppose
f
n
, g
n
, g are summable functions on X such that f
n
f and g
n
g pointwise,
[f
n
[ g
n
and
X
g
n

X
g as n . Then f is summable and Eq. (4.15)
still holds. For the proof we use Fatous Lemma to again conclude
X
(g f) =
X
lim inf
n
(g
n
f
n
) lim inf
n
X
(g
n
f
n
)
=
X
g + lim inf
n
_
X
f
n
_
and then proceed exactly as in the rst proof of Theorem 4.19.
4.4 Iterated sums and the Fubini and Tonelli Theorems
Let X and Y be two sets. The proof of the following lemma is left to the
reader.
Lemma 4.21. Suppose that a : X C is function and F X is a subset
such that a(x) = 0 for all x / F. Then

F
a exists i

X
a exists and when
the sums exists,
X
a =
F
a.
Theorem 4.22 (Tonellis Theorem for Sums). Suppose that a : XY
[0, ], then
XY
a =
Y
a =
X
a.
Proof. It suces to show, by symmetry, that
XY
a =
Y
a
Let X Y. Then for any X and Y such that ,
we have
4.4 Iterated sums and the Fubini and Tonelli Theorems 31
a =
Y
a
Y
a,
i.e.

Y
a. Taking the sup over in this last equation shows
XY
a
Y
a.
For the reverse inequality, for each x X choose
x
n
Y such that
x
n
Y
as n and
yY
a(x, y) = lim
n
y
x
n
a(x, y).
If X is a given nite subset of X, then
yY
a(x, y) = lim
n
yn
a(x, y) for all x
where
n
:=
x
x
n
Y. Hence
yY
a(x, y) =
x
lim
n
yn
a(x, y) = lim
n
yn
a(x, y)
= lim
n
(x,y)n
a(x, y)
XY
a.
Since is arbitrary, it follows that
xX
yY
a(x, y) = sup
X
yY
a(x, y)
XY
a
which completes the proof.
Theorem 4.23 (Fubinis Theorem for Sums). Now suppose that a : X
Y C is a summable function, i.e. by Theorem 4.22 any one of the following
equivalent conditions hold:
1.
XY
[a[ < ,
2.
Y
[a[ < or
3.
X
[a[ < .
Then

XY
a =
Y
a =
X
a.
Proof. If a : X 1 is real valued the theorem follows by applying
Theorem 4.22 to a
the positive and negative parts of a. The general result

holds for complex valued functions a by applying the real version just proved
to the real and imaginary parts of a.
4.5 Exercises
Exercise 4.1. Now suppose for each n N := 1, 2, . . . that f
n
: X 1 is
a function. Let
D := x X : lim
n
f
n
(x) = +
show that
D =
M=1
N=1
nN
x X : f
n
(x) M. (4.16)
Exercise 4.2. Let f
n
: X 1 be as in the last problem. Let
C := x X : lim
n
f
n
(x) exists in 1.
Find an expression for C similar to the expression for D in (4.16). (Hint: use
the Cauchy criteria for convergence.)
4.5.1 Limit Problems
Exercise 4.3. Show liminf
n
(a
n
) = limsup
n
a
n
.
Exercise 4.4. Suppose that limsup
n
a
n
= M

1, show that there is a
subsequence a
n
k
k=1
of a
n
n=1
such that lim
k
a
n
k
= M.
Exercise 4.5. Show that
limsup
n
(a
n
+b
n
) limsup
n
a
n
+ limsup
n
b
n
(4.17)
provided that the right side of Eq. (4.17) is well dened, i.e. no or
+type expressions. (It is OK to have += or = ,
etc.)
Exercise 4.6. Suppose that a
n
0 and b
n
0 for all n N. Show
limsup
n
(a
n
b
n
) limsup
n
a
n
limsup
n
b
n
, (4.18)
provided the right hand side of (4.18) is not of the form 0 or 0.
Exercise 4.7. Prove Lemma 4.15.
4.5 Exercises 33
4.5.2 Dominated Convergence Theorem Problems
Notation 4.24 For u
0
1
n
and > 0, let B
u0
() := x 1
n
: [x u
0
[ <
be the ball in 1
n
centered at u
0
with radius .
Exercise 4.9. Suppose U 1
n
is a set and u
0
U is a point such that
U (B
u0
() u
0
) ,= for all > 0. Let G : U u
0
C be a function
on U u
0
. Show that lim
uu0
G(u) exists and is equal to C,
1
i for all
sequences u
n
n=1
U u
0
which converge to u
0
(i.e. lim
n
u
n
= u
0
)
we have lim
n
G(u
n
) = .
Exercise 4.10. Suppose that Y is a set, U 1
n
is a set, and f : U Y C
is a function satisfying:
1. For each y Y, the function u U f(u, y) is continuous on U.
2
2. There is a summable function g : Y [0, ) such that
[f(u, y)[ g(y) for all y Y and u U.
Show that
F(u) :=
yY
f(u, y) (4.19)
is a continuous function for u U.
Exercise 4.11. Suppose that Y is a set, J = (a, b) 1 is an interval, and
f : J Y C is a function satisfying:
1. For each y Y, the function u f(u, y) is dierentiable on J,
2. There is a summable function g : Y [0, ) such that
u
f(u, y)
g(y) for all y Y and u J.

3. There is a u
0
J such that

yY
[f(u
0
, y)[ < .
Show:
a) for all u J that

yY
[f(u, y)[ < .
1
More explicitly, limuu
0
G(u) = means for every every > 0 there exists a
> 0 such that
[G(u) [ < whenever u U (Bu
0
() \ |u0) .
2
To say g := f(, y) is continuous on U means that g : U C is continuous relative
to the metric on R
n
restricted to U.
b) Let F(u) :=
yY
f(u, y), show F is dierentiable on J and that
F(u) =
yY
u
f(u, y).
(Hint: Use the mean value theorem.)
Exercise 4.12 (Dierentiation of Power Series). Suppose R > 0 and
a
n
n=0
is a sequence of complex numbers such that

n=0
[a
n
[ r
n
< for
all r (0, R). Show, using Exercise 4.11, f(x) :=

n=0
a
n
x
n
is continuously
dierentiable for x (R, R) and
f
t
(x) =
n=0
na
n
x
n1
=
n=1
na
n
x
n1
.
Exercise 4.13. Show the functions
e
x
:=
n=0
x
n
n!
, (4.20)
sinx :=
n=0
(1)
n
x
2n+1
(2n + 1)!
and (4.21)
cos x =
n=0
(1)
n
x
2n
(2n)!
(4.22)
are innitely dierentiable and they satisfy
d
dx
e
x
= e
x
with e
0
= 1
d
dx
sinx = cos x with sin(0) = 0
d
dx
cos x = sinx with cos (0) = 1.
Exercise 4.14. Continue the notation of Exercise 4.13.
1. Use the product and the chain rule to show,
d
dx
_
e
x
e
(x+y)
_
= 0
and conclude from this, that e
x
e
(x+y)
= e
y
for all x, y 1. In particular
taking y = 0 this implies that e
x
= 1/e
x
and hence that e
(x+y)
= e
x
e
y
.
Use this result to show e
x
as x and e
x
0 as x .
Remark: since e
x
N
n=0
x
n
n!
when x 0, it follows that lim
x
x
n
e
x
= 0
for any n N, i.e. e
x
grows at a rate faster than any polynomial in x as
x .
4.5 Exercises 35
2. Use the product rule to show
d
dx
_
cos
2
x + sin
2
x
_
= 0
and use this to conclude that cos
2
x + sin
2
x = 1 for all x 1.
Exercise 4.15. Let a
n
n=
be a summable sequence of complex numbers,
i.e.

n=
[a
n
[ < . For t 0 and x 1, dene
F(t, x) =
n=
a
n
e
tn
2
e
inx
,
where as usual e
ix
= cos(x) +i sin(x), this is motivated by replacing x in Eq.
(4.20) by ix and comparing the result to Eqs. (4.21) and (4.22).
1. F(t, x) is continuous for (t, x) [0, )1. Hint: Let Y = Z and u = (t, x)
and use Exercise 4.10.
2. F(t, x)/t, F(t, x)/x and
2
F(t, x)/x
2
exist for t > 0 and x 1.
Hint: Let Y = Z and u = t for computing F(t, x)/t and u = x for
computing F(t, x)/x and
2
F(t, x)/x
2
via Exercise 4.11. In computing
the t derivative, you should let > 0 and apply Exercise 4.11 with t =
u > and then afterwards let 0.
3. F satises the heat equation, namely
F(t, x)/t =
2
F(t, x)/x
2
for t > 0 and x 1.
5
p
spaces, Minkowski and Holder Inequalities
In this chapter, let : X (0, ) be a given function. Let F denote either
1 or C. For p (0, ) and f : X F, let
|f|
p
:=
_
xX
[f(x)[
p
(x)
_
1/p
and for p = let
|f|
= sup[f(x)[ : x X .
Also, for p > 0, let
p
() = f : X F : |f|
p
< .
In the case where (x) = 1 for all x X we will simply write
p
(X) for
p
().
Denition 5.1. A norm on a vector space Z is a function || : Z [0, )
such that
1. (Homogeneity) |f| = [[ |f| for all F and f Z.
2. (Triangle inequality) |f +g| |f| +|g| for all f, g Z.
3. (Positive denite) |f| = 0 implies f = 0.
A function p : Z [0, ) satisfying properties 1. and 2. but not necessarily
3. above will be called a semi-norm on Z.
A pair (Z, ||) where Z is a vector space and || is a norm on Z is called
a normed vector space.
The rest of this section is devoted to the proof of the following theorem.
Theorem 5.2. For p [1, ], (
p
(), | |
p
) is a normed vector space.
Proof. The only diculty is the proof of the triangle inequality which is
the content of Minkowskis Inequality proved in Theorem 5.8 below.
38 5
p
Proposition 5.3. Let f : [0, ) [0, ) be a continuous strictly increasing
function such that f(0) = 0 (for simplicity) and lim
s
f(s) = . Let g = f
1
and for s, t 0 let
F(s) =
_
s
0
f(s
t
)ds
t
and G(t) =
_
t
0
g(t
t
)dt
t
.
Then for all s, t 0,
st F(s) +G(t)
and equality holds i t = f(s).
Proof. Let
A
s
:= (, ) : 0 f() for 0 s and
B
t
:= (, ) : 0 g() for 0 t
then as one sees from Figure 5.1, [0, s] [0, t] A
s
B
t
. (In the gure: s = 3,
t = 1, A
3
is the region under t = f(s) for 0 s 3 and B
1
is the region to
the left of the curve s = g(t) for 0 t 1.) Hence if m denotes the area of a
region in the plane, then
st = m([0, s] [0, t]) m(A
s
) +m(B
t
) = F(s) +G(t).
As it stands, this proof is a bit on the intuitive side. However, it will become
rigorous if one takes m to be Lebesgue measure on the plane which will be
introduced later.
We can also give a calculus proof of this theorem under the additional
assumption that f is C
1
. (This restricted version of the theorem is all we
need in this section.) To do this x t 0 and let
h(s) = st F(s) =
_
s
0
(t f())d.
If > g(t) = f
1
(t), then t f() < 0 and hence if s > g(t), we have
h(s) =
_
s
0
(t f())d =
_
g(t)
0
(t f())d +
_
s
g(t)
(t f())d
_
g(t)
0
(t f())d = h(g(t)).
Combining this with h(0) = 0 we see that h(s) takes its maximum at some
point s (0, g(t)] and hence at a point where 0 = h
t
(s) = t f(s). The only
solution to this equation is s = g(t) and we have thus shown
st F(s) = h(s)
_
g(t)
0
(t f())d = h(g(t))
5
p
spaces, Minkowski and Holder Inequalities 39
with equality when s = g(t). To nish the proof we must show
_
g(t)
0
(t
f())d = G(t). This is veried by making the change of variables = g()
and then integrating by parts as follows:
_
g(t)
0
(t f())d =
_
t
0
(t f(g()))g
t
()d =
_
t
0
(t )g
t
()d
=
_
t
0
g()d = G(t).
Fig. 5.1. A picture proof of Proposition 5.3.
Denition 5.4. The conjugate exponent q [1, ] to p [1, ] is q :=
p
p1
with the conventions that q = if p = 1 and q = 1 if p = . Notice that q is
characterized by any of the following identities:
1
p
+
1
q
= 1, 1 +
q
p
= q, p
p
q
= 1 and q(p 1) = p. (5.1)
Lemma 5.5. Let p (1, ) and q :=
p
p1
(1, ) be the conjugate exponent.
Then
st
s
p
p
+
t
q
q
for all s, t 0
with equality if and only if t
q
= s
p
.
Proof. Let F(s) =
s
p
p
for p > 1. Then f(s) = s
p1
= t and g(t) = t
1
p1
=
t
q1
, wherein we have used q 1 = p/ (p 1) 1 = 1/ (p 1) . Therefore
G(t) = t
q
/q and hence by Proposition 5.3,
st
s
p
p
+
t
q
q
40 5
p
with equality i t = s
p1
, i.e. t
q
= s
q(p1)
= s
p
. For those who do not want
to use Proposition 5.3, here is a direct calculus proof. Fix t > 0 and let
h(s) := st
s
p
p
.
Then h(0) = 0, lim
s
h(s) = and h
t
(s) = t s
p1
which equals zero
i s = t
1
p1
. Since
h
_
t
1
p1
_
= t
1
p1
t
t
p
p1
p
= t
p
p1
t
p
p1
p
= t
q
_
1
1
p
_
=
t
q
q
,
it follows from the rst derivative test that
max h = max
_
h(0) , h
_
t
1
p1
__
= max
_
0,
t
q
q
_
=
t
q
q
.
So we have shown
st
s
p
p

t
q
q
with equality i t = s
p1
.
Theorem 5.6 (Holders inequality). Let p, q [1, ] be conjugate expo-
nents. For all f, g : X F,
|fg|
1
|f|
p
|g|
q
. (5.2)
If p (1, ) and f and g are not identically zero, then equality holds in Eq.
(5.2) i
_
[f[
|f|
p
_
p
=
_
[g[
|g|
q
_
q
. (5.3)
Proof. The proof of Eq. (5.2) for p 1, is easy and will be left to
the reader. The cases where |f|
p
= 0 or or |g|
q
= 0 or are easily dealt
with and are also left to the reader. So we will assume that p (1, ) and
0 < |f|
p
, |g|
q
< . Letting s = [f (x)[ /|f|
p
and t = [g[/|g|
q
in Lemma 5.5
implies
[f (x) g (x)[
|f|
p
|g|
q
1
p
[f (x)[
p
|f|
p
p
+
1
q
[g (x)[
q
|g|
q
q
with equality i
[f (x)[
p
|f|
p
p
= s
p
= t
q
=
[g (x)[
q
|g|
q
q
. (5.4)
Multiplying this equation by (x) and then summing on x gives
|fg|
1
|f|
p
|g|
q
1
p
+
1
q
= 1
with equality i Eq. (5.4) holds for all x X, i.e. i Eq. (5.3) holds.
5
p
spaces, Minkowski and Holder Inequalities 41
Denition 5.7. For a complex number C, let
sgn() =
_

]]
if ,= 0
0 if = 0.
For , C we will write sgn() sgn() if sgn() = sgn() or = 0.
Theorem 5.8 (Minkowskis Inequality). If 1 p and f, g
p
()
then
|f +g|
p
|f|
p
+|g|
p
. (5.5)
Moreover, assuming f and g are not identically zero, equality holds in Eq.
(5.5) i
sgn(f) sgn(g) when p = 1 and
f = cg for some c > 0 when p (1, ).
Proof. For p = 1,
|f +g|
1
=
X
[f +g[
X
([f[ +[g[) =
X
[f[ +
X
[g[
with equality i
[f[ +[g[ = [f +g[ sgn(f) sgn(g).
For p = ,
|f +g|
= sup
X
[f +g[ sup
X
([f[ +[g[)
sup
X
[f[ + sup
X
[g[ = |f|
+|g|
.
Now assume that p (1, ). Since
[f +g[
p
(2 max ([f[ , [g[))
p
= 2
p
max ([f[
p
, [g[
p
) 2
p
([f[
p
+[g[
p
)
it follows that
|f +g|
p
p
2
p
_
|f|
p
p
+|g|
p
p
_
< .
Eq. (5.5) is easily veried if |f + g|
p
= 0, so we may assume |f + g|
p
> 0.
Multiplying the inequality,
[f +g[
p
= [f +g[[f +g[
p1
[f[ [f +g[
p1
+[g[[f +g[
p1
(5.6)
by , then summing on x and applying Holders inequality on each term gives
X
[f +g[
p

X
[f[ [f +g[
p1
+
X
[g[ [f +g[
p1
(|f|
p
+|g|
p
)
_
_
_[f +g[
p1
_
_
_
q
. (5.7)
42 5
p
Since q(p 1) = p, as in Eq. (5.1),
|[f +g[
p1
|
q
q
=
X
([f +g[
p1
)
q
=
X
[f +g[
p
= |f +g|
p
p
. (5.8)
Combining Eqs. (5.7) and (5.8) shows
|f +g|
p
p
(|f|
p
+|g|
p
) |f +g|
p/q
p
(5.9)
and solving this equation for |f + g|
p
(making use of Eq. (5.1)) implies Eq.
(5.5). Now suppose that f and g are not identically zero and p (1, ) .
Equality holds in Eq. (5.5) i equality holds in Eq. (5.9) i equality holds in
Eq. (5.7) and Eq. (5.6). The latter happens i
sgn(f) sgn(g) and
_
[f[
|f|
p
_
p
=
[f +g[
p
|f +g|
p
p
=
_
[g[
|g|
p
_
p
. (5.10)
wherein we have used
_
[f +g[
p1
|[f +g[
p1
|
q
_
q
=
[f +g[
p
|f +g|
p
p
.
Finally Eq. (5.10) is equivalent to [f[ = c [g[ with c = (|f|
p
/|g|
p
) > 0 and
this equality along with sgn(f) sgn(g) implies f = cg.
5.1 Exercises
Exercise 5.1. Generalize Proposition 5.3 as follows. Let a [, 0] and
f : 1 [a, ) [0, ) be a continuous strictly increasing function such that
lim
s
f(s) = , f(a) = 0 if a > or lim
s
f(s) = 0 if a = . Also let
g = f
1
, b = f(0) 0,
F(s) =
_
s
0
f(s
t
)ds
t
and G(t) =
_
t
0
g(t
t
)dt
t
.
Then for all s, t 0,
st F(s) +G(t b) F(s) +G(t)
and equality holds i t = f(s). In particular, taking f(s) = e
s
, prove Youngs
inequality stating
st e
s
+ (t 1) ln (t 1) (t 1) e
s
+t lnt t,
where s t := min (s, t) . Hint: Refer to Figures 5.2 and 5.3..
Fig. 5.2. Comparing areas when t b goes the same way as in the text.
Fig. 5.3. When t b, notice that g(t) 0 but G(t) 0. Also notice that G(t) is
no longer needed to estimate st.
Part II
Metric, Banach, and Hilbert Space Basics
6
Metric Spaces
Denition 6.1. A function d : X X [0, ) is called a metric if
1. (Symmetry) d(x, y) = d(y, x) for all x, y X
2. (Non-degenerate) d(x, y) = 0 if and only if x = y X
3. (Triangle inequality) d(x, z) d(x, y) +d(y, z) for all x, y, z X.
As primary examples, any normed space (X, ||) (see Denition 5.1) is a
metric space with d(x, y) := |x y| . Thus the space
p
() (as in Theorem
5.2) is a metric space for all p [1, ]. Also any subset of a metric space
is a metric space. For example a surface in 1
3
is a metric space with the
distance between two points on being the usual distance in 1
3
.
Denition 6.2. Let (X, d) be a metric space. The open ball B(x, ) X
centered at x X with radius > 0 is the set
B(x, ) := y X : d(x, y) < .
We will often also write B(x, ) as B
x
(). We also dene the closed ball
centered at x X with radius > 0 as the set C
x
() := y X : d(x, y) .
Denition 6.3. A sequence x
n
n=1
in a metric space (X, d) is said to be
convergent if there exists a point x X such that lim
n
d(x, x
n
) = 0. In
this case we write lim
n
x
n
= x or x
n
x as n .
Exercise 6.1. Show that x in Denition 6.3 is necessarily unique.
Denition 6.4. A set E X is bounded if E B(x, R) for some x X
and R < . A set F X is closed i every convergent sequence x
n
n=1
which is contained in F has its limit back in F. A set V X is open i
V
c
is closed. We will write F X to indicate F is a closed subset of X and
V
o
X to indicate the V is an open subset of X. We also let
d
denote the
collection of open subsets of X relative to the metric d.
48 6 Metric Spaces
Denition 6.5. A subset A X is a neighborhood of x if there exists an
open set V
o
X such that x V A. We will say that A X is an open
neighborhood of x if A is open and x A.
Exercise 6.2. Let T be a collection of closed subsets of X, show T :=
FJ
F is closed. Also show that nite unions of closed sets are closed, i.e. if
F
k
n
k=1
are closed sets then
n
k=1
F
k
is closed. (By taking complements, this
shows that the collection of open sets,
d
, is closed under nite intersections
and arbitrary unions.)
The following continuity facts of the metric d will be used frequently in
the remainder of this book.
Lemma 6.6. For any non empty subset A X, let d
A
(x) := infd(x, a)[a
A, then
[d
A
(x) d
A
(y)[ d(x, y) x, y X (6.1)
and in particular if x
n
x in X then d
A
(x
n
) d
A
(x) as n . Moreover
the set F
:= x X[d
A
(x) is closed in X.
Proof. Let a A and x, y X, then
d
A
(x) d(x, a) d(x, y) +d(y, a).
Take the inmum over a in the above equation shows that
d
A
(x) d(x, y) +d
A
(y) x, y X.
Therefore, d
A
(x) d
A
(y) d(x, y) and by interchanging x and y we also have
that d
A
(y) d
A
(x) d(x, y) which implies Eq. (6.1). If x
n
x X, then by
Eq. (6.1),
[d
A
(x) d
A
(x
n
)[ d(x, x
n
) 0 as n
so that lim
n
d
A
(x
n
) = d
A
(x) . Now suppose that x
n
n=1
F
and
x
n
x in X, then
d
A
(x) = lim
n
d
A
(x
n
)
since d
A
(x
n
) for all n. This shows that x F
and hence F
is closed.
Corollary 6.7. The function d satises,
[d(x, y) d(x
t
, y
t
)[ d(y, y
t
) +d(x, x
t
).
In particular d : X X [0, ) is continuous in the sense that d(x, y)
is close to d(x
t
, y
t
) if x is close to x
t
and y is close to y
t
. (The notion of
continuity will be developed shortly.)
6.1 Continuity 49
Proof. By Lemma 6.6 for single point sets and the triangle inequality for
the absolute value of real numbers,
[d(x, y) d(x
t
, y
t
)[ [d(x, y) d(x, y
t
)[ +[d(x, y
t
) d(x
t
, y
t
)[
d(y, y
t
) +d(x, x
t
).
Example 6.8. Let x X and > 0, then C
x
() and B
x
()
c
are closed subsets
of X. For example if y
n
n=1
C
x
() and y
n
y X, then d (y
n
, x) for
all n and using Corollary 6.7 it follows d (y, x) , i.e. y C
x
() . A similar
proof shows B
x
()
c
is open, see Exercise 6.3.
Exercise 6.3. Show that V X is open i for every x V there is a > 0
such that B
x
() V. In particular show B
x
() is open for all x X and
> 0. Hint: by denition V is not open i V
c
is not closed.
Lemma 6.9 (Approximating open sets from the inside by closed
sets). Let A be a closed subset of X and F
:= x X[d
A
(x) X
be as in Lemma 6.6. Then F
A
c
as 0.
Proof. It is clear that d
A
(x) = 0 for x A so that F
A
c
for each > 0
and hence
>0
F
A
c
. Now suppose that x A
c
o
X. By Exercise 6.3
there exists an > 0 such that B
x
() A
c
, i.e. d(x, y) for all y A.
Hence x F
and we have shown that A

c

>0
F
. Finally it is clear that

F
whenever
t
.
Denition 6.10. Given a set A contained in a metric space X, let

A X be
the closure of A dened by
A := x X : x
n
A x = lim
n
x
n
.
That is to say

A contains all limit points of A. We say A is dense in X if
A = X, i.e. every element x X is a limit of a sequence of elements from A.

Exercise 6.4. Given A X, show

A is a closed set and in fact
A = F : A F X with F closed. (6.2)

That is to say

A is the smallest closed set containing A.
6.1 Continuity
Suppose that (X, ) and (Y, d) are two metric spaces and f : X Y is a
function.
50 6 Metric Spaces
Denition 6.11. A function f : X Y is continuous at x X if for all
> 0 there is a > 0 such that
d(f(x), f(x
t
)) < provided that (x, x
t
) < . (6.3)
The function f is said to be continuous if f is continuous at all points x X.
The following lemma gives two other characterizations of continuity of a
function at a point.
Lemma 6.12 (Local Continuity Lemma). Suppose that (X, ) and (Y, d)
are two metric spaces and f : X Y is a function dened in a neighborhood
of a point x X. Then the following are equivalent:
1. f is continuous at x X.
2. For all neighborhoods A Y of f(x), f
1
(A) is a neighborhood of x X.
3. For all sequences x
n
n=1
X such that x = lim
n
x
n
, f(x
n
) is
convergent in Y and
lim
n
f(x
n
) = f
_
lim
n
x
n
_
.
Proof. 1 = 2. If A Y is a neighborhood of f (x) , there exists > 0
such that B
f(x)
() A and because f is continuous there exists a > 0 such
that Eq. (6.3) holds. Therefore
B
x
() f
1
_
B
f(x)
()
_
f
1
(A)
showing f
1
(A) is a neighborhood of x.
2 = 3. Suppose that x
n
n=1
X and x = lim
n
x
n
. Then for
any > 0, B
f(x)
() is a neighborhood of f (x) and so f
1
_
B
f(x)
()
_
is
a neighborhood of x which must contain B
x
() for some > 0. Because
x
n
x, it follows that x
n
B
x
() f
1
_
B
f(x)
()
_
for a.a. n and this
implies f (x
n
) B
f(x)
() for a.a. n, i.e. d(f(x), f (x
n
)) < for a.a. n. Since
> 0 is arbitrary it follows that lim
n
f (x
n
) = f (x) .
3. = 1. We will show not 1. = not 3. If f is not continuous at x,
there exists an > 0 such that for all n N there exists a point x
n
X with
(x
n
, x) <
1
n
yet d (f (x
n
) , f (x)) . Hence x
n
x as n yet f (x
n
)
does not converge to f (x) .
Here is a global version of the previous lemma.
Lemma 6.13 (Global Continuity Lemma). Suppose that (X, ) and (Y, d)
are two metric spaces and f : X Y is a function dened on all of X. Then
the following are equivalent:
1. f is continuous.
2. f
1
(V )
for all V
d
, i.e. f
1
(V ) is open in X if V is open in Y.
3. f
1
(C) is closed in X if C is closed in Y.
6.2 Completeness in Metric Spaces 51
4. For all convergent sequences x
n
X, f(x
n
) is convergent in Y and
lim
n
f(x
n
) = f
_
lim
n
x
n
_
.
Proof. Since f
1
(A
c
) =
_
f
1
(A)
c
, it is easily seen that 2. and 3. are
equivalent. So because of Lemma 6.12 it only remains to show 1. and 2. are
equivalent. If f is continuous and V Y is open, then for every x f
1
(V ) ,
V is a neighborhood of f (x) and so f
1
(V ) is a neighborhood of x. Hence
f
1
(V ) is a neighborhood of all of its points and from this and Exercise
6.3 it follows that f
1
(V ) is open. Conversely, if x X and A Y is a
neighborhood of f (x) then there exists V
o
X such that f (x) V A.
Hence x f
1
(V ) f
1
(A) and by assumption f
1
(V ) is open showing
f
1
(A) is a neighborhood of x. Therefore f is continuous at x and since x X
was arbitrary, f is continuous.
Example 6.14. The function d
A
dened in Lemma 6.6 is continuous for each
A X. In particular, if A = x , it follows that y X d(y, x) is continuous
for each x X.
Exercise 6.5. Use Example 6.14 and Lemma 6.13 to recover the results of
Example 6.8.
The next result shows that there are lots of continuous functions on a
metric space (X, d) .
Lemma 6.15 (Urysohns Lemma for Metric Spaces). Let (X, d) be a
metric space and suppose that A and B are two disjoint closed subsets of X.
Then
f(x) =
d
B
(x)
d
A
(x) +d
B
(x)
for x X (6.4)
denes a continuous function, f : X [0, 1], such that f(x) = 1 for x A
and f(x) = 0 if x B.
Proof. By Lemma 6.6, d
A
and d
B
are continuous functions on X. Since
A and B are closed, d
A
(x) > 0 if x / A and d
B
(x) > 0 if x / B. Since
AB = , d
A
(x) +d
B
(x) > 0 for all x and (d
A
+d
B
)
1
is continuous as well.
The remaining assertions about f are all easy to verify.
Sometimes Urysohns lemma will be use in the following form. Suppose
F V X with F being closed and V being open, then there exists f
C (X, [0, 1])) such that f = 1 on F while f = 0 on V
c
. This of course follows
from Lemma 6.15 by taking A = F and B = V
c
.
6.2 Completeness in Metric Spaces
Denition 6.16 (Cauchy sequences). A sequence x
n
n=1
in a metric
space (X, d) is Cauchy provided that
52 6 Metric Spaces
lim
m,n
d(x
n
, x
m
) = 0.
Exercise 6.6. Show that convergent sequences are always Cauchy sequences.
The converse is not always true. For example, let X = be the set of ratio-
nal numbers and d(x, y) = [x y[. Choose a sequence x
n
n=1
which
converges to

2 1, then x
n
n=1
is (, d) Cauchy but not (, d) con-
vergent. The sequence does converge in 1 however.
Denition 6.17. A metric space (X, d) is complete if all Cauchy sequences
are convergent sequences.
Exercise 6.7. Let (X, d) be a complete metric space. Let A X be a subset
of X viewed as a metric space using d[
AA
. Show that (A, d[
AA
) is complete
i A is a closed subset of X.
Example 6.18. Examples 2. 4. of complete metric spaces will be veried in
Chapter 7 below.
1. X = 1 and d(x, y) = [x y[, see Theorem 3.8 above.
2. X = 1
n
and d(x, y) = |x y|
2
=
_
n
i=1
(x
i
y
i
)
2
_
1/2
.
3. X =
p
() for p [1, ] and any weight function : X (0, ).
4. X = C([0, 1], 1) the space of continuous functions from [0, 1] to 1 and
d(f, g) := max
t[0,1]
[f(t) g(t)[.
This is a special case of Lemma 7.3 below.
5. Let X = C([0, 1], 1) and
d(f, g) :=
_
1
0
[f(t) g(t)[ dt.
You are asked in Exercise 7.11 to verify that (X, d) is a metric space which
is not complete.
Exercise 6.8 (Completions of Metric Spaces). Suppose that (X, d) is
a (not necessarily complete) metric space. Using the following outline show
there exists a complete metric space
_
X,

d
_
and an isometric map i : X

X
such that i (X) is dense in

X, see Denition 6.10.
1. Let ( denote the collection of Cauchy sequences a = a
n
n=1
X. Given
two element a, b ( show d
c
(a, b) := lim
n
d (a
n
, b
n
) exists, d
c
(a, b)
0 for all a, b ( and d
c
satises the triangle inequality,
d
c
(a, c) d
c
(a, b) +d
c
(b, c) for all a, b, c (.
Thus ((, d
c
) would be a metric space if it were true that d
c
(a, b) = 0 i
a = b. This however is false, for example if a
n
= b
n
for all n 100, then
d
c
(a, b) = 0 while a need not equal b.
6.3 Supplementary Remarks 53
2. Dene two elements a, b ( to be equivalent (write a b) whenever
d
c
(a, b) = 0. Show is an equivalence relation on ( and that
d
c
(a
t
, b
t
) = d
c
(a, b) if a a
t
and b b
t
. (Hint: see Corollary 6.7.)
3. Given a ( let a := b ( : b a denote the equivalence class contain-
ing a and let

X := a : a ( denote the collection of such equivalence
classes. Show that

d
_
a,
b
_
:= d
c
(a, b) is well dened on

X

X and verify
_
X,

d
_
is a metric space.
4. For x X let i (x) = a where a is the constant sequence, a
n
= x for all n.
Verify that i : X

X is an isometric map and that i (X) is dense in

X.
5. Verify
_
X,

d
_
is complete. Hint: if a(m)
m=1
is a Cauchy sequence in

X
choose b
m
X such that

d (i (b
m
) , a(m)) 1/m. Then show a(m)

b
where b = b
m
m=1
.
6.3 Supplementary Remarks
6.3.1 Word of Caution
Example 6.19. Let (X, d) be a metric space. It is always true that B
x
()
C
x
() since C
x
() is a closed set containing B
x
(). However, it is not always
true that B
x
() = C
x
(). For example let X = 1, 2 and d(1, 2) = 1, then
B
1
(1) = 1 , B
1
(1) = 1 while C
1
(1) = X. For another counterexample,
take
X =
_
(x, y) 1
2
: x = 0 or x = 1
_
with the usually Euclidean metric coming from the plane. Then
B
(0,0)
(1) =
_
(0, y) 1
2
: [y[ < 1
_
,
B
(0,0)
(1) =
_
(0, y) 1
2
: [y[ 1
_
, while
C
(0,0)
(1) = B
(0,0)
(1) (0, 1) .
In spite of the above examples, Lemmas 6.20 and 6.21 below shows that
for certain metric spaces of interest it is true that B
x
() = C
x
().
Lemma 6.20. Suppose that (X, [[) is a normed vector space and d is the
metric on X dened by d(x, y) = [x y[ . Then
B
x
() = C
x
() and
bd(B
x
()) = y X : d(x, y) = .
where the boundary operation, bd() is dened in Denition 13.29 (BRUCE:
Forward Reference.) below.
Proof. We must show that C := C
x
() B
x
() =:

B. For y C, let
v = y x, then
[v[ = [y x[ = d(x, y) .
Let
n
= 1 1/n so that
n
1 as n . Let y
n
= x +
n
v, then
d(x, y
n
) =
n
d(x, y) < , so that y
n
B
x
() and d(y, y
n
) = (1
n
) [v[ 0
as n . This shows that y
n
y as n and hence that y

B.
54 6 Metric Spaces
Fig. 6.1. An almost length minimizing curve joining x to y.
6.3.2 Riemannian Metrics
This subsection is not completely self contained and may safely be skipped.
Lemma 6.21. Suppose that X is a Riemannian (or sub-Riemannian) mani-
fold and d is the metric on X dened by
d(x, y) = inf () : (0) = x and (1) = y
where () is the length of the curve . We dene () = if is not
piecewise smooth.
Then
B
x
() = C
x
() and
bd(B
x
()) = y X : d(x, y) =
where the boundary operation, bd() is dened in Denition 13.29 below.
Proof. Let C := C
x
() B
x
() =:

B. We will show that C

B by
showing

B
c
C
c
. Suppose that y

B
c
and choose > 0 such that B
y
()
B = . In particular this implies that

B
y
() B
x
() = .
We will nish the proof by showing that d(x, y) + > and hence
that y C
c
. This will be accomplished by showing: if d(x, y) < + then
B
y
() B
x
() ,= . If d(x, y) < max(, ) then either x B
y
() or y B
x
().
In either case B
y
() B
x
() ,= . Hence we may assume that max(, )
d(x, y) < +. Let > 0 be a number such that
max(, ) d(x, y) < < +
and choose a curve from x to y such that () < . Also choose 0 <
t
<
such that 0 <
t
< which can be done since < . Let k(t) = d(y, (t))
a continuous function on [0, 1] and therefore k([0, 1]) 1 is a connected
6.4 Exercises 55
set which contains 0 and d(x, y). Therefore there exists t
0
[0, 1] such that
d(y, (t
0
)) = k(t
0
) =
t
. Let z = (t
0
) B
y
() then
d(x, z) ([
[0,t0]
) = () ([
[t0,1]
) < d(z, y) =
t
<
and therefore z B
x
() B
x
() ,= .
Remark 6.22. Suppose again that X is a Riemannian (or sub-Riemannian)
manifold and
d(x, y) = inf () : (0) = x and (1) = y .
Let be a curve from x to y and let = () d(x, y). Then for all 0 u <
v 1,
d(x, y) + = () = ([
[0,u]
) +([
[u,v]
) +([
[v,1]
)
d(x, (u)) +([
[u,v]
) +d((v), y)
and therefore, using the triangle inequality,
([
[u,v]
) d(x, y) + d(x, (u)) d((v), y)
d((u), (v)) +.
This leads to the following conclusions. If is within of a length minimizing
curve from x to y then [
[u,v]
is within of a length minimizing curve from
(u) to (v). In particular if is a length minimizing curve from x to y then
[
[u,v]
is a length minimizing curve from (u) to (v).
6.4 Exercises
Exercise 6.9. Let (X, d) be a metric space. Suppose that x
n
n=1
X is a
sequence and set
n
:= d(x
n
, x
n+1
). Show that for m > n that
d(x
n
, x
m
)
m1
k=n
k=n
k
.
Conclude from this that if
k=1
k
=
n=1
d(x
n
, x
n+1
) <
then x
n
n=1
is Cauchy. Moreover, show that if x
n
n=1
is a convergent
sequence and x = lim
n
x
n
then
d(x, x
n
)
k=n
k
.
56 6 Metric Spaces
Exercise 6.10. Show that (X, d) is a complete metric space i every sequence
x
n
n=1
X such that

n=1
d(x
n
, x
n+1
) < is a convergent sequence in
X. You may nd it useful to prove the following statements in the course of
the proof.
1. If x
n
is Cauchy sequence, then there is a subsequence y
j
:= x
nj
such
that

j=1
d(y
j+1
, y
j
) < .
2. If x
n
n=1
is Cauchy and there exists a subsequence y
j
:= x
nj
of x
n
such that x = lim

j
y
j
exists, then lim
n
x
n
also exists and is equal
to x.
Exercise 6.11. Suppose that f : [0, ) [0, ) is a C
2
function such
that f(0) = 0, f
t
> 0 and f
tt
0 and (X, ) is a metric space. Show that
d(x, y) = f((x, y)) is a metric on X. In particular show that
d(x, y) :=
(x, y)
1 +(x, y)
is a metric on X. (Hint: use calculus to verify that f(a +b) f(a) +f(b) for
all a, b [0, ).)
Exercise 6.12. Let (X
n
, d
n
)
n=1
be a sequence of metric spaces, X :=
n=1
X
n
, and for x = (x(n))
n=1
and y = (y(n))
n=1
in X let
d(x, y) =
n=1
2
n
d
n
(x(n), y(n))
1 +d
n
(x(n), y(n))
.
Show:
1. (X, d) is a metric space,
2. a sequence x
k
k=1
X converges to x X i x
k
(n) x(n) X
n
as
k for each n N and
3. X is complete if X
n
is complete for all n.
Exercise 6.13. Suppose (X, ) and (Y, d) are metric spaces and A is a dense
subset of X.
1. Show that if F : X Y and G : X Y are two continuous functions
such that F = G on A then F = G on X. Hint: consider the set C :=
x X : F (x) = G(x) .
2. Suppose f : A Y is a function which is uniformly continuous, i.e. for
every > 0 there exists a > 0 such that
d (f (a) , f (b)) < for all a, b A with (a, b) < .
Show there is a unique continuous function F : X Y such that F = f on
A. Hint: each point x X is a limit of a sequence consisting of elements
from A.
3. Let X = 1 = Y and A = X, nd a function f : 1 which is
continuous on but does not extend to a continuous function on 1.
7
Banach Spaces
Let (X, ||) be a normed vector space and d (x, y) := |x y| be the asso-
ciated metric on X. We say x
n
n=1
X converges to x X (and write
lim
n
x
n
= x or x
n
x) if
0 = lim
n
d (x, x
n
) = lim
n
|x x
n
| .
Similarly x
n
n=1
X is said to be a Cauchy sequence if
0 = lim
m,n
d (x
m
, x
n
) = lim
m,n
|x
m
x
n
| .
Denition 7.1 (Banach space). A normed vector space (X, ||) is a Ba-
nach space if the associated metric space (X, d) is complete, i.e. all Cauchy
sequences are convergent.
Remark 7.2. Since |x| = d (x, 0) , it follows from Lemma 6.6 that || is a
continuous function on X and that
[|x| |y|[ |x y| for all x, y X.
It is also easily seen that the vector addition and scalar multiplication are
continuous on any normed space as the reader is asked to verify in Exercise
7.5. These facts will often be used in the sequel without further mention.
7.1 Examples
Lemma 7.3. Suppose that X is a set then the bounded functions,
(X), on
X is a Banach space with the norm
|f| = |f|
= sup
xX
[f(x)[ .
Moreover if X is a metric space (more generally a topological space, see Chap-
ter 13) the set BC(X)
(X) = B(X) is closed subspace of
(X) and
hence is also a Banach space.
58 7 Banach Spaces
Proof. Let f
n
n=1

(X) be a Cauchy sequence. Since for any x X,

we have
[f
n
(x) f
m
(x)[ |f
n
f
m
|
(7.1)
which shows that f
n
(x)
n=1
F is a Cauchy sequence of numbers. Because F
(F = 1 or C) is complete, f(x) := lim
n
f
n
(x) exists for all x X. Passing
to the limit n in Eq. (7.1) implies
[f(x) f
m
(x)[ lim inf
n
|f
n
f
m
|
and taking the supremum over x X of this inequality implies

|f f
m
|
lim inf
n
|f
n
f
m
|
0 as m
showing f
m
f in
(X). For the second assertion, suppose that f

n
n=1

BC(X)
(X) and f
n
f
(X). We must show that f BC(X), i.e.

that f is continuous. To this end let x, y X, then
[f(x) f(y)[ [f(x) f
n
(x)[ +[f
n
(x) f
n
(y)[ +[f
n
(y) f(y)[
2 |f f
n
|
+[f
n
(x) f
n
(y)[ .
Thus if > 0, we may choose n large so that 2 |f f
n
|
< /2 and
then for this n there exists an open neighborhood V
x
of x X such that
[f
n
(x) f
n
(y)[ < /2 for y V
x
. Thus [f(x) f(y)[ < for y V
x
showing
the limiting function f is continuous.
Here is an application of this theorem.
Theorem 7.4 (Metric Space Tietze Extension Theorem). Let (X, d)
be a metric space, D be a closed subset of X, < a < b < and f
C(D, [a, b]). (Here we are viewing D as a metric space with metric d
D
:=
d[
DD
.) Then there exists F C(X, [a, b]) such that F[
D
= f.
Proof.
1. By scaling and translation (i.e. by replacing f by (b a)
1
(f a)), it
suces to prove Theorem 7.4 with a = 0 and b = 1.
2. Suppose (0, 1] and f : D [0, ] is continuous function. Let A :=
f
1
([0,
1
3
]) and B := f
1
([
2
3
, ]). By Lemma 6.15 there exists a function
g C(X, [0,

3
]) such that g = 0 on A and g = 1 on B. Letting g :=

3
g,
we have g C(X, [0,

3
]) such that g = 0 on A and g =

3
on B. Further
notice that
0 f(x) g(x)
2
3
for all x D.
3. Now suppose f : D [0, 1] is a continuous function as in step 1. Let
g
1
C(X, [0, 1/3]) be as in step 2, see Figure 7.1. with = 1 and let
f
1
:= f g
1
[
D
C(D, [0, 2/3]). Apply step 2. with = 2/3 and f = f
1
to
7.1 Examples 59
nd g
2
C(X, [0,
1
3
2
3
]) such that f
2
:= f (g
1
+g
2
) [
D
C(D, [0,
_
2
3
_
2
]).
Continue this way inductively to nd g
n
C(X, [0,
1
3
_
2
3
_
n1
]) such that
f
N
n=1
g
n
[
D
=: f
N
C
_
D,
_
0,
_
2
3
_
N
__
. (7.2)
4. Dene F :=
n=1
g
n
. Since
n=1
|g
n
|
n=1
1
3
_
2
3
_
n1
=
1
3
1
1
2
3
= 1,
the series dening F is uniformly convergent so F C(X, [0, 1]). Passing
to the limit in Eq. (7.2) shows f = F[
D
.
Fig. 7.1. Reducing f by subtracting o a globally dened function g1
C
_
X, [0,
1
3
]
_
.
Theorem 7.5 (Completeness of
p
()). Let X be a set and : X (0, )
be a given function. Then for any p [1, ], (
p
(), ||
p
) is a Banach space.
Proof. We have already proved this for p = in Lemma 7.3 so we now
assume that p [1, ). Let f
n
n=1

p
() be a Cauchy sequence. Since for
any x X,
[f
n
(x) f
m
(x)[
1
(x)
|f
n
f
m
|
p
0 as m, n
it follows that f
n
(x)
n=1
is a Cauchy sequence of numbers and f(x) :=
lim
n
f
n
(x) exists for all x X. By Fatous Lemma,
60 7 Banach Spaces
|f
n
f|
p
p
=
X
lim
m
inf [f
n
f
m
[
p
lim
m
inf
X
[f
n
f
m
[
p
= lim
m
inf |f
n
f
m
|
p
p
0 as n .
This then shows that f = (f f
n
) + f
n

p
() (being the sum of two
p
functions) and that f

n
p
f.
Remark 7.6. Let X be a set, Y be a Banach space and
(X, Y ) denote the

bounded functions f : X Y equipped with the norm
|f| = |f|
= sup
xX
|f(x)|
Y
.
If X is a metric space (or a general topological space, see Chapter 13), let
BC(X, Y ) denote those f
(X, Y ) which are continuous. The same proof

used in Lemma 7.3 shows that
(X, Y ) is a Banach space and that BC(X, Y )

is a closed subspace of
(X, Y ). Similarly, if 1 p < we may dene
p
(X, Y ) =
_
_
_
f : X Y : |f|
p
=
_
xX
|f (x)|
p
Y
_
1/p
<
_
_
_
.
The same proof as in Theorem 7.5 would then show that
_
p
(X, Y ) , ||
p
_
is
a Banach space.
7.2 Bounded Linear Operators Basics
Denition 7.7. Let X and Y be normed spaces and T : X Y be a linear
map. Then T is said to be bounded provided there exists C < such that
|T(x)|
Y
C|x|
X
for all x X. We denote the best constant by |T|
op
=
|T|
L(X,Y )
, i.e.
|T|
L(X,Y )
= sup
x,=0
|T(x)|
Y
|x|
X
= sup
x,=0
|T(x)|
Y
: |x|
X
= 1 .
The number |T|
L(X,Y )
is called the operator norm of T.
In the future, we will usually drop the garnishing on the norms and simply
write |x|
X
as |x|, |T|
L(X,Y )
as |T| , etc. The reader should be able to
determine the norm that is to be used by context.
Proposition 7.8. Suppose that X and Y are normed spaces and T : X Y
is a linear map. The the following are equivalent:
1. T is continuous.
2. T is continuous at 0.
7.2 Bounded Linear Operators Basics 61
3. T is bounded.
Proof. 1. 2. trivial. 2. 3. If T continuous at 0 then there exist
> 0 such that |T(x)| 1 if |x| . Therefore for any nonzero x X,
|T (x/|x|)| 1 which implies that |T(x)|
1
|x| and hence |T|

1
< .
3. 1. Let x X and > 0 be given. Then
|Ty Tx| = |T(y x)| |T| |y x| <
provided |y x| < /|T| := .
For the next three exercises, let X = 1
n
and Y = 1
m
and T : X Y
be a linear transformation so that T is given by matrix multiplication by an
mn matrix. Let us identify the linear transformation T with this matrix.
Exercise 7.1. Assume the norms on X and Y are the
1
norms, i.e. for
x 1
n
, |x| =
n
j=1
[x
j
[ . Then the operator norm of T is given by
|T| = max
1jn
m
i=1
[T
ij
[ .
norms, i.e. for

x 1
n
, |x| = max
1jn
[x
j
[ . Then the operator norm of T is given by
|T| = max
1im
n
j=1
[T
ij
[ .
2
norms, i.e. for
x 1
n
, |x|
2
=

n
j=1
x
2
j
. Show |T|
2
is the largest eigenvalue of the matrix
T
tr
T : 1
n
1
n
. Hint: Use the spectral theorem for orthogonal matrices.
Notation 7.9 Let L(X, Y ) denote the bounded linear operators from X to Y
and L(X) = L(X, X) . If Y = F we write X
for L(X, F) and call X
the
(continuous) dual space to X.
Lemma 7.10. Let X, Y be normed spaces, then the operator norm || on
L(X, Y ) is a norm. Moreover if Z is another normed space and T : X Y
and S : Y Z are linear maps, then |ST| |S||T|, where ST := S T.
Proof. As usual, the main point in checking the operator norm is a norm
is to verify the triangle inequality, the other axioms being easy to check. If
A, B L(X, Y ) then the triangle inequality is veried as follows:
|A+B| = sup
x,=0
|Ax +Bx|
|x|
sup
x,=0
|Ax| +|Bx|
|x|
sup
x,=0
|Ax|
|x|
+ sup
x,=0
|Bx|
|x|
= |A| +|B| .
62 7 Banach Spaces
For the second assertion, we have for x X, that
|STx| |S||Tx| |S||T||x|.
From this inequality and the denition of |ST|, it follows that |ST|
|S||T|.
The reader is asked to prove the following continuity lemma in Exercise
7.9.
Lemma 7.11. Let X, Y and Z be normed spaces. Then the maps
(S, x) L(X, Y ) X Sx Y
and
(S, T) L(X, Y ) L(Y, Z) ST L(X, Z)
are continuous relative to the norms
|(S, x)|
L(X,Y )X
:= |S|
L(X,Y )
+|x|
X
and
|(S, T)|
L(X,Y )L(Y,Z)
:= |S|
L(X,Y )
+|T|
L(Y,Z)
on L(X, Y ) X and L(X, Y ) L(Y, Z) respectively.
Proposition 7.12. Suppose that X is a normed vector space and Y is a Ba-
nach space. Then (L(X, Y ), | |
op
) is a Banach space. In particular the dual
space X
is always a Banach space.

Proof. Let T
n
n=1
be a Cauchy sequence in L(X, Y ). Then for each
x X,
|T
n
x T
m
x| |T
n
T
m
| |x| 0 as m, n
showing T
n
x
n=1
is Cauchy in Y. Using the completeness of Y, there exists
an element Tx Y such that
lim
n
|T
n
x Tx| = 0.
The map T : X Y is linear map, since for x, x
t
X and F we have
T (x +x
t
) = lim
n
T
n
(x +x
t
) = lim
n
[T
n
x +T
n
x
t
] = Tx +Tx
t
,
wherein we have used the continuity of the vector space operations in the last
equality. Moreover,
|Tx T
n
x| |Tx T
m
x| +|T
m
x T
n
x| |Tx T
m
x| +|T
m
T
n
| |x|
and therefore
|Tx T
n
x| lim inf
m
(|Tx T
m
x| +|T
m
T
n
| |x|)
= |x| lim inf
m
|T
m
T
n
| .
Hence
|T T
n
| lim inf
m
|T
m
T
n
| 0 as n .
Thus we have shown that T
n
T in L(X, Y ) as desired.
The following characterization of a Banach space will sometimes be useful
in the sequel.
Theorem 7.13. A normed space (X, ||) is a Banach space i every sequence
x
n
n=1
X such that
n=1
|x
n
| < implies lim
N
N
n=1
x
n
= s exists in
X (that is to say every absolutely convergent series is a convergent series in
X.) As usual we will denote s by
n=1
x
n
.
Proof. (This is very similar to Exercise 6.10.) () If X is complete and
n=1
|x
n
| < then sequence s
N
:=
N
n=1
x
n
for N N is Cauchy because (for
N > M)
|s
N
s
M
|
N
n=M+1
|x
n
| 0 as M, N .
Therefore s =
n=1
x
n
:= lim
N
N
n=1
x
n
exists in X. (=) Suppose that
x
n
n=1
is a Cauchy sequence and let y
k
= x
n
k
k=1
be a subsequence of
x
n
n=1
such that
n=1
|y
n+1
y
n
| < . By assumption
y
N+1
y
1
=
N
n=1
(y
n+1
y
n
) s =
n=1
(y
n+1
y
n
) X as N .
This shows that lim
N
y
N
exists and is equal to x := y
1
+s. Since x
n
n=1
is Cauchy,
|x x
n
| |x y
k
| +|y
k
x
n
| 0 as k, n
showing that lim
n
x
n
exists and is equal to x.
Example 7.14. Here is another proof of Proposition 7.12 which makes use of
Theorem 7.13. Suppose that T
n
L(X, Y ) is a sequence of operators such
that
n=1
|T
n
| < . Then
n=1
|T
n
x|
n=1
|T
n
| |x| <
64 7 Banach Spaces
and therefore by the completeness of Y, Sx :=
n=1
T
n
x = lim
N
S
N
x exists
in Y, where S
N
:=
N
n=1
T
n
. The reader should check that S : X Y so dened
is linear. Since,
|Sx| = lim
N
|S
N
x| lim
N
N
n=1
|T
n
x|
n=1
|T
n
| |x| ,
S is bounded and
|S|
n=1
|T
n
|. (7.3)
Similarly,
|Sx S
M
x| = lim
N
|S
N
x S
M
x|
lim
N
N
n=M+1
|T
n
| |x| =
n=M+1
|T
n
| |x|
and therefore,
|S S
M
|
n=M
|T
n
| 0 as M .
For the remainder of this section let X be an innite set, : X (0, )
be a given function and p, q [1, ] such that q = p/ (p 1) . It will also be
convenient to dene
x
: X 1 for x X by
x
(y) =
_
1 if y = x
0 if y ,= x.
Notation 7.15 Let c
0
(X) denote those functions f
(X) which vanish

at , i.e. for every > 0 there exists a nite subset
X such that
[f (x)[ < whenever x /
. Also let c
f
(X) denote those functions f : X F
with nite support, i.e.
c
f
(X) := f
(X) : #(x X : f (x) ,= 0) < .

Exercise 7.4. Show c
f
(X) is a dense subspace of the Banach spaces
_
p
() , ||
p
_
for 1 p < , while the closure of c
f
(X) inside the Ba-
nach space, (
(X) , ||
) is c
0
(X) . Note from this it follows that c
0
(X)
is a closed subspace of
(X) . (See Proposition 15.23 below where this last

assertion is proved in a more general context.)
Theorem 7.16. Let X be any set, : X (0, ) be a function, p [1, ],
q := p/ (p 1) be the conjugate exponent and for f
q
() dene
f
:
p
() F by
f
(g) :=
xX
f (x) g (x) (x) .
Then
1.
f
(g) is well dened and
f

p
()
.
2. The map
f
q
()

f

p
()
(7.4)
is an isometric linear map of Banach spaces.
3. If p [1, ), then the map in Eq. (7.4) is also surjective and hence,
p
()
is isometrically isomorphic to
q
() .
4. When p = , the map
f
1
()
f
c
0
(X)
is an isometric and surjective, i.e.

1
() is isometrically isomorphic to
c
0
(X)
.
(See Theorem 25.13 below for a continuation of this theorem.)
Proof.
1. By Holders inequality,
xX
[f (x)[ [g (x)[ (x) |f|
q
|g|
p
which shows that
f
is well dened. The
f
:
p
() F is linear by the
linearity of sums and since
[
f
(g)[ =
xX
f (x) g (x) (x)
xX
[f (x)[ [g (x)[ (x) |f|
q
|g|
p
,
we learn that
|
f
|
p
()
|f|
q
. (7.5)
Therefore
f

p
()
.
2. The map in Eq. (7.4) is linear in f by the linearity properties of innite
sums. For p (1, ) , dene g (x) = sgn(f (x)) [f (x)[
q1
where
sgn(z) :=
_
z
]z]
if z ,= 0
0 if z = 0.
Then
66 7 Banach Spaces
|g|
p
p
=
xX
[f (x)[
(q1)p
(x) =
xX
[f (x)[
(
p
p1
1)p
(x)
=
xX
[f (x)[
q
(x) = |f|
q
q
and
f
(g) =
xX
f (x) sgn(f (x)) [f (x)[
q1
(x) =
xX
[f (x)[ [f (x)[
q1
(x)
= |f|
q(
1
q
+
1
p
)
q
= |f|
q
|f|
q
p
q
= |f|
q
|g|
p
.
Hence |
f
|
p
()
|f|
q
which combined with Eq. (7.5) shows |
f
|
p
()
=
|f|
q
. For p = , let g (x) = sgn(f (x)), then |g|
= 1 and
[
f
(g)[ =
xX
f (x) sgn(f (x))(x)
=
xX
[f (x)[ (x) = |f|
1
|g|
which shows |
f
|
()
|f|
1
()
. Combining this with Eq. (7.5) shows
|
f
|
()
= |f|
1
()
. For p = 1,
[
f
(
x
)[ = (x) [f (x)[ = [f (x)[ |
x
|
1
and therefore |
f
|
1
()
[f (x)[ for all x X. Hence |
f
|
1
()
|f|
which combined with Eq. (7.5) shows |

f
|
1
()
= |f|
.
3. and 4. Suppose that p [1, ) and
p
()
or p = and c
0
(X)
.
We wish to nd f
q
() such that =
f
. If such an f exists, then
(
x
) = f (x) (x) and so we must dene f (x) := (
x
) /(x) . As a
preliminary estimate,
[f (x)[ =
[(
x
)[
(x)

||
p
()
|
x
|
p
()
(x)
=
||
p
()
[(x)]
1
p
(x)
= ||
p
()
[(x)]
1
q
.
When p = 1 and q = , this implies |f|
||
1
()
< . If p (1, ]
and X, then
7.3 General Sums in Banach Spaces 67
|f|
q
q
(,)
:=
x
[f (x)[
q
(x) =
x
f (x) sgn(f (x)) [f (x)[
q1
(x)
=
x
(
x
)
(x)
sgn(f (x)) [f (x)[
q1
(x)
=
x
(
x
) sgn(f (x)) [f (x)[
q1
=
_
x
sgn(f (x)) [f (x)[
q1
x
_
||
p
()
_
_
_
_
_
x
sgn(f (x)) [f (x)[
q1
x
_
_
_
_
_
p
.
Since
_
_
_
_
_
x
sgn(f (x)) [f (x)[
q1
x
_
_
_
_
_
p
=
_
x
[f (x)[
(q1)p
(x)
_
1/p
=
_
x
[f (x)[
q
(x)
_
1/p
= |f|
q/p
q
(,)
which is also valid for p = provided |f|
1/
1
(,)
:= 1. Combining the last
two displayed equations shows
|f|
q
q
(,)
||
p
()
|f|
q/p
q
(,)
and solving this inequality for |f|
q
q
(,)
(using q q/p = 1) implies
|f|
q
(,)
||
p
()
Taking the supremum of this inequality on X
shows |f|
q
()
||
p
()
, i.e. f
q
() . Since =
f
agree on c
f
(X)
and c
f
(X) is a dense subspace of
p
() for p < and c
f
(X) is dense
subspace of c
0
(X) when p = , it follows that =
f
.
7.3 General Sums in Banach Spaces
Denition 7.17. Suppose X is a normed space.
1. Suppose that x
n
n=1
is a sequence in X, then we say
n=1
x
n
converges
in X and

n=1
x
n
= s if
lim
N
N
n=1
x
n
= s in X.
68 7 Banach Spaces
2. Suppose that x
: A is a given collection of vectors in X. We say

the sum

A
x
converges in X and write s =
A
x
X if for all
> 0 there exists a nite set
A such that
_
_
s
_
_
< for
any A such that
.
Warning: As usual if X is a Banach space and

A
|x
| < then
A
x
exists in X, see Exercise 7.13. However, unlike the case of real val-
ued sums the existence of

A
x
does not imply

|x
| < . See
Proposition 8.19 below, from which one may manufacture counter-examples
to this false premise.
Lemma 7.18. Suppose that x
X : A is a given collection of vectors

in a normed space, X.
1. If s =

A
x
X exists and T : X Y is a bounded linear map

between normed spaces, then

A
Tx
exists in Y and
Ts = T
A
x
A
Tx
.
2. If s =

A
x
exists in X then for every > 0 there exists
A
such that
_
_
_
_
< for all A
.
3. If s =

A
x
exists in X, the set := A : x

a
,= 0 is at most
countable. Moreover if is innite and
n
n=1
is an enumeration of ,
then
s =
n=1
x
n
:= lim
N
N
n=1
x
n
. (7.6)
4. If we further assume that X is a Banach space and suppose for all > 0
there exists
A such that
_
_
_
_
< whenever A
,
then

A
x
exists in X.
Proof.
1. Let
be as in Denition 7.17 and A such that
. Then
_
_
_
_
_
Ts
Tx
_
_
_
_
_
|T|
_
_
_
_
_
s
_
_
_
_
_
< |T|
which shows that

Tx
exists and is equal to Ts.

2. Suppose that s =

A
x
exists and > 0. Let
A be as in
Denition 7.17. Then for A
,
_
_
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
s
_
_
_
_
_
+
_
_
_
_
_
s
_
_
_
_
_
< 2.
7.4 Inverting Elements in L(X) 69
3. If s =

A
x
exists in X, for each n N there exists a nite subset
n
A such that
_
_
_
_
<
1
n
for all A
n
. Without loss of
generality we may assume x
,= 0 for all
n
. Let
:=
n=1
n
a
countable subset of A. Then for any /
, we have
n
= and
therefore
|x
| =
_
_
_
_
_
_
]
x
_
_
_
_
_
_
1
n
0 as n .
Let
n
n=1
be an enumeration of and dene
N
:=
n
: 1 n N .
Since for any M N,
N
will eventually contain
M
for N suciently
large, we have
lim sup
N
_
_
_
_
_
s
N
n=1
x
n
_
_
_
_
_
1
M
0 as M .
Therefore Eq. (7.6) holds.
4. For n N, let
n
A such that
_
_
_
_
<
1
n
for all A
n
.
Dene
n
:=
n
k=1
k
A and s
n
:=
n
x
. Then for m > n,

|s
m
s
n
| =
_
_
_
_
_
_
m\n
x
_
_
_
_
_
_
1/n 0 as m, n .
Therefore s
n
n=1
is Cauchy and hence convergent in X, because X is a
Banach space. Let s := lim
n
s
n
. Then for A such that
n
,
we have
_
_
_
_
_
s
_
_
_
_
_
|s s
n
| +
_
_
_
_
_
_
\n
x
_
_
_
_
_
_
|s s
n
| +
1
n
.
Since the right side of this equation goes to zero as n , it follows that
A
x
exists and is equal to s.

7.4 Inverting Elements in L(X)
Denition 7.19. A linear map T : X Y is an isometry if |Tx|
Y
= |x|
X
for all x X. T is said to be invertible if T is a bijection and T
1
is bounded.
Notation 7.20 We will write GL(X, Y ) for those T L(X, Y ) which are
invertible. If X = Y we simply write L(X) and GL(X) for L(X, X) and
GL(X, X) respectively.
70 7 Banach Spaces
Proposition 7.21. Suppose X is a Banach space and L(X) := L(X, X)
satises
n=0
|
n
| < . Then I is invertible and
(I )
1
=
1
I
=
n=0
n
and
_
_
(I )
1
_
_
n=0
|
n
|.
In particular if || < 1 then the above formula holds and
_
_
(I )
1
_
_
1
1 ||
.
Proof. Since L(X) is a Banach space and
n=0
|
n
| < , it follows from
Theorem 7.13 that
S := lim
N
S
N
:= lim
N
N
n=0
n
exists in L(X). Moreover, by Lemma 7.11,
(I ) S = (I ) lim
N
S
N
= lim
N
(I ) S
N
= lim
N
(I )
N
n=0
n
= lim
N
(I
N+1
) = I
and similarly S (I ) = I. This shows that (I )
1
exists and is equal to
S. Moreover, (I )
1
is bounded because
_
_
(I )
1
_
_
= |S|
n=0
|
n
|.
If we further assume || < 1, then |
n
| ||
n
and
n=0
|
n
|
n=0
||
n
=
1
1 ||
< .
Corollary 7.22. Let X and Y be Banach spaces. Then GL(X, Y ) is an open
(possibly empty) subset of L(X, Y ). More specically, if A GL(X, Y ) and
B L(X, Y ) satises
|B A| < |A
1
|
1
(7.7)
then B GL(X, Y )
B
1
=
n=0
_
I
X
A
1
B
n
A
1
L(Y, X), (7.8)
7.4 Inverting Elements in L(X) 71
_
_
B
1
_
_
|A
1
|
1
1 |A
1
| |AB|
(7.9)
and
_
_
B
1
A
1
_
_
|A
1
|
2
|AB|
1 |A
1
| |AB|
. (7.10)
In particular the map
A GL(X, Y ) A
1
GL(Y, X) (7.11)
is continuous.
Proof. Let A and B be as above, then
B = A(AB) = A
_
I
X
A
1
(AB))
= A(I
X
)
where : X X is given by
:= A
1
(AB) = I
X
A
1
B.
Now
|| =
_
_
A
1
(AB))
_
_
|A
1
| |AB| < |A
1
||A
1
|
1
= 1.
Therefore I is invertible and hence so is B (being the product of invertible
elements) with
B
1
= (I
X
)
1
A
1
=
_
I
X
A
1
(AB))
1
A
1
.
Taking norms of the previous equation gives
_
_
B
1
_
_
_
_
(I
X
)
1
_
_
|A
1
| |A
1
|
1
1 ||
|A
1
|
1 |A
1
| |AB|
which is the bound in Eq. (7.9). The bound in Eq. (7.10) holds because
_
_
B
1
A
1
_
_
=
_
_
B
1
(AB) A
1
_
_
_
_
B
1
_
_
_
_
A
1
_
_
|AB|
|A
1
|
2
|AB|
1 |A
1
| |AB|
.
For an application of these results to linear ordinary dierential equations,
see Section 10.3.
72 7 Banach Spaces
7.5 Exercises
Exercise 7.5. Let (X, ||) be a normed space over F (1 or C). Show the map
(, x, y) F X X x +y X
is continuous relative to the norm on F X X dened by
|(, x, y)|
FXX
:= [[ +|x| +|y| .
(See Exercise 13.25 for more on the metric associated to this norm.) Also show
that || : X [0, ) is continuous.
Exercise 7.6. Let X = N and for p, q [1, ) let ||
p
denote the
p
(N)
norm. Show ||
p
and ||
q
are inequivalent norms for p ,= q by showing
sup
f,=0
|f|
p
|f|
q
= if p < q.
Exercise 7.7. Suppose that (X, ||) is a normed space and S X is a linear
subspace.
1. Show the closure

S of S is also a linear subspace.
2. Now suppose that X is a Banach space. Show that S with the inherited
norm from X is a Banach space i S is closed.
Exercise 7.8. Folland Problem 5.9. Showing C
k
([0, 1]) is a Banach space.
Exercise 7.9. Suppose that X, Y and Z are Banach spaces and Q : XY
Z is a bilinear form, i.e. we are assuming x X Q(x, y) Z is linear for
each y Y and y Y Q(x, y) Z is linear for each x X. Show Q is
continuous relative to the product norm, |(x, y)|
XY
:= |x|
X
+ |y|
Y
, on
X Y i there is a constant M < such that
|Q(x, y)|
Z
M|x|
X
|y|
Y
for all (x, y) X Y. (7.12)
Then apply this result to prove Lemma 7.11.
Exercise 7.10. Let d : C(1) C(1) [0, ) be dened by
d(f, g) =
n=1
2
n
|f g|
n
1 +|f g|
n
,
where |f|
n
:= sup[f(x)[ : [x[ n = max[f(x)[ : [x[ n.
1. Show that d is a metric on C(1).
2. Show that a sequence f
n
n=1
C(1) converges to f C(1) as n
i f
n
converges to f uniformly on bounded subsets of 1.
7.5 Exercises 73
3. Show that (C(1), d) is a complete metric space.
Exercise 7.11. Let X = C([0, 1], 1) and for f X, let
|f|
1
:=
_
1
0
[f(t)[ dt.
Show that (X, ||
1
) is normed space and show by example that this space is
not complete. Hint: For the last assertion nd a sequence of f
n
n=1
X
which is trying to converge to the function f = 1
[
1
2
,1]
/ X.
Exercise 7.12. Let (X, ||
1
) be the normed space in Exercise 7.11. Compute
the closure of A when
1. A = f X : f (1/2) = 0 .
2. A =
_
f X : sup
t[0,1]
f (t) 5
_
.
3. A =
_
f X :
_
1/2
0
f (t) dt = 0
_
.
Exercise 7.13. Suppose x
X : A is a given collection of vectors in

a Banach space X. Show

A
x
exists in X and
_
_
_
_
_
A
x
_
_
_
_
_
A
|x
|
if

A
|x
| < . That is to say absolute convergence implies con-

vergence in a Banach space.
Exercise 7.14. Suppose X is a Banach space and f
n
: n N is a sequence
in X such that lim
n
f
n
= f X. Show s
N
:=
1
N
N
n=1
f
n
for N N is
still a convergent sequence and
lim
N
1
N
N
n=1
f
n
= lim
N
s
N
= f.
Exercise 7.15 (Dominated Convergence Theorem Again). Let X be a
Banach space, A be a set and suppose f
n
: A X is a sequence of functions
such that f () := lim
n
f
n
() exists for all A. Further assume there
exists a summable function g : A [0, ) such that |f
n
()| g () for all
A. Show

A
f () exists in X and
lim
n
A
f
n
() =
A
f () .
8
Hilbert Space Basics
Denition 8.1. Let H be a complex vector space. An inner product on H is
a function, [) : H H C, such that
1. ax +by[z) = ax[z) +by[z) i.e. x x[z) is linear.
2. x[y) = y[x).
3. |x|
2
:= x[x) 0 with equality |x|
2
= 0 i x = 0.
Notice that combining properties (1) and (2) that x z[x) is conjugate
linear for xed z H, i.e.
z[ax +by) = az[x) +
bz[y).
The following identity will be used frequently in the sequel without further
mention,
|x +y|
2
= x +y[x +y) = |x|
2
+|y|
2
+x[y) +y[x)
= |x|
2
+|y|
2
+ 2Rex[y). (8.1)
Theorem 8.2 (Schwarz Inequality). Let (H, [)) be an inner product
space, then for all x, y H
[x[y)[ |x||y|
and equality holds i x and y are linearly dependent.
Proof. If y = 0, the result holds trivially. So assume that y ,= 0 and
observe; if x = y for some C, then x[y) = |y|
2
and hence
[x[y)[ = [[ |y|
2
= |x||y|.
Now suppose that x H is arbitrary, let z := x |y|
2
x[y)y. (So z is the
orthogonal projection of x onto y, see Figure 8.1.) Then
76 8 Hilbert Space Basics
Fig. 8.1. The picture behind the proof of the Schwarz inequality.
0 |z|
2
=
_
_
_
_
x
x[y)
|y|
2
y
_
_
_
_
2
= |x|
2
+
[x[y)[
2
|y|
4
|y|
2
2Rex[
x[y)
|y|
2
y)
= |x|
2
[x[y)[
2
|y|
2
from which it follows that 0 |y|
2
|x|
2
[x[y)[
2
with equality i z = 0 or
equivalently i x = |y|
2
x[y)y.
Corollary 8.3. Let (H, [)) be an inner product space and |x| :=
_
x[x).
Then the Hilbertian norm, ||, is a norm on H. Moreover [) is continuous
on H H, where H is viewed as the normed space (H, ||).
Proof. If x, y H, then, using Schwarzs inequality,
|x +y|
2
= |x|
2
+|y|
2
+ 2Rex[y)
|x|
2
+|y|
2
+ 2|x||y| = (|x| +|y|)
2
.
Taking the square root of this inequality shows || satises the triangle in-
equality.
Checking that || satises the remaining axioms of a norm is now routine
and will be left to the reader. If x, x
t
, y, y
t
H, then
[x[y) x
t
[y
t
)[ = [x x
t
[y) +x
t
[y y
t
)[
|y||x x
t
| +|x
t
||y y
t
|
|y||x x
t
| + (|x| +|x x
t
|) |y y
t
|
= |y||x x
t
| +|x||y y
t
| +|x x
t
||y y
t
|
from which it follows that [) is continuous.
Denition 8.4. Let (H, [)) be an inner product space, we say x, y H are
orthogonal and write x y i x[y) = 0. More generally if A H is a set,
x H is orthogonal to A (write x A) i x[y) = 0 for all y A. Let
8 Hilbert Space Basics 77
A
= x H : x A be the set of vectors orthogonal to A. A subset S H

is an orthogonal set if x y for all distinct elements x, y S. If S further
satises, |x| = 1 for all x S, then S is said to be an orthonormal set.
Proposition 8.5. Let (H, [)) be an inner product space then
1. (Parallelogram Law)
|x +y|
2
+|x y|
2
= 2|x|
2
+ 2|y|
2
(8.2)
for all x, y H.
2. (Pythagorean Theorem) If S H is a nite orthogonal set, then
_
_
_
_
_
xS
x
_
_
_
_
_
2
=
xS
|x|
2
. (8.3)
3. If A H is a set, then A
is a closed linear subspace of H.

Remark 8.6. See Proposition 8.48 for the converse of the parallelogram law.
Proof. I will assume that H is a complex Hilbert space, the real case being
easier. Items 1. and 2. are proved by the following elementary computations;
|x +y|
2
+|x y|
2
= |x|
2
+|y|
2
+ 2Rex[y) +|x|
2
+|y|
2
2Rex[y)
= 2|x|
2
+ 2|y|
2
,
and
_
_
_
_
_
xS
x
_
_
_
_
_
2
=
xS
x[
yS
y) =
x,yS
x[y)
=
xS
x[x) =
xS
|x|
2
.
Item 3. is a consequence of the continuity of [) and the fact that
A
=
xA
Nul([x))
where Nul([x)) = y H : y[x) = 0 a closed subspace of H.
Denition 8.7. A Hilbert space is an inner product space (H, [)) such
that the induced Hilbertian norm is complete.
Example 8.8. Suppose X is a set and : X (0, ) , then H :=
2
() is a
Hilbert space when equipped with the inner product,
f[g) :=
xX
f (x) g (x) (x) .
In Exercise 8.7 you will show every Hilbert space H is equivalent to a Hilbert
space of this form with 1.
More examples of Hilbert spaces will be given later after we develop the
Lebesgue integral, see Example 23.1 below.
Denition 8.9. A subset C of a vector space X is said to be convex if for all
x, y C the line segment [x, y] := tx + (1 t)y : 0 t 1 joining x to y is
contained in C as well. (Notice that any vector subspace of X is convex.)
Theorem 8.10 (Best Approximation Theorem). Suppose that H is a
Hilbert space and M H is a closed convex subset of H. Then for any x H
there exists a unique y M such that
|x y| = d(x, M) = inf
zM
|x z|.
Moreover, if M is a vector subspace of H, then the point y may also be char-
acterized as the unique point in M such that (x y) M.
Proof. Uniqueness. By replacing M by M x := mx : m M we
may assume x = 0. Let := d(0, M) = inf
mM
|m| and y, z M, see Figure
8.2.
Fig. 8.2. The geometry of convex sets.
By the parallelogram law and the convexity of M,
2|y|
2
+ 2|z|
2
= |y +z|
2
+|y z|
2
= 4
_
_
_
_
y +z
2
_
_
_
_
2
+|y z|
2
4
2
+|y z|
2
. (8.4)
Hence if |y| = |z| = , then 2
2
+2
2
4
2
+|y z|
2
, so that |y z|
2
= 0.
Therefore, if a minimizer for d(0, )[
M
exists, it is unique.
Existence. Let y
n
M be chosen such that |y
n
| =
n
d(0, M).
Taking y = y
m
and z = y
n
in Eq. (8.4) shows
2
2
m
+ 2
2
n
4
2
+|y
n
y
m
|
2
.
Passing to the limit m, n in this equation implies,
2
2
+ 2
2
4
2
+ limsup
m,n
|y
n
y
m
|
2
,
i.e. limsup
m,n
|y
n
y
m
|
2
= 0. Therefore, by completeness of H, y
n
n=1
is convergent. Because M is closed, y := lim
n
y
n
M and because the norm
is continuous,
|y| = lim
n
|y
n
| = = d(0, M).
So y is the desired point in M which is closest to 0.
Now suppose M is a closed subspace of H and x H. Let y M be the
closest point in M to x. Then for w M, the function
g(t) := |x (y +tw)|
2
= |x y|
2
2tRex y[w) +t
2
|w|
2
has a minimum at t = 0 and therefore 0 = g
t
(0) = 2Rex y[w). Since
w M is arbitrary, this implies that (x y) M.
Finally suppose y M is any point such that (x y) M. Then for
z M, by Pythagoreans theorem,
|x z|
2
= |x y +y z|
2
= |x y|
2
+|y z|
2
|x y|
2
which shows d(x, M)
2
|x y|
2
. That is to say y is the point in M closest
to x.
Denition 8.11. Suppose that A : H H is a bounded operator. The ad-
joint of A, denoted A
, is the unique operator A
: H H such that
Ax[y) = x[A
y). (The proof that A
exists and is unique will be given in

Proposition 8.16 below.) A bounded operator A : H H is self - adjoint or
Hermitian if A = A
.
Denition 8.12. Let H be a Hilbert space and M H be a closed subspace.
The orthogonal projection of H onto M is the function P
M
: H H such that
for x H, P
M
(x) is the unique element in M such that (x P
M
(x)) M.
Theorem 8.13 (Projection Theorem). Let H be a Hilbert space and M
H be a closed subspace. The orthogonal projection P
M
satises:
1. P
M
is linear and hence we will write P
M
x rather than P
M
(x).
2. P
2
M
= P
M
(P
M
is a projection).
3. P
M
= P
M
(P
M
is self-adjoint).
4. Ran(P
M
) = M and Nul(P
M
) = M
.
Proof.
1. Let x
1
, x
2
H and F, then P
M
x
1
+P
M
x
2
M and
P
M
x
1
+P
M
x
2
(x
1
+x
2
) = [P
M
x
1
x
1
+(P
M
x
2
x
2
)] M
showing P
M
x
1
+P
M
x
2
= P
M
(x
1
+x
2
), i.e. P
M
is linear.
2. Obviously Ran(P
M
) = M and P
M
x = x for all x M. Therefore P
2
M
=
P
M
.
3. Let x, y H, then since (x P
M
x) and (y P
M
y) are in M
,
P
M
x[y) = P
M
x[P
M
y +y P
M
y) = P
M
x[P
M
y)
= P
M
x + (x P
M
x)[P
M
y) = x[P
M
y).
4. We have already seen, Ran(P
M
) = M and P
M
x = 0 i x = x 0 M
,
i.e. Nul(P
M
) = M
.
Corollary 8.14. If M H is a proper closed subspace of a Hilbert space H,
then H = M M
.
Proof. Given x H, let y = P
M
x so that x y M
. Then x =
y +(x y) M +M
. If x MM
, then x x, i.e. |x|

2
= x[x) = 0. So
M M
= 0 .
Exercise 8.1. Suppose M is a subset of H, then M
= span(M).
Theorem 8.15 (Riesz Theorem). Let H
be the dual space of H (Notation

7.9). The map
z H
j
[z) H
(8.5)
is a conjugate linear
1
isometric isomorphism.
Proof. The map j is conjugate linear by the axioms of the inner products.
Moreover, for x, z H,
[x[z)[ |x| |z| for all x H
with equality when x = z. This implies that |jz|
H
= |[z)|
H
= |z| .
Therefore j is isometric and this implies j is injective. To nish the proof we
must show that j is surjective. So let f H
which we assume, with out

loss of generality, is non-zero. Then M =Nul(f) a closed proper subspace
of H. Since, by Corollary 8.14, H = M M
, f : H/M

= M
F is a
linear isomorphism. This shows that dim(M
) = 1 and hence H = M Fx
0
where x
0
M
0 .
2
Choose z = x
0
M
such that f(x

0
) = x
0
[z), i.e.
=

f(x
0
)/ |x
0
|
2
. Then for x = m+x
0
with m M and F,
f(x) = f(x
0
) = x
0
[z) = x
0
[z) = m+x
0
[z) = x[z)
which shows that f = jz.
1
Recall that j is conjugate linear if
j (z1 +z2) = jz1 + jz2
for all z1, z2 H and C.
2
Alternatively, choose x0 M
\ |0 such that f(x0) = 1. For x M
we have
f(x x0) = 0 provided that := f(x). Therefore x x0 M M
= |0 ,
i.e. x = x0. This again shows that M
is spanned by x0.
Proposition 8.16 (Adjoints). Let H and K be Hilbert spaces and A : H
K be a bounded operator. Then there exists a unique bounded operator A
:
K H such that
Ax[y)
K
= x[A
y)
H
for all x H and y K. (8.6)
Moreover, for all A, B L(H, K) and C,
1. (A+B)
= A
+

B
,
2. A
:= (A
= A,
3. |A
| = |A| and
4. |A
A| = |A|
2
.
5. If K = H, then (AB)
= B
. In particular A L(H) has a bounded

inverse i A
has a bounded inverse and (A
)
1
=
_
A
1
_
.
Proof. For each y K, the map x Ax[y)
K
is in H
and therefore
there exists, by Theorem 8.15, a unique vector z H (we will denote this z
by A
(y)) such that

Ax[y)
K
= x[z)
H
for all x H.
This shows there is a unique map A
: K H such that Ax[y)

K
=
x[A
(y))
H
for all x H and y K.
To see A
is linear, let y
1
, y
2
K and C, then for any x H,
Ax[y
1
+y
2
)
K
= Ax[y
1
)
K
+

Ax[y
2
)
K
= x[A
(y
1
))
K
+

x[A
(y
2
))
H
= x[A
(y
1
) +A
(y
2
))
H
and by the uniqueness of A
(y
1
+y
2
) we nd
A
(y
1
+y
2
) = A
(y
1
) +A
(y
2
).
This shows A
is linear and so we will now write A
y instead of A
(y).
Since
A
y[x)
H
= x[A
y)
H
= Ax[y)
K
= y[Ax)
K
it follows that A
= A. The assertion that (A+B)
= A
is Exercise
8.2.
Items 3. and 4. Making use of Schwarzs inequality (Theorem 8.2), we
have
|A
| = sup
kK:|k|=1
|A
k|
= sup
kK:|k|=1
sup
hH:|h|=1
[A
k[h)[
= sup
hH:|h|=1
sup
kK:|k|=1
[k[Ah)[ = sup
hH:|h|=1
|Ah| = |A|
so that |A
| = |A| . Since
|A
A| |A
| |A| = |A|
2
and
|A|
2
= sup
hH:|h|=1
|Ah|
2
= sup
hH:|h|=1
[Ah[Ah)[
= sup
hH:|h|=1
[h[A
Ah)[ sup
hH:|h|=1
|A
Ah| = |A
A| (8.7)
we also have |A
A| |A|
2
|A
A| which shows |A|

2
= |A
A| .
Alternatively, from Eq. (8.7),
|A|
2
|A
A| |A| |A
| (8.8)
which then implies |A| |A
| . Replacing A by A
in this last inequality

shows |A
| |A| and hence that |A
| = |A| . Using this identity back in

Eq. (8.8) proves |A|
2
= |A
A| .
Now suppose that K = H. Then
ABh[k) = Bh[A
k) = h[B
k)
which shows (AB)
= B
. If A
1
exists then
_
A
1
_
=
_
AA
1
_
= I
= I and
A
_
A
1
_
=
_
A
1
A
_
= I
= I.
This shows that A
is invertible and (A
)
1
=
_
A
1
_
. Similarly if A
is
invertible then so is A = A
.
Exercise 8.2. Let H, K, M be Hilbert spaces, A, B L(H, K), C L(K, M)
and C. Show (A+B)
= A
+

B
and (CA)
= A
L(M, H).
Exercise 8.3. Let H = C
n
and K = C
m
equipped with the usual inner
products, i.e. z[w)
H
= z w for z, w H. Let A be an mn matrix thought of
as a linear operator from H to K. Show the matrix associated to A
: K H
is the conjugate transpose of A.
Lemma 8.17. Suppose A : H K is a bounded operator, then:
1. Nul(A
) = Ran(A)
.
2. Ran(A) = Nul(A
.
3. if K = H and V H is an A invariant subspace (i.e. A(V ) V ), then
V
is A
invariant.
Proof. An element y K is in Nul(A
) i 0 = A
y[x) = y[Ax)
for all x H which happens i y Ran(A)
. Because, by Exercise 8.1,

Ran(A) = Ran(A)
, and so by the rst item, Ran(A) = Nul(A
. Now
suppose A(V ) V and y V
, then
A
y[x) = y[Ax) = 0 for all x V

which shows A
y V
.
8.1 Hilbert Space Basis 83
8.1 Hilbert Space Basis
Proposition 8.18 (Bessels Inequality). Let T be an orthonormal set, then
for any x H,
vT
[x[v)[
2
|x|
2
for all x H. (8.9)
In particular the set T
x
:= v T : x[v) , = 0 is at most countable for all
x H.
Proof. Let T be any nite set. Then
0 |x
v
x[v)v|
2
= |x|
2
2Re
v
x[v) v[x) +
v
[x[v)[
2
= |x|
2
v
[x[v)[
2
showing that

v
[x[v)[
2
|x|
2
. Taking the supremum of this inequality over
T then proves Eq. (8.9).
Proposition 8.19. Suppose T H is an orthogonal set. Then s =

vT
v
exists in H (see Denition 7.17) i

vT
|v|
2
< . (In particular T must
be at most a countable set.) Moreover, if

vT
|v|
2
< , then
1. |s|
2
=
vT
|v|
2
and
2. s[x) =
vT
v[x) for all x H.
Similarly if v
n
n=1
is an orthogonal set, then s =
n=1
v
n
exists in H
i
n=1
|v
n
|
2
< . In particular if
n=1
v
n
exists, then it is independent of
rearrangements of v
n
n=1
.
Proof. Suppose s =
vT
v exists. Then there exists T such that
v
|v|
2
=
_
_
_
_
_
v
v
_
_
_
_
_
2
1
for all T , wherein the rst inequality we have used Pythagoreans
theorem. Taking the supremum over such shows that
vT\
|v|
2
1 and
therefore

vT
|v|
2
1 +
v
|v|
2
< .
Conversely, suppose that

vT
|v|
2
< . Then for all > 0 there exists
T such that if T
,
_
_
_
_
_
v
v
_
_
_
_
_
2
=
v
|v|
2
<
2
. (8.10)
Hence by Lemma 7.18,

vT
v exists.
For item 1, let
be as above and set s
:=
v
v. Then
[|s| |s
|[ |s s
| <
and by Eq. (8.10),
0
vT
|v|
2
|s
|
2
=
v/
|v|
2

2
.
Letting 0 we deduce from the previous two equations that |s
| |s| and
|s
|
2
vT
|v|
2
as 0 and therefore |s|
2
=
vT
|v|
2
.
Item 2. is a special case of Lemma 7.18. For the nal assertion, let
s
N
:=
N
n=1
v
n
and suppose that lim
N
s
N
= s exists in H and in partic-
ular s
N
N=1
is Cauchy. So for N > M.
N
n=M+1
|v
n
|
2
= |s
N
s
M
|
2
0 as M, N
which shows that
n=1
|v
n
|
2
is convergent, i.e.
n=1
|v
n
|
2
< .
Alternative proof of item 1. We could use the last result to prove
Item 1. Indeed, if

vT
|v|
2
< , then T is countable and so we may write
T = v
n
n=1
. Then s = lim
N
s
N
with s
N
as above. Since the norm, || ,
is continuous on H,
|s|
2
= lim
N
|s
N
|
2
= lim
N
_
_
_
_
_
N
n=1
v
n
_
_
_
_
_
2
= lim
N
N
n=1
|v
n
|
2
=
n=1
|v
n
|
2
=
vT
|v|
2
.
Corollary 8.20. Suppose H is a Hilbert space, H is an orthonormal set
and M = span . Then
P
M
x =
u
x[u)u, (8.11)
u
[x[u)[
2
= |P
M
x|
2
and (8.12)
u
x[u)u[y) = P
M
x[y) (8.13)
for all x, y H.
Proof. By Bessels inequality,

u
[x[u)[
2
|x|
2
for all x H and
hence by Proposition 8.18, Px :=
u
x[u)u exists in H and for all x, y H,
Px[y) =
u
x[u)u[y) =
u
x[u)u[y). (8.14)
Taking y in Eq. (8.14) gives Px[y) = x[y), i.e. that x Px[y) = 0 for
all y . So (x Px) span and by continuity we also have (x Px)
M = span . Since Px is also in M, it follows from the denition of P
M
that
Px = P
M
x proving Eq. (8.11). Equations (8.12) and (8.13) now follow from
(8.14), Proposition 8.19 and the fact that P
M
x[y) = P
2
M
x[y) = P
M
x[P
M
y)
for all x, y H.
Exercise 8.4. Let (H, [)) be a Hilbert space and suppose that P
n
n=1
is a sequence of orthogonal projection operators on H such that P
n
(H)
P
n+1
(H) for all n. Let M :=
n=1
P
n
(H) (a subspace of H) and let P denote
orthonormal projection onto

M. Show lim
n
P
n
x = Px for all x H. Hint:
rst prove the result for x M
, then for x M and then for x

M.
Denition 8.21 (Basis). Let H be a Hilbert space. A basis of H is a
maximal orthonormal subset H.
Proposition 8.22. Every Hilbert space has an orthonormal basis.
Proof. Let T be the collection of all orthonormal subsets of H ordered by
inclusion. If T is linearly ordered then is an upper bound. By Zorns
Lemma (see Theorem B.7) there exists a maximal element T.
An orthonormal set H is said to be complete if
= 0 . That is
to say if x[u) = 0 for all u then x = 0.
Lemma 8.23. Let be an orthonormal subset of H then the following are
equivalent:
1. is a basis,
2. is complete and
3. span = H.
Proof. (1. 2.) If is not complete, then there exists a unit vector
x
0 . The set x is an orthonormal set properly containing , so

is not maximal. Conversely, if is not maximal, there exists an orthonormal
set
1
H such that
1
. Then if x
1
, we have x[u) = 0 for all
u showing is not complete.
(2. 3.) If is not complete and x
0 , then span x
which is a proper subspace of H. Conversely if span is a proper subspace

of H,
= span
is a non-trivial subspace by Corollary 8.14 and is not

complete.
Theorem 8.24. Let H be an orthonormal set. Then the following are
equivalent:
1. is complete, i.e. is an orthonormal basis for H.
2. x =

u
x[u)u for all x H.
3. x[y) =

u
x[u) u[y) for all x, y H.
4. |x|
2
=

u
[x[u)[
2
for all x H.
Proof. Let M = span and P = P
M
.
(1) (2) By Corollary 8.20,

u
x[u)u = P
M
x. Therefore
x
u
x[u)u = x P
M
x M
= 0 .
(2) (3) is a consequence of Proposition 8.19.
(3) (4) is obvious, just take y = x.
(4) (1) If x
, then by 4), |x| = 0, i.e. x = 0. This shows that is

complete.
Suppose := u
n
n=1
is a collection of vectors in an inner product space
(H, [)) . The standard Gram-Schmidt process produces from an or-
thonormal subset, = v
n
n=1
, such that every element u
n
is a nite
linear combination of elements from . Recall the procedure is to dene v
n
inductively by setting
v
n+1
:= v
n+1
j=1
u
n+1
[v
j
)v
j
= v
n+1
P
n
v
n+1
where P
n
is orthogonal projection onto M
n
:= span(v
k
n
k=1
). If v
n+1
:= 0, let
v
n+1
= 0, otherwise set v
n+1
:= | v
n+1
|
1
v
n+1
. Finally re-index the resulting
sequence so as to throw out those v
n
with v
n
= 0. The result is an orthonormal
subset, H, with the desired properties.
Denition 8.25. A subset, , of a normed space X is said to be total if
span() is dense in X.
Remark 8.26. Suppose that u
n
n=1
is a total subset of H. Let v
n
n=1
be
the vectors found by performing Gram-Schmidt on the set u
n
n=1
. Then
:= v
n
n=1
is an orthonormal basis for H. Indeed, if h H is orthogonal
to then h is orthogonal to u
n
n=1
and hence also spanu
n
n=1
= H.
In particular h is orthogonal to itself and so h = 0. This generalizes the
corresponding results for nite dimensional inner product spaces.
Proposition 8.27. A Hilbert space H is separable (BRUCE: has separable
been dened yet?) i H has a countable orthonormal basis H. Moreover,
if H is separable, all orthonormal bases of H are countable. (See Proposition
4.14 in Conways, A Course in Functional Analysis, for a more general
version of this proposition.)
Proof. Let | H be a countable dense set | = u
n
n=1
. By Gram-
Schmidt process there exists = v
n
n=1
an orthonormal set such that
spanv
n
: n = 1, 2 . . . , N spanu
n
: n = 1, 2 . . . , N. So if x[v
n
) = 0 for
all n then x[u
n
) = 0 for all n. Since | H is dense we may choose w
k
|
such that x = lim
k
w
k
and therefore x[x) = lim
k
x[w
k
) = 0. That is to
say x = 0 and is complete. Conversely if H is a countable orthonormal
basis, then the countable set
| =
_
_
_
u
a
u
u : a
u
+i : #u : a
u
,= 0 <
_
_
_
is dense in H. Finally let = u
n
n=1
be an orthonormal basis and
1
H
be another orthonormal basis. Then the sets
B
n
= v
1
: v[u
n
) , = 0
are countable for each n N and hence B :=
n=1
B
n
is a countable subset
of
1
. Suppose there exists v
1
B, then v[u
n
) = 0 for all n and since
= u
n
n=1
is an orthonormal basis, this implies v = 0 which is impossible
since |v| = 1. Therefore
1
B = and hence
1
= B is countable.
Notation 8.28 If f : X C and g : Y C are two functions, let f g :
X Y C be dened by f g (x, y) := f (x) g (y) .
Proposition 8.29. Suppose X and Y are sets and : X (0, ) and :
Y (0, ) are given weight functions. If
2
() and
2
() are
orthonormal bases, then
:= f g : f and g
is an orthonormal basis for
2
( ) .
Proof. Let f, f
t

2
() and g, g
t

2
() , then by the Tonellis Theorem
4.22 for sums and Holders inequality,
XY
f g f
t
g
t
ff
t
gg
t
|f|
2
()
|f
t
|
2
()
|g|
2
()
|g
t
|
2
()
= 1 < .
So by Fubinis Theorem 4.23 for sums,
f g[f
t
g
t
)
2
()
=
X
f

f
t

Y
g g
t
= f[f
t
)
2
()
g[g
t
)
2
()
=
f,f

g,g
.
Therefore, is an orthonormal subset of
2
( ). So it only remains to
show is complete. We will give two proofs of this fact. Let F
2
().
In the rst proof we will verify item 4. of Theorem 8.24 while in the second
we will verify item 1 of Theorem 8.24.
First Proof. By Tonellis Theorem,
xX
(x)
yY
(y) [F(x, y)[
2
= |F|
2
2
()
<
and since > 0, it follows that
yY
[F(x, y)[
2
(y) < for all x X,
i.e. F(x, )
2
() for all x X. By the completeness of ,
yY
[F(x, y)[
2
(y) = F (x, ) [F (x, ))
2
()
=
F (x, ) [g)
2
()
2
and therefore,
|F|
2
2
()
=
xX
(x)
yY
(y) [F(x, y)[
2
=
xX
F (x, ) [g)
2
()
2
(x) . (8.15)
and in particular, x F (x, ) [g)
2
()
is in
2
() . So by the completeness of
and the Fubini and Tonelli theorems, we nd
F (x, ) [g)
2
()
2
(x) =
xX
F (x, ) [g)
2
()

f (x) (x)
2
=
xX
_
_
yY
F (x, y) g (y) (y)
_
_
f (x) (x)
2
=
(x,y)XY
F (x, y) f g (x, y) (x, y)
2
=
F[f g)
2
()
2
.
Combining this result with Eq. (8.15) shows
8.2 Some Spectral Theory 89
|F|
2
2
()
=
f, g
F[f g)
2
()
2
as desired.
Second Proof. Suppose, for all f and g that F[f g) = 0, i.e.
0 = F[f g)
2
()
=
xX
(x)
yY
(y) F(x, y)

f(x) g(y)
=
xX
(x) F(x, )[g)
2
()

f(x). (8.16)
Since
xX
F(x, )[g)
2
()
2
(x)
xX
(x)
yY
[F(x, y)[
2
(y) < , (8.17)
it follows from Eq. (8.16) and the completeness of that F(x, )[g)
2
()
= 0
for all x X. By the completeness of we conclude that F(x, y) = 0 for all
(x, y) X Y.
Denition 8.30. A linear map U : H K is an isometry if |Ux|
K
=
|x|
H
for all x H and U is unitary if U is also surjective.
Exercise 8.5. Let U : H K be a linear map, show the following are
equivalent:
1. U : H K is an isometry,
2. Ux[Ux
t
)
K
= x[x
t
)
H
for all x, x
t
H, (see Eq. (8.33) below)
3. U
U = id
H
.
Exercise 8.6. Let U : H K be a linear map, show the following are
equivalent:
1. U : H K is unitary
2. U
U = id
H
and UU
= id
K
.
3. U is invertible and U
1
= U
.
Exercise 8.7. Let H be a Hilbert space. Use Theorem 8.24 to show there
exists a set X and a unitary map U : H
2
(X). Moreover, if H is separable
and dim(H) = , then X can be taken to be N so that H is unitarily
equivalent to
2
=
2
(N).
8.2 Some Spectral Theory
For this section let H and K be two Hilbert spaces over C.
Exercise 8.8. Suppose A : H H is a bounded self-adjoint operator. Show:
1. If is an eigenvalue of A, i.e. Ax = x for some x H 0 , then 1.
2. If and are two distinct eigenvalues of A with eigenvectors x and y
respectively, then x y.
Unlike in nite dimensions, it is possible that an operator on a complex
Hilbert space may have no eigenvalues, see Example 8.36 and Lemma 8.37
below for a couple of examples. For this reason it is useful to generalize the
notion of an eigenvalue as follows.
Denition 8.31. Suppose X is a Banach space over F (F = 1 or C) and
A L(X) . We say F is in the spectrum of A if AI does not have a
bounded
3
inverse. The spectrum will be denoted by (A) F. The resolvent
set for A is (A) := F (A) .
Remark 8.32. If is an eigenvalue of A, then AI is not injective and hence
not invertible. Therefore any eigenvalue of A is in the spectrum of A. If H is
a Hilbert space and A L(H) , it follows from item 5. of Proposition 8.16
that (A) i

(A
) , i.e.
(A
) =
_
: (A)
_
.
Exercise 8.9. Suppose X is a complex Banach space and A GL(X) . Show
_
A
1
_
= (A)
1
:=
_
1
: (A)
_
.
If we further assume A is both invertible and isometric, i.e. |Ax| = |x| for
all x X, then show
(A) S
1
:= z C : [z[ = 1 .
Hint: working formally,
_
A
1
1
_
1
=
1
1
A

1
=
1
A
A
=
A
A
from which you might expect that
_
A
1
1
_
1
= A(A)
1
if
(A) .
Exercise 8.10. Suppose X is a Banach space and A L(X) . Use Corollary
7.22 to show (A) is a closed subset of
_
F : [[ |A| := |A|
L(X)
_
.
Lemma 8.33. Suppose that A L(H) is a normal operator, i.e. 0 =
[A, A
] := AA
A. Then (A) i
inf
||=1
|(A1)| = 0. (8.18)
In other words, (A) i there is an approximate sequence of eigen-
vectors for (A, ) , i.e. there exists
n
H such that |
n
| = 1 and
A
n
n
0 as n .
3
It will follow by the open mapping Theorem 25.19 or the closed graph Theorem
25.22 that the word bounded may be omitted from this denition.
Proof. By replacing A by AI we may assume that = 0. If 0 / (A),
then
inf
||=1
|A| = inf
|A|
||
= inf
||
|A
1
|
= 1/
_
_
A
1
_
_
> 0.
Now suppose that inf
||=1
|A| = > 0 or equivalently we have
|A| ||
for all H. Because A is normal,
|A|
2
= A[A) = A
A[) = AA
[) = A
[A
) = |A
|
2
.
Therefore we also have
|A
| = |A| || H. (8.19)
This shows in particular that A and A
are injective, Ran(A) is closed and

hence by Lemma 8.17
Ran(A) = Ran(A) = Nul(A
= 0
= H.
Therefore A is algebraically invertible and the inverse is bounded by Eq.
(8.19).
Lemma 8.34. Suppose that A L(H) is self-adjoint (i.e. A = A
) then
(A)
_
|A|
op
, |A|
op
_
1.
Proof. Writing = +i with , 1, then
|(A+ +i) |
2
= |(A+)|
2
+[[
2
||
2
+ 2 Re((A+) , i)
= |(A+)|
2
+[[
2
||
2
(8.20)
wherein we have used
Re [i((A+) , )] = Im((A+) , ) = 0
since
((A+) , ) = (, (A+) ) = ((A+) , ).
Eq. (8.20) along with Lemma 8.33 shows that / (A) if ,= 0, i.e. (A) 1.
The fact that (A) is now contained in
_
|A|
op
, |A|
op
_
is a consequence of
Exercise 8.9.
Remark 8.35. It is not true that (A) 1 implies A = A
. For example let

A =
_
0 1
0 0
_
on H = C
2
, then (A) = 0 yet A ,= A
.
Example 8.36. Let S L(H) be a (not necessarily) normal operator. The
proof of Lemma 8.33 gives (S) if Eq. (8.18) holds. However the converse
is not always valid unless S is normal. For example, let S :
2
2
be the shift,
S(
1
,
2
, . . . ) = (0,
1
,
2
, . . . ). Then for any D := z C : [z[ < 1 ,
|(S ) | = |S | [|S| [[ ||[ = (1 [[) ||
and so there does not exists an approximate sequence of eigenvectors for
(S, ) . However, as we will now show, (S) =

D.
To prove this it suces to show by Remark 8.32 and Exercise 8.9 that
D (S
) . For if this is the case then

D (S
)

D and hence (S) =

D
since

D is invariant under complex conjugation.
A simple computation shows,
S
(
1
,
2
, . . . ) = (
2
,
3
, . . . )
and = (
1
,
2
, . . . ) is an eigenvector for S
with eigenvalue C i
0 = (S
I) (
1
,
2
, . . . ) = (
2
1
,
3
2
, . . . ).
Solving these equation shows
2
=
1
,
3
=
2
=
2
1
, . . . ,
n
=
n1
1
, . . . .
Hence if D, we may let
1
= 1 above to nd
S
(1, ,
2
, . . . ) = (1, ,
2
, . . . )
where (1, ,
2
, . . . )
2
. Thus we have shown is an eigenvalue for S
for
all D and hence D (S
).
Lemma 8.37. Let H =
2
(Z) and let A : H H be dened by
Af (k) = i (f (k + 1) f (k 1)) for all k Z.
Then:
1. A is a bounded self-adjoint operator.
2. A has no eigenvalues.
3. (A) = [2, 2] .
Proof. For another (simpler) proof of this lemma, see Exercise 23.8 below.
1. Since
|Af|
2
|f ( + 1)|
2
+|f ( 1)|
2
= 2 |f|
2
,
|A|
op
2 < . Moreover, for f, g
2
(Z) ,
Af[g) =
k
i (f (k + 1) f (k 1)) g (k)
=
k
if (k) g (k 1)
k
if (k) g (k + 1)
=
k
f (k) Ag (k) = f[Ag),
which shows A = A
.
2. From Lemma 8.34, we know that (A) [2, 2] . If [2, 2] and
f H satises Af = f, then
f (k + 1) = if (k) +f (k 1) for all k Z. (8.21)
This is a second order dierence equations which can be solved analogously
to second order ordinary dierential equations. The idea is to start by looking
for a solution of the form f (k) =
k
. Then Eq. (8.21) becomes,
k+1
=
i
k
+
k1
or equivalently that
2
+i 1 = 0.
So we will have a solution if
where
=
i
4
2
2
.
For [[ , = 2, there are two distinct roots and the general solution to Eq. (8.21)
is of the form
f (k) = c
+
k
+
+c
(8.22)
for some constants c
C and [[ = 2, the general solution has the form

f (k) = c
k
+
+dk
k
+
(8.23)
Since in all cases, [
[ =
1
4
_
2
+ 4
2
_
= 1, it follows that neither of these
functions, f, will be in
2
(Z) unless they are identically zero. This shows that
A has no eigenvalues.
3. The above argument suggests a method for constructing approximate
eigenfunctions. Namely, let [2, 2] and dene f
n
(k) := 1
]k]n
k
where
=
+
. Then a simple computation shows
lim
n
|(AI) f
n
|
2
|f
n
|
2
= 0 (8.24)
and therefore (A) .
Exercise 8.11. Verify Eq. (8.24). Also show by explicit computations that
lim
n
|(AI) f
n
|
2
|f
n
|
2
,= 0
if / [2, 2] .
The next couple of results will be needed for the next section.
Theorem 8.38 (Rayleigh quotient). Suppose T L(H) := L(H, H) is a
bounded self-adjoint operator, then
|T| = sup
f,=0
[f[Tf)[
|f|
2
.
Moreover if there exists a non-zero element f H such that
[Tf[f)[
|f|
2
= |T|,
then f is an eigenvector of T with Tf = f and |T|.
Proof. Let
M := sup
f,=0
[f[Tf)[
|f|
2
.
We wish to show M = |T|. Since
[f[Tf)[ |f||Tf| |T||f|
2
,
we see M |T|. Conversely let f, g H and compute
f +g[T(f +g)) f g[T(f g))
= f[Tg) +g[Tf) +f[Tg) +g[Tf)
= 2[f[Tg) +Tg[f)] = 2[f[Tg) +f[Tg)]
= 4Ref[Tg).
Therefore, if |f| = |g| = 1, it follows that
[Ref[Tg)[
M
4
_
|f +g|
2
+|f g|
2
_
=
M
4
_
2|f|
2
+ 2|g|
2
_
= M.
By replacing f be e
i
f where is chosen so that e
i
f[Tg) is real, we nd
[f[Tg)[ M for all |f| = |g| = 1.
Hence
|T| = sup
|f|=|g|=1
[f[Tg)[ M.
If f H 0 and |T| = [Tf[f)[/|f|
2
then, using Schwarzs inequality,
|T| =
[Tf[f)[
|f|
2

|Tf|
|f|
|T|. (8.25)
This implies [Tf[f)[ = |Tf||f| and forces equality in Schwarzs inequality.
So by Theorem 8.2, Tf and f are linearly dependent, i.e. Tf = f for some
C. Substituting this into (8.25) shows that [[ = |T|. Since T is self-
adjoint,
|f|
2
= f[f) = Tf[f) = f[Tf) = f[f) =

f[f) =

|f|
2
,
which implies that 1 and therefore, |T|.
8.3 Compact Operators on a Hilbert Space 95
8.3 Compact Operators on a Hilbert Space
In this section let H and B be Hilbert spaces and U := x H : |x| < 1 be
the unit ball in H. Recall from Denition 14.16 (BRUCE: forward reference.
Think about correct placement of this section.) that a bounded operator,
K : H B, is compact i K(U) is compact in B. Equivalently, for all
bounded sequences x
n
n=1
H, the sequence Kx
n
n=1
has a convergent
subsequence in B. Because of Theorem 14.15, if dim(H) = and K : H B
is invertible, then K is not compact.
Denition 8.39. K : H B is said to have nite rank if Ran(K) B is
nite dimensional.
The following result is a simple consequence of Corollaries 14.13 and 14.14.
Corollary 8.40. If K : H B is a nite rank operator, then K is compact.
In particular if either dim(H) < or dim(B) < then any bounded operator
K : H B is nite rank and hence compact.
Lemma 8.41. Let / := /(H, B) denote the compact operators from H to
B. Then /(H, B) is a norm closed subspace of L(H, B).
Proof. The fact that / is a vector subspace of L(H, B) will be left to the
reader. To nish the proof, we must show that K L(H, B) is compact if
there exists K
n
/(H, B) such that lim
n
|K
n
K|
op
= 0.
First Proof. Given > 0, choose N = N() such that |K
N
K| < .
Using the fact that K
N
U is precompact, choose a nite subset U such
that min
x
|y K
N
x| < for all y K
N
(U) . Then for z = Kx
0
K(U)
and x ,
|z Kx| = |(K K
N
)x
0
+K
N
(x
0
x) + (K
N
K)x|
2 +|K
N
x
0
K
N
x|.
Therefore min
x
|z Kx| < 3, which shows K(U) is 3 bounded for all
> 0, so K(U) is totally bounded and hence precompact.
Second Proof. Suppose x
n
n=1
is a bounded sequence in H. By com-
pactness, there is a subsequence
_
x
1
n
_
n=1
of x
n
n=1
such that
_
K
1
x
1
n
_
n=1
is convergent in B. Working inductively, we may construct subsequences
x
n
n=1

_
x
1
n
_
n=1

_
x
2
n
_
n=1
x
m
n
n=1
. . .
such that K
m
x
m
n
n=1
is convergent in B for each m. By the usual Cantors
diagonalization procedure, let y
n
:= x
n
n
, then y
n
n=1
is a subsequence of
x
n
n=1
such that K
m
y
n
n=1
is convergent for all m. Since
|Ky
n
Ky
l
| |(K K
m
) y
n
| +|K
m
(y
n
y
l
)| +|(K
m
K) y
l
)|
2 |K K
m
| +|K
m
(y
n
y
l
)| ,
lim sup
n,l
|Ky
n
Ky
l
| 2 |K K
m
| 0 as m ,
which shows Ky
n
n=1
is Cauchy and hence convergent.
Proposition 8.42. A bounded operator K : H B is compact i there exists
nite rank operators, K
n
: H B, such that |K K
n
| 0 as n .
Proof. Since K(U) is compact it contains a countable dense subset and
from this it follows that K (H) is a separable subspace of B. Let
n
be an
orthonormal basis for K (H) B and
P
N
y =
N
n=1
y[
n
)
n
be the orthogonal projection of y onto span
n
N
n=1
. Then lim
N
|P
N
y
y| = 0 for all y K(H). Dene K
n
:= P
n
K a nite rank operator on H.
For sake of contradiction suppose that
limsup
n
|K K
n
| = > 0,
in which case there exists x
n
k
U such that |(K K
n
k
)x
n
k
| for all n
k
.
Since K is compact, by passing to a subsequence if necessary, we may assume
Kx
n
k
n
k
=1
is convergent in B. Letting y := lim
k
Kx
n
k
,
|(K K
n
k
)x
n
k
| = |(1 P
n
k
)Kx
n
k
|
|(1 P
n
k
)(Kx
n
k
y)| +|(1 P
n
k
)y|
|Kx
n
k
y| +|(1 P
n
k
)y| 0 as k .
But this contradicts the assumption that is positive and hence we must
have lim
n
|K K
n
| = 0, i.e. K is an operator norm limit of nite rank
operators. The converse direction follows from Corollary 8.40 and Lemma
8.41.
Corollary 8.43. If K is compact then so is K
.
Proof. First Proof. Let K
n
= P
n
K be as in the proof of Proposition
8.42, then K
n
= K
P
n
is still nite rank. Furthermore, using Proposition
8.16,
|K
n
| = |K K
n
| 0 as n
showing K
is a limit of nite rank operators and hence compact.

Second Proof. Let x
n
n=1
be a bounded sequence in B, then
|K
x
n
K
x
m
|
2
= x
n
x
m
[KK
(x
n
x
m
)) 2C |KK
(x
n
x
m
)|
(8.26)
where C is a bound on the norms of the x
n
. Since K
x
n
n=1
is also a bounded
sequence, by the compactness of K there is a subsequence x
t
n
of the x
n
such that KK
x
t
n
is convergent and hence by Eq. (8.26), so is the sequence
K
x
t
n
.
8.3.1 The Spectral Theorem for Self Adjoint Compact Operators
For the rest of this section, K /(H) := /(H, H) will be a self-adjoint
compact operator or S.A.C.O. for short. Because of Proposition 8.42, we
might expect compact operators to behave very much like nite dimensional
matrices. This is typically the case as we will see below.
Example 8.44 (Model S.A.C.O.). Let H =
2
and K be the diagonal matrix
K =
_
_
_
_
_
1
0 0
0
2
0
0 0
3

.
.
.
.
.
.
.
.
.
.
.
.
_
_
_
_
_
,
where lim
n
[
n
[ = 0 and
n
1. Then K is a self-adjoint compact opera-
tor. This assertion was proved in Example 14.17.
The main theorem (Theorem 8.46) of this subsection states that up to
unitary equivalence, Example 8.44 is essentially the most general example of
an S.A.C.O.
Proposition 8.45. Let K be a S.A.C.O., then either = |K| or = |K|
is an eigenvalue of K.
Proof. Without loss of generality we may assume that K is non-zero since
otherwise the result is trivial. By Theorem 8.38, there exists u
n
H such that
|u
n
| = 1 and
[u
n
[Ku
n
)[
|u
n
|
2
= [u
n
[Ku
n
)[ |K| as n . (8.27)
By passing to a subsequence if necessary, we may assume that :=
lim
n
u
n
[Ku
n
) exists and |K|. By passing to a further subse-
quence if necessary, we may assume, using the compactness of K, that Ku
n
is convergent as well. We now compute:
0 |Ku
n
u
n
|
2
= |Ku
n
|
2
2Ku
n
[u
n
) +
2

2
2Ku
n
[u
n
) +
2
2
2
2
+
2
= 0 as n .
Hence
Ku
n
u
n
0 as n (8.28)
and therefore
u := lim
n
u
n
=
1
lim
n
Ku
n
exists. By the continuity of the inner product, |u| = 1 ,= 0. By passing to the
limit in Eq. (8.28) we nd that Ku = u.
Theorem 8.46 (Compact Operator Spectral Theorem). Suppose that
K : H H is a non-zero S.A.C.O., then
1. there exists at least one eigenvalue |K|.
2. There are at most countably many non-zero eigenvalues,
n
N
n=1
, where
N = is allowed. (Unless K is nite rank (i.e. dimRan (K) < ), N
will be innite.)
3. The
n
s (including multiplicities) may be arranged so that [
n
[ [
n+1
[
for all n. If N = then lim
n
[
n
[ = 0. (In particular any eigenspace
for K with non-zero eigenvalue is nite dimensional.)
4. The eigenvectors
n
N
n=1
can be chosen to be an O.N. set such that H =
span
n
Nul(K).
5. Using the
n
N
n=1
above,
Kf =
N
n=1
n
f[
n
)
n
for all f H. (8.29)
6. The spectrum of K is (K) = 0
n
: n < N + 1 if dimH = ,
otherwise (K) =
n
: n N with N dimH.
Proof. We will nd
n
s and
n
s recursively. Let
1
|K| and
1
H such that K
1
=
1
1
as in Proposition 8.45.
Take M
1
= span(
1
) so K(M
1
) M
1
. By Lemma 8.17, KM
1
M
1
.
Dene K
1
: M
1
M
1
via K
1
= K[
M
1
. Then K
1
is again a compact
operator. If K
1
= 0, we are done. If K
1
,= 0, by Proposition 8.45 there exists
2
|K
1
| and
2
M
1
such that |
2
| = 1 and K
1
2
= K
2
=
2
2
.
Let M
2
:= span(
1
,
2
).
Again K (M
2
) M
2
and hence K
2
:= K[
M
2
: M
2
M
2
is compact and
if K
2
= 0 we are done. When K
2
,= 0, we apply Proposition 8.45 again to nd
3
|K|
2
and
3
M
2
such that |
3
| = 1 and K
2
3
= K
3
=
3
3
.
Continuing this way indenitely or until we reach a point where K
n
= 0,
we construct a sequence
n
N
n=1
of eigenvalues and orthonormal eigenvectors
N
n=1
such that [
n
[ [
n+1
[ with the further property that
[
n
[ = sup
1,2,...n1]
|K|
||
. (8.30)
When N < , the remaining results in the theorem are easily veried. So
from now on let us assume that N = .
If := lim
n
[
n
[ > 0, then
_
1
n

n
_
n=1
is a bounded sequence in H.
Hence, by the compactness of K, there exists a subsequence n
k
: k N of
N such that
_
n
k
=
1
n
k
K
n
k
_
k=1
is a convergent. However, since
n
k
k=1
is an orthonormal set, this is impossible and hence we must conclude that
:= lim
n
[
n
[ = 0.
Let M := span
n
n=1
. Then K(M) M and hence, by Lemma 8.17,
K(M
) M
. Using Eq. (8.30),

|K[
M
|
_
_
K[
M
n
_
_
= [
n
[ 0 as n
showing K[M
0. Dene P
0
to be orthogonal projection onto M
. Then
for f H,
f = P
0
f + (1 P
0
)f = P
0
f +
n=1
f[
n
)
n
and
Kf = KP
0
f +K
n=1
f[
n
)
n
=
n=1
n
f[
n
)
n
which proves Eq. (8.29).
Since
n
n=1
(K) and (K) is closed, it follows that 0 (K) and
hence
n
n=1
0 (K). Suppose that z /
n
n=1
0 and let d
be the distance between z and
n
n=1
0. Notice that d > 0 because
lim
n
n
= 0.
A few simple computations show that:
(K zI)f =
n=1
f[
n
)(
n
z)
n
zP
0
f,
(K z)
1
exists,
(K zI)
1
f =
n=1
f[
n
)(
n
z)
1
n
z
1
P
0
f,
and
|(K zI)
1
f|
2
=
n=1
[f[
n
)[
2
1
[
n
z[
2
+
1
[z[
2
|P
0
f|
2
_
1
d
_
2
_

n=1
[f[
n
)[
2
+|P
0
f|
2
_
=
1
d
2
|f|
2
.
We have thus shown that (K zI)
1
exists, |(K zI)
1
| d
1
< and
hence z / (K).
Theorem 8.47 (Structure of Compact Operators). Let K : H B
be a compact operator. Then there exists N N , orthonormal subsets
N
n=1
H and
n
N
n=1
B and a sequence
n
N
n=1
1
+
such that
1

2
. . . (with lim
n
n
= 0 if N = ), |
n
| 1 for all n and
Kf =
N
n=1
n
f[
n
)
n
for all f H. (8.31)
Proof. Since K
K is a self-adjoint compact operator, Theorem 8.46 im-

plies there exists an orthonormal set
n
N
n=1
H and positive numbers
N
n=1
such that
K
K =
N
n=1
n
[
n
)
n
for all H.
Let A be the positive square root of K
K dened by
A :=
N
n=1
_
n
[
n
)
n
for all H.
A simple computation shows, A
2
= K
K, and therefore,
|A|
2
= A[A) =
[A
2
_
= [K
K) = K[K) = |K|
2
for all H. Hence we may dene a unitary operator, u : Ran(A) Ran(K)
by the formula
uA = K for all H.
We then have
K = uA =
N
n=1
_
n
[
n
)u
n
(8.32)
which proves the result with
n
:= u
n
and
n
=
n
.
It is instructive to nd
n
explicitly and to verify Eq. (8.32) by brute force.
Since
n
=
1/2
n
A
n
,
n
=
1/2
n
uA
n
=
1/2
n
K
n
and
K
n
[K
m
) =
n
[K
K
m
) =
n
mn
.
This veries that
n
N
n=1
is an orthonormal set. Moreover,
N
n=1
_
n
[
n
)
n
=
N
n=1
_
n
[
n
)
1/2
n
K
n
= K
N
n=1
[
n
)
n
= K
since

N
n=1
[
n
)
n
= P where P is orthogonal projection onto Nul(K)
.
Second Proof. Let K = u[K[ be the polar decomposition of K. Then [K[
is self-adjoint and compact, by Corollary ?? below, and hence by Theorem
8.4 Supplement 1: Converse of the Parallelogram Law 101
8.46 there exists an orthonormal basis
n
N
n=1
for Nul([K[)
= Nul(K)
such that [K[

n
=
n
n
,
1

2
. . . and lim
n
n
= 0 if N = . For
f H,
Kf = u[K[
N
n=1
f[
n
)
n
=
N
n=1
f[
n
)u[K[
n
=
N
n=1
n
f[
n
)u
n
which is Eq. (8.31) with
n
:= u
n
.
8.4 Supplement 1: Converse of the Parallelogram Law
Proposition 8.48 (Parallelogram Law Converse). If (X, ||) is a normed
space such that Eq. (8.2) holds for all x, y X, then there exists a unique in-
ner product on [) such that |x| :=
_
x[x) for all x X. In this case we
say that || is a Hilbertian norm.
Proof. If || is going to come from an inner product [), it follows from
Eq. (8.1) that
2Rex[y) = |x +y|
2
|x|
2
|y|
2
and
2Rex[y) = |x y|
2
|x|
2
|y|
2
.
Subtracting these two equations gives the polarization identity,
4Rex[y) = |x +y|
2
|x y|
2
.
Replacing y by iy in this equation then implies that
4Imx[y) = |x +iy|
2
|x iy|
2
from which we nd
x[y) =
1
4
G
|x +y|
2
(8.33)
where G = 1, i a cyclic subgroup of S
1
C. Hence, if [) is going to
exist we must dene it by Eq. (8.33) and the uniqueness has been proved.
For existence, dene x[y) by Eq. (8.33) in which case,
x[x) =
1
4
G
|x +x|
2
=
1
4
_
|2x|
2
+i|x +ix|
2
i|x ix|
2
= |x|
2
+
i
4
1 +i[
2
|x|
2
i
4
1 i[
2
|x|
2
= |x|
2
.
So to nish the proof, it only remains to show that x[y) dened by Eq. (8.33)
is an inner product.
Since
4y[x) =
G
|y +x|
2
=
G
| (y +x) |
2
=
G
|y +
2
x|
2
= |y +x|
2
| y +x|
2
+i|iy x|
2
i| iy x|
2
= |x +y|
2
|x y|
2
+i|x iy|
2
i|x +iy|
2
= 4x[y)
it suces to show x x[y) is linear for all y H. (The rest of this proof may
safely be skipped by the reader.) For this we will need to derive an identity
from Eq. (8.2). To do this we make use of Eq. (8.2) three times to nd
|x +y +z|
2
= |x +y z|
2
+ 2|x +y|
2
+ 2|z|
2
= |x y z|
2
2|x z|
2
2|y|
2
+ 2|x +y|
2
+ 2|z|
2
= |y +z x|
2
2|x z|
2
2|y|
2
+ 2|x +y|
2
+ 2|z|
2
= |y +z +x|
2
+ 2|y +z|
2
+ 2|x|
2
2|x z|
2
2|y|
2
+ 2|x +y|
2
+ 2|z|
2
.
Solving this equation for |x +y +z|
2
gives
|x +y +z|
2
= |y +z|
2
+|x +y|
2
|x z|
2
+|x|
2
+|z|
2
|y|
2
. (8.34)
Using Eq. (8.34), for x, y, z H,
4 Rex +z[y) = |x +z +y|
2
|x +z y|
2
= |y +z|
2
+|x +y|
2
|x z|
2
+|x|
2
+|z|
2
|y|
2
_
|z y|
2
+|x y|
2
|x z|
2
+|x|
2
+|z|
2
|y|
2
_
= |z +y|
2
|z y|
2
+|x +y|
2
|x y|
2
= 4 Rex[y) + 4 Rez[y). (8.35)
Now suppose that G, then since [[ = 1,
4x[y) =
1
4
G
|x +y|
2
=
1
4
G
|x +
1
y|
2
=
1
4
G
|x +y|
2
= 4x[y) (8.36)
where in the third inequality, the substitution was made in the sum.
So Eq. (8.36) says ix[y) = ix[y) and x[y) = x[y). Therefore
Imx[y) = Re (ix[y)) = Reix[y)
which combined with Eq. (8.35) shows
8.5 Supplement 2. Non-complete inner product spaces 103
Imx +z[y) = Reix iz[y) = Reix[y) + Reiz[y)
= Imx[y) + Imz[y)
and therefore (again in combination with Eq. (8.35)),
x +z[y) = x[y) +z[y) for all x, y H.
Because of this equation and Eq. (8.36) to nish the proof that x x[y) is
linear, it suces to show x[y) = x[y) for all > 0. Now if = m N,
then
mx[y) = x + (m1)x[y) = x[y) +(m1)x[y)
so that by induction mx[y) = mx[y). Replacing x by x/m then shows that
x[y) = mm
1
x[y) so that m
1
x[y) = m
1
x[y) and so if m, n N, we nd
n
m
x[y) = n
1
m
x[y) =
n
m
x[y)
so that x[y) = x[y) for all > 0 and . By continuity, it now follows
that x[y) = x[y) for all > 0.
8.5 Supplement 2. Non-complete inner product spaces
Part of Theorem 8.24 goes through when H is a not necessarily complete inner
product space. We have the following proposition.
Proposition 8.49. Let (H, [)) be a not necessarily complete inner product
space and H be an orthonormal set. Then the following two conditions
are equivalent:
1. x =

u
x[u)u for all x H.
2. |x|
2
=

u
[x[u)[
2
for all x H.
Moreover, either of these two conditions implies that H is a maximal
orthonormal set. However H being a maximal orthonormal set is not
sucient (without completeness of H) to show that items 1. and 2. hold!
Proof. As in the proof of Theorem 8.24, 1) implies 2). For 2) implies 1)
let and consider
_
_
_
_
_
x
u
x[u)u
_
_
_
_
_
2
= |x|
2
2
u
[x[u)[
2
+
u
[x[u)[
2
= |x|
2
u
[x[u)[
2
.
Since |x|
2
=

u
[x[u)[
2
, it follows that for every > 0 there exists

such that for all such that
,
_
_
_
_
_
x
u
x[u)u
_
_
_
_
_
2
= |x|
2
u
[x[u)[
2
<
showing that x =

u
x[u)u. Suppose x = (x
1
, x
2
, . . . , x
n
, . . . )
. If 2)
is valid then |x|
2
= 0, i.e. x = 0. So is maximal. Let us now construct a
counterexample to prove the last assertion. Take H = Spane
i
i=1

2
and
let u
n
= e
1
(n+1)e
n+1
for n = 1, 2 . . . . Applying Gram-Schmidt to u
n
n=1
we construct an orthonormal set = u
n
n=1
H. I now claim that H
is maximal. Indeed if x = (x
1
, x
2
, . . . , x
n
, . . . )
then x u
n
for all n, i.e.
0 = x[ u
n
) = x
1
(n + 1)x
n+1
.
Therefore x
n+1
= (n + 1)
1
x
1
for all n. Since x Spane
i
i=1
, x
N
= 0 for
some N suciently large and therefore x
1
= 0 which in turn implies that
x
n
= 0 for all n. So x = 0 and hence is maximal in H. On the other hand,
is not maximal in
2
. In fact the above argument shows that
in
2
is given
by the span of v = (1,
1
2
,
1
3
,
1
4
,
1
5
, . . . ). Let P be the orthogonal projection of
2
onto the Span() = v
. Then
n=1
x[u
n
)u
n
= Px = x
x[v)
|v|
2
v,
so that
n=1
x[u
n
)u
n
= x i x Span() = v

2
. For example if x =
(1, 0, 0, . . . ) H (or more generally for x = e
i
for any i), x / v
and hence
i=1
x[u
n
)u
n
,= x.
8.6 Exercises
Exercise 8.12. Prove Theorem 14.43. Hint: Let H
0
:= spanx
n
: n N a
separable Hilbert subspace of H. Let
m
m=1
H
0
be an orthonormal basis
and use Cantors diagonalization argument to nd a subsequence y
k
:= x
n
k
such that c
m
:= lim
k
y
k
[
m
) exists for all m N. Finish the proof by
appealing to Proposition 14.42. (BRUCE: forward reference.)
Denition 8.50. We say a sequence x
n
n=1
of a Hilbert space, H, converges
weakly to x H (and denote this by writing x
n
w
x H as n ) i
lim
n
x
n
, y) = x, y) for all y H.
8.6 Exercises 105
Exercise 8.13. Suppose that x
n
n=1
H and x
n
w
x H as n .
Show x
n
x as n (i.e. lim
n
|x x
n
| = 0) i lim
n
|x
n
| = |x| .
(BRUCE: weak convergence has not been dened yet.)
Exercise 8.14 (Banach-Saks). Suppose that x
n
n=1
H, x
n
w
x H as
n , and c := sup
n
|x
n
| < .
4
Show there exists a subsequence, y
k
= x
n
k
such that
lim
N
_
_
_
_
_
x
1
N
N
k=1
y
k
_
_
_
_
_
= 0,
i.e.
1
N
N
k=1
y
k
x as N . Hints: 1. show it suces to assume x = 0
and then choose y
k
k=1
so that [y
k
[y
l
)[ l
1
(or even smaller if you like)
for all k l.
Exercise 8.15 (The Mean Ergodic Theorem). Let U : H H be a uni-
tary operator on a Hilbert space H, M = Nul(U I), P = P
M
be orthogonal
projection onto M, and S
n
=
1
n
n1
k=0
U
k
. Show S
n
P
M
strongly by which
we mean lim
n
S
n
x = P
M
x for all x H.
Hints: 1. Show H is the orthogonal direct sum of M and Ran(U I) by
rst showing Nul(U
I) = Nul(U I) and then using Lemma 8.17. 2. Verify

the result for x Nul(U I) and x Ran(U I). 3. Use a limiting argument
to verify the result for x Ran(U I).
See Denition 14.36 and the exercises in Section 25.4 for more on the
notion of weak and strong convergence.
4
The assumption that c < is superuous because of the uniform boundedness
principle, see Theorem 25.27 below.
9
Holder Spaces as Banach Spaces
In this section, we will assume that reader has basic knowledge of the Riemann
integral and dierentiability properties of functions. The results use here may
be found in Part III below. (BRUCE: there are forward references in this
section.)
Notation 9.1 Let be an open subset of 1
d
, BC() and BC(

) be the
bounded continuous functions on and

respectively. By identifying f
BC(

) with f[
BC(), we will consider BC(

) as a subset of BC().
For u BC() and 0 < 1 let
|u|
u
:= sup
x
[u(x)[ and [u]
:= sup
x,y
x=y
_
[u(x) u(y)[
[x y[
_
.
If [u]
< , then u is Holder continuous with holder exponent

1
. The
collection of Holder continuous function on will be denoted by
C
0,
() := u BC() : [u]
<
and for u C
0,
() let
|u|
C
0,
()
:= |u|
u
+ [u]
. (9.1)
Remark 9.2. If u : C and [u]
< for some > 1, then u is constant

on each connected component of . Indeed, if x and h 1
d
then
u(x +th) u(x)

t
[u]
/t 0 as t 0
which shows
h
u(x) = 0 for all x . If y is in the same connected
component as x, then by Exercise 22.8 below there exists a smooth curve
1
If = 1, u is is said to be Lipschitz continuous.
108 9 Holder Spaces as Banach Spaces
: [0, 1] such that (0) = x and (1) = y. So by the fundamental
theorem of calculus and the chain rule,
u(y) u(x) =
_
1
0
d
dt
u((t))dt =
_
1
0
0 dt = 0.
This is why we do not talk about Holder spaces with Holder exponents larger
than 1.
Lemma 9.3. Suppose u C
1
() BC() and
i
u BC() for i =
1, 2, . . . , d, then u C
0,1
(), i.e. [u]
1
< .
The proof of this lemma is left to the reader as Exercise 9.1.
Theorem 9.4. Let be an open subset of 1
d
. Then
1. Under the identication of u BC
_
_
with u[
BC () , BC(

) is a
closed subspace of BC().
2. Every element u C
0,
() has a unique extension to a continuous func-
tion (still denoted by u) on

. Therefore we may identify C
0,
() with
C
0,
(

) BC(

). (In particular we may consider C
0,
() and C
0,
(

)
to be the same when > 0.)
3. The function u C
0,
() |u|
C
0,
()
[0, ) is a norm on C
0,
()
which make C
0,
() into a Banach space.
Proof. 1. The rst item is trivial since for u BC(

), the sup-norm of
u on

agrees with the sup-norm on and BC(

) is complete in this norm.
2. Suppose that [u]
< and x
0
bd(). Let x
n
n=1
be a
sequence such that x
0
= lim
n
x
n
. Then
[u(x
n
) u(x
m
)[ [u]
[x
n
x
m
[
0 as m, n
showing u(x
n
)
n=1
is Cauchy so that u(x
0
) := lim
n
u(x
n
) exists. If
y
n
n=1
is another sequence converging to x
0
, then
[u(x
n
) u(y
n
)[ [u]
[x
n
y
n
[
0 as n ,
showing u(x
0
) is well dened. In this way we dene u(x) for all x bd()
and let u(x) = u(x) for x . Since a similar limiting argument shows
[ u(x) u(y)[ [u]
[x y[
for all x, y

it follows that u is still continuous and [ u]
= [u]
. In the sequel we will abuse

notation and simply denote u by u.
3. For u, v C
0,
(),
9 Holder Spaces as Banach Spaces 109
[v +u]
= sup
x,y
x=y
_
[v(y) +u(y) v(x) u(x)[
[x y[
_
sup
x,y
x=y
_
[v(y) v(x)[ +[u(y) u(x)[
[x y[
_
[v]
+ [u]
and for C it is easily seen that [u]
= [[ [u]
. This shows []
is a
semi-norm (see Denition 5.1) on C
0,
() and therefore | |
C
0,
()
dened
in Eq. (9.1) is a norm. To see that C
0,
() is complete, let u
n
n=1
be a
C
0,
()Cauchy sequence. Since BC(

) is complete, there exists u BC(

)
such that |u u
n
|
0 as n . For x, y with x ,= y,
[u(x) u(y)[
[x y[
= lim
n
[u
n
(x) u
n
(y)[
[x y[
limsup
n
[u
n
]
lim
n
|u
n
|
C
0,
()
< ,
and so we see that u C
0,
(). Similarly,
[u(x) u
n
(x) (u(y) u
n
(y))[
[x y[
= lim
m
[(u
m
u
n
)(x) (u
m
u
n
)(y)[
[x y[
limsup
m
[u
m
u
n
]
0 as n ,
showing [u u
n
]
0 as n and therefore lim

n
|u u
n
|
C
0,
()
= 0.
Notation 9.5 Since and

are locally compact Hausdor spaces, we may
dene C
0
() and C
0
(

) as in Denition 15.22. We will also let
C
0,
0
() := C
0,
() C
0
() and C
0,
0
(

) := C
0,
() C
0
(

).
It has already been shown in Proposition 15.23 that C
0
() and C
0
(

) are
closed subspaces of BC() and BC(

) respectively. The next proposition
describes the relation between C
0
() and C
0
(

).
Proposition 9.6. Each u C
0
() has a unique extension to a continuous
function on

given by u = u on and u = 0 on bd() and the extension u
is in C
0
(

). Conversely if u C
0
(

) and u[
bd()
= 0, then u[
C
0
(). In
this way we may identify C
0
() with those u C
0
(

) such that u[
bd()
= 0.
Proof. Any extension u C
0
() to an element u C(

) is necessarily
unique, since is dense inside

. So dene u = u on and u = 0 on bd().
We must show u is continuous on

and u C
0
(

). For the continuity
assertion it is enough to show u is continuous at all points in bd(). For any
> 0, by assumption, the set K
:= x : [u(x)[ is a compact subset

of . Since bd() =

, bd() K
= and therefore the distance,

110 9 Holder Spaces as Banach Spaces
:= d(K
, bd()), between K
and bd() is positive. So if x bd() and

y

and [y x[ < , then [ u(x) u(y)[ = [u(y)[ < which shows u :

C
is continuous. This also shows [ u[ = [u[ = K
is compact in and
hence also in

. Since > 0 was arbitrary, this shows u C
0
(

). Conversely if
u C
0
(

) such that u[
bd()
= 0 and > 0, then K
:=
_
x

: [u(x)[
_
is a compact subset of

which is contained in since bd() K
= .
Therefore K
is a compact subset of showing u[
C
0
(

).
Denition 9.7. Let be an open subset of 1
d
, k N0 and (0, 1].
Let BC
k
() (BC
k
(

)) denote the set of k times continuously dierentiable
functions u on such that
u BC() (
u BC(

))
2
for all [[ k.
Similarly, let BC
k,
() denote those u BC
k
() such that [
u]
< for
all [[ = k. For u BC
k
() let
|u|
C
k
()
=
]]k
|
u|
u
and
|u|
C
k,
()
=
]]k
|
u|
u
+
]]=k
[
u]
.
Theorem 9.8. The spaces BC
k
() and BC
k,
() equipped with | |
C
k
()
and ||
C
k,
()
respectively are Banach spaces and BC
k
(

) is a closed subspace
of BC
k
() and BC
k,
() BC
k
(

). Also
C
k,
0
() = C
k,
0
(

) = u BC
k,
() :
u C
0
() [[ k
is a closed subspace of BC
k,
().
Proof. Suppose that u
n
n=1
BC
k
() is a Cauchy sequence, then
u
n
n=1
is a Cauchy sequence in BC() for [[ k. Since BC() is
complete, there exists g
BC() such that lim

n
|
u
n
g
= 0 for
all [[ k. Letting u := g
0
, we must show u C
k
() and
u = g
for all
[[ k. This will be done by induction on [[ . If [[ = 0 there is nothing to
prove. Suppose that we have veried u C
l
() and
u = g
for all [[ l
for some l < k. Then for x , i 1, 2, . . . , d and t 1 suciently small,
a
u
n
(x +te
i
) =
a
u
n
(x) +
_
t
0
a
u
n
(x +e
i
)d.
Letting n in this equation gives
a
u(x +te
i
) =
a
u(x) +
_
t
0
g
+ei
(x +e
i
)d
from which it follows that
i
u(x) exists for all x and

i
u = g
+ei
.
This completes the induction argument and also the proof that BC
k
() is
2
To say
u BC(

) means that
u BC() and
u extends to a continuous
function on

.
9.1 Exercises 111
complete. It is easy to check that BC
k
(

) is a closed subspace of BC
k
()
and by using Exercise 9.1 and Theorem 9.4 that that BC
k,
() is a subspace
of BC
k
(

). The fact that C
k,
0
() is a closed subspace of BC
k,
() is a con-
sequence of Proposition 15.23. To prove BC
k,
() is complete, let u
n
n=1

BC
k,
() be a ||
C
k,
()
Cauchy sequence. By the completeness of BC
k
()
just proved, there exists u BC
k
() such that lim
n
|u u
n
|
C
k
()
= 0.
An application of Theorem 9.4 then shows lim
n
|
u
n
u|
C
0,
()
= 0
for [[ = k and therefore lim
n
|u u
n
|
C
k,
()
= 0.
The reader is asked to supply the proof of the following lemma.
Lemma 9.9. The following inclusions hold. For any [0, 1]
BC
k+1,0
() BC
k,1
() BC
k,
()
BC
k+1,0
(

) BC
k,1
(

) BC
k,
().
9.1 Exercises
Part III
Calculus and Ordinary Dierential Equations
in Banach Spaces
10
The Riemann Integral
In this Chapter, the Riemann integral for Banach space valued functions is
dened and developed. Our exposition will be brief, since the Lebesgue integral
and the Bochner Lebesgue integral will subsume the content of this chapter.
In Denition 14.1 below, we will give a general notion of a compact subset of a
topological space. However, by Corollary 14.9 below, when we are working
with subsets of 1
d
this denition is equivalent to the following denition.
Denition 10.1. A subset A 1
d
is said to be compact if A is closed and
bounded.
Theorem 10.2. Suppose that K 1
d
is a compact set and f C (K, X) .
Then
1. Every sequence u
n
n=1
K has a convergent subsequence.
2. The function f is uniformly continuous on K, namely for every > 0
there exists a > 0 only depending on such that |f (u) f (v)| <
whenever u, v K and [u v[ < where [[ is the standard Euclidean
norm on 1
d
.
Proof.
1. (This is a special case of Theorem 14.7 and Corollary 14.9 below.) Since K
is bounded, K [R, R]
d
for some suciently large d. Let t
n
be the rst
component of u
n
so that t
n
[R, R] for all n. Let J
1
= [0, R] if t
n
J
1
for innitely many n otherwise let J
1
= [R, 0]. Similarly split J
1
in half
and let J
2
J
1
be one of the halves such that t
n
J
2
for innitely many
n. Continue this way inductively to nd a nested sequence of intervals
J
1
J
2
J
3
J
4
. . . such that the length of J
k
is 2
(k1)
R and for
each k, t
n
J
k
for innitely many n. We may now choose a subsequence,
n
k
k=1
of n
n=1
such that
k
:= t
n
k
J
k
for all k. The sequence
k=1
is Cauchy and hence convergent. Thus by replacing u
n
n=1
by a
subsequence if necessary we may assume the rst component of u
n
n=1
is
116 10 The Riemann Integral
convergent. Repeating this argument for the second, then the third and all
the way through the d
th
components of u
n
n=1
, we may, by passing to
further subsequences, assume all of the components of u
n
are convergent.
But this implies limu
n
= u exists and since K is closed, u K.
2. (This is a special case of Exercise 14.6 below.) If f were not uniformly
continuous on K, there would exists an > 0 and sequences u
n
n=1
and
v
n
n=1
in K such that
|f (u
n
) f (v
n
)| while lim
n
[u
n
v
n
[ = 0.
By passing to subsequences if necessary we may assume that lim
n
u
n
and lim
n
v
n
exists. Since lim
n
[u
n
v
n
[ = 0, we must have
lim
n
u
n
= u = lim
n
v
n
for some u K. Since f is continuous, vector addition is continuous and
the norm is continuous, we may now conclude that
lim
n
|f (u
n
) f (v
n
)| = |f (u) f (u)| = 0
which is a contradiction.
For the remainder of the chapter, let [a, b] be a xed compact interval and
X be a Banach space. The collection o = o([a, b], X) of step functions,
f : [a, b] X, consists of those functions f which may be written in the form
f(t) = x
0
1
[a,t1]
(t) +
n1
i=1
x
i
1
(ti,ti+1]
(t), (10.1)
where := a = t
0
< t
1
< < t
n
= b is a partition of [a, b] and x
i
X.
For f as in Eq. (10.1), let
I(f) :=
n1
i=0
(t
i+1
t
i
)x
i
X. (10.2)
Exercise 10.1. Show that I(f) is well dened, independent of how f is repre-
sented as a step function. (Hint: show that adding a point to a partition of
[a, b] does not change the right side of Eq. (10.2).) Also verify that I : o X
is a linear operator.
Notation 10.3 Let

o denote the closure of o inside the Banach space,
([a, b], X) as dened in Remark 7.6.

The following simple Bounded Linear Transformation theorem will often
be used in the sequel to dene linear transformations.
10 The Riemann Integral 117
Theorem 10.4 (B. L. T. Theorem). Suppose that Z is a normed space,
X is a Banach space, and o Z is a dense linear subspace of Z. If T :
o X is a bounded linear transformation (i.e. there exists C < such that
|Tz| C |z| for all z o), then T has a unique extension to an element
T L(Z, X) and this extension still satises

_
_
Tz
_
_
C |z| for all z

o.
Exercise 10.2. Prove Theorem 10.4.
Proposition 10.5 (Riemann Integral). The linear function I : o X
extends uniquely to a continuous linear operator

I from

o to X and this
operator satises,
|
I(f)| (b a) |f|
for all f

o. (10.3)
Furthermore, C([a, b], X)

o
([a, b], X) and for f ,

I(f) may be com-
puted as
I(f) = lim
]]0
n1
i=0
f(c
i
)(t
i+1
t
i
) (10.4)
where := a = t
0
< t
1
< < t
n
= b denotes a partition of [a, b],
[[ = max [t
i+1
t
i
[ : i = 0, . . . , n 1 is the mesh size of and c
i
may be
chosen arbitrarily inside [t
i
, t
i+1
]. See Figure 10.1.
Fig. 10.1. The usual picture associated to the Riemann integral.
Proof. Taking the norm of Eq. (10.2) and using the triangle inequality
shows,
|I(f)|
n1
i=0
(t
i+1
t
i
)|x
i
|
n1
i=0
(t
i+1
t
i
)|f|
(b a)|f|
. (10.5)
The existence of

I satisfying Eq. (10.3) is a consequence of Theorem 10.4.
Given f C([a, b], X), := a = t
0
< t
1
< < t
n
= b a partition of [a, b],
and c
i
[t
i
, t
i+1
] for i = 0, 1, 2 . . . , n 1, let f
o be dened by
f
(t) := f(c
0
)
0
1
[t0,t1]
(t) +
n1
i=1
f(c
i
)1
(ti,ti+1]
(t).
Then by the uniform continuity of f on [a, b] (Theorem 10.2), lim
]]0
|f
f
= 0 and therefore f

o. Moreover,
I (f) = lim
]]0
I(f
) = lim
]]0
n1
i=0
f(c
i
)(t
i+1
t
i
)
If f
n
o and f

o such that lim
n
|f f
n
|
= 0, then for a <

b, then 1
(,]
f
n
o and lim
n
_
_
1
(,]
f 1
(,]
f
n
_
_
= 0. This shows
1
(,]
f

o whenever f

o.
Notation 10.6 For f

o and a b we will write denote

I(1
(,]
f)
by
_
f(t) dt or
_
(,]
f(t)dt. Also following the usual convention, if a
b, we will let
_

f(t) dt =
I(1
(,]
f) =
_

f(t) dt.
The next Lemma, whose proof is left to the reader contains some of the
many familiar properties of the Riemann integral.
Lemma 10.7. For f

o([a, b], X) and , , [a, b], the Riemann integral
satises:
1.
_
_
_
_
f(t) dt
_
_
_
X
( ) sup|f(t)| : t .
2.
_
f(t) dt =
_
f(t) dt +
_
f(t) dt.
3. The function G(t) :=
_
t
a
f()d is continuous on [a, b].
4. If Y is another Banach space and T L(X, Y ), then Tf

o([a, b], Y )
and
T
_
_

f(t) dt
_
=
_

Tf(t) dt.
5. The function t |f(t)|
X
is in

o([a, b], 1) and
_
_
_
_
_
_
b
a
f(t) dt
_
_
_
_
_
X
_
b
a
|f(t)|
X
dt.
10.1 The Fundamental Theorem of Calculus 119
6. If f, g

o([a, b], 1) and f g, then
_
b
a
f(t) dt
_
b
a
g(t) dt.
Remark 10.8 (BRUCE: todo?). Perhaps the Riemann Stieljtes integral, Lemma
28.38, should be done here. Maybe this should be done in the more gen-
ral context of Banach valued functions in preparation of T. Lyons rough
path analysis. The point would be to let X
t
take values in a Banach
space and assume that X
t
had nite variation. Then dene
X
(t) :=
sup
l
_
_
X
tt
l
X
tt
l1
_
_
. Then we could dene
_
T
0
Z
t
dX
t
:= lim
]]0
Z
t
l1
_
X
tt
l
X
tt
l1
_
for continuous operator valued paths, Z
t
. This integral would then satisfy the
estimates,
_
_
_
_
_
_
T
0
Z
t
dX
t
_
_
_
_
_
_
T
0
|Z
t
| d
X
(t) sup
0tT
|Z
t
|
X
(T) .
10.1 The Fundamental Theorem of Calculus
Our next goal is to show that our Riemann integral interacts well with dif-
ferentiation, namely the fundamental theorem of calculus holds. Before doing
this we will need a couple of basic denitions and results of dierential calcu-
lus, more details and the next few results below will be done in greater detail
in Chapter 12.
Denition 10.9. Let (a, b) 1. A function f : (a, b) X is dierentiable
at t (a, b) i
L := lim
h0
_
h
1
[f(t +h) f(t)]
_
= lim
h0
f(t +h) f(t)

h

exists in X. The limit L, if it exists, will be denoted by

f(t) or
df
dt
(t). We also
say that f C
1
((a, b) X) if f is dierentiable at all points t (a, b) and
f C((a, b) X).
As for the case of real valued functions, the derivative operator
d
dt
is easily
seen to be linear. The next two results have proofs very similar to their real
valued function analogues.
Lemma 10.10 (Product Rules). Suppose that t U (t) L(X) , t
V (t) L(X) and t x(t) X are dierentiable at t = t
0
, then
1.
d
dt
[
t0
[U (t) x(t)] X exists and
d
dt
[
t0
[U (t) x(t)] =
_
U (t
0
) x(t
0
) +U (t
0
) x(t
0
)
_
and
2.
d
dt
[
t0
[U (t) V (t)] L(X) exists and
d
dt
[
t0
[U (t) V (t)] =
_
U (t
0
) V (t
0
) +U (t
0
)

V (t
0
)
_
.
3. If U (t
0
) is invertible, then t U (t)
1
is dierentiable at t = t
0
and
d
dt
[
t0
U (t)
1
= U (t
0
)
1
U (t
0
) U (t
0
)
1
. (10.6)
Proof. The reader is asked to supply the proof of the rst two items in Ex-
ercise 10.9. Before proving item 3., let us assume that U (t)
1
is dierentiable,
then using the product rule we would learn
0 =
d
dt
[
t0
I =
d
dt
[
t0
_
U (t)
1
U (t)
_
=
_
d
dt
[
t0
U (t)
1
_
U (t
0
) +U (t
0
)
1
U (t
0
) .
Solving this equation for
d
dt
[
t0
U (t)
1
gives the formula in Eq. (10.6). The
problem with this argument is that we have not yet shown t U (t)
1
is
invertible at t
0
. Here is the formal proof. Since U (t) is dierentiable at t
0
,
U (t) U (t
0
) as t t
0
and by Corollary 7.22, U (t
0
+h) is invertible for h
near 0 and
U (t
0
+h)
1
U (t
0
)
1
as h 0.
Therefore, using Lemma 7.11, we may let h 0 in the identity,
U (t
0
+h)
1
U (t
0
)
1
h
= U (t
0
+h)
1
_
U (t
0
) U (t
0
+h)
h
_
U (t
0
)
1
,
to learn
lim
h0
U (t
0
+h)
1
U (t
0
)
1
h
= U (t
0
)
1
U (t
0
) U (t
0
)
1
.
Proposition 10.11 (Chain Rule). Suppose s x(s) X is dierentiable
at s = s
0
and t T (t) 1 is dierentiable at t = t
0
and T (t
0
) = s
0
, then
t x(T (t)) is dierentiable at t
0
and
d
dt
[
t0
x(T (t)) = x
t
(T (t
0
)) T
t
(t
0
) .
The proof of the chain rule is essentially the same as the real valued func-
tion case, see Exercise 10.10.
Proposition 10.12. Suppose that f : [a, b] X is a continuous function
such that

f(t) exists and is equal to zero for t (a, b). Then f is constant.
Proof. Let > 0 and (a, b) be given. (We will later let 0.) By the
denition of the derivative, for all (a, b) there exists
> 0 such that

|f(t) f()| =
_
_
_f(t) f()

f()(t )
_
_
_ [t [ if [t [ <
.
(10.7)
Let
A = t [, b] : |f(t) f()| (t ) (10.8)
and t
0
be the least upper bound for A. We will now use a standard argument
which is referred to as continuous induction to show t
0
= b. Eq. (10.7)
with = shows t
0
> and a simple continuity argument shows t
0
A, i.e.
|f(t
0
) f()| (t
0
). (10.9)
For the sake of contradiction, suppose that t
0
< b. By Eqs. (10.7) and (10.9),
|f(t) f()| |f(t) f(t
0
)| +|f(t
0
) f()|
(t
0
) +(t t
0
) = (t )
for 0 t t
0
<
t0
which violates the denition of t
0
being an upper bound.
Thus we have shown b A and hence
|f(b) f()| (b ).
Since > 0 was arbitrary we may let 0 in the last equation to conclude
f(b) = f () . Since (a, b) was arbitrary it follows that f(b) = f () for all
(a, b] and then by continuity for all [a, b], i.e. f is constant.
Remark 10.13. The usual real variable proof of Proposition 10.12 makes use
Rolles theorem which in turn uses the extreme value theorem. This latter
theorem is not available to vector valued functions. However with the aid of
the Hahn Banach Theorem 25.4 below and Lemma 10.7, it is possible to reduce
the proof of Proposition 10.12 and the proof of the Fundamental Theorem of
Calculus 10.14 to the real valued case, see Exercise 25.4.
Theorem 10.14 (Fundamental Theorem of Calculus). Suppose that f
C([a, b], X), Then
1.
d
dt
_
t
a
f() d = f(t) for all t (a, b).
2. Now assume that F C([a, b], X), F is continuously dierentiable on
(a, b) (i.e.

F (t) exists and is continuous for t (a, b)) and

F extends to
a continuous function on [a, b] which is still denoted by

F. Then
_
b
a
F(t) dt = F(b) F(a).

Proof. Let h > 0 be a small number and consider
_
_
_
_
_
_
t+h
a
f()d
_
t
a
f()d f(t)h
_
_
_
_
_
=
_
_
_
_
_
_
t+h
t
(f() f(t)) d
_
_
_
_
_
_
t+h
t
|(f() f(t))| d h(h),
where (h) := max
[t,t+h]
|(f() f(t))|. Combining this with a similar
computation when h < 0 shows, for all h 1 suciently small, that
_
_
_
_
_
_
t+h
a
f()d
_
t
a
f()d f(t)h
_
_
_
_
_
[h[(h),
where now (h) := max
[t]h],t+]h]]
|(f() f(t))|. By continuity of f at t,
(h) 0 and hence
d
dt
_
t
a
f() d exists and is equal to f(t). For the second
item, set G(t) :=
_
t
a
F() d F(t). Then G is continuous by Lemma 10.7 and
G(t) = 0 for all t (a, b) by item 1. An application of Proposition 10.12 shows

G is a constant and in particular G(b) = G(a), i.e.
_
b
a
F() d F(b) = F(a).

Corollary 10.15 (Mean Value Inequality). Suppose that f : [a, b] X
is a continuous function such that

f(t) exists for t (a, b) and

f extends to a
continuous function on [a, b]. Then
|f(b) f(a)|
_
b
a
|

f(t)|dt (b a)
_
_
_

f
_
_
_
. (10.10)
Proof. By the fundamental theorem of calculus, f(b) f(a) =
_
b
a
f(t)dt
and then by Lemma 10.7,
|f(b) f(a)| =
_
_
_
_
_
_
b
a
f(t)dt
_
_
_
_
_
_
b
a
|

f(t)|dt
_
b
a
_
_
_

f
_
_
_
dt = (b a)
_
_
_

f
_
_
_
.
10.2 Integral Operators as Examples of Bounded Operators 123
Corollary 10.16 (Change of Variable Formula). Suppose that f
C([a, b], X) and T : [c, d] (a, b) is a continuous function such that T (s)
is continuously dierentiable for s (c, d) and T
t
(s) extends to a continuous
function on [c, d]. Then
_
d
c
f (T (s)) T
t
(s) ds =
_
T(d)
T(c)
f (t) dt.
Proof. For s (a, b) dene F (t) :=
_
t
T(c)
f () d. Then F C
1
((a, b) , X)
and by the fundamental theorem of calculus and the chain rule,
d
ds
F (T (s)) = F
t
(T (s)) T
t
(s) = f (T (s)) T
t
(s) .
Integrating this equation on s [c, d] and using the chain rule again gives
_
d
c
f (T (s)) T
t
(s) ds = F (T (d)) F (T (c)) =
_
T(d)
T(c)
f (t) dt.
10.2 Integral Operators as Examples of Bounded
Operators
In the examples to follow all integrals are the standard Riemann integrals and
we will make use of the following notation.
Notation 10.17 Given an open set U 1
d
, let C
c
(U) denote the collection
of real valued continuous functions f on U such that
supp(f) := x U : f (x) ,= 0
is a compact subset of U.
Example 10.18. Suppose that K : [0, 1] [0, 1] C is a continuous function.
For f C([0, 1]), let
Tf(x) =
_
1
0
K(x, y)f(y)dy.
Since
[Tf(x) Tf(z)[
_
1
0
[K(x, y) K(z, y)[ [f(y)[ dy
|f|
max
y
[K(x, y) K(z, y)[ (10.11)
and the latter expression tends to 0 as x z by uniform continuity of K.
Therefore Tf C([0, 1]) and by the linearity of the Riemann integral, T :
C([0, 1]) C([0, 1]) is a linear map. Moreover,
[Tf(x)[
_
1
0
[K(x, y)[ [f(y)[ dy
_
1
0
[K(x, y)[ dy |f|
A|f|
where
A := sup
x[0,1]
_
1
0
[K(x, y)[ dy < . (10.12)
This shows |T| A < and therefore T is bounded. We may in fact
show |T| = A. To do this let x
0
[0, 1] be such that
sup
x[0,1]
_
1
0
[K(x, y)[ dy =
_
1
0
[K(x
0
, y)[ dy.
Such an x
0
can be found since, using a similar argument to that in Eq. (10.11),
x
_
1
0
[K(x, y)[ dy is continuous. Given > 0, let
f
(y) :=
K(x
0
, y)
_
+[K(x
0
, y)[
2
and notice that lim
0
|f
= 1 and
|Tf
[Tf
(x
0
)[ = Tf
(x
0
) =
_
1
0
[K(x
0
, y)[
2
_
+[K(x
0
, y)[
2
dy.
Therefore,
|T| lim
0
1
|f
_
1
0
[K(x
0
, y)[
2
_
+[K(x
0
, y)[
2
dy
= lim
0
_
1
0
[K(x
0
, y)[
2
_
+[K(x
0
, y)[
2
dy = A
since
0 [K(x
0
, y)[
[K(x
0
, y)[
2
_
+[K(x
0
, y)[
2
=
[K(x
0
, y)[
_
+[K(x
0
, y)[
2
_
_
+[K(x
0
, y)[
2
[K(x
0
, y)[
_
_
+[K(x
0
, y)[
2
[K(x
0
, y)[
10.3 Linear Ordinary Dierential Equations 125
and the latter expression tends to zero uniformly in y as 0.
We may also consider other norms on C([0, 1]). Let (for now) L
1
([0, 1])
denote C([0, 1]) with the norm
|f|
1
=
_
1
0
[f(x)[ dx,
then T : L
1
([0, 1], dm) C([0, 1]) is bounded as well. Indeed, let M =
sup[K(x, y)[ : x, y [0, 1] , then
[(Tf)(x)[
_
1
0
[K(x, y)f(y)[ dy M|f|
1
which shows |Tf|
M|f|
1
and hence,
|T|
L
1
C
max [K(x, y)[ : x, y [0, 1] < .
We can in fact show that |T| = M as follows. Let (x
0
, y
0
) [0, 1]
2
satisfying
[K(x
0
, y
0
)[ = M. Then given > 0, there exists a neighborhood U = I J
of (x
0
, y
0
) such that [K(x, y) K(x
0
, y
0
)[ < for all (x, y) U. Let f
C
c
(I, [0, )) such that
_
1
0
f(x)dx = 1. Choose C such that [[ = 1 and
K(x
0
, y
0
) = M, then
[(Tf)(x
0
)[ =
_
1
0
K(x
0
, y)f(y)dy
_
I
K(x
0
, y)f(y)dy
Re
_
I
K(x
0
, y)f(y)dy
_
I
(M ) f(y)dy = (M ) |f|
L
1
and hence
|Tf|
C
(M ) |f|
L
1
showing that |T| M . Since > 0 is arbitrary, we learn that |T| M
and hence |T| = M.
One may also view T as a map from T : C([0, 1]) L
1
([0, 1]) in which
case one may show
|T|
L
1
C

_
1
0
max
y
[K(x, y)[ dx < .
10.3 Linear Ordinary Dierential Equations
Let X be a Banach space, J = (a, b) 1 be an open interval with 0 J,
h C(J X) and A C(J L(X)). In this section we are going to
consider the ordinary dierential equation,
y(t) = A(t)y(t) +h(t) where y(0) = x X, (10.13)
where y is an unknown function in C
1
(J X). This equation may be written
in its equivalent (as the reader should verify) integral form, namely we are
looking for y C(J, X) such that
y(t) = x +
_
t
0
h() d +
_
t
0
A()y()d. (10.14)
In what follows, we will abuse notation and use || to denote the opera-
tor norm on L(X) associated to then norm, || , on X and let ||
:=
max
tJ
|(t)| for BC(J, X) or BC(J, L(X)).
Notation 10.19 For t 1 and n N, let
n
(t) =
_
(
1
, . . . ,
n
) 1
n
: 0
1

n
t if t 0
(
1
, . . . ,
n
) 1
n
: t
n

1
0 if t 0
and also write d = d
1
. . . d
n
and
_
n(t)
f(
1
, . . .
n
)d : = (1)
n1t<0
_
t
0
d
n
_
n
0
d
n1
. . .
_
2
0
d
1
f(
1
, . . .
n
).
Lemma 10.20. Suppose that C (1, 1) , then
(1)
n1t<0
_
n(t)
(
1
) . . . (
n
)d =
1
n!
__
t
0
()d
_
n
. (10.15)
Proof. Let (t) :=
_
t
0
()d. The proof will go by induction on n. The
case n = 1 is easily veried since
(1)
11t<0
_
1(t)
(
1
)d
1
=
_
t
0
()d = (t).
Now assume the truth of Eq. (10.15) for n 1 for some n 2, then
(1)
n1t<0
_
n(t)
(
1
) . . . (
n
)d
=
_
t
0
d
n
_
n
0
d
n1
. . .
_
2
0
d
1
(
1
) . . . (
n
)
=
_
t
0
d
n
n1
(
n
)
(n 1)!
(
n
) =
_
t
0
d
n
n1
(
n
)
(n 1)!

(
n
)
=
_
(t)
0
u
n1
(n 1)!
du =

n
(t)
n!
,
wherein we made the change of variables, u = (
n
), in the second to last
equality.
Remark 10.21. Eq. (10.15) is equivalent to
_
n(t)
(
1
) . . . (
n
)d =
1
n!
_
_
1(t)
()d
_
n
and another way to understand this equality is to view
_
n(t)
(
1
) . . . (
n
)d
as a multiple integral (see Chapter 20 below) rather than an iterated integral.
Indeed, taking t > 0 for simplicity and letting S
n
be the permutation group
on 1, 2, . . . , n we have
[0, t]
n
=
Sn
(
1
, . . . ,
n
) 1
n
: 0
1

n
t
with the union being essentially disjoint. Therefore, making a change of vari-
ables and using the fact that (
1
) . . . (
n
) is invariant under permutations,
we nd
__
t
0
()d
_
n
=
_
[0,t]
n
(
1
) . . . (
n
)d
=
Sn
_
(1,...,n)R
n
:01nt]
(
1
) . . . (
n
)d
=
Sn
_
(s1,...,sn)R
n
:0s1snt]
(s
1
1
) . . . (s
1
n
)ds
=
Sn
_
(s1,...,sn)R
n
:0s1snt]
(s
1
) . . . (s
n
)ds
= n!
_
n(t)
(
1
) . . . (
n
)d.
Theorem 10.22. Let BC(J, X), then the integral equation
y(t) = (t) +
_
t
0
A()y()d (10.16)
has a unique solution given by
y(t) = (t) +
n=1
(1)
n1t<0
_
n(t)
A(
n
) . . . A(
1
)(
1
)d (10.17)
and this solution satises the bound
|y|
||
J
|A()|d
.
Proof. Dene : BC(J, X) BC(J, X) by
(y)(t) =
_
t
0
A()y()d.
Then y solves Eq. (10.14) i y = +y or equivalently i (I )y = . An
induction argument shows
(
n
)(t) =
_
t
0
d
n
A(
n
)(
n1
)(
n
)
=
_
t
0
d
n
_
n
0
d
n1
A(
n
)A(
n1
)(
n2
)(
n1
)
.
.
.
=
_
t
0
d
n
_
n
0
d
n1
. . .
_
2
0
d
1
A(
n
) . . . A(
1
)(
1
)
= (1)
n1t<0
_
n(t)
A(
n
) . . . A(
1
)(
1
)d.
Taking norms of this equation and using the triangle inequality along with
Lemma 10.20 gives,
|(
n
)(t)| ||
_
n(t)
|A(
n
)| . . . |A(
1
)|d
||
1
n!
_
_
1(t)
|A()|d
_
n
||
1
n!
__
J
|A()|d
_
n
.
Therefore,
|
n
|
op

1
n!
__
J
|A()|d
_
n
(10.18)
and
n=0
|
n
|
op
e
J
|A()|d
<
where ||
op
denotes the operator norm on L(BC(J, X)) . An application of
Proposition 7.21 now shows (I )
1
=
n=0
n
exists and
_
_
(I )
1
_
_
op
e
J
|A()|d
.
It is now only a matter of working through the notation to see that these
assertions prove the theorem.
Corollary 10.23. Suppose h C(J X) and x X, then there exits a
unique solution, y C
1
(J, X) , to the linear ordinary dierential Eq. (10.13).
Proof. Let
(t) = x +
_
t
0
h() d.
By applying Theorem 10.22 with and J replaced by any open interval J
0
such that 0 J
0
and

J
0
is a compact subinterval
1
of J, there exists a unique
solution y
J0
to Eq. (10.13) which is valid for t J
0
. By uniqueness of solutions,
if J
1
is a subinterval of J such that J
0
J
1
and

J
1
is a compact subinterval
of J, we have y
J1
= y
J0
on J
0
. Because of this observation, we may construct
a solution y to Eq. (10.13) which is dened on the full interval J by setting
y (t) = y
J0
(t) for any J
0
as above which also contains t J.
Corollary 10.24. Suppose that A L(X) is independent of time, then the
solution to
y(t) = Ay(t) with y(0) = x
is given by y(t) = e
tA
x where
e
tA
=
n=0
t
n
n!
A
n
. (10.19)
Moreover,
e
(t+s)A
= e
tA
e
sA
for all s, t 1. (10.20)
Proof. The rst assertion is a simple consequence of Eq. 10.17 and Lemma
10.20 with = 1. The assertion in Eq. (10.20) may be proved by explicit
computation but the following proof is more instructive. Given x X, let
y (t) := e
(t+s)A
x. By the chain rule,
d
dt
y (t) =
d
d
[
=t+s
e
A
x = Ae
A
x[
=t+s
= Ae
(t+s)A
x = Ay (t) with y (0) = e
sA
x.
The unique solution to this equation is given by
y (t) = e
tA
x(0) = e
tA
e
sA
x.
This completes the proof since, by denition, y (t) = e
(t+s)A
x.
We also have the following converse to this corollary whose proof is outlined
in Exercise 10.20 below.
Theorem 10.25. Suppose that T
t
L(X) for t 0 satises
1. (Semi-group property.) T
0
= Id
X
and T
t
T
s
= T
t+s
for all s, t 0.
2. (Norm Continuity) t T
t
is continuous at 0, i.e. |T
t
I|
L(X)
0 as
t 0.
Then there exists A L(X) such that T
t
= e
tA
where e
tA
is dened in Eq.
(10.19).
1
We do this so that [J
0
will be bounded.
10.4 Classical Weierstrass Approximation Theorem
Denition 10.26 (Support). Let f : X Z be a function from a metric
space (X, ) to a vector space Z. The support of f is the closed subset, supp(f),
of X dened by
supp(f) := x X : f(x) ,= 0.
Example 10.27. For example if f : 1 1 is dened by f(x) = sin(x)1
[0,4]
(x)
1, then
f ,= 0 = (0, 4) , 2, 3
and therefore supp(f) = [0, 4].
For the remainder of this section, Z will be used to denote a Banach space.
Denition 10.28 (Convolution). For f, g C (1) with either f or g having
compact support, we dene the convolution of f and g by
f g(x) =
_
R
f(x y)g(y)dy =
_
R
f(y)g(x y)dy.
We will also use this denition when one of the functions, either f or g, takes
values in a Banach space Z.
Lemma 10.29 (Approximate sequences). Suppose that q
n
n=1
is
a sequence non-negative continuous real valued functions on 1 with compact
support that satisfy
_
R
q
n
(x) dx = 1 and (10.21)
lim
n
_
]x]
q
n
(x)dx = 0 for all > 0. (10.22)
If f BC(1, Z), then
q
n
f (x) :=
_
R
q
n
(y)f(x y)dy
converges to f uniformly on compact subsets of 1.
Proof. Let x 1, then because of Eq. (10.21),
|q
n
f(x) f(x)| =
_
_
_
_
_
R
q
n
(y) (f(x y) f(x)) dy
_
_
_
_
_
R
q
n
(y) |f(x y) f(x)| dy.
Let M = sup|f(x)| : x 1 . Then for any > 0, using Eq. (10.21),
10.4 Classical Weierstrass Approximation Theorem 131
|q
n
f(x) f(x)|
_
]y]
q
n
(y) |f(x y) f(x)| dy
+
_
]y]>
q
n
(y) |f(x y) f(x)| dy
sup
]w]
|f(x +w) f(x)| + 2M
_
]y]>
q
n
(y)dy.
So if K is a compact subset of 1 (for example a large interval) we have
sup
(x)K
|q
n
f(x) f(x)|
sup
]w], xK
|f(x +w) f(x)| + 2M
_
|y|>
q
n
(y)dy
and hence by Eq. (10.22),
lim sup
n
sup
xK
|q
n
f(x) f(x)|
sup
]w], xK
|f(x +w) f(x)| .
This nishes the proof since the right member of this equation tends to 0 as
0 by uniform continuity of f on compact subsets of 1.
Let q
n
: 1 [0, ) be dened by
q
n
(x) :=
1
c
n
(1 x
2
)
n
1
]x]1
where c
n
:=
_
1
1
(1 x
2
)
n
dx. (10.23)
Figure 10.2 displays the key features of the functions q
n
.
Lemma 10.30. The sequence q
n
n=1
is an approximate sequence, i.e.
they satisfy Eqs. (10.21) and (10.22).
Proof. By construction, q
n
C
c
(1, [0, )) for each n and Eq. 10.21 holds.
Since
_
]x]
q
n
(x)dx =
2
_
1
(1 x
2
)
n
dx
2
_
0
(1 x
2
)
n
dx + 2
_
1
(1 x
2
)
n
dx
_
1
(1 x
2
)
n
dx
_
0
x
(1 x
2
)
n
dx
=
(1 x
2
)
n+1
[
1
(1 x
2
)
n+1
[
0
=
(1
2
)
n+1
1 (1
2
)
n+1
0 as n ,
the proof is complete.
Fig. 10.2. A plot of q1, q50, and q100. The most peaked curve is q100 and the least
is q1. The total area under each of these curves is one.
Notation 10.31 Let Z
+
:= N 0 and for x 1
d
and Z
d
+
let
x
=

d
i=1
x
i
i
and [[ =

d
i=1
i
. A polynomial on 1
d
with values in Z
is a function p : 1
d
Z of the form
p(x) =
:]]N
p
with p
Z and N Z
+
.
If p
,= 0 for some such that [[ = N, then we dene deg(p) := N to be

the degree of p. If Z is a complex Banach space, the function p has a natural
extension to z C
d
, namely p(z) =
:]]N
p
where z
d
i=1
z
i
i
.
Given a compact subset K 1
d
and f C (K, C)
2
, we are going to
show, in the Weierstrass approximation Theorem 10.35 below, that f may
be uniformly approximated by polynomial functions on K. The next theorem
addresses this question when K is a compact subinterval of 1.
Theorem 10.32 (Weierstrass Approximation Theorem). Suppose <
a < b < , J = [a, b] and f C(J, Z). Then there exists polynomials p
n
on
1 such that p
n
f uniformly on J.
2
Note that f is automatically bounded because if not there would exist un K
such that limn[f (un)[ = . Using Theorem 10.2 we may, by passing to a
subsequence if necessary, assume un u K as n . Now the continuity of
f would then imply
= lim
n
[f (un)[ = [f (u)[
which is absurd since f takes values in C.
Proof. By replacing f by F where
F (t) := f (a +t (b a)) [f (a) +t (f (b) f (a))] for t [0, 1] ,
it suces to assume a = 0, b = 1 and f (0) = f (1) = 0. Furthermore we may
now extend f to a continuous function on all 1 by setting f 0 on 1 [0, 1] .
With q
n
dened as in Eq. (10.23), let f
n
(x) := (q
n
f)(x) and recall
from Lemma 10.29 that f
n
(x) f (x) as n with the convergence being
uniform in x [0, 1]. This completes the proof since f
n
is equal to a polynomial
function on [0, 1] . Indeed, there are polynomials, a
k
(y) , such that
(1 (x y)
2
)
n
=
2n
k=0
a
k
(y) x
k
,
and therefore, for x [0, 1] ,
f
n
(x) =
_
R
q
n
(x y)f(y)dy
=
1
c
n
_
[0,1]
f(y)
_
(1 (x y)
2
)
n
1
]xy]1
dy
=
1
c
n
_
[0,1]
f(y)(1 (x y)
2
)
n
dy
=
1
c
n
_
[0,1]
f(y)
2n
k=0
a
k
(y) x
k
dy =
2n
k=0
A
k
x
k
where
A
k
=
_
[0,1]
f (y) a
k
(y) dy.
Lemma 10.33. Suppose J = [a, b] is a compact subinterval of 1 and K
is a compact subset of 1
d1
, then the linear mapping R : C (J K, Z)
C (J, C (K, Z)) dened by (Rf) (t) = f (t, ) C (K, Z) for t J is an iso-
metric isomorphism of Banach spaces.
Proof. By uniform continuity of f on J K (see Theorem 10.2),
|(Rf) (t) (Rf) (s)|
C(K,Z)
= max
yK
|f (t, y) f (s, y)|
Z
0 as s t
which shows that Rf is indeed in C (J C (K, Z)) . Moreover,
|Rf|
C(JC(K,Z))
= max
tJ
|(Rf) (t)|
C(K,Z)
= max
tJ
max
yK
|f (t, y)|
Z
= |f|
C(JK,Z)
,
showing R is isometric and therefore injective.
To see that R is surjective, let F C (J C (K, Z)) and dene f (t, y) :=
F (t) (y) . Since
|f (t, y) f (s, y
t
)|
Z
|f (t, y) f (s, y)|
Z
+|f (s, y) f (s, y
t
)|
Z
|F (t) F (s)|
C(K,Z)
+|F (s) (y) F (s) (y
t
)|
Z
,
it follows by the continuity of t F (t) and y F (s) (y) that
|f (t, y) f (s, y
t
)|
Z
0 as (t, y) (s, y
t
) .
This shows f C (J K, Z) and thus completes the proof because Rf = F
by construction.
Corollary 10.34 (Weierstrass Approximation Theorem). Let d N,
J
i
= [a
i
, b
i
] be compact subintervals of 1 for i = 1, 2, . . . , d, J := J
1
J
d
and f C(J, Z). Then there exists polynomials p
n
on 1
d
such that p
n
f
uniformly on J.
Proof. The proof will be by induction on d with the case d = 1 being
the content of Theorem 10.32. Now suppose that d > 1 and the theorem
holds with d replaced by d 1. Let K := J
2
J
d
, Z
0
= C (K, Z) ,
R : C (J
1
K, Z) C (J
1
, Z
0
) be as in Lemma 10.33 and F := Rf. By
Theorem 10.32, for any > 0 there exists a polynomial function
p (t) =
n
k=0
c
k
t
k
with c
k
Z
0
= C (K, Z) such that |F p|
C(J1,Z0)
< . By the induction
hypothesis, there exists polynomial functions q
k
: K Z such that
|c
k
q
k
|
Z0
<

n([a[ +[b[)
k
.
It is now easily veried (you check) that the polynomial function,
(x) :=
n
k=0
x
k
1
q
k
(x
2
, . . . , x
d
) for x J
satises |f |
C(J,Z)
< 2 and this completes the induction argument and
hence the proof.
The reader is referred to Chapter 20 for a two more alternative proofs of
this corollary.
Theorem 10.35 (Weierstrass Approximation Theorem). Suppose that
K 1
d
is a compact subset and f C(K, C). Then there exists polynomials
p
n
on 1
d
such that p
n
f uniformly on K.
Proof. Choose > 0 and b 1
d
such that
K
0
:= K b := x b : x K B
d
where B
d
:= (0, 1)
d
. The function F (y) := f
_
1
(y +b)
_
for y K
0
is in
C (K
0
, C) and if p
n
(y) are polynomials on 1
d
such that p
n
F uniformly
on K
0
then p
n
(x) := p
n
(x b) are polynomials on 1
d
such that p
n
f
uniformly on K. Hence we may now assume that K is a compact subset of
B
d
. Let g C (K B
c
d
) be dened by
g (x) =
_
f (x) if x K
0 if x B
c
d
and then use the Tietze extension Theorem 7.4 (applied to the real and imag-
inary parts of F) to nd a continuous function F C(1
d
, C) such that
F = g[
KB
c
d
. If p
n
are polynomials on 1
d
such that p
n
F uniformly on
[0, 1]
d
then p
n
also converges to f uniformly on K. Hence, by replacing f by
F, we may now assume that f C(1
d
, C), K =

B
d
= [0, 1]
d
, and f 0 on
B
c
d
. The result now follows by an application of Corollary 10.34 with Z = C.
Remark 10.36. The mapping (x, y) 1
d
1
d
z = x + iy C
d
is an
isomorphism of vector spaces. Letting z = x iy as usual, we have x =
z+ z
2
and y =
z z
2i
. Therefore under this identication any polynomial p(x, y) on
1
d
1
d
may be written as a polynomial q in (z, z), namely
q(z, z) = p(
z + z
2
,
z z
2i
).
Conversely a polynomial q in (z, z) may be thought of as a polynomial p in
(x, y), namely p(x, y) = q(x +iy, x iy).
Corollary 10.37 (Complex Weierstrass Approximation Theorem).
Suppose that K C
d
is a compact set and f C(K, C). Then there ex-
ists polynomials p
n
(z, z) for z C
d
such that sup
zK
[p
n
(z, z) f(z)[ 0 as
n .
Proof. This is an immediate consequence of Theorem 10.35 and Remark
10.36.
Example 10.38. Let K = S
1
= z C : [z[ = 1 and / be the set of poly-
nomials in (z, z) restricted to S
1
. Then / is dense in C(S
1
).
3
Since z = z
1
on S
1
, we have shown polynomials in z and z
1
are dense in C(S
1
). This
example generalizes in an obvious way to K =
_
S
1
_
d
C
d
.
3
Note that it is easy to extend f C(S
1
) to a function F C(C) by setting
F(z) = zf(
z
z
) for z ,= 0 and F(0) = 0. So this special case does not require the
Tietze extension theorem.
Exercise 10.4. Suppose < a < b < and f C ([a, b], C) satises
_
b
a
f (t) t
n
dt = 0 for n = 0, 1, 2 . . . .
Show f 0.
Exercise 10.5. Suppose f C (1, C) is a 2 periodic function (i.e.
f (x + 2) = f (x) for all x 1) and
_
2
0
f (x) e
inx
dx = 0 for all n Z,
show again that f 0. Hint: Use Example 10.38 to show that any 2
periodic continuous function g on 1 is the uniform limit of trigonometric
polynomials of the form
p (x) =
n
k=n
p
k
e
ikx
with p
k
C for all k.
10.5 Iterated Integrals
Theorem 10.39 (Baby Fubini Theorem). Let a
i
, b
i
1 with a
i
,=
b
i
for i = 1, 2 . . . , n, f(t
1
, t
2
, . . . , t
n
) Z be a continuous function of
(t
1
, t
2
, . . . , t
n
) where t
i
between a
i
and b
i
for each i and for any given per-
mutation, , of 1, 2 . . . , n let
I
(f) :=
_
b
1
a
1
dt
1
. . .
_
bn
an
dt
n
f(t
1
, t
2
, . . . , t
n
). (10.24)
Then I
(f) is well dened and independent of , i.e. the order of iterated

integrals is irrelevant under these hypothesis.
Proof. Let J
i
:= [min(a
i
, b
i
) , max (a
i
, b
i
)] , J := J
1
J
n
and [J
i
[ :=
max (a
i
, b
i
) min(a
i
, b
i
) . Using the uniform continuity of f (Theorem 10.2)
and the continuity of the Riemann integral, it is easy to prove (compare with
the proof of Lemma 10.33) that the map
(t
1
, . . .
t
n
, . . . , t
n
) (J
1

J
n
J
n
)
_
bn
an
dt
n
f(t
1
, t
2
, . . . , t
n
)
is continuous, where the hat is used to denote a missing element from a list.
From this remark, it follows that each of the integrals in Eq. (10.24) are well
dened and hence so is I
(f) . Moreover by an induction argument using

Lemma 10.33 and the boundedness of the Riemann integral, we have the
estimate,
10.6 Exercises 137
|I
(f)|
Z

_
n
i=1
[J
i
[
_
|f|
C(J,Z)
. (10.25)
Now suppose is another permutation. Because of Eq. (10.25), I
and I
are bounded operators on C (J, Z) and so to shows I
= I
is suces to shows
there are equal on the dense set of polynomial functions (see Corollary 10.34)
in C (J, Z) . Moreover by linearity, it suces to show I
(f) = I
(f) when f
has the form
f(t
1
, t
2
, . . . , t
n
) = t
k1
1
. . . t
kn
n
z
for some k
i
N
0
and z Z. However for this function, explicit computations
show
I
(f) = I
(f) =
_
n
i=1
b
ki+1
i
a
ki+1
i
k
i
+ 1
_
z.
Proposition 10.40 (Equality of Mixed Partial Derivatives). Let Q =
(a, b) (c, d) be an open rectangle in 1
2
and f C(Q, Z). Assume that
t
f(s, t),

s
f(s, t) and

t
s
f(s, t) exists and are continuous for (s, t) Q,
then

s
t
f(s, t) exists for (s, t) Q and
t
f(s, t) =

t
s
f(s, t) for (s, t) Q. (10.26)
Proof. Fix (s
0
, t
0
) Q. By two applications of Theorem 10.14,
f(s, t) = f(s
t0
, t) +
_
s
s0
f(, t)d
= f(s
0
, t) +
_
s
s0
f(, t
0
)d +
_
s
s0
d
_
t
t0
d

f(, ) (10.27)
and then by Fubinis Theorem 10.39 we learn
f(s, t) = f(s
0
, t) +
_
s
s0
f(, t
0
)d +
_
t
t0
d
_
s
s0
d

f(, ).
Dierentiating this equation in t and then in s (again using two more appli-
cations of Theorem 10.14) shows Eq. (10.26) holds.
10.6 Exercises
Throughout these problems, (X, ||) is a Banach space.
Exercise 10.6. Show f = (f
1
, . . . , f
n
)

o([a, b], 1
n
) i f
i

o([a, b], 1) for
i = 1, 2, . . . , n and
_
b
a
f(t)dt =
_
_
b
a
f
1
(t)dt, . . . ,
_
b
a
f
n
(t)dt
_
.
Here 1
n
is to be equipped with the usual Euclidean norm. Hint: Use Lemma
10.7 to prove the forward implication.
Exercise 10.7. Give another proof of Proposition 10.40 which does not use
Fubinis Theorem 10.39 as follows.
1. By a simple translation argument we may assume (0, 0) Q and we are
trying to prove Eq. (10.26) holds at (s, t) = (0, 0).
2. Let h(s, t) :=

t
s
f(s, t) and
G(s, t) :=
_
s
0
d
_
t
0
dh(, )
so that Eq. (10.27) states
f(s, t) = f(0, t) +
_
s
0
f(, t
0
)d +G(s, t)
and dierentiating this equation at t = 0 shows
t
f(s, 0) =

t
f(0, 0) +

t
G(s, 0). (10.28)
Now show using the denition of the derivative that
t
G(s, 0) =
_
s
0
dh(, 0). (10.29)
Hint: Consider
G(s, t) t
_
s
0
dh(, 0) =
_
s
0
d
_
t
0
d [h(, ) h(, 0)] .
3. Now dierentiate Eq. (10.28) in s using Theorem 10.14 to nish the proof.
Exercise 10.8. Give another proof of Eq. (10.24) in Theorem 10.39 based on
Proposition 10.40. To do this let t
0
(c, d) and s
0
(a, b) and dene
G(s, t) :=
_
t
t0
d
_
s
s0
df(, )
Show G satises the hypothesis of Proposition 10.40 which combined with two
applications of the fundamental theorem of calculus implies
10.6 Exercises 139
s
G(s, t) =

s
t
G(s, t) = f(s, t).
Use two more applications of the fundamental theorem of calculus along with
the observation that G = 0 if t = t
0
or s = s
0
to conclude
G(s, t) =
_
s
s0
d
_
t
t0
d

G(, ) =
_
s
s0
d
_
t
t0
d

f(, ). (10.30)
Finally let s = b and t = d in Eq. (10.30) and then let s
0
a and t
0
c to
prove Eq. (10.24).
Exercise 10.9 (Product Rule). Prove items 1. and 2. of Lemma 10.10. This
can be modeled on the standard proof for real valued functions.
Exercise 10.10 (Chain Rule). Prove the chain rule in Proposition 10.11.
Again this may be modeled on the on the standard proof for real valued
functions.
Exercise 10.11. To each A L(X) , we may dene L
A
, R
A
: L(X) L(X)
by
L
A
B = AB and R
A
B = BA for all B L(X) .
Show L
A
, R
A
L(L(X)) and that
|L
A
|
L(L(X))
= |A|
L(X)
= |R
A
|
L(L(X))
.
Exercise 10.12. Suppose that A : 1 L(X) is a continuous function and
U, V : 1 L(X) are the unique solution to the linear dierential equations
V (t) = A(t)V (t) with V (0) = I (10.31)

and
U(t) = U(t)A(t) with U(0) = I. (10.32)

Prove that V (t) is invertible and that V
1
(t) = U(t)
4
, where by abuse of
notation I am writing V
1
(t) for [V (t)]
1
. Hints: 1) show
d
dt
[U(t)V (t)] = 0
(which is sucient if dim(X) < ) and 2) show compute y(t) := V (t)U(t)
solves a linear dierential ordinary dierential equation that has y Id as
an obvious solution. (The results of Exercise 10.11 may be useful here.) Then
use the uniqueness of solutions to linear ODEs.
Exercise 10.13. Suppose that (X, ||) is a Banach space, J = (a, b) with
a < b and f
n
: 1 X are continuously dierentiable functions
such that there exists a summable sequence a
n
n=1
satisfying
|f
n
(t)| +
_
_
_

f
n
(t)
_
_
_ a
n
for all t J and n N.
Show:
4
The fact that U(t) must be dened as in Eq. (10.32) follows from Lemma 10.10.
1. sup
__
_
_
fn(t+h)fn(t)
h
_
_
_ : (t, h) J 1 t +h J and h ,= 0
_
a
n
.
2. The function F : 1 X dened by
F (t) :=
n=1
f
n
(t) for all t J
is dierentiable and for t J,
F (t) =
n=1
f
n
(t) .
Exercise 10.14. Suppose that A L(X). Show directly that:
1. e
tA
dene in Eq. (10.19) is convergent in L(X) when equipped with the
operator norm.
2. e
tA
is dierentiable in t and that
d
dt
e
tA
= Ae
tA
.
Exercise 10.15. Suppose that A L(X) and v X is an eigenvector of A
with eigenvalue , i.e. that Av = v. Show e
tA
v = e
t
v. Also show that if
X = 1
n
and A is a diagonalizable n n matrix with
A = SDS
1
with D = diag(
1
, . . . ,
n
)
then e
tA
= Se
tD
S
1
where e
tD
= diag(e
t1
, . . . , e
tn
). Here diag(
1
, . . . ,
n
)
denotes the diagonal matrix such that
ii
=
i
for i = 1, 2, . . . , n.
Exercise 10.16. Suppose that A, B L(X) and [A, B] := AB BA = 0.
Show that e
(A+B)
= e
A
e
B
.
Exercise 10.17. Suppose A C(1, L(X)) satises [A(t), A(s)] = 0 for all
s, t 1. Show
y(t) := e
(
t
0
A()d)
x
is the unique solution to y(t) = A(t)y(t) with y(0) = x.
Exercise 10.18. Compute e
tA
when
A =
_
0 1
1 0
_
and use the result to prove the formula
cos(s +t) = cos s cos t sins sint.
Hint: Sum the series and use e
tA
e
sA
= e
(t+s)A
.
10.6 Exercises 141
Exercise 10.19. Compute e
tA
when
A =
_
_
0 a b
0 0 c
0 0 0
_
_
with a, b, c 1. Use your result to compute e
t(I+A)
where 1 and I is
the 3 3 identity matrix. Hint: Sum the series.
Exercise 10.20. Prove Theorem 10.25 using the following outline.
1. Using the right continuity at 0 and the semi-group property for T
t
, show
there are constants M and C such that |T
t
|
L(X)
MC
t
for all t > 0.
2. Show t [0, ) T
t
L(X) is continuous.
3. For > 0, let S
:=
1
0
T
d L(X). Show S
I as 0 and
conclude from this that S
is invertible when > 0 is suciently small.

For the remainder of the proof x such a small > 0.
4. Show
T
t
S
=
1
_
t+
t
T
d
and conclude from this that
lim
t0
_
T
t
I
t
_
S
=
1
(T
Id
X
) .
5. Using the fact that S
is invertible, conclude A = lim

t0
t
1
(T
t
I) exists
in L(X) and that
A =
1
(T
I) S
1
.
6. Now show, using the semigroup property and step 4., that
d
dt
T
t
= AT
t
for
all t > 0.
7. Using step 5, show
d
dt
e
tA
T
t
= 0 for all t > 0 and therefore e
tA
T
t
=
e
0A
T
0
= I.
Exercise 10.21 (Duhamel s Principle I). Suppose that A : 1 L(X) is
a continuous function and V : 1 L(X) is the unique solution to the linear
dierential equation in Eq. (10.31). Let x X and h C(1, X) be given.
Show that the unique solution to the dierential equation:
y(t) = A(t)y(t) +h(t) with y(0) = x (10.33)
is given by
y(t) = V (t)x +V (t)
_
t
0
V ()
1
h() d. (10.34)
Hint: compute
d
dt
[V
1
(t)y(t)] (see Exercise 10.12) when y solves Eq. (10.33).
Exercise 10.22 (Duhamel s Principle II). Suppose that A : 1 L(X)
is a continuous function and V : 1 L(X) is the unique solution to the linear
dierential equation in Eq. (10.31). Let W
0
L(X) and H C(1, L(X)) be
given. Show that the unique solution to the dierential equation:
W(t) = A(t)W(t) +H(t) with W(0) = W

0
(10.35)
is given by
W(t) = V (t)W
0
+V (t)
_
t
0
V ()
1
H() d. (10.36)
11
Ordinary Dierential Equations in a Banach
Space
Let X be a Banach space, U
o
X, J = (a, b) 0 and Z C (J U, X) . The
function Z is to be interpreted as a time dependent vector-eld on U X.
In this section we will consider the ordinary dierential equation (ODE for
short)
y(t) = Z(t, y(t)) with y(0) = x U. (11.1)
The reader should check that any solution y C
1
(J, U) to Eq. (11.1) gives a
solution y C(J, U) to the integral equation:
y(t) = x +
_
t
0
Z(, y())d (11.2)
and conversely if y C(J, U) solves Eq. (11.2) then y C
1
(J, U) and y solves
Eq. (11.1).
Remark 11.1. For notational simplicity we have assumed that the initial con-
dition for the ODE in Eq. (11.1) is taken at t = 0. There is no loss in generality
in doing this since if y solves
d y
dt
(t) =

Z(t, y(t)) with y(t
0
) = x U
i y(t) := y(t +t
0
) solves Eq. (11.1) with Z(t, x) =

Z(t +t
0
, x).
11.1 Examples
Let X = 1, Z(x) = x
n
with n N and consider the ordinary dierential
equation
y(t) = Z(y(t)) = y
n
(t) with y(0) = x 1. (11.3)
If y solves Eq. (11.3) with x ,= 0, then y(t) is not zero for t near 0. Therefore
up to the rst time y possibly hits 0, we must have
144 11 Ordinary Dierential Equations in a Banach Space
t =
_
t
0
y()
y()
n
d =
_
y(t)
y(0)
u
n
du =
_
_
_
[y(t)]
1n
x
1n
1n
if n > 1
ln
y(t)
x
if n = 1
and solving these equations for y(t) implies
y(t) = y(t, x) =
_
x
n1
1(n1)tx
n1
if n > 1
e
t
x if n = 1.
(11.4)
The reader should verify by direct calculation that y(t, x) dened above does
indeed solve Eq. (11.3). The above argument shows that these are the only
possible solutions to the Equations in (11.3).
Notice that when n = 1, the solution exists for all time while for n > 1,
we must require
1 (n 1)tx
n1
> 0
t <
1
(1 n)x
n1
if x
n1
> 0 and
t >
1
(1 n) [x[
n1
if x
n1
< 0.
Moreover for n > 1, y(t, x) blows up as t approaches (n 1)
1
x
1n
. The
reader should also observe that, at least for s and t close to 0,
y(t, y(s, x)) = y(t +s, x) (11.5)
for each of the solutions above. Indeed, if n = 1 Eq. (11.5) is equivalent to the
well know identity, e
t
e
s
= e
t+s
and for n > 1,
y(t, y(s, x)) =
y(s, x)
n1
_
1 (n 1)ty(s, x)
n1
=
x
n1
1(n1)sx
n1
n1
1 (n 1)t
_
x
n1
1(n1)sx
n1
_
n1
=
x
n1
1(n1)sx
n1
n1
_
1 (n 1)t
x
n1
1(n1)sx
n1
=
x
n1
_
1 (n 1)sx
n1
(n 1)tx
n1
=
x
n1
_
1 (n 1)(s +t)x
n1
= y(t +s, x).
11.2 Uniqueness Theorem and Continuous Dependence on Initial Data 145
Now suppose Z(x) = [x[
with 0 < < 1 and we now consider the

ordinary dierential equation
y(t) = Z(y(t)) = [y(t)[
with y(0) = x 1. (11.6)

Working as above we nd, if x ,= 0 that
t =
_
t
0
y()
[y(t)[
d =
_
y(t)
y(0)
[u[
du =
[y(t)]
1
x
1
1
,
where u
1
:= [u[
1
sgn(u). Since sgn(y(t)) = sgn(x) the previous equation
implies
sgn(x)(1 )t = sgn(x)
_
sgn(y(t)) [y(t)[
1
sgn(x) [x[
1
_
= [y(t)[
1
[x[
1
and therefore,
y(t, x) = sgn(x)
_
[x[
1
+ sgn(x)(1 )t
_ 1
1
(11.7)
is uniquely determined by this formula until the rst time t where [x[
1
+
sgn(x)(1 )t = 0.
As before y(t) = 0 is a solution to Eq. (11.6) when x = 0, however it is far
from being the unique solution. For example letting x 0 in Eq. (11.7) gives
a function
y(t, 0+) = ((1 )t)
1
1
which solves Eq. (11.6) for t > 0. Moreover if we dene
y(t) :=
_
((1 )t)
1
1
if t > 0
0 if t 0
,
(for example if = 1/2 then y(t) =
1
4
t
2
1
t0
) then the reader may easily check
y also solve Eq. (11.6). Furthermore, y
a
(t) := y(t a) also solves Eq. (11.6)
for all a 0, see Figure 11.1 below.
With these examples in mind, let us now go to the general theory. The
case of linear ODEs has already been studied in Section 10.3 above.
11.2 Uniqueness Theorem and Continuous Dependence
on Initial Data
Lemma 11.2. Gronwalls Lemma. Suppose that f, , and k are non-
negative functions of a real variable t such that
Fig. 11.1. Three dierent solutions to the ODE y(t) = [y(t)[
1/2
with y(0) = 0.
f(t) (t) +
_
t
0
k()f()d
. (11.8)
Then
f(t) (t) +
_
t
0
k()()e
[
k(s)ds[
d
, (11.9)
and in particular if and k are constants we nd that
f(t) e
k]t]
. (11.10)
Proof. I will only prove the case t 0. The case t 0 can be derived
by applying the t 0 to

f(t) = f(t),

k(t) = k(t) and (t) = (t). Set
F(t) =
_
t
0
k()f()d. Then by (11.8),
F = kf k +kF.
Hence,
d
dt
(e
t
0
k(s)ds
F) = e
t
0
k(s)ds
(

F kF) ke
t
0
k(s)ds
.
Integrating this last inequality from 0 to t and then solving for F yields:
F(t) e
t
0
k(s)ds
_
t
0
dk()()e

0
k(s)ds
=
_
t
0
dk()()e
k(s)ds
.
But by the denition of F we have that
f +F,
and hence the last two displayed equations imply (11.9). Equation (11.10)
follows from (11.9) by a simple integration.
11.3 Local Existence (Non-Linear ODE) 147
Corollary 11.3 (Continuous Dependence on Initial Data). Let U
o
X, 0 (a, b) and Z : (a, b) U X be a continuous function which is K
Lipschitz function on U, i.e. |Z(t, x) Z(t, x
t
)| K|x x
t
| for all x and x
t
in U. Suppose y
1
, y
2
: (a, b) U solve
dy
i
(t)
dt
= Z(t, y
i
(t)) with y
i
(0) = x
i
for i = 1, 2. (11.11)
Then
|y
2
(t) y
1
(t)| |x
2
x
1
|e
K]t]
for t (a, b) (11.12)
and in particular, there is at most one solution to Eq. (11.1) under the above
Lipschitz assumption on Z.
Proof. Let f(t) := |y
2
(t) y
1
(t)|. Then by the fundamental theorem of
calculus,
f(t) = |y
2
(0) y
1
(0) +
_
t
0
( y
2
() y
1
()) d|
f(0) +
_
t
0
|Z(, y
2
()) Z(, y
1
())| d
= |x
2
x
1
| +K
_
t
0
f() d
.
Therefore by Gronwalls inequality we have,
|y
2
(t) y
1
(t)| = f(t) |x
2
x
1
|e
K]t]
.
11.3 Local Existence (Non-Linear ODE)
We now show that Eq. (11.1) has a unique solution when Z satises the
Lipschitz condition in Eq. (11.14). See Exercise 14.21 below for another
existence theorem.
Theorem 11.4 (Local Existence). Let T > 0, J = (T, T), x
0
X, r > 0
and
C(x
0
, r) := x X : |x x
0
| r
be the closed r ball centered at x
0
X. Assume
M = sup|Z(t, x)| : (t, x) J C(x
0
, r) < (11.13)
and there exists K < such that
|Z(t, x) Z(t, y)| K|x y| for all x, y C(x
0
, r) and t J. (11.14)
Let T
0
< minr/M, T and J
0
:= (T
0
, T
0
), then for each x B(x
0
, rMT
0
)
there exists a unique solution y(t) = y(t, x) to Eq. (11.2) in C (J
0
, C(x
0
, r)) .
Moreover y(t, x) is jointly continuous in (t, x), y(t, x) is dierentiable in t,
y(t, x) is jointly continuous for all (t, x) J
0
B(x
0
, r MT
0
) and satises
Eq. (11.1).
Proof. The uniqueness assertion has already been proved in Corollary
11.3. To prove existence, let C
r
:= C(x
0
, r), Y := C (J
0
, C(x
0
, r)) and
S
x
(y)(t) := x +
_
t
0
Z(, y())d. (11.15)
With this notation, Eq. (11.2) becomes y = S
x
(y), i.e. we are looking for a
xed point of S
x
. If y Y, then
|S
x
(y)(t) x
0
| |x x
0
| +
_
t
0
|Z(, y())| d
|x x
0
| +M[t[
|x x
0
| +MT
0
r MT
0
+MT
0
= r,
showing S
x
(Y ) Y for all x B(x
0
, r MT
0
). Moreover if y, z Y,
|S
x
(y)(t) S
x
(z)(t)| =
_
_
_
_
_
t
0
[Z(, y()) Z(, z())] d
_
_
_
_
_
t
0
|Z(, y()) Z(, z())| d
_
t
0
|y() z()| d
. (11.16)
Let y
0
(t, x) = x and y
n
(, x) Y dened inductively by
y
n
(, x) := S
x
(y
n1
(, x)) = x +
_
t
0
Z(, y
n1
(, x))d. (11.17)
Using the estimate in Eq. (11.16) repeatedly we nd
11.3 Local Existence (Non-Linear ODE) 149
[[ y
n+1
(t) y
n
(t) [[
K
_
t
0
|y
n
() y
n1
()| d
K
2
_
t
0
dt
1
_
t1
0
dt
2
|y
n1
(t
2
) y
n2
(t
2
)|
.
.
.
K
n
_
t
0
dt
1
_
t1
0
dt
2
. . .
_
tn1
0
dt
n
|y
1
(t
n
) y
0
(t
n
)|
. . .
K
n
|y
1
(, x) y
0
(, x)|
_
n(t)
d
=
K
n
[t[
n
n!
|y
1
(, x) y
0
(, x)|
(11.18)
wherein we have also made use of Lemma 10.20. Combining this estimate with
|y
1
(t, x) y
0
(t, x)| =
_
_
_
_
_
t
0
Z(, x)d
_
_
_
_
_
t
0
|Z(, x)| d
M
0
,
where
M
0
= max
_
_
T0
0
|Z(, x)| d,
_
0
T0
|Z(, x)| d
_
MT
0
,
shows
|y
n+1
(t, x) y
n
(t, x)| M
0
K
n
[t[
n
n!
M
0
K
n
T
n
0
n!
and this implies
n=0
sup |y
n+1
(, x) y
n
(, x)|
,J0
: t J
0
n=0
M
0
K
n
T
n
0
n!
= M
0
e
KT0
<
where
|y
n+1
(, x) y
n
(, x)|
,J0
:= sup|y
n+1
(t, x) y
n
(t, x)| : t J
0
.
So y(t, x) := lim
n
y
n
(t, x) exists uniformly for t J and using Eq. (11.14)
we also have
sup |Z(t, y(t)) Z(t, y
n1
(t))| : t J
0
K|y(, x) y
n1
(, x)|
,J0
0 as n .
Now passing to the limit in Eq. (11.17) shows y solves Eq. (11.2). From this
equation it follows that y(t, x) is dierentiable in t and y satises Eq. (11.1).
The continuity of y(t, x) follows from Corollary 11.3 and mean value inequality
(Corollary 10.15):
|y(t, x) y(t
t
, x
t
)| |y(t, x) y(t, x
t
)| +|y(t, x
t
) y(t
t
, x
t
)|
= |y(t, x) y(t, x
t
)| +
_
_
_
_
_
t
t
Z(, y(, x
t
))d
_
_
_
_
|y(t, x) y(t, x
t
)| +
_
t
t
|Z(, y(, x
t
))| d
|x x
t
|e
KT
+
_
t
t
|Z(, y(, x
t
))| d
(11.19)
|x x
t
|e
KT
+M[t t
t
[ .
The continuity of y(t, x) is now a consequence Eq. (11.1) and the continuity
of y and Z.
Corollary 11.5. Let J = (a, b) 0 and suppose Z C(J X, X) satises
|Z(t, x) Z(t, y)| K|x y| for all x, y X and t J. (11.20)
Then for all x X, there is a unique solution y(t, x) (for t J) to Eq. (11.1).
Moreover y(t, x) and y(t, x) are jointly continuous in (t, x).
Proof. Let J
0
= (a
0
, b
0
) 0 be a precompact subinterval of J and Y :=
BC (J
0
, X) . By compactness, M := sup
t
J0
|Z(t, 0)| < which combined
with Eq. (11.20) implies
sup
t
J0
|Z(t, x)| M +K|x| for all x X.
Using this estimate and Lemma 10.7 one easily shows S
x
(Y ) Y for all
x X. The proof of Theorem 11.4 now goes through without any further
change.
11.4 Global Properties
Denition 11.6 (Local Lipschitz Functions). Let U
o
X, J be an open
interval and Z C(J U, X). The function Z is said to be locally Lipschitz in
x if for all x U and all compact intervals I J there exists K = K(x, I) <
and = (x, I) > 0 such that B(x, (x, I)) U and
|Z(t, x
1
) Z(t, x
0
)| K(x, I)|x
1
x
0
| x
0
, x
1
B(x, (x, I)) & t I.
(11.21)
11.4 Global Properties 151
For the rest of this section, we will assume J is an open interval containing
0, U is an open subset of X and Z C(JU, X) is a locally Lipschitz function.
Lemma 11.7. Let Z C(J U, X) be a locally Lipschitz function in X and
E be a compact subset of U and I be a compact subset of J. Then there exists
> 0 such that Z(t, x) is bounded for (t, x) I E
and and Z(t, x) is K

Lipschitz on E
for all t I, where

E
:= x U : dist(x, E) < .
Proof. Let (x, I) and K(x, I) be as in Denition 11.6 above. Since
E is compact, there exists a nite subset E such that E V :=
x
B(x, (x, I)/2). If y V, there exists x such that |y x| < (x, I)/2
and therefore
|Z(t, y)| |Z(t, x)| +K(x, I) |y x| |Z(t, x)| +K(x, I)(x, I)/2
sup
x,tI
|Z(t, x)| +K(x, I)(x, I)/2 =: M < .
This shows Z is bounded on I V. Let
:= d(E, V
c
)
1
2
min
x
(x, I)
and notice that > 0 since E is compact, V
c
is closed and E V
c
= .
If y, z E
and |y z| < , then as before there exists x such that

|y x| < (x, I)/2. Therefore
|z x| |z y| +|y x| < +(x, I)/2 (x, I)
and since y, z B(x, (x, I)), it follows that
|Z(t, y) Z(t, z)| K(x, I)|y z| K
0
|y z|
where K
0
:= max
x
K(x, I) < . On the other hand if y, z E
and
|y z| , then
|Z(t, y) Z(t, z)| 2M
2M
|y z| .
Thus if we let K := max 2M/, K
0
, we have shown
|Z(t, y) Z(t, z)| K|y z| for all y, z E
and t I.
Proposition 11.8 (Maximal Solutions). Let Z C(J U, X) be a locally
Lipschitz function in x and let x U be xed. Then there is an interval J
x
=
(a(x), b(x)) with a [, 0) and b (0, ] and a C
1
function y : J U
with the following properties:
1. y solves ODE in Eq. (11.1).
2. If y :

J = ( a,
b) U is another solution of Eq. (11.1) (we assume that

0

J) then

J J and y = y[

J
.
The function y : J U is called the maximal solution to Eq. (11.1).
Proof. Suppose that y
i
: J
i
= (a
i
, b
i
) U, i = 1, 2, are two solutions to
Eq. (11.1). We will start by showing that y
1
= y
2
on J
1
J
2
. To do this
1
let
J
0
= (a
0
, b
0
) be chosen so that 0

J
0
J
1
J
2
, and let E := y
1
(

J
0
)y
2
(

J
0
)
a compact subset of X. Choose > 0 as in Lemma 11.7 so that Z is Lipschitz
on E
. Then y
1
[
J0
, y
2
[
J0
: J
0
E
both solve Eq. (11.1) and therefore are

equal by Corollary 11.3. Since J
0
= (a
0
, b
0
) was chosen arbitrarily so that
[a
0
, b
0
] J
1
J
2
, we may conclude that y
1
= y
2
on J
1
J
2
. Let (y
, J
=
(a
, b
))
A
denote the possible solutions to (11.1) such that 0 J
. Dene
J
x
= J
and set y = y
on J
. We have just checked that y is well dened

and the reader may easily check that this function y : J
x
U satises all the
conclusions of the theorem.
Notation 11.9 For each x U, let J
x
= (a(x), b(x)) be the maximal in-
terval on which Eq. (11.1) may be solved, see Proposition 11.8. Set T(Z) :=
xU
(J
x
x) J U and let : T(Z) U be dened by (t, x) = y(t)
where y is the maximal solution to Eq. (11.1). (So for each x U, (, x) is
the maximal solution to Eq. (11.1).)
Proposition 11.10. Let Z C(J U, X) be a locally Lipschitz function in x
and y : J
x
= (a(x), b(x)) U be the maximal solution to Eq. (11.1). If b(x) <
b, then either limsup
tb(x)
|Z(t, y(t))| = or y(b(x)) := lim
tb(x)
y(t) exists
and y(b(x)) / U. Similarly, if a > a(x), then either limsup
ta(x)
|y(t)| =
or y(a(x)+) := lim
ta(x)
y(t) exists and y(a(x)+) / U.
Proof. Suppose that b < b(x) and M := limsup
tb(x)
|Z(t, y(t))| < .
Then there is a b
0
(0, b(x)) such that |Z(t, y(t))| 2M for all t (b
0
, b(x)).
Thus, by the usual fundamental theorem of calculus argument,
|y(t) y(t
t
)|
_
t
t
|Z(t, y())| d
2M[t t
t
[
1
Here is an alternate proof of the uniqueness. Let
T sup|t [0, min|b1, b2) : y1 = y2 on [0, t].
(T is the rst positive time after which y1 and y2 disagree.
Suppose, for sake of contradiction, that T < min|b1, b2. Notice that y1(T) =
y2(T) =: x
/
. Applying the local uniqueness theorem to y1( T) and y2( T)
thought as function from (, ) B(x
/
, (x
/
)) for some suciently small, we
learn that y1(T) = y2(T) on (, ). But this shows that y1 = y2 on [0, T +)
which contradicts the denition of T. Hence we must have the T = min|b1, b2,
i.e. y1 = y2 on J1 J2 [0, ). A similar argument shows that y1 = y2 on
J1 J2 (, 0] as well.
for all t, t
t
(b
0
, b(x)). From this it is easy to conclude that y(b(x)) =
lim
tb(x)
y(t) exists. If y(b(x)) U, by the local existence Theorem 11.4,
there exists > 0 and w C
1
((b(x) , b(x) +), U) such that
w(t) = Z(t, w(t)) and w(b(x)) = y(b(x)).
Now dene y : (a, b(x) +) U by
y(t) =
_
y(t) if t J
x
w(t) if t [b(x), b(x) +)
.
The reader may now easily show y solves the integral Eq. (11.2) and hence
also solves Eq. 11.1 for t (a(x), b(x) +).
2
But this violates the maximality
of y and hence we must have that y(b(x)) / U. The assertions for t near
a(x) are proved similarly.
Example 11.11. Let X = 1
2
, J = 1, U =
_
(x, y) 1
2
: 0 < r < 1
_
where
r
2
= x
2
+y
2
and
Z(x, y) =
1
r
(x, y) +
1
1 r
2
(y, x).
Then the unique solution (x(t), y(t)) to
d
dt
(x(t), y(t)) = Z(x(t), y(t)) with (x(0), y(0)) = (
1
2
, 0)
is given by
(x(t), y(t)) =
_
t +
1
2
__
cos
_
1
1/2 t
_
, sin
_
1
1/2 t
__
for t J
(1/2,0)
= (1/2, 1/2) . Notice that |Z(x(t), y(t))| as t 1/2 and
dist((x(t), y(t)), U
c
) 0 as t 1/2.
Example 11.12. (Not worked out completely.) Let X = U =
2
, C
(1
2
)
be a smooth function such that = 1 in a neighborhood of the line segment
joining (1, 0) to (0, 1) and being supported within the 1/10 neighborhood of
this segment. Choose a
n
and b
n
and dene
Z(x) =
n=1
a
n
(b
n
(x
n
, x
n+1
))(e
n+1
e
n
). (11.22)
For any x
2
, only a nite number of terms are non-zero in the above sum
in a neighborhood of x. Therefor Z :
2

2
is a smooth and hence locally
Lipschitz vector eld. Let (y(t), J = (a, b)) denote the maximal solution to
2
See the argument in Proposition 11.13 for a slightly dierent method of extending
y which avoids the use of the integral equation (11.2).
y(t) = Z(y(t)) with y(0) = e
1
.
Then if the a
n
and b
n
are chosen appropriately, then b < and there will
exist t
n
b such that y(t
n
) is approximately e
n
for all n. So again y(t
n
) does
not have a limit yet sup
t[0,b)
|y(t)| < . The idea is that Z is constructed to
blow the particle from e
1
to e
2
to e
3
to e
4
etc. etc. with the time it takes to
travel from e
n
to e
n+1
being on order 1/2
n
. The vector eld in Eq. (11.22) is
a rst approximation at such a vector eld, it may have to be adjusted a little
more to provide an honest example. In this example, we are having problems
because y(t) is going o in dimensions.
Here is another version of Proposition 11.10 which is more useful when
dim(X) < .
Proposition 11.13. Let Z C(J U, X) be a locally Lipschitz function in
x and y : J
x
= (a(x), b(x)) U be the maximal solution to Eq. (11.1).
1. If b(x) < b, then for every compact subset K U there exists T
K
< b(x)
such that y(t) / K for all t [T
K
, b(x)).
2. When dim(X) < , we may write this condition as: if b(x) < b, then
either
limsup
tb(x)
|y(t)| = or liminf
tb(x)
dist(y(t), U
c
) = 0.
Proof. 1) Suppose that b(x) < b and, for sake of contradiction, there
exists a compact set K U and t
n
b(x) such that y(t
n
) K for all n.
Since K is compact, by passing to a subsequence if necessary, we may assume
y
:= lim
n
y(t
n
) exists in K U. By the local existence Theorem 11.4,
there exists T
0
> 0 and > 0 such that for each x
t
B(y
, ) there exists a
unique solution w(, x
t
) C
1
((T
0
, T
0
), U) solving
w(t, x
t
) = Z(t, w(t, x
t
)) and w(0, x
t
) = x
t
.
Now choose n suciently large so that t
n
(b(x) T
0
/2, b(x)) and y(t
n
)
B(y
, ) . Dene y : (a(x), b(x) +T

0
/2) U by
y(t) =
_
y(t) if t J
x
w(t t
n
, y(t
n
)) if t (t
n
T
0
, b(x) +T
0
/2).
wherein we have used (t
n
T
0
, b(x)+T
0
/2) (t
n
T
0
, t
n
+T
0
). By uniqueness
of solutions to ODEs y is well dened, y C
1
((a(x), b(x) +T
0
/2) , X) and y
solves the ODE in Eq. 11.1. But this violates the maximality of y.
2) For each n N let
K
n
:= x U : |x| n and dist(x, U
c
) 1/n .
Then K
n
U and each K
n
is a closed bounded set and hence compact if
dim(X) < . Therefore if b(x) < b, by item 1., there exists T
n
[0, b(x))
such that y(t) / K
n
for all t [T
n
, b(x)) or equivalently |y(t)| > n or
dist(y(t), U
c
) < 1/n for all t [T
n
, b(x)).
Remark 11.14 (This remark is still rather rough.). In general it is not true
that the functions a and b are continuous. For example, let U be the region in
1
2
described in polar coordinates by r > 0 and 0 < < 3/2 and Z(x, y) =
(0, 1) as in Figure 11.2 below. Then b(x, y) = y for all x 0 and y > 0 while
b(x, y) = for all x < 0 and y 1 which shows b is discontinuous. On the
other hand notice that
b > t = x < 0 (x, y) : x 0, y > t
is an open set for all t > 0. An example of a vector eld for which b(x) is
discontinuous is given in the top left hand corner of Figure 11.2. The map
(r (cos , sin)) :=
_
lnr, tan
_
2
3

2
__
, would allow the reader to nd an
example on 1
2
if so desired. Some calculations shows that Z transferred to
1
2
by the map is given by the new vector
Z(x, y) = e
x
_
sin
_
3
8
+
3
4
tan
1
(y)
_
, cos
_
3
8
+
3
4
tan
1
(y)
__
.
Fig. 11.2. Manufacturing vector elds where b(x) is discontinuous.
Theorem 11.15 (Global Continuity). Let Z C(J U, X) be a locally
Lipschitz function in x. Then T(Z) is an open subset of J U and the func-
tions : T(Z) U and

: T(Z) U are continuous. More precisely, for
all x
0
U and all open intervals J
0
such that 0 J
0
J
x0
there exists
= (x
0
, J
0
, Z) > 0 and C = C(x
0
, J
0
, Z) < such that for all x B(x
0
, ),
J
0
J
x
and
|(, x) (, x
0
)|
BC(J0,U)
C |x x
0
| . (11.23)
Proof. Let [J
0
[ = b
0
a
0
, I =

J
0
and E := y(

J
0
) a compact subset of U
and let > 0 and K < be given as in Lemma 11.7, i.e. K is the Lipschitz
constant for Z on E
. Also recall the notation:

1
(t) = [0, t] if t > 0 and
1
(t) = [t, 0] if t < 0. Suppose that x E
, then by Corollary 11.3,

|(t, x) (t, x
0
)| |x x
0
|e
K]t]
|x x
0
|e
K]J0]
(11.24)
for all t J
0
J
x
such that such that (
1
(t), x) E
. Letting :=
e
K]J0]
/2, and assuming x B(x
0
, ), the previous equation implies
|(t, x) (t, x
0
)| /2 < t J
0
J
x
(
1
(t), x) E
.
This estimate further shows that (t, x) remains bounded and strictly away
from the boundary of U for all such t. Therefore, it follows from Proposition
11.8 and continuous induction
3
that J
0
J
x
and Eq. (11.24) is valid for all
t J
0
. This proves Eq. (11.23) with C := e
K]J0]
. Suppose that (t
0
, x
0
) T(Z)
and let 0 J
0
J
x0
such that t
0
J
0
and be as above. Then we have
just shown J
0
B(x
0
, ) T(Z) which proves T(Z) is open. Furthermore,
since the evaluation map
(t
0
, y) J
0
BC(J
0
, U)
e
y(t
0
) X
is continuous (as the reader should check) it follows that = e(x (, x)) :
J
0
B(x
0
, ) U is also continuous; being the composition of continuous
maps. The continuity of

(t
0
, x) is a consequence of the continuity of and
the dierential equation 11.1 Alternatively using Eq. (11.2),
|(t
0
, x) (t, x
0
)| |(t
0
, x) (t
0
, x
0
)| +|(t
0
, x
0
) (t, x
0
)|
C |x x
0
| +
_
t0
t
|Z(, (, x
0
))| d
C |x x
0
| +M[t
0
t[
where C is the constant in Eq. (11.23) and M = sup
J0
|Z(, (, x
0
))| < .
This clearly shows is continuous.
11.5 Semi-Group Properties of time independent ows
To end this chapter we investigate the semi-group property of the ow asso-
ciated to the vector-eld Z. It will be convenient to introduce the following
3
See the argument in the proof of Proposition 10.12.
11.5 Semi-Group Properties of time independent ows 157
suggestive notation. For (t, x) T(Z), set e
tZ
(x) = (t, x). So the path
t e
tZ
(x) is the maximal solution to
d
dt
e
tZ
(x) = Z(e
tZ
(x)) with e
0Z
(x) = x.
This exponential notation will be justied shortly. It is convenient to have the
following conventions.
Notation 11.16 We write f : X X to mean a function dened on some
open subset D(f) X. The open set D(f) will be called the domain of f.
Given two functions f : X X and g : X X with domains D(f) and
D(g) respectively, we dene the composite function f g : X X to be the
function with domain
D(f g) = x X : x D(g) and g(x) D(f) = g
1
(D(f))
given by the rule f g(x) = f(g(x)) for all x D(f g). We now write f = g
i D(f) = D(g) and f(x) = g(x) for all x D(f) = D(g). We will also write
f g i D(f) D(g) and g[
D(f)
= f.
Theorem 11.17. For xed t 1 we consider e
tZ
as a function from X to X
with domain D(e
tZ
) = x U : (t, x) T(Z), where D() = T(Z) 1U,
T(Z) and are dened in Notation 11.9. Conclusions:
1. If t, s 1 and t s 0, then e
tZ
e
sZ
= e
(t+s)Z
.
2. If t 1, then e
tZ
e
tZ
= Id
D(e
tZ
)
.
3. For arbitrary t, s 1, e
tZ
e
sZ
e
(t+s)Z
.
Proof. Item 1. For simplicity assume that t, s 0. The case t, s 0 is left
to the reader. Suppose that x D(e
tZ
e
sZ
). Then by assumption x D(e
sZ
)
and e
sZ
(x) D(e
tZ
). Dene the path y() via:
y() =
_
e
Z
(x) if 0 s
e
(s)Z
(x) if s t +s
.
It is easy to check that y solves y() = Z(y()) with y(0) = x. But since,
e
Z
(x) is the maximal solution we must have that x D(e
(t+s)Z
) and y(t +
s) = e
(t+s)Z
(x). That is e
(t+s)Z
(x) = e
tZ
e
sZ
(x). Hence we have shown that
e
tZ
e
sZ
e
(t+s)Z
. To nish the proof of item 1. it suces to show that
D(e
(t+s)Z
) D(e
tZ
e
sZ
). Take x D(e
(t+s)Z
), then clearly x D(e
sZ
). Set
y() = e
(+s)Z
(x) dened for 0 t. Then y solves
y() = Z(y()) with y(0) = e
sZ
(x).
But since e
Z
(e
sZ
(x)) is the maximal solution to the above initial valued
problem we must have that y() = e
Z
(e
sZ
(x)), and in particular at =
t, e
(t+s)Z
(x) = e
tZ
(e
sZ
(x)). This shows that x D(e
tZ
e
sZ
) and in fact
e
(t+s)Z
e
tZ
e
sZ
.
Item 2. Let x D(e
tZ
) again assume for simplicity that t 0. Set
y() = e
(t)Z
(x) dened for 0 t. Notice that y(0) = e
tZ
(x) and
y() = Z(y()). This shows that y() = e
Z
(e
tZ
(x)) and in particular that
x D(e
tZ
e
tZ
) and e
tZ
e
tZ
(x) = x. This proves item 2.
Item 3. I will only consider the case that s < 0 and t + s 0, the other
cases are handled similarly. Write u for t + s, so that t = s + u. We know
that e
tZ
= e
uZ
e
sZ
by item 1. Therefore
e
tZ
e
sZ
= (e
uZ
e
sZ
) e
sZ
.
Notice in general, one has (f g) h = f (g h) (you prove). Hence, the
above displayed equation and item 2. imply that
e
tZ
e
sZ
= e
uZ
(e
sZ
e
sZ
) = e
(t+s)Z
I
D(e
sZ
)
e
(t+s)Z
.
The following result is trivial but conceptually illuminating partial con-
verse to Theorem 11.17.
Proposition 11.18 (Flows and Complete Vector Fields). Suppose U
o
X, C(1 U, U) and
t
(x) = (t, x). Suppose satises:
1.
0
= I
U
,
2.
t

s
=
t+s
for all t, s 1, and
3. Z(x) :=

(0, x) exists for all x U and Z C(U, X) is locally Lipschitz.
Then
t
= e
tZ
.
Proof. Let x U and y(t) :=
t
(x). Then using Item 2.,
y(t) =
d
ds
[
0
y(t +s) =
d
ds
[
0
(t+s)
(x) =
d
ds
[
0
s

t
(x) = Z(y(t)).
Since y(0) = x by Item 1. and Z is locally Lipschitz by Item 3., we know by
uniqueness of solutions to ODEs (Corollary 11.3) that
t
(x) = y(t) = e
tZ
(x).
11.6 Exercises
Exercise 11.1. Find a vector eld Z such that e
(t+s)Z
is not contained in
e
tZ
e
sZ
.
Denition 11.19. A locally Lipschitz function Z : U
o
X X is said to
be a complete vector eld if T(Z) = 1U. That is for any x U, t e
tZ
(x)
is dened for all t 1.
11.6 Exercises 159
Exercise 11.2. Suppose that Z : X X is a locally Lipschitz function.
Assume there is a constant C > 0 such that
|Z(x)| C(1 +|x|) for all x X.
Then Z is complete. Hint: use Gronwalls Lemma 11.2 and Proposition 11.10.
Exercise 11.3. Suppose y is a solution to y(t) = [y(t)[
1/2
with y(0) = 0.
Show there exists a, b [0, ] such that
y(t) =
_
_
_
1
4
(t b)
2
if t b
0 if a < t < b
1
4
(t +a)
2
if t a.
Exercise 11.4. Using the fact that the solutions to Eq. (11.3) are never 0 if
x ,= 0, show that y(t) = 0 is the only solution to Eq. (11.3) with y(0) = 0.
Exercise 11.5 (Higher Order ODE). Let X be a Banach space, , |
o
X
n
and f C (J |, X) be a Locally Lipschitz function in x = (x
1
, . . . , x
n
).
Show the n
th
ordinary dierential equation,
y
(n)
(t) = f(t, y(t), y(t), . . . , y
(n1)
(t)) with y
(k)
(0) = y
k
0
for k < n (11.25)
where (y
0
0
, . . . , y
n1
0
) is given in |, has a unique solution for small t J. Hint:
let y(t) =
_
y(t), y(t), . . . , y
(n1)
(t)
_
and rewrite Eq. (11.25) as a rst order
ODE of the form
y(t) = Z(t, y(t)) with y(0) = (y
0
0
, . . . , y
n1
0
).
Exercise 11.6. Use the results of Exercises 10.19 and 11.5 to solve
y(t) 2 y(t) +y(t) = 0 with y(0) = a and y(0) = b.
Hint: The 2 2 matrix associated to this system, A, has only one eigenvalue
1 and may be written as A = I +B where B
2
= 0.
Exercise 11.7 (Non-Homogeneous ODE). Suppose that U
o
X is open
and Z : 1U X is a continuous function. Let J = (a, b) be an interval and
t
0
J. Suppose that y C
1
(J, U) is a solution to the non-homogeneous
dierential equation:
y(t) = Z(t, y(t)) with y(t
o
) = x U. (11.26)
Dene Y C
1
(J t
0
, 1U) by Y (t) := (t +t
0
, y(t +t
0
)). Show that Y solves
the homogeneous dierential equation
Y (t) =

Z(Y (t)) with Y (0) = (t
0
, y
0
), (11.27)
where

Z(t, x) := (1, Z(x)). Conversely, suppose that Y C
1
(J t
0
, 1 U)
is a solution to Eq. (11.27). Show that Y (t) = (t +t
0
, y(t +t
0
)) for some y
C
1
(J, U) satisfying Eq. (11.26). (In this way the theory of non-homogeneous
O.D.E.s may be reduced to the theory of homogeneous O.D.E.s.)
Exercise 11.8 (Dierential Equations with Parameters). Let W be
another Banach space, U V
o
X W and Z C(U V, X) be a locally
Lipschitz function on U V. For each (x, w) U V, let t J
x,w
(t, x, w)
denote the maximal solution to the ODE
y(t) = Z(y(t), w) with y(0) = x. (11.28)
Prove
T := (t, x, w) 1 U V : t J
x,w
(11.29)
is open in 1 U V and and

are continuous functions on T.
Hint: If y(t) solves the dierential equation in (11.28), then v(t) :=
(y(t), w) solves the dierential equation,
v(t) =

Z(v(t)) with v(0) = (x, w), (11.30)
where

Z(x, w) := (Z(x, w), 0) XW and let (t, (x, w)) := v(t). Now apply
the Theorem 11.15 to the dierential equation (11.30).
Exercise 11.9 (Abstract Wave Equation). For A L(X) and t 1, let
cos(tA) :=
n=0
(1)
n
(2n)!
t
2n
A
2n
and
sin(tA)
A
:=
n=0
(1)
n
(2n + 1)!
t
2n+1
A
2n
.
Show that the unique solution y C
2
(1, X) to
y(t) +A
2
y(t) = 0 with y(0) = y
0
and y(0) = y
0
X (11.31)
is given by
y(t) = cos(tA)y
0
+
sin(tA)
A
y
0
.
Remark 11.20. Exercise 11.9 can be done by direct verication. Alternatively
and more instructively, rewrite Eq. (11.31) as a rst order ODE using Exercise
11.5. In doing so you will be lead to compute e
tB
where B L(X X) is
given by
B =
_
0 I
A
2
0
_
,
where we are writing elements of XX as column vectors,
_
x
1
x
2
_
. You should
then show
e
tB
=
_
cos(tA)
sin(tA)
A
Asin(tA) cos(tA)
_
where
Asin(tA) :=
n=0
(1)
n
(2n + 1)!
t
2n+1
A
2(n+1)
.
11.6 Exercises 161
Exercise 11.10 (Duhamels Principle for the Abstract Wave Equa-
tion). Continue the notation in Exercise 11.9, but now consider the ODE,
y(t) +A
2
y(t) = f(t) with y(0) = y
0
and y(0) = y
0
X (11.32)
where f C(1, X). Show the unique solution to Eq. (11.32) is given by
y(t) = cos(tA)y
0
+
sin(tA)
A
y
0
+
_
t
0
sin((t ) A)
A
f()d (11.33)
Hint: Again this could be proved by direct calculation. However it is more
instructive to deduce Eq. (11.33) from Exercise 10.21 and the comments in
Remark 11.20.
12
Banach Space Calculus
In this section, X and Y will be Banach space and U will be an open subset
of X.
Notation 12.1 (, O, and o notation) Let 0 U
o
X, and f : U Y
be a function. We will write:
1. f(x) = (x) if lim
x0
|f(x)| = 0.
2. f(x) = O(x) if there are constants C < and r > 0 such that
|f(x)| C|x| for all x B(0, r). This is equivalent to the condition
that limsup
x0
_
|x|
1
|f(x)|
_
< , where
limsup
x0
|f(x)|
|x|
:= lim
r0
sup|f(x)| : 0 < |x| r.
3. f(x) = o(x) if f(x) = (x)O(x), i.e. lim
x0
|f(x)|/|x| = 0.
Example 12.2. Here are some examples of properties of these symbols.
1. A function f : U
o
X Y is continuous at x
0
U if f(x
0
+ h) =
f(x
0
) +(h).
2. If f(x) = (x) and g(x) = (x) then f(x) +g(x) = (x).
Now let g : Y Z be another function where Z is another Banach space.
3. If f(x) = O(x) and g(y) = o(y) then g f(x) = o(x).
4. If f(x) = (x) and g(y) = (y) then g f(x) = (x).
12.1 The Dierential
Denition 12.3. A function f : U
o
X Y is dierentiable at x
0
U
if there exists a linear transformation L(X, Y ) such that
f(x
0
+h) f(x
0
) h = o(h). (12.1)
We denote by f
t
(x
0
) or Df(x
0
) if it exists. As with continuity, f is dif-
ferentiable on U if f is dierentiable at all points in U.
164 12 Banach Space Calculus
Remark 12.4. The linear transformation in Denition 12.3 is necessarily
unique. Indeed if
1
is another linear transformation such that Eq. (12.1)
holds with replaced by
1
, then
(
1
)h = o(h),
i.e.
limsup
h0
|(
1
)h|
|h|
= 0.
On the other hand, by denition of the operator norm,
limsup
h0
|(
1
)h|
|h|
= |
1
|.
The last two equations show that =
1
.
Exercise 12.1. Show that a function f : (a, b) X is a dierentiable at
t (a, b) in the sense of Denition 10.9 i it is dierentiable in the sense of
Denition 12.3. Also show Df(t)v = v

f(t) for all v 1.
Example 12.5. If T L(X, Y ) and x, h X, then
T (x +h) T (x) Th = 0
which shows T
t
(x) = T for all x X.
Example 12.6. Assume that GL(X, Y ) is non-empty. Then by Corollary 7.22,
GL(X, Y ) is an open subset of L(X, Y ) and the inverse map f : GL(X, Y )
GL(Y, X), dened by f(A) := A
1
, is continuous. We will now show that f
is dierentiable and
f
t
(A)B = A
1
BA
1
for all B L(X, Y ).
This is a consequence of the identity,
f(A+H) f(A) = (A+H)
1
(A(A+H)) A
1
= (A+H)
1
HA
1
which may be used to nd the estimate,
_
_
f(A+H) f(A) +A
1
HA
1
_
_
=
_
_
_
A
1
(A+H)
1
HA
1
_
_
_
_
A
1
(A+H)
1
_
_
|H|
_
_
A
1
_
_
|A
1
|
3
|H|
2
1 |A
1
| |H|
= O
_
|H|
2
_
wherein we have used the bound in Eq. (7.10) of Corollary 7.22 for the last
inequality.
12.2 Product and Chain Rules 165
12.2 Product and Chain Rules
The following theorem summarizes some basic properties of the dierential.
Theorem 12.7. The dierential D has the following properties:
1. Linearity: D is linear, i.e. D(f +g) = Df +Dg.
2. Product Rule: If f : U
o
X Y and A : U
o
X L(X, Z) are
dierentiable at x
0
then so is x (Af)(x) := A(x)f(x) and
D(Af)(x
0
)h = (DA(x
0
)h)f(x
0
) +A(x
0
)Df(x
0
)h.
3. Chain Rule: If f : U
o
X V
o
Y is dierentiable at x
0
U, and
g : V
o
Y Z is dierentiable at y
0
:= f(x
0
), then g f is dierentiable
at x
0
and (g f)
t
(x
0
) = g
t
(y
0
)f
t
(x
0
).
4. Converse Chain Rule: Suppose that f : U
o
X V
o
Y is contin-
uous at x
0
U, g : V
o
Y Z is dierentiable y
0
:= f(h
o
), g
t
(y
0
) is
invertible, and g f is dierentiable at x
0
, then f is dierentiable at x
0
and
f
t
(x
0
) := [g
t
(x
0
)]
1
(g f)
t
(x
0
). (12.2)
Proof. Linearity. Let f, g : U
o
X Y be two functions which are
dierentiable at x
0
U and 1, then
(f +g)(x
0
+h)
= f(x
0
) +Df(x
0
)h +o(h) +(g(x
0
) +Dg(x
0
)h +o(h)
= (f +g)(x
0
) + (Df(x
0
) +Dg(x
0
))h +o(h),
which implies that (f +g) is dierentiable at x
0
and that
D(f +g)(x
0
) = Df(x
0
) +Dg(x
0
).
Product Rule. The computation,
A(x
0
+h)f(x
0
+h)
= (A(x
0
) +DA(x
0
)h +o(h))(f(x
0
) +f
t
(x
0
)h +o(h))
= A(x
0
)f(x
0
) +A(x
0
)f
t
(x
0
)h + [DA(x
0
)h]f(x
0
) +o(h),
veries the product rule holds. This may also be considered as a special case
of Proposition 12.9. Chain Rule. Using f(x
0
+ h) f(x
0
) = O(h) (see Eq.
(12.1)) and o(O(h)) = o(h),
(gf)(x
0
+h)
= g(f(x
0
)) +g
t
(f(x
0
))(f(x
0
+h) f(x
0
)) +o(f(x
0
+h) f(x
0
))
= g(f(x
0
)) +g
t
(f(x
0
))(Df(x
0
)x
0
+o(h)) +o(f(x
0
+h) f(x
0
)
= g(f(x
0
)) +g
t
(f(x
0
))Df(x
0
)h +o(h).
Converse Chain Rule. Since g is dierentiable at y
0
= f(x
0
) and g
t
(y
0
) is
invertible,
g(f(x
0
+h)) g(f(x
0
))
= g
t
(f(x
0
))(f(x
0
+h) f(x
0
)) +o(f(x
0
+h) f(x
0
))
= g
t
(f(x
0
)) [f(x
0
+h) f(x
0
) +o(f(x
0
+h) f(x
0
))] .
And since g f is dierentiable at x
0
,
(g f)(x
0
+h) g(f(x
0
)) = (g f)
t
(x
0
)h +o(h).
Comparing these two equations shows that
f(x
0
+h) f(x
0
) +o(f(x
0
+h) f(x
0
))
= g
t
(f(x
0
))
1
[(g f)
t
(x
0
)h +o(h)]
which is equivalent to
f(x
0
+h) f(x
0
) +o(f(x
0
+h) f(x
0
))
= g
t
(f(x
0
))
1
[(g f)
t
(x
0
)h +o(h)]
= g
t
(f(x
0
))
1
(g f)
t
(x
0
)h +o(h) o(f(x
0
+h) f(x
0
))
= g
t
(f(x
0
))
1
(g f)
t
(x
0
)h +o(h) +o(f(x
0
+h) f(x
0
)). (12.3)
Using the continuity of f, f(x
0
+h) f(x
0
) is close to 0 if h is close to zero,
and hence
|o(f(x
0
+h) f(x
0
))|
1
2
|f(x
0
+h) f(x
0
)| (12.4)
for all h suciently close to 0. (We may replace
1
2
by any number > 0
above.) Taking the norm of both sides of Eq. (12.3) and making use of Eq.
(12.4) shows, for h close to 0, that
|f(x
0
+h) f(x
0
)|
|g
t
(f(x
0
))
1
(g f)
t
(x
0
)||h| +o(|h|) +
1
2
|f(x
0
+h) f(x
0
)|.
Solving for |f(x
0
+h) f(x
0
)| in this last equation shows that
f(x
0
+h) f(x
0
) = O(h). (12.5)
(This is an improvement, since the continuity of f only guaranteed that f(x
0
+
h) f(x
0
) = (h).) Because of Eq. (12.5), we now know that o(f(x
0
+ h)
f(x
0
)) = o(h), which combined with Eq. (12.3) shows that
f(x
0
+h) f(x
0
) = g
t
(f(x
0
))
1
(g f)
t
(x
0
)h +o(h),
i.e. f is dierentiable at x
0
and f
t
(x
0
) = g
t
(f(x
0
))
1
(g f)
t
(x
0
).
12.2 Product and Chain Rules 167
Corollary 12.8 (Chain Rule). Suppose that : (a, b) U
o
X is dier-
entiable at t (a, b) and f : U
o
X Y is dierentiable at (t) U. Then
f is dierentiable at t and
d(f )(t)/dt = f
t
((t)) (t).
Proposition 12.9 (Product Rule II). Suppose that X := X
1
X
n
with each X
i
being a Banach space and T : X
1
X
n
Y is a multilinear
map, i.e.
x
i
X
i
T (x
1
, . . . , x
i1
, x
i
, x
i+1
, . . . , x
n
) Y
is linear when x
1
, . . . , x
i1
, x
i+1
, . . . , x
n
are held xed. Then the following are
equivalent:
1. T is continuous.
2. T is continuous at 0 X.
3. There exists a constant C < such that
|T (x)|
Y
C
n
i=1
|x
i
|
Xi
(12.6)
for all x = (x
1
, . . . , x
n
) X.
4. T is dierentiable at all x X
1
X
n
.
Moreover if T the dierential of T is given by
T
t
(x) h =
n
i=1
T (x
1
, . . . , x
i1
, h
i
, x
i+1
, . . . , x
n
) (12.7)
where h = (h
1
, . . . , h
n
) X.
Proof. Let us equip X with the norm
|x|
X
:= max
_
|x
i
|
Xi
_
.
If T is continuous then T is continuous at 0. If T is continuous at 0, using
T (0) = 0, there exists a > 0 such that |T (x)|
Y
1 whenever |x|
X
.
Now if x X is arbitrary, let x
t
:=
_
|x
1
|
1
X1
x
1
, . . . , |x
n
|
1
Xn
x
n
_
. Then
|x
t
|
X
and hence
_
_
_
_
_
_
n
n
i=1
|x
i
|
1
Xi
_
T (x
1
, . . . , x
n
)
_
_
_
_
_
Y
= |T (x
t
)| 1
from which Eq. (12.6) follows with C =
n
.
Now suppose that Eq. (12.6) holds. For x, h X and 0, 1
n
let
[[ =
n
i=1
i
and
x
(h) := ((1
1
) x
1
+
1
h
1
, . . . , (1
n
) x
n
+
n
h
n
) X.
By the multi-linearity of T,
T (x +h) = T (x
1
+h
1
, . . . , x
n
+h
n
) =
0,1]
n
T (x
(h))
= T (x) +
n
i=1
T (x
1
, . . . , x
i1
, h
i
, x
i+1
, . . . , x
n
)
+
0,1]
n
:]]2
T (x
(h)) . (12.8)
From Eq. (12.6),
_
_
_
_
_
_
0,1]
n
:]]2
T (x
(h))
_
_
_
_
_
_
= O
_
|h|
2
_
,
and so it follows from Eq. (12.8) that T
t
(x) exists and is given by Eq. (12.7).
This completes the proof since it is trivial to check that T being dierentiable
at x X implies continuity of T at x X.
Exercise 12.2. Let det : L(1
n
) 1 be the determinant function on n n
matrices and for A L(,1
n
) we will let A
i
denote the i
th
column of A and
write A = (A
1
[A
2
[ . . . [A
n
) .
1. Show det
t
(A) exists for all A L(,1
n
) and
det
t
(A) H =
n
i=1
det (A
1
[ . . . [A
i1
[H
i
[A
i+1
[ . . . [A
n
) (12.9)
for all H L(1
n
) . Hint: recall that det (A) is a multilinear function of
its columns.
2. Use Eq. (12.9) along with basic properties of the determinant to show
det
t
(I) H = tr(H).
3. Suppose now that A GL(1
n
) , show
det
t
(A) H = det (A) tr(A
1
H).
Hint: Notice that det (A+H) = det (A) det
_
I +A
1
H
_
.
4. If A L(1
n
) , show det
_
e
A
_
= e
tr(A)
. Hint: use the previous item and
Corollary 12.8 to show
d
dt
det
_
e
tA
_
= det
_
e
tA
_
tr(A).
12.3 Partial Derivatives 169
Denition 12.10. Let X and Y be Banach spaces and let L
1
(X, Y ) :=
L(X, Y ) and for k 2 let L
k
(X, Y ) be dened inductively by L
k+1
(X, Y ) =
L(X, L
k
(X, Y )). For example L
2
(X, Y ) = L(X, L(X, Y )) and L
3
(X, Y ) =
L(X, L(X, L(X, Y ))) .
Suppose f : U
o
X Y is a function. If f is dierentiable on U, then it
makes sense to ask if f
t
= Df : U L(X, Y ) = L
1
(X, Y ) is dierentiable. If
Df is dierentiable on U then f
tt
= D
2
f := DDf : U L
2
(X, Y ). Similarly
we dene f
(n)
= D
n
f : U L
n
(X, Y ) inductively.
Denition 12.11. Given k N, let C
k
(U, Y ) denote those functions f :
U Y such that f
(j)
:= D
j
f : U L
j
(X, Y ) exists and is continuous for
j = 1, 2, . . . , k.
Example 12.12. Let us continue on with Example 12.6 but now let X = Y to
simplify the notation. So f : GL(X) GL(X) is the map f(A) = A
1
and
f
t
(A) = L
A
1R
A
1, i.e. f
t
= L
f
R
f
.
where L
A
B = AB and R
A
B = AB for all A, B L(X). As the reader may
easily check, the maps
A L(X) L
A
, R
A
L(L(X))
are linear and bounded. So by the chain and the product rule we nd f
tt
(A)
exists for all A L(X) and
f
tt
(A)B = L
f
(A)B
R
f
L
f
R
f
(A)B
.
More explicitly
[f
tt
(A)B] C = A
1
BA
1
CA
1
+A
1
CA
1
BA
1
. (12.10)
Working inductively one shows f : GL(X) GL(X) dened by f(A) := A
1
is C
.
12.3 Partial Derivatives
Denition 12.13 (Partial or Directional Derivative). Let f : U
o
X
Y be a function, x
0
U, and v X. We say that f is dierentiable at x
0
in
the direction v i
d
dt
[
0
(f(x
0
+tv)) =: (
v
f)(x
0
) exists. We call (
v
f)(x
0
) the
directional or partial derivative of f at x
0
in the direction v.
Notice that if f is dierentiable at x
0
, then
v
f(x
0
) exists and is equal to
f
t
(x
0
)v, see Corollary 12.8.
Proposition 12.14. Let f : U
o
X Y be a continuous function and
D X be a dense subspace of X. Assume
v
f(x) exists for all x U and
v D, and there exists a continuous function A : U L(X, Y ) such that
v
f(x) = A(x)v for all v D and x U D. Then f C
1
(U, Y ) and
Df = A.
Proof. Let x
0
U, > 0 such that B(x
0
, 2) U and M := sup|A(x)| :
x B(x
0
, 2) <
1
. For x B(x
0
, ) D and v D B(0, ), by the
fundamental theorem of calculus,
f(x +v) f(x) =
_
1
0
df(x +tv)
dt
dt
=
_
1
0
(
v
f)(x +tv) dt =
_
1
0
A(x +tv) v dt. (12.11)
For general x B(x
0
, ) and v B(0, ), choose x
n
B(x
0
, ) D and
v
n
D B(0, ) such that x
n
x and v
n
v. Then
f(x
n
+v
n
) f(x
n
) =
_
1
0
A(x
n
+tv
n
) v
n
dt (12.12)
holds for all n. The left side of this last equation tends to f(x +v) f(x) by
the continuity of f. For the right side of Eq. (12.12) we have
|
_
1
0
A(x +tv) v dt
_
1
0
A(x
n
+tv
n
) v
n
dt|
_
1
0
|A(x +tv) A(x
n
+tv
n
) ||v| dt +M|v v
n
|.
It now follows by the continuity of A, the fact that |A(x+tv)A(x
n
+tv
n
) |
M, and the dominated convergence theorem that right side of Eq. (12.12)
converges to
_
1
0
A(x + tv) v dt. Hence Eq. (12.11) is valid for all x B(x
0
, )
and v B(0, ). We also see that
f(x +v) f(x) A(x)v = (v)v, (12.13)
where (v) :=
_
1
0
[A(x +tv) A(x)] dt. Now
1
It should be noted well, unlike in nite dimensions closed and bounded sets
need not be compact, so it is not sucient to choose suciently small so that
B(x0, 2) U. Here is a counter example. Let X H be a Hilbert space, |en
n=1
be an orthonormal set. Dene f(x)
n=1
n(|xen|), where is any contin-
uous function on R such that (0) = 1 and is supported in (1, 1). Notice that
|en em|
2
= 2 for all m ,= n, so that |en em| =
2. Using this fact it is rather

easy to check that for any x0 H, there is an > 0 such that for all x B(x0, ),
only one term in the sum dening f is non-zero. Hence, f is continuous. However,
f(en) = n as n .
12.4 Higher Order Derivatives 171
|(v)|
_
1
0
|A(x +tv) A(x)| dt
max
t[0,1]
|A(x +tv) A(x)| 0 as v 0,
by the continuity of A. Thus, we have shown that f is dierentiable and that
Df(x) = A(x).
Corollary 12.15. Suppose now that X = 1
d
, f : U
o
X Y be a contin-
uous function such that
i
f(x) :=
ei
f (x) exists and is continuous on U for
i = 1, 2, . . . , d, where e
i
d
i=1
is the standard basis for 1
d
. Then f C
1
(U, Y )
and Df (x) e
i
=
i
f (x) for all i.
Proof. For x U, let A(x) : 1
d
Y be the unique linear map such that
A(x) e
i
=
i
f (x) for i = 1, 2, . . . , d. Then A : U L(1
d
, Y ) is a continuous
map. Now let v 1
d
and v
(i)
:= (v
1
, v
2
, . . . , v
i
, 0, . . . , 0) for i = 1, 2, . . . , d and
v
(0)
:= 0. Then for t 1 near 0, using the fundamental theorem of calculus
and the denition of
i
f (x) ,
f (x +tv) f (x) =
d
i=1
_
f
_
x +tv
(i)
_
f
_
x +tv
(i1)
__
=
d
i=1
_
1
0
d
ds
f
_
x +tv
(i1)
+stv
i
e
i
_
ds
=
d
i=1
tv
i
_
1
0
i
f
_
x +tv
(i1)
+stv
i
e
i
_
ds
=
d
i=1
tv
i
_
1
0
A
_
x +tv
(i1)
+stv
i
e
i
_
e
i
ds.
Using the continuity of A, it now follows that
lim
t0
f (x +tv) f (x)
t
=
d
i=1
v
i
lim
t0
_
1
0
A
_
x +tv
(i1)
+stv
i
e
i
_
e
i
ds
=
d
i=1
v
i
_
1
0
A(x) e
i
ds = A(x) v
which shows
v
f (x) exists and
v
f (x) = A(x) v. The result now follows from
an application of Proposition 12.14.
12.4 Higher Order Derivatives
It is somewhat inconvenient to work with the Banach spaces L
k
(X, Y ) in Def-
inition 12.10. For this reason we will introduce an isomorphic Banach space,
M
k
(X, Y ).
Denition 12.16. For k 1, 2, 3, . . ., let M
k
(X, Y ) denote the set of func-
tions f : X
k
Y such that
1. For i 1, 2, . . . , k, v X fv
1
, v
2
, . . . , v
i1
, v, v
i+1
, . . . , v
k
) Y is
linear
2
for all v
i
n
i=1
X.
2. The norm |f|
M
k
(X,Y )
should be nite, where
|f|
M
k
(X,Y )
:= sup
|fv
1
, v
2
, . . . , v
k
)|
Y
|v
1
||v
2
| |v
k
|
: v
i
k
i=1
X 0.
Lemma 12.17. There are linear operators j
k
: L
k
(X, Y ) M
k
(X, Y )
dened inductively as follows: j
1
= Id
L(X,Y )
(notice that M
1
(X, Y ) =
L
1
(X, Y ) = L(X, Y )) and
(j
k+1
A)v
0
, v
1
, . . . , v
k
) = (j
k
(Av
0
))v
1
, v
2
, . . . , v
k
) v
i
X.
(Notice that Av
0
L
k
(X, Y ).) Moreover, the maps j
k
are isometric isomor-
phisms.
Proof. To get a feeling for what j
k
is let us write out j
2
and j
3
explicitly.
If A L
2
(X, Y ) = L(X, L(X, Y )), then (j
2
A)v
1
, v
2
) = (Av
1
)v
2
and if A
L
3
(X, Y ) = L(X, L(X, L(X, Y ))), (j
3
A)v
1
, v
2
, v
3
) = ((Av
1
)v
2
)v
3
for all v
i

X. It is easily checked that j
k
is linear for all k. We will now show by induction
that j
k
is an isometry and in particular that j
k
is injective. Clearly this is true
if k = 1 since j
1
is the identity map. For A L
k+1
(X, Y ),
|j
k+1
A|
M
k+1
(X,Y )
:= sup
|(j
k
(Av
0
))v
1
, v
2
, . . . , v
k
)|
Y
|v
0
||v
1
||v
2
| |v
k
|
: v
i
k
i=0
X 0
= sup
|(j
k
(Av
0
))|
M
k
(X,Y )
|v
0
|
: v
0
X 0
= sup
|Av
0
|
1
k
(X,Y )
|v
0
|
: v
0
X 0
= |A|
L(X,1
k
(X,Y ))
:= |A|
1
k+1
(X,Y )
,
wherein the second to last inequality we have used the induction hypothesis.
This shows that j
k+1
is an isometry provided j
k
is an isometry. To nish the
proof it suces to show that j
k
is surjective for all k. Again this is true for
k = 1. Suppose that j
k
is invertible for some k 1. Given f M
k+1
(X, Y ) we
must produce A L
k+1
(X, Y ) = L(X, L
k
(X, Y )) such that j
k+1
A = f. If such
an equation is to hold, then for v
0
X, we would have j
k
(Av
0
) = fv
0
, ).
That is Av
0
= j
1
k
(fv
0
, )). It is easily checked that A so dened is linear,
bounded, and j
k+1
A = f.
From now on we will identify L
k
with M
k
without further mention. In
particular, we will view D
k
f as function on U with values in M
k
(X, Y ).
2
I will routinely write fv1, v2, . . . , v
k
) rather than f(v1, v2, . . . , v
k
) when the func-
tion f depends on each of variables linearly, i.e. f is a multi-linear function.
12.4 Higher Order Derivatives 173
Theorem 12.18 (Dierentiability). Suppose k 1, 2, . . . and D is
a dense subspace of X, f : U
o
X Y is a function such that
(
v1
v2

v
l
f)(x) exists for all x D U, v
i
l
i=1
D, and l = 1, 2, . . . k.
Further assume there exists continuous functions A
l
: U
o
X M
l
(X, Y )
such that such that (
v1
v2

v
l
f)(x) = A
l
(x)v
1
, v
2
, . . . , v
l
) for all x
D U, v
i
l
i=1
D, and l = 1, 2, . . . k. Then D
l
f(x) exists and is equal
to A
l
(x) for all x U and l = 1, 2, . . . , k.
Proof. We will prove the theorem by induction on k. We have already
proved the theorem when k = 1, see Proposition 12.14. Now suppose that
k > 1 and that the statement of the theorem holds when k is replaced by k1.
Hence we know that D
l
f(x) = A
l
(x) for all x U and l = 1, 2, . . . , k 1. We
are also given that
(
v1
v2

v
k
f)(x) = A
k
(x)v
1
, v
2
, . . . , v
k
) x U D, v
i
D. (12.14)
Now we may write (
v2

v
k
f)(x) as (D
k1
f)(x)v
2
, v
3
, . . . , v
k
) so that Eq.
(12.14) may be written as
v1
(D
k1
f)(x)v
2
, v
3
, . . . , v
k
))
= A
k
(x)v
1
, v
2
, . . . , v
k
) x U D, v
i
D. (12.15)
So by the fundamental theorem of calculus, we have that
((D
k1
f)(x +v
1
) (D
k1
f)(x))v
2
, v
3
, . . . , v
k
)
=
_
1
0
A
k
(x +tv
1
)v
1
, v
2
, . . . , v
k
) dt (12.16)
for all x U D and v
i
D with v
1
suciently small. By the same
argument given in the proof of Proposition 12.14, Eq. (12.16) remains valid
for all x U and v
i
X with v
1
suciently small. We may write this last
equation alternatively as,
(D
k1
f)(x +v
1
) (D
k1
f)(x) =
_
1
0
A
k
(x +tv
1
)v
1
, ) dt. (12.17)
Hence
(D
k1
f)(x +v
1
) (D
k1
f)(x) A
k
(x)v
1
, )
=
_
1
0
[A
k
(x +tv
1
) A
k
(x)]v
1
, ) dt
from which we get the estimate,
|(D
k1
f)(x +v
1
) (D
k1
f)(x) A
k
(x)v
1
, )| (v
1
)|v
1
| (12.18)
where (v
1
) :=
_
1
0
|A
k
(x + tv
1
) A
k
(x)| dt. Notice by the continuity of A
k
that (v
1
) 0 as v
1
0. Thus it follow from Eq. (12.18) that D
k1
f is
dierentiable and that (D
k
f)(x) = A
k
(x).
Example 12.19. Let f : GL(X, Y ) GL(Y, X) be dened by f(A) := A
1
.
We assume that GL(X, Y ) is not empty. Then f is innitely dierentiable and
(D
k
f)(A)V
1
, V
2
, . . . , V
k
)
= (1)
k
B
1
V
(1)
B
1
V
(2)
B
1
B
1
V
(k)
B
1
, (12.19)
where sum is over all permutations of of 1, 2, . . . , k.
Let me check Eq. (12.19) in the case that k = 2. Notice that we have
already shown that (
V1
f)(B) = Df(B)V
1
= B
1
V
1
B
1
. Using the product
rule we nd that
(
V2
V1
f)(B) = B
1
V
2
B
1
V
1
B
1
+B
1
V
1
B
1
V
2
B
1
=: A
2
(B)V
1
, V
2
).
Notice that |A
2
(B)V
1
, V
2
)| 2|B
1
|
3
|V
1
| |V
2
|, so that |A
2
(B)|
2|B
1
|
3
< . Hence A
2
: GL(X, Y ) M
2
(L(X, Y ), L(Y, X)). Also
|(A
2
(B) A
2
(C))V
1
, V
2
)| 2|B
1
V
2
B
1
V
1
B
1
C
1
V
2
C
1
V
1
C
1
|
2|B
1
V
2
B
1
V
1
B
1
B
1
V
2
B
1
V
1
C
1
|
+ 2|B
1
V
2
B
1
V
1
C
1
B
1
V
2
C
1
V
1
C
1
|
+ 2|B
1
V
2
C
1
V
1
C
1
C
1
V
2
C
1
V
1
C
1
|
2|B
1
|
2
|V
2
||V
1
||B
1
C
1
|
+ 2|B
1
||C
1
||V
2
||V
1
||B
1
C
1
|
+ 2|C
1
|
2
|V
2
||V
1
||B
1
C
1
|.
This shows that
|A
2
(B) A
2
(C)| 2|B
1
C
1
||B
1
|
2
+|B
1
||C
1
| +|C
1
|
2
.
Since B B
1
is dierentiable and hence continuous, it follows that A
2
(B)
is also continuous in B. Hence by Theorem 12.18 D
2
f(A) exists and is given
as in Eq. (12.19)
Example 12.20. Suppose that f : 1 1 is a C
function and F(x) :=

_
1
0
f(x(t)) dt for x X := C([0, 1], 1) equipped with the norm |x| :=
max
t[0,1]
[x(t)[. Then F : X 1 is also innitely dierentiable and
(D
k
F)(x)v
1
, v
2
, . . . , v
k
) =
_
1
0
f
(k)
(x(t))v
1
(t) v
k
(t) dt, (12.20)
for all x X and v
i
X.
To verify this example, notice that
12.5 Inverse and Implicit Function Theorems 175
(
v
F)(x) :=
d
ds
[
0
F(x +sv) =
d
ds
[
0
_
1
0
f(x(t) +sv(t)) dt
=
_
1
0
d
ds
[
0
f(x(t) +sv(t)) dt =
_
1
0
f
t
(x(t))v(t) dt.
Similar computations show that
(
v1
v2

v
k
f)(x) =
_
1
0
f
(k)
(x(t))v
1
(t) v
k
(t) dt =: A
k
(x)v
1
, v
2
, . . . , v
k
).
Now for x, y X,
[A
k
(x)v
1
, v
2
, . . . , v
k
) A
k
(y)v
1
, v
2
, . . . , v
k
)[
_
1
0
[f
(k)
(x(t)) f
(k)
(y(t))[ [v
1
(t) v
k
(t) [dt
i=1
|v
i
|
_
1
0
[f
(k)
(x(t)) f
(k)
(y(t))[dt,
which shows that
|A
k
(x) A
k
(y)|
_
1
0
[f
(k)
(x(t)) f
(k)
(y(t))[dt.
This last expression is easily seen to go to zero as y x in X. Hence A
k
is
continuous. Thus we may apply Theorem 12.18 to conclude that Eq. (12.20)
is valid.
12.5 Inverse and Implicit Function Theorems
In this section, let X be a Banach space, R > 0, U = B = B(0, R) X
and : U X be a continuous function such that (0) = 0. Our immedi-
ate goal is to give a sucient condition on so that F(x) := x + (x) is a
homeomorphism from U to F(U) with F (U) being an open subset of X. Lets
start by looking at the one dimensional case rst. So for the moment assume
that X = 1, U = (1, 1), and : U 1 is C
1
. Then F will be injective
i F is either strictly increasing or decreasing. Since we are thinking that F
is a small perturbation of the identity function we will assume that F is
strictly increasing, i.e. F
t
= 1 +
t
> 0. This positivity condition is not so eas-
ily interpreted for operators on a Banach space. However the condition that
[
t
[ < 1 is easily interpreted in the Banach space setting and it implies
1 +
t
> 0.
Lemma 12.21. Suppose that U = B = B(0, R) (R > 0) is a ball in X and
: B X is a C
1
function such that |D| < on U. Then
|(x) (y)| |x y| for all x, y U. (12.21)
Proof. By the fundamental theorem of calculus and the chain rule:
(y) (x) =
_
1
0
d
dt
(x +t(y x))dt
=
_
1
0
[D(x +t(y x))](y x)dt.
Therefore, by the triangle inequality and the assumption that |D(x)|
on B,
|(y) (x)|
_
1
0
|D(x +t(y x))|dt |(y x)| |(y x)|.
Remark 12.22. It is easily checked that if : U = B(0, R) X is C
1
and
satises (12.21) then |D| on U.
Using the above remark and the analogy to the one dimensional example,
one is lead to the following proposition.
Proposition 12.23. Suppose (0, 1), R > 0, U = B(0, R)
o
X and
: U X is a continuous function such that (0) = 0 and
|(x) (y)| |x y| x, y U. (12.22)
Then F : U X dened by F(x) := x +(x) for x U satises:
1. F is an injective map and G = F
1
: V := F (U) U is continuous.
2. If x
0
U, z
0
= F (x
0
) and r > 0 such the B(x
0
, r) U, then
B(z
0
, (1 )r) F (B(x
0
, r)) B(z
0
, (1 +)r). (12.23)
In particular, for all r R,
B(0, (1 ) r) F(B(0, r)) B(0, (1 +) r), (12.24)
see Figure 12.1 below.
3. V := F(U) is open subset of X and F : U V is a homeomorphism.
Proof.
1. Using the denition of F and the estimate in Eq. (12.22),
|x y| = |(F(x) F(y)) ((x) (y))|
|F(x) F(y)| +|(x) (y)|
|F(x) F(y)| +|(x y)|
for all x, y U. This implies
Fig. 12.1. Nesting of F (B(x0, r)) between B(z0, (1 )r) and B(z0, (1 +)r).
|x y| (1 )
1
|F(x) F(y)| (12.25)
which shows F is injective on U and hence shows the inverse function
G = F
1
: V := F (U) U is well dened. Moreover, replacing x, y in
Eq. (12.25) by G(x) and G(y) respectively with x, y V shows
|G(x) G(y) | (1 )
1
|x y| for all x, y V. (12.26)
Hence G is Lipschitz on V and hence continuous.
2. Let x
0
U, r > 0 and z
0
= F(x
0
) = x
0
+ (x
0
) be as in item 2. The
second inclusion in Eq. (12.23) follows from the simple computation:
|F (x
0
+h) z
0
| = |h + (x
0
+h) (x
0
)|
|h| +| (x
0
+h) (x
0
)|
(1 +) |h| < (1 +) r
for all h B(0, r) . To prove the rst inclusion in Eq. (12.23) we must
nd, for every z B(z
0
, (1)r), an h B(0, r) such that z = F (x
0
+h)
or equivalently an h B(0, r) solving
z z
0
= F (x
0
+h) F(x
0
) = h +(x
0
+h) (x
0
).
Let k := z z
0
and for h B(0, r) , let (h) := (x
0
+ h) (x
0
). With
this notation it suces to show for each k B(z
0
, (1 )r) there exists
h B(0, r) such that k = h + (h) . Notice that (0) = 0 and
| (h
1
) (h
2
)| = |(x
0
+h
1
) (x
0
+h
2
)| |h
1
h
2
| (12.27)
for all h
1
, h
2
B(0, r) . We are now going to solve the equation k =
h + (h) for h by the method of successive approximations starting with
h
0
= 0 and then dening h
n
inductively by
h
n+1
= k (h
n
) . (12.28)
A simple induction argument using Eq. (12.27) shows that
|h
n+1
h
n
|
n
|k| for all n N
0
and in particular that
|h
N
| =
_
_
_
_
_
N1
n=0
(h
n+1
h
n
)
_
_
_
_
_
N1
n=0
|h
n+1
h
n
|
N1
n=0
n
|k| =
1
N
1
|k| . (12.29)
Since |k| < (1 ) r, this implies that |h
N
| < r for all N showing the
approximation procedure is well dened. Let
h := lim
N
h
n
=
n=0
(h
n+1
h
n
) X
which exists since the sum in the previous equation is absolutely con-
vergent. Passing to the limit in Eqs. (12.29) and (12.28) shows that
|h| (1 )
1
|k| < r and h = k (h) , i.e. h B(0, r) solves
k = h + (h) as desired.
3. Given x
0
U, the rst inclusion in Eq. (12.23) shows that z
0
= F (x
0
) is
in the interior of F (U) . Since z
0
F (U) was arbitrary, it follows that
V = F (U) is open. The continuity of the inverse function has already
been proved in item 1.
For the remainder of this section let X and Y be two Banach spaces,
U
o
X, k 1, and f C
k
(U, Y ).
Lemma 12.24. Suppose x
0
U, R > 0 is such that B
X
(x
0
, R) U and
T : B
X
(x
0
, R) Y is a C
1
function such that T
t
(x
0
) is invertible. Let
(R) := sup
xB
X
(x0,R)
_
_
T
t
(x
0
)
1
T
t
(x) I
_
_
L(X)
(12.30)
and C
1
_
B
X
(0, R), X
_
be dened by
(h) = T
t
(x
0
)
1
[T(x
0
+h) T(x
0
)] h (12.31)
so that
T(x
0
+h) = T(x
0
) +T
t
(x
0
) (h +(h)) . (12.32)
Then (h) = o(h) as h 0 and
|(h
t
) (h)| (R) |h
t
h| for all h, h
t
B
X
(0, R). (12.33)
If (R) < 1 (which may be achieved by shrinking R if necessary), then T
t
(x)
is invertible for all x B
X
(x
0
, R) and
sup
xB
X
(x0,R)
_
_
T
t
(x)
1
_
_
L(Y,X)

1
1 (R)
_
_
T
t
(x
0
)
1
_
_
L(Y,X)
. (12.34)
Proof. By denition of T
t
(x
0
) and using T
t
(x
0
)
1
exists,
T(x
0
+h) T(x
0
) = T
t
(x
0
)h +o(h)
from which it follows that (h) = o(h). In fact by the fundamental theorem
of calculus,
(h) =
_
1
0
_
T
t
(x
0
)
1
T
t
(x
0
+th) I
_
hdt
but we will not use this here. Let h, h
t
B
X
(0, R) and apply the fundamental
theorem of calculus to t T(x
0
+t(h
t
h)) to conclude
(h
t
) (h) = T
t
(x
0
)
1
[T(x
0
+h
t
) T(x
0
+h)] (h
t
h)
=
__
1
0
_
T
t
(x
0
)
1
T
t
(x
0
+t(h
t
h)) I
_
dt
_
(h
t
h).
Taking norms of this equation gives
|(h
t
) (h)|
__
1
0
_
_
T
t
(x
0
)
1
T
t
(x
0
+t(h
t
h)) I
_
_
dt
_
|h
t
h|
(R) |h
t
h|
It only remains to prove Eq. (12.34), so suppose now that (R) < 1. Then by
Proposition 7.21, T
t
(x
0
)
1
T
t
(x) = I
_
I T
t
(x
0
)
1
T
t
(x)
_
is invertible and
_
_
_
_
T
t
(x
0
)
1
T
t
(x)
1
_
_
_
1
1 (R)
for all x B
X
(x
0
, R).
Since T
t
(x) = T
t
(x
0
)
_
T
t
(x
0
)
1
T
t
(x)
this implies T
t
(x) is invertible and
_
_
T
t
(x)
1
_
_
=
_
_
_
_
T
t
(x
0
)
1
T
t
(x)
1
T
t
(x
0
)
1
_
_
_
1
1 (R)
_
_
T
t
(x
0
)
1
_
_
for all x B
X
(x
0
, R).
Theorem 12.25 (Inverse Function Theorem). Suppose U
o
X, k 1
and T C
k
(U, Y ) such that T
t
(x) is invertible for all x U. Further assume
x
0
U and R > 0 such that B
X
(x
0
, R) U.
1. For all r R,
T(B
X
(x
0
, r)) T (x
0
) +T
t
(x
0
) B
X
(0, (1 +(r))r) . (12.35)
2. If we further assume that
(R) := sup
xB
X
(x0,R)
_
_
T
t
(x
0
)
1
T
t
(x) I
_
_
< 1,
which may always be achieved by taking R suciently small, then
T (x
0
) +T
t
(x
0
) B
X
(0, (1 (r))r) T(B
X
(x
0
, r)) (12.36)
for all r R, see Figure 12.2.
3. T : U Y is an open mapping, in particular V := T(U)
o
Y.
4. Again if R is suciently small so that (R) < 1, then T[
B
X
(x0,R)
:
B
X
(x
0
, R) T(B
X
(x
0
, R)) is invertible and T[
1
B
X
(x0,R)
: T
_
B
X
(x
0
, R)
_
B
X
(x
0
, R) is a C
k
map.
5. If T is injective, then T
1
: V U is also a C
k
map and
_
T
1
_
t
(y) =
_
T
t
(T
1
(y))
1
for all y V.
Fig. 12.2. The nesting of T(B
X
(x0, r)) between T (x0)+T
/
(x0) B
X
(0, (1 (r))r)
andT (x0) +T
/
(x0) B
X
(0, (1 +(r))r) .
Proof. Let C
1
_
B
X
(0, R), X
_
be as dened in Eq. (12.31).
1. Using Eqs. (12.32) and (12.24),
T
_
B
X
(x
0
, r)
_
= T (x
0
) +T
t
(x
0
) (I +)
_
B
X
(0, r)
_
(12.37)
T (x
0
) +T
t
(x
0
) B
X
(0, (1 +(r)) r)
2. Now assume (R) < 1, then by Eqs. (12.37) and (12.24),
T (x
0
) +T
t
(x
0
) B
X
(0, (1 (r)) r)
T (x
0
) +T
t
(x
0
) (I +)
_
B
X
(0, r)
_
= T
_
B
X
(x
0
, r)
_
3. Notice that h X T (x
0
) + T
t
(x
0
) h Y is a homeomorphism. The
fact that T is an open map follows easily from Eq. (12.36) which shows
that T (x
0
) is interior of T (W) for any W
o
X with x
0
W.
4. The fact that T[
B
X
(x0,R)
: B
X
(x
0
, R) T(B
X
(x
0
, R)) is invertible with
a continuous inverse follows from Eq. (12.32) and Proposition 12.23. It
now follows from the converse to the chain rule, Theorem 12.7, that g :=
T[
1
B
X
(x0,R)
: T
_
B
X
(x
0
, R)
_
B
X
(x
0
, R) is dierentiable and
g
t
(y) = [T
t
(g (y))]
1
for all y T
_
B
X
(x
0
, R)
_
.
This equation shows g is C
1
. Now suppose that k 2. Since T
t

C
k1
(B, L(X)) and i(A) := A
1
is a smooth map by Example 12.19,
g
t
= i T
t
g is C
1
, i.e. g is C
2
. If k 2, we may use the same argument
to now show g is C
3
. Continuing this way inductively, we learn g is C
k
.
5. Since dierentiability and smoothness is local, the assertion in item 5.
follows directly from what has already been proved.
Theorem 12.26 (Implicit Function Theorem). Suppose that X, Y, and
W are three Banach spaces, k 1, A X Y is an open set, (x
0
, y
0
) is
a point in A, and f : A W is a C
k
map such f(x
0
, y
0
) = 0. Assume
that D
2
f(x
0
, y
0
) := D(f(x
0
, ))(y
0
) : Y W is a bounded invertible linear
transformation. Then there is an open neighborhood U
0
of x
0
in X such that
for all connected open neighborhoods U of x
0
contained in U
0
, there is a unique
continuous function u : U Y such that u(x
0
) = y
o
, (x, u(x)) A and
f(x, u(x)) = 0 for all x U. Moreover u is necessarily C
k
and
Du(x) = D
2
f(x, u(x))
1
D
1
f(x, u(x)) for all x U. (12.38)
Proof. By replacing f by (x, y) D
2
f(x
0
, y
0
)
1
f(x, y) if necessary, we
may assume with out loss of generality that W = Y and D
2
f(x
0
, y
0
) = I
Y
.
Dene F : A XY by F(x, y) := (x, f(x, y)) for all (x, y) A. Notice that
DF(x, y) =
_
I D
1
f(x, y)
0 D
2
f(x, y)
_
which is invertible i D
2
f(x, y) is invertible and if D
2
f(x, y) is invertible then
DF(x, y)
1
=
_
I D
1
f(x, y)D
2
f(x, y)
1
0 D
2
f(x, y)
1
_
.
Since D
2
f(x
0
, y
0
) = I is invertible, the inverse function theorem guarantees
that there exists a neighborhood U
0
of x
0
and V
0
of y
0
such that U
0
V
0
A,
F(U
0
V
0
) is open in X Y, F[
(U0V0)
has a C
k
inverse which we call F
1
.
Let
2
(x, y) := y for all (x, y) X Y and dene C
k
function u
0
on U
0
by
u
0
(x) :=
2
F
1
(x, 0). Since F
1
(x, 0) = ( x, u
0
(x)) i
(x, 0) = F( x, u
0
(x)) = ( x, f( x, u
0
(x))),
it follows that x = x and f(x, u
0
(x)) = 0. Thus
(x, u
0
(x)) = F
1
(x, 0) U
0
V
0
A
and f(x, u
0
(x)) = 0 for all x U
0
. Moreover, u
0
is C
k
being the composition
of the C
k
functions, x (x, 0), F
1
, and
2
. So if U U
0
is a connected set
containing x
0
, we may dene u := u
0
[
U
to show the existence of the functions
u as described in the statement of the theorem. The only statement left to
prove is the uniqueness of such a function u. Suppose that u
1
: U Y is
another continuous function such that u
1
(x
0
) = y
0
, and (x, u
1
(x)) A and
f(x, u
1
(x)) = 0 for all x U. Let
O := x U[u(x) = u
1
(x) = x U[u
0
(x) = u
1
(x).
Clearly O is a (relatively) closed subset of U which is not empty since x
0
O.
Because U is connected, if we show that O is also an open set we will have
shown that O = U or equivalently that u
1
= u
0
on U. So suppose that x O,
i.e. u
0
(x) = u
1
(x). For x near x U,
0 = 0 0 = f( x, u
0
( x)) f( x, u
1
( x)) = R( x)(u
1
( x) u
0
( x)) (12.39)
where
R( x) :=
_
1
0
D
2
f(( x, u
0
( x) +t(u
1
( x) u
0
( x)))dt. (12.40)
From Eq. (12.40) and the continuity of u
0
and u
1
, lim
xx
R( x) =
D
2
f(x, u
0
(x)) which is invertible.
3
Thus R( x) is invertible for all x suciently
close to x which combined with Eq. (12.39) implies that u
1
( x) = u
0
( x) for all
x suciently close to x. Since x O was arbitrary, we have shown that O is
open.
12.6 Smooth Dependence of ODEs on Initial
Conditions*
In this subsection, let X be a Banach space, U
o
X and J be an open interval
with 0 J.
Lemma 12.27. If Z C(J U, X) such that D
x
Z(t, x) exists for all (t, x)
J U and D
x
Z(t, x) C(J U, X) then Z is locally Lipschitz in x, see
Denition 11.6.
Proof. Suppose I J and x U. By the continuity of DZ, for every
t I there an open neighborhood N
t
of t I and
t
> 0 such that B(x,
t
)
U and
sup|D
x
Z(t
t
, x
t
)| : (t
t
, x
t
) N
t
B(x,
t
) < .
By the compactness of I, there exists a nite subset I such that I
tI
N
t
. Let (x, I) := min
t
: t and
3
Notice that DF(x, u0(x)) is invertible for all x U0 since F[U
0
V
0
has a C
1
inverse. Therefore D2f(x, u0(x)) is also invertible for all x U0.
12.6 Smooth Dependence of ODEs on Initial Conditions* 183
K(x, I) := sup|DZ(t, x
t
)|(t, x
t
) I B(x, (x, I)) < .
Then by the fundamental theorem of calculus and the triangle inequality,
|Z(t, x
1
) Z(t, x
0
)|
__
1
0
|D
x
Z(t, x
0
+s(x
1
x
0
)| ds
_
|x
1
x
0
|
K(x, I)|x
1
x
0
|
for all x
0
, x
1
B(x, (x, I)) and t I.
Theorem 12.28 (Smooth Dependence of ODEs on Initial Condi-
tions). Let X be a Banach space, U
o
X, Z C(1 U, X) such that
D
x
Z C(1U, X) and : T(Z) 1X X denote the maximal solution
operator to the ordinary dierential equation
y(t) = Z(t, y(t)) with y(0) = x U, (12.41)
see Notation 11.9 and Theorem 11.15. Then C
1
(T(Z), U),
t
D
x
(t, x)
exists and is continuous for (t, x) T(Z) and D
x
(t, x) satises the linear
dierential equation,
d
dt
D
x
(t, x) = [(D
x
Z) (t, (t, x))]D
x
(t, x) with D
x
(0, x) = I
X
(12.42)
for t J
x
.
Proof. Let x
0
U and J be an open interval such that 0 J

J J
x0
,
y
0
:= y(, x
0
)[
J
and
O
:= y BC(J, U) : |y y
0
|
<
o
BC(J, X).
By Lemma 12.27, Z is locally Lipschitz and therefore Theorem 11.15 is ap-
plicable. By Eq. (11.23) of Theorem 11.15, there exists > 0 and > 0 such
that G : B(x
0
, ) O
dened by G(x) := (, x)[

J
is continuous. By Lemma
12.29 below, for > 0 suciently small the function F : O
BC(J, X)
dened by
F(y) := y
_

0
Z(t, y(t))dt. (12.43)
is C
1
and
DF(y)v = v
_

0
D
y
Z(t, y(t))v(t)dt. (12.44)
By the existence and uniqueness Theorem 10.22 for linear ordinary dieren-
tial equations, DF(y) is invertible for any y BC(J, U). By the denition
of , F(G(x)) = h(x) for all x B(x
0
, ) where h : X BC(J, X) is de-
ned by h(x)(t) = x for all t J, i.e. h(x) is the constant path at x. Since
h is a bounded linear map, h is smooth and Dh(x) = h for all x X.
We may now apply the converse to the chain rule in Theorem 12.7 to con-
clude G C
1
(B(x
0
, ), O) and DG(x) = [DF(G(x))]
1
Dh(x) or equivalently,
DF(G(x))DG(x) = h which in turn is equivalent to
D
x
(t, x)
_
t
0
[DZ((, x)]D
x
(, x) d = I
X
.
As usual this equation implies D
x
(t, x) is dierentiable in t, D
x
(t, x) is
continuous in (t, x) and D
x
(t, x) satises Eq. (12.42).
Lemma 12.29. Continuing the notation used in the proof of Theorem 12.28
and further let
f(y) :=
_

0
Z(, y()) d for y O
.
Then f C
1
(O
, Y ) and for all y O
,
f
t
(y)h =
_

0
D
x
Z(, y())h() d =:
y
h.
Proof. Let h Y be suciently small and J, then by fundamental
theorem of calculus,
Z(,y() +h()) Z(, y())
=
_
1
0
[D
x
Z(, y() +rh()) D
x
Z(, y())]dr
and therefore,
f(y +h) f(y)
y
h(t)
=
_
t
0
[Z(, y() +h()) Z(, y()) D
x
Z(, y())h() ] d
=
_
t
0
d
_
1
0
dr[D
x
Z(, y() +rh()) D
x
Z(, y())]h().
Therefore,
|(f(y +h) f(y)
y
h)|
|h|
(h) (12.45)
where
(h) :=
_
J
d
_
1
0
dr |D
x
Z(, y() +rh()) D
x
Z(, y())| .
With the aide of Lemmas 12.27 and Lemma 11.7,
(r, , h) [0, 1] J Y |D
x
Z(, y() +rh())|
is bounded for small h provided > 0 is suciently small. Thus it follows
from the dominated convergence theorem that (h) 0 as h 0 and hence
Eq. (12.45) implies f
t
(y) exists and is given by
y
. Similarly,
12.7 Existence of Periodic Solutions 185
[[f
t
(y +h) f
t
(y)[[
op
_
J
|D
x
Z(, y() +h()) D
x
Z(, y())| d 0 as h 0
showing f
t
is continuous.
Remark 12.30. If Z C
k
(U, X), then an inductive argument shows that
C
k
(T(Z), X). For example if Z C
2
(U, X) then (y(t), u(t)) :=
((t, x), D
x
(t, x)) solves the ODE,
d
dt
(y(t), u(t)) =

Z ((y(t), u(t))) with (y(0), u(0)) = (x, Id
X
)
where

Z is the C
1
vector eld dened by
Z (x, u) = (Z(x), D
x
Z(x)u) .
Therefore Theorem 12.28 may be applied to this equation to deduce: D
2
x
(t, x)
and D
2
x

(t, x) exist and are continuous. We may now dierentiate Eq. (12.42)
to nd D
2
x
(t, x) satises the ODE,
d
dt
D
2
x
(t, x) = [
_
Dx(t,x)
D
x
Z
_
(t, (t, x))]D
x
(t, x)
+ [(D
x
Z) (t, (t, x))]D
2
x
(t, x)
with D
2
x
(0, x) = 0.
12.7 Existence of Periodic Solutions
A detailed discussion of the inverse function theorem on Banach and Frechet
spaces may be found in Richard Hamiltons, The Inverse Function Theorem
of Nash and Moser. The applications in this section are taken from this
paper. In what follows we say f C
k
2
(1, (c, d)) if f C
k
2
(1, (c, d)) and f is
2 periodic, i.e. f (x + 2) = f (x) for all x 1.
Theorem 12.31 (Taken from Hamilton, p. 110.). Let p : U := (a, b)
V := (c, d) be a smooth function with p
t
> 0 on (a, b). For every g
C
2
(1, (c, d)) there exists a unique function y C
2
(1, (a, b)) such that
y(t) +p(y(t)) = g(t).
Proof. Let

V := C
0
2
(1, (c, d))
o
C
0
2
(1, 1) and

U
o
C
1
2
(1, (a, b)) be
given by
U :=
_
y C
1
2
(1, 1) : a < y(t) < b & c < y(t) +p(y(t)) < d t
_
.
The proof will be completed by showing P :

U

V dened by
P(y)(t) = y(t) +p(y(t)) for y

U and t 1
is bijective. Note that if P (y) is smooth then so is y.
Step 1. The dierential of P is given by P
t
(y)h =

h+p
t
(y)h, see Exercise
12.8. We will now show that the linear mapping P
t
(y) is invertible. Indeed let
f = p
t
(y) > 0, then the general solution to the Eq.

h +fh = k is given by
h(t) = e
t
0
f()d
h
0
+
_
t
0
e
f(s)ds
k()d
where h
0
is a constant. We wish to choose h
0
so that h(2) = h
0
, i.e. so that
h
0
_
1 e
c(f)
_
=
_
2
0
e
f(s)ds
k()d
where
c(f) =
_
2
0
f()d =
_
2
0
p
t
(y())d > 0.
The unique solution h C
1
2
(1, 1) to P
t
(y)h = k is given by
h(t) =
_
1 e
c(f)
_
1
e
t
0
f()d
_
2
0
e
f(s)ds
k()d +
_
t
0
e
f(s)ds
k()d
=
_
1 e
c(f)
_
1
e
t
0
f(s)ds
_
2
0
e
f(s)ds
k()d +
_
t
0
e
f(s)ds
k()d.
Therefore P
t
(y) is invertible for all y. Hence by the inverse function Theorem
12.25, P :

U

V is an open mapping which is locally invertible.
Step 2. Let us now prove P :

U

V is injective. For this suppose
y
1
, y
2

U such that P(y
1
) = g = P(y
2
) and let z = y
2
y
1
. Since
z(t) +p(y
2
(t)) p(y
1
(t)) = g(t) g(t) = 0,
if t
m
1 is point where z(t
m
) takes on its maximum, then z(t
m
) = 0 and
hence
p(y
2
(t
m
)) p(y
1
(t
m
)) = 0.
Since p is increasing this implies y
2
(t
m
) = y
1
(t
m
) and hence z(t
m
) = 0. This
shows z(t) 0 for all t and a similar argument using a minimizer of z shows
z(t) 0 for all t. So we conclude y
1
= y
2
.
Step 3. Let W := P(
U), we wish to show W =

V . By step 1., we know
W is an open subset of

V and since

V is connected, to nish the proof it
suces to show W is relatively closed in

V . So suppose y
j

U such that
g
j
:= P(y
j
) g

V . We must now show g W, i.e. g = P(y) for some y W.
If t
m
is a maximizer of y
j
, then y
j
(t
m
) = 0 and hence g
j
(t
m
) = p(y
j
(t
m
)) < d
and therefore y
j
(t
m
) < b because p is increasing. A similar argument works for
the minimizers then allows us to conclude Ran(p y
j
) Ran(g
j
) (c, d)
12.8 Contraction Mapping Principle 187
for all j. Since g
j
is converging uniformly to g, there exists c < < < d
such that Ran(p y
j
) Ran(g
j
) [, ] for all j. Again since p
t
> 0,
Ran(y
j
) p
1
([, ]) = [, ] (a, b) for all j.
In particular sup[ y
j
(t)[ : t 1 and j < since
y
j
(t) = g
j
(t) p(y
j
(t)) [, ] [, ] (12.46)
which is a compact subset of 1. The Ascoli-Arzela Theorem (see Theoerem
14.29 below) now allows us to assume, by passing to a subsequence if necessary,
that y
j
is converging uniformly to y C
0
2
(1, [, ]). It now follows that
y
j
(t) = g
j
(t) p(y
j
(t)) g p(y)
uniformly in t. Hence we concluded that y C
1
2
(1, 1)C
0
2
(1, [, ]), y
j
y
and P(y) = g. This has proved that g W and hence that W is relatively
closed in

V .
12.8 Contraction Mapping Principle
Some of the arguments uses in this chapter and in Chapter 11 may be ab-
stracted to a general principle of nding xed points on a complete metric
space. This is the content of this section.
Theorem 12.32 (Contraction Mapping Principle). Suppose that (X, )
is a complete metric space and S : X X is a contraction, i.e. there exists
(0, 1) such that (S(x), S(y)) (x, y) for all x, y X. Then S has
a unique xed point in X, i.e. there exists a unique point x X such that
S(x) = x.
Proof. For uniqueness suppose that x and x
t
are two xed points of S,
then
(x, x
t
) = (S(x), S(x
t
)) (x, x
t
).
Therefore (1 )(x, x
t
) 0 which implies that (x, x
t
) = 0 since 1 > 0.
Thus x = x
t
. For existence, let x
0
X be any point in X and dene x
n
X
inductively by x
n+1
= S(x
n
) for n 0. We will show that x := lim
n
x
n
exists in X and because S is continuous this will imply,
x = lim
n
x
n+1
= lim
n
S(x
n
) = S( lim
n
x
n
) = S(x),
showing x is a xed point of S. So to nish the proof, because X is complete,
it suces to show x
n
n=1
is a Cauchy sequence in X. An easy inductive
computation shows, for n 0, that
(x
n+1
, x
n
) = (S(x
n
), S(x
n1
)) (x
n
, x
n1
)
n
(x
1
, x
0
).
Another inductive argument using the triangle inequality shows, for m > n,
that,
(x
m
, x
n
) (x
m
, x
m1
) +(x
m1
, x
n
)
m1
k=n
(x
k+1
, x
k
).
Combining the last two inequalities gives (using again that (0, 1)),
(x
m
, x
n
)
m1
k=n
k
(x
1
, x
0
) (x
1
, x
0
)
n
l=0
l
= (x
1
, x
0
)

n
1
.
This last equation shows that (x
m
, x
n
) 0 as m, n , i.e. x
n
n=0
is a
Cauchy sequence.
Corollary 12.33 (Contraction Mapping Principle II). Suppose that
(X, ) is a complete metric space and S : X X is a continuous map such
that S
(n)
is a contraction for some n N. Here
S
(n)
:=
n times
..
S S . . . S
and we are assuming there exists (0, 1) such that (S
(n)
(x), S
(n)
(y))
(x, y) for all x, y X. Then S has a unique xed point in X.
Proof. Let T := S
(n)
, then T : X X is a contraction and hence T has
a unique xed point x X. Since any xed point of S is also a xed point of
T, we see if S has a xed point then it must be x. Now
T(S(x)) = S
(n)
(S(x)) = S(S
(n)
(x)) = S(T(x)) = S(x),
which shows that S(x) is also a xed point of T. Since T has only one xed
point, we must have that S(x) = x. So we have shown that x is a xed point
of S and this xed point is unique.
Lemma 12.34. Suppose that (X, ) is a complete metric space, n N, Z is
a topological space, and (0, 1). Suppose for each z Z there is a map
S
z
: X X with the following properties:
Contraction property (S
(n)
z
(x), S
(n)
z
(y)) (x, y) for all x, y X and z
Z.
Continuity in z For each x X the map z Z S
z
(x) X is continuous.
By Corollary 12.33 above, for each z Z there is a unique xed point
G(z) X of S
z
.
Conclusion: The map G : Z X is continuous.
12.9 Exercises 189
Proof. Let T
z
:= S
(n)
z
. If z, w Z, then
(G(z), G(w)) = (T
z
(G(z)), T
w
(G(w)))
(T
z
(G(z)), T
w
(G(z))) +(T
w
(G(z)), T
w
(G(w)))
(T
z
(G(z)), T
w
(G(z))) +(G(z), G(w)).
Solving this inequality for (G(z), G(w)) gives
(G(z), G(w))
1
1
(T
z
(G(z)), T
w
(G(z))).
Since w T
w
(G(z)) is continuous it follows from the above equation that
G(w) G(z) as w z, i.e. G is continuous.
12.9 Exercises
Exercise 12.3. Suppose that A : 1 L(X) is a continuous function and
V : 1 L(X) is the unique solution to the linear dierential equation
V (t) = A(t)V (t) with V (0) = I. (12.47)

Assuming that V (t) is invertible for all t 1, show that V
1
(t) := [V (t)]
1
must solve the dierential equation
d
dt
V
1
(t) = V
1
(t)A(t) with V
1
(0) = I. (12.48)
See Exercise 10.12 as well.
Exercise 12.4 (Dierential Equations with Parameters). Let W be
another Banach space, U V
o
X W and Z C
1
(U V, X). For each
(x, w) U V, let t J
x,w
(t, x, w) denote the maximal solution to the
ODE
y(t) = Z(y(t), w) with y(0) = x (12.49)
and
T := (t, x, w) 1 U V : t J
x,w
as in Exercise 11.8.
1. Prove that is C
1
and that D
w
(t, x, w) solves the dierential equation:
d
dt
D
w
(t, x, w) = (D
x
Z)((t, x, w), w)D
w
(t, x, w)+(D
w
Z)((t, x, w), w)
with D
w
(0, x, w) = 0 L(W, X). Hint: See the hint for Exercise 11.8
with the reference to Theorem 11.15 being replace by Theorem 12.28.
2. Also show with the aid of Duhamels principle (Exercise 10.22) and The-
orem 12.28 that
D
w
(t, x, w) = D
x
(t, x, w)
_
t
0
D
x
(, x, w)
1
(D
w
Z)((, x, w), w)d
Exercise 12.5. (Dierential of e
A
) Let f : L(X) GL(X) be the expo-
nential function f(A) = e
A
. Prove that f is dierentiable and that
Df(A)B =
_
1
0
e
(1t)A
Be
tA
dt. (12.50)
Hint: Let B L(X) and dene w(t, s) = e
t(A+sB)
for all t, s 1. Notice that
dw(t, s)/dt = (A+sB)w(t, s) with w(0, s) = I L(X). (12.51)
Use Exercise 12.4 to conclude that w is C
1
and that w
t
(t, 0) := dw(t, s)/ds[
s=0
satises the dierential equation,
d
dt
w
t
(t, 0) = Aw
t
(t, 0) +Be
tA
with w(0, 0) = 0 L(X). (12.52)
Solve this equation by Duhamels principle (Exercise 10.22) and then apply
Proposition 12.14 to conclude that f is dierentiable with dierential given
by Eq. (12.50).
Exercise 12.6 (Local ODE Existence). Let S
x
be dened as in Eq. (11.15)
from the proof of Theorem 11.4. Verify that S
x
satises the hypothesis of
Corollary 12.33. In particular we could have used Corollary 12.33 to prove
Theorem 11.4.
Exercise 12.7 (Local ODE Existence Again). Let J = (1, 1) , Z
C
1
(X, X), Y := BC(J, X) and for y Y and s J let y
s
Y be dened by
y
s
(t) := y(st). Use the following outline to prove the ODE
y(t) = Z(y(t)) with y(0) = x (12.53)
has a unique solution for small t and this solution is C
1
in x.
1. If y solves Eq. (12.53) then y
s
solves
y
s
(t) = sZ(y
s
(t)) with y
s
(0) = x
or equivalently
y
s
(t) = x +s
_
t
0
Z(y
s
())d. (12.54)
Notice that when s = 0, the unique solution to this equation is y
0
(t) = x.
12.9 Exercises 191
2. Let F : J Y J Y be dened by
F(s, y) := (s, y(t) s
_
t
0
Z(y())d).
Show the dierential of F is given by
F
t
(s, y)(a, v) =
_
a, t v(t) s
_
t
0
Z
t
(y())v()d a
_

0
Z(y())d
_
.
3. Verify F
t
(0, y) : 1Y 1Y is invertible for all y Y and notice that
F(0, y) = (0, y).
4. For x X, let C
x
Y be the constant path at x, i.e. C
x
(t) = x for all
t J. Use the inverse function Theorem 12.25 to conclude there exists
> 0 and a C
1
map : (, ) B(x
0
, ) Y such that
F(s, (s, x)) = (s, C
x
) for all (s, x) (, ) B(x
0
, ).
5. Show, for s that y
s
(t) := (s, x)(t) satises Eq. (12.54). Now dene
y(t, x) = (/2, x)(2t/) and show y(t, x) solve Eq. (12.53) for [t[ < /2
and x B(x
0
, ).
Exercise 12.8. Show P dened in Theorem 12.31 is continuously dieren-
tiable and P
t
(y)h =

h +p
t
(y)h.
Exercise 12.9. Embedded sub-manifold problems.
Exercise 12.10. Lagrange Multiplier problems.
12.9.1 Alternate construction of g. To be made into an exercise.
Suppose U
o
X and f : U Y is a C
2
function. Then we are looking for
a function g(y) such that f(g(y)) = y. Fix an x
0
U and y
0
= f(x
0
) Y.
Suppose such a g exists and let x(t) = g(y
0
+ th) for some h Y. Then
dierentiating f(x(t)) = y
0
+th implies
d
dt
f(x(t)) = f
t
(x(t)) x(t) = h
x(t) = [f
t
(x(t))]
1
h = Z(h, x(t)) with x(0) = x
0
(12.55)
where Z(h, x) = [f
t
(x(t))]
1
h. Conversely if x solves Eq. (12.55) we have
d
dt
f(x(t)) = h and hence that
f(x(1)) = y
0
+h.
Thus if we dene
g(y
0
+h) := e
Z(h,)
(x
0
),
then f(g(y
0
+h)) = y
0
+h for all h suciently small. This shows f is an open
mapping.
Part IV
Topological Spaces
13
Topological Space Basics
Using the metric space results above as motivation we will axiomatize the
notion of being an open set to more general settings.
Denition 13.1. A collection of subsets of X is a topology if
1. , X
2. is closed under arbitrary unions, i.e. if V
, for I then

I
V
.
3. is closed under nite intersections, i.e. if V
1
, . . . , V
n
then V
1

V
n
.
A pair (X, ) where is a topology on X will be called a topological
space.
Notation 13.2 Let (X, ) be a topological space.
1. The elements, V , are called open sets. We will often write V
o
X
to indicate V is an open subset of X.
2. A subset F X is closed if F
c
is open and we will write F X if F is
a closed subset of X.
3. An open neighborhood of a point x X is an open set V X such
that x V. Let
x
= V : x V denote the collection of open
neighborhoods of x.
4. A subset W X is a neighborhood of x if there exists V
x
such that
V W.
5. A collection
x
is called a neighborhood base at x X if for all
V
x
there exists W such that W V .
The notation
x
should not be confused with
x]
:= i
1
x]
() = x V : V = , x .
Example 13.3. 1. Let (X, d) be a metric space, we write
d
for the collection
of d open sets in X. We have already seen that
d
is a topology, see
Exercise 6.2. The collection of sets = B
x
() : | where | is any
dense subset of (0, 1] is a neighborhood base at x.
196 13 Topological Space Basics
2. Let X be any set, then = 2
X
is the discrete topology on X. In this
topology all subsets of X are both open and closed. At the opposite ex-
treme we have the trivial topology, = , X . In this topology only
the empty set and X are open (closed).
3. Let X = 1, 2, 3, then = , X, 2, 3 is a topology on X which does
not come from a metric.
4. Again let X = 1, 2, 3. Then = 1, 2, 3, , X. is a topology, and
the sets X, 1, 2, 3, are open and closed. The sets 1, 2 and 1, 3
are neither open nor closed.
Fig. 13.1. A topology.
Denition 13.4. Let (X,
X
) and (Y,
Y
) be topological spaces. A function
f : X Y is continuous if
f
1
(
Y
) :=
_
f
1
(V ) : V
Y
_

X
.
We will also say that f is
X
/
Y
continuous or (
X
,
Y
) continuous. Let
C(X, Y ) denote the set of continuous functions from X to Y.
Exercise 13.1. Show f : X Y is continuous i f
1
(C) is closed in X for
all closed subsets C of Y.
Denition 13.5. A map f : X Y between topological spaces is called a
homeomorphism provided that f is bijective, f is continuous and f
1
:
Y X is continuous. If there exists f : X Y which is a homeomorphism,
we say that X and Y are homeomorphic. (As topological spaces X and Y are
essentially the same.)
13.1 Constructing Topologies and Checking Continuity
Proposition 13.6. Let c be any collection of subsets of X. Then there exists
a unique smallest topology (c) which contains c.
13.1 Constructing Topologies and Checking Continuity 197
Proof. Since 2
X
is a topology and c 2
X
, c is always a subset of a
topology. It is now easily seen that
(c) :=
: is a topology and c
is a topology which is clearly the smallest possible topology containing c.
The following proposition gives an explicit descriptions of (c).
Proposition 13.7. Let X be a set and c 2
X
. For simplicity of notation,
assume that X, c. (If this is not the case simply replace c by c X, .)
Then
(c) := arbitrary unions of nite intersections of elements from c.
(13.1)
Proof. Let be given as in the right side of Eq. (13.1). From the denition
of a topology any topology containing c must contain and hence c
(c). The proof will be completed by showing is a topology. The validation
of being a topology is routine except for showing that is closed under
taking nite intersections. Let V, W which by denition may be expressed
as
V =
A
V
and W =
B
W
,
where V
and W
are sets which are nite intersection of elements from c.

Then
V W = (
A
V
) (
B
W
) =
_
(,)AB
V
.
Since for each (, ) AB, V
is still a nite intersection of elements

from c, V W showing is closed under taking nite intersections.
Denition 13.8. Let (X, ) be a topological space. We say that o is a
sub-base for the topology i = (o) and X = o :=
V S
V. We say
1 is a base for the topology i 1 is a sub-base with the property that
every element V may be written as
V = B 1 : B V .
Exercise 13.2. Suppose that o is a sub-base for a topology on a set X.
1. Show 1 := o
f
(o
f
is the collection of nite intersections of elements from
o) is a base for .
2. Show o is itself a base for i
V
1
V
2
= S o : S V
1
V
2
.
for every pair of sets V
1
, V
2
o.
Fig. 13.2. Fitting balls in the intersection.
Remark 13.9. Let (X, d) be a metric space, then c = B
x
() : x X and
> 0 is a base for
d
the topology associated to the metric d. This is the
content of Exercise 6.3.
Let us check directly that c is a base for a topology. Suppose that x, y X
and , > 0. If z B(x, ) B(y, ), then
B(z, ) B(x, ) B(y, ) (13.2)
where = min d(x, z), d(y, z), see Figure 13.2. This is a formal
consequence of the triangle inequality. For example let us show that B(z, )
B(x, ). By the denition of , we have that d(x, z) or that d(x, z)
. Hence if w B(z, ), then
d(x, w) d(x, z) +d(z, w) +d(z, w) < + =
which shows that w B(x, ). Similarly we show that w B(y, ) as well.
Owing to Exercise 13.2, this shows c is a base for a topology. We do not
need to use Exercise 13.2 here since in fact Equation (13.2) may be generalized
to nite intersection of balls. Namely if x
i
X,
i
> 0 and z
n
i=1
B(x
i
,
i
),
then
B(z, )
n
i=1
B(x
i
,
i
) (13.3)
where now := min
i
d(x
i
, z) : i = 1, 2, . . . , n . By Eq. (13.3) it follows
that any nite intersection of open balls may be written as a union of open
balls.
Exercise 13.3. Suppose f : X Y is a function and
X
and
Y
are topolo-
gies on X and Y respectively. Show
f
1
Y
:=
_
f
1
(V ) X : V
Y
_
and f
X
:=
_
V Y : f
1
(V )
X
_
(as in Notation 2.7) are also topologies on X and Y respectively.
Remark 13.10. Let f : X Y be a function. Given a topology
Y
2
Y
, the
topology
X
:= f
1
(
Y
) is the smallest topology on X such that f is (
X
,
Y
)
- continuous. Similarly, if
X
is a topology on X then
Y
= f
X
is the largest
topology on Y such that f is (
X
,
Y
) - continuous.
Denition 13.11. Let (X, ) be a topological space and A subset of X. The
relative topology or induced topology on A is the collection of sets
A
= i
1
A
() = A V : V ,
where i
A
: A X be the inclusion map as in Denition 2.8.
Lemma 13.12. The relative topology,
A
, is a topology on A. Moreover a
subset B A is
A
closed i there is a closed subset, C, of X such that
B = C A.
Proof. The rst assertion is a consequence of Exercise 13.3. For the second,
B A is
A
closed i A B = AV for some V which is equivalent to
B = A (A V ) = A V
c
for some V .
Exercise 13.4. Show if (X, d) is a metric space and =
d
is the topology
coming from d, then (
d
)
A
is the topology induced by making A into a metric
space using the metric d[
AA
.
Lemma 13.13. Suppose that (X,
X
), (Y,
Y
) and (Z,
Z
) are topological
spaces. If f : (X,
X
) (Y,
Y
) and g : (Y,
Y
) (Z,
Z
) are continuous
functions then g f : (X,
X
) (Z,
Z
) is continuous as well.
Proof. This is easy since by assumption g
1
(
Z
)
Y
and f
1
(
Y
)
X
so that
(g f)
1
(
Z
) = f
1
_
g
1
(
Z
)
_
f
1
(
Y
)
X
.
The following elementary lemma turns out to be extremely useful because
it may be used to greatly simplify the verication that a given function is
continuous.
Lemma 13.14. Suppose that f : X Y is a function, c 2
Y
and A Y,
then
_
f
1
(c)
_
= f
1
((c)) and (13.4)
(c
A
) = ((c))
A
. (13.5)
Moreover, if
Y
= (c) and
X
is a topology on X, then f is (
X
,
Y
)
continuous i f
1
(c)
X
.
Proof. We will give two proof of Eq. (13.4). The rst proof is more con-
structive than the second, but the second proof will work in the context of
algebras to be developed later.
First Proof. There is no harm (as the reader should verify) in replacing c
by c, Y if necessary so that we may assume that , Y c. By Proposition
13.7, the general element V of (c) is an arbitrary unions of nite intersections
of elements from c. Since f
1
preserves all of the set operations, it follows
that f
1
(c) consists of sets which are arbitrary unions of nite intersections
of elements from f
1
c, which is precisely
_
f
1
(c)
_
by another application
of Proposition 13.7.
Second Proof. By Exercise 13.3, f
1
((c)) is a topology and since c
(c) , f
1
(c) f
1
((c)). It now follows that (f
1
(c)) f
1
((c)). For
the reverse inclusion notice that
f
_
f
1
(c)
_
=
_
B Y : f
1
(B)
_
f
1
(c)
__
is a topology which contains c and thus (c) f
_
f
1
(c)
_
. Hence if B
(c) we know that f
1
(B)
_
f
1
(c)
_
, i.e. f
1
((c))
_
f
1
(c)
_
and
Eq. (13.4) has been proved. Applying Eq. (13.4) with X = A and f = i
A
being the inclusion map implies
((c))
A
= i
1
A
((c)) = (i
1
A
(c)) = (c
A
).
Lastly if f
1
c
X
, then f
1
(c) =
_
f
1
c
_

X
which shows f is
(
X
,
Y
) continuous.
Corollary 13.15. If (X, ) is a topological space and f : X 1 is a function
then the following are equivalent:
1. f is (,
R
) - continuous,
2. f
1
((a, b)) for all < a < b < ,
3. f
1
((a, )) and f
1
((, b)) for all a, b .
(We are using
R
to denote the standard topology on 1 induced by the
metric d(x, y) = [x y[.)
Proof. Apply Lemma 13.14 with appropriate choices of c.
Denition 13.16. Let (X,
X
) and (Y,
Y
) be topological spaces. A function
f : X Y is continuous at a point x X if for every open neighborhood
V of f(x) there is an open neighborhood U of x such that U f
1
(V ). See
Figure 13.3.
Exercise 13.5. Show f : X Y is continuous (Denition 13.16) i f is
continuous at all points x X.
Denition 13.17. Given topological spaces (X, ) and (Y,
t
) and a subset
A X. We say a function f : A Y is continuous i f is
A
/
t

continuous.
Fig. 13.3. Checking that a function is continuous at x X.
Denition 13.18. Let (X, ) be a topological space and A X. A collection
of subsets | is an open cover of A if A
| :=
U/
U.
Proposition 13.19 (Localizing Continuity). Let (X, ) and (Y,
t
) be
topological spaces and f : X Y be a function.
1. If f is continuous and A X then f[
A
: A Y is continuous.
2. Suppose there exist an open cover, | , of X such that f[
A
is continuous
for all A |, then f is continuous.
Proof. 1. If f : X Y is a continuous, f
1
(V ) for all V
t
and
therefore
f[
1
A
(V ) = A f
1
(V )
A
for all V
t
.
2. Let V
t
, then
f
1
(V ) =
A/
_
f
1
(V ) A
_
=
A/
f[
1
A
(V ). (13.6)
Since each A | is open,
A
and by assumption, f[
1
A
(V )
A
.
Hence Eq. (13.6) shows f
1
(V ) is a union of open sets and hence is also
open.
Exercise 13.6 (A Baby Extension Theorem). Suppose V and f :
V C is a continuous function. Further assume there is a closed subset C
such that x V : f (x) ,= 0 C V, then F : X C dened by
F(x) =
_
f(x) if x V
0 if x / V
is continuous.
Exercise 13.7 (Building Continuous Functions). Prove the following
variant of item 2. of Proposition 13.19. Namely, suppose there exists a -
nite collection T of closed subsets of X such that X =
AJ
A and f[
A
is
continuous for all A T, then f is continuous. Given an example showing
that the assumption that T is nite can not be eliminated. Hint: consider
f
1
(C) where C is a closed subset of Y.
13.2 Product Spaces I
Denition 13.20. Let X be a set and suppose there is a collection of topo-
logical spaces (Y
) : A and functions f
: X Y
for all A.
Let (f
: A) denote the smallest topology on X such that each f
is
continuous, i.e.
(f
: A) = (
f
1
)).
Proposition 13.21 (Topologies Generated by Functions). Assuming
the notation in Denition 13.20 and additionally let (Z,
Z
) be a topologi-
cal space and g : Z X be a function. Then g is (
Z
, (f
: A))
continuous i f
g is (
Z
,
)continuous for all A.

Proof. () If g is (
Z
, (f
: A)) continuous, then the composition

f
g is (
Z
,
) continuous by Lemma 13.13. () Let
X
= (f
: A) =
_
A
f
1
)
_
.
If f
g is (
Z
,
) continuous for all , then

g
1
f
1
)
Z
A
and therefore
g
1
_
A
f
1
)
_
=
A
g
1
f
1
)
Z
Hence
g
1
(
X
) = g
1
_
A
f
1
)
__
= (g
1
_
A
f
1
)
_

Z
which shows that g is (
Z
,
X
) continuous.
Let (X
)
A
be a collection of topological spaces, X = X
A
=

A
X
and
: X
A
X
be the canonical projection map as in Notation 2.2.

Denition 13.22. The product topology =
A
is the smallest topol-

ogy on X
A
such that each projection
is continuous. Explicitly, is the

topology generated by the collection of sets,
c =
1
(V
) : A, V
=
A
. (13.7)
Applying Proposition 13.21 in this setting implies the following proposi-
tion.
Proposition 13.23. Suppose Y is a topological space and f : Y X
A
is a
map. Then f is continuous i
f : Y X
is continuous for all A.

In particular if A = 1, 2, . . . , n so that X
A
= X
1
X
2
X
n
and
f(y) = (f
1
(y), f
2
(y), . . . , f
n
(y)) X
1
X
2
X
n
, then f : Y X
A
is
continuous i f
i
: Y X
i
is continuous for all i.
13.2 Product Spaces I 203
Proposition 13.24. Suppose that (X, ) is a topological space and f
n

X
A
(see Notation 2.2) is a sequence. Then f
n
f in the product topology of
X
A
i f
n
() f() for all A.
Proof. Since
is continuous, if f
n
f then f
n
() =
(f
n
)
(f) =
f() for all A. Conversely, f
n
() f() for all A i
(f
n
)
(f)
for all A. Therefore if V =
1
(V
) c (with c as in Eq. (13.7)) and

f V, then
(f) V
and
(f
n
) V
for a.a. n and hence f

n
V for a.a.
n. This shows that f
n
f as n .
Proposition 13.25. Suppose that (X
)
A
is a collection of topological
spaces and
A
is the product topology on X :=
A
X
.
1. If c
generates
for each A, then
=
_
(c
)
_
(13.8)
2. If B
is a base for
for each , then the collection of sets, 1, of

the form
V =
/
X
=: V
X
A\
, (13.9)
where A and V
for all is base for

A
.
Proof. 1. Since
((c
))
=
(
1
)
_
_
,
it follows that
_
.
2. Now let | =
_
f
denote the collection of sets consisting of nite
intersections of elements from
. Notice that | may be described as

those sets in Eq. (13.9) where V
for all . By Exercise 13.2, | is a

base for the product topology,
A
. Hence for W
A
and x W,
there exists a V | of the form in Eq. (13.9) such that x V W. Since B
is a base for
, there exists U
such that x
for each .
With this notation, the set U
X
A\
1 and x U
X
A\
V W.
This shows that every open set in X may be written as a union of elements
from 1, i.e. 1 is a base for the product topology.
Notation 13.26 Let c
i
2
Xi
be a collection of subsets of a set X
i
for each
i = 1, 2, . . . , n. We will write, by abuse of notation, c
1
c
2
c
n
for the
collection of subsets of X
1
X
n
of the form A
1
A
2
A
n
with A
i
c
i
for all i. That is we are identifying (A
1
, A
2
, . . . , A
n
) with A
1
A
2
A
n
.
Corollary 13.27. Suppose A = 1, 2, . . . , n so X = X
1
X
2
X
n
.
1. If c
i
2
Xi
,
i
= (c
i
) and X
i
c
i
for each i, then
2

n
= (c
1
c
2
c
n
) (13.10)
and in particular
2

n
= (
1

n
). (13.11)
2. Furthermore if B
i

i
is a base for the topology
i
for each i, then B
1
B
n
is a base for the product topology,
1
2

n
.
Proof. (The proof is a minor variation on the proof of Proposition 13.25.)
1. Let
_
iA
1
i
(c
i
)
f
denotes the collection of sets which are nite intersec-
tions from
iA
1
i
(c
i
), then, using X
i
c
i
for all i,
iA
1
i
(c
i
) c
1
c
2
c
n

_
iA
1
i
(c
i
)
f
.
Therefore
=
_
iA
1
i
(c
i
)
_
(c
1
c
2
c
n
)
_
_
iA
1
i
(c
i
)
f
_
= .
2. Observe that
1

n
is closed under nite intersections and generates
1

2

n
, therefore
1

n
is a base for the product topology.
The proof that B
1
B
n
is also a base for
1

2

n
follows the
same method used to prove item 2. in Proposition 13.25.
Lemma 13.28. Let (X
i
, d
i
) for i = 1, . . . , n be metric spaces, X := X
1

X
n
and for x = (x
1
, x
2
, . . . , x
n
) and y = (y
1
, y
2
, . . . , y
n
) in X let
d(x, y) =
n
i=1
d
i
(x
i
, y
i
). (13.12)
Then the topology,
d
, associated to the metric d is the product topology on X,
i.e.
d
=
d1

d2

dn
.
Proof. Let (x, y) = maxd
i
(x
i
, y
i
) : i = 1, 2, . . . , n. Then is equivalent
to d and hence
=
d
. Moreover if > 0 and x = (x
1
, x
2
, . . . , x
n
) X, then
B
x
() = B
d1
x1
() B
dn
xn
().
By Remark 13.9,
c := B
x
() : x X and > 0
is a base for
and by Proposition 13.25 c is also a base for

d1
d2

dn
.
Therefore,
d1

d2

dn
= (c) =
=
d
.
13.3 Closure operations 205
13.3 Closure operations
Denition 13.29. Let (X, ) be a topological space and A be a subset of X.
1. The closure of A is the smallest closed set

A containing A, i.e.
A := F : A F X .
(Because of Exercise 6.4 this is consistent with Denition 6.10 for the
closure of a set in a metric space.)
2. The interior of A is the largest open set A
o
contained in A, i.e.
A
o
= V : V A .
(With this notation the denition of a neighborhood of x X may be
stated as: A X is a neighborhood of a point x X if x A
o
.)
3. The accumulation points of A is the set
acc(A) = x X : V A x , = for all V
x
.
4. The boundary of A is the set bd(A) :=

A A
o
.
Remark 13.30. The relationships between the interior and the closure of a set
are:
(A
o
)
c
=
V
c
: V and V A =
C : C is closed C A
c
= A
c
and similarly, (

A)
c
= (A
c
)
o
. Hence the boundary of A may be written as
bd(A) :=

A A
o
=

A (A
o
)
c
=

A A
c
, (13.13)
which is to say bd(A) consists of the points in both the closure of A and A
c
.
Proposition 13.31. Let A X and x X.
1. If V
o
X and A V = then

A V = .
2. x

A i V A ,= for all V
x
.
3. x bd(A) i V A ,= and V A
c
,= for all V
x
.
4.

A = A acc(A).
Proof. 1. Since A V = , A V
c
and since V
c
is closed,

A V
c
. That
is to say

AV = . 2. By Remark 13.30
1
,

A = ((A
c
)
o
)
c
so x

A i x / (A
c
)
o
which happens i V _ A
c
for all V
x
, i.e. i V A ,= for all V
x
. 3.
This assertion easily follows from the Item 2. and Eq. (13.13). 4. Item 4. is an
easy consequence of the denition of acc(A) and item 2.
1
Here is another direct proof of item 2. which goes by showing x /

A i there exists
V x such that V A = . If x /

A then V =
_
A
_
c
x and V A V

A = .
Conversely if there exists V x such that AV = then by Item 1.

AV = .
Lemma 13.32. Let A Y X,

A
Y
denote the closure of A in Y with its
relative topology and

A =

A
X
be the closure of A in X, then

A
Y
=

A
X
Y.
Proof. Using Lemma 13.12,
A
Y
= B Y : A B = C Y : A C X
= Y (C : A C X) = Y

A
X
.
Alternative proof. Let x Y then x

A
Y
i V A ,= for all V
Y
such that x V. This happens i for all U
x
, U Y A = U A ,= which
happens i x

A
X
. That is to say

A
Y
=

A
X
Y.
The support of a function may now be dened as in Denition 10.26 above.
Denition 13.33 (Support). Let f : X Y be a function from a topo-
logical space (X,
X
) to a vector space Y. Then we dene the support of f
by
supp(f) := x X : f(x) ,= 0,
a closed subset of X.
The next result is included for completeness but will not be used in the
sequel so may be omitted.
Lemma 13.34. Suppose that f : X Y is a map between topological spaces.
Then the following are equivalent:
1. f is continuous.
2. f(

A) f(A) for all A X
3. f
1
(B) f
1
(

B) for all B Y.
Proof. If f is continuous, then f
1
_
f(A)
_
is closed and since A
f
1
(f(A)) f
1
_
f(A)
_
it follows that

A f
1
_
f(A)
_
. From this equa-
tion we learn that f(

A) f(A) so that (1) implies (2) Now assume (2), then
for B Y (taking A = f
1
(

B)) we have
f(f
1
(B)) f(f
1
(

B)) f(f
1
(

B))

B
and therefore
f
1
(B) f
1
(

B). (13.14)
This shows that (2) implies (3) Finally if Eq. (13.14) holds for all B, then
when B is closed this shows that
f
1
(B) f
1
(

B) = f
1
(B) f
1
(B)
which shows that
f
1
(B) = f
1
(B).
Therefore f
1
(B) is closed whenever B is closed which implies that f is
continuous.
13.4 Countability Axioms 207
13.4 Countability Axioms
Denition 13.35. Let (X, ) be a topological space. A sequence x
n
n=1

X converges to a point x X if for all V
x
, x
n
V almost always
(abbreviated a.a.), i.e. #(n : x
n
/ V ) < . We will write x
n
x as n
or lim
n
x
n
= x when x
n
converges to x.
Example 13.36. Let X = 1, 2, 3 and = X, , 1, 2, 2, 3, 2 and x
n
=
2 for all n. Then x
n
x for every x X. So limits need not be unique!
Denition 13.37 (First Countable). A topological space, (X, ), is rst
countable i every point x X has a countable neighborhood base as dened
in Notation 13.2
Example 13.38. All metric spaces, (X, d) , are rst countable. Indeed, if x X
then
_
B
_
x,
1
n
_
: n N
_
is a countable neighborhood base at x X.
Exercise 13.8. Suppose X is an uncountable set and let V i V
c
is nite
or countable of V = . Show is a topology on X which is closed under
countable intersections and that (X, ) is not rst countable.
Exercise 13.9. Let 0, 1 be equipped with the discrete topology and X =
0, 1
R
be equipped with the product topology, . Show (X, ) is not rst
countable.
The spaces described in Exercises 13.8 and 13.9 are examples of topological
spaces which are not metrizable, i.e. the topology is not induced by any metric
on X. Like for metric spaces, when is rst countable, we may formulate many
topological notions in terms of sequences.
Proposition 13.39. If f : X Y is continuous at x X and lim
n
x
n
=
x X, then lim
n
f(x
n
) = f(x) Y. Moreover, if there exists a countable
neighborhood base of x X, then f is continuous at x i lim
n
f(x
n
) = f(x)
for all sequences x
n
n=1
X such that x
n
x as n .
Proof. If f : X Y is continuous and W
Y
is a neighborhood of
f(x) Y, then there exists a neighborhood V of x X such that f(V ) W.
Since x
n
x, x
n
V a.a. and therefore f(x
n
) f(V ) W a.a., i.e.
f(x
n
) f(x) as n . Conversely suppose that := W
n
n=1
is a
countable neighborhood base at x and lim
n
f(x
n
) = f(x) for all sequences
x
n
n=1
X such that x
n
x. By replacing W
n
by W
1
W
n
if neces-
sary, we may assume that W
n
n=1
is a decreasing sequence of sets. If f were
not continuous at x then there exists V
f(x)
such that x /
_
f
1
(V )
o
.
Therefore, W
n
is not a subset of f
1
(V ) for all n. Hence for each n, we may
choose x
n
W
n
f
1
(V ). This sequence then has the property that x
n
x
as n while f(x
n
) / V for all n and hence lim
n
f(x
n
) ,= f(x).
Lemma 13.40. Suppose there exists x
n
n=1
A such that x
n
x, then
x

A. Conversely if (X, ) is a rst countable space (like a metric space)
then if x

A there exists x
n
n=1
A such that x
n
x.
Proof. Suppose x
n
n=1
A and x
n
x X. Since

A
c
is an open
set, if x

A
c
then x
n

A
c
A
c
a.a. contradicting the assumption that
x
n
n=1
A. Hence x

A. For the converse we now assume that (X, ) is
rst countable and that V
n
n=1
is a countable neighborhood base at x such
that V
1
V
2
V
3
. . . . By Proposition 13.31, x

A i V A ,= for all
V
x
. Hence x

A implies there exists x
n
V
n
A for all n. It is now
easily seen that x
n
x as n .
Denition 13.41. A topological space, (X, ), is second countable if there
exists a countable base 1 for , i.e. 1 is a countable set such that for
every W ,
W = V : V 1 such that V W.
Denition 13.42. A subset D of a topological space X is dense if

D = X.
A topological space is said to be separable if it contains a countable dense
subset, D.
Example 13.43. The following are examples of countable dense sets.
1. The rational numbers, , are dense in 1 equipped with the usual topology.
2. More generally,
d
is a countable dense subset of 1
d
for any d N.
3. Even more generally, for any function : N (0, ),
p
() is separable
for all 1 p < . For example, let F be a countable dense set, then
D := x
p
() : x
i
for all i and #j : x
j
,= 0 < .
The set can be taken to be if F = 1 or +i if F = C.
4. If (X, d) is a metric space which is separable then every subset Y X is
also separable in the induced topology.
To prove 4. above, let A = x
n
n=1
X be a countable dense subset of
X. Let d
Y
(x) = infd(x, y) : y Y be the distance from x to Y and recall
that d
Y
: X [0, ) is continuous. Let
n
= max
_
d
Y
(x
n
),
1
n
_
0 and for
each n let y
n
B
xn
(2
n
). Then if y Y and > 0 we may choose n N such
that d(y, x
n
)
n
< /3. Then d(y
n
, x
n
) 2
n
< 2/3 and therefore
d(y, y
n
) d(y, x
n
) +d(x
n
, y
n
) < .
This shows that B := y
n
n=1
is a countable dense subset of Y.
Exercise 13.10. Show
(N) is not separable.

Exercise 13.11. Show every second countable topological space (X, ) is
separable. Show the converse is not true by showing X := 1 with =
V 1 : 0 V is a separable, rst countable but not a second count-
able topological space.
13.5 Connectedness 209
Exercise 13.12. Every separable metric space, (X, d) is second countable.
Exercise 13.13. Suppose c 2
X
is a countable collection of subsets of X,
then = (c) is a second countable topology on X.
13.5 Connectedness
Denition 13.44. (X, ) is disconnected if there exist non-empty open sets
U and V of X such that U V = and X = U V . We say U, V is a
disconnection of X. The topological space (X, ) is called connected if it
is not disconnected, i.e. if there is no disconnection of X. If A X we say
A is connected i (A,
A
) is connected where
A
is the relative topology on
A. Explicitly, A is disconnected in (X, ) i there exists U, V such that
U A ,= , U A ,= , A U V = and A U V.
The reader should check that the following statement is an equivalent
denition of connectivity. A topological space (X, ) is connected i the only
sets A X which are both open and closed are the sets X and . This version
of the denition is often used in practice.
Remark 13.45. Let A Y X. Then A is connected in X i A is connected
in Y .
Proof. Since
A
:= V A : V X = V A Y : V X = U A : U
o
Y ,
the relative topology on A inherited from X is the same as the relative topol-
ogy on A inherited from Y . Since connectivity is a statement about the relative
topologies on A, A is connected in X i A is connected in Y.
The following elementary but important lemma is left as an exercise to
the reader.
Lemma 13.46. Suppose that f : X Y is a continuous map between topo-
logical spaces. Then f(X) Y is connected if X is connected.
Here is a typical way these connectedness ideas are used.
Example 13.47. Suppose that f : X Y is a continuous map between two
topological spaces, the space X is connected and the space Y is T
1
, i.e. y
is a closed set for all y Y as in Denition 15.35 below. Further assume f is
locally constant, i.e. for all x X there exists an open neighborhood V of x
in X such that f[
V
is constant. Then f is constant, i.e. f(X) = y
0
for some
y
0
Y. To prove this, let y
0
f(X) and let W := f
1
(y
0
). Since y
0
Y
is a closed set and since f is continuous W X is also closed. Since f is
locally constant, W is open as well and since X is connected it follows that
W = X, i.e. f(X) = y
0
.
As a concrete application of this result, suppose that X is a connected
open subset of 1
d
and f : X 1 is a C
1
function such that f 0.
If x X and > 0 such that B(x, ) X, we have, for any [v[ < and
t [1, 1] , that
d
dt
f (x +tv) = f (x +tv) v = 0.
Therefore f (x +v) = f (x) for all [v[ < and this shows f is locally constant.
Hence, by what we have just proved, f is constant on X.
Theorem 13.48 (Properties of Connected Sets). Let (X, ) be a topo-
logical space.
1. If B X is a connected set and X is the disjoint union of two open sets
U and V, then either B U or B V.
2. If A X is connected,
a) then

A is connected.
b) More generally, if A is connected and B acc(A), then A B is
connected as well. (Recall that acc(A) the set of accumulation points
of A was dened in Denition 13.29 above.)
3. If E
A
is a collection of connected sets such that

A
E
,= , then
Y :=
A
E
is connected as well.
4. Suppose A, B X are non-empty connected subsets of X such that

A
B ,= , then A B is connected in X.
5. Every point x X is contained in a unique maximal connected subset
C
x
of X and this subset is closed. The set C
x
is called the connected
component of x.
Proof.
1. Since B is the disjoint union of the relatively open sets B U and B V,
we must have B U = B or B V = B for otherwise B U, B V
would be a disconnection of B.
2. a. Let Y =

A be equipped with the relative topology from X. Suppose
that U, V
o
Y form a disconnection of Y =

A. Then by 1. either A U
or A V. Say that A U. Since U is both open an closed in Y, it follows
that Y =

A U. Therefore V = and we have a contradiction to the
assumption that U, V is a disconnection of Y =

A. Hence we must
conclude that Y =

A is connected as well.
b. Now let Y = A B with B acc(A), then
A
Y
=

A Y = (A acc(A)) Y = A B.
Because A is connected in Y, by (2a) Y = A B =

A
Y
is also connected.
3. Let Y :=

A
E
. By Remark 13.45, we know that E
is connected
in Y for each A. If U, V were a disconnection of Y, by item (1),
either E
U or E
V for all . Let = A : E
U then
13.5 Connectedness 211
U =
and V =
A\
E
. (Notice that neither or A can be

empty since U and V are not empty.) Since
= U V =
_
,
c (E
A
E
,= .
we have reached a contradiction and hence no such disconnection exists.
4. (A good example to keep in mind here is X = 1, A = (0, 1) and B =
[1, 2).) For sake of contradiction suppose that U, V were a disconnection
of Y = A B. By item (1) either A U or A V, say A U in which
case B V. Since Y = A B we must have A = U and B = V and so
we may conclude: A and B are disjoint subsets of Y which are both open
and closed. This implies
A =

A
Y
=

A Y =

A (A B) = A
_
A B
_
and therefore
= A B =
_
A
_
A B
_
B =

A B ,=
which gives us the desired contradiction.
5. Let ( denote the collection of connected subsets C X such that x C.
Then by item 3., the set C
x
:= ( is also a connected subset of X which
contains x and clearly this is the unique maximal connected set containing
x. Since

C
x
is also connected by item (2) and C
x
is maximal, C
x
=

C
x
,
i.e. C
x
is closed.
Theorem 13.49 (The Connected Subsets of 1). The connected subsets
of 1 are intervals.
Proof. Suppose that A 1 is a connected subset and that a, b A with
a < b. If there exists c (a, b) such that c / A, then U := (, c) A
and V := (c, ) A would form a disconnection of A. Hence (a, b) A. Let
:= inf(A) and := sup(A) and choose
n
,
n
A such that
n
<
n
and
n
and
n
as n . By what we have just shown, (
n
,
n
) A
for all n and hence (, ) =
n=1
(
n
,
n
) A. From this it follows that
A = (, ), [, ), (, ] or [, ], i.e. A is an interval.
Conversely suppose that A is an interval, and for sake of contradiction,
suppose that U, V is a disconnection of A with a U, b V. After relabelling
U and V if necessary we may assume that a < b. Since A is an interval
[a, b] A. Let p = sup([a, b] U) , then because U and V are open, a p and p can not be in
V for otherwise p < sup([a, b] U) . From this it follows that p / U V and
hence A ,= UV contradicting the assumption that U, V is a disconnection.
Theorem 13.50 (Intermediate Value Theorem). Suppose that (X, ) is
a connected topological space and f : X 1 is a continuous map. Then f
satises the intermediate value property. Namely, for every pair x, y X such
that f (x) < f(y) and c (f (x) , f(y)), there exits z X such that f(z) = c.
Proof. By Lemma 13.46, f (X) is connected subset of 1. So by Theorem
13.49, f (X) is a subinterval of 1 and this completes the proof.
Denition 13.51. A topological space X is path connected if to every pair
of points x
0
, x
1
X there exists a continuous path, C([0, 1], X), such
that (0) = x
0
and (1) = x
1
. The space X is said to be locally path con-
nected if for each x X, there is an open neighborhood V X of x which is
path connected.
Proposition 13.52. Let X be a topological space.
1. If X is path connected then X is connected.
2. If X is connected and locally path connected, then X is path connected.
3. If X is any connected open subset of 1
n
, then X is path connected.
Proof. The reader is asked to prove this proposition in Exercises 13.20
13.22 below.
Proposition 13.53 (Stability of Connectedness Under Products). Let
(X
) be connected topological spaces. Then the product space X

A
=
A
X
equipped with the product topology is connected.

Proof. Let us begin with the case of two factors, namely assume that
X and Y are connected topological spaces, then we will show that X Y is
connected as well. Given x X, let f
x
: Y XY be the map f
x
(y) = (x, y)
and notice that f
x
is continuous since
X
f
x
(y) = x and
Y
f
x
(y) = y are
continuous maps. From this we conclude that x Y = f
x
(Y ) is connected
by Lemma 13.46. A similar argument shows Xy is connected for all y Y.
Let p = (x
0
, y
0
) X Y and C
p
denote the connected component of p.
Since x
0
Y is connected and p x
0
Y it follows that x
0
Y C
p
and hence C
p
is also the connected component (x
0
, y) for all y Y. Similarly,
Xy C
(x0,y)
= C
p
is connected, and therefore Xy C
p
. So we have
shown (x, y) C
p
for all x X and y Y, see Figure 13.4. By induction the
theorem holds whenever A is a nite set, i.e. for products of a nite number
of connected spaces.
For the general case, again choose a point p X
A
= X
A
and again
let C = C
p
be the connected component of p. Recall that C
p
is closed and
therefore if C
p
is a proper subset of X
A
, then X
A
C
p
is a non-empty open
set. By the denition of the product topology, this would imply that X
A
C
p
contains an open set of the form
V :=
(V
) = V
X
A\
13.6 Exercises 213
Fig. 13.4. This picture illustrates why the connected component of p in X Y
must contain all points of X Y.
where A and V
for all . We will now show that no such V

can exist and hence X
A
= C
p
, i.e. X
A
is connected.
Dene : X
X
A
by (y) = x where
x
=
_
y
if
p
if / .
If ,
(y) = y
(y) and if A then
(y) = p
so that in
every case
: X
is continuous and therefore is continuous. Since

X
is a product of a nite number of connected spaces and so is connected

and thus so is the continuous image, (X
) = X
A\
X
. Now
p (X
) and (X
) is connected implies that (X
) C. On the other
hand one easily sees that
, = V (X
) V C
contradicting the assumption that V C
c
.
13.6 Exercises
13.6.1 General Topological Space Problems
Exercise 13.14. Let V be an open subset of 1. Show V may be written as
a disjoint union of open intervals J
n
= (a
n
, b
n
), where a
n
, b
n
1 for
n = 1, 2, < N with N = possible.
Exercise 13.15. Let (X, ) and (Y,
t
) be a topological spaces, f : X Y
be a function, | be an open cover of X and F
j
n
j=1
be a nite cover of X by
closed sets.
1. If A X is any set and f : X Y is (,
t
) continuous then f[
A
: A Y
is (
A
,
t
) continuous.
2. Show f : X Y is (,
t
) continuous i f[
U
: U Y is (
U
,
t
)
continuous for all U |.
3. Show f : X Y is (,
t
) continuous i f[
Fj
: F
j
Y is (
Fj
,
t
)
continuous for all j = 1, 2, . . . , n.
Exercise 13.16. Suppose that X is a set, (Y
) : A is a family of
topological spaces and f
: X Y
is a given function for all A. Assuming

that o
is a sub-base for the topology
for each A, show o :=
A
f
1
(o
) is a sub-base for the topology := (f
: A).
13.6.2 Connectedness Problems
Exercise 13.17. Show any non-trivial interval in is disconnected.
Exercise 13.18. Suppose a < b and f : (a, b) 1 is a non-decreasing func-
tion. Show if f satises the intermediate value property (see Theorem 13.50),
then f is continuous.
Exercise 13.19. Suppose < a < b and f : [a, b) 1 is a strictly
increasing continuous function. By Lemma 13.46, f ([a, b)) is an interval and
since f is strictly increasing it must of the form [c, d) for some c 1 and d

1
with c < d. Show the inverse function f
1
: [c, d) [a, b) is continuous and
is strictly increasing. In particular if n N, apply this result to f (x) = x
n
for x [0, ) to construct the positive n
th
root of a real number. Compare
with Exercise 3.8
Exercise 13.20. Prove item 1. of Proposition 13.52. Hint: show X is not
connected implies X is not path connected.
Exercise 13.21. Prove item 2. of Proposition 13.52. Hint: x x
0
X and let
W denote the set of x X such that there exists C([0, 1], X) satisfying
(0) = x
0
and (1) = x. Then show W is both open and closed.
Exercise 13.22. Prove item 3. of Proposition 13.52.
Exercise 13.23. Let
X :=
_
(x, y) 1
2
: y = sin(x
1
)
_
(0, 0)
equipped with the relative topology induced from the standard topology on
1
2
. Show X is connected but not path connected.
13.6 Exercises 215
13.6.3 Metric Spaces as Topological Spaces
Denition 13.54. Two metrics d and on a set X are said to be equivalent
if there exists a constant c (0, ) such that c
1
d c.
Exercise 13.24. Suppose that d and are two metrics on X.
1. Show
d
=
if d and are equivalent.

2. Show by example that it is possible for
d
=
even thought d and are

inequivalent.
i
, d
i
) for i = 1, . . . , n be a nite collection of metric
spaces and for 1 p and x = (x
1
, x
2
, . . . , x
n
) and y = (y
1
, . . . , y
n
) in
X :=
n
i=1
X
i
, let
p
(x, y) =
_
(
n
i=1
[d
i
(x
i
, y
i
)]
p
)
1/p
if p ,=
max
i
d
i
(x
i
, y
i
) if p =
.
1. Show (X,
p
) is a metric space for p [1, ]. Hint: Minkowskis inequal-
ity.
2. Show for any p, q [1, ], the metrics
p
and
q
are equivalent. Hint:
This can be done with explicit estimates or you could use Theorem 14.12
below.
Notation 13.55 Let X be a set and p := p
n
n=0
be a family of semi-metrics
on X, i.e. p
n
: X X [0, ) are functions satisfying the assumptions
of metric except for the assertion that p
n
(x, y) = 0 implies x = y. Further
assume that p
n
(x, y) p
n+1
(x, y) for all n and if p
n
(x, y) = 0 for all n N
then x = y. Given n N and x X let
B
n
(x, ) := y X : p
n
(x, y) < .
We will write (p) form the smallest topology on X such that p
n
(x, ) : X
[0, ) is continuous for all n N and x X, i.e. (p) := (p
n
(x) : n N
and x X).
Exercise 13.26. Using Notation 13.55, show that collection of balls,
B := B
n
(x, ) : n N, x X and > 0 ,
forms a base for the topology (p). Hint: Use Exercise 13.16 to show B is a
sub-base for the topology (p) and then use Exercise 13.2 to show B is in fact
a base for the topology (p).
Exercise 13.27 (A minor variant of Exercise 6.12). Let p
n
be as in
Notation 13.55 and
d(x, y) :=
n=0
2
n
p
n
(x, y)
1 +p
n
(x, y)
.
Show d is a metric on X and
d
= (p). Conclude that a sequence x
k
k=1

X converges to x X i
lim
k
p
n
(x
k
, x) = 0 for all n N.
n
, d
n
)
n=1
be a sequence of metric spaces, X :=
n=1
X
n
, and for x = (x(n))
n=1
and y = (y(n))
n=1
in X let
d(x, y) =
n=1
2
n
d
n
(x(n), y(n))
1 +d
n
(x(n), y(n))
.
(See Exercise 6.12.) Moreover, let
n
: X X
n
be the projection maps, show
d
=
n=1
dn
:= (
n
: n N).
That is show the d metric topology is the same as the product topology on
X. Suggestions: 1) show
n
is
d
continuous for each n and 2) show for each
x X that d (x, ) is
n=1
dn
continuous. For the second assertion notice
that d (x, ) =
n=1
f
n
where f
n
= 2
n
_
dn(x(n),)
1+dn(x(n),)
_

n
.
14
Compactness
Denition 14.1. The subset A of a topological space (X ) is said to be com-
pact if every open cover (Denition 13.18) of A has nite a sub-cover, i.e. if
| is an open cover of A there exists |
0
| such that |
0
is a cover of A.
(We will write A X to denote that A X and A is compact.) A subset
A X is precompact if

A is compact.
Proposition 14.2. Suppose that K X is a compact set and F K is a
closed subset. Then F is compact. If K
i
n
i=1
is a nite collections of compact
subsets of X then K =
n
i=1
K
i
is also a compact subset of X.
Proof. Let | be an open cover of F, then |F
c
is an open cover
of K. The cover |F
c
of K has a nite subcover which we denote by
|
0
F
c
where |
0
|. Since F F
c
= , it follows that |
0
is the desired
subcover of F. For the second assertion suppose | is an open cover of K.
Then | covers each compact set K
i
and therefore there exists a nite subset
|
i
| for each i such that K
i
|
i
. Then |
0
:=
n
i=1
|
i
is a nite cover
of K.
Exercise 14.1 (Suggested by Michael Gurvich). Show by example that
the intersection of two compact sets need not be compact. (This pathology
disappears if one assumes the topology is Hausdor, see Denition 15.2 below.)
Exercise 14.2. Suppose f : X Y is continuous and K X is compact,
then f(K) is a compact subset of Y. Give an example of continuous map,
f : X Y, and a compact subset K of Y such that f
1
(K) is not compact.
Exercise 14.3 (Dinis Theorem). Let X be a compact topological space
and f
n
: X [0, ) be a sequence of continuous functions such that f
n
(x) 0
as n for each x X. Show that in fact f
n
0 uniformly in x, i.e.
sup
xX
f
n
(x) 0 as n . Hint: Given > 0, consider the open sets
V
n
:= x X : f
n
(x) < .
218 14 Compactness
Denition 14.3. A collection T of closed subsets of a topological space (X, )
has the nite intersection property if T
0
,= for all T
0
T.
The notion of compactness may be expressed in terms of closed sets as
follows.
Proposition 14.4. A topological space X is compact i every family of closed
sets T 2
X
having the nite intersection property satises

T , = .
Proof. () Suppose that X is compact and T 2
X
is a collection of
closed sets such that

T = . Let
| = T
c
:= C
c
: C T ,
then | is a cover of X and hence has a nite subcover, |
0
. Let T
0
= |
c
0
T,
then T
0
= so that T does not have the nite intersection property. () If
X is not compact, there exists an open cover | of X with no nite subcover.
Let
T = |
c
:= U
c
: U | ,
then T is a collection of closed sets with the nite intersection property while
T = .
Exercise 14.4. Let (X, ) be a topological space. Show that A X is com-
pact i (A,
A
) is a compact topological space.
14.1 Metric Space Compactness Criteria
Let (X, d) be a metric space and for x X and > 0 let
B
t
x
() := B
x
() x
be the ball centered at x of radius > 0 with x deleted. Recall from Denition
13.29 that a point x X is an accumulation point of a subset E X if
, = EV x for all open neighborhoods, V, of x. The proof of the following
elementary lemma is left to the reader.
Lemma 14.5. Let E X be a subset of a metric space (X, d) . Then the
following are equivalent:
1. x X is an accumulation point of E.
2. B
t
x
() E ,= for all > 0.
3. B
x
() E is an innite set for all > 0.
4. There exists x
n
n=1
E x with lim
n
x
n
= x.
Denition 14.6. A metric space (X, d) is bounded ( > 0) if there exists
a nite cover of X by balls of radius and it is totally bounded if it is
bounded for all > 0.
14.1 Metric Space Compactness Criteria 219
Theorem 14.7. Let (X, d) be a metric space. The following are equivalent.
(a) X is compact.
(b) Every innite subset of X has an accumulation point.
(c) Every sequence x
n
n=1
X has a convergent subsequence.
(d) X is totally bounded and complete.
Proof. The proof will consist of showing that a b c d a.
(a b) We will show that not b not a. Suppose there exists an innite
subset E X which has no accumulation points. Then for all x X there
exists
x
> 0 such that V
x
:= B
x
(
x
) satises (V
x
x) E = . Clearly
1 = V
x
xX
is a cover of X, yet 1 has no nite sub cover. Indeed, for each
x X, V
x
E x and hence if X,
x
V
x
can only contain a nite
number of points from E (namely E). Thus for any X, E
x
V
x
and in particular X ,=
x
V
x
. (See Figure 14.1.)
Fig. 14.1. The construction of an open cover with no nite sub-cover.
(b c) Let x
n
n=1
X be a sequence and E := x
n
: n N . If
#(E) < , then x
n
n=1
has a subsequence x
n
k
k=1
which is constant and
hence convergent. On the other hand if #(E) = then by assumption E has
an accumulation point and hence by Lemma 14.5, x
n
n=1
has a convergent
subsequence.
(c d) Suppose x
n
n=1
X is a Cauchy sequence. By assumption there
exists a subsequence x
n
k
k=1
which is convergent to some point x X. Since
x
n
n=1
is Cauchy it follows that x
n
x as n showing X is complete.
We now show that X is totally bounded. Let > 0 be given and choose an
arbitrary point x
1
X. If possible choose x
2
X such that d(x
2
, x
1
) , then
if possible choose x
3
X such that d
x1,x2]
(x
3
) and continue inductively
choosing points x
j
n
j=1
X such that d
x1,...,xn1]
(x
n
) . (See Figure
14.2.) This process must terminate, for otherwise we would produce a sequence
x
n
n=1
X which can have no convergent subsequences. Indeed, the x
n
have been chosen so that d (x
n
, x
m
) > 0 for every m ,= n and hence no
subsequence of x
n
n=1
can be Cauchy.
(d a) For sake of contradiction, assume there exists an open cover
1 = V
A
of X with no nite subcover. Since X is totally bounded for
each n N there exists
n
X such that
X =
_
xn
B
x
(1/n)
_
xn
C
x
(1/n).
220 14 Compactness
Fig. 14.2. Constructing a set with out an accumulation point.
Choose x
1

1
such that no nite subset of 1 covers K
1
:= C
x1
(1). Since
K
1
=
x2
K
1
C
x
(1/2), there exists x
2

2
such that K
2
:= K
1
C
x2
(1/2)
can not be covered by a nite subset of 1, see Figure 14.3. Continuing this
way inductively, we construct sets K
n
= K
n1
C
xn
(1/n) with x
n

n
such
that no K
n
can be covered by a nite subset of 1. Now choose y
n
K
n
for each n. Since K
n
n=1
is a decreasing sequence of closed sets such that
diam(K
n
) 2/n, it follows that y
n
is a Cauchy and hence convergent with
y = lim
n
y
n

m=1
K
m
.
Since 1 is a cover of X, there exists V 1 such that y V. Since K
n
y
and diam(K
n
) 0, it now follows that K
n
V for some n large. But this
violates the assertion that K
n
can not be covered by a nite subset of 1.
Fig. 14.3. Nested Sequence of cubes.
Corollary 14.8. Any compact metric space (X, d) is second countable and
hence also separable by Exercise 13.11. (See Example 15.25 below for an ex-
ample of a compact topological space which is not separable.)
Proof. To each integer n, there exists
n
X such that X =
xn
B(x, 1/n). The collection of open balls,
1 :=
nN
xn
B(x, 1/n)
forms a countable basis for the metric topology on X. To check this, suppose
that x
0
X and > 0 are given and choose n N such that 1/n < /2
and x
n
such that d (x
0
, x) < 1/n. Then B(x, 1/n) B(x
0
, ) because for
y B(x, 1/n),
d (y, x
0
) d (y, x) +d (x, x
0
) < 2/n < .
Corollary 14.9. The compact subsets of 1
n
are the closed and bounded sets.
Proof. This is a consequence of Theorem 10.2 and Theorem 14.7. Here
is another proof. If K is closed and bounded then K is complete (being the
closed subset of a complete space) and K is contained in [M, M]
n
for some
positive integer M. For > 0, let
= Z
n
[M, M]
n
:= x : x Z
n
and [x
i
[ M for i = 1, 2, . . . , n.
We will show, by choosing > 0 suciently small, that
K [M, M]
n

x
B(x, ) (14.1)
which shows that K is totally bounded. Hence by Theorem 14.7, K is compact.
Suppose that y [M, M]
n
, then there exists x
such that [y
i
x
i
[
for i = 1, 2, . . . , n. Hence
d
2
(x, y) =
n
i=1
(y
i
x
i
)
2
n
2
which shows that d(x, y)

n. Hence if choose < /
n we have shows
that d(x, y) < , i.e. Eq. (14.1) holds.
Example 14.10. Let X =
p
(N) with p [1, ) and
p
(N) such that
(k) 0 for all k N. The set
K := x X : [x(k)[ (k) for all k N
is compact. To prove this, let x
n
n=1
K be a sequence. By com-
pactness of closed bounded sets in C, for each k N there is a subse-
quence of x
n
(k)
n=1
C which is convergent. By Cantors diagonaliza-
tion trick, we may choose a subsequence y
n
n=1
of x
n
n=1
such that
222 14 Compactness
y(k) := lim
n
y
n
(k) exists for all k N.
1
Since [y
n
(k)[ (k) for all n
it follows that [y(k)[ (k), i.e. y K. Finally
lim
n
|y y
n
|
p
p
= lim
n
k=1
[y(k) y
n
(k)[
p
=
k=1
lim
n
[y(k) y
n
(k)[
p
= 0
wherein we have used the Dominated convergence theorem. (Note
[y(k) y
n
(k)[
p
2
p
p
(k)
and
p
is summable.) Therefore y
n
y and we are done.
Alternatively, we can prove K is compact by showing that K is closed and
totally bounded. It is simple to show K is closed, for if x
n
n=1
K is a
convergent sequence in X, x := lim
n
x
n
, then
[x(k)[ lim
n
[x
n
(k)[ (k) k N.
This shows that x K and hence K is closed. To see that K is totally
bounded, let > 0 and choose N such that
_
k=N+1
[(k)[
p
_
1/p
< . Since
N
k=1
C
(k)
(0) C
N
is closed and bounded, it is compact. Therefore there
exists a nite subset
N
k=1
C
(k)
(0) such that
N
k=1
C
(k)
(0)
z
B
N
z
()
where B
N
z
() is the open ball centered at z C
N
relative to the
p
(1, 2, 3, . . . , N) norm. For each z , let z X be dened by
z(k) = z(k) if k N and z(k) = 0 for k N + 1. I now claim that
K
z
B
z
(2) (14.2)
which, when veried, shows K is totally bounded. To verify Eq. (14.2), let
x K and write x = u + v where u(k) = x(k) for k N and u(k) = 0 for
k < N. Then by construction u B
z
() for some z and
|v|
p

_

k=N+1
[(k)[
p
_
1/p
< .
1
The argument is as follows. Let |n
1
j
j=1
be a subsequence of N =|n
n=1
such that
limjx
n
1
j
(1) exists. Now choose a subsequence |n
2
j
j=1
of |n
1
j
j=1
such that
limjx
n
2
j
(2) exists and similarly |n
3
j
j=1
of |n
2
j
j=1
such that limjx
n
3
j
(3)
exists. Continue on this way inductively to get
|n
n=1
|n
1
j
j=1
|n
2
j
j=1
|n
3
j
j=1
. . .
such that limjx
n
k
j
(k) exists for all k N. Let mj := n
j
j
so that eventually
|mj
j=1
is a subsequence of |n
k
j
j=1
for all k. Therefore, we may take yj := xm
j
.
So we have
|x z|
p
= |u +v z|
p
|u z|
p
+|v|
p
< 2.
Exercise 14.5 (Extreme value theorem). Let (X, ) be a compact topo-
logical space and f : X 1 be a continuous function. Show < inf f
supf < and there exists a, b X such that f(a) = inf f and f(b) = supf
2
.
Hint: use Exercise 14.2 and Corollary 14.9.
Exercise 14.6 (Uniform Continuity). Let (X, d) be a compact metric
space, (Y, ) be a metric space and f : X Y be a continuous function.
Show that f is uniformly continuous, i.e. if > 0 there exists > 0 such that
(f(y), f(x)) < if x, y X with d(x, y) < . Hint: you could follow the
argument in the proof of Theorem 10.2.
Denition 14.11. Let L be a vector space. We say that two norms, [[ and
|| , on L are equivalent if there exists constants , (0, ) such that
|f| [f[ and [f[ |f| for all f L.
Theorem 14.12. Let L be a nite dimensional vector space. Then any two
norms [[ and || on L are equivalent. (This is typically not true for norms
on innite dimensional spaces, see for example Exercise 7.6.)
Proof. Let f
i
n
i=1
be a basis for L and dene a new norm on L by
_
_
_
_
_
n
i=1
a
i
f
i
_
_
_
_
_
2
:=
_
n
i=1
[a
i
[
2
for a
i
F.
By the triangle inequality for the norm [[ , we nd
i=1
a
i
f
i
i=1
[a
i
[ [f
i
[
_
n
i=1
[f
i
[
2
_
n
i=1
[a
i
[
2
M
_
_
_
_
_
n
i=1
a
i
f
i
_
_
_
_
_
2
where M =
_
n
i=1
[f
i
[
2
. Thus we have
[f[ M|f|
2
for all f L and this inequality shows that [[ is continuous relative to
||
2
. Since the normed space (L, ||
2
) is homeomorphic and isomorphic
to F
n
with the standard euclidean norm, the closed bounded set, S :=
2
Here is a proof if X is a metric space. Let |xn
n=1
X be a sequence such that
f(xn) sup f. By compactness of X we may assume, by passing to a subsequence
if necessary that xn b X as n . By continuity of f, f(b) = supf.
224 14 Compactness
f L : |f|
2
= 1 L, is a compact subset of L relative to ||
2
. There-
fore by Exercise 14.5 there exists f
0
S such that
m = inf [f[ : f S = [f
0
[ > 0.
Hence given 0 ,= f L, then
f
|f|
2
S so that
m
f
|f|
2
= [f[
1
|f|
2
or equivalently
|f|
2

1
m
[f[ .
This shows that [[ and ||
2
are equivalent norms. Similarly one shows that
|| and ||
2
are equivalent and hence so are [[ and || .
Corollary 14.13. If (L, ||) is a nite dimensional normed space, then A
L is compact i A is closed and bounded relative to the given norm, || .
Corollary 14.14. Every nite dimensional normed vector space (L, ||) is
complete. In particular any nite dimensional subspace of a normed vector
space is automatically closed.
Proof. If f
n
n=1
L is a Cauchy sequence, then f
n
n=1
is bounded
and hence has a convergent subsequence, g
k
= f
n
k
, by Corollary 14.13. It is
now routine to show lim
n
f
n
= f := lim
k
g
k
.
Theorem 14.15. Suppose that (X, ||) is a normed vector in which the unit
ball, V := B
0
(1) , is precompact. Then dimX < .
Proof. Since

V is compact, we may choose X such that
V
x
_
x +
1
2
V
_
(14.3)
where, for any > 0,
V := x : x V = B
0
() .
Let Y := span(), then Eq. (14.3) implies,
V

V Y +
1
2
V.
Multiplying this equation by
1
2
then shows
1
2
V
1
2
Y +
1
4
V = Y +
1
4
V
14.2 Compact Operators 225
and hence
V Y +
1
2
V Y +Y +
1
4
V = Y +
1
4
V.
Continuing this way inductively then shows that
V Y +
1
2
n
V for all n N. (14.4)
Indeed, if Eq. (14.4) holds, then
V Y +
1
2
V Y +
1
2
_
Y +
1
2
n
V
_
= Y +
1
2
n+1
V.
Hence if x V, there exists y
n
Y and z
n
B
0
(2
n
) such that y
n
+z
n
x.
Since lim
n
z
n
= 0, it follows that x = lim
n
y
n

Y . Since dimY
#() < , Corollary 14.14 implies Y =

Y and so we have shown that
V Y. Since for any x X,
1
2|x|
x V Y, we have x Y for all x X, i.e.
X = Y.
Exercise 14.7. Suppose (Y, ||
Y
) is a normed space and (X, ||
X
) is a nite
dimensional normed space. Show every linear transformation T : X Y is
necessarily bounded.
14.2 Compact Operators
Denition 14.16. Let A : X Y be a bounded operator between two Banach
spaces. Then A is compact if A[B
X
(0, 1)] is precompact in Y or equivalently
for any x
n
n=1
X such that |x
n
| 1 for all n the sequence y
n
:= Ax
n
Y
has a convergent subsequence.
Example 14.17. Let X =
2
= Y and
n
C such that lim
n
n
= 0, then
A : X Y dened by (Ax)(n) =
n
x(n) is compact.
Proof. Suppose x
j
j=1

2
such that |x
j
|
2
=
[x
j
(n)[
2
1 for all j.
By Cantors Diagonalization argument, there exists j
k
j such that, for
each n, x
k
(n) = x
j
k
(n) converges to some x(n) C as k . By Fatous
Lemma 4.12,
n=1
[ x(n)[
2
=
n=1
lim inf
k
[ x
k
(n)[
2
lim inf
k
n=1
[ x
k
(n)[
2
1,
which shows x
2
.
Let
M
= max
nM
[
n
[. Then
226 14 Compactness
|A x
k
A x|
2
=
n=1
[
n
[
2
[ x
k
(n) x(n)[
2
n=1
[
n
[
2
[ x
k
(n) x(n)[
2
+[
M
[
2
M+1
[ x
k
(n) x(n)[
2
n=1
[
n
[
2
[ x
k
(n) x(n)[
2
+[
M
[
2
| x
k
x|
2
n=1
[
n
[
2
[ x
k
(n) x(n)[
2
+ 4[
M
[
2
.
Passing to the limit in this inequality then implies
lim sup
k
|A x
k
A x|
2
4[
M
[
2
0 as M
and this completes the proof the A is a compact operator.
Lemma 14.18. If X
A
Y
B
Z are bounded operators such the either A
or B is compact then the composition BA : X Z is also compact.
Proof. Let B
X
(0, 1) be the open unit ball in X. If A is compact and B
is bounded, then BA(B
X
(0, 1)) B(AB
X
(0, 1)) which is compact since the
image of compact sets under continuous maps are compact. Hence we conclude
that BA(B
X
(0, 1)) is compact, being the closed subset of the compact set
B(AB
X
(0, 1)). If A is continuous and B is compact, then A(B
X
(0, 1)) is a
bounded set and so by the compactness of B, BA(B
X
(0, 1)) is a precompact
subset of Z, i.e. BA is compact.
14.3 Local and Compactness
Notation 14.19 If X is a topological space and Y is a normed space, let
BC(X, Y ) := f C(X, Y ) : sup
xX
|f(x)|
Y
<
and
C
c
(X, Y ) := f C(X, Y ) : supp(f) is compact.
If Y = 1 or C we will simply write C(X), BC(X) and C
c
(X) for C(X, Y ),
BC(X, Y ) and C
c
(X, Y ) respectively.
Remark 14.20. Let X be a topological space and Y be a Banach space.
By combining Exercise 14.2 and Theorem 14.7 it follows that C
c
(X, Y )
BC(X, Y ).
14.3 Local and Compactness 227
Denition 14.21 (Local and compactness). Let (X, ) be a topolog-
ical space.
1. (X, ) is locally compact if for all x X there exists an open neigh-
borhood V X of x such that

V is compact. (Alternatively, in light of
Denition 13.29 (also see Denition 6.5), this is equivalent to requiring
that to each x X there exists a compact neighborhood N
x
of x.)
2. (X, ) is compact if there exists compact sets K
n
X such that
X =
n=1
K
n
. (Notice that we may assume, by replacing K
n
by K
1
K
2
K
n
if necessary, that K
n
X.)
Example 14.22. Any open subset of U 1
n
is a locally compact and
compact metric space. The proof of local compactness is easy and is left to
the reader. To see that U is compact, for k N, let
K
k
:= x U : [x[ k and d
U
c (x) 1/k .
Then K
k
is a closed and bounded subset of 1
n
and hence compact. Moreover
K
o
k
U as k since
3
K
o
k
x U : [x[ < k and d
U
c (x) > 1/k U as k .
Exercise 14.8. If (X, ) is locally compact and second countable, then there
is a countable basis B
0
for the topology consisting of precompact open sets.
Use this to show (X, ) is - compact.
Exercise 14.9. Every separable locally compact metric space is compact.
Exercise 14.10. Every compact metric space is second countable (or
equivalently separable), see Corollary 14.8.
Exercise 14.11. Suppose that (X, d) is a metric space and U X is an open
subset.
1. If X is locally compact then (U, d) is locally compact.
2. If X is compact then (U, d) is compact. Hint: Mimic Example
14.22, replacing x 1
n
: [x[ k by compact sets X
k
X such that
X
k
X.
Lemma 14.23. Let (X, ) be locally and compact. Then there exists com-
pact sets K
n
X such that K
n
K
o
n+1
K
n+1
for all n.
Proof. Suppose that C X is a compact set. For each x C let V
x

o
X
be an open neighborhood of x such that

V
x
is compact. Then C
xC
V
x
so
there exists C such that
C
x
V
x

x

V
x
=: K.
3
In fact this is an equality, but we will not need this here.
228 14 Compactness
Then K is a compact set, being a nite union of compact subsets of X, and
C
x
V
x
K
o
. Now let C
n
X be compact sets such that C
n
X as
n . Let K
1
= C
1
and then choose a compact set K
2
such that C
2
K
o
2
.
Similarly, choose a compact set K
3
such that K
2
C
3
K
o
3
and continue
inductively to nd compact sets K
n
such that K
n
C
n+1
K
o
n+1
for all n.
Then K
n
n=1
is the desired sequence.
Remark 14.24. Lemma 14.23 may also be stated as saying there exists pre-
compact open sets G
n
n=1
such that G
n

G
n
G
n+1
for all n and G
n
X
as n . Indeed if G
n
n=1
are as above, let K
n
:=

G
n
and if K
n
n=1
are
as in Lemma 14.23, let G
n
:= K
o
n
.
Proposition 14.25. Suppose X is a locally compact metric space and U
o
X and K U. Then there exists V
o
X such that K V V U X
and

V is compact.
Proof. (This is done more generally in Proposition 15.7 below.) By local
compactness of X, for each x K there exists
x
> 0 such that B
x
(
x
) is
compact and by shrinking
x
if necessary we may assume,
B
x
(
x
) C
x
(
x
) B
x
(2
x
) U
for each x K. By compactness of K, there exists K such that K
x
B
x
(
x
) =: V. Notice that

V
x
B
x
(
x
) U and

V is a closed subset
of the compact set
x
B
x
(
x
) and hence compact as well.
Denition 14.26. Let U be an open subset of a topological space (X, ). We
will write f U to mean a function f C
c
(X, [0, 1]) such that supp(f) :=
f ,= 0 U.
Lemma 14.27 (Urysohns Lemma for Metric Spaces). Let X be a lo-
cally compact metric space and K U
o
X. Then there exists f U such
that f = 1 on K. In particular, if K is compact and C is closed in X such
that KC = , there exists f C
c
(X, [0, 1]) such that f = 1 on K and f = 0
on C.
Proof. Let V be as in Proposition 14.25 and then use Lemma 6.15 to nd
a function f C(X, [0, 1]) such that f = 1 on K and f = 0 on V
c
. Then
supp(f)

V U and hence f U.
14.4 Function Space Compactness Criteria
In this section, let (X, ) be a topological space.
Denition 14.28. Let T C(X).
14.4 Function Space Compactness Criteria 229
1. T is equicontinuous at x X i for all > 0 there exists U
x
such
that [f(y) f(x)[ < for all y U and f T.
2. T is equicontinuous if T is equicontinuous at all points x X.
3. T is pointwise bounded if sup[f(x)[ : f T < for all x X.
Theorem 14.29 (Ascoli-Arzela Theorem). Let (X, ) be a compact topo-
logical space and T C(X). Then T is precompact in C(X) i T is equicon-
tinuous and point-wise bounded.
Proof. () Since C(X)
(X) is a complete metric space, we must

show T is totally bounded. Let > 0 be given. By equicontinuity, for all
x X, there exists V
x

x
such that [f(y) f(x)[ < /2 if y V
x
and
f T. Since X is compact we may choose X such that X =
x
V
x
.
We have now decomposed X into blocks V
x
x
such that each f T is
constant to within on V
x
. Since sup[f(x)[ : x and f T < , it is
now evident that
M = sup[f(x)[ : x X and f T
sup[f(x)[ : x and f T + < .
Let | := k/2 : k Z [M, M]. If f T and |
(i.e. : | is a
function) is chosen so that [(x) f(x)[ /2 for all x , then
[f(y) (x)[ [f(y) f(x)[ +[f(x) (x)[ < x and y V
x
.
From this it follows that T =
_
T
: |
_
where, for |
,
T
:= f T : [f(y) (x)[ < for y V

x
and x .
Let :=
_
|
: T
,=
_
and for each choose f
T. For
f T
, x and y V
x
we have
[f(y) f
(y)[ [f(y) (x))[ +[(x) f
(y)[ < 2.
So |f f
< 2 for all f T
showing that T
B
f
(2). Therefore,
T =
B
f
(2)
and because > 0 was arbitrary we have shown that T is totally bounded.
() (*The rest of this proof may safely be skipped.) Since ||
: C(X)
[0, ) is a continuous function on C(X) it is bounded on any compact subset
T C(X). This shows that sup|f|
: f T < which clearly implies

that T is pointwise bounded.
4
Suppose T were not equicontinuous at some
4
One could also prove that T is pointwise bounded by considering the continuous
evaluation maps ex : C(X) R given by ex(f) = f(x) for all x X.
230 14 Compactness
point x X that is to say there exists > 0 such that for all V
x
,
sup
yV
sup
fJ
[f(y) f(x)[ > .
5
Equivalently said, to each V
x
we may choose
f
V
T and x
V
V [f
V
(x) f
V
(x
V
)[ . (14.5)
Set (
V
= f
W
: W
x
and W V
||
T and notice for any 1
x
that
V \
(
V
(
\
,= ,
so that (
V
V

x
T has the nite intersection property.
6
Since T is
compact, it follows that there exists some
f
V x
(
V
,= .
Since f is continuous, there exists V
x
such that [f(x) f(y)[ < /3 for
all y V. Because f (
V
, there exists W V such that |f f
W
| < /3.
We now arrive at a contradiction;
[f
W
(x) f
W
(x
W
)[
[f
W
(x) f(x)[ +[f(x) f(x
W
)[ +[f(x
W
) f
W
(x
W
)[
< /3 +/3 +/3 = .
Exercise 14.12. Give an alternative proof of the implication, () , in Theo-
rem 14.29 by showing every subsequence f
n
: n N T has a convergence
sub-sequence.
5
If X is rst countable we could nish the proof with the following argument.
Let |Vn
n=1
be a neighborhood base at x such that V1 V2 V3 . . . . By
the assumption that T is not equicontinuous at x, there exist fn T and xn
Vn such that [fn(x) fn(xn)[ n. Since T is a compact metric space by
passing to a subsequence if necessary we may assume that fn converges uniformly
to some f T. Because xn x as n we learn that
[fn(x) fn(xn)[ [fn(x) f(x)[ +[f(x) f(xn)[ + [f(xn) fn(xn)[
2|fn f| +[f(x) f(xn)[ 0 as n
6
If we are willing to use Nets described in Appendix C below we could nish
the proof as follows. Since T is compact, the net |fV V x
T has a cluster
point f T C(X). Choose a subnet |gA of |fV V
X
such that g f
uniformly. Then, since xV x implies xV
x, we may conclude from Eq.
(14.5) that
[g(x) g(xV
)[ [g(x) g(x)[ = 0
14.4 Function Space Compactness Criteria 231
Exercise 14.13. Suppose k C
_
[0, 1]
2
, 1
_
and for f C ([0, 1] , 1) , let
Kf (x) :=
_
1
0
k (x, y) f (y) dy for all x [0, 1] .
Show K is a compact operator on (C ([0, 1] , 1) , ||
) .
The following result is a corollary of Lemma 14.23 and Theorem 14.29.
Corollary 14.30 (Locally Compact Ascoli-Arzela Theorem). Let (X, )
be a locally compact and compact topological space and f
m
C(X)
be a pointwise bounded sequence of functions such that f
m
[
K
is equicon-
tinuous for any compact subset K X. Then there exists a subsequence
m
n
m such that g
n
:= f
mn
n=1
C(X) is a sequence which is uni-
formly convergent on compact subsets of X.
Proof. Let K
n
n=1
be the compact subsets of X constructed in Lemma
14.23. We may now apply Theorem 14.29 repeatedly to nd a nested family
of subsequences
f
m
g
1
m
g
2
m
g
3
m
. . .
such that the sequence g
n
m
m=1
C(X) is uniformly convergent on K
n
.
Using Cantors trick, dene the subsequence h
n
of f
m
by h
n
:= g
n
n
. Then
h
n
is uniformly convergent on K
l
for each l N. Now if K X is an
arbitrary compact set, there exists l < such that K K
o
l
K
l
and
therefore h
n
is uniformly convergent on K as well.
Proposition 14.31. Let
o
1
d
such that

is compact and 0 < 1.
Then the inclusion map i : C
() C
() is a compact operator. See

Chapter 9 and Lemma 9.9 for the notation being used here.
Let u
n
n=1
C
() such that |u
n
|
C
1, i.e. |u
n
|
1 and
[u
n
(x) u
n
(y)[ [x y[
for all x, y .
By the Arzela-Ascoli Theorem 14.29, there exists a subsequence of u
n
n=1
of u
n
n=1
and u C
o
(

) such that u
n
u in C
0
. Since
[u(x) u(y)[ = lim
n
[ u
n
(x) u
n
(y)[ [x y[
,
u C
as well. Dene g
n
:= u u
n
C
, then
[g
n
]
+|g
n
|
C
0 = |g
n
|
C
2
and g
n
0 in C
0
. To nish the proof we must show that g
n
0 in C
. Given
> 0,
[g
n
]
= sup
x,=y
[g
n
(x) g
n
(y)[
[x y[
A
n
+B
n
232 14 Compactness
where
A
n
= sup
_
[g
n
(x) g
n
(y)[
[x y[
: x ,= y and [x y[
_
= sup
_
[g
n
(x) g
n
(y)[
[x y[
[x y[
: x ,= y and [x y[
_

[g
n
]
and
B
n
= sup
_
[g
n
(x) g
n
(y)[
[x y[
: [x y[ >
_
2
|g
n
|
C
0 0 as n .
Therefore,
lim sup
n
[g
n
]
lim sup
n
A
n
+ lim sup
n
B
n
2
+ 0 0 as 0.
This proposition generalizes to the following theorem which the reader is asked
to prove in Exercise 14.22 below.
Theorem 14.32. Let be a precompact open subset of 1
d
, , [0, 1] and
k, j N
0
. If j + > k +, then C
j,
_
_
is compactly contained in C
k,
_
_
.
14.5 Tychonos Theorem
The goal of this section is to show that arbitrary products of compact spaces
is still compact. Before going to the general case of an arbitrary number of
factors let us start with only two factors.
Proposition 14.33. Suppose that X and Y are non-empty compact topolog-
ical spaces, then X Y is compact in the product topology.
Proof. Let | be an open cover of X Y. Then for each (x, y) X Y
there exist U | such that (x, y) U. By denition of the product topology,
there also exist V
x

X
x
and W
y

Y
y
such that V
x
W
y
U. Therefore
1 := V
x
W
y
: (x, y) X Y is also an open cover of X Y. We will now
show that 1 has a nite sub-cover, say 1
0
1. Assuming this is proved for
the moment, this implies that | also has a nite subcover because each V 1
0
is contained in some U
V
|. So to complete the proof it suces to show every
cover 1 of the form 1 = V
: A where V

o
X and W

o
Y has
a nite subcover. Given x X, let f
x
: Y XY be the map f
x
(y) = (x, y)
and notice that f
x
is continuous since
X
f
x
(y) = x and
Y
f
x
(y) = y are
continuous maps. From this we conclude that x Y = f
x
(Y ) is compact.
Similarly, it follows that X y is compact for all y Y. Since 1 is a cover
of x Y, there exist
x
A such that x Y

x
(V
) without
14.5 Tychonos Theorem 233
loss of generality we may assume that
x
is chosen so that x V
for all

x
. Let U
x
:=

x
V

o
X, and notice that
_
x
(V
)
_
x
(U
x
W
) = U
x
Y, (14.6)
see Figure 14.4 below. Since U
x
xX
is now an open cover of X and X is
Fig. 14.4. Constructing the open set Ux.
compact, there exists X such that X =
x
U
x
. The nite subcol-
lection, 1
0
:= V
:
x
x
, of 1 is the desired nite subcover.
Indeed using Eq. (14.6),
1
0
=
x
x
(V
)
x
(U
x
Y ) = X Y.
The results of Exercises 14.23 and 13.28 prove Tychonos Theorem for a
countable product of compact metric spaces. We now state the general version
of the theorem.
Theorem 14.34 (Tychonos Theorem). Let X
A
be a collection of
non-empty compact spaces. Then X := X
A
=

A
X
is compact in the prod-

uct space topology. (Compare with Exercise 14.23 which covers the special case
of a countable product of compact metric spaces.)
Proof. (The proof is taken from Loomis [14] which followed Bourbaki. Re-
mark 14.35 below should help the reader understand the strategy of the proof
to follow.) The proof requires a form of induction known as Zorns lemma
which is equivalent to the axiom of choice, see Theorem B.7 of Appendix B
below.
234 14 Compactness
For A let
denote the projection map from X to X
. Suppose that
T is a family of closed subsets of X which has the nite intersection property,
see Denition 14.3. By Proposition 14.4 the proof will be complete if we can
show T ,= .
The rst step is to apply Zorns lemma to construct a maximal collection,
T
0
, of (not necessarily closed) subsets of X with the nite intersection property
such that T T
0
. To do this, let :=
_
( 2
X
: T (
_
equipped with the
partial order, (
1
< (
2
if (
1
(
2
. If is a linearly ordered subset of , then
(:= is an upper bound for which still has the nite intersection property
as the reader should check. So by Zorns lemma, has a maximal element
T
0
. The maximal T
0
has the following properties.
1. T
0
is closed under nite intersections. Indeed, if we let (T
0
)
f
denote the
collection of all nite intersections of elements from T
0
, then (T
0
)
f
has
the nite intersection property and contains T
0
. Since T
0
is maximal, this
implies (T
0
)
f
= T
0
.
2. If B X and B F ,= for all F T
0
then B T
0
. For if not
T
0
B would still satisfy the nite intersection property and would
properly contain T
0
and this would violate the maximallity of T
0
.
3. For each A,
a
(T
0
) :=
(F) X
: F T
0
has the nite intersection property. Indeed, if F

i
n
i=1
T
0
, then
n
i=1
(F
i
)
(
n
i=1
F
i
) ,= .
Since X
is compact, property 3. above along with Proposition 14.4 implies
FJ0
(F) ,= . Since this true for each A, using the axiom of choice,
there exists p X such that p
(p)
FJ0
(F) for all A. The

proof will be completed by showing T ,= by showing p T.
Since C :=
_
F : F T
0
_
T, it suces to show p C. Let U be an
open neighborhood of p in X. By the denition of the product topology (or
item 2. of Proposition 13.25), there exists A and open sets U
for all such that p
(U
) U. Since p

FJ0
(F) and
p
for all , it follows that U
(F) ,= for all F T

0
and all
. This then implies
1
(U
) F ,= for all F T
0
and all . By
property 2.
7
above we concluded that
1
(U
) T
0
for all and then
by property 1. that
(U
) T
0
. In particular
, = F
_
(U
)
_
F U for all F T
0
which shows p

F for each F T
0
, i.e. p C.
Remark 14.35. Consider the following simple example where X = [1, 1]
[1, 1] and T = F
1
, F
2
as in Figure 14.5. Notice that
i
(F
1
)
i
(F
2
) =
7
Here is where we use that T0 is maximal among the collection of all, not just
closed, sets having the nite intersection property and containing T.
14.6 Banach Alaoglus Theorem 235
[1, 1] for each i and so gives no help in trying to nd the i
th
coordinate
of one of the two points in F
1
F
2
. This is why it is necessary to introduce
the collection T
0
in the proof of Theorem 14.34. In this case one might take
T
0
to be the collection of all subsets F X such that p F. We then have
FJ0
i
(F) = p
i
, so the i
th
coordinate of p may now be determined by
observing the sets,
i
(F) : F T
0
.
Fig. 14.5. Here T = |F1, F2 where F1 and F2 are the two parabolic arcs and
F1 F2 = |p, q.
14.6 Banach Alaoglus Theorem
14.6.1 Weak and Strong Topologies
Denition 14.36. Let X and Y be be a normed vector spaces and L(X, Y )
the normed space of bounded linear transformations from X to Y.
1. The weak topology on X is the topology generated by X
, i.e. the smallest

topology on X such that every element f X
is continuous.
2. The weak- topology on X
is the topology generated by X, i.e. the

smallest topology on X
such that the maps f X
f(x) C are
continuous for all x X.
3. The strong operator topology on L(X, Y ) is the smallest topology such
that T L(X, Y ) Tx Y is continuous for all x X.
4. The weak operator topology on L(X, Y ) is the smallest topology such
that T L(X, Y ) f(Tx) C is continuous for all x X and f Y
.
Let us be a little more precise about the topologies described in the above
denitions.
236 14 Compactness
1. The weak topology has a neighborhood base at x
0
X consisting of
sets of the form
N =
n
i=1
x X : [f
i
(x) f
i
(x
0
)[ <
where f
i
X
and > 0.
2. The weak- topology on X
has a neighborhood base at f X
con-
sisting of sets of the form
N :=
n
i=1
g X
: [f(x
i
) g(x
i
)[ <
where x
i
X and > 0.
3. The strong operator topology on L(X, Y ) has a neighborhood base at
T X
consisting of sets of the form

N :=
n
i=1
S L(X, Y ) : |Sx
i
Tx
i
| <
where x
i
X and > 0.
4. The weak operator topology on L(X, Y ) has a neighborhood base at
T X
consisting of sets of the form

N :=
n
i=1
S L(X, Y ) : [f
i
(Sx
i
Tx
i
)[ <
where x
i
X, f
i
X
and > 0.
Theorem 14.37 (Alaoglus Theorem). If X is a normed space the unit
ball in X
is weak - compact. (Also see Theorem 14.44 and Proposition

26.16.)
Proof. For all x X let D
x
= z C : [z[ |x|. Then D
x
C is a
compact set and so by Tychonos Theorem :=

xX
D
x
is compact in the
product topology. If f C
:= f X
: |f| 1, [f(x)[ |f| |x| |x|

which implies that f(x) D
x
for all x X, i.e. C
. The topology on
C
inherited from the weak topology on X
is the same as that relative

topology coming from the product topology on . So to nish the proof it
suces to show C
is a closed subset of the compact space . To prove this

let
x
(f) = f(x) be the projection maps. Then
C
= f : f is linear
= f : f(x +cy) f(x) cf(y) = 0 for all x, y X and c C
=
x,yX
cC
f : f(x +cy) f(x) cf(y) = 0
=
x,yX
cC
(
x+cy
x
c
y
)
1
(0)
which is closed because (
x+cy
x
c
y
) : C is continuous.
14.7 Weak Convergence in Hilbert Spaces 237
Theorem 14.38 (Alaoglus Theorem for separable spaces). Suppose
that X is a separable Banach space, C
:= f X
: |f| 1 is the
closed unit ball in X
and x
n
n=1
is an countable dense subset of C :=
x X : |x| 1 . Then
(f, g) :=
n=1
1
2
n
[f(x
n
) g(x
n
)[ (14.7)
denes a metric on C
which is compatible with the weak topology on C
C
:= (
w
)
C
= V C : V
w
. Moreover (C
, ) is a compact metric
space.
Proof. The routine check that is a metric is left to the reader. Let
be the topology on C
induced by . For any g X and n N, the map

f X
(f(x
n
) g(x
n
)) C is
w
continuous and since the sum in Eq.
(14.7) is uniformly convergent for f C
, it follows that f (f, g) is

C

continuous. This implies the open balls relative to are contained in
C
and
therefore

C
. We now wish to prove
C

. Since
C
is the topology
generated by x[
C
: x C , it suces to show x is
continuous for all

x C. But given x C there exists a subsequence y
k
:= x
n
k
of x
n
n=1
such
that such that x = lim
k
y
k
. Since
sup
fC
[ x(f) y
k
(f)[ = sup
fC
[f(x y
k
)[ |x y
k
| 0 as k ,
y
k
x uniformly on C
and using y
k
is
continuous for all k (as is easily

checked) we learn x is also
continuous. Hence
C
= ( x[
C
: x X)
.
The compactness assertion follows from Theorem 14.37. The compactness
assertion may also be veried directly using: 1) sequential compactness is
equivalent to compactness for metric spaces and 2) a Cantors diagonalization
argument as in the proof of Theorem 14.44. (See Proposition 26.16 below.)
14.7 Weak Convergence in Hilbert Spaces
Suppose H is an innite dimensional Hilbert space and x
n
n=1
is an or-
thonormal subset of H. Then, by Eq. (8.1), |x
n
x
m
|
2
= 2 for all m ,= n and
in particular, x
n
n=1
has no convergent subsequences. From this we conclude
that C := x H : |x| 1 , the closed unit ball in H, is not compact. To
overcome this problems it is sometimes useful to introduce a weaker topology
on X having the property that C is compact.
Denition 14.39. Let (X, ||) be a Banach space and X
be its continu-
ous dual. The weak topology,
w
, on X is the topology generated by X
. If
x
n
n=1
X is a sequence we will write x
n
w
x as n to mean that
x
n
x in the weak topology.
238 14 Compactness
Because
w
= (X
)
||
:= (|x | : x X), it is harder for a
function f : X F to be continuous in the
w
topology than in the norm
topology,
||
. In particular if : X F is a linear functional which is
w

continuous, then is
||
continuous and hence X
.
Exercise 14.14. Show the vector space operations of X are continuous in the
weak topology, i.e. show:
1. (x, y) X X x +y X is (
w
w
,
w
) continuous and
2. (, x) F X x X is (
F
w
,
w
) continuous.
Proposition 14.40. Let x
n
n=1
X be a sequence, then x
n
w
x X as
n i (x) = lim
n
(x
n
) for all X
.
Proof. By denition of
w
, we have x
n
w
x X i for all X
and > 0 there exists an N N such that [(x) (x

n
)[ < for all n N
and . This later condition is easily seen to be equivalent to (x) =
lim
n
(x
n
) for all X
.
The topological space (X,
w
) is still Hausdor as follows from the Hahn
Banach Theorem, see Theorem 25.6 below. For the moment we will concen-
trate on the special case where X = H is a Hilbert space in which case
H
=
z
:= [z) : z H , see Theorem 8.15. If x, y H and z := y x ,= 0,
then
0 < := |z|
2
=
z
(z) =
z
(y)
z
(x).
Thus
V
x
:= w H : [
z
(x)
z
(w)[ < /2 and
V
y
:= w H : [
z
(y)
z
(w)[ < /2
are disjoint sets from
w
which contain x and y respectively. This shows that
(H,
w
) is a Hausdor space. In particular, this shows that weak limits are
unique if they exist.
Remark 14.41. Suppose that H is an innite dimensional Hilbert space
x
n
n=1
is an orthonormal subset of H. Then Bessels inequality (Propo-
sition 8.18) implies x
n
w
0 H as n . This points out the fact
that if x
n
w
x H as n , it is no longer necessarily true that
|x| = lim
n
|x
n
| . However we do always have |x| liminf
n
|x
n
|
because,
|x|
2
= lim
n
x
n
[x) liminf
n
[|x
n
| |x|] = |x| liminf
n
|x
n
| .
Proposition 14.42. Let H be a Hilbert space, H be an orthonormal
basis for H and x
n
n=1
H be a bounded sequence, then the following are
equivalent:
1. x
n
w
x H as n .
14.7 Weak Convergence in Hilbert Spaces 239
2. x[y) = lim
n
x
n
[y) for all y H.
3. x[y) = lim
n
x
n
[y) for all y .
Moreover, if c
y
:= lim
n
x
n
[y) exists for all y , then
y
[c
y
[
2
<
and x
n
w
x :=
y
c
y
y H as n .
Proof. 1. = 2. This is a consequence of Theorem 8.15 and Proposition
14.40. 2. = 3. is trivial. 3. = 1. Let M := sup
n
|x
n
| and H
0
denote the
algebraic span of . Then for y H and z H
0
,
[x x
n
[y)[ [x x
n
[z)[ +[x x
n
[y z)[ [x x
n
[z)[ + 2M|y z| .
Passing to the limit in this equation implies limsup
n
[x x
n
[y)[
2M|y z| which shows limsup
n
[x x
n
[y)[ = 0 since H
0
is dense in
H. To prove the last assertion, let . Then by Bessels inequality
(Proposition 8.18),
y
[c
y
[
2
= lim
n
y
[x
n
[y)[
2
liminf
n
|x
n
|
2
M
2
.
Since was arbitrary, we conclude that

y
[c
y
[
2
M < and
hence we may dene x :=
y
c
y
y. By construction we have
x[y) = c
y
= lim
n
x
n
[y) for all y
and hence x
n
w
x H as n by what we have just proved.
Theorem 14.43. Suppose x
n
n=1
is a bounded sequence in a Hilbert space,
H. Then there exists a subsequence y
k
:= x
n
k
of x
n
n=1
and x X such
that y
k
w
x as k .
Proof. This is a consequence of Proposition 14.42 and a Cantors diago-
nalization argument which is left to the reader, see Exercise 8.12.
Theorem 14.44 (Alaoglus Theorem for Hilbert Spaces). Suppose that
H is a separable Hilbert space, C := x H : |x| 1 is the closed unit ball
in H and e
n
n=1
is an orthonormal basis for H. Then
(x, y) :=
n=1
1
2
n
[x y[e
n
)[ (14.8)
denes a metric on C which is compatible with the weak topology on C,
C
:=
(
w
)
C
= V C : V
w
. Moreover (C, ) is a compact metric space. (This
theorem will be extended to Banach spaces, see Theorems 14.37 and 14.38
below.)
240 14 Compactness
Proof. The routine check that is a metric is left to the reader. Let
be the topology on C induced by . For any y H and n N, the map

x H x y[e
n
) = x[e
n
) y[e
n
) is
w
continuous and since the sum in
Eq. (14.8) is uniformly convergent for x, y C, it follows that x (x, y) is
C
continuous. This implies the open balls relative to are contained in
C
and therefore

C
. For the converse inclusion, let z H, x
z
(x) =
x[z) be an element of H
, and for N N let z

N
:=

N
n=1
z[e
n
)e
n
. Then
z
N
=

N
n=1
z[e
n
)
en
is continuous, being a nite linear combination of
the
en
which are easily seen to be continuous. Because z
N
z as N
it follows that
sup
xC
[
z
(x)
z
N
(x)[ = |z z
N
| 0 as N .
Therefore
z
[
C
is continuous as well and hence
C
= (
z
[
C
: z H)
. The last assertion follows directly from Theorem 14.43 and the fact that
sequential compactness is equivalent to compactness for metric spaces.
14.8 Exercises
Exercise 14.16. Let C be a closed proper subset of 1
n
and x 1
n
C. Show
there exists a y C such that d(x, y) = d
C
(x).
Exercise 14.17. Let F = 1 in this problem and A
2
(N) be dened by
A = x
2
(N) : x(n) 1 + 1/n for some n N
=
n=1
x
2
(N) : x(n) 1 + 1/n.
Show A is a closed subset of
2
(N) with the property that d
A
(0) = 1 while
there is no y A such that d(0, y) = 1. (Remember that in general an innite
union of closed sets need not be closed.)
Exercise 14.18. Let p [1, ] and X be an innite set. Show directly, with-
out using Theorem 14.15, the closed unit ball in
p
(X) is not compact.
14.8.1 Ascoli-Arzela Theorem Problems
Exercise 14.19. Let (X, ) be a compact topological space and T :=
f
n
n=1
C (X) is a sequence of functions which are equicontinuous and
pointwise convergent. Show f (x) := lim
n
f
n
(x) is continuous and that
lim
n
|f f
n
|
= 0, i.e. f
n
f uniformly as n .
14.8 Exercises 241
Solution to Exercise (14.19). By the Arzela-Ascoli Theorem 14.29, there
exists a subsequence, g
k
:= f
n
k
of T which is uniformly convergent to a
function g C (X) . Since also g
k
f pointwise by assumption, it follows
that f = g C (X) .
Now suppose, for the sake of contradiction, that f
n
does not converge
uniformly to f. In this case there exists > 0 and a subsequence, g
k
:= f
n
k
of T, such that |f g
k
|
for all k. However, another application of the

Arzela-Ascoli Theorem 14.29 shows there is a further subsquence, g
k
l
l=1
of
g
k
k=1
such that lim
l
|f g
k
l
|
= 0 which leads to the contradiction

that 0 = > 0.
Exercise 14.20. Let T (0, ) and T C([0, T]) be a family of functions
such that:
1.

f(t) exists for all t (0, T) and f T.
2. sup
fJ
[f(0)[ < and
3. M := sup
fJ
sup
t(0,T)

f(t)
< .
Show T is precompact in the Banach space C([0, T]) equipped with the
norm |f|
= sup
t[0,T]
[f(t)[ .
Exercise 14.21 (Peanos Existence Theorem). Suppose Z : 1 1
d
1
d
is a bounded continuous function. Then for each T <
8
there exists a
solution to the dierential equation
x(t) = Z(t, x(t)) for T < t < T with x(0) = x
0
. (14.9)
Do this by lling in the following outline for the proof.
1. Given > 0, show there exists a unique function x
C([, ) 1
d
)
such that x
(t) := x
0
for t 0 and
x
(t) = x
0
+
_
t
0
Z(, x
( ))d for all t 0. (14.10)

Here
_
t
0
Z(, x
())d =
__
t
0
Z
1
(, x
( ))d, . . . ,
_
t
0
Z
d
(, x
( ))d
_
where Z = (Z
1
, . . . , Z
d
) and the integrals are either the Lebesgue or the
Riemann integral since they are equal on continuous functions. Hint: For
t [0, ], it follows from Eq. (14.10) that
x
(t) = x
0
+
_
t
0
Z(, x
0
)d.
Now that x
(t) is known for t [, ] it can be found by integration for

t [, 2]. The process can be repeated.
8
Using Corollary 14.30, we may in fact allow T = .
242 14 Compactness
2. Then use Exercise 14.20 to show there exists
k
k=1
(0, ) such that
lim
k
k
= 0 and x
k
converges to some x C([0, T]) with respect to
the sup-norm: |x|
= sup
t[0,T]
[x(t)[). Also show for this sequence that
lim
k
sup
k
T
[x
k
(
k
) x()[ = 0.
3. Pass to the limit (with justication) in Eq. (14.10) with replaced by
k
to show x satises
x(t) = x
0
+
_
t
0
Z(, x())d t [0, T].
4. Conclude from this that x(t) exists for t (0, T) and that x solves Eq.
(14.9).
5. Apply what you have just proved to the ODE,
y(t) = Z(t, y(t)) for 0 t < T with y(0) = x
0
.
Then extend x(t) above to (T, T) by setting x(t) = y(t) if t (T, 0].
Show x so dened solves Eq. (14.9) for t (T, T).
Exercise 14.22. Prove Theorem 14.32. Hint: First prove C
j,
_
_

C
j,
_
_
is compact if 0 < 1. Then use Lemma 14.18 repeatedly to
handle all of the other cases.
14.8.2 Tychonos Theorem Problem
Exercise 14.23 (Tychonos Theorem for Compact Metric Spaces).
Let us continue the Notation used in Exercise 6.12. Further assume that
the spaces X
n
are compact for all n. Show, without using Theorem 14.34,
(X, d) is compact. Hint: Either use Cantors method to show every sequence
x
m
m=1
X has a convergent subsequence or alternatively show (X, d) is
complete and totally bounded. (Compare with Example 14.10.)
15
Locally Compact Hausdor Spaces
In this section X will always be a topological space with topology . We
are now interested in restrictions on in order to insure there are plenty
of continuous functions. One such restriction is to assume =
d
is the
topology induced from a metric on X. For example the results in Lemma
6.15 and Theorem 7.4 above shows that metric spaces have lots of continuous
functions.
The main thrust of this section is to study locally compact (and com-
pact) Hausdor spaces as dened in Denitions 15.2 and 14.21. We will see
again that this class of topological spaces have an ample supply of continuous
functions. We will start out with the notion of a Hausdor topology. The fol-
lowing example shows a pathology which occurs when there are not enough
open sets in a topology.
Example 15.1. As in Example 13.36, let
X := 1, 2, 3 with := X, , 1, 2, 2, 3, 2.
Example 13.36 shows limits need not be unique in this space and moreover it
is easy to verify that the only continuous functions, f : Y 1, are necessarily
constant.
Denition 15.2 (Hausdor Topology). A topological space, (X, ), is
Hausdor if for each pair of distinct points, x, y X, there exists dis-
joint open neighborhoods, U and V of x and y respectively. (Metric spaces are
typical examples of Hausdor spaces.)
Remark 15.3. When is Hausdor the pathologies appearing in Example
15.1 do not occur. Indeed if x
n
x X and y X x we may choose
V
x
and W
y
such that V W = . Then x
n
V a.a. implies x
n
/ W
for all but a nite number of n and hence x
n
y, so limits are unique.
Proposition 15.4. Let (X
) be Hausdor topological spaces. Then the

product space X
A
=

A
X
equipped with the product topology is Haus-

dor.
244 15 Locally Compact Hausdor Spaces
Proof. Let x, y X
A
be distinct points. Then there exists A such that
(x) = x
,= y
(y). Since X
is Hausdor, there exists disjoint open

sets U, V X
such
(x) U and
(y) V. Then
1
(U) and
1
(V ) are
disjoint open sets in X
A
containing x and y respectively.
Proposition 15.5. Suppose that (X, ) is a Hausdor space, K X and
x K
c
. Then there exists U, V such that U V = , x U and K V.
In particular K is closed. (So compact subsets of Hausdor topological spaces
are closed.) More generally if K and F are two disjoint compact subsets of X,
there exist disjoint open sets U, V such that K V and F U.
Proof. Because X is Hausdor, for all y K there exists V
y

y
and
U
y

x
such that V
y
U
y
= . The cover V
y
yK
of K has a nite subcover,
V
y
y
for some K. Let V =
y
V
y
and U =
y
U
y
, then U, V
satisfy x U, K V and U V = . This shows that K
c
is open and hence
that K is closed. Suppose that K and F are two disjoint compact subsets of
X. For each x F there exists disjoint open sets U
x
and V
x
such that K V
x
and x U
x
. Since U
x
xF
is an open cover of F, there exists a nite subset
of F such that F U :=
x
U
x
. The proof is completed by dening
V :=
x
V
x
.
Exercise 15.1. Show any nite set X admits exactly one Hausdor topology
.
Exercise 15.2. Let (X, ) and (Y,
Y
) be topological spaces.
1. Show is Hausdor i := (x, x) : x X is a closed set in X X
equipped with the product topology .
2. Suppose is Hausdor and f, g : Y X are continuous maps. If
f = g
Y
= Y then f = g. Hint: make use of the map f g : Y XX
dened by (f g) (y) = (f(y), g(y)).
Exercise 15.3. Give an example of a topological space which has a non-closed
compact subset.
Proposition 15.6. Suppose that X is a compact topological space, Y is a
Hausdor topological space, and f : X Y is a continuous bijection then f
is a homeomorphism, i.e. f
1
: Y X is continuous as well.
Proof. Since closed subsets of compact sets are compact, continuous im-
ages of compact subsets are compact and compact subsets of Hausdor spaces
are closed, it follows that
_
f
1
_
1
(C) = f(C) is closed in X for all closed
subsets C of X. Thus f
1
is continuous.
The next two results show that locally compact Hausdor spaces have
plenty of open sets and plenty of continuous functions.
15 Locally Compact Hausdor Spaces 245
Proposition 15.7. Suppose X is a locally compact Hausdor space and U
o
X and K U. Then there exists V
o
X such that K V V U X
and

V is compact. (Compare with Proposition 14.25 above.)
Proof. By local compactness, for all x K, there exists U
x

x
such
that

U
x
is compact. Since K is compact, there exists K such that
U
x
x
is a cover of K. The set O = U (
x
U
x
) is an open set such that
K O U and O is precompact since

O is a closed subset of the compact
set
x

U
x
. (
x

U
x
. is compact because it is a nite union of compact sets.)
So by replacing U by O if necessary, we may assume that

U is compact. Since
U is compact and bd(U) =

U U
c
is a closed subset of

U, bd(U) is compact.
Because bd(U) U
c
, it follows that bd(U) K = , so by Proposition 15.5,
there exists disjoint open sets V and W such that K V and bd(U) W. By
replacing V by V U if necessary we may further assume that K V U, see
Figure 15.1. Because

UW
c
is a closed set containing V and bd(U)W
c
= ,
Fig. 15.1. The construction of V.
V

U W
c
= (U bd(U)) W
c
= U W
c
U

U.
Since

U is compact it follows that

V is compact and the proof is complete.
The following Lemma is analogous to Lemma 14.27.
Lemma 15.8 (Urysohns Lemma for LCH Spaces). Let X be a locally
compact Hausdor space and K U
o
X. Then there exists f U (see
Denition 14.26) such that f = 1 on K. In particular, if K is compact and
C is closed in X such that K C = , there exists f C
c
(X, [0, 1]) such that
f = 1 on K and f = 0 on C.
Proof. For notational ease later it is more convenient to construct g :=
1 f rather than f. To motivate the proof, suppose g C(X, [0, 1]) such
that g = 0 on K and g = 1 on U
c
. For r > 0, let U
r
= g < r . Then for
0 < r < s 1, U
r
g r U
s
and since g r is closed this implies
K U
r

U
r
g r U
s
U.
Therefore associated to the function g is the collection open sets U
r
r>0

with the property that K U
r

U
r
U
s
U for all 0 < r < s 1 and
U
r
= X if r > 1. Finally let us notice that we may recover the function g from
the sequence U
r
r>0
by the formula
g(x) = infr > 0 : x U
r
. (15.1)
The idea of the proof to follow is to turn these remarks around and dene g
by Eq. (15.1).
Step 1. (Construction of the U
r
.) Let
| :=
_
k2
n
: k = 1, 2, . . . , 2
n
, n = 1, 2, . . .
_
be the dyadic rationals in (0, 1]. Use Proposition 15.7 to nd a precompact
open set U
1
such that K U
1

U
1
U. Apply Proposition 15.7 again to
construct an open set U
1/2
such that
K U
1/2

U
1/2
U
1
and similarly use Proposition 15.7 to nd open sets U
1/2
, U
3/4

o
X such that
K U
1/4

U
1/4
U
1/2

U
1/2
U
3/4

U
3/4
U
1
.
Likewise there exists open set U
1/8
, U
3/8
, U
5/8
, U
7/8
such that
K U
1/8

U
1/8
U
1/4

U
1/4
U
3/8

U
3/8
U
1/2

U
1/2
U
5/8

U
5/8
U
3/4

U
3/4
U
7/8

U
7/8
U
1
.
Continuing this way inductively, one shows there exists precompact open sets
U
r
rD
such that
K U
r
U
r
U
s
U
1

U
1
U
for all r, s | with 0 < r < s 1.
Step 2. Let U
r
:= X if r > 1 and dene
g(x) = infr | (1, 2) : x U
r
,
see Figure 15.2. Then g(x) [0, 1] for all x X, g(x) = 0 for x K since
x K U
r
for all r |. If x U
c
1
, then x / U
r
for all r | and hence
g(x) = 1. Therefore f := 1 g is a function such that f = 1 on K and
f ,= 0 = g ,= 1 U
1

U
1
U so that supp(f) = f ,= 0

U
1
U is
15 Locally Compact Hausdor Spaces 247
Fig. 15.2. Determining g from |Ur .
a compact subset of U. Thus it only remains to show f, or equivalently g, is
continuous.
Since c = (, ), (, ) : 1 generates the standard topology on
1, to prove g is continuous it suces to show g < and g > are open
sets for all 1. But g(x) = and if < 0, g > = X. If
(0, 1), then g(x) > i there exists r | such that r > and x / U
r
.
Now if r > and x / U
r
then for s | (, r), x /

U
s
U
r
. Thus we have
shown that
g > =
_
_
_
U
s
_
c
: s | s >
_
which is again an open subset of X.
Theorem 15.9 (Locally Compact Tietz Extension Theorem). Let
(X, ) be a locally compact Hausdor space, K U
o
X, f C(K, 1),
a = minf(K) and b = max f(K). Then there exists F C(X, [a, b])
such that F[
K
= f. Moreover given c [a, b], F can be chosen so that
supp(F c) = F ,= c U.
The proof of this theorem is similar to Theorem 7.4 and will be left to the
reader, see Exercise 15.6.
15.1 Locally compact form of Urysohns Metrization
Theorem
Denition 15.10 (Polish spaces). A Polish space is a separable topolog-
ical space (X, ) which admits a complete metric, , such that =
.
Notation 15.11 Let Q := [0, 1]
N
denote the (innite dimensional) unit cube
in 1
N
. For a, b Q let
d(a, b) :=
n=1
1
2
n
[a
n
b
n
[ . (15.2)
The metric introduced in Exercise 14.23 would be dened, in this context,
as

d(a, b) :=
n=1
1
2
n
]anbn]
1+]anbn]
. Since 1 1+[a
n
b
n
[ 2, it follows that

d
d 2d. So the metrics d and

d are equivalent and in particular the topologies
induced by d and

d are the same. By Exercises 13.28, the d topology on Q
is the same as the product topology and by Tychonos Theorem 14.34 or by
Exercise 14.23, (Q, d) is a compact metric space.
Theorem 15.12. To every separable metric space (X, ), there exists a con-
tinuous injective map G : X Q such that G : X G(X) Q is a
homeomorphism. Moreover if the metric, , is also complete, then G(X) is a
G
set, i.e. the G(X) is the countable intersection of open subsets of (Q, d) .
In short, any separable metrizable space X is homeomorphic to a subset of
(Q, d) and if X is a Polish space then X is homeomorphic to a G
subset
of (Q, d).
Proof. (See Rogers and Williams [19], Theorem 82.5 on p. 106.) By replac-
ing by

1+
if necessary, we may assume that 0 < 1. Let D = a
n
n=1
be a countable dense subset of X and dene
G(x) = ( (x, a
1
) , (x, a
2
) , (x, a
3
) , . . . ) Q
and
(x, y) = d (G(x) , G(y)) =
n=1
1
2
n
[ (x, a
n
) (y, a
n
)[
for x, y X. To prove the rst assertion, we must show G is injective and
is a metric on X which is compatible with the topology determined by .
If G(x) = G(y) , then (x, a) = (y, a) for all a D. Since D is a dense
subset of X, we may choose
k
D such that
0 = lim
k
(x,
k
) = lim
k
(y,
k
) = (y, x)
and therefore x = y. A simple argument using the dominated convergence
theorem shows y (x, y) is continuous, i.e. (x, y) is small if (x, y) is
small. Conversely,
15.1 Locally compact form of Urysohns Metrization Theorem 249
(x, y) (x, a
n
) + (y, a
n
) = 2 (x, a
n
) + (y, a
n
) (x, a
n
)
2 (x, a
n
) +[ (x, a
n
) (y, a
n
)[ 2 (x, a
n
) + 2
n
(x, y) .
Hence if > 0 is given, we may choose n so that 2 (x, a
n
) < /2 and so if
(x, y) < 2
(n+1)
, it will follow that (x, y) < . This shows
. Since
G : (X, ) (Q, d) is isometric, G is a homeomorphism.
Now suppose that (X, ) is a complete metric space. Let S := G(X) and
be the metric on S dened by (G(x) , G(y)) = (x, y) for all x, y X.
Then (S, ) is a complete metric (being the isometric image of a complete
metric space) and by what we have just prove,
=
d
S
.Consequently, if
u S and > 0 is given, we may nd
t
() such that B
(u,
t
())
B
d
(u, ) . Taking () = min(
t
() , ) , we have diam
d
(B
d
(u, ())) <
and diam
(B
d
(u, ())) < where
diam
(A) := sup (u, v) : u, v A and

diam
d
(A) := supd (u, v) : u, v A .
Let

S denote the closure of S inside of (Q, d) and for each n N let
^
n
:= N
d
: diam
d
(N) diam
(N S) < 1/n
and let U
n
:= ^
n

d
. From the previous paragraph, it follows that S U
n
and therefore S

S (
n=1
U
n
) .
Conversely if u

S (
n=1
U
n
) and n N, there exists N
n
^
n
such
that u N
n
. Moreover, since N
1
N
n
is an open neigborhood of u

S,
there exists u
n
N
1
N
n
S for each n N. From the denition of
^
n
, we have lim
n
d (u, u
n
) = 0 and (u
n
, u
m
) max
_
n
1
, m
1
_
0
as m, n . Since (S, ) is complete, it follows that u
n
n=1
is convergent
in (S, ) to some element u
0
S. Since (S, d
S
) has the same topology as
(S, ) it follows that d (u
n
, u
0
) 0 as well and thus that u = u
0
S. We
have now shown, S =

S(
n=1
U
n
) . This completes the proof because we may
write

S =
_
n=1
S
1/n
_
where S
1/n
:=
_
u Q : d
_
u,

S
_
< 1/n
_
and therefore,
S = (
n=1
U
n
)
_
n=1
S
1/n
_
is a G
set.
Theorem 15.13 (Urysohn Metrization Theorem for LCHs). Every
second countable locally compact Hausdor space, (X, ) , is metrizable, i.e.
there is a metric on X such that =
. Moreover, may be chosen so that

X is isometric to a subset Q
0
Q equipped with the metric d in Eq. (15.2).
In this metric X is totally bounded and hence the completion of X (which is
isometric to

Q
0
Q) is compact. (Also see Theorem 15.43.)
Proof. Let B be a countable base for and set
:= (U, V ) B B [

U V and

U is compact.
To each O and x O there exist (U, V ) such that x U V O.
Indeed, since B is a base for , there exists V B such that x V O.
Now apply Proposition 15.7 to nd U
t

o
X such that x U
t

U
t
V
with

U
t
being compact. Since B is a base for , there exists U B such that
x U U
t
and since

U

U
t
,

U is compact so (U, V ) . In particular this
shows that B
t
:= U B : (U, V ) for some V B is still a base for . If
is a nite, then B
t
is nite and only has a nite number of elements as well.
Since (X, ) is Hausdor, it follows that X is a nite set. Letting x
n
N
n=1
be
an enumeration of X, dene T : X Q by T(x
n
) = e
n
for n = 1, 2, . . . , N
where e
n
= (0, 0, . . . , 0, 1, 0, . . . ), with the 1 occurring in the n
th
spot. Then
(x, y) := d(T(x), T(y)) for x, y X is the desired metric.
So we may now assume that is an innite set and let (U
n
, V
n
)
n=1
be an
enumeration of . By Urysohns Lemma 15.8 there exists f
U,V
C(X, [0, 1])
such that f
U,V
= 0 on

U and f
U,V
= 1 on V
c
. Let T := f
U,V
[ (U, V )
and set f
n
:= f
Un,Vn
an enumeration of T. We will now show that
(x, y) :=
n=1
1
2
n
[f
n
(x) f
n
(y)[
is the desired metric on X. The proof will involve a number of steps.
1. ( is a metric on X.) It is routine to show satises the triangle inequality
and is symmetric. If x, y X are distinct points then there exists
(U
n0
, V
n0
) such that x U
n0
and V
n0
O := y
c
. Since f
n0
(x) = 0
and f
n0
(y) = 1, it follows that (x, y) 2
n0
> 0.
2. (Let
0
= (f
n
: n N) , then =
0
=
.) As usual we have
0
.
Since, for each x X, y (x, y) is
0
continuous (being the uni-
formly convergent sum of continuous functions), it follows that B
x
() :=
y X : (x, y) <
0
for all x X and > 0. Thus

0
.
Suppose that O and x O. Let (U
n0
, V
n0
) be such that x U
n0
and V
n0
O. Then f
n0
(x) = 0 and f
n0
= 1 on O
c
. Therefore if y X and
f
n0
(y) < 1, then y O so x f
n0
< 1 O. This shows that O may be
written as a union of elements from
0
and therefore O
0
. So
0
and
hence =
0
. Moreover, if y B
x
(2
n0
) then 2
n0
> (x, y) 2
n0
f
n0
(y)
and therefore x B
x
(2
n0
) f
n0
< 1 O. This shows O is open
and hence
.
3. (X is isometric to some Q
0
Q.) Let T : X Q be dened by T(x) =
(f
1
(x), f
2
(x), . . . , f
n
(x), . . . ). Then T is an isometry by the very denitions
of d and and therefore X is isometric to Q
0
:= T(X). Since Q
0
is a subset
of the compact metric space (Q, d), Q
0
is totally bounded and therefore
X is totally bounded.
BRUCE: Add Stone Chech Compactication results.
15.2 Partitions of Unity 251
15.2 Partitions of Unity
Denition 15.14. Let (X, ) be a topological space and X
0
X be a set. A
collection of sets B
A
2
X
is locally nite on X
0
if for all x X
0
,
there is an open neighborhood N
x
of x such that # A : B
N
x
,=
< .
Denition 15.15. Suppose that | is an open cover of X
0
X. A collection
A
C(X, [0, 1]) (N = is allowed here) is a partition of unity on
X
0
subordinate to the cover | if:
1. for all there is a U | such that supp(
) U,
2. the collection of sets, supp(
)
A
, is locally nite on X, and
3.
= 1 on X
0
.
Notice by item 2. that, for each x X, there is a neighborhood N
x
such
that
:= A : supp(
) N
x
,=
is a nite set. Therefore,

A
[
Nx
=

[
Nx
which shows the sum
is well dened and denes a continuous function on N

x
and there-
fore on X since continuity is a local property. We will summarize these last
comments by saying the sum,

A
, is locally nite.
Proposition 15.16 (Partitions of Unity: The Compact Case). Suppose
that X is a locally compact Hausdor space, K X is a compact set and
| = U
j
n
j=1
is an open cover of K. Then there exists a partition of unity
h
j
n
j=1
of K such that h
j
U
j
for all j = 1, 2, . . . , n.
Proof. For all x K choose a precompact open neighborhood, V
x
, of x
such that V
x
U
j
. Since K is compact, there exists a nite subset, , of K
such that K

x
V
x
. Let
F
j
=
_
V
x
: x and V
x
U
j
_
.
Then F
j
is compact, F
j
U
j
for all j, and K
n
j=1
F
j
. By Urysohns Lemma
15.8 there exists f
j
U
j
such that f
j
= 1 on F
j
for j = 1, 2, . . . , n and by
convention let f
n+1
1. We will now give two methods to nish the proof.
Method 1. Let h
1
= f
1
, h
2
= f
2
(1 h
1
) = f
2
(1 f
1
),
h
3
= f
3
(1 h
1
h
2
) = f
3
(1 f
1
(1 f
1
)f
2
) = f
3
(1 f
1
)(1 f
2
)
and continue on inductively to dene
h
k
= (1 h
1
h
k1
)f
k
= f
k
k1
j=1
(1 f
j
) k = 2, 3, . . . , n (15.3)
and to show
h
n+1
= (1 h
1
h
n
) 1 = 1
n
j=1
(1 f
j
). (15.4)
From these equations it clearly follows that h
j
C
c
(X, [0, 1]) and that
supp(h
j
) supp(f
j
) U
j
, i.e. h
j
U
j
. Since

n
j=1
(1 f
j
) = 0 on K,
n
j=1
h
j
= 1 on K and h
j
n
j=1
is the desired partition of unity.
Method 2. Let g :=
n
j=1
f
j
C
c
(X). Then g 1 on K and hence
K g >
1
2
. Choose C
c
(X, [0, 1]) such that = 1 on K and supp()
g >
1
2
and dene f
0
:= 1 . Then f
0
= 0 on K, f
0
= 1 if g
1
2
and
therefore,
f
0
+f
1
+ +f
n
= f
0
+g > 0
on X. The desired partition of unity may be constructed as
h
j
(x) =
f
j
(x)
f
0
(x) + +f
n
(x)
.
Indeed supp (h
j
) = supp(f
j
) U
j
, h
j
C
c
(X, [0, 1]) and on K,
h
1
+ +h
n
=
f
1
+ +f
n
f
0
+f
1
+ +f
n
=
f
1
+ +f
n
f
1
+ +f
n
= 1.
Proposition 15.17. Let (X, ) be a locally compact and compact Haus-
dor space. Suppose that | is an open cover of X. Then we may construct
two locally nite open covers 1 = V
i
N
i=1
and J = W
i
N
i=1
of X (N =
is allowed here) such that:
1. W
i

W
i
V
i

V
i
and

V
i
is compact for all i.
2. For each i there exist U | such that

V
i
U.
Proof. By Remark 14.24, there exists an open cover of ( = G
n
n=1
of X such that G
n

G
n
G
n+1
. Then X =
k=1
(

G
k

G
k1
), where by
convention G
1
= G
0
= . For the moment x k 1. For each x

G
k
G
k1
,
let U
x
| be chosen so that x U
x
and by Proposition 15.7 choose an open
neighborhood N
x
of x such that

N
x
U
x
(G
k+1

G
k2
), see Figure 15.3
below. Since N
x
G
k
\G
k1
is an open cover of the compact set

G
k
G
k1
,
there exist a nite subset
k
N
x
G
k
\G
k1
which also covers

G
k
G
k1
.
By construction, for each W
k
, there is a U | such that

W
U (G
k+1

G
k2
) and by another application of Proposition 15.7, there
exists an open set V
W
such that

W V
W

V
W
U (G
k+1

G
k2
). We
now choose and enumeration W
i
N
i=1
of the countable open cover,
k=1
k
,
of X and dene V
i
= V
Wi
. Then the collection W
i
N
i=1
and V
i
N
i=1
are easily
checked to satisfy all the conclusions of the proposition. In particular notice
that for each k; V
i
G
k
,= for only a nite number of is.
15.2 Partitions of Unity 253
Fig. 15.3. Constructing the |Wi
N
i=1
.
Theorem 15.18 (Partitions of Unity for Compact LCH Spaces).
Let (X, ) be locally compact, compact and Hausdor and let | be an
open cover of X. Then there exists a partition of unity of h
i
N
i=1
(N = is
allowed here) subordinate to the cover | such that supp(h
i
) is compact for all
i.
Proof. Let 1 = V
i
N
i=1
and J = W
i
N
i=1
be open covers of X with the
properties described in Proposition 15.17. By Urysohns Lemma 15.8, there
exists f
i
V
i
such that f
i
= 1 on

W
i
for each i. As in the proof of Proposition
15.16 there are two methods to nish the proof.
Method 1. Dene h
1
= f
1
, h
j
by Eq. (15.3) for all other j. Then as in
Eq. (15.4), for all n < N + 1,
1
j=1
h
j
= lim
n
_
_
f
n
n
j=1
(1 f
j
)
_
_
= 0
since for x X, f
j
(x) = 1 for some j. As in the proof of Proposition 15.16, it
is easily checked that h
i
N
i=1
is the desired partition of unity.
Method 2. Let f :=

N
i=1
f
i
, a locally nite sum, so that f C(X).
Since W
i
i=1
is a cover of X, f 1 on X so that 1/f C (X)) as well. The
functions h
i
:= f
i
/f for i = 1, 2, . . . , N give the desired partition of unity.
Lemma 15.19. Let (X, ) be a locally compact Hausdor space.
1. A subset E X is closed i E K is closed for all K X.
2. Let C
A
be a locally nite collection of closed subsets of X, then
C =
A
C
is closed in X. (Recall that in general closed sets are only

closed under nite unions.)
Proof. 1. Since compact subsets of Hausdor spaces are closed, E K is
closed if E is closed and K is compact. Now suppose that E K is closed
for all compact subsets K X and let x E
c
. Since X is locally compact,
there exists a precompact open neighborhood, V, of x.
1
By assumption E
V
is closed so x
_
E

V
_
c
an open subset of X. By Proposition 15.7 there
exists an open set U such that x U

U
_
E

V
_
c
, see Figure 15.4. Let
Fig. 15.4. Showing E
c
is open.
W := U V. Since
W E = U V E U

V E = ,
and W is an open neighborhood of x and x E
c
was arbitrary, we have shown
E
c
is open hence E is closed.
2. Let K be a compact subset of X and for each x K let N
x
be an
open neighborhood of x such that # A : C
N
x
,= < . Since K is
compact, there exists a nite subset K such that K
x
N
x
. Letting
0
:= A : C
K ,= , then
#(
0
)
x
# A : C
N
x
,= <
and hence K (
A
C
) = K (
0
C
) . The set (
0
C
) is a nite
union of closed sets and hence closed. Therefore, K(
A
C
) is closed and
by item 1. it follows that
A
C
is closed as well.
1
If X were a metric space we could nish the proof as follows. If there does not
exist an open neighborhood of x which is disjoint from E, then there would exists
xn E such that xn x. Since E

V is closed and xn E

V for all large n,
it follows (see Exercise 6.4) that x E

V and in particular that x E. But we
chose x E
c
.
15.3 C0(X) and the Alexanderov Compactication 255
Corollary 15.20. Let (X, ) be a locally compact and compact Hausdor
space and | = U
A
be an open cover of X. Then there exists a
partition of unity of h
A
subordinate to the cover | such that supp(h
)
U
for all A. (Notice that we do not assert that h
has compact support.

However if

U
is compact then supp(h
) will be compact.)
Proof. By the compactness of X, we may choose a countable subset,
N
i=1
(N = allowed here), of A such that U
i
:= U
i
N
i=1
is still an
open cover of X. Let g
j
j=1
be a partition of unity
2
subordinate to the
cover U
i
N
i=1
as in Theorem 15.18. Dene

k
:= j : supp(g
j
) U
k
and
k
:=

k1
j=1
k
_
, where by convention

0
= . Then
N =
_
k=1
k
=
k=1
k
.
If
k
= let h
k
:= 0 otherwise let h
k
:=
j
k
g
j
, a locally nite sum. Then
N
k=1
h
k
=
j=1
g
j
= 1.
By Item 2. of Lemma 15.19,
j
k
supp(g
j
) is closed and therefore,
supp(h
k
) = h
k
,= 0 =
j
k
g
j
,= 0
j
k
supp(g
j
) U
k
and hence h
k
U
k
and the sum

N
k=1
h
k
is still locally nite. (Why?) The
desired partition of unity is now formed by letting h
k
:= h
k
for k < N + 1
and h
0 if /
i
N
i=1
.
Corollary 15.21. Let (X, ) be a locally compact and compact Haus-
dor space and A, B be disjoint closed subsets of X. Then there exists
f C(X, [0, 1]) such that f = 1 on A and f = 0 on B. In fact f can be
chosen so that supp(f) B
c
.
Proof. Let U
1
= A
c
and U
2
= B
c
, then U
1
, U
2
is an open cover of X.
By Corollary 15.20 there exists h
1
, h
2
C(X, [0, 1]) such that supp(h
i
) U
i
for i = 1, 2 and h
1
+ h
2
= 1 on X. The function f = h
2
satises the desired
properties.
15.3 C
0
(X) and the Alexanderov Compactication
Denition 15.22. Let (X, ) be a topological space. A continuous function
f : X C is said to vanish at innity if [f[ is compact in X for
all > 0. The functions, f C(X), vanishing at innity will be denoted by
C
0
(X). (Notice that C
0
(X) = C (X) whenever X is compact.)
2
So as to simplify the indexing we assume there countable number of gjs. This
can always be arranged by taking g
k
0 for large k if necessary.
Proposition 15.23. Let X be a topological space, BC(X) be the space of
bounded continuous functions on X with the supremum norm topology. Then
1. C
0
(X) is a closed subspace of BC(X).
2. If we further assume that X is a locally compact Hausdor space, then
C
0
(X) = C
c
(X).
Proof.
1. If f C
0
(X), K
1
:= [f[ 1 is a compact subset of X and there-
fore f(K
1
) is a compact and hence bounded subset of C and so M :=
sup
xK1
[f(x)[ < . Therefore |f|
M1 < showing f BC(X).

Now suppose f
n
C
0
(X) and f
n
f in BC(X). Let > 0 be given and
choose n suciently large so that |f f
n
|
/2. Since
[f[ [f
n
[ +[f f
n
[ [f
n
[ +|f f
n
|
[f
n
[ +/2,
[f[ [f
n
[ +/2 = [f
n
[ /2 .
Because [f[ is a closed subset of the compact set [f
n
[ /2 ,
[f[ is compact and we have shown f C
0
(X).
2. Since C
0
(X) is a closed subspace of BC(X) and C
c
(X) C
0
(X), we
always have C
c
(X) C
0
(X). Now suppose that f C
0
(X) and let K
n
:=
[f[
1
n
X. By Lemma 15.8 we may choose
n
C
c
(X, [0, 1]) such
that
n
1 on K
n
. Dene f
n
:=
n
f C
c
(X). Then
|f f
n
|
u
= |(1
n
)f|

1
n
0 as n .
This shows that f C
c
(X).
Proposition 15.24 (Alexanderov Compactication). Suppose that (X, )
is a non-compact locally compact Hausdor space. Let X
= X , where
is a new symbol not in X. The collection of sets,
= X
K : K X 2
X
,
is a topology on X
and (X
) is a compact Hausdor space. Moreover

f C(X) extends continuously to X
i f = g +c with g C
0
(X) and c C
in which case the extension is given by f() = c.
Proof. 1. (
is a topology.) Let T := F X
: X
, i.e.
F T i F is a compact subset of X or F = F
0
with F
0
being a closed
subset of X. Since the nite union of compact (closed) subsets is compact
(closed), it is easily seen that T is closed under nite unions. Because arbitrary
intersections of closed subsets of X are closed and closed subsets of compact
subsets of X are compact, it is also easily checked that T is closed under
15.3 C0(X) and the Alexanderov Compactication 257
arbitrary intersections. Therefore T satises the axioms of the closed subsets
associated to a topology and hence
is a topology.
2. ((X
) is a Hausdor space.) It suces to show any point x X

can be separated from . To do this use Proposition 15.7 to nd an open
precompact neighborhood, U, of x. Then U and V := X
U are disjoint open

subsets of X
such that x U and V.

3. ((X
) is compact.) Suppose that |
is an open cover of X
.
Since | covers , there exists a compact set K X such that X
K |.
Clearly X is covered by |
0
:= V : V | and by the denition of
(or using (X
) is Hausdor), |
0
is an open cover of X. In particular |
0
is
an open cover of K and since K is compact there exists | such that
K V : V . It is now easily checked that X
K |
is a nite subcover of X
.
4. (Continuous functions on C(X
) statements.) Let i : X X
be the
inclusion map. Then i is continuous and open, i.e. i(V ) is open in X
for all
V open in X. If f C(X
), then g = f[
X
f() = f i f() is continuous
on X. Moreover, for all > 0 there exists an open neighborhood V
of
such that
[g(x)[ = [f(x) f()[ < for all x V.
Since V is an open neighborhood of , there exists a compact subset,
K X, such that V = X
K. By the previous equation we see that

x X : [g(x)[ K, so [g[ is compact and we have shown g van-
ishes at .
Conversely if g C
0
(X), extend g to X
by setting g() = 0. Given

> 0, the set K = [g[ is compact, hence X
K is open in X
. Since
g(X
K) (, ) we have shown that g is continuous at . Since g is also

continuous at all points in X it follows that g is continuous on X
. Now it
f = g + c with c C and g C
0
(X), it follows by what we just proved that
dening f() = c extends f to a continuous function on X
.
Example 15.25. Let X be an uncountable set and be the discrete topology
on X. Let (X
= X ,
) be the one point compactication of X. The

smallest dense subset of X
is the uncountable set X. Hence X
is a compact
but non-separable and hence non-metrizable space.
Exercise 15.4. Let X := 0, 1
R
and be the product topology on X where
0, 1 is equipped with the discrete topology. Show (X, ) is separable. (Com-
bining this with Exercise 13.9 and Tychonos Theorem 14.34, we see that
(X, ) is compact and separable but not rst countable.)
Solution to Exercise (15.4). We begin by observing that a basic open
neighborhood of g X is of the form
V
:= f X : f = g on
where 1. Therefore to see that X is separable, we must nd a countable
set D X such that for any g X (g : 1 0, 1) and any 1, there
exists f D such that f = g on .
Kevin Costellos construction. Let
M
m,k
:= 1
[k/m,(k+1)/m)
be the characteristic function of the interval [k/m, (k + 1)/m) and let D
0, 1
R
be the set of all nite sums of M
m,k
which still have range in 0, 1,
i.e. the set of sums over disjoint intervals.
Now suppose g 0, 1
R
and 1. Let
S := x : g (x) = 0 and T = x : g (x) = 1 .
Then = S
T and we may take intervals J

t
:= [k/m, (k + 1)/m) t for
each t T which are small enough to be disjoint and not contain any points
in S. Then f =
tT
1
Jt
D and f = g on showing f V
.
The next proposition gathers a number of results involving countability
assumptions which have appeared in the exercises.
Proposition 15.26 (Summary). Let (X, ) be a topological space.
1. If (X, ) is second countable, then (X, ) is separable; see Exercise 13.11.
2. If (X, ) is separable and metrizable then (X, ) is second countable; see
Exercise 13.12.
3. If (X, ) is locally compact and metrizable then (X, ) is compact i
(X, ) is separable; see Exercises 14.10 and 14.11.
4. If (X, ) is locally compact and second countable, then (X, ) is - com-
pact, see Exercise 14.8.
5. If (X, ) is locally compact and metrizable, then (X, ) is compact i
(X, ) is separable, see Exercises 14.9 and 14.10.
6. There exists spaces, (X, ) , which are both compact and separable but not
rst countable and in particular not metrizable, see Exercise 15.4.
15.4 Stone-Weierstrass Theorem
We now wish to generalize Theorem 10.35 to more general topological spaces.
We will rst need some denitions.
Denition 15.27. Let X be a topological space and / C(X) = C(X, 1) or
C(X, C) be a collection of functions. Then
1. / is said to separate points if for all distinct points x, y X there exists
f / such that f(x) ,= f(y).
15.4 Stone-Weierstrass Theorem 259
2. / is an algebra if / is a vector subspace of C(X) which is closed under
pointwise multiplication. (Note well: we do not assume 1 /.)
3. / C(X, 1) is called a lattice if f g := max(f, g) and f g =
min(f, g) / for all f, g /.
4. / C(X, C) is closed under conjugation if

f / whenever f /.
Remark 15.28. If X is a topological space such that C(X, 1) separates points
then X is Hausdor. Indeed if x, y X and f C(X, 1) such that
f(x) ,= f(y), then f
1
(J) and f
1
(I) are disjoint open sets containing x
and y respectively when I and J are disjoint intervals containing f(x) and
f(y) respectively.
Lemma 15.29. If / is a closed sub-algebra of BC(X, 1) then [f[ / for all
f / and / is a lattice.
Proof. Let f / and let M = sup
xX
[f(x)[ . Using Theorem 10.35 or
Exercise 15.12, there are polynomials p
n
(t) such that
lim
n
sup
]t]M
[[t[ p
n
(t)[ = 0.
By replacing p
n
by p
n
p
n
(0) if necessary we may assume that p
n
(0) = 0.
Since / is an algebra, it follows that f
n
= p
n
(f) / and [f[ /, because
[f[ is the uniform limit of the f
n
s. Since
f g =
1
2
(f +g +[f g[) and
f g =
1
2
(f +g [f g[),
we have shown / is a lattice.
Lemma 15.30. Let / C(X, 1) be an algebra which separates points and
suppose x and y are distinct points of X. If there exits such that f, g / such
that
f(x) ,= 0 and g(y) ,= 0, (15.5)
then
V := (f(x), f(y)) : f /= 1
2
. (15.6)
Proof. It is clear that V is a non-zero subspace of 1
2.
If dim(V ) = 1, then
V = span(a, b) for some (a, b) 1
2
which, necessarily by Eq. (15.5), satisfy
a ,= 0 ,= b. Since (a, b) = (f(x), f(y)) for some f / and f
2
/, it follows
that (a
2
, b
2
) = (f
2
(x), f
2
(y)) V as well. Since dimV = 1, (a, b) and (a
2
, b
2
)
are linearly dependent and therefore
0 = det
_
a b
a
2
b
2
_
= ab
2
a
2
b = ab(b a)
which implies that a = b. But this the implies that f(x) = f(y) for all f /,
violating the assumption that / separates points. Therefore we conclude that
dim(V ) = 2, i.e. V = 1
2
.
Theorem 15.31 (Stone-Weierstrass Theorem). Suppose X is a locally
compact Hausdor space and / C
0
(X, 1) is a closed subalgebra which
separates points. For x X let
/
x
:= f(x) : f / and
1
x
= f C
0
(X, 1) : f(x) = 0.
Then either one of the following two cases hold.
1. / = C
0
(X, 1) or
2. there exists a unique point x
0
X such that / = 1
x0
.
Moreover, case 1. holds i /
x
= 1 for all x X and case 2. holds i
there exists a point x
0
X such that /
x0
= 0 .
Proof. If there exists x
0
such that /
x0
= 0 (x
0
is unique since /
separates points) then / 1
x0
. If such an x
0
exists let ( = 1
x0
and if /
x
= 1
for all x, set ( = C
0
(X, 1). Let f ( be given. By Lemma 15.30, for all
x, y X such that x ,= y, there exists g
xy
/ such that f = g
xy
on x, y.
3
When X is compact the basic idea of the proof is contained in the following
identity,
f(z) = inf
xX
sup
yX
g
xy
(z) for all z X. (15.7)
To prove this identity, let g
x
:= sup
yX
g
xy
and notice that g
x
f since
g
xy
(y) = f(y) for all y X. Moreover, g
x
(x) = f(x) for all x X since
g
xy
(x) = f(x) for all x. Therefore,
inf
xX
sup
yX
g
xy
= inf
xX
g
x
= f.
The rest of the proof is devoted to replacing the inf and the sup above by
min and max over nite sets at the expense of Eq. (15.7) becoming only an
approximate identity. We also have to modify Eq. (15.7) slightly to take care
of the non-compact case.
Claim. Given > 0 and x X there exists g
x
/ such that g
x
(x) = f(x)
and f < g
x
+ on X.
To prove this, let V
y
be an open neighborhood of y such that [f g
xy
[ <
on V
y
; in particular f < +g
xy
on V
y
. Also let g
x,
be any xed element in
/ such that g
x,
(x) = f (x) and let
K =
_
[f[

2
_
_
[g
x,
[

2
_
. (15.8)
Since K is compact, there exists K such that K

y
V
y
. Dene
3
If /x
0
= |0 and x = x0 or y = x0, then gxy exists merely by the fact that /
separates points.
15.4 Stone-Weierstrass Theorem 261
g
x
(z) = maxg
xy
: y .
Since
f < +g
xy
< +g
x
on V
y
,
for any y , and
f <

2
< +g
x,
g
x
+ on K
c
,
f < + g
x
on X and by construction f(x) = g
x
(x), see Figure 15.5. This
completes the proof of the claim.
Fig. 15.5. Constructing the dominating approximates, gx for each x X.
To complete the proof of the theorem, let g
be a xed element of / such

that f < g
+ on X; for example let g
= g
x0
/ for some xed x
0
X.
For each x X, let U
x
be a neighborhood of x such that [f g
x
[ < on U
x
.
Choose
F :=
_
[f[

2
_
_
[g
[

2
_
such that F

x
U
x
( exists since F is compact) and dene
g = ming
x
: x /.
Then, for x F, g
x
< f + on U
x
and hence g < f + on

x
U
x
F.
Likewise,
g g
< /2 < f + on F
c
.
Therefore we have now shown,
f < g + and g < f + on X,
i.e. [f g[ < on X. Since > 0 is arbitrary it follows that f

/ = / and
so / = (.
Corollary 15.32 (Complex Stone-Weierstrass Theorem). Let X be a
locally compact Hausdor space. Suppose / C
0
(X, C) is closed in the uni-
form topology, separates points, and is closed under complex conjugation. Then
either / = C
0
(X, C) or
/ = 1
C
x0
:= f C
0
(X, C) : f(x
0
) = 0
for some x
0
X.
Proof. Since
Re f =
f +

f
2
and Imf =
f

f
2i
,
Re f and Imf are both in /. Therefore
/
R
= Re f, Imf : f /
is a real sub-algebra of C
0
(X, 1) which separates points. Therefore either
/
R
= C
0
(X, 1) or /
R
= 1
x0
C
0
(X, 1) for some x
0
and hence / = C
0
(X, C)
or 1
C
x0
respectively.
As an easy application, Theorem 15.31 and Corollary 15.32 imply Theorem
10.35 and Corollary 10.37 respectively. Here are a few more applications.
Example 15.33. Let f C([a, b]) be a positive function which is injective.
Then functions of the form

N
k=1
a
k
f
k
with a
k
C and N N are dense in
C([a, b]). For example if a = 1 and b = 2, then one may take f(x) = x
for
any ,= 0, or f(x) = e
x
, etc.
Exercise 15.5. Let (X, d) be a separable compact metric space. Show that
C(X) is also separable. Hint: Let E X be a countable dense set and then
consider the algebra, / C(X), generated by d(x, )
xE
.
Example 15.34. Let X = [0, ), > 0 be xed, / be the real algebra
generated by t e
t
. So the general element f / is of the form
f(t) = p(e
t
), where p(x) is a polynomial function in x with real coecients.
Since / C
0
(X, 1) separates points and e
t
/ is pointwise positive,
/ = C
0
(X, 1).
As an application of Example 15.34, suppose that g C
c
(X, 1) satises,
_

0
g (t) e
t
dt = 0 for all > 0. (15.9)
(Note well that the integral in Eq. (15.9) is really over a nite interval since g
is compactly supported.) Equation (15.9) along with linearity of the Riemann
integral implies
_

0
g (t) f (t) dt = 0 for all f /.
15.5 *More on Separation Axioms: Normal Spaces 263
We may now choose f
n
/ such that f
n
g uniformly and therefore,
using the continuity of the Riemann integral under uniform convergence (see
Proposition 10.5),
0 = lim
n
_

0
g (t) f
n
(t) dt =
_

0
g
2
(t) dt.
From this last equation it is easily deduced, using the continuity of g, that
g 0. See Theorem 22.12 below, where this is done in greater generality.
15.5 *More on Separation Axioms: Normal Spaces
(This section may safely be omitted on the rst reading.)
Denition 15.35 (T
0
T
2
Separation Axioms). Let (X, ) be a topological
space. The topology is said to be:
1. T
0
if for x ,= y in X there exists V such that x V and y / V or V
such that y V but x / V.
2. T
1
if for every x, y X with x ,= y there exists V such that x V
and y / V. Equivalently, is T
1
i all one point subsets of X are closed.
4
3. T
2
if it is Hausdor.
Note T
2
implies T
1
which implies T
0
. The topology in Example 15.1 is T
0
but not T
1
. If X is a nite set and is a T
1
topology on X then = 2
X
. To
prove this let x X be xed. Then for every y ,= x in X there exists V
y

such that x V
y
while y / V
y
. Thus x =
y,=x
V
y
showing contains
all one point subsets of X and therefore all subsets of X. So we have to look
to innite sets for an example of T
1
topology which is not T
2
.
Example 15.36. Let X be any innite set and let = A X : #(A
c
) <
the so called conite topology. This topology is T
1
because if x ,= y in
X, then V = x
c
with x / V while y V. This topology however is not
T
2
. Indeed if U, V are open sets such that x U, y V and U V =
then U V
c
. But this implies #(U) < which is impossible unless U =
which is impossible since x U.
The uniqueness of limits of sequences which occurs for Hausdor topologies
(see Remark 15.3) need not occur for T
1
spaces. For example, let X = N
and be the conite topology on X as in Example 15.36. Then x
n
= n is a
sequence in X such that x
n
x as n for all x N. For the most part
we will avoid these pathologies in the future by only considering Hausdor
topologies.
4
If one point subsets are closed and x ,= y in X then V := |x
c
is an open set
containing y but not x. Conversely if is T1 and x X there exists Vy such
that y Vy and x / Vy for all y ,= x. Therefore, |x
c
=
y=x
Vy .
Denition 15.37 (Normal Spaces: T
4
Separation Axiom). A topolog-
ical space (X, ) is said to be normal or T
4
if:
1. X is Hausdor and
2. if for any two closed disjoint subsets A, B X there exists disjoint open
sets V, W X such that A V and B W.
Example 15.38. By Lemma 6.15 and Corollary 15.21 it follows that metric
spaces and topological spaces which are locally compact, compact and
Hausdor (in particular compact Hausdor spaces) are normal. Indeed, in
each case if A, B are disjoint closed subsets of X, there exists f C(X, [0, 1])
such that f = 1 on Aand f = 0 on B. Now let U =
_
f >
1
2
_
and V = f <
1
2
.
Remark 15.39. A topological space, (X, ), is normal i for any C W X
with C being closed and W being open there exists an open set U
o
X such
that
C U

U W.
To prove this rst suppose X is normal. Since W
c
is closed and C W
c
= ,
there exists disjoint open sets U and V such that C U and W
c
V.
Therefore C U V
c
W and since V
c
is closed, C U

U V
c
W.
For the converse direction suppose A and B are disjoint closed subsets of
X. Then A B
c
and B
c
is open, and so by assumption there exists U
o
X
such that A U

U B
c
and by the same token there exists W
o
X such
that

U W

W B
c
. Taking complements of the last expression implies
B

W
c
W
c

U
c
.
Let V =

W
c
. Then A U
o
X, B V
o
X and U V U W
c
= .
Theorem 15.40 (Urysohns Lemma for Normal Spaces). Let X be a
normal space. Assume A, B are disjoint closed subsets of X. Then there exists
f C(X, [0, 1]) such that f = 0 on A and f = 1 on B.
Proof. To make the notation match Lemma 15.8, let U = A
c
and K = B.
Then K U and it suces to produce a function f C(X, [0, 1]) such that
f = 1 on K and supp(f) U. The proof is now identical to that for Lemma
15.8 except we now use Remark 15.39 in place of Proposition 15.7.
Theorem 15.41 (Tietze Extension Theorem). Let (X, ) be a normal
space, D be a closed subset of X, < a < b < and f C(D, [a, b]).
Then there exists F C(X, [a, b]) such that F[
D
= f.
Proof. The proof is identical to that of Theorem 7.4 except we now use
Theorem 15.40 in place of Lemma 6.15.
Corollary 15.42. Suppose that X is a normal topological space, D X is
closed, F C(D, 1). Then there exists F C(X) such that F[
D
= f.
15.5 *More on Separation Axioms: Normal Spaces 265
Proof. Let g = arctan(f) C(D, (
2
,

2
)). Then by the Tietze ex-
tension theorem, there exists G C(X, [
2
,

2
]) such that G[
D
= g. Let
B := G
1
(
2
,

2
) X, then B D = . By Urysohns lemma (Theo-
rem 15.40) there exists h C(X, [0, 1]) such that h 1 on D and h = 0
on B and in particular hG C(D, (
2
,

2
)) and (hG) [
D
= g. The function
F := tan(hG) C(X) is an extension of f.
Theorem 15.43 (Urysohn Metrization Theorem for Normal Spaces).
Every second countable normal space, (X, ) , is metrizable, i.e. there is a
metric on X such that =
. Moreover, may be chosen so that X is

isometric to a subset Q
0
Q (Q is as in Notation 15.11) equipped with the
metric d in Eq. (15.2). In this metric X is totally bounded and hence the
completion of X (which is isometric to

Q
0
Q) is compact.
Proof. (The proof here will be very similar to the proof of Theorem 15.13.)
Let B be a countable base for and set
:= (U, V ) B B [

U V .
To each O and x O there exist (U, V ) such that x U V O.
Indeed, since B is a base for , there exists V B such that x V O.
Because xV
c
= , there exists disjoint open sets

U and W such that x

U,
V
c
W and

U W = . Choose U B such that x U

U. Since
U

U W
c
, U W
c
V and hence (U, V ) . See Figure 15.6 below. In
Fig. 15.6. Constructing (U, V ) .
particular this shows that
B
0
:= U B : (U, V ) for some V B
is still a base for .
If is a nite set, the previous comment shows that only has a nite
number of elements as well. Since (X, ) is Hausdor, it follows that X is a
nite set. Letting x
n
N
n=1
be an enumeration of X, dene T : X Q by
T(x
n
) = e
n
for n = 1, 2, . . . , N where e
n
= (0, 0, . . . , 0, 1, 0, . . . ), with the 1
occurring in the n
th
spot. Then (x, y) := d(T(x), T(y)) for x, y X is the
desired metric.
So we may now assume that is an innite set and let (U
n
, V
n
)
n=1
be an enumeration of . By Urysohns Lemma for normal spaces (Theorem
15.40) there exists f
U,V
C(X, [0, 1]) such that f
U,V
= 0 on

U and f
U,V
= 1
on V
c
. Let T := f
U,V
[ (U, V ) and set f
n
:= f
Un,Vn
an enumeration
of T. The proof that
(x, y) :=
n=1
1
2
n
[f
n
(x) f
n
(y)[
is the desired metric on X now follows exactly as the corresponding argument
in the proof of Theorem 15.13.
15.6 Exercises
Exercise 15.6. Prove Theorem 15.9. Hints:
1. By Proposition 15.7, there exists a precompact open set V such that
K V

V U. Now suppose that f : K [0, ] is continuous with
(0, 1] and let A := f
1
([0,
1
3
]) and B := f
1
([
2
3
, 1]). Appeal to
Lemma 15.8 to nd a function g C(X, [0, /3]) such that g = /3 on B
and supp(g) V A.
2. Now follow the argument in the proof of Theorem 7.4 to construct F
C(X, [a, b]) such that F[
K
= f.
3. For c [a, b], choose U such that = 1 on K and replace F by
F
c
:= F + (1 )c.
Exercise 15.7 (Sterographic Projection). Let X = 1
n
, X
:= X
be the one point compactication of X, S
n
:= y 1
n+1
: [y[ = 1 be the
unit sphere in 1
n+1
and N = (0, . . . , 0, 1) 1
n+1
. Dene f : S
n
X
by
f(N) = , and for y S
n
N let f(y) = b 1
n
be the unique point such
that (b, 0) is on the line containing N and y, see Figure 15.7 below. Find a
formula for f and show f : S
n
X
is a homeomorphism. (So the one point

compactication of 1
n
is homeomorphic to the n sphere.)
Exercise 15.8. Let (X, ) be a locally compact Hausdor space. Show (X, )
is separable i (X
) is separable.
Exercise 15.9. Show by example that there exists a locally compact metric
space (X, d) such that the one point compactication, (X
:= X ,
) ,
is not metrizable. Hint: use exercise 15.8.
15.6 Exercises 267
Fig. 15.7. Sterographic projection and the one point compactication of R
n
.
Exercise 15.10. Suppose (X, d) is a locally compact and compact metric
space. Show the one point compactication, (X
:= X ,
) , is metriz-
able.
Exercise 15.11. In this problem, suppose Theorem 15.31 has only been
proved when X is compact. Show that it is possible to prove Theorem 15.31
by using Proposition 15.24 to reduce the non-compact case to the compact
case.
Hints:
1. If /
x
= 1 for all x X let X
= X be the one point compacti-

cation of X.
2. If /
x0
= 0 for some x
0
X, let Y := X x
0
and Y
= Y be
the one point compactication of Y.
3. For f / dene f () = 0. In this way / may be considered to be a
sub-algebra of C(X
, 1) in case 1. or a sub-algebra of C(Y
, 1) in case 2.
Exercise 15.12. Let M < , show there are polynomials p
n
(t) such that
lim
n
sup
]t]M
[[t[ p
n
(t)[ = 0
using the following outline.
1. Let f(x) =

1 x for [x[ 1 and use Taylors theorem with integral
remainder (see Eq. A.15 of Appendix A), or analytic function theory if
you know it, to show there are constants
5
c
n
> 0 for n N such that
1 x = 1
n=1
c
n
x
n
for all [x[ < 1. (15.10)
5
In fact cn :=
(2n3)!!
2
n
n!
, but this is not needed.
2. Let q
m
(x) := 1

m
n=1
c
n
x
n
. Use (15.10) to show

n=1
c
n
= 1 and
conclude from this that
lim
m
sup
]x]1
[
1 x q
m
(x)[ = 0. (15.11)
3. Let 1 x = t
2
/M
2
, i.e. x = 1 t
2
/M
2
, then
lim
m
sup
]t]M
[t[
M
q
m
(1 t
2
/M
2
)
= 0
so that p
m
(t) := Mq
m
(1 t
2
/M
2
) are the desired polynomials.
Exercise 15.13. Given a continuous function f : 1 C which is 2 -
periodic and > 0. Show there exists a trigonometric polynomial, p() =
n
n=N
n
e
in
, such that [f() P()[ < for all 1. Hint: show that
there exists a unique function F C(S
1
) such that f() = F(e
i
) for all
1.
Remark 15.44. Exercise 15.13 generalizes to 2 periodic functions on 1
d
,
i.e. functions such that f( +2e
i
) = f() for all i = 1, 2, . . . , d where e
i
d
i=1
is the standard basis for 1
d
. A trigonometric polynomial p() is a function of
1
d
of the form
p() =
n
e
in
where is a nite subset of Z
d
. The assertion is again that these trigonometric
polynomials are dense in the 2 periodic functions relative to the supremum
norm.
16
Baire Category Theorem
Denition 16.1. Let (X, ) be a topological space. A set E X is said to be
nowhere dense if
_
E
_
o
= i.e.

E has empty interior.
Notice that E is nowhere dense is equivalent to
X =
__
E
_
o
_
c
=
_
E
_
c
= (E
c
)
o
.
That is to say E is nowhere dense i E
c
has dense interior.
16.1 Metric Space Baire Category Theorem
Theorem 16.2 (Baire Category Theorem). Let (X, ) be a complete met-
ric space.
1. If V
n
n=1
is a sequence of dense open sets, then G :=
n=1
V
n
is dense in
X.
2. If E
n
n=1
is a sequence of nowhere dense sets, then

n=1
E
n

n=1
E
n
X and in particular X ,=
n=1
E
n
.
Proof. 1. We must show that

G = X which is equivalent to showing
that W G ,= for all non-empty open sets W X. Since V
1
is dense,
W V
1
,= and hence there exists x
1
X and
1
> 0 such that
B(x
1
,
1
) W V
1
.
Since V
2
is dense, B(x
1
,
1
)V
2
,= and hence there exists x
2
X and
2
> 0
such that
B(x
2
,
2
) B(x
1
,
1
) V
2
.
Continuing this way inductively, we may choose x
n
X and
n
> 0
n=1
such
that
270 16 Baire Category Theorem
B(x
n
,
n
) B(x
n1
,
n1
) V
n
n.
Furthermore we can clearly do this construction in such a way that
n
0 as
n . Hence x
n
n=1
is Cauchy sequence and x = lim
n
x
n
exists in X since
X is complete. Since B(x
n
,
n
) is closed, x B(x
n
,
n
) V
n
so that x V
n
for all n and hence x G. Moreover, x B(x
1
,
1
) W V
1
implies x W
and hence x W G showing W G ,= .
2. The second assertion is equivalently to showing
, =
_

_
n=1
E
n
_
c
=
n=1
_
E
n
_
c
=
n=1
(E
c
n
)
o
.
As we have observed, E
n
is nowhere dense is equivalent to (E
c
n
)
o
being a dense
open set, hence by part 1),

n=1
(E
c
n
)
o
is dense in X and hence not empty.
Example 16.3. Suppose that X is a countable set and is a metric on X for
which no single point set is open. Then (X, ) is not complete. Indeed we
may assume X = N and let E
n
:= n N for all n N. Then E
n
is closed
and by assumption it has empty interior. Since X =
nN
E
n
, it follows from
the Baire Category Theorem 16.2 that (X, ) can not be complete.
16.2 Locally Compact Hausdor Space Baire Category
Theorem
Here is another version of the Baire Category theorem when X is a locally
compact Hausdor space.
Theorem 16.4. Let X be a locally compact Hausdor space.
Proposition 16.5. 1. If V
n
n=1
is a sequence of dense open sets, then G :=
n=1
V
n
is dense in X.
2. If E
n
n=1
is a sequence of nowhere dense sets, then X ,=
n=1
E
n
.
Proof. As in the previous proof, the second assertion is a consequence of
the rst. To nish the proof, if suces to show G W ,= for all open sets
W X. Since V
1
is dense, there exists x
1
V
1
W and by Proposition 15.7
there exists U
1

o
X such that x
1
U
1

U
1
V
1
W with

U
1
being compact.
Similarly, there exists a non-empty open set U
2
such that U
2

U
2
U
1
V
2
.
Working inductively, we may nd non-empty open sets U
k
k=1
such that
U
k

U
k
U
k1
V
k
. Since
n
k=1
U
k
=

U
n
,= for all n, the nite intersection
characterization of

U
1
being compact implies that
, =
k=1

U
k
G W.
16.2 Locally Compact Hausdor Space Baire Category Theorem 271
Denition 16.6. A subset E X is meager or of the rst category if
E =
n=1
E
n
where each E
n
is nowhere dense. And a set R X is called
residual if R
c
is meager.
Remarks 16.7 For those readers that already know some measure theory
may want to think of meager as being the topological analogue of sets of mea-
sure 0 and residual as being the topological analogue of sets of full measure.
(This analogy should not be taken too seriously, see Exercise 19.19.)
1. R is residual i R contains a countable intersection of dense open sets.
Indeed if R is a residual set, then there exists nowhere dense sets E
n
such that
R
c
=
n=1
E
n

n=1

E
n
.
Taking complements of this equation shows that
n=1

E
c
n
R,
i.e. R contains a set of the form
n=1
V
n
with each V
n
(=

E
c
n
) being an
open dense subset of X.
Conversely, if
n=1
V
n
R with each V
n
being an open dense subset of X,
then R
c

n=1
V
c
n
and hence R
c
=
n=1
E
n
where each E
n
= R
c
V
c
n
, is
a nowhere dense subset of X.
2. A countable union of meager sets is meager and any subset of a meager
set is meager.
3. A countable intersection of residual sets is residual.
Remarks 16.8 The Baire Category Theorems may now be stated as follows.
If X is a complete metric space or X is a locally compact Hausdor space,
then
1. all residual sets are dense in X and
2. X is not meager.
It should also be remarked that incomplete metric spaces may be meager.
For example, let X C([0, 1]) be the subspace of polynomial functions on
[0, 1] equipped with the supremum norm. Then X =
n=1
E
n
where E
n
X
denotes the subspace of polynomials of degree less than or equal to n. You
are asked to show in Exercise 16.1 below that E
n
is nowhere dense for all n.
Hence X is meager and the empty set is residual in X.
Here is an application of Theorem 16.2.
Theorem 16.9. Let ^ C([0, 1], 1) be the set of nowhere dierentiable
functions. (Here a function f is said to be dierentiable at 0 if f
t
(0) :=
lim
t0
f(t)f(0)
t
exists and at 1 if f
t
(1) := lim
t0
f(1)f(t)
1t
exists.) Then ^ is
a residual set so the generic continuous functions is nowhere dierentiable.
Proof. If f / ^, then f
t
(x
0
) exists for some x
0
[0, 1] and by the
denition of the derivative and compactness of [0, 1], there exists n N such
that [f(x) f(x
0
)[ n[x x
0
[ x [0, 1]. Thus if we dene
E
n
:= f C([0, 1]) : x
0
[0, 1] [f(x) f(x
0
)[ n[x x
0
[ x [0, 1] ,
then we have just shown ^
c
E :=
n=1
E
n
. So to nish the proof it suces
to show (for each n) E
n
is a closed subset of C([0, 1], 1) with empty interior.
1. To prove E
n
is closed, let f
m
m=1
E
n
be a sequence of functions
such that there exists f C([0, 1], 1) such that |f f
m
|
0 as m .
Since f
m
E
n
, there exists x
m
[0, 1] such that
[f
m
(x) f
m
(x
m
)[ n[x x
m
[ x [0, 1]. (16.1)
Since [0, 1] is a compact metric space, by passing to a subsequence if neces-
sary, we may assume x
0
= lim
m
x
m
[0, 1] exists. Passing to the limit
in Eq. (16.1), making use of the uniform convergence of f
n
f to show
lim
m
f
m
(x
m
) = f(x
0
), implies
[f(x) f(x
0
)[ n[x x
0
[ x [0, 1]
and therefore that f E
n
. This shows E
n
is a closed subset of C([0, 1], 1).
2. To nish the proof, we will show E
0
n
= by showing for each f E
n
and
> 0 given, there exists g C([0, 1], 1)E
n
such that |f g|
< . We now
construct g. Since [0, 1] is compact and f is continuous there exists N N
such that [f(x) f(y)[ < /2 whenever [y x[ < 1/N. Let k denote the
piecewise linear function on [0, 1] such that k(
m
N
) = f(
m
N
) for m = 0, 1, . . . , N
and k
tt
(x) = 0 for x /
N
:= m/N : m = 0, 1, . . . , N . Then it is easily seen
that |f k|
u
< /2 and for x (
m
N
,
m+1
N
) that
[k
t
(x)[ =
[f(
m+1
N
) f(
m
N
)[
1
N
< N/2.
We now make k rougher by adding a small wiggly function h which we dene
as follows. Let M N be chosen so that 4M > 2n and dene h uniquely
by h(
m
M
) = (1)
m
/2 for m = 0, 1, . . . , M and h
tt
(x) = 0 for x /
M
. Then
|h|
< and [h
t
(x)[ = 4M > 2n for x /
M
. See Figure 16.1 below. Finally
dene g := k +h. Then
|f g|
|f k|
+|h|
< /2 +/2 =
and
[g
t
(x)[ [h
t
(x)[ [k
t
(x)[ > 2n n = n x /
M

N
.
It now follows from this last equation and the mean value theorem that for
any x
0
[0, 1],
g(x) g(x
0
)
x x
0
> n
Fig. 16.1. Constgructing a rough approximation, g, to a continuous function f.
for all x [0, 1] suciently close to x
0
. This shows g / E
n
and so the proof is
complete.
Here is an application of the Baire Category Theorem 16.4.
Proposition 16.10. Suppose that f : 1 1 is a function such that f
t
(x)
exists for all x 1. Let
U :=
_
>0
_
x 1 : sup
]y]<
[f
t
(x +y)[ <
_
.
Then U is a dense open set. (It is not true that U = 1 in general, see Example
30.27 below.)
Proof. It is easily seen from the denition of U that U is open. Let W
o
1
be an open subset of 1. For k N, let
E
k
:=
_
x W : [f(y) f(x)[ k [y x[ when [y x[
1
k
_
=
z:]z]k
1
x W : [f(x +z) f(x)[ k [z[ ,
which is a closed subset of 1 since f is continuous. Moreover, if x W and
M = [f
t
(x)[ , then
[f(y) f(x)[ = [f
t
(x) (y x) +o (y x)[
(M + 1) [y x[
for y close to x. (Here o(y x) denotes a function such that lim
yx
o(y
x)/(y x) = 0.) In particular, this shows that x E
k
for all k suciently
large. Therefore W=
k=1
E
k
and since W is not meager by the Baire category
Theorem 16.4, some E
k
has non-empty interior. That is there exists x
0
E
k

W and > 0 such that
J := (x
0
, x
0
+) E
k
W.
For x J, we have [f(x +z) f(x)[ k [z[ provided that [z[ k
1
and
therefore that [f
t
(x)[ k for x J. Therefore x
0
U W showing U is
dense.
Remark 16.11. This proposition generalizes to functions f : 1
n
1
m
in an
obvious way.
For our next application of Theorem 16.2, let X := BC
((1, 1)) denote

the set of smooth functions f on (1, 1) such that f and all of its derivatives
are bounded. In the metric
(f, g) :=
k=0
2
k
_
_
f
(k)
g
(k)
_
_
1 +
_
_
f
(k)
g
(k)
_
_
for f, g X,
X becomes a complete metric space.
Theorem 16.12. Given an increasing sequence of positive numbers M
n
n=1
,
the set
T :=
_
f X : limsup
n
f
(n)
(0)
M
n
1
_
is dense in X. In particular, there is a dense set of f X such that the power
series expansion of f at 0 has zero radius of convergence.
Proof. Step 1. Let n N. Choose g C
c
((1, 1)) such that |g|
< 2
n
while g
t
(0) = 2M
n
and dene
f
n
(x) :=
_
x
0
dt
n1
_
tn1
0
dt
n2
. . .
_
t2
0
dt
1
g(t
1
).
Then for k < n,
f
(k)
n
(x) =
_
x
0
dt
nk1
_
t
nk1
0
dt
nk2
. . .
_
t2
0
dt
1
g(t
1
),
f
(n)
(x) = g
t
(x), f
(n)
n
(0) = 2M
n
and f
(k)
n
satises
_
_
_f
(k)
n
_
_
_
2
n
(n 1 k)!
2
n
for k < n.
Consequently,
(f
n
, 0) =
k=0
2
k
_
_
_f
(k)
n
_
_
_
1 +
_
_
_f
(k)
n
_
_
_
n1
k=0
2
k
2
n
+
k=n
2
k
1 2
_
2
n
+ 2
n
_
= 4 2
n
.
Thus we have constructed f
n
X such that lim
n
(f
n
, 0) = 0 while
f
(n)
n
(0) = 2M
n
for all n.
Step 2. The set
G
n
:=
mn
_
f X :
f
(m)
(0)
> M
m
_
is a dense open subset of X. The fact that G
n
is open is clear. To see that
G
n
is dense, let g X be given and dene g
m
:= g +
m
f
m
where
m
:=
sgn(g
(m)
(0)). Then
g
(m)
m
(0)
g
(m)
(0)
f
(m)
m
(0)
2M
m
> M
m
for all m.
Therefore, g
m
G
n
for all m n and since
(g
m
, g) = (f
m
, 0) 0 as m
it follows that g

G
n
.
Step 3. By the Baire Category theorem, G
n
is a dense subset of X. This
completes the proof of the rst assertion since
T =
_
f X : limsup
n
f
(n)
(0)
M
n
1
_
=
n=1
_
f X :
f
(n)
(0)
M
n
1 for some n m
_

n=1
G
n
.
Step 4. Take M
n
= (n!)
2
and recall that the power series expansion for f
near 0 is given by

n=0
fn(0)
n!
x
n
. This series can not converge for any f T
and any x ,= 0 because
limsup
n
f
n
(0)
n!
x
n
= limsup
n
f
n
(0)
(n!)
2
n!x
n
= limsup
n
f
n
(0)
(n!)
2
lim
n
n! [x
n
[ =
where we have used lim
n
n! [x
n
[ = and limsup
n
fn(0)
(n!)
2
1.
Remark 16.13. Given a sequence of real number a
n
n=0
there always exists
f X such that f
(n)
(0) = a
n
. To construct such a function f, let
C
c
(1, 1) be a function such that = 1 in a neighborhood of 0 and
n
(0, 1)
be chosen so that
n
0 as n and

n=0
[a
n
[
n
n
< . The desired
function f can then be dened by
f(x) =
n=0
a
n
n!
x
n
(x/
n
) =:
n=0
g
n
(x). (16.2)
The fact that f is well dened and continuous follows from the estimate:
[g
n
(x)[ =
a
n
n!
x
n
(x/
n
)

||
n!
[a
n
[
n
n
and the assumption that

n=0
[a
n
[
n
n
< . The estimate
[g
t
n
(x)[ =
a
n
(n 1)!
x
n1
(x/
n
) +
a
n
n!
n
x
n
t
(x/
n
)
||
(n 1)!
[a
n
[
n1
n
+
|
t
|
n!
[a
n
[
n
n
(||
+|
t
|
) [a
n
[
n
n
and the assumption that

n=0
[a
n
[
n
n
< shows f C
1
(1, 1) and
f
t
(x) =

n=0
g
t
n
(x). Similar arguments show f C
k
c
(1, 1) and f
(k)
(x) =
n=0
g
(k)
n
(x) for all x and k N. This completes the proof since, using
(x/
n
) = 1 for x in a neighborhood of 0, g
(k)
n
(0) =
k,n
a
k
and hence
f
(k)
(0) =
n=0
g
(k)
n
(0) = a
k
.
16.3 Exercises
Exercise 16.1. Let (X, ||) be a normed space and E X be a subspace.
1. If E is closed and proper subspace of X then E is nowhere dense.
2. If E is a proper nite dimensional subspace of X then E is nowhere dense.
Exercise 16.2. Now suppose that (X, ||) is an innite dimensional Banach
space. Show that X can not have a countable algebraic basis. More explicitly,
there is no countable subset S X such that every element x X may be
written as a nite linear combination of elements from S. Hint: make use of
Exercise 16.1 and the Baire category theorem.
Part V
Lebesgue Integration Theory
17
Introduction: What are measures and why
measurable sets
Denition 17.1 (Preliminary). A measure on a set X is a function
: 2
X
[0, ] such that
1. () = 0
2. If A
i
N
i=1
is a nite (N < ) or countable (N = ) collection of subsets
of X which are pair-wise disjoint (i.e. A
i
A
j
= if i ,= j) then
(
N
i=1
A
i
) =
N
i=1
(A
i
).
Example 17.2. Suppose that X is any set and x X is a point. For A X,
let
x
(A) =
_
1 if x A
0 if x / A.
Then =
x
is a measure on X called the Dirac delta measure at x.
Example 17.3. Suppose that is a measure on X and > 0, then
is also a measure on X. Moreover, if
J
are all measures on X, then
=
, i.e.
(A) =
(A) for all A X

is a measure on X. (See Section 2 for the meaning of this sum.) To prove
this we must show that is countably additive. Suppose that A
i
i=1
is a
collection of pair-wise disjoint subsets of X, then
(
i=1
A
i
) =
i=1
(A
i
) =
i=1
(A
i
)
=
i=1
(A
i
) =
i=1
A
i
)
= (
i=1
A
i
)
280 17 Introduction: What are measures and why measurable sets
wherein the third equality we used Theorem 4.22 and in the fourth we used
that fact that
is a measure.
Example 17.4. Suppose that X is a set : X [0, ] is a function. Then
:=
xX
(x)
x
is a measure, explicitly
(A) =
xA
(x)
for all A X.
17.1 The problem with Lebesgue measure
So far all of the examples of measures given above are counting type mea-
sures, i.e. a weighted count of the number of points in a set. We certainly are
going to want other types of measures too. In particular, it will be of great
interest to have a measure on 1 (called Lebesgue measure) which measures
the length of a subset of 1. Unfortunately as the next theorem shows, there
is no such reasonable measure of length if we insist on measuring all subsets
of 1.
Theorem 17.5. There is no measure : 2
R
[0, ] such that
1. ([a, b)) = (b a) for all a < b and
2. is translation invariant, i.e. (A + x) = (A) for all x 1 and A 2
R
,
where
A+x := y +x : y A 1.
In fact the theorem is still true even if (1) is replaced by the weaker con-
dition that 0 < ((0, 1]) < .
The counting measure (A) = #(A) is translation invariant. However
((0, 1]) = in this case and so does not satisfy condition 1.
Proof. First proof. Let us identify [0, 1) with the unit circle S
1
:= z
C : [z[ = 1 by the map
(t) = e
i2t
= (cos 2t +i sin2t) S
1
for t [0, 1). Using this identication we may use to dene a function on
2
S
1
by ((A)) = (A) for all A [0, 1). This new function is a measure on
S
1
with the property that 0 < ((0, 1]) < . For z S
1
and N S
1
let
zN := zn S
1
: n N, (17.1)
17.1 The problem with Lebesgue measure 281
that is to say e
i
N is N rotated counter clockwise by angle . We now claim
that is invariant under these rotations, i.e.
(zN) = (N) (17.2)
for all z S
1
and N S
1
. To verify this, write N = (A) and z = (t) for
some t [0, 1) and A [0, 1). Then
(t)(A) = (t +Amod1)
where for A [0, 1) and [0, 1),
t +Amod1 := a +t mod1 [0, 1) : a N
= (a +A a < 1 t) ((t 1) +A a 1 t) .
Thus
((t)(A)) = (t +Amod1)
= ((a +A a < 1 t) ((t 1) +A a 1 t))
= ((a +A a < 1 t)) +(((t 1) +A a 1 t))
= (A a < 1 t) +(A a 1 t)
= ((A a < 1 t) (A a 1 t))
= (A) = ((A)).
Therefore it suces to prove that no nite non-trivial measure on S
1
such
that Eq. (17.2) holds. To do this we will construct a non-measurable set
N = (A) for some A [0, 1). Let
R := z = e
i2t
: t = z = e
i2t
: t [0, 1)
a countable subgroup of S
1
. As above R acts on S
1
by rotations and divides
S
1
up into equivalence classes, where z, w S
1
are equivalent if z = rw for
some r R. Choose (using the axiom of choice) one representative point n
from each of these equivalence classes and let N S
1
be the set of these
representative points. Then every point z S
1
may be uniquely written as
z = nr with n N and r R. That is to say
S
1
=
rR
(rN) (17.3)
where

is used to denote the union of pair-wise disjoint sets A
. By
Eqs. (17.2) and (17.3),
(S
1
) =
rR
(rN) =
rR
(N).
282 17 Introduction: What are measures and why measurable sets
The right member from this equation is either 0 or , 0 if (N) = 0 and if
(N) > 0. In either case it is not equal (S
1
) (0, 1). Thus we have reached
the desired contradiction.
Proof. Second proof of Theorem 17.5. For N [0, 1) and [0, 1),
let
N
= N +mod1
= a +mod1 [0, 1) : a N
= ( +N a < 1 ) (( 1) +N a 1 ) .
Then
(N
) = ( +N a < 1 ) +(( 1) +N a 1 )
= (N a < 1 ) +(N a 1 )
= (N a < 1 (N a 1 ))
= (N). (17.4)
We will now construct a bad set N which coupled with Eq. (17.4) will lead to
a contradiction. Set
Q
x
:= x +r 1 : r =x +.
Notice that Q
x
Q
y
,= implies that Q
x
= Q
y
. Let O = Q
x
: x 1 the
orbit space of the action. For all A O choose f(A) [0, 1/3) A
1
and
dene N = f(O). Then observe:
1. f(A) = f(B) implies that A B ,= which implies that A = B so that f
is injective.
2. O = Q
n
: n N.
Let R be the countable set,
R := [0, 1).
We now claim that
N
r
N
s
= if r ,= s and (17.5)
[0, 1) =
rR
N
r
. (17.6)
Indeed, if x N
r
N
s
,= then x = r +nmod1 and x = s +n
t
mod1, then
nn
t
, i.e. Q
n
= Q
n
. That is to say, n = f(Q
n
) = f(Q
n
) = n
t
and hence
that s = r mod1, but s, r [0, 1) implies that s = r. Furthermore, if x [0, 1)
and n := f(Q
x
), then x n = r and x N
r mod 1
. Now that we have
constructed N, we are ready for the contradiction. By Equations (17.417.6)
we nd
1
We have used the Axiom of choice here, i.e.
AJ
(A [0, 1/3]) ,=
17.1 The problem with Lebesgue measure 283
1 = ([0, 1)) =
rR
(N
r
) =
rR
(N)
=
_
if (N) > 0
0 if (N) = 0
.
which is certainly inconsistent. Incidentally we have just produced an example
of so called non measurable set.
Because of Theorem 17.5, it is necessary to modify Denition 17.1. Theo-
rem 17.5 points out that we will have to give up the idea of trying to measure
all subsets of 1 but only measure some sub-collections of measurable sets.
This leads us to the notion of algebra discussed in the next chapter. Our
revised notion of a measure will appear in Denition 19.1 of Chapter 19 below.
18
Measurability
18.1 Algebras and Algebras
Denition 18.1. A collection of subsets / of a set X is an algebra if
1. , X /
2. A / implies that A
c
/
3. / is closed under nite unions, i.e. if A
1
, . . . , A
n
/ then A
1
A
n

/.
In view of conditions 1. and 2., 3. is equivalent to
3
t
. / is closed under nite intersections.
Denition 18.2. A collection of subsets / of X is a algebra (or some-
times called a eld) if / is an algebra which also closed under countable
unions, i.e. if A
i
i=1
/, then
i=1
A
i
/. (Notice that since / is also
closed under taking complements, / is also closed under taking countable in-
tersections.) A pair (X, /), where X is a set and / is a algebra on X,
is called a measurable space.
The reader should compare these denitions with that of a topology in
Denition 13.1. Recall that the elements of a topology are called open sets.
Analogously, elements of and algebra / or a algebra / will be called
measurable sets.
Example 18.3. Here are some examples of algebras.
1. /= 2
X
, then / is a topology, an algebra and a algebra.
2. Let X = 1, 2, 3, then = , X, 2, 3 is a topology on X which is not
an algebra.
3. = / = 1, 2, 3, , X is a topology, an algebra, and a algebra
on X. The sets X, 1, 2, 3, are open and closed. The sets 1, 2 and
1, 3 are neither open nor closed and are not measurable.
The reader should compare this example with Example 13.3.
286 18 Measurability
Proposition 18.4. Let c be any collection of subsets of X. Then there exists
a unique smallest algebra /(c) and algebra (c) which contains c.
Proof. The proof is the same as the analogous Proposition 13.6 for topolo-
gies, i.e.
/(c) :=
/ : / is an algebra such that c /

and
(c) :=
/: / is a algebra such that c /.

Example 18.5. Suppose X = 1, 2, 3 and c = , X, 1, 2, 1, 3, see Figure
18.1.
Fig. 18.1. A collection of subsets.
Then
(c) = , X, 1, 1, 2, 1, 3
/(c) = (c) = 2
X
.
The next proposition is the analogue to Proposition 13.7 for topologies
and enables us to give and explicit descriptions of /(c). On the other hand
it should be noted that (c) typically does not admit a simple concrete de-
scription.
Proposition 18.6. Let X be a set and c 2
X
. Let c
c
:= A
c
: A c and
c
c
:= c X, c
c
Then
/(c) := nite unions of nite intersections of elements from c
c
. (18.1)
Proof. Let / denote the right member of Eq. (18.1). From the denition of
an algebra, it is clear that c / /(c). Hence to nish that proof it suces
18.1 Algebras and Algebras 287
to show / is an algebra. The proof of these assertions are routine except for
possibly showing that /is closed under complementation. To check /is closed
under complementation, let Z / be expressed as
Z =
N
_
i=1
K
j=1
A
ij
where A
ij
c
c
. Therefore, writing B
ij
= A
c
ij
c
c
, we nd that
Z
c
=
N
i=1
K
_
j=1
B
ij
=
K
_
j1,...,j
N
=1
(B
1j1
B
2j2
B
Nj
N
) /
wherein we have used the fact that B
1j1
B
2j2
B
Nj
N
is a nite inter-
section of sets from c
c
.
Remark 18.7. One might think that in general (c) may be described as the
countable unions of countable intersections of sets in c
c
. However this is in
general false, since if
Z =
_
i=1
j=1
A
ij
with A
ij
c
c
, then
Z
c
=
_
j1=1,j2=1,...j
N
=1,...
_
=1
A
c
,j
_
which is now an uncountable union. Thus the above description is not cor-
rect. In general it is complicated to explicitly describe (c), see Proposition
1.23 on page 39 of Folland for details. Also see Proposition 18.13 below.
Exercise 18.1. Let be a topology on a set X and / = /() be the algebra
generated by . Show / is the collection of subsets of X which may be written
as nite union of sets of the form F V where F is closed and V is open.
The following notion will be useful in the sequel and plays an analogous
role for algebras as a base (Denition 13.8) does for a topology.
Denition 18.8. A set c 2
X
is said to be an elementary family or
elementary class provided that
c
c is closed under nite intersections
if E c, then E
c
is a nite disjoint union of sets from c. (In particular
X =
c
is a nite disjoint union of elements from c.)
Example 18.9. Let X = 1, then
c :=
_
(a, b] 1 : a, b

1
_
= (a, b] : a [, ) and a < b < , 1
is an elementary family.
Exercise 18.2. Let / 2
X
and B 2
Y
be elementary families. Show the
collection
c = /B = AB : A / and B B
is also an elementary family.
Proposition 18.10. Suppose c 2
X
is an elementary family, then / =
/(c) consists of sets which may be written as nite disjoint unions of sets
from c.
Proof. This could be proved making use of Proposition 18.6. However it
is easier to give a direct proof. Let / denote the collection of sets which may
be written as nite disjoint unions of sets from c. Clearly c / /(c) so it
suces to show / is an algebra since /(c) is the smallest algebra containing
c. By the properties of c, we know that , X /. Now suppose that A
i
=
Fi
F / where, for i = 1, 2, . . . , n,
i
is a nite collection of disjoint sets
from c. Then
n
i=1
A
i
=
n
i=1
_

Fi
F
_
=
_
(F1,,...,Fn)1n
(F
1
F
2
F
n
)
and this is a disjoint (you check) union of elements from c. Therefore / is
closed under nite intersections. Similarly, if A =

F
F with being a
nite collection of disjoint sets from c, then A
c
=
F
F
c
. Since by assump-
tion F
c
/ for F c and / is closed under nite intersections, it
follows that A
c
/.
Denition 18.11. Let X be a set. We say that a family of sets T 2
X
is a
partition of X if distinct members of T are disjoint and if X is the union
of the sets in T.
Example 18.12. Let X be a set and c = A
1
, . . . , A
n
where A
1
, . . . , A
n
is a
partition of X. In this case
/(c) = (c) = (c) =
i
A
i
: 1, 2, . . . , n
where
i
A
i
:= when = . Notice that
#(/(c)) = #(2
1,2,...,n]
) = 2
n
.
18.1 Algebras and Algebras 289
Proposition 18.13. Suppose that / 2
X
is a algebra and / is at
most a countable set. Then there exists a unique nite partition T of X such
that T / and every element B / is of the form
B = A T : A B . (18.2)
In particular / is actually a nite set and #(/) = 2
n
for some n N.
Proof. For each x X let
A
x
= A /: x A /,
wherein we have used /is a countable algebra to insure A
x
/. Hence
A
x
is the smallest set in /which contains x. Let C = A
x
A
y
. If x / C then
A
x
C A
x
is an element of /which contains x and since A
x
is the smallest
member of /containing x, we must have that C = . Similarly if y / C then
C = . Therefore if C ,= , then x, y A
x
A
y
/ and A
x
A
y
A
x
and
A
x
A
y
A
y
from which it follows that A
x
= A
x
A
y
= A
y
. This shows that
T = A
x
: x X / is a (necessarily countable) partition of X for which
Eq. (18.2) holds for all B /. Enumerate the elements of T as T = P
n
N
n=1
where N N or N = . If N = , then the correspondence
a 0, 1
N
A
a
= P
n
: a
n
= 1 /
is bijective and therefore, by Lemma 2.6, / is uncountable. Thus any count-
able algebra is necessarily nite. This nishes the proof modulo the unique-
ness assertion which is left as an exercise to the reader.
Example 18.14. Let X = 1 and
c = (a, ) : a 1 1, = (a, ) 1 : a

1 2
R
.
Notice that c
f
= c and that c is closed under unions, which shows that
(c) = c, i.e. c is already a topology. Since (a, )
c
= (, a] we nd that
c
c
= (a, ), (, a], a < 1, . Noting that
(a, ) (, b] = (a, b]
it follows that /(c) = /(
c) where
c :=
_
(a, b] 1 : a, b

1
_
.
Since

c is an elementary family of subsets of 1, Proposition 18.10 implies
/(c) may be described as being those sets which are nite disjoint unions of
sets from

c. The algebra, (c), generated by c is very complicated.
Here are some sets in (c) most of which are not in /(c).
(a) (a, b) =
n=1
(a, b
1
n
] (c).
(b) All of the standard open subsets of 1 are in (c).
(c) x =
n
_
x
1
n
, x
(c)
(d) [a, b] = a (a, b] (c)
(e) Any countable subset of 1 is in (c).
Remark 18.15. In the above example, one may replace c by c = (a, ) : a
1, , in which case /(c) may be described as being those sets which
are nite disjoint unions of sets from the following list
(a, ), (, a], (a, b] : a, b , 1 .
This shows that /(c) is a countable set a useful fact which will be needed
later.
Notation 18.16 For a general topological space (X, ), the Borel alge-
bra is the algebra B
X
:= () on X. In particular if X = 1
n
, B
R
n will
be used to denote the Borel algebra on 1
n
when 1
n
is equipped with its
standard Euclidean topology.
Exercise 18.3. Verify the algebra, B
R
, is generated by any of the following
collection of sets:
1. (a, ) : a 1 , 2. (a, ) : a or 3. [a, ) : a .
Proposition 18.17. If is a second countable topology on X and c is a
countable collection of subsets of X such that = (c), then B
X
:= () =
(c), i.e. ((c)) = (c).
Proof. Let c
f
denote the collection of subsets of X which are nite inter-
section of elements fromc along with X and . Notice that c
f
is still countable
(you prove). A set Z is in (c) i Z is an arbitrary union of sets from c
f
.
Therefore Z =

AJ
A for some subset T c
f
which is necessarily count-
able. Since c
f
(c) and (c) is closed under countable unions it follows
that Z (c) and hence that (c) (c). Lastly, since c (c) (c),
(c) ((c)) (c).
18.2 Measurable Functions
Our notion of a measurable function will be analogous to that for a con-
tinuous function. For motivational purposes, suppose (X, /, ) is a measure
space and f : X 1
+
. Roughly speaking, in the next Chapter we are going
to dene
_
X
fd as a certain limit of sums of the form,
0<a1<a2<a3<...
a
i
(f
1
(a
i
, a
i+1
]).
18.2 Measurable Functions 291
For this to make sense we will need to require f
1
((a, b]) / for all a <
b. Because of Lemma 18.22 below, this last condition is equivalent to the
condition f
1
(B
R
) /.
Denition 18.18. Let (X, /) and (Y, T) be measurable spaces. A function
f : X Y is measurable if f
1
(T) /. We will also say that f is //T
measurable or (/, T) measurable.
Example 18.19 (Characteristic Functions). Let (X, /) be a measurable space
and A X. We dene the characteristic function 1
A
: X 1 by
1
A
(x) =
_
1 if x A
0 if x / A.
If A /, then 1
A
is (/, 2
R
) measurable because 1
1
A
(W) is either , X,
A or A
c
for any W 1. Conversely, if T is any algebra on 1 containing
a set W 1 such that 1 W and 0 W
c
, then A / if 1
A
is (/, T)
measurable. This is because A = 1
1
A
(W) /.
Exercise 18.4. Suppose f : X Y is a function, T 2
Y
and / 2
X
.
Show f
1
T and f
/ (see Notation 2.7) are algebras ( algebras) provided

T and / are algebras ( algebras).
Remark 18.20. Let f : X Y be a function. Given a algebra T 2
Y
,
the algebra /:= f
1
(T) is the smallest algebra on X such that f is
(/, T) - measurable . Similarly, if / is a - algebra on X then T = f
/
is the largest algebra on Y such that f is (/, T) - measurable .
Recall from Denition 2.8 that for c 2
X
and A X that
c
A
= i
1
A
(c) = A E : E c
where i
A
: A X is the inclusion map. Because of Exercise 13.3, when
c = / is an algebra ( algebra), /
A
is an algebra ( algebra) on A and
we call /
A
the relative or induced algebra ( algebra) on A.
The next two Lemmas are direct analogues of their topological counter
parts in Lemmas 13.13 and 13.14. For completeness, the proofs will be given
even though they are same as those for Lemmas 13.13 and 13.14.
Lemma 18.21. Suppose that (X, /), (Y, T) and (Z, () are measurable
spaces. If f : (X, /) (Y, T) and g : (Y, T) (Z, () are measurable
functions then g f : (X, /) (Z, () is measurable as well.
Proof. By assumption g
1
(() T and f
1
(T) / so that
(g f)
1
(() = f
1
_
g
1
(()
_
f
1
(T) /.
Lemma 18.22. Suppose that f : X Y is a function and c 2
Y
and A Y
then
_
f
1
(c)
_
= f
1
((c)) and (18.3)
((c))
A
= (c
A
). (18.4)
(Similar assertion hold with () being replaced by /() .) Moreover, if T =
(c) and /is a algebra on X, then f is (/, T) measurable i f
1
(c)
/.
Proof. By Exercise 18.4, f
1
((c)) is a algebra and since c T,
f
1
(c) f
1
((c)). It now follows that (f
1
(c)) f
1
( (c)). For the
reverse inclusion, notice that
f
_
f
1
(c)
_
=
_
B Y : f
1
(B)
_
f
1
(c)
__
is a algebra which contains c and thus (c) f
_
f
1
(c)
_
. Hence if
B (c) we know that f
1
(B)
_
f
1
(c)
_
, i.e. f
1
((c))
_
f
1
(c)
_
and Eq. (18.3) has been proved. Applying Eq. (18.3) with X = A and f = i
A
being the inclusion map implies
((c))
A
= i
1
A
((c)) = (i
1
A
(c)) = (c
A
).
Lastly if f
1
c /, then f
1
(c) =
_
f
1
c
_
/ which shows f is
(/, T) measurable.
Corollary 18.23. Suppose that (X, /) is a measurable space. Then the fol-
lowing conditions on a function f : X 1 are equivalent:
1. f is (/, B
R
) measurable,
2. f
1
((a, )) / for all a 1,
3. f
1
((a, )) / for all a ,
4. f
1
((, a]) / for all a 1.
Proof. An exercise in using Lemma 18.22 and is the content of Exercise
18.8.
Here is yet another way to generate algebras. (Compare with the
analogous topological Denition 13.20.)
Denition 18.24 ( Algebras Generated by Functions). Let X be a
set and suppose there is a collection of measurable spaces (Y
, T
) : A
and functions f
: X Y
for all A. Let (f
: A) denote the
smallest algebra on X such that each f
is measurable, i.e.
(f
: A) = (
f
1
(T
)).
Proposition 18.25. Assuming the notation in Denition 18.24 and addition-
ally let (Z, /) be a measurable space and g : Z X be a function. Then g
is (/, (f
: A)) measurable i f
g is (/, T
)measurable for all

A.
Proof. This proof is essentially the same as the proof of the topological
analogue in Proposition 13.21. () If g is (/, (f
: A)) measurable,
then the composition f
g is (/, T
) measurable by Lemma 18.21. ()

Let
( = (f
: A) =
_
A
f
1
(T
)
_
.
If f
g is (/, T
) measurable for all , then

g
1
f
1
(T
) / A
and therefore
g
1
_
A
f
1
(T
)
_
=
A
g
1
f
1
(T
) /.
Hence
g
1
(() = g
1
_
A
f
1
(T
)
__
= (g
1
_
A
f
1
(T
)
_
/
which shows that g is (/, () measurable.
Denition 18.26. A function f : X Y between two topological spaces is
Borel measurable if f
1
(B
Y
) B
X
.
Proposition 18.27. Let X and Y be two topological spaces and f : X Y
be a continuous function. Then f is Borel measurable.
Proof. Using Lemma 18.22 and B
Y
= (
Y
),
f
1
(B
Y
) = f
1
((
Y
)) = (f
1
(
Y
)) (
X
) = B
X
.
Denition 18.28. Given measurable spaces (X, /) and (Y, T) and a subset
A X. We say a function f : A Y is measurable i f is /
A
/T
measurable.
Proposition 18.29 (Localizing Measurability). Let (X, /) and (Y, T)
be measurable spaces and f : X Y be a function.
1. If f is measurable and A X then f[
A
: A Y is measurable.
2. Suppose there exist A
n
/ such that X =
n=1
A
n
and f[A
n
is /
An
measurable for all n, then f is / measurable.
Proof. As the reader will notice, the proof given below is essentially iden-
tical to the proof of Proposition 13.19 which is the topological analogue of
this proposition. 1. If f : X Y is measurable, f
1
(B) / for all B T
and therefore
f[
1
A
(B) = A f
1
(B) /
A
for all B T.
2. If B T, then
f
1
(B) =
n=1
_
f
1
(B) A
n
_
=
n=1
f[
1
An
(B).
Since each A
n
/, /
An
/and so the previous displayed equation shows
f
1
(B) /.
Proposition 18.30. If (X, /) is a measurable space, then
f = (f
1
, f
2
, . . . , f
n
) : X 1
n
is (/, B
R
n) measurable i f
i
: X 1 is (/, B
R
) measurable for each
i. In particular, a function f : X C is (/, B
C
) measurable i Re f and
Imf are (/, B
R
) measurable.
Proof. This is formally a consequence of Corollary 18.65 and Proposition
18.60 below. Nevertheless it is instructive to give a direct proof now. Let
=
R
n denote the usual topology on 1
n
and
i
: 1
n
1 be projection
onto the i
th
factor. Since
i
is continuous,
i
is B
R
n/B
R
measurable and
therefore if f : X 1
n
is measurable then so is f
i
=
i
f. Now suppose
f
i
: X 1 is measurable for all i = 1, 2, . . . , n. Let
c := (a, b) : a, b
n
a < b ,
where, for a, b 1
n
, we write a < b i a
i
< b
i
for i = 1, 2, . . . , n and let
(a, b) = (a
1
, b
1
) (a
n
, b
n
) .
Since c and every element V may be written as a (necessarily)
countable union of elements from c, we have (c) B
R
n = () (c) , i.e.
(c) = B
R
n. (This part of the proof is essentially a direct proof of Corollary
18.65 below.) Because
f
1
((a, b)) = f
1
1
((a
1
, b
1
)) f
1
2
((a
2
, b
2
)) f
1
n
((a
n
, b
n
)) /
for all a, b with a < b, it follows that f
1
c / and therefore
f
1
B
R
n = f
1
(c) =
_
f
1
c
_
/.
Corollary 18.31. Let (X, /) be a measurable space and f, g : X C be
(/, B
C
) measurable functions. Then f g and f g are also (/, B
C
)
measurable.
Proof. Dene F : X CC, A
: CC C and M : CC C by
F(x) = (f(x), g(x)), A
(w, z) = w z and M(w, z) = wz. Then A
and M
are continuous and hence (B
C
2, B
C
) measurable. Also F is (/, B
C
B
C
) =
(/, B
C
2) measurable since
1
F = f and
2
F = g are (/, B
C
)
measurable. Therefore A
F = f g and MF = f g, being the composition

of measurable functions, are also measurable.
Lemma 18.32. Let C, (X, /) be a measurable space and f : X C be
a (/, B
C
) measurable function. Then
F(x) :=
_
1
f(x)
if f(x) ,= 0
if f(x) = 0
is measurable.
Proof. Dene i : C C by
i(z) =
_
1
z
if z ,= 0
0 if z = 0.
For any open set V C we have
i
1
(V ) = i
1
(V 0) i
1
(V 0)
Because i is continuous except at z = 0, i
1
(V 0) is an open set and hence
in B
C
. Moreover, i
1
(V 0) B
C
since i
1
(V 0) is either the empty
set or the one point set 0 . Therefore i
1
(
C
) B
C
and hence i
1
(B
C
) =
i
1
((
C
)) = (i
1
(
C
)) B
C
which shows that i is Borel measurable. Since
F = i f is the composition of measurable functions, F is also measurable.
We will often deal with functions f : X

1 = 1 . When talking
about measurability in this context we will refer to the algebra on

1
dened by
B
R
:= ([a, ] : a 1) . (18.5)
Proposition 18.33 (The Structure of B
R
). Let B
R
and B
R
be as above,
then
B
R
= A

1 : A 1 B
R
. (18.6)
In particular , B
R
and B
R
B
R
.
Proof. Let us rst observe that
=
n=1
[, n) =
n=1
[n, ]
c
B
R
,
=
n=1
[n, ] B
R
and 1 =

1 B
R
.
Letting i : 1

1 be the inclusion map,
i
1
(B
R
) =
_
i
1
__
[a, ] : a

1
___
=
__
i
1
([a, ]) : a

1
__
=
__
[a, ] 1 : a

1
__
= ([a, ) : a 1) = B
R
.
Thus we have shown
B
R
= i
1
(B
R
) = A 1 : A B
R
.
This implies:
1. A B
R
=A 1 B
R
and
2. if A

1 is such that A 1 B
R
there exists B B
R
such that A 1 =
B 1. Because AB and , B
R
we may conclude
that A B
R
as well.
This proves Eq. (18.6).
The proofs of the next two corollaries are left to the reader, see Exercises
18.5 and 18.6.
Corollary 18.34. Let (X, /) be a measurable space and f : X

1 be a
function. Then the following are equivalent
1. f is (/, B
R
) - measurable,
2. f
1
((a, ]) / for all a 1,
3. f
1
((, a]) / for all a 1,
4. f
1
() /, f
1
() / and f
0
: X 1 dened by
f
0
(x) := 1
R
(f (x)) =
_
f (x) if f (x) 1
0 if f (x)
is measurable.
Corollary 18.35. Let (X, /) be a measurable space, f, g : X

1 be func-
tions and dene f g : X

1 and (f +g) : X

1 using the conventions,
0 = 0 and (f +g) (x) = 0 if f (x) = and g (x) = or f (x) =
and g (x) = . Then f g and f +g are measurable functions on X if both f
and g are measurable.
Exercise 18.5. Prove Corollary 18.34 noting that the equivalence of items 1.
3. is a direct analogue of Corollary 18.23. Use Proposition 18.33 to handle
item 4.
Exercise 18.6. Prove Corollary 18.35.
Proposition 18.36 (Closure under sups, infs and limits). Suppose that
(X, /) is a measurable space and f
j
: (X, /) 1 for j N is a sequence
of //B
R
measurable functions. Then
sup
j
f
j
, inf
j
f
j
, limsup
j
f
j
and liminf
j
f
j
are all //B
R
measurable functions. (Note that this result is in generally
false when (X, /) is a topological space and measurable is replaced by con-
tinuous in the statement.)
Proof. Dene g
+
(x) := sup
j
f
j
(x), then
x : g
+
(x) a = x : f
j
(x) a j
=
j
x : f
j
(x) a /
so that g
+
is measurable. Similarly if g
(x) = inf
j
f
j
(x) then
x : g
(x) a =
j
x : f
j
(x) a /.
Since
limsup
j
f
j
= inf
n
supf
j
: j n and
liminf
j
f
j
= sup
n
inf f
j
: j n
we are done by what we have already proved.
Denition 18.37. Given a function f : X

1 let f
+
(x) := max f(x), 0
and f
(x) := max (f(x), 0) = min(f(x), 0) . Notice that f = f

+
f
.
Corollary 18.38. Suppose (X, /) is a measurable space and f : X

1 is
a function. Then f is measurable i f
are measurable.
Proof. If f is measurable, then Proposition 18.36 implies f
are measur-
able. Conversely if f
are measurable then so is f = f

+
f
.
18.2.1 More general pointwise limits
Lemma 18.39. Suppose that (X, /) is a measurable space, (Y, d) is a metric
space and f
j
: X Y is (/, B
Y
) measurable for all j. Also assume that for
each x X, f(x) = lim
n
f
n
(x) exists. Then f : X Y is also (/, B
Y
)
measurable.
Proof. Let V
d
and W
m
:= y Y : d
V
c (y) > 1/m for m = 1, 2, . . . .
Then W
m

d
,
W
m

W
m
y Y : d
V
c (y) 1/m V
for all m and W
m
V as m . The proof will be completed by verifying
the identity,
f
1
(V ) =
m=1
N=1
nN
f
1
n
(W
m
) /.
If x f
1
(V ) then f(x) V and hence f(x) W
m
for some m. Since f
n
(x)
f(x), f
n
(x) W
m
for almost all n. That is x
m=1
N=1
nN
f
1
n
(W
m
).
Conversely when x
m=1
N=1
nN
f
1
n
(W
m
) there exists an m such that
f
n
(x) W
m

W
m
for almost all n. Since f
n
(x) f(x)

W
m
V, it follows
that x f
1
(V ).
Remark 18.40. In the previous Lemma 18.39 it is possible to let (Y, ) be any
topological space which has the regularity property that if V there
exists W
m
such that W
m

W
m
V and V =
m=1
W
m
. Moreover, some
extra condition is necessary on the topology in order for Lemma 18.39 to
be correct. For example if Y = 1, 2, 3 and = Y, , 1, 2, 2, 3, 2 as
in Example 13.36 and X = a, b with the trivial algebra. Let f
j
(a) =
f
j
(b) = 2 for all j, then f
j
is constant and hence measurable. Let f(a) = 1
and f(b) = 2, then f
j
f as j with f being non-measurable. Notice
that the Borel algebra on Y is 2
Y
.
18.3 Function Algebras
In this subsection, we are going to relate algebras of subsets of a set X to
certain algebras of functions on X. We will begin this endeavor after proving
the simple but very useful approximation Theorem 18.42 below.
Denition 18.41. Let (X, /) be a measurable space. A function : X F
(F denotes either 1, C or [0, ]

1) is a simple function if is / B
F
measurable and (X) contains only nitely many elements.
Any such simple functions can be written as
=
n
i=1
i
1
Ai
with A
i
/ and
i
F. (18.7)
Indeed, take
1
,
2
, . . . ,
n
to be an enumeration of the range of and A
i
=
1
(
i
). Note that this argument shows that any simple function may be
written intrinsically as
=
yF
y1
1
(y])
. (18.8)
The next theorem shows that simple functions are pointwise dense in
the space of measurable functions.
Theorem 18.42 (Approximation Theorem). Let f : X [0, ] be mea-
surable and dene, see Figure 18.2,
n
(x) :=
2
2n
1
k=0
k
2
n
1
f
1
((
k
2
n ,
k+1
2
n ])
(x) + 2
n
1
f
1
((2
n
,])
(x)
=
2
2n
1
k=0
k
2
n
1
k
2
n <f
k+1
2
n
(x) + 2
n
1
f>2
n
]
(x)
then
n
f for all n,
n
(x) f(x) for all x X and
n
f uniformly on
the sets X
M
:= x X : f(x) M with M < . Moreover, if f : X
C is a measurable function, then there exists simple functions
n
such that
lim
n
n
(x) = f(x) for all x and [
n
[ [f[ as n .
Proof. Since
(
k
2
n
,
k + 1
2
n
] = (
2k
2
n+1
,
2k + 1
2
n+1
] (
2k + 1
2
n+1
,
2k + 2
2
n+1
],
if x f
1
_
(
2k
2
n+1
,
2k+1
2
n+1
]
_
then
n
(x) =
n+1
(x) =
2k
2
n+1
and if x
f
1
_
(
2k+1
2
n+1
,
2k+2
2
n+1
]
_
then
n
(x) =
2k
2
n+1
<
2k+1
2
n+1
=
n+1
(x). Similarly
(2
n
, ] = (2
n
, 2
n+1
] (2
n+1
, ],
18.3 Function Algebras 299
Fig. 18.2. Constructing simple functions approximating a function, f : X [0, ].
and so for x f
1
((2
n+1
, ]),
n
(x) = 2
n
< 2
n+1
=
n+1
(x) and for x
f
1
((2
n
, 2
n+1
]),
n+1
(x) 2
n
=
n
(x). Therefore
n

n+1
for all n. It is
clear by construction that
n
(x) f(x) for all x and that 0 f(x)
n
(x)
2
n
if x X
2
n. Hence we have shown that
n
(x) f(x) for all x X and
n
f uniformly on bounded sets. For the second assertion, rst assume that
f : X 1 is a measurable function and choose
n
to be simple functions
such that
n
f
as n and dene
n
=
+
n

n
. Then
[
n
[ =
+
n
+
n

+
n+1
+
n+1
= [
n+1
[
and clearly [
n
[ =
+
n
+
n
f
+
+f
= [f[ and
n
=
+
n
n
f
+
f
= f
as n . Now suppose that f : X C is measurable. We may now choose
simple function u
n
and v
n
such that [u
n
[ [Re f[ , [v
n
[ [Imf[ , u
n
Re f
and v
n
Imf as n . Let
n
= u
n
+iv
n
, then
[
n
[
2
= u
2
n
+v
2
n
[Re f[
2
+[Imf[
2
= [f[
2
and
n
= u
n
+iv
n
Re f +i Imf = f as n .
For the rest of this section let X be a given set.
Denition 18.43 (Bounded Convergence). We say that a sequence of
functions f
n
from X to 1 or C converges boundedly to a function f if
lim
n
f
n
(x) = f(x) for all x X and
sup[f
n
(x)[ : x X and n = 1, 2, . . . < .
Denition 18.44. A function algebra H on X is a linear subspace of
(X, 1) which contains 1 and is closed under pointwise multiplication, i.e.

H is a subalgebra of
(X, 1) which contains 1. If H is further closed under

bounded convergence then H is said to be a function algebra.
Example 18.45. Suppose / is a algebra on X, then
(/, 1) := f
(X, 1) : f is //B
R
measurable (18.9)
is a function algebra. The next theorem will show that these are the only
example of function algebras. (See Exercise 18.7 below for examples of
function algebras on X.)
Notation 18.46 If H
(X, 1) be a function algebra, let

/(H) := A X : 1
A
H . (18.10)
Theorem 18.47. Let H be a function algebra on a set X. Then
1. /(H) is a algebra on X.
2. H =
(/(H) , 1) .
3. The map
/ algebras on X
(/, 1) function algebras on X

(18.11)
is bijective with inverse given by H /(H) .
Proof. Let /:= /(H) .
1. Since 0, 1 H, , X /. If A / then, since H is a linear subspace
of
(X, 1) , 1
A
c = 1 1
A
H which shows A
c
/. If A
n
n=1
/,
then since H is an algebra,
1
N
n=1
An
=
N
n=1
1
An
=: f
N
H
for all N N. Because H is closed under bounded convergence it follows
that
1
n=1
An
= lim
N
f
N
H
and this implies
n=1
A
n
/. Hence we have shown / is a algebra.
2. Since H is an algebra, p (f) H for any f H and any polyno-
mial p on 1. The Weierstrass approximation Theorem 10.35, asserts
that polynomials on 1 are uniformly dense in the space of continu-
ous functions on any compact subinterval of 1. Hence if f H and
C (1) , there exists polynomials p
n
on 1 such that p
n
f (x) con-
verges to f (x) uniformly (and hence boundedly) in x X as n .
Therefore f H for all f H and C (1) and in particular
[f[ H and f
:=
]f]f
2
H if f H. Fix an 1 and for n N let
n
(t) := (t )
1/n
+
, where (t )
+
:= max t , 0 . Then
n
C (1)
and
n
(t) 1
t>
as n and the convergence is bounded when
t is restricted to any compact subset of 1. Hence if f H it follows
that 1
f>
= lim
n
n
(f) H for all 1, i.e. f > / for
all 1. Therefore if f H then f
(/, 1) and we have shown

H
(/, 1) . Conversely if f
(/, 1) , then for any < ,

< f /= /(H) and so by assumption 1
<f]
H. Com-
bining this remark with the approximation Theorem 18.42 and the fact
that H is closed under bounded convergence shows that f H. Hence we
have shown
(/, 1) H which combined with H
(/, 1) already
proved shows H =
(/(H) , 1) .
3. Items 1. and 2. shows the map in Eq. (18.11) is surjective. To see the
map is injective suppose / and T are two algebras on X such that
(/, 1) =
(T, 1) , then
/= A X : 1
A

(/, 1)
= A X : 1
A

(T, 1) = T.
Notation 18.48 Suppose M is a subset of
(X, 1) .
1. Let H(M) denote the smallest subspace of
(X, 1) which contains M

and the constant functions and is closed under bounded convergence.
2. Let H
(M) denote the smallest function algebra containing M.

Theorem 18.49. Suppose M is a subset of
(X, 1) , then H
(M) =
( (M) , 1) or in other words the following diagram commutes:

M
(M)
M Multiplicative Subsets algebras /

H
(M) function algebras = function algebras
(/, 1) .
Proof. Since
( (M) , 1) is function algebra which contains M it

follows that
H
(M)
( (M) , 1) .
For the opposite inclusion, let
/= /(H
(M)) := A X : 1
A
H
(M) .
By Theorem 18.47, M H
(M) =
(/, 1) which implies that every

f M is / measurable. This then implies (M) / and therefore
( (M) , 1)
(/, 1) = H
(M) .
Denition 18.50 (Multiplicative System). A collection of bounded real or
complex valued functions, M, on a set X is called a multiplicative system
if f g M whenever f and g are in M.
Theorem 18.51 (Dynkins Multiplicative System Theorem). Suppose
M
(X, 1) is a multiplicative system, then

H(M) = H
(M) =
( (M) , 1) . (18.12)
This can also be stated as follows.
Suppose H is a linear subspace of
(X, 1) such that: 1 H, H is closed

under bounded convergence, and M H. Then H contains all bounded real
valued (M)-measurable functions, i.e.
( (M) , 1) H.
(In words, the smallest subspace of bounded real valued functions on X
which contains M that is closed under bounded convergence is the same as the
space of bounded real valued (M) measurable functions on X.)
Proof. The assertion that H
(M) =
( (M) , 1) has already been

proved in Theorem 18.49. Since any function algebra containing M is also
a subspace of
(X, 1) which contains the constant functions and is closed

under bounded convergence (compare with Exercise 18.13), it follows that
H(M) H
(M) . To complete the proof it suces to show the inclusion,

H(M) H
(M) , is an equality. We will accomplish this below by showing

H(M) is also a function algebra.
For any f H := H(M) let
H
f
:= g H : fg H H
and notice that H
f
is a linear subspace of
(X, 1) which is closed under

bounded convergence. Moreover if f M, M H
f
since M is multiplicative.
Therefore H
f
= H and we have shown that fg H whenever f M and
g H. Given this it now follows that M H
f
for any f H and by the
same reasoning just used, H
f
= H. Since f H is arbitrary, we have shown
fg H for all f, g H, i.e. H is an algebra, which by the denition of H(M)
in Notation 18.48 contains the constant functions, i.e. H(M) is a function
algebra.
Theorem 18.52 (Complex Multiplicative System Theorem). Suppose
H is a complex linear subspace of
(X, C) such that: 1 H, H is closed under

complex conjugation, and H is closed under bounded convergence. If M H
is multiplicative system which is closed under conjugation, then H contains all
bounded complex valued (M)-measurable functions, i.e.
( (M) , C) H.
Proof. Let M
0
= span
C
(M 1) be the complex span of M. As the
reader should verify, M
0
is an algebra, M
0
H, M
0
is closed under complex
conjugation and that (M
0
) = (M) . Let H
R
:= H
(X, 1) and M
R
0
=
M
(X, 1). Then (you verify) M

R
0
is a multiplicative system, M
R
0
H
R
and
H
R
is a linear space containing 1 which is closed under bounded convergence.
Therefore by Theorem 18.51,
_
M
R
0
_
, 1
_
H
R
. Since H and M
0
are
complex linear spaces closed under complex conjugation, for any f H or
f M
0
, the functions Re f =
1
2
_
f +

f
_
and Imf =
1
2i
_
f

f
_
are in H (M
0
)
or M
0
respectively. Therefore H = H
R
+ iH
R
, M
0
= M
R
0
+ iM
R
0
,
_
M
R
0
_
=
(M
0
) = (M) and
( (M) , C) =
_
M
R
0
_
, 1
_
+i
_
M
R
0
_
, 1
_
H
R
+iH
R
= H.
Exercise 18.7 (Algebra analogue of Theorem 18.47). Call a function
algebra H
(X, 1) a simple function algebra if the range of each func-

tion f H is a nite subset of 1. Prove there is a one to one correspondence
between algebras / on a set X and simple function algebras H on X.
Denition 18.53. A collection of subsets, (, of X is a multiplicative
class(or a class) if ( is closed under nite intersections.
Corollary 18.54. Suppose H is a subspace of
(X, 1) which is closed under

bounded convergence and 1 H. If ( 2
X
is a multiplicative class such
that 1
A
H for all A (, then H contains all bounded (() measurable
functions.
Proof. Let M = 1 1
A
: A ( . Then M H is a multiplicative
system and the proof is completed with an application of Theorem 18.51.
Corollary 18.55. Suppose that (X, d) is a metric space and B
X
= (
d
)
is the Borel algebra on X and H is a subspace of
(X, 1) such that

BC(X, 1) H and H is closed under bounded convergence
1
. Then H contains
all bounded B
X
measurable real valued functions on X. (This may be stated
as follows: the smallest vector space of bounded functions which is closed under
bounded convergence and contains BC(X, 1) is the space of bounded B
X

measurable real valued functions on X.)
Proof. Let V
d
be an open subset of X and for n N let
f
n
(x) := min(n d
V
c (x), 1) for all x X.
Notice that f
n
=
n
d
V
c where
n
(t) = min(nt, 1) (see Figure 18.3) which
is continuous and hence f
n
BC(X, 1) for all n. Furthermore, f
n
converges
boundedly to 1
d
V
c >0
= 1
V
as n and therefore 1
V
H for all V .
Since is a class, the result now follows by an application of Corollary
18.54.
Here are some more variants of Corollary 18.55.
1
Recall that BC(X, R) are the bounded continuous functions on X.
Fig. 18.3. Plots of 1, 2 and 3.
Proposition 18.56. Let (X, d) be a metric space, B
X
= (
d
) be the Borel
algebra and assume there exists compact sets K
k
X such that K
o
k
X.
Suppose that H is a subspace of
(X, 1) such that C

c
(X, 1) H (C
c
(X, 1)
is the space of continuous functions with compact support) and H is closed
under bounded convergence. Then H contains all bounded B
X
measurable
real valued functions on X.
Proof. Let k and n be positive integers and set
n,k
(x) = min(1, n
d
(K
o
k
)
c (x)). Then
n,k
C
c
(X, 1) and
n,k
,= 0 K
o
k
. Let H
n,k
denote
those bounded B
X
measurable functions, f : X 1, such that
n,k
f H.
It is easily seen that H
n,k
is closed under bounded convergence and that
H
n,k
contains BC(X, 1) and therefore by Corollary 18.55,
n,k
f H for all
bounded measurable functions f : X 1. Since
n,k
f 1
K
o
k
f boundedly
as n , 1
K
o
k
f H for all k and similarly 1
K
o
k
f f boundedly as k
and therefore f H.
Lemma 18.57. Suppose that (X, ) is a locally compact second countable
Hausdor space.
2
Then:
1. every open subset U X is compact. In fact U is still a locally compact
second countable Hausdor space.
2. If F X is a closed set, there exist open sets V
n
X such that V
n
F
as n .
3. To each open set U X there exists f
n
U (i.e. f
n
C
c
(U, [0, 1])) such
that lim
n
f
n
= 1
U
.
4. B
X
= (C
c
(X, 1)), i.e. the algebra generated by C
c
(X) is the Borel
algebra on X.
Proof.
2
For example any separable locally compact metric space and in particular any
open subset of R
n
.
18.4 Product Algebras 305
1. Let U be an open subset of X, 1 be a countable base for and
1
U
:= W 1 :

W U and

W is compact.
For each x U, by Proposition 15.7, there exists an open neighborhood
V of x such that

V U and

V is compact. Since 1 is a base for the
topology , there exists W 1 such that x W V. Because

W

V , it
follows that

W is compact and hence W 1
U
. As x U was arbitrary,
U = 1
U
. This shows 1
U
is a countable basis for the topology on U and
that U is still locally compact.
Let W
n
n=1
be an enumeration of 1
U
and set K
n
:=
n
k=1
W
k
. Then
K
n
U as n and K
n
is compact for each n. This shows U is
compact. (See Exercise 14.7.)
2. Let K
n
n=1
be compact subsets of F
c
such that K
n
F
c
as n and
set V
n
:= K
c
n
= XK
n
. Then V
n
F and by Proposition 15.5, V
n
is open
for each n.
3. Let U X be an open set and K
n
n=1
be compact subsets of U such
that K
n
U. By Urysohns Lemma 15.8, there exist f
n
U such that
f
n
= 1 on K
n
. These functions satisfy, 1
U
= lim
n
f
n
.
4. By item 3., 1
U
is (C
c
(X, 1)) measurable for all U and hence
(C
c
(X, 1)). Therefore B
X
= () (C
c
(X, 1)). The converse
inclusion holds because continuous functions are always Borel measurable.
Here is a variant of Corollary 18.55.
Corollary 18.58. Suppose that (X, ) is a second countable locally compact
Hausdor space and B
X
= () is the Borel algebra on X. If H is a
subspace of
(X, 1) which is closed under bounded convergence and contains

C
c
(X, 1), then H contains all bounded B
X
measurable real valued functions
on X.
Proof. By Item 3. of Lemma 18.57, for every U the characteristic
function, 1
U
, may be written as a bounded pointwise limit of functions from
C
c
(X, 1) . Therefore 1
U
H for all U . Since is a class, the proof is
nished with an application of Corollary 18.54
18.4 Product Algebras
Let (X
, /
)
A
be a collection of measurable spaces X = X
A
=

A
X
and
: X
A
X
be the canonical projection map as in Notation 2.2.

Denition 18.59 (Product Algebra). The product algebra,
A
/
, is the smallest algebra on X such that each
for A is
measurable, i.e.
A
/
:= (
: A) =
_
(/
)
_
.
Applying Proposition 18.25 in this setting implies the following proposi-
tion.
Proposition 18.60. Suppose Y is a measurable space and f : Y X = X
A
is a map. Then f is measurable i
f : Y X
is measurable for all

A. In particular if A = 1, 2, . . . , n so that X = X
1
X
2
X
n
and
f(y) = (f
1
(y), f
2
(y), . . . , f
n
(y)) X
1
X
2
X
n
, then f : Y X
A
is
measurable i f
i
: Y X
i
is measurable for all i.
Proposition 18.61. Suppose that (X
, /
)
A
is a collection of measurable
spaces and c
generates /
for each A, then
A
/
=
_
(c
)
_
(18.13)
Moreover, suppose that A is either nite or countably innite, X
for
each A, and /
= (c
) for each A. Then the product algebra

satises
A
/
=
__
A
E
: E
for all A
__
. (18.14)
In particular if A = 1, 2, . . . , n , then X = X
1
X
2
X
n
and
/
1
/
2
/
n
= (/
1
/
2
/
n
),
where /
1
/
2
/
n
is as dened in Notation 13.26.
Proof. Since
(c
(/
), it follows that
T :=
_
(c
)
_

_
(/
)
_
=
A
/
.
Conversely,
T (
1
(c
)) =
1
((c
)) =
1
(/
)
holds for all implies that
(/
) T
and hence that
A
/
T. We now prove Eq. (18.14). Since we are

assuming that X
for each A, we see that
(c
)
_
A
E
: E
for all A
_
and therefore by Eq. (18.13)
A
/
=
_
(c
)
_

__
A
E
: E
for all A
__
.
18.4 Product Algebras 307
This last statement is true independent as to whether A is countable or not.
For the reverse inclusion it suces to notice that since A is countable,
A
E
=
A
(E
)
A
/
and hence
__
A
E
: E
for all A
__

A
/
.
Remark 18.62. One can not relax the assumption that X
in the second
part of Proposition 18.61. For example, if X
1
= X
2
= 1, 2 and c
1
= c
2
=
1 , then (c
1
c
2
) = , X
1
X
2
, (1, 1) while ((c
1
) (c
2
)) =
2
X1X2
.
Theorem 18.63. Let X
A
be a sequence of sets where A is at most
countable. Suppose for each A we are given a countable set c
2
X
.
Let
= (c
) be the topology on X
generated by c
and X be the product

space

A
X
equipped with the product topology :=

A
(c
). Then the
Borel algebra B
X
= () is the same as the product algebra:
B
X
=
A
B
X
,
where B
X
= ((c
)) = (c
) for all A.
In particular if A = 1, 2, . . . , n and each (X
i
,
i
) is a second countable
topological space, then
B
X
:= (
1
2

n
) = (B
X1
B
Xn
) =: B
X1
B
Xn
.
Proof. By Proposition 13.25, the topology may be described as the
smallest topology containing c =
A
(c
). Now c is the countable union

of countable sets so is still countable. Therefore by Proposition 18.17 and
Proposition 18.61,
B
X
= () = ((c)) = (c) =
A
(c
)
=
A
(
) =
A
B
X
.
Corollary 18.64. If (X
i
, d
i
) are separable metric spaces for i = 1, . . . , n, then
B
X1
B
Xn
= B
(X1Xn)
where B
Xi
is the Borel algebra on X
i
and B
(X1Xn)
is the Borel
algebra on X
1
X
n
equipped with the metric topology associ-
ated to the metric d(x, y) =

n
i=1
d
i
(x
i
, y
i
) where x = (x
1
, x
2
, . . . , x
n
) and
y = (y
1
, y
2
, . . . , y
n
).
Proof. This is a combination of the results in Lemma 13.28, Exercise 13.12
and Theorem 18.63.
Because all norms on nite dimensional spaces are equivalent, the usual
Euclidean norm on 1
m
1
n
is equivalent to the product norm dened by
|(x, y)|
R
m
R
n = |x|
R
m +|y|
R
n .
Hence by Lemma 13.28, the Euclidean topology on 1
m+n
is the same as the
product topology on 1
m+n
= 1
m
1
n
. Here we are identifying 1
m
1
n
with
1
m+n
by the map
(x, y) 1
m
1
n
(x
1
, . . . , x
m
, y
1
, . . . , y
n
) 1
m+n
.
These comments along with Corollary 18.64 proves the following result.
Corollary 18.65. After identifying 1
m
1
n
with 1
m+n
as above and letting
B
R
n denote the Borel algebra on 1
n
, we have
B
R
m+n = B
R
n B
R
m and B
R
n =
ntimes
..
B
R
B
R
.
18.4.1 Factoring of Measurable Maps
Lemma 18.66. Suppose that (Y, T) is a measurable space and F : X Y is
a map. Then to every ((F), B
R
) measurable function, H : X

1, there is
a (T, B
R
) measurable function h : Y

1 such that H = h F.
Proof. First suppose that H = 1
A
where A (F) = F
1
(T). Let
B T such that A = F
1
(B) then 1
A
= 1
F
1
(B)
= 1
B
F and hence the
Lemma is valid in this case with h = 1
B
. More generally if H =

a
i
1
Ai
is a simple function, then there exists B
i
T such that 1
Ai
= 1
Bi
F and
hence H = h F with h :=

a
i
1
Bi
a simple function on

1. For general
((F), T) measurable function, H, from X

1, choose simple functions
H
n
converging to H. Let h
n
be simple functions on

1 such that H
n
= h
n
F.
Then it follows that
H = lim
n
H
n
= limsup
n
H
n
= limsup
n
h
n
F = h F
where h := limsup
n
h
n
a measurable function from Y to

1.
The following is an immediate corollary of Proposition 18.25 and Lemma
18.66.
Corollary 18.67. Let X and A be sets, and suppose for A we are give a
measurable space (Y
, T
) and a function f
: X Y
. Let Y :=

A
Y
,
T :=
A
T
be the product algebra on Y and / := (f
: A)
be the smallest algebra on X such that each f
is measurable. Then the

function F : X Y dened by [F(x)]
:= f
(x) for each A is (/, T)

measurable and a function H : X

1 is (/, B
R
) measurable i there
exists a (T, B
R
) measurable function h from Y to

1 such that H = h F.
18.5 Exercises 309
18.5 Exercises
Exercise 18.8. Prove Corollary 18.23. Hint: See Exercise 18.3.
Exercise 18.9. If / is the algebra generated by c 2
X
, then / is the
union of the algebras generated by countable subsets T c.
Exercise 18.10. Let (X, /) be a measure space and f
n
: X F be a se-
quence of measurable functions on X. Show that x : lim
n
f
n
(x) exists in F
/.
Exercise 18.11. Show that every monotone function f : 1 1 is (B
R
, B
R
)
measurable.
Exercise 18.12. Show by example that the supremum of an uncountable
family of measurable functions need not be measurable. (Folland problem 2.6
on p. 48.)
Exercise 18.13. Let X = 1, 2, 3, 4 , A = 1, 2 , B = 2, 3 and M :=
1
A
, 1
B
. Show H
(M) ,= H(M) in this case.

19
Measures and Integration
Denition 19.1. A measure on a measurable space (X, /) is a function
: /[0, ] such that
1. () = 0 and
2. (Finite Additivity) If A
i
n
i=1
/ are pairwise disjoint, i.e. A
i
A
j
=
when i ,= j, then
(
n
_
i=1
A
i
) =
n
i=1
(A
i
).
3. (Continuity) If A
n
/ and A
n
A, then (A
n
) (A).
We call a triple (X, /, ), where (X, /) is a measurable space and :
/[0, ] is a measure, a measure space.
Remark 19.2. Properties 2) and 3) in Denition 19.1 are equivalent to the
following condition. If A
i
i=1
/ are pairwise disjoint then
(
_
i=1
A
i
) =
i=1
(A
i
). (19.1)
To prove this assume that Properties 2) and 3) in Denition 19.1 hold and
A
i
i=1
/ are pairwise disjoint. Letting B
n
:=
n
i=1
A
i
B :=
i=1
A
i
, we
have
(
_
i=1
A
i
) = (B)
(3)
= lim
n
(B
n
)
(2)
= lim
n
n
i=1
(A
i
) =
i=1
(A
i
).
Conversely, if Eq. (19.1) holds we may take A
j
= for all j > n to see that
Property 2) of Denition 19.1 holds. Also if A
n
A, let B
n
:= A
n
A
n1
with
A
0
:= . Then B
n
n=1
are pairwise disjoint, A
n
=
n
j=1
B
j
and A =
j=1
B
j
.
So if Eq. (19.1) holds we have
312 19 Measures and Integration
(A) =
_
j=1
B
j
_
=
j=1
(B
j
)
= lim
n
n
j=1
(B
j
) = lim
n
(
n
j=1
B
j
) = lim
n
(A
n
).
Proposition 19.3 (Basic properties of measures). Suppose that (X, /, )
is a measure space and E, F / and E
j
j=1
/, then :
1. (E) (F) if E F.
2. (E
j
)
(E
j
).
3. If (E
1
) < and E
j
E, i.e. E
1
E
2
E
3
. . . and E =
j
E
j
, then
(E
j
) (E) as j .
Proof.
1. Since F = E (F E),
(F) = (E) +(F E) (E).
2. Let

E
j
= E
j
(E
1
E
j1
) so that the

E
j
s are pair-wise disjoint
and E =
E
j
. Since

E
j
E
j
it follows from Remark 19.2 and part (1),
that
(E) =
E
j
)
(E
j
).
3. Dene D
i
:= E
1
E
i
then D
i
E
1
E which implies that
(E
1
) (E) = lim
i
(D
i
) = (E
1
) lim
i
(E
i
)
which shows that lim
i
(E
i
) = (E).
Denition 19.4. A set E X is a null set if E / and (E) = 0. If P is
some property which is either true or false for each x X, we will use the
terminology P a.e. (to be read P almost everywhere) to mean
E := x X : P is false for x
is a null set. For example if f and g are two measurable functions on
(X, /, ), f = g a.e. means that (f ,= g) = 0.
Denition 19.5. A measure space (X, /, ) is complete if every subset of
a null set is in /, i.e. for all F X such that F E / with (E) = 0
implies that F /.
19 Measures and Integration 313
Proposition 19.6 (Completion of a Measure). Let (X, /, ) be a mea-
sure space. Set
^ = ^
:= N X : F /such that N F and (F) = 0 ,
/=

/
:= A N : A / and N ^ and
(A N) := (A) for A / and N ^,
see Fig. 19.1. Then

/is a algebra, is a well dened measure on

/, is
the unique measure on

/ which extends on /, and (X,

/, ) is complete
measure space. The -algebra,

/, is called the completion of / relative to
and , is called the completion of .
Proof. Clearly X,

/. Let A / and N ^ and choose F /
Fig. 19.1. Completing a algebra.
such that N F and (F) = 0. Since N
c
= (F N) F
c
,
(A N)
c
= A
c
N
c
= A
c
(F N F
c
)
= [A
c
(F N)] [A
c
F
c
]
where [A
c
(F N)] ^ and [A
c
F
c
] /. Thus

/ is closed under
complements. If A
i
/ and N
i
F
i
/ such that (F
i
) = 0 then
(A
i
N
i
) = (A
i
)(N
i
)

/since A
i
/and N
i
F
i
and (F
i
)
(F
i
) = 0. Therefore,

/ is a algebra. Suppose A N
1
= B N
2
with
A, B /and N
1
, N
2
, ^. Then A AN
1
AN
1
F
2
= BF
2
which
shows that
(A) (B) +(F
2
) = (B).
Similarly, we show that (B) (A) so that (A) = (B) and hence (A
N) := (A) is well dened. It is left as an exercise to show is a measure,
i.e. that it is countable additive.
Many theorems in the sequel will require some control on the size of a
measure . The relevant notion for our purposes (and most purposes) is that
of a nite measure dened next.
Denition 19.7. Suppose X is a set, c / 2
X
and : /[0, ] is a
function. The function is nite on c if there exists E
n
c such that
(E
n
) < and X =
n=1
E
n
. If / is a algebra and is a measure on
/ which is nite on / we will say (X, /, ) is a nite measure
space.
The reader should check that if is a nitely additive measure on an
algebra, /, then is nite on / i there exists X
n
/ such that
X
n
X and (X
n
) < .
19.1 Example of Measures
Most algebras and -additive measures are somewhat dicult to describe
and dene. However, one special case is fairly easy to understand. Namely
suppose that T 2
X
is a countable or nite partition of X and / 2
X
is
the algebra which consists of the collection of sets A X such that
A = T : A . (19.2)
It is easily seen that / is a algebra.
Any measure : / [0, ] is determined uniquely by its values on T.
Conversely, if we are given any function : T [0, ] we may dene, for
A /,
(A) =
JA
() =
J
()1
A
where 1
A
is one if A and zero otherwise. We may check that is a
measure on /. Indeed, if A =
i=1
A
i
and T, then A i A
i
for
one and hence exactly one A
i
. Therefore 1
A
=
i=1
1
Ai
and hence
(A) =
J
()1
A
=
J
()
i=1
1
Ai
=
i=1
J
()1
Ai
=
i=1
(A
i
)
as desired. Thus we have shown that there is a one to one correspondence
between measures on / and functions : T [0, ].
The construction of measures will be covered in Chapters 31 32 below.
However, let us record here the existence of an interesting class of measures.
Theorem 19.8. To every right continuous non-decreasing function F :
1 1 there exists a unique measure
F
on B
R
such that
F
((a, b]) = F(b) F(a) < a b < (19.3)
19.1 Example of Measures 315
Moreover, if A B
R
then
F
(A) = inf
_

i=1
(F(b
i
) F(a
i
)) : A
i=1
(a
i
, b
i
]
_
(19.4)
= inf
_

i=1
(F(b
i
) F(a
i
)) : A
i=1
(a
i
, b
i
]
_
. (19.5)
In fact the map F
F
is a one to one correspondence between right con-
tinuous functions F with F(0) = 0 on one hand and measures on B
R
such
that (J) < on any bounded set J B
R
on the other.
Proof. See Section 28.3 below or Theorem 28.40 below.
Example 19.9. The most important special case of Theorem 19.8 is when
F(x) = x, in which case we write m for
F
. The measure m is called Lebesgue
measure.
Theorem 19.10. Lebesgue measure m is invariant under translations, i.e.
for B B
R
and x 1,
m(x +B) = m(B). (19.6)
Moreover, m is the unique measure on B
R
such that m((0, 1]) = 1 and Eq.
(19.6) holds for B B
R
and x 1. Moreover, m has the scaling property
m(B) = [[ m(B) (19.7)
where 1, B B
R
and B := x : x B.
Proof. Let m
x
(B) := m(x + B), then one easily shows that m
x
is a
measure on B
R
such that m
x
((a, b]) = b a for all a < b. Therefore, m
x
= m
by the uniqueness assertion in Theorem 19.8. For the converse, suppose that
m is translation invariant and m((0, 1]) = 1. Given n N, we have
(0, 1] =
n
k=1
(
k 1
n
,
k
n
] =
n
k=1
_
k 1
n
+ (0,
1
n
]
_
.
Therefore,
1 = m((0, 1]) =
n
k=1
m
_
k 1
n
+ (0,
1
n
]
_
=
n
k=1
m((0,
1
n
]) = n m((0,
1
n
]).
That is to say
m((0,
1
n
]) = 1/n.
Similarly, m((0,
l
n
]) = l/n for all l, n N and therefore by the translation
invariance of m,
m((a, b]) = b a for all a, b with a < b.
Finally for a, b 1 such that a < b, choose a
n
, b
n
such that b
n
b and
a
n
a, then (a
n
, b
n
] (a, b] and thus
m((a, b]) = lim
n
m((a
n
, b
n
]) = lim
n
(b
n
a
n
) = b a,
i.e. m is Lebesgue measure. To prove Eq. (19.7) we may assume that ,= 0
since this case is trivial to prove. Now let m
(B) := [[
1
m(B). It is easily
checked that m
is again a measure on B
R
which satises
m
((a, b]) =
1
m((a, b]) =
1
(b a) = b a
if > 0 and
m
((a, b]) = [[
1
m([b, a)) = [[
1
(b a) = b a
if < 0. Hence m
= m.
We are now going to develop integration theory relative to a measure. The
integral dened in the case for Lebesgue measure, m, will be an extension of
the standard Riemann integral on 1.
19.1.1 ADD: Examples of Measures
BRUCE: ADD details.
1. Product measure for the ipping of a coin.
2. Haar Measure
3. Measure on embedded submanifolds, i.e. Hausdor measure.
4. Wiener measure.
5. Gibbs states.
6. Measure associated to self-adjoint operators and classifying them.
19.2 Integrals of Simple functions
Let (X, /, ) be a xed measure space in this section.
Denition 19.11. Let F = C or [0, ) and suppose that : X F is
a simple function as in Denition 18.41. If F = C assume further that
(
1
(y)) < for all y ,= 0 in C. For such functions , dene I
() by
I
() =
yF
y(
1
(y)).
19.2 Integrals of Simple functions 317
Proposition 19.12. Let F and and be two simple functions, then I
satises:
1.
I
() = I
(). (19.8)
2.
I
( +) = I
() +I
().
3. If and are non-negative simple functions such that then
I
() I
().
Proof. Let us write = y for the set
1
(y) X and ( = y) for
( = y) = (
1
(y)) so that
I
() =
yF
y( = y).
We will also write = a, = b for
1
(a)
1
(b). This notation is
more intuitive for the purposes of this proof. Suppose that F then
I
() =
yF
y ( = y) =
yF
y ( = y/)
=
zF
z ( = z) = I
()
provided that ,= 0. The case = 0 is clear, so we have proved 1. Suppose
that and are two simple functions, then
I
( +) =
zF
z ( + = z)
=
zF
z (
wF
= w, = z w)
=
zF
z
wF
( = w, = z w)
=
z,wF
(z +w)( = w, = z)
=
zF
z ( = z) +
wF
w ( = w)
= I
() +I
().
which proves 2. For 3. if and are non-negative simple functions such that

I
() =
a0
a( = a) =
a,b0
a( = a, = b)
a,b0
b( = a, = b) =
b0
b( = b) = I
(),
wherein the third inequality we have used = a, = b = if a > b.
19.3 Integrals of positive functions
Denition 19.13. Let L
+
= L
+
(/) = f : X [0, ] : f is measurable.
Dene
_
X
f (x) d(x) =
_
X
fd := supI
() : is simple and f .
We say the f L
+
is integrable if
_
X
fd < . If A /, let
_
A
f (x) d(x) =
_
A
fd :=
_
X
1
A
f d.
Remark 19.14. Because of item 3. of Proposition 19.12, if is a non-negative
simple function,
_
X
d = I
() so that
_
X
is an extension of I
. This exten-
sion still has the monotonicity property of I
; namely if 0 f g then
_
X
fd = supI
() : is simple and f
supI
() : is simple and g
_
X
gd.
Similarly if c > 0,
_
X
cfd = c
_
X
fd.
Also notice that if f is integrable, then (f = ) = 0.
Lemma 19.15 (Sums as Integrals). Let X be a set and : X [0, ] be
a function, let =
xX
(x)
x
on /= 2
X
, i.e.
(A) =
xA
(x).
If f : X [0, ] is a function (which is necessarily measurable), then
_
X
fd =
X
f.
19.3 Integrals of positive functions 319
Proof. Suppose that : X [0, ) is a simple function, then =
z[0,)
z1
=z]
and
X
=
xX
(x)
z[0,)
z1
=z]
(x) =
z[0,)
z
xX
(x)1
=z]
(x)
=
z[0,)
z( = z) =
_
X
d.
So if : X [0, ) is a simple function such that f, then
_
X
d =
X
f.
Taking the sup over in this last equation then shows that
_
X
fd
X
f.
For the reverse inequality, let X be a nite set and N (0, ).
Set f
N
(x) = minN, f(x) and let
N,
be the simple function given by
N,
(x) := 1
(x)f
N
(x). Because
N,
(x) f(x),
f
N
=
N,
=
_
X
N,
d
_
X
fd.
Since f
N
f as N , we may let N in this last equation to concluded
f
_
X
fd.
Since is arbitrary, this implies
X
f
_
X
fd.
Theorem 19.16 (Monotone Convergence Theorem). Suppose f
n
L
+
is a sequence of functions such that f
n
f (f is necessarily in L
+
) then
_
f
n

_
f as n .
Proof. Since f
n
f
m
f, for all n m < ,
_
f
n

_
f
m

_
f
from which if follows
_
f
n
is increasing in n and
lim
n
_
f
n

_
f. (19.9)
For the opposite inequality, let : X [0, ) be a simple function such
that 0 f, (0, 1) and X
n
:= f
n
. Notice that X
n
X and
f
n
1
Xn
and so by denition of
_
f
n
,
_
f
n

_
1
Xn
=
_
1
Xn
. (19.10)
Then using the continuity property of ,
lim
n
_
1
Xn
= lim
n
_
1
Xn
y>0
y1
=y]
= lim
n
y>0
y(X
n
= y) =
y>0
y lim
n
(X
n
= y)
=
y>0
y lim
n
( = y) =
_
.
This identity allows us to let n in Eq. (19.10) to conclude
_
X

1
lim
n
_
f
n
.
Since this is true for all non-negative simple functions with f;
_
f = sup
__
X
: is simple and f
_
lim
n
_
f
n
.
Because (0, 1) was arbitrary, it follows that
_
f lim
n
_
f
n
which com-
bined with Eq. (19.9) proves the theorem.
The following simple lemma will be used often in the sequel.
Lemma 19.17 (Chebyshevs Inequality). Suppose that f 0 is a mea-
surable function, then for any > 0,
(f )
1
_
X
fd. (19.11)
In particular if
_
X
fd < then (f = ) = 0 (i.e. f < a.e.) and the
set f > 0 is nite.
Proof. Since 1
f]
1
f]
1
f
1
f,
(f ) =
_
X
1
f]
d
_
X
1
f]
1
fd
1
_
X
fd.
If M :=
_
X
fd < , then
(f = ) (f n)
M
n
0 as n
and f 1/n f > 0 with (f 1/n) nM < for all n.
Corollary 19.18. If f
n
L
+
is a sequence of functions then
_

n=1
f
n
=
n=1
_
f
n
.
In particular, if

n=1
_
f
n
< then

n=1
f
n
< a.e.
Proof. First o we show that
_
(f
1
+f
2
) =
_
f
1
+
_
f
2
by choosing non-negative simple function
n
and
n
such that
n
f
1
and
n
f
2
. Then (
n
+
n
) is simple as well and (
n
+
n
) (f
1
+f
2
) so by the
monotone convergence theorem,
_
(f
1
+f
2
) = lim
n
_
(
n
+
n
) = lim
n
__

n
+
_

n
_
= lim
n
_

n
+ lim
n
_

n
=
_
f
1
+
_
f
2
.
Now to the general case. Let g
N
:=
N
n=1
f
n
and g =
1
f
n
, then g
N
g and so
again by monotone convergence theorem and the additivity just proved,
n=1
_
f
n
:= lim
N
N
n=1
_
f
n
= lim
N
_
N
n=1
f
n
= lim
N
_
g
N
=
_
g =:
_

n=1
f
n
.
Remark 19.19. It is in the proof of this corollary (i.e. the linearity of the
integral) that we really make use of the assumption that all of our functions are
measurable. In fact the denition
_
fd makes sense for all functions f : X
[0, ] not just measurable functions. Moreover the monotone convergence
theorem holds in this generality with no change in the proof. However, in
the proof of Corollary 19.18, we use the approximation Theorem 18.42 which
relies heavily on the measurability of the functions to be approximated.
The following Lemma and the next Corollary are simple applications of
Corollary 19.18.
Lemma 19.20 (The First Borell Carntelli Lemma). Let (X, /, ) be
a measure space, A
n
/, and set
A
n
i.o. = x X : x A
n
for innitely many ns =
N=1
_
nN
A
n
.
If

n=1
(A
n
) < then (A
n
i.o.) = 0.
Proof. (First Proof.) Let us rst observe that
A
n
i.o. =
_
x X :
n=1
1
An
(x) =
_
.
Hence if

n=1
(A
n
) < then
>
n=1
(A
n
) =
n=1
_
X
1
An
d =
_
X
n=1
1
An
d
implies that
n=1
1
An
(x) < for - a.e. x. That is to say (A
n
i.o.) = 0.
(Second Proof.) Of course we may give a strictly measure theoretic proof of
this fact:
(A
n
i.o.) = lim
N
_
_
_
nN
A
n
_
_
lim
N
nN
(A
n
)
and the last limit is zero since

n=1
(A
n
) < .
Corollary 19.21. Suppose that (X, /, ) is a measure space and A
n
n=1

/ is a collection of sets such that (A
i
A
j
) = 0 for all i ,= j, then
(
n=1
A
n
) =
n=1
(A
n
).
Proof. Since
(
n=1
A
n
) =
_
X
1
n=1
An
d and
n=1
(A
n
) =
_
X
n=1
1
An
d
it suces to show
n=1
1
An
= 1
n=1
An
a.e. (19.12)
Now
n=1
1
An
1
n=1
An
and
n=1
1
An
(x) ,= 1
n=1
An
(x) i x A
i
A
j
for
some i ,= j, that is
_
x :
n=1
1
An
(x) ,= 1
n=1
An
(x)
_
=
i<j
A
i
A
j
and the latter set has measure 0 being the countable union of sets of measure
zero. This proves Eq. (19.12) and hence the corollary.
Notation 19.22 If m is Lebesgue measure on B
R
, f is a non-negative Borel
measurable function and a < b with a, b

1, we will often write
_
b
a
f (x) dx
or
_
b
a
fdm for
_
(a,b]R
fdm.
Example 19.23. Suppose < a < b < , f C([a, b], [0, )) and m be
Lebesgue measure on 1. Also let
k
= a = a
k
0
< a
k
1
< < a
k
n
k
= b be a
sequence of rening partitions (i.e.
k

k+1
for all k) such that
mesh(
k
) := max
a
k
j
a
k+1
j1
: j = 1, . . . , n
k
0 as k .
For each k, let
f
k
(x) = f(a)1
a]
+
n
k
1
l=0
min
_
f(x) : a
k
l
x a
k
l+1
_
1
(a
k
l
,a
k
l+1
]
(x)
then f
k
f as k and so by the monotone convergence theorem,
_
b
a
fdm :=
_
[a,b]
fdm = lim
k
_
b
a
f
k
dm
= lim
k
n
k
1
l=0
min
_
f(x) : a
k
l
x a
k
l+1
_
m
_
(a
k
l
, a
k
l+1
]
_
=
_
b
a
f(x)dx.
The latter integral being the Riemann integral.
We can use the above result to integrate some non-Riemann integrable
functions:
Example 19.24. For all > 0,
_

0
e
x
dm(x) =
1
and
_
R
1
1 +x
2
dm(x) = .
The proof of these identities are similar. By the monotone convergence the-
orem, Example 19.23 and the fundamental theorem of calculus for Riemann
integrals (or see Theorem 10.14 above or Theorem 19.40 below),
_

0
e
x
dm(x) = lim
N
_
N
0
e
x
dm(x) = lim
N
_
N
0
e
x
dx
= lim
N
1
e
x
[
N
0
=
1
and
_
R
1
1 +x
2
dm(x) = lim
N
_
N
N
1
1 +x
2
dm(x) = lim
N
_
N
N
1
1 +x
2
dx
= lim
N
_
tan
1
(N) tan
1
(N)
= .
Let us also consider the functions x
p
,
_
(0,1]
1
x
p
dm(x) = lim
n
_
1
0
1
(
1
n
,1]
(x)
1
x
p
dm(x)
= lim
n
_
1
1
n
1
x
p
dx = lim
n
x
p+1
1 p
1
1/n
=
_
1
1p
if p < 1
if p > 1
If p = 1 we nd
_
(0,1]
1
x
p
dm(x) = lim
n
_
1
1
n
1
x
dx = lim
n
ln(x)[
1
1/n
= .
Example 19.25. Let r
n
n=1
be an enumeration of the points in [0, 1] and
dene
f(x) =
n=1
2
n
1
_
[x r
n
[
with the convention that
1
_
[x r
n
[
= 5 if x = r
n
.
Since, By Theorem 19.40,
_
1
0
1
_
[x r
n
[
dx =
_
1
rn
1
x r
n
dx +
_
rn
0
1
r
n
x
dx
= 2
x r
n
[
1
rn
2
r
n
x[
rn
0
= 2
_
1 r
n
r
n
_
4,
we nd
_
[0,1]
f(x)dm(x) =
n=1
2
n
_
[0,1]
1
_
[x r
n
[
dx
n=1
2
n
4 = 4 < .
In particular, m(f = ) = 0, i.e. that f < for almost every x [0, 1] and
this implies that
n=1
2
n
1
_
[x r
n
[
< for a.e. x [0, 1].
This result is somewhat surprising since the singularities of the summands
form a dense subset of [0, 1].
Proposition 19.26. Suppose that f 0 is a measurable function. Then
_
X
fd = 0 i f = 0 a.e. Also if f, g 0 are measurable functions such that
f g a.e. then
_
fd
_
gd. In particular if f = g a.e. then
_
fd =
_
gd.
Proof. If f = 0 a.e. and f is a simple function then = 0 a.e.
This implies that (
1
(y)) = 0 for all y > 0 and hence
_
X
d = 0 and
therefore
_
X
fd = 0. Conversely, if
_
fd = 0, then by (Lemma 19.17),
(f 1/n) n
_
fd = 0 for all n.
Therefore, (f > 0)

n=1
(f 1/n) = 0, i.e. f = 0 a.e. For the second
assertion let E be the exceptional set where f > g, i.e. E := x X : f(x) >
g(x). By assumption E is a null set and 1
E
c f 1
E
c g everywhere. Because
g = 1
E
c g + 1
E
g and 1
E
g = 0 a.e.,
_
gd =
_
1
E
c gd +
_
1
E
gd =
_
1
E
c gd
and similarly
_
fd =
_
1
E
c fd. Since 1
E
c f 1
E
c g everywhere,
_
fd =
_
1
E
c fd
_
1
E
c gd =
_
gd.
Corollary 19.27. Suppose that f
n
is a sequence of non-negative measurable
functions and f is a measurable function such that f
n
f o a null set, then
_
f
n

_
f as n .
Proof. Let E X be a null set such that f
n
1
E
c f1
E
c as n . Then
by the monotone convergence theorem and Proposition 19.26,
_
f
n
=
_
f
n
1
E
c
_
f1
E
c =
_
f as n .
Lemma 19.28 (Fatous Lemma). If f
n
: X [0, ] is a sequence of
measurable functions then
_
liminf
n
f
n
liminf
n
_
f
n
Proof. Dene g
k
:= inf
nk
f
n
so that g
k
liminf
n
f
n
as k . Since
g
k
f
n
for all k n,
_
g
k

_
f
n
for all n k
and therefore _
g
k
lim inf
n
_
f
n
for all k.
We may now use the monotone convergence theorem to let k to nd
_
lim inf
n
f
n
=
_
lim
k
g
k
MCT
= lim
k
_
g
k
lim inf
n
_
f
n
.
19.4 Integrals of Complex Valued Functions
Denition 19.29. A measurable function f : X

1 is integrable if f
+
:=
f1
f0]
and f
= f 1
f0]
are integrable. We write L
1
(; 1) for the space
of real valued integrable functions. For f L
1
(; 1) , let
_
fd =
_
f
+
d
_
f
d
Convention: If f, g : X

1 are two measurable functions, let f + g
denote the collection of measurable functions h : X

1 such that h(x) =
f(x)+g(x) whenever f(x)+g(x) is well dened, i.e. is not of the formor
+. We use a similar convention for f g. Notice that if f, g L
1
(; 1)
and h
1
, h
2
f +g, then h
1
= h
2
a.e. because [f[ < and [g[ < a.e.
Notation 19.30 (Abuse of notation) We will sometimes denote the in-
tegral
_
X
fd by (f) . With this notation we have (A) = (1
A
) for all
A /.
Remark 19.31. Since
f
[f[ f
+
+f
,
a measurable function f is integrable i
_
[f[ d < . Hence
L
1
(; 1) :=
_
f : X

1 : f is measurable and
_
X
[f[ d <
_
.
19.4 Integrals of Complex Valued Functions 327
If f, g L
1
(; 1) and f = g a.e. then f
= g
a.e. and so it follows from

Proposition 19.26 that
_
fd =
_
gd. In particular if f, g L
1
(; 1) we may
dene _
X
(f +g) d =
_
X
hd
where h is any element of f +g.
Proposition 19.32. The map
f L
1
(; 1)
_
X
fd 1
is linear and has the monotonicity property:
_
fd
_
gd for all f, g
L
1
(; 1) such that f g a.e.
Proof. Let f, g L
1
(; 1) and a, b 1. By modifying f and g on a null
set, we may assume that f, g are real valued functions. We have af + bg
L
1
(; 1) because
[af +bg[ [a[ [f[ +[b[ [g[ L
1
(; 1) .
If a < 0, then
(af)
+
= af
and (af)
= af
+
so that
_
af = a
_
f
+a
_
f
+
= a(
_
f
+
_
f
) = a
_
f.
A similar calculation works for a > 0 and the case a = 0 is trivial so we have
shown that _
af = a
_
f.
Now set h = f +g. Since h = h
+
h
,
h
+
h
= f
+
f
+g
+
g
or
h
+
+f
+g
= h
+f
+
+g
+
.
Therefore,
_
h
+
+
_
f
+
_
g
=
_
h
+
_
f
+
+
_
g
+
and hence
_
h =
_
h
+
_
h
=
_
f
+
+
_
g
+
_
f
_
g
=
_
f +
_
g.
Finally if f
+
f
= f g = g
+
g
then f
+
+g
g
+
+f
which implies
that
_
f
+
+
_
g

_
g
+
+
_
f
_
f =
_
f
+
_
f

_
g
+
_
g
=
_
g.
The monotonicity property is also a consequence of the linearity of the inte-
gral, the fact that f g a.e. implies 0 g f a.e. and Proposition 19.26.
Denition 19.33. A measurable function f : X C is integrable if
_
X
[f[ d < . Analogously to the real case, let
L
1
(; C) :=
_
f : X C : f is measurable and
_
X
[f[ d <
_
.
denote the complex valued integrable functions. Because, max ([Re f[ , [Imf[)
[f[
2 max ([Re f[ , [Imf[) ,

_
[f[ d < i
_
[Re f[ d +
_
[Imf[ d < .
For f L
1
(; C) dene
_
f d =
_
Re f d +i
_
Imf d.
It is routine to show the integral is still linear on L
1
(; C) (prove!). In the
remainder of this section, let L
1
() be either L
1
(; C) or L
1
(; 1) . If A /
and f L
1
(; C) or f : X [0, ] is a measurable function, let
_
A
fd :=
_
X
1
A
fd.
Proposition 19.34. Suppose that f L
1
(; C) , then
_
X
fd
_
X
[f[ d. (19.13)
Proof. Start by writing
_
X
f d = Re
i
with R 0. We may assume that
R =
_
X
fd
> 0 since otherwise there is nothing to prove. Since

R = e
i
_
X
f d =
_
X
e
i
f d =
_
X
Re
_
e
i
f
_
d +i
_
X
Im
_
e
i
f
_
d,
it must be that
_
X
Im
_
e
i
f
d = 0. Using the monotonicity in Proposition

19.26,
_
X
fd
=
_
X
Re
_
e
i
f
_
d
_
X
Re
_
e
i
f
_
d
_
X
[f[ d.
Proposition 19.35. Let f, g L
1
() , then
1. The set f ,= 0 is nite, in fact [f[
1
n
f ,= 0 and ([f[
1
n
) < for all n.
2. The following are equivalent
a)
_
E
f =
_
E
g for all E /
b)
_
X
[f g[ = 0
c) f = g a.e.
Proof. 1. By Chebyshevs inequality, Lemma 19.17,
([f[
1
n
) n
_
X
[f[ d <
for all n. 2. (a) = (c) Notice that
_
E
f =
_
E
g
_
E
(f g) = 0
for all E /. Taking E = Re(f g) > 0 and using 1
E
Re(f g) 0, we
learn that
0 = Re
_
E
(f g)d =
_
1
E
Re(f g) =1
E
Re(f g) = 0 a.e.
This implies that 1
E
= 0 a.e. which happens i
(Re(f g) > 0) = (E) = 0.
Similar (Re(fg) < 0) = 0 so that Re(fg) = 0 a.e. Similarly, Im(fg) = 0
a.e and hence f g = 0 a.e., i.e. f = g a.e. (c) = (b) is clear and so is (b)
= (a) since
_
E
f
_
E
g
_
[f g[ = 0.
Denition 19.36. Let (X, /, ) be a measure space and L
1
() = L
1
(X, /, )
denote the set of L
1
() functions modulo the equivalence relation; f g i
f = g a.e. We make this into a normed space using the norm
|f g|
L
1 =
_
[f g[ d
and into a metric space using
1
(f, g) = |f g|
L
1 .
Warning: in the future we will often not make much of a distinction
between L
1
() and L
1
() . On occasion this can be dangerous and this danger
will be pointed out when necessary.
Remark 19.37. More generally we may dene L
p
() = L
p
(X, /, ) for p
[1, ) as the set of measurable functions f such that
_
X
[f[
p
d <
modulo the equivalence relation; f g i f = g a.e.
We will see in Chapter 21 that
|f|
L
p =
__
[f[
p
d
_
1/p
for f L
p
()
is a norm and (L
p
(), ||
L
p) is a Banach space in this norm.
Theorem 19.38 (Dominated Convergence Theorem). Suppose f
n
, g
n
, g
L
1
() , f
n
f a.e., [f
n
[ g
n
L
1
() , g
n
g a.e. and
_
X
g
n
d
_
X
gd.
Then f L
1
() and
_
X
fd = lim
h
_
X
f
n
d.
(In most typical applications of this theorem g
n
= g L
1
() for all n.)
Proof. Notice that [f[ = lim
n
[f
n
[ lim
n
[g
n
[ g a.e. so that
f L
1
() . By considering the real and imaginary parts of f separately, it
suces to prove the theorem in the case where f is real. By Fatous Lemma,
_
X
(g f)d =
_
X
liminf
n
(g
n
f
n
) d liminf
n
_
X
(g
n
f
n
) d
= lim
n
_
X
g
n
d + liminf
n
_
_
X
f
n
d
_
=
_
X
gd + liminf
n
_
_
X
f
n
d
_
Since liminf
n
(a
n
) = limsup
n
a
n
, we have shown,
_
X
gd
_
X
fd
_
X
gd +
_
liminf
n
_
X
f
n
d
limsup
n
_
X
f
n
d
and therefore
limsup
n
_
X
f
n
d
_
X
fd liminf
n
_
X
f
n
d.
This shows that lim
n
_
X
f
n
d exists and is equal to
_
X
fd.
Exercise 19.1. Give another proof of Proposition 19.34 by rst proving Eq.
(19.13) with f being a cylinder function in which case the triangle inequality
for complex numbers will do the trick. Then use the approximation Theorem
18.42 along with the dominated convergence Theorem 19.38 to handle the
general case.
Corollary 19.39. Let f
n
n=1
L
1
() be a sequence such that
n=1
|f
n
|
L
1
()
<
, then

n=1
f
n
is convergent a.e. and
_
X
_

n=1
f
n
_
d =
n=1
_
X
f
n
d.
Proof. The condition

n=1
|f
n
|
L
1
()
< is equivalent to

n=1
[f
n
[
L
1
() . Hence

n=1
f
n
is almost everywhere convergent and if S
N
:=
N
n=1
f
n
, then
[S
N
[
N
n=1
[f
n
[
n=1
[f
n
[ L
1
() .
So by the dominated convergence theorem,
_
X
_

n=1
f
n
_
d =
_
X
lim
N
S
N
d = lim
N
_
X
S
N
d
= lim
N
N
n=1
_
X
f
n
d =
n=1
_
X
f
n
d.
Theorem 19.40 (The Fundamental Theorem of Calculus). Suppose
< a < b < , f C((a, b), 1)L
1
((a, b), m) and F(x) :=
_
x
a
f(y)dm(y).
Then
1. F C([a, b], 1) C
1
((a, b), 1).
2. F
t
(x) = f(x) for all x (a, b).
3. If G C([a, b], 1) C
1
((a, b), 1) is an anti-derivative of f on (a, b) (i.e.
f = G
t
[
(a,b)
) then
_
b
a
f(x)dm(x) = G(b) G(a).
Proof. Since F(x) :=
_
R
1
(a,x)
(y)f(y)dm(y), lim
xz
1
(a,x)
(y) = 1
(a,z)
(y)
for m a.e. y and

1
(a,x)
(y)f(y)
1
(a,b)
(y) [f(y)[ is an L
1
function, it
follows from the dominated convergence Theorem 19.38 that F is continuous
on [a, b]. Simple manipulations show,
F(x +h) F(x)

h
f(x)
=
1
[h[
_
_
_
_
x+h
x
[f(y) f(x)] dm(y)
if h > 0
_
x
x+h
[f(y) f(x)] dm(y)
if h < 0
1
[h[
_
_
x+h
x
[f(y) f(x)[ dm(y) if h > 0
_
x
x+h
[f(y) f(x)[ dm(y) if h < 0
sup[f(y) f(x)[ : y [x [h[ , x +[h[]
and the latter expression, by the continuity of f, goes to zero as h 0 . This
shows F
t
= f on (a, b). For the converse direction, we have by assumption
that G
t
(x) = F
t
(x) for x (a, b). Therefore by the mean value theorem,
F G = C for some constant C. Hence
_
b
a
f(x)dm(x) = F(b) = F(b) F(a)
= (G(b) +C) (G(a) +C) = G(b) G(a).
Example 19.41. The following limit holds,
lim
n
_
n
0
(1
x
n
)
n
dm(x) = 1.
Let f
n
(x) = (1
x
n
)
n
1
[0,n]
(x) and notice that lim
n
f
n
(x) = e
x
. We will
now show
0 f
n
(x) e
x
for all x 0.
It suces to consider x [0, n]. Let g(x) = e
x
f
n
(x), then for x (0, n),
d
dx
lng(x) = 1 +n
1
(1
x
n
)
(
1
n
) = 1
1
(1
x
n
)
0
which shows that lng(x) and hence g(x) is decreasing on [0, n]. Therefore
g(x) g(0) = 1, i.e.
0 f
n
(x) e
x
.
From Example 19.24, we know
_

0
e
x
dm(x) = 1 < ,
so that e
x
is an integrable function on [0, ). Hence by the dominated con-
vergence theorem,
lim
n
_
n
0
(1
x
n
)
n
dm(x) = lim
n
_

0
f
n
(x)dm(x)
=
_

0
lim
n
f
n
(x)dm(x) =
_

0
e
x
dm(x) = 1.
Example 19.42 (Integration of Power Series). Suppose R > 0 and a
n
n=0
is
a sequence of complex numbers such that
n=0
[a
n
[ r
n
< for all r (0, R).
Then
_

n=0
a
n
x
n
_
dm(x) =
n=0
a
n
_

x
n
dm(x) =
n=0
a
n
n+1
n+1
n + 1
for all R < < < R. Indeed this follows from Corollary 19.39 since
n=0
_

[a
n
[ [x[
n
dm(x)
n=0
_
_
]]
0
[a
n
[ [x[
n
dm(x) +
_
]]
0
[a
n
[ [x[
n
dm(x)
_
n=0
[a
n
[
[[
n+1
+[[
n+1
n + 1
2r
n=0
[a
n
[ r
n
<
where r = max([[ , [[).
Corollary 19.43 (Dierentiation Under the Integral). Suppose that
J 1 is an open interval and f : J X C is a function such that
1. x f(t, x) is measurable for each t J.
2. f(t
0
, ) L
1
() for some t
0
J.
3.
f
t
(t, x) exists for all (t, x).
4. There is a function g L
1
() such that
f
t
(t, )
g for each t J.
Then f(t, ) L
1
() for all t J (i.e.
_
X
[f(t, x)[ d(x) < ), t
_
X
f(t, x)d(x) is a dierentiable function on J and
d
dt
_
X
f(t, x)d(x) =
_
X
f
t
(t, x)d(x).
Proof. (The proof is essentially the same as for sums.) By considering the
real and imaginary parts of f separately, we may assume that f is real. Also
notice that
f
t
(t, x) = lim
n
n(f(t +n
1
, x) f(t, x))
and therefore, for x
f
t
(t, x) is a sequential limit of measurable functions
and hence is measurable for all t J. By the mean value theorem,
[f(t, x) f(t
0
, x)[ g(x) [t t
0
[ for all t J (19.14)
and hence
[f(t, x)[ [f(t, x) f(t
0
, x)[ +[f(t
0
, x)[ g(x) [t t
0
[ +[f(t
0
, x)[ .
This shows f(t, ) L
1
() for all t J. Let G(t) :=
_
X
f(t, x)d(x), then
G(t) G(t
0
)
t t
0
=
_
X
f(t, x) f(t
0
, x)
t t
0
d(x).
By assumption,
lim
tt0
f(t, x) f(t
0
, x)
t t
0
=
f
t
(t, x) for all x X
and by Eq. (19.14),
f(t, x) f(t
0
, x)
t t
0
g(x) for all t J and x X.

Therefore, we may apply the dominated convergence theorem to conclude
lim
n
G(t
n
) G(t
0
)
t
n
t
0
= lim
n
_
X
f(t
n
, x) f(t
0
, x)
t
n
t
0
d(x)
=
_
X
lim
n
f(t
n
, x) f(t
0
, x)
t
n
t
0
d(x)
=
_
X
f
t
(t
0
, x)d(x)
for all sequences t
n
J t
0
such that t
n
t
0
. Therefore,

G(t
0
) =
lim
tt0
G(t)G(t0)
tt0
exists and
G(t
0
) =
_
X
f
t
(t
0
, x)d(x).
Example 19.44. Recall from Example 19.24 that
1
=
_
[0,)
e
x
dm(x) for all > 0.
Let > 0. For 2 > 0 and n N there exists C
n
() < such that
0
_
d
d
_
n
e
x
= x
n
e
x
C()e
x
.
Using this fact, Corollary 19.43 and induction gives
n!
n1
=
_
d
d
_
n
1
=
_
[0,)
_
d
d
_
n
e
x
dm(x)
=
_
[0,)
x
n
e
x
dm(x).
That is n! =
n
_
[0,)
x
n
e
x
dm(x). Recall that
(t) :=
_
[0,)
x
t1
e
x
dx for t > 0.
(The reader should check that (t) < for all t > 0.) We have just shown
that (n + 1) = n! for all n N.
19.5 Measurability on Complete Measure Spaces 335
Remark 19.45. Corollary 19.43 may be generalized by allowing the hypothesis
to hold for x X E where E / is a xed null set, i.e. E must be
independent of t. Consider what happens if we formally apply Corollary 19.43
to g(t) :=
_
0
1
xt
dm(x),
g(t) =
d
dt
_

0
1
xt
dm(x)
?
=
_

0
t
1
xt
dm(x).
The last integral is zero since

t
1
xt
= 0 unless t = x in which case it is
not dened. On the other hand g(t) = t so that g(t) = 1. (The reader should
decide which hypothesis of Corollary 19.43 has been violated in this example.)
19.5 Measurability on Complete Measure Spaces
In this subsection we will discuss a couple of measurability results concerning
completions of measure spaces.
Proposition 19.46. Suppose that (X, /, ) is a complete measure space
1
and f : X 1 is measurable.
1. If g : X 1 is a function such that f(x) = g(x) for a.e. x, then g is
measurable.
2. If f
n
: X 1 are measurable and f : X 1 is a function such that
lim
n
f
n
= f, - a.e., then f is measurable as well.
Proof. 1. Let E = x : f(x) ,= g(x) which is assumed to be in / and
(E) = 0. Then g = 1
E
c f + 1
E
g since f = g on E
c
. Now 1
E
c f is measurable
so g will be measurable if we show 1
E
g is measurable. For this consider,
(1
E
g)
1
(A) =
_
E
c
(1
E
g)
1
(A 0) if 0 A
(1
E
g)
1
(A) if 0 / A
(19.15)
Since (1
E
g)
1
(B) E if 0 / B and (E) = 0, it follow by completeness of
/ that (1
E
g)
1
(B) / if 0 / B. Therefore Eq. (19.15) shows that 1
E
g is
measurable. 2. Let E = x : lim
n
f
n
(x) ,= f(x) by assumption E / and
(E) = 0. Since g := 1
E
f = lim
n
1
E
c f
n
, g is measurable. Because f = g
on E
c
and (E) = 0, f = g a.e. so by part 1. f is also measurable.
The above results are in general false if (X, /, ) is not complete. For
example, let X = 0, 1, 2, /= 0, 1, 2, X, and =
0
. Take g(0) =
0, g(1) = 1, g(2) = 2, then g = 0 a.e. yet g is not measurable.
1
Recall this means that if N X is a set such that N A / and (A) = 0,
then N / as well.
Lemma 19.47. Suppose that (X, /, ) is a measure space and

/ is the
completion of / relative to and is the extension of to

/. Then a
function f : X 1 is (

/, B = B
R
) measurable i there exists a function
g : X 1 that is (/, B) measurable such E = x : f(x) ,= g(x)

/ and
(E) = 0, i.e. f(x) = g(x) for a.e. x. Moreover for such a pair f and g,
f L
1
( ) i g L
1
() and in which case
_
X
fd =
_
X
gd.
Proof. Suppose rst that such a function g exists so that (E) = 0. Since
g is also (

/, B) measurable, we see from Proposition 19.46 that f is (

/, B)
measurable. Conversely if f is (

/, B) measurable, by considering f
we
may assume that f 0. Choose (

/, B) measurable simple function
n
0
such that
n
f as n . Writing
n
=
a
k
1
A
k
with A
k

/, we may choose B
k
/such that B
k
A
k
and (A
k
B
k
) = 0.
Letting
n
:=
a
k
1
B
k
we have produced a (/, B) measurable simple function

n
0 such that
E
n
:=
n
,=

n
has zero measure. Since (
n
E
n
)

n
(E
n
) , there
exists F / such that
n
E
n
F and (F) = 0. It now follows that
1
F

n
= 1
F
n
g := 1
F
f as n .
This shows that g = 1
F
f is (/, B) measurable and that f ,= g F has
measure zero. Since f = g, a.e.,
_
X
fd =
_
X
gd so to prove Eq.
(19.16) it suces to prove
_
X
gd =
_
X
gd. (19.16)
Because = on /, Eq. (19.16) is easily veried for non-negative /
measurable simple functions. Then by the monotone convergence theorem and
the approximation Theorem 18.42 it holds for all / measurable functions
g : X [0, ]. The rest of the assertions follow in the standard way by
considering (Re g)
and (Img)
.
19.6 Comparison of the Lebesgue and the Riemann
Integral
For the rest of this chapter, let < a < b < and f : [a, b] 1 be a
bounded function. A partition of [a, b] is a nite subset [a, b] containing
a, b. To each partition
19.6 Comparison of the Lebesgue and the Riemann Integral 337
= a = t
0
< t
1
< < t
n
= b (19.17)
of [a, b] let
mesh() := max[t
j
t
j1
[ : j = 1, . . . , n,
M
j
= supf(x) : t
j
x t
j1
, m
j
= inff(x) : t
j
x t
j1
= f(a)1
a]
+
n
1
M
j
1
(tj1,tj]
, g
= f(a)1
a]
+
n
1
m
j
1
(tj1,tj]
and
S
f =
M
j
(t
j
t
j1
) and s
f =
m
j
(t
j
t
j1
).
Notice that
S
f =
_
b
a
G
dm and s
f =
_
b
a
g
dm.
The upper and lower Riemann integrals are dened respectively by
_
b
a
f(x)dx = inf
f and
_
a
b
f(x)dx = sup
f.
Denition 19.48. The function f is Riemann integrable i
_
b
a
f =
_
b
a
f
1 and which case the Riemann integral
_
b
a
f is dened to be the common value:
_
b
a
f(x)dx =
_
b
a
f(x)dx =
_
b
a
f(x)dx.
The proof of the following Lemma is left to the reader as Exercise 19.20.
Lemma 19.49. If
t
and are two partitions of [a, b] and
t
then
G
f g
and
S
f S
f s
f s
f.
There exists an increasing sequence of partitions
k
k=1
such that mesh(
k
)
0 and
S
k
f
_
b
a
f and s
k
f
_
b
a
f as k .
If we let
G := lim
k
G
k
and g := lim
k
g
k
(19.18)
then by the dominated convergence theorem,
_
[a,b]
gdm = lim
k
_
[a,b]
g
k
= lim
k
s
k
f =
_
b
a
f(x)dx (19.19)
and
_
[a,b]
Gdm = lim
k
_
[a,b]
G
k
= lim
k
S
k
f =
_
b
a
f(x)dx. (19.20)
Notation 19.50 For x [a, b], let
H(x) = limsup
yx
f(y) := lim
0
supf(y) : [y x[ , y [a, b] and
h(x) = liminf
yx
f(y) := lim
0
inf f(y) : [y x[ , y [a, b].
Lemma 19.51. The functions H, h : [a, b] 1 satisfy:
1. h(x) f(x) H(x) for all x [a, b] and h(x) = H(x) i f is continuous
at x.
2. If
k
k=1
is any increasing sequence of partitions such that mesh(
k
) 0
and G and g are dened as in Eq. (19.18), then
G(x) = H(x) f(x) h(x) = g(x) x / :=
k=1
k
. (19.21)
(Note is a countable set.)
3. H and h are Borel measurable.
Proof. Let G
k
:= G
k
G and g
k
:= g
k
g.
1. It is clear that h(x) f(x) H(x) for all x and H(x) = h(x) i lim
yx
f(y)
exists and is equal to f(x). That is H(x) = h(x) i f is continuous at x.
2. For x / ,
G
k
(x) H(x) f(x) h(x) g
k
(x) k
and letting k in this equation implies
G(x) H(x) f(x) h(x) g(x) x / . (19.22)
Moreover, given > 0 and x / ,
supf(y) : [y x[ , y [a, b] G
k
(x)
for all k large enough, since eventually G
k
(x) is the supremum of f(y)
over some interval contained in [x, x+]. Again letting k implies
sup
]yx]
f(y) G(x) and therefore, that
H(x) = limsup
yx
f(y) G(x)
for all x / . Combining this equation with Eq. (19.22) then implies
H(x) = G(x) if x / . A similar argument shows that h(x) = g(x) if
x / and hence Eq. (19.21) is proved.
3. The functions G and g are limits of measurable functions and hence mea-
surable. Since H = G and h = g except possibly on the countable set ,
both H and h are also Borel measurable. (You justify this statement.)
19.7 Determining Classes of Measures 339
Theorem 19.52. Let f : [a, b] 1 be a bounded function. Then
_
b
a
f =
_
[a,b]
Hdm and
_
b
a
f =
_
[a,b]
hdm (19.23)
and the following statements are equivalent:
1. H(x) = h(x) for m -a.e. x,
2. the set
E := x [a, b] : f is discontinuous at x
is an m null set.
3. f is Riemann integrable.
If f is Riemann integrable then f is Lebesgue measurable
2
, i.e. f is L/B
measurable where L is the Lebesgue algebra and B is the Borel algebra
on [a, b]. Moreover if we let m denote the completion of m, then
_
[a,b]
Hdm =
_
b
a
f(x)dx =
_
[a,b]
fd m =
_
[a,b]
hdm. (19.24)
Proof. Let
k
k=1
be an increasing sequence of partitions of [a, b] as
described in Lemma 19.49 and let G and g be dened as in Lemma 19.51.
Since m() = 0, H = G a.e., Eq. (19.23) is a consequence of Eqs. (19.19) and
(19.20). From Eq. (19.23), f is Riemann integrable i
_
[a,b]
Hdm =
_
[a,b]
hdm
and because h f H this happens i h(x) = H(x) for m - a.e. x. Since
E = x : H(x) ,= h(x), this last condition is equivalent to E being a m
null set. In light of these results and Eq. (19.21), the remaining assertions
including Eq. (19.24) are now consequences of Lemma 19.47.
Notation 19.53 In view of this theorem we will often write
_
b
a
f(x)dx for
_
b
a
fdm.
19.7 Determining Classes of Measures
Denition 19.54 ( nite). Let X be a set and c T 2
X
. We say
that a function : T [0, ] is nite on c if there exist X
n
c such
that X
n
X and (X
n
) < for all n.
2
f need not be Borel measurable.
Theorem 19.55 (Uniqueness). Suppose that ( 2
X
is a class (see
Denition 18.53), /= (() and and are two measure on /. If and
are nite on ( and = on (, then = on /.
Proof. We begin rst with the special case where (X) < and therefore
also
(X) = lim
n
(X
n
) = lim
n
(X
n
) = (X) < .
Let
H := f
(/, 1) : (f) = (f) .

Then His a linear subspace which is closed under bounded convergence (by the
dominated convergence theorem), contains 1 and contains the multiplicative
system, M := 1
C
: C ( . Therefore, by Theorem 18.51 or Corollary 18.54,
H =
(/, 1) and hence = . For the general case, let X

1
n
, X
2
n
( be
chosen so that X
1
n
X and X
2
n
X as n and
_
X
1
n
_
+
_
X
2
n
_
< for
all n. Then X
n
:= X
1
n
X
2
n
( increases to X and (X
n
) = (X
n
) < for
all n. For each n N, dene two measures
n
and
n
on / by
n
(A) := (A X
n
) and
n
(A) = (A X
n
).
Then, as the reader should verify,
n
and
n
are nite measure on / such
that
n
=
n
on (. Therefore, by the special case just proved,
n
=
n
on /.
Finally, using the continuity properties of measures,
(A) = lim
n
(A X
n
) = lim
n
(A X
n
) = (A)
for all A /.
As an immediate consequence we have the following corollaries.
Corollary 19.56. Suppose that (X, ) is a topological space, B
X
= () is
the Borel algebra on X and and are two measures on B
X
which are
nite on . If = on then = on B
X
, i.e. .
Corollary 19.57. Suppose that and are two measures on B
R
n which are
nite on bounded sets and such that (A) = (A) for all sets A of the form
A = (a, b] = (a
1
, b
1
] (a
n
, b
n
]
with a, b 1
n
and a < b, i.e. a
i
< b
i
for all i. Then = on B
R
n.
Proposition 19.58. Suppose that (X, d) is a metric space, and are two
measures on B
X
:= (
d
) which are nite on bounded measurable subsets of
X and _
X
fd =
_
X
fd (19.25)
for all f BC
b
(X, 1) where
BC
b
(X, 1) = f BC(X, 1) : supp(f) is bounded. (19.26)
Then .
19.7 Determining Classes of Measures 341
Proof. To prove this x a o X and let
R
(x) = ([R + 1 d(x, o)] 1) 0 (19.27)
so that
R
BC
b
(X, [0, 1]), supp(
R
) B(o, R + 2) and
R
1 as R .
Let H
R
denote the space of bounded real valued B
X
measurable functions
f such that
_
X
R
fd =
_
X
R
fd. (19.28)
Then H
R
is closed under bounded convergence and because of Eq. (19.25)
contains BC(X, 1). Therefore by Corollary 18.55, H
R
contains all bounded
measurable functions on X. Take f = 1
A
in Eq. (19.28) with A B
X
, and
then use the monotone convergence theorem to let R . The result is
(A) = (A) for all A B
X
.
Here is another version of Proposition 19.58.
Proposition 19.59. Suppose that (X, d) is a metric space, and are two
measures on B
X
= (
d
) which are both nite on compact sets. Further assume
there exist compact sets K
k
X such that K
o
k
X. If
_
X
fd =
_
X
fd (19.29)
for all f C
c
(X, 1) then .
Proof. Let
n,k
be dened as in the proof of Proposition 18.56 and let
H
n,k
denote those bounded B
X
measurable functions, f : X 1 such that
_
X
f
n,k
d =
_
X
f
n,k
d.
By assumption BC(X, 1) H
n,k
and one easily checks that H
n,k
is closed
under bounded convergence. Therefore, by Corollary 18.55, H
n,k
contains all
bounded measurable function. In particular for A B
X
,
_
X
1
A
n,k
d =
_
X
1
A
n,k
d.
Letting n in this equation, using the dominated convergence theorem,
one shows _
X
1
A
1
K
o
k
d =
_
X
1
A
1
K
o
k
d
holds for k. Finally using the monotone convergence theorem we may let
k to conclude
(A) =
_
X
1
A
d =
_
X
1
A
d = (A)
for all A B
X
.
19.8 Exercises
Exercise 19.2. Let be a measure on an algebra / 2
X
, then (A) +
(B) = (A B) +(A B) for all A, B /.
Exercise 19.3 (From problem 12 on p. 27 of Folland.). Let (X, /, )
be a nite measure space and for A, B / let (A, B) = (AB) where
AB = (A B) (B A) . It is clear that (A, B) = (B, A) . Show:
1. satises the triangle inequality:
(A, C) (A, B) + (B, C) for all A, B, C /.
2. Dene A B i (AB) = 0 and notice that (A, B) = 0 i A B.
Show is an equivalence relation.
3. Let // denote / modulo the equivalence relation, , and let
[A] := B /: B A . Show that ([A] , [B]) := (A, B) is gives a
well dened metric on // .
4. Similarly show ([A]) = (A) is a well dened function on // and
show : (// ) 1
+
is continuous.
Exercise 19.4. Suppose that
n
: / [0, ] are measures on / for n
N. Also suppose that
n
(A) is increasing in n for all A /. Prove that
: /[0, ] dened by (A) := lim
n
n
(A) is also a measure.
Exercise 19.5. Now suppose that is some index set and for each ,
: / [0, ] is a measure on /. Dene : / [0, ] by (A) =
(A) for each A /. Show that is also a measure.

Exercise 19.6. Let (X, /, ) be a measure space and : X [0, ] be a
measurable function. For A /, set (A) :=
_
A
d.
1. Show : /[0, ] is a measure.
2. Let f : X [0, ] be a measurable function, show
_
X
fd =
_
X
fd. (19.30)
Hint: rst prove the relationship for characteristic functions, then for
simple functions, and then for general positive measurable functions.
3. Show that a measurable function f : X C is in L
1
() i [f[ L
1
()
and if f L
1
() then Eq. (19.30) still holds.
Notation 19.60 It is customary to informally describe dened in Exercise
19.6 by writing d = d.
Exercise 19.7. Let (X, /, ) be a measure space, (Y, T) be a measurable
space and f : X Y be a measurable map. Dene a function : T [0, ]
by (A) := (f
1
(A)) for all A T.
19.8 Exercises 343
1. Show is a measure. (We will write = f
or = f
1
.)
2. Show _
Y
gd =
_
X
(g f) d (19.31)
for all measurable functions g : Y [0, ]. Hint: see the hint from
Exercise 19.6.
3. Show a measurable function g : Y C is in L
1
() i g f L
1
() and
that Eq. (19.31) holds for all g L
1
().
Exercise 19.8. Let F : 1 1 be a C
1
-function such that F
t
(x) > 0 for all
x 1 and lim
x
F(x) = . (Notice that F is strictly increasing so that
F
1
: 1 1 exists and moreover, by the inverse function theorem that F
1
is a C
1
function.) Let m be Lebesgue measure on B
R
and
(A) = m(F(A)) = m(
_
F
1
_
1
(A)) =
_
F
1
m
_
(A)
for all A B
R
. Show d = F
t
dm. Use this result to prove the change of
variable formula,
_
R
h F F
t
dm =
_
R
hdm (19.32)
which is valid for all Borel measurable functions h : 1 [0, ].
Hint: Start by showing d = F
t
dm on sets of the form A = (a, b] with
a, b 1 and a < b. Then use the uniqueness assertions in Theorem 19.8 (or
see Corollary 19.57) to conclude d = F
t
dm on all of B
R
. To prove Eq. (19.32)
apply Exercise 19.7 with g = h F and f = F
1
.
Exercise 19.9. Let (X, /, ) be a measure space and A
n
n=1
/, show
(A
n
a.a.) liminf
n
(A
n
)
and if (
mn
A
m
) < for some n, then
(A
n
i.o.) limsup
n
(A
n
) .
Exercise 19.10. BRUCE: Delete this exercise which is contained in Lemma
19.17. Suppose (X, /, ) be a measure space and f : X [0 ] be a mea-
surable function such that
_
X
fd < . Show (f = ) = 0 and the set
f > 0 is nite.
Exercise 19.11. Folland 2.13 on p. 52. Hint: Fatou times two.
Exercise 19.12. Folland 2.14 on p. 52. BRUCE: delete this exercise
Exercise 19.13. Give examples of measurable functions f
n
on 1 such that
f
n
decreases to 0 uniformly yet
_
f
n
dm = for all n. Also give an example
of a sequence of measurable functions g
n
on [0, 1] such that g
n
0 while
_
g
n
dm = 1 for all n.
Exercise 19.14. Folland 2.19 on p. 59. (This problem is essentially covered
in the previous exercise.)
Exercise 19.15. Suppose a
n
n=
C is a summable sequence (i.e.
n=
[a
n
[ < ), then f() :=

n=
a
n
e
in
is a continuous function
for 1 and
a
n
=
1
2
_

f()e
in
d.
Exercise 19.16. For any function f L
1
(m) , show x 1
_
(,x]
f (t) dm(t)
is continuous in x. Also nd a nite measure, , on B
R
such that x
_
(,x]
f (t) d(t) is not continuous.
Exercise 19.17. Folland 2.28 on p. 60.
Exercise 19.18. Folland 2.31b and 2.31e on p. 60. (The answer in 2.13b is
wrong by a factor of 1 and the sum is on k = 1 to . In part e, s should be
taken to be a. You may also freely use the Taylor series expansion
(1 z)
1/2
=
n=0
(2n 1)!!
2
n
n!
z
n
=
n=0
(2n)!
4
n
(n!)
2
z
n
for [z[ < 1.
Exercise 19.19. There exists a meager (see Denition 16.6 and Theorem
16.4) subsets of 1 which have full Lebesgue measure, i.e. whose complement
is a Lebesgue null set. (This is Folland 5.27. Hint: Consider the generalized
Cantor sets discussed on p. 39 of Folland.)
20
Multiple Integrals
In this chapter we will introduce iterated integrals and product measures. We
are particularly interested in when it is permissible to interchange the order
of integration in multiple integrals.
Example 20.1. As an example let X = [1, ) and Y = [0, 1] equipped with
their Borel - algebras and let = = m, where m is Lebesgue measure.
The iterated integrals of the function f (x, y) := e
xy
2e
2xy
satisfy,
_
1
0
__

1
(e
xy
2e
2xy
)dx
_
dy =
_
1
0
e
y
_
1 e
y
y
_
dy (0, )
and
_

1
__
1
0
(e
xy
2e
2xy
)dy
_
dx =
_

1
e
x
_
1 e
x
x
_
dx (, 0)
and therefore are not equal. Hence it is not always true that order of integra-
tion is irrelevant.
Lemma 20.2. Let F be either [0, ), 1 or C. Suppose (X, /) and (Y, ^)
are two measurable spaces and f : XY F is a (/^, B
F
) measurable
function, then for each y Y,
x f(x, y) is (/, B
F
) measurable, (20.1)
for each x X,
y f(x, y) is (^, B
F
) measurable. (20.2)
Proof. Suppose that E = AB c := /^ and f = 1
E
. Then
f(x, y) = 1
AB
(x, y) = 1
A
(x)1
B
(y)
from which it follows that Eqs. (20.1) and (20.2) hold for this function. Let
H be the collection of all bounded (/^, B
F
) measurable functions on
346 20 Multiple Integrals
X Y such that Eqs. (20.1) and (20.2) hold, here we assume F = 1 or C.
Because measurable functions are closed under taking linear combinations
and pointwise limits, H is linear subspace of
(/^, F) which is closed

under bounded convergence and contain 1
E
H for all E in the class, c.
Therefore by by Corollary 18.54, that H =
(/^, F) .
For the general (/^, B
R
) measurable functions f : X Y F and
M N, let f
M
:= 1
]f]M
f
(/^, F) . Then Eqs. (20.1) and (20.2)

hold with f replaced by f
M
and hence for f as well by letting M .
Notation 20.3 (Iterated Integrals) If (X, /, ) and (Y, ^, ) are two
measure spaces and f : X Y C is a / ^ measurable function,
the iterated integrals of f (when they make sense) are:
_
X
d(x)
_
Y
d(y)f(x, y) :=
_
X
__
Y
f(x, y)d(y)
_
d(x)
and
_
Y
d(y)
_
X
d(x)f(x, y) :=
_
Y
__
X
f(x, y)d(x)
_
d(y).
Notation 20.4 Suppose that f : X C and g : Y C are functions, let
f g denote the function on X Y given by
f g(x, y) = f(x)g(y).
Notice that if f, g are measurable, then f g is (/^, B
C
) measurable.
To prove this let F(x, y) = f(x) and G(x, y) = g(y) so that f g = F G will
be measurable provided that F and G are measurable. Now F = f
1
where
1
: X Y X is the projection map. This shows that F is the composition
of measurable functions and hence measurable. Similarly one shows that G is
measurable.
20.1 Fubini-Tonellis Theorem and Product Measure
Theorem 20.5. Suppose (X, /, ) and (Y, ^, ) are -nite measure spaces
and f is a nonnegative (/^, B
R
) measurable function, then for each
y Y,
x f(x, y) is / B
[0,]
measurable, (20.3)
for each x X,
y f(x, y) is ^ B
[0,]
measurable, (20.4)
x
_
Y
f(x, y)d(y) is / B
[0,]
measurable, (20.5)
y
_
X
f(x, y)d(x) is ^ B
[0,]
measurable, (20.6)
20.1 Fubini-Tonellis Theorem and Product Measure 347
and _
X
d(x)
_
Y
d(y)f(x, y) =
_
Y
d(y)
_
X
d(x)f(x, y). (20.7)
Proof. Suppose that E = AB c := /^ and f = 1
E
. Then
f(x, y) = 1
AB
(x, y) = 1
A
(x)1
B
(y)
and one sees that Eqs. (20.3) and (20.4) hold. Moreover
_
Y
f(x, y)d(y) =
_
Y
1
A
(x)1
B
(y)d(y) = 1
A
(x)(B),
so that Eq. (20.5) holds and we have
_
X
d(x)
_
Y
d(y)f(x, y) = (B)(A). (20.8)
Similarly,
_
X
f(x, y)d(x) = (A)1
B
(y) and
_
Y
d(y)
_
X
d(x)f(x, y) = (B)(A)
from which it follows that Eqs. (20.6) and (20.7) hold in this case as well. For
the moment let us further assume that (X) < and (Y ) < and let
H be the collection of all bounded (/^, B
R
) measurable functions on
X Y such that Eqs. (20.3) (20.7) hold. Using the fact that measurable
functions are closed under pointwise limits and the dominated convergence
theorem (the dominating function always being a constant), one easily shows
that H closed under bounded convergence. Since we have just veried that
1
E
H for all E in the class, c, it follows by Corollary 18.54 that H is the
space of all bounded (/^, B
R
) measurable functions on XY. Finally if
f : X Y [0, ] is a (/^, B
R
) measurable function, let f
M
= Mf
so that f
M
f as M and Eqs. (20.3) (20.7) hold with f replaced by
f
M
for all M N. Repeated use of the monotone convergence theorem allows
us to pass to the limit M in these equations to deduce the theorem in
the case and are nite measures. For the nite case, choose X
n
/,
Y
n
^ such that X
n
X, Y
n
Y, (X
n
) < and (Y
n
) < for all
m, n N. Then dene
m
(A) = (X
m
A) and
n
(B) = (Y
n
B) for all
A / and B ^ or equivalently d
m
= 1
Xm
d and d
n
= 1
Yn
d. By what
we have just proved Eqs. (20.3) (20.7) with replaced by
m
and by
n
for all (/^, B
R
) measurable functions, f : XY [0, ]. The validity
of Eqs. (20.3) (20.7) then follows by passing to the limits m and then
n making use of the monotone convergence theorem in the form,
_
X
ud
m
=
_
X
u1
Xm
d
_
X
ud as m
and _
Y
vd
n
=
_
Y
v1
Yn
d
_
Y
vd as n
for all u L
+
(X, /) and v L
+
(Y, ^).
Corollary 20.6. Suppose (X, /, ) and (Y, ^, ) are nite measure
spaces. Then there exists a unique measure on /^ such that (AB) =
(A)(B) for all A / and B ^. Moreover is given by
(E) =
_
X
d(x)
_
Y
d(y)1
E
(x, y) =
_
Y
d(y)
_
X
d(x)1
E
(x, y) (20.9)
for all E /^ and is nite.
Proof. Notice that any measure such that (A B) = (A)(B) for
all A / and B ^ is necessarily nite. Indeed, let X
n
/ and
Y
n
^ be chosen so that (X
n
) < , (Y
n
) < , X
n
X and Y
n
Y, then
X
n
Y
n
/ ^, X
n
Y
n
X Y and (X
n
Y
n
) < for all n. The
uniqueness assertion is a consequence of Theorem 19.55 or see Theorem 33.6
below with c = /^. For the existence, it suces to observe, using the
monotone convergence theorem, that dened in Eq. (20.9) is a measure on
/^. Moreover this measure satises (AB) = (A)(B) for all A /
and B ^ from Eq. (20.8).
Notation 20.7 The measure is called the product measure of and and
will be denoted by .
Theorem 20.8 (Tonellis Theorem). Suppose (X, /, ) and (Y, ^, ) are
nite measure spaces and = is the product measure on / ^.
If f L
+
(X Y, / ^), then f(, y) L
+
(X, /) for all y Y, f(x, )
L
+
(Y, ^) for all x X,
_
Y
f(, y)d(y) L
+
(X, /),
_
X
f(x, )d(x) L
+
(Y, ^)
and
_
XY
f d =
_
X
d(x)
_
Y
d(y)f(x, y) (20.10)
=
_
Y
d(y)
_
X
d(x)f(x, y). (20.11)
Proof. By Theorem 20.5 and Corollary 20.6, the theorem holds when
f = 1
E
with E / ^. Using the linearity of all of the statements, the
theorem is also true for non-negative simple functions. Then using the mono-
tone convergence theorem repeatedly along with the approximation Theorem
18.42, one deduces the theorem for general f L
+
(X Y, /^).
The following convention will be in force for the rest of this chapter.
Convention: If (X, /, ) is a measure space and f : X C is a measur-
able but non-integrable function, i.e.
_
X
[f[ d = , by convention we will de-
ne
_
X
fd := 0. However if f is a non-negative function (i.e. f : X [0, ])
is a non-integrable function we will still write
_
X
fd = .
Theorem 20.9 (Fubinis Theorem). Suppose (X, /, ) and (Y, ^, ) are
nite measure spaces, = is the product measure on / ^ and
f : X Y C is a /^ measurable function. Then the following three
conditions are equivalent:
_
XY
[f[ d < , i.e. f L
1
(), (20.12)
_
X
__
Y
[f(x, y)[ d(y)
_
d(x) < and (20.13)
_
Y
__
X
[f(x, y)[ d(x)
_
d(y) < . (20.14)
If any one (and hence all) of these condition hold, then f(x, ) L
1
() for -
a.e. x, f(, y) L
1
() for -a.e. y,
_
Y
f(, y)dv(y) L
1
(),
_
X
f(x, )d(x)
L
1
() and Eqs. (20.10) and (20.11) are still valid.
Proof. The equivalence of Eqs. (20.12) (20.14) is a direct consequence
of Tonellis Theorem 20.8. Now suppose f L
1
() is a real valued function
and let
E :=
_
x X :
_
Y
[f (x, y)[ d (y) =
_
. (20.15)
Then by Tonellis theorem, x
_
Y
[f (x, y)[ d (y) is measurable and hence
E /. Moreover Tonellis theorem implies
_
X
__
Y
[f (x, y)[ d (y)
_
d(x) =
_
XY
[f[ d <
which implies that (E) = 0. Let f
be the positive and negative parts of f,

then using the above convention we have
_
Y
f (x, y) d (y) =
_
Y
1
E
c (x) f (x, y) d (y)
=
_
Y
1
E
c (x) [f
+
(x, y) f
(x, y)] d (y)

=
_
Y
1
E
c (x) f
+
(x, y) d (y)
_
Y
1
E
c (x) f
(x, y) d (y) .
(20.16)
Noting that 1
E
c (x) f
(x, y) = (1
E
c 1
Y
f
) (x, y) is a positive / ^
measurable function, it follows from another application of Tonellis theorem
that x
_
Y
f (x, y) d (y) is / measurable, being the dierence of two
measurable functions. Moreover
_
X
_
Y
f (x, y) d (y)
d(x)
_
X
__
Y
[f (x, y)[ d (y)
_
d(x) < ,
which shows
_
Y
f(, y)dv(y) L
1
(). Integrating Eq. (20.16) on x and using
Tonellis theorem repeatedly implies,
_
X
__
Y
f (x, y) d (y)
_
d(x)
=
_
X
d(x)
_
Y
d (y) 1
E
c (x) f
+
(x, y)
_
X
d(x)
_
Y
d (y) 1
E
c (x) f
(x, y)
=
_
Y
d (y)
_
X
d(x) 1
E
c (x) f
+
(x, y)
_
Y
d (y)
_
X
d(x) 1
E
c (x) f
(x, y)
=
_
Y
d (y)
_
X
d(x) f
+
(x, y)
_
Y
d (y)
_
X
d(x) f
(x, y)
=
_
XY
f
+
d
_
XY
f
d =
_
XY
(f
+
f
) d =
_
XY
fd (20.17)
which proves Eq. (20.10) holds.
Now suppose that f = u + iv is complex valued and again let E be as
in Eq. (20.15). Just as above we still have E / and (E) = 0. By our
convention,
_
Y
f (x, y) d (y) =
_
Y
1
E
c (x) f (x, y) d (y) =
_
Y
1
E
c (x) [u(x, y) +iv (x, y)] d (y)
=
_
Y
1
E
c (x) u(x, y) d (y) +i
_
Y
1
E
c (x) v (x, y) d (y)
which is measurable in x by what we have just proved. Similarly one shows
_
Y
f (, y) d (y) L
1
() and Eq. (20.10) still holds by a computation similar
to that done in Eq. (20.17). The assertions pertaining to Eq. (20.11) may be
proved in the same way.
Notation 20.10 Given E X Y and x X, let
x
E := y Y : (x, y) E.
Similarly if y Y is given let
E
y
:= x X : (x, y) E.
If f : X Y C is a function let f
x
= f(x, ) and f
y
:= f(, y) so that
f
x
: Y C and f
y
: X C.
Theorem 20.11. Suppose (X, /, ) and (Y, ^, ) are complete nite
measure spaces. Let (XY, L, ) be the completion of (XY, /^, ).
If f is L measurable and (a) f 0 or (b) f L
1
() then f
x
is ^
measurable for a.e. x and f
y
is / measurable for a.e. y and in case (b)
f
x
L
1
() and f
y
L
1
() for a.e. x and a.e. y respectively. Moreover,
_
x
_
Y
f
x
d
_
L
1
() and
_
y
_
X
f
y
d
_
L
1
()
and _
XY
fd =
_
Y
d
_
X
df =
_
X
d
_
Y
d f.
Proof. If E /^ is a null set (i.e. ( )(E) = 0), then
0 = ( )(E) =
_
X
(
x
E)d(x) =
_
X
(E
y
)d(y).
This shows that
(x : (
x
E) ,= 0) = 0 and (y : (E
y
) ,= 0) = 0,
i.e. (
x
E) = 0 for a.e. x and (E
y
) = 0 for a.e. y. If h is L measurable and
h = 0 for a.e., then there exists E /^ such that (x, y) : h(x, y) ,=
0 E and ()(E) = 0. Therefore [h(x, y)[ 1
E
(x, y) and ()(E) = 0.
Since
h
x
,= 0 = y Y : h(x, y) ,= 0
x
E and
h
y
,= 0 = x X : h(x, y) ,= 0 E
y
we learn that for a.e. x and a.e. y that h
x
,= 0 /, h
y
,= 0 ^,
(h
x
,= 0) = 0 and a.e. and (h
y
,= 0) = 0. This implies
_
Y
h(x, y)d(y)
exists and equals 0 for a.e. x and similarly that
_
X
h(x, y)d(x) exists and
equals 0 for a.e. y. Therefore
0 =
_
XY
hd =
_
Y
__
X
hd
_
d =
_
X
__
Y
hd
_
d.
For general f L
1
(), we may choose g L
1
(/ ^, ) such that
f(x, y) = g(x, y) for a.e. (x, y). Dene h := f g. Then h = 0, a.e.
Hence by what we have just proved and Theorem 20.8 f = g + h has the
following properties:
1. For a.e. x, y f(x, y) = g(x, y) +h(x, y) is in L
1
() and
_
Y
f(x, y)d(y) =
_
Y
g(x, y)d(y).
2. For a.e. y, x f(x, y) = g(x, y) +h(x, y) is in L
1
() and
_
X
f(x, y)d(x) =
_
X
g(x, y)d(x).
From these assertions and Theorem 20.8, it follows that
_
X
d(x)
_
Y
d(y)f(x, y) =
_
X
d(x)
_
Y
d(y)g(x, y)
=
_
Y
d(y)
_
Y
d(x)g(x, y)
=
_
XY
g(x, y)d( )(x, y)
=
_
XY
f(x, y)d(x, y).
Similarly it is shown that
_
Y
d(y)
_
X
d(x)f(x, y) =
_
XY
f(x, y)d(x, y).
The previous theorems have obvious generalizations to products of any
nite number of nite measure spaces. For example the following theorem
holds.
Theorem 20.12. Suppose (X
i
, /
i
,
i
)
n
i=1
are nite measure spaces
and X := X
1
X
n
. Then there exists a unique measure, , on
(X, /
1
/
n
) such that
(A
1
A
n
) =
1
(A
1
) . . .
n
(A
n
) for all A
i
/
i
.
(This measure and its completion will be denoted by
1

n
.) If f : X
[0, ] is a /
1
/
n
measurable function then
_
X
fd =
_
X
(1)
d
(1)
(x
(1)
) . . .
_
X
(n)
d
(n)
(x
(n)
) f(x
1
, . . . , x
n
) (20.18)
where is any permutation of 1, 2, . . . , n. This equation also holds for any
f L
1
() and moreover, f L
1
() i
_
X
(1)
d
(1)
(x
(1)
) . . .
_
X
(n)
d
(n)
(x
(n)
) [f(x
1
, . . . , x
n
)[ <
for some (and hence all) permutations, .
This theorem can be proved by the same methods as in the two factor case,
see Exercise 20.5. Alternatively, one can use the theorems already proved and
induction on n, see Exercise 20.6 in this regard.
Example 20.13. In this example we will show
lim
M
_
M
0
sinx
x
dx = /2. (20.19)
To see this write
1
x
=
_
0
e
tx
dt and use Fubini-Tonelli to conclude that
_
M
0
sinx
x
dx =
_
M
0
__

0
e
tx
sinx dt
_
dx
=
_

0
_
_
M
0
e
tx
sinx dx
_
dt
=
_

0
1
1 +t
2
_
1 te
Mt
sinM e
Mt
cos M
_
dt
_

0
1
1 +t
2
dt =

2
as M ,
wherein we have used the dominated convergence theorem to pass to the limit.
The next example is a renement of this result.
Example 20.14. We have
_

0
sinx
x
e
x
dx =
1
2
arctan for all > 0 (20.20)
and for, M [0, ),
_
M
0
sinx
x
e
x
dx
1
2
+ arctan
C
e
M
M
(20.21)
where C = max
x0
1+x
1+x
2
=
1
2
22
= 1.2. In particular Eq. (20.19) is valid.

To verify these assertions, rst notice that by the fundamental theorem of
calculus,
[sinx[ =
_
x
0
cos ydy
_
x
0
[cos y[ dy
_
x
0
1dy
= [x[
so

sin x
x
1 for all x ,= 0. Making use of the identity

_

0
e
tx
dt = 1/x
and Fubinis theorem,
_
M
0
sinx
x
e
x
dx =
_
M
0
dxsinxe
x
_

0
e
tx
dt
=
_

0
dt
_
M
0
dxsinxe
(+t)x
=
_

0
1 (cos M + ( +t) sinM) e
M(+t)
( +t)
2
+ 1
dt
=
_

0
1
( +t)
2
+ 1
dt
_

0
cos M + ( +t) sinM
( +t)
2
+ 1
e
M(+t)
dt
=
1
2
arctan (M, ) (20.22)
where
(M, ) =
_

0
cos M + ( +t) sinM
( +t)
2
+ 1
e
M(+t)
dt.
Since
cos M + ( +t) sinM

( +t)
2
+ 1
1 + ( +t)
( +t)
2
+ 1
C,
[(M, )[
_

0
e
M(+t)
dt = C
e
M
M
.
This estimate along with Eq. (20.22) proves Eq. (20.21) from which Eq. (20.19)
follows by taking and Eq. (20.20) follows (using the dominated con-
vergence theorem again) by letting M .
20.2 Lebesgue Measure on 1
d
and the Change of
Variables Theorem
Notation 20.15 Let
m
d
:=
d times
..
m m on B
R
d =
d times
..
B
R
B
R
be the d fold product of Lebesgue measure m on B
R
. We will also use m
d
to denote its completion and let L
d
be the completion of B
R
d relative to m
d
.
A subset A L
d
is called a Lebesgue measurable set and m
d
is called d
dimensional Lebesgue measure, or just Lebesgue measure for short.
Denition 20.16. A function f : 1
d
1 is Lebesgue measurable if
f
1
(B
R
) L
d
.
Notation 20.17 I will often be sloppy in the sequel and write m for m
d
and
dx for dm(x) = dm
d
(x), i.e.
20.2 Lebesgue Measure on R
d
and the Change of Variables Theorem 355
_
R
d
f (x) dx =
_
R
d
fdm =
_
R
d
fdm
d
.
Hopefully the reader will understand the meaning from the context.
Theorem 20.18. Lebesgue measure m
d
is translation invariant. Moreover m
d
is the unique translation invariant measure on B
R
d such that m
d
((0, 1]
d
) = 1.
Proof. Let A = J
1
J
d
with J
i
B
R
and x 1
d
. Then
x +A = (x
1
+J
1
) (x
2
+J
2
) (x
d
+J
d
)
and therefore by translation invariance of m on B
R
we nd that
m
d
(x +A) = m(x
1
+J
1
) . . . m(x
d
+J
d
) = m(J
1
) . . . m(J
d
) = m
d
(A)
and hence m
d
(x +A) = m
d
(A) for all A B
R
d by Corollary 19.57. From this
fact we see that the measure m
d
(x + ) and m
d
() have the same null sets.
Using this it is easily seen that m(x + A) = m(A) for all A L
d
. The proof
of the second assertion is Exercise 20.7.
Exercise 20.1. In this problem you are asked to show there is no reasonable
notion of Lebesgue measure on an innite dimensional Hilbert space. To be
more precise, suppose H is an innite dimensional Hilbert space and m is a
countably additive measure on B
H
which is invariant under translations
and satises, m(B
0
()) > 0 for all > 0. Show m(V ) = for all non-empty
open subsets V H.
Theorem 20.19 (Change of Variables Theorem). Let
o
1
d
be an
open set and T : T()
o
1
d
be a C
1
dieomorphism,
1
see Figure
20.1. Then for any Borel measurable function, f : T() [0, ],
_
f (T (x)) [ det T
t
(x) [dx =
_
T()
f (y) dy, (20.23)
where T
t
(x) is the linear transformation on 1
d
dened by T
t
(x)v :=
d
dt
[
0
T(x+
tv). More explicitly, viewing vectors in 1
d
as columns, T
t
(x) may be repre-
sented by the matrix
T
t
(x) =
_
1
T
1
(x) . . .
d
T
1
(x)
.
.
.
.
.
.
.
.
.
1
T
d
(x) . . .
d
T
d
(x)
_
_, (20.24)
i.e. the i - j matrix entry of T
t
(x) is given by T
t
(x)
ij
=
i
T
j
(x) where
T(x) = (T
1
(x), . . . , T
d
(x))
tr
and
i
= /x
i
.
1
That is T : T() o R
d
is a continuously dierentiable bijection and the
inverse map T
1
: T() is also continuously dierentiable.
Fig. 20.1. The geometric setup of Theorem 20.19.
Remark 20.20. Theorem 20.19 is best remembered as the statement: if we
make the change of variables y = T (x) , then dy = [ det T
t
(x) [dx. As usual,
you must also change the limits of integration appropriately, i.e. if x ranges
through then y must range through T () .
Proof. The proof will be by induction on d. The case d = 1 was essentially
done in Exercise 19.8. Nevertheless, for the sake of completeness let us give
a proof here. Suppose d = 1, a < < 0 on [a, b] , then
_

[T
t
(x)[ dx =
_

T
t
(x) dx = T () T ()
= m(T ((, ])) =
_
T([a,b])
1
T((,])
(y) dy
while if T
t
(x) < 0 on [a, b] , then
_

[T
t
(x)[ dx =
_

T
t
(x) dx = T () T ()
= m(T ((, ])) =
_
T([a,b])
1
T((,])
(y) dy.
d
Combining the previous three equations shows
_
[a,b]
f (T (x)) [T
t
(x)[ dx =
_
T([a,b])
f (y) dy (20.25)
whenever f is of the form f = 1
T((,])
with a < < < b. An application of
Dynkins multiplicative system Theorem 18.51 then implies that Eq. (20.25)
holds for every bounded measurable function f : T ([a, b]) 1. (Observe
that [T
t
(x)[ is continuous and hence bounded for x in the compact interval,
[a, b] .) From Exercise 13.14, =
N
n=1
(a
n
, b
n
) where a
n
, b
n
1 for
n = 1, 2, < N with N = possible. Hence if f : T () 1
+
is a Borel
measurable function and a
n
<
k
<
k
< b
n
with
k
a
n
and
k
b
n
, then
by what we have already proved and the monotone convergence theorem
_
1
(an,bn)
(f T) [T
t
[dm =
_
_
1
T((an,bn))
f
_
T [T
t
[dm
= lim
k
_
_
1
T([
k
,
k
])
f
_
T [T
t
[ dm
= lim
k
_
T()
1
T([
k
,
k
])
f dm
=
_
T()
1
T((an,bn))
f dm.
Summing this equality on n, then shows Eq. (20.23) holds.
To carry out the induction step, we now suppose d > 1 and suppose the
theorem is valid with d being replaced by d1. For notational compactness, let
us write vectors in 1
d
as row vectors rather than column vectors. Nevertheless,
the matrix associated to the dierential, T
t
(x) , will always be taken to be
given as in Eq. (20.24).
Case 1. Suppose T (x) has the form
T (x) = (x
i
, T
2
(x) , . . . , T
d
(x)) (20.26)
or
T (x) = (T
1
(x) , . . . , T
d1
(x) , x
i
) (20.27)
for some i 1, . . . , d . For deniteness we will assume T is as in Eq. (20.26),
the case of T in Eq. (20.27) may be handled similarly. For t 1, let i
t
:
1
d1
1
d
be the inclusion map dened by
i
t
(w) := w
t
:= (w
1
, . . . , w
i1
, t, w
i+1
, . . . , w
d1
) ,
t
be the (possibly empty) open subset of 1
d1
dened by
t
:=
_
w 1
d1
: (w
1
, . . . , w
i1
, t, w
i+1
, . . . , w
d1
)
_
and T
t
:
t
1
d1
be dened by
T
t
(w) = (T
2
(w
t
) , . . . , T
d
(w
t
)) ,
see Figure 20.2. Expanding det T
t
(w
t
) along the rst row of the matrix T
t
(w
t
)
Fig. 20.2. In this picture d = i = 3 and is an egg-shaped region with an egg-
shaped hole. The picture indicates the geometry associated with the map T and
slicing the set along planes where x3 = t.
shows
[det T
t
(w
t
)[ = [det T
t
t
(w)[ .
Now by the Fubini-Tonelli Theorem and the induction hypothesis,
d
_
f T[ det T
t
[dm =
_
R
d
1
f T[ det T
t
[dm
=
_
R
d
1
(w
t
) (f T) (w
t
) [ det T
t
(w
t
) [dwdt
=
_
R
_
_
_
t
(f T) (w
t
) [ det T
t
(w
t
) [dw
_
_
dt
=
_
R
_
_
_
t
f (t, T
t
(w)) [ det T
t
t
(w) [dw
_
_
dt
=
_
R
_
_
_
Tt(t)
f (t, z) dz
_
_dt =
_
R
_
_
_
R
d1
1
T()
(t, z) f (t, z) dz
_
_
dt
=
_
T()
f (y) dy
wherein the last two equalities we have used Fubini-Tonelli along with the
identity;
T () =
tR
T (i
t
()) =
tR
(t, z) : z T
t
(
t
) .
Case 2. (Eq. (20.23) is true locally.) Suppose that T : 1
d
is a general
map as in the statement of the theorem and x
0
is an arbitrary point. We
will now show there exists an open neighborhood W of x
0
such that
_
W
f T[ det T
t
[dm =
_
T(W)
fdm
holds for all Borel measurable function, f : T(W) [0, ]. Let M
i
be the 1-i
minor of T
t
(x
0
) , i.e. the determinant of T
t
(x
0
) with the rst row and i
th
column removed. Since

0 ,= det T
t
(x
0
) =
d
i=1
(1)
i+1
i
T
j
(x
0
) M
i
,
there must be some i such that M
i
,= 0. Fix an i such that M
i
,= 0 and let,
S (x) := (x
i
, T
2
(x) , . . . , T
d
(x)) . (20.28)
Observe that [det S
t
(x
0
)[ = [M
i
[ , = 0. Hence by the inverse function Theorem
12.25, there exist an open neighborhood W of x
0
such that W
o
and
S (W)
o
1
d
and S : W S (W) is a C
1
dieomorphism. Let R : S (W)
T (W)
o
1
d
to be the C
1
dieomorphism dened by
R(z) := T S
1
(z) for all z S (W) .
Because
(T
1
(x) , . . . , T
d
(x)) = T (x) = R(S (x)) = R((x
i
, T
2
(x) , . . . , T
d
(x)))
for all x W, if
(z
1
, z
2
, . . . , z
d
) = S (x) = (x
i
, T
2
(x) , . . . , T
d
(x))
then
R(z) =
_
T
1
_
S
1
(z)
_
, z
2
, . . . , z
d
_
. (20.29)
Observe that S is a map of the form in Eq. (20.26), R is a map of the form
in Eq. (20.27), T
t
(x) = R
t
(S (x)) S
t
(x) (by the chain rule) and (by the mul-
tiplicative property of the determinant)
[det T
t
(x)[ = [ det R
t
(S (x)) [ [det S
t
(x)[ x W.
So if f : T(W) [0, ] is a Borel measurable function, two applications of
the results in Case 1. shows,
_
W
f T [ det T
t
[dm =
_
W
(f R [ det R
t
[) S [det S
t
[ dm
=
_
S(W)
f R [ det R
t
[dm =
_
R(S(W))
fdm
=
_
T(W)
fdm
and Case 2. is proved.
Case 3. (General Case.) Let f : [0, ] be a general non-negative
Borel measurable function and let
K
n
:= x : dist(x,
c
) 1/n and [x[ n .
Then each K
n
is a compact subset of and K
n
as n . Using the
compactness of K
n
and case 2, for each n N, there is a nite open cover
J
n
of K
n
such that W and Eq. (20.23) holds with replaced by W for
each W J
n
. Let W
i
i=1
be an enumeration of
n=1
J
n
and set

W
1
= W
1
and

W
i
:= W
i
(W
1
W
i1
) for all i 2. Then =

i=1
W
i
and by
repeated use of case 2.,
d
_
f T[ det T
t
[dm =
i=1
_
1

Wi
(f T) [ det T
t
[dm
=
i=1
_
Wi
__
1
T(

Wi)
f
_
T
_
[ det T
t
[dm
=
i=1
_
T(Wi)
1
T(

Wi)
f dm =
n
i=1
_
T()
1
T(

Wi)
f dm
=
_
T()
fdm.
Remark 20.21. When d = 1, one often learns the change of variables formula
as
_
b
a
f (T (x)) T
t
(x) dx =
_
T(b)
T(a)
f (y) dy (20.30)
where f : [a, b] 1 is a continuous function and T is C
1
function dened
in a neighborhood of [a, b] . If T
t
> 0 on (a, b) then T ((a, b)) = (T (a) , T (b))
and Eq. (20.30) is implies Eq. (20.23) with = (a, b) . On the other hand if
T
t
< 0 on (a, b) then T ((a, b)) = (T (b) , T (a)) and Eq. (20.30) is equivalent
to
_
(a,b)
f (T (x)) ([T
t
(x)[) dx =
_
T(a)
T(b)
f (y) dy =
_
T((a,b))
f (y) dy
which is again implies Eq. (20.23). On the other hand Eq. Eq. (20.30) is
more general than Eq. (20.23) since it does not require T to be injective. The
standard proof of Eq. (20.30) is as follows. For z T ([a, b]) , let
F (z) :=
_
z
T(a)
f (y) dy.
Then by the chain rule and the fundamental theorem of calculus,
_
b
a
f (T (x)) T
t
(x) dx =
_
b
a
F
t
(T (x)) T
t
(x) dx =
_
b
a
d
dx
[F (T (x))] dx
= F (T (x)) [
b
a
=
_
T(b)
T(a)
f (y) dy.
An application of Dynkins multiplicative systems theorem (in the form of
Corollary 18.55) now shows that Eq. (20.30) holds for all bounded measurable
functions f on (a, b) . Then by the usual truncation argument, it also holds
for all positive measurable functions on (a, b) .
Example 20.22. Continuing the setup in Theorem 20.19, if A B
, then
m(T (A)) =
_
R
d
1
T(A)
(y) dy =
_
R
d
1
T(A)
(Tx) [det T
t
(x)[ dx
=
_
R
d
1
A
(x) [det T
t
(x)[ dx
wherein the second equality we have made the change of variables, y = T (x) .
Hence we have shown
d (m T) = [det T
t
()[ dm.
In particular if T GL(d, 1) = GL(1
d
) the space of dd invertible matrices,
then m T = [det T[ m, i.e.
m(T (A)) = [det T[ m(A) for allA B
R
d. (20.31)
This equation also shows that mT and m have the same null sets and hence
the equality in Eq. (20.31) is valid for any A L
d
.
Exercise 20.2. Show that f L
1
_
T () , m
d
_
i
_
[f T[ [ det T
t
[dm <
and if f L
1
_
T () , m
d
_
, then Eq. (20.23) holds.
Example 20.23 (Polar Coordinates). Suppose T : (0, ) (0, 2) 1
2
is
dened by
x = T(r, ) = (r cos , r sin) ,
i.e. we are making the change of variable,
x
1
= r cos and x
2
= r sin for 0 < r < and 0 < < 2.
In this case
T
t
(r, ) =
_
cos r sin
sin r cos
_
and therefore
dx = [det T
t
(r, )[ drd = rdrd.
Observing that
1
2
T ((0, ) (0, 2)) = := (x, 0) : x 0
has m
2
measure zero, it follows from the change of variables Theorem 20.19
that
_
R
2
f(x)dx =
_
2
0
d
_

0
dr r f(r (cos , sin)) (20.32)
for any Borel measurable function f : 1
2
[0, ].
d
Example 20.24 (Holomorphic Change of Variables). Suppose that f :
o
C

= 1
2
C is an injective holomorphic function such that f
t
(z) ,= 0 for all
z . We may express f as
f (x +iy) = U (x, y) +iV (x, y)
for all z = x +iy . Hence if we make the change of variables,
w = u +iv = f (x +iy) = U (x, y) +iV (x, y)
then
dudv =
det
_
U
x
U
y
V
x
V
y
_
dxdy = [U
x
V
y
U
y
V
x
[ dxdy.
Recalling that U and V satisfy the Cauchy Riemann equations, U
x
= V
y
and
U
y
= V
x
with f
t
= U
x
+iV
x
, we learn
U
x
V
y
U
y
V
x
= U
2
x
+V
2
x
= [f
t
[
2
.
Therefore
dudv = [f
t
(x +iy)[
2
dxdy.
Example 20.25. In this example we will evaluate the integral
I :=
__
_
x
4
y
4
_
dxdy
where
=
_
(x, y) : 1 < x
2
y
2
< 2, 0 < xy < 1
_
,
see Figure 20.3. We are going to do this by making the change of variables,
Fig. 20.3. The region consists of the two curved rectangular regions shown.
(u, v) := T (x, y) =
_
x
2
y
2
, xy
_
,
in which case
dudv =
det
_
2x 2y
y x
_
dxdy = 2
_
x
2
+y
2
_
dxdy
Notice that
_
x
4
y
4
_
=
_
x
2
y
2
_ _
x
2
+y
2
_
= u
_
x
2
+y
2
_
=
1
2
ududv.
The function T is not injective on but it is injective on each of its connected
components. Let D be the connected component in the rst quadrant so that
= D D and T (D) = (1, 2) (0, 1) . The change of variables theorem
then implies
I
:=
__
D
_
x
4
y
4
_
dxdy =
1
2
__
(1,2)(0,1)
ududv =
1
2
u
2
2
[
2
1
1 =
3
4
and therefore I = I
+
+I
= 2 (3/4) = 3/2.
Exercise 20.3 (Spherical Coordinates). Let T : (0, )(0, )(0, 2)
1
3
be dened by
T (r, , ) = (r sincos , r sinsin, r cos )
= r (sincos , sinsin, cos ) ,
see Figure 20.4. By making the change of variables x = T (r, , ) , show
Fig. 20.4. The relation of x to (r, , ) in spherical coordinates.
_
R
3
f(x)dx =
_

0
d
_
2
0
d
_

0
dr r
2
sin f(T (r, , ))
for any Borel measurable function, f : 1
3
[0, ].
20.3 The Polar Decomposition of Lebesgue Measure 365
Lemma 20.26. Let a > 0 and
I
d
(a) :=
_
R
d
e
a]x]
2
dm(x).
Then I
d
(a) = (/a)
d/2
.
Proof. By Tonellis theorem and induction,
I
d
(a) =
_
R
d1
R
e
a]y]
2
e
at
2
m
d1
(dy) dt
= I
d1
(a)I
1
(a) = I
d
1
(a). (20.33)
So it suces to compute:
I
2
(a) =
_
R
2
e
a]x]
2
dm(x) =
_
R
2
\0]
e
a(x
2
1
+x
2
2
)
dx
1
dx
2
.
Using polar coordinates, see Eq. (20.32), we nd,
I
2
(a) =
_

0
dr r
_
2
0
d e
ar
2
= 2
_

0
re
ar
2
dr
= 2 lim
M
_
M
0
re
ar
2
dr = 2 lim
M
e
ar
2
2a
_
M
0
=
2
2a
= /a.
This shows that I
2
(a) = /a and the result now follows from Eq. (20.33).
20.3 The Polar Decomposition of Lebesgue Measure
Let
S
d1
= x 1
d
: [x[
2
:=
d
i=1
x
2
i
= 1
be the unit sphere in 1
d
equipped with its Borel algebra, B
S
d1 and
: 1
d
0 (0, ) S
d1
be dened by (x) := ([x[ , [x[
1
x). The inverse
map,
1
: (0, ) S
d1
1
d
0 , is given by
1
(r, ) = r. Since
and
1
are continuous, they are both Borel measurable. For E B
S
d1 and
a > 0, let
E
a
:= r : r (0, a] and E =
1
((0, a] E) B
R
d.
Denition 20.27. For E B
S
d1, let (E) := d m(E
1
). We call the
surface measure on S
d1
.
It is easy to check that is a measure. Indeed if E B
S
d1, then E
1
=
1
((0, 1] E) B
R
d so that m(E
1
) is well dened. Moreover if E =
i=1
E
i
,
then E
1
=
i=1
(E
i
)
1
and
(E) = d m(E
1
) =
i=1
m((E
i
)
1
) =
i=1
(E
i
).
The intuition behind this denition is as follows. If E S
d1
is a set and
> 0 is a small number, then the volume of
(1, 1 +] E = r : r (1, 1 +] and E
should be approximately given by m((1, 1 +] E)
= (E), see Figure 20.5

below. On the other hand
Fig. 20.5. Motivating the denition of surface measure for a sphere.
m((1, 1 +]E) = m(E
1+
E
1
) =
_
(1 +)
d
1
_
m(E
1
).
Therefore we expect the area of E should be given by
(E) = lim
0
_
(1 +)
d
1
_
m(E
1
)
= d m(E
1
).
The following theorem is motivated by Example 20.23 and Exercise 20.3.
Theorem 20.28 (Polar Coordinates). If f : 1
d
[0, ] is a (B
R
d, B)
measurable function then
_
R
d
f(x)dm(x) =
_
(0,)S
d1
f(r)r
d1
drd(). (20.34)
In particular if f : 1
+
1
+
is measurable then
20.3 The Polar Decomposition of Lebesgue Measure 367
_
R
d
f([x[)dx =
_

0
f(r)dV (r) (20.35)
where V (r) = m(B(0, r)) = r
d
m(B(0, 1)) = d
1
_
S
d1
_
r
d
.
Proof. By Exercise 19.7,
_
R
d
fdm =
_
R
d
\0]
_
f
1
_
dm =
_
(0,)S
d1
_
f
1
_
d (
m) (20.36)
and therefore to prove Eq. (20.34) we must work out the measure
m on
B
(0,)
B
S
d1 dened by
m(A) := m
_
1
(A)
_
A B
(0,)
B
S
d1. (20.37)
If A = (a, b] E with 0 < a < b and E B
S
d1, then
1
(A) = r : r (a, b] and E = bE
1
aE
1
wherein we have used E
a
= aE
1
in the last equality. Therefore by the basic
scaling properties of m and the fundamental theorem of calculus,
(
m) ((a, b] E) = m(bE
1
aE
1
) = m(bE
1
) m(aE
1
)
= b
d
m(E
1
) a
d
m(E
1
) = d m(E
1
)
_
b
a
r
d1
dr. (20.38)
Letting d(r) = r
d1
dr, i.e.
(J) =
_
J
r
d1
dr J B
(0,)
, (20.39)
Eq. (20.38) may be written as
(
m) ((a, b] E) = ((a, b]) (E) = ( ) ((a, b] E) . (20.40)

Since
c = (a, b] E : 0 < a < b and E B
S
d1 ,
is a class (in fact it is an elementary class) such that (c) = B
(0,)
B
S
d1,
it follows from Theorem 19.55 and Eq. (20.40) that
m = . Using this
result in Eq. (20.36) gives
_
R
d
fdm =
_
(0,)S
d1
_
f
1
_
d ( )
which combined with Tonellis Theorem 20.8 proves Eq. (20.36).
Corollary 20.29. The surface area (S
d1
) of the unit sphere S
d1
1
d
is
(S
d1
) =
2
d/2
(d/2)
(20.41)
where is the gamma function given by
(x) :=
_

0
u
x1
e
u
dr (20.42)
Moreover, (1/2) =
, (1) = 1 and (x + 1) = x(x) for x > 0.

Proof. Using Theorem 20.28 we nd
I
d
(1) =
_

0
dr r
d1
e
r
2
_
S
d1
d = (S
d1
)
_

0
r
d1
e
r
2
dr.
We simplify this last integral by making the change of variables u = r
2
so
that r = u
1/2
and dr =
1
2
u
1/2
du. The result is
_

0
r
d1
e
r
2
dr =
_

0
u
d1
2
e
u
1
2
u
1/2
du
=
1
2
_

0
u
d
2
1
e
u
du =
1
2
(d/2). (20.43)
Combing the the last two equations with Lemma 20.26 which states that
I
d
(1) =
d/2
, we conclude that
d/2
= I
d
(1) =
1
2
(S
d1
)(d/2)
which proves Eq. (20.41). Example 19.24 implies (1) = 1 and from Eq.
(20.43),
(1/2) = 2
_

0
e
r
2
dr =
_

e
r
2
dr
= I
1
(1) =
.
The relation, (x+1) = x(x) is the consequence of the following integration
by parts argument:
(x + 1) =
_

0
e
u
u
x+1
du
u
=
_

0
u
x
_
d
du
e
u
_
du
= x
_

0
u
x1
e
u
du = x (x).
BRUCE: add Morreys Inequality ?? here.
20.4 More proofs of the classical Weierstrass approximation Theorem 10.35 369
20.4 More proofs of the classical Weierstrass
approximation Theorem 10.35
In each of these proofs we will use the reduction explained the previous proof
of Theorem 10.35 to reduce to the case where f C([0, 1]
d
). The rst proof
we will give here is based on the weak law of large numbers. The second
will be another approximate function argument.
Proof. of Theorem 10.35. Let 0 : = (0, 0, . . . , 0), 1 : = (1, 1, . . . , 1) and
[0, 1] := [0, 1]
d
. By considering the real and imaginary parts of f separately,
it suces to assume f C([0, 1], 1). For x [0, 1], let
x
be the measure on
0, 1 such that
x
(0) = 1 x and
x
(1) = x. Then
_
0,1]
yd
x
(y) = 0 (1 x) + 1 x = x and (20.44)
_
0,1]
(y x)
2
d
x
(y) = x
2
(1 x) + (1 x)
2
x = x(1 x). (20.45)
For x [0, 1] let
x
=
x1

x
d
be the product of
x1
, . . . ,
x
d
on
:= 0, 1
d
. Alternatively the measure
x
may be described by
x
() =
d
i=1
(1 x
i
)
1i
x
i
i
(20.46)
for . Notice that
x
() is a degree d polynomial in x for each .
For n N and x [0, 1], let
n
x
denote the n fold product of
x
with itself
on
n
, X
i
() =
i
1
d
for
n
and let
S
n
= (S
1
n
, . . . , S
d
n
) := (X
1
+X
2
+ +X
n
)/n,
so S
n
:
n
1
d
. The reader is asked to verify (Exercise 20.4) that
_
n
S
n
d
n
x
:=
__
n
S
1
n
d
n
x
, . . . ,
_
n
S
d
n
d
n
x
_
= (x
1
, . . . , x
d
) = x (20.47)
and
_
n
[S
n
x[
2
d
n
x
=
1
n
d
i=1
x
i
(1 x
i
)
d
n
. (20.48)
From these equations it follows that S
n
is concentrating near x as n , a
manifestation of the law of large numbers. Therefore it is reasonable to expect
p
n
(x) :=
_
n
f(S
n
)d
n
x
(20.49)
should approach f(x) as n . Let > 0 be given, M = sup[f(x)[ : x [0, 1]
and
= sup[f(y) f(x)[ : x, y [0, 1] and [y x[ .

By uniform continuity of f on [0, 1], lim
0
= 0. Using these denitions and

the fact that
n
x
(
n
) = 1,
[f(x) p
n
(x)[ =
n
(f(x) f(S
n
)) d
n
x
n
[f(x) f(S
n
)[ d
n
x
_
]Snx]>]
[f(x) f(S
n
)[ d
n
x
+
_
]Snx]]
[f(x) f(S
n
)[ d
n
x
2M
n
x
([S
n
x[ > ) +
. (20.50)
By Chebyshevs inequality,
n
x
([S
n
x[ > )
1
2
_
n
(S
n
x)
2
d
n
x
=
d
n
2
,
and therefore, Eq. (20.50) yields the estimate
|f p
n
|

2dM
n
2
+
and hence
limsup
n
|f p
n
|
0 as 0.
This completes the proof since, using Eq. (20.46),
p
n
(x) =
n
f(S
n
())
n
x
() =
n
f(S
n
())
n
i=1
x
(
i
),
is an nd degree polynomial in x 1
d
).
Exercise 20.4. Verify Eqs. (20.47) and (20.48). This is most easily done using
Eqs. (20.44) and (20.45) and Fubinis theorem repeatedly. (Of course Fubinis
theorem here is over kill since these are only nite sums after all. Nevertheless
it is convenient to use this formulation.)
The second proof requires the next two lemmas.
Lemma 20.30 (Approximate sequences). Suppose that Q
n
n=1
is a
sequence of positive functions on 1
d
such that
_
R
d
Q
n
(x) dx = 1 and (20.51)
lim
n
_
]x]
Q
n
(x)dx = 0 for all > 0. (20.52)
For f BC(1
d
), Q
n
f converges to f uniformly on compact subsets of 1
d
.
20.5 More Spherical Coordinates 371
Proof. The proof is exactly the same as the proof of Lemma 10.29, it is
only necessary to replace 1 by 1
d
everywhere in the proof.
Dene
Q
n
: 1
n
[0, ) by Q
n
(x) = q
n
(x
1
) . . . q
n
(x
d
). (20.53)
where q
n
is dened in Eq. (10.23).
Lemma 20.31. The sequence Q
n
n=1
is an approximate sequence, i.e.
they satisfy Eqs. (20.51) and (20.52).
Proof. The fact that Q
n
integrates to one is an easy consequence of
Tonellis theorem and the fact that q
n
integrates to one. Since all norms on
1
d
are equivalent, we may assume that [x[ = max [x
i
[ : i = 1, 2, . . . , d when
proving Eq. (20.52). With this norm
_
x 1
d
: [x[
_
=
d
i=1
_
x 1
d
: [x
i
[
_
and therefore by Tonellis theorem,
_
]x]]
Q
n
(x)dx
d
i=1
_
]xi]]
Q
n
(x)dx = d
_
xR]x]]
q
n
(t)dt
which tends to zero as n by Lemma 10.30.
Proof. Proof of Theorem 10.35. Again we assume f C
_
1
d
, C
_
and f 0
on Q
c
d
where Q
d
:= (0, 1)
d
. Let Q
n
(x) be dened as in Eq. (20.53). Then by
Lemma 20.31 and 20.30, p
n
(x) := (Q
n
F)(x) F(x) uniformly for x [0, 1]
as n . So to nish the proof it only remains to show p
n
(x) is a polynomial
when x [0, 1]. For x [0, 1],
p
n
(x) =
_
R
d
Q
n
(x y)f(y)dy
=
1
c
n
_
[0,1]
f(y)
d
i=1
_
c
1
n
(1 (x
i
y
i
)
2
)
n
1
]xiyi]1
dy
=
1
c
n
_
[0,1]
f(y)
d
i=1
_
c
1
n
(1 (x
i
y
i
)
2
)
n
dy.
Since the product in the above integrand is a polynomial if (x, y) 1
d
1
d
,
it follows easily that p
n
(x) is polynomial in x.
20.5 More Spherical Coordinates
In this section we will dene spherical coordinates in all dimensions. Along
the way we will develop an explicit method for computing surface integrals
on spheres. As usual when n = 2 dene spherical coordinates (r, ) (0, )
[0, 2) so that
_
x
1
x
2
_
=
_
r cos
r sin
_
= T
2
(, r).
For n = 3 we let x
3
= r cos
1
and then
_
x
1
x
2
_
= T
2
(, r sin
1
),
as can be seen from Figure 20.6, so that
Fig. 20.6. Setting up polar coordinates in two and three dimensions.
_
_
x
1
x
2
x
3
_
_
=
_
T
2
(, r sin
1
)
r cos
1
_
=
_
_
r sin
1
cos
r sin
1
sin
r cos
1
_
_
=: T
3
(,
1
, r, ).
We continue to work inductively this way to dene
_
_
_
_
_
x
1
.
.
.
x
n
x
n+1
_
_
_
_
_
=
_
T
n
(,
1
, . . . ,
n2
, r sin
n1
, )
r cos
n1
_
= T
n+1
(,
1
, . . . ,
n2
,
n1
, r).
So for example,
x
1
= r sin
2
sin
1
cos
x
2
= r sin
2
sin
1
sin
x
3
= r sin
2
cos
1
x
4
= r cos
2
and more generally,
x
1
= r sin
n2
. . . sin
2
sin
1
cos
x
2
= r sin
n2
. . . sin
2
sin
1
sin
x
3
= r sin
n2
. . . sin
2
cos
1
.
.
.
x
n2
= r sin
n2
sin
n3
cos
n4
x
n1
= r sin
n2
cos
n3
x
n
= r cos
n2
. (20.54)
By the change of variables formula,
_
R
n
f(x)dm(x)
=
_

0
dr
_
0i,02
d
1
. . . d
n2
d
n
(,
1
, . . . ,
n2
, r)f(T
n
(,
1
, . . . ,
n2
, r))
(20.55)
where
n
(,
1
, . . . ,
n2
, r) := [det T
t
n
(,
1
, . . . ,
n2
, r)[ .
Proposition 20.32. The Jacobian,
n
is given by
n
(,
1
, . . . ,
n2
, r) = r
n1
sin
n2
n2
. . . sin
2
2
sin
1
. (20.56)
If f is a function on rS
n1
the sphere of radius r centered at 0 inside of
1
n
, then
_
rS
n1
f(x)d(x) = r
n1
_
S
n1
f(r)d()
=
_
0i,02
f(T
n
(,
1
, . . . ,
n2
, r))
n
(,
1
, . . . ,
n2
, r)d
1
. . . d
n2
d
(20.57)
Proof. We are going to compute
n
inductively. Letting := r sin
n1
and writing
Tn
for
Tn
(,
1
, . . . ,
n2
, ) we have
n+1
(,
1
, . . . ,
n2
,
n1
, r)
=
_
Tn
Tn
1
0 0
. . .
Tn
n2
. . . 0
Tn
r cos
n1
r sin
n1
Tn
sin
n1
cos
n1
_
= r
_
cos
2
n1
+ sin
2
n1
_
n
(, ,
1
, . . . ,
n2
, )
= r
n
(,
1
, . . . ,
n2
, r sin
n1
),
i.e.
n+1
(,
1
, . . . ,
n2
,
n1
, r) = r
n
(,
1
, . . . ,
n2
, r sin
n1
). (20.58)
To arrive at this result we have expanded the determinant along the bottom
row. Staring with
2
(, r) = r already derived in Example 20.23, Eq. (20.58)
implies,
3
(,
1
, r) = r
2
(, r sin
1
) = r
2
sin
1
4
(,
1
,
2
, r) = r
3
(,
1
, r sin
2
) = r
3
sin
2
2
sin
1
.
.
.
n
(,
1
, . . . ,
n2
, r) = r
n1
sin
n2
n2
. . . sin
2
2
sin
1
which proves Eq. (20.56). Equation (20.57) now follows from Eqs. (20.34),
(20.55) and (20.56).
As a simple application, Eq. (20.57) implies
(S
n1
) =
_
0i,02
sin
n2
n2
. . . sin
2
2
sin
1
d
1
. . . d
n2
d
= 2
n2
k=1
k
= (S
n2
)
n2
(20.59)
where
k
:=
_
0
sin
k
d. If k 1, we have by integration by parts that,
k
=
_

0
sin
k
d =
_

0
sin
k1
d cos = 2
k,1
+ (k 1)
_

0
sin
k2
cos
2
d
= 2
k,1
+ (k 1)
_

0
sin
k2
_
1 sin
2
_
d = 2
k,1
+ (k 1) [
k2
k
]
and hence
k
satises
0
= ,
1
= 2 and the recursion relation
k
=
k 1
k

k2
for k 2.
Hence we may conclude
0
= ,
1
= 2,
2
=
1
2
,
3
=
2
3
2,
4
=
3
4
1
2
,
5
=
4
5
2
3
2,
6
=
5
6
3
4
1
2
and more generally by induction that
2k
=
(2k 1)!!
(2k)!!
and
2k+1
= 2
(2k)!!
(2k + 1)!!
.
Indeed,
2(k+1)+1
=
2k + 2
2k + 3
2k+1
=
2k + 2
2k + 3
2
(2k)!!
(2k + 1)!!
= 2
[2(k + 1)]!!
(2(k + 1) + 1)!!
and
2(k+1)
=
2k + 1
2k + 1
2k
=
2k + 1
2k + 2
(2k 1)!!
(2k)!!
=
(2k + 1)!!
(2k + 2)!!
.
The recursion relation in Eq. (20.59) may be written as
(S
n
) =
_
S
n1
_
n1
(20.60)
which combined with
_
S
1
_
= 2 implies
_
S
1
_
= 2,
(S
2
) = 2
1
= 2 2,
(S
3
) = 2 2
2
= 2 2
1
2
=
2
2
2
2!!
,
(S
4
) =
2
2
2
2!!

3
=
2
2
2
2!!
2
2
3
=
2
3
2
3!!
(S
5
) = 2 2
1
2

2
3
2
3
4
1
2
=
2
3
3
4!!
,
(S
6
) = 2 2
1
2

2
3
2
3
4
1
2

4
5
2
3
2 =
2
4
3
5!!
and more generally that
(S
2n
) =
2 (2)
n
(2n 1)!!
and (S
2n+1
) =
(2)
n+1
(2n)!!
(20.61)
which is veried inductively using Eq. (20.60). Indeed,
(S
2n+1
) = (S
2n
)
2n
=
2 (2)
n
(2n 1)!!
(2n 1)!!
(2n)!!
=
(2)
n+1
(2n)!!
and
(S
(n+1)
) = (S
2n+2
) = (S
2n+1
)
2n+1
=
(2)
n+1
(2n)!!
2
(2n)!!
(2n + 1)!!
=
2 (2)
n+1
(2n + 1)!!
.
Using
(2n)!! = 2n(2(n 1)) . . . (2 1) = 2
n
n!
we may write (S
2n+1
) =
2
n+1
n!
which shows that Eqs. (20.34) and (20.61 are
in agreement. We may also write the formula in Eq. (20.61) as
(S
n
) =
_
_
_
2(2)
n/2
(n1)!!
for n even
(2)
n+1
2
(n1)!!
for n odd.
20.6 Sards Theorem
See p. 538 of Taylor and references. Also see Milnors topology book. Add
in the Browers Fixed point theorem here as well. Also Spivaks calculus on
manifolds.
Theorem 20.33. Let U
o
1
m
, f C
(U, 1
d
) and C := x U : rank(f
t
(x)) < n
be the set of critical points of f. Then the critical values, f(C), is a Borel mea-
surable subset of 1
d
of Lebesgue measure 0.
Remark 20.34. This result clearly extends to manifolds.
For simplicity in the proof given below it will be convenient to use the
norm, [x[ := max
i
[x
i
[ . Recall that if f C
1
(U, 1
d
) and p U, then
f(p+x) = f(p)+
_
1
0
f
t
(p+tx)xdt = f(p)+f
t
(p)x+
_
1
0
[f
t
(p +tx) f
t
(p)] xdt
so that if
R(p, x) := f(p +x) f(p) f
t
(p)x =
_
1
0
[f
t
(p +tx) f
t
(p)] xdt
we have
[R(p, x)[ [x[
_
1
0
[f
t
(p +tx) f
t
(p)[ dt = [x[ (p, x).
By uniform continuity, it follows for any compact subset K U that
sup[(p, x)[ : p K and [x[ 0 as 0.
Proof. Notice that if x U C, then f
t
(x) : 1
m
1
n
is surjective,
which is an open condition, so that U C is an open subset of U. This shows
C is relatively closed in U, i.e. there exists

C 1
m
such that C =

C U.
Let K
n
U be compact subsets of U such that K
n
U, then K
n
C C
and K
n
C = K
n

C is compact for each n. Therefore, f(K
n
C) f(C)
i.e. f(C) =
n
f(K
n
C) is a countable union of compact sets and therefore
is Borel measurable. Moreover, since m(f(C)) = lim
n
m(f(K
n
C)), it
suces to show m(f(K)) = 0 for all compact subsets K C. Case 1. (n m)
Let K = [a, a + ] be a cube contained in U and by scaling the domain we
may assume = (1, 1, 1, . . . , 1). For N N and j S
N
:= 0, 1, . . . , N 1
n
let K
j
= j/N + [a, a + /N] so that K =
jS
N
K
j
with K
o
j
K
o
j
= if
j ,= j
t
. Let Q
j
: j = 1 . . . , M be the collection of those K
j
: j S
N
which
intersect C. For each j, let p
j
Q
j
C and for x Q
j
p
j
we have
f(p
j
+x) = f(p
j
) +f
t
(p
j
)x +R
j
(x)
where [R
j
(x)[
j
(N)/N and (N) := max
j

j
(N) 0 as N . Now
20.6 Sards Theorem 377
m(f(Q
j
)) = m(f(p
j
) + (f
t
(p
j
) +R
j
) (Q
j
p
j
))
= m((f
t
(p
j
) +R
j
) (Q
j
p
j
))
= m(O
j
(f
t
(p
j
) +R
j
) (Q
j
p
j
)) (20.62)
where O
j
SO(n) is chosen so that O
j
f
t
(p
j
)1
n
1
m1
0 . Now
O
j
f
t
(p
j
)(Q
j
p
j
) is contained in 0 where 1
m1
is a cube cen-
tered at 0 1
m1
with side length at most 2 [f
t
(p
j
)[ /N 2M/N where
M = max
pK
[f
t
(p)[ . It now follows that O
j
(f
t
(p
j
) +R
j
) (Q
j
p
j
) is con-
tained the set of all points within (N)/N of 0 and in particular
O
j
(f
t
(p
j
) +R
j
) (Q
j
p
j
) (1 +(N)/N) [(N)/N, (N)/N].
From this inclusion and Eq. (20.62) it follows that
m(f(Q
j
))
_
2
M
N
(1 +(N)/N)
_
m1
2(N)/N
= 2
m
M
m1
[(1 +(N)/N)]
m1
(N)
1
N
m
and therefore,
m(f(C K))
j
m(f(Q
j
)) N
n
2
m
M
m1
[(1 +(N)/N)]
m1
(N)
1
N
m
= 2
n
M
n1
[(1 +(N)/N)]
n1
(N)
1
N
mn
0 as N
since m n. This proves the easy case since we may write U as a countable
union of cubes K as above.
Remark. The case (m < n) also follows from the case m = n as follows.
When m < n, C = U and we must show m(f(U)) = 0. Letting F : U
1
nm
1
n
be the map F(x, y) = f(x). Then F
t
(x, y)(v, w) = f
t
(x)v, and
hence C
F
:= U 1
nm
. So if the assertion holds for m = n we have
m(f(U)) = m(F(U 1
nm
)) = 0.
Case 2. (m > n) This is the hard case and the case we will need in the co-area
formula to be proved later. Here I will follow the proof in Milnor. Let
C
i
:= x U :
f(x) = 0 when [[ i
so that C C
1
C
2
C
3
. . . . The proof is by induction on n and goes
by the following steps:
1. m(f(C C
1
)) = 0.
2. m(f(C
i
C
i+1
)) = 0 for all i 1.
3. m(f(C
i
)) = 0 for all i suciently large.
Step 1. If m = 1, there is nothing to prove since C = C
1
so we may assume
m 2. Suppose that x C C
1
, then f
t
(p) ,= 0 and so by reordering the
components of x and f(p) if necessary we may assume that
1
f
1
(p) ,= 0 where
we are writing f(p)/x
i
as
i
f (p) . The map h(x) := (f
1
(x), x
2
, . . . , x
n
) has
dierential
h
t
(p) =
_
1
f
1
(p)
2
f
1
(p) . . .
n
f
1
(p)
0 1 0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
_
_
which is not singular. So by the implicit function theorem, there exists there
exists V
p
such that h : V h(V )
h(p)
is a dieomorphism and in
particular f
1
(x)/x
1
,= 0 for x V and hence V U C
1
. Consider the
map g := f h
1
: V
t
:= h(V ) 1
m
, which satises
(f
1
(x), f
2
(x), . . . , f
m
(x)) = f(x) = g(h(x)) = g((f
1
(x), x
2
, . . . , x
n
))
which implies g(t, y) = (t, u(t, y)) for (t, y) V
t
:= h(V )
h(p)
, see Figure
20.7 below where p = x and m = p. Since
Fig. 20.7. Making a change of variable so as to apply induction.
g
t
(t, y) =
_
1 0
t
u(t, y)
y
u(t, y)
_
it follows that (t, y) is a critical point of g i y C
t
t
the set of critical points
of y u(t, y). Since h is a dieomorphism we have C
t
:= h(C V ) are the
critical points of g in V
t
and
f(C V ) = g(C
t
) =
t
[t u
t
(C
t
t
)] .
20.6 Sards Theorem 379
By the induction hypothesis, m
m1
(u
t
(C
t
t
)) = 0 for all t, and therefore by
Fubinis theorem,
m(f(C V )) =
_
R
m
m1
(u
t
(C
t
t
))1
V
t
,=
dt = 0.
Since CC
1
may be covered by a countable collection of open sets V as above,
it follows that m(f(C C
1
)) = 0. Step 2. Suppose that p C
k
C
k+1
, then
there is an such that [[ = k + 1 such that
f(p) = 0 while
f(p) = 0
for all [[ k. Again by permuting coordinates we may assume that
1
,= 0
and
f
1
(p) ,= 0. Let w(x) :=
e1
f
1
(x), then w(p) = 0 while
1
w(p) ,= 0.
So again the implicit function theorem there exists V
p
such that h(x) :=
(w(x) , x
2
, . . . , x
n
) maps V V
t
:= h(V )
h(p)
in a dieomorphic way and
in particular
1
w(x) ,= 0 on V so that V UC
k+1
. As before, let g := f h
1
and notice that C
t
k
:= h(C
k
V ) 0 1
n1
and
f(C
k
V ) = g(C
t
k
) = g (C
t
k
)
where g := g[
(0]R
n1
)V
. Clearly C
t
k
is contained in the critical points of g,
and therefore, by induction
0 = m( g(C
t
k
)) = m(f(C
k
V )).
Since C
k
C
k+1
is covered by a countable collection of such open sets, it follows
that
m(f(C
k
C
k+1
)) = 0 for all k 1.
Step 3. Suppose that Q is a closed cube with edge length contained in U
and k > n/m1. We will show m(f(QC
k
)) = 0 and since Q is arbitrary it
will follows that m(f(C
k
)) = 0 as desired. By Taylors theorem with (integral)
remainder, it follows for x Q C
k
and h such that x +h Q that
f(x +h) = f(x) +R(x, h)
where
[R(x, h)[ c |h|
k+1
where c = c(Q, k). Now subdivide Q into r
n
cubes of edge size /r and let
Q
t
be one of the cubes in this subdivision such that Q
t
C
k
,= and let
x Q
t
C
k
. It then follows that f(Q
t
) is contained in a cube centered at
f(x) 1
m
with side length at most 2c (/r)
k+1
and hence volume at most
(2c)
m
(/r)
m(k+1)
. Therefore, f(QC
k
) is contained in the union of at most
r
n
cubes of volume (2c)
m
(/r)
m(k+1)
and hence meach
m(f(Q C
k
)) (2c)
m
(/r)
m(k+1)
r
n
= (2c)
m
m(k+1)
r
nm(k+1)
0 as r
provided that n m(k + 1) < 0, i.e. provided k > n/m1.
20.7 Exercises
Exercise 20.5. Prove Theorem 20.12. Suggestion, to get started dene
(A) :=
_
X1
d(x
1
) . . .
_
Xn
d(x
n
) 1
A
(x
1
, . . . , x
n
)
and then show Eq. (20.18) holds. Use the case of two factors as the model of
your proof.
j
, /
j
,
j
) for j = 1, 2, 3 be nite measure spaces.
Let F : (X
1
X
2
) X
3
X
1
X
2
X
3
be dened by
F((x
1
, x
2
), x
3
) = (x
1
, x
2
, x
3
).
1. Show F is ((/
1
/
2
) /
3
, /
1
/
2
/
3
) measurable and F
1
is (/
1
/
2
/
3
, (/
1
/
2
) /
3
) measurable. That is
F : ((X
1
X
2
)X
3
, (/
1
/
2
)/
3
) (X
1
X
2
X
3
, /
1
/
2
/
3
)
is a measure theoretic isomorphism.
2. Let := F
[(
1
2
)
3
] , i.e. (A) = [(
1
2
)
3
] (F
1
(A)) for all
A /
1
/
2
/
3
. Then is the unique measure on /
1
/
2
/
3
such that
(A
1
A
2
A
3
) =
1
(A
1
)
2
(A
2
)
3
(A
3
)
for all A
i
/
i
. We will write :=
1
3
.
3. Let f : X
1
X
2
X
3
[0, ] be a (/
1
/
2
/
3
, B
R
) measurable
function. Verify the identity,
_
X1X2X3
fd =
_
X3
d
3
(x
3
)
_
X2
d
2
(x
2
)
_
X1
d
1
(x
1
)f(x
1
, x
2
, x
3
),
makes sense and is correct.
4. (Optional.) Also show the above identity holds for any one of the six
possible orderings of the iterated integrals.
Exercise 20.7. Prove the second assertion of Theorem 20.18. That is show
m
d
is the unique translation invariant measure on B
R
d such that m
d
((0, 1]
d
) =
1. Hint: Look at the proof of Theorem 19.10.
Exercise 20.8. (Part of Folland Problem 2.46 on p. 69.) Let X = [0, 1],
/= B
[0,1]
be the Borel eld on X, m be Lebesgue measure on [0, 1] and
be counting measure, (A) = #(A). Finally let D = (x, x) X
2
: x X
be the diagonal in X
2
. Show
_
X
__
X
1
D
(x, y)d(y)
_
dm(x) ,=
_
X
__
X
1
D
(x, y)dm(x)
_
d(y)
by explicitly computing both sides of this equation.
20.7 Exercises 381
Exercise 20.9. Folland Problem 2.48 on p. 69. (Counter example related to
Fubini Theorem involving counting measures.)
Exercise 20.10. Folland Problem 2.50 on p. 69 pertaining to area under a
curve. (Note the /B
R
should be /B
R
in this problem.)
Exercise 20.11. Folland Problem 2.55 on p. 77. (Explicit integrations.)
Exercise 20.12. Folland Problem 2.56 on p. 77. Let f L
1
((0, a), dm),
g(x) =
_
a
x
f(t)
t
dt for x (0, a), show g L
1
((0, a), dm) and
_
a
0
g(x)dx =
_
a
0
f(t)dt.
_
sin x
x
dm(x) = . So
sin x
x
/ L
1
([0, ), m) and
_
0
sin x
x
dm(x) is not dened as a Lebesgue integral.
Exercise 20.14. Folland Problem 2.57 on p. 77.
Exercise 20.15. Folland Problem 2.58 on p. 77.
Exercise 20.16. Folland Problem 2.60 on p. 77. Properties of the func-
tion.
Exercise 20.17. Folland Problem 2.61 on p. 77. Fractional integration.
Exercise 20.18. Folland Problem 2.62 on p. 80. Rotation invariance of sur-
face measure on S
n1
.
Exercise 20.19. Folland Problem 2.64 on p. 80. On the integrability of
[x[
a
[log [x[[
b
for x near 0 and x near in 1
n
.
Exercise 20.20. Show, using Problem 20.18 that
_
S
d1
j
d () =
1
d
ij
_
S
d1
_
.
Hint: show
_
S
d1

2
i
d () is independent of i and therefore
_
S
d1
2
i
d () =
1
d
d
j=1
_
S
d1
2
j
d () .
21
L
p
-spaces
Let (X, /, ) be a measure space and for 0 a) = 0 (21.2)

For 0 < p , let
L
p
(X, /, ) = f : X C : f is measurable and |f|
p
< /
where f g i f = g a.e. Notice that |f g|
p
= 0 i f g and if f g
then |f|
p
= |g|
p
. In general we will (by abuse of notation) use f to denote
both the function f and the equivalence class containing f.
Remark 21.1. Suppose that |f|
M, then for all a > M, ([f[ > a) = 0

and therefore ([f[ > M) = lim
n
([f[ > M + 1/n) = 0, i.e. [f(x)[ M
for - a.e. x. Conversely, if [f[ M a.e. and a > M then ([f[ > a) = 0 and
hence |f|
M. This leads to the identity:

|f|
= inf a 0 : [f(x)[ a for a.e. x .

The next theorem is a generalization Theorem 5.6 to general integrals and
the proof is essentially identical to the proof of Theorem 5.6.
Theorem 21.2 (Holders inequality). Suppose that 1 p and q :=
p
p1
, or equivalently p
1
+q
1
= 1. If f and g are measurable functions then
|fg|
1
|f|
p
|g|
q
. (21.3)
Assuming p (1, ) and |f|
p
|g|
q
< , equality holds in Eq. (21.3) i [f[
p
and [g[
q
are linearly dependent as elements of L
1
which happens i
[g[
q
|f|
p
p
= |g|
q
q
[f[
p
a.e. (21.4)
384 21 L
p
-spaces
Proof. The cases where |f|
q
= 0 or or |g|
p
= 0 or are easy to deal
with and are left to the reader. So we will now assume that 0 < |f|
q
, |g|
p
<
. Let s = [f[ /|f|
p
and t = [g[/|g|
q
then Lemma 5.5 implies
[fg[
|f|
p
|g|
q
1
p
[f[
p
|f|
p
+
1
q
[g[
q
|g|
q
(21.5)
with equality i [g/|g|
q
[ = [f[
p1
/|f|
(p1)
p
= [f[
p/q
/|f|
p/q
p
, i.e. [g[
q
|f|
p
p
=
|g|
q
q
[f[
p
. Integrating Eq. (21.5) implies
|fg|
1
|f|
p
|g|
q
1
p
+
1
q
= 1
with equality i Eq. (21.4) holds. The proof is nished since it is easily checked
that equality holds in Eq. (21.3) when [f[
p
= c [g[
q
of [g[
q
= c [f[
p
for some
constant c.
The following corollary is an easy extension of Holders inequality.
Corollary 21.3. Suppose that f
i
: X C are measurable functions for i =
1, . . . , n and p
1
, . . . , p
n
and r are positive numbers such that

n
i=1
p
1
i
= r
1
,
then _
_
_
_
_
n
i=1
f
i
_
_
_
_
_
r
i=1
|f
i
|
pi
where
n
i=1
p
1
i
= r
1
.
Proof. To prove this inequality, start with n = 2, then for any p [1, ],
|fg|
r
r
=
_
X
[f[
r
[g[
r
d |f
r
|
p
|g
r
|
p
where p
=
p
p1
is the conjugate exponent. Let p
1
= pr and p
2
= p
r so that
p
1
1
+p
1
2
= r
1
as desired. Then the previous equation states that
|fg|
r
|f|
p1
|g|
p2
as desired. The general case is now proved by induction. Indeed,
_
_
_
_
_
n+1
i=1
f
i
_
_
_
_
_
r
=
_
_
_
_
_
n
i=1
f
i
f
n+1
_
_
_
_
_
r
_
_
_
_
_
n
i=1
f
i
_
_
_
_
_
q
|f
n+1
|
pn+1
where q
1
+p
1
n+1
= r
1
. Since
n
i=1
p
1
i
= q
1
, we may now use the induction
hypothesis to conclude
_
_
_
_
_
n
i=1
f
i
_
_
_
_
_
q
i=1
|f
i
|
pi
,
which combined with the previous displayed equation proves the generalized
form of Holders inequality.
21 L
p
-spaces 385
Theorem 21.4 (Minkowskis Inequality). If 1 p and f, g L
p
then
|f +g|
p
|f|
p
+|g|
p
. (21.6)
Moreover, assuming f and g are not identically zero, equality holds in Eq.
(21.6) i sgn(f) sgn(g) a.e. (see the notation in Denition 5.7) when p = 1
and f = cg a.e. for some c > 0 for p (1, ).
Proof. When p = , [f[ |f|
a.e. and [g[ |g|
a.e. so that [f +g[

[f[ +[g[ |f|
+|g|
a.e. and therefore

|f +g|
|f|
+|g|
.
When p < ,
[f +g[
p
(2 max ([f[ , [g[))
p
= 2
p
max ([f[
p
, [g[
p
) 2
p
([f[
p
+[g[
p
) ,
|f +g|
p
p
2
p
_
|f|
p
p
+|g|
p
p
_
< .
In case p = 1,
|f +g|
1
=
_
X
[f +g[d
_
X
[f[ d +
_
X
[g[d
with equality i [f[ +[g[ = [f +g[ a.e. which happens i sgn(f) sgn(g) a.e.
In case p (1, ), we may assume |f + g|
p
, |f|
p
and |g|
p
are all positive
since otherwise the theorem is easily veried. Now
[f +g[
p
= [f +g[[f +g[
p1
([f[ +[g[)[f +g[
p1
with equality i sgn(f) sgn(g). Integrating this equation and applying
Holders inequality with q = p/(p 1) gives
_
X
[f +g[
p
d
_
X
[f[ [f +g[
p1
d +
_
X
[g[ [f +g[
p1
d
(|f|
p
+|g|
p
) | [f +g[
p1
|
q
(21.7)
with equality i
sgn(f) sgn(g) and
_
[f[
|f|
p
_
p
=
[f +g[
p
|f +g|
p
p
=
_
[g[
|g|
p
_
p
a.e. (21.8)
Therefore
|[f +g[
p1
|
q
q
=
_
X
([f +g[
p1
)
q
d =
_
X
[f +g[
p
d. (21.9)
Combining Eqs. (21.7) and (21.9) implies
386 21 L
p
-spaces
|f +g|
p
p
|f|
p
|f +g|
p/q
p
+|g|
p
|f +g|
p/q
p
(21.10)
with equality i Eq. (21.8) holds which happens i f = cg a.e. with c > 0.
Solving for |f +g|
p
in Eq. (21.10) gives Eq. (21.6).
The next theorem gives another example of using Holders inequality
Theorem 21.5. Suppose that (X, /, ) and (Y, ^, ) be nite measure
spaces, p [1, ], q = p/(p1) and k : XY C be a /^ measurable
function. Assume there exist nite constants C
1
and C
2
such that
_
X
[k(x, y)[ d(x) C
1
for a.e. y and
_
Y
[k(x, y)[ d(y) C
2
for a.e. x.
If f L
p
(), then
_
Y
[k(x, y)f(y)[ d(y) < for a.e. x,
x Kf(x) :=
_
Y
k(x, y)f(y)d(y) L
p
() and
|Kf|
L
p
()
C
1/p
1
C
1/q
2
|f|
L
p
()
(21.11)
Proof. Suppose p (1, ) to begin with and let q = p/(p 1), then by
Holders inequality,
_
Y
[k(x, y)f(y)[ d(y) =
_
Y
[k(x, y)[
1/q
[k(x, y)[
1/p
[f(y)[ d(y)
__
Y
[k(x, y)[ d(y)
_
1/q
__
Y
[k(x, y)[ [f(y)[
p
d(y)
_
1/p
C
1/q
2
__
Y
[k(x, y)[ [f(y)[
p
d(y)
_
1/p
.
Therefore,
_
_
_
_
_
Y
[k(, y)f(y)[ d(y)
_
_
_
_
p
L
p
()
=
_
X
d(x)
__
Y
[k(x, y)f(y)[ d(y)
_
p
C
p/q
2
_
X
d(x)
_
Y
d(y) [k(x, y)[ [f(y)[
p
= C
p/q
2
_
Y
d(y) [f(y)[
p
_
X
d(x) [k(x, y)[
C
p/q
2
C
1
_
Y
d(y) [f(y)[
p
= C
p/q
2
C
1
|f|
p
L
p
()
,
21.1 Jensens Inequality 387
wherein we used Tonellis theorem in third line. From this it follows that
_
Y
[k(x, y)f(y)[ d(y) < for - a.e. x,
x Kf(x) :=
_
Y
k(x, y)f(y)d(y) L
p
()
and that Eq. (21.11) holds.
Similarly if p = ,
_
Y
[k(x, y)f(y)[ d(y) |f|
L
()
_
Y
[k(x, y)[ d(y) C
2
|f|
L
()
for a.e. x.
so that |Kf|
L
()
C
2
|f|
L
()
. If p = 1, then
_
X
d(x)
_
Y
d(y) [k(x, y)f(y)[ =
_
Y
d(y) [f(y)[
_
X
d(x) [k(x, y)[
C
1
_
Y
d(y) [f(y)[
which shows |Kf|
L
1
()
C
1
|f|
L
1
()
.
21.1 Jensens Inequality
Denition 21.6. A function : (a, b) 1 is convex if for all a < x
0
< x
1
<
b and t [0, 1] (x
t
) t(x
1
) + (1 t)(x
0
) where x
t
= tx
1
+ (1 t)x
0
.
Example 21.7. The functions exp(x) and log(x) are convex and x
p
is
convex i p 1 as follows from Corollary 21.9 below which in part states
that any C
2
((a, b) , 1) such that
tt
0 is convex.
The following Proposition is clearly motivated by Figure 21.1.
BRUCE: See the Appendix (page 500) of Revuz and Yor for facts and
better proofs of facts about convex functions.
Proposition 21.8. Suppose : (a, b) 1 is a convex function, then
1. For all u, v, w, z (a, b) such that u < z, w [u, z) and v (u, z],
(v) (u)
v u

(z) (w)
z w
. (21.12)
2. For each c (a, b), the right and left sided derivatives
t
(c) exists in 1
and if a < u < v < b, then
t
+
(u)
t
(v)
t
+
(v). (21.13)
3. The function is continuous and dierentiable except on an at most
countable subset of (a, b) .
388 21 L
p
-spaces
Fig. 21.1. A convex function along with two cords corresponding to x0 = 2 and
x1 = 4 and x0 = 5 and x1 = 2.
4. For all t (a, b) and [
t
(t),
t
+
(t)], (x) (t) + (x t) for all
x (a, b). In particular,
(x) (t) +
t
(t)(x t) for all x, t (a, b).

Proof. 1a) Suppose rst that u < v = w < z, in which case Eq. (21.12) is
equivalent to
((v) (u)) (z v) ((z) (v)) (v u)
which after solving for (v) is equivalent to the following equations holding:
(v) (z)
v u
z u
+(u)
z v
z u
.
But this last equation states that (v) (z)t +(u) (1 t) where t =
vu
zu
and v = tz + (1 t)u and hence is valid by the denition of being convex.
1b) Now assume u = w < v < z, in which case Eq. (21.12) is equivalent to
((v) (u)) (z u) ((z) (u)) (v u)
which after solving for (v) is equivalent to
(v) (z u) (z) (v u) +(u) (z v)
which is equivalent to
(v) (z)
v u
z u
+(u)
z v
z u
.
Again this equation is valid by the convexity of .
21.1 Jensens Inequality 389
1c) u < w < v = z, in which case Eq. (21.12) is equivalent to
((z) (u)) (z w) ((z) (w)) (z u)
and this is equivalent to the inequality,
(w) (z)
w u
z u
+(u)
z w
z u
which again is true by the convexity of .
1) General case. If u < w < v < z, then by 1a-1c)
(z) (w)
z w

(v) (w)
v w

(v) (u)
v u
and if u < v < w < z
(z) (w)
z w

(w) (v)
w v

(w) (u)
w u
.
We have now taken care of all possible cases.
2) On the set a < w < z < b, Eq. (21.12) shows that ((z) (w)) / (z w)
is a decreasing function in w and an increasing function in z and therefore
(x) exists for all x (a, b). Also from Eq. (21.12) we learn that
t
+
(u)
(z) (w)
z w
for all a < u < w < z < b, (21.14)
(v) (u)
v u

t
(z) for all a < u < v < z < b, (21.15)

and letting w z in the rst equation also implies that
t
+
(u)
t
(z) for all a < u < z < b.

The inequality,
t
(z)
t
+
(z), is also an easy consequence of Eq. (21.12).
3) Since (x) has both left and right nite derivatives, it follows that
is continuous. (For an alternative proof, see Rudin.) Since z
t
(z) is an
increasing function, it has at most a countable set of discontinuities. If
t
is
continuous at u, then by Eq. (21.13),
t
+
(u) lim
vu
(v) =
t
(u)
t
+
(u)
t
(u) =
t
+
(u) and is dierentiable at u.
4) Given t, let [
t
(t),
t
+
(t)], then by Eqs. (21.14) and (21.15),
(t) (u)
t u

t
(t)
t
+
(t)
(z) (t)
z t
for all a < u < t < z < b. Item 4. now follows.
390 21 L
p
-spaces
Corollary 21.9. Suppose : (a, b) 1 is dierentiable then is convex i
t
is non decreasing. In particular if C
2
(a, b) then is convex i
tt
0.
Proof. By Proposition 21.8, if is convex then
t
is non-decreasing. Con-
versely if
t
is increasing then by the mean value theorem,
(x
1
) (c)
x
1
c
=
t
(
1
) for some
1
(c, x
1
)
and
(c) (x
0
)
c x
0
=
t
(
2
) for some
2
(x
0
, c).
Hence
(x
1
) (c)
x
1
c

(c) (x
0
)
c x
0
for all x
0
< c < x
1
. Solving this inequality for (c) gives
(c)
c x
0
x
1
x
0
(x
1
) +
x
1
c
x
1
x
0
(x
0
)
showing is convex.
Theorem 21.10 (Jensens Inequality). Suppose that (X, /, ) is a prob-
ability space, i.e. is a positive measure and (X) = 1. Also suppose that
f L
1
(), f : X (a, b), and : (a, b) 1 is a convex function. Then
__
X
fd
_
_
X
(f)d
where if f / L
1
(), then f is integrable in the extended sense and
_
X
(f)d = .
Proof. Let t =
_
X
fd (a, b) and let 1 be such that (s) (t)
(st) for all s (a, b). Then integrating the inequality, (f)(t) (ft),
implies that
0
_
X
(f)d (t) =
_
X
(f)d (
_
X
fd).
Moreover, if (f) is not integrable, then (f) (t) +(f t) which shows
that negative part of (f) is integrable. Therefore,
_
X
(f)d = in this
case.
Example 21.11. The convex functions in Example 21.7 lead to the following
inequalities,
exp
__
X
fd
_
_
X
e
f
d, (21.16)
_
X
log([f[)d log
__
X
[f[ d
_
21.2 Modes of Convergence 391
and for p 1,
_
X
fd
__
X
[f[ d
_
p
_
X
[f[
p
d.
The last equation may also easily be derived using Holders inequality. As a
special case of the rst equation, we get another proof of Lemma 5.5. Indeed,
more generally, suppose p
i
, s
i
> 0 for i = 1, 2, . . . , n and

n
i=1
1
pi
= 1, then
s
1
. . . s
n
= e
n
i=1
ln si
= e
n
i=1
1
p
i
ln s
p
i
i
i=1
1
p
i
e
ln s
p
i
i
=
n
i=1
s
pi
i
p
i
(21.17)
where the inequality follows from Eq. (21.16) with X = 1, 2, . . . , n , =
n
i=1
1
pi
i
and f (i) := lns
pi
i
. Of course Eq. (21.17) may be proved directly
using the convexity of the exponential function.
21.2 Modes of Convergence
As usual let (X, /, ) be a xed measure space, assume 1 p and let
f
n
n=1
f be a collection of complex valued measurable functions on X.
We have the following notions of convergence and Cauchy sequences.
Denition 21.12. 1. f
n
f a.e. if there is a set E / such that (E) =
0 and lim
n
1
E
c f
n
= 1
E
c f.
2. f
n
f in measure if lim
n
([f
n
f[ > ) = 0 for all > 0. We
will abbreviate this by saying f
n
f in L
0
or by f
n
f.
3. f
n
f in L
p
i f L
p
and f
n
L
p
for all n, and lim
n
|f
n
f|
p
= 0.
Denition 21.13. 1. f
n
is a.e. Cauchy if there is a set E / such that
(E) = 0 and1
E
c f
n
is a pointwise Cauchy sequences.
2. f
n
is Cauchy in measure (or L
0
Cauchy) if lim
m,n
([f
n

f
m
[ > ) = 0 for all > 0.
3. f
n
is Cauchy in L
p
if lim
m,n
|f
n
f
m
|
p
= 0.
Lemma 21.14 (Chebyshevs inequality again). Let p [1, ) and f
L
p
, then
([f[ )
1
p
|f|
p
p
for all > 0.
In particular if f
n
L
p
is L
p
convergent (Cauchy) then f
n
is also
convergent (Cauchy) in measure.
Proof. By Chebyshevs inequality (19.11),
([f[ ) = ([f[
p

p
)
1
p
_
X
[f[
p
d =
1
p
|f|
p
p
392 21 L
p
-spaces
and therefore if f
n
is L
p
Cauchy, then
([f
n
f
m
[ )
1
p
|f
n
f
m
|
p
p
0 as m, n
showing f
n
is L
0
Cauchy. A similar argument holds for the L
p
convergent
case.
Here is a sequence of functions where f
n
0 a.e., f
n
0 in L
1
, f
n
m
0.
Above is a sequence of functions where f
n
0 a.e., yet f
n
0 in L
1
. or in
measure.
Here is a sequence of functions where f
n
0 a.e., f
n
m
0 but f
n
0 in L
1
.
Above is a sequence of functions where f
n
0 in L
1
, f
n
0 a.e., and
f
n
m
0.
21.2 Modes of Convergence 393
Lemma 21.15. Suppose a
n
C and [a
n+1
a
n
[
n
and
n=1
n
< . Then
lim
n
a
n
= a C exists and [a a
n
[
n
:=
k=n
k
.
Proof. (This is a special case of Exercise 6.9.) Let m > n then
[a
m
a
n
[ =
m1
k=n
(a
k+1
a
k
)
m1
k=n
[a
k+1
a
k
[
k=n
k
:=
n
. (21.18)
So [a
m
a
n
[
min(m,n)
0 as , m, n , i.e. a
n
is Cauchy. Let m
in (21.18) to nd [a a
n
[
n
.
Theorem 21.16. Suppose f
n
is L
0
-Cauchy. Then there exists a subse-
quence g
j
= f
nj
of f
n
such that limg
j
:= f exists a.e. and f
n
f
as n . Moreover if g is a measurable function such that f
n
g as n ,
then f = g a.e.
Proof. Let
n
> 0 such that
n=1
n
< (
n
= 2
n
would do) and set
n
=
k=n
k
. Choose g
j
= f
nj
such that n
j
is a subsequence of N and
([g
j+1
g
j
[ >
j
)
j
.
Let E
j
= [g
j+1
g
j
[ >
j
,
F
N
=
_
j=N
E
j
=
_
j=N
[g
j+1
g
j
[ >
j
and
E :=
N=1
F
N
=
N=1
_
j=N
E
j
= [g
j+1
g
j
[ >
j
i.o..
Then (E) = 0 by Lemma 19.20 or the computation
(E)
j=N
(E
j
)
j=N
j
=
N
0 as N .
If x / F
N
, i.e. [g
j+1
(x) g
j
(x)[
j
for all j N, then by Lemma 21.15,
f(x) = lim
j
g
j
(x) exists and [f(x) g
j
(x)[
j
for all j N. Therefore,
since E
c
=
N=1
F
c
N
, lim
j
g
j
(x) = f(x) exists for all x / E. Moreover, x :
[f(x) g
j
(x)[ >
j
F
j
for all j N and hence
([f g
j
[ >
j
) (F
j
)
j
0 as j .
394 21 L
p
-spaces
Therefore g
j
f as j . Since
[f
n
f[ > = [f g
j
+g
j
f
n
[ >
[f g
j
[ > /2 [g
j
f
n
[ > /2,
([f
n
f[ > ) ([f g
j
[ > /2) +([g
j
f
n
[ > /2)
and
([f
n
f[ > ) lim
j
sup([g
j
f
n
[ > /2) 0 as n .
If there is another function g such that f
n
g as n , then arguing as
above
([f g[ > ) ([f f
n
[ > /2) +([g f
n
[ > /2) 0 as n .
Hence
([f g[ > 0) = (
n=1
[f g[ >
1
n
)
n=1
([f g[ >
1
n
) = 0,
i.e. f = g a.e.
Corollary 21.17 (Dominated Convergence Theorem). Suppose f
n
,
g
n
, and g are in L
1
and f L
0
are functions such that
[f
n
[ g
n
a.e., f
n
f, g
n
g, and
_
g
n

_
g as n .
Then f L
1
and lim
n
|f f
n
|
1
= 0, i.e. f
n
f in L
1
. In particular
lim
n
_
f
n
=
_
f.
Proof. First notice that [f[ g a.e. and hence f L
1
since g L
1
. To
see that [f[ g, use Theorem 21.16 to nd subsequences f
n
k
and g
n
k
of
f
n
and g
n
respectively which are almost everywhere convergent. Then
[f[ = lim
k
[f
n
k
[ lim
k
g
n
k
= g a.e.
If (for sake of contradiction) lim
n
|f f
n
|
1
,= 0 there exists > 0 and a
subsequence f
n
k
of f
n
such that
_
[f f
n
k
[ for all k. (21.19)
Using Theorem 21.16 again, we may assume (by passing to a further subse-
quences if necessary) that f
n
k
f and g
n
k
g almost everywhere. Noting,
[f f
n
k
[ g + g
n
k
2g and
_
(g +g
n
k
)
_
2g, an application of the
dominated convergence Theorem 19.38 implies lim
k
_
[f f
n
k
[ = 0 which
contradicts Eq. (21.19).
p
spaces 395
Exercise 21.1 (Fatous Lemma). If f
n
0 and f
n
f in measure, then
_
f liminf
n
_
f
n
.
Theorem 21.18 (Egoros Theorem). Suppose (X) < and f
n
f
a.e. Then for all > 0 there exists E / such that (E) < and f
n
f
uniformly on E
c
. In particular f
n
f as n .
Proof. Let f
n
f a.e. Then ([f
n
f[ >
1
k
i.o. n) = 0 for all k > 0,
i.e.
lim
N
_
_
_
nN
[f
n
f[ >
1
k
_
_
=
_
_
N=1
_
nN
[f
n
f[ >
1
k
_
_
= 0.
Let E
k
:=

nN
k
[f
n
f[ >
1
k
and choose an increasing sequence N
k
k=1
such that (E
k
) < 2
k
for all k. Setting E := E
k
, (E) <
k
2
k
=
and if x / E, then [f
n
f[
1
k
for all n N
k
and all k. That is f
n
f
uniformly on E
c
.
Exercise 21.2. Show that Egoros Theorem remains valid when the as-
sumption (X) < is replaced by the assumption that [f
n
[ g L
1
for all n.
Hint: make use of Theorem 21.18 applied to f
n
[
X
k
where X
k
:=
_
[g[ k
1
_
.
p
spaces
Theorem 21.19. Let ||
be as dened in Eq. (21.2), then (L
(X, /, ), ||
) is
a Banach space. A sequence f
n
n=1
L
converges to f L
i there ex-
ists E / such that (E) = 0 and f
n
f uniformly on E
c
. Moreover,
bounded simple functions are dense in L
.
Proof. By Minkowskis Theorem 21.4, ||
satises the triangle inequal-

ity. The reader may easily check the remaining conditions that ensure ||
is a norm. Suppose that f

n
n=1
L
is a sequence such f
n
f L
, i.e.
|f f
n
|
0 as n . Then for all k N, there exists N

k
< such that
_
[f f
n
[ > k
1
_
= 0 for all n N
k
.
Let
E =
k=1
nN
k
_
[f f
n
[ > k
1
_
.
Then (E) = 0 and for x E
c
, [f(x) f
n
(x)[ k
1
for all n N
k
. This
shows that f
n
f uniformly on E
c
. Conversely, if there exists E / such
that (E) = 0 and f
n
f uniformly on E
c
, then for any > 0,
([f f
n
[ ) = ([f f
n
[ E
c
) = 0
396 21 L
p
-spaces
for all n suciently large. That is to say limsup
n
|f f
n
|
for
all > 0. The density of simple functions follows from the approximation
Theorem 18.42. So the last item to prove is the completeness of L
for which
we will use Theorem 7.13.
Suppose that f
n
n=1
L
is a sequence such that

n=1
|f
n
|
< .
Let M
n
:= |f
n
|
, E
n
:= [f
n
[ > M
n
, and E :=
n=1
E
n
so that (E) = 0.
Then
n=1
sup
xE
c
[f
n
(x)[
n=1
M
n
<
which shows that S
N
(x) =

N
n=1
f
n
(x) converges uniformly to S(x) :=
n=1
f
n
(x) on E
c
, i.e. lim
n
|S S
n
|
= 0.
Alternatively, suppose
m,n
:= |f
m
f
n
|
0 as m, n . Let
E
m,n
= [f
n
f
m
[ >
m,n
and E := E
m,n
, then (E) = 0 and
sup
xE
c
[f
m
(x) f
n
(x)[
m,n
0 as m, n .
Therefore, f := lim
n
f
n
exists on E
c
and the limit is uniform on E
c
.
Letting f = lim
n
1
E
c f
n
, it then follows that lim
n
|f
n
f|
= 0.
Theorem 21.20 (Completeness of L
p
()). For 1 p , L
p
() equipped
with the L
p
norm, ||
p
(see Eq. (21.1)), is a Banach space.
Proof. By Minkowskis Theorem 21.4, ||
p
satises the triangle inequality.
As above the reader may easily check the remaining conditions that ensure
||
p
is a norm. So we are left to prove the completeness of L
p
() for 1 p < ,
the case p = being done in Theorem 21.19.
Let f
n
n=1
L
p
() be a Cauchy sequence. By Chebyshevs inequality
(Lemma 21.14), f
n
is L
0
-Cauchy (i.e. Cauchy in measure) and by Theorem
21.16 there exists a subsequence g
j
of f
n
such that g
j
f a.e. By Fatous
Lemma,
|g
j
f|
p
p
=
_
lim
k
inf [g
j
g
k
[
p
d lim
k
inf
_
[g
j
g
k
[
p
d
= lim
k
inf |g
j
g
k
|
p
p
0 as j .
In particular, |f|
p
|g
j
f|
p
+|g
j
|
p
< so the f L
p
and g
j
L
p
f. The
proof is nished because,
|f
n
f|
p
|f
n
g
j
|
p
+|g
j
f|
p
0 as j, n .
The L
p
() norm controls two types of behaviors of f, namely the be-
havior at innity and the behavior of local singularities. So in particular,
if f blows up at a point x
0
X, then locally near x
0
it is harder for f to be in
p
spaces 397
L
p
() as p increases. On the other hand a function f L
p
() is allowed to de-
cay at innity slower and slower as p increases. With these insights in mind,
we should not in general expect L
p
() L
q
() or L
q
() L
p
(). However,
there are two notable exceptions. (1) If (X) < , then there is no behavior
at innity to worry about and L
q
() L
p
() for all q p as is shown in
Corollary 21.21 below. (2) If is counting measure, i.e. (A) = #(A), then
all functions in L
p
() for any p can not blow up on a set of positive measure,
so there are no local singularities. In this case L
p
() L
q
() for all q p,
see Corollary 21.25 below.
Corollary 21.21. If (X) < and 0 < p < q , then L
q
() L
p
(),
the inclusion map is bounded and in fact
|f|
p
[(X)]
(
1
p
1
q
)
|f|
q
.
Proof. Take a [1, ] such that
1
p
=
1
a
+
1
q
, i.e. a =
pq
q p
.
Then by Corollary 21.3,
|f|
p
= |f 1|
p
|f|
q
|1|
a
= (X)
1/a
|f|
q
= (X)
(
1
p
1
q
)
|f|
q
.
The reader may easily check this nal formula is correct even when q =
provided we interpret 1/p 1/ to be 1/p.
Proposition 21.22. Suppose that 0 < p
0
< p
1
, (0, 1) and p

(p
0
, p
1
) be dened by
1
p
=
1
p
0
+

p
1
(21.20)
with the interpretation that /p
1
= 0 if p
1
= .
1
Then L
p
L
p0
+L
p1
, i.e.
every function f L
p
may be written as f = g+h with g L

p0
and h L
p1
.
For 1 p
0
< p
1
and f L
p0
+L
p1
let
|f| := inf
_
|g|
p0
+|h|
p1
: f = g +h
_
.
Then (L
p0
+L
p1
, ||) is a Banach space and the inclusion map from L
p
to
L
p0
+L
p1
is bounded; in fact |f| 2 |f|
p
for all f L
p
.
1
A little algebra shows that may be computed in terms of p0, p
and p1 by
=
p0
p
p1 p
p1 p0
.
398 21 L
p
-spaces
Proof. Let M > 0, then the local singularities of f are contained in the
set E := [f[ > M and the behavior of f at innity is solely determined
by f on E
c
. Hence let g = f1
E
and h = f1
E
c so that f = g +h. By our earlier
discussion we expect that g L
p0
and h L
p1
and this is the case since,
|g|
p0
p0
=
_
[f[
p0
1
]f]>M
= M
p0
_
f
M
p0
1
]f]>M
M
p0
_
f
M
1
]f]>M
M
p0p
|f|
p
<
and
|h|
p1
p1
=
_
_
f1
]f]M
_
_
p1
p1
=
_
[f[
p1
1
]f]M
= M
p1
_
f
M
p1
1
]f]M
M
p1
_
f
M
1
]f]M
M
p1p
|f|
p
< .
Moreover this shows
|f| M
1p
/p0
|f|
p
/p0
p
+M
1p
/p1
|f|
p
/p1
p
.
Taking M = |f|
p
then gives
|f|
_
1p
/p0
+
1p
/p1
_
|f|
p
and then taking = 1 shows |f| 2 |f|

p
. The proof that (L

p0
+L
p1
, ||)
is a Banach space is left as Exercise 21.7 to the reader.
Corollary 21.23 (Interpolation of L
p
norms). Suppose that 0 < p
0
<
p
1
, (0, 1) and p
(p
0
, p
1
) be dened as in Eq. (21.20), then
L
p0
L
p1
L
p
and
|f|
p
|f|
p0
|f|
1
p1
. (21.21)
Further assume 1 p
0
< p
< p
1
, and for f L
p0
L
p1
let
|f| := |f|
p0
+|f|
p1
.
Then (L
p0
L
p1
, ||) is a Banach space and the inclusion map of L
p0
L
p1
into L
p
is bounded, in fact
|f|
p
max
_
1
, (1 )
1
_
_
|f|
p0
+|f|
p1
_
. (21.22)
The heuristic explanation of this corollary is that if f L
p0
L
p1
, then f
has local singularities no worse than an L
p1
function and behavior at innity
no worse than an L
p0
function. Hence f L
p
for any p
between p
0
and p
1
.
Proof. Let be determined as above, a = p
0
/ and b = p
1
/(1 ), then
by Corollary 21.3,
p
spaces 399
|f|
p
=
_
_
_[f[
[f[
1
_
_
_
p
_
_
_[f[
_
_
_
a
_
_
_[f[
1
_
_
_
b
= |f|
p0
|f|
1
p1
.
It is easily checked that || is a norm on L
p0
L
p1
. To show this space is
complete, suppose that f
n
L
p0
L
p1
is a || Cauchy sequence. Then
f
n
is both L
p0
and L
p1
Cauchy. Hence there exist f L
p0
and g L
p1
such
that lim
n
|f f
n
|
p0
= 0 and lim
n
|g f
n
|
p
= 0. By Chebyshevs
inequality (Lemma 21.14) f
n
f and f
n
g in measure and therefore by
Theorem 21.16, f = g a.e. It now is clear that lim
n
|f f
n
| = 0. The
estimate in Eq. (21.22) is left as Exercise 21.6 to the reader.
Remark 21.24. Combining Proposition 21.22 and Corollary 21.23 gives
L
p0
L
p1
L
p
L
p0
+L
p1
for 0 < p
0
< p
1
, (0, 1) and p
(p
0
, p
1
) as in Eq. (21.20).
Corollary 21.25. Suppose now that is counting measure on X. Then
L
p
() L
q
() for all 0 < p < q and |f|
q
|f|
p
.
Proof. Suppose that 0 < p < q = , then
|f|
p
= sup[f(x)[
p
: x X
xX
[f(x)[
p
= |f|
p
p
,
i.e. |f|
|f|
p
for all 0 )
p
|f|
p
p
, L
p
convergence implies L
0
convergence.
2. L
0
convergence implies almost everywhere convergence for some subse-
quence.
3. If (X) < then almost everywhere convergence implies uniform con-
vergence o certain sets of small measure and in particular we have L
0
convergence.
4. If (X) < , then L
q
L
p
for all p q and L
q
convergence implies L
p
convergence.
5. L
p0
L
p1
L
q
L
p0
+L
p1
for any q (p
0
, p
1
).
6. If p q, then
p

q
and|f|
q
|f|
p
.
400 21 L
p
-spaces
21.4 Converse of Holders Inequality
Throughout this section we assume (X, /, ) is a nite measure space,
q [1, ] and p [1, ] are conjugate exponents, i.e. p
1
+ q
1
= 1. For
g L
q
, let
g
(L
p
)
be given by
g
(f) =
_
gf d =: g, f). (21.23)
By Holders inequality
[
g
(f)[
_
[gf[d |g|
q
|f|
p
(21.24)
which implies that
|
g
|
(L
p
)
:= sup[
g
(f)[ : |f|
p
= 1 |g|
q
. (21.25)
Proposition 21.26 (Converse of Holders Inequality). Let (X, /, ) be
a nite measure space and 1 p as above. For all g L
q
,
|g|
q
= |
g
|
(L
p
)
:= sup
_
[
g
(f)[ : |f|
p
= 1
_
(21.26)
and for any measurable function g : X C,
|g|
q
= sup
__
X
[g[ fd : |f|
p
= 1 and f 0
_
. (21.27)
Proof. We begin by proving Eq. (21.26). Assume rst that q < so
p > 1. Then
[
g
(f)[ =
_
gf d
_
[gf[ d |g|
q
|f|
p
and equality occurs in the rst inequality when sgn(gf) is constant a.e. while
equality in the second occurs, by Theorem 21.2, when [f[
p
= c[g[
q
for some
constant c > 0. So let f := sgn(g)[g[
q/p
which for p = is to be interpreted
as f = sgn(g), i.e. [g[
q/
1. When p = ,
[
g
(f)[ =
_
X
g sgn(g)d = |g|
L
1
()
= |g|
1
|f|
which shows that |

g
|
(L
)
|g|
1
. If p < , then
|f|
p
p
=
_
[f[
p
=
_
[g[
q
= |g|
q
q
while
21.4 Converse of Holders Inequality 401
g
(f) =
_
gfd =
_
[g[[g[
q/p
d =
_
[g[
q
d = |g|
q
q
.
Hence
[
g
(f)[
|f|
p
=
|g|
q
q
|g|
q/p
q
= |g|
q(1
1
p
)
q
= |g|
q
.
This shows that [[
g
| |g|
q
which combined with Eq. (21.25) implies Eq.
(21.26).
The last case to consider is p = 1 and q = . Let M := |g|
and choose
X
n
/ such that X
n
X as n and (X
n
) < for all n. For any
> 0, ([g[ M) > 0 and X
n
[g[ M [g[ M. Therefore,
(X
n
[g[ M ) > 0 for n suciently large. Let
f = sgn(g)1
Xn]g]M]
,
then
|f|
1
= (X
n
[g[ M ) (0, )
and
[
g
(f)[ =
_
Xn]g]M]
sgn(g)gd =
_
Xn]g]M]
[g[d
(M )(X
n
[g[ M ) = (M )|f|
1
.
Since > 0 is arbitrary, it follows from this equation that |
g
|
(L
1
)
M =
|g|
.
Now for the proof of Eq. (21.27). The key new point is that we no longer
are assuming that g L
q
. Let M(g) denote the right member in Eq. (21.27)
and set g
n
:= 1
Xn]g]n]
g. Then [g
n
[ [g[ as n and it is clear that
M(g
n
) is increasing in n. Therefore using Lemma 4.10 and the monotone
convergence theorem,
lim
n
M(g
n
) = sup
n
M(g
n
) = sup
n
sup
__
X
[g
n
[ fd : |f|
p
= 1 and f 0
_
= sup
_
sup
n
_
X
[g
n
[ fd : |f|
p
= 1 and f 0
_
= sup
_
lim
n
_
X
[g
n
[ fd : |f|
p
= 1 and f 0
_
= sup
__
X
[g[ fd : |f|
p
= 1 and f 0
_
= M(g).
Since g
n
L
q
for all n and M(g
n
) = |
gn
|
(L
p
)
(as you should verify), it
follows from Eq. (21.26) that M(g
n
) = |g
n
|
q
. When q < (by the monotone
convergence theorem) and when q = (directly from the denitions) one
learns that lim
n
|g
n
|
q
= |g|
q
. Combining this fact with lim
n
M(g
n
) =
M(g) just proved shows M(g) = |g|
q
.
402 21 L
p
-spaces
As an application we can derive a sweeping generalization of Minkowskis
inequality. (See Reed and Simon, Vol II. Appendix IX.4 for a more thorough
discussion of complex interpolation theory.)
Theorem 21.27 (Minkowskis Inequality for Integrals). Let (X, /, )
and (Y, ^, ) be nite measure spaces and 1 p . If f is a / ^
measurable function, then y |f(, y)|
L
p
()
is measurable and
1. if f is a positive /^ measurable function, then
_
_
_
_
_
Y
f(, y)d(y)
_
_
_
_
L
p
()
_
Y
|f(, y)|
L
p
()
d(y). (21.28)
2. If f : XY C is a /^ measurable function and
_
Y
|f(, y)|
L
p
()
d(y) <
then
a) for a.e. x, f(x, ) L
1
(),
b) the a.e. dened function, x
_
Y
f(x, y)d(y), is in L
p
() and
c) the bound in Eq. (21.28) holds.
Proof. For p [1, ], let F
p
(y) := |f(, y)|
L
p
()
. If p [1, )
F
p
(y) = |f(, y)|
L
p
()
=
__
X
[f(x, y)[
p
d(x)
_
1/p
is a measurable function on Y by Fubinis theorem. To see that F
is mea-
surable, let X
n
/ such that X
n
X and (X
n
) < for all n. Then by
Exercise 21.5,
F
(y) = lim
n
lim
p
|f(, y)1
Xn
|
L
p
()
which shows that F
is (Y, ^) measurable as well. This shows that integral

on the right side of Eq. (21.28) is well dened.
Now suppose that f 0, q = p/(p 1)and g L
q
() such that g 0 and
|g|
L
q
()
= 1. Then by Tonellis theorem and Holders inequality,
_
X
__
Y
f(x, y)d(y)
_
g(x)d(x) =
_
Y
d(y)
_
X
d(x)f(x, y)g(x)
|g|
L
q
()
_
Y
|f(, y)|
L
p
()
d(y)
=
_
Y
|f(, y)|
L
p
()
d(y).
Therefore by the converse to Holders inequality (Proposition 21.26),
_
_
_
_
_
Y
f(, y)d(y)
_
_
_
_
L
p
()
= sup
__
X
__
Y
f(x, y)d(y)
_
g(x)d(x) : |g|
L
q
()
= 1 and g 0
_
_
Y
|f(, y)|
L
p
()
d(y)
21.4 Converse of Holders Inequality 403
proving Eq. (21.28) in this case.
Now let f : XY C be as in item 2) of the theorem. Applying the rst
part of the theorem to [f[ shows
_
Y
[f(x, y)[ d(y) < for a.e. x,
i.e. f(x, ) L
1
() for the a.e. x. Since
_
Y
f(x, y)d(y)
_
Y
[f(x, y)[ d(y)
it follows by item 1) that
_
_
_
_
_
Y
f(, y)d(y)
_
_
_
_
L
p
()
_
_
_
_
_
Y
[f(, y)[ d(y)
_
_
_
_
L
p
()
_
Y
|f(, y)|
L
p
()
d(y).
Hence the function, x X
_
Y
f(x, y)d(y), is in L
p
() and the bound in
Eq. (21.28) holds.
Here is an application of Minkowskis inequality for integrals. In this the-
orem we will be using the convention that x
1/
:= 1.
Theorem 21.28 (Theorem 6.20 in Folland). Suppose that k : (0, )
(0, ) C is a measurable function such that k is homogenous of degree 1,
i.e. k(x, y) =
1
k(x, y) for all > 0. If, for some p [1, ],
C
p
:=
_

0
[k(x, 1)[ x
1/p
dx <
then for f L
p
((0, ), m), k(x, )f() L
1
((0, ), m) for m a.e. x. More-
over, the m a.e. dened function
(Kf)(x) =
_

0
k(x, y)f(y)dy (21.29)
is in L
p
((0, ), m) and
|Kf|
L
p
((0,),m)
C
p
|f|
L
p
((0,),m)
.
Proof. By the homogeneity of k, k(x, y) = x
1
k(1,
y
x
). Using this relation
and making the change of variables, y = zx, gives
_

0
[k(x, y)f(y)[ dy =
_

0
x
1
k(1,
y
x
)f(y)
dy
=
_

0
x
1
[k(1, z)f(xz)[ xdz =
_

0
[k(1, z)f(xz)[ dz.
Since
|f( z)|
p
L
p
((0,),m)
=
_

0
[f(yz)[
p
dy =
_

0
[f(x)[
p
dx
z
,
|f( z)|
L
p
((0,),m)
= z
1/p
|f|
L
p
((0,),m)
.
404 21 L
p
-spaces
Using Minkowskis inequality for integrals then shows
_
_
_
_
_

0
[k(, y)f(y)[ dy
_
_
_
_
L
p
((0,),m)
_

0
[k(1, z)[ |f(z)|
L
p
((0,),m)
dz
= |f|
L
p
((0,),m)
_

0
[k(1, z)[ z
1/p
dz
= C
p
|f|
L
p
((0,),m)
< .
This shows that Kf in Eq. (21.29) is well dened from m a.e. x. The proof
is nished by observing
|Kf|
L
p
((0,),m)

_
_
_
_
_

0
[k(, y)f(y)[ dy
_
_
_
_
L
p
((0,),m)
C
p
|f|
L
p
((0,),m)
for all f L
p
((0, ), m).
The following theorem is a strengthening of Proposition 21.26. It may be
skipped on the rst reading.
Theorem 21.29 (Converse of Holders Inequality II). Assume that
(X, /, ) is a nite measure space, q, p [1, ] are conjugate exponents
and let S
f
denote the set of simple functions on X such that ( ,= 0) < .
Let g : X C be a measurable function such that g L
1
() for all S
f
,
2
and dene
M
q
(g) := sup
_
_
X
gd
: S
f
with ||
p
= 1
_
. (21.30)
If M
q
(g) < then g L
q
() and M
q
(g) = |g|
q
.
Proof. Let X
n
/ be sets such that (X
n
) < and X
n
X as n .
Suppose that q = 1 and hence p = . Choose simple functions
n
on X
such that [
n
[ 1 and sgn(g) = lim
n
n
in the pointwise sense. Then
1
Xm
n
S
f
and therefore
_
X
1
Xm
n
gd
M
q
(g)
for all m, n. By assumption 1
Xm
g L
1
() and therefore by the dominated
convergence theorem we may let n in this equation to nd
_
X
1
Xm
[g[ d M
q
(g)
for all m. The monotone convergence theorem then implies that
_
X
[g[ d = lim
m
_
X
1
Xm
[g[ d M
q
(g)
2
This is equivalent to requiring 1Ag L
1
() for all A / such that (A) < .
21.5 Uniform Integrability 405
showing g L
1
() and |g|
1
M
q
(g). Since Holders inequality implies that
M
q
(g) |g|
1
, we have proved the theorem in case q = 1. For q > 1, we will
begin by assuming that g L
q
(). Since p [1, ) we know that S
f
is a
dense subspace of L
p
() and therefore, using
g
is continuous on L
p
(),
M
q
(g) = sup
_
_
X
gd
: L
p
() with ||
p
= 1
_
= |g|
q
where the last equality follows by Proposition 21.26. So it remains to show
that if g L
1
for all S
f
and M
q
(g) < then g L
q
(). For n N, let
g
n
:= 1
Xn
1
]g]n
g. Then g
n
L
q
(), in fact |g
n
|
q
n(X
n
)
1/q
< . So by
the previous paragraph, |g
n
|
q
= M
q
(g
n
) and hence
|g
n
|
q
= sup
_
_
X
1
Xn
1
]g]n
gd
: L
p
() with ||
p
= 1
_
M
q
(g)
_
_
1
Xn
1
]g]n
_
_
p
M
q
(g) 1 = M
q
(g)
wherein the second to last inequality we have made use of the denition of
M
q
(g) and the fact that 1
Xn
1
]g]n
S
f
. If q (1, ), an application of the
monotone convergence theorem (or Fatous Lemma) along with the continuity
of the norm, ||
p
, implies
|g|
q
= lim
n
|g
n
|
q
M
q
(g) < .
If q = , then |g
n
|
M
q
(g) < for all n implies [g
n
[ M
q
(g) a.e. which
then implies that [g[ M
q
(g) a.e. since [g[ = lim
n
[g
n
[ . That is g L
()
and |g|
(g).
21.5 Uniform Integrability
This section will address the question as to what extra conditions are needed
in order that an L
0
convergent sequence is L
p
convergent.
Notation 21.30 For f L
1
() and E /, let
(f : E) :=
_
E
fd.
and more generally if A, B / let
(f : A, B) :=
_
AB
fd.
Lemma 21.31. Suppose g L
1
(), then for any > 0 there exist a > 0
such that ([g[ : E) < whenever (E) < .
406 21 L
p
-spaces
Proof. If the Lemma is false, there would exist > 0 and sets E
n
such
that (E
n
) 0 while ([g[ : E
n
) for all n. Since [1
En
g[ [g[ L
1
and
for any (0, 1), (1
En
[g[ > ) (E
n
) 0 as n , the dominated
convergence theorem of Corollary 21.17 implies lim
n
([g[ : E
n
) = 0. This
contradicts ([g[ : E
n
) for all n and the proof is complete.
Suppose that f
n
n=1
is a sequence of measurable functions which con-
verge in L
1
() to a function f. Then for E / and n N,
[(f
n
: E)[ [(f f
n
: E)[ +[(f : E)[ |f f
n
|
1
+[(f : E)[ .
Let
N
:= sup
n>N
|f f
n
|
1
, then
N
0 as N and
sup
n
[(f
n
: E)[ sup
nN
[(f
n
: E)[ (
N
+[(f : E)[)
N
+(g
N
: E) ,
(21.31)
where g
N
= [f[ +
N
n=1
[f
n
[ L
1
. From Lemma 21.31 and Eq. (21.31) one
easily concludes,
> 0 > 0 sup
n
[(f
n
: E)[ < when (E) < . (21.32)
Denition 21.32. Functions f
n
n=1
L
1
() satisfying Eq. (21.32) are
said to be uniformly integrable.
Remark 21.33. Let f
n
be real functions satisfying Eq. (21.32), E be a set
where (E) < and E
n
= E f
n
0 . Then (E
n
) < so that (f
+
n
:
E) = (f
n
: E
n
) < and similarly (f
n
: E) < . Therefore if Eq. (21.32)
holds then
sup
n
([f
n
[ : E) < 2 when (E) < . (21.33)
Similar arguments work for the complex case by looking at the real and imag-
inary parts of f
n
. Therefore f
n
n=1
L
1
() is uniformly integrable i
> 0 > 0 sup
n
([f
n
[ : E) < when (E) < . (21.34)
Lemma 21.34. Assume that (X) < , then f
n
is uniformly bounded in
L
1
() (i.e. K = sup
n
|f
n
|
1
< ) and f
n
is uniformly integrable i
lim
M
sup
n
([f
n
[ : [f
n
[ M) = 0. (21.35)
Proof. Since f
n
is uniformly bounded in L
1
(), ([f
n
[ M) K/M.
So if (21.34) holds and > 0 is given, we may choose M suciently large so
that ([f
n
[ M) < () for all n and therefore,
sup
n
([f
n
[ : [f
n
[ M) .
Since is arbitrary, we concluded that Eq. (21.35) must hold. Conversely,
suppose that Eq. (21.35) holds, then automatically K = sup
n
([f
n
[) <
because
([f
n
[) = ([f
n
[ : [f
n
[ M) +([f
n
[ : [f
n
[ < M)
sup
n
([f
n
[ : [f
n
[ M) +M(X) < .
Moreover,
([f
n
[ : E) = ([f
n
[ : [f
n
[ M, E) +([f
n
[ : [f
n
[ < M, E)
sup
n
([f
n
[ : [f
n
[ M) +M(E).
So given > 0 choose M so large that sup
n
([f
n
[ : [f
n
[ M) < /2 and then
take = / (2M) .
Remark 21.35. It is not in general true that if f
n
L
1
() is uniformly
integrable then sup
n
([f
n
[) < . For example take X = and () = 1.
Let f
n
() = n. Since for < 1 a set E X such that (E) < is in fact
the empty set, we see that Eq. (21.33) holds in this example. However, for
nite measure spaces with out atoms, for every > 0 we may nd a nite
partition of X by sets E
k
=1
with (E
) < . Then if Eq. (21.33) holds with

2 = 1, then
([f
n
[) =
k
=1
([f
n
[ : E
) k
showing that ([f
n
[) k for all n.
The following Lemmas gives a concrete necessary and sucient conditions
for verifying a sequence of functions is uniformly bounded and uniformly in-
tegrable.
Lemma 21.36. Suppose that (X) < , and L
0
(X) is a collection of
functions.
1. If there exists a non decreasing function : 1
+
1
+
such that
lim
x
(x)/x = and
K := sup
f
(([f[)) < (21.36)
then
lim
M
sup
f
_
[f[ 1
]f]M
_
= 0. (21.37)
2. Conversely if Eq. (21.37) holds, there exists a non-decreasing continuous
function : 1
+
1
+
such that (0) = 0, lim
x
(x)/x = and Eq.
(21.36) is valid.
Proof. 1. Let be as in item 1. above and set
M
:= sup
xM
x
(x)
0
as M by assumption. Then for f
408 21 L
p
-spaces
([f[ : [f[ M) = (
[f[
([f[)
([f[) : [f[ M)
M
(([f[) : [f[ M)

M
(([f[)) K
M
and hence
lim
M
sup
f
_
[f[ 1
]f]M
_
lim
M
K
M
= 0.
2. By assumption,
M
:= sup
f
_
[f[ 1
]f]M
_
0 as M . Therefore
we may choose M
n
such that
n=0
(n + 1)
Mn
<
where by convention M
0
:= 0. Now dene so that (0) = 0 and
t
(x) =
n=0
(n + 1) 1
(Mn,Mn+1]
(x),
i.e.
(x) =
_
x
0
t
(y)dy =
n=0
(n + 1) (x M
n+1
x M
n
) .
By construction is continuous, (0) = 0,
t
(x) is increasing (so is convex)
and
t
(x) (n + 1) for x M
n
. In particular
(x)
x

(M
n
) + (n + 1)x
x
n + 1 for x M
n
from which we conclude lim
x
(x)/x = . We also have
t
(x) (n + 1)
on [0, M
n+1
] and therefore
(x) (n + 1)x for x M
n+1
.
So for f ,
(([f[)) =
n=0
_
([f[)1
(Mn,Mn+1]
([f[)
_
n=0
(n + 1)
_
[f[ 1
(Mn,Mn+1]
([f[)
_
n=0
(n + 1)
_
[f[ 1
]f]Mn
_
n=0
(n + 1)
Mn
and hence
sup
f
(([f[))
n=0
(n + 1)
Mn
< .
Theorem 21.37 (Vitali Convergence Theorem). (Folland 6.15) Suppose
that 1 p < . A sequence f
n
L
p
is Cauchy i
1. f
n
is L
0
Cauchy,
2. [f
n
[
p
is uniformly integrable.
3. For all > 0, there exists a set E / such that (E) < and
_
E
c
[f
n
[
p
d < for all n. (This condition is vacuous when (X) < .)
Proof. (=) Suppose f
n
L
p
is Cauchy. Then (1) f
n
is L
0
Cauchy by Lemma 21.14. (2) By completeness of L

p
, there exists f L
p
such
that |f
n
f|
p
0 as n . By the mean value theorem,
[[f[
p
[f
n
[
p
[ p(max([f[ , [f
n
[))
p1
[[f[ [f
n
[[ p([f[ +[f
n
[)
p1
[[f[ [f
n
[[
and therefore by Holders inequality,
_
[[f[
p
[f
n
[
p
[ d p
_
([f[ +[f
n
[)
p1
[[f[ [f
n
[[ d p
_
([f[ +[f
n
[)
p1
[f f
n
[d
p|f f
n
|
p
|([f[ +[f
n
[)
p1
|
q
= p| [f[ +[f
n
[|
p/q
p
|f f
n
|
p
p(|f|
p
+|f
n
|
p
)
p/q
|f f
n
|
p
where q := p/(p 1). This shows that
_
[[f[
p
[f
n
[
p
[ d 0 as n .
3
By
the remarks prior to Denition 21.32, [f
n
[
p
is uniformly integrable. To verify
(3), for M > 0 and n N let E
M
= [f[ M and E
M
(n) = [f
n
[ M.
Then (E
M
)
1
M
p
|f[[
p
p
< and by the dominated convergence theorem,
_
E
c
M
[f[
p
d =
_
[f[
p
1
]f]<M
d 0 as M 0.
Moreover,
_
_
f
n
1
E
c
M
_
_
p

_
_
f1
E
c
M
_
_
p
+
_
_
(f
n
f)1
E
c
M
_
_
p

_
_
f1
E
c
M
_
_
p
+|f
n
f|
p
. (21.38)
So given > 0, choose N suciently large such that for all n N, |f
f
n
|
p
p
< . Then choose M suciently small such that
_
E
c
M
[f[
p
d < and
_
E
c
M
(n)
[f[
p
d < for all n = 1, 2, . . . , N 1. Letting E := E
M
E
M
(1)
E
M
(N 1), we have
(E) < ,
_
E
c
[f
n
[
p
d < for n N 1
and by Eq. (21.38)
3
Here is an alternative proof. Let hn [[fn[
p
[f[
p
[ [fn[
p
+[f[
p
=: gn L
1
and
g 2[f[
p
. Then gn
g, hn
0 and
_
gn
_
g. Therefore by the dominated
convergence theorem in Corollary 21.17, lim
n
_
hn d = 0.
410 21 L
p
-spaces
_
E
c
[f
n
[
p
d < (
1/p
+
1/p
)
p
2
p
for n N.
Therefore we have found E / such that (E) < and
sup
n
_
E
c
[f
n
[
p
d 2
p
which veries (3) since > 0 was arbitrary. (=) Now supposef
n
L
p
satises conditions (1) - (3). Let > 0, E be as in (3) and
A
mn
:= x E[f
m
(x) f
n
(x)[ .
Then
|(f
n
f
m
) 1
E
c |
p
|f
n
1
E
c |
p
+|f
m
1
E
c |
p
< 2
1/p
and
|f
n
f
m
|
p
= |(f
n
f
m
)1
E
c |
p
+|(f
n
f
m
)1
E\Amn
|
p
+|(f
n
f
m
)1
Amn
|
p
|(f
n
f
m
)1
E\Amn
|
p
+|(f
n
f
m
)1
Amn
|
p
+ 2
1/p
. (21.39)
Using properties (1) and (3) and 1
E]fmfn]<]
[f
m
f
n
[
p

p
1
E
L
1
, the
dominated convergence theorem in Corollary 21.17 implies
|(f
n
f
m
) 1
E\Amn
|
p
p
=
_
1
E]fmfn]<]
[f
m
f
n
[
p
m,n
0.
which combined with Eq. (21.39) implies
limsup
m,n
|f
n
f
m
|
p
limsup
m,n
|(f
n
f
m
)1
Amn
|
p
+ 2
1/p
.
Finally
|(f
n
f
m
)1
Amn
|
p
|f
n
1
Amn
|
p
+|f
m
1
Amn
|
p
2()
where
() := sup
n
sup |f
n
1
E
|
p
: E / (E)
By property (2), () 0 as 0. Therefore
limsup
m,n
|f
n
f
m
|
p
2
1/p
+ 0 + 2() 0 as 0
and therefore f
n
is L
p
-Cauchy.
Here is another version of Vitalis Convergence Theorem.
Theorem 21.38 (Vitali Convergence Theorem). (This is problem 9 on
p. 133 in Rudin.) Assume that (X) < , f
n
is uniformly integrable, f
n

f a.e. and [f[ < a.e., then f L
1
() and f
n
f in L
1
().
Proof. Let > 0 be given and choose > 0 as in the Eq. (21.33). Now use
Egoros Theorem 21.18 to choose a set E
c
where f
n
converges uniformly
on E
c
and (E) < . By uniform convergence on E
c
, there is an integer
N < such that [f
n
f
m
[ 1 on E
c
for all m, n N. Letting m , we
learn that
[f
N
f[ 1 on E
c
.
Therefore [f[ [f
N
[ + 1 on E
c
and hence
([f[) = ([f[ : E
c
) +([f[ : E)
([f
N
[) +(X) +([f[ : E).
Now by Fatous lemma,
([f[ : E) lim inf
n
([f
n
[ : E) 2 <
by Eq. (21.33). This shows that f L
1
. Finally
([f f
n
[) = ([f f
n
[ : E
c
) +([f f
n
[ : E)
([f f
n
[ : E
c
) +([f[ +[f
n
[ : E)
([f f
n
[ : E
c
) + 4
and so by the Dominated convergence theorem we learn that
lim sup
n
([f f
n
[) 4.
Since > 0 was arbitrary this completes the proof.
Theorem 21.39 (Vitali again). Suppose that f
n
f in measure and Eq.
(21.35) holds, then f
n
f in L
1
.
Proof. This could of course be proved using 21.38 after passing to sub-
sequences to get f
n
to converge a.s. However I wish to give another proof.
First o, by Fatous lemma, f L
1
(). Now let
K
(x) = x1
]x]K
+K1
]x]>K
.
then
K
(f
n
)

K
(f) because [
K
(f)
K
(f
n
)[ [f f
n
[ and since
[f f
n
[ [f
K
(f)[ +[
K
(f)
K
(f
n
)[ +[
K
(f
n
) f
n
[
we have that
[f f
n
[ [f
K
(f)[ +[
K
(f)
K
(f
n
)[ +[
K
(f
n
) f
n
[
= ([f[ : [f[ K) +[
K
(f)
K
(f
n
)[ +([f
n
[ : [f
n
[ K).
Therefore by the dominated convergence theorem
lim sup
n
[f f
n
[ ([f[ : [f[ K) + lim sup
n
([f
n
[ : [f
n
[ K).
This last expression goes to zero as K by uniform integrability.
412 21 L
p
-spaces
21.6 Exercises
Denition 21.40. The essential range of f, essran(f), consists of those
C such that ([f [ < ) > 0 for all > 0.
Denition 21.41. Let (X, ) be a topological space and be a measure on
B
X
= (). The support of , supp(), consists of those x X such that
(V ) > 0 for all open neighborhoods, V, of x.
Exercise 21.3. Let (X, ) be a second countable topological space and be
a measure on B
X
the Borel algebra on X. Show
1. supp() is a closed set. (This is actually true on all topological spaces.)
2. (X supp()) = 0 and use this to conclude that W := X supp()
is the largest open set in X such that (W) = 0. Hint: let | be
a countable base for the topology . Show that W may be written as a
union of elements from V 1 with the property that (V ) = 0.
Exercise 21.4. Prove the following facts about essran(f).
1. Let = f
:= f
1
a Borel measure on C. Show essran(f) = supp().
2. essran(f) is a closed set and f(x) essran(f) for almost every x, i.e.
(f / essran(f)) = 0.
3. If F C is a closed set such that f(x) F for almost every x then
essran(f) F. So essran(f) is the smallest closed set F such that f(x) F
for almost every x.
4. |f|
= sup[[ : essran(f) .
Exercise 21.5. Let f L
p
L
for some p < . Show |f|
=
lim
q
|f|
q
. If we further assume (X) < , show |f|
= lim
q
|f|
q
for
all measurable functions f : X C. In particular, f L
i lim
q
|f|
q
<
. Hints: Use Corollary 21.23 to show limsup
q
|f|
q
|f|
and to
show liminf
q
|f|
q
|f|
, let M < |f|
and make use of Chebyshevs

inequality.
Exercise 21.6. Prove Eq. (21.22) in Corollary 21.23. (Part of Folland 6.3 on
p. 186.) Hint: Use the inequality, with a, b 1 with a
1
+ b
1
= 1 chosen
appropriately,
st
s
a
a
+
t
b
b
,
(see Lemma 5.5 for Eq. (21.17)) applied to the right side of Eq. (21.21).
Exercise 21.7. Complete the proof of Proposition 21.22 by showing (L
p
+
L
r
, ||) is a Banach space. Hint: you may nd using Theorem 7.13 is helpful
here.
21.6 Exercises 413
Exercise 21.9. By making the change of variables, u = lnx, prove the fol-
lowing facts:
_
1/2
0
x
a
[lnx[
b
dx < a < 1 or a = 1 and b < 1
_

2
x
a
[lnx[
b
dx < a > 1 or a = 1 and b < 1
_
1
0
x
a
[lnx[
b
dx < a < 1 and b > 1
_

1
x
a
[lnx[
b
dx < a > 1 and b > 1.
Suppose 0 < p
0
< p
1
and m is Lebesgue measure on (0, ) . Use
the above results to manufacture a function f on (0, ) such that f
L
p
((0, ) , m) i (a) p (p
0
, p
1
) , (b) p [p
0
, p
1
] and (c) p = p
0
.
Exercise 21.11. Folland 6.10 on p. 186. Use the strong form of Theorem
19.38.
Exercise 21.12. Let (X, /, ) and (Y, ^, ) be nite measure spaces,
f L
2
() and k L
2
( ). Show
_
[k(x, y)f(y)[ d(y) < for a.e. x.
Let Kf(x) :=
_
Y
k(x, y)f(y)d(y) when the integral is dened. Show Kf
L
2
() and K : L
2
() L
2
() is a bounded operator with |K|
op

|k|
L
2
()
.
Exercise 21.13. Folland 6.27 on p. 196. Hint: Theorem 21.28.
22
Approximation Theorems and Convolutions
22.1 Density Theorems
In this section, (X, /, ) will be a measure space / will be a subalgebra of
/.
Notation 22.1 Suppose (X, /, ) is a measure space and / / is a sub-
algebra of /. Let S(/) denote those simple functions : X C such that
1
() / for all C and let S
f
(/, ) denote those S(/) such that
( ,= 0) < .
Remark 22.2. For S
f
(/, ) and p [1, ), [[
p
=

z,=0
[z[
p
1
=z]
and
hence _
[[
p
d =
z,=0
[z[
p
( = z) < (22.1)
so that S
f
(/, ) L
p
(). Conversely if S(/)L
p
(), then from Eq. (22.1)
it follows that ( = z) < for all z ,= 0 and therefore ( ,= 0) < . Hence
we have shown, for any 1 p < ,
S
f
(/, ) = S(/) L
p
().
Lemma 22.3 (Simple Functions are Dense). The simple functions,
S
f
(/, ), form a dense subspace of L
p
() for all 1 p < .
Proof. Let
n
n=1
be the simple functions in the approximation Theo-
rem 18.42. Since [
n
[ [f[ for all n,
n
S
f
(/, ) and
[f
n
[
p
([f[ +[
n
[)
p
2
p
[f[
p
L
1
() .
Therefore, by the dominated convergence theorem,
lim
n
_
[f
n
[
p
d =
_
lim
n
[f
n
[
p
d = 0.
416 22 Approximation Theorems and Convolutions
The goal of this section is to nd a number of other dense subspaces of
L
p
() for p [1, ). The next theorem is the key result of this section.
Theorem 22.4 (Density Theorem). Let p [1, ), (X, /, ) be a mea-
sure space and M be an algebra of bounded F valued (F = 1 or F = C)
measurable functions such that
1. M L
p
(, F) and (M) = /.
2. There exists
k
M such that
k
1 boundedly.
3. If F = C we further assume that M is closed under complex conjugation.
Then to every function f L
p
(, F) , there exist
n
M such that
lim
n
|f
n
|
L
p
()
= 0, i.e. M is dense in L
p
(, F) .
Proof. Fix k N for the moment and let H denote those bounded /
measurable functions, f : X F, for which there exists
n
n=1
M such
that lim
n
|
k
f
n
|
L
p
()
= 0. A routine check shows H is a subspace
of
(/, F) such that 1 H, M H and H is closed under complex

conjugation if F = C. Moreover, H is closed under bounded convergence.
To see this suppose f
n
H and f
n
f boundedly. Then, by the dominated
convergence theorem, lim
n
|
k
(f f
n
)|
L
p
()
= 0.
1
(Take the dominating
function to be g = [2C [
k
[]
p
where C is a constant bounding all of the
[f
n
[
n=1
.) We may now choose
n
M such that |
n
k
f
n
|
L
p
()

1
n
then
lim sup
n
|
k
f
n
|
L
p
()
lim sup
n
|
k
(f f
n
)|
L
p
()
+ lim sup
n
|
k
f
n
n
|
L
p
()
= 0 (22.2)
which implies f H. An application of Dynkins Multiplicative System The-
orem 18.51 if F = 1 or Theorem 18.52 if F = C now shows H contains all
bounded measurable functions on X.
Let f L
p
() be given. The dominated convergence theorem implies
lim
k
_
_
k
1
]f]k]
f f
_
_
L
p
()
= 0. (Take the dominating function to be
g = [2C [f[]
p
where C is a bound on all of the [
k
[ .) Using this and what we
have just proved, there exists
k
M such that
_
_
k
1
]f]k]
f
k
_
_
L
p
()

1
k
.
The same line of reasoning used in Eq. (22.2) now implies lim
k
|f
k
|
L
p
()
=
0.
1
It is at this point that the proof would break down if p = .
22.1 Density Theorems 417
Denition 22.5. Let (X, ) be a topological space and be a measure on
B
X
= () . A locally integrable function is a Borel measurable function
f : X C such that
_
K
[f[ d < for all compact subsets K X. We will
write L
1
loc
() for the space of locally integrable functions. More generally we
say f L
p
loc
() i |1
K
f|
L
p
()
< for all compact subsets K X.
Denition 22.6. Let (X, ) be a topological space. A K-nite measure on
X is Borel measure such that (K) < for all compact subsets K X.
Lebesgue measure on 1 is an example of a K-nite measure while counting
measure on 1 is not a K-nite measure.
Example 22.7. Suppose that is a K-nite measure on B
R
d. An application of
Theorem 22.4 shows C
c
(1, C) is dense in L
p
(1
d
, B
R
d, ; C). To apply Theorem
22.4, let M := C
c
_
1
d
, C
_
and
k
(x) := (x/k) where C
c
_
1
d
, C
_
with
(x) = 1 in a neighborhood of 0. The proof is completed by showing (M) =
_
C
c
_
1
d
, C
__
= B
R
d, which follows directly from Lemma 18.57.
We may also give a more down to earth proof as follows. Let x
0
1
d
, R >
0, A := B(x
0
, R)
c
and f
n
(x) := d
1/n
A
(x) . Then f
n
M and f
n
1
B(x0,R)
as n which shows 1
B(x0,R)
is (M)-measurable, i.e. B(x
0
, R) (M) .
Since x
0
1
d
and R > 0 were arbitrary, (M) = B
R
d.
More generally we have the following result.
Theorem 22.8. Let (X, ) be a second countable locally compact Hausdor
space and : B
X
[0, ] be a K-nite measure. Then C
c
(X) (the space
of continuous functions with compact support) is dense in L
p
() for all p
[1, ). (See also Proposition 28.23 below.)
Proof. Let M := C
c
(X) and use Item 3. of Lemma 18.57 to nd functions
k
M such that
k
1 to boundedly as k . The result now follows
from an application of Theorem 22.4 along with the aid of item 4. of Lemma
18.57.
Exercise 22.1. Show that BC (1, C) is not dense in L
(1, B
R
, m; C). Hence
the hypothesis that p < in Theorem 22.4 can not be removed.
Corollary 22.9. Suppose X 1
n
is an open set, B
X
is the Borel algebra
on X and be a K-nite measure on (X, B
X
) . Then C
c
(X) is dense in L
p
()
for all p [1, ).
Corollary 22.10. Suppose that X is a compact subset of 1
n
and is a nite
measure on (X, B
X
), then polynomials are dense in L
p
(X, ) for all 1 p <
.
Proof. Consider X to be a metric space with usual metric induced
from 1
n
. Then X is a locally compact separable metric space and there-
fore C
c
(X, C) = C(X, C) is dense in L
p
() for all p [1, ). Since, by the
dominated convergence theorem, uniform convergence implies L
p
() conver-
gence, it follows from the Weierstrass approximation theorem (see Theorem
10.35 and Corollary 10.37 or Theorem 15.31 and Corollary 15.32) that poly-
nomials are also dense in L
p
().
Lemma 22.11. Let (X, ) be a second countable locally compact Hausdor
space and : B
X
[0, ] be a K-nite measure on X. If h L
1
loc
() is a
function such that
_
X
fhd = 0 for all f C
c
(X) (22.3)
then h(x) = 0 for a.e. x. (See also Corollary 28.26 below.)
Proof. Let d(x) = [h(x)[ dx, then is a K-nite measure on X and hence
C
c
(X) is dense in L
1
() by Theorem 22.8. Notice that
_
X
f sgn(h)d =
_
X
fhd = 0 for all f C
c
(X). (22.4)
Let K
k
k=1
be a sequence of compact sets such that K
k
X as in Lemma
14.23. Then 1
K
k
sgn(h) L
1
() and therefore there exists f
m
C
c
(X) such
that f
m
1
K
k
sgn(h) in L
1
(). So by Eq. (22.4),
(K
k
) =
_
X
1
K
k
d = lim
m
_
X
f
m
sgn(h)d = 0.
Since K
k
X as k , 0 = (X) =
_
X
[h[ d, i.e. h(x) = 0 for a.e. x.
As an application of Lemma 22.11 and Example 15.34, we will show that
the Laplace transform is injective.
Theorem 22.12 (Injectivity of the Laplace Transform). For f
L
1
([0, ), dx), the Laplace transform of f is dened by
Lf() :=
_

0
e
x
f(x)dx for all > 0.
If Lf() := 0 then f(x) = 0 for m -a.e. x.
Proof. Suppose that f L
1
([0, ), dx) such that Lf() 0. Let g
C
0
([0, ), 1) and > 0 be given. By Example 15.34 we may choose a
>0
such that #( > 0 : a
,= 0) < and
[g(x)
>0
a
e
x
[ < for all x 0.
Then
_

0
g(x)f(x)dx
_

0
_
g(x)
>0
a
e
x
_
f(x)dx
_

0
g(x)
>0
a
e
x
[f(x)[ dx |f|
1
.
_
0
g(x)f(x)dx = 0 for all g
C
0
([0, ), 1). The proof is nished by an application of Lemma 22.11.
Here is another variant of Theorem 22.8.
Theorem 22.13. Let (X, d) be a metric space,
d
be the topology on X gen-
erated by d and B
X
= (
d
) be the Borel algebra. Suppose : B
X
[0, ]
is a measure which is nite on
d
and let BC
f
(X) denote the bounded
continuous functions on X such that (f ,= 0) < . Then BC
f
(X) is a dense
subspace of L
p
() for any p [1, ).
Proof. Let X
k

d
be open sets such that X
k
X and (X
k
) < and
let
k
(x) = min(1, k d
X
c
k
(x)) =
k
(d
X
c
k
(x)),
see Figure 22.1 below. It is easily veried that M := BC
f
(X) is an algebra,
Fig. 22.1. The plot of n for n = 1, 2, and 4. Notice that n 1
(0,)
.
k
M for all k and
k
1 boundedly as k . Given V and
k, n N,let
f
k,n
(x) := min(1, n d
(V X
k
)
c (x)).
Then f
k,n
,= 0 = V X
k
so f
k,n
BC
f
(X). Moreover
lim
k
lim
n
f
k,n
= lim
k
1
V X
k
= 1
V
which shows V (M) and hence (M) = B
X
. The proof is now completed
by an application of Theorem 22.4.
Exercise 22.2. (BRUCE: Should drop this exercise.) Suppose that (X, d) is
a metric space, is a measure on B
X
:= (
d
) which is nite on bounded
measurable subsets of X. Show BC
b
(X, 1), dened in Eq. (19.26), is dense in
L
p
() . Hints: let
k
be as dened in Eq. (19.27) which incidentally may be
used to show (BC
b
(X, 1)) = (BC(X, 1)) . Then use the argument in the
proof of Corollary 18.55 to show (BC(X, 1)) = B
X
.
Theorem 22.14. Suppose p [1, ), / /is an algebra such that (/) =
/and is nite on /. Then S
f
(/, ) is dense in L
p
(). (See also Remark
28.7 below.)
Proof. Let M := S
f
(/, ). By assumption there exits X
k
/ such that
(X
k
) < and X
k
X as k . If A /, then X
k
A / and
(X
k
A) < so that 1
X
k
A
M. Therefore 1
A
= lim
k
1
X
k
A
is (M)
measurable for every A /. So we have shown that / (M) /
and therefore / = (/) (M) /, i.e. (M) = /. The theorem
now follows from Theorem 22.4 after observing
k
:= 1
X
k
M and
k
1
boundedly.
Theorem 22.15 (Separability of L
p
Spaces). Suppose, p [1, ), /
/ is a countable algebra such that (/) = / and is nite on /. Then
L
p
() is separable and
| =
a
j
1
Aj
: a
j
+i, A
j
/ with (A
j
) <
is a countable dense subset.
Proof. It is left to reader to check | is dense in S
f
(/, ) relative to the
L
p
() norm. The proof is then complete since S
f
(/, ) is a dense subspace
of L
p
() by Theorem 22.14.
Example 22.16. The collection of functions of the form =

n
k=1
c
k
1
(a
k
,b
k
]
with a
k
, b
k
and a
k
< b
k
are dense in L
p
(1, B
R
, m; C) and L
p
(1, B
R
, m; C)
is separable for any p [1, ). To prove this simply apply Theorem 22.14 with
/ being the algebra on 1 generated by the half open intervals (a, b] 1 with
a < b and a, b , i.e. / consists of sets of the form
n
k=1
(a
k
, b
k
]1,
where a
k
, b
k
.
Exercise 22.3. Show L
([0, 1] , B
R
, m; C) is not separable. Hint: Suppose
is a dense subset of L
([0, 1] , B
R
, m; C) and for (0, 1) , let f
(x) :=
1
[0,]
(x) . For each (0, 1) , choose g
such that |f
< 1/2 and

then show the map (0, 1) g
is injective. Use this to conclude that

must be uncountable.
Corollary 22.17 (Riemann Lebesgue Lemma). Suppose that f L
1
(1, m),
then
lim
_
R
f(x)e
ix
dm(x) = 0.
Proof. By Example 22.16, given > 0 there exists =

n
k=1
c
k
1
(a
k
,b
k
]
with a
k
, b
k
1 such that
_
R
[f [dm < .
Notice that
_
R
(x)e
ix
dm(x) =
_
R
n
k=1
c
k
1
(a
k
,b
k
]
(x)e
ix
dm(x)
=
n
k=1
c
k
_
b
k
a
k
e
ix
dm(x) =
n
k=1
c
k
1
e
ix
[
b
k
a
k
=
1
n
k=1
c
k
_
e
ib
k
e
ia
k
_
0 as [[ .
Combining these two equations with
_
R
f(x)e
ix
dm(x)
_
R
(f(x) (x)) e
ix
dm(x)
_
R
(x)e
ix
dm(x)
_
R
[f [dm+
_
R
(x)e
ix
dm(x)
_
R
(x)e
ix
dm(x)
we learn that
lim sup
]]
_
R
f(x)e
ix
dm(x)
+ lim sup
]]
_
R
(x)e
ix
dm(x)
= .
Since > 0 is arbitrary, this completes the proof of the Riemann Lebesgue
lemma.
Corollary 22.18. Suppose / / is an algebra such that (/) = / and
is nite on /. Then for every B / such that (B) < and > 0
there exists D / such that (BD) < . (See also Remark 28.7 below.)
Proof. By Theorem 22.14, there exists a collection, A
i
n
i=1
, of pairwise
disjoint subsets of / and
i
1 such that
_
X
[1
B
f[ d < where f =
n
i=1
i
1
Ai
. Let A
0
:= X
n
i=1
A
i
/ then
_
X
[1
B
f[ d =
n
i=0
_
Ai
[1
B
f[ d
= (A
0
B) +
n
i=1
_
_
AiB
[1
B
i
[ d +
_
Ai\B
[1
B
i
[ d
_
= (A
0
B) +
n
i=1
[[1
i
[ (B A
i
) +[
i
[ (A
i
B)] (22.5)
(A
0
B) +
n
i=1
min(B A
i
) , (A
i
B) (22.6)
where the last equality is a consequence of the fact that 1 [
i
[ + [1
i
[ .
Let
i
=
_
0 if (B A
i
) < (A
i
B)
1 if (B A
i
) (A
i
B)
and g =
n
i=1
i
1
Ai
= 1
D
where
D := A
i
: i > 0 &
i
= 1 /.
Equation (22.5) with
i
replaced by
i
and f by g implies
_
X
[1
B
1
D
[ d = (A
0
B) +
n
i=1
min(B A
i
) , (A
i
B) .
The latter expression, by Eq. (22.6), is bounded by
_
X
[1
B
f[ d < and
therefore,
(BD) =
_
X
[1
B
1
D
[ d < .
Remark 22.19. We have to assume that (B) < as the following example
shows. Let X = 1, /= B, = m, / be the algebra generated by half open
intervals of the form (a, b], and B =
n=1
(2n, 2n+1]. It is easily checked that
for every D /, that m(BD) = .
22.2 Convolution and Youngs Inequalities
Throughout this section we will be solely concerned with d dimensional
Lebesgue measure, m, and we will simply write L
p
for L
p
_
1
d
, m
_
.
Denition 22.20 (Convolution). Let f, g : 1
d
C be measurable func-
tions. We dene
f g(x) =
_
R
d
f(x y)g(y)dy (22.7)
22.2 Convolution and Youngs Inequalities 423
whenever the integral is dened, i.e. either f (x ) g () L
1
(1
d
, m) or
f (x ) g () 0. Notice that the condition that f (x ) g () L
1
(1
d
, m)
is equivalent to writing [f[ [g[ (x) < . By convention, if the integral in Eq.
(22.7) is not dened, let f g(x) := 0.
Notation 22.21 Given a multi-index Z
d
+
, let [[ =
1
+ +
d
,
x
:=
d
j=1
x
j
j
, and
x
=
_

x
_
:=
d
j=1
_

x
j
_
j
.
For z 1
d
and f : 1
d
C, let
z
f : 1
d
C be dened by
z
f(x) = f(xz).
Remark 22.22 (The Signicance of Convolution).
1. Suppose that f, g L
1
(m) are positive functions and let be the measure
on
_
1
d
_
2
dened by
d(x, y) := f (x) g (y) dm(x) dm(y) .
Then if h : 1 [0, ] is a measurable function we have
_
(R
d
)
2
h(x +y) d(x, y) =
_
(R
d
)
2
h(x +y) f (x) g (y) dm(x) dm(y)
=
_
(R
d
)
2
h(x) f (x y) g (y) dm(x) dm(y)
=
_
R
d
h(x) f g (x) dm(x) .
In other words, this shows the measure (f g) m is the same as S
where
S (x, y) := x+y. In probability lingo, the distribution of a sum of two in-
dependent (i.e. product measure) random variables is the the convolution
of the individual distributions.
2. Suppose that L =
]]k
a
is a constant coecient dierential oper-

ator and suppose that we can solve (uniquely) the equation Lu = g in the
form
u(x) = Kg(x) :=
_
R
d
k(x, y)g(y)dy
where k(x, y) is an integral kernel. (This is a natural sort of assumption
since, in view of the fundamental theorem of calculus, integration is the
inverse operation to dierentiation.) Since
z
L = L
z
for all z 1
d
, (this
is another way to characterize constant coecient dierential operators)
and L
1
= K we should have
z
K = K
z
. Writing out this equation then
says
_
R
d
k(x z, y)g(y)dy = (Kg) (x z) =
z
Kg(x) = (K
z
g) (x)
=
_
R
d
k(x, y)g(y z)dy =
_
R
d
k(x, y +z)g(y)dy.
Since g is arbitrary we conclude that k(x z, y) = k(x, y + z). Taking
y = 0 then gives
k(x, z) = k(x z, 0) =: (x z).
We thus nd that Kg = g. Hence we expect the convolution operation
to appear naturally when solving constant coecient partial dierential
equations. More about this point later.
Proposition 22.23. Suppose p [1, ], f L
1
and g L
p
, then f g(x)
exists for almost every x, f g L
p
and
|f g|
p
|f|
1
|g|
p
.
Proof. This follows directly from Minkowskis inequality for integrals,
Theorem 21.27.
Proposition 22.24. Suppose that p [1, ), then
z
: L
p
L
p
is an iso-
metric isomorphism and for f L
p
, z 1
d
z
f L
p
is continuous.
Proof. The assertion that
z
: L
p
L
p
is an isometric isomorphism
follows from translation invariance of Lebesgue measure and the fact that
z

z
= id. For the continuity assertion, observe that
|
z
f
y
f|
p
= |
y
(
z
f
y
f)|
p
= |
zy
f f|
p
from which it follows that it is enough to show
z
f f in L
p
as z 0 1
d
.
When f C
c
(1
d
),
z
f f uniformly and since the K :=
]z]1
supp(
z
f) is
compact, it follows by the dominated convergence theorem that
z
f f in
L
p
as z 0 1
d
. For general g L
p
and f C
c
(1
d
),
|
z
g g|
p
|
z
g
z
f|
p
+|
z
f f|
p
+|f g|
p
= |
z
f f|
p
+ 2 |f g|
p
and thus
lim sup
z0
|
z
g g|
p
lim sup
z0
|
z
f f|
p
+ 2 |f g|
p
= 2 |f g|
p
.
Because C
c
(1
d
) is dense in L
p
, the term |f g|
p
may be made as small as
we please.
Exercise 22.4. Let p [1, ] and |
z
I|
L(L
p
(m))
be the operator norm
z
I. Show |
z
I|
L(L
p
(m))
= 2 for all z 1
d
0 and conclude from this
that z 1
d
z
L(L
p
(m)) is not continuous.
Hints: 1) Show |
z
I|
L(L
p
(m))
=
_
_
]z]e1
I
_
_
L(L
p
(m))
. 2) Let z = te
1
with t > 0 and look for f L
p
(m) such that
z
f is approximately equal
to f. (In fact, if p = , you can nd f L
(m) such that

z
f = f.)
(BRUCE: add on a problem somewhere showing that (
z
) = S
1
C. This
is very simple to prove if p = 2 by using the Fourier transform.)
Denition 22.25. Suppose that (X, ) is a topological space and is a mea-
sure on B
X
= (). For a measurable function f : X C we dene the
essential support of f by
supp
(f) = x X : (y V : f(y) ,= 0) > 0 neighborhoods V of x.

(22.8)
Equivalently, x / supp
(f) i there exists an open neighborhood V of x such

that 1
V
f = 0 a.e.
It is not hard to show that if supp() = X (see Denition 21.41) and
f C(X) then supp
(f) = supp(f) := f ,= 0 , see Exercise 22.7.

Lemma 22.26. Suppose (X, ) is second countable and f : X C is a mea-
surable function and is a measure on B
X
. Then X := U supp
(f) may
be described as the largest open set W such that f1
W
(x) = 0 for a.e. x.
Equivalently put, C := supp
(f) is the smallest closed subset of X such that

f = f1
C
a.e.
Proof. To verify that the two descriptions of supp
(f) are equivalent,

suppose supp
(f) is dened as in Eq. (22.8) and W := X supp
(f). Then
W = x X : V x such that (y V : f(y) ,= 0) = 0
= V
o
X : (f1
V
,= 0) = 0
= V
o
X : f1
V
= 0 for a.e. .
So to nish the argument it suces to show (f1
W
,= 0) = 0. To to this let
| be a countable base for and set
|
f
:= V | : f1
V
= 0 a.e..
Then it is easily seen that W = |
f
and since |
f
is countable
(f1
W
,= 0)
V /
f
(f1
V
,= 0) = 0.
Lemma 22.27. Suppose f, g, h : 1
d
C are measurable functions and as-
sume that x is a point in 1
d
such that [f[[g[ (x) < and [f[([g[ [h[) (x) <
, then
1. f g(x) = g f(x)
2. f (g h)(x) = (f g) h(x)
3. If z 1
d
and
z
([f[ [g[)(x) = [f[ [g[ (x z) < , then
z
(f g)(x) =
z
f g(x) = f
z
g(x)
4. If x / supp
m
(f) + supp
m
(g) then f g(x) = 0 and in particular,
supp
m
(f g) supp
m
(f) + supp
m
(g)
where in dening supp
m
(f g) we will use the convention that f g(x) ,=
0 when [f[ [g[ (x) = .
Proof. For item 1.,
[f[ [g[ (x) =
_
R
d
[f[ (x y) [g[ (y)dy =
_
R
d
[f[ (y) [g[ (y x)dy = [g[ [f[ (x)
where in the second equality we made use of the fact that Lebesgue measure
invariant under the transformation y x y. Similar computations prove
all of the remaining assertions of the rst three items of the lemma. Item
4. Since f g(x) =

f g(x) if f =

f and g = g a.e. we may, by replacing
f by f1
supp
m
(f)
and g by g1
supp
m
(g)
if necessary, assume that f ,= 0
supp
m
(f) and g ,= 0 supp
m
(g). So if x / (supp
m
(f) + supp
m
(g)) then
x / (f ,= 0 +g ,= 0) and for all y 1
d
, either x y / f ,= 0 or y /
g ,= 0 . That is to say either x y f = 0 or y g = 0 and hence
f(xy)g(y) = 0 for all y and therefore f g(x) = 0. This shows that f g = 0
on 1
d
_
supp
m
(f) + supp
m
(g)
_
and therefore
1
d
_
supp
m
(f) + supp
m
(g)
_
1
d
supp
m
(f g),
i.e. supp
m
(f g) supp
m
(f) + supp
m
(g).
Remark 22.28. Let A, B be closed sets of 1
d
, it is not necessarily true that
A+B is still closed. For example, take
A = (x, y) : x > 0 and y 1/x and B = (x, y) : x < 0 and y 1/ [x[ ,
then every point of A+B has a positive y - component and hence is not zero.
On the other hand, for x > 0 we have (x, 1/x) +(x, 1/x) = (0, 2/x) A+B
for all x and hence 0 A+B showing A + B is not closed. Nevertheless if
one of the sets A or B is compact, then A+B is closed again. Indeed, if A is
compact and x
n
= a
n
+ b
n
A + B and x
n
x 1
d
, then by passing to a
subsequence if necessary we may assume lim
n
a
n
= a A exists. In this
case
lim
n
b
n
= lim
n
(x
n
a
n
) = x a B
exists as well, showing x = a +b A+B.
Proposition 22.29. Suppose that p, q [1, ] and p and q are conjugate
exponents, f L
p
and g L
q
, then f g BC(1
d
), |f g|
|f|
p
|g|
q
and if p, q (1, ) then f g C
0
(1
d
).
Proof. The existence of fg(x) and the estimate [f g[ (x) |f|
p
|g|
q
for
all x 1
d
is a simple consequence of Holders inequality and the translation in-
variance of Lebesgue measure. In particular this shows |f g|
|f|
p
|g|
q
.
By relabeling p and q if necessary we may assume that p [1, ). Since
|
z
(f g) f g|
u
= |
z
f g f g|
u
|
z
f f|
p
|g|
q
0 as z 0
it follows that f g is uniformly continuous. Finally if p, q (1, ), we learn
from Lemma 22.27 and what we have just proved that f
m
g
m
C
c
(1
d
)
where f
m
= f1
]f]m
and g
m
= g1
]g]m
. Moreover,
|f g f
m
g
m
|
|f g f
m
g|
+|f
m
g f
m
g
m
|
|f f
m
|
p
|g|
q
+|f
m
|
p
|g g
m
|
q
|f f
m
|
p
|g|
q
+|f|
p
|g g
m
|
q
0 as m
showing, with the aid of Proposition 15.23, f g C
0
(1
d
).
Theorem 22.30 (Youngs Inequality). Let p, q, r [1, ] satisfy
1
p
+
1
q
= 1 +
1
r
. (22.9)
If f L
p
and g L
q
then [f[ [g[ (x) < for m a.e. x and
|f g|
r
|f|
p
|g|
q
. (22.10)
In particular L
1
is closed under convolution. (The space (L
1
, ) is an example
of a Banach algebra without unit.)
Remark 22.31. Before going to the formal proof, let us rst understand Eq.
(22.9) by the following scaling argument. For > 0, let f
(x) := f(x), then

after a few simple change of variables we nd
|f
|
p
=
d/p
|f| and (f g)
=
d
f
.
Therefore if Eq. (22.10) holds for some p, q, r [1, ], we would also have
|f g|
r
=
d/r
|(f g)
|
r

d/r
d
|f
|
p
|g
|
q
=
(d+d/rd/pd/q)
|f|
p
|g|
q
for all > 0. This is only possible if Eq. (22.9) holds.
Proof. By the usual sorts of arguments, we may assume f and g are
positive functions. Let , [0, 1] and p
1
, p
2
(0, ] satisfy p
1
1
+p
1
2
+r
1
=
1. Then by Holders inequality, Corollary 21.3,
f g(x) =
_
R
d
_
f(x y)
(1)
g(y)
(1)
_
f(x y)
g(y)
dy
__
R
d
f(x y)
(1)r
g(y)
(1)r
dy
_
1/r
__
R
d
f(x y)
p1
dy
_
1/p1
__
R
d
g(y)
p2
dy
_
1/p2
=
__
R
d
f(x y)
(1)r
g(y)
(1)r
dy
_
1/r
|f|
p1
|g|
p2
.
Taking the r
th
power of this equation and integrating on x gives
|f g|
r
r

_
R
d
__
R
d
f(x y)
(1)r
g(y)
(1)r
dy
_
dx |f|
p1
|g|
p2
= |f|
(1)r
(1)r
|g|
(1)r
(1)r
|f|
r
p1
|g|
r
p2
. (22.11)
Let us now suppose, (1 )r = p
1
and (1 )r = p
2
, in which case Eq.
(22.11) becomes,
|f g|
r
r
|f|
r
p1
|g|
r
p2
which is Eq. (22.10) with
p := (1 )r = p
1
and q := (1 )r = p
2
. (22.12)
So to nish the proof, it suces to show p and q are arbitrary indices in
[1, ] satisfying p
1
+q
1
= 1+r
1
. If , , p
1
, p
2
satisfy the relations above,
then
=
r
r +p
1
and =
r
r +p
2
and
1
p
+
1
q
=
1
p
1
+
1
p
2
=
1
p
1
r +p
1
r
+
1
p
2
r +p
2
r
=
1
p
1
+
1
p
2
+
2
r
= 1 +
1
r
.
Conversely, if p, q, r satisfy Eq. (22.9), then let and satisfy p = (1 )r
and q = (1 )r, i.e.
:=
r p
r
= 1
p
r
1 and =
r q
r
= 1
q
r
1.
Using Eq. (22.9) we may also express and as
= p(1
1
q
) 0 and = q(1
1
p
) 0
and in particular we have shown , [0, 1]. If we now dene p
1
:= p/
(0, ] and p
2
:= q/ (0, ], then
1
p
1
+
1
p
2
+
1
r
=
1
q
+
1
p
+
1
r
= (1
1
q
) + (1
1
p
) +
1
r
= 2
_
1 +
1
r
_
+
1
r
= 1
as desired.
Theorem 22.32 (Approximate functions). Let p [1, ],
L
1
(1
d
), a :=
_
R
d
(x)dx, and for t > 0 let
t
(x) = t
d
(x/t). Then
1. If f L
p
with p < then
t
f af in L
p
as t 0.
2. If f BC(1
d
) and f is uniformly continuous then |
t
f af|
0
as t 0.
3. If f L
and f is continuous on U
o
1
d
then
t
f af uniformly
on compact subsets of U as t 0.
Proof. Making the change of variables y = tz implies
t
f(x) =
_
R
d
f(x y)
t
(y)dy =
_
R
d
f(x tz)(z)dz
so that
t
f(x) af(x) =
_
R
d
[f(x tz) f(x)] (z)dz
=
_
R
d
[
tz
f(x) f(x)] (z)dz. (22.13)
Hence by Minkowskis inequality for integrals (Theorem 21.27), Proposition
22.24 and the dominated convergence theorem,
|
t
f af|
p

_
R
d
|
tz
f f|
p
[(z)[ dz 0 as t 0.
Item 2. is proved similarly. Indeed, form Eq. (22.13)
|
t
f af|

_
R
d
|
tz
f f|
[(z)[ dz
which again tends to zero by the dominated convergence theorem because
lim
t0
|
tz
f f|
= 0 uniformly in z by the uniform continuity of f.

Item 3. Let B
R
= B(0, R) be a large ball in 1
d
and K U, then
sup
xK
[
t
f(x) af(x)[
_
B
R
_
B
c
R
_
B
R
[(z)[ dz sup
xK,zB
R
[f(x tz) f(x)[ + 2 |f|
_
B
c
R
[(z)[ dz
||
1
sup
xK,zB
R
[f(x tz) f(x)[ + 2 |f|
_
]z]>R
[(z)[ dz
so that using the uniform continuity of f on compact subsets of U,
limsup
t0
sup
xK
[
t
f(x) af(x)[ 2 |f|
_
]z]>R
[(z)[ dz 0 as R .
See Theorem 8.15 of Folland for a statement about almost everywhere
convergence.
Exercise 22.5. Let
f(t) =
_
e
1/t
if t > 0
0 if t 0.
Show f C
(1, [0, 1]).

Lemma 22.33. There exists C
c
(1
d
, [0, )) such that (0) > 0,
supp()

B(0, 1) and
_
R
d
(x)dx = 1.
Proof. Dene h(t) = f(1 t)f(t + 1) where f is as in Exercise 22.5.
Then h C
c
(1, [0, 1]), supp(h) [1, 1] and h(0) = e
2
> 0. Dene c =
_
R
d
h([x[
2
)dx. Then (x) = c
1
h([x[
2
) is the desired function.
The reader asked to prove the following proposition in Exercise 22.9 below.
Proposition 22.34. Suppose that f L
1
loc
(1
d
, m) and C
1
c
(1
d
), then
f C
1
(1
d
) and
i
(f ) = f
i
. Moreover if C
c
(1
d
) then
f C
(1
d
).
Corollary 22.35 (C
Uryshons Lemma). Given K U

o
1
d
, there
exists f C
c
(1
d
, [0, 1]) such that supp(f) U and f = 1 on K.
Proof. Let be as in Lemma 22.33,
t
(x) = t
d
(x/t) be as in Theorem
22.32, d be the standard metric on 1
d
and = d(K, U
c
). Since K is compact
and U
c
is closed, > 0. Let V
=
_
x 1
d
: d(x, K) <
_
and f =
/3
1
V
/3
,
then
supp(f) supp(
/3
) +V
/3

V
2/3
U.
Since

V
2/3
is closed and bounded, f C
c
(U) and for x K,
f(x) =
_
R
d
1
d(y,K)</3

/3
(x y)dy =
_
R
d
/3
(x y)dy = 1.
The proof will be nished after the reader (easily) veries 0 f 1.
Here is an application of this corollary whose proof is left to the reader,
Exercise 22.10.
Lemma 22.36 (Integration by Parts). Suppose f and g are measur-
able functions on 1
d
such that t f(x
1
, . . . , x
i1
, t, x
i+1
, . . . , x
d
) and t
g(x
1
, . . . , x
i1
, t, x
i+1
, . . . , x
d
) are continuously dierentiable functions on 1
for each xed x = (x
1
, . . . , x
d
) 1
d
. Moreover assume f g,
f
xi
g and
f
g
xi
are in L
1
(1
d
, m). Then
_
R
d
f
x
i
gdm =
_
R
d
f
g
x
i
dm.
With this result we may give another proof of the Riemann Lebesgue
Lemma.
Lemma 22.37 (Riemann Lebesgue Lemma). For f L
1
(1
d
, m) let
f() := (2)
d/2
_
R
d
f(x)e
ix
dm(x)
be the Fourier transform of f. Then

f C
0
(1
d
) and
_
_
_
f
_
_
_
(2)
d/2
|f|
1
.
(The choice of the normalization factor, (2)
d/2
, in

f is for later conve-
nience.)
Proof. The fact that

f is continuous is a simple application of the domi-
nated convergence theorem. Moreover,
f()

_
R
d
[f(x)[ dm(x) (2)
d/2
|f|
1
so it only remains to see that

f() 0 as [[ . First suppose that
f C
c
(1
d
) and let =

d
j=1
2
x
2
j
be the Laplacian on 1
d
. Notice that
xj
e
ix
= i
j
e
ix
and e
ix
= [[
2
e
ix
. Using Lemma 22.36 re-
peatedly,
_
R
d
k
f(x)e
ix
dm(x) =
_
R
d
f(x)
k
x
e
ix
dm(x) = [[
2k
_
R
d
f(x)e
ix
dm(x)
= (2)
d/2
[[
2k
f()
for any k N. Hence
(2)
d/2
f()
[[
2k
_
_
k
f
_
_
1
0
as [[ and

f C
0
(1
d
). Suppose that f L
1
(m) and f
k
C
c
(1
d
) is
a sequence such that lim
k
|f f
k
|
1
= 0, then lim
k
_
_
_
f

f
k
_
_
_
= 0.
Hence

f C
0
(1
d
) by an application of Proposition 15.23.
Corollary 22.38. Let X 1
d
be an open set and be a K-nite measure
on B
X
.
1. Then C
c
(X) is dense in L
p
() for all 1 p < .
2. If h L
1
loc
() satises
_
X
fhd = 0 for all f C
c
(X) (22.14)
then h(x) = 0 for a.e. x.
Proof. Let f C
c
(X), be as in Lemma 22.33,
t
be as in Theorem
22.32 and set
t
:=
t
(f1
X
) . Then by Proposition 22.34
t
C
(X) and
by Lemma 22.27 there exists a compact set K X such that supp(
t
) K
for all t suciently small. By Theorem 22.32,
t
f uniformly on X as t 0
1. The dominated convergence theorem (with dominating function being
|f|
1
K
), shows
t
f in L
p
() as t 0. This proves Item 1., since
Theorem 22.8 guarantees that C
c
(X) is dense in L
p
().
2. Keeping the same notation as above, the dominated convergence theorem
(with dominating function being |f|
[h[ 1
K
) implies
0 = lim
t0
_
X
t
hd =
_
X
lim
t0
t
hd =
_
X
fhd.
The proof is now nished by an application of Lemma 22.11.
22.2.1 Smooth Partitions of Unity
We have the following smooth variants of Proposition 15.16, Theorem 15.18
and Corollary 15.20. The proofs of these results are the same as their contin-
uous counterparts. One simply uses the smooth version of Urysohns Lemma
of Corollary 22.35 in place of Lemma 15.8.
Proposition 22.39 (Smooth Partitions of Unity for Compacts). Sup-
pose that X is an open subset of 1
d
, K X is a compact set and | = U
j
n
j=1
is an open cover of K. Then there exists a smooth (i.e. h
j
C
(X, [0, 1]))

partition of unity h
j
n
j=1
of K such that h
j
U
j
for all j = 1, 2, . . . , n.
Theorem 22.40 (Locally Compact Partitions of Unity). Suppose that
X is an open subset of 1
d
and | is an open cover of X. Then there exists a
smooth partition of unity of h
i
N
i=1
(N = is allowed here) subordinate to
the cover | such that supp(h
i
) is compact for all i.
Corollary 22.41. Suppose that X is an open subset of 1
d
and | =
U
A
is an open cover of X. Then there exists a smooth partition
of unity of h
A
subordinate to the cover | such that supp(h
) U
for
all A. Moreover if

U
is compact for each A we may choose h
so
that h
.
22.3 Exercises 433
22.3 Exercises
Exercise 22.6. Let (X, ) be a topological space, a measure on B
X
=
() and f : X C be a measurable function. Letting be the measure,
d = [f[ d, show supp() = supp
(f), where supp() is dened in Denition

21.41).
Exercise 22.7. Let (X, ) be a topological space, a measure on B
X
= ()
such that supp() = X (see Denition 21.41). Show supp
(f) = supp(f) =
f ,= 0 for all f C(X).
Exercise 22.8. Prove the following strong version of item 3. of Proposition
13.52, namely to every pair of points, x
0
, x
1
, in a connected open subset V
of 1
d
there exists C
(1, V ) such that (0) = x

0
and (1) = x
1
. Hint:
First choose a continuous path : [0, 1] V such that (t) = x
0
for t near 0
and (t) = x
1
for t near 1 and then use a convolution argument to smooth .
Exercise 22.9. Prove Proposition 22.34 by appealing to Corollary 19.43.
Exercise 22.10 (Integration by Parts). Suppose that (x, y) 1 1
d1
f(x, y) C and (x, y) 1 1

d1
g(x, y) C are measurable functions
such that for each xed y 1
d
, x f(x, y) and x g(x, y) are continuously
dierentiable. Also assume f g,
x
f g and f
x
g are integrable relative to
Lebesgue measure on 1 1
d1
, where
x
f(x, y) :=
d
dt
f(x +t, y)[
t=0
. Show
_
RR
d1
x
f(x, y) g(x, y)dxdy =
_
RR
d1
f(x, y)
x
g(x, y)dxdy. (22.15)
(Note: this result and Fubinis theorem proves Lemma 22.36.)
Hints: Let C
c
(1) be a function which is 1 in a neighborhood of
0 1 and set
(x) = (x). First verify Eq. (22.15) with f(x, y) replaced

by
(x)f(x, y) by doing the x integral rst. Then use the dominated con-
vergence theorem to prove Eq. (22.15) by passing to the limit, 0.
Exercise 22.11. Let be a nite measure on B
R
d, then | := spane
ix
:
1
d
is a dense subspace of L
p
() for all 1 p < . Hints: By Theorem
22.8, C
c
(1
d
) is a dense subspace of L
p
(). For f C
c
(1
d
) and N N, let
f
N
(x) :=
nZ
d
f(x + 2Nn).
Show f
N
BC(1
d
) and x f
N
(Nx) is 2 periodic, so by Exercise 15.13,
x f
N
(Nx) can be approximated uniformly by trigonometric polynomials.
Use this fact to conclude that f
N

|
L
p
()
. After this show f
N
f in L
p
().
Exercise 22.12. Suppose that and are two nite measures on 1
d
such
that
_
R
d
e
ix
d(x) =
_
R
d
e
ix
d(x) (22.16)
for all 1
d
. Show = .
Hint: Perhaps the easiest way to do this is to use Exercise 22.11 with the
measure being replaced by + . Alternatively, use the method of proof
of Exercise 22.11 to show Eq. (22.16) implies
_
R
d
fd(x) =
_
R
d
fd(x) for all
f C
c
(1
d
) and then apply Corollary 18.58.
Exercise 22.13. Again let be a nite measure on B
R
d. Further assume that
C
M
:=
_
R
d
e
M]x]
d(x) < for all M (0, ). Let T(1
d
) be the space of
polynomials, (x) =
]]N

with
C, on 1
d
. (Notice that [(x)[
p
Ce
M]x]
for some constant C = C(, p, M), so that T(1
d
) L
p
() for all
1 p < .) Show T(1
d
) is dense in L
p
() for all 1 p < . Here is a
possible outline.
Outline: Fix a 1
d
and let f
n
(x) = ( x)
n
/n! for all n N.
1. Use calculus to verify sup
t0
t
e
Mt
= (/M)
for all 0 where

(0/M)
0
:= 1. Use this estimate along with the identity
[ x[
pn
[[
pn
[x[
pn
=
_
[x[
pn
e
M]x]
_
[[
pn
e
M]x]
to nd an estimate on |f
n
|
p
.
2. Use your estimate on |f
n
|
p
to show

n=0
|f
n
|
p
< and conclude
lim
N
_
_
_
_
_
e
i()
n=0
i
n
f
n
_
_
_
_
_
p
= 0.
3. Now nish by appealing to Exercise 22.11.
Exercise 22.14. Again let be a nite measure on B
R
d but now assume
there exists an > 0 such that C :=
_
R
d
e
]x]
d(x) < . Also let q > 1 and
h L
q
() be a function such that
_
R
d
h(x)x
d(x) = 0 for all N

d
0
. (As
mentioned in Exercise 22.14, T(1
d
) L
p
() for all 1 p < , so x h(x)x
is in L
1
().) Show h(x) = 0 for a.e. x using the following outline.
Outline: Fix a 1
d
, let f
n
(x) = ( x)
n
/n! for all n N, and let
p = q/(q 1) be the conjugate exponent to q.
1. Use calculus to verify sup
t0
t
e
t
= (/)
for all 0 where

(0/)
0
:= 1. Use this estimate along with the identity
[ x[
pn
[[
pn
[x[
pn
=
_
[x[
pn
e
]x]
_
[[
pn
e
]x]
to nd an estimate on |f
n
|
p
.
22.3 Exercises 435
2. Use your estimate on |f
n
|
p
to show there exists > 0 such that
n=0
|f
n
|
p
< when [[ and conclude for [[ that e
ix
=
L
p
()-
n=0
i
n
f
n
(x). Conclude from this that
_
R
d
h(x)e
ix
d(x) = 0 when [[ .
3. Let 1
d
([[ not necessarily small) and set g(t) :=
_
R
d
e
itx
h(x)d(x)
for t 1. Show g C
(1) and
g
(n)
(t) =
_
R
d
(i x)
n
e
itx
h(x)d(x) for all n N.
4. Let T = sup 0 : g[
[0,]
0. By Step 2., T . If T < , then
0 = g
(n)
(T) =
_
R
d
(i x)
n
e
iTx
h(x)d(x) for all n N.
Use Step 3. with h replaced by e
iTx
h(x) to conclude
g(T +t) =
_
R
d
e
i(T+t)x
h(x)d(x) = 0 for all t / [[ .
This violates the denition of T and therefore T = and in particular
we may take T = 1 to learn
_
R
d
h(x)e
ix
d(x) = 0 for all 1
d
.
5. Use Exercise 22.11 to conclude that
_
R
d
h(x)g(x)d(x) = 0
for all g L
p
(). Now choose g judiciously to nish the proof.
Part VI
Further Hilbert and Banach Space Techniques
23
L
2
- Hilbert Spaces Techniques and Fourier
Series
This section is concerned with Hilbert spaces presented as in the following
example.
Example 23.1. Let (X, /, ) be a measure space. Then H := L
2
(X, /, )
with inner product
f[g) =
_
X
f gd
is a Hilbert space.
It will be convenient to dene
f, g) :=
_
X
f (x) g (x) d(x) (23.1)
for all measurable functions f, g on X such that fg L
1
() . So with this
notation we have f[g) = f, g) for all f, g H.
Exercise 23.1. Let K : L
2
() L
2
() be the operator dened in Exercise
21.12. Show K
: L
2
() L
2
() is the operator given by
K
g(y) =
_
X
k(x, y)g(x)d(x).
23.1 L
2
-Orthonormal Basis
Example 23.2. 1. Let H = L
2
([1, 1], dm), A := 1, x, x
2
, x
3
. . . and H
be the result of doing the Gram-Schmidt procedure on A. By the Stone-
Weierstrass theorem or by Exercise 22.13 directly, A is total in H. Hence
by Remark 8.26, is an orthonormal basis for H. The basis, , consists
of polynomials which up to normalization are the so called Legendre
polynomials.
440 23 L
2
- Hilbert Spaces Techniques and Fourier Series
2. Let H = L
2
(1, e
1
2
x
2
dx) and A := 1, x, x
2
, x
3
. . . . Again by Exercise
22.13, A is total in H and hence the Gram-Schmidt procedure applied to
A produces an orthonormal basis, , of polynomial functions for H. This
basis consists, up to normalizations, of the so called Hermite polyno-
mials on 1.
Remark 23.3 (An Interesting Phenomena). Let H = L
2
([1, 1], dm) and B :=
1, x
3
, x
6
, x
9
, . . . . Then again A is total in H by the same argument as in
item 2. Example 23.2. This is true even though B is a proper subset of A.
Notice that A is an algebraic basis for the polynomials on [1, 1] while B is
not! The following computations may help relieve some of the readers anxiety.
Let f L
2
([1, 1], dm), then, making the change of variables x = y
1/3
, shows
that
_
1
1
[f(x)[
2
dx =
_
1
1
f(y
1/3
)
2
1
3
y
2/3
dy =
_
1
1
f(y
1/3
)
2
d(y) (23.2)
where d(y) =
1
3
y
2/3
dy. Since ([1, 1]) = m([1, 1]) = 2, is a nite
measure on [1, 1] and hence by Exercise 22.13 A := 1, x, x
2
, x
3
. . . is total
(see Denition 8.25) in L
2
([1, 1], d). In particular for any > 0 there exists
a polynomial p(y) such that
_
1
1
f(y
1/3
) p(y)
2
d(y) <
2
.
However, by Eq. (23.2) we have
2
>
_
1
1
f(y
1/3
) p(y)
2
d(y) =
_
1
1
f(x) p(x
3
)
2
dx.
Alternatively, if f C([1, 1]), then g(y) = f(y
1/3
) is back in C([1, 1]).
Therefore for any > 0, there exists a polynomial p(y) such that
> |g p|
= sup[g(y) p(y)[ : y [1, 1]

= sup
_
g(x
3
) p(x
3
)
: x [1, 1]
_
= sup
_
f(x) p(x
3
)
: x [1, 1]
_
.
This gives another proof the polynomials in x
3
are dense in C([1, 1]) and
hence in L
2
([1, 1]).
Exercise 23.2. Suppose (X, /, ) and (Y, ^, ) are -nite measure spaces
such that L
2
() and L
2
() are separable. If f
n
n=1
and g
m
m=1
are orthonormal bases for L
2
() and L
2
() respectively, then :=
f
n
g
m
: m, n N is an orthonormal basis for L
2
( ) . (Recall that
f g (x, y) := f (x) g (y) , see Notation 20.4.) Hint: model your proof on
the proof of Proposition 8.29.
23.2 Hilbert Schmidt Operators 441
Denition 23.4 (External direct sum of Hilbert spaces). Suppose that
H
n
n=1
is a sequence of Hilbert spaces. Let
n=1
H
n
denote the space of
sequences, f
n=1
H
n
such that
|f| =
n=1
|f (n)|
2
Hn
< .
It is easily seen that (
n=1
H
n
, ||) is a Hilbert space with inner product de-
ned, for all f, g
n=1
H
n
, by
f[g)
n=1
Hn
=
n=1
f (n) [g (n))
Hn
.
Exercise 23.3. Suppose H is a Hilbert space and H
n
: n N are closed
subspaces of H such that H
n
H
m
for all m ,= n and if f H with f H
n
for all n N, then f = 0. For f
n=1
H
n
, show the sum

n=1
f (n) is
convergent in H and the map U :
n=1
H
n
H dened by Uf :=
n=1
f (n)
is unitary.
Exercise 23.4. Suppose (X, /, ) is a measure space and X =

n=1
X
n
with X
n
/and (X
n
) > 0 for all n. Then U : L
2
(X, )
n=1
L
2
(X
n
, )
dened by (Uf) (n) := f1
Xn
is unitary.
23.2 Hilbert Schmidt Operators
In this section H and B will be Hilbert spaces.
Proposition 23.5. Let H and B be a separable Hilbert spaces, K : H B
be a bounded linear operator, e
n
n=1
and u
m
m=1
be orthonormal basis for
H and B respectively. Then:
1.
n=1
|Ke
n
|
2
=

m=1
|K
u
m
|
2
allowing for the possibility that the
sums are innite. In particular the Hilbert Schmidt norm of K,
|K|
2
HS
:=
n=1
|Ke
n
|
2
,
is well dened independent of the choice of orthonormal basis e
n
n=1
.
We say K : H B is a Hilbert Schmidt operator if |K|
HS
< and
let HS(H, B) denote the space of Hilbert Schmidt operators from H to B.
2. For all K L(H, B), |K|
HS
= |K
|
HS
and
|K|
HS
|K|
op
:= sup|Kh| : h H such that |h| = 1 .
442 23 L
2
3. The set HS(H, B) is a subspace of L(H, B) (the bounded operators from
H B), ||
HS
is a norm on HS(H, B) for which (HS(H, B), ||
HS
) is
a Hilbert space, and the corresponding inner product is given by
K
1
[K
2
)
HS
=
n=1
K
1
e
n
[K
2
e
n
) . (23.3)
4. If K : H B is a bounded nite rank operator, then K is Hilbert Schmidt.
5. Let P
N
x :=
N
n=1
x[e
n
) e
n
be orthogonal projection onto spane
n
: n N
H and for K HS(H, B), let K
N
:= KP
N
. Then
|K K
N
|
2
op
|K K
N
|
2
HS
0 as N ,
which shows that nite rank operators are dense in (HS(H, B), ||
HS
) .
In particular of HS(H, B) /(H, B) the space of compact operators
from H B.
6. If Y is another Hilbert space and A : Y H and C : B Y are bounded
operators, then
|KA|
HS
|K|
HS
|A|
op
and |CK|
HS
|K|
HS
|C|
op
,
in particular HS(H, H) is an ideal in L(H) .
Proof. Items 1. and 2. By Parsevals equality and Fubinis theorem for
sums,
n=1
|Ke
n
|
2
=
n=1
m=1
[Ke
n
[u
m
)[
2
=
m=1
n=1
[e
n
[K
u
m
)[
2
=
m=1
|K
u
m
|
2
.
This proves |K|
HS
is well dened independent of basis and that |K|
HS
=
|K
|
HS
. For x H 0 , x/ |x| may be taken to be the rst element in an
orthonormal basis for H and hence
_
_
_
_
K
x
|x|
_
_
_
_
|K|
HS
.
Multiplying this inequality by |x| shows |Kx| |K|
HS
|x| and hence
|K|
op
|K|
HS
.
Item 3. For K
1
, K
2
L(H, B),
|K
1
+K
2
|
HS
=
n=1
|K
1
e
n
+K
2
e
n
|
2
n=1
[|K
1
e
n
| +|K
2
e
n
|]
2
= ||K
1
e
n
| +|K
2
e
n
|
n=1
|
2
||K
1
e
n
|
n=1
|
2
+||K
2
e
n
|
n=1
|
2
= |K
1
|
HS
+|K
2
|
HS
.
From this triangle inequality and the homogeneity properties of ||
HS
, we
now easily see that HS(H, B) is a subspace of L(H, B) and ||
HS
is a norm
on HS(H, B). Since
n=1
[K
1
e
n
[K
2
e
n
)[
n=1
|K
1
e
n
| |K
2
e
n
|
n=1
|K
1
e
n
|
2
n=1
|K
2
e
n
|
2
= |K
1
|
HS
|K
2
|
HS
,
the sum in Eq. (23.3) is well dened and is easily checked to dene an inner
product on HS(H, B) such that |K|
2
HS
= K[K)
HS
.
The proof that
_
HS(H, B), ||
2
HS
_
is complete is very similar to the proof
of Theorem 7.5. Indeed, suppose K
m
m=1
is a ||
HS
Cauchy sequence in
HS(H, B). Because L(H, B) is complete, there exists K L(H, B) such that
|K K
m
|
op
0 as m . Thus, making use of Fatous Lemma 4.12,
|K K
m
|
2
HS
=
n=1
|(K K
m
) e
n
|
2
=
n=1
lim inf
l
|(K
l
K
m
) e
n
|
2
lim inf
l
n=1
|(K
l
K
m
) e
n
|
2
= lim inf
l
|K
l
K
m
|
2
HS
0 as m .
Hence K HS(H, B) and lim
m
|K K
m
|
2
HS
= 0.
Item 4. Since Nul(K
= Ran(K) = Ran(K) ,
|K|
2
HS
= |K
|
2
HS
=
N
n=1
|K
v
n
|
2
H
<
444 23 L
2
where N := dimRan(K) and v
n
N
n=1
is an orthonormal basis for Ran (K) =
K (H) .
Item 5. Simply observe,
|K K
N
|
2
op
|K K
N
|
2
HS
=
n>N
|Ke
n
|
2
0 as N .
Item 6. For C L(B, Y ) and K L(H, B) then
|CK|
2
HS
=
n=1
|CKe
n
|
2
|C|
2
op
n=1
|Ke
n
|
2
= |C|
2
op
|K|
2
HS
and for A L(Y, H) ,
|KA|
HS
= |A
|
HS
|A
|
op
|K
|
HS
= |A|
op
|K|
HS
.
Remark 23.6. The separability assumptions made in Proposition 23.5 are un-
necessary. In general, we dene
|K|
2
HS
=
e
|Ke|
2
where H is an orthonormal basis. The same proof of Item 1. of Proposition
23.5 shows |K|
HS
is well dened and |K|
HS
= |K
|
HS
. If |K|
2
HS
< ,
then there exists a countable subset
0
such that Ke = 0 if e
0
. Let
H
0
:= span(
0
) and B
0
:= K(H
0
). Then K (H) B
0
, K[
H
0
= 0 and hence
by applying the results of Proposition 23.5 to K[
H0
: H
0
B
0
one easily sees
that the separability of H and B are unnecessary in Proposition 23.5.
Example 23.7. Let (X, ) be a measure space, H = L
2
(X, ) and
k(x, y) :=
n
i=1
f
i
(x)g
i
(y)
where
f
i
, g
i
L
2
(X, ) for i = 1, . . . , n.
Dene
(Kf)(x) =
_
X
k(x, y)f(y)d(y),
then K : L
2
(X, ) L
2
(X, ) is a nite rank operator and hence Hilbert
Schmidt.
Exercise 23.5. Suppose that (X, ) is a nite measure space such that
H = L
2
(X, ) is separable and k : XX 1 is a measurable function, such
that
|k|
2
L
2
(XX,)
:=
_
XX
[k(x, y)[
2
d(x)d(y) < .
Dene, for f H,
Kf(x) =
_
X
k(x, y)f(y)d(y),
when the integral makes sense. Show:
1. Kf(x) is dened for a.e. x in X.
2. The resulting function Kf is in H and K : H H is linear.
3. |K|
HS
= |k|
L
2
(XX,)
< . (This implies K HS(H, H).)
Example 23.8. Suppose that 1
n
is a bounded set, < n, then the oper-
ator K : L
2
(, m) L
2
(, m) dened by
Kf(x) :=
_
1
[x y[
f(y)dy
is compact.
Proof. For 0, let
K
f(x) :=
_
1
[x y[
+
f(y)dy = [g
(1
f)] (x)
where g
(x) =
1
]x]
+
1
C
(x) with C 1
n
a suciently large ball such that
C. Since < n, it follows that
g
g
0
= [[
1
C
L
1
(1
n
, m).
Hence it follows by Proposition 22.23 that
|(K K
) f|
L
2
()
|(g
0
g
) (1
f)|
L
2
(R
n
)
|(g
0
g
)|
L
1
(R
n
)
|1
f|
L
2
(R
n
)
= |(g
0
g
)|
L
1
(R
n
)
|f|
L
2
()
which implies
|K K
|
B(L
2
())
|g
0
g
|
L
1
(R
n
)
=
_
C
1
[x[
+

1
[x[
dx 0 as 0 (23.4)
by the dominated convergence theorem. For any > 0,
_
_
1
[x y[
+
_
2
dxdy < ,
and hence K
is Hilbert Schmidt and hence compact. By Eq. (23.4), K
K
as 0 and hence it follows that K is compact as well.
446 23 L
2
Exercise 23.6. Let H := L
2
([0, 1] , m) , k (x, y) := min (x, y) for x, y [0, 1]
and dene K : H H by
Kf (x) =
_
1
0
k (x, y) f (y) dy.
By Exercise 23.5, K is a Hilbert Schmidt operator and it is easily seen that
K is self-adjoint. Show:
1. If g C
2
([0, 1]) with g (0) = 0 = g
t
(1) , then Kg
tt
= g. Use this to
conclude Kf[g
tt
) = f[g) for all g C
c
((0, 1)) and consequently that
Nul(K) = 0 .
2. Now suppose that f H is an eigenvector of K with eigenvalue ,= 0.
Show that there is a version
1
of f which is in C ([0, 1]) C
2
((0, 1)) and
this version, still denoted by f, solves
f
tt
= f with f (0) = f
t
(1) = 0. (23.5)
where f
t
(1) := lim
x1
f
t
(x) .
3. Use Eq. (23.5) to nd all the eigenvalues and eigenfunctions of K.
4. Use the results above along with the spectral Theorem 8.46, to show
_
2 sin
__
n +
1
2
_
x
_
: n N
0
_
is an orthonormal basis for L
2
([0, 1] , m) .
Exercise 23.7. Let (X, /, ) be a nite measure space, a L
() and
let A be the bounded operator on H := L
2
() dened by Af (x) = a (x) f (x)
for all f H. (We will denote A by M
a
in the future.) Show:
1. |A|
op
= |a|
L
()
.
2. A
= M
a
.
3. (A) = essran(a) where (A) is the spectrum of A and essran(a) is the
essential range of a, see Denitions 8.31 and 21.40 respectively.
4. Show is an eigenvalue for A = M
a
i (a = ) > 0, i.e. i a has a
at spot of height .
23.3 Fourier Series Considerations
Throughout this section we will let d, dx, d, etc. denote Lebesgue measure
on 1
d
normalized so that the cube, Q := (, ]
d
, has measure one, i.e.
d = (2)
d
dm() where m is standard Lebesgue measure on 1
d
. As usual,
for N
d
0
, let
D
=
_
1
i
_
]]

]]
1
1
. . .
d
d
.
1
A measurable function g is called a version of f i g = f a.e..
23.3 Fourier Series Considerations 447
Notation 23.9 Let C
k
per
(1
d
) denote the 2 periodic functions in C
k
(1
d
),
that is f C
k
per
(1
d
) i f C
k
(1
d
) and f( +2e
i
) = f() for all 1
d
and
i = 1, 2, . . . , d. Further let [) denote the inner product on the Hilbert space,
H := L
2
([, ]
d
), given by
f[g) :=
_
Q
f() g()d =
_
1
2
_
d
_
Q
f() g()dm()
and dene
k
() := e
ik
for all k Z
d
. For f L
1
(Q), we will write

f(k)
for the Fourier coecient,
f(k) := f[
k
) =
_
Q
f()e
ik
d. (23.6)
Since any 2 periodic functions on 1
d
may be identied with function
on the d - dimensional torus,
d
= 1
d
/ (2Z)
d
=
_
S
1
_
d
, I may also write
C
k
(
d
) for C
k
per
(1
d
) and L
p
_
d
_
for L
p
(Q) where elements in f L
p
(Q) are
to be thought of as there extensions to 2 periodic functions on 1
d
.
Theorem 23.10 (Fourier Series). The functions :=
_
k
: k Z
d
_
form
an orthonormal basis for H, i.e. if f H then
f =
kZ
d
f[
k
)
k
=
kZ
d
f(k)
k
(23.7)
where the convergence takes place in L
2
([, ]
d
).
Proof. Simple computations show :=
_
k
: k Z
d
_
is an orthonormal
set. We now claim that is an orthonormal basis. To see this recall that
C
c
((, )
d
) is dense in L
2
((, )
d
, dm). Any f C
c
((, )) may be ex-
tended to be a continuous 2 periodic function on 1 and hence by Exercise
15.13 and Remark 15.44, f may uniformly (and hence in L
2
) be approximated
by a trigonometric polynomial. Therefore is a total orthonormal set, i.e.
is an orthonormal basis.
This may also be proved by rst proving the case d = 1 as above and then
using Exercise 23.2 inductively to get the result for any d.
Exercise 23.8. Let A be the operator dened in Lemma 8.37 and for g
L
2
() , let Ug (k) := g (k) so that U : L
2
()
2
(Z) is unitary. Show
U
1
AU = M
a
where a C
per
(1) is a function to be found. Use this repre-
sentation and the results in Exercise 23.7 to give a simple proof of the results
in Lemma 8.37.
23.3.1 Dirichlet, Fejer and Kernels
Although the sum in Eq. (23.7) is guaranteed to converge relative to the
Hilbertian norm on H it certainly need not converge pointwise even if
448 23 L
2
f C
per
_
1
d
_
as will be proved in Section 25.3.1 below. Nevertheless, if f
is suciently regular, then the sum in Eq. (23.7) will converge pointwise as
we will now show. In the process we will give a direct and constructive proof
of the result in Exercise 15.13, see Theorem 23.12 below.
Let us restrict our attention to d = 1 here. Consider
f
n
() =
]k]n
f(k)
k
() =
]k]n
1
2
_
_
[,]
f(x)e
ikx
dx
_
k
()
=
1
2
_
[,]
f(x)
]k]n
e
ik(x)
dx
=
1
2
_
[,]
f(x)D
n
( x)dx (23.8)
where
D
n
() :=
n
k=n
e
ik
is called the Dirichlet kernel. Letting = e
i/2
, we have
D
n
() =
n
k=n
2k
=

2(n+1)
2n
2
1
=

2n+1
(2n+1)

1
=
2i sin(n +
1
2
)
2i sin
1
2
=
sin(n +
1
2
)
sin
1
2
.
and therefore
D
n
() :=
n
k=n
e
ik
=
sin(n +
1
2
)
sin
1
2
, (23.9)
see Figure 23.3.1.
This is a plot D
1
and D
10
.
with the understanding that the right side of this equation is 2n+1 whenever
2Z.
Theorem 23.11. Suppose f L
1
([, ] , dm) and f is dierentiable at
some [, ] , then lim
n
f
n
() = f () where f
n
is as in Eq. (23.8).
Proof. Observe that
1
2
_
[,]
D
n
( x)dx =
1
2
_
[,]
]k]n
e
ik(x)
dx = 1
and therefore,
f
n
() f () =
1
2
_
[,]
[f(x) f ()] D
n
( x)dx
=
1
2
_
[,]
[f(x) f ( x)] D
n
(x)dx
=
1
2
_
[,]
_
f( x) f ()
sin
1
2
x
_
sin(n +
1
2
)x dx. (23.10)
If f is dierentiable at , the last expression in Eq. (23.10) tends to 0 as
n by the Riemann Lebesgue Lemma (Corollary 22.17 or Lemma 22.37)
and the fact that 1
[,]
(x)
f(x)f()
sin
1
2
x
L
1
(dx) .
Despite the Dirichlet kernel not being positive, it still satises the approx-
imate sequence property,
1
2
D
n

0
as n , when acting on C
1
periodic functions in . In order to improve the convergence properties it is

reasonable to try to replace f
n
: n N
0
by the sequence of averages (see
Exercise 7.14),
F
N
() =
1
N + 1
N
n=0
f
n
() =
1
N + 1
N
n=0
1
2
_
[,]
f(x)
]k]n
e
ik(x)
dx
=
1
2
_
[,]
K
N
( x)f(x)dx
where
K
N
() :=
1
N + 1
N
n=0
]k]n
e
ik
(23.11)
is the Fejer kernel.
Theorem 23.12. The Fejer kernel K
N
in Eq. (23.11) satises:
1.
450 23 L
2
K
N
() =
N
n=N
_
1
[n[
N + 1
_
e
in
(23.12)
=
1
N + 1
sin
2
_
N+1
2

_
sin
2
_
2
_ . (23.13)
2. K
N
() 0.
3.
1
2
_
K
N
()d = 1
4. sup
]]
K
N
() 0 as N for all > 0, see Figure 23.1.
5. For any continuous 2 periodic function f on 1, K
N
f() f()
uniformly in as N , where
K
N
f() =
1
2
_

K
N
( )f()d
=
N
n=N
_
1
[n[
N + 1
_

f (n) e
in
. (23.14)
Fig. 23.1. Plots of KN() for N = 2, 7 and 13.
Proof. 1. Equation (23.12) is a consequence of the identity,
N
n=0
]k]n
e
ik
=
]k]nN
e
ik
=
]k]N
(N + 1 [k[) e
ik
.
Moreover, letting = e
i/2
and using Eq. (3.3) shows
K
N
() =
1
N + 1
N
n=0
]k]n
2k
=
1
N + 1
N
n=0
2n+2
2n
2
1
=
1
(N + 1) (
1
)
N
n=0
_
2n+1
2n1
=
1
(N + 1) (
1
)
N
n=0
_
2n
2n
=
1
(N + 1) (
1
)
_
2N+2
1
2
1

1
2N2
1
2
1
_
=
1
(N + 1) (
1
)
2
_
2(N+1)
1 +
2(N+1)
1
_
=
1
(N + 1) (
1
)
2
_
(N+1)
(N+1)
_
2
=
1
N + 1
sin
2
((N + 1) /2)
sin
2
(/2)
.
Items 2. and 3. follow easily from Eqs. (23.13) and (23.12) respectively.
Item 4. is a consequence of the elementary estimate;
sup
]]
K
N
()
1
N + 1
1
sin
2
_
2
_
and is clearly indicated in Figure 23.1. Item 5. now follows by the standard
approximate function arguments, namely,
[K
N
f() f ()[ =
1
2
K
N
( ) [f() f ()] d
1
2
_

K
N
() [f( ) f ()[ d
1
N + 1
1
sin
2
_
2
_ |f|
+
1
2
_
]]
K
N
() [f( ) f ()[ d
1
N + 1
1
sin
2
_
2
_ |f|
+ sup
]]
[f( ) f ()[ .
Therefore,
lim sup
N
|K
N
f f|
sup
sup
]]
[f( ) f ()[ 0 as 0.
452 23 L
2
23.3.2 The Dirichlet Problems on D and the Poisson Kernel
Let D := z C : [z[ < 1 be the open unit disk in C

= 1
2
, write z C as
z = x + iy or z = re
i
, and let =

2
x
2
+

2
y
2
be the Laplacian acting on
C
2
(D) .
Theorem 23.13 (Dirichlet problem for D). To every continuous function
g C (bd(D)) there exists a unique function u C(

D) C
2
(D) solving
u(z) = 0 for z D and u[
D
= g. (23.15)
Moreover for r < 1, u is given by,
u(re
i
) =
1
2
_

P
r
( )u(e
i
)d =: P
r
u(e
i
) (23.16)
=
1
2
Re
_

1 +re
i()
1 re
i()
u(e
i
)d (23.17)
where P
r
is the Poisson kernel dened by
P
r
() :=
1 r
2
1 2r cos +r
2
.
(The problem posed in Eq. (23.15) is called the Dirichlet problem for D.)
Proof. In this proof, we are going to be identifying S
1
= bd(D) :=
_
z

D : [z[ = 1
_
with [, ]/ ( ) by the map [, ] e
i
S
1
.
Also recall that the Laplacian may be expressed in polar coordinates as,
u = r
1
r
_
r
1
r
u
_
+
1
r
2
u,
where
(
r
u)
_
re
i
_
=

r
u
_
re
i
_
and (
u)
_
re
i
_
=

u
_
re
i
_
.
Uniqueness. Suppose u is a solution to Eq. (23.15) and let
g(k) :=
1
2
_

g(e
ik
)e
ik
d
and
u(r, k) :=
1
2
_

u(re
i
)e
ik
d (23.18)
be the Fourier coecients of g () and u
_
re
i
_
respectively. Then for
r (0, 1) ,
r
1
r
(r
r
u(r, k)) =
1
2
_

r
1
r
_
r
1
r
u
_
(re
i
)e
ik
d
=
1
2
_

1
r
2
u(re
i
)e
ik
d
=
1
r
2
1
2
_

u(re
i
)
2
e
ik
d
=
1
r
2
k
2
u(r, k)
or equivalently
r
r
(r
r
u(r, k)) = k
2
u(r, k). (23.19)
Recall the general solution to
r
r
(r
r
y(r)) = k
2
y(r) (23.20)
may be found by trying solutions of the form y(r) = r
which then implies
2
= k
2
or = k. From this one sees that u(r, k) solving Eq. (23.19) may
be written as u(r, k) = A
k
r
]k]
+B
k
r
]k]
for some constants A
k
and B
k
when
k ,= 0. If k = 0, the solution to Eq. (23.20) is gotten by simple integration and
the result is u(r, 0) = A
0
+ B
0
lnr. Since u(r, k) is bounded near the origin
for each k it must be that B
k
= 0 for all k Z. Hence we have shown there
exists A
k
C such that, for all r (0, 1),
A
k
r
]k]
= u(r, k) =
1
2
_

u(re
i
)e
ik
d. (23.21)
Since all terms of this equation are continuous for r [0, 1], Eq. (23.21)
remains valid for all r [0, 1] and in particular we have, at r = 1, that
A
k
=
1
2
_

u(e
i
)e
ik
d = g(k).
Hence if u is a solution to Eq. (23.15) then u must be given by
u(re
i
) =
kZ
g(k)r
]k]
e
ik
for r < 1. (23.22)
or equivalently,
u(z) =
kN0
g(k)z
k
+
kN
g(k) z
k
.
Notice that the theory of the Fourier series implies Eq. (23.22) is valid in the
L
2
(d) - sense. However more is true, since for r < 1, the series in Eq. (23.22) is
absolutely convergent and in fact denes a C
function (see Exercise 4.11 or

Corollary 19.43) which must agree with the continuous function, u
_
re
i
_
,
for almost every and hence for all . This completes the proof of uniqueness.
454 23 L
2
Existence. Given g C (bd(D)) , let u be dened as in Eq. (23.22). Then,
again by Exercise 4.11 or Corollary 19.43, u C
(D) . So to nish the proof

it suces to show lim
xy
u(x) = g (y) for all y bd(D). Inserting the formula
for g(k) into Eq. (23.22) gives
u(re
i
) =
1
2
_

P
r
( ) u(e
i
)d for all r < 1
where
P
r
() =
kZ
r
]k]
e
ik
=
k=0
r
k
e
ik
+
k=0
r
k
e
ik
1 =
= Re
_
2
1
1 re
i
1
_
= Re
_
1 +re
i
1 re
i
_
= Re
_
_
1 +re
i
_ _
1 re
i
_
[1 re
i
[
2
_
= Re
_
1 r
2
+ 2ir sin
1 2r cos +r
2
_
(23.23)
=
1 r
2
1 2r cos +r
2
.
The Poisson kernel again solves the usual approximate function prop-
erties (see Figure 2), namely:
1. P
r
() > 0 and
1
2
_

P
r
( ) d =
1
2
_

kZ
r
]k]
e
ik()
d
=
1
2
kZ
r
]k]
_

e
ik()
d = 1
and
2.
sup
]]
P
r
()
1 r
2
1 2r cos +r
2
0 as r 1.
A plot of P
r
() for r = 0.2, 0.5 and 0.7.
23.4 Weak L
2
-Derivatives 455
Therefore by the same argument used in the proof of Theorem 23.12,
lim
r1
sup
u
_
re
i
_
g
_
e
i
_
= lim
r1
sup
(P
r
g)
_
e
i
_
g
_
e
i
_
= 0
which certainly implies lim
xy
u(x) = g (y) for all y bd(D).
Remark 23.14 (Harmonic Conjugate). Writing z = re
i
, Eq. (23.17) may be
rewritten as
u(z) =
1
2
Re
_

1 +ze
i
1 ze
i
u(e
i
)d
which shows u = Re F where
F(z) :=
1
2
_

1 +ze
i
1 ze
i
u(e
i
)d.
Moreover it follows from Eq. (23.23) that
ImF(re
i
) =
1
Im
_

r sin( )
1 2r cos( ) +r
2
g(e
i
)d
=: (Q
r
u) (e
i
)
where
Q
r
() :=
r sin()
1 2r cos() +r
2
.
From these remarks it follows that v =: (Q
r
g) (e
i
) is the harmonic conju-
gate of u and

P
r
= Q
r
. For more on this point see Section ?? below.
23.4 Weak L
2
-Derivatives
Theorem 23.15 (Weak and Strong Dierentiability). Suppose that f
L
2
(1
n
) and v 1
n
0 . Then the following are equivalent:
1. There exists t
n
n=1
1 0 such that lim
n
t
n
= 0 and
sup
n
_
_
_
_
f( +t
n
v) f()
t
n
_
_
_
_
2
< .
2. There exists g L
2
(1
n
) such that f,
v
) = g, ) for all C
c
(1
n
).
3. There exists g L
2
(1
n
) and f
n
C
c
(1
n
) such that f
n
L
2
f and
v
f
n
L
2
g as n .
4. There exists g L
2
such that
f( +tv) f()
t
L
2
g as t 0.
456 23 L
2
(See Theorem 26.18 for the L
p
generalization of this theorem.)
Proof. 1. = 2. We may assume, using Theorem 14.43 and passing to a
subsequence if necessary, that
f(+tnv)f()
tn
w
g for some g L
2
(1
n
). Now
for C
c
(1
n
),
g[) = lim
n
_
f( +t
n
v) f()
t
n
,
_
= lim
n
_
f,
( t
n
v) ()
t
n
_
=
_
f, lim
n
( t
n
v) ()
t
n
_
= f,
v
),
wherein we have used the translation invariance of Lebesgue measure and
the dominated convergence theorem. 2. = 3. Let C
c
(1
n
, 1) such that
_
R
n
(x)dx = 1 and let
m
(x) = m
n
(mx), then by Proposition 22.34, h
m
:=
m
f C
(1
n
) for all m and
v
h
m
(x) =
v
m
f(x) =
_
R
n
m
(x y)f(y)dy = f,
v
[
m
(x )])
= g,
m
(x )) =
m
g(x).
By Theorem 22.32, h
m
f L
2
(1
n
) and
v
h
m
=
m
g g in L
2
(1
n
)
as m . This shows 3. holds except for the fact that h
m
need not have
compact support. To x this let C
c
(1
n
, [0, 1]) such that = 1 in a
neighborhood of 0 and let
(x) = (x) and (

v
)
(x) := (
v
) (x). Then
v
(
h
m
) =
v
h
m
+
v
h
m
= (
v
)
h
m
+
v
h
m
so that
h
m
h
m
in L
2
and
v
(
h
m
)
v
h
m
in L
2
as 0. Let
f
m
=
m
h
m
where
m
is chosen to be greater than zero but small enough so
that
|
m
h
m
h
m
|
2
+|
v
(
m
h
m
)
v
h
m
|
2
< 1/m.
Then f
m
C
c
(1
n
), f
m
f and
v
f
m
g in L
2
as m . 3. = 4. By
the fundamental theorem of calculus
tv
f
m
(x) f
m
(x)
t
=
f
m
(x +tv) f
m
(x)
t
=
1
t
_
1
0
d
ds
f
m
(x +stv)ds =
_
1
0
(
v
f
m
) (x +stv)ds.
(23.24)
Let
G
t
(x) :=
_
1
0
stv
g(x)ds =
_
1
0
g(x +stv)ds
which is dened for almost every x and is in L
2
(1
n
) by Minkowskis inequality
for integrals, Theorem 21.27. Therefore
23.5 *Conditional Expectation 457
tv
f
m
(x) f
m
(x)
t
G
t
(x) =
_
1
0
[(
v
f
m
) (x +stv) g(x +stv)] ds
and hence again by Minkowskis inequality for integrals,
_
_
_
_
tv
f
m
f
m
t
G
t
_
_
_
_
2
_
1
0
|
stv
(
v
f
m
)
stv
g|
2
ds
=
_
1
0
|
v
f
m
g|
2
ds.
Letting m in this equation implies (
tv
f f) /t = G
t
a.e. Finally one
more application of Minkowskis inequality for integrals implies,
_
_
_
_
tv
f f
t
g
_
_
_
_
2
= |G
t
g|
2
=
_
_
_
_
_
1
0
(
stv
g g) ds
_
_
_
_
2
_
1
0
|
stv
g g|
2
ds.
By the dominated convergence theorem and Proposition 22.24, the latter term
tends to 0 as t 0 and this proves 4. The proof is now complete since 4. =
1. is trivial.
23.5 *Conditional Expectation
In this section let (, T, P) be a probability space, i.e. (, T, P) is a measure
space and P() = 1. Let ( T be a sub sigma algebra of T and write
f (
b
if f : C is bounded and f is ((, B
C
) measurable. In this section
we will write
Ef :=
_
fdP.
Denition 23.16 (Conditional Expectation). Let E
: L
2
(, T, P)
L
2
(, (, P) denote orthogonal projection of L
2
(, T, P) onto the closed sub-
space L
2
(, (, P). For f L
2
(, (, P), we say that E
f L
2
(, T, P) is the
conditional expectation of f.
Theorem 23.17. Let (, T, P) and ( T be as above and f, g
L
2
(, T, P).
1. If f 0, P a.e. then E
f 0, P a.e.
2. If f g, P a.e. there E
f E
g, P a.e.
3. [E
f[ E
[f[ , P a.e.
4. |E
f|
L
1 |f|
L
1 for all f L
2
. So by the B.L.T. Theorem 10.4, E
extends uniquely to a bounded linear map from L

1
(, T, P) to L
1
(, (, P)
which we will still denote by E
.
458 23 L
2
5. If f L
1
(, T, P) then F = E
f L
1
(, (, P) i
E(Fh) = E(fh) for all h (
b
.
6. If g (
b
and f L
1
(, T, P), then E
(gf) = g E
f, P a.e.
Proof. By the denition of orthogonal projection for h (
b
,
E(fh) = E(f E
h) = E(E
f h).
So if f, h 0 then 0 E(fh) E(E
f h) and since this holds for all h 0

in (
b
, E
f 0, P a.e. This proves (1). Item (2) follows by applying item

(1). to f g. If f is real, f [f[ and so by Item (2), E
f E
[f[ , i.e.
[E
f[ E
[f[ , P a.e. For complex f, let h 0 be a bounded and (

measurable function. Then
E [[E
f[ h] = E
_
E
f sgn (E
f)h
_
= E
_
f sgn (E
f)h
_
E [[f[ h] = E [E
[f[ h] .
Since h is arbitrary, it follows that [E
f[ E
[f[ , P a.e. Integrating this

inequality implies
|E
f|
L
1 E [E
f[ E [E
[f[ 1] = E [[f[] = |f|

L
1.
Item (5). Suppose f L
1
(, T, P) and h (
b
. Let f
n
L
2
(, T, P) be a
sequence of functions such that f
n
f in L
1
(, T, P). Then
E(E
f h) = E( lim
n
E
f
n
h) = lim
n
E(E
f
n
h)
= lim
n
E(f
n
h) = E(f h). (23.25)
This equation uniquely determines E
, for if F L
1
(, (, P) also satises
E(F h) = E(f h) for all h (
b
, then taking h = sgn (F E
f) in Eq.
(23.25) gives
0 = E((F E
f) h) = E([F E
f[).
This shows F = E
f, P a.e. Item (6) is now an easy consequence of this

characterization, since if h (
b
,
E [(gE
f) h] = E [E
f hg] = E [f hg] = E [gf h] = E [E
(gf) h] .
Thus E
(gf) = g E
f, P a.e.
Proposition 23.18. If (
0
(
1
T. Then
E
0
E
1
= E
1
E
0
= E
0
. (23.26)
Proof. Equation (23.26) holds on L
2
(, T, P) by the basic properties of
orthogonal projections. It then hold on L
1
(, T, P) by continuity and the
density of L
2
(, T, P) in L
1
(, T, P).
23.5 *Conditional Expectation 459
Example 23.19. Suppose that (X, /, ) and (Y, ^, ) are two nite mea-
sure spaces. Let = X Y, T = /^ and P(dx, dy) = (x, y)(dx)(dy)
where L
1
(, T, ) is a positive function such that
_
XY
d ( ) = 1.
Let
X
: X be the projection map,
X
(x, y) = x, and
( := (
X
) =
1
X
(/) = AY : A / .
Then f : 1 is ( measurable i f = F
X
for some function F : X 1
which is ^ measurable, see Lemma 18.66. For f L
1
(, T, P), we will now
show E
f = F
X
where
F(x) =
1
(x)
1
(0,)
( (x))
_
Y
f(x, y)(x, y)(dy),
(x) :=
_
Y
(x, y)(dy). (By convention,
_
Y
f(x, y)(x, y)(dy) := 0 if
_
Y
[f(x, y)[ (x, y)(dy) = .)
By Tonellis theorem, the set
E := x X : (x) =
_
x X :
_
Y
[f(x, y)[ (x, y)(dy) =
_
is a null set. Since
E [[F
X
[] =
_
X
d(x)
_
Y
d(y) [F(x)[ (x, y) =
_
X
d(x) [F(x)[ (x)
=
_
X
d(x)
_
Y
(dy)f(x, y)(x, y)
_
X
d(x)
_
Y
(dy) [f(x, y)[ (x, y) < ,
F
X
L
1
(, (, P). Let h = H
X
be a bounded ( measurable function,
then
E [F
X
h] =
_
X
d(x)
_
Y
d(y)F(x)H(x)(x, y)
=
_
X
d(x)F(x)H(x) (x)
=
_
X
d(x)H(x)
_
Y
(dy)f(x, y)(x, y)
= E [hf]
and hence E
f = F
X
as claimed.
This example shows that conditional expectation is a generalization of
the notion of performing integration over a partial subset of the variables
in the integrand. Whereas to compute the expectation, one should integrate
over all of the variables. See also Exercise 23.25 to gain more intuition about
conditional expectations.
460 23 L
2
Theorem 23.20 (Jensens inequality). Let (, T, P) be a probability space
and : 1 1 be a convex function. Assume f L
1
(, T, P; 1) is a function
such that (for simplicity) (f) L
1
(, T, P; 1), then (E
f) E
[(f)] ,
P a.e.
Proof. Let us rst assume that is C
1
and f is bounded. In this case
(x) (x
0
)
t
(x
0
)(x x
0
) for all x
0
, x 1. (23.27)
Taking x
0
= E
f and x = f in this inequality implies

(f) (E
f)
t
(E
f)(f E
f)
and then applying E
to this inequality gives

E
[(f)] (E
f) = E
[(f) (E
f)]

t
(E
f)(E
f E
f) = 0
The same proof works for general , one need only use Proposition 21.8 to
replace Eq. (23.27) by
(x) (x
0
)
t
(x
0
)(x x
0
) for all x
0
, x 1
where
t
(x
0
) is the left hand derivative of at x
0
. If f is not bounded, apply
what we have just proved to f
M
= f1
]f]M
, to nd
E
_
(f
M
)
(E
f
M
). (23.28)
Since E
: L
1
(, T, P; 1) L
1
(, T, P; 1) is a bounded operator and f
M
f and (f
M
) (f) in L
1
(, T, P; 1) as M , there exists M
k
k=1
such that M
k
and f
M
k
f and (f
M
k
) (f), P a.e. So passing to
the limit in Eq. (23.28) shows E
[(f)] (E
f), P a.e.
23.6 Exercises
Exercise 23.9. Let (X, /, ) be a measure space and H := L
2
(X, /, ).
Given f L
() let M
f
: H H be the multiplication operator dened by
M
f
g = fg. Show M
2
f
= M
f
i there exists A / such that f = 1
A
a.e.
Exercise 23.10 (Haar Basis). In this problem, let L
2
denote L
2
([0, 1], m)
with the standard inner product,
(x) = 1
[0,1/2)
(x) 1
[1/2,1)
(x)
and for k, j N
0
:= N0 with 0 j < 2
k
let
23.6 Exercises 461
kj
(x) = 2
k/2
(2
k
x j)
= 2
k/2
_
1
2
k
[j,j+1/2)
(x) 1
2
k
[j+1/2,j+1)
(x)
_
.
The following pictures shows the graphs of
0,0
,
1,0
,
1,1
,
2,1
,
2,2
and
2,3
respectively.
Plot of
0
, 0.
Plot of
1
0. Plot of
1
1.
Plot of
2
0. Plot of
2
1.
Plot of
2
2. Plot of
2
3.
1. Let M
0
= span(1) and for n N let
M
n
:= span
_
1
_
kj
: 0 k < n and 0 j < 2
k
__
,
462 23 L
2
where 1 denotes the constant function 1. Show
M
n
= span
_
1
[j2
n
,(j+1)2
n
)
: and 0 j < 2
n
_
.
2. Show := 1
_
kj
: 0 k and 0 j < 2
k
_
is an orthonormal
set. Hint: show
k+1,j
M
k
for all 0 j < 2
k+1
and show
_
kj
: 0 j < 2
k
_
is an orthonormal set for xed k.
3. Show
n=1
M
n
is a dense subspace of L
2
and therefore is an orthonormal
basis for L
2
. Hint: see Theorem 22.15.
4. For f L
2
, let
H
n
f := f[1)1 +
n1
k=0
2
k
1
j=0
f[
kj
)
kj
.
Show (compare with Exercise 23.25)
H
n
f =
2
n
1
j=0
_
2
n
_
(j+1)2
n
j2
n
f(x)dx
_
1
[j2
n
,(j+1)2
n
)
and use this to show |f H
n
f|
0 as n for all f C([0, 1]).

Hint: Compute orthogonal projection onto M
n
using a judiciously chosen
basis for M
n
.
Exercise 23.11. Let O(n) be the orthogonal groups consisting of n n real
orthogonal matrices O, i.e. O
tr
O = I. For O O(n) and f L
2
(1
n
) let
U
O
f(x) = f(O
1
x). Show
1. U
O
f is well dened, namely if f = g a.e. then U
O
f = U
O
g a.e.
2. U
O
: L
2
(1
n
) L
2
(1
n
) is unitary and satises U
O1
U
O2
= U
O1O2
for all
O
1
, O
2
O(n). That is to say the map O O(n) |(L
2
(1
n
)) the
unitary operators on L
2
(1
n
) is a group homomorphism, i.e. a unitary
representation of O(n).
3. For each f L
2
(1
n
), the map O O(n) U
O
f L
2
(1
n
) is continu-
ous. Take the topology on O(n) to be that inherited from the Euclidean
topology on the vector space of all n n matrices. Hint: see the proof of
Proposition 22.24.
Exercise 23.12. Euclidean group representation and its innitesimal gener-
ators including momentum and angular momentum operators.
Exercise 23.13. Spherical Harmonics.
Exercise 23.14. The gradient and the Laplacian in spherical coordinates.
Exercise 23.15. Legendre polynomials.
23.7 Fourier Series Exercises 463
23.7 Fourier Series Exercises

k=1
k
2
=
2
/6, by taking f(x) = x on [, ] and
computing |f|
2
2
directly and then in terms of the Fourier Coecients

f of f.
Exercise 23.17 (Riemann Lebesgue Lemma for Fourier Series). Show
for f L
1
([, ]
d
) that

f c
0
(Z
d
), i.e.

f : Z
d
C and lim
k

f(k) =
0. Hint: If f H, this follows form Bessels inequality. Now use a density
argument.
Exercise 23.18. Suppose f L
1
([, ]
d
) is a function such that

f
1
(Z
d
)
and set
g(x) :=
kZ
d
f(k)e
ikx
(pointwise).
1. Show g C
per
(1
d
).
2. Show g(x) = f(x) for m a.e. x in [, ]
d
. Hint: Show g(k) =

f(k) and
then use approximation arguments to show
_
[,]
d
f(x)h(x)dx =
_
[,]
d
g(x)h(x)dx h C([, ]
d
)
and then refer to Lemma 22.11.
3. Conclude that f L
1
([, ]
d
) L
([, ]
d
) and in particular f
L
p
([, ]
d
) for all p [1, ].
Exercise 23.19. Suppose m N
0
, is a multi-index such that [[ 2m and
f C
2m
per
(1
d
)
2
.
1. Using integration by parts, show (using Notation 22.21) that
(ik)

f(k) =
f[e
k
) for all k Z
d
.
Note: This equality implies
f(k)

1
k
f|
H

1
k
f|
.
2. Now let f =
d
i=1
2
f/x
2
i
, Working as in part 1) show
(1 )
m
f[e
k
) = (1 +[k[
2
)
m

f(k). (23.29)
Remark 23.21. Suppose that m is an even integer, is a multi-index and
f C
m+]]
per
(1
d
), then
2
We view Cper(R) as a subspace of H = L
2
([, ]) by identifying f Cper(R)
with f[
[,]
H.
464 23 L
2
_
_
kZ
d
[k
f(k)
_
_
2
=
_
_
kZ
d
[
f[e
k
)[ (1 +[k[
2
)
m/2
(1 +[k[
2
)
m/2
_
_
2
=
_
_
kZ
d
(1 )
m/2
f[e
k
)
(1 +[k[
2
)
m/2
_
_
2
kZ
d
(1 )
m/2
f[e
k
)
kZ
d
(1 +[k[
2
)
m
= C
m
_
_
_(1 )
m/2
f
_
_
_
2
H
where C
m
:=
kZ
d(1 +[k[
2
)
m
 d/2. So the smoother f is the
faster

f decays at innity. The next problem is the converse of this assertion
and hence smoothness of f corresponds to decay of

f at innity and visa-versa.
Exercise 23.20 (A Sobolev Imbedding Theorem). Suppose s 1 and
_
c
k
C : k Z
d
_
are coecients such that
kZ
d
[c
k
[
2
(1 +[k[
2
)
s
< .
Show if s >
d
2
+m, the function f dened by
f(x) =
kZ
d
c
k
e
ikx
is in C
m
per
(1
d
). Hint: Work as in the above remark to show
kZ
d
[c
k
[ [k
[ < for all [[ m.

Exercise 23.21 (Poisson Summation Formula). Let F L
1
(1
d
),
E :=
_
_
_
x 1
d
:
kZ
d
[F(x + 2k)[ =
_
_
_
and set
F(k) := (2)
d/2
_
R
d
F(x)e
ikx
dx.
Further assume

F
1
(Z
d
).
1. Show m(E) = 0 and E + 2k = E for all k Z
d
. Hint: Compute
_
[,]
d
kZ
d [F(x + 2k)[ dx.
23.7 Fourier Series Exercises 465
2. Let
f(x) :=
_
kZ
d F(x + 2k) for x / E
0 if x E.
Show f L
1
([, ]
d
) and

f(k) = (2)
d/2
F(k).
3. Using item 2) and the assumptions on F, show
f(x) =
kZ
d
f(k)e
ikx
=
kZ
d
(2)
d/2
F(k)e
ikx
for m a.e. x,
i.e.
kZ
d
F(x + 2k) = (2)
d/2

kZ
d
F(k)e
ikx
for m a.e. x (23.30)
and form this conclude that f L
1
([, ]
d
) L
([, ]
d
).
Hint: see the hint for item 2. of Exercise 23.18.
4. Suppose we now assume that F C(1
d
) and F satises [F(x)[ C(1 +
[x[)
s
for some s > d and C < . Under these added assumptions on F,
show Eq. (23.30) holds for all x 1
d
and in particular
kZ
d
F(2k) = (2)
d/2

kZ
d
F(k).
For notational simplicity, in the remaining problems we will assume that
d = 1.
Exercise 23.22 (Heat Equation 1.). Let (t, x) [0, ) 1 u(t, x) be a
continuous function such that u(t, ) C
per
(1) for all t 0, u := u
t
, u
x
, and
u
xx
exists and are continuous when t > 0. Further assume that u satises the
heat equation u =
1
2
u
xx
. Let u(t, k) := u(t, )[e
k
) for k Z. Show for t > 0
and k Z that u(t, k) is dierentiable in t and
d
dt
u(t, k) = k
2
u(t, k)/2. Use
this result to show
u(t, x) =
kZ
e
t
2
k
2
f(k)e
ikx
(23.31)
where f(x) := u(0, x) and as above
f(k) = f[e
k
) =
_

f(y)e
iky
dy =
1
2
_

f(y)e
iky
dm(y) .
Notice from Eq. (23.31) that (t, x) u(t, x) is C
for t > 0.
Exercise 23.23 (Heat Equation 2.). Let q
t
(x) :=
1
2
kZ
e
t
2
k
2
e
ikx
.
Show that Eq. (23.31) may be rewritten as
u(t, x) =
_

q
t
(x y)f(y)dy
466 23 L
2
and
q
t
(x) =
kZ
p
t
(x +k2)
where p
t
(x) :=
1
2t
e
1
2t
x
2
. Also show u(t, x) may be written as
u(t, x) = p
t
f(x) :=
_
R
d
p
t
(x y)f(y)dy.
Hint: To show q
t
(x) =
kZ
p
t
(x+k2), use the Poisson summation formula
and the Gaussian integration identity,
p
t
() =
1
2
_
R
p
t
(x)e
ix
dx =
1
2
e
t
2
2
. (23.32)
Equation (23.32) will be discussed in Example 34.4 below.
Exercise 23.24 (Wave Equation). Let u C
2
(11) be such that u(t, )
C
per
(1) for all t 1. Further assume that u solves the wave equation, u
tt
=
u
xx
. Let f(x) := u(0, x) and g(x) = u(0, x). Show u(t, k) := u(t, ), e
k
) for
k Z is twice continuously dierentiable in t and
d
2
dt
2
u(t, k) = k
2
u(t, k). Use
this result to show
u(t, x) =
kZ
_
f(k) cos(kt) + g(k)

sinkt
k
_
e
ikx
(23.33)
with the sum converging absolutely. Also show that u(t, x) may be written as
u(t, x) =
1
2
[f(x +t) +f(x t)] +
1
2
_
t
t
g(x +)d. (23.34)
Hint: To show Eq. (23.33) implies (23.34) use
cos kt =
e
ikt
+e
ikt
2
,
sinkt =
e
ikt
e
ikt
2i
, and
e
ik(x+t)
e
ik(xt)
ik
=
_
t
t
e
ik(x+)
d.
23.8 Conditional Expectation Exercises
Exercise 23.25. Suppose (, T, P) is a probability space and / := A
i
i=1

T is a partition of . (Recall this means =

i=1
A
i
.) Let ( be the
algebra generated by /. Show:
23.8 Conditional Expectation Exercises 467
1. B ( i B =
i
A
i
for some N.
2. g : 1 is ( measurable i g =
i=1
i
1
Ai
for some
i
1.
3. For f L
1
(, T, P), let E(f[A
i
) := E [1
Ai
f] /P(A
i
) if P(A
i
) ,= 0 and
E(f[A
i
) = 0 otherwise. Show
E
f =
i=1
E(f[A
i
)1
Ai
.
24
Complex Measures, Radon-Nikodym Theorem
and the Dual of L
p
Denition 24.1. A signed measure on a measurable space (X, /) is a
function : /1 such that
1. Either
(/) := (A) : A / (, ]
or (/) [, ).
2. is countably additive, this is to say if E =
j=1
E
j
with E
j
/, then
(E) =
j=1
(E
j
).
If (E) 1 then the series
j=1
(E
j
) is absolutely convergent since it is
independent of rearrangements.
3. () = 0.
If there exists X
n
/ such that [(X
n
)[ < and X =
n=1
X
n
, then
is said to be nite and if (/) 1 then is said to be a nite signed
measure. Similarly, a countably additive set function : / C such that
() = 0 is called a complex measure.
Example 24.2. Suppose that
+
and
are two positive measures on /such

that either
+
(X) < or
(X) < , then =

+
is a signed measure.
If both
+
(X) and
(X) are nite then is a nite signed measure and may

also be considered to be a complex measure.
Example 24.3. Suppose that g : X 1 is measurable and either
_
E
g
+
d or
_
E
g
d < , then
(A) =
_
A
gd A / (24.1)
denes a signed measure. This is actually a special case of the last example
with
(A) :=
_
A
g
d. Notice that the measure
in this example have

470 24 Complex Measures, Radon-Nikodym Theorem and the Dual of L
p
the property that they are concentrated on disjoint sets, namely
+
lives
on g > 0 and
lives on the set g < 0 .

Example 24.4. Suppose that is a positive measure on (X, /) and g L
1
(),
then given as in Eq. (24.1) is a complex measure on (X, /). Also if
_
,
i
_
is any collection of four positive nite measures on (X, /), then
:=
r
+
+i
_
i
+
_
(24.2)
is a complex measure.
If is given as in Eq. 24.1, then may be written as in Eq. (24.2) with
d
r
= (Re g)
d and d
i
= (Img)
d.
24.1 The Radon-Nikodym Theorem
Denition 24.5. Let be a complex or signed measure on (X, /). A set
E /is a null set or precisely a null set if (A) = 0 for all A /such
that A E, i.e. [
,
E
= 0. Recall that /
E
:= A E : A / = i
1
E
(/)
is the trace of M on E.
We will eventually show that every complex and nite signed measure
may be described as in Eq. (24.1). The next theorem is the rst result in
this direction.
Theorem 24.6 (A Baby Radon-Nikodym Theorem). Suppose (X, /)
is a measurable space, is a positive nite measure on / and is a complex
measure on / such that [(A)[ (A) for all A /. Then d = d where
[[ 1. Moreover if is a positive measure, then 0 1.
Proof. For a simple function, f o(X, /), let (f) :=
aC
a(f = a).
Then
[(f)[
aC
[a[ [(f = a)[
aC
[a[ (f = a) =
_
X
[f[ d.
So, by the B.L.T. Theorem 10.4, extends to a continuous linear functional
on L
1
() satisfying the bounds
[(f)[
_
X
[f[ d
_
(X) |f|
L
2
()
for all f L
1
().
The Riesz representation Theorem 8.15 then implies there exists a unique
L
2
() such that
(f) =
_
X
fd for all f L
2
().
Taking A / and f = sgn()1
A
in this equation shows
24.1 The Radon-Nikodym Theorem 471
_
A
[[ d = (sgn()1
A
) (A) =
_
A
1d
from which it follows that [[ 1, a.e. If is a positive measure, then
for real f, 0 = Im[(f)] =
_
X
Imfd and taking f = Im shows 0 =
_
X
[Im]
2
d, i.e. Im((x)) = 0 for a.e. x and we have shown is real a.e.
Similarly,
0 (Re < 0) =
_
Re <0]
d 0,
shows 0 a.e.
Denition 24.7. Let and be two signed or complex measures on (X, /).
Then:
1. and are mutually singular (written as ) if there exists A /
such that A is a null set and A
c
is a null set.
2. The measure is absolutely continuous relative to (written as
) provided (A) = 0 whenever A is a null set, i.e. all null
sets are null sets as well.
As an example, suppose that is a positive measure and L
1
() .
Then the measure, := is absolutely continuous relative to . Indeed, if
(A) = 0 then
(A) =
_
A
d = 0
as well.
Lemma 24.8. If
1
,
2
and are signed measures on (X, /) such that
1

and
2
and
1
+
2
is well dened, then (
1
+
2
) . If
i
i=1
is a
sequence of positive measures such that
i
for all i then =
i=1
i

as well.
Proof. In both cases, choose A
i
/ such that A
i
is null and A
c
i
is
i
-null for all i. Then by Lemma 24.16, A :=
i
A
i
is still a null set. Since
A
c
=
i
A
c
i
A
c
m
for all m
we see that A
c
is a
i
- null set for all i and is therefore a null set for =
i=1
i
. This shows that .
Throughout the remainder of this section will be always be a positive
measure on (X, /) .
Denition 24.9 (Lebesgue Decomposition). Suppose that is a signed
(complex) measure and is a positive measure on (X, /). Two signed (com-
plex) measures
a
and
s
form a Lebesgue decomposition of relative to
if
p
1. If (A) = ( (A) = ) for some A / then
a
(A) ,=
(
a
(A) ,= +) and
s
(A) ,= (
s
(A) ,= +) .
2. =
a
+
s
which is well dened by assumption 1.
3.
a
and
s
.
Lemma 24.10. Let is a signed (complex) measure and is a positive mea-
sure on (X, /). If there exists a Lebesgue decomposition, =
s
+
a
, of the
measure relative to then it is unique. Moreover:
1. if is positive then
s
and
a
are positive.
2. If is a nite measure then so are
s
and
a
.
Proof. Since
s
, there exists A / such that (A) = 0 and A
c
is
s
null and because
a
, A is also a null set for
a
. So for C /,
a
(C A) = 0 and
s
(C A
c
) = 0 from which it follows that
(C) = (C A) +(C A
c
) =
s
(C A) +
a
(C A
c
)
and hence,
s
(C) =
s
(C A) = (C A) and
a
(C) =
a
(C A
c
) = (C A
c
). (24.3)
Item 1. is now obvious from Eq. (24.3).
For Item 2., if is a nite measure then there exists X
n
/such that
X =
n=1
X
n
and [(X
n
)[ < for all n. Since (X
n
) =
a
(X
n
) +
s
(X
n
), we
must have
a
(X
n
) 1 and
s
(X
n
) 1 showing
a
and
s
are nite as
well.
For the uniqueness assertion, if we have another decomposition =
a
+
s
with
s
and
a
we may choose

A / such that (

A) = 0 and

A
c
is
s
null. Then B = A

A is still a - null set and B
c
= A
c

A
c
is a null
set for both
s
and
s
. Therefore by the same arguments which proved Eq.
(24.3),
s
(C) = (C B) =
s
(C) and
a
(C) = (C B
c
) =
a
(C) for all C /.
Lemma 24.11. Suppose is a positive measure on (X, /) and f, g : X

1
are extended integrable functions such that
_
A
fd =
_
A
gd for all A /, (24.4)
_
X
f
d < ,
_
X
g
d < , and the measures [f[ d and [g[ d are

nite. Then f(x) = g(x) for a.e. x.
Proof. By assumption there exists X
n
/ such that X
n
X and
_
Xn
[f[ d < and
_
Xn
[g[ d < for all n. Replacing A by A X
n
in
Eq. (24.4) implies
_
A
1
Xn
fd =
_
AXn
fd =
_
AXn
gd =
_
A
1
Xn
gd
for all A /. Since 1
Xn
f and 1
Xn
g are in L
1
() for all n, this equation
implies 1
Xn
f = 1
Xn
g, a.e. Letting n then shows that f = g, a.e.
Remark 24.12. Suppose that f and g are two positive measurable functions on
(X, /, ) such that Eq. (28.32) holds. It is not in general true that f = g,
a.e. A trivial counterexample is to take /= 2
X
, (A) = for all non-empty
A /, f = 1
X
and g = 2 1
X
. Then Eq. (24.4) holds yet f ,= g.
Theorem 24.13 (Radon Nikodym Theorem for Positive Measures).
Suppose that and are nite positive measures on (X, /). Then has
a unique Lebesgue decomposition =
a
+
s
relative to and there exists
a unique (modulo sets of measure 0) function : X [0, ) such that
d
a
= d. Moreover,
s
= 0 i .
Proof. The uniqueness assertions follow directly from Lemmas 24.10 and
24.11.
Existence. (Von-Neumanns Proof.) First suppose that and are nite
measures and let = +. By Theorem 24.6, d = hd with 0 h 1 and
this implies, for all non-negative measurable functions f, that
(f) = (fh) = (fh) +(fh) (24.5)
or equivalently
(f(1 h)) = (fh). (24.6)
Taking f = 1
h=1]
in Eq. (24.6) shows that
(h = 1) = (1
h=1]
(1 h)) = 0,
i.e. 0 h(x) < 1 for - a.e. x. Let
:= 1
h<1]
h
1 h
and then take f = g1
h<1]
(1 h)
1
with g 0 in Eq. (24.6) to learn
(g1
h<1]
) = (g1
h<1]
(1 h)
1
h) = (g).
Hence if we dene
a
:= 1
h<1]
and
s
:= 1
h=1]
,
p
we then have
s
(since
s
lives on h = 1 while (h = 1) = 0) and
a
= and in particular
a
. Hence =
a
+
s
is the desired Lebesgue
decomposition of .
1
If we further assume that , then (h = 1) = 0 implies (h = 1) = 0
and hence that
s
= 0 and we conclude that =
a
= .
For the nite case, write X =

n=1
X
n
where X
n
/ are chosen
so that (X
n
) < and (X
n
) < for all n. Let d
n
= 1
Xn
d and d
n
=
1
Xn
d. Then by what we have just proved there exists
n
L
1
(X,
n
)
L
1
(X, ) and measure
s
n
such that d
n
=
n
d
n
+ d
s
n
with
s
n

n
. Since
n
and
s
n
live on X
n
(see Eq. (24.3) there exists A
n
/
Xn
such that
(A
n
) =
n
(A
n
) = 0 and
s
n
(X A
n
) =
s
n
(X
n
A
n
) = 0.
This shows that
s
n
for all n and so by Lemma 24.8,
s
:=

n=1
s
n
is
singular relative to . Since
=
n=1
n
=
n=1
(
n
n
+
s
n
) =
n=1
(
n
1
Xn
+
s
n
) = +
s
,
where :=

n=1
1
Xn
n
, it follows that =
a
+
s
with
a
= is the
Lebesgue decomposition of relative to .
Theorem 24.14 (Dual of L
p
spaces). Let (X, /, ) be a nite mea-
sure space and suppose that p, q [1, ] are conjugate exponents. Then for
p [1, ), the map g L
q

g
(L
p
)
(where
g
= , g)
was dened
in Eq. 21.23) is an isometric isomorphism of Banach spaces. We summarize
this by writing (L
p
)
= L
q
for all 1 p < . (The result is in general false
for p = 1 as can be seen from Theorem 25.13 and Lemma 25.14 below.)
1
Here is the motivation for this construction. Suppose that d = ds + d is
the Radon-Nikodym decompostion and X = A
B such that s(B) = 0 and

(A) = 0. Then we nd
s(f) +(f) = (f) = (fg) = (fg) +(fg).
Letting f 1Af then implies that
s(1Af) = (1Afg)
which show that g = 1 a.e. on A. Also letting f 1Bf implies that
(1Bf(1 g)) = (1Bf(1 g)) = (1Bfg) = (fg)
which shows that
(1 g) = 1B(1 g) = g a.e..
This shows that =
g
1g
a.e.
Proof. The only results of this theorem which are not covered in Propo-
sition 21.26 is the surjectivity of the map g L
q
g
(L
p
)
. When p = 2,
this surjectivity is a direct consequence of the Riesz Theorem 8.15.
Case 1. We will begin the proof under the extra assumption that (X) <
in which cased bounded functions are in L
p
() for all p. So let (L
p
)
.
We need to nd g L
q
() such that =
g
. When p [1, 2], L
2
() L
p
() so
that we may restrict to L
2
() and again the result follows fairly easily from
the Riesz Theorem, see Exercise 24.3 below. To handle general p [1, ),
dene (A) := (1
A
). If A =
n=1
A
n
with A
n
/, then
|1
A
n=1
1
An
|
L
p = |1
n=N+1
An
|
L
p =
_
(
n=N+1
A
n
)
1
p
0 as N .
Therefore
(A) = (1
A
) =
1
(1
An
) =
1
(A
n
)
showing is a complex measure.
2
For A /, let [[ (A) be the total varia-
tion of A dened by
[[ (A) := sup[(f1
A
)[ : [f[ 1 (24.7)
and notice that
[(A)[ [[ (A) ||
(L
p
)
(A)
1/p
for all A /. (24.8)
You are asked to show in Exercise 24.4 that [[ is a measure on (X, /). (This
can also be deduced from Lemma 24.29 and Proposition 24.33 below.) By Eq.
(24.8) [[ , by Theorem 24.6 d = hd [[ for some [h[ 1 and by Theorem
24.13 d [[ = d for some L
1
(). Hence, letting g = h L
1
(), d = gd
or equivalently
(1
A
) =
_
X
g1
A
d A /. (24.9)
By linearity this equation implies
(f) =
_
X
gfd (24.10)
for all simple functions f on X. Replacing f by 1
]g]M]
f in Eq. (24.10) shows
(f1
]g]M]
) =
_
X
1
]g]M]
gfd
holds for all simple functions f and then by continuity for all f L
p
(). By
the converse to Holders inequality, (Proposition 21.26) we learn that
2
It is at this point that the proof breaks down when p = .
p
_
_
1
]g]M]
g
_
_
q
= sup
|f|
p
=1
(f1
]g]M]
)
sup
|f|
p
=1
||
(L
p
)
_
_
f1
]g]M]
_
_
p
||
(L
p
)
.
Using the monotone convergence theorem we may let M in the previous
equation to learn |g|
q
||
(L
p
)
.With this result, Eq. (24.10) extends by
continuity to hold for all f L
p
() and hence we have shown that =
g
.
Case 2. Now suppose that is nite and X
n
/ are sets such that
(X
n
) < and X
n
X as n . We will identify f L
p
(X
n
, ) with
f1
Xn
L
p
(X, ) and this way we may consider L
p
(X
n
, ) as a subspace of
L
p
(X, ) for all n and p [1, ]. By Case 1. there exists g
n
L
q
(X
n
, ) such
that
(f) =
_
Xn
g
n
fd for all f L
p
(X
n
, )
and
|g
n
|
q
= sup
_
[(f)[ : f L
p
(X
n
, ) and |f|
L
p
(Xn,)
= 1
_
||
[L
p
()]
.
It is easy to see that g
n
= g
m
a.e. on X
n
X
m
for all m, n so that g :=
lim
n
g
n
exists a.e. By the above inequality and Fatous lemma, |g|
q

||
[L
p
()]
< and since (f) =
_
Xn
gfd for all f L
p
(X
n
, ) and n
and
n=1
L
p
(X
n
, ) is dense in L
p
(X, ) it follows by continuity that (f) =
_
X
gfd for all f L
p
(X, ),i.e. =
g
.
24.2 The Structure of Signed Measures
Denition 24.15. Let be a signed measure on (X, /) and E /, then
1. E is positive if for all A /such that A E, (A) 0, i.e. [
,
E
0.
2. E is negative if for all A /such that A E, (A) 0, i.e. [
,
E
0.
Lemma 24.16. Suppose that is a signed measure on (X, /). Then
1. Any subset of a positive set is positive.
2. The countable union of positive (negative or null) sets is still positive
(negative or null).
3. Let us now further assume that (/) [, ) and E / is a set
such that (E) (0, ). Then there exists a positive set P E such that
(P) (E).
Proof. The rst assertion is obvious. If P
j
/ are positive sets, let
P =
n=1
P
n
. By replacing P
n
by the positive set P
n
_
n1
j=1
P
j
_
we may assume
24.2 The Structure of Signed Measures 477
that the P
n
n=1
are pairwise disjoint so that P =
n=1
P
n
. Now if E P and
E /, E =
n=1
(E P
n
) so (E) =
n=1
(E P
n
) 0.which shows that
P is positive. The proof for the negative and the null case is analogous.
The idea for proving the third assertion is to keep removing big sets of
negative measure from E. The set remaining from this procedure will be P.
We now proceed to the formal proof. For all A / let
n(A) = 1 sup(B) : B A.
Since () = 0, n(A) 0 and n(A) = 0 i A is positive. Choose A
0
E
such that (A
0
)
1
2
n(E) and set E
1
= E A
0
, then choose A
1
E
1
such
that (A
1
)
1
2
n(E
1
) and set E
2
= E (A
0
A
1
) . Continue this procedure
inductively, namely if A
0
, . . . , A
k1
have been chosen let E
k
= E
_
k1
i=0
A
i
_
and choose A
k
E
k
such that (A
k
)
1
2
n(E
k
). Let P := E
k=0
A
k
=
k=0
E
k
, then E = P
k=0
A
k
and hence
(0, ) (E) = (P) +
k=0
(A
k
) = (P)
k=0
(A
k
) (P). (24.11)
From Eq. (24.11) we learn that
k=0
(A
k
) < and in particular that
lim
k
((A
k
)) = 0. Since 0
1
2
n(E
k
) (A
k
), this also implies
lim
k
n(E
k
) = 0. If A / with A P, then A E
k
for all k and
so, for k large so that n(E
k
) < 1, we nd (A) n(E
k
). Letting k in
this estimate shows (A) 0 or equivalently (A) 0. Since A P was
arbitrary, we conclude that P is a positive set such that (P) (E).
24.2.1 Hahn Decomposition Theorem
Denition 24.17. Suppose that is a signed measure on (X, /). A Hahn
decomposition for is a partition P, N = P
c
of X such that P is positive
and N is negative.
Theorem 24.18 (Hahn Decomposition Theorem). Every signed mea-
sure space (X, /, ) has a Hahn decomposition, P, N. Moreover, if
P,

N
is another Hahn decomposition, then P
P = N

N is a null set, so the de-
composition is unique modulo null sets.
Proof. With out loss of generality we may assume that (/) [, ).
If not just consider instead.
Uniqueness. For any A /, we have
p
(A) = (A P) +(A N) (A P) (P).
In particular, taking A = P

P, we learn
(P) (P

P) (P)
or equivalently that (P) =
_
P

P
_
. Of course by symmetry we also have
(P) =
_
P

P
_
=
_
P
_
=: s.
Since also,
s = (P

P) = (P) +(

P) (P

P) = 2s (P

P),
we also have (P

P) = s. Finally using P

P =
_
P

P
_
PP
_
, we
conclude that
s = (P

P) = (P

P) +(

PP) = s +(

PP)
which shows (

PP) = 0. Thus N

N =

PP is a positive set with zero
measure, i.e. N

N =

PP is a null set and this proves the uniqueness
assertion.
Existence. Let
s := sup(A) : A /
which is non-negative since () = 0. If s = 0, we are done since P = and
N = X is the desired decomposition. So assume s > 0 and choose A
n
/
such that (A
n
) > 0 and lim
n
(A
n
) = s. By Lemma 24.16 there exists
positive sets P
n
A
n
such that (P
n
) (A
n
). Then s (P
n
) (A
n
) s
as n implies that s = lim
n
(P
n
). The set P :=
n=1
P
n
is a positive
set being the union of positive sets and since P
n
P for all n,
(P) (P
n
) s as n .
This shows that (P) s and hence by the denition of s, s = (P) < .
I now claim that N = P
c
is a negative set and therefore, P, N is the
desired Hahn decomposition. If N were not negative, we could nd E N =
P
c
such that (E) > 0. We then would have
(P E) = (P) +(E) = s +(E) > s
which contradicts the denition of s.
24.2.2 Jordan Decomposition
Theorem 24.19 (Jordan Decomposition). If is a signed measure on
(X, /) , there exist unique positive measure
on (X, /) such that

+

and =
+

. This decomposition is called the Jordan decomposition

of .
Proof. Let P, N be a Hahn decomposition for and dene
+
(E) := (P E) and
(E) := (N E) E /.
Then it is easily veried that =
+

is a Jordan decomposition of .
The reader is asked to prove the uniqueness of this decomposition in Exercise
24.10.
Denition 24.20. The measure, [[ :=
+
+
is called the total variation

of . A signed measure is called nite provided that
(or equivalently
[[ :=
+
+
) are -nite measures.

Lemma 24.21. Let be a signed measure on (X, /) and A /. If (A)
1 then (B) 1 for all B A. Moreover, (A) 1 i [[ (A) < . In
particular, is nite i [[ is nite. Furthermore if P, N / is a
Hahn decomposition for and g = 1
P
1
N
, then d = gd [[ , i.e.
(A) =
_
A
gd [[ for all A /.
Proof. Suppose that B A and [(B)[ = then since (A) = (B) +
(A B) we must have [(A)[ = . Let P, N / be a Hahn decomposition
for , then
(A) = (A P) +(A N) = [(A P)[ [(A N)[ and
[[ (A) = (A P) (A N) = [(A P)[ +[(A N)[ . (24.12)
Therefore (A) 1 i (A P) 1 and (A N) 1 i [[ (A) < .
Finally,
(A) = (A P) +(A N)
= [[ (A P) [[ (A N)
=
_
A
(1
P
1
N
)d [[
which shows that d = gd [[ .
Lemma 24.22. Suppose that is a positive measure on (X, /) and g : X
1 is an extended -integrable function. If is the signed measure d = gd,
then d
= g
d and d [[ = [g[ d. We also have

[[ (A) = sup
_
A
f d : [f[ 1 for all A /. (24.13)
Proof. The pair, P = g > 0 and N = g 0 = P
c
is a Hahn decom-
position for . Therefore
+
(A) = (A P) =
_
AP
gd =
_
A
1
g>0]
gd =
_
A
g
+
d,
p
(A) = (A N) =
_
AN
gd =
_
A
1
g0]
gd =
_
A
g
d.
and
[[ (A) =
+
(A) +
(A) =
_
A
g
+
d
_
A
g
d
=
_
A
(g
+
g
) d =
_
A
[g[ d.
If A / and [f[ 1, then
_
A
f d
_
A
f d
+
_
A
f d
_
A
f d
+
_
A
f d
_
A
[f[ d
+
+
_
A
[f[ d
=
_
A
[f[ d [[ [[ (A).
For the reverse inequality, let f := 1
P
1
N
then
_
A
f d = (A P) (A N) =
+
(A) +
(A) = [[ (A).
Denition 24.23. Let be a signed measure on (X, /), let
L
1
() := L
1
(
+
) L
1
(
) = L
1
([[)
and for f L
1
() we dene
_
X
fd :=
_
X
fd
+
_
X
fd
.
Lemma 24.24. Let be a positive measure on (X, /), g be an extended
integrable function on (X, /, ) and d = gd. Then L
1
() = L
1
([g[ d) and
for f L
1
(),
_
X
fd =
_
X
fgd.
Proof. By Lemma 24.22, d
+
= g
+
d, d
= g
d, and d [[ = [g[ d so
that L
1
() = L
1
([[) = L
1
([g[ d) and for f L
1
(),
_
X
fd =
_
X
fd
+
_
X
fd
=
_
X
fg
+
d
_
X
fg
d
=
_
X
f (g
+
g
) d =
_
X
fgd.
Lemma 24.25. Suppose is a signed measure, is a positive measure and
=
a
+
s
is a Lebesgue decomposition (see Denition 24.9) of relative to
, then [[ = [
a
[ +[
s
[ .
Proof. Let A / be chosen so that A is a null set for
a
and A
c
is
a null set for
s
. Let A = P
t
N
t
be a Hahn decomposition of
s
[
,
A
and
A
c
=

P

N be a Hahn decomposition of
a
[
,
A
c
. Let P = P
t

P and
N = N
t

N. Since for C /,
(C P) = (C P
t
) +(C

P)
=
s
(C P
t
) +
a
(C

P) 0
and
(C N) = (C N
t
) +(C

N)
=
s
(C N
t
) +
a
(C

N) 0
we see that P, N is a Hahn decomposition for . It also easy to see that
P, N is a Hahn decomposition for both
s
and
a
as well. Therefore,
[[ (C) = (C P) (C N)
=
s
(C P)
s
(C N) +
a
(C P)
a
(C N)
= [
s
[ (C) +[
a
[ (C).
Lemma 24.26.
1. Let be a signed measure and be a positive measure on (X, /) such
that and , then 0.
2. Suppose that =
i=1
i
where
i
are positive measures on (X, /) such
that
i
, then .
3. Also if
1
and
2
are two signed measure such that
i
for i = 1, 2
and =
1
+
2
is well dened, then .
Proof. 1. Because , there exists A / such that A is a null
set and B = A
c
is a - null set. Since B is null and , B is also
null. This shows by Lemma 24.16 that X = A B is also null, i.e. is
the zero measure. The proof of items 2. and 3. are easy and will be left to the
reader.
Theorem 24.27 (Radon Nikodym Theorem for Signed Measures).
Let be a nite signed measure and be a nite positive measure on
(X, /). Then has a unique Lebesgue decomposition =
a
+
s
relative to
and there exists a unique (modulo sets of measure 0) extended integrable
function : X 1 such that d
a
= d. Moreover,
s
= 0 i , i.e.
d = d i .
p
Proof. Uniqueness. Is a direct consequence of Lemmas 24.10 and 24.11.
Existence. Let =
+

be the Jordan decomposition of . Assume,

without loss of generality, that
+
(X) < , i.e. (A) < for all A /. By
the Radon Nikodym Theorem 24.13 for positive measures there exist functions
f
: X [0, ) and measures
such that
=
f
+
with
.
Since
>
+
(X) =
f+
(X) +
+
(X),
f
+
L
1
() and
+
(X) < so that f = f
+
f
is an extended integrable
function, d
a
:= fd and
s
=
+
are signed measures. This nishes the

existence proof since
=
+
=
f+
+
+
f
+
_
=
a
+
s
and
s
= (
+
) by Lemma 24.8. For the nal statement, if

s
= 0,
then d = d and hence . Conversely if , then d
s
= d d
, so by Lemma 24.16,
s
= 0. Alternatively just use the uniqueness of the
Lebesgue decomposition to conclude
a
= and
s
= 0. Or more directly,
choose B / such that (B
c
) = 0 and B is a
s
null set. Since , B
c
is also a null set so that, for A /,
(A) = (A B) =
a
(A B) +
s
(A B) =
a
(A B).
Notation 24.28 The function f is called the Radon-Nikodym derivative of
relative to and we will denote this function by
d
d
.
24.3 Complex Measures
Suppose that is a complex measure on (X, /), let
r
:= Re ,
i
:= Im
and := [
r
[ + [
i
[. Then is a nite positive measure on / such that
r
and
i
. By the Radon-Nikodym Theorem 24.27, there exists
real functions h, k L
1
() such that d
r
= h d and d
i
= k d. So letting
g := h +ik L
1
(),
d = (h +ik)d = gd
showing every complex measure may be written as in Eq. (24.1).
Lemma 24.29. Suppose that is a complex measure on (X, /), and for
i = 1, 2 let
i
be a nite positive measure on (X, /) such that d = g
i
d
i
with g
i
L
1
(
i
). Then
_
A
[g
1
[ d
1
=
_
A
[g
2
[ d
2
for all A /.
In particular, we may dene a positive measure [[ on (X, /) by
24.3 Complex Measures 483
[[ (A) =
_
A
[g
1
[ d
1
for all A /.
The nite positive measure [[ is called the total variation measure of .
Proof. Let =
1
+
2
so that
i
. Let
i
= d
i
/d 0 and h
i
=
i
g
i
.
Since
(A) =
_
A
g
i
d
i
=
_
A
g
i
i
d =
_
A
h
i
d for all A /,
h
1
= h
2
, a.e. Therefore
_
A
[g
1
[ d
1
=
_
A
[g
1
[
1
d =
_
A
[h
1
[ d
=
_
A
[h
2
[ d =
_
A
[g
2
[
2
d =
_
A
[g
2
[ d
2
.
Denition 24.30. Given a complex measure , let
r
= Re and
i
= Im
so that
r
and
i
are nite signed measures such that
(A) =
r
(A) +i
i
(A) for all A /.
Let L
1
() := L
1
(
r
) L
1
(
i
) and for f L
1
() dene
_
X
fd :=
_
X
fd
r
+i
_
X
fd
i
.
Example 24.31. Suppose that is a positive measure on (X, /), g L
1
()
and (A) =
_
A
gdas in Example 24.4, then L
1
() = L
1
([g[ d) and for
f L
1
()
_
X
fd =
_
X
fgd. (24.14)
To check Eq. (24.14), notice that d
r
= Re g d and d
i
= Img d so that
(using Lemma 24.24)
L
1
() = L
1
(Re gd)L
1
(Imgd) = L
1
([Re g[ d)L
1
([Img[ d) = L
1
([g[ d).
If f L
1
(), then
_
X
fd :=
_
X
f Re gd +i
_
X
f Imgd =
_
X
fgd.
Remark 24.32. Suppose that is a complex measure on (X, /) such that
d = gd and as above d [[ = [g[ d. Letting
= sgn() :=
_
g
]g]
if [g[ , = 0
1 if [g[ = 0
p
we see that
d = gd = [g[ d = d [[
and [[ = 1 and is uniquely dened modulo [[ null sets. We will denote
by d/d [[ . With this notation, it follows from Example 24.31 that L
1
() :=
L
1
([[) and for f L
1
(),
_
X
fd =
_
X
f
d
d [[
d [[ .
We now give a number of methods for computing the total variation, [[ , of
a complex or signed measure .
Proposition 24.33 (Total Variation). Suppose / 2
X
is an algebra,
/= (/), is a complex (or a signed measure which is nite on /) on
(X, /) and for E / let
0
(E) = sup
_
n
1
[(E
j
)[ : E
j
/
E
E
i
E
j
=
ij
E
i
, n = 1, 2, . . .
_
1
(E) = sup
_
n
1
[(E
j
)[ : E
j
/
E
E
i
E
j
=
ij
E
i
, n = 1, 2, . . .
_
2
(E) = sup
_

1
[(E
j
)[ : E
j
/
E
E
i
E
j
=
ij
E
i
_
3
(E) = sup
_
_
E
fd
: f is measurable with [f[ 1

_
4
(E) = sup
_
_
E
fd
: f S
f
(/, [[) with [f[ 1
_
.
then
0
=
1
=
2
=
3
=
4
= [[ .
Proof. Let = d/d [[ and recall that [[ = 1, [[ a.e.
Step 1.
4
[[ =
3
. If f is measurable with [f[ 1 then
_
E
f d
_
E
f d [[
_
E
[f[ d [[
_
E
1d [[ = [[ (E)
from which we conclude that
4

3
[[ . Taking f = above shows
_
E
f d
=
_
E
d [[ =
_
E
1 d [[ = [[ (E)
which shows that [[
3
and hence [[ =
3
.
Step 2.
4
[[ . Let X
m
/ be chosen so that [[ (X
m
) < and
X
m
X as m . By Theorem 22.15 (or Remark 28.7 or Corollary 32.42
24.3 Complex Measures 485
below), there exists
n
S
f
(/, ) such that
n
1
Xm
in L
1
([[) and each
n
may be written in the form
n
=
N
k=1
z
k
1
A
k
(24.15)
where z
k
C and A
k
/ and A
k
A
j
= if k ,= j. I claim that we may
assume that [z
k
[ 1 in Eq. (24.15) for if [z
k
[ > 1 and x A
k
,
[(x) z
k
[
(x) [z
k
[
1
z
k
.
This is evident from Figure 24.1 and formally follows from the fact that
d
dt
(x) t [z
k
[
1
z
k
2
= 2
_
t Re([z
k
[
1
z
k
(x))
_
0
when t 1. Therefore if we dene
Fig. 24.1. Sliding points to the unit circle.
w
k
:=
_
[z
k
[
1
z
k
if [z
k
[ > 1
z
k
if [z
k
[ 1
and
n
=
N
k=1
w
k
1
A
k
then
[(x)
n
(x)[ [(x)
n
(x)[
and therefore
n
1
Xm
in L
1
([[). So we now assume that
n
is as in Eq.
(24.15) with [z
k
[ 1. Now
_
E

n
d
_
E
1
Xm
d
_
E
(
n
d 1
Xm
) d [[
_
E
[
n
1
Xm
[ d [[ 0 as n
p
and hence
4
(E)
_
E
1
Xm
d
= [[ (E X
m
) for all m.
Letting m in this equation shows
4
[[ which combined with step 1.
shows
3
=
4
= [[ .
Step 3.
0
=
1
=
2
= [[ . Clearly
0

1

2
. Suppose E
j
j=1

/
E
be a collection of pairwise disjoint sets, then
j=1
[(E
j
)[ =
j=1
_
Ej
d [[
j=1
[[ (E
j
) = [[ (E
j
) [[ (E)
which shows that
2
[[ =
4
. So it suces to show
4

0
. But if
f S
f
(/, [[) with [f[ 1, then f may be expressed as f =

N
k=1
z
k
1
A
k
with [z
k
[ 1 and A
k
A
j
=
ij
A
k.
Therefore,
_
E
fd
k=1
z
k
(A
k
E)
k=1
[z
k
[ [(A
k
E)[
k=1
[(A
k
E)[
0
(A).
Since this equation holds for all f S
f
(/, [[) with [f[ 1,
4

0
as
claimed.
Theorem 24.34 (Radon Nikodym Theorem for Complex Measures).
Let be a complex measure and be a nite positive measure on (X, /).
Then has a unique Lebesgue decomposition =
a
+
s
relative to and there
exists a unique element L
1
() such that such that d
a
= d. Moreover,
s
= 0 i , i.e. d = d i .
Proof. Uniqueness. Is a direct consequence of Lemmas 24.10 and 24.11.
Existence. Let g : X S
1
C be a function such that d = gd [[ . By
Theorem 24.13, there exists h L
1
() and a positive measure [[
s
such that
[[
s
and d [[ = hd + d [[
s
. Hence we have d = d + d
s
with
:= gh L
1
() and d
s
:= gd [[
s
. This nishes the proof since, as is easily
veried,
s
.
24.4 Absolute Continuity on an Algebra
The following results will be needed in Section 30.4 below.
Exercise 24.1. Let =
r
+i
i
is a complex measure on a measurable space,
(X, /) , then [
r
[ [[ ,
[[ and [[ [
r
[ +
.
24.4 Absolute Continuity on an Algebra 487
Exercise 24.2. Let be a signed measure on a measurable space, (X, /) .
If A / is set such that there exists M < such that [ (B)[ M for all
B /
A
= C A : C / , then [[ (A) 2M. If is complex measure
with A / and M < as above, then [[ (A) 4M.
Lemma 24.35. Let be a complex or a signed measure on (X, /). Then
A / is a null set i [[ (A) = 0. In particular if is a positive measure
on (X, /), i [[ .
Proof. In all cases we have [(A)[ [[ (A) for all A / which clearly
shows that [[ (A) = 0 implies A is a null set. Conversely if A is a null
set, then, by denition, [
,
A
0 so by Proposition 24.33
[[ (A) = sup
_

1
[(E
j
)[ : E
j
/
A
E
i
E
j
=
ij
E
i
_
= 0.
since E
j
A implies (E
j
) = 0 and hence (E
j
) = 0.
Alternate Proofs that A is null implies [[ (A) = 0.
1) Suppose is a signed measure and P, N = P
c
/ is a Hahn de-
composition for . Then
[[ (A) = (A P) (A N) = 0.
Now suppose that is a complex measure. Then A is a null set for both
r
:= Re and
i
:= Im. Therefore [[ (A) [
r
[ (A) +[
i
[ (A) = 0.
2) Here is another proof in the complex case. Let =
d
d]]
, then by as-
sumption of A being null,
0 = (B) =
_
B
d [[ for all B /
A
.
This shows that 1
A
= 0, [[ a.e. and hence
[[ (A) =
_
A
[[ d [[ =
_
X
1
A
[[ d [[ = 0.
Theorem 24.36 ( Denition of Absolute Continuity). Let be a
complex measure and be a positive measure on (X, /). Then i for
all > 0 there exists a > 0 such that [(A)[ < whenever A / and
(A) < .
Proof. (=) If (A) = 0 then [(A)[ < for all > 0 which shows that
(A) = 0, i.e. .
(=) Since i [[ and [(A)[ [[ (A) for all A /, it suces
to assume 0 with (X) < . Suppose for the sake of contradiction there
exists > 0 and A
n
/ such that (A
n
) > 0 while (A
n
)
1
2
n
. Let
p
A = A
n
i.o. =
N=1
_
nN
A
n
so that
(A) = lim
N
(
nN
A
n
) lim
N
n=N
(A
n
) lim
N
2
(N1)
= 0.
On the other hand,
(A) = lim
N
(
nN
A
n
) lim
n
inf (A
n
) > 0
showing that is not absolutely continuous relative to .
Corollary 24.37. Let be a positive measure on (X, /) and f L
1
(d).
Then for all > 0 there exists > 0 such that
_
A
f d
< for all A /

such that (A) < .
Proof. Apply theorem 24.36 to the signed measure (A) =
_
A
f d for all
A /.
Theorem 24.38 (Absolute Continuity on an Algebra). Let be a com-
plex measure and be a positive measure on (X, /). Suppose that / /
is an algebra such that (/) = / and that is nite on /. Then
i for all > 0 there exists a > 0 such that [(A)[ < for all A / which
satisfy (A) < .
Proof. (=) This implication is a consequence of Theorem 24.36.
(=) If [(A)[ < for all A / with (A) < , then by Exercise 24.2,
[[ (A) 4 for all A / with (A) < . Because of this argument, we may
now replace by [[ and hence we may assume that is a positive nite
measure.
Let > 0 and > 0 be such that (A) < for all A / with (A) < .
Suppose that B / with (B) < and (0, (B)) . By Corollary
22.18, there exists A / such that
(AB) + (AB) = ( +) (AB) < .
In particular it follows that (A) (B) + (AB) < and hence by
assumption (A) < . Therefore,
(B) (A) + (AB) < +
and letting 0 in this inequality shows (B) .
Alternative Proof. Let > 0 and > 0 be such that (A) < for all
A / with (A) < . Suppose that B /with (B) < . Use the regularity
24.5 Exercises 489
Theorem 28.6 below (or see Theorem 33.9 or Corollary 32.42) to nd A /
such that B A and (B) (A) < . Write A =

n
A
n
with A
n
/. By
replacing A
n
by
n
j=1
A
j
if necessary we may assume that A
n
is increasing in
n. Then (A
n
) (A) < for each n and hence by assumption (A
n
) < .
Since B A =
n
A
n
it follows that (B) (A) = lim
n
(A
n
) . Thus
we have shown that (B) for all B / such that (B) < .
24.5 Exercises
Exercise 24.3. Prove Theorem 24.14 for p [1, 2] by directly applying the
Riesz theorem to [
L
2
()
.
Exercise 24.4. Show [[ be dened as in Eq. (24.7) is a positive measure.
Here is an outline.
1. Show
[[ (A) +[[ (B) [[ (A B). (24.16)
when A, B are disjoint sets in /.
2. If A =
n=1
A
n
with A
n
/ then
[[ (A)
n=1
[[ (A
n
). (24.17)
3. From Eqs. (24.16) and (24.17) it follows that [[ is nitely additive, and
hence
[[ (A) =
N
n=1
[[ (A
n
) +[[ (
n>N
A
n
)
N
n=1
[[ (A
n
).
Letting N in this inequality shows [[ (A)

n=1
[[ (A
n
) which
combined with Eq. (24.17) shows [[ is countably additive.
Exercise 24.5. Suppose X is a set, / 2
X
is an algebra, and : / C is
a nitely additive measure. For any A /, let
[[ (A) := sup
_
n
i=1
[ (A
i
)[ : A =
n
i=1
A
i
with A
i
/ and n N
_
.
1. Suppose P := A
i
n
i=1
/ is a partition of A / and B
j
m
j=1
/ is
partition of A which renes P (i.e. for each j there exists an i such that
B
j
A
i
), then
n
i=1
[ (A
i
)[
m
j=1
[ (B
j
)[ . (24.18)
2. The total variation,[[ : / [0, ] , of is a nitely additive measure
on /.
p
Exercise 24.6. Suppose that
n
n=1
are complex measures on a measurable
space, (X, /) .
1. If

n=1
[
n
[ (X) < , then :=
n=1
n
is a complex measure.
2. If there is a nite positive measure, : /[0, ) such that [
n
(A)[
(A) for all A / and (A) := lim
n
n
(A) exists for all A /,
then is also a complex measure.
Exercise 24.7. Suppose
i
,
i
are nite positive measures on measurable
spaces, (X
i
, /
i
), for i = 1, 2. If
i

i
for i = 1, 2 then
1

2

1

2
and in fact
d(
1
2
)
d(
1
2
)
(x
1
, x
2
) =
1
2
(x
1
, x
2
) :=
1
(x
1
)
2
(x
2
)
where
i
:= d
i
/d
i
for i = 1, 2.
Exercise 24.8. Let X = [0, 1] , / := B
[0,1]
, m be Lebesgue measure and
be counting measure on X. Show
1. m yet there is not function such that dm = d.
2. Counting measure has no Lebesgue decomposition relative to m.
Exercise 24.9. Suppose that is a signed or complex measure on (X, /)
and A
n
/ such that either A
n
A or A
n
A and (A
1
) 1, then show
(A) = lim
n
(A
n
).
Exercise 24.10. Let (X, /) be a measurable space, : /[, ) be a
signed measure, and =
+
be a Jordan decomposition of . If :=
with and being positive measures and (X) < , show
+
and
. Us this result to prove the uniqueness of Jordan decompositions

stated in Theorem 24.19.
Exercise 24.11. Let
1
and
2
be two signed measures on (X, /) which are
assumed to be valued in [, ). Show, [
1
+
2
[ [
1
[ + [
2
[ . Hint: use
Exercise 24.10 along with the observation that
1
+
2
= (
+
1
+
+
2
)(
1
+
2
),
where
i
:= (
i
)
.
Exercise 24.12. Folland Exercise 3.7a on p. 88.
Exercise 24.13. Show Theorem 24.36 may fail if is not nite. (For a hint,
see problem 3.10 on p. 92 of Folland.)
Exercise 24.16. If is a complex measure on (X, /) such that [[ (X) =
(X) , then = [[ .
Exercise 24.17. Suppose is a complex or a signed measure on a measurable
space, (X, /) . Show A / is a - null set i [[ (A) = 0. Use this to
conclude that if is a positive measure, then i [[ .
25
Three Fundamental Principles of Banach
Spaces
25.1 The Hahn-Banach Theorem
Our goal here is to show that continuous dual, X
, of a Banach space, X,
is always large. This will be the content of the Hahn-Banach Theorem 25.4
below.
Proposition 25.1. Let X be a complex vector space over C and let X
R
denote
X thought of as a real vector space. If f X
and u = Ref X
R
then
f(x) = u(x) iu(ix). (25.1)
Conversely if u X
R
and f is dened by Eq. (25.1), then f X
and
|u|
X
R
= |f|
X
. More generally if p is a semi-norm (see Denition 5.1) on
X, then
[f[ p i u p.
Proof. Let v(x) = Im f(x), then
v(ix) = Im f(ix) = Im(if(x)) = Ref(x) = u(x).
Therefore
f(x) = u(x) +iv(x) = u(x) +iu(ix) = u(x) iu(ix).
Conversely for u X
R
let f(x) = u(x) iu(ix). Then
f((a +ib)x) = u(ax +ibx) iu(iax bx)
= au(x) +bu(ix) i(au(ix) bu(x))
while
(a +ib)f(x) = au(x) +bu(ix) +i(bu(x) au(ix)).
So f is complex linear. Because [u(x)[ = [Ref(x)[ [f(x)[, it follows that
|u| |f|. For x X choose S
1
C such that [f(x)[ = f(x) so
492 25 Three Fundamental Principles of Banach Spaces
[f(x)[ = f(x) = u(x) |u| |x| = |u||x|.
Since x X is arbitrary, this shows that |f| |u| so |f| = |u|.
1
For
the last assertion, it is clear that [f[ p implies that u [u[ [f[ p.
Conversely if u p and x X, choose S
1
C such that [f(x)[ = f(x).
Then
[f(x)[ = f(x) = f(x) = u(x) p(x) = p(x)
holds for all x X.
Denition 25.2 (Minkowski functional). A function p : X 1 is a
Minkowski functional if
1. p(x +y) p(x) +p(y) for all x, y X and
2. p(cx) = cp(x) for all c 0 and x X.
Example 25.3. Suppose that X = 1 and
p(x) = inf 0 : x [1, 2] = [, 2] .
Notice that if x 0, then p(x) = x/2 and if x 0 then p(x) = x, i.e.
p(x) =
_
x/2 if x 0
[x[ if x 0.
From this formula it is clear that p(cx) = cp(x) for all c 0 but not for c < 0.
Moreover, p satises the triangle inequality, indeed if p(x) = and p(y) = ,
then x [1, 2] and y [1, 2] so that
x +y [1, 2] +[1, 2] ( +) [1, 2]
which shows that p(x+y) + = p(x)+p(y). To check the last set inclusion
let a, b [1, 2], then
1
Proof. To understand better why |f| = |u|, notice that
|f|
2
= sup
jxj=1
[f(x)[
2
= sup
jxj=1
([u(x)[
2
+[u(ix)[
2
).
Suppose that M = sup
jxj=1
[u(x)[ and this supremum is attained at x0 X with
|x0| = 1. Replacing x0 by x0 if necessary, we may assume that u(x0) = M.
Since u has a maximum at x0,
0 =
d
dt
0
u
_
x0 +itx0
|x0 +itx0|
_
=
d
dt
0
_
1
[1 +it[
(u(x0) +tu(ix0))
_
= u(ix0)
since
d
dt
[0[1 +it[ =
d
dt
[0
1 +t
2
= 0.This explains why |f| = |u|.
25.1 The Hahn-Banach Theorem 493
a +b = ( +)
_

+
a +

+
b
_
( +) [1, 2]
since [1, 2] is a convex set and

+
+

+
= 1.
BRUCE: Add in the relationship to convex sets and separation theorems,
see Reed and Simon Vol. 1. for example.
Theorem 25.4 (Hahn-Banach). Let X be a real vector space, p : X 1
be a Minikowski functional, M X be a subspace f : M 1 be a linear
functional such that f p on M. Then there exists a linear functional F :
X 1 such that F[
M
= f and F p on X.
Proof. Step 1. We show for all x X M there exists and extension F
to M 1x with the desired properties. If F exists and = F(x), then for all
y M and 1 we must have
f(y) + = F(y +x) p(y +x). (25.2)
Dividing this equation by [[ allows us to conclude that Eq. (25.2) is valid for
all y M and 1 i
f(y) + p(y +x) for all y M and 1 .
Equivalently put we must have, for all y, z M, that
p(y +x) f (y) and
f(z) p(z x) .
Hence it is possible to nd an 1 such that Eq. (25.2) holds i
f(z) p(z x) p(y +x) f (y) for all y, z M. (25.3)
(If Eq. (25.3) holds, then sup
zM
[f(z) p(z x)] inf
yM
[p(y +x) f (y)]
and so we may choose = sup
zM
[f(z) p(z x)] for example.) Now Equa-
tion (25.3) is equivalent to having
f (z) +f (y) = f (z +y) p(y +x) +p(z x) for all y, z M
and this last equation is valid because
f (z +y) p (z +y) = p(y +x +z x) p(y +x) +p(z x),
wherein we use f p on M and the triangle inequality for p. In conclusion, if
:= sup
zM
[f(z) p(z x)] and F (y +x) := f(y) +, then by following
the above logic backwards, we have F[
M
= f and F p on M 1x showing
F is the desired extension.
Step 2. Let us now write F : X 1 to mean F is dened on a linear
subspace D(F) X and F : D(F) 1 is linear. For F, G : X 1 we will
say F G if D(F) D(G) and F = G[
D(F)
, that is G is an extension of F.
Let
T = F : X 1 : f F and F p on D(F).
Then (T, ) is a partially ordered set. If T is a chain (i.e. a linearly
ordered subset of T) then has an upper bound G T dened by D(G) =
F
D(F) and G(x) = F(x) for x D(F). Then it is easily checked that
D(G) is a linear subspace, G T, and F G for all F . We may now
apply Zorns Lemma
2
(see Theorem B.7) to conclude there exists a maximal
element F T. Necessarily, D(F) = X for otherwise we could extend F by
step (1), violating the maximality of F. Thus F is the desired extension of f.
Corollary 25.5. Suppose that X is a complex vector space, p : X [0, ) is
a semi-norm, M X is a linear subspace, and f : M C is linear functional
such that [f(x)[ p(x) for all x M. Then there exists F X
t
(X
t
is the
algebraic dual of X) such that F[
M
= f and [F[ p.
Proof. Let u = Ref then u p on M and hence by Theorem 25.4,
there exists U X
t
R
such that U[
M
= u and U p on M. Dene F(x) =
U(x) iU(ix) then as in Proposition 25.1, F = f on M and [F[ p.
Theorem 25.6. Let X be a normed space M X be a closed subspace and
x X M. Then there exists f X
such that |f| = 1, f(x) = = d(x, M)

and f = 0 on M.
Proof. Dene h : M Cx Cby h(m + x) := for all m M and
C. Then
|h| := sup
mM and ,=0
[[
|m+x|
= sup
mM and ,=0
|x +m/|
=

= 1
and by the Hahn Banach theorem there exists f X
such that f[
MCx
= h
and |f| 1. Since 1 = |h| |f| 1, it follows that |f| = 1.
Corollary 25.7. To each x X, let x X
be dened by x(f) = f(x) for

all f X
. Then the map x X x X
is a linear isometry of Banach

spaces.
Proof. Since
2
The use of Zorns lemma in this step may be avoided in the case that p (x) is a
norm and X may be written as M span() where := |xn
n=1
is a countable
subset of X. In this case, by step (1) and induction, f : M R may be extended to
a linear functional F : Mspan() R with F (x) p (x) for x Mspan().
This function F then extends by continuity to X and gives the desired extension
of f.
[ x(f)[ = [f(x)[ |f|
X
|x|
X
for all f X
,
it follows that | x|
X
|x|
X
. Now applying Theorem 25.6 with M = 0 ,
there exists f X
such that |f| = 1 and [ x(f)[ = f(x) = |x| , which shows

that | x|
X
|x|
X
. This shows that x X x X
is an isometry. Since
isometries are necessarily injective, we are done.
Denition 25.8. A Banach space X is reexive if the map x X x
X
is surjective.
Example 25.9. Every Hilbert space H is reexive. This is a consequence of the
Riesz Theorem 8.15.
Exercise 25.1. Show all nite dimensional Banach spaces are reexive.
Denition 25.10. For subsets, M X and N X
, let
M
0
:= f X
: f[
M
= 0 and
N
:= x X : f(x) = 0 for all f N.

We call M
0
the annihilator of M and N
the backwards annihilator of

N.
Lemma 25.11. Let M X and N X
, then
1. M
0
and N
are always closed subspace of X
and X respectively.
2. If M is a subspace of X, then
_
M
0
_
=

M.
3. If N is a subspace, then

N
_
N
_
0
with equality if X is reexive. See
Proposition 25.16 below.
Proof. Since
M
0
=
xM
Nul( x) and N
=
fM
Nul(f),
M
0
and N
are both formed as an intersection of closed subspaces and hence

are themselves closed subspaces.
If x M, then f(x) = 0 for all f M
0
so that x
_
M
0
_
and hence
M
_
M
0
_
. If x /

M, then there exists (by Theorem 25.6) f X
such that
f[
M
= 0 while f(x) ,= 0, i.e. f M
0
yet f(x) ,= 0. This shows x /
_
M
0
_
and we have shown

_
M
0
_

M. The proof of Item 3. is left to the reader in
Exercise 25.2.
Exercise 25.2. Prove Item 3. of Lemma 25.11. Also show that it is possible
that

N ,=
_
N
_
0
. Hint: let Y be a non-reexive Banach space (see Theorem
7.16 and Theorem 25.13 below) and take N =

X X
= Y
.
Solution to Exercise. 25.2If f N, then f[
N
= 0 by denition of N
.
This shows f
_
N
_
0
and hence N
_
N
_
0
. Since
_
N
_
0
is a closed
subspace it now follows that

N
_
N
_
0
. If there exists f
_
N
_
0

N, then
by the Hahn Banach theorem there exists X
such that [
N
= 0 while
(f) ,= 0. If X is reexive, = x for some x X and we have x(N) = 0 which
is equivalent to x N
. Since f
_
N
_
0
, we arrive at the contradiction,
0 = f (x) = x(f) = (f) ,= 0,
and hence it follows that

N =
_
N
_
0
.
Now let X = Y
where Y is a non-reexive Banach space and let N =
X X
= Y
. Notice that N is a closed subspace of X
and that f
N
Y = X
i 0 = x(f) = f (x) for all x X. This clearly shows that

N
= 0 and therefore
_
N
_
0
= Y
= X
which properly contains N =

X
when X is not reexive.
Proposition 25.12. Suppose X is a Banach space, then X
=

(X
)
_
X
_
0
where
_
X
_
0
= X
: ( x) = 0 for all x X .
In particular X is reexive i X
is reexive.
Proof. Let X
and dene f
by f
(x) := ( x) for all x X

and set
t
:=

f
. For x X (so x X
) we have
t
( x) = ( x)

f
( x) = f
(x) x(f
) = f
(x) f
(x) = 0.
This shows
t

X
0
and we have shown X
=

X
+

X
0
. If

X

X
0
,
then =

f for some f X
and 0 =

f( x) = x(f) = f(x) for all x X, i.e.
f = 0 so = 0. Therefore X
=

X

X
0
as claimed.
If X is reexive, then

X = X
and so

X
0
= 0 showing (X
=
X
=

(X
), i.e. X
is reexive. Conversely if X
is reexive we conclude
that
_
X
_
0
= 0 and therefore
X
= 0
=
_
X
0
_
=

X,
which shows

X is reexive. Here we have used
_
X
0
_
=

X =

X
since

X is a closed subspace of X
.
Theorem 25.13 (Continuation of Theorem 7.16). Let X be an innite
set, : X (0, ) be a function, p [1, ], q := p/ (p 1) be the conjugate
exponent and for f
q
() dene
f
:
p
() F by
f
(g) :=
xX
f (x) g (x) (x) . (25.4)
1.
p
() is reexive for p (1, ) .
2. The map :
1
()
(X)
is not surjective.
3.
1
() and
(X) are not reexive.

See Lemma 25.14 and Exercise 28.3 below for more examples of non-
reexive spaces.
Proof.
1. This basically follows from two applications of item 3 of Theorem 7.16.
More precisely if
p
()
, let

q
()
be dened by

(g) = (
g
)
for g
q
() . Then by item 3., there exists f
p
() such that, for all
g
q
() ,
(
g
) =

(g) =
f
(g) =
g
(f) =

f (
g
) .
Since
p
()
=
g
: g
q
() , this implies that =

f and so
p
() is
reexive.
2. Recall c
0
(X) as dened in Notation 7.15 and is a closed subspace of
(X) , see Exercise 7.4. Let 1
(X) denote the constant function 1

on X. Notice that |1 f|
1 for all f c
0
(X) and therefore, by the
Hahn - Banach Theorem, there exists
(X)
such that (1) = 0

while [
c0(X)
0. Now if =
f
for some f
1
() , then (x) f (x) =
(
x
) = 0 for all x and f would have to be zero. This is absurd.
3. As we have seen
1
()
(X) while
(X)

= c
0
(X)
,=
1
() . Let

(X)
be the linear functional as described above. We view this as

an element of
1
()
by using
(
g
) := (g) for all g
(X) .
Suppose that

=

f for some f
1
() , then
(g) =

(
g
) =

f (
g
) =
g
(f) =
f
(g) .
But was constructed in such a way that ,=
f
for any f
1
() .
It now follows from Proposition 25.12 that
1
()
(X) is also not

reexive.
Exercise 25.3. Suppose p (1, ) and is a nite measure on a mea-
surable space (X, /), then L
p
(X, /, ) is reexive. Hint: model your proof
on the proof of item 1. of Theorem 25.13 making use of Theorem 24.14.
Lemma 25.14. Suppose that (X, o) is a pointed Hausdor topological space
(i.e. o X is a xed point) and is a nite measure on B
X
such that
1. supp() = X while (o) = 0 and
2. there exists f
n
C (X) such that f
n
1
o]
boundedly as n .
(For example suppose X = [0, 1], o = 0, and = m.)
Then the map
g L
1
()
g
L
()
is not surjective and the Banach space L

1
() is not reexive. (In other words,
Theorem 24.14 may fail when p = and L
1
- spaces need not be reexive.)
Proof. Since supp() = X, if f C (X) we have
|f|
L
()
= sup[f (x)[ x X
and we may view C (X) as a closed subspace of L
() . For f C (X) , let

(f) = f (o) . Then ||
C(X)
= 1, and therefore by Corollary 25.5 of the
Hahn-Banach Theorem, there exists an extension (L
())
such that
= [
C(X)
and || = 1.
If =
g
for some g L
1
() then we would have
f (o) = (f) = (f) =
g
(f) =
_
X
fgd for all f C (X) .
Applying this equality to the f
n
n=1
in item 2. of the statement of the lemma
and then passing to the limit using the dominated convergence theorem, we
arrive at the following contradiction;
1 = lim
n
f
n
(o) = lim
n
_
X
f
n
gd =
_
X
1
o]
gd = 0.
Hence we must conclude that ,=
g
for any g L
1
() .
Since, by Theorem 24.14, the map f L
()
f
L
1
()
is an
isometric isomorphism of Banach spaces we may dene L L
1
()
by
L(
f
) := (f) for all f L
() .
If L were to equal g for some g L
1
() , then
(f) = L(
f
) = g (
f
) =
f
(g) =
_
X
fgd
for all f C (X) L
() . But we have just seen this is impossible and

therefore L ,= g for any g L
1
() and thus L
1
() is not reexive.
25.1.1 Hahn Banach Theorem Problems
Exercise 25.4. Give another proof Corollary 10.15 based on Remark 10.13.
Hint: the Hahn Banach theorem implies
|f(b) f(a)| = sup
X
, ,=0
[(f(b)) (f(a))[
||
.
Exercise 25.5. Prove Theorem 10.39 using the following strategy.
1. Use the results from the proof in the text of Theorem 10.39 that
s
_
d
c
f(s, t)dt and t
_
b
a
f(s, t)ds
are continuous maps.
2. For the moment take X = 1 and prove Eq. (10.24) holds by rst proving it
holds when f (s, t) = s
m
t
n
with m, n N
0
. Then use this result along with
Theorem 10.35 to show Eq. (10.24) holds for all f C ([a, b] [c, d], 1) .
3. For the general case, use the special case proved in item 2. along with
Hahn Banach Theorem 25.4.
Exercise 25.6 (Liouvilles Theorem). (This exercise requires knowledge
of complex variables.) Let X be a Banach space and f : C X be a
function which is complex dierentiable at all points z C, i.e. f
t
(z) :=
lim
h0
(f (z +h) f(z) /h exists for all z C. If we further suppose that
M := sup
zC
|f (z)| < ,
then f is constant. Hint: use the Hahn Banach Theorem 25.4 and the fact
the result holds if X = C.
Exercise 25.7. Let M be a nite dimensional subspace of a normed space,
X. Show there exists a closed subspace, N, such that X = M N. Hint:
let = x
1
, . . . , x
n
M be a basis for M and construct N making use of
i
X
which you should construct to satisfy,
i
(x
j
) =
ij
=
_
1 if i = j
0 if i ,= j.
Exercise 25.8. Folland 5.21, p. 160.
Exercise 25.9. Let X be a Banach space such that X
is separable. Show
X is separable as well. (The converse is not true as can be seen by taking
X =
1
(N) .) Hint: use the greedy algorithm, i.e. suppose D X
0 is a
countable dense subset of X
, for D choose x
X such that |x
| = 1
and [(x
)[
1
2
||.
Exercise 25.10. Folland 5.26.
25.1.2 *Quotient spaces, adjoints, and more reexivity
Denition 25.15. Let X and Y be Banach spaces and A : X Y be a linear
operator. The transpose of A is the linear operator A
: Y
dened
by
_
A
f
_
(x) = f(Ax) for f Y
and x X. The null space of A is the

subspace Nul(A) := x X : Ax = 0 X. For M X and N X
let
M
0
:= f X
: f[
M
= 0 and
N
:= x X : f(x) = 0 for all f N.

Proposition 25.16 (Basic properties of transposes and annihilators).
1. |A| =
_
_
A
_
_
and A
x =

Ax for all x X.
2. M
0
and N
are always closed subspaces of X
and X respectively.
3.
_
M
0
_
=

M.
4.

N
_
N
_
0
with equality when X is reexive.
5. Nul(A) = Ran(A
and Nul(A
) = Ran(A)
0
. Moreover, Ran(A) =
Nul(A
and if X is reexive, then Ran(A
) = Nul(A)
0
.
6. X is reexive i X
is reexive. More generally X
=

X

X
0
where
X
0
= X
: ( x) = 0 for all x X .
Proof.
1.
|A| = sup
|x|=1
|Ax| = sup
|x|=1
sup
|f|=1
[f(Ax)[
= sup
|f|=1
sup
|x|=1
f(x)
= sup
|f|=1
_
_
A
f
_
_
=
_
_
A
_
_
.
2. This is an easy consequence of the assumed continuity o all linear func-
tionals involved.
3. If x M, then f(x) = 0 for all f M
0
so that x
_
M
0
_
. Therefore
M
_
M
0
_
. If x /

M, then there exists f X
such that f[
M
= 0
while f(x) ,= 0, i.e. f M
0
yet f(x) ,= 0. This shows x /
_
M
0
_
and we
have shown
_
M
0
_

M.
4. It is again simple to show N
_
N
_
0
and therefore

N
_
N
_
0
.
Moreover, as above if f /

N there exists X
such that [
N
= 0
while (f) ,= 0. If X is reexive, = x for some x X and since
g(x) = (g) = 0 for all g

N, we have x N
. On the other hand,

f(x) = (f) ,= 0 so f /
_
N
_
0
. Thus again
_
N
_
0

N.
5.
Nul(A) = x X : Ax = 0 = x X : f(Ax) = 0 f X
=
_
x X : A
f(x) = 0 f X
_
=
_
x X : g(x) = 0 g Ran(A
)
_
= Ran(A
.
Similarly,
Nul(A
) =
_
f Y
: A
f = 0
_
=
_
f Y
: (A
f)(x) = 0 x X
_
= f Y
: f(Ax) = 0 x X
=
_
f Y
: f[
Ran(A)
= 0
_
= Ran(A)
0
.
6. Let X
and dene f
by f
(x) = ( x) for all x X and set
t
:=

f
. For x X (so x X
) we have
t
( x) = ( x)

f
( x) = f
(x) x(f
) = f
(x) f
(x) = 0.
This shows
t

X
0
and we have shown X
=

X
+

X
0
. If

X

X
0
,
then =

f for some f X
and 0 =

f( x) = x(f) = f(x) for all x X,
i.e. f = 0 so = 0. Therefore X
=

X

X
0
as claimed. If X is
reexive, then

X = X
and so

X
0
= 0 showing X
=

X
, i.e. X
is reexive. Conversely if X
is reexive we conclude that

X
0
= 0 and
therefore X
= 0
=
_
X
0
_
=

X, so that X is reexive.
Alternative proof. Notice that f
= J
, where J : X X
is given
by Jx = x, and the composition
f X

f X

f X
is the identity map since

_
J

f
_
(x) =

f(Jx) =

f( x) = x(f) = f(x) for all
x X. Thus it follows that X
is invertible i J
is its inverse
which can happen i Nul(J
) = 0 . But as above Nul(J
) = Ran(J)
0
which will be zero i Ran(J) = X
and since J is an isometry this

is equivalent to saying Ran(J) = X
. So we have again shown X
is
reexive i X is reexive.
Theorem 25.17. Let X be a Banach space, M X be a proper closed
subspace, X/M the quotient space, : X X/M the projection map
(x) = x +M for x X and dene the quotient norm on X/M by
|(x)|
X/M
= |x +M|
X/M
= inf
mM
|x +m|
X
.
Then:
1. ||
X/M
is a norm on X/M.
2. The projection map : X X/M has norm 1, || = 1.
3. (X/M, ||
X/M
) is a Banach space.
4. If Y is another normed space and T : X Y is a bounded linear trans-
formation such that M Nul(T), then there exists a unique linear trans-
formation S : X/M Y such that T = S and moreover |T| = |S| .
Proof. 1) Clearly |x +M| 0 and if |x + M| = 0, then there exists
m
n
M such that |x + m
n
| 0 as n , i.e. x = lim
n
m
n

M = M.
Since x M, x +M = 0 X/M. If c C 0 , x X, then
|cx +M| = inf
mM
|cx +m| = [c[ inf
mM
|x +m/c| = [c[ |x +M|
because m/c runs through M as m runs through M. Let x
1
, x
2
X and
m
1
, m
2
M then
|x
1
+x
2
+M| |x
1
+x
2
+m
1
+m
2
| |x
1
+m
1
| +|x
2
+m
2
|.
Taking inmums over m
1
, m
2
M then implies
|x
1
+x
2
+M| |x
1
+M| +|x
2
+M|.
and we have completed the proof the (X/M, | |) is a normed space. 2) Since
|(x)| = inf
mM
|x +m| |x| for all x X, || 1. To see || = 1, let
x X M so that (x) ,= 0. Given (0, 1), there exists m M such that
|x +m|
1
|(x)| .
Therefore,
|(x +m)|
|x +m|
=
|(x)|
|x +m|

|x +m|
|x +m|
=
which shows || . Since (0, 1) is arbitrary we conclude that |(x)| =
1. 3) Let (x
n
) X/M be a sequence such that

|(x
n
)| < . As above
there exists m
n
M such that |(x
n
)|
1
2
|x
n
+ m
n
| and hence

|x
n
+
m
n
| 2
|(x
n
)| < . Since X is complete, x :=
n=1
(x
n
+ m
n
) exists in
X and therefore by the continuity of ,
(x) =
n=1
(x
n
+m
n
) =
n=1
(x
n
)
showing X/M is complete. 4) The existence of S is guaranteed by the factor
theorem from linear algebra. Moreover |S| = |T| because
|T| = |S | |S| || = |S|
and
|S| = sup
x/ M
|S((x))|
|(x)|
= sup
x/ M
|Tx|
|(x)|
sup
x/ M
|Tx|
|x|
= sup
x,=0
|Tx|
|x|
= |T| .
Theorem 25.18. Let X be a Banach space. Then
1. Identifying X with

X X
, the weak topology on X
induces the
weak topology on X. More explicitly, the map x X x

X is a
homeomorphism when X is equipped with its weak topology and

X with
the relative topology coming from the weak- topology on X
.
2.

X X
is dense in the weak- topology on X
.
3. Letting C and C
be the closed unit balls in X and X
respectively, then
C := x C
: x C is dense in C
in the weak topology on X

.
.
4. X is reexive i C is weakly compact.
(See Denition 14.36 for the topologies being used here.)
Proof.
1. The weak topology on X
is generated by
_
f : f X
_
= X
(f) : f X
.
So the induced topology on X is generated by
x X x X
x(f) = f(x) : f X
= X
and so the induced topology on X is precisely the weak topology.

2. A basic weak - neighborhood of a point X
is of the form
^ :=
n
k=1
X
: [(f
k
) (f
k
)[ < (25.5)
for some f
k
n
k=1
X
and > 0. be given. We must now nd x X

such that x ^, or equivalently so that
[ x(f
k
) (f
k
)[ = [f
k
(x) (f
k
)[ < for k = 1, 2, . . . , n. (25.6)
In fact we will show there exists x X such that (f
k
) = f
k
(x) for
k = 1, 2, . . . , n. To prove this stronger assertion we may, by discard-
ing some of the f
k
s if necessary, assume that f
k
n
k=1
is a linearly
independent set. Since the f
k
n
k=1
are linearly independent, the map
x X (f
1
(x), . . . , f
n
(x)) C
n
is surjective (why) and hence there
exists x X such that
(f
1
(x), . . . , f
n
(x)) = Tx = ((f
1
) , . . . , (f
n
)) (25.7)
as desired.
3. Let C
and ^ be the weak - open neighborhood of

as in Eq. (25.5). Working as before, given > 0, we need to nd
x C such that Eq. (25.6). It will be left to the reader to verify that
it suces again to assume f
k
n
k=1
is a linearly independent set. (Hint:
Suppose that f
1
, . . . , f
m
were a maximal linearly dependent subset of
f
k
n
k=1
, then each f
k
with k > m may be written as a linear combination
f
1
, . . . , f
m
.) As in the proof of item 2., there exists x X such that
Eq. (25.7) holds. The problem is that x may not be in C. To remedy this,
let N :=
n
k=1
Nul(f
k
) = Nul(T), : X X/N

= C
n
be the projection
map and

f
k
(X/N)
be chosen so that f
k
=

f
k
for k = 1, 2, . . . , n.
Then we have produced x X such that
((f
1
) , . . . , (f
n
)) = (f
1
(x), . . . , f
n
(x)) = (

f
1
((x)), . . . ,

f
n
((x))).
Since
_
f
1
, . . . ,

f
n
_
is a basis for (X/N)
we nd
|(x)| = sup
C
n
\0]
n
i=1
i

f
i
((x))
_
_
n
i=1
i

f
i
_
_
= sup
C
n
\0]
[
n
i=1
i
(f
i
)[
|
n
i=1
i
f
i
|
= sup
C
n
\0]
[(
n
i=1
i
f
i
)[
|
n
i=1
i
f
i
|
|| sup
C
n
\0]
|
n
i=1
i
f
i
|
|
n
i=1
i
f
i
|
= 1.
Hence we have shown |(x)| 1 and therefore for any > 1 there
exists y = x + n X such that |y| < and ((f
1
) , . . . , (f
n
)) =
(f
1
(y), . . . , f
n
(y)). Hence
[(f
i
) f
i
(y/)[
f
i
(y)
1
f
i
(y)
(1
1
) [f
i
(y)[
which can be arbitrarily small (i.e. less than ) by choosing suciently
close to 1.
4. Let

C := x : x C C
. If X is reexive,

C = C
is weak -
compact and hence by item 1., C is weakly compact in X. Conversely
if C is weakly compact, then

C C
is weak compact being the

continuous image of a continuous map. Since the weak topology on
X
is Hausdor, it follows that

C is weak closed and so by item 3,
C
=

C
weak
=

C. So if X
, / || C
=

C, i.e. there exists
x C such that x = / || . This shows = (|| x)
and therefore
X = X
.
25.2 The Open Mapping Theorem 505
25.2 The Open Mapping Theorem
Theorem 25.19 (Open Mapping Theorem). Let X, Y be Banach spaces,
T L(X, Y ). If T is surjective then T is an open mapping, i.e. T(V ) is open
in Y for all open subsets V X.
Proof. For all > 0 let B
X
= x X : |x|
X
< X, B
Y
=
y Y : |y|
Y
< Y and E
= T(B
X
) Y. The proof will be carried out

by proving the following three assertions.
1. There exists > 0 such that B
Y
for all > 0.

2. For the same > 0, B
Y
, i.e. we may remove the closure in assertion

1.
3. The last assertion implies T is an open mapping.
1. Since Y =
n1
E
n
, the Baire category Theorem 16.2 implies there exists
n such that E
0
n
,= , i.e. there exists y E
n
and > 0 such that B
Y
(y, )
E
n
. Suppose |y
t
| < then y and y + y
t
are in B
Y
(y, ) E
n
hence there
exists x, x B
X
n
such that |T x (y + y
t
)| and |Tx y| may be made as
small as we please, which we abbreviate as follows
|T x (y +y
t
)| 0 and |Tx y| 0.
Hence by the triangle inequality,
|T( x x) y
t
| = |T x (y +y
t
) (Tx y)|
|T x (y +y
t
)| +|Tx y| 0
with x x B
X
2n
. This shows that y
t
E
2n
which implies B
Y
(0, ) E
2n
.
Since the map
: Y Y given by
(y) =

2n
y is a homeomorphism,
(E
2n
) = E
and
(B
Y
(0, )) = B
Y
(0,

2n
), it follows that B
Y
where :=

2n
> 0.
2. Let be as in assertion 1., y B
Y
and
1
(|y| /, 1). Choose
n=2
(0, ) such that

n=1
n
< 1. Since y B
Y
1
E
1
= T
_
B
X
1
_
by assertion 1. there exists x
1
B
X
1
such that |y Tx
1
| <
2
. (Notice that
|y Tx
1
| can be made as small as we please.) Similarly, since y Tx
1

B
Y
2

E
2
= T
_
B
X
2
_
there exists x
2
B
X
2
such that |y Tx
1
Tx
2
| <
3
. Continuing this way inductively, there exists x
n
B
X
n
such that
|y
n
k=1
Tx
k
| <
n+1
for all n N. (25.8)
Since
n=1
|x
n
| <
n=1
n
< 1, x :=
n=1
x
n
exists and |x| < 1, i.e. x B
X
1
.
Passing to the limit in Eq. (25.8) shows, |yTx| = 0 and hence y T(B
X
1
) =
E
1
. Therefore we have shown B
X
E
1
. The same scaling argument as above
then shows B
X
for all > 0.

3. If x V
o
X and y = Tx TV we must show that TV contains a
ball B
Y
(y, ) = Tx +B
Y
for some > 0. Now B

Y
(y, ) = Tx +B
Y
TV i
B
Y
TV Tx = T(V x). Since V x is a neighborhood of 0 X, there

exists > 0 such that B
X
(V x) and hence by assertion 2.,

B
Y
TB
X
T(V x) = T (V ) y
and therefore B
Y
(y, ) TV with := .
Corollary 25.20. If X, Y are Banach spaces and T L(X, Y ) is invertible
(i.e. a bijective linear transformation) then the inverse map, T
1
, is bounded,
i.e. T
1
L(Y, X). (Note that T
1
is automatically linear.)
Denition 25.21. Let X and Y be normed spaces and T : X Y be linear
(not necessarily continuous) map.
1. Let : X X Y be the linear map dened by (x) := (x, T (x)) for
all x X and let
(T) = (x, T (x)) : x X
be the graph of T.
2. The operator T is said to be closed if (T) is closed subset of X Y.
Exercise 25.11. Let T : X Y be a linear map between normed vector
spaces, show T is closed i for all convergent sequences x
n
n=1
X such that
Tx
n
n=1
Y is also convergent, we have lim
n
Tx
n
= T (lim
n
x
n
) .
(Compare this with the statement that T is continuous i for every convergent
sequences x
n
n=1
X we have Tx
n
n=1
Y is necessarily convergent
and lim
n
Tx
n
= T (lim
n
x
n
) .)
Theorem 25.22 (Closed Graph Theorem). Let X and Y be Banach
spaces and T : X Y be linear map. Then T is continuous i T is closed.
Proof. If T is continuous and (x
n
, Tx
n
) (x, y) X Y as n
then Tx
n
Tx = y which implies (x, y) = (x, Tx) (T). Conversely
suppose T is closed, i.e. (T) is a closed subspace of X Y and is therefore
a Banach space in its own right. The map
2
: X Y X is continuous and
1
[
(T)
: (T) X is continuous bijection which implies
1
[
1
(T)
is bounded
by the open mapping Theorem 25.19. Therefore T =
2
=
2

1
[
1
(T)
is bounded, being the composition of bounded operators since the following
diagram commutes
(T)
=
1
[
1
(T)

2
X Y
T
.
As an application we have the following proposition.
25.2 The Open Mapping Theorem 507
Proposition 25.23. Let H be a Hilbert space. Suppose that T : H H is a
linear (not necessarily bounded) map such that there exists T
: H H such
that
Tx[Y ) = x[T
Y ) x, y H.
Then T is bounded.
Proof. It suces to show T is closed. To prove this suppose that x
n
H
such that (x
n
, Tx
n
) (x, y) H H. Then for any z H,
Tx
n
[z) = x
n
[T
z) x[T
z) = Tx[z) as n .
On the other hand lim
n
Tx
n
[z) = y[z) as well and therefore Tx[z) =
y[z) for all z H. This shows that Tx = y and proves that T is closed.
Here is another example.
Example 25.24. Suppose that / L
2
([0, 1], m) is a closed subspace such that
each element of / has a representative in C([0, 1]). We will abuse notation
and simply write / C([0, 1]). Then
1. There exists A (0, ) such that |f|
A|f|
L
2 for all f /.
2. For all x [0, 1] there exists g
x
/ such that
f(x) = f[g
x
) :=
_
1
0
f (y) g
x
(y) dy for all f /.
Moreover we have |g
x
| A.
3. The subspace / is nite dimensional and dim(/) A
2
.
Proof. 1) I will give a two proofs of part 1. Each proof requires that we
rst show that (/, ||
) is a complete space. To prove this it suces to show

/ is a closed subspace of C([0, 1]). So let f
n
/ and f C([0, 1]) such
that |f
n
f|
0 as n . Then |f
n
f
m
|
L
2 |f
n
f
m
|
0 as
m, n , and since / is closed in L
2
([0, 1]), L
2
lim
n
f
n
= g /. By
passing to a subsequence if necessary we know that g(x) = lim
n
f
n
(x) =
f(x) for m - a.e. x. So f = g /.
i) Let i : (/, | |
) (/, | |
2
) be the identity map. Then i is bounded
and bijective. By the open mapping theorem, j = i
1
is bounded as well.
Hence there exists A < such that |f|
= |j(f)| A|f|
2
for all f /.
ii) Let j : (/, | |
2
) (/, | |
) be the identity map. We will shows

that j is a closed operator and hence bounded by the closed graph Theorem
25.22. Suppose that f
n
/ such that f
n
f in L
2
and f
n
= j(f
n
) g in
C([0, 1]). Then as in the rst paragraph, we conclude that g = f = j(f) a.e.
showing j is closed. Now nish as in last line of proof i).
2) For x [0, 1], let e
x
: / C be the evaluation map e
x
(f) = f(x).
Then
[e
x
(f)[ [f(x)[ |f|
A|f|
L
2
which shows that e
x
/
. Hence there exists a unique element g

x
/such
that
f(x) = e
x
(f) = f, g
x
) for all f /.
Moreover |g
x
|
L
2 = |e
x
|
,
A.
3) Let f
j
n
j=1
be an L
2
orthonormal subset of /. Then
A
2
|e
x
|
2
,
= |g
x
|
2
L
2
n
j=1
[f
j
, g
x
)[
2
=
n
j=1
[f
j
(x)[
2
and integrating this equation over x [0, 1] implies that
A
2
j=1
_
1
0
[f
j
(x)[
2
dx =
n
j=1
1 = n
which shows that n A
2
. Hence dim(/) A
2
.
Remark 25.25. Keeping the notation in Example 25.24, G(x, y) = g
x
(y) for
all x, y [0, 1]. Then
f(x) = e
x
(f) =
_
1
0
f(y)G(x, y)dy for all f /.
The function G is called the reproducing kernel for /.
The above example generalizes as follows.
Proposition 25.26. Suppose that (X, /, ) is a nite measure space, p
[1, ) and W is a closed subspace of L
p
() such that W L
p
() L
().
Then dim(W) < .
Proof. With out loss of generality we may assume that (X) = 1. As in
Example 25.24, we shows that W is a closed subspace of L
() and hence
by the open mapping theorem, there exists a constant A < such that
|f|
A|f|
p
for all f W. Now if 1 p 2, then
|f|
A|f|
p
A|f|
2
and if p (2, ), then |f|
p
p
|f|
2
2
|f|
p2
or equivalently,
|f|
p
|f|
2/p
2
|f|
12/p
|f|
2/p
2
_
A|f|
p
_
12/p
from which we learn that |f|
p
A
12/p
|f|
2
and therefore that |f|

AA
12/p
|f|
2
so that in any case there exists a constant B < such
that |f|
B|f|
2
. Let f
n
N
n=1
be an orthonormal subset of W and
f =
N
n=1
c
n
f
n
with c
n
C, then
25.3 Uniform Boundedness Principle 509
_
_
_
_
_
N
n=1
c
n
f
n
_
_
_
_
_
2
B
2
N
n=1
[c
n
[
2
B
2
[c[
2
where [c[
2
:=
N
n=1
[c
n
[
2
. For each c C
N
, there is an exception set E
c
such
that for x / E
c
,
n=1
c
n
f
n
(x)
2
B
2
[c[
2
.
Let | := (+i)
N
and E =
cD
E
c
. Then (E) = 0 and for x / E,
N
n=1
c
n
f
n
(x)
B
2
[c[
2
for all c |. By continuity it then follows for
x / E that
n=1
c
n
f
n
(x)
2
B
2
[c[
2
for all c C
N
.
Taking c
n
= f
n
(x) in this inequality implies that
n=1
[f
n
(x)[
2
2
B
2
N
n=1
[f
n
(x)[
2
for all x / E
and therefore that
N
n=1
[f
n
(x)[
2
B
2
for all x / E.
Integrating this equation over x then implies that N B
2
, i.e. dim(W) B
2
.
25.3 Uniform Boundedness Principle
Theorem 25.27 (Uniform Boundedness Principle). Let X and Y be
normed vector spaces, / L(X, Y ) be a collection of bounded linear operators
from X to Y,
F = F
,
= x X : sup
A,
|Ax| < and
R = R
,
= F
c
= x X : sup
A,
|Ax| = . (25.9)
1. If sup
A,
|A| < then F = X.
2. If F is not meager, then sup
A,
|A| < .
3. If X is a Banach space, F is not meager i sup
A,
|A| < . In particular,
if sup
A,
|Ax| < for all x X then sup
A,
|A| < .
4. If X is a Banach space, then sup
A,
|A| = i R is residual. In particular
if sup
A,
|A| = then sup
A,
|Ax| = for x in a dense subset of X.
Proof. 1. If M := sup
A,
|A| < , then sup
A,
|Ax| M|x| < for all
x X showing F = X.
2. For each n N, let E
n
X be the closed sets given by
E
n
= x : sup
A,
|Ax| n =
A,
x : |Ax| n.
Then F =
n=1
E
n
which is assumed to be non-meager and hence there exists
an n N such that E
n
has non-empty interior. Let B
x
() be a ball such that
B
x
() E
n
. Then for y X with |y| = we know x y B
x
() E
n
, so
that Ay = Ax A(x y) and hence for any A /,
|Ay| |Ax| +|A(x y)| n +n = 2n.
Hence it follows that |A| 2n/ for all A /, i.e. sup
A,
|A| 2n/ < .
3. If X is a Banach space, F = X is not meager by the Baire Category
Theorem 16.2. So item 3. follows from items 1. and 2 and the fact that F = X
i sup
A,
|Ax| < for all x X.
4. Item 3. is equivalent to F is meager i sup
A,
|A| = . Since R = F
c
, R
is residual i F is meager, so R is residual i sup
A,
|A| = .
Remarks 25.28 Let S X be the unit sphere in X, f
A
(x) = Ax for x S
and A /.
1. The assertion sup
A,
|Ax| < for all x X implies sup
A,
|A| < may
be interpreted as follows. If sup
A,
|f
A
(x)| < for all x S, then
sup
A,
|f
A
|
< where |f
A
|
:= sup
xS
|f
A
(x)| = |A| .
2. If dim(X) < we may give a simple proof of this assertion. Indeed
if e
n
N
n=1
S is a basis for X there is a constant > 0 such that
_
_
_
N
n=1
n
e
n
_
_
_
N
n=1
[
n
[ and so the assumption sup
A,
|f
A
(x)| <
implies
sup
A,
|A| = sup
A,
sup
,=0
_
_
_
N
n=1
n
Ae
n
_
_
_
_
_
_
N
n=1
n
e
n
_
_
_
sup
A,
sup
,=0
N
n=1
[
n
[ |Ae
n
|
N
n=1
[
n
[

1
sup
A,
sup
n
|Ae
n
| =
1
sup
n
sup
A,
|Ae
n
| < .
Notice that we have used the linearity of each A / in a crucial way.
3. If we drop the linearity assumption, so that f
A
C(S, Y ) for all A /
some index set, then it is no longer true that sup
A,
|f
A
(x)| <
for all x S, then sup
A,
|f
A
|
< . The reader is invited to construct a

counterexample when X = 1
2
and Y = 1 by nding a sequence f
n
n=1
of continuous functions on S
1
such that lim
n
f
n
(x) = 0 for all x S
1
while lim
n
|f
n
|
C(S
1
)
= .
4. The assumption that X is a Banach space in item 3.of Theorem 25.27
can not be dropped. For example, let X C([0, 1]) be the polynomial
functions on [0, 1] equipped with the uniform norm ||
and for t (0, 1],

let f
t
(x) := (x(t) x(0)) /t for all x X. Then lim
t0
f
t
(x) =
d
dt
[
0
x(t)
and therefore sup
t(0,1]
[f
t
(x)[ < for all x X. If the conclusion of
Theorem 25.27 (item 3.) were true we would have M := sup
t(0,1]
|f
t
| <
. This would then imply
x(t) x(0)
t
M|x|
for all x X and t (0, 1].

Letting t 0 in this equation gives, [ x(0)[ M|x|
for all x X. But

taking x(t) = t
n
in this inequality shows M = .
Example 25.29. Suppose that c
n
n=1
C is a sequence of numbers such that
lim
N
N
n=1
a
n
c
n
exists in C for all a
1
.
Then c
.
Proof. Let f
N

_
1
_
be given by f
N
(a) =

N
n=1
a
n
c
n
and set M
N
:=
max [c
n
[ : n = 1, . . . , N . Then
[f
N
(a)[ M
N
|a|
1
and by taking a = e
k
with k such M
N
= [c
k
[ , we learn that |f
N
| = M
N
.
Now by assumption, lim
N
f
N
(a) exists for all a
1
and in particular,
sup
N
[f
N
(a)[ < for all a
1
.
So by the uniform boundedness principle, Theorem 25.27,
> sup
N
|f
N
| = sup
N
M
N
= sup[c
n
[ : n = 1, 2, 3, . . . .
25.3.1 Applications to Fourier Series
Let T = S
1
be the unit circle in S
1
,
n
(z) := z
n
for all n Z, and m denote
the normalized arc length measure on T, i.e. if f : T [0, ) is measurable,
then
_
T
f(w)dw :=
_
T
fdm :=
1
2
_

f(e
i
)d.
From Section 23.3, we know
n
nZ
is an orthonormal basis for L
2
(T). For
n N and z T, let
s
n
(f, z) :=
n
k=n
f[
n
)
k
(z) =
_
T
f(w)d
n
(z w)dw
where
d
n
(e
i
) :=
n
k=n
e
ik
=
sin(n +
1
2
)
sin
1
2
,
see Eqs. (23.8) and (23.9). By Theorem 23.10, for all f L
2
(T) we know
f = L
2
(T) lim
n
s
n
(f, ).
On the other hand the next proposition shows; if we x z T, then
lim
n
s
n
(f, z) does not even exist for the typical f C(T) L
2
(T).
Proposition 25.30 (Lack of pointwise convergence). For each z T,
there exists a residual set R
z
C(T) such that sup
n
[s
n
(f, z)[ = for all
f R
z
. Recall that C(T) is a complete metric space, hence R
z
is a dense
subset of C(T).
Proof. By symmetry considerations, it suces to assume z = 1 T. Let
n
: C(T) C be given by
n
f := s
n
(f, 1) =
_
T
f(w)d
n
( w)dw.
An application of Corollary 32.68 below shows,
|
n
| = |d
n
|
1
=
_
T
[d
n
( w)[ dw
=
1
2
_

d
n
(e
i
)
d =
1
2
_

sin(n +
1
2
)
sin
1
2
d. (25.10)
Of course we may prove this directly as follows. Since
[
n
f[ =
_
T
f(w)d
n
( w)dw
_
T
[f(w)d
n
( w)[ dw |f|
_
T
[d
n
( w)[ dw,
we learn |
n
|
_
T
[d
n
( w)[ dw. For all > 0, let
f
(z) :=
d
n
( z)
_
d
2
n
( z) +
.
Then |f
|
C(T)
1 and hence
|
n
| lim
0
[
n
f
[ = lim
0
_
T
d
2
n
( z)
_
d
2
n
( z) +
dw =
_
T
[d
n
( z)[ dw
and the verication of Eq. (25.10) is complete.
Using
[sinx[ =
_
x
0
cos ydy
_
x
0
[cos y[ dy
[x[
in Eq. (25.10) implies that
|
n
|
1
2
_

sin(n +
1
2
)
1
2
d =
2
_

0
sin(n +
1
2
)
=
2
_

0
sin(n +
1
2
)
=
_
(n+
1
2
)
0
[siny[
dy
y
as n
(25.11)
and hence sup
n
|
n
| = . So by Theorem 25.27,
R
1
= f C(T) : sup
n
[
n
f[ =
is a residual set.
See Rudin Chapter 5 for more details.
Lemma 25.31. For f L
1
(T), let
f(n) := f,
n
) =
_
T
f(w) w
n
dw.
Then

f c
0
:= C
0
(Z) (i.e lim
n

f(n) = 0) and the map f L
1
(T)

f
c
0
is a one to one bounded linear transformation into but not onto c
0
.
Proof. By Bessels inequality,

nZ
f(n)
2
< for all f L
2
(T) and
in particular lim
]n]
f(n)
= 0. Given f L
1
(T) and g L
2
(T) we have
f(n) g(n)
_
T
[f(w) g(w)] w
n
dw
|f g|
1
and hence
lim sup
n
f(n)
= lim sup
n
f(n) g(n)
|f g|
1
for all g L
2
(T). Since L
2
(T) is dense in L
1
(T), it follows that
limsup
n
f(n)
= 0 for all f L
1
, i.e.

f c
0
. Since
f(n)
|f|
1
,
we have
_
_
_
f
_
_
_
c0
|f|
1
showing that f :=

f is a bounded linear transfor-
mation from L
1
(T) to c
0
. To see that is injective, suppose

f = f 0,
then
_
T
f(w)p(w, w)dw = 0 for all polynomials p in w and w. By the Stone -
Wierestrass and the dominated convergence theorem, this implies that
_
T
f(w)g(w)dw = 0
for all g C(T). Lemma 22.11 now implies f = 0 a.e. If were surjective,
the open mapping theorem would imply that
1
: c
0
L
1
(T) is bounded.
In particular this implies there exists C < such that
|f|
L
1 C
_
_
_
f
_
_
_
c0
for all f L
1
(T). (25.12)
Taking f = d
n
, we nd (because

d
n
(k) = 1
]k]n
) that
_
_
_
d
n
_
_
_
c0
= 1 while
(by Eq. (25.11)) lim
n
|d
n
|
L
1 = contradicting Eq. (25.12). Therefore
Ran() ,= c
0
.
25.4 Exercises
25.4.1 More Examples of Banach Spaces
Exercise 25.12. Let (X, /) be a measurable space and M(X) denote the
space of complex measures on (X, /) and for M(X) let || := [[ (X).
Show (M(X), ||) is a Banach space. (Move to Section 30.)
Exercise 25.13. Folland 5.9, p. 155. (Drop this problem, or move to Chapter
9.)
Exercise 25.14. Folland 5.10, p. 155. (Drop this problem, or move later
where it can be done.)
Exercise 25.15. Folland 5.11, p. 155. (Drop this problem, or move to Chapter
9.)
25.4.2 Hahn-Banach Theorem Problems
Exercise 25.16. Let X be a normed vector space. Show a linear functional,
f : X C, is bounded i M := f
1
(0) is closed. Hint: if M is closed yet
f is not continuous, consider y
n
:= x
0
x
n
/f(x
n
) where x
0
X such that
f (x
0
) = 1 and x
n
X such that |x
n
| = 1 and lim
n
[f (x
n
)[ = .
25.4 Exercises 515
Exercise 25.17. Let M be a closed subspace of a normed space, X, and
x X M. Show M Cx is closed. Hint: make use of a X
which you
should construct so that (M) = 0 while (x) ,= 0.
Exercise 25.18. (Uses quotient spaces.) Let X be an innite dimensional
normed vector space. Show:
1. There exists a sequence x
n
n=1
X such that |x
n
| = 1 for all n and
|x
m
x
n
|
1
2
for all m ,= n.
2. Show X is not locally compact.
25.4.3 Open Mapping and Closed Operator Problems
Exercise 25.19. Let X =
1
(N) ,
Y =
_
f X :
n=1
n[f (n)[ <
_
with Y being equipped with the
1
(N) - norm, and T : Y X be dened by
(Tf) (n) = nf (n) . Show:
1. Y is a proper dense subspace of X and in particular Y is not complete
2. T : Y X is a closed operator which is not bounded.
3. T : Y X is algebraically invertible, S := T
1
: X Y is bounded and
surjective but not open.
Exercise 25.20. Let X = C ([0, 1]) and Y = C
1
([0, 1]) X with both X and
Y being equipped with the uniform norm. Let T : Y X be the linear map,
Tf = f
t
. Here C
1
([0, 1]) denotes those functions, f C
1
((0, 1)) C ([0, 1])
such that
f
t
(1) := lim
x1
f
t
(x) and f
t
(0) := lim
x0
f
t
(x)
exist.
1. Y is a proper dense subspace of X and in particular Y is not complete.
2. T : Y X is a closed operator which is not bounded.
Exercise 25.22. Let X be a vector space equipped with two norms, ||
1
and
||
2
such that ||
1
||
2
and X is complete relative to both norms. Show
there is a constant C < such that ||
2
C ||
1
.
Exercise 25.23. Show that it is impossible to nd a sequence, a
n
nN

(0, ) , with the following property: if
n
nN
is a sequence in C, then
n=1
[
n
[ < i supa
1
n
[
n
[ < . (Poetically speaking, there is no slowest
rate of decay for the summands of absolutely convergent series.)
Outline: For sake of contradiction suppose such a magic sequence
a
n
nN
(0, ) were to exists.
1. For f
(N) , let (Tf) (n) := a

n
f (n) for n N. Verify that Tf
1
(N)
and T :
(N)
1
(N) is a bounded linear operator.
2. Show T :
(N)
1
(N) must be an invertible operator and that T
1
:
1
(N)
(N) is necessarily bounded, i.e. T :
(N)
1
(N) is a
homeomorphism.
3. Arrive at a contradiction by showing either that T
1
is not bounded or
by using the fact that, D, the set of nitely supported sequences, is dense
in
1
(N) but not in
(N) .
Exercise 25.24. Folland 5.34, p. 164. (Not a very good problem, delete.)
Exercise 25.25. Folland 5.35, p. 164. (A quotient space exercise.)
Exercise 25.26. Folland 5.36, p. 164. (A quotient space exercise.)
Exercise 25.27. Suppose T : X Y is a linear map between two Banach
spaces such that f T X
for all f Y
. Show T is bounded.
Exercise 25.28. Suppose T
n
: X Y for n N is a sequence of bounded
linear operators between two Banach spaces such lim
n
T
n
x exists for all
x X. Show Tx := lim
n
T
n
x denes a bounded linear operator from X
to Y.
Exercise 25.29. Let X, Y and Z be Banach spaces and B : XY Z be a
bilinear map such that B(x, ) L(Y, Z) and B(, y) L(X, Z) for all x X
and y Y. Show there is a constant M < such that
|B(x, y)| M|x| |y| for all (x, y) X Y
and conclude from this that B : X Y Z is continuous
Exercise 25.30. Folland 5.40, p. 165. (Condensation of singularities).
Exercise 25.31. Folland 5.41, p. 165. (Drop this exercise, it is 16.2.)
25.4.4 Weak Topology and Convergence Problems
Denition 25.32. A sequence x
n
n=1
X is weakly Cauchy if for all
V
w
such that 0 V, x
n
x
m
V for all m, n suciently large. Similarly
a sequence f
n
n=1
X
is weak Cauchy if for all V

w
such that
0 V, f
n
f
m
V for all m, n suciently large.
Remark 25.33. These conditions are equivalent to f(x
n
)
n=1
being Cauchy
for all f X
and f
n
(x)
n=1
being Cauchy for all x X respectively.
Exercise 25.32. Let X and Y be Banach spaces. Show:
1. Every weakly Cauchy sequence in X is bounded.
2. Every weak-* Cauchy sequence in X
is bounded.
25.4 Exercises 517
3. If T
n
n=1
L(X, Y ) converges weakly (or strongly) then sup
n
|T
n
|
L(X,Y )
<
.
Exercise 25.33. Let X be a Banach space, C := x X : |x| 1 and
C
:= X
: ||
X
1 be the closed unit balls in X and X
respec-
tively.
1. Show C is weakly closed and C
is weak-* closed in X and X
respectively.
2. If E X is a norm-bounded set, then the weak closure,

E
w
X, is also
norm bounded.
3. If F X
is a norm-bounded set, then the weak-* closure,

E
w
X
,
is also norm bounded.
4. Every weak-* Cauchy sequence f
n
X
is weak-* convergent to some

f X
.
Exercise 25.35. If X is a separable normed linear space, the weak-* topology
on the closed unit ball in X
is second countable and hence metrizable. (See

Theorem 14.38.)
Exercise 25.36. Let X be a Banach space. Show every weakly compact sub-
set of X is norm bounded and every weak compact subset of X
is norm
bounded.
Exercise 25.37. A vector subspace of a normed space X is norm-closed i it
is weakly closed. (If X is not reexive, it is not necessarily true that a normed
closed subspace of X
need be weak* closed, see Exercise 25.39.) (Hint: this

problem only uses the Hahn-Banach Theorem.)
Exercise 25.38. Let X be a Banach space, T
n
n=1
and S
n
n=1
be two
sequences of bounded operators on X such that T
n
T and S
n
S strongly,
and suppose x
n
n=1
X such that lim
n
|x
n
x| = 0. Show:
1. lim
n
|T
n
x
n
Tx| = 0 and that
2. T
n
S
n
TS strongly as n .
26
Weak and Strong Derivatives
For this section, let be an open subset of 1
d
, p, q, r [1, ], L
p
() =
L
p
(, B
, m) and L
p
loc
() = L
p
loc
(, B
, m), where m is Lebesgue measure

on B
R
d and B
is the Borel algebra on . If = 1

d
, we will simply write
L
p
and L
p
loc
for L
p
(1
d
) and L
p
loc
(1
d
) respectively. Also let
f, g) :=
_
fgdm
for any pair of measurable functions f, g : C such that fg L
1
().
For example, by Holders inequality, if f, g) is dened for f L
p
() and
g L
q
() when q =
p
p1
.
Denition 26.1. A sequence u
n
n=1
L
p
loc
() is said to converge to u
L
p
loc
() if lim
n
|u u
n
|
L
q
(K)
= 0 for all compact subsets K .
The following simple but useful remark will be used (typically without
further comment) in the sequel.
Remark 26.2. Suppose r, p, q [1, ] are such that r
1
= p
1
+ q
1
and
f
t
f in L
p
() and g
t
g in L
q
() as t 0, then f
t
g
t
fg in L
r
().
Indeed,
|f
t
g
t
fg|
r
= |(f
t
f) g
t
+f (g
t
g)|
r
|f
t
f|
p
|g
t
|
q
+|f|
p
|g
t
g|
q
0 as t 0
26.1 Basic Denitions and Properties
Denition 26.3 (Weak Dierentiability). Let v 1
d
and u L
p
()
(u L
p
loc
()) then
v
u is said to exist weakly in L
p
() (L
p
loc
()) if there
exists a function g L
p
() (g L
p
loc
()) such that
520 26 Weak and Strong Derivatives
u,
v
) = g, ) for all C
c
(). (26.1)
The function g if it exists will be denoted by
(w)
v
u. Similarly if N
d
0
and
is as in Notation 22.21, we say
u exists weakly in L
p
() (L
p
loc
()) i
there exists g L
p
() (L
p
loc
()) such that
u,
) = (1)
]]
g, ) for all C
c
().
More generally if p() =
]]N
a
is a polynomial in 1
n
, then p()u
exists weakly in L
p
() (L
p
loc
()) i there exists g L
p
() (L
p
loc
()) such
that
u, p()) = g, ) for all C
c
() (26.2)
and we denote g by wp()u.
By Corollary 22.38, there is at most one g L
1
loc
() such that Eq. (26.2)
holds, so wp()u is well dened.
Lemma 26.4. Let p() be a polynomial on 1
d
, k = deg (p) N, and u
L
1
loc
() such that p()u exists weakly in L
1
loc
(). Then
1. supp
m
(wp()u) supp
m
(u), where supp
m
(u) is the essential support of
u relative to Lebesgue measure, see Denition 22.25.
2. If deg p = k and u[
U
C
k
(U, C) for some open set U , then
wp()u = p () u a.e. on U.
Proof.
1. Since
wp()u, ) = u, p()) = 0 for all C
c
( supp
m
(u)),
an application of Corollary 22.38 shows wp()u = 0 a.e. on
supp
m
(u). So by Lemma 22.26, supp
m
(u) supp
m
(wp()u),
i.e. supp
m
(wp()u) supp
m
(u).
2. Suppose that u[
U
is C
k
and let C
c
(U). (We view as a function
in C
c
(1
d
) by setting 0 on 1
d
U.) By Corollary 22.35, there exists
C
c
() such that 0 1 and = 1 in a neighborhood of supp().
Then by setting u = 0 on 1
d
supp() we may view u C
k
c
(1
d
) and
so by standard integration by parts (see Lemma 22.36) and the ordinary
product rule,
wp()u, ) = u, p()) = u, p())
= p() (u) , ) = p()u, ) (26.3)
wherein the last equality we have is constant on supp(). Since Eq.
(26.3) is true for all C
c
(U), an application of Corollary 22.38 with
h = wp()u p () u and = m shows wp()u = p () u a.e. on U.
26.1 Basic Denitions and Properties 521
Notation 26.5 In light of Lemma 26.4 there is no danger in simply writing
p () u for wp()u. So in the sequel we will always interpret p()u in the
weak or distributional sense.
Example 26.6. Suppose u(x) = [x[ for x 1, then u(x) = sgn(x) in L
1
loc
(1)
while
2
u(x) = 2(x) so
2
u(x) does not exist weakly in L
1
loc
(1) .
Example 26.7. Suppose d = 2 and u(x, y) = 1
y>x
. Then u L
1
loc
_
1
2
_
, while
x
1
y>x
= (y x) and
y
1
y>x
= (y x) and so that neither
x
u or
y
u
exists weakly. On the other hand (
x
+
y
) u = 0 weakly. To prove these as-
sertions, notice u C
_
1
2

_
where =
_
(x, x) : x 1
2
_
. So by Lemma
26.4, for any polynomial p () without constant term, if p () u exists weakly
then p () u = 0. However,
u,
x
) =
_
y>x
x
(x, y)dxdy =
_
R
(y, y)dy,
u,
y
) =
_
y>x
y
(x, y)dxdy =
_
R
(x, x)dx and
u, (
x
+
y
)) = 0
x
u and
y
u can not be zero while (
x
+
y
)u = 0.
On the other hand if p() and q () are two polynomials and u L
1
loc
()
is a function such that p()u exists weakly in L
1
loc
() and q () [p () u] exists
weakly in L
1
loc
() then (qp) () u exists weakly in L
1
loc
() . This is because
u, (qp) () ) = u, p () q())
= p () u, q()) = q()p () u, ) for all C
c
() .
Example 26.8. Let u(x, y) = 1
x>0
+1
y>0
in L
1
loc
_
1
2
_
. Then
x
u(x, y) = (x)
and
y
u(x, y) = (y) so
x
u(x, y) and
y
u(x, y) do not exist weakly in
L
1
loc
_
1
2
_
. However
y
x
u does exists weakly and is the zero function. This
shows
y
x
u may exists weakly despite the fact both
x
u and
y
u do not
exists weakly in L
1
loc
_
1
2
_
.
Lemma 26.9. Suppose u L
1
loc
() and p() is a polynomial of degree k such
that p () u exists weakly in L
1
loc
() then
p () u, ) = u, p () ) for all C
k
c
() . (26.4)
Note: The point here is that Eq. (26.4) holds for all C
k
c
() not just
C
c
() .
Proof. Let C
k
c
() and choose C
c
(B(0, 1)) such that
_
R
d
(x)dx = 1 and let
(x) :=
d
(x/). Then
c
() for su-
ciently small and p () [
] =
p () p () and

uniformly on compact sets as 0. Therefore by the dominated convergence
theorem,
p () u, ) = lim
0
p () u,
) = lim
0
u, p () (
)) = u, p () ).
Lemma 26.10 (Product Rule). Let u L
1
loc
(), v 1
d
and C
1
().
If
(w)
v
u exists in L
1
loc
(), then
(w)
v
(u) exists in L
1
loc
() and
(w)
v
(u) =
v
u +
(w)
v
u a.e.
Moreover if C
1
c
() and F := u L
1
(here we dene F on 1
d
by setting
F = 0 on 1
d
), then
(w)
F =
v
u +
(w)
v
1
(1
d
).
Proof. Let C
c
(), then using Lemma 26.9,
u,
v
) = u,
v
) = u,
v
()
v
)
=
(w)
v
u, ) +
v
u, )
=
(w)
v
u, ) +
v
u, ).
This proves the rst assertion. To prove the second assertion let C
c
()
such that 0 1 and = 1 on a neighborhood of supp(). So for
C
c
(1
d
), using
v
= 0 on supp() and C
c
(), we nd
F,
v
) = F,
v
) = F,
v
) = (u) ,
v
()
v
)
= (u) ,
v
()) =
(w)
v
(u) , ())
=
v
u +
(w)
v
u, ) =
v
u +
(w)
v
u, ).
This show
(w)
v
F =
v
u +
(w)
v
u as desired.
Lemma 26.11. Suppose q [1, ), p() is a polynomial in 1
d
and
u L
q
loc
(). If there exists u
m
m=1
L
q
loc
() such that p () u
m
exists in
L
q
loc
() for all m and there exists g L
q
loc
() such that for all C
c
(),
lim
m
u
m
, ) = u, ) and lim
m
p () u
m
, ) = g, )
then p () u exists in L
q
loc
() and p () u = g.
Proof. Since
u, p () ) = lim
m
u
m
, p () ) = lim
m
p () u
m
, ) = g, )
for all C
c
(), p () u exists and is equal to g L
q
loc
().
Conversely we have the following proposition.
Proposition 26.12 (Mollication). Suppose q [1, ), p
1
(), . . . , p
N
() is
a collection of polynomials in 1
d
and u L
q
loc
() such that p
l
()u exists
weakly in L
q
loc
() for l = 1, 2, . . . , N. Then there exists u
n
C
c
() such that
u
n
u in L
q
loc
() and p
l
() u
n
p
l
() u in L
q
loc
() for l = 1, 2, . . . , N.
Proof. Let C
c
(B(0, 1)) such that
_
R
d
dm = 1 and
(x) :=
d
(x/) be as in the proof of Lemma 26.9. For any function f L
1
loc
() ,
> 0 and x
:= y : dist(y,
c
) > , let
f
(x) := f
(x) := 1
(x) =
_
f(y)
(x y)dy.
Notice that f
) and
as 0. Given a compact set K

let K
:= x : dist(x, K) . Then K
K as 0, there exists
0
> 0
such that K
0
:= K
0
is a compact subset of
0
:=
0
(see Figure 26.1)
and for x K,
f
(x) :=
_
f(y)
(x y)dy =
_
K
f(y)
(x y)dy.
Therefore, using Theorem 22.32,
Fig. 26.1. The geomentry of K K0 0 .
|f
f|
L
p
(K)
= |(1
K0
f)
1
K0
f|
L
p
(K)
|(1
K0
f)
1
K0
f|
L
p
(R
d
)
0 as 0.
Hence, for all f L
q
loc
(), f
) and
lim
0
|f
f|
L
p
(K)
= 0. (26.5)
Now let p() be a polynomial on 1
d
, u L
q
loc
() such that p () u L
q
loc
()
and v
:=
u C
) as above. Then for x K and <

0
,
p()v
(x) =
_
u(y)p(
x
)
(x y)dy =
_
u(y)p(
y
)
(x y)dy
=
_
u(y)p(
y
)
(x y)dy = u, p()
(x ))
= p()u,
(x )) = (p()u)
(x). (26.6)
From Eq. (26.6) we may now apply Eq. (26.5) with f = u and f = p
l
()u for
1 l N to nd
|v
u|
L
p
(K)
+
N
l=1
|p
l
()v
p
l
()u|
L
p
(K)
0 as 0.
For n N, let
K
n
:= x : [x[ n and d(x,
c
) 1/n
(so K
n
K
o
n+1
K
n+1
for all n and K
n
as n or see Lemma 14.23)
and choose
n
C
c
(K
o
n+1
, [0, 1]), using Corollary 22.35, so that
n
= 1 on
a neighborhood of K
n
. Choose
n
0 such that K
n+1

n
and
|v
n
u|
L
p
(Kn)
+
N
l=1
|p
l
()v
n
p
l
()u|
L
p
(Kn)
1/n.
Then u
n
:=
n
v
n
C
c
() and since u
n
= v
n
on K
n
we still have
|u
n
u|
L
p
(Kn)
+
N
l=1
|p
l
()u
n
p
l
()u|
L
p
(Kn)
1/n. (26.7)
Since any compact set K is contained in K
o
n
for all n suciently large,
Eq. (26.7) implies
lim
n
_
|u
n
u|
L
p
(K)
+
N
l=1
|p
l
()u
n
p
l
()u|
L
p
(K)
_
= 0.
The following proposition is another variant of Proposition 26.12 which
the reader is asked to prove in Exercise 26.2 below.
Proposition 26.13. Suppose q [1, ), p
1
(), . . . , p
N
() is a collection of
polynomials in 1
d
and u L
q
= L
q
_
1
d
_
such that p
l
()u L
q
for
l = 1, 2, . . . , N. Then there exists u
n
C
c
_
1
d
_
such that
lim
n
_
|u
n
u|
L
q +
N
l=1
|p
l
()u
n
p
l
()u|
L
q
_
= 0.
Notation 26.14 (Dierence quotients) For v 1
d
and h 1 0 and
a function u : C, let
h
v
u(x) :=
u(x +hv) u(x)
h
for those x such that x + hv . When v is one of the standard basis
elements, e
i
for 1 i d, we will write
h
i
u(x) rather than
h
ei
u(x). Also let
h
u(x) :=
_
h
1
u(x), . . . ,
h
n
u(x)
_
be the dierence quotient approximation to the gradient.
Denition 26.15 (Strong Dierentiability). Let v 1
d
and u L
p
, then
v
u is said to exist strongly in L
p
if the lim
h0
h
v
u exists in L
p
. We will
denote the limit by
(s)
v
u.
It is easily veried that if u L
p
, v 1
d
and
(s)
v
u L
p
exists then
(w)
v
u
exists and
(w)
v
u =
(s)
v
u. The key to checking this assertion is the identity,
h
v
u, ) =
_
R
d
u(x +hv) u(x)
h
(x)dx
=
_
R
d
u(x)
(x hv) (x)
h
dx = u,
h
v
). (26.8)
Hence if
(s)
v
u = lim
h0
h
v
u exists in L
p
and C
c
(1
d
), then
(s)
v
u, ) = lim
h0
h
v
u, ) = lim
h0
u,
h
v
) =
d
dh
[
0
u, ( hv)) = u,
v
)
wherein Corollary 19.43 has been used in the last equality to bring the deriva-
tive past the integral. This shows
(w)
v
u exists and is equal to
(s)
v
u. What is
somewhat more surprising is that the converse assertion that if
(w)
v
u exists
then so does
(s)
v
u. Theorem 26.18 is a generalization of Theorem 23.15 from
L
2
to L
p
. For the readers convenience, let us give a self-contained proof of
the version of the Banach - Alaoglus Theorem which will be used in the proof
of Theorem 26.18. (This is the same as Theorem 14.38 above.)
Proposition 26.16 (Weak- Compactness: Banach - Alaoglus The-
orem). Let X be a separable Banach space and f
n
X
be a bounded
sequence, then there exist a subsequence
f
n
f
n
such that lim
n
f
n
(x) =
f(x) for all x X with f X
.
Proof. Let D X be a countable linearly independent subset of X such
that span(D) = X. Using Cantors diagonal trick, choose
f
n
f
n
such
that
x
:= lim
n
f
n
(x) exist for all x D. Dene f : span(D) 1 by the
formula
f(
xD
a
x
x) =
xD
a
x
x
where by assumption #(x D : a
x
,= 0) < . Then f : span(D) 1 is
linear and moreover

f
n
(y) f(y) for all y span(D). Now
[f(y)[ = lim
n
f
n
(y)
limsup
n
|
f
n
| |y| C|y| for all y span(D).
Hence by the B.L.T. Theorem 10.4, f extends uniquely to a bounded linear
functional on X. We still denote the extension of f by f X
. Finally, if
x X and y span(D)
[f(x)

f
n
(x)[ [f(x) f(y)[ +[f(y)

f
n
(y)[ +[
f
n
(y)

f
n
(x)[
|f| |x y| +|
f
n
| |x y| +[f(y)

f
n
(y)|
2C|x y| +[f(y)

f
n
(y)[ 2C|x y| as n .
Therefore
limsup
n
f(x)

f
n
(x)
2C|x y| 0 as y x.
Corollary 26.17. Let p (1, ] and q =
p
p1
. Then to every bounded se-
quence u
n
n=1
L
p
() there is a subsequence u
n
n=1
and an element
u L
p
() such that
lim
n
u
n
, g) = u, g) for all g L
q
() .
Proof. By Theorem 24.14, the map
v L
p
() v, ) (L
q
())
is an isometric isomorphism of Banach spaces. By Theorem 22.15, L

q
() is
separable for all q [1, ) and hence the result now follows from Proposition
26.16.
Theorem 26.18 (Weak and Strong Dierentiability). Suppose p
[1, ), u L
p
(1
d
) and v 1
d
0 . Then the following are equivalent:
1. There exists g L
p
(1
d
) and h
n
n=1
1 0 such that lim
n
h
n
= 0
and
lim
n
hn
v
u, ) = g, ) for all C
c
(1
d
).
2.
(w)
v
u exists and is equal to g L
p
(1
d
), i.e. u,
v
) = g, ) for all
C
c
(1
d
).
3. There exists g L
p
(1
d
) and u
n
C
c
(1
d
) such that u
n
L
p
u and
v
u
n
L
p
g as n .
4.
(s)
v
u exists and is is equal to g L
p
(1
d
), i.e.
h
v
u g in L
p
as h 0.
Moreover if p (1, ) any one of the equivalent conditions 1. 4. above
are implied by the following condition.
1
t
. There exists h
n
n=1
1 0 such that lim
n
h
n
= 0 and sup
n
_
_
hn
v
u
_
_
p
<
.
Proof. 4. = 1. is simply the assertion that strong convergence implies
weak convergence. 1. = 2. For C
c
(1
d
), Eq. (26.8) and the dominated
convergence theorem implies
g, ) = lim
n
hn
v
u, ) = lim
n
u,
hn
v
) = u,
v
).
2. = 3. Let C
c
(1
d
, 1) such that
_
R
d
(x)dx = 1 and let
m
(x) =
m
d
(mx), then by Proposition 22.34, h
m
:=
m
u C
(1
d
) for all m and
v
h
m
(x) =
v
m
u(x) =
_
R
d
m
(x y)u(y)dy
= u,
v
[
m
(x )]) = g,
m
(x )) =
m
g(x).
By Theorem 22.32, h
m
u L
p
(1
d
) and
v
h
m
=
m
g g in L
p
(1
d
)
as m . This shows 3. holds except for the fact that h
m
need not have
compact support. To x this let C
c
(1
d
, [0, 1]) such that = 1 in a
neighborhood of 0 and let
(x) = (x) and (

v
)
(x) := (
v
) (x). Then
v
(
h
m
) =
v
h
m
+
v
h
m
= (
v
)
h
m
+
v
h
m
so that
h
m
h
m
in L
p
and
v
(
h
m
)
v
h
m
in L
p
as 0. Let
u
m
=
m
h
m
where
m
is chosen to be greater than zero but small enough so
that
|
m
h
m
h
m
|
p
+|
v
(
m
h
m
)
v
h
m
|
p
< 1/m.
Then u
m
C
c
(1
d
), u
m
u and
v
u
m
g in L
p
as m . 3. = 4. By
the fundamental theorem of calculus
h
v
u
m
(x) =
u
m
(x +hv) u
m
(x)
h
=
1
h
_
1
0
d
ds
u
m
(x +shv)ds =
_
1
0
(
v
u
m
) (x +shv)ds. (26.9)
and therefore,
h
v
u
m
(x)
v
u
m
(x) =
_
1
0
[(
v
u
m
) (x +shv)
v
u
m
(x)] ds.
So by Minkowskis inequality for integrals, Theorem 21.27,
_
_
h
v
u
m
(x)
v
u
m
_
_
p

_
1
0
|(
v
u
m
) ( +shv)
v
u
m
|
p
ds
and letting m in this equation then implies
_
_
h
v
u g
_
_
p

_
1
0
|g( +shv) g|
p
ds.
By the dominated convergence theorem and Proposition 22.24, the right mem-
ber of this equation tends to zero as h 0 and this shows item 4. holds. (1
t
.
= 1. when p > 1) This is a consequence of Corollary 26.17 (or see Theorem
14.38 above) which asserts, by passing to a subsequence if necessary, that
hn
v
u
w
g for some g L
p
(1
d
).
Example 26.19. The fact that (1
t
) does not imply the equivalent conditions 1
4 in Theorem 26.18 when p = 1 is demonstrated by the following example.
Let u := 1
[0,1]
, then
_
R
u(x +h) u(x)

h
dx =
1
[h[
_
R
1
[h,1h]
(x) 1
[0,1]
(x)
dx = 2
for [h[ < 1. On the other hand the distributional derivative of u is u(x) =
(x) (x 1) which is not in L
1
.
Alternatively, if there exists g L
1
(1, dm) such that
lim
n
u(x +h
n
) u(x)
h
n
= g(x) in L
1
for some sequence h
n
n=1
as above. Then for C
c
(1) we would have on
one hand,
_
R
u(x +h
n
) u(x)
h
n
(x)dx =
_
R
(x h
n
) (x)
h
n
u(x)dx
_
1
0
t
(x)dx = ((0) (1)) as n ,
while on the other hand,
_
R
u(x +h
n
) u(x)
h
n
(x)dx
_
R
g(x)(x)dx.
These two equations imply
_
R
g(x)(x)dx = (0) (1) for all C
c
(1) (26.10)
and in particular that
_
R
g(x)(x)dx = 0 for all C
c
(1 0, 1). By Corol-
lary 22.38, g(x) = 0 for m a.e. x 1 0, 1 and hence g(x) = 0 for m
a.e. x 1. But this clearly contradicts Eq. (26.10). This example also shows
that the unit ball in L
1
(1, dm) is not weakly sequentially compact. Compare
with Lemma 25.14 below.
Corollary 26.20. If 1 p < , u L
p
such that
v
u L
p
, then
_
_
h
v
u
_
_
L
p

|
v
u|
L
p for all h ,= 0 and v 1
d
.
Proof. By Minkowskis inequality for integrals, Theorem 21.27, we may
let m in Eq. (26.9) to nd
h
v
u(x) =
_
1
0
(
v
u) (x +shv)ds for a.e. x 1
d
and
_
_
h
v
u
_
_
L
p

_
1
0
|(
v
u) ( +shv)|
L
p ds = |
v
u|
L
p .
Proposition 26.21 (A weak form of Weyls Lemma). If u L
2
(1
d
) such
that f := u L
2
(1
d
) then
u L
2
_
1
d
_
for [[ 2. Furthermore if k N
0
and
f L
2
_
1
d
_
for all [[ k, then
u L
2
_
1
d
_
for [[ k + 2.
Proof. By Proposition 26.13, there exists u
n
C
c
_
1
d
_
such that u
n
u
and u
n
u = f in L
2
_
1
d
_
. By integration by parts we nd
_
R
d
[(u
n
u
m
)[
2
dm = ((u
n
u
m
), (u
n
u
m
))
L
2
(f f, u u) = 0 as m, n
and hence by item 3. of Theorem 26.18,
i
u L
2
for each i. Since
|u|
2
L
2 = lim
n
_
R
d
[u
n
[
2
dm = (u
n
, u
n
)
L
2 (f, u) as n
we also learn that
|u|
2
L
2 = (f, u) |f|
L
2 |u|
L
2 . (26.11)
Let us now consider
d
i,j=1
_
R
d
[
i
j
u
n
[
2
dm =
d
i,j=1
_
R
d
j
u
n
2
i
j
u
n
dm
=
d
j=1
_
R
d
j
u
n
j
u
n
dm =
d
j=1
_
R
d
2
j
u
n
u
n
dm
=
_
R
d
[u
n
[
2
dm = |u
n
|
2
L
2 .
Replacing u
n
by u
n
u
m
in this calculation shows
d
i,j=1
_
R
d
[
i
j
(u
n
u
m
)[
2
dm = |(u
n
u
m
)|
2
L
2 0 as m, n
and therefore by Lemma 26.4 (also see Exercise 26.4),
i
j
u L
2
_
1
d
_
for all
i, j and
d
i,j=1
_
R
d
[
i
j
u[
2
dm = |u|
2
L
2 = |f|
2
L
2 . (26.12)
Combining Eqs. (26.11) and (26.12) gives the estimate
]]2
|
u|
2
L
2 |u|
2
L
2 +|f|
L
2 |u|
L
2 +|f|
2
L
2
= |u|
2
L
2 +|u|
L
2 |u|
L
2 +|u|
2
L
2 . (26.13)
Let us now further assume
i
f =
i
u L
2
_
1
d
_
. Then for h 1 0 ,
h
i
u L
2
(1
d
) and
h
i
u =
h
i
u =
h
i
f L
2
(1
d
) and hence by Eq. (26.13)
and what we have just proved,
h
i
u =
h
i

u L
2
and
]]2
_
_
h
i

u
_
_
2
L
2
(R
d
)

_
_
h
i
u
_
_
2
L
2
+
_
_
h
i
f
_
_
L
2

_
_
h
i
u
_
_
L
2
+
_
_
h
i
f
_
_
2
L
2
|
i
u|
2
L
2 +|
i
f|
L
2 |
i
u|
L
2 +|
i
f|
2
L
2
where the last inequality follows from Corollary 26.20. Therefore applying
Theorem 26.18 again we learn that
i
u L
2
(1
d
) for all [[ 2 and
]]2
|
i
u|
2
L
2
(R
d
)
|
i
u|
2
L
2 +|
i
f|
L
2 |
i
u|
L
2 +|
i
f|
2
L
2
|u|
2
L
2 +|
i
f|
L
2 |u|
L
2 +|
i
f|
2
L
2
|f|
L
2 |u|
L
2
+|
i
f|
L
2
_
|f|
L
2 |u|
L
2 +|
i
f|
2
L
2 .
The remainder of the proof, which is now an induction argument using the
above ideas, is left as an exercise to the reader.
Theorem 26.22. Suppose that is an open subset of 1
d
and V is an open
precompact subset of .
1. If 1 p < , u L
p
() and
i
u L
p
(), then |
h
i
u|
L
p
(V )
|
i
u|
L
p
()
for all 0 < [h[ <
1
2
dist(V,
c
).
2. Suppose that 1 < p , u L
p
() and assume there exists a constants
C
V
< and
V
(0,
1
2
dist(V,
c
)) such that
|
h
i
u|
L
p
(V )
C
V
for all 0 < [h[ <
V
.
Then
i
u L
p
(V ) and |
i
u|
L
p
(V )
C
V
. Moreover if C := sup
V
C
V
<
then in fact
i
u L
p
() and |
i
u|
L
p
()
C.
Proof. 1. Let U
o
such that

V U and

U is a compact subset of .
For u C
1
() L
p
(), x B and 0 < [h[ <
1
2
dist(V, U
c
),
h
i
u(x) =
u(x +he
i
) u(x)
h
=
_
1
0
i
u(x +the
i
) dt
and in particular,
[
h
i
u(x)[
_
1
0
[u(x +the
i
)[dt.
Therefore by Minikowskis inequality for integrals,
|
h
i
u|
L
p
(V )

_
1
0
|u( +the
i
)|
L
p
(V )
dt |
i
u|
L
p
(U)
. (26.14)
For general u L
p
() with
i
u L
p
(), by Proposition 26.12, there exists
u
n
C
c
() such that u
n
u and
i
u
n

i
u in L
p
loc
(). Therefore we may
replace u by u
n
in Eq. (26.14) and then pass to the limit to nd
|
h
i
u|
L
p
(V )
|
i
u|
L
p
(U)
|
i
u|
L
p
()
.
2. If |
h
i
u|
L
p
(V )
C
V
for all h suciently small then by Corollary 26.17
there exists h
n
0 such that
hn
i
u
w
v L
p
(V ). Hence if C
c
(V ),
_
V
vdm = lim
n
_
hn
i
udm = lim
n
_
u
hn
i
dm
=
_
u
i
dm =
_
V
u
i
dm.
Therefore
i
u = v L
p
(V ) and |
i
u|
L
p
(V )
|v|
L
p
(V )
C
V
.
1
Finally if
C := sup
V
C
V
< , then by the dominated convergence theorem,
|
i
u|
L
p
()
= lim
V
|
i
u|
L
p
(V )
C.
We will now give a couple of applications of Theorem 26.18.
1
Here we have used the result that if f L
p
and fn L
p
such that fn, ) f, )
for all C
c
(V ) , then |f|
L
p
(V )
liminfn|fn|
L
p
(V )
. To prove this, we
have with q =
p
p1
that
[f, )[ = lim
n
[fn, )[ liminf
n
|fn|
L
p
(V )
||
L
q
(V )
and therefore,
|f|
L
p
(V )
= sup
=0
[f, )[
||
L
q
(V )
liminf
n
|fn|
L
p
(V )
.
Lemma 26.23. Let v 1
d
.
1. If h L
1
and
v
h exists in L
1
, then
_
R
d

v
h(x)dx = 0.
2. If p, q, r [1, ) satisfy r
1
= p
1
+q
1
, f L
p
and g L
q
are functions
such that
v
f and
v
g exists in L
p
and L
q
respectively, then
v
(fg) exists
in L
r
and
v
(fg) =
v
f g + f
v
g. Moreover if r = 1 we have the
integration by parts formula,
v
f, g) = f,
v
g). (26.15)
3. If p = 1,
v
f exists in L
1
and g BC
1
(1
d
) (i.e. g C
1
(1
d
) with
g and its rst derivatives being bounded) then
v
(gf) exists in L
1
and
v
(fg) =
v
f g +f
v
g and again Eq. (26.15) holds.
Proof. 1) By item 3. of Theorem 26.18 there exists h
n
C
c
(1
d
) such
that h
n
h and
v
h
n

v
h in L
1
. Then
_
R
d
v
h
n
(x)dx =
d
dt
[
0
_
R
d
h
n
(x +hv)dx =
d
dt
[
0
_
R
d
h
n
(x)dx = 0
and letting n proves the rst assertion. 2) Similarly there exists f
n
, g
n

C
c
(1
d
) such that f
n
f and
v
f
n

v
f in L
p
and g
n
g and
v
g
n

v
g
in L
q
as n . So by the standard product rule and Remark 26.2, f
n
g
n

fg L
r
as n and
v
(f
n
g
n
) =
v
f
n
g
n
+f
n

v
g
n

v
f g +f
v
g in L
r
as n .
It now follows from another application of Theorem 26.18 that
v
(fg) exists in
L
r
and
v
(fg) =
v
f g+f
v
g. Eq. (26.15) follows from this product rule and
item 1. when r = 1. 3) Let f
n
C
c
(1
d
) such that f
n
f and
v
f
n

v
f
in L
1
as n . Then as above, gf
n
gf in L
1
and
v
(gf
n
)
v
g f +g
v
f
in L
1
as n . In particular if C
c
(1
d
), then
gf,
v
) = lim
n
gf
n
,
v
) = lim
n
v
(gf
n
) , )
= lim
n
v
g f
n
+g
v
f
n
, ) =
v
g f +g
v
f, ).
This shows
v
(fg) exists (weakly) and
v
(fg) =
v
f g + f
v
g. Again Eq.
(26.15) holds in this case by item 1. already proved.
Lemma 26.24. Let p, q, r [1, ] satisfy p
1
+q
1
= 1+r
1
, f L
p
, g L
q
and v 1
d
.
1. If
v
f exists strongly in L
r
, then
v
(f g) exists strongly in L
p
and
v
(f g) = (
v
f) g.
2. If
v
g exists strongly in L
q
, then
v
(f g) exists strongly in L
r
and
v
(f g) = f
v
g.
3. If
v
f exists weakly in L
p
and g C
c
(1
d
), then f g C
(1
d
),
v
(f g)
exists strongly in L
r
and
v
(f g) = f
v
g = (
v
f) g.
Proof. Items 1 and 2. By Youngs inequality (Theorem 22.30) and simple
computations:
_
_
_
_
hv
(f g) f g
h
(
v
f) g
_
_
_
_
r
=
_
_
_
_
hv
f g f g
h
(
v
f) g
_
_
_
_
r
=
_
_
_
_
_
hv
f f
h
(
v
f)
_
g
_
_
_
_
r
_
_
_
_
hv
f f
h
(
v
f)
_
_
_
_
p
|g|
q
which tends to zero as h 0. The second item is proved analogously, or just
make use of the fact that f g = g f and apply Item 1. Using the fact that
g(x ) C
c
(1
d
) and the denition of the weak derivative,
f
v
g(x) =
_
R
d
f(y) (
v
g) (x y)dy =
_
R
d
f(y) (
v
g(x )) (y)dy
=
_
R
d
v
f(y)g(x y)dy =
v
f g(x).
Item 3. is a consequence of this equality and items 1. and 2.
Proposition 26.25. Let = (, ) 1 be an open interval and f L
1
loc
()
such that
(w)
f = 0 in L
1
loc
(). Then there exists c C such that f = c a.e.
More generally, suppose F : C
c
() C is a linear functional such that
F(
t
) = 0 for all C
c
(), where
t
(x) =
d
dx
(x), then there exists c C
such that
F() = c, ) =
_
c(x)dx for all C
c
(). (26.16)
Proof. Before giving a proof of the second assertion, let us show it includes
the rst. Indeed, if F() :=
_
fdm and
(w)
f = 0, then F(
t
) = 0 for all
C
c
() and therefore there exists c C such that
_
fdm = F() = c, 1) = c
_
fdm.
But this implies f = c a.e. So it only remains to prove the second assertion.
Let C
c
() such that
_
dm = 1. Given C
c
() C
c
(1) , let
(x) =
_
x
((y) (y), 1)) dy. Then

t
(x) = (x) (x), 1) and
C
c
() as the reader should check. Therefore,
0 = F() = F( , )) = F() , 1)F()
which shows Eq. (26.16) holds with c = F(). This concludes the proof,
however it will be instructive to give another proof of the rst assertion.
Alternative proof of rst assertion. Suppose f L
1
loc
() and
(w)
f =
0 and f
m
:= f
m
as is in the proof of Lemma 26.9. Then f
t
m
=
(w)
f
m
= 0,
so f
m
= c
m
for some constant c
m
C. By Theorem 22.32, f
m
f in L
1
loc
()
and therefore if J = [a, b] is a compact subinterval of ,
[c
m
c
k
[ =
1
b a
_
J
[f
m
f
k
[ dm 0 as m, k .
So c
m
m=1
is a Cauchy sequence and therefore c := lim
m
c
m
exists and
f = lim
m
f
m
= c a.e.
We will say more about the connection of weak derivatives to pointwise
derivatives in Section 30.5 below.
26.2 Exercises
Exercise 26.1. Give another proof of Lemma 26.10 base on Proposition
26.12.
Exercise 26.2. Prove Proposition 26.13. Hints: 1. Use u
as dened in the
proof of Proposition 26.12 to show it suces to consider the case where u
C
_
1
d
_
L
q
_
1
d
_
with
u L
q
_
1
d
_
for all N
d
0
. 2. Then let
C
c
(B(0, 1), [0, 1]) such that = 1 on a neighborhood of 0 and let u
n
(x) :=
u(x)(x/n).
Exercise 26.3. Suppose p() is a polynomial in 1
d
, p (1, ), q :=
p
p1
,
u L
p
such that p()u L
p
and v L
q
such that p () v L
q
. Show
p () u, v) = u, p () v).
Exercise 26.4. Let p [1, ), be a multi index (if = 0 let
0
be the
identity operator on L
p
),
D(
) := f L
p
(1
n
) :
p
(1
n
)
and for f D(
) (the domain of
) let
f denote the weak derivative

of f. (See Denition 26.3.)
1. Show
is a densely dened operator on L

p
, i.e. D(
) is a dense linear
subspace of L
p
and
: D(
) L
p
is a linear transformation.
2. Show
: D(
) L
p
is a closed operator, i.e. the graph,
(
) := (f,
f) L
p
L
p
: f D(
) ,
is a closed subspace of L
p
L
p
.
26.2 Exercises 535
3. Show
: D(
) L
p
L
p
is not bounded unless = 0. (The norm on
D(
) is taken to be the L
p
norm.)
Exercise 26.5. Let p [1, ), f L
p
and be a multi index. Show
f
exists weakly (see Denition 26.3) in L
p
i there exists f
n
C
c
(1
n
) and
g L
p
such that f
n
f and
f
n
g in L
p
as n . Hints: See
exercises 26.2 and 26.4.
Exercise 26.6. 8.8 on p. 246.
Exercise 26.7. Assume n = 1 and let =
e1
where e
1
= (1) 1
1
= 1.
1. Let f(x) = [x[ , show f exists weakly in L
1
loc
(1) and f(x) = sgn(x) for
m a.e. x.
2. Show (f) does not exists weakly in L
1
loc
(1).
3. Generalize item 1. as follows. Suppose f C(1, 1) and there exists a
nite set := t
1
< t
2
< < t
N
1 such that f C
1
(1 , 1).
Assuming f L
1
loc
(1) , show f exists weakly and
(w)
f(x) = f(x)
for m a.e. x.
Exercise 26.8. Suppose that f L
1
loc
() and v 1
d
and e
j
n
j=1
is the
standard basis for 1
d
. If
j
f :=
ej
1
loc
() for all j =
1, 2, . . . , n then
v
1
loc
() and
v
f =
n
j=1
v
j
j
f.
Exercise 26.9. Suppose, f L
1
loc
(1
d
) and
v
f exists weakly and
v
f = 0 in
L
1
loc
(1
d
) for all v 1
d
. Then there exists C such that f(x) = for m
a.e. x 1
d
. Hint: See steps 1. and 2. in the outline given in Exercise 26.10
below.
Exercise 26.10 (A generalization of Exercise 26.9). Suppose is a
connected open subset of 1
d
and f L
1
loc
(). If
f = 0 weakly for Z
n
+
with [[ = N +1, then f(x) = p(x) for m a.e. x where p(x) is a polynomial
of degree at most N. Here is an outline.
1. Suppose x
0
and > 0 such that C := C
x0
() and let
n
be a
sequence of approximate functions such supp(
n
) B
0
(1/n) for all n.
Then for n large enough,
(f
n
) = (
f)
n
on C for [[ = N + 1.
Now use Taylors theorem to conclude there exists a polynomial p
n
of
degree at most N such that f
n
= p
n
on C.
2. Show p := lim
n
p
n
exists on C and then let n in step 1. to show
there exists a polynomial p of degree at most N such that f = p a.e. on
C.
3. Use Taylors theorem to show if p and q are two polynomials on 1
d
which
agree on an open set then p = q.
4. Finish the proof with a connectedness argument using the results of steps
2. and 3. above.
Exercise 26.11. Suppose
o
1
d
and v, w 1
d
. Assume f L
1
loc
()
and that
v
w
1
loc
(), show
w
v
f also exists weakly and
v
f =
v
w
f.
Exercise 26.12. Let d = 2 and f(x, y) = 1
x0
. Show
(1,1)
f = 0 weakly in
L
1
loc
despite the fact that
1
f does not exist weakly in L
1
loc
!
27
Bochner Integral
Part VII
Construction and Dierentiation of Measures
28
Examples of Measures
In this chapter we are going to state a couple of construction theorems for
measures. The proofs of these theorems will be deferred until the next chapter,
also see Chapter 32. Our goal in this chapter is to apply these construction
theorems to produce a fairly broad class of examples of measures.
28.1 Extending Premeasures to Measures
Throughout this chapter, X will be a given set which will often be taken to
be a locally compact Hausdor space.
Denition 28.1. Suppose that c 2
X
is a collection of subsets of X and
: c [0, ] is a function. Then
1. is additive or nitely additive on c if
(E) =
n
i=1
(E
i
) (28.1)
whenever E =

n
i=1
E
i
c with E
i
c for i = 1, 2, . . . , n < . If in
addition c = / is an algebra and () = 0, then is a nitely additive
measure.
2. is additive (or countable additive) on c if item 1. holds even
when n = . If in addition c = / is an algebra and () = 0, then is
called a premeasure on /.
3. is sub-additive (nitely sub-additive) on c if
(E)
n
i=1
(E
i
)
whenever E =
n
i=1
E
i
c with n N (n N).
542 28 Examples of Measures
Theorem 28.2. Suppose that c 2
X
is an elementary family (Denition
18.8), / = /(c) is the algebra generated by c (see Proposition 18.10) and
: c [0, ] is a function such that () = 0.
1. If is additive on c, then has a unique extension to a nitely additive
measure on / which will still be denoted by .
2. If is also countably sub-additive on c, then is a premeasure on /.
3. If is a premeasure on / then
(A) = inf
n=1
(E
n
) : A
n=1
E
n
with E
n
c (28.2)
= inf
n=1
(E
n
) : A
_
n=1
E
n
with E
n
c (28.3)
extends to a measure on (/) = (c).
4. If we further assume is nite on c, then is the unique measure
on (c) such that [
L
= .
Proof. Item 1. is Proposition 31.3, item 2. is Proposition 31.5, item 3. is
contained in Theorem 31.18 (or see Theorems 31.15 or 32.41 for the nite
case) and item 4. is a consequence of Theorem 19.55. The equivalence of Eqs.
(28.2) and (28.3) requires a little comment.
Suppose is dened by Eq. (28.2) and A
n=1
E
n
with E
n
c and let
E
n
:= E
n
(E
1
E
n1
) /(c) , where E
0
:= . Then A

n=1
E
n
and by Proposition 18.10

E
n
=

Nn
j=1
E
n,j
for some E
n,j
c. Therefore,
A
n=1
Nn
j=1
E
n,j
and hence
(A)
n=1
Nn
j=1
(E
n,j
) =
n=1
E
n
_
n=1
(E
n
) ,
which easily implies the equality in Eq. (28.3).
Example 28.3. The uniqueness assertion in item 4. of Theorem 28.2 may fail
if the niteness assumption is dropped. For example, let X = 1 and /
denote the algebra generated by
c := (a, b] 1 : a b .
Then each of the following three distinct measures on B
R
restrict to the same
premeasure on /;
1.
1
= except on the empty set,
2.
2
is counting measure, and
3.
3
(A) =
2
(A D) where D is any dense subset of 1.
The next exercise is a minor variant of Remark 19.2 and Proposition 19.3.
28.1 Extending Premeasures to Measures 543
Exercise 28.1. Suppose : / [0, ] is a nitely additive measure. Show
1. is a premeasure on / i (A
n
) (A) for all A
n
n=1
/ such that
A
n
A /.
2. Further assume is nite (i.e. (X) < ). Then is a premeasure on /
i (A
n
) 0 for all A
n
n=1
/ such that A
n
.
28.1.1 Regularity and Density Results
Denition 28.4. Given a collection of subsets, c, of X, let c
denote the
collection of subsets of X which are nite or countable unions of sets from
c. Similarly let c
denote the collection of subsets of X which are nite or

countable intersections of sets from c. We also write c
= (c
and c
=
(c
, etc.
Lemma 28.5. Suppose that / 2
X
is an algebra. Then:
1. /
is closed under taking countable unions and nite intersections.

2. /
is closed under taking countable intersections and nite unions.

3. A
c
: A /
= /
and A
c
: A /
= /
.
Proof. By construction /
is closed under countable unions. Moreover if

A =
i=1
A
i
and B =
j=1
B
j
with A
i
, B
j
/, then
A B =
i,j=1
A
i
B
j
/
,
which shows that /
is also closed under nite intersections. Item 3. is straight

forward and item 2. follows from items 1. and 3.
Theorem 28.6 (Regularity Theorem). Suppose that is a nite pre-
measure on an algebra /, is the extension described in Theorem 28.2 and
B (/) . Then:
1.
(B) := inf (C) : B C /
.
2. For any > 0 there exists A B C such that A /
, C /
and
(C A) < .
3. There exists A B C such that A /
, C /
and (C A) = 0.
Proof. 1. The rst item is an easy consequence of the third item in The-
orem 28.2 with / = c.
2. Let X
m
/ such that (X
m
) < and X
m
X as n and let
B
m
:= X
m
B. Then by item 1., there exists C
m
/
such that B
m
C
m
and (C
m
B
m
) < 2
(m+1)
. So, letting C =
m=1
C
m
, C /
and
(C B)
m=1
(C
m
B)
m=1
(C
m
B
m
) <

2
.
Applying this result to B
c
implies there exists D /
such that B
c
D and
(B D
c
) = (D B
c
) <

2
.
Therefore if we let A := D
c
/
,
then A B and (B A) < /2 and
therefore
(C A) = (B A) + (C B) < .
3. By item 2 there exist A
m
B C
m
with C
m
/
, A
m
/
and (C
m
A
m
) < 1/m for all m. Letting A :=
m=1
A
m
/
and C :=
m=1
C
m
/
, we have
(C A) (C
m
A
m
) 0 as m .
Remark 28.7. Using this result we may recover Corollary 22.18 and Theorem
22.14 which state, under the assumptions of Theorem 28.6;
1. for every > 0 and B (/) such that (B) < , there exists D /
such that (BD) < .
2. S
f
(/, ) is dense in L
p
() for all 1 p < .
Indeed by Theorem 28.6 (also see Corollary 33.10), there exists C /
such B C and (C B) < . Now write C =
n=1
C
n
with C
n
/ for
each n. By replacing C
n
by
n
k=1
C
k
/ if necessary, we may assume that
C
n
C as n . Since C
n
B C B, B C
n
B C = as n , and
(B C
1
) (B) < , we know that
lim
n
(C
n
B) = (C B) < and lim
n
(B C
n
) = (B C) = 0
Hence for n suciently large,
(BC
n
) = (C
n
B) + (B C
n
) < .
Hence we are done with the rst item by taking D = C
n
/ for an n
suciently large.
For the second item, notice that
_
X
[1
B
1
D
[
p
d = (BD) < (28.4)
from which it easily follows that any simple function in S
f
(/, ) may be
approximated arbitrary well by an element from S
f
(/, ). This completes the
proof of item 2. since S
f
(/, ) is dense in L
p
() by Lemma 22.3.
28.2 The Riesz-Markov Theorem 545
28.2 The Riesz-Markov Theorem
Now suppose that X is a locally compact Hausdor space and B = B
X
is the
Borel algebra on X. Open subsets of 1
d
and locally compact separable
metric spaces are examples of such spaces, see Section 14.3.
Denition 28.8. A linear functional I on C
c
(X) is positive if I(f) 0 for
all f C
c
(X, [0, )).
Proposition 28.9. If I is a positive linear functional on C
c
(X) and K is a
compact subset of X, then there exists C
K
< such that [I(f)[ C
K
|f|
for all f C
c
(X) with supp(f) K.
Proof. By Urysohns Lemma 15.8, there exits C
c
= 1 on K. Then for all f C
c
(X, 1) such that supp(f) K, [f[ |f|
or equivalently |f|
f 0. Hence |f|
I() I(f) 0 or equivalently

which is to say [I(f)[ |f|
I(). Letting C
K
:= I(), we have shown
that [I(f)[ C
K
|f|
for all f C
c
(X, 1) with supp(f) K. For general
f C
c
(X, C) with supp(f) K, choose [[ = 1 such that I(f) 0. Then
[I(f)[ = I(f) = I( f) = I(Re(f)) C
K
| Re (f) |
C
K
|f|
.
Example 28.10. If is a K-nite measure on X, then
I
(f) =
_
X
fd f C
c
(X)
denes a positive linear functional on C
c
(X). In the future, we will often
simply write (f) for I
(f).
The Riesz-Markov Theorem 28.16 below asserts that every positive linear
functional on C
c
(X) comes from a K-nite measure .
Example 28.11. Let X = 1 and =
d
= 2
X
be the discrete topology on X.
Now let (A) = 0 if A is countable and (A) = otherwise. Since K X is
compact i #(K) < , is a K-nite measure on X and
I
(f) =
_
X
fd = 0 for all f C
c
(X).
This shows that the correspondence I
from K-nite measures to positive

linear functionals on C
c
(X) is not injective without further restriction.
Denition 28.12. Suppose that is a Borel measure on X and B B
X
. We
say is inner regular on B if
(B) = sup(K) : K B (28.5)
and is outer regular on B if
(B) = inf(U) : B U
o
X. (28.6)
The measure is said to be a regular Borel measure on X, if it is both
inner and outer regular on all Borel measurable subsets of X.
Denition 28.13. A measure : B
X
[0, ] is a Radon measure on X
if is a K-nite measure which is inner regular on all open subsets of X and
outer regular on all Borel subsets of X.
The measure in Example 28.11 is an example of a K-nite measure on X
which is not a Radon measure on X. BRUCE: Add exercise stating the sum
of two radon measures is still a radon measure. It is not true for countable
sums since this does not even preserve the K - nite condition.
Example 28.14. If the topology on a set, X, is the discrete topology, then a
measure on B
X
is a Radon measure i is of the form
=
xX
x
(28.7)
where
x
[0, ) for all x X. To verify this rst notice that B
X
=
X
= 2
X
and hence every measure on B
X
is necessarily outer regular on all subsets of
X. The measure is K-nite i
x
:= (x) < for all x X. If is a
Radon measure, then for A X we have, by inner regularity,
(A) = sup() : A = sup
_
x
: A
_
=
xA
x
.
On the other hand if is given by Eq. (28.7) and A X, then
(A) =
xA
x
= sup
_
() =
x
: A
_
showing is inner regular on all (open) subsets of X.
Recall from Denition 14.26 that if U is an open subset of X, we write
f U to mean that f C
c
(X, [0, 1]) with supp(f) := f ,= 0 U.
Notation 28.15 Given a positive linear functional, I, on C
c
(X) dene =
I
on B
X
by
(U) = supI(f) : f U (28.8)
for all U
o
X and then dene
(B) = inf(U) : B U and U is open. (28.9)
Theorem 28.16 (Riesz-Markov Theorem). The map I
taking
Radon measures on X to positive linear functionals on C
c
(X) is bijective.
Moreover if I is a positive linear functional on C
c
(X), the function :=
I
dened in Notation 28.15 has the following properties.
1. is a Radon measure on X and the map I
I
is the inverse to the
map I
.
2. For all compact subsets K X,
(K) = infI(f) : 1
K
f X. (28.10)
3. If |I
| denotes the dual norm of I = I
on C
c
(X, 1)
, then |I| = (X).

In particular, the linear functional, I
, is bounded i (X) < .

Proof. The proof of the surjectivity of the map I
and the assertion

in item 1. is the content of Theorem 31.21 below.
Injectivity of I
. Suppose that is a is a Radon measure on X. To

each open subset U X let
0
(U) := supI
(f) : f U. (28.11)
It is evident that
0
(U) (U) because f U implies f 1
U
. Given a
compact subset K U, Urysohns Lemma 15.8 implies there exists f U
such that f = 1 on K. Therefore,
(K)
_
X
fd
0
(U) (U) (28.12)
By assumption is inner regular on open sets, and therefore taking the supre-
mum of Eq. (28.12) over compact subsets, K, of U shows
(U) =
0
(U) = supI
(f) : f U. (28.13)
If and are two Radon measures such that I
= I
. Then by Eq. (28.13)

it follows that = on all open sets. Then by outer regularity, = on B
X
and this shows the map I
is injective.
Item 2. Let K X be a compact set, then by monotonicity of the integral,
(K) infI
(f) : f C
c
(X) with f 1
K
. (28.14)
To prove the reverse inequality, choose, by outer regularity, U
o
X such that
K U and (U K) < . By Urysohns Lemma 15.8 there exists f U such
that f = 1 on K and hence,
I
(f) =
_
X
f d = (K) +
_
U\K
f d (K) +(U K) < (K) +.
Consequently,
infI
(f) : f C
c
(X) with f 1
K
< (K) +
and because > 0 was arbitrary, the reverse inequality in Eq. (28.14) holds
and Eq. (28.10) is veried.
Item 3. If f C
c
(X), then
[I
(f)[
_
X
[f[ d =
_
supp(f)
[f[ d |f|
(supp(f)) |f|
(X)
(28.15)
and thus |I
| (X). For the reverse inequality let K be a compact subset

of X and use Urysohns Lemma 15.8 again to nd a function f X such that
f = 1 on K. By Eq. (28.12) we have
(K)
_
X
fd = I
(f) |I
| |f|
= |I
| ,
which by the inner regularity of on open sets implies
(X) = sup(K) : K X |I
| .
Example 28.17 (Discrete Version of Theorem 28.16). Suppose X is a set, =
2
X
is the discrete topology on X and for x X, let e
x
C
c
(X) be dened
by e
x
(y) = 1
x]
(y). Let I be positive linear functional on C
c
(X) and dene
a Radon measure, , on X by
(A) :=
xA
I(e
x
) for all A X.
Then for f C
c
(X) (so f is a complex valued function on X supported on a
nite set),
_
X
fd =
xX
f(x)I(e
x
) = I
_
xX
f(x)e
x
_
= I(f),
so that I = I
. It is easy to see in this example that dened above is the

unique regular radon measure on X such that I = I
while example Example

28.11 shows the uniqueness is lost if the regularity assumption is dropped.
28.2.1 Regularity Results For Radon Measures
Proposition 28.18. If is a Radon measure on X then is inner regular
on all -nite Borel sets.
Proof. Suppose A B
X
and (A) < and > 0 is given. By outer
regularity of , there exist an open set U
o
X such that A U and (U
A) < . By inner regularity on open sets, there exists a compact set F U
such that (U F) < . Again by outer regularity of , there exist V
o
X
such that (U A) V and (V ) < . Then K := F V is compact set and
K F (U A) = F (U A
c
)
c
= F (U
c
A) = F A,
see Figure 28.1. Since,
Fig. 28.1. Constructing the compact set K.
(K) = (F) (F V ) (U) (A),
or more formally,
(K) = (F) (F V ) (U) (F V )
(U) 2 (A) 3,
we see that (A K) 3. This proves the proposition when (A) < .
If (A) = and there exists A
n
A as n with (A
n
) < .
Then by the rst part, there exist compact set K
n
such that K
n
A
n
and
(A
n
K
n
) < 1/n or equivalently (K
n
) > (A
n
) 1/n as n .
Corollary 28.19. Every -nite Radon measure, , is a regular Borel mea-
sure, i.e. is both outer and inner regular on all Borel subsets.
Notation 28.20 If (X, ) is a topological space, let F
denote the collection

of sets formed by taking countable unions of closed sets and G
denote
the collection of sets formed by taking countable intersections of open sets.
Proposition 28.21. Suppose that is a -nite Radon measure and B B.
Then
1. For all > 0 there exists sets F B U with F closed, U open and
(U F) < .
2. There exists A F
and C G
such that A B C such that and

(C A) = 0.
Proof. 1. Let X
n
B such that X
n
X and (X
n
) < and choose
open set U
n
such that BX
n
U
n
and (U
n
(B X
n
)) < 2
(n+1)
. Then
U :=
n=1
U
n
is an open set such that
(U B)
n=1
(U
n
B)
n=1
(U
n
(B X
n
)) <

2
.
Applying this same result to B
c
allows us to nd a closed set F such that
B
c
F
c
and
(B F) = (F
c
B
c
) <

2
.
Thus F B U and (U F) < as desired.
2. This a simple consequence of item 1.
Theorem 28.22. Let X be a locally compact Hausdor space such that every
open set V
o
X is compact, i.e. there exists K
n
V such that V =
n
K
n
. Then any K-nite measure on X is a Radon measure and in fact is a
regular Borel measure. (The reader should check that if X is second countable,
then open sets are compact, see Exercise 14.8. In particular this condition
holds for 1
n
with the standard topology.)
Proof. By the Riesz-Markov Theorem 28.16, the positive linear functional,
I (f) :=
_
X
fd for all f C
c
(X),
may be represented by a Radon measure on (X, B) , i.e. such that I (f) =
_
X
fd for all f C
c
(X). By Corollary 28.19, is also a regular Borel measure
on (X, B) . So to nish the proof it suces to show = . We will give two
proofs of this statement.
First Proof. The same arguments used in the proof of Lemma 18.57
shows (C
c
(X)) = B
X
. Let K be a compact subset of X and use Urysohns
Lemma 15.8 to nd X such that 1
K
. By a simple application of the
multiplicative system Theorem 18.51 one shows
_
X
fd =
_
X
fd
for all bounded B
X
= (C
c
(X)) measurable functions on X. Taking f = 1
K
then shows that (K) = (K) with K X. An application of Theorem
19.55 implies = on algebra generated by the compact sets. This
completes the proof, since, by assumption, this algebra contains all of the
open sets and hence is the Borel algebra.
Second Proof. Since is a Radon measure on X, it follows from Eq.
(28.13), that
(U) = sup
_
_
_
_
X
fd : f U
_
_
_
= sup
_
_
_
_
X
fd : f U
_
_
_
(U) (28.16)
for all open subsets U of X. For each compact subset K U, there exists, by
Uryshons Lemma 15.8, a function f U such that f 1
K
. Thus
(K)
_
X
fd =
_
X
fd (U) . (28.17)
Combining Eqs. (28.16) and (28.17) implies (K) (U) (U). By as-
sumption there exists compact sets, K
n
U, such that K
n
U as n
and therefore by continuity of ,
(U) = lim
n
(K
n
) (U) (U).
Hence we have shown, (U) = (U) for all U .
If B B = B
X
and > 0, by Proposition 28.21, there exists F B U
such that F is closed, U is open and (U F) < . Since U F is open,
(U F) = (U F) < and therefore
(U) (B) (U) and
(U) (B) (U) .
Since (U) = (U) , (B) = i (B) = and if (B) < then
[ (B) (B)[ < . Because > 0 is arbitrary, we may conclude that (B) =
(B) for all B B.
Proposition 28.23 (Density of C
c
(X) in L
p
()). If is a Radon measure
on X, then C
c
(X) is dense in L
p
() for all 1 p < .
Proof. Let > 0 and B B
X
with (B) < . By Proposition 28.18,
there exists K B U
o
X such that (U K) <
p
and by Urysohns
Lemma 15.8, there exists f U such that f = 1 on K. This function f
satises
|f 1
B
|
p
p
=
_
X
[f 1
B
[
p
d
_
U\K
[f 1
B
[
p
d (U K) <
p
.
From this it easy to conclude that C
c
(X) is dense in S
f
(B, ) the simple
functions on X which are in L
1
() . Combining this with Lemma 22.3 which
asserts that S
f
(B, ) is dense in L
p
() completes the proof of the theorem.
Theorem 28.24 (Lusins Theorem). Suppose (X, ) is a locally compact
Hausdor space, B
X
is the Borel algebra on X, and is a Radon measure
on (X, B
X
) . Also let > 0 be given. If f : X C is a measurable function
such that (f ,= 0) < , there exists a compact set K f ,= 0 such that
f[
K
is continuous and (f ,= 0 K) < . Moreover there exists C
c
(X)
such that (f ,= ) < and if f is bounded the function may be chosen so
that
||
u
|f|
u
:= sup
xX
[f(x)[ .
Proof. Suppose rst that f is bounded, in which case
_
X
[f[ d |f|
u
(f ,= 0) < .
By Proposition 28.23, there exists f
n
C
c
(X) such that f
n
f in L
1
() as
n . By passing to a subsequence if necessary, we may assume |f f
n
|
1
<
n
1
2
n
and hence by Chebyshevs inequality (Lemma 19.17),
_
[f f
n
[ > n
1
_
< 2
n
for all n.
Let E :=
n=1
_
[f f
n
[ > n
1
_
, so that (E) < . On E
c
, [f f
n
[ 1/n,
i.e. f
n
f uniformly on E
c
and hence f[
E
c is continuous. By Proposition
28.18, there exists a compact set K and open set V such that
K f ,= 0 E V
such that (V K) < . Notice that
(f ,= 0 K) = ((f ,= 0 K) E) +((f ,= 0 K) E)
(V K) +(E) < 2.
By the Tietze extension Theorem 15.9, there exists F C(X) such that
f = F[
K
. By Urysohns Lemma 15.8 there exists V such that = 1 on
K. So letting = F C
c
(X), we have = f on K, ||
|f|
and
since ,= f E (V K), ( ,= f) < 3. This proves the assertions in
the theorem when f is bounded.
Suppose that f : X C is (possibly) unbounded and > 0 is given. Then
B
N
:= 0 < [f[ N f ,= 0 as N and therefore for all N suciently
large,
(f ,= 0 B
N
) < /3.
Since is a Radon measure, Proposition 28.18, guarantees there is a compact
set C f ,= 0 such that (f ,= 0 C) < /3. Therefore,
(f ,= 0 (B
N
C)) < 2/3.
We may now apply the bounded case already proved to the function 1
B
N
C
f
to nd a compact subset K and an open set V such that K V,
K 1
B
N
C
f ,= 0 = B
N
C f ,= 0
such that((B
N
C f ,= 0) K) < /3 and C
c
(X) such that =
1
B
N
C
f = f on K. This completes the proof, since
(f ,= 0 K) ((B
N
C f ,= 0) K) +(f ,= 0 (B
N
C)) <
which implies (f ,= ) < .
Example 28.25. To illustrate Theorem 28.24, suppose that X = (0, 1), = m
is Lebesgue measure and f = 1
(0,1)Q
. Then Lusins theorem asserts for any
> 0 there exists a compact set K (0, 1) such that m((0, 1) K) < and
f[
K
is continuous. To see this directly, let r
n
n=1
be an enumeration of the
rationals in (0, 1),
J
n
= (r
n
2
n
, r
n
+2
n
) (0, 1) and W =
n=1
J
n
.
Then W is an open subset of X and (W) < . Therefore K
n
:= [1/n, 1
1/n] W is a compact subset of X and m(X K
n
)
2
n
+ (W). Taking n
suciently large we have m(X K
n
) < and f[
Kn
0 which is of course
continuous.
The following result is a slight generalization of Lemma 22.11.
Corollary 28.26. Let (X, ) be a locally compact Hausdor space, : B
X

[0, ] be a Radon measure on X and h L
1
loc
(). If
_
X
fhd = 0 for all f C
c
(X) (28.18)
then 1
K
h = 0 for a.e. for every compact subset K X. (BRUCE: either
show h = 0 a.e. or give a counterexample. Also, either prove or give a coun-
terexample to the question to the statement the d = d is a Radon measure
if 0 and in L
1
loc
() .)
Proof. By considering the real and imaginary parts of h we may assume
with out loss of generality that h is real valued. Let K be a compact subset of
X. Then 1
K
sgn(
h) L
1
() and by Proposition 28.23, there exists f
n
C
c
(X)
such that lim
n
|f
n
1
K
sgn(h)|
L
1
()
= 0. Let C
c
= 1 on K and g
n
= min(1, max (1, f
n
)) . Since
[g
n
1
K
sgn(h)[ [f
n
1
K
sgn(h)[
we have found g
n
C
c
(X, 1) such that [g
n
[ 1
supp()
and g
n
1
K
sgn(h) in
L
1
() . By passing to a sub-sequence if necessary we may assume the conver-
gence happens almost everywhere. Using Eq. (28.18) and the dominated
convergence theorem (the dominating function is [h[ 1
supp()
) we conclude that
0 = lim
n
_
X
g
n
hd =
_
X
1
K
sgn(
h)hd =
_
K
[h[ d
which shows h(x) = 0 for - a.e. x K.
28.2.2 The dual of C
0
(X)
Denition 28.27. Let (X, ) be a locally compact Hausdor space and B =
() be the Borel algebra. A signed Radon measure is a signed measure
on B such that the measures,
, in the Jordan decomposition of are both

Radon measures. A complex Radon measure is a complex measure on B
such that Re and Im are signed radon measures.
Example 28.28. Every complex measure on B
R
d is a Radon measure.
BRUCE: add some more examples and perhaps some exercises here.
BRUCE: Compare and combine with results from Section 32.10.
Proposition 28.29. Suppose (X, ) is a topological space and I C
0
(X, 1)
.
Then we may write I = I
+
I
where I
C
0
(X, 1)
are positive linear

functionals.
Proof. For f C
0
(X, [0, )), let
I
+
(f) := supI(g) : g C
0
(X, [0, )) and g f
and notice that [I
+
(f)[ |I| |f| . If c > 0, then I
+
(cf) = cI
+
(f). Suppose
that f
1
, f
2
C
0
(X, [0, )) and g
i
C
0
(X, [0, )) such that g
i
f
i
, then
g
1
+g
2
f
1
+f
2
so that
I(g
1
) +I(g
2
) = I(g
1
+g
2
) I
+
(f
1
+f
2
)
and therefore
I
+
(f
1
) +I
+
(f
2
) I
+
(f
1
+f
2
). (28.19)
Moreover, if g C
0
(X, [0, )) and g f
1
+f
2
, let g
1
= min(f
1
, g), so that
0 g
2
:= g g
1
f
1
g
1
+f
2
f
2
.
Hence I(g) = I(g
1
) +I(g
2
) I
+
(f
1
) +I
+
(f
2
) for all such g and therefore,
I
+
(f
1
+f
2
) I
+
(f
1
) +I
+
(f
2
). (28.20)
Combining Eqs. (28.19) and (28.20) shows that I
+
(f
1
+f
2
) = I
+
(f
1
)+I
+
(f
2
).
For general f C
0
(X, 1), let I
+
(f) = I
+
(f
+
) I
+
(f
) where f
+
= max(f, 0)
and f
= min(f, 0). (Notice that f = f

+
f
.) If f = h g with h, g
C
0
(X, 1), then g +f
+
= h +f
and therefore,
I
+
(g) +I
+
(f
+
) = I
+
(h) +I
+
(f
)
and hence I
+
(f) = I
+
(h) I
+
(g). In particular,
I
+
(f) = I
+
(f
f
+
) = I
+
(f
) I
+
(f
+
) = I
+
(f)
so that I
+
(cf) = cI
+
(f) for all c 1. Also,
I
+
(f +g) = I
+
(f
+
+g
+
(f
+g
)) = I
+
(f
+
+g
+
) I
+
(f
+g
)
= I
+
(f
+
) +I
+
(g
+
) I
+
(f
) I
+
(g
)
= I
+
(f) +I
+
(g).
Therefore I
+
is linear. Moreover,
[I
+
(f)[ max ([I
+
(f
+
)[ , [I
+
(f
)[) |I| max (|f

+
| , |f
|) = |I| |f|
which shows that |I
+
| |I| . Let I
= I
+
I C
0
(X, 1)
, then for f 0,
I
(f) = I
+
(f) I(f) 0
by denition of I
+
, so I
0 as well.
Remark 28.30. The above proof works for functionals on linear spaces of
bounded functions which are closed under taking f g and f g. As an exam-
ple, let (f) =
_
1
0
f(x)dx for all bounded measurable functions f : [0, 1] 1.
By the Hahn Banach theorem, we may extend to a linear functional on
all bounded functions on [0, 1] in such a way that || = 1. Let
+
be as
above, then
+
= on bounded measurable functions and |
+
| = 1. De-
ne (A) := (1
A
) for all A [0, 1] and notice that if A is measurable, the
(A) = m(A). So is a nitely additive extension of m to all subsets of [0, 1].
Exercise 28.2. Suppose that is a signed Radon measure and I = I
. Let
+
and
be the Radon measures associated to I
with I
being constructed
as in the proof of Proposition 28.29. Show that =
+

is the Jordan
decomposition of .
Solution to Exercise (28.2). Let X = P P
c
where P is a positive set for
and P
c
is a negative set. Then for A B
X
,
(P A) =
+
(P A)
(P A)
+
(P A)
+
(A). (28.21)
To nish the proof we need only prove the reverse inequality. To this end let
> 0 and choose K P A U
o
X such that [[ (U K) < . Let
f, g C
c
(U, [0, 1]) with f g, then
I(f) = (f) = (f : K) +(f : U K) (g : K) +O()
(K) +O() (P A) +O() .
Taking the supremum over all such f g, we learn that I
+
(g) (P A) +
O() and then taking the supremum over all such g shows that
+
(U) (P A) +O() .
Taking the inmum over all U
o
X such that P A U shows that
+
(P A) (P A) +O() (28.22)
From Eqs. (28.21) and (28.22) it follows that (P A) =
+
(P A). Since
I
(f) = sup
0gf
I(g)I(f) = sup
0gf
I(g f) = sup
0gf
I(f g) = sup
0hf
I(h)
the same argument applied to I shows that
(P
c
A) =
(P
c
A).
Since
(A) = (P A) +(P
c
A) =
+
(P A)
(P
c
A) and
(A) =
+
(A)
(A)
it follows that
+
(A P) =
(A P
c
) =
(A P).
Taking A = P then shows that
(P) = 0 and taking A = P

c
shows that
+
(P
c
) = 0 and hence
(P A) =
+
(P A) =
+
(A) and
(P
c
A) =
(P
c
A) =
(A)
as was to be proved.
Theorem 28.31. Let X be a locally compact Hausdor space, M(X) be the
space of complex Radon measures on X and for M (X) let || = [[(X).
Then the map
M(X) I
C
0
(X)
is an isometric isomorphism. Here again I
(f) :=
_
X
f d.
Proof. To show that the map M(X) C
0
(X)
is surjective, let I
C
0
(X)
and then write I = I

re
+ iI
im
be the decomposition into real and
imaginary parts. Then further decompose these into there plus and minus
parts so
I = I
re
+
I
re
+i
_
I
im
+
I
im
_
and let
re
and
im
be the corresponding positive Radon measures associated

to I
re
and I
im
. Then I = I
where
=
re
+

re
+i
_
im
+

im
_
.
To nish the proof it suces to show |I
|
C0(X)
= || = [[(X). We have
|I
|
C0(X)
= sup
_
_
X
fd
: f C
0
(X) |f|
1
_
sup
_
_
X
fd
: f measurable and |f|
1
_
= || .
28.3 Classifying Radon Measures on R 557
To prove the opposite inequality, write d = gd [[ with g a complex measur-
able function such that [g[ = 1. By Proposition 28.23, there exist f
n
C
c
(X)
such that f
n
g in L
1
([[) as n . Let g
n
= (f
n
) where : C C is
the continuous function dened by (z) = z if [z[ 1 and (z) = z/ [z[ if
[z[ 1. Then [g
n
[ 1 and g
n
g in L
1
(). Thus
|| = [[ (X) =
_
X
d [[ =
_
X
gd = lim
n
_
X
g
n
d |I
|
C0(X)
.
Exercise 28.3. Let (X, ) be a compact Hausdor space which supports a
positive measure on B = () such that (X) ,=

xX
(x) , i.e. is
a not a counting type measure. (Example X = [0, 1] .) Then C (X) is not
reexive.
Hint: recall that C (X)
is isomorphic to the space of complex Radon

measures on (X, B) . Using this isomorphism, dene C (X)
by
() =
xX
(x)
and then show ,=

f for any f C (X) .
Solution to Exercise (28.3). Suppose there exists f C (X) such that
() =

f () = (f) =
_
X
fd for all complex Radon measures, . Taking
=
x
with x X then implies
f (x) = (
x
) = (
x
) =
yX
x
(y) = 1.
This shows f 1. However, this f can not work since
f () = (X) ,=
xX
(x) = () .
28.3 Classifying Radon Measures on 1
Throughout this section, let X = 1, c be the elementary class
c = (a, b] 1 : a b , (28.23)
and / = /(c) be the algebra formed by taking nite disjoint unions of el-
ements from c, see Proposition 18.10. The aim of this section is to prove
Theorem 19.8 which we restate here for convenience.
Theorem 28.32. The collection of K-nite measure on (1, B
R
) are in one
to one correspondence with a right continuous non-decreasing functions, F :
1 1, with F(0) = 0. The correspondence is as follows. If F is a right
continuous non-decreasing function F : 1 1, then there exists a unique
measure,
F
, on B
R
such that
F
((a, b]) = F(b) F(a) < a b <
and this measure may be dened by
F
(A) = inf
_

i=1
(F(b
i
) F(a
i
)) : A
i=1
(a
i
, b
i
]
_
= inf
_

i=1
(F(b
i
) F(a
i
)) : A
i=1
(a
i
, b
i
]
_
(28.24)
for allA B
R
. Conversely if is K-nite measure on (1, B
R
) , then
F(x) :=
_
((x, 0]) if x 0
((0, x]) if x 0
(28.25)
is a right continuous non-decreasing function and this map is the inverse to
the map, F
F
.
There are three aspects to this theorem; namely the existence of the map
F
F
, the surjectivity of the map and the injectivity of this map. Assuming
the map F
F
exists, the surjectivity follows from Eq. (28.25) and the
injectivity is an easy consequence of Theorem 19.55. The rest of this section
is devoted to giving two proofs for the existence of the map F
F
.
Exercise 28.4. Show by direct means any measure =
F
satisfying Eq.
(28.24) is outer regular on all Borel sets. Hint: it suces to show if B :=
i=1
(a
i
, b
i
], then there exists V
o
1 such that (V B) is as small as you
please.
28.3.1 Classifying Radon Measures on 1 using Theorem 28.2
Proposition 28.33. To each nitely additive measures : / [0, ] which
is nite on bounded sets there is a unique increasing function F :

1

1 such
that F(0) = 0, F
1
() , F
1
() and
((a, b] 1) = F(b) F(a) a b in

1. (28.26)
Conversely, given an increasing function F :

1

1 such that F
1
()
and F
1
() , there is a unique nitely additive measure
=
F
on / such that the relation in Eq. (28.26) holds.
Proof. If F is going to exist, then
((0, b] 1) = F(b) F(0) = F(b) if b [0, ],
((a, 0]) = F(0) F(a) = F(a) if a [, 0]
from which we learn
F(x) =
_
((x, 0]) if x 0
((0, x] 1) if x 0.
Moreover, one easily checks using the additivity of that Eq. (28.26) holds
for this F. Conversely, suppose F :

1

1 is an increasing function such that
F
1
() , F
1
() . Dene on c using the formula in
Eq. (28.26). The argument will be completed by showing is additive on c
and hence, by Proposition 31.3, has a unique extension to a nitely additive
measure on /. Suppose that
(a, b] =
n
i=1
(a
i
, b
i
].
By reordering (a
i
, b
i
] if necessary, we may assume that
a = a
1
< b
1
= a
2
< b
2
= a
3
< < b
n1
= a
n
< b
n
= b.
Therefore, by the telescoping series argument,
((a, b]) = F(b) F(a) =
n
i=1
[F(b
i
) F(a
i
)] =
n
i=1
((a
i
, b
i
]).
Now let F : 1 1 be an increasing function, F () := lim
x
F (x)
and =
F
be the nitely additive measure on (1, /) described in Proposition
28.33. If happens to be a premeasure on /, then, letting A
n
= (a, b
n
] with
b
n
b as n , implies
F(b
n
) F(a) = ((a, b
n
]) ((a, b]) = F(b) F(a).
Since b
n
n=1
was an arbitrary sequence such that b
n
b, we have shown
lim
yb
F(y) = F(b), i.e. F is right continuous. The next proposition shows the
converse is true as well. Hence premeasures on / which are nite on bounded
sets are in one to one correspondences with right continuous increasing func-
tions which vanish at 0.
Proposition 28.34. To each right continuous increasing function F : 1 1
there exists a unique premeasure =
F
on / such that
F
((a, b]) = F(b) F(a) < a < b < .
Proof. As above, let F() := lim
x
F(x) and =
F
be as in
Proposition 28.33. Because of Proposition 31.5, to nish the proof it suces
to show is sub-additive on c.
First suppose that < a a,

b
n
> b
n
in which case I := ( a, b] J,
J
n
:= (a
n
,
b
n
]

J
o
n
:= (a
n
,
b
n
) J
n
.
Since

I = [a, b] is compact and

I J
n=1
J
o
n
there exists N < such that
I

I
N
_
n=1
J
o
n

N
_
n=1
J
n
.
Hence by nite sub-additivity of ,
F(b) F( a) = (I)
N
n=1
(

J
n
)
n=1
(

J
n
).
Using the right continuity of F and letting a a in the above inequality,
(J) = ((a, b]) = F(b) F(a)
n=1
J
n
_
=
n=1
(J
n
) +
n=1
(

J
n
J
n
). (28.28)
Given > 0, we may use the right continuity of F to choose

b
n
so that
(

J
n
J
n
) = F(
b
n
) F(b
n
) 2
n
n N.
Using this in Eq. (28.28) shows
(J) = ((a, b])
n=1
(J
n
) +
which veries Eq. (28.27) since > 0 was arbitrary.
The hard work is now done but we still have to check the cases where
a = or b = . For example, suppose that b = so that
J = (a, ) =
n=1
J
n
with J
n
= (a
n
, b
n
] 1. Then
I
M
:= (a, M] = J I
M
=
n=1
J
n
I
M
and so by what we have already proved,
F(M) F(a) = (I
M
)
n=1
(J
n
I
M
)
n=1
(J
n
).
Now let M in this last inequality to nd that
((a, )) = F() F(a)
n=1
(J
n
).
The other cases where a = and b 1 and a = and b = are
handled similarly.
Corollary 28.35. The map F
F
in Theorem 28.32 exists.
Proof. This is simply a combination of Proposition 28.34 and Theorem
28.2.
28.3.2 Classifying Radon Measures on 1 using the Riesz-Markov
Theorem 28.16
For the moment let X be an arbitrary set. We are going to start by introducing
a simple integral associated to an additive measure, , on an algebra / 2
X
.
Denition 28.36. Let be a nitely additive measure on an algebra / 2
X
,
S = S
f
(/, ) be the collection of simple functions dened in Notation 22.1 and
for f S dened the integral I(f) = I
(f) by
I
(f) =
yR
y(f = y). (28.29)
The same proof used for Proposition 19.12 shows I
: S 1 is linear and
positive, i.e. I(f) 0 if f 0. Taking absolute values of Eq. (28.29) gives
[I(f)[
yR
[y[ (f = y) |f|
(f ,= 0) (28.30)
where |f|
= sup
xX
[f(x)[ . For A /, let S
A
:= f S : f ,= 0 A.
The estimate in Eq. (28.30) implies
[I(f)[ (A) |f|
for all f S
A
. (28.31)
Let

S
A
denote the closure of S
A
inside
(X, 1).
Proposition 28.37. Let (/, , I = I
) be as in Denition 28.36, then we

may extend I to
S :=
_
S
A
: A / with (A) <
by dening I(f) = I
A
(f) when f

S
A
with (A) < . Moreover this exten-
sion is still positive.
Proof. Because of Eq. (28.31) and the B.L.T. Theorem 10.4, I has a
unique extension I
A
to

S
A

(X, 1) for any A / such that (A) < .

The extension I
A
is still positive. Indeed, let f

S
A
with f 0 and let
f
n
S
A
be a sequence such that |f f
n
|
0 as n . Then f
n
0 S
A
and
|f f
n
0|
|f f
n
|
0 as n .
Therefore, I
A
(f) = lim
n
I
A
(f
n
0) 0.
Now suppose that A, B / are sets such that (A) + (B) < . Then
S
A
S
B
S
AB
and so

S
A
S
B

S
AB
. Therefore I
A
(f) = I
AB
(f) = I
B
(f)
for all f
S
A
S
B
. Therefore I(f) := I
A
(f) for f
S
A
is well dened.
We now specialize the previous results to the case where X = 1, / = /(c)
with c as in Eq. (28.23), and F and are as in Proposition 28.33. In this
setting, for f

S, we will write I
(f) as
_
fdF or
_
f(x)dF(x) and to
this integral as the Riemann Stieljtes integral of f relative to F.
Lemma 28.38. Using the notation above, the map f

S
_
fdF is lin-
ear, positive and satises the estimate
fdF
(F(b) F(a)) |f|
(28.32)
if supp(f) (a, b). Moreover C
c
(1, 1)
S.
Proof. The only new point of the lemma is to prove C
c
(1, 1)

S, the
remaining assertions follow directly from Proposition 28.37. The fact that
C
c
(1, 1)

S has essentially already been done in Example 19.23. In more
detail, let f C
c
(1, 1) and choose a < b such that supp(f) (a, b). Then
dene f
k
S as in Example 19.23, i.e.
f
k
(x) =
n
k
1
l=0
min
_
f(x) : a
k
l
x a
k
l+1
_
1
(a
k
l
,a
k
l+1
]
(x)
where
k
= a = a
k
0
< a
k
1
< < a
k
n
k
= b, for k = 1, 2, 3, . . . , is a sequence
of rening partitions such that mesh(
k
) 0 as k . Since supp(f)
is compact and f is continuous, f is uniformly continuous on 1. Therefore
|f f
k
|
0 as k , showing f

S. Incidentally, for f C
c
(1, 1), it
follows that
_

fdF = lim
k
n
k
1
l=0
min
_
f(x) : a
k
l
x a
k
l+1
__
F(a
k
l+1
) F(a
k
l
)
.
(28.33)
The following Exercise is an abstraction of Lemma 28.38.
Exercise 28.5. Continue the notation of Denition 28.36 and Proposition
28.37. Further assume that X is a metric space, there exists open sets X
n

o
X
such that X
n
X and for each n N and > 0 there exists a nite collection
of sets A
i
k
i=1
/ such that diam(A
i
) < , (A
i
) < and X
n

k
i=1
A
i
.
Then C
c
(X, 1)
S and so I is well dened on C

c
(X, 1).
28.3.3 The Lebesgue-Stieljtes Integral
Notation 28.39 Given an increasing function F : 1 1, let F(x) =
lim
yx
F(y), F(x+) = lim
yx
F(y) and F() = lim
x
F(x)

1. Since
F is increasing all of theses limits exists.
Theorem 28.40. If F : 1 1 is an increasing function (not necessarily
right continuous), there exists a unique measure =
F
on B
R
such that
_

fdF =
_
R
fd for all f C
c
(1, 1), (28.34)
where
_
fdF is as in Lemma 28.38 above. This measure may also be char-

acterized as the unique measure on B
R
such that
((a, b]) = F(b+) F(a+) for all < a < b < . (28.35)
Moreover, if A B
R
then
F
(A) = inf
_

i=1
(F(b
i
+) F(a
i
+)) : A
i=1
(a
i
, b
i
]
_
= inf
_

i=1
(F(b
i
+) F(a
i
+)) : A
i=1
(a
i
, b
i
]
_
. (28.36)
Proof. An application of the Riesz-Markov Theorem 28.16 implies there
exists a unique measure on B
R
such Eq. (28.34) is valid. Let < a 0 be small and
(x) be the function dened in Figure 28.2, i.e.
is one on [a + 2, b +], linearly interpolates to zero on [b +, b + 2] and on

[a + , a + 2] and is zero on (a, b + 2)
c
. Since
1
(a,b]
it follows by the
dominated convergence theorem that
((a, b]) = lim
0
_
R
d = lim
0
_
R
dF. (28.37)
Fig. 28.2. .
On the other hand we have
1
(a+2,b+]

1
(a+,b+2]
, (28.38)
and therefore applying I
F
to both sides of Eq. (28.38) shows;
F(b +) F(a + 2) =
_
R
1
(a+2,b+]
dF
_
R
dF
_
R
1
(a+,b+2]
dF = F(b + 2) F(a +).
Letting 0 in this equation and using Eq. (28.37) shows
F(b+) F(a+) ((a, b]) F(b+) F(a+).
For the last assertion let
0
(A) = inf
_

i=1
(F(b
i
) F(a
i
)) : A
i=1
(a
i
, b
i
]
_
= inf (B) : A B /
,
where / is the algebra generated by the half open intervals on 1. By mono-
tonicity of , it follows that
0
(A) (A) for all A B. (28.39)
For the reverse inequality, let A V
o
1 and notice by Exercise 13.14
that V =

i=1
(a
i
, b
i
) for some collection of disjoint open intervals in 1.
Since (a
i
, b
i
) /
(as the reader should verify!), it follows that V /
and
therefore,
0
(A) inf (V ) : A V
o
1 = (A) .
Combining this with Eq. (28.39) shows
0
(A) = (A) which is precisely Eq.
(28.36).
28.4 Kolmogorovs Existence of Measure on Products Spaces 565
Corollary 28.41. The map F
F
is a one to one correspondence between
right continuous non-decreasing functions F such that F(0) = 0 and Radon
(same as K - nite) measures on (1, B
R
) .
28.4 Kolmogorovs Existence of Measure on Products
Spaces
Throughout this section, let (X
)
A
be second countable locally com-
pact Hausdor spaces and let X :=

A
X
be equipped with the product

topology, :=
A
. More generally for A, let X
:=

and
:=
and A, let
,
: X
be the projection map;
,
(x) = x[
for x X
. We will simply write
for
,A
: X X
.
(Notice that if is a nite subset of A then (X
) is still second countable

as the reader should verify.) Let /=
A
B
be the product algebra on

X = X
A
and B
= (
) be the Borel algebra on X
.
Theorem 28.42 (Kolmogorovs Existence Theorem). Suppose
: A
are probability measures on (X
, B
) satisfying the following compatibility

condition:
(
,
)
whenever A.
Then there exists a unique probability measure, , on (X, /) such that
(
whenever A. Recall, see Exercise 19.8, that the condition

(
is equivalent to the statement;

_
X
F(
(x))d(x) =
_
X
F(y)d
(y) (28.40)
for all A and F : X
1 bounded a measurable.
We will rst prove the theorem in the following special case. The full proof
will be given after Exercise 28.6 below.
Theorem 28.43. Theorem 28.42 holds under the additional assumption that
each of the spaces, (X
)
A
, are compact second countable and Haus-
dor and A is countable.
Proof. Recall from Theorem 18.63 that the Borel algebra, B
=
(
) , and the product algebra,
, are the same for any A.

By Tychonos Theorem 14.34 and Proposition 15.4, X and X
for any A
are still compact Hausdor spaces which are second countable if is nite.
By the Stone Weierstrass Theorem 15.31,
T := f C(X) : f = F
with F C(X
) and A
is a dense subspace of C(X). For f = F
T, let
I(f) =
_
X
(x)d
(x). (28.41)
Let us verify that I is well dened. Suppose that f may also be expressed as
f = F
t

with
t
A and F
t
C(X
). Let :=
t
and dene
G C (X
) by G := F
,
. Hence, using Exercise 19.8,
_
X
Gd
=
_
X
F
,
d
=
_
X
F d
_
(
,
)
=
_
X
F d
wherein we have used the compatibility condition in the last equality. Simi-
larly, using G = F
t

,
(as the reader should verify), one shows
_
X
Gd
=
_
X
F
t
d
.
Therefore _
X
F
t
d
=
_
X
Gd
=
_
X
F d
,
which shows I in Eq. (28.41) is well dened.
Since [I(f)[ |f|
, the B.L.T. Theorem 10.4 allows us to extend I from

the dense subspace, T, to a continuous linear functional,

I, on C(X). Because
I was positive on T, it is easy to check that

I is still positive on C(X). So by
the Riesz-Markov Theorem 28.16, there exists a Radon measure on B = /
such that

I(f) =
_
X
fd for all f C(X). By the denition of

I in now follows
that _
X
Fd (
=
_
X
d =

I(F
) =
_
X
Fd
for all F C(X
) and A. Since X
is a second countable lo-

cally compact Hausdor space, this identity implies, see Theorem 22.8
1
, that
(
. The uniqueness assertion of the theorem follows from the

fact that the measure is determined uniquely by its values on the algebra
/ :=
A
(B
X
) which generates B = /, see Theorem 19.55.

Exercise 28.6. Let (Y, ) be a locally compact Hausdor space and (Y
=
Y ,
) be the one point compactication of Y. Then

B
Y
:= (
) = A Y
: A Y B
Y
= ()
or equivalently put
B
Y
= B
Y
A : A B
Y
.
Also shows that (Y
= Y ,
) is second countable if (Y, ) was second

countable.
1
Alternatively, use Theorems 28.22 and the uniquness assertion in Markov-Riesz
Theorem 28.16 to conclude ()
= .
28.4 Kolmogorovs Existence of Measure on Products Spaces 567
Proof. Proof of Theorem 28.42.
Case 1; A is a countable. Let (X
= X
) be the one point

compactication of (X
) . For A, let X
:=

equipped with
the product topology and Borel algebra, B
. Since is at most countable,

the set,
X
:=
,
is a measurable subset of X
. Therefore for each A, we may extend
to a measure,
, on (X
, B
) using the formula,

(B) =
(B X
) for all B X
.
An application of Theorem 28.43 shows there exists a unique probability mea-
sure, , on X
:= X
A
such that (
for all A. Since

X
X =
_
A
and (
= ) =
]
(
) = 0, it follows that (X
X) = 0. Hence
:= [
B
X
is a probability measure on (X, B
X
) . Finally if B B
X
B
X
,
(B) =
(B) = (
(B) =
_
(B)
_
=
_
(B) X
_
=
_
[
1
X
(B)
_
which shows is the required probability measure on B
X
.
Case 2; A is uncountable. By case 1. for each countable or nite subset
A there is a measure
on (X
, B
) such that (
,
)
for all
. By Exercise 18.9,
/=
_
_
(B
) : is a countable subset of A
_
,
i.e. every B /may be written in the form B =
1
(C) for some countable

subset, A, and C B
. For such a B we dene (B) :=
(C) . It is left
to the reader to check that is well dened and that is a measure on /.
(Keep in mind the countable union of countable sets is countable.) If A
and C B
, then
[(
] (C) =
_
(C)
_
:=
(C) ,
i.e. (
as desired.
Corollary 28.44. Suppose that
A
are probability measure on (X
, B
)
for all A and if A let
:=
be the product measure on

(X
, B
) . Then there exists a unique probability measure, , on

(X, /) such that (
for all A. (It is possible remove the

topology from this corollary, see Theorem 32.65 below.)
Exercise 28.7. Prove Corollary 28.44 by showing the measures
:=
satisfy the compatibility condition in Theorem 28.42.

29
Probability Measures on Lusin Spaces
Denition 29.1 (Lusin spaces). A Lusin space is a topological space
(X, ) which is homeomorphic to a Borel subset of a compact metric space.
Example 29.2. By Theorem 15.12, every Polish (i.e. complete separable metric
space) is a Lusin space. Moreover, any Borel subset of Lusin space is again a
Lusin space.
Denition 29.3. Two measurable spaces, (X, /) and (Y, ^) are said to be
isomorphic if there exists a bijective map, f : X Y such that f (/) = ^
and f
1
(^) = /, i.e. both f and f
1
are measurable.
29.1 Weak Convergence Results
The following is an application of theorem 14.7 characterizing compact sets
in metric spaces. (BRUCE: add Hellys selection principle here.)
Proposition 29.4. Suppose that (X, ) is a complete separable metric space
and is a probability measure on B = (
). Then for all > 0, there exists

K
X such that (K
) 1 .
Proof. Let x
k
k=1
be a countable dense subset of X. Then X =
k
C
x
k
(1/n) for all n N. Hence by continuity of , there exists, for all
n N, N
n
< such that (F
n
) 1 2
n
where F
n
:=
Nn
k=1
C
x
k
(1/n). Let
K :=
n=1
F
n
then
(X K) = (
n=1
F
c
n
)
n=1
(F
c
n
) =
n=1
(1 (F
n
))
n=1
2
n
=
so that (K) 1 . Moreover K is compact since K is closed and totally
bounded; K F
n
for all n and each F
n
is 1/n bounded.
570 29 Probability Measures on Lusin Spaces
Denition 29.5. A sequence of probability measures P
n
n=1
is said to con-
verge to a probability P if for every f BC(X), P
n
(f) P(f). This is
actually weak-* convergence when viewing P
n
BC(X)
.
Proposition 29.6. The following are equivalent:
1. P
n
w
P as n
2. P
n
(f) P(f) for every f BC(X) which is uniformly continuous.
3. limsup
n
P
n
(F) P(F) for all F X.
4. liminf
n
P
n
(G) P(G) for all G
o
X.
5. lim
n
P
n
(A) = P(A) for all A B such that P(bd(A)) = 0.
Proof. 1. = 2. is obvious. For 2. = 3.,
(t) :=
_
_
_
1 if t 0
1 t if 0 t 1
0 if t 1
(29.1)
and let f
n
(x) := (nd(x, F)). Then f
n
BC(X, [0, 1]) is uniformly continu-
ous, 0 1
F
f
n
for all n and f
n
1
F
as n . Passing to the limit n
in the equation
0 P
n
(F) P
n
(f
m
)
gives
0 lim sup
n
P
n
(F) P(f
m
)
and then letting m in this inequality implies item 3. 3. 4. Assuming
item 3., let F = G
c
, then
1 lim inf
n
P
n
(G) = lim sup
n
(1 P
n
(G)) = lim sup
n
P
n
(G
c
)
P(G
c
) = 1 P(G)
which implies 4. Similarly 4. = 3. 3. 5. Recall that bd(A) =

A A
o
,
so if P(bd(A)) = 0 and 3. (and hence also 4. holds) we have
lim sup
n
P
n
(A) lim sup
n
P
n
(

A) P(

A) = P(A) and
lim inf
n
P
n
(A) lim inf
n
P
n
(A
o
) P(A
o
) = P(A)
from which it follows that lim
n
P
n
(A) = P(A). Conversely, let F X and
set F
:= x X : (x, F) . Then
bd(F
) F
x X : (x, F) < = A
where A
:= x X : (x, F) = . Since A
>0
are all disjoint, we must
have

>0
P(A
) P(X) 1
29.1 Weak Convergence Results 571
and in particular the set := > 0 : P(A
) > 0 is at most countable. Let
n
/ be chosen so that
n
0 as n , then
P(F
m
) = lim
n
P
n
(F
n
) lim sup
n
P
n
(F).
Let m this equation to conclude P(F) limsup
n
P
n
(F) as desired.
To nish the proof we will now show 3. = 1. By an ane change of variables
it suces to consider f C(X, (0, 1)) in which case we have
k
i=1
(i 1)
k
1
(i1)
k
f<
i
k
f
k
i=1
i
k
1
(i1)
k
f<
i
k
. (29.2)
Let F
i
:=
_
i
k
f
_
and notice that F
k
= , then we for any probability P
that
k
i=1
(i 1)
k
[P(F
i1
) P(F
i
)] P(f)
k
i=1
i
k
[P(F
i1
) P(F
i
)] . (29.3)
Now
k
i=1
(i 1)
k
[P(F
i1
) P(F
i
)]
=
k
i=1
(i 1)
k
P(F
i1
)
k
i=1
(i 1)
k
P(F
i
)
=
k1
i=1
i
k
P(F
i
)
k
i=1
i 1
k
P(F
i
) =
1
k
k1
i=1
P(F
i
)
and
k
i=1
i
k
[P(F
i1
) P(F
i
)]
=
k
i=1
i 1
k
[P(F
i1
) P(F
i
)] +
k
i=1
1
k
[P(F
i1
) P(F
i
)]
=
k1
i=1
P(F
i
) +
1
k
so that Eq. (29.3) becomes,
1
k
k1
i=1
P(F
i
) P(f)
1
k
k1
i=1
P(F
i
) + 1/k.
Using this equation with P = P
n
and then with P = P we nd
lim sup
n
P
n
(f) lim sup
n
_
1
k
k1
i=1
P
n
(F
i
) + 1/k
_
1
k
k1
i=1
P(F
i
) + 1/k P(f) + 1/k.
Since k is arbitrary,
lim sup
n
P
n
(f) P(f).
This inequality also hold for 1 f and this implies liminf
n
P
n
(f) P(f)
and hence lim
n
P
n
(f) = P(f) as claimed.
Denition 29.7. Let X be a topological space. A collection of probability mea-
sures on (X, B
X
) is said to be tight if for every > 0 there exists a compact
set K
B
X
such that P(K
) 1 for all P .
Theorem 29.8. Suppose X is a separable metrizable space and = P
n
n=1
is a tight sequence of probability measures on B
X
. Then there exists a sub-
sequence P
n
k
k=1
which is weakly convergent to a probability measure P on
B
X
.
Proof. First suppose that X is compact. In this case C(X) is a Banach
space which is separable by the Stone Weirstrass theorem, see Exercise 15.5.
By the Riesz theorem, Corollary 32.68, we know that C(X)
is in one to one
correspondence with complex measure on (X, B
X
). We have also seen that
C(X)
is metrizable and the unit ball in C(X)
is weak - * compact, see

Theorem 14.38. Hence there exists a subsequence P
n
k
k=1
which is weak
-* convergent to a probability measure P on X. Alternatively, use the can-
tors diagonalization procedure on a countable dense set C(X) so nd
P
n
k
k=1
such that (f) := lim
k
P
n
k
(f) exists for all f . Then for
g C(X) and f , we have
[P
n
k
(g) P
n
l
(g)[ [P
n
k
(g) P
n
k
(f)[ +[P
n
k
(f) P
n
l
(f)[
+[P
n
l
(f) P
n
l
(g)[
2 |g f|
+[P
n
k
(f) P
n
l
(f)[
which shows
lim sup
k,l
[P
n
k
(g) P
n
l
(g)[ 2 |g f|
.
Letting f tend to g in C(X) shows limsup
k,l
[P
n
k
(g) P
n
l
(g)[ = 0
and hence (g) := lim
k
P
n
k
(g) for all g C(X). It is now clear that
(g) 0 for all g 0 so that is a positive linear functional on X and thus
there is a probability measure P such that (g) = P(g).
General case. By Theorem 15.12 we may assume that X is a subset of
a compact metric space which we will denote by

X. We now extend P
n
to

X
29.4 Exercises 573
by setting

P
n
(A) :=

P
n
(A

X) for all A B
X
. By what we have just proved,
there is a subsequence
_
P
t
k
:=

P
n
k
_
k=1
such that

P
t
k
converges weakly to a
probability measure

P on

X. The main thing we now have to prove is that

P(X) = 1, this is where the tightness assumption is going to be used. Given
> 0, let K
X be a compact set such that

P
n
(K
) 1 for all n. Since

K
is compact in X it is compact in

X as well and in particular a closed
subset of

X. Therefore by Proposition 29.6
P(K
) lim sup
k
k
(K
) = 1 .
Since > 0 is arbitrary, this shows with X
0
:=
n=1
K
1/n
satises

P(X
0
) = 1.
Because X
0
B
X
B
X
, we may view

P as a measure on B
X
by letting
P(A) :=

P(A X
0
) for all A B
X
. Given a closed subset F X, choose
F

X such that F =

F X. Then
lim sup
k
P
t
k
(F) = lim sup
k
P
t
k
(

F)

P(

F) =

P(

F X
0
) = P(F),
which shows P
t
k
w
P.
29.2 Haar Measures
To be written.
29.3 Hausdor Measure
To be written.
29.4 Exercises
Exercise 29.1. Let E B
R
with m(E) > 0. Then for any (0, 1) there
exists a bounded open interval J 1 such that m(E J) m(J).
1
Hints:
1. Reduce to the case where m(E) (0, ). 2) Approximate E from the
outside by an open set V 1. 3. Make use of Exercise 13.14, which states
that V may be written as a disjoint union of open intervals.
Exercise 29.2. Let F : 1 1 be a right continuous increasing function and
=
F
be as in Theorem 28.32. For a < b, nd the values of (a) , ([a, b)) ,
([a, b]) and ((a, b)) in terms of the function F.
1
See also the Lebesgue dierentiation Theorem 30.13 from which one may prove
the much stronger form of this theorem, namely for m -a.e. x E there exits
r(x) > 0 such that m(E (x r, x +r)) m((x r, x +r)) for all r r(x).
Exercise 29.3. Suppose that F C
1
(1) is an increasing function and
F
is the unique Borel measure on 1 such that
F
((a, b]) = F(b) F(a) for all
a b. Show that d
F
= dm for some function 0. Find explicitly in
terms of F.
Exercise 29.4. Suppose that F(x) = e1
x3
+ 1
x7
and
F
is the is the
unique Borel measure on 1 such that
F
((a, b]) = F(b) F(a) for all a b.
Give an explicit description of the measure
F
.
Exercise 29.5. Let (X, /, ) be as in Denition 28.36 and Proposition 28.37,
Y be a Banach space and S(Y ) := S
f
(X, /, ; Y ) be the collection of functions
f : X Y such that #(f(X)) < , f
1
(y) / for all y Y and
(f ,= 0) < . We may dene a linear functional I : S(Y ) Y by
I(f) =
yY
y(f = y).
Verify the following statements.
1. Let |f|
= sup
xX
|f(x)|
Y
be the sup norm on
(X, Y ), then for

f S(Y ),
|I(f)|
Y
|f|
(f ,= 0).
Hence if (X) < , I extends to a bounded linear transformation from
S(Y )
(X, Y ) to Y.
2. Assuming (X, /, ) satises the hypothesis in Exercise 28.5, then
C(X, Y )

S(Y ).
3. Now assume the notation in Section 28.3.3, i.e. X = [M, M] for some
M 1 and is determined by an increasing function F. Let := M =
t
0
< t
1
< < t
n
= M denote a partition of J := [M, M] along with
a choice c
i
[t
i
, t
i+1
] for i = 0, 1, 2 . . . , n1. For f C([M, M], Y ), set
f
:= f(c
0
)1
[t0,t1]
+
n1
i=1
f(c
i
)1
(ti,ti+1]
.
Show that f
S and
|f f
|
J
0 as [[ := max(t
i+1
t
i
) : i = 0, 1, 2 . . . , n 1 0.
Conclude from this that
I(f) = lim
]]0
n1
i=0
f(c
i
)(F(t
i+1
) F(t
i
)).
As usual we will write this integral as
_
M
M
fdF and as
_
M
M
f(t)dt if
F(t) = t.
29.4 Exercises 575
Exercise 29.6. Let (X, ) be a locally compact Hausdor space and I :
C
0
(X, 1) 1 be a positive linear functional. Show I is necessarily bounded,
i.e. there exists a C < such that [I(f)[ C |f|
for all f C
0
(X, 1).
Outline. (BRUCE: see Nates proof below and then rewrite this outline
to make the problem much easier and to handle more general circumstances.)
1. By the Riesz-Markov Theorem 28.16, there exists a unique Radon measure
on (X, B
X
) such that (f) :=
_
X
fd = I (f) for all f C
c
(X, 1) .
Show (f) I (f) for all f C
0
(X, [0, )).
2. Show that if (X) = , then there exists a function f C
0
(X, [0, ))
such that = (f) I (f) contradicting the assumption that I is real
valued.
3. By item 2., (X) < and therefore C
0
(X, 1) L
1
() and :
C
0
(X, 1) 1 is a well dened positive linear functional. Let J (f) :=
I (f) (f) , which by item 1. is a positive linear functional such that
J[
Cc(X,R)
0. Show that any positive linear functional, J, on C
0
(X, 1)
satisfying these properties must necessarily be zero. Thus I (f) = (f)
and |I| = (X) < as claimed.
Exercise 29.7. BRUCE (Drop this exercise or move somewhere else, it
is already proved in the notes in more general terms.) Suppose that I :
C
c
(1, 1) 1 is a positive linear functional. Show
1. For each compact subset K 1 there exists a constant C
K
< such
that
[I(f)[ C
K
|f|
whenever supp(f) K.
2. Show there exists a unique Radon measure on B
R
(the Borel algebra
on 1) such that I(f) =
_
R
fd for all f C
c
(1, 1).
29.4.1 The Laws of Large Number Exercises
For the rest of the problems of this section, let be a probability measure on
B
R
such that
_
R
[x[ d(x) < .
By Corollary 28.44, there exists a unique measure on (X := 1
N
, B := B
R
N =
n=1
B
R
) such that
_
X
f(x
1
, x
2
, . . . , x
N
)d(x) =
_
R
N
f(x
1
, x
2
, . . . , x
N
)d(x
1
) . . . d(x
N
) (29.4)
for all N N and bounded measurable functions f : 1
N
1, i.e. =
n=1
n
with
n
= for every n. We will also use the following notation:
S
n
(x) :=
1
n
n
k=1
x
k
for x X,
m :=
_
R
xd(x)
2
:=
_
R
(x m)
2
d(x) =
_
R
x
2
d(x) m
2
, and
:=
_
R
(x m)
4
d(x).
As is customary, m is said to be the mean or average of and
2
is the
variance of .
Exercise 29.8 (Weak Law of Large Numbers). Assume
2
< . Show
_
X
S
n
d = m.
|S
n
m|
2
2
=
_
X
(S
n
m)
2
d =

2
n
,
and ([S
n
m[ > )

2
n
2
for all > 0 and n N.
Exercise 29.9 (A simple form of the Strong Law of Large Numbers).
Suppose now that :=
_
R
(x m)
4
d(x) < . Show for all > 0 and n N
that
|S
n
m|
4
4
=
_
X
(S
n
m)
4
d =
1
n
4
_
n + 3n(n 1)
4
_
=
1
n
2
_
n
1
+ 3
_
1 n
1
_
and
([S
n
m[ > )
n
1
+ 3
_
1 n
1
_
4
n
2
.
Conclude from the last estimate and the rst Borel Cantelli Lemma 19.20
that lim
n
S
n
(x) = m for a.e. x X.
Exercise 29.10. Suppose :=
_
R
(x m)
4
d(x) < and m =
_
R
(x
m)d(x) ,= 0. For > 0 let T
: 1
N
1
N
be dened by T
(x) =
(x
1
, x
2
, . . . , x
n
, . . . ),
= T
1
and
X
:=
_
_
_
x 1
N
: lim
n
1
n
n
j=1
x
j
=
_
_
_
.
Show
(X
) =
,
=
_
1 if =
t
0 if ,=
t
and use this to show if ,= 1, then d
,= d for any measurable function

: 1
N
[0, ].
30
Lebesgue Dierentiation and the Fundamental
Theorem of Calculus
BRUCE: replace 1
n
by 1
d
in this section?
Notation 30.1 In this chapter, let B = B
R
n denote the Borel algebra
on 1
n
and m be Lebesgue measure on B. If V is an open subset of 1
n
, let
L
1
loc
(V ) := L
1
loc
(V, m) and simply write L
1
loc
for L
1
loc
(1
n
). We will also write
[A[ for m(A) when A B.
Denition 30.2. A collection of measurable sets E
r>0
B is said to shrink
nicely to x 1
n
if (i) E
r
B(x, r) for all r > 0 and (ii) there exists > 0
such that m(E
r
) m(B(x, r)). We will abbreviate this by writing E
r
x
nicely. (Notice that it is not required that x E
r
for any r > 0.
The main result of this chapter is the following theorem.
Theorem 30.3. Suppose that is a complex measure on (1
n
, B) , then there
exists g L
1
(1
n
, m) and a complex measure
s
such that
s
m, d =
gdm+d
s
, and for m - a.e. x
g(x) = lim
r0
(E
r
)
m(E
r
)
(30.1)
for any collection of E
r
r>0
B which shrink nicely to x . (Eq. (30.1) holds
for all x L(g) the Lebesgue set of g, see Denition 30.11 and Theorem
30.12 below.)
Proof. The existence of g and
s
such that
s
m and d = gdm + d
s
is a consequence of the Radon-Nikodym Theorem 24.34. Since
(E
r
)
m(E
r
)
=
1
m(E
r
)
_
Er
g(x)dm(x) +

s
(E
r
)
m(E
r
)
Eq. (30.1) is a consequence of Theorem 30.13 and Corollary 30.15 below.
The rest of this chapter will be devoted to lling in the details of the proof
of this theorem.
578 30 Lebesgue Dierentiation and the Fundamental Theorem of Calculus
30.1 A Covering Lemma and Averaging Operators
Lemma 30.4 (Covering Lemma). Let c be a collection of open balls in 1
n
and U =
BL
B. If c < m(U), then there exists disjoint balls B
1
, . . . , B
k
c
such that c < 3
n
k
j=1
m(B
j
).
Proof. Choose a compact set K U such that m(K) > c and then
let c
1
c be a nite subcover of K. Choose B
1
c
1
to be a ball with
largest diameter in c
1
. Let c
2
= A c
1
: A B
1
= . If c
2
is not empty,
choose B
2
c
2
to be a ball with largest diameter in c
2
. Similarly let c
3
=
A c
2
: A B
2
= and if c
3
is not empty, choose B
3
c
3
to be a
ball with largest diameter in c
3
. Continue choosing B
i
c for i = 1, 2, . . . , k
this way until c
k+1
is empty, see Figure 30.1 below. If B = B(x
0
, r) 1
n
, let
Fig. 30.1. Picking out the large disjoint balls via the greedy algorithm.
B
= B(x
0
, 3r) 1
n
, that is B
is the ball concentric with B which has three

times the radius of B. We will now show K
k
i=1
B
i
. For each A c
1
there
exists a rst i such that B
i
A ,= . In this case diam(A) diam(B
i
) and
A B
i
. Therefore A
k
i=1
B
i
and hence K A : A c
1

k
i=1
B
i
.
Hence by sub-additivity,
c < m(K)
k
i=1
m(B
i
) 3
n
k
i=1
m(B
i
).
Denition 30.5. For f L
1
loc
, x 1
n
and r > 0 let
(A
r
f)(x) =
1
[B(x, r)[
_
B(x,r)
fdm (30.2)
where B(x, r) = B(x, r) 1
n
, and [A[ := m(A).
30.2 Maximal Functions 579
Lemma 30.6. Let f L
1
loc
, then for each x 1
n
, (0, ) r (A
r
f)(x)
C is continuous and for each r > 0, 1
n
x (A
r
f) (x) C is measurable.
Proof. Recall that [B(x, r)[ = m(E
1
)r
n
which is continuous in r. Also
lim
rr0
1
B(x,r)
(y) = 1
B(x,r0)
(y) if [y[ , = r
0
and since m(y : [y[ , = r
0
) = 0
(you prove!), lim
rr0
1
B(x,r)
(y) = 1
B(x,r0)
(y) for m -a.e. y. So by the domi-
nated convergence theorem,
lim
rr0
_
B(x,r)
fdm =
_
B(x,r0)
fdm
and therefore
(A
r
f)(x) =
1
m(E
1
)r
n
_
B(x,r)
fdm
is continuous in r. Let g
r
(x, y) := 1
B(x,r)
(y) = 1
]xy]<r
. Then g
r
is B B
measurable (for example write it as a limit of continuous functions or just
notice that F : 1
n
1
n
1 dened by F(x, y) := [x y[ is continuous) and
so that by Fubinis theorem
x
_
B(x,r)
fdm =
_
B(x,r)
g
r
(x, y)f(y)dm(y)
is B measurable and hence so is x (A
r
f) (x).
30.2 Maximal Functions
1
(m), the Hardy - Littlewood maximal function
Hf is dened by
(Hf)(x) = sup
r>0
A
r
[f[ (x).
Lemma 30.6 allows us to write
(Hf)(x) = sup
rQ, r>0
A
r
[f[ (x)
from which it follows that Hf is measurable.
Theorem 30.8 (Maximal Inequality). If f L
1
(m) and > 0, then
m(Hf > )
3
n
|f|
L
1.
This should be compared with Chebyshevs inequality which states that
m([f[ > )
|f|
L
1
.
Proof. Let E
:= Hf > . For all x E
there exists r
x
such that
A
rx
[f[ (x) > , i.e.
[B
x
(r
x
)[ <
1
_
Bx(rx)
fdm.
Since E

xE
B
x
(r
x
), if c < m(E
) m(
xE
B
x
(r
x
)) then, using
Lemma 30.4, there exists x
1
, . . . , x
k
E
and disjoint balls B

i
= B
xi
(r
xi
) for
i = 1, 2, . . . , k such that
c <
k
i=1
3
n
[B
i
[ <
3
n
_
Bi
[f[ dm
3
n
_
R
n
[f[ dm =
3
n
|f|
L
1.
This shows that c < 3
n
1
|f|
L
1 for all c < m(E
) which proves m(E
)
3
n
1
|f|.
Theorem 30.9. If f L
1
loc
then lim
r0
(A
r
f)(x) = f(x) for m a.e. x 1
n
.
Proof. With out loss of generality we may assume f L
1
(m). We now
begin with the special case where f = g L
1
(m) is also continuous. In this
case we nd:
[(A
r
g)(x) g(x)[
1
[B(x, r)[
_
B(x,r)
[g(y) g(x)[dm(y)
sup
yB(x,r)
[g(y) g(x)[ 0 as r 0.
In fact we have shown that (A
r
g)(x) g(x) as r 0 uniformly for x in
compact subsets of 1
n
. For general f L
1
(m),
[A
r
f(x) f(x)[ [A
r
f(x) A
r
g(x)[ +[A
r
g(x) g(x)[ +[g(x) f(x)[
= [A
r
(f g)(x)[ +[A
r
g(x) g(x)[ +[g(x) f(x)[
H(f g)(x) +[A
r
g(x) g(x)[ +[g(x) f(x)[
and therefore,
lim
r0
[A
r
f(x) f(x)[ H(f g)(x) +[g(x) f(x)[.
So if > 0, then
E
:=
_
lim
r0
[A
r
f(x) f(x)[ >
_
_
H(f g) >

2
_
_
[g f[ >

2
_
and thus
30.3 Lebesque Set 581
m(E
) m
_
H(f g) >

2
_
+m
_
[g f[ >

2
_
3
n
/2
|f g|
L
1 +
1
/2
|f g|
L
1
2(3
n
+ 1)
1
|f g|
L
1,
where in the second inequality we have used the Maximal inequality (Theorem
30.8) and Chebyshevs inequality. Since this is true for all continuous g
C(1
n
) L
1
(m) and this set is dense in L
1
(m), we may make |f g|
L
1 as
small as we please. This shows that
m
__
x : lim
r0
[A
r
f(x) f(x)[ > 0
__
= m(
n=1
E
1/n
)
n=1
m(E
1/n
) = 0.
Corollary 30.10. If d = gdm with g L
1
loc
then
(B(x, r))
[B(x, r)[
= A
r
g(x) g(x) for m a.e. x.
30.3 Lebesque Set
1
loc
(m), the Lebesgue set of f is
L(f) :=
_
_
x 1
n
: lim
r0
1
[B(x, r)[
_
B(x,r)
[f(y) f(x)[dy = 0
_
_
=
_
x 1
n
: lim
r0
(A
r
[f() f(x)[) (x) = 0
_
.
More generally, if p [1, ) and f L
p
loc
(m) , let
L
p
(f) :=
_
_
x 1
n
: lim
r0
1
[B(x, r)[
_
B(x,r)
[f(y) f(x)[
p
dy = 0
_
_
Theorem 30.12. Suppose 1 p < and f L
p
loc
(m), then m
_
1
d
L
p
(f)
_
=
0.
Proof. For w C dene g
w
(x) = [f(x) w[
p
and
E
w
:=
_
x : lim
r0
(A
r
g
w
) (x) ,= g
w
(x)
_
.
Then by Theorem 30.9 m(E
w
) = 0 for all w C and therefore m(E) = 0
where
E =
_
wQ+iQ
E
w
.
By denition of E, if x / E then.
lim
r0
(A
r
[f() w[
p
)(x) = [f(x) w[
p
for all w +i. Letting q :=
p
p1
(so that p/q = p 1) we have
[f() f(x)[
p
([f() w[ +[w f(x)[)
p
2
p/q
([f() w[
p
+[w f(x)[
p
) = 2
p1
([f() w[
p
+[w f(x)[
p
) ,
(A
r
[f() f(x)[
p
)(x) 2
p1
(A
r
[f() w[
p
) (x) + 2
p1
[w f(x)[
and hence for x / E,
lim
r0
(A
r
[f() f(x)[
p
)(x) 2
p1
[f(x) w[
p
+2
p1
[wf(x)[
p
= 2
p
[f(x) w[
p
.
Since this is true for all w +i, we see that
lim
r0
(A
r
[f() f(x)[
p
)(x) = 0 for all x / E,
i.e. E
c
L
p
(f) or equivalently (L
p
(f))
c
E. So m(1
n
L
p
(f)) m(E) =
0.
Theorem 30.13 (Lebesque Dierentiation Theorem). If f L
p
loc
and
x L
p
(f) (so in particular for m a.e. x), then
lim
r0
1
m(E
r
)
_
Er
[f(y) f(x)[
p
dy = 0
and
lim
r0
1
m(E
r
)
_
Er
f(y)dy = f(x)
when E
r
x nicely, see Denition 30.2.
Proof. For x L
p
(f) , by Holders inequality (Theorem 21.2) or Jensens
inequality (Theorem 21.10), we have
1
m(E
r
)
_
Er
f(y)dy f(x)
p
=
1
m(E
r
)
_
Er
(f(y) f(x)) dy
1
m(E
r
)
_
Er
[f(y) f(x)[
p
dy
1
m(B(x, r))
_
B(x,r)
[f(y) f(x)[
p
dy
which tends to zero as r 0 by Theorem 30.12. In the second inequality we
have used the fact that m(B(x, r) B(x, r)) = 0.
30.3 Lebesque Set 583
Lemma 30.14. Suppose is positive K nite measure on B := B
R
n such
that m. Then for m a.e. x,
lim
r0
(B(x, r))
m(B(x, r))
= 0.
Proof. Let A B such that (A) = 0 and m(A
c
) = 0. By the regularity
theorem (see Theorem 28.22, Corollary 32.42 or Exercise 33.4), for all > 0
there exists an open set V
1
n
such that A V
and (V
) < . For the

rest of this argument, we will assume m has been extended to the Lebesgue
measurable sets, L :=

B
m
. Let
F
k
:=
_
x A : lim
r0
(B(x, r))
m(B(x, r))
>
1
k
_
the for x F
k
choose r
x
> 0 such that B
x
(r
x
) V
(see Figure 30.2) and

(B(x,rx))
m(B(x,rx))
>
1
k
, i.e.
m(B(x, r
x
)) < k(B(x, r
x
)).
Let c = B(x, r
x
)
xF
k
and U :=

xF
k
B(x, r
x
) V
. Heuristically if all the

Fig. 30.2. In this picture we imagine that =
n=1
n
2
1/n
and A = R
2
\
|(1/n, 0) : n N . We may approximate A by the open sets, VN := R
2
\
|(1/n, 0) : 1 n N , since (VN) =
n=N+1
n
2
0 as N . (Of
course we could simplify matters in this setting by choosing A = V := R
2
\
(|(1/n, 0) : 1 n N |0) , but this would not be very enlightening.)
balls in c were disjoint and c were countable, then
m(F
k
)
xF
k
m(B(x, r
x
)) < k
xF
k
(B(x, r
x
))
= k(U) k (V
) k.
Since > 0 is arbitrary this would imply that F
k
L and m(F
k
) = 0. To x
the above argument, suppose that c < m(U) and use the covering lemma to
nd disjoint balls B
1
, . . . , B
N
c such that
c < 3
n
N
i=1
m(B
i
) < k3
n
N
i=1
(B
i
)
k3
n
(U) k3
n
(V
) k3
n
.
Since c < m(U) is arbitrary we learn that m(U) k3
n
. This argument shows
open sets U
such that F
k
U
and m(U
) k3
n
for all > 0. Therefore
F
k
G :=
l=1
U
1/l
B with m(G) = 0 which shows F
k
L and m(F
k
) = 0.
Since
F
:=
_
x A : lim
r0
(B(x, r))
m(B(x, r))
> 0
_
=
k=1
F
k
L,
it also follows that F
L and m(F
) = 0. Since
_
x 1
n
: lim
r0
(B(x, r))
m(B(x, r))
> 0
_
F
A
c
and m(A
c
) = 0, we have shown
m
__
x 1
n
: lim
r0
(B(x, r))
m(B(x, r))
> 0
__
= 0.
Corollary 30.15. Let be a complex or a K nite signed measure (i.e.
(K) 1 for all K 1
n
) such that m. Then for m a.e. x,
lim
r0
(E
r
)
m(E
r
)
= 0
whenever E
r
x nicely.
Proof. By Exercise 24.17, m implies [[ m. Hence the result follows
from Lemma 30.14 and the inequalities,
[(E
r
)[
m(E
r
)

[[(E
r
)
m(B(x, r))

[[(B(x, r))
m(B(x, r))

[[(B(x, 2r))
2
n
m(B(x, 2r))
.
Proposition 30.16. TODO Add in almost everywhere convergence result of
convolutions by approximate functions.
30.4 The Fundamental Theorem of Calculus
In this section we will restrict the results above to the one dimensional setting.
The following notation will be in force for the rest of this chapter. (BRUCE:
make sure this notation agrees with the notation in Notation 30.21.)
Notation 30.17 Let
1. m be one dimensional Lebesgue measure on B := B
R
,
2. , be numbers in

1 such that < ,
3. / = /
[,]
be the algebra generated by sets of the form (a, b] [, ] with
a < b ,
4. /
b
denote those sets in / which are bounded,
5. and B
[,]
be the Borel algebra on [, ] 1.
Notation 30.18 Given a function F : 1

1 or F : 1 C, let F(x) =
lim
yx
F(y), F(x+) = lim
yx
F(y) and F() = lim
x
F(x) whenever
the limits exist. Notice that if F is a monotone functions then F() and
F(x) exist for all x.
30.4.1 Increasing Functions
Theorem 30.19 (Monotone functions). Let F : 1 1 be increasing and
dene G(x) = F(x+). Then:
1. The function G is increasing and right continuous.
2. For x 1, G(x) = lim
yx
F(y).
3. The set x 1 : F(x+) > F(x) is countable and for each N > 0, and
moreover,
x(N,N]
[F(x+) F(x)] F(N) F(N) < . (30.3)
4. There exists a unique measure,
G
on B = B
R
such that
G
((a, b]) = G(b) G(a) for all a < b.
5. For m a.e. x, F
t
(x) and G
t
(x) exists and F
t
(x) = G
t
(x). (Notice that
F
t
(x) and G
t
(x) are non-negative when they exist.)
6. The function F
t
is in L
1
loc
(m) and there exists a unique positive measure
s
on (1, B
R
) such that
F(b+) F(a+) =
_
b
a
F
t
dm+
s
((a, b]) for all < a < b < .
Moreover the measure
s
is singular relative to m and F
t
L
1
(1, m) if
F is bounded.
Proof. Item 1. is a consequence of Eq. (28.35) of Theorem 28.40. Never-
theless we will still give a direct proof here as well.
1. The following observation shows G is increasing: if x < y then
F(x) F(x) F(x+) = G(x) F(y) F(y) F(y+) = G(y).
(30.4)
Since G is increasing, G(x) G(x+). If y > x then G(x+) F(y) and
hence G(x+) F(x+) = G(x), i.e. G(x+) = G(x) which is to say G is
right continuous.
2. Since G(x) F(y) F(y) for all y > x, it follows that
G(x) lim
yx
F(y) lim
yx
F(y) = G(x)
showing G(x) = lim
yx
F(y).
3. By Eq. (30.4), if x ,= y then
(F(x), F(x+)] (F(y), F(y+)] = .
Therefore, (F(x), F(x+)]
xR
are disjoint possible empty intervals in
1. Let N N and (N, N) be a nite set, then
x
(F(x), F(x+)] (F(N), F(N)]
and therefore,
x
[F(x+) F(x)] F(N) F(N) < .
Since this is true for all (N, N], Eq. (30.3) holds. Eq. (30.3) shows
N
:= x (N, N)[F(x+) F(x) > 0
is countable and hence so is
:= x 1[F(x+) F(x) > 0 =
N=1
N
.
4. Item 4. is a direct consequence of Theorem 28.32. Notice that
G
is a nite
measure when F and hence G is bounded.
5. By Theorem 30.3,
d
G
= gdm+d
s
,
where
s
m, g L
1
loc
(1, m) with g L
1
(1, m) if F is bounded, and
for m - a.e. x; for all sequences E
r
r>0
which shrink nicely to x ,
g (x) = lim
r0
(
G
(E
r
)/m(E
r
))
with the limit being independent of the choice of the sequence E
r
r>0
.
Since (x, x +r] x and (x r, x] x nicely,
g (x) = lim
r0
G
(x, x +r])
m((x, x +r])
= lim
r0
G(x +r) G(x)
r
=
d
dx
+
G(x) (30.5)
and
g (x) = lim
r0
G
((x r, x])
m((x r, x])
= lim
r0
G(x) G(x r)
r
= lim
r0
G(x r) G(x)
r
=
d
dx
G(x) (30.6)
exist and are equal for m - a.e. x, i.e. G
t
(x) = g (x) exists for m -a.e. x.
For x 1, let
H(x) := G(x) F(x) = F(x+) F(x) 0.
Since F(x) = G(x) H(x), the proof of 5. will be complete once we show
H
t
(x) = 0 for m a.e. x. From Item 3.,
:= x 1 : F(x+) > F(x) x 1 : F(x+) > F(x)
is a countable set and
x(N,N)
H(x) =
x(N,N)
(F(x+)F(x))
x(N,N)
(F(x+)F(x)) <
for all N < . Therefore :=

xR
H(x)
x
(i.e. (A) :=
xA
H(x) for all
A B
R
) denes a Radon measure on B
R
. Since (
c
) = 0 and m() = 0,
the measure m. By Corollary 30.15 for m - a.e. x,
H(x +r) H(x)

r
[H(x +r)[ +[H(x)[

[r[
H(x +[r[) +H(x [r[) +H(x)

[r[
2
([x [r[ , x +[r[])
2 [r[
and the last term goes to zero as r 0 because [x r, x +r]
r>0
shrinks
nicely to x as r 0 and m([x [r[ , x +[r[]) = 2 [r[ . Hence we conclude
for m a.e. x that H
t
(x) = 0.
6. From Theorem 30.3, item 5. and Eqs. (30.5) and (30.6), F
t
= G
t
L
1
loc
(m)
and d
G
= F
t
dm+d
s
where
s
is a positive measure such that
s
m.
Applying this equation to an interval of the form (a, b] gives
F(b+) F(a+) =
G
((a, b]) =
_
b
a
F
t
dm+
s
((a, b]). (30.7)
The uniqueness of
s
such that this equation holds is a consequence of
Theorem 19.55. As we have already mentioned, when F is bounded then
F
t
L
1
(1, m) . This can also be seen directly by letting a and
b + in Eq. (30.7).
Example 30.20. Let C [0, 1] denote the cantor set constructed as follows.
Let C
1
= [0, 1] (1/3, 2/3), C
2
:= C
1
[(1/9, 2/9) (7/9, 8/9)] , etc., so that
we keep removing the middle thirds at each stage in the construction. Then
C :=
n=1
C
n
=
_
_
_
x =
j=0
a
j
3
j
: a
j
0, 2
_
_
_
and
m(C) = 1
_
1
3
+
2
9
+
2
2
3
3
+. . .
_
= 1
1
3
n=0
_
2
3
_
n
= 1
1
3
1
1 2/3
= 0.
Associated to this set is the so called cantor function F(x) := lim
n
f
n
(x)
where the f
n
n=1
are continuous non-decreasing functions such that f
n
(0) =
0, f
n
(1) = 1 with the f
n
pictured in Figure 30.3 below. From the pictures one
sees that f
n
are uniformly Cauchy, hence there exists F C([0, 1]) such
that F(x) := lim
n
f
n
(x). The function F has the following properties,
1. F is continuous and non-decreasing.
2. F
t
(x) = 0 for m a.e. x [0, 1] because F is at on all of the middle third
open intervals used to construct the cantor set C and the total measure
of these intervals is 1 as proved above.
3. The measure on B
[0,1]
associated to F, namely ([0, b]) = F(b) is singular
relative to Lebesgue measure and (x) = 0 for all x [0, 1]. Notice that
([0, 1]) = 1. In particular, the function F certainly does not satisfy the
fundamental theorem of calculus despite the fact that F
t
(x) = 0 for a.e.
x.
30.4.2 Functions of Bounded Variation
Our next goal is to prove an analogue of Theorem 30.19 for complex valued
F. Let , 1 with < be xed. The following notation will be used
throughout this section.
Notation 30.21 Let (X, B) denote one of the following four measure spaces:
(1, B
R
) ,
_
(, ], B
(,)
_
,
_
(, ) , B
(,)
_
or
_
(, ], B
(,]
_
and let

X
denote the closure of X in 1 and

X
denote the closure of X in

1 :=
[, ] .
1
We further let / denote the algebra of half open intervals in X,
i.e. the algebra generated by the sets, (a, b] X : a b . Also
let /
b
be those A / which are bounded.
1
So

X is either R, (, ], [, ), or [, ] respectively and

X is either [, ] ,
[, ], [, ], or [, ] respectively.
Fig. 30.3. Constructing the Cantor function.
Denition 30.22. For a x max P
where min := .
For example, if X = (, ) , then a partition of

X = [, ) is a nite
subset, P, of [, ) such that P and if a < b < , then a partition
of [a, b] is a nite subset, P, of [a, b] such that a, b P, see Figure 30.4.
The following proposition will help motivate a number of concepts which
will need to introduce.
Proposition 30.23. Suppose is a complex measure on (X, B) and F :

X
C is a function
((a, b]) = F(b) F(a)
for all a, b

X with a < b.(For example one may let F (x) := ((, x]
X)).) Then
1. F :

X C is a right continuous function,
Fig. 30.4. In this gure, X = (, ) and partitions of X and [a, b] with [a, b]

X
have been shown. The meaing of x+ is also depicted.
2. For all a, b

X with a < b,
[[ (a, b] = sup
P
xP
[(x, x
+
][ = sup
P
xP
[F(x
+
) F(x)[ (30.8)
where supremum is over all partitions P of [a, b].
3. If inf X = then Eq. (30.8) remains valid for a = and moreover,
[[ ((, b]) = lim
a
[[ (a, b]. (30.9)
Similar statements hold in case supX = + in which case we may take
b = above. In particular if X = 1, then
[[ (1) = sup
_
xP
[F(x
+
) F(x)[ : P is a partition of 1
_
= lim
a
b
sup
_
xP
[F(x
+
) F(x)[ : P is a partition of [a, b]
_
.
4. m on X i for all > 0 there exists > 0 such that
n
i=1
[ ((a
i
, b
i
])[ =
n
i=1
[F(b
i
) F(a
i
)[ < (30.10)
whenever (a
i
, b
i
]
n
i=1
are disjoint subintervals of X such that
n
i=1
(b
i

a
i
) < .
Proof. 1. The right continuity of F is a consequence of the continuity of
under decreasing limits of sets.
2 and 3. When a, b

X, Eq. (30.8) follows from Proposition 24.33 and
the fact that B = (/). The verication of item 3. is left for Exercise 30.1.
4. Equation (30.10) is a consequence of Theorem 24.38 and the following
remarks:
a) (a
i
, b
i
) X
n
i=1
are disjoint intervals i (a
i
, b
i
] X
n
i=1
are disjoint in-
tervals,
b) m(X (
n
i=1
(a
i
, b
i
]))
n
i=1
(b
i
a
i
), and
c) the general element A /
b
is of the form A = X (
n
i=1
(a
i
, b
i
]) .
Exercise 30.1. Prove Item 3. of Proposition 30.23.
Denition 30.24 (Total variation of a function). The total variation
of a function F :

XC on (a, b] X

X
(b = is allowed here) is dened

by
T
F
((a, b] X) = sup
P
xP
[F(x
+
) F(x)[
where supremum is over all partitions P of [a, b] X. Also let
T
F
(b) := T
F
((inf X, b]) for all b X.
The function F is said to have bounded variation on (a, b] X if T
F
((a, b]
X) < and F is said to be of bounded variation, and we write F
BV (X) , if T
F
(X) < .
Denition 30.25 (Absolute continuity). A function F :

XC is abso-
lutely continuous if for all > 0 there exists > 0 such that
n
i=1
[F(b
i
) F(a
i
)[ < (30.11)
whenever (a
i
, b
i
]
n
i=1
are disjoint subintervals of X such that
n
i=1
(b
i
a
i
) < .
Exercise 30.2. Let F, G :

XC be and C be given. Show
1. T
F+G
T
F
+T
G
and T
F
= [[ T
F
. Conclude from this that BV (X) is a
vector space.
2. T
Re F
T
F
, T
ImF
T
F
, and T
F
T
Re F
+ T
ImF
. In particular F
BV (X) i Re F and ImF are in BV (X) .
3. If F :

XC is absolutely continuous then F :

XC is continuous and
in fact is uniformly continuous.
Lemma 30.26 (Examples). Let F :

X F be given, where F is either 1 of
C.
1. If F :

X 1 is a monotone function, then T
F
((a, b]) = [F (b) F (a)[
for all a, b

X with a < b. So F BV (X) i F is bounded (which will
be the case if X = [, ]).
2. If F : [, ] C is absolutely continuous then F BV ((, ]).
3. If F C ([, ] 1) , F
t
(x) is dierentiable for all x (, ) , and
sup
x(,)
[F
t
(x)[ = M < , then F is absolutely continuous
2
and
T
F
((a, b]) M(b a) a < b .
4. Let f L
1
(X, m) and set
F(x) =
_
(,x]

X
fdm for all x

X. (30.12)
Then F :

XC is absolutely continuous.
Proof.
1. If F is monotone increasing and P is a partition of (a, b] then
xP
[F(x
+
) F(x)[ =
xP
(F(x
+
) F(x)) = F(b) F(a)
so that T
F
((a, b]) = F (b) F(a). Similarly, one shows
T
F
((a, b]) = F (a) F (b) = [F (b) F(a)[
if F is monotone decreasing. Also note that F BV (1) i [F() F()[ <
, where F () = lim
x
F (x) .
2. Since F is absolutely continuous, there exists > 0 such that whenever
a, b

X with a < b and b a < , then
xP
[F(x
+
) F(x)[ 1
for all partitions, P, of [a, b] . This shows that T
F
((a, b]) 1 for all a < b
with ba < . Thus using Eq. (30.14), it follows that T
F
((a, b]) N <
provided N N is chosen so that b a < N.
3. Suppose that (a
i
, b
i
]
n
i=1
are disjoint subintervals of (a, b], then by the
mean value theorem,
2
It is proved in Natanson or in Rudin that this is also true if F C([, ]) such
that F
/
(x) exists for all x (, ) and F
/
L
1
([, ] , m) .
n
i=1
[F(b
i
) F(a
i
)[
n
i=1
[F
t
(c
i
)[ (b
i
a
i
) M m(
n
i=1
(a
i
, b
i
))
M
n
i=1
(b
i
a
i
) M(b a)
form which it easily follows that F is absolutely continuous. Moreover we
may conclude that T
F
((a, b]) M(b a).
4. Let be the positive measure d = [f[ dm on (a, b]. Again let (a
i
, b
i
]
n
i=1
be disjoint subintervals of (a, b], then
n
i=1
[F(b
i
) F(a
i
)[ =
n
i=1
_
(ai,bi]
fdm
i=1
_
(ai,bi]
[f[ dm
=
_
n
i=1
(ai,bi]
[f[ dm = (
n
i=1
(a
i
, b
i
]). (30.13)
Since is absolutely continuous relative to m, by Theorem 24.38, for all
> 0 there exist > 0 such that (A) < if m(A) < . Applying this
result with A =
n
i=1
(a
i
, b
i
], it follows from Eq. (30.13) that F satises
the denition of being absolutely continuous. Furthermore, Eq. (30.13)
also may be used to show
T
F
((a, b])
_
(a,b]
[f[ dm.
Example 30.27 (See I. P. Natanson,Theory of functions of a real variable,
p.269.). In each of the two examples below, f C([1, 1]).
1. Let f(x) = [x[
3/2
sin
1
x
with f(0) = 0, then f is everywhere dierentiable
but f
t
is not bounded near zero. However, f
t
is in L
1
([1, 1]).
2. Let f(x) = x
2
cos

x
2
with f(0) = 0, then f is everywhere dierentiable
but f
t
/ L
1
(, ) for any (0, 1) . Indeed, if 0 / (, ) then
_

f
t
(x)dx = f() f() =
2
cos

2

2
cos

2
.
Now take
n
:=
_
2
4n+1
and
n
= 1/
2n. Then
_
n
n
f
t
(x)dx =
2
4n + 1
cos
(4n + 1)
2

1
2n
cos 2n =
1
2n
and noting that (
n
,
n
)
n=1
are all disjoint, we nd
_
0
[f
t
(x)[ dx = .
Theorem 30.28. Let F : 1 C be any function.
1. For a < b < c,
T
F
((a, c]) = T
F
((a, b]) +T
F
((b, c]). (30.14)
Letting a = in this expression implies
T
F
(c) = T
F
(b) +T
F
((b, c]) (30.15)
and in particular T
F
is monotone increasing.
2. Now suppose F : 1 1 and F BV (1) . Then the functions F
:=
(T
F
F) /2 are bounded and increasing functions.
3. (Optional) A function F : 1 1 is in BV i F = F
+
F
where F
are bounded increasing functions.

Proof.
1. (Item 1. is a special case of Exercise 24.5. Nevertheless we will give a proof
here.) By the triangle inequality, if P and P
t
are partition of [a, c] such
that P P
t
, then
xP
[F(x
+
) F(x)[
xP
[F(x
+
) F(x)[ .
So if P is a partition of [a, c], then P P
t
:= Pb implies
xP
[F(x
+
) F(x)[
xP
[F(x
+
) F(x)[
=
xP
(a,b]
[F(x
+
) F(x)[ +
xP
[b,c]
[F(x
+
) F(x)[
T
F
((a, b]) +T
F
((b, c]).
Thus we see that
T
F
((a, c]) T
F
((a, b]) +T
F
((b, c]).
Similarly if P
1
is a partition of [a, b] and P
2
is a partition of [b, c], then
P = P
1
P
2
is a partition of [a, c] and
xP1
[F(x
+
) F(x)[+
xP2
[F(x
+
) F(x)[ =
xP
[F(x
+
) F(x)[ T
F
((a, c]).
From this we conclude
T
F
((a, b]) +T
F
((b, c]) T
F
((a, c])
which nishes the proof of Eqs. (30.14) and (30.15).
2. By Item 1., for all a < b,
T
F
(b) T
F
(a) = T
F
((a, b]) [F(b) F(a)[ (30.16)
and therefore
T
F
(b) F(b) T
F
(a) F(a)
which shows that F
are increasing. Moreover from Eq. (30.16), for b 0

and a 0,
[F(b)[ [F(b) F(0)[ +[F(0)[ T
F
(0, b] +[F(0)[
T
F
(0, ) +[F(0)[
and similarly
[F(a)[ [F(0)[ +T
F
(, 0)
which shows that F is bounded by [F(0)[+T
F
(1). Therefore the functions,
F
+
and F
are bounded as well.

3. By Exercise 30.2 if F = F
+
F
, then
T
F
((a, b]) T
F+
((a, b]) +T
F
((a, b])
= [F
+
(b) F
+
(a)[ +[F
(b) F
(a)[
which is bounded showing that F BV. Conversely if F is bounded
variation, then F = F
+
F
where F
are dened as in Item 1.

Theorem 30.29 (Bounded variation functions). Suppose F :

X C is
in BV (X) , then
1. F(x+) := lim
yx
F (y) and F (x) := lim
yx
F (y) exist for all x

X.
By convention, if X (, ] then F () = F () and if X (, ]
then F (+) := F () . Let G(x) := F(x+) and G() = F () where
appropriate.
2. If inf X = , then F () := lim
x
F (x) exists and if supX =
+ then F () := lim
x
F (x) exists.
3. The set of points of discontinuity, x X : lim
yx
F(y) ,= F(x), of F is
at most countable and in particular G(x) = F(x+) for all but a countable
number x X.
4. For m a.e. x, F
t
(x) and G
t
(x) exist and F
t
(x) = G
t
(x).
5. The function G is right continuous on X. Moreover, there exists a unique
complex measure, =
F
, on (X, B) such that, for all a, b

X with a < b,
((a, b]) = G(b) G(a) = F (b+) F (a+) . (30.17)
6. F
t
L
1
(X, m) and the Lebesgue decomposition of may be written as
d
F
= F
t
dm+d
s
(30.18)
where
s
is a measure singular to m. In particular,
F (b+) F (a+) =
_
b
a
F
t
dm+
s
((a, b]) (30.19)
whenever a, b

X with a < b.
7.
s
= 0 i F is absolutely continuous on

X.
Proof. If X ,= 1, extend F to all of 1 by requiring F be constant on each
of the connected components of 1 X
o
. For example if X = [, ] , extend F
to 1 by setting F (x) := F () for x and F (x) = F () for x . With
this extension it is easily seen that T
F
(1) = T
F
(X) and T
F
(x) is constant
on the connected components of 1 X
o
. Thus we may now assume X = 1
and T
F
(1) < . Moreover, by considering the real and imaginary parts of F
separately we may assume F is real valued. So we now assume X = 1 and
F : 1 1 is in BV := BV (1) .
By Theorem 30.37, the functions F
:= (T
F
F) /2 are bounded and in-
creasing functions. Since F = F
+
F
, items 1. 4. are now easy consequences

of Theorem 30.19 applies to F
+
and F
.
Let G
(x) := F
(x+) and G
() = F
() and G
() = F
() ,
then
G(x) = F (x+) = G
+
(x) G
(x)
and as in Theorem 30.19 (see Theorem 28.32), there exists unique positive
nite measures,
, such that
((a, b]) = G
(b) G
(a) for all a < b.

Then :=
+
is a nite signed measure with the property that

((a, b]) = G(b) G(a) = F (b+) F (a+) for all a < b.
Since
have Lebesgue decompositions given by

d
= F
t
dm+d (
)
s
with F
t
L
1
(m) and (
)
s
m, it follows that
d =
_
F
t
+
F
t
_
dm+d
s
= F
t
dm+d
s
with F
t
= F
t
+
F
t
(m -a.e.), F
t
L
1
(1, m) and
s
m, where
s
:= (
+
)
s
(
)
s
.
Thus we have proven everything but the uniqueness of the measure satisfying
Eq. (30.17).
Uniqueness of . So it only remains to prove that is unique. Suppose
that is another such measure such that Eq. (30.17) holds with replaced
by . Then for (a, b] 1,
[[ (a, b] = sup
P
xP
[G(x
+
) G(x)[ = [ [ (a, b]
where the supremum is over all partition of [a, b] . This shows that [[ = [ [
on / B and so by the measure uniqueness Theorem 19.55, [[ = [ [ on B.
It now follows that [[ + and [ [ + are nite positive measure on B such
that, for all a < b,
([[ +) ((a, b]) = [[ ((a, b]) + (G(b) G(a))
= [ [ ((a, b]) + (G(b) G(a))
= ([ [ + ) ((a, b]) .
Hence another application of Theorem 19.55 shows
[[ + = [ [ + = [[ + on B,
and hence = on B.
Alternative proofs of uniqueness of . The uniqueness may be proved
by any number of other means. For example one may directly apply the multi-
plicative system Theorem 18.51 with H being the collection of bounded mea-
surable functions such that
_
R
fd =
_
R
fd and M being the multiplicative
system,
M :=
_
1
(a,b]
: a, b 1 with a < b
_
.
Alternatively one could apply the monotone class Theorem (Lemma 33.3)
with ( := A B : (A) = (A) and / the algebra of half open intervals.
Or one could use the Theorem 33.5, with T = A B : (A) = (A)
and ( := (a, b] : a, b 1 with a < b .
Corollary 30.30. If F BV (X) then
F
m i F
t
= 0 m a.e.
Proof. This is a consequence of Eq. (30.18) and the uniqueness of the
Lebesgue decomposition. In more detail, if F
t
(x) = 0 for m a.e. x, then by Eq.
(30.18),
F
=
s
m. If
F
m, then by Eq. (30.18), F
t
dm = d
F
d
s
dm
and by Lemma 24.8 F
t
dm = 0, i.e. F
t
= 0 m -a.e.
Corollary 30.31. Let F :

X C be a right continuous function in BV (X) ,
F
be the associated complex measure and
d
F
= F
t
dm+d
s
(30.20)
be the its Lebesgue decomposition. Then the following are equivalent,
1. F is absolutely continuous,
2.
F
m,
3.
s
= 0, and
4. for all a, b X with a < b,
F (b) F (a) =
_
(a,b]
F
t
(x)dm(x). (30.21)
Proof. The equivalence of 1. and 2. was established in Proposition 30.23
and the equivalence of 2. and 3. is trivial. (If
F
m, then d
s
= d
F

F
t
dm dm which implies, by Lemma 24.26, that
s
= 0.) If
F
m and
G(x) := F (x+) , then the identity,
F (b) F (a) = F (b+) F (a) =
_
b
a
F
t
(x) dm(x) ,
implies F is continuous.
(The equivalence of 4. and 1., 2., and 3.) If F is absolutely continuous,
then
s
= 0 and Eq. (30.21) follows from Eq. (30.20). Conversely let
(A) :=
_
A
F
t
(x)dm(x) for all A B.
Recall by the Radon - Nikodym theorem that
_
R
[F
t
(x)[ dm(x) < so that
is a complex measure on B. So if Eq. (30.21) holds, then =
F
on the algebra
generated by half open intervals. Therefore =
F
as in the uniqueness part
of the proof of Theorem 30.29. Therefore d
F
= F
t
dm dm.
Theorem 30.32 (The fundamental theorem of calculus). Suppose that
F : [, ] C is a measurable function. Then the following are equivalent:
1. F is absolutely continuous on [, ].
2. There exists f L
1
([, ]), dm) such that
F(x) F() =
_
x
fdm x [, ] (30.22)
3. F
t
exists a.e., F
t
L
1
([, ], dm) and
F(x) F() =
_
x
F
t
dmx [, ]. (30.23)
Proof. 1. = 3. If F is absolutely continuous then F BV ((, ]) and
F is continuous on [, ]. Hence Eq. (30.23) holds by Corollary 30.31. The
assertion 3. = 2. is trivial and we have already seen in Lemma 30.26 that 2.
implies 1.
Corollary 30.33 (Integration by parts). Suppose < < < and
F, G : [, ] C are two absolutely continuous functions. Then
_

F
t
Gdm =
_

FG
t
dm+FG[
.
Proof. Suppose that (a
i
, b
i
]
n
i=1
is a sequence of disjoint intervals in
[, ], then
n
i=1
[F(b
i
)G(b
i
) F(a
i
)G(a
i
)[
i=1
[F(b
i
)[ [G(b
i
) G(a
i
)[ +
n
i=1
[F(b
i
) F(a
i
)[ [G(a
i
)[
|F|
i=1
[G(b
i
) G(a
i
)[ +|G|
i=1
[F(b
i
) F(a
i
)[ .
From this inequality, one easily deduces the absolutely continuity of the prod-
uct FG from the absolutely continuity of F and G. Therefore,
FG[
=
_

(FG)
t
dm =
_

(F
t
G+FG
t
)dm.
30.4.3 Alternative method to Proving Theorem 30.29
For simplicity assume that = , = , F BV,
/
b
:= A / : A is bounded ,
and o
c
(/) denote simple functions of the form f =
n
i=1
i
1
Ai
with A
i
/
b
.
Let
0
=
0
F
be the nitely additive set function on such that
0
((a, b]) =
F(b) F(a) for all < a < b < . As in the case of an increasing
function F (see Lemma 28.38 and the text preceding it) we may dene a
linear functional, I
F
: o
c
(/) C, by
I
F
(f) =
0
(f = ).
If we write f =

N
i=1
i
1
(ai,bi]
with (a
i
, b
i
]
N
i=1
pairwise disjoint subsets of
/
b
inside (a, b] we learn
[I
F
(f)[ =
i=1
i
(F(b
i
) F(a
i
)
i=1
[
i
[ [F(b
i
) F(a
i
)[ |f|
T
F
((a, b]).
(30.24)
In the usual way this estimate allows us to extend I
F
to the those compactly
supported functions, o
c
(/), in the closure of o
c
(/). As usual we will still
denote the extension of I
F
to o
c
(/) by I
F
and recall that o
c
(/) contains
C
c
(1, C). The estimate in Eq. (30.24) still holds for this extension and in
particular we have
[I(f)[ T
F
() |f|
for all f C
c
(1, C).
Therefore I extends uniquely by continuity to an element of C
0
(1, C)
. So
by appealing to the complex Riesz Theorem (Corollary 32.68) there exists a
unique complex measure =
F
such that
I
F
(f) =
_
R
fd for all f C
c
(1). (30.25)
This leads to the following theorem.
Theorem 30.34. To each function F BV there exists a unique mea-
sure =
F
on (1, B
R
) such that Eq. (30.25) holds. Moreover, F(x+) =
lim
yx
F(y) exists for all x 1 and the measure satises
((a, b]) = F(b+) F(a+) for all < a < b < . (30.26)
Remark 30.35. By applying Theorem 30.34 to the function x F(x) one
shows every F BV has left hand limits as well, i.e F(x) = lim
yx
F(y)
exists for all x 1.
Proof. We must still prove F (x+) exists for all x 1 and Eq. (30.26)
holds. To prove let
b
and
be the functions shown in Figure 30.5 below.

The reader should check that
b
o
c
(/). Notice that
I
F
(
b+
) = I
F
(
+ 1
(,b+]
) = I
F
(
) +F(b +) F()
and since |
b+
|
= 1,
[I(
) I
F
(
b+
)[ = [I
F
(
b+
)[
T
F
([b +, b + 2]) = T
F
(b + 2) T
F
(b +),
which implies O() := I(
)I
F
(
b+
) 0 as 0 because T
F
is monotonic.
Therefore,
I(
) = I
F
(
b+
) +I(
) I
F
(
b+
)
= I
F
(
) +F(b +) F() +O() . (30.27)

Because
converges boundedly to
b
as 0, the dominated convergence
theorem implies
lim
0
I(
) = lim
0
_
R
d =
_
R
b
d =
_
R
d +((, b]).
30.5 The connection of Weak and pointwise derivatives 601
Fig. 30.5. A couple of functions in Sc(/).
So we may let 0 in Eq. (30.27) to learn F(b+) exists and
_
R
d +((, b]) = I
F
(
) +F(b+) F().
Similarly this equation holds with b replaced by a, i.e.
_
R
d +((, a]) = I
F
(
) +F(a+) F().
Subtracting the last two equations proves Eq. (30.26).
Remark 30.36. Given Theorem 30.34 we may now prove Theorem 30.29 in the
same we proved Theorem 30.19.
30.5 The connection of Weak and pointwise derivatives
Theorem 30.37. Suppose Let 1 be an open interval and f L
1
loc
().
Then there exists a complex measure on B
such that
f,
t
) = () :=
_
d for all C
c
() (30.28)
i there exists a right continuous function F of bounded variation such that
F = f a.e. In this case =
F
, i.e. ((a, b]) = F(b) F(a) for all < a <
b < .
Proof. Suppose f = F a.e. where F is as above and let =
F
be the
associated measure on B
. Let G(t) = F(t) F() = ((, t]), then

using Fubinis theorem and the fundamental theorem of calculus,
f,
t
) = F,
t
) = G,
t
) =
_
t
(t)
__
1
(,t]
(s)d(s)
_
dt
=
_
t
(t)1
(,t]
(s)dtd(s) =
_
(s)d(s) = ().
Conversely if Eq. (30.28) holds for some measure , let F(t) := ((, t])
then working backwards from above,
f,
t
) = () =
_
(s)d(s) =
_
t
(t)1
(,t]
(s)dtd(s)
=
_
t
(t)F(t)dt.
This shows
(w)
(f F) = 0 and therefore by Proposition 26.25, f = F + c
a.e. for some constant c C. Since F + c is right continuous with bounded
variation, the proof is complete.
Proposition 30.38. Let 1 be an open interval and f L
1
loc
(). Then
w
f exists in L
1
loc
() i f has a continuous version

f which is absolutely
continuous on all compact subintervals of . Moreover,
w
f =

f
t
a.e., where
f
t
(x) is the usual pointwise derivative.
Proof. If f is locally absolutely continuous and C
c
() with
supp() [a, b] , then by integration by parts, Corollary 30.33,
_
f
t
dm =
_
b
a
f
t
dm =
_
b
a
f
t
dm+f[
b
a
=
_
f
t
dm.
This shows
w
f exists and
w
f = f
t
L
1
loc
(). Now suppose that
w
f exists
in L
1
loc
() and a . Dene F C () by F(x) :=
_
x
a

w
f(y)dy. Then F
is absolutely continuous on compacts and therefore by fundamental theorem
of calculus for absolutely continuous functions (Theorem 30.32), F
t
(x) exists
and is equal to
w
f(x) for a.e. x . Moreover, by the rst part of the
argument,
w
F exists and
w
F =
w
f, and so by Proposition 26.25 there is
a constant c such that
f(x) := F(x) +c = f(x) for a.e. x .

Denition 30.39. Let X and Y be metric spaces. A function u : X Y is
said to be Lipschitz if there exists C < such that
d
Y
(u(x), u(x
t
)) Cd
X
(x, x
t
) for all x, x
t
X
and said to be locally Lipschitz if for all compact subsets K X there exists
C
K
< such that
d
Y
(u(x), u(x
t
)) C
K
d
X
(x, x
t
) for all x, x
t
K.
Proposition 30.40. Let u L
1
loc
(). Then there exists a locally Lipschitz
function u : C such that u = u a.e. i (weak
i
u) L
1
loc
() exists and
is locally (essentially) bounded for i = 1, 2, . . . , d.
Proof. Suppose u = u a.e. and u is Lipschitz and let p (1, ) and V be a
precompact open set such that

V W and let V
:=
_
x : dist(x,

V )
_
.
Then for < dist(
V ,
c
), V
and therefore there is constant C(V, ) <

such that [ u(y) u(x)[ C(V, ) [y x[ for all x, y V
. So for 0 < [h[ 1

and v 1
d
with [v[ = 1,
_
V
u(x +hv) u(x)

h
p
dx =
_
V
u(x +hv) u(x)

h
p
dx C(V, ) [v[
p
.
Therefore Theorem 26.18 may be applied to conclude
v
u exists in L
p
and
moreover,
lim
h0
u(x +hv) u(x)
h
=
v
u(x) for m a.e. x V.
Since there exists h
n
n=1
1 0 such that lim
n
h
n
= 0 and
[
v
u(x)[ = lim
n
u(x +h
n
v) u(x)
h
n
C(V ) for a.e. x V,

it follows that |
v
u|
C(V ) where C(V ) := lim

0
C(V, ).
Conversely, let
:= x : dist(x,
c
) > and C
c
(B(0, 1), [0, ))
such that
_
R
n
(x)dx = 1,
m
(x) = m
n
(mx) and u
m
:= u
m
as in the
proof of Theorem 26.18. Suppose V
o
with

V and is suciently
small. Then u
m
C
),
v
u
m
=
v
u
m
, [
v
u
m
(x)[ |
v
u|
L
(V
m
1)
=:
C(V, m) < and therefore for x, y

V with [y x[ ,
[u
m
(y) u
m
(x)[ =
_
1
0
d
dt
u
m
(x +t(y x))dt
_
1
0
(y x) u
m
(x +t(y x))dt
_
1
0
[y x[ [u
m
(x +t(y x))[ dt C(V, m) [y x[ .
(30.29)
By passing to a subsequence if necessary, we may assume that lim
m
u
m
(x) =
u(x) for m a.e. x

V and then letting m in Eq. (30.29) implies
[u(y) u(x)[ C(V ) [y x[ for all x, y

V E and [y x[ (30.30)
where E

V is a m null set. Dene u
V
:

V C by u
V
= u on

V E and
u
V
(x) = lim
yx
y/ E
u(y) if x E. Then clearly u
V
= u a.e. on

V and it is easy
to show u
V
is well dened and u
V
:

V C is continuous and still satises
[ u
V
(y) u
V
(x)[ C
V
[y x[ for x, y

V with [y x[ .
Since u
V
is continuous on

V there exists M
V
< such that [ u
V
[ M
V
on
V . Hence if x, y

V with [x y[ , we nd
[ u
V
(y) u
V
(x)[
[y x[

2M
and hence
[ u
V
(y) u
V
(x)[ max
_
C
V
,
2M
V
_
[y x[ for x, y

V
showing u
V
is Lipschitz on

V . To complete the proof, choose precompact
open sets V
n
such that V
n

V
n
V
n+1
for all n and for x V
n
let
u(x) := u
Vn
(x).
Alternative way to construct the function u
V
. For x V E,
[u
m
(x) u(x)[ =
_
V
u(x y)(my)m
n
dy u(x)
_
V
[u(x y/m) u(x)] (y)dy
_
V
[u(x y/m) u(x)[ (y)dy
C
m
_
V
[y[ (y)dy
wherein the last equality we have used Eq. (30.30) with V replaced by V
for
some small > 0. Letting K := C
_
V
[y[ (y)dy < we have shown
|u
m
u|
K/m 0 as m
and consequently
|u
m
u
n
|
= |u
m
u
n
|
2K/m 0 as m .
Therefore, u
n
converges uniformly to a continuous function u
V
.
The next theorem is from Chapter 1. of Mazja [15].
Theorem 30.41. Let p 1 and be an open subset of 1
d
, x 1
d
be written
as x = (y, t) 1
d1
1,
Y :=
_
y 1
d1
: (y 1) ,=
_
and u L
p
(). Then
t
p
() i there is a version u of
u such that for a.e. y Y the function t u(y, t) is absolutely continuous,
t
u(y, t) =
u(y,t)
t
a.e., and
_
_
u
t
_
_
L
p
()
< .
Proof. For the proof of Theorem 30.41, it suces to consider the case
where = (0, 1)
d
. Write x as x = (y, t) Y (0, 1) = (0, 1)
d1
(0, 1)
and
t
u for the weak derivative
e
d
u. By assumption
_
[
t
u(y, t)[ dydt = |
t
u|
1
|
t
u|
p
<
and so by Fubinis theorem there exists a set of full measure, Y
0
Y, such
that
_
1
0
[
t
u(y, t)[ dt < for y Y
0
.
So for y Y
0
, the function v(y, t) :=
_
t
0

t
u(y, )d is well dened and ab-
solutely continuous in t with

t
v(y, t) =
t
u(y, t) for a.e. t (0, 1). Let
C
c
(Y ) and C
c
((0, 1)) , then integration by parts for absolutely
functions implies
_
1
0
v(y, t) (t)dt =
_
1
0
t
v(y, t)(t)dt for all y Y
0
.
Multiplying both sides of this equation by (y) and integrating in y shows
_
v(x) (t)(y)dydt =
_
t
v(y, t)(t)(y)dydt
=
_
t
u(y, t)(t)(y)dydt.
Using the denition of the weak derivative, this equation may be written as
_
u(x) (t)(y)dydt =
_
t
u(x)(t)(y)dydt
and comparing the last two equations shows
_
[v(x) u(x)] (t)(y)dydt = 0.

Since C
c
(Y ) is arbitrary, this implies there exists a set Y
1
Y
0
of full
measure such that
_
[v(y, t) u(y, t)] (t)dt = 0 for all y Y

1
from which we conclude, using Proposition 26.25, that u(y, t) = v(y, t) +C(y)
for t J
y
where m
d1
(J
y
) = 1, here m
k
denotes k dimensional Lebesgue
measure. In conclusion we have shown that
u(y, t) = u(y, t) :=
_
t
0
t
u(y, )d +C(y) for all y Y
1
and t J
y
. (30.31)
We can be more precise about the formula for u(y, t) by integrating both sides
of Eq. (30.31) on t we learn
C(y) =
_
1
0
dt
_
t
0
u(y, )d
_
1
0
u(y, t)dt
=
_
1
0
(1 )
u(y, )d
_
1
0
u(y, t)dt
=
_
1
0
[(1 t)
t
u(y, t) u(y, t)] dt
and hence
u(y, t) :=
_
t
0
u(y, )d +
_
1
0
[(1 )
u(y, ) u(y, )] d
which is well dened for y Y
0
. For the converse suppose that such a u exists,
then for C
c
() ,
_
u(y, t)
t
(y, t)dydt =
_
u(y, t)
t
(y, t)dtdy
=
_
u(y, t)
t
(y, t)dtdy
wherein we have used integration by parts for absolutely continuous functions.
From this equation we learn the weak derivative
t
u(y, t) exists and is given
by
u(y,t)
t
a.e.
30.6 Exercises
31
Constructing Measures Via Caratheodory
The main goals of this chapter is to prove the two measure construction The-
orems 28.2 and 28.16. Throughout this chapter, X will be a given set. The
following denition is a continuation of the terminology introduced in Deni-
tion 28.1.
Denition 31.1. Suppose that c 2
X
is a collection of subsets of X and
: c [0, ] is a function. Then
1. is super-additive (nitely super-additive) on c if
(E)
n
i=1
(E
i
) (31.1)
whenever E =
n
i=1
E
i
c with n N (n N).
2. is monotonic if (A) (B) for all A, B c with A B.
Remark 31.2. If c = / is an algebra and is nitely additive on /, then is
sub-additive on / i
(A)
i=1
(A
i
) for A =
i=1
A
i
(31.2)
where A / and A
i
i=1
/ are pairwise disjoint sets. Indeed if A =
i=1
B
i
with A / and B
i
/, then A =

i=1
A
i
where A
i
:= B
i

(B
1
. . . B
i1
) / and B
0
= . Therefore using the monotonicity of and
Eq. (31.2)
(A)
i=1
(A
i
)
i=1
(B
i
).
608 31 Constructing Measures Via Caratheodory
31.1 Construction of Premeasures
Proposition 31.3 (Construction of Finitely Additive Measures). Sup-
pose c 2
X
is an elementary family (see Denition 18.8) and / = /(c) is
the algebra generated by c. Then every additive function : c [0, ] ex-
tends uniquely to an additive measure (which we still denote by ) on /.
Proof. Since (by Proposition 18.10) every element A / is of the form
A =

i
E
i
for a nite collection of E
i
c, it is clear that if extends to a
measure then the extension is unique and must be given by
(A) =
i
(E
i
). (31.3)
To prove existence, the main point is to show that (A) in Eq. (31.3) is well
dened; i.e. if we also have A =
j
F
j
with F
j
c, then we must show
i
(E
i
) =
j
(F
j
). (31.4)
But E
i
=

j
(E
i
F
j
) and the property that is additive on c implies
(E
i
) =
j
(E
i
F
j
) and hence
i
(E
i
) =
j
(E
i
F
j
) =
i,j
(E
i
F
j
).
Similarly, or by symmetry,
j
(F
j
) =
i,j
(E
i
F
j
)
which combined with the previous equation shows that Eq. (31.4) holds. It
is now easy to verify that extended to / as in Eq. (31.3) is an additive
measure on /.
Proposition 31.4. Suppose that / 2
X
is an algebra and : / [0, ] is
a nitely additive measure on /. Then is automatically super-additive on
/.
Proof. Since
A =
_
N
i=1
A
i
_
_
A
N
_
i=1
A
i
_
,
(A) =
N
i=1
(A
i
) +
_
A
N
_
i=1
A
i
_
i=1
(A
i
).
Letting N in this last expression shows that (A)
i=1
(A
i
).
31.1 Construction of Premeasures 609
Proposition 31.5. Suppose that c 2
X
is an elementary family, / = /(c)
and : / [0, ] is a nitely additive measure. Then is a premeasure on
/ i is sub-additive on c.
Proof. Clearly if is a premeasure on / then is -additive and hence
sub-additive on c. Because of Proposition 31.4, to prove the converse it suces
to show that the sub-additivity of on c implies the sub-additivity of on
/.
So suppose A =
n=1
A
n
with A / and each A
n
/ which we express
as A =
k
j=1
E
j
with E
j
c and A
n
=
Nn
i=1
E
n,i
with E
n,i
c. Then
E
j
= A E
j
=
n=1
A
n
E
j
=
n=1
Nn
i=1
E
n,i
E
j
which is a countable union and hence by assumption,
(E
j
)
n=1
Nn
i=1
(E
n,i
E
j
) .
Summing this equation on j and using the nite additivity of shows
(A) =
k
j=1
(E
j
)
k
j=1
n=1
Nn
i=1
(E
n,i
E
j
)
=
n=1
Nn
i=1
k
j=1
(E
n,i
E
j
) =
n=1
Nn
i=1
(E
n,i
) =
n=1
(A
n
) ,
which proves (using Remark 31.2) the sub-additivity of on /.
31.1.1 Extending Premeasures to A
Proposition 31.6. Let be a premeasure on an algebra /, then has a

unique extension (still called ) to a countably additive function on /
. More-
over the extended function satises the following properties.
1
1. (Continuity) If A
n
/ and A
n
A /
, then (A
n
) (A) as
n .
2. (Strong Additivity) If A, B /
, then
(A B) +(A B) = (A) +(B) . (31.5)
3. (Sub-Additivity on /
) The function is sub-additive on /
.
1
The remaining results in this proposition may be skipped in which case the reader
should also skip Section 31.3.
Proof. Suppose A
n
n=1
/, A
0
:= , and A =
n=1
A
n
/
. By
replacing each A
n
by A
n
(A
1
A
n1
) if necessary we may assume that
collection of sets A
n
n=1
are pairwise disjoint. Hence every element A /
may be expressed as a disjoint union, A =
n=1
A
n
with A
n
/. With A
expressed this way we must dene
(A) :=
n=1
(A
n
) .
The proof that (A) is well dened follows the same argument used in the
proof of Proposition 31.3. Explicitly, suppose also that A =
k=1
B
k
with
B
k
/, then for each n, A
n
=

k=1
(A
n
B
k
) and therefore because
is a premeasure,
(A
n
) =
k=1
(A
n
B
k
).
Summing this equation on n shows,
n=1
(A
n
) =
n=1
k=1
(A
n
B
k
) =
k=1
n=1
(A
n
B
k
)
wherein the last equality we have used Tonellis theorem for sums. By sym-
metry we also have
k=1
(B
k
) =
k=1
n=1
(A
n
B
k
)
and comparing the last two equations gives
n=1
(A
n
) =
k=1
(B
k
) which
shows the extension of to /
is well dened.
Countable additive of on /
. If A
n
n=1
is a collection of pairwise
disjoint subsets of /
, then there exists A

ni
/ such that A
n
=

i=1
A
ni
for all n, and therefore,
(
n=1
A
n
) =
_
_
i,n=1
A
ni
_
_
:=
i,n=1
(A
ni
)
=
n=1
i=1
(A
ni
) =
n=1
(A
n
) .
Again there are no problems in manipulating the above sums since all sum-
mands are non-negative.
Continuity of . Suppose A
n
/ and A
n
A /
. Then A
n
=
n
i=1
B
i
and A =
i=1
B
i
where B
n
:= A
n
(A
1
A
n1
) /. So by denition of
(A) ,
31.2 Outer Measures 611
(A) =
i=1
(B
i
) = lim
n
n
i=1
(B
i
) = lim
n
(A
n
)
which proves the continuity assertion.
Strong additivity of . Let A and B be in /
and choose A
n
, B
n
/
such that A
n
A and B
n
B as n then
(A
n
B
n
) +(A
n
B
n
) = (A
n
) +(B
n
) . (31.6)
Indeed if (A
n
) + (B
n
) = the identity is true because = and if
(A
n
) +(B
n
) < the identity follows from the nite additivity of on /
and the set identity,
A
n
B
n
= [A
n
B
n
] [A
n
(A
n
B
n
)] [B
n
(A
n
B
n
)] .
Since A
n
B
n
AB and A
n
B
n
AB, Eq. (31.5) follows by passing to
the limit as n in Eq. (31.6) while making use of the continuity property
of .
Sub-Additivity on /
. Suppose A
n
/
and A =
n=1
A
n
. Choose
A
n,j
/ such that A
n
:=

j=1
A
n,j
, let B
k
k=1
be an enumeration of the
collection of sets, A
n,j
: n, j N , and dene C
k
:= B
k
(B
1
B
k1
)
/ with the usual convention that B
0
= . Then A =

k=1
C
k
and therefore
by the denition of on /
and the monotonicity of on /,

(A) =
k=1
(C
k
)
k=1
(B
k
) =
n=1
j=1
(A
n,j
) =
n=1
(A
n
) .
In future we will tacitly assume that any premeasure, , on an algebra, /,
has been extended to /
as described in Proposition 31.6.

31.2 Outer Measures
Denition 31.7. A function : 2
X
[0, ] is an outer measure if () =
0, is monotonic and sub-additive.
Proposition 31.8 (Example of an outer measure.). Let c 2
X
be ar-
bitrary collection of subsets of X such that , X c. Let : c [0, ] be a
function such that () = 0. For any A X, dene
(A) = inf
_

i=1
(E
i
) : A
_
i=1
E
i
with E
i
c
_
. (31.7)
Then
is an outer measure.
Proof. It is clear that
is monotonic and
() = 0. Suppose for i N,
A
i
2
X
and
(A
i
) < ; otherwise there will be nothing to prove. Let > 0
and choose E
ij
c such that A
i

j=1
E
ij
and
(A
i
)
j=1
(E
ij
) 2
i
.
Since
i=1
A
i

i,j=1
E
ij
,
_
i=1
A
i
_
i=1
j=1
(E
ij
)
i=1
(
(A
i
) + 2
i
) =
i=1
(A
i
) +.
Since > 0 is arbitrary in this inequality, we have shown
is sub-additive.
The following lemma is an easy consequence of Proposition 31.6 and the
remarks in the proof of Theorem 28.2.
Lemma 31.9. Suppose that is a premeasure on an algebra / and
is the
outer measure associated to as in Proposition 31.8. Then
(B) = inf (C) : B C /
B X
and
= on /, where has been extended to /
as described in Proposition
31.6.
Lemma 31.10. Suppose (X, ) is a locally compact Hausdor space, I is a
positive linear functionals on C
c
(X), and let : [0, ] be dened in
Eq. (28.8). Then is sub-additive on and the associate outer measure,
: 2
X
[0, ] associated to as in Proposition 31.8 may be described by
(E) = inf (U) : E U

o
X . (31.8)
In particular
= on .
Proof. Let U
j
j=1
, U :=
j=1
U
j
, f U and K = supp(f). Since
K is compact, K
n
j=1
U
j
for some n N suciently large. By Proposition
15.16 (partitions of unity proposition) we may choose h
j
U
j
such that
n
j=1
h
j
= 1 on K. Since f =
n
j=1
h
j
f and h
j
f U
j
,
I (f) =
n
j=1
I (h
j
f)
n
j=1
(U
j
)
j=1
(U
j
) .
Since this is true for all f U we conclude (U)

j=1
(U
j
) proving
the countable sub-additivity of on . The remaining assertions are a direct
consequence of this sub-additivity.
31.3 *The Finite Extension Theorem
This section may be skipped (at the loss of some motivation), since the results
here will be subsumed by those in Section 31.4 below.
31.3 *The Finite Extension Theorem 613
Notation 31.11 (Inner Measure) If is a nite (i.e. (X) < ) pre-
measure on an algebra /, we extend to /
by dening
(A) := (X) (A
c
) . (31.9)
(Note: (A
c
) is dened since A
c
/
.) Also let
(B) := sup(A) : /
A B B X
and dene
/= /() := B X :
(B) =
(B) (31.10)
and :=
[
,
. In words, B is in / i B may be well approximated from
both inside and out by sets can measure.
Remark 31.12. If A /
, then A, A
c
/
and so by the strong addi-

tivity of , (A) +(A
c
) = (X) from which it follows that the extension of
to /
is consistent with the extension of to /
.
Lemma 31.13. Let be a nite premeasure on an algebra / 2
X
and
continue the setup in Notation 31.11.
1. If A /
and C /
with A C, then
(C A) = (C) (A) . (31.11)
2. For all B X,
(B) = (X)
(B
c
) , and
/:= B X : (X) =
(B) +
(B
c
) . (31.12)
3. As subset B X is in /i for all > 0 there exists A /
and C /
such that A B C and (C A) < . In particular / /.

4. is additive on /
.
Proof. 1. The strong additivity Eq. (31.5) with B = C /
and A being
replaced by A
c
/
implies
(A
c
C) +(C A) = (A
c
) +(C) .
Since X = A
c
C and (A
c
) = (X) (A) , the previous equality implies
Eq. (31.11).
2. For the second assertion we have
(B) = sup(A) : /
A B
= sup(X) (A
c
) : /
A B
= sup(X) (C) : /
C
c
B
= (X) inf (C) : B
c
C /
= (X)
(B
c
) .
Thus the condition that
(B) =
(B) is equivalent to requiring that

(X) =
(X) =
(B
c
) +
(B) . (31.13)
3. By denition B X i
(B) =
(B) which happens i for each > 0

there exists A /
and C /
such that A B C and (C) (A) < ;

i.e. by item 1, (C A) < . The containment, / /, follows from what
we have just proved or is a direct consequence of being additive on / and
the fact that
= on /.
4. Suppose A, B /
are disjoint sets, then by the strong additivity of

on /
(use Eq. (31.5) with A and B being replaced by A

c
and B
c
respectively)
gives
2(X) (A B) = (X) +([A B]
c
) = (A
c
B
c
) +(A
c
B
c
)
= (A
c
) +(B
c
) = 2(X) (A) (B) ,
i.e. (A B) = (A) +(B) .
Theorem 31.14 (Finite Premeasure Extension Theorem). If is a
nite premeasure on an algebra /, then /= /() (as in Eq. (31.10)) is a
algebra, / / and =
[
,
is a countably additive measure such that
= on /.
Proof. By Lemma 31.13, , X / / and from Eq. (31.12) it follows
that /is closed under complementation. Now suppose N 2, 3, . . .
and B
i
/ for i < N. Given > 0, by Lemma 31.13 there exists A
i
B
i

C
i
with A
i
/
and C
i
/
such that (C
i
A
i
) < 2
i
for all i < N. Let
B =
i<N
B
i
, C :=
i<N
C
i
and A :=
i<N
A
i
so that A B C /
.
For the moment assume N < , then A /
, C A = C A
c
/
,
C A =
i<N
(C
i
A)
i<N
(C
i
A
i
) /
and so by the sub-additivity of on /
(Proposition 31.6),
(C A)
i<N
(C
i
A
i
) <
i<N
2
i
< .
Since > 0 was arbitrary, it follows again by Lemma 31.13 that B / and
we have shown / is an algebra.
Now suppose that N = . Because / is an algebra, to show / is a
algebra it suces to show B =
i=1
B
i
/under the additional assumption
that the collection of sets, B
i
i=1
, are also pairwise disjoint in which case
the sets, A
i
i=1
, are pairwise disjoint. Since is additive on /
(Lemma
31.13), for any n N,
n
i=1
(C
i
)
n
i=1
_
(A
i
) +2
i
(
n
i=1
A
i
) +.
31.3 *The Finite Extension Theorem 615
This implies, using
(
n
i=1
A
i
) = (X) ([
n
i=1
A
i
]
c
) (X) ,
that
i=1
(C
i
) = lim
n
n
i=1
(C
i
) (X) + < . (31.14)
Let n N and A
n
:=

n
i=1
A
i
/
. Then /
A
n
B C /
,
C A
n
/
and
C A
n
=
i=1
(C
i
A
n
) [
n
i=1
(C
i
A
i
)]
_
i=n+1
C
i
.
Therefore, using the sub-additivity of on /
and the estimate (31.14),

(C A
n
)
n
i=1
(C
i
A
i
) +
i=n+1
(C
i
)
+
i=n+1
(C
i
) as n .
Since > 0 was arbitrary it now follows from Lemma 31.9 that B /.
Moreover, since
(B
i
) (C
i
) (A
i
) + 2
i
,
n
i=1
_
(B
i
) 2
i
i=1
(A
i
) = (A
n
)
(B) .
Letting n in this equation implies
i=1
(B
i
)
(B)
i=1
(B
i
) .
Because > 0 was arbitrary, it follows that

i=1
(B
i
) =
(B) and we
have also shown =
[
,
is a measure on /.
Exercise 31.1. Keeping the same hypothesis and notation as in Theorem
31.14 and suppose B /. Show there exists A B C such that A /
,
C /
and (C A) = 0. (Hint: see the proof of Theorem 28.6 where the

same statement is proved with /replaced by (/) .) Conclude from this that
is the completion of [
(,)
. (See Lemma 19.47 for more about completion
of measures.)
Exercise 31.2. Keeping the same hypothesis and notation as in Theorem
31.14, show /= /
t
where /
t
consists of those subset B X such that
(E) =
(B E) +
(B
c
E) E X. (31.15)
Hint: To verify Eq. (31.15) holds for B /, approximate E X from
the outside by a set C /
and then make use the sub-additivity, the mono-

tonicity of
and the fact that
is a measure on /.
Theorem 31.15. Suppose that is a nite premeasure on an algebra /.
Then
(B) := inf (C) : B C /
B (/) (31.16)
dened a measure on (/) and this measure is the unique measure on (/)
which extends .
Proof. The uniqueness of the extension was already proved in Theorem
19.55. For existence, let X
n
n=1
/ be chosen so that (X
n
) < for all
n and X
n
X as n and let
n
(A) :=
n
(A X
n
) for all A /.
Each
n
is a premeasure (as is easily veried) on / and hence by Theorem
31.14 each
n
has an extension,
n
, to a measure on (/) . Since the measure

n
are increasing, := lim
n

n
is a measure which extends , see Exercise
19.4.
The proof will be completed by verifying that Eq. (31.16) holds by repeat-
ing an argument already used in the proof of Theorem 28.6. Let B (/) ,
B
m
= X
m
B and > 0 be given. By Theorem 31.14, there exists C
m
/
such that B
m
C
m
X
m
and (C
m
B
m
) =
m
(C
m
B
m
) < 2
n
. Then
C :=
m=1
C
m
/
and, as usual,
(C B)
_

_
m=1
(C
m
B)
_
m=1
(C
m
B)
m=1
(C
m
B
m
) < .
Thus
(B) (C) = (B) + (C B) (B) +
which proves the rst item since > 0 was arbitrary.
31.4 General Extension and Construction Theorem
Exercise 31.2 motivates the following denition.
Denition 31.16. Let
: 2
X
[0, ] be an outer measure. Dene the
-measurable sets to be
/(
) := B X :
(E)
(E B) +
(E B
c
) E X.
Because of the sub-additivity of
, we may equivalently dene /(
) by
/(
) = B X :
(E) =
(E B) +
(E B
c
) E X. (31.17)
Theorem 31.17 (Caratheodorys Construction Theorem). Let
be
an outer measure on X and /:= /(
). Then / is a -algebra and :=
[
,
is a complete measure.
31.4 General Extension and Construction Theorem 617
Proof. Clearly , X / and if A / then A
c
/. So to show that
/ is an algebra we must show that / is closed under nite unions, i.e. if
A, B / and E 2
X
then
(E)
(E (A B)) +(E (A B)).

Using the denition of / three times, we have
(E) =
(E A) +
(E A) (31.18)
=
(E A B) +
((E A) B)
+
((E A) B) +
((E A) B). (31.19)

By the sub-additivity of
and the set identity,

E (A B) = (E A) (E B)
= [((E A) B) (E A B)] [((E B) A) (E A B)]
= [E A B] [(E A) B] [(E A) B] ,
we have
(E A B) +
((E A) B) +
((E A) B)
(E (A B)) .
Using this inequality in Eq. (31.19) shows
(E)
(E (A B)) +
(E (A B)) (31.20)
which implies A B /. So / is an algebra. Now suppose A, B / are
disjoint, then taking E = A B in Eq. (31.18) implies
(A B) =
(A) +
(B)
and =
[
,
is nitely additive on /.
We now must show that / is a algebra and the is additive. Let
A
i
/(without loss of generality assume A
i
A
j
= if i ,= j) B
n
=
n
i=1
A
i
,
and B =
j=1
A
j
, then for E X we have
(E B
n
) =
(E B
n
A
n
) +
(E B
n
A
c
n
)
=
(E A
n
) +
(E B
n1
).
and so by induction,
(E B
n
) =
n
k=1
(E A
k
). (31.21)
Therefore we nd that
(E) =
(E B
n
) +
(E B
c
n
)
=
n
k=1
(E A
k
) +
(E B
c
n
)
k=1
(E A
k
) +
(E B
c
)
where the last inequality is a consequence of the monotonicity of
and the
fact that B
c
B
c
n
. Letting n in this equation shows that
(E)
k=1
(E A
k
) +
(E B
c
)

(
k
(E A
k
)) +
(E B)
=
(E B) +
(E B)
(E),
wherein we have used the sub-additivity
twice. Hence B /and we have

shown / is a algebra. Since
(E)
(E B
n
) we may let n in
Eq. (31.21) to nd
(E)
k=1
(E A
k
).
Letting E = B = A
k
in this inequality then implies
(B)
k=1
(A
k
) and
hence, by the sub-additivity of
(B) =
k=1
(A
k
). Therefore, =
[
,
is countably additive on /.
Finally we show is complete. If N F / and (F) = 0 =
(F),
then
(N) = 0 and
(E)
(E N) +
(E N
c
) =
(E N
c
)
(E).
which shows that N /.
31.4.1 Extensions of General Premeasures
In this subsection let X be a set, / be a subalgebra of 2
X
and
0
: / [0, ]
be a premeasure on /.
Theorem 31.18. Let / 2
X
be an algebra, be a premeasure on / and
be the associated outer measure as dened in Eq. (31.7) with = . Let

/:= /(
) (/), then:
1. / /(
) and
[
,
= .
2. =
[
,
is a measure on / which extends .
31.4 General Extension and Construction Theorem 619
3. If : /[0, ] is another measure such that = on / and B /,
then (B) (B) and (B) = (B) whenever (B) < .
4. If is -nite on / then the extension, , of to / is unique and
moreover /= (/)
]
(A)
.
Proof. Recall from Proposition 31.6 and Lemma 31.9 that extends to a
countably additive function on /
and
= on /.
1. Let A / and E X such that
(E) < . Given > 0 choose pairwise

disjoint sets, B
j
/, such that E B :=
j=1
B
j
and
(E) + (B) =
j=1
(B
j
).
Since A E

j=1
(B
j
A
c
) and E A
c
j=1
(B
j
A
c
), using the
sub-additivity of
and the additivity of on / we have,
(E) +
j=1
(B
j
) =
j=1
[(B
j
A) +(B
j
A
c
)]

(E A) +
(E A
c
).
Since > 0 is arbitrary this shows that
(E)
(E A) +
(E A
c
)
and therefore that A /(
).
2. This is a direct consequence of item 1. and Theorem 31.17.
3. If A :=
j=1
A
j
with A
j
j=1
/ being a collection of pairwise disjoint
sets, then
(A) =
j=1
(A
j
) =
j=1
(A
j
) = (A) .
This shows = = on /
. Consequently, if B /, then
(B) inf (A) : B A /
= inf (A) : B A /
(B) = (B). (31.22)

If (B) < and > 0 is given, there exists A /
such that B A
and (A) = (A) (B) +. From Eq. (31.22), this implies
(A B) (A B) .
Therefore,
(B) (B) (A) = (A) = (B) + (A B) (B) +
which shows (B) = (B) because > 0 was arbitrary.
4. For the nite case, choose X
j
/such that X
j
X and (X
j
) <
then
(B) = lim
j
(B X
j
) = lim
j
(B X
j
) = (B).
Theorem 31.19 (Regularity Theorem). Suppose that is a nite
premeasure on an algebra /, is the extension described in Theorem 31.18
and B /:= /(
) . Then:
1.
(B) := inf (C) : B C /
.
2. For any > 0 there exists A B C such that A /
, C /
and
(C A) < .
3. There exists A B C such that A /
, C /
and (C A) = 0.
4. The -algebra, /, is the completion of (/) with respect to [
(,)
.
Proof. The proofs of items 1. 3. are the same as the proofs of the
corresponding results in Theorem 28.6 and so will be omitted. Moreover, item
4. is a simple consequence of item 3. and Proposition 19.6.
The following proposition shows that measures may be restricted to
non-measurable sets.
Proposition 31.20. Suppose that (X, /, ) is a probability space and X
is any set. Let /
:= A : A / and set P(A ) :=
(A ).
Then P is a measure on the - algebra /
. Moreover, if P
is the outer
measure generated by P, then P
(A) =
(A) for all A .

Proof. Let A, B / such that A B = . Then since A / /(
)
it follows from Eq. (31.15) with E := (A B) that
((A B) ) =
((A B) A) +
((A B) A
c
)
=
( A) +
(B )
which shows that P is nitely additive. Now suppose A =

j=1
A
j
with
A
j
/ and let B
n
:=
j=n+1
A
j
/. By what we have just proved,
(A ) =
n
j=1
(A
j
) +
(B
n
)
n
j=1
(A
j
).
Passing to the limit as n in this last expression and using the sub-
additivity of
we nd
j=1
(A
j
)
(A )
j=1
(A
j
).
31.5 Proof of the Riesz-Markov Theorem 28.16 621
Thus
(A ) =
j=1
(A
j
)
and we have shown that P =
[
,
is a measure. Now let P
be the outer
measure generated by P. For A , we have
P
(A) = inf P(B) : A B /
= inf P(B ) : A B /
= inf
(B ) : A B / (31.23)
and since
(B )
(B),
P
(A) inf
(B) : A B /
= inf (B) : A B / =
(A).
On the other hand, for A B /, we have
(A)
(B) and therefore

by Eq. (31.23)
(A) inf
(B ) : A B / = P
(A).
and we have shown
(A) P
(A)
(A).
31.5 Proof of the Riesz-Markov Theorem 28.16
This section is devoted to completing the proof of the Riesz-Markov Theorem
28.16.
Theorem 31.21. Suppose (X, ) is a locally compact Hausdor space, I is
a positive linear functional on C
c
(X) and :=
I
be as in Notation 28.15.
Then is a Radon measure on X such that I = I
, i.e.
I (f) =
_
X
fd for all f C
c
(X) .
Proof. Let : [0, ] be as in Eq. (28.8) and
: 2
X
[0, ] be the
associate outer measure as in Proposition 31.8. As we have seen in Lemma
31.10, is sub-additive on and
(E) = inf (U) : E U

o
X .
By Theorem 31.17, / := /(
) is a -algebra and
[
,
is a measure on
/.
To show B
X
/ it suces to show U / for all U , i.e. we must
show;
(E)
(E U) +
(E U) (31.24)
for every E X such that
(E) < . First suppose E is open, in which case

EU is open as well. Let f EU and K := supp(f). Then E U E K
and if g E K then f +g E (see Figure 31.1) and hence
(E) I (f +g) = I (f) +I (g) .

Taking the supremum of this inequality over g E K shows
(E) I (f) +
(E K) I (f) +
(E U) .
Taking the supremum of this inequality over f U shows Eq. (31.24) is valid
for E .
Fig. 31.1. Constructing a function g which approximates 1
E\U
.
For general E X, let V with E V, then
(V )
(V U) +
(V U)
(E U) +
(E U)
and taking the inmum of this inequality over such V shows Eq. (31.24) is
valid for general E X. Thus U / for all U and therefore B
X
/.
Up to this point it has been shown that =
[
B
X
is a measure which, by
very construction, is outer regular. We now verify that satises Eq. (28.10),
namely that (K) = (K) for all compact sets K X where
(K) := inf I (f) : f C
c
(X, [0, 1]) f 1
K
.
To do this let f C
c
(X, [0, 1]) with f 1
K
and > 0 be given. Let
U
:= f > 1 and g U
, then g (1 )
1
f and hence
I (g) (1 )
1
I (f) . Taking the supremum of this inequality over all g U
then gives,
31.5 Proof of the Riesz-Markov Theorem 28.16 623
(K) (U
) (1 )
1
I (f) .
Since > 0 was arbitrary, we learn (K) I (f) for all 1
K
f X and
therefore, (K) (K) . Now suppose that U and K U. By Urysohns
Lemma 15.8 (also see Lemma 14.27), there exists f U such that f 1
K
and therefore
(K) (K) I (f) (U) .
By the outer regularity of , we have
(K) (K) inf (U) : K U
o
X = (K) ,
i.e.
(K) = (K) = inf I (f) : f C
c
(X, [0, 1]) f 1
K
. (31.25)
This inequality clearly establishes that is K-nite and therefore C
c
(X, [0, ))
L
1
() .
Next we will establish,
I (f) = I
(f) :=
_
X
fd (31.26)
for all f C
c
(X) . By the linearity, it suces to verify Eq. (31.26) holds for
f C
c
(X, [0, )) . To do this we will use the layer cake method to slice f
into thin pieces. Explicitly, x an N N and for n N let
f
n
:= min
_
max
_
f
n 1
N
, 0
_
,
1
N
_
, (31.27)
see Figure 31.2. It should be clear from Figure 31.2 that f =

n=1
f
n
with
the sum actually being a nite sum since f
n
0 for all n suciently large.
Let K
0
:= supp(f) and K
n
:=
_
f
n
N
_
. Then (again see Figure 31.2) for all
n N,
1
Kn
Nf
n
1
Kn1
which upon integrating on gives
(K
n
) NI
(f
n
) (K
n1
) . (31.28)
Moreover, if U is any open set containing K
n1
, then Nf
n
U and so by Eq.
(31.25) and the denition of , we have
(K
n
) NI (f
n
) (U) . (31.29)
From the outer regularity of , it follows from Eq. (31.29) that
(K
n
) NI (f
n
) (K
n1
) . (31.30)
As a consequence of Eqs. (31.28) and (31.30), we have
Fig. 31.2. This sequence of gures shows how the function fn is constructed. The
idea is to think of f as describing a cake set on a table, X. We then slice the
cake into slabs, each of which is placed back on the table. Each of these slabs is
described by one of the functions, fn, as in Eq. (31.27).
N [I
(f
n
) I (f
n
)[ (K
n1
) (K
n
) = (K
n1
K
n
) .
Therefore
[I
(f) I (f)[ =
n=1
I
(f
n
) I (f
n
)
n=1
[I
(f
n
) I (f
n
)[
1
N
n=1
(K
n1
K
n
) =
1
N
(K
0
) 0 as N
which establishes Eq. (31.26).
It now only remains to show is inner regular on open sets to complete
the proof. If U and (U) < , then for any > 0 there exists f U
such that
(U) I (f) + =
_
X
fd + (supp(f)) +.
Hence if K = supp(f), we have K U and (U K) < and this shows
is inner regular on open sets with nite measure. Finally if U and
(U) = , there exists f
n
U such that I (f
n
) as n . Then,
letting K
n
= supp(f
n
), we have K
n
U and (K
n
) I (f
n
) and therefore
(K
n
) (U) = .
31.6 More Motivation of Caratheodorys Construction Theorem 31.17 625
31.6 More Motivation of Caratheodorys Construction
Theorem 31.17
The next Proposition helps to motivate this denition and the Caratheodorys
construction Theorem 31.17.
Proposition 31.22. Suppose c = / is a algebra, = : / [0, ]
is a measure and
is dened as in Eq. (31.7). Then

1. For A X
(A) = inf(B) : B / and A B.

In particular,
= on /.
2. Then / /(
), i.e. if A / and E X then
(E)
(E A) +
(E A
c
). (31.31)
3. Assume further that is nite on /, then /(
) =

/=

/
and
[
,(
)
= where (

/=

/
, ) is the completion of (/, ) .

Proof. Item 1. If E
i
/ such that A E
i
= B and

E
i
= E
i
(E
1
E
i1
) then
(E
i
)
E
i
) = (B)
so
(A)
E
i
) = (B)
(E
i
).
Therefore,
(A) = inf(B) : B / and A B.

Item 2. If
(E) = Eq. (31.31) holds trivially. So assume that
(E) <
. Let > 0 be given and choose, by Item 1., B / such that E B and
(B)
(E) +. Then
(E) + (B) = (B A) +(B A

c
)

(E A) +
(E A
c
).
Since > 0 is arbitrary we are done.
Item 3. Let us begin by assuming the (X) < . We have already seen
that / /(
). Suppose that A 2
X
satises,
(E) =
(E A) +
(E A
c
) E 2
X
. (31.32)
By Item 1., there exists B
n
/ such that A B
n
and
(B
n
)
(A) +
1
n
for all n N. Therefore B = B
n
A and (B)
(A) +
1
n
for all n which
implies that (B)
(A) which implies that (B) =
(A). Similarly there

exists C / such that A
c
C and
(A
c
) = (C). Taking E = X in Eq.
(31.32) shows
(X) =
(A) +
(A
c
) = (B) +(C)
so
(C
c
) = (X) (C) = (B).
Thus letting D = C
c
, we have
D A B and (D) =
(A) = (B)
so (B D) = 0 and hence
A = D [(BD) A]
where D / and (BD) A ^ showing that A

/ and
(A) = (A).
Now if is nite, choose X
n
/ such that (X
n
) < and X
n
X.
Given A /(
) set A
n
= X
n
A. Therefore
(E) =
(E A) +
(E A
c
) E 2
X
.
Replace E by X
n
to learn,
(X
n
) =
(A
n
) +
(X
n
A) =
(A
n
) +
(X
n
A
n
).
The same argument as above produces sets D
n
A
n
B
n
such that (D
n
) =
(A
n
) = (B
n
). Hence A
n
= D
n
N
n
and N
n
:= (B
n
D
n
) A
n
^. So
we learn that
A = D N := (D
n
) (N
n
) / ^ =

/.
We also see that
(A) = (D) since D A D F where F / such

that N F and
(D) =
(D)
(A) (D F) = (D).
32
The Daniell Stone Construction of
Integration and Measures
Now that we have developed integration theory relative to a measure on a
algebra, it is time to show how to construct the measures that we have
been using. This is a bit technical because there tends to be no explicit
description of the general element of the typical algebras. On the other
hand, we do know how to explicitly describe algebras which are generated by
some class of sets c 2
X
. Therefore, we might try to dene measures on (c)
by there restrictions to /(c). Theorem 19.55 or Theorem 33.6 shows this is a
plausible method.
So the strategy of this section is as follows: 1) construct nitely additive
measure on an algebra, 2) construct integrals associated to such nitely
additive measures, 3) extend these integrals (Daniells method) when possible
to a larger class of functions, 4) construct a measure from the extended integral
(Daniell Stone construction theorem).
In this chapter, X will be a given set and we will be dealing with certain
spaces of extended real valued functions f : X

1 on X.
Notation 32.1 Given functions f, g : X

1, let f +g denote the collection
of functions h : X

1 such that h(x) = f(x) + g(x) for all x for which
f(x) +g(x) is well dened, i.e. not of the form .
For example, if X = 1, 2, 3 and f(1) = , f(2) = 2 and f(3) = 5
and g(1) = g(2) = and g(3) = 4, then h f + g i h(2) = and
h(3) = 7. The value h(1) may be chosen freely. More generally if a, b 1 and
f, g : X

1 we will write af + bg for the collection of functions h : X

1
such that h(x) = af(x) + bg(x) for those x X where af(x) + bg(x) is well
dened with the values of h(x) at the remaining points being arbitrary. It
will also be useful to have some explicit representatives for af +bg which we
dene, for

1, by
(af +bg)
(x) =
_
af(x) +bg(x) when dened
otherwise.
(32.1)
We will make use of this denition with = 0 and = below.
628 32 The Daniell Stone Construction of Integration and Measures
Notation 32.2 Given a collection of extended real valued functions ( on X,
let (
+
:= f ( : f 0 denote the subset of positive functions f (.
Denition 32.3. A set, L, of extended real valued functions on X is an ex-
tended vector space (or a vector space for short) if L is closed under scalar
multiplication and addition in the following sense: if f, g L and 1 then
(f +g) L. A vector space L is said to be an extended lattice (or a lattice
for short) if it is also closed under the lattice operations;
f g = max(f, g) and f g = min(f, g).
A linear functional I on L is a function I : L 1 such that
I(f +g) = I(f) +I(g) for all f, g L and 1. (32.2)
A linear functional I is positive if I(f) 0 when f L
+
.
Equation (32.2) is to be interpreted as I(h) = I(f) + I(g) for all h
(f +g), and in particular I is required to take the same value on all members
of (f +g).
Remark 32.4. Notice that an extended lattice L is closed under the absolute
value operation since [f[ = f 0 f 0 = f (f). Also if I is positive
on L then I(f) I(g) when f, g L and f g. Indeed, f g implies
(g f)
0
0, so
0 = I(0) I((g f)
0
) = I(g) I(f)
and hence I(f) I(g). If L is a vector space of real-valued functions on X,
then L is a lattice i f
+
= f 0 L for all f L. This is because
[f[ = f
+
+ (f)
+
,
f g =
1
2
(f +g +[f g[) and
f g =
1
2
(f +g [f g[) .
In the remainder of this chapter we x a sub-lattice, S
(X, 1) and a
positive linear functional I : S 1.
Denition 32.5 (Property (D)). A non-negative linear functional I on S is
said to be continuous under monotone limits if I(f
n
) 0 for all f
n
n=1
S
+
satisfying (pointwise) f
n
0. A positive linear functional on S satisfying
property (D) is called a Daniell integral on S. We will also write S as D(I)
the domain of I.
Lemma 32.6. Let I be a non-negative linear functional on a lattice S. Then
property (D) is equivalent to either of the following two properties:
32 The Daniell Stone Construction of Integration and Measures 629
D
1
If ,
n
S satisfy;
n

n+1
for all n and lim
n
n
, then I()
lim
n
I(
n
).
D
2
If u
j
S
+
and S is such that
j=1
u
j
then I()
j=1
I(u
j
).
Proof. (D) = (D
1
) Let ,
n
S be as in D
1
. Then
n
and
(
n
) 0 which implies
I() I(
n
) = I( (
n
)) 0.
Hence
I() = lim
n
I(
n
) lim
n
I(
n
).
(D
1
) = (D
2
) Apply (D
1
) with
n
=
n
j=1
u
j
. (D
2
) = (D) Suppose
n
S
with
n
0 and let u
n
=
n

n+1
. Then

N
n=1
u
n
=
1

N+1

1
and
hence
I(
1
)
n=1
I(u
n
) = lim
N
N
n=1
I(u
n
)
= lim
N
I(
1
N+1
) = I(
1
) lim
N
I(
N+1
)
from which it follows that lim
N
I(
N+1
) 0. Since I(
N+1
) 0 for all N
we conclude that lim
N
I(
N+1
) = 0.
32.0.1 Examples of Daniell Integrals
Proposition 32.7. Suppose that (X, ) is locally compact Hausdor space
and I is a positive linear functional on S := C
c
(X, 1). Then for each compact
subset K X there is a constant C
K
< such that [I(f)[ C
K
|f|
for
all f C
c
(X, 1) with supp(f) K. Moreover, if f
n
C
c
(X, [0, )) and
f
n
0 (pointwise) as n , then I(f
n
) 0 as n and in particular I
is necessarily a Daniell integral on S.
Proof. Let f C
c
(X, 1) with supp(f) K. By Lemma 15.8 there exists
K
X such that
K
= 1 on K. Since |f|
K
f 0,
0 I(|f|
K
f) = |f|
I(
K
) I(f)
from which it follows that [I(f)[ I(
K
) |f|
. So the rst assertion holds

with C
K
= I(
K
) < . Now suppose that f
n
C
c
(X, [0, )) and f
n
0 as
n . Let K = supp(f
1
) and notice that supp(f
n
) K for all n. By Dinis
Theorem (see Exercise 14.3), |f
n
|
0 as n and hence
0 I(f
n
) C
K
|f
n
|
0 as n .
For example if X = 1 and F is an increasing function on 1, then I (f) :=
_
R
fdF is a Daniell integral on C
c
(1, 1), see Lemma 28.38. However it is not
generally true in this case that I(f
n
) 0 for all f
n
S (S is the collection of
compactly supported step functions on 1) such that f
n
0. The next example
and proposition addresses this question.
Example 32.8. Suppose F : 1 1 is an increasing function which is not right
continuous at x
0
1. Then, letting f
n
= 1
(x0,x0+n
1
]
S, we have f
n
0 as
n but
_
R
f
n
dF = F
_
x
0
+n
1
_
F (x
0
) F (x
0
+) F (x
0
) ,= 0.
Proposition 32.9. Let (/, , S = S
f
(/, ), I = I
) be as in Denition 28.36.
If is a premeasure (Denition 31.1) on /, then
f
n
S with f
n
0 =I(f
n
) 0 as n . (32.3)
Hence I is a Daniell integral on S.
Proof. Let > 0 be given. Then
f
n
= f
n
1
fn>f1
+f
n
1
fnf1
f
1
1
fn>f1
+f
1
,
I(f
n
) I (f
1
1
fn>f1
) +I(f
1
) =
a>0
a(f
1
= a, f
n
> a) +I(f
1
),
and hence
limsup
n
I(f
n
)
a>0
a limsup
n
(f
1
= a, f
n
> a) +I(f
1
). (32.4)
Because, for a > 0,
/ f
1
= a, f
n
> a as n
and (f
1
= a) < , limsup
n
(f
1
= a, f
n
> a) = 0. Combining this
with Eq. (32.4) and making use of the fact that > 0 is arbitrary we learn
limsup
n
I(f
n
) = 0.
32.1 Extending a Daniell Integral
In the remainder of this chapter we x a lattice, S, of bounded functions,
f : X 1, and a positive linear functional I : S 1 satisfying Property (D)
of Denition 32.5.
Lemma 32.10. Suppose that f
n
, g
n
S.
32.1 Extending a Daniell Integral 631
1. If f
n
f and g
n
g with f, g : X (, ] such that f g, then
lim
n
I(f
n
) lim
n
I(g
n
). (32.5)
2. If f
n
f and g
n
g with f, g : X [, ) such that f g, then Eq.
(32.5) still holds.
In particular, in either case if f = g, then
lim
n
I(f
n
) = lim
n
I(g
n
).
Proof.
1. Fix n N, then g
k
f
n
f
n
as k and g
k
f
n
g
k
and hence
I(f
n
) = lim
k
I(g
k
f
n
) lim
k
I(g
k
).
Passing to the limit n in this equation proves Eq. (32.5).
2. Since f
n
(f) and g
n
(g) and g (f), what we just proved
shows
lim
n
I(g
n
) = lim
n
I(g
n
) lim
n
I(f
n
) = lim
n
I(f
n
)
which is equivalent to Eq. (32.5).
Denition 32.11. Let
S
= f : X (, ] : f
n
S such that f
n
f
and
S
= f : X [, ) : f
n
S such that f
n
f .
Because of Lemma 32.10, for f S
and g S
we may dene
I
(f) = lim
n
I (f
n
) if S f
n
f
and
I
(g) = lim
n
I (g
n
) if S g
n
g.
If f S
, then there exists f

n
, g
n
S such that f
n
f and g
n
f. Hence
S (g
n
f
n
) 0 and hence by the continuity property (D),
I
(f) I
(f) = lim
n
[I (g
n
) I (f
n
)] = lim
n
I (g
n
f
n
) = 0.
Therefore I
= I
on S
.
Notation 32.12 Using the above comments we may now simply write I (f)
for I
(f) or I
(f) when f S
or f S
. Henceforth we will now view I as

a function on S
.
Again because of Lemma 32.10, let I
:= I[
S
or I
:= I[
S
are positive
functionals; i.e. if f g then I(f) I(g).
Exercise 32.1. Show S
= S
and for f S
that I(f) = I(f)

1.
Proposition 32.13. The set S
and the extension of I to S
in Denition
32.11 satises:
1. (Monotonicity) I(f) I(g) if f, g S
with f g.
2. S
is closed under the lattice operations, i.e. if f, g S
then f g S
and f g S
. Moreover, if I(f) < and I(g) < , then I(f g) <

and I(f g) < .
3. (Positive Linearity) I (f +g) = I(f) +I(g) for all f, g S
and 0.
4. f S
+
i there exists
n
S
+
such that f =
n=1
n
. Moreover, I(f) =
m=1
I(
m
).
5. If f
n
S
+
, then

n=1
f
n
=: f S
+
and I(f) =
n=1
I(f
n
).
Remark 32.14. Similar results hold for the extension of I to S
in Denition
32.11.
Proof.
1. Monotonicity follows directly from Lemma 32.10.
2. If f
n
, g
n
S are chosen so that f
n
f and g
n
g, then f
n
g
n
f g and
f
n
g
n
f g. If we further assume that I(g) < , then f g g and
hence I(f g) I(g) < . In particular it follows that I(f 0) (, 0]
for all f S
. Combining this with the identity,

I(f) = I (f 0 +f 0) = I (f 0) +I(f 0) ,
shows I(f) < i I(f 0) < . Since f g f 0 + g 0, if both
I(f) < and I(g) < then
I(f g) I (f 0) +I (g 0) < .
3. Let f
n
, g
n
S be chosen so that f
n
f and g
n
g, then (f
n
+g
n
)
(f +g) and therefore
I (f +g) = lim
n
I (f
n
+g
n
) = lim
n
I(f
n
) + lim
n
I(g
n
)
= I(f) +I(g).
4. Let f S
+
and f
n
n
f. By replacing f
n
by f
n
0
if necessary we may assume that f
n
S
+
. Now set
n
= f
n
f
n1
S for
n = 1, 2, 3, . . . with the convention that f
0
= 0 S. Then

n=1
n
= f
and
I(f) = lim
n
I(f
n
) = lim
n
I(
n
m=1
m
) = lim
n
n
m=1
I(
m
) =
m=1
I(
m
).
Conversely, if f =
m=1
m
with
m
S
+
, then f
n
:=
n
m=1
m
f as
n and f
n
S
+
.
5. Using Item 4., f
n
=
m=1
n,m
with
n,m
S
+
. Thus
f =
n=1
m=1
n,m
= lim
N
m,nN
n,m
S
and
I(f) = lim
N
I(
m,nN
n,m
) = lim
N
m,nN
I(
n,m
)
=
n=1
m=1
I(
n,m
) =
n=1
I(f
n
).
Denition 32.15. Given an arbitrary function g : X

1, let
I
(g) = inf I(f) : g f S

1 and
I
(g) = supI(f) : S
f g

1.
with the convention that sup = and inf = +.
Denition 32.16. A function g : X

1 is integrable if I
(g) = I
(g) 1.
Let
L
1
(I) :=
_
g : X

1 : I
(g) = I
(g) 1
_
and for g L
1
(I), let

I(g) denote the common value I
(g) = I
(g).
Remark 32.17. A function g : X

1 is integrable i for any > 0 there exists
f S
L
1
(I) and h S
L
1
(I)
1
such that f g h and I(h f) < .
Indeed if g is integrable, then I
(g) = I
(g) and there exists f S
L
1
(I)
and h S
L
1
(I) such that f g h and 0 I
(g) I(f) < /2 and

0 I(h)I
(g) < /2. Adding these two inequalities implies 0 I(h)I(f) =

I(h f) < . Conversely, if there exists f S
L
1
(I) and h S
L
1
(I)
such that f g h and I(h f) < , then
I(f) = I
(f) I
(g) I
(h) = I(h) and

I(f) = I
(f) I
(g) I
(h) = I(h)
1
Equivalently, f S
with I(f) > and h S
with I(h) < .

and therefore
0 I
(g) I
(g) I(h) I(f) = I(h f) < .

Since > 0 is arbitrary, this shows I
(g) = I
(g).
Proposition 32.18. Given functions f, g : X

1, then:
1. I
(f) = I
(f) for all 0.

2. (Chebyshevs Inequality.) Suppose f : X [0, ] is a function and
(0, ), then I
(1
f]
)
1
(f) and if I
(f) < then I
(1
f=]
) = 0.
3. I
is sub-additive, i.e. if I
(f)+I
(g) is not of the form or +,

then
I
(f +g) I
(f) +I
(g). (32.6)
This inequality is to be interpreted to mean,
I
(h) I
(f) +I
(g) for all h (f +g).

4. I
(g) = I
(g).
5. I
(g) I
(g).
6. If f g then I
(f) I
(g) and I
(f) I
(g).
7. If g S
and I(g) < or g S
and I(g) > then I
(g) = I
(g) =
I(g).
Proof.
1. Suppose that > 0 (the = 0 case being trivial), then
I
(f) = inf I(h) : f h S
= inf
_
I(h) : f
1
h S
_
= inf I(g) : f g S
= inf I(g) : f g S
= I
(f).
2. For (0, ), 1
f]
f and therefore,
I
(1
f]
) = I
(1
f]
) I
(f).
Since N1
f=]
f for all N (0, ),
NI
(1
f=]
) = I
(N1
f=]
) I
(f).
So if I
(f) < , this inequality implies I
(1
f=]
) = 0 because N is
arbitrary.
3. If I
(f) + I
(g) = the inequality is trivial so we may assume that

I
(f), I
(g) [, ). If I
(f) +I
(g) = then we may assume, by

interchanging f and g if necessary, that I
(f) = and I
(g) < . By
denition of I
, there exists f
n
S
and g
n
S
such that f f
n
and
g g
n
and I(f
n
) and I(g
n
) I
(g). Since f + g f
n
+ g
n
S
,
(i.e. h f
n
+ g
n
for all h (f + g) which holds because f
n
, g
n
> )
and
I(f
n
+g
n
) = I(f
n
) +I(g
n
) +I
(g) = ,
it follows that I
(f + g) = , i.e. I
(h) = for all h f + g.

Henceforth we may assume I
(f), I
(g) 1. Let k (f +g) and f

h
1
S
and g h
2
S
. Then k h
1
+h
2
S
because if (for example)

f(x) = and g(x) = , then h
1
(x) = and h
2
(x) > since
h
2
S
. Thus h
1
(x) + h
2
(x) = k(x) no matter the value of k(x).
It now follows from the denitions that I
(k) I(h
1
) + I(h
2
) for all
f h
1
S
and g h
2
S
. Therefore,
I
(k) inf I(h

1
) +I(h
2
) : f h
1
S
and g h
2
S
= I
(f) +I
(g)
and since k (f +g) is arbitrary we have proven Eq. (32.6).
4. From the denitions and Exercise 32.1,
I
(g) = supI(f) : f g S
= supI(f) : g f S
= supI(h) : g h S
= inf I(h) : g h S
= I
(g).
5. The assertion is trivially true if I
(g) = I
(g) = or I
(g) = I
(g) =
. So we now assume that I
(g) and I
(g) are not both or .

Since 0 (g g) and I
(g g) I
(g) +I
(g) (by Item 1),

0 = I
(0) I
(g) +I
(g) = I
(g) I
(g)
provided the right side is well dened which it is by assumption. So again
we deduce that I
(g) I
(g).
6. If f g then
I
(f) = inf I(h) : f h S
inf I(h) : g h S
= I
(g)
and
I
(f) = supI(h) : S
h f supI(h) : S
h g = I
(g).
7. Let g S
with I(g) < and choose g

n
S such that g
n
g. Then
I
(g) I
(g) I(g
n
) I(g) as n .
Combining this with
I
(g) = inf I(f) : g f S
= I(g)
shows
I
(g) I
(g) I(g) = I
(g)
and hence I
(g) = I(g) = I
(g). If g S
and I(g) > , then by what

we have just proved,
I
(g) = I(g) = I
(g).
This nishes the proof since I
(g) = I
(g) and I(g) = I(g).

Lemma 32.19 (Countable Sub-additivity of I
). Let f
n
: X [0, ] be
a sequence of functions and F :=
n=1
f
n
. Then
I
(F) = I
n=1
f
n
)
n=1
I
(f
n
). (32.7)
Proof. Suppose

n=1
I
(f
n
) < , for otherwise the result is trivial. Let
> 0 be given and choose g
n
S
+
such that f
n
g
n
and I(g
n
) = I
(f
n
) +
n
where
n=1
n
. (For example take
n
2
n
.) Then
n=1
g
n
=: G S
+
,
F G and so
I
(F) I
(G) = I(G) =
n=1
I(g
n
) =
n=1
(I
(f
n
) +
n
)
n=1
I
(f
n
) +.
Since > 0 is arbitrary, the proof is complete.
Proposition 32.20. The space L
1
(I) is an extended lattice and

I : L
1
(I)
1 is linear in the sense of Denition 32.3.
Proof. Let us begin by showing that L
1
(I) is a vector space. Suppose that
g
1
, g
2
L
1
(I), and g (g
1
+g
2
). Given > 0 there exists f
i
S
L
1
(I) and
h
i
S
L
1
(I) such that f
i
g
i
h
i
and I(h
i
f
i
) < /2. Let us now show
f
1
(x) +f
2
(x) g(x) h
1
(x) +h
2
(x) x X. (32.8)
This is clear at points x X where g
1
(x) + g
2
(x) is well dened. The other
case to consider is where g
1
(x) = = g
2
(x) in which case h
1
(x) =
and f
2
(x) = while , h
2
(x) > and f
1
(x) < because h
2
S
and
f
1
S
. Therefore, f
1
(x) + f
2
(x) = and h
1
(x) + h
2
(x) = so that
Eq. (32.8) is valid no matter how g(x) is chosen. Since f
1
+ f
2
S
L
1
(I),
h
1
+h
2
S
L
1
(I) and
I(g
i
) I(f
i
) +/2 and /2 +I(h
i
)

I(g
i
),
we nd
I(g
1
) +

I(g
2
) I(f
1
) +I(f
2
) = I(f
1
+f
2
) I
(g) I
(g)
I(h
1
+h
2
) = I(h
1
) +I(h
2
)

I(g
1
) +

I(g
2
) +.
Because > 0 is arbitrary, we have shown that g L
1
(I) and

I(g
1
) +

I(g
2
) =
I(g), i.e.

I(g
1
+g
2
) =

I(g
1
) +

I(g
2
). It is a simple matter to show g L
1
(I)
and

I(g) =
I(g) for all g L

1
(I) and 1. For example if = 1 (the
most interesting case), choose f S
L
1
(I) and h S
L
1
(I) such that
f g h and I(h f) < . Therefore,
S
L
1
(I) h g f S
L
1
(I)
with I(f (h)) = I(h f) < and this shows that g L
1
(I) and
I(g) =
I(g). We have now shown that L

1
(I) is a vector space of extended
real valued functions and

I : L
1
(I) 1 is linear. To show L
1
(I) is a lattice, let
g
1
, g
2
L
1
(I) and f
i
S
L
1
(I) and h
i
S
L
1
(I) such that f
i
g
i
h
i
and I(h
i
f
i
) < /2 as above. Then using Proposition 32.13 and Remark
32.14,
S
L
1
(I) f
1
f
2
g
1
g
2
h
1
h
2
S
L
1
(I).
Moreover,
0 h
1
h
2
f
1
f
2
h
1
f
1
+h
2
f
2
,
because, for example, if h
1
h
2
= h
1
and f
1
f
2
= f
2
then
h
1
h
2
f
1
f
2
= h
1
f
2
h
2
f
2
.
Therefore,
I (h
1
h
2
f
1
f
2
) I (h
1
f
1
+h
2
f
2
) <
and hence by Remark 32.17, g
1
g
2
L
1
(I). Similarly
0 h
1
h
2
f
1
f
2
h
1
f
1
+h
2
f
2
,
because, for example, if h
1
h
2
= h
1
and f
1
f
2
= f
2
then
h
1
h
2
f
1
f
2
= h
1
f
2
h
1
f
1
.
Therefore,
I (h
1
h
2
f
1
f
2
) I (h
1
f
1
+h
2
f
2
) <
and hence by Remark 32.17, g
1
g
2
L
1
(I).
Theorem 32.21 (Monotone convergence theorem). If f
n
L
1
(I) and
f
n
f, then f L
1
(I) i lim
n

I(f
n
) = sup
n

I(f
n
) < in which case
I(f) = lim
n

I(f
n
).
Proof. If f L
1
(I), then by monotonicity

I(f
n
)

I(f) for all n and there-
fore lim
n

I(f
n
)

I(f) < . Conversely, suppose := lim
n

I(f
n
) <
and let g :=
n=1
(f
n+1
f
n
)
0
. The reader should check that f (f
1
+g)

(f
1
+g) . So by Lemma 32.19,
I
(f) I
((f
1
+g)
) I
(f
1
) +I
(g)
I
(f
1
) +
n=1
I
((f
n+1
f
n
)
0
) =

I(f
1
) +
n=1
I (f
n+1
f
n
)
=

I(f
1
) +
n=1
_
I(f
n+1
)

I(f
n
)
=

I(f
1
) +

I(f
1
) = . (32.9)
Because f
n
f, it follows that

I(f
n
) = I
(f
n
) I
(f) which upon passing

to limit implies I
(f). This inequality and the one in Eq. (32.9) shows

I
(f) I
(f) and therefore, f L

1
(I) and

I(f) = = lim
n

I(f
n
).
Lemma 32.22 (Fatous Lemma). Suppose f
n

_
L
1
(I)
+
, then inf f
n

L
1
(I). If liminf
n

I(f
n
) < , then liminf
n
f
n
L
1
(I) and in this case
I(liminf
n
f
n
) liminf
n
I(f
n
).
Proof. Let g
k
:= f
1
f
k
L
1
(I), then g
k
g := inf
n
f
n
. Since g
k

g, g
k
L
1
(I) for all k and

I(g
k
)

I(0) = 0, it follow from Theorem
32.21 that g L
1
(I) and hence so is inf
n
f
n
= g L
1
(I). By what we have
just proved, u
k
:= inf
nk
f
n
L
1
(I) for all k. Notice that u
k
liminf
n
f
n
,
and by monotonicity that

I(u
k
)

I(f
k
) for all k. Therefore,
lim
k
I(u
k
) = liminf
k
I(u
k
) liminf
k
I(f
n
) <
and by the monotone convergence Theorem 32.21, liminf
n
f
n
= lim
k
u
k

L
1
(I) and
I(liminf
n
f
n
) = lim
k
I(u
k
) liminf
n
I(f
n
).
Before stating the dominated convergence theorem, it is helpful to remove
some of the annoyances of dealing with extended real valued functions. As we
have done when studying integrals associated to a measure, we can do this by
modifying integrable functions by a null function.
Denition 32.23. A function n : X

1 is a null function if I
([n[) = 0.
A subset E X is said to be a null set if 1
E
is a null function. Given two
functions f, g : X

1 we will write f = g a.e. if f ,= g is a null set.
Here are some basic properties of null functions and null sets.
Proposition 32.24. Suppose that n : X

1 is a null function and f : X
1 is an arbitrary function. Then

1. n L
1
(I) and

I(n) = 0.
2. The function n f is a null function.
3. The set x X : n(x) ,= 0 is a null set.
4. If E is a null set and f L
1
(I), then 1
E
c f L
1
(I) and

I(f) =

I(1
E
c f).
5. If g L
1
(I) and f = g a.e. then f L
1
(I) and

I(f) =

I(g).
6. If f L
1
(I), then E := [f[ = is a null set.
Proof.
1. If n is null, using n [n[ we nd I
(n) I
([n[) = 0, i.e. I
(n) 0
and I
(n) = I
(n) 0. Thus it follows that I
(n) 0 I
(n) and
therefore n L
1
(I) with

I (n) = 0.
2. Since [n f[ [n[ , I
([n f[) I
( [n[) . For k N, k [n[ L

1
(I)
and

I(k [n[) = kI ([n[) = 0, so k [n[ is a null function. By the monotone
convergence Theorem 32.21 and the fact k [n[ [n[ L
1
(I) as k ,
I ( [n[) = lim
k

I (k [n[) = 0. Therefore [n[ is a null function and
hence so is n f.
3. Since 1
n,=0]
1
n,=0]
= [n[ , I
_
1
n,=0]
_
I
( [n[) = 0
showing n ,= 0 is a null set.
4. Since 1
E
f L
1
(I) and

I (1
E
f) = 0,
f1
E
c = (f 1
E
f)
0
(f 1
E
f) L
1
(I)
and

I(f1
E
c ) =

I(f)

I(1
E
f) =

I(f).
5. Letting E be the null set f ,= g , then 1
E
c f = 1
E
c g L
1
(I) and 1
E
f is
a null function and therefore, f = 1
E
f + 1
E
c f L
1
(I) and
I(f) =

I(1
E
f) +

I(f1
E
c ) =

I(1
E
c f) =

I(1
E
c g) =

I(g).
6. By Proposition 32.20, [f[ L
1
(I) and so by Chebyshevs inequality (Item
2 of Proposition 32.18), [f[ = is a null set.
Theorem 32.25 (Dominated Convergence Theorem). Suppose that
f
n
: n N L
1
(I) such that f := limf
n
exists pointwise and there exists
g L
1
(I) such that [f
n
[ g for all n. Then f L
1
(I) and
lim
n
I(f
n
) =

I( lim
n
f
n
) =

I(f).
Proof. By Proposition 32.24, the set E := g = is a null set and
I(1
E
c f
n
) =

I(f
n
) and

I(1
E
c g) =

I(g). Since
I(1
E
c (g f
n
)) 2
I(1
E
c g) = 2
I(g) < ,
we may apply Fatous Lemma 32.22 to nd 1
E
c (g f) L
1
(I) and
I(1
E
c (g f)) liminf
n
I(1
E
c (g f
n
))
= liminf
n
_
I(1
E
c g)

I(1
E
c f
n
)
_
= liminf
n
_
I(g)

I(f
n
)
_
.
Since f = 1
E
c f a.e. and 1
E
c f =
1
2
1
E
c (g +f (g +f)) L
1
(I), Proposition
32.24 implies f L
1
(I). So the previous inequality may be written as
I(g)

I(f) =

I(1
E
c g)

I(1
E
c f)
=

I(1
E
c (g f))

I(g) +
_
liminf
n

I(f
n
)
limsup
n

I(f
n
),
wherein we have used liminf
n
(a
n
) = limsupa
n
. These two inequal-
ities imply limsup
n

I(f
n
)

I(f) liminf
n

I(f
n
) which shows that
lim
n
I(f
n
) exists and is equal to

I(f).
32.2 The Structure of L
1
(I)
Let S
(I) denote the collections of functions f : X

1 for which there
exists f
n
S
L
1
(I) such that f
n
f as n and lim
n

I(f
n
) >
. Applying the monotone convergence theorem to f
1
f
n
, it follows that
f
1
f L
1
(I) and hence f L
1
(I) so that S
(I) L
1
(I).
Lemma 32.26. Let f : X

1 be a function. If I
(f) 1, then there exists

g S
(I) such that f g and I
(f) =

I(g). (Consequently, n : X [0, , )
is a positive null function i there exists g S
(I) such that g n and
I(g) = 0.) Moreover, f L

1
(I) i there exists g S
(I) such that g f

and f = g a.e.
Proof. By denition of I
(f) we may choose a sequence of functions g

k

S
L
1
(I) such that g
k
f and

I(g
k
) I
(f). By replacing g
k
by g
1

g
k
if necessary (g
1
g
k
S
L
1
(I) by Proposition 32.13), we may
assume that g
k
is a decreasing sequence. Then lim
k
g
k
=: g f and, since
lim
k

I(g
k
) = I
(f) > , g S
(I) . By the monotone convergence

theorem applied to g
1
g
k
,
I(g
1
g) = lim
k
I(g
1
g
k
) =

I(g
1
) I
(f),
so

I(g) = I
(f). Now suppose that f L

1
(I), then (g f)
0
0 and
I ((g f)
0
) =

I (g)

I(f) =

I(g) I
(f) = 0.
Therefore (g f)
0
is a null functions and hence so is (g f)
0
. Because
1
f,=g]
= 1
f<g]
(g f)
0
,
f ,= g is a null set so if f L
1
(I) there exists g S
(I) such that f = g

a.e. The converse statement has already been proved in Proposition 32.24.
Proposition 32.27. Suppose that I and S are as above and J is another
Daniell integral on a vector lattice such that S and I = J[
S
. (We
abbreviate this by writing I J.) Then L
1
(I) L
1
(J) and

I =

J on L
1
(I),
or in abbreviated form: if I J then

I

J.
Proof. From the construction of the extensions, it follows that S
and the I = J on S
. Similarly, it follows that S
(I)
(J) and

I =

J
on S
(I) . From Lemma 32.26 we learn, if n 0 is an I null function then

there exists g S
(I)
(J) such that n g and 0 = I(g) = J(g).

This shows that n is also a J null function and in particular every I
null set is a J null set. Again by Lemma 32.26, if f L
1
(I) there exists
g S
(I)
(J) such that f ,= g is an I null set and hence a J null

set. So by Proposition 32.24, f L
1
(J) and I(f) = I(g) = J(g) = J(f).
32.3 Relationship to Measure Theory 641
32.3 Relationship to Measure Theory
Denition 32.28. A function f : X [0, ] is said to I-measurable (or
just measurable) if f g L
1
(I) for all g L
1
(I).
Lemma 32.29. The set of non-negative measurable functions is closed under
pairwise minimums and maximums and pointwise limits.
Proof. Suppose that f, g : X [0, ] are measurable functions. The fact
that f g and f g are measurable (i.e. (f g) h and (f g) h are in
L
1
(I) for all h L
1
(I)) follows from the identities
(f g) h = f (g h) and (f g) h = (f h) (g h)
and the fact that L
1
(I) is a lattice. If f
n
: X [0, ] is a sequence of
measurable functions such that f = lim
n
f
n
exists pointwise, then for
h L
1
(I), we have h f
n
h f . By the dominated convergence theorem
(using [h f
n
[ [h[) it follows that hf L
1
(I). Since h L
1
(I) is arbitrary
we conclude that f is measurable as well.
Lemma 32.30. A non-negative function f on X is measurable i f
L
1
(I) for all S.
Proof. Suppose f : X [0, ] is a function such that f L
1
(I)
for all S and let g S
L
1
(I). Choose
n
S such that
n
g as
n , then
n
f L
1
(I) and by the monotone convergence Theorem
32.21,
n
f g f L
1
(I). Similarly, using the dominated convergence
Theorem 32.25, it follows that g f L
1
(I) for all g S
(I) . Finally
for any h L
1
(I), there exists g S
(I) such that h = g a.e. and hence

h f = g f a.e. and therefore by Proposition 32.24, h f L
1
(I). This
completes the proof since the converse direction is trivial.
Denition 32.31. A set A X is measurable if 1
A
is measurable and A
integrable if 1
A
L
1
(I). Let denote the collection of measurable subsets
of X.
Remark 32.32. Suppose that f 0, then f L
1
(I) i f is measurable and
I
(f) < . Indeed, if f is measurable and I
(f) < , there exists g

S
L
1
(I) such that f g. Since f is measurable, f = f g L
1
(I). In
particular if A , then A is integrable i I
(1
A
) < .
Lemma 32.33. The set is a ring which is a algebra if 1 is measurable.
(Notice that 1 is measurable i 1 L
1
(I) for all S. This condition is
clearly implied by assuming 1 S for all S. This will be the typical
case in applications.)
Proof. Suppose that A, B , then AB and AB are in by Lemma
32.29 because
1
AB
= 1
A
1
B
and 1
AB
= 1
A
1
B
.
If A
k
, then the identities,
1
k=1
A
k
= lim
n
1
n
k=1
A
k
and 1
k=1
A
k
= lim
n
1
n
k=1
A
k
along with Lemma 32.29 shows that
k=1
A
k
and
k=1
A
k
are in as well.
Also if A, B and g S, then
g 1
A\B
= g 1
A
g 1
AB
+g 0 L
1
(I) (32.10)
showing the A B as well.
2
Thus we have shown that is a ring. If
1 = 1
X
is measurable it follows that X and becomes a algebra.
Lemma 32.34 (Chebyshevs Inequality). Suppose that 1 is measurable.
1. If f
_
L
1
(I)
+
then, for all 1, the set f > is measurable. More-
over, if > 0 then f > is integrable and

I(1
f>]
)
1
I(f).
2. (S) .
Proof.
1. If < 0, f > = X since 1 is measurable. So now assume that
0. If = 0 let g = f L
1
(I) and if > 0 let g =
1
f
_
1
f
_
1.
(Notice that g is a dierence of two L
1
(I) functions and hence in L
1
(I).)
The function g
_
L
1
(I)
+
has been manufactured so that g > 0 =
f > . Now let
n
:= (ng) 1
_
L
1
(I)
+
then
n
1
f>]
as n
showing 1
f>]
is measurable and hence that f > is measurable.
Finally if > 0,
1
f>]
= 1
f>]
1
f
_
L
1
(I)
showing the f > is integrable and
I(1
f>]
) =

I(1
f>]
1
f
_
)

I(
1
f) =
1
I(f).
2. Since f S
+
is measurable by (1) and S = S
+
S
+
, it follows that any
f S is measurable, (S) .
2
Indeed, for x A B, x A \ B and x A
c
, Eq. (32.10) evaluated at x states,
respectively, that
g 0 = g 1 g 1 +g 0,
g 1 = g 1 g 0 +g 0 and
g 0 = g 0 g 0 +g 0,
all of which are true.
Lemma 32.35. Let 1 be measurable. Dene
: [0, ] by
+
(A) = I
(1
A
) and
(A) = I
(1
A
)
Then
are measures on such that

+
and
(A) =
+
(A) whenever
+
(A) < .
Notice by Remark 32.32 that
+
(A) =
_
I(1
A
) if A is integrable
if A but A is not integrable.
Proof. Since 1
= 0,
() =

I(0) = 0 and if A, B , A B then
+
(A) = I
(1
A
) I
(1
B
) =
+
(B) and similarly,
(A) = I
(1
A
)
I
(1
B
) =
(B). Hence
are monotonic. By Remark 32.32 if

+
(A) <
then A is integrable so
(A) = I
(1
A
) =

I(1
A
) = I
(1
A
) =
+
(A).
Now suppose that E
j
j=1
is a sequence of pairwise disjoint sets and let
E :=
j=1
E
j
. If
+
(E
i
) = for some i then by monotonicity
+
(E) =
as well. If
+
(E
j
) < for all j then f
n
:=

n
j=1
1
Ej

_
L
1
(I)
+
with
f
n
1
E
. Therefore, by the monotone convergence theorem, 1
E
is integrable
i
lim
n
I(f
n
) =
j=1
+
(E
j
) <
in which case 1
E
L
1
(I) and lim
n

I(f
n
) =

I(1
E
) =
+
(E). Thus we have
shown that
+
is a measure and
(E) =
+
(E) whenever
+
(E) < . The
fact the
is a measure will be shown in the course of the proof of Theorem

32.38.
Example 32.36. Suppose X is a set, S = 0 is the trivial vector space and
I(0) = 0. Then clearly I is a Daniel integral,
I
(g) =
_
if g(x) > 0 for some x
0 if g 0
and similarly,
I
(g) =
_
if g(x) < 0 for some x
0 if g 0.
Therefore, L
1
(I) = 0 and for any A X we have 1
A
0 = 0 S so that
= 2
X
. Since 1
A
/ L
1
(I) = 0 unless A = set, the measure
+
in Lemma
32.35 is given by
+
(A) = if A ,= and
+
() = 0, i.e.
+
(A) = I
(1
A
)
while
0.
Lemma 32.37. For A , let
(A) := sup
+
(B) : B , B A and
+
(B) < ,
then is a measure on such that (A) =
+
(A) whenever
+
(A) < .
If is any measure on such that (B) =
+
(B) when
+
(B) < , then
. Moreover,
.
Proof. Clearly (A) =
+
(A) whenever
+
(A) < . Now let A =
n=1
A
n
withA
n
n=1
being a collection of pairwise disjoint subsets.
Let B
n
A
n
with
+
(B
n
) < , then B
N
:=
N
n=1
B
n
A and
+
(B
N
) <
and hence
(A)
+
(B
N
) =
N
n=1
+
(B
n
)
and since B
n
A
n
with
+
(B
n
) < is arbitrary it follows that (A)
N
n=1
(A
n
) and hence letting N implies (A)

n=1
(A
n
). Con-
versely, if B A with
+
(B) < , then BA
n
A
n
and
+
(BA
n
) < .
Therefore,
+
(B) =
n=1
+
(B A
n
)
n=1
(A
n
)
for all such B and hence (A)
n=1
(A
n
). Using the denition of and
the assumption that (B) =
+
(B) when
+
(B) < ,
(A) = sup(B) : B , B A and
+
(B) < (A),
showing . Similarly,
(A) = sup
I(1
B
) : B , B A and
+
(B) <
= supI
(1
B
) : B , B A and
+
(B) < I
(1
A
) =
(A).
Theorem 32.38 (Stone). Suppose that 1 is measurable and
+
and
are
as dened in Lemma 32.35, then:
1. L
1
(I) = L
1
(X, ,
+
) = L
1
(
+
) and for integrable f L
1
(
+
),
I(f) =
_
X
fd
+
. (32.11)
2. If is any measure on such that S L
1
() and
I(f) =
_
X
fd for all f S (32.12)
then
(A) (A)
+
(A) for all A with
(A) = (A) =
+
(A)
whenever
+
(A) < .
3. Letting be as dened in Lemma 32.37,
= and hence
is a
measure. (So
+
is the maximal and
is the minimal measure for which

Eq. (32.12) holds.)
4. Conversely if is any measure on (S) such that (A) =
+
(A) when
A (S) and
+
(A) < , then Eq. (32.12) is valid.
Proof.
1. Suppose that f
_
L
1
(I)
+
, then Lemma 32.34 implies that f is mea-
surable. Given n N, let
n
:=
2
2n
k=1
k
2
n
1
k
2
n <f
k+1
2
n ]
= 2
n
2
2n
k=1
1
k
2
n <f]
. (32.13)
Then we know
k
2
n
< f and that 1
k
2
n <f]
= 1
k
2
n <f]

_
2
n
k
f
_

L
1
(I), i.e.
+
_
k
2
n
< f
_
< . Therefore
n

_
L
1
(I)
+
and
n
f. Sup-
pose that is any measure such that (A) =
+
(A) when
+
(A) < ,
then by the monotone convergence theorems for

I and the Lebesgue inte-
gral,
I(f) = lim
n
I(
n
) = lim
n
2
n
2
2n
k=1
I(1
k
2
n <f]
) = lim
n
2
n
2
2n
k=1
+
_
k
2
n
< f
_
= lim
n
2
n
2
2n
k=1
_
k
2
n
< f
_
= lim
n
_
X
n
d =
_
X
fd. (32.14)
This shows that f
_
L
1
()
+
and that

I(f) =
_
X
fd. Since every f
L
1
(I) is of the form f = f
+
f
with f

_
L
1
(I)
+
, it follows that
L
1
(I) L
1
(
+
) L
1
() L
1
() and Eq. (32.12) holds for all f L
1
(I).
Conversely suppose that f
_
L
1
(
+
)
+
. Dene
n
as in Eq. (32.13).
Chebyshevs inequality implies that
+
(
k
2
n
< f) < and hence
k
2
n
<
f is I integrable. Again by the monotone convergence for Lebesgue
integrals and the computations in Eq. (32.14),
>
_
X
fd
+
= lim
n
I(
n
)
and therefore by the monotone convergence theorem for

I, f L
1
(I) and
_
X
fd
+
= lim
n
I(
n
) =

I(f).
2. Suppose that is any measure such that Eq. (32.12) holds. Then by the
monotone convergence theorem,
I(f) =
_
X
fd for all f S
.
Let A and assume that
+
(A) < , i.e. 1
A
L
1
(I). Then there
exists f S
L
1
(I) such that 1
A
f and integrating this inequality
relative to implies
(A) =
_
X
1
A
d
_
X
fd =

I(f).
Taking the inmum of this equation over those f S
such that 1
A
f
implies (A) I
(1
A
) =
+
(A). If
+
(A) = in this inequality holds
trivially. Similarly, if A and f S
such that 0 f 1
A
, then
(A) =
_
X
1
A
d
_
X
fd =

I(f).
Taking the supremum of this equation over those f S
such that 0
f 1
A
then implies (A)
(A). So we have shown that

+
.
3. By Lemma 32.37, = is a measure as in (2) satisfying
and
therefore
and hence we have shown that =
. This also shows

that
is a measure.
4. This can be done by the same type of argument used in the proof of (1).
Proposition 32.39 (Uniqueness). Suppose that 1 is measurable and there
exists a function L
1
(I) such that (x) > 0 for all x. Then there is only
one measure on (S) such that
I(f) =
_
X
fd for all f S.
Remark 32.40. The existence of a function L
1
(I) such that (x) > 0 for
all x is equivalent to the existence of a function S
such that

I() <
and (x) > 0 for all x X. Indeed by Lemma 32.26, if L
1
(I) there exists
S
L
1
(I) such .
Proof. As in Remark 32.40, we may assume S
L
1
(I). The sets
X
n
:= > 1/n (S) satisfy (X
n
) n
I() < . The proof is

completed using Theorem 32.38 to conclude, for any A (S), that
+
(A) = lim
n
+
(A X
n
) = lim
n
(A X
n
) =
(A).
Since

+
=
,
we see that =
+
=
.
32.4 Extensions of premeasures to measures 647
32.4 Extensions of premeasures to measures
Theorem 32.41. Let X be a set, / be a subalgebra of 2
X
and
0
be a pre-
measure on / which is nite on /, i.e. there exists X
n
/ such that
0
(X
n
) < and X
n
X as n . Then
0
has a unique extension to a
measure, , on / := (/). Moreover, if A / and > 0 is given, there
exists B /
such that A B and (B A) < . In particular,

(A) = inf
0
(B) : A B /
(32.15)
= inf
n=1
0
(A
n
) : A
n=1
A
n
with A
n
/. (32.16)
Proof. Let (/,
0
, I = I
0
) be as in Denition 28.36. By Proposition 32.9,
I is a Daniell integral on the lattice S = S
f
(/,
0
). It is clear that 1 S
for all S. Since 1
Xn
S
+
and

n=1
1
Xn
> 0 on X, by Remark 32.45
there exists S
such that I() < and > 0. So the hypothesis of

Theorem 32.44 hold and hence there exists a unique measure on / such
that I(f) =
_
X
fd for all f S. Taking f = 1
A
with A / and
0
(A) <
shows (A) =
0
(A). For general A /, we have
(A) = lim
n
(A X
n
) = lim
n
0
(A X
n
) =
0
(A).
The fact that is the only extension of
0
to / follows from Theorem 33.6
or Theorem 19.55. It is also can be proved using Theorem 32.44. Indeed, if
is another measure on / such that = on /, then I
= I on S. Therefore
by the uniqueness assertion in Theorem 32.44, = on /. By Eq. (32.20),
for A /,
(A) = I
(1
A
) = inf I(f) : f S
with 1
A
f
= inf
__
X
fd : f S
with 1
A
f
_
.
For the moment suppose (A) < and > 0 is given. Choose f S
such
that 1
A
f and
_
X
fd = I(f) < (A) +. (32.17)
Let f
n
S be a sequence such that f
n
f as n and for (0, 1) set
B
:= f > =
n=1
f
n
> /
.
Then A f 1 B
and by Chebyshevs inequality,

(B
)
1
_
X
fd =
1
I(f)
which combined with Eq. (32.17) implies (B
) < (A)+ for all suciently

close to 1. For such we then have A B
and (B
A) = (B
)
(A) < . For general A /, choose X
n
X with X
n
/. Then there exists
B
n
/
such that (B
n
(A
n
X
n
)) < 2
n
. Dene B :=
n=1
B
n
/
.
Then
(B A) = (
n=1
(B
n
A))
n=1
((B
n
A))
n=1
((B
n
(A X
n
)) < .
Eq. (32.15) is an easy consequence of this result and the fact that (B) =
0
(B).
Corollary 32.42 (Regularity of ). Let / 2
X
be an algebra of sets,
/= (/) and : /[0, ] be a measure on / which is nite on /.
Then
1. For all A /,
(A) = inf (B) : A B /
. (32.18)
2. If A / and > 0 are given, there exists B /
such that A B and

(B A) < .
3. For all A / and > 0 there exists B /
such that B A and

(A B) < .
4. For any B / there exists A /
and C /
such that A B C
and (C A) = 0.
5. The linear space S := S
f
(/, ) is dense in L
p
() for all p [1, ), briey
put, S
f
(/, )
L
p
()
= L
p
().
Proof. Items 1. and 2. follow by applying Theorem 32.41 to
0
= [
,
.
Items 3. and 4. follow from Items 1. and 2. as in the proof of Corollary 33.10
above. Item 5. This has already been proved in Theorem 22.15 but we will
give yet another proof here. When p = 1 and g L
1
(; 1), there exists, by
Eq. (32.20), h S
such that g h and |h g|

1
=
_
X
(h g)d < . Let
h
n
n=1
S be chosen so that h
n
h as n . Then by the dominated
convergence theorem, |h
n
g|
1
|h g|
1
< as n . Therefore for
n large we have h
n
S with |h
n
g|
1
< . Since > 0 is arbitrary this
shows, S
f
(/, )
L
1
()
= L
1
(). Now suppose p > 1, g L
p
(; 1) and X
n
/
are sets such that X
n
X and (X
n
) < . By the dominated convergence
theorem, 1
Xn
[(g n) (n)] g in L
p
() as n , so it suces to
consider g L
p
(; 1) with g ,= 0 X
n
and [g[ n for some large n N.
By Holders inequality, such a g is in L
1
(). So if > 0, by the p = 1 case, we
may nd h S such that |h g|
1
< . By replacing h by (h n) (n) S,
we may assume h is bounded by n as well and hence
32.4 Extensions of premeasures to measures 649
|h g|
p
p
=
_
X
[h g[
p
d =
_
X
[h g[
p1
[h g[ d
(2n)
p1
_
X
[h g[ d < (2n)
p1
.
Since > 0 was arbitrary, this shows S is dense in L
p
(; 1).
Remark 32.43. If we drop the niteness assumption on
0
we may loose
uniqueness assertion in Theorem 32.41. For example, let X = 1, B
R
and
/ be the algebra generated by c := (a, b] 1 : a < b . Recall
B
R
= (c). Let D 1 be a countable dense set and dene
D
(A) := #(DA).
Then
D
(A) = for all A / such that A ,= . So if D
t
1 is another
countable dense subset of 1,
D
=
D
on / while
D
,=
D
on B
R
. Also
notice that
D
is nite on B
R
but not on /.
It is now possible to use Theorem 32.41 to give a proof of Theorem 19.8, see
subsection 31.4.1 below. However rather than do this now let us give another
application of Theorem 32.41 based on Proposition 32.9 and use the result to
prove Theorem 19.8.
32.4.1 A Useful Version: BRUCE: delete this if incorporated
above.
We are now in a position to state the main construction theorem. The theorem
we state here is not as general as possible but it will suce for our present
purposes.
Theorem 32.44 (Daniell-Stone). Let S be a lattice of bounded functions
on a set X such that 1 S for all S and let I be a Daniel integral on
S. Further assume there exists S
such that I() < and (x) > 0 for

all x X. Then there exists a unique measure on /:= (S) such that
I(f) =
_
X
fd for all f S. (32.19)
Moreover, for all g L
1
(X, /, ),
supI(f) : S
f g =
_
X
gd = inf I(h) : g h S
. (32.20)
Proof. Only a sketch of the proof will be given here. Full details may be
found in Section 32 below. Existence. For g : X

1, dene
I
(g) := infI(h) : g h S
,
I
(g) := supI(f) : S
f g
and set
L
1
(I) := g : X

1 : I
(g) = I
(g) 1.
For g L
1
(I), let

I(g) = I
(g) = I
(g). Then, as shown in Proposition 32.20,

L
1
(I) is a extended vector space and

I : L
1
(I) 1 is linear as dened in
Denition 32.3 below. By Proposition 32.18, if f S
with I(f) < then

f L
1
(I). Moreover,

I obeys the monotone convergence theorem, Fatous
lemma, and the dominated convergence theorem, see Theorem 32.21, Lemma
32.22 and Theorem 32.25 respectively. Let
:=
_
A X : 1
A
f L
1
(I) for all f S
_
and for A set (A) := I
(1
A
). It can then be shown: 1) is a algebra
(Lemma 32.33) containing (S) (Lemma 32.34), is a measure on (Lemma
32.35), and that Eq. (32.19) holds. In fact it is shown in Theorem 32.38 and
Proposition 32.39 below that L
1
(X, /, ) L
1
(I) and
I(g) =
_
X
gd for all g L
1
(X, /, ).
The assertion in Eq. (32.20) is a consequence of the denition of L
1
(I) and

I
and this last equation. Uniqueness. Suppose that is another measure on
(S) such that
I(f) =
_
X
fd for all f S.
By the monotone convergence theorem and the denition of I on S
,
I(f) =
_
X
fd for all f S
.
Therefore if A (S) ,
(A) = I
(1
A
) = infI(h) : 1
A
h S
= inf
_
X
hd : 1
A
h S

_
X
1
A
d = (A)
which shows . If A (S) with (A) < , then, by Remark 32.32
below, 1
A
L
1
(I) and therefore
(A) = I
(1
A
) =

I(1
A
) = I
(1
A
) = supI(f) : S
f 1
A
= sup
_
X
fd : S
f 1
A
(A).
Hence (A) (A) for all A (S) and (A) = (A) when (A) < .
To prove (A) = (A) for all A (S), let X
n
:= 1/n (S). Since
1
Xn
n,
(X
n
) =
_
X
1
Xn
d
_
X
nd = nI() < .
32.5 Riesz Representation Theorem 651
Since > 0 on X, X
n
X and therefore by continuity of and ,
(A) = lim
n
(A X
n
) = lim
n
(A X
n
) = (A)
for all A (S).
Remark 32.45. To check the hypothesis in Theorem 32.44 that there exists
S
such that I() < and (x) > 0 for all x X, it suces to nd
n

S
+
such that
n=1
n
> 0 on X. To see this let M
n
:= max (|
n
|
, I(
n
) , 1)
and dene :=
n=1
1
Mn2
n
n
, then S
, 0 < 1 and I() 1 < .

32.5 Riesz Representation Theorem
Denition 32.46. Given a second countable locally compact Hausdor space
(X, ), let M
+
denote the collection of positive measures, , on B
X
:= ()
with the property that (K) < for all compact subsets K X. Such a
measure will be called a Radon measure on X. For M
+
and f
C
c
(X, 1) let I
(f) :=
_
X
fd.
BRUCE: Consolidate the next theorem and Theorem 32.63.
Theorem 32.47 (Riesz Representation Theorem). Let (X, ) be a sec-
ond countable
3
locally compact Hausdor space. Then the map I
taking
M
+
to positive linear functionals on C
c
(X, 1) is bijective. Moreover every
measure M
+
has the following properties:
1. For all > 0 and B B
X
, there exists F B U such that U is open
and F is closed and (U F) < . If (B) < , F may be taken to be a
compact subset of X.
2. For all B B
X
there exists A F
and C
is more conventionally
written as G
) such that A B C and (C A) = 0.

3. For all B B
X
,
(B) = inf(U) : B U and U is open (32.21)
= sup(K) : K B and K is compact. (32.22)
4. For all open subsets, U X,
(U) = sup
_
X
fd : f X = supI
(f) : f X. (32.23)
3
The second countability is assumed here in order to avoid certain technical issues.
Recall from Lemma 18.57 that under these assumptions, (S) = BX. Also recall
from Uryshons metrizatoin theorem that X is metrizable. We will later remove
the second countability assumption.
5. For all compact subsets K X,
(K) = infI
(f) : 1
K
f X. (32.24)
6. If |I
| denotes the dual norm on C

c
(X, 1)
, then |I
| = (X). In par-
ticular I
is bounded i (X) < .

7. C
c
(X, 1) is dense in L
p
(; 1) for all 1 p < .
Proof. First notice that I
is a positive linear functional on S := C

c
(X, 1)
for all M
+
and S is a lattice such that 1 f S for all f S. Proposition
32.7shows that any positive linear functional, I, on S := C
c
(X, 1) is a Daniell
integral on S. By Lemma 14.23, there exists compact sets K
n
X such that
K
n
X. By Urysohns lemma, there exists
n
X such that
n
= 1 on K
n
.
Since
n
S
+
and

n=1
n
> 0 on X it follows from Remark 32.45 that
there exists S
such that > 0 on X and I() < . So the hypothesis

of the Daniell Stone Theorem 32.44 hold and hence there exists a unique
measure on (S) =B
X
(Lemma 18.57) such that I = I
. Hence the map

I
taking M
+
to positive linear functionals on C
c
(X, 1) is bijective. We
will now prove the remaining seven assertions of the theorem.
1. Suppose > 0 and B B
X
satises (B) < . Then 1
B
L
1
() so there
exists functions f
n
C
c
(X, 1) such that f
n
f, 1
B
f, and
_
X
fd = I(f) < (B) +. (32.25)
Let (0, 1) and U
a
:= f >
n=1
f
n
> . Since 1
B
f,
B f 1 U
and by Chebyshevs inequality, (U
)
1
_
X
fd =
1
I(f). Combining this estimate with Eq. (32.25) shows (U
B) =
(U
) (B) < for suciently closet to 1. For general B B

X
,
by what we have just proved, there exists open sets U
n
X such that
B K
n
U
n
and (U
n
(B K
n
)) < 2
n
for all n. Let U =
n=1
U
n
,
then B U and
(U B) = (
n=1
(U
n
B))
n=1
(U
n
B)
n=1
(U
n
(B K
n
))
n=1
2
n
= .
Applying this result to B
c
shows there exists a closed set F X such
that B
c
F
c
and
(B F) = (F
c
B
c
) < .
So we have produced F B U such that (U F) = (U B) +(B
F) < 2. If (B) < , using B (K
n
F) B F as n , we may
choose n suciently large so that (B (K
n
F)) < . Hence we may
replace F by the compact set F K
n
if necessary.
2. Choose F
n
B U
n
such F
n
is closed, U
n
is open and (U
n
F
n
) < 1/n.
Let B =
n
F
n
F
and C := U
n

. Then A B C and
(C A) (F
n
U
n
) <
1
n
0 as n .
3. From Item 1, one easily concludes that
(B) = inf (U) : B U
o
X
for all B B
X
and
(B) = sup(K) : K B
for all B B
X
with (B) < . So now suppose B B
X
and (B) = .
Using the notation at the end of the proof of Item 1., we have (F) =
and (F K
n
) as n . This shows sup(K) : K B = =
(B) as desired.
4. For U
o
X, let
(U) := supI
(f) : f U.
It is evident that (U) (U) because f U implies f 1
U
. Let K be a
compact subset of U. By Urysohns Lemma 15.8, there exists f U such
that f = 1 on K. Therefore,
(K)
_
X
fd (U) (32.26)
and we have
(K) (U) (U) for all U
o
X and K U. (32.27)
By Item 3.,
(U) = sup(K) : K U (U) (U)
which shows that (U) = (U), i.e. Eq. (32.23) holds.
5. Now suppose K is a compact subset of X. From Eq. (32.26),
(K) infI
(f) : 1
K
f X (U)
for any open subset U such that K U. Consequently by Eq. (32.21),
(K) infI
(f) : 1
K
f X inf(U) : K U
o
X = (K)
6. For f C
c
(X, 1),
[I
(f)[
_
X
[f[ d |f|
(supp(f)) |f|
(X) (32.28)
which shows |I
| (X). Let K X and f X such that f = 1 on

K. By Eq. (32.26),
(K)
_
X
fd = I
(f) |I
| |f|
= |I
|
and therefore,
(X) = sup(K) : K X |I
| .
7. This has already been proved by two methods in Theorem 22.8 but
we will give yet another proof here. When p = 1 and g L
1
(; 1),
there exists, by Eq. (32.20), h S
= C
c
(X, 1)
such that g h and

|h g|
1
=
_
X
(h g)d < . Let h
n
n=1
S = C
c
(X, 1) be chosen
so that h
n
h as n . Then by the dominated convergence theorem
(notice that [h
n
[ [h
1
[ + [h[), |h
n
g|
1
|h g|
1
< as n .
Therefore for n large we have h
n
C
c
(X, 1) with |h
n
g|
1
< . Since
> 0 is arbitrary this shows, S
f
(/, )
L
1
()
= L
1
(). Now suppose p > 1,
g L
p
(; 1) and K
n
n=1
are as above. By the dominated convergence
theorem, 1
Kn
(g n) (n) g in L
p
() as n , so it suces to
consider g L
p
(; 1) with supp(g) K
n
and [g[ n for some large
n N. By Holders inequality, such a g is in L
1
(). So if > 0, by the
p = 1 case, there exists h S such that |h g|
1
< . By replacing h by
(h n) (n) S, we may assume h is bounded by n in which case
|h g|
p
p
=
_
X
[h g[
p
d =
_
X
[h g[
p1
[h g[ d
(2n)
p1
_
X
[h g[ d < (2n)
p1
.
Since > 0 was arbitrary, this shows S is dense in L
p
(; 1).
Remark 32.48. We may give a direct proof of the fact that I
is injective.
Indeed, suppose , M
+
satisfy I
(f) = I
(f) for all f C

c
(X, 1). By
Theorem 22.8, if A B
X
is a set such that (A) + (A) < , there exists
f
n
C
c
(X, 1) such that f
n
1
A
in L
1
( +). Since f
n
1
A
in L
1
() and
L
1
(),
(A) = lim
n
I
(f
n
) = lim
n
I
(f
n
) = (A).
For general A B
X
, choose compact subsets K
n
X such that K
n
X.
Then
(A) = lim
n
(A K
n
) = lim
n
(A K
n
) = (A)
showing = . Therefore the map I
is injective.
Theorem 32.49 (Lusins Theorem). Suppose (X, ) is a locally compact
and second countable Hausdor space, B
X
is the Borel algebra on X, and
is a measure on (X, B
X
) which is nite on compact sets of X. Also let > 0
be given. If f : X C is a measurable function such that (f ,= 0) < ,
there exists a compact set K f ,= 0 such that f[
K
is continuous and
(f ,= 0 K) < . Moreover there exists C
c
(X) such that (f ,= ) <
and if f is bounded the function may be chosen so that ||
|f|
:=
sup
xX
[f(x)[ .
Proof. Suppose rst that f is bounded, in which case
_
X
[f[ d |f|
(f ,= 0) < .
By Theorem 22.8 or Item 7. of Theorem 32.47, there exists f
n
C
c
(X) such
that f
n
f in L
1
() as n . By passing to a subsequence if necessary,
we may assume |f f
n
|
1
< n
1
2
n
for all n and thus
_
[f f
n
[ > n
1
_
<
2
n
for all n. Let E :=
n=1
_
[f f
n
[ > n
1
_
, so that (E) < . On E
c
,
[f f
n
[ 1/n, i.e. f
n
f uniformly on E
c
and hence f[
E
c is continuous.
Let A := f ,= 0 E. By Theorem 32.47 (or see Exercises 33.4 and 33.5)
there exists a compact set K and open set V such that K A V such that
(V K) < . Notice that
(f ,= 0 K) (A K) +(E) < 2.
By the Tietze extension Theorem 15.9, there exists F C(X) such that
f = F[
K
. By Urysohns Lemma 15.8 there exists V such that = 1 on
K. So letting = F C
c
(X), we have = f on K, ||
|f|
and
since ,= f E(V K), ( ,= f) < 3. This proves the assertions in the
theorem when f is bounded. Suppose that f : X C is (possibly) unbounded.
By Lemmas 18.57 and 14.23, there exists compact sets K
N
N=1
of X such
that K
N
X. Hence B
N
:= K
N
0 < [f[ N f ,= 0 as N .
Therefore if > 0 is given there exists an N such that (f ,= 0 B
N
) < .
We now apply what we have just proved to 1
B
N
f to nd a compact set
K 1
B
N
f ,= 0 , and open set V X and C
c
(V ) C
c
(X) such that
(V K) < , (1
B
N
f ,= 0 K) < and = f on K. The proof is now
complete since
,= f (f ,= 0 B
N
) (1
B
N
f ,= 0 K) (V K)
so that ( ,= f) < 3.
To illustrate Theorem 32.49, suppose that X = (0, 1), = m is Lebesgue
measure and f = 1
(0,1)Q
. Then Lusins theorem asserts for any > 0 there
exists a compact set K (0, 1) such that m((0, 1) K) < and f[
K
is
continuous. To see this directly, let r
n
n=1
be an enumeration of the rationals
in (0, 1),
J
n
= (r
n
2
n
, r
n
+2
n
) (0, 1) and W =
n=1
J
n
.
Then W is an open subset of X and (W) < . Therefore K
n
:= [1/n, 1
1/n] W is a compact subset of X and m(X K
n
)
2
n
+ (W). Taking n
suciently large we have m(X K
n
) < and f[
Kn
0 is continuous.
32.6 The General Riesz Representation by Daniell
Integrals (Move Later?)
This section is rather a mess and is certainly not complete. Here is the upshot
of what I understand at this point.
When using the Daniell integral to construct measures on locally compact
Hausdor spaces the natural answer is in terms of measures on the Baire
algebra. To get the Rudin or Folland version of the theorem one has to extend
this measure to the Borel algebra. Checking all of the details here seems
to be rather painful. Just as painful and giving the full proof in Rudin!! Argh.
Denition 32.50. Let X be a locally compact Hausdor space. The Baire
algebra on X is B
0
X
:= (C
c
(X)).
Notice that if f C
c
(X, 1) then f = f
+
f
with f
C
c
(X, 1
+
).
Therefore B
0
X
is generated by sets of the form K := f supp(f) with
> 0. Notice that K is compact and K =
n=1
f > 1/n showing K is
a compact (
. Thus we have shown B

0
X
(compact (
t
s). For the converse

we will need the following exercise.
Exercise 32.2. Suppose that X is a locally compact Hausdor space and
K X is a compact (
then there exists f C

c
(X, [0, 1]) such that f = 1 on
K and f < 1 on K
c
.
Solution to Exercise (32.2). Let V
n

o
X be sets such that V
n
K as
n and use Uryshons Lemma to nd f
n
C
c
(V
n
, [0, 1]) such that f
n
= 1
on K. Let f =

n=1
2
n
f
n
. Hence if x K
c
, then x / V
n
for some n and
hence f(x) <
n=1
2
n
= 1.
This exercise shows that (compact (
t
s) (C
c
(X)). Indeed, if K is a
compact (
then by Exercise 32.2, there exist f X such that f = 1 on K

and f < 1 on K
c
. Therefore 1
K
= lim
n
f
n
is B
0
X
measurable. Therefore
we have proved B
0
X
= (compact (
t
s).
Denition 32.51. Let (X, ) be a local compact topological space. We say
that E X is bounded if E K for some compact set K and E is
bounded if E K
n
for some sequence of compact sets K
n
n=1
.
Lemma 32.52. If A B
0
X
, then either A or A
c
is bounded.
32.6 The General Riesz Representation by Daniell Integrals (Move Later?) 657
Proof. Let
T := A X : either A or A
c
is bounded.
Clearly X T and T is closed under complementation. Moreover if A
i
T
then A =
i
A
i
T. Indeed, if each A
i
is bounded then A is bounded
and if some A
c
j
is bounded then
A
c
=
i
A
c
i
A
c
j
is bounded. Therefore, T is a algebra containing the compact (
t
s and
therefore B
0
X
T.
Now the algebra B
0
X
is called and may not necessarily be as large as
the Borel algebra. However if every open subset of X is compact, then
the Borel B
X
and and the Baire algebras are the same. Indeed, if U
o
X
and K
n
U with K
n
being compact. There exists f
n
U such that f
n
= 1
on K
n
. Now f := lim
n
f
n
= 1
U
showing U (C
c
(X)) = B
0
X
.
Lemma 32.53. In Halmos on p.221 it is shown that a compact Baire set is
necessarily a compact (
.
Proof. Let K be a compact Baire set and let K(
denote the space of

compact (
s. Recall in general that if T is some collection of subsets of a

space X, then
(T) = (c) : c is a countable subset of T .
This is because the right member of this equation is a algebra. Therefore,
there exist C
n
n=1
K(
such that K (C
n
n=1
) . Let f
n
C(X, [0, 1])
such that C
n
= f
n
= 0, see Exercise 32.2 above. Now dene
d(x, y) :=
n=1
2
n
[f
n
(x) f
n
(y)[ .
Then d would be a metric on X except for the fact that d(x, y) may be zero
even though x ,= y. Let X y i d(x, y) = 0 i f
n
(x) = f
n
(y) for all n
It is easily seen that is an equivalence relation and Z := X/ with the
induced metric

d is a metric space. Also let : X Z be the canonical
projection map. Notice that if x C
n
then y C
n
for all x y, and therefore
1
((C
n
)) = C
n
for all n. In particular this shows that
K (C
n
n=1
)
1
(T(Z)) ,
i.e. K =
1
((K)). Now is continuous, since if x X and y
N
k=1
[f
k
(y) f
k
(x)[ <
o
X then
d((x), (y)) = d(x, y) < + 2

N+1
which can be made as small as we please. Hence (K) is compact and hence
closed in Z. Let W
n
:= z Z :

d
(K)
(z) < 1/n, then W
n
is open in Z and
W
n
(K) as n . Let V
n
:=
1
(W
n
), open in X since is continuous,
then V
n
K as n .
The following facts are taken from Halmos, section 50 starting on p. 216.
Theorem 32.54. 1. It K X and K U V with U, V , then K =
K
1
K
2
with K
1
U and K
2
V.
2. If K X and F X are disjoint, then there exists f C(X, [0, 1])
such that f = 0 on K and f = 1 on F.
3. If f is a real valued continuous function, then for all c 1 the sets
f c , f c and f = c are closed (
.
4. If K U
o
X then there exists K U
0
K
0
U such that K
0
is
a compact (
and U
0
is a compact open set.
5. If X is separable, then every compact subset of X is a (
. (I think the
proof of this point is wrong in Halmos!)
Proof. 1. KU and KV are disjoint compact sets and hence there exists
two disjoint open sets U
t
and V
t
such that
K U V
t
and K V U
t
.
Let K
1
:= KV
t
U and K
2
= KU
t
V. 2. Tietze extension theorem with
elementary proof in Halmos. 3. f c =
n=1
f < c + 1/n with similar
formula for the other cases. The converse has already been mentioned. 4. For
each x K, let V
x
be an open neighborhood of K such that

V
x
U,
and set V =
x
V
x
where K is a nite set such that K V. Since
V =
x

V
x
is compact, we may replace U by V if necessary and assume that
U is bounded. Let f C(X, [0, 1]) such that f = 0 on K and f = 1 on U
c
.
Take U
0
= f < 1/2 and K
0
= f 1/2. Then K U
0
K
0
U, K
0
is
compact (
and U
0
is a compact open set since U
0
=
n=3
f 1/2+1/n.
5. Let K X, and | be a countable dense subset of X. For all x / K there
exist disjoint open sets V
x
and U
x
such that x U
x
and K V
x
. (I dont see
how to nish this o at the moment.)
32.7 Regularity Results
Proposition 32.55. Let X be a compact Hausdor space and be a Baire
measure on B
0
X
. Then for each A B
0
X
and > 0 there exists K A V
where K is a compact (
and V is an open, Baire and -compact, such that

(V K) < .
Proof. Let I(f) =
_
X
fd for f S := C(X), so that I is a Daniell
integral on C(X). Since 1 S, the measure from the Daniell Stone con-
struction theorem is the same as the measure . Hence for g L
1
(), we
have
32.7 Regularity Results 659
supI(f) : f S
with f h =
_
X
gd
= inf
__
hd : h S
with g h
_
.
Suppose > 0 and B B
0
X
are given. There exists h
n
S such that h
n
h,
1
B
h, and (h) < (B)+. The condition 1
B
h, implies 1
B
1
h1]
h
and hence
(B) (h 1) (h) < (B) +. (32.29)
Moreover, letting
V
m
:=
n=1
h
n
> 1 1/m =
n=1
k=1
h
n
1 1/m+ 1/k
(a compact, open Baire set) we have V
m
h 1 B hence (V
m
)
(h 1) (B) as m . Combining this observation with Eq. (32.29),
we may choose m suciently large so that B V
m
and
(V
m
B) = (V
m
) (B) < .
Hence there exists V such that B V and (V B) < . Similarly, there
exists f S
such that f 1
B
and (B) < (f) +. We clearly may assume
that f 0. Let f
n
n
f as n . Since 0 f 1
B
we have
0 f 1
f>0]
1
B
so that f > 0 B and (B) < (f > 0) +. For each m N, let
K
m
:=
n=1
f
n
1/m =
k=1
n=1
f
n
> 1/m1/k ,
a compact (
, then K
m
f > 0 as m . Therefore for large m we will
have (B) < (K
m
) +, i.e. K
m
B and (B K
m
) < .
Remark 32.56. The above proof does not in general work when X is a locally
compact Hausdor space and is a nite Baire measure on B
0
X
since it may
happen that ,=
+
, i.e.
+
(X) might be innite, see Example 32.57 below.
However, if
+
(X) < , then the above proof works in this context as well.
Example 32.57. Let X be an uncountable and = 2
X
be the discrete topology
on X. In this case K X is compact i K is a nite set. Since every set is
open, K is necessarily a (
and hence a Baire set. So B

0
X
is the algebra
generated by the nite subsets of X. We may describe B
0
X
by A B
0
X
i A is
countable or A
c
is countable. For A B
0
X
, let
(A) =
_
0 if A is countable
1 if A
c
is uncountable
To see that is a measure suppose that A is the disjoint union of
A
n
B
0
X
. If A
n
is countable for all n, then A is countable and (A) =
0 =
n=1
(A
n
). If A
c
m
is countable for some m, then A
i
A
c
m
is countable
for all i ,= m. Therefore,

n=1
(A
n
) = 1, now A
c
= A
c
n
A
c
m
is countable
as well, so (A) = 1. Therefore is a measure.
The measure is clearly a nite Baire measure on B
0
X
which is non-
regular. Letting I(f) =
_
X
fd for all f S = C
c
(X) the functions with
nite support, then I(f) = 0 for all f. If B X is a set such that B
c
is
countable, there are no functions f (C
c
(X))
such that 1
B
f. Therefore
+
(B) = I
(1
B
) = . That is
+
(A) =
_
0 if A is countable
otherwise.
On the other hand, one easily sees that
(A) = 0 for all A B

0
X
. The
measure
represents I as well.
Denition 32.58. A Baire measure on a locally compact Hausdor space
is regular if for each A B
0
X
, (B
0
X
being the Baire algebra)
(A) = sup(K) : K A and K is a compact (
.
Proposition 32.59. Let be a Baire measure on X and set
(A) := sup(K) : K A and K is a compact (
.
Then (A) = (A) for any bounded sets A and is a regular Baire
measure on X.
Proof. Let A be a bounded set and K
n
be compact (
s (which
exist by Theorem 32.54) such that A K
n
. By replacing K
n
by
n
k=1
K
k
if
necessary, we may assume that K
n
is increasing in n. By Proposition 32.55,
there exists compact (
t
s, C
n
, such that C
n
A K
n
and (A K
n
C
n
) <
2
n
for all n. Let C
N
:=
N
n=1
C
n
, then C
N
is a compact (
, C
N
A and
(AK
N
C
N
) < for all N. From this equation it follows that (AC
N
) <
for large N if (A) < and (C
N
) if (A) = . In either case we
conclude that (A) = (A). Now let us show that is a measure on B
0
X
.
Suppose A =

n=1
A
n
and K
n
A
n
for each n with K
n
being a compact
(
. Then K
N
:=
N
n=1
K
n
is also a compact (
and since K
N
A, it follows
that
(A) (K
N
) =
N
n=1
(K
n
).
Since K
n
A
n
are arbitrary, we learn that (A)
N
n=1
(A
n
) for all N and
hence letting N shows
(A)
n=1
(A
n
).
We now wish to prove the converse inequality. Owing to the above inequality,
it suces not to consider the case where

n=1
(A
n
) < . Let K A be a
compact (
. Then
(K) =
n=1
(K A
n
) =
n=1
(K A
n
)
n=1
(A
n
)
and since K is arbitrary, it follows that (A)
n=1
(A
n
). So is a measure.
Finally if A B
0
X
, then
sup(K) : K A and K is a compact (
= sup(K) : K A and K is a compact (
= (A)
showing is regular.
Corollary 32.60. Suppose that is a nite Baire measure on X such
(X) := sup(K) : K X and K is a compact (
,
then = , in particular is regular.
Proof. The assumption asserts that (X) = (X). Since = on the
class consisting of the compact (
t
s, we may apply Theorem 19.55 to learn

= .
Proposition 32.61. Suppose that is a Baire measure on X, then for all
A B
0
X
which is bounded and > 0 there exists V B
0
X
such that
A V and (V A) < . Moreover if is regular then
(A) = inf
_
(V ) : A V B
0
X
_
. (32.30)
holds for all A B
0
X
.
Proof. Suppose A is bounded Baire set. Let K
n
be compact (
s
(which exist by Theorem 32.54) such that A K
n
. By replacing K
n
by
n
k=1
K
k
if necessary, we may assume that K
n
is increasing in n. By Propo-
sition 32.55 (applied to d
n
:= 1
U
0
n
d with U
0
n
an open Baire set such that
K
n
U
0
n
and U
0
n
C
n
where C
n
is a compact Baire set, see Theorem
32.54), there exists open Baire sets V
n
of X such that A K
n
V
n
and
(V
n
A K
n
) < 2
n
for all n. Let V =
n=1
V
n
B
0
X
, A V and
(V A) < . Now suppose that is regular and A B
0
X
. If (A) =
then clearly inf
_
(V ) : A V B
0
X
_
= . So we will now assume that
(A) < . By inner regularity, there exists compact (
t
s, K
n
, such that K
n
,
K
n
A for all n and (A K
n
) 0 as n . Letting B = K
n
A, then
B is a bounded set, (A B) = 0. Since B is bounded there exists an
open Baire V such that B V and (V B) is a small as we please. These
remarks reduce the problem to considering the truth of the proposition for
the null set A B. So we now assume that (A) = 0. If A is bounded we
are done by the rst part of the proposition, so we will now assume that A is
not bounded. By Lemma 32.52, it follows that A
c
is bounded. (I am
a little stuck here, so assume for now that (X) < . in which case we do
not use the fact that A
c
can be assumed to be bounded.) If (X) <
and > 0 is given, by inner regularity there exists a compact Baire subset
K A
c
such that
> (A
c
K) = (K
c
A)
and since A K
c
is an open, Baire set the proof is nished when is a nite
measure.
Example 32.62. 1) Suppose that X = 1 with the standard topology and is
counting measure on X. Then clearly is not nite on all compact sets, so
is not K-nite measure. 2) Let X = 1 and =
d
= 2
X
be the discrete
topology on X. Now let (A) = 0 if A is countable and (A) = otherwise.
Then (K) = 0 < if K is
d
compact yet is not inner regular on open
sets, i.e. all sets. So again is not Radon. Moreover, the functional
I
(f) =
_
X
fd = 0 for all f C
c
(X).
This shows that with out the restriction that is Radon in Example 28.17,
the correspondence I
is not injective.
Theorem 32.63 (Riesz Representation Theorem). Let X be a locally
compact Hausdor space. The map I
taking Radon measures on X to

positive linear functionals on C
c
(X) is bijective. Moreover if I is a positive
linear functionals on C
c
(X), then I = I
where is the unique Radon measure

such that (U) = supI(f) : f U for all U
o
X.
Proof. Given a positive linear functional on C
c
(X), the Daniell - Stone
integral construction theorem gives the existence of a measure on B
0
X
:=
(C
c
(X)) (the Baire algebra) such that
_
X
fd = I(f) for all f C
c
(X)
and for g L
1
(),
supI(f) : S
f g =
_
X
gd = inf I(h) : g h S
with S := C
c
(X, 1). Suppose that K is a compact subset of X and E K
is a Baire set. Let f X be a function such that f = 1 on K, then 1
E
f
implies (E) = I(1
E
) I(f) < . Therefore any bounded (i.e. subset of a
compact set) Baire set E has nite measure. Suppose that K is a compact
Baire set, i.e. a compact (
, and f is as in Exercise 32.2, then

(K)
_
f
n
d = I(f
n
) <
showing is nite on compact Baire sets and by the dominated convergence
theorem that
(K) = lim
n
I(f
n
)
showing is uniquely determined on compact Baire sets. Suppose that A B
0
X
and (A) = I
(1
A
) < . Given > 0, there exists f S
such that 1
A
f
and (f) < (A) +. Let f
n
C
c
(X) such that f
n
f, then 1
A
1
f1]
f
which shows
(A) (f 1)
_
fd = I(f) < (A) +.
Let V
m
:=
n=1
f
n
> 1 1/m , then V
m
is open and V
m
f 1 as
m . Notice that
(V
m
) = lim
n
(f
n
> 1 1/m) (f > 1 1/m)
1
1 1/m
(f) <
1
1 1/m
((A) +)
showing (V
m
) < (A) + for all m large enough. Therefore if A B
0
X
and
(A) < , there exists a Baire open set, V, such A V and (V A) is
as small as we please. Suppose that A B
0
X
is a bounded Baire set,
then using Item 4. of Theorem 32.54 there exists compact (
, K
n
, such that
A K
n
. Hence there exists V
n
open Baire sets such that K
n
A V
n
and
(V
n
K
n
A) < 2
n
for all n. Now let V := V
n
, an open Baire set, then
A V and (V A) < . Hence we have shown if A is bounded then
(A) = inf (V ) : A V
o
X and V is Baire.
Again let Aand K
n
be as above. Replacing K
n
by
n
k=1
K
k
we may also assume
that K
n
as n . Then K
n
A is a bounded Baire set. Let F
n
be a compact
(
such that K
n
A F
n
and choose compact open set V
n
such that
F
n
K
n
A V
n
and (V
n
(F
n
K
n
A)) < 2
n
. ....... In the end the
desired measure should be dened by
(U) = supI(f) : f U for all U
o
X
and for general A B
X
we set
(A) := inf (U) : A U
o
X .
Let us note that if f U and K = supp(f), then there exists K U
0

K
0
U as in Theorem 32.54. Therefore, f 1
K0
and hence I(f) (K
0
)
which shows that
(U) sup(K
0
) : K
0
U and K
0
is a compact (
.
The converse inequality is easily proved by letting g U such that g = 1 on
K
0
. Then (K
0
) I(g) (U) and hence
(U) = sup(K
0
) : K
0
U and K
0
is a compact (
.
Let us note that is sub-additive on open sets ,see p. 314 of Royden. Let
(A) := inf (U) : A U

o
X
Then
is an outer measure as well I think and ^ := A X :
(A) =
0 is closed under countable unions. Moreover if E is Baire measurable and
E ^, then there exists O open (O) < and E O. Hence for all compact
(
, K O, (K) < . Royden uses assumed regularity here to show that

(E) = 0. I dont see how to get this assume regularity at this point.
32.8 Metric space regularity results resisted
Proposition 32.64. Let (X, d) be a metric space and be a measure on
/= B
X
which is nite on :=
d
.
1. For all > 0 and B / there exists an open set V and a closed set
F such that F B V and (V F) .
2. For all B /, there exists A F
and C G
such that A B C and

(CA) = 0. Here F
denotes the collection of subsets of X which may be

written as a countable union of closed sets and G
is the collection
of subsets of X which may be written as a countable intersection of open
sets.
3. The space BC
f
(X) of bounded continuous functions on X such that (f ,=
0) < is dense in L
p
().
Proof. Let S := BC
f
(X), I(f) :=
_
X
fd for f S and X
n
be chosen
so that (X
n
) < and X
n
X as n . Then 1 f S for all f S and
if
n
= 1
_
nd
X
c
n
_
S
+
, then
n
1 as n and so by Remark 32.45
there exists S
such that > 0 on X and I() < . Similarly if V ,

the function g
n
:= 1
_
nd
(XnV )
c
_
S and g
n
1
V
as n showing
(S) =B
X
. If f
n
S
+
and f
n
0 as n , it follows by the dominated
convergence theorem that I(f
n
) 0 as n . So the hypothesis of the
Daniell Stone Theorem 32.44 hold and hence is the unique measure on
B
X
such that I = I
and for B B
X
and
(B) = I
(1
B
) = inf I(f) : f S
with 1
B
f
= inf
__
X
fd : f S
with 1
B
f
_
.
32.9 General Product Measures 665
Suppose > 0 and B B
X
are given. There exists f
n
BC
f
(X) such
that f
n
f, 1
B
f, and (f) < (B) + . The condition 1
B
f, implies
1
B
1
f1]
f and hence that
(B) (f 1) (f) < (B) +. (32.31)
Moreover, letting V
m
:=
n=1
f
n
1 1/m
d
, we have V
m
f 1
B hence (V
m
) (f 1) (B) as m . Combining this observation
with Eq. (32.31), we may choose m suciently large so that B V
m
and
(V
m
B) = (V
m
) (B) < .
Hence there exists V such that B V and (V B) < . Applying this
result to B
c
shows there exists F X such that B
c
F
c
and
(B F) = (F
c
B
c
) < .
So we have produced F B V such that (V F) = (V B) +(BF) <
2. The second assertion is an easy consequence of the rst and the third
follows in similar manner to any of the proofs of Item 7. in Theorem 32.47.
32.9 General Product Measures
In this section we drop the topological assumptions used in the last section.
Theorem 32.65. Let (X
, /
)
A
be a collection of probability spaces,
that is
(X
a
) = 1 for all A. Let X :=

A
X
, /= (
: A) and
for A let X
:=

and
: X X
be the projection map
(x) = x[
and
:=

be product measure on /
:=
.
Then there exists a unique measure on / such that (
for all
A, i.e. if f : X
1 is a bounded measurable function then

_
X
f(
(x))d(x) =
_
X
f(y)d
(y). (32.32)
Proof. Let S denote the collection of functions f : X 1 such that there
exists A and a bounded measurable function F : X
1 such that
f = F
. For f = F
S, let I(f) =
_
X
Fd
. Let us verify that

I is well dened. Suppose that f may also be expressed as f = G
with
A and G : X
1 bounded and measurable. By replacing by

if necessary, we may assume that . Making use of Fubinis theorem we
learn
_
X
G(z) d
(z) =
_
X
X
\
F
(x) d
(x)d
\
(y)
=
_
X
(x) d
(x)
_
X
\
d
\
(y)
=
\
_
X
\
_
_
X
(x) d
(x)
=
_
X
(x) d
(x),
wherein we have used the fact that
(X
) = 1 for all Asince
(X
) =
1 for all A. It is now easy to check that I is a positive linear functional on
the lattice S. We will now show that I is a Daniel integral. Suppose that f
n

S
+
is a decreasing sequence such that inf
n
I(f
n
) = > 0. We need to show
f := lim
n
f
n
is not identically zero. As in the proof that I is well dened,
there exists
n
A and bounded measurable functions F
n
: X
n
[0, )
such that
n
is increasing in n and f
n
= F
n

n
for each n. For k n, let
F
k
n
: X
k
[0, ) be the bounded measurable function
F
k
n
(x) =
_
X
n\
k
F
n
(x y)d
n\
k
(y)
where xy X
n
is dened by (x y) () = x() if
k
and (x y) () =
y() for
n
k
. By convention we set F
n
n
= F
n
. Since f
n
is decreasing it
follows that F
k
n+1
F
k
n
for all k and n k and therefore F
k
:= lim
n
F
k
n
exists. By Fubinis theorem,
F
k
n
(x) =
_
X
n\
k
F
k+1
n
(x y)d
k+1
\
k
(y) when k + 1 n
and hence letting n in this equation shows
F
k
(x) =
_
X
n\
k
F
k+1
(x y)d
k+1
\
k
(y) (32.33)
for all k. Now
_
X
1
F
1
(x)d
1
(x) = lim
n
_
X
1
F
1
n
(x)d
1
(x) = lim
n
I(f
n
) = > 0
so there exists
x
1
X
1
such that F
1
(x
1
) .
From Eq. (32.33) with k = 1 and x = x
1
it follows that

_
X
2
\
1
F
2
(x
1
y)d
2\1
(y)
32.9 General Product Measures 667
and hence there exists
x
2
X
2\1
such that F
2
(x
1
x
2
) .
Working this way inductively using Eq. (32.33) implies there exists
x
i
X
i\i1
such that F
n
(x
1
x
2
x
n
)
for all n. Now F
n
k
F
n
for all k n and in particular for k = n, thus
F
n
(x
1
x
2
x
n
) = F
n
n
(x
1
x
2
x
n
)
F
n
(x
1
x
2
x
n
) (32.34)
for all n. Let x X be any point such that
n
(x) = x
1
x
2
x
n
for all n. From Eq. (32.34) it follows that
f
n
(x) = F
n

n
(x) = F
n
(x
1
x
2
x
n
)
for all n and therefore f(x) := lim
n
f
n
(x) showing f is not zero.
Therefore, I is a Daniel integral and there exists by Theorem 32.47 a unique
measure on (X, (S) = /) such that
I(f) =
_
X
fd for all f S.
Taking f = 1
A

in this equation implies
(A) = I(f) =
1
(A)
and the result is proved.
Remark 32.66. (Notion of kernel needs more explanation here.) The above
theorem may be Jazzed up as follows. Let (X
, /
)
A
be a collection
of measurable spaces. Suppose for each pair A there is a kernel
,
(x, dy) for x X
and y X
\
such that if K A then
,K
(x, dy dz) =
,
(x, dy)
,K
(x y, dz).
Then there exists a unique measure on / such that
_
X
f(
(x))d(x) =
_
X
f(y)d
,
(y)
for all A and f : X
1 bounded and measurable. To prove this asser-

tion, just use the proof of Theorem 32.65 replacing
\
(dy) by
,
(x, dy)
everywhere in the proof.
32.10 Daniel Integral approach to dual spaces
BRUCE: compare and consolidate with Section 28.2.2.
Proposition 32.67. Let S be a vector lattice of bounded real functions on a
set X. We equip S with the sup-norm topology and suppose I S
. Then there
exists I
which are positive such that then I = I

+
I
.
Proof. For f S
+
, let
I
+
(f) := sup
_
I(g) : g S
+
and g f
_
.
One easily sees that [I
+
(f)[ |I| |f| for all f S
+
and I
+
(cf) = cI
+
(f) for
all f S
+
and c > 0. Let f
1
, f
2
S
+
. Then for any g
i
S
+
such that g
i
f
i
,
we have S
+
g
1
+g
2
f
1
+f
2
and hence
I(g
1
) +I(g
2
) = I(g
1
+g
2
) I
+
(f
1
+f
2
).
Therefore,
I
+
(f
1
) +I
+
(f
2
) = supI(g
1
) +I(g
2
) : S
+
g
i
f
i
I
+
(f
1
+f
2
). (32.35)
For the opposite inequality, suppose g S
+
and g f
1
+f
2
. Let g
1
= f
1
g,
then
0 g
2
:= g g
1
= g f
1
g =
_
0 if g f
1
g f
1
if g f
1
_
0 if g f
1
f
1
+f
2
f
1
if g f
1
f
2
.
Since g = g
1
+g
2
with S
+
g
i
f
i
,
I(g) = I(g
1
) +I(g
2
) I
+
(f
1
) +I
+
(f
2
)
and since S
+
g f
1
+f
2
was arbitrary, we may conclude
I
+
(f
1
+f
2
) I
+
(f
1
) +I
+
(f
2
). (32.36)
Combining Eqs. (32.35) and (32.36) shows that
I
+
(f
1
+f
2
) = I
+
(f
1
) +I
+
(f
2
) for all f
i
S
+
. (32.37)
We now extend I
+
to S by dening, for f S,
I
+
(f) = I
+
(f
+
) I
+
(f
)
where f
+
= f 0 and f
= (f 0) = (f) 0. (Notice that f = f

+
f
.)
We will now shows that I
+
is linear. If c 0, we may use (cf)
= cf
to
conclude that
32.10 Daniel Integral approach to dual spaces 669
I
+
(cf) = I
+
(cf
+
) I
+
(cf
) = cI
+
(f
+
) cI
+
(f
) = cI
+
(f).
Similarly, using (f)
= f
it follows that I
+
(f) = I
+
(f
) I
+
(f
+
) =
I
+
(f). Therefore we have shown
I
+
(cf) = cI
+
(f) for all c 1 and f S.
If f = u v with u, v S
+
then
v +f
+
= u +f
S
+
and so by Eq. (32.37), I
+
(v) +I
+
(f
+
) = I
+
(u) +I
+
(f
) or equivalently
I
+
(f) = I
+
(f
+
) I
+
(f
) = I
+
(u) I
+
(v). (32.38)
Now if f, g S, then
I
+
(f +g) = I
+
(f
+
+g
+
(f
+g
))
= I
+
(f
+
+g
+
) I
+
(f
+g
)
= I
+
(f
+
) +I
+
(g
+
) I
+
(f
) I
+
(g
)
= I
+
(f) +I
+
(g),
wherein the second equality we used Eq. (32.38). The last two paragraphs
show I
+
: S 1 is linear. Moreover,
[I
+
(f)[ = [I
+
(f
+
) I
+
(f
)[ max ([I
+
(f
+
)[ , [I
+
(f
)[)
|I| max (|f
+
| , |f
|) = |I| |f|
which shows that |I
+
| |I| . That is I
+
is a bounded positive linear
functional on S. Let I
= I
+
I S
. Then by denition of I
+
(f),
I
(f) = I
+
(f) I(f) 0 for all S f 0. Therefore I = I
+
I
with
I
being positive linear functionals on S.

Corollary 32.68. Suppose X is a second countable locally compact Hausdor
space and I C
0
(X, 1)
, then there exists =

+

where is a nite
signed measure on B
R
such that I(f) =
_
R
fd for all f C
0
(X, 1). Similarly
if I C
0
(X, C)
there exists a complex measure such that I(f) =

_
R
fd
for all f C
0
(X, C). TODO Add in the isometry statement here.
Proof. Let I = I
+
I
be the decomposition given as above. Then we

know there exists nite measure
such that
I
(f) =
_
X
fd
for all f C
0
(X, 1).
and therefore I(f) =
_
X
fd for all f C
0
(X, 1) where =
+

.
Moreover the measure is unique. Indeed if I(f) =
_
X
fd for some nite
signed measure , then the next result shows that I
(f) =
_
X
fd
where
is the Hahn decomposition of . Now the measures
are uniquely determined

by I
. The complex case is a consequence of applying the real case just proved
to Re I and ImI.
Proposition 32.69. Suppose that is a signed Radon measure and I = I
.
Let
+
and
be the Radon measures associated to I
, then =
+
is
the Jordan decomposition of .
Proof. Let X = P P
c
where P is a positive set for and P
c
is a negative
set. Then for A B
X
,
(P A) =
+
(P A)
(P A)
+
(P A)
+
(A). (32.39)
To nish the proof we need only prove the reverse inequality. To this end let
> 0 and choose K P A U
o
X such that [[ (U K) < . Let
f, g C
c
(U, [0, 1]) with f g, then
I(f) = (f) = (f : K) +(f : U K) (g : K) +O()
(K) +O() (P A) +O() .
Taking the supremum over all such f g, we learn that I
+
(g) (P A) +
O() and then taking the supremum over all such g shows that
+
(U) (P A) +O() .
Taking the inmum over all U
o
X such that P A U shows that
+
(P A) (P A) +O() (32.40)
From Eqs. (32.39) and (32.40) it follows that (P A) =
+
(P A). Since
I
(f) = sup
0gf
I(g)I(f) = sup
0gf
I(g f) = sup
0gf
I(f g) = sup
0hf
I(h)
the same argument applied to I shows that
(P
c
A) =
(P
c
A).
Since
(A) = (P A) +(P
c
A) =
+
(P A)
(P
c
A) and
(A) =
+
(A)
(A)
it follows that
+
(A P) =
(A P
c
) =
(A P).
Taking A = P then shows that
(P) = 0 and taking A = P

c
shows that
+
(P
c
) = 0 and hence
(P A) =
+
(P A) =
+
(A) and
(P
c
A) =
(P
c
A) =
(A)
as was to be proved.
33
Class Arguments
33.1 Monotone Class and Theorems
Denition 33.1. Let ( 2
X
be a collection of sets.
1. ( is a monotone class if it is closed under countable increasing unions
and countable decreasing intersections,
2. ( is a class if it is closed under nite intersections and
3. ( is a class if ( satises the following properties:
a) X (
b) If A, B ( and A B = , then A B (. (Closed under disjoint
unions.)
c) If A, B ( and A B, then A B (. (Closed under proper
dierences.)
d) If A
n
( and A
n
A, then A (. (Closed under countable increasing
unions.)
4. ( is a
0
class if ( satises conditions a) c) but not necessarily d).
Remark 33.2. Notice that every class is also a monotone class.
(The reader wishing to shortcut this section may jump to Theorem 33.5
where he/she should then only read the second proof.)
Lemma 33.3 (Monotone Class Theorem). Suppose / 2
X
is an algebra
and ( is the smallest monotone class containing /. Then ( = (/).
Proof. For C ( let
((C) = B ( : C B, C B
c
, B C
c
(,
then ((C) is a monotone class. Indeed, if B
n
((C) and B
n
B, then
B
c
n
B
c
and so
672 33 Class Arguments
( C B
n
C B
( C B
c
n
C B
c
and
( B
n
C
c
B C
c
.
Since ( is a monotone class, it follows that C B, C B
c
, B C
c
(,
i.e. B ((C). This shows that ((C) is closed under increasing limits and
a similar argument shows that ((C) is closed under decreasing limits. Thus
we have shown that ((C) is a monotone class for all C (. If A / (,
then A B, A B
c
, B A
c
/ ( for all B / and hence it follows
that / ((A) (. Since ( is the smallest monotone class containing / and
((A) is a monotone class containing /, we conclude that ((A) = ( for any
A /. Let B ( and notice that A ((B) happens i B ((A). This
observation and the fact that ((A) = ( for all A / implies / ((B) (
for all B (. Again since ( is the smallest monotone class containing / and
((B) is a monotone class we conclude that ((B) = ( for all B (. That is
to say, if A, B ( then A ( = ((B) and hence A B, A B
c
, A
c
B (.
So ( is closed under complements (since X / () and nite intersections
and increasing unions from which it easily follows that ( is a algebra.
Let c 2
XY
be given by
c = /^ = AB : A /, B ^
and recall from Exercise 18.2 that c is an elementary family. Hence the algebra
/ = /(c) generated by c consists of sets which may be written as disjoint
unions of sets from c.
Lemma 33.4. If T is a
0
class which contains a class, (, then T
contains /(() the algebra generated by (.
Proof. We will give two proofs of this lemma. The rst proof is construc-
tive and makes use of Proposition 18.6 which tells how to construct /(()
from (. The key to the rst proof is the following claim which will be proved
by induction.
Claim. Let

(
0
= ( and

(
n
denote the collection of subsets of X of the
form
A
c
1
A
c
n
B = B A
1
A
2
A
n
. (33.1)
with A
i
( and B ( X . Then

(
n
T for all n, i.e.

( :=
n=0

(
n
T.
By assumption

(
0
T and when n = 1,
B A
1
= B (A
1
B) T
when A
1
, B ( T since A
1
B ( T. Therefore,

(
1
T. For the
induction step, let B ( X and A
i
( X and let E
n
denote the set
in Eq. (33.1) We now assume

(
n
T and wish to show E
n+1
T, where
E
n+1
= E
n
A
n+1
= E
n
(A
n+1
E
n
).
33.1 Monotone Class and Theorems 673
Because
A
n+1
E
n
= A
c
1
A
c
n
(B A
n+1
)

(
n
T
and (A
n+1
E
n
) E
n

(
n
T, we have E
n+1
T as well. This nishes
the proof of the claim.
Notice that

( is still a multiplicative class and from Proposition 18.6 (using
the fact that ( is a multiplicative class), /(() consists of nite unions of
elements from

(. By applying the claim to

(, A
c
1
A
c
n
T for all A
i

(
and hence
A
1
A
n
= (A
c
1
A
c
n
)
c
T.
Thus we have shown /(() T which completes the proof.
Second Proof. With out loss of generality, we may assume that T is the
smallest
0
class containing ( for if not just replace T by the intersection
of all
0
classes containing (. Let
T
1
:= A T : A C T C (.
Then ( T
1
and T
1
is also a
0
class as we now check. a) X T
1
. b) If
A, B T
1
with AB = , then (AB) C = (A C)
(B C) T for all
C (. c) If A, B T
1
with B A, then (A B) C = A C (B C) T
for all C (. Since ( T
1
T and T is the smallest
0
class containing (
it follows that T
1
= T. From this we conclude that if A T and B ( then
A B T. Let
T
2
:= A T : A D T D T.
Then T
2
is a
0
class (as you should check) which, by the above paragraph,
contains (. As above this implies that T = T
2
, i.e. we have shown that
T is closed under nite intersections. Since
0
classes are closed under
complementation, T is an algebra and hence /(() T. In fact T = /(().
This Lemma along with the monotone class theorem immediately implies
Dynkins very useful theorem.
Theorem 33.5 ( Theorem). If T is a class which contains a contains
a class, (, then (() T.
Proof. First Proof. Since T is a
0
class, Lemma 33.4 implies that
/(() T and so by Remark 33.2 and Lemma 33.3, (() T. Let us pause
to give a second, stand-alone, proof of this Theorem.
Second Proof. With out loss of generality, we may assume that T is the
smallest class containing ( for if not just replace T by the intersection of
all classes containing (. Let
T
1
:= A T : A C T C (.
Then ( T
1
and T
1
is also a class because as we now check. a) X T
1
. b)
If A, B T
1
with AB = , then (AB) C = (A C)
(B C) T for
all C (. c) If A, B T
1
with B A, then (A B)C = AC(BC) T
for all C (. d) If A
n
T
1
and A
n
A as n , then A
n
C T for
all C T and hence A
n
C A C T. Since ( T
1
T and T is the
smallest class containing ( it follows that T
1
= T. From this we conclude
that if A T and B ( then A B T.
Let
T
2
:= A T : A D T D T.
Then T
2
is a class (as you should check) which, by the above paragraph,
contains (. As above this implies that T = T
2
, i.e. we have shown that T is
closed under nite intersections. Since classes are closed under comple-
mentation, T is an algebra which is closed under increasing unions and hence
is closed under arbitrary countable unions, i.e. T is a algebra. Since ( T
we must have (() T and in fact (() = T.
33.1.1 Some other proofs of previously proved theorems
Proof. Other Proof of Corollary 18.54. Let T := A X : 1
A
H. Then
by assumption ( T and since 1 H we know X T. If A, B T are
disjoint then 1
AB
= 1
A
+ 1
B
H so that A B T and if A, B T and
A B, then 1
B\A
= 1
B
1
A
H. Finally if A
n
T and A
n
A as n
then 1
An
1
A
boundedly so 1
A
H and hence A T. So T is class
containing ( and hence T contains ((). From this it follows that H contains
1
A
for all A (() and hence all (() measurable simple functions by
linearity. The proof is now complete with an application of the approximation
Theorem 18.42 along with the assumption that H is closed under bounded
convergence.
Proof. Other Proof of Theorems 18.51 and 18.52. Let F be 1 or C. Let (
be the family of all sets of the form:
B := x X : f
1
(x) R
1
, . . . , f
m
(x) R
m
(33.2)
where m = 1, 2, . . . , and for k = 1, 2, . . . , m, f
k
M and R
k
is an open inter-
val if F = 1 or R
k
is an open rectangle in C if F = C. The family ( is easily
seen to be a system such that (M) = ((). So By Corollary 18.54, to n-
ish the proof it suces to show 1
B
H for all B (. It is easy to construct,
for each k, a uniformly bounded sequence of continuous functions
_
k
n
_
n=1
on F converging to the characteristic function 1
R
k
. By Weierstrass theo-
rem, there exists polynomials p
k
m
(x) such that

p
k
n
(x)
k
n
(x)
1/n for
[x[ |
k
|
in the real case and polynomials p

k
m
(z, z) in z and z such that
p
k
n
(z, z)
k
n
(z)
1/n for [z[ |

k
|
in the complex case. The functions

F
n
:=p
1
n
(f
1
)p
2
n
(f
2
) . . . p
m
n
(f
m
) (real case)
F
n
:=p
1
n
(f
1

f
1
)p
2
n
(f
2
,

f
2
) . . . p
m
n
(f
m
,

f
m
) (complex case)
33.2 Regularity of Measures 675
on X are uniformly bounded, belong to H and converge pointwise to 1
B
as
n , where B is the set in Eq. (33.2). Thus 1
B
H and the proof is
complete.
Theorem 33.6 (Uniqueness). Suppose that c 2
X
is an elementary class
and /= (c) (the algebra generated by c). If and are two measures
on / which are nite on c and such that = on c then = on /.
Proof. Let / := /(c) be the algebra generated by c. Since every element
of / is a disjoint union of elements from c, it is clear that = on /.
Henceforth we may assume that c = /. We begin rst with the special case
where (X) < and hence (X) = (X) < . Let
( = A /: (A) = (A)
The reader may easily check that ( is a monotone class. Since / (, the
monotone class lemma asserts that /= (/) ( /showing that ( = /
and hence that = on /. For the nite case, let X
n
/ be sets such
that (X
n
) = (X
n
) < and X
n
X as n . For n N, let
n
(A) := (A X
n
) and
n
(A) = (A X
n
) (33.3)
for all A /. Then one easily checks that
n
and
n
are nite measure on
/such that
n
=
n
on /. Therefore, by what we have just proved,
n
=
n
on /. Hence or all A /, using the continuity of measures,
(A) = lim
n
(A X
n
) = lim
n
(A X
n
) = (A).
Using Dynkins Theorem 33.5 we may strengthen Theorem 33.6 to
the following.
Proof. Second Proof of Theorem 19.55. As in the proof of Theorem
33.6, it suces to consider the case where and are nite measure such
that (X) = (X) < . In this case the reader may easily verify from the
basic properties of measures that
T = A /: (A) = (A)
is a class. By assumption ( T and hence by the theorem, T
contains /= (().
33.2 Regularity of Measures
Denition 33.7. Suppose that c is a collection of subsets of X, let c
denote
the collection of subsets of X which are nite or countable unions of sets from
c. Similarly let c
denote the collection of subsets of X which are nite or

countable intersections of sets from c. We also write c
for (c
and c
for (c
, etc.
Remark 33.8. Notice that if / is an algebra and C = C
i
and D = D
j
with
C
i
, D
j
/
, then
C D =
i,j
(C
i
D
j
) /
so that /
is closed under nite intersections.

The following theorem shows how recover a measure on (/) from its
values on an algebra /.
Theorem 33.9 (Regularity Theorem). Let / 2
X
be an algebra of sets,
/= (/) and : /[0, ] be a measure on / which is nite on /.
Then for all A /,
(A) = inf (B) : A B /
. (33.4)
Moreover, if A / and > 0 are given, then there exists B /
such that
A B and (B A) .
Proof. For A X, dene
(A) = inf (B) : A B /
.
We are trying to show
= on /. We will begin by rst assuming that

is a nite measure, i.e. (X) < . Let
T = B /:
(B) = (B) = B /:
(B) (B).
It is clear that / T, so the nite case will be nished by showing T is
a monotone class. Suppose B
n
T, B
n
B as n and let > 0 be
given. Since
(B
n
) = (B
n
) there exists A
n
/
such that B
n
A
n
and
(A
n
) (B
n
) +2
n
i.e.
(A
n
B
n
) 2
n
.
Let A =
n
A
n
/
, then B A and
(A B) = (
n
(A
n
B))
n=1
((A
n
B))
n=1
((A
n
B
n
))
n=1
2
n
= .
Therefore,
(B) (A) (B) +

and since > 0 was arbitrary it follows that B T. Now suppose that B
n
T
and B
n
B as n so that
(B
n
) (B) as n .
As above choose A
n
/
such that B
n
A
n
and
0 (A
n
) (B
n
) = (A
n
B
n
) 2
n
.
Combining the previous two equations shows that lim
n
(A
n
) = (B).
Since
(B) (A
n
) for all n, we conclude that
(B) (B), i.e. that

B T. Since T is a monotone class containing the algebra /, the monotone
class theorem asserts that
/= (/) T /
showing the T = / and hence that
= on /. For the nite case,

let X
n
/ be sets such that (X
n
) < and X
n
X as n . Let
n
be the nite measure on / dened by
n
(A) := (A X
n
) for all A /.
Suppose that > 0 and A / are given. By what we have just proved, for
all A /, there exists B
n
/
such that A B
n
and
((B
n
X
n
) (A X
n
)) =
n
(B
n
A) 2
n
.
Notice that since X
n
/
, B
n
X
n
/
and
B :=
n=1
(B
n
X
n
) /
.
Moreover, A B and
(B A)
n=1
((B
n
X
n
) A)
n=1
((B
n
X
n
) (A X
n
))
n=1
2
n
= .
Since this implies that
(A) (B) (A) +
and > 0 is arbitrary, this equation shows that Eq. (33.4) holds.
Corollary 33.10. Let / 2
X
be an algebra of sets, /= (/) and : /
[0, ] be a measure on / which is nite on /. Then for all A / and
> 0 there exists B /
such that B A and

(A B) < .
Furthermore, for any B / there exists A /
and C /
such that
A B C and (C A) = 0.
Proof. By Theorem 33.9, there exist C /
such that A
c
C and
(C A
c
) . Let B = C
c
A and notice that B /
and that C A
c
=
B
c
A = A B, so that
(A B) = (C A
c
) .
Finally, given B /, we may choose A
n
/
and C
n
/
such that
A
n
B C
n
and (C
n
B) 1/n and (B A
n
) 1/n. By replacing A
N
by
N
n=1
A
n
and C
N
by
N
n=1
C
n
, we may assume that A
n
and C
n
as n
increases. Let A = A
n
/
and C = C
n
/
, then A B C and
(C A) = (C B) +(B A) (C
n
B) +(B A
n
)
2/n 0 as n .
For Exercises 33.1 33.3 let 2
X
be a topology, / = () and
: /[0, ) be a nite measure, i.e. (X) < .
Exercise 33.1. Let
T := A /: (A) = inf (V ) : A V . (33.5)
1. Show T may be described as the collection of set A /such that for all
> 0 there exists V such that A V and (V A) < .
2. Show T is a monotone class.
Exercise 33.2. Give an example of a topology on X = 1, 2 and a measure
on /= () such that T dened in Eq. (33.5) is not /.
Exercise 33.3. Suppose now 2
X
is a topology with the property that to
every closed set C X, there exists V
n
such that V
n
C as n . Let
/ = /() be the algebra generated by .
1. With the aid of Exercise 18.1, show that / T. Therefore by exercise
33.1 and the monotone class theorem, T = /, i.e.
(A) = inf (V ) : A V .
2. Show this result is equivalent to following statement: for every > 0 and
A /there exist a closed set C and an open set V such that C A V
and (V C) < . (Hint: Apply part 1. to both A and A
c
.)
Exercise 33.4 (Generalization to the nite case). Let 2
X
be
a topology with the property that to every closed set F X, there exists
V
n
such that V
n
F as n . Also let /= () and : / [0, ]
be a measure which is nite on .
1. Show that for all > 0 and A / there exists an open set V and a
closed set F such that F A V and (V F) .
2. Let F
denote the collection of subsets of X which may be written as a

countable union of closed sets. Use item 1. to show for all B /, there
exists C
is customarily written as G
) and A F
such that
A B C and (C A) = 0.
33.2.1 Another proof of Theorem 28.22
Proof. The main part of this proof is an application of Exercise 33.4. So we
begin by checking the hypothesis of this exercise. Suppose that C X is a
closed set, then by assumption there exists K
n
X such that C
c
=
n=1
K
n
.
Letting V
N
:=
N
n=1
K
c
n

o
X, by taking complements of the last equality we
nd that V
N
C as N . Also by assumption there exists K
n
X such
that K
n
X as n . For each x K
n
, let V
x

o
X be a precompact
neighborhood of x. By compactness of K
n
there is a nite set K
n
such
that K
n
V
n
:=
x
V
x
. Since

V =
x

V
x
is a nite union of compact
set,

V
n
is compact and hence (V
n
) (
V
n
) < . Since X = K
n
V
n
we learn that is nite on open sets of X. By Exercise 33.4, we conclude
that for all > 0 and A B
X
there exists V
o
X and F X such that
F A V and (V F) < . For this F and V we have
(A) (V ) = (A) +(V A) (A) +(V F) < (A) + (33.6)
and
(F) (A) = (F) +(A F) < (F) +. (33.7)
From Eq. (33.6) we see that is outer regular on B
X
. To nish the proof of
inner regularity, let K
n
X such that K
n
X. If (A) = , it follows from
Eq. (33.7) that (F) = . Since F K
n
F, (F K
n
) = (A) which
shows that is inner regular on A because F K
n
is a compact subset of A
for each n. If (A) < , we again have F K
n
F and hence by Eq. (33.7)
for n suciently large we still have
(F K
n
) (A) < (F K
n
) +
from which it follows that is inner regular on A.
Exercise 33.5 (Metric Space Examples). Suppose that (X, d) is a metric
space and
d
is the topology of d open subsets of X. To each set F X and
> 0 let
F
= x X : d
F
(x) < =
xF
B
x
()
d
.
Show that if F is closed, then F
F as 0 and in particular V
n
:= F
1/n

d
are open sets decreasing to F. Therefore the results of Exercises 33.3 and 33.4
apply to measures on metric spaces with the Borel algebra, B = (
d
).
Corollary 33.11. Let X 1
n
be an open set and B = B
X
be the Borel
algebra on X equipped with the standard topology induced by open balls with
respect to the Euclidean distance. Suppose that : B [0, ] is a measure
such that (K) < whenever K is a compact set.
1. Then for all A B and > 0 there exist a closed set F and an open set
V such that F A V and (V F) < .
2. If (A) < , the set F in item 1. may be chosen to be compact.
3. For all A B we may compute (A) using
(A) = inf(V ) : A V and V is open (33.8)
= sup(K) : K A and K is compact. (33.9)
Proof. For k N, let
K
k
:= x X : [x[ k and d
X
c (x) 1/k . (33.10)
Then K
k
is a closed and bounded subset of 1
n
and hence compact. Moreover
K
o
k
X as k since
1
x X : [x[ < k and d
X
c (x) > 1/k K
o
k
and x X : [x[ < k and d
X
c (x) > 1/k X as k .This shows is
nite on
X
and Item 1. follows from Exercises 33.4 and 33.5. If (A) <
and F A V as in item 1. Then K
k
F F as k and therefore
since (V ) < , (V K
k
F) (V F) as k . Hence by choosing k
suciently large, (V K
k
F) < and we may replace F by the compact set
FK
k
and item 1. still holds. This proves item 2. Item 3. Item 1. easily implies
that Eq. (33.8) holds and item 2. implies Eq. (33.9) holds when (A) < . So
we need only check Eq. (33.9) when (A) = . By Item 1. there is a closed set
F A such that (AF) < 1 and in particular (F) = . Since K
n
F F,
and K
n
F is compact, it follows that the right side of Eq. (33.9) is innite
and hence equal to (A).
33.2.2 Second Proof of Theorem 22.13
Proof. Second Proof of Theorem 22.13 Since S
f
(/, ) is dense in L
p
()
it suces to show any S
f
(/, ) may be well approximated by f
BC
f
(X). Moreover, to prove this it suces to show for A /with (A) <
that 1
A
may be well approximated by an f BC
f
(X). By Exercises 33.4 and
33.5, for any > 0 there exists a closed set F and an open set V such that
F A V and (V F) < . (Notice that (V ) < (A) + < .) Let f be
as in Eq. (6.4), then f BC
f
(X) and since [1
A
f[ 1
V \F
,
_
[1
A
f[
p
d
_
1
V \F
d = (V F) (33.11)
or equivalently
|1
A
f|
1/p
.
Since > 0 is arbitrary, we have shown that 1
A
can be approximated in L
p
()
arbitrarily well by functions from BC
f
(X)).
1
In fact this is an equality, but we will not need this here.
Part VIII
The Fourier Transform and Generalized
Functions
34
Fourier Transform
The underlying space in this section is 1
n
with Lebesgue measure. The Fourier
inversion formula is going to state that
f(x) =
_
1
2
_
n
_
R
n
de
ix
_
R
n
dyf(y)e
iy
. (34.1)
If we let = 2, this may be written as
f(x) =
_
R
n
de
i2x
_
R
n
dyf(y)e
i2y
and we have removed the multiplicative factor of
_
1
2
_
n
in Eq. (34.1) at the
expense of placing factors of 2 in the arguments of the exponentials. Another
way to avoid writing the 2s altogether is to redene dx and d and this is
what we will do here.
Notation 34.1 Let m be Lebesgue measure on 1
n
and dene:
dx =
_
1
2
_
n
dm(x) and d :=
_
1
2
_
n
dm().
To be consistent with this new normalization of Lebesgue measure we will
redene |f|
p
and f, g) as
|f|
p
=
__
R
n
[f(x)[
p
dx
_
1/p
=
_
_
1
2
_
n/2
_
R
n
[f(x)[
p
dm(x)
_
1/p
and
f, g) :=
_
R
n
f(x)g(x)dx when fg L
1
.
Similarly we will dene the convolution relative to these normalizations by
fg :=
_
1
2
_
n/2
f g, i.e.
684 34 Fourier Transform
fg(x) =
_
R
n
f(x y)g(y)dy =
_
R
n
f(x y)g(y)
_
1
2
_
n/2
dm(y).
The following notation will also be convenient; given a multi-index Z
n
+
,
let [[ =
1
+ +
n
,
x
:=
n
j=1
x
j
j
,
x
=
_

x
_
:=
n
j=1
_

x
j
_
j
and
D
x
=
_
1
i
_
]]
_

x
_
=
_
1
i
x
_
.
Also let
x) := (1 +[x[
2
)
1/2
and for s 1 let
s
(x) = (1 +[x[)
s
.
34.1 Fourier Transform
Denition 34.2 (Fourier Transform). For f L
1
, let
f() = Tf() :=
_
R
n
e
ix
f(x)dx (34.2)
g
(x) = T
1
g(x) =
_
R
n
e
ix
g()d = Tg(x) (34.3)
The next theorem summarizes some more basic properties of the Fourier
transform.
Theorem 34.3. Suppose that f, g L
1
. Then
1.

f C
0
(1
n
) and
_
_
_
f
_
_
_
|f|
1
.
2. For y 1
n
, (
y
f) () = e
iy
f() where, as usual,
y
f(x) := f(x y).
3. The Fourier transform takes convolution to products, i.e. (fg)
=

f g.
4. For f, g L
1
,
f, g) = f, g).
5. If T : 1
n
1
n
is an invertible linear transformation, then
(f T)
() = [det T[
1
f(
_
T
1
_
) and
(f T)
() = [det T[
1
f
(
_
T
1
_
)
6. If (1 + [x[)
k
f(x) L
1
, then

f C
k
and

f C
0
for all [[ k.
Moreover,

f() = T [(ix)
f(x)] () (34.4)
for all [[ k.
34.1 Fourier Transform 685
7. If f C
k
and
f L
1
for all [[ k, then (1 +[[)
k
f() C
0
and
(
f)
() = (i)

f() (34.5)
for all [[ k.
8. Suppose g L
1
(1
k
) and h L
1
(1
nk
) and f = g h, i.e.
f(x) = g(x
1
, . . . , x
k
)h(x
k+1
, . . . , x
n
),
then

f = g
h.
Proof. Item 1. is the Riemann Lebesgue Lemma 22.37. Items 2. 5. are
proved by the following straight forward computations:
(
y
f) () =
_
R
n
e
ix
f(x y)dx =
_
R
n
e
i(x+y)
f(x)dx = e
iy

f(),
f, g) =
_
R
n
f()g()d =
_
R
n
dg()
_
R
n
dxe
ix
f(x)
=
_
R
n
R
n
dxde
ix
g()f(x) =
_
R
n
R
n
dx g(x)f(x) = f, g),
(fg)
() =
_
R
n
e
ix
fg(x)dx =
_
R
n
e
ix
__
R
n
f(x y)g(y)dy
_
dx
=
_
R
n
dy
_
R
n
dxe
ix
f(x y)g(y)
=
_
R
n
dy
_
R
n
dxe
i(x+y)
f(x)g(y)
=
_
R
n
dye
iy
g(y)
_
R
n
dxe
ix
f(x) =

f() g()
and letting y = Tx so that dx = [det T[
1
dy
(f T)
() =
_
R
n
e
ix
f(Tx)dx =
_
R
n
e
iT
1
y
f(y) [det T[
1
dy
= [det T[
1
f(
_
T
1
_
).
Item 6. is simply a matter of dierentiating under the integral sign which is
easily justied because (1 + [x[)
k
f(x) L
1
. Item 7. follows by using Lemma
22.36 repeatedly (i.e. integration by parts) to nd
(
f)
() =
_
R
n
x
f(x)e
ix
dx = (1)
]]
_
R
n
f(x)
x
e
ix
dx
= (1)
]]
_
R
n
f(x)(i)
e
ix
dx = (i)

f().
Since
f L
1
for all [[ k, it follows that (i)

f() = (
f)
() C
0
for
all [[ k. Since
(1 +[[)
k
_
1 +
n
i=1
[
i
[
_
k
=
]]k
c
[
where 0 < c
< ,
(1 +[[)
k
f()
]]k
c

f()
0 as .
Item 8. is a simple application of Fubinis theorem.
Example 34.4. If f(x) = e
]x]
2
/2
then

f() = e
]]
2
/2
, in short
Te
]x]
2
/2
= e
]]
2
/2
and T
1
e
]]
2
/2
= e
]x]
2
/2
. (34.6)
More generally, for t > 0 let
p
t
(x) := t
n/2
e
1
2t
]x]
2
(34.7)
then
p
t
() = e
t
2
]]
2
and ( p
t
)
(x) = p
t
(x). (34.8)
By Item 8. of Theorem 34.3, to prove Eq. (34.6) it suces to con-
sider the 1 dimensional case because e
]x]
2
/2
=

n
i=1
e
x
2
i
/2
. Let g() :=
_
Te
x
2
/2
_
() , then by Eq. (34.4) and Eq. (34.5),
g
t
() = T
_
(ix) e
x
2
/2
_
() = iT
_
d
dx
e
x
2
/2
_
()
= i(i)T
_
e
x
2
/2
_
() = g(). (34.9)
Lemma 20.26 implies
g(0) =
_
R
e
x
2
/2
dx =
1
2
_
R
e
x
2
/2
dm(x) = 1,
and so solving Eq. (34.9) with g(0) = 1 gives T
_
e
x
2
/2
_
() = g() = e
2
/2
as desired. The assertion that T
1
e
]]
2
/2
= e
]x]
2
/2
follows similarly or by
using Eq. (34.3) to conclude,
T
1
_
e
]]
2
/2
_
(x) = T
_
e
]]
2
/2
_
(x) = T
_
e
]]
2
/2
_
(x) = e
]x]
2
/2
.
The results in Eq. (34.8) now follow from Eq. (34.6) and item 5 of Theorem
34.3. For example, since p
t
(x) = t
n/2
p
1
(x/
t),
( p
t
)() = t
n/2
_
t
_
n
p
1
(
t) = e
t
2
]]
2
.
This may also be written as ( p
t
)() = t
n/2
p1
t
(). Using this and the fact that
p
t
is an even function,
( p
t
)
(x) = T p
t
(x) = t
n/2
Tp1
t
(x) = t
n/2
t
n/2
p
t
(x) = p
t
(x).
34.2 Schwartz Test Functions 687
34.2 Schwartz Test Functions
Denition 34.5. A function f C(1
n
, C) is said to have rapid decay or
rapid decrease if
sup
xR
n
(1 +[x[)
N
[f(x)[ < for N = 1, 2, . . . .
Equivalently, for each N N there exists constants C
N
< such that
[f(x)[ C
N
(1 + [x[)
N
for all x 1
n
. A function f C(1
n
, C) is said
to have (at most) polynomial growth if there exists N < such
sup(1 +[x[)
N
[f(x)[ < ,
i.e. there exists N N and C < such that [f(x)[ C(1 + [x[)
N
for all
x 1
n
.
Denition 34.6 (Schwartz Test Functions). Let o denote the space of
functions f C
(1
n
) such that f and all of its partial derivatives have rapid
decay and let
|f|
N,
= sup
xR
n
(1 +[x[)
N
f(x)
so that
o =
_
f C
(1
n
) : |f|
N,
< for all N and
_
.
Also let T denote those functions g C
(1
n
) such that g and all of its
derivatives have at most polynomial growth, i.e. g C
(1
n
) is in T i for
all multi-indices , there exists N
< such
sup(1 +[x[)
N
[
g(x)[ < .
(Notice that any polynomial function on 1
n
is in T.)
Remark 34.7. Since C
c
(1
n
) o L
2
(1
n
) , it follows that o is dense in
L
2
(1
n
).
Exercise 34.1. Let
L =
]]k
a
(x)
(34.10)
with a
T. Show L(o) o and in particular
f and x
f are back in o
for all multi-indices .
Notation 34.8 Suppose that p(x, ) =
]]N
a
(x)
where each function

a
(x) is a smooth function. We then set

p(x, D
x
) :=
]]N
a
(x)D
x
and if each a
(x) is also a polynomial in x we will let

p(D
, ) :=
]]N
a
(D
)M
where M
is the operation of multiplication by
.
Proposition 34.9. Let p(x, ) be as above and assume each a
(x) is a poly-
nomial in x. Then for f o,
(p(x, D
x
)f)
() = p(D
, )

f () (34.11)
and
p(, D
)

f() = [p(D
x
, x)f(x)]
(). (34.12)
Proof. The identities (D
e
ix
= x
e
ix
and D
x
e
ix
=
e
ix
imply, for any polynomial function q on 1
n
,
q(D
)e
ix
= q(x)e
ix
and q(D
x
)e
ix
= q()e
ix
. (34.13)
Therefore using Eq. (34.13) repeatedly,
(p(x, D
x
)f)
() =
_
R
n
]]N
a
(x)D
x
f(x) e
ix
d
=
_
R
n
]]N
D
x
f(x) a
(D
)e
ix
d
=
_
R
n
f(x)
]]N
(D
x
)
_
a
(D
)e
ix
d
=
_
R
n
f(x)
]]N
a
(D
)
_
e
ix
d = p(D
, )

f ()
wherein the third inequality we have used Lemma 22.36 to do repeated in-
tegration by parts, the fact that mixed partial derivatives commute in the
fourth, and in the last we have repeatedly used Corollary 19.43 to dierenti-
ate under the integral. The proof of Eq. (34.12) is similar:
p(, D
)

f() = p(, D
)
_
R
n
f(x)e
ix
dx =
_
R
n
f(x)p(, x)e
ix
dx
=
]]N
_
R
n
f(x)(x)
()e
ix
dx
=
]]N
_
R
n
f(x)(x)
(D
x
)e
ix
dx
=
]]N
_
R
n
e
ix
a
(D
x
) [(x)
f(x)] dx
= [p(D
x
, x)f(x)]
().
Corollary 34.10. The Fourier transform preserves the space o, i.e. T(o)
o.
34.3 Fourier Inversion Formula 689
Proof. Let p(x, ) =
]]N
a
(x)
with each a
(x) being a polynomial

function in x. If f o then p(D
x
, x)f o L
1
and so by Eq. (34.12),
p(, D
)

f() is bounded in , i.e.
sup
R
n
[p(, D
)

f()[ C(p, f) < .
Taking p(x, ) = (1 +[x[
2
)
N
with N Z
+
in this estimate shows

f() and
all of its derivatives have rapid decay, i.e.

f is in o.
34.3 Fourier Inversion Formula
Theorem 34.11 (Fourier Inversion Theorem). Suppose that f L
1
and
f L
1
(for example suppose f o), then
1. there exists f
0
C
0
(1
n
) such that f = f
0
a.e.,
2. f
0
= T
1
T f and f
0
= TT
1
f,
3. f and

f are in L
1
L
and
4. |f|
2
=
_
_
_
f
_
_
_
2
.
In particular, T : o o is a linear isomorphism of vector spaces.
Proof. First notice that

f C
0
(1
n
) L
and

f L
1
by assumption, so
that

f L
1
L
. Let p
t
(x) := t
n/2
e
1
2t
]x]
2
be as in Example 34.4 so that
p
t
() = e
t
2
]]
2
and p
t
= p
t
. Dene f
0
:=

f
C
0
then
f
0
(x) = (

f)
(x) =
_
R
n
f()e
ix
d = lim
t0
_
R
n
f()e
ix
p
t
()d
= lim
t0
_
R
n
_
R
n
f(y)e
i(xy)
p
t
()d dy
= lim
t0
_
R
n
f(y)p
t
(x y)dy = f(x) a.e.
wherein we have used Theorem 22.32 in the last equality along with the ob-
servations that p
t
(y) = p
1
(y/
t) and
_
R
n
p
1
(y)dy = 1 so that
L
1
lim
t0
_
R
n
f(y)p
t
(x y)dy = f (x) .
In particular this shows that f L
1
L
. A similar argument shows that

T
1
T f = f
0
as well. Let us now compute the L
2
norm of

f,
|
f|
2
2
=
_
R
n
f()

f()d =
_
R
n
d

f()
_
R
n
dxf(x)e
ix
=
_
R
n
dxf(x)
_
R
n
d

f()e
ix
(by Fubini)
=
_
R
n
dx f(x)f(x) = |f|
2
2
because
_
R
n
d

f()e
ix
= T
1
f(x) = f(x) a.e.
Corollary 34.12. By the B.L.T. Theorem 10.4, the maps T[
S
and T
1
[
S
extend to bounded linear maps

T and

T
1
from L
2
L
2
. These maps satisfy
the following properties:
1.

T and

T
1
are unitary and are inverses to one another as the notation
suggests.
2. If f L
2
, then

Tf is uniquely characterized as the function, G L
2
such
that
G, ) = f,

) for all C
c
(1
n
).
3. If f L
1
L
2
, then

Tf =

f a.e.
4. For f L
2
we may compute

T and

T
1
by
Tf() = L
2
lim
R
_
]x]R
f(x)e
ix
dx and (34.14)
T
1
f() = L
2
lim
R
_
]x]R
f(x)e
ix
dx. (34.15)
5. We may further extend

T to a map from L
1
+L
2
C
0
+L
2
(still denote
by

T) dened by

Tf =

h+

Tg where f = h+g L
1
+L
2
. For f L
1
+L
2
,
Tf may be characterized as the unique function F L

1
loc
(1
n
) such that
F, ) = f, ) for all C
c
(1
n
). (34.16)
Moreover if Eq. (34.16) holds then F C
0
+L
2
L
1
loc
(1
n
) and Eq.(34.16)
is valid for all o.
Proof. 1. and 2. If f L
2
and
n
o such that
n
f in L
2
, then
Tf := lim
n

n
. Since
n
o L
1
, we may concluded that |
n
|
2
=
|
n
|
2
for all n. Thus
_
_
Tf
_
_
2
= lim
n
|
n
|
2
= lim
n
|
n
|
2
= |f|
2
which shows that

T is an isometry from L
2
to L
2
and similarly

T
1
is an
isometry. Since

T
1

T = T
1
T = id on the dense set o, it follows by conti-
nuity that

T
1

T = id on all of L
2
. Hence

T

T
1
= id, and thus

T
1
is the
inverse of

T. This proves item 1. Moreover, if C
c
(1
n
), then
Tf, ) = lim
n

n
, ) = lim
n
n
,

) = f, ) (34.17)
and this equation uniquely characterizes

Tf by Corollary 22.38. Notice that
Eq. (34.17) also holds for all o.
3. If f L
1
L
2
, we have already seen that

f C
0
(1
n
) L
1
loc
and
that
f, ) = f,

) for all C
c
(1
n
). Combining this with item 2. shows
34.3 Fourier Inversion Formula 691
f

Tf, ) = 0 or all C
c
(1
n
) and so again by Corollary 22.38 we
conclude that

f

Tf = 0 a.e.
4. Let f L
2
and R < and set f
R
(x) := f(x)1
]x]R
. Then f
R
L
1
L
2
and therefore

Tf
R
=

f
R
. Since

T is an isometry and (by the dominated
convergence theorem) f
R
f in L
2
, it follows that
Tf = L
2
lim
R
Tf
R
= L
2
lim
R
f
R
.
5. If f = h + g L
1
+ L
2
and o, then by Eq. (34.17) and item 4. of
Theorem 34.3,
h +

Tg, ) = h, ) +g, ) = h +g, ). (34.18)
In particular if h + g = 0 a.e., then
h +

Tg, ) = 0 for all o and since
h +

Tg L
1
loc
it follows from Corollary 22.38 that

h +

Tg = 0 a.e. This
shows that

Tf is well dened independent of how f L
1
+L
2
is decomposed
into the sum of an L
1
and an L
2
function. Moreover Eq. (34.18) shows Eq.
(34.16) holds with F =

h +

Tg C
0
+L
2
and o. Now suppose G L
1
loc
and G, ) = f, ) for all C
c
(1
n
). Then by what we just proved,
G, ) = F, ) for all C
c
(1
n
) and so another application of Corollary
22.38 shows G = F C
0
+L
2
.
Notation 34.13 Given the results of Corollary 34.12, there is little danger
in writing

f or Tf for

Tf when f L
1
+L
2
.
Corollary 34.14. If f and g are L
1
functions such that

f, g L
1
, then
T(fg) =

f g and T
1
(fg) = f
.
Since o is closed under pointwise products and T : o o is an isomorphism
it follows that o is closed under convolution as well.
Proof. By Theorem 34.11, f, g,

f, g L
1
L
and hence f g L
1
L
and

f g L
1
L
. Since
T
1
_
f g
_
= T
1
_
f
_
T
1
( g) = f g L
1
we may conclude from Theorem 34.11 that
f g = TT
1
_
f g
_
= T(f g).
Similarly one shows T
1
(fg) = f
.
Corollary 34.15. Let p(x, ) and p(x, D
x
) be as in Notation 34.8 with each
function a
(x) being a smooth function of x 1

n
. Then for f o,
p(x, D
x
)f(x) =
_
R
n
p(x, )

f () e
ix
d. (34.19)
Proof. For f o, we have
p(x, D
x
)f(x) = p(x, D
x
)
_
T
1

f
_
(x) = p(x, D
x
)
_
R
n
f () e
ix
d
=
_
R
n
f () p(x, D
x
)e
ix
d =
_
R
n
f () p(x, )e
ix
d.
If p(x, ) is a more general function of (x, ) then that given in Notation 34.8,
the right member of Eq. (34.19) may still make sense, in which case we may
use it as a denition of p(x, D
x
). A linear operator dened this way is called
a pseudo dierential operator and they turn out to be a useful class of
operators to study when working with partial dierential equations.
Corollary 34.16. Suppose p() =

]]N
a
is a polynomial in 1
n
and f L
2
. Then p()f exists in L
2
(see Denition 26.3) i p(i)

f()
L
2
in which case
(p()f)
() = p(i)

f() for a.e. .
In particular, if g L
2
then f L
2
solves the equation, p()f = g i
p(i)

f() = g() for a.e. .
Proof. By denition p()f = g in L
2
i
g, ) = f, p()) for all C
c
(1
n
). (34.20)
If follows from repeated use of Lemma 26.23 that the previous equation is
equivalent to
g, ) = f, p()) for all o(1
n
). (34.21)
This may also be easily proved directly as well as follows. Choose C
c
(1
n
)
such that (x) = 1 for x B
0
(1) and for o(1
n
) let
n
(x) := (x/n)(x).
By the chain rule and the product rule (Eq. A.5 of Appendix A),
n
(x) =
_
n
]]
_
_
(x/n)
(x)
along with the dominated convergence theorem shows
n
and
in L
2
as n . Therefore if Eq. (34.20) holds, we nd Eq. (34.21) holds
because
g, ) = lim
n
g,
n
) = lim
n
f, p()
n
) = f, p()).
To complete the proof simply observe that g, ) = g,
) and
f, p()) =
f, [p()]
) =
f(), p(i)
())
= p(i)

f(),
())
for all o(1
n
). From these two observations and the fact that T is bijective
on o, one sees that Eq. (34.21) holds i p(i)

f() L
2
and g() =
p(i)

f() for a.e. .
34.5 Fourier Transforms of Measures and Bochners Theorem 693
34.4 Summary of Basic Properties of F and F
1
The following table summarizes some of the basic properties of the Fourier
transform and its inverse.
f

f or f
Smoothness Decay at innity
Multiplication by (i)
o o
L
2
(1
n
) L
2
(1
n
)
Convolution Products.
34.5 Fourier Transforms of Measures and Bochners
Theorem
To motivate the next denition suppose that is a nite measure on 1
n
which
is absolutely continuous relative to Lebesgue measure, d(x) = (x)dx. Then
it is reasonable to require
() := () =
_
R
n
e
ix
(x)dx =
_
R
n
e
ix
d(x)
and
(g) (x) := g(x) =
_
R
n
g(x y)(x)dx =
_
R
n
g(x y)d(y)
when g : 1
n
C is a function such that the latter integral is dened, for
example assume g is bounded. These considerations lead to the following
denitions.
Denition 34.17. The Fourier transform, , of a complex measure on B
R
n
is dened by
() =
_
R
n
e
ix
d(x) (34.22)
and the convolution with a function g is dened by
(g) (x) =
_
R
n
g(x y)d(y)
when the integral is dened.
It follows from the dominated convergence theorem that is continuous.
Also by a variant of Exercise 22.12, if and are two complex measure on
B
R
n such that = , then = . The reader is asked to give another proof
of this fact in Exercise 34.4 below.
Example 34.18. Let
t
be the surface measure on the sphere S
t
of radius t
centered at zero in 1
3
. Then

t
() = 4t
sint [[
[[
.
Indeed,

t
() =
_
tS
2
e
ix
d(x) = t
2
_
S
2
e
itx
d(x)
= t
2
_
S
2
e
itx3]]
d(x) = t
2
_
2
0
d
_

0
dsine
it cos ]]
= 2t
2
_
1
1
e
itu]]
du = 2t
2
1
it [[
e
itu]]
[
u=1
u=1
= 4t
2
sint [[
t [[
.
Denition 34.19. A function : 1
n
C is said to be positive (semi)
denite i the matrices A := (
k
j
)
m
k,j=1
are positive denite for all
m N and
j
m
j=1
1
n
.
Lemma 34.20. If C(1
n
, C) is a positive denite function, then
1. (0) 0.
2. () = () for all 1
n
.
3. [()[ (0) for all 1
n
.
4. For all f S(1
d
),
_
R
n
R
n
( )f()f()dd 0. (34.23)
Proof. Taking m = 1 and
1
= 0 we learn (0) [[
2
0 for all C
which proves item 1. Taking m = 2,
1
= and
2
= , the matrix
A :=
_
(0) ( )
( ) (0)
_
is positive denite from which we conclude ( ) = ( ) (since A = A
by denition) and
0 det
_
(0) ( )
( ) (0)
_
= [(0)[
2
[( )[
2
.
and hence [()[ (0) for all . This proves items 2. and 3. Item 4. follows
by approximating the integral in Eq. (34.23) by Riemann sums,
_
R
n
R
n
( )f()f()dd
= lim
0
2n
,(Z
n
)[
1
,
1
]
n
( )f()f() 0.
The details are left to the reader.
34.5 Fourier Transforms of Measures and Bochners Theorem 695
Lemma 34.21. If is a nite positive measure on B
R
n, then :=
C(1
n
, C) is a positive denite function.
Proof. As has already been observed after Denition 34.17, the dominated
convergence theorem implies C(1
n
, C). Since is a positive measure (and
hence real),
() =
_
R
n
e
ix
d(x) =
_
R
n
e
ix
d(x) = ().
From this it follows that for any m N and
j
m
j=1
1
n
, the matrix
A := (
k
j
)
m
k,j=1
is self-adjoint. Moreover if C
m
,
m
k,j=1
(
k
j
)
k
j
=
_
R
n
m
k,j=1
e
i(
k
j)x
j
d(x)
=
_
R
n
m
k,j=1
e
i
k
x
k
e
ijx
j
d(x)
=
_
R
n
k=1
e
i
k
x
2
d(x) 0
showing A is positive denite.
Theorem 34.22 (Bochners Theorem). Suppose C(1
n
, C) is positive
denite function, then there exists a unique positive measure on B
R
n such
that = .
Proof. If () = (), then for f o we would have
_
R
n
fd =
_
R
n
(f
d =
_
R
n
f
() ()d.
This suggests that we dene
I(f) :=
_
R
n
()f
()d for all f o.

We will now show I is positive in the sense if f o and f 0 then I(f) 0.
For general f o we have
I([f[
2
) =
_
R
n
()
_
[f[
2
_
()d =
_
R
n
()
_
f
_
()d
=
_
R
n
()f
( )

f
()dd =
_
R
n
()f
( )f
()dd
=
_
R
n
( )f
()f
()dd 0. (34.24)
For t > 0 let p
t
(x) := t
n/2
e
]x]
2
/2t
o and dene
I
t
(x) := Ip
t
(x) := I(p
t
(x )) = I(
_
p
t
(x )
2
)
which is non-negative by Eq. (34.24) and the fact that
_
p
t
(x ) o. Using
[p
t
(x )]
() =
_
R
n
p
t
(x y)e
iy
dy =
_
R
n
p
t
(y)e
i(y+x)
dy
= e
ix
p
t
() = e
ix
e
t]]
2
/2
,
I
t
, ) =
_
R
n
I(p
t
(x ))(x)dx
=
_
R
n
__
R
n
() [p
t
(x )]
()(x)d
_
dx
=
_
R
n
__
R
n
()e
ix
e
t]]
2
/2
(x)d
_
dx
=
_
R
n
()
()e
t]]
2
/2
d
which coupled with the dominated convergence theorem shows
Ip
t
, )
_
R
n
()
()d = I() as t 0.
Hence if 0, then I() = lim
t0
I
t
, ) 0.
Let K 1 be a compact set and C
c
(1, [0, )) be a function such
that = 1 on K. If f C
c
(1, 1) is a smooth function with supp(f) K,
then 0 |f|
f o and hence
0 I, |f|
f) = |f|
I, ) I, f)
and therefore I, f) |f|
I, ). Replacing f by f implies, I, f)
|f|
I, ) and hence we have proved

[I, f)[ C(supp(f)) |f|
(34.25)
for all f T
R
n := C
c
(1
n
, 1) where C(K) is a nite constant for each
compact subset of 1
n
. Because of the estimate in Eq. (34.25), it follows that
I[
1
R
n
has a unique extension I to C
c
(1
n
, 1) still satisfying the estimates in
Eq. (34.25) and moreover this extension is still positive. So by the Riesz
Markov Theorem 32.47, there exists a unique Radon measure on 1
n
such
that such that I, f) = (f) for all f C
c
(1
n
, 1).
To nish the proof we must show () = () for all 1
n
given
(f) =
_
R
n
()f
()d for all f C
c
(1
n
, 1). (34.26)
34.6 Supplement: Heisenberg Uncertainty Principle 697
Let f C
c
(1
n
, 1
+
) be a radial function such f(0) = 1 and f(x) is decreasing
as [x[ increases. Let f
(x) := f(x), then by Theorem 34.3,

T
1
_
e
ix
f
(x)
() =
n
f
)
and therefore, from Eq. (34.26),
_
R
n
e
ix
f
(x)d(x) =
_
R
n
()
n
f
)d. (34.27)
Because
_
R
n
f
()d = Tf
(0) = f(0) = 1, we may apply the approximate

function Theorem 22.32 to Eq. (34.27) to nd
_
R
n
e
ix
f
(x)d(x) () as 0. (34.28)
On the the other hand, when = 0, the monotone convergence theorem
implies (f
) (1) = (1
n
) and therefore (1
n
) = (1) = (0) < . Now
knowing the is a nite measure we may use the dominated convergence
theorem to concluded
(e
ix
f
(x)) (e
ix
) = () as 0
for all . Combining this equation with Eq. (34.28) shows () = () for all
1
n
.
34.6 Supplement: Heisenberg Uncertainty Principle
Suppose that H is a Hilbert space and A, B are two densely dened symmet-
ric operators on H. More explicitly, A is a densely dened symmetric linear
operator on H means there is a dense subspace T
A
H and a linear map
A : T
A
H such that A[) = [A) for all , T
A
. Let
T
AB
:= H : T
B
and B T
A
and for T
AB
let (AB) = A(B) with a similar denition of T
BA
and
BA. Moreover, let T
C
:= T
AB
T
BA
and for T
C
, let
C =
1
i
[A, B] =
1
i
(AB BA) .
Notice that for , T
C
we have
C[) =
1
i
AB[) BA[) =
1
i
B[A) A[B)
=
1
i
[BA) [AB) = [C),
so that C is symmetric as well.
Theorem 34.23 (Heisenberg Uncertainty Principle). Continue the
above notation and assumptions,
1
2
[[C)[
_
|A|
2
[A)
_
|B|
2
[B) (34.29)
for all T
C
. Moreover if || = 1 and equality holds in Eq. (34.29), then
(A[A)I) = i(B [B)I) or
(B [B)I) = i(A[A)I) (34.30)
for some 1.
Proof. By homogeneity (34.29) we may assume that || = 1. Let a :=
[A), b = [B),

A = AaI, and

B = B bI. Then we have still have
[

A,

B] = [AaI, B bI] = iC.
Now
i[C) = [iC) = [[

A,

B]) = [
A

B) +[
B

A)
=
_
A[
B)
B[
A)
_
= 2i Im
A[
B)
from which we learn
[[C)[ = 2
Im
A[
B)
A[
B)
2
_
_
_
A
_
_
_
_
_
_
B
_
_
_ (34.31)
with equality i Re
A[
B) = 0 and

A and

B are linearly dependent, i.e.
i Eq. (34.30) holds. Equation (34.29) now follows from the inequality in Eq.
(34.31) and the identities,
_
_
_
A
_
_
_
2
= |A a|
2
= |A|
2
+a
2
||
2
2a ReA[)
= |A|
2
+a
2
2a
2
= |A|
2
A[)
and similarly
_
_
_
B
_
_
_ = |B|
2
B[).
Example 34.24. As an example, take H = L
2
(1), A =
1
i
x
and B = M
x
with T
A
:= f H : f
t
H (f
t
is the weak derivative) and T
B
:=
_
f H :
_
R
[xf(x)[
2
dx <
_
. In this case,
T
C
= f H : f
t
, xf and xf
t
are in H
and C = I on T
C
. Therefore for a unit vector T
C
,
1
2

_
_
_
_
1
i
t
a
_
_
_
_
2
|x b|
2
where a = i
_
R
t
dm
1
and b =
_
R
x[(x)[
2
dm(x). Thus we have
1
4
=
1
4
_
R
[[
2
dm
_
R
(k a)
2
(k)
2
dk
_
R
(x b)
2
[(x)[
2
dx. (34.32)
Equality occurs if there exists 1 such that
i(x b) (x) = (
1
i
x
a)(x) a.e.
Working formally, this gives rise to the ordinary dierential equation (in weak
form),
x
= [(x b) +ia] (34.33)
which has solutions (see Exercise 34.5 below)
= C exp
__
R
[(x b) +ia] dx
_
= C exp
_
2
(x b)
2
+iax
_
. (34.34)
Let =
1
2t
and choose C so that ||
2
= 1 to nd
t,a,b
(x) =
_
1
2t
_
1/4
exp
_
1
4t
(x b)
2
+iax
_
are the functions (called coherent states) which saturate the Heisenberg
uncertainty principle in Eq. (34.32).
34.6.1 Exercises
Exercise 34.2. Let f L
2
(1
n
) and be a multi-index. If
f exists in
L
2
(1
n
) then T(
f) = (i)
f() in L
2
(1
n
) and conversely if
_

f()
_
L
2
(1
n
) then
f exists.
Exercise 34.3. Suppose p() is a polynomial in 1
d
and u L
2
such that
p () u L
2
. Show
T (p () u) () = p(i) u() L
2
.
Conversely if u L
2
such that p(i) u() L
2
, show p () u L
2
.
1
The constant a may also be described as
a = i
_
R
/
dm =
2i
_
R
()
_
/
_
()d
=
_
R
()
2
dm().
Exercise 34.4. Suppose is a complex measure on 1
n
and () is its Fourier
transform as dened in Denition 34.17. Show satises,
, ) :=
_
R
n
()()d = ( ) :=
_
R
n
d for all o
and use this to show if is a complex measure such that 0, then 0.
Exercise 34.5. Show that described in Eq. (34.34) is the general solution
to Eq. (34.33). Hint: Suppose that is any solution to Eq. (34.33) and is
given as in Eq. (34.34) with C = 1. Consider the weak dierential equation
solved by /.
34.6.2 More Proofs of the Fourier Inversion Theorem
Exercise 34.6. Suppose that f L
1
(1) and assume that f continuously
dierentiable in a neighborhood of 0, show
lim
M
_

sinMx
x
f(x)dx = f(0) (34.35)
using the following steps.
1. Use Example 20.14 to deduce,
lim
M
_
1
1
sinMx
x
dx = lim
M
_
M
M
sinx
x
dx = .
2. Explain why
0 = lim
M
_
]x]1
sinMx
f(x)
x
dx and
0 = lim
M
_
]x]1
sinMx
f(x) f(0)
x
dx.
3. Add the previous two equations and use part (1) to prove Eq. (34.35).
Exercise 34.7 (Fourier Inversion Formula). Suppose that f L
1
(1)
such that

f L
1
(1).
1. Further assume that f is continuously dierentiable in a neighborhood of
0. Show that
:=
_
R
f()d = f(0).
Hint: by the dominated convergence theorem, := lim
M
_
]]M
f()d.
Now use the denition of

f(), Fubinis theorem and Exercise 34.6.
2. Apply part 1. of this exercise with f replace by
y
f for some y 1 to
prove
f(y) =
_
R
f()e
iy
d (34.36)
provided f is now continuously dierentiable near y.
The goal of the next exercises is to give yet another proof of the Fourier
inversion formula.
Notation 34.25 For L > 0, let C
k
L
(1) denote the space of C
k
2L periodic
functions:
C
k
L
(1) :=
_
f C
k
(1) : f(x + 2L) = f(x) for all x 1
_
.
Also let , )
L
denote the inner product on the Hilbert space H
L
:=
L
2
([L, L]) given by
f[g)
L
:=
1
2L
_
[L,L]
f(x) g(x)dx.
Exercise 34.8. Recall that
_
L
k
(x) := e
ikx/L
: k Z
_
is an orthonormal basis
for H
L
and in particular for f H
L
,
f =
kZ
f[
L
k
)
L
L
k
(34.37)
where the convergence takes place in L
2
([L, L]). Suppose now that f
C
2
L
(1)
2
. Show (by two integration by parts)
f[
L
k
)
L
L
2
k
2
|f
tt
|
where |g|
denote the uniform norm of a function g. Use this to conclude

that the sum in Eq. (34.37) is uniformly convergent and from this conclude
that Eq. (34.37) holds pointwise. BRUCE: it is enough to assume f C
L
(1)
by making use of the identity,
f[
L
k
)
L
=
L
[k[
f
t
[
L
k
)
L
along with the Cauchy Schwarz inequality to see

_
_
k,=0
f[
L
k
)
L
_
_
2
k,=0
f
t
[
L
k
)
L
k,=0
_
L
[k[
_
2
.
2
We view C
2
L
(R) as a subspace of HL by identifying f C
2
L
(R) with f[
[L,L]

HL.
Exercise 34.9 (Fourier Inversion Formula on o). Let f o(1), L > 0
and
f
L
(x) :=
kZ
f(x + 2kL). (34.38)
Show:
1. The sum dening f
L
is convergent and moreover that f
L
C
L
(1).
2. Show f
L
[
L
k
)
L
=
1
2L
f(k/L).
3. Conclude from Exercise 34.8 that
f
L
(x) =
1
2L
kZ
f(k/L)e
ikx/L
for all x 1. (34.39)
4. Show, by passing to the limit, L , in Eq. (34.39) that Eq. (34.36)
holds for all x 1. Hint: Recall that

f o.
Exercise 34.11. Folland 8.14 on p. 254. (Wirtingers inequality.)
Exercise 34.12. Folland 8.15 on p. 255. (The sampling Theorem. Modify to
agree with notation in notes, see Solution ?? below.)
Exercise 34.15. .Folland 8.19 on p. 256. (The Fourier transform of a function
whose support has nite measure.)
Exercise 34.16. Folland 8.22 on p. 256. (Bessel functions.)
Exercise 34.17. Folland 8.23 on p. 256. (Hermite Polynomial problems and
Harmonic oscillators.)
Exercise 34.18. Folland 8.31 on p. 263. (Poisson Summation formula prob-
lem.)
35
Constant Coecient partial dierential
equations
Suppose that p() =
]]k
a
with a
C and
L = p(D
x
) :=
]]N
a
x
=
]]N
a
_
1
i
x
_
. (35.1)
Then for f o
Lf() = p()

f(),
that is to say the Fourier transform takes a constant coecient partial dier-
ential operator to multiplication by a polynomial. This fact can often be used
to solve constant coecient partial dierential equation. For example suppose
g : 1
n
C is a given function and we want to nd a solution to the equation
Lf = g. Taking the Fourier transform of both sides of the equation Lf = g
would imply p()

f() = g() and therefore

f() = g()/p() provided p()
is never zero. (We will discuss what happens when p() has zeros a bit more
later on.) So we should expect
f(x) = T
1
_
1
p()
g()
_
(x) = T
1
_
1
p()
_
g(x).
Denition 35.1. Let L = p(D
x
) as in Eq. (35.1). Then we let (L) :=Ran(p)
C and call (L) the spectrum of L. Given a measurable function G : (L)
C, we dene (a possibly unbounded operator) G(L) : L
2
(1
n
, m) L
2
(1
n
, m)
by
G(L)f := T
1
M
Gp
T
where M
Gp
denotes the operation on L
2
(1
n
, m) of multiplication by G p,
i.e.
M
Gp
f = (G p) f
with domain given by those f L
2
such that (G p) f L
2
.
At a formal level we expect
G(L)f = T
1
(G p) g.
704 35 Constant Coecient partial dierential equations
35.1 Elliptic examples
As a specic example consider the equation
_
+m
2
_
f = g (35.2)
where f, g : 1
n
C and =
n
i=1
2
/x
2
i
is the usual Laplacian on 1
n
. By
Corollary 34.16 (i.e. taking the Fourier transform of this equation), solving
Eq. (35.2) with f, g L
2
is equivalent to solving
_
[[
2
+m
2
_

f() = g(). (35.3)
The unique solution to this latter equation is
f() =
_
[[
2
+m
2
_
1
g()
and therefore,
f(x) = T
1
_
_
[[
2
+m
2
_
1
g()
_
(x) =:
_
+m
2
_
1
g(x).
We expect
T
1
_
_
[[
2
+m
2
_
1
g()
_
(x) = G
m
g(x) =
_
R
n
G
m
(x y)g(y)dy,
where
G
m
(x) := T
1
_
[[
2
+m
2
_
1
(x) =
_
R
n
1
m
2
+[[
2
e
ix
d.
At the moment T
1
_
[[
2
+m
2
_
1
only makes sense when n = 1, 2, or 3
because only then is
_
[[
2
+m
2
_
1
L
2
(1
n
).
For now we will restrict our attention to the one dimensional case, n = 1,
in which case
G
m
(x) =
1
2
_
R
1
( +mi) ( mi)
e
ix
d. (35.4)
The function G
m
may be computed using standard complex variable contour
integration methods to nd, for x 0,
G
m
(x) =
1
2
2i
e
i
2
mx
2im
=
1
2m
2e
mx
and since G
m
is an even function,
35.1 Elliptic examples 705
G
m
(x) = T
1
_
[[
2
+m
2
_
1
(x) =
2
2m
e
m]x]
. (35.5)
This result is easily veried to be correct, since
T
_
2
2m
e
m]x]
_
() =
2
2m
_
R
e
m]x]
e
ix
dx
=
1
2m
__

0
e
mx
e
ix
dx +
_
0
e
mx
e
ix
dx
_
=
1
2m
_
1
m+i
+
1
mi
_
=
1
m
2
+
2
.
Hence in conclusion we nd that
_
+m
2
_
f = g has solution given by
f(x) = G
m
g(x) =
2
2m
_
R
e
m]xy]
g(y)dy =
1
2m
_
R
e
m]xy]
g(y)dy.
Question. Why do we get a unique answer here given that f(x) =
Asinh(x) +Bcosh(x) solves
_
+m
2
_
f = 0?
The answer is that such an f is not in L
2
unless f = 0! More generally it is
worth noting that Asinh(x) +Bcosh(x) is not in T unless A = B = 0.
What about when m = 0 in which case m
2
+
2
becomes
2
which has a
zero at 0. Noting that constants are solutions to f = 0, we might look at
lim
m0
(G
m
(x) 1) = lim
m0
2
2m
(e
m]x]
1) =
2
2
[x[ .
as a solution, i.e. we might conjecture that
f(x) :=
1
2
_
R
[x y[ g(y)dy
solves the equation f
tt
= g. To verify this we have
f(x) :=
1
2
_
x
(x y) g(y)dy
1
2
_

x
(y x) g(y)dy
so that
f
t
(x) =
1
2
_
x
g(y)dy +
1
2
_

x
g(y)dy and
f
tt
(x) =
1
2
g(x)
1
2
g(x).
35.2 Poisson Semi-Group
Let us now consider the problems of nding a function (x
0
, x) [0, )1
n
u(x
0
, x) C such that
_

2
x
2
0
+
_
u = 0 with u(0, ) = f L
2
(1
n
). (35.6)
Let u(x
0
, ) :=
_
R
n
u(x
0
, x)e
ix
dx denote the Fourier transform of u in the
x 1
n
variable. Then Eq. (35.6) becomes
_

2
x
2
0
[[
2
_
u(x
0
, ) = 0 with u(0, ) =

f() (35.7)
and the general solution to this dierential equation ignoring the initial con-
dition is of the form
u(x
0
, ) = A()e
x0]]
+B()e
x0]]
(35.8)
for some function A() and B(). Let us now impose the extra condition that
u(x
0
, ) L
2
(1
n
) or equivalently that u(x
0
, ) L
2
(1
n
) for all x
0
0. The
solution in Eq. (35.8) will not have this property unless B() decays very
rapidly at . The simplest way to achieve this is to assume B = 0 in which
case we now get a unique solution to Eq. (35.7), namely
u(x
0
, ) =

f()e
x0]]
.
Applying the inverse Fourier transform gives
u(x
0
, x) = T
1
_
f()e
x0]]
_
(x) =:
_
e
x0
f
_
(x)
and moreover _
e
x0
f
_
(x) = P
x0
f(x)
where P
x0
(x) = (2)
n/2
_
T
1
e
x0]]
_
(x). From Exercise 35.1,
P
x0
(x) = (2)
n/2
_
T
1
e
x0]]
_
(x) = c
n
x
0
(x
2
0
+[x[
2
)
(n+1)/2
where
c
n
= (2)
n/2
((n + 1)/2)
2
n/2
=
((n + 1)/2)
2
n
(n+1)/2
.
Hence we have proved the following proposition.
Proposition 35.2. For f L
2
(1
n
),
e
x0
f = P
x0
f for all x
0
0
and the function u(x
0
, x) := e
x0
f(x) is C
for (x
0
, x) (0, ) 1
n
and solves Eq. (35.6).
35.3 Heat Equation on R
n
707
35.3 Heat Equation on 1
n
The heat equation for a function u : 1
+
1
n
C is the partial dierential
equation
_
1
2
_
u = 0 with u(0, x) = f(x), (35.9)
where f is a given function on 1
n
. By Fourier transforming Eq. (35.9) in the
x variables only, one nds that (35.9) implies that
_
t
+
1
2
[[
2
_
u(t, ) = 0 with u(0, ) =

f(). (35.10)
and hence that u(t, ) = e
t]]
2
/2
f(). Inverting the Fourier transform then
shows that
u(t, x) = T
1
_
e
t]]
2
/2

f()
_
(x) =
_
T
1
_
e
t]]
2
/2
_
f
_
(x) =: e
t/2
f(x).
From Example 34.4,
T
1
_
e
t]]
2
/2
_
(x) = p
t
(x) = t
n/2
e
1
2t
]x]
2
and therefore,
u(t, x) =
_
R
n
p
t
(x y)f(y)dy.
This suggests the following theorem.
Theorem 35.3. Let
(t, x, y) := (2t)
n/2
e
]xy]
2
/2t
(35.11)
be the heat kernel on 1
n
. Then
_
1
2
x
_
(t, x, y) = 0 and lim
t0
(t, x, y) =
x
(y), (35.12)
where
x
is the function at x in 1
n
. More precisely, if f is a continuous
bounded (can be relaxed considerably) function on 1
n
, then
u(t, x) =
_
R
n
(t, x, y)f(y)dy
is a solution to Eq. (35.9) where u(0, x) := lim
t0
u(t, x).
Proof. Direct computations show that
_
1
2
x
_
(t, x, y) = 0 and an
application of Theorem 22.32 shows lim
t0
(t, x, y) =
x
(y) or equivalently
that lim
t0
_
R
n
(t, x, y)f(y)dy = f(x) uniformly on compact subsets of 1
n
.
This shows that lim
t0
u(t, x) = f(x) uniformly on compact subsets of 1
n
.
This notation suggests that we should be able to compute the solution to
g to (m
2
)g = f using
g(x) =
_
m
2
_
1
f(x) =
_

0
_
e
(m
2
)t
f
_
(x)dt
=
_

0
_
e
m
2
t
p
2t
f
_
(x)dt,
a fact which is easily veried using the Fourier transform. This gives us a
method to compute G
m
(x) from the previous section, namely
G
m
(x) =
_

0
e
m
2
t
p
2t
(x)dt =
_

0
(2t)
n/2
e
m
2
t
1
4t
]x]
2
dt.
We make the change of variables, = [x[
2
/4t (t = [x[
2
/4, dt =
]x]
2
4
2
d) to
nd
G
m
(x) =
_

0
(2t)
n/2
e
m
2
t
1
4t
]x]
2
dt =
_

0
_
[x[
2
2
_
n/2
e
m
2
]x]
2
/4
[x[
2
(2)
2
d
=
2
(n/22)
[x[
n2
_

0
n/22
e
e
m
2
]x]
2
/4
d. (35.13)
In case n = 3, Eq. (35.13) becomes
G
m
(x) =
2 [x[
_

0
1
e
m
2
]x]
2
/4
d =
2 [x[
e
m]x]
where the last equality follows from Exercise 35.1. Hence when n = 3 we have
found
_
m
2
_
1
f(x) = G
m
f(x) = (2)
3/2
_
R
3
2 [x y[
e
m]xy]
f(y)dy
=
_
R
3
1
4 [x y[
e
m]xy]
f(y)dy. (35.14)
The function
1
4]x]
e
m]x]
is called the Yukawa potential.
Let us work out G
m
(x) for n odd. By dierentiating Eq. (35.27) of Exercise
35.1 we nd
_

0
d
k1/2
e
1
4
x
2
e
m
2
=
_

0
d
1
1
4
x
2
_
d
da
_
k
e
a
[
a=m
2
=
_
d
da
_
k

a
e
ax
= p
m,k
(x)e
mx
where p
m,k
(x) is a polynomial in x with deg p
m
= k with
n
709
p
m,k
(0) =
d
da
_
k
a
1/2
[
a=m
2 =
(
1
2
3
2
. . .
2k 1
2
)m
2k+1
= m
2k+1
2
k
(2k 1)!!.
Letting k1/2 = n/22 and m = 1 we nd k =
n1
2
2 N for n = 3, 5, . . . .
and we nd
_

0
n/22
e
1
4
x
2
e
d = p
1,k
(x)e
x
for all x > 0.
Therefore,
G
m
(x) =
2
(n/22)
[x[
n2
_

0
n/22
e
e
m
2
]x]
2
/4
d =
2
(n/22)
[x[
n2
p
1,n/22
(m[x[)e
m]x]
.
Now for even m, I think we get Bessel functions in the answer. (BRUCE:
look this up.) Let us at least work out the asymptotics of G
m
(x) for x .
To this end let
(y) :=
_

0
n/22
e
(+
1
y
2
)
d = y
n2
_

0
n/22
e
(y
2
+
1
)
d
The function f
y
() := (y
2
+
1
) satises,
f
t
y
() =
_
y
2
2
_
and f
tt
y
() = 2
3
and f
ttt
y
() = 6
4
so by Taylors theorem with remainder we learn
f
y
()
= 2y +y
3
( y
1
)
2
for all > 0,
see Figure 35.3 below.
Plot of f
4
and its second order Taylor approximation.
So by the usual asymptotics arguments,
(y)
= y
n2
_
(+y
1
,y
1
+)
n/22
e
(y
2
+
1
)
d
= y
n2
_
(+y
1
,y
1
+)
n/22
exp
_
2y y
3
( y
1
)
2
_
d
= y
n2
e
2y
_
R
n/22
exp
_
y
3
( y
1
)
2
_
d (let y
1
)
= e
2y
y
n2
y
n/2+1
_
R
n/22
exp
_
y( 1)
2
_
d
= e
2y
y
n2
y
n/2+1
_
R
( + 1)
n/22
exp
_
y
2
_
d.
The point is we are still going to get exponential decay at .
When m = 0, Eq. (35.13) becomes
G
0
(x) =
2
(n/22)
[x[
n2
_

0
n/21
e
=
2
(n/22)
[x[
n2
(n/2 1)
where (x) in the gamma function dened in Eq. (20.42). Hence for reason-
able functions f (and n ,= 2) we expect that (see Proposition 35.4 below)
()
1
f(x) = G
0
f(x) = 2
(n/22)
(n/2 1)(2)
n/2
_
R
n
1
[x y[
n2
f(y)dy
=
1
4
n/2
(n/2 1)
_
R
n
1
[x y[
n2
f(y)dy.
The function
G(x) :=
1
4
n/2
(n/2 1)
1
[x[
n2
(35.15)
is a Greens function for . Recall from Exercise 20.16 that, for n = 2k,
(
n
2
1) = (k 1) = (k 2)!, and for n = 2k + 1,
(
n
2
1) = (k 1/2) = (k 1 + 1/2) =
1 3 5 (2k 3)
2
k1
=
(2k 3)!!
2
k1
where (1)!! =: 1.
Hence
G(x) =
1
4
1
[x[
n2
_
1
k
(k 2)! if n = 2k
1
k
(2k3)!!
2
k1
if n = 2k + 1
and in particular when n = 3,
G(x) =
1
4
1
[x[
which is consistent with Eq. (35.14) with m = 0.
n
711
Proposition 35.4. Let n 3 and for x ,1
n
, let
t
(x) = (t, x, 0) :=
_
1
2t
_
n/2
e
1
2t
]x]
2
(see Eq. (35.11))and G(x) be as in Eq. (35.15) so that
G(x) :=
C
n
[x[
n2
=
1
2
_

0
t
(x) dt for x ,= 0.
Then
(G u) = G u = u
for all u C
2
c
(1
n
) .
Proof. For f C
c
(1
n
) ,
G f (x) = C
n
_
R
n
f (x y)
1
[y[
n2
dy
is well dened, since
_
R
n
[f (x y)[
1
[y[
n2
dy M
_
]y]R+]x]
1
[y[
n2
dy <
where M is a bound on f and supp(f) B(0, R) . Similarly, [x[ r, we have
sup
]x]r
[f (x y)[
1
[y[
n2
M1
]y]R+r]
1
[y[
n2
L
1
(dy) ,
from which it follows that G f is a continuous function. Similar arguments
show if f C
2
c
(1
n
) , then G f C
2
(1
n
) and (G f) = G f. So to
nish the proof it suces to show G u = u.
For this we now write, making use of Fubini-Tonelli, integration by parts,
the fact that
t
t
(y) =
1
2
t
(y) and the dominated convergence theorem,
G u(x) =
1
2
_
R
n
u(x y)
__

0
t
(y) dt
_
dy
=
1
2
_

0
dt
_
R
n
u(x y)
t
(y) dy
=
1
2
_

0
dt
_
R
n
y
u(x y)
t
(y) dy
=
1
2
_

0
dt
_
R
n
u(x y)
y
t
(y) dy
=
_

0
dt
_
R
n
u(x y)
d
dt
t
(y) dy
= lim
0
_

dt
_
R
n
u(x y)
d
dt
t
(y) dy
= lim
0
_
R
n
u(x y)
__

d
dt
t
(y) dt
_
dy
= lim
0
_
R
n
u(x y)
(y) dy = u(x) ,
where in the last equality we have used the fact that
t
is an approximate
sequence.
35.4 Wave Equation on 1
n
Let us now consider the wave equation on 1
n
,
0 =
_
2
t

_
u(t, x) with
u(0, x) = f(x) and u
t
(0, x) = g(x). (35.16)
Taking the Fourier transform in the x variables gives the following equation
0 = u
t t
(t, ) +[[
2
u(t, ) with
u(0, ) =

f() and u
t
(0, ) = g(). (35.17)
The solution to these equations is
u(t, ) =

f() cos (t [[) + g()
sint [[
[[
and hence we should have
u(t, x) = T
1
_
f() cos (t [[) + g()

sint [[
[[
_
(x)
= T
1
cos (t [[) f(x) +T
1
sint [[
[[
g (x)
=
d
dt
T
1
_
sint [[
[[
_
f(x) +T
1
_
sint [[
[[
_
g (x) . (35.18)
The question now is how interpret this equation. In particular what
are the inverse Fourier transforms of T
1
cos (t [[) and T
1
sin t]]
]]
. Since
d
dt
T
1
sin t]]
]]
f(x) = T
1
cos (t [[)f(x), it really suces to understand
T
1
_
sin t]]
]]
_
. The problem we immediately run into here is that
sin t]]
]]

L
2
(1
n
) i n = 1 so that is the case we should start with.
Again by complex contour integration methods one can show
_
T
1
1
sint
_
(x) =

2
_
1
x+t>0
1
(xt)>0
_
=

2
(1
x>t
1
x>t
) =

2
1
[t,t]
(x)
where in writing the last line we have assume that t 0. Again this easily
seen to be correct because
35.4 Wave Equation on R
n
713
T
_

2
1
[t,t]
(x)
_
() =
1
2
_
R
1
[t,t]
(x)e
ix
dx =
1
2i
e
ix
[
t
t
=
1
2i
_
e
it
e
it
=
1
sint.
Therefore,
_
T
1
1
sint
_
f(x) =
1
2
_
t
t
f(x y)dy
and the solution to the one dimensional wave equation is
u(t, x) =
d
dt
1
2
_
t
t
f(x y)dy +
1
2
_
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
_
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
_
x+t
xt
g(y)dy.
We can arrive at this same solution by more elementary means as follows.
We rst note in the one dimensional case that wave operator factors, namely
0 =
_
2
t

2
x
_
u(t, x) = (
t
x
) (
t
+
x
) u(t, x).
Let U(t, x) := (
t
+
x
) u(t, x), then the wave equation states (
t
x
) U = 0
and hence by the chain rule
d
dt
U(t, x t) = 0. So
U(t, x t) = U(0, x) = g(x) +f
t
(x)
and replacing x by x +t in this equation shows
(
t
+
x
) u(t, x) = U(t, x) = g(x +t) +f
t
(x +t).
Working similarly, we learn that
d
dt
u(t, x +t) = g(x + 2t) +f
t
(x + 2t)
which upon integration implies
u(t, x +t) = u(0, x) +
_
t
0
g(x + 2) +f
t
(x + 2) d
= f(x) +
_
t
0
g(x + 2)d +
1
2
f(x + 2)[
t
0
=
1
2
(f(x) +f(x + 2t)) +
_
t
0
g(x + 2)d.
Replacing x x t in this equation gives
u(t, x) =
1
2
(f(x t) +f(x +t)) +
_
t
0
g(x t + 2)d
and then letting y = x t + 2 in the last integral shows again that
u(t, x) =
1
2
(f(x t) +f(x +t)) +
1
2
_
x+t
xt
g(y)dy.
When n > 3 it is necessary to treat T
1
_
sin t]]
]]
_
as a distribution or
generalized function, see Section 36 below. So for now let us take n = 3, in
which case from Example 34.18 it follows that
T
1
_
sint [[
[[
_
=
t
4t
2
t
= t
t
(35.19)
where
t
is
1
4t
2
t
, the surface measure on S
t
normalized to have total mea-
sure one. Hence from Eq. (35.18) the solution to the three dimensional wave
equation should be given by
u(t, x) =
d
dt
(t
t
f(x)) +t
t
g (x) . (35.20)
Using this denition in Eq. (35.20) gives
u(t, x) =
d
dt
_
t
_
St
f(x y)d
t
(y)
_
+t
_
St
g(x y)d
t
(y)
=
d
dt
_
t
_
S1
f(x t)d
1
()
_
+t
_
S1
g(x t)d
1
()
=
d
dt
_
t
_
S1
f(x +t)d
1
()
_
+t
_
S1
g(x +t)d
1
(). (35.21)
Proposition 35.5. Suppose f C
3
(1
3
) and g C
2
(1
3
), then u(t, x) de-
ned by Eq. (35.21) is in C
2
_
1 1
3
_
and is a classical solution of the wave
equation in Eq. (35.16).
Proof. The fact that u C
2
_
1 1
3
_
follows by the usual dieren-
tiation under the integral arguments. Suppose we can prove the proposi-
tion in the special case that f 0. Then for f C
3
(1
3
), the function
v(t, x) = +t
_
S1
g(x+t)d
1
() solves the wave equation 0 =
_
2
t

_
v(t, x)
with v(0, x) = 0 and v
t
(0, x) = g(x). Dierentiating the wave equation
in t shows u = v
t
also solves the wave equation with u(0, x) = g(x) and
u
t
(0, x) = v
tt
(0, x) =
x
v(0, x) = 0. These remarks reduced the problems
to showing u in Eq. (35.21) with f 0 solves the wave equation. So let
u(t, x) := t
_
S1
g(x +t)d
1
(). (35.22)
n
715
We now give two proofs the u solves the wave equation.
Proof 1. Since solving the wave equation is a local statement and u(t, x)
only depends on the values of g in B(x, t) we it suces to consider the case
where g C
2
c
_
1
3
_
. Taking the Fourier transform of Eq. (35.22) in the x
variable shows
u(t, ) = t
_
S1
d
1
()
_
R
3
g(x +t)e
ix
dx
= t
_
S1
d
1
()
_
R
3
g(x)e
ix
e
it
dx = g()t
_
S1
e
it
d
1
()
= g()t
sin[tk[
[tk[
= g()
sin(t [[)
[[
wherein we have made use of Example 34.18. This completes the proof since
u(t, ) solves Eq. (35.17) as desired.
Proof 2. Dierentiating
S(t, x) :=
_
S1
g(x +t)d
1
()
in t gives
S
t
(t, x) =
1
4
_
S1
g(x +t) d()
=
1
4
_
B(0,1)
g(x +t)dm()
=
t
4
_
B(0,1)
g(x +t)dm()
=
1
4t
2
_
B(0,t)
g(x +y)dm(y)
=
1
4t
2
_
t
0
dr r
2
_
]y]=r
g(x +y)d(y)
where we have used the divergence theorem, made the change of variables
y = t and used the disintegration formula in Eq. (20.34),
_
R
d
f(x)dm(x) =
_
[0,)S
n1
f(r ) d()r
n1
dr =
_

0
dr
_
]y]=r
f(y)d(y).
Since u(t, x) = tS(t, x) if follows that
Fig. 35.1. The geometry of the solution to the wave equation in three dimensions.
The observer sees a ash at t = 0 and x = 0 only at time t = [x[ . The wave progates
sharply with speed 1.
u
tt
(t, x) =

t
[S(t, x) +tS
t
(t, x)]
= S
t
(t, x) +

t
_
1
4t
_
t
0
dr r
2
_
]y]=r
g(x +y)d(y)
_
= S
t
(t, x)
1
4t
2
_
t
0
dr
_
]y]=r
g(x +y)d(y)
+
1
4t
_
]y]=t
g(x +y)d(y)
= S
t
(t, x) S
t
(t, x) +
t
4t
2
_
]y]=1
g(x +t)d()
= tu(t, x)
as required.
The solution in Eq. (35.21) exhibits a basic property of wave equations,
namely nite propagation speed. To exhibit the nite propagation speed, sup-
pose that f = 0 (for simplicity) and g has compact support near the origin,
for example think of g =
0
(x). Then x +tw = 0 for some w i [x[ = t. Hence
the wave front propagates at unit speed and the wave front is sharp. See
Figure 35.1 below.
The solution of the two dimensional wave equation may be found using
Hadamards method of decent which we now describe. Suppose now that
f and g are functions on 1
2
which we may view as functions on 1
3
which
happen not to depend on the third coordinate. We now go ahead and solve
the three dimensional wave equation using Eq. (35.21) and f and g as initial
n
717
Fig. 35.2. The geometry of the solution to the wave equation in two dimensions. A
ash at 0 R
2
looks like a line of ashes to the ctitious 3 d observer and hence
she sees the eect of the ash for t [x[ . The wave still propagates with speed 1.
However there is no longer sharp propagation of the wave front, similar to water
waves.
conditions. It is easily seen that the solution u(t, x, y, z) is again independent
of z and hence is a solution to the two dimensional wave equation. See gure
35.2 below.
Notice that we still have nite speed of propagation but no longer sharp
propagation. The explicit formula for u is given in the next proposition.
Proposition 35.6. Suppose f C
3
(1
2
) and g C
2
(1
2
), then
u(t, x) :=

t
_
t
2
__
D1
f(x +tw)
_
1 [w[
2
dm(w)
_
+
t
2
__
D1
g(x +tw)
_
1 [w[
2
dm(w)
is in C
2
_
1 1
2
_
and solves the wave equation in Eq. (35.16).
Proof. As usual it suces to consider the case where f 0. By symmetry
u may be written as
u(t, x) = 2t
_
S
+
t
g(x y)d
t
(y) = 2t
_
S
+
t
g(x +y)d
t
(y)
where S
+
t
is the portion of S
t
with z 0. The surface S
+
t
may be parametrized
by R(u, v) = (u, v,
t
2
u
2
v
2
) with (u, v) D
t
:=
_
(u, v) : u
2
+v
2
t
2
_
.
In these coordinates we have
4t
2
d
t
=
u
_
t
2
u
2
v
2
,
v
_
t
2
u
2
v
2
, 1
_
dudv
=
_
u
t
2
u
2
v
2
,
v
t
2
u
2
v
2
, 1
_
dudv
=
_
u
2
+v
2
t
2
u
2
v
2
+ 1dudv =
[t[
t
2
u
2
v
2
dudv
and therefore,
u(t, x) =
2t
4t
2
_
Dt
g(x + (u, v,
_
t
2
u
2
v
2
))
[t[
t
2
u
2
v
2
dudv
=
1
2
sgn(t)
_
Dt
g(x + (u, v))
t
2
u
2
v
2
dudv.
This may be written as
u(t, x) =
1
2
sgn(t)
__
Dt
g(x +w)
_
t
2
[w[
2
dm(w)
=
1
2
sgn(t)
t
2
[t[
__
D1
g(x +tw)
_
1 [w[
2
dm(w)
=
1
2
t
__
D1
g(x +tw)
_
1 [w[
2
dm(w)
35.5 Elliptic Regularity
The following theorem is a special case of the main theorem (Theorem 35.11)
of this section.
Theorem 35.7. Suppose that M
o
1
n
, v C
(M) and u L
1
loc
(M) sat-
ises u = v weakly, then u has a (necessarily unique) version u C
(M).
Proof. We may always assume n 3, by embedding the n = 1 and n = 2
cases in the n = 3 cases. For notational simplicity, assume 0 M and we will
show u is smooth near 0. To this end let C
c
(M) such that = 1 in a
neighborhood of 0 and C
c
(M) such that supp() = 1 and = 1
in a neighborhood of 0 as well, see Figure 35.3 Then formally, we have with
35.5 Elliptic Regularity 719
Fig. 35.3. The region M and the cuto functions, and .
:= 1 ,
G (v) = G (u) = G ((u +u))
= G ((u) +(u)) = u +G ((u))
so that
u(x) = G (v) (x) G ((u))(x)
for x supp(). The last term is formally given by
G ((u))(x) =
_
R
n
G(x y)(y)((y)u(y))dy
=
_
R
n
(y)
y
[G(x y)(y)] u(y)dy
which makes sense for x near 0. Therefore we nd
u(x) = G (v) (x)
_
R
n
(y)
y
[G(x y)(y)] u(y)dy.
Clearly all of the above manipulations were correct if we know u were C
2
to
begin with. So for the general case, let u
n
= u
n
with
n
n=1
the usual
sort of sequence approximation. Then u
n
= v
n
=: v
n
away from M
and
u
n
(x) = G (v
n
) (x)
_
R
n
(y)
y
[G(x y)(y)] u
n
(y)dy. (35.23)
Since u
n
u in L
1
loc
(O) where O is a suciently small neighborhood of 0, we
may pass to the limit in Eq. (35.23) to nd u(x) = u(x) for a.e. x O where
u(x) := G (v) (x)
_
R
n
(y)
y
[G(x y)(y)] u(y)dy.
This concluded the proof since u is smooth for x near 0.
Denition 35.8. We say L = p(D
x
) as dened in Eq. (35.1) is elliptic
if p
k
() :=

]]=k
a
is zero i = 0. We will also say the polynomial

p() :=
]]k
a
is elliptic if this condition holds.

Remark 35.9. If p() :=

]]k
a
is an elliptic polynomial, then there

exists A < such that inf
]]A
[p()[ > 0. Since p
k
() is everywhere non-
zero for S
n1
and S
n1
1
n
is compact, := inf
]]=1
[p
k
()[ > 0. By
homogeneity this implies
[p
k
()[ [[
k
for all A
n
.
Since
[p()[ =
p
k
() +
]]<k
a
[p
k
()[
]]<k
a
[[
k
C
_
1 +[[
k1
_
for some constant C < from which it is easily seen that for A suciently
large,
[p()[

2
[[
k
for all [[ A.
For the rest of this section, let L = p(D
x
) be an elliptic operator and
M
0
1
n
. As mentioned at the beginning of this section, the formal solution
to Lu = v for v L
2
(1
n
) is given by
u = L
1
v = G v
where
G(x) :=
_
R
n
1
p()
e
ix
d.
Of course this integral may not be convergent because of the possible zeros of
p and the fact
1
p()
may not decay fast enough at innity. We we will introduce
a smooth cut o function () which is 1 on C
0
(A) := x 1
n
: [x[ A and
supp() C
0
(2A) where A is as in Remark 35.9. Then for M > 0 let
G
M
(x) =
_
R
n
(1 ()) (/M)
p()
e
ix
d, (35.24)
(x) :=
(x) =
_
R
n
()e
ix
d, and
M
(x) = M
n
(Mx). (35.25)
Notice
_
R
n
(x)dx = T(0) = (0) = 1, o since o and
LG
M
(x) =
_
R
n
(1 ()) (/M)e
ix
d =
_
R
n
[(/M) ()] e
ix
d
=
M
(x) (x)
provided M > 2.
Proposition 35.10. Let p be an elliptic polynomial of degree m. The function
G
M
dened in Eq. (35.24) satises the following properties,
1. G
M
o for all M > 0.
2. LG
M
(x) = M
n
(Mx) (x).
3. There exists G C
c
(1
n
0) such that for all multi-indices ,
lim
M
G
M
(x) =
G(x) uniformly on compact subsets in 1

n
0 .
Proof. We have already proved the rst two items. For item 3., we notice
that
(x)
G
M
(x) =
_
R
n
(1 ()) (/M)
p()
(D)
e
ix
d
=
_
R
n
D
_
(1 ())
p()
(/M)
_
e
ix
d
=
_
R
n
D
(1 ())
p()
(/M)e
ix
d +R
M
(x)
where
R
M
(x) =
<
_
_
M
]]]]
_
R
n
D
(1 ())
p()

_
D
_
(/M)e
ix
d.
Using

p()
(1 ())
_
C [[
]]m]]
and the fact that
supp(
_
D
_
(/M)) 1
n
: A [[ /M 2A
= 1
n
: AM [[ 2AM
we easily estimate
[R
M
(x)[ C
<
_
_
M
]]]]
_
R
n
:AM]]2AM]
[[
]]m]]
d
C
<
_
_
M
]]]]
M
]]m]]+n
= CM
]]]]m+n
.
Therefore, R
M
0 uniformly in x as M provided [[ > [[ m+n. It
follows easily now that G
M
G in C
c
(1
n
0) and furthermore that
(x)
G(x) =
_
R
n
D
(1 ())
p()
e
ix
d
provided is suciently large. In particular we have shown,
D
G(x) =
1
[x[
2k
_
R
n
(
)
k
(1 ())
p()
e
ix
d
provided m[[ +2k > n, i.e. k > (n m+[[) /2. We are now ready to use
this result to prove elliptic regularity for the constant coecient case.
Theorem 35.11. Suppose L = p(D
) is an elliptic dierential operator on

1
n
, M
o
1
n
, v C
(M) and u L
1
loc
(M) satises Lu = v weakly, then u
has a (necessarily unique) version u C
(M).
Proof. For notational simplicity, assume 0 M and we will show u is
smooth near 0. To this end let C
c
(M) such that = 1 in a neighbor-
hood of 0 and C
c
(M) such that supp() = 1 , and = 1 in a
neighborhood of 0 as well. Then formally, we have with := 1 ,
G
M
(v) = G
M
(Lu) = G
M
(L(u +u))
= G
M
(L(u) +L(u))
=
M
(u) (u) +G
M
(L(u))
so that
M
(u) (x) = G
M
(v) (x) G
M
(L(u))(x) + (u) . (35.26)
Since
T [G
M
(v)] () =

G
M
() (v)
() =
(1 ()) (/M)
p()
(v)
()
(1 ())
p()
(v)
() as M
with the convergence taking place in L
2
(actually in o), it follows that
G
M
(v) G (v) (x) :=
_
R
n
(1 ())
p()
(v)
()e
ix
d
= T
1
_
(1 ())
p()
(v)
()
_
(x) o.
So passing the the limit, M , in Eq. (35.26) we learn for almost every
x 1
n
,
u(x) = G (v) (x) lim
M
G
M
(L(u))(x) + (u) (x)
for a.e. x supp(). Using the support properties of and we see for x
near 0 that (L(u))(y) = 0 unless y supp() and y / = 1 , i.e. unless
y is in an annulus centered at 0. So taking x suciently close to 0, we nd
x y stays away from 0 as y varies through the above mentioned annulus,
and therefore
35.6 Exercises 723
G
M
(L(u))(x) =
_
R
n
G
M
(x y)(L(u))(y)dy
=
_
R
n
L
y
(y)G
M
(x y) (u) (y)dy
_
R
n
L
y
(y)G(x y) (u) (y)dy as M .
Therefore we have shown,
u(x) = G (v) (x)
_
R
n
L
y
(y)G(x y) (u) (y)dy + (u) (x)
for almost every x in a neighborhood of 0. (Again it suces to prove this
equation and in particular Eq. (35.26) assuming u C
2
(M) because of the
same convolution argument we have use above.) Since the right side of this
equation is the linear combination of smooth functions we have shown u has
a smooth version in a neighborhood of 0.
Remarks 35.12 We could avoid introducing G
M
(x) if deg(p) > n, in which
case
(1())
p()
L
1
and so
G(x) :=
_
R
n
(1 ())
p()
e
ix
d
is already well dened function with G C
(1
n
0)BC(1
n
). If deg(p) <
n, we may consider the operator L
k
= [p(D
x
)]
k
= p
k
(D
x
) where k is chosen
so that k deg(p) > n. Since Lu = v implies L
k
u = L
k1
v weakly, we see to
prove the hypoellipticity of L it suces to prove the hypoellipticity of L
k
.
35.6 Exercises
Exercise 35.1. Using
1
[[
2
+m
2
=
_

0
e
(]]
2
+m
2
)
d,
the identity in Eq. (35.5) and Example 34.4, show for m > 0 and x 0 that
e
mx
=
m
_

0
d
1
1
4
x
2
e
m
2
(let /m
2
) (35.27)
=
_

0
d
1
m
2
4
x
2
. (35.28)
Use this formula and Example 34.4 to show, in dimension n, that
T
_
e
m]x]
_
() = 2
n/2
((n + 1)/2)
m
(m
2
+[[
2
)
(n+1)/2
where (x) in the gamma function dened in Eq. (20.42). (I am not absolutely
positive I have got all the constants exactly right, but they should be close.)
36
Elementary Generalized Functions /
Distribution Theory
This chapter has been highly inuenced by Friedlanders book [7].
36.1 Distributions on U
o
1
n
Let U be an open subset of 1
n
and
C
c
(U) =
KU
C
(K) (36.1)
denote the set of smooth functions on U with compact support in U.
Denition 36.1. A sequence
k
k=1
T(U) converges to T(U), i
there is a compact set K U such that supp(
k
) K for all k and
k

in C
(K).
Denition 36.2 (Distributions on U
o
1
n
). A generalized function T
on U
o
1
n
is a continuous linear functional on T(U), i.e. T : T(U) C
is linear and lim
n
T,
k
) = 0 for all
k
T(U) such that
k
0 in
T(U). We denote the space of generalized functions by T
t
(U).
Proposition 36.3. Let T : T(U) C be a linear functional. Then T T
t
(U)
i for all K U, there exist n N and C < such that
[T()[ Cp
n
() for all C
(K). (36.2)
Proof. Suppose that
k
T(U) such that
k
0 in T(U). Let K be
a compact set such that supp(
k
) K for all k. Since lim
k
p
n
(
k
) = 0, it
follows that if Eq. (36.2) holds that lim
n
T,
k
) = 0. Conversely, suppose
that there is a compact set K U such that for no choice of n N and
C < , Eq. (36.2) holds. Then we may choose non-zero
n
C
(K) such
that
[T(
n
)[ np
n
(
n
) for all n.
726 36 Elementary Generalized Functions / Distribution Theory
Let
n
=
1
npn(n)
n
C
(K), then p
n
(
n
) = 1/n 0 as n which
shows that
n
0 in T(U). On the other hence [T(
n
)[ 1 so that
lim
n
T,
n
) , = 0. Alternate Proof:The denition of T being continu-
ous is equivalent to T[
C
(K)
being sequentially continuous for all K U.
Since C
(K) is a metric space, sequential continuity and continuity are the

same thing. Hence T is continuous i T[
C
(K)
is continuous for all K U.
Now T[
C
(K)
is continuous i a bound like Eq. (36.2) holds.
Denition 36.4. Let Y be a topological space and T
y
T
t
(U) for all y Y.
We say that T
y
T T
t
(U) as y y
0
i
lim
yy0
T
y
, ) = T, ) for all T(U).
36.2 Examples of distributions and related computations
Example 36.5. Let be a positive Radon measure on U and f L
1
loc
(U).
Dene T T
t
(U) by T
f
, ) =
_
U
fd for all T(U). Notice that if
C
(K) then
[T
f
, )[
_
U
[f[ d =
_
K
[f[ d C
K
||
where C
K
:=
_
K
[f[ d < . Hence T
f
T
t
(U). Furthermore, the map
f L
1
loc
(U) T
f
T
t
(U)
is injective. Indeed, T
f
= 0 is equivalent to
_
U
fd = 0 for all T(U). (36.3)
for all C
(K). By the dominated convergence theorem and the usual

convolution argument, this is equivalent to
_
U
fd = 0 for all C
c
(U). (36.4)
Now x a compact set K U and
n
C
c
(U) such that
n
sgn(f)1
K
in L
1
(). By replacing
n
by (
n
) if necessary, where
(z) =
_
z if [z[ 1
z
]z]
if [z[ 1,
we may assume that [
n
[ 1. By passing to a further subsequence, we may
assume that
n
sgn(f)1
K
a.e.. Thus we have
36.2 Examples of distributions and related computations 727
0 = lim
n
_
U
n
fd =
_
U
sgn(f)1
K
fd =
_
K
[f[ d.
This shows that [f(x)[ = 0 for -a.e. x K. Since K is arbitrary and U is
the countable union of such compact sets K, it follows that f(x) = 0 for
-a.e. x U.
The injectivity may also be proved slightly more directly as follows. As
before, it suces to prove Eq. (36.4) implies that f(x) = 0 for a.e. x. We
may further assume that f is real by considering real and imaginary parts
separately. Let K U and > 0 be given. Set A = f > 0 K, then
(A) < and hence since all nite measure on U are Radon, there exists
F A V with F compact and V
o
U such that (V F) < . By
Uryshons lemma, there exists C
c
(V ) such that 0 1 and = 1 on
F. Then by Eq. (36.4)
0 =
_
U
fd =
_
F
fd +
_
V \F
fd =
_
F
fd +
_
V \F
fd
so that
_
F
fd =
_
V \F
fd
_
V \F
[f[ d <
provided that is chosen suciently small by the denition of absolute
continuity. Similarly, it follows that
0
_
A
fd
_
F
fd + 2.
_
A
fd = 0. Since K was arbitrary, we
learn that _
f>0]
fd = 0
which shows that f 0 a.e. Similarly, one shows that f 0 a.e. and
hence f = 0 a.e.
Example 36.6. Let us now assume that = m and write T
f
, ) =
_
U
fdm.
For the moment let us also assume that U = 1. Then we have
1. lim
M
T
sin Mx
= 0
2. lim
M
T
M
1
sin Mx
=
0
where
0
is the point measure at 0.
3. If f L
1
(1
n
, dm) with
_
R
n
fdm = 1 and f
(x) =
n
f(x/), then
lim
0
T
f
=
0
. As a special case,
consider lim
0
(x
2
+
2
)
=
0
.
Denition 36.7 (Multiplication by smooth functions). Suppose that
g C
(U) and T T
t
(U) then we dene gT T
t
(U) by
gT, ) = T, g) for all T(U).
It is easily checked that gT is continuous.
Denition 36.8 (Dierentiation). For T T
t
(U) and i 1, 2, . . . , n let
i
T T
t
(U) be the distribution dened by
i
T, ) = T,
i
) for all T(U).
Again it is easy to check that
i
T is a distribution.
More generally if L =
]]m
a
with a
(U) for all , then LT

is the distribution dened by
LT, ) = T,
]]m
(1)
]]
(a
)) for all T(U).

Hence we can talk about distributional solutions to dierential equations of
the form LT = S.
Example 36.9. Suppose that f L
1
loc
and g C
(U), then gT
f
= T
gf
. If
further f C
1
(U), then
i
T
f
= T
if
. If f C
m
(U), then LT
f
= T
Lf
.
Example 36.10. Suppose that a U, then
a
, ) =
i
(a)
and more generally we have
L
a
, ) =
]]m
(1)
]]
(a
) (a).
Example 36.11. Consider the distribution T := T
]x]
for x 1, i.e. take U = 1.
Then
d
dx
T = T
sgn(x)
and
d
2
d
2
x
T = 2
0
.
More generally, suppose that f is piecewise C
1
, the
d
dx
T
f
= T
f
+
(f(x+) f(x))
x
.
Example 36.12. Consider T = T
ln]x]
on T(1). Then
T
t
, ) =
_
R
ln[x[
t
(x)dx = lim
0
_
]x]>
ln[x[
t
(x)dx
= lim
0
_
]x]>
ln[x[
t
(x)dx
= lim
0
_
]x]>
1
x
(x)dx lim
0
[ln(() ())]
= lim
0
_
]x]>
1
x
(x)dx.
We will write T
t
= PV
1
x
in the future. Here is another formula for T
t
,
T
t
, ) = lim
0
_
1]x]>
1
x
(x)dx +
_
]x]>1
1
x
(x)dx
= lim
0
_
1]x]>
1
x
[(x) (0)]dx +
_
]x]>1
1
x
(x)dx
=
_
1]x]
1
x
[(x) (0)]dx +
_
]x]>1
1
x
(x)dx.
Please notice in the last example that
1
x
/ L
1
loc
(1) so that T
1/x
is not well
dened. This is an example of the so called division problem of distributions.
Here is another possible interpretation of
1
x
as a distribution.
Example 36.13. Here we try to dene 1/x as lim
y0
1
xiy
, that is we want to
dene a distribution T
by
T
, ) := lim
y0
_
1
x iy
(x)dx.
Let us compute T
+
explicitly,
lim
y0
_
R
1
x +iy
(x)dx
= lim
y0
_
]x]1
1
x +iy
(x)dx + lim
y0
_
]x]>1
1
x +iy
(x)dx
= lim
y0
_
]x]1
1
x +iy
[(x) (0)] dx +(0) lim
y0
_
]x]1
1
x +iy
dx
+
_
]x]>1
1
x
(x)dx
= PV
_
R
1
x
(x)dx +(0) lim
y0
_
]x]1
1
x +iy
dx.
Now by deforming the contour we have
_
]x]1
1
x +iy
dx =
_
<]x]1
1
x +iy
dx +
_
C
1
z +iy
dz
where C
: z = e
i
with : 0. Therefore,
lim
y0
_
]x]1
1
x +iy
dx = lim
y0
_
<]x]1
1
x +iy
dx + lim
y0
_
C
1
z +iy
dz
=
_
<]x]1
1
x
dx +
_
C
1
z
dz = 0 .
Hence we have shown that T
+
= PV
1
x
i
0
. Similarly, one shows that
T
= PV
1
x
+i
0
. Notice that it follows from these computations that T
T
+
= i2
0
. Notice that
1
x iy

1
x +iy
=
2iy
x
2
+y
2
and hence we conclude that lim
y0
y
x
2
+y
2
=
0
a result that we saw in
Example 36.6, item 3.
Example 36.14. Suppose that is a complex measure on 1 and F(x) =
((, x]), then T
t
F
= . Moreover, if f L
1
loc
(1) and T
t
f
= , then
f = F +C a.e. for some constant C.
Proof. Let T := T(1), then
T
t
F
, ) = T
F
,
t
) =
_
R
F(x)
t
(x)dx =
_
R
dx
_
R
d(y)
t
(x)1
yx
=
_
R
d(y)
_
R
dx
t
(x)1
yx
=
_
R
d(y)(y) = , )
by Fubinis theorem and the fundamental theorem of calculus. If T
t
f
= , then
T
t
fF
= 0 and the result follows from Corollary 36.16 below.
Lemma 36.15. Suppose that T T
t
(1
n
) is a distribution such that
i
T = 0
for some i, then there exists a distribution S T
t
(1
n1
) such that T, ) =
S,
i
) for all T(1
n
) where

i
=
_
R
tei
dt T(1
n1
).
Proof. To simplify notation, assume that i = n and write x 1
n
as
x = (y, z) with y 1
n1
and z 1. Let C
c
(1) such that
_
R
(z)dz = 1
and for T(1
n1
), let (x) = (y)(z). The mapping
T(1
n1
) T(1
n
)
is easily seen to be sequentially continuous and therefore S, ) := T, )
dened a distribution in T
t
(1
n
). Now suppose that T(1
n
). If =
n
f for
some f T(1
n
) we would have to have
_
(y, z)dz = 0. This is not generally
true, however the function does have this property. Dene
f(y, z) :=
_
z
[(y, z
t
) (y)(z
t
)] dz
t
,
then f T(1
n
) and
n
f = . Therefore,
0 =
n
T, f) = T,
n
f) = T, ) T, ) = T, ) S, ).
Corollary 36.16. Suppose that T T
t
(1
n
) is a distribution such that there
exists m 0 such that
T = 0 for all [[ = m,
then T = T
p
where p(x) is a polynomial on 1
n
of degree less than or equal to
m1, where by convention if deg(p) = 1 then p := 0.
Proof. The proof will be by induction on n and m. The corollary is trivially
true when m = 0 and n is arbitrary. Let n = 1 and assume the corollary holds
for m = k 1 with k 1. Let T T
t
(1) such that 0 =
k
T =
k1
T. By
the induction hypothesis, there exists a polynomial, q, of degree k 2 such
that T
t
= T
q
. Let p(x) =
_
x
0
q(z)dz, then p is a polynomial of degree at most
k 1 such that p
t
= q and hence T
t
p
= T
q
= T
t
. So (T T
p
)
t
= 0 and hence by
Lemma 36.15, T T
p
= T
C
where C = T T
p
, ) and is as in the proof of
Lemma 36.15. This proves the he result for n = 1. For the general induction,
suppose there exists (m, n) N
2
with m 0 and n 1 such that assertion
in the corollary holds for pairs (m
t
, n
t
) such that either n
t
< n of n
t
= n and
m
t
m. Suppose that T T
t
(1
n
) is a distribution such that
T = 0 for all [[ = m+ 1.
In particular this implies that
n
T = 0 for all [[ = m 1 and hence by
induction
n
T = T
qn
where q
n
is a polynomial of degree at most m 1 on
1
n
. Let p
n
(x) =
_
z
0
q
n
(y, z
t
)dz
t
a polynomial of degree at most m on 1
n
. The
polynomial p
n
satises, 1)
p
n
= 0 if [[ = m and
n
= 0 and 2)
n
p
n
= q
n
.
Hence
n
(T T
pn
) = 0 and so by Lemma 36.15,
T T
pn
, ) = S,
n
)
for some distribution S T
t
(1
n1
). If is a multi-index such that
n
= 0
and [[ = m, then
0 =
T
pn
, ) = T T
pn
,
) = S, (
)
n
)
= S,

n
) = (1)
]]
S,
n
).
and in particular by taking = , we learn that
S, ) = 0 for all
T(1
n1
). Thus by the induction hypothesis, S = T
r
for some polynomial (r)
of degree at most m on 1
n1
. Letting p(y, z) = p
n
(y, z) +r(y) a polynomial
of degree at most m on 1
n
, it is easily checked that T = T
p
.
Example 36.17. Consider the wave equation
(
t
x
) (
t
+
x
) u(t, x) =
_
2
t

2
x
_
u(t, x) = 0.
From this equation one learns that u(t, x) = f(x + t) + g(x t) solves the
wave equation for f, g C
2
. Suppose that f is a bounded Borel measurable
function on 1 and consider the function f(x + t) as a distribution on 1. We
compute
(
t
x
) f(x +t), (x, t)) =
_
R
2
f(x +t) (
x
t
) (x, t)dxdt
=
_
R
2
f(x) [(
x
t
) ] (x t, t)dxdt
=
_
R
2
f(x)
d
dt
[(x t, t)] dxdt
=
_
R
f(x) [(x t, t)] [
t=
t=
dx = 0.
This shows that (
t
x
) f(x + t) = 0 in the distributional sense. Similarly,
(
t
+
x
) g(x t) = 0 in the distributional sense. Hence u(t, x) = f(x + t) +
g(x t) solves the wave equation in the distributional sense whenever f and
g are bounded Borel measurable functions on 1.
Example 36.18. Consider f(x) = ln[x[ for x 1
2
and let T = T
f
. Then,
pointwise we have
ln[x[ =
x
[x[
2
and ln[x[ =
2
[x[
2
2x
x
[x[
4
= 0.
Hence f(x) = 0 for all x 1
2
except at x = 0 where it is not dened. Does
this imply that T = 0? No, in fact T = 2 as we shall now prove. By
denition of T and the dominated convergence theorem,
T, ) = T, ) =
_
R
2
ln[x[ (x)dx = lim
0
_
]x]>
ln[x[ (x)dx.
Using the divergence theorem,
_
]x]>
ln[x[ (x)dx
=
_
]x]>
ln[x[ (x)dx +
_
]x]>]
ln[x[ (x) n(x)dS(x)
=
_
]x]>
ln[x[ (x)dx
_
]x]>]
ln[x[ n(x)(x)dS(x)
+
_
]x]>]
ln[x[ ((x) n(x)) dS(x)
=
_
]x]>]
_
]x]>]
ln[x[ n(x)(x)dS(x),
where n(x) is the outward pointing normal, i.e. n(x) = x := x/ [x[ . Now
_
]x]>]
C
_
ln
1
_
2 0 as 0
where C is a bound on ((x) n(x)) . While
_
]x]>]
ln[x[ n(x)(x)dS(x) =
_
]x]>]
x
[x[
( x)(x)dS(x)
=
1
_
]x]>]
(x)dS(x)
2(0) as 0.
Combining these results shows
T, ) = 2(0).
Exercise 36.1. Carry out a similar computation to that in Example 36.18 to
show
T
1/]x]
= 4
where now x 1
3
.
Example 36.19. Let z = x +iy, and

=
1
2
(
x
+i
y
). Let T = T
1/z
, then
T =
0
or imprecisely

1
z
= (z).
Proof. Pointwise we have

1
z
= 0 so we shall work as above. We then have
T, ) = T,

) =
_
R
2
1
z
(z)dm(z)
= lim
0
_
]z]>
1
z
(z)dm(z)
= lim
0
_
]z]>
1
z
(z)dm(z)
lim
0
_
]z]>]
1
z
(z)
1
2
(n
1
(z) +in
2
(z)) d(z)
= 0 lim
0
_
]z]>]
1
z
(z)
1
2
_
z
[z[
_
d(z)
=
1
2
lim
0
_
]z]>]
1
[z[
(z)d(z)
= lim
0
1
2
_
]z]>]
(z)d(z) = (0).
36.3 Other classes of test functions
(For what follows, see Exercises 13.26 and 13.27 of Chapter 18.
Notation 36.20 Suppose that X is a vector space and p
n
n=0
is a family
of semi-norms on X such that p
n
p
n+1
for all n and with the property that
p
n
(x) = 0 for all n implies that x = 0. (We allow for p
n
= p
0
for all n in
which case X is a normed vector space.) Let be the smallest topology on X
such that p
n
(x ) : X [0, ) is continuous for all n N and x X. For
n N, x X and > 0 let B
n
(x, ) := y X : p
n
(x y) < .
Proposition 36.21. The balls B := B
n
(x, ) : n N, x X and > 0 for
a basis for the topology . This topology is the same as the topology induced
by the metric d on X dened by
d(x, y) =
n=0
2
n
p
n
(x y)
1 +p
n
(x y)
.
Moreover, a sequence x
k
X is convergent to x X i lim
k
d(x, x
k
) =
0 i lim
n
p
n
(x, x
k
) = 0 for all n N and x
k
X is Cauchy in X i
lim
k,l
d(x
l
, x
k
) = 0 i lim
k,l
p
n
(x
l
, x
k
) = 0 for all n N.
Proof. Suppose that z B
n
(x, ) B
m
(y, ) and assume with out loss of
generality that m n. Then if p
m
(w z) < , we have
p
m
(w y) p
m
(w z) +p
m
(z y) < +p
m
(z y) <
provided that (0, p
m
(z y)) and similarly
p
n
(w x) p
m
(w x) p
m
(w z) +p
m
(z x) < +p
m
(z x) <
provided that (0, p
m
(z x)). So choosing
=
1
2
min( p
m
(z y), p
m
(z x)) ,
we have shown that B
m
(z, ) B
n
(x, ) B
m
(y, ). This shows that B forms
a basis for a topology. In detail, V
o
X i for all x V there exists n N
and > 0 such that B
n
(x, ) := y X : p
n
(x y) < V. Let (B) be
the topology generated by B. Since[p
n
(x y) p
n
(x z)[ p
n
(y z), we see
that p
n
(x) is continuous on relative to (B) for each x X and n N. This
shows that (B). On the other hand, since p
n
(x ) is continuous, it
follows that B
n
(x, ) = y X : p
n
(x y) < for all x X, > 0 and
n N. This shows that B and therefore that (B) . Thus = (B).
Given x X and > 0, let B
d
(x, ) = y X : d(x, y) < be a d ball.
Choose N large so that

n=N+1
2
n
< /2. Then y B
N
(x, /4) we have
d(x, y) = p
N
(x y)
N
n=0
2
n
+/2 < 2
4
+/2 <
36.3 Other classes of test functions 735
which shows that B
N
(x, /4) B
d
(x, ). Conversely, if d(x, y) < , then
2
n
p
n
(x y)
1 +p
n
(x y)
<
which implies that
p
n
(x y) <
2
n
1 2
n
=:
when 2
n
< 1 which shows that B
n
(x, ) contains B
d
(x, ) with and as
above. This shows that and the topology generated by d are the same. The
moreover statements are now easily proved and are left to the reader.
Exercise 36.2. Keeping the same notation as Proposition 36.21 and further
assume that p
t
n
nN
is another family of semi-norms as in Notation 36.20.
Then the topology
t
determined by p
t
n
nN
is weaker then the topology
determined by p
n
nN
(i.e.
t
) i for every n N there is an m N and
C < such that p
t
n
Cp
m
.
Lemma 36.22. Suppose that X and Y are vector spaces equipped with se-
quences of norms p
n
and q
n
as in Notation 36.20. Then a linear map
T : X Y is continuous if for all n N there exists C
n
< and m
n
N
such that q
n
(Tx) C
n
p
mn
(x) for all x X. In particular, f X
i
[f(x)[ Cp
m
(x) for some C < and m N. (We may also characterize
continuity by sequential convergence since both X and Y are metric spaces.)
Proof. Suppose that T is continuous, then x : q
n
(Tx) < 1 is an open
neighborhood of 0 in X. Therefore, there exists m N and > 0 such that
B
m
(0, ) x : q
n
(Tx) < 1 . So for x X and < 1, x/p
m
(x) B
m
(0, )
and thus
q
n
_

p
m
(x)
Tx
_
< 1 =q
n
(Tx) <
1
p
m
(x)
for all x. Letting 1 shows that q
n
(Tx)
1
p
m
(x) for all x X. Conversely,
if T satises
q
n
(Tx) C
n
p
mn
(x) for all x X,
then
q
n
(Tx Tx
t
) = q
n
(T(x x
t
)) C
n
p
mn
(x x
t
) for all x, y X.
This shows Tx
t
Tx as x
t
x, i.e. that T is continuous.
Denition 36.23. A Frechet space is a vector space X equipped with a family
p
n
of semi-norms such that X is complete in the associated metric d.
Example 36.24. Let K 1
n
and C
(K) := f C
c
(1
n
) : supp(f) K .
For m N, let
p
m
(f) :=
]]m
|
f|
.
Then (C
(K), p
m
m=1
) is a Frechet space. Moreover the derivative opera-
tors
k
and multiplication by smooth functions are continuous linear maps
from C
(K) to C
(K). If is a nite measure on K, then T(f) :=

_
K

fd
is an element of C
(K)
for any multi index .

Example 36.25. Let U
o
1
n
and for m N, and a compact set K U let
p
K
m
(f) :=
]]m
|
f|
,K
:=
]]m
max
xK
[
f(x)[ .
Choose a sequence K
m
U such that K
m
K
o
m+1
K
m+1
U for
all m and set q
m
(f) = p
Km
m
(f). Then (C
(K), p
m
m=1
) is a Frechet space
and the topology in independent of the choice of sequence of compact sets K
exhausting U. Moreover the derivative operators
k
and multiplication by
smooth functions are continuous linear maps from C
(U) to C
(U). If is
a nite measure with compact support in U, then T(f) :=
_
K

fd is an
element of C
(U)
for any multi index .

Proposition 36.26. A linear functional T on C
(U) is continuous, i.e. T

C
(U)
i there exists a compact set K U, m N and C < such that

[T, )[ Cp
K
m
() for all C
(U).
Notation 36.27 Let
s
(x) := (1+[x[)
s
(or change to
s
(x) = (1+[x[
2
)
s/2
=
x)
s
?) for x 1
n
and s 1.
Example 36.28. Let o denote the space of functions f C
(1
n
) such that f
and all of its partial derivatives decay faster that (1 +[x[)
m
for all m > 0 as
in Denition 34.6. Dene
p
m
(f) =
]]m
|(1 +[ [)
m
f()|
]]m
|(
m
f()|
,
then (o, p
m
) is a Frechet space. Again the derivative operators
k
and
multiplication by function f T are examples of continuous linear operators
on o. For an example of an element T o
, let be a measure on 1
n
such
that _
(1 +[x[)
N
d[[(x) <
for some N N. Then T(f) :=
_
K

fd denes and element of o
.
Proposition 36.29. The Fourier transform T : o o is a continuous linear
transformation.
Proof. For the purposes of this proof, it will be convenient to use the
semi-norms
p
t
m
(f) =
]]m
_
_
(1 +[ [
2
)
m
f()
_
_
.
36.3 Other classes of test functions 737
This is permissible, since by Exercise 36.2 they give rise to the same topology
on o. Let f o and m N, then
(1 +[[
2
)
m

f() = (1 +[[
2
)
m
T ((ix)
f) ()
= T [(1 )
m
((ix)
f)] ()
and therefore if we let g = (1 )
m
((ix)
f) o,
(1 +[[
2
)
m

f()
|g|
1
=
_
R
n
[g(x)[ dx
=
_
R
n
[g(x)[ (1 +[x[
2
)
n
1
(1 +[x[
2
)
n
d
C
_
_
_[g()[ (1 +[[
2
)
n
_
_
_
where C =
_
R
n
1
(1+]x]
2
)
n
d < . Using the product rule repeatedly, it is not
hard to show
_
_
_[g()[ (1 +[[
2
)
n
_
_
_
=
_
_
_(1 +[[
2
)
n
(1 )
m
((ix)
f)
_
_
_
]]2m
_
_
_(1 +[[
2
)
n+]]/2
f
_
_
_
kp
t
2m+n
(f)
for some constant k < . Combining the last two displayed equations implies
that p
t
m
(

f) Ckp
t
2m+n
(f) for all f o, and thus T is continuous.
Proposition 36.30. The subspace C
c
(1
n
) is dense in o(1
n
).
Proof. Let C
c
(1
n
) such that = 1 in a neighborhood of 0 and set
m
(x) = (x/m) for all m N. We will now show for all f o that
m
f
converges to f in o. The main point is by the product rule,
(
m
f f) (x) =
m
(x)
f(x) f
=
:,=
_
_
1
m
]]
(x/m)
f(x).
Since max
__
_
_
_
:
_
is bounded it then follows from the last equation
that |
t
(
m
f f)|
= O(1/m) for all t > 0 and . That is to say

m
f
f in o.
Lemma 36.31 (Peetres Inequality). For all x, y 1
n
and s 1,
(1 +[x +y[)
s
min
_
(1 +[y[)
]s]
(1 +[x[)
s
, (1 +[y[)
s
(1 +[x[)
]s]
_
(36.5)
that is to say
s
(x + y)
]s]
(x)
s
(y) and
s
(x + y)
s
(x)
]s]
(y) for all
s 1, where
s
(x) = (1 +[x[)
s
as in Notation 36.27. We also have the same
results for x), namely
x +y)
s
2
]s]/2
min
_
x)
]s]
y)
s
, x)
s
y)
]s]
_
. (36.6)
Proof. By elementary estimates,
(1 +[x +y[) 1 +[x[ +[y[ (1 +[x[)(1 +[y[)
and so for Eq. (36.5) holds if s 0. Now suppose that s < 0, then
(1 +[x +y[)
s
(1 +[x[)
s
(1 +[y[)
s
and letting x x y and y y in this inequality implies
(1 +[x[)
s
(1 +[x +y[)
s
(1 +[y[)
s
.
This inequality is equivalent to
(1 +[x +y[)
s
(1 +[x[)
s
(1 +[y[)
s
= (1 +[x[)
s
(1 +[y[)
]s]
.
By symmetry we also have
(1 +[x +y[)
s
(1 +[x[)
]s]
(1 +[y[)
s
.
For the proof of Eq. (36.6
x +y)
2
= 1 +[x +y[
2
1 + ([x[ +[y[)
2
= 1 +[x[
2
+[y[
2
+ 2 [x[ [y[
1 + 2 [x[
2
+ 2 [y[
2
2(1 +[x[
2
)(1 +[y[
2
) = 2x)
2
y)
2
.
From this it follows that x)
2
2x +y)
2
y)
2
and hence
x +y)
2
2x)
2
y)
2
.
So if s 0, then
x +y)
s
2
s/2
x)
s
y)
s
and
x +y)
s
2
s/2
x)
s
y)
s
.
Proposition 36.32. Suppose that f, g o then f g o.
Proof. First proof. Since T(f g) =

f g o it follows that f g =
T
1
(

f g) o as well. For the second proof we will make use of Peetres
inequality. We have for any k, l N that
36.4 Compactly supported distributions 739
t
(x) [
(f g)(x)[ =
t
(x) [
f g(x)[
t
(x)
_
[
f(x y)[ [g(y)[ dy

C
t
(x)
_

k
(x y)
l
(y)dy C
t
(x)
_

k
(x)
k
(y)
l
(y)dy
= C
tk
(x)
_

kl
(y)dy.
Choosing k = t and l > t +n we learn that
t
(x) [
(f g)(x)[ C
_

kl
(y)dy <
showing |
t
(f g)|
< for all t 0 and N

n
.
36.4 Compactly supported distributions
Denition 36.33. For a distribution T T
t
(U) and V
o
U, we say T[
V
= 0
if T, ) = 0 for all T(V ).
Proposition 36.34. Suppose that 1 := V
A
is a collection of open subset
of U such that T[
V
= 0 for all , then T[
W
= 0 where W =
A
V
.
Proof. Let
A
be a smooth partition of unity subordinate to 1, i.e.
supp(
) V
for all A, for each point x W there exists a neighborhood

N
x

o
W such that # A : supp(
)N
x
,= < and 1
W
=
.
Then for T(W), we have =

A
and there are only a nite

number of nonzero terms in the sum since supp() is compact. Since

T(V
) for all ,
T, ) = T,
) =
A
T,
) = 0.
Denition 36.35. The support, supp(T), of a distribution T T
t
(U) is the
relatively closed subset of U determined by
U supp(T) = V
o
U : T[
V
= 0 .
By Proposition 36.26, supp(T) may described as the smallest (relatively)
closed set F such that T[
U\F
= 0.
Proposition 36.36. If f L
1
loc
(U), then supp(T
f
) = ess sup(f), where
ess sup(f) := x U : m(y V : f(y) ,= 0) > 0 for all neighborhoods V of x
as in Denition 22.25.
Proof. The key point is that T
f
[
V
= 0 i f = 0 a.e. on V and therefore
U supp(T
f
) = V
o
U : f1
V
= 0 a.e. .
On the other hand,
U ess sup(f) = x U : m(y V : f(y) ,= 0) = 0 for some neighborhood V of x
= x U : f1
V
= 0 a.e. for some neighborhood V of x
= V
o
U : f1
V
= 0 a.e.
Denition 36.37. Let c
t
(U) := T T
t
(U) : supp(T) U is compact
the compactly supported distributions in T
t
(U).
Lemma 36.38. Suppose that T T
t
(U) and f C
(U) is a function such

that K := supp(T) supp(f) is a compact subset of U. Then we may dene
T, f) := T, f), where T(U) is any function such that = 1 on a
neighborhood of K. Moreover, if K U is a given compact set and F U
is a compact set such that K F
o
, then there exists m N and C < such
that
[T, f)[ C
]]m
_
_
f
_
_
,F
(36.7)
for all f C
(U) such that supp(T) supp(f) K. In particular if T

c
t
(U) then T extends uniquely to a linear functional on C
(U) and there is

a compact subset F U such that the estimate in Eq. (36.7) holds for all
f C
(U).
Proof. Suppose that

is another such cuto function and let V be an open
neighborhood of K such that =

= 1 on V. Setting g :=
_

_
f T(U)
we see that
supp(g) supp(f) V supp(f) K = supp(f) supp(T) U supp(T),
see Figure 36.1 below. Therefore,
0 = T, g) = T,
_

_
f) = T, f) T,

f)
which shows that T, f) is well dened. Moreover, if F U is a compact set
such that K F
o
and C
c
(F
0
) is a function which is 1 on a neighborhood
of K, we have
[T, f)[ = [T, f)[ = C
]]m
|
(f)|
]]m
_
_
f
_
_
,F
and this estimate holds for all f C
(U) such that supp(T) supp(f) K.

36.4 Compactly supported distributions 741
Fig. 36.1. Intersecting the supports.
Theorem 36.39. The restriction of T C
(U)
to C
c
(U) denes an ele-
ment in c
t
(U). Moreover the map
T C
(U)
i
T[
1(U)
c
t
(U)
is a linear isomorphism of vector spaces. The inverse map is dened as follows.
Given S c
t
(U) and C
c
(U) such that = 1 on K = supp(S) then
i
1
(S) = S, where S C
(U)
dened by
S, ) = S, ) for all C
(U).
Proof. Suppose that T C
(U)
then there exists a compact set K

U, m N and C < such that
[T, )[ Cp
K
m
() for all C
(U)
where p
K
m
is dened in Example 36.25. It is clear using the sequential notion of
continuity that T[
1(U)
is continuous on T(U), i.e. T[
1(U)
T
t
(U). Moreover,
if C
c
(U) such that = 1 on a neighborhood of K then
[T, ) T, )[ = [T, ( 1) )[ Cp
K
m
(( 1) ) = 0,
which shows T = T. Hence supp(T) = supp(T) supp() U showing
that T[
1(U)
c
t
(U). Therefore the map i is well dened and is clearly linear.
I also claim that i is injective because if T C
(U)
and i(T) = T[
1(U)
0,
then T, ) = T, ) = T[
1(U)
, ) = 0 for all C
(U). To show i is
surjective suppose that S c
t
(U). By Lemma 36.38 we know that S extends
uniquely to an element

S of C
(U)
such that

S[
1(U)
= S, i.e. i(
S) = S. and
K = supp(S).
Lemma 36.40. The space c
t
(U) is a sequentially dense subset of T
t
(U).
Proof. Choose K
n
U such that K
n
K
o
n+1
K
n+1
U as n .
Let
n
C
c
(K
0
n+1
) such that
n
= 1 on K. Then for T T
t
(U),
n
T c
t
(U)
and
n
T T as n .
36.5 Tempered Distributions and the Fourier Transform
The space of tempered distributions o
t
(1
n
) is the continuous dual to o =
o(1
n
). A linear functional T on o is continuous i there exists k N and
C < such that
[T, )[ Cp
k
() := C
]]k
|
k
(36.8)
for all o. Since T = T(1
n
) is a dense subspace of o any element T o
t
is determined by its restriction to T. Moreover, if T o
t
it is easy to see that
T[
1
T
t
. Conversely and element T T
t
satisfying an estimate of the form
in Eq. (36.8) for all T extend uniquely to an element of o
t
. In this way
we may view o
t
as a subspace of T
t
.
Example 36.41. Any compactly supported distribution is tempered, i.e.
c
t
(U) o
t
(1
n
) for any U
o
1
n
.
One of the virtues of o
t
is that we may extend the Fourier transform to
o
t
. Recall that for L
1
functions f and g we have the identity,
f, g) = f, g).
This suggests the following denition.
Denition 36.42. The Fourier and inverse Fourier transform of a tempered
distribution T o
t
are the distributions

T = TT o
t
and T
= T
1
T
o
t
dened by
T, ) = T, ) and T
, ) = T,
) for all o.
Since T : o o is a continuous isomorphism with inverse T
1
, one easily
checks that

T and T
are well dened elements of o and that T

1
is the
inverse of T on o
t
.
Example 36.43. Suppose that is a complex measure on 1
n
. Then we may
view as an element of o
t
via , ) =
_
d for all o
t
. Then by Fubini-
Tonelli,
, ) = , ) =
_
(x)d(x) =
_ __
()e
ix
d
_
d(x)
=
_ __
()e
ix
d(x)
_
d
36.5 Tempered Distributions and the Fourier Transform 743
which shows that is the distribution associated to the continuous func-
tion
_
e
ix
d(x).
_
e
ix
d(x)We will somewhat abuse notation and
identify the distribution with the function
_
e
ix
d(x). When
d(x) = f(x)dx with f L
1
, we have =

f, so the denitions are all
consistent.
Corollary 36.44. Suppose that is a complex measure such that = 0, then
= 0. So complex measures on 1
n
are uniquely determined by their Fourier
transform.
Proof. If = 0, then = 0 as a distribution, i.e.
_
d = 0 for all o
and in particular for all T. By Example 36.5 this implies that is the
zero measure.
More generally we have the following analogous theorem for compactly
supported distributions.
Theorem 36.45. Let S c
t
(1
n
), then

S is an analytic function and

S(z) =
S(x), e
ixz
). Also if supp(S) B(0, M), then

S(z) satises a bound of the
form

S(z)
C(1 +[z[)
m
e
M]Imz]
for some m N and C < . If S T(1
n
), i.e. if S is assumed to be smooth,
then for all m N there exists C
m
< such that
S(z)
C
m
(1 +[z[)
m
e
M]Imz]
.
Proof. The function h(z) = S(), e
iz
) for z C
n
is analytic since the
map z C
n
e
iz
C
( 1
n
) is analytic and S is complex linear.
Moreover, we have the bound
[h(z)[ =
S(), e
iz
)
]]m
_
_
e
iz
_
_
,B(0,M)
= C
]]m
_
_
z
e
iz
_
_
,B(0,M)
C
]]m
[z[
]]
_
_
e
iz
_
_
,B(0,M)
C(1 +[z[)
m
e
M]Imz]
.
If we now assume that S T(1
n
), then

S(z)
_
R
n
S()z
e
iz
d
_
R
n
S()(i
e
iz
d
_
R
n
(i
S()e
iz
d
e
M]Imz]
_
R
n
[
S()[ d
showing
[z
S(z)
e
M]Imz]
|
S|
1
and therefore
(1 +[z[)
m
S(z)
Ce
M]Imz]
]]m
|
S|
1
Ce
M]Imz]
.
So to nish the proof it suces to show h =

S in the sense of distributions
1
.
For this let T, K 1
n
be a compact set for > 0 let

() = (2)
n/2
xZ
n
(x)e
ix
.
This is a nite sum and
sup
K
[
() ())[
= sup
K
yZ
n
_
y+(0,1]
n
_
(iy)
(y)e
iy
(ix)
(x)e
ix
_
dx
yZ
n
_
y+(0,1]
n
sup
K
(y)e
iy
x
(x)e
ix
dx
By uniform continuity of x
(x)e
ix
for (, x) K 1
n
( has compact
support),
() = sup
K
sup
yZ
n
sup
xy+(0,1]
n
(y)e
iy
x
(x)e
ix
0 as 0
which shows
sup
K
[
() ())[ C()
where C is the volume of a cube in 1
n
which contains the support of . This
shows that
in C
(1
n
). Therefore,
1
This is most easily done using Fubinis Theorem 37.2 for distributions proved
below. This proof goes as follows. Let , T(R
n
) such that = 1 on a neigh-
borhood of supp(S) and = 1 on a neighborhood of supp() then
h, ) = (x), S(), e
ix
)) = (x)(x), S(), ()e
ix
))
= (x), S(), (x)()e
ix
)).
We may now apply Theorem 37.2 to conclude,
h, ) = S(), (x), (x)()e
ix
)) = S(), ()(x), e
ix
)) = S(), (x), e
ix
))
= S(),

()).
S, ) = S, ) = lim
0
S,
) = lim
0
(2)
n/2
xZ
n
(x)S(), e
ix
)
= lim
0
(2)
n/2
xZ
n
(x)h(x) =
_
R
n
(x)h(x)dx = h, ).
Remark 36.46. Notice that

S(z) = S(x),
z
e
ixz
) = S(x), (ix)
e
ixz
) = (ix)
S(x), e
ixz
)
and (ix)
S(x) c
t
(1
n
). Therefore, we nd a bound of the form

S(z)
C(1 +[z[)
m
e
M]Imz]
where C and m
t
depend on . In particular, this shows that

S T, i.e. o
t
is
preserved under multiplication by

S.
The converse of this theorem holds as well. For the moment we only have
the tools to prove the smooth converse. The general case will follow by using
the notion of convolution to regularize a distribution to reduce the question
to the smooth case.
Theorem 36.47. Let S o(1
n
) and assume that

S is an analytic function
and there exists an M < such that for all m N there exists C
m
< such
that
S(z)
C
m
(1 +[z[)
m
e
M]Imz]
.
Then supp(S) B(0, M).
Proof. By the Fourier inversion formula,
S(x) =
_
R
n
S()e
ix
d
and by deforming the contour, we may express this integral as
S(x) =
_
R
n
+i
S()e
ix
d =
_
R
n
S( +i)e
i(+i)x
d
for any 1
n
. From this last equation it follows that
[S(x)[ e
x
_
R
n
S( +i)
d C
m
e
x
e
M]]
_
R
n
(1 +[ +i[)
m
d
C
m
e
x
e
M]]
_
R
n
(1 +[[)
m
d

C
m
e
x
e
M]]
where

C
m
< if m > n. Letting = x with > 0 we learn
[S(x)[

C
m
exp
_
[x[
2
+M[x[
_
=

C
m
e
]x](M]x])
. (36.9)
Hence if [x[ > M, we may let in Eq. (36.9) to show S(x) = 0. That is
to say supp(S) B(0, M).
Let us now pause to work out some specic examples of Fourier transform
of measures.
Example 36.48 (Delta Functions). Let a 1
n
and
a
be the point mass mea-
sure at a, then
a
() = e
ia
.
In particular it follows that
T
1
e
ia
=
a
.
To see the content of this formula, let o. Then
_
e
ia
()d = e
ia
, T
1
) = T
1
e
ia
, ) =
a
, ) = (a)
which is precisely the Fourier inversion formula.
Example 36.49. Suppose that p(x) is a polynomial. Then
p, ) = p, ) =
_
p() ()d.
Now
p() () =
_
(x)p()e
ix
dx =
_
(x)p(i
x
)e
ix
dx
=
_
p(i
x
)(x)e
ix
dx = T (p(i)) ()
which combined with the previous equation implies
p, ) =
_
T (p(i)) ()d =
_
T
1
T (p(i))
_
(0) = p(i)(0)
=
0
, p(i)) = p(i)
0
, ).
Thus we have shown that p = p(i)
0
.
Lemma 36.50. Let p() be a polynomial in 1
n
, L = p(i) (a constant
coecient partial dierential operator) and T o
t
, then
Tp(i)T = p
T.
In particular if T =
0
, we have
Tp(i)
0
= p
0
= p.
Proof. By denition,
TLT, ) = LT, ) = p(i)T, ) = T, p(i) )
and
p(i
) () = p(i
)
_
(x)e
ix
dx =
_
p(x)(x)e
ix
dx = (p) .
Thus
TLT, ) = T, p(i) ) = T, (p) ) =
T, p) = p
T, )
which proves the lemma.
Example 36.51. Let n = 1, < a < b < , and d(x) = 1
[a,b]
(x)dx. Then
() =
_
b
a
e
ix
dx =
1
2
e
ix
i
[
b
a
=
1
2
e
ib
e
ia
i
=
1
2
e
ia
e
ib
i
.
So by the inversion formula we may conclude that
T
1
_
1
2
e
ia
e
ib
i
_
(x) = 1
[a,b]
(x) (36.10)
in the sense of distributions. This also true at the Level of L
2
functions.
When a = b and b > 0 these formula reduce to
T1
[b,b]
=
1
2
e
ib
e
ib
i
=
2
2
sinb
and
T
1
2
2
sinb
= 1
[b,b]
.
Let us pause to work out Eq. (36.10) by rst principles. For M (0, )
let
N
be the complex measure on 1
n
dened by
d
M
() =
1
2
1
]]M
e
ia
e
ib
i
d,
then
1
2
e
ia
e
ib
i
= lim
M
M
in the o
t
topology.
Hence
T
1
_
1
2
e
ia
e
ib
i
_
(x) = lim
M
T
1
M
and
T
1
M
() =
_
M
M
1
2
e
ia
e
ib
i
e
ix
d.
Since is
1
2
e
ia
e
ib
i
e
ix
is a holomorphic function on C we may
deform the contour to any contour in C starting at M and ending at M. Let
M
denote the straight line path from M to 1 along the real axis followed
by the contour e
i
for going from to 2 and then followed by the straight
line path from 1 to M. Then
_
]]M
1
2
e
ia
e
ib
i
e
ix
d =
_
M
1
2
e
ia
e
ib
i
e
ix
d
=
_
M
1
2
e
i(xa)
e
i(xb)
i
d
=
1
2i
_
M
e
i(xa)
e
i(xb)
i
dm().
By the usual contour methods we nd
lim
M
1
2i
_
M
e
iy
dm() =
_
1 if y > 0
0 if y < 0
and therefore we have
T
1
_
1
2
e
ia
e
ib
i
_
(x) = lim
M
T
1
M
(x) = 1
x>a
1
x>b
= 1
[a,b]
(x).
Example 36.52. Let
t
be the surface measure on the sphere S
t
of radius t
centered at zero in 1
3
. Then

t
() = 4t
sint [[
[[
.
Indeed,

t
() =
_
tS
2
e
ix
d(x) = t
2
_
S
2
e
itx
d(x)
= t
2
_
S
2
e
itx3]]
d(x) = t
2
_
2
0
d
_

0
dsine
it cos ]]
= 2t
2
_
1
1
e
itu]]
du = 2t
2
1
it [[
e
itu]]
[
u=1
u=1
= 4t
2
sint [[
t [[
.
By the inversion formula, it follows that
T
1
sint [[
[[
=
t
4t
2
t
= t
t
where
t
is
1
4t
2
t
, the surface measure on S
t
normalized to have total measure
one.
36.6 Wave Equation 749
Let us again pause to try to compute this inverse Fourier transform di-
rectly. To this end, let f
M
() :=
sin t]]
t]]
1
]]M
. By the dominated convergence
theorem, it follows that f
M

sin t]]
t]]
in o
t
, i.e. pointwise on o. Therefore,
T
1
sint [[
t [[
, ) =
sint [[
t [[
, T
1
) = lim
M
f
M
, T
1
) = lim
M
T
1
f
M
, )
and
(2)
3/2
T
1
f
M
(x) = (2)
3/2
_
R
3
sint [[
t [[
1
]]M
e
ix
d
=
_
M
r=0
_
2
=0
_

=0
sintr
tr
e
ir]x] cos
r
2
sindrdd
=
_
M
r=0
_
2
=0
_
1
u=1
sintr
tr
e
ir]x]u
r
2
drdud
= 2
_
M
r=0
sintr
t
e
ir]x]
e
ir]x]
ir [x[
rdr
=
4
t [x[
_
M
r=0
sintr sinr [x[ dr
=
4
t [x[
_
M
r=0
1
2
(cos(r(t +[x[) cos(r(t [x[)) dr
=
4
t [x[
1
2(t +[x[)
(sin(r(t +[x[) sin(r(t [x[)) [
M
r=0
=
4
t [x[
1
2
_
sin(M(t +[x[)
t +[x[

sin(M(t [x[)
t [x[
_
Now make use of the fact that
sin Mx
x
(x) in one dimension to nish the
proof.
36.6 Wave Equation
Given a distribution T and a test function , we wish to dene T C
by the formula
T (x) =
_
T(y)(x y)dy = T, (x )).
As motivation for wanting to understand convolutions of distributions let us
reconsider the wave equation in 1
n
,
0 =
_
2
t

_
u(t, x) with
u(0, x) = f(x) and u
t
(0, x) = g(x).
Taking the Fourier transform in the x variables gives the following equation
0 = u
t t
(t, ) +[[
2
u(t, )with
u(0, ) =

f() and u
t
(0, ) = g().
The solution to these equations is
u(t, ) =

f() cos (t [[) + g()
sint [[
[[
and hence we should have
u(t, x) = T
1
_
f() cos (t [[) + g()

sint [[
[[
_
(x)
= T
1
cos (t [[) f(x) +T
1
sint [[
[[
g (x)
=
d
dt
T
1
sint [[
[[
f(x) +T
1
sint [[
[[
g (x) .
The question now is how interpret this equation. In particular what are the in-
verse Fourier transforms of T
1
cos (t [[) and T
1
sin t]]
]]
. Since
d
dt
T
1
sin t]]
]]

f(x) = T
1
cos (t [[) f(x), it really suces to understand T
1
sin t]]
]]
. This
was worked out in Example 36.51 when n = 1 where we found
_
T
1
1
sint
_
(x) =

2
_
1
x+t>0
1
(xt)>0
_
=

2
(1
x>t
1
x>t
) =

2
1
[t,t]
(x)
where in writing the last line we have assume that t 0. Therefore,
_
T
1
1
sint
_
f(x) =
1
2
_
t
t
f(x y)dy
Therefore the solution to the one dimensional wave equation is
u(t, x) =
d
dt
1
2
_
t
t
f(x y)dy +
1
2
_
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
_
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
_
x+t
xt
g(y)dy.
We can arrive at this same solution by more elementary means as follows.
We rst note in the one dimensional case that wave operator factors, namely
0 =
_
2
t

2
x
_
u(t, x) = (
t
x
) (
t
+
x
) u(t, x).
Let U(t, x) := (
t
+
x
) u(t, x), then the wave equation states (
t
x
) U = 0
and hence by the chain rule
d
dt
U(t, x t) = 0. So
U(t, x t) = U(0, x) = g(x) +f
t
(x)
and replacing x by x +t in this equation shows
(
t
+
x
) u(t, x) = U(t, x) = g(x +t) +f
t
(x +t).
Working similarly, we learn that
d
dt
u(t, x +t) = g(x + 2t) +f
t
(x + 2t)
which upon integration implies
u(t, x +t) = u(0, x) +
_
t
0
g(x + 2) +f
t
(x + 2) d.
= f(x) +
_
t
0
g(x + 2)d +
1
2
f(x + 2)[
t
0
=
1
2
(f(x) +f(x + 2t)) +
_
t
0
g(x + 2)d.
Replacing x x t in this equation then implies
u(t, x) =
1
2
(f(x t) +f(x +t)) +
_
t
0
g(x t + 2)d.
Finally, letting y = x t + 2 in the last integral gives
u(t, x) =
1
2
(f(x t) +f(x +t)) +
1
2
_
x+t
xt
g(y)dy
as derived using the Fourier transform.
For the three dimensional case we have
u(t, x) =
d
dt
T
1
sint [[
[[
f(x) +T
1
sint [[
[[
g (x)
=
d
dt
(t
t
f(x)) +t
t
g (x) .
The question is what is g(x) where is a measure. To understand the
denition, suppose rst that d(x) = (x)dx, then we should have
g(x) = g(x) =
_
R
n
g(x y)(x)dx =
_
R
n
g(x y)d(y).
Thus we expect our solution to the wave equation should be given by
u(t, x) =
d
dt
_
t
_
St
f(x y)d
t
(y)
_
+t
_
St
g(x y)d
t
(y)
=
d
dt
_
t
_
S1
f(x t)d
_
+t
_
S1
g(x t)d
=
d
dt
_
t
_
S1
f(x +t)d
_
+t
_
S1
g(x +t)d (36.11)
where d := d
1
(). Notice the sharp propagation of speed. To understand
this suppose that f = 0 for simplicity and g has compact support near the
origin, for example think of g =
0
(x), the x + tw = 0 for some w i [x[ = t.
Hence the wave front propagates at unit speed in a sharp way. See gure
below.
Fig. 36.2. The geometry of the solution to the wave equation in three dimensions.
We may also use this solution to solve the two dimensional wave equation
using Hadamards method of decent. Indeed, suppose now that f and g are
function on 1
2
which we may view as functions on 1
3
which do not depend
on the third coordinate say. We now go ahead and solve the three dimensional
wave equation using Eq. (36.11) and f and g as initial conditions. It is easily
seen that the solution u(t, x, y, z) is again independent of z and hence is a
solution to the two dimensional wave equation. See gure below.
Notice that we still have nite speed of propagation but no longer sharp
propagation. In fact we can work out the solution analytically as follows.
Again for simplicity assume that f 0. Then
Fig. 36.3. The geometry of the solution to the wave equation in two dimensions.
u(t, x, y) =
t
4
_
2
0
d
_

0
dsing((x, y) +t(sincos , sinsin))
=
t
2
_
2
0
d
_
/2
0
dsing((x, y) +t(sincos , sinsin))
and letting u = sin, so that du = cos d =
1 u
2
d we nd
u(t, x, y) =
t
2
_
2
0
d
_
1
0
du
1 u
2
ug((x, y) +ut(cos , sin))
and then letting r = ut we learn,
u(t, x, y) =
1
2
_
2
0
d
_
t
0
dr
_
1 r
2
/t
2
r
t
g((x, y) +r(cos , sin))
=
1
2
_
2
0
d
_
t
0
dr
t
2
r
2
rg((x, y) +r(cos , sin))
=
1
2
__
Dt
g((x, y) +w))
_
t
2
[w[
2
dm(w).
Here is a better alternative derivation of this result. We begin by using
symmetry to nd
u(t, x) = 2t
_
S
+
t
g(x y)d
t
(y) = 2t
_
S
+
t
g(x +y)d
t
(y)
where S
+
t
is the portion of S
t
with z 0. This sphere is parametrized by
R(u, v) = (u, v,
t
2
u
2
v
2
) with (u, v) D
t
:=
_
(u, v) : u
2
+v
2
t
2
_
. In
these coordinates we have
4t
2
d
t
=
u
_
t
2
u
2
v
2
,
v
_
t
2
u
2
v
2
, 1
_
dudv
=
_
u
t
2
u
2
v
2
,
v
t
2
u
2
v
2
, 1
_
dudv
=
_
u
2
+v
2
t
2
u
2
v
2
+ 1dudv =
[t[
t
2
u
2
v
2
dudv
and therefore,
u(t, x) =
2t
4t
2
_
S
+
t
g(x + (u, v,
_
t
2
u
2
v
2
))
[t[
t
2
u
2
v
2
dudv
=
1
2
sgn(t)
_
S
+
t
g(x + (u, v))
t
2
u
2
v
2
dudv.
This may be written as
u(t, x) =
1
2
sgn(t)
__
Dt
g((x, y) +w))
_
t
2
[w[
2
dm(w)
as before. (I should check on the sgn(t) term.)
c
(U)
Let U be an open subset of 1
n
and
C
c
(U) =
KU
C
(K) (36.12)
denote the set of smooth functions on U with compact support in U. Our
goal is to topologize C
c
(U) in a way which is compatible with he topologies
dened in Example 36.24 above. This leads us to the inductive limit topology
which we now pause to introduce.
Denition 36.53 (Indcutive Limit Topology). Let X be a set, X
X
for A (A is an index set) and assume that
2
X
is a topology on X
for each . Let i
: X
X denote the inclusion maps. The inductive limit

topology on X is the largest topology on X such that i
is continuous for
all A. That is to say, =
A
i
), i.e. a set U X is open (U )

i i
1
(A) = A X
for all A.
Notice that C X is closed i C X
is closed in X
for all . Indeed,

C X is closed i C
c
= X C X is open, i C
c
X
= X
C is open in
X
i X
C = X
(X
C) is closed in X
for all A.
Denition 36.54. Let T(U) denote C
c
(U) equipped with the inductive limit
topology arising from writing C
c
(U) as in Eq. (36.12) and using the Frechet
topologies on C
(K) as dened in Example 36.24.

c
(U) 755
For each K U, C
(K) is a closed subset of T(U). Indeed if F is

another compact subset of U, then C
(K) C
(F) = C
(K F), which is
a closed subset of C
(F). The set | T(U) dened by

| =
_
_
_
T(U) :
]]m
|
( )|
<
_
_
_
(36.13)
for some T(U) and > 0 is an open subset of T(U). Indeed, if K U,
then
| C
(K) =
_
_
_
C
(K) :
]]m
|
( )|
<
_
_
_
is easily seen to be open in C
(K).
Proposition 36.55. Let (X, ) be as described in Denition 36.53 and f :
X Y be a function where Y is another topological space. Then f is contin-
uous i f i
: X
Y is continuous for all A.

Proof. Since the composition of continuous maps is continuous, it follows
that f i
: X
Y is continuous for all A if f : X Y is continuous.

Conversely, if f i
is continuous for all A, then for all V

o
Y we have
(f i
)
1
(V ) = i
1
(f
1
(V )) = f
1
(V ) X
for all A
showing that f
1
(V ) .
Lemma 36.56. Let us continue the notation introduced in Denition 36.53.
Suppose further that there exists
k
A such that X
t
k
:= X
k
X as k
and for each A there exists an k N such that X
X
t
k
and the
inclusion map is continuous. Then = A X : A X
t
k

o
X
t
k
for all k
and a function f : X Y is continuous i f[
X
k
: X
t
k
Y is continuous
for all k. In short the inductive limit topology on X arising from the two
collections of subsets X
A
and X
t
k
kN
are the same.
Proof. Suppose that A X, if A then A X
t
k
= A X
k

o
X
t
k
by denition. Now suppose that A X
t
k

o
X
t
k
for all k. For A choose
k such that X
X
t
k
, then A X
= (A X
t
k
) X

o
X
since A X
t
k
is open in X
t
k
and by assumption that X
is continuously embedded in X
t
k
,
V X

o
X
for all V
o
X
t
k
. The characterization of continuous functions
is prove similarly.
Let K
k
U for k N such that K
o
k
K
k
K
o
k+1
K
k+1
for all
k and K
k
U as k . Then it follows for any K U, there exists
an k such that K K
o
k
K
k
. One now checks that the map C
(K) em-
beds continuously into C
(K
k
) and moreover, C
(K) is a closed subset of

C
(K
k+1
). Therefore we may describe T(U) as C
c
(U) with the inductively
limit topology coming from
kN
C
(K
k
).
Lemma 36.57. Suppose that
k
k=1
T(U), then
k
T(U) i
k
0 T(U).
Proof. Let T(U) and | T(U) be a set. We will begin by showing
that | is open in T(U) i | is open in T(U). To this end let K
k
be
the compact sets described above and choose k
0
suciently large so that
C
(K
k
) for all k k
0
. Now | T(U) is open i (| ) C
(K
k
)
is open in C
(K
k
) for all k k
0
. Because C
(K
k
), we have (| )
C
(K
k
) = | C
(K
k
) which is open in C
(K
k
) i | C
(K
k
) is
open C
(K
k
). Since this is true for all k k
0
we conclude that | is an
open subset of T(U) i | is open in T(U). Now
k
in T(U) i for all
|
o
T(U),
k
| for almost all k which happens i
k
|
for almost all k. Since | ranges over all open neighborhoods of 0 when |
ranges over the open neighborhoods of , the result follows.
Lemma 36.58. A sequence
k
k=1
T(U) converges to T(U), i there
is a compact set K U such that supp(
k
) K for all k and
k
in
C
(K).
Proof. If
k
in C
(K), then for any open set 1 T(U) with 1

we have 1 C
(K) is open in C
(K) and hence

k
1 C
(K) 1 for
almost all k. This shows that
k
T(U). For the converse, suppose that
there exists
k
k=1
T(U) which converges to T(U) yet there is no
compact set K such that supp(
k
) K for all k. Using Lemma36.57, we may
replace
k
by
k
if necessary so that we may assume
k
0 in T(U).
By passing to a subsequences of
k
and K
k
if necessary, we may also
assume there x
k
K
k+1
K
k
such that
k
(x
k
) ,= 0 for all k. Let p denote the
semi-norm on C
c
(U) dened by
p() =
k=0
sup
_
[(x)[
[
k
(x
k
)[
: x K
k+1
K
o
k
_
.
One then checks that
p()
_
N
k=0
1
[
k
(x
k
)[
_
||
for C
(K
N+1
). This shows that p[
C
(K
N+1
)
is continuous for all N and
hence p is continuous on T(U). Since p is continuous on T(U) and
k
0
in T(U), it follows that lim
k
p(
k
) = p(lim
k
k
) = p(0) = 0. While on
the other hand, p(
k
) 1 by construction and hence we have arrived at a
contradiction. Thus for any convergent sequence
k
k=1
T(U) there is a
compact set K U such that supp(
k
) K for all k. We will now show
that
k
k=1
is convergent to in C
(K). To this end let | T(U) be

the open set described in Eq. (36.13), then
k
| for almost all k and in
particular,
k
| C
(K) for almost all k. (Letting > 0 tend to zero

c
(U) 757
shows that supp() K, i.e. C
(K).) Since sets of the form | C
(K)
with | as in Eq. (36.13) form a neighborhood base for the C
(K) at , we
concluded that
k
in C
(K).
Denition 36.59 (Distributions on U
o
1
n
). A generalized function on
U
o
1
n
is a continuous linear functional on T(U). We denote the space of
generalized functions by T
t
(U).
Proposition 36.60. Let f : T(U) C be a linear functional. Then the
following are equivalent.
1. f is continuous, i.e. f T
t
(U).
2. For all K U, there exist n N and C < such that
[f()[ Cp
n
() for all C
(K). (36.14)
3. For all sequences
k
T(U) such that
k
0 in T(U), lim
k
f(
k
) =
0.
Proof. 1) 2). If f is continuous, then by denition of the inductive
limit topology f[
C
(K)
is continuous. Hence an estimate of the type in Eq.
(36.14) must hold. Conversely if estimates of the type in Eq. (36.14) hold for
all compact sets K, then f[
C
(K)
is continuous for all K U and again
by the denition of the inductive limit topologies, f is continuous on T
t
(U).
1) 3) By Lemma 36.58, the assertion in item 3. is equivalent to saying
that f[
C
(K)
is sequentially continuous for all K U. Since the topology on
C
(K) is rst countable (being a metric topology), sequential continuity and

continuity are the same think. Hence item 3. is equivalent to the assertion that
f[
C
(K)
is continuous for all K U which is equivalent to the assertion
that f is continuous on T
t
(U).
Proposition 36.61. The maps (, ) CT(U) T(U) and (, )
T(U) T(U) + T(U) are continuous. (Actually, I will have to look
up how to decide to this.) What is obvious is that all of these operations are
sequentially continuous, which is enough for our purposes.
37
Convolutions involving distributions
37.1 Tensor Product of Distributions
Let X
o
1
n
and Y
o
1
m
and S T
t
(X) and T T
t
(Y ). We wish to dene
S T T
t
(X Y ). Informally, we should have
S T, ) =
_
XY
S(x)T(y)(x, y)dxdy
=
_
X
dxS(x)
_
Y
dyT(y)(x, y) =
_
Y
dyT(y)
_
X
dxS(x)(x, y).
Of course we should interpret this last equation as follows,
S T, ) = S(x), T(y), (x, y))) = T(y), S(x), (x, y))). (37.1)
This formula takes on particularly simple form when = uv with u T(X)
and v T(Y ) in which case
S T, u v) = S, u)T, v). (37.2)
We begin with the following smooth version of the Weierstrass approximation
theorem which will be used to show Eq. (37.2) uniquely determines S T.
Theorem 37.1 (Density Theorem). Suppose that X
o
1
n
and Y
o
1
m
,
then T(X) T(Y ) is dense in T(X Y ).
Proof. First let us consider the special case where X = (0, 1)
n
and Y =
(0, 1)
m
so that X Y = (0, 1)
m+n
. To simplify notation, let m + n = k and
= (0, 1)
k
and
i
: (0, 1) be projection onto the i
th
factor of . Suppose
that C
c
() and K = supp(). We will view C
c
(1
k
) by setting = 0
outside of . Since K is compact
i
(K) [a
i
, b
i
] for some 0 < a
i
< b
i
< 1. Let
a = mina
i
: i = 1, . . . , k and b = max b
i
: i = 1, . . . , k . Then supp() =
K [a, b]
k
. As in the proof of the Weierstrass approximation theorem,
760 37 Convolutions involving distributions
let q
n
(t) = c
n
(1 t
2
)
n
1
]t]1
where c
n
is chosen so that
_
R
q
n
(t)dt = 1. Also
set Q
n
= q
n
q
n
, i.e. Q
n
(x) =
k
i=1
q
n
(x
i
) for x 1
k
. Let
f
n
(x) := Q
n
(x) = c
k
n
_
R
k
(y)
k
i=1
(1 (x
i
y
i
)
2
)
n
1
]xiyi]1
dy
i
. (37.3)
By standard arguments, we know that
f
n

uniformly on 1
k
as n
. Moreover for x , it follows from Eq. (37.3) that
f
n
(x) := c
k
n
_
(y)
k
i=1
(1 (x
i
y
i
)
2
)
n
dy
i
= p
n
(x)
where p
n
(x) is a polynomial in x. Notice that p
n
C
((0, 1)) C
((0, 1))
so that we are almost there.
1
We need only cuto these functions so that they
have compact support. To this end, let C
c
((0, 1)) be a function such that
= 1 on a neighborhood of [a, b] and dene
n
= ( ) f
n
= ( ) p
n
C
c
((0, 1)) C
c
((0, 1)).
I claim now that
n
in T(). Certainly by construction supp(
n
)
[a, b]
k
for all n. Also
(
n
) =
( ( ) f
n
)
= ( ) (
f
n
) +R
n
(37.4)
where R
n
is a sum of terms of the form
( )
f
n
with ,= 0.
Since
( ) = 0 on [a, b]
k
and
f
n
converges uniformly to zero on
1
k
[a, b]
k
, it follows that R
n
0 uniformly as n . Combining this with
Eq. (37.4) and the fact that
f
n

uniformly on 1
k
as n , we
see that
n
in T(). This nishes the proof in the case X = (0, 1)
n
and Y = (0, 1)
m
. For the general case, let K = supp() X Y and
K
1
=
1
(K) X and K
2
=
2
(K) Y where
1
and
2
are projections
from X Y to X and Y respectively. Then K K
1
K
2
X Y.
Let V
i
a
i=1
and U
j
b
j=1
be nite covers of K
1
and K
2
respectively by open
1
One could also construct fn C
(R)
k
such that
fn
f uniformlly as
n using Fourier series. To this end, let

be the 1 periodic extension of
to R
k
. Then

C
periodic
(R
k
) and hence it may be written as
(x) =
mZ
k
cme
i2mx
where the
_
cm : m Z
k
_
are the Fourier coecients of

which decay faster that
(1+[m[)
l
for any l > 0. Thus fn(x) :=
mZ
k
:mn
cme
i2mx
C
(R)
k
and
fn
unifromly on as n .
37.1 Tensor Product of Distributions 761
sets V
i
= (a
i
, b
i
) and U
j
= (c
j
, d
j
) with a
i
, b
i
X and c
j
, d
j
Y. Also let
i
C
c
(V
i
) for i = 1, . . . , a and
j
C
c
(U
j
) for j = 1, . . . , b be functions
such that

a
i=1
i
= 1 on a neighborhood of K
1
and

b
j=1
j
= 1 on a
neighborhood of K
2
. Then =

a
i=1
b
j=1
(
i
j
) and by what we have
just proved (after scaling and translating) each term in this sum, (
i
j
) ,
may be written as a limit of elements in T(X) T(Y ) in the T(X Y )
topology.
Theorem 37.2 (Distribution-Fubini-Theorem). Let S T
t
(X), T
T
t
(Y ), h(x) := T(y), (x, y)) and g(y) := S(x), (x, y)). Then h =
h
T(X), g = g
T(Y ),
h(x) = T(y),
x
(x, y)) and
g(y) =
S(x),
y
(x, y)) for all multi-indices and . Moreover
S(x), T(y), (x, y))) = S, h) = T, g) = T(y), S(x), (x, y))). (37.5)
We denote this common value by S T, ) and call S T the tensor product
of S and T. This distribution is uniquely determined by its values on T(X)
T(Y ) and for u T(X) and v T(Y ) we have
S T, u v) = S, u)T, v).
Proof. Let K = supp() X Y and K
1
=
1
(K) and K
2
=
2
(K).
Then K
1
X and K
2
Y and K K
1
K
2
X Y. If x X
and y / K
2
, then (x, y) = 0 and more generally
x
(x, y) = 0 so that
y :
x
(x, y) ,= 0 K
2
. Thus for all x X, supp(
(x, )) K
2
Y. By
the fundamental theorem of calculus,
y
(x +v, y)
y
(x, y) =
_
1
0
x
v
y
(x +v, y)d (37.6)
and therefore
_
_
y
(x +v, )
y
(x, )
_
_
[v[
_
1
0
_
_
y
(x +v, )
_
_
d
[v[
_
_
_
_
0 as 0.
This shows that x X (x, ) T(Y ) is continuous. Thus h is continuous
being the composition of continuous functions. Letting v = te
i
in Eq. (37.6)
we nd
y
(x +te
i
, y)
y
(x, y)
t

x
i
y
(x, y)
=
_
1
0
_

x
i
y
(x +te
i
, y)

x
i
y
(x, y)
_
d
and hence
_
_
_
_
_
y
(x +te
i
, )
y
(x, )
t

x
i
y
(x, )
_
_
_
_
_
_
1
0
_
_
_
_
x
i
y
(x +te
i
, )

x
i
y
(x, )
_
_
_
_
d
which tends to zero as t 0. Thus we have checked that
x
i
(x, ) = T
t
(Y ) lim
t0
(x +te
i
, ) (x, )
t
and therefore,
h(x +te
i
) h(x)
t
= T,
(x +te
i
, ) (x, )
t
) T,

x
i
(x, ))
as t 0 showing
i
h(x) exists and is given by T,

xi
(x, )). By what
we have proved above, it follows that
i
h(x) = T,

xi
(x, )) is continu-
ous in x. By induction on [[ , it follows that
h(x) exists and is continuous

and
h(x) = T(y),
x
(x, y)) for all . Now if x / K
1
, then (x, ) 0
showing that x X : h(x) ,= 0 K
1
and hence supp(h) K
1
X.
Thus h has compact support. This proves all of the assertions made about
h. The assertions pertaining to the function g are prove analogously. Let
, ) = S(x), T(y), (x, y))) = S, h
) for T(X Y ). Then is clearly

linear and we have
[, )[ = [S, h
)[
C
]]m
|
x
h
|
,K1
= C
]]m
|T(y),
x
(, y))|
,K1
which combined with the estimate
[T(y),
x
(x, y))[ C
]]p
_
_
x
(x, y))
_
_
,K2
shows
[, )[ C
]]m
]]p
_
_
x
(x, y))
_
_
,K1K2
.
So is continuous, i.e. T
t
(X Y ), i.e.
T(X Y ) S(x), T(y), (x, y)))
denes a distribution. Similarly,
T(X Y ) T(y), S(x), (x, y)))
also denes a distribution and since both of these distributions agree on the
dense subspace T(X) T(Y ), it follows they are equal.
Theorem 37.3. If (T, ) is a distribution test function pair satisfying one of
the following three conditions
1. T c
t
(1
n
) and C
(1
n
)
2. T T
t
(1
n
) and T(1
n
) or
3. T o
t
(1
n
) and o(1
n
),
let
T (x) =
_
T(y)(x y)dy = T, (x )). (37.7)
Then T C
(1
n
),
(T ) = (
T ) = (T
) for all and

supp(T ) supp(T) + supp(). Moreover if (3) holds then T T the
space of smooth functions with slow decrease.
Proof. I will supply the proof for case (3) since the other cases are similar
and easier. Let h(x) := T (x). Since T o
t
(1
n
), there exists m N and
C < such that [T, )[ Cp
m
() for all o, where p
m
is dened in
Example 36.28. Therefore,
[h(x) h(y)[ = [T, (x ) (y ))[ Cp
m
((x ) (y ))
= C
]]m
|
m
(
(x )
(y ))|
.
Let :=
, then
(x z) (y z) =
_
1
0
(y +(x y) z) (x y)d (37.8)
and hence
[(x z) (y z)[ [x y[
_
1
0
[(y +(x y) z)[ d
C [x y[
_
1
0
M
(y +(x y) z)d
for any M < . By Peetres inequality,
M
(y +(x y) z)
M
(z)
M
(y +(x y))
so that
[
(x z)
(y z)[ C [x y[
M
(z)
_
1
0
M
(y +(x y))d
C(x, y) [x y[
M
(z) (37.9)
where C(x, y) is a continuous function of (x, y). Putting all of this together
we see that
[h(x) h(y)[

C(x, y) [x y[ 0 as x y,
showing h is continuous. Let us now compute a partial derivative of h. Suppose
that v 1
n
is a xed vector, then by Eq. (37.8),
(x +tv z) (x z)
t

v
(x z)
=
_
1
0
(x +tv z) vd
v
(x z)
=
_
1
0
[
v
(x +tv z)
v
(x z)] d.
This then implies
z
_
(x +tv z) (x z)
t

v
(x z)
_
_
1
0
z
[
v
(x +tv z)
v
(x z)] d
_
1
0
[
z
[
v
(x +tv z)
v
(x z)][ d.
But by the same argument as above, it follows that
[
z
[
v
(x +tv z)
v
(x z)][ C(x +tv, x) [tv[
M
(z)
and thus
z
_
(x +tv z) (x z)
t

v
(x z)
_
t
M
(z)
_
1
0
C(x +tv, x)d [v[
M
(z).
Putting this all together shows
_
_
_
_
z
_
(x +tv z) (x z)
t

v
(x z)
__
_
_
_
= O(t)
0 as t 0.
That is to say
(x+tv)(x)
t

v
(x ) in o as t 0. Hence since T is
continuous on o, we learn
v
(T ) (x) =
v
T, (x )) = lim
t0
T,
(x +tv ) (x )
t
)
= T,
v
(x )) = T
v
(x).
By the rst part of the proof, we know that
v
(T ) is continuous and hence
by induction it now follows that T is C
and
T = T
. Since
T
(x) = T(z), (
) (x z)) = (1)
T(z),
z
(x z))
=
z
T(z), (x z)) =
T (x)
the proof is complete except for showing T T. For the last statement,
it suces to prove [T (x)[ C
M
(x) for some C < and M < . This
goes as follows
[h(x)[ = [T, (x ))[ Cp
m
((x )) = C
]]m
|
m
(
(x )|
and using Peetres inequality, [
(x z)[ C
m
(xz) C
m
(z)
m
(x)
so that
|
m
(
(x )|
C
m
(x).
Thus it follows that [T (x)[ C
m
(x) for some C < . If x 1
n
(supp(T) + supp()) and y supp() then x y / supp(T) for otherwise

x = x y +y supp(T) + supp(). Thus
supp((x )) = x supp() 1
n
supp(T)
and hence h(x) = T, (x)) = 0 for all x 1
n
(supp(T) + supp()) . This
implies that h ,= 0 supp(T) + supp() and hence
supp(h) = h ,= 0 supp(T) + supp().
As we have seen in the previous theorem, T is a smooth function and
hence may be used to dene a distribution in T
t
(1
n
) by
T , ) =
_
T (x)(x)dx =
_
T, (x ))(x)dx.
Using the linearity of T we might expect that
_
T, (x ))(x)dx = T,
_
(x )(x)dx)
T , ) = T,

) (37.10)
where

(x) := (x).
Theorem 37.4. Suppose that if (T, ) is a distribution test function pair sat-
isfying one the three condition in Theorem 37.3, then T as a distribution
may be characterized by
T , ) = T,

) (37.11)
for all T(1
n
). Moreover, if T o
t
and o then Eq. (37.11) holds for
all o.
Proof. Let us rst assume that T T
t
and , T and T be a
function such that = 1 on a neighborhood of the support of . Then
T , ) =
_
R
n
T, (x ))(x)dx = (x), T(y), (x y)))
= (x)(x), T(y), (x y)))
= (x), (x)T(y), (x y)))
= (x), T(y), (x)(x y))).
Now the function, (x)(x y) T(1
n
1
n
), so we may apply Fubinis
theorem for distributions to conclude that
T , ) = (x), T(y), (x)(x y)))
= T(y), (x), (x)(x y)))
= T(y), (x)(x), (x y)))
= T(y), (x), (x y)))
= T(y),

(y)) = T,

)
as claimed. If T c
t
, let T(1
n
) be a function such that = 1 on a
neighborhood of supp(T), then working as above,
T , ) = (x), T(y), (x)(x y)))
= (x), T(y), (y)(x)(x y)))
and since (y)(x)(xy) T(1
n
1
n
) we may apply Fubinis theorem for
distributions to conclude again that
T , ) = T(y), (x), (y)(x)(x y)))
= (y)T(y), (x)(x), (x y)))
= T(y), (x), (x y))) = T,

).
Now suppose that T o
t
and , o. Let
n
,
n
T be a sequences such
that
n
and
n
in o, then using arguments similar to those in the
proof of Theorem 37.3, one shows
T , ) = lim
n
T
n
,
n
) = lim
n
T,
n
n
) = T,

).
Theorem 37.5. Let U
o
1
n
, then T(U) is sequentially dense in c
t
(U).
When U = 1
n
we have c
t
(1
n
) is a dense subspace of o
t
(1
n
) T
t
(1
n
).
Hence we have the following inclusions,
T(U) c
t
(U) T
t
(U),
T(1
n
) c
t
(1
n
) o
t
(1
n
) T
t
(1
n
) and
T(1
n
) o(1
n
) o
t
(1
n
) T
t
(1
n
)
with all inclusions being dense in the next space up.
Proof. The key point is to show T(U) is dense in c
t
(U). Choose
C
c
(1
n
) such that supp() B(0, 1), = and
_
(x)dx = 1. Let
m
(x) =
m
n
(mx) so that supp(
m
) B(0, 1/m). An element in T c
t
(U) may be
viewed as an element in c
t
(1
n
) in a natural way. Namely if C
c
(U) such
that = 1 on a neighborhood of supp(T), and C
(1
n
), let T, ) =
T, ). Dene T
m
= T
m
. It is easily seen that supp(T
n
) supp(T) +
B(0, 1/m) U for all m suciently large. Hence T
m
T(U) for large enough
m. Moreover, if T(U), then
T
m
, ) = T
m
, ) = T,
m
) = T,
m
) T, )
since
m
in T(U) by standard arguments. If U = 1
n
, T c
t
(1
n
)
o
t
(1
n
) and o, the same argument goes through to show T
m
, ) T, )
provided we show
m
in o(1
n
) as m . This latter is proved by
showing for all and t > 0, I
|
t
(
)|
0 as m ,
which is a consequence of the estimates:
[
m
(x)
(x)[ = [
m

(x)
(x)[
=
_

m
(y) [
(x y)
(x)] dy
sup
]y]1/m
[
(x y)
(x)[
1
m
sup
]y]1/m
[
(x y)[
1
m
C sup
]y]1/m
t
(x y)
1
m
C
t
(x y) sup
]y]1/m
t
(y)
1
m
C
_
1 +m
1
_
t
t
(x).
Denition 37.6 (Convolution of Distributions). Suppose that T T
t
and S c
t
, then dene T S T
t
by
T S, ) = T S,
+
)
where
+
(x, y) = (x+y) for all x, y 1
n
. More generally we may dene TS
for any two distributions having the property that supp(T S) supp(
+
) =
[supp(T) supp(S)] supp(
+
) is compact for all T.
Proposition 37.7. Suppose that T T
t
and S c
t
then T S is well dened
and
T S, ) = T(x), S(y), (x +y))) = S(y), T(x), (x +y))). (37.12)
Moreover, if T o
t
then T S o
t
and T(T S) =

S

T. Recall from Remark
36.46 that

S T so that

S

T o
t
.
Proof. Let T be a function such that = 1 on a neighborhood of
supp(S), then by Fubinis theorem for distributions,
T S,
+
) = T S(x, y), (y)(x +y)) = T(x)S(y), (y)(x +y))
= T(x), S(y), (y)(x +y))) = T(x), S(y), (x +y)))
and
T S,
+
) = T(x)S(y), (y)(x +y)) = S(y), T(x), (y)(x +y)))
= S(y), (y)T(x), (x +y))) = S(y), T(x), (x +y)))
proving Eq. (37.12). Suppose that T o
t
, then
[T S, )[ = [T(x), S(y), (x +y)))[ C
]]m
|
m
x
S(y), ( +y))|
= C
]]m
|
m
S(y),
( +y))|
and
[S(y),
(x +y))[ C
]]p
sup
yK
(x +y)
Cp
m+p
() sup
yK
mp
(x +y)
Cp
m+p
()
mp
(x) sup
yK
m+p
(y)
=

C
mp
(x)p
m+p
().
Combining the last two displayed equations shows
[T S, )[ Cp
m+p
()
which shows that T S o
t
. We still should check that
T S, ) = T(x), S(y), (x +y))) = S(y), T(x), (x +y)))
still holds for all o. This is a matter of showing that all of the expressions
are continuous in o when restricted to T. Explicitly, let
m
T be a sequence
of functions such that
m
in o, then
T S, ) = lim
n
T S,
n
) = lim
n
T(x), S(y),
n
(x +y))) (37.13)
and
T S, ) = lim
n
T S,
n
) = lim
n
S(y), T(x),
n
(x +y))). (37.14)
So it suces to show the map o S(y), ( +y)) o is continuous and
o T(x), (x +)) C
(1
n
) are continuous maps. These may veried
by methods similar to what we have been doing, so I will leave the details to
the reader. Given these continuity assertions, we may pass to the limits in Eq.
(37.13d (37.14) to learn
T S, ) = T(x), S(y), (x +y))) = S(y), T(x), (x +y)))
still holds for all o. The last and most important point is to show T(T
S) =

S

T. Using
(x +y) =
_
R
n
()e
i(x+y)
d =
_
R
n
()e
iy
e
ix
d
= T
_
()e
iy
_
(x)
and the denition of T on o
t
we learn
T(T S), ) = T S, ) = S(y), T(x), (x +y)))
= S(y), T(x), T
_
()e
iy
_
(x)))
= S(y),
T(), ()e
iy
)). (37.15)
Let T be a function such that = 1 on a neighborhood of supp(S) and
assume T for the moment. Then from Eq. (37.15) and Fubinis theorem
for distributions we nd
T(T S), ) = S(y), (y)
T(), ()e
iy
))
= S(y),
T(), ()(y)e
iy
))
=
T(), S(y), ()(y)e

iy
))
=
T(), ()S(y), e
iy
))
=
T(), ()
S()) =
S()
T(), ()). (37.16)

Since T(T S) o
t
and

S

T o
t
, we conclude that Eq. (37.16) holds for all
o and hence T(T S) =

S

T as was to be proved.
37.2 Elliptic Regularity
Theorem 37.8 (Hypoellipticity). Suppose that p(x) =

]]m
a
is a
polynomial on 1
n
and L is the constant coecient dierential operator
L = p(
1
i
) =
]]m
a
(
1
i
)
]]m
a
(i)
.
Also assume there exists a distribution T T
t
(1
n
) such that R := LT
C
(1
n
) and T[
R
n
\0]
C
(1
n
0). Then if v C
(U) and u T
t
(U)
solves Lu = v then u C
(U). In particular, all solutions u to the equation

Lu = 0 are smooth.
Proof. We must show for each x
0
U that u is smooth on a neighborhood
of x
0
. So let x
0
U and T(U) such that 0 1 and = 1 on
neighborhood V of x
0
. Also pick T(V ) such that 0 1 and = 1
on a neighborhood of x
0
. Then
u = (u) = (LT +R) (u) = (LT) (u) +R (u)
= T L(u) +R (u)
= T L(u) + (1 )L(u) +R (u)
= T Lu + (1 )L(u) +R (u)
= T (v) +R (u) +T [(1 )L(u)] .
Since v T(U) and T T
t
(1
n
) it follows that R (u) C
(1
n
). Also
since R C
(1
n
) and u c
t
(U), R (u) C
(1
n
). So to show u, and
hence u, is smooth near x
0
it suces to show T g is smooth near x
0
where
g := (1 )L(u) . Working formally for the moment,
T g(x) =
_
R
n
T(x y)g(y)dy =
_
R
n
\=1]
T(x y)g(y)dy
which should be smooth for x near x
0
since in this case x y ,= 0 when
g(y) ,= 0. To make this precise, let > 0 be chosen so that = 1 on a
neighborhood of B(x
0
, ) so that supp(g) B(x
0
, )
c
. For T(B(x
0
, /2),
T g, ) = T(x), g(y), (x +y))) = T, h)
where h(x) := g(y), (x +y)). If [x[ /2
supp((x +)) = supp() x B(x
0
, /2) x B(x
0
, )
so that h(x) = 0 and hence supp(h) B(x
0
, /2)
c
. Hence if we let
T(B(0, /2)) be a function such that = 1 near 0, we have h 0, and thus
T g, ) = T, h) = T, h h) = (1 )T, h) = [(1 )T] g, ).
Since this last equation is true for all T(B(x
0
, /2)), T g = [(1 )T] g
on B(x
0
, /2) and this nishes the proof since [(1 )T]g C
(1
n
) because
(1 )T C
(1
n
).
Denition 37.9. Suppose that p(x) =

]]m
a
is a polynomial on 1
n
and L is the constant coecient dierential operator
L = p(
1
i
) =
]]m
a
(
1
i
)
]]m
a
(i)
.
Let
p
(L)() :=

]]=m
a
and call
p
(L) the principle symbol of L. The
operator L is said to be elliptic provided that
p
(L)() ,= 0 if ,= 0.
Theorem 37.10 (Existence of Parametrix). Suppose that L = p(
1
i
) is
an elliptic constant coecient dierential operator, then there exists a dis-
tribution T T
t
(1
n
) such that R := LT C
(1
n
) and T[
R
n
\0]

C
(1
n
0).
Proof. The idea is to try to nd T such that LT = . Taking the Fourier
transform of this equation implies that p()
T() = 1 and hence we should try

to dene

T() = 1/p(). The main problem with this denition is that p()
may have zeros. However, these zeros can not occur for large by the ellipticity
assumption. Indeed, let q() :=
p
(L)() =
]]=m
a
, r() = p()q() =
]]<m
a
and let c = min[q()[ : [[ = 1 max [q()[ : [[ = 1 =: C.

Then because [q()[ is a nowhere vanishing continuous function on the compact
set S := 1
n
: [[ = 1[ , 0 < c C < . For 1
n
, let

= / [[ and
notice
[p()[ = [q()[ [r()[ c [[
m
[r()[ = [[
m
(c
[r()[
[[
m
) > 0
for all [[ M with M suciently large since lim
]r()]
]]
m = 0. Choose
T(1
n
) such that = 1 on a neighborhood of B(0, M) and let
h() =
1 ()
p()
=
()
p()
C
(1
n
)
where = 1. Since h() is bounded (in fact lim
h() = 0), h o
t
(1
n
)
so there exists T := T
1
h o
t
(1
n
) is well dened. Moreover,
T ( LT) = 1 p()h() = 1 () = () T(1
n
)
which shows that
R := LT o(1
n
) C
(1
n
).
So to nish the proof it suces to show
T[
R
n
\0]
C
(1
n
0).
To prove this recall that
T (x
T) = (i)

T = (i)
h.
By the chain rule and the fact that any derivative of is has compact support
in B(0, M)
c
and any derivative of
1
p
is non-zero on this set,
h =
1
p
+r
where r
T(1
n
). Moreover,
i
1
p
=
i
p
p
2
and
j
i
1
p
=
j
i
p
p
2
=
i
p
p
2
+ 2
i
p
p
3
()
i
1
p
()
C [[
(m+1)
and
()
j
i
1
p
C [[
(m+2)
.
More generally, one shows by inductively that
()
1
p
C [[
(m+]])
. (37.17)
In particular, if k N is given and is chosen so that [[ +m > n +k, then
[[
k
h() L
1
() and therefore
x
T = T
1
[(i)
h] C
k
(1
n
).
Hence we learn for any k N, we may choose p suciently large so that
[x[
2p
T C
k
(1
n
).
This shows that T[
R
n
\0]
C
(1
n
0).
Here is the induction argument that proves Eq. (37.17). Let q
:=
p
]]+1
p
1
with q
0
= 1, then
p
1
=
i
_
p
]]1
q
_
= ([[ 1) p
]]2
q
i
p +p
]]1
i
q
so that
q
+ei
= p
]]+2
p
1
= ([[ 1) q
i
p +p
i
q
.
It follows by induction that q
is a polynomial in and letting d
:= deg(q
),
we have d
+ei
d
+ m 1 with d
0
= 1. Again by induction this implies
d
[[ (m1). Therefore
p
1
=
q
p
]]+1
[[
dm(]]+1)
= [[
]](m1)m(]]+1)
= [[
(m+]])
as claimed in Eq. (37.17).
Part IX
Appendices
A
Multinomial Theorems and Calculus Results
Given a multi-index Z
n
+
, let [[ =
1
+ +
n
, ! :=
1
!
n
!,
x
:=
n
j=1
x
j
j
and
x
=
_

x
_
:=
n
j=1
_

x
j
_
j
.
We also write
v
f(x) :=
d
dt
f(x +tv)[
t=0
.
A.1 Multinomial Theorems and Product Rules
For a = (a
1
, a
2
, . . . , a
n
) C
n
, m N and (i
1
, . . . , i
m
) 1, 2, . . . , n
m
let

j
(i
1
, . . . , i
m
) = #k : i
k
= j . Then
_
n
i=1
a
i
_
m
=
n
i1,...,im=1
a
i1
. . . a
im
=
]]=m
C()a
where
C() = #(i
1
, . . . , i
m
) :
j
(i
1
, . . . , i
m
) =
j
for j = 1, 2, . . . , n
I claim that C() =
m!
!
. Indeed, one possibility for such a sequence
(a
1
, . . . , a
im
) for a given is gotten by choosing
(
1
..
a
1
, . . . , a
1
,
2
..
a
2
, . . . , a
2
, . . . ,
n
..
a
n
, . . . , a
n
).
Now there are m! permutations of this list. However, only those permutations
leading to a distinct list are to be counted. So for each of these m! permuta-
tions we must divide by the number of permutation which just rearrange the
6 A Multinomial Theorems and Calculus Results
groups of a
i
s among themselves for each i. There are ! :=
1
!
n
! such
permutations. Therefore, C() = m!/! as advertised. So we have proved
_
n
i=1
a
i
_
m
=
]]=m
m!
!
a
. (A.1)
Now suppose that a, b 1
n
and is a multi-index, we have
(a +b)
!
!( )!
a
+=
!
!!
a
(A.2)
Indeed, by the standard Binomial formula,
(a
i
+b
i
)
i
=
ii
i
!
i
!(
i
i
)!
a
i
b
ii
from which Eq. (A.2) follows. Eq. (A.2) generalizes in the obvious way to
(a
1
+ +a
k
)
1++
k
=
!
1
!
k
!
a
1
1
. . . a
k
k
(A.3)
where a
1
, a
2
, . . . , a
k
1
n
and Z
n
+
.
Now let us consider the product rule for derivatives. Let us begin with the
one variable case (write d
n
f for f
(n)
=
d
n
dx
n
f) where we will show by induction
that
d
n
(fg) =
n
k=0
_
n
k
_
d
k
f d
nk
g. (A.4)
Indeed assuming Eq. (A.4) we nd
d
n+1
(fg) =
n
k=0
_
n
k
_
d
k+1
f d
nk
g +
n
k=0
_
n
k
_
d
k
f d
nk+1
g
=
n+1
k=1
_
n
k 1
_
d
k
f d
nk+1
g +
n
k=0
_
n
k
_
d
k
f d
nk+1
g
=
n+1
k=1
__
n
k 1
_
+
_
n
k
__
d
k
f d
nk+1
g +d
n+1
f g +f d
n+1
g.
Since
_
n
k 1
_
+
_
n
k
_
=
n!
(n k + 1)!(k 1)!
+
n!
(n k)!k!
=
n!
(k 1)! (n k)!
_
1
(n k + 1)
+
1
k
_
=
n!
(k 1)! (n k)!
n + 1
(n k + 1) k
=
_
n + 1
k
_
A.2 Taylors Theorem 7
the result follows.
Now consider the multi-variable case
(fg) =
_
n
i=1
i
i
_
(fg) =
n
i=1
_
i
ki=0
_
i
k
i
_
ki
i
f
iki
i
g
_
=
1
k1=0

n
kn=0
n
i=1
_
i
k
i
_
k
f
k
g =
k
_
k
_
k
f
k
g
where k = (k
1
, k
2
, . . . , k
n
) and
_
k
_
:=
n
i=1
_
i
k
i
_
=
!
k!( k)!
.
So we have proved
(fg) =
g. (A.5)
A.2 Taylors Theorem
Theorem A.1. Suppose X 1
n
is an open set, x : [0, 1] X is a C
1
path, and f C
N
(X, C). Let v
s
:= x(1) x(s) and v = v
1
= x(1) x(0), then
f(x(1)) =
N1
m=0
1
m!
(
m
v
f) (x(0)) +R
N
(A.6)
where
R
N
=
1
(N 1)!
_
1
0
_
x(s)
N1
vs
f
_
(x(s))ds =
1
N!
_
1
0
_
d
ds
N
vs
f
_
(x(s))ds.
(A.7)
and 0! := 1.
Proof. By the fundamental theorem of calculus and the chain rule,
f(x(t)) = f(x(0)) +
_
t
0
d
ds
f(x(s))ds = f(x(0)) +
_
t
0
_
x(s)
f
_
(x(s))ds (A.8)
and in particular,
f(x(1)) = f(x(0)) +
_
1
0
_
x(s)
f
_
(x(s))ds.
This proves Eq. (A.6) when N = 1. We will now complete the proof using
induction on N. Applying Eq. (A.8) with f replaced by
1
(N1)!
_
x(s)
N1
vs
f
_
gives
1
(N 1)!
_
x(s)
N1
vs
f
_
(x(s)) =
1
(N 1)!
_
x(s)
N1
vs
f
_
(x(0))
+
1
(N 1)!
_
s
0
_
x(s)
N1
vs

x(t)
f
_
(x(t))dt
=
1
N!
_
d
ds
N
vs
f
_
(x(0))
1
N!
_
s
0
_
d
ds
N
vs
x(t)
f
_
(x(t))dt
wherein we have used the fact that mixed partial derivatives commute to show
d
ds
N
vs
f = N
x(s)
N1
vs
f. Integrating this equation on s [0, 1] shows, using
the fundamental theorem of calculus,
R
N
=
1
N!
_
N
v
f
_
(x(0))
1
N!
_
0ts1
_
d
ds
N
vs
x(t)
f
_
(x(t))dsdt
=
1
N!
_
N
v
f
_
(x(0)) +
1
(N + 1)!
_
0t1
_
N
wt
x(t)
f
_
(x(t))dt
=
1
N!
_
N
v
f
_
(x(0)) +R
N+1
which completes the inductive proof.
Remark A.2. Using Eq. (A.1) with a
i
replaced by v
i
i
(although v
i
n
i=1
are
not complex numbers they are commuting symbols), we nd
m
v
f =
_
n
i=1
v
i
i
_
m
f =
]]=m
m!
!
v
.
Using this fact we may write Eqs. (A.6) and (A.7) as
f(x(1)) =
]]N1
1
!
v
f(x(0)) +R
N
and
R
N
=
]]=N
1
!
_
1
0
_
d
ds
v
f
_
(x(s))ds.
Corollary A.3. Suppose X 1
n
is an open set which contains x(s) = (1
s)x
0
+sx
1
for 0 s 1 and f C
N
(X, C). Then
f(x
1
) =
N1
m=0
1
m!
(
m
v
f) (x
0
) +
1
N!
_
1
0
_
N
v
f
_
(x(s))d
N
(s) (A.9)
=
]]<N
1
!
f(x(0))(x
1
x
0
)
:]]=N
1
!
__
1
0
f(x(s))d
N
(s)
_
(x
1
x
0
)
(A.10)
A.2 Taylors Theorem 9
where v := x
1
x
0
and d
N
is the probability measure on [0, 1] given by
d
N
(s) := N(1 s)
N1
ds. (A.11)
If we let x = x
0
and y = x
1
x
0
(so x + y = x
1
) Eq. (A.10) may be written
as
f(x +y) =
]]<N
x
f(x)
!
y
:]]=N
1
!
__
1
0
x
f(x +sy)d
N
(s)
_
y
.
(A.12)
Proof. This is a special case of Theorem A.1. Notice that
v
s
= x(1) x(s) = (1 s)(x
1
x
0
) = (1 s)v
and hence
R
N
=
1
N!
_
1
0
_
d
ds
(1 s)
N
N
v
f
_
(x(s))ds =
1
N!
_
1
0
_
N
v
f
_
(x(s))N(1s)
N1
ds.
Example A.4. Let X = (1, 1) 1, 1 and f(x) = (1 x)
. The reader
should verify
f
(m)
(x) = (1)
m
( 1) . . . ( m+ 1)(1 x)
m
and therefore by Taylors theorem (Eq. (??) with x = 0 and y = x)
(1 x)
= 1 +
N1
m=1
1
m!
(1)
m
( 1) . . . ( m+ 1)x
m
+R
N
(x) (A.13)
where
R
N
(x) =
x
N
N!
_
1
0
(1)
N
( 1) . . . ( N + 1)(1 sx)
N
d
N
(s)
=
x
N
N!
(1)
N
( 1) . . . ( N + 1)
_
1
0
N(1 s)
N1
(1 sx)
N
ds.
Now for x (1, 1) and N > ,
0
_
1
0
N(1 s)
N1
(1 sx)
N
ds
_
1
0
N(1 s)
N1
(1 s)
N
ds =
_
1
0
N(1 s)
1
ds =
N
and therefore,
[R
N
(x)[
[x[
N
(N 1)!
[( 1) . . . ( N + 1)[ =:
N
.
Since
lim sup
N
N+1
N
= [x[ lim sup
N
N
N
= [x[ < 1
and so by the Ratio test, [R
N
(x)[
N
0 (exponentially fast) as N .
Therefore by passing to the limit in Eq. (A.13) we have proved
(1 x)
= 1 +
m=1
(1)
m
m!
( 1) . . . ( m+ 1)x
m
(A.14)
which is valid for [x[ < 1 and 1. An important special cases is = 1
in which case, Eq. (A.14) becomes
1
1x
=
m=0
x
m
, the standard geometric
series formula. Another another useful special case is = 1/2 in which case
Eq. (A.14) becomes
1 x = 1 +
m=1
(1)
m
m!
1
2
(
1
2
1) . . . (
1
2
m+ 1)x
m
= 1
m=1
(2m3)!!
2
m
m!
x
m
for all [x[ < 1. (A.15)
B
Zorns Lemma and the Hausdor Maximal
Principle
Denition B.1. A partial order on X is a relation with following properties
(i) If x y and y z then x z.
(ii)If x y and y x then x = y.
(iii)x x for all x X.
Example B.2. Let Y be a set and X = 2
Y
. There are two natural partial
orders on X.
1. Ordered by inclusion, A B is A B and
2. Ordered by reverse inclusion, A B if B A.
Denition B.3. Let (X, ) be a partially ordered set we say X is linearly
or totally ordered if for all x, y X either x y or y x. The real numbers
1 with the usual order is a typical example.
Denition B.4. Let (X, ) be a partial ordered set. We say x X is a
maximal element if for all y X such that y x implies y = x, i.e. there is
no element larger than x. An upper bound for a subset E of X is an element
x X such that x y for all y E.
Example B.5. Let
X =
_
a = 1 b = 1, 2 c = 3 d = 2, 4 e = 2
_
ordered by set inclusion. Then b and d are maximal elements despite that fact
that b _ a and a _ b. We also have
If E = a, e, c, then E has no upper bound.
Denition B.6. If E = a, e, then b is an upper bound.
E = e, then b and d are upper bounds.
Theorem B.7. The following are equivalent.
12 B Zorns Lemma and the Hausdor Maximal Principle
1. The axiom of choice: to each collection, X
A
, of non-empty sets
there exists a choice function, x : A

A
X
such that x() X
for
all A, i.e.

A
X
,= .
2. The Hausdor Maximal Principle: Every partially ordered set has a
maximal (relative to the inclusion order) linearly ordered subset.
3. Zorns Lemma: If X is partially ordered set such that every linearly
ordered subset of X has an upper bound, then X has a maximal element.
1
Proof. (2 3) Let X be a partially ordered subset as in 3 and let T =
E X : E is linearly ordered which we equip with the inclusion partial
ordering. By 2. there exist a maximal element E T. By assumption, the
linearly ordered set E has an upper bound x X. The element x is maximal,
for if y Y and y x, then E y is still an linearly ordered set containing
E. So by maximality of E, E = E y , i.e. y E and therefore y x
showing which combined with y x implies that y = x.
2
(3 1) Let X
A
be a collection of non-empty sets, we must show
A
X
is not empty. Let ( denote the collection of functions g : D(g)
A
X
such that D(g) is a subset of A, and for all D(g), g() X
.
Notice that ( is not empty, for we may let
0
A and x
0
X
and then
set D(g) =
0
and g(
0
) = x
0
to construct an element of (. We now put
a partial order on ( as follows. We say that f g for f, g ( provided
that D(f) D(g) and f = g[
D(f)
. If ( is a linearly ordered set, let
D(h) =
g
D(g) and for D(g) let h() = g(). Then h ( is an upper
bound for . So by Zorns Lemma there exists a maximal element h (. To
nish the proof we need only show that D(h) = A. If this were not the case,
then let
0
AD(h) and x
0
X
0
. We may now dene D(
h) = D(h)
0
and
h() =
_
h() if D(h)
x
0
if =
0
.
1
If X is a countable set we may prove Zorns Lemma by induction. Let |xn
n=1
be an enumeration of X, and dene En X inductively as follows. For n = 1
let E1 = |x1, and if En have been chosen, let En+1 = En |xn+1 if xn+1
is an upper bound for En otherwise let En+1 = En. The set E =
n=1
En is a
linearly ordered (you check) subset of X and hence by assumption E has an upper
bound, x X. I claim that his element is maximal, for if there exists y = xm X
such that y x, then xm would be an upper bound for Em1 and therefore
y = xm Em E. That is to say if y x, then y E and hence y x, so
y = x. (Hence we may view Zorns lemma as a jazzed up version of induction.)
2
Similarly one may show that 3 2. Let T = |E X : E is linearly ordered
and order T by inclusion. If / T is linearly ordered, let E = / =
A,
A.
If x, y E then x A and y B for some A, B /. Now / is linearly ordered
by set inclusion so A B or B A i.e. x, y A or x, y B. Since A and B are
linearly order we must have either x y or y x, that is to say E is linearly
ordered. Hence by 3. there exists a maximal element E T which is the assertion
in 2.
B Zorns Lemma and the Hausdor Maximal Principle 13
Then h
h while h ,=

h violating the fact that h was a maximal element.
(1 2) Let (X, ) be a partially ordered set. Let T be the collection of
linearly ordered subsets of X which we order by set inclusion. Given x
0
X,
x
0
T is linearly ordered set so that T ,= . Fix an element P
0
T. If P
0
is not maximal there exists P
1
T such that P
0
_ P
1
. In particular we may
choose x / P
0
such that P
0
x T. The idea now is to keep repeating
this process of adding points x X until we construct a maximal element
P of T. We now have to take care of some details. We may assume with out
loss of generality that

T = P T : P is not maximal is a non-empty set.
For P

T, let P
= x X : P x T . As the above argument shows,

P
,= for all P

T. Using the axiom of choice, there exists f
J
P
.
We now dene g : T T by
g(P) =
_
P if P is maximal
P f(x) if P is not maximal.
(B.1)
The proof is completed by Lemma B.8 below which shows that g must have
a xed point P T. This xed point is maximal by construction of g.
Lemma B.8. The function g : T T dened in Eq. (B.1) has a xed point.
3
Proof. The idea of the proof is as follows. Let P
0
T be chosen
arbitrarily. Notice that =
_
g
(n)
(P
0
)
_
n=0
T is a linearly ordered set and it
is therefore easily veried that P
1
=
n=0
g
(n)
(P
0
) T. Similarly we may repeat
the process to construct P
2
=
n=0
g
(n)
(P
1
) T and P
3
=
n=0
g
(n)
(P
2
) T,
etc. etc. Then take P
n=0
P
n
and start again with P
0
replaced by P
.
Then keep going this way until eventually the sets stop increasing in size, in
which case we have found our xed point. The problem with this strategy is
that we may never win. (This is very reminiscent of constructing measurable
sets and the way out is to use measure theoretic like arguments.)
Let us now start the formal proof. Again let P
0
T and let T
1
= P
T : P
0
P. Notice that T
1
has the following properties:
1. P
0
T
1
.
2. If T
1
is a totally ordered (by set inclusion) subset then T
1
.
3. If P T
1
then g(P) T
1
.
Let us call a general subset T
t
T satisfying these three conditions a
tower and let
T
0
= T
t
: T
t
is a tower .
3
Here is an easy proof if the elements of T happened to all be nite sets and
there existed a set P T with a maximal number of elements. In this case the
condition that P g(P) would imply that P = g(P), otherwise g(P) would have
more elements than P.
14 B Zorns Lemma and the Hausdor Maximal Principle
Standard arguments show that T
0
is still a tower and clearly is the smallest
tower containing P
0
. (Morally speaking T
0
consists of all of the sets we were
trying to constructed in the idea section of the proof.) We now claim that
T
0
is a linearly ordered subset of T. To prove this let T
0
be the linearly
ordered set
= C T
0
: for all A T
0
either A C or C A .
Shortly we will show that T
0
is a tower and hence that T
0
= . That is
to say T
0
is linearly ordered. Assuming this for the moment let us nish the
proof.
Let P T
0
which is in T
0
by property 2 and is clearly the largest element
in T
0
. By 3. it now follows that P g(P) T
0
and by maximality of P, we
have g(P) = P, the desired xed point. So to nish the proof, we must show
that is a tower. First o it is clear that P
0
so in particular is not
empty. For each C let
C
:= A T
0
: either A C or g(C) A .
We will begin by showing that
C
T
0
is a tower and therefore that
C
= T
0
.
1. P
0

C
since P
0
C for all C T
0
. 2. If
C
T
0
is totally
ordered by set inclusion, then A
:= T
0
. We must show A

C
, that
is that A
C or C A
. Now if A C for all A , then A
C and
hence A

C
. On the other hand if there is some A such that g(C) A
then clearly g(C) A
and again A

C
. 3. Given A
C
we must show
g(A)
C
, i.e. that
g(A) C or g(C) g(A). (B.2)
There are three cases to consider: either A _ C, A = C, or g(C) A. In the
case A = C, g(C) = g(A) g(A) and if g(C) A then g(C) A g(A) and
Eq. (B.2) holds in either of these cases. So assume that A _ C. Since C ,
either g(A) C (in which case we are done) or C g(A). Hence we may
assume that
A _ C g(A).
Now if C were a proper subset of g(A) it would then follow that g(A)A would
consist of at least two points which contradicts the denition of g. Hence we
must have g(A) = C C and again Eq. (B.2) holds, so
C
is a tower. It is
now easy to show is a tower. It is again clear that P
0
and Property
2. may be checked for in the same way as it was done for
C
above. For
Property 3., if C we may use
C
= T
0
to conclude for all A T
0
, either
A C g(C) or g(C) A, i.e. g(C) . Thus is a tower and we are
done.
C
Nets
In this section (which may be skipped) we develop the notion of nets. Nets are
generalization of sequences. Here is an example which shows that for general
topological spaces, sequences are not always adequate.
Example C.1. Equip C
R
with the topology of pointwise convergence, i.e. the
product topology and consider C(1, C) C
R
. If f
n
C(1, C) is a sequence
which converges such that f
n
f C
R
pointwise then f is a Borel measurable
function. Hence the sequential limits of elements in C(1, C) is necessarily
contained in the Borel measurable functions which is properly contained in
C
R
. In short the sequential closure of C(1, C) is a proper subset of C
R
. On
the other hand we have C(1, C) = C
R
. Indeed a typical open neighborhood
of f C
R
is of the form
N = g C
R
: [g(x) f(x)[ < for x ,
where > 0 and is a nite subset of 1. Since N C(1, C) ,= it follows
that f C(1, C).
Denition C.2. A directed set (A, ) is a set with a relation such that
1. for all A.
2. If and then .
3. A is conite, i.e. , A there exists A such that and .
A net is function x : A X where A is a directed set. We will often
denote a net x by x
A
.
Example C.3 (Directed sets).
1. A = 2
X
ordered by inclusion, i.e. if . If and then
and hence . Similalry if , 2
X
then , =: .
2. A = 2
X
ordered by reverse inclusion, i.e. if . If and
then and so and if , A then , .
16 C Nets
3. Let A = N equipped with the usual ordering on N. In this case nets are
simply sequences.
Denition C.4. Let x
A
X be a net then:
1. x
converges to x X (written x
x) i for all V
x
, x
V
eventually, i.e. there exists =
V
A such that x
V for all .
2. x is a cluster point of x
A
if for all V
x
, x
V frequently,
i.e. for all A there exists such that x
V.
Proposition C.5. Let X be a topological space and E X. Then
1. x is an accumulation point of E (see Denition 13.29) i there exists net
x
E x such that x
x.
2. x

E i there exists x
E such that x
x.
Proof.
1. Suppose x is an accumulation point of E and let A =
x
be ordered by
reverse set inclusion. To each A =
x
choose x
(x) E which
is possible sine x is an accumulation point of E. Then given V
x
for
all V (i.e. and V ), x
V and hence x
x. Conversely if
x
A
E x and x
x then for all V

x
there exists A
such that x
V for all . In particular x
(E x) V ,= and
so x acc(E) the accumulation points of E.
2. If x
E such that x
x then for all V

x
there exists A such
that x
V E for all . In particular V E ,= for all V

x
and this implies x

E. For the converse recall Proposition 13.31 implies
E = E acc(E). If x acc(E) there exists a net x
E such that
x
x by item 1. If x E we may simply take x

n
= x for all n A := N.
Proposition C.6. Let X and Y be two topological spaces and f : X Y
be a function. Then f is continuous at x X i f(x
) f(x) for all nets

x
x.
Proof. If f is continuous at x and x
x then for any V

f(x)
there
exists W
x
such that f(W) V. Since x
W eventually, f(x
) V
eventually and we have shown f(x
) f(x). Conversely, if f is not contin-

uous at x then there exists W
f(x)
such that f(V ) _ W for all V
x
.
Let A =
x
be ordered by reverse set inclusion and for V
x
choose (axiom
of choice) x
V
V such that f(x
V
) / W. Then x
V
x since for any U
x
,
x
V
U if V U (i.e. V U). On the over hand f(x
V
) / W for all V
x
showing f(x
V
) f(x).
Denition C.7 ( Subnet). A net y
)
B
is a subnet of a net x
)
A
if
there exists a map B
A such that
C Nets 17
1. y
= x
for all B and

2. for all
0
A there exists
0
B such that

0
whenever
0
,
i.e.

0
eventually.
Proposition C.8. A point x X is a cluster point of a net x
)
A
i
there exists a subnet y
)
B
such that y
x.
Proof. Suppose y
)
B
is a subnet of x
)
A
such that y
= x
x.
Then for W
x
and
0
A there exists
0
B such that y
= x
W
for all
0
. Choose
1
B such that

0
for all
1
then choose
3
B such that
3

1
and
3

2
then

0
and x
W for all

3
which implies x
W frequently. Conversely assume x is a cluster

point of a net x
)
A
. We mak B :=
x
A into a directed set by dening
(U, ) (U
t
,
t
) i
t
and U U
t
. For all (U, ) B =
x
A,
choose
(U,)
in A such that y
(U,)
= x
(U,)
U. Then if
0
A for all
(U
t
,
t
) (U,
0
), i.e.
t

0
and U
t
U,
(U
)

t

0
. Now if W
x
is given, then y
(U,)
U W for all U W. Hence xing A we see if
(U, ) (W, ) then y
(U,)
= x
(U,)
U W showing that y
(U,)
x.
Exercise C.1 (Folland #34, p. 121). Let x
)
A
be a net in a topological
space and for each A let E
: . Then x is a cluster point of

x
) i x

A
E
.
Solution to Exercise (C.1). If x is a cluster point, then given W
x
we
know E
W ,= for all E since x
W frequently thus x E
for all
, i.e. x

A
E
. Conversely if x is not a cluster point of x
) then there
exists W
x
and A such that x
/ W for all , i.e. W E
= .
But this shows x / E
and hence x /

A
E
.
Theorem C.9. A topological space X is compact i every net has a cluster
point i every net has a convergent subnet.
Proof. Suppose X is compact, x
)
A
X is a net and let F
:=
x
: . Then F
is closed for all A, F
if
t
and
F
1
F
n
F
whenever
i
for i = 1, . . . , n. (Such a always exists
since A is a directed set.) Therefore F
1
F
n
,= i.e. F
A
has the
nite intersection property and since X is compact this implies there exists
x

a
F
By Exercise C.1, it follows that x is a cluster point of x
)
A
.
Conversely, if X is not compact let U
j
jJ
be an innite cover with no nite
subcover. Let A be the directed set A = J : #() < with i
. Dene a net x
)
A
in X by choosing
x
X
_
_
_
j
U
j
_
_
,= for all A.
18 C Nets
This net has no cluster point. To see this suppose x X and j J is chosen
so that x U
j
. Then for all j (i.e. j ), x
U
j
and
in particular x
/ U
j
. This shows x
/ U
j
frequently and hence x is not a
cluster point.
References
1. Vladimir I. Bogachev, Gaussian measures, Mathematical Surveys and Mono-
graphs, vol. 62, American Mathematical Society, Providence, RI, 1998. MR
2000a:60004
2. Isaac Chavel, Riemannian geometrya modern introduction, Cambridge Uni-
versity Press, Cambridge, 1993. MR 95j:53001
3. , The Laplacian on Riemannian manifolds, Spectral theory and geometry
(Edinburgh, 1998), Cambridge Univ. Press, Cambridge, 1999, pp. 3075. MR
2001c:58029
4. Giuseppe Da Prato and Jerzy Zabczyk, Stochastic equations in innite dimen-
sions, Cambridge University Press, Cambridge, 1992. MR 95g:60073
5. E. B. Davies, Heat kernels and spectral theory, Cambridge Tracts in Mathemat-
ics, vol. 92, Cambridge University Press, Cambridge, 1990. MR 92a:35035
6. , Heat kernel bounds for higher order elliptic operators, Journees
Equations aux Derivees Partielles (Saint-Jean-de-Monts, 1995),

Ecole Poly-
tech., Palaiseau, 1995, pp. Exp. No. III, 11. MR 96i:35020
7. F. G. Friedlander, Introduction to the theory of distributions, second ed., Cam-
bridge University Press, Cambridge, 1998, With additional material by M. Joshi.
MR 2000g:46002: Call No. QA324 .F74 1998
8. Enrico Giusti, Minimal surfaces and functions of bounded variation, Birkhauser
Verlag, Basel, 1984. MR 87a:58041
9. Leonard Gross, Integration and nonlinear transformations in Hilbert space,
Trans. Amer. Math. Soc. 94 (1960), 404440. MR 22 #2883
10. , Abstract Wiener spaces, Proc. Fifth Berkeley Sympos. Math. Statist.
and Probability (Berkeley, Calif., 1965/66), Vol. II: Contributions to Probability
Theory, Part 1, Univ. California Press, Berkeley, Calif., 1967, pp. 3142. MR 35
#3027
11. Ronald B. Guenther and John W. Lee, Partial dierential equations of math-
ematical physics and integral equations, Dover Publications Inc., Mineola, NY,
1996, Corrected reprint of the 1988 original. MR 97e:35001
12. Hui Hsiung Kuo, Gaussian measures in Banach spaces, Springer-Verlag, Berlin,
1975, Lecture Notes in Mathematics, Vol. 463. MR 57 #1628
13. Serge Lang, Real and functional analysis, third ed., Graduate Texts in Mathe-
matics, vol. 142, Springer-Verlag, New York, 1993. MR 94b:00005
20 References
14. Lynn H. Loomis, An introduction to abstract harmonic analysis, D. Van Nos-
trand Company, Inc., Toronto-New York-London, 1953. MR 14,883c
15. Vladimir G. Mazja, Sobolev spaces, Springer Series in Soviet Mathematics,
Springer-Verlag, Berlin, 1985, Translated from the Russian by T. O. Shaposh-
nikova. MR 87g:46056
16. A. Pazy, Semigroups of linear operators and applications to partial dierential
equations, Applied Mathematical Sciences, vol. 44, Springer-Verlag, New York,
1983. MR 85g:47061
17. Michael Reed and Barry Simon, Methods of modern mathematical physics. II.
Fourier analysis, self-adjointness, Academic Press [Harcourt Brace Jovanovich
Publishers], New York, 1975. MR 58 #12429b
18. , Methods of modern mathematical physics. I, second ed., Academic Press
Inc. [Harcourt Brace Jovanovich Publishers], New York, 1980, Functional anal-
ysis. MR 85e:46002
19. L. C. G. Rogers and David Williams, Diusions, Markov processes, and mar-
tingales. Vol. 1, Cambridge Mathematical Library, Cambridge University Press,
Cambridge, 2000, Foundations, Reprint of the second (1994) edition. MR
2001g:60188
20. Laurent Salo-Coste, Aspects of Sobolev-type inequalities, London Mathematical
Society Lecture Note Series, vol. 289, Cambridge University Press, Cambridge,
2002. MR 2003c:46048
21. Robert Schatten, Norm ideals of completely continuous operators, Second print-
ing. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 27, Springer-
Verlag, Berlin, 1970. MR 41 #2449
22. B. Simon, Convergence in trace ideals, Proc. Amer. Math. Soc. 83 (1981), no. 1,
3943. MR 82h:47042
23. Barry Simon, Trace ideals and their applications, London Mathematical Society
Lecture Note Series, vol. 35, Cambridge University Press, Cambridge, 1979. MR
80k:47048
24. S. L. Sobolev and E. R. Dawson, Partial dierential equations of mathematical
physics, Translated from the third Russian edition by E. R. Dawson; English
translation edited by T. A. A. Broadbent, Pergamon Press, Oxford, 1964. MR
31 #2478
25. Elias M. Stein, Singular integrals and dierentiability properties of functions,
Princeton Mathematical Series, No. 30, Princeton University Press, Princeton,
N.J., 1970. MR 44 #7280
26. Frank W. Warner, Foundations of dierentiable manifolds and Lie groups, Grad-
uate Texts in Mathematics, vol. 94, Springer-Verlag, New York, 1983, Corrected
reprint of the 1971 edition. MR 84k:58001
Index
Absolute continuity, 601
Banach space, 59
reexive, 497
sums in, 69
Bounded variation, 601
Cauchy, 11
Cauchy sequence
in a metric space, 53
in a normed space, 59
Closed, see Sets
Coherent states, 715
Complete
Metric space, 54
Continuous function, 198, 202
Contraction Mapping Principle, 189
Function
continuous, 52, 198
continuous at a point, 52
Fundamental theorem of calculus, 608
Hahn-Banach Theorem, 495
Hellys selection principle, 579
Homeomorphism, 198
Integration by parts, 609
Isomorphic measure spaces, 579
Lusin Space, 579
Minikowski functional, 494
Neighborhood, 197
base, 197
open, 197
Open, see Sets
Open cover, 203
Polish Spaces, 250
Product topology, 204
Radon Measure
Complex, 564
Signed, 564
Reexive, see Banach space
Sets
closed, 197
open, 197
Sub-base, see Topology
Summable, 29
Topological Space, 197
Topology, 197
base, 199
discrete, 198
generated by functions, 204
induced / relative, 201
relative / induced, 201
sub-base, 199
trivial, 198
Total variation, 601
version, 448

Bruce K. Driver - Analysis Tools With Examples

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bruce K. Driver - Analysis Tools With Examples

Uploaded by

Copyright:

Available Formats

Bruce K.

be the canonical projection map

= X for some xed space X, then we will

for each A. The axiom of choice (see Appendix B.)

in the case that

b 1 show that the denitions

and by reversing the roles of a and

a is an increasing function on [0, ). Hint: To construct b =

with > 0 shows the necessity for assuming right hand

(x) = max(a(x), 0). (4.12)

which shows that either

= . Suppose, with out loss

the positive and negative parts of a. The general result

g(y) for all y Y and u J.

and we have shown that A

. Finally it is clear that

A = X, i.e. every element x X is a limit of a sequence of elements from A.

A = F : A F X with F closed. (6.2)

B = . In particular this implies that

such that x = lim

(X) = B(X) is closed subspace of

(X) be a Cauchy sequence. Since for any x X,

and taking the supremum over x X of this inequality implies

(X). For the second assertion, suppose that f

(X). We must show that f BC(X), i.e.

functions) and that f

(X, Y ) denote the

(X, Y ) which are continuous. The same proof

(X, Y ) is a Banach space and that BC(X, Y )

(X, Y ). Similarly, if 1 p < we may dene

|x| and hence |T|

norms, i.e. for

for L(X, F) and call X

is always a Banach space.

(X) which vanish

(X) : #(x X : f (x) ,= 0) < .

(X) . (See Proposition 15.23 below where this last

is an isometric and surjective, i.e.

which combined with Eq. (7.5) shows |

: A is a given collection of vectors in X. We say

converges in X and write s =

does not imply

X : A is a given collection of vectors

X exists and T : X Y is a bounded linear map

exists in X then for every > 0 there exists

exists in X, the set := A : x

be as in Denition 7.17 and A such that

exists and is equal to Ts.

exists and > 0. Let

exists in X, for each n N there exists a nite subset

. Then for m > n,

exists and is equal to s.

X : A is a given collection of vectors in

| < . That is to say absolute convergence implies con-

= x H : x A be the set of vectors orthogonal to A. A subset S H

is a closed linear subspace of H.

, is the unique operator A

y). (The proof that A

exists and is unique will be given in

, then x x, i.e. |x|

be the dual space of H (Notation

which we assume, with out

such that f(x