Professional Documents
Culture Documents
Studies Mathematics
26
Heinz Bauer
Measure and
Integration Theory
de Gruyter Studies in Mathematics 26
Editors: Carlos Kenig Andrew Ranicki Michael Rockner
de Gruyter Studies in Mathematics
W Walter de Gruyter
Berlin New York 2001
Author Translator
Heinz Bauer Robert B. Burckel
Mathematisches Institut Department of Mathematics
der Universit t Erlangen-Numberg Kansas State University
Bismarckstral3e 1 1/2 137 Cardwell Hall
91054 Erlangen Manhattan, K ansas 66506-2602
Germany USA
Series Editors
Carlos E. Kenig Andrew Ranicki Michael Rockner
Department of Mathematics Department of Mathematics Fakultit fiir Mathematik
University of Chicago University of Edinburgh Universitiit Bielefeld
5734 University Ave Mayfield Road UniversitiitsstraBe 25
Chicago, IL 60637 Edinburgh EH9 3JZ 33615 Bielefeld
USA Scotland Germany
Ptimod on acid-free papa which fans widen the guidelines of the ANSI to errawe permanence and dwability.
Bauer, Heinz:
Measure and integration theory / Heinz Bauer. Trans[. from the German
Robert B. Burckel. - Berlin ; New York : de Gruyter, 2001
(De Gruyter studies in mathematics ; 26)
Einheitssacht.: Mass- and Integrationstheorie (engl.)
ISBN 3-11-016719-0
© Copyright 2001 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin, Germany.
All rights reserved including those of translation into foreign languages. No part of this book may be
reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or
any information storage and retrieval system, without permission in writing from the publisher.
Printed in Germany.
Typesetting: Oldlich Uhych, Prague, Czech Republic.
Printing and binding: Hubert & Co. GmbH & Co. KG, GBttingen.
Cover design: Rudolf Hubler, Berlin.
In memoriam
Orro HAUPT
(5.3.1887 -10.11.1988)
former Professor of Mathematics
at the University of Erlangen
Preface
they will know. Finally, I thank my publisher Walter de Gruyter & Co., and, above
all, Dr. Manfred Karbe for publishing the translation of my book.
Measure theory and integration are closely interwoven theories, both content-wise
and in their historical developments. They form a unit. The development of analy-
sis in the 19th century - here one is thinking especially about the theory of Fourier
series and classical function theory - compelled the creation of a sufficiently gen-
eral concept of the integral that discontinuous functions could also be integrated.
The jump function of P. G. LEJEUNE DIRICHLET should be seen in this light. At
that time only an integration theory due to CAUCHY, a precursor of Riemann's,
was known. And it was not until B. RIEMANN's Habilitation in 1854 (text pub-
lished posthumously in 1867) that Cauchy's ideas were made sufficiently precise
to integrate (certain) discontinuous functions. For the first time the need was felt
for integrability criteria. Parallel to this a "theory of content" was evolving - pri-
marily at the hands of G. PEANO and C. JORDAN - to measure the areas of plane
and the volumes of spatial "figures".
But the decisive breakthrough occurred at the turn of the century, thanks to
the French mathematicians EMILE BOREL and HENRI LEBESGUE. In 1898 Borel -
coming from the direction of function theory - described the "a-algebra" of sets
that today bear his name, the Borel sets, and showed how to construct a "measure"
on this a-algebra that satisfactorily resolved the problems of measuring content.
In particular, he recognized the significance of the "a-additivity" of the measure.
In his thesis (1902) LEBESGUE presented the integral concept, subsequently named
after him, that proved decisive for the development of a general theory. At the same
time he furnished the tools needed to make Borel's ideas more precise. From then
on Lebesgue-Borel measure on the a-algebra of Borel sets and Lebesgue measure
on a somewhat larger a-algebra - consisting of the sets which are "measurable" in
Lebesgue's sense became standard methods of analysis.
What was new about Lebesgue's integral concept was not just the way it was
defined, but also - and this was the real reason for its fame - its great versatility as
manifested in the way it behaved with respect to limit operations. Consequently
the convergence theorems are at the center of the integration theory developed by
Lebesgue and his intellectual progeny.
Subsequent developments are characterized by increasing recognition of the
versatility of Lebesgue's concepts in dealing with new demands from mathematics
and its applications. In the course of time (up to 1930) the general (abstract) mea-
sure concept crystallized, and a theory of integration built on it - after Lebesgue's
model.
It is this theory that will be developed here in an introductory fashion, but
far enough that from the platform so erected the reader can easily press ahead to
deeper questions and the manifold applications. Areas in which measure and inte-
gration play a key role are, for example, ergodic theory, spectral theory, harmonic
x Introduction
analysis on locally compact groups, and mathematical economics. But the fore-
most example is probability theory, which uses measure and integration as an
indispensable tool and whose own specific kinds of questions and methods have
in turn helped to shape the former. Even today the development of measure and
integration theory is far from finished.
The book is comprised of four chapters. The first is devoted to the measure
concept and in particular to the Lebesgue-Borel measure and its interplay with
geometry. In the second chapter the integral determined by a measure, and in
particular the Lebesgue integral, the one determined by Lebesgue-Borel measure,
will be introduced and investigated. The short third chapter deals with the product
of measures and the associated integration. An application of this which is very
important in Fourier analysis is the convolution of measures. In the fourth and
last chapter the abstract concept of measure is made more concrete in the form
of Radon measures. As in the original example of Lebesgue-Borel measure, here
the relation of the measure to a topology on the underlying set moves into the
foreground. Essentially two kinds of spaces are allowed: Polish spaces and locally
compact spaces. The topological tools needed for this will mostly be developed in
the text, with the reader occasionally being given only a reference (very specific)
to the standard textbook literature.
The examples accompanying the exposition of a theme have an important func-
tion. They are supposed to illuminate the concepts and illustrate the limitations
of the theory. The reader should therefore work through them with care. Exer-
cises also accompany the exposition. They are not essential to understanding later
developments and, in particular, proofs are not superficially shortened by con-
signing parts to the exercises. But the exercises do serve to deepen the reader's
understanding of the material treated in the text, and working them is strongly
recommended.
Notations
Here we assemble some of the notation and phraseology which will be used in the
text without further comment and which - with but a few exceptions - are in
general use.
Of course, a+ > 0 and a- > 0 for all a. For finitely many a1, ... , an E It the
corresponding expressions a1 V ... V an and al A ... A an stand for max{a1i... , an }
and min{ai, .... an), respectively.
For the set-theoretic operations we use the usual symbols: U or U for union,
n or n for intersection, and the prefix C to signify complementation. The set-
theoretic relation of inclusion is written A C B, and equality of the sets is not
thereby excluded. For the difference set A fl CB, the set of all x E A such that
x ¢ B, we also write A \ B. Sets A and B which have an empty intersection, that
is, for which A fl B = 0, are said to be disjoint.
The power set _9(Q) of a set f2 is the set of all subsets of f2, including the
empty set 0. A set A will be called countable if it is either finite or denumerably
infinite. In other words, we will be using "countable" in lieu of the equally popular
expression "at most countable". Obviously the empty set is to be understood as
a finite set. A set will be called non-denumerable or uncountable if it is neither
finite nor denumerable.
Mappings of a set A into a set B will be denoted by f : A - B or by the
mapping prescription x y f (x) (with x E A). In case B = R we speak of a real
function or a real-valued function on A. Not universal, but useful for our purposes,
is the designation numerical function on A for mappings f : A - R into the
extended number line. The restriction of a mapping f : A -+ B to a subset A'
of A will be denoted by f I A'. The composition of f with a mapping g : B - C
will be denoted g o f and the pre-image or inverse-image of a set B' C B under
the mapping f will be denoted f -1(B').
A sequence in a set A is a mapping f : N -* A of the set N of natural num-
bers into A. Designating the image element f (n) by an, we also write
or simply for the mapping f. If other index sets, e.g., Z+ =
{0, 1... .}, come up, this notation is appropriately modified to, e.g., (an)nEZ+ or
(an)n=o.1.... In the same way finite sets are often exhibited as with
n E N. Even more generally, we write mappings f : I -+ A of a set I into the
set A as (a,),EI. understanding by ai the element f (i) of A. And we then speak of
a family in A (with index set I).
If the terms of a sequence (an)fES in a set A from some index no E N onwards
possess a certain property, that is, if there are but finitely many exceptional indices,
we say that ultimately all terms of the sequence have the property. The popular
phrasing "almost all terms of the sequence possess the property" has to be avoided
in measure theory because there the concept "almost all" is employed in another
sense.
PrefRee vii
Introduction ix
Notations xi
Bibliography 217
Symbol Index 221
Name Index 223
Subiect Index 225
Chapter I
Measure Theory
To geometrically simple subsets of the line, the plane, and 3-dimensional space,
elementary geometry assigns "numerical measures" called length, area and volume.
At first all that is intuitively clear is how the length of a segment, the area of
a rectangle and the volume of a box should be defined. Proceeding from these we
can determine by elementary geometric methods the lengths, areas, and volumes
of more complicated sets if we accept certain calculational rules for dealing with
such numerical measures.
If one thinks for example about the elementary determination of the area of
a (topologically) open triangle, one begins by decomposing it via one of its altitudes
into two open right triangles and the altitude itself. One further recalls that every
right triangle arises from insertion of a diagonal into an appropriate rectangle.
Every line segment is assigned numerical measure 0 when considered as a surface.
The following two rules of calculation therefore lead to the determination of the
areas of triangles:
(A) If the set A has numerical measure a, and B is congruent to A, then B also
has numerical measure a.
(B) If A and B are disjoint sets with numerical measures a and p, reap., then
A U B has numerical measure a +)3.
The limits of such elementary geometric considerations are already reached in
defining the area of an open disk K, to which end one proceeds thus: A sequence of
open 3.2"-1-goes En (n E N) is inscribed in K, with El being an open equilateral
triangle, and the vertices of En+1 being those of En together with the intersections
of the circle with the radii perpendicular to the sides of En. Thus En+1 consists
of En together with its 3.2n-1 edges and the open isosceles triangles which have
these edges as hypotenuses and vertices on the circle. Since K is the union of all
the En, it looks like a "mosaic of triangles", that is, like a union of disjoint open
triangles and segments (namely, common sides of various triangles). The following
broader formulation of (B) therefore leads to a definition of the area of the disk K:
(C) If (An) is a sequence of pairwise disjoint sets, and An has numerical mean
00
sure an (n E N), then U
0. An has numerical measure E an.
n=1 n=1
If we replace K and every En by its topological closure, this method would not
lead to a plausible definition of the area of a closed disk K, because K is not the
union of the closures En of the above constructed polygons En. A peculiarity and
disadvantage of the elementary geometric procedure is precisely the necessity of
2 I. Measure Theory
is a a-algebra in S2', called the trace of .sad in ff. In case S2' E of, 0' nod consists
simply of all the subsets of 12' which are elements of 0.
4. Let S2, S2' be sets, 0' a a-algebra in Cl', and T : Cl -> 12' a mapping. Then the
system of sets
(1.5) T-1(d) := {T-1(A') : A' E Ad'}
is a a-algebra in Cl, as follows from the known behavior of the set-theoretic oper-
ations under inverse mappings (like T-1 here).
(1.7) (an)nEN C d n An E W.
nEN
These follow from (1.1)-(1.3) and the identities 0 = C11 and nAn = C(UCAn).
Moreover,
A,u...UAn =A,u...UAnuOuOu...
and
A, n... nAn = A, n... nAn nCln1n...
Therefore, along with any finite number of sets which 0 contains, it also contains
their union and their intersection. From this observation and (1.2) follows as well:
(1.8) A,BEd A\B=AnCBEd.
For constructing a-algebras the following theorem is important:
Its proof is just a routine check of properties (1.1)-(1.3). It follows that for every
system 9 of subsets of Cl there is a smallest a-algebra a(8) which contains 9; that
is, a(8) is a a-algebra in 0 with the defining properties
(i) 9 C a(9),
(ii) for every o-algebra .sd in Cl with 8 C 0, a(8) C.W.
For a proof, consider the system E of all a-algebras nd in S2 with 9 C nd;
for example, . (S2) is an element of E. Then o(e) is the intersection of all the
0 E E, which according to 1.2 possesses all the desired properties.
Q(8) is called the a-algebra generated by 8 (in Cl) and .9 is called a generator
of a(8).
Examples. 5. If 9 itself is a a-algebra in S2, then 9 = a(8).
6. If S consists of a single set A C Cl, then a(S) = {0, A, CA, S2}.
4 1. Measure Theory
1.3 Definition. A system . of subsets of a set 11 is called a ring (in Sl) if it has
the following properties:
(1.9) O E R;
(1.10) A,BE.J A\BE-4;
(1.11) A,BER AuBEF.
If in addition
(1.12) SZ E R
then :.8 is called an algebra (in fl).
A ring contains with each two of its sets (and so, with each finite collection of
its sets) not only their union, but also their intersection. This is because An B =
A \ (A \ B).
Proof. By definition an algebra has properties (1.1) and (1.11) and (1.10), and from
the latter follows (1.2). The converse follows from the fact that 0 = Co, together
with the set-theoretic identity
A\B=AnCB=C(BuCA). 0
Examples. 8. Every a-algebra is an algebra.
9. For any set 0 the system of all sets A C 0 which are either finite or co-finite
(i.e., have finite complement in i2) is an algebra, but is a a-algebra only if fl is
finite.
10. The system of all finite subsets of a set 0 is a ring, but is an algebra only
if fl itself is finite.
11. The smallest ring of subsets of a set 0 is the empty set O.
Exercises.
1. For every system 8 of subsets of a set n there exists a smallest ring p(8)
in 0 which contains if. It is called the ring generated by 8. Prove this existence
assertion. Determine p(8) and a(8) in the case where f consists of two subsets
A, B of Q. When does p(8) = a(e) hold in this latter case; when does it hold for
general 8?
§2. Dynkin systems 5
Every Dynkin system 9 thus contains the empty set 0 = CA, and then (2.3)
also insures that 9 contains the union of every finite, pairwise disjoint collection
of its sets.
The precise connection between the concepts of or-algebra and Dynkin system
is elucidated in the following considerations:
2.2 Lemma. Every Dynkin system 9 is closed with respect to the formation of
proper complements, meaning that
(2.2') D,EE9, DcE E\DE9.
Proof. According to what was noted right after definition 2.1, the set D U CE,
being the union of the disjoint sets D and CE from 9, lies in 9. But then the
complement of this set with respect to 0, that is, E f1 CD = E \ D, lies in 9.
Consequently, Dynkin systems can also be defined via properties (2.1), (2.2')
and (2.3).
Proof. What needs to be shown is that every Dynkin system .9 which is closed
under finite intersections is a a-algebra. Of the defining properties of a a-algebra,
only (1.3) needs to be confirmed and we do that thus: According to (2.2') and
the closure hypothesis, A \ B = A \ (A fl B) lies in 9 whenever A, B E 9. Since
(A \ B) fl B = 0 and A U B = (A \ B) U B, 9 contains the union of any two, hence
the union of any finitely many, of its elements. For any sequence (Da)nEN C 9,
we have
00 00
U Dn=U(D'n+1\D,)
n=1 n=e
§2. Dynkin systems 7
in which D' := 0 and D;, := Dl U ... U D for each n E N. The sets D;+i \ D;, are
pairwise disjoint and, thanks to (2.2') and what has already been proved, they lie
in 2. According to (2.3) then the union of the sets D lies in 2. 0
Just as for a-algebras, algebras and rings, every system Cr C .9(Q) lies in
a smallest Dynkin system. It is, of course, called the Dynkin system generated
by 8, and is denoted 6(8).
The significance of Dynkin systems lies primarily in the following fact:
2.4 Theorem. Every 9 C .9(Q) which is closed with respect to finite intersection
satisfies
(2.4) 6(8) = 0(6°) .
Proof. Since every a-algebra is a Dynkin system, o(8) is a Dynkin system con-
taining 9' and consequently 6(8) C o((fl. If conversely, 6(8) were known to be
a a-algebra, the dual relation o(8) C 6(8) would also follow. In view of 2.3 there-
fore it suffices to show that 6(8) is closed under intersection. To prove this, we
introduce for every D E 6(8) the system
1D:={QE.9(st):QnDE6(8)}.
A routine check confirms that 9D is a Dynkin system. For every E E 8 the
hypothesis on 8 insures that 8 C 2E and therewith that 6(8) C 2E. Thus for
every DE6(8)andevery EE8wehave EnDE6(8);that is,8C2D,and
consequently 6(8) C 9D, holding for every D E 6(8). But this is just the property
of d(eb) that had to be confirmed. 0
Exercise.
Determine the Dynkin system generated by the system consisting of just two
subsets A, B of fl. Show that 6(&) and o(8) coincide just in case one of the sets
A n B, A n CB, B n CA of CA n CB is empty.
8 1. Measure Theory
and for every sequence (An) of pairwise disjoint sets from R whose union lies in 1B
00 00
(for every two and therewith) for every finitely many pairwise disjoint sets A,,. .. ,
A,, E R_
Due to (3.1) every premeasure is evidently a content. To see this, you have only
to take An+1 = An+2 = ... = 0 in (3.2).
Examples. 1. For every ring R in 11 and every point w E 11 the function s,,,
defined on . by
if U) EA
if1.r0A
is a premeasure. It is called the premeasure defined by unit mass at W.
2. Let a be the a-algebra defined in Example 2 of §1, for an uncountable set fl,
say for S2 =1R. Set p(A) := 0 or 1 according as A of CA is countable. Since of two
disjoint subsets of f? at most one can have a countable complement, property (3.2)
is easily confirmed; thus p is a premeasure on d.
3. Let W be the algebra defined in Example 9 of §1, for a countably infinite set i.
Set p(A) := 0 or 1 according as A or CA is finite. Then p is a content but not
a premeasure. The first assertion has a proof analogous to that in the preceding
example, the second follows from the fact that f) is the disjoint union of countably
many 1-element sets.
4. Let 111,112.... be a sequence of contents (premeasures) on a ring 9, and let
a 1, 02, ... be a sequence of non-negative real numbers, Then
00
p
n=1
§3. Contents, premeasures, measures 9
Every content µ on a ring R enjoys the following further properties (in which
A, B, A1, B1, ... E R):
(3.5) µ(A U B) +µ(A n B) = µ(A) + µ(B) ;
(3.6) ACB µ(A) < µ(B) (isotoneity);
(3.7) A C B, µ(A) < +oo . µ(B \ A) = µ(B) - µ(A) (subtractivity);
n n
(3.8) µ(U Ai) ,p(Ai) (subadditivity);
i=1 i=1
for every sequence (An) of pairwise disjoint sets from R whose union lies in R
00
Lµ(An)<µ(UAn).
n=1
"D
n=1
Finally, if it is a premeasure on .4, then for any sets A0, A1, ... E 9
0
(3.10) Ao C U A. = p(Ao) :5 >2 p(An)
n=1 n=1
Because of AO = U(Ao n An) and (3.6), we can assume, in verifying (3.10), that
Ao = U An. Then set B1 := Al, B2 := A2 \ Al,... ,Bn := An \ (A1 U ... U An-1)
and proceed as in the proof of (3.8).
In particular, we now have
A= U Bn, An=B1U...UB,,.
Yd=1
§3. Contents, premeasures, measures 11
follow are still of a rather formal nature. But as early as §6 and then quite a bit
later we will become acquainted with an abundance of important examples.
Proof. Designate the union of all the An as C. For each r = 1,.. . , k the sets
(Ar+rnk)mENo are pairwise disjoint. So if we set
00
Fr U Ar+mk r
rn=0
then
00
E p(A,) = E u(Fr)
n=1 r=1
§3. Contents, premeasures, measures 13
From this equality and the preceding inequality the asserted inequality can be read
off.
Exercises.
1. Let 12 be a finite, non-empty set. Show that the counting measure ( on Y(O)
coincides with E e,,. Show further that every measure p on :x(1l) has the form
cEn
p= a,,e,,,, with each a, := p({w}).
WE n
2. For a finite content p on a ring .4 establish the following input-output formula
generalizing equality (3.5): For all n E Nl, A,, ... , An E M
n n
µ(U A;) =EA(Ai)- E t(AinAj)+ p(AinAjnAk)
i=1 i=1 1<i<j<n 1<i<j<k<n
- +...+(-1)n-1µ(A1n...nAn).
3. For a premeasure p on a ring. in 12 define
.':={AE-6P(1l):AnRE.4for every RE-4}
µ(A) := sup{p(R) : R C A, R E-4}, for A E i.
Show that .9' is an algebra in 12 which contains .?, and that µ is a premeasure
on 8 which extends p.
4. Suppose that (A- )-EN is a sequence of premeasures on a common ring 9 which
is isotone, that is, satisfies µn (A) < pn+1(A) for all A E R, n E N. Show that via
p(A) := sup An (A) a premeasure µ is defined on R.
nEN
5. Let p be a measure on a a-algebra .sat in 0, and denote by .N,, the set of all
p-null (or µ-negligible) sets, that is, the N E .szd for which µ(N) = 0. Check that
.M,,, has the following properties:
(a) 0 E t',,;
(b) NE.Y,,,MEd,MCN ME.A;
(c) (Nn)nEN C ,N,, U Nn E ,4,.
nEN
Subsets of .sat with these properties are called a-ideals in d. Thus Y is always
a a-ideal. (Cf. Exercise 4 of §1.)
6. Every a-ideal .N in a a-algebra d is the a-ideal .N,, of p-null sets of an appro-
priate measure p on d. To get such a p, define
0 if AEa
_ I+oo ifAEd\-,Y.
'L(A)'-
As a special case, on the power set .9(12) of any set SZ there is a measure p such
that p(A) = 0 precisely if A is a countable subset of Q.
14 I. Measure Theory
Proof. Let I = [a, b[, J = [a', b'[ with a < b, a' < b', and let the corresponding
coordinates of these points be ai, 3i, a;, 3,. If we let a and f denote those points
in Rd whose coordinates are max{ai, a' j) and min{13i, $ } (i = 1, ... , d), respec-
tively, then I n J = [e, f [ in case e < f and otherwise I fl J = 0. Consequently,
I n J is already in .ld. Because J \ 1 = J \ (I n J) and we now have I n J E -old,
in proving the second claim we may assume that 196 0 and I C J. Then I and J
determine the points a, 6, a', 6' uniquely and they satisfy a' < a ci b < Y.
Create new points from a = (al , ... , ad) and b = Ad) by replacing
ai by a; and /3i by ai, or by replacing ai by ai and Qi by $ , and do this in all
§4. Lebesgue premeasure 15
possible ways. More precisely, make such replacements for the i coming from each
non-empty subset of { 1, ... , d}. The points so created give rise to at most 3d - 1
pairwise disjoint intervals from _Od whose union is J \ I. Thus J \ I is a figure
and is representable as a finite union of pairwise disjoint sets from _0d. That this
obtains as well for every figure F = Il U ... U I,, E _4rd with Ii, ... , I E .0d can
now be seen as follows:
F=I1U(I2\I1)U(13\IlUI2)U...U(In\IIU...UIn-1)
exhibits F as a union of n pairwise disjoint sets, each of the form I \ J1 U ... U Jm
with I, J1, ... , J. intervals from jd. Thus it suffices to show that every set of this
form is the union of finitely many pairwise disjoint intervals from mod. But this
follows from
m
I\J1U...UJm=n(1\Ji)
i=1
when, using what has already been proved, we write each I \ Ji as a union of
finitely many pairwise disjoint intervals from j0d and distribute the intersection
through these unions. 0
Proof. The only thing that is not obvious is property (1.10) of a ring, according to
which along with any sets F, G E .$d their difference F \ G must also be in `$d.
By definition there exist intervals Ii, ... , I,,,, Ii , ... , I;; E pd such that
m n
F=UI; and G= U1 .
i=1 j=1
But then
M n
F\G(nvi,\Ijn)
i=1 j=1
M
and so it only has to be shown that each set n (I; \ I) is a figure. According to 4.1
j=1
I; \ Ij" is always a figure. So it further suffices to demonstrate that the intersection
of two (whence, of any finite number of) figures is itself a figure. If however F
and G are two figures represented as above, then thanks to distributivity F fl G
is just the union of the sets I; fl I , " (i = 1, ... , m; j = 1, ... , n), which by another
appeal to 4.1 is a figure. 0
By definition every figure is a union of finitely many intervals from 5d. Conse-
quently, .mod C 9 for every ring 9 in Rd such that fd C R. So theorem 4.2 really
says that .Ird is the ring generated by fd.
Our geometric intuition now suggests the validity of the following theorem:
16 1. Measure Theory
4.3 Theorem. There exists exactly one content A on 911 with the property that
A(I) coincides with the d-dimensional elementary content of I, for each I E .fad.
This content is real-valued.
A(Ij)=>A(IjnJ=) (j=1,...,n).
i=1
§4. Lebesgue premeasure 17
Together these last two equations entail the equality E A(Ij) = E A(JJ).
(e) Thus for every F E .'d the number F_ A(Ij) is independent of the special
representation
F=I1u...UI,
of F as a union of finitely many pairwise disjoint 11, ... , In E fd. Therefore the
decree
A(F) := A(I1) +... + A(In)
well defines an extension, to be denoted still by A, of the original function on .fd to
one on gd. This function is real-valued, non-negative, and according to (d) finitely-
additive. Since 0 E j0d and A(0) = 0, a content with the sought-for properties is
at hand.
Proof. Because A is finite, 3.2 says that we only need to prove the continuity of A
at 0. To this end, let (Fn) be an antitone sequence of figures from d. We will
show that from the assumption that
b:= limoA(FF)=n NA(Fn)>0
follows
nFn #0.
Each Fn being a union of finitely many pairwise disjoint intervals from .>Id, it
should be clear that by a slight leftward shift of the right endpoints of each of
these intervals a new figure an E .fin is created, whose topological closure Gn is
still a subset of Fn, and
A(Fn) - A(Gn) < 2-"6.
If we set Hn := G1 fl ... fl Gn, then (Hn) is a sequence of sets from gd satisfyin
Hn Hn+1+ Fn C Gn C Fn for all n. Because Fn is bounded its closed subset H,,
is compact. As soon as we succeed in showing that each Hn is not empty, it will
follow from the finite-intersection property of compacts (WILLARD [1970), p. 118,
KELLEY [1955], p. 136) that n Hn 0 0 and so a fortiori n Fn 54 0. So let us
nEN nEN
prove that no Hn is empty. For every n E N
(*) A(Hn) > A(Fn) - (1- 2-")d,
as we will confirm by induction. The inequality holds for n = 1 because H1 = G1,
and by choice of G1, A(F1) - A(G1) < 2-16. Suppose the inequality valid for
18 1. Measure Theory
Exercises.
1. Show that on 91 there is exactly one content p that assigns to the right half-
open interval [a,,3[, a, f3 E R, the following values
ifa<0<$
'L([a,13[)={10 in all other cases.
Is µ a-additive?
2. Two intervals 1o, J E jrd with 1o C J are given. Prove the existence of k < 2d
intervals I1, ... , Ik E Od with the following two properties: (i) 10 U ... U 1j E 'Od
for each j E {0, ... , k}; (ii) J = lo U ... U Ik. [Hint: Proceed by induction on the
dimension d.]
Lebesgue premeasure is not a measure because its domain of definition, the ring
of d-dimensional figured, is not a o-algebra. For example, the whole space Rd is
not in .$'d, every d-dimensional figure being a bounded subset of Rd.
The elementary geometric considerations sketched at the beginning of this chap-
ter however suggest that the domain of the premeasure Al be so enlarged that
§5. Extension of a premeasure to a measure 19
a "numerical measure" gets assigned also to more complicated subsets of Rd. The
most satisfactory such result would say that Ad can be extended in exactly one
way to a measure on an appropriate or-algebra W in Rd with g d C a0.
Here we encounter the following general problem: A ring ? in a set Q and
a content p on -4 are given. Under what conditions does there exist a o-algebra 0
in fl and a measure µ on at such that p is the restriction of j to -IV An obvious
necessary condition for this is that µ be a premeasure on R. The designation
"premeasure" will turn out to be justified if we can show the converse: For every
premeasure p on a ring . there exists a or-algebra ark in Il with -4 C at, and
a measure µ on ark satisfying i I .? = p. It suffices to take for 0 the o-alge-
bra o(3) generated in fI by .M.
Proof. For each subset Q C S1 designate by 'W (Q) the set of all sequences (An)nEN
of sets from 3 which cover Q, that is, which satisfy
QC UAn.
nEN
(5.2) A* (0) = 0;
(5.3) Q1 C Q2 A WO < P*(Q2);
00 00
Equality (5.2) follows from the observation that the constant sequence 0, 0, ... is
in W (O). The observation that °1(Q2) C V (Q0 follows from Q1 C Q2, serves to
confirm (5.3). For the proof of (5.4) it can evidently be assumed that p (Qn) is
finite and so in particular 0&(Qn) # 0, for every n E N. For an arbitrary e > 0
then, each 0&(Qn) contains a sequence (Anm)mEN such that
00
The double sequence (A,nm)n,mEN lies in 11 ('J Qn) and as a consequence the
n=1
definition of µ' gives
00 00
and (5.4) follows from this and the arbitrariness of £ > 0. It is immediate from the
definition that
(5.5) i >0.
Decisive for what follows is the fact that every A E .4 satisfies
(5.6) p (Q) > µ' (Q fl A) + µ' (An CA) for every Q E .9(0),
as well as
(5.7) p*(A) =;&(A).
In proving (5.6) we can again assume µ'(Q) < +oo, so that P(Q) 34 0. First of
all we have 00 00
00
5.2 Definition. A numerical function µ' on the power set .9(St) having properties
(5.2)-(5.4) is called an outer measure on the set fl. A subset A of 0 is called u-
-measurable if it satisfies (5.6).
Notice that µ' > 0 always prevails, an immediate consequence of (5.2) and (5.3)
together.
The idea in the proof of the measure-extension theorem, which goes back to
C. CARATHFODORY (1873-1950), consists in associating via definition (5.1) an
outer measure to the premeasure p on.' and then invoking the following theorem.
§5. Extension of a premeasure to a measure 21
5.3 Theorem (Caratheodory). Let µ' be an outer measure on a set f). Then the
system 0' of all µ'-measurable sets A C fl is a o-algebra in fl. Moreover, the
restriction of µ' to dA' is a measure.
Proof. First let us note that the requirement (5.6) for a subset A of St to lie in d'
is equivalent to
(5.6') µ'(Q)=µ'(QnA)+µ'(Q\A) for allQE9(1),
because from (5.4) applied to the sequence Q n A, Q \ A, 0, 0.... follows the reverse
of inequality (5.6), for every Q E 9(S1). From either (5.6) or (5.6') it is immediate
that S2 E d', and because of their symmetry in A and CA, whenever A lies in d',
so does CA. The following considerations will show that with each two of its sets A
and B, .d' also contains their union A U B, and so d is an algebra. B E as''
entails that
µ' (Q) = µ' (Q n B) +.u* (Q \ B)
for every Q E 9(11). Replacing Q here first by Q n A, then by Q \ A = Q n CA, we
get two new equalities (valid for all Q E 9(1)) which, when inserted into (5.6'),
lead to
µ'(Q) =µ'(QnAnB)+µ'(QnAnCB)+µ'(QnCAnB)+µ (QnCAnCB).
Replacing Q here by Q n (A U B) gives
(5.8) µ'(Qn(AuB)) =µ'(QnAnB)+µ'(QnAn CB)+µ'(QnCAnB),
which in conjunction with the preceding equality yields
µ'(Q) = µ'(Qn(AUB))+µ'(QnCAnCB) = µ'(Qn(AuB))+µ'(Q\(AuB))
This being valid for all Q E Y(n) affirms that A U BE d'.
Now let (An) be a sequence of pairwise disjoint sets from W' and A be their
union. The choice of A := A1, B:= A2 in (5.8) produces
, (Q n U A) = E(Q n Ai)
i=1 i=1
n
for all Q E 9(1), all n E N. Recalling that Bn U Ai has already been proven
i=1
to be in Af ', and that Q \ Bn D Q \ A, so that µ' (Q \ Bn) > µ' (Q \ A), we obtain
n
p* (Q) =14* (QnBn)+p'(Q\Bn)?F1i'(QnAi)+µ'(Q\A)
i=1
22 I. Measure Theory
p'(A) _ E (An),
n=1
It can be further shown that in many important cases the measure µ from
Theorem 5.1 is uniquely determined. As a preliminary we give a proof that is
a typical application of the technique of Dynkin systems. (Cf. also Exercise 9.)
Proof. Denote by 8f the system of all sets E E.9 satisfying µ1(E) =p2(E) < +oo.
For a given E E of consider the system
9E :_ {D E sV: p1(E n D) = p2(E n D)).
We will show that it is a Dynkin system. Obviously Sl E E. If D E 9E, then
p1(E n D) = µ2(E n D) < +oo (since E E 8j), and so (3.7) shows that
p1(EnCD) = p1(E\EnD) = p1(E)-p1(EnD) = µ2(E)-p2(EnD) = p2(EnCD),
which says that CD E 9E. The remaining property of Dynkin systems (2.3) follows
at once from the a-additivity of the measures p1, p2. Because 8 is n-stable,
8 C 9E follows from (i) and the definition of 9E. But then S(td) C 9E because
6(8) is the smallest Dynkin system which contains 8. From Theorem 2.4 however,
§5. Extension of a premeasure to a measure 23
µ1(A)=Eµ1(FnnA)=>µ2(FnnA)=µ2(A)
n=1 n=1
for every A E .W, which says that the measures 1A1, µ2 are identical. O
For finite measures some other natural stability properties of the generator c9
(e.g., its closure under set-differences) also insure uniqueness. See, for example,
ROBERTSON [1967].
Examples. 1. Suppose that the content p on the ring .4 in f is finite, that is,
p(A) < +oo for every A E R. The a-finiteness of u is the equivalent to the existence
of a sequential covering (An) of Il by sets An E R. But the latter condition does
not automatically hold, as the trivial example Sl 54 0, .9 := {0} illustrates.
In general, the a-finiteness of a content p on a ring .4 is equivalent to the
existence of a sequence (A;,) of sets in .4 with µ(A',) < +oo for all n and A', T Q.
In fact, if (An) is merely a covering of S2 by sets in -4 having finite µ-measure,
then the sets An := Al U ... U An, n E N, furnish a sequence of the desired kind.
2. Lebesgue premeasure in R" is a-finite (as well as finite). For if we denote by n
the point in Rd whose coordinates are all equal to n, then In :_ [-n, n[ is an
interval from 'Od, Ad(I,,) < +00 (n E N), and In t Rd.
24 I. Measure Theory
5.6 Theorem. Every or -finite premeasure p on a ring ' in a set 1 can be extended
in exactly one way to a measure it on a(M).
Proof. Only the uniqueness of it has to be proved. But this follows immediately
from 5.4: thanks to the o-finiteness of it, the ring. has all the properties required
of the generator 6' in the hypothesis of 5.4.
Exercises.
1. Let µ = E,,, be the premeasure on a ring .R in Sl defined by putting unit mass at
the point w E fl. Under the hypothesis that {w} can be realized as the intersection
of a sequence from .s and fl as the union of such a sequence, prove that: (a) The
outer measure µ' defined from µ via (5.1) assigns to every set A E .9(11) the
value 1 or 0, according as w E A or w E CA. (b) Every subset of Sl is p'-measurable.
(c) µ' is the measure E,,, on .9(11).
2. Consider the measure p in Examples 2 and 7 of §3, say for Sl := R, and prove
that: (a) The outer measure µ' defined from p via (5.1) assigns to every set
A E .9(fl) the value 0 or 1, according as A is countable or not. (b) µ is not
a measure on .9(11), not even a content. (c) The only µ'-measurable sets are
those in the a-algebra sd on which u is defined.
3. Let d be the a-algebra generated by an algebra .r on the set Cl, a and v
measures on 0. Show that the validity of u(A) < v(A) for all A E . need not
imply its validity for all A E d. [Hint: do := 91, v counting measure, p := 2v.]
Find supplemental hypotheses that will render such an implication true.
4. Show that the sequence required in Definition 5.5 of the a-finiteness of the
content p on the ring 9' in fl, can always be chosen to be a sequence of pairwise
disjoint sets from ,R which cover Cl and each have finite measure.
5. Let µ be a a-finite measure on a a-algebra sF in fl, and µ' the outer measure
defined by (5.1). Then to every set Q E .9(11) corresponds an A E 0, called
a measurable hull of Q, with the properties that Q C A, µ' (Q) = p(A), and
µ(B) = 0 for all B E d such that B C A \ Q. [Hint: In case µ' (Q) < +oo, show
that there exists a sequence (A,,) in at with Q C A,, and µ(A,) < µ'(A) + n'
for every n E N. Then A := n A,, has the desired properties.]
nEN
26 I. Measure Theory
ingly o(.Fd) is called the a-algebm of Borel subsets of Rd; it will henceforth be
denoted .mod.
The results reviewed in the introduction can, following 4.3, be expressed thus:
6.2 Theorem. There is exactly one measure Ad on ,mod which assigns to every
right half-open interval in Rd its d-dimensional elementary content.
6.4 Theorem. Let 0d, `ed, .d denote the system of all open, closed, compact
subsets of Rd, respectively. Then
,Wd
(6.3) = o(6d) = o(`ed) = r(. d) .
Proof ..lt'd C `E'd C 0,((2d), so o(Xd) C o(('d). Every set C E'd is the union of
a sequence of sets C E ..1C'd; for example, if K. are the compact balls with a fixed
center and radii n E N, then the sets C,, := C fl Kn furnish such a sequence. Thus
by (1.3), Wd C o(..lE'd), whence o(Wd) C o(..iE'd) and so finally the equality of
these two a-algebras. Since the open sets are the complements of the closed ones,
the equality o(6d) = o(( d) is obvious; therewith the last two equalities in (6.3)
are confirmed.
28 I. Measure Theory
We finish up by showing that o,(®d) =mod. We will, as usual, use the term
bounded open interval in Rd for every set of the form
(6.4) Ja,b[:={xERd:a.x4b},
where a, b E Rd satisfy a < b. Every right half-open interval [a, b[ E fd is the
intersection of a sequence of bounded open intervals, namely, for
a:= (a1, ... , ad) and an := (al - n-1, ... , ad - n-1) (n E AI)
we have
Jan, b[ .l. ]a, b[ .
if we set
(6.5') an := (min{al + n-',,31 I,-, min{ad + n-1,Qd}) (n E N),
ai, ... , ad and being the coordinates of a and b, respectively. Every
open set is therefore the union of a sequence of intervals from jd, and so 6d C
0,(.f d) = .jd. It thus follows that o(eld) C .mod and, as the reverse inequality has
already been established, equality Rd = o,((l'd) is confirmed. O
We will become acquainted with some deeper properties of L-B measure in §8.
In particular, there the existence of non-Borel sets, that is, the assertion
Rd #-'P(fltd)
will be proved. For the moment we content ourselves with computing the Lebesgue
measure ,d(B) of some geometrically simple Borel sets B.
HC U [x",yn[
neeN
§6. Lebesgue-Borel measure and measures on the number line 29
and
Ad([xn,yn[) = 2-ne, n E N.
From (3.10) we therefore get
00
Proof The techniques employed in the proof of Theorem 4.3 can be repeated
to show that corresponding to F there is a unique content p on the ring Jr'
of 1-dimensional figures which has property (6.9). That part of the proof used
only the isotoneity of F. From the left-continuity of F it follows that for every
1=[a,b[E5' and every e>0there isaJ=[a,c[E51with JCland
IA(1) - p(J) = p([c, b[) = F(b) - F(c) < e.
§6. Lebesgue-Borel measure and measures on the number line 31
But then the technique employed in the proof of Theorem 4.4 shows that it is
a a-finite (as well as finite) premeasure on .071.
According to 5.6 it can be extended in exactly one way to a measure on 0.
This measure does what is wanted, is a pF. Its uniqueness with respect to its
prescription on .1 via F was settled in the deliberations preceding the present
theorem. From pF = pc we get G(b) - G(a) = F(b) - F(a) whenever a < b. Upon
applying this with a = 0 < b as well as with a < 0 = b, we learn that G = F + c,
with c := G(O) - F(0). Every AF is a Borel measure, because every bounded
B E 91 is contained in [-n,n[ for some n E N and so pF(B) < IAF([-n,n[) _
F(n) - F(-n) < +oo.
If conversely, p is an arbitrary Borel measure on R, we can define
p([0, x[) if x > 0
F(x) .=
I-p([x, 0[) if x < 0
and get a function on R having property (6.9) and therewith, in light of the
discussion preceding this proof, measure-generating. In fact, for real numbers 0!5
a < b the subtractivity (3.7) of measures entails that
p([a, b[) = p([0, b[ \ [0, a[) = F(b) - F(a) ,
and (6.9) is confirmed analogously when a < b < 0. In the remaining case a < 0 < b
we get (6.9) from [a, b[ = [a, 0[ U 10, b[ and the additivity of it. The uniqueness
already proved leads finally to the equality of p with the measure AF derived
from F.
Notice that L-B measure )' has the form PF, with F the identity map x H x
on R.
Of special importance are the finite measures on 0. Every one is a Borel
measure on R. Because 0 < p(B) < p(R) < +oo for all B E 91, a finite Borel
measure p on R is either the zero measure p = 0, or 0 < p(R) < +co and
v:= p is a measure on.91 with v(R) = 1. Measures normalized this way play
p(R)
a fundamental role in probability theory. This explains the following vocabulary:
A measure p on a a-algebra .sad in a set Q is called a probability measure (ab-
breviated to p-measure) if p(1l) = 1. Because of the isotoneity property every
p-measure satisfies
(6.10) 0 < p(A) < 1 = p(fl) for all A E W.
Consider now a p-measure p on 0. The open interval [-co, x[ lies in 91 for each
x E R, so a real function F. with values in [0,1] is defined by
(6.11) F,,(x) := p(] - oo,x[) (x E R).
It is called the distribution function of p. For example, the distribution of the
Dirac measure eo equals 0 throughout ] - oo, 0] and 1 throughout ]0, +oo(.
Since ] - coo, b[ \ ] - oo, a[ = [a, b[ whenever a < b,
p([a, b[) = F, (b) - F, (a) for all (a, b[ E S1.
32 I. Measure Theory
Here ak, (3k (k = 1,. .. , d) are the coordinates of a, b, resp., and ,aI F is the
function defined on Rd-1 via (t1, ... , d) '-4 F2 (6, , G) := F(01,6, -,td) -
F(a1, t2, ... , d). Then Da2F2 = AJ &3, F is defined and the further "difference
operators" Dak are inductively brought into play. There is a theorem analogous
to 6.5: To every measure-generating function F on Rd corresponds a unique Borel
measure AF on Rd which satisfies the iterated difference condition
(6.14) pF([a, b[) = Aad ... L&QI F for all [a, b[ E d.
For d = 1 this reduces simply to (6.9'). As an example, for the function
CC
Aad ... Aa, Fo = (.31 - a1) (Qd - ad) for a, b E Rd with a < b.
This function is consequently measure-generating, and generates the L-B mea-
sure Ad in the sense that uF0 = Ad. Details can be found in RICHTER [1966],
TUCKER [1967) and GNEDENKO [1988.
Exercises.
1. Prove that a Borel set B E .mod is an L-B-null set if and only if one of the
two following conditions (which are hence equivalent) is satisfied: (a) For every
e > 0 there is a covering of B by countably many open intervals In C Rd such
00
that E Ad(In) < c. (b) There is a covering of B by countably many open inter-
n=1
00
vals In such that E Ad(II) < +oo and every point of B lies in In for infinitely
n=1
many n. Both characterizations remain valid if the In are allowed to be half-open
or compact, instead of open. [Hint for (a): Utilize (5.1).]
2. Write Rd in the form Rd = Rp X RQ with p, q E N, p + q = d, by grouping the
first p coordinates of a point x E Rd into a point in RP and the last q coordinates
into a point in R. Denoting by 0 the zero of the vector space R9, show that for
a set A C RP, A x {O} E .mod precisely when A E P.
3. Let p be a p-measure on 0 and Fµ its distribution function. Show that Fµ is
continuous at the point x E R just if p({x}) = 0.
4. Determine the p-measure on .r which has x -+ 0 V (x A 1) as distribution
function, and answer anew the question in Exercise 1 of §4.
5. Show that every a-finite measure p on 0 can be represented in the form
00
p = E an pn, where for each n E N, an E R+ and An is a p-measure on .mod. The
n=1
supplemental condition that for every bounded set B E Rd, pn(B) -A 0 for only
finitely many n E N can be imposed if and only if y is a Borel measure.
34 I. Measure Theory
and speak of a measurable mapping of the first measurable space into the second.
Using the notation introduced in (1.5), (7.1) can be written as
7.2 Theorem. Let (12, d) and (Q', W') be measurable spaces; further, let 9' be
a generator of 0'. A mapping T : Cl - 12' is measurable just if
(7.2) T-1(E') E R1 for every E' E 4'.
§7. Measurable mappings and image measures 35
Proof. The system .l' of all sets Q E 9(S2') for which T-1(Q') E d is a a-algebra
in 11'. Consequently, 0' C °. ' holds just if 8' C 2' does. sZf' C .l' is equivalent
to the measurability of T, while 8' c 2' is equivalent to (7.2).
7.3 Theorem. If Ti : (c', .s+'j) -> (Sl2, a/2) and T2 : (S22, saI2) -* (S13, s71/3) are
Proof. The claim follows from the validity of the equation (T2 o T,)-1(A) =
Ti 1(TZ 1(A)) for all A E 9(SZ3), in particular, from its validity for all A E saf3.
Next consider a family of measurable spaces ((c,, sO ))iEI and a family (Ti)iE1 of
mappings Ti : S2 -> S2i of some fixed set S2 into the individual sets 11,. Obviously the
a-algebra in 0 generated by U Ti 1(sa;) is the smallest a-algebra 0 with respect
to which every Ti is 0-sfi-measurable. We designate this a-algebra o(T, : i E I),
that is, we define
(7.3) o(Ti : i E I) := o(U(T; 1(-Wi))
iEI
and call it the a-algebra generated by the mappings Ti (and the measurable spaces
(Sti, r!)). In the case of the finite index set I n}, we also use the notation
o(T1i...,Tn)-
For n = 1 we clearly have a(TI) = Ti 1(sad1). If therefore a a-algebra d in
a set S1 is given, then a mapping T, : S2 -> S1, being d- s i(i -measurable is equivalent
to
(7.4) a(T,)C0.
Cf. (7.1').
As a further application of 7.2 we will demonstrate:
Proof. According to Theorem 7.3 the condition is necessary. The following consid-
erations show that it is also sufficient. By (7.3) the system
8:=UT,'(s )
iE1
is a generator of o(TT : i E I). Each set E E 8 has the form E = Ti 1(Ai) for some
i E I, A, E .sad . Thus S-1(E) = (Ti o S) -1(Ai) E s to because of the hypothesized
measurability of Ti o S. From 7.2 therefore, S is sio-o(Ti : i E I)-measurable.
36 I. Measure Theory
7.5 Theorem. Let T : (I ,.d) -+ (0', 0') be a measurable mapping. Then for
every measure p on a+f,
(7.5)
Proof. We only have to observe that for every sequence (An)nEN of pairwise disjoint
sets from al', (T-1(A'n))nEN is a sequence of pairwise disjoint sets from W, and
that
T-1(UA')=UT-'(Art).
O
nEN nEN
7.6 Definition. In the situation described in 7.5, the measure p' is called the
image of p under the mapping T and is denoted by T(p).
Thus according to this definition
(7.5') T(p)(A') := p(T-1(A')) for all A' E ai'.
The formation of image measures is transitive, that is,
(7.6) (T2 o TO) (p) = T2(Ti(p)),
whenever we are in the situation of 7.3 and U is a measure on .aft: For every A E aft,
T := T2oT1 satisfies T -'(A) = Ti ' (T;" (A)), and T;" (A) E .aft. Therefore, setting
A':= Ti(p), 14":= T2(µ') for short, it follows that
T(p)(A) = p(Ti '(Tz 1(A))) = µ (Tz 1(A)) = p"(A),
for all A E W3i showing that T(p) = p" and confirming (7.6).
for sets A E .9(Rd) and points a E Rd, then TQ(Ad)(A) = ad(-a+A) for arbitrary
A E Rd. Property (7.7) can therefore also be expressed as
(7.7') Ad(a + A) = Ad(A) for all A E 69d, a E Rd.
For, every open interval ]a, b[ C Rd has D.()-pre-image equal to ]a', b'[, where the
coordinates of a', b' except the ith are those of a, b, the ith being a-1 times those
of a, b if a > 0, and a-1 times those of b, a if of < 0. Hence
Ad((DR'i)-'Qa,b[)) = IaI-' Ad(]a,b[)
DQ'i(ad) and IaI-l Ad are therefore measures on .mod which coincide on all bounded
open intervals. Thanks to 6.4 such intervals constitute a generator of 9d, which
obviously has with respect to each of these measures all the properties of the
generator 8 in the uniqueness Theorem 5.4. From that theorem (7.9) therefore
follows.
5. If we set Hr := Dr1) o ... o D(rd) for real r 96 0, we obtain the linear mapping
Hr(x) = rx (x E Rd), called a homothety. Because of the transitivity of image
measures, it follows from (7.9) that
(7.10) Hr(Ad) = Iri-dad
Exercises.
1. For fl := R, let (Sl, dA, p) be the measure space of Example 2, §3. For SY := {0,1 }
and .sad' 9(fl) define the mapping T : fl --, SW by T(w) := 0 if w is rational,
T(w) := 1 if w is irrational. Show that T is d-d'-measurable and determine the
image measure T(µ).
2. Show that for any sets fl, Sl', any mapping T : 11 - fl', and any system of sets
B' c .9(11'), T-1(o(8')) = a,(T-'(r))
3. Let K be a compact subset of Rd with the property that the intersection HH(K)fl
Hr, (K) of every two homothetic images of K with 0 < r < r' < 1 is an L-B-null
set. (This property is enjoyed by every sphere S,,(0) of radius a > 0 and center
0 := (0,. .. , 0), that is, the set of x E Rd having euclidean distance a from 0.) Show
that Ad(K)=0. [Hint: For allrE10,11, Hr(K)CK:={tx:0<t<1,xEK},
which is a compact set. Hence Ad (k) < +oo.)
38 I. Measure Theory
4. Let T:= {(x, y) E R2 : x2 + y2 = 1} denote the unit circle, that is, the sphere
S1 (0) in R2. Prove the existence of a finite non-zero measure v on the a-algebra
,4(T) : _ T n 0 which is invariant under all rotations of T. [Hint: Take for v an
image of Ac for an appropriate interval C C R.]
L-B measure Ad on ,Rd is, as was shown in Example 3 of the preceding section,
translation-invariant. Of the greatest significance is the fact that Ad is uniquely
determined by this invariance property, together with a simple normalization. For
the d'-dimensional unit cube, defined by
(8.1) W:= 10, 1[,
where 0 = (0, ... , 0) E Rd and 1 :_ (1, ... ,1) E Rd, 6.2 insures that
(8.2) Ad(W) = 1.
Along with Ad each non-negative multiple aAd (a E R+) of it is a translation-
invariant measure u on 0, which satisfies u(W) = a < +oo. The following
converse of this also holds, and contains the aforementioned characterization of Ad
as a special case.
Proof. Let an := n'l, the point in W' all of whose coordinates are 1/n. Then
W := [0, a cube with
.u(Wn) = a/nd
In fact: The interval [0,1 [ E .01 is the union of the pairwise disjoint intervals
[!,-±_![ with v = 0,1, ... , n - 1. If therefore Gn denotes the set of points
(Bl, ... , Pd) E Rd whose coordinates all come from the set { v/n : v = 0,...,n- 1},
then
W = U [r, r -i- M' [
rEG.
§8. Mapping properties of the Lebesgue-Borel measure 39
a union of nd pairwise disjoint intervals. Because [r, r+an[ = T,.([0, a,,[) = Tr(Wn)
and because of the translation-invariance of µ, it follows from this representation
of W that a = ndµ(Wn).
A repetition of these considerations will show that
µ([a, b[) = aAd([a, b[)
holds for every interval [a, b[ E fd in which the points a, b have only rational
coordinates. Obviously in proving this we can assume that a 4 b, and due to
the translation-invariance of both measures we can further assume that a = 0.
Then b = (ml /n, ... , and/n) for appropriate ml,..., md, n E N, and therefore
[0, b[ is the union of the ml ... and pairwise disjoint intervals [r, r + an[ with
r = (Bl /n, , Pd/n) and Pi E 10,..., m; - 1) for each i. As before, this yields
m1 ... =µ([0,b[), hence
µ((0,b[) = a ... nd = aAd([O, b[)
nl
Now the set ee of all intervals [a, b[ E f d for which a, b have only rational
coordinates is an fl-stable system. The technique used in the proof of Theorem 6.4
shows that ac is, just like 5d, a generator of.. Because the measures p and aAd
coincide on ee and for n:= (n,. .. , n) with n E N the intervals [-n, n[ lie in 8e
and increase to Rd, our claim (8.4) follows from the uniqueness theorem 5.4.
This corollary says that Ad is, in the theory of locally-compact groups, a Haar
measure on the additive group Rd. That theory provides an analogous non-zero
invariant measure on every locally compact abelian group G; it is unique to within
a positive scalar factor and is called Haar measure on G. The reader interested in
its theory should consult NACHBIN [1965]. (Cf. also Exercise 4 of §7 and Exercise 8
of §17.)
The conclusion of the theorem and its corollary remain valid if in the normal-
ization (8.2) and (8.2') the unit cube W is replaced by its open interior ]0,1[ or
its compact closure [0,1]. This is immediate from (6.7). However, if p(W) = +oo
is allowed, p need not be a multiple of Ad. See HENLE and WAGON [1983].
Example. 1. Besides the DO(') of Example 4, §7 there is another basic class of linear
mappings in Rd, those that skew one coordinate by means of another. Specifically,
for each i, k E { 1, ... , d} with i -A k we define
S(i,k)(x 1,..., ad) :_ (xl,...,Xi-l,xi +2k,Xi+1,.... 2d).
40 I. Measure Theory
Fix such a pair (i, k) and write simply S for S(i,k) Since this is a linear mapping,
S(Ad) is a translation-invariant measure on 0, so (8.5) will follow from 8.2 if we
succeed in showing that S(Ad)(W) = 1, that is, Ad(S-1(W)) = 1. In view of (7.9)
and the equality S' = D(kl o S o D(ki, it suffices to show instead that
(8.5') Ad(S(W)) = 1.
Let a denote the vector in Rd whose only non-zero coordinate is the ith one, it
being -1. Introduce
4iW'{(xl,...,xd):0<xj <1forjq6i,0<xi<xk} and
Wit :={(xl,...,xd):0<x,<1for j 0i,I+xk<xi<2}.
Notice that
Ta(W") _ { (xl, ... , xd) : 0 < xj < I for j 96 i., xk < xi < 1} .
Clearly
For the conditions on the ith coordinate that define each set in (8.7) are obviously
incompatible with those that define the other two sets. Moreover,
Here the inclusion "C" is obvious from the coordinate inequalities defining the
sets. A typical point x of D(i)(W) has j`h coordinate xx E [0,1[ if j 96 i and ith
coordinate t E [0, 2[ = [0, xk[ U [xk,1 +xk[ u [1 +xk, 2[. If t lies in the first (third)
interval, then x E W' (x E W"). Otherwise, xi := t - xk E 10, 1[, and
x = (XI.... , xi-1, t, xi+1, .... xd) = (x1, ... , xi-1, xi + xk, xi+1, ... , xd) E S(W ).
§8. Mapping properties of the Lebesgue-Borel measure 41
This confirms (8.8). Combining all that we have learned gives the desired (8.5') as
follows:
2 = 2Ad(W) =Ad (DZ')(W)) by (7.9)
=Ad W) + Ad(W"i)
+ Ad(S(W)) by (8.7) and (8.8)
=Ad(WI) + Ad(Ta(W )) + Ad(S(W)) by (7.7)
= Ad(W) +,d(S(W)) by (8.6)
= 1 + Ad(S(W)) by (8.2).
One usually thinks of the space Rd as equipped with the euclidean scalar-
product
d
(x, y) E cn,
i=1
expression. Upon doing so and re-assembling the terms, we get back a single ex-
pression like (*) but with the identity mapping in place of T. That is, we get 0. In
other words,
AT(x) + T(y) - T(Ax + y) = 0,
T(Ax + y) = AT(x) + T(y),
holding for all A E It, x, y E Rd. This says that T is a linear mapping. It is
immediate from (8.9) that T is then injective. The dimension of T(Rd) C Rd is
therefore d, so T(Rd) = Rd, and T is surjective. A motion T that is also a linear
mapping, and the preceding deliberations show that this is equivalent to T(0) = 0,
is called an orthogonal transformation.
If T is any motion and we set a := T(0), then the mapping U := T - a =
T_a o T is a motion that fixes 0. Therefore by the above, every motion T is
a composite Ta o U of a translation and an orthogonal transformation, and is
consequently a bijection of ltd. From this and (8.9) it is clear that the mapping
inverse to a motion is itself a motion, and that the set of motions is a group under
composition, the motion group Mot(Itd) of Rd.
The translation-invariance of All derived in 8.1 not only characterizes L-B mea-
sure but renders excellent service in the derivation of further invariance properties.
We begin with the motion-invariance of Ad, that is, with the proof that
(8.10) T(Ad) = Ad for all T E Mot(Rd).
The reflection-invariance treated in Example 5 of §7 is contained in this as a special
case.
Proof. Let a motion T of lRd, about which we initially assume that T(O) = 0, be
given. Thus T is an orthogonal, linear transformation. Via the following consid-
erations, we will quickly convince ourselves that T(Ad) is a translation-invariant
measure on 4d: Denoting as before by T,, the translation x H x + c, for each
e E ltd, we consider any a E Rd, set b:= T`(a), and observe that
(8.11) T. oT =T oTb.
For every x E Rd, T. oT(x) = T(x) +a = T(x)+T(b) = T(x+b) = ToTb(x), con-
firming (8.11). From this and the translation-invariance of Ad we get T.(T(Ad)) =
T(Tb(Ad)) = T(Ad). As a E ltd is arbitrary, this says that it:= T(Ad) is a transla-
tion-invariant measure on.*'. For the unit cube W = (0,1( we have a :=.u(W) =
Ad(T-I(W)) < +oo by (6.2), since T is an isometry and therefore along with W
the set T(W) is also bounded. Now Theorem 8.1 comes into action and guar-
antees that T(Ad) = it = aAd holds. So what remains is to see that a = 1. To
this end we look at the compact ball K := {x E ltd : p(0, x) < 1} of radius 1 and
center 0. Since T and T- i are orthogonal transformations, they fix 0 and leave
§8. Mapping properties of the Lebesgue-Borel measure 43
distances invariant (8.9). Hence T-I(K) = K, and from T(ad) = aAd follows
Ad(K) = Ad(T-I (K)) = T(Ad)(K) = aad(K)
From this follows the desired a = 1, because on the one hand Ad(K) < +oo
by (6.2) and on the other hand Ad(K) > 0 because K contains a non-empty
interval I E jd, namely I := [-t, t[ with t := (d-1/2, _ .. , d-1/2)_ [In Exercise 6
of §23 we will compute Ad(K) explicitly.]
To handle the case of an arbitrary motion T, set c := T(O) and S := T, o T,
getting a motion that fixes 0, for which Ad = S(Ad) by what was first proved. It
follows finally from transitivity and T = TaoS that T(Ad) = T'(S(Ad)) =To(ad) _
Ad. Thus the theorem is proved.
Since with every motion T of Rd its inverse T-' is also one, the motion-
invariance can also be recorded in the following form: For every motion T of Rd
and every Borel set A E pfd
(8.12) Ad(T(A)) _.d(A)
In this form Theorem 8.3 just says that any two congruent Borel sets in Rd have
the same d-dimensional Lebesgue measure. This however is the measure-theoretic
formulation and refinement of the elementary geometric principle (A) enunciated
in the introduction to the chapter. Via it L-B measure is seen in the final analysis
to be a concept from euclidean geometry.
alternative proof of our next theorem. (The behavior of Ad with respect to linear
mappings T with det T = 0 is elucidated in Exercise 2 below.)
Theorems 8.3 and 8.4 taken together confirm an elementary fact from linear
algebra, namely that det T = ±1 for every orthogonal transformation T. And this
means that 8.3 is contained in the following immediate consequence of 8.4:
or equivalently
(8.16') IdetDcpIAd .
We will not go into this any further, but refer the reader to the textbook literature,
e.g., STROMBERG [1981], or to VARBERG [1971].
We will conclude the chapter by proving the existence of non-Borel subsets
of Rd. A different approach is indicated in the prologue to Theorem 26.6.
Proof. Let Qd denote the set of points in Rd each of whose d coordinates is rational.
This is a subgroup of the additive group Rd, so congruence x - y of points x, y E Rd
modulo Qd is an equivalence relation; it is defined by x - y if and only if x-y E Qd.
The space Rd decomposes into disjoint equivalence classes, each a set x+Qd with
x E Rd, the statement x - y being equivalent to the equality x + Qd = y + Qd.
Since to every real number 77 corresponds an integer n such that n < r) < n + 1,
that is, such that q - n E [0,1 [, every equivalence class contains a point x E [0,1 [.
Consequently, there is a set K C [0, 1[ which contains exactly one element from
each equivalence class. (On the role of the Axiom of Choice from set theory in this
existence claim see SOLOVAY [1970] and HALMOS [1974].) We have then
(8.17) Rd = U (k + Qd) = U (y + K)
kEK VEQd
and
(8.18) y1, y2 E Qd, t 1 0 y2 (y1 + K) fl (y2 + K) = 0 .
(Otherwise there are k, k' E K with y1 + k = y2 + k', that is, with k - k', which
by definition of K means that k = k' and consequently also y1 = y2.) Let us now
suppose that K E .mod. Since Q and therewith Qd is countable, it follows from
(8.17), (8.18) and the o-additivity of Ad that
(8.19) E Ad(y + K) = Ad(Rd) = +00.
yEQd
But then (8.20) means that we must have Ad(K) = 0, contradicting (8.19). The
assumption K E Yd is what led to this contradiction, so we conclude that K is,
after all, not a Borel set.
The following remarks serve to round out the foregoing and to provide a glimpse
of some closely related issues.
In passing from Borel sets to Lebesgue measurable sets the important property
of the former that they are determined only by the topology of Rd is lost. Because
d is the defining or-algebra for so many other important measures (for d = 1
Theorem 6.5 already attests to this), we will not dwell in detail on the transition
from Lebesgue-Borel to Lebesgue measure; only the former will be employed in
the sequel.
4. There exists a Borel set B E 0 whose image 7r1(B) under the first projection
map irl : R2 -* R (which sends every point (xi, x2) E R2 to its first coordinate xi)
is not a Borel subset of R. A proof of this will be found in SRIVASTAVA [1998],
p. 130. Such a B can even be found which is G6-set, that is, the intersection of
countably many open subsets of R2; see p. 36 of CHRISTENSEN [1974]. In particular,
the continuous image of a Borel set need not be a Borel set. The system of all sets
7r, (B) with B E 92 comprises rather the so-called Souslin or analytic subsets of R.
See SRIVASTAVA [1998] and CHOQUET [1969].
For any non-Borel set A C R, Exercise 2 of §6 shows that A x {0} is a non-Borel
subset of the A2-null set R x {0}.
5. Examples 4 and 5 of §7 as well as Theorem 8.4 illustrate that the L-B mea-
sure Ad is not invariant with respect to all homeomorphisms T : Rd -3 Rd of Rd
with itself. For such a homeomorphism T however, p := T(ad) is always a mea-
sure on .mod with the following properties: (i) p(K) < +oo for every compact
K C Rd; (ii) p({x}) = 0 for every x E Rd; (iii) u(U) > 0 for every non-empty open
U C Rd; (iv) p(Rd) _ +o0. OXTOBY and ULAM [1941] showed that, conversely,
every measure p on 0 enjoying properties (i)-(iv) has the form it = T(ad) for
some homeomorphism T : Rd Rd. A simpler treatment of their result was later
provided by COFFMAN and PEDRICK [1975].
Exercises.
1. Let T : (fl, .ad) -4 (fl', d') be a measurable mapping, p a measure on the or-alge-
bra 0, and p' := T (p) its image under this mapping. (1l, .Wo, po) and (S2', 00, IA')
will denote the completions of these measure spaces (Exercise 7, §5). Show that the
mapping T is also do-.olo-measurable and that T(po) = µo. From this it follows
that Lebesgue measure in Rd is also motion-invariant.
2. Let T be a linear mapping of Rd into itself with det T = 0. Show that for every
A E .9d, T(A), although it may fail to be a Borel set (as noted in Remark 4) is
at least a Lebesgue-null set, thus a subset of an L-B-null set, namely the linear
subspace T(Rd) of Rd. In this sense equality (8.14) retains its validity for linear
transformation T : Rd -+ Rd with det T = 0, i.e., (8.14) is valid for every linear
transformation T of Rd into itself.
3. Show that the set K constructed in the proof of Theorem 8.6 is not even
Lebesgue measurable.
4. In the section entitled "Fallacies, Flaws and Flimflam", p. 39, vol. 22, no. 1
(1991) of the College Mathematics Journal the following short "proof" of Theo-
rem 8.6 is offered: Suppose that A1(X) is defined for every subset X of 10, 11. By
isotoneity it is a number in 10, 11. Consider the set B defined as {al(X) : X E
48 1. Measure Theory
6'([O.1)), A' (X) % X}. It is a subset of 10,1] and upon testing the number A '(B)
for membership in B we find that the statements A'(B) E B and Al(B) 0 B are
equivalent, a contradiction. What is the error in this reasoning, or is it perhaps
a legitimate proof of Theorem 8.6?
Chapter II
Integration Theory
2. For an arbitrary subset Q of Rd consider the measurable space (Q, Qn9d). The
corresponding measurable numerical functions on Q will be called Borel measurable
functions or Borel functions on Q. Every continuous numerical function f on Q
is such a Borel measurable function. Indeed, for every a E R the set Q. of all
x E Q with f (x) > a is a relatively closed subset of Q, that is, of the form Q fl F
for a set F which is closed in Rd. (Such an F would be, e.g., the closure of Q.
in Rd.) Since F E .mod, this intersection lies in the trace a-algebra Q fl .mod. The
claim therefore follows from the next theorem. (Cf. also Example 2 of §7.)
both lie in :9. Consequently, along with each Q E :N, the set R fl Q is also in :9.
In other words, R n C :9 and therewith 91 C .2. This fact together with
(-oo), {+oo} E and the remarks preceding (9.1) make it clear that 91 C :N,
so that finally we have _'l = .1.
It may be noted that the four related assertions in which quantification is over
all a E R are also equivalent.
A plethora of assertions about calculating with measurable numerical functions
now presents itself.
9.3 Theorem. For any 0-measurable functions f,g : fl - Ilt the sets If < g},
If < g}, If = g} and If & g} lie in W.
Prof. Because the set Q of rational numbers is countable, the claims follow (with
the help of 9.2) from the equalities
If <g}= U{f <e}fl{e<g};
FEQ
{f<g}=C{f >g}; If =g}={f<g}f1{g< f};
{fog}=C{f=g}.
9.4 Theorem. Along with f, g : 11 -> R, the function f g and, if everywhere
defined, the functions f + g and f - g are also d-measurable.
Prof. First of all, along with g, a + rg is measurable for all a, ,r E R. This follows
from 9.2 because {o + rg > a} is {g > (a - a)/r} if -r > 0 and is {g < (a - a)/r}
if r < 0, the case r = 0 being trivial. This preliminary remark takes care of the
passage from g to -g and reduces the case f - g to the case f + g. Furthermore,
together with the remark following 9.2 and the equalities
{f+g>-a}={f>a-g} (aER)
it yields the measurability of f + g.
In investigating f g we will first suppose both functions are real-valued. Then
the identity
f9 = (f + 9)2 - 1(f - 9)2
4
reduces the product question to the case g = f. But (f2 > la)if is
a < 0 and
is If > V a-1 U { f < - f } if a > 0, which shows that the measurability of f2
follows from that of f.
52 1 1. Integration Theory
Proof. Apply 9.5 to the ultimately constant sequence fl, fn, fn .... 0
9.7 Corollary 2. If a sequence (fn)nEN of ti-measurable functions converges
pointwise throughout 12, that is, if lim fn(w) exists on R for every w E 52, then
the limit function lim fn is 0-measurable.
n-+m
Exercises.
1. Let (Q, a() be a measurable space, D a dense subset of llt (e.g., Q). Show that
a numerical function f on fl is af-measurable if the analog for all a E D of one of
(a)-(d) in Theorem 9.2 holds.
2. Let (fn)nEN be a sequence of as -measurable numerical functions on a measurable
space (0,W). Why is the set of all w E f2 for which the sequence (fn(w))fEN
converges in R, and that for which it converges in R, xf-measurable?
3. The real function f : Sl -> R is measurable on the measurable space (0, sd). Are
exp f and sin f , that is, the function w H of (1) and w - sin f (w), 0-measurable?
4. With the aid of Theorem 9.1 show that the real function defined on R2 by
(x, y) +-> max{x, y} is 6#2-measurable. Deduce from this another proof of Corol-
lary 9.6.
5. Show via an example that the measurability of a numerical function f is not
always a consequence of the measurability of if I.
If {a1, ... , a,, } is the set of distinct values of a function u E E, then the sets
Ai := u-1(a; ), i = 1,..., n, are pairwise disjoint, and as pre-images of the Borel
sets {ai} they each lie in d. Using the notation for indicator functions introduced
in (9.2), we have then
n
(10.1) u = E ailA,.
i=1
10.2 Lemma. Let (it, d,,u) be a measure space. For any normal representations
m n
=fl,1B'
q
i=1 j=1
L,Q1µ(Bj)
tol j=1
Proof. From
i1=AlU...UAm=B1U...UBn
follows
n m
Ai = U (Ai n Bj) and Bj = U (Ai n Bj )
j=1 i=1
§10. Elementary functions and their integral 55
in which the sets Ai n Bj are pairwise disjoint. The finite additivity of A therefore
supplies the equalities
n ns
(10.3) Judo :_
i=1
which is independent of the special choice of normal representation
U it
= E ailA,
i=1
Properties (10.4) and (10.5) are immediate from 10.3. The next property in the
list is confirmed thus: Start with normal representations
in n
u=EailAi and v=J:pjlE,
i=1 j=1
56 1 1. Integration Theory
the first for all i E { 1, ... , m}, the second for all j E { 1, ... , n}, from which in turn
new normal representations
u=F'ai1A,nBj, v=EQi1A,nB, and u + v = E(ai + 13j) lA,nB3
ij ij ij
emerge. Using them to compute all the integrals,
involving the same sets C1, ... , Ck E d. In case u < v, it then follows that ryi < bi
for each i E { 1, ... , k} such that Ci 34 0, and from this we have (10.7).
n
Now let u = E ail A, be an arbitrary representation of an elementary function
i=1
u E E with coefficients ai E R.4. and sets Ai E .op, but not necessarily a normal
representation. From (10.4)-(10.6) it follows that
n
Judµ = aiu(Ai)
i=1
For normal representations this equation served as the definition of f u du. Its
validity without this restriction, which we now perceive, indicates that the intro-
duction of normal representations was simply a technique of proof.
Exercises.
1. Let (S2, p) be a measure space and (Sl, sVo, po) its completion. Prove that
for every moo-elementary function u there are d-elementary functions u1i u2 such
that u1 < u < u2 and ji({u1 # u2}) = 0. For every such pair, f u1 dp = f u2 du =
f udpo. (Cf. Exercise 7(d) in §5.)
§11. The integral of non-negative measurable functions 57
2. The function 1Q on IIt has long been known as Dirichlet's jump function. Is it
a -41-elementary function?
11.1 Theorem. For every isotone sequence (un)neN of functions from E and
every u E E
of u with sets Aj E af and coefficients aj E R+, and let a be any number in 10,1[.
Then because of measurability the set
B,,:={un>au}
lies in 0 for each n E N. From this definition follows on the one hand that un >
au1B and consequently by (10.5) and (10.7)
undp>a J
J
for every n E N. Since the sequence (un) is isotone and u < supun, it follows on
the other hand that Bn T St, and so Aj n Bn T Aj for each j E {1, ... , m} and
consequently, because p is continuous from below
r m na
Proof. For every m E N, vn, < supun and u,,, < sup vn, from which inequalities
n n
and 11.1 follow
Now let
(11.3) E- = E'(0,a)
designate the set of all non-negative numerical functions f on 1 for which an
isotone sequence of functions from E can be found satisfying
sup un = f .
nEN
Then according to (11.2) the number
sup J U. dp E Ft+
nEN
depends only on f and not on the special representating sequence (u,,) of f used
to compute it. We're in a position similar to that of 10.3. Therefore we make the
Proof. From the definition of E* and from (10.2) follows (11.5). One only has
to note that sup un = lim un for isotone sequences (un). The earlier proofs carry
n n
over almost verbatim to (11.6) and (11.7). We'll do (11.7) and leave (11.6) for the
reader: Let f = SUP un, 9 = sup vn be representations of f, g E E' by means of
n n
elementary
isotone sequences of functions. Then by definition
11.4 Theorem (on monotone convergence). For every isotone sequence (fn)nEN
of functions from E'
Proof. Set f := sup fn. It suffices to find an isotone sequence (vn) of functions
n
from E which satisfy
sup vn = f and vn < fn for every n E N.
nEN
For then f E E' and f f dµ = sup f vn dµ by definition of the integral in E', while
f vn dµ < f fn dµ by (11.8). Consequently, f f dµ < sup f fn dµ and therewith the
equality claimed by the theorem follows, since the other inequality sup f fn dp <
n
f f dµ is immediate from (11.8) and the fact that fn < f for all n.
60 Il. Integration Theory
The sequence (vn) is gotten thus: For each fn there is by definition an isotone
sequence (umn)mEN of functions from E with sup urn = fn According to (10.2)
the functions mEN
Cm:=um1 V...Vumm
be in E (for each m E N). The isotoneity of each sequence (umn)mEN clearly entails
that of the sequence (Vm)mEN. From the isotoneity of (fm) n,EN follows v n < fm
for all m, and thus sup um < f . For all m > n we have u,nn < vm and so
m
Together with the preceding this gives finally sup vm = f . Therefore (vn) is a se-
quence with the needed properties 0 n
fn E E'
00
nn=1
and J(f)d$t=JfdIL.
n=1
00
n=1
Proof. Apply 11.4 to the sequence U t + ... + fn)nEN and recall (11.7). 0
Examples. 1. Let (S2, 0) be an arbitrary measurable space and c,, the measure
defined on d by unit mass at the point w E S2 (cf. Example 5 in §3). Then
f fde.=f(w)
for every f E E. Due to 11.3 we can at once assume that f E E.
If, however, f = E ai 1 A, is a normal representation of f, then w lies in exactly
one of the sets A;, say in Aj0. Then f f den, = E ajc,,(Aj) = a;. = f (w).
2. Consider 0 := N and .d :_ ,90(N). The o-additivity requirement means that
a measure p on V is uniquely defined whenever numbers do = p({n}) E R+ are
specified for each n E N. E` consists of all numerical function f > 0 on Q. Indeed,
one sets fn := f (n) It,,) for each n E N and then fn E E`, and in case f (n) < +oo,
§11. The integral of non-negative measurable functions 61
fn E E. Since
00
f=I:fn,
n=1
f du = f (n)pn .
J n=1
fidp ->fidpn.
This is evidently true of indicator functions f, so the claimed equality holds for
all elementary functions. Transition to an arbitrary f E E' is accomplished thus:
Let (un) be a sequence in E with un t f. Then the double sequence
amn = >2
i=1
n
ff
,,n dpi
*n E N)
satisfies
sup (supamn)= sup(sup amn) (= sup amn) ,
mEN nEN nEN mEN m.nEN
is a normal representation of a function in E. On the set Air the function un+1 can
take only the values (2i)2'n-1 and (2i + 1)2-"-1 if i E {O... , n2" -1}, and only
values > n when i = n2". Therefore the sequence (un) is isotone. It satisfies
sup un = f , because for any w E 11 either f (w) = +oo, in which case un (w) = n
n
for every n, or f (w) < +oo, in which case u. (w) < f(w) < un(w) + 2'n for all
n > f (w). Thus f lies in E. 0
If now p is the measure defined in Examples 2 and 7 of §3 which takes only the
values 0 and 1, then it follows from the preceding deliberations that
Exercises.
1. Show that every bounded, 0-measurable, non-negative real-valued function
on a measurable space (fl, d) is the uniform limit of an isotone sequence of d-
measurable elementary functions.
2. Let (Sl, .r9, µ) be a measurable space with a finite measure µ. Further, let
f, f1, f2.... be measurable numerical functions on 11. Prove the equivalence of
64 11. Integration Theory
§12. Integrability
By now the integral f f d;i is defined for all non-negative d-measurable numerical
functions on 11, as a result of 11.4 and 11.6 together. In a third and final step f f du
will now be defined for certain numerical functions f which are not of constant
sign.
According to Theorem 9.8, f is measurable just if both its positive part f+ and
its negative part f - are measurable. This remark prompts the following definition:
the integral off exists and one uses (12.1) to define f f dµ E R. Only occasionally
will we be concerned with this obvious generalization.
2. In the special case µ = ad we speak of Lebesgue integrable functions (on Rd)
and of their Lebesgue integrals. If a Borel measure µF on Rd is described with the
help of a measure-generating function F on Rd (cf. §6), the µF-integrable func-
tions f on Rd are called Lebesgue-Stieltjes integrable (or Stieltjes integrable) with
respect to F. One speaks of its (Lebesgue-)Stieltjes integral and writes f f dF
instead of f f dtF. The general theory of measure and integration has however
displaced this terminology and the notation f f dF, despite their historical signif-
icance.
Let us now summarize the most important properties of the conceptual edifice
just built:
12.2 Theorem. Each of the following four statements is equivalent to the inte-
grability of the measurable numerical function f on S2:
(a) f + and f - are integrable.
(b) There are integrable functions u > 0, v > 0 such that f = u - v. (Note that the
last equality entails that u(w) - v(w) is defined (in R) for every w E 11.)
(c) There is an integrable function g with if I < g.
(d) If I is integrable.
Proof. What has to be shown is the equivalence of (a) through (d), since (a) con-
stitutes the definition of f being integrable.
(a)=:-(b): According to (9.8), u := f+ and v := f- do the job required in (b).
Because the integral is additive on E', along with u and v, u + v is
also integrable. Since f = u - v < u < u + v and -f = v - u < v < u + v, the
function g := u + v is as required.
(c)=*(d): This follows from the isotoneity of the integral on E* and the fact
that If I E E' (Theorems 11.6 and 9.8): f If I dµ < f gdµ < +oo.
(d)=:;-(a): Upon recalling that f+ < IfI and f- If I, this too follows from the
isotoneity of the integral on E*.
In (b), f = u - v = f + - f - and so u + f v + f +, which via (11.7)
yields f u dµ + f f - dµ = f v dµ + f f + dµ and therewith the last assertion of the
theorem, since all the integrals here are finite. 0
are integrable.
(12.3) f <9
(12.4) Jfdµ 5 fiji dµ.
Proof. From f < g follows f+ < g+ and f - > g-, and from these inequalities and
the isotoneity of the integral on E' follows (12.3). Because f < IfI and -f: If 1,
(12.4) follows from the first equality in (12.2) and from (12.3), with If I in the role
of g there.
Examples. 1. Let (Cl, d) be any measurable space, e,,, the measure on ii defined
by unit mass at w E Cl. According to Example I of §11, the e,,-integrable functions
are just the W-measurable numerical function f on Cl with I f (w) I < +oo. For them
f fde,,=f(w)
2. Let be the measure space defined in Example 2 of §11, µ({n}) = an
for n E N. From what was shown there it follows that the µ-integrable functions
§12. Integrability 67
J
fdµf(n)an
n=1
3. Let (0, d, µ) be the measure space defined in Examples 2 and 1 of §3. A func-
tion f : S2 -* R is then µ-integrable if and only if it is equal to a real constant a
throughout the complement of some countable subset of 0. From Example 4 of §11
we have f f dµ = a for such an f.
4. Let (9, 0,,u) be a measure space with µ(f2) < +oo. Then every constant real
function, and consequently after 12.2, every bounded, measurable real function
on 12 is µ-integrable.
5. Let µ and v be measures on a a-algebra si in Q. A numerical function f on 0
is (µ + v)-integrable if and only if it is both µ- and v-integrable, and in this case
J
fd(µ+v)=Jfd+Jfdv.
In fact: For every non-negative sf-measurable function g on 12, f g d(µ + v) _
f g du + f gdv holds by Example 3 in §11. Applied tog := If I this and 12.2 prove
the integrability claim, and applied to g := f + and g := f - it implies the claimed
equality. In particular
2'(µ+v)=21(i)n21(v)
is valid.
We can now free ourselves of the restriction that functions always be integrated
over the whole 1. (11.5) insures that along with any pair of functions from E' =
E*(S2, s9) their product is also in E. So from f E E' and A E d follows lA f E E.
If f is an integrable numerical function on S2, then so is lA f, for every A E srd:
Because of the trivial inequality I lAf 15 If I, this is immediate from 12.2 (and 9.4).
In the light of this the following seems natural:
(12.7) jfdIi=Jfd,i.
68 11. Integration Theory
The following rules of calculation are evident, for all f, g which either lie in E'
or are integrable:
f Afdj<_f gdµ
A
One merely has to reflect on the definitions involved. Moreover, pursuant to the
discussion after (12.5).
(12.10) f - j fdµ A
is a linear form on .l(µ), for each A E ad.
But we can get at integrals over sets in ad in a different way, namely by con-
sidering the restriction µA of the given measure µ to the trace a-algebra A n a+d.
That one is thereby led to the same result is the content of
12.5 Lemma. Let A E .d and for every function f on IZ which either lies in E*
or is µ-integrable let f denote the restriction of f to A, and µA the restriction
of µ to A n .W. Then
r
(12.11) ff'dPA= J fdµ.
A
Proof. First consider f E E' (St, at). Then f' E E' (A, A n W) since
(f')-'(B) = An f-'(B)
holds for all Bore] sets B in R (cf. 11.6). For the function lA f E E' there is
a sequence (un) of a/-elementary functions satisfying it,, f IA f . The sequence (u;,)
of restrictions to A obviously consists of A n ad-elementary functions that satisfy
u',, t f', from all of which follows that
Un = a;1A,
i=1
that
k
Un = ai1'qi .
i=1
(Notice that for Q C A, the restriction 1Q coincides with the indicator function
with respect to A of Q.) From the last two equalities we see that
k.,
because each integral equals aiµ(Ai), and from these equalities and (12.12)
i=1
follows (12.11) for f E E'(1l,sv).
If f is p-integrable the preceding can be applied to both f + and f -. All integrals
are finite and it is obvious that (f')+ = (f+)', (f')- = (f-)', so (12.11) follows
from linearity of the integral.
(12.13)
and in the second case to say that f is also p-integrable over A. With the aid of
Lemma 12.5 we thereby get:
fA(w)
f 0(w) ifwEA
' if w E St \ A
is p-integrable. In this case
Exercises.
1. Characterize the functions u E E(12, d) which are p-integrable.
2. Let (12, d, p) be a measure space. The indicator function IA of a set A E at
is p-integrable just when µ(A) < +oo. Such sets are called p-integrable, and 9
will denote their totality. Show that R is an ideal in the ring 0 (cf. Exercise 4
in §1); in particular, R E .S and A E 0 A n R E R. For a or-finite measure p
a converse also holds: A C St and A n R E 9 for all R E R implies A E W.
70 It. Integration Theory
For the further construction of the theory the concept of a negligible set, already
frequently mentioned in Chapter I, will now play an important role. We recall:
N C 11 is called a (µ-)nullset if N E a and µ(N) = 0. The union of every
sequence of p-nullsets is again one (3.10), as is every set in W which is contained
in a p-nullset, thanks to isotoneity (cf. Exercise 5 in §3).
Be careful: It is not required that the set N,, of all w E Cl which enjoy property rl
be a µ-nullset. Indeed, generally N,, may not belong to W. For example, if A is
a subset of S2 which does not belong to ii and q is the property "w is a point
of A", then N,, = CA is not in sir.
Examples of properties q which will come up in the sequel are: Equality of the
values at a point w E Cl of two functions f and g which are defined on fl, finiteness
of the value at w E Cl of a function f, etc. Corresponding to these we have the
following modes of speaking: f and g are (µ-) almost everywhere equal on Cl, in
symbols
f=9 (µ-)almost everywhere;
f is (p-) almost everywhere finite, in symbols
If ( < +oo (p-)almost everywhere;
§13. Almost everywhere prevailing properties 71
13.2 Theorem. For every f E E'(0, d), that is, (cf. 11.6) for every +d-
measur-able, non-negative numerical function f
Ji dµ = 0 a f = 0 p-almost everywhere.
Proof. If f > 0, this claim follows from the theorem, because each function 1N f
lies in E' (12, sd) and is almost everywhere 0. In turn, application of this to f +
and f - delivers the full claim. 0
f Nfdµ= f Ngdµ=0.
On the other hand, for M = CN we have lM f = 1Mg due to the definition of N,
and so by (12.6)
dµ_IM
JM dµ.
A dding integrals and using (12.8') leads to the conclusion in (a).
(b): The almost everywhere equality hypothesis entails that
f+ = g+ almost everywhere and f g- almost everywhere.
From (a) then
Proof. The set N := (If I = +oo} lies in a( and for every real a > 0 satisfies
alN < if 1. Consequently, aµ(N) < f If I dµ < +oo, from which follows the first
§13. Almost everywhere prevailing properties 73
claim, µ(N) = 0. To prove the second claim we pass over to If I and thereby assume
that f > 0. Then
If 540}={f >0}= U{f >n-1}.
nEN
Theorem 13.6 has yet another consequence: Let N be a p-nullset and f a nu-
merical function which is defined on M := CN and is M fl ad-measurable. Such
a function is described as being a (p-)almost everywhere defined (d)-measurable
function. The function fm introduced in 12.6 extends it to an &d-measurable func-
tion on 11. Any other extension of f to SZ must agree with fm almost everywhere.
According to 13.4 therefore either every such extension is integrable or none is. In
the first case moreover all extensions have the same µ-integral. These observations
justify the following definition:
J(f+o)d=ffd+J9d µ
prevails unrestrictedly.
Exercises.
1. The numerical functions f and g on the measure space (St, s(, µ) satisfy f = g
,u-almost everywhere. Show via an example that in general the sat-measurability
74 1 1. Integration Theory
of g does not follow from that off . Show however that in case (52, d, p) is complete,
the d-measurability of g is equivalent to that of f.
2. Let (S2, .od, p) be a measure space, (1, x 1o', po) its completion. Prove that f :
Q -* R is wo-measurable just if .vd-measurable numerical functions fl, f2 on fl
exist with the properties f, < f < f2 everywhere in f1 and fl = f2 p-almost
everywhere. If f is po-integrable, then any functions fl, f2 with these properties
are p-integrable, and f fl dp = f f2 dp = f f dpo. (This supplements Exercise 7
in §5 and generalizes Exercise 1 in §10.)
3. Even if the f in the preceding exercise is real-valued, the functions fl, f2 which
were proved to exist there cannot always be chosen to be real-valued. Prove this
for the case where 11 is any infinite set, Ad := {Q1, S2} and p := 0.
Example. (0, sd, p) is the measurable space described in Example 2 of §12 and
Example 2 of §11, with a,, := n_P-1 for each n E N, where 1 < p < +oo. The
identity function, f (n) := n for all n E N, is integrable, but its p-th power is not.
Thus for p = 2, f2 = f f is not integrable.
14.1 Theorem. p > 1 is a real number and q > 1 is defined by the equation
-+-=1.
1
P q
1
Proof. It is clear from definition (14.1) that we may assume f > 0 and g > 0.
Setting
a:=Np(f) and r:=Nq(g),
we can also assume that both these numbers are positive. For if, say a = 0, then
by 13.2 f P, whence also f , is almost everywhere equal to 0. The same is then true
of f g (remember that 0 (+oo) = 0), so that again by 13.2 we have NI (f g) = 0,
and (14.3) holds. Once a, ,r are each positive, no loss of generality is incurred by
assuming that each is also finite, which we now do.
Applying the mean-value theorem of the differential calculus to the function
q 1- (1 + rl)l/D, there follows at once the well-known Bernoulli inequality
(1+71)I/p<2+1
_p for all11ER+
or
P q
If now x and y are positive real numbers, then one of xy-1 and x-Iy is such
a l;. Inserting this t into the last inequality (and reversing the roles of p and q if
necessary), gives
xllpyllq < 1x+ ly.
P q
This inequality - really equivalent to the concavity of the (natural) logarithm
function - holds as well for all x, y E R+. If finally we take x := (o-I f (w))P and
y := (rr-lg(w))q for an w E If < +oo} fl {g < +oo}, we get
1 1 1
valid throughout fI, since it trivially prevails as well in the complementary set
If= +oo} U {g = +oo}. Integration of this inequality leads at once to (14.3). 0
14.2 Theorem. For all measurable numerical functions f and g on £l whose sum
f + g is defined throughout fI, and for every p E [1, +oo[
(14.4) Np(f + g) <_ Np(f) + Ng(g) (MINKOWSKI's inequality).
which shows that we may assume f > 0 and g > 0. In case p = 1 there is then
even equality in (14.4), by (11.7). Therefore, for the rest we can assume that
1 < p < +oo, and then again define q by p-1 + q 1 = I. We may further assume
that both NN(f) and NN(g) are finite, that is, that if and gp are integrable.
12.2(c) and the estimates
(f + g)P <- [2(f V g)J" = 2P[fP v gPJ < 2P(fP + gP)
then insure the integrability of (f + g)P, that is, Np(f + g) < +oo. Now write
14.4 Theorem. Consider p E [1, +oc[ and p -fold integrable functions f and g.
Then for every a E R
of, f Vg and f Ag
are p -fold integrable, and in case it is defined throughout St, the function f + g is
p-fold integrable.
14.6 Theorem. The product of a p-fold and a q -fold integrable numerical function
is integrable (where 1 < p < +oo and 1 + a = 1).
14.7 Corollary. If 1 < p < +oo and the measure µ is finite, then every p -fold
integrable function is integrable.
Proof. Because µ(S2) < +oo, the constant function 1 is q-fold integrable on 0, for
each q E (1,+00[. So the present claim follows from 14.6 upon writing any p-fold
integrable f as f 1.
Remark. 1. Without the hypothesis µ(S2) < +oo the conclusion of 14.7 may fail.
For example, in Example 2 of §12 choose the measure it by requiring a = n-1/2
for all n. Then the function f defined on S2 = N by f (n) := an for all n E N lies
in 22(p) but not in 2'(p).
More generally when µ is finite, from p-fold integrability follows p'-fold inte-
grability for every p' E [1, p] - cf. Exercise 3 below.
Related to 14.6 we have:
Remark. 2. Definition (14.1) of Np obviously makes sense for every real p >
0, thus also for those 0 < p < 1 heretofore excluded from consideration. For
these p, however, the fundamental properties (14.3) and (14.4) are lost and the q
determined by p` l+q-1 = 1 is negative. (On this point, compare Exercise 5 below.)
Remark 3. at the end of §15 will show that pathologies occur when 0 < p < 1. All
subsequent work will therefore be restricted to the case p > 1.
Exercises.
1. Let (S2, d, µ) be a finite measure space, 1 < p < +oo. Show that every function f
on fl which is the uniform limit of a sequence (fn) from VP(IA) itself lies in .'(p).
2. For an arbitrary measure space (S1, rd, p) and 1 < p < +oo, show that a real
function f on 9 is p-fold integrable if and only if f If I" is Integrable. (In the "if"
direction, measurability of f itself is not part of the hypothesis.)
3. Let (11, 0,;t) be a finite measure space, 1 < p' < p < +oo, and f a measurable
numerical function on Q. Then
Np'(f) < Np(f) .1 (01/P -1/P and 2'(p) C -2v'(µ).
4. For any finite number of measurable numerical functions fl,..., fn on a measure
n
space and real numbers p i , , pn E 11, +oo [ satisfying F, p., 1 = 1, prove the
j=1
generalized Holder inequality
Nl(fl- fn):5Np,(fl).....NP"(f.)
5. Let (52,. 9, p) be a measure space, p E J0,1 [ and q < 0 be defined by p`1 +q-1 =
1. Consider non-negative f E .P(µ) and a measurable g : S1 -a 10, +oo[ satisfying
0 < Nq(g) := (f gq dµ) I /q < +oo. By an appropriate application of Holder's
§15. Convergence theorems 79
Again consider 1 < p < +oo and a measure space (12, .sa', p). The function Np is
real-valued on the vector space 2P(p), and in fact a semi-norm, that is, a mapping
Np :.2 (p) - R+
having properties (14.2) and (14.4). From the second of those properties, the
Minkowski inequality, it follows that the function
dp(f,9):= Np(f - 9) f,9 E 2P(p),
satisfies the triangle inequality, that is,
dp(f, 9) S dp(f, h) + dp(h, g) for all f, g, h E -"(p).
Evidently dp thus has all the properties of a metric on 2"(p), with one exception:
According to 13.2 and 13.3
dp(f, 9) = 0
is not equivalent to f = g, but only to
f = g p-almost everywhere.
Distance-like functions without the property that "distance between two elements
equal zero entails equality of the elements", are usually called pseudometrics.
Np and d,, are called the .P-semi-norm and the Pp-pseudometric, also the semi-
norm or the pseudometric of convergence in the pth mean or in 2'-convergence.
To elaborate: If (f,,) is a sequence in YP(i), then it is said to converge in eh
mean to f E 2'P(p), or to be 2P-convergent if
(15.1) lim N,(fn-f)= lim
n +oo n-ioo
dp(fn,f)=0.
By virtue of what was noted above, the limit function f is only almost everywhere
uniquely determined. (14.2) and (14.4) insure that linear computations with con-
vergent sequences are like those we are accustomed to involving real numbers. In
immediately apprehensible symbolic form these say:
A - f, 9n -1 9 a fn + f3gn -4 of + 09
for any a, 0 E R.
80 II. Integration Theory
15.1 Theorem. Every sequence (fn) in 21(u) (reap., in 2 (1i)) which converges
in mean (resp.. in pth mean) to a function f from 21(p) (reap., from -gy(p)) also
satisfies
f
(15.5) lim f
n- oo A
fn dµ = J fdµ for every A E d
A
(p.,
(15.6) Jim f
A
Ifnlp dp = f If I' dp
A
for every A E d.)
Proof. (15.5) follows from (15.3). Correspondingly (15.6) follows from (15.4), which
gives Np(lAfn) = Np(lAf), upon taking pth powers in this last limit and
using the continuity of the mapping x H xp on R+. O
(15.5) and (15.6) say nothing other than that for each A E d the mappings
15.2 Lemma (of Fatou). Every sequence (fn)fEN in E*(fl,ii), that is consisting
of 0-measurable numerical functions fn > 0, satisfies
nEN
f
f If dp = sup gn dp = n-+00
lim f 9n dp.
f gn dp infra J fm d!L
This is the set of w E Il which lie in ultimately all of the sets An. Dual to it one
defines
(15.8) lim sup An := n U U Am ,
n-pm
nEN m>n
the set of w E fl which lie in infinitely many of the sets An, more correctly, the w
which he in An for infinitely many n. Evidently
lim sup A,) = lim inf CAn .
n-+oo n-+oo
Hence we get the following corollary:
15.4 Theorem (of F. Riesz). Suppose 1 < p < +oo and the sequence (fn)nEN
in 2P(S1) converges almost everywhere in 11 to a function f E 2P(51). Then the
condition
(15.12) Jim Jtfnrdsti= JIf lpdu
is (necessary and) sufficient for the convergence of (f,,) to f in eh mean.
Proof. The necessity of (15.12) follows (even without the hypothesis of almost-
everywhere convergence) immediately from 15.1. The proof of sufficiency proceeds
from the inequality
(a +/3)P < 2P (aP +,6P) (a, /3 E R+)
which has already been used in the proof of (14.4). Since Ia - 0I < a + /3 this
inequality yields
la - QIP < 2P(IaIP + IQIP) (a,$ E R).
This inequality insures that
9n:=2P(IffIP+VIP) -Ifn-fl", nEN,
are non-negative functions. They lie in .2o1(µ) and by hypothesis they converge
almost everywhere to 2P+1 If IP. In particular, 2P+1 If I = lim inf gn almost every-
where. Therefore Fatou's lemma in conjunction with (15.12) delivers the relations
In preparation for the proof of the next convergence theorem we extend Min-
kowski's inequality to series of non-negative functions.
15.6 Theorem (on dominated convergence). Let 1 < p < +oo and (fn)nEN be
a sequence from .'P(p) which converges almost everywhere on Q. Suppose there
exists a p-integrable numerical function g > 0 on fI such that
(15.14) for all n E N.
Then there is a real-valued measurable function f on fI to which (fn) converges
almost everywhere. Every such f lies in 21'(p) and the sequence (fn) converges
to f in pth mean.
Proof. By assumption there is a nullset M1 such that lim f,, (w) exists (in 1[1) for
every w E CM1. Because of the integrability of gp there is, according to 13.6,
another nullset M2 with g(w) < +oo for every w E CM2. If we set
limo f,, (w), w E C(M1 U M2)
f (w)
{ 0, w E M1 U M2,
then f is real-valued and aaf-measurable, and the sequence (fn) converges almost
everywhere to f. Consider now any function f with these properties. Then If I < g
almost everywhere, so along with gp the function If Ip is also integrable, that is,
f E 2p(µ), by 13.5. We set, for each n E N
9n:=Ifn-fIp
84 II. Integration Theory
and then what has to be shown is that lim f gn dµ = 0. From the definition of gn,
0:5 gn <- (Ifnl+IfD <_ (9+IfI)P.
Since the fimction h :_ (g + If I )P is integrable, so is each gn (by 14.4 and 12.2).
Fatou's lemma applies to the sequence (h - gn) and says that
15.7 Theorem. For each 1 < p < +oo, every Cauchy sequence (fn)nEN en '(k)
converges in pt' mean to an f E 2P(p). Some subsequence of (fn) converges
almost everywhere to f.
Proof. Straight from the definition of Cauchy sequence we can construct 1 < n1 <
n2 < ... such that Np(fnk+, - fnk) < 2-k for all k E N. We define
00
9k *= fnk+, - fnk for each k E R, and g:= Z I9kI
k=1
NP(9)<_ENP(9k)<E2-k=1.
k=1 k=1
this series is f,,k+, - fn so we see that the sequence (fnk)kEN converges almost
everywhere in Q. Moreover,
Ifnk+,I = 191 +... +9k + fn,I <- 9+ Ifn,I
and by 14.4 the sum g + I fn, I is pth-power integrable. Thus the sequence (fn. )W
satisfies all the hypotheses of the dominated convergence theorem, according to
which it therefore converges in eh mean to an f E 2P(1) and
lim fnk = f almost everywhere.
k-woo
Since (fn) is a Cauchy sequence, this subsequence behavior entails the convergence
in eh mean of the whole sequence: Given c > 0 there is an mE E N such that
Np(fn-fn)<E for all m,n>m,.
Then there is a k E N with nk > me such that
NP(fnk - f) < E.
The triangle inequality then insures that
Np(fn - f) < Np(fn - fnk) + Np(fnk - f) < 2E
Example. Consider fl := (0, 1[, d := Clf1.1 and a := an. Every natural number n
is representable as n = 2h + k for a unique pair of integers h > 0, k > 0 with
k < 2h. Set An := [k2-h, (k + 1)2-h[ and let fn denote its indicator function.
Then f fn dµ = f fn du = µ(An) = 2-4 < 2/n, so (fn) converges to 0 in eh mean
(for any 1 < p < +oo) and is therefore certainly a Cauchy sequence in 2p(14).
But the sequence (fn(w))fEN in {0, 11 is not convergent for any w E Cl. Indeed,
given w E Sl and h = 0, 1, ..., there is exactly one k = 0,..., 2h - 1, such that
w E [k2-h, (k + 1)2-h[, that is, w E A2k+k. In case k < 2h - 1, w AZk+k+I. In
case k = 2h -1 and h> 1, w ¢ A2h+, .
We record the following simple corollary to 15.7:
15.8 Corollary. If the Cauchy sequence (fn) in 2p(µ) converges almost every-
where to an d-measurable real function f on Cl, then f lies in 20P(A) and the
sequence converges to it in eh mean.
Corresponding to Theorem 14.6 and its corollary we have finally the following
two convergence assertions:
15.9 Theorem. The sequence (fn) in .4°D(p) converges in pth mean to a function
f E 2'(p) and the sequence (gn) in 29(p) converges in qth mean to g E
If I < p < +oo and p-' +q-1 = 1, then the sequence (fn9n) of products converges
in mean to f g.
15.10 Corollary. If the measure p is finite, then every sequence (fn),,EN in 2'(p)
which converges in pth mean to an f E YP(p) for some 1 < p < +oo, also
converges to f in mean.
Proof. For p = 1 there is nothing to prove. For 1 < p < +oo the claim follows from
the theorem upon taking every function gn there to be the constant function 1;
because of the finiteness of p the constant functions lie in 29(p) for every q E
(1, +oo(.
The reader should convince himself via an example like that in the remark af-
ter 14.7 that the converse of the assertion in this corollary is not true. However, the
conclusion of the corollary can be refined somewhat; namely, under its hypotheses
there is 2'V-convergence of (fn) to f for every p' E (1,p). Cf. Exercise 2 below.
Remarks. 1. Because
Np : 2'(p) - R+
is a semi-norm, the set
.N := N;'(0)
is a linear subspace of .gy(p). It is independent of p because it consists of all
measurable real functions on Sl which are almost everywhere equal to 0. The
quotient vector space
One checks effortlessly that f H 1If IIP is thereby well defined and provides a norm
on LP(p). Theorem 15.7 says that LP(p) is complete with respect to this norm,
that is, it is a Banach space (for 1 < p < +oo).
L2(µ) is even a Hilbert space. For the product fg of two functions f,g E 22(p)
is integrable, by 14.6, and it is clear that the integral f f g dp depends only on the
canonical images f , g of these functions, which means that
(f, 9) -ffdp
is a well-defined mapping. A short calculation suffices to confirm that it provides
a scalar product in L2(p).
2. f E 2°°(p) means that the set W J of all a E R+ such that If I < a almost
everywhere is not empty. We can set
N00(f):=infWj
and show easily that N,,, :2°°(p) -r R+ is a semi-norm on 2°°(p). Also in this
case N ' (0) coincides with the space .At described in 1. In the quotient space
LO°(,u) := Y°°(p)1_41
can be defined via N,,. just as before. One checks that L°° (p)
a norm f H II f I I
thus also becomes a Banach space.
3. For every measure space (SI, dry, p) and every p E ]0,1[ the set 2P(p) (cf.
Remark 2 in §14) turns out to be a vector space. NP is generally not a semi-norm
(cf. Exercise 5, §14), but 4(f, g) := Ny (f - g) is a complete pseudometric - with,
however, strange properties: The unit "ball" centered at 0 is generally not convex.
For L-B measure on (0, 1], every f E .2P is actually a convex combination of
functions in this ball. See BoURBAKI [1965], chap. 4, §6, exer. 13.
Exercises.
n=1
then the sequence (f) converges almost everywhere to f.
5. Show that the conclusion of Theorem 15.9 remains valid for p = 1 and q = +oo.
Proof. The continuity of V at xo is proved if we show that for every sequence (xn)
in E with lim xn = xo,
nim V(xn) = p(xo)
holds. To accomplish this, we introduce the sequence (fn) by
fn (w) := f (xn, w) (n E Z+, w E 0).
By hypothesis these are integrable functions, each satisfies IfnJ < h, and for every
fixed w E 11, lieu fn(w) = fo(w). From the theorem on dominated convergence
n--+oo
§16. Applications of the convergence theorems 89
In short, under the stated conditions (16.2) can be differentiated under the
integral sign.
Proof. Fix xo E I and consider any sequence (xn)nEN C I \ {xo} which converges
to xo. Then the function defined on S2 by
f (xn,w) - f(xo,w)
gn(w) xn - xo
is p-integrable, for each n E N, and
lim gn(w) = f'(xo,w) for all w E Q.
n-+oo
It is a consequence of hypothesis (c) that Ign 1 < h for n E N, as we now confirm. It
suffices to apply the mean-value theorem of differential calculus. According to it,
for each x E I \ {xo} and each fixed w E fl there is a point t, in the open interval
whose endpoints are x and xo, such that
f(x,w) - f(xo,w) = f'(t,w)
x - xo
90 11. Integration Theory
im
4oo J gds = J f'(xo, w)!p(dw)
Claim (16.3) follows from this because
I
gdp=forallnEN.
xn -xe
11
w(x) := ff(x.w)i(d)
'- 8f (x, w) is µ-
has an ith partial derivative at every x E U, the function w
8x,
integrable, and
av (x) = J az (x, w),u(dw) for every x E U.
axj
This follows at once from the differentiation lemma: Given T = (T,, ... ,Td) E
U, there is an open interval I C R containing ai such that for each t E I the
point (zl , ... , T,- j , t, Ti+i .... 7d) lies in U, and we can apply 16.2 to the function
(t,w),_, f(xl,...,xi-1, .Td,w).
II. Comparison of the R.iemann and Lebesgue Integrals. For every d-
dimensional Borel set B E .mod and suitable Borel measurable numerical func-
tions f on B the integral fa f dad was defined in §12 and identified with f f dAB.
This integral is called for short the Lebesgue integral of f over B. A frequently
encountered alternative way of writing it is
In case d = 1 and B = [a, a], or ] - oo, a], or R, etc. the notations fa f (x) dx, or
f °. f (x) dx, or f ±' f (x) dx, etc., are also common.
Since in basic analysis courses it is frequently only the Riemann integral that
is dealt with, the following remarks relating it to what has been done here may be
useful.
and so by 13.2, q = 0 p-almost everywhere. Since in addition for every n,1.4 < f <
uj holds p-almost everywhere (everywhere except possibly at the points of in),
92 11. Integration Theory
As has been noted, f is bounded, say If 1 <_ M E R. The sequence ([1.4. [) is therefore
majorized by the constant M, a p-integrable function, and so Theorem 15.6 on
dominated convergence delivers the 1s-integrability of f as well as the convergence
of (1k) to f in mean. From 15.1 finally follows
I fdp=lim J
which finishes the proof. 0
Remarks. 1. Consider once again Dirichlet's jump function f on the unit interval
(cf. Exercise 2 of §10). Being the indicator function of Q fl 10, 11, it is Borel mea-
surable and almost everywhere 0 with respect to L-B measure .1011. Consequently
it is Lebesgue integrable and fo f (x) dx = 0. But f is not Riemann integrable. So
the roles of Riemann and Lebesgue integration cannot be reversed in 16.4.
2. Borel measurability of f need not be hypothesized: the above proof shows,
even without it, that lim 1.4. = f p-almost everywhere and so f is -almost every-
where equal to the Borel function lim lj, . However, in this case it can well happen
that f itself is not Borel measurable.
3. The ideas in the proof of Theorem 16.4 can be amplified into a non-trivial
criterion for Riemann integrability. Namely, f : [a, 0] -+ R is Riemann integrable
if and only if it is bounded and is continuous at V-almost every point of [a, fiJ.
See Theorem 2.5.1 of COHN [1980] or the multi-part Exercise 12.51 of HEwITT
and STROMBERG [1965].
Proof. Denote by pn the Riemann integral of f over An := [-n, +n] for each n E N.
According to the theorem just proved
pn=IA
From 11.4 and the fact that IA f T f we get
sup p JfdA'.
=
§16. Applications of the convergence theorems 93
The improper Riemann integral exists, by definition, just if this supremum is finite
and in that case its value g is that supremum. From these observations and the
monotone convergence theorem our present result follows. 0
lim
R ++oo 0 J- rR sin x
X
dx
JR+
If I d,\' >
J fa,(n+1)w)
If ( dA' E
k=lJka
dx >
E k+1
k=11
Since the harmonic series diverges, these inequalities show that fR+ If I dA' = +oo,
and so by 12.2 f is not Lebesgue integrable over R+.
e-x(1+m2 )
(16.5) f (x, w) :_ (x,w)ER x1R.
1 + w2
Both f and the function (x, w) t-+ f'(x, w) := -e-:(1+w2) are continuous. For fixed
xo > 0 form the auxiliary functions
Their A'-integrability (over R) follows from Corollary 16.5 and the fundamental
theorem of calculus. For example,
r+
J/ (1 + W2)-1 hm [arctan(W)]"n = r.
n-too
Obviously f (x, w) < h(w) for all (x, w) E HI+ x R. It follows from 12.2 that for each
x E It+ the function w H f (x, w) is A'-integrable. And the real function defined
by
is continuous by the continuity lemma 16.1. Note that p(O) = r. Since 2 JWJ < 1+w2
for all w E R, we have I f'(x,w)J < ho(w) for all (x, w) E [xo,+oo[x]R. Consequently
the differentiation lemma 16.2 insures that <p is differentiable in ]xo, +oo[, for every
xo > 0, that is, differentiable in JO,+oo[, and
where G designates the integral (16.1) that we are trying to explicitly compute. Its
existence is already fart of the preceding analysis, but can also be inferred from
the majorization a-' < e-t, which holds for t > 1. From (16.6), (16.8) and the
§16. Applications of the convergence theorems 95
for x > 0 and a > 0. Upon letting a run to +oo, we will get
it = p(0) = 2G
r+ e-"'2 dw = G2,
J0
using the obvious (on grounds of symmetry) fact that f °. a-"'' dw = f0+00 e' dw.
G = . That is,
Since G > 0, it follows finally thatfe2
(16.10) dx = r
or equivalently, in the form seen in probability theory,
(16.10') 2a.
This derivation goes back to ANONYME [1889J and VAN YZEREN [1979]. A par-
ticularly short alternative one is made possible by Tonelli's theorem (cf. Exercise 4
in §23).
Exercises.
1. Which of the two functions below are integrable, which are square-integrable
with respect to Lebesgue-Borel measure on the indicated intervals?
(a) f (x) := x-1, x E I:= [l, +oo[;
(b) f (x) := x-1/2, x E I:= 10,1] .
2. Show that for every real number a > 0 the function x H e" is A1-integrable
over R+.
3. Show that for every real number a > 0 the function
rsinx13 A1(dx)
Jo x J
is continuous Oil 10, +00[.
Again let (12, dd, p) be an arbitrary measure space and E' = E'(f2, sd) the set of
all W-measurable, non-negative numerical fimctions on 12. In 12.4 we defined the
integral of every function f E E* over every set A E id'. We are interested here in
how this integral behaves with respect to A.
(17.1) v(A) := f du
Proof v(0) = 0 and v > 0. For every sequence (An)nEN of pairwise disjoint sets
from W with A:= U A
nEN
IAf = IA, f
n=1
and so by 11.5
v(A) v(An),
n=1
the final property needing to be checked in confirming that v is a measure on 0. 0
(17.3') Jd(f,i) = f Wf dµ -
An id-measurable function V : fl - R is v-integrable if and only if ,pf is µ-
integrable. In this case (17.3) is again valid.
17.4 Corollary. Let f, g E E', v := fit and P := gv. Then B = (gf )µ, that is,
(17.4) 9(fµ) = (9f)µ
Proof. For every A E id
g(A) = f gdv =
A
f lAgdv
Proof. If f and g coincide p-almost everywhere, then so do 1A f and 'Ag for each
A E a(, whence
JALgdp for allAEd,
which just says that fit = gp.
Now suppose that f is p-integrable and that fit = gp. Since g > 0 and f gdp =
f f dp < +oc, g is also p-integrable. Let us show that the set
N:={f>g},
which lies in 0 by 9.3, is a p-nullset. For every w E N, f (w) - g(w) is defined and
is positive, which means that the definition
h:= 1Nf - 1N9
makes sense. The functions 1N f, 1Ng, being majorized by the p-integrable func-
tions f, g, are themselves integrable. Because fit = gp, they have the same it-
integral. From this we getr that
r
J
hdp= Ir fdp- /Ngdp=0.
Since N = {h > 0}, this equality and 13.2 tell us that p(N) = 0. With the roles
of f and g reversed, this conclusion reads u(N') = 0, where N' := {g > f }. Since
if 54 g} = N U N', the desired conclusion, namely that If 34 g} is a p-nullset, is
obtained. 0
The converse of implication (17.5) is not valid without some additional hypoth-
esis on the densities f and g. The next example illustrates this.
17.6 Lemma. Let (fl,.ad,p) be a measure space. The measure is a -finite if and
only if there exists a p-integrable function h on Cl which satisfies
(17.6) 0<h(w)<+oo forevery wEf2.
Proof. If It is a-finite, there is a sequence in a0 such that p(An) < +oo for
each n E N and A7, fi Cl. Choose positive real numbers gn satisfying both r) < 2-n
§17. Measures with densities: the Radon-Nikodym theorem 99
In the light of 13.2 this lemma has another formulation: For each or-finite mea-
sure R there exists a real, measurable function h > 0 such that the measure hp is
finite and has the same nullsets as A.
We come now to the main problem, already alluded to: On the v-algebra sF of
the measurable space (S2, 0) two measures v and p are given. We pose the question
of how to decide whether v has a density with respect to µ, that is, whether there
is an .W-measurable, non-negative, numerical function f on St satisfying v = f p,
satisfying in other words
r
v(A)=J fdp for allAE.d.
a
For an affirmative answer it is necessary, as 13.3 shows, that every p-null set in a
be a v-null set as well.
17.8 Theorem. A finite measure v on jzf is p-continuous if and only if for every
c > 0 there exists d > 0 such that
(17.7) A E O and u(A)<b . v(A) < e.
Proof. From (17.7) it follows that v(A) < e holds for every E > 0 if A is a p-nullset.
Hence v(A) = 0 and v is thus a p-continuous measure, even without the finiteness
hypothesis. For the converse we will show that if (17.7) fails, then v is not µ-
continuous. Thus, for some c > 0 there is no 6, which means there is a sequence
(An)nEN in with the properties
p(An) < 2_n and v(An) > E for each n E N.
We set
A := 41.s .up An := n U An
nEN m>n
and have a set in ap which on the one hand satisfies
00 00
2-m = 2-n+1 for every n E N,
A(A) < µ( U Am) < E p(Am) <_
m>n m=n m=n
100 II. Integration Theory
whence p(A) = 0, and on the other hand, due to the finiteness of v and 15.3,
satisfies
v(A) > limsup E > 0,
nix
which proves that v is not p-continuous. 0
for every w E S2, making f = 0 and therefore v = fit = 0, which is not the case
because Sl is uncountable.
3. Let (R, 0, It) be the 1-dimensional Lebesgue-Borel measure space (so p = 'V)
and denote by A" the system of all p-nullsets. Then is an example of a or-ideal
in W1: The union of any sequence of its sets is another, as are the intersections of
its sets with those of ,5d1 (cf. Exercise 5, §3). These properties insure that
v(A)
- 10 ifAE-4
+oo if AEJO\.X
defines a measure on 1 (cf. Exercise 6, §3). From its definition it is clear that v
is p-continuous. Here however (17.7) falls, since for every b > 0
jp([o,ap = s and v([0,ap =+oo.
Thus the finiteness hypothesis on v in 17.8 is not superfluous. Example 2 shows
that for the existence of a density f E E' with v = fit, the µ-continuity of v, while
necessary, is not sufficient. All the more noteworthy is the theorem of Radon and
Nikodym which we will prove, after a preparatory lemma.
We may obviously suppose that p(l) > 0, since otherwise SlE := 0 does what is
wanted. If then e(A) > -e for all A E .sad, it suffices to choose 1 := Q. So we
consider the case that some Al E ad satisfies e(A1) < -e. From the definition of e
and the subtractivity of the finite measures a and T,
e(CA1) = e(fl) - e(A1) ? e(1) + e > e(11) .
Therefore, if e(A) > -e for all A E (CAI) fl 0, we can set S1E := CA1 and be done.
In the contrary case there is a set A2 E (CAI) flsat with e(A2) < -e. Then because
A1, A2 are disjoint
e(C(A1 U Az)) = o(Q) - e(A1) - e(A2) > e(fl) + 2e > e(n)
and the preceding dichotomy presents itself anew. If after finitely many repeti-
tions of this procedure we have not reached our goal, then we will have generated
a sequence (An)nEN of pairwise disjoint sets in gd with
e(Sl \ (A1 U ... U An)) > e(Sl) and e(A.) < -e for every n E N.
Because of the finite additivity of a and r, this would have the consequence that
n
e(A1U...UAn)=Ee(A,) <-ne for every n E N
i=1
00
and entail the divergence of the series 1 e(An). But the latter is untenable,
n=1
because when the a-additivity of a and r is applied to the disjoint union A
U An it shows this series to be convergent:
nEN
00 00
Proof. Only the implication (ii)=(i) is still in need of proof. To that end we
distinguish three cases.
First Case: The measures µ and v are each finite. Form the set 9 of all d mea-
surable numerical functions g > 0 on Sl which satisfy gµ < v, that is, which
satisfy
for allAEd.
The constant function g = 0 lies in 9, so 9 is not empty. 9 is moreover sup-stable,
that is, g V h E 9 whenever g, h E W. Indeed, setting Al := {g > h}, A2 := CAI,
every A E d satisfiees
r
gvhdµ= 1 gdµ+J
J Ana, ArA,
Since f gdµ < v(Q) < +oo for every g E 9, the number
ry:=suP{ f 9dµ:gE9)
is finite and there is a sequence (g;,) in 9 such that lim f gn dµ = -y. Due to sup-
stability the functions gn := gi V ... V gn lie in 9, and consequently ry > f gn dµ >
f gn dµ (since g,, > gn) for all n E N. Which shows that lim f gn dµ = ry. As
the sequence (gn) is isotone, the monotone convergence theorem can be applied,
assuring that f := supgn is a function in 9 and that f f dµ = ry. All this proves
that the function g H f g dµ on 9 assumes its maximum value at f.
Now we prove that v = f µ. In any case we have f µ < v, since f E 9, and so
T:= V- f A
is a finite measure on sat, evidently µ-continuous since v is by hypothesis. We have
to show that r = 0. So let us assume contrariwise that r(Sl) > 0. Due to the
µ-continuity of r, this entails that µ(11) > 0 as well, and we may form the real
number
Q:=2 (M}>0,
which satisfies r(Sl) = 20µ(Sl) > Qµ(St). The preceding lemma applied to r and
a:= Q3µ supplies a set flo E 0 which satisfies
r(flo) - lµ(ilo) > r(1) - $µ(!l) > 0 and r(A) > Qµ(A) for all A E f o n 0.
The .sat-measurable, non-negative function fo := f +,81n. therefore has the prop-
erty
ffodiz=jfdii+I3(QonA) jfd+r(A)=v(A)
§17. Measures with densities: the Radon-Nikodym theorem 103
f fodµ= ffdµ+ap(no)=7+i3µ(Slo)>7,
an inequality which is incompatible with the definition of -f and the fact that
fo E 9. The assumption r(S1) > 0 is therefore untenable, and r = 0, as desired.
Second Case: The measure µ is finite and the measure v is infinite. We will produce
00
a decomposition SZ = U On of S1 into pairwise disjoint sets from d with the
n=0
following properties
(a) A E 1o fl at either µ(A) = v(A) = 0 or 0 < µ(A) < v(A) = +oo .
(b) v(S1n) < +0o for all n E N.
To this end let 2 denote the system of all Q E 0 with v(Q) < +oo and define
a:= sup{µ(Q) : Q E _l} .
This is a real number because the measure µ is finite. There is a sequence (Qm)mEN
in .l with limµ(Qn,) = a. Since 1 is evidently closed under finite unions, (Q,n)
may be assumed to be isotone. Qo U Q,n is then a set from std satisfying
mEN
µ(Qo) = a. We will show that 52o := CQo satisfies (a). So consider A E Stood with
v(A) < +oo. We need to see that p(A) = v(A) = 0, and since v is µ-continuous
we really only need to confirm that p(A) = 0. Since v(A) < +oo and, as noted
already, . is closed under union, each Q,n U A lies in 2, so that p(Q,, U A) < a,
and consequently
µ(Qo U A) = lim p(Qm U A) < a.
"t-400
Since A is disjoint from 1o, u(Qo U A) = a + µ(A). Conjoined with the preceding
inequality and the finiteness of a this says that indeed p(A) must be 0. Finally, to
take care of (b) we merely define S21 := Ql, and u n := Qm \Q,n_1 for all integers
m > 2 in order to get a decomposition of S2 with the desired properties.
Now let An, vn denote the restrictions of µ, v to the trace a-algebra On fl 8d,
for n = 0, 1.... and note that each vn is a µn-continuous measure. Moreover, for
all n > 1 both An and vn are finite. Case 1 therefore supplies Cl,, n 0-measurable
functions fn > 0 on Cl,, with vn = fnµn Taking fo to be the constant function +oo
on Sto, vo = foµo also holds, thanks to (a). Finally, "putting all the pieces together"
gives our result in this second case. Namely, the function f on Cl defined to coincide
on each Cl,, with fn (n = 0, 1, ...) is non-negative, sad-measurable and satisfies
v=fp.
Third Case: This is the general case: only the a-finiteness of it is demanded. There
is according to 17.6 a strictly positive function h E 2'(µ). The measure hp is
therefore finite and possesses exactly the same nullsets as does A. Consequently
v is also (hp)-continuous. By what has already been proved there is then an
104 II. Integration Theory
The question arises whether, in the situation of Theorem 17.10 the density f
of v is p-almost everywhere uniquely determined. From 17.5 we at least get a pos-
itive answer when f is p-integrable, that is, when v is a finite measure. But more
is true:
Proof. First we show that f is µ-almost everywhere uniquely determined if the mea-
sure p is finite. In proving this we may assume that v(St) = +oo, since its truth is
otherwise a consequence of the second part of 17.5. Furthermore, as we now find
ourselves in case 2 of the preceding proof, the decomposition of St into %J11,...
employed there lets us confine our attention to Sto, as 17.5 takes care of the re-
maining Stn (n E N). So it suffices to treat the case ft = Sto, that is, to assume
that p and v are linked by the alternative:
A E srp = either p(A) = v(A) =0 or 0 < µ(A) < v(A) = +oo.
The constant function +oo is then a density for v with respect to p and what has
to be shown for uniqueness is that f = +oo holds p-almost everywhere. And for
that it suffices to show that
µ({ f < n}) = 0 for each n E N,
which in turn is a consequence of the above alternative and the inequalities
00
hypothesis being just that p(Ao) = 0. D = U (1li n AJ) is a decomposition of fl
i.J=o
into a (doubly-indexed) sequence of pairwise disjoint sets from sat. If each has finite
v-measure, this proves that v is a-finite. Consider any i E Z+. Because p(Ao) = 0
and v = fµ, we have v(1l, n Ao) < v(Ao) = 0. Because v = fit and f < j in AJ,
we have v(12i n AJ) < jp(ni) < +oo for all j E N as well. Thus all is proven. 0
In the generality presented here Theorem 17.10 was proved in 1930 by O.M. NI-
KODYtM (1888-1974). H. Lebesgue proved the theorem in 1910 for the case where
At is the L-B measure A1. J. RADON (1887-1956) pushed things further in a funda-
mental work which appeared in 1913. So 17.10 is often also called the theorem of
Lebesgue-Radon-Nikodym. The uniquely determined density f in 17.11 is called
the Radon-Nikodym density or the Radon-Nikodym integrand (of v with respect
top). A beautiful proof of 17.10 by elementary Hilbert-space methods was discov-
ered in 1940 by J. VON NEUMANN (1903-1957) and appears in many textbooks,
e.g., in RUDIN [19871, p. 130-131.
The history of the result to be presented next, the Lebesgue decomposition
theorem, runs somewhat parallel, Radon and Nikodym having also made signif-
icant contributions. We need a concept complementary to p-continuity, namely
p-singularity:
17.12 Definition. Let (Sly, sat) be a measurable space, µ and v measures defined
on sat. Let us write v << p if v is p-continuous. v is said to be singular with respect
top (or p-singular), written v J p, if a set N E sl exists with µ(N) = 0 = v(CN).
It is obvious that the relation v J p is symmetric in µ and v, so it is also ex-
pressed as p and v are singular to each other (or mutually singular). The definition
of v 1 p expresses the fact that for a suitable p-nullset N E W
(17.10) v(A) = v(A n N) for all A E d,
as follows from v(A) = v(A n N) + v(A n CN) and v(CN) = 0. The condition
that v J it thus says that the measure v is "carried by a p-nullset". From v << p
and v 1 p together follows that v(N) = 0, and so v = 0. In this sense the
concepts p-continuity and p-singularity are diametral or antipodal. Relative to
L-B measure Ad every Dirac measure ex on d obviously satisfies Ad 1 ex.
v, is called the continuous part of v with respect to p, v, the singular part. The
Radon-Nikodym theorem is applicable to the part vc.
Proof. We will carry out the proof in detail only for finite p and v and indicate in
Exercise 4 how the reader can then handle the general case himself.
106 1 1. Integration Theory
Exercises.
1. Show that the Dirac measure e., on Rd has no density with respect to .1d,
for any x E W'. (Physicists occasionally work with such a "symbolic" density d5,
calling it the Dirac. function at the point x. The correct mathematical object is
nevertheless the Dirac measure es.)
§18*. Signed measures 107
2. Show that the relation << on the set of measures on a a-algebra d is reflexive
and transitive. The relation p - v defined as p << v and v « is is then an
equivalence relation. Two measures p and v stand in this relation just when they
have the same nullsets. For a-finite measures p and v on d show that p - v is
equivalent to v = f 1L for a density f which satisfies 0 < f (w) < +oo for p-almost
all (or even for all) w E Q.
3. On a a-algebra 0 in a set 11 two measures a and v are related by v < A. Show
that if further it is a-finite, then there is an d-measurable function f satisfying
0< f<lsuch that y= fµ.
4. Lebesgue's decomposition theorem was proved for finite measures p and v. Show
how to infer its validity for a-finite measures from this. [Hint: For the existence
proof use 17.6. For the uniqueness proof choose a sequence (An) in 0 with An T Sl
and a(An), v(An) finite for each n, and consider the measures vn(A) := v(Af1An),
AEd,nEN.]
5. Let v = vi+ve be the Lebesgue decomposition of a a-finite measure v on d with
respect to a a-finite measure p. The singular part V. has the form v,(A) = v(AfN)
for all A E 0 and a suitable p-nullset N E d. Show that if N' is any other p-
nullset with this property, then u(N 0 N') = v(N A N') = 0.
6. Let (S2, .mot, p) be a measure space, v = f 1A a a-finite measure on d having
density f with respect to p. Show that this density function is p-almost every-
where uniquely determined and is p-almost everywhere real-valued. Show that if f
is strictly positive, then p itself is a-finite.
7. Let (11, d) be a measurable space. For every measure µ on s0 let .M,,, denote
the a-ideal of its nullsets. Show that for any sequence (Pn)nEN of a-finite measures
on ae there is a finite measure µ on d for which /V,,= n N,,,
nEN
8. The set n := 10, +oo[ is a group with respect to multiplication. Show that the
measure on SZ f1.1 defined by p := han with density function h(x) := 1/x is
invariant under each self-mapping x H as of fI (a E fI). p is thus the Haar
measure of the group f2 in the sense of the remark immediately following 8.2.
It is worthwhile turning our attention back to Lemma 17.9. The measure concept
in this book is that formulated in Definition 3.3: Measures are premeasures p on
a a-algebra sad, and so are non-negative a-additive functions on d satisfying the
additional condition u(0) = 0. In Lemma 17.9 we encountered a real-valued, a-
additive function p which is the difference of two finite measures. Similarly for any
f E 2' (p) the function A H fA f dp on W is the difference of two finite measures,
for example f + p, and f -µ.
We will call a real function p : sr' - R on a a-algebra a finite signed measure
if it is a-additive in the sense of (3.2), non-negativity not being required. From
108 I l. Integration Theory
18.1 Theorem. Let g be a finite signed measure on a a-algebra and in a set Cl.
Then there are sets Sl+, St- E of with Cl = Sl+ U fl-, Sl+ n fl- = 0, and g(A) > 0
for all A in the trace a-algebra Sl+ n 0, and g(A) < 0 for all A E Sl- n dd.
Proof. Set
-y:= sup{g(A) : A E 0}
and choose a sequence (An) in 0 with limg(An) = y. By applying 17.9 to the
restriction of g to An nad, we may replace An by a set Pn E 0 satisfying g(Pn) >
g(An) and g(A) > 0 for all A E Pn n 0. We will then have
(18.1) y=sup{g(Pn):nEN).
The decomposition of Cl that is sought can be realized by
Sl+ := U Pn, S2- := S2 \ Q+ .
nEN
Indeed, all A E H+ n .ad satisfy g(A) > 0 because such an A has the form
A = U Bn
nEN
From this theorem another important feature of signed measures becomes ev-
ident: The difference p in Lemma 17.9 is more than an illustrative example of
a signed measure - it is the typical signed measure:
Proof. Let fl = S2+ U S2- be a Hahn decomposition in the sense of 18.1. Then
evidently
p+(A) p(A n St+) and p(A) :_ - p(A n St-), A E sat
define measures on d, which satisfy p = p+ - p-, since each A E sat is the disjoint
union (AnS2+)u(Ancl-). 0
With this result the circle closes: finite signed measures are nothing more than
the differences of finite measures. It is however possible to dispense with the finite-
ness hypothesis if a-additivity is handled with sufficient care, but we will not go
into this further.
In the final analysis it is because of the preceding corollary that we only consider
measures with non-negative values in this book. Often to emphasize the distinction
with signed measures, what we call simply measures are called positive measures.
Exercises.
1. Show that every finite signed measure on a a-algebra is bounded and assumes
a largest and a smallest value.
2. Let p be a finite signed measure on a-algebra d in Sl, and St = Sli U f1i ,
fl = fl2 Uci be two Hahn decompositions for it. Show that ii LSl2 and Sti OS22
are totally p-nulsets, meaning that p(N) = 0 for every N E 0 which is subset of
either of them. Conclude that to within such totally p-nullsets there is only one
Hahn decomposition for p.
3. Let p be a finite signed measure on a a-algebra sat in Q. Show that the specific
representation p = p+ - p- of p as the difference of the two measures on sat which
was produced in the proof of 18.2 is characterized by the following minimality
property: In every representation p = pl - p2 as the difference of measures pl, p2
on 0, pl = p+ + 8 and p2 = p + b for an appropriate finite measure 8 on sa7,
and indeed if 11 = Sl+ U S2- is any Hahn decomposition of S2 corresponding to p,
8 = (ln+)p2 + (1n-)pl. (Conversely, of course, every finite non-zero measure b
on sat generates in this way a different representation of p.) Infer that the only
measure v on sat which satisfies v(A) < min{p+(A), p-(A)} for every A E sat is
the identically 0 measure. [Remark: The representation p = p+ - p uniquely
determined by this minimality condition is called the Jordan decomposition of the
finite signed measure p. As with functions, p+ and p- are called the positive part
and the negative part of p.]
110 1 1. Integration Theory
p` := T(p)
is defined in (7.5). The connection between p-integrals and µ'-integrals is eluci-
dated by:
19.1 Theorem. For every s/'-measurable numerical function f' > 0 on 0'
(19.1)
f'oTa;lAi
e=1
(19.2)
1
§19. Integration with respect to an image measure 111
f (f')+dT(p)=J(f')+
and of course
o Tdp and J(f')_dT(P) = f (f')- oT d1 z,
One has only to note that the integrability of f' o T entails the measurability
of f' o T and therewith that off'= f' o T o T -1.
The content of 19.1-19.3 constitutes what is called the "general transformation
theorem for integrals".
As the behavior of the L-B measure with respect to Cl-diffeomorphisms is
known from (8.16'), the transformation theorem for Lebesgue integrals follows at
once:
19.4 Theorem. Let G. G' be open subsets of W', cp : G -> G' a C1-diffeomorphisrn
of G onto G'. A numerical function f' on G' is Ad-integrable if and only if the
function f' o cp I det DWI is Ad-integrable over G, and in this case
f f'dAc,_f
f f f'o,pIdctDWIdAd,.
Because of Theorem 19.1, equality (19.3) holds as well for all non-negative,
Borel measurable, numerical functions on G'.
112 1 1. Integration Theory
Exercises.
1. Let (0, dal, p) be a measure space, T : fZ -+ f 1 a mapping which together with
its inverse is an d-d-measurable bijection. Show that for every f E E (St, .ad) the
image measure T(f p) has a density with respect to T(p), namely f o T-1.
2. Let (0,.', p) be a a-finite measure space, T : ) -4 i2 an alf-d-measurable
mapping such that T-1(A) is a p-nullset whenever A is. Prove the existence of
a measurable function q > 0 such that
r
fA-'(A) f oTdp
-TJ
fqdu
Let us return to the study of p-fold integrable functions begun in §14. Our goal will
be to replace the almost-everywhere convergence concept that underlies the theo-
rems proved there with a weaker convergence concept. It is suggested by a simple
but very useful inequality.
The setting is once again an arbitrary measure space (el, 0,u).
20.1 Lemma. For every measurable numerical function f on 0 and every pair of
real numbers p > 0 and a > 0 the Chebyshev-Markov inequality
holds.
Remarks. 1. For a finite measure p we may take A = 52 in (20.3) and in this case
stochastic convergence of (fn) to f is equivalent to the requirement
(20.5) lim µ({lfn- fI>a})=0 for every a>0.
The more complicated condition (20.3) is dictated by the desire to treat infinite,
and especially a-finite, measures as well as finite ones.
2. For a-finite measures p the stochastic convergence of a sequence (fn) to f is
generally not equivalent to (20.5), as the next example illustrates.
and the requirement of o-additivity. With An := {n, n + 1,.. .} and In := 1A., for
each n E N, the sequence (fn) converges stochastically to 0: For every a E 10, 1[,
{ jn > a} = An, and since An ,. 0, it follows from 3.2 that lim µ(An n A) = 0
for every A E Af having finite measure. On the other hand, u(A.) = +oo for
every nEN.
20.3 Theorem. For every o-finite measure p, any two stochastic limits of a se-
quence of measurable real functions are µ-almost everywhere equal to each other.
114 1 1. Integration Theory
Proof. If f and f* are stochastic limits of the sequence (fn), then from the triangle
inequality in R
{If -f*I2al C{If.-fI? a/2}U{Ifn-f*I2! a/2},
whence
p({If-f*I >a}nA)<p({Ifn-fl>a/2}nA)+p({Ifn-f*I2:a/2}n A)
for every n E N and every A E d. Letting n -3 oo shows that
p({ If -f*1 >- a} nA) = 0
for every a > 0 and every A E ii of finite measure. Then however, f = f* "-almost
everywhere in every such set A, since
If 54 f*} n A= U{If - f*1 > Ilk} nA
kEN
is a p-nullset. Upon taking for A the sets in a sequence (An) in 41 which satisfies
p(An) < +oo for all n and An t 0, the p-almost everywhere equality of f and f
follows. D
Remark. 4. Stochastic limits f and f* of the same sequence (fn) are almost
everywhere equal without any hypotheses on the measure itself if both functions
are p-fold integrable for some p E [1, +oo[. This is because for every real a > 0 the
set (if - f* I > a} has finite measure, by (20.1), and so f = f * p-almost every-
where in this set, whence { If - f * I > 0} = U {If - f* I > 1/n} is a countable
nEN
union of p-nullsets. This just says that f = f* p-almost everywhere in Sl. But the
next example shows that it may fail if one of the functions is not in any 2P-space.
Example. 2. Consider the measure space (fl, Y(fl), p), where 11 consists of exactly
two elements wo,wl and p({wo}) = 0, p({wl}) = +oo, fn = f = 0 for every n E N.
These functions lie in every .2'P(p) and the sequence (fn) converges stochastically
to f , as well as to every real-valued function f * on 0. Every such f* which is
non-zero at wl, however, lies in no 2"(p) with 1 < p < +00 and fails to coincide
p-almost everywhere in 11 with f.
The considerations with which we began this section lead to an important class
of stochastically convergent sequences:
20.4 Theorem. If the sequence (fn) in 2P(p) converges in e" mean to a function
f E 2P(p) for some 1 < p < +oo, then it also converges to f p-stochastically.
holds for every n E N, every a > 0 and every A E s+d. The claimed stochastic
convergence, that is, the convergence to 0 of the left end of this chain as n -+ oo,
follows because f I fn - f I' dµ -+ 0 as n -+ oo is the definition of convergence
in pth mean. 0
The proof shows that convergence in eh mean actually entails the stronger
form of stochastic convergence in (20.5). The situation is different when the given
sequence is almost everywhere convergent. (On this point cf. also Remark 5.)
and so
A({Ifn - fl2! a}nA):5 µ({supI.fm-f1 >a}nA)
m>n
for every A E d. The present claim therefore follows from our next lemma, applied
to the restriction of p to A n sl for each A of finite measure. 0
20.6 Lemma. If the measure p is finite, then each of the following three conditions
on a sequence (fn)nEN of measurable real functions is equivalent to (fn) converging
p-almost everywhere to 0:
Proof. To prove the equivalence of (20.6) with the almost everywhere convergence
of (fn) to 0, we set, for each a > 0 and each n E N
An :_ { sup IN > a} .
m>n
then these lie in W. either by appeal to 9.5 or by noticing that each A; E W and
A= n U
kEN nEN
Passing to complements,
CA= U nAnk
kEN nEN
and so
n A ;/k r CA as k -+ oo, and Al/k
n 1
fI' dl
"m as n -00.
nEH mEN
Consequently,
because the finite measure µ is both continuous from above and continuous from
below, by 3.2. Thus (fn) converges almost everywhere to 0 just when the number
defined by (20.8) is 0. In turn, the latter occurs exactly in case
inf p(AIlk) = Iuu p(An1fk) = 0
nEN n-+oo
for every k E N. The first equivalence follows from this. The equivalence of (20.6)
with (20.6') follows from the observation that for any numerical function g on S2
{g>a}C{g>a}C{g>a'}
whenever 0 < a' < a.
Finally, the equivalence of (20.6') with (20.7) follows from the validity, for every
a > 0, of the equality
(20.9) a(( sup Ifml > a}) = µ(limsop tlfnl > a}) .
m> n
The conditions involved in Theorems 20.4 and 20.5 are indeed sufficient to
insure stochastic convergence, but they are not necessary for it, as the following
examples show.
§20. Stochastic convergence 117
= n"p(An) = np-1
shows that the sequence does not converge to 0 in pth mean for any p > 1.
4. Let (fl, 0, µ) be the measure space of the preceding example. Write each n E N
as n = 2' + k with non-negative integers h and k satisfying 0 < k < 21 (which
uniquely determines them) and set
An :_ [k2-h, (k+ 1)2-h[, In lAn, n E N.
It was shown in the example in §15 that the sequence (fn(w))nEN converges for
no w E S1. Nevertheless the sequence (fn) does converge stochastically to 0, since
for every a > 0 and n E N
In this example stochastic convergence can also be inferred from 20.4, since the
example in §15 showed that (fn) converges to 0 in pth mean for every p E [1, +oo[.
Proof. For A E sa( with µ(A) < +oo, the measure µA, which is the restriction of p
to A n.ad, is finite. It therefore suffices to deal with the case of a finite measure u;
moreover, in that case we can simply take A to be St itself.
For a > 0 and m, n E N the triangle inequality shows that
{Ifm - fnI 2: a} C {If,. - f I ! a/2} U {Ifn - f I a/2);
thus by hypothesis µ({I fn, - fnl > a}) can be made arbitrarily small by taking m
and n sufficiently large. If therefore (rlk)kEN is a sequence of positive real numbers
with
00
E rlk < +00,
k=1
118 I l. Integration Theory
1: lfnk+l(w) - A. (w)1
k=1
converges (absolutely); that is, the sequence Y n& (w))kEN converges in R. In sum-
mary, the sequence (fnk) converges almost everywhere to a measurable real func-
tion f' on !l. By 20.5 f' is also a stochastic limit of (fnk )kEN. But, as a sub-
sequence of that sequence converges stochastically to f as well. Hence
by 20.3, f = f " almost everywhere. We have shown therefore that (fnk )kEN con-
verges almost everywhere to f. 0
Proof. The preceding theorem establishes that the subsequence condition is nec-
essary for the stochastic convergence of (fn) to f, since every subsequence of (fn)
§20. Stochastic convergence 119
likewise converges stochastically to f. Let us now assume that the subsequence con-
dition is fulfilled, and fix an A E W of finite measure. Since every subsequence (f,,.)
contains another which converges almost everywhere in A to f and by 20.5 this
latter subsequence must also converge (in A) stochastically to f, we see that in
the sequence of numbers
p({Ifnk - fI -a}nA) (kEN),
in which a > 0 is fixed, a subsequence exists which converges to 0. But, as an
easy argument confirms, a sequence of real numbers whose subsequences, have this
property must itself converge to 0. That is, the sequence of real numbers
>a}nA) (nEN)
converges to 0. As this is true of every A E d having finite measure and ev-
ery a > 0, the stochastic convergence of to f is thereby confirmed. 0
Remarks. 5. It is not to be expected that in 20.7 and 20.8 the reference to the
finite-measure set A E W can be stricken. This is already illustrated by Example 2
if one replaces the sequence (fn) there with the sequence (f) defined by f,, :_
nl(,,,, ), n E N. This new sequence also converges stochastically to f := 0. See
however Exercise 5.
6. The second part of the proof of 20.7 shows that for finite measures u there is
a Cauchy criterion for the stochastic convergence of a sequence (f.): Necessary
and sufficient for the stochastic convergence of a sequence to a measurable
real function on S1 is the condition
litre for every a > 0.
m.n-ix
7. The sequence formed by alternately taking terms from each of two stochasti-
cally convergent sequences whose limit functions do not coincide almost everywhere
shows that in Corollary 20.8 it does not suffice to demand that in each A some
sub sequence of the full sequence (fn) converge almost everywhere.
A particularly useful consequence of 20.8 is:
20.9 Theorem. If the sequence (f,,) ,EN of measurable rral functions on 11 con-
verges stochastically to a measurable real function f on. Q. and yo : R -4 R is
continuous, then the sequence (y^ o f )nEN converges stochastically to V o f.
Proof. One exploits both directions of 20.8, noting that from the almost every-
where convergence of a subsequence to f on an A E 41 follows the almost
everywhere convergence of (,p o f on A. 0
The general question of functions p : R -* R which preserve convergence, in the
sense that (o o f, inherits the kind of convergence (f,,)iE14 has, is investigated
by BARTLE and Jo1CH1 (1961]. They show how Theorem 20.9 can fail if the more
restrictive definition (20.5) is adopted for stochastic convergence.
120 11. Integration Theory
Exercises.
1. (fn) and are stochastically convergent sequences of measurable real func-
tions, having limit functions f and g, respectively. Show that for all a,,8 E R
the sequence (af,, + 13g,,) converges stochastically to of + fg, and the sequences
(fn A gn), (f V g,,) converge stochastically to f A.9, f V g, respectively.
2. For a measure space (Si, d,,u) with finite measure p let d, be the pseudomet-
ric on d constructed in Exercise 7 of §3. Show that a sequence (An) in saf is
d,,-convergent to A E 0 if and only if the sequence (NAB) of indicator functions
converges stochastically to the indicator function IA.
3. For every pair of measurable real functions f and g on a measure space (Cl, sA, µ)
with finite measure µ define
D,(f,g) := inf{e > 0 : p({I If - gI > e}) < e}
and then prove that
(a) DP is a pseudometric on the set M(d) of all measurable real functions.
(b) A sequence (fn) in M(W) converges stochastically to f E M(d) if and only if
lim D, (f,,, f) = 0.
n +00
(c) M(se) is D,,-complete, that is, every Dµ Cauchy sequence in M(d) converges
with respect to Da to some function in M(Ao ).
What is the relation of D,, to the dµ of Exercise 2?
4. In the context of Exercise 3 define
If - gi dp,
for every pair of functions f, g E M(ss). Show that Dµ also enjoys the properties
(a)-(c) proved for D$, in the preceding exercise.
5. Let be a or-finite measure space. Show that a sequence (fn) of measur-
able real functions on Cl converges stochastically to a measurable real function f
on Cl if and only if from every subsequence (fk) of (fn) a further subsequence can
be extracted which converges almost everywhere in 0 to f. [Hints: Suppose (fn) is
stochastically convergent. Choose a sequence (Ak) from d with p(Ak) < +oo for
each k and Ak 1 11, and consider the finite measures pk(A) := µ(A fl At,) on sW.
The claim is true of each measure Pk. Given a subsequence 4 of (fn), there is for
each k E N a subsequence of (g;,k))nEN of 4' which converges pk-almost everywhere
to f. It can be arranged that (g nk+u)) is a subsequence of (gnl) for each k. Then
the diagonal subsequence (g;,ni ), EN does what is wanted.]
6. Give an "elementary" proof of 20.9 based directly on the relevant definition 20.2.
To this end, show that for each E E 10, 1[ there exists 6 > 0 such that fl f I <
11F}fl{Ifn-f1 :56}C{IVo fn-Wofl<£}for all nEN.
7. (Theorem of D.F. Ecoaov (1869-1931)) Let (S2,srd,A) be a measure space
with finite measure p. Show that: For every sequence (fn)nEN of measurable real
functions on Cl its convergence almost everywhere to a measurable real function f
is equivalent to its so-called almost-uniform convergence to f. The latter means
§21. Equi-integrability 121
that for every 6 > 0 there exists an A6 E W such that p(A6) < b and (fn) converges
to f uniformly on CA6. [Hint: Exercise 2 of §11.]
§21. Equi-integrability
The sufficient condition for convergence in eh mean which is set out in Lebesgue's
dominated convergence theorem can be transformed into a necessary as well as suf-
ficient condition with the help of stochastic convergence. But we need the concept
of equi-integrability, which is of fundamental significance.
In the following (S2, sz4, p) will again be an arbitrary measure space, and p is
always a real number satisfying 1 < p < +oo.
The point of departure is a simple observation. A measurable numerical func-
tion f on S2 is integrable if and only if for every e > 0 there is a non-negative
integrable function g = ge such that
(21.1)
J I9} IfI dp <e.
For if f is integrable and we take, as we then may, g to be 2 If I, then { If I > g} _
{ f = 0} U { If I = +oo} and thanks to 13.6 the integral in (21.1) is actually equal
to 0. Conversely, if we have (21.1) even for just one real e > 0, then
fIdµ < J gP dµ = J dµ = 0
J 1f1P>h} {gP>h} {g=too}
liminf
n_+00
J Ifnl dµ> 1,
{If..I>g}
showing that g cannot be an a-bound for any e E ]0, 1[.
Here is a useful characterization of equi-integrability, which, for o-finite mea-
sures, will be improved upon in 21.8.
§21. Equi-integrability 123
f IfI du <_ f
{IfI>_g}
IfI dµ+ f gdu.
Assuming that the set M is equi-integrable, let us choose for g an E-bound for it
and then set h := g, d 2. Then conditions (21.3) and (21.4) follow from the
preceding inequalities.
Conversely, assume the two conditions are fulfilled and let e > 0 be given. Let
h and b > 0 be as furnished by (21.4). For each f E M and real a > 0, consider
the obviously valid inequality
or its equivalent
1
Proof. For every f E 2P(p) and every A E dd, I lA f l <- If I shows that 1A f E
2'(p) too, and so for all fl, f2 E 2P(p) Minkowski's inequality (14.4) gives
Np(lAfl + lAf2) :5 Np(lAfl)+Np(lAf2),
whence
///' 1/v 1/1 p
Ifl + f2Ip dp < If,Ip dp) + (!A 1f21P dp}
JA
Applying this inequality to fl = a fl, f2 = pg with f, g E M a, 8 E R and Ial < 1,
ICI < 1, and hearing in mind that 21.2 is (by hypothesis) valid for the set MP,
one realizes that conditions (21.3) and (21.4) are fulfilled by M: as well as by MP,
with the same function h in both cases. 0
We are now in a position to deliver the sharpened version of the dominated
convergence theorem mentioned in the introduction to this section. That we really
have to do with a sharpening here is attested to on the one hand by Example 3
and Theorem 20.4, according to which stochastic convergence follows from almost-
everywhere convergence, and on the other by Example 4 of §20, which shows that
there are situations in which the dominated convergence theorem is not applicable
but the following theorem is.
21.4 Theorem. For every sequence. (fn)nEN of p -fold, p-integrable real functions
on a measure space (1l, sd, p) the following two assertions are equivalent:
(i) The sequence (fn) converges in p`h mean.
(ii) The sequence (fn) converges p-stochastically, and the sequence (Ifnlp) is p-
equi-integrable.
To every e > 0 corresponds an nE E N such that Np(fn - f) < 2-eel/p for all
n > nE. Therefore, if we set 6:= 2-'Pe and
h:=If1IPV...VIfn,IPVIfIP,
condition (21.4) is also satisfied by M.
(ii) .(i): From the stochastic convergence of the sequence (fn) and Remark 6
in §20 it follows that
(21.5) lim p({I fm - a} n A) = 0
n,m- .
§21. Equi-integrability 125
for every A E W of finite measure and every real a > 0. We have to show that
is a Cauchy sequence in 2P(µ), that is, that the doubly-indexed sequence of
functions frnn := frn - fn satisfies
According to 21.3, along with the set {IfnIP : it E N} the set 1190 :_ {lfnrnI
m, n E N} is also equi-integrable. Hence to every e > 0 corresponds an integrable
function gE > 0 such that f{f _g. } f dµ < e holds for all f E Mo. If we set g := 9E1 /P
then g is p-fold integrable and the preceding inequality can be written
for allm,nEN.
J fnrnIPdu<<e
Because
g" (11,<E.
fwnl
Consequently we also have
(21.7)
J fIdµ J g «}
gP dµ < for all m, n E N.
The Chebyshev-Markov inequality insures that the set {g > Y}} has finite µr
measure. According to (21.5) therefore the doubly-indexed sequence of sets
Ann :_ {I fnrnl > a} fl {g > 7)} in,nEN
satisfies, whatever a > 0 is involved,
lim µ(A,,,n) = 0.
m.n-4Q0
The p-continuity of the finite measure gPp and 17.8 provide for an no E N such
that
for all m, n > no.
J ,,
gP dp < e
Hence
Remark. 1. Theorem 21.4 does not claim that from the stochastic convergence of
a sequence (fn) to a measurable real function f, the p-fold integrability of f and the
convergence of (fn) to f in pth mean follow as soon as the sequence (if. JP) is equi-
integrable. Rather the theorem guarantees the existence of a p-fold integrable func-
tion among the possible stochastic limits of the sequence (fe). The sequence (fn)
does converge in eh mean to every such stochastic limit, as follows from the proof
of the theorem in the light of Remark 4 of §20, according to which any two p-fold
integrable stochastic limits must in fact coincide almost everywhere.
But stochastic limits that are not p-fold integrable do exist, a fact that can be
demonstrated with the aid of the Example in §20: For the sequence (fn) there,
(If,, 1") is equi-integrable. But among the stochastic limits f' that occur there,
f' E .`BP(p) for some p E I1,+oo[ if and only if f'(wi) = 0.
However, the phenomenon discussed above does not occur for a-finite mea-
sures. By 20.3 in that case any two stochastic limits are almost everywhere equal.
Therefore we have
21.5 Corollary. Suppose the measure p is a -finite. If a sequence (fn) from. "P(p)
converges stochastically to a (measurable, real) function f, and if the sequence
(IfnIP) is equi-integrable, then f E 2P(p) and (fn) converges in pth mean to I.
21.6 Lemma. Suppose the sequence of functions f > 0 from 2' (p) converges
stochastically to a function f > 0 from 2'(It). If in addition
(21.10) lim
n>z
From this, the decomposition f + fn = f V f + f A fn, and the convergence
hypothesis follows the companion result
Now we can get the sharpening of Theorems 21.4 and 15.4 mentioned earlier:
21.7 Theorem. For every sequence (fn) in 2P(t) which converges p-stochastically
to a function f E 2P(,u) the following three assertions are equivalent:
(1) The sequence (fn) converges in p'h mean to f .
(ii) The sequence (If,, 1") is equi-integrable.
(iii) lim f If,, I' d;i = f If I' dp.
n-, x.
Proof. The equivalence of (i) and (ii) is contained in Theorem 21.4. We need
therefore establish only two implications:
(i) .(iii): Assertion (15.6) in Theorem 15.1 affirms this.
(iii)=,>(ii): From the hypothesized stochastic convergence of the sequence (f,,)
to f follows that of (I f I') to If 11, via 20.9. And then from the preceding lemma
it further follows that the sequence (If P) converges to I fI' in mean. Finally,
Theorem 21.4 - with the p there chosen to be I - shows that the convergence in
mean of this sequence entails its equi-integrability.
128 1 1. Integration Theory
21.8 Theorem. Let (S2, dd, p) be a o-finite measure space and h a strictly positive
function from 2'(p). Then for any set M of dd-measurable numerical functions
on Sl the following three assertions are equivalent:
(i) M is equi-integrable.
(ii) For every e > 0 some scalar multiple of h is an a-bound for M.
(iii) M satisfies
s lim
(21.13)
JIfI>ah} If I du = 0
holds uniformly for f E M. Condition (21.12) is for obvious reasons (cf. 17.8)
called the equi-(hit)-continuity of the measures If I µ, f E M.
Proof. (i) .(ii): Let g be an E-bound for M. Then for all f E M and all a > 0
{IfI>-hh}
IfI dµ= f
{IfI>oh}n{IfI>g}
IfI dµ+ f {(fI>«h)n{(fI<9)
IfI dµ
g dµ < 2
k>ah)
for all sufficiently large a. Coupled with the preceding inequality this shows that
indeed ah is an a-bound for all sufficiently large a, that is, (ii) holds.
§21. Equi-integrability 129
(21.13') lim
a-++oo J IfI?a} IfI dp = 0 uniformly for f E M.
This condition is thus - just as (21.13) for a-finite measures - necessary and
sufficient for equi-integrability of M.
21.9 Lemma. Let p be a finite measure and M C Y' (y). Suppose that there is
a p-integrable function g > 0 such that
(21.14)
J{Ift?a}
IfI dp < f
J{IJI>a}
9dp
Proof. The case a:= 0 of (21.14) says that f If I dp < f g dp < +oo for all f E M.
Then Chebyshev's inequality tells us that
p({IfI ? a}) <_ f IfI dp < a f 9dp for all a > 0, f EM.
It follows from this that
(21.15) lim p({IfI > a}) = 0 uniformly in f E M.
a-4+oo
Exercises.
1. Show that for any measure space (0, a, p) a set M of measurable numerical
functions is equi-integrable if and only if for every e > 0 there is an integrable
function h = hr > 0 such that f (If I - h)+ < e for all f E M. [Hint: For sufficiently
large q > 0, g := r)h will be a 2e-bound for M.]
2. Let (S2, d,14) be an arbitrary measure space, 1 < p < +oo. Suppose the se-
quence (f,,) in ((t) converges almost everywhere on 12 to a measurable real
function f. Show that f lies in 2P(p) and (fn) converges to fin pth mean if the
sequence (If,, I P) is equi-integrable.
3. Show that from the 2-convergence of a sequence (fn) to a function f E 2"(e)
follows the 21-convergence of the sequence (I fn IP) to If I, for any 1 < p < +oo.
4. Consider a finite measure .t and an M C Y1(µ). For each n E N, f E M set
an(f):=nµ({n<_IfI<n+1}).
00
Show that M is equi-integrable if and only the series E an(f) converges uniformly
na
in f E M. [Cf. Theorem 3.4 and its proof in BAUER [1996].]
5. Consider a finite measure p and an M C 2 (z). Show that M is equi-integrable
if there is a function q : a+ - R+ with the properties
lilri q(t)
t0+00 t
_ +oo and spu J q °If I du < +oo.
(In fact we have to do here with a necessary as well as a sufficient condition, which
goes back to CH. DE LA VALLEE POUSSIN (1866-1962). Moreover, q can always
be chosen to be convex and isotone. Cf. MEYER [1976], p. 19 or DELLACHERIE
and MEYER [1975], p. 38.)
6. Let (fl,.ad,p) be a measure space with µ(S2) < +oo, (fn)nEN a sequence of
measurable numerical functions fn > 0, and set f* := lira .supoofn. Show that:
n
(a) If the sequence (fn) is equi-integrable (or at least satisfies condition (21.12)),
then the following "dual version" of Fatou's lemma is valid:
How does the corresponding result in Exercise I of §15 fit in? [Hint: Exercise 2
of §11.]
(b) Under the hypothesis f f' du < +oc, the sequence (f,,) is equi-integrable if
and only if (*) holds. [In proving the "if" direction, argue indirectly.]
§21. Equi-integrability 131
(c) Result (b) can fail in case f f ` dµ = +oo. Try to corroborate this with a se-
quence (an derived by appropriate choice of (sufficiently large) numbers
a,, > 0 from the sequence (f,,) in the Example from § 15.
7. Let (f), .x, µ) be a measurable space with µ(S2) < +oo, and let (v;)iE f be a family
of finite and it-continuous measures on 0. Suppose this family is equi-continuous
at 0, meaning that to every sequence (An)nEN in iA with A,, J. 0 and to every
c>0there is an nEENsuch that y;(A,)<efor all n>nE,and all iEI.Show
that then this family is equi-µ-continuous in the following sense (cf. (21.12)): To
every E > 0 there corresponds a 6 = 6e > 0 such that
and µ(A)<6 vi(A)<eforalliEI.
What does this result say in view of Theorem 21.8? (Hint: Review the proof of
Theorem 17.8.1
Chapter III
Product Measures
In this short chapter we will investigate whether and how one can associate a prod-
uct with finitely many measure spaces. And for the product measures thus gotten
we will want to see about how to integrate with respect to them in terms of their
factors. We will recognize the L-B measure Ad as being a special product measure
when d > 2. One important application of product measures is the introduction
of the concept of convolution for measures and functions.
which assigns to each point (w1, ,w,) E I its jth coordinate wj. The a-algebra
in Q generated by the mappings pa,. , pn is designated
n
j=1
and called the product of the a-algebrns d1 r ... , d,,. According to (7.3) we have to
do here with the smallest a-algebra s® in ft such that each pj is d-safj-measurable.
The reader may recall that the product of finitely many topological spaces is
defined in a very similar way.
An important principle of generation for such products is immediately at hand:
22.1 Theorem. For each j = 1, ... , it let Ag be a generator of the a-algebra salj
in SZj which contains a sequence (Ejk)kEN of sets with Ejk T Q j. Then the a-algebra
A(i 0 ®.n is generated by the system of all sets
E1x...xEn
with E., E 9, for each j = 1, ... , n.
§22. Products of a-algebras and measures 133
Proof. Let 0 be any a-algebra in Q. What we have to show is that the mappings p,
are all d-Oj-measurable (j = 1,.. . , n) if and only if s+d contains each of the
sets El x ... x En described above. According to 7.2 pj is .V-Afj-measurable just
exactly if p 1(E3) E 0 for every E3 E 8 . If this condition is fulfilled for each
j E {1,.. . , n}, then the sets
El x ... x En =p11(El)n...npnl(En)
all lie in 0. If conversely, E, x ... x En E s+1 for every possible choice of E3 E 4
and j E {1,. .. , n }, then upon fixing E3 E 8j, the sets
Fk:=Elkx...xEj-1.kxEi xEj+1,kx...xEnk, kEN,
all lie in W. Since the sequence (Fk)kEN increases to
U1 x...x1j-1 xEj xflj+1 x... xOn =pj1(Ej),
this set too lies in d, for each j. The claim is therewith proven. 13
A particular case of this theorem is the fact that the product dj ® ... ®srdn is
generated by all the sets Al x ... x An with each A3 E . . Our further course will
be guided by the following example:
Measure spaces (f13, O j, pi) are given, 1 < j < n with n > 2, and for each dj
a generator 9j. Under what hypotheses can the existence of a measure a on
010 .. . (9 On satisfying
(22.3) zr(E1 for all E,ESj,I<j<n
be proven?
The accompanying uniqueness question can be settled at once:
134 III. Product Measures
22.2 Theorem. Suppose that for each j = 1, ... , n irj is an n-stable generator
of ao which contains a sequence (Ejk)kEN of sets of finite pj-measure satisfy-
ing Ejk f 11j. Then there is at most one measure rr on alt ®... ®x/ erljjoying
property (22.3).
Proof. Let 8 denote the system of all sets El x ... x E,,, where Ej E ej for each j.
According to 22.1, 8 generates the a-algebra dj (9 ... ® 04. Since each Bj is
f-)-stable, so is 8, as the identity
?I n
X Ej)n(X Fj) = X(E,nF,)
J=1 9=1 j=1
Under the hypotheses of 22.2, which obviously entail the a-finiteness of each
measure uj, the existence of the desired measure it can also be proven. This proof
will be carried out in the next section, first for it = 2, then for arbitrary n > 2.
Exercise.
Finitely many measurable spaces (flj,.Wj) are given, j = 1,. .. , n. Show that the
algebra in S21 x ... x S2 generated by all sets Al x ... x A,, with each Aj E .rrdj
consists of all finite unions of such product sets.
§23. Product measures and Fubini's theorem 135
Initially measure spaces (521, .sdl, pj ), (522, sd2, µ2) are given. For every Q C ill x 112
the sets
Q111 {w2 E ill : (WI, W2) E Q}
(23.1)
Q,,,., {w1 E ili : (w1,w2) E Q}
are called, respectively, the w1-section of Q (w1 E ill) and the w2-section of Q
(w2 E p2)
This notation is chosen for typographic simplicity and will see us through §23,
after which it is not needed. In case ill = il2i however, it presents obvious prob-
lems, to circumvent which, alternative notations like,,,, Q or Q4 for Q,,1 are also
popular in the literature.
About these sets we claim:
23.1 Lemma. If Q E sd1 ® sd2i then its w1-section lies in ad2 for every w1 E 01,
and its w2-section lies in sd1 for every w2 E i12.
Proof. For arbitrary subsets Q, Q1 i Q2.... Of fl :=121 x 522i and points w1 E ill
(!\Q)w, =!2\Q.1
and
(U Qn) = U (Qn)., .
nEN nEN
Furthermore 52, = 112, and more generally for Al C 111, A2 C ill we have
(A1 x A2),1 =
j A2 if w1 E Al
0 if w1 E ill \ A1.
For each w1 E 121, therefore, the system of all sets Q C fl having section Q,,, E .ode
is a a-algebra in Cl which contains every product set Al x A2 with Al E .o'j,
A2 E ode. But according to 22.1 01 (& ad2 is the smallest a-algebra which contains
all such product sets. This proves the part of the lemma dealing with w1-sections.
Of course, w2-sections are treated the same way. 0
Since now µ2(QW1) and make sense for all Q E 01 ®.02, wl E ill and
w2 E S12, we are in a position to take the next step:
23.2 Lemma. Suppose the measures p1 and µ2 are or-finite. Then for every Q E
sd1 ® . 9 the functions
w1 H µ2(Q.,) and w2 H A, (Q..)
on 121 and 122, respectively, are sd1-measurable and 02-measurable, respectively.
136 III. Product Measures
Proof. The function wl H P2(Qw,) will be denoted by sq. We will establish the
d1-measurability of sq, for each Q E d1 ®sal2. The other function can be treated
analogously.
First suppose that µ2(1Z2) < +oo. In this case the set ) of all D E .01 ®sal2
whose sD function is.call-measurable constitutes a Dynkin system in C := 111 x 11.2.
This involves the following easily checked assertions:
811 = /12(122);
sf1\D = 851 - SD for every D E .9;
svD = ESD. for every sequence (D,6) of disjoint sets in .9.
Furthermore 9 contains Al x A2 for every Al E salli A2 E sale, since
SA, xA2 =112(A2) - lA,
The system if of all such Al x A2 is fl-stable and generates sale ®sd2, by 22.1.
Therefore 2.4 insures that 01 ®ad2 is the Dynkin system generated by it. From
9 C -9 C Wl ®,42 therefore follows that .9 = .call ®.v i which is what is being
claimed.
If 162 is only a-finite, then there is a sequence of sets from ae, each of
finite 162-measure, with Bn T 112. For each n, A2 H u2(A2f B.) is therefore a finite
measure 162,, on sate, to which the already proven result can be applied, showing
that wl H is .aft-measurable for each Q E Of, ® 02. Now
112(Q,,,) = auP112,,(Qw,)
nEN
because of the continuity from below of the measure 162. From Theorem 9.5 then
the mapping wl -r 162(Q,,,) is indeed al-measurable.
23.3 Theorem. Let (f1j, dj, pp) be o-finite measure spaces, j = 1, 2. Then there
is exactly one measure.. it on all ® .sate which satisfies
(23.2) rr(A, x A2) = p, (Al)112(A2) for all Al E sli, A2 E sate.
In addition this measure satisfies
and is a-finite.
Proof. As before, for each Q E sate e s12 let sq denote the Wi-measurable function
w1 on 121; it is of course non-negative. Consequently via
ir(Q) := JSQdILI
a non-negative function it is well defined on 010 sate. For every sequence (Q,)nEH
of pairwise disjoint sets from sat 0 szt2 the equality sUq = E sq, and 11.5 insure
§23. Product measures and Fubini's theorem 137
that 00
7r U Qn) _ F, n(Qn)
nEN n=1
Since so = 0 we have 7r(0) = 0. This proves that 7r is indeed a measure on .od1®a2.
It has property (23.2) because
SA, XA2 = p2(A2)IA,, whence integration yields
7r(A1 x A2) = pl(A1)a2(A2)
Proceeding analogously, we confirm that
ir'(Q) := fi(Qw2)iz2(dw2)
also defines a measure on s1® ® d2 having this property. But when Theorem 22.2
is applied to 9d°1 sr'1 and &2 := W2 it affirms that there is at most one such
measure. Thus 7r = 7r' and (23.3) is confirmed. There is a sequence (Ajn)nEN of
sets from ,rarj, each of finite pj-measure, with Ajn T 52j, for j = 1 and j = 2. Using
these as the A1, A2, respectively, in (23.2) proves the a-finiteness of IT because
r(A1nxA2n)<+ooand A1nxA2nTfu1 xQ2
23.4 Definition. The measure IT on 010 .W2 which is uniquely specified by (23.2)
whenever (521,911,p1) and (122,d2ip2) are a-finite measure spaces is called the
product of the measures p1 and 02 and is denoted by
Thus also the question posed in §22 is answered for a-finite measures p1, P2.
If namely ej is a generator of salj (j = 1, 2) with the properties formulated in
Theorem 22.2, then according to 22.2 and 23.3, Al ® p2 is the only measure IT on
01 ® 02 which satisfies (22.3).
The Example in §22 therefore entails that A2 = a1 ®a1. Similar considerations
lead to the validity of
Am+n = '\® ®)n
Note, of course, that these indicator functions have different domains, and, just as
with (23.1), further caution is called for with (23.4) in case ill = f12. Equations
(23.4), and (23.5) lead us to call the mapping f,,,, the wj-section of f. It enjoys
the expected properties:
23.5 Lemma. For every measurable space (W, d') and every measurable mapping
f: (11 x122,4110A)-(11',d')
is sate -d' -measurable and f,,, is .11-d'-measurable for every wl E 11 i w2 E S12.
(f-1(A'))w,,
so the measurability claims follow from Lemma 23.1.
Decisive is the following theorem which extends formula (23.3) from indicator
functions to non-negative measurable functions. It goes back to L. TONELLI (1885-
1946), its corollary to G. FUBINI (1879-1943). Both statements are often combined
under the single designation the theorem of Fubini.
23.6 Theorem (of Tonelli). Let (111,41z) be o-finite measure spaces (j = 1, 2),
and let
f: 121x122 R+
be s1® 0 .sat2-measurable. Then the functions
r
w2' J f,n dµ1 and w1 H dµ2
Proof. Set Sl := Sl1 x 112, NY' := . ®.so42 and rr := Eq ®µ2. Consider first an
at-elementary function f :
n
f:_Eaj1Qj (ai>O,QjEa,nEN).
j=1
Then a glance at (23.5) reveals that for each w2 E aj14IL2 and so
f" n
f d,u1=
Eaj1il040,
j=1
§23. Product measures and Fubini's theorem 139
f(ff2d1) _ aj7r(Q7) =
J
f d7r,
j=1
which confirms the first equation in (23.6), for elementary f.
For an arbitrary d-measurable numerical function f > 0 let (u(')) be a se-
quence of .say-elementary functions such that uini T f. Then, as was noted in the
first part of the proof, is a sequence of dl-elementary functions, which
obviously satisfy u) T fw2 (for each w2 E 112). Consequently, the functions
V(n)(w2) Ji4)dir w2 E 112,
which are d2-measurable by what has already been proven, increase to the function
w2H f f)2dp1,
by 11.3. This function is therefore also a02-measurable and the monotone conver-
gence theorem 11.4 says that
f f d7r = sup I
nEN
u(n) d7r.
J(ffdi)P2(dw2) = f f dir,
and wholly analogous arguments establish the claims about the functions f", . 0
23.7 Corollary (Theorem of Fubini). For j = 1, 2 let (llj, a4j, 14j) be a-finite
measure spaces, f a k1 0 p2-integrable numerical function on !l x 02. Then for
µl-almost every w1 the function f,, is 142-integrable and for µ2-almost every w2
the function f,,,2 is µ1-integrable. The functions
thus defined p,-almost everywhere on fl, and P2-almost everywhere on f12, respec-
tively, are pl-integrable and µ2-integrable, respectively, and equations (23.6) are
valid.
The theorems of Tonelli and Fubini insure, in particular, that under the stated
hypotheses the order of repeated integrations is immaterial. We can emphasize this
by writing the equation (23.6) in the form
Example. 1. Consider L-B measure A2 = A' ®A' on R2, the set A := Q x R E R',
and its indicator function f := 1,1. According to 23.3 or 23.6 we have A2(A) =
f f dA2 = 0, so f is A2-integrable. Nevertheless, for every w1 E Q, the section
f,,,, = la is not A'-integrable.
§23. Product measures and Fubini's theorem 141
Remark. 1. For certain measures µ1,P2 which are not or-finite the existence but
usually not the uniqueness of a product measure can be proved by other methods.
See, e.g., BERBEIUAN [1962]. Even if just one of p' or 112 fails to be a-finite, the
second equality in (23.3) can fail. Cf. Exercise 1, p. 145 of HALMOS [1974], as
well as chapter IV, §16 of HAHN and ROSENTHAL [1948]. Moreover, there exist
f : 91 x f12 - R+ which are not sail (9 02-measurable yet the "iterated integrals"
on the right side of (23.6) make sense (and are finite). For an abundance of il-
luminating but elementary counterexamples related to this famous theorem, see
CHATTERJI [1985-86] and MATTNER [1999].
(23.7) f co o f dp = fit ,
+
(t)p({ f > t})A1(dt) = 0
J0
+00
w (t)µ({ f > t}) dt .
Proof. Consider the L-B measure A' := AR+ on the o-algebra R' := R+ fl9l. The
function F : 0 x R+ -+ R2 defined by
F(w, t) :_ (f (w), t)
(t)IE(w,t)A'(dt)p(dw) = f f V(t)1E(w,t)µ(dw)X'(dt)
(23.8) JJ V
(t)A'(dt) = limo J
f
oal n
(t) dt = W(a) - n m V(1/n) = w(a)
142 !IL Product Measures
(cp(0) = 0 and Sp is continuous on R+), we see that V is also integrable over 10, a]
for every a > 0. It follows from f > 0 and the preceding calculation that
o f dµ = f (Jlo,f(W)l
= J f o'(t)llo,nw)d(t)A*(dt)µ(&)
= IV
J
which combined with (23.8) concludes the proof. D
(23.9) fl'dµ=p +
J 0
The reader should not overlook the geometric significance of this, which is that
the integral f f dµ is formed "vertically", while the integral on the right-hand side
of (23.10) is formed "horizontally".
Now at last we turn back to the general case of §22 and consider finitely many
o-finite measure spaces (S1i, di,,a ), j = 1, ... , n and n > 2.
The two product sets (f21 x ... x 1li_1) x On and SZ1 x ... x Sln_1 x Stn will
be identified via the bijection
((w1,...,W,y_1),wn) H (L11,...,wn-l,wn)
The agreed-upon equality of these sets leads at once to the equality of the corre-
sponding products of v-algebras:
(23.11) (Wi®...®An-1)®-Wn=010...®An-1®dd/n.
In fact, by 22.1 the sets Al x ... x An- l with each Ai E jz(j generate rote®...OAfn-1,
and by the same theorem the sets
then generate (.Q91 0 ... 0 s0n_ 1) ®6dn as well as .c ® ... ®sOn_ 1 ®SF,.
§23. Product measures and Fubini's theorem 143
23.9 Theorem. or-finite measures µl, ... , µn on a-algebras .d1, ... , jVn uniquely
determine a measure 7r on safe ® ... 0 do such that
(23.13) 7r(A1 x ... x An) = ul(A,) .... µn(An) for all Aj E 0j, 1 < j < n.
This measure 7r is a-finite.
Proof. In 22.2 take for the various generators 8j the o-algebra .dj itself, and learn
that there is at most one measure 7r which satisfies (23.13). The existence question
has already been settled for n = 2, in 23.3. We make the inductive assumption
that 7r' := µ1 ®... ®µn-1 exists for some n > 2 and show how that leads to the
existence of µl ® ... ®µn. Evidently the a-finiteness of µl, ... , µn_1 entails that
of 7r', as in the proof of Theorem 23.3. That theorem therefore supplies us with
a measure 7r := 7r' ®µn on (.W1 ®... ®.dn_ 1) ®.dn which satisfies
7r(Q' x An) = 7r'(Q')µn(An)
for all Q' E .d1 ® ... ® .dn-1 and all An E dd4n. Because of (23.11) this measure
does what is wanted at level n, completing the induction. Again, a-finiteness of 7r
is confirmed exactly as in the proof of 23.3. 0
This inductive construction of the n-fold product measure builds in the equality
(23.14) (141 ®... (&µn-1) ®µn = µ1 ®... ®µn-1 ®µn
By now familiar considerations show that in fact a general associativity prevails
in the formation of product measures:
m n n
(23.15) (®µj)®(
j=1
® µj)=®
j=m+1 j=1
µj (1<m<nEN).
In particular
xd = V ®V, with d factors.
144 III. Product Measures
In view of (23.15) induction can also be used to extend the theorems of Tonelli
and Fubini to multiple factors. We will formulate only the analog of 23.6:
Let f _> 0 be an s91®... ®.c 4-measurable numerical function on 01 x... x Stn.
Then for every permutation j1, ... , j,, of 1, ... , n
(23.16) Jfd(ii®...®in)
= f(... (f (f f(w1i...,wn)µj,(dwj,))µj.(dwjs))...)µjr(dwj.)'
Every integral that occurs on the right-hand side is measurable with respect to
the product of the appropriate Oj, namely those corresponding to the coordinates
in which integration has not yet occurred. This right-hand side is often written in
the shorter fashion
J ... J
The simple proof of this theorem (involving induction), as well as the formula,
tion and proof of the analog of 23.7, will be left to the reader.
One more piece of notation is convenient:
23.10 Definition. For finitely many a-finite measure spaces (SZj, Wj, µj), 1 < j <
+, 1l 1!
n, the triple ()( SZj, ®.Wj, ®µj) is called the product of these measure spaces
7=1 j=1 j=1
and is denoted by n
j,
14Y
j=1
Remark. 2. Throughout the preceding the index set was finite. But there is
also a theory of products of (finite) measures indexed by arbitrary sets, which
is particularly important in probability theory; it is treated in detail by BAUER
[1996], and somewhat more extensively in HEw rr and STROMBERG [1965]. For
p-measures SAF,KI [1996] gives a short, elementary proof that uses only 5.1.
In closing we will consider the case where each measure µj comes with a real
density f j > 0. According to Theorem 17.11, vj := f jµj is then a a-finite measure
too.
Proof. As already noted, 17.11 insures that each measure vj is a-finite, guarantee-
ing that their product is defined. It suffices to treat the case n = 2 and refer the
general case to induction. For sets Al E and A2 E s12
vl(A1)v2(A2) = (jfid14i)(j12d142)
z
=
Jf lA,(w1)fl(wl)lA2(w2)f2(w2)141(dwl)112(dw2)
I ._
= Jf lA,xA2(wl,w2)F(wl,w2)1L1(dwl)122(dw2)
From 23.6 therefore
But then according to 23.3, v1 ® v2 coincides with the measure F (141 ®14z). 0
Exercises.
1. Consider 521 = 522 :=1R, 01 = 02 := ,41, it, := Al and 142 the non-a-finite
counting measure on .41 (cf. Example 3, §5). Show that equality (23.3) fails to
hold for Q := D, the diagonal {(w,w) : w E R} in 121 x 522. Why does D lie in
jV1 002 =W2?
2. Show that the function
(x, y) H 2e2xv - exv
is not A2-integrable over the set [1, +oo[x [0, 1].
3. With the aid of Tonelli's theorem find a new proof of Theorem 8.1 along the
following lines: Up is a translation-invariant measure on mod, 14([O,1[) = 1, and f >
0, g > 0 are Borel measurable numerical functions on Rd, compare the integrals
and thereby evaluate anew the important integral G = 21 in (16.1), in the fol-
ye_y2V2
lowing simple way: fo a-e2 dt = fo dx for every y > 0 and therefore
146 III. Product Measures
[Hint: Use (7.10) and note that every xd-section of K,.(0) is either empty or is
a (d-1)-dimensional closed ball. Tone1G's theorem then leads to a recursion formula
for the ad. Here, of course, 7r has its customary geometric meaning.]
How do these relations change if we replace K,.(xo) by the open ball Kr(xo)
in Rd of radius r and center xo? [Cf. Exercise 3 in §7.]
7. For every compact interval [a, ,Q] C R+ designate by R(a, Q) the spherical shell
K,3(0) \ K.(0) _ {x E Rd : a < IxI < /3} .
Show that for every continuous real function h on such an interval (a, /3] C R+
.
fR(a,p)
h(Jxj)Ad(dx) = d ad f
a
h(t)td-1
dt,
ad being the number ad(KI (0)) from the preceding exercise. [Hint: The function H
defined on [a, p) by
H(t) := f h(IxI)J1d(dx), tE
Consider the d-dimensional Borel measurable space (Rd,.gd). Every finite mea-
sure µ on Rd will be called a finite or also a bounded Borel measure, and the set
of all of them will be designated by.,&+' (lR'). For every such µ the number
(24.1) lI,II := IA(Rd)
is called the total mass of A.
Making critical use of the group structure of (Rd, +) a so-called convolution
product can be assigned to any finitely many measures Al, ... , An E .K+ (Rd);
in contrast to the previously studied product measure, it is again a measure on
the original o-algebra Vd, even an element of .,of' (Rd). What we do below can
be carried out in every (abelian) locally compact group. We cannot, however, go
into this generalization, but must instead refer interested readers to the excellent
monographs of HEwIrr and Ross [1979] and RUDIN [1962]. Initially we consider
the product measure Al ® ... ® An defined in §23. Since W d = Rd ®... 00,
this measure is an element of .,W+b (Rod) The mapping A. : R"d -3 Rd defined by
A,,(xl,... , xn) := x1 + ... + xn
is continuous, and so Vnd-.mod-measurable. The following definition accordingly
makes sense:
24.1 Definition. The image under the mapping An of the product measure
-IC/+b(Rd),
plo. .®Idn is called the convolution product of the measures pl,... , An E
in symbols
(24.2)
The theorems on product and image measures combine to yield the most im-
portant properties of the convolution operation *. First of all, At * ... *An is again
an element of .0+1 (Rd) and
so that in fact
(24.3) IIµl * ... * poll = 11µ11I ...' 11µn11
In studying the convolution product it suffices to deal with n = 2, because
(24.4) Al * ... * An * I`n+1 = (Al * ... * ln) * ltn+1
for every n + 1 measures from .4 (Rd). To see this, introduce the continuous
mapping Bn+1 : R(n+l)d _+ Red by
r
J fd(E.e*v)
=J foA2d(p®v)
(24.5) = ff f(x + y)p(dx)v(dy)
= f f f(x + y)v(dy)µ(dn)
As this holds for f := 1B, they indicator function of any set B E fed, we have
Now To is the identity mapping, so co is a - and obviously the only - unit with
respect to convolution. If, namely, E were also a unit, meaning that p = E *,U for
every µ E 4. (Rd), then it would follow that Eo = E * co = E.
For the special choice p := Eb, (24.10) says that
(24.10') Ea * Eb = Ea+b for all a, b E Rd.
= f f 1B(x + y)f(x)T-v(Ad)(dx)v(dy)
= f f 1B(x)f(x - y)Ad(dx)v(dy)
for every B E .mod. With the help of Tonelli's theorem it further follows that
is a density for u * v with respect to Ad. We denote this function by f * g, that is,
we set
and get
(24.14) (f Ad)*(gAd)_(f*g)Ad-
Here too f *g is called the convolution off and g. It is defined for every pair of non-
negative Ad-integrable functions and is itself such a function. Nevertheless, it might
not be real-valued, even if f and g each are (cf. Remark 1 below). Ftom (24.13)
and the translation- and reflection-invariance of Ad it follows that for every x E Rd
4. For arbitrary functions f, g E 2' (Ad) decomposition into their positive and
negative parts and appeal to the resusecured in 3. show that
x + ff(x - y)g(y)Ad(dy),
while possibly defined only Ad-almost everywhere (see Remark 1 below), is always
Ad-integrable. One can therefore define f * g by
but generally only for Ad-almost all x E Rd. Once again the expression convolution
is used for this f * g.
§24. Convolution of finite Borel measures 151
Exercises.
1. Show that for any it, v E dii (Rd) and any linear mapping T : Rd - Rd,
T(µ * v) = T(p) * T(v). To this end, first observe that T o A2 = A2 o (T (& T),
where T 0 T denotes the mapping (x, y) -+ (T (x), T (y)) of Rd x Rd into itself.
2. Compute the nlh convolution power of the function f defined on R by f (x)
ethat is, the convolution f * ... * f with n(E N) factors. Is it true that for
every n E N, f has an "nth convolution root"? That is, is f the nth convolution
power of some A'-integrable function g > 0?
3. If we set N1(f) f I f I dAd (this is (14.1) for it := Ad), then
N, (f *g) <N,(f)N,(g)
holds for all f, g E 21(Ad), and for non-negative functions equality prevails.
4. Write out the details of Remark 2 and show that
III*9II1 5 II/II, 119111
holds for all elements f and g of the Banach space L1(Ad). The latter is therefore
a Banach algebra.
Chapter N
Measures on Topological Spaces
Initially E will be an arbitrary topological space. The system of its open subsets
which defines the topology will be denoted B. In the case of Rd we had deter-
mined (cf. 6.4) that the o-algebra of Borel sets is generated by the open sets.
Consonant with this we now make the general
The closed sets being the complements of the open ones, _V(E) is also generated
by the system of all closed subsets of E. In this respect the analogy with 6.4 extends
a bit farther. The intersection of a sequence of open sets is called a G6-set, and
the dual, the union of a sequence of closed sets is called an Fa-set. All such sets
are clearly Borel.
§25. Borel sets, Borel and Radon measures 153
In the sequel we will be studying measures on R(E) for two important classes
of spaces E. In preparation for which we make
The converse of (25.7) is, however, not generally valid. Exercise 2 below furnishes
eui example.
Because of the implication (25.7), instead of locally finite measures defined
on 1(E), we will henceforth say simply locally finite Bore! measures.
For the moment we will be content to illustrate the regularity concept with
some examples.
is a locally finite Borel measure which is obviously outer regular. It is, however,
inner regular if and only if the set E is countable.
7. On -41 = .a(R) consider the counting measure. It is not a Borel measure, is
however inner regular, but not outer regular. In fact, equality (25.5) fails even for
one-point sets B.
8. L-B measure Ad on ( d =M(Rd) is a (locally finite) Borel measure. In §26 we
will see that it - and indeed every Borel measure on Rd - is regular.
More precisely the term used is "positive" Radon measure, but in this book
we dispense with that adjective because non-negativity is built into our definition
of measure, that is, we consider only measures with values in [0, +oo]. Example 5
says that the Dirac measure at any point a E E is always a Radon measure on E.
We have already noted that Borel measures are not automatically locally finite.
Nevertheless for many spaces Radon measures can be defined simply as the inner
regular Borel measures. That is the import of
Prof. We argue by contradiction: Suppose that it is not locally finite, which means
there is a point x E E such that u(V) = +oo for every open neighborhood V of x.
By hypothesis x has a neighborhood basis consisting of a sequence (Vn) of open
sets, and by replacing each V. with V1 fl ... fl V,,, we may suppose that V. 1. {x}.
Since p(Vn) = +oo and p is inner regular, there exists a compact subset Kn C V.
such that p(K,,) > n, and this is true of each n E N. Now the set
K := {x} U U Kn
nEN
Exercises.
1. Let (Q, .W) be a measurable space, 8 a generator of &V and ! ' a subset of Q.
Consider the traces a' and d" of a' and 8, reap., on S2' and show that e' is
a generator of the a-algebra .rah' in ff. Example 3 above is a special case.
2. Equip the set R with the so-called right-sided topology (which is also sometimes
named after SORGENFREY [1947) whose system 0, of open sets is defined as
follows: A subset U C R lies in ®r if and only if for each x E U there is an e > 0
such that [x, x + E[ C U. The topological space thus created will be denoted R,.
Establish, one after another, the following claims:
(a) Every right half-open interval [a, b[ is both open and closed in R,.. The right-
sided topology on R is strictly finer than the usual topology. In particular,
R, is a Hausdorff space.
(b) .W(R,) =0.
(c) Suppose (x,e) is a strictly isotone sequence of real numbers possessing the
supremum b E R. Then the set {z : n E N} U {b} is closed but not compact
in R,. By contrast, if (y,,) is a strictly antitone sequence of real numbers
possessing the infimum a E R, then {a} U {y : n E N} is compact in R,..
(d) Let K be compact in R,. Then there exists (from the first part of (c)) for every
x E Kay E Q with y < x and [y, x[f1K = 0. If for each x E K, p(x) designates
such a rational number y, then a mapping B : K -+ Q materializes which is
strictly isotone, and hence injective.
(e) Every compact subset of R, is countable. (But (c) shows that the converse is
not true.)
(f) Consider on .W(R,) = . 1 the measure p which assigns to every countable set
the value 0 and to every uncountable set the value +oo (cf. Example 6). Then
p is a Borel measure on R, for which no point of R, has a neighborhood of
finite measure. In particular, the measure p is not locally finite and is neither
inner regular nor outer regular.
(g) Consider the measure v := IA' with density
f(x) := x-' llo,+ool(x) (x E R)
and show that it too is a non-locally-finite Borel measure on R,.
(h) Investigate the L-B measure Al, thought of as a Borel measure on R in
respect to its inner and outer regularity.
§26. Radon measures on Polish spaces 157
For two extensive classes of Hausdorff spaces Borel measures come up very natu-
rally. The first of these classes will be discussed in this section, beginning of course
with its
26.1 Definition. A topological space E is called Polish when its topology has
a countable base and can be defined by a complete metric.
The terminology is due to N. BouRBAKI and commemorates the achievements
of Polish topologists in the development of general topology.
A metric is called complete when the associated metric space is complete: every
Cauchy subsequence in it converges. A countable base or basis for the topology is
a countable system of open sets such that every open set is the union of those from
the system which are subsets of it. For a metrizable space E the existence of such
a basis is equivalent to the existence of a countable dense subset.
Examples. 1. The euclidean spaces Rd of every dimension d > 1 are Polish, the
ordinary euclidean metric being complete.
2. The product E' x E" of two Polish spaces is another, when given the product
topology. For if d, d" are complete metrics generating the topologies of E' and E",
reap., then the product topology of E' x E" is generated by the metric
d(x, y) = d'(x', y) + d"(:r", y"), x := (x', x"), y (y', y").
which moreover is complete. If 9',9" are countable bases for E', E", resp., then
{G' x G" : G' E 91, G" E 9") is a countable basis for E' x E".
3. Every closed subspace F of a Polish space E is Polish. Just restrict to F any
complete metric that generates the topology of E.
4. Every open subspace G of a Polish space E is Polish.
example, the set J of all irrational numbers with its topology as a subspace of R
is Polish, since
J= n (R \ {x}) .
2E'Q
6. Every compact space E with a countable basis is Polish. For a famous theorem
of P.S. URYSOHN (1889-1924) (cf. KELLEY [1955], p. 125 or WILLARD [1970],
Theorem 23.1) guarantees that E is metrizable, and in Remark 3 of §31 we shall
even give a proof of this. The compactness of E easily entails that every metric
defining its topology is complete.
The key to the further discussion is the following lemma, which is here just
a preliminary to the big theorem that follows it, but nevertheless is significant in
its own right. In it we encounter our first extensive class of Radon measures.
and each set in this union has diameter no greater than 2/n. This shows that K is
pre-compact (=totally bounded) and in a complete metric space that is equivalent
to compactness, by very easy arguments (cf. WILLARD [1970], Theorem 39.9 or
KELLEY [1955], p. 198).
2. Every closed set C lies in 9: Let F > 0 be given. We already know that there is
a compact set K with
µ(E) - IA(K) < e.
According to 3.5 however
We come now to the principal result of this section. It generalizes the foregoing
lemma.
The question now suggests itself whether - in analogy with 26.2 - the outer
regularity of p can be proved. This is in fact the case.
Proof. We have to show that every B E 4(E) satisfies (25.5). So let B E .4(E)
and e > 0 be given. Consider the open sets G. and the finite measures tt created
in the preceding proof. Lemma 26.2 furnishes open sets U. J B such that
(26.5) ti((U,, \ B) n p. (U,. \ B) < e/2" for each n E N.
Let U U U n G,,, an open set. Since
nEN
B = B n E = B n UG,, U BnC,,,
nEN nEN
The regularity conditions (25.4), (25.5) make sense for outer measures px and
together with one other minimal demand on p* they assure that all Borel sets are
,W-measurable. In fact, these conditions on an outer measure come up naturally in
the course of proving the famous Riesz representation theorem in §29; cf. also 28.3.
26.5 Lemma. Let E be a Hausdorf space and tt' an outer measure on E with
the following three properties:
(i) for every set A C E
tt'(A) = inf{tt'(U) : A C U open 1;
162 IV. Measures on Topological Spaces
Proof. We consider the a-algebra d* of all µ*-measurable sets, that is, according
to (5.6) the set of all A E .9(E) which satisfy
(26.6) k*(Q) > µ*(Q n A) + p*(Q \ A) for all Q E .9(E).
First note that it suffices that this hold for all open sets Q in order that it hold
for all Q whatsoever. In other words, what we need to check for an A to be in d*
is that
(26.6') p*(U) > p*(U n A) +,t.*(U \ A) for all U E 0.
Indeed from (26.6') it follows for any Q C E that
p*(U) > p*(Q f1 A) + p*(Q \ A)
whenever U is an open set containing Q; then (26.6) itself follows by taking the
infimum over such U and invoking (i). So now let A = G be an open set; we will
use criterion (26.6) to show that G lies in W*. To this end consider any open
U C E; further, consider any compact Kl C U n G and any compact K2 C U \ K1.
Since then K1 n K2 = 0 and Kl U K2 C U, it follows from (iii) that
y* (U) > {b' (K1 UK2) =A* (KI) +Ft*(K2)
The set U\Kl is open, so if we take the supremum over all such K2 in the preceding
inequality and appeal to (ii), we get
it* (U) > IA*(Kl) + u* (U \ K1) > u'(Ki) + t,* (U \ G),
the last inequality because U\Kl D U\G. This holds for all compact Kl C UnG,
and so after a second appeal to (ii) it yields
p*(U) > p*(UnG)+µ'(U\G),
holding for all U E 0. That is, (26.6') holds for A = G, and consequently G E d9*.
This all proves that B C W*. But then .9(E) = a(®) C j W* the latter
is a a-algebra, by Theorem 5.3. That theorem further affirms that the restriction
of u* to W* is a measure.
The foregoing Theorem 26.3 and its corollary show in particular that the
L-B measure Ad is a regular Bored measure on Re in e a c h dimension d = 1, 2, ... .
In fact every Bore] measure on Rd is regular (cf. also Theorem 29.12). Following
STROMBERG [19721 we derive from the regularity of Ad a purely topological result
of H. STEINHAUS (1887-1972). It shows, incidentally, that every set of positive
L-B measure has the cardinality of R.
§26. Radon measures on Polish spaces 163
26.7 Theorem (of Lusin). Let ,a be a locally finite Borel measure, thus a Radon
measure, on a Polish space E, and E' be a topological space with a countable basis.
Then for every mapping f : E -+ E' the following are equivalent:
(a) f coincides p-almost everywhere with a Borel measurable mapping of E into E'.
(b) There is a decomposition of E into a p-nullset N E R(E) and a sequence
(K,.)nEN of compact sets, such that the restriction off to each K is contin-
uous.
If the measure µ is finite, (a) and (b) are further equivalent to:
(c) For every e > 0 there is a compact subset KK C E such that p(CKE) < e and
the restriction off to K, is continuous.
Proof. Let us first suppose that p is finite. Let 9' be a countable base for the topol-
ogy of E' and (Gn)nEN a sequential arrangement of its elements. Notice that 9' is
a generator of the Borel o-algebra because every open subset of E' is a (countable)
union of sets from s'.
164 IV. Measures on Topological Spaces
If Ks,. .. , Kn have been defined having the desired properties, we will get K"+1
from (c) and the inner regularity of p. By (c) there is a compact K' C E such that
p(CK') < (2n + 2)-'
and f I K' is continuous. With L := K, U... UKn the inner regularity of p supplies
a compact Kn+1 C K' \ L such that
µ(K' \ L) - p(Kn+1) = µ(K' n CL n CKn+,) < (2n + 2)' 1 .
Because
p(C(L U Kn+,)) = p(CK' n CL n CKn+1) + µ(K' n CL n CKn+, )
< p(CK')+p(K'nCL nCK,,+,) < (n + 1)-',
with this set Kn+, the inductive construction is complete.
(b)=(a): If E = N U K, U K2 U ... is the given decomposition, one defines
a mapping g : E -* E' as follows. In case N = 0, let g := f. In case N 96 0, choose
yo E f (N) arbitrarily and set
g(x) := f (x) for x E E \ N, g(x) := yo for x E N.
What has to be shown is that g is Borel measurable, which is done as follows: For
every open G' C E'
9_1(G')
= (g-1 (G') n N) U U (g-1(G') n Kn) = No U U g; 1(G')
nEN nEN
Remarks. 1. The equivalence of (a) and (b) in Lusin's theorem may be lost if (a) is
strengthened to the 9(E)-9(E')-measurability of f. It suffices to take for E the
compact set [0,1] x [0,1] and for p the L-B measure .X E. As was noted in the
second part of Remark 4, §8, E contains a p-nullset N which contains a non-Borel
subset. If M is such a set, its indicator function f = l,w is not Borel measurable,
although f is p-almost everywhere equal to the Borel measurable function 1N On
the other hand, if f is . (E)-. (E')-measurable, there is a Polish topology r on E,
stronger than the original but generating exactly the same Borel sets, such that f
is r-continuous. See 3.2.6 of SRIVASTAVA [1998] for the proof, which is not difficult.
166 IV. Measures on Topological Spaces
27.2 Theorem (on partitions of unity). Suppose that the compact subset K of
the locally compact space E is covered by the n open sets U1, ... , U,,. n E N. Then
there are functions fl.... , f E C,.(E) with the following properties
(27.3) fj>0 for j = 1.....n;
(27.4) supp(fj) C Uj for j = 1,....n:
r4
Two consequences of the foregoing will turn out to be especially useful. The
first - known as Urysohn's lemma - often serves as the starting point for inductive
constructions of partitions of unity (see, e.g., RUDIx[1987J, p. 39). The second can
also be proven directly, as indicated in Exercise 1 below.
Proof. We have only to apply 27.2 for n = 1. Since K C (f, > 0} C supp(f3), the
fact that (f, > 0) is open means that supp(f 1) is indeed a neighborhood of K. 0
27.4 Corollary 2. In the locally compact space E the compact subset K is covered
by then open sets UI,... , Un, n E N. Then K can be decomposed as K = KI U
... U Kn with Kj a compact subset of Uj for each j = 1, ... , n.
§27. Properties of locally compact spaces 169
For a locally compact space E there is another function space besides CC(E)
that is of importance. To define it we assign to every bounded real function f on
an arbitrary space E its supremum norm, also called its uniform norm, via
Ilf11 sup If W1
sEE
The mapping (f, g) -+ If -gIi makes Cb(E) - more generally even the vector
space of all bounded real functions on E - into a metric space. One speaks of the
metric of uniform convergence (on E). A sequence (fn) of bounded real functions
on E converges uniformly on E to a bounded function f just means that
lim Ilfn - f 1l = 0 .
nloo
27.5 Definition. A continuous real function f on a locally compact space E is
said to vanish at infinity if it lies in the closure Co(E) of CC(E) in Cb(E) with
respect to the metric of uniform convergence. Denoting closure in this metric by
bar, we thus have
Co(E) := CC(E) C Cb(E).
27.6 Theorem. For a real function f on a locally compact space E the following
statements are equivalent:
(a) f E Co(E);
(b) f E C(E) and {If I > e} is compact for each e > 0;
(c) the function
f'(x) :_ { f (x), for all x E E
0, for x = wo
is continuous on the one-point compactification E' of E.
Proof. (a)=(b): Given e > 0, there is by definition off E Co(E) a g E Cc(E) with
Ilf - gfl S e/2. Every x E E satisfies If (x)I - Ig(x)I <- If (x) - g(x)I S Ilf - gAI, so
we see that
(If 1> e} C {IgI > E/2} C supp(g).
This shows that (If 12: c} is a relatively compact set. But, due to the continuity
of f, it is also closed. Hence it is compact.
(b)*(c): Since the subspace topology of E in E' is its original topology and E is
an open subset of E', continuity of f' at each point of E is assured by f E C(E). As
to continuity at the ideal point wo, given e > 0, we have I f'(x) - f'(wo) I = l f'(x) I <
170 IV. Measures on Topological Spaces
e for all x in the set E' \ {If I > E}, which by definition of E' is a neighborhood
of wo, since (If I > e} is a compact subset of E.
(c)=:>(a): Continuity of f' at wo and the definition of the topology in E' mean
that for each e > 0 there is a compact K C E such that If (x)I = If'(x) - f'(wo)I <
E for all x E E \ K. 27.3 supplies a g E CA(E) with 0 < 9< I and g(K) = {1}.
Then fg E CA(E) and satisfies
If - f(x)I = If(x)I (1-g(x)) < E
for all x E E, so Ilfg - f II < E. As e > 0 is arbitrary, this proves that f E CA(E).
Exercises.
1. Without resort to partitions of unity, prove Corollary 27.4 directly. [Hint for
the case n = 2: Separate the disjoint compacta K \ U1, K \ U2 with disjoint open
neighborhoods V1, V2 and set Kl := K \ V1, K2 := K \ V2.]
2. Let E' = E U {wo } be the one-point compactification of a locally compact
space E. Describe the Borel sets in E' by means of the Borel sets in E. In particular,
see how your description fits into the following general picture: For a measure
space (E,.o), a point wo it E and the set EWO := E U {wo}, the a-algebra d"'O
in E"'° generated by d and {wo} consists of all A' C El- such that All fl E E St.
If therefore Lisa compact subset of Ua, then 1y < u and so from (28.2) P. (L) <
a 1(u). From definition (28.4) therefore
0<1 s(Ua)-ps(K)
=(a-l)p.(K)+a.
As a 1 1 this majorant converges to e, which shows that
inf{ps(U) : K C U open} < IA. (K) +e
holds for every e > 0; that is,
p`(K) = inf{µ.(U) : K C U open} < µ.(K).
This confirms (28.8), the reverse inequality being part of (28.6).
Of critical importance is the following result:
It suffices to settle the case n = 2, as induction then takes care of the rest. If K is
a compact subset of Ul U U2, then 27.4 provides compact Kj C Uj, j = I, 2, such
that K = Kl U K2. Then by the result of our first step
,u*(K) < lj*(KI) + p*(K2) <;t'(U,) +p`(U2)
The claimed inequality (with n = 2) then follows from (28.8), (28.4) and (28.7).
Third step: Now we will prove (28.9). In doing so we may obviously assume that
p'(Q,,) < +oo for every n. E N. Given e > 0, there then exist open U. J Q,, such
that
2-11e for every n E N.
The open set U := U U contains Q :_ U Q. If now K is a compact subset
"EN nEN
of U, then K C U1 U ... U U for sufficiently large n.. From this it follows that
:, x
p.(K)_p*(K)<p'(UiU...UUn)<Ep'(Uj)<Ep`(Qj)+E.
j=t j=1
where we used the second step. As this last inequality is satisfied by every compact
subset K of U, definition (28.4) and equation (28.7) give
a
it. (U) = Et'(U) <- E; (Qj) +e,
j=t
and since Q C U we will then have as well
00
The next corollary sharpens the inequality proved in the first step above.
Therefore
p.(Ki) +p.(K2) < I(vu) + I((1 -v)u) =1(u) ,
174 IV. Measures on Topological Spaces
The proof is immediate from Lemma 26.5 and the facts accumulated to this
point. Notice that (28.7) and (28.5) say that hypothesis (1) of 26.5 is fulfilled, while
(28.7), (28.8) and (28.4) insure that hypothesis (ii) of 26.5 is fulfilled.
The Borel measure µ' I ..(E) has a series of further remarkable properties:
28.4 Theorem. Every Borel subset A C E with µ'(A) < +oo satisfies
µ.(A) = µ`(A)
Proof. Given e > 0, there is an open U D A such that
It* (U) - µ'(A) < e/2,
which, due to µ' (A) < +oo and µ' being a measure on 9(E), can be written as
µ'(U\A) =µ'(U) -µ'(A) <e/2.
From (28.4) we get compact L C U such that
µ'(U\L)=µ'(U)-li (L) <e/2.
The set
Q:=(U\A)U(U\L)
then satisfies p* (Q) < e. Hence there is an open G Q such that
µ'(G) < C.
Now K := L \ G is a (closed, hence) compact subset of L with the properties
(28.10) K C A and A\ K C G.
In fact, on the one hand
K = L \ G C L \ Q C L \ (U \ A) = L n A,
since L C U, and on the other hand
A\K=A\(L\G)=(AnG)U(A\L)CGu(U\L)=G,
since U \ L C Q C G. From (28.10) we get
µ'(A) - µ'(K) = µ'(A \ K) 5 µ'(G) < e,
§28. Construction of Radon measures on locally compact spaces 175
and so u* (A) < µ'(K) + e <- µ.(A) + e. As e > 0 was arbitrary, this says that
µ'(A) < µ.(A), which with (28.6) finishes the proof.
28.5 Corollary. The equality p. (A) = u* (A) also holds for every A E -V(E)
which has o'-finite µ'-measure.
Proof. The terminology means that there exist An E R (E) (n E N), each of finite
µ'-measure, such that An T A. The preceding theorem and the isotoneity yield
µ'(An) = p.(An) < µ.(A) ,
from which and the continuity of µ' from below on R (E) follows
µ'(A) = sup p* (An) <_ p. (A).
n
Together with (28.6) this proves the claimed equality.
Proof. Since all compact K satisfy µ.(K) = p'(K) < +oo, all that has to be
proved is that p. I M(E) is a measure, i.e., that p. is countably additive on M (E).
To that end, let (An) be a sequence of pairwise disjoint sets from R(E), whose
union is A. For every compact K C A, K = U (K n An), so from 28.3 and 28.4
nEN
we get
00 00 00
We now set
(28.11) µo := µ. I .4(E) a n d µ° := µ* I R(E)
and, inspired by COURREGE [19621, call these the essential measure determined
by I and the principal measure determined by I, respectively. Each is a Borel
measure (28.3 and 28.6).
Proof. The first equality is just (28.7). Denote the right side of (28.12) by y, and
consider any compact K C U. Corollary 27.3 provides a function u E CA(E) with
0 < u < 1, u(K) = {1} and supp(u) C U. In particular, 1K < u and so by (28.2)
µ.(K) < I(u) < y, that is, µ.(K) < y for every such K. It follows that µ°(U) =
µ`(U) = µ.(U) < y, by (28.4). The reverse inequality y < µ°(U) is derived as
follows: Let u E CA(E) be a typical function involved in the definition of y. Set
L := supp(u) and consider a typical v E C0(E) involved in the definition (28.2)
of µ.(L). Evidently then u < v, so 1(u) < I(v); that is, I(u) < µ.(L) = µ0(L) =
µ°(L) < µ°(U). Taking the supremum over eligible u gives finally the desired
complementary inequality -y:5 µ°(U).
Exercises.
1. For a locally compact space E and a measure p defined on ..(E), show that it
is a Borel measure if and only if Cc(E) C 21(p).
2. Let p be a Radon measure on a locally compact space E and (Gi)1EI a family
of open sets which is upward filtering, that is, for any i, j E I there is a k E I such
that Gi U G; C Gk. Show that C := U Gi satisfies
iEI
p(G) = sup{p(Gi) : i E I} .
3. Using the preceding exercise, show that for any Radon measure p on a locally
compact space E:
(a) There exists a largest open set G with p(G) = 0. The set CG is called the
support of the measure p and is denoted supp(p).
(b) A point x E E lies in supp(p) if and only if every open neighborhood of x has
positive p-measure.
(c) For a non-negative f E C(E), f f dµ = 0 if and only if f = 0 throughout
supp(p).
Determine supp(Ad) for L-B measure Ad on Rd, and supp(E°) for every Dirac
measure ea on E.
4. Let p be a Borel measure on a locally compact space E. Show that every set A
from the a-ring p0(X) generated by the system ..iE' of compact subsets of E is
a Borel set which satisfies p.(A) = p°(A). Here a ring .4 in a set 0 is called a a-
ring if the union of every sequence of sets in .9 is itself a set in R. In complete
analogy with a-algebras, every subset of .9(0) is contained in a smallest a-ring.
Sometimes it is only the sets in pe(a') which get called "Borel sets"; this is the
case, e.g., in the classic exposition of HALMOS [1974]. Why is it generally the case
that po(..1E') 3 .9(E)?
on CA(E). The question posed in §28 was: Is it true that for every positive linear
form I on CA(E) there is a Borel measure p on E such that Iµ = I, that is, such
that
I(u) = Judp foralluECC(E)?
Any such Borel measure p will be called a representing measure for I. The answer,
leaked earlier, to this question reads:
178 W. Measures on lbpological Spaces
and because of linearity and the fact that the positive and negative parts of each
u E CA(E) also lie in C°(E), it suffices to show this for non-negative u. So let such
auE be given and let the real number b > 0 be an upper bound for u. Fbr
a given e > 0 choose real numbers yp,... , y,, with
0=yo<yt<...<yn=b
and
yj-yj-1< C for each j = I,-, n.
We set
uj :_ (u - yj-1)+ A (yj '- yj-1) (j = 1,...,n)
and get non-negative continuous functions, each having its support in supp(u),
which satisfy
n
(29.2) u=Euj,
j=1
Jud1L0- Eu°(KK-1\K3)-Fµ°(Ko\K.)<EIp°(Ko)
I(u)I <_F,
j=1
The extreme inequality being valid for every e > 0 and p°(Ko) being finite, the
desired equality
f
(29.8) I(u) = udµ°
J
emerges.
The measures of the compact sets Kj, j = 0, ... , n do not change, thanks
to (28.8), when µ° is replaced by p ,. Another pass through the preceding derivation
therefore leads to the conclusion that µO is also a representing measure for 1. O
Proof. Given K and U, consider functions u,v E CA(E) with iK < v, 0 < u < 1,
and supp(u) C U. Integrating these inequalities,
r
µ(K) < Jvd de = I(v) and I(u) = udp < p(U).
J
From (28.2) and Lemma 28.7 therefore the claimed inequalities follow. 0
After this preparation we can enhance the statement of the Riesz representation
theorem by characterizing the measures p and µ°, thereby putting into relief the
role of Radon measures.
29.3 Theorem. For every positive linear form I on CA(E) the associated essential
measure F4° is the unique Radon measure among the representing measures of 1.
Proof. Let p he a representing measure for I which is inner regular, thus a Radon
measure. Since 1I° is also inner regular, it follows from the first part of the preceding
lemma that
p(A) < p,(A) for every A E .R(E).
In particular then all open U C E satisfy µ(U) < p0(U) < p°(U) and when this
is combined with the second part of 29.2 we have
(29.9) p(U) = {I°(U) for every open U C E.
If compact K C E is given and U is an open, relatively compact neighborhood
of K, then U \ K is open, so that (29.9) is applicable and
p(U) - p(K) = p(U \ K) = po(U \ K) = p, (U) - p0(K)
Another appeal to (29.9), remembering that p0(U) < +oo, gives the equality
p(K) = po(K) ,
valid for every compact K C E. This fact and the inner regularity of both measures
results in their equality. 0
29.4 Theorem. Among all representing measures for a positive linear form I
on CA(E) the principal representing measure 1° is characterized by each of the
following two properties:
(i) p° is the smallest among all outer regular representing measures.
(ii) p° is the unique outer regular representing measure p which is inner regular
on open sets, that is, satisfies
(29.10) p(U) = sup{µ(K) : K compact C U} for every open U.
Proof. Let p be an outer regular representing measure. By Lemma 29.2, p°(U) <
p(U) holds for all open sets U. Since, however, µ° is also outer regular, that
inequality passes over to Borel sets generally:
µ°(B) < p(B) for all B E M(E),
§29. Riesz representation theorem 181
The following example shows that in general uO is not the only outer regular
representing measure.
Example. 1. Let E be an uncountable set and equip it with the discrete topology.
For I take the identically 0 form. Then from the last two theorems it follows that
µ° = µ° = 0. However the measure it from Example 6 of §25 is an outer regular
representing measure which is not identically 0.
29.6 Theorem. If the locally compact space E is countable at infinity, then the
representing measures ii° and p° determined by any positive linear form I on CA(E)
coincide.
Proof. First of all there is a sequence (Kn) of compact sets K such that Kn t E.
Using Corollary 27.3 we find 0:5 u,, E CA(E) with u, t 1E. But then the sets
Ln:={un>1/n}, nEN,
do what is wanted: Each is closed and, since Ln C supp(u,,), it is compact. Because
(zun) is isotone
L C {Yin+i > 1/n} C 1/(n + 1)} open C Ln+l,
whence L C I n+t, where A denotes the interior of a set A. As a result, (t )nEN
is an open covering of E, so finitely many of its sets suffice to cover any given
compact subset of E. 0
A simple interpretation of countability at infinity now emerges: A locally com-
pact space E is countable at infinity if and only if the infinitely remote point wo
§29. Riesz representation theorem 183
Here IIf II denotes the supremum norm of any bounded real function f on E.
The requirement (29.11) means that I is continuous with respect to the metric (of
uniform convergence) in CA(E) derived from this norm.
Remark. 2. If the space E is compact, then every positive linear form I on Cc(E)
is bounded, because CA(E) = C(E) so the constant function 1 lies in Cc(E).
Therefore from - Dull 1 < u < IIuII . 1 and the positivity of I we infer that
The next theorem - like its predecessor - covers compact spaces as a special
case.
Il,0Il=sup{I(u):0<u<1,uECc(E)}.
Since 0:5 u < 1 entails Dull < 1, (29.11) says that 0:5 1(u) < M IIuII < M, and so
IW° II <- M < +oo .
Remarks. 3. From the proof of Theorem 29.10 it also follows that the total
maw [l;t°II of u° is the smallest real number M > 0 that can serve in Definition 29.9.
4. It is not to be expected that in every locally compact space E which is
countable at infinity every positive linear form on CA(E) will have exactly one
representing measure with no further qualification. Still less is unqualified unique-
ness of representing measures for bounded positive linear forms on C°(E), when E
is only a locally compact space, to be expected. There is a counterexample to
both in HALMOS [1974), p. 231 - DIEUDONNIi [1939) is also cited there - in which
the space E is even compact: It is the interval [1, Q] of all ordinal numbers not
greater than the first uncountable ordinal f2, equipped with the order topology.
The positive linear form IEn on C([1,52]) defined by the Dirac measure en has
a representing measure it which is neither inner regular nor outer regular. Thus
f f den = f f dp for all f E C([1,1z]) although It 96 eS2. Details can be found in
PFEFFER [1977], p. 116.
29.12 Theorem. If the locally compact space E has a countable base for its topol-
ogy, then every Borel measure on E is regular, hence in particular a Radon mea-
sure.
Proof. Let It be a Borel measure, I, the associated positive linear form on CA(E)
and p° the principal representing measure for I. Along with E each of its open
subspaces U also has a countable base. From Example 2 therefore U is countable
at infinity; there exists a sequence of compact sets such that K 1' U. Since
the measures It, p° are continuous from below, it follows that
u(U) _ rn p(K,,) and p°(U) _ im° p°(K,,).
But u(K,,) < u°(K,,) for every n E N, by Lemma 29.2. So we get
u(U) < u°(U), from which and a second appeal to 29.2
(29.12) u(U) = u°(U), for every open U C E.
For an arbitrary Borel set A and open U D A we then have u(A) < u(U) = u°(U)
and so, on account of the outer regularity of u°,
(29.13) u(A) < u°(A), for every A E ..(E).
§29. Riesz representation theorem 185
29.13 Corollary. For a locally compact space E whose topology has a countable
base, every positive linear form I on CA(E) can be represented as
1(u) = Judprt E
by exactly one Borel rrteasur p on E.
Example. 4. For cacti u E CS(R) choose real numbers a < 13 such that supp(u) C
(a,131 and define
L(u) :=
j a u(x) dx,
a
the integral being the usual Riemann integral: it is independent of the specific
numbers a and,3 used. Evidently L is a positive linear form on CS(R). According
to 16.4 L-B measure A' represents L, and by 29.13 it is the only representing
measure.
Remark. 5. It is also possible to deduce Theorem 29.12 from Theorem 26.3 and its
Corollary 26.4 because every locally compact space E whose topology has a count-
able basis is Polish. In fact along with E, its one-point compactification E' also has
a countable base, as follows from Lenima 29.8 and the commentary after it. It will
be shown in Remark 3 of §31 that E' is consequently ntetrizable, and completeness
of the metric follows easily from compactness (cf. Example 6, §26). Thus E' is
Polish and E is an open subset of it. Therefore according to Example 4, §26 E
itself is Polish.
186 IV. Measures on Topological Spaces
Summarizing, we can say that for every locally compact space E, the mapping
that associates to each Radon measure p on E the positive linear form 1. on Cc(E)
is a bijection between the set of Radon measures on E and the set of positive linear
forms on CA(E). That is the reason why in BOURBAKI [1965) the positive linear
forms on CA(E) are themselves designated as (positive) Radon measures.
If the space E is countable at infinity as well, the Radon measures on E are
all outer regular. If moreover the topology of E has a countable base, the Radon
measures and the Borel measures on E coincide.
We give now an application to integration that is of fundamental importance.
29.14 Theorem. For any regular Borel measure p on a locally compact space E
and any p E [1, +oo[, the vector space CA(E) is dense in 2P(p) with respect to
convergence in pen mean.
Proof. First of all, CA(E) C .`(p), because CA(E) C .2"(p) by (28.1) and Iulp E
CA(E) whenever u E CA(E). The denseness claim requires that for each f E gy(p)
and each number e > 0, a function u E CA(E) be produced with
that is,
Np(lu - 1K) < e/2.
Finally, we use 27.3 to select u E CA(E) satisfying 1K < u:5 1U, whence
0<1u-u<1u-1K
§29. Riesz representation theorem 187
and so
Np(lu - u) < e/2.
For the function f = 1 A to be approximated we now have
Np(f - u) < Np(lA - lu) + N,(lu - u) < e,
completing the proof. 0
The proof actually uses the inner regularity of µ only on open sets. So what is
involved here are conditions which according to 29.4(ii) characterize the principal
representing measure. We will not pursue this any further but interested readers
can in BOURBAKI [1965) and BAUER [1984), where this remark is placed in a more
general framework.
Exercises.
1. Let E be an uncountable discrete space. Using the Borel measure from Ex-
ample 6 in §25, show that every positive linear form on CA(E) has at least two
different representing measures. This sharpens Example 1 of this section.
2. Let E be a locally compact space and I a positive linear form on CA(E). With
the help of the R.iesz representation theorem prove the following refinement of
equality (28.12): For every open U C E
µ°(U) = sup{I(u) :0:5 u:5 lu, u E CA(E)} .
I (u) II u(x, y) dy
O<x<I 0
is a well defined finite sum, evidently a positive linear form on Cc(E). Show that
(e) The essential and the principal representing measures for I do not coincide.
[Hint: Show that the set A := El x {0} is closed and that s°(A) = 0, while
u°(A) = +oo.]
(f) In passing from u° to the Borel measure 1Bµ° for B E M(E) outer regularity
may be lost. [It suffices to consider B := E \ A, for the set A in the preceding
hint.]
For locally compact spaces E we will henceforth use the notation .4'.. (E) for the set
of all (positive) Radon measures on E. The Riesz representation theorem furnishes
a canonical bijection of fl+(E) onto the set of all positive linear forms on CA(E).
With p, v E 4+ (E) and real numbers a > 0, i3 > 0 the measure aµ +)3v also lies
in .ill+(E), as is easily checked. That is, .0+(E) is what is called a convex cone.
Besides . W+ (E) we often consider the following subsets
(30.2) lim pp 1zn (K) < p(K) and lim oinµn (G) > jz(G)
Proof. Suppose converges vaguely top and that K and G are any compact and
open sets, respectively. Consider functions u,v E CC(E) with u > 1K, 0 < v < 1
and supp(v) C G. Then for all n E N
whence
limss op jln(K) < udp and vdµ < liimianfµn(G).
J J
From these inequalities (30.2) follows via (28.2) and (28.12). One only has to recall
that the Radon measure p coincides, thanks to Theorem 29.3, with the essential
measure po determined by the linear form Iµ.
Now suppose conversely that condition (30.2) is fulfilled and that an f E CA(E)
has been given. Since our goal is to confirm (30.1), we lose no generality by as-
suming that f > 0. For a pre-assigned e > 0 we choose finitely many numbers
0=yo<y1<...<yk
with yk > IIfII and yj - yj_1 = e for each j = 1,...,k. Set
K:= supp(f) and Aj :_ {yj_1 < f < yj} f1 K, j = 1,...,k.
Denoting the compact set { f > yj } fl K by Kj for j = 0,..., k (so Kk = 0 and
Ko = K), we have K, -I D Kj and
A =Kj-1\Kj (j=1,...,k).
Because of the obvious inequalities
k k
Eyj-11A; <.f <_ Eyj1A,,
j=1 j=1
every Radon measure v on E satisfies
k k
1: yj_1v(Aj) < Jfthi < yjv(A,),
j=1 j=1
§30. Convergence of Radon measures 191
from which and a simple calculation using the facts v(A,) = v(Ki_1) -v(K,) and
yi - yi _ 1 = e, we get
k k k
e E v(KK) - ev(K) = e E v(KK) < Jfdzi<r>v(Ki).
i=o i= i=o
For v := it,, the right-hand inequality gives us
k
Jfd/<EJL(Kj) for all n E M,
i=o
and therefore from the first half of hypothesis (30.2)
r k
limsopJ fdµn<eEµ(K1)
i=o
But this right-hand side can be estimated by using the left end of the earlier chain
of inequalities, with v:= µ. We thereby get
f fdµ<liminf f fdµn
and we get it by an analogous procedure, using the second half of hypothesis (30.2).
One sets Gi := If > yi }, j = 0, ... , k, which are open, relatively compact subsets
of K with
Gi-i \ Gi = {yi-i < f < yi} _ {yi_1 < f < yi} fl K.
These sets take over the role of the Ki. 0
The second example above (for the case in which, say, all the an equal 1) shows
that a vaguely convergent sequence of measures from .41+(E) need not converge
to a measure in .,W+l (E): mass can be lost. This illustrates the following general
phenomenon:
30.3 Lemma. If the sequence (µn)nEN of Radon measures on the locally compact
space E converges vaguely to the measure µ E then the associated total
masses satisfy
(30.3) IIiII < Inm onf IIIinII
192 IV. Measures on Topological Spaces
<u<1
Proof. For every u E CA(E) with 0JudlLn
< -IIunhI
(30.4) p y J f dp (f E CA(E))
(30.5) Vi...... t..:E(WJ) 1/1 E 4+(E) : if fi d1a - J fi dPol < s,.1 = 1, ... , n}
in which n E N, 0 < E E R and fl,..fn E CA(E) are all arbitrary. The vague
topology is Hausdorff because the uniqueness aspect of Riesz's theorem says that
if p, v are different Radon measures, then I, 36 It,., which just means that f f du 34
f f dv for some f E C (E).
In this context it is now clear too what should be understood by the vague
convergence of a mapping t i-+ p of a subset A of a topological space T into W+ (E)
when t converges to a point to E A. With respect to the vague topology the
convergence
lim
t 10
µt = µ
tE A
that
.F
r-a+oC
this and the Lebesgue dominated convergence theorem the claim (30.7) fol-
lows upon checking that, on the one hand
lint f (r-'x)K(x) = f (0)K(x)
r-++oo
for every x E Rd,
and on the other hand for all real r > 0 and all x E Rd
If (r-'x)K(x) I <_ Ilf11. K(x),
so that 11111 K is an integrable majoraut for all functions. The "approximation of
the identity" co expressed by (30.7) plays an important role in Fourier analysis (cf.
the exercises in §23 of BAUER [1996] ). For the algebra L' (ad) (cf. Remark 2, §24)
has no identity element with respect to convolution, but it is not hard to show
that II Kr * f - f 11 -+ 0 as r -+ +oo for each f E L' (Ad), and in many situations
this is almost as useful as having an identity.
To .,W+b (E) belong in particular all discrete Radon measures on E. These are
the measures 6 which can be represented in the form
k
5 = E aic",
7=1
f o r some finite number of points x1, ... , xk E E and non-negative real numbers
at, ... , ak. Every 5 admits many such representations. Every Radon measure can
be approximated, in the sense of the vague topology, by such 5, as we next show.
30.4 Theorem. For every locally compact space E the set of discrete Radon mea-
sures on E is dense in .4f+ (E) in the vague topology.
pact set
n
K := U supp(fi)
i=1
and g > 0 such that npo(K) < 1. Every y E K has an open neighborhood U. in E
such that 1 fi (y') - fi (y") I < q for all y', y" E U. and all j E {1, ... , n}. Finitely
many Us,, say Uy...... Uy,, suffice to cover K. Set
Al :=KnU,,, A2:=(KnU,,)\Al,...,Ak:=(KnUYk)\(ALU...UAk_1).
These are pairwise disjoint, relatively compact Borel sets whose union is K, and for
all j E { 1, ... , n}, i E { 1, ... , k} and y', y" E A. the inequality I f i (y') - fi (y") 15 rl
holds. Since only these properties of the A; are used in the sequel, we can discard
those that are empty (not all are because 0 0 K = Al U ... U Ak), and re-index the
others. That is, we can suppose all the A; are non-empty and then select a point
xi E A, for each i. The discrete measure
k
i=1
(notice that po(A;) is finite because A; is relatively compact) will be shown to lie
in V and that will complete the proof-
f fi dpo - po(A:)fi(xi)
i=1 +
i=1
using the fact that Ifi(x) - fi(xi)I < 17 for all x E A;, all i E {1,...,k}. This
holds for each j E { 1, ... , n}, and gpo(K) < 1 by choice of q. Therefore b E
V1,,..., f,,;1(po) = V, as was to be shown. 0
30.5 Corollary. The discrete p-measures on E are dense in di. (E) in the vague
topology.
Proof. We take over the notation of the preceding proof. Now po is a measure
in 4+' (E), but the discrete measure 6 = F, po(A;)ez, may not be a p-measure,
so more work is required. Set a; := po(A1), i = 1, ... , k. If K = E (in which case
E had to be a compact space), then a1 +... + ak = po(K) = 1 and b actually is
a p-measure. In general what we have is
a1 + ... + ak = po(K) < uo(E) = 1
§30. Convergence of Radon measures 195
is a discrete p-measure with f fj dd = f fj db' for each j = 1, ... , n, since xk+l lies
outside the supports of all these functions. Consequently, 6 E V = Vf...... f,,;I(P0)
yields that also 6' E V. 0
Next we will investigate whether the equality (30.1) and the continuity asser-
tion (30.4) remain valid for classes of continuous functions more general than C..(E).
Recall in this connection that for a measure µ E .,&+' (E), every f E Cb(E) is u-
integrable: it is g(E)-measurable and its modulus is majorized by a real constant,
hence p-integrable, function. We will formulate the relevant results for sequences
only; their extensions to mappings t u-+ pt are routine.
for each n E N
if
and
if f du < ae,
so that via the triangle inequality
if f dµn - f f dµ - jgd.Uj
I< 2ae + 119dJun for all n E N.
Since the hypothesis of vague convergence means that f g dµn -1 f g dµ, we get
valid for every e > 0. That is, the limit exists and is 0. 0
Remarks. 1. If one considers measures pn and µ E .-W+6 (E) without the hypothesis
sup 11µn 11 < +oo, the above conclusion can fail. The special case of Example 2 in
196 IV. Measures on Topological Spaces
The passage from Co(E) to Cb(E) therefore calls for a special investigation,
which we stress by introducing a new definition:
(30.8) lim
n-+00
JfdP=Jfdp for all f E Cb(E).
At this point it is worth returning once more to Theorem 30.2. If the measures
µ,µn there are all finite and of the same total mass, e.g., if they are all p-measures,
then the two components of the compound condition (30.2) become equivalent. The
result is the following portmanteau-theorem:
30.10 Theorem. Let µ,µl,µ2, ... be measures in &+' (E). Then the following
three assertions are equivalent:
(i) The sequence (µn)nEN converges vaguely (and therefore also weakly) to p.
(ii) For every closed F C E
(30.9) lim so p µn (F) < µ(F) .
(iii) For every open G C E
(30.9') lim of µn (G) >- IL(G)
Proof. The first paragraph of the proof of 30.2 actually established that (i)=(iii),
under the less restrictive hypotheses prevailing there. Since that theorem further
shows that the conjunction of (ii) and (iii) implies (i), it only remains to establish
198 IV. Measures on Topological Spaces
the equivalence of (ii) and (iii). That follows from the trivial observation that
v(CA) = v(E) - v(A) = 1 - v(A)
holds for all A E -4(E) and all v E _W+1(E).
Example 1 in this section shows that the weak convergence of a sequence (µn)
in .4/+(E) to a It E 4' (E) does not imply the convergence of (f f dµ,+) to f f dµ
for every bounded Borel measurable function f . Nevertheless the continuity of the
functions f which define weak convergence can be relaxed somewhat. To this end,
we consider bounded, real-valued, Borel measurable functions f on E which are
p-almost everywhere continuous for a p E .A"+(E): After excision of a p-nullset
N E .3(E), f is continuous at each point of E \ N. Important examples of such
are the indicator functions of boundaryless Borel sets. The latter are defined as
follows:
Proof. By hypothesis there is a Borel set Eo C E with µ(E \ Eo) = 0 such that
f is continuous at the each point of E0. Let e > 0 be given. Since p is a Radon
measure, there is a compact K C Eo with
p(Eo\K)<e.
Every x E K has an open neighborhood Ux on which the oscillation of f is at
most e, meaning that
If(yi)-f(Y2)I <_e for all y1, y2EUx.
§30. Convergence of Radon measures 199
as well as
a<g;<a;</3;<h;<0.
This follows at the once from 27.3 and the application of an appropriate affine
transformation in the range space R. From these properties and definitions it
follows in particular that gi S f < hj for all j. Therefore if we set
g:= g1 V... Vg,, and h:=h1 A...Ahn,
then both these functions lie in Cb(E) and they satisfy a < g < f < h < ,0.
Moreover,
0<h(x)-g(x)<e forallxEK.
For each x E K lies in some V1, C Us, and because of the way Ux; was chosen
with respect to the oscillation of f, it follows that h(x) - g(x) < h,(x) - gj(x) _
/31 - aj < E. We are now in a position to finish the proof, as follows:
J(h-g)di=IK-g)
<
dµ+JE\K-g)dit
eµ(K) + ((3 - a)µ(E \ K) < e(µ(E) + 3 - a) ;
and, because g < f < h and g, h E Cb(E), the weak convergence hypothesis gives
g dp = n-too
lim g dµn < lim inf f f dttn < lim sup if f dµn
J J nloo J n-+00
Let us now look at an application of this theorem which relates the vague
convergence of p-measures on the number line to their Theorem 6.6 description
in terms of distribution functions. This is the way that weak (and hence vague)
convergence made its original historical appearance.
30.13 Theorem. Let µ, Al, A2.... be measures in 4+1(R), that is, probability mea-
sures on .41, and F, F1, F2 ... their distribution functions. If the sequence (µn)nEN
200 IV. Measures on Topological Spaces
Proof. According to Theorem 30.12, 1im p.,, (Q) = p(Q) for every p-boundaryless
set Q E .£1 and thus, after (6.11), lim F,,(x) = F(x) for every x E R such that the
interval Qx oo, x( is p-bounda.ryless. We have
] - oo, x] = Qx = n Q=+1 /k
kEN
and therefore
t (Qx) = klim u(Q.+1/k) =kin F(x + Ilk) .
Consequently, Q, is L-boundaryless just if the (isotone) function F is right-conti-
nuous at x, that is (since distribution functions are everywhere left-continuous),
just if x is a point of continuity of F. This proves the first assertion.
Let us now hypothesize that F is continuous on the whole line, and let e > 0
be given. First of all, (6.13) supplies numbers a < b such that F(a) < e and
1 - F(b) < c. The uniform continuity of F on the compact interval [a,b] insures
that points a = xo < x1 < ... < xk = b exist such that
F(xj)-F(xj_1)<e forj=1,...,k.
From what has already been proven we know that there exists nE E N such that
IFn(xj) - F(xj)I <,E for each j E 10,..., k} and all n > nE.
But then, as we will show, the inequality (Fo(x) - F(x)] < 2e prevails for every
x E R and all n > ne1 which proves the uniform convergence of (Fn) to F. For if
x < x0, then
0 < F(x) < F(xo) < e and 0 < Fn(x) < Fn(xo) < F(xo) +e < 2e,
that is, I F,,(x) - F(x)j < 2e. And a similar argument works if x > xk. The re-
maining x fall into [x j _ 1, x j [ for an appropriate j E {1,...,k}, so
F(xj_1) < F(x) < F(xj) < F(xj_1) +e
and
F(xj_1) - c < Fn.(xj_1) < Fn(x) < F,,(xj) < F(xi) +e < F(xj_1) +2e,
confirming that in this case too IFn(x) - F(x)I < 2E.
The concept of weak convergence (with the same definition) is also meaningful
if E is a Polish space (or even just a metric space) if the measures involved in
Definition 30.7 are all finite Borel measure on E. Only the uniqueness of limits
calls for discussion:
30.14 Lemma. Finite Borel measures p and v on a metric space E are equal if
f f dp = f f dv for all f E Cb(E).
Proof. Let d be a metric giving the topology of E and consider closed subsets
F C E. Suppose we can always find a sequence (fn) in Cb(E) with fn .1. 1F. Then
it would follow from the hypothesis and from Lebesgue's dominated convergence
theorem that u(F) = v(F). The system of closed subsets F of E is an fl-stable
generator of the Borel a-algebra R(E) and it contains the whole space E. The
equality µ = v would thus follow from the uniqueness theorem 5.4.
It remains therefore to prove the existence of such sequences (fn) and we
can suppose F 0 0. For this purpose we use the (uniformly) continuous an-
titone function h : R -+ R which is constantly 1 on ] - oo, 01, constantly 0
on [1, +oo[ and defined by h(t) := 1 - t on [0, 1], together with the function
x H d(x, F) := inf{d(x, y) : y E F}. The latter is a (uniformly) continuous func-
tion on E, as we showed in the proof of Example 4, §26. Moreover, its zero-set
is exactly F, because F is closed. Apparently then the sequence of (uniformly)
continuous functions
fn(x) := h(n d(x, fl), x E E, n E N
does what is wanted. 0
Remarks. 6. The concept of p-boundaryless sets is also meaningful for finite Borel
measures p on Polish spaces. One easily convinces himself that Theorem 30.12
remains valid in this new situation. In the proof one merely has to secure the
existence of the needed functions g3 and h2 somewhat differently: To this end one
engages Urysohn's lemma (WILLARD [1970], p. 102 or KELLEY [1955], p. 115).
7. Weak convergence in the set of finite Radon measures on a Polish or a locally
compact space E derives from a topology in the same way that vague convergence
does. It is called, naturally, the weak topology and it is defined by letting Cb(E)
take over the role of CC(E) in (30.4).
Exercises.
1. Let E be a locally compact space, (µn)fEN a sequence in ..Wb(E) which is
vaguely convergent to µ E . +(E). If 11µI.11 !5 1 for every n E N, then R
o.D
exists and equals 1.
2. Let be a convergent sequence of real numbers, with slim an = a E
+00
Further, let (a be a sequence of non-negative real numbers such that al > 0
and the series E a,,, is divergent. Then
alai +...'+'anon =a
lim
n-+no a,
the case in which all an = 1 being the best known instance. Here is an outline for
a measure-theoretic proof: The equations
x161 + ... + anEn
/tn := n E N,
al+...Ian
define a sequence of measures in -0 (N) which vaguely converges to 0. Therefore
according to 30.6, line f f dt. = 0 holds for every f E Ca(lm). The relevant f is
the one defined by f (n) := a - a.
3. Let E be a locally compact space and T a subset of C0(E) with the following
properties: Each compact K C E has a relatively compact neighborhood U such
that every f E C0(E) with supp(f) C K is uniformly approximable on E by
functions t E T whose supports He in U; and further, there exists a t E T with
0 < t < I and t(K) _ {1}. Show that:
(a) A sequence (µn) in .1+(E) is vaguely convergent if and only if the sequence
(f t dp) is convergent in R for every t E T.
(b) For E := R, the set of all continuously differentiable real-valued functions with
compact support is a T with the above properties.
4. With the help of Exercise 3 show that for the functions f, (x) := I - sin(nx)
on R, the sequence (f .\'),,EN converges vaguely to A1, and deduce from this the
Riemann-Lebesgue lemma:
Elm
n -r00
f f (x) sin(nx) dx = 0 for every f E
of Theorem 11.6 and show with the help of Exercise 5 that every 0 < f E Cb(E)
is the uniform limit on E of an isotone sequence (un) in the vector space spanned
by the indicator functions of the sets in -90-1
7. As an application of Exercise 6 show that in the context of Theorem 30.13
condition (30.13) there is also sufficient for the weak convergence of (µn) to p.
8. Let (an)nEN be a sequence of real numbers in J0, 1[. From [0,1] delete the open
interval Ill centered at 1/2 having length al. There remain two disjoint closed
intervals J11, J12. From J1j delete the open interval I2j of length a2A1(J13) whose
midpoint is that of J13 (j = 1,2). Then there remain four pairwise disjoint closed
intervals J21, J22, J23, J24. From J2, delete the open interval I3j of length a3.' (J23)
whose midpoint is that of J23 (j = 1,2,3,4). Then there remain 8 = 23 pairwise
disjoint closed intervals J3j, j = 1, ... , 8. Continuing in this way one gets for each
n E N pairwise disjoint closed intervals Jnj, j = 1, ... , 2n. The set
C:= n(Jn1U...UJn2n)
nEN
10. Let E be a metric space, with metric d, and let µ, µ1,p2, ... be p-measures
on .R(E). Show that each of the following is necessary and sufficient for the weak
convergence of the sequence (µn) to p:
(a) lim f f dµn = f f dµ for all bounded functions f which are uniformly continu-
ous on E.
(b) lim sup µn(F) < µ(F) for all closed F C E.
(c) lim inf µn (G) > µ(G) for all open G C E.
[Hints for (a) .(b): Re-examine the proof of 30.14. There it was shown how,
for a closed non-empty F C E, to construct uniformly continuous functions fn
satisfying fn 1F.1
204 IV. Measures on Topological Spaces
We again consider a locally compact space E along with its space &+ = .,a'+(E)
of Radon measures, equipped with the vague topology. Our interest here is in the
subsets of ..41+ which are compact or relatively compact in this topology. They are
naturally called vaguely compact, resp., vaguely relatively compact.
A necessary condition for the vague relative compactness of a set H C -W+
can be inferred at once from the very definition of the vague topology. According
to it, for each f E Cc(E) the real function p H f f dµ is continuous on W+.
Therefore the image of any relatively compact H under each such mapping must
be a relatively compact subset of R, that is, a bounded set. This observation leads
to the following definition:
Thus vague boundedness of a set H C -4'+ is a necessary condition for its vague
relative compactness. We want to show that it is also sufficient:
Proof. In view of the preceding, all that has to be shown in that vague relative
compactness follows from the vague boundedness of H. To this end, let of de-
note the real number in (31.1), for each f E Cc(E), and Jf the compact interval
(-a f, a fJ in R. Also denote the (vague) closure of H in W+ by H. First observe
that
fid AEJf
for all f E CA(E) and all p E H. In fact, if f E CA(E) and e > 0 are given
P:= RC = X Rl
IEC,
in which for each f E C, = CA(E) a copy RI := R of the number line appears as
a factor. The product
J:= X JI
I EC
4'(u)'-Jfd(4i(4'())) Jfdt
of 4>(.q'!.+) into R (f E C'(E)). But this mapping is just the restriction to 4)(..C/+)
of the projection of P = RC, onto its coordinate specified by f.
As to (b): Let I E P be a point in the closure of 4'(..E'+) in P. Then I is
a positive linear form on CA(E). To see its additivity, for example, let f, g E CA(E)
and E > 0 be given. The set of all I' E P which satisfy
II'(u) - I(u)I < E for u E (f, g, f + g}
is a neighborhood of I in P, and therefore contains a point I' = 4>(p) from
I' is thus the positive linear form
Proof. For every f E CC(E) and p E 4, if f dpi < f If I du <_ a IIf 11. Conse-
quently, tf,, is vaguely bounded, hence vaguely relatively compact. What therefore
remains to be confirmed is the closedness of via in .4W+. According to (28.13)
6 is just the set of all Is E W+ such that f u dµ < a holds for all [0,1]-valued
u c- CA(E). Because the mapping p '-+ f u dtp of .'+ into R is continuous, the
set {µ E - ' + : f u du < a} is closed, for each u E CA(E), and by the preceding
observation 4 is an intersection of such sets, those for which u(E) C [0, 11. Thus
.9a is indeed (vaguely) closed. 0
Remark. 1. The set of all measures u E 4' (E) with IIpQ equal to a fixed positive
number a is vaguely closed if E is compact (because in that case 1E E CA(E)).
Example 2 of §30, with all the a there equal to a, illustrates this.
Remark. 2. For every locally compact space E the, obviously injective, mapping
(31.2) V : E -+ .4f+ (E)
defined by V(x) := ex is a homeomorphism of E with cp(E) _ {ey : x E E}. For
every point x E E the (open) sets
Mf...... f..:n(x) = {y E E : If,(x) - f;(y)I < 17,,7 = 1,...,n}
form a neighborhood basis at x as the fj run through all finite subsets of CA(E)
and 17 through all positive real numbers. In fact, if U is a neighborhood of some
§31. Vague compactness and metrizability questions 207
31.4 Lemma. For any locally compact space E the following assertions are equiv-
alent:
(a) E has a countable basis.
(b) There is a countable subset of CA(E) which is dense with respect to uniform
convergence.
Proof. (a)=::-(b): Let 9 be a countable base for (the topology of) E,.? the set of
all open intervals in R with rational endpoints. For every natural number n let
us say that an n-tuple (C1,... , Gn) E 1n and an n-tuple (II, ... , In) E Mn are
compatible with each other if a function f E CA(E) exists such that f(G,) C II
for each j = 1,...,n and supp(f) C Gl U ... U Gn. Any such f will be called
a compatibility function for the pair of n-tuples. Obviously, the set
U(9" x,1n)
nEN
is countable; there are therefore only countably many such pairs of n-tuples (n E N)
that are compatible with each other. We choose a compatibility function for each
such pair and designate by F the set of functions chosen. It suffices to prove that
F is a countable dense subset of CA(E). To prove its denseness, let u E CA(E)
and e > 0 be given. Denote the support of u by K. Every x E K lies in an open
neighborhood from 9 each point y of which satisfies Iu(x) - u(y) I < E. The com-
pact set K is covered by finitely many such neighborhoods, say by C1,.. . , Gn.
The diameter of each image set u(G,) is at most 2E. Consequently there are in-
tervals I j E 9 of length less that & such that u(G3) C II, f o r j = 1, ... , n. Thus
u is a compatibility function f o r the pair of n-tuples (G 1 i ... , G"), (I1, ... , In ).
Hence there must also be such a compatibility function f in the representative
set F. Every X E Gj therefore satisfies Iu(x) - f(x)I < .A'(Ij) < 3e; that is,
Iu(x) - f (x)I < 3e for all x E G1 U ... U Gn. But this latter inequality prevails as
well for all x E E \ (G1 U ... U Gn) for the simple reason that both f and u vanish
identically in this complement. In summary, llu - f II < 3F. This proves that F is
dense in CA(E).
(b)=*(a): Let D be a dense subset of Cc(E). We will show that the system 9
of all sets {u > 1/2} with u E D is a base for the topology of E. For every open
U C E and every point x E U Corollary 27.3 furnishes an f E CA(E) with f (x) = 1
208 IV. Measures on Topological Spaces
and supp(f) C U. Since D is dense, there is a u E D with 1$u - f O < 1/2. Then
xE{u>1/2}C{f> 0) C supp(f) C U.
If D is countable, so is If. O
(31.3)
1un(x) -'uw(y)1
d(x, y) :_ X, Y E E.
n=1 2" 11un11
Point-separation by D means that d(x, y) > 0 whenever x # y. All the other prop-
erties of a metric on E are obvious for d. This function d on E x E is a uniform
limit of continuous functions and is consequently continuous. Therefore the topol-
ogy generated by d, which we will call the d-topology, is coarser than the original
topology of E. For any given point x E E and neighborhood U of x in the original
topology of E there is, as was shown in the "(b)=(a)" part of the preceding proof,
a u E D with
zEV:={u>1/2}CU.
This function u is however a u,,, so that by (31.3) u is d-continuous and V is d -open.
Therefore the d-topology is finer than the original topology of E. Consequently
the two topologies in fact coincide.
Now we can provide the final answer to the question posed after Remark 1.
31.5 Theorem. The following assertions about a locally compact space E are
equivalent:
(a) .A+(E) is a Polish space in its vague topology.
(b) The vague topology of 4+(E) is metrizable and has a countable base.
(c) The topology of E has a countable base.
(d) E is a Polish space.
(31.6)
if
(31.61) I ffdv_Juekdv l < F J ek dv.
As the functions ek and uek are in D, the assumption that p(p, v) = 0 entails that
their p- and the v-integrals coincide, and it follows that
Jfdi_Jfdu
l <
2e J ek d,",
holding for every e > 0. That is, the desired equality f f dp = f f dv must hold.
The next step is to show that the topology determined by P is none other
than the vague topology. We will, to that end, make use of the fact that the sets
defined in (30.5) are a neighborhood base at v E ..&+ in the vague
210 IV. Measures on Topological Spaces
topology, when all possible finite subsets {fl,..., fn} of C0(E) and all numbers
e > 0 are considered. We will denote by Ue (v) the open ball of center v and radius e
with respect to the metric p.
1. Given e > 0 there exists m E N such that
(31.7) Vd,..... dm;e/2(V) C UU(V) for every v E .4'+.
Indeed, one may take any m E N such that
00
E 2-n < e/2
n=m+1
and every le E Vd,..... d,,,;e/2(V) will then satisfy
in
p(µ, V) <
E2-n
+<e
n=1
(31.9) Jf)dL_Juiekd1zl<SJekdµ,
if
for j = 1, ... , n. Choosem so large) that all the functions ek, u1ek.... show
up among the first m functions dl,..., d,,, in the enumeration of D, to which they
all belong. Finally, set
,7] ._ d2-m
(31.10') Jekd/L_fekdP<o.
From (31.9) and (31.9'), as well as from (31.10) it follows, via the triangle inequality
that
Jfid_ffidv<82+(1+2 J edv)S<eAs
this holds for every j E{ 1, ... , n}, it asserts that p E V11 ,... j,,, (v) and con-
firms (31.8). Together (31.7) and (31.8) assert the equality of the vague and the
p-topologies.
The next step will be to prove the completeness of the metric p, and we can
do that via slight modifications in the foregoing arguments. Let (pn)nEN be a p-
Cauchy sequence in W+. Instead of the functions fl,..., fn and the number e > 0
in 2. above, let an f E CA(E) and a number b E ]0, 1[ be given. We aim first
to prove that the numerical sequence (f f dpn)nEN converges in R. Choose k E N
with supp(f) C {ek = 1) and u E Do with Ilf - It < b. Then choose m E N large
enough that the two functions ek and uek are among dl, ... , d,n and set 17:= 62-1.
Since (µn) is a p-Cauchy sequence, there is a natural number N, dependent on 'q,
thus on f and S, such that
p(pr, ps) < 77 for all r, s > N.
Just as in the earlier deduction scheme, we get that for such r, 8
if if
212 IV. Measures on Topological Spaces
Of course we also have the f-analogs of (31.9) and (31.9'), so that reasoning similar
to that used earlier deliversthe inequality
forallnEN.
The earlier inequality therefore yields
with positive rational ai and points ai drawn from a countable set Eo which
is dense in E. We get such a set Eo simply by taking a point from each set
in a countable base for the topology of E. Evidently, this 90 is countable. We
have to show that for every p E . fl+, every real e > 0, and every finite set
F :_ {fl,..., fn} C CA(E), the basic vague neighborhood Vj,,... contains
a measure from 90. At least, according to 30.4, this neighborhood contains a
if fdIt-Jfd.6l< J
fdlt -
J
f &l +IJ fa-J fd6l
f k
fd,,- ffdal+Ea;If(=i)-f(xj)I+FI-;-ailIlfII
k
E-
HJfd,1
-ffdbl
is positive. If we choose a; from Q+ sufficiently close to i and x; from the dense
set E o sufficiently close to T, (i = 1, ... , k ), then because of the continuity of the
(finitely many) functions f, we can obviously see to it that the two sums in (31.13)
together are less than this, so that the right side of inequality (31.13) is less than e,
for each f E F. But that means that b E 9o n V1..... f,,;, (It).
Remarks. 4. The reader should recall the rather elementary fact that for a met-
ric space compactness and sequential compactness are equivalent (see (6.37) in
HEwrrr and STROMBERG [19651). In view of this, a very useful consequence of
Theorems 31.2 and 31.5 for a locally compact space E with a countable base is
that every vaguely bounded sequence in _J!+(E) contains a vaguely convergent
subsequence..
In particular, for such E every sequence (p,,) in ..#+(E), that is, every sequence
of p-measures, contains a vaguely convergent subsequence. Moreover, in case all
convergent subsequences have the same limit e, the original sequence (p,,) itself
converges vaguely to /t: Otherwise there would be an f E CA(E) for which (f f dlt )
sloes not converge to f f dlt, and so an e > 0 and integers I < n1 < n2 < ... such
that If f dlt,,; - f f ditl > e for all j E N. The sequence )jEN would have
a vaguely convergent subsequence and its vague limit could not be iz. If we further
hypothesize of that it is tight, then with the aid of Remark 3 in §30 we can
conclude that it E .W+(E) as well, and that even converges weakly to it.
5. The foregoing deliberations show (for locally compact. E with a countable
base) that tight sequences in &+'(E) always contain weakly convergent subse-
quences. Explicitly formulated this says: A set H C .,i.+ (E) is relatively compact
(= relatively sequentially compact) in the weak topology if it is tight, meaning
that for every e > 0 a compact Kf C E exists such that p(E \ KE) < e for ev-
ery it E H. A theorem of Yu.V. PROHOROV asserts that the lightness of H is
even equivalent to its weak relative compactness. More is true: This equivalence
prevails as well whenever E is any Polish space. For details the reader can consult
BILLINGSLEY [1968[.
214 IV. Measures on Topological Spaces
The ideas employed in the proofs of Theorems 31.4 and 31.5, slightly modified,
lead to a further interesting result. It concerns the space
C := C(R+, E)
of all continuous mappings f of R+ := [0, +oo into a Polish space E, for exam-
ple, Rd. We endow C with the topology of uniform convergence on compact subsets
of R+.
Proof. Consider any complete metric B which generates the topology of E. Another
such metric is given by (x,y) H min{1, p(x,y)}, and using it if need be, we can
simply assume that L< 1. This lets us define do in C for each n E N by
dn(f,g) := sup{p(f(x),g(x)) : x E [0, n]), f,g E C;
and
Just as earlier (cf. (31.3) and (31.4)), one easily confirms that d is a metric
on C (with all its values in [0,1]) which satisfies
(31.15) 2-nd(f,g)<d(f,g)<dn(f,g)+2'n for allnEIN,
the right-most inequality following from the fact that d< < d,+1 for all i E N,
resulting in
n 00
U(®n
nEN
is countable, there is a countable set F C C which contains a compatibility function
for each pair of compatible n-tuples, for each n E N. The open d-balls having
centers in F and rational radii are a countable set, and it is easy to see that they
constitute a base for the d-topology of C once we confirm that F is dense in C.
§31. Vague compactness and metrizability questions 215
The significance of Theorem 31.5 lies partly in the fact that for a locally com-
pact space E whose topology has a countable base the space .41+(E) of all (posi-
tive) Radon measures - which according to 29.12 is the set of all Borel measures
on E - being also a Polish space, is itself an environment in which measure theory
can be pursued. And this happens in convex analysis, in integral geometry, and
in stochastic geometry, a meeting point between geometry and probability theory.
The path-space C(R+, E) of all continuous paths or curves t H f (t) 1 t E R+,
in a Polish space E (Theorem 31.6) plays a fundamental role in the theory of
stochastic processes. For example, the Polish space C(R+, Rd) carries the famous
Wiener measure; it is the steering mechanism of the Brownian motion in Rd (cf.
BAUER [1996]).
Exercises.
1. Let E be a locally compact space, v E ..#+(E). Show that the set of all p E
..#+(E) which satisfy 0 <_ f u. d,u < f udv for every non-negative u E CA(E) is
vaguely compact.
216 IV. Measures on Topological Spaces
2. Let E be a locally compact space with a countable base. Prove that there is
a countable subset of C0(E) that has the properties of the set T in Exercise 3, §30.
[Hint: Try the set D that featured in the proof of Theorem 31.5.]
3. (Selection theorem of E. HELIX (1884-1943)). Prove the original form of Corol-
lary 31.3: To every sequence (Fn)nEN of distribution functions on R corresponds
a measure-generating function F : R -+ R and a subsequence (Fn,, )kEN of the
original sequence such that lim Fnk (x) = F(x) for every continuity point x of F.
k-roo
Why is F generally not a distribution function? How does one recover 31.3 (for
the case E := R) from Helly's theorem?
4. For a Polish space E consider the topology (introduced in Remark 7 of §30) of
weak convergence on the set of finite Borel measures (the finite Radon measures
- cf. 26.2) on E. By adapting the ideas in the proof of Theorem 31.5, show that
this topology is metrizable.
5. For what more general spaces taking over the role of R+ in the definition
of C(R+, E) does Theorem 31.6 remain valid?
Bibliography
e_ls
U. ANONYME [1889]: "Sur l'integrale JIx dr", Bull. Sci. Math. (2)13, 84.
G. AUMANN [1969]: Reelle Funktionen. Grundlehren Math. Wiss. 68 (2nd edition),
Springer-Verlag, Berlin-Heidelberg-New York.
S. BANACH [1923]: "Stir le problenne de la mesure", Fund. Math. 4, 7-33.
R.G. BARTLE and J.T..JoicHI [1961]: "The preservation of convergence of mea-
surable functions", Proc. Amer. Math. Soc. 12, 122-126.
H. BAUER [1984]: Mafle auf topologischen Raumen, Kurs der Fernuniversitat-
Gesamthochschule-Hagen.
- 11996]: Probability Theory, de Gruyter Stud. Math. 23. Walter de Gruyter.
Berlin-New York.
S.K. BERBERIAN [1962]: "The product of two measures", Amer. Math. Monthly
69, 961-968.
P. BILLINGSLEY [1968]: Convergence of Probability Measures. John Wiley & Soils,
Inc., New York-London-Sydney-Toronto.
G. BIRKHOFF and S. MACLANE [1965]: A Survey of Modern Algebra (3rd edition).
The Macmillan Co., New York.
N. BOURBAKI [1965]: Integration, Chap. 1-4. Hermann, Paris.
A. BROUGHTON and B.W. HUFF [1977]: "A comment on unions of a-fields",
Amer. Math. Monthly 84, 553-554.
S.D. CHATTERJI [1985-86]: "Elementary counter-examples in the theory of double
integrals", Atti Sem. Mat.. Fis. Univ. Modena 34, 363-384.
G. CHOQUET [1969]: Lectures on Analysis. Vol. 1. W.A. Benjamin, New York-
Amsterdam.
.I.P.R. CHRISTENSEN [1974]: Topology and Borel Structure. Mathematical Studies
10. North-Holland Publ. Co., Amsterdam-London.
D.L. COHN [1980]: Measure Theory. Birkhauser Verlag, Basel-Boston-Stuttgart.
P. COURREGE [1962]: Theorie dc la mesue. Les cours de Sorboune. Centre de
Documentation Universitaire, Paris 5'.
C. DELLACHERIE et P.-A. MEYER [1975]: Prnbabilites et potentiel, Chap. I a IV.
Hermann, Paris.
P. DIEROLF and V. SCHMIDT [1998]: "A proof of the change of variable formula
for d-dimensional integrals", Amer. Math. Monthly 105, 654-656.
J. DIEUDONNE [1939]: "Un exemple d'espacc normal non susceptible dune struc-
ture uniforme d'espace complet", C. R. Acad. Sci. Paris Ser. I Math. 209,
145-147.
218 Bibliography
D.A. OVERDIJK, F.H. SIMONS and J.G.F. THIEMANN [1979]: "A comment on
unions of rings", Indag. Math. 41, 439-441.
J.C. OXTOBY and S. ULAM [1941]: "Measure-preserving homeomorphisms and
metrical transitivity", Ann. of Math. (2) 42, 874-920.
K.R. PARTHASARATHY [1967]: Probability Measures on Metric Spaces, Academic
Press, New York-London.
W.F. PFEFFER [1977]: Integrals and Measures. Marcel Dekker. New York-Basel.
J. RADON [1913]: "Theorie and Anwendungen der absolut additives Mengenfunk-
tioncn", Sitzungsber. Kaiserl. Akad. Wiss. Wien, Math.-NaturYaiss. K1. 122,
1295-1438.
H. RICHTER [1966[: Wahrscheinlichkeitstheorie. Grundlehren Math. 1Viss. 86
(2nd edition). Springer-Verlag, Berlin-Heidelberg-New York.
F. R.IESZ [1911]: "Sur certaines systemes singuliers ('equations intrgrales", Ann.
Sci. Ecole Norm. Sup. (3) 28, 33-62.
J.B. ROBERTSON [1967]: "Uniqueness of measures", Amer. Math. Monthly 74,
50-53.
W. RUDIN [1962]: Fourier Analysis on Groups. Interscience Tracts in Pure Appl.
Math. 12. John Wiley & Sons, New York-London.
- [1987]: Real and Complex Analysis (3rd edition). McGraw-Hill Book Comp.,
New York-Hamburg-Tokyo--Toronto.
S. SAEKI [1996]: "A proof of the existence of infinite product probability mea-
sures", Amer. Math. Monthly 103, 682-683.
W. SIERPINSKI [1928]: "Un thboreme general sur les families d'ensembles", Fund.
Math. 1, 206-210.
R.M. SOLOVAY [1970]: "A model of set-theory in which every set of reals is
Lebesgue measurable", Ann. of Math. (2) 92, 1-56.
R.H. SORGENFREY [1947]: "On the topological product of paracornpact spaces",
Bull. Amer. Math. Soc. 53,631-632.
S.M. SRIVASTAVA [1998]: A Course on Bore! Sets. Grad. Texts in Math. 180.
Springer-Verlag, New York-Berlin.
K. STROMBERG [1972]: "An elementary proof of Steinhaus's theorem", Proc.
Amer. Math. Soc. 36, 308.
- [1979]: "The Banach-Tarski paradox", Amer. Math. Monthly 86, 151-161.
- [1981]: An Introduction to Classical Real Analysis. Wadsworth International,
Belmont, California.
H.G. TucKER [1967]: A Graduate Course in Probability. Academic Press, New
York-San Francisco-London.
J. VAN YZEREN [1979]: "Moivre's and Fresnel's integrals by simple integration",
Amer. Math. Monthly 86, 691-693.
D.E. VARBERG [1971]: "Change of variables in multiple integrals", Amer. Math.
Monthly 18, 42-45.
220 Bibliography
The numbers beside the symbols refer to the pages where the symbol in question
is defined.
- Fatou's, 81 - - , reversed, 79
- Urysohn's, 168 motion, 41
--- Riecnann-Lebesgue, 202 motion group, 42
--- on differentiation of integrals, 89 motion-invariance of ad, 42
linear form, 66, 68 motion-invariant content, 46
- , positive (isotone), 66, 171 mutually singular (measures), 1.05
Lusin's theorem, 10
negative part of a function, 53
mapping, xii - of a signed measure, 109
mass distribution, 12, 108 non-Borel set, 45 47
measurable mapping, 34 non-denumerable, xii
measurable numerical function, 49 norm of uniform convergence, 169
measurable sets, 34 normal representation, 54
measurable space, 34 nullset, 13
measurable, Borel, 34, 103 - , L-B, 28 20 33, 43
- , Lebesgue, 46 - , Lebesgue, 46
with respect to an outer measure, totally, 1119
21)
number line, xi
measure, 11 - , compactified, xi
Borel, 31L 153 - , extended, xi
- , carried by a set, 105
finite, U one-point compactification, 167
outer measure, 20
finite signed, 11)2
inner regular, 1.54 point, ideal, 106
L-B,27 point, infinitely remote, 167
- , Lebesgue, 46 point mass (see Dirac measure)
- , locally finite, 153 Polish space, 157, 208, 214
-, of a set, 11 portmanteau-theorem, 197
outer regular, 153 positive part of a function, 53
positive, 1519 - of a signed measure, 109
- , regular, 15.4 positively-homogeneous function, 59
- , u-continuous, 99 power set, xii, 2
- , a-finite, 23, 72, 98 pre-image, xii
- , signed, 102 premeasure, 8
with density, 96 - , Lebesgue, 18
measuue-defining function, 311 principal measure, 176
measure-extension theorem, Q. 21 probability measure, 31
measure-generating function, Q. 32 probability space, 34
measure space, 26. 34 product measure, 137, 143
- , a-finite, 34 product of measure spaces, 144
metric of uniform convergence, 169 - of a-algebras, 132, 142
metrizability of locally compact spaces, pseudometric, 79
208
- of vague topology, 208 Radon measure, 155
Minkowski inequality, 70 83 --- , bounded, 188
Subject Index 229