You are on page 1of 232

Measure

and
Integral
Jaroslav Luke
Jan Mal

matfyzpress
PRAGUE 2005

All rights reserved, no part of this publication may be reproduced or transmitted in any
form or by any means, electronic, mechanical, photocopying or otherwise, without the
prior written permission of the publisher.

Jaroslav Luke, Jan Mal, 2005


MATFYZPRESS by publishing house of the Faculty of Mathematics and Physics
Charles University in Prague, 2005

ISBN 80-86732-68-1
ISBN 80-85863-06-5 (First edition)

Motto: Everybody writes and nobody reads


r
L. Feje

Preface

This text is based on lectures in measure and integration theory given by the
authors during the past decade at Charles University, and on preliminary lecture
notes published in Czech.
It is impossible to thank individually all colleagues and students who assisted
in the preparation of this manuscript, but we will just mention Michal Kubecek
who helped with the translation and TEX processing.
The authors wish to express their deep gratitude to Professor Stylianos Negrepontis, who was the chief coordinator of TEMPUS project JEP1980. Without
support from him and the Tempus programme the manuscript would never have
appeared.
The preparation of this manuscript was partially supported by the grant
No. 201/93/2174 of the Czech Grant Agency and by the grant No. 354 of the
Charles University.

Prague, 1994

Jaroslav Lukes and Jan Mal


y

Preface to the second edition

We have carried out only minor corrections. We wish to thank all who contributed by suggestions and comments.

Prague, 2005

Jaroslav Lukes and Jan Mal


y

Contents
List of Basic Notations and Frequently Used Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
A. Measures and Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1. The Lebesgue Measure
2. Abstract Measures
3. Measurable Functions
4. Construction of Measures from Outer Measures
5. Classes of Sets and Set Functions
6. Signed and Complex Measures
B. The Abstract Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7. Integration on R
8. The Abstract Lebesgue Integral
9. Integrals Depending on a Parameter
10. The Lp Spaces
11. Product Measures and the Fubini Theorem
12. Sequences of Measurable Functions
13. The Radon-Nikod
ym Theorem and the Lebesgue Decomposition
C. Radon Integral and Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
14. Radon Integral
15. Radon Measures
16. Riesz Representation Theorem
17. Sequences of Measures
18. Luzins Theorem
19. Measures on Topological Groups
D. Integration on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
20. Integral and Dierentiation
21. Functions of Finite Variation and Absolutely Continuous Functions
22. Theorems on Almost Everywhere Dierentiation
23. Indenite Lebesgue Integral and Absolute Continuity
24. Radon Measures on R and Distribution Functions
25. HenstockKurzweil Integral
E. Integration on Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
26. Lebesgue Measure and Integral on Rn
27. Covering Theorems
28. Dierentiation of Measures
29. Lebesgue Density Theorem and Approximately Continuous Functions
30. Lipschitz Functions
31. Approximation Theorems
32. Distributions
33. Fourier Transform
F. Change of Variable and k-dimensional Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
34. Change of Variable Theorem
35. The Degree of a Mapping
36. Hausdor Measures
G. Surface and Curve Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
37. Integral Calculus in Vector Analysis
38. Integration of Dierential Forms
39. Integration on Manifolds

H. Vector Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191


40. Measurable Functions
41. Vector Measures
42. The Bochner Integral
43. The Dunford and Pettis Integrals
Appendix on Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
A Short Guide to the Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

List of Basic Notations and Frequently Used Symbols


In this manuscript we use the standard notation.
In all what follows, N, Z, Q, R, C will denote the sets of all natural, integer,
rational, real and complex numbers, respectively.
Extended real number set R consists of R together with two symbols and
+ equipped with the usual algebraic structure and topology. Remind only that
0 and 0 are taken as 0.
If X is a set, P(X) denotes the collection of all its subsets.
Rn stands for the Euclidean n-dimensional space under the usual Euclidean
norm || and the metric |x y|, where x = [x1 , . . . , xn ].
Remember that when multiplied by a matrix (from the left), the vector x =
[x1 , . . . , xn ] behaves like a vertical vector, i.e. like a matrix with one column

x1
.
.. .
xn
The horizontal notation is preferred for estetical and typographical reasons.
The standard (or canonical ) basis of the space Rn is denoted by {e1 , . . . , en },
the vector ei = [0, . . . , 0, 1, 0, . . . , 0] with 1 at the ith place. The inner product in
Rn is denoted by x y.
By U (x, r) we denote the open ball in a metric space (P, ) of radius r round the
x. The closed ball is denoted by B(x, r). Thus U (x, r) = {y P : (x, y) < r},
B(x, r) = {y P : (x, y) r}. For the diameter of a set we use the symbol
diam and for the distance of a pair of sets the symbol dist.
If nothing else is specied, a function on a set X is a mapping of X into R. If
we want to emphasise that a function does not attain the values and +,
we call it a real function.
Instead of the notation {x X : f (x) > a} we often use the abbreviated version
{f > a}.
The symbol cA denotes the indicator function of a set A X, i.e. the function

cA (x) =

for x A ,

for x X \ A.

We use fj f to denote the uniform convergence of a sequence of functions.

1. The Lebesgue Measure

A. Measures and Measurable Functions


1. The Lebesgue Measure
In the history, people were engaged in the problem of measuring lenghts, areas
and volumes. In mathematical formulation the task was, for a given set A, to
determine its size (measure) A. It was required that the volume of a cube or
the area of a rectangle or a circle should agree with the well-known formulae. It
was also clear by intuition that this measure should be positive and additive, i.e.
it should satisfy the equality


Aj =
Aj
provided {Aj } is a nite disjoint collection of sets. For a succesful development
of the theory a further condition was imposed: The above equality was claimed
to hold even for countable disjoint collections of sets. Moreover, the eort was
paid to assign a measure to as many sets as possible.
Now, we are going to show how to proceed on the real line. The same approach
will be used later in the Euclidean space Rn where the proofs will be given.
1.1. Outer Lebesgue Measure. For an arbitrary set A R, dene



(ai , bi ) A}.
A := inf{ (bi ai ) :
i=1

i=1

The value A (which can also be +) is called the outer Lebesgue measure of a
set A.
1.2. Properties of the Outer Lebesgue Measure. One can see immediately
that A B if A B and that the measure of a singleton is 0, and without
much eort it becomes clear that I is the length of I in case of I interval of any
type (see Exercise 1.6). Then it is relatively easy to prove that the outer Lebesgue
measure is translation invariant: If A R and x R, then A = (x + A).
Another important property is the -subadditivity:
(

j=1

Aj )

Aj .

j=1

In mathematical terminology, the prex usually relates to countable unions and to countable
intersections.

The question of whether is an additive set function has a negative answer:


There are disjoint sets A, B with
(A B) < A + B
(cf. 1.8), and we need to nd a family of sets (as large as possible) on which the
measure is additive. This task will be solved later in Chapter 4 in a much more
general case. Now we just briey indicate one of its possible solutions in case of
the Lebesgue measure.

A. Measures and Measurable Functions

1.3. Lebesgue Measurable Sets. Let A be a subset of a bounded interval I.


Dening the inner measure A = I (I A), it is natural to investigate
the collection of sets for which A = A (cf. Exercise 1.7). This leads to the
following denition.
We say that a set A R is (Lebesgue) measurable if I = (AI)+ (I \A)
for every bounded interval I R. The collection of all measurable sets on
R will be denoted by M. Not every set is measurable as will be seen in 1.8.
The set function M M , M M is denoted by and called the Lebesgue
measure. Thus, on measurable sets, the set functions and coincide but for
nonmeasurable ones only is dened.
Another important property of the measure is contained in the following
theorem which is now presented without proof.
1.4.
Theorem.
(a) If M1 , M2 , . . . are elements of M, then also M1 \ M2 ,


Mn and Mn are elements of M. If, in addition, the sets Mn are pairwise
disjoint, then



Mn =
Mn .
n

(b) Intervals of any type are in M.


1.5. Remark. The ingenuity of Lebesgues approach to the measure consists in considering
the countable covers of a set A with intervals. If in the denition of A we consider only nite
covers, we get the notion of so-called Jordan-Peano content. In modern analysis this notion is
far from being as important as the Lebesgue measure.
1.6. Exercise. If I R is an interval (of any type), show that I is its length.
Hint. It is sucient to consider the case I = [a, b]. Clearly [a, b] b a (since [a, b]

S
(a , b + )). Suppose
(ai , bi ) [a, b]. A compactness argument yields the existence of an
index n satisfying
ba

n
P

n
S

i=1

(ai , bi ) [a, b]. Using induction (with respect to n) it can be shown that

i=1

(bi ai ).

i=1

1.7. Exercise. For every bounded set A R, dene


A := I (I \ A)
where I is a bounded interval containing A. Show that:
(a) the value of A does not depend on the choice of I;
(b) a bounded set A R is measurable if and only if A = A;
(c) a set M R is measurable if and only if its intersection with each bounded interval is
measurable.

In the next part of this chapter we introduce some signicant sets on the real
line.
1.8. A Nonmeasurable Set. Now we prove the existence of a nonmeasurable subset of R
and consequently prove that the outer Lebesgue measure cannot be additive.
Set x y if x y is a rational number. It is easy to see that is an equivalence relation
on R. Therefore R splits into an uncountable collection
of pairwise disjoint classes. A set V
if and only if V = x + Q for some x R. By the axiom of choice,
belongs to this collection

1. The Lebesgue Measure

there exists a set E (0, 1) that shares exactly one point with each set V . We show that
E is not in M.
Let {qn } be a sequence containing all rational numbers from the interval (1, +1). It is not
very dicult to show that the sets En := qn + E are pairwise disjoint and that
(0, 1)

En (1, 2).

Assuming that E M, then also En M and Theorem 1.4 gives

En =

En . Distin-

guishing two cases E = 0 and E > 0 we easily obtain the contradiction.


1.9. Remarks. 1. The proof of the existence of a nonmeasurable set is not a constructive
one (it uses the axiom of choice for an uncountable collection of sets). We return to the topic
of nonmeasurable sets in Notes 1.22.
2. By a simple argument, an even stronger proposition can be proved: Any measurable set
M S
R of a positive measure contains a nonmeasurable subset. It is sucient to realize that
M = qQ M (E + q) where E is the nonmeasurable set from 1.8 and that any measurable
subset of E is of zero (Lebesgue) measure.
3. Van Vleck [1908] constructed a set E [0, 1] for which E = 1 and E = 0.
1.10. Exercise. Show that every countable set S is of measure zero.

S
Hint. Consider covers
(rj 2j , rj + 2j ) where {rj } is a sequence of all elements of the
j=1

set S. The assertion also follows from Theorem 1.4 if you realize that singletons have measure
zero.
1.11. Examples of Sets of Measure Zero.
(a) The set Q of all rational numbers is
countable, thus by Exercise 1.10 it has Lebesgue measure zero.
(b) It can be seen from the hint to the exercise that for every k N there is an open set

T
Gk such that Q Gk and Gk 1/k. The set
Gk has also Lebesgue measure zero, it is
k=1

dense and uncountable (even residual).


1.12. Cantor Ternary Set. Consider the sequence { n } of nite collections of intervals
dened in the following way: 0 = {[0, 1]}, 1 = {[0, 13 ], [ 23 , 1]}. In each step we construct n
from n1 as the collection of all closed intervals which are the left or right third of an interval
from the collection n1 (the middle thirds are omitted). Then n is a collection of 2n disjoint
n
closed intervals, each of them of length
T 3 . Let Kn denote the union of the collection n . The
Cantor ternary set 1 C is dened as Kn . It is not dicult to verify that C consists precisely
of points of the form

ai 3i where each ai is 0 or 2. Roughly speaking, in the Cantor set

i=1

there are exactly those points of the interval [0, 1] whose ternary expansions do not contain the
digit 1. The Cantor set has the following properties:
(a) C is a compact set without isolated points;
(b) C is a nowhere dense (and totally disconnected) set;
(c) C is an uncountable set;
(d) the Lebesgue measure of C is zero.
1.13 Discontinua of a Positive Measure. If we construct a set D [0, 1] like the Cantor
set except that we always omit intervals of length 3n where (0, 1) (note that their centres
are not the same as those in the construction of the Cantor set), we get a closed nowhere dense
set, for which D = 1 . Sets having this property are called the discontinua of a positive
measure. Another construction: If G is an open subset of the interval (0, 1), containing all
rational points of this interval and G = < 1 then [0, 1] \ G is a discontinuum of measure 1 .
1 sometimes

also called the Cantor discontinuum

A. Measures and Measurable Functions

1.14. Exercise. Prove that there exists a non-Borel subset of the Cantor set and realize that
this set is Lebesgue measurable.
Hint. The cardinality argument shows that the set of all Borel subsets of the Cantor set has
cardinality of the continuum while the set of all its subsets has greater cardinality.
Instead of this, the following idea can be used. Dene
(t) := inf{x [0, 1] : f (x) = t}
where f is the Cantor singular function from 23.1. Show that is increasing on the interval [0, 1],
and therefore it is a Borel function. Suppose E is a nonmeasurable subset of [0, 1], B := (E).
Then B (as a subset of the Cantor set) is a measurable set. But since 1 (B) = E (and is a
Borel function), B cannot be a Borel set.

1.15. Lebesgue Measure on Rn . In the same way as for R, we introduce


the Lebesgue measure on Rn . Recall that by an interval in Rn we understand
an arbitrary Cartesian product of n one-dimensional intervals. If I := (a1 , b1 )
(an , bn ) is an open interval, we dene its volume as
vol I = (b1 a1 ) . . . (bn an ).
In the same way we dene vol I for intervals of other types. Given an arbitrary
set A Rn , dene the outer Lebesgue measure of A as the quantity
A = inf{

vol Ik :

k=1

Ik A, Ik is an open interval}.

k=1

We say that a set A Rn is measurable if T = (A T ) + (A \ T ) for


every set T Rn . (By analogy with the one-dimensional case we should require
this equality to hold just for bounded intervals T . We have chosen the present
denition in order to apply the general approach of Chapter 4. Soon we show
that there is no dierence between these two denitions.) The symbol M again
denotes the collection of all measurable subsets of Rn . For M M we denote by
M := M the n-dimensional Lebesgue measure of a set M .
1.16. Theorem.

If {Aj } is a sequence of (arbitrary) sets of Rn , then


j=1


Aj
Aj .
j=1

Proof. The assertion follows from Theorem 4.3.


1.17.
Theorem. If M1 , M2 , . . . are elements of M, then also M1 \ M2 , Mn
and Mn are elements of M. If, in addition, the sets Mn are pairwise disjoint,
then



Mn =
Mn .
n

Proof. The assertion follows from general Theorem 4.5.


Compare the following theorem with Exercise 1.6.

1. The Lebesgue Measure

1.18. Theorem.

If I Rn is a bounded interval, I

Qj where {Qj } is a

sequence of open intervals, then


vol I

vol Qj .

Thus the n-dimensional Lebesgue measure I is equal to the volume vol I.


Proof. Suppose J is a compact interval contained in I. There exists a p such that
the intervals {Q1 , . . . , Qp } cover J. The interval J can be now divided into a
nite number of non-overlapping n-dimensional intervals {Ji } (distinct elements
of {Ji } have disjoint interiors) in such a way that the interior of each interval Ji
is contained in some of the intervals Qj . Then
vol J =

vol Ji

p


vol Qj

j=1

vol Qj .

j=1

Since the dierence vol I vol J can be arbitrarily small, the assertion follows.
1.19. Theorem. (a) Any open subset of Rn is measurable.
(b) If A = 0, then A is measurable.
Proof. The proof of part (b) is obvious; we will prove (a). First we prove that each
interval H which is a halfspace (e.g. of the form (, c) Rn1 ) is measurable.
Choose a test set T , T < , and > 0. There exist open intervals {Qj }
with


Qj T and
vol Ij < T + .
j

Now nd open intervals Ij and Jj such that


Ij Jj = Qj ,
Then

Qj H Ij , Qj \ H Jj and Ij + Jj < Qj + 2j .

(T I) + (T \ I)

vol Ij +

vol Jj T + .

We proved the measurability of all intervals H of the form of a halfspace. Now,


each open set can be expressed as a countable union of intervals and each interval
is a nite intersection of intervals which are halfspaces.
1.20. Theorem.

If A Rn , then
A = inf {G : G open, G A}.

Proof. One inequality follows from the monotonicity of . Now if A < and
> 0, then there exist open intervals Ij Rn such that



A
Ij and Ij
vol Ij < A + .
j

The reader should compare the following theorem and Exercise 15.19.

A. Measures and Measurable Functions

1.21. Theorem. Given a set M Rn , the following are equivalent:


(i) M is measurable;
(ii) for every bounded interval I, I = (I M ) + (I \ M );
(iii) for every > 0 there exists an open set G M with (G \ M ) < ;
(iv) there exists a G -set D M such that (D \ M ) = 0;
(v) there exist an F -set Bi and a G -set Be such that Bi M Be and
(Be \ Bi ) = 0.
Proof. The implication (i) = (ii) is trivial. Assuming (ii), x > 0 and denote
Ik = (k, k)n . By Theorem 1.20 we can nd open sets Gk and Hk such that
Ik M Gk , Ik \M Hk , Gk (Ik M )+2k and Hk (Ik \M )+2k .
We can assume that Gk and Hk are subsets of Ik . Then we have Gk \M Gk Hk .
Using (ii) and the measurability of open sets we obtain
Ik +(Gk Hk ) = Gk +Hk (Ik M )+ (Ik \M )+2k+1 Ik +2k+1 .

Set G = Gk . Then
k

(G \ M )

(Gk Hk ) 2

k=1

so that (iii) holds. That (iii) implies (iv) is evident. It is not very dicult to
prove the implication (iv) = (v). If M satises (v), then M = Bi (M \ Bi )
where the sets Bi and M \ Bi are measurable by Theorem 1.19 (each one for a
dierent reason), so that (v) = (i).
1.22. Notes.
Originally, H. Lebesgue dened the outer measure on the real line using
countable covers formed by intervals, exactly as explained in the text. He dened measurability
as in Exercise 1.7.
At the end of the last century, various attempts to dene the length or area of geometrical
gures appear; in the works of G. Peano [1887] and C. Jordan [1892] even the measures of
more complicated sets are considered.
The existence of a Lebesgue nonmeasurable set is very closely connected to the axiom of
choice (for uncountable collections of sets) and the assertion that such sets exist was rst proved
by G. Vitali [*1905]. Solovays result [1970] says that there exist models of the set theory (of
course not satisfying the axiom of choice) in which every subset of real numbers is Lebesgue measurable. The existence of a nonmeasurable set can be proved (assuming various set conditions)
in other ways as well. Constructions of Bernsteins sets (still assuming the axiom of choice) as
examples of nonmeasurable sets are also interesting. Another construction of a nonmeasurable
set (the axiom of choice again) based on results of the graph theory comes from R. Thomas
[1985]. Using nonstandard methods, it is possible to prove the existence of a nonmeasurable set
assuming the existence of ultralters (a weaker form of the axiom of choice; cf. M. Davis [*1977]).
Recently, M. Foreman and F. Wehrung [1991] proved that the existence of a nonmeasurable set
follows from the Hahn-Banach Theorem (which is again a weaker assumption than the axiom
of choice).
Let us note that the Lebesgue measure can be extended to a translation invariant measure dened on a wider -algebra than is the collection of all Lebesgue measurable sets. The
construction can be found e.g. in S. Kakutani and J.C. Oxtoby [1950]. However, the Lebesgue
measure cannot be extended in a reasonable way to the collection of all subsets of Rn .
It is interesting that in R or R2 there exist nitely additive extensions of the Lebesgue
measure to the collection of all subsets which can also be invariant with respect to translations

2. Abstract Measures

and rotations. This was rst proved by S. Banach [1923]. However, this result cannot be
transferred to spaces of higher dimensions as follows from the famous result of S. Banach and
A. Tarski [1924]:
If U and V are arbitrary (!) bounded and open sets in the space Rn , n 3, then there S
exist
S
sets E1 , . . . , Ek and F1 , . . . , Fk such that Ei Ej = = Fi Fj for i
= j, U = Ei , V = Fi
and Ej are isometric copies of Fj .
In this theorem, which is known as the Banach-Tarski paradox, in general all the sets Ei and
Fi cannot be measurable; realize that U , V can be of dierent measures. More information is
contained in S. Wagon [*1985].

2. Abstract Measures
In this chapter we study an abstract notion of measure which stands as a basis
for modern integration theory. Also in probability theory, the notion of measure
(termed a probability there) plays a crucial role. Among many elds of analysis
which employ measures in an essential way, let us mention e.g. functional analysis,
theory of function spaces and theory of distributions, or mathematical modelling
of physical quantities.
So far we have the only nontrivial example of the Lebesgue measure. Further
important examples of measures will be introduced later.
Remember that the Lebesgue measure is not dened on the collection of all
subsets of R but on its subcollection which is closed under countable operations.
We start with the following denition.
2.1. -algebras. A collection S of subsets of a given set X is called a -algebra
if
(a) X S ;
(b) if A S , then X \ A S ;

(c) if An S , then
An S .
n=1

The pair (X, S ) is called a measurable space.


Clearly every -algebra is also closed under countable intersections, under differences and contains the empty set.
Not every collection of sets is a -algebra. However, if T is an arbitrary family
of subsets of X, then there exists the smallest -algebra (T ) which contains T .
Such a -algebra is simply the intersection of all -algebras (in X) which contain
T . It surely exists, since there is at least one such a -algebra (the -algebra
P(X) of all subsets of X) and the intersection of any collection of -algebras is
again a -algebra. The collection (T ) is called the -algebra generated by T .
2.2. Examples. One of the most important examples is the -algebra M of all Lebesgue
measurable sets on the real line. Another important class of examples yields Borel -algebras
from 2.3. For illustration, we add a few simple examples. Suppose X is an arbitrary set. Then
(a) {, X} is a -algebra;
(b) the collection

(X) of all subsets of X is a -algebra;

(c) {A X : A is countable or X \ A is countable} is a -algebra.

2.3. Borel Sets. Let P be a topological space. The -algebra B(P ) generated
by the family of all open subsets of P is called the Borel -algebra of P ; its

A. Measures and Measurable Functions

elements are called Borel sets. The Borel -algebra B(P ) contains all closed
sets, all countable intersections of open sets (these sets are called G sets), all
countable unions of closed sets (F sets), all countable unions of G sets (G
sets), all countable intersections of F sets (F ) and so on. Let us note that for
the complete description of all possible types we would have to use (in nontrivial
cases) all countable ordinal numbers.
2.4. Measures. Let S be a collection of subsets of a set X. A nonnegative set
function : S [0, ] is called a measure if
(a) S is a -algebra;
(b) = 0;

(c) for
each sequence {An } of pairwise disjoint sets from S , ( An ) =
An .
The triplet (X, S , ) is termed a measure space.
From (b) and (c) it immediately follows that each measure is monotone (if
A, B S , A B, then A B); the property (c) is also called the -additivity
of a measure.
We say that a measure is nite if X < +, -nite if there exist sets Mn S

Mn . If X = 1, then we say that is a


such that Mn < + and X =
n=1

probability measure. A measure is said to be complete if whenewer B S is a


null set and A B, then also A S (and A = 0).
If is a measure on X and E S , we dene E A = (A E) for A S .
A related notion is the restriction |A of a measure to the set A S : If
we denote by SA the -algebra {M S : M A} of subsets of A, we dene
|A (M ) = (M ) for M SA . Finally, if T S is a -algebra of subsets of X,
then the symbol |T denotes the measure E E, E T .
2.5. Examples. (a) The Lebesgue measure on the collection of all Lebesgue measurable sets
in Rn . This measure is complete, -nite, but not nite.
= (X) the collection of all subsets
(b) Counting measure. Let X be an arbitrary set,
of X and
(
the number of elements of A if A is nite,
A =
+ if A is innite.
The counting measure is complete; it is -nite if and only if X is countable, and nite just
when X is nite.
(c) The Dirac measure. Again, let X be an arbitrary set, x X,
(
1, if x A,
A =
0, if x X \ A.

(X). We dene

The measure is called the Dirac measure at x and it is denoted by x . The Dirac measure is
a complete probability measure.
is an arbitrary -algebra of subsets of X and A = 0 for all
(d) Trivial measures. If
A , then is an example of a nite measure which is not complete provided

= (X).

2.6. Properties of Measures. Let (X, S , ) be a measure space. Then the


the following propositions hold:

(a) if A1 , A2 , S , A1 A2 . . . , then ( An ) = lim An ;

10

2. Abstract Measures


(b) if A1 , A2 , S , A1 A2 . . . , and A1 < , then ( An ) =
lim An ;


(c) if A1 , A2 , S , then ( An ) An .
Proof. (a) Since A1 , A2 \ A1 , A3 \ A2 , . . . are pairwise disjoint, we get



An


= A1

n=1

(An+1 \ An )

n=1

= lim A1 +
k

k1


= A1 +

(An+1 \ An )

n=1


(An+1 \ An )

= lim Ak .

n=1

(b) By (a), we obtain




An


= A1 A1 \

n=1


An


= A1

n=1


A1 \ An

n=1

= A1 lim (A1 \ An ) = lim An .


n

(c) It is sucient to consider the sequence


B1 = A1 , B2 = A2 \ A1 , B3 = A3 \ (A1 A2 ), . . . ,
notice that

Bn =

An and that the sequence {Bn } is pairwise disjoint.

2.7. Completion of Measures. Now we return to the notion of completeness


of a measure. We show that every measure space can be extended to a complete
measure space.
Let (X, S , ) be a measure space and let N denote the collection of all sets
A X for which there is a B S such that B = 0 and A B. Further, let S
denote the collection of all sets of the form M N where M S and N N .
We dene a set function on S by
(M N ) = M,

M S, N N .

Clearly the value of E does not depend on the choice of M and N . The set
function on S is called the completion of .
2.8. Theorem. S is a -algebra containing S and is a complete measure
on S which coincides with on S .
Proof. Obviously, S S . Since both S and N are closed under countable
unions, the same is true for S . If E S , there are M S , N N and B S
so that N B, B = 0 and E = M N . Then X \E = (X \ (M B))(B \N )
S . It is easy to verify that is a complete measure on S and = on S .
2.9. Remark. Note that there is a variety of extensions of a measure to a complete measure.
The completion described above is uniquely determined and in some sense minimal (cf. Exercise
2.13).

A. Measures and Measurable Functions


2.10. Exercise.
B
we dene

11

Let f be a nonnegative function on a set X and


B :=

a -algebra on X. For

f (x).

xB

(Recall that by denition


X

f (x) = sup
K

xB

Show that is a measure on


(a) In a particular case
(b) If

8
<X
:

f (x) : K B, K nite

xK

9
=
;

.)

(so-called weighted counting measure).


=

(X) and f = 1 on X we get the counting measure.

(X), z X and f = c{z} , then is the Dirac measure at z.

2.11. Exercise.
completion?

(a) The trivial measure on

= {, R} is not complete. What is its

(b) The Lebesgue measure on R considered on the -algebra of Borel sets (see Exercise
1.14.a) is not complete.
= {, {1, 2}, {3, 4}, X}. Let be a measure on
(c) Suppose X = {1, 2, 3, 4},
that
{1} = {2} = 0, {3} = {4} = 1.

(X) such

If = |S , then is not complete. If


=

{{1}, {2}, {1, 3, 4}, {2, 3, 4}},

= |T ,

then is complete. What is the completion of ?


2.12. Exercise.
Prove that the completion of the Lebesgue measure considered on the
-algebra of all Borel sets in R is the Lebesgue measure (on the collection of all Lebesgue
measurable sets; cf. Theorem 26.1).
2.13. Exercise. Suppose (X, , ) is the completion of (X, , ). If 1 is a complete measure
1 and = 1 |S , show that
1 and = 1 |T .
on a -algebra 1 such that
2.14. Exercise (Borel-Cantelli lemma). Let (X, , ) be a measure space and An

P
T
S
An < +, then (lim sup An ) = 0 (we dene lim sup An :=
Ak ).
n=1

. If

n=1 k=n

2.15. Exercise (Darboux property). Let (X, , ) be a measure space. If does not have
is called an atom if A > 0 and for
an atom, show that {A : A } = [0, X]. (A set A
each set B , B A, either B = 0 or (A \ B) = 0.)
= {A
: A }. There exists a set C
such that
Hint. Fix 0 < < X and set
, A C. In a similar way nd D
, D C, D with the
A = C for all A
following property: B = D for each set B
for which D B C and B . Because
C \ D cannot be an atom, it follows C = D = .
2.16. Notes. In [1895] and [*1898], E. Borel extended the length of intervals to a set function
(measure) dened on the collection of all Borel sets. However, H. Lebesgue was the rst who
created the theory of the integral with the help of this measure. J. Radon in [1913] dened a
general notion of (Borel) measures in Euclidean spaces. A.N. Kolmogorov introduced in [*1933]
the axiomatic theory of probability measures.
The Darboux property of real-valued measures is a special version of the general Lyapunov
theorem (A. Lyapunov [1940]) which says that the range of a nite-dimensional nonatomic vector
measure is a convex compact set.

12

3. Measurable Functions

3. Measurable Functions
As mentioned in the rst chapter, there are serious reasons why the Lebesgue
measure is not dened on the collection all subsets of Rn . Likewise, one cannot
expect a reasonable theory of integration on the class of all functions; it will be
necessary to conne to some reasonable family of functions. This class of course
should contain the indicator functions of all measurable sets and should be closed
under all common (algebraic as well as limit) operations. As we will see, these
requirements are satised by the following natural denition.
In what follows, we assume that S is a -algebra of subsets of a given set X.
3.1. Measurable Functions. Suppose D S . A function f : D R is said
to be S -measurable on D if {x D : f (x) > } S for each R.
A complex function on D is S -measurable if its real and imaginary parts are
S -measurable.
3.2. Examples.
-measurable.

(a) If

is a -algebra of all subsets of X, then every function on X is

(b) Only the constant functions are

= {, X}.

-measurable if

is the -algebra of all Borel sets of a topological space X, then


(c) If
functions are called shortly Borel functions on X.

-measurable

3.3. Remark.
Since any -algebra is closed for complementation, a function f is
for each R. Because
measurable if and only if {f }
{f } =

j
\

f >

n=1

1
n

it is possible to replace the condition {f > }


by {f }
) in the denition of
-measurability. Other equivalent conditions are stated in Exercise 3.6.

3.4. Theorem. Let f , g be S -measurable functions on X. Then


(a) {x X : f (x) < g(x)} S ;
(b) f 1 (+), f 1 () S .
Proof. Since
{f < g} =

({f < r} {g > r}),

rQ

{x X : f (x) = +} =

{f > n},

n=1

the assertion easily follows.


3.5. Properties of S -measurable Functions. Let f , g, fn be S -measurable
functions (with possibly dierent domains in S ), R and let be a continuous
function on an open set G R. Then the following functions are S -measurable
where dened (and their denition domains are in S ):
(a) f , f + g, max(f, g), min(f, g), |f |, f g, f /g;
(b) sup fn , inf fn , lim sup fn , lim inf fn and lim fn ;
(c) f .

A. Measures and Measurable Functions

13

Proof. We will just indicate the ideas of some of the proofs leaving the others as
an exercise for the reader. The assertion (c) is obvious.
(a) Suppose R. Then {f + g > } = {f > g} (and the function g
is S -measurable). The functions |f |, f 2 and 1/g are S -measurable by (c). Then
we can use the formulae
max(f, g) =

1
(f + g + |f g|),
2

1
((f + g)2 f 2 g 2 ).
2
(b) As a hint let us just note that

{sup fn > } = {fn > }.
fg =

3.6. Exercise.
(a) Show that a real-valued function f is
f 1 (G)
for each open set G R or, if and only if f 1 (B)

-measurable if and only if


for each Borel set B R.
S
Hint. Since each open set G R can be expressed in the form G = (an , bn ) (where (a, b) =
(, b) (a, +)), we can see that a function f is -measurable if and only if f 1 (G)
for each open set G R (note that {f > } = f 1 ((, +))!). Then consider the collection
{B R : B Borel, f 1 (B)

},

and show that it is a -algebra containing all open sets.


(b) The characterization given in (a) cannot be used for functions having innite values
unless we introduce the notions of open or Borel subsets of R.
3.7. Exercise. Let {fn } be a sequence of

-measurable functions. Show that the sets

{x X : lim fn (x) exists } and {x X : lim fn (x) exists and is nite }


are in

3.8. Simple Functions. By a simple function on X we understand a (nite)


linear combination of indicator functions of sets from S . In other words, a real(or complex-)valued function f is simple if f is S -measurable and f (X) is a nite
n

set. Thus every simple function is of the form
i cAi where i are numbers and
i=1

Ai S . Note that this expression is not uniquely determined!


3.9. Theorem. Let f 0 be an S -measurable function on X. Then there
exists a sequence {fn } of nonnegative simple functions on X such that fn  f .
Proof. For n N and k = 1, 2, . . . , n2n set


k1
k
f (x) < n
Fn,k = x X :
2n
2


and dene
fn (x) =

k1
2n ,

if x Fn,k ,

if x X \ Fn,k .
k

It is easily seen that fn are simple and fn  f .


3.10. Exercise. Suppose f , fn have the same meaning as in the previous theorem. Show
that fn f on every set on which f is bounded.

14

3. Measurable Functions

3.11. Exercise. If f is an
-measurable function, then there exists a sequence of simple
functions {fn } such that |fn | |f | and fn f on X.

Most frequently we meet the concept of measurability when additionally a


measure is in consideration on a given -algebra. So suppose in what follows
that (X, S , ) is a measure space. The following notion is of great importance in
Lebesgues integration theory, since usually the null sets are negligible.
3.12. Almost Everywhere. We say that a function h is dened -almost
everywhere (briey almost everywhere) on X if its domain D S satises (X \
D) = 0. Suppose f and g are functions dened almost everywhere on X. We say
that f (x) g(x) for -almost all x X, or that f g -almost everywhere if
there exists a set N such that N = 0 and for all x X \ N we have f (x) g(x).
Similarly we understand the expressions almost everywhere and almost all in
other contexts, e.g. when speaking about the equality of functions or about the
convergence almost everywhere.
3.13. -measurable functions. We say that a function f dened on D S
is -measurable on X if (X \ D) = 0 and f is SD -measurable on D.
Let us emphasize that we distinguish strictly between -measurable functions
(dened in general only almost everywhere) and S -measurable functions on X
(dened everywhere on X).
3.14. Equality almost everywhere. The relation f = g almost everywhere
is clearly an equivalence relation on the set of all functions (or all -measurable
functions) on X. The following observations are quite useful.
(a) Let be a measure on (X, S ) and f be a -measurable function dened
on D S . Then there exists an S -measurable function g on X such that f = g
on D; in particular, we can take

g=

on D,

on X \ D.

(b) If is a complete measure on (X, S ) and f is a -measurable function,


then every function g which is equal to f almost everywhere is -measurable.
3.15. Exercise. Find an example of a measure space on which there exists an -measurable
function f and an -nonmeasurable function g such that f = g almost everywhere.
3.16. Exercises. Suppose (X,

, ) is the completion of a measure space (X,

, ).

(a) Show that an everywhere dened function f is


-measurable if and only if there exist
-measurable functions g, h such that g f h on X and g = h -almost everywhere in X.
-measurable. By Exercise 3.11 nd
Hint. The proof of one implication is easy. Let f be
a sequence of simple ( -measurable) functions {fn } such that fn f on X. Then nd
measurable functions gn , hn such that gn fn hn and gn = hn -almost everywhere and
set
g := lim sup gn , h := lim inf hn .
-measurable function on X, then there exists an
(b) If f is an
such that f = g -almost everywhere.

-measurable function g

Hint. The assertion is obvious for indicator functions of sets from


-measurable simple functions. Then use Exercise 3.11.

, and therefore also for

A. Measures and Measurable Functions

15

3.17. Exercise. Let {fn } be a sequence of -measurable functions on X which converges to


a function f -almost everywhere. Prove that
(a) if the measure is complete, then also f is

-measurable;

(b) if f is dened on X and is


-measurable, then there exists a sequence {fn } of
measurable functions such that fn = fn -almost everywhere and fn f everywhere on X;
(c) if the measure is complete, then a function f is -measurable if and only if there exists
a sequence of simple functions which converges to f -almost everywhere.
3.18. Images and Preimages of Measurable Sets.
We know that real-valued
measurable functions are exactly those for which the preimages of Borel sets are in
. To
illustrate the situation, let us present the following example for the Lebesgue measure on R.
(a) Let g(x) = 21 (x + f (x)) where f is the Cantor singular function from 23.1. Then g is
a continuous and increasing function mapping the interval [0, 1] on [0, 1]. If C is the Cantor
set and is the inverse function to g on [0, 1], then 1 (C) is a (Lebesgue) measurable set of
positive measure. As remarked in 1.9.2 there exists a nonmeasurable set E 1 (C). Finally,
for M := (E) we have M C, so that M is measurable while 1 (M ) = g(M ) = E is a
nonmeasurable set. Let us note that this M is not even a Borel set (compare also with Exercise
1.14.a).
(b) Without presenting a proof, we mention that the continuous image of a Borel set on R
is always measurable but not necessarilly Borel.

4. Construction of Measures from Outer Measures


In this chapter, we will construct measures from the so-called outer measures.
This is a very useful method used already before when we introduced the Lebesgue
measure in Rn . We also full our promise to prove propositions from the rst
chapter.
4.1. Outer Measure. By an outer measure on a set X we understand a set
function which assigns to every set A X a nonnegative number A (real or
+) such that the following conditions are satised:
(a) = 0;
(b) if A

B, then
A B;
(c) ( An ) An .
The property (c) is called the -subadditivity of an outer measure.
4.2. Examples of Outer Measures. (a) The outer Lebesgue measure in Rn .
(b) The Hausdor outer measure from Chapter 36.
(c) The counting measure.
(d) If (X, , ) is a measure space, then the set function : A  inf{M : M
, M A} is an outer measure. See also Exercise 4.8.

:M

An important example of creating an outer measure is contained in the following theorem.


4.3. Theorem. Let G be a collection of subsets of a set X containing , and
: G [0, +] be a set function on G with = 0. For A X set
A = inf{

n=1

Gn : Gn G ,

Gn A}

n=1

(note that inf = +). Then is an outer measure.

16

4. Construction of Measures from Outer Measures

Proof. We only have to verify that is -subadditive. So suppose A



An < +. Fix > 0 and nd Gjn G so that


Gjn An and

Then

Gjn < An +

Gjn A and

An ,

.
2n

An A .

n,j

4.4. -measurable Sets.


sense of Caratheodory) if

A set M X is said to be -measurable (in the

T = (T M ) + (T \ M )
for each test set T X (in other words, if M splits additively each set in X).
The collection of all -measurable sets will be denoted by M(). To prove that a
set M is -measurable, it is sucient to verify the inequality
T (T M ) + (T \ M )
for any set T with T < +.
4.5. Theorem.

M() is a -algebra and is a complete measure on M().

Proof will be divided into a few steps.


(a) It is straightforward to check that X M(), that X \ M M() provided
M M(), and that A M() whenever A = 0. Suppose M, N M(). We
would like to show that also M N M(). So choose a test set T X. Then
T = (T M ) + (T \ M )
and
(T M ) = (T M N ) + ((T M ) \ N ).
Now we use the test set T \ (M N ) and -measurability of M to get
(T \ (M N )) = ((T M ) \ N ) + (T \ M ).
Thanks to last three equalities, it follows that
T = (T (M N )) + (T \ (M N )).
Since M() is closed under complements and nite intersections, it is closed also
under nite unions.

A. Measures and Measurable Functions

17

(b) In order to show that is -additive on M(), choose Mn M() pairwise


disjoint. Setting T = M1 M2 and using -measurability of M1 we obtain
(M1 M2 ) = M1 + M2 .
Thus is nitely additive. Further

Mn = lim
k

n=1

k



Mn = lim
k

n=1

k



Mn

n=1

Mn

n=1

and since the reverse inequality always holds we reach the conclusion.

(c) Let now Mn M() be pairwise disjoint. Our aim is to show that

Mn

n=1

M(). Choosing a test set T X, we have








k
k

k




T = T \
Mn + T
Mn T \
Mn +
(T Mn )
n=1

n=1

n=1

n=1

for each k N. Since is -subadditive, it readily follows that









Mn ,
T T \
Mn + T
n=1

n=1

which is what we wanted.


4.6. Exercise. Let be a nonnegative function on
(X), = 0. Show that the collection
of all -measurable sets forms a -algebra and that is additive on it.
Hint. It only needs to examine where monotonicity and -subadditivity of the outer measure
in the proof of Theorem 4.5 is used.
4.7. Exercise. We say that an outer measure is regular if for each set A X there exists
M M() such that A M and A = M .
(a) Let be a regular outer measure and A1 A2 A2 . . . an increasing sequence of
S
sets. Show that ( An ) = lim An .
(b) Show that the outer Lebesgue measure is regular.
T
n

(c) There exists a decreasing sequence {Mn } of subsets of [0, 1] such that Mn = 1 and
Mn = (compare with (a)).

4.8. Exercise. Let (X,

, ) be a measure space and A X. Set

A := inf{M : M

, A M },

A := sup{M : M

, M A}.

(a) Suppose A < . Show that A M( ) if and only if A = A.


(b) Show that

M( ) and = on

4.9. Notes. Carath


eodorys characterization of measurable sets appears rst in [*1918].

5. Classes of Sets and Set Functions


This chapter will be devoted to a study of various classes of sets and set functions from the point of view of measure theory. We prove theorems on extensions
of set functions to larger families of sets as important steps in constructing measures.

18

5. Classes of Sets and Set Functions

5.1. Systems of sets. A family A of subsets of a set X containing is called:


(a) a semiring if
(a1) A B A for each A, B A ,
(a2) for A, B A , A B, there exist pairwise disjoint sets C1 , . . . , Cn A
n

Cj ;
such that B \ A =
j=1

(b) a ring if given A, B A , then A B, A \ B A (so that also A B A );


(c) an algebra if it is a ring and X A ;
(d) a Dynkin class if
(d1) X A ,
(d2) A \ B A for each A, B A , B A,

(d3) if An A are pairwise disjoint, then


An A ;
n=1

(e) a -system if A B A for each A, B A ;


(f) a -ring if A is a ring closed under countable intersections;
(g) a -ring if A is a ring closed under countable unions.
5.2. Premeasure. A nonnegative set function is called a premeasure on X if
is dened on a ring A of subsets of X and satises the following conditions:
(a) = 0,

(b) if {A

j } A is a sequence of pairwise disjoint sets and Aj A , then


( Aj ) = Aj .
A premeasure is in fact a -additive set function dened on a ring of sets. We
say that a premeasure on X is -nite
if there exists a sequence Xj of sets from

A such that Xj < and X = Xj .


5.3. Examples. (a) Denote by
the collection of all intervals on R including the degenerated
ones (i.e. the empty set and the singletons) and by b the collection of all bounded intervals
from . Further, let l be the collection of all intervals of the form [a, b) together with , R
and the intervals of the form (, b). Finally, let lb = l b .
(a1) The collections ,
open (or closed) intervals.

l,

b,

b
l

form semirings which is not true for the collection of all

(a2) Finite unions of sets from b or lb form a ring. Finite unions of sets from
or l
form even an algebra
or l . The set function I  vol I can be (uniquely) extended to a
or l .
premeasure on
(a3) Closing
or l under countable unions, we do not get even a semiring: Substracting
a countable union of open intervals from [0,1] we get the Cantor set which is not a countable
union of intervals.
and
(b) If
A
and B

are semirings on X, then the collection of all sets of the form A B where
forms a semiring on X X.

be a semiring. Then the set of all nite disjoint unions of elements from
(c) Let
ring (in fact, the smallest ring containing ).

is a

of all subsets of the set {1, 2, . . . , 20} with an even number of


(d) Consider the family
elements and show that
is a Dynkin class but not a semiring.
(e) The collection of all countable subsets of a set X is a -ring which for is not a -algebra
unless X is countable.

A. Measures and Measurable Functions

19

(f) The collection of all Lebesgue measurable sets of nite measure forms a -ring.
(g) Suppose that
is the collection of all unions of a nite number of intervals (including
the degenerated ones) and  = {A Q : A }. Then  is an algebra of subsets of Q and
: A Q  A is a nitely additive set function which is not a premeasure.
(h) Let
be the algebra of all nite unions of intervals on (0, 1). Dene a set function on
by the formula
(
(A) =

if A contains some interval of the form (0, ), > 0,

in other cases.

Then is a nitely additive set function on

which is not a premeasure.

5.4. Theorem. Let G be a ring of subsets of a set X and a premeasure on


G . If is the outer measure constructed from G and as in Theorem 4.3, then
(a) = on G ;
(b) G M(
).

Proof. Obviously . Suppose G G , G < + and {Gn } G , Gn G.


Then





G =
(Gn G)
(Gn G)
Gn ,
n

so that G G.
Now suppose G G . Choose a test set T X, T < + and > 0. There exist
Gn G such that
T

Gn

and

Gn T + .

Then
T +

Gn =

((Gn G) + (Gn \ G)) (T G) + (T \ G),

thus G M(
).
5.5. Hopf s Extension Theorem. Let be a premeasure on a ring A of
subsets of a set X. Then there exists a measure
on (A ) which equals on A .
This extension of is unique provided is -nite.
Proof. The existence of a measure
is an immediate consequence of Theorems 4.3
and 5.4 (notice that (A ) M(
)). To prove uniqueness, under the -niteness
assumptions, let be another measure on (A ), = on A . One can easily nd
out (from the construction
of the outer measure
) that
on (A ). Then

for Aj A , A = Aj we have
j

A = lim
n

n


j=1

Aj = lim
n

n


j=1

Aj =
A.

20

5. Classes of Sets and Set Functions

So if
E (A ),
E < , then for a given > 0 there exist sets Aj A ,
A = Aj such that E A and
A <
E + . Hence
j

E
A = A = E + (A \ E) E +
(A \ E) E + ,
thus
E = E. Finally suppose X =

Xj , Xj < +; one can assume that Xj

are pairwise disjoint. If E (A ), then

E =

(E Xj ) =

(E Xj ) = E.

5.6. Exercise. Let


be a family of subsets of a set X. Show that there exists the smallest
Dynkin class ( ) containing .
5.7. Exercise. Show that every -algebra is a Dynkin class. A Dynkin class is a -algebra if
and only if it is a -system.
5.8. Exercise. Prove that if

is a -system, then

) = (

).

5.9. Exercise. The assumptions of Hopfs extension theorem can be weakened. Indeed, show
that the assertion of Theorem 5.4 is still true if we only assume that is nitely additive and
and that = 0.
-subadditive on the semiring
Hint. First note that is monotone. Then proceed as in Theorem 5.4 and show that ( )
j
(
). In the essential step use the existence of sets Cn

with the property Gn \ (G Gn ) =


S j
Cn .
j

5.10. Exercise. Let


be a -system on X. Show that if 1 , 2 are probability measures on
( ) which agree on , then 1 = 2 on ( ).
Hint. Let
5.7.

:= {M (

) : 1 (M ) = 2 (M )}. Show that

is a Dynkin class and use Exercise

5.11. Exercise. Consider


:= {[a, b) \ C : a, b [0, 1]},

([a, b) \ C) := f (b) f (a),

where C is the Cantor set and f the Cantor function from 23.1.
Show that
is a semiring and that is nitely additive but not -additive on

5.12. Exercise. (a) Let X be an arbitrary set and n N. Consider the set function which
is restriction of the counting measure to the collection
n

:= {A X : A has exactly n elements }

and construct the outer measure as in Theorem 4.3. Investigate the relationship between the
) and compare with Theorem 5.4.
collections n and M(
(b) Investigate the same problem in case of X = (0, 1),

A := inf{

(X) and

k
k
X
[
(bi ai ) :
(ai , bi ) A}
i=1

(A is the Jordan-Peano content of a set A).

i=1

A. Measures and Measurable Functions

21

5.13. Exercise. Let be an outer measure on X constructed from the premeasure on an


algebra by Theorem 4.3 such that X < +. For A X set
A := X (X \ A).
Show that A M( ) if and only if A = A (compare also with Theorem 5.4).
5.14. Capacity on Compact Sets.
Let X be a locally compact topological space. A
real-valued nonnegative function
dened on the collection
(X) of all compact subsets of X
is called a Choquet capacity if it satises the following conditions:
(a) if K1 K2 , then (K1 ) (K
T2 );
(b) if K1 K2 K3 . . . , then ( Kn ) = lim (Kn );
(c) (K1 K2 ) + (K1 K2 ) (K1 ) + (K2 ) whenever Kn
subadditivity).

(X) (so-called strong

An important example is the Newtonian capacity in Rn , n 3. If we dene the Newtonian


potential of a Radon measure on Rn by
Z
d(t)

(x) :=
,
n2
Rn |x t|
we can introduce the Newtonian capacity cap K of a compact set K Rn as
cap K := sup{K : supt K,

1 on Rn }.

The proof that this capacity satises the conditions (a), (b), (c) can be found for instance in
[KNV].
5.15. Outer Capacity. Let X be a topological space. A mapping c :
called an outer capacity provided it satises:

(X) [0, +] is

(a) if A B, then cA cB;


S
(b) if A1 A2 A3 . . . , then c ( An ) = sup cAn ;
T
(c) if K1 K2 K3 . . . and Kn are compact, then c( Kn ) = inf c Kn .
A set A X is said to be c-capacitable if
cA = sup{cK : K A, K compact}.
1. If

is a Choquet capacity on a locally compact space (Exercise 5.14) and


cA := inf{sup{ (K) : K G, K compact} : G A, G open},

prove that c is an outer capacity.


2. Suppose c is simultaneously an outer capacity and an outer measure. Investigate the relationship between the notions of c-capacitability and c-measurability in the sense of Caratheodory.
Consider the cases:
(a) X is a two-points set equipped with the discrete topology and cA = 1 if A
= , c = 0;
(b) X is R with the Euclidean topology and c is the outer Lebesgue measure;
(c) X is R with the discrete topology and c is again the outer Lebesgue measure;
(d) c is the outer capacity derived from the Newtonian capacity. In this case a set A Rn
is c-measurable in the sense of Caratheodory if and only if cA = 0 or if c(Rn \ A) = 0 (this is
not quite easy, see M.M. Rao [*1987]). On the other hand, Choquets deep result says that each
Borel set in Rn is c-capacitable.
5.16. Notes. Hopfs extension theorem is usually attributed to H. Hahn or C. Caratheodory
but it is probably originated by M. Frechet [1915]. The proof using Caratheodorys theorem was
given independently by H. Hahn [*1924] and A. N. Kolmogorov [*1933].
The notions of a -system or of a Dynkin class were investigated by E. B. Dynkin in 1959 as
tools for the probability theory. However, families of similar properties were studied already by
W. Sierpi
nski [1928].

22

6. Signed and Complex Measures

An investigation of the theory of capacities in the classical potential theory during the period
19201950 is connected with the names of Ch. de la Vallee Poussin, N. Wiener, O. Frostman or
M. Brelot among others. These authors studied mainly the Newtonian capacity and examined
the role of sets of a small capacity. One could say that the capacity theory is a younger
sister of Lebesgue measure theory. In the 50s the general capacity theory was developed and
G. Choquet proved his famous capacitability theorem. An interested reader is referred to a nice
article written by Choquet himself [1986] or [1989].

6. Signed and Complex Measures


In this chapter, we will investigate measures assuming also negative (or even
complex) values.
6.1. Signed Measures. Let S be a -algebra of subsets of X. A set function
: S R is said to be a signed measure on S if
(a) = 0; 



An =
An whenever An S are pairwise disjoint.
(b)
n=1

n=1

6.2. Remarks.
1. A signed measure can assume at most one of the values + and
, then (E F ) is nite. Therefore
. Indeed, if E = + and F = for E, F
(E \ F ) = +, (F \ E) = and the equality (b) does not hold for (disjoint) sets E \ F
and F \ E.
S
P
2. If ( An ) is nite, then the series
An in the condition (b) converges absolutely (indeed,
this series converges to the same value when its terms are rearranged arbitrarily, which is
equivalent to the absolute convergence).
3. Some of basic properties of positive measures hold even for signed measures. Show that the
following assertions are true:
(a) if An

, An  A , then An A,

(b) if An

, An  A, |A1 | < + , then An A.

4. A signed measure is not necessarily monotone: If A B, we can no longer assert that


A B.

6.3. Hahn Decomposition. We say that a set P S is positive for a signed


measure if E 0 for each set E S , E P . Analogously we dene negative
sets for .
An ordered pair of sets (P, N ) is called a Hahn decomposition of X for if
(a) P N = X, P N = ;
(b) P is positive and N is negative (for ).
6.4. Examples. (a) The empty set is both positive and negative for any signed measure. The
pair (X, ) is a Hahn decomposition for if and only if is positive.
(b) The pairs (R, ) , (R \ Q, Q) , (R \ {5}, {5}) are Hahn decompositions of R for the
Lebesgue measure.
R
(c) If a measure f has a density f with respect to (i.e. if f A = A f d where f ()
cf. 8.19), then the set P := {f 0} is positive for f and ({f 0}, {f < 0}) is a Hahn
decomposition for f .

6.5. Hahn Decomposition Theorem. For every signed measure there


exists a Hahn decomposition. This decomposition is unique in the following sense:
If (P1 , N1 ) and (P2 , N2 ) are two such decompositions, then
(P1 E) = (P2 E),

(N1 E) = (N2 E)

A. Measures and Measurable Functions

23

for each set E S .


Proof. We start with proof of the uniqueness assertion. Suppose that (P1 , N1 )
and (P2 , N2 ) are Hahn decompositions of X for . Then evidently
(P1 E) = (E (P1 P2 )) = (E P2 )
and, in the same way, (N1 E) = (N2 E).
To prove the existence of the decomposition, assume that S < + for all
S S . We proceed via the following steps.
Step 1: If A S , A > and > 0, then there exists a set B A such that
B A and B is positive. The set B will be constructed as the intersection of
a nite or innite nested sequence {An } of subsets of A. We start with A1 = A.
In the n-th step, if An is not positive, we can set

1
kn = min k N : there exists E An with (E)
k
and nd An+1 An such that (An \ An+1 ) k1n . Then we set B =
The sets An \ An+1 are pairwise disjoint and their union is A \ B. Thus
+ > (B) = (A)

(An \ An+1 ) (A) +

n=1

An .

n=1


1
.
k
n=1 n

It follows that (B) (A) and kn . If E B is a measurable subset, then


for each n N we have E An , and thus (E) > k1n . Hence B is positive.
Step 2: To complete the proof, set s = sup{A : A S }. Since S , we have
s 0. There exists a sequence {Pn } S such that Pn s. In light of the rst
step we can assume
P1 P2 . . . (the union of two positive sets is a positive

set!). If P := Pn , then P A and P = lim Pn = s < +. Moreover, P is


positive. Indeed, if E P , then E Pn  E and (E Pn ) 0. Lastly, to show
that the set N := X \ P is negative we notice that (E P ) = E + P > s for
any set E N with E > 0.
6.6. Variation of a Measure. Let (P, N ) be a Hahn decomposition of X for
a signed measure . Dene
+ E = (E P ),

E = (E N ),

|| (E) = + E + E

for every set E S . By the previous theorem, + E and E do not depend


on the choice of the Hahn decomposition (which is not unique!). It is simply
checked that the set functions + , and || are positive measures on S . They
are called the positive, negative and total variations of a signed measure . Let
us summarize our results in the following theorem.

24

6. Signed and Complex Measures

6.7. Theorem (The Jordan decomposition of a signed measure). Let be a


signed measure on (X, S ). The functions + , and || are positive measures
on S and = + . If also = 1 2 where 1 , 2 are positive measures,
then 1 + , 2 .
Proof. It is enough to prove the last part of the theorem. But for each E S ,
one has
+ E = (E P ) = 1 (E P ) 2 (E P ) 1 (E P ) 1 E.
An important characterization of the total variation of a signed measure, often used as its
denition, is contained in the following theorem.

6.9. Theorem.

Let be a signed measure on (X, S ) and E S . Then




|| (E) = sup

n


n


|Ak | : Ak S pairwise disjoint,

k=1


Ak = E

k=1

Proof. It is straightforward to check that




|Ak | =



  +
+ Ak Ak 
( Ak + Ak ) =
|| (Ak ) = || (E).
k

On the other hand, if (P, N ) is a Hahn decomposition of X for , set A1 = E P ,


A2 = E N and nd out that the supremum is even attained.
6.10. Complex Measures. A complex measure on (X, S ) is a -additive
set function
S C for which = 0. By -additivity we understand that
 :

Ak =
Ak whenever {Ak } is a sequence of pairwise disjoint sets from
k=1

k=1

S . Let us note that, as in Remark 6.2.2, the appearing series must converge
absolutely or denitely diverge.
Each complex measure can be expressed uniquely in the form = r + ii
where r and i are (nite!) signed measures on (X, S ). In particular, each nite
signed measure can be understood as a complex measure.
If is a complex measure, we dene its total variation || by

|| (E) = sup

n


k=1

|Ak | : Ak S pairwise disjoint,

n



Ak = E

k=1

An appeal to Theorem 6.9 reveals that this denition agrees with the previous
one for signed measures.
6.11. Theorem. Let be a complex measure on (X, S ). Then the total
variation || is a nite positive measure on (X, S ).

A. Measures and Measurable Functions

25

Proof. Apparently, || () = 0 and it is simply to verify that || is nitely additive.


If = 1 2 + i(3 4 ) where 1 2 is the Jordan decomposition of r and
3 4 is the Jordan decomposition of i (i.e. j are positive nite measures),
then
|| (A) 1 A + 2 A + 3 A + 4 A,
so that || (A) < +. If An S , An  , then lim j An = 0 for each
j = 1, 2, 3, 4, and therefore || (An ) 0. A routine argument now shows that
is -additive.
6.12. Remark. Note that the range of any complex measure is always a bounded subset of
the complex plane C.
) and E

6.13. Exercise.
Let be a signed measure on (X,
sup{B : B , B E}.

6.14.
on (X, ) and E
j Exercise. Let be a complex measure

P
S
|Ak | : Ak
pairwise disjoint,
Ak = E .
sup
k=1

. Show that + E =
. Prove that || (E) =

k=1

6.15. Exercise.
Let be a complex measure on (X,
positive measure satisfying A || A for all A .

). Prove that || is the smallest

6.16 Exercise. Investigate whether || = |r | + |i |, or || =

q
|r |2 + |i |2 .

6.17. Exercise. We denote by M( ) the set of all nite signed or complex measures on
(X, ). If M( ), let  = || (X). Show that (M( ), ) is a Banach space (i.e. M( )
is a linear space,  becomes a norm and M( ) is complete with respect to it).
Hint. Only the completeness requires a proof. But if n  is a Cauchy sequence in M( ), then
there exists lim n A for each A . Dening A = lim n A, it is not hard to show that is
-additive and n  0.
6.18. Exercise.
every set A .

Suppose that , n M(

6.19. Exercise.
the order

Show that the set M(

), n  0. Prove that n A A for

) of all signed measures on (X,

) equipped with

if A A for each A

is a lattice (i.e. for any , M( ) there exist sup(, ) and inf(, ) in M( )). Notice that
sup(, ) does not necessarily mean the set function A  sup((A), (A)) such function, in
general, is not a measure!.
Hint. Show that

1
(
2

+ + | |) is sup(, ) and

1
(
2

+ | |) is inf(, ).

6.20. Finitely Additive Measures Charges. A real-valued set function on an algebra


is called a charge if
of sets
(a) = 0;
(b) (A B) = A + B whenever A, B
If is a charge, dene + A = sup{F : F
|| (A) = + A + A.
6.21. Exercise.
measures on .

, A B = .
, F A}, A = inf{F : F

, f A},

Show that the set functions + , and || are positive nitely additive

6.22. Exercise. Suppose X = [0, 1). Let


be the collection of all nite unions of intervals
is an algebra. Set
[a, b) [0, 1). Verify that
(
f (x) =

1
x

for x (0, 1),

for x = 0.

26

6. Signed and Complex Measures

If A =

n
S

[ai , bi ), dene A =

P
(f (bi ) f (ai )). Show that neither the Jordan decomposition

i=1

theorem, nor the Hahn decomposition theorem do hold for . Proceed via the following
steps:
S
(a) the denition of A does not depend on the particular representation of A as [ai , bi );
(b) is a charge on

(c) A 0 for A (0, 1);


(d) ([0, 1)) = 1;
(e) + ([0, 1)) = ([0, 1)) = +.
6.23. Exercise. Let be a charge on
(i.e. if sup{|A| : A } < +).

. Show that = + if and only if is bounded

6.24. Notes. Signed measures were investigated by H. Lebesgue already in [1910]; he was
concerned mostly with measures given by a density (see 8.19).
The existence of a Hahn decomposition of the space and the Jordan decomposition of signed
measures were rst established in a full generality by Hahn [*1921].
The monograph by K.P.S. Bhaskar Rao and M. Bhaskar Rao [*1983] is devoted to the theory
of charges.

B. The Abstract Lebesgue Integral

27

B. The Abstract Lebesgue Integral


7. Integration on R
7.1. Newton Integral. Let f be a real-valued function on an interval (a, b).
A function F on (a, b) is called an antiderivative to f if F  (x) = f (x) for all
x (a, b). A real number A is called the Newton integral of f over (a, b) if there
is an antiderivative F of f on (a, b) such that A = lim F (x) lim F (x). The
xb

xa+

value of the Newton integral of f on (a, b) does not depend on the choice of F
because the dierence between any two antiderivatives is a constant. The Newton
b
integral is denoted by N a f .
7.2. Riemann Integral. Let f be a bounded function on a bounded interval
(a, b). If a = 0 < 1 < < m = b is a (nite) sequence of points in [a, b] (so
called partition of the interval [a, b]), and 1 , . . . , m a sequence of real numbers,
m

j (j j1 ) is called an upper Riemann sum of f if j f
then the value
j=1

on every interval (j1 , j ), and a lower Riemann sum of f if j f on every


(j1 , j ). Set
R f = inf{A : A is an upper Riemann sum of f },
R f = sup{A : A is a lower Riemann sum of f }.
A function f is Riemann integrable on (a, b) provided R f = R f ; the common
value is then called the Riemann integral of f over (a, b) and it is denoted by
b
R a f.
7.3. Theorem. Let f be a continuous function on an interval [a, b]. Then both
the Newton integral and the Riemann integral of f on (a, b) exist and are equal.
x
Proof. Since f is uniformly continuous, we easily obtain that R a f exists for all
x
x (a, b). A short reection shows that x R a f is an antiderivative of f and
this implies the existence of the Newton integral and the required equality.
7.4. Remark. Each Riemann integrable function f is absolutely Riemann integrable, i.e. |f |
is Riemann integrable as well. This assertion is no longer true for the Newton integral (consider
the integration of the function x  sin x/x over (1, +); the change of variables t = 1/x yields
an example on a bounded interval).
7.5. Dirichlet Function. The Dirichlet function D is dened as the indicator function of
the set of all rational numbers. Observe that D = 0 almost everywhere. The Dirichlet function
is nowhere continuous, it has neither an antiderivative (it does not have the Darboux property)
nor the Riemann integral over (0, 1) ( D = 1, D = 0).
7.6. Riemann function. The Riemann function R is zero on the set of all irrational numbers.
If r = p/q is a rational number, p and q are relatively prime and q > 0, then R(r) is dened as
1/q (zero is supposed to be 0/1, so R(0) = 1). The Riemann function is continuous at x if and
only if x is irrational. The Riemann function serves as an example of a pathological function
which is Riemann but not Newton integrable (it has no antiderivative) on bounded intervals.
7.7. Various Examples.
The indicator function of the Cantor ternary set is Riemann
integrable. The indicator function of any discontinuum of a positive measure fails to be Riemann
integrable. An unbounded Newton integrable function cannot be Riemann integrable. There
are examples of bounded Newton integrable functions which are not Riemann integrable (cf.
7.9.f).

28

8. The Abstract Lebesgue Integral

7.8. Chebyshevs Inequality for the Riemann Integral. If f is a nonnegative bounded


function on (a, b), > f and > 0, then the set {f } can be covered by a union of a
nite number of intervals, the sum of whose lengths is less than / (so {f } < / ).
Hint. Find a partition a = 0 < 1 < < m = b and numbers cj such that cj f on
m
P
(j1 , j ) and
cj (j j1 ) < . Then the sum of lengths of those intervals (j1 , j ) for
j=1

which cj is less than / .


7.9. Riemann Integrable Functions. Lebesgue measure theory allows a deeper study of
the class of Riemann integrable functions.
(a) For any bounded function f on an interval (a, b) we have

f = inf{

g : g piecewise constant, g f },

Z
f

= sup{

g : g piecewise constant, g f }.

(A function f is said to be piecewise constant on [a, b] if there exist j such that a = 0 < 1 <
< m = b and f is constant on each interval (j1 , j ).)
(b) If f is a bounded function on an interval (a, b), then the function
f := inf{g : g f, g continuous}
is called the upper Baire function of f . The denition of the lower Baire function f of f is
analogous.
Show that the function f is upper semi-continuous and that f is continuous at a point x if
and only if f (x) = f (x).
(c) Show that

f = inf{R

g : g f, g continuous }.

(d) Let f be a bounded function on an interval (a, b). Show that the following conditions are
equivalent:
(i) f is Riemann integrable;
R
(ii) for each > 0 there exist continuous functions t, s such that t f s and R ab (st) <
;
(iii) f = f almost everywhere;
(iv) there exist functions u, v; u upper semi-continuous, v lower semi-continuous such that
v f u and u = v almost everywhere;
(v) the set of all points where f is discontinuous is of Lebesgue measure zero.
Hint. Use (b) and (c). For (ii) = (iii), according to 7.8, {f f 1/k} {s t
1/k} k for each k N and > 0.
(e) Show that any Riemann integrable function is measurable.
Hint. Any semi-continuous function is measurable. Now use (d) and Theorem 3.14.b.
(f) The above condition (v) permits to construct bounded Newton integrable functions which
are not Riemann integrable. An example of Volterra type functions constructed with the aid of
closed nowhere dense sets of positive measure (Example 1.13) can be found in A.M. Bruckner
[*1978].
7.10. Notes. The origins of the integral calculus are connected with the names of I. Newton
and G.W. Leibniz. The modern integration theory has been developed from the beginning of
the 19th century by A.L. Cauchy, L. Dirichlet, B. Riemann, C. Jordan, E. Borel, H. Lebesgue,
G. Vitali and others. Some historical treatments on integration are mentioned in 8.25.

B. The Abstract Lebesgue Integral

29

8. The Abstract Lebesgue Integral


Elementary expositions of the integration theory in usual textbooks of calculus
employ Riemanns or Newtons constructions. However, the examples given in
Chapter 7 show that these integrals provide rather small classes of integrable
functions. In deeper applications, we need completeness of normed linear spaces
of integrable functions, which is achieved with the aid of Lebesgues integration.
A further anvantage of Lebesgues approach consists in the possibility to integrate
over more general domains than intervals.
In the sequel, (X, S , ) will be a measure space.
The Riemann integral of the indicator function of any interval is its length.
Accordingly,
developing the notion of an integral on X, the requirement that

c
d
=
A
for any A S is quite natural. Further, it is reasonable to
A
X
impose conditions on additivity and monotonicity of the integral and to ask the
family of all integrable functions to be as large as possible.
We introduce
 the concept of the abstract Lebesgue integral in several steps.
First we dene X f d in a natural way for nonnegative simple functions and then
we extend it to nonnegative -measurable functions using their approximation by
simple functions. The general case will be completed using the decomposition of
a function into its positive and negative parts. Building up this theory, we must
be careful in order
 to legitimate these steps. For instance, we have to show that
the denition of X s d in the case of simple functions does not depend on their
representation which is not unique. Likewise, we have to deal with problems when
innite values or indenite expressions appear.
The reader may nd it instructive to remember the most important example
of the n-dimensional Lebesgue measure in Rn . For integration with respect to
the Lebesgue measure we use a traditional notation


 b

f dx :=
f d ,
f dx :=
f d.
E

(a,b)

8.1. Simple Functions. Recall that a simple function is a real S -measurable


function s on X having a nite range. Any simple function s can be expressed as
s=

n


j cBj ,

j=1

where B1 , . . . , Bn S are pairwise disjoint sets and 1 , . . . , n R. Of course,


this representation of s is not unique.
Remember again that as a matter of convenience we set 0 = 0 = 0.
Let A1 , . . . , Am S and B1 , . . . , Bn S be pairwise disjoint
m
n


i cAi
j cBj ,
sets and let i and j be nonnegative real numbers. If
8.2. Lemma.

i=1

then

m

i=1

i Ai

n

j=1

j Bj .

j=1

30

8. The Abstract Lebesgue Integral

Proof. To simplify the proof dene 0 = 0 = 0, A0 = X \


X\

Ai and B0 =

i=1

Bj (then the collections {Ai }, {Bj } form partitions of X). For i

j=1

{0, . . . , m} and j {0, . . . , n} either Ai Bj = or i j . Thus


m


i Ai

i=0

n
m 


n
m 


i (Ai Bj )

i=0 j=0

j (Ai Bj )

i=0 j=0

n


j Bj .

j=0

8.3. Abstract Lebesgue Integral. If D S and s is a nonnegative simple


n

j cBj , where Bj are pairwise disjoint sets and j are
function expressed as s =
j=1

nonnegative coecients, dene

s d :=
D

j (D Bj ). The previous lemma

j=1

shows that this value does not depend on the particular representation of s. Next,
dene



s d : 0 s f on D, s simple

f d := sup
D

if f 0 is a -measurable function on D S . Lemma 8.2 again ensures that the


new denition and the old one agree in case when f is a simple function.


a SD -measurable function f on D we dene D f d := D f + d
 For
f d provided at least one of these integrals is nite. Remember that f + :=
D
max(f, 0), f := max(f, 0).
If f is a function on X and M S , then apparently



f cM d

f d =
M



and M f d = M f d where = |M is the restriction of to the -algebra
SM := {A S : A M } of subsets of M .
Therefore, it is no loss of generality to restrict our attention to the integration
over the whole space X.
It is useful to dene the abstract Lebesgue integral even for functions dened
only -almost everywhere. In this case, if f is dened on D S and (X \D) = 0,
set


f d :=
f d
X

if the integral on the right side is dened. It is immediate that this value does
not depend on the choice of D.
The symbol L or L () denotes the family of all -measurable functions
dened -almost everywhere on X for which the abstract Lebesgue integral is
dened.

B. The Abstract Lebesgue Integral

Further denote


L = L () =
1


f d R .

31

f L () :
X

If f L (), we say that f is -integrable.


1

8.4. Lebesgue Integrable Functions on R. (a) Every bounded -measurable function on


a bounded interval (and so every Riemann integrable function) is Lebesgue integrable. Thus,
the functions from Examples 7.5 7.6 are Lebesgue integrable as well.
R
R
(b) If f is a Riemann integrable function on [a, b], then ab f d f = f ab f d, so
that the Lebesgue and the Riemann integrals of f coincide.
(c) If a function has both the Newton and the Lebesgue integrals, they are equal. Proof of this
is not easy. We will prove it using the Henstock-Kurzweil integral in Chapter 25.
(d) If f is Newton integrable, then f is -measurable since it can be expressed as the limit of a
1
sequence {fn } of continuous functions, in essence, fn (x) = n(F (x + n
) F (x)), where F is an
antiderivative to f . It can happen that f is not Lebesgue integrable, but this is only the case if
f is not absolutely Newton integrable.
(e) Since nonmeasurable functions are rather rare, the question whether or not f is Lebesgue
R
R
integrable can be reduced to the question whether the integrals ab f + dx and ab f dx are nite
p
or innite. For instance, the function f : x  x is integrable on (0, 1) if and only if p > 1.

The key result is the following monotone convergence theorem (sometimes


called Levis theorem or the Lebesgue monotone convergence theorem) for nonnegative functions.
If fn 0 are -measurable functions on X, fn  f , then

8.5. Theorem.
f
d

f
d.
n
X
X


Proof. It is clear that the sequence X fn d is nondecreasing,
 and therefore
there exists := lim X fn d. Since fn f , we have X f d and the
assertion is obvious for = +. Assume < +, x a simple function s,
0 s f , and prove that X s d . The proof will be given in several steps:
(a) Let (0, 1). Dene En = {x X : fn (x) s(x)}. Then En S , En

En = X (if f (x) = 0, then x E1 ; if f (x) > 0, then s(x) <


En+1 and
n=1

lim fn (x) ). Thus (En A) A for every set A S .


k

(b) Let s =
j cAj , where Aj are pairwise disjoint. Then
j=1


fn d


fn d

En

k
k


(
j cAj ) d =
j (Aj En ).

s d =
En

En j=1

j=1

(c) Passing to limits in (b) (using (a)), we get

k



j Aj =

Since (0, 1) is arbitrary, we get

s d .
X

j=1


X

s d, as needed.

32

8. The Abstract Lebesgue Integral

8.6. Theorem. Let g L and f be a -measurable function, f = g almost


everywhere. Then f L and X f d = X g d.
Proof. Obvious.
8.7. Remark. For any nonnegative -measurable function f on X there exists a sequence
R
of simple functions sn 0 such that sn  f . From Theorem 8.5 we know that X f d =
R
lim X sn d. As is often the case, this observation becomes the basis for a denition of the
integral of nonnegative -measurable functions. Of course, starting with such a denition, it is
R
X f d on the choice of the sequence {sn }.

necessary to prove the independence of

8.8. Theorem.

If f1 , f2 are nonnegative -measurable functions, then





(f1 + f2 ) d =
f1 d +
f2 d .
X

Proof. It might be assumed that both f1 and f2 are dened everywhere on X.


First suppose that f1 and f2 are even simple. Following the same line of proof as
in Lemma 8.2, we can nd pairwise disjoint S -measurable sets A0 , . . . , Am and
B0 , . . . , Bn and nonnegative real numbers 0 , . . . , m and 0 , . . . , n such that
m
n
m
n



Ai =
Bj , f1 =
i cAi and f2 =
j cBj . Then
X=
i=0

j=0

i=0


(f1 + f2 ) d =
X

n
m 


j=0

(i + j )(Ai Bj ) =

i=0 j=0

i=0

f1 d +

m


i Ai +

n


j Bj

j=0

f2 d.
X

 
 
For the general case, nd sequences s1n and s2n of simple functions with
sin  fi and use the rst part and Theorem 8.5.
8.9. Theorem (properties of L 1 ()). The following propositions hold:
(a) If f L 1 , then f is nite almost everywhere.

(b) If f, g L 1 , , R, then f + g L 1 and (f + g) d =
X f d + X g d (an appeal to (a) reveals that f + g is dened
almost everywhere).

 
(c) If f L 1 , then |f | L 1 and  X f d X |f | d (i.e. the integral is
absolutely convergent).
(d) L 1 is a lattice (if f, g L 1 , then max(f, g), min(f, g) L 1 ).
(e) If f is -measurable, g L 1 , |f | g, then f L 1 .
Proof. (a) Suppose that f L 1 , A:= {x X : f (x) = }. Then
A is measurable

and 0 ncA f + for every n N. Thus 0 X ncA d X f + d and we get
A n1 X f + d < for every n N. Hence A = 0.


(b) It is plain to see that X f d = X f d for f L 1 and R. Let
f, g L 1 and write f = f + f , g = g + g and h = f + g (by (a), h is dened
almost everywhere). Then
h+ h = f + f + g + g ,

B. The Abstract Lebesgue Integral

33

and in light of Theorem 8.8,








+

+
+
h d +
f d +
g d =
f d +
g d +
h d .
X

The assertion now follows if we observe that the integrals


are nite. But this is true as


X

h d and


X

h d

0 h+ = (f + g)+ f + + g + .
(c) If f L 1 , then |f | = f + + f L 1 by (b), and we have
 
 

 


 
 

 




 f d =  f + d


f d  f d +  f d =
 

X
X
X
X
 X
+

=
f d +
f d =
|f | d.
X

(d) The assertion follows immediately from (b) and (c) using
max(f, g) =

1
(f + g + |f + g|).
2

(e) If f is a -measurable function, f + is -measurable as well and




f + d
g d <
0
X

(0 f |f | g), thus f
f + f L 1.
+

L . Analogously f L 1 , and nally f =


1

8.10. Remarks. 1. Proposition (b) of the previous theorem shows that 1 satises almost
all axioms for a linear space but it is not a true linear space. To get a linear space, the set of
1 or the space L1 (which will be introduced in
all nite everywhere dened functions from
Chapter 10) is to be considered.
P
2. The sum of an absolutely convergent series
j aj is the integral of the function j  aj
over the set N with respect to the counting measure. On the other hand, the abstract integration
theory cannot describe sums of nonabsolutely convergent series. This observation should not be
understood as an insuciency of Lebesgues theory. The abstract Lebesgue integral described
in 8.3 provides the best approach in the general framework of measure spaces. Let us consider
a simple experiment: A rearangement of N can change the sum of a nonabsolutely convergent
series while measures of sets remain invariant. The sum depends on an additional structure
on N, namely, on its ordering. Analogously, taking into account the ordering of the real line (in
addition to its measure properties), various kinds of nonabsolutely convergent integrals may be
introduced. Let us mention Newton integration on an elementary level and Perron or HenstockKurzweil integration on an advanced level (see Section 25). It is instructive to compare the
expressions
Z b
Z
sin x
sin x
dx = lim
dx
b 1
x
x
1
(where the integral is understood as Newtonian) and

n
X
X
sin k
sin k
= lim
.
n
k
k
k=1
k=1

In the theory of Lebesgue integration, limit theorems play a crucial role. They
concern either monotone convergence or dominated convergence.

34

8. The Abstract Lebesgue Integral

8.11. Levis theorem. Let {f


functions,
 n } be a sequence of -measurable


fn  f almost everywhere and let X f1 d > . Then X f d = lim X fn d.
Proof. We have already proved the theorem when fn are nonnegative and fn  f
everywhere. The general case can easily be reduced to Theorem 8.5. First we
redene f and fn on sets of measure zero in such a way that fn  f everywhere.
It would be clearly sucient to assume that fn L 1 for all n. If gn = fn + f1
(notice that f1 L 1 ), then gn are nonnegative
-measurable
functions and


gn  f + f1 . Now, Theorem 8.5 ensures that X gn d X (f + f1 ) d. Since



f d = X gn d X f1 d, the assertion easily follows.
X n
8.12. Levis theorem
 for series.
 If fn are nonnegative -measurable functions on X, then X fn d =
f d.
X n
Proof. Use Theorems 8.5 and 8.8.
8.13. Lebesgue dominated convergence theorem. Let {fn } be a sequence
of -measurable functions, fn f almost everywhere. If there exists a function
1
1
h
 L such that |fn | h almost everywhere for all n, then f L and
f d = lim X fn d.
X
Proof. The proof may be easily reduced to Levis theorem by considering f =
lim sup fn = lim inf fn . Set sn = sup{fn , fn+1 , . . . }, tn = inf{fn , fn+1 , . . . }.
Then h tn fn sn h, sn  f , tn  f almost everywhere and




<
h d
t1 d
s1 d
h d < +.
X





By
Levis
theorem,
f
d
=
lim
t
d
=
lim
s
d.
Since
t d
n
n
X
X
X
X n


f d X sn d, the assertion follows.
X n
8.14. Dominated convergence theorem for series. Let {hn } be a sequence
of -measurable functions on X and g L 1 . Suppose that
n




almost everywhere
hj  g

j=1

for all n N and that the series

hj converges almost everywhere. Then

j=1

 

X j=1

hj d =



j=1

hj d.

Proof. The theorem follows immediately from the previous one.


8.15. Fatous lemma. Let {fn } be a sequence of -measurable functions and
g L 1 . If fn g almost everywhere for all n N, then


lim inf fn d lim inf
fn d.
X

Proof. Set gk = inf{fn : k n}. Then gk  lim inf fn and infer on Levis
theorem again.

B. The Abstract Lebesgue Integral

35


8.16. Theorem. Let f 0 be a -measurable function. If X f d = 0, then
f = 0 almost everywhere.

Proof. Denote An = {x X : g(x) n1 }. Then


An = {x X : f (x) > 0}
n=1

and the inequality 0 cAn nf yields (An ) = 0.



8.17. Theorem. Let f L 1 . If E f d = 0 for any measurable set E, then
f = 0 almost everywhere.


Proof. If E := {f 0}, then E f d = E f + d = 0 and using the previous theorem we get f + = 0 almost everywhere. Analogously f = 0 almost everywhere.
8.18. Corollary.

Let f, g L 1 . If


f d
g d
E

for every measurable set E, then f g almost everywhere.




Proof. Set h = (f g)+ . Then h 0 and 0 E h d = E{h>0} (f g) d 0.
Thanks to Theorem 8.17, (f g)+ = 0 almost everywhere.
8.19. Indenite Lebesgue integral. Let f L . For E S set

f (E) :=
f d .
E

The set function f is called the indenite Lebesgue integral of f .


8.20. Theorem. Let f L . Then f is a signed measure on X. Moreover,
f (E) = 0 for every set E S with E = 0.
Proof. It is enough to prove
that f is a measure
provided f is nonnegative. In

this case, notice that f cA = f cAn if A = An with Ai Aj = for i = j, and


n

cite Theorem 8.12.


8.21. Remark. A question arises whether or not for any pair of measures on , say and
, there is always a nonnegative -measurable function f on X such that = f , i.e.
Z
E =
f d
E

. Of course, the answer is negative; if such a function exists, then necessarily


for all E
E = 0 whenever E = 0. But if and obey this condition (we say that is absolutely
continuous with respect to ), then there is a function f with the ascribed property (at least
for -nite measures). This will be the subject of the chapter concerning the Radon-Nikod
ym
theorem.
8.22. Exercises. Let f

1.

Prove that

(a) f is a nite signed measure on ;

(b) for every > 0 there is a > 0 such that f (E) < whenever E < .
P
f (An ) (An
pairwise disjoint) use Theorem
Hint. (a) To verify the equality f (A) =
8.14 for hn = f cAn .
R
(b) Suppose there exists an > 0 and a sequence {En } such that En < 2n and En |f | .

R
T S
Ek . Then E = 0 and E f d , which is a contradiction.
Set E =
n=1 k=n

36

9. Integrals Depending on a Parameter

8.23. Image of a Measure. Let (X, , ) be a measure space, (Y, ) be a measurable space
and f : X Y a measurable mapping (i.e. f 1 (E)
whenever E ). The set function
E  (f 1 (E)),

is called the image of the measure under the mapping f and it is denoted by f ().
(a) Show that f () is a measure on (Y,
(b) Let g : Y R be a
if g f 1 . In this case

).

-measurable function. Show that g is f ()-integrable if and only


Z

Z
g f d.

g df () =
Y

Hint. First consider simple functions and then pass to limits.


8.24. Exercise. Let f be an increasing dierentiable function on R,
be a measure on (R,

lim f (x) = . Let

(R)) dened by
Z

f  dx.

E =
E

Show that f () = .
8.25. Notes. The modern integration theory starts with Lebesgues doctoral thesis [1902]
following a short paper [1901]. His denition uses constructed Lebesgue measure on R. The
idea of using simple functions (not equal to those introduced here) when dening the integral
(still on the real line) comes from F. Riesz ([1912], [1920]).
Further details can be found in historical notes by T. Hawkins [*1970] or by F.A. Medvedev
[1975], R. Henstock [1988], T.H. Hildebrandt [1953]. It is clear that H. Lebesgue followed investigations of his predecessors (B. Riemann, C. Jordan, G. Peano, E. Borel and others). Less known
is the fact that (roughly) at the same time G. Vitali and W.H. Young published similar results
independently.
Theorem 8.11 (concerning the monotone convergence) was proved by Beppo Levi [1906],
Lemma 8.15 by P.J.L. Fatou [1906] and Theorem 8.13. by H. Lebesgue [1910].
One more remark: H. Lebesgue developed his theory of integration mainly for the case of
Lebesgue measures on Rn . Later J. Radon [1913] considered more general measures in Rn
(nowadays called the Radon measures). A general theory of measures on arbitrary -algebras was
given by M. Frechet [1915]. Since that time, many results on this subject have been published;
the monograph [*1950] by P.R. Halmos is one of the most quoted.

9. Integrals Depending on a Parameter


The Lebesgue dominated convergence theorem has simple but very important
consequences on continuity and dierentiation of integrals depending on a parameter.
9.1. Theorem. Let (X, S , ) be a measure space, P a metric space and U a
neighbourhood of a point a P . Suppose that a function F : U X R has the
following properties:
(a) there exists a set N X of measure zero such that for each x X \ N
the function F (, x) is continuous at a;
(b) for each t U , the function F (t, ) is -measurable;
(c) there exists a function g L 1 (X) such that |F (t, )| g almost everywhere for all t U .

B. The Abstract Lebesgue Integral

37

Then for each t U , F (t, ) L 1 (X) and the function



F (t, ) d
f : t
X

is continuous at a.
Proof. The proof that

ta


F (t, ) d =

lim

F (a, ) d
X

is achieved by showing that




F (tj , ) d =
F (a, ) d
lim
j

for each sequence tj a, tj U . To prove the last assertion it suces to use the
Lebesgue dominated convergence theorem.
9.2. Theorem. Let (X, S , ) be a measure space, N X a set of measure
zero and I R an open interval. Suppose that a function F : I X R has the
following properties:
(a) For each x X \ N , F (, x) is dierentiable on I;
(b) for each t I, F (t, ) is -measurable;


d

1

(c) there exists a function g L (X) such that  F (t, x) g(x) for each
dt
x X \ N and t I;
(d) there is a t0 I such that F (t0 , ) L 1 (X).
Then F (t, ) L 1 (X) for all t I, the function

f : t
F (t, ) d
X

is dierentiable on I and
f  (t) =


X

d
F (t, ) d.
dt

Proof. Suppose a, b I, b = a, x X \ N . By the mean value theorem there


exists a between a and b such that
 



 F (b, x) F (a, x)   d

 =  F (, x) g(x).




ba
dt
For a = t0 it follows that the function
x

F (b, x) F (a, x)
ba

10. The Lp Spaces

38

is integrable, thus F (b, ) is integrable. Choose a I again. By the Lebesgue


dominated convergence theorem,

lim
j

F (tj , ) F (a, )
d =
tj a


X

d
F (t, ) d
dt

for each sequence tj a of points of I \ {a}, which yields


d
dt


F (a, ) d = lim

ta

F (t, ) F (a, )
d =
ta


X

d
F (a, ) d.
dt

9.3 Remarks. 1. Notice that in the proofs of the last theorems we made heavy use of the
Lebesgue dominated convergence theorem. Of course we could state analogous results based on
Levis theorem.
2. Since the notions of continuity and derivative are local, it suces to verify the assumption (c)
only locally.
3. A number of exercises and clarifying examples can be found in [L-Pr].

10. The Lp Spaces


The Lebesgue spaces Lp are important means linking measure theory and functional analysis, and they are essential tools in dierential equations theory, probability theory and other branches of modern analysis.
In this chapter, (X, S , ) denotes a xed measure space and p a number from
the interval [1, ].
10.1. The Set L p . (a) Suppose p < . We denote by L p = L p (X, S , ) the
set of all -measurable functions on X such that

p

|f | d < .
X

The value


f p :=

1/p
p
|f | d

is called the Lp -norm of a function f L p . We will show soon that p has all
important properties of a norm.
(b) We denote by L = L (X, S , ) the set of all -measurable functions
f on X such that |f | M -almost everywhere for some constant M . The least
constant M having this property is called the L -norm of f and is denoted by
f  . Roughly speaking, the dierence between f  and supX |f | is that f 
omits values of f on null sets.
Sometimes it is useful to emphasize the measure or the space X. In this case,
abbreviated symbols L p (X) or L p () will be used as well.

B. The Abstract Lebesgue Integral

39

Let p, q (1, ), p1 + 1q = 1. If a, b are nonnegative

10.2. Youngs inequality.


numbers, then

ab

ap
bq
+ .
p
q

Proof. We can assume ab > 0. Making use of concavity of the function ln,

ln

bq
ap
+
p
q

1
1
ln(ap ) + ln(bq ) = ln a + ln b = ln(ab)
p
q

holds, and we easily establish the required inequality.


10.3. H
olders inequality. Suppose that f L p and g L q , where p, q
1
(1, ), p + 1q = 1. Then f g L 1 and
 

1/p 
1/q


p
q
 f g d
|f
|
d
|g|
d
.


X

Proof. Denote


1/p
|f | d
,
p

s=



1/q
|g| d
.
q

t=

We may assume that st > 0. Thanks to Youngs inequality (a = f (x)/s, b =


g(x)/t) we have
p

|g(x)|
f (x)g(x)
|f (x)| |g(x)|
|f (x)|
+

st
st
psp
qtq
for each x X. Thus
1
st


f g d
X

|f | d
+
psp

|g| d
1 1
+ =1
q
qt
p q

which is what we wanted to prove.


10.4. Minkowskis inequality. Let p [1, ] and f, g L p . Then f + g
L p and
f + gp f p + gp .
Proof. It is not hard to verify that f + g1 f 1 + g1 .
If p = , then |f | s -almost everywhere and |g| t -almost everywhere,
which implies |f + g| s + t -almost everywhere. Therefore f + g s + t,
and consequently
f + g f  + g .

10. The Lp Spaces

40

With these trivial cases out of the way, there remains the case 1 < p < .
Holders inequality yields

p1
|f | |f + g|
d
X



1/p 
1/q
(p1)q
|f | d
|f + g|
d
p



1/p 
11/p
p
|f | d
|f + g| d
.
p

=
X

Analogously



p1

|g| |f + g|

1/p 
11/p
p
|g| d
|f + g| d
.
p

This entails



p
p1
p1
|f + g| d
|f | |f + g|
d +
|g| |f + g|
d
X
X
X

1/p 
1/p 
11/p
p
p
p

|f |
+
|g|
|f + g| d
.
X

10.5. The Lp Spaces. The behavior of the function f f p on L p , where


p [1, ], resembles axioms of a norm. However, in general, L p is not a linear
space and a nonzero function may have zero norm. To apply the theory of normed
linear spaces, we identify functions which are equal almost everywhere. Formally,
we assign to every function f L p the class of functions
[f ] = {g L p : g = f -almost everywhere on X}
and dene
Lp = Lp (X, S , ) = {[f ] : f L p }.
Then Lp is a linear space equipped with operations
[f ] + [g] := [f + g],

[f ] := [f ]

( R),

and with the (true) norm


 [f ] p := f p .
It can easily be seen that these denitions do not depend on the choice of representatives.
It is customary not to distinguish between functions and classes, often even
between the spaces L p and Lp . For instance, we say that {fj } is a Cauchy
sequence in Lp while the meaning is that fj are functions and {[fj ]} is a Cauchy
sequence in Lp .

B. The Abstract Lebesgue Integral

41

10.6. Completeness of Lp . Let {fj } be a Cauchy sequence in Lp . Then {fj }


is convergent in Lp , i.e. there exists an f L p such that

p
|f fj | d 0.
X

Proof. The easy case p = is left to the reader. Suppose p < . First we choose
a subsequence {gj } of {fi } such that
S :=

gj gj+1 p < .

j=1

Denote h =

|gj gj+1 |. Then by Levis Theorem 8.12 and Minkowskis in-

j=1

equality



hp d =
X

p
p


k



|gj gj+1 | d = lim


|gj gj+1 |

j=1

p
k

lim
gj gj+1 p S p < .
k

j=1

j=1

By Theorem 8.9.a there exists a set M X such that (X \ M ) = 0 and h <


on M . We can also assume that |g1 | < on M . For each x M , {gj (x)} is a
Cauchy sequence in R, and therefore it is convergent. Thus we can dene
f (x) = lim gj (x)
j

-almost everywhere. We prove that f L p and f fj p 0. The sequence


p
p
{|gj | } tends to {|f | } -almost everywhere and
p

|gj | (|gj g1 | + |g1 |)p (h + |g1 |)p


for all k N. An appeal to the Lebesgue dominated convergence theorem 8.13
with dominating function (h + |g1 |)p yields



p
p
|f | d = lim
|gj | dx
(h + g1 )p d < .
X

We see that f is an element of L p . In a similar way by Lebesgues theorem


(dominating function hp )

p
p
f gj p =
|f gj | d 0.
X

Since the sequence {fj } is Cauchy in Lp , we get lim fj gj p = 0. Thus


j

lim fj f p = 0 and f is a limit of the sequence {fj } in Lp .

10.7. Remarks.
1. In the language of functional analysis we have just proved that Lp
are Banach spaces. The characterization of their duals, which is signicant for the theory, is
postponed to Exercise 13.17.

42

11. Product Measures and the Fubini Theorem

2. For a counting measure on a set X we write lp (X) := Lp (X,


X = N we get well known spaces of sequences:

(X), ). In particular, for

the space p , 1 p < of all sequences x = {xn } such that xp :=

P
n

|xn |p

1/p
<

,
the space  of all bounded sequences x = {xn } with the norm x := supn |xn |.

10.8. Exercise. Suppose fn , f . Show that fn f  0 if and only if there exists
a set E , E = 0, such that fn f on X \ E.
10.9. Exercise. Let p [1, +) and {fn } be a Cauchy sequence in p . A close inspection of
the proof of 10.6 shows that there is a subsequence of {fn } that converges -almost everywhere
in X. Compare with Theorem 12.4.
10.10. Exercise. (a) Suppose that p [1, +). Show that the set
X
:= {
j cBj : Bj , Bj < }
j

of simple functions is dense in

p.

Hint. Apparently,
p . Choose f p . Since f is nite -almost everywhere, Exercise
3.11 provides a sequence {fn } of simple functions such that |fn | |f | and fn f almost
everywhere. Clearly fn
and by Lebesgues theorem with dominating function 2p |f |p ,
fn f p 0.
(b) Show that the set of all simple functions is dense in

10.11. Exercise. Consider p (0, 1). Dene .p and the spaces
in the beginning of this chapter. Show that:

p,

Lp in the same way as

(a) Lp is a linear space.


(b) The triangle inequality for .p does not hold.
Hint. Employ the indicator functions of two disjoint sets of a positive measure.
R
(c) The distance function dp (f, g) := |f g|p d is a metric on Lp and (Lp , dp ) is a
complete metric linear space.
10.12. Notes. The H
older inequality was rst proved by A.L. Cauchy [*1821] in the case of
p = 2 for nite sums (so, in fact, for the counting measure on a nite set). V.Y. Bunyakowski
[1859] proved it (still for p = 2) for Riemann integrable functions and the same result was
obtained by H.A. Schwarz [1885]. For general p but still for nite sums, the inequality was
proved by L.J. Rogers [1888] and by O. H
older [1889]. The denitive form given in 10.3 comes
from F. Riesz [1910], where Minkowskis inequality is proved as well. The original result for
nite sums is due to H. Minkowski [1907].
The theory of Lp -spaces was originated by F. Riesz [1906] who dened the L2 -metric and
E. Fischer [1907] who proved the completeness of L2 . F. Riesz then dened Lp for other values
of p as well and proved their completeness. The notion of a Banach space has its roots in papers
by S. Banach [1922] and N. Wiener [1922]. This period culminated when Banachs fundamental
monograph [*1932] appeared.

11. Product Measures and the Fubini Theorem


In this chapter, we will consider the following problem: Given two measure
spaces (X, S , ) and (Y, T , ), we wish to dene a product measure on an
appropriate -algebra U of subsets of the Cartesian product X Y . If M = S T
where S S , T T (such sets are called measurable rectangles), we require
M = S T .
An important example is that of Lebesgue measure in R2 which should arise as
the product of one-dimensional Lebesgue measures.

B. The Abstract Lebesgue Integral

43

11.1. Product -algebra. First we introduce the notion of the product algebra. If S and T are some -algebras, the product -algebra S T is the
-algebra generated by the collection of all measurable rectangles. Thus, S T
is the smallest -algebra which contains all sets of the form S T where S S
and T T .
For M X Y , x X, let
M x := {y Y : [x, y] M } .
The set M x is called the x-section of M . Analogously, dene the y-section
My = {x X : [x, y] M }
for y Y .
11.2. Lemma. If M S T and x X, then M x T (and, of course,
similarly My S for y Y ).
Proof. Denote A := {E S T : E x T }. A moments reection shows
that A contains all measurable
A routine argument yields that

rectangles.

x
(X Y \ E) = Y \ E x and ( E )x = Ex . We see that A is a -algebra,
and therefore A = S T .

11.3 Monotone Classes. A family M of subsets of a set Z is a monotone class


if it obeys the following conditions:

(a) if M1 M2 M3 . . . ,
Mn M , then Mn M ;

(b) if M1 M2 M3 . . . ,
Mn M , then Mn M .
Every monotone class which is simultaneously an algebra forms a -algebra.
The following idea, used already in the proof of the previous lemma, will repeat
in the proofs of subsequent theorems:
If A is the family of sets from S T for which a certain proposition holds
and if A contains all measurable rectangles, then A = S T provided A is a
-algebra. It is not always so easy to verify that A a -algebra. Often it is easier
to prove that A is only a monotone class and an algebra (mostly using Levis
theorem). But even in this case A = S T as follows from the next theorem.
11.4. Theorem. If R is an algebra of sets, then the smallest monotone class
containing R is (R).
Proof. Let M denote the smallest monotone class containing R (it does exist!).
It is sucient to show that M is an algebra. To this end, it is enough to prove
that M KE := {B : E \ B, B \ E, B E M } for each E M . We x a set
F and proceed in the following steps:
(a) KF is a monotone class;
(b) R KF for F R;
(c) M KF for F R;
(d) if E M , then R KE . (Let F R; by (c), E KF , and so F KE .)
Therefore R M KE by the denition of M .
The basis for a denition of the product measure is in the following proposition.

44

11. Product Measures and the Fubini Theorem

11.5. Lemma. Let (X, S , ) and (Y, T , ) be -nite measure spaces, E


S T . Then the function x (E x ) is S -measurable (E x T by the previous
lemma), the function y (Ey ) is T -measurable and


(E x ) d(x) =
(Ey ) d(y).
X

Proof. Let A be the collection of all sets E S T such that all assertions
hold for E. If R denotes the family of all nite disjoint unions of measurable
rectangles, it is straightforward to check that R is an algebra. The assertion will
follow from the preceding theorem, provided we can prove that A is a monotone
class containing R. We merely outline the main steps and invite the reader to ll
in the details.
(a) A contains all measurable rectangles.
n

(b) R A (it suces to show that


Ej A whenever Ej A are pairwise
j=1

disjoint).

(c) If En A , E1 E2 . . . , then En A (use Levis theorem and


properties of sections).

(d) If En A , E1 E2 . . . , then En A (consider rst the case when
and are nite measures and proceed as in (c)).
11.6. Product Measure. Let (X, S , ) and (Y, T , ) be -nite measure
spaces. A measure on S T is called a product measure of and (denoted
by ) if
(A B) = A B
whenever A S , B T . Based on the following main theorem, we can claim
that the product operation is properly dened.
11.7. Theorem. Let (X, S , ) and (Y, T , ) be again -nite measure spaces.
Then there exists a unique measure on S T such that
(A B) = A B
whenever A S and B T .
Proof. Uniqueness is almost obvious. Indeed, if 1 and 2 are measures on S T
satisfying requirements of the theorem, then the family D = {E S T : 1 E =
2 E} contains all measurable rectangles, is closed under nite disjoint unions and
is a monotone class. Therefore D = S T and we see that 1 = 2 on S T .
To prove the existence of a product measure, dene

E x d
E =
X

for E S T . We know that E = Y Ey d according to Lemma 11.5.


A routine argument shows that is a measure on S T and (AB) = AB
for all measurable rectangles A B.

B. The Abstract Lebesgue Integral

45

11.8. Lemma. Let (X, S , ), (Y, T , ) be -nite measure spaces and f a


S T -measurable function on X Y . Then the function f (x, ) is T -measurable
on Y for each x X.
Proof. Select R, x X and observe that
x

{y Y : f (x, y) > } = {(t, y) x Y : f (t, y) > } .

Let (X, S , ), (Y, T , ) be -nite measure spaces,

11.9. Fubinis Theorem.


and h L ( ). Then

 



 
h(x, y) d) d =
h(x, y) d d .

h d =
XY

Proof. It is no restriction to assume that h 0, for we can use the decomposition


h = h+ h . The assertion holds whenever h = cA is the indicator function of
a set A S T (see the proof of Theorem 11.7), and therefore for nonnegative
simple functions as well. If h 0 is an arbitrary S T -measurable function, there
is a sequence
{hn } of nonnegative
simple functions, hn  h (cf. Theorem 3.9).


Then XY hn d XY h d . On the other hand, hn (x, y)  h(x, y)
and since the assertion holds for simple functions hn , by using Levis theorem
(twice) we easily nish the proof.
11.10. Remarks. 1. The Lebesgue measure in Rn+k is the completion of the product of
Lebesgue measures in Rn and Rk . We refer to Chapter 26 for a detailed study of this important
example.
2. Suppose is the product of -nite measures and . Then the measure is not necessarily
complete, even if and are complete. (Indeed, the set {(x, x) : x M } is measurable with
respect to the product of Lebesgue measures on R if and only if M R is Lebesgue measurable.
On the other hand, all subsets of the diagonal {(x, x) : x R} are measurable with respect to
the completion of .) However, for the completed product measure the following variant of
Fubinis theorem holds.

11.11. Theorem. Let (X, S , ), (Y, T , ) be complete -nite measure spaces,


the product of and and the completion of . Let h L ( ). Then the
function h(x, ) is -measurable for -almost all x X, h(, y) is -measurable
for -almost all y Y , and
 


h d =
XY

 


h(x, y) d

d =


h(x, y) d d .

Proof. As usual, it is sucient to prove the theorem for indicator functions. Let
M be a set of the form A N , where A S T and N B for B S T ,
B = 0. By Fubinis theorem

cB (x, ) d = 0
Y

46

12. Sequences of Measurable Functions

for -almost all x X. Therefore also



cN (x, ) d = 0
Y

for -almost all x X. Hence we obtain


 


cM d =
XY


cM (x, y) d

d.

11.12. Exercise. Show that every Dynkin class is a monotone class and nd a counterexample
that the reverse is not true.
11.13. Notes. The product measures and the reduction of multi-dimensional integration to
one-dimensional one were originally studied in case of the Lebesgue measure in R2 and appear
already in H. Lebesgue [*1904]. G. Fubini [1907] claims that the order of integration may be
reversed. A complete proof was done by L. Tonelli [1909]. Approximately in the 30s a number
of works concerning the abstract product measure theory appeared. The method explained in
this chapter is due to H. Hahn [1933].
Let us point out that there are theories concerning products of measures which are not
necessarily -nite. The theory of innite products of measures (we understand the innite
number of factors) is also very important.

12. Sequences of Measurable Functions


In this section we will study the relationship among various kinds of convergences of sequences of measurable functions on a measure space (X, S , ). We
will consider
(a)
(b)
(c)
(d)

the
the
the
the

convergence almost everywhere;


convergence in the norm of Lp spaces;
convergence in measure;
-uniform convergence.

Two last mentioned notions will now be dened.


12.1 Convergence in Measure. Let f , fn be -measurable, almost everywhere
nite functions on X. We say that a sequence {fn } converges in measure to f if
lim {x X : |f (x) fn (x)| } = 0
for each > 0.
12.2. -uniform Convergence. Let f , fn be almost everywhere nite functions on X. We say that {fn } converges to f -uniformly if for every > 0 there
is M S such that (X \ M ) < and the convergence fn f is uniform on M .
The limit function of a sequence converging in measure (or -uniformly) is
unique except for a null set and is nite -almost everywhere.

B. The Abstract Lebesgue Integral

47

12.3. Theorem. Let 1 p < and {fn } be a sequence of -measurable,


almost everywhere nite functions on X. If fn L p and fn f p 0, then
fn converge to f in measure.
Proof. Choose
> 0 and denote En = {x X : |f (x) f (xn )| }. Then

p
p
p En En |f fn | d f fn p , which implies the proposition.
12.4. Riesz Theorem. Let {fn } be a sequence of -measurable, almost everywhere nite functions on X. If fn converge to f in measure, then there exists a
subsequence {fnk } such that fnk f almost everywhere.
Proof. Suppose fn converge to f in measure. There is a sequence n1 < n2 < n3 <
. . . such that
 1

1
{x X : f (x) fnj (x) } < j .
j
2


1
If Aj :=
{x X : |f (x) fnk (x)| } and B :=
Aj , then A1 A2
k
j=1
k=j
1
< +, and consequently
A3 . . . , A1
2k
B = lim Aj lim


1
1
= lim j1 = 0
2k
2
k=j

(notice the connection with Borel-Cantellis lemma 2.14). Now it is sucient


to prove that fnk (x) f (x) for x X \ B. To this end let x X \ B and


> 0 be given. Then there is a j = j(x) such that x X \ Aj =
{y X :
k=j

1
|f (y) fnk (y)| < }. If k0 > max(j(x), 1 ), then |f (x) fnk (x)| <
k
k k0 , as needed.

1
k

for all

12.5. Egorovs Theorem. Let X < + and f , fn be -measurable, almost


everywhere nite functions on X. Then fn f almost everywhere if and only if
fn f -uniformly.
Proof. Assume that fn f almost everywhere. Denote
E := {x X : f (x), fn (x) are nite and fn (x) f (x)}.
Obviously (X \E)

= 0 and it does no harm to assume that E = X. Choose > 0.


Writing Ek,m :=
{x X : |fn (x) f (x)| k1 }, we have Ek,1 Ek,2 . . .
nm

and Ek,1 X < +. For xed k N, lim Ek,m = 0, and we easily nd a


m


sequence {mk } such that
Ek,mk < . Set
k

M =X\

Ek,mk .

Then we have |fn (x) f (x)| <


uniformly on M .

1
k

for each x M and n > mk , so that fn f

48

12. Sequences of Measurable Functions

The reverse implication is obvious.


12.6. Remark.
If f and fn are -measurable, almost everywhere nite functions on X
and fn f -uniformly, then fn f in measure. Indeed, a routine argument shows that
lim {|fn f | } = 0 for any > 0. Combining aforementioned results we can state the
following Lebesgues theorem:
If X < +, f , fn are -measurable, almost everywhere nite functions on X, and if
fn f almost everywhere, then fn f in measure.
12.7. Exercise.
Suppose X < . If f , fn are -measurable, almost everywhere nite
functions on X, then fn f in measure if and only if for any subsequence {fnk } there exists
its subsequence {fnk } which converges to f almost everywhere.
j

Hint. For one of the implications use Theorem 12.4. If {fn } satises the subsequences condition and does not converge in measure, then there exists an > 0 and a subsequence {gn } of
{fn } such that {|gn f | > } > for all n and gn f almost everywhere. By Lebesgues
theorem of 12.6 we obtain a contradiction.
12.8. Exercise. Let (X, , ) be a measure space, an increasing, bounded and continuous
function on [0, ], (0) = 0, and (s + t) (s) + (t) for all s, t 0 (for instance the function
t  t/(1 + t)). Dene

X
(f, g) =
2j ({|f g| 1/j}).
j=1

Show that
(a) is a pseudometric on the space of all the -measurable functions on X;
(b) fn converge to f in measure if and only if (f, fn ) 0.
12.9. Exercise (Vitalis theorem). Let 1 p < +. If fn
and fn p f p , then fn f p 0.

p,

fn f almost everywhere

Hint. Let n = 2p1 (|fn |p + |f |p ) |fn f |p . Then n 2p |f |p almost everywhere. Since


|fn f |p (|fn | + |f |)p 2p1 (|fn |p + |f |p ), we have n 1 , n 0. By Fatous lemma
Z
X

Hence lim sup

R
X

Z
|f |p lim inf

2p

Z
|f |p lim sup

n = 2p
X

|fn f |p .
X

|fn f |p = 0, nishing the proof.

12.10. Exercise.
(a) (Egorov) Let be a -nite measure on (X, ), fn -measurable
functions on X and fn f almost everywhere (fn , f almost everywhere nite). Show that

S
of nite measure such that (X \
Ek ) = 0 and fn f on every Ek .
there exist Ek
k=1

(b) Show that the assumption of -niteness of the measure in (a) cannot be dropped.
Hint. Let X be the set of all convergent sequences x = {xk } of real numbers and the counting
measure on X. Dene a sequence {fn } of functions on X by fn (x) = xn . Choose a sequence
{Ej } of subsets of X on which {fn } converges uniformly and nd a convergent sequence {yk }
which does not belong to any of the sets Ej (because it converges more slowly than all sequences
from Ej ).
(c) Show that the -uniform convergence implies convergence almost everywhere, or convergence in measure, without the assumption that X < .
12.11. Exercise.
Find a sequence {fn } of functions on [0, 1] which converges to zero in
(Lebesgue) measure whereas the sequence {fn (x)} fails to converge for any x [0, 1].
,
Hint. Consider the indicator functions of intervals [ k1
m

k
]
m

ordered into a suitable sequence.

B. The Abstract Lebesgue Integral

49

p
12.12. Weak Convergence in Lp . Let 1 p < + and q = p1
(q = + if p = 1).
p
Suppose f , fj R
. We say
weakly to f in Lp (denoted
R that a sequence {fj } converges
f = w-lim fj ) if X fj g d X f g d for every function g q .

(a) If f = w-lim fj , g = w-lim fj , then g = f almost everywhere.


(b) If f fj p 0, then f = w-lim fj .
(c) Suppose f, fj p . Let
be a set of p -functions, the linear span of which is dense
in Lq . Then the following assertions are equivalent:
(i) f = w-lim fj ;
R
R
(ii) the sequence {fj p } is bounded and X gfj d X gf d for every g

Hint. A typical application of the Banach-Steinhaus theorem from functional analysis.


(d) We can take as
the set of all indicator functions of sets from
in case p = 1, or the
of nite measure if p > 1; in the special case of the
set of all indicator functions of sets from
Lebesgue measure, other possibilities are shown in Theorem 31.4.
(e) If p > 1, then any norm-bounded sequence of functions from
subsequence.

has a weakly convergent

Hint. If the space Lp is separable, proceed in a similar way as in the proof of Theorem 17.10.
The nonseparable case is more dicult. The idea is that for 1 < p < the space Lp is reexive
and reexive spaces are characterized by this property, cf. for instance L. Misk [*1989], Theorem
33.4 or R.B. Holmes [*1975], p.149.
(f) Let p > 1. If fj f almost everywhere and if {fj p } is a bounded sequence, then
f = w-lim fj . The proposition holds also if the assumption fj f almost everywhere is
replaced by fj f in measure.
(g) The Radon-Riesz theorem. Let p > 1, f = w-lim fj and f p = lim fj p . Then
f fj p 0. (What proposition do we obtain by sticking (f) and (g) together?)
Hint. The proof makes substantial use of uniform convexity of Lp spaces for 1 < p < , see
M.M. Rao [*1987], Proposition 5.5.3.
(h) The sequence fj : x  sin jx converges weakly to zero for any p on each bounded interval.
None of its subsequences is convergent almost everywhere.
(i) Suppose fj = j 1/p on (0, 1/j) and fj (x) = 0 on (1/j, 1). The sequence {fj } is bounded
in the norm of Lp (0, 1) and tends to zero almost everywhere. If p = 1, then it is not weakly
convergent (nor any of its subsequences). If p > 1, then it converges weakly but not in norm.
12.13. Weak* Convergence. Assume is a -nite measure on X and f, fj L . We say
R
R
that a sequence {fj } converges weakly* to f (denoted f = w* -lim fj ) if X fj g d X f g d
for each function g 1 .
(a) If f = w* -lim fj , g = w* -lim fj , then g = f almost everywhere.
(b) If f fj  0, then f = w* -lim fj .

(c) Suppose f, fj . Let


be a set of
in L1 . Then the following are equivalent:

1 -functions,

the linear span of which is dense

(i) f = w* -lim fj ;
R
R
(ii) the sequence {fj  } is bounded and X gfj d X gf d for each function g
(d) If fj f almost everywhere and if the sequence fj  is bounded, then f =

w* -lim f

.
j.

(e) Any norm-bounded sequence {fn } of -functions contains a weakly* convergent subsequence.
S
Hint. Write X = Xk where X1 X2 . . . and Xk < . Suppose {fn } is a norm-bounded
sequence of functions from L and
(
fn on Xk ,
fn,k =
0
on X \ Xk .

50

13. The Radon-Nikod


ym Theorem and the Lebesgue Decomposition

Then {fn,k }n is bounded even in the L2 -norm and there exists its subsequence {fni ,k }n which
R
R
converges weakly in L2 to a function fk . But this means that lim X fni u d = X fk u d for
i

each function u L2 vanishing on the complement of Xk . The diagonal


method (cf.
R
R 17.10)
provides a subsequence {hn } of {fn } and a limit function f such that X hn u d X f u d
for each function u L2 vanishing on the complement of some of the sets Xk . Since the set of
all such functions u is dense in L1 , in light of (c) we conclude that f = w* -lim fn .
(f) The sequence fj : x  sin jx converges weakly* to zero (cite the Riemann-Lebesgue
lemma 31.10).
(g) Dene fj = 0 on (0, 1/j) and fj (x) = 1 on (1/j, 1). The sequence {fj } is bounded in
L -norm and converges both almost everywhere and weakly* to the constant function 1. We
have lim fj  = lim fj  = 1 but lim f fj 
= 0.
12.14. Remarks. It may be worth while to take for the interested reader a slightly deeper
look from the point of view of functional analysis (see also [Zap]).
(1) Let X is a Banach space with (topological) dual X , xn , x X, Fn , F X . We say
that
(a) {xn } converges weakly to x if F (xn ) F (x) for any functional F X ;
(b) {Fn } converges weakly* to F if Fn (z) F (z) for any z X.
Taking into account 13.17, the previous denitions of the weak and weak* convergence in
the Lp spaces now are easier to understand (at least if 1 < p < or if is -nite).
2. We say that a Banach space is uniformly convex if for any > 0 there is a > 0 such that,

if x, y X, x = y = 1 and


2 (x + y) > 1 , then y x < .
Each uniformly convex Banach
space is locally uniformly rotund which means that if xn  =

x = 1 and 12 (xn + x) 1, then xn x 0. The Lp spaces are uniformly convex
for 1 < p < (it follows from the Clarksons inequalities, see, for instance, E. Hewitt and
K. Stromberg [*1965]).
3. A Banach space X is said to have the Kadec-Klee property if xn x 0, whenever a
sequence {xn } X converges weakly to x and xn  x. All locally uniformly rotund
Banach spaces (in particular Lp spaces, 1 < p < ) have the Kadec-Klee property. Indeed,
if {xn } is a sequence which does not converge to a point x and for which xn  = x = 1,
local uniform rotundity
ensures the existence of its subsequence, call it {xn } again, and an

(0, 1) such that 12 (xn + x) < 1 . Using the Hahn-Banach theorem nd a F X with
F  = 1 = F (x). Now, it is clear that the sequence {F (xn )} cannot converge to F (x).
4. In interesting cases, the L1 spaces are not (locally) uniformly convex, nor do they need satisfy
a proposition analogous to 12.12.f. Nevertheless, if fn L1 , fn f almost everywhere or in
measure and fn  f , then fn f  0 (cf. 12.9).
12.15. Exercise. Let X < and 1 p < r +. If {fj r }j is a bounded sequence and
fj f almost everywhere (or, in measure), then f fj p 0. Show that the assertion does
not continue to hold for r = p.
12.16. Notes.
The notion of convergence in measure was introduced by F. Riesz [1909a]
where also Theorem 12.4 was proved. Egorovs theorem was proved for case of the Lebesgue
measure by H. Lebesgue [1903] and by D. Egorov [1911].

m Theorem and the Lebesgue Decomposition


13. The Radon-Nikody
13.1. Radon-Nikod
ym Derivative and Absolute Continuity of Measures. Let (X, S , ) be a measure space and f L 1 () a nonnegative function
on X. In Chapter 8 we dened the measure f on (X, S ) as

f (E) =
f d,
E S.
E

B. The Abstract Lebesgue Integral

51

Now, if and are measures on S , one might ask whether there is a nonnegative
function f L 1 () such that = f . Immediately we see that such a function in
general does not exist. Indeed, every measure of the form f satises f (E) = 0,
provided (E) = 0. However, we show that if E = 0 whenever E = 0, and
and are -nite, then a desired function f , called the density or the RadonNikodym derivative of with respect to , can be found. Then the density of
with respect to is unique except for a null set and it is sometimes denoted by
d
.
d
We say that a measure on S is absolutely continuous with respect to , and
write  , if E = 0 for every E S with E = 0.
There are several methods how to prove the existence of Radon-Nikod
ym derivatives. Our
proof is based on the following variational approach. Assuming that such a function f exists
and is in 2 , consider the functional
Z
|g f |2 d

J0 : g 
X

on the space L2 (). The function f evidently represents the only element of the space L2 () at
which J0 attains its minimum. For all g L2 ()
Z
f 2 d

J0 (g) = J(g) +
X

where

(g 2 2f g) d =

J(g) =
X

g 2 d 2
X

g d.
X

Since the functionals J and J0 dier only by a constant, J attains its minimum at f . This is
also the idea of our proof of the following lemma.

13.2. Lemma. Let , be nite measures on (X, S ) such that A A for


each A S . Then there exists a nonnegative function f L 2 () such that

A =

f d
A

for any A S .
Proof. For g L 2 () denote



g 2 d 2

J(g) =
X

g d .
X

Let
c := inf{J(g) : g L2 ()}.
Since A A for each A S , we have

J(g)


(g 2 2 |g|) d =

(|g| 1)2 d (X) (X),


X

52

13. The Radon-Nikod


ym Theorem and the Lebesgue Decomposition

for g L 2 (), so that c R. Hence, there is a sequence {fj } of functions from


L 2 () such that J(fj ) c. Since for g, h L 2 () the parallelogram law

J(g) + J(h) 2J


1
1
2
(g + h) = g h2
2
2

holds, we have
1
2


2

|fj fk | d J(fj ) + J(fk ) 2c 0


X

as j, k . Consequently, {fj } is a Cauchy sequence, and therefore there is a


limit f of this sequence in L2 (). Obviously, J(f ) = c. Choose A S . Since
J(f ) J(f + tcA )
for all t R, we get



0 2t
X

The last shows that




c2A d 2t

f cA d + t2

cA d = 2t
X


f d A + t2 A.





 f d A 1 |t| A
 2

X

(note that t can be positive as well as negative). By letting t 0 we establish


the assertion.
13.3. Remark. Proof of the preceding lemma becomes more comprehensible if it is based on
the knowledge of the theory of Hilbert spaces. Considering the functional
Z
F (g) =

g d,

g L2 ().

By the Riesz representation theorem on continuous linear functionals on Hilbert spaces there is
an element f L2 () such that
Z
F (g) =
f g d
X

for each g
.

L2 ().

Now it is enough to apply the last equality to g = cA , where A runs over

13.4. Radon-Nikod
ym Theorem. Let , be nite measures on (X, S ),
 . Then there exists a nonnegative function h L 1 () such that

h d

A =
A

for all A S . This function h is unique up to -almost everywhere equality.

B. The Abstract Lebesgue Integral

53

Proof. The uniqueness can be achieved by Theorem 8.17, and we proceed to


establish the existence of h. Set = + . By the previous lemma there exists
a function f L 2 () such that

f d
A =
A

for all A S . Since

0 A =

f d A =
A

1 d ,
A

an appeal to Corollary 8.18 reveals that 0 f 1 -almost everywhere. We


claim that f < 1 -almost everywhere. Let E := {f = 1}. Then

E =
f d = E,
E

thus E = 0. By assumptions we get E = 0 and therefore E = 0. Hence, it is


no restriction to assume that 0 f < 1 everywhere. For each set A S we have


cA f d =
cA (1 f ) d.
X

We see (for simple functions using linearity, then passing to limits) that


gf d =
g(1 f ) d
X

for every nonnegative S -measurable function g on X. For A S , put g =


Then


f
cA d
cA d =
X 1f
X
and the function
h :=

f
1f

has all required properties. Notice that h L 1 ()) as


X

cA
1f .

h d = (X) < +.

13.5. Theorem. Let and be nite measures on (X, S ). Then the following
conditions are equivalent:
(i)  ;
(ii) there is a nonnegative function f L 1 () such that

E =
f d
E

for all E S ;
(iii) for any > 0 there is a > 0 such that E < whenever E S and
E < .

54

13. The Radon-Nikod


ym Theorem and the Lebesgue Decomposition

Proof. The implication (i) = (ii) was established in Theorem 13.4, (ii) = (iii)
follows by Exercise 8.22.b and (iii) = (i) is obvious.
13.6. Remarks. 1. The Radon-Nikod
ym theorem can be generalized, its variants hold even
for signed or complex measures. Let us state the version for -nite measures:
Let , be -nite measures on (X, ),  . Then there is a nonnegative
function h (not necessarily from 1 ()) such that

-measurable

Z
A =

h d
A

for all A . This function is unique except on a null set.


of pairwise disjoint sets such that An < +,
Indeed, we
{An }, {Bn }
Snd sequences
S
Bn < +, An = Bn = X. Applying Theorem 13.4 for restrictions of and to An Bk
we obtain the assertion.
2. We proved the Radon-Nikod
ym theorem and the Hahn decomposition theorem independently.
The reader should nd it easy to derive the Radon-Nikod
ym theorem from the Hahn theorem
or vice versa. Both possibilities will be indicated.
Let , be nite measures on (X, ),  . We would like to prove the existence of a
Radon-Nikod
ym derivative. For Q nd a Hahn decomposition X = P N such that
is a nonnegative measure on P and ( ) is a nonnegative measure on N . For
= 0 set P0 = X, N0 = . Then we nd d/d in the form
f (x) = sup{ Q [0, +) : x P }.
d+
and P = {f = 1}, N = X \ P , then
d ||
(P, N ) is a Hahn decomposition of X with respect to .

On the other hand, if is a signed measure, f =

13.7. Integration with Respect to Signed or Complex Measures.


signed or complex measure on (X, ). Then we dene
Z

Suppose is a

Z
af d ||

f d =
X

where a = d/d ||. (Show that |a| = 1 almost everywhere.)

13.8. Singular and Absolutely Continuous Measures. Suppose that


and are positive, signed, or complex measures on the measurable space (X, S ).
We say that and are (mutually) singular , which is denoted by , if
there is a M S with || (M ) = || (X \ M ) = 0. Obviously, this relation is
symmetric.
A measure is absolutely continuous with respect to if E = 0, whenever
E S and || (E) = 0.
13.9. Examples. (a) Let be a signed measure. Using a Hahn decomposition of X for ,
we see immediately that + and are mutually singular.
(b) The Dirac measure at 0 and the Lebesgue measure are relatively singular on (R,

(R)).

(c) If is the Lebesgue-Stieltjes measure determined by the Cantor singular function (see
Example 23.1 and Exercise 24.8.) and the Lebesgue measure, then on (R, (R)).

13.10. Lebesgue Decomposition Theorem. Let be a (positive) measure


on (X, S ) and a -nite or complex measure on (X, S ). Then there exists a
unique decomposition = a + s where a  and s .

B. The Abstract Lebesgue Integral

55

Proof. (See also Exercise 13.12.) We rst reduce the general case to the case
when is a nite positive measure. There are B

j S such that Bj = 0 and


lim Bj = sup{B : B S , B = 0}. If M = Bj , then M = 0 and for the
measures
a A := (A \ M ), s A := (A M )
the equality = a + s holds. Clearly s . If now B = 0, then a (B) =
(B \ M ) = 0. This is so because B  = 0 for each set B  S , B 

X \ M with
B  = 0. (Otherwise (M B  ) = 0 and (B  M ) > M .) If X = Xk where
Xk < +, nd sets Mk Xk in the same way as in the rst part of the proof
and set

M=
Mk , a A = (A \ M ), s A = (A M ).
Having the proof for positive measures, the general case easily follows.
Uniqueness remains. Suppose that = a + s is another decomposition with
the required properties. Then there is a set M  S for which M  = 0 and
s (X \ M  ) = 0. Fix A S and denote C = A (M M  ), D = A \ (M M  ).
We have C M + M  = 0, so a C = a C = 0 and s C = s C. Further,
s D = s D = 0 and therefore a D = a D. Since A = C D and C D = , we
get s A = s A and a A = a A.
13.11. Lebesgue Decomposition. The decomposition = a + s is called
the Lebesgue decomposition of relatively to , and the measures a and s are
called the absolutely continuous part and the singular part of . Notice that there
exists a set M S such that s = M and a = X\M .
13.12. Exercise.
Give the details of an alternative proof of the Lebesgue decomposition
theorem: Existence (assuming -nitness of both measures) using the Radon-Nikod
ym theorem
and uniqueness from the following exercise (part (c)).
Hint. (a) Existence: Suppose , are positive and set = + , f = d/d. Then the
Lebesgue decomposition has the form s = M , a = X\M , where M = {f = 0}.
(b) Uniqueness: If = a + s = a + s , then a a = s s is at the same time
absolutely continuous and singular.
13.13. Exercise. Let and be positive, signed or complex measures on a measurable space
(X, )).
(a) Prove that the following are equivalent:
(i) ;
(ii) || ||;
(iii) (r )+ (r )+ , (r ) (r ) ; (i )+ (i )+ , (i ) (i ) .
(b) If 1 , 2 , then 1 + 2 .
(c) If ,  , then = 0.
13.14. Exercise.

Suppose S R is countable, f 0 and =

f (s)s (s are the Dirac

sS

measures, cf. Exercise 2.10). Describe the Lebesgue decomposition of with respect to the
Lebesgue measure.
13.15. Exercise. Suppose M(X) (see Exercise 6.17), V = { M(X) : }. Show
that V is a closed linear subspace of the Banach space M(X).
13.16. Exercise. Let , be nite (positive) measures on (X,
as in Exercise 6.19 and let = + .

). Let sup and inf be dened

56

13. The Radon-Nikod


ym Theorem and the Lebesgue Decomposition
(a) Show that the following statements are equivalent:
(i) ;
(ii) inf(, ) = 0;
(iii) sup(, ) = + .

d d
d sup(, )
= max(
,
).
d
d d
13.17. Duality of Lp Spaces. Suppose p, q [1, +], p1 + 1q = 1. Consider spaces Lp and
Lq determined by a -nite measure on (X, ).
R
(a) If v Lq , then the mapping u  X uv d is a continuous linear functional on Lp .
(b) Prove that

p
q
(b) Suppose p <
R and f is a continuous linear functional on L . Then there exists a v L
such that f (u) = X uv d for all u Lp . Moreover vq = f  (here, f  = sup{f (u) : u
Lp , up 1}).

Hint. ByRthe Radon-Nikod


ym theorem there exists a R-measurable nite function v such that
f (cE ) = E v d for each set E . Thus, f (u) = X uv d for each u Lp . It remains to
show that v Lq and vq = f . There is an increasing sequence {Ej } of sets from
of
S
nite measures such that v is bounded on each Ej and Ej is X. Set uj = |v|q2 vcEj . Then
R
R
R
uj Lp and uj pp = X uj v d = f (uj ) f  uj p and therefore E |v|q d = X |uj |p d
j

q
uj p1
f q . An appeal to Levis theorem reveals that v Lq and vq f . The
p
reverse inequality is as easy consequence of H
olders inequality.
13.18. Remark.
The previous theorem which describes duals of Lp for 1 < p < was
proved using the Radon-Nikod
ym theorem, so that we had to conne ourselves to the case of
-nite measures. However, the theorem holds for arbitrary measures.
In a similar way, it can be proved that any element of L represents a continuous linear
functional on L1 (by the same formula as in 13.17), but not every element of the dual space
to L1 is of this form. In order to be able to characterize elements of the dual space to L1 by
elements of L , it is necessary to conne ourselves, e.g. to the case of -nite measures. One
possible description of the dual space of L1 () in the case of arbitrary measure can be found
in J. Schwartz [1951].
The assumptions imposed on measures in the Radon-Nikod
ym theorem can be weakened to
the case of so called localizable measures. This notion, containing -nite case as well as the
case of Radon measures, was introduced by I.E. Segal in [1954]. For more information we refer
to M.M. Rao [*1987].
Let us remark that the dual spaces of L can be described using (bounded) nitely additive
measures, cf. E. Hewitt and K. Stromberg [*1965].
13.19. Notes.
The Radon-Nikod
ym theorem was rst proved by H. Lebesgue [1910] for
measures which are absolutely continuous with respect to the Lebesgue measure. Later, it was
generalized by M.J. Radon [1913] to Radon measures and by O. Nikod
ym [1930] to measures
on abstract spaces. The Lebesgue decomposition in the case of arbitrary measures is in Saks
monograph [*1937]. There are dierent proofs of the Radon-Nikod
ym theorem. One of them is
based on the Hahn decomposition and a newer one, using the Riesz theorem of representation
of functionals on Hilbert spaces, comes from J. von Neumann. Our proof uses the variational
principle and is near to von Neumanns.
In the classical case of the Lebesgue measure, the characterization of dual spaces of Lp spaces
is due to F. Riesz [1910] for p > 1 and to H. Steinhaus [1919] for p = 1.

C. Radon Integral and Measure

57

C. Radon Integral and Measure


14. Radon Integral
14.1. Radon Integral.
Throughout this chapter P , will denote a locally
compact topological space. The most important examples of locally compact
spaces are open and closed subsets of Rn .
The support of a function f is the closure of the set {x P : f (x) = 0}. It is
denoted by supt f .
If K P is a compact set, the symbol CK (P ) stands for the linear space of
all continuous functions on P whose support is contained in K. By Cc (P ) we
denote the

set of all continuous functions on P whose support is compact, i.e.


Cc (P ) = {CK (P ) : K P is compact }. If the space P is compact, then Cc (P )
and the space C (P ) of all continuous functions on P coincide.
A functional A on Cc (P ) is positive if Af 0 whenever f 0, and monotone
if Af Ag whenever f g. A linear functional is positive if and only if it is
monotone (for nonlinear functionals, these two notions are dierent).
A Radon integral on P is any positive linear functional on the space Cc (P ).
14.2. Examples. (a) Fix a point a P . It is simply to verify that the mapping
a : f  f (a) ,

c (P )

is a Radon integral on P . This functional is called the Dirac integral at a.


(b) The functional

Af :=

f
a

is a Radon integral on [a, b]. Since f is continuous, the integral


or Riemanns sense.

Rb
a

f always exists in Newtons

(c) The previous example can be generalized. If is a nondecreasing function on R and


[a, b] R, we dene
Z

A f = (RS)

f d := inf{
a

n
X

ci ((xi ) (xi1 )) :

i=1

a = x0 < x1 < < xn = b, ci f on [xi1 , xi ]}


for each f ([a, b]). The functional A is again a Radon integral and it is called the RiemannStieltjes integral. If (x) = x, it is the Riemann integral.
(d) The Riemann integral or the Riemann-Stieltjes integral can be dened for functions from
R
R
Indeed, observe that ab f d = cd f d whenever f c (R) and supt f [a, b] [c, d].
Rb
Thus, if f c (R), set A f = a f d where [a, b] is an arbitrary interval containing the
support of f . The functional A is then a Radon integral on R.
c (R).

(e) Another example of a Radon integral is the functional


Z
(f ) g ,

f 
G

where g is a continuous nonnegative function on an open set G Rk and : G P is a


continuous mapping. In this way, the following important examples (f) and (g) can be expressed
as well.

58

14. Radon Integral

(f) Suppose G Rk is an open set and : G Rn is a dieomorphism. Then the Radon


integral
Z
p
f 
f det(() )
G

is in fact the (k-dimensional) surface integral of f over (G).


(g) Let P = {z R2 : |z| = 1}. If f is a continuous function on P and x = (r cos , r sin ),
|x| < 1, set
Z 2
1
(1 r2 )f (cos t, sin t)
Af (x) =
dt;
2 0 1 2r cos(t ) + r2
this integral is called the Poisson integral. It is not too hard to see that the mapping
f  Af (x)
is a Radon integral on P .
The Poisson integral is used when solving the Laplace (partial dierential) equation h = 0.
Indeed, given a continuous function f on P , the function h : x  Af (x) is harmonic in the unit
disc U := {x R2 : |x| < 1} (i.e. h is a solution to the Laplace equation) and
lim

xU, xz

h(x) = f (z)

for each z P .

An important property of Radon integrals is described in the following theorem.


14.3. Theorem (Daniells property).
Cc (P ), fn  0. Then Afn 0.

Let A be a Radon integral on P , fn

Proof. There is a limit b := lim Afn 0. By Dinis theorem fn 0 on P


(consider the sequence {fn } on the support of f1 ), and therefore
there exists a
1
sequence {nk } so that |fnk |
for
each
k

N.
Thus
the
series
fnk converges
2
k
uniformly on P and if f := fnk , then f Cc (P ). Then for each k,
bk

k

i=1

Afni = A

 k



fni

Af,

i=1

and hence b = 0.
14.4. Semicontinuous Functions. Remember that a function f : P R is
said to be lower semicontinuous if {x P : f (x) > c} is open for each c R.
Upper semicontinuous functions are dened in a similar way.
Denote by Cc (P ) the set of all lower semicontinuous functions on P which are
nonnegative outside a compact set and do not attain the value . Analogously
we dene Cc (P ).
14.5. Extension of a Radon Integral. Let A be a Radon integral on P . We
extend A for larger classes of functions in the following steps.

1. For f Cc (P ) let f = Af .


2. If f Cc (P ), then we dene f = sup{ g : g Cc (P ), g f }.

C. Radon Integral and Measure

59

3. Lastly, for an arbitrary function f on P we dene upper and lower integrals as






f = inf{ u : u Cc (P ), u f },
f =
(f ).



4. We say that a function f on P is A-integrable if f =
f and this common
value, which is denoted by f , is nite.
14.6. Properties of Extended Radon Integrals. The procedure of extension
of Radon integrals requires to prove in each step that the new integral agrees
with the old one for functions which are already integrable with respect to
previous denitions. It is also needed to show that


(a) f
f for every function f on P ;

(b) the set of all nite A-integrable functions on P is a linear space and is a
nonnegative linear functional on it.

14.7. Measures induced by Radon Integrals. The mapping A : E
cE
is an outer measure on P . Dene M(A) as the -algebra of all A -measurable
subsets of P in the sense of Caratheodory (see 4.4.) and A as the restriction of
A to M(A). Then A is a measure on M(A). Furthermore, the set of all Aintegrable functions on P coincides with the set of all
 A -integrable functions on
P . For every such a function f the equality f = P f dA holds. The measure
A is complete and possesses the following properties:
(a) the domain M(A) of A contains all Borel subsets of P ;
(b) A K < for every compact K P ;
(c) A G = sup{A K : K G, K compact } for every open set G P ;
(d) A M = inf{A G : G M, G open } for every M M(A).
14.8. Examples. (a) If z P and A : f  f (z) is the Dirac integral at z, then A is the
Dirac measure at z (note that the Dirac measure is dened on the -algebra of all subsets of P ).
(b) If A is the Riemann-Stieltjes integral determined on R by a nondecreasing function ,
then the associated measure := A is called the Lebesgue-Stieltjes measure. Show that the
Lebesgue measure corresponds to (x) = x.
(c) The Radon measure corresponding to the Poisson integral (Example 14.2.g) is called the
harmonic measure at x.
14.9. Remark.
The explanation in 14.5 14.7 would be much longer if all proclaimed
propositions were proved. In this manuscript, we choose another approach. We will prove
the existence of A (the so called Riesz representation theorem) as directly as possible and
the extension
of a Radon integral to the collection of all A-integrable functions we obtain as
R
f  P f dA .
Although there are several approaches with dierent proofs, their most dicult parts are
based on similar ideas.

14.10. Signed and Complex Radon Integrals. A linear functional A on


the space Cc (P ) is called a signed Radon integral on P if for any compact set K
there exists a constant aK such that |A(f )| aK supK |f | whenever f CK (P ).
Analogously we dene complex Radon integrals on P . (They are functionals on the
space Cc (P, C) of all complex valued continuous functions on P with a compact
support.)
Any dierence of positive Radon integrals serves as an example of a signed
Radon integral. Soon we show that all signed Radon integrals are of this form.

60

14. Radon Integral

14.11. Variation of Signed and Complex Radon Integrals. Let A be


a signed or complex Radon integral on P . The variation of A is the (positive)
Radon integral |A| dened in the following steps:
1. If f Cc (P ) is a nonnegative function, dene
|A| (f ) = sup{Ag : g Cc (P ), |g| f }.
Apparently, 0 |A| (f ) < +. If f1 , f2 Cc (P ) are nonnegative, then |A| (f1 +
f2 ) |A| (f1 ) + |A| (f2 ). To prove the reverse inequality we use the Riesz decomposition lemma: Given g Cc (P ), |g| f1 + f2 there exist g1 , g2 Cc (P ) such
that g = g1 + g2 , |g|j fj for j = 1, 2. (It is not hard to see that the functions
gj := fj g(f1 + f2 )1 on G := {f1 + f2 > 0} and zero on P \ G have all the desired
properties.) Thus |A| (f1 ) + |A| (f2 ) Ag1 + Ag2 , and taking sup over all g, we
obtain |A| (f1 + f2 ) |A| (f1 ) + |A| (f2 ).
2. If f Cc (P ) is arbitrary, dene
|A| (f ) = |A| (f + ) |A| (f ).
To prove the additivity of |A|, note that f1+ +f2+ +(f1 +f2 ) = f1 +f2 +(f1 +f2 )+
and use the previous step. Since |A| (f ) = |A| (f ) for every f Cc (P ) and
every R, the functional |A| is linear.
14.12. Decomposition of Signed and Complex Radon Integrals. Every
complex Radon integral can be decomposed into its real and imaginary part which
are signed Radon integrals. If A is a signed Radon integral, then A can be
expressed in the form A = A+ A where A+ := 12 (|A| + A) (the positive
variation) and A := 12 (|A| A) (the negative variation) are positive Radon
integrals. The representation of a signed Radon integral as a dierence of positive
Radon integrals is not unique; however, the decomposition to the positive and
negative parts is minimal in a similar sense as the Jordan decomposition of a
measure.
14.13. Exercise (product of Radon integrals). Let A1 , A2 be Radon integrals on locally
compact spaces P1 , P2 . Show that there exists exactly one Radon integral A on P1 P2 such
that
A f = A1 f1 A2 f2
whenever f1 c (P1 ), f2 c (P2 ) and f (x1 , x2 ) = f1 (x1 )f2 (x2 ), x1 P1 , x2 P2 . Prove the
similar assertion also for signed and complex Radon integrals.
Hint. According to the Stone-Weierstrass theorem, the set of all linear combinations of functions
of the form f1 (x) f2 (y), where fi c (Pi ), is dense in c (P1 P2 ).
14.14. Notes.
When building up the integration theory, some authors do not start from
the original notion of a measure; they consider linear functionals on spaces of functions dened
on an arbitrary set and then they extend these functionals to larger collections of functions.
of functions on a set X (which is
More precisely, it is possible to start with a Riesz lattice
a linear space of functions closed under formations of nite maxima and minima) and with a
positive linear functional on
which satises Daniells condition. As in 14.5, this functional
is extended, a measure is dened in a natural way and its properties are derived. This method
was worked out by P.J. Daniell [1918] (for the extension he used sequences of functions) and
M.H. Stone in the series of articles [1948] and [1949] (using generalized sequences). Let us note

C. Radon Integral and Measure

61

that many authors use dierent forms of a similar approach (W.H. Young [1904], H.H. Golstine
[1941], J. Mark [1952] and others).
The idea of extending the Radon integral is also due to many authors. However, it seems
that Bourbakis group was the rst who worked it out for general locally compact spaces.
Also many authors conne themselves to the integration theory on locally compact spaces.
Kakutanis representation theorem (S. Kakutani [1941]) says that any abstract functional A
can be represented as a Radon integral AK on a locally compact space PK
on a Riesz lattice
so that the corresponding spaces of integrable functions are isomorphic. However, Kakutanis
representation also has some disadvantages when changing the original functional, the space P
changes as well and if the original space is locally compact, Kakutanis representation can lead to
satises Stones condition (min(1, f )
a dierent space. Supposing that the Riesz lattice
if f ), H. Bauer in [1957] constructed a dierent representation of (PB , AB ) which removes
is the space of all
mentioned disadvantages; the space PB does not depend on A and if
continuous functions with a compact support on a locally compact space P , PB can be identied
and AB with A.
with P , c (PB ) with
Of course, no such representation can save the topology on the original space if it is not
locally compact.
Many works are also concerned with integration theories in topological spaces without the
assumption of local compactness (for instance in separable metric spaces) but in this case the
correspondence between measure and the topological structure is not so fruitful.

15. Radon Measures


15.1. Radon Measures.
Properties of the measure A from the previous
chapter (14.7) lead to the following denition.
Let P be a locally compact space and B(P ) the -algebra of all Borel subsets
of P . We say that is a Radon measure on (P, S ) if
(a) S is a -algebra containing B(P );
(b) K < for every compact set K P ;
(c) G = sup{K : K G, K compact } for every open set G P ;
(d) A = inf{G : G A, G open } for every A S .
15.2. Remarks.
1. There is a one-to-one correspondence between Radon integrals on P
and Radon measures on (P ). To each Radon integral, we can assign a Radon measure A .
(See 14.7.; an alternative, more precisely developed approach will be used in 16.4.) Then we
can restrict A to (P ). On the other hand, ifR is a Radon measure on (P ), then = A
on (P ), where A is the Radon integral f  P f d. Therefore, to give examples of Radon
measures it is enough to list examples of Radon integrals.
2. In order to describe the structure of the system of all Radon measures on P , we dene an
equivalence relation: Radon measures 1 on (P, 1 ) and 2 on (P, 2 ) are said to be equivalent
(at least for the purpose of this remark) if 1 = 2 on (P ). Then, of course, 1 = 2 on
1
2 . In each equivalence class we can nd two special representatives: the minimal
one which is dened on (P ) and the maximal one which is complete. However, not every
complete Radon measure is maximal. See also Exercise 15.20. The Borel -algebra plays
the unifying role, all minimal Radon measures are dened on it while domains of maximal
Radon measures can be dierent (for instance, there are Lebesgue nonmeasurable sets while
every set is measurable with respect to the complete Dirac measure mentioned in Example
14.8.a).
3. It is worthwhile to mention that denitions of Radon measures may vary from one author to
another. We have seen that to each Radon integral we can assign an outer Radon measure and
the whole scale of equivalent Radon measures. There is no general agreement which of these
objects should be called a Radon measure. Some authors even use the term Radon measure for a

62

15. Radon Measures

measure which is inner regular (i.e. E = sup{K : K E, K compact } for every A ). In


more general spaces than the -compact ones this leads to a dierent concept than our denition
which requires outer regularity. Also the terms regular Borel measure or Borel regular measure
are frequently and ambiguously used for Radon measures or similar objects.
= (P ). Then the property (b) of 15.1
4. Suppose that P is separable, metrizable and that
implies both (c) and (d). Indeed, as every open set is a countable union of compact sets, we get
is the collection of all Borel sets A satisfying
(c). If
(A U ) = inf{G : G A U, G open }
for every open set U , then
well.

is a -algebra containing all open sets. Therefore (d) holds as

Throughout, is a Radon measure on (P, S ).


15.3. Lemma.

If E S , E < , then
E = sup {K : K E, K compact }.

Proof. Given > 0, there exist an open set G E and a compact set K G
such that (G \ E) < and K > G . Let V be an open set containing G \ E
with V < . If H := K \ V , then H is compact,
E \ H (E K \ H) (E \ K) V (G \ K)
and H > E 2.
15.4. Lemma.

Let f 0 be a -integrable function on P . Then





f d = inf{ g d : g Cc , g f };
P
P
f d = sup{ h d : h Cc , 0 h f }.

(a)
(b)

Proof. (a) Choose > 0 and nd a sequence {sk } of simple functions, 0 sk 


(sk sk1 ), there exist Aj S and j 0 so that
f . Since f = s1 +
f =


j=1

k=2

j cAj . As f is -integrable, we have Aj < for all j and we can nd

open sets Gj Aj so that (Gj \ Aj ) < 2j . The function g :=


j=1

j cGj is

nonnegative, lower semicontinuous and satises





g d

f d + .
P

(b) To prove the asssertion it suces to show it for simple functions. Let
n

j cMj be a -integrable function (1 , . . . , n 0, M1 , . . . , Mn S and
f =
j=1

C. Radon Integral and Measure

63

Mj < ). Choose again an > 0. By virtue of Lemma 15.3 there are compact
sets Kj (j = 1, . . . , n) such that Kj Mj and

Kj Mj
.
nj
Then h :=

n

j=1

j cKj is an upper semicontinuous function with a compact support,

0 h f and


h d
P

f d .
P

15.5. Theorem. Let be a complete Radon measure on P . A function f on


P is -integrable if and only if for every > 0 there exist integrable functions
s Cc and t Cc such that t f s and

(s t) d < .
P

Proof. First assume that f L (). Let > 0 be given. Cite Lemma 15.4 to
get nonnegative functions g Cc , h Cc such that g f + , h f and




g d
f + d + ,
h d
f d .
1

Then s := g h

Cc ,

s f and


s d f d + 2.
P

In a similar way, we manufacture a t Cc such that t f and




t d f d 2.
P

This establishes the necessity.


For the converse, suppose that for every k N there exist integrable functions
sk Cc and tk Cc such that tk f sk and

1
(sk tk ) d < .
k
P
No generality is lost with the assumption that the sequence {sk } is nonincreasing,
for we may replace it by the sequence {s1 , min(s1 , s2 ), min(s1 , s2 , s3 ), . . . }, and
that the sequence {tk } is nondecreasing. By the Lebesgue dominated convergence
theorem

lim(sk tk ) d = 0.
P

Thanks to Theorem 8.16, lim(sk tk ) = 0 -almost everywhere. Thus lim sk = f


-almost everywhere and by the Lebesgue theorem with the dominating function
|s1 | + |t1 | we get that f is -integrable.

64

15. Radon Measures

15.6. Corollary.

Let f be a -integrable function on P . Then




f d = inf{ s d : s Cc , s f }.
P

15.7. Exercise. Show that a nite positive measure on (P, ) is Radon if and only if for
and for every > 0 there exist a compact set K and an open set U such that
every E
K E U and (U \ K) < .
15.8. Signed and Complex Radon Measures. A signed measure on (P, ) is said to be
is
Radon if its positive and negative variations are Radon measures. A complex measure on
Radon if its real and imaginary parts are signed Radon measures.
(P,

Prove the following proposition: For a locally compact space P and a complex measure on
), the following assertions are equivalent:
(i) is Radon;
(ii) || is Radon;
and > 0, then there exist a compact set K and an open set U so that
(iii) if E
K E U and |A| < for each -measurable set A U \ K.

Hint. Use the inequality


|| (U \ K) 4 sup{A : A

, A U \ K}.

To prove this inequality, use the decomposition as in the proof of Theorem 6.11.
15.9. Exercise. Let be a Radon measure on P . Prove that the union G of an arbitrary
of -null open sets is again a -null open set.
collection
Hint. Suppose K G is compact. Then K can be covered by a nite union of sets from
whence K = 0. Now, the denition of a Radon measure (property (c)) shows that G = 0.

15.10. Support of a Radon Measure. Let be a Radon measure on P . Dene the support
of as
[
supt = P \ {G : G open, G = 0}.
In other words, supt is the smallest closed set whose complement si of measure zero. According
to the previous exercise, such a set always exists. If is a signed or complex measure, its support
is dened as the support of its total variation ||.
R
(a) Let be a positive measure. Show that z supt if and only if P f d > 0 for every
nonnegative function f c (P ) with f (z) > 0, which happens if and only if U > 0 for every
open neighbourhood U of z.
(b) If 1 , 2 are Radon measures on P , then supt(1 + 2 ) supt 1 supt 2 with the
equality if 1 and 2 are positive.
15.11. Exercise. Let be a Radon measure and E a Borel set. If E is open or of nite
measure, then E (see 2.4) is a Radon measure.
15.12. Exercise. Let be a Radon measure and f a nonnegative -measurable function on
P . Show that f (see 8.19) is a Radon measure in the following cases:
(a) f is a Borel function and f is nite on compact sets;
(b) f is continuous on P .
Hint. Use the part (a).
15.13. Exercise. Suppose that is a Radon measure and f L1 (). Show that f is a
signed Radon measure.
15.14. Exercise.
Let 0 be a -nite Radon measure and = a + s the Lebesgue
decomposition of a nite (signed or complex) Radon measure with respect to 0 . Show that
a and s are Radon measures.
Hint. Use Exercise 15.11.

C. Radon Integral and Measure

65

15.15. Exercise. (a) Show that the assertion of Lemma 15.3 continues to hold, provided E
is a countable union of sets of nite measures.
(b) If a Radon measure is -nite, then
E = sup{K : K E, K compact}
for every E , in particular for every Borel set E P . (Recall that on -compact spaces
Radon measures are -nite. Every locally compact space with a countable base of open sets is
metrizable and -compact.)
(c) Let P be the Cartesian product R Rd where Rd is R equipped with the discrete
topology. Show that P is a locally compact space which is not -compact.
P
Gy (see the notation in 11.1) and extend to a
For every open set G P set G =
yRd

Radon measure on P .
If B := {0} Rd , then B is a Borel set, B = and K = 0 for every compact set K P .
15.16. Exercise.
(a) Let be a Radon measure on P and K P compact. Show that
{x K : {x} > 0} is countable. In particular, if is a Radon measure on a -compact space
P , then {x P : {x} > 0} is countable.
(b) Let be a Radon measure on Rn . Then the set {r > 0 : {x Rn : |x| = r} > 0} is
countable.
15.17. Exercise. Let be a Radon measure on (P,
is dense in p ().

), p [1, ). Show that the set Cc (P )

of nite
Hint. With the help of Exercise 10.10.a, it is sucient to approximate every set E
measure by functions from K (P ) in the Lp -norm. If > 0, then there exist an open set G and
a compact set K (Lemma 15.7) so that K E G and (G \ K) < . Now, use Urysohns
lemma in order to nd a function c (P ) with cK cG .
15.18. Exercise. A Radon measure on P is called discrete (often also atomic) if there
exists a set S P such that (P \ S) = 0 and {x}
= 0 for all x S. A measure is called
continuous if {x} = 0 for each point x P .
(a) Show that a complex Radon measure on P
P is discrete if and P
only if there exists a
|cj | < and =
cj xj . Cf. Exercise
sequence of numbers {cj } and points xj P so that
j

2.10.
(b) Show that every Radon measure can be uniquely expressed in the form = d + c
where d is discrete and c continuous.
Hint. Set S := {x P : {x} > 0}, d A = (A S) and c A = (A \ S) (use Exercise 15.16.a).
(c) Show that each Radon measure on Rn can be written uniquely in the form
= d + a + s
where d is discrete, s is continuous, a is absolutely continuous with respect to the Lebesgue
measure and s and are mutually singular.
15.19. Exercise. Let be a -nite Radon measure on P and E

(a) Show that for every > 0 there exist an open set G and a closed set F such that
F E G and (G \ F ) < .
S
Hint. Suppose E = Ej , where Ej are of nite measure. Find open sets Gj Ej such that
j
S
Gj < Ej + 2j1 and set G = Gj . In a similar way nd F (considering the complements).
j

66

16. Riesz Representation Theorem

(b) Show that there exists an F set S and a G set D such that S E D and (D \S) = 0
(compare also with Theorem 1.21).
15.20. Exercise. Let be a Radon measure on (P,

) and consider the outer measure

A := inf{(G) : G open, G A}.


, then A is -measurable in the sense of 4.4. and = on
. Thus the
(a) If A
extension of to M( ) is the maximal representative of in the sense of Remark 15.2.2.
(b) Let be as in Exercise 15.15.c. Show that the completion of is not the maximal
representative.
15.21. Notes. The important step from the Lebesgue measure to the study of more general
measures in Euclidean spaces was done by J. Radon [1913]. The notion of a Radon measure is
connected with his name, even if this term is not entirely common and some authors use other
synonyms instead.

16. Riesz Representation Theorem


In this chapter, let us direct our attention to the relation between Radon
measures and Radon integrals on a locally compact space P .
Suppose is a Radon measure on P . Then Cc (P ) L 1 () and the mapping

f

f d ,

f Cc (P )

is a Radon integral on P .
On the other hand, every Radon integral A on P can be understood as an
integral with respect to a Radon measure. In Chapter 14 we indicated a method
of proving this proposition, and now we propose to show another method and
provide complete proofs.
Symbols A and A of this chapter will be used to denote the same objects as
previously in Chapter 14, but the denitions are revised.
In the sequel, A denotes a Radon integral on a locally compact space P .
16.1. Outer Radon Measure. If G is an open subset of P , we set
A (G) = sup {Af : f Cc (P ), 0 f 1, f = 0 on P \ G} .
Clearly A is monotone on open sets. We dene
A (E) = inf {A (G) : G open, G E}
for an arbitrary set E P .
The set function A will be called the outer Radon measure (assigned to the
Radon integral A). In the next theorem we will prove that A (which will be
simply denoted by ) is really an outer measure.

C. Radon Integral and Measure

67

16.2. Properties of Outer Radon Measures. Let be an outer Radon


measure. Then
(a) K = inf {Ag : g Cc (P ), 0 g 1, g = 1 on K} for every compact
set K P (in particular, is nite on compact sets);
(b) G = sup { K : K compact, K G} for every open set G P ;
(c) is an outer measure.
Proof. (a) Let K P be a compact set and g Cc (P ), 0 g 1, g = 1 on
K. Fix (0, 1) and denote G = {x P : g(x) > 1 }. Obviously K G. If
1
1
f Cc (P ), 0 f 1, f = 0 on P \ G, then f 1
g. Hence G 1
Ag and
Ag (1 ) G (1 ) K .
We see that Ag K, and thus
K inf {Ag : g Cc (P ), 0 g 1, g = 1 on K} .
To prove the reverse inequality, select an open set G K. Urysohns lemma
provides a function g Cc (P ), 0 g 1, g = 0 on P \ G and g = 1 on K. Since
Ag G, we are done.
(b) Suppose we are given an open set G P and > 0. Let f Cc (P )
be a function such that 0 f 1, f = 0 on P \ G and choose > 0. Since
fk := min(f, k1 )  0, an appeal to Daniells
property reveals
the existence of an


n N such that Afn < . If K := x P : f (x) n1 , then K is a compact
subset of P . According to (a), there exists g Cc (P ) such that 0 g 1, g = 1
on K and Ag K + . Since f fn g, we get
Af Afn + Ag K + 2
and we have nished.
(c) Clearly = 0, and S T whenever S T . It remains to show
that is -subadditive. To this end, let > 0 and K1 , K2 P be given. By
Theorem 16.2.a we can nd fj Cc (P ), j = 1, 2 such that fj = 1 on Kj and
Afj Kj +. Then (K1 K2 ) A(f1 +f2 ) = Af1 +Af2 K1 + K2 +2
and we see that is subadditive on compact sets.
Next consider open sets G1 , G2 P and pick a compact set K G1 G2 . For
any point x K there is its neighbourhood Vx whose closure is either in G1 or
in G2 . Thanks to the compactness of K, we obtain nite collections of open sets

{Vi1 }i , {Vi2 }i such that Vij Gj and Vi1 Vi2 K. Set Kj = K Vij ,
i

j = 1, 2. Then Kj are compact, Kj Gj and K = K1 K2 . Thus K


K1 + K2 G1 + G2 and it readily follows that is subadditive on
open sets.

Now, let {Gn } be a sequence of open sets. Choose a compact set K


Gi .
Then K

i=1

Gi for some n N, whence

i=1

n


i=1


Gi

n

i=1

Gi


i=1

Gi ,

68

16. Riesz Representation Theorem




and (b) yields that


Gi
Gi .

i=1
i=1


En .
Finally, let En P be arbitrary. We wish to show that ( En )
It would be clearly sucient to assume that En < for all n. Given > 0,
we can nd open sets Gn such that En Gn and Gn < En + 2n . Then

 
En
Gn
En + .

As > 0 was arbitrary, is -subadditive as needed.


16.3. Theorem.
Caratheodory).

Every Borel subset of P is -measurable (in the sense of

Proof. It is sucient to prove measurability of open sets. Notice that by virtue of


Theorem 16.2.c we have (G1 G2 ) = G1 + G2 whenever G1 , G2 are disjoint
open sets. Now, given an open set G P , a test set T P such that T <
and > 0, there exist open sets V and H such that V T , V < T + and
H V \ G, H < (V \ G) + . We can also nd a compact set K V G
such that K + > (V G) and an open set W with a compact closure such
that K W W V G. Set W0 = V H \ W . Then W0 is an open set,
W W0 = , W W0 V , V \ G W0 and W + > (V G). Thus
(T G) + (T \ G) (V G) + (V \ G) < W + + W0
= (W W0 ) + V + < T + 2.
(In fact, the idea of the proof is quite simple, T G and T \ G are approximated
by disjoint open sets.)
16.4. Measure A . Any Radon integral A on a locally compact space P denes
a corresponding outer Radon measure A . Caratheodorys construction yields the
-algebra MA on which A is a complete Radon measure. The restriction of A
to MA will be denoted by A .
16.5. Riesz Representation Theorem. Let A be a Radon integral on P .
Then there exists a complete Radon measure on P such that Cc (P ) L 1 ()
and Af = P f d for each f Cc (P ). The measure is unique on B(P ).
Proof of the existence. If = A , then is a complete Radon measure. To
complete the proof we only have to show that

Af =
f d, f Cc (P ).
P

The reader should nd it easy to prove that CK (P ) L 1 ().


Let f Cc (P ) be given. It is no restriction to assume that 0 f 1. For
n N and k = 0, 1, . . . , n denote
k
fk := min f,
,
n


k
Gk := x P : f (x) >
.
n

C. Radon Integral and Measure

69

Then using the denition of G and the properties of integral it is clear that

1
1
1
1
Gk A(fk fk1 ) Gk1 and
Gk (fk fk1 ) d Gk1
n
n
n
n
P
for each k = 0, 1, . . . , n. Thus

 





  n

Af
A(fk fk1 ) (fk fk1 ) d 
f d = 



P
P
k=1

n

1
1
1

(Gk1 Gk ) = G0 = {x P : f (x) > 0}.


n
n
n
k=1

Since {x P : f (x) > 0} < +, we get Af =


P

f d and we are done.

Proof of the uniqueness. If is a complete Radon measure on P obeying the


assumptions of the theorem, we have


f d
cG d = G
Af =
P

for any open set G P and f CK (P ) with 0 f 1, f = 0 on P \ G. Whence


A G = A G G. Taking into account Theorem 16.2.a, we have A K K for
each compact set and from regularity of and A it immediately follows that
and A coincide on Borel sets.
16.6. Remark. The previous theorem guarantees a one-to-one correspondence between Radon
integrals and classes of equivalence of Radon measures (cf. Remark 15.2).

16.7. Other Spaces of Continuous Functions. We denote by Cb (P ) the


Banach space of all bounded continuous functions on P equipped with the norm
f  = sup |f (x)| .
xP

The space
C0 (P ) := {f C (P ) : for every > 0 there exists a compact set K P
such that |f (x)| < for all x outside K }
of all continuous functions on P vanishing at innity is the closure of Cc (P ) in
Cb (P ). If is a nite Radon measure on P , then

f

f d
P

is a positive linear functional on Cb (P ) and on C0 (P ) as well. The converse


proposition is also true for the space C0 (P ) as an easy consequence of the Riesz
Representation Theorem 16.5.

70

16. Riesz Representation Theorem

Let A be a positive linear functional on C0 (P ). Then there exists a unique


 nite
complete Radon measure on P such that C0 (P ) L 1 () and Af = P f d
for every function f C0 (P ).
16.8. Exercises. Let P , Q be locally compact spaces, h a continuous mapping from P onto
Q and a Radon measure on (P, (P )).
(a) Show that f h 1 () for every function f
compact for every compact F Q.
R
(b) Show that the mapping A : f  P f h d, f

c (Q)

provided P < + or h1 (F ) is

c (Q),

is a Radon integral on Q.

(c) By the Riesz Representation Theorem there exists a unique Radon measure  on
(Q, (Q)) which represents A. Prove that  = h() (notation as in Exercise 8.23).
Hint. Show that  (K) = f ()(K) for every compact set K.
16.9. Product of Radon Measures. Consider locally compact spaces P1 and P2 . Their
Cartesian product P1 P2 is again a locally compact space. If P1 and P2 have countable bases,
then the product P1 P2 has a countable base as well.
We are now interested in the product of two Radon measures. Two main problems arise:
1. In general, a Radon measure is not necessarily -nite and we cannot speak about a product
measure in the sense of Chapter 11.
2. Although (P1 ) (P2 ) (P1 P2 ), the equality in general does not hold. Thus even if
the original Radon measures on P1 and P2 are -nite, their product is not necessarily dened
on (P1 P2 ) and so it cannot be called a Radon measure.
In the sequel, we show a way how to dene a product of Radon measures.
(a) Let P1 , P2 be locally compact spaces with countable bases (and so metrizable). Then
(P1 P2 ) = (P1 ) (P2 ) and if 1 , 2 are Radon mesures on (P1 ), (P1 )), (P2 , (P2 )),
then 1 2 is a Radon measure on (P1 P2 , (P1 P2 )).
Hint. It is not hard to verify that (P1 ) (P2 ) = (P1 P2 ). Hence 1 2 is dened on
(P1 P2 ). Since P1 P2 has a countable basis, it is sucient to show that 1 2 is nite
on compact sets. Select a compact set K P1 P2 and denote Ki projections of K onto Pi .
Then K1 and K2 are compact sets (as continuous images of compact sets) and K K1 K2 .
Then
1 2 (K) 1 2 (K1 K2 ) = 1 K1 2 K2 < +.
(b) Consider now the general case when P1 and P2 have not necessarily countable bases. By
Exercise 14.13 there is a Radon integral A on P1 P2 such that
Af = A1 f1 A2 f2
whenever f1 c (P1 ), f2 c (P2 ) and f (x1 , x2 ) = f1 (x1 )f2 (x2 ), x1 P1 , x2 P2 . Show that
A is a unique Radon measure on (P1 P2 ) satisfying
(E1 E2 ) = 1 E1 2 E2
whenever E1

(P1 ) and E2

(P2 ). If, in addition, 1 , 2 are -nite, then


A (E) = (1 2 )E

for every set E

(P1 )

(P2 ).

(c) Using results of 16.10, it is possible to dene a product of complex measures as well.
16.10. Representations of Signed and Complex Radon Integrals.
signed or complex Radon measure on P and a a bounded Borel function on
mapping
Z
A : f 
af d

(a) Let be a
(P ). Then the

is a complex Radon integral. If, in addition, P < +, then A is a continuous linear functional
on 0 (P ).

C. Radon Integral and Measure

71

(b) Let be a nite signed or complex Radon measure on P . Then the mapping
Z
A : f 

f d
P

is a continuous linear functional on the space

0 (P ).

(c) Let A be a complex Radon integral on P (signed Radon integrals are particular cases)
and let |A|R be as in 14.11. By the Riesz representation theorem there is a Radon measure
such that f d = |A| f for each f c (P ). Show that there exists a bounded RBorel function
a (uniquely determined as an element of L (P, )) such that the equality Af = P af d holds
for every function f c (P ). Moreover, |a| = 1 -almost everywhere.
Hint. The function a can be dened locally, so the problem can be reduced to the case when P
is compact. Our approach will be analogous to that in proving the Radon-Nikod
ym theorem.
The functional A is uniformly continuous on (P ) with respect to the norm of the Hilbert
space L2 (P, ), thus it can be continuously extended to the entire space L2 and the Riesz
Representation Theorem on representation of continuous linear
functionals on a Hilbert space
R
yields the existence of an element a L2 such that Au = au d for all u L2 . Proof of the
fact that |a| = 1 -almost everywhere is more dicult, see e.g. G.K. Pedersen [*1989].
(d) Let A be a continuous linear functional on 0 (P ). Then there exists a unique complete
signed (or complex,R depends on what eld we consider) measure on P such that 0 (P )
1 (||) and Af =
P f d for each f 0 (P ).
16.11. Notes. The famous Riesz Representation Theorem 16.5 was proved for P = [0, 1] by
F. Riesz [1909b] but continuous linear functionals on ([0, 1]) were represented by functions of
bounded variation using Riemann-Stieltjes integrals. Other theorems of this type were proved by
J. Radon [1913] (for compact subsets of Rn ), S. Banach in Appendix to Saks monograph [*1937],
S. Saks [1938] (for compact metric spaces) and S. Kakutani [1941]. The Riesz Representation
Theorem for compact spaces and the method of construction of Radon measures directly from
Radon integrals is due to J. von Neumann [1934]. Concerning locally compact spaces, A. Weil
[1940] was aware about the result. However, the nal version for locally compact spaces was
established by the Bourbaki group (A. Weil was also its member) in the 40s and appeared in
[*1952].

17. Sequences of Measures


Throughout this chapter, P will be a locally compact space. We denote by
M (P ) the linear space of all signed Radon integrals on P and by M + (P ) the
set of all (positive) Radon integrals on P . If no confusion can result, we will not
distinguish between the Radon integral A and the Radon measure A . Thus, for
M + (P ), we will write both (f ) (f Cc (P )) and E (E P ).
This chapter is for information, proofs will be given only to some propositions.
17.1. Strong and Weak Convergence. Let F be a linear subspace of C (P )
containing Cc (P ). We say that a sequence {n } of Radon integrals on P converges
F -weakly to a Radon integral if


lim
f dn =
f d
n

for each f F . Notice that the F -weak limit, if exists, is uniquely determined.
The most important case is the Cc (P )-weak convergence which is called the
v
vague convergence and denoted by n .

72

17. Sequences of Measures

From the point of view of functional analysis, the vague convergence is the weak* convergence.
In case of spaces of measures, it is common to omit the asterisk. Dierent spaces of measures
are duals to dierent spaces of continuous functions. If such a space
is equipped with a norm
(or, more generally, with a locally convex topology), then it is possible to introduce a -weak
topology on its dual so that the -weak convergence of measures is in fact the convergence
in this ( -weak) topology. Notice that the space is not (except a few not very interesting
cases) metrizable. Hence the -weak topology cannot be described by convergence of sequences
and, in general, nets have to be used instead.
If P is a compact metric space, then the set

+ (P )

is metrizable in the

(P )-weak topology.

In functional analysis, besides weak convergences we meet also strong convergences. The most important example of a strong type convergence of measures is
the convergence n  0 on the space Mb (P ) of all nite signed (or complex) Radon measures. The norm  dened as || (P ) (like in Exercise 6.17)
is the dual norm to the norm of C0 (P ) and Mb (P ) with this norm is a Banach
space.
17.2. Comparison of Weak Convergences. Now we will touch the question
v
whether the vague convergence n implies the F -weak convergence for a
wider space of test functions. Proofs of next propositions require the BanachSteinhaus theorem of functional analysis.
v

(a) A sequence n converges C0 (P )-weakly to if and only if n and the


sequence {n } is bounded.
(b) As usually, by a weak convergence we understand the Cb (P )-weak convergence where Cb (P ) denotes the set of all bounded continuous functions on P . A
sequence n of complex measures from Mb (P ) converges weakly to a measure
v
Mb (P ) if and only in n and for every > 0 there exists a compact set
K P so that |n | (P \ K) < for all n.
A sequence n of positive measures from Mb+ (P ) converges weakly to a measure
v
Mb+ (P ) if and only if n and n  .
(c) Note that the weak convergence implies the C0 (P )-weak convergence and
this one implies the vague convergence. If the space P is compact, then Cb (P ) =
Cc (P ) = C (P ) and there is no dierence between these convergences.
v

17.3. Examples. (a) If xn x, then xn x .


(b) Suppose that {xn } is a sequence having no convergent subsequence (for example, take
xn = n for P = R) and {n } a sequence of real numbers. Then the sequence {n xn } converges
vaguely to the null measure. This sequence converges (to the null measure) 0 (P )-weakly if and
only if n is a bounded sequence, and weakly if and only if n 0.
v

(c) Suppose xn z, yn z, xn
= yn . Then xn yn 0 but xn yn  2
= 0.
(d) If f is a continuous function on [0, 1], then
Z

f (x) dx =
0

lim

k+

k
1 X i
f( ) .
k i=1 k

As if often the case, this equality can become the basis for a denition of the Riemann integral,
but it cannot be used to describe the set of Riemann integrable functions. In the language of
the weak convergence of measures it can be understood as
k
1X
v
i [0,1] .
k i=1 k

C. Radon Integral and Measure

73

The above examples show that the weak or vague convergence (unlike the
strong convergence, cf. Exercise 6.18) of n to does not imply n (A) (A)
for all (Borel) sets. Nevertheless, we can state the following theorem for compact
spaces. Its modications hold in locally compact spaces as well.
v

17.4. Theorem. Let P be a compact space and n , M + (P ), n .


(a) For any lower semicontinuous lower bounded function u on P we have



u d lim inf

u dn .

(b) If f is a bounded Borel function on P which is continuous -almost everywhere, then




f d = lim
f dn .
P

(c) If G P is an open set, then lim inf n (G) (G).


(d) If K P is a compact set, then lim sup n (K) (K).
(e) If A is a Borel set such that (A) = 0, then n (A) (A).
Proof. (a) is an easy consequence of the denition.
(b) Dene f , f as in 7.9.b. If f = f -almost everywhere, then f is measurable and




f d =
f d lim inf
f dn lim inf
f dn
n
n
P
P
P
P





lim sup
f dn lim sup
f dn
f d = f d.
n

To prove (c), (d) and (e) we can apply (a) and (b) to indicator functions.
17.5. Remark. Compare Theorem 17.4 with the well-known result that a bounded function
f is Riemann integrable if and only if the set of all points of discontinuity of f is of measure zero
(see 7.9.d). It is possible to prove that a bounded Borel function f is Riemann integrable on
R
R
[0, 1] if and only if [0,1] f dn 01 f d for every sequence n of positive measures converging
weakly to on [0, 1].
v

+ (P ). Show that if and only lim sup (K)


17.6. Exercise. Suppose n ,
n
n
(K) for each compact set K P and lim inf n (G) (G) for each open set G P .

17.7. Molecular Measures.


A Radon measure is called molecular if
k
k


i xi where x1 , . . . , xk P and 1 , . . . , k are positive and
i = 1.
=
i=1

i=1

As we have seen in Example 17.3.d, the denition of the Riemann integral is


closely related to an approximation of continuous Lebesgue measure by discrete molecular measures. In a similar way, more general measures can be approximated as the next theorem shows.
17.8. Theorem. Let be a (positive) Radon measure on a compact metric
space P . Then there exists a sequence {n } of molecular measures such that
v
n .

74

18. Luzins Theorem

Proof. We sketch the main idea of the proof. Thanks to the compactness of P ,
for every k
{Mki } of pairwise disjoint nonempty

N ithere is a nite collection


i
k
sets so that Mk = P and diam Mk 2 . Choose points xik Mki and set
i

k =

(Mki )xik .

17.9. Remark.
The previous theorem can be proved as a nice application of the KrejnMilman theorem or of the bipolar theorem. However, both of them belong to deeper theorems
of functional analysis.
The following theorem has also its interpretation in the language of functional analysis: On
dual spaces bounded sets are relatively sequentially compact in the weak* topology.

17.10. Theorem. Let P be a metric compact space and {n } a sequence of


complex Radon measures on P . If supn n  < +, then there exists a weakly
convergent subsequence of {n }.
Proof. The method of the proof uses the Cantor diagonal selection process. Since
C (P ) is separable, there is a countable dense set {fk } C (P ). Now step by
step construct a sequence of sequences of measures {{kn }n }k so that 0n = n
and every {kn }n (k 1) is a subsequence of {k1
n }n for which the sequence of
numbers {kn (fk )}n is convergent. This can be done by the Bolzano-Weierstrass
n
theorem since the sequence {k1
n (fk )}n of real numbers is bounded. If n := n ,
then {n } is a subsequence of {n } and since for every k the sequence {n }nk is a
subsequence of {kn }n , the sequences {n (fk )}n are convergent. Choose f C (P )
and > 0. Find g {fk } such that f g < . There is an n0 such that
|i (g) j (g)| < for all i, j n0 . Then by the triangle inequality
|i (f ) j (f )| (1 + 2 sup n )
n

for all i, j n0 . Hence {n (f )} is a Cauchy sequence and therefore it is convergent. Dene a functional on C (P ) as
(f ) = lim n (f ).
n

It remains only to show that is a Radon integral on P and is a weak limit of


{n }, which is easy.
v

17.11. Exercise. Let P be a compact metric space and n . Show that


 lim inf n 
(i.e. the norm is a weakly lower semicontinuous function on

(P )).

17.12. Notes. The theory of weak convergence of probability measures has its roots in the
probability theory and mathematical statistics (see e.g. P. Billingsley [*1968]) and led, of course,
to a study of weak topologies on various subspaces of Radon measures. A similar notion of the
vague convergence was probably studied rst by Bourbakists (see the second edition of their
monograph). Vague convergence is again nothing else than the weak convergence on the space

C. Radon Integral and Measure


of measures determined by the inductive topology on
of the Alaoglu-Bourbaki theorem.

c (P ).

75

Theorem 17.10. is a special case

18. Luzins Theorem


If a measure space is endowed with a metric or a topological structure, a
natural question arises whether there is a closer relation between measurability
and continuity. The following theorems show that for complete Radon measures
on locally compact topological spaces this is the case.
18.1. Luzins Theorem. For a complete Radon measure on a locally compact
space P and a -almost everywhere nite function f on P , the following conditions
are equivalent:
(i) f is -measurable;
(ii) for any > 0 and any compact set K P there is an open set G so that
G < and f |K\G is continuous;
(iii) for any > 0 and any compact set K P there exists a continuous
function on P such that
{x K : f (x) = (x)} < ;
(iv) for every compact set K P there exists a sequence {n } of continuous
functions on P such that
n f

-almost everywhere on K.

Proof. (i) = (ii): Let {Uj } be a countable base for the topology on R (for
instance, a sequence of all intervals with rational endpoints). Fix an > 0 and
a compact set K P . There exist open sets Gj P and compact sets Fj P
such that
Fj K f 1 (Uj ) Gj and (Gj \ Fj ) < 2j .

Set G = (Gj \ Fj ). Clearly G is open and G < . Denote Y = K \ G and x


an index j. Then Y Gj = Y Fj . Thus Y f 1 (Uj ) = Y Gj is an open subset
of Y and we see that f |Y is continuous.
(ii) = (iii): Select again > 0. Find an open set G P such that G < and
f |K\G is continuous. By Tietzes extension theorem, there exists a continuous
function on P such that f = on K \ G.
(iii) = (iv): Find continuous functions k on P so that Ek < 2k where
Ek = {x K : f (x) = k (x)}. If
E :=




Ek

(= lim sup Ek ),

n=1 k=n

then E = 0 (cf. with the proof of the Borel-Cantelli lemma 2.14). If x K \ E,


then there exists n0 such that x
/ En for n n0 and n (x) f (x).
The implication (iv) = (i) is obvious.
For locally compact spaces which can be expressed as countable unions of
compact sets we get the following corollary.

76

19. Measures on Topological Groups

18.2. Luzins Theorem. Let be a complete Radon measure on a -compact


locally compact space P and f a -almost everywhere nite function on P . Then
the following conditions are equivalent:
(i) f is -measurable;
(ii) for any > 0 there exists an open set G so that G < and f |P \G is
continuous;
(iii) for any > 0 there exist a continuous function on P and an open set
G such that G < and = f on P \ G;
(iv) there exists a sequence {n } of continuous functions on P such that
n f

-almost everywhere on P.

Proof. This theorem is an easy consequence of the previous one.


18.3. Remarks. 1. Only the equivalence (i) (ii) is usually called Luzins theorem.
2. Notice that a measurable function can be discontinuous at all points(!), consider e.g. the
Dirichlet function on R. Luzins theorem says that the restricted function is continuous when
omitting a small set. A characterization of functions which are continuous at almost all points
of the interval [0, 1] was described in 7.9 and 17.5.
3. In general, it is not true that a measurable function is continuous omitting a null set.
Consider, for example, the indicator function of a discontinuum of a positive measure (see 1.13).
4. Another interesting characterization of measurable functions for the case of the Lebesgue
measure in Rn is given by Denjoys theorem 29.9.
5. By Theorem 18.2, every Lebesgue measurable function on R is a limit of a sequence of
continuous functions in the sense of the convergence almost everywhere. The collection of
functions which are pointwise limits of continuous functions (so called functions of the Baire
class one) is not very wide. For example, the Dirichlet function (see 7.5) does not belong to it.
18.4. Exercise. Show that a -almost everywhere nite function f on P is -measurable if
and only if for any S
compact set K P there exist compact sets Kn K and a -null set E
such that K = E Kn and functions f |Kn are continuous.
18.5. Exercise. Give another proof of the assertion that
15.17) with the help of Luzins theorem.
Hint. First approximate a function from
support and then use Luzins theorem.

c (P )

is dense in

(cf. Exercise

by a bounded measurable function with compact

18.6. Notes. Luzins theorem for the case of the Lebesgue measure was proved by H. Lebesgue
[1903] and by N.N. Luzin [1912].

19. Measures on Topological Groups


19.1. Special Case.
One of the fundamental properties of the Lebesgue
measure on the real line is its translation invariance: If x R and A R
is measurable, then (x + A) = A. The Lebesgue measure is in fact by this
property uniquely determined. Indeed, the following theorem is true (cf. Exercise
26.6):
Let be a Radon measure on R. If ([0, 1]) = 1 and (x + A) = A for every
x R and A B(R), then = on B(R).

C. Radon Integral and Measure

77

Most of this chapter is devoted to a more general problem. Since the object
belongs to elements of harmonic analysis and the reader can consult many textbooks (e.g. G. Bachman [*1964] or K. Ross [*1963]), we merely outline the main
ideas and invite the reader to ll in the details.
19.2. Topological Group. Let us start with basic notions. By a topological
group we understand a group G together with a (Hausdor) topology such that
the group operations (x, y) xy and x x1 are continuous.
In the sequel, G stands for a topological group whose topology is locally compact. The unit element of G will be denoted by e and the -algebra of all Borel
subsets of G by B(G). Finally, by S we will denote -algebras which appear as
domains of measures in consideration.
For x G and A G, set
xA = {xy : y A},

Ax = {yx : y A},

A1 = {x1 : x A}.

19.3. Haar Measure.


A left Haar measure on G is every nonzero Radon
measure on (G, S ) which satises (xA) = A for each x G and A S . In
a similar way we dene a right Haar measure. A measure which is simultaneously
both a left and a right Haar measure is called briey a Haar measure.
19.4. Examples. (a) The Lebesgue measure is a (typical) example of a Haar measure on Rn
(the group operation is the addition).
(b) The counting measure is a Haar measure on every group equipped with the discrete
topology.
(c) Let G = (0, +) be the multiplicative group of positive real numbers endowed with the
Euclidean topology. If
Z
dx
,
A Borel ,
A :=
A x
then is a Haar measure on G.
(d) Let G = C \ 0 be the multiplicative group of all nonzero complex numbers (with the
usual topology). If
Z
1
A =
d(z)
2
A |z|
for A

(G), then is a Haar measure on G.

19.5. Example. Let G be the multiplicative group of all 2 2-matrices of type

a
0

b
1

where a (0, +) and b R. There exists a one-to-one mapping of G onto (0, +) R given
by

a b
F:
(a, b).
0 1
Consider on G the (locally compact) topology determined by F from R2 . For A
Z
Z
1
1
A =
dx dy.
dx dy, A =
2
A x
A x
(a) Show that is a left Haar measure and a right Haar measure on G.
(b) Show that A = (A1 ).

(G), set

78

19. Measures on Topological Groups


(c) Find a set A

(G) such that A < and A = .

(The group G can be viewed as the group of all ane transformations of R onto R of the
form t  at + b, a > 0, b R.)
19.6 Remark. If is a left Haar measure on G and
is dened as
E := (E 1 ) (again
is a right Haar measure on G. In the same way, any right
E 1 (G) if E (G)), then
Haar measure determines a left one.
In what follows, we restrict ourselves to a study of left Haar measures. Fundamental properties are contained in the following theorem.

19.7. Theorem (existence and uniqueness of left Haar measure).


locally compact group there exists a left Haar measure.

(a) On any

(b) If and are complete left Haar measures on G, then there exists c > 0
such that = c.
Proof of this theorem is quite complicated and takes a lot of eort. There are various existence proofs, some of them as applications of deep theorems of functional
analysis. Here we outline a rough idea of an elementary existence proof.
Let V be an open neighbourhood of the unit element e of G. If E G is a
compact set, denote by HV (E) the smallest n such that there exist x1 , . . . , xn G
n

so that E
xj V (existence of a nite cover follows from the compactness of
j=1

E). Let K be a xed compact set with a nonempty interior. The required Haar
measure is then obtained by some limit procedure for V {e} and by extending
set functions
HV (E)
,
E compact.
E
HV (K)
A proper and complete description of the entire construction is not easy.
The proof of uniqueness will be given under an additional assumption that
is even a Haar measure. The general case is similar, but the technical details
are more dicult. First, there exists a nonnegative function h Cc (G) so that
h(e) = 0 and h(x) = h(x1 ) for
 x G (if g Cc (G), g(e) = 0, g 0, set
h(x) = g(x) + g(x1 )). Then G h d > 0 (see Exercise 19.14.b) and for an
arbitrary function f Cc (G) we get



h d




f d =
G


f (xy) d(x) d(y)

h(y)
G

h(y)f (xy) d(x, y)

=
GG

h(x1 z)f (z) d(x, z)



 
=
h(x1 z)f (z) d(z) d(x)
G
G





f (z)
h(z 1 x) d(x) d(z) =
h d
f d.
=

GG

C. Radon Integral and Measure

79

To give reasons for particular steps when using Fubinis theorem notice that f
and h have compact supports and
that and are Radon measures. We can
R
h d
nish the proof by setting c := RG h d .
G

Haar measures have many interesting properties. Let us state some of them.
19.8. Theorem. Let be a left Haar measure on a locally compact group G.
Then:
(a) U > 0 for any nonempty open set U G;
(b) G < + if and only if G is compact.
Proof. (a) Let U G be a nonempty open set. We can

assume e U . There exists a compact set K G with K > 0. Then K


xU and the compactness
xK

of K yields the existence of x1 , . . . , xn K such that K

xi U . Then

i=1

0 K

n


(xi U ) = n(U ).

i=1

(b) Suppose G < +. Let K G be a compact set of a positive measure.


There exist x1 , . . . , xn G so that the sets xi K, i = 1, . . . , n are pairwise disjoint
n

but all (remaining) x G satisfy xK ( xi K) = . (Indeed consider nite


i=1

sequences x1 , . . . , xn for which x1 K, . . . , xn K are pairwise disjoint. Their number


n

is bounded for instance by G/K. If x G, then xK ( xi K) = and it


i=1

 n

1
xi K K is compact.
follows that G =
i=1

19.9. Modular Function. Let be a left Haar measure on a locally compact


group G. If x G and x A := (Ax) for A B(G), then x is obviously a left
Haar measure. By the uniqueness part of Theorem 19.7 there exists (x) > 0
such that x = (x). Again, by uniqueness, (x) does not depend on the choice
of a left Haar measure on G. The function : x (x) : G (0, +) is called
the modular function of G.
19.10. Theorem. Let be the modular function of G. The following conditions are equivalent:
(i) every left Haar measure on G is also a right Haar measure;
(ii) = 1 on G.
Proof is easy and is omitted.
19.11. Unimodular Group. A locally compact group G is said to be unimodular if the modular function = 1 on G. In other words, G is unimodular if the
classes of left and right Haar measures on G coincide. Every commutative group
is unimodular. However, there exist examples of noncommutative unimodular
groups as indicated in the next theorem.

80

19. Measures on Topological Groups

19.12. Theorem.

Any commutative, discrete or compact group is unimodular.

Proof. If G is discrete, then every left Haar measure is a multiple of the counting
measure. Hence, it is also right invariant. If is a left Haar measure on a
compact group G and x G, then G = Gx and G = (Gx) = (x)G. Since
0 < G < , we obtain (x) = 1.
If G is a compact group, then each left Haar measure on G is a Haar measure
and we get immediately the following theorem.
19.13. Theorem. On every compact topological group G there exists a unique
complete Haar measure satisfying G = 1. Moreover, E = E 1 for every
E B(G).
19.14. Exercise. Let be a left Haar measure. Prove the following assertions.
R
R
(a) If f c (G) and y G, then G f d = G f (yx) d(x). The last equality holds if f is a
nonnegative -measurable function on G or if f 1 ().
R
(b) If f c (G) is nonnegative and f (e) > 0, then G f d > 0.
(c) The measure is -nite if and only if G is -compact.
(d) The topology on G is discrete if and only if {x}
= 0 for some (and thus for all) x G.
19.15. Exercise. Calculate the modular function of the group from Example 19.5.
19.16. Exercise . Let be a modular function on a locally compact group G. Show that:
(a) is continuous;
(b) (xy) = (x)(y) for all x, y G;
(c) (e) = 1.
19.17. Exercise. Let be a left Haar measure
on G. Show that
, where
R
R the set function
for every f c (G).

E := E 1 , is a right Haar measure and G f (x)(x1 ) d = G f d


d

equals (x1 ).
In other words, the Radon-Nikod
ym derivative
d
R
Hint. Show that the function on (G) dened by A := A (x1 ) d is a right Haar
measure. Thus = K
for some K. Now use the fact that is a continuous function attaining
the value 1 at the unit element e.
19.18. Exercise.
continuous.

Show that the left and right Haar measures on G are mutually absolutely

Hint. Use Exercise 19.17.

19.19. Convolution of Functions.


In the following exercises, let be a
xed right Haar measure on (G, B(G)). If f, g L 1 () and x G, dene the
convolution of f and g at x as

f g (x) =

f (xy 1 )g(y) d(y)

provided the integral on the right-hand side exists.


19.20. Exercise. If f, g 1 (), then the convolution f g is dened -almost everywhere
f g 1 () and f g1 f 1 g1 . Thus, L1 () with convolution as multiplication is a
Banach algebra.
19.21. Exercise. Show that convolution is commutative if and only if G is commutative.

C. Radon Integral and Measure

81

19.22. Exercise. Show that the Banach algebra (L1 (), ) has a multiplicative unit if and
only if G is endowed with the discrete topology. As an example consider the group Z of all
integers with the discrete topology and counting measure.
19.23. Involution. If f (x) := f (x1 )((x))1 , then the mappingR f  f ofthe space L1 ()
onto itself (which is called the involution) is an isometry. (Show that G f (x1 ) ((x))1 d =
R

1 ().)
G |f (x)| d for f c (G); hence it follows that f  = f  if f

19.24. Convolution of Measures. In the following, G is again a locally compact topological group. By Mb (G) denote the set of all complex Radon measures
on (G, B(G)). According to 17.1, Mb (G) with the norm dened by  := || (G)
is a Banach space.
We dene the convolution of measures , Mb (G) as
(E) := {(x, y) G G : xy E}
for any Borel set E G, where is the product of complex Radon measures
and , cf. 16.9.c. (Verify that {(x, y) G G : xy E} is a Borel subset of
G G!)
19.25. Exercise. Prove that is a complex measure on
the inequality     holds.

(G),

b (G)

and that

19.26. Exercise. Show that


(a) the operation of convolution of measures is associative;
(b) the convolution of measures on G is commutative if and only if G is commutative.
19.27. Exercise.
( b (G), ).

Prove that if e is the unit of G, then the Dirac measure e is the unit of

19.28. Remark. The reader may nd it interested to investigate a relation between convolution of measures and convolution of functions. If , b (G), then
Z
(E) =
cE (xy) d ,
GG

hence

Z
h d =
G

h(xy) d.
GG

for any bounded Borel function h on G. Therefore, for f, g

1 ()

we have

d(f g )
,
f g =
d

R
where f is the complex Radon measure given by f (E) = E f d. We see that the new
denition of convolution of functions agrees with the former one.
19.29. Exercise. Give a denition of a convolution of a (complex) function f 1 () and a
complex Radon measure b (G). Prove that f 1 () and that f 1 f 1 .
In fact, L1 () is not only a subalgebra of
b (G) but even an ideal.
19.30. Notes.
The translation invariant measures (or integrals) on compact Lie groups
were studied by F. Peter and H. Weyl [1927]. Remarkable progress was made by proving the
existence of a left Haar measure for separable locally compact groups by A. Haar [1933] and
by J. von Neumann in [1934] (the existence and uniqueness) for arbitrary compact groups, and
in [1936] (uniqueness) for separable locally compact groups and also by A. Weil especially in
[*1940]. Their proofs used the axiom of choice. H. Cartan [1940] and G.E. Bredon [1963] then
gave proofs without using the axiom of choice. A relatively short proof of the uniqueness result
can be found in S. Kakutani [1948]. More detailed historical notes are given by E. Hewitt and
K.A. Ross in [*1963]. It is said that Example 19.5. is due to J. von Neumann [1936]. Note that
Haar measures are special cases of invariant measures which can be studied in a more general
setting, cf. Banachs appendix of Saks monograph [*1937] or H. Federer [*1969].

82

D. Integration on R
20. Integral and Differentiation
In the following chapters we will investigate properties of real functions involving the Lebesgue measure and Lebesgue integration on the real line.
In this chapter, our aim is to show an inequality between the measure of f (E)
and the integral of f  over E. As a particular case, we obtain Sards lemma on
the real line.
Let K be a positive real number. We say that a real function f is a K-Lipschitz
function on a set E R if the inequality
|f (x) f (y)| K |x y|
holds for all x, y E. If f is a K-Lipschitz function on E for some K, we say
simply that f is a Lipschitz function on E.
20.1. Lemma.
f (E) K E.

Let f be a K-Lipschitz function on a set E R. Then

Proof. The assertion is obvious in the case E = +. If E <

+, select
> 0 and nd a sequence of open intervals (aj , bj ) with E (aj , bj ) and
j

(bj aj ) E + . By hypothesis there are intervals [j , j ] such that
j

f (E (aj , bj )) [j , j ] and j j K(bj aj ). Thus f (E) [j , j ] and


j

f (E)

(j j ) K

(bj aj ) K( E + ).

Since the last inequality holds for all > 0, we are done.
20.2. Lemma. Let f be a real function on an interval I and E I. If |f  | K
on E for some K > 0, then
f (E) K E.
Proof. Choose K  > K and denote
Ek = {x E : |f (x) f (y)| K  for all y E (x k1 , x + k1 )}.

Then E1 E2 . . . and E = Ek . Let J be an interval with length less than k1 .


k

According to the previous lemma,


f (J Ek ) K  (J Ek ).
Dividing I into such intervals we obtain f (Ek ) K  (Ek ). An appeal to
Exercise 4.7 reveals that f (E) K  (E). Since K  > K was arbitrary, we
have the required inequality.
As a consequence of the preceeding lemma we get one-dimensional version of
well-known Sards lemma. Compare it with the next Theorem 20.4.

D. Integration on R

83

20.3. Corollary. Let f be a real function on an interval I R and E I. If


f  = 0 on E, then f (E) = 0.
20.4. Theorem. Let f be a real-valued function on an interval I and E I a
measurable set. Suppose that a nite derivative f  (x) exists at each point x E.
Then f  is measurable on E and
f (E)

|f  | .

Proof. We rst prove the measurability of f  on E. Choose c R. For each


k, m N, the set
Gm,k := {x I : there exist y, z I such that
1
1
1
x < y < x < z < x + and f (z) f (y) > (c + )(z y)}
k
k
m
is open. Hence the set


{x E : f (x) > c} = E



m


Gm,k

is measurable.
To prove the required inequality, it is no restriction to assume that E is
bounded. For an > 0 denote
Ek = {x E : (k 1) |f  (x)| < k}.
Then Ek are pairwise disjoint measurable sets, E =

Ek and an appeal to Lemma

20.2 yields
f (E)

f (Ek )


=


k

 
k

|f  | + Ek

Ek

|f  | + E.

20.5. Exercise. Under the assumptions of Theorem 20.4 prove that f (E) is a measurable set.
S
Hint. Use the following Exercise 20.6 and the fact that E = N Kn where N = 0 and Kn
n

are compact (Theorem 1.21).


20.6. Exercise. Let f be a real-valued function on an interval I R having a nite derivative
at each point of a (Lebesgue) null set E I. Show that f (E) = 0.
S
Hint. Use either Theorem 20.4 or Lemma 20.2 realizing that E = {x E : |f  (x)| < n}.
n

84

21. Functions of Finite Variation and Absolutely Continuous Functions

20.7. Luzins (N)-property. We have seen in 3.18 that a continuous image of a (Lebesgue)
measurable set does not need to be measurable. On the other hand, if f is dierentiable and E
measurable, Exercise 20.5 yields that f (E) is measurable. We now consider briey the question,
under what conditions images of measurable sets are again measurable.
We say that a real function f dened on an interval I R has Luzins (N)-property if the
image f (N ) of any (Lebesgue) null set N I is again a null set.
(a) (Rademacher) Let f be a continuous function on an interval I. Then the following
conditions are equivalent:
(i) f has Luzins (N)-property;
(ii) f (M ) is measurable whenever M I is measurable.
Hint. In light of Theorem 1.21 a measurable set is a countable union of compact sets and a null
set. For the converse, use Remark 1.9.2 which says that every set of a positive measure contains
a nonmeasurable subset.
(b) Show that the assertion of (a) remains true if f is measurable only (use Luzins Theorem
18.2). Likewise, the denition of Luzins (N)-property and other ideas can be generalized to the
case when I is a measurable subset of Rn .
(c) Let f be a function having a nite derivative everywhere on an interval I. Then f has
Luzins (N)-property.
Hint. Exercise 20.6.
(d) Every Lipschitz (even locally Lipschitz) function on a measurable set M has Luzins
(N)-property.
20.8. Exercise. Let f be an arbitrary function on an interval I R. If D is the set of points
at which f has a nite derivative, then D is a Borel set and the function x  f  (x) is a Borel
function on D.
20.9. Notes. Luzins (N)-property was introduced by Luzin in [*1915]. Corollary 20.3 which
we called the one-dimensional version of Sards lemma is also often called Luzins theorem.

21. Functions of Finite Variation and Absolutely Continuous Functions


In this chapter we discuss two classes of functions without using measure theory. Important results concerning these functions will be introduced in following
chapters.
21.1. Functions of Finite Variation. Let f be a real-valued function on an
interval I. For any interval [a, b] I and any partition D : a = x0 < x1 < <
xm = b of [a, b] denote
b

V(f, D) =
a

m


|f (xj ) f (xj1 )| .

j=1

The extended real number


b

V f := sup{V(f, D) : D is a partition of[a, b]}


a

is called the variation of f over [a, b]. If V f < for every interval [a, b] I,
a
then f is termed a function of nite variation. In this case there exists a function
v on I such that
b

v(b) v(a) = V f
a

D. Integration on R

85

for any interval [a, b] I. Such a function v is determined up to an additive


constant and it is called an (indenite) variation of f .
One can easily see that the set of all functions of nite variation on an interval
I is a vector space.
If, in addition,
b

sup{V f : [a, b] I} < +,


a

we say that f is of bounded variation on I. For a compact interval [a, b], notions
of nite and bounded variation agree and they can be characterized simple by
b

V f < .
a

In case of an arbitrary interval I we could say that f is of locally bounded


variation instead of nite variation. Indeed, a function f is of nite variation
on I if and only if it has bounded variation on each compact subinterval of I.
In a similar way, we are going to localize notions of absolute continuity and
integrability.
21.2 Jordan Decomposition Theorem. A function f is of nite variation
on I if and only if f is a dierence of two nondecreasing functions.
Proof. Monotone functions are of nite variation, and thus dierences of monotone functions are also of nite variation. For the converse, if v is an indenite
variation of a function f of nite variation, then v and v f are nondecreasing
and f is their dierence.
21.3. Absolutely Continuous Functions. A real-valued function f is said
to be absolutely continuous on an interval I if given > 0, there exists > 0 that
m


|f (bj ) f (aj )| <

j=1

whenever a1 <b1 a2 <b2 am <bm are points of I with

(bj aj ) < .

j=1

The family of all absolutely continuous functions on an interval I is a vector space


which contains all Lipschitz functions.
We say that a function f is locally absolutely continuous on an interval I if f
is absolutely continuous on every compact subinterval of I.
Any locally absolutely continuous function is continuous and of bounded variation.
21.4. Theorem. Any absolutely continuous function f on I is the dierence
of two nondecreasing absolutely continuous functions.
Proof. It is enough to show that the indenite variation v of f is absolutely
continuous. To this end let an > 0 be given. Find > 0 such that
m

j=1

|f (bj ) f (aj )| <

86

22. Theorems on Almost Everywhere Dierentiation

whenever a1 <b1 a2 <b2 am <bm are points of I with


Let A1 <B2 A2 <B2 Ap <Bp be points of I with

(bj aj ) < .

j=1

(Bj Aj ) < . Find

j=1

partitions
mj

Aj = a0j < b0j = a1j < < bj

= Bj

of intervals [Aj , Bj ] for which


v(Bj ) v(Aj ) <
Since

mj

 i

f (bj ) f (aij ) + 1 .
p
i=1

i
(bj aij ) < , we have
j,i


j

|v(Bj ) v(Aj )| <



f (bij ) f (aij ) + < 2 ,
j,i

as needed.
21.5 Exercise. Show that the product of two absolutely continuous functions (or functions of
nite variation) on a bounded interval is again an absolutely continuous function (or a function
of nite variation).
21.6. Exercise. Prove that every absolutely continuous function has Luzins (N)-property.
Hint. If N is a null set, nd an open set M of a small measure containing N and then use the
denition of absolute continuity.
21.7. Notes.
C. Jordan discovered functions of nite variation in [1881] and proved the
decomposition theorem 21.2.

22. Theorems on Almost Everywhere Differentiation


In this chapter we show that every function of nite variation (in particular,
every nondecreasing or Lipschitz function) has a nite derivative almost everywhere. The usual proof of this deep theorem uses Vitalis covering theorem. Here
we present a rather elementary proof. Let start with a simple lemma which is due
to F. Riesz.
22.1. F. Rieszs Rising Sun Lemma.
interval [a, b]. Denote

Let h be a continuous function on an

E = {x (a, b) : there exists (x, b)such that h() > h(x)}.


Then E is a union of a sequence of pairwise disjoint open intervals (aj , bj ) with
h(aj ) h(bj ).
Proof. It is clear that E is an open set. Hence, E is a union of a sequence of
pairwise disjoint maximal open intervals contained in E. Let (, ) be any such
an interval and x (, ). Denote
M = { (x, ) : h() h(x)}.

D. Integration on R

87

Since
/ E, it follows that h h() in [, b). By hypothesis M = and
sup M = . Thus h(x) h() and letting x +, we complete the proof.
22.2. Remarks. 1. If (, ) is a maximal open interval contained in E and > a, then even
h() = h().
2. The mirror version of the lemma: Let h be a continuous function on an interval [a, b] and
denote
E = {x (a, b) : there exists a (a, x) with h() > h(x)}.
Then E is the union of a sequence (aj , bj ) of pairwise disjoint open intervals with h(aj ) h(bj ).

22.3. Extreme Derivatives.


of a point x R. Denote

Let f be a function dened on a neighborhood

1
D+ f (x) = lim sup (f (x + t) f (x)),
t0+ t
1
D+ f (x) = lim inf (f (x + t) f (x))
t0+ t
and analogously D f (x) and D f (x) for t 0. These extended real numbers
are called the Dini derivatives of f at x. A function f is dierentiable at x if all
Dini derivatives of f agree at x.
Finally set
1
Df (x) = lim sup (f (x + t) f (x))
t
t0
and call the function Df the upper derivative of f . The lower derivative Df is
dened in a similar way.
22.4. Lemma. Every nondecreasing Lipschitz function f on an interval [a, b]
has a nite derivative almost everywhere on [a, b].
Proof. It is no restriction to assume that f is 1-Lipschitz. Lipschitz functions
cannot have innite derivatives. To prove the assertion it suces to show that
D+ f D f almost everywhere. Then analogously D f D+ f almost everywhere, and
0 D+ f D f D f D+ f D+ f 1
almost everywhere as needed. To this end let 0 < p < q < 1 be given and set
Mp,q := {x (a, b) : D f (x) < p < q < D+ f (x)}.
By Remark 22.2.2 there is a sequence of pairwise disjoint intervals (aj , bj ) such
that


 
x (a, b) : D f (x) < p x (a, b) : there exists (a, x)
 
with f () p > f (x) px = (aj , bj )
j

and
f (bj ) pbj f (aj ) paj .

88

22. Theorems on Almost Everywhere Dierentiation

Now apply Lemma 22.1 to f (x) qx on each interval [ak , bk ]. Again there are
pairwise disjoint intervals (ak,j , bk,j ) (ak , bk ) such that

{x (a, b) : D+ f (x) > q} {x (ak , bk ) : there exists (ak , x)
k

with f () q < f (x) qx} =

(ak,j , bk,j )

k,j

and
f (bk,j ) qbk,j f (ak,j ) qak,j .
Hence


k,j

1
1
(f (bk,j ) f (ak,j ))
(f (bk ) f (ak ))
q
q
k,j
k
p
p
(bk ak ) (b a).

q
q

(bk,j ak,j )

Thus, we can dene by induction sequences of decreasing collections of intervals;


in the 2nth step we obtain a collection of intervals {(As , Bs )} with
 n


p
(As , Bs ) Mp,q and
(Bs As )
(b a).
q
s
s
It follows that Mp,q is a null set. The assertion now follows from the fact that

Mp,q .
{x (a, b) : D f (x) < D+ f (x)}
p,q(0,1)Q

Now we show that also monotone functions are dierentiable almost everywhere. The situation is more dicult since monotone functions can be discontinuous. The heart of the proof lies in the observation that, for a nondecreasing
function f , the inverse function to x x + f (x) (whose domain is not necessarily
connected) can be extended to a nondecreasing Lipschitz function on an interval.
22.5. Theorem (Lebesgue). Every monotone function f on an interval I has
a nite derivative at almost all points of I.
Proof. It would be clearly sucient to assume that I = [a, b]. There exists an
interval [A, B] and a function g on [A, B] so that g is Lipschitz and nondecreasing,
and x + f (x) [A, B], g(x + f (x)) = x for all x [a, b]. (If f is continuous, then
g is simply the inverse function to x + f (x)). A moments thought will convince
the reader that
{x (a, b) : f  (x) does not exist } g(E) g(N )
where
E = {y (A, B) : g  (x) does not exist } and N = {y (A, B) : g  (x) = 0}.
By the previous lemma E = 0. Hence g(E) = 0 by Lemma 20.1. According to
Sards lemma (Corollary 20.3) also g(N ) = 0 and the proof is complete.

D. Integration on R

89

22.6. Corollary. Every function of bounded variation has a nite derivative


almost everywhere.
We conclude this section by proving an important inequality.
22.7. Theorem. Let f be a nondecreasing function on an interval [a, b]. Then
f  L 1 ([a, b]) and
 b
f  f (b) f (a).
a

Proof. For x > b set f (x) = f (b) and dene a sequence of functions


1
fk (x) = k f (x + ) f (x)
k
for x [a, b]. Then fk are nonnegative measurable functions on [a, b] (f is measurable!) and lim fk = f  almost everywhere. Thus f  is measurable and nonnegative
almost everywhere. Fatous lemma 8.15 and a quick computation establish that


1



b

f  lim inf

fk = lim inf k
a



= lim inf k
b

1
b+ k


f

1
a+ k

b+ k

1
a+ k

f
a


lim inf k


1
1
f (b) f (a)
k
k

= f (b) f (a).

22.8. Notes.
Theorem 22.5 on dierentiability of monotone functions was proved by
H. Lebesgue in [*1904] under an additional assumption of continuity of dierentiated function.
In a full generality, the theorem was proved independently by G. Faber [1918] and by G.C. Young
and W.H. Young [1911]. Rising sun lemma 22.1 and proof of 22.4 are due to F. Riesz [1930-32].
To prove Lebesgues theorem, he uses this lemma and the notion of null sets only. The idea of
the proof of 22.5 was used by J. Mignot [1976] and L. Zaj
cek [1983].

23. Indefinite Lebesgue Integral and Absolute Continuity


In this chapter we examine the formula

f (b) f (a) =

f

where the integral is understood as Lebesgues one and the derivative in the
almost everywhere sense. First, let us show that formula does not always hold
even if f is monotone by considering the following example.
23.1. Cantor Singular Function. We dene the Cantor function f : [0, 1] [0, 1] in the
following way: Let f (0) = 0, f (1) = 1. For x [ 13 , 23 ] set f (x) = 12 . Further set f (x) = 14 for
x [ 19 , 29 ] and f (x) = 34 for x [ 79 , 89 ]. Each succesive step is essentially the same. If (a, b) is a
maximal interval in which f is not yet dened, we subdivide it into thirds and dene the value
of f in the closed middle third as the arithmetical mean of f (a) and f (b). Then f is dened

90

23. Indenite Lebesgue Integral and Absolute Continuity

and uniformly continuous on a dense subset of [0, 1]. The last step of the denition consists in
the continuous extension of f to the whole interval [0, 1]. There is also an arithmetic denition
of the Cantor function. Suppose x [0, 1] is written in the form
x=

3j xj

j=1

where xj {0, 1, 2}. Then


f (x) =
(

where
yj =
We see that

f

1 X j
2 yj
2 j=1

if there is i < j with xi = 1,

xj

otherwise.

= 0 almost everywhere and


Z

f (1) f (0) = 1
= 0 =

f .

This situation cannot occur when f is absolutely continuous as shown by the


following theorem.
23.2 Theorem. Let f be an absolutely continuous function on an interval [a, b].
Then f  L 1 ([a, b]) and
 b
f (b) f (a) =
f .
a

Proof. As every absolutely continuous function is the dierence of two monotone


absolutely continuous functions, no generality is lost with the assumption that f
b
is nondecreasing. Obviously f  0 almost everywhere and a f  f (b) f (a)
thanks to Theorem 22.7. Select > 0 and let > 0 be as furnished by the
denition of absolute continuity of f . There exists an open set G with G <
containing all points of nondierentiability of f . Let G be expressed as a union
of pairwise disjoint intervals (aj , bj ). Then the denition of absolute continuity of
k
f yields j=1 (f (bj ) f (aj )) < for all k N. Thus f (G) . On the other
hand, by Theorem 20.4 we have

f ([a, b] \ G)

f
[a,b]\G

f .

Whence
f (b) f (a) = f ([a, b]) f ([a, b] \ G) + f (G)

f  + .

23.3 Indenite Lebesgue Integral. A function on an interval I R is said


to be locally integrable if it is integrable on every compact subinterval of I, and

D. Integration on R

91

f is called an indenite Lebesgue integral of a locally integrable function on I


provided
 b
f (b) f (a) =
whenever [a, b] I .
a

If c I and

 x

c
(x) =
 c

x c,
x < c,

then is an indenite Lebesgue integral of and any other indenite integral


of diers from by an additive constant. If, in addition, is nonnegative,
then is nondecreasing. It follows that for a locally integrable function , the
indenite integral of is a function of nite variation (it is the dierence of
indenite integrals of + and ) and by Corollary 22.6 it has a nite derivative
almost everywhere. Notice that according to the Lebesgue dominated convergence
theorem any indenite Lebesgue integral is a continuous function. This assertion
can be sharpened considerably as the next theorem shows.
23.4. Theorem. Let L 1 (I) and f be an indenite Lebesgue integral of
on I. Then f is absolutely continuous and f  = almost everywhere.
y
Proof. Since |f (y) f (x)| x |f | for [x, y] I, absolute continuity of f follows
from Exercise 8.22.b. It remains to show that f  = almost everywhere. Since
f is absolutely continuous, f  exists almost everywhere, f  L 1 (I) and for any
interval [a, b] I
 b
 b

f = f (b) f (a) =
.
a

It readily follows that


E

f =

for any open set E, and consequently for any measurable set E as well. Then
Theorem 8.17 gives the desired conclusion.
23.5. Corollary. For a real-valued function f on an interval [a, b], the following
properties are equivalent:
(i)
(ii)
(iii)

f is absolutely continuous on [a, b];


x
there exists L 1 ([a, b]) such that f (x) = f (a) + a for all x [a, b];

1
f xis dierentiable almost everywhere, f L ([a, b]) and f (x) = f (a) +
f
for
all
x

[a,
b].
a

23.6 Corollary. Suppose f is an absolutely continuous function on [a, b]. If


f  = 0 almost everywhere, then f is constant.
23.7. Remark.
Let be a real-valued function on an interval [a, b]. Recall that the
Newton integral of is the dierence f (b) f (a), where f is an antiderivative of on the
interval [a, b]. This denition can be generalized in many ways. We may modify the notion
of an antiderivative to require the equality f  = up to a countable set of points. To

92

23. Indenite Lebesgue Integral and Absolute Continuity

make this denition reasonable, we should add the assumption that f is continuous in order to
guarantee that all antiderivatives of f dier up a constant. An analogous problem appears
when assuming f  = only almost everywhere. Now, the example of the Cantor function shows
that continuity of f does not guarantee that the increment f (b)f (a) does not depend on choice
of the generalized antiderivative. However, it is possible to give an alternative denition of
the Lebesgue integral of a function as the increment of an absolutely continuous function f
on [a, b] with f  = almost everywhere. (Notice that this denition does not lead to a true
generalization some antiderivative are not absolutely continuous, see Example 25.1). The
denitions of various integrals based on the idea of (in some way) generalized antiderivatives
are called the descriptive ones. Let us note that these generalizations may consist in omitting
small sets or in a generalized dierentiation. Furthermore, in Chapter 25 we mention
Perrons method which is also included among descriptive approaches.

23.8. Lebesgue Points. Let I R be an interval and x I. We say that x


is a Lebesgue point for a locally integrable function f if
 h
1
lim
|f (x + t) f (x)| dt = 0.
h0 2h h
If F is an indenite Lebesgue integral of f on an interval I and x I is a point
where F  (x) = f (x), then
1
lim
h0 2h

(f (x + t) f (x)) dt = 0

but x does not need to be a Lebesgue point for f . However, it is clear that F  = f
at each Lebesgue point for f . The following theorem is thus a strengthening of
Theorem 23.4.
23.9. Lebesgue Dierentiation Theorem. Let f be a locally integrable
function on an interval I. Then almost every point of I is a Lebesgue point for
f.
Proof. For a xed r R, the function x |f (x) r| is locally integrable on I.
By virtue of Theorem 23.4 there exists a set Er I of Lebesgue measure zero
such that
 h
1
lim
|f (x + t) r| dt = |f (x) r|
h0 2h h

Er , then E = 0. Now, if x I \ E and > 0,


for every x I \ Er . If E :=
rQ

then there exists r Q with |f (x) r| < . Consequently,


|f (x + t) f (x)| |f (x + t) r| +
and
lim sup
h0

1
2h

|f (x + t) f (x)| dt |f (x) r| + 2.

23.10. Remark.
Every continuity point of f is a Lebesgue point for f . An interesting
relationship between Lebesgue points and points of approximate continuity will be given in
Exercise 29.11.

D. Integration on R

93

23.11. Banach-Zarecki Theorem. Let f be a real-valued function on an interval [a, b]. The
following assertions are equivalent:
(i) f is absolutely continuous on [a, b];
(iv) f is continuous on [a, b], f is of bounded variation and has Luzins (N)-property.
Hint. We already know that any absolutely continuous
function f is continuous and of nite
R
variation. According to Theorem 23.2, f (G) G |f  | for any open set G [a, b], and a
moments reection shows that f has Luzins (N)-property. Conversely, suppose that (iv) holds
and let D be the set of all points where f has a nite derivative. By Exercise 20.8, D is
measurable. Since f ([a, b] \ D) = 0, Theorem 20.4 yields
Z

|f (s) f (t)|


f () d

for every interval [s, t] [a, b]. Now it is sucient to notice that the function x
absolutely continuous.

Rx
a

|f  | is

23.12. A Characterizations of Lipschitz Functions. Show that for a real-valued function


f on an interval [a, b] the following conditions are equivalent:
(i) f is Lipschitz on [a, b];
(ii) for any > 0 there exists > 0 so that for
Pevery nite collection of intervals [aj , bj ]
P
[a, b] with (bj aj ) < the inequality j |f (bj ) f (aj )| < holds;
j

(iii) f is absolutely continuous on [a, b] and f  is bounded on [a, b] \ N , where N = 0.


(Compare with a similar characterization of absolutely continuous functions.)
23.13. Integration by Parts for Lebesgue Integral.
functions on an interval [a, b]. Then
Z

f g  = f (b)g(b) f (a)g(a)

Let f , g be absolutely continuous


Z

f  g.

23.14. Notes. Fundamental theorem (23.5) of the calculus for Lebesgue integrals was proved
by H. Lebesgue [*1904]. The implication (iv) = (i) of Theorem 23.11 is due to S. Banach [1925].

24. Radon Measures on R and Distribution Functions


24.1. Distribution Functions.
A distribution function of a Radon measure on R is a nondecreasing and right continuous function F on R such that
F (b) F (a) = (a, b] for every interval (a, b] R. It is obvious that any other
distribution function of can dier from F by only a constant.
If is a probability measure, then a distribution function of , given as
G (x) := (, x] ,
is normalized so that lim G (x) = 0 and lim G (x) = 1.
x

x+

24.2. Theorem. (a) Let F be a nondecreasing and right continuous function


on R. Then there exists a unique Radon measure F on B(R) such that F is the
distribution function of F .
(b) Let be a Radon measure on R. Then there is a distribution function F
of .

24. Radon Measures on R and Distribution Functions

94

Proof. (a) As for the uniqueness, any two such measures agree on all open (or
compact) sets in R. The existence can be proved in various ways, depending on
what kind of general theorem we wish to use. For instance, we can use Hopfs
extension theorem 5.5 or start from the covering collection {(a, b] : a < b, a, b R}
and the set function F (a, b] := F (b) F (a), create the outer measure and to
restrict it to measurable sets. In any case, we have to prove the following property:


If (a, b]
(an , bn ], then F (b) F (a)
(F (bn ) F (an )). To see this, given
n=1

n=1

> 0, let n > 0, > 0 be so that F (bn +n ) < F (bn )+2n , F (a+) < F (a)+.
Then consider the cover {(an , bn + n )} of the interval [a + , b].
(b) Setting

for x > 0,

(0, x]
F (x) := 0
for x = 0,

(x, 0] for x < 0,

F is nondecreasing, right continuous, F (0) = 0 and (a, b] = F (b) F (a) for


every interval (a, b] R.
24.3. Theorem.
dened on B(R):

The following conditions are equivalent for a measure

(i) is a Radon measure;


(ii) there exists a distribution function F such that (a, b] = F (b) F (a) for
every bounded interval (a, b];
(iii) there exists a nondecreasing function on R such that is the LebesgueStieltjes measure of Example 14.8.b.
Proof. Use previous results and the uniqueness part of the Riesz representation
theorem 16.5.
24.4. Remarks.
1. According to Remark 15.2 any nite measure on (R) is a Radon
measure. However,Pthere exist -nite measures on (R) which are not Radon measures (for
1/k of Dirac measures). Hence, these measures are not Lebesgue-Stieltjes
example, the sum
k

measures and have not distribution functions.


2. The function from the last theorem need not be right continuous. However, as a monotone
function it has a right limit at each point. If (x)

:= limtx+ (t), then


is a distribution
function of and determines the same Lebesgue-Stieltjes measure as . The functions
and F
of Theorem 24.3.ii dier by a constant (cf. Exercise 24.6).
24.5. Exercise. Let F be a distribution function of a Radon measure on R and let F
correspond to F as in Theorem 24.2. Show that = F on (R).
24.6. Exercise.
constant.

Show that any two distribution functions of the same measure dier by a

24.7. Exercise. Let F be a distribution function of a Radon measure . Show that:


(a) F is locally absolutely continuous on R if and only if is absolutely continuous with
ym derivative d/ d.
respect to the Lebesgue measure . In this case, F  is the Radon-Nikod
(b) Measures and are mutually singular if and only if F  = 0 almost everywhere.
(c) ({x}) = F (x) lim F (t) for every x R. In particular, F is continuous at x if and
tx

only if ({x}) = 0.

D. Integration on R

95

(d) z supt if and only if F (z ) < F (z + ) for every > 0.


(e) Let = d + a + s be the decomposition of into discrete, absolutely continuous
and singular part (Exercise 15.18.c). If Fd , Fa , Fs are distribution functions of d , a , s ,
respectively, then Fa is locally absolutely continuous, Fs is continuous, Fs = 0 almost everywhere
and Fd is called the saltus function of F . Moreover, F = Fd + Fa + Fs .
24.8. Exercise. (a) Let equal to the Cantor function (Example 23.1) on [0, 1] and = c(0,)
elsewhere on R. Determine C, where C is the Cantor set. Show that is a continuous
measure (see Exercise 15.18) and supt = C (see Exercise 15.10).
(b) Let [0, 1]. Does there exist a nondecreasing function so that (0) = 0, (1) = 1
and C = ?
Hint. Set (x) = (x) + (1 )x, where is dened as in (a).
24.9. Exercise. Let F be a distribution function of a Radon measure . Assume that F is
locally absolutely continuous on R. Show that
Z
Z
g d =
gF 
R

for every g

1 ().

on an interval
24.10. Fubinis Lemma. Let
P{fn } be a sequence of nondecreasing functions
P 
I R. Assume that the series
fn converges to a nite limit f on I. Then f  =
fn almost
everywhere on I.
P
fn (notation as in 14.8). To nish the proof, use the Lebesgue
Hint. First prove that f =
decompositions of these measures to absolutely continuous and singular parts.
24.11. Notes. Distribution functions play an important role in probability theory and it is
very dicult to trace who introduced them. They were investigated in various forms by Jacob
Bernoulli, P.S. de Laplace and others. Lemma 24.10 is due to G. Fubini [1915].

25. HenstockKurzweil Integral


In some sense, the Lebesgue integral has the best properties among all integrals
which can be dened on any measure space. However, its universality can also be
a disadvantage. If we are engaged in integration of functions of one real variable
(or several variables, but in what follows we restrict to the real line in order to
simplify the explanation), then we can nd wider collections of functions on which
a reasonable notion of an integral can be introduced. Let us present an illustrative
example.
25.1. Example. The function
8
<
x2 cos2
f (x) =
:
0

1
x2

for x
= 0,
for x = 0

1
1
is not Lebesgue integrable on [1, 1]. For aj = q
we have
and bj =
1
j
(j + 2 )
Z

bj
aj

Z
|f (x)| dx =

bj
aj

f (x) dx =

1
,
j

R
and therefore 01 |f (x)| dx = +. On the other hand, f has an antiderivative on R and the
R
Newton integral 01 f (x) dx exists.

96

25. HenstockKurzweil Integral

There are also simple examples of functions which are Lebesgue integrable but
not Newton integrable (for example, the function sign x on [1, 1]). A generalization of both Newton and Lebesgue integrals leads to a nonabsolutely convergent
integral which can be introduced in several ways. Here we use an approach due
to Henstock and Kurzweil; Perrons one is outlined in 25.10.
25.2. HenstockKurzweil integral. For the sake of simplicity we will dene
the integral for bounded intervals only.
If [a, b] is an interval, denote the set of all (strictly) positive functions on [a, b] by
and label functions from as gauges. A partition is a pair D = ([aj , bj ], j )m
j=1 ,
where
a = a1 < b1 = a2 < b2 = a3 < < bm1 = am < bm = b and j [aj , bj ].
If is a gauge, then a partition D is called -ne (or subordinated to )
whenever bj aj < (j ) for all j = 1, . . . , m. The denition of the Henstock
Kurzweil integral is based on the following proposition whose proof is an easy
consequence of the compactness of [a, b].
Cousins Lemma. Let be a gauge from . Then there exists a -ne partition. Moreover, if a collection B of non-overlapping intervals [ai , bi ] containing
points i with bi ai < (i ) is given, then adding some intervals to B we get a
-ne partition.
Given a real-valued function f on [a, b] and a partition D = ([aj , bj ]), j )m
j=1 of
[a, b], set
m

s(f, D) =
f (j )(bj aj ).
j=1

We say that f is HenstockKurzweil integrable on [a, b] if there is a real number


K so that for any > 0 there is such that
|K s(f, D)| <
for each -ne partition D of [a, b]. The number K is then uniquely determined,
b
it is called the HenstockKurzweil integral of f and denoted by K a f . Where
no confusion can result, we will drop the prex K. This denition is similar
to original Riemanns one, the only dierence being that instead of constants
in Riemanns denition, a gauge function appears in HenstockKurzweils
denition.
The set of all HenstockKurzweil integrable functions is a linear space on which
the HenstockKurzweil integral is a monotone linear functional.
We are not going to study details of the theory of nonabsolutely convergent
integrals; we will concentrate on relations to the Newton and the Lebesgue integral
only.
25.3 Theorem. Let F be a continuous function on [a, b], F  = f on (a, b). Then
the HenstockKurzweil integral of f on [a, b] exists and equals to F (b) F (a).

D. Integration on R

97

Proof. Given an > 0 and x (a, b), there is (x) > 0 such that



 F (y) F (x)
<


f
(x)


yx
whenever y [a, b], 0 < |y x| < (x). Next, we can nd (a) > 0 and (b) > 0
such that |f (a)(a)| < , |f (b)(b)| < , |F (y) F (a)| < for all y [a, a + (a))
and |F (y) F (a)| < for all y (b (b), b]. Let D = ([aj , bj ]), j )m
j=1 be a -ne
partition. Then
|F (bj ) F (aj ) f (j )(bj aj )|
|F (bj ) F (j ) f (j )(bj j )| + |F (j ) F (aj ) f (j )(j aj )|
< (|bj j | + |j aj |) = (bj aj )
for every j {1, . . . , m}, j (a, b), and
|F (bj ) F (aj ) f (j )(bj aj )| |F (bj ) F (aj )| + |f (j )(bj aj )| < 2
if j {a, b}. Summing over j, we obtain the estimate
|F (b) F (a) s(f, D)| < 4 + (b a).

25.4. Theorem. If a function f has a nite Lebesgue integral L, then f also


b
has the HenstockKurzweil integral and K a f = L.
Proof. If > 0, then Theorem 15.5 ensures the existence of a lower semicontinuous
function s > f and an upper semicontinuous function t < f such that s, t
L 1 ([a, b]) and
 b
(s t) <
a

(we get strict inequalities by adding small constants). For every x [a, b] nd
(x) > 0 with t < f (x) < s on (x (x), x + (x)) [a, b]. Let D = ([aj , bj ], j )m
j=1
be a -ne partition of [a, b]. Then


bj


t f (j )(bj aj )

aj

bj

s
aj

for each j {1, . . . , m}. Summing we get




t s(f, D)
a

Since

b
a

tL

b
a

s.
a

s, we have |s(f, D) L| < 2, and we are done.

98

25. HenstockKurzweil Integral

25.5 Indenite HenstockKurzweil Integral. In the sequel, [a, b] will be


a xed interval and will denote the set of all positive functions on [a, b]. If a
function f has the HenstockKurzweil integral on [a, b] and [a , b ] [a, b], then f
has the HenstockKurzweil integral on the interval [a , b ] as well. Moreover,
 c3
 c2
 c3
f=
f+
f
c1

c1

c2

whenever a c1 < c2 < c3 b.


A function F on [a, b] is called an indenite HenstockKurzweil integral of f
if, for each interval [a , b ] [a, b],


F (b ) F (a ) = K

b

f.
a

If F is an indenite HenstockKurzweil integral of f , then any other indenite


HenstockKurzweil integral of f can dier from F by only a constant.
25.6. Saks-Henstocks Lemma. Let F be an indenite HenstockKurzweil
integral of f on [a, b]. Then for any > 0 there exists a such that




F (bj ) F (aj ) f (j )(bj aj ) <

jM

whenever D = ([aj , bj ], j )m
j=1 is a -ne partition of [a, b] and M {1, . . . , m}.
Proof. Choose an > 0 and nd a gauge such that
|F (b) F (a) s(f, D)| <

whenever D = ([aj , bj ], j )m
j=1 is a -ne partition of [a, b]. Fix now any such
a partition D and M {1, . . . , m}. For each j = 1, . . . , m there is a partition
mj
Dj = ([aij , bij ], ji )i=1
of [aj , bj ] subordinated to the restriction of to [aj , bj ] such
that

|F (bj ) F (aj ) s(f, Dj )| <


.
2m
Create a new partition D = ([ak , bk ], k )pk=1 of [a, b] in such way that every triple
(ak , bk , k ) is either one of the triplets (aij , bij , ji ), where j
/ M , i {1, . . . , mj },
or one of the triplets (aj , bj , j ), where j M . This partition is -ne and thus
|F (b) F (a) s(f, D )| <

.
2

Whence combining all inequalities, we get






F (bj ) F (aj ) f (j )(bj aj ) < .

jM

D. Integration on R

99

25.7. Theorem. An indenite HenstockKurzweil integral F of f on [a, b] is


continuous on [a, b].
Proof. Let z [a, b] and > 0 be xed. By previous SaksHenstocks lemma nd
a gauge such that




F (bj ) F (aj ) f (j )(bj aj ) <
(*)

jM

whenever D = ([aj , bj ], j )m
j=1 is a -ne partition of [a, b] and M {1, . . . , m}.
For x [a, b], |x z| < (z), let D = ([aj , bj ], j )m
j=1 be a partition such that
k = z for some k {1, . . . , m} and such that the endpoints of the interval [ak , bk ]
are x and z. Applying (*) with M = {k}, we obtain
|F (x) F (z)| < + |f (z)| |x z| .

Based on HenstockKurzweils denition, the following theorem (even in higher


dimensions) was proved by J. Kr
al [1985].
25.8. Theorem. Let F be an indenite integral of a HenstockKurzweil integrable function f on [a, b]. Then f is a measurable function, the derivative F 
exists at almost all points of (a, b) and F  = f almost everywhere.
Proof. The continuity of F (Theorem 25.7) implies that DF is measurable (even
Borel) since
DF (x) = lim lim fn,k ,
n k

where functions fn,k dened as



fn,k (x) = sup

F (y) F (x)
1
1
: y [a, b],
|y x|
yx
n+k
n

are continuous. Now we prove that DF = DF almost everywhere which establishes the existence of the derivative almost everywhere and its measurability.
We will be done once we show that, for every n N, the level sets
1
} and
n
1
Un := {x [a, b] : DF (x) > f (x) + }
n
Ln := {x [a, b] : DF (x) < f (x)

have measure zero. For this purpose we will x n N and estimate the (outer)
measure of the set L := Ln . We are now in a position to invoke SaksHenstocks
lemma 25.6: Given > 0, there is a gauge function such that




F (bj ) F (aj ) f (j )(bj aj ) <

jM

100

25. HenstockKurzweil Integral

whenever D = ([aj , bj ], j )m
j=1 is a -ne partition of [a, b] and M {1, . . . , m}.
Let V denote the family of all intervals [a , b ] for which there exists an x
[a , b ] L such that b a < (x) and
F (b ) F (a )
1
< f (x) .
b a
n
Then V is a Vitali cover of L, and so, by Corollary 27.3, there exists a nite,
pairwise disjoint subfamily {[ai , bi ]}i of V with points i [ai , bi ] such that
(L \

[ai , bi ]) < .

Now, use Cousins lemma in 25.2 to extend the collection ([ai , bi ], i )i to a family ([aj , bj ], j ) subordinated to . If M := {j : there exists an i with [aj , bj ] =
[ai , bi ]}, the property of gauge yields that


(f (i )(bi ai ) (F (bi ) F (ai ))) < ,

and the denition of V that




(f (i )(bi ai ) (F (bi ) F (ai ))) >

Thus

i (bi

1
(bi ai ).
n i

ai ) < n, which in turn implies that


(L) n + (L \

[ai , bi ]) < (n + 1).

The following proposition illustrates the relation between Henstock-Kurzweil


and Lebesgue integrals their equivalence for nonnegative functions.
25.9. Theorem. If a nonnegative function f has the HenstockKurzweil inteb
b
gral on [a, b], then f has also the Lebesgue integral on [a, b] and L a f = K a f .
Proof. According to Theorem 25.6, f is measurable. It remains to show that
b
L a f < and use Theorem 25.4. The proof may now be nished in a stroke:
Since f 0, the indenite HenstockKurzweil integral F of f is nondecreasing on
b
b
[a, b] and thus by Theorems 22.7 and 25.8 L a f = L a F  F (b) F (a) < +.
25.10. Perron Integral.
There are also other ways how to dene an integral which is
equivalent to the HenstockKurzweil integral (i.e. it provides the same class of integrable
functions on which it equals to the HenstockKurzweil integral). In the following, we outline
Perrons approach to generalizing both the Newton and the Lebesgue integrals on intervals. Let
us start with two notes:

D. Integration on R

101

(a) A descriptive denition of the Newton integral of a function f requires the existence of
an antiderivative (a function F with F  = f ). This is a very strong condition as it reduces
the class of Newton integrable functions. Among others, this condition implies that f shares
the Darboux property and is of the Baire class one. A descriptive denition of the Lebesgue
integral requires the existence of an absolutely continuous function F with F  = f only almost
everywhere.
(b) The following particular case of Theorem 15.5 is known as the Vitali-Caratheodory theorem: A function f is Lebesgue integrable on an interval I if for any > 0 there exists an upper
semicontinuous
function g and a lower semicontinuous function h with g f h on I and
R
(h

g)
<
.
I
Now we are in the position to introduce Perrons integral. Let f be a function on an interval
[a, b]. A function M is termed to be a majorant of f if f (x) DM (x)
= for every x [a, b].
If f Dm
= +, we say that m is a minorant of f .
A function f is said to be Perron integrable on an interval [a, b] if for any > 0 there exists
a majorant M and a minorant m so that
M (b) M (a) (m(b) m(a)) < .
Since D(M m) DM Dm 0 (and since every function with a nonnegative lower derivative
is nonnegative) we may dene the Perron integral of f as
Z

f = inf{M (b) M (a) : M is a majorant of f }

= sup{m(b) m(a) : m is a minorant of f }.


We may also dene upper and lower Perron integrals of f , and f will be Perron integrable when
they are equal and nite.
25.11 Remark. Lebesgue integrable functions may be characterized in the following way:
A function f on an interval [a, b] is Lebesgue integrable if and only if for any > 0 there is
a majorant M and a minorant m, both of them absolutely continuous, so that
M (b) M (a) (m(b) m(a)) < .
To prove this proposition, one can use the Vitali-Caratheodory characterization (Theorem
15.5) again (the indenite integrals of semicontinuous functions serve as majorants and minorants).
It is quite interesting to notice that in the denition of the Perron integral we can restrict to
continuous majorants:
A function f on an interval [a, b] is Perron integrable if and only if for any > 0 there is a
majorant M and a minorant m, both of them continuous, so that
M (b) M (a) (m(b) m(a)) < .
Finally, the Riemann integrable functions may be characterized in a similar way:
A function f on an interval [a, b] is Riemann integrable if and only if for any > 0 there is
a majorant M and a minorant m, both of them with continuous derivative, so that
M (b) M (a) (m(b) m(a)) < .
25.12. Exercise.
integral if f is:
(a)
(b)
(c)
(d)

Find the gauge function from the denition of the HenstockKurzweil

Newton integrable;
Lebesgue integrable;
an indicator function of an open set;
an indicator function of a null set.

102

25. HenstockKurzweil Integral

25.13 Exercise.
Give an example to show that the indenite HenstockKurzweil integral
need not be an absolutely continuous function.
25.14. Historical Notes. H. Lebesgue already knew that his integral on the real line does not
integrate all derivatives. The task was, to nd a wider class of integrable functions containing
both the Lebesgue and Newton integrable functions. The problem was solved by A. Denjoy [1912]
who used a constructive method based on a transnite approach, and by N. N. Luzin [1912] using
a descriptive denition based on further generalization of the notion of absolute continuity. The
method of envelopes was developed by O. Perron [1914]; it was proved that his and Denjoys
integrals are equivalent. Nowadays, this integral is usually called the Denjoy-Perron integral or
the restricted Denjoy integral. Another approach was developed independently by R. Henstock
[1961] and J. Kurzweil [1957] who returned to a Riemann type denition. The integral is often
called the complete Riemann or the Henstock-Kurzweil integral. Simple denitions of Henstocks
and Kurzweils approach and simple proofs allowed a deeper study of the HenstockKurzweil
integral and surprising applications. Among recent publications let us mention R. Henstock
Schwabik [*1992] and S.
Schwabik and Guoju Ye [*2005].
[*1988], Peng Yee Lee [*1989], S.

E. Integration on Rn

103

E. Integration on Rn
26. Lebesgue Measure and Integral on Rn
Recall that the outer Lebesgue measure n is dened as



n A := inf
vol Ik :
Ik A, Ik are open intervals
and the Lebesgue measure n is dened as the restriction of n to the -algebra
Mn = Mn (n ) of all Lebesgue measurable sets. Where no confusion can result,
the subscript n will be omitted.
26.1. Theorem.

The Lebesgue measure n is a complete Radon measure.

Proof. Use Theorem 1.21 and Remark 15.2.4 to prove that n is a Radon measure.
Completeness is a consequence of Caratheodorys construction, see Theorem 4.5.
26.2. Theorem. The Lebesgue measure n+k on Mn+k is the completion of
the product measure n k on Mn Mk .
Proof. Recall that A Mn if and only if A is a union of a Borel set and a set
of measure zero. Indeed, such a property is possessed by any complete Radon
measure and then we may refer to Theorem 26.1. The proof is now divided into
three steps.
1. First we show that Mn Mk Mn+k . For this purpose it is enough to show
that any measurable rectangle A B for A Mn , B Mk is in Mn+k . We write
A = DA NA and B = DB NB , where DA , DB are Borel sets and NA , NB
have measure zero. Then DA DB is a Borel set and (A B) \ (DA DB ) =
(A NB ) (NA B) has measure zero.
2. Conversely, Bn+k = Bn Bk Mn Mk . Since (Mn+k , n+k ) is the
completion of (Bn+k , n+k ), Mn+k Mn Mk .
3. Since n k = n+k on open intervals of Rn+k , using Hopfs extension
theorem 5.5 and uniqueness of the completion we conclude that n k = n+k
on Mn+k .
26.3. Remark. It is not very hard to prove that n k = n+k in Euclidean spaces. For
measurable sets, the situation is much more complicated. Indeed,
n+k

Mn Mk Mn+k

and

n+k

= Mn Mk
= Mn+k .

For the rst counterexample consider the Cartesian product H H, where H is a measurable
non-Borel subset of R (Exercise 1.14.b) and use Lemma 11.2. The second one can be nd in
Remark 11.10.
26.4. Exercise. Show that the Lebesgue measure is the only complete Radon measure on Rn
that assigns to each interval its volume.
26.5. Exercise. The Lebesgue measure is translation-invariant: If A Mn and x Rn , then
A + x Mn and n (A + x) = n A.
26.6. Exercise. Let be a nontrivial translation-invariant complete Radon measure on Rn .
Show that there is c > 0 such that = cn .
Hint. Let c = (0, 1)n . Evaluate the -measure of arbitrary intervals and use Exercise 26.4 to
compare n and 1c .

26. Lebesgue Measure and Integral on Rn

104

26.7. Lebesgue Measurable Functions.


Since Mn contains the family
of all Borel sets, each Borel (in particular, each continuous) function (on Rn )
is measurable. A deeper and complete characterization of measurable functions
(e.g. on open or closed subsets of Rn ) is contained in Luzins Theorem 18.2.
Basic tools of the integral calculus of functions of several variables are substitution theorem and Fubinis theorem. The last one allow to reduce the multidimensional integration to a succession of one-dimensional integrals.
26.8. Introduction to Fubinis Theorem. If M Rn+k is a measurable

set, let L
or innite)
 (M ) denote the set of all functions f which have (nite

integral M f dn+k (which will be traditionally denoted by M f (x, y) dx dy).
Given x Rn , recall that M x stands for the set {y Rk : [x, y] M }.
Let M Rn+k be a measurable set and f

26.9. Fubinis Theorem.


L (M ). Then the integral


f (x, ) dk

g(x) :=
Mx

exists for almost all x Rn , and




f dn+k =

Rn

In other words,

g dn .



f (x, y) dx dy =

f (x, y) dy
Rn

dx.

Mx

Proof. To prove the assertion it suces to use Theorems 11.11. and 26.2.
26.10. Remarks. 1. The assumption f (M ) is satised if, for instance, f 1 (M )
(Tonellis theorem), or if f is a non-negative measurable function (Fubinis theorem in a narrow
sense). When applying Fubinis theorem, we usually start with Fubinis theorem in a narrow
sense in order to show that |f | is integrable, and then we use Tonellis theorem.
2. Let M denote the projection of M into Rn (i.e. M is the set of all points x for which M x
is nonempty). Then
Z

Z
Z
f (x, y) dx dy =
f (x, y) dy dx
M

Mx

provided M is measurable. In general, the measurability of M does not imply the measurability
of M . Observe that M is measurable whenever M is open (then M is also open), or compact
(then M is compact), or a countable union of compact sets (e.g. a closed set). Projections of
Borel sets are also measurable but the proof is dicult. The study of projections of Borel sets
led to the notion of analytic sets (see D.L. Cohn [*1980]).
R
26.11. Example. We evaluate the integral B f (z) dz, where f is an integrable function on
3
the ball B := {z R : |z| < 1}. Let usq
write z in the form [x, y], where x = [z1 , z2 ] R2 and
y = z3 R. Then B x = {y R : |y| <
Z

Z 1|x|2

f (z) dz =
B

1 |x|2 } and B = {x R2 : |x| < 1}. Hence

=
1

1|x|2

2
1z1

!
f (x, y) dy

dx

0 q
0 q
1
1
Z 1z2 Z 1z2 z2
1
1
2
@ q
@ q
f (z1 , z2 , z3 ) dz3 A dz2 A dz1 .

2 z 2
1z1
2

E. Integration on Rn

105

26.12. Jacobian and Derivative.


If G Rk is an open set and =
n
[1 , . . . , n ] : G R has all partial derivatives at a point z G, then the Jacobi
i
matrix (
xj (z))i=1,...,n of the mapping at z will be denoted by (z).
j=1,...,k

We say that is (Frechet) dierentiable at z if there exists a linear mapping


L : Rk Rn such that
lim

h0

(z + h) (z) L(h)
= 0.
|h|

In this case, L is represented by the matrix (z), it is called the derivative


of at z and it is denoted by  (z). If all partial derivatives of exist and are
continuous at z, then is dierentiable at z.
A mapping which has continuous all partial derivatives of order k is called
a C k -mapping, or a mapping of class C k . A C -mappings, or mappings of class
C are those possessing continuous partial derivatives of all orders.
In case k = n, the Jacobi matrix (z) is a square matrix and its determinant
J (z) is called the Jacobian of at z.
We state the following theorem without proof, later we provide a proof of a
more general Theorem 34.18.
26.13. Change of Variable Formula. Let G Rn be an open set and
: G Rn a one-to-one C 1 -mapping such that J (z) = 0 for all z G. Let f
be a function on (G) and E (G) a measurable set. Then



f (x) dx =
1 (E)

f ((t)) |J (t)| dt

provided either of these integrals exists.


26.14. Polar Coordinates.
class in R2 ,

Consider the mapping : [r, t]  [r cos t, r sin t]. Then is of


 (r, t) =

cos t,
sin t,

r sin t
r cos t

and J (r, t) = r. The mapping is not one-to-one. For calculation of integrals, the substitution
x = (r, t) is often useful (usually G = (0, ) (0, 2)). In this situation, the hypothesis of the
change of variable formula in 26.13 are satised (notice that (G)
= R2 but (R2 \ (G)) = 0).
26.15. Lemniscate. Calculate the area of the set M surrounded by the lemniscate
M := {[x, y] R2 : (x2 + y 2 )2 < 2a2 (x2 y 2 )} .
Hint. If denotes a mapping of 26.14 (polar coordinates) and if
L := {[r, t] R2 : r (0,

2a2 cos 2t), t (0,

)} ,
4

then (L) = M {[x, y] : x > 0, y > 0}. By the change of variable formula 26.13 we get
Z

r dr dt = 4a2

2 (M ) = 4
L

cos 2t dt = 2a2 .

26. Lebesgue Measure and Integral on Rn

106

26.16. Spherical Coordinates. Consider the mapping = [x, y, z], where


x(r, t, ) = r cos cos t,

y(r, t, ) = r cos sin t,

and [r, t, ] G := (0, ) (, )


0
cos cos t,
 (r, t, ) = @ cos sin t,
sin ,

( 21 , 12 ).

z(r, t, ) = r sin

Then

r sin t cos ,
r cos t cos ,
0,

1
r cos t sin
r sin t sin A .
r cos

Thus J (r, t, ) = r2 cos . Again (G)


= R3 . However, the set R3 \(G) = {[x, y, z] R3 : y =
0 and x 0} is of measure zero.
26.17. Exercise. Let n denote the volume of the n-dimensional unit ball.
(a) Let be a nonnegative measurable function on (0, R). Prove that
Z R
Z
(|x|) dx = nn
rn1 (r) dr.
B(0,R)

Hint. Consider rst the case when is piecewise constant and then approximate by piecewise
constant functions.
n/2

(b) Show that n = (n +1) .


2
R
2
Hint. Write In = Rn e|x| dx. Using Fubinis theorem we obtain In = I1n . By the previous
part (a)
Z
2
n
n
n
In = nn
rn1 er dr = n ( ) = n ( + 1) .
2
2
2
0
It is easy to compute I2 = 2 (2) = . Hence In = n/2 and we are done.

26.18. Convolution of Functions and Measures. Let f , g be measurable


functions on Rn . The convolution of f and g is the function f g dened as

f (x y)g(y) dy
f g (x) =
Rn

at those points x for which the integral exists.


Let f be a function on Rn and a (signed) Radon measure on Rn . If the
expression f (x) := Rn f (x y) d(y) makes sense for almost all x Rn , then
the function f is called the convolution of f and .
26.19. Theorem. If f, g L 1 (Rn ), then the function y f (x y)g(y) is in
L 1 (Rn ) for almost all x Rn , f g L 1 and f g1 f 1 g1 .
Proof. First we have to show that the function [x, y] f (xy)g(y) is measurable
on R2n . Obviously, the function [x, y] g(y) is measurable on R2n and we need
to prove the measurability of [x, y] f (x y). To this end it is enough to
show that the set M := {[x, y] Rn Rn : x y E} is measurable for every
measurable set E Rn . But this is apparent since M is of the form E Rn in
the coordinate system x = x y, y  = x + y. Next, we use Fubinis Theorem
26.10 (in a narrow sense) to obtain



 



g(y)
|f (x y)g(y)| dx dy =
|f (x y)| dx  dy

2n
n
n
R

= f 1 g1 < .

Using Fubinis theorem again (now Tonellis one) the desired conclusion follows.

E. Integration on Rn

107

26.20. Youngs Convolution Theorem. Let 1 p, q, r , p1 + 1q = 1 + 1r .


If f L p , g L q , then f g = g f is dened almost everywhere, it is an
element of L r and f gr f p gq .
Proof. With trivial cases out of the way (cf. also the proof of Theorem 26.21)
p
q
we assume p, q, r < . Denoting f = |f | , g = |g| , f and g are in L 1 . By the
previous theorem the convolution f g is dened almost everywhere and it is in
L 1 . For a xed x, the H
older inequality yields
 



rp
rq
p
q
f (x y)g(y) dy 
|f (x y)| r |g(y)| r |f (x y)| r |g(y)| r
Rn
Rn
 rp

 rq

 r1

pr
qr
p
q
p
q
|f (x y)|
|g(y)|
|f (x y)| |g(y)|






Rn

Rn

rp
r

rq
r

= f p

gq

Rn

1
((f g)(x)) r .

By Fubinis theorem we have the equality



Rn




f g =
=

Rn
p
f p


f(x y)
g (y) dy




dx =

Rn
q
gq ,

Rn

Rn

f (x y)
g (y) dx dy

and consequently

r

f gr =





n

r


rp
rq
f (x y)g(y) dy  dx f p gq

Rn
rp
rq
f p gq

Rn

(f g)(x) dx

f p gq = f p gq .

In fact, a precise proof should contain also a verication of measurability of the


function

x
f (x y)g(y) dy.
Rn

This is clear if f and g are bounded and of bounded support, as f (x y)g(y)


is then integrable on Rn Rn , cf. 26.18. The general case can be obtained by
approximation fk f , gk g, where

fk (x) =

f (x)

if |x| k and |f (x)| k,

otherwise

and gk s are dened analogously.


26.21. Theorem. If f L p (1 p ) and if is a nite signed Radon
measure on Rn , then f L p and
f p f p || (Rn ).

26. Lebesgue Measure and Integral on Rn

108

Proof. (a) Let 1 < p < +. The H


older inequality gives (for a xed x)
 

 p1


p1
p


(|| (Rn )) p ,
|f (x y)| d || (y)
 n f (x y) d(y)
n
R

whence (by Fubinis theorem)




p
n p1
|f | dx (|| (R ))
Rn

Rn

= || (R )

n p

p
f p




|f (x y)| dx d || (y)
p

Rn

(b) If p = 1, then Fubinis theorem yields immediately




 
|f | dx
|f (x y)| dx d || (y) = || (Rn ) f 1 .
Rn

Rn

(c) If p = +, then

Rn

|f (x)|

|f (x y)| d || (y) f  || (Rn )

Rn

for almost all x Rn .


26.22. Remark. Suppose 1 p , p1 + 1q = 1, f p , g q . Then f g by
Youngs convolution theorem 26.20 and f g f p gq . Even stronger assertion holds:
The convolution f g is uniformly continuous on Rn , and if 1 < p < , then f g 0 (Rn ).
These properties can be proved using methods of Chapter 31 (see Exercise 31.9).
26.23. Exercise. Suppose f, g, h

Show that (f g) h = f (g h) almost everywhere.

1.

26.24. Exercise. Now we indicate another way how to dene the Jordan-Peano volume on
Euclidean spaces (compare with Exercise 5.12). Let m be a positive integer and i = [i1 , . . . , in ]
Zn be a multiindex. Denote by Qm,i the closed cube [(i1 1)2m , i1 2m ] [(in
1)2m , in 2m ]. If m is xed and i runs over Zn , the cubes Qm,i are non-overlapping (i.e. their
(E, m) denote 2mn times the
interiors are pairwise disjoint) and cover Rn . If E Rn , let
number of elements of the set {i : Qm,i E
= } and (E, m) denote 2mn times the number
of elements of the set {i : Qm,i E},

E = inf
m

(E, m)

and

= sup
m

(E, m).

Prove that:
E E

for any set E Rn ;

(a)

(b)

G = G for any open set G;


K = K for every compact set

(c)

(d) for a bounded set E,

K;

E if and only if (E) = 0.

26.25. Exercise. Let G Rn be an open set. Show that there exists a disjoint collection of
balls in G whose union covers G except on a null Lebesgue set.
Hint. The assertion is an easy consequence of Vitalis covering theorem 27.2. Here we indicate
an elementary proof. We can assume that G is of a nite measure. Set G0 = G. By virtue of
Exercise 26.24 there are closed cubes Q1 , . . . , Qp whose interiors are pairwise disjoint and whose
union has a measure greater than 12 G. Set c = 2n B(0, 1/2). Then in the interior of every
cube having a measure there is a closed ball of measure c. Hence we get closed pairwise
disjoint balls Bj Qj whose union F1 has a measure greater than 2c G. Set G1 = G0 \ F1 .
Inductively, there are Fi+1 Gi := Gi1 \ Fi which are unions of pairwise disjoint closed balls
with Gi (1 2c )Gi1 (1 2c )i G.

E. Integration on Rn

109

26.26. Exercise. Let E Rn be an arbitrary set. Prove that


E = inf{

U (xj , rj ) :

U (xj , rj ) E}.

Hint. It may be supposed that E < +. Choose an > 0 and nd an open set G E with
G < E + . Thanks to Exercise 26.25 there exists a countable disjoint collection {Vj } of
open balls so that {Vj } covers G except on a null set N and
X

Vj =

Vj G + .

S
P
Find a countable family {Ij } of open intervals such that Ij N and
Ij . According
S mj
P jm
m
Qj < 2Ij . For each
to Exercise 26.24 there exist closed cubes Qj with Qj Ij and
m

m
cube Qm
j nd an open ball Wj with the same centre and with radius equal to the diameter
of this cube (so that Wjm contains Qm
j ). It is not hard to check that there exists a constant c
(depending only on the dimension n) such that Wjm cQm
j . Whence

X
j,m

Wjm c

Qm
j 2c

j,m

Ij 2c.

To nish the proof, it is enough to consider the union of all balls from collections {Vj } and
{Wjm }.

27. Covering Theorems


Covering theorems provide an important tool for deriving deeper results in
measure and integration theory. We will present two Vitalis type theorems. The
task is to nd, for a given cover, a countable or nite disjoint subcover in such
way that the dierence is a set of small measure.
27.1. Vitali Cover. We say that a collection V of closed balls is a Vitali cover
of a set A Rn (or that V has the Vitali property) if for each x A and each
r > 0 there is B V so that x B B(x, r).
First we state the classical theorem of Vitali.
27.2. Vitalis Covering Theorem. Let V be a Vitali cover of a set A Rn .
Then there exists a pairwise disjoint countable subcollection A V such that
(A \

A ) = 0.

Proof. It is no restriction to assume that A H Rn , where H is an open set


having nite measure. Indeed, if we prove the theorem in this case, we can apply
it successively to H = U (0, 1), H = U (0, 2) \ B(0, 1), H = U (0, 3) \ B(0, 2), . . .
and to nd a countable pairwise disjoint collection of the given family which
covers A \ {x Rn : |s| N} (notice that {x Rn : |x| N} = 0). It may
be also supposed that no nite union of pairwise disjoint balls of V covers A.
Denote W = {(x, r) : B(x, r) V }. Now we are going to construct inductively a
sequence {Bj } of pairwise disjoint balls of V . In the 0th step we have the empty

110

27. Covering Theorems

collection of balls. Assume that in the k th step (k = 1, 2, . . . ) the closed balls


B1 , . . . , Bk1 V are selected and denote



!k1
"
Bj = , B(x, r) H .
sk = sup r > 0 : B(x, r) V , B(x, r)
j=1

The Vitali property and the additional assumptions ensure that sk > 0. Further
+ > s1 s2 . . . since H is of nite measure. Now choose a ball Bk =
B(xk , rk ) V with rk > sk /2 and Bk Bj = for j < k.
We get a sequence A := {Bk , k N} of pairwise disjoint sets from V . We
show that A covers
A except on a set of Lebesgue measure zero. To this end let
p N and x A \ A be given. We now invoke the Vitali property and nd a
ball B = B(y, s) V with x B
and B Bj = for j = 1, . . . , p 1. Clearly
s sp . Since H < , we have
rkn < and lim sk = lim rk = 0. Therefore
k

we can nd q p with sq+1 < s sq . By denition of sq+1 there exists i q


with Bi B = . If z denotes a point of Bi B, then
|x xi | |x y| + |y z| + |z xi | s + s + ri 2sq + ri 2si + ri 5ri ,
whence x B(xi , 5ri ). Since i p, we get
A\

B(xi , 5ri ) ,

i=p

and thus
(A \

A)

5n B(xi , ri ) .

i=p

Taking the limit as p it follows that (A \

A ) = 0, and we have nished.

27.3. Corollary. Let A H Rn , where H is an open set of nite measure.


If a collection V of closed balls in H is a Vitali cover of A and > 0, then there
exists a nite collection of pairwise disjoint balls A V such that
(A \

A ) < .

Proof. Set A = {B1 , . . . , Bp }, where Bj are as in the proof of the previous


theorem and p is suciently large.
Although Theorem 27.2 is suciently strong for most of the applications, we
state also more powerful covering theorems which will be useful in Chapter 28.
27.4. Besicovitch Theorem. There exists a positive integer N (depending
only on the dimension n of the space Rn ) with the following property:

E. Integration on Rn

111

Whenever A Rn is a set and is a bounded positive function on A, then


there exist a nite family of countable sets D1 , . . . , DN A such that for each
k {1, 2, . . . , N }, the collection {B(x, (x)) : x Dk } is pairwise disjoint, and
A

{U (x, (x)) : x

N


Dk } .

k=1

Proof. Thanks to the compactness of S =: {x Rn : |x| = 1}, there is a nite


set T S such that

1
S
U (t, 40
).
tT

Let denote number of elements of T and N := 2 + 1. First let us assume that


A is bounded. A disjoint sequence {xj } of points of A will be chosen inductively
as follows: Denote
s1 = sup{(x) : x A}
and nd x1 A with (x1 ) > 78 s1 . In the j th step (j > 1) denote
Mj = A \

U (xi , (xi )).

i<j

If Mj = , then we stop and set D = {xi : i = 1, . . . , j 1}. Otherwise, let sj =


sup{(x) : x Mj }, nd
xj Mj with (xj ) > 78 sj and set D = {xi : i N}.
We claim that A Uj , where Uj := U (xj , (xj )). This is clear if D is
j

nite. In the general case, since A is bounded, {xj } has a Cauchy subsequence
{xjk }. Observe that (xi ) |xi xj | (indeed, for i < j, xj lies outside the ball
U (xi , (xi )) ), and therefore
inf{(xi ) : i N} inf{|xi xj | : i, j N, i = j} = 0.
Given now x A, there is j with (xj
) < 78 (x). It is then apparent that
(x) > sj , so that x
/ Mj . Whence x i<j Ui .
Next we dene a function p. For every xj D, set
Pj = {i {1, . . . , j 1} : Bi Bj = } ,
where Bj := B(xj , (xj ), and dene inductively p(x1 ) = 1,
p(xj ) = min{k N : k = p(xi ) for all i Pj }.
Set Dk = {xj : p(xj ) = k}. Plainly the collection of balls {Bj : xj Dk } is
pairwise disjoint for all k N , and to complete the proof we only have to show
that p(xj ) N for each xj D. We will be done once we prove that, for each
xj D, number of balls Bi , i < j, intersecting Bj is less than N .

112

27. Covering Theorems

We shall suppose that the last assertion does not hold, and derive a contradiction. Select z := xj D. If number of balls Bi , i < j intersecting Bj is N ,
then there exist z1 , z2 , z3 D and t S so that
zk = xjk
and

for j1 < j2 < j3 < i




 z zk
1


 |z zk | t < 40

for k = 1, 2, 3 .

For k = 1, 2, 3 denote
r = (z), rk = (zk ), Rk = |z zk | .
We will show that r, rk and Rk satisfy the following system of 16 inequalities
r > 0,

(I)
(IIk )
(IIIk )
(IV1 )
(IV2 )
(IV3 )
(Vk )

Rk rk + r,
8
r < rk ,
7
8
r3 < r2 ,
7
8
r3 < r1 ,
7
8
r2 < r1 ,
7
rk R k ,

1
r2 |R2 R3 | + r3 ,
8
1
r1 |R1 R3 | + r3 ,
(VI2 )
8
1
r1 |R1 R2 | + r2 ,
(VI3 )
8
which does not have any solution. Therewith we get the required contradiction.
Now, we will derive the system of inequalities (I) (VI3 ). The inequality (IIk )
says that the ball B(zk , rk ) intersects B(z, r). The inequalities (III), (IV) follow
from the following inequalities
(VI1 )

8
(xi ) for i < j .
7
Since (xi ) < |xj xi | for i < j, we get (V) and (VI). For instance, to obtain
(VI1 ) we use the denition of N , (II3 ) and (III3 )) in order to show that

 


 
|z3 z|
|z3 z| 



r2 |z2 z3 | z2 z
(z2 z) + z3 z
|z2 z|
|z2 z| 


 z3 z
1
z2 z 
|R2 R3 | + 2R3

|R2 R3 | + R3 

|z3 z| |z2 z|
40
1
15
1
|R2 R3 | + (r3 + r) |R2 R3 | +
r3 |R2 R3 | + r3 .
20
20 7
8
(xj ) si <

E. Integration on Rn

113

Now we show that the system of inequalities (I)(VI) does not possess a solution.
First, note that rk and Rk are positive by (I), (III) and (V). By (II1 ), (II2 ), (V1 ),
(V2 ), (IV3 ) and (III2 ) we have
7
3
|R1 R2 | r+|r1 r2 | = r+r1 +r2 2 min(r1 , r2 ) r+r1 +r2 r2 < r1 + r2 .
4
7
On the other hand, (VI1 ), (VI2 ) and (IV1 ) yield
1
1
5
|R1 R3 | + |R2 R3 | r1 r3 + r2 r3 r1 + ,
8
8
7
whence
|R1 R2 | < |R1 R3 | + |R3 R2 | .
Similarly we obtain
|R1 R3 | < |R1 R2 | + |R2 R3 | ,
and
|R2 R3 | < |R1 R3 | + |R2 R1 | .
We see that the system (I)(VI3 ) has no solution and this contradiction yields
the required conclusion.
For the general case when A is unbounded, let K > 2 sup (x), and denote
xA

Ai = {x A : (i 1)K |x| < iK}.


By the above argument we nd for each i N corresponding sets Dki Ai ,
k = 1, . . . , N . Setting

i
for k N,
{D : i even }
Dk =
ki
{DkN : i odd } for N < k 2N.
we see that number 2N obeys required properties.
27.5. Lemma. Let be a Radon measure on Rn and N as in Theorem 27.4.
Suppose further that A H Rn , where H is an open set with H < +. Given
a positive bounded function on A such that B(x, (x)) H for all x A, then
there exist an open set U H and a nite set F A such that A U , the
collection of balls B(x, (x))xF is pairwise disjoint and
 



1
B(x, (x)) < 1
U\
U.
2N
xF

Proof. By the previous theorem we can nd countable sets D1 , . . . , DN A such


N

that for D :=
Dk and k {1, 2, . . . , N }, the collection of balls
k=1

{B(x, (x)) : x Dk }

114

27. Covering Theorems

is pairwise disjoint and


A

B(x, (x)) .

xD

Denote

Ek =

B(x, (x)),

U=

xDk

U (x, (x)).

xD

We may nd and index q {1, . . . , } with Eq N1 U and a nite set F Dq


such that

1
B(x, (x)) >

U.
2N
xF

Now, it is not hard to see that



U\


B(x, (x))


<

xF

1
2N


U.

27.6. Vitalis Covering Theorem for Radon Measures. Let be a Radon


measure on Rn and A H Rn , where H is an open of nite -measure.
Let further V be a family of closed balls of Rn having the following Vitalis type
property: Given x A, there exist rj  0 such that B(x, rj ) V . Then for any
> 0 there exists an open set G Rn with G < and a nite family A of
pairwise disjoints balls of V such that
AG

A.

Proof. We will construct recursively sequences of sets {Gk } and {Ak } according
to the following rule: Set G0 = H, A0 = A. Suppose that in the k th step the open
sets G0 Gk1 and sets A0 Ak1 have been chosen. By Lemma
27.5 there is an open set Uk , Ak1 Uk Gk1 and a closed set Mk Gk1 so
that Mk is the union of a nite pairwise disjoint collection of balls of V , and

(Uk \ Mk ) <

1
1
2N


Uk .

Set
Ak = Ak1 \ Mk ,

Gk = Uk \ Mk .

Our construction guarantees that A \ Gk is covered by the selected balls (which


constitute a nite pairwise disjoint subcollection of V whose union is M1 Mk )
and

k


1
1
Gk1 1
Gk < 1
G0 .
2N
2N
Taking suciently large k, we get the desired conclusion.

E. Integration on Rn

115

27.7. Corollary. Let be a complete Radon measure on Rn , A Rn and


V be a collection of closed balls in Rn . If for any x A there exist a sequence
rj  0 with B(x, rj ) V , then there exists a countable pairwise disjoint collection
A V such that


A \ A = 0.
Proof. The proposition is a straightforward consequence of Theorem 27.6 if A is
bounded. For the general case, we use a similar idea as in the proof of Theorem
27.2. Since it can happen that {x : |x| = r} = 0, we use Exercise 15.16.b in
order to nd a sequence {r1 , r2 , . . . }, rj  + for which {x : |x| = rj } = 0 for
every j N.
27.8. Remark. Stating Theorem 27.6 and Corollary 27.7 we assumed that centers of given
balls lie in the covered set. Hence, Theorem 27.2 and its Corollary 27.3 are not direct consequences of Theorem 27.6 and Corollary 27.7.
27.9. Exercise. Let (P, ) be a separable metric space, A P and > 1. Suppose there is
W P (0, 1] such that the collection of balls {B(x, r) : (x, r) W} covers A. Show that there
exists a countable collection S W such that the balls B(x, r), (x, r) S are pairwise disjoint
and
[
U (x, (1 + 2 )r).
A
(x,r)S

Hint. Step by step, nd for any k N a countable colection Sk Wk , where


Wk :=

(x, r) W: k < r k+1 , B(x, r)

o
[
{B(y, s) : (y, s) S1 Sk1 } =

in such a way that the balls B(x, r), (x, r) Sk are pairwise disjoint and
B(x, r)

[
{B(y, x) : (y, s) Sk }
=

for all (x, r) Wk .


(This can be done, for instance, in the following way: Let {qj } be a dense sequence of points
of P . We start from S0k = and go on recursively. Let j 1 be a positive integer. If there
exists (x, r) Wk so that qj B(x, r) and B(x, r) B(y, s) = for all (y, s) Sj1
, then set
k
j
j1

{(x,
r)}.
In
the
opposite
case
leave
S
=
S
.)
Sjk = Sj1
k
k
k
S
Now set S =
k=1 Sk . The balls B(x, r), (x, r) S are plainly pairwise disjoint. Take
x A. We are going to prove that
x

U (u, (1 + 2 )r).

(u,r)S

Since
A

B(u, r),

(u,r)W

there is (y, s) W with x B(y, s). Further nd k N satisfying k < s k+1 . Then
either (y, s) Sk (and there is nothing to prove), or B(y, s) intersects some of the previous
balls. More precisely, there exists z B(u, r) B(y, s), where (u, r) S1 Sk . Obviously
s < r, and therefore
(x, u) (x, y) + (y, z) + (z, u) 2s + r < (2 + 1)r.
Hence x U (u, (2 + 1)r).

116

28. Dierentiation of Measures

27.10. Exercise. Let


be a collection of open balls in Rn , M =
there exist disjoint balls U1 , . . . , Uk
such that
k
X

Uj >

j=1

, c < M . Prove that

c
.
3n


so that K > c and
Hint. Find a compact set K M and a nite subcollection
S
 K. Select the ball U
 with the greatest radius. Again, select among the balls of

1
which do not intersect U1 the ball U2 with the greatest radius. Suppose the balls U1 , . . . , Uj1
have been chosen and nd among the balls of  which do not intersect U1 , . . . , Uj1 the ball
Uj with the greatest radius. After a nite number of steps, we get a sequence {U1 , . . . , Uk } such
that we cannot add another ball. If x K, then x U for some U  and the construction
implies that U Ui
= for some i. Let i be the smallest of such indices. Then the radius R of
the ball U is not greater than the radius of Ui , so that x lies in the ball with the same center as
Ui and radius 3R. Hence
k
X
Uj .
c < K 3n
j=1

27.11. Exercise. Prove Vitalis covering theorem 27.2 using Exercise 27.10 in a similar way
as Theorem 27.6 was derived from Lemma 27.5.
27.12. Notes. The famous covering theorem is due to G. Vitali [1908] who proved it for closed
intervals and the Lebesgue measure. Later on, Vitalis covering theorem was generalized in various directions by many authors (H. Lebesgue [1910], S. Banach [1924], C. Caratheodory (second
edition of the book [*1918] in 1927)). Another step forward was made by A.S. Besicovitch [1945],
[1946] and by A.P. Morse [1947]. The origin of the simple covering theorem which appeared in
Exercise 27.10 is in Wieners article [1939].

28. Differentiation of Measures


28.1. Derivative of a Measure. In the following, will stand for a Radon
measure on Rn and will denote again the Lebesgue measure.
For any x Rn , dene
D (x) := lim sup
r0+

B(x, r)
B(x, r)

and D (x) := lim inf


r0+

B(x, r)
.
B(x, r)

If D (x) = D (x), we call this common value the (symmetric) derivative of


(with respect to ) at x and denoted it by D (x).
n

28.2. Remarks. 1. Recall that B(x, r) =

2
rn , see Exercise 26.17.b
+ 1)

( n
2

2. The functions x  D (x) and x  D (x) are Borel-measurable. This assertion follows from
the fact that the function x  B(x, r) is upper semicontinuous, x  B(x, r) is continuous
and that when dening D , we can restrict to r Q.
Compare with the following rather surprising theorem (O. H
ajek [1957]): If F is an arbitrary
function on an interval I R, then the function DF (see 22.3) is Borel on I. The proof for
continuous F is indicated in the proof of Theorem 25.8. An analogous theorem fails for Dini
derivatives, there are examples of functions for which all four Dini derivatives are nonmeasurable.
3. Let be a Radon measure on R and F its distribution function (see 24.1). Then F  (x) =
D (x) provided F  (x) exists. Moreover, the following is true: If F  (x) a for x A, then
A aA. Similarly, B aB provided F  a on B. Compare this result with the next
lemma.

E. Integration on Rn

117

28.3. Lemma. Let A Rn be a Borel set and a > 0.


(a) If D (x) a for all x A, then A aA.
(b) If D (x) a for all x A, then A aA.
Proof. With the trivial case A = out of the way, assume A < and x
an > 0. There is an open set G A with G A + . By Vitalis covering
theorem 27.6 for Radon measures we can nd disjoint closed balls B(xi , ri ) G
such that

!
"
B(xi , ri ) (a + )B(xi , ri ) and A \ B(xi , ri ) = 0.
i

(If is absolutely continuous with respect to , we can also use Vitalis covering
theorem 27.2.) Then


A
B(xi , ri ) (a + )
B(xi , ri ) (a + )G (a + )(A + ) ,
i

whence (a) follows.


The inequality in part (b) can be proved similarly using Vitalis covering theorem 27.2 for the Lebesgue measure.
28.4. Theorem. Any Radon measure on Rn has a nite derivative D
-almost everywhere on Rn .
Proof. Without loss of generality we can restrict to a compact interval I Rn .
For k N and r, s Q+ , s < r, set
Ak = {x I : D (x) k},

A(r, s) = {x I : D (x) s < r D (x)}.

By the previous lemma,


kAk Ak I < + and rA(r, s) A(r, s) sA(r, s) sI < +.
! "
Hence Ak = limk Ak = 0 and A(r, s) = 0 (realize that 0 s < r). The
k

desired conclusion readily follows observing that


0 D (x) D +,

{x I : D (x) = +} =

Ak

and
{x I : D (x) < D (x)} =

A(r, s).

r,sQ+

28.5. Remark. Consider now the simple case when F is the distribution function of a Radon
measure on R. We know that F has a (nite) derivative F  almost everywhere (Lebesgues
R
R
theorem 22.5), that ab F  d F (b) F (a) (Theorem 22.7), and that ab F  d = F (b) F (a)
for any interval [a, b] if and only if  (Corollary 23.5 and Exercise 24.7). In the last case,
ym derivative d
. An analogous assertion
F  agrees almost everywhere with the Radon-Nikod
d
holds for dierentiation of measures and it is contained in the next Theorem 28.6.

Proof of the following theorem is based on Lemma 28.3. A proof based on the
covering theorem of Exercise 27.10 can be found in W. Rudin [*1974].

118

28. Dierentiation of Measures

28.6. Theorem. Let be a Radon measure on Rn and B Rn a Borel set.


Then:

(a) B D d B (in particular, D is locally -integrable);

(b) B D d = B, provided  .
Proof. Suppose > 1, k Z and set
Bk () = {x B : k D (x) < k+1 },

M=

Bk ().

According to Theorem 28.4, (B \ M ) = 0. If  , then also (B \ M ) = 0.


Using Lemma 28.3.b, we get



D d =

D d =
M

+ 

k=

+


D d

Bk ()

+


k+1 Bk ()

k=

Bk () B ,

k=

and by part (a) of Lemma 28.3,



D d
B

+ 

k=

Bk ()

+


D d

Bk () = 1 M = 1 B.

k=

When taking the limit as 1+, we get both (a) and (b).
28.7. Theorem. The following statements about a Radon measure on Rn
are equivalent:
(i)  ;
(ii) B = B D d for every Borel set B Rn ;
(iii) D is the Radon-Nikodym derivative of with respect to ;
(iv) D < + -almost everywhere on Rn .
Proof. The previous theorem says that (i) = (ii), thus the equivalence (i)
(ii) (iii) is obvious. To show that (i) implies (iv), use Theorem 28.6. Assume
now (iv) and let A Rn , A = 0. Using Lemma 28.3.a, we get
{x A : D (x) k} k A = 0
for every k N, whence plainly A = 0.
28.8. Remarks. 1. It is not dicult to state similar results for signed Radon measures.
2. If is a Radon measure on Rn and = s + a is its Lebesgue decomposition into the
ym derivative of a with
singular and absolutely continuous part, then D is the Radon-Nikod
respect to .
28.9. Exercise. Let G Rn be an open set, f : G Rn be a dieomorphism. If B is a
Borel subset of G, set B := f (B). Show that is a Radon measure on G and

(a) D (x) = Jf (x) for every x G;



(b) Jf is the Radon-Nikod
ym derivative d .
d

E. Integration on Rn

119

28.10. Notes. We could dene the (symmetric) derivative of a measure at x using cubes
with the center at x whose lenght of edges converges to zero.
Furthermore, we could consider more general sequences of sets shrinking in a suitable way
to x. We could consider, for instance, all sequences of intervals containing x whose diameters
converge to zero. Let us note that for these general derivatives of measures analogous theorems
to those of this paragraph fail to hold, and even analogies of Theorem 29.2 (which is their direct
consequence) do not hold. An example can be found in M. de Guzm
an [*1975], Chapter V., 2.
On the topic of dierentiation of measures the reader is invite to consult for instance,
W. Rudin [*1974], M. de Guzm
an [*1975] or J. Lukes, J. Mal
y and L. Zaj
cek [*1986].

29. Lebesgue Density Theorem and Approximately Continuous Functions

29.1. Density points. Let be again the Lebesgue measure on Rn , M Rn


a measurable set and x Rn . We say that x is a point of density, or a density
point of M if
(M U (x, r))
= 1.
lim
r0+
U (x, r)
29.2. Lebesgue Density Theorem.
M is its density point.

Almost every point of a measurable set

Proof. If A := (A M ) for every (Lebesgue) measurable set A Rn , then


is a Radon
measure on Mn absolutely continuous with respect to . Since

A = cM d for A Mn , cM is the Radon-Nikod
ym derivative of with
A

respect to . On the other hand, Theorem 28.7 tell us that D is also the RadonNikod
ym derivative of with respect to . Hence D = cM almost everywhere
according to the Radon-Nikod
ym Theorem (Remark 13.6.1) . If we realize that
U (x, r)
D (x) = lim
and cM (x) = 1 for x M , we obtain the assertion.
r0+ U (x, r)
29.3. Remark.
When n = 1, the Lebesgue density theorem is an easy consequence of
Theorem 23.4 according to which the derivative of an indenite Lebesgue integral of a (locally)
integrable function cM equals cM almost everywhere.

29.4. Density Topology. A measurable set M Rn is called d-open if each


point of M is its point of density. For instance, the set of all irrational numbers,
or any open subset of Rn are d-open. We are going to show that the collection
d of all d-open sets on Rn forms a topology which will be labelled as the density
topology. Since d contains all open subsets of Rn , it is ner than the Euclidean
topology of Rn .
29.5. Theorem.

The collection of all d-open sets forms a topology.

Proof. Plainly and Rn are d-open, and d is closed under the formation of
nite
intersections. If A is a collection of d-open sets, we have to prove that T := A
d. If we show that T is measurable, then every point of T is apparently a density
point of T . We can assume that T I, where I Rn is a compact interval.
Denoting S = {S : there exists a countable collection A0 A such that S =

A0 }, there exists S S with S = sup{M : M S }. Given x T , there


exists A A so that x is a point of density of A. Since (A S) = S, we get

120

29. Lebesgue Density Theorem and Approximately Continuous Functions

(A \ S) = 0. Hence x is a point of density of A S and also a point of density


of S. Of course, x cannot be a point of density of I \ S. Since by the Lebesgue
density theorem almost every point of I \ S is a point of density of I \ S, it follows
that (T \ S) = 0. Hence T = S (T \ S) is measurable.
29.6. Remark. The density topology in Rn shares a lot of interesting properties. Let us note
that the density topology is not metrizable, it is not normal (it is completely regular), the only
d-compact sets are the nite ones, and that the Baire category theorem holds for d.

29.7. Approximately Continuous Functions. A function f dened on a


neighbourhood of a point z Rn is said to be approximately continuous at z if
there exists a measurable set M Rn such that z is a density point of M and
lim f (x) = f (z). If f is approximately continuous at each point of a given
xz, xM

set, we say that f is approximately continuous. It is clear that each continuous


function is approximately continuous.
29.8. Theorem. A function f is approximately continuous at z if and only if
f is d-continuous at z.
Proof. Suppose f is d-continuous at z and dened on a neighbourhood U =
U (z, r0 ) of z. Then z is a point of density of each of the sets Mj := {x
U : |f (x) f (z)| < 1j }. Find a decreasing sequence of radii rk > 0 satisfying
(U (z, r) \ Mj )
< 2jk
U (z, r)
for all j = 1, . . . , k and r (0, rk ). Set Aj = U (z, rj ) \ Mj and M = U \
We have
(U (z, r) Aj )
2j
U (z, r)

j=1

Aj .

for all r > 0. Choose k N and r (0, rk ). Then


(U (z, r) Aj )

U (z, r)
Whence

2jk

for j k ,

2j

for j > k .

(U (z, r) \ M )
2k+1 ,
U (z, r)

and z is a density point of M . If x M U (z, rk ), then x Mk and consequently


|f (x) f (z)| < k1 . Therefore
lim f (x) = f (z). The reverse implication is
xz,xM

obvious.
29.9. Denjoys Theorem. A function f : Rn R is Lebesgue measurable if
and only if f is approximately continuous at almost all points of Rn .
Proof. Let f be approximately continuous at almost all points. Denote by N
the set of all points of approximate discontinuity of f . Take any c R. We
will show that the set M := {x R : f (x) > c} is measurable. Since M is a

E. Integration on Rn

121

d-neighbourhood of any point x M \ N , it follows that M \ N is also a dneighbourhood of x. Hence M \ N is d-open, and thus measurable. Since N is a
null set, M is measurable as well.
Now assume that f is measurable and select an > 0. Luzins theorem 18.2
provides us with a continuous function g on Rn and an open set G so that G <
and f = g in Rn \ G. By the Lebesgue density theorem almost every point of the
set Rn \ G is its point of density. Thus f is approximately continuous at almost
all points of Rn \G and we can easily conclude that f is approximately continuous
at almost all points of Rn .
29.10. Remark.
Compare the last equivalence with the Lebesgue theorem 7.9 according
to which a bounded function f is Riemann integrable if and only if f is continuous almost
everywhere.
29.11. Exercise. Let f be a bounded function dened on a neighbourhood of a point z Rn .
Then f is approximately continuous at z if and only if z is a Lebesgue point for f (the denition
of Lebesgue points in Rn is analogous to the one-dimensional case in 23.8). Show that the
assumption of boundedness is essential for one of the implications.
29.12. Notes. The notion of approximate continuity was introduced by A. Denjoy [1915];
he proved also that every Lebesgue measurable function is approximately continuous at almost
all points. The converse assertion was proved by V. Stepanov in [1924]. The density theorem
29.2 is due to H. Lebesgue [*1904]. This theorem also holds for nonmeasurable sets if the outer
Lebesgue measure is used in the denition of density points. The notion of the density topology
was studied much later in the 1950s. Many of its properties and generalizations and recent
applications can be found in the monograph by J. Lukes, J. Mal
y and L. Zaj
cek [*1986].

30. Lipschitz Functions


30.1. Lipschitz Mappings. Let (P1 , 1 ), (P2 , 2 ) be metric spaces and > 0.
Recall that a mapping f : P1 P2 is said to be -Lipschitz if
2 (f (x), f (y)) 1 (x, y)
for every x, y P1 . We say that f is Lipschitz on P1 if there exists > 0 for
which f is -Lipschitz. A mapping f : P1 P2 is called locally Lipschitz if for
any z P1 there exists its neighbourhood U such that f is Lipschitz on U (in
this case, the constant can vary from point to point).
30.2. Lemma. Let f be a continuous function on an open set G Rn and
f
i {1, . . . , n}. Then the set of those points at which x
fails to exist is a Borel
i
set.
Proof. To simplify the proof, we can assume that G = Rn . Denote


f (x + tei ) f (x)
1 1
1
,
um,k (x) = sup
: t ( , ), |t|
t
m m
k


f (x + tei ) f (x)
1 1
1
,
vm,k (x) = inf
: t ( , ), |t|
t
m m
k
u(x) = inf sup um,k (x) ,
m k>m

v(x) = sup inf vm,k (x).


m k>m

Then, for k > m, um,k and vm,k are continuous. Since the set of points where
f
n
xi (x) exists equals {x R : < u(x) = v(x) < +}, the assertion follows.

122

30. Lipschitz Functions

30.3. Rademachers Theorem. Let f be a Lipschitz function on an open set


G Rn . Then f is dierentiable almost everywhere in G.
Proof. Assume that f is a -Lipschitz function. Let E be the set of all points
where f fails to have some of the partial derivatives. Using Fubinis theorem, the
one-dimensional theorem on dierentiability of Lipschitz functions (Lemma 22.4)
easily implies that E is of measure zero (notice that E is measurable by Lemma
30.2).
Now, for p, q Qn and m N denote

Sp,q,m =

f (x + tei ) f (x)
< qi for all i = 1, . . . , n
t

1 1
and for t ,
\ {0} .
m m

x G \ E : pi <

Let Sp,q,m be the set of all density points of Sp,q,m . By the Lebesgue density
theorem 29.2 (Sp,q,m \ Sp,q,m ) = 0. If
N :=

(Sp,q,m \ Sp,q,m ) ,

p,q,m

then N is of measure zero. We claim that f is dierentiable at each point of


G \ (N E).
To this end, take x G \ (N E) and (0, 1). Pick p, q Qn so that
qi < pi <

f
(x) < qi ,
xi

i = 1, . . . , n.

Then there exists m N with x S := Sp,q,m . Since x is outside !of "N , x is even
n
1
) such that (U (x, r) \ S) 2 (U (x, r))
a density point of S. Find (0, m
for all r (0, 2). In particular, notice that U (x, (1 + ) ) \ S does not contain any ball of radius if (0, ). Choose y U (x, ) and denote y i =
(y1 , . . . , yi , xi+1 , . . . , xn ). For any i {1, . . . , n}, let Ui be the ball with center y i
and radius |y x|. The choice = |y x| yields that S Ui = for all i. If
z i S Ui and wi := z i1 + (yi xi )ei , then
 i
 

w y i  = z i1 y i1  |y x| ,
pi <

f (wi ) f (z i1 )
< qi
yi xi

and pi <

f
(x) < qi .
xi

Whence




f (wi ) f (z i1 ) f (x)(yi xi ) (qi pi )|yi xi | |y x| .


xi

E. Integration on Rn

123

Summarizing, we get




 f


(x)(yi xi )
f (y) f (x)


x
i
i




f


i
i1
f (w ) f (z )
(x)(yi xi ) 



x
i
i
 !


"
f (wi ) f (y i ) + f (z i1 ) f (y i1 )
+
i

(n + 2n) |y x| .
Thus f  (x) does exist.
30.4. Lemma. Let (f ) be a family of -Lipschitz functions on Rn . Then
the function sup f is -Lipschitz provided it is nite at least at one point.
Proof. It is quite obvious.
30.5. McShanes Extension Theorem. Let f be a -Lipschitz function on
a set E Rn . Then there exists a -Lipschitz extension f of f to all of Rn . If,
in addition, E is bounded, then we can nd a -Lipschitz extension f of f with
compact support.
Proof. Set

f (x) = sup {f (y) |y x|}.


yE

By the previous lemma, f is a -Lipschitz function and f = f on E. If E is


bounded, then there exists an > 0 such that |f (x)| < |x| for all x E
and we set

if |x| < / and f (x) 0;

min(f (x), |x|)


f (x) = max(f (x), + |x|) if |x| < / and f (x) < 0;

0
if |x| /.

30.6. Remark. Let f be a -Lipschitz mapping from E Rn into Rk . Then the coordinates
of f can be extended separately by the previous theorem and we get a k-Lipschitz extension
f : Rn Rk of f . However, this extension fails to be -Lipschitz. There exists a stronger
extension theorem for mappings due to Kirszbraun which guarantees the existence of a Lipschitz extension. The proof of this assertion is more dicult.
30.7. Exercise. Suppose E Rn and f : E Rn is a -Lipschitz mapping. Prove that
f (E) n E.
Hint. We can assume that E < +. Fix an > 0. By Lemma 26.26 nd a sequence
S
P
{U (xj , rj )} of open balls so that E
U (xj , rj ) and
U (xj , rj ) < E + . Since
j

f (U (xj , rj )) U (f (xj ), rj ) for every j,


(f (E))

X
j

U (f (xj ), rj ) = k

X
j

U (xj , rj ) k ( E + ).

124

31. Approximation Theorems

30.8. Notes.
Theorem 30.3 is due to H. Rademacher [1919]. An elementary proof of it
(dierent from ours) was given by A. Nekvinda and L. Zaj
cek [1984].
Extensions of Lipschitz functions in metric space were closely related to original proofs of
the Hahn-Banach theorem. Theorem 30.5 of extension of (nonlinear) Lipschitz functions was
proved independently by M.D. Kirszbraun [1934] and by E.J. McShane [1934]. Another material
can be found in G.J. Minty [1970].

31. Approximation Theorems


31.1. Space D(). Let Rn be an open set. We denote by D() the linear
space of all innitely dierentiable functions on with compact support in .
Our aim is to prove that D() is often dense in other function spaces. The rst
task is to construct a nontrivial innitely smooth function on Rn with compact
support.
Set
 1/(|x|2 1)
if |x| < 1,
e
1 (x) =
0
if |x| 1 ,

where the constant is chosen in such a way that Rn 1 (x) dx = 1. Denote
k (x) = k n 1 (kx).

Then Rn k (x y) dy = 1 for any k N and x Rn . A simple calculation
shows that the functions k are innitely dierentiable (rst reduce the proof to
a problem of dierentiability of functions of one variable).
Now, if f is a locally integrable function, then Theorem 9.2 gives immediately
that k f are innitely dierentiable functions as well. Likewise, the convolutions
k are innitely dierentiable if is a Radon measure.
31.2. Lemma. Suppose p [1, ), f L p (Rn ) and k N. Then k f
L p (Rn ) and k f p f p .
Proof. According to generalized Youngs inequality 26.20,
k f p f p k 1 f p
(notice that k 1 = 1).
31.3. Theorem.
f k f p 0.

Suppose p [1, ).

For any f L p (Rn ), we have

Proof. Denote Ej = {x Rn : |x| j and |f (x)| j} and Fj = f cEj . Since



(Rn \ Ej ) has measure zero, fj f 0 almost everywhere and by the Lebesgue
j
p

theorem (dominating function |f | ), f fj p 0. Fix an > 0 and nd j N


with f fj p < . Whence by Lemma 31.2
k fj k f p = k (fj f )p <
for all k N. Now k fj fj (as k ) almost everywhere. Indeed, it
is routine to verify that k fj fj at all points of approximate continuity of

E. Integration on Rn

125

fj and this set, according to Denjoys theorem 29.9, is of full measure. By the
Lebesgue convergence theorem with dominating function (2j)p cB(0,j+1) we have
lim fj k fj p = 0.

k0

Hence,
f k f p f fj p + fj k fj p + k fj k f p < 3 ,
provided k is large enough.
31.4. Theorem. Let Rn be an open set and p [1, ). Then D() is a
dense subset of L p ().
Proof. Set j = {x : |x| < j and dist(x, ) > 1j }. Take a function
f L p () and denote gj = f cj . As in the previous proof, it is clear that
f gj p 0. For all k > j we get k gj D(). Now the estimate
f k gj p f gj p + gj k gj p
yields the assertion.
31.5. Theorem. Let f be a continuous function with compact support contained
in an open set Rn . Then k f f on .
Proof. We can assume that = Rn . Since f is uniformly continuous, given > 0
there is k0 N such that |f (x) f (y)| < whenever x, y Rn , |x y| < k10 . If
k > k0 and x Rn , then




(f (x) f (y))k (x y) dy 
|f (x) k f (x)| = 
n

R




(f (x) f (y))k (x y) dy 
=
 B(x,1/k)



k (x y) dy = .
Rn

31.6. Theorem. Let be a Radon measure on . Then


(a) there exist sequences {j } of measures and {fj } of functions from D(Rn )
w
such that fj = dj / d and j ;
(b) if, in addition, has compact support, then we can take measures j with
densities j .
Proof. Let us start with the case (b). Assume again that = Rn and take
f Cc (Rn ). Then by the previous theorem f j f , and it readilly follows
that



f (x)(j )(x) dx =
f j d
f d.
Rn

Rn

Rn

For the proof of (a), we rst approximate by K (see 2.4) where K is a suitable
compact set, and then use the convolution as above.

126

32. Distributions

31.7. Theorem. Let f be a -Lipschitz function on Rn . Then k f are


(innitely dierentiable) -Lipschitz functions, |k f (x) f (x)| /k for all
x Rn and (k f ) f  almost everywhere in Rn .
Proof. Since
(k f ) = k f  ,
k f are -Lipschitz functions. For any x Rn we have

|k f (x) f (x)|

1
B(x, k
)

|f (x) f (y)| k (x y) dy

1
B(x, k
)

k (x y) dy =

.
k

We can see that (k f  )(x) f  (x) at all points x, where the derivative f  (x)
exists and is approximately continuous. But the Rademacher theorem 29.9 and
Denjoys theorem 30.3 tell us that this happens almost everywhere.
31.8. Exercise.
1 p < .

Give an alternative proof of the fact that

() is dense in

p ()

for

Hint. If f c (), then by Theorem 31.5 the functions k f converge uniformly to f and
their supports are contained in a compact set (not depending on k). This yields that k f
converge to f in the p -norm as well. Now it is sucient to use the density of c () in p ()
(Exercise 15.17).
31.9. Exercise. Use Theorem 31.4 to prove that the convolution f g is uniformly continuous
provided f p and g q , p1 + 1q = 1 and p, q > 1. (This will prove the proposition stated
in Remark 26.22.)
31.10. Riemann-Lebesgue Lemma. Suppose f

1 (Rn ).

Then

Z
eixt f (t) dt = 0.

lim

|x|

Hint. If g

Rn

(Rn ), then integration by parts gives


Z

Rn

|x|2 eixt g(t) dt =

whence

Rn

eixt
Rn

n
X
2g
(t) dt
t2j
j=1

eixt g(t) dt |x|2 g1 .

1 (Rn ): Given f
1 (Rn ) and > 0, by
Now we invoke the fact that (Rn ) is dense in
Theorem 31.4 there is g (Rn ) (g depends on f and ) such that f g1 < . Then

whence

Rn

Rn

eixt (f (t) g(t)) dt ,

eixt f (t) dt |x|2 g1 + f g1 .

E. Integration on Rn

127

32. Distributions
Since twenties, physicists started to work with generalized functions. These
functions were determined by their average densities in a neighbourhood
of each point. P.A.M. Dirac introduced the -function having the following
properties:

(x) = 1
(x) = 0 for x = 0, (0) = and
U

for every open ball U containing the origin. Thus the average density of the
-function in the ball U (x, r) is

fr (x) :=

1
U (x,r)

if x U (0, r),

otherwise.

(Notice that x U (0, r) if and only if 0 U (x, r).) We can see that
(x) = lim fr (x).
r0+

If now is a continuous function on Rn , denote



fr ,
() = (0).
Zr () =
Rn

Using the denition of continuity, one can show that


lim Zr () = ()

r0+

for any continuous function on Rn . In other words, the -function is a weak


limit of the sequence {Zr } of functionals while values of the -function are
pointwise limits of the sequence {fr }. The great number of quotation-marks in
the last sentences indicates that we have to give exact denitions and thus also
interpretations of the notion used. Thus while the physicists succesfully used the
-function and many other generalized functions, the mathematical theory of
distributions (or, generalized functions) arose much later in the end of the 30s
and it is connected with names of S.L. Sobolev, and mainly with L. Schwartz.
And one more note. We know that there exist continuous functions which
have a derivative at no point. One of the great advantages of the theory of
distributions is the fact that each of them does have a derivative. However, we
should be aware of the dierence between classical derivatives and derivatives in
the sense of distributions. Among others, the derivative of a function in sense
of distributions is not necessarily a function (cf. Examples 32.5). In order that
the distributional derivative of a continuous function f to be again a function
(more exactly, a regular distribution, cf. 32.3.1), f has to be locally absolutely
continuous. In particular, f must have the classical derivative almost everywhere.

128

32. Distributions

In what follows, we consider an open set Rn . If = (1 , 2 , . . . , n ) is


a n-tuple of nonnegative integers (so-called multiindex), set || = 1 + + n
and denote the dierential operator
1
x
1

||
n
. . . x
n

by the symbol D . Finally, recall that D() denotes the linear space of all
innitely dierentiable functions with compact support contained in (see 31.1).
32.1. Notion of a Distribution. If {fk } is a sequence of functions from D(),
D

we say that fk 0 if there exists a compact set K such that supt fk K


for every k and D fk 0 on K for every multiindex . Every linear functional
T on D() satisfying
T (fk ) 0

whenever fk 0

is called a distribution (on ). Thus every distribution is determined by its values


on D().
Let f be a locally integrable function on (i.e. integrable on each compact
subset of , in particular, f can be a continuous function). Set

Tf () =
f d

for D() (notice that the integral exists). It is not very dicult to prove that
D

Tf is a distribution: Suppose k D() and k 0. If K is a compact set on


whose complement all k s vanish, then

|Tf (k )| max |k (t)|
|f | d
tK

and k 0 on K. It follows that T (k ) 0.


Thus every locally integrable function denes a distribution. We say that T
1
is a regular distribution if there is a function f Lloc
() with T = Tf . In this
sense, we can identify regular distributions and locally integrable functions.
32.2. Remarks.

1. The space

() is usually endowed with a locally convex topology in


D

such a way that fk 0 in this topology if and only if fk 0. So distributions are continuous
linear functionals on ().
2. If a linear functional Z on () is a pointwise limit of a sequence of distributions {Zj } (i.e.
if Z() = lim Zj () for each ()), then Z is also a distribution. This is a consequence of
the Banach-Steinhaus theorem of functional analysis, which is valid for ().
3. If f, g are locally integrable functions and f = g almost everywhere, then apparently Tf = Tg .
The converse is also true: If the regular distributions Tf , Tg are equal, then f = g almost
everywhere.
For this purpose, it is sucient to prove that f = g for every (). Suppose that
h = (f g) on and h = 0 outside . Then h L1 (Rn ) and Th = 0 in Rn . Thus k h = 0
for every k and Theorem 31.3 (the notation is the same) yields h = 0 almost everywhere.
This assertion shows that we can really identify regular distributions and locally integrable
functions.

E. Integration on Rn

129

32.3. Examples.
1. In 32.1 we embedded locally integrable functions into the space
of distributions. Another class of objects contained in the space of distribution are Radon
measures. Let be a (positive) Radon measure on . The functional T dened as
Z
d)
T () = () (=

is again a distribution. In this sense, we can understand every Radon measure as a distribution.
For example, the Dirac measure (the -function) dened by T () = (0) is a distribution.
It is not regular. Indeed, if there were a locally integrable function f with Tf = T , we could
easily deduce that f = 0 almost everywhere. But T is not the zero distribution.
2. It is not true that every distribution is regular or is dened by some Radon measure. As
the example consider the functional   (0). However, there is a simple criterion which
enables to decide whether a distribution T is equal to T . Namely, if is a Radon measure,
then the distribution T is a positive functional. Conversely, if T is a distribution with T () 0
whenever () is nonnegative, then there exists a Radon measure such that T = T .
The last proposition is an easy consequence of the Riesz representation theorem 16.5.
3. For > 0, denote O() = (, ) (, +) and
Z
(x)
dx for
R () =
x

(R)

O()

(notice that the integral exists). It is not dicult to see that R is a distribution on R. Further,
for every (R) there is a nite limit lim R (). If we denote it by T 1 (), then T 1 is a
0+

distribution. The reader may nd it instructive to prove this assertion directly, or using Remark
32.2.2. Notice also that the function x1 is not locally integrable on R. Thus, the distribution
T 1 which is often identied with the function x1 is not regular and also cannot be determined
x
by a Radon measure.

The set of all distributions is a linear space which is the topological dual to
D() (cf. Remark 32.2.1).
As mentioned above, every distribution is dierentiable. To motivate the exact
denition, consider a function f with continuous derivative f  on R (notice that
both f and f  are locally integrable). In this case it is natural to require that Tf 
should be the derivative of Tf . Since,

f  = Tf ( )
Tf  () =
R

for any D(R) we are led to the following general denition.


32.4. Dierentiation of Distributions. Let T be a distribution on . If
is a multiindex, we dene the th (partial) derivative D of the distribution T as
D T () = (1)|| T (D ) ,
In particular,
T
() = T
xj

xj

D().

.

In the sense of identication of regular distributions and locally integrable functions, we often speak about distributional derivatives of functions provided the
result of dierentiation is a function.

130

32. Distributions

If f is a C 1 - function, then its distributional derivative is the classical one:


(Tf ) = Tf  . For the one-dimensional case see the next Remarks. The multidimensional case involving partial derivatives is then an easy consequence of Fubinis theorem.
1. It remains to show that D T is again a distribution.

32.5. Remarks and Examples.

But the linearity of D T is obvious, and supposing k 0, then also D k 0 and we get
D T (k ) 0 (we used the fact that the mapping  D is continuous on ()).
2. Notice that any distribution has derivatives of all degrees. However, the derivative of a
function fails to be a function. As the next examples show, it can be, for example, a
measure . For instance, any continuous function which does not possess a derivative at any
point is innitely dierentiable but only in the sense of distributions!
3. Assume now that u and f are locally integrable functions on an interval (a, b) and that the
distributional derivative of u is f . Then there exists a locally absolutely continuous function
v and a constant c such that u = v + c almost everywhere and v  = f almost everywhere.
R
Indeed, consider a function ((a, b)) with ab = 1 and denote by v the indenite Lebesgue
integral of f . We know that v is a locally absolutely continuous function. The integration
by parts 23.13 shows that the distributional derivative of w := u v is a zero function.
If
R
R
R
c = ab w, ((a, b)) is an arbitrary function and (x) = ax (t) (t) ab (s) ds dt,
then

((a, b)) and


Z

b
a

Z
(w c) =

Z
w

b
a

=
a

w  = 0.

Thus Twc = 0, whence w = c almost everywhere by 32.2.3.


4. Let f be a continuous function having the derivative f  almost everywhere on an interval I.
Then f  is a distributional derivative of f if and only if f is locally absolutely continuous on I.
5. Consider the function f (x) = max(0, x) and the so-called Heaviside function Y on R which is
dened as the indicator function of the interval (0, +). Then f is not classically dierentiable
at zero. However, its distributional derivative is the Heaviside function Y . Integration by parts
shows easily that
+
Z
 = (0) = T ()
(Tf ) () = (TY ) () =
0

whenever (). We see that the derivative of the distribution TY is the Dirac -function.
In a similar way we nd that
(TY ) () =  (0)

for

().

Show that (TY ) is neither a function nor a measure.


6. Let f be a real function whose classical derivative is not locally Lebesgue integrable (cf.
Example 25.1). Then (Tf ) is not a regular distribution.
7. The distributional derivative of the Cantor function of Example 23.1 is not the zero function
but a measure concentrated on the Cantor set (see Exercise 24.8.a).
8. The function ln(|x|) is locally Lebesgue integrable. Its derivative in the sense of distributions
is the distribution T 1 (This gives one way how to prove that T 1 is a distribution.)
x

9. For

(R), set
U () =

(k) (k) .

k=1

Show that U is a distribution on R which is neither regular nor determined by a Radon measure.

E. Integration on Rn

131

n 2
P
be the Laplace operator and u(x) := |x|2n . Our aim is to nd u in the
2
i=1 xi
xi
u
= (2 n) n is a locally integrable function. Now
distributional sense. First, we have that
xi
|x|
2u
is a complicated distribution containing an integral average
the distributional derivative
x2i
similar to the distribution of Example 32.3.3. But we have an easier task, namely to nd out
n 2u
P
of these distributions. If n denotes the volume of the unit ball in Rn (cf.
the sum
2
i=1 xi
Exercise 26.17), then we prove that

10. Let :=

Z
Rn

for every test function

n
X
xi
(x) = nn (0)
n
|x|
xi
i=1

(Rn ). Hence
u = n(2 n)n 0 .

To this end let be a nonincreasing innitely smooth function on (0, ), (t) = 1 for t < 1,
(t) = 0 for t > 2, |  (t)| 2 and r (x) = (x/r). We use the decomposition
= (0)r + ( (0))r + (1 r ).
Then

Z
Rn

n
X
xi
(x) = Ar + Br + Cr + Dr ,
n
|x|
xi
i=1

where
Z
Ar = (0)
Z

Rn

= (0)
Rn

n
X
xi r
(x) dx
n
|x|
xi
i=1

|x|1n r1  (|x| /r) dx = (0)nn

2r

r 1  (t/r) dt = nn (0)

(we used Exercise 26.17),


Z
Br =
Z
Cr =

Rn

n
X
r
xi
((x) (0))
(x) dx,
|x|n
xi
i=1

Rn

n
X
xi

r (x)
(x) dx
|x|n
xi
i=1

Rn

n
X
xi ((1 r ))
(x) dx.
n
|x|
xi
i=1

and
Z
Dr =

Set
i (x) :=

An easy calculation shows that i

xi
(x)(1
|x|n

r (x))

if x
= 0;
if x = 0.

0
(Rn ) and

n
n
X
X
i
xi ((1 r ))
(x) =
(x).
n
x
|x|
xi
i
i=1
i=1

132

32. Distributions

Thus

Z
Dr =

Rn

n
X
i
dx = 0.
xi
i=1

The proof will be nished by estimating the integrals Br and Cr and taking the limit as r 0.
Since (Rn ), there exists a constant K such that |(x)| K and |(x) (0)| K |x|
for all x U (0, 1). Obviously, |r | r2 . We obtain

n
X
xi

|x|n |((x) (0)|

Z
|Br |

B(0,2r) i=1

and

Z
|Cr |

n
X
B(0,2r) i=1

xi
r (x) n
|x|

r
dx 4nKr

(x)
|x|1n dx 0

x
r
i
B(0,2r)

x (x) dx nK
i

|x|1n dx 0.
B(0,2r)

Another operation with distributions is the mutiplication. Since we prove the


Schwartz impossibility theorem which says that on the space of all distibutions
the multiplication cannot be dened in a reasonable way, we restrict to the multiplication by smooth functions. However, if h is a smooth function on R and f
is continuous, then
Thf () = Tf (h)

D()

for

and this equality leads to the following denition.


32.6. Multiplication of Distributions by Smooth Functions. Let T be a
distribution on and h C (). We dene the distribution hT = T h by
hT () = T (h) ,

D().

32.7. Remarks and Examples. 1. It is easy to show that hT and T h are distributions.
2. We have
xT () = T (x(x)) = 0,
Z
xT 1 () = T 1 (x(x)) =
= T1 ()
x

and we see that the multiplication in the space of distributions is almost the same as the
pointwise multiplication.

32.8. Schwartz Impossibility Theorem. On the space of distributions on


R, a multiplication cannot be dened in such a way that it would be commutative
and associative and that xT = 0 and xT x1 = T1 .
Proof. The following shows that it is impossible to dene a multiplication:
0 = 0 T x1 = (xT )T x1 = T (xT x1 ) = T 1 = T .
32.9. Convergence of Sequences of Distributions. Let {Tk } be a sequence
of distributions on . We say that Tk converge to a distribution T on , denoted
by Tk T , if Tk () T () for every test function D(). Thus the convergence in the sense of distributions is nothing else than the pointwise convergence,
or the weak*-convergence on D  ().

E. Integration on Rn
32.10. Examples.
distribution T0 ).

133

(a) If Tk := Tsin kx on R, then {Tk } converges to zero (i.e. to the

kx
. Show that
(b) Let Tk be the regular distribution on R determined by the function sin
x
kx
} converges to
Tk T (when keeping the convention of 32.1, the sequence of functions { sin
x
the Dirac measure in the sense of distributions).

Hint. Choose

(R). If = 0 outside the interval [A, A], then


Z A
Z A
((x) (0))
sin kx
sin kx dx + (0)
dx.
Tk () =
x
x
A
A

By the Riemann-Lebesgue Lemma 31.10, the limit of the rst term is zero. Further it is known
that
ZkA
sin t
lim
dt = ,
k
t
kA

whence the assertion follows.


(c) Show that the sequence of functions {k } (cf. 31.1) converges on Rn to the Dirac measure
in the sense of distributions.
Hint. The proposition is an easy consequence of Theorem 31.5.
R
1 (). If lim
(d) Suppose fk , f
K |fk f | 0 for every compact set K , then
Tfk Tf .

32.11. Theorem. Let be a multiindex, T, Tk distributions. If Tk T on ,


then D Tk D T .
Proof. The proposition is an immediate consequence of denitions.

P
P
32.12. Example. We dene that
Tn = T provided
Tn () = T () for any test function
(). The previous theorem tell us that every convergent series of distributions can be
dierentiated term by term. Here there is an illustrating example:
Let f be a 2-periodic function on R,
8 1
for x (0, ],
>
< 2 ( x)
f (x) =

1 ( + x)
>
: 2
0

for x [, 0),
for x = 0.

It follows from the theory of Fourier series that


f (x) =

X
sin kx
k
k=1

for every x R. Show that:


(a) This equality holds in the sense of distributions as well.
(b) The derivative of the distribution Tf is a 2-periodic measure whose restriction to
the interval [, ] equals 12 + .

P
Tcos kx .
On the other hand, Theorem 32.11 implies that (Tf ) =
k=1

Therefore, in the sense of distributions we have the equality

cos kx = .

k=1

We see that we can assign to the divergent series

P
k=1

and not a function.

cos kx a sum but this one is a distribution

134

33. Fourier Transform

32.13. Notes. The theory of distributions in the framework of duality between topological
vector spaces was created by L. Schwartz in the late 1930s and was fully published in [1947]
(because of the scientic isolation during the second world war he published his theory afterwards). He also introduced the fundamental space (Rn ) suitable for the Fourier transform of
distributions.
Some related considerations were implicitly contained in earlier works of other authors. In
fact, L. S. Sobolev [1936] also introduced basic notions of the theory of distributions. Since he
was led by a special question concerning the solution of the Cauchy problem he did not realize
the power of his own ideas.
Concerning the history of the theory of distributions, we suggest also L
utzens monograph
[*1982]. Other material can be found in J. Horv
ath [1970] or J. Horv
ath [*1966].

33. Fourier Transform


Throughout this chapter, u v denotes the inner product of n-dimensional
vectors u and v. By a function we understand a complex-valued function and in
this way we modify the denitions of function spaces like Lp .
The Fourier transform is an operator whose calculus provides a number of
formulae. Roughly speaking, the Fourier transform transforms dierentiation with
respect to j th variable into multiplication by xj and convolution into product. Due
to these properties it has wide applications in function spaces theory, theory of
partial dierential equations, operator calculus and other elds.
When reading this chapter, we suggest to notice the parallel between the theory
of Fourier series and the Fourier transform (cf. Remark 33.14).
Formally, we can write the
33.1. Fourier Transform of L1 -Functions.
Fourier transform in the form

n/2
#
eixy f (y) dy.
f (x) := (2)
Rn

This formula can be understood as a pointwise denition provided f L 1 (Rn ).


Notice that the Lebesgue integral always exists.
We start with simple formulae.
If f, g L 1 (Rn ), then
g;
(a) f
g = (2)n/2 f# #


#
(b) Rn f g = Rn f g# (the so-called multiplication formula).

33.2. Theorem.

Proof. The proof will follow using Fubinis theorem easily. It only needs to be
observed that f#g L 1 whenever f, g L 1 .
The following theorem plays a key role in applications of the Fourier transform
to the theory of partial dierential equations.
33.3. Theorem. Suppose f L 1 (Rn ) and 1 j n.
(a) If xj f L 1 (Rn ), then
f#

= (ix
j f ).
xj

E. Integration on Rn

(b) Suppose that f ,

f
xj

are continuous on Rn ,

135
f
xj

L 1 (Rn ) and

lim f (x) = 0.

|x|

Then


f
/xj (x) = ixj f#(x).

Proof. (a) The dierentiation under the integral sign of



eixt f (t) dt
f#(x) = (2)n/2
Rn

leads to

f#
(x) = (2)n/2
xj


Rn

eixt itj f (t) dt.

(b) Using Fubinis theorem and integration by parts we obtain




ixt
n/2
ixt f
n/2
(2)
e
(t) dt = (2)
f (t)
e
dt
t
t
n
n
j
j
R
R

= ixj (2)n/2
f (t)eixt dt .
Rn

33.4. Lemma.

If h(x) = e|x|

/2

, then #
h = h.

Proof. Assume rst that n = 1. By the previous lemma both h and #


h are solutions
to the dierential equation
u + xu = 0 .
Since h(0) = #
h(0) (see Exercise 26.17.b), we get #
h = h. Assuming we had already
proved the theorem for n = 1, we get


2
2
2
eixt e|t| /2 dt =
eix1 t1 t1 /2 . . . eixn tn tn /2 dt1 . . . dtn
Rn
Rn


2
2
2
ix1 t1 t1 /2
e
dt1
eixn tn tn /2 (2)n/2 e|x| /2 .
R

33.5. Fourier Transform of Distributions. We start with a motivation.


It would be quite natural to dene the Fourier transform of a distribution in
such way that the Fourier transform of the regular distribution Tf would be the
distribution Tf. But if f L 1 (Rn ) and D(Rn ), the multiplication formula
33.2.b gives


f .
#
f# =
Rn

Rn

and since the Fourier transform of a function from D(Rn ) fails to be in D(Rn )
we cannot proceed in this way. (Notice that the zero function = 0 is the only
function of D(Rn ) with
# D(Rn ).)

136

33. Fourier Transform

Observe also that the Fourier transform of a function from L 1 needs not to be
an element of L 1 . If f is the indicator function of the interval [1, 1], then
$
f#(x) =

2 sin x
.
x

So we search for a subset of L 1 (Rn ) which would be closed with respect


to the Fourier transform and to dierentiation (and thus also with respect to
multiplication by polynomials, cf. Theorem 33.3), and in this way we are naturally
led to the notion of the Schwartz space S (Rn ).
We say that a C function f on Rn belongs to the Schwartz space S (Rn ) if
p D f is a bounded function for any multiindex and for any polynomial p on
Rn . Equivalently, it is easy to see that f S (Rn ) if and only if p D f L 1 (Rn )
whenever p is a polynomial and a multiindex.
2

The space of test functions D(Rn ) is a subset of S (Rn ). The function e|x|
provides an example of a function from S (Rn ) which does not have compact
support.
With the help of Theorem 33.3 and Lemma 33.6 it can be easily shown that
S (Rn ) is closed with respect to the Fourier transform, and that the mapping

# is a linear isomorphism of S (Rn ) onto itself.
A linear functional S on S (Rn ) is called a tempered distribution if it is continuous in the following sense: If {j } is a sequence of functions from S (Rn )
and pD j 0 for any multiindex and any polynomial p, then T (j ) 0.
We can easily verify that every tempered distribution (restricted to D(Rn )) is a
distribution.
The Fourier transform T# of a tempered distribution T is dened as
T#() = T ()
,

S (Rn ).

Apparently T# is again a tempered distribution.


So far, the Fourier transform is dened for functions from L 1 (Rn ) and for
tempered distributions. Any locally integrable function f determines a (regular)
distribution, and many of them even tempered distributions. If f L p (Rn )
(1 p ), then

Sf :

f ,
Rn

S (Rn )

is a tempered distribution. In the same way as in the previous chapter, we identify


the function f L p (Rn ) and the tempered distribution Sf . The multiplication
formula yields that S#f = Sf and we see that the new denition of the Fourier
transform of tempered distributions is an extension of the original one for functions from L 1 (Rn ).
Since the Fourier transform is now dened also for functions outside of L 1 (Rn ),
the question arises whether, given function f (say f L p ), a function g can be
found so that S#f = Sg (then we could say that the Fourier transform of f is g).

E. Integration on Rn

137

We know that in the case of L 1 the answer is positive as we can take g = f#.
On the other hand, Exercise 33.9 shows that the Fourier transform of a constant
function (which is a function of L (Rn )) is not a function.
Let us note that even in the case when the Fourier transform of a function f
is a function g (in the sense mentioned above), the function g is not determined
pointwise (as in the original denition of the Fourier transform for f L 1 (Rn ))
but only except a null set.
In what follows, we restrict our attention to the case p = 2 and we state the
main theorem which guarantees a good theory for the Hilbert space L2 (Rn ). First
we dene the inverse Fourier transform by the formula
f(y) = (2)n/2


f L 1 (Rn );

eixy f (x) dx ,
Rn

T() = T ()
, S (Rn ),

T is a tempered distribution

If f L2 (Rn ), then we dene the Fourier transform f# as such g L2 (Rn )


for which S#f = Sg . The inverse Fourier transform f of a function f L2 (Rn ) is
dened in a similar way. Before we verify that these denitions make sense, let
state a useful lemma.
33.6. Lemma.

The Schwartz space S (Rn ) is a dense subset of L2 (Rn ) and

f = f ,

% %
%f#% = f 
2
2

f = f ,

for f S (Rn ).
Proof. In Theorem 31.4 we proved even that D(Rn ) is a dense subset of L2 (Rn ).
2

Now we are going to prove the formula f = f . Denote h(x) = e|x| /2 . Remember, according to Lemma 33.4, that #
h = h. Using the multiplication formula
33.2.b we obtain


t
t
#
f ( )#
f (t)h( ) dt =
h(t) dt.
k
k
Rn
Rn
When taking the limit as k it follows that


f#(t) dt =

Rn

f (0)#
h(t) dt = (2)n/2 f (0).

Rn

We make the substitution z = x + y. The integrals then become






n ixt

f (y) =

(2)
Rn

R


=

Rn

f (t + y) dt

=
Rn

(2)n eiyt eizt f (z) dz

Rn

(2)n/2 eiyt f#(t) dt = f(y).

dx

dt

138

33. Fourier Transform

The formula f = f can be proved in a similar way. Notice that for the Fourier

transform of the complex conjugated function we have f = f and thus the multiplication formula yields





  2

 

2
|f | dx =
f f dx =
f f dx =
f f dx =
ff dx =
f  dx.
Rn

Rn

Rn

Rn

Rn

Rn

33.7. Plancherel Theorem. For any f L2 (Rn ) there exists g L2 (Rn )


such that S#f = Sg , Sg = Sf and f 2 = g2 .
Proof. By virtue of Lemma 33.6 there are fj S (Rn ) such that fj f in
L2 (Rn ). By the same lemma, {f#j } is a Cauchy sequence in L2 (Rn ) and thus
there exists g L
 2 (Rn ) with
%
%
%
%
%g f#j % 0.
2

By passing to the limit we get


S#f = Sg ,

Sg = Sf

and

f 2 = g2 .

33.8. Remark. The Plancherel theorem says that the (original) Fourier transform can be
of the
uniquely extended from (Rn ) to L2 (Rn ). The result is a continuous linear isometry
is a unitary mapping and one can easily nd out that also
space L2 (Rn ) onto itself. Thus
the inner product in L2 (Rn ) is invariant with respect to it.
33.9. Exercise. Let be the Dirac -function. Prove that b is the constant (2)n/2 and
that the Fourier transform of this constant is .
33.10. Exercise. Find a function u whose Fourier transform u
b is a solution to the equation
b
u+u
b=
in the sense of distribution (here  is the Laplace operator, cf. 32.5.10).
33.11. Fourier Transform of a Function in L1 (Rn ). (a) Suppose f
is a uniformly continuous function and lim fb(x) = 0 (in other words, fb
|x|

1 (Rn ).
0 (R

Then fb

n )).

Hint. The assertion concerning the limit at innity is a consequence of the Riemann-Lebesgue
lemma 31.10. As for the proof of uniform continuity it is no restriction to assume that f is from
the space (Rn ) since we know that (Rn ) is dense in L1 (Rn ) (Theorem 31.4). Suppose that
the support of f is contained in U (0, R). We use the inequality

iyt
e
eixt |t| |y x|
to get

Z
b

f (y) fb(x) =

Rn

(eiyt eixt )f (t) dt R |y x| f 1

from which the uniform continuity of fb follows.


(b) Remark that any function of

0 (R

n)

does not need to be the Fourier transform of a funcx


. If
0 (R)function g : x 
1 + x2 log(2
x2 )
Ra+
1 iz
1
(R), then Fubinis theorem together with boundedness of the function a  1 z e
dz
f
R x fb(t)
b
on (1, +) imply that 1 t dt is a bounded function on (1, +). Hence apparently f
= g.
Note that, in contrast to the equality {fb: f L2 (Rn )} = L2 (Rn ), a satisfactory characterization of the range {fb: f 1 (R)} of the Fourier transform on L1 (R) is not known.
tion from

L1 (Rn ).

Consider the example of the

E. Integration on Rn

139

33.12. Exercise. For f L2 (Rn ) dene


Z
f (x) =

f (t)
R

Show that

eitx 1
dt.
it

f is an indenite Lebesgue integral of the (locally integrable) function fb.

33.13. Remark. Dene the Fourier transform F on


Ff =

1
a

1 (Rn )

by the formula

eibxt f (t) dt
Rn

and denote c := (2)n |b|n a2 . Then (under certain assumptions)


F (f g) = a F f F g;
F (F f (x)) = c f (x);
(F f )
(x) = ibxj F f (x);
xj
F f 2 = c f 2 .
Browsing through dierent manuscripts, various choices of a and b appear (e.g. a = 1, a =
(2)n/2 , b = 1, b = 2).
33.14. Fourier Transform and Fourier Series. Let T be the interval [, ]. We assign
u(k)}kZ of its Fourier coecients
to any function u L1 (T ) the sequence {b
u
b(k) :=

1
2

eikt u(t) dt.


T

Conversely, to a sequence {ck } l1 (Z) we assign the function


c(x) :=

ck eikx

kZ

(the sum of the convergent trigonometrical series with coecients ck ). The (formal) series
X

u
b(k)eikx

kZ

is called the Fourier series of a function u.


We leave the theory of Fourier series out of this manuscript (the books by V. Jarnk [JII] or
W. Rudin [*1974] are suggested). However, we state here without proofs some results indicating
an analogy between the theories of Fourier series and Fourier transform.
The sequence of Fourier coecients {b
u(k)} is an analogy of the Fourier transform fb(x) and
the sum c(x) is an analogy of the inverse Fourier transform f(x). In some sense, the theory of
Fourier series is less complicated because of L2 (T ) L1 (T ) and l1 (Z) l2 (Z). On the other
hand, we lose the symmetry between the Fourier transform and the inverse Fourier transform.
(a) If u L
 1 (T ), then

lim u
b(k) = 0. The proposition is a consequence of the Riemann-

|k|

Lebesgue lemma 31.10; the proof is similar to that for the case of Fourier transform (cf. 33.11.a).
Also in this case, not every sequence in c0 (Z) is a sequence of Fourier coecients of a function
in L1 (T ).
(b) If {ck } l1 (Z), then c is a continuous function on T with c() = c() (compare with
33.11.a). Let us note again that not every continuous function can be expressed in this way.

140

33. Fourier Transform

(c) If u L2 (T ), then {b
u(k)} l2 (Z) (this follows from the Bessel inequality). Conversely, for
any sequence {ck } l2 (Z) there exists a unique function u L2 (T ) with ck = u
b(k) (the RieszFischer theorem). This function can be obtained as the limit of the sequence {sk } L2 (T ),
where
k
X
ck eikx ,
x T.
sk (x) =
j=k

Moreover, u2 =
33.7.

2 {b
u(k)}2 (the Parseval identity). Compare with the Plancherel theorem

(d) The mapping u  {b


u(k)} and {ck }  c can be also generalized to wider classes of
functions or sequences. We are not going to mention the details. Just notice that the mapping
{ck }  u, {ck } l2 (Z) given by the Riesz-Fischer theorem is in fact an extension of the mapping
{ck }  c, {ck } l1 (Z). However, in the case of l2 , the function u is determined only as an
element of L2 (T ), i.e. except of a null set.
(e) If u

1 (R)

is a 2-periodic function and v = u , then


v
b(k) = ikb
u(k) ,

k Z.

Compare again with Theorem 33.3.


33.15. Notes.
The theory of Fourier series and the Fourier transform has a very long
history. They were D. Bernoulli and L. Euler who rst used what is now called Fourier series.
J. B. J. Fourier used trigonometric series in connection with problems on heat and wrote his
famous book [*1822]. He also studied the Fourier transform in the form
Z

F (x) =

f (x) cos tx dx
0

and formally derived the inversion formula.


Common features of the Fourier transform and the Fourier series (see the list of parallels in
33.14) indicated that there should exist a unifying theory covering both. This was established
by A. Weil [1940] who presented Fourier transform on locally compact Abelian groups. This
approach led further to harmonic analysis on groups, see E. Hewitt and K.A. Ross [*1970] or
G.B. Folland [*1975].
Note that M. Plancherel extended the denition of the Fourier transform for functions in
L2 (R) so that he obtained the unitary operator on this Hilbert space (cf. Remark 33.8).
Besides the Fourier transform, various integral transforms like the Laplace or Mellin transform are also examined.
Roughly speaking, an integral transform is a one-to-one mapping between two function spaces
dened by a formula of the form
Z
u
b(x) =

K(x, y) u(y) d(y).


Y

Often an inverse formula can be found as


Z
u(y) =
K  (x, y) u
b(x) d(x).
X

The theory of Fourier series can be including as a particular case since series could be
considered as integrals with respect to the counting measure.

F. Change of Variable and k-dimensional Measures

141

F. Change of Variable and k-dimensional Measures


34. Change of Variable Theorem
Throughout this chapter Rn will be the Euclidean space and k n a nonnegative integer. Our aim is to prove the general change of variable theorem for
k-dimensional measures.
We want to measure various subsets of Rn whose dimension is k n. For
instance, we are interested in the problem how to calculate the length of curves
or the area of surfaces in R3 . In miscellaneous elds of analysis we encounter
various methods leading to this goal in dierent degrees of generality. If a set
A Rn is an isometric copy of a set E Rk , it is reasonable to introduce its
k-dimensional measure as the Lebesgue measure of its preimage E. Consequently,
it is not dicult to introduce the k-dimensional measure of at sets. Now, if we
consider a set A on a curved surface, we divide it into almost at parts Aj and
sum evaluating measures of at sets Ej which are close to Aj . In the limiting
case, we obtain the intuitive meaning of measure of A. Our task is therefore to
introduce a concept of k-dimensional measures which would be general enough,
easily describing and transparent. We will propose two independent solutions of
this problem in the following chapters.
The integral with respect to the k-dimensional measure is the so-called curve
(k = 1), or surface (k > 1) integral. We prove a change of variable formula for
this integral which will include Theorem 26.13 as a special case.
We start by recalling useful notions from linear algebra.
The inner product of vectors x and y will be denoted again by x y and the
norm of an element x of Rn simply by |x|. By ej we denote the basic vector
[0, . . . , 1, . . . , 0] of Rn having jth term 1 and all other terms zero.
Further, Mn,k will be the set of all matrices with n rows and k columns. Each
such a matrix represents a linear mapping of Rk into Rn .
If L : Rk Rn is a linear mapping, then there is a unique mapping L : Rn
k
R such that Lx y = x L y for all x Rk and y Rn . The mapping L is
called the adjoint to L. If L is represented by a matrix A, then L is represented
by the transposed matrix to A and this one will be denoted by AT . The norm of
the mapping L is dened by the formula
L = sup{|Lx| : x Rk , |x| 1}.
%
%
Further we denote %L1 % = sup{|x| : x Rk , |Lx| 1}.
We suppose that the reader is familiar with the notion of a determinant. We
will relate it not only to square matrices but also to objects which are representable
by means of square matrices: to k-tuples of vectors from Rk , k-tuples of linear
forms over Rk , or to linear mappings of Rk into itself.
34.1. Properties of Determinants.
T

(a) det A = det A ,


(b) det AB = det A det B.

Let A, B Mk,k . Then

142

34. Change of Variable Theorem

34.2. Isometric Mappings. Let V, W be k-dimensional linear spaces with


inner products. Remember that a linear mapping L : W V is said to be an
isometry provided it preserves distances of points. Then, of course, it preserves
also the norm and the inner product (indeed, the inner product can be expressed
2
2
in terms of the norm according to the formula 4x y = x + y x y ). The
k
k
matrices of isometric mappings of R into R are termed unitary or orthogonal .
A matrix A is orthogonal if and only if its rows (columns) form an orthonormal
basis of Rk . The product of unitary matrices is a unitary matrix.
34.3. Lemma. Let W be a k-dimensional linear space with an inner product.
Then there exists a linear isometry Q : Rk W.
Proof. Let u1 , . . . , uk be an orthonormal basis of W. The linear transformation
which sends ei ui , i = 1, . . . , k, is an isometry.
34.4. Example.
Let W R3 be a vector space generated by the vectors [1, 1, 1] and [0, 1, 2]. We are looking
for a linear isometry of R2 into W. We arrive
at this nding an orthonormal basis u1 , u2 of
`
the space W. Solving the equation [1, 1, 1] [0, 1, 2] + [1, 1, 1] = 0 we obtain = 1, so that
the vector [1, 0, 1] = [0, 1, 2] [1, 1, 1] is orthogonal to [1, 1, 1]. If we set u1 = 31/2 [1, 1, 1],
u2 = 21/2 [1, 0, 1], then the vectors u1 , u2 are unit vectors. The matrix of the required linear
mapping has as columns u1 , u2 .

34.5. Lemma.

Let Q Mk,k be a unitary matrix. Then |det Q| = 1.

34.6. Lemma. Let A Mk,k . Then there exist unitary matrices Q1 , Q2


Mk,k and a diagonal matrix D Mk,k such that A = Q1 DQ2 .
Proof. An elementary theorem of linear algebra tell us that there is a unitary
matrix U and a symmetric matrix R such that A = U R. Further, there exist a
unitary matrix Q and a diagonal matrix D such that R = QT DQ. Now it remains
to put Q1 = U QT , Q2 = Q.
Now, go back to measure theory. We denote, as usual, by the Lebesgue measure.
34.7. Lemma. Let L : Rk Rk be a linear mapping. Then, for any measurable set E Rk , the set L(E) is measurable and (L(E)) = |det L| E.
Proof. The measurability of L(E) is obvious. By Lemma 34.6 there are isometric
linear mappings Q1 , Q2 and a diagonal mapping D = (di,j )ki,j=1 of the space Rk
into itself such that L = Q1 DQ2 . Let K = [a1 , b1 ] [ak , bk ]. Then (to
simplify the notation we will assume that di,i 0, but the same result is obtained
also in the remaining cases) D(K) = [d1,1 a1 , d1,1 b1 ] [dk,k ak , dk,k bk ], so
that
(D(K)) = |d1,1 (b1 a1 ) . . . dk,k (bk ak )| = |det D| K.
Since isometric linear mappings are measure preserving and their composition
with another linear mapping M do not change the absolute value of the determinant of M , it follows that for any measurable set E Rn we have
(L(E)) = (DQ2 (E)) = |det D| (Q2 (E)) = |det D| E = |det L| E.

F. Change of Variable and k-dimensional Measures

34.8. k-dimensional Measures.


containing B(Rn ). Denote

143

Let be a measure on a -algebra

E = {inf S : S E, S }.
We say that is a k-dimensional measure on Rn if:
(a) I(E) = E whenever I : Rk Rn is an isometric mapping and
E Rk ;
(b) (E) k E for each -Lipschitz mapping : E Rn and E Rn .
A k-dimensional measure is not uniquely determined by (a) and (b) (cf. Remark
34.31). Nevertheless, on reasonable sets all k-dimensional measures coincide. Notice also that for any k-dimensional measure , the property (a) implies that
([0, 1]k {0}nk ) = 1.
By Exercise 30.7, the Lebesgue measure satises the property (b) (and thus
also (a)) in the case k = n. Using (a), we see immediately that the Lebesgue
measure is the unique n-dimensional measure on Rn .
34.9. Existence of k-dimensional Measures.
measure on Rn .
Proof. For E Rn , set

E = inf{

There exists a k-dimensional

k Gj } ,

where the inmum is taken over all sequences of open


sets {Gj }, Gj Rk , which
admit 1-Lipschitz functions j on Gj such that E j (Gj ). If such a sequence
j

{Gj } does not exist, we set E = .


Obviously is an outer measure on Rn .
Let be the -algebra of all -measurable sets (see 4.4). Similarly as in
Theorem 1.19 we prove that every Borel set belongs to : It is sucient to
observe that for any open halfspace H and any continuous mapping of an open
set G Rk into Rn , the set G is a disjoint union of G 1 (H) (and this is
clearly open) and G \ 1 (H) (this is a union of a countable family of closed sets;
therefore a measurable set which can be approximated by an open superset with
arbitrarily close measure). Obviously
E = inf{F : F , F E}
as the sets of the form

j (Gj ) are measurable (Gj is a union of a sequence of

compact sets and each continuous image of a compact set is compact). According
to Exercise 30.7 we obtain that is a k-dimensional measure.
34.10. Volume. Let W be a k-dimensional linear space with an inner product
k
and L : Rk W
 a linear mapping. Let Q : R W be an isometry and
a := det(Q1 L). Then (Q1 L(E)) = a E for any Borel set E Rk . The
constant a is independent of the choice of the isometric mapping Q and has a

144

34. Change of Variable Theorem

similar meaning to that of the


 multiplier |det L| of Lemma 34.7 (the case n = k).
The expression det(Q1 L) may be simplied: Using 34.1 we have
!
"k
(det(Q1 L))2 = det((Q1 L)T Q1 T )) = det Q1 Lei Q1 Lej i,j=1
!
"k
= det Lei Lej i,j=1 .
The expression

&
det(ui uj )ki,j=1

is said to be the volume of a k-tuple of vectors (u1 , . . . , uk ) and will be denoted by


vol(u1 , . . . , uk ). It has a geometric interpretation as the k-dimensional measure
of the set
{c1 u1 + + ck uk : (c1 , . . . , ck ) [0, 1]k }.
We associate the volume also to the linear mapping L by
vol L := vol(Le1 , . . . , Lek ).
k

Let us notice that the estimate vol L L is valid. We may summarize the
consequences for the k-dimensional measure in the following theorem.
34.11. Theorem. Let W be a k-dimensional linear subspace of Rn and be a
k-dimensional measure on Rn . Let L : Rk W be a linear mapping. Then
(L(E)) = vol L E
for any measurable set E Rk .
34.12. Theorem. Let W be a k-dimensional linear space and L : Rk W,
M : Rk Rk be linear mappings. Then vol(LM ) = |det M | vol L.
Proof. Let Q : Rk W be a linear isometry. By 34.1
(vol LM )2 = det((Q1 LM )T Q1 LM ) = det M T det((Q1 L)T Q1 L) det M
= (det M )2 (vol L)2 .

34.13. Cauchy-Binet formula. Now we derive a formula which makes the


calculation of the volume of a mapping L : Rk Rn easier, at least for some
combinations of n and k. The elements of the set {1, . . . , n}k (i.e. the ordered ktuples of indices) will be called multiindices. We will denote I = I(k, n) the set of
all multiindices {1, . . . , n}k which are increasing, i.e. 1 < < k . Consider
a matrices A = (am,j )m=1,...,n and B = (bm,j )m=1,...,n . For each multiindex I
j=1,...,k

j=1,...,k

write A = (ai ,j ) i=1,...,k , B = (bi ,j ) i=1,...,k . Now we show that


j=1,...,k

j=1,...,k

det AT B =


I

det AT B .

F. Change of Variable and k-dimensional Measures

145

A routine calculation shows that


det AT B = det(

n


am,i bm,j )ki,j=1 =

m=1

det(ai ,i , bi ,j )ki,j=1 .

{1,...,m}k

Since
det(ai ,i , bi ,j )ki,j=1 = 0
whenever any index in is repeated, we obtain


det AT B =

det(ai ,i , bi ,j )ki,j=1

{1,...,m}k

det(ai ,i , bi ,j )ki,j=1 =

I {1 ,...,k }k

det(AT B ).

Now the promised application to the volume follows: We set both A and B to be
the matrices of the mapping L. From the above computation we obtain
(vol L)2 = det AT A =

det AT A =

(det A )2 .

34.14. Example. Let


u2 = [1, 0, 1],

u1 = [0, 2, 1],
0

so that

0
A = @2
1

1
1
0A.
1

Then we evaluate
(vol(u1 , u2 ))2 = det(AT A) = det

5
1

1
2

= 9,

or
`
(vol(u1 , u2 ))2 = det

0
2

1 2 `
0
+ det
0
1

1 2 `
2
+ det
1
1

0 2
= (2)2 + (1)2 + 22 = 9.
1

34.15. Lemma. Let be a k-dimensional measure in Rn . Let L : Rk Rn


be an injective linear mapping, G Rk an open set, F G a measurable set and
> 0. Further, let : G Rn be an almost everywhere dierentiable mapping
G for which
|(s) (t) L(s t)| |s t| , t, s F.
%
%
%
%
Denote = 1 + %L1 %, = (1 %L1 %)1 .
Then the following assertions hold:
1
(a) the %mapping
% L is -Lipschitz on L(F );
(b) if %L1 % < 1, then the mapping is one-to-one and the mapping L1
is -Lipschitz on (F );


k
(c) for almost
% every t F we have  (t) L and vol (t) vol L;
%
(d) if %L1 % < 1, then vol  (t) k vol L for almost all t F .

146

34. Change of Variable Theorem


k

(e) (E)
% 1% vol L E for any set E F ;
%
%
< 1, then
(f) if L

k k

vol  (t) dt (E) k k

vol  (t) dt

for any set E F .


Proof. (a) and (b): Let t, s F . Then
|(s) (t)| |(s) (t) L(s t)| + |Ls Lt| |s t| + |Ls Lt|
%
%
( %L1 % + 1) |Ls Lt|
(hence (a) follows) and
|Ls Lt| |(s) (t) L(s t)| + |(s) (t)|
%
%
%L1 % |Ls Lt| + |(s) (t)| .
The last inequality may be also simplied
%
%
(1 %L1 %) |L(s) L(t)| |(s) (t)| ,
which yields (b).
(c) and (d): By the Lebesgue Density Theorem 29.2, almost every point of F is
its point of density. If, in addition, the derivative A :=  (t) exists at such a point
t, then obviously
the mapping AL1 is (as in (a)) -Lipschitz,
% 1 % A L . Since
1
vol
AL k , or vol A k vol L. Similarly we obtain
we have %AL % and
%
%
vol A k vol L if %L1 % < 1.
(e) and (f): Using the denition of the k-dimensional measure and the preceding
considerations, it follows that
(E) k L(E) = k vol L E.
%
%
Assuming %L1 % < 1 we obtain
(E) k vol L E k k

vol  (t) dt

and

(E)

vol L E

vol  (t) dt.

Now we will deal with locally Lipschitz mappings. We will frequently and
tacitly use the fact that (in view of the Rademacher Theorem 30.3) any Lipschitz
mapping has its derivative  (t) almost everywhere, and that this is measurable
(as a function of t).

F. Change of Variable and k-dimensional Measures

147

34.16. Lemma. Let G Rk be an open set, E G a measurable set and


: G Rn a locally Lipschitz mapping. Let F be a closed or open subset of
Mn,k and a positive function
on F . Then there exists a countable disjoint

partition E { F } = Dj , where Dj are measurable and for each j there is


L F such that
|(t) (s) L(s t)| < (L) |s t|
for each s, t Dj .
Proof. First suppose that F is compact. The system of sets
{M F : M L < (L)},
where L runs over F , is an open cover of F . We can thus nd its countable
subcover and divide E { F } into a disjoint union of measurable sets E i ,
where for each i there is L F such
for all t E i we have  (t) L <

that
i
i
(L). Let i be xed. We have E = Ep , where
p


Epi := t E i : for each s U (t, p2 ) we have

|(s) (t) L(s t)| < (L) |s t| .
i
If we divide Epi into measurable pairwise disjoint parts Ep,q
with diameter less
i
than 1/p, then Ep,q have all the required properties. If F is closed or open in
Mn,k , it is a countable union of compact sets and we use the preceding part.

34.17. Sard Theorem. Let be a complete k-dimensional measure on Rn .


Let G Rk be an open set and : G Rk a locally Lipschitz mapping. Let
Z := {t : vol  (t) = 0}. Then ((Z)) = 0.
Proof. It is enough to show that ((E)) = 0 for any set E := Z {  m}.
If K := {M Mn,k : vol M = 0 a M  m}, then K is a closed subset
of Mn,k . Choose > 0. With each L K we associate (L) > 0 such that
k1
(L) < and |Lt| (L) |t| for every t perpendicular to Ker L := {s
2k L
k
R : Ls = 0}. According to Lemma 34.16 there is a disjoint partition E into
a countable union of measurable sets Dj such that for each j there is L K
satisfying
|(t) (s) L(s t)| < (L) |s t|
for each s, t Dj . Let j be xed. The dimension d of the space Ker L is less
than k. Let P be the orthogonal projection of Rk onto Ker L and W be a (k-d)dimensional subspace of Rn orthogonal to L(Rk ). We nd an isometric mapping
Q of the space Ker L to W and set M t%= Lt %+ (L)QP t. It is easily veried that
for = (L) we have M L , %M 1 % 1/ and vol M = kd vol L
kd md mk1 . By Lemma 34.15 we get
%
%
(Dj ) (2 %M 1 %)k vol M D 2k mk1 Dj Dj .
Taking the union and letting 0 we obtain (E) = 0.

148

34. Change of Variable Theorem

34.18. Change of Variable Theorem. Let be a complete k-dimensional


n
measure on Rn . Let G Rk be an open set and : G R
a locally Lipschitz
mapping. If u is a measurable function on G and w(x) := {u(t) : t 1 (x)},
then


w(x) d(x) =
u(t) vol  (t) dt ,
(G)

provided the integral on the right-hand side converges.


Proof. Apparently it is enough to prove the theorem when u is the indicator
function of a measurable set E Rk . If E is of measure zero, it is mapped by a
locally Lipschitz mapping again to a set of measure zero. If E {vol  = 0}, then
according to the Sard Theorem 34.17 we have ((E)) = 0 and the integral on
the right-hand side is also zero. Hence we may assume that E {vol f  > 0}. If
F := {M Mn,k : vol M > 0}, then F is an open subset of% Mn,k
% . Choose > 1.
To each matrix L Mn,k nd (L) > 0 such that (L) %L1 % < 1 (1/2k)

k
k
(then the constants , of Lemma 34.15 satisfy the

estimate < ). By
Lemma 34.16 there is a disjoint partition E = Dj , where Dj are measurable
sets and for every j there is L K such that

|(t) (s) L(s t)| < (L) |s t|


for all s, t Dj . Then by virtue of Lemma 34.15, the mapping is one-to-one
on Dj . We rst reduce the general case to the case when is one-to-one on E.
Thanks to Lemma 34.15 we have

1 (Dj )
vol  (t) dt (Dj ).
Dj

For each compact set K Dj the set (K) is compact and thus measurable.
There is a
sequence {Ki } of compact subset Dj such that Ki Dj . Then the
set Fj := (Ki ) is a -measurable subset (Dj ) and
i

vol  (t) dt Fj .

Dj

Denote F =

Fj . If we sum over j, we obtain

1 (E)

vol  (t) dt (F ) .

Whence, passing to the limit as 1 we get


(E)


E

vol  (t) dt (F ).

F. Change of Variable and k-dimensional Measures

Since F E, it follows

149

vol  (t) dt ,

(E) =
E

which is the desired formula in the case of the indicator function of the set E.
Now, if is not

one-to-one on E, the same procedure as above yields a disjoint


partition E = Ej , where Ej are measurable sets and is one-to-one on Ej .
j

According to the preceding part of the proof, it follows that




cEj d = (Ej ) =
vol  (t) dt.
(G)

Ej

Summing over j we complete the assertion.


34.19. Corollary. Let be a complete k-dimensional measure on Rn . Let
G Rk be an open set and : G Rn a locally Lipschitz mapping. Let f be
a -measurable function on (G) and E G a measurable set. If we dene the
Banach indicatrix N (x, , E) as the cardinality of the set E 1 (x), then


N (x, , E) f (x) d(x) =
f ((t)) vol  (t) dt ,
(G)

provided either integral exists.


In particular, if the mapping is, in addition, one-to-one, then


f (x) d(x) =
f ((t)) vol  (t) dt.
(G)

Proof. The assertion is a direct consequence of Theorem 34.18. It remains only


to verify that f is measurable on {vol  > 0} and N (x, , E) is -measurable.
If E {vol  = 0}, then an appeal to the Sard Theorem 34.17 reveals that
(E) = 0, so that N (x, , E) = 0 almost everywhere. Therefore, we may suppose
that E {vol  > 0} and similarly as in the proof of the preceding theorem we
may restrict to the case when is one-to-one on E. Further, it is no restriction to
assume that f is the indicator function of a -measurable set H (G). In this
case there are Borel sets B, N (G) such that B H, H \ B N and N = 0.
1
By Theorem 34.18 we have (E 1 (N )) =
and therefore (E
(H \B)) =
! 0,
"
1
1
1
0. We see that the set E (H) = E (B) (H \ B) is measurable.
34.20. Bilipschitz Mappings. Let (P1 , 1 ), (P2 , 2 ) be metric spaces, G P1
and [1, ) a constant. We say that a mapping : G P2 is -bilipschitz on
G if
1 1 (x, y) 2 ((x), (y)) 1 (x, y)
for all x, y G. In other words, whenever both and 1 are -Lipschitz
mappings. A mapping is termed bilipschitz if it is -bilipschitz for some .
A mapping : G P2 is said to be locally bilipschitz if each point x G has
a neighborhood Ux and x such that the restriction of to Ux is x -bilipschitz.

150

34. Change of Variable Theorem

34.21. Remarks.
1. Each bilipschitz mapping is one-to-one, it is even homeomorphic.
However, a locally bilipschitz mapping fails to be one-to-one; and even if it is one-to-one, it is
not necessarily homeomorphic.
2. Notice that 1-bilipschitz mappings are nothing else than isometries.
3. Let be a bilipschitz mapping of an open subset of Rk to Rn . Then is dierentiable
almost everywhere (by the Rademacher Theorem 31.2). Moreover, if  (t) exists, then the rank
of  (t) is k.

34.22. Regular Mappings and Dieomorphisms. Every regular mapping


(i.e. a C 1 -mapping of an open set G Rk to Rk whose Jacobian is nowhere on G
vanishing) serves as an example of a locally bilipschitz mapping. Also any regular
mapping of an open set G Rk into Rn (k n), (which is dened as a C 1 mapping whose Jacobi matrix has rank k at each point G) is a locally bilipschitz
mapping.
Any one-to-one regular mapping of an open set G Rk into Rk is a dieomorphism, i.e. is a homeomorphism for which both and 1 are C 1 -mappings.
34.23. Examples.
The mappings given by polar or spherical coordinates of Chapter 26
are typical examples of regular mappings. If a U (0, 1) is a constant vector, then f : x 
x
a), x U (0, 1) may serve as as example of a mapping which is (1 |a|)1 -bilipschitz
|x| ( |x|
but not of class

(as it is not dierentiable at the origin).

34.24. Integration on k-dimensional Surfaces. In applications, k-dimensional measures are mostly used on the so-called k-dimensional surfaces. These
objects admit local parametrizations by means of Lipschitz (or even bilipschitz)
mappings, and consequently the k-dimensional measure is uniquely determined
on subsets of k-dimensional surfaces by the Change of Variable Theorem.
Given a set Rn , we always consider it as a metric space whose topology
is inherit from those of Rn . A set Rn is called a k-dimensional surface
whenever for each point x there exists a locally bilipschitz homeomorphic
mapping of an open set G Rk into such that x (G), and (G) is a
relatively open subset of . The mapping is then called a parametrization of
(G).
If each point of has a neighborhood parametrizable by mappings of class C
( 1), we say that the surface itself is of class C .
Roughly speaking, k-dimensional surfaces are sets which locally look like a
bilipschitz deformation of an open part of Rk . Equivalently, we may describe this
property in terms of charts, which are dened as inverse mappings to parametrizations. A k-dimensional chart is thus dened as a locally bilipschitz homeomorphic
mapping of an open subset U of onto an open subset of Rk . A set is a kdimensional surface if and only if for each point x there is a k-dimensional
chart dened on a neighborhood of x (relative to ).
Because of the Lindelof property of Rn each k-dimensional surface is a countable union of surfaces of the form (G), where is a parametrization.
Another method of characterizing k-dimensional surfaces of class C is to use
an implicit description: A set Rn is a k-dimensional surface of class C
if and only if each x admits a neighborhood Wx in Rn and a mapping

F. Change of Variable and k-dimensional Measures

151

g : Wx Rnk of class C such that the matrix g  (x) has rank nk and Wx =
Wx {g = 0}.
(It is an easy exercise to use the Implicit Function Theorem in order to get a parametrization.
Conversely, if : G is a parametrization of class
and x = (t), then there exists the
projection of the space Rn onto Rk such that ( ) (t)
= 0. Then by the Inverse Mapping
Theorem there exists (after an eventual restriction of the domain of ) the inverse mapping
`

( )1 . The equation g(x) = 0, where g(x) = x ( )1 ((x)) , is the desired implicit


description of on the neighborhood of x.)

Let be a k-dimensional surface and a k-dimensional measure on Rn . The


integral f d is sometimes labelled as the curve or surface integral of f .
34.25. Example (helix). Let X = {[x, y, z] R3 : x = cos z, y = sin z}. We may parametrize
X by the mapping
(t) = [cos t, sin t, t] , t R.
Then

 (t) = [ sin t, cos t, 1]

and
vol  (t) =

( sin t)2 + cos2 t + 1 = 2.

34.26. Example (sphere). Let S = {[x, y, z] R3 : x2 + y 2 + z 2 = 1}. We introduce three


possibilities of a parametrization.
(a) Remember that the spherical coordinates are given by formulas s = [x, y, z], where
x(t, a) = cos a cos t,
y(t, a) = cos a sin t,
z(t, a) = sin a ,
[t, a] G := (0, 2) ( 12 , 12 ). Then (G) = X := S \ N , where N is the meridian
{[x, y, z] S : y = 0, x [0, 1]}. Obviously, two-dimensional measure of N is zero, so that the
dierence between S and X may be neglected. We have
0

s (t, a)
and

vol s (t, a) =

sin t cos a,
= @ cos t cos a,
0,

1
cos t sin a
sin t sin a A
cos a

p
cos2 a sin2 a + sin2 t cos4 a + cos2 t cos4 a = cos a.

(b) Let S, X, N be as in (a). For the parametrization of X we may also use the cylindrical
coordinates c = [x, y, z], where
x(t, h) = r cos t,
y(t, h) = r sin t,

(r denotes

p
1 h2 ),

z(t, h) = h
on G := {[t, h] R2 : t (0, 2), h (1, 1)}. Then
0

r sin t,
B
c (t, h) = @ r cos t,
0,
and
vol c (t, h) =

1
h
cos t
r
C
h
sin t A
r
1

q
h2 + (1 h2 )(sin2 t + cos2 t) = 1.

152

34. Change of Variable Theorem

(c) Let S+ be the hemisphere {[x, y, z] S : z > 0}. Then S+ can be parametrized by
its projection into the plane determined by the axes x and y. Consider the parametrization
p = [x, y, z], where
x(s, t) = s,
y(s, t) = t,
p
z(s, t) = 1 s2 t2 ,
on {(s, t) R2 : s2 + t2 < 1}. Then

B
p (s, t) = @

1,
0,

s
,
1s2 t2

0
1

t
1s2 t2

1
C
A

and

1
.
1 s2 t2
34.27. Example.
By means of the parametrization of Example 34.26.b we will compute
two-dimensional surface measure of the unit sphere. Let S, X, G have the same meaning as in
the quoted example. Then
Z 1 Z 2
Z
vol  dt dh =
(
dt)dh = 4.
(S) = (X) =
vol p (t) =

34.28.
Example.
Let be a two-dimensional measure on S := {(x, y, z) R3 : z =
p
1 x2 y 2 }. The integral
Z
z 2 d
S

posseses (up to constants) the physical meaning of the force by which (under the unit gravitation
acceleration) a ball with the unit density half immersed in a liquid is lifted (computed by the
integration of the gravitation forces).
Use spherical coordinates on G = {(t, a) R2 : t (0, 2), a (/2, 0)}. Then
Z
Z 0
Z
2
z 2 d =
sin2 a cos a dt da = 2
sin2 a cos a da = ,
3
S
G
/2
which is the half of the volume of the unit sphere. Therefore, we have veried Archimedes
principle in our particular case.
34.29. Example (the length of the graph of a function). Let be the graph of a Lipschitz
function f : [0, 1] R and a 1-dimensional measure on R2 . Let G = (0, 1) and (t) = (t, f (t)),
t G. Then the assumptions of Theorem 34.18 are satised and
Z 1
Z 1q
() =
vol  (t) dt =
1 + (f  (t))2 dt.
0

34.30. Exercise. Derive the formula

() = 2

f (t)

q
1 + (f  (t))2 dt

in the case of two-dimensional measure of the surface drawn in R3 by the rotation of the graph
of a function f around the z-axis. Suppose that the function f : (a, b) (0, +) is Lipschitz
and set
p
x2 + y 2 = f (z)}.
= {[x, y, z] R3 : z (a, b),
34.31. Rectiable Sets. A set H Rn is said to be k-rectiable if there is a Lipschitz
mapping of a bounded measurable set E Rk to H. It is a consequence of the Change of
Variable Formula 34.19 that on the -algebra generated by k-rectiable sets all k-dimensional
measures coincide. On the other hand, there exist closed sets on which k-dimensional measures
may dier (for example, the k-dimensional measure of 34.9 may be dierent from the Hausdor
measure of Chapter 36). The constructions of such sets are rather complicated.

F. Change of Variable and k-dimensional Measures

153

Notice that n-recticable sets in Rn are just bounded Lebesgue measurable sets.
34.32. Notes. The concept of k-dimensional measures comes from A. Kolmogorov [1932]. Our
exposition follows the monograph by H. Federer [*1969] (see also L. C. Evans and R. E. Gariepy
[*1992]) and ideas due to D. Preiss. The classical Sard theorem (or, Morse-Sard theorem) for
1 mappings

was proved by A.P. Morse [1939] and A. Sard [1942].

35. The Degree of a Mapping


Let G Rk be an open set and : G Rk a locally Lipschitz mapping. Then


|J (t)| dt =
N (x, , G) dx,
G

(G)

where N (x, , G) is the number of points of the set 1 (x) (the Banach indicatrix ). It is natural to ask what is the meaning of the integral

J (t) dt.
G

This problem leads to the notion of the degree of a mapping which turns out
to have many applications. Let us mention, for instance, its importance for the
topology of Euclidean spaces and solvability problems for nonlinear algebraic
equations.
First we need to prepare several auxiliary computational results.
35.1. Lemma. Let G Rk be an open set and 1 , . . . , k twice continuously
dierentiable functions on G. Then we have
(a)
k

! i "

(1)q+1
det
= 0.
i=2,...,k
tq
tj j=1,...,q1,q+1,...,k
q=1
(b)
k

q=1

(1)q+1


! i "

1 det
= det(1 , . . . , k ).
i=2,...,k
tq
tj j=1,...,q1,q+1,...,k

Proof. (a) The left-hand side is equal to the sum


k
k 


(1)q+1 det(ai,j,p,q ) i=2,...,k


j=1,...,q1,q+1,...,k

q=1 p=1

where
ai,j,p,q =

i
tj
2

if j = p = q ,

i
tq tj

if j = p = q ,

if p = q.

154

35. The Degree of a Mapping

Now it remains to realize that using the interchange of the order of dierentiation
we get
(1)q+1 det(ai,j,p,q ) i=2,...,k

= (1)p det(ai,j,q,p ) i=2,...,k

j=1,...,q1,q+1,...,k

j=1,...,p1,p+1,...,k

for all p, q.
(b) If we apply the chain rule to the left-hand side of the equality, we obtain
k


(1)q+1

q=1

k



! i "

1 det
i=2,...,k
tq
tj j=1,...,q1,q+1,...,k

(1)q+1

q=1

+ 1

k


! i "
1
det
i=2,...,k
tq
tj j=1,...,q1,q+1,...,k

(1)q+1

q=1

! i "

det
.
i=2,...,k
tq
tj j=1,...,q1,q+1,...,k

The rst term on the right-hand side is just the expansion of the determinant
det(1 , . . . , k ) by the rst row, the second one vanishes by the part (a).
35.2. Lemma. Let 1 , . . . , k be Lipschitz functions on Rk vanishing outside
a compact subset of Rk . Then

det(1 , . . . , k ) dt = 0.
Rk

Proof. We proceed in two steps. First, we assume that the functions 1 . . . k are
of class C 2 . By Lemma 35.1.b there are functions 1 , . . . , k vanishing outside
compact subsets of Rk such that
det(1 , . . . , k ) =

k

j
j=1

tj

For each j = 1, . . . , k and [t1 , . . . , tj1 , tj+1 , . . . , tk ] Rk1 we have



j
(t1 , . . . , tj1 , , tj+1 , . . . , tk ) d = 0.
R tj
Fubinis theorem yields


Rk

j
(t) dt = 0.
tj

Summing up these equalities over j = 1, . . . , k we get the required formula.


In the second step we show that the formula is valid without smoothness assumptions. Suppose that all functions 1 , . . . , k are -Lipschitz. Then

det(q 1 , . . . , q k ) dt = 0
Rk

(notation as in 31.1). Letting q , by the Lebesgue dominated convergence


theorem with constant dominating function we get the desired equality.

F. Change of Variable and k-dimensional Measures

155

35.3. Corollary. Let G Rk be a bounded open set and 1 , . . . , k , 1 , . . . , k


Lipschitz functions on G. If i = i on G, i = 1, . . . , k, then


det(1 , . . . , k ) dt =
det(1 , . . . , k ) dt.
G

Proof. By McShanes theorem 30.5 the functions i , i can be extended to the


whole of Rk to be Lipschitz functions of compact support. Further, we redene
i on Rk \ G to coincide there with i . Thanks to the preceding lemma it follows
that


det(1 , . . . , k ) dt =
det(1 , . . . , k ) dt
G
Rk \G

=
det(1 , . . . , k ) dt
Rk \G

=
det(1 , . . . , k ) dt.
G

35.4. Lemma. Let G Rk be a bounded open set, f an integrable function on


Rk and , Lipschitz mappings of G into Rk . If = on G, then


f ((t)) J (t) dt =
f ((t)) J (t) dt.
G

Proof. Let = [1 , . . . , k ], = [1 , . . . , k ]. No generality is lost with the


assumption that f D(Rk ), for D(Rk ) is dense in L1 (Rk ). Find a C 1 -function
g
g on Rk so that x
= f . Then by Corollary 35.3
1


det((g ), 2 , . . . , k ) dt =

det((g ), 2 , . . . , k ) dt.

Further,

det((g ), 2 , . . . , k ) dt =
G

k 

i=1

g
) det(i , 2 , . . . , k ) dt.
xi

Among the terms of the sum on the right, only the rst one may dier from
zero and it equals G (f ) J dt. If we combine this result with an analogous
computation for i , we obtain the desired equality.
35.5. Lemma. Let G Rk be a bounded open set and , : G Rk Lipschitz
mappings. Let B(y, r) be a closed ball of Rk which does not intersect any of the
segments joining (t) and (t), t G. Let f be an integrable function on Rk
with support in U (y, r). Then


f ((t)) J (t) dt =
f ((t)) J (t) dt.
G

156

35. The Degree of a Mapping

Proof. Denote
K = {t G : the segment joining (t) and (t) intersects B(y, r)} .
Then K is a compact set contained in G. The function which is one on K and
zero outside G is Lipschitz on K (Rk \ G). According to McShanes extension
theorem 30.5, can be extended (under the same notation) as a Lipschitz function
on Rk . Set = + ( ). Then = on G and f ((t)) = f ((t)) for all
t G. and by Lemma 35.4 we obtain



f ((t)) J (t) dt =
f ((t)) J (t) dt =
f ((t)) J (t) dt.
G

35.6. Lemma. Let G Rk be a bounded open set and : G Rk a Lipschitz


mapping. If U Rk is a connected open set disjoint with (G), then there exists
a R such that


f ((t)) J (t) dt = a
G

f (x) dx
Rk

for every function f L 1 (Rk ) with support in U .


Proof. To prove the assertion it suces to show it when U is an open cube, and it is
enough to verify the equality for indicator functions of cubes K(z, ) := [, ]k +z
contained in U , where z U and is a rational number. Select and set
U := {z U : K(z, ) U }. Then for z, y U , we have


cK(y,) ((t)) J (t) dt =
cK(z,) ((t)) J (t) dt ,
G

where (t) = (t) + y z. For y close enough to z the functions , satisfy the
hypotheses of Lemma 35.5 and, in view of connectedness of U , it follows that the
function

y
cK(y,) ((t)) J (t) dt
G

is constant on U . Hence

cK(y,) ((t)) J (t) dt = a()K(y, ).
G

Since the Jacobian of a Lipschitz mapping is a bounded function and G is a


bounded set, the constant a() is nite. If 1 is a rational number and 2 is an
integer multiple of 1 , then the cube with edges 22 may be lled by cubes of
edge 21 , so that a(2 ) = a(1 ). Therefore we can conclude that a() does not
depend on .
35.7. Degree of a Mapping. Let G Rk be a bounded open set and : G
Rk a continuous mapping. By Tietzes extension theorem, is continuously
extendable as a continuous function of compact support to the whole of Rk (with

F. Change of Variable and k-dimensional Measures

157

the same notation). By Theorem 31.5 there exist (even C ) functions j on Rk


such that j . Let y Rk \ (G). We nd a neighborhood U of the point
y whose closure does not intersect (G). We may assume that U j (G) =
for all j. Using the preceding lemmas, there are nite constants aj such that


f (j (t)) Jj (t) dt = aj
f (x) dx
Rk

for each integrable function f on Rn with support in U . Further, an appeal to


Lemma 35.5 makes it clear that the sequence {aj } is constant for j j0 , and
that its limit a is independent of the choice of the sequence {j }. We will soon
be able to show that a is an integer and we call a to be the degree of the mapping
at y on G. We denote it by deg(y, , G).
35.8. Change of Variable Formula. Let G Rk be a bounded open set and
: G Rk a Lipschitz mapping. Let f be a measurable function on Rk , f = 0
almost everywhere on (G). Then


f ((t)) J (t) dt =
deg(x, , G) f (x) dx ,
Rk

provided the integral on the left-hand side converges.


Proof. Making similar measurability considerations to those of the proof of Corollary 34.19, we conclude the theorem using the preceding lemmas and the denition
of a degree.
35.9. Properties of a Degree. Let G Rk be a bounded open set and
: G Rk a continuous mapping.
(a) The function y deg(y, f, G) is constant on each component of Rk \
f (G).
(b) Let : G Rk be a continuous mapping and let the segment joining (t)
and (t), t G, do not contain y. Then deg(y, , G) = deg(y, , G).
(c) If is one-to-one and J > 0 almost everywhere in G, then deg(y, , G) =
1 for all y (G).
(d) If deg(y, , G) = 0, then the equation (t) = y has a solution in G.
(e) If G1 , . . . , Gm are disjoint open subsets of G and (t) = y for t G \
(G1 Gm ), then
deg(y, , G) = deg(y, , G1 ) + + deg(y, , Gm ).
(f) If is C 1 and J (t) = 0 for all t 1 (y), then
deg(y, , G) =

{sign J (t) : t 1 (y)} .

(g) The degree deg(y, , G) is an integer number.


Proof. The assertions (a) and (b) are obvious. Combining Change of Variable
Formula in 35.8 and 34.19, we get (c).

158

35. The Degree of a Mapping

(d) Assume that the equation (t) = y has no solution. Since (G) is a compact
set, there is an open neighborhood U of y such that U (G) = . Find a Lipschitz
function close enough to such that U (G) = and deg(y, , G) = 0. Then
G 1 (U ) = and by 35.8

0=
G 1 (U )

J = deg(y, , G) U = 0 ,

which is a contradiction.
(e) We may assume that is a Lipschitz mapping. The set K := (G \ (G1
Gm )) is compact and does not contain y. We nd an open neighborhood U
of y whose closure does not intersect K. Then by 35.8



deg(y, , G) =
J =
J + +
J
G1 (U )

G1 1 (U )

Gm 1 (U )

= deg(y, , G1 ) + + deg(y, , Gm ).
(f) Let s 1 (y). Since J (s) = 0, there is a neighborhood V of s such that
is one-to-one on V , (V ) is an open set (the Inverse Mapping Theorem) and
J has a constant sign on V . Then


|J | dt = sign J (s) (V ) .

J dt = sign J (s)

(V ) deg(y, , V ) =
V

It is also clear that the set 1 (y) is isolated, and therefore 1 (y) is a nite
subset of G. We complete the proof using (e).
(g) We can assume that is of class C 1 . Let E = {t G : J (t) = 0}. By
the Sard Theorem 34.17, ((E)) = 0. Choose y Rk \ (G). Let U be a
neighborhood of y which does not intersect (G). Then there is x U \ (E).
Since deg(y, , G) = deg(x, , G), according to (f) deg(y, , G) is an integer.
35.10. Open Mapping Theorem. Let : G0 Rk be a bilipschitz mapping
on an open set G0 Rk . Then (G0 ) is an open set.
Proof. Let t G0 and y = (t). Let G be a bounded open set, t G G G0 ,
and U be an open neighborhood of y which does not intersect (G). If we prove
that deg(y, , G) = 0, then by 35.9.d the equation (s) = x has a solution for
any x U . Now, there is an open neighborhood V of t such that (V ) U .
Denote V + = {s V : J(s)>0 } and V = {s V : J(s)<0 }. Notice that by the
Rademacher Theorem 30.3,  exists almost everywhere. From the denition of
a bilipschitz mapping it is clear that J (t) = 0 is impossible, and at least one
+

of the
 sets V , V contains a compact set K of positive measure. We obtain
0 = K J = deg(y, , G) (K). Hence deg(y, , G) = 0.
35.11. Theorem on Orientation. Let be a locally bilipschitz mapping of a
connected open set G0 Rk into Rk . Then J > 0 almost everywhere in G0 , or
J < 0 almost everywhere in G0 .

F. Change of Variable and k-dimensional Measures

159

Proof. Let G G G0 be a bounded open set and t G. There is a neighborhood U of (t) which does not intersect (G), and a neighborhood V of
t such that (V ) U . As in the proof of the Open Mapping Theorem we
deduce that deg((t), , G) = 0. By 35.9.g, the degree is an integer, so that
|deg((t), , G)| 1. We have





|J (t)| dt.
 J (t) dt = |deg((t), , G)| (V ) (V ) =
V

Thus, J has a constant sign almost everywhere in V . From the connectedness of


G0 we obtain the assertion.
35.12. Notes.
Although the origins of the idea of degree go back to K.F. Gauss and
A.L. Cauchy, for smooth mappings and smooth sets they were considered at the turn of the
century by H. Kronecker, H. Poincare, E. Picard, P. Bohl or J. Hadamard. The essential step for
the developing the theory of degree of mappings in nite dimensional spaces and its application is
due to L.E.J. Brouwer [1912]. Later on, J. Leray and J. Schauder introduced the topological degree also in innite-dimensional spaces. Today, the signicance of a degree, mainly in nonlinear
functional analysis, is undeniable. There is a rich bibliography and many sources for study the
degree theory. We refer the reader e.g. to S. Fuck and J. Milota [FM], J. Star
a and O. John [SJ],
J.T. Schwartz [*1969], S. Fuck, J. Ne
cas, J. Sou
cek and V. Sou
cek [*1973], K. Deimling [*1985],
I. Fonseca and W. Gangbo [*1995] and P. Dr
abek and J. Milota [*2004].

36. Hausdorff Measures


In Chapter 34 we proved the existence of a k-dimensional measures on Rn
using a relatively simple method. This approach to k-dimensional measures is
appropriate for purposes of applications to the curve and surface integrals.
In theoretical parts of modern analysis we encounter Hausdor measures occuring more frequently in various connections (even for noninteger values of k).
According to 34.31, on rectiable sets, and in particular on k-dimensional surfaces, the normalized Hausdor measure and the measure constructed in 34.9
coincide.
Without loss of clarity we take a slightly deeper look in a more general setting
supposing that p (the dimension) is a nonnegative real number and (P, ) is
a metric space on which we are going to construct the p-dimensional Hausdor
measure.
36.1. Outer Hausdor Measure. Let A P . Denote
Hp (A, ) = inf



(diam Aj )p :

j=1

Hp (A) = sup Hp (A, )


>0

Aj A, diam Aj

for

> 0,

j=1

(= lim Hp (A, ) ).
0+

The set function A Hp (A) is called the p-dimensional (outer) Hausdor measure.
As we show later, if P = Rn and k n is a nonnegative integer, there is a
constant k such that Hk /k is a k-dimensional measure on Rn in the sense of
denition 34.8. The measure Hk /k is called the normalized Hausdor measure.

160

36. Hausdor Measures

36.2. Metric Outer Measure. An outer measure on P is called


a metric

outer measure
if
(A

B)
=
A
+
B
whenever
A,
B

P
and
inf
(x,
y) : x

A, y B > 0.
36.3. Remarks. 1. In the denition of the Hausdor measure p (A) we considered arbitrary
coverings of A but we may conne to coverings consisting of closed or open sets.
2. Notice that the n-dimensional Hausdor measure of a set K which is a ball or a cube in Rn
equals c(diam K)n for a suitable constant c (cf. Lemma 36.12).
3. The set functions p (, ) are not metric outer measures and cannot be used to describe
length or area. Notice that the one-dimensional measure 1 (K, ) of the unit square K in
R2 is a nite number although K contains innitely many segments of length one. This is why
we cannot replace the denition of Hausdor measure by a more simple formula
A  inf

(diam Aj )p :

j=1

(which is in fact

Aj A}

j=1

p (A, )).

4. Further examples of k-dimensional measures in Rn (which, in general, do not coincide with


the normalized Hausdor measure) can be obtained using other covering families. For example,
the spherical measure is dened using coverings formed by open balls, see H. Federer [*1969].

The next series of theorems shows that Hp is a metric outer measure. Therefore, we can apply Carathodorys method and to derive that each Borel set is
Hp -measurable, and that the restriction of Hp to the Borel -algebra is a measure.
36.4. Theorem. Let be an metric outer measure on P . Then each Borel
subset of P is -measurable.
Proof. It would be clearly sucient to prove that closed sets are -measurable.
To this end let a closed set F P be given. Choose a test set T P , T < +
and denote

1
1
Pj = x T :
dist(x, F ) <
,
j+1
j


P0 = x T : dist(x, F ) 1 .

j = 1, 2, . . . ,

Then the sets P0 , P2 , P4 , . . . have positive distances, so that


m


P2j =

j=0

for all m N. Similarly

m
!

"
P2j T

j=0

P2j+1 T , and we see that the series

j=0

is convergent. Since for each m N, the distance between

Pj and T F is

j=0

positive we have
(T \ F )

m
!
j=0


"
! 
"
Pj +
Pj T (T F ) +
Pj .
j=m+1

Pj

j=0

j=m+1

F. Change of Variable and k-dimensional Measures

161

Letting m we obtain
(T \ F ) (T ) (T F ).
36.5. Remark. If is an outer measure on P for which every Borel set is -measurable, then
is already a metric outer measure.

36.6. Theorem.

Hp is a metric outer measure on P .

Proof. Theorem 4.3 tell us that A Hp (A, ) is an outer measure for every > 0.
Letting 0 we see 
that Hp is an outer measure.
Let A, B be sets of positive

distance and 0 < inf (x, y) : x A, y B . Let M A B, diam M 0 .
Then either M A or M B. Hence
Hp (A B, ) = Hp (A, ) + Hp (B, )
for all (0, 0 ). Thus
Hp (A B) = Hp (A) + Hp (B).

36.7. Corollary.

Any Borel subset of P is Hp -measurable.

36.8. Exercise. (a) Let 0 p < q. If

p (A)

< +, then

q (A)

= 0.

(b) The number inf{p 0 : p (A) = 0} is called the Hausdor dimension of a set A.
Compute the Hausdor dimension of the Cantor set in [0, 1].
(c) For any p (0, 1), a generalized Cantor set B [0, 1] may be constructed with
= 1.

p (B)

36.9. Remark. The denition of the p-dimensional Hausdor measure admits also noninteger
values of p. The Hausdor measures with noninteger dimensions are not directly linked with
the topics of the following chapters nevertheless they have a great importance e.g. in the theory
of singular sets (to describe the size of negligible sets, for example the set of discontinuities
for a solution of system of partial dierential equations, or the set of points of divergence of a
Fourier series; the applications to removable singularities are also frequent). The concept of
Hausdor measures with noninteger dimension is also a starting point for the famous fractal
theory (see e.g G.A. Edgar [*1990] and K.J. Falconer [*1985]).
36.10. Exercise. Show that each 0-dimensional Hausdor measure is the counting measure.

36.11. Theorem. Let P and P  be metric spaces, Hp the p-dimensional Hausdor measure on P and Hp the p-dimensional Hausdor measure on P  . Let E
be a subset of P and f : E P  a -Lipschitz mapping. Then
Hp (f (E)) p Hp (E).
Proof. For each > 0 and each sequence {Aj } of subsets of P with

j=1

diam Aj < we have


Hp (f (E), )

(diam f (Aj ))

j=1

Hence the desired inequality easily follows.


j=1

(diam Aj )p .

Aj E,

162

36. Hausdor Measures

36.12. Lemma. Let Hk be the k-dimensional Hausdor measure on Rn ,


K = [0, 1]k {0}nk . Then 0 < Hk (K) < +.

k
< . We divide K to mk cubes
Proof. Given > 0 there is m N such that
m

1
k
and diameters
< . Then
Kj with edges
m
m

mk

! k "k
k
k
(diam Kj ) = m
= k k/2 .
Hk (K, )
m
j=1
Hence Hk (K) k k/2 .
Conversely, let be the Lebesgue measure on K. Let A K and x A.
Then
A B(x, diam A)
'
(
'
(
x1 diam A, x1 + diam A xk diam A, xk + diam A ,
thus
A 2k (diam A)k .

 

If Aj is a sequence of subsets of K,
Aj = K, then
j=1


j=1

(diam Aj )k 2k

Aj 2k K = 2k .

j=1

Whence taking the inmum we obtain the desired lower estimate for Hk (K).
36.13. Normalized Hausdor Measures on Rn . Let k be a nonnegative
integer, k n and k := Hk ([0, 1]k 0nk ). Then Hk /k is a k-dimensional
measure on Rn which will be labelled as the normalized Hausdor measure.
The constant k is equal to the number (4/)k/2 (1 + k2 ). The computation
is not easy, see C.A. Rogers [*1970].
36.14. Remarks. 1. The hints for computation k-dimensional measures of concrete rectiable
sets are given in Chapter 34. We will not present the (dicult) examples of nonrectiable sets.
2. Without essential changes of proofs a similar theory to that of this chapter can be built also
for the case of spherical measures (cf. Remark 36.3.4). The computation of the corresponding
constant analogous to k is much easier, in fact it follows immediately from Exercise 26.26.
3. In terms of the Hausdor measure the following version of the Sard theorem can be proved:
Let G Rk be an open set and f : G Rn an arbitrary mapping. Let E be the set of all
points t G at which the derivative f  (t) exists and vol f  (t) = 0 (which means that the rank
of the matrix f  (t) is less than k). Then k (f (E)) = 0.
36.15. Notes.

C. Caratheodory developed in [1914] the theory of the one-dimensional (lin-

ear) measure in n-dimensional Euclidean spaces. In the same paper he mentions the possibility of introducing k-dimensional measures in an n-dimensional space (for an integer k).
The k-dimensional measure for arbitrary positive k > 0 on Rn was introduced by F. Hausdor
[1919]. The interesting Theorem 36.4 is due to C. Carathodory [*1918]. The theory of Hausdor measures was developed very intensively, particularly A.S. Besikovitch published a great
amount of papers devoted to this topic. From monographs on Hausdor measures we recommend
C.A. Rogers [*1970], K. J. Falconer [*1986] and P. Mattila [*1995].

G. Surface and Curve Integrals

163

G. Surface and Curve Integrals


37. Integral Calculus in Vector Analysis
In this chapter we continue a study of curve and surface integrals and state
a change of variable formula. Moreover, we derive formulae concerning relations
between the integration over the interior (of a set or a surface) and the integration over the boundary. Later on we will see that these formulae are particular
cases of general Stokes Theorem of the next chapter. They are, in fact, multidimensional generalizations of the famous Leibniz formula
 b
f  (x) dx = f (b) f (a) .
a

The reader perhaps appreciates that we include the explanation of three-dimensional (and therefore most important from the point of view of the classical
physics) situations without a deeper excursion to multilinear algebra.
Recall, that the integral calculus on k-dimensional surfaces in Rn is based on
the notion of a k-dimensional measure (34.8) which exists (Theorem 34.9) and on
k-dimensional surfaces is uniquely determined (Remark 34.31). Underline that for
a basic understanding of the topic it is not essential to know how k-dimensional
measures were constructed.
Since all theorems of this chapter are only special cases of more general results
of Chapter 38, we will not disturb the presentation by their proofs.
For the one-dimensional measure on Rn we will use the notation s, while S
will be reserved for the (n 1)-dimensional measure on Rn .
37.1. Vector Field, Gradient, Divergence, Curl. By a vector eld on a set
X we understand a mapping f of a set X into Rn .
Let g be a function of class C 1 on an open set U Rn . Then its gradient
g
g
grad g on U is dened as the vector eld x [ x
(x), . . . , x
(x)]. (The dierence
1
n
between the gradient and the derivative for functions of class C 1 consists only in
the convention that the gradient is a vector while the derivative is a linear form.)
Let f = [f1 , . . . , fn ] be a continuously dierentiable (i.e. of class C 1 ) vector
eld on an open set U Rn . Then the divergence of the eld f is the function
div f on U dened as
n

fi
div f =
.
xi
i=1
If f is a continuously dierentiable vector eld on an open set U R2 , its curl
is the function curl f dened by the formula
curl f =

f2
f1

.
x1
x2

Finally, if f is a continuously dierentiable vector eld on an open set U R3 ,


we introduce its curl curl f as the vector eld on U dened by
' f3
f2 f1
f3 f2
f1 (
curl f =

.
x2
x3 x3
x1 x1
x2

164

37. Integral Calculus in Vector Analysis

In higher-dimensional spaces the curl corresponds to the bilinear form (u, v)


f  (x)u v f  (x)v u.
Let V be an n-dimensional vector space with an inner product. A choice of
an orthonormal basis (u1 , . . . , un ) in V transfers the calculus in V into a calculus
in Rn . The coordinate mapping L : V Rn assigns with each vector x V a
vector Lx Rn of its coordinates with respect to (u1 , . . . , un ) in such a way that
if x = t1 u1 + + tn un , then Lx = [t1 , . . . , tn ]. Since the basis (u1 , . . . , un ) is
orthonormal, we have ti = ui x. Let U V be an open set, g a real function on U
and f : U V a vector eld. Denote G = L(U ), g = g L1 and f = L f L1 .
Then f is a real function on G and g is a mapping of G into Rn . Of course, the
coordinates and the matrices of derivatives f and g depend on the choice of a
basis (u1 , . . . , un ). We can introduce grad g(x) := L1 (grad g(Lx)), div f (x) :=
div f(Lx), and curl f (x) := curl f(Lx) (if n = 2) or curl f (x) := L1 curl f (Lx)
(if n = 3). Then the operators grad, div and curl do not depend on the choice of
an orthonormal basis in V.
37.2. Example (in R2 ). Let f (x) = [x21 + x22 , x2 x1 ]. Then div f =
2x1 + 1 and curl f =

(x2 x1 )
x1

37.3. Example (in R3 ).


1 + 2x2 + x1 , curl f =

3 x1
[ x
x2

2
(x2
1 +x2 )
x2

x1
x3

(x2 x1 )
x2

x3 x1
x3

= 1 2x2 .

Let f (x) = [x1 , x22 , x3 x1 ]. Then div f =


x2
2
x3

2
(x2
1 +x2 )
x1

x3 x1
,
x1

x2
2
x1

x1
]
x2

x1
x1

x2
2
x2

= [0, x3 , 0].

dened on a neighborhood of x and


37.4. Exercise.
Let g be a function of class
u = grad g(x)
= 0. Show that the function which associates with an unit vector v the derivative
of g at x in the direction v attains its maximum at u/ |u|.

37.5. Vector Product. A vector w Rn is said to be the vector product (also


called the cross product) of vectors u1 , . . . , un1 and denoted by u1 un1 ,
if for each vector v Rn we have w v = det(v, u1 , . . . , un1 ). Thus in particular
the i-th coordinate of the vector w is
ei w = det(ei , u1 , . . . , un1 ) = (1)i+1 det(ej um )m=1,...,n1
j=1,...,i1,i+1,...n .
Notice that the vector product is a binary operation if and only if n = 3. The
vector product is always perpendicular to its factors and its norm is
|u1 un1 | = vol(u1 , . . . , un1 ).
37.6. Example. In R3 we have

h
1,
[1, 1, 1] [1, 2, 3] = det
2,

1
3

, det

1,
1,

1
3

, det

1,
1,

1
2

= [1, 2, 1].

37.7. Example. In R2 , the vector product of the vector [1, 2] is the vector [2, 1].
37.8. Exercise. Compute in R4 the vector product [1, 0, 1, 0] [0, 0, 0, 1] [0, 2, 0, 0].

37.9. Orientation. Let be a k-dimensional surface and x . By Px =


Px () denote the set of all parametrizations of neighborhoods of x in .

G. Surface and Curve Integrals

165

By a (local) orientation of at x we understand a mapping, which associates


with every parametrization Px at x a positive or a negative sign so that
1

det(1
2 1 ) > 0 almost everywhere on a neighborhood of 1 (x) whenever
1 Px and 2 Px are both positive or both negative parametrizations,
1

and det(1
2 1 ) < 0 almost everywhere on a neighborhood of 1 (x) whenever
1 Px is a positive parametrization and 2 Px is a negative parametrization.
It can be proved that a local orientation is always available.
By an orientation of a surface we understand its local orientation at each
point satisfying the following additional property: If is a positive parametrization at x, then it is positive also at all points of some neighborhood of x.
Although each point has exactly two local orientations, some problems can
occur when orienting a whole surface. There exist surfaces without a possibility
of a (global) orientation (the well known M
obius strip, see Example 39.7). When
orientations exist, their number is even. An orientable surface with n connected
components possess 2n orientations.
An n-dimensional surface in Rn is nothing else than an open subset of Rn .
It possesses its natural orientation in which the identical parametrizations are
positive.
0-dimensional surfaces are countable sets of isolated points, compact 0-dimensional surfaces have only a nite number of points. Any oriented 0-dimensional
surface F is decomposed into sets F + and F . At points of F + all parametrizations are positive and at all points of F are negative.
37.10. Curves. One-dimensional surfaces are called curves. Let Rn be an
oriented curve. If : G is a positive parametrization, then for almost every
t G we dene the unit tangent vector t(x) at the point x = (t) by the formula
t(x) = u/ |u|, where u = d
dt (t). Then the vector t(x) is dened up to an s-null set
and unit tangent vectors dened by means of dierent positive parametrizations
coincide except on an s-null set (for more details see 38.14).
The eld t(x) obtained in this way determines the orientation. Namely, a
parametrization : G is positive whenever


 d 
d
(t) = t((t))  (t)
dt
dt
for almost all t G and negative if
d
(t) = t((t))
dt



 d 
 (t)
 dt 

for almost all t G.


The eld of unit tangent vectors will be sometimes called shortly the tangent
eld .

The integral f t ds is called a curve integral of the vector eld f . It satises
the following Change of Variable Formula.

166

37. Integral Calculus in Vector Analysis

37.11. Theorem. Let be an oriented curve. Let G R be an open set


and : G be a positive parametrization. Then for any vector eld f =
[f1 , . . . , fn ], fj L 1 (, s), we have



f t ds =
(G)

"
f1 ((t))1 (t) + + fn ((t))n (t) dt.

37.12. Example. Let = {[x, y, z] R3 : x = cos z, y = sin z, 0 < z < 2} be a helix (see
Example 34.25) and f (x, y, z) = [0, x, 3z 2 ]. We orient in such a way that the unit tangent
vector would have a positive z-coordinate. Parametrizing (t) = [x(t), y(t), z(t)] = [cos t, sin t, t]
we compute
1
1
t = [ sin z, cos z, 1] = [y, x, 1] ,
2
2
and therefore
Z

f t ds =

(x y  + 3z 2 z  ) dt =

(cos t(sin t) + 3t2 ) dt = + 8 3 .

37.13. Normal Field on (n 1)-dimensional Surfaces. Let Rn be an


oriented (n 1)-dimensional surface. If : G is a positive parametrization,
then for almost all t G we dene the unit normal vector n(x) at the point
x = (t) by the formula
n(x) =

w1 wn1
,
|w1 wn1 |

where wj = t
(t). Again, the vector n(x) is dened except on a S-null set and
j
using dierent parametrizations yields the same result outside a S-null set (for
more details see 38.14).

The eld n(x) obtained in this way determines the orientation. A parametrization : G is positive provided for almost all t G we have

(t)
(t) = n((t))
t1
tn1






(t)
(t),

t1
tn1

and negative if the above equality holds having the opposite sign on the right-hand
side.
The eld of unit normal vectors will be briey called the normal eld .

The integral f n dS is called the surface integral of the vector eld f . It
satises the following Change of Variable Formula.
37.14. Theorem. Let be an (n 1)-dimensional oriented surface. Let
G Rn1 be an open set and : G a positive parametrization. Then for
any vector eld f = [f1 , . . . , fn ], where fj L 1 (, S), we have



f n dS =

(G)

f ((t)) (
G


) dt.
t1
tn1

G. Surface and Curve Integrals

167

37.15. Example. We evaluate the integral


Z
[x, y, z 2 ] n dS,

where
= {[x, y, z] R3 : z 2 = x2 + y 2 , 0 < z < 1}
and n is supposed to be oriented outwards from the cone
:= {[x, y, z] R3 : x2 + y 2 < z 2 , 0 < z < 1}
(see 37.21). We use the parametrization
(r, t) = [r cos t, r sin t, r]
on G := {[r, t] : r (0, 1) and t (0, 2)}. Obviously (G) diers from only in a set of
measure zero, hence there is no dierence between integration over and over (G). We have
0

cos t,
(r, t) = @ sin t,
1,


1
r sin t
r cos t A ,
0

so that

= [r cos t, r sin t, r].


r
t
In case of a positive parametrization this vector would be a positive multiple of the unit normal vector and thus it would be directed outwards from . Since it is directed inwards, the
parametrization is negative (like in 37.21 it is possible to precise outwards and inwards).
The formula will dier only in the sign. We have
Z
Z
[x, y, z 2 ] n dS =
[r cos t, r sin t, r2 ] [r cos t, r sin t, r] dr dt

G
Z

(r 2 r 3 ) dr dt = .
=
6
G
37.16. Example. Let := {[x, y, z] R3 : x2 + y 2 = z < 1}. Let the unit normal vector
n(x, y, z) =

1
[2x, 2y, 1])
4z + 1

be oriented by the choice of its sign +. Then [t, r]  [x(t, r), y(t, r), z(t, r)] := [r cos t, r sin t, r2 ],
t (0, 2), r (0, 1), is a positive parametrization and its range diers from only in a set of
measure zero. Let f (x, y, z) = [2x, 0, 0]. Then
Z

Z
f n dS =

2x det(y, z) dt dr
Z

(0,2)(0,1)

(0,2)(0,1)

2r cos t det((r sin t), (r2 )) dt dr

4r 3 cos2 t dt dr = .

=
(0,2)(0,1)

37.17. Surfaces with Lipschitz Boundaries. One of the most important


formula of the integral calculus on surfaces is the general Stokes Theorem and
its special cases. For the purpose of a formulation of these results we need to
introduce the notion of a surface with a Lipschitz boundary.

168

37. Integral Calculus in Vector Analysis

Denote by Hk+ , Hk the halfspaces (0, ) Rk1 and (, 0) Rk1 , respectively. Further denote by i the mapping of Rk1 onto Hk dened as
i([s1 , . . . , sk1 ]) = [0, s1 , . . . , sk1 ].
Let Rn be a bounded oriented k-dimensional surface. The k-boundary
of is dened as \ . It is the topological boundary of if k = n. Suppose
that the k-boundary of is an oriented k1-dimensional surface. We say
that is a Lipschitz k-boundary of if for every point z there exist an
open set G Rk , a neighborhood U of x and a homeomorphic locally bilipschitz
mapping : G Hk such that z (Hk ), (G Hk ) = U ,
(G Hk ) = U and one of the following situations occurs:
(a) |GHk is a positive parametrization of U and |GHk i is a positive
parametrization of U ;
(b) |GHk is a negative parametrization of U and |GHk i is a negative
parametrization of U .
If k > 1 and a parametrization satises (b), the by a mirror-like modication we get a parametrization satisfying (a).
37.18. Introduction to Curve Integral Theorem. Let Rn be an oriented curve with a Lipschitz 1-boundary F . As we already know, the orientation
on is formed by a eld of unit tangent vectors t and F is a nite set consisting
from a positive part F + and a negative part F .
The relation between orientations and F is given in Denition 37.17. Less
precisely but transparently: The tangent eld is directed from points of the set
F towards points of the set F + . If t(a) is a continuous extension of t to the
point a F (warning: its existence is not guaranteed by our assumptions), then
t(a) = xa
lim
x

xa
.
|x a|

The result in b F + is similar, but with an opposite sign.


Under these assumptions the following result is valid.
37.19. Curve Integral Theorem.
tion on a neighborhood of . Then

bF +

g(b)


aF

Let g be a continuously diferentiable func


grad g t ds.

g(a) =

37.20. Example.
Let h be a Lipschitz function on [1, 1], h(1) = h(1) = 0. Let =
{[x, y] : y = h(x), |x| < 1}. If we choose an orientation of the unit tangent vector t to so that
its x-coordinate is positive, then
1
[1, h (x)] .
t(x, y) = p
1 + (h (x))2

G. Surface and Curve Integrals

169

R
We have to evaluate the integral f t ds, where f (x, y) = [3x2 cos y, x3 sin y]. A direct computation does not seem to be very hopeful, particularly if the function h is rather complicated.
Nevertheless, since f = grad g for g = x3 cos y, by the Curve Integral Theorem we get
Z
f t ds = g([1, 0]) g([1, 0]) = 2.

37.21. Introduction to Gauss Theorem.


Next we introduce the Gauss
Theorem which is also called the Gauss-Ostrogradski Theorem, or the Divergence
Theorem.
Let Rn be a bounded open set with a Lipschitz boundary (i.e. suppose
that the conditions of 37.17 are satised) and consider a natural orientation on
. Thus the orientation of is uniquely determined and it is represented by the
eld n of unit normal vectors. Roughly speaking, we can say that n(x) is the
unit vector which is perpendicular to the boundary of at x and is oriented out
of . Therefore n(x) is also labelled as the vector of the outer normal . By the
formulation out we understand that for a nonvanishing vector u Rn and a
small positive t we have x + tu provided u n(x) < 0, and x + tu
/ provided
u n(x) > 0. Of course, such a situation occurs only at the points of smoothness
of .
Under the above described assumptions the following result holds.
37.22. Gauss Theorem. Let f be a vector eld of class C 1 on a neighborhood
of . Then


f n dS =
div f d.

37.23. Example (the ball, spherical coordinates). Let = {[x, y, z] R3 : x2 + y 2 + z 2 < 1},
= {[x, y, z] R3 : x2 +y 2 +z 2 = 1}. Consider the mapping given by the spherical coordinates:
= [x, y, z], where
x(r, t, a) = r cos a cos t,
y(r, t, a) = r cos a sin t,
z(r, t, a) = r sin a,
and [r, t, a] H := (0, ) (, ) ( 21 ,
0

cos a cos t,
(r, t, a) = @ cos a sin t,
sin a,


1
).
2

We have

r sin t cos a,
r cos t cos a,
0,

1
r cos t sin a
r sin t sin a A .
r cos a

Then the mapping (s, t, a) = (s + 1, t, a), [s, t, a] (1, 0] (, ) ( 12 , 12 ) satises


requirements of 37.17 for [x, y, z] (H). (If [x0 , y0 , z0 ] is not in (H), we can use mappings
[s, t, a]  (s + 1, t t0 , a a0 ) for suitable t0 and a0 .) After verifying that det  = r 2 cos a > 0
and computing the outer normals we get
[cos a cos t, cos a sin t, sin a]
= [ cos a sin t, cos a cos t, 0],
t
[cos a cos t, cos a sin t, sin a]
w3 :=
= [ sin a cos t, sin a sin t, cos a],
a
w2 w3 = [cos2 a cos t, cos2 a sin t, cos a sin a],
w2 :=

|w2 w3 | = cos a .

170

37. Integral Calculus in Vector Analysis

Finally
n(x, y, z) =

w2 w 3
= [cos t cos a, sin t cos a, sin a] = [x, y, z].
|w2 w3 |

37.24. Example. (a) By means of the Gauss Theorem we will evaluate (not for the rst
time) the surface measure of the sphere := {(x, y, z) R3 : x2 + y 2 + z 2 = 1}. We utilize
the fact that is a Lipschitz boundary of the ball = {(x, y, z) R3 : x2 + y 2 + z 2 < 1}.
We compute the two-dimensional measure of the sphere by integrating the constant 1 which we
express in the form 1 = f n, where we choose e.g. f (x, y, z) = [x, y, z]. Actually, [x, y, z] n =
[x, y, z] [x, y, z] = x2 + y 2 + z 2 = 1 (for a dierent set it would be necessary to nd a dierent
f ). We obtain
Z

[x, y, z] n dS =

1 dS =

div[x, y, z] dx dy dz =

3 dx dy dz = 4.
)

(b) As an exercise evaluate


Z
x2 dS

(result:

4
).
3

37.25. Example (the cube). Let be a cube (0, 1)3 and its boundary oriented by the
outer normal (e.g. on {0} (0, 1)2 the unit normal vector is [1, 0, 0]). The condition 37.17
is clearly satised for z lying on this side, and less obvious (but still satised) if z lies on an
edge or even at a vertex. Let, for example, z be the vertex [0, 0, 0]. Then a mapping with the
desired properties can be found in the form (t)
` = [t1 +max(0, t2 , t3 ), t1 +max(t2 , 0, t2
t3 ), t1 + max(t3 , t3 t2 , 0)], t ( 21 , 0] 12 , 12 )2 {|t2 t3 | < 12 }.

37.26. Introduction to Green Theorem. Consider the situation described in


the Gauss theorem for the two-dimensional case. Then is a curve, the orientation
of which can be expressed not only by means of the normal eld, but also by means
of the tangent eld. The vector n(x) is s-almost everywhere the vector product of
the vector t(x), i.e. (n(x), t(x)) is a positive orthonormal basis of R2 . (Roughly
speaking, t(x) circulates around anti-clockwise.)
37.27. Green Theorem. Let g = [g1 , g2 ] be a continuously dierentiable vector
eld on a neighborhood of . Then



g t ds =

curl g dx.

37.28. Exercise. Compute the unit tangent vector and the unit normal vector to the unit
circle oriented as the (Lipschitz) boundary to the unit disc.
37.29. Example. Using the Green Theorem we compute the contents of the gure :=
x2
y2
{(x, y) R2 : 2 + 2 < 1}. The boundary (ellipse) can be (up to a set of one-dimensional
a
b
measure zero) parametrized by the mapping (t) = [x(t), y(t)] := (a cos t, b sin t), t (0, 2).
Let f be a vector eld on Rn whose curl is 1, e.g. f = [0, x], f = [y, 0], or f = [ 12 y, 12 x]. The
the Green theorem yields
Z

[0, x] t(x, y) ds =

1 dx dy =

xy  dt =

ab cos2 t dt = ab.

37.30. Exercise. Using the Green theorem evaluate the content of the gure surrounded by
the asteroide {(a cos3 t, b sin3 t) : t [0, 2)}.

G. Surface and Curve Integrals

171

37.31. Introduction to Stokes Theorem. Let R3 be an oriented twodimensional surface with a Lipschitz k-boundary . Suppose that a normal eld
n on and a tangent eld t on are given. Less precisely, but more transparently
we can say that the tangent eld t(x) again circulates anti-clockwise provided we
observe the situation against the direction of the normal vector n(x). (Warning:
in contrast to the Green Theorem, here n means the normal eld to the surface.)
Under the above stated assumptions the following theorem holds.
37.32. (Special) Stokes Theorem. Let f be a vector eld of class C 1 on a
neighborhood of . Then


f t ds =
n curl f dS.

37.33. Example. Let h be p


a positive even function of class 1 on [1, 1], h(1) = 0. Let
= {[x, y, z] : z = h(r)}, r = x2 + y 2 . Let the unit normal vector have the orientation for
which its z-coordinate is positive, i.e.
1
x
y
n(x, y, z) = p
[h (r) , h (r) , 1].
r
r
1 + (h (r))2
R
Let us evaluate g n dS for g(x, y, z) = [xez , yez , 2ez ]. Since g = curl f for f (x, y, z) =
[yez , xez , 0] and f = [y, x, 0] on := {[x, y, z] R3 : r = 1, z = 0} is a Lipschitz 2-boundary
of , the Stokes formula gives
Z

g n dS =

f t ds =

[y, x] t(x, y) ds =

(sin2 t + cos2 t) dt = 2.

For the evaluation of the integral over we used polar coordinates: x = cos t, y = sin t, t
(0, 2).
37.44. Notes. Main formulas of the calculus of curve and surface integral were established
in 19th century. The divergence theorem was discovered by K. F. Gauss, G. Green and M. V.
Ostrogradski. The Stokes theorem is due to G. G. Stokes. See also 39.24.

38. Integration of Differential Forms


In this chapter we present the theorem which contains as special cases the
Curve integral theorem, the Gauss theorem and Stokes theorem of Chapter 37.
For this purpose we need more advanced tools of exterior algebra.
38.1. k-covectors and k-vectors. Let V be a n-dimensional vector space and
V its dual space. The value of a linear form V V at a vector u V will be
denoted by V, u.
A mapping W : Vk R is called a k-linear form (on V) if it is linear in each
variable separately. As a special case we get linear forms for k = 1, or bilinear
forms for k = 2.
A k-linear form W on V is said to be a k-covector if W is antisymmetric in the
following sense: If is a transposition (a permutation, which transposes exactly
one pair of elements) of the set {1, . . . , k}, then
)
*
W, (u(1) , . . . , u(k) ) = W, (u1 , . . . , uk ) .

172

38. Integration of Dierential Forms

Given a k-covector W , we have


+
W,

k
!
j=1

a1,j uj , . . . ,

k


ak,j uj

"

,
= det A W, (u1 , . . . , uk )

j=1

for any matrix A = (ai,j )ki,j=1 and any ordered k-tuple (u1 , . . . , uk ) of vectors of
V.
Dually, a k-vector on V will be dened as a k-covector on V . Thus a k-vector
assigns a real number to a k-tuple of linear forms on V. In particular, a 1-covector
is the same as a linear form; a 1-vector u on V will be identied with a vector
v V, if u assigns to each linear form V V the value V, v. A real number
c will be identied with a 0-form, or a 0-vector, which assigns the number c to
each 0-tuple of vectors or forms, respectively. The vector space of all k-covectors
on V will be denoted by k (V) and the vector space of all k-vectors on V by
k (V). So k (V) = k (V ), k (V) = k (V ), 1 (V) = V, 1 (V) = V , and
0 (V) = 0 (V) = R.
38.2. Example. A bilinear form is representable by a matrix. For the sake of simplicity
assume V = Rn , then (ai,j )i,j is a matrix of a bilinear form V if
V, (u, v) =

ai,j ui vj

i,j

for each u, v Rn . Among bilinear forms we select 2-covectors as bilinear forms with an
antisymmetric matrix, i.e. ai,j = aj,i (in particular ai,i = 0). An example of a matrix of a
2-covector in R3 is
0
1
0,
2, 3
@ 2, 0,
5 A.
3, 5, 0
The (canonical) inner product represented by the unit matrix is also a bilinear form but it fails
to be a 2-covector.

38.3. Exterior Product.


By an exterior product of an ordered k-tuple
(V1 , . . . Vk ) of linear forms on V we mean the k-covector V1 Vk dened
by the formula
V1 Vk , (u1 , . . . , uk ) = det(Vi , uj )ki,j=1 ,

(u1 , . . . , uk ) Vk .

Dually, the exterior product of an ordered k-tuple (u1 , . . . uk ) of vectors from V


will be the k-vector u1 uk dened by the formula
(V1 , . . . , Vk ), u1 uk  = det(Vi , uj )ki,j=1 ,

(V1 , . . . , Vk ) (V )k .

Notice that the exterior product is invariant with respect to even permutations of
factors. An odd permutation changes (only) the sign. The exterior product vanishes if the factors are linearly dependent (in particular, if two factors coincide).
Now we will investigate coordinates of k-vectors and k-covectors in Rn . Let Xi
denote the i-th coordinate form [u1 , . . . , un ] ui . Recall that in the context of
these chapters by a multiindex we understand an ordered k-tuple = [1 , . . . , k ]

G. Surface and Curve Integrals

173

of indices from {1, . . . , n}, and that the set of all such multiindices is denoted by
{1, . . . , n}k . A multiindex is called increasing if 1 < < k . The set of all
increasing multiindices from {1, . . . , n}k is denoted by I(k, n). Each k-covector
W posseses (uniquely determined) real coordinates W , {1, . . . , n}k , namely
W = W, (e1 , . . . , ek ) .
Dually, each k-vector w has (uniquely determined) real coordinates w ,
{1, . . . , n}k , namely
w = (X1 , . . . , Xk ), w .
Let W be a k-covector (take into account analogous considerations for k-vectors).
From the antisymmetry it follows
W() = W
for each transposition of indices.
In particular W = 0 whenever an index occurs in the multiindex at least twice.
For a complete description of a k-covector we need only coordinates corresponding
to increasing multiindices.
We have

W =
W X1 Xk ,
I(k,n)

so that the basis of k (Rn ) is {X1 Xk : I(k, n)}. Dually, we have



w=
w e1 ek ,
I(k,n)

so that the basis of k (Rn ) is {e1 . . . ek : I(k, n)}.


A similar consideration leads to a description of the basis of k (V) and k (V)
for a general n-dimensional vector space V. (It remains only to replace (e1 , . . . , en )
by a xed basis V and (X1 , . . . , Xn ) by its dual basis.) It follows that the dimenk
sion of the spaces
!n"(V) and k (V) is equal to the number of the multiindices
of I(k, n) which is k .
By means of coordinates a duality pairing between k-vectors and k-covectors
can be dened: If v k (V) and W k (V), then we introduce

W v .
W, v : =
I(k,n)

In particular, given v1 , . . . , vk V and W1 , . . . , Wk V , we have


W, v1 vk  = W, (v1 , . . . , vk ) ,
W1 Wk , v = (W1 , . . . , Wk ), v .
Notice that not every k-covector is an exterior product of linear forms and not
every k-vector is an exterior product of vectors, see Example 38.7.

174

38. Integration of Dierential Forms

38.4. Example (in R3 ).


(e1 + e2 e3 ) (2e2 + e3 ) = 2e1 e2 + 2e2 e2 2e3 e2 + e1 e3 + e2 e3 e3 e3
= 2e1 e2 + e1 e3 + (2 + 1)e2 e3 .
38.5. Example (in R3 , notice how the sign is changed under the transposition).
X2 (X1 + X3 ) (X1 X3 ) = X2 X1 X1 X2 X1 X3
+ X2 X3 X1 X2 X3 X3
= X1 X2 X3 X3 X2 X1 = 2X1 X2 X3 .
38.6. Example. In R4 we have
X1 X2 + X2 X3 + X3 X4 , 2e1 e2 e1 e4 + e2 e4  = 2.
38.7. Example. The 2-covector W = X1 X2 + X2 X3 + X3 X4 in R4 cannot be written
as an exterior product V1 V2 of linear forms. We prove this by a contradiction: Assume that
W = V1 V2 . Let A : R4 R2 be the linear mapping x  [V1 x, V2 x]. Since W(1,2)
= 0, we
have Ae1
= 0. Further, W(1,3) = W(1,4) = 0, so that Ae3 and Ae4 are multiples of Ae1 . This
is a contradiction, as W(3,4)
= 0.
38.8. Example. There is no chance to construct a counterexample similar to 38.7 in R3 .
Indeed, let W be a 2-covector on R3 . If W1,3 = 0, then
W = (W(1,2) X1 W(2,3) X3 ) X2 .
If W1,3
= 0, then
W =(

W(2,3)

X2 + X1 ) (W(1,3) X3 + W(1,2) X2 ).

W(1,3)

38.9. Dierential Forms. A mapping : E k (Rn ), where E Rn ,


is called a dierential k-form (or, simply a dierential form) on E. We identify
dierential 0-forms with functions. Any dierential k-form on E Rn is
representable in coordinates
=

dx1 dxk ,

I(k,n)

where dxi denotes the constant dierential 1-form x Xi and are functions
on E. We say that a dierential form is of class C if all its coordinates
are of class C . Similarly we dene other properties of dierential forms (e.g.
measurability) using coordinates .
38.10. Example.
example in R4 :

We illustrate the calculation with dierential forms by the following

(dx2 + dx4 ) (x4 dx1 x1 dx4 ) (dx2 + dx3 ) = (x4 dx2 dx1 + x4 dx4 dx1 x1 dx2 dx4
x1 dx4 dx4 ) (dx2 + dx3 )
= (x4 dx1 dx2 x4 dx1 dx4 x1 dx2 dx4 ) (dx2 + dx3 )
= x4 dx1 dx2 dx2 x4 dx1 dx4 dx2 x1 dx2 dx4 dx2
x4 dx1 dx2 dx3 x4 dx1 dx4 dx3 x1 dx2 dx4 dx3
= x4 dx1 dx2 dx4 x4 dx1 dx2 dx3 x4 dx1 dx4 dx3 + x1 dx2 dx3 dx4 .

G. Surface and Curve Integrals

175

38.11. Dierential. Let




dx1 dxk1

I(k1,n)

be a dierential (k1)-form of class C 1 on an open set U Rn . Then its


dierential d is dened as the dierential k-form on U given by the formula


d(x) =

n


I(k1,n) i=1

xi

(x) dxi dx1 dxk1 .

The dierential of a C 1 -function f on U is the dierential 1-form


n

f
dxi .
df (x) =
x
i
i=1

In particular, the dierential of the coordinate function x xi is dxi which


corresponds to the notation introduced above.
38.12. Example (in R2 ).
d(x2 dx1 x1 sin x2 dx2 ) = dx2 dx1 sin x2 dx1 dx2 x1 cos x2 dx2 dx2
= dx1 dx2 sin x2 dx1 dx2 = (1 sin x2 ) dx1 dx2 .
38.13. Example (in R3 ).
(a)

d(x1 dx1 + x1 x3 dx2 ) = dx1 dx1 + x3 dx1 dx2 + x1 dx3 dx2


= x3 dx1 dx2 x1 dx2 dx3 .

(b)

d(x1 x2 dx1 dx3 ) = x2 dx1 dx1 dx3 + x1 dx2 dx1 dx3


= x1 dx1 dx2 dx3 .

38.14. Orientation of k-dimensional Surfaces. In the preceding chapters


we have seen that the orientation of a curve forms a vector eld (of the unit
tangent vectors) on it while the orientation of an n1-dimensional surface also
forms a vector eld (now that of unit normal vectors) but in an entirely dierent
way. We will see that both tangent and normal elds are particular cases of a
general object. An orientation of a k-dimensional surface determines a k-vector
tangent eld and the (n k)-covector normal eld on it. (In fact the normal
should also be a covector, but it is customary to understand it as a vector the
inner product structure makes this identication possible.)
Let be a k-dimensional measure on Rn and be a k-dimensional oriented
surface in Rn . If : G is a positive parametrization, then for almost all
t G we dene the unit tangent k-vector (x) k (Rn ) at the point x = (t)
by the formula
w1 wk
(x) =
,
vol(w1 , . . . , wk )

where wj = t
(t). Then the k-vector (x) is dened -almost everywhere and
j
does not depend on the particular choice of the parametrization .

176

38. Integration of Dierential Forms

We will not use the normal nk-covector in full generality, nevertheless for the
sake of completeness we will outline the denition: It associates with a nk-tuple
u1 , . . . , unk of vectors in Rn the number
vol1 (w1 , . . . , wk ) det(u1 , . . . , unk , w1 , . . . , wk ).
The vector space Tx generated by the vectors w1 , . . . , wk (it will be termed
the tangent space in the next chapter) is -almost everywhere independent of the
choice of a parametrization.
Notice that the dimension of k (Tx ) is 1. Consider two orientations of a surface
. Suppose that 1 : G1 , 2 : G2 are parametrizations such that 1 is
positive under the rst orientation and 2 is positive under the second orientation.
Let U 1 (G1 ) 2 (G2 ) be a connected open (relatively in ) set. Let j be
a eld of unit tangent k-vectors on U determined by the j-th orientation (by
the parametrization j ). The above conducted dimension considerations show
that 2 (x) {1 (x), 1 (x)} -almost everywhere on U . From Theorem 35.11 it
follows that either 2 = 1 -almost everywhere, or 2 = 1 -almost everywhere.
We can deduce that the orientation of can be reconstructed from the knowledge of the unit tangent k-vector eld. A parametrization : G is positive
if for almost all t G

(t)
(t) = ((t)) vol(
(t), . . . ,
(t)),
t1
tk
t1
tk
and negative if the above stated equality holds with the opposite sign.
38.15. Example. Let := {x R4 : x1 > 0, x3 = x1 cos x2 , x4 = x1 sin x2 }. We have to
nd the unit tangent 2-vector eld on knowing that the parametrization : (0, )(, )
,
(t1 , t2 ) = [t1 , t2 , t1 cos t2 , t1 sin t2 ]
is positive. We have

(t) = [1, 0, cos t2 , sin t2 ],


t1

(t) = [0, 1, t1 sin t2 , t1 cos t2 ],


w2 :=
t2

w1 :=

and thus
vol2 (w1 , w2 ) = 1 + t21 cos2 t2 + t21 sin2 t2 + cos2 t2 + sin2 t2 + t21 (cos2 t2 + sin2 t2 )2 = 2 + 2t21
and
(x) =

[1, 0, x3 /x1 , x4 /x1 ] [0, 1, x4 , x3 ]


q
.
2 + 2 x21

The 2-vector (x) has six coordinates, namely (1,2) (x), (1,3) (x), (1,4) (x), (2,3) (x), (2,4) (x)
and (3,4) (x). For example,

det
(1,3) (x) =

1,
0
x3 /x1 , x4
q
2 + 2 x21

x4
= q
.
2 + 2 x21

G. Surface and Curve Integrals

177

38.16. Integration of Dierential Forms. Let be an oriented k-dimensional surface, a unit tangent k-vector eld and a k-dimensional measure on
. Then the integral of a dierential k-form is dened as



,  d

:=

provided the right-hand side makes sense. Here, the symbol represents all the
structure (, orientation): Without the knowledge of it would be impossible to
determine the sign of the integral.
38.17. Particular Cases.
Notions of the preceding chapter can be now
revisited from the point of view of the general approach using dierential forms
and k-vector elds. Let us start with trivial cases.
If k = 0, then is a nite set, (x) = 1 on and the integral is only a sum
of values. A dierential 0-form is a function and


=
(x)(x).

If k = n, then is an open subset of Rn , = e1 en (but theoretically,


the unnatural case = e1 en should not be excluded), a dierential
n-form is described by one coordinate, and we have



f (x) dx1 dxn =

f d.

A bit more interesting is the one-dimensional case of a curve. Then (x) is a


1-vector t(x), a dierential 1-form is expressed by a vector eld g and



g1 dx1 + + gn dxn =

g t ds.

Finally the case of a normal eld is also included: Let k = n 1. Then the
vector eld n(x) and the (n 1)-vector eld (x) are linked by the relation
= n1 e2 en n2 e1 e3 en + (1)n nn e1 en1 .
A dierential (n1)-form is (again, but in a dierent way) represented by a vector
eld g and


g1 dx2 dxn g2 dx1 dx3 dxn



+ + (1)n1 gn dx1 dxn1 =
g n d.

178

38. Integration of Dierential Forms

38.18. Change of Variable Formula. Let be an oriented k-dimensional


surface, a unit tangent k-vector eld on and a k-dimensional measure on
. Let : G be a positive parametrization. Then for any dierential form
=

dx1 dxk ,

I(k,n)

whose coordinates are in L 1 (, ) we have





=

(G)

G I(k,n)


=

G I(k,n)

d1 dk
det(1 , . . . , k ).

Proof. The assertion is an obvious consequence of 34.19 and denitions.


38.19. Example.
We evaluate the integral of the dierential form z dx dy over the
sphere = {[x, y, z] R3 : x2 + y 2 + z 2 = 1}. We will use the parametrization (t, a) =
[x(t, a), y(t, a), z(t, a)], where
x(t, a) = cos a cos t,
y(t, a) = cos a sin t,
z(t, a) = sin a,
[t, a] G := (0, 2) ( 12 , 12 ) and the orientation is supposed to make positive. Recall,
that the uncovered part of the sphere has the (n1)-dimensional measure zero. We have
dx = sin t cos a dt cos t sin a da,
dy = cos t cos a dt sin t sin a da,
dz = cos a da.
Thus
Z

Z
z dx dy =

sin a( sin t cos a dt cos t sin a da) (cos t cos a dt sin t sin a da)
ZG

sin2 a cos a sin2 t dt da sin2 a cos a cos2 tda dt

=
Z

/2

sin2 a cos a(sin2 t + cos2 t) dt da = 2

=
G

sin2 a cos a da

/2
1

= 2
1

u2 du =

4
.
3

38.20. Example. Let := {x R4 : x21 + x22 = x23 + x24 = 1, x1 > 0, x3 > 0}. Consider
the parametrization (t) = [ cos t1 , sin t1 , cos t2 , sin t2 ], t G := ( 2 , 2 )2 . Suppose that the
orientation makes positive. Then
0

sin t1 ,
B cos t1 ,
B
(t) = @
0,
0,

1
0
C
0
C
sin t2 A
cos t2

G. Surface and Curve Integrals

179

so that

(t) = sin t1 sin t2 e1 e3 sin t1 cos t2 e1 e4


t1
t2
cos t1 sin t2 e2 e3 + cos t1 cos t2 e2 e4
and
p
`
(t) =
,
sin2 t1 sin2 t2 + sin2 t1 cos2 t2 + cos2 t1 sin2 t2 + cos2 t1 cos2 t2 = 1.
vol
t1 t2
(The expression under the root is the sum of squares of coordinates of the 2-vector
It follows that

t1

.)
t2

(x) = x2 x4 e1 e3 x2 x3 e1 e4 x1 x4 e2 e3 + x1 x3 e2 e4
is a tangent 2-vector eld. We can compute (taking into account the sign of only) for example
Z

Z
x1 dx2 dx4 =

cos t1 det
G

cos t1 ,
0,

0
cos t2

Z
cos2 t1 cos t2 dt1 dt2 = .

=
G

38.21. General Stokes Theorem. Let Rn be a bounded oriented kdimensional surface with a Lipschitz k-boundary . Let be a C 1 dierential
k1-form on a neighborhood of . Then



=

d.

Proof. We present a proof for a dierential form of the form


= dx1 dxk1 ,
where I(k1, n). Using compactness of and the denition of a surface
with a Lipschitz k-boundary we get open balls U (x1 , r1 ), . . . U (xm , rm ) covering
, open sets Gq Rk and bilipschitz mappings q : G Hk such that, for
each q = 1, . . . , m, we have U (xq , rq ) = q (Gq Hk ), U (xq , rq ) = q (Gq
Hk ), q|Gq Hk is a positive parametrization of U (xq , rq ) and q|Gq Hk i
is a positive parametrization of U (xq , rq ). (Without loss of generality we
assume the case (a) of 37.17.) Let q be Lipschitz functions positive on U (xq , rq ),
m

vanishing outside U (xq , rq ) and satisfying
q = 1 on . (This is in fact a
q=1

partition of the unity, cf. 39.11. We can choose e.g.


q (x) = max(0, rq |x xq |)
m

and q (x) =
q (x)/

i (x).) Let q be xed, U = Uq and = q . We dene a


i=1

function on Rk by the formula

1
(t) = 1 t1

if t1 0,
if 0 < t1 < 1,
if t1 1.

180

38. Integration of Dierential Forms

Let 1 , . . . , k be Lipschitz function with a compact support on Rk satisfying


1 = (q ) ,
1 = 0

2 = 1 ,

. . . , k = k1

on G Hk ,

onHk \ G,

i (t) = i (0, t2 , . . . , tk ),

if 0 < t1 < 1.

The existence of these functions follows from McShanes theorem 30.5. Applying
Corollary 35.3 to 1 , 2 , . . . , k we obtain

det(( 1 ), 2 , . . . , k ) dt
0=
k

R
1 det(, 2 , . . . , k ) dt +
det(1 , 2 , . . . , k ) dt.
=
Rk

Rk

The rst integrand can dier from zero only on the strip {0 t1 1}, otherwise
= 0. Fubinis theorem yields


Rk


(

! i "k
1 dt1 ) det
dt2 . . . dtk
tj i,j=2
k
R 0
! i "k
=
1 det
dt2 . . . dtk
k
tj i,j=2
GH

=
q .

1 det(, 2 , . . . , k ) dt =

The second integrand vanishes on Hk \G (where 1 = 0), on the strip {0 < t1 < 1}
(where the partial derivatives i /t1 are zero) and for t1 1 (where = 0).
Thus


det(1 , 2 , . . . , k ) dt =
det(1 , . . . , k ) dt
Rk

GHk


d(q ).

Summing over q = 1, . . . , m we get the desired equality.


38.22. Example. Let := {x R4 : 1 < x21 + x22 = x23 + x24 < 4}, := {x R4 : x21 + x22 =
x23 + x24 = 1} {x R4 : x21 + x22 = x23 + x24 = 4}. We want to verify that for a suitable
orientation, is a Lipschitz 3-boundary of . Using the denition, we show this for the point
[1, 0, 1, 0]. Set
(t) = [(1 t1 ) cos t2 , (1 t1 ) sin t2 , (1 t1 ) cos t3 , (1 t1 ) sin t3 ],
t G := (1, 1) (, )2 . We have
0

cos t2 ,
B sin t2 ,
B
(t) = @
cos t3 ,
sin t3 ,

(1 t1 ) sin t2 ,
(1 t1 ) cos t2 ,
0,
0,

1
0
C
0
C.
(1 t1 ) sin t3 A
(1 t1 ) cos t3

G. Surface and Curve Integrals

Hence we easily compute that vol( t


(t),
2

(t))
t3

181

= 1 (similarly as in Example 38.20) and

(t),
(t),
)(t) = 2(1 t1 )2 .
vol
t1
t2
t3
Let the orientations of and be chosen in such a way that the parametrization is positive.
We easily verify that the unit tangent 3-vector eld on has the form
`

1
(x) = q
x1 e2 e3 e4 x2 e1 e3 e4 x3 e1 e2 e4 + x4 e1 e2 e3
2
2
2
2
x1 + x2 + x3 + x4
(which corresponds to the normal eld
1
[x1 , x2 , x3 , x4 ],
n(x) = q
x21 + x22 + x23 + x24
see 38.19), and the unit tangent 2-vector eld on is
`

(x) = a x2 x4 e1 e3 x2 x3 e1 e4 x1 x4 e2 e3 + x1 x3 e2 e4 ,
where a equals 1/4 for x21 + x22 = 4 and 1 for x21 + x22 = 1.
p
x2 + y 2 < 35 x + 1}, := {[x, y, z]
38.23. Example.
Let := {[x, y, z] R3 : z =
p
3
R3 : z = x2 + y 2 = 5 x + 1}. (A part of the conical surface surrounded by an ellipse .) Choose
the orientation R in such a way that the z-coordinate of the normal is positive. The evaluation
of the integral dy dz is simplied when applying Stokes formula. We use the mapping
25
5
15
+
r cos t, r sin t,
(r, t) = [
16
16
4

r
(

25
15
5
+
r cos t)2 + ( r sin t)2
16
16
4

which is a positive parametrization; ((0, 1) (0, 2)) covers up to a set of two-dimensional


measure zero. The mapping : t  (1, t), t (0, 2) is then a positive parametrization of
25
sin t dt.
\ {[ 52 , 0, 52 ]}. Since z = 35 x + 1 on , we have 3 = 35 1 + 1 and d3 = 35 d1 = 35 16
Thus
Z
Z
Z 2
75
5 3 25
sin2 t dt = .
dy dz =
y dz =
4

16
64

0
38.24. Notes. The exterior multiplication was invented by H. Grassmann in the 19th century.
See also 39.24.

39. Integration on Manifolds


A natural generalization of k-dimensional surfaces in Rn are manifolds. The
main dierence consist in the fact that a surface is considered as a subset of Rn
while in the case of manifolds we abstract from the embedding into Rn . In this
chapter we also introduce notions which we omitted in the previous chapters like
the notion of the tangent space.
However, this chapter should not be understood as an introduction to the
analysis on manifolds. It contains only a direct way of presentation integration
theory. We do not give proofs which, in principle, are mostly the same as proofs
of analogous results of the preceding chapters.
Now, the whole theory could be built in the spirit of Lipschitz mappings like in
the last chapters. We hope that the reader welcome at least once the simplicity
of a C 1 -presentation.

182

39. Integration on Manifolds

39.1. Manifolds. Let be a metrizable topological space. A homeomorphic


mapping of an open set U into Rk is called a (k-dimensional) chart (on
) provided (U ) is an open subset of Rk . The domain U of the chart will
be denoted by U . A system A of charts on is called a C - atlas (on ) if
{U : A } is a covering of and all superpositions 1 , where , A ,
are of class C . In this case = (, A ) is called a (k-dimensional) manifold of
class C . If the order of dierentiability of a manifold is not specied, then C 1 is
tacitly understood.
Let us emphasize that our manifolds are supposed to be metrizable which is not
always the case of other authors. Omitting this assumption we get some peculiar
examples which are not important from the point of view of applications.
If the set will be equipped by two dierent atlases A1 , A2 , we understand the
manifolds (, A1 ), (, A2 ) to be dierent. (Sometimes it is preferred to identify
manifolds whose atlases are in a sense equivalent.)
A set G Rk is identied with the manifold (G, {id}), where id is the identity
mapping on G.
Let (, A ), ( , A  ) be two manifolds (not necessarily of the same dimension)
and U be an open set. We say that a mapping f : U  is of class C
(measurable, dierentiable) at a point x if all superpositions f 1 ,
A , A  , are of class C (measurable, dierentiable) at (x). A set E
is called measurable if (E U ) is -measurable for each A , and a null set
if ((E U )) = 0 for each A . A homeomorphic mapping f is called a
dieomorphism provided both f and f 1 are C 1 .
Notice the essential dierence with the previous concept of a surface in Rn .
If is a subset of Rn , then a metric and a linear structure induced on gives
an easy way to a dierentiation. Having a manifold, the only information about
the structure (except the topological one) is given by the atlas. Without the
knowledge of the atlas we are not able to decide whether a mapping of an open
subset of Rk to the manifold is dierentiable.
39.2. Embedding. Let (, A ) be a k-dimensional manifold of class C . Let
f : Rn be a dieomorphism and f A = { f 1 : A }. The mapping f
is called an embedding of class C into Rn if (f (), f A ) is again a manifold of
class C .
Most frequently we meet the identical embedding of a manifold being itself
a topological subspace of Rn . A possibility to introduce the structure of an
embedded manifold (an atlas) on a set Rn is to use coordinate charts x
[x1 , . . . , xk ], where is a multiindex of {1, . . . , n}k .
39.3. Orientation. Let k 1. A dieomorphism of an open set G Rk
into Rk is called positive if J > 0 on G, and negative if J < 0 on G. We say
that (, A ) is an oriented manifold , or that A is an oriented atlas on , if all
superpositions 1 , where , A , are positive.
Notice that to any oriented manifold (, A ) of dimension k 1 there exists a
manifold (, A ) with an opposite orientation, where
A = {[1 , 2 , . . . , k ] : [1 , . . . , k ] A }.

G. Surface and Curve Integrals

183

We say, that a connected manifold (, A ) of a dimension k 1 is orientable


if there is an oriented atlas A  A A . A disconnected manifold is orientable
if all its connected components are orientable.
0-dimensional manifolds consist of isolated points. The orientation of such a
manifold is nothing else than the assignment of a sign plus or minus with every
of its points. Let (, A ) be an oriented 0-dimensional manifold. If A and
z U , then U = {z} and (z) = 0. If z is a positive point, then the chart is
positive and conversely.
39.4. Tangent Spaces and Derivative of a Mapping. Let (, A ) be a
k-dimensional manifold and x . Let A be a chart whose domain contains
x, and = 1 . Then the tangent space Tx to at the point x is generated by

vectors t
((x)), . . . , t
((x)). If the manifold is embedded into Rn , it is not
1
k
necessary to dene these vectors since partial derivatives of are elements of Rn .
There is a lack of such an interpretation for abstract manifolds. To associate a

meaning with symbols t


((x)), we can imagine the following construction: The
j

vector t
((x)) is represented as the linear form f (ft)
on the vector space
j
j
of all functions on which are dierentiable in x. A tedious computation shows
that such a denition of a tangent space is independent of the choice of .
Let (X , M ) and (Y , N ) be two manifolds (not necessarily of the same dimension) and f : X Y . If x X , y Y , f (x) = y and f is dierentiable at
x, then the derivative of f at x is dened as the mapping which assigns with each

vector u Tx (X ) the vector f  (x)u Ty (Y ). Namely, if u = t


((x)), where
j

M and = 1 , we dene f  (x)u =

(f ) 
((x)).
tj

Let the k-dimensional manifold (, A ) be oriented A and x U . The basis (u1 , . . . , uk ) of the tangent space Tx () is called positive provided det( (x)u1 ,
. . . ,  (x)uk ) > 0, and negative if det( (x)u1 , . . . ,  (x)uk ) < 0. Notice that
neither of these notions depend on the particular choice of a chart.
39.5. Example. Let be the sphere {x Rn : |x|2 = 1}.
(a) The structure of an oriented manifold of class can be formed on by the atlas of
the coordinate charts:
= {q : q {n, . . . , 1, 1, . . . , n}}, where
1 (x) = [x2 , . . . , xn ],

x1 > 0,

1 (x) = [x2 , . . . , xn ],

x1 < 0,

2 (x) = [x1 , x3 , . . . , xn ],
2 (x) = [x1 , x3 , . . . , xn ],

x2 > 0,
x2 < 0,

...
n (x) = [x1 , . . . , xn1 ],
n (x) = [x1 , . . . , xn1 ],

xn > 0,
xn < 0.

(b) There is no atlas on composed from a sole chart . Indeed, is compact, is continuous
and thus () should be compact as well. However, there are no compact open sets in Rn1 .
(c) We nd the tangent space at a point x , for instance by means of n for xqsatisfying
n1 : |t| < 1}, then (t) = [t , . . . , t
1 |t|2 ],
xn > 0, using = 1
1
n1 ,
n . Thus, if G = {t R
t G, and the tangent space at the point x = (t) is generated by the vectors
t1
x1
tn1
xn1
[1, 0, . . . , 0, q
] = [1, 0, . . . , 0,
], . . . , [0, . . . , 0, 1, q
] = [0, . . . , 0, 1,
].
xn
xn
1 |t|2
1 |t|2

184

39. Integration on Manifolds

(d) Consider the mapping g : R2 R3 dened as g(x) = [x21 , x22 , 2 x1 x2 ] and denote its
restriction to the unit circle 2 in R2 by f . Then f maps 2 to the unit sphere 3 in R3 .
The derivative f  (x) maps a vector u Tx (2 ) R2 to the vector f  (x)(u) Tf (x) (3 ), and

g
g
(x) + u2 x
(x) = [2u1 x1 , 2u2 x2 , 2u1 x2 + 2u2 x1 ].
f  (x)(u) = g  (x)u = u1 x
1

39.6. Example. Let 0 < r < R, and


{[x, y, z] R3 : (

p
x2 + y 2 R)2 + z 2 = r 2 }

be an anuloid. We will use the parametrization q (s, t) = [x, y, z], [s, t] Gq , where
x = (R + r cos s) cos t,
y = (R + r cos s) sin t,
z = r sin s,
G1 = (0, 2) (0, 2),
G2 = (0, 2) (, ),
G3 = (, ) (0, 2),
G4 = (, ) (, ).
Then the atlas
on .

1
{1
1 , . . . , 4 }

forms the structure of an oriented manifold of class

39.7. Example. Let = (G), where G = (1/2, 1/2) (2 , 2 ) and


(s, t) = [(1 + s cos

t
t
t
) cos t, (1 + s cos ) sin t, s sin ].
2
2
2

be the atlas of all 1 charts on . We will


Then has the shape of the M
obius strip. Let
prove that the manifold (, ) is not orientable. We have
1
0
cos t cos 2t , sin t s sin t cos 2t 2s cos t sin 2t
C
B
cos t + s cos t cos 2t 2s sin t sin 2t A .
 (s, t) = @ sin t cos 2t ,
t
s
t
sin 2 ,
cos 2
2
Assume that there exists an oriented atlas  . Let H = {[s, t] G : det ( )(s, t) > 0
whenever  and (s, t) U }. Then using the connectedness of G it follows that H = G
or H = . Let  , [1, 0, 0] U . Then det( ) should have the same sign at the point
[0, ] as at the point [0, ], which leads to a contradiction.

39.8. Dierential Forms. A dierential k-form on an n-dimensional manifold


is dened as a mapping : x (x) k (Tx ()). The calculation with
dierential forms on manifolds is transferred into a calculation with dierential
forms in Rk by means of a pullback. Let G Rm be an open set and : G
a C 1 mapping. Let be a dierential k-form on . Then the dierential k-form
 which is called the pullback of on G is dened as
*
) 
( )(t), (u1 , . . . , uk ) = ((t)), ( (t)u1 , . . . ,  (t)uk ) , u1 . . . , uk Rm .
The pullback of a dierential form on

(x) =
(x)d1 dk (x)
I(k,n)

is a dierential form on G

 (t) =

((t))d(1 ) d(k ) (t),

I(k,n)

which can be, of course, expressed in coordinates in Rm (see Example 39.10).

G. Surface and Curve Integrals

185

We say that a dierential form on is of class C or measurable if the same


holds for its pullbacks (1 ) , A .
The dierential of a dierential k1-form is dened as a dierential k-form
d such that (1 ) d = d((1 ) ) for each chart A . Every C 1 dierential
form has a dierential (this is not entirely easy), and if is an open subset of
Rn , it coincides with the dierential introduced in the preceding chapter.
If a manifold is embedded into Rn , then any dierential form

=
dx1 dxk
I(k,n)

on a neighborhood of induces a dierential form


on , namely


=
d
x1 d
xk ,
I(k,n)

where x
i are the coordinate functions x xi on . Notice that (x) k (Rn )
while
(x) k (Tx ()). The space Tx () can be understood as a subspace of Rn .
The dierence is immaterial from the point of view of integration. Nevetheless,
a certain carefulness is recommended. Indeed, two dierent elements of k (Rn )
can coincide on Tx () so that a dierential form on can have more distinct
descriptions in coordinates (related to Rn ). This phenomenon is demonstrated
by the next example.
39.9. Example. Let be the unit circle {[x, y] R2 : x2 + y 2 = 1}. Then x dx + y dy induces
the zero dierential form on .
q
39.10. Example. Let be the conical surface {x R3 : x3 = x21 + x22 }. Let be the chart
which is the inverse to the mapping : G , G = (0, 1)(0, 2), (t) = [1 (t), 2 (t), 3 (t)] =
[t1 cos t2 , t1 sin t2 , t1 ], and let (x) = x1 dx2 dx3 be the dierential form on . Then
= 1 d2 d3 = t1 cos t2 d(t1 sin t2 ) dt1 = (t21 cos2 t2 )dt1 dt2 .

39.11. Partition of Unity.


Let (, A ) be manifold. A system { } T
of nonnegative functions of class C 1 on is called a partition of unity on
(subordinated to a covering {U }A ) if for every T there is A such
that { > 0} U and, in addition, each point x has a neighborhood
V with
a nite set TV T such that = 0 on V , provided
/ TV , and
= 1
TV

on V . So,
= 1 on and this sum is locally nite.
T

The partition of unity exists, cf. 39.22.


39.12. Riemannian Metric. If we want to introduce a k-dimensional measure
on a manifold, the idea of copying the denition from Rn is not the best one. An
analogy with the Change of Variable Formula of 34.19 is more straightforward. In
this case we need to dene a volume of a k-tuple of tangent vectors. The denition
of a volume (if we omit the possibility of its axiomatic introduction) is based on
an inner product. If the given manifold is embedded into Rn , on each tangent
space we have to our disposal an inner product from Rn . In a general case we
need to consider an inner product on tangent spaces as an additional structure.

186

39. Integration on Manifolds

Let (X , M ) be an n-dimensional manifold of class C 1 and g a mapping associating with every x X a positive denite bilinear form gx on Tx (X ). Then
we can express gx in coordinates with respect to a chart M in such a way
that for any vectors u, v Tx (X ) we have
gx (u, v) =

n


x
gi,j
u
i vj ,

i,j=1

i (x)u

j (x)v.

x
where u
i =
and vj =
If all coordinate functions x gi,j
, i, j =
1, . . . , n, are continuous, we call g a Riemannian metric on X . The structure
(X , M , g) is called a Riemannian manifold . Any manifold of class C 1 admits a
Riemannian structure; it follows easily using the partition of unity.

p
39.13. Example. If
:= {[x, y, z] R3 : x = r cos z, y = r sin z, where r = x2 + y 2 }
is a helix, then
is a two-dimensional manifold. If we introduce a metric on T[x,y,z] by the
formula
g[x,y,z] (u, v) = P u P v,

where P is the projection [x, y, z]  [x, y], then we get a Riemannian manifold which does not
have an isometric embedding to Rn . (The mapping P is of course locally an isometric embedding
into R2 but globaly it is not one-to-one.) Such manifolds are useful in complex analysis, the
example demonstrates that non-imbedded manifolds are not only useless abstractions.
39.14. Example. Let be an open unit circle in R2 . Set
gx (u, v) = u in +

(u x)(v x)
|x|2

x , u, v R2 .

Then g is a Riemannian metric whichqgives the shape of a hemisphere to the manifold .


to the actual hemisphere
Indeed, the mapping f : x [x1 , x2 , x21 + x22 ] which maps
f ( ) (endowed with the Euclidean inner product) preserves the inner product. Namely, for all
u, v R2 and x we have
(f  (x)u) (f  (x)v) = gx (u, v),
hence f is an isometric mapping. On the other hand, is not isometric to any open subset of
R2 (it is not possible to make the hemisphere at). From the geometrical point of view, the
shape of the manifold expressed by the Riemannian metric is more important than the original
underlying space.

39.15. k-dimensional Measures on Riemannian Manifolds. Suppose that


(X , A , g) is a Riemannian manifold, x X and (u1 , . . . , uk ) (Tx (X ))k . We
dene a volume of this k-tuple of vectors similarly as in 34.10:
vol(u1 , . . . , uk ) = det(gx (ui , uj ))ki,j=1 .
This immediately introduces also the volume of a linear mapping L : Rk
Tx (X ). Let be a measure on the -algebra of all measurable subsets of X .
We say that is a k-dimensional measure on X if for each A and each
measurable set E U we have

E =
vol  (t) dt , where = 1 .
1 (E)

Since the integral does not depend on a particular choice of , the partition of
unity leads to the existence of a k-dimensional measure on a k-dimensional
Riemannian manifold.

G. Surface and Curve Integrals

187

39.16. Integration of Dierential Forms on Riemannian Manifolds. Let


(, A , g) be a k-dimensional Riemannian manifold. We dene the unit tangent
k-vector (x) to at a point x by the formula
(x) =

u1 uk
,
vol(u1 , . . . , uk )

where (u1 , . . . , uk ) is a positive basis of Tx () (the denition does not depend


on the choice of the base). Now, we can introduce the integration of dierential
forms by means of integration by k-dimensional measure on similarly as we
have proceeded in 38.16: Namely, if is an integrable dierential form on , then


=
,  d.

39.17. Example. We evaluate the integral


Z
rx2

d,
4
4r
+
5r 2 + 1

q
where r = x21 + x22 , := {x R4 : x1 = r cos x4 , x2 = r sin x4 , x3 = r 2 , x4 (0, ), r
(0, 1)} and is a two-dimensional measure on .
Let (es , et ) be the canonical basis of R2 and (e1 , . . . , e4 ) the canonical basis of R4 . The
manifold will be parametrized by the mapping (s, t) = [s cos t, s sin t, s2 , t], [s, t] G :=
(0, 1) (0, ). We introduce the structure of an oriented manifold on by the atlas {1 }.
Denote L =  (s, t), w = Les Let . Then

(s, t) = [cos t, sin t, 2s, 0] [s sin t, s cos t, 0, 1]


s
t
x1 x 2
, 2r, 0] [x2 , x1 , 0, 1]
=[ ,
r
r
x1
x2
e1 e4 2rx1 e2 e3 +
e2 e4 + 2r e3 e4 ,
= r e1 e2 + 2rx2 e1 e3 +
r
r

w=

and

x21
x2
+ 4r 2 x21 + 22 + 4r2 = 4r4 + 5r2 + 1.
r2
r
For the unit tangent k-vector we have
w
=
,
|w|
|w|2 = r 2 + 4r 2 x22 +

the integrand is expressed as

and thus
Z

rx2
4r4 + 5r2 + 1

rx2
4r 4 + 5r 2 + 1

1
dx1 dx3 ,  ,
2

Z
1
dx1 dx3
2
Z
Z
1
2
=
(cos t ds s sin t dt) 2s ds =
s2 sin t ds dt = .
2 G
3
G

d =

39.18. Integration on General Manifolds. The denition 39.16 is a logical


conclusion of the approach of the preceding chapter, where the presence of the

188

39. Integration on Manifolds

inner product was quite obvious. Nevertheless, for the purpose of integration of
dierential forms on manifolds neither Riemannian structure nor a k-dimensional
measure are needed. Indeed, we can realize that the integral expressions given
by the Change of Variable Formula do not depend on these structures. In the
general case we can proceed as follows: Let (, A ) be a k-dimensional oriented
manifold. Let be a measurable
dierential k-form on A and E a measurable

subset of . The integral E is dened in two steps: First, assume E U for
some M . Then we dene


=
(1 ) .
E

(E)

From the Change of Variable Formula it follows that this integral (if it makes
sense) does not depend on the choice .
The second step is based on the partition of unity. Let { } T be a partition
of unity on (, A ).
For any measurable set E denote

I(E) =

E{ >0}

(the expression I(E) does not necessarily makes sense). We say, that the integral
converges, or that the dierential form is integrable on E if for any meaE
surable set E  , the expression I(E  ) makes sense and it is a nite number,
and for any sequence {Eq } of parwise disjoint measurable sets Eq E, the series

I(Eq )

q=1

converges (absolutely). If the integral


E

converges, we set


= I(E).
E

Since we were careful enough, the value of such a dened integral depends neither
on the partition of E, nor on the partition of unity. On the other hand, it
depends on the orientation of : The reverse orientation forces the converse of
the sign of the integral.
Let G Rk be an open set and : G a dieomorphism. We say, that
is a positive parametrization if all superpositions , A , have a positive
Jacobian. For positive parametrizations the following change of variable formula
is valid:


=

(G)

provided either of these integrals exists.

G. Surface and Curve Integrals


39.19. Example.
(x (s))2

189

Let be the set of all decreasing solution of the dierential equation

x (s) x(s) = 0 satisfying the condition 0 < x(0) < 1. Let
= { : R2 : there are a,
b R, such that a < b and (x) = [x(a), x(b)] for all x }. Then (, ) is a two-dimensional
oriented manifold and : t  et2 s+t1 , t (, 0)2 is a positive parametrization of . The
tangent space Tx (), x = (t), is representable as a two-dimensional vector space of functions

generated by the functions t


(t) : s  et2 s+t1 and t
(t) : s  s et2 s+t1 . For each R,
1
2
let be the function on dened as (x) = x( ). Then := d0 d1 is a dierential form
which associates with u1 , u2 Tx () the number

det

u1 (0),
u1 (1),

u2 (0)
u2 (1)

We have
= d(et1 ) d(et1 +t2 ) = et1 dt1 (et1 +t2 dt1 + et1 +t2 dt2 )
= e2t1 +t2 dt1 dt2 ,
so that

Z
=

(,0)2

e2t1 +t2 dt1 dt2 =

1
.
2

39.20. Introduction to General Stokes Theorem on Manifolds. Let


(X , A ) be a k-dimensional oriented manifold and be a compact subset
of X . Let (, B) be a (k 1)-dimensional oriented manifold. We suppose that
for each point z there exist an open set G Rk , a neighborhood U
of the point z and a homeomorphic mapping : G Hk , such that
z (Hk ), (G Hk ) = U , (G Hk ) = U and one of the following
cases occurs:
(a) |GHk is a positive parametrization of U and |GHk i is a positive
parametrization of U .
(b) |GHk is a negative parametrization of U and |GHk i is a negative
parametrization of U .
(Recall that, by the conventions of this chapter, any parametrization is a diffeomorphism.)
39.21. General Stokes Theorem on Manifolds.
(k 1)-form on X . Then


=
d.

Let be a C 1 dierential

39.22. Existence of the Partition of Unity. Let (, ) be a k-dimensional (topological)


manifold. We introduce a temporary term of an admissible family of functions for a system
{f } T of nonnegative functions on which satises the following conditions: For every
from the index set
there exists
such that {f > 0} U , and f 1 is an innitely
dierentiable function. Further, each point x has a neighborhood V with a nite set
such that f = 0 on V provided
/ V . The only requirement on partition of unity
V
being not satised by an admissible family of function is that the sum is 1. However, if {f } T
is an admissible family of function whose sum S is positive, then {f /S} T is a partition of
is
, or locally Lipschitz,
unity. Its quality depends on the quality of the atlas. If the atlas
then also the partition of unity will have the same property.

190

39. Integration on Manifolds

In the next step let K W be subsets of , K


an admissible family of function (even a nite one)
outside W . For each point a K we nd a
B(a (a), 2ra ) a (Ua W ). Now set
(
2
2
e1/(|a (x)a (a)| ra )
fa (x) =
0

compact, W open. We show that there is


whose sum is positive on K and vanishing
and a radius ra > 0 so that a Ua and
if x 1
a (U (a (x), ra )),
in remaining cases.

Then {{fa > 0} : a K} is a covering of K and taking into account the compactness of K we
can select a nite set {fa1 , . . . , fam } forming an admissible family of functions, whose sum is
positive on K. We have yet solved the existence of the partition of unity provided is compact.
Recall that, by our denitions, is a metrizable space. If is connected, then the Topological
Lemma 39.23 yields the existence of compact sets Kq and open sets Wq (q N) such that

S
Kq Wq , =
Kq , and each point has a neighborhood which intersects only a nite
q=1

number of sets Wq . For each couple (Kq , Wq ) we nd an admissible family of functions by the
preceding procedure. The union with respect to q will be an admissible family of functions
whose sum is positive on . This solves the existence problem if is connected. If is not
connected, then its topological components are connected submanifolds . By a simple union
of partitions of unity on components we obtain the partition of unity on .
39.23. Topological Lemma. Let (P, ) be a connected locally compact metric space. Then
there exists a sequence {Kq } of compact subsets of P and a sequence {Wq } of open subsets of

S
X such that X =
Kq and each point P has a neighborhood intersecting only nitely many
q=1

sets Wq .
Proof. We may assume that P is not compact, for otherwise there is nothing to prove. Further,
we consider an equivalent metric in which P is bounded. Given x P , there is a radius r(x)
such that B(x, 2r(x)) is compact and B(x, 4r(x)) is not compact. We construct recursively a
sequence {Vq } of open relatively compact subsets of P . Choose x0 P and set V1 = U (x0 , 2r0 ).
Assume that V1 , . . . , Vq were already constructed. Thanks to compactness of Vq it follows that
there is a nite system {U (xj , rj )} of balls selected from {U (x, r(x)) : x V q } such that it
S
covers V q . Set Vq+1 = U (xj , 2rj ). The resulting sequence satises V q Vq+1 . We prove
j

by a contradiction that V :=

Vq = P . Suppose V
= P . Since P is connected, there exists

q=1

z V . Set R = r(z)/3, nd x U (z, R) V and q such that x Vq . Further nd y Vq and


/ V , it follows that (z, y) 2r.
r = r(y) such that x U (y, r) and U (y, 2r) Vq+1 . Since z
We have
2r (y, z) (y, x) + (x, z) r + R,
thus r R. If t B(y, 4r), then
(t, z) (t, y) + (y, z) 4r + r + R 6R,
so that B(y, 4r(y)) B(z, 2r(z)). This is a contradiction, because the ball B(y, 4y) is not
compact and the ball B(z, 2r(z) is compact. We have proved that V = P . To nish the proof it
is enough to set K1 = V 1 , W1 = V2 , W2 = V3 , Kq = Vq \ Vq1 for q 2 and Wq = Vq+1 \ V q2
for q 3.
39.24. Notes. The modern theory of manifolds is based on ideas of G. F. B. Riemann.
The roots of the topics of Chapter G go back to the 19th century and are connected with
famous names of outstanding mathematicians. From an extensive bibliography we recommend

II], H. Federer
M. Berger and B. Gostiaux [*1988], L. Bocek [Bo
c], I. Cern
y and J. Mark [CM
[*1969], W. Fleming [*1965], O. Kowalski [Kow], L. Krump, V. Soucek and J. A. T
esnsk
y [KST],
F. Moran [*1988], R. Sikorski [Sik], L. Simon [*1983].

H. Vector Integration

191

H. Vector Integration
40. Measurable Functions
In many branches of analysis we need to integrate functions having values in
vector spaces. Throughout this chapter we consider the case when (, S , ) is
a measure space and X is a Banach space. Having a mapping
f from to X

(avector function) we would like to dene an integral f d in a reasonable
way. In principle, there are two possibilities:
(1) to utilize real (or complex) case arising by a composition f where
varies over the set of all functionals of the dual space X ,
(2) to try to choose an appropriate denition of the Lebesgue integral suited
for the vector case.
Note that both methods are also common in other branches of analysis.
In this rather short and informative chapter we will follow both ways of dening
vector integrals. Note that basic knowledge of main topics of functional analysis (like the HahnBanach theorem, the Riesz-Frechet representation theorem of
bounded linear functionals on Hilbert spaces, or the notion of a reexive space)
is necessary for a good understanding of vector integration.
It seems that the rst attempt to dene a vector integral of the Riemann type is
due to Graves in 1927. His denition is only a suitable modied original Riemann
denition and its main idea is explained in Exercise 43.7.
In what follows, X will denote a Banach space and (, S , ) will be a measure
space, where is supposed to be a complete probability measure ( = 1).
First of all we concentrate on the notion of measurable functions.
40.1. Measurable Functions. A function f : X is termed
simple
provided there exist x1 , ..., xn X and E1 , ..., En S such that
f = i xi cEi ,
measurable if there exists a sequence {fn } of simple functions such that
lim fn () = f () for -almost all (i.e. if fn () converge to f () in
the norm of the space X for -almost all ),
weakly or scalarly measurable if functions f are measurable for each
continuous linear functional X .
40.2. Remarks.
1. Note that we dened measurable functions according to the characterization given in Exercise 5.7. It is clear that the common denition of measurability
for each X) cannot be used in the case of vector functions.
({ : f () < }
There are other equivalent denitions of measurability of real functions like those in Exercise
3.6.a ({ : f () B}
for each open, or Borel set B R, respectively) suited to
the case of vector functions. Having the last denition in mind the class of all measurable
functions coincides with the class of all measurable functions (and also with the class of all
weakly measurable functions according to Pettis theorem) provided X is separable. In a nonseparable case the sum of two measurable functions according to the last denition need not
be evenmeasurable.
2. Assume fn are measurable, fn f -almost everywhere and R. Prove that the
functions f1 + f2 , f1 , f are also measurable. The same is true for weakly measurable functions.

The relationship between measurable and weakly measurable functions is described in the next theorem.

192

40. Measurable Functions

40.3. Pettis Theorem. A function f : X is measurable if and only if


f is weakly measurable and there is a -null set E S such that f ( \ E) is a
separable subset of X. In particular, if X is separable the notions of measurability
and weak measurability coincide.
Proof. Suppose fk are simple functions,

(E) = 0 and fk f on \ E. Since


f ( \ E) is a subset of the closure of fk () and fk () are nite sets, it follows
that f ( \ E) is separable.
The proof that f is measurable provided X is easy. Indeed, the
assertion is true if f is a simple function and for the general case we pass to the
limit.
Now assume that f is weakly measurable and that the set f ( \ E) is separable
for E S , E = 0. In the rst step of the proof we show that the function
f () is measurable. To this end let {xn } f ( \ E) be a dense countable
set. Using the Hahn-Banach theorem there are n X , n  = 1 such that
n (xn ) = xn . It remains to show that
f () = sup |n (f ())|
n

for each \ E. Obviously, |n (f ()| n  f () = f (). To prove


the reverse inequality, let \ E, n N and > 0 be given. There is xn such
that f () xn  < . Then
| f () n (f ())| | f () xn  | + | xn  n (xn )|
+ |n (xn ) n (f ())|
f () xn  + |n (xn f ())|
+ n  xn f () < 2.
Similarly, we can prove that the functions gn : f () xn  are measurable.
Now x k N, put Enk = { : gn () < k1 } (obviously Enk S ) and dene


xn if Enk \ j<n Ejk ,
hk () =
0
otherwise.
If \ E, then f () hk () < k1 . Thus, we have shown that there is a
sequence {hk } such that hk f -almost everywhere and each function hk has a
countable range. Since such functions are measurable, the assertion easily follows.
To get a good understanding of vector integration we introduce examples. Before proceeding let us agree on the following notations:
c0 is the Banach space of all real sequences x = {xn } satisfying xn 0
equipped with the sup-norm x = maxn |xn |,
lp , 1 p < is the space of all sequences x = {xn } for which xp :=
!
p "1/p
|xn |
< ,
n

H. Vector Integration

193

l will denote the space of all bounded sequences x = {xn } equipped with
the norm x := supn |xn |,
l2 ([0, 1]) is the Hilbert space of all real functions f on [0, 1] vanishing o

2
a countable set such that t[0,1] |f (t)| < , equipped with the inner

product (f, g) = t f (t)g(t).
40.4. Example. Consider X = l2 ([0, 1]) and the measure space ([0, 1], M, ). Let {et : t
[0, 1]} be the usual orthonormal base of X (et (x) = 1 for x = t, et (x) = 0 otherwise). Dene
echet representation theorem
the mapping f : [0, 1] X as f (t) = et . If X , the Riesz-Fr
on Hilbert spaces implies the existence of a uniquely determined a X such that (x) = (x, a)
for every x X. Hence (f (t)) = (et , a) = a(t) and the set {t [0, 1] : (et , a)
= 0} is countable.
We can see that f = 0 almosteverywhere and therefore f is a weakly measurable function.
On the other hand, et es  = 2 for t
= s. It follows that f ([0, 1] \ E) = {et : t [0, 1] \ E}
is separable if and only if the set [0, 1] \ E is countable. Thus, there is no set E [0, 1] of
measure zero for which f ([0, 1] \ E) would be separable and Pettis theorem implies that f is
not measurable. (Further examples can be found in Exercises 43.7.e and f.)
40.5. Notes. The theory of vector measures and integration was developed in an essential
way during 1930s. Famous Theorem 40.3 appears in B.J. Pettis [1938].

41. Vector Measures


In this chapter we touch briey on measures whose values are in a given Banach
space X. Before doing this we take notice of a convergence of series in Banach
spaces.
41.1. Absolute and Unconditional Convergence. Let {xn } be a sequence



of elements of a Banach space X. We say that a (formal) series
xi =
xi is
i=1

convergent if there is a limit lim


absolutely convergent if

n i=1

xi (we write


i=1

xi = lim

n i=1

xi ),

xi  < + ,


unconditionally convergent to x X if n xP (n) = x whenever P is a
one-to-one mapping of N onto N.
Each absolutely convergent series converges (X is a complete space !) even
unconditionally. On the other hand, every unconditionally convergent series
converges absolutely provided dim X < + (Riemanns theorem) while in innite dimensional spaces this assertion is no longer true (consider the example
xn = (0, . . . , 0, n1 , 0, 0, . . . ) c0 ).
In what follows, (, S , ) will stand for a xed measure space and X a given
Banach space.
41.2. Vector Measures. A vector-valued set function F : S X is called an
additive (-additive) vector measure if F () = 0 and


F ( En ) =
F (En )
i=1

for every nite (countable) sequence of pairwise disjoint sets En S . In case of


-additive measures the convergence of a series in the denition is understood in

194

42. The Bochner Integral

the sense as above. Realize that this convergence is even an unconditional one.
Any -additive vector measure will be shortly called a vector measure.
41.3. Examples. In all examples, (,

, ) stands for the measure space ([0, 1], M, ).

1. If X = Lp ([0, 1]), 1 p , and F : E  cE for E M, then F is a -additive vector


measure.
2. Let L be a continuous linear operator from L1 [0, 1] into a Banach space X. If F (E) := L(cE )
for E M, then F is again a -additive vector measure. Indeed, a moments reection shows
that

[
X
[
[
Ej )
F (Ej ) = F (
Ej ) (
Ej ) L .
F (
j=1

j=1

j=n+1

j=n+1

3. Let T : L [0, 1] X be a continuous linear operator, F (E) := T (cE ) for E M. Then F


is an additive vector measure which may not be -additive. To see an example, let T be the
Hahn-Banach extension of the functional : x  x( 21 ) from [0, 1] to L [0, 1] (notice that
X = R !)

41.4. Absolute Continuity. An additive vector measure F : X is said


to be absolutely continuous with respect to a measure on S if for any > 0
there exists > 0 such that F (E) < whenever E < .
41.5. Theorem (Pettis). A -additive vector measure F on S is absolutely
continuous with respect to a nite measure on S if and only if F (E) = 0
whenever E = 0.
Proof. The necessity is obvious. For the proof of the converse we can use analogous reasoning as in Exercise 8.22.b: We assume the existence of an > 0 and a
sequence {En } S for which
F (En )

and En < 2n .

To reach the contradiction consider compositions F where X . Now to


complete the proof we need uniform estimates with respect to and this is
more dicult than the conclusion in Exercise 8.22.b.
41.6. Exercise. Let be the counting measure on N and X = l2 . Show that F : E 
1
{n
cE (n)}, E N, is a -additive vector measure.
41.7. Variation of Vector Measures. Let F : X be a vector measure. According to
Theorem 6.9 we dene the variation of F as
|F (E)| := sup

n
X

F (Ak ) : Ak

, Ai Aj = for i
= j ,

k=1

n
[

Ak = E .

k=1

If |F ()| < +, we say that F is a vector measure of bounded variation.


(a) Show that the variation of a vector measure is a (nonnegative) measure.
(b) Examine vector measures of the previous examples and decide whether or not they are
of bounded variation.

42. The Bochner Integral


42.1. Bochner Integral. We say that a vector function f : X is Bochner
integrable if it is measurable and there exists a sequence of simple functions {fn }
such that f fn  d 0.

H. Vector Integration

195

A few remarks should be now added. Recall (cf. the proof of Pettis theorem, or
Exercise 42.5)
that (real) functions f fn  are measurable. Further, the above

limit lim B fn d exists for any B S since X is complete and the estimates
% 
%


%
%
%
% fn
fk %
fn fk 
fn f  + fk f 
%
B


hold. Note also that B d :=
xi (Ei B) when =
xi cEi is a simple
function and that this denition does not depend on the expression of as xi cEi
which is not unique.


Moreover, if B f fn   0 and B f
 gn  0, B S (where fn , gn
are simple functions) then lim B fn = lim B gn (consider the sequence f1 , g1 , f2 ,
g2 , . . . ).
Now if f is Bochner
integrable, B S and {fn } is a sequence of simple func
tions satisfying B f fn  0, the limit lim B hn d exists and is independent
of the sequence {hn }. This limit (which is an element of our Banach space X) is
called the Bochner integral of f and it is denoted by B f d. The fundamental
characterization of Bochner integrable functions is given in the next theorem.
42.2. Theorem (Bochner).
A measurable function f : X is Bochner

integrable if and only if f  d < (i.e. exactly when the (real) function f 
is Lebesgue integrable).
Proof. If f is Bochner integrable, then f  is measurable (cf. Exercise 42.5). Now
the assertion follows from the estimates



f  f fn  + fn 

where {fn } is a sequence of simple functions satisfying f fn  0.
For the converse, suppose f  is integrable. There are simple functions fn
tending to f - almost everywhere. Set

fn () if fn () 2 f () ,
gn () =
0
otherwise in .
Obviously, gn are again simple functions. Moreover,
gn () 2 f () ,

f () gn () 0

for -almost all . Since


f () gn () f () + gn () 3 f () ,
an appeal to the Lebesgue dominated convergence theorem 8.13 shows
0.

f gn 

Denoting LX1 (or, more precisely, LX1 (, S , )) the space of all Bochner integrable functions we see that f LX1 if and only if f  L 1 . Basic properties of
the Bochner integral are summarized in the next theorem.

196

42. The Bochner Integral

% % 
(a) If f LX1 and E S , then % E f % E f .

(b) The space LX1 equipped with the norm g1 = g d is complete. In
other words, identifying functions which are equal -almost everywhere, LX1 is a
Banach space.

(c) If F (E) := E f d denotes the indenite Bochner integral, then F is a additive vector measure absolutely continuous with respect to . Moreover,
if

{En } is a sequence of pairwise disjoint sets from S , then the series n F (En )
converges absolutely.
42.3. Theorem.

Sketch of the proof. (a) Using the triangle inequality, the assertion is true for
simple functions. For the general case, pass to the appropriate limit appealing to
the Lebesgue dominated convergence theorem (which is valid even in the vector
case).
(b) The proof is the same as in the case of real functions.
(c) Without any diculty you can show that the indenite Bochner
integral is
(nitely) additive. If E

S
are
pairwise
disjoint,
then
the
series
F (En ) is
n



S
absolutely convergent ( F (En )
f

=
f

f

<
+)
and
En

En



% 
% %
%
%F (
En )
F (En )% = %F (
En )% .
n=1

n=1

n=k+1

"
The assertion now follows since lim
En = 0 and the measure E
k
n=k+1

f

d
is
absolutely
continuous
with
E
% respect
%  to (Exercise 8.22.b). Indeed,
given > 0 there is > 0 such that % E f % E f  < whenever E < .

R
42.4. Remark. We have just seen that the indenite Bochner integral E  E f d is a
-additive vector measure which is absolutely continuous with respect to . Moreover, it is
simply checked that this measure is of bounded variation. As in the real case, a question arises
whether each -additive X-valued vector measure of bounded variation which is absolutely
continuous with respect to can be expressed as an indenite Bochner integral of a Bochner
integrable function. Thus there is a question whether or not the Radon-Nikod
ym theorem holds
for vector measures. The answer is negative. The vector measure of Example 41.3.1 is absolutely
continuous with respect to Lebesgue measure, in case of p = 1 it is of bounded variation and
still it is not the indenite Bochner integral of any Bochner integrable function.

We say that a Banach space X has the Radon-Nikod


ym property (shortly, RNP) if the Radon
Nikod
ym theorem holds for any X-valued vector measure. More precisely, whenever (, , )
X is a vector measure of bounded variation which
is a probability measure space and :
is absolutely continuous with respect to , then is an indenite Bochner integral of a Bochner
integrable function.
Consequently, the space L1 [0, 1] does not have the RNP. Neither do the spaces c0 , l , (K)
(K innite compact) have the RNP. On the other hand, any reexive Banach space has the
RNP.
42.5. Exercise. If f : X is measurable, then the real function f  is measurable. (In
fact, we proved this assertion in the course of the proof of Pettis theorem 40.3.) Prove this
assertion directly.
Hint. The assertion is obvious for simple functions. Now, if fn are simple and fn f -almost
everywhere, it follows that fn  f  -almost everywhere.

H. Vector Integration

197

42.6. Exercise. Let f, g be measurable functions, f  g -almost everywhere and let g
be Bochner integrable. Show that f is Bochner integrable.
42.7. Exercise.
variation.

Show that the indenite Bochner integral is a vector measure of bounded

42.8. Notes. The Bochner integral was studied by S. Bochner [1933] and N. Dunford [1935].
Today, it is used as a tool in function spaces theories, when examining evolution PDEs and
dierential equations in Banach spaces, or in some problems of the geometry of Banach spaces.

43. The Dunford and Pettis Integrals


Most of this chapter is devoted to weak integrals in Banach spaces. Next
Dunfords lemma seems to be of a great importance.
43.1. Dunfords Lemma. Given a vector function f : X such that
f L 1 () for every X , then for any set E S there exists an element
LE X such that

f d

LE () =
E

for each X .
Proof.
Obviously, f is weakly measurable. Fix E S and dene LE :


f
for X . Without doubt LE is a linear functional and we have to
E
show that LE is bounded. To this end, dene the mapping T : (f cE )
(X L 1 ()). We will nish the proof by showing that T has a closed graph.
Indeed, then the closed graph theorem implies that T is bounded (the spaces X
and L1 () are complete !) and

 


 




|LE ()| =  f  =  (f cE )  (f cE )1 = T 1 T  
E

which shows that LE X .


Now let us see why T is closed. Assume that n , T n g. By Theorem
12.4 or Exercise 10.9 we can nd a subsequence {nk } such that T nk g almost everywhere (i.e. nk (f cE ) g -almost everywhere). Since n (f cE )
(f cE ) everywhere, it follows that g = (f cE ) -almost everywhere and we
see that T = g.
43.2. Weak Integrals. A vector function f : X is said to be Dunford
integrable (or, weakly integrable) if f L 1 () for each X . The element
LE X whose existence is guaranteed by Dunfords lemma is called the Dunford
integral (according to some authors also the Gelfand integral ) of f on E. Thus
LE is the Dunford integral of f if

f d for each X .
LE () =
E

If even LE X for each E S (or, more precisely LE X X where


denotes the canonical embedding of X into X ), then f is called Pettis integrable.
Thus PE X is the Pettis integral of f over E if

f d for each X .
(PE ) =
E

198

43. The Dunford and Pettis Integrals


Instead of LE we will use the notation (D) E f (remember that
 LE is an element
of X ) while PE (an element of X) will be denoted by (P ) E f .
Both integrals can be identied provided X is reexive.
43.3. Example. Let X = c0 , (,

, ) = ([0, 1], M, ) and f : [0, 1] c0 be dened as

f (t) = {c(0,1] (t), 2c(0,1/2] (t), 3c(0,1/3] , . . . } .


For each (c0 ) there is a sequence {n } l1 such that (x) =
{xn } c0 . Then one has (the Lebesgue integral in consideration!)
Z

| f | =

X
XZ

n nc(0,1/n] (t) dt

1
0

n xn whenever x =

|n | nc(0,1/n] =

|n | < +.

It follows that f is Dunford integrable. Since


Z

f =

n nc(0,1/n] =

n ,

R
we see that (D) 01 f = {1, 1, 1, , . . . }. Therefore the Dunford integral of f is an element of
P

n . Consequently, f is not Pettis


l = (l ) = (c0 ) determined as a functional 
n

integrable.
Show that
Z

1/n

1/n

f =

(D)
0

It follows that

Z
0

i ic[0,1/i] =

1/n

(D)
and

1/n

f =

1
2
n
1 + 2 + + n + n+1 + . . . .
n
n
n

n
1 2
f = { , , . . . , , 1, 1, 1, . . . } l
n n
n
Z

(D)

1/n

= 1.

43.4. Remark. Having in mind the last example, notice that the indenite Dunford integral
Z
F : E  (D)

f , EM
E

is not
continuous with respect to the Lebesgue measure ([0, 1/n] = 1/n 0 in spite
absolutely
R

of (D) 01/n f = 1). Moreover, the vector measure F is not -additive.

The next theorem characterizes Pettis integrable functions among those which
are Dunford integrable.
43.5. Pettis Theorem. For a measurable Dunford integrable function f :
X the following assertions are equivalent:
(i) f is Pettis integrable,
(ii) the indenite Dunford integral of f is a -additive vector measure,
(iii) the indenite Dunford integral of f is absolutely continuous with respect
to .

H. Vector Integration

199

The proof of this theorem makes use of deeper theorems of functional analysis and
is beyond the scope of this manuscript.
Many problems dealing with vector integration are subtle and need a deeper knowledge of
Banach spaces theory. This is also the case of the next remarks. The reader is referred, for
example, to J. Diestel and J.J. Uhl [*1977] or L. Misk [*1989].
43.6. Remarks.
1. Any Bochner integrable function is also Pettis integrable and both
integrals coincide on sets of .
2. Moreover, the following theorem holds:
Let f : X be a measurable Pettis integrable function. Then f is Bochner integrable if
and only if the indenite Pettis integral of f is a vector measure of bounded variation.
3. The space c0 is exceptional: If X does not contain a copy of c0 (i.e. there is no subspace of
X topologically and algebraically isomorphic with c0 ), then any measurable Dunford integrable
function is Pettis integrable.
4. Let f : X be a measurable function. There are xn X and En

P
f =
xn cEn almost everywhere. Then

such that

n=1

P
f is Pettis integrable if and only if the series
xn (En E) is unconditionally convergent
for any E ,
P
f is Bochner integrable if and only if the series
xn (En E) is absolutely convergent for
any E .
1 (C) be a probability Radon
5. Let C be a compact subset of a Banach space X. Let
measure on C. (For any A X we denote by coA the smallest closed convex subset of X
containing A. It is simply checked that RcoA equals the closure of the convex hull of A.) There
exists a unique z coC so that (z) = d for any X . This z is called the barycenter
C

of . How is this relatedR to the Pettis integral?


The answer is simple. When dening f (x) = x
R
for x C then z = (P ) f d, or z = (P ) x d. Illustrate for the case X = R, C = [0, 1] and
C

=!
6. There is a generalization of the previous example giving a criterion on the existence of the
Pettis integral.
Let K be a compact metric space, X a Banach space and a probability Radon measure on
K. RIf the mapping f : K X is continuous, then there exists the Pettis integral of f and
(P ) K f d cof (K).
43.7. The Graves Integral.
There is a straightforward analogy of Riemann integration
theory for functions having their values in a Banach space. Remember that the Riemann integral
can be dened using Darboux upper and lower sums (and upper and lower integrals), or for its
denition an original Riemanns approach can be used. It is clear that any denition (like
Darbouxs one) making use of the ordering of the real line (and the notion of the least upper
bound) cannot be immediately carried over for a vector case. Of course, in general Banach
spaces there is no ordering. Nevertheless, in the sequel we will touch briey analogues of both
Riemanns and Darbouxs approaches.
Let X again be a Banach space, f a mapping from [0, 1] into X. Given a partition D :=
{0 = x0 < x1 < ... < xn = 1} of [0, 1], I(D) = { = {i } : i [xi1 , xi ]} set
(f, D, ) :=

n
X

f (i )(xi xi1 ).

i=1

The real number (f, D, ) is called the Riemann sum of f . Further, dene the norm of a
partition D as D := max{xi xi1 : i = 1, ..., n}.

200

43. The Dunford and Pettis Integrals

(a) We will say that f is Riemann integrable provided there exists z X with the following
property: For any > 0 there is > 0 such that
(f, D, ) z <
whenever D < and I(D).
If f is Riemann integrable, then the element z of the denition is uniquely determined. It
will be termed the Graves (sometimes also the RiemannGraves) integral of f and denoted by
R
(RG) 01 f . As in the real case, a mapping f is Riemann integrable if and only if there exists
w X with the property: For any > 0 there is a partition D0 such that
(f, D, ) w <
whenever a partition D is ner than D0 (i.e. D D0 ) and I(D). Of course, w is then the
Graves integral of f .
(b) Realize that the Graves integral is a (Moore-Smith) limit of a generalized sequence
{(f, D, )} when ordered either by a norm or an inclusion.
(c) Any Riemann integrable X-valued function f is bounded (there exists K > 0 so that
f (t) < K whenever t [0, 1]).
In what follows, we make use of the vector integration theory for the case of the special measure
space ([0, 1], M, ).
(d) If f : [0, 1] X is Riemann integrable and X , then the (real) function f is
Riemann integrable. In particular, f is weakly measurable and Pettis integrable.
R
Hint. Almost all assertions are obvious, even that f is Dunford integrable and (RG) 01 =
R
R1
(D) 0 . Since f is bounded it follows that (D) I f X for any interval I [0, 1] and one can
see that f is Pettis integrable.
(e) Dene a (vector) function f : [0, 1] l [0, 1] (=the space of all bounded functions on
[0, 1] equipped with the sup -norm) as f (t) = c[0,t] . With the aid of the BolzanoCauchy
condition show that f is Riemann integrable. Now (d) implies that f is weakly measurable.
(Check the last assertion also directly: If is a continuous linear form on l [0, 1], then it
is easy to see that the function t  (f (t)) is of bounded variation, and consequently it is
measurable.) On the other hand, f is not measurable (use Pettis theorem 40.3 and realize that
f (s) f (t) = 1 for s
= t).
(f) Another example. Let E [0, 1] and dene gE : [0, 1] l [0, 1] as follows: Put gE (t) =
/ E. Show that gE is Riemann integrable. If
c{t} if t E and gE (t) equals zero function for t
E is a nonmeasurable set, then the function t  gE (t) cannot be measurable. Accordingly,
in this case gE is not measurable (cf. Exercise 42.5 or Pettis theorem 40.3). State conditions
on a set E under which gE will be measurable.
(g) Any measurable Riemann integrable X-valued function f is Bochner integrable (and both
integrals equal).
Hint. Recall that a (real) function f  is bounded and measurable. Now use Bochners characterization 42.2.
(h) Given again a partition D := {0 = x0 < x1 < ... < xn = 1} of [0, 1], set
(f, D) =

n
X

sup{f (s) f (t) : s, t [xi1 , xi ]}(xi xi1 ).

i=1

We say that f : [0, 1] X is Darboux integrable if for any > 0 there exists > 0 such that
(f, D) < whenever D is a partition and D < . Show again that instead of ordering
given by a norm an equivalent denition can be formulated using an ordering determined by
an inclusion. Prove also that any Darboux integrable function is Riemann integrable.

H. Vector Integration

201

(i) A function f : [0, 1] X is Darboux integrable if and only if f is bounded and continuous
in almost all points of [0, 1] (cf. 7.9.d).
Hint. Dene the oscillation of a X-valued function g as
t  g (t) := lim sup{g(u) g(v) : u, v (t , t + )}.
0+

First show that f is continuous at t0 if and only if f (t0 ) = 0. Further prove that the set
{t [0, 1] : f (t) } is always closed.
Any Darboux integrable function is obviously bounded. To complete the proof check that
{t [0, 1] : f (t)

1
}=0
n

for any n.
For the converse, x > 0 and choose an open set G [0, 1] such that G sup f  < and f is
continuous at all points of the compact set K := [0, 1]\G. For each t K nd the greatest t > 0
for which f (s) f (t) 12 for any s [0, 1] (t t , t + t ). A routine compactness argument
establishes the existence of > 0 with the property: If D := {0 = x0 < x1 < ... < xn = 1}
is a partition of [0, 1] and D < , then each interval [xi1 , xi ] is either contained in G or, it
satises f (s) f (t) for any couple s, t [xi1 , xi ]. Having such a partition D, we get
(f, D) 3.
(j) It follows from the above considerations that any Darboux integrable function is measurable (use Pettis theorem 40.3 and the fact that a continuous image of a separable space is
separable), and therefore with the aid of Bochners theorem 42.2 it is also Bochner integrable.
(k) Each bounded function which is continuous at almost all points is Riemann integrable.
When X is a general Banach space, the converse may not be the
case.

Choose E = Q in (f). Then gQ is Riemann integrable while gQ is the Dirichlet function of


Example 7.5 which is nowhere continuous. Nor is gQ continuous at any point of [0, 1] (remember
that the norm is a continuous function!).
(l) We can see that the classes of Darboux and Riemann integrable functions can dier. It
seems that there is no known reasonable characterization of Banach spaces where these classes
coincide.
43.8. Exercise. Let X = c0 , (,

, ) = ([0, 1], M, ) and dene

f (t) = {c(0,1] (t), 2c(0,1/2] (t), 3c(0,1/3] (t), ...},


g(t) = {c(1/2,1] (t), 2c(1/3,1/2] (t), 3c(1/4,1/3] (t), ...},

h(t) = {

nc(

n=1

1 ,1]
n+1 n

(t), 0, 0, ...}.

Show that:
(a) g is Pettis integrable ,
(b) f is Dunford but not Pettis integrable,
(c) h is not even Dunford integrable but it is measurable,
(d) f (t) = g(t) = h(t) for any t [0, 1] .
43.9. Exercise. Dene the mapping f from the interval (0, 1) (Lebesgue measure in consideration) into the Hilbert space l2 as
f : x  (
Show that (P )

R1
0

f = {log(1 +

1
1
1
,
,
, ...) ,
x+1 x+2 x+3

1
)} .
n n

x (0, 1).

202
43.10. Exercise.
e0 = {0, 0, ...}.

43. The Dunford and Pettis Integrals


Consider again the measure space ([0, 1], M, ). Let en := {0, ..., 0, 1, 0, ...}

(a) Put f (rn ) = en and f = e0 otherwise (here {rn } is a sequence of all rational numbers
of [0, 1]). Show that f : [0, 1] c0 is measurable and Bochner integrable (f is even Riemann
integrable).
(b) Examine the measurability and integrability of the mapping f of (a) replacing the space
c0 by lp , 1 p < .
(c) Let R and dene
h(t) = {n c(0, 1 ] (t)}n whenever t [0, 1].
n

If X is one of the Banach spaces c0 ,


1 p show that h is measurable. There is no
diculty to prove that h is Dunford integrable if X = c0 and it is Dunford integrable for X = l
if and only if 1. Further, h is Pettis integrable if and only if it is Bochner integrable, and
this is the case exactly when < 1. For other cases consult I. Chitescu [1990].
lp ,

43.11. Notes. Fundamental properties of the Pettis integral appeared in B.J. Pettis [1938],
also N. Dunford [1936] studied this integral. Another weak integral (for functions having values
in duals of Banach spaces) was introduced by I.M. Gelfand in [1936]. Nowadays, the Pettis
integral plays an important role in many branches of functional analysis.
L.M. Graves gave the Riemann-type denition for X-valued mappings on [0, 1] in [1927].
Historical comments on vector integration can be found in T.H. Hildebrandt [1953], or J. Diestel and J. J. Uhl [*1977].

Appendix on Topology

203

Appendix on Topology
In this appendix, we mention briey some topological notions used in the manuscript which may not be familiar to the reader.
Let us remind that by a topology we always mean a family of subsets of a set
X, designated as open or -open sets, which has the following properties:
(a) contains and X;
(b)

A B for any A, B ;
(c)
A if A .

Complements of open sets are called closed sets.


A family B of open sets is a base for the topology if every open set can
be expressed as a union of members of B.

Since a topology is often dened with


help of a base B by the formula = { {B : B Z } : Z B}, it is useful to
know when a family of sets determines a topology (in this way). The answer is
as follows:
Proposition.
A collection of sets B is a base for a topology on X if and only

if X = B, and if for any z B1 B2 (B1 , B2 B) there is B B with


z B B1 B2 .
An important example is the topology of a metric space formed by the family
of all open sets. A base for this topology is, for instance, the set of all open balls.
The discrete topology on X consists of all subsets of X and the singletons form
a base for this topology.
General topological spaces may enjoy a very complicated structure. In whole
of this manuscript we assume that all spaces are Hausdor: (X, ) is a Hausdor space whenever x, y are distinct points of X, there exist disjoint open sets
Gx , Gy with x Gx , y Gy .
A neighborhood of a point z X is every set whose interior contains z. The
interior of a set M is dened as the largest open set contained in M . It is the
union of all open sets contained in M .
The closure A of a set A is the smallest closed set containing A (it exists!). A
set E is dense in X if E = X and nowhere dense if the interior of its closure is
empty.
A topological space is said to be separable if it contains a countable dense
subset. A metric space X is separable if and only if it has a countable base for
its topology, and this is the case exactly when X has the Lindel
of property: Any
open cover of X contains a countable subcover.
If X, Y are topological spaces, a mapping f : X Y is continuous if the
pre-images of open sets in Y are open in X. In a usual way, we can dene the
continuity at a point. A mapping is continuous if and only if it is continuous at
each point.
A function f is continuous on X if and only if the level sets
{x X : f (x) > } and {x X : f (x) < }

204

Appendix on Topology

are open for every R. A function f is said to be lower semicontinuous if the


set {x X : f (x) > } is open for every R.
A one-to-one mapping f : X Y is called a homeomorphism if it is continuous
and the inverse mapping f 1 : f (X) X is also continuous.
A base for the topology of the Cartesian product of topological spaces X1 , X2
is formed by the collection of all sets of the form G1 G2 , where Gj are open in
Xj .
A topological space is normal if every pair of disjoint closed sets can be separated by (disjoint) open sets. The normal topological spaces are exactly the
spaces where Urysohns lemma and Tietzes extension theorem hold.
Urysohns Lemma for Normal Spaces. If F1 and F2 are disjoint closed subsets of a normal topological space X, then there exists a continuous function f on
X such that
0 f 1, f = 0 on F1 , f = 1 on F2 .
Tietzes Extension Theorem. If f is a continuous function on a closed subset
Z of a normal topological space X, then there exists a continuous function F on
X such that
f = F on Z and sup |F | = sup |f | .
X

Any metric space is normal.


An interesting and from the point of view of the measure theory important class
of topological spaces is formed by locally compact spaces, i.e. spaces in which every
point has a compact neighborhood. A set is compact if every its open cover (cover
by open sets) contains a nite subcover. The locally compact spaces fails to be
normal but for them a version of Urysohns lemma holds.
Urysohns Lemma for Locally Compact Spaces. If K is a compact set and
U an open subset of a locally compact space X, K U X, then there exists a
continuous function f and a compact set L with
K L U,

0 f 1,

f = 1 on K,

f = 0 on X \ L.

Every locally compact space with a countable base is metrizable. A nite


Cartesian product of locally compact spaces is a locally compact space.
Let C (X) be the space of all continuous (real- or complex-valued) functions on
a compact set X. If
f  := max{|f (t)| : t X}

for f C (X) ,

then C (X) equipped with this norm is a Banach space.


The convergence in the space C (X) is the uniform convergence. The pointwise convergence of special sequences of continuous functions can guarantee the
convergence in C (X) as following Dinis theorem shows.

Appendix on Topology

205

Dinis Theorem. If {fn } is a monotone sequence of continuous functions on


a compact space X which converges pointwise to a continuous function, then the
convergence of {fn } is uniform on X.
Let A

C (X). We say that


A is an algebra if f g A for f, g A ;
A is a lattice if max(f, g), min(f, g) A whenever f, g A ;
A separates points of X if for any x, y X , x = y there is A such
that (x) = (y).
The following theorem is useful in many branches of modern analysis.
Stone-Weierstrass Theorem. Let X be a compact space and A a linear subspace of C (X). If A is an algebra or a lattice, if A separates points of X and
contains the constant functions, then A is dense in C (X).
Let P be a metric compact space. Then there exists a countable base {Vn } for
the topology on P . Put fn (x) = dist(x, P \Vn ) and consider the algebra generated
by {fn }. The Stone-Weierstrass theorem yields the following proposition.
Theorem. If P is a metric compact space, then the space C (P ) is separable.

206

References

References
Czech Books and Lecture Notes

[Bo
c]

Tenzorov
y po
cet, SNTL Praha, 1976.

I]
[CM
II]
[CM

Integr
aln po
cet I, lecture notes, SPN Praha 1960.
Integr
aln po
cet II, lecture notes, SPN Praha 1961.

[FM]

Matematick
a anal
yza II. Diferenci
aln po
cet funkc vce prom
enn
ych, lecture notes,
SPN Praha 1975.

[Kow]

Z
aklady matematick
e anal
yzy na variet
ach, lecture notes, UK Praha 1975.

[KNV]

Teorie potenci
alu III, lecture notes, SPN Praha 1976.

[KST]

Matematick
a anal
yza na variet
ach, Karolinum, Praha 1998.

[KS]

Integr
aln transformace, lecture notes, SPN Praha 1969.

[L-Pr]
[L-T]
[Zap]
[Uvod]

P
rklady z matematick
e anal
yzy I. P
rklady k teorii Lebesgueova integr
alu, lecture
notes, SPN Praha 1968 (1972, 1984).
Teorie mry a integr
alu I, lecture notes, SPN Praha 1972 (1974, 1980).
Z
apisky z funkcion
aln anal
yzy, Karolinum, Praha 1998 (2002, 2003).

Uvod
do funkcion
aln anal
yzy, Karolinum, Praha 2005.

[Pr]

Probl
emy z matematick
e anal
yzy, lecture notes, SPN Praha 1972 (1974, 1977, 1982).

[LM]

Mra a integr
al, lecture notes, Univerzita Karlova, Praha 1993.

[Mar]

Matematick
a anal
yza
cten
a podruh
e, Academia Praha 1976.

[NV]

P
rklady z matematick
e anal
yzy. Mra a integr
al, lecture notes, UK Praha 1982.

[Ru]

Anal
yza v re
aln
em a komplexnm oboru, Academia Praha 1977.

[Sik]

Diferenci
aln a integr
aln po
cet. Funkce vce prom
enn
ych, Academia Praha 1973.

[SJ]

Funkcion
aln anal
yza. Neline
arn u
lohy, lecture notes, SPN Praha 1986.
Other books

[*1981]

Principles of real analysis, North-Holland.

[*1966]

A rst course in integration, Holt, Reinhart and Winston.

References

207

[*1964]

Elements of abstract harmonic analysis, Academic Press.

[*1932]

Th
eorie des op
erations lin
eaires, Warszava.

[*1966]

The elements of integration, Wiley.

[*1990]

Ma- und Integrationstheorie, Walter de Gruyter.

[*1987]

Ma- und Integrationstheorie, Springer-Verlag.

[*1965]

Measure and integration, Macmillan.

[*1988]

Dierential geometry: Manifolds, Curves, and Surfaces, Springer-Verlag.

[*1983]

Theory of charges, Academic Press.

[*1968]
[*1979]

Convergence of probability measures, John Wiley 1968, Russian translation 1977.


Probability and measure, Wiley.

[*1898]

Lecons sur la th
eorie des fonctions, Gauthier-Villars, Paris.

[*1959]

Real Analysis, Van Nostrand.

[*1952]

Int
egration, Herman et Cie, Paris 1952, 2nd edition 1965.

[*1978]

Dierentiation of real functions, Springer-Verlag.

[*1918]

Vorlesungen u
ber reelle Funktionen, Teubner Leipzig 1918, 2nd edition 1927, 3rd
edition Chelsea 1948.

[*1821]

Cours danalyse de lEcole


Royale Polytechnique, Paris.

[*1969]

Lectures on analysis, Vol. I, W.A. Benjamin.

[*1980]

Measure theory, Birkh


auser.

[*1985]

Integration theory, Vol. 1: Measure and integral, John Wiley & Sons.

[*1977]

Applied nonstandard analysis, Wiley-Interscience.

[*1985]

Nonlinear functional analysis, SpringerVerlag.

[*1977]

Vector measures, AMS.

208

References

[*1994]

Measure theory, Springer-Verlag.

[*2004]

Lectures on Nonlinear Analysis, Vydavatelsk


y servis, Plze
n.

[*1989]

Real analysis and probability, Wadsworth&Brooks/Cole.

[*1990]

Measure, topology, and fractal geometry, SpringerVerlag.

[*1992]

Measure theory and ne properties of functions, CRC Press, Boca Raton.

[*1985]

The geometry of fractal sets, Cambridge University Press, Cambridge.

[*1969]

Geometric measure theory, SpringerVerlag. (Second edition Springer 1996).

[*1965]

Functions of several variables, AddisonWesley.

[*1981]

Ma- und Integrationstheorie, B.G. Teubner-Verlag, Stuttgart.

[*1984]
[*1995]

Real analysis, John Wiley.


A course in abstract harmonic analysis, CRC Press.

[*1995]

Degree theory in analysis and applications., The Clarendon Press, Oxford University
Press, New York.

[*1991]

Fundamentals of real analysis, Marcel Dekker.

[*1822]

Th
eorie analytique de la chaleur, F. Didot, Paris.

[*1974]

Topological Riesz spaces and measure, Cambridge University Press.

[*1973]

Spectral analysis of nonlinear operators, Lecture Notes in Math. 346, SpringerVerlag


1973.

[*1995]

Modern Real Analysis, PWS Publishing Company, Boston.

[*1994]

The integrals of Lebesgue, Denjoy, Perron, and Henstock, American Mathematical


Society, Graduate studies in mathematics, vol. 4.

[*1975]

Dierentiation of integrals in Rn , Lecture Notes in Math. 481, SpringerVerlag.

[*1921]

Theorie der reellen Funktionen, I. Band, Julius Springer, Berlin.

[*1948]

Set functions, Univ. New Mexico Press.

References

209

[*1950]

Measure theory, Van Nostrand 1950, 1966, Springer 1974.

[*1970]

Lebesgues theory of integration (Its origins and development), Wisconsin Press.

[*1988]
[*1991]

Lectures on the theory of integration, World Scientic Publishing, Singapore.


The general theory of integration, Clarendon Press, Oxford.

[*1963]
[*1970]

Abstract harmonic analysis I, Academic Press 1963 (Russian translation 1975).


Abstract harmonic analysis II, Academic Press 1970 (Russian translation 1975).

[*1965]

Real and abstract analysis, Springer Verlag.

[*1975]

Analysis in Euclidean spaces, Prentice-Hall Inc.

[*1966]

Topological vector spaces and distributions I, Addison Wesley.

[*1978]

Measure and integral, Academic Press.

[*1933]

Grundbegrie der Wahrscheinlichkeitsrechnung [Foundations of the theory of probability], Springer-Verlag 1933, Chelsea 1950.

[*1980]

Nichtabsolut konvergente Integrale, B.S.B.G. Teubner, Leipzig.

[*1993]

Real and functional analysis, Springer-Verlag (3rd edition).

[*1904]

Lecons sur lint


egration et la recherche des fonctions primitives, Paris, 2nd edition
1928.

[*1989]

Lanzhou lectures on Henstock integration, World Scientic Publishing Co..

[*1988]

An introduction to the theory of real functions, John Wiley & Sons.

[*1986]

Fine topology methods in real analysis and potential theory, Lecture Notes in Math.
1189, SpringerVerlag.

[*1982]

The prehistory of the theory of distributions, SpringerVerlag.

[*1915]

Integral i trigonometri
ceskie rjady, Moskva 1915, 2nd edition 1950.

[*1986]
[*1995]

Lecture notes on geometric measure theory, Universidad de Extremadura.


Geometry of sets and measures in Euclidean spaces. Fractals and rectiability, Cambridge University Press, Cambridge.

[*1992]

Analyse (Fondaments, techniques,


evolution), DeBoeck Universite, Bruxelles.

210

References

[*1978]

The Bochner integral, Birkh


auser.

[*1907]

Diophantische Approximationen, Teubner, Leipzig.

[*1989]

Funkcion
alna anal
yza, Alfa.

[*1988]

Geometric measure theory, Academic Press.

[*1986]

Real and functional analysis, Part A and B, Plenum Press.

[*1971]

Measure and integration, Addison-Wesley.

[*1965]

The Haar integral, Van Nostrand.

[*1981]

Miera a integr
al, Veda.

[*1971]

Measure and category, Springer-Verlag 1971, 1980, Moskva 1974.

[*1978]

Introduction to probability and measure, Springer-Verlag.

[*1989]

Analysis now, Springer-Verlag.

[*1977]

Integrals and measures, Marcel Dekker, New York.

[*1987]

Measure theory and integration, John Wiley & Sons. (Second revised edition Marcel
Dekker, Inc., New York, 2004).

[*1992]

T
eoria miery, Veda.

[*1970]

Hausdor measures, Cambridge University Press.

[*1982]

A second course on real functions, Cambridge University Press.

[*1974]

Real and complex analysis, McGraw-Hill (2nd ed.).

[*1968]

Real analysis, Macmillan.

[*1937]

Theory of the integral, Stechert 1937.

[*1992]

Generalized ordinary dierential equations, World Scientic, Singapore.

[*2005]

Topics in Banach space integration., World Scientic Publishing Co. Pte. Ltd., Hackensack.

References

211

[*1969]

Nonlinear functional analysis, Gordon and Breach, New York.

[*1968]

Integrals and operators, McGraw-Hill.

[*1983]

Lectures on geometric measure theory, Proc. of the Centre for mathematical analysis,
Australian National University, vol.3.

[*1983]

Primer on modern analysis, Springer-Verlag.

[*1984]

An introduction to classical real analysis, Wadsworth International.

[*1987]

Matematick
a anal
yza funkc re
alnej premennej, Alfa.

[*1965]

General theory of functions and integration, Blaisdell.

[*1970]

Topology and measure, Lecture Notes in Math. 133, Springer-Verlag.

[*1988]

Real variables, Addison-Wesley.

[*1905]

Sul problema della misura dei gruppi di punta di una retta, Bologna.

[*1985]

The Banach-Tarski paradox, Cambridge University Press.

[*1940]

LInt
egration dans les Groupes Topologiques et ses Applications, Hermann et Cie,
Paris 1940, 2nd edition 1965.

[*1973]

Lebesgue integration and measure, Cambridge University Press.

[*1977]

Measure and integral, Marcel Dekker.

[*1969]

Lectures on measure and integration, Van Nostrand.

[*1962]

Lebesgue integration, Holt,Rinehart and Winston.

[*1967]

Integration, North Holland.


Papers

[1922]
[1923]
[1924]
[1925]

Sur les op
erations dans les ensembles abstraits et leurs applications aux
equations
int
egrales, Fund. Math. 3, 133-181.
Sur le probl`
eme de la mesure, Fund. Math. 4, 7-33.
Sur un theor`
eme de M. Vitali, Fund. Math. 5, 130-136.
Sur les lignes rectiables et les surfaces dont laire est nie, Fund. Math. 7, 225-237.

212

References

[1924]

Sur la d
ecomposition des ensembles de points en parties respectivement congruentes,
Fund. Math. 6, 244-277.

[1957]

Sur l
equivalence des th
eories de lint
egration selon N. Bourbaki et selon M.H. Stone,
Bull. Soc. math. France 85, 51-75.

[1945]

A general form of the covering principle and relative dierentiation of additive functions, Proc. Cambridge Philos. Soc. 41, 103-110.
A general form of the covering principle and relative dierentiation of additive functions, Proc. Cambridge Philos. Soc. 42, 1-10.

[1946]

[1933]

Integration von Funktionen deren Werte die Elemente eines Vectorraumes sind,
Fund. Math. 20, 262-276.

[1895]

Sur quelques points de la th


eorie des fonctions, Ann. Ecole Normale Sup. 12, 9-55.

[1963]

A new treatment of the Haar integral, Michigan Math. J. 10, 365-373.

[1914]

Uber
das lineare Mass von Punktmengeneine Verallgemeinerung des L
angenbegris,
Nach. Ges. Wiss. G
ottingen, 404-426.

[1940]

Sur la mesure de Haar, C. R. Acad. Sci. Paris 211, 759-762.

[1990]

A parametrical example of Dunford, Pettis and Bochner integration, Stud. Cerc.


Mat. 42, 405-418.

[1986]

La naissance de la th
eorie des capacit
es: r
eexion sur une exp
erience personelle, La
Vie des Sciences, Comptes rendus, ser. g
en
erale 3,4, 385-397.
Vznik teorie kapacit: zamyslen nad vlastn zkusenost, Pokroky matematiky, fyziky
a astronomie 34, 71-83.

[1989]

[1918]

A general form of integral, Ann. of Math. 19, 279-284.

[1980]

The Hahn decomposition theorem, Proc. Amer. Math. Soc. 80, 377.

[1935]

Integration in general analysis, Trans. Amer. Math. Soc. 37, 441-453.

[1911]
[1936]

Sur les suites de fonctions mesurables, C. R. Acad. Sci. Paris 152, 244-246.
Integration and linear operation, Trans. Amer. Math. Soc. 40, 474-494.

[1910]

Uber
stetige Funktionen. II, Math. Annalen 69, 372-433.

[1906]

S
eries trigonom
etriques et s
eries de Taylor, Acta Math. 30, 335-400.

[1981]

A proof of Lusins theorem, Amer. Math. Monthly 88, 191-192.

References

213

[1907]

Sur la convergence en moyenne, Comptes Rendus Acad. Sci. Paris 144, 1022-1024.

[1991]

The Hahn-Banach theorem implies the existence of a non Lebesgue-measurable set,


Fund. Math. 138, 13-19.

[1915]

Sur lint
egrale dune fonctionnelle
etendue a
` un ensemble abstrait, Bull. Soc. Math.
France 43, 248-265.
Des familles et fonctions additives densembles abstraits, Fund. Math. 5, 206-251.

[1924]
[1907]
[1915]

Sugli integrali multipli, Rendiconti Accad. Nazionale dei Lincei (Roma) 16, 608-614.
Sulla derivazione per serie, Rendiconti Accad. Nazionale dei Lincei (Roma) 24, 204206.

[1936]

Sur un lemme de la th
eorie des espaces lin
eaires, Comm. Ins. Sci. Math. Mec. Univ.
de Kharkov et Soc. Mat. Kharkov 13, 35-40.

[1941]

Linear functionals and integrals in abstract spaces, Bull. Amer. Math. Soc. 47, 615620.

[1927]

Riemann integration and Taylors theorem in general analysis, Trans. Amer. Math.
Soc. 29, 163-177.

[1933]

Der Mabegri in der Theorie der kontinuierlichen Gruppen, Ann. of Math. 34,
147-169.

[1933]

Uber
die multiplikation total-additiver Mengenfunktionen, Annali Scuola Norm. Sup.
Pisa 2, 429-452.

[1957]

Note sur la mesurabilit


e B de la d
eriv
ee sup
erieure, Fund. Math. 44, 238-240.

[1983]

The Riesz representation theorem revisited, Amer. Math. Monthly 90, 277-280.

[1919]

Dimension und a
usseres Mass, Math. Ann. 79, 157-179.

[1961]
[1988]

Denitions of Riemann type of the variational integrals, Proc. London Math. Soc.
13,3, 305-321.
A short history of integration theory, SEA Bull. Math. 12, 75-95.

[1953]

Integration in abstract spaces, Bull. Amer. Math. Soc. 59, 111-139.

[1889]

Uber
einen Mittelwerthssatz, Nachr. Akad. Wiss. G
ottingen Math.-Phys., 38-47.

[1970]

An introduction to distributions, Amer. Math. Monthly 77, 227-240.

[1881]

Sur la s
erie de Fourier, Comptes Rendus Acad. Sci. Paris 92, 228-230.

214

References

[1941]
[1948]

Concrete representation of abstract (M)-spaces, Annals of Math. 42, 994-1024.


A proof of the uniqueness of Haars measure, Ann. Math. 49, 225-226.

[1950]

Construction of a non-separable invariant extension of the Lebesgue measure space,


Annals of Math. 52, 580-590.

[1934]

Uber
die zussammenziehenden und Lipschitzchen Transformationen, Fund. Math.
22, 77-108.

[1932]

Beitr
age zur Masstheorie, Math. Ann. 107, 351-366.

[1985]

Note on generalized multiple Perron integral, Cas.


P
est. Mat. 110, 371-374.

[1957]

Generalized ordinary dierential equations and continuous dependence on a parameter, Czechoslovak Math. J. 82, 418-446.

[1901]
[1902]
[1903]
[1910]

Sur une g
en
eralisation de lint
egrale d
enie, Comptes Rendus Acad. Sci. Paris 132,
1025-1028.
Integrale, longuer, aire, Annali Mat. Pura Appl. 7, 231-359.
Sur une propri
et`
e des fonctions, Comptes Rendus Acad. Sci. Paris 137, 1228-1230.
Sur lint
egration des fonctions discontinues, Ann. Sci. Ecole Norm. Sup. 27, 361-450.

[1906]

Sopra lintegrazione delle serie, Rend. Instituto Lombardo di Sci. e Lett. 39, 775-780.

[1940]

Sur les fonctionsvecteurs compl`


etement additives, Bull. Acad. Sci. URSS 6, 465-478.

[1912]

Sur les propri


et
es des fonctions mesurables, C. R. Acad. Sci. Paris 154, 1688-1690.

[1952]

Z
aklady theorie integr
alu v Euklidov
ych prostorech, Casopis
P
est. Mat. 77, 1-51,
125-145, 267-301.

[1934]

Extension of range of functions, Bull. Amer. Math. Soc. 40, 837-842.

[1975]

The work of Henri Lebesgue on the theory of functions (on the occasion of his centenary) Transl. from Uspechi Mat. Nauk, 30(1975), 227-238, Russian Math.Surveys
30, 179-191.

[1976]

C
ontrole dans les in
equations variationelles elliptiques, J. Functional Analysis 22,
130-185.

[1970]

On the extension of Lipschitz, LipschitzH


older continuous and monotone functions,
Bull. Amer. Math. Soc. 76, 334-339.

[1939]
[1947]

The behaviour of a function on its critical set, Ann. of Math. 40, 6270.
Perfect blankets, Trans. Amer. Math. Soc. 6, 418-442.

References

215

[1988]

A simple proof of the Rademacher theorem, Casopis


P
est. Mat. 113, 337-341.

[1934]
[1936]

Zum Haarschen Ma in topologischen Gruppen, Compositio Math. 1, 106-114.


The uniqueness of Haars measure, Matem. sbornik 43, 721-734.

[1930]

Sur une g
en
eralisation des int
egrales de M.J. Radon, Fund. Math. 15, 131-179.

[1914]

Uber
den Integralbegri, Sitzungsber. Heidelberg Akad. Wiss. A16, 1-16.

[1927]

Die Vollst
andigkeit der primitiven Darstellungen einer geschlossenen kontinuierlichen Gruppe, Math. Ann. 97, 737-755.

[1938]

On integration in vector spaces, Trans. Amer. Math. Soc. 44, 277-304.

[1910]

Contributions a
` l
etude de la repr
esentation dune fonction arbitrire par des int
grales
d
enies, Rend. Circ. mat. Palermo 30, 289-335.

[1919]

Uber
partielle und totale Dierenzierbarkeit, Math. Ann. 89, 340-359.

[1913]

Theorie und Anwendungen der absolut additiv Mengenfunktionen, S.-B. Math. Natur. Kl. Kais. Akad. Wiss. Wien 122.IIa, 1295-1438.

[1906]
[1909a]

Sur les ensembles de fonctions, Comptes Rendus Acad. Sci. Paris 143, 738-741.
Sur les suites de fonctions mesurables, Comptes Rendus Acad. Sci. Paris 148, 13031305.
Sur les op
erations fonctionnelles lin
eaires, Comptes Rendus Acad. Sci. Paris 149,
974-977.
Untersuchungen u
ber Systeme integrirbarer Funktionen, Math. Annalen 69, 449-497.
Sur quelques points de la th
eorie des fonctions sommables, Comptes Rendus Acad.
Sci. Paris 154, 641-643.
Sur lint
egrale de Lebesgue, Acta Math. 42, 191-205.
Sur lexistence de la d
eriv
ee des fonctions monotones et sur quelques probl`
emes qui
sy rattachent, Acta Sci. Math. Szeged 5, 208-221.

[1909b]
[1910]
[1912]
[1920]
[1930-32]

[1888]

An extension of a certain theorem in inequalities, Messenger of Math. 17, 145-150.

[1938]

Integration in abstract metric spaces, Duke Math. J. 4, 408-411.

[1942]

The measure of the critical set values of dierentiable mappings, Bull. Amer. Math.
Soc. 48, 883890.

[1951]

A note on the space Lp , Proc. Amer. Math. Soc. 2, 270-275.

[1948]

Thorie des distributions et transformation de Fourier (French), Ann. Univ. Grenoble. Sect. Sci. Math. Phys. (N.S.) 23, 7-24.

216

References

[1954]

Equivalence of measure spaces, Am. J. Math. 73, 275-313.

[1928]

Un th
eor`
eme g
en
eral sur les familles densembles, Fund. Math. 12, 206-210.

[1936]

M
ethode nouvelle a
` r
esoudre le probl`
eme de Cauchy pour les
equations hyperboliques
normales, Mat. Sb. 1 (43), 39-71.
L. S. Sobolev
M
ethode nouvelle `
a r
esoudre le probl`eme de Cauchy pour les equations hyperboliques normales
Mat. Sb. 1 (43) (1936), 39 71
[1970]

A model of set theory in which every set of reals is Lebesgue measurable, Annals of
Math. 92, 1-56.

[1919]

Additive und stetige Funktionaloperationen, Math. Z. 5, 186-221.

[1948]
[1949]

Notes on integration I - III, Proc. Nat. Acad. Sci. 34, 336-342, 447-455, 483-490.
Notes on integration IV, Proc. Nat. Acad. Sci. 35, 50-58.

[1985]

A combinatorial construction of a nonmeasurable set, Amer. Math. Monthly 92,


421-422.

[1909]

Sullintegrazione per parti, Rendiconti Accad. Nazionale dei Lincei 18, 246-253.

[1908]

On non-measurable sets of points with an example, Trans. Amer. Math. Soc. 9,


237-244.

[1922]
[1939]

Limit in terms of continuous transformation, Bull. Soc. Math. France 50, 119-134.
The ergodic theorem, Duke Math. J. 1-18.

[1911]

On the existence of a dierential coecient, Proc. London Math. Soc. 9, 325-335.

[1904]

On upper and lower integration, Proc. London Math. Soc. 2, 52-66.

[1983]

On dierentiation of metric projections in nite dimensional Banach spaces, Czech.


Math. J. 33,3, 325-336.

A Short Guide to the Notation

217

A Short Guide to the Notation


M, M(), Mn ... measurable sets 1.3, 4.4, 26
, n ... Lebesgue measure 1.3, 1.15, 26
(A ) ... -algebra generated by A 2.3
B, B(P ) ... Borel sets 2.3
|A , A , |T ... restrictions of measures 2.4
SA ... restriction of a -algebra 2.4
x ... Dirac measure 2.5
... completion of a measure 2.7
, ... outer and inner measure 1.3, 4.8
M(S ) ... space of all (signed, complex) measures on (X, S ) 6.17
vol I ... volume of an interval 1.15
+ , , || ... variations of a measure 6.6, 6.10
f () ... image of a measure 8.23
L ... the set of all functions for which the integral exists 8.3
L p , Lp ... Lp -spaces 8.3, 10.1
f p ... Lp -norm of a function f 10.1, 10.5
lp ... lp -spaces 10.7, 40.3-4
S T ... product -algebra 11.1
M x , Mx ... sections 11.1
... product measure 11.6
w-lim fj ... weak limit 12.12
w*-lim fj ... weak* limit 12.13
X ... (topological) dual 12.4
f ... measure having a density f , 8.19, 13.1
d
ym derivative 13.1
d ... Radon-Nikod
 ... absolute continuity of measures 13.1
... mutually singular measures 13.8
supt f ... support 14.1
CK (P ) ... continuous functions having compact support in K 14.1
Cc (P ) ... continuous functions having compact support 14.1
Cc (P ), Cc (P ) ... semicontinuous functions 14.4
A+ , A , |A| ... variation of a (signed) Radon integral 14.11, 14.12
A , A ... measure (outer measure) corresponding to a Radon integral 16.1,
16.4
Cb (P ) ... bounded continuous functions 16.7
C0 (P ) ... continuous functions vanishing at innity 16.7
v
j ... vague convergence 17.1
M (P ) ... signed (complex) Radon measures on P 17.1
Mb (P ) ... nite signed (complex) Radon measures 17.1
e ... unit of a group 19.2
... modular function 19.9; also a set of all positive functions 25.2
f g ... convolution 19.19, 26.21
b

V f ... variation 21.1


a

218

A Short Guide to the Notation

D+ f , D f , D+ f , D f Df , Df ... extreme derivatives 22.3


... Jacobi matrix 26.12
 ... (Frchet) derivative 26.12
C ... contiuously dierentiable functions of order l 26.12
J ... Jacobian 26.12 2
D ... symmetric derivative of a measure 28.2
D() ... (innitely) smooth functions having compact support 31.1
k ... smoothing convolution kernel 31.1
Tf ... regular distribution 32.1
T ... distribution determined by a measure 32.3
... Dirac distribution 32
D
fk 0 ... convergence in D 32.1
D T ... multiindex derivative 32.4
Zk Z ... convergence of distributions 32.9
f, F f ... Fourier transform 33.1, 33.5, 33.8
f ... inverse Fourier transform 33.5
S ... Schwartz space 33.5
Mn,k ... space of all matrices 34
L ... adjoint mapping 34
AT ... transpose matrix 34
L ... norm of a linear mapping 34
det A ... determinant 34
... k-dimensional measure in Rn 34.8
vol L, vol(u1 , . . . , uk ) ... volume 34.10
N (x, , E) ... Banach indicatrix 34.19
deg(y., G) ... degree of a mapping 35.7
Hp (A), Hp (A, ) ... Hausdor measure 36.1
s, S ... one-dimensional (n 1-dimensiona) measure 37
grad g, div f , curl f ... gradient, divergence, curl 37.1
u1 uk1 ... vector product 37.5
t, n ... unit tangent (normal) vector 37.10, 37.13
Hk ... half-space 37.17
i ... embbeding of Rn1 into Hk 37.17
< , > ... duality 38.1
k (V), k (V) ... k-covectors and vectors 38.1
V1 Vk ... exterior product 38.3
I(k, n) ... set of all increasing multiindices 38.3
Xi ... coordinate form 38.3
d ... diferential 38.11
(x) ... unit tangent k-vector 38.14
U ... domain of a chart 39.1
Tx ... tangent space 39.4
c0 ... space of all sequences converging to zero 40.3-4

Subject Index

219

Subject Index
absolute convergence 41.1
absolutely continuous function 21.3
absolutely continuous measure 13.1,
13.8
vector 41.4
adjoint mapping 34
algebra 5.1, Appendix
almost everywhere 3.12, 3.14
analytic set 26.10
antiderivative 7.1
antisymmetric k-linear form 38.1
approximately continuous function 29.7
atlas 39.1
atom 2.15
atomic measure 15.18
Baire function 7.9
ball 26.11, 28.2.1
Banach-Zareckis theorem 23.11
barycenter 43.6
base for a topology Appendix
Besicovitchs theorem 27.4
bilinear form 38.1
bilipschitz mapping 34.20
Bochner integral 42.1
Bochners theorem 42.2
Borel-Cantellis lemma 2.14
Borel function 3.2
Borel set 2.3
bounded variation 21.1
Cantor discontinuum 1.12
Cantor function 23.1
Cantor (ternary) set 1.12
capacity 5.14
capacitable set 5.15
charge 6.20
chart 34.24, 39.1
Chebyshevs inequality 7.8
Choquet capacity 5.14
compact space Appendix
complete measure 2.4
complete Riemann integral 25.14
completion of a measure 2.7
complex measure 6.10

complex Radon measure 15.8


complex Radon integral 14.10
cone 37.15
continuous measure 15.18
convergence 41
absolute 41.1
almost everywhere 3.12
in measure 12.1
-uniform 12.2
strong 17.1
unconditional 41.1
vague 17.2
weak 12.12, 12.14, 17.1, 32.9
weak* 12.13, 12.14
convergent integral 8.3
convolution of functions 19.19, 26.18
convolution of measures 19.24, 26.18
continuous measure 15.18
counting measure 2.5, 2.10, 19.4
Cousins lemma 25.2
curl 37.1
curve 37.10
curve integral 34.24, 37.10
Daniells property 14.3
Darboux property (of a measure) 2.15
Darboux integrable function 43.7
degree of a mapping 35.7
-ne partition 25.2
-ring 5.1
Denjoy-Perron integral 25.14
Denjoys theorem 29.9
density of a measure 13.1
density point 29.1
density topology 29.4
derivative 26.12
derivative of a mapping 39.4
derivative of a measure 28.1
descriptive denition of an integral 23.7
dieomorphism 34.22, 39.1
dierential 38.11, 39.8
dierential form 38.9, 39.8
Dini derivative 22.3
Dinis theorem Appendix
Dirac integral 14.2

220

Subject Index

Dirac measure 2.5


Dirichlet function 7.5
discontinuum
Cantors 1.12
of a positive measure 1.13
discrete measure 15.18
discrete topology Appendix
distribution 32.1
tempered 33.5
distribution function 24.1
divergence 37.1
d-open set 29.4
dual space to Lp 13.17
Dunford integral 43.2
Dunfords lemma 43.1
Dynkin class 5.1
Egorov theorem 12.6
embedding 39.2
extremal derivative 22.3
Fatou lemma 8.15
ner partition 43.7
nite measure 2.4
nite variation 21.1
nitely additive measure 6.20
Fourier coecients 33.20
Fourier series 33.20
Fourier transform 33.1, 33.5, 33.11
Frechet derivative 26.12
F set 2.3
F set 2.3
Fubinis lemma 24.10
Fubinis theorem 11.9, 26.9
function
absolutely continuous 21.3
approximately continuous 29.7
Baire 7.9
Borel 3.2
Cantor 23.1
Darboux integrable 43.7
Dirichlet 7.5
distribution 24.1
Heaviside 32.5
integrable 8.3, 14.5
Lebesgue measurable 26.7
Lipschitz 20

locally absolutely continuous 21.3


locally integrable 23.3
lower Baire 7.9
lower semicontinuous 14.4
measurable 3.1, 3.13, 40.1
modular 19.9
of Baire class one 18.3
of class C 26.12
of nite variation 21.1
Riemann 7.6
Riemann integrable 7.2
saltus 24.7
simple 3.8, 8.1, 40.1
upper semicontinuous 14.14
weakly integrable 43.2
weakly measurable 40.1
with a compact support 14.1, 31.1
Gauss theorem 37.22
G set 2.3
G set 2.3
Gelfand integral 43.2
gradient 37.1
Graves integral 43.7
Greens theorem 37.27
Haar measure 19.3
Hahn decomposition 6.3
harmonic measure 14.8
Hausdor dimension 36.8
Hausdor measure 36.1
Hausdor outer measure 4.2
Hausdor space Appendix
Heaviside function 32.5
helix 34.25, 37.12
HenstockKurzweil integral 25.2
homeomorphism Appendix
Hopfs theorem 5.5
H
olders inequality 10.3
image of a measure 8.23
increasing multiindex 38.3
indenite HenstockKurzweil integral
25.5
indenite Lebesgue integral 23.3
indenite variation 21.1
inequality Chebyshevs 7.8

Subject Index

Holders 10.3
Minkowskis 10.4
Youngs 10.2
integrable function 8.3, 14.5
integral 8.3
Bochner 42.1
complex Radon 14.10
convergent 8.3
curve 34.24, 37.10
Denjoy-Perron 25.14
Denjoy restricted 25.14
Dirac 14.2
Dunford 43.2
Gelfand 43.2
Graves 43.7
HenstockKurzweil 25.2
indenite 25.5
Lebesgue 8.3, 26
indenite 23.3
Newton 7.1
of a dierential form 38.16, 39.18
Perron 25.10
Pettis 43.2
Poisson 14.2
Radon 14.1
signed 14.10
Riemann 7.2
complete 25.14
Riemann-Graves 43.7
Riemann-Stieltjes 14.2
surface 34.24, 37.13
integration by parts 23.13
inverse Fourier transform 33.5
involution 19.23
isometric mapping 34.2
Jacobi matrix 26.12
Jacobian 26.12
Jordan-Peano content 1.5, 5.12, 26.24
Jordan decomposition 6.7
Kadec-Klee property 12.14
k-boundary 37.17
k-covector 38.1
k-dimensional measure 34.8
on a manifold 39.15
k-dimensional surface 34.24

221

Kirszbrauns theorem 30.6


k-linear form 38.1
Kurzweil (HenstockKurzweil) integral
25.2
k-vector 38.1
lattice Appendix
Lebesgue measurable function 26.7
Lebesgue measure 1.3, 1.15, 2.5, 19.4
Lebesgue outer measure 1.1, 1.15
Lebesgue point 23.8
Lebesgue-Stieltjes measure 14.8
Lebesgues theorem 8.13, 8.14, 12.6
dierentiation 22.5, 23.9
density 29.2
decomposition 13.10
left Haar measure 19.3
lemma
Borel-Cantellis 2.14
Cousins 25.2
Dunfords 43.1
Fatou 8.15
Fubinis 24.10
Riemann-Lebesgues 12.13, 31.10
Saks-Henstocks 25.6
Urysohns Appendix
lemniscate 26.15
Levis theorem 8.5, 8.11, 8.12
linear form 38.1
L -norm 10.1
Lipschitz boundary 37.21
Lipschitz k-boundary 37.17
Lipschitz function 20
Lipschitz mapping 30.1
Ljapunovs theorem 2.16
Lp -norm 10.1
Lp -space 10.5
localizable measure 13.18
locally absolutely continuous function
21.3
locally bilipschitz mapping 34.20
locally compact space Appendix
locally integrable function 23.3
locally Lipschitz mapping 30.1
locally uniformly convex space 12.14
lower derivative 22.3
lower Riemann sum 7.2

222

Subject Index

lower semicontinuous function 14.4


Luzins (N)-property 20.7
Luzins theorem 18.2
majorant 25.10
manifold 39.1
mapping
adjoint 34
bilipschitz 34.20
dieomorphic 39.1
isometric 34.2
Lipschitz 30.1
locally bilipschitz 34.20
locally Lipschitz 30.1
(of class) C 26.12
regular 34.22
McShane extension theorem 30.5
measurable function 3.1, 3.13, 40.1
measurable set 1.3, 1.15, 2.4, 4.4
measurable rectangle 11.1
measurable space 2.1
measure 2.4
absolutely continuous 13.1, 13.8
atomic 15.18
complete 2.4
complex 6.10
continuous 15.18
counting 2.5, 19.4
Dirac 2.5
discrete 15.18
nite 2.4
nitely additive 6.20
Haar 19.3
harmonic 14.8
Hausdor 36.1
outer 4.2
k-dimensional 34.8
on a manifold 39.15
Lebesgue 1.3, 1.15, 2.5, 19.4
outer 1.1, 1.15
Lebesgue-Stieltjes 14.8
left Haar 19.3
localizable 13.18
molecular 17.7
outer 4.1
metric 36.2

regular 4.7
probability 2.4
singular 13.8
Radon 15.1
complex 15.8
outer 16.1
regular Borel 15.2
right Haar 19.3
-nite 2.4
signed 6.1
translation invariant 1.2
trivial 2.5
vector 41.2
absolutely continuous 41.4
measure space 2.4
metric outer measure 36.2
Minkowskis inequality 10.4
minorant 25.10
modular function 19.9
molecular measure 17.7
monotone class 11.3
monotone functional 14.1
M
obius strip 39.7
multiindex 32, 34.13, 38.3
-uniform convergence 12.2
natural orientation 37.9
negative base 39.4
negative dieomorphism 39.3
negative parametrization 37.9
negative variation of a measure 6.6
negative variation of a Radon integral
14.12
neighborhood Appendix
Newton integral 7.1
Newton potential 5.14
Newtonian capacity 5.14
norm of a partition 43.7
normal (vector) 37.13
normal space Appendix
orientation 37.9, 38.14, 39.3
orientable manifold 39.3
oriented atlas 39.3
oriented manifold 39.3
Orlicz-Pettis theorem 43.5
orthogonal matrix 34.2

Subject Index

oscillation 43.7
outer capacity 5.15
outer measure 4.1
outer normal 37.21
outer product 38.3
parametrization 34.24
partition of an interval 7.2, 25.2
-ne 25.2
subordinated 25.2
partition of unity 39.11, 39.22
Perron integral 25.10
Pettis integral 43.2
Pettis theorem 40.3, 41.5
-system 5.1
Plancherel theorem 33.7
point of density 29.1
Poisson integral 14.2
polar coordinates 26.14
positive base 39.4
positive dieomorphism 39.3
positive functional 14.1
positive parametrization 37.9, 39.18
positive variation of a Radon integral
14.12
positive variation of a measure 6.6
premeasure 5.2
probability measure 2.4
product of measures 11.6
product of Radon integrals 14.13
product of Radon measures 16.9
product -algebra 11.1
property
Daniell 14.3
Kadec-Klee 12.14
Luzins (N) 20.7
Radon-Nikod
ym 42.4
Vitalis 27.1
pullback 39.8
Rademacher theorem 30.3
Radon integral 14.1
Radon measure 15.1
Radon-Nikod
ym derivative 13.1
Radon-Nikod
ym property 42.4
Radon-Nikod
ym theorem 13.4
Radon outer measure 16.1

223

rectiable set 34.31


regular Borel measure 15.2
regular distribution 32.3
regular mapping 34.22
regular outer measure 4.7
restricted Denjoy integral 25.14
restriction of a measure 2.4
Riemann function 7.6
Riemann integrable function 7.2
Riemann-Graves integral 43.7
Riemann integral 7.2
Riemann-Lebesgue lemma 12.13, 31.10
Riemann-Stieltjes integral 14.2
Riemannian manifold 39.12
Riemannian metric 39.12
Riesz lattice 14.14
Riesz theorem 12.3
representation 16.5
right Haar measure 19.3
ring 5.1
Saks-Henstocks lemma 25.6
Sard theorem 34.17
Schwartz space 33.5
Schwartz theorem 32.8
section 11.1
semicontinuous function 14.4, Appendix
semiring 5.1
separable space Appendix
set
analytic 26.10
Borel 2.3
capacitable 5.15
d-open 29.4
F 2.3
F 2.3
G 2.3
G 2.3
measurable 1.3, 1.15, 2.4, 4.4
rectiable 34.31
-additivity 2.4
-algebra 2.1
-nite measure 2.4
-ring 5.1
-subadditivity 1.2, 4.1
signed measure 6.1

224

Subject Index

signed Radon integral 14.10


signed Radon measure 15.8
simple function 3.8, 8.1, 40.1
singular measure 13.8
sphere 34.26, 34.27, 37.23
spherical coordinates 26.16, 37.23
Stokes theorem 37.32, 38.21, 39.21
Stone-Weierstrass theorem Appendix
Stones condition 14.14
strong convergence 17.1
strong subadditivity 5.14
support of a function 14.1
support of a Radon measure 15.10
surface integral 34.24, 37.13
surface k-dimensional 34.24
surface with a Lipchitz k-boundary
37.17

Plancherels 33.7
Rademachers 30.3
Radon-Nikod
yms 13.4
Riesz 12.3
representation 16.5
Sards 34.17
Schwartz 32.8
Stokes 37.32, 38.21 39.21
Stone-Weierstrass Appendix
Tietzes Appendix
Tonellis 26.10
Vitalis 12.9, 27.2, 27.6
Youngs convolution 26.20
Tietzes theorem Appendix
Tonellis theorem 26.10
topological group 19.2
total variation of a measure 6.6, 6.10
translation invariant measure 1.2
trigonometric series 33.14
trivial measure 2.5

tangent k-vector 38.14


tangent space 39.4
tangent vector 37.10
unconditional convergence 41.1
tempered distribution 33.5
uniformly convex space 12.14
theorem
unimodular group 19.11
Banach-Zareckis 23.11
unit tangent k-vector 38.14, 39.16
Besicovitchs 27.4
unit tangent vector 37.10
Bochners 42.2
change of variable 26.13, 34.18, 34.19, unitary matrix 34.2
upper Baire function 7.9
35.8, 38.18
upper derivative 22.3
Denjoys 29.9
upper Riemann sum 7.2
density 29.2
Dinis Appendix
vague convergence 17.2
Egorovs 12.6
variation of a function 21.1
Fubinis 11.9, 26.9
variation of a measure 6.6
Gauss 37.22
variation of a Radon integral 14.11
Greens 37.27
variation of a vector measure 41.7
Hopfs 5.5
vector eld 37.1
Kirszbrauns 30.6
vector measure 41.2
Lebesgues 8.13, 8.14, 12.6
vector product 37.5
decomposition 13.10
Vitali cover 27.1
density 29.2
Vitali property 27.1
dierentiation 22.5, 23.9
Vitalis theorem 12.9, 27.2, 27.6
Levis 8.5, 8.11, 8.12
volume 34.10
Ljapunovs 2.16
of a k-tuple of vectors 34.10
Luzins 18.3
Youngs convolution theorem 26.20
McShane 30.5
Youngs inequality 10.2
Orlicz-Pettis 43.5

Subject Index

weak convergence 12.12, 12.14, 17.1,


32.9
weak* convergence 12.13, 12.14
weakly integrable function 43.2
weakly measurable function 40.1
weighted counting measure 2.10

225

Jaroslav Luke, Jan Mal

MEASURE AND INTEGRAL


Published by
MATFYZPRESS
publishing house
of the Faculty of Mathematics and Physics
Charles University in Prague
Sokolovsk 83, CZ - 186 75 Praha 8
as the 162. publication
Reviewer: Prof. RNDr. Ivan Netuka, DrSc.
This volume was typeset by the authors using AMS-TE X
the macro system of the American Mathematical Society
Printed by Reproduction center UK MFF
Sokolovsk 83, CZ - 186 75 Praha 8
Second edition
Prague 2005

ISBN 80-86732-68-1
ISBN 80-85863-06-5 (First edition)

You might also like