ProbabilitProbability With Martingalesy With Martingales

Probability
with
Martingales
David
Statistical
Williams
Laboratory,
DPMMS
Cambridge
University
Cambridge
UNIVERSITY
PRESS
Published in tlie United
States
of America
by
Cambridge
University Press, New York
\302\251 Cambridge
University Press 1991

is in copyright. Subject to statutory exception of relevant collective licensing agreements, of any part may take place without
Thispublication
and
to the provisions
no reproduction the written permission of Cambridge University Press. 1991 published Twelfth printing 2010
First Printed
in the
United
Kingdom
at the
University Press,Cambridge
is available from the British Library
A catalogue
record for this
publication
ISBN
978-0-521-40605-5
paperback
for the persistence or accun Cambridge University Press has no responsibility of URLs for external or third-party internet websites referred to in this public! and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
Contents
Preface
\342\200\224
please
read!
xi
A Question
A
of Terminology
Notation
xiii
xiv
Guide
to
Chapter
0: A Branching-Process Example
remarks. 0.3. Z\342\200\236.
1
0.2. Size
of
0.0.
of
Introductory
0.1. Use
n^^
generation,
tt.
number of children, X. Typical of conditional expectations. 0.4. Extinction
probability,
0.5.
Pause
for thought:
0.7.
Convergence
(or not)
measure. 0.6. Our first martingale. the distribution of expectations. 0.8. Finding
Moo. 0.9.
Concrete example.
PART
A:
FOUNDATIONS
Chapter
1: Measure
remarks.
Spaces
1.1.
14
Definitions
1.0.
Examples.
Introductory
Borel
set
functions.
measures.
<7-algebras, B{S), B{R). 1.3. Definitions 1.4. Definition of measure space. 1.5.
B =
of algebra,
<7-algebra. 1.2.
concerning
Definitions
concerning
1.6.
Lemma.
extension
Theorem.
Uniqueness
theorem.
Elementary
of extension,
1.8.
inequalities.
7r-systems.1.7.
measure Leb
Lemma.
1.10.
Caratheodory's
on ((0,1],-B(0,1]). 1.9.
Monotone-convergence
Chapter
Lemma.
Lebesgue
properties of measures. 1.11.Example/Warning.

23
(Q,^,
2:
Events
for
2.1. Model
experiment:
Examples lim liminf,

sup,
of
(fi,^)
J,
pairs.
lim,
2.4.
2.6.
etc.
2.2. The intuitive meaning. 2.3. P). Almost surely (a.s.) 2.5. Reminder: Definitions. limsupJS^n, (-B\342\200\236,i.o.). 2.7.
vi
Contents
First Borel-Cantelli 2.9. Exercise.
Lemma (BCl). 2.8.
Definitions,
liminf
j^n,
(^n,ev).
Chapter
3: Random Variables
S-measurable
29
3.2. Elementary (mS)+,bS. Sums and products of
3.1.
Definitions.
3.4. Composition Lemma. 3.5. Lemma functions are measurable. measurable of functions. 3.6. Definition. Random liminfs of infs, on measurability 3.8. Definition. Coin variable. 3.7. Example. <7-algebra generated tossing. on Q. 3.9. Definitions. Law, Distribution by a collectionof functions functions. 3.11. Existenceof random of distribution Function. 3.10. Properties of with given distribution function. 3.12. Skorokod variable representation
a
Propositionson measurability.
mS, function, Lemma. 3.3.
random
variable
<7-algebras
Chapter
- a
4:
discussion.3.14.The Monotone-Class
with prescribed
distribution
function.
3.13. Theorem.
Generated
Independence
of
38
4.2. 4.3.
4.1. Definitions
independence.
more familiar
model
definitions.
A
Second
question
Example.4.5.
with
The 7r-system Lemma; and Lemma (BC2). Borel-Cantelli
the
4.4.
fundamental
applications.
4.7. Notation:
IID RVs.
0-1
for modelling.
4.8.
4.6.
coin-tossing
Stochastic
processes;
Markov
chains. 4.9.
algebras. 4.11. Theorem. Kolmogorov's

Chapter
5.0.
Monkeytyping
Shakespeare.
4.10. law.
Definition.
Tail a-
4.12.
Exercise/Warning.
49
5: Integration
etc.
Notation,
simple
TOU).
Positive and negative parts of /. 5.7. Inte5.8. 5.9. Dominated Convergence grable function, \302\243^(5, S,/i). Linearity. Theorem (DOM).5.10.SchefFe's Lemma 5.11. Remark on (SCHEFFE). uniform integrability. 5.12. The standard machine. 5.13. over Integrals subsets. 5.14. The measure //i, / \342\202\254 (mE)\"*\".
5.5.
Theorem Convergence (MON).

'Linearity'.
functions,
//(/) :=: J fdfi, SF'^. 5.2. Definition

5.4.
fj,(f;A). 5.1. Integrals

of//(/),
of
non-negative
/ G (mS)\"*\". 5.3.
Monotone(FA-
The
Fatou
Lemmas
for functions
5.6.
Chapter
6:
Expectation
58
of expectation. 6.2. Introductory remarks. 6.1. Definition Convergence 6.3. The notation E(X;F). 6.4. Markov's 6.5. inequahty. Sums of non-negative RVs. 6.6. Jensen'sinequality for convex functions. 6.7. Monotonicity of C^ norms. 6.8. The Schwarz 6.9. >C^: inequahty. < p < oo). 6.11. etc. 6.10. of \302\243p Pythagoras, covariance, Completeness (1
theorems.
6.13. Holder from
Orthogonal projection. 6.12. The

Jensen.
'elementary
formula'
for expectation.
Contents
vii
Chapter
7: An Easy
means
Strong Law
multiply'
71
7.2. Strong Law
approximation
7.1.
'Independence
7.3. Chebyshev's inequality.

Chapter
7.4. Weierstrass
- again!
- first theorem.
version.
8: Product
Measure
advice.
75
Product
and 8.0. Introduction
8.1.
measurable
=
8.2.
Product
and product measure. 8.5. i?(R)'' Independence of probability extension. 8.7. Infinite products
of on the existence
measure,
Fubini's Theorem.
8.3. Joint laws,

S(R'').
structure,
joint
Ei
pdfs.
E2.
8.4.
triples.
8.6. The n-fold 8.8. Technical note
joint
laws.
PART
B: MARTINGALE
THEORY
Chapter 9: ConditionalExpectation
83
9.1.
expectation Agreement expectation:
motivating
example. 1933).
(Kolmogorov, as
9.3.
with
least-squares-best traditional
Fundamental Theorem and Definition 9.2. The intuitive meaning. 9.4. Conditional 9.5. Proof of Theorem 9.2. 9.6. predictor. 9.7. Properties of conditional expression.
a list.
9.8.
Use
Proofs
and
of the
conditional
assumptions.
probabilities
9.11.
pdfs. 9.10. of symmetry: an
9.7.9.9.Regular properties in Section Conditioning under independence

example.
Chapter 10: Martingales
93
processes.
examples
10.1.
fundamental Stopped
Filtered
spaces.
submartingale.
10.2.
Adapted
10.4.
10.3. Martingale,
of martingales.
martingale,
and unfair
Some
10.5. Fair
super-
games. principle: supermartingales
10.6.
Previsible process, gambling strategy. beat the system! 10.8. you can't
are
time.10.9. Stopping
Doob's Optional10.12. Hitting
10.7. A
times
functions
Stopping Theorem.10.11. Awaiting

for
supermartingales.
the
10.10.
inevitable.
almost
simple
random
walk.
10.13.
Non-negative superharmonic
for
Markov
chains.
Chapter
11: The
picture
Convergence Theorem
says it all.
Corollary.
11.1.
ing
The
that
11.2.
106 Upcrossings. 11.3. Doob'sUpcrossTheorem.
Lemma.
11.4.
Corollary.
11.7.
11.6. Warning.
11.5. Doob's'Forward'Convergence
viii
Contents
Chapter
12.0.
12: Martingales bounded in

12.1. Martingales
\302\243^
110
Introduction.
in
\302\243^: orthogonality
of increments.
in C^. 12.3. of zero-meanindependent random variables the 12.4. A symmetrization Random signs. sample space. technique: expanding 12.6. Cesaro's Lemma. 12.7. Theorem. Three-Series 12.5.Kolmogorov's Kronecker'sLemma.12.8.A Strong Law under variance constraints. 12.9.
12.2. Sums
Law of Strong Kolmogorov'sTruncation Lemma. 12.10. Kolmogorov's The 1 2.12. Doob 12.11. Numbers angledecomposition. Large (SLLN). of of M to finiteness brackets process (M). 12.13. Relating convergence
extension
'Strong (M)oo- 12.14. A trivial of the Borel-Cantelli
Law'
Lemmas. 12.16.Comments.
126
UI
for martingales
in
\302\243^.12.15.
Levy's
Chapter 13:
13.1.
of
Uniform
Integrability
continuity' property. 13.2. Definition. Two simple sufficient conditions for the UI property.
An 'absolute
conditional proof
Elementary
family.
13.3.
expectations. of (BDD).
13.5.
Convergence
necessary
13.7.
and
13.4. UI property in probability. 13.6. sufficient condition for C^
convergence.
Chapter 14: UI Martingales

14.0.
133
Introduction.
Martingale
14.1.
proof
Theorem. 14.5. martingale
14.3.
14.8.
A
UI martingales.
Martingale
of Kolmogorov's of the proof

14.7.
Inequality.
standard bounds;
estimate large
on
14.2. Levy's 'Upward'Theorem. 14.4. Levy's 'Downward' Strong Law. 14.6. Doob's SubLaw of the Iterated Logarithm: special case. the normal distribution. 14.9. Remarkson
0-1 law. theory.
exponential
deviation
14.10.
consequence
of Holder's
14.12. C^ inequality. Kakutani's Theorem on theorem. 14.14. The 'product' martingales. 14.13.TheRadon-Nikodym theorem and conditionalexpectation.14.15.Likelihood Radon-Nikodym measures. 14.16. Likelihood ratio and conditional ratio; equivalent expectation. 14.17. Kakutani's Theorem revisited; consistency of LR test. 14.18.
inequality.
14.11. Doob's
Note
on
Hardy
spaces,
etc.
Chapter
15: Applications
-
153
15.1.
A
15.0.
result.
Introduction
entangled.15.11.
The formula. 15.3. Option pricing; discrete Black-Scholes Proof of Lemma 15.4. 15.5. Proof sheep problem. Mabinogion 15.3(c). of result 15.3(d). 15.6.Recursive nature of conditional 15.7. probabilities. formula for bivariate distributions. normal 15.8. observation of Bayes' Noisy a single random variable. 15.9. The Kalman-Bucy filter. 15.10.Harnesses
please
read!
trivial
martingale-representation
15.2.
Harnesses
unravelled,
1.
15.12.
Harnesses
unravelled, 2.
Contents
PART
ix
C:
CHARACTERISTIC
FUNCTIONS
Chapter
16: Basic
Definition.
Properties of CFs
Elementary 16.4.
172
16.3. Some uses of 16.5. Atoms. 16.6. Levy's
16.1.
characteristic
16.2.
properties.
functions.
Three
key results.
Inversion
Formula.
16.7.
A table.
Chapter
17: Weak Convergence

'elegant'
179
formulation, n.3. Skofor
compactness
rokhod representation.17.4. Sequential

Tightness.
17.1.
The
definition.
17.2.
A 'practical'
Prob(R).
17.5.
Theorem 18: The Central Limit 18.1. Levy's Convergence Theorem. 18.2.o and important estimates. 18.4. The CentralLimit
Chapter
185
O
notation. 18.5.
18.3.
Some
Theorem.
Example.
18.6. CF
proof of
Lemma
12.4.
APPENDICES
Chapter
Al:
A
Appendix
Proof
to Chapter 1
192
Al.l.
Lemma.
non-measurable
A1.4.
Outer measures. Al.7. Caratheodory'sLemma. A1.8.Proof of A 1.9. Proof of the existence Theorem. of Lebesgue measure on ((0,1],B(0,1]). ALIO. of non-uniqueness of extension. Al.ll. Example Completionof a measure space. A1.12. The Baire categorytheorem.
case.
subset A of 5^. A1.2. <i-systems. A1.3.Dynkin's of Uniqueness Lemma 1.6. A1.5. A-sets: 'algebra'
A1.6.
Caratheodory's
to Chapter A3: Appendix

A3.1.
generated
Chapter
205
of
Proof
of the
Monotone-Class Theorem3.14. A3.2. Discussion to Chapter 4

of the
<7-algebras.
Chapter
A4: Appendix
Kolmogorov's
208
A4.2. Strassen's Law
chain.
for
A4.1.
of
Law
Iterated
A
Logarithm.
the
Iterated
Logarithm.
A4.3.
model
a Markov
Chapter
A5: Appendix to
monotone
Chapter5
A5.2.
211
use
of
A5.1.
Doubly
arrays.
A5.3. 'Uniqueness Theorem.
of integral'.
A5.4. Proof of
The key
the
Lemma
1.10(a).
Monotone-Convergence
Contents
Chapter
A9:
Infinite
Appendix
products:
to Chapter
setting
9
up. A9.2.
214
Proof
of
A9.1.
Chapter
things
A9.1(e).
A13:
Modes
Appendix
of
to Chapter
definitions.
13
A13.2.
217
Modes of
A13.1.
convergence:
convergence:
219 case
relationships.
to Chapter A14:Appendix
Chapter
14
A14.1. The <7-algebra ^r, T a stoppingtime. A14.2. A special A14.3. Doob's Optional-Sampling Theorem for UI martingales. result for UI submartingales.
of OST. A14.4. The
16 Chapter A16: Appendixto Chapter

A16.1.
222
Differentiation
under
the integral
sign.
Chapter E: Exercises
References
224
243
Index
246
Preface
please
read!
I have book is Chapter E: Exercises. 'EG' on the start now can left the interesting things for you to do. You exercises,but see 'More about exercises' later in this Preface. the set of lecture notes for a third-year is essentially The book, which as I can an introduction course at Cambridge, is as lively undergraduate of probability. Since much of the book is manage to the rigorous theory at those look it is bound to become very devoted to martingales, lively: of course, there is that initial plod through Exercises on Chapter 10! But, be said however that measure the measure-theoreticfoundations. It must theory, that most arid of subjects when done for its own sake, becomes amazingly more alive when used in probability, not only because it is then applied,but also because it is immensely enriched. avoid measure You cannot theory: an event in probabilityis a measurableset, a random variable is a measurablefunction on the sample space, the expectation of a random variable is its integral with to the respect and so on. To be sure, one can take some central results measure; probability from measure theory as axiomatic in the main text, giving careful proofs in appendices;and indeedthat is exactly what I have done. Measuretheory for its own sake is based on the fundamental addition rule for measures. with that the theory Probability supplements multiplication rule describes which and things are already independence; looking But what enriches and enlivens we deal with is that lots up. really things of (7-algebras, not just the one <7-algebra is the concern of measure which
The
most
important
chapter
in this
theory. In planningthis book, I decided for just a bit too advanced, and, often with
them. For a more thorough training in
every
topic
what
I have
sadness,
things I considered ruthlessly omitted covered
many
of
the
topics
here, see
Billingsley(1979),
Chow
and
Teicher
(1978),
Chung
(1968), Kingman
and
xi
xii
Taylor
Preface
(1966),
Laha
and
this
Rohatgi
from
measure
(1968), martingales.
theory, I
reading
learnt it
(1979), and
and
Neveu
(1965).
As
regards
Breiman
Dunford
Schwartz
can
(1958) and Halmos

be done
with
(1959). After
and,
book,
you must
read the still-magnificent

what
for an excellent Hall and Heyde
indication of
(1980).
discrete
than Aldous (1989), though it is a very for this There is no better whetstone and for learning of probability demanding book. For appreciatingthe scope and Stirzaker and Grimmett how to think about it, Kaxlin Taylor (1981), and recent Grimmett's superb book, Grimmett (1989), (1982),Hall (1988), recommended. on percolation are strongly
More about exercises. the homework sheet
theory, and you
Of course,intuitionis muchmoreimportant
than
knowledge
should take every
opportunity
to
sharpen
of measure your intuition.
In compiling
give
Chapter E,
which
consists
exactly
of
to
the
account
contains
Cambridge
students,
I have
taken into
the
fact
that
this book,
of
like any other

to the
mathematicsbook,implicitly
are easier than of which exercises you create by reading the statementof a result, and then trying to prove it for yourself, before read the other about exercises: One you proof. you will, for point given in E Exerciseson example, surely forgive my using expectation Chapter4 E is treated before with in full 6. Chapter rigour
a vast
number
of other
those in Chapter E. I refer
exercises,many
course
must My first thanks go to the students who have the book is basedand whose quality has made me try hard to make it worthy of them; and to those, especially David who had developed the coursebefore it became to Kendall, my privilege teach it. My thanks to David Tranah and otherstaff of CUP for their help in converting the courseinto this book.Next,I must thank Ben Gar ling, James Norris and Chris Rogers without the book would have contained more whom errorsand obscurities. many faults which surely remain in it are my (The Helen and I typed part of the book, but the vast Rutherford responsibility.) majority of it was typed by Sarah Shea-Simonds in a virtuoso performance of Horowitz. to Sarah. worthy My thanks to Helen and, most especially, Special thanks to my wife. Sheila, too, for all her help.
Acknowledgements.
the
endured
course
on which
But my
must
best
thanks
- and
go
to three
without Doob, A.N. Kolmogorov and P. Levy: them, been much to write about, as Doob (1953) splendidly
people whose
yours if you derive any benefit from the book names appear in capitals in the Index: J.L.
there
confirms.
wouldn't
have
Statistical
Laboratory,
David
Williams
Cambridge
October1990
A Question
of Terminology
functions
Randomvariables:
or
equivalence
classes?
if we of this book, the theory would be more'elegant' regarded of measurable functions on the class variable as an equivalence to the same equivalence class if and sample space,two functions belonging Then the conditional-expectation are almost if everywhere. only they equal At
the
level
a random
map
X
would for
^ E{x\\g)
i^^(fi,
be p
>
a truly 1; and
well-defined
contraction map from

have
^,
the
P) to
endless
we would not
classes)
L^(f2, Q^P)
'almost
to
keep
mentioning
versions
(representatives of
equivalence
and would
be able to avoid
route:
firstly,
surely'
qualifications.
have
I
with
however
chosen
functions^
and
confess
5 =
the 'inelegant' to preferring
I prefer
to work
4 -h
2 mod
to
[4]7
[5]7
= [2]7.
But there is a substantive reason. I hope that this book will you to tempt to the and much more interesting, more where important, progress theory the parameter set of our process is uncountable (e.g. it may be the timeformulation parameter set [0,oo)). There, the equivalence-class just will not work: the 'cleverness'of introducing quotient spaces loses the subtlety which is essential even for formulating the fundamental results on existence of continuous are modifications, etc., unless one performs contortionswhich Even if these contortions allow one to formulate hardly elegant. results, one would still have to use genuine functions to prove them; so where doesthe
reality
lie?!
xni
Guide
to
Notation
\342\226\272 signifies
something
the
Martingale
\342\226\272\342\226\272 important, something Theorem. Convergence
very
important,
and
\342\226\272\342\226\272\342\
I use ':='
convenient
to signify
because (as
'is
defined
to equal'.
it can opposed
also be
to
use
analysts'
This Pascal notation is particularly used in the reversedsense. category theorists') conventions:
\342\226\272
N:={1,2,3,...}C{0,1,2,...}=:Z+.
Everyone For
function
is agreed that
of
R\"^
:=
[0,oo).
set
a set
B containedin someuniversal B: that is /^ : 5 \342\200\224\342\231\246 and {0,1}

\\ 0
5,
Ib
denotes
the indicator
otherwise.
For a, 6
E R,
a Ab
:= min(a, 6),
aV
6 :=
max(a,
6).
pdf:
CFxharacteristic
density
function; DF: distribution

(7{Yy
function;
probability
function.
a-algebra,
<7(C) (1.1);
: 7
G C)
(3.8, 3.13).
7r-system
(1.6);
d-system
(A1.2).
a.e.:
almost everywhere (1.5)
a.s.:
bE:
almost
surely
(2.4)
the space
of bounded
E-measurablefunctions
(3.1)
xiv
A Guide to
the
Notation
XV
B(S):
Borel
a-algebra
stochastic
on 5,
integral
B := B(R)
(10.6)
(5.14)
(1.2)
\342\200\242 X:
discrete
dX/dfi:
dQ/dP:
derivative Radon-Nikodym
Likehhood
Ratio (14.13)
X{uj)P(du;)
E(X):
E(X;F):
expectation E{X):= ^
/^ Xc/P
conditional
of
(6.3)
(6.3)
expectation
E(X|^):
(En.ev):
(En,i.o.):
(9.3)
liminf
jE;\342\200\236 (2.8)
limsupjEn
(2.6)
(pdf)
fX' /x,y:
fx\\Y'
probability density function joint pdf (8.3)

conditional
of X
(6.12).
pdf
(9.6)
of
Fx'
liminf:
distribution function
for sets,
for
(3.9)
(2.8)
(2.6)
limsup:
X =1 log:
linix\342\200\236:
sets,
x\342\200\236
| x in
that
Xn
<
Xn-\\-i (Vn)
and
\342\200\224> x. x\342\200\236
natural
law
(base e)
X
logarithm
(6.7, 6.13)
Cx, Ax:
LP: \302\243P,
of
(3.9) spaces
Lebesgue
Leb:
mE:
Lebesguemeasure (1.8)
space
of E-measurable
functions
(3.1)
process M stopped at time T (10.9)

(M):
angle-brackets
process
with
(12.12)
to
/i(/):
integral of /
respect
/i (5.0,
5.2)
/i(/;A):
<px''
X4/c//i(5.0,5.2)
CFof pdf
(Chapter
16) normal
<p:
of standard
N(0,1)
N(0,1) distribution
^:
X^:
DF of
distribution
X stopped
at time T (10.9)
Chapter
A Branching-Process
(This with
Example
of
Chapter Chapter
is not 1 if
essential
for the remainder
the
book.
You can
start
you wish.)
0.0. Introductory
remarks
Thepurpose
well known
is probably is threefold: to take somethingwhich of this chapter or Ross to you from books such as the immortal Feller(1957) to that start on to start think so familiar make you you ground; (1976), about someof the problems involved in making the elementary treatment into rigorous and to indicate what new results appear if one mathematics; the somewhat more advanced theory developedin this book. We applies stick to one example: a branching process. This is rich enoughto show that the theory has some substance.
0.1. Typical number of children, X In our the number of childrenof model,

for
typical
animal
some
interpretations
of 'child'
assume
and 'animal')
is a random variable
(see Notes
X
below
with
values
in
Z\"'\".
We
that
z=
P(X We define
where
0)
> 0.
the generatingfunction
of X
SiS the
map /
: [0,1]-^
[0?1]?
kez+
Standard
theorems on
power seriesimply E(X0^-') = J2
that,
for
0 G [0,1],
=
f\\0) =
and
ke^~^P{X
k)
E{X) = f\\l) = ^
1
kP{X
k)<oo.
2 Of course,
Chapter 0:
Branching-Process
Example
(O-l)-
as /'(I) is hereinterpreted
^Ti
0-1 that fl <

OO.
9]i
- 6
since /(I) = 1. We
assume
Notes.
of
The
first application
of family
survival
names;
of branching-process theory and in that context, animal

be
wasto the question

=
man,
and
child
= son.
can In another context,'animal' neutron into
'neutron',
and
'child'
of that
will a
signify nucleus.
can can
a neutron released if and when the parent neutron or not the associatedbranching Whether
supercritical
be
a matter find
of real
crashes is process
richer structures
study
importance.
of
We and
often
can
then
use
branching processes embedded in the results of this chapterto start the

processes,
more
interesting
things.
see Athreya
For superb accounts of branching Harris (1963), Kendall (1966, 1975).

0.2.
and Ney (1972),
Size
of n^^ generation,
formal:
Zn
that
To be a bit
(a)
suppose
we are
given a doubly infinite
sequence
|X(^^
:m,rGN}
random
of independent
with
identically distributed
distribution
variables
(IID
RVs), each
the
same
as X:
P(X^-*) = k) = P{X= k).

The
idea
is that of children
for n G (who
Z\"^ will
and be
r G in the
number
(if there
that
is one) in the
signifies
N, the variable Xr (n-h 1)^^ generation)

The
represents
of
the
the
r^^ animal
n^^
generation.
fundamental
rule therefore
is
if
Zm
the size
of the n^^ generation,then

+
(b)
We assume
the sequence
that (Zm
Z\342\200\236+i=x\\\"+'^
-.-
+ xil+'\\
full
Zq :
= m
1, so that G Z\"^) from
(b) gives a
the
recursive
definition
of
sequence
(a).
Our first task
is
..(0.3)
to calculate the
generating
Chapter
0:
A Branching-Process
Example or equivalently
distribution
function
of
Zn,
to
find
the
function
(c)
U9):=E{e^'')^Y.^'P{Zn^k).
0.3. Use of conditionalexpectations

The
first
main
result
is that for n G
Z\"^
(and
6 G
[0,1])
(a)
fn+m
that
= urn),
n-fold /o/o...o/.
the
that
so
(b)
for
each
n G
Z\"^,
is the /\342\200\236 =
composition
/\342\200\236
Note that the

0^
in
agreement
is by convention 0-fold composition with - indeed, forced by - the fact (a),

we
identity
map fo(0)
Zq = 1.
To prove
following
use
at the
moment in intuitive fashion
- the
very
special
case
of the very
useful
Tower
Property
of
Conditional
Expectation:
(c)
to
E(c;) =
find
EE(u\\vy,
the
expectation
of
of a
U
random variable
V, and
Z7,
first
find
the conditional
that
expectation E(Z7|V)
We
given
then find the

=
prove
We
the ultimate form

apply
of (c) at a later stage.

Zn:
expectation of
(c)
with
U =
6^^+^ and V
E(^^\"+0
= EE(^^\"+H^n).
Now, for
satisfies
A:
G Z\"^,
the
conditional
expectation of
^^\"+igiven
=
that
Zn
= ^
(d)
But
= E(^^\"+> \\Z\342\200\236 k)
= E(^^{-'\"+-+4\"+\"
|z\342\200\236 k).
Zn is
of
constructed
Xj
from
variables
independent
,...
in the
expectation
right-hand term in (d) must
,X|^\"
. The
with r < n, and so Zn Xi conditionalexpectation given

agree
is Zn
therefore
with
the absolute
(e)
E(e^'^\"^'\\..0î\"''').
Chapter
0:
Branching-Process
Example
(0.3)..
the expression at (e) is a expectation random variables and as part of the family we know that this expectation of results,
But
of the product of 'Independence a product
may
of independent means multiply^ be rewritten as the
product
we have
of
expectations.
Since
(for every n and
r)
proved
that
E(0^\" and
'|^n
fc)
/W*,
this is what
it
means
to
say that
[If V
takes
of
only
U
E(L/|V)
integer values, V is equal given

V
then when V
k^ the
conditional
expectation
=
U given that
k.
(Sounds
to the conditionalexpectation E(Z7|F Property (c) now yields reasonable!)]
k) of
E^z\342\200\236+i Ê/(^)Z\342\200\236^
and,
since
E(a^\"
result
) =
/\342\200\236(\302\253),
(a) is
proved.
are
Independence and conditionalexpectations in this course.
two
of the
main topics
0.4. Extinction probability,

tt
Let
TTn
:=
P(Zn
= 0).
Then
tt^
/\342\200\236(0),
so
that,
by (0.3,b),
(a)
Measure
7r\342\200\236+i =/(7r\342\200\236).
theory
confirms
TT
our intuition
P{Zm
about the
extinction probability:
lim7r\342\200\236.
(b) Because
:=
= 0
for some m)
(a)
=t
/ is continuous, it
follows from
TT^
that
(c)
f(ir).
The
function
/ /
non-decreasing
the following
/'(I)
of
at 1
is /i = E(X). The celebrated pictures Theorem obvious.
is analytic on (0,1), and slope). Also, /(I) = 1 and
is non-decreasing and
/(O)
convex
P(X
= 0)
> 0. The slope

now
(of
opposite
make
THEOREM
IfE{X)> 1,
equation
then TT
then
the
which
extinction
lies
tt = 1.
/(tt)
strictly
tt is the unique probability between 0 and 1. If
E(-X')< 1,
root of
the
..(0.4)
Chapter
0:
Branching-Process
Example
y=
f{x)
Case1: subcriiical,
The critical
// = =
/'(!)<
case //
1 has
a similar
1. Clearly, tt picture.
:= 1.
Case 2: supercritical^
^ =
/'(I)
> 1. Now,
tt <
1.
6 0.5.
Now
Chapter 0: Pause
that
Branching-Process
Example
(0.5)..
for thought: measure

about
we have say
theory
find a
To be sure, more preciselanguage.

TT
finished revising what introductory theory, let us think branching-process

the
courseson probabiUty
about
why
we must
claim
at (0.4,b)
that
(a)
is intuitively
cannot
=1
limTTn
prove
it
at present
mathematical
it? We certainly plausible, but how could one prove of stating with no means because we have what it is supposed to mean. Let us discuss precision
pure-
this
further.
Back in Section 0.2, we

sequence
said
'Suppose
that
we are
What
given a doubly infinite

distributed does
[Xr
: m,r
with
6 N}
same
of independentidentically
distribution
of)
random
variables
could
each
the
as X'.
function
this
mean?
A
We
random variable
follow
is a (certainkind
on a
elementary
other words, taking
Q to be the set of Q to be the Cartesian product

theory in taking
sample space
all
Q.
outcomes,
in
the
typical
element
cj of
Q being
a; = (a;^^>
and then
we
:r6N,5
-
6N),
Q is
setting Xa {oj) =
oJa
Now
an uncountable
sense
of
set, so that
of
in the 7r\342\200\236
are
outside
the 'combinatorial'
one theory. Choice, can prove that it is impossibleto assign to all subsets of Q a probability the X's IID RVs satisfying the 'intuitively obvious' axioms and making with the correct common distribution. we to have know that the set So, of uo corresponding to the event 'extinction occurs' is one to which one can a will then a definition of uniquely assign probability (which provide tt). elementary
context which makes if one assumesthe Axiom Moreover,
Even then, we have to prove (a). Consider for a moment Example.
what
Let
is in some
constructa
which the
'probability
theory'.
C be
the class
ways a bad attempt to

of
subsets
C of
N for
'density'
p{C):= ntoo
lim
U^:l<k<n;keC}
exists.
Let Cn
Vn
:= {1,2,..., n}. Then

and
also
(J Cn =
N. However, p(Cn) =
Cn
E C
and Cn
in
Vn,
0,
the sense that = 1. but p{N)
..(0.6)
Hence the
fact
Chapter 0: A logic which

{Zn =
will
Branching-Process
Example
allow
us correctly
to deduce
(a)
from
the
that
0} t {extinctionoccurs}
(N,C,/9) is
not
fails
for
the
(N,C,/o)
set-up:
'a
probability
triple'.
but provides a huge resolves There are problems. Measuretheory them, bonus in the form of much deeper results such as the Martingale
Convergence
Theorem
which
we now
take a
first
look
at
- at
an
intuitive
level,
hasten
to add.
0.6. Our first

Recall
martingale
that
from
(0.2,b)
is clear
where the X^^'^^^variablesare independent from this that
of
the
values
Zi, Z2,...,
It Z\342\200\236.
P(Zn+i ^
a result
(Zn : n
j\\Zo
io.Zi
=ii,...,Zn=
in) =
P(2n+1 = j\\Zn
that
in),
> 0)
which you will is a Markov
probably chain.
as recognize We therefore =
stating have
the process
Z \342\200\224
E(Zn+l|Zo =
= .. ,Z\342\200\236 io.Zi = Z'l,. in)
2_^ jP(Z\342\200\236+i
j\\Zn
= in)
=
=
or, in
(a)
E(Zn+l|Z\342\200\236
in),
a condensed
and better notation,

=
E(Z\342\200\236+i|Z\342\200\236).
E(Z\342\200\236+,|Zo,Zi,...,Z\342\200\236)
Of
course,
it is intuitively
obvious that
== E{Zn^,\\Zn)
(b)
because
children.
flZn,
each
We
of the
Zn
animals
in
the
n^^ generation
differentiating
can confirm
result (b) by
has on average
result
(j.
the
with
respect
to 6
and setting 6=1.
Chapter
0:
A Branching-Process
Example
(0.6)..
Now define
(C)
Then
Mn := ^n//i\",
>
0.
E(Mn+i|Zo,Zi,...,Z\342\200\236)-Mn,
which exactly says that M is a martingale relative (d)

Given the
to
the
Z process.
value
it is now: what M is 'constant on average' in this very sophisticated of conditional expectation given 'past' and 'present'.The true statement
history of
up
to stage
n, the next
Mn+i
of M
is on
average
sense
(e)
is
E(Afn)
of course
A
= l,
Vn
infinitely
S is
cruder.
said to
statement 1 if
be true almost
P(5 is
surely (a.s.)
or
with
probability
(surprise,
surprise!)
true) =1.
{Mn >
is
Because
our Martingale
martingale
is non-negative implies
0,Vn), the
surely
Convergence
Theorem
that
it
almost
true that
(f)
Note
Moo:=limMn
that
exists.
can
> 0 for some outcome (which when probability only /i > 1), then the statement
if Moo
happen
with
positive
Zn//i\"
^ Moo
1; what
(a.s.)
is a precise formulation question is: suppose

the value
0.7.
We o/Mo\302\251?
of that
'exponential
/.i >
A particularly fascinating growth'. is the behaviour of Z conditional on
Convergence
know
(or Moo :=
/i
not)
of expectations
probabiUty
that
lim Mn exists with
1, and
that
Vn. We
know
might be
if
that
eventually
0. Hence
^/m
tempted to believethat E(Moo) = 1. However, we already < 1, then, almost surely, the process dies out and Mn is
1, then
0 =
E(M\342\200\236)
1,
(a)
<
Moo =
0 (a.s.)
and
E(Moo)7^1imE(Mn)
= l.
..(0.8)
This is
Chapter 0: A
for
Branching-Process
Example
Fatou's Lemma,
variables:
to keepin an excellentexample
valid
mind
when
we come
any
sequence
Yn)
{Yn) of
non-negative random
to study
E(liminf
What
< liminf
E(Fn).
/i <
is
that
are
'going
Mn
will
large value
times its small probability

Section
it
at wrong' be large
0.9.
is
(when (a) is that if Mn is not 0 will
1) for
large n, the chances

speaking,
and, very
keep
roughly
this
E(Mn)
at 1.
See the concrete
examples
in
Of course,
very
important
to know
E(lim-),
when
(b)
and
general
limE(-)=
we
do spend
are
quite a
rarely
considerabletime studying
fact
this.
The
best
concrete
theorems
good enough
and
to get the best resultsfor

fi
problems, (c)
where
as is
evidenced by the
=
\"^
that
E(Moo)
X
=
if
only
if hoth
children. though
>
1 and
E(XlogX) may not
<
cx),
is the
and
Moo
E(XlogX)
0, a.s.
typical number of = cx), then, even
Of the
course process
0 log 0 = 0. li /j
>
die out.
0.8.
Since
Finding
Mn
the distribution of Moo

(a.s.), it is obvious that
exp(-AAfn)
for A >
-^ Moo
0,
-^ exp{-XMoo)
(a.s.)
Now
since
each
Mn
> 0,
in absolute
value by the
experiment.
assert
The
the whole sequence (exp( bounded \342\200\224AMn)) is constant 1, independently of the outcome of our
Convergence
Bounded
Theorem
says
that
we can
now
what
we would
wish:
(a)
Since
Eexp(-AA/oo)= limEexp{-XMn).
Mn
Zn/^\"\" and
E(6\302\273^\")
fn{0),
we have
(b)
so
Eexp(-AM,)
that,
fn{exp{-X/fi^)),
in principle
However,
side of (a).
function
(if very rarely in practice), we

for
can
calculate
the
the left-hand
distribution
a non-negative
random
variable F,
by
\302\273\342\200\224\342\226\272 <
P{Y
y) is completely
A
determined
oji
the
map
\302\273\342\200\224\342\226\272 Eexp(\342\200\224Ay)
(0,cx)).
10
Chapter
0:
Branching-Process
Example
(0.8)..
Hence, in principle,
we can
the
find
the
distribution
of Moo-
We have
seen that
real
problem
is to
calculate the
function
i:(A):=Eexp(-AMoo).
Using
(b),
the fact that

of
/n+i = f
equation:
fn-,
and
establishthe functional
(c)
consequence
the
Bounded
Convergence
of L (another the continuity Theorem), you can immediately
I(Am)
= /(X(A)).
0.9. Concrete example

This everything
concrete
example
explicitly,
is just about the only one in which one can in the it is useful of mathematics, but, way X to
calculate
in
many
contexts.
We take
distribution:
the 'typical number

= k)
of
children'
have a
geometric
(a)
P{X
= pq^
(^êZ+),
where
0<p<l,
Then,
q:=l-p.
as
you
can easily
check,
(b)
and
fi9) =
-^, 1
1
\342\200\224
q6
^=i, p
< p.
from
\\
if ^
of the
To calculate/ o / o ... o upper half-plane. If
/,
we use
a device familiar
the
geometry
f 9u
\\921
gi2\\
922 J
the
is a
non-singular 2x2 matrix,define G(e) =
fractional
Hnear
transformation:
(c)
f4\302\261i-. -r 92if^
922
..(0.9) Then you can
Chapter 0:
Branching-Process
Example
11
check that
if H
is another
such matrix, then
G{H{9))= (GHXe),
SO
that
composition
of fractional
Unear transformations
correspondsto
matrix
multipHcation.
Suppose
that
the
we find
p ^
that
n^^ power
= A method, for example, q. Then, by the S~ÂS to / is of the matrix corresponding
(AO\"=\"-'\"-C:)(ô;)(-.
so
T).
6) +
+
qO
that
(d)
li
(jL
MO) =
=
pfi\"(l
gp\302\273(l-^)
50-p-
process dies out.

Suppose
q/p
< 1,
then
that
linin
fn{^)
\342\200\224
1,
corresponding
to the
fact that the

for A >
now
yi >
1. Then you
can easily check that,
0,
L{\\) :
= Eexp(-AMoo)_
lim/4exp(-A//i\)
p\\-\\r
q-
qX-^q-p
Jo
from
which
we
deduce
that
and
P(Moo
P(x
- 0) = TT,
<
X
< Moo
dx)
= (1
- 7r)2e-(^-^)^c/x (x > 0),
or,
better,
P(Moo
> x)
<
= (1
- 7r)e-(^-^)^
case, it is Zn ^ 0?
interesting
We
(x > 0).
to ask:
that
Suppose
that Zn
jj,
1.
In this by
what is
distribution of
the
conditioned
find
^^'
'^\"^'î-/\342\200\236(0)
=13^'
where
\342\200\224
\"
qjj,^'
p-qyi^'
12
Chapter
0:
-h
Branching-Process
Example
(0.9)..
so 0
<
< 1 a\342\200\236
and
an
Pn
1. As n
\342\200\224^ we oc,
see that /^,
an -^
so (this
1-
//,
^n
-^
is justified)
= Um P{Zn n\342\200\224\342\226\272oo
(e)
Suppose
h\\Zn ^
0) =
(1 -
yi)ii^-^
[k
G N).
that
jjL
\342\200\224 1. You
can show by
induction that
n6 [n + 1) \342\200\224
and
that
E(e-^^\"/\302\273|Z\342\200\236^
0)^1/(1
+A),
corresponding
to
(f)
P{Zn/n > x\\Zn
7^ 0)
->
e-^
x > 0.
'The Fatou factor'

We we
know get
that some
when insight
1, we have into this?

/z <
case
E(Mn)
1, Vn, but
E(Moo) =
0. Can
that
First considerthe
for
when
jjl
<
1.
Result
(e) makes it
plausible
large
n, E{Zn\\Zn
^ 0)
is roughly
(1
//) E
kfi'^-' -
1/(1-
fi).
We
know
that
P{Zn ^
so
0) = 1 -
/\342\200\236(0)
is
roughly
(1
fi)fi^,
we
should
have (roughly)
=
E(M\342\200\236)
('^
Z\342\200\236 ^
o)
^ P{Z\342\200\236
0)
which might
values
help explain how

small probabilities.
the
'balance'
E(Mn)
= 1
is achievedby
big
times
..(0.9)
Chapter 0: A
case when
Branching-Process
Example
13
Now consider the
fi
= 1.
=
Then
l/(n ^
Zn
P(Z\342\200\236^0)
+ l), 0 is
\"^
and, from (f),

mean
1, so
correct
that Mn
order
We
Zn by Zn/n conditioned = Zn conditionedby
0 is
roughly exponential with on average of size about
n,
the
of magnitude
just
for balance.
exactly the type

E(Afoo)
Warning.
have of
argument
been using for 'correct intuitive explanations' which might have misled us into thinking that
= 1 in
the first place.

E(M\342\200\236)
But, of
course,
the
result = 1
E{Mn\\Z\342\200\236 ^
0)P(Z\342\200\236 7^
0)
is a
matter of
obvious
fact.
PART
A:
FOUNDATIONS
Chapter
Measure
Spaces
1.0.
Introductory
remarks
Topology
is about
oyen sets.
the
function
Measure
/ is
that
inverse is about
of a continuous The characterizing property set of an G is open. open image f~^{G)
theory
measurable
the
sets. The
image
of
a measurable
function
/ is that
characterizingproperty
f\"^ (A)
inverse
of any
measurable set
is measurable.
In topology,
particular
that
intersection
one axiomatizesthe notion of 'open set', insisting of any collection of open setsis open, and sets is open. of a finite collectionof open
the
in
that
union
the
set', theory, one axiomatizesthe notion of 'measurable a of sets is countable collection measurable of insisting of measurable of a countable that the intersection collection sets measurable, and of a set must be is also measurable. Also, the measurable complement and the whole space must be measurable.Thus the measurable sets measurable, a a-algebra, a structure stable (or 'closed') under countably set form many that Without the insistence many operations operations. 'only countably - a point lost on are allowed', measure theory would be self-contradictory
In measure
that
the
union
certain
philosophers
of
probability.
The
sphere by
probability
5^ the
that
in R^ falls
a point chosen at into the subset F

What
random
of
on is just
the
5^
surface of the unit the area of F divided if the

for
total
area 47r.
could
be
easier?
(see
However, Banach and Tarski showed Axiom of Choiceis assumed, asit is throughout then there exists a subset F of the unit 14
Wagon
conventional
(1985))
R^
that
such
sphere
S^ in
mathematics, that
..(1.1)
Z <
Chapter 1: oo (and even

for
Measure
Spaces
15
k <
k =
oo), S^
is the disjoint union

r/*>F,
of
k exact
copies
ofF:
5^ = U
1=
has an 'area', then that area must conclusion is that the set F The 0. only simultaneously so is it is non-measurable complicated that one measurable): (not Lebesgue Tarski have not broken the Law of and cannot assign an areato it. Banach of Area: Conservation they have simply operated outsideits jurisdiction. Remarks, every rotation r has a fixed point x on S^ such that (i) Because = X, it is not possibleto find A of 5^ and a rotation r such a subset r(x) \342\200\224 = A we A U t{A) could not have taken k = 2. that S^ and f] t{A) 0. So, that proved given any two bounded (ii) Banach and Tarski even subsets A and B of R^ each with non-empty interior, it is possible to decompose A into a certain finite number n of disjoint pieces A \342\200\224 A,- and B into IJ^Lj = the same number n of disjoint B a way that, for ^^ such pieces |jr=i ^\302\253' to B,!!! So, we can disassemble each 2, Ai is Euclid-congruent A and rebuild
where
each
r^-
is
a rotation.
If F
be 47r/3,47r/4,...,
it as
B.
in
(iii) Section Al.l (optional!) an Axiom-of-Choice of construction
the
appendix
a non-measurable
to this chapter subset of 5^.
gives
This chapter introduces
a-algebras,
case for
probability
Tr-systems,
and
measures We
and emphasizes mônotone-convergence properties of measures. in later chapters that, although not all sets are measurable, it
theory
shall
see
that
enough
a-algebra
sets are
measurable.
is always the
1.1. Definitions ofalgebra,

Let
5 be
a set.
of subsets
Algebra on S
A
collection 5)
subsets of
(i)
(iii)
Eq if
of S
is calledan
algebra
on
S (or
algebra of
S e
F,G
So,
=>
(ii) FeSo
[Note that 0 =
5^
F^:=5\\F\342\202\254Eo,
\342\202\254 So
=>
FUGe
So.
\342\202\254 So and
So F, C? \342\202\254
=>
F n
C?= (F\" U
G\\"")
\342\202\254 So.]
16
Chapter
an
1: Measure
of
Spaces
of 5
(^-V-finitely
Thus,
set
algebra
on
5 is
family
subsets
stable under
C
many
operations.
Exercise
(optional).
Let C be the classof

mtoo
subsets
of N for
which the
'density'
exists.
that
e C} lim m-î{k :1 <k <m',k

think
of
a number
We might like to chosen at
this
density
random belongs to
(if it exists) as 'the probabiHty C\\ But therearemany reasons

theory.
why
this does not conform to a proper probability find elements Section 0.5.) For example,you should FnG
(We
saw one in
for
F and
G inC
which
^C.
terminology
Note on difference
('algebra the
versus
field').
An algebra in
fl
our senseis a
symmetric
true algebrain as 'sum', the

(This is
why
algebraists'
sense
with
as
product,
and
AAB:=(AUB)\\{AnB)
underlying we
field
of the
prefer
of
way that an
is
A
algebra
is,
of subsets' to 'algebra subsets is a field in the
algebra being the
field
with
2 elements. there
'field of
subsets':
is no
algebraists' sense- unless So
trivial^
that
Eq =
{5,0}.) of S
algebra
(7-algebra on S
collection
E of
subsets
is an
subsets of
then
5)
if
is calleda a-algebraon on S such that whenever

n
(or
cr-algebra
of
F\342\200\236GE(nGN),
[Note
that
if E
is a
on S and cr-algebra
n
G E F\342\200\236
for n G N,
then
n of
a Thus, collection
(7-algebra on 5 is a family of set operations'.

it
subsets
of S
'stable under
any
countable
Note. Whereas
is
element
of
1.8
many for
of
the
below
a first
element typical where possible
of a cr-algebra. This is the reasonfor on the much simpler 'tt-systems'.

where
to write in 'closed form' the typical of sets which we shall meet (see Section algebras it is down the usually impossibleto write example),
usually
possible
our
concentrating
Measurablespace
A
pair
(5,
E),
space.
5 is a
measurable
An element
set and E is a cr-algebraon 5, is calleda of E is calleda E-measurable subset of S.
..(1.2)
generated
Chapter
(7-algebra be
1: Measure
Spaces
17
by a class C of subsets of 5. Then cr(C), the a-algebra by C, generated is the smallest cr-algebra E on 5 such that C C E . It is the intersectionof of all class the on S which have C as a subclass. all (7-algebras (Obviously, which extends subsets of 5 is a cr-algebra C.)
(7(C),
Let
a class
of subsets
1.2.
Let
Borel cr-algebras, Examples. 5 be a topological space.

the
B(5), B = B{R)
B{S)
B(5),
Borel
cr-algebra
on 5,
slight
is the
abuse
:\342\200\224
open subsets of
S.
by generated cr-algebra
the
family
of
With
of notation,
cr(open
B{S)
B:=B(R)
sets).
standard shorthand that B := B{R). of all cr-algebras. The cr-algebra B is the most important Every subset of R which you meet in everyday use is an element of it is indeed B; and difficult to find a subset of R constructed explicitly (but possible!) (without the Axiom of Choice)which is not in B.
It is
Elements of B can be quite 7r(R) :=

(not
complicated.
However,
R}
the collection
{(_oo,a:]: x G
easy
a standard
case that
all we need to know
notation) is
very
to
understand,
and
it
is
often
the
about
B is
<T(7r(R)).
that
(a)
Proof
B=
of (a).
need only
But,
countable intersectionof open sets, All that remains to be proved cr(7r(R)). But every such G is a
show
For each x in R,
(--cx),a:]=
is
countable
flnGNC\"\"^^'^
the set
that a <
is ( \342\200\224cx),x]
+ ^~^)? in B.
^^ ^^^^
^^ ^
is in
every
union
b^
open subset G of of open intervals,
so we
that,
for a, 6 G
with
(a,6)6a(7r(R)).
for any
with
w >
a,
(\342\200\224oo, u]
(a, u] = and since,
n ( \342\200\224cxD, aY G cr(7r(R)),
for
= \302\243
^(6
\342\200\224
a),
{a,b)
we
[j{a,b-sn-%
n
see
that
(a, b) G cr(7r(R)), and the proof
is complete.
18
1.3. Definitions
Let function
Chapter
1:
Measure
Spaces
(1.3)..
5 be
concerning set functions let on 5, and set, let Eo be an algebra

/io : So
/zq be
a non-negative
set
~> [0,oc].
Additive
Then
/zq
is
called
additive
if /io(0)
=>
= 0
and, for
F,
G G
So,
F n G=
yio{F
U G)
= yLo{F)+
/io(G).
Countably additive
The
map
whenever {Fn : n 6
F
/zo is
called
(note
countably additive
N)
(or
cr-additive)
if
/i(0)
with
=
not
0 and
union
is
a sequence
in |JF\342\200\236
So
that
this is
an assumptionsinceEo
of disjoint
sets in Eo
need
be a
(7-algebra), then
po(F) = ^/.o(F\342\200\236).
n
Of
course
(why?),
a countably additive of measure
set function
is
additive.
1.4.
Definition
be
space on S. E is a cr-algebra
Let (5, E)
A
a measurable
space, so that
/i : E
map
-^ [0,cx)].
is calleda measureon (5, E) if /i is then called a measure space.
is countably
additive. The
triple (5, S, /z)
1.5. Definitions concerningmeasures Let (5, E, /i) be a measure space. Then /z (5, E, /i)) is called
finite
(or
indeed
the
measure
space
if /i(5)
< oo,
(7-finite if there
is a sequence : n 6 (5\342\200\236
li{Sn) <
N) of
elements of
(J
such
that
oo (Vn
and \342\202\254 N)
5. 5\342\200\236
Intuition is usually OK for finite measures, (7-finite measures. However, measureswhich are not there are no such in measures this book. fortunately,
Warning.
and cr-finite
adapts well for can be crazy;
,.(1.6)
Probability
Chapter 1: Measure
measure,
is
Spaces
19
probability
a probability
triple
measure if
Our measure yi
called
and (5, E, /i)
is
then
called
a probability
triple.
(a.e.)
//-null element of E, almosteverywhere F of E is called An element fi-nuUii fi(F)

5
= 0.
(a.e.)
A statement
if
S about points
of
5 is
said to
hold almost
F :=
everywhere
{s : S{s) is false}G
and
fi{F)
= 0.
1.6. LEMMA. Uniqueness of extension, 7r-systems

Moral: aim
cr-algebras to
are
'difRcult',
but 7r-systenis
on
are 'easy'; so we
a family
of
work
S be a
with
the latter.
Let \342\226\272(a)
set. Let
be
of S stable
under finite intersection:
a 7r-system
S, that is,
subsets
Let that
:=
cr(J).
=
fii(S)
Suppose that fJ'2{S) < cx) and

fjii
fii
fii
and
=
/i2
CLf^
measures
on
(5, E)
such
yL2 on J.
Then
fi2
on E.
\342\226\272 Corollary. \342\226\272(b)
If
they
two
on
then
agree
measures agree on a probability the cr-algebra by that generated

is
7r-system,
7r-system.
The example B=
the
(7(7r(R))
of course
the most
E =
cr(J) in the
theorem.
an
important exampleof
it
This result will

frequently than will of this, the proof of
play the Lemma
important
role. Indeed,
will
be
celebrated
to
existence result in Section 1.6 given in Sections Al.2-1.4
more applied 1.7. Because
this
chapter
should
perhaps
be consulted
of the appendix - but read the remainder of
this
chapter
first.
20
Chapter 1:
MeasureSpaces
Extension
(1.7)..
Theorem
1.7. THEOREM. Caratheodory's

\342\200\242\342\226\272 Let S be
a set,
let So
be
an
algebra
on S,
and let
E:=(7(Eo).
If
fiQ
is
a countably
additive
such
map
î ^
fio
: T,o
-^ [0,oo],
then there
exists a
measureji on (5, E)
< oo,
that
fiQ
on
Eo-
If fJ>o{S)
then,
by
Lemma
1.6,
this extension
is
unique
an
algebra is
a ir-system!
result
In a sense,this
without
we it
should
have
more
use
\342\226\272 signs
than
any
we
could
not
construct
any interesting models. However, once

of the
other,
for
have
our
model,
of
we make
this
no further
theorem.
The proof
result It
there
course.
for
completeness.
Let us
now see how
of the appendix given in Sections A 1.5-1.8 will do no harm to assume the result for is used. the theorem
is
this
1.8. Lebesgue
Let
S =
(0,1].
measure Leb on ((0,1],B(0,1]) For F C 5, say that F G Eo if F may be written as a finite

F = < ai
union
(*)
(ai,6i]U...U(a^,6r]
<
where r \342\202\254 0 N, (0,1] and

(We
< 6i < \" -
ar
< br
< 1.
Then Eq
is
an
algebra
on
write
S(0,1]
instead
E:=(7(Eo) = B(0,l]. of S((0,1]).) For F as at (*), let
fio{F)==J2(bk-ak).
k<r
Then
fiQ
is well-defined
and additive
measure 1.7, there existsa unique fi on ((0,1],B(0,1]) measure n is called Lebesgue measure on ((0,1],-B(0,1]) or (loosely) Lebesgue measureon (0,1]. We shall often denote fi by Leb. measure (still denoted by Leb) on ([0,1],B[0,1]) is of course Lebesgue obtained the set {0}having measure 0. Of by a trivial modification, Lebesgue the concept of length. course, Leb ma<kes precise
countably
additive
Eq. This
on Eq.
by Theorem
fiQ extending
(This is not
on Eq (this is easy). Moreover,

trivial.
fio
is
See
Section
A1.9.)
Hence,
on
In a similar way, we also denoteby Leb)
we
can on
construct (a-finite) R (more strictly, on
Lebesgue measure (which
(R,S(R)).
..(1.10)
1.9. LEMMA.
Let
Chapter
1:
Measure
Spaces
21
Elementary inequalities
measure space.
Then
(5,
E,//)
he a
(a)
fi(AuB)<fi{A)-hfi{B)
(A,BgE), (Fi , F2,
\342\226\272(b) K[J^<nF^)<E^<nKF^)
. . . , F\342\200\236 G E).
Furthermore^
(c) (d)
if fi{S)
< oo,
then
fi{AuB)
= fi{A)^fi{B)-fi{AnB) formula):
=
(A,B\342\202\254E),
(inclusion-exclusion
for Fi,
F2,...,
Fn
G E,
^(U<
\342\226\240^\342\226\240)
E<
^(^\342\226\240)-EE.<
Ki^.ni^;)
successive partial sums

You
Result will
alternating
between
over-
and under-estimates.
(c)
is obvious
(c)=>(a)=>(b)
from
(c)
by
is by integration.
some version of these resultspreviously. But AUB is the disjointunion AU(j5\\(AnB)). - check that 'infinities do not matter'. You can deduce (d) induction, but, as we shall seelater,the neat way to prove (d)
surely
have
seen
because
1.10. LEMMA.Monotone-convergence properties

of
measures
These
results
are often
needed
Shakespeare'
for
making
things
rigorous.
measure space.
If \342\226\272(a)
to the 'Monkey typing

Fn
(Peep
ahead
Section
4.9.)
Again,
let (5,
E,//) be a
G E
(n G N)
property
Gi
and
Fn
T F,
then
T //(F\342\200\236)
//(F).
Notes.
the
Fn]
F means:
C Fn+i F\342\200\236
(Vn G N),
[jFn =
(n
F. Result (a) is
Then the
fundamental
of measure.
:=
Proof of{8i). Write Gji (n G N) are

fiiF\342\200\236)
Fi,
Gn :=
F\342\200\236\\Fn-i
> 2).
sets
disjoint^ U G2
U
and
...
fi{Gi
G\342\200\236)
fiiGk)
the
kKn
^(*^*) Yl k<.oo
= ^(^)-
\302\260
Application.
In a
proper
formulation
of
branching-process
example
of
Chapter 0,
{Z\342\200\2360}
t {extinction
occurs),
so that
tt^
tt.
(A
proper
formulation
of the
branching-process examplewill
be
given
later.)
22
If
Chapter 1:
Gn
\342\202\254 G\342\200\236 i G S,
MeasureSpaces
cx)
(1.10)..
h, then
now
i //(G\342\200\236)
\342\226\272(b)
and
î{Gk) <
for
some
//(G).
Proof
of{h).
to
For n 6 N, let
indicate
:= Ga:\\Ga;+\342\200\236, and F\342\200\236
apply
part (a).
Example -
what
can
'go
wrong\\
For
n G
N, let
Hn :=
(n,oo).
Then
(c)
Leb(^n)
cx),Vn,
but
i?\342\200\236 j 0.
The \342\226\272
union
of a
countable number
of results
This is a trivial
corollary
sets is fi-null. and (1.9,b) (1.10,a).

of
fi-null
1.11. Example/Warning
Let positive
(5,
S,//)
numbers
be a sequence be ([0,1],S[0, l],Leb). Let \342\202\254{k) that \342\202\254{k) such | 0. For a singlepoint x of 5,
of strictly
we
have
(a)
{x} C
for
(x -
e{k),
x -h
e{k))
n S,
so that
of
measurable
open
every
follows
fc,
fi({x}) because
<
2\342\202\254{k),
and
{x}
is the
= 0. That {x} is B{S)so fi{{x}) intersection of the countablenumber

(a).
subsets
Let
countable
V =
of
right-hand side of Q n [0,1], the set of rationals

of S
on the
in
it
union
measurableand that
S
of
singletons:
Leb(V)
V =
[0,1].
is
V
{vn
: n
G N},
clear
0. We
can include
Since V is a that V is iB[0,1]in an open subset of
measure
at
most
as follows: 4\342\202\254{k)
VCGk=
[j
fiGN
[(v\342\200\236
e(k)2-\",
+ e(fc)2-\") v\342\200\236
n 5]
=: jj
/\342\200\236,*.
Clearly,
that
consequence
of
:= pj^ Gk
the
satisfies Leb(-fir)
category
0 and
Baire
H is
the
(b)
set
uncountable^ so H is an uncountable set of measure 0; moreoverj

k
theorem (see
V C H. Now, it is a the appendixto this chapter)
n
to
n
be
Throughout
of
the subject, we have
careful
about
interchanging orders
operations.
Chapter
Events
2.1. Model for experiment: (Q,^,P)

A
model
for
triple
an experiment
(fi,^,
probability
P) in
involving randomness 1.5. the sense of Section
takes the form
of
Sample
space
f] is a
Sample
A
set calledthe
point
u; of
sample
space.
point
f] is
called a samplepoint
on f]
Event The element

By
(7-algebra of
^ ^,
is called the
family
of
events,
Q.
so that
an event is an
on
that
is, an
^-measurable subset of triple, P is a probability
definition
of probability
measure
(f],^).
2.2.
Tyche,
The intuitive
Goddess
meaning
point u; of f] 'at random' according to ^, P(F) represents the 'probability' (in the sense that the point uj chosen by Tyche belongs to intuition)
chooses a
uj
the law P in that,

understood
of Chance, F in for
by
our
F.
The chosen thereis a map
point
determines
the
outcome
of the
experiment. Thus
\342\200\224> set of
outcomes,
u;
There
should
\302\273\342\200\224> outcome.
is no
be
reason
why
this
'map'
an
(the
that
co-domain
although
lies in our
there
one-one.
Often
it is
the case
intuition!)
obvious
tossing
is some
of
richermodel.
by imbedding
'minimal' or 'canonical'model for

(For
experiment,
in
it is
better to
coin
use some
example,
we can
the associated
random walk
read off many
properties
a Brownian
motion.)
24
2.3.
We
Chapter 2:
Examples
leave
Events
(2,3)..
of (f],^)
question
pairs probabilities
can
until
the
of assigning
later.
(a)
Experiment:
Q. =
Toss coin
twice. We
take
[HH, HT,
TH, TT},
event
T = P(fi) :=
'At
set
of
all
subsets
of Q.
In this
by
model,the intuitive
mathematical
Toss
least
one head
the
event
coin
(element
infinitely
of ^)
often.
is obtained' is described {HH^HT^TH}.

We can
(b) Experiment:
take
n = {H,T}'-',
SO
that
a typical
point cj
uj
of f]
is
a sequence
(u;i,u;2,...)^
^n G {H,T}.
intuitive
We certainly wish to speak of the to choose {if, T}, and it is natural :f = (t{{ujen:ujn
Although
event
'ujn =
W\\
where
= w}:neN,w
it turns
that
e {h,t}).
for ^ is big enough;
T we
7^
'Pi^)
(accept
this!),
out that
truth
example,
shall
see in Section
3.7
the
set
p^(.
of
Kk<n:u;,=
H) ^
11
'1 2
m.odel
the
statement
number
of heads in n
tosses
is
for
an
element
of !F.
Note that we
the
experiment
outcomes.
Q>
can use the current model as a moreinformative in (a), using the map u \302\273-> of sample (u;i,u;2)
a point
points
to
Choose (c) Experiment:
between
the
0 and
point
1 uniform.ly
chosen.
at
random,.
Take
[0,1],^ case
=
P
B[0, l],u;
=Leb.
signifying
In this
obviously taJce
for
The
will
sense in which this

be
the
of a fair
coin
model containsmodel (b)
case, we
explained
later.
..(2.5)
Chapter 2: Events
surely
25
2.4. Almost
statement \342\226\272A
(a.s.)
about
outcomes
is said to
be true
and
almost
surely
(a.s.)^
or
with probability
1 (w.p.l)îf
F
:= {lv:
S{uj) is
E J^
true} G
P(F)
= 1.
(a) Proposition. If Fn
Proof.
(n e
N) and
P(Fn) = l,Vn, then

0-
P(F^)
= 0,Vn,
so, by Lemma
about.
1.10(c), P(Un-^n)=
But
f]Fn
([jF^y.
(b) Somethingto
develop
think
probability
without
Some measure
philosophers have tried to distinguished theory. One of the reasonsfor difficulty
is the
following.
When the
probability
measure
(SLLN)
to define the appropriate discussion(2.3,b)isextended of Large Numbers for fair coin tossing, the Strong Law = 1, where F, the truth set of the states that F \342\202\254 ^ and P(F)
'proportion
be let
of
statement
heads
in
n tosses
\342\200\224>
i',
is defined
formally in < a(2)
(2.3,b).
For a e
Let A
A,
the
set of
all maps
a :N
\342\200\224* N such
that
a(l)
<... .
= ^\342\200\236
{.,:\302\253^-^\"'^\302\260'\"-^'^l}
the 'truth
we
set
P(Fa)
of
the
Strong
G
Law
A.
for the
have
= l,Va
that
subsequence a\\ Then, of
course,
Exercise.
Prove
(Hint
For
any
given
that
cj, find
the
an a
concept
... .)
of 'almost
The moralis
but
precision, (ii) enough flexibility to into which those innocent of measure theory
surely' gives us
avoid
(i)
absolute
also
the
self-contradictions
too
since
easily fall.
they
(Of course,
thought
philosophers
are
pompous
where
we are
think deeply
... .)
Hmsup,
precise,
axe
to
2.5. Reminder:
: n G (a) Let (x\342\200\236
Hminf,
| lim,
real
etc.
numbers.
N) be
a sequenceof
<
We define
lim sup x\342\200\236 := inf ^
sup
[n>m
Xn f =i
lim { ^
ln>m
sup
> G [\342\200\22400,00]. x\342\200\236
26
Obviously,
Chapter 2: Events
ym '-=
limits
(2.5)..
^'^Pn>m ^n
exists
in is monotonenon-increasing
m,
so that
the
hmit of the
monotone
sequencey^
will be
in
The [\342\200\22400,00].
will
use
of tHm
handy, as
t/n
J,
t/oo
to signify
or |Um to signify t/oo =i limt/n-
(b)
Analogously,
liminf
Xn
:=
sup
<
inf
Xn \\ =T
li^ {
i^^f
^n
\342\202\254 [\342\200\22400,00].
(c)
We
have
in
<==^ =
Xn
converges
[\342\200\22400,00]
limsupx\342\200\236
liminf
Xn,
and then
Note \342\226\272(d)
limxn =
that
limsupx\342\200\236
liminf
x\342\200\236.
(i) if z
> limsupxn, then

Xn
<
z eventually
then
(that
is, for all

(that
sufficiently
large
n)
(ii) if 2: <
Xn
limsupx\342\200\236,
> z
infinitely
often
is, for
infinitely
many
n).
2.6.
The
Definitions.
event
limsupjE^n,(\302\243'n,
i.o.) the
(in the
rigorous formulation:
heads/
truth
set of
the statement)
^'
'number of
is
number
of tosses
n^^
\342\200\224>
built
out of
rather complicatedway.
simple events such as 'the

We
toss
need
of
handle complicated combinations lim sups of sets provides what

It
might
a systematic events. The
in heads' in a method of being able to idea of taking lim infs and

results
is required.
be
helpful
to note
the tautology that,

{uj
if
is an
event, then
E=
Suppose
\342\226\272(a)
:ujeE}.
Z5 a
now that (En
:n 6
N)
sequence
of events.
We
define
: =
(\302\243*\342\200\236, i.o.)
(En
infinitely
-=
often)
f]
:= =
=
limsup\302\243'n
m n>m
[j
En
{uj {uj
: for : uj
every
m,
3n{uj) > m
many
such that
n}.
u; G
^n(u;)}
E En for
infinitely
..(2.8)
Fatou
Chapter
Lemma
2: Events
27
of
(Reverse \342\226\272(b)
- needs
>
FINITENESS
limsupP(E\342\200\236).
P)
P(limsupJE;\342\200\236)
Proof.
Let
Gm
\342\226\240=
where G := HmsupE\342\200\236T
Un>m
^nBy
Then
(look at
(1.10,b),
>
the
definition
in
(a))
Gm i
G,
result
i P(G\342\200\236)
P(G).
But,
clearly,
P(G\342\200\236.)
sup
P{En).
Hence,
P(G) >i
Hm \"*
I sup
Ln>m
P{En)\\
J
=: limsupPC^n).
2.7.
\342\226\272 \342\226\272
First
Borel-Cantelli
Let
Lemma
G
Then
{En
: n
(BCl) N) be a sequence of
En) =
events such that
X:\342\200\236P(^n)<oo.
P{\\{m sup
Proof.
P{En, i.o.)= 0.
for
With
the
notation
of (2.6,b), P(G)<P(Gn^)<
we have,
n>m
each
m,
yP{En),
using
(1.9,b)
and
(1.10,a).
Now let m
cx). will
D be
Notes,
(i) An
instructive proof by integration

First
given
later.
will
within
(ii) Many applications of the this course. Interesting

random
Borel-Cantelli
Lemma
be
given
applicationsrequireconceptsof
(En, is
independence,
variables,
etc..
ev) a sequence
2.8.
Again
Definitions,
define
liminf jE^n, suppose that {En : n 6 N)

{En,
of events,
We \342\226\272(a)
ev) :
= {En eventually)
limmiEn
: =
:=[j
f| En
m(a;), uj G
= {a;: for some = {uj : LJ \302\243 En

(b) Note
\342\226\272 \342\226\272(c) (Fatou's that
En,\\/n >
m{u;)}
for
all large
n}.
{En,
evf
= {E^,
i.o.).
Lemma
for sets
- true for
with
ALL
measure
P(\302\243'n).
spaces)
P{\\hnin{En)
Exercise.
< liminf
the
Prove
this
(1.10,a)rather than
in analogy
proof
of result
(2.6,b), using
(1.10,b).
Chapter
2: Events
(2.9).
2.9. Exercise
For an event
jB,
define
the indicator
function I^- on Q via
i.M:={;;
:j
events.
uj
E.
Let {En
: n 6 N)
be a sequence of
Prove
that,
for each u;,
Iiimsup\302\243;\342\200\236(^) limsupl\302\243;\342\200\236(u;),
and
estabHsh
the
corresponding
result for
Um
infs.
Chapter
Random
Variables
Let
(5,
E)
be a
measurable space,
so that
is a
cr-algebra
on S.
3.1. Definitions. E-measurable function, mS,(mE)\"^,bE

Suppose
that
h :
S -^ R. For
h-\\A)
R, define
S:h{s)\302\243A].
:={se
/i\"^
Then
h is
called H-measurable if
: B
-^ T,, that function
is, h-^(A)
6 E, VA
E B.
So, here
is a
picture of
E-measurable
h:
Eiilis We write
the
mE for
the
class
of E-measurable
functions on
We
class
bounded E-measurable functions

Note.
infinite,
of non-negative
elements in mE.
on
5, and (mE)\"^
bE
for
denote
by
the class
of
5. even of finite-valued functions may be convenient to extend thesedefinitions in
Because
and
lim sups
for
of
sequences
other
reasons,
it is
to
functions
Tt-measurahle
Which
h taking if h~^ : of the
values in
S[\342\200\224oo, oo]
[\342\200\224oo, oo] \342\200\224> E.
the
obvious
way:
is
called
various results
in
[\342\200\224oo, oo],
stated for
and
real-valued
functions
extend
to
functions
obvious.
with
values
what
these extensions
are, should be
Borel
function
is S
A function h from a topologicalspace5 to R measurable. The most important caseis when
called itself
Borel is R.
if h is
B{S)-
29
30
3.2.
Chapter 8: Random
Elementary
The
Variables
(3.2)..
Propositions
preserves
on nieasurability operations:
h-\\A^)
(a)
map
h~^
all set
h-\\[jÂ,) = [j,h-\\A,),
This is just
C B
Let definition
{h-^{A)y,
etc.
D
Proof.
IfC \342\226\272(b)
chasing.
then /i\"\"^
and g{C) = B,
the
C -^ E
=>
such
he mS.
that
Proof.
be \302\243
class
of elements
ft : 5
\342\200\224> R is
\\n
result (a), \302\243* is a
cr-algebra,
and
and, by hypothesis, S DC.

continuous, of R,
h~^(B)
h
E E. By
(c) If S is
Proof.
\342\226\272(d)
topological
then
is
Borel.
Take C
For
to be the classof
open
subsets
and apply
-^ R is
(Vc E
result (b). D
Jl-m,easurable
any
measurable
space (5,
E), a function
h : S
if
{h<c}:={seS:
Proof. Take C to be the class7r(R) and apply result (b).
of
h(s) < c}
intervals
R).
c E
of the
form
c], (\342\200\224oo,
R,
Note.
{h
Obviously,
similar
results
apply
in which
> c},
{h > c},
etc.
measurable
by {h < c} is replaced
3.3. LEMMA.Sums and productsof

measurable
\342\226\272 mS
functions
are
is an
R
algebra over R,
and
that
is,
if \\ E
/i, /ii, /i2
E mE, then
hih2
hi -{-h2
Example
if
E mS,
E mE,
it
\\h
E mE.
and
only
of proof. Let c E R. Then for 5 E 5, if for some rational g, we have

hi{s)
is clear
that
hi{s)-^h2{s)
> c
> q
> c
\342\200\224
h2{s).
In other
words,
{hi +
union
/i2
>
c}
y
qeQ
({hi >
q}n{h2>c-
q}),
D
countable
of elements
of E.
,.(8.6)
3.4. Composition
Chapter
3:
Random
Variables
31
Lemma.
mB, then f o h
E mE.
If h
Proof.
E mE
and f G
Draw the
picture:
s -!urMr
in moreadvanced
and
Note.There
h : Si
this
are
obvious theory):
generaUzations if (5i,Ei)
h
-^
point
From
3.5.
\342\226\272 \342\226\272
82^ then
of
is
called
based on the definition (important and (52, E2) are measurable spaces if h~^ : E2 -^ Ei. E1/E2-measurable
view,
what
we have
called Y^-measurable
should
read
TiIB-measurable LEMMA
Let
(or perhaps
E/S[\342\200\224oc,
00]-measurable).
on measurability
: n
(i)
\302\243 N)
(hn
be a
of infs, lim infs of sequence of elements o/mE.

00], ([\342\200\22400,
S[\342\200\22400, 00]),
functions
Then
inf/in?
(into
(ii) liminf/in,
(iii) lim sup/in

but
are Ti-m.easurable
inf
hn
we shall
still write
E mE
(for
example)).
Further,
exists
(iv) {s : lim/in('S) Proof (i) {inf/in> C} (ii) Let Ln{s) := \\ni{hr{s)

and{i:<c}
=
in R}
E E.
flni^^n
: r
> c}. > n}.

(i).
Then Ln E mE, by part = := lim L(s) inf/in('S) =| limXn('S) supXn('S), = nn{^n<c}EE.

lim/in
But
(iii) This part is now obvious, (iv) This is also clear becausethe set on which
{limsup/in
where
^f
exists
in R is
< 00}n {liminf/in> -00}n5f~^({0}),

:=
lim sup/in
\342\200\224
liminf/in.
3.6.
Definition.
(f],
Random
our (sample
Thus,
variable space, family

R,
of
\342\226\272Let
elementof
!F) be m^.
events).
A random
variable is an
X '.n-^
X-^ '.B-^T.
32
3.7.
Chapter S: Random
Example.
Variables
(3.7)..
Coin tossing
=
Let n =
{H,T}'^,u
(ui,U2,...),u;ne
: u;n
{H,T}.
As in (2.3,b),
we define
f = aiW
Let
= W}
: n e N,W
e {H,T}).
The
definition
of
Lemma3.3,
Sn
f guarantees
that
each
Xn
is a
random variable. By
:=
Xi
+ X2
\342\200\242 \342\200\242 \342\200\242
Xn
= number
of heads in n
tosses
is a random variable. Next, for

A:=
p
6 [0,1],
of
we have
heads
y
<uj :
\\
number
number
1
p}
of tosses
{uj :
. _.. = X^(u;) v y
.^, p} /-j H
{uj i
: L
r-/(u;)
\\
\\
i = y). p},
where 3.5, A
\342\226\272 \342\226\272
X\"^
:=
lim
supn~^5n
and L~
is the corresponding lim inf.
By
Lemma
JF. \342\202\254
we have taken an important result is meaningful! It only remains

Thus,
step towards
to
the
prove
that
it
Law: Strong is true!
the
3.8.
on
Definition,
Q,
cr-algebra
generated
by a collectionof functions
in
This
is an
important idea,
discussedfurther
every
Section
3.14.
weakest topologywhich
etc.)
(Compare
continuous,
the
makes
function
in a
given family
In Example
3.7, we have
set fi,
{Xn : n
a given
a The best
in the
way
family
6 N)
of maps
6
Xn that
-^ R.
is as
to
think
of the
a-algebra T in
example
T = a{Xn : n sense now to

if we
N)
be described.
a collection
(F-y
\342\226\272 \342\226\272Generally,
have
: 7
6 C)
of maps
Ky
: f]
-^' R,
then
3^ :=
a(K, : 7 \342\202\254 C)
..(3.10)
Chapter
3: Random
Variables
each
33
map
is defined to he the smallest a-algebra y onO, such that is y-measurahle. Clearly,

a{Yy
Yy (7 E
C)
= : 7 \342\202\254 C)
a({u;
^ : \342\202\254
for
F-^H
(f],^),
: \342\202\254 B}
\342\202\254 C, B
\342\202\254 S).
If X
is a
(i)
random variable
The
some
then,
of course,
cr{X) C T.
in this section is somethingwhich introduced you about work as pick up gradually you through the course. Don't worry it now; think about it, yes! to our aid. For example,if {Xn : n 6 N) is a come 7r-systems (ii) Normally,
idea will
Remarks,
collectionof
[J
A'n
functions
on
f], and
Xn
denotes
which
a^Xk
fc
<
n),
then
is
a TT-system
(indeed,
an algebra)
generates
(j{Xn
: n
the union 6 N).
3.9. Definitions. Law, distributionfunction that X is a random variable carried by Suppose We have (f],jF,P).
some
probability
triple
n^R
[0,1]^J'^B,
Define
the law Cx
or indeed [0,1]-^ of X by
Cx:=PoX-\\
a{X)
^B.
Cx:B
^[0,1].
Then (Exercise!) Cx is a probability measure on (R,S). Since 7r(R) = Lemma 1.6 {(\342\200\224cx),c] : c 6 R} is a 7r-system which generates S, Uniqueness shows that Cx is determined by the function as defined Fx : R \342\200\224> [0,1]
follows:
Fx(c) :=
The
\302\243x(-oo,c]
P(X
< c)
= P{uj : X{uj)<
of
c}.
function
Fx
is called
the distribution
function
X.
3.10.
Suppose
Properties
that
of distribution
is the
X.
Then
functions distribution function F = Fx of
some
random
variable
(a)
(b)
(c)
lim^ôo F{x) = 1, lim:c-.-oo F{x)

F is
F:R-^[0,1],
(that
is,
x <
=\302\273
F(x)
< F(y)),
0,
right-continuous.
using
Proof of (c). By
and
Lemma
(1.10,b),
we see that
P(X <x-f n-i)iP(X

this
<x),
of
fact
continuous.
together
with the
any
monotonicity
ends.
Fx
shows
that
Fx is
right-
Exercise! Clear
up
loose
34
3.11.
function \342\226\272If F Section
Chapter S:
Existence
has
Random
Variables
(S.ll)..
of random
variable with
given
distribution
1.8
on
the properties the existence
probability measure Take (17,J^, P)

Note. The
associated
C on (R,5) such
C{~oQ,x]
S,
\302\243),
(a,b,c) in Section3.10,then, of Lebesgue measure, we

that
by can
analogy construct
with a unique
= F{x),\\fx.
= co. Then =
(R,
X{u;) Fx{x)
it is
tautological
that
F{x)yx.
Lebesgue-Stieltjes
measure C just described is calledthe with F. Its existence is proved in the
measure
next
section.
with
3.12.
prescribed
Skorokhod
distribution
representation of a random variable

function [0,1]
Again
let F
->
with
have
properties
function
random
variable
Define
(3.10,a,b,c).
F carried
We
can
construct
distribution
by
(Q,^,P)
as follows.
for
= ([0,l],S[0,l],Leb)
equalities, which you can
(the only)
right-hand
prove, are there
clarification
(al)
X+(w)
:=
\\rd{z
: F{z)
> a;}
= supjy
: F{y) < a.},
(al)
The
X-{lo)
:=
hd{z
: F{z)
> w} =
snp{y : F{y) < co}.
following
picture
shows
cases to
watch out for.
F{x)
M 0
X\302\261(a;)
X-{Fix))
X+{Fix))
By
definition
of ^~,
{CO
<
F{c))
iX-{co)
< c).
..(3.12)
Now,
Chapter S: Random
Variables
35
(^>.Y-(u;))
=^
{F{z)>ul
so, by
the
right-continuity
of F,
F{X~{ijo)) >
L:<
u, and
< F(c)\\.
{X-{u) < Thus,

(u
c)
F{X-{u:))
<
F(c))
<=^
iX-{uj)
P(X-
< c),
so that
= F(c).
< c)
(b)
It
will
The measure
variable C
X^
therefore
has distribution
function F,
and
the
in
Section later
3.11 is
just
the
law
of X~.
be
important
to know that
function
(c)
X'^ alsohas distribution
Fy
and
that, indeed,
P(X+ = X-)
Proof
= 1.
of
(c).
By definition
of
X\"^,
(w
< F{c))
=> X\" <
(X+(u;) <
c),
so that F(c) < P{X+
<
c).
Since
X+, it is clearthat
cGQ
But,
for every
c6
R,
P(X-
< c
< X+)
= P{{X- < c}\\{X+

follows.
<
c})
< F(c)
- F(c) = 0.
Since
is countable,
the result
in fact true that every experiment you will meet in this (or course can be modelled via the triple ([0,1],S[0, l],Leb). (You will to be convinced of this by the end of the next start However, chapter.) this observation normally has value. only curiosity
Remark. It is
any other)
86
3.13.
Suppose experiment
Chapters:
Generated
that has
Random
Variables
(3.IS).,
(Q,^, been
cr-algebras - a discussion and that the experiment, P) is a model for some has made Section that so Tyche 2.2) (see performed,
a collection
her
choice of
u.
(Ky
Let
our
: 7
be \342\202\254 C)
experiment,
and
suppose
that
with of random variables associated the to someone reports following you
information
(*)
about
values
the
the chosen point uj: Yy{uj), that isj the observed values of
the
random
variables
Y, (7 e C).
Then
it : 7 \342\202\254 of the cr-algebra 3^ := cr(Ky the intuitive C) is that significance can F for which, for each and every consists precisely of those events u;, you or not uj E F) on decide whether or not F has occurred is, whether (that the information the basis of the information (*) is precisely equivalent (*); to the information: following
variable Y is given and

that
(**) the values If{uj) (F \342\202\254 y). Prove that the cr-algebra (a) Exercise. (t(Y)
by
generated
by
a single
random
a{Y)
cr(Y)
= Y-\\B)
is generated
:= ({u; : Y{uj) e by the 7r-system

<
B}
: B
e B),
7r(r) := {{u: Y{uj)
x}
: x
E R)
= F-'(7r(R)).
things.
in
D
the
The reading
if
following
results
might
help
this
section
after
(c)! Results Z :
(b) and (c) are proved
clarify
Good advice: stop

appendix
to this chapter.
(b) If y : f]
only
\342\200\224> then
R,
f]
\342\200\224> R is
there
exists
a Borel
function / : R
an <7(y)-measurable -^> R such that
function if and Z = f(Y).
from f2 to R, then a function Z : Q, -^ R Yn are functions Yi, F2,. \342\200\242., is cr(Yi, F2, \342\200\242 \342\200\242 if and \342\200\242, only if there exists a Borel function yn)-measurable / on R\" such that Z = /(Yi, F2, \342\200\242 \342\200\242 We shall see in the appendix that \342\200\242, Yn). the more correct measurability condition on / is that / be 'S\"-measurable'.
(c) If
functions
only
(d) If
(Yy
: 7
from
E C)
Q
is a
to
R, then Z
collection(parametrized by
: fi
\342\200\224^ R is
the
infinite
set
C) of
if
if there
exists a
/
Borel function
Warning much
on R^
countablesequence (ji :i E
such that Z =
/(K,.,K,\342\200\236...).
a{Yy : 7
N)
6 C)-measurable
of elements
and
of C
and a
- for the over-enthusiastic larger than the C-fold product
latter
rather than the
measure space
only. For uncountable C,

H^^cî^)-
S(R^)
is
^^ is
the
former
which
gives
the appropriate
type of /
in (d).
..(3.14)
3.14.
Chapters:
Monotone-Class
that
Random
Variables
37
The
Theorem
Lemma
us to deduce results the 'elementary' 7r-systems, following (7-algebras version of the Monotone-Class Theorem allows us to deduceresultsabout of ttfunctions from results about indicatorsof elements measurable general the we shallnot use theorem in the main text, preferring systems. Generally, measure in Chapter 8, it for 'just to use barehands'. However, product In the same way
about
Uniqueness
1.6 allows
from
results
about
becomes
indispensable.
THEOREM.
\342\226\272 \342\226\272
Let
Ti
he a
class of hounded
conditions:
space
functions
from
a set
S into
satisfying
the
following
(i) H
(iii)
is a vector
is a
f
over
R;
7i;
(ii) the constant

if (fn)
where
function 1 is an elementof
sequenceof
the
non-negative
functions
in H,
f
such that
tt-
fn^f
system
is a
hounded function
indicator
on 5,
then
E 7i.
Then if 7i
I,
contains
then
function
hounded
of every
(j(I)-measurahle
set in some
function
Ti contains
every
chapter.
on S.
For proof,
see the appendix to this
Chapter
Independence
Let (fi,^, P) be a
4.1.
probability
triple.
Definitions
We
of independence
attention
the on the cr-algebra formulation (and describe to acclimatize ourselves in terms of of familiar forms more independence it) information. as the natural means of summarizing of cr-algebras to thinking definitions Section 4.2 shows that the fancy agree with the ones cr-algebra
Note.
focus
from
elementary
courses.
Independent
\342\226\272 Sub-<J-algebras Gi
a-algebras
of J^ ^1,^2,-\342\200\242\342\200\242
are
called
then
independent if, whenever Gi

n
(i
N) and
I'l,...,
are 2\342\200\236 distinct,
Independent
\342\226\272Random
random
Xi,X2,...
variables
are called
variables
independent if the
cr-algebras
aiXi),a(X2),...
are
independent.
Independent
\342\226\272Events
events
jEî,JEJ2,...
are
called
independent
cr-algebra
if the
cr-algebras \302\243*i,52,...
are
independent,
where
is \302\243n
the
{0,
\302\243'n,Q\\En,f]}. \342\200\242 \342\200\242 \342\200\242 are independent
Since
= \302\243n
(^{lEn)?
i^ follows that
only if the
random variablesIei , IE2
events jEi, jE^2,
if and
\342\200\242 \342\200\242 ^-re independent. ? \342\200\242
38
,.(4.2)
Chapter 4'
TT-system
Independence
more
39
definitions
independent
4.2. The
We
Lemma;
and
the
familiar
know
from
elementary
theory
and
and only if whenever n G N
that events jEî, \302\243\"2,... are z'l,... , in are distinct, then

n
if
corresponding
consequences
results
of
involving
complements
of the Ei^ etc.,
being
this.
We now
generalization (manageable)
use the
idea,
of
this
allowing
rather
a significant UniquenessLemma1.6to obtain us to study independence via
TT-systems
than
the
(awkward)
case
cr-algebras. cr-algebras.
J-',
on Let us concentrate
of two
\342\226\272 \342\226\272(a)
LEMMA.
Suppose
that Q
with
1 and
J are TT-systems
and H are sub-a-algebras of

= g,
and
that
a{i)
Q and
that
<7{j) =
n.
and J
Then
H are
in
independent if and only

n J)
if 2
J art ej.
independent
p(/
= P(/)P(
J),
J,
Proof Supposethat
J and
J are independent. For fixed
/ in
J, the
measures
(check
that
they
are
measures!)
H ^P{ln
on
H) and H ^
P(I)P(H)
J.
have the same total mass P(/), and agreeon (^^H) they therefore agree on cr{J) = W. Hence, p(/nH)==
By Lemma
1.6,
P(/)P(i?),
the
/GJ,
Hen.
Thus,
for
fixed
in
7Y,
measures
G^PiGD
on
H) and
G ^ P(G)P(H)
(f],
Q) have
=
the same
Q; and
agree on cr(Z)
this is what
total mass P(-H\,") and agree on J. They therefore we set out to prove. D
^0
Suppose
Chapter4now that X
x,y
I'Tidependence
(4-^)\"
and Y are two
random
variables
on (fi,
^, P)
such
that,
(b)
whenever
6 R,
P{X <x;Y <y)
P{X
<
x)P{Y
< y).
Now, (b)
independent.
says that
Hence
the
Tr-systems
independent in the senseof

In the
are
cr{X)
and
and 7r(F) (see are independent: that cr{Y)

7r(X)
4.1.
Section 3.13)are
is,
and
Y are
Definition
same
way,
we
can
prove
that random
n
\342\200\242 variables Xi, X2, \342\200\242 Xn \342\200\242,
independent
if and only if
<Xk
P{Xk
:l<k<n):=Y[
from
P{Xk < Xk),

theory.
and all the

Command:
4.3.
\342\226\272 \342\226\272
familiar
things
elementary
Do ExerciseE4.1now.
Borel-Cantelli
: n
Second
If
Lemma
(BC2)
eventSj
(En
E N) is a sequence0/independent
= 00
then
J2P{En)
First,
=^
P{En, i.o.) = P(limsupEn)= 1.
Proof,
we have
(limsupEny
With
- liminf ^^ = have
|J Q
E^.
pn
denoting
P(jE^n), we
\\n>m
J condition
n>m {n >
and
two
this equation being true if {r > n > m}, because of

justified
the
m} is
the
replaced by
limit
condition
independence,
as r
by the monotonicity For X >
of the
| 00
being
sides.
0,
1 \342\200\224 X < exp(\342\200\224x),
so
that,
since
YlPn =
00,
n>m
\\
n>m
So, PpmsupjEn)^]
=0.
if 0 that
D
<
< 1 /?\342\200\236
Exercise. Prove that 0. Hint First show
and
S :=
if 5
< 1,
then n(l
- X]Pn>
Pn)
< 00,then
1
- 5.
[](!
~Pn)
>
..(4-4)
4.4.
Chapter 4'
Independence
4^
Example
of rate
: n 6 N) be a sequence exponentially distributed with

Let (Xn
P(Xn Then,
independent
random
variables,
each
1:
>a:)
= e-^
a:
>
0.
for q
> 0,
P(Xn > alogn) = and (BC2),

for
infinitely
n-'',
so that,
using (BCl)
P{Xn
(aO)
> alogn
many
n)
= <
'
Now let
L :=
A:
limsup(X\342\200\236/logn).
Then
P(X and,
> 1)
> P(Xn
> logn, i.o.)=
1,
for
G N,
P(L
Thus,
> l
{L>
1}
=0. + 2k-^) <P{Xn > (l + fc-i)logn, i.o.) = [Jk{L > 1 -h 2k-^] is P-null, and hence L = 1 almostsurely.
think
we
Something
to
about
can
In the
same way,
prove
the finer result

=
(al)
P(Xn > logn + aloglogn,i.o.)

even
| ^
if ^
< l'
or,
(a2)
finer,
P{Xn
> log n
+ log log n -f
in
a log
log log n,
i.o. ) =
if a
< l'
or etc.
sequence
By
combining
an appropriate
sets
of
statements
(a0),(al),(a2),...
of
way (think about with the statement that
this!) the
the
union
of
a countable
number
null
is null
while the
can
intersectionof
make
a sequence
of probability-1
sets has probability 1, we about the size of the big statements precise
obviously
remarkably
a
elements
in the
sequence (Xn).
I
truly
have
included
in
the
appendix
fantastic
theorem
about precise
to this chapter the statementof behaviour: descriptionof long-term
Strassen's
Law.
42
A
Chapter 4- Independence
number
(4-4)\"
accessible
of exercises
in Chapter
E are now
to you.
4.5.
Can
A fundamental
we
construct
Xn having prescribed distribution function to to this question - for example, Yes

for
question for modelling sequence (Xn : n E N) of independent

Fn
random
variables,
the
branching-process
model
is
to make
rigorous branching-process model. The trick answer of Lebesgue measure given based on the existence answer is in the next section does settle the question. A more satisfying a topic deferred to Chapter 8. measure, provided by the theory of product
question is all that
sense. Equation
(0.2,b) makesit clearthat

for a
be ableto answer be able to construct a rigorousmodel 4.4 of Chapter 0, or indeed for Example
? We
have to
Yes
answer
to our
needed
4.6.
coin-tossing
model
with
applications
u; E
Let (n,
For jr,P) be ([0,1],S[0, l],Leb). UJ =
fi, expand
uj
in
binary:
O.UJ1UJ2
\342\200\242 \342\200\242 \342\200\242
different expansions of a dyadic rational is not going (The to cause any problems because the set D (say) of dyadic rationals in [0,1] has Lebesgue measure0 - it is a countable An an Exercise, you can set!) that the : n where G prove sequence N), (^n
existence
of two
is a sequence of
probability
independent
variables
each
^ for
either
for coin tossing.

Now
possibility. Clearly, (^^ : n
taking
the values 0 or
E
1 with
N)
provides
a model
define
Yi(uj) Y2((j^)
Y3{uj)
:=
O.uJiuJûjQ
...
:= 0.u;2Cc;5u;9 ...
:=
,
sequence
0.u;4u;8u;i3
... ,
and so on.
We
now
need
a bit
of common sense. Sincethe
has
is
the
clear
same 'coin-tossing'
that
properties as the full
sequence
(ujn
: n G
N),
it
and similarly
for
Fi has the uniform the other F's.
distribution on [0,1];
..(4-8)
Since
Chapter 4'
I'f^d^P^f^d^'^^^^
4^
disjoint,
is
the sequences (1,3,6,.-O^ (2,5,9,...), and therefore correspond to different that obvious intuitively
...
which
give sets
rise
\342\200\242 \342\200\242 are to Yi, 1^2,\342\200\242
of tosses
of our
'coin',
it
Yi^Y2,... are independent random

distributed
variables,
each
uniformly
on
[0,1].
Now
is
given.
: n E N) of distribution functions suppose that a sequence (F\342\200\236 we can the Skorokhod representation of Section 3.12, By
find
functions
gn on
[0,1] such that

Xn
:=
F-variables are independent, the X-variables,

But because the
\342\226\272
gn{yn)
has
distribution
function Fnthe
same
is obviously
true of
We
have
therefore
succeeded in constructing
with
you
a fam,ily
(Xn
: n
variables independent random, Exercise. Satisfy yourself that

intuitive
prescribed
distribution
E N) of functions.
utilizing
if forced carry through these this is again largely a case of Obviously, arguments rigorously. as we did in the Uniqueness Lemma 1.6 in much the same way
could
Section 4.2.
4.7. Notation: IID RVs

Many
of the
.random
variables
Thus,
(IID).
independent
and
most important problems in probability concern of sequences and which are distributed identically independent (RVs) if (Xn) is a sequence of IID variables, the Xn are then all have the same distribution function F (say): P{Xn <x)
= F{x),
V7i,Vx.
Of
course,
we now
we
can
for
construct
distribution
our
common
model
4.8.
know that for any given distribution function F, a triple (f],^, P) carrying a sequence of IID RVs with F. In particular, we can constructa rigorous function
process.
branching
Stochastic
process
processes; Markov chains

Y parametrized
by
\342\226\272A stochastic
a set
C is
a collection
F = (K, : 7
of random variables about existence of a
G C)
on some triple (f],^,
is (to all intents and purposes) settledby theorem, which is just beyond the scopeof
stochastic processwith
P).
the this
The
fundamental
question
prescribed
joint
distributions
famous course.
Daniell-Kolmogorov
^^
will Our concern
Chapter
be
Z\"^.
4' I'^f^^P^f^dence
with
think
7i \302\273-*
(4-^)\"
: n Z\"*\") (X\342\200\236 \342\202\254 of
mainly
We
processes
of Xn
time n. For u;
corresponding
A
(or parametrized)
by
G fi,
X = as the value
indexed
the
process
X at
of
the
to the
important
sample point lj.

example
map
Xn(u;)
is called
the sample
path
very
of a
stochastic process is
{pij
provided
by
Markov
chain.
a finite
\302\243\" be \342\226\272 \342\226\272Let
or countable
for
X E
matrix, so that
i,j
set. Let P = G E, we have

Y^pik
ij
e E)
he
sl
stochastic
Pii > 0,
Let // be
on
l.
^- := //({f}),{i
Z\"*\")
a probability
G
measureon E, so that
a time-homogeneous
fi
fi
is specified
by the
values
: ri
E).
By
such
Markov chain
transition
Z\"*\"
Z \342\200\224 {Zn
, in
with initial
distribution
that,
and
1-step
m,atrix P
io, M,...
is meant
gE.
a stochastic process Z
(a)
whenever
=
n G
and
P(Zo =
iQ\\Zi
2i;...;Z\342\200\236
z'n)
= fJ'ioPioh
-\"Pin-iin-
Exercise. Give a constructionof
such
a chain
in terms of
variables.
4.9.
the
values
at u;
of a
suitable family
chapter.
Z expressing
of
Zn{^) explicitly
random
independent
See the
appendix to this
Shakespeare
have
Monkey
typing
that
must Many interesting events an event F has probability

show
to independence
that
P(F)^
show 0 or 1, and we often probability 0 or 1 by using some argument based on = P(F).

we
Here is a silly example, to which both illustratesvery clearly

of
which
apply
of
the
use of
measures
in Lemma
'Easy
1.10 and
exercise'
has a lot
towards
0-1 law. See the

instantaneous
a silly method, but one the monotonicity properties the of the Kolmogorov flavour the end of this sectionfor an
solution
to the
correctly
problem.
WS, the Collected Works of typing a on a Shakespeare, particular sequence of N symbols typing typewriter. A monkey at one unit types symbols random, per time, producing an infinite sequence {Xn) of IID RVs with values in the set of all possible We agree that symbols.
amounts
Let us agreethat
to
= x) e := inf{P(A'i
Let
: x
is a
symbol} >
0.
of WS.
WS
be
the event that

event
Let Hk be the
that
the monkey producesinfinitely many copies the monkey will produce at least k copiesof
in
.,(4-9)
Chapter
be let
4: Independence
that
it
4^
at least
all, and let Hm,k time m. Finally,

many
the
probability be the H^^^
will
produce
k copies by
event that the monkey producesinfinitely
copies
of WS
over the time
period [m
over
-f-
1, cx)).
Because
behaviour
over
the monkey's behaviour [m + 1, oo), we have
[l,m]
is independent
of its
But
logic
tells
us that, for
every m, H^\"^^
HI
Hence,
P(Hm,knH)
= P{Hm^k)P{H).
{H,ri,k
H
But, as m t oo, Hm,k

obvious
T Hk, Hence,
and
JjT)
T (Hk
HH)
= H,
it
being
that
Hk 2
-S^-
by Lemma
1.10(a),
P{H)=P{Hk)P{H).
However, sls k ] oo, Hk

or 1.
H,
and
so, by Lemma
1.10(b),
P(H) = PiH)PiH),
whence
P(-fir)
= 0
The
for which
Kolmogorov
0-1
have
law
P(\302\243')
we must
- and
us which
Easy
1.
a produces 0 or P(^)
huge class of important = 1. Fortunately, it

of
events
doesnot tell
it therefore
generates a lot
interesting
problems!
to prove that P(H) \342\200\224 Lemma SecondBorel-Cantelli event that the monkey produces WS away, right > e^. Then that is, during time period [1, A^]. P(\302\243'i) exercise only Tricky types ( to which we shall return). If the monkey and is on every occasion likely to type any of the 26, capital letters, equally on average how will it take him to produce the sequence long
exercise.
Let
Use the
be
Hint,
E\\
the
'ABRACADABRA'?
The
assimilate.
next
three
They 0-1 but
sections
are law
involve quite
stage a quick
subtle topics which
take
time
to
Kolmogorov
IID RVs,
have
not strictly necessary for subsequent chapters. of the Strong Law is used in one of our two proofs
The
for
will
by
that
martingaleproof (of the 0-1law)
been
provided.
I use
chi,
Note. Perhaps the

/C
instead ôo\\
of Z
otherwise-wonderful makes its TgK to avoid the confusion.ScriptX,

to live with
that.
T too
like J.
Below,
A*, is too
like Greek
X?
but
we have
46
Chapter Definition.
JY\"2,...
4'
I'f^d^V^''^^^''^^^
(4-10)..
4.10.
\342\226\272 \342\226\272Let Xi,
Tail
be random
cr-algebras variables. Define
The
(7-algebra
T is
called the tail

important
a-algebraof
events: :=
the
sequence
(Xn
:n 6
N).
Now, T contains many

(bl)
(b2)
for example,
Fi :=
F2 :=
.*=
(lim-Yfe
exists)
{uj :
exists}, limXit(u;) k
(X^-îk
I hm
converges),
exists
1
(b3)
Also,
-F3
there
are many
important variableswhich ^
T
are
in mT:
for example,
.X
(c)
$i:=limsup
be
\302\26100, of
X\\'{-X2^
\\-Xk
which may
Exercise.
course.
Prove that
monkey
H in the
Hint
Fi, F2 and
is a
Section after
Fz
are
are
T-measurable,
the
that
various
the event
events
probability 0 and 1 in
problem
tail event, and that 4.4 are tail events.

have
of
- to be readonly Lookat F3 for example.
you
already
tried hard.
F3
For each
^\"+^M
n, logic tellsus that

+ ^n+*H
\342\200\242^-+
is equal
to the
set
Fi\") := {u,:lim
Now,
F3\" 3
Xn_|_i,
exists}.
Xn+2,...
now
are all from
Tn
follows
random variables on the Lemmas 3.3 and 3.5.

Law
triple (f],7^,
P).
That
4.11. THEOREM.Kolmogorov's 0-1

\342\226\272 \342\226\272
Let
(Xn
and
that
: n E N) 6e a sequence let T be the tail a-algebra of

isy
0/independent
(Xn : n
random
6 N).
Then T
is P-trivial:
variables,
(i) (ii)
FeT if ^ is
in
m,inistic
=^ P(F) = 0 or P(F) = 1, a T-measurablerandom variable, then, that for some constant c in [\342\200\22400,00],
P(e
^ is
almost deter-
c) =
l.
..(4-11)
We
Chapter4'
^ =
(i). at \302\261oo
I'^dependence
4'^
allow
of
(ii)
for obvious
reasons.
Proof
Let
Step
1: We
of
claim
The
that
Xn
and
Tn are
independent.
the
Proof
claim.
class
IC of
events of
form
x,-
{u :
Xi(u) <xi:l<k<n},
generates : n -f
R U \342\202\254 sets
{oo}
is a
which TT-system
Xn-
The
-f
class
r},
of
of the
form
U {cx)}
sequence
{lo : Xj{uj)< is a
(Xk)
now TT-system
x^-
1 <
; < n
7^.
/C
r E
N,
Xj
E R
which
generates
But the
and
assumption
that
the
is independent
clinches
implies
that
JT are independent.
Lemma 4.2(a)
our
claim.
Step 2: Xn and T are independent. This is obvious becauseT CTn.

Step
3:
We
claim,
that X^
Because
:=
cr(Xn
: n
6 N)
Vn,
and T are
the
independent.
\342\200\242=
Proof of
system
claim,.
Xn C A'n^.i,
class
/Coo
U'^'n
^^ ^
^\"
(it
is generally
NOT a
/Coo and
T axe independent,by
C A'cc, F
cr-algebra!)
Step
which
generates
A'oo-
Moreover,
2.
Lemma
4.2(a) again
clinches things.
Step 4Since
T is independentof
T \342\202\254
T!
Thus,
=>
P(F)
= P(F n
F) = P(F)P(F),
D
and
P(F)
of
= 0
(ii).
or 1.
By
Proof
Let
part
(i), for every x in R,

=
c :=
P($ =
sup{x : P(^ <

=
P($ < x) = 0 or
1.
x)
\342\200\224oo)
1; and
if c
c is
= oo, it
finite.
is clearthat
Then P(^
0}.
Then,
if c
= -oc,
=
=
it
is
clear
that
So, suppose that
< c-
P(^
oo) = 1.
0, Vn,
1/n)
so that
P(U{^<^-l/^})-P(^<c)
while,
= 0,
have
since
P(^
< c
+ 1/n)
= 1,Vn,
we
P(nU<^+l/n})
= P(^<c)
= l.
^8
Hence, P{C
Remarks.
Chapter
4-
Independence
(4-11)\"
\342\226\241
= c) =
1.
is. this result in Section 4.10 show how striking The examples random \342\200\242 ^^ cl sequence \342\200\242 For example, i/J\\ri,-Y2,\342\200\242 variables^ of independent
then
either
P( V]
Xn converges)
Xn
=0
=
or P(y^
converges)
1.
settles
The Three Series Theorem (Theorem completely 12.5) of which possibility occurs.
the
question
So,
Example.
you
can
see
that
the 0-1
law poses numerousinteresting questions.

example of Chapter 0, the
:=limZn//i'*
the variable
In
the branching-process
Moo
is measurable
of on the tail cr-algebra then not be almost deterministic. But
sequence
(Zn
the
variables
(Zn : n E N) are not
: n E N) but
need
independent.
4.12. Exercise/Warning
Let
Yo,
ill,
^2,
\342\200\242 \342\200\242 \342\200\242 be independent p(y'\342\200\236--fi)
random
=
p(r\342\200\236
variables = i, -i)
with
vn.
For n 6
Prove
N,
define
that
the
... ^nvariables -X'i,X2,... are independent. Define T^ := a{Xr : r > n). y:âiYi,Y2,...),
Xn
:= Voî
Prove
that
c-f]<T{y,Tn)jâly,f]Tn]
n
=:n.
\\
/ of 1Z. tripped
given
Hint.
Prove that
and
Yq
E mC
and
that
Yq
is
independent example
when
Notes. The
mogorov
phenomenonillustrated by
Wiener.
this
up even
was
Kolto that
The
and
me by Martin (for y a <7-algebra
Barlowand Ed Perkins. Deciding

(7^)
very simple
illustration
here
shown
we
can assert
a decreasing
sequence of cr-algebras )
=
f]aiy,T\342\200\236)
a(y,f]TA
contexts.
is a
tantalizing problemin many
probabilistic
Chapter
Integration
5.0. Notation, etc. /i(/) :=:J f dfi^ /i(/; A) Let (5, S,/i) be a measure space.We are interested elements/ of mE the (Lebesgue) integral of / with we shall use the alternative notations:
\342\226\272 \342\226\272
in defining
respect
to
for suitable /z, for which
fi{f)
:=:
Is f{s)fi{d3) that
:=: /^
fdfi.
It is worth
notations
mentioning
now
we shall
also use the equivalent
for
A 6
S:
(with a true
example,
definition
on
the
extreme
right!)
It should
be clear that,
for
Kf;
f>x):=
fi{f; A), where
{s E
S : f{s)
> x}.
now is that, of course, is else worth summation Something emphasizing a special type of integration. If (a\342\200\236 real : n E N) is a sequence of numbers, = 1 then with 5 = N, E = 'P(N),and measure on (5, E) with jj, the /i({fc}) for every A: in N, then 5 \302\273\342\200\224> if and only if ^ |an | < 00,and a^ is /z-integrable
then
yân
/ asiJ>{ds)=
a dji.
We
begin
by
to
considering
take
such an f
the integral
in the
of a
function
in (mS)\"^,
allowing
values
extended half-line [0, 00].
49
50
5.1.
Chapter 5: Integration
Integrals
is
(5.1)..
of non-negative of E,
simple functions, SF'^
If
an
element
we define
Aô(U)
:=
^^{A) <
that
cx). have only a

shall
naive
The use of
integral
An
/io
rather
than
yi
signifies
we currently
defined
element
if
for simple
functions.
and
SF'^,
(a)
may
/ of (mE)\"^ is called simple^ sum be written as a finite

m
we
then write
/ E
/ =
X^\302\253itUfc
Jk=i
where
ak E
[0, oo] and Ak

Yâkfi{Ak)
T,. We
then define
(with
(b)
fioif) =
first
<
oo
O.oo :=
0 =: oo.O).
The
to be checked is that /io(/) is well-defined; for point / will have we different of the must that form and ensure many representations (a), the same value of in desirable also they yield properties /io(/) (b). Various need to be checked,namely (c), (d) and (e) now to be stated:
(c) ii f,g e
and
(e)
5F+
and
//(/
^ g)
= 0 then
/io(/) = fôig);
f
(d)
('Linearity')
ii f,g
Mo(/
e 5F+
Mo(/)
and c > 0 then

+ fô{g),
+ g
and cf
are in 5F+,
g)=^ if f,g
fJ'o{cf)= c/io(/);
/io(/)
(Monotonicity)
e SF'^
^f
and f < g, then

/
V
< lô{g)]
(f)
involves
ii f,g
no
e 5F+
then
/ A
and
are
in 5F+.
but
Checking all the

point
of substance,
what
turn our
attention to
propertiesjust mentionedis a little messy, and in particular no analysis.We skip

matters:
it
this,
and
the
Theorem.
5.2. Definition
\342\226\272For /
of/i(/), /
we define
(mE)+
E (mE)\"^
(a)
Clearly, for
/
fi{f) :=
E 5F+,
result
sup{fio{h)
: h \342\202\254 SF+,
ft <
/}
< oo.
we have
fi{f) = fio{f).
The
following
is important.
..(5.3)
51
LEMMA
\342\226\272(b)
//
/ G (mE)+
and fi{f)
= 0, then
K{/>o})
= o.
that
Proof. Obviously, {/ > 0}=T if /i({/ > 0}) > 0, then,
limj/
for
> n~^}.
some
n,
Hence, using (1.10,a), we see /i({/ > n~^}) > 0, and then
\342\226\241
fi{f)>fio{n-'l{f>i/n})>0.
5.3.
Theorem
(MON)
such
\342\226\272 If \342\226\272\342\226\272(a)(/n)
is a
sequence
of elements of (mE)\"^
M(/n)
T
that
f /\342\200\236
/,
then
M(/)
< OO,
or, in
other notation,
/ Js
other key
fnisUds)
/ Js
f{s)fJi{ds).
This theoremis really all there is results such a^ the Fatou it. Theoremfollow trivially from The (MON) theorem is proved
relates
Lemma
to integration theory. We shall see that and the Dominated-Convergence

in
the
Appendix.
Obviously,
the
theorem
measures.
you have
It is
a sequence
lookedat the following
very closely 1.10(a), the monotonicity result for The proof of (MON) is not at all difficult, and may be read once
to Lemma
definition
of o:^''^.
of
E
convenient to have an explicit way given / E (mE)\"^ such that f^^^ of simple functions f^^^ | /. For r r^^ staircase function a^^^ : [0,cx)] -^ [0,cx)] as follows:
(b)
N,
obtaining the define
a(''>(x) := I {i y r
(0
if
X = {i
X
1)2-''
/('')
0,
if
>
1)2-''
r.
<x <
i2-'' <r
T
{i
N),
if
satisfies
Then
/('') =
a^'') o /
6 5F+,
and /(''>
so that,
by (MON),
/i(/)=Tlim//(/''>)
We
=Tlim/io(/^''^).
/\342\200\236 T
have
made
a^''^ left-continuous
so that if
/ then
T Oi^'^Hf)\302\273('')(/\342\200\236)
52
need
Chapter
Often,
5: Integration
(5.3)..
we
to apply /
ii^
the
rather than everywhere.

(c)
hypothesis
T (/\342\200\236
^ê
Let
and
where convergence theorems such as (MON) case of (MON)) holds almost everywhere be made. us see how such adjustments may
If f,9
e (mE)+ /i(/^''^) =
E (mE)+
set /(''>
= g
(a.e.),
let
then fi{f)
r
= fi{g).
and
Proof. Let
by (5.1,c),
\342\226\272
o \302\253(''>
/,
^('')
= a^''^ o g.
Kg^\"^^)'
Now
Then /(''> = g^\"^^ (a.e.) t oo, and use (MON).

in (mE)\"^
so,
D
except
(d)
If f
and
is (/\342\200\236)
a sequence
such
that,
on
jjL-null
iV\",
Then /\342\200\236 T /\342\200\242
Kfn)
M/).
Proof
fls\\N
We have
everywhere.
on,
/i(/) =
The
result now follows from

is understood
fi{fls\\N) and /i(/n) =

to include
//(/nl5\\iv)-
But
fnls\\N
(MON).
D
not
From now
to spell out such extensions for the other bother theorems, convergence often stating results with 'almost but proving them under the everywhere' null set is empty. assumptionthat the exceptional
(MON)
this extension.We
do
Note on the Riemann

If, for
integral
Riemann
example,
with
/ is a non-negative
Riemann
integrable
function
on ([0,1],
S[0,1], Leb)
sequence of
integral
I, then
a
(Ln)
of elements
of
SF\"^
and
there exists an increasing sequence (Un) of elements decreasing UniU>f
SF\"^
such
that
Ln'{L<f,
and
fjL^Ln)
T I?
y^{Un) i
L If we define
2[L
then {/
7^
if
X =
[/,
0 \\
otherwise,
it is
/}
of
clear
is a
measure
that
/ 0.
is Borel
subset of the
to be
and
the
Riemann
So / is Lebesgue measurable (see SectionA 1.11) with integral of / equals the integralof / associated
the
Borelset {L ^
measurable, while
U)
(since/i(X) =
Lemma
/i(^)
1)
which
5.2(b)
shows
([0,1], Le6[0,1], Leb), Le6[0,1] denoting

measurable subsets
<j-algebra
of Lebesgue
of [0,1].
5.4.
The Fatou
(FATOU)
Lemmas for functions

For
\342\226\272 \342\226\272(a)
a sequence /i(liminf/n)
(/\342\200\236)
in
(mE)\"^,
< liminf/i(/n).
..(5.6)
Proof.
We have
53
(*)
For n >
A;,
=T lim^^, liminf/n n
we
where gk
/i(/n) >
n>Ai;
:=
infn>*:
fn-
have
/\342\200\236 gk^
>
so that
li{gk)
lîQk)-,
whence
< inf
//(/\342\200\236);
and
on combining
this
with
an
appHcation liî
k
of (MON) inf /i(/n)
to (*), we obtain
//(Uminf/n)=t
n
hm/i(<;A:)
<T
n>k
=: Hminf/i(/n). n
Reverse
\342\226\272(b)
\342\226\241
Fatou
//
Lemma is a
<
have fn
(fn)
sequence in
and
(mE)\"^
such
that
for
some g in (mE)\"^, we
5',Vn,
fi^g) <
sup
oo,
fn)
then
fi{lim
> Hmsup/i(/n).
fn)\342\226\241
Proof Apply (FATOU)to the sequence {g-
5.5.
'Linearity'
Fora.f}
e R+
and f,g 6 (mE)+,

K^f
+ M
= c^Kf)
+ /i^g)
(< oo).
apply
Proof
to the
Approximate
and
simple functions,
and then use (MON). of/

where
g from
below by simplefunctions,
(5.1,d)
D
5.6.
Positive
E
and
we
For /
mE,
negative parts write / = /+\342\200\224/\",

:= max(/(^),0),
and
f^{s)
Then
f-{s)
:= max(-/(^), 0).
/+,/-\342\202\254
(mS)+,
|/| =
/++/\".
54
5.7.
Integrable
(5.7)..
function,
we say
\302\243^(5, E,/i)
\342\226\272For mE, / \342\202\254
that /
is
fi-integrable^
and
write
if
M(i/i) =
M/\"')+M(r)<oo,
and then
we
define
y\"/dp:=M(/):=M(/+)-Mr)-
Note
\342\226\272
that,
for /
6 \302\243^(5,E,p),
IM/)I<MI/IX
integral
the familiar
integral
of
rule that the modulus of the the modulus.

E,
is less
than or
equal to
the
We write
\302\243^(5,
/i)\"^ for
the class
of non-negative elementsin \302\243^(5, E,
fi).
5.8.
Linearity
Fora,/3
GR
and
f,g
E C\\S,T;,fi),
af +
and
^geC\\S,i:,ix)
= ayL{f) +
fi(af
+ pg)
^yi{g).
result in Section 5.5.
Proof. This is a
totally
routine
consequence
of the
5.9. Doininated-Convergence Theorem (DOM)

\342\226\272
Suppose the
that
fn^f
E niE, that
is
fn{s)
\342\200\224>
f{s)
sequence
(/\342\200\236)
dominated
by an
for every s in S and that element g o/>C^(5, E,/z)\"^:
|/n(^)|<^W,
where
V^G5,VneN,
-^
fi(g)
< oo.
Then
fn-^f
whence
in C\\S,
E, /i): that
Kfn)
is,
fi(\\fn
f\\)
0,
^ P(/)-
Command: Do Exercise E5.1
now.
..(5.11)
Cha'pier -
5: Integration
the
55
reverse
Proof.
Lemma
We
have
|/\342\200\236
/| <
--
2g, where fi{2g) < oo, so by

/I)
Fatou
5.4(b),
\\imsupfi{\\fn
< /i(Hmsup|/n
-
- /I) = /i(0)
<
M(I/\302\253
0.
Since
IM/n) the theorem
- /^(/)l =
IM/n
/)l
/I),
CD
is proved.
Lemma
that
5.10.
\342\226\272(i)
Scheffe's
Suppose
negative.
(SCHEFFE)
\342\202\254 \302\243^(5, E,//)\"^;
fn,f
-
m particular,
fn and
-^ Kf)-
f are non-
Suppose
fîlfn
that
fn-^f
(a.e.).
Then
if fi(fn)
/I)
-^ 0 if
and
only
Proof The
Suppose
'only if part is trivial.

that
now
(a)
Kfn)
(/\342\200\236
^ Kf).
that
Since
(b)
/)-
< /,
(DOM) shows
p((/n-/)-)-0.
Next
M((/n-/)+)
But
= M(/n-/;/n>/)
Kfn)
Kf)
- Kfn -f;fn<
/)\342\200\242
K/n
SO
-/;/n
< /)|
< K(/n
-/)-)!-
0
D
(a.e.). Then
-^
fi{\\f\\).
that
(a)
and
(b) together imply that

and
(C)
Of
course,
(b)
the
(c) now yield

part
M((/n-/)+)-0. the desired result.

of SchefFe's
Here is
\342\226\272(ii)
second
Lemma,
and
Suppose
that
K\\fn
fn^fE
-
\302\243^(5,
E,/i)
that
fn-^f
/I)
-^ 0 if
and
only
if ^(|M)
Exercise.
Prove
\"^
the
^^^
fîft)
trivial.
Kf^)^
'if part of (ii) by using to show that Fatou's Lemma ^^^^ applying is if part (i). Of course, the 'only
5.11. Remark
The
on uniform integrability
better
theory triples,
of uniform gives
probability
which we shall establishlater for integrability, of integrals. insight into the matter of convergence
56
Chapter 5:
machine
Integration
(5.12)..
5.12. The standard

What
I call Monotone-Class
the standard Theorem.
machine is a much cruder alternative

that
/z),
the
definition;
to
the
/i in
to prove The idea is that a space such as \302\243^(5, E,

\342\200\242 first, function
a 'linear'
result is true for

case
all
functions
we
show the result is true for - which it normally is by
when
h is
an indicator
; integrability
\342\200\242 then, \342\200\242 next,
we use we use
linearity (MON)
h
to obtain the to obtain the

being
h
result for
h h G
in SF
result for
h\"^
(mE)\"^,
conditions on
\342\200\242
usually
superfluous
=
at this
stage;
linearity, that
finally,
we
show,
the claimed
result is true.
when
by writing
\342\200\224 h'~ and
using
It seems to
me
that,
machinework'than to appealto the monotone-class times when the greater subtlety of the Monotone-Class 5.13. Integrals over subsets Recall for / E (mE)\"^, we set, for that
it works,
it is ea<sier to
'watch the standard

though
result,
Theorem
there are is essential.
G E,
J[ fdfi:=:fi{f;A):=fi{flA).
A
If
we
really
want to
E^),
ought
integrate /
measure
over
A,
we
should
is // subsets
integrate
restricted
of
the restriction
/U with space (A, S. Sowe

(a)
The
both
respect
to the
E^
to
prove
denoting that
â (say) which the a-algebra of
to the measure A which belong to
iiAU\\A)
ti{f;A).
indicator
standard machine does sides of (a) are just //(A have f\\A G mE^; and then
this. If / is the
fl B);
of a
set B in
A,
then
etc.
We discover that
for f
mS,
we
/U e
in
\302\243^(A,S^,//^)
if
and
only if fU
\302\243i(5,E,^),
which
case
(a) holds.
..(5.14)
5.14. Let
measure f/.i^ f
A \342\202\254 (mS)\"*\"
57
The
/ G (mE)+.
For
G S,
define
(a)
A
(ff,){A):=fi{f;A):=fi(flAy
trivial
Exercise
on the
results of
^^
Section5.5 and
on (5,
(MON)
shows
that
(b)
(ff^)
h
measure
S).
For
(c)
If
(niE)\"^,
and
\342\202\254 S, we
can conjecture
that
(h{fl^))(A) := (/M)(ftU)= KfhU).

is the
standard (d)
indicator of a set in E, then machine produces (c), so that hifl,)
(c) is immediate
we
by
definition.
Our
have
= (hf),!.
following form:
then
Result (d) is often

^f \342\226\272(^)
used
in the
(\"^^)^
and need
^^^ ^
then
(n^S)>
\302\243^(5, E,///)
if and
only if
that
D
fh e C^{S,S, /i)
{ffi){h)
= fi{fh).
ft
Proof.
We
only
prove
this for
5.
>
0 in
which case
it
merely
says
the measures
Terminology,
If
at (d) agreeon
and
the Radon-Nikodym ffi on

symbols
theorem
say
A denotes
the measure
in
to //, and expressthis
(5, E), we
via
that
A has
density f
relative
d\\/dfi
We
= f.
i^ E
note
that
in this
case, we have
for
E:
X{F)
(f)
so that
^{F) = 0 impliesthat
only certain
and A are
= 0;
measures
(proved
have
Nikodyin theorem
(g)
\\ if
density
relative
to fi.
The Radonholds,
in
Chapter
14) tells us that
fi
a-finite
\342\202\254 (mE)\"^.
measures on
(5, E)
such
that
(f)
then
fji
for some f
Chapter
Expectation
6.0. Introductory
We
remarks
work
with a
Recall that a
measurable
probability
variable
triple (fi, ^, P), and

(RV)
write
C^ for
random
is an
element of
C^{Q.^T^ P).
is
m^,
that
an
J^-
function from fi to R.
to
Expectationis just the integralrelative

Jensen's
very
P.
inequality^
(5,
useful
general
We
critical use of the fact that P(r2) = 1, is for and powerful: it implies the Schwarz, Holder,... inequalities E,//). (See Section 6.13.)
which makes
geometry
study
the
of the space
C^{Q.^J-^ P) in somedetail,with
a view
to several
later applications.
6-1- Definition of expectation

For
random
of
variable
X E :=
>C^
\302\243^(fi, J^,
P),
we define
the expectation
E(X)
Xhy
E(X)
We
/ XdP =
X
/
with
X{u)P{duj),
also
define
E(X)
(< oo) for
\342\202\254 (m^)+.
In short, those
That our presentdefinitions

density
agree
in
P(X). terms of probability

6.12.
E(X) =
function
(if it exists)
etc.
will
be
confirmed
in Section
6-2- Convergence theorems

Suppose
that
{Xn)
is a
sequence of RVs,
P(Xn
that
is a
RVj
and
that
Xn
\342\200\224> X
almost
surely:
^ X) =
1.
notation:
We
rephrase
the
convergence
theorems
of Chapter 5 in our new
58
..(6.4)
\342\226\272 \342\226\272(MON)
Chapter
6:
Expectation
59
if 0
< Xn T X,
>
then
T E(X\342\200\236)
E(X)
< oo;
E(X\342\200\236);
\342\226\272 \342\226\272(FATOU) \342\226\272 (DOM)
ifX\342\200\236
0,
<
then
Y(uj)
E(X)
V(n,w),
< liminf
if
\\X\342\200\236{u)\\
where E{Y)
< oo, then
E(|X\342\200\236-.Y|)^0,
30 that
E{Xn)
\342\226\272(SCHEFFE)
- E(X);
then
ifE(\\Xn\\)
-^
E(|X|),
E{\\Xn-X\\)-^0;
\342\226\272 \342\226\272(BDD)
if for
some
finite
constant K,
\\Xn((^)\\
<
Ky{nû),
then
E(|Xn-X|)->0.
The newly-added
immediate
Bounded Convergence
Section
Theorem
(BDD)
has
consequence
fact
of P(fi)
of the
that
proof which
in we shall examine
obtained by (DOM), = 1, we have E(F)
< oo.It
but
taking
Y{ijj)
K^ a direct
is an Vu;; because elementary
13.7;
to provide
As has
concept
you might
well be able
it
now.
which
study
this,
is the key been mentioned previously, uniform integrability of theorems. We a gives proper understanding convergence via the elementary (BDD) result,in Chapter 13.
shall
6.3.
The
notation
E(X; F)
and
For X eC^ (or (mJF)+)

\342\226\272
6 JF,
we define
E(X;
F)
:= /^
X(u)P(cL;) :=
E(XI^),
where, as ever.
Of
course,
this
tallies
with the
/i(/;
A)
notation
of Chapter
5.
6.4. Markov's
Suppose
inequality
E
mj-\"
that
and
that
decreasing.(We
\342\226\272
know
that
>
g{Z)
E(^(Z);
^ : R \342\200\224> [0, oo] is B-m.easurable = g o Z E (m^)\"^.^ Then Z >
and non-
\302\243g{Z)
c) >
g(c)P(Z > c).
60
Examples:
Chapter
6:
Expectation
(6-4)'-
for Z
for
\342\202\254 (m^)+,
cP(Z
cP(\\X\\
> c)
< E(Z),
E{\\X\\)
by
(c > 0),
(c >
e C\\
can
>c)<
0).
optimum
>->-Considerable strength
c in
\342\226\272
often
be obtained
choosing
the
0 for
P(F
> c)
< e-^Ê(e^^),
(^ > 0,
\342\202\254 R).
6.5.
We
Sums
collect
of non-negative
together
imJ=')-^
RVs
< oo, then
some
and
useful results.
(a) If X e
If \342\226\272(b)
E{X)
P{X <
oo) = 1. This
is
obvious.
(Zk)
is a
sequence in (m^)\"^, then
This
is an (Zk)
obvious
is a
consequence of linearity
and (MON).
XÊ(Z)t)
If \342\226\272(c)
sequence in (m^)\"^ such that ^Zk
< oo,
then
< oo (a.s.)
and
so
Zfc
\342\200\224> 0
(a.s.)
of immediate consequence and (b). (a) is a consequence of (c). For suppose (d) The First Borel-CantelliLemma that is a sequence of events such that oo. Take Zk = Ipk< (Fk) ^ P{Fk) = Then and, by E(Zk) P(Fk) (c),
This
is an
Y^
If^
number
of events
Fk
which
occur
is a.s.
finite.
6.6. Jensen's inequality for convexfunctions \342\226\272 \342\226\272A function c : G \342\200\224> where G is an open subinterval R, convex on G if its graph lies below any of its chords: for
of
R,
is
called
x,y
E G
and
0<p=l-q<l,
It
will
+ c{px
below
on
qy)
<
pc(x)
automatically
-h qc(y).
continuous and
be
explained
that
then
c is
c is
on
>
G.
If c
is
twice-difFerentiable
G,
convex if
only
if
c\"
0.
^-Important
examples
6 R). of convexfunctions: |x|,x^,e^^(^
..(6.7)
Chapter
6: Expectation
61
THEOREM.
\342\226\272 \342\226\272
Jensen's inequality
that
Suppose
c : that
G of
and
G -^ H is a convex function on an open subinterval such that X is a random variable

< oo,
E(|X|)
P(X
= \342\202\254 G)
1,
E|c(X)|
< oo.
Then
Ec{X) > c(E(X)).

Proof.
with
The
u
fact
< V <
is convex may w, we have

that c
A ^ Au,v < A T_ where At,,u\342\200\236
be
rewritten
as follows:
ior u^v^w 6
Au,u
:=
^(^) -^^-^^
^(^) ^-^.
now clear (why?!) the monotonelimits

It is
that
c is
continuous
on G,
and
that
for
each
v in
(D-c)(v) :=t
exist and satisfy
have
lii^
Au,i\342\200\236
(^+c)(^^)
:=i
Hm A^;,^^ D-c
(D-c){v) < (D^c)(v). a nd non-decreasing, for every v in G,

c(x)
The
functions
and Z^^-c are

we
for any m in
+ c(i;),
for
jjl
[(Z)_c)(v), (\302\243)4.c)(v)]
> m(x
\342\200\224
v)
x E G.
:=
In particular, we have,
c{X)
and
almost surely,
E(X),
> m(X
-fi) + c(m), m 6
follows on taking
we
[(D-c)(;.),
(D+c)(/x)]
Jensen's
inequality
expectations. fact
+
that
Remark. Forlater
(a)
use,
shall
need
the obvious
sup(ana: n
c{x)
= sup[(D_c)(^)(a:
qeG
- q) + c{q)]=
bn)
{x 6
G)
for
some
sequences
(an) and (bn) in
R. (Recallthat
c is
continuous.)
6.7.
Monotonicity
p <
of C^ norms
X
\342\226\272 \342\226\272For 1 <
cx), we say that
E C^
= a{Q.,7,
< oo,
P)
if
E(|X|^)
62
and
Chapter 6:
then
Expectation
(6.7)..
we
define
IIÎIp
\342\226\272 \342\226\272
:=
{EdXl\}^.
following:
The monotonicity
\342\226\272
property referredto in the sectiontitleis the

and Y
\342\202\254 C,
(a)
ifl<p<r<oo
then <
and C^ \342\202\254
ii^îIp
\\\\y\\\\r^
>-Proof.
For
n 6 N,
define
Xn{^):={\\Y{i^)\\hny.
Then Xn
x^^P
is
on
(0,
so that Xn and Xn we from Jensen's conclude oo),

bounded
(EX^y/\"
are
both
in C^.
that
Taking c{x) =
inequality
<
< ECJT;/\")
E[{\\Y\\AnY]
E(\\Yn.
Now let n t
oo and
use (MON)
to
obtain
the desired
result.
D
a simple but
effective use of
Vector-space
Note. The proofis markedwith

truncation.
because \342\226\272
it
illustrates
property
a,
of C^
R\"^,
(b)
Since,for
6 6
we
have
(a +
is \302\243^
by <
[2max(a,b)]P
<
ViôP
-f 6^),
obviously
a vector
space.
6.8. The Schwarz inequality

\342\226\272 If \342\226\272(a)
and Y
are in C?,
then
XY
\302\243ând
|E(xy)l<E(|XF|)<l|x||2||y||2.
will have seen many versions of this result and of truncation to make the argument rigorous.
Remarkbefore.
You We use
By
its
proof
Proof. restrict
considering to
|X| the
and
\\Y\\
instead
of
attention
case
when X
>0,Y >0.
and Y,
we can and do
..(6.9)
Write Xrt~
Cha'pier
6:
Expectation
63 Yn are
X Nn,Yn''-Y ^n, so that

0 <
Xn
and
bounded. For
any
E[{aXn
-f hYnf]
-f
= aÊ{Xl)
and since
2abE(XnYn)
+ bÊ{Y^),
have
not the quadratic in a/b (orb/a, or...) does
two
distinct
real
roots,
Now let n t
The
{2E{XnYn)y <
oo using
AE(Xl)E(Y^)
<
AE{X^)E{Y^).
\342\226\241
(MON).
immediate consequence of
so (a):
we
following
is an
(b)
if X
and Y are in
C^, then
\\\\X
is X
-^ Y,
+
and
\\\\Y\\\\2.
have
the
triangle
law:
Yh<\\\\Xh
Remark.
Section
The
6.13,
Schwarz
which
inequality is gives the extensions
true for any measure of (a) and (b) to C^.
space
- see
6.9.
In
C^: Pythagoras,
section,
with
this
we take
probabilistic
variance
covariaiice, etc. a brief look at the geometry

concepts
of
C^
and
at its
connections
such as
covariance, correlation,etc.
and Covariance
li
X,Y
>C\"^,
then
by
the
monotonicity
of norms,
X,Y 6
>C^,
so
that
we
may define
Mx:=E(X),
fiY-E{Y).
are in
we \302\243^,
Since
(a)
the
constant
functions
with
values /ix,/^y
see
that
X:=X-^fix,
the
Y:=Y-fiY
XY
= E[{X
and \342\202\254 \302\243^,
are in C^. By
(b)
The
final
Schwarz
:=
inequality,
EiXY)
so we
may define
Cov(X,Y)
Schwarz
[ ] bracket
- ^cx){Y-
/zy)].
inequality
to yield
further justifies expanding the alternative formula:
out the product in the
(c)
As
Coy(X,Y) = E{XY)-fixtiY.
you
know,
the
variance
of X
is defined
by =
(d)
Var(X)
:= E[(X
- fix)'] = E(X') -
^\\
Cov(X,
X).
64
Chapter 6:
Expectation
(6.9)..
Inner product,angle
For
Z7, V
G >C^, we
define the inner (or scalar)product

{U,V):=E{UV\\
(e)
and
and V
(f)
if ||J7||2
by
and ||F||2
^ 0, we
define
the
cosine
of the
angle 9 betweenU
cos.=
the
<^^^
WuhWh
Schwarz
This has modulus at most 1 by

the
inequality.
This ties in
with
probabilistic
correlation
idea
of correlation: X and Y
(g) the and Y.
p of
is cosa
where
is the
angle between
Orthogonality, Pythagoras theorem

C^ ing'
has
the
same
Thus
below).
geometry as any inner-product space (but see 'Quotientthe 'cosine rule' of elementarygeometry and the holds,
form
Pythagoras
(h)
theorem takes the

lie; +
=
A.
vh'
= wuh'
+ wvh'
V
if
{u,
V)
= 0. or perpendicular,
form
If {U,V) write U
replaced
0,
V.
we say that U In probabilistic
and
are
orthogonal
and
U,V
language,
(h) takes the

if
(with
hy X,Y)
(i)
Var(X + F)
Xi,
Var(X)
Var(y)
Cov(X,F)
= 0.
Generally,for
(j)
X2,...,
6 C^, A'\342\200\236
Var(Xi
+X2 +
---+Xn) = J2
k
V^(^t)
2^^
^^.^
.Cov(Xi,
I am
X,).
I have
not marked
they are
results such as (i) and well known to you.

law
(j)
with
\342\226\272 because
sure that
Parallelogram
Note
that
by the
-f
bilinearity of
+
\\\\U
(\342\200\242, \342\200\242),
(k)
\\\\U
FII2'
FII2'
= {U
+ V,U
+ V)
{U ^V,U
--V)
2\\\\Uh'+2\\\\Vh\\
..(6.10)
Chapter
6: Expectation
65
Quotienting
Our
(or lack of it!): L^

not
space
does \302\243\"^
quite
can
satisfy
say only
space because the
best we
is that
the requirements for (see (5.2,b))

if U
an
inner
product
||J7||2= 0 if In functional
equivalence
and
= 0 almost
surely.
an
analysis,
we find
an elegant
solution by defining surely
relation
U ^
V ii
and
only
if
f7 =
V almost
and define
Ui
L'^
as
oneneedsto check
-
'\302\243^ quotiented that
out
by this
if for
i =
1,2, we have
equivalence relation'. Of course, c,- 6 R and Ui,Vi 6 C^ with

=
Vi, then
ciUi
that
-f
C2U2
- ciVi
-f
C2V2;
{UuU2)
(V\\, F2);
V
'liUn-Û
in C^
and Vn
-^ Un
and
of
V -Û,
then Vn-^
in
C^;
etc.
As mentioned in 'A
this quotienting in probability the moderately elementary

advanced
Question
level
Terminology',
theory.
level.
t
For
a Brownian
an
that
\302\273\342\200\224\342\226\272 is
Bt{uj)
continuous
function Bt on 6.10.
book, one couldnot do so at a more motion {Bt : t E R\"^}, the crucial property the true would be meaningless if one replaced
of this
Although
we normally do not do one might safely do so at
Q,
by
equivalence
class.
^ p
Completeness
of C^
(1
< 00)
Let p e
The
[1,00).
following
(a) is important in functional analysis, and will it as an the case when p = 2. It is instructive to prove exercise in our probabilisticway of thinking, and we now do so.
result
be
crucial
for us in
(a)
//
(Xn)
is
a Cauchy
sequence
in C^ in that
sup \\\\Xr-Xs\\\\p-^0
r,s>fc
(k-ôo)
\342\200\224> X in
then
there
exists
X in C^ ||X,
such
that
Xr
C^:
- A^l;,-^ 0
(r-ôo).
(a) by
Note.
We
already
in showing that technique of the
know that C^ is a vector space. Property C^ can be made into a Banach space L^
is important a quotienting
type mentionedat the end of
the
preceding
section.
66
Chapter 6:
of
Expectation
be
(6.10)..
an almost sure limit
that
Proof
(a).
We
show
that
A\"
may
chosen
to be
of
a subsequence
{Xk^): 72 \342\202\254 N) with
Choose a sequence{h^
{r,s>kn)
kn ]
oo such
\\\\Xr-Xs\\\\p<2-^.
Then
=
H\\Xk\342\200\236^.-XkA) \\\\Xu\342\200\236^.-Xk\342\200\236h
<
U^^.-XkAr.
< 2-\",
SOthat
Hence it
is
almost
surely
true that the
series
converges
X](^\"itn+i
(even
^kr.)
absolutely!),
so that
limXfc\342\200\236(u;)
exists
for
almost
all uj.
Define
Then
X is Suppose
X{u) := limsupXjt\342\200\236(u;), a.s. jF-measurable, and Xk^ \342\200\224> A,

that n 6
E (|X.
N
Vu;.
and
r >
for /:\302\253\342\200\242 Then,

=
N 9
<
>
n,
- Xu, I\")
\\\\Xr
Xk,
lip\"
2-\"P,
obtain
so that
on letting
| oo
and using
Fatou's Lemma,we
E C^.
Firstly,
Xr
->
X 6 Xr \342\200\224 X in CP.
>C^,
so
that
Secondly, we see that, indeed,

D
see EA13.2.
Note. For an easy exercise on

6.11. Orthogonal
C^
convergence,
projection
The
number
result
as
well
on has of C^ obtained in the previous completeness of important consequences for probability theory, and it is to develop one of these while Section in your fresh mind.
6.10is
section a perhaps
of its
I hope that you

orthogonal projection
will
allow
me
to present
for
the following result on

now,
as
central role in the theory

We
a piece
of
of geometry
conditional throughout
||2
deferring
discussion
expectation this
until Chapter
9.
write
||
\342\200\242 for
||
||
\342\200\242
section.
..(6.11)
THEOREM
\342\226\272
Chapter 6:
Expectation
67
Let
(Vn)
he a vector subspace of is a sequence in K which
C^ has
which the
is complete in Cauchy property
that
whenever
that
sup ||v;~v;||-ô
r,3>k
(k-^^),
then
there
exists
a V in K
such
that
\\\\Vn-V\\\\^Q
(nôo).
exists
Then given
(i)
X in
\\\\X
C?y
there
Y in
K such
--W\\\\:We
that
y|| =
:=
inf{||X
K},
(ii)
X-Y
(i) and
\302\261Z,
VZ
/C. \342\202\254
property (i) or (ii)

||F
Properties
(ii) ofY in IC
with
are
equivalent
and ifY
shares either
Y, 0
then {equivalently, in the

If Y
- y II
Y =
Y, a.s.).
Definition. The
the
random
variable
Y
IC.
theorem is
is another
a.s.
Proof.
orthogonal
projection
of X
onto
version, then F =
called a
version
of
F,
Choose
a sequence
(Yn) in /C such
||x-y\342\200\236||--A.
that
By
the parallelogram
\\\\X
law (6.9,k),
n||2 + n)
+ ||X
- n||2 = 2\\\\X
that
\\{Yr +
Y,)f
+ 2||Kn
It
Y,)\\\\\\
But
\\{Yr
e K,, so
||X
that the
K. such
sequence {Yn)
that
has
the
k{Yr + F,)||2 > A^. Cauchy property so that

-
is now
obvious
in
there exists a F
\\\\Y\342\200\236-Y\\\\^Q.
Since
(6.8,b)
implies
that
||,Y -
Y\\\\
<
\\\\X
Yn\\\\
||y\342\200\236
r||,
it is
clear
that
||X-F||=A.
68
Chapter
6: Expectation
(6.11)..
and so
For any
in /C,
we have
F + tZ
/C for
t 6 R,
\\\\X-Y-tZf>\\\\X-Y\\\\\\ whence This can only

be
the
case for
all
of
small
modulus
if
\342\226\241
(z,x-r)
Remark.
form
= o.
theorem
The
\302\243^(f], Q^
case to
P) for some
this which we shall apply of J^. Q sub-a-algebra
is when tC
has the
6.12.
The
'elementary
formula'
for expectation To avoid

the confusion
Backto earth!
Let
J\\r be
let
us here
a random variable. write Ax on (R, S) for
between
different
\302\243's,
law
of X:
Ax{B) :=
LEMMA
P{X e B).
R to
\342\226\272
Suppose
that
h is
G
a Borel
measurable function
if and
from
R. Then
h{X)
and
\302\243HQ,JF,P)
only if
\342\202\254 C\\R,B,Kx)
then
(a) We simply
Eh{X) =
Ax(h) = / Jr
into of
h{x)Ax{dx).
Proof
feed everything
the
standard
machine.
shows
Result (a) is the definition that (a) is true if ft is a

h
(a)
for
non-negative
if h = I5 (B 6 B). Linearity then function on then simple implies (R, B). (MON) and linearity allows us to complete the function,
Ax D
argument.
Probability
We
density
that
function
(pdf)
(pdf)
say
X has
/x
Borel function
(b)
: R
a probability density function -^ [0, cx)] such that
fx
if there
exists a
P(X
eB)=
f fxix)dx, Jb
BeB.
..(6.13)
Here we
Section
Chapter
6:
Expectation
69
have written dx for

result
what
should
be Leb(<ix).
In the
to
language
Leb:
of
5.12,
(b) says that Ax
has density fx
relative
dLeh
The function
to fx
fx is only
satisfy
defined
almost
everywhere:
any function
a.e. equal
will
also
(b) 'and
conversely'.
The above lemma extends to

E{\\h{X)\\)
<
oo if
and only if
< / \\h{x)\\fx{x)dx
cx)
and
then
Eh{X)=
Jr
f h{x)fx(x)dx.
6.13. Holder from Jensen

The
truncation
technique
fact
used
P(fi)
6.S relied
on the
any
that
true for
We
to prove the Schwarzinequality in Section < cx). However, the Schwarz inequality
is
measure
space, this
for
as is
with
the more general Holder inequality.
conclude
Holder inequality
triples.
chapter
any
(5, S,
a device (often useful) which yields the //) from Jensen's inequality for probability
Let
\342\226\272
(5, S,//)
be a measure space.Suppose
p
that
> 1
and p~^
and
-f q~^ = 1.
cx), and in
that
Write
e CP(S,
E, //)
if
mE \342\202\254
fi{\\f\\P) <
case
define
11/11. :=
Ml/r)}^/^.
THEOREM
Supposethai f,g e \302\243^(5, E,

\342\226\272(a)
fi),
h G C^{S,
E, //).
and
Then
(Holder's
inequality)
fh \\Kfh)\\
\342\202\254 \302\243^(5,S,//)
<K\\fh\\)
< ||./||;>||-|lg^
\342\226\272(b)
(Minkowski's
inequality)
11/+
fir||,<
11/11,+ ||5||p.
70
Proof
Chapter 6: Expectation
of (a).
(6.IS)..
We can obviously restrict attention to

/,/i>Oand/i(/P)>0.
the casewhen
With
the notation
of Section5.14,define
SO that
P is
a probabiUty measureon
\342\200\236(,):^/M^)//W-^
(5, S).
Define
if/W>o,
The fact
that
P(w)\302\253
<
P(w\302\253) now
yields
\342\226\241
M(IAI)<ll/llpl|ftI{/>o}||,<||/||p||%.
Proof
of (h).
Using Holder's
Ml/
inequality, we have + 91'-') + Mbll/ + 9r')

+
+ 91\")
= Ml/ll/
<\\\\f\\\\pA
\\\\9\\\\,A,
where
A=\\\\\\f+9r'\\u=M{\\f+9n^',
and (b)
and
of
follows
on
rearranging.
(The
A
result is
follows
non-trivial
only
if /,
flf
\302\243^,
in that
CP.)
case, the
finiteness of
from
the vector-space
property
D
Chapter 7
An Easy Strong Law
7.1. 'Independence means multiply' - again!
THEOREM
Suppose that X and Y are independent RVs, and that X and Y are both in Cl . Then XY G C1 and
E(XY) = E(X)E(F).
In particular,
if X and Y
are independent elements
of C2 , then
Cov(X, Y) = 0 and Var(X
+ Y) = Var(X) + Var(F ).
case when X
X~, etc., allows us to reduce the problem to the Proof Writing X = > 0 and Y > 0. This we do.
But then, if is our familiar staircase function, then
a(X) = ailAi,
a(Y) = bjlsj
where the sums are over finite parameter sets, and where for each i and j, Ai (in cr(X)) is independent of Bj (in cr()). Hence
E [a(X)a(r\Y)} =
=
n B>)
EEa'P(')P() = E[a(r)(X)]E[a<r)(y)].
Now let r j oo and use (MON).
Remark. Note especially that if X and Y are independent then X Cl and Y 6 Cl imply that XY 6 C1. This is not necessarily true when X and
71
72
Chapter
7:
An
Easy
Strong
Law
(7.0)..
Y are
It is important
not independent,and we
that
need
independence -
of Schwarz, the inequalities the need for such obviates
Holder, etc.
inequalities.
note
7.2. Strong
Law
first
version
many
4^**
The following result covers a 'finite it imposes though
cases
moment'
of importance.
condition,
You should
that
the (X\342\200\236) for about identicaldistributions sequence. so fine a result has so simplea proof.
it makes no assumption that It is remarkable
THEOREM
\342\226\272
Suppose
that
Xi^X2^'''
for
some
constant
are independent random K in [0, oo),

= 0,
+
variables,and
that
E(Xfc)
E{Xt)<K,
Then
=
\\/k.
Let
Sn
= Xi
+ X2 +
\342\226\240 \342\226\240 \342\226\240
Xn.
P(n->5\342\200\236-^0)
l,
or again, Proof.
Sn/n
\342\200\224> 0
(a.s.).
We have
EiS*J = E[iXx+X2+ --- + XnT]

k
because,
for distinct
i,j,
fc
and
/,
E(X,X|)
using
=
plus
E{XiX]Xk)
the fact that
= E{XiXjXkXi)=
E(A'i) =
0,
of
independence
the fact that

CP
norms'
E{Xj) < 00 impliesthat E(X|) result in Section 6.7. Thus Xi and
0. [Note that, for example, < 00, by the 'monotonicity are in C^.] Xj
We know from Section
6.7 that
\\/i.
i ^
[E{Xf)]''<E{Xt)<K,
Hence, using
independence again, for

E{XfX])
j,
E{Xf)E{X]) < K.
..(1.S)
Thus
Chapter
7:
An
Easy
Strong Law
73
E(5;t)<
nK
-f
3n(n
1)K <
3Kn\\
and (seeSection 6.5)

E
Y^{Sn/ny
< 3K
Y^
7Z-2 <
oo,
so that
Y!f{Sn/ny
< oo?
a.s., and
0, Sn/f'i' \342\200\224^
a.s.
Corollary.
E(Xk) (a.s.)
// the condition
fJ-
E(Xit) = 0 m
the
theorem
is
replaced
n~^Sn
by
= as
for
some
constant
fi, then the theorem
holds with
to the
-^
y^
its conclusion.
Proof. It is obviously a
where
case of
applying
the
theorem
sequence {Yk)-,
Yk :=
Xk
\342\200\224 But
/i.
we need
to know that
(a)
This
supE(F/)<oo. k
is obvious
from Minkowski's
inequality
||A';i.-mI|4<||A',||4 + H
//I on fi having C^ norm |/i|). But we the elementaryinequality (a) immediately by (6.7,b).
(the
constant
function
can
also
prove
D
The
next topics
indicate a
different
use
of variance.
7.3.
Chebyshev's
know
inequality
says
As you
this
that for
-
c > 0, and
C^ ^ \302\243
c^P{\\X
//| >
c) < Var(X),
fi
:=
E(X);
and
it is
obvious.
Example.
Considera sequence (Xn)

p
of
IID
RVs with
=
values in {0,1}
with
P(Xn
= 1)
= 1 ~ P(Xn
0).
Cha'pier
7:
An
Easy
Strong
Law
(7.3)..
Then E{Xn)
has
= P and Var(Xn)
np
p(l
P) < \\'
\342\200\224 <
Thus (usingTheorem 7.1)

and we
expectation
and
variance
np{l
p)
n/4,
have
= E(n-^5n) = p, Var(n-i5n)
n-2Var(5n)
< l/(4n).
Chebyshev's
inequality yields
P(|n-^5n-p|>(!))<l/(4n(!)2).
theorem 7.4. Weierstrassapproximation

J/ / is a continuous function on [0,1] and polynomial B such that
e >
0, then
there exists a
xe[o,i]
sup
\\B{x)
f{x)\\
<e.
are
Proof. Let
aware
(Xk)^
Sn
etc.
be as in
the Examplein Section 7.3.You
well
that
P[Sn =
Hence
k]=
(^)p'(l
=
~ Pr~\\ 0<k<n.
-p)\"-^
B\342\200\236(p)
:=
Ef(n-'S\342\200\236)
J2f(n-'k)(^^p''(l
the 'j5'
Now
to Bernstein. being in deference

/
is bounded
on
on [0,1], \\f{y)\\
^
<5
<
continuous
[0,1]:
for our
y|
<5 > given e > 0, thereexists
K-, \"iy
\342\202\254 [0,1].
Also, / 0 such
is
uniformly
that
(a)
Now, for p Let us
\342\200\224
implies
that
\\f{x)
\342\200\224
/(y)|
< \\e.
6 [0,1],
\\Bn{p)
/(p)|
= -
|E{/(n-i5\342\200\236)
/(p)}|.
write
:= F\342\200\236 |/(n-i5n) < \\e^ y\342\200\236

\\Bn{.p)
/(p)| and
E(F\342\200\236)
:= Z\342\200\236 [n-^Sn
-p|.
Then Zn<8
impliesthat
and m\\
we have <
=
E{Yn;Zn<S)
+ E{Yn\\Zn>S)
>
< heP{Zn<S) + 2KP(Zn

< Earlier,
i\302\243 + We
6)
2A7(4n<52).
now
we chose
-
a fixed 6 at (a).
< e,
choose
n so that
2K/(4n6^)
< ie.
\342\226\241
Then
|B\342\200\236(p)
/(p)|
for all p in [0,1]. Laplace transforms.
Now
do
Exercise
E7.1 on inverting
Chapter
Product
Measure
8.0.
One
'interchange
Introduction
and
advice
practical
of this
chapter's main lessonsof

of
importance
is that
an
order
of integration'
result
/ n
J
Si
J S2
U fisi,S2)fii(dsi)jfi2{ds2) f{si,S2)fi2{ds2yjfjii{dsi)= Si
*^ *^ \342\226\240S'2 infinite)
is always valid (both sides possibly being both valid for 'signed'/ repeated integrals that one (then the other)ofthe integrals absolute
(with of
if
/ >
finite)
0; and is provided
values:
1/(^1,52)1/^1
fJ'2{ds2)
is finite.
It is a good stage. Exceptfor

of either
use
idea
to read
strongly recommended to
the
postpone serious study

of infinite
through the chapter to

of
get the
the
ideas,
but
you are
contents
until a
later
matter
the standard machine or the Monotone-Class Theorem to prove the notation. When by things made to look complicated it is important to appreciate when the more you do begin a seriousstudy, subtle Monotone-ClassTheoremhas to be used instead of the standard
intuitively
products, it is all a
case of
relentless
obvious
machine.
product S
that
8.1. Productmeasurable structure, x E2 Let (5i, El) and (52,E2) bemeasurable spaces.
Ei
Let
denote
:= Si X
52-
For
i =
1,2, let pi
denotethe i^^
coordinate
the Cartesian so map,
pi{si,S2) :=
51,
75
:= S2. P2{si,S2)
7^
Chapter
8: Product
of E
Measure
E2
(8.1)..
the cr-algebra
The fundamental
\342\226\272
definition
= Si
E =
is as
(a)
(7(/>i,/>2).
Thus E
is generated by
the
sets
of the
XS2
form
(Bi
p:[\\Bi) = Bi
together
with
eEi)
sets
of the
form
p-\\B2)
Generally,
= SixB2
over
(B2eE2).
Cartesian
product
a-algebra
to
which one
factor
J
factor is allowed
all
and
other
factors
are whole spaces. In the caseof
vary
is generated by the a-algebra
products
to
in
that
corresponding
our
product
of
two factors,
(b)
we have
(Bi
X 52) n
{Si
B2)
= BiX
B2
and you can

(c)
easily checkthat
T={BixB2:B,\342\202\254E.}
is a
a
TT-system generating
E = Ei X
^^^
E2.
A similar
remark
of
would apply
may
for
product y^^ n^\302\253' countable intersections in analogues of (b), of (7-algebras cause problems. The fundamental
countable
^^^
^^^ that,
since we
only
take
products
uncountable
families
definition
analogous
still works.
to (a)
LEMMA
(d)
Let which
7i
denote
are
the class that
of functions
map m,ap
R which / : 5 \342\200\224>
are in
bE and
on S2, on 5i.
such
for each
si in Si, the for each S2 in S2, the Then H = bE.

It
S2 si
\302\273-> y-*'
f(si^S2) f{si,S2)
is Tt2-measurable is T,i-mêasurable
Proof
is clear
that if
the
\342\202\254 J,
then
conditions(i)-(iii)of SinceE = cr(T), the
Ia
Verification H. \342\202\254
that
Monotone-Class
Theorem
3.14 is
7i satisfies the straightforward. D
result
follows.
,,(8,2)
8.2.
We
Chapter 8: Product Measure

measure,
with
is
77
Product
continue
Fubiai's
Theorem
the
for i =
1,2, fii Section that for l{(î)
/ 6 JS2
/
a finite
that notation of the precedingSection.We suppose the preceding from measure on (5i, Ei). We know bE, we may define the integrals
\342\200\242=
f{îi^2)f^2{ds2),
12(^2):=
/ JSi
f{si,S2)fî(dsi).
LEMMA
Let
7i
he the
class of
elements in bE such that

and
the
following
property
holds:
\342\202\254 bEi !{(\342\200\242)
l{(-)
\342\202\254 bEs
and
JSx
Then
l{{si)fii{dsi)
=
JS-y
li{s2)fi2{ds2).
= bE.
X,
is straightforward. Monotone-Class Theorem 3.14

For
Proof. If
then,
trivially,
I^
W. \342\202\254
Verification
of the
conditions of
D
-F 6
with
indicator
function
/ :=
If, we now define

Ii{s2)fi2{ds2y
//(F) := / JSi
Fubini's
\342\226\272 \342\226\272
l{(î)/ii(c/3i)= / JS2
on (5,
fi = fii
X
Theorem
The
measure
set
of
function fi fii and fi2
is
O'f^d
a measure
we
S) calledthe
x
fi2
product
write
dnd
(5,S,p) Moreover, fi is
= (5i,Si,î)
(52,E2,M2).
the unique measure

X A2)
with
on (5, E) for
Ai
which
(a)
//(Ai If f
= /ii(Ai)//2(A2),
the
\342\202\254 S,.
E (mE)\"^, then
obvious
definitions
ofl^^l^,
we have
(b)
fMif)= I l{(5iVl(rf^l)=/ JSi JSi
li{s2)^L2{ds^),
18
Cha'pter 8:
[0,oo].
Product Measure
(8,2),,
(a)
in
If f
E mE
and /i(|/|) <
oo,
then
equation
is valid
(with
all
terms
in H).
fact
Proof, The
The fact that /i
Lemma 1.6 and the fact
and (MON). of Unearity // is a measure is a consequence is obvious from Uniqueness is then uniquely specified by (a)
that that
<7{T)
= E. = I^, where valid for /

A
Result (b) is automatic for / Theorem shows that it is therefore
T. The \342\202\254
\342\202\254 bS,
and
Monotone-Class in particular
it
for
for
/ in the SF'^ spacefor / 6 (mS)\"^;and linearity
(5,
E,//). that
shows
(MON) (b) is
then shows that
is
valid
valid if //(|/|)
< oo.
Extension
\342\226\272
All sure
of FuhinVs spaces:
Theorem
will
work
if the
(Si^Tii, fii)
are a-finite
etc.
of
m,ea-
We have
this by
blocks.
a unique
breaking
measure/i on (5,S) satisfying

up
(a),
etc.,
We can
disjoint
<7-finite
spaces into countable
unions
prove
finite
Warning
The <7-finiteness The is the conditioncannot be dropped. standard example = 1,2, = = 2 For take and be Let 5, E,fii following. Lebesgue [0,1] S[0,1]. and let fi2 just count the number of elements in a set. Let F be measure the x 52 : x = y}. Then (check!) F 6 E,but \342\202\254 5i diagonal {{x^y)
I({s^)
l,
li(s2)
= 0
and result (b) fails,

Something
stating
that
1 =
0.
to think about
on was
So,
finite
our
insistence
measures
with beginning necessary. Perhaps
bounded functions on
it
products of
that in our
is
worth
emphasizing
standard machine, things work because we can use indicator functions of our set in we whereas when can only use indicator functions any cr-algebra, of sets in a 7r-system, we have to use the Monotone-Class Theorem. We cannot approximatethe set F in the Warning as example
F=TlimF\342\200\236,
where
each
Fn is
a finite union
of
'rectangles'
Ai
x A2,
each
A,
being
in
B[0,1].
..(8.3)
A
Chapter 8:
application
Product Measure
19
simple
Suppose that X is a non-negative the measure // := P x Leb on

A
random
variable
(fi,^) : 0
x ([0,
oo),S[0, oo)). Let

h
on (O,^,
P). Consider
:=
{(u;,x)
< x
< X(u;)}, graph

\\\\{x) of
:=
U.
Note
that
is
the
'region
under the
X'. P(X
Then > x).
If (u;)
= X(u;),
Thus
(c)
dx denoting
formulae
//(A) = E(X) =
P(X
>
x)dx,
for
Leb(dx) as usual. Thus we have E(-X') and also interpreted the
obtained
one of
the well-known
under
integral E(X) as ''area

the Lemma
the
graph
of X\\
Note. It is perhaps worth the Fatou Lemma and the
reverse
sets
remarking that
Fatou
Monotone-Class for
Theorem,
functions
amount
to
the correspondingresults for
applied
to regions
under graphs.
8.3. Joint laws,joint pdfs

Let
and
Y be two
random variables.
Cx,Y
(X, F)
defined
is the map
: BiR) by
The (joint) law

^ [0,1]
Cx,Y
of the
pair
X B{R)
-Cx,y(r):=p[(x,r)er].
The S(R)
F'x,Y
system
X
x {(\342\200\224oo,x]
(\342\200\224oo,y]
: x^y
6 R}
B{R).
Hence
Cx,Y
is completely
of
and Y which
is defined via
is a 7r-system which generates determined by the joint distribution

<x;Y
Fx,Y{x,y):=P{X
<y).
We now know how to

//
construct Lebesgue measure

LebxLebon(R,S(R))2.
joint
We say that
X and Y
if for
have
/x,y on R2
probability
density
function
(joint pdf)
\342\202\254 B{R)
x B(R),
P[{X,Y)eT]
J^fxM^Mdz)
JrJr
Ir(^, y)fx,Y{x, y)dxdy
80
Chapter
(Fubini's
8: Product
being
Measure
in the
(8.3)..
etc.,etc.,
Theorem
Theorem
used
last step(s)). Fubini's
further
shows
that
fx{x)
acts
:= / /x,y(^, Jh
y)dy
as a
pdf for
of
any more
X (Section6.12),etc.,etc. You
sort
don't
need
me to
tell you
this
of thing.
8.4. Independence and product measure laws Y be two random variableswith Let X and Cx -, Cy respectively three functions Fx^Fy respectively.Then the following distribution
statements
and
are
equivalent:
(i)
X and Y
Cx,Y
are independent;
X
Cy\\
(ii)
(iii)
Cx
Fx,y(x,2/)
= Fx(x)Fy(y);
/x,y
moreover, if (X, Y) has 'joint' pdf

(iv)
then
each of
almost
(i)-(iii) is equivalent
every
to
/x,y(^?y)
= fx{x)fY{y)
for Leb X
Leb
(x,i/).
You do
not wish to know more about this either.

=
8.5.
Here
countable
5(R)'^
BiR\"\
again, things
products,
are niceand tidy

require
provided
we work
with finite or
if
but
different
concepts (such as Bairecr-algebras)

topological
we
work
with uncountable
i^^ coordinate
products.
the
from S(R\")is constructed
space
R**. Now, if pi
: R\"
\342\200\224>
is the
map:
\342\200\242= ,^n) />i(^l,^2?-\342\200\242 -2:,',
then
pi is
continuous,
S\"
and hence
:=:
i3(R)\"
5(R'^)-measurable.Hence (j{pi : 1 < i < n) C B(R\.

by
On the other hand, S(R\")is such open subset is a
generated countable of
union
the
open
subsets
open
'hypercubes'
of R\", and every of the form
n
l<A;<n
(\302\253''^')
and
such
In
products
are in 5(R)\".
theory
it
Hence,B(R\")= 5(R)\".
always
8.8.
feature rather than S(R\.")SeeSection
probability
is
almost
product
structures
B^ which
..(8.7)
Chapter
8: Product
Measure
81
8.6. The n-fold extension

measure
of two space chapter, we have studied the product measure variables. of two random to the how this relates and study spaces 'fromtwo to n' from your You are more than able to 'generalize' You should in other branches of mathematics. experience of similar give things in product measure space, some thought to the associativityof the 'product' So
far
in
this
somethingagain familiar
8.7.
This
in
analogous
contexts.
Infinite
topic
products
sl trivial
of probability triples
extension of
an and previous
is not
results.
it
restricted context(though main idea in a clear fashion;

probability
important extension
one) because to infinite
We concentrate on a us to get the allows
triples
is then
a
a purely
sequence
routine exercise.
of independent
products of arbitrary RVs
Canonical model
Let
already
construct (A\342\200\236:nEN)bea
for
know
a
from
sequence
more elegantand
THEOREM Let
Define (An
of probability measures on (R,S). We sequence the coin-tossing trickery of Section 4.6 that we can of law An. Here is a independent RVs, Xn having (Xn)
systematic
way
of doing
this.
: n
\342\202\254 N)
fee
a sequence fi=
of probability [J
nGN
R
measures on (R,S).
SO
that
a typical
element
Xn : fi
uj
of
H is
a sequence
Xn(u;)
(u^n)
i'n
R-
Define
-> R,
Then that
:= Un,
and let T := (y{Xn : measureP on (f],J^)
\342\202\254 N).
there
N
exists a unique
and
probability
such
for r 6
Bi,
B2?
\342\200\242 \342\200\242 \342\200\242 \342\202\254 ? ^r B,
(a)
((n l<k<r
\\
^0
\"^
n R1 k>r J
l<k<r
^\"'^^^^
We
write
(f],jr,P)=
JJ(R,s,An).
nGN
Then on
the (fi,.?^,
sequence P),
n \342\202\254 w a sequence (Xn \342\200\242 N) Xn having law An-
of independent
RVs
82
Chapter
8: Product
Measure
in the appear
(8.7).. usual way from Lemma on the left-hand side of
Remarks, (i) The uniqueness

1.6, because product a tt-system form (a)
of
of
P follows
sets
the
form
which
generating !F.
(a) more
(ii) We
could
rewrite
neatly as
To see this, use the Proof

of
monotone-convergence
property
(1.10,b)
of measures.
the
theorem
is deferred
on
to the
Appendixto Chapter9. laws

i
8.8.
Let
Technical
note
the
existence
E, -^
of joint
Define
Xi : fi -^ Si be such
5 :=
(Q,^),
(5i,Ei)
and (52,
that
spaces.For E2) be measurable

:
J^\".
1,2,
let
X~^
5i
52,
E :=
Ei x
-^
E2,
X(uj) := (Xi(u;),X2(u;))
5.
Then
fi oi
variable,and if
X (equals
(Exercise)
P
X~^ :
is a
T,
J^^ so
probability
Xi
the joint law of
that X is an (5, E)-valued random measure on fi, we can talk about the law and X2) on (5, E) : /i = P o X~^ on E.
now that 5i and 52 are metrizablespaces and that Ej = B{Si) Suppose = isa Then metrizable 5 under the space product topology. If 5i (i 1,2). and 52 are separable, then S = S(5), and there is no 'conflict'. However, if 5i and 52 are not separable,then B(S) may be strictly larger than E, X need not be an (5, S(5))-valued random and the variable, joint law of Xi and X2 need not exist on (5, S(5)).
It is
separability
perhaps
of
R was
as well to be warned of such things. Note used in proving that S(R\") C B^ in Section
that
the
8.5.
PART
B:
MARTINGALE
THEORY
Chapter 9
Conditional
Expectation
9.1.
variables,
motivating
example
Suppose
X
that (fi,
J^, P) is a probability
the
the
triple
and
that
X and Z
are random
taJcing
taking
distinct
distinct
values xi,X2,...
values
,a:m,
î,
\342\200\242 \342\200\242 ? -s^n. 2^2? \342\200\242
Elementary
conditional
probability:
Zj)
P(X =
and
Xi\\Z
:= P{X
=Xi;Z
= Zj)/PiZ =
Zj)
elementary
conditional
=
expectation:
zj)
E{X\\Z
5^x,P(X
=
Y
Xi\\Z
Zj)
are
familiar
to you.
of
The random variable

Z, is defined
E(X|Z),
the conditional
expectation
given
as follows:
(a)
if
Z(u)
= Zj,
then Y{u)
advantageous
:= \302\243(X\\Z
to
Zj)
=: yj (say).
It proves
'Reporting
to be
very
look
to
to us
the value
on which Z is constant:
of Z(ujy amounts
Z
at this idea in a new way. Q into 'Z-atoms' partitioning

Z =
Z = zi
Z2
Zn
The
<7-algebra
and therefore It is clear from

(b)
consistsprecisely
(a)
(^(Z)
generated
of
by Z
the
2**
consists of
possible
sets
{Z
\342\202\254 B},
E B,
unions
of the n
it
Z-atoms.
that
Y is
constant on is
Z-atoms, or, to put
better,
^-meaûrable. 83
84
Next,
Chapter 9:
since Y takes
YdP
Conditional
Expectation
(9.1)..
the constant value

yjP{Z
yj
on the
Z-atom
=
{Z =
Zj)P{Z
^j}, we
= zj)
have:
= zj)
= y]x,P(X = Xi\\Z
=:^S^xiP(X
If
= Xi;Z =
Zj)= f
XdP.
every
we
write
G in
^, /g
= = {Z = Zj}, this says ) E(XIg, ). Since for Gj E(FIg, is a sum of Igj 's, we have E(FIg) = E(XIg), or
(c)
Results (b)
JG
YdP
Jg
the
XdP,
\\/G
Q.
and (c) suggest
central
definition
of modern
probability.
9.2. Fundainental Theoremand Definition(Kolmogorov, 1933) with E(|X|) < oo. variable Let \342\226\272 \342\226\272\342\226\272 P) be a triple, and X a random (f],^, Let Q be a sub-a-algebra of J-. Then there exists a random variable
Y
such
that
(a)
(b)
Y is
E(|y|)
Q measurable^
< oo,
set
(c)
for
every
TT-system
which
in Q (êquivalently, for every set contains Q. and generates Q), we
G in some
have
I YdP=
G
G
RV
XdP,
\\/GeG.
Moreover,
that
is,
is called3L
given
Two
ifY is another = y] = 1. A P\\Y

version
with
random
thenY = properties variable Y with properties

these
Y, a.s., (a)-(c)
of
of
the
conditional
concept,
expectation
= E(X|^), a.s. versions with familiar a.s., and when one has become agree one identifies different versionsand speaks of the conditional
Q, and we write Y
E.{X\\G).
expectation
JE.{X\\Q)
the
But
you should
think
about
the
'a.s.'
throughout
this
course.
The theorem is proved

which
in
Section
9.5, except
for the
7r-systemassertion
for
you
will
We
find at
often
Exercise E9.1.
write E{X\\Z) for E(X|(7(Z)), That this is consistent with
\342\226\272Notation.
E(X|Zi, Z2,...)
the
E(X|<j(Zi,Z2,...)),
etc.
is apparent from
Section 9.6 below.
elementary
usage
..(9.5)
9.3. The
An
Chapter
9:
Conditional
Expectation
85
intuitive meaning
to you The only information available performed. is the set of values Z{u;) point lj has been chosen is the variable Z. Then F(u;) = E(X|^)(u;) random ^-measurable in the this information. The 'a.s.' ambiguity value of X{(jj) given in general, but it is sometimes one has to live with is something
experiment
has
been
regarding
for
which sample
every
expected
definition
a canonical possibleto choose Note that if Q is the trivial

information),
version
of E(X|^). {0,fi}
<7-algebra
(which
contains
no
then
^{X\\Q){ijj)
= E(X)
for all uj.

as least-squares-best
9.4.
Conditional
If
expectation <
predictor
\342\226\272 \342\226\272
E(-X''^) a version
\302\243^(fi,^,P). predictor
of
predictors
is oo, then the conditional expectation Y = E{X\\Q) the onto X Section of orthogonal projection (see 6.11)of Y is the least-squares-best Q-measurable Hence, all X: all Q-m,easurable functions (i.e. amôngst amongst which can be com,puted the available Y from, information),
m,inim,izes
EliY-Xn No surprise then that

theory
conditional
which
develops
it) is crucial in
industrial
processes,
or whatever.
filtering and control-
expectation
(and
the martingale
of
space-ships,
of
9.5. Proof of Theorem 9.2
isvia the Radonway to prove Theorem 9.2 (seeSection 14.14) theorem, described in Section 5.14. However, a Section 9.4 suggests much simpler approach,and this is what we now develop. We can then prove
The
standard
Nikodym,
the general Radon-Nikodym
theorem
by
14.13.
martingale
theory.
See Section
Then
First
we prove
we
prove
the
almost
of
existence
the existence in general.
E{X\\Q)
sure uniqueness of a version of when X e C^; and finally,
E{X\\Q). we
prove
the
Almiost sure uniqueness of E(-X'|^)

Suppose
Y,Y
that eC\\Q,g,P),
E C^
and
and
that
and
Y are
versions of
E{X\\Q). Then
E(r-f;G)
= o,
WGeg.
86
Chapter
9:
Conditional
Expectation
We
(9.5)..
may
Suppose that Y and Y the labeUing is such that
are not almost surely

P{Y
equal.
assume
that
> Y)
> 0.
Since
{Y>Y + n-'}uy>y}.
we
see
that
Y is in ^, because
P{Y
\342\200\224 V\" >
and
n~^) Y are
y
for some n. But and ^-measurable;

> 0
>
the
set
{Y -Y
> n'^}
E(y a contradiction.
y; r -
n~^)
> n-\"^P(y
- y > n-^)
>
o,
Hence Y =
E{X\\g) C^
Y, a.s.
e C^ Let
Section
Existenceof
let fC :=
for :=
Suppose that X e CîQ) :=

we
\302\243^(fi,^, P).
^ be
6.10
^,
(a)
know
orthogonal
that /C is complete for the projection we know that there E[(X
CîQ^Q^P). By
C^ norm.
exists
Wf]
a sub-cr-algebra of J^, and to g rather than applied

By
Theorem
6.11
on
Y m
: W
/C =
such that C'^{Q)
Yf] =
mi{E[{X= 0,
-C^C^)
\342\202\254 C^Q)},
(b)
Now,
{X -Y,Z)
a G
VZ
in
\302\2432(^).
eQ,
then Z :=
Iq
and
(b) states
that
E(Y;G) =
EiX;G).
Hence
is
a version
of E(X|^),
for
as required.
Existence of
By splitting
case
E{X\\g)
e C^
X-,
X as X = X'^
Xn
\342\200\224
we see that
X X.
when
bounded variables
choose
X G (>C^)\"^.So
with
assume that
0 < Xn
We
T
it is enough to deal with We can now G (C^)'^. Since each Xn is in \302\243^, we
the
choose
can
a version
of E(-X'\342\200\236|^). y\342\200\236 true
now
need
to establish that
(c) it is almost surely

We prove
that
0 <Yn
that
^.
(c)
this in a moment.Given
is true,
we set
y(u;) := limsupy\342\200\236(u;).
Then
Y G
m^, and
y\342\200\236 t y,
a.s.
=
But now (MON)

EiX;G)
Yn
allows us to deduce that Q)

D
E{Y;G)
from
(G e
Xn
the
corresponding
result
for
and
\342\200\242
..(9.6) result
Chapter
9: Conditional
Expectation
87
positivity
Property
(c) follows
is a
once we prove that

bounded
E{U\\g)
RV,
(d)
if
non-negative
then
> 0,
a.s.
Proof of (d).
some
Let
VT
be
a version
of E{U\\g). If P{W
Q has
< 0) > 0, then
for
n, the
set
G := {W < so that
in \342\200\224n\"^}
positive
probability,
0 < E{U; G)
finishes This contradiction
= the
E{W; proof.
G) <
-n-^P(G)
< 0.
D
with traditional usage that The case of two RVs will suffice to illustrate things. Sosuppose Z are RVs which have a joint probability density function (pdf)
9.6.
Agreement
and
fx,z{x,z).
Then fz{^)
= J^fx,z{xy^)dx actsas a probability Define the elementary conditionalpdf fx\\z of

^)//^(^) \342\200\242^-^.^(^' fx\\zix\\z) := ( 10
/i
density
function
for Z.
X given if /^(^)
Z via ^ 0;
Let
where
otherwise.
be
a Borel
function
E|/i(X)|
on
such
that
JR
< oo, / \\h{x)\\fx{x)dx

a pdf
of course
fx{^)
= /r
fx,z{x^z)dz gives
Jr
h{x)fx\\z{x\\z)dx.
for X.
Set
g{z) := /
Then
a{Z).
Y :=
g{Z) is a
typical
version
of
the
conditional
expectation
of h{X)
G B},
given
Proof.
The
element of
must
B E B. Hence, we
(a)
But
a{Z) has the form
{uj
: Z{u;)
where
show
that
=:
L :=
^ =
= E[g(Z)lB{Z)] E[h{X)lB{Z)]
^)dxdz,
R.
J J
Kx)lB{z)fxA^^
R=
g{z)lB{z)fz{z)dz,
D
and result (a) follows

Some of the
at now.
practice
from
Fubini's is given
Theorem.
in Sections
15.6-15.9,
which
you
can
look
88
\342\226\272 \342\226\272\342\226\2729.7. Properties
Chapter 9:
of
Conditional
Expectation
(9.7).,
conditional
in
expectation:
Section
and
a list
areproved These properties

this Hst of
9.8.
All X's
satisfy
Ed-X\"!)
<
oo in
use
(a)
properties. Of course,Q
denote
7i denote
of
'c'
to
'conditional'
in (cMON),
etc., is obvious.)
{Very
sub-cr-algebras
of J^.
(The
If
If
Y is
X
any version
(b)
(c)
is ^
of E(X\\g) then E{Y) = E{X). then E(X\\g) = X, a.s. measurable,

E(aiXi
if Yi
useful, this.)
(Linearity)
Clarification:
+ 02^21^)
= aiE(Xi|a)
and
+ a2E(X2|a),a.s.
Y2 is
then
aiYi
4- ^2^2
is a version of E(-X'i \\Q) is a version of E(ai-X'i

E{X\\g)
a version
of E(-X'2
\\Q),
4- ci2X2\\G)'
(d) (Positivity)
If X > 0, then
> 0,
a.s.
a.s.
(e) (cMON) If 0
(f)
< Xn
^,
then
E{Xn\\G)
T E(X|a),
(cFATOU)
If Xn > 0, then
<
E[liminf
Xn|C?]
< liminf
^
E[J\\:n|a], a.s.
X, a.s.,
(g) (cDOM) If \\Xn{u;)\\
F(u;),
Vn, EV
< 00, and Xn
then
E{Xn\\g)-Ê{X\\g),
a.s.
E|c(X)|
(h) (c
JENSEN)
If
c : R
->
is
convex,
and
< 00, then
E[c{X)\\g]>c{E[X\\g]\\ a.s.
< Important corollary: ||E(X|a)||p

(i) (Tower
\\\\X\\\\p
for
p >
1.
Property) If W
is
a sub-cr-algebra
of ^,
then
= E[x\\ni E[E{x\\g)\\n]
Note.
a.s.
and bounded,
We
shorthand
what
LHS
is
to E[X|a|W]
known')
for tidiness.
is ^-measurable a.s.
and E{X)
(j) ('Taking out

then
If Z
(*)
E[ZX\\g]
= ZE[X\\gi
\302\243^(fi,
If i? > l,i?-i +^-1 = l,X e again holds. If X G (mJP')+,

then
J^,P)
\342\202\254 (ma)+,
Z G \302\243^(fi,a,P),then < 00 and E{ZX) < then
(*)
co,
(*)
holds.
(k)
(Role
of independence)
If H is independentof
a.s.
a{a(X),g),
E[X\\a{g,n)]=E{X\\g),
In particular,
if
is independent
of W, then
E{X\\n) = E(A'),
a.s.
..(9.8)
9.8. Proofs
Property
Chapter
9:
Conditional
Expectation
89
9.7 of the properties in Section

element
of
its Clarification has beengiven.

Property
immediately to
Property
= E(JV;J1), Jl being an since E(y;Jl) follows as is Property the from is immediate definition, (b)
(a)
Q.
(c) now that
(d)
is not
obvious, but the proof
of
(9.5,d)
transfers
our
current
situation.
Proof of (e). If 0 < Xn T ^^ then, by (d), if, for each n, Yn is a version of Then Y G mQ, and Y := limsupFnE{Xn\\Q), then (a.s.) 0<Ynt Define a.s. Now use (MON) to deduce from Yn T y,
E(r\342\200\236;G')
E(Xn;G),
VG G a,
that
argument
E(y; G)
in
= E{X]G), VG
9.5.)
G G-
(Of course
we used a
very
similar
Section
D
should
Proof of (f) and

(FATOU)
(g).
(MON)
You
check
from
(DOM) from
you.
(FATOU)
in Section in Section
from
that the argument used to obtain 5.4 and the argument used to obtain to 5.9 both transfer without difficulty
careful
yield the conditional from (cMON) and of
versions. Doing the

(cDOM)
derivation
of
(cFATOU)
is an
essential exercisefor
n
(cFATOU)
Proof of
(h). From (6.6,a), there existsa

R^
countable
sequence
((ctnj^n))
of
points in
such
that
c(x)
= sup(ariX
n
-h
6\342\200\236),
G R.
For each
surely,
fixed n we deduce via
(d)
from
c{X)
> CnX
4- bn
that,
almost
(**)
By
^lc{x)\\g] >
appeal
for
a\342\200\236E[x\\g]
b\342\200\236.
the
usual
to count ability,
all
we
can
say that
simultaneously
n,
whence, almost surely,

+ =
almost surely (**) holds
> snp(a\342\200\236E[X\\g] E[c(X)|C?] n

Proof of
6\342\200\236) c(E[J^|g]).
corollary
to
(h).
Let
p >
1. Taking
>
\\E(X\\g)\\\\
we c{x) = |a:|P,
see
that
E(|Xng)
a.s.
90
Chapter
take
9: Conditional
using
Expectation
(9.8)..
\342\226\241
Now
expectations, (i)
property
(a).
definition
Property
is virtually
immediate from the
of
conditional
expectation.
Proof of
Y
of E{X\\g),
(i). Linearity showsthat and fix G in Q. We is Z Q-measurableand if

then
we must
can
assume
prove
that X
> 0. Fixa version

conditions
that
integrahility
appropriate
hold,
(***)
We
E(ZX;G) = E(Zr;G).
machine. If Z is the indicator of a set in ^, then (***) of the conditional expectation Y. Linearity then shows for Z \342\202\254 Next, (MON) shows that (***) 5F+(fi,a,P). both be sides might (m^)\"^ with the understanding that
use
the
standard
definition
is true by
that
is true for
infinite.
All
(***)
holds
Z
that
is necessary
is
to
show
that
if
\\i
is obvious
inequality
Z
X
to establish that property (j) in the tableis correct under each of the conditions < cx). This given, Ed-ZXl) is bounded and X is in \302\243^,and follows from the Holder D ^ C^ and Z E C^ wherep > 1 and -{- q\"^ = 1. p\"^
can
Proof o/(k).
ff
\342\202\254 XIq H,
We
assume
and
H are
that X >0 (and E(A') < oo). ForG EQ and independent, so that by Theorem 7.1,
E(JV; GnH)
Now
= E[(XIg)Ih] = E(XlG)P(if).
if
independent of H so that
= E(X\\Q)
(a version
of), then sinceY
is
^-measurable,
YIq
is
E[(riG)iH] = E(riG)P(^)
and
we
have
= E[Y;GnH]. E[X;GnH]
Thus the
measures
K-.
E(X;
F),
F >-* E(Y;
F)
of on
on a(Q, \"H) of the same finite total mass agreeon the 7r-system form GC\\H(GeG,H\302\243 everywhere H), and henceagree is exactly what we had to prove.
sets
of the
(t(Q,
H).
This
..(9.10)
9.9.
Chapter 9: Conditional
conditional
have
Expectation
91
Regular
probabilities
=
ForF e f,we
P(F)
E(If).
For
be
and pdfs F e J^ and G a
sub-a-algebra
of ^,
we define
P{F\\Q) to
can
a version
o/E(If|^). for a
By linearity
disjoint
and (cMON), we
of ^,
show
that
elements
we have
EP(^\302\273l^)'
fixed sequence(Fn) of
(a)
Except
P(U^\"i^) =
in
(^\342\200\242^\342\200\242)
trivial
cases, there
are
uncountably
many
sequences
of disjoint
sets, so we cannot
concludefrom
(a)
that
there
exists a
map
P(-,-):fix:r^[o,i) such that

(bl)
for
F e
J^, the function

every
uj
\302\273->
P(a;,F)
is a
version
ofP{F\\Q);
(b2)
for almost
lj,
the
map
F^P(u:,F)
is a
If such is known
encountered
probability
measure
on T,
a map
that
in
for
exists, it
technical
Important
is called a regular conditional probability given Q. It conditions regular conditional probabilitiesexist under most exist. The matter is too practice^ but they do not always book at this level. See, for example, Parthasarathy (1967).
is a proper - technically, for every AmB^
note.
The elementaryconditionalpdf
regular
- conditional
pdf for X
/x|z(^k)
of Section
given
9.6
in that
^
Proof,
\342\200\242\"*
JA
/i =
fx\\z{^\\Z{^))dx
is a
version of
P{X
G A\\Z).
Take
U in Section 9.6.
9.10. Conditioningunder independence assumptions

Suppose
that
r
If
G N
/i G
and that
bS\"\"
A'*!,^\"2,
\342\200\242 \342\200\242 are A'r \342\200\242,
having law Ajt.
independent
RVs, Xk
and
we
define
(for xi G R)
(a)
^\\x,) = E[h{x,,X2,X^,...,Xr%
92
then
Chapter
9:
Conditional
Expectation
(9.10)..
(b)
7^(-X'i)is a
version
of
the
conditional
expectation
E[/i(Xi,X2,...,X,)|Xi].
Two proofs of {h). We
need
only
show
that for
B e B.,
of
(c)
We
can
do this
E[h{Xi,X2,...,Xr)lB{Xi)]=E[j'^{Xi)lB{Xi)]. H h satisfying via the Monotone-Class Theorem,the class

functions
(c) contains
the
the indicator
of
elements
in the
7r-system of
sets of
form
B1XB2X
...xBr
appeal
etc., etc.
says
Alternatively,
we
can
to the
{Bke B), r-fold Fubini Theorem;for

/
Jxi\302\243R
(c)
that
Jx\302\243R^
/i(x)Ib(xi)(Ai
A2
...
Ar)(c/x)
l^{xi)lB{xi)Ki{dxi),
where
7^î)=
/
Jy\302\243Rr\"-i
h{xi,y){A2X...xAr){dy). an example
RVs with
4-
9.11.
Use
that
of symmetry:
Xi,
Suppose
E(\\X\\)
X2
\342\200\242 \342\200\242 are IID ? \342\200\242
< 00.
Let 5n Qn
:= Xi
the same distributionas X, where

and
4-
X2
\342\200\242 \342\200\242 \342\200\242
4- Xn,
define
\342\200\242 \342\200\242 \342\200\242)\342\200\242
\342\200\242\342\200\242=
Cr(5'n,5n+1,.
= . \342\200\242) Xn+i,-X'\342\200\236+2, a(5\342\200\236,
We
wish
to calculate
E(Xi|a\342\200\236),
for is
very
good
independent
14. Now cr(Xn+i, Xn+2, \342\200\242 \342\200\242 reasons, as we shall seein Chapter \342\200\242) of cr(Xi,5n) of ... (which is a sub-cr-algebra (t(Xi, ,Xn)).
Hence, by (9.7,k),
But if
we
denotes
the
E(Xi\\gn) = E{Xi\\Sn). law of X, then, with 3\342\200\236 denoting
xi
4-
^^2
4- x\342\200\236,
have
E(Xi;5\342\200\236G5)
...
/
Jsn\302\243B
xiA{dxi)A(dx2)... A{dxn)
...
=E(X\342\200\236;5nG5).
= Hence,
E(X2;SneB)=
almost
surely,
E(Xi|5\342\200\236)= \342\200\242\342\200\242\342\200\242 =E(X\342\200\236|5\342\200\236)
n-\302\273E(Xi +
... +
n-^Sn. X\342\200\236|5\342\200\236)
Chapter
10
Martingales
10.1. Filtered spaces

\342\226\272 \342\226\272As basic
datum,
(Q,J^,
we now is a
>
take a filtered space(fi,^,
{^n},P)-
Here,
P)
n
probability triple
a filtration,
as usual,
that is, an
^.
{^n :
0} is
increasingfamily
of
sub-
cr-algebras
of J^:
^0 C ^1 C ... C C^.
u; in
We define
J'oo:=<7(\\jj'n)
Intuitive idea. The information

prefer,
measurable
about
Q available
the
'just
after') time n consists preciselyof Z. Usually, {fn} is the functions

:F\342\200\236 a{Wo,Wi,...,W\342\200\236)
values
natural filtration
to us at (or, if you of Z{(jj) for all ^\342\200\236
of
some
about u
(stochastic)
which
process
have
W =
(Wn
*\342\200\242 n E Z\"^),
and
values
then the
information
we
at time n
consists of the
Woiu;),Wiio,),...,W\342\200\236{u;).
10.2.
\342\226\272A process
Adapted
X
process
= (Xn
'\342\200\242 n >
0) is
called adapted (to

value
the filtration
is known
{J^n})
if for
each n,
Intuitive
Usually,
is J>i-measurable. X\342\200\236
idea.
J^n
If X
is adapted,the
-X'\342\200\236(u;)
to us
W^n)
at time
for
n.
<7{Wo,Wi,...,
and W^\342\200\236) on R\"\"^^. /\342\200\236
Xn
= fn{Wo,
Wî,...,
some
g\"+i-measurable
function
9S
94
Chapter 10:
Martingales
submartingale
(10.S)..
10.3. Martingale, supermartingale,

\342\226\272 \342\226\272\342\226\272A process
is called
a martingale
(relative to ({J^n},P)) if
(i)
is adapted,
(ii)
(iii)
E(|X\342\200\236|)<oo,Vn,
J^\342\200\236_i, E[X\342\200\236|:F\342\200\236_i]
a.s.
(n>l).
similarly,
A superniartingale (iii) is replaced by
(relative to {{Tn},P)) is defined
except
that
a.s. E[X\342\200\236|:r\342\200\236_i]<Z\342\200\236_i,
(n>l), replaced
and a
submartingale
is defined
with
(iii)
by
E[Xn\\rn-l]>Xn-U
A
a.S.
(n > 1). 'increaseson

/ on
f{B)
R\",
supermartingale
[Supermartingale
'decreases
on average';
corresponds
a submartingale
superharmonic:
filtration
average'!
R\"
to
a function
of
is
superharmonic
if and only if
for a
Brownianmotion B on
B.
is a
local supermartingale relative to the natural Section 10.13.]
Compare
Note that
and that
\302\243^(fi, if
is a
supermartingale
and
if and only if
is a \342\200\224X
submartingale,
X is a
a submartingale. It is important
J^5,P) and
martingale if
X
only if it is that to note =
both a supermartingaleand a process X for which Xq 6
only
property. So we can focus

\342\226\272
is a martingale if the process
[respectively, supermartingale,
\342\200\224
submartingale]
has
Xq
attention
Xq {Xn on processes
\342\200\224
: n
\342\202\254 Z\"^)
the
same
which are null
at 0.
of
If
is for
CEs,
(9.7)(i),
example a supermartingale,then shows that for m < n^

=
the Tower
<
Property
E[Xn\\Tm]
ElXnlJ'n-^llTm]
< E[Xn^l\\Tm]
< Xm^
a.s..
10.4.
Some
examples
is
of miartingales
As we shall see,it
and
submartingales
importance
up in
very
be
studied
to view all martingales,supermartingales the enormous gambling. But, of course, of martingale theory derivesfrom the fact that martingales crop contexts. For example, diffusion theory, which used to many via methods from Markov-process theory, from the theory of
very helpful of in terms
..(10.4)
Chapter
10:
Martingales
95
partial
Let interesting
differential
equations,
etc.,
has been
revolutionized by the
examples,
martingale
an
approach.
us question
now
look
(solved
at some
later)
simple first
pertaining
and
mention
to each.
Let and
RVs. zero-mean (a) Sums of independent of
Xi,
X2,...
be a
sequence
independent
RVs
with
Edîtl)
E(X,)
< oc, Vfc,

= 0,
Vfc.
Define
(5o
:= 0
and)
:= Xi 5\342\200\236
4- -X'2
4- -X'n,
J^n:=^(Xi,X2,...,Xn),
Jô:={0,fi}.
Then for n
> 1, we
have
(a.s.)
E(5n|^n-l)
= E(5n-l|J^n_l)
+ E(Xn|J^n-l)
=
The
Sn-\\
= Sn-1' 4- E(-X'\342\200\236)
first (a.s.)
equality is obvious
= and since X\342\200\236 is independent That must our notation! by (9.7,k). explain when does lim 5n exist (a.s.)? SeeSection 12.5. Interesting question:
(b)
is 5\342\200\236_i
^n-i-measurable,
from the linearity property (9.7,c). Since = 5n-i we have E(5\342\200\236_i|J*\342\200\236î) (a.s.) by (9.7,b); of J^n~i, we have E(-X'n|^n~i) E(-X'\342\200\236) (a.s.)
Products X2,...
Xi, with
be
of non-negative RVs of mean 1. independent a sequence of independent non-negativerandom variables

E(Xt)
Let
= l,
Vfc.
Define
(Mo
:= 1,
JTq
:=
{0,fi}
and)
!Fn
'\342\226\240= Cr(Xi,X2,.
Mn :=
Then,
X\\X2
\342\226\240 \342\226\240 .X\342\200\236,
..
,X\342\200\236).
for
n >
1, we have
=
(a.s.)
E(M\342\200\236|:r\342\200\236_i) E(M\342\200\236_iX\342\200\236|^\342\200\236_i)î:W\342\200\236_iE(x\342\200\236|:r\342\200\236_i)
^=W\342\200\236_iE(X\342\200\236) M\342\200\236-i,
so
that
A/ is
a martingale.
96
It should

be remarked that
(10.4)\"
such martingalesare not
at
all
artificial.
Because M is a non-negativemartingale,Moo = Theorem this is part of the Martingale Convergence 14.12 we say that E(Moo) = 1? SeeSections can of the next chapter. When
lim Mn
Interesting question. exists (a.s.);

and
14.17.
(c)
Accumulating
data
about
J^,P).
a random variable.
Define
have := M\342\200\236 E(^|J^n)
Let
{Tn}
be
our
filtration,
and let
^6
\302\243^(0,
('some
version
we of). By the Tower Property (9.7,i),
(a.s.)
E(MnlJ^n^l)
Hence
= E(e|J^n|J^n~l)
= E{i\\J'n-l) =
Mn^L
is a
martingale.
shall
Interesting question.In this case,we

Mn -^
because
Moo
be
able
to say that
a.s.,
:=
E(^|Jôo),
is the best of Levy's Upward Theorem(Chapter14). Now Mn available to us at time n, and Moo is the the information predictor of ^ given best prediction of ^ we can ever make. When can we say that ^ = E(^|ôo)5 a.s? The answer is not always obvious. See Section 15.8.
10.5. Fair and

Think
Xn in
unfair
games
now of
\342\200\224
Xn-i
as your
net
winnings
per
unit
stake
in game n
There
a series
of games,
played
0.
at times n = 1,2,
(n > 1)
game
is
no
at time
In the martingale case,

(a) \342\200\224 Xn-i E[-X'\342\200\236
\\Tn-i]
= 0,
(game seriesis fair), (game seriesis unfavourable

a useful to
and in the supermartingale case,

(b)
B[Xn
\342\200\224
Xn-i
l^n-i]
^ 0,
you).
Note
martingale
that
(a) [respectively (b)] gives

property
way of
formulating the
[supermartingale]
of X.
10.6. Previsible
\342\226\272 \342\226\272We call
a process
process, gambling strategy C = (Cn : n G N) previsible if
Cn is
fn-1 measurable(n > 1).
..(10.8)
Note
Chayier
10:
Martingales
97 Zî
that
Think
exist.
of
of
C has parameter
set
rather
than
Co does
not
Cn
based
Cn as your stake on on the history up to

the
\342\200\224
game n. You have to decide on the value is the 1. This (and including)time n \342\200\224
character
intuitive significance of
game n are Cn{Xn
-^n-i)
'previsible' and your
of C.
up
Your winnings on
time
total
winnings
to
n are
Yn=
Note
J2
l<Jt<n
C',(Xfc-AVi)=:(C#X)\342\200\236.
that
(C
\342\200\242
X)o
= 0,
and that
discrete
theory
The expression
analogue
the C \342\200\242 X, the
of
is
one
of the
greatest achievements of
transform martingale stochastic integral J CdX.

the
C, is the Stochastic-integral
of X by
theory
modern
of proba-
bihty.
10.7.
\342\226\272 \342\226\272(i)
fundamental
Let
system! C be a bounded non-negative previsible processso that, some for < K for every n and every u. Let X be a superin [0, oo)^ |Cn(<^)|
principle:
you
canH
beat the
m,artingale
[respectively
martingale].
Then
C%X is X is
a superwârtingale
m,artingale,
[m,artingale]
null
bounded
at 0.
previsible
(ii)
// C is a
{C
\342\200\242 is
process
and
then
X)
a w,artingale
(ii)^the
G
null at
0.
be
Xn
(iii)
Proof
In
(i)
and
boundedness
C^^Vn,
for
condition Cn
of (i). Write
provided
C
condition on C m,ay we also insist that

is bounded C\342\200\236
replaced
G C^,Wn.
by the
\342\200\242 X. Since
non-negative
and
!Fn-i measurable,
E[Yn
Yn^l l^n-l]
and
CnE[X\342\200\236
Xn-1
|^n-l]
< 0,
[resp. =0].
Proofs of
(ii)
(iii)
are now obvious.
(Look again at (9.7,j).)
10.8. Stoppingtime
A
map
T : Jl
\342\200\224\342\226\272
{0,1,2,...; {T<n}
00} is = {u;:
\342\226\272\342\226\272(a)
stoppingtime T(u;) < n} e Tn, Vn
called a
if, <
00,
98
Chapter
10: Martingales
(10.8)..
equivalently,
(b)
{T
= n}
= {uj:
T(u;) = n} e Tn,
and (b). If T
Vn
< oo.
Note that
T can
the
be oo.
of (a)
Frooj
of
equivalence
has property
-
(a),
then
{T =
n} =
{T< n}\\{T< n
k <
1} G J^nQ: ^n
If T
has property (b), then
for
n, {T
= k} e J^k
and
{T<n)=
U
0<k<n
{T = k}eJ'n^ decideto stop playing

the our
Intuitive
Whether
the history up to (and including)

Example.
T is a time when you can idea. or not you stop immediately after
time
game.
n^^
game
J^n-
depends
B E
only on B. Let
n : {T
= n} E
Suppose
that
> 0
(An) is
E B}
an adapted process, and

of first entry
=
of
that
T=
inf
{n
An
= time
T
into
set B.
By convention, inf(0)
Obviously,
= oo, so that
{T<n}^
k<n
oo if A
never enters
set B.
\\J {Ak
e B}
e J'n,
so
that
T is
a stopping
L
time.
: n
Example. Let L =
yourself that
10.9.
is NOT
sup{n
<
G 10;A\342\200\236
B},
is
sup(0)
freaky).
= 0.
Convince
a stopping
time (unlessA
Stopped
supermartingales
Let X be a supermartingale, and let at (immediately Supposethat you always bet 1 unit and quit playing T. time Then 'stake is n G N, for your C^^\\ where, process' after)
are supermartingales T be a stopping time.
Your
'winnings
process'
is the
processwith
value
at
time
n equal
to
..(10.9)
If X^

the process X
X'^{uj)
99
denotes
stopped at T:
:=
XT(u;)An{(^),
then
Now C^^^
is clearly bounded (by

can
1)
and
non-negative.
n
Moreover,
G N,
C^^^ is
previsiblebecauseCn
{CP
Result
only
be 0
or 1 and, for
J'n-i^
= 0} =
{T<n-l} e
10.7
now yields
the following result.
THEOREM.
\342\226\272 \342\226\272(i)
If X
is a
supermartingale and T
process
X^ = (Xtau
is a stopping
is
<
time,
then
the stopped
^ \342\202\254 \342\200\242 Z\"^)
a supermartingale, Vn. X'^
so that in
particular,
E(XrA\342\200\236)
E(Xo),
\342\226\272 \342\226\272(ii)
IfX
is a
gale, so that
martingale
in
and T
is a stopping
= ^Xo),
time,
then
is a m,artin-
particular,
E(XTAn)
Vn.
It bility
definition
is important conditions
of
to notice that
whatsoever and
this theorem imposes no extra integrain the (except of course for those implicit
martingale).
on
supermartingale
But be
at 0.
careful! Let X be a simple random walk very Then X is a martingale. Let T be the stopping time:
Z\"^, starting
T :=
inf{n : Xn
= 1}.
a
It is
proof
well
known
that
of this
fact, and for

though
P{T
< cx)) =
a martingale
1. (SeeSection10.12 for
calculation
martingale
of
of the
distribution
T.)
However,
even
E{XTAn) =
we have
E(Xo) for
every
n,
1=
E{Xt) j^
E(Jô)
= 0.
100
We
Chapter 10:
very
Martingales
(10.9)..
much
want
to know
E(Xt)
when we can say that

= E(Xo)
theorem
for a martingale
X. The following
gives
some
sufficient
conditions.
10.10. Doob's Optional-Stopping Theorem T he a stopping time. Let X be a supermartingale. Let \342\226\272(a) integrahle and
Then
Xt
is
E{Xt) < E(Xo)

N in
in
each
of the
following situations:
(for
(i) T is hounded
(ii)
some
N, T{uj) <
N,
Vu;/,
is bounded
and
every uo)
T
oo^
is a.s.
and,
(for some K finite]

for
in R^,
K in
|X\342\200\236(u;)|
<
for
every n and
(iii) E(T) <
some
~
R\"'\",
\\Xn{uj)
Xnî{uj)\\
< K
V(n,u;).
(b)
If any
of
the
conditions
(i)-(iii)
E(Xr)
holds and
= E(Xo).
X is a martingale, then
Proof of (ai). We
know
that
Xtau
is integrable,
and
(*)
E(XrAn-Xo)<0.
(i),
For
we
can
have
take
n =
For (iii), we
N. For (ii), we
TAn
can
let
\342\200\224\342\226\272 oo in
(*) using
(BDD).
\\XTAn-Xo\\
k=l
^(X,
-X,_i)|
< KT
and E(/\\T) < oo, so that the answerwe want.

Proof
(DOM)
justifies
letting
\342\200\224\342\226\272 oc in
(*) to
obtain
D
o/(b).
Apply (a)
to X
and
to
(-X).
..(10.11)
Chapter 10:
Martingales
101
Corollary
-\342\226\272(c)
Suppose
that
M is
by
a martingale,
constant some
the
increments
Mn~Mn-i
of which
are bounded by
some
Ki.
Suppose K2,
o>nd
that C
T
is a previsible
stopping time
process
such
that
bounded
constant
that
is a
E(T)
< 00.
Then
E(C#M)t
= 0.
Proof
left
of
the
following
as
an //
Exercise. X
is Theorem the Optional-Stopping (It's clear whose lemma is needed!)

final
part of
(d)
is a
non-negative
finite, then
supermârtingale,
and
is a
stopping
tim,e
which
is a.s.
E(Xt)<E(.Yo).
10.11. Awaiting the

In
almostinevitable
some of
surely
the
order
to
be able
of
we need ways
announcement
to apply
that
results
of the
proving
of
happening
the
chance
often
principle
will
(when true!) E(T) < that ^whatever always
00. The
preceding Section,
following
of
almost
stands a reasonable sooner rather than laterals happen
useful.
LEMMA
\342\226\272
Suppose
that
some \342\202\254 > 0,
T is a we have,
stopping time such for every n in N:

<n-\\-N\\J='n)
that
for
som,e N in N
and
P(T
Then E(T)
You
>
e,
a.s.
< 00.
of this
first
will
find
the proof
set as an exercise in
occasion
Chapter
E.
Note
that if T
the
is the
at exercise'
by
which
the monkey in
the 'Tricky
end
of Section
4.9 first completes
ABRACADABRA, then E(T) <

to
00.
You
will
find
another
exercise
apply
result
to show (c) of the precedingSection

E(T)
in Chapter
that
inviting
you
= 26^^
4-26^4-26.
now
large
number
of other
Exercises are
accessible
to
you.
102
Chapter 10:
for
Martingales
random
(10.12)..
10.12.Hittingtimes
Suppose that
(X\342\200\236
simple
walk
each
: n
same distribution as X where
G N)
is a
sequenceof IID RVs,

=
Xn
having
the
P(X = 1)
:= Xi Define So := 0, 5\342\200\236
P(X
-1)
set
= i.
4-
and -X'\342\200\236,
T:=inf{n:5n
= l}.
Let
Then wish
= . ,-X'\342\200\236) Tn = Cr{Xi,.. (7(5o,5'i,.
.,5\342\200\236).
the to
process calculate
S is adapted (to {fn})^so that the distribution of T.

Ee^^
T is
a stopping
time.
We
For 0 eR,
i{e^
4- e\"^)
= cosh^,
=
so that
Vn.
E[(sechl9)e^^\"]
1,
Example (10.4,b)
showsthat
M^
is a
martingale,
where
M^ = (sechl9)\"e^^\".
SinceT is a
(a)
stopping
time,
and
M^
is a
martingale, we have
= 1,
Vn.
EM|,^\342\200\236
E[(sech^)^^\"
exp(^5rAn)]
\342\226\272
Now
insist
that as n t
0 >
0.
is bounded
Then, firstly,
e^.
exp(^5rAn)
by e^,
Secondly,
T = oo. The Bounded Convergence Theorem allows to obtain
where cx), Mj^^^ \342\200\224\342\226\272 Mj^
the latter
so Mj^^îs bounded by is defined to be 0 if

us
to let
\342\200\224\342\226\272 oo in
(a)
EM|, = l
the
E[(sech^)ê^]
term
inside
on the [\342\200\242]
right-hand
side correctly
being 0 if
0.
= cx). Hence
(b) We now
E[(sech^)^] =
\"f
e-^
T <
for
^ >
let 0 10. Then(sechl?)^ Either (MON) or (BDD) yields

EI{r<oo}
1 if
oo, and
(sechl?)^T
0 if
T =
cx).
= 1
= P(r
<
CX)).
.,(10.13)
Chapter
10: Martingales
103
to
\342\226\272 The
above infinite
possibly
argument stopping
(b)
has been times.

to obtain
=
given carefully
show
how to
deal
with
Put a
(c) so that
= sechîn
E(a^) =
^ a\"P(T
P(T =
n)
e~^ =
a\"^ [1 -
\\/l
a2],
2m-l) =
(-ir+>^j.
Intuitive proofof (c)

We
have
(d)
/(a)
:=
E(a^) = \\E{a^\\X,= 1) +
iE(a^|Xi
-1)
reason for the very last term is that time 1 has already elapsed \342\200\2241 to 1 has the form Ti -|- T2, and the time taken to go from giving to 0) and T2 (the time to go from where 0 to \342\200\2241 Ti (the time to go from as are T. It is not obvious each same with the distribution independent, 1) to devise a proof: that 'Ti and T2 are independent', but it is not difficult the so-called Markov us to allow Theorem would Strong justify (d).
The intuitive the a,
10.13.
Let
Non-negative
\302\243* be
superharmonic or countable
G
functions
(pij)
for Markov chains

be
a finite
for
set. Let P =
=
Y^pik
a stochastic
E x
matrix, so that,
z, j
E, we have
Pij>0,
l.
Let /i be a probability measureon E. We know from Section 4.8 that there exists a triple (fi,^, P'^) (we now signify the dependence of P on //) carrying a Markov chain Z = (Z\342\200\236 : n G Z\"^) such that (4.8,a) holds. We write 'a.s., P'^'to signify 'almost surely relative to the P'*-measure'.
Let Tn := cr(Zo, Zi,..., write p(z,j) insteadof pij
Zn). It is ezisy
when
to
deduce
from
(4.8,a) that if we
typographically
convenient,
then (a.s.,P'')
Let
/i be
a non-negative
P^{Zn^^=j\\J^r^)=p(ZnJy function on E and define the (Ph)ii) = J2pii,j)hij).
function
Ph
on E
via
104
Assume that our
Chapter
10:
Martingales
(10.IS)..
that Ph<h
non-negative on E. Then,(cMON)
function
h shows
is finite that,
a.s.,
and P-superharmonic P'^,

<
in
E^[h{Zn-,l)\\J'n] =
SO that
J2p{ZnJ)h{j)
supermartingale
{Ph){Zn)
h{Zn%
h{Zn)
is a
non-negative
(whatever
be the
initial
distribution //).
Suppose
that
the
chain
Z is
irreducible recurrent in
< cx))
that
P'(T; /,,\342\200\242
:=
= 1,
mass
Vz',iG^,
(//j
where P' denotesP'^ below)and

Note return
when
// is
the
unit
6ij) at i
(see 'Note'
Tj
:=inf{n:n>l;Z\342\200\236=i}.
over
that
the infimum
is
to i if Z starts at i. and P-superharmonic^ non-negative
[n
>
by
1}, so that
Theorem
/,-, is the
probability of
if
h
a is
Then,
then, for
10.10(d), i and any
we see that j in
E^
hiJ) = E'h{ZTj)<E'hiZo)= hii),
so
that
is constant
on E.
first
Exercise.
Explain (at
fij
intuitively,
and
later
with
consideration
of
rigour) why
=
2Zi?tit//:j
-^Pij >
k^j
and
/^Pikfkj
k function
deduce
that
if every
then
Z is
irreducible recurrent.
have
non-negative P-superharmonic
is constant,
So
we
proved
that
recurrent
our chain Z
negative
is irreducibleand
function step
first
P-superharmonic
trivial
if and is constant.
only if every
non-
This is a
theory.
in the
links between probabilityand potential
Note.
The
perspicacious
reader
convey
will
have what
been
upset
in this
section.
I wished to
very
is interesting
by a first.
lack of
precision
Only the
enthusiastic
should
read
the remainder
of this section.
,.(10.13)
The natural
\302\243 denote for model take the canonical
Chapter
10:
Martingales
105 transition
thing to
do, given
the
the
one-step E
matrix
P, is to
Markov
of
the
<7-algebra
of all
subsets
chain Z obtained and define
as follows. Let
(fi,j^):=
a point
Q is
n(^\"^)nGZ+
In
particular,
u;
of
a sequence
=
u;
(u;o,cc;i,...)
of elements
of E. For u;
in
fi and
n in
:=
Z\"^,
define
Zn{u;)
LJn
E.
Then,
is a unique probability there for each probability measure/j, on (E, \302\243), \342\200\242 \342\200\242 \342\200\242 measure P'^ on (fi,J-) such that for n G N and \302\253o?\302\253i? G E, we have ?in
(*) P''[u; : Zo(u;)

The
the
io,Zi(u;)
trivial
=
because
ii,...,
u;-sets
with
Zn(cc;) =
of the
0,
in]
fîoPioii
-\"Pin-iin'
uniqueness is
left-hand
side of
Existence follows
canonical
(*), together
we
form
because
can
take
process Z constructed in
P'^
P'^ to be the SectionA4.3:
a 7r-system
P'^-law
form contained in [\342\200\242] on J^. generating

of
the
non-
=P''oZ\"^
Here,
we regard Z
as the map
u;^{Zo{u;\\Zi{u),...\\
this
map
Z being
^/J^
measurable in that
The canonicalmodel thus

measurable
obtained
is very
satisfying
because the
space
(Jl,^)
carries
all measures
P'^ simultaneously.
Chapter
11
The
Convergence
Theorem
11.1.
The
picture
that
says
it all
for a process X 11.1 shows a sample path n \302\273-> The top part of Figure Xn{(^) stake on unit where Xn \342\200\224 Xn-i game n. The represents your winnings per X lower part of the picture illustrates your total-winnings process Y := C \342\200\242 under the previsible strategy C describedas follows:
Pick
two
numbers
until
unit
a and
X
stakes
with
a <
b.
REPEAT
Wait
gets
below a
until
Play
UNTIL
X gets
above
and
stop
playing = 0.
FALSE (that
where
is, forever!).
=
at
Blackblobs
Recall that
To be
signify
1; and
time
C is not defined
more formal
(and
open circles signify

that C
where
0.
to prove
Ci
inductively
I{Xo<a},
is previsible),define
:=
and, for n
> 2,
11.2. Upcrossings
The
number
i7iv[a,6](u;)
by
time
N is
defined to
be the largest
of upcrossings
of [a, 6]
in Z\"^ such
made
by
\302\273-> -X'\342\200\236(u;)
fc
that
tk <
we can find N
0 <si < ti
with
<
S2
<
t2 <
'\" <
Sk
<
Xs,{u;)
< a,
Xt,{u;) >b
{1 < i <
k),
106
.(11.2)
Chapter
11: The
Figure
Convergence Theorem
11.1
107
QQQ
108
The fundamental
\342\226\272(D)
Chapter 11: The
Convergence
Theorem
(11.2)..
inequality (recall that Fo(^)-= 0) - [Xiv(cc;) a)UN[a, b]{u;) >{bYN(i^)
a]'
the F-value of [a, b] increases is obvious from the picture: every upcrossing the loss during the [-X'a^(u;) \342\200\224 while overemphasizes by at least (6 \342\200\224 a]~ a), the last 'interval of play'.
11.3.
\342\226\272
Doob's
Let
Upcrossing
be a
[a,
b]
Lenima
swpermartingale.
by
Let
i7iv[a,
^]
be the
number
of
wpcrossings of
time
N.
Then
<
(6 -
a)EUN[a,b]
E[{Xn
a)'].
> 0, result
X. and Y = C \342\200\242 now follows from
bounded Proof. The processC is previsible, F is a supermartingale, and E(yiv) 0. \302\243 (11.2,D).
and The
Hence
11.4. COROLLARY
\342\226\272
Let
be a
supermartingale
n
bounded in C^ in
<
that
supE(|A'\342\200\236|)
oo.
Let a, 6 G R
with
a <
b. Then,
with
Uoo[a^
b] :=t
limiv UNla, b],

<
{b-a)EUoo[a,b]
SO that
< |a| +
supE(|X\342\200\236|)
oo
P{Uoo[a,b] =
Proof
oo) = Q.
By
(11.3),
we have,
for
<
iV
G N,
{bâ)EUN[a,b]
Now
\\a\\
E{\\XN\\)
< |a|
4-supE(|X\342\200\236|).
let N
\"{
oo,
using
(MON).
..(11.7) 11.5. Doob's

Let \342\226\272\342\226\272 \342\226\272
Chapter 11:
The
Convergence
Theorem
109
'Forward' Convergence Theorem bounded in O : be a superniartingale

almost
liniXn surely, Xô \342\200\242=
define
supEd-Ynl)
<
oo.
Then,
exists
and
is finite.
Vu;,
For
Xoo
definiteness, we
is ôo
Write
: =
measurable and
(noting
XqoC^)
Xô
'-=
=
limsup-X'\342\200\236(u;),
so
that
limX\342\200\236,a.s.
Proo/(Doob).
A
the use of
[\342\200\22400,00]):
{u; :
Xn{^) 1)
does not converge to a

< A'\342\200\236(u;)
{ijj
limit
in
00]} [\342\200\22400,
{ijj : liminf
limsupXnC^)}
: liminf-X'n(c<;)
< a
< 6<
limsup-X'\342\200\236(u;)}
{a,6GQ:a<6}
=:[jAa,6
(say).
But Aa,6 Q
so that,
we
{^ : Uoo[0',h]{ijj)
is
00},
by
that
see
(11.4), P(Aa,6) = 0. SinceA P(A) = 0, whence

Xoo \342\200\242= lini-Yn
a countable
union of
sets
Aa,6,
exists
a.s.
in
[\342\200\22400,00].
But
Fatou's
Lemma
shows that
E(|Xoc|) =
E(liminf |Xn|) < liminf
^(|Xn|)
< SUpEd^nl)
SO
< 00,
that
P(Xoo
is finite)
= 1.
is as
for the discrete-parameter case. None of these proofs and none shares the central of one for this probabilistic, importance the continuous-parameter case.
are
Note.There
other
11.6. Warning
As
we
Xn
saw for the -* -X'oo in C^.
branching-process
example, it need not be
true
that
11.7. Corollary
\342\226\272 \342\226\272
If
is a
exists
almost
non-negative surely.
bounded
supermartingale,
in
since \302\243\\
then
= ^Xn)
Xqo
:=
limXn
Proof. X is obviously
E(|Xn|)
< E(Xo).
Chapter
12
Martingales
bounded
in C
12.0.
Introduction
When
boundedin C^
(a)
it works,
is
one
to
of
the
prove
<
ezisiest ways of proving that a martingale M that it is hounded in C? in the sense that
is
sup||M\342\200\236||2 n
oo,
equivalently,
supE(M^)
n
< oo.
formula
Boundedness
(proved
in C? is often in Section 12.1)
of because easy to check
a Pythagorean
ib=l
The
study
of sums
on Theorem 12.2 below, both of neat which have proofs. We shall prove the Threeparts martingale Series Theorem,which says exactly when a sum of independentrandom We shall also prove the generalStrong Law variables of Large converges. Numbers for IID RVs and extension of the Borel-Cantelli Levy's
in the
of independent
be
random variables,
central
topic
classical theory,
will
seen
to hinge
Lemmas.
12.1. Martingalesin
Let M
\302\243^; orthogonality
of
increments
each
: n (M\342\200\236
that E(M^) < oo,Vn.

that
> 0)
be a martingale in
for
C? in that
with
Mn
is in
C? so
know
Then
s, t,u^v
=
G
Mu
Z\"^,
s<t<u<v,we
E{Mr,\\Jû)
(a.s.),
so that
(a)
My\342\200\224Mu
is orthogonal
to C^{Jû)
(see Section 9.5) and in particular,

=
(Mt-M\342\200\236M,-M\342\200\236>
0.
110
..(12.1) Hence the
Chapter 12:
Martingalesbounded
in
C?
Ill
formula
n
Mn
= Mo
+ ^(Mit-Mit_i)
expresses
yields
as M\342\200\236
the
sum
of orthogonal
n
theorem terms, and Pythagoras's
(b)
E(M2)
= E(M2)
+ Ê[(Mfc
ib=l
Mu-xf\\.
THEOREM
\342\226\272
Let in
M C?
he a
martingale for
which
Mn
G C'^,
Vn. Then
M is bounded
if and only if
(c)
Y.^[{Mk-Mk-xf]<oo; and
when
this
obtainsj
Mn
\342\200\224> Moo
almost
surely
and in
M is boundedin C^.
Proof. It is obviousfrom
(b)
that
condition
(c) is
equivalent to the statement
Suppose
the property
now
that
(c)
Theorem
holds.
of
Then M
norms
Doob's
of monotonicity
(Section
is boundedin \302\2432, and hence, by is M boundedin C^. 6.7),

\342\200\242*= lim Mn
Convergence
11.5
shows
surely. The
(d)
that Pythagorean theorem implies
that M^q
exists
almost
n-\\-r
E[(M\342\200\236+.-M\342\200\236)2]=
Y.
E[(M*-Mft_i)2].
A:=n+1
Letting r
\342\200\224> oo and
applying
Fatou's
Lemma, we obtain
Y,
(e)
Hence
E[(Moo-M\342\200\236)2]<
E[(M*-Mft_i)2].
ib>n+l
(f)
liinE[(Moo-M\342\200\236)2]
= 0,
so that Mn \342\200\224> in C^. Of course, Moo (e) holds with equality.
from (f) allows us to deduce
(d)
that
112
Chapter
12: Martingales
bounded in C?
variables
(12,2)..
C?
12.2.Sums ofzero-mean independent

THEOREM
\342\226\272 \342\226\272
in
Suppose variables
that
(Xk : fc such that, for
N)
is a
sequence of
independentrandom
every k,
(7^
E(X,) = 0,
(a)
:=
Var(XO
< oo.
Then
(yZ^l (b)
\"^
^^)
iT^pli^^
ihat
(/\"y\"^* converges,
by
a.s.).
in [0, cxd)
in
//
that
the
variables
(Xk)
Wk,
are
Vu;,
bounded
then
some
constant
\\Xk{u;)\\ < K, (y_]Xk
converges,
a.s.)
0-1
im,plies that
law
(/^
<Î
<
^^)-
Note.
Of course,
the
Kolmogorov
implies
that
P(5^Xjt converges) =
0 or 1.
Notation.
Jô :=
We
define
(with
{0, Jl},
Mo := 0, by
n
the
usual
conventions).
We also
define
An:=J2^l
Nn:=Ml-An,
k=l
so that
Ao
:=
0 and We
No :=
0. M is a martingale. Moreover
= al
Proof of
(*)
SO
(a).
know
from (10.4,a) that
\302\243[iAh-M,.^)^]
that,
= E(Xl)
from
(12.1,b),
E(A/2)^\302\243^2^^\342\200\236.
If
Z)^fc
< ^?
^hen M
is boundedin \302\2432, so
that
HmMn
exists a.s.
..(12.3)
Proof of (b).
J^k-if
We

can
bounded
in C?
113
strengthen
(*) as
follows: since Xk is independentof

= E(X|)
we
have,
almost
surely,
=
E[(M* A
Ah-i)'\\J'k-i]
now
E(Mfc2m-i)
^Xl\\n-i]
= cl
familiar
argument
applies:
-
since Mk-i
is ^k-i measurable,
+ M|_i
al
But
2Mfc_iE(Mfc|:rfc_i)
= E(Ml\\J^k-i)-Ml.,
this
(a.s.)
result
states
that
N is a
Now
martingale.
let
c G
(0, oo)
and define
T:=inf{r : \\Mr\\
We know
>
c}.
every =
that N'^
is a martingale so that, for

EiVj
n, 0. see
= E[(Mj)2]
\\^t\\
E^TAn
But
for
since
every
\\Mt
\342\200\224
Mt-i
\\
^ K
li T
is finite,
Vn.
we
that
|Mj|
<K + c
n,
whence
(**)
However,
bounded,
EATAn<{K
since
and converges X^-X'\342\200\236
+ cf,
the
for
a.s.,
partial
some
it must
be the
Aqo
case that
^<^l
c,
sums of ^Xjt P(T = oo) > 0.
are a.s.
It is now
D
clear
from
(**)
that
\342\200\242=
< ^^'
Remark.
zero-mean
The proofof
sums
(b)
RVs uniformly
of showed that if (Xn) is a sequence bounded by some constant K, then
independent
(P{
partial
of J^X/t
are bounded } > 0) =>
i^Xk
converges
a.s.)
Generalization.
of
Sections 12.11-12.16 form present the natural martingale

applications.
Theorem
12.2 with
12.3. Random signs
Suppose
of IID
RVs
that
is a (a\342\200\236) with
sequence
of real
numbers and that
(\302\243\342\200\236)
is
a sequence
114
Chapter
of
12: Martingales
12.2
bounded in C?
(12.3)..
Theresults
Section
show
that
(a.s.)
Y^SnCin
converges
if and
only
ifâ^
< oo,
and that
You should
infinitely ^Sndn (a.s.) oscillates think
ifô,^
= oo.
about
how
to clinch technique:
the latter
statement.
the sample
12.4.
We
symmetrization a stronger
expanding
space
need
result than that
provided by (12.2,b).
LEMMA
Suppose
by a
that i^ ^ sequence of (Xn) constant K in [0, oo):
independentrandom
A',
variables
bounded
lA'nHI <
Then
Vn,Vu;.
(^X\342\200\236converges,
a.s.)
=>
(Ê(-X'\342\200\236)
converges
and
^ Var(-X'n)
would
< oo).
to
Proof. If each Xn version' Z*

Define
(12.2,b).
hzis
mean
zero,
way
then of
replaces
\302\243is to
There
of
is a
n
nice trick which

0 in
G
course, this
each
amount
mean
such a
N))
preserve
of
by a 'symmetrized enough of the structure.

Xn
(fi,^,P,(Xn : n G
Let {^,T,P,{Xn :
be an
exact copy
N)).
(n*,:F*,p*):=(fi,:F,p)x(n,:F,p)
and, for
X:(u;*)
We
clear
u;*
(^,i^)
define \342\202\254 Jl*,
:=
X\342\200\236H,
A-K)
:=
X\342\200\236(u-),
Z^K)
:= A^Cu;*)
X\342\200\236(u;*).
think
(and
of X*
may
be proved
as Xn lifted to the larger 'sample It is space' (J1*,^*,P*). by applying Uniqueness Lemma 1.6in a familiar
way)
that
the combined family

{XI : n
G N)
(X^
: n
G N)
is a
and
family
of
independent
random
variables
X* having
the same P*-distributionzis

P* o (X*)-i
on (Jl*,^*,
the
P*), with
both
X*
P-distribution
of Xn'.
= P o X-i
on
(R,
B),
etc.
..(12.5) Now we
Chapter 12:
Martingalesbounded
in
C?
115
have
n
(a)
(Z* :
variables
N*)
is a
zero-mean
on (fi*,
J^*,P*) such that
sequence of
\\Zn{u;*)\\
independent
random
< 2K
(Vn,Vu;*) and
where a'^ := Var(X\342\200\236).

Let
G := with
P*(G
{u; E
0>
: X^-X'\342\200\236(u;) converges}, that
G defined
X
similarly. Then we are given

X^
P(G)
G) = 1. But
Z;i{u;*)
converges
on G
x G,
so that
= P(G)
= 1, so that
(b)
P*(X;
^n
we
converges)
conclude
= 1.
that
From (a) and (b) and
(12.2,b),
and now it (c)

the variables
follows
from
(12.2,a)
that
E(^n)]
Yll^n in this sum
converges,
a.s.,
with being zero-meanindependent,
E[{Xn-EiX\342\200\236)r]
al
X1E(-X'\342\200\236)
Since (c)
converges,
holds and
Yl-^n
converges
(a.s.)
by hypothesis,
Note.
Another
proof of the lemma
may
be
found
in Section
18.6.
12.5. Kolmogorov's Three-Series Theorem

Let
(Xn)
be a
converges
K
almost
sequence of independent random variables. Then surely if and only if for some (then for
Y^Xn
every)
> 0,
(i)
the following
EP(l^n|>/0<00, n
three propertieshold:
(ii) \"^^{Xjf)
n
converges^
(iii)
EVax(X\342\200\236^)<co,
110
Chapter
where
12: Martingales bounded in
C?
(12.5)..
^-^^>'-\\0
if |Xn(cc;)|>X.
that
Proof of
Then
Hf part.
Suppose
for
some
K >
0 properties (i)-(iii)hold.
Y,^(Xn
^ Xf)
5^P(|Xn|
> K) < oo,
so
that
by
(BCl)
P{Xn
= X^
we (ii),
for all but

need we
finitely
many
n) X^
= 1. converges
therefore clear that surely; and because of

It is
only
need
show that YL only prove that

Y^
almost
a.s.,where Y: Y^ converges,
However,
:=
X^
- E(X^).
independent
the
sequence
(Yj^
: n G N)
of is a zero-mean sequence
random
variables
with
E[(yj^)^]=Var(Xf).
Because of
Proof
any
(iii),
the
desired
result now that
follows from
(12.2,a).
of
constant
ônly
if^
part.
Suppose
many
in
(0,
oo). Since it is
finitely
almost surely
converges, ]^X\342\200\236 true
a.s., that
and
that
K is
\342\200\224> 0 whence X\342\200\236
\\Xn\\ >
Xn
K for only
X^
for all but finitely
n, (BC2) shows that (i) many n, we know that converges,
holds. Since(a.s.)
YX^
a.s.
Lemma 12.4 completesthe proof.
D
when
Results such as the Three-Series Theorem become powerful

conjunction
with
used
in
Kronecker's
Lemma
(Section
12.7).
12.6. Cesaro's
Suppose
Lemma
that
(bn)
is a
sequence of strictly positive real numbers

i^
fl
with
bn
^^f
^^^
^^^^ (^n)
convergent
sequence
of real
numbers:
^n
\342\200\224*' ^cx) G
R.
Then
1
^
\"
^\"*=i
X^(^*
\"
^k-i)vk
-^
Voo
(n
-> oo).
..(12.7) Here,
Proof.
bo

:\342\200\224 0.
bounded
in C?
Ill
Let
\302\243 > 0.
Choose
N such that
>
\342\200\224 whenever ^c\302\273 \302\243
^k
k >
N.
Then,
1
t?n \"
\"
liminf\342\200\224 Yîbk \342\200\224 n\342\200\224\342\226\272cx^
i/.\342\200\224Ot'ib
ib=l
> liminf
<
\342\200\224
Y]{bk
6ib-i)t'ib +
-^\342\200\224^(t'cx.
\302\243) ^
this is true for every \302\243 Since > similar argument, limsup < Voc
0,
the
we have
result
liminf >
follows.
vô;
and
since,
by a
12.7. Kronecker's
\342\226\272
Lemnia
denote
Again, with
let bn t
(6\342\200\236)
a sequence
oo. Let
be a (x\342\200\236)
of strictly positive real numbers sequence of real numbers, and define +

X2 -]
Sn
'-= Xi
h Xn.
Then
(E
Proof.
t:
^
\342\200\224-^g-) (\302\243-\"\302\253)
\342\200\242
Let
Un '-=
Ylk<n(^f^/^f^)^ ^^ ^^^^ôo
Un \342\200\224
'=
limwn
exists.
Then
Wn-1
Xn/bn.
Thus
ib=l
ib=l
Cesaro's
Lemma now shows that
Sn/bn -> Woo
Woo
0.
\342\226\241
118
Chapter
12: Martingales
hounded in C?
(12.8)..
12.8.
Strong
Law
under
variance
constraints
LEMMA
Let
(Wn)
be a
sequence
of independent random variables
such that
Then
n-i
X;ib<n ^k
-> 0,
a.s..
X^(VFn/n)
Proof. By
converges,
Kronecker's
a.s.
But
this
it is enough to prove that Lemma, is immediate from Theorem 12.2(a).
D
to
Note.
obtain
We
are
now
going
to see that
IID
the general
Strong Law for
a truncation techniqueenablesus RVs from the above lemma.
12.9. Kolmogorov's TruncationLemma

Suppose
that
where
-X'i,X2,...
Ed-X\"!)
as X,
< oo.
are IID Set fi
RVs
each
with
the same
distribution
:=
E(-X').
Define
'\"\"
\\0
if\\Xn\\>n.
Then
(i)
E(r\342\200\236)^//;
(ii)
P[Yn
= Xn
eventually] =
1;
(iii)
Proof
^n-2Var(r\342\200\236)<oo.
of (i).
Let
._(X y \"\342\200\242\"
\\0
if|X|<n,
if
|X|>n.
that
Then Z\342\200\236 has the E(Yn). But, as n
same
oo,
distribution
have
as Fn, so
in
particular,
E(Zn)
\342\200\224> we
Zn
SO, by
^ X,
= ^c.
\\Z\342\200\236\\ \\X\\,
<
(DOM),
-\302\273 EiZ\342\200\236) E(X)
..(12.10) Proof of
Chapter 12:
Martingalesbounded
in
C?
119
(ii).
CX)
We
have
n=l
X;P(r\342\200\236
J^\342\200\236)Y^Pi\\X\342\200\236\\
>n)
J2Pi\\X\\
> n)
= EX;i{m>n}
n=l
Y.
l<E(l^l)<oo,
l<n<|X|
SOthat by (BCl),
result (ii) holds.

have
Proof
of
(lii).
We
where, for
0<z<
oo,
f{z)
^
n>max(l,z)
n-2 <
2/max(l,z).
We
have
used
the fact
that,
for
>
1,
n?
?i(n 4-1)
\\n
n-\\-\\J
oo.
Hence
< ^n-2Var(rn) < 2E(|X|)

Kolmogorov's
12.10.
\342\226\272 \342\226\272
Strong
Law of
Large Numbers (SLLN)

Define
\" '
LetXi,X2,...heIID
RVswithE{\\Xk\\)<oo,'ik.
Sn '-= X\\
Then,
-\\-
X2
-{\342\226\240 -{\342\226\240 Xn-
with /i :=
E(Xit),
n~^
Vifc,
Sn
\342\200\224>
/^,
almost
surely.
Define F\342\200\236 as in Lemma Proof need only show that
12.9.
Yit
By property
~>
(ii) of
that
lemma,
we
n~^ ^
//,
a.s.
k<n
120 But
bounded
in C?
(12.10)..
(a)
where
tends
converges
n-' Y.^k= n-'

k<n
k<n
J^ ^^' J^ ^(^0 + n'' k<n

first term on the right-hand sideof (a) Lemma; and the secondalmost surely
Wk :=
to
/j,
Yk
by
E(Fib)- But, the and Cesaro's (12.9,i)

-
to
0 by
Lemma 12.8.
Law
Notes. The Strong

precise
is philosophically
satisfying
a
in that
number
if
it
gives
formulation
of
of
X\\
realizations
E{X) as 'the mean of We know from Exercise
large
E4.6 that
a.s..
for
Ed^l)
of independent = oo, then
lim sup 15n So, we have

Discussion
arrived
I/n
oo,
at
the best
possible result
the
IID
case.
a good result, it Even though we have achieved it does to be admitted that the truncation technique seems'ad hoc^: has - which sense of rightness not have the pure-mathematical elegance- the in the proof by ergodic theory (the latter is not and the martingale proof can be this adapted to book) both possess. However, each of the methods the others cannot tackle; and, in particular, classical cover situations which
of methods.
truncation
arguments
retain
great
im,portance.
Properly formulated, the argument which gave the result. which all of this chapter has so far relied, can yield much
Theorem
more.
12.2, on
12.11.
Doob
decomposition
'A
In the
following theorem, the statement that at 0' means of course that Aq = 0 and An E
is a
previsible
(n G N).
process null
mj^n-i
THEOREM
\342\226\272 \342\226\272(a)
Let
has a Doobdecom,position
(D)
(Xn
: n
E 2'^)
be an adaptedprocesswith
X
Xn
G C^,Wn.
Then X
= Xo
+ M
+A
..(12.12)
Chapter is a
12: Martingales
bounded in C?
is
121
process null at decomposition^
where
M
Moreover^
martingale
this
null at
0, and A
is unique
is
a previsible
0.
in
decomposition
the
sense
that if X
= Xo + M + A
modulo in distinguishability
another
such
then
i\342\200\236,Vn)-l. P(M\342\200\236=M\342\200\236,A\342\200\236
\342\226\272-(b)
is
processin the
a submartingale
sense
if and
that
<
only
if
the
process
A is
an increasing
P(A\342\200\236 A\342\200\236+i,Vn)
l.
Proof.
martingale
If X
and
as at (D), has a Doob decomposition we have, almost surely, A is previsible,

Xn-l\\rn-l)
then,
since
M is
E{Xn
E(A/\342\200\236 M\342\200\236_i|^n-l)
E(A.
An-llJ'n-l)
= 0+
Hence
n
~
{An
An-l).
(C)
^n
J2 E(^^^
k=l
A,
^^^-1
l-^^-l).
a-S-
and if we
The
use (c) to
define
we
obtain
the required
decomposition
of
X.
'submartingale'
result
(b) is now
obvious.
Remark.
submartingale previsible
The
in increasing
Doob-Meyer
process,
continuous
time is
decomposition, which expresses a sum of a local martingale and a a deep result which is the foundation stone
as the
for
stochastic-integral
theory.
12.12.
Let
Jensen's inequality
(a)
The angle-brackets process (M) be a martingale in C^ and null at

shows
0.
Then
the conditional form
of
that
M^ is
has
a submartingale.
a Doob
Thus AI (b)
where
M^
decomposition
(essentially unique): process, both
=N
is
i-A,
A''
and A
Notation.
being
a martingale null at 0.
and
A Aoo
is
a previsible
increasing
Define
A
a.s. \342\200\242=? Iinii4\342\200\236,
The process
is
often
written
(M).
122 Since E(M^)

(c)
bounded
in C?
(12.12)..
E(A\342\200\236),
we
see
that
and
M is bounded in
C? if
only
i/E(Aoo)
< oo.
It is important to
\342\226\272 (d)
note that
M^.^lJ^n-l)
An
- An-l
= E(M2 -
E[(Mn
- Mn-lflJ'n-ll
of (M)oo 12.13. Relating convergence of M to finiteness in C^ and null at 0. Define A := (M). Again let M be a martingale
(More
strictly,
let
be
'a version
of (M).)
THEOREM
\342\226\272(a) limM\342\200\236(u;) n
exists
for
almost
every
u;
for
which
Aoo(^)
< oo.
that
\342\226\272(b)
Suppose
that
Hj
M has
uniformly bounded incrementsin

almôst M\342\200\236_i(u;)|
for
some
in
\\Mn{uj)
<
K,
Vnîv.
Then Aoo{^)
Remark.
Theorem
< oo for
every
lj for
which
a very
limM\342\200\236(c<;)
exists.
This is
12.2.
obviously an extensionit is
G Z+
and
substantial
one
of
Proof
of (a).
Because A
is
previsible,
immediate that for

: Anî
every
fc
G N,
S{k)
defines
:=
inf
{n
> k}
is
a stopping
time
ible because
for B
B^
S{k). Moreover, the n G N, and
stopped processA^^^^
previs-
{An^S(k) e
where
B} = Fi U
F2,
n-l
Fi :=
r=0
= U {S{k) r; Ar
G B}
\342\202\254 \342\202\254 J'n-l, B}
F2 :=
Since
{An
n {S{k)
< n
- 1}^G
A)^W
J^n-i-
(M^W)2
is
- A^W
= (M2 -
a martingale,
A^(^^ is
(c)
bounded by
M\342\200\236A5(*)
= A^^'^K we now see that (M*^^*^) the process However, so that k, by (12.12,c), M-^^*) is boundedin C^ and exists
lim
almost
surely.
..(12.14) However,
Chapter 12:
Martingalesbounded
in
C?
123
(d)
Result
{Aoo <
now follows
OO}
\\]{S(h)
k
= OO}.
(a)
on combining (c) and (d).

that
\342\226\241
Proof
o/(b).
Suppose
P{Aoo =
Then for
(e)
OO,
SUp|Mn|
<
CX))
>
0.
some c > 0,
P(T(c)
= OO,
Aoo =
oo) > 0,
where
T{c)
is the
stopping
time:
T{c):=mi{r:\\Mr\\
Now,
> c}.
and
M^(^)
is bounded
by
c-\\-
K.
Thus
(f)
EATic)An<(c-^K)\\
Vn.
But (MON)
Remark.
showsthat
^
(e)
and
(f) are
we
incompatible.
able
Result (b) follows.

previsibility to
In the
the jump
^T{c)
proof of
As{k)-i
(a),
were
to use
make
jump
increments.
Mt{c)-i which is why

trivial
As(k)
irrelevant.
we
We could not
the
do this
for
the
needed
assumption
about bounded
12.14.
Let is
'Strong
Law'
for martingales
at
in
\302\243^
be a
a bounded
martingale in C^ and null previsible process,
0, and
let A =
(M). Since(1+ A)~^
\\<k<n
defines
a martingale
E[(H^\342\200\236
W. Moreover, \\rn-i]
since {1-{An)
An)-\\A\342\200\236 A\342\200\236_i)-i-(l
is
J^n-i
measurable,
W^.^f
= (1 +
<(l
A\342\200\236_i)
A\342\200\236)-i,
a.s.
124
Chapter
see
12: Martingales
bounded in C?
a.s..
(12.H)..
Kronecker's
We
that
(VF)oo <
1, a.s.,
so that
limT1^\342\200\236 exists,
Lemma
now shows that

\342\226\272 Mn/An \342\226\272(a) \342\200\224^ 0 a.s.
on {Aoo
= oo}-
Lemmas 12.15. Levy's extensionofthe Borel-Cantelli
THEOREM
Suppose
that for n G N, Zn :=
G J^n^\342\200\236
Define
Ek{k
y2 ^^k l<k<n
= number of
and
< n)
which occur.
Define ik:=P{Ek\\rk-i)y
l<k<n
Then,
almost
surely,
<
(a)
(Yoo
oo)
=> (Zoo
=^
< oo),
^ 1).
it follows that
(b) (Foe= oo)

Remarks,
Yoo
(Zn/Vn
(i) SinceEît
a.s.
P{Ek),
<
oo,
Let
(BCl)
{En : n
therefore follows. be a
and
if Y^P{Ek)
< oo, then
(ii)
P{Ek), Proof.
G N)
sequenceof
define (b).
\342\200\224
with some
a.s., Let
triple (fi, ^, P), and (BC2) follows from

M be
of the
An
events associated independent = = \342\200\242 \342\200\242 J^n \302\243^k g{Ei , \302\243\"2, \342\200\242, \302\243'n). Then
the martingale Z
:= {M)n
F,
so that
Z =
M + F is the
Doob
decomposition
submartingale Z. Then (you =
check!)
X^ 6(1
k<n
- 6) <
exists,
Yn,
a.s.
Ifôo
are
< 00,
then
A<x)
<
null
00 and
u;-set'
lim
Afn
so that
Zoo is
finite.
trivial
(Wê
skipping
'except
for a
Aoo
statements
now.)
If Foo = Zn/Yn -> 1.
00 and
and
<
00
then
lim Mn
exists and
\342\200\224^so
it
is
that
If
Yoo
\342\200\224 00
Aoo
^1. Mn/Yn -^ 0 and Z\342\200\236/r\342\200\236
00, then Mn/An
0,
that,
a fortiori,
D
..(12.16)
Chapter 12:
Martingalesbounded
in
C?
125
12.16.
Comments
few
just how powerful the use of (M) to as one can obtain the conditional study of one can obtain version Theorem 12.15 the Borel-Cantelli Lemmas, conditional versions of the Three-Series Theorem etc. But a whole new world is opened In the continuous-time up: see Neveu (1975),for example. case,
The last
sections
have
indicated
M is likely
to be. In the sameway
things are much morestriking

(1987).
still.
See,
for example,
Rogers and
Williams
Chapter
13
Uniform
Integrability
We
have
already
seen
full
a number
we
In
To derive
sufficient concept
the
Theorem. Convergence
condition required
benefit,
of nice applicationsof martingale theory. something better than the DominatedTheorem 13.7 gives a necessary and particular,
need
for
a sequence
links
of
RVs
to
converge
on
The \302\243^. of
new randomi
is that
of a
uniformly integrable (UI)

perfectly
family
variables.
This concept
martingales.
with
conditional
expectations
and
hence with
The
examiners and others: modes of convergence. use of the Upcrossing Our Lemma has meant that this topic does not feature large in the main text of
this
appendix
to this chapter
of contains a discussion
that
topic
loved by
book.
13.1.
An âbsolute
continuity'
X E
that
property
P).
LEMMA
\342\226\272(a)
Suppose
that
such
C^ =
for
a 6>0
Proof
sequence
\302\243^(Q, J^,
Then,
given e
that
eT,
P{F) < 6 implies

some
> 0, there exists

E(\\X\\]F)
< e.
find
If the conclusionis false, then, for (Fn) of elements of J-'such that

P(Fn)<2-\"
Sq
>
0, we
can
and
E(|X|;Fn)
P{H)
>
\302\243o.
Let
Fatou Lemma (5.4,b)shows
:= limsupFn.
Then (BCl)
that
showsthat
= 0,
but the
'Reverse'
Ei\\X\\;H)>eo;
and
we have
arrived at the required contradiction.
126
..(13.3)
Corollary
Chapter
13:
Uniform
IntegrabilHy
127
(b)
Supposethai X e
such
C^
and
that
e >
Q. Then there
existsK
in
[0,oo)
that
E{\\X\\;\\X\\>K)<e.
Proof. Let
S be as in
Lemma
(a).
Since
>
KP{\\X\\
K)
< E(|X|),
we can
chooseK such
that
P{\\X\\
> K)
< 6.
13.2. Definition.UI
\342\226\272 \342\226\272
family
A
if
class
given
C of random e > 0, there
variables is calleduniformly integrable(UI) exists K in [0,oo) such that

>K)<e,
E(|X|;|J^|
WXeC,
(with
We note that every X E C,
for
such
a class
C, we
have
Ki
relating
to
= \302\243
1) for
E{\\X\\)
E{\\X\\;\\X\\>K\\)
+ E{\\X\\;\\X\\<K^)
Thus, a
It is
UIfamily
not true
is
that
bounded
a family
=
in C^. bounded
([0,
in C^
l],Leb).
is UI.
Let
Example. Take(Q,^,P)
Then Ed^nl) =
iiT >
l],i?[0,
^n=(0,n-i),
0, we have for
Xn^nlE^.
is bounded in
1,
Vn,
so
that
{Xn)
O. However, for
any
> A\",
E{\\Xn\\\\\\Xn\\>K)^nP{En)
= l, E{Xn) />
for
variables
so that (Xn) is not UI. Here, Xn
~>
0, but
0.
UI property
13.3. Two simplesufficient

\342\226\272(a)
conditions
the
Suppose
thatC
is a
for somep > 1; thus,
class of random,
for
which
is bounded in
C^
some
A G [0,
oo),
E(|X|P)<A,
VXeC.
128
Then C
Chapter
13:
Uniform
Integrability
(13.3)..
is UI.
then
Proof. Uv>K>0,
X
v <
K^'^vP
(obviously!).
Hence, for K) <
K > 0 and
E C,
we have
E(|X|; |X| > K)
<
K^-Ê{\\X\\P;
\\X\\ >
K^-Â,
D
The
(b)
result
follows.
Suppose
an
that C
is a class of
<
random
variables
which
is dom,inated
by
integrable
non-negative
\\X{u;)\\
variable
WX
Y: eC
y(u;),
and
E(Y) < oc.
Then
is
UI.
makes
Note. It is
precisely this which

obvious
(DOM)
work
E
for our
(fi, J^,P).
Proof.It is
and now
that,
for
K >
0 and
K)
C^
Ei\\X\\;\\X\\>
it
<EiY;Y
> K),
is only
necessary
to apply
(13.1,b)toy.
13.4. UI propertyofconditional expectations

The
mean
reason
that
See
the UI
Exercise
is the following.
property
E13.3
fits
in
so well
for an
important extension.
with martingale
theory
THEOREM
\342\226\272 \342\226\272
Let
e C^.
Then the class
: g a sub-(7-algebra {E{X\\g)
is uniformity
of
J^}
integrable.
in question
Note. Becauseof the business of versions, a formal descriptionof the class C would be as follows: y G C if and only if for some sub-cr-algebra of y is a version of E{X\\Q). ^ ^,
Proof.
Let e
> 0
be given. Choose 6 > 0 such

P(F) < 6 impliesthat
that,
for < e.
F e
J^,
E{\\X\\;F)
,.(13,5)
Choose
Chapter 13:
K so that
Uniform
Integrability
129
A'-ÊdA'D < S. let ^ be a sub-cr-algebra of J^ and let Y Now Jensen's inequality, By
be
any
version
of E{X\\Q).
(a)
Hence
|F|<E(|X||a),
E(|r|)
a.s.
< E(|J5f
|) and
ii:p(|y|>JO<E(|r|)<E(|x|), so that
But
p(|y| >
> K}
G
A')
<
s.
definition
{|F|
G^
so
that,
from (a)
and the
of
conditional
expectation,
E(|r|;|F|>A')<E(|X|;|F|>/r)<\302\243.
Note,
just
Now
you
can
see why
the
result
(13.1,b)
we needed the moresubtle result (13.1,a), not which has a simplerproof.
13.5. Convergence in probability

Let
(Xn)
be a
sequence
of random
variables,and let X bea random
variable.
We
say that
Xn
\342\200\224> X in
\342\226\272 \342\226\272
probability
if for
every
\302\243 > 0,
P(\\Xn
- X| > \302\243) ->
0 as
n ->
oo.
LEMMA
\342\226\272
If
Xn
\342\200\224^ X almost
surely,
Xn
then
\342\200\224* X in
probability.
Proof
Reverse
Suppose that
Fatou
Xn
\342\200\224> X almost
surely
and that e
> 0. Then by
the
Lemma
-
2.6(b) for sets,

X\\ >
\302\243, i.o.)
0 = P{\\Xn
P(limsup{|Xn
X\\
>
e})
>limsupP(|Xn--Y|
>e),
and the result is proved.
ISO
Note.
various
Chapter 13: Uniform

As already
modes
IntegrabilHy
(13.5)..
of convergence
between mentioned, a discussionof the relationships to this the in be found chapter. appendix may
13.6.
We
Elementary
restate of
proof
Bounded
of (BDD)
Convergence Theorem, but in probability' rather than
the
under the weaker
hypothesis convergence'.
'convergence
'almost sure
THEOREM
(BDD)
Let {Xn) he
Xn
a sequenceof RVs,
probability
u)
and
let
he a RV.
\342\200\224*\342\226\240 X in
and that for

\\Xn{0j)\\
some K in [0,oc), we
< K.
Suppose that
have
for
every n and
Then
E(|X\342\200\236 that Proof. Let us check
X\\)
^ 0.
fc
P(|X|
< K)
= I.
-
Indeed,for
Xn\\ >
G N,
Pd^l
so
> K
+ fc-i) < P{\\X
fc-i),
Vn,
that
P(\\X\\ >
K +
P(|J^|
fc-i) = 0. Thus
> A')
P(U{|X|
k that
>K + k-'})^
0.
Let
\302\243 > 0
be given.
Choose no such
P{\\Xn
X\\
>i\302\243)
<
when n >
hq.
Then,
for
> no, -
\302\243{\\Xn
X\\) =
E{\\Xn
X\\;
\\Xn
X\\ >ie)
-f
E(|X\342\200\236
X|;
|X\342\200\236
X\\ <ie)
<2XP(|X\342\200\236-X|>l\302\243)-fl\302\243<\302\243.
The
proof
is finished.
This proof shows(much as doesthat of the Weierstrass approximation theorem) that convergence in probabilityis a natural concept.
,.(13.7)
13.7. A necessary
THEOREM
Chapter
IS:
Uniform
Integrability
131
and sufficient conditionfor O

sequence in C^, and let
\342\200\224 ^{\\Xn
convergence
Then
\342\226\272 \342\226\272
Let
{Xn)
be a
are
X ^
C^,
Xn
\342\200\224^ in X two
OJ
equivalently
\342\200\224^ in X
X\\)
-^
^, if and only
if the following
conditions
satisfied: probability,
(i) Xn
(ii) the sequence
(Xn) is
UL
Remarks. It is of course the 'if part of the it must improve the result is 'best possible',
theorem on
which
(DOM)
triple;
Proof
K
of
and,
of
course,
result Suppose
function
13.3(b) makes that

(pK
this explicit.
is useful. Since for our (fi, J^,P)
Hf^
part.
e [0,
oc), define a
conditions (i)
: R \342\200\224^ [~-^5
and (ii) are satisfied.

^ follows:
For
^]
( K K
<Pk{x)
if X > K, iix> K,
:=
< x
if
\\x\\ X
< <
K,
l-K Let
\302\243 > 0
if
of
-K (Xn)
we can choose K so that

E{\\ipK{Xn)
be given.
By the
-
UI property
<
the
sequence
and (13.1,b),
Xn\\}
|,
Vn;
E{\\ipK{X)
X\\}
<
|.
But, since
probability;
\\ipK{x)
by
and
(pK{y)\\ < in (BDD)
choose
no such
that, for n
> no,
\\x y|, we see that (pK(Xn) ~> ^k(X) in the form of the preceding section,we can
E{\\^K{Xn)-'PKiX)\\}<'-. The
triangle
inequality
therefore
implies
that, for n
> no,
E{\\Xn-X\\)<e, and the proof is complete.
D
that Xn -^
N Choose
Proofof
'only such
if
that
part.
Suppose
\\n
O,
Let
e >
0 be given.
n>N
=>
E{\\Xn-X\\)<e/2,
132
By (13.1,a),
Chapter IS:
we can
Uniform
Integrability
(13.7)..
choose^
>
0 such
that whenever
(l<n<iV),
P(F) < 6, we
have
E(|,Y\342\200\236|;F)<\302\243
E(|A'|;F)<e/2.
Since
(X\342\200\236)
is
bounded
in C^,
K-^
we can r
chooseK such
< 6.
that
supE(\\Xr\\)
Then
for
> iV,
we have
Pd^nl > K)
<6
+
and
E(\\X\342\200\236\\;\\Xn\\>K)
< E(|X|;
> A') |X\342\200\236|
Ei\\X
< e. X\342\200\236\\)
For n
<
iV,
we
have
> P(|A'\342\200\236|
K)
< S
and
E{\\Xn\\;\\X\342\200\236\\>K)<e.
Hence
{Xn)
is a
UI family.
Since eP{\\Xnit is
X\\
>e)<
E(|X\342\200\236
X\\) =
||A\342\200\236
A||i,
clear
that
Xn
\342\200\224^ X in
probability.
Chapter
14
UI
Martingales
14.0.Introduction
The
first
part
of this
chapter examines what

the
happens
when
uniform
bility is
we also obtain such as Levy's 'Upward' and 'Downward'Theorems, Law of Large and of the Strong new proofs of the Kolmogorov 0-1 Law
combined with
martingale
property.
In addition
to new results
integra-
at Section 14.6) is concerned result impliesin particular This Inequality. martingale = in C^ is dominated for p > I (but not for a bounded that p martingale 1) an of hence element and both almost and in C^. The C^ surely by converges to Kakutani's is also used Theorem on prove SubmartingaleInequality in of an illustration and, bounds, product-formmartingales exponential to prove a very special case of the Law of the Iterated Logarithm.
of
Numbers. Thesecond part

Doob's
the
chapter
(beginning
with
Sub
The
likelihoodratio
The
Radon-Nikodym
explained.
theorem
is then
proved, and
to its relevance
topic
theory 14.1.
and UI
of optional sampling, important for continuous-parameter in other contexts, is coveredin the appendix to this chapter.
martingales
Let M be a
Since
(fi,^, {^n},
P)
UI martingale,so that
and : n (M\342\200\236
is a
G Z+)
is a
13.7,
UI family.
(by
martingale
(13.2)),
relative to
and
that
our set-up
:= limM^
in
existsalmost
We now
is UI,
surely.
M is boundedin C^
By
so Mqo
Theorem
it is
also true
\342\200\224> M\342\200\236 Mqo
E(|M\342\200\236-Moo|)->0.
prove that
= M\342\200\236
yields
E{Moo\\^n),
a.s.
For F
J^n,
and
r >
n, the
martingale property
(*)
E{Mr;F)
= E(Mn;F).
133
134
But
Chapter 14: UIMartingales

\\EiMr;F)-E{Moo;F)\\<Ei\\Mr-Moo\\;F)
(H-V-
<E(\\Mr-Moo\\).
Hence, on lettingr
-+oo
in
(\342\231\246),
we
obtain
E(Moo;F)
We
E(M\342\200\236;F).
have
proved
the following
result.
THEOREM
\342\226\272 \342\226\272
Let
be a\\JI
martingale.
Moo
Then
a.s.
'=
exists limM\342\200\236
and
in C^.
Moreover, for
every
n,
Mn
= E{Moo\\^n),
a.s..
be
The obvious extensionto UI supermartingales may

14.2.
\342\226\272 \342\226\272
proved
similarly.
Levy's
Let
'Upward'
\302\243^(Q, J^,P),
Theorem
and
^ e
define
Mn :=
E{^\\J^n),
a.s.
Then
is a
UI martingale and
almost surely
and
in C^.
because of the Proof. We know that M is a martingale We know from Theorem 13.4 that is UI. Hence Moo M a.s. and in \302\243^,and it remains Moo only to prove that
r? :=
Tower =
\342\200\242= lim Mn \"Ht
Property. exists
where
a.s.,
E{i\\J'oo).
Without
consider
loss of
measures
Qi(F):=E(,7;F),
the
generality, we may (and Qi and Q2 on (fl, Jôo),

Ql^{F)
=
do)
assume
that ^
> 0.
Now
where
= E{Moo; F),
the
F e TooProperty,
If
e Tn,
then since
by E(\302\2737|^\342\200\236) E{i\\:F\342\200\236)
Tower
E(\302\273?; F)
E(M\342\200\236; F)
= E(Moo;
F),
..(14'S)
the

equality
135
second
having been proved in

and
Section 14.1. Thus

hence
Qi
and
Q2
agree on
the 7r-system (algebra!) |J J^n,

are
by
J^j\302\251 measurable;
they
agree on J^joMoo may

every
Both ry and Moo to be ^00 measurable
more
strictly,
be taken
Thus,
defining
Moo '-= limsup

>
Mn
for
u.
F :=
{cj : ry
Moo}
G Jôo,
and since Qi(F)
Q2(i^),
E(r7-Moo;r7
> Moo)
= 0.
ry)
Hence P{'q> Moo)
0, and
similarly
P(Moo >
0.
14.3.
Recall
Martingale
the
proof of Kolmogorov's 0-1 law
result.
THEOREM
Let -X'i,-X'2? he a sequence \342\200\242 \342\200\242 \342\200\242
of independent RVs.
Define
7^ := Cr(-X'n-|-l,-X'n-f2,-\342\200\242
Then ifFeT,
Proof T]
\342\200\242)\302\273 ^\342\200\242=11^-
P(F) = 0 or 1.
Define
J='n :=
G bôo,
Levy's
a{Xi,X2,... ,Xn). Let Upward Theorem shows that

TJ
F eT,
and
let
tj :=
I^.
Since
E{tj\\Jôo)
is
= lim
E(r7|J^\342\200\236),a.s.
and
However,
for each
n,
r/
T\342\200\236 measurable,
hence
(see Remark
independentof
below) is
Hence J^\342\200\236.
by
(9.7,k),
=
E(7,|JF\342\200\236)
E(r?) =
ry
P(F),
a.s.
the values
Hencerj
follows.
P(i^),
a.s.;
and since
only
takes
0 and
1, the result
D
Remark.
the earlier
we have cheated to someextent in into the used in the proof martingalestatements

Of course,
building
parts
of
proof
just given.
136
14.4.
\342\226\272 \342\226\272

Levy's
Suppose
(U'4)\"
'Downward'
that
Theorem P) is a
probability
is a collection of
(Jl, J^,
triple,
and
that
{Q-n : n G N}
sub-a-algebras
of
\342\200\242 \342\200\242 \342\200\242
T such that
^-n \302\243
Q-oo := fl
k
^-*
and
^-(-+1)
\342\200\242 \342\200\242 \342\200\242 C
a-i.
Let 7 G
\302\243^(Q, J^,
P)
define
M.n :=
Then
E(7|a-n).
exists a.s. and in C^
M-oo
'=
lim M_n
and
(*)
Proof.
A/_oo
E(7|^_oo),
a.s.
The Upcrossing Lemmaapplied to
the martingale
canbe
13.4,
used
exactly
as in
to show that
lim M_n
the proof of Doob'sForward exists a.s. The uniform-integrability

in C^.
M_oo
Convergence result,
Theorem Theorem
shows that That (*)
limM_\342\200\236 exists
holds (if you likewith

reasoning:
E(7;G)
'-= lim
sup M_n G m^_oo)
follows
by
now-familiar
for G G Q-oo \302\243 Q-r-,

= E(M_.;G),
and now let
r t oo.
Law
14.5. Martingaleproofofthe
Recall
Strong
the result
as bonus). (but add C^ convergence

be
THEOREM
Let
Xi,X2,...
IID
RVs,
with
common value
E{\\Xk\\)
<
oo,Vifc.
Let n be the
ofE{Xn)'
Write
Then
n'^^Sn
\342\200\224^ cl-s.
/J-,
and
in C^.
,.(14^6)
Proof.
137
Define
:= ^_\342\200\236
\342\200\242 \342\200\242 ^-oo \342\200\242)? Cr(5\342\200\236,5n-fl,5n+2? \342\200\242*= I ]^-n-
We know from
Section 9.11 that

E{Xi\\g-n)
= n-'Sn.
in
a.s.
definiteness,
Hence L := lim sup n\"\"^5n
a.s. and lim n~^5nexists

for
\302\243^.For fc,
define
L :=
every
u;. Then L =
..
for each
Xk-\\-l
H
hm sup
-^ Xk-^n
n
\342\200\242 \342\200\242 By Kolmogorov's \342\200\242)\342\200\242
so that L G mTk P{L = c) = 1 for
where
some
Tk = a(Xk-\\.i',Xk-\\.2,
c in
0-1 law,
R. But
E(n-i
deduced think
c = E{L)= lim
Exercise.
//. 5\342\200\236) C^ about
Explain how we could have Hint. RecallScheffe'sLemma 5.10,and
at convergence how to use it.
(12.10).
Remarks.
of
See
Meyer
the
results
given
0-1
variables, the Choquet-Deny theorem on for random walks on groups.
Hewitt-Savage
for important extensions and applications (1966) so far in this chapter. These extensions include:the on de theorem Finetti's random law, exchangeable
bounded
harmonic
functions
14.6. Doob's
THEOREM
\342\226\272 \342\226\272(a)
Submiartingale
Inequality
Let
be a
non-negative
submartingale.
Then, for c > 0,

>
CP [sup
k<n
Zk>c]
<
Zn; sup Zib (Zn;SUpZk>c)

\\
k<n
< E(Zn) ) <E{Zn)
Proof Let
F :=
{sup;t<n
^k
c}-
Then F
is a disjointunion
F = FoUFiU...UFn,
138
where Fo :=
Fk
Chapter
14:
UI Martingales
(U-^)-
{Zo> c},
{Zo
and
:=
< c} n
Z
{Zi <
on Fk.
c} n...
Hence,
{Zk-i
<c}n
{Zk >
c}.
Now, Fk e fk,
> c
EiZn;Fk)>EiZk;Fk)>cP{Fk). Summingover k
The
following.
now
yields
the result.
\342\226\241
main
reason
for the
usefulness of
the
above
theorem
is the
LEMMA
\342\226\272(b)
If
is a
martingale, c
is a convex
function,
and
E|c(M\342\200\236)|
<
cx)^ Vn^
then
c(M) is a
submartingale.
of Jensen's
Proof
Apply
the
conditional
form
inequality in Table 9.7.
Kolniogorov's
\342\226\272
inequality
Let
(Xn
: n
:=
E N) be
Var(Xit).
a sequenceof
Write
independent
zero-mean
RVs in
C?.
Define 0%
Sn :=
Xi +
\342\200\242 \342\200\242 \342\200\242
Xn,
Vn :=
Var(5\342\200\236)
J2
^^
k=i
Then, for c>
0,
< c2pfsup|5it|>c) /
\\k<n
Vn.
Proof
= We know that if we set ^\342\200\236 a(Xi,

Now
-X'2,
\342\200\242 \342\200\242 \342\200\242, ^n),
then
S =
martingale. Note.
Kolmogorov's
apply
the
Submartingale
was
Inequality
to 5^.
(Sn) is a
Kolmogorov's
inequality
the
Three-Series
Theorem
key step in the and Strong Law.
original proofs of
14.7. Law of the IteratedLogarithin: case special

Let
Submartingale Inequality may be used bounds to case of Kolmogorov's exponential prove a very special
us
see
how the
via
so-called
Law of the
..(14.7)
Iterated

Logarithm
139
which is
this
to take
a quick look at
described in SectionA4.1.(You would do it is not needed later.) even though proof
well
THEOREM Let
(Xn
: n
e N)
of
be IID
RVs
each
with
the standard
normal N(0,1)
distribution
mean
0 and
variance
X2
1. Define
+
\342\200\242 \342\200\242 \342\200\242
5n := Xi +
Then,
Xn.
almost surely,
limsup
(2nloglogn)2
write
n
= 1.
Proof
Throughout
the
proof,
we shall
h{n) := (2nloglogn)2,
(It will be understood that this is necessary.) e, when
>
3.
than integers occurringin the proof are greater
Step 1:
An
exponential
bound.
Define
a martingale relative to
{J^n}-It is well
:= cr(-X'i,-X'2, J^\342\200\236 known
\342\200\242 \342\200\242 \342\200\242
yXn)-
Then
S is
that
for ^ G R, n G N,
The function x
\302\273-> e^^
is convex
on R, so that
e^^\" is a submartingale
and, by
\342\226\272 \342\226\272
the
Submartingale
Inequality, >
we have,
6^^*
for ^ > 0,
< e-^-E
5, (sup,<\342\200\236
c)
= P
(sup,<\342\200\236
>
e'^)
(e^^-) .
This is a type
In
of exponential
bound
much used in
modern probabilitytheory.
our
special
case, we
have
> c) < P (( sup 5it>c^ <e-^ê^^'\", supSk

\\k<n
1^0
and
Chapter
for
14: UI Martingales
c/n,
(U-'^)obtain
c >
0, choosing
the best 6, namely

P (sup
we
(a)
Sk> A <e-^^'/\". K be a realnumber

to 1.)
with
when (We are interestedin cases
Step
2:
Obtaining
an upper
bound. Let
K
> 1.
is close
:= Kh{K^~^). Choose c\342\200\236
Then
( sup 5,
\\k<K\302\273
>
cr^
< J
exp(-4/2A--)
= (n
- l)-^(log A^^^.
The
First
large
Lemma therefore showsthat, Borel-Cantelli n (all n > no(u;))we have for A'\"\"-^ < k < A\", Sk <
for
A\"
almost surely,
for
all
< Kh{ky sup Sk<Cn = Kh{K''-'^)

limsup/i(fc)~^5it
k
Hence,
>
1,
<
AT,
a.s.
By
taking
a sequence
of A-values converging down to 1, we

limsup/i(fc)\"\"^5jk
k
obtain
< 1,
a.s.
interested in caseswhen
course,
Step
3:
Obtaining
a lower
N
bound. Let N be an
is very more
integer
with
N >
e will
when
be small in
the cases which

_
large.)
Let e
interest
be a number in (0,1). (Of

us.)
1. (We are
Write
S{r) for
Sr,
etc.,
typographically
convenient.
For n G N, define
-
the
event
Fn
:=
{5(iV\"+l)
^(^n)
^ (J _ \302\243)ft(iV\"+l
iV\}.
Then (see
Proposition 14.S(b)below),
P(F\342\200\236)
$(y) >
(27r)-^(y + y-')-' exp(-yV2),

-
where
t/ =
(1
\302\243){21oglog(iV\"+i
iV\}^
so that Thus, ignoring 'logarithmic terms', P(Fn)is roughly (nlogN)-^^~^^^ Fn (n G N) are clearly independent^ so ]CP(-^n) = <^- However, <Ae events
..(14-8)
that infinitely
Chapter
14:
UI Martingales
I4I
occur. F\342\200\236
(BC2) shows that, many n,
almost surely,
infinitely
many
Thus,
for
5(;V\"+i)> (1But, by Step 2,

many
\302\243)/i(iV\"+i
iV\") + large
SiN\"\.
n, so that
S{N'') >
-2/i(A^\")
for
all
for
infinitely
n, we have
5(7V\"+i) ^ It now follows

that
(1 _ \302\243)/i(Arn+i
AT\")
2/i(A^\.
limsup/i(fc)-i5ib
ib
> limsuph(iV\"+i)-i5(iV\"+i)
n
>(l-\302\243)(l-iV-i)2
-2iV-2.
(You should
obvious.
check that Hhe logarithmicterms do disappear'.)The restis

n
14.8.
We
A standard
used
estimate
on the normaldistribution previoussection.
part
of the
following result in the
Proposition
Suppose
that
has
the standard
=
normal distribution, so =
that,
for
P(X>x)
where
l-^x)
<p{y)dy
Then,
for
x >
0,
(a)
(b)
P{X > x)<

P(X
x~V(a^),
> x)>{x
=
+ x-^yîfix).
Proof Let x > 0. Since

V?(x)
(p'{y)
\342\200\224y^{y)^ /\342\200\242oo
/\302\273oo
JX
y<^{y)dy >x
J X
9{y)dy,
yielding
(a).
142
Chapter
14: UI Martingales
V'^Mv),
(I4.8)..
Since(y-V(y))' = \"(1 +
/\342\200\242oo
yoo
yielding
(b).
HI
14.9.
Obtaining
Remarks
exponential
on exponential
bounds
(1984),
is
deviations
an
- seeVaradhan
number
bounds; large-deviation theory of large related to the very powerful theory

Deuschel
application.
and
Stroock
See
ever-growing
of fields of
(1989)
which
has
Ellis
(1985).
of context You can study exponentialboundsin the very specific e and in Teicher Garsia Neveu martingales (1978), (1975), Chow (1973), tc.
Much
of
the
literature
is concerned
with
obtaining
exponential
bounds
a sensebest possible. results such as the 'elementary' However, in Exercise E14.1 numerous are useful in very Azuma-HoefFdinginequality to for the combinatorics in See Bollobas applications. example applications
which are in
(1987).
14.10.
Look
A consequence
at
the
statement
we
to see
of Holder's inequality of Doob's C^ inequality in the next sectionin order
where
are
going.
LEMMA
Suppose that X and Y are

cP{X >c)< Then, for
non-negative
RVs
such
that
c>
E(Y;X > c)for every

4- q^^
0.
p>I
and
p~^
= 1, we
<
have
\\\\X\\\\p
q\\\\Y\\\\,.
Proof
We obviously
have
c)dc
(*)
L:=
Jc=0
pc^-'PiX >
<
/ Jc=0
i?cP-2\302\243(y;X
>
c)dc
=: R,
with Using Fubini'sTheorem
non-negative
integrands,
we obtain
..(14-11)
L =
Chapter 14: UI Martingales

PC^-'dc
j\302\260\302\260(j
US
l{x>c}{^)P{d^))
-L Q \\Jc=0
Exactly similarly,
we
find
that
R =
We
E{qXP-'^Y).
apply
Holder's
inequality
to conclude that
II,.
(**)
Suppose
< qWYUX\"-' E(X'') < EiqXP-'Y)

{p
\\\\Y\\\\p \342\200\224 =
that
<
oo,
and
suppose
for now that
||X||p <
cxd
also.
Then
since
l)q
p, we
have
\\\\X^-'\\\\,
EiXni,
SO (**) implies that H^Hp true for X An. remains

follows
Hence
< 5||y||p. For generalX, notethat the hypothesis < ^H^Hp for all n, and the result \\\\X A n||p
D
using
(MON).
14.11.
THEOREM
\342\226\272 \342\226\272(a)
Doob's
\302\243P inequality
Let
p >
1 and
define q so thatp\"^
bounded
-{-q~^
1. Let
Z be a
non-negative
submartingale
in C^,
Z*
and define (this

:=
is standard
notation)
sup Zk.
Then Z* G
(*)
C^,
and
indeed
||Z*t<5Sup||Z,||p. r
by the element and in IIP and
The submartingaleZ is therefore dominated CP. As Ti \342\200\224> OO. Z(Xi \342\200\242\342\200\224 exists a.s. lim2^^
Z* of
||Zoo||p = sup||Zr|U=Tlim||Z.||p.
144
\342\226\272 \342\226\272(b)
Chapter
14:
UI Martingales
(I4.II)..
If
Z is :=
Mqo
of the form \\M\\j lim Mn exists
where
is a
a.s. and in fy,
martingale houndedin C^, then and Zoo = |-^cx)L of course
a.s.
Proof For n
Inequality
Z\"^,
define
Z* :=
14.6(a)
and
Lemma
14.10 we
sup/^^n^k-
From Doob's Submartingale
see that
r
l|-^:ilp<9ll^n||p<<?SUp||Z.||p.
Property
is a ( \342\200\224Z)
Theorem.Since (*) now follows from the Monotone-Convergence we know in and in therefore bounded \302\243^, C^, supermartingale
exists
that
Zoo
\342\200\242= limZn
a.s.
However,
|Zn-Z|^<(2Z*)PG\302\243^
so
IIZr
that
(DOM)
shows
that
Zn
\342\200\224> Z in
\302\243^.Jensen's
inequality
shows
that
D
IIJ) is non-decreasing
in r,
and all the rest is straightforward.
14.12.
Kakutani's
Let
theorem
on 'product'
martingales
RVs,
Xi,A'2,...
Define Mq :=
I,
be independent non-negative n G N, let for and,
each
of
mean
1.
Mn:=XiX2...Xn.
Then
is a
non-negative
Moo
martingale, so
\342\200\242= limMn
that
exists
a.s.
The following five

(i)
statew>entsare
Moo
equivalent:
E(Moo)
= 1,
(ii) Mn ->
0 <
in
O;
(iii) M
< 1,
is UI;
(iv) n\302\253n >

If one (then
0 where
:= E(Xj) \302\253\342\200\236
(v)E(l-\302\253n)<CX).
every
one)
of the
above five = 0)
statements fails
to
hold,
then
PiMoo
= 1. theorem is explained in
Remark.
Section
Something
of
the
significance
of this
14.17.
..(14.13)
Chapter
That
14: UI Martingales
145
0 > \302\253\342\200\236
Proof.
obvious.
an
<
I follows
from Jensen's
holds.
inequality. That
Then define
is
First, suppose that
statement
(iv)
11
(*)
= \302\261i_\302\2612____^_ Ar\342\200\236
Then
iV is
a martingale
for the same reasonthat

<l/(n\302\253*)'
is.
See (10.4,b). ^'
We
have
ENl=l/(aia2...anY
so that
<
N is
bounded in C^. By
<E
Doob's
C^
inequality,
E (sup\\Mn\\]
TsuplAT^p^ :=
<4supE|iV2|) < oo,

G
\302\243^. Hence
so
that
is dominated
properties (i)-(iii) hold.

Now
by M* when
=
\\Mn\\ sup\342\200\236
is UI
and
consider
the
case
is a
[][a\342\200\236
0. a.s.
Define But
non-negative
martingale,
Moo
exists limiV\342\200\236
forced to
concludethat
N as at (*). SinceN = since J][ a\342\200\236 0, we are (4.3). The
0? a-.s.
to
is proved.
The equivalence
of (iv) and (v) is known
us from
theorem
14.13. The
Martingale
Radon-Nikodym
We
Radon-Nikodym
theorem
intuitive
theory
with
yields an
theorem. a special
and
''constructive^
- proof of
the
We are
guided by Meyer(1966).
begin
case.
THEOREM
\342\226\272 \342\226\272(I)
Suppose
that
in
(Q,^,
P) is
T
probability
triple
in
which
T is
separable
that
= a{Fn
subsets
: n G N)
of Q,.
for some sequence of (F\342\200\236)

m,easure
Suppose that Q
to
is a finite
P in
on (fi,^)
which is
absolutely continuous relative
that
(a)
for
FeT,
P(F) = 0
=^
Q(F)
= 0.
146
Then
Chapter 14:
there
i'n
UI Martingales
such
(14.13)..
exists X in
\302\243^(fi,^,P)
that
Q =
XP (see
Section
5.14)
that
Q(F)
/ XdP
= E(X; F),
of
VF
J\".
The
variable
X is
to
called a version
on
the
Radon-Nikodym
derivative
write
of Q
relative
(Q,
J^). Two such versionsagreea.s. We

= A
-r=r dP
on
y^,
a.s.
Remark.
Most of
we the cr-algebras
have
encountered
are separable.
(The
cr-algebra of Proof.
property
With
Lebesgue-measurable
subsets
of [0,1]
is not.)
the
method
of Section
13.1(a) in mind, you
can prove
that
(a) implies
that
there
(b)
> 0, given \302\243
exists S
> 0 such that,
for
G J^,
P(F) <S=^
Next,
Q(F) < e.
,Fn)possible
define
^n
\342\200\242= cr(Fi,F2,...
Then
for each
n,
J^n
consists
of the
2^^\"^
unions
of 'atoms'
of ^n, an atom A proper subset of A have the form
of
J^n
being
which
is again
an element of an element
J^n
such
that
0 is atom
the only
A will
of
^\342\200\236. (Each
HinH2n...nHny
where eachHi is eitherFi or F^.)

Define
a function
Xn : f2
\342\200\224>
[0, cx))
as follows:
if u
An,k,
then
^\"^'^^-\\Q(yl\342\200\236,fc)/P(A\342\200\236,A)
if
P(yl\342\200\236,,)>0.
Then
G C^{Q.,Tn,P) X\342\200\236
and
=
(c)
E(J^\342\200\236;F)
Q(F),
VFeJP-\342\200\236.
..(14,13)
Chapter
14:
UI Martingales
i^7
The variable
Xn
obvious
is the
from (J^n : n
obvious version
(c)
of dQ/dP on (Q,^n)relative is a martingale : n \342\202\254 Z+) this martingale is non-negative,

exists,
It is
to the
that
filtration
el'^),
Xoo
X = {Xn and since

limX\342\200\236,
'=
a.s. cx)) be
Let
\302\243 > 0,
choose
6 as
at (a), and let K
(0,
such that
Then
P{Xn
> K)
< K-'E{Xn)
> K)
= /<-'Q(n) < > K) < S.
S,
so that
E{Xn;Xn
= q(Xn
The
martingale
is therefore
UI, so that
Xn-^XmC\\ It
now
follows
from
(c) that
the measures
Cl{F)
Fh->E(J\\:;F) and F ^
agree
on
the
the proof
of uniqueness,which
7r-system IJ^n,
so
is
that
they
agree
on T,
All
that
remains
is
now
standard
for us.
Remark. The familiarity

emphasizes expectation
of
all
of the
the which
close
theorem ...
link between the Radon-Nikodym and conditional is made explicit in Section 14.14. Now for the next part of
arguments in the above
proof
the
(II)
P
The
assumption
that
T is
separable can
be
dropped
from
Part
I.
and
finite.
on
Once one has Part II, one can easily extend the result to the case when Q are cr-finite measures by partitioning fi into sets on which both are
the
Proving Part II of
the
theorem
is a
piece of
is a
'abstract nonsense' based

space, and in
You this
fact
that
on the role of sequential to take Part II for granted

Let
that
C^ (or,
more strictly, Z^)

convergence
and
metric
particular
well
skip
in metric spaces. the rem,ainder of
might section.
want
for
of Sep be the class of all separable sub-cr-algebras such that G G Sep, there exists X(; in \302\243^(Q,^,P)
dq/dP
J^. Part
I shows
= Xq ;
equivalently, E{Xg\\G) =
Q(G),
GeQ.
148
We
Chapter
are
14: UI Martingales there exists
(14,13)..
P)
going
to prove that
X in \302\243^(Q, J^, O
such
that
(d)
Xo-^Xm
given
in the sensethat
e >
0, there
exists K in Sep such that
if K;
C a G
Sep, then \\Xq -
X||i < e.
First,
we
note
that
it is
: g
enough to prove that

e Sep)
(e)
{Xq the
is Cauchyin C^
in
in
sense
that given
/C e > 0, thereexists \342\200\224
Sep
such that if
/C C
Qi
G Sep
for i = 1,2,
then
\\\\X(;^
-^(?2lli
'^
^\342\200\242
that Proof (e) implies (d). that if /CnQQie Sep for i
Suppose
=
that
(e)
holds.
Choose
G Sep /C\342\200\236
such
1,2,
then
\\\\Xo,-XoAi<^-^''^'^'
Let H(n) =
X
cr(/Ci,/C25
\342\200\242 \342\200\242 \342\200\242
^^n)-
Then
(see
the proof
indeed,
of (6.10,a)) the
limit
:=
lim-X'7^(\342\200\236)
exists
a.s.
and
in
and \302\243^,
\\\\X-Xn(n)\\\\i<2-\\
Set X have
:=
limsup-X'7^(\342\200\236)
for
definiteness.
For any
Q G Sep with
\"Hn we
\\\\Xc-Xnin)\\\\i<2-\\
Result (d) follows.
C ... of fC{0)C /C(1)
Proof
o/(e).
If (e)
is false, then (why?!)we

elements
can
find
> \302\243o
0 and
a sequence
of Sep
such that
>
\302\2430,
||^X:(n)
However,
filtration that
XfC{n-\\-l)\\\\l
Vn.
it is
(K{n))y
easily seen that

so that
{Xfc(n))
is a
UI martingale
relative to the
D
Xfc{n) converges in C^.
The contradiction establishes

show that for
(e)
is true.
Proof of
and
Part II
G ^,
of
the
theorem.
We need only
for
we have
X as at (d)
E(X;F) = Q(n
..(14.15)
Choose K
where
Chapter
14:
UI Martingales
149
suchthat
is the
for
/C C g? G Sep,
yX^; -X||i
< e.
Then a{K:, F) G
including
Sep,
(j{K^F)
K smallest a-algebra extending
and
F; and,
by a
familiar argument,
\\E{X;F)
The
Q(F)|
=
\\E{X-X,(K,F)\\F)\\
<\\\\X-X\342\200\236(K,F)h<e.
result
follows.
theoremand conditional 14.14. The Radon-Nikodym expectation that that ^ is a sub-cr-algebra Suppose (Q, ^, P) is a probabilitytriple,and of \302\243^(Q,^, element Then of T. Let X be a non-negative P).
Q(X):=E(X;G),
GeQ,
defines a
continuous
finite
measure
on
Q^
(Q,^).
so
relative
to P
Y
on
:=
that,
Moreover, Q is clearly by the Radon-Nikodym

on (Q,^).
absolutely
theorem, (a
version...)
c/Q/dP
exists
Now Y is
^-measurable, and
E(y;G) = Q(G)
Hence
E(X;G),
GeQ.
F is
a version of
the
conditional
Y
expectation
a.s.
of X
given G'-
\302\243{X\\g),
Remark.
between
The
martingale
right context
convergence,
geometry
theorem, etc., is the

14.15.
for appreciating the closeinter-relations conditional the Radon-Nikodym expectation,

of Banach
spaces.
Likelihood ratio, equivalent measures Let P and that Q be probability measures on (Q,J^) such Q is absolutely continuous relative to P, so that a version X of dQ/dP on J- exists. We that Y is (a version of) the likelihood ratio say of Q given P. Then P is to continuous relative if and if absolutely Q only P{X > 0) = 1, and then X~^ is a versionof c/P/c/Q. When each of P and Q is absolutely continuous relative to the other, then P and Q aresaidto be equivalent. Note it that then makes sense to define
/ JF
and
y/dPdq
:=
/ Jf
x'^dP
f {x-^)dq, Jf
what
FeT;
Kakutani
we
can
hope
for a
fuller understandingof
achieved
150
14.16.
Let
Chapter 14:
Likelihood
(Jl,
UI Martingales
(I4.I6)..
absolutely continuous relative to P with (fZ,^) function be a sub-cr-algebra of f. What ^-measurable = is of i t y on Q? Yes, course, yields dQ/dP E{X\\Q), E denoting P-expectation,
fy P) be which is
expectation a probability triple, and let Q be a probability

ratio
density
and conditional
measure
on
X.
Let Q
(modulo
for,
yet
again,
versions) with
E{Y;G) = E{X;G) of (fi,^), Hence, if {Tn} is a filtration (*)

form
Q(G)
for GeQ.
then
the likelihood
ratios
{dQlldPonTn)=
a UI
E{X\\Tn)
was theorem Radon-Nikodym sections,we are dropping the we
martingale.
(This is
of
course
bound
why the to succeed!)
proof of the martingale Here and in the next two
'a.s.'
qualifications on
as (*): such statements

test
have
outgrown
them.
14.17.
Let
Jl
Theorem revisited; consistency of LR = R*^, Xn{uj) = uJny and define the cr-algebras
Kakutani's
^ = a{Xk : fc
Suppose
G N), and /\342\200\236
fn=
Qn
a{Xk :
1<
fc
<
n).
that
functions
for each
on measure
n,
are
everywhere
density be the unique

Xn
R and
let
:= r\342\200\236(a:) gn{x)/fn{x).
Let P
on (Q,
probability you should prove this,

having
makes the variablesXn independent, function density /\342\200\236 Clearly, [respectively, ^\342\200\236].
^) which
positive probability [respectively, Q]

but
Mn := dq/dP where
reasons,
Now
YiY2
...
Fn on
^n,
Note y\342\200\236 r\342\200\236(X\342\200\236). each
that
the
under P and that

M is
if
has
P-mean
variables {Yn 1. For any
: n G N) are independent of a multitude of familiar
a martingale.
absolutely
and E(^|^\342\200\236),
Q is
on f,
exists
then
(a.s.,
But
= M\342\200\236
P) and
continuous relative to P on ^ with dQ/dP M is UL Conversely, if M is UI, then

= Vn. M\342\200\236,
=
Moo
E(Moo\\T\342\200\236)
then
the probability
measures
Fh^Q(F)
f. Thus Q is absolutely m.
agree
on
the
7r-system
|J
and FÊ{Moo;F) fn and so agree on f, so that continuous relative to P on ^
Moo
if
and
only
on c/Q/dP if M is
..(14-18)
Kakutani's

Theorem
if
151
therefore =
impHes that Q
is equivalent
to
P on
if and
only
nE(ri)
equivalently
/ y/U^)9n{x)dx> 0,
if
(*)
^ then P
/ {Vfn{x)
Vdnix)^
dx
< oc;
and that
is also absolutely
continuous
relative
to Q.
variables
distributed independent Suppose now that the Xn are identically functions of P and Q. Thus, for some under each density probability = = that n. is from It clear f and Qn 9 for all / and 5f on R, we have fn (*) = to Q is equivalent to P if and only if f g almost everywhere with respect = P. Moreover, Theorem Kakutani's measure, in which case Q Lebesgue \342\200\224> also tells us that if Q î^ P, then 0 (a.s., M\342\200\236 P) and this is exactly the Test in Statistics. consistency of the Likelihood-Ratio
etc. (prestissimo!) 14.18. Note on Hardy spaces,

We
have
seen
is
in this
martingales
a natural Theorem,
what
Sampling
However,
that for the class of UI many purposes, The appendix to this chapter, on the Optionalfurther evidence of this. provides
chapter
one.
we
might
if
martingales.
bounded
For example,
process,
wish to be true is is a UI martingale a.s.!)
not
always
true
for
UI
and C is a
previsible
then the
converge
in C^.
M does (Even so, C \342\200\242

of
martingale C9M neednot

theory, one
one
(uniformly)
be
bounded
For many parts

martingales conditions equivalent
H\\
the
more
advanced
for
of
M null holds:
at 0
which
(then
uses the 'Hardy' space each) of the following
(a)
(b)
M*
:=sup|M\342\200\236|G\302\243^
:= e C\\ where [M]\342\200\236 [M]i and [M]oo=T limMn. a
ELi(^^
\"
Mk-,f
thereexist
(c)
By
special
case
absolute
of a
constants
celebrated Burkholder-Davis-Gundy that (1 < p < cx)) such Cp,Cp

\\\\M*\\\\p
theorem,
Cp||[M]i|U
<
<
C,\\\\[M]i\\\\,
(1 <
p <
oo).
spaces
The space
of martingales
TYj
is
obviously
sandwiched
>
bounded in
\302\243^ (p
1) and
between the union of the the space of UI martingales.
Its
152
identification
Chapter 14:
as the
from
right
UI Martingales
space
(I4.I8)..
intermediate
has proved
complex
very important.
Its
name derives
Proof
its
important
links
with
analysis.
of (a) and (b) is B-D-G inequality or of the equivalence look at the relevance take a But we can here. quick very give it clear that makes M problem. of (b) to the C \342\200\242 First, (b)
of the
to
too
(d)
and
difficult
if
eHl
and C
is a bounded previsible process,then
\342\200\242 M
EHq^
see that, in a sense, this is 'best possible'. at 0 and a (bounded) previsthat we have a martingale M null Suppose = |, are IID RVs with P(ek = il) ible process e = (sk : k E N) where the \342\202\254k to show that want and where e and M are independent. We
we shall now
(e)
Hj
if (as
well as only
of
M is if) e \342\200\242
bounded
in C^.
We run into no
difficulties
'regularity'
if we
condition on
> 3-Ê([M]|).
M:
E|(\302\243 M)|\342\200\236 EE{|(\302\243
\342\200\242 M)\342\200\236| |a(M)}
And where did the last inequality of sequence of real numbers.Think
ak
appear from? Let (ak as Mk \342\200\224 when Mk-i
fc
G N)
be a
M is
known.
Define
n
Xk:=akek,
:= .Yi T^\342\200\236
\342\200\242 \342\200\242 \342\200\242 + X\342\200\236, v^
E{W^)
k=i
J2 al
Then (see
Section
7.2)
SO
that,
certainly,
E{W^)
< Sv^.
On combiningthis
=
fact
with
Holder's
inequality
in the form
vn
E{W'J
<
||tî|||||W^^||3
(E\\W\342\200\236\\)iEiW*)i
we
obtain
the
special case
of Khinchine's inequality we
E{\\Wn\\)>3-KI,
need:
For more on the topics in this section, seeChow and Dellacherie and Meyer (1980), Doob (1981), Durrett (1984). these is accessibleto the readerof this book; the others are
Teicher The
(1978), first of
more advanced.
Chapter
15
Applications
15.0.Introduction
The
\342\200\224
please
read!
in which the
purpose
of this
theory
chapter is to
which
we
have
problems.
In
We
consider
only
very
of some of the ways can be applied to real-world developed but at a lively pace! simple examples,
give
some
indication
Sections
The
15.1-15.2,
was
we discuss a
developed
trivial
case
of
a celebrated
result
from mathematical
formula.
model
for a continuous-parameter (diffusion) We for prices; see, example,Karatzas and Schreve (1988). in treatments the an also hzis obvious which discretization many present is that in the discrete case,the to be emphasized literature. What needs is why the answer is result has to do with which probability, nothing the of completely independent underlying probability measure. The use of the a device for other than P measure^ in Section 15.2is nothing 'martingale some But in the diffusion expressing sim,ple algebra/combinatorics. where the the and combinatorics are no longer meaningful, setting, algebra theorem, and Cameron-Martin-Girsanov changemârtingale-representation the essential of-measure theorem provide language. I think that this justifies a 'martingale' treatment of something which needs only juniormy giving
formula for
option-pricing economics,the Black-Scholes
stock
school
algebra.
Sections
15.3-15.5 indicate the
further
formulation
E10.2
of optimality in a first look. We consider gave

which
sheep problem^]
techniques
example, the 'Mabinogion just but it is an example which illustrates rather well several be utilized in other contexts. may effectively
one 'fun'
we
stochastic control,
development
of the
at
martingale
Exercise
which
In Sections15.6-15.9, look estimating in real-timeprocesses

of
at
some
simple
noisy
which
made. This topic has important applicationsin engineering (lookat the IEEE I that in and will look in economics. medicine, you journals!), hope
15S
only
problems of filtering: observations can be
154
further
Chapter 15: Applications

into
(15.0)..
this topic and into the important subject which develops for with stochastic-control is combined example, theory. See, filtering and Whittle and Vintner (1990). (1985)
encounter
when
Davis
Sections 15.10-15.12 consist of we try to extend the when

A
first
reflections
on the
problems we
martingale concept.
result
subsets
with
15.1.
Let S
of
trivial
martingale-representation
let E denote the set of all denote the two-point set {\342\200\224 1,1}, and let fx be the probability measure on (5, E) 5, let p G (0,1),
/.({I})=:p=l-M{-1}).
Let
A^
G N.
Define
(fi, T,
cj =
P) = (5, S, //)^
so
that
a typical
element of fi
is
(cji,cj2,---,<îv),
ek{^)
\342\200\242= îb, so define
^k G {-1,1}.
that
\342\200\242 \342\200\242 are (\302\243i,\302\2432, \342\200\242,îv)
^ ^ Define \302\243;t with
\342\200\224^ R
by
IID
RVs each
law
11. For 0
<n <
n
N,
^n:=X^(\302\243ib-2p+l),
^n
Note
'= Cr(Zo,Zi,
l.p 4- (-1)(1
. . . , Zn) - p) =
Cr{Sl,S2, -
\342\200\242 \342\200\242 \342\200\242 j^n)-
that
E(\342\202\254k)
2p
1. We
see that
(a)
Z = {Zn:0<n<N)
martingale
is
(relative
to {{fn
:0
<n < iV},P)). n<
LEMMA
If
= {Mn
then
:0
< n < N)
exists
iV},P)^,
there
is a martingale (relative to {{fn : 0 < a unique previsible process H such that
= Mo
+ H^Z,
=
that is, Mn
Mo
Mo
ELi
Hk{Zk
- Zk-i).
the
Remark. Sincefo
common
{0,Q},
is constant
on Q, and
has to be
measurable,
value We
of the
E(M\342\200\236).
Proof.
simply
construct
H explicitly.
BecauseMn =/n(î,...
is
fn
Mn{^)
= fn{ei{u;),...,en{u;))
,u;n)
..(15.2)
for some
Chapter
15:
Applications
155
function
\342\200\224> : { \342\200\2241,1}\" R. Since /\342\200\236
M is
a martingale, we have
0=E(Mn-Mn_i|j^n-l)(cc;)
= p/n((^l,
. . . ,CJ\342\200\236_i, 1)
4- (1
p)/\342\200\236((^l,
. . . ,CJn-l,
-1)
-'/n_i(u;i,...,u;n-i).
Hence the
expressions
/n(u^l, . . . ,CJn-l,
.,
^
^x
1) 2(1
- /n-l(<^l, . \302\273 ,^n-l)

-p)
and
(b2)
\342\200\242 \342\200\242 /n-l (t^l, \342\200\242 , t^n-1
/n(t^l,
\342\200\242 \342\200\242 \342\200\242 , t^n-1,
-1)
2p
if we define then H is Hn{(j^) to be their common value, that as M = Mq -^ H \342\200\242 previsible, and simple algebraverifies Z, You check that H is unique. D
are equal; and

clearly
required.
15.2. Option pricing; discrete

Consider
Black-Scholes
formula
two
an economy
in which
the
there are
of which
TV.
'securities':
bonds of stock
of fixed
interest rate
r, and stock,
of N.
value
fluctuates randomly.
units For
Let N be
fixed
element
units change abruptly -Sn = (1 4Sn
We suppose that at times 1,2,...,

for
values of
n =
0,1,...,
throughout
and of bond N, we write

the
r)\"-Bo
the
value of
1 bond unit
throughout
open
time interval
for
(n, n + 1),
of 1 unit
0 with so that
of
the 4- 1).
value
stock
the open
x
time
interval
{n,n
of stock
You start just after and Vq of
time
a fortune
of value
=
made
up of
Aq
units
bond,
AoSo 4- VoBo
Between
before
X.
times
time
0 and 1 you 1, you have Ai
invest units
this of stock
in stocks
and bonds,
of
so that
just
and
Vi
bond
so that
^150 4-^1^0
= 2:. as
your
So,
(Ai,
Vi)
represents
the
portfolio
you have
'stake
on
the
first
game'.
156
Just
units

after
of bond
1 (where time n \342\200\224 with value =
(15.2)..
n >
1) you have
An-i
units
of stock
and
V\342\200\236_i
Xn-\\
-An-iSn-l
+ Vn-lBn-l-
By trading
between
stock for
n
bonds
times
\342\200\224 1 and
value Xn-i
costs tobe zero) because we assumetransaction

Xn-l
or conversely, you rearrange n so that just before time n, your
your portfolio
fortune is
(still described
of by
= AnSn-l
+ K^n-l
by
(n
>
1).
Your
(a)
fortune
just after time n

Xn =
is given
+
VnBn
(n>0)
and
your change in
Xn
fortune satisfies
Xn^l
(b)
Now,
= An{Sn \"
Sn-l) + K(Bn = rBn,
5\342\200\236_i).
Bn
\342\200\224
Bn-1
and
where
rewrite
is the J?\342\200\236
random
'rate of
interest
of
stock
at
time
n\\
We
may
now
(b) as
Xn \342\200\224 Xn-l
so that
= rXn-l
+ AnSn--l(Rn
\342\200\224
t),
if we
set
Yn
(C)
then
{l+ry^Xn,
(d) Note that (c) n, so that the

Let
Yn
Yn-1
= (1
r)-(\"-\302\273>A\342\200\2365\342\200\236-i(iZ\342\200\236
r).
shows Yn to be the discounted value of your evolution (d) is of primary interest.

that
fortune
at
time
Section 15.1.Note
We
fi,^,\302\243\342\200\236(l <n<N),
Zn{0 < n < N) no probability measure
and J='n(0
only
< n < N) he as in has been introduced.
build
a model
in which
each
a < r
Rn
takes
values
a,
in
\342\200\224
1, oo),
where
< b,
..(15,2)
by
157
setting
(e)
^ Rn =
a+ b + -\"2-
b \342\200\224 a
^~^\"-
But then (f)

where we
R^-r='^(bnow choose
a){\342\202\254n
2p +
1) = h{b-
a)(Zn
^n-i),
Note
that
(d)
and
(f) together
integral'relative display y as a 'stochastic

time
to
Z.
you to buy 1 unit

striking
A European
optionis a
of
contractmadejust after
after time
iV
0 which
will allow
stock
just
at
a price
K; K is the
so-called
you will
value the
at option
time If you have made such a contract, then just after N, price. exercise the option if 5^ > K and will not if 5^ < K. Thus, the should you pay for time N of such a contract is {Sn \342\200\224 What K)'^.
at
time
0?
an
Black and
the
Scholesprovide
strategy
answer
to this
question which is basedon
concept
A
of a
hedging strategy.
with
scheme
a portfolio
A
hedging
initial
{{An,
Vi^)
management
and
V are
previsible
for
relative to
every uj,
and (b),
(hi)
we have
{^n}, and where, with
value x for the describedoption is : 1 < n < N} wherethe processes

X
satisfying
(a)
Xo{uj) =
Xn{uj)
x,
0 (0
(h2)
(h3)
>
< n <
iV),
Xn{^)
= {SnH-'K)^-
Anyone employing a
management,
hedging strategy
going
will
by
appropriate
portfolio
value
and
without
bankrupt,
exactly duplicate the
of
the
option
at time
N.
that
-X'\342\200\236(u;)
some n amounts to borrowing at the fixed interest rate r. A negative value of A corresponds to 'short-selling^ stock,but after you have read the theorem, this may not worry you!
V for
Note. Though Blackand Scholes insist we) do not insist that the processes A
>
0, Vn,
A
Vu;, they
(and
of
and V be
positive.
negative
value
158
(15.2)..
THEOREM
A
hedging
strategy
with
initial
value x exists if
and
only
if
x =
where
xo:=E[(l+r)-''iSN-K)+],
for the measure P
hedging
of
is the
as at
and
(g). There is a
of
expectation
unique
Section
15.1
with p
Xq,
strategy
with negative.
initial
value
it involves
this
On the basis time 0 of the

Proof.
of
no short-selling: A is never it is claimed that Xq result,
is
the
unique
fair price
at
option.
In the
underlying positive
Suppose
definition
of
hedging
strategy,
an
requirement,
(jj
has
we should consider only mass. Of course, P is such a measure.

now
probability
measure.
there is nowhere any mention Because however of the 'for every u;' measures on Q for which each point x exists,
that
let A,V,X^Y denotethe
a hedging
associated
strategy
processes.
with
initial
value
and
From
(d) and (f),
Y = Yo^F.Z,
where
is the
previsible process
F\342\200\236
with
(l +
r)-(\"-\302\273)A\342\200\2365\342\200\236_i.
Of
course,
X
F is
Thus
bounded because
there are only
finitely
combinations.
Yo =
strategy,
and
y is a martingale under the P y}v = (1 4- r)''^{Sn \342\200\224 by (c) and K)\"^
since measure,
many
{n,uj)
is;
and
since
the definition of
hedging
we obtain
(We did not
use the property
that
> 0.)
define
Now consider
things afresh and
y\342\200\236:=E((l+r)-^(5w-i^)+|:^\342\200\236).
Then
F is
a martingale,
in
and by
combining
(f)
for
with
some
the
unique
martingale-representation result
Section
15.1, we
see that
:=
previsible
A, (d)
holds. Define
process
Xn := (1+ r^Yn,
Vn
{Xn
- AnSn)/Bn.
Then
(a)
and
(b)
hold.
Since Xo =
X
and
Xn
= {Sn
- K)-^,
,,(15.3)
the only thing which
of
Chapter
15:
Applications A is
159
remains is to prove
(15.1,bl),
that
never negative.
Because
the
explicit
E
formula
this reduces
to showingthat
= (1 +
a simple
[{Sn
K)^\\Sn-l,Sn
= (1 + h)Sn-.l\\
>
and
[{Sn
K)-^\\Sn--uSn
a)Snî] ;
computation
this
is intuitively
obvious and may
be provedby
on binomial
coefficients.
15.3. The Mabinogion sheepproblem

In
the
Tale
of Peredur
Mabinogion (see Jones

some
the
ap Efrawg in the
Jones
We
very
early
Welsh
folk
flock
tales,
of
The
and
(1949)),
sacrifice
there
poetry
is a
magical
sheep,
black,
entire
some white.
each
for precision
in specifying
its behaviour. At
from randomly 1,2,3,... a sheep (chosen of if this bleating flock, independently previousevents) bleats; becomes white; if instantly sheep is white, one black sheep(if any remain) the bleating sheepis black, then one white sheep (if any remain) instantly
of times
becomesblack.No
The controlled
births
or deaths
occur.
system
the
Suppose
now
that
system
the
can be
transition
system.
and just after every magical sheep may be removed from of black
controlled in time 1,2,...,

(White
the
that any
just number
after
time
of white
numerousoccasions.) The object isto maximize sheep.
sheep may
expected
be removed on
final
number
Consider
the
following
example
of a
decision,
policy:
Policy A: at each time of black sheepthan white sheep

immediately
or if no
reduce
the
white
do nothing if there are more black sheep remain;otherwise, to one less than the black population
population. The
value
function
V for
Policy F : Z+
is
the ->
function [0, cx)),
Z+
sheep
where for w^b G Z\"^, V{w,b) denotes the expected final number of black if one adopts Policy A and if there are w white at and b black sheep time 0. Then V is uniquely the fact for G specified by that, w,b Z\"^,
(al)y(0,6)
= 6;
160
(a2)

V(w,
(15.3)..
b) =
V{w
1,
ft)
whenever
w >b
and
>
0]
whenever
(a3)
6>
0 and w
It is black
V(w,b)
=
>
:^^V{w + l,b~l)^^^V(w-l,b^l)
0. Wn
we
<
h,
and
almost tautological that if sheep at time n, then, if

of
and Bn
adopt
Policy
denote the numbersof white A, then, whatever the
initial
values
Wq
and
Bq,
is a martingale relativeto the natural filtration

(b) V{Wn,Bn)
of
{{Wn,Bn)
: n
> 0}.
(c) LEMMA
The
following
V{w^
statements
b)
are true
1, b)
for w,b E T^:

w;
(cl)
(c2)
whenever
> V(w
\342\200\224
whenever
>
0,
V(w,b)>^,Viw
w
+ l,b-l)
and
b
> 0
+ ^,Viw-l,b+
in
1)
the
>
0.
Let us suppose that this Lemma is true. (It is proved Then, for any policy whatsoever, (d)
next
Section.)
V(Wn^Bn)
fact
is a
F(VFn,
supermartingale.
Bn)
a.s. end up converges means that the systemmust are of one colour. But then V( Woo? Boo) sheep sheep (by definition of V). SinceV{Wn,Bn) have
The
is
that
in an
is a non-negativesupermartingale, we
absorbing state in which just the final number of black
for
deterministic
Wq^Bq,
EV{Woo,Boo)<V{Wo,Bo).
Hence,
sheep
whatever
the
initial
position, Thus
under
final any policy is no morethan the expected A
the expected final
number
number
of
black
of black
sheep if Policy
is
used.
Policy
In Section
is
optimâl.
result:
15.5, we prove the
following
as Thus
oc. fc \342\200\224)^
if we
start
with
10000
black
sheep
and 10000
(over
up with
(about)
19 824 black sheepon
white sheep, we
many
had
finish
average
'runs').
correctly
Of course, the above argument

the optimal
workedbecause we
subject
goodguesses.
policy to begin with. In this

Then
guessed
area,
one often
has to
make
one
usually
has
to work
rather hard to clinchresults
.,(15,4)
Chapter
15:
Applications
161
which correspond in more general situations to Lemma (c) and to prove these You might find it quite an amusingexercise (d). on. our special problem now, before reading
For
statement
results
for
problem
in economics
which utilizes
analogousideas,seeDavis
and
Norman
(1990).
15.4.
Proof
be
of Lemma 15.3(c)
to define
It
(a)
will
convenient
Vk:=V{k,k).
results:
Everything hinges on the following

(bl)
for
1 <
c<
fc,
y(A:-c,fc
+ c)
= \302\253,+(2A:-n-)2-^\"-'> Y.
\342\226\240
)'
(b2)
y(jfc +
l-c,fc
Vk
+ c)
=
which
simply
reflect
(15.3,a3)
vfc,
together
F(0,2fc)
vk,
with
the
'boundary
conditions':
F(fc,fc) =
(c)
= 2fc, 4-1)
V(h 4-1, k)
V(0,2fc
= 2fc 4-1.
Now, from (15.3,a2),
VM = and hence,from
(d)
(b2)
F(fc
1,
fc
1) =
that
V{k, k +
1),
with
c =
1, we find
v,+,
is the
\\^v,
+ -^{2k
+ 1),
tails
where
pk
chance of
obtaining k headsand k
in 2k
tosses of
a fair
coin:
Result
(d) is
the key to
proving
things
by
induction.
162

of
(15.4),. is automatically
when
w
Proof
true
result
(15.3,cl).
b
From
is
when
w > b.
and
Hence, we
w
-{- b
need only
then
(fc
(15.3,a2),
result (15.3,cl)
establish
the
result
<
b.
Now
i{ w <
odd,
(u?,6) = for some
l -c,fc4-c)
that
c with
1 <
1 <
fc,
c <
k. But formulae (b) show
it is
enough
to show
that
for
a <
>(2.-\302\253)2-\"-.(/;;_\\);
and
since
/
\\k +
2fc
\\
//2ilA
2fc-l
\\
//2A:-1\\
\\
a-l)
the
/ \\k) -\\k + a-l)/

case when
;'
we need
only
establish
a=
1:
(2fc+i-n)2-(\"-i>(i+Pfcr'(^jtO
>(2ifc-.02-(\"-^>(^\\~'),
which
reduces
to
(f)
Vk>2k--p-\\
follows
But property (f)
by
induction
from
(d) using only in k. be achieved
the
fact
that
Pk
is decreasing
is
Proof
for
the
case
when
6 4- w^
even
may
similarly.
Proof of
automatically
result true
(15.3,c2). when
Because
w; <
case when reducethis problem to showing

(15.3,c2) for the
{1 +
In analogy with the a = 1' in the proof of
reduction
(15.3,cl),
of (15.3,a3), the result (15.3,c2) is > b. we so need it w establish when 6, only of the 'general a' caseto the 'boundary case it is easily shown that it is sufficient to prove = (fc 4-1, fc 4-1) for some k. Formulae (b) {w, 6)
that
(2fc +
l)pk}vk < 2k{2k+ l)pk,
..(15,5)
and this
Chapter
15:
Applications
168
follows by induction from

(2fc 4- ^)pk i^
(d)
using
only
the fact that D
increasing
in k.
15.5. Proof
Define
of result (15.3,d)
OLk :=
Vk-2k-
{pkT^
JTT.
Then,
from
(15.4,d),
aA:+i =
where
(1 -
pk)o^k
4- pkCk,
Stirling's
find
formula shows that

\\ck\\
Cfc
\342\200\224> 0 as fc \342\200\224> oo, so
that
given e
> 0, we
can
N so that
<e iV,
for
k>N,
Induction shows that
for
fc
>
\\oik+i\\ <(1
- pk)(l we have ak
pk-i)'
\"
(I
pN)\\otN\\+e.
0, and it
But, since
limsup
J^/Ojt
oo,
J][(l
\342\200\224
pk)
is now
clear that
|a;t+i| Because
so that \302\243 \302\243?
\342\200\224> 0.
of the n\\ =
accurate version of (27rn)^ (^Y
Stirling's
formula:
e^/^^^n)^ 0 <
6{n)
< 1,
we have
SO
that Vit
(2fc
4- ~
>/7rfc)
->
0,
as required.
We now
formula
take a
quick
look
at
filtering.
The central
illustrated
with
a recursive
property
which is now
idea combinesBayes'
by
two examples.
164
15.6.

Recursive
Suppose
(15.6)..
nature
that
of conditional
S, C probabiUty.
A,
probabilities
(elements
(for
Example.
with
ACi B
the P[C. Let us also introduce
strictly
positive
and D are events Let us write

notation
example)
of J^) ABC
each
for
Ca{B)
for
:- P{B\\A)
The
= P{AB)IP{A)
property' in which we
conditional is
probabilities. exemplified
'recursive
are
interested
by
Cabc{D)
= CMD\\C)
:=
^^
D
;
given B
'if
have
we
want occurred,
to find we
can
the Cab probability
the conditional probability of assume that both A and of D given C\\

and
on
have
that A, B and C occurred and find
a strictly
Example. Supposethe X, Y, Z positive joint pdf fx,Y,Z,T

P{(X,
T are RVs such that R^: for B G B\"^,
{X, Y,
Z,
T)
has
Y,Z,T)eB}=
J
has
J J
fx,Y,zA^,
y, z,
t)dxdydzdt.
Then, of
course, (X, F, Z)
joint
pdf
fx,Y,z
on R^, where
fx,Y,z{x,y,z)= Jr/
The
fx,Y,z,T{x,y,z,t)dt.
formula
/T|x,y,z(^k,t/,^)
defines
:= fx,Y,z,T{x,y,z,t)
T
given
fx,Y,z{x,y,z)
X,
a ('regular')
conditional
pdf of
Y, Z:
for B
e B,we
have,
with
all
dependence E B\\X,Y,Z){u;)
on
u;
indicated,
P(T
= E(Ib{T)\\X,Y,Z){u;)
=
Similarly,
Jb
fT\\x,YAAX{uj),Y{uj),Z{uj))dt.
fT,Z\\X,Y{i,z\\x,y)
\\
\\
fx,Y,Z,T{x,y,z,t) \342\200\224 r
fx,Y{x,y)
\\
..(15.1)
The

property
165
recurrence
is exemplified by
fT,Z\\X,Y
pj
fnx,Y,z-{fnz)^^,y.--^^
15.7.
With
Bayes'
a now-clear
formula for bivariate iiorinal distributions

notation,
we have
for RVs
X,
Y with
strictly
positive joint
pdf/x,y on R2,
.
X
(*)
/x,v(x|y)
fx,Y{x,y)
-j^:^
fx{x)fY\\x{y\\x)
\342\200\224j^\342\200\224.
Thus
(**)
fx\\Y{x\\y)
proportionality'
oc fx{x)fY\\x{y\\x),
depending
the 'constant of
the fact that
on
but
being
determined
by
Jr /R
=
fx\\Y{x,y)dx
1.
The meaning
of the
following
Lemma
is clarified
within
its proof.
LEMMA
\342\226\272(a)
Suppose
RVs
that
such
/j,â^b E R, that
\302\243(X)
U,W E (0, cx))

=
and
that
X and Y
are
that N(/.,C/),
CxiY)
= -Nia +
bX,W).
Then
Cy(X)=N(i-,n
where
the
number
V G (0,
cx)) and
~
the
RV
X -
are
determined
as
follows:
1 = ~ 14.^
V
W'
X
X_^
V
b{Y
a)
U'^
'
Proof. The
absolutedistribution of /x(a:) =
is N(//,
U), so
that
(27rt7)-Uxp{-^^^|
166
The conditional
Chapter
15:
Applications
(15.7)..
of
pdf
of
given
X is
the density
N(a
4- bX,
W), so that
/y|x(t/k)
Hence,
(27rTyrêxp{-^^-^^^^^'|
from
(**),
log/x,v(x|\302\273)
= CW
(i^
fc|j^
result
where
1/F
= 1/U
+ b'^/W and
x/V = n/U + b(y-a)/V. The
follows.
COROLLARY
(b)
With
the
notation
of the
Lemma, we have
=
= E{iX-Xr} \\\\X-X\\\\l
Proof.
V.
Since
Cy{X)
= N(X,
F),
we
even
have
E{(X-X)^\\Y}
V,
a.s.
15.8.
Let
A\",
Noisy observation
r/i,
of a singlerandom variable
RVs,
N(0,a2),
7/2,
\342\200\242 \342\200\242 \342\200\242 be independent
with
Civk)
\302\243(X)
= m^)-
Let
(c\342\200\236)
be
a sequence
of positive
real numbers,and let

Tn
Yf,=X+Ckr)k.
We regard
(T{YuY2,...,Yn).
each
Yi
as
a noisy
observation
Moo
of X.
We
know
that
Mn :=
One
E{X\\:Fn) ^
:=
E(X|ôo)
a.s.
and in
\302\2432.
interesting
when
question
is
mentioned that Moo
at (10.4,c) is:
it true
= X a.s.?^ or again,
is
a.s.
equal to an
Jôo-measurableRV?
..(15.8)
Chapter
15: Applications
167 law' given Fi, 1^2,

\342\200\242 \342\200\242 5^n\342\200\242,
Let us write
We
to signify C\342\200\236
'regular
conditional
have
Suppose
that it is
true
that
c\342\200\236-i(x)
N(i-\342\200\236_i,y\342\200\236-i)
where
is X\342\200\236-i
a linear
function
=
(0, oo). Then,since
\342\200\242 of Yi, y2, \342\200\242 Yn-i \342\200\242,
and
Vn-i
a constant in
X Y\342\200\236
we have c\342\200\236j]\342\200\236, =
C\342\200\236-i(Yn\\X)
N(X,cl).
From
the Lemma
/i
15.7(a) on bivariate normalswith

= Xn-i,
U = Vn-i
,a = 0,b = l,W =
=
cl,
we
have
Cn-iiX\\Y\342\200\236) -NiX\342\200\236,V\342\200\236),
where
\342\200\224 \342\200\224
4_ ^
_L ' c^
^^
V
in
^^-^ \342\200\224 ~ V
Xn
r2*
But the recursiveproperty
indicated
Section
15.6 shows that
C\342\200\236{X) Cn-l{X\\Y\342\200\236).
We
have
now proved
by induction that
Cn(X) =
Now,
Vn. N(X\342\200\236,V\342\200\236),
of course,
Mn =
However,
Xn and E{{X-Mny} = Vn.
ib=l
Our
martingale
M is \302\243^ bounded Moo =
and
so converges
only
in
\302\243^. We
now
see that
X a.3, if
and
if
^ c^^
= oo .
168
15.9.
Chapter
15:
Applications
(15.9)..
The Kalman-Bucy
method
filter
used
The
immediate
of derivation
calculation of the
in the
famous
allows preceding three sections filter. Kalman-Bucy
Let A,
r/i,
fl\", C,
K and g
that be real constants. Suppose
Xo,
\342\200\242 \342\200\242 \342\200\242, Vo,\302\243i,\302\2432,
7/2,
\342\200\242 \342\200\242 \342\200\242 are independent
RVs with
CiXo)
C{ek) =
C(7jt)= N(0,1),
at time n
N(m,
a%
by
Yo =
Xn,
0.
where
The true state

(dynamics)
of
a system
is supposed
given
Xn
However
\342\200\224
Xn-l
= AXn-1
+ HSn +
9can
the
process
X cannot
be observed
the process
y, where
^n \342\200\224
directly: we
only
observe
(observation)
Yn-1
= CXn the
+ Krjn.
induction
Just as in Section we 15.8,
make
hypothesis
that
C\342\200\236-iiX\342\200\236-i)=NiXn-uV\342\200\236.i),
where
Cn-i
signifies
regular
conditional law
given
5^1,12,
\342\200\242\342\200\242\342\200\242 Since ^^n-i-
Xn =
OiXn-1 + 9 -^ Hsn,
where
a:=l4-A,
we
have
Cn-i{Xn)
N(aX\342\200\236_i
g,a^V\342\200\236.i
H^).
Also,
since
= Y\342\200\236î Y\342\200\236 + CXn
+ KVn,
=
we have
C\342\200\236_i(r\342\200\236|A'\342\200\236) N(F\342\200\236_i +CJ^\342\200\236,A'2).
Apply
the
bivariate-normal
=
Lemma
15.7(a) to
=
find
that
C\342\200\236iX\342\200\236) C\342\200\236-iiX\342\200\236\\Y\342\200\236) N{Xn, V\342\200\236),
where
(KBl)
\342\200\224 =
-^^^\342\200\224\342\200\224\342\200\224
('KB2^ ^ ^
^ = ^^n-i + g
Vn
C(Y\342\200\236-Y\342\200\236-i)
aWn-1
+H^
lO
..(15.10)
Equation
rectangular

(KBl)
which
= Examination shows that V\342\200\236 /(V\342\200\236_i). that is the graph of / shows Vn
169
of the
\342\200\224> V'oo,
hyperbola
the
positive
root oiV
= f{V).
to
in continuous a rigorous treatment of the K-B filter a nd use See, techniques. time, martingale stochastic-integral to filtering and and references for example, Rogers and Williams (1987) control mentioned there.For more on the discrete-time situation, which is
If onewishes
give
one is forced to
very
important
in
practice,
and for how filtering

Whittle
does link
(1990).
with
stochastic
control, 15.10.
see Davis Harnesses
and Vintner (1985) or

entangled
The martingale
because question
concept is well
time
(discrete) arises:
does
the
processes
parametrized
explain
to processes evolving in time to the ordered naturally belongs spaceZ'^. The in some natural way transfer property martingale Z^? by (say)
adapted
to
Let mefirst
with
models
in Z (d
that
= 1) and in Z^,
: n
a difficulty
described
though
we do
in Williams (1973) that arises not study the latter here.

each
Suppose
that ('almost
(a)
For
(Xn
G Z)
is a
will
process suchthat
be
Xn
G C^
and
surely' qualifications
E(Xn\\Xm
dropped)
:7n^n)=
K^n-i
+ Xn-fi),
Vn.
G Z,
define
Gm
cr{Xk shows
A:
<
m),
Hm =
h
cr{Xr : r
Z^
> m).
a <
The
a <
Tower
Property
that
for a,
in
with
6, we
have
for
r <
6,
: r
^{XAGa.H,) = E{Xr\\Xs
SO that
^ ^lâ,
Wft)
r i->
E^XrlQa^T^b) is the
E{Xr\\Ga,y'b)
linear interpolation
=
0 \342\200\224 a
â
+ -. \342\200\224 Xh.
b a
Hence,
for n G Z
and
G N,
we have,
a.s..
E(-X'n|^n-u?'Wn-f
l) =
Xn-u , uXn-\\-i --\342\200\224

W-f- 1
+
U-f- 1
110
Now,
Warning

the
4.12,
decrease cr(^n-u, Wn-fi) better not claim that they the Downward Theorem, we see that
(15.10)..
cr-algebras
as decrease
the
we had
Anyway,
by
u t oo. Because of to a(^_oo) Wn-f i)variable random
L :=
and
\\\\iaui-oo{Xulu) exists (a.s.)

= ^n+1
E{Xn\\\\\\(^{Qn-u,Hn-\\-l))
u
- L.
Hence,
by
the
Tower Property,
E(Xn\\L,Hn+i)
\342\200\242
(b)
= Xn+1
\342\200\224 L
whence
we have
a reversed-martingale property: = Xn+i nL\\L,Hn+i) E(Xn
(n
1)L.
that
A further
application of the Downward Theoremshows

A :=
(c)
IminôoiXn
\342\200\224
nL)
exists
(a.s.)
Hence
L=
By
uf
l\\ni(X u/u). oo
which led to (b) in
the
using
the arguments
reversed-time
sense,
we
now obtain
(d)
E{Xn+l\\L,gn)=Xn+L.
(d)
From (b) and
and
the Tower
Property,
L\\Xn-^l)
E{Xn 4E9.2 shows that
Xn-^l,
Exercise
Xn-î
= Xn = nL-^
+ L. A,
Vn
Hence (almost surely)

G Z,
Xn
so that
(almost) all
samplepaths
that
of
are straight
linesl
a harness property
that
that Hammersley (1966)suggested
and
every low-dimensional
harness is a straitjacket!
the type
of (a) should be called any analogue of result just obtained conveys the idea
..(15.12)
15.11.
in terms
should
Chapter
15:
Applications
171
Harnesses
that
unravelled, 1
rules
each
Thereason
(15.10,a)
of the idea that
out interesting be Xn should
models is that it is expressed one a random variable. What
say is that
Jtn :
but
Q -> R
then
require
only that
differences
(Xr-Xsir.sel)
be RVs
(that
is, -
be J^-measurable),
and that for

n)
n,k
1 with
k ^ n.,
E(X\342\200\236
Xk\\Xm
Xfc
: m
7^
K^n-i
- Xk) + K^n+i
(1973).
^fc).
I call
this
a difference
harness in
that
Williams
Easy
exercise.
Suppose
: n (y\342\200\236
any function on
Q. Define
Y
e 1)
are IID
RVs
in
C^.
Let Xq
be
ifn<0. ^\"-\\Xo-EL\342\200\236+in
Thus,
_/ô
+ E*=in
ifn>0,
Xn
\342\200\224
Xn-i
= Yn, Vn.
Prove that X 2
is a difference
harness.
15.12.
Harnesses
unravelled,
In dimension d >
described
3,
we
do
not
in
related both to Gibbsstates in statisticalmechanics such that each Xn{n G Z^) is a, RV and, for n G Z^, EiXniXm : m
Z\"* \\ \342\202\254
the preceding
need to use the 'difference-process' unravelling section. For d > 3, there is a non-trivial model,
and
to
quantum
fields,
{n})
= (2d)-'
J] X See WiUiams
the (1973).
whereU is the set of

In addition
'martingale' and
the
2d unit
vectors in 1^.
to a
'harness',
anticipating
harnesses,
Many
interesting
Hammersley (1966) contains many important ideas on later work on stochastic partial differential equations. unsolved on various kinds of harness remain. problems
treatiseon fascinating etymological
terms
PART
C: CHARACTERISTIC
FUNCTIONS
Chapter 16
Basic
Properties
of CFs
Part
is merely
the briefest
is about
account
function theory. This theory work in Part B. PartB was

Characteristic function
theory
that
is on
finds
it is
proper
it
its way
do
the one hand part of Fourier-integral recent into that marvellous

and
of characteristic stages in different spirit from the very something of the sample processes. paths
of
the
first
theory, book,
and
Korner
characteristic statistics, I
and
(1988); see
also the magicalDym

functions
McKean
(1972).
On the
have
an essential
role in both
must
include
see
Chow
and Teicher
indicate the method
(1978) or Lukacs
these few pages

(1970).
on them. Forfull
probability and
treatment,
other hand,
Exercises
develop
the analogous
in
full
of moments
Laplace-transform method, and

for
distributions
on
[0,1].
16.1. Definition
The characteristic function

to
(CF)
ip
= ipx
: R
of a
random variable X is defined
be
the map
(yp
-> C defined
(important.-the
\342\226\272 \342\226\272
domain is R
ip{e)
not
C)
by
:=
E(e'^^')
= Ecos(^X)
of
+ iEsm{eX).
X,
Let
:= Fx
be the
and
let
/j,
:=
law of
X, Then
fix
denote
the
^{6):= f Jr
so
of
that (p
e'^^'dFix)
:=
Jr of //, for
172
(p,
e'^V(d^),
is
the
Fourier
write
transform
factor
or the Fourier-Stieltjes transform

is sometimes
F.
(We do
not use the
(27r)~
2 which
used in Fourier
theory.) We
often
(pp
or c^^
..(16.3)
Chapter
16: Basic
Properties
of
CFs
178
16.2.
Elementary
ip
properties
RV
Let
ipx
for a
= <
X.
Then
\342\226\272(a)
^(0) \\<f{e)\\
1 (obviously); 1, V0;
is continuous
\342\226\272(b)
\342\226\272(c)
6 \302\273-> ^{0)
on R;
(d)
(e)
n-x){e)
VaX+6(\302\253)
= ^x{e\\ye,
=
e'\302\273Vx(a^).
to
You can
Note
differentiate
A16.1) implies that

^{6)
these properties.(Use(BDD) on differentiability (or lack of it). Standard

easily prove
if
establish (see
(c).) Theorem
analysis
G N
and
Ed-X\"!\")
<
oo,
then
= Ee*^^
n times
(^(\">(^)
to obtain
E[(iX)\"e'^^].
it
we may
formally
In particular,
when
(p^^\\0) =
= oo.
iÊ(X^),
that
However,
is possible
for <p'{0) to
exist
E(|X|)
We shall
see shortly
(f can
be the
'tent-function'
^w = (i-iî)i[_,,ij(^)
so
that
(p
need
not
be differentiable
everywhere, and
(f
can
be
0 outside
[-1,1].
16.3. Some uses of characteristic functions

Amongst
uses
of CFs the
are the
following:
\342\200\242 to prove \342\200\242 to calculate
Central
Limit Theorem
(CLT) and analogues,

Theorem,
saddle-point
distributions the
of limit RVs,
\342\200\242 to
prove
'only if
part
of
the
Three-Series
via
\342\200\242 to obtain approximation, \342\200\242 to prove
estimates
on tail
probabilities
such
results
as
and
normal
if X
and Y are independent, then both X and Y have

three
of
X -{-Y
has a normal distribution,
distributions.
Only the first
these
uses
are discussed
in this book.
174
16.4. Three
(a)
Properties Chapter 16: Basic
of
CFs
(^6,4),.
key results
are independent
RVs,
If
and Y
then
Proof.
This
is just
'independence means mvdtiply'
agziin:
(b)
F may he
reconstructed from (f.

a precise distribution
for See Section 16.6
statement.
functions
(c) ^Weak^ convergence of

convergence
corresponds
exactly to
of
the
corresponding
CFs,
a precise
See Section
Theorem
18.1for
statement.
in
The way in which these results are used is as follows. Suppose that X\\, -X'2, we and variance 1. From (a) and (16.2,e),
Eexp(i^5n/v/;7)
the
proof
of the
Central Limit
with
\342\200\242 are IID \342\200\242 \342\200\242
RVs each
mean
h -X'\342\200\236,
see
that
if Sn
:= X\\^
then
= (^x(^/v^\".
show
We shall
obtain rigorousestimates which

=
that
9x(^/v^)\"
|l
\\6Vn
o(l/n)|\"
-. exp(-i^2)^
y^
Since6 \302\273-> exp(

shall
\342\200\224^^^)
see
the
shortly),
distribution $
is the CF of the it now follows from
standard normal distribution (as we (b) and (c) that

to
function
of SnIy/ri converges weakly of the standard normal distribution simply
the
distribution
N(0,1).
In this
case,
this
means that
P{Sn/Vn
<x) -^ $(x),
eR.
16.5.
Atoms
In regard
by
to both (16.4,b) and

of atoms.
(16.4,c),
tidiness
of results
can be
threatened
the
presence
.,(16.6)
If P{X
of Chapter 16: BasicProperties
CFs
175
the distribution
= c)
> 0, then
function
the
law
// of
A\"
is
said
to have
an atom at
of X
has a discontinuity at
=
c:
c, and
f,{{c])= F(c)-Fic-)
Now
of /J,
P{X
c).
can
have of /i
at most
atoms
is at
n atoms of mziss most countable.

that
at
least
1/n,
so that the
number
It therefore follows
reals
given
x G R,
right-continuity
there exists a
of
/j,
of sequence(t/\342\200\236)
with
yn
I x such
of continuity
of F);
and then, by
that every
is a non-atom t/\342\200\236
(equivalently
a point
of F,
F{yn) i
F{x).
16.6. Levy's InversionFormula

This
theorem
puts
functions
the fact that

the
may
be
reconstructed
from
explicit form. (Check that

distribution
theorem
does
such
that
ipp =
imply
R,
that
F
if F
and G are
(f in
very
^G on
then
= G.)
THEOREM
\342\226\272
Let
(^ be
F.
the CF of a RV Then, for a < b,

1
which
has law
/j,
and
distribution
function
^ ^ (a)
Z*^ e\"*^^ J_' -T

/
- p-*^^
-J \"-^
lim\342\200\224
TToo27r
ie
<p(e)de ^^ ^
\\[F{b)
F(b^)]-^[F{a)^F{a^)].
X has
continuous
Moreover,
if
J^
\\(p(6)\\d6
< oo, then
probability
density
function
/, and
(b)
The
f{x) =
'duality'
^j
e-'''^{e)de.
between
(b) and
the result
(c)
can
^{e) =
be
JR
I e''-f{x)dx
omitted on a first
reading.
exploited
proof
as we
of
shall see.
may be
The
the
theorem
176
Chapter of the
16: Basic
Properties
u
of
CFs
(16.6)..
Proof
theorem. For u,v
eR
with
<v,
(d)
|e*'\"-e'\"|<|^-u|,
either from a
picture or since
I \\Ju
ie'*dt\\
a
<
I
Ju
f f \\ie''\\dt= Ju
Theorem
Idt.
allows
Let
0
a, 6 G R,
with
<
b.
Fubini's
us to
say that,
for
<T
< oo,
^ e-'\302\273\302\273
e-
(e)
-A J_
2w _rp
ie
<f{e)de
It
^Jr\\J-t~
id
de \\
/i{dx)
provided we show that

'' dO > fj.{dx)
Ct
^^Jr[j-
TI
that evenness
< oo.
id
However, inequality (d) shows the Next, we can exploit

of
Ct
^ (^
of the
\342\200\224 \302\253)^/7r,
so
that
(e) is
valid.
cosine function
and the oddness
the
sine
function
to obtain
îe(x-a)
(f)
I
27r
rT
_ îBix-b)
iO
iTTJ_ ./\342\200\224J\"
_
de
a\\T)
sgn(a:
- a)S{\\x
r 1 := < 0 I -1
sgn(x
- b)S{\\x -
b\\T)
where,
as usual.
if
\\i if
>
0,
sgn(a:)
x = X <
0, 0,
and S(U) :=
sinx
/ Jo
dx
(U > 0).
..(16.6)
Chapter
16:
Basic
Properties
x~^
of CFs
does
177
exist, because
Even though the LebesgueintegralJ^
sinxdx
not
we have
(see
Exercise
E16.1)
lim S{U)
= ^ .
in x and
to
The expression
(f)
is
bounded
simultaneously
converges
T for
our
fixed
a and
6; and, as T t
0 if
^ if
X X
oo, the expression (f)

a or a or
X X X
< =
> =
6, 6,
1 if a The
<
<
6.
now
Bounded Suppose
Convergence Theorem
now that
(DOM)
yields
result
(a).
(a)
(g)
and
use
to obtain
J^
\\(f{6)\\d6
<
oc.
We can
then let T t oo in
result
F{h)
- F(a) =
-1 /r
\342\200\224 iBa __ p \342\200\224 iBh ^(e)dd,
F is continuous at a and b. However, that provided (DOM) right-hand side of (g) is continuous in a and b and (why?!) that F has no atoms and that holds for all a and b with (g)
shows we can
a <
that the conclude b.
We now have
(^^
But, F{b)
F{a)
6-a
by
[ e^^^
e\"''^^ ^^^)^^-
=2Û
^îSa ie(b
iO{b-a)
I
(d).
_ ^-iSh
< 1.
a) oo allows
Hence,the assumptionthat
6 \342\200\224> a in
J^
\\y:>(6)\\d6 <
(h) to obtain
us to use (DOM) to let
n\302\253)=/(\302\253):=^ê-*VWd^,
and, finally,
is continuous
by (DOM).
178
16.7.
Properties Chapter 16: Basic
of
CFs
(16,7).
A table
Distribution
pdf
Support
CF
1.
N(/x,a2)
^^^cxp|
-^^
-}
exp(i>6l
- 10-202)
i0
2.
U[0,1]
1 1
2
[0,1]
3. U[-l,l]
[-1,1]
R
sin 0
0 1
4. Double
exponential
ie-l-l
5. Cauchy 6. Triangular
7. Anon
7r(l-fic2)
e-l\302\273l
l-|x| 1\342\200\224COS X
7ra:2
[-1,1]
2(i-r^)
(1
|0|)i[_i,i]W
The
as
two
lines
4 and
5 illustrate
do the
two lines 6
and 7.
Hints
exerciseson this chapter.
the duality between (16.6,b) and (16.6,c), on the table are verifying given in the
Chapter 17
Weak
Convergence
for the appropriate concept of 'convergence' is 'weak on The mezisures convergence' terminology probability (R,B). in the is closer to 'weak*' than to 'weak' convergence unfortunate: concept is the official senses used by functional analysts. 'Narrow convergence' the term. However, probabilists seem determined to use pure-mathematical here. them 'weak in their sense, and, reluctantly, I follow convergence'
In
this
chapter,
we consider
We are
(complete,
studying the
metric)
of
specialcaseof
For
weaJc
separable,
special features
Parthasarathy
R.
space the
S when general
5 =
convergence
R;
on a
we
Polish
use Ethier
and
(1967)
or, for
and Kurtz
a superb acount of
theory, see Billingsley(1968)or

its
unashamedly scope,
current
(1986).
We
Notation.
for the
write
Prob(R)
on space of probabilitymeasures
R,
and
Cfc(R)
for the
space of
bounded
continuous
functions
on R.
17.1. The 'elegant'

Let
definition
sequencein Prob(R)and
to
/j,
(/In
Hn
: n
G N)
be a
let
/j,
G Prob(R).
We say
that
converges
weakly
if
(and
only
V/i
if)
C6(R),
(a)
and
fin(h) ->
then write
//(/i),
(b)
/J'n^
179
fi
180
Chapter
We
17: Weak of Prob(R)
Convergence
(17.1)..
know
that
elements
correspond to distributionfunctions
F{x) =
via
the
correspondence
/J,
<-^ F,
where
//(\342\200\224oo,x].
Weak
convergence
of distribution
Fn
functions is defined in the if and

only
obviousway:
(c)
^ F
if
//. //\342\200\236
-^
We are
Fn is the
have,
generally interested in the

for
case when
random
F\342\200\236 Fx^,
that
is when
some
variable
Xn-
Then, by
(6.12), we
for
h G Cb{R),
^^(h) =
Note
JR
= E/i(X\342\200\236). f h{x)dFn{x)
F is meaningful X are RVs
even
that
different
the statement
probability
Fn -^
spaces.
if the
Xn^s are
defined
on
then
However,
if Xn
(n G N^ and
-.
on
the
same
triple (Q, J^, P)^
(d)
(X\342\200\236
A', a.s.)
^ =\302\273 (Fx\342\200\236
Fx),
and indeed,
(e)
Proof o/(d).
fj, is
(A\342\200\236^Xinprob)
=>
(Fx\342\200\236
Fx),
the
law of
Suppose that X. Then,
-+ X, X\342\200\236
for
a.s.,
G Ci,(R), ^
and that /z\342\200\236 is the law of X\342\200\236 and \342\200\224\342\226\272 we have h{X\342\200\236) a.s., and, h{X), = n{h).
by (BDD),
/x\342\200\236(A) E(X\342\200\236)
E{X)
Exercise. Prove (e).
17.2.
be
'practical
formulation
Example.
the
Atoms
law of
for
are a
Xn, so that
h
nuisance. Suppose that

fin
Xn
is the
unit mass
X. Then,
at -, and let /j,
-, X
= 0.
be
Let
//\342\200\236
the
law of
G Cb{R),
fin{h) =
h{n-')-^h{0)
= fi(h),
so
that
jJLn \342\200\224^ IJ\" However,
FniQ)
= 0
/> F(0) =
1.
..(17.2)
LEMMA
Chapter
17:
Weak
Convergence
181
(a)
Fn ^ F
Let {Fn) he
if
a sequence and
of DFs if
l\\m
on R, and let
be
a DF
on R.
Then
only
Fn{x)
F(x)
for every
non-atom (that is,

part.
every
point
of continuity) Let
x of
F.
Proof of 'only
Define
if'
Suppose
that
\342\200\224> F. F\342\200\236
x G R,
and let
^ > 0.
G Ch{^)
via
if
h{y)
<
X,
y
:=
1 \342\200\224 S~^(y -^
\342\200\224
x)
if x
\\{ y
<
>
<
x
S.
-{-
6,
lo
Then
\342\200\224> //\342\200\236(/i) /J,{h).
X -{-
Now,
Fn{x)
so that
< Hn{h)
and
n{h) < F(x -h

< F{x
may
4-
S),
limsupFn(x)
n
S).
However,
F is
right-continuous, so we
limsupF\342\200\236(x) n
let
| 0 to obtain
R.
R
(b)
In
<
F(x),
Wx e
similar
fashion,
working
n
with
y \302\273-> h{y
+ S), F{x
we find that for x G
and
^>0,
liminf
SO that
F\342\200\236(x-)
>
S),
(c) and
liminf
n
F\342\200\236(x-)
>
F(x-),
Vx G R.
Inequalities
(b)
the
(c) refine
the desired result.

the 'if
of part as a consequence
D
a
In
represent
next
section,
we obtain
nice
at ion.
182
17.3.
Chapter 17: Weak

Skorokhod
Convergence
(17.3)..
representation
of DFs on R, that F is a DF point x of continuity of F.
THEOREM
Supposethat
on
R
: n (F\342\200\236 \302\243N)
is
and there
that
\342\200\224> F\342\200\236(x) F{x)
a sequence at every
Then
exists
a probability
RV
(Xn)
of RVs and also a
triple (fi,J^,
such
P)
carrying
sequence
that
Fn =
and
Fx^,
F = Fx,
a..S.
Ji-n
\342\200\224^ -^
This
is a kind
We
of 'converse'to (17.1,d).
use
Proof.
simply
the construction
in Section
3.12. Thus, take
in,T,P) = i[0,l],B[0,l],Leh),
define
X-^{u;)
:=
m{{z
: F(z)
> u;},
X\"^
X'^{u;):=mi{z:F(z)>i^},
and
define
have DF
F and that
Let
2r
X^,
X~
similarly.
P(X+
We know from Section 3.12 that X\") = 1.
and
X~
z > X'^(u;). Then F{z) > of F with < so z. that > So hence, large n, Fn{z) u;, X^(uj) limsup\342\200\236 X^{u;) we can choose z I X'^{u;). Hence But (since non-atoms are dense), be
Fix Lj.
for
a non-atom
u, and
< z.
limsupJ\\:+(u;)<X-^(u;), and, by
similar
arguments,
liminfX~(u;)>X-(u;).
Since X~
< X+ and P(X+ = X\")
1, the
result follows.
17.4. Sequential compactness for Prob(R)

There
is a
the
unit
/In
but
the problem in working with mass at n. No subsequence of \342\200\224> in where //qo is Aôo Prob(R),
non-compact
converges (//\342\200\236)
space
weakly
R. Let
be //\342\200\236
in Prob(R),
is
the
unit
mziss at
oo. Here R
the
..(11.4) compact
Chapter metrizable
-^ in /^c\302\273
17:
Weak
Convergence
183
space
[\342\200\224cx),cx)],
the
definition
of Prob(R)
is obvious,
and Hn
Prob(R)
means
that
V/i
în{h) -> //oo(ft),
e C(R).
of C(R) axe elements because (We do not need the subscript '6' on C(R) that while functions in to keep remembering bounded.) It is important need not. The in ^^(R) functions at 4-cx) and \342\200\224cx), C(R) tend to hmits a countable dense subset) while the space has space C(R) is separable (it
C6(R)
is
not.
think of the next topic. HereI next paragraph(not the next the analysis, treatment. I resort to bare-hands By the Riesz elementary on, section) of is the dual representation the spaceof bounded space C(R)* theorem, C(R) of The measures on weak* signed topology C(R)* is metrizable (R,S(R)). of C(R)* is the unit ball a nd under this topology (because C(R) separable), is compact and contains Prob(R)as a closed subset.The weak* topology of Prob(R) is exactly the probabilists' weak so topology, weak (a) Prob(R) is a compact metrizable space under our probabilists^ Let me
some
briefly describe
functional
how
one
should
assume
but from
topology.
The
LEMMA
bare-hands
(Helly-Bray)
substitute for
result (a) is the
following.
(b)
Let (Fn) be any
sequence
of distribution
non-decreasing
exist a
functions
function
on R. Then there
F
right-continuous
on
such
that
0 <
F <
1 and
a subsequence
(rzj) such that

at
(*)
limFni{x) = F(x)
We
every
point
of continuity
F.
Proof
make
countable
an
obvious
dense
use of
set
'the diagonalprinciple'.
R
Takea
C of
and
label
it:
C =
{ci,C2,C3,...}.
Since (Fn{ci) :
subsequence
N)
is a
bounded sequence, it
contains a
convergent
(i^n(i,j)(ci)):
Fn(ij){ci)
In
~> H(ci)
(say)
as j -> oo.
some
subsequence
of this
Fni2j){c2)
subsequence, we shall have

~> H{c2)
as j
~> oo;
184
and so on.
Chapter
17:
Weak
Convergence
(17,4)--
If we
put
n,- =
n{i,i)^ then
limFn.(c)
we shall have:
for
H{c):=
Obviously, 0 < if
exists
every
c in C.
< 1,
and
is non-decreasing
on C.
For x
R,
define
F(x)
:= lim
ciixH{c),
to x through as you can
can
the m'
signifying
that
c decreases
strictly
C. (In particular,
the
F{c)
pery'
need
not
equal F
Jy(c) for c in
C.)
Our function
of Sections
is right-continuous,
of depriving
check. By
'limsu-
holds: I wouldn't
17.2 and 17.3, you

dream
also
check
for yourself that
(*)
you of that
pleasure.
17.5. Tightness
Of
course,
the
function
F in the It
will
neednot Helly-Bray Lemma 17.4(b)

a distribution
be
distribution
function.
be
if and only if
F{x)
lim F{x)
Definition
\342\226\272
= 0,
lim
1.
sequence
\302\243 > 0,
there
(Fn) of distribution exists K > 0 such that

finhK,K]
functions is called tight - F(-K-)
if,
given
= F{K)
>l-e.
out
You
can
see
the
idea:
-oc'.
'tightness stops
mass beingpushed
to
4-cx) or
LEMMA
Suppose
that
Fn
is a
sequence of DFs.
(a)
(b)
If
such
Fn
F for
some DF
then
F,
then
(Fn)
is tight.
If (Fn)
that
is
tight,
there
exists
a subsequence
(Fm) and a DF F
Fm
-^ F.
exercise.
This is a really
easy limsupery
Chapter
18
The
Central
Limit
Theorem
The
Central
Limit we
that
Theorem it as
weak
mathematics. Here
derive of CFs.
which says
convergence
(CLT) is one of the great results of Theorem a corollary of Levy's Convergence to DFs exactly pointwise of corresponds convergence
18.1. Levy's
\342\226\272 \342\226\272\342\226\272 Let
Convergence Theorem
be
(Fn)
a sequence
of DFs, and let
ifn
denote
the
CF
of Fn-
Suppose that
g{6)
:=
lim(fn{^)
exists for
all 0 eR,
and
that
g{')
Then
is continuous
at 0.
F,
= (fp
for some
f\342\200\236zf.
and
Proof
Assume
the
for the
(Fn)
moment that
is tight.
can
(a)
sequence
Then, by the
and a DF F But then, for
Helly-Brayresult 17.5(b),we
that
find
a subsequence
(Fn^)
such
Fn.
^ G
- F.
R, we
have
: CF (<^\342\200\236,
<fn,iO) ^ ^FiO)
(take
of
F\342\200\236J
h{x)
= e'*^).
Thus g =
ipp.
185
186
Now weakly
Chapter
18:
The
Central
Limit
Theorem
(18.1)..
we to
argue
by contradiction.
Suppose that {Fn)

of
does not converge

F,
F.
Then,
we
for some
shall
point x
continuity
of
77
subsequencewhich
(*)
denote
by {Fn)
and an
>
0 such
we can find that
\\Fn{x)-F{x)\\>r^,
so
yn.
and
But (Fn) is tight,

that
that
we can find
a subsequence Fn^
a DF
F such
F
But
^ F
=1 g
=1
then
(fnj
~> (^,
so that
(p
ipp see
ipp.
Since
a CF
determines the
correspondingDF a of F and
non-atom
uniquely,
we
that
F =
F, so that, in particular,x is
(**)
Fn,(x)^F(x)=F(x).
(**)
The contradictionbetween(*) and

We must now prove
clinches
the result.
(a).
Let e
+
Proof
of
tightness
of (Fn).
> 0 be given. Sincethe
expression
M^)
M-^)
= I 2cos{0x)dFn{x)
Jr
is
real,
it
follows
that
g{0) +
g{\342\200\2240)
is real
(and
obviously
bounded above by
can
2).
Since
such that
is
continuous
at 0
and equal to
when
1 at 0, we
\\0\\
choose
6 >
\\l-9{0)\\<ie
<
6.
We now
have
0<6-'
Since
Jo {2-g{0)-g{-0)}d0<ie.
Convergence Theoremfor the in N such that for n > no.
finite
= lim(/?n,
the Bounded
interval
[0,6] shows that
there exists no
6-' I Jo
{2-^n{0)-^n{-0)}d0<e.
..(18.2)
Chapter
18:
The Central Limit
Theorem
187
However,
6-^
Jo
{2-^n{e)-<Pn{-6))d6
= 6-'
U {l-e''^)dFn{x)\\dO
the
interchange
\342\200\224
|1
e*^^|
< 2,
have, for n
> no,
of order of integration being justified by 'the integral of the absolutevalue' is clearly
the
fact
finite.
that
We
since
now
> / J\\x\\>26-->l\\x\\>26-
dFn
=fin{x:
\\x\\
>26-^}
and
it is now
evident that the
sequence(Fn) is tight.
obtain 'Taylor'estimates on
18.2.
If you
now re-read
Section 16.4, you

characteristic
will
realize functions.
that
the next
task is to
o and
that
O notation
Recall
/(t)
= 0(^(0)
3s
t^L
CO
meansthat
< limsup|/(t)/(9r(t)|
and
that
f{t)=o{g{t))
3s
t^L
t^L.
means that
f{t)/g{t)^0
3S
188
18.3. Some
Chapter 18: The
Central
Limit
Theorem
(18.S)..
important estimates
and
For
0,1,2,...
x real,
=
define the 'remainder'

e^^-\302\261
i?\342\200\236(x)
^.
Then
i?o(x)= e^^-l=
and
r
Jo
ie^ydy,
from
these two expressions

|i?o(x)|
we see that
<min(2,!x|).
Since
Rn{x) =
we
/ Jo
iRnî{y)dy,
obtain
by
induction:
Suppose
now that
A\"
is
a zero-mean
(7^
RV in C^:
Var(J\\:)
E{X) = 0,
Then,
\342\226\272 (a) with (/?
:=
< oo.
denoting
(/?x? we
have
|Ei?2(^X)|
l^(^)
(1
- \\a'9^)\\=
<
E\\R2{eX)\\
(\\xf^miy
The final term within to 0 as ^ \342\200\224> 0. Hence,

\342\226\272 \342\226\272(b)
E(-)
is dominated
by the
integrable
^
RV
\\X\\^
and
tends
by (DOM),
(^(^)
we have
l-i<T202_ô(^2)
^^0.
logs,
Next, for
\\z\\
<
^, and
with principal
values for
tdt
Jo
^+w
Jo 1
and since|1+ tz| > ^

\342\226\240\342\226\272(c)
we
have
\\logil
+ z)
z\\
<
\\z\\\\
\\z\\<\\.
,.(18.5)
Chapter 18:
The
Central
Limit
Theorem
189
18.4. The
Let \342\226\272 \342\226\272\342\226\272
Central Limit Theorem

{Xn)
he an
IID sequence, each Xn

E{X)
-
distributedas X
where
= 0,
(T^ :=
set
Var(X) < oo.
Define
Sn
'-= Xi
-\\-
-- -{\342\226\240 and Xn,
\342\200\242 \342\200\242\342\200\224
7=
cr^/n
Then,
for
x G R, we have,
P{Gn<x)^^x)
as n
=
\342\200\224^ oo,
-^J'
exp{-iy^)dy.
Proof.
Fix
in
R. Then,
using (18.3,b),
the 'o' now

we have,
referring
to the
oo,
situation
when n -^
oo. But
now,
using
(18.3,c),
as n
\342\200\224\302\273\342\200\242
logvo.W=nlog{l-l^+o(^)}
Hence
v^g\342\200\236(^)
-^
normal distribution, the result follows
Gxp(
and \342\200\224^^^),
since
0
from
i\342\200\224> exp(\342\200\224^^^)
is
the
CF
Theorem
18.1.
of the D
18.5. Example
Let us look at a simple examplewhich
shows
adapted to dealwith
With
sequence
of independent
how the method may but non-IID RVs. some
be
P),
the
Record
Problem
jBi, jB2,
\342\200\242 \342\200\242 \342\200\242 are independent
events
E4.3 in mind, suppose that on with P{En) = 1/n. Define
(ft, j^,
190
Chapter
18:
The Central Limit

time
Theorem
(18.5)..
the 'numberof
E{Nn)
records
by
n' in the + 7
record context. Then
^-r=logn
+ o(l),
log
(7 is Euler's
7
constant)
Var(Ar.)
^
k<r,
(^1
i^
_
n +
Y + o(l).
Let
Nn
- log n
0 in
so that
E(Gn)
-^
0, Var(Gn)
\\/logn -^ 1. Then, for fixed
R,
^Gni^)
But
\342\200\224
exp{-iOy/logn)(fN^
ik=l
nvx.(.)=n{i-i+i.\"}. k=l
with t
We see that
as n
\342\200\224\302\273\342\200\242 and
00,
:\342\200\224 0/y/logn,
k=l
= -z9y/i^
+J2l k=l
(it
(it ^
It' + oit')) ^
j
[logn
fE
\\k=l
p)
-iOy/logn-]-
- 1^2+o(t2)
0(1)]
+ ^^0(1
-1^2+o(l)--l^^
x)
HenceP{Gn<
18.6.
<^(x), x
(1980)
eR.
for
D
some
SeeHall and Heyde

CF
very general
limit
theorems.
proof
was
of Lemma
us
12.4
if part
Lemma12.4gave
statement
the
'only
of the Three-Series Theorem Its 12.5.
els follows.
..(18.6)
Chapter 18:
The
Central
Limit
Theorem
191
LEMMA
Suppose that
by
{Xn)
is
a sequence
of independent random variables
hounded
a constant
K in [0, oo):
\\Xn{u^)\\<K. Vn,Vu;.
Then
and Y^W^v{Xn) < oo). {Y^^{Xn) converges The proof given in Section 12.4was rather sophisticated. of as a consequence Proof using characteristic functions. First, note that, estimate (18.3,a), if Z is a RV such that for someconstant Ki, < /iTi, (j2 := Var(Z) < oo, \\Z\\ E(Z) = 0, for then |^| < K^^, we have
{Y^Xn
converges,
a.s.) =^
< 1_
< exp
1
i<y202 _^_
2^2 ^
1_
1
2^2
(-!-')\342\226\240
Now
take
:= X\342\200\236 Then Z\342\200\236 E{X\342\200\236).

E(Z\342\200\236) \\^z\342\200\236m
0,
Var(Z\342\200\236)
Var(X\342\200\236),
\\exp{-ieE{Xn)WxM\\
\\^x\342\200\236m,
and
2K. \\Z\342\200\236\\<
If X; {2K)-\\
Var(X\342\200\236)
oo,
then,
for 6
< i2K)-\\
\302\253^P
we
shall
have,
for 0
<
\\d\\
<
n
However,
l'^^*(^)l =
if
IV'^*(^)I
{-^^'
Var(X*)|
= 0.
Yl^k
converges
a.s. to
5, then, by (DOM),
nv^x.(^) =
k<n
Eexp(i^5n)^(^s(^),
and
^s{0)
is continuous in
Y^W^v{Xn)
6 with
<^s(^)
1. We
have a
contradiction.
Hence
12.2(a)
X)Vax(Zn)
< oo,
and, since E(Zn) =
0,
Theorem
shows
that
\\^ Zn
converges a.s.
Hence
converges
Y,E{Xn) =
a.s.,
the
Y.{X\342\200\236-Z\342\200\236}
and
since
it
is
a deterministic
part of
argument
was used in
Section 12.4.
sum, it converges!
This last
APPENDICES
Chapter Al
Appendix
to Chapter
Al.l.
In the
example
non-measurable Banach
subset and
A of
5^.
of
spirit of
Tarski,
although,
Axiom
pre-dates
theirs, we use the
of course, this relatively to show that Choice
trivial
(a)
S'-\\jA,
are
wherethe Aq the others by

clear
disjoint
sets,
If the
rotation.
be obtained each of which may set A = Aq has a 'length' then
from
it
any of
is
intuitively
that result
(a) would force

27r
oo X
length
(A),
an impossibility.
To constructthe
{e*^ : 0 G
w
R} inside
exist
z ^
ii
there
family (Aq : q G Q), proceed as follows. Regard S^ as C. Define an equivalence relation ~ on S^ by writing a and ^ in R such that
z = e*'\",
Use
e'^,
- ^
which
Q.
the
Axiom of
of
Choice to produce a set A

class. Define
=
has
precisely
one
representative
each
equivalence
Aq
e'Â^
{e'^z:zeA}.
Then
could
the family
be replaced
{Aq
: g
Z
G Q)
has the
the to
desired properties. (Obviously,

above its
fully
by
throughout
argument.)
rigorous
remainder
We do not bring this example of this appendix is fully
conclusion.
The
rigorous.
192
..(Al.S)
We
Chapter
set
Al: Appendix to Chapter
193
now
out
to prove
Uniqueness Lemma 1.6.
A1.2.
Let
c?-systems.
5 be
a set, D,
a d-system
(a)
of and let I> be a collection if {on S)
subsets
of 5.
Then
T>
is
called
5 G
(b)
(c)
ii A,B eVdindACB
ii
An
then
B\\A
e D,
eV
An
and
^ A
An
^,
then
AeV.
An+i(Vn)
Recall that
(d)
means:
An Q
and
[JAn of S
\342\200\224 A.
Proposition.
only
A collection
both
of
subsets
is a
a-algebra
part.
if
and
if S is
if
a 7r-system
and
a d-system.
the
Proof The 'only

Suppose that
G S.
part
is trivial,
so we prove only
and
'if
S is both
Then
E\"\"
a 7r-system
S\\E
a cf-system,
and that -E, F
and
En{n
N)
:=
G S,
and
ÛF=:5\\(E^nF^)GS.
HenceGn
Finally,
'-\342\200\224 EiU.,
.UEn
G Ti and,
since Gn T
U-^^^
^^
^^^ ^^^^
|J
Jîb
G S.
is a c?-system,
of d(C). Supposethat C is the to be intersection all of c?-systems d(C)

Definition
the
smallest
cf-system
d{C)
of subsets of 5. We define which contain C. Obviously, d(C) which contains C. It is also obvious that
a class C a{C).
Al.S.
\342\226\272 \342\226\272
Dynkin's
If
Lemma
d{I) = a{J).
is a TT-system, then
Thus
Proof.
TT-system.
any
c?-system
which
by
contains
that
a 7r-system
contains the
that
a-algebra generated
Because of
7r-system.
need
Proposition Al.2(d),we
only
prove
d{T) is a
194
Chapter
Al:
Appendix
to Chapter
G
1
el}.
(A1.3)..
that Vi D I. It is easilychecked system, from d{I), [For, clearly, 5 G X>i. Next,
Step 1: Let
Dj
:=
{B G d{X)
: BCiC
Because J is a ttthe cf-system structure inherits Vi if Bi,B2 e Vi and Bi C B2, then,

d{I),
yC
for C in J,
and,
{B2\\Bi)nC
=
{B2nC)\\{BinC);
since ^2 H C G c?( J), J3i (B2\\Bi) n C G d{I), so that Bn T i^, then for C G J,
fl
G d{I) ^ ^2X^1
C
and c?( J) is a c?-system, A. Finally, if Bn G
we I>i(n
see G N)
that
and
(^nnc)T(^nc)
so that
n C G
which containsJ, so that

Step
d{T) and
B
(since
Xî.]
Vi
:
We have shown that Vi C d{X) by its definition)
is a c?-system
X>i
d(X).
2: Let
X>2
D2 := {^
T.
d(J)
BHAe
d{I),
VJ3
G d(J)}.
that
the
fact
contains
c?-system
that
structure
d(T)
X>2 =
just as in Step 1, we can therefore from d{X) and that says that c?(X) is a 7r-system.
But,
prove D2
Step 1 showed that X>2 inherits = d{X). But the
A1.4. Proof of Uniqueness

Recall
Lemma
1.6
what
Let
the crucial
S
Lemma 1.6 stated: T

be
be a
jjLi
set. Let
a Tr-system
on 5,
and let E
that
and ~
and
Hi
/i2 cire measures Then /j,2 on J.
on (5, S)
such that
E.
:= cr(2').Suppose
=
iJi\\{S)
l^2{S)
< 00
Hi
\342\200\224 /jL2
on
Proof.
Let
I? =
X>
{F
: Mi(F)
= /.2(F)}.
fact
Then
A,B
is
a c?-system
on 5.
[Indeed, the
=
that
G T> is
given.
If
eV,
then
(*)
Hi{B\\A) =
that
Hi(B) ii
Hi{A)
H2{B)
H2{A) ^,
MB\\A\\
by Lemma
so
B\\A
G V.
Finally,
=T
Fn
V and
F^
then
1.10(a),
/ii(F)
so that
F
lim/ii(Fn)
=T
lim/i2(Fn)
= fi2{F\\
G v.]
..(A1.5)
Chapter Al:
VÎhy
Appendix
to
Chapter
195
that
and Since I> is a cf-system V 2 cr( J) = S, and the
hypothesis,
Dynkin's Lemma shows

\342\226\241
result follows.
circular argument is entailed by
the
Notes.
You
should
check that no
use
of
Lemma
The
/J'2{S)
1.10 (this
reason oo is
<
is obvious).
for the insistence that we do not wish
on finiteness in the condition Hi{S)= to try to claim at (*) that

oo
\342\200\224 oo.
oo = oo \342\200\224
Indeed
the Lemma
1.6 is false if
'<
oo' is
omitted
- seeSection Al.lObelow.
We
now
aim
to prove
Caratheodory's Theorem 1.7.
A1.5. A-sets:'algebra'case
LEMMA
Let
Qo be an
algebra of
subsets of S
A:go^[0,oo]
and
let
with
element
A(0) =
of
0. Call an elementL of Qq
properly\\'
X-set
if L ^splits
every
Qq
\\{L n
G) + A(i:^n
\\-sets
G) = A(G),
an algebra,
VG
G Go.
is
Then the class Co of on Co. Moreover, for
is
and
disjoint\302\243i,\302\2432,... ,Xn
dnd \342\202\254 \302\243o
additive finitely G in Qo,
A(|J(i:,nG)J
Kk=l
-ÂCL.nG).
/
k=l
Proof. Step 1: Let Li prove that L is a A-set.

Now we
and
L2
be
A-sets,
and let
= L^.
L = Li
fl
L2.
We
wish
to
have,
^2 = L2 n LI for any G in ^0,

L^ n
and L^
n
H Z^
Hence, since L2
is a
A-set,
\\{L' n G) = A(i:2
Lj
n G)
-f \\{Li n
G)
196
and, of
Since course
Chapter
Al:
Appendix
to Chapter
(A1.5)..
\\{L'i Li
n G)
+ \\{L2 n G)
A(G).
is a
A-set,
A(L2
Lj
n G)
+ \\{L n G)
\\{L2
G).
On adding
the three
\\{L'
we equationsjust obtained,
see
that G ô,
n G)
A-set.
+ \\{Ln G) =
A(G),
VG
so that L
Step
A-set,
is indeeda
follows
and
2:
it
Since,
now
and trivially, 5 is a A-set, that \302\243o is an algebra.
the
complement
of a A-set
is a
Step 3: If
L\\
L2
are disjoint
A-sets and
GG
L2)
^0,
then
(Li
U L2)
r^Li=Li,
(Li
nLl
= L2,
so, since Li
is a A-set,
x{{Li
u L2)
n G)
A(Li
G) +
A(i:2 n G).
D
The proof is now

A1.6.
easily
completed.
Outer
measures
of
Let ^ be a cr-algebra
subsets
of S.
A:g^[0,oo]
map
is called
(a)
an outer 0;
measureon (5, Q)
for Gi,
if
A(0) =
A
(b)
is
increasing:
G2 G ^
with
Gi
C G2,
A(Gi)<A(G2);
(c)
is
countably
subadditive:
then
ii (Gk)
is
any
sequence
of elements
of ^,
^(u^M
Ê^(^*)-
..(A1.7)
Chapter
Al:
Appendix
to Chapter
197
A1.7. Caratheodory's Lenima.

\342\226\272 \342\226\272
measurable A he an outer measure on the Let X-sets in Q form a a-algebra C on which that (5, \302\243, A) is a m,easure space.
space
A is
Then the {S^Q). so additive, countably
Proof. Becauseof Lemma then sequence of sets in \302\243,

(a)
A 1.5, L :=
we need only |J Ljfc E >C and

k
show that
if (Ljfc)
is a
disjoint
\\{L) =
Y.^{Lk).
k
By
the
subadditive
property
A(G)
of
<
A,
for
G ^
Q^'we have
(b)
Now
\\{L
n G)
+ \\{L' n
G).
that
let
Mn
:\342\200\224
IJifc<n
-^*-
Lemma
so A1.5 shows that Mn G \302\243,
A(G) However,
(c)
- \\{Mn n G) +
A(M^
G).
M^ D
L^, so that
A(G)
> X{Mn n
us to
G) +
X{L'n G).
LemmaAl.5
now
allows
rewrite
\\{Lk n
(c) as G) +
A(G)
>
Y^
k<n
\\{L' n G),
SO that
(d)
A(G) >
Y^ X{Lkn G) + \\{L'n
k
G)
> x{L
n G)
+ A(i:^ n G),
subadditive of A in the last step. On comparing using the countably property and (b), so that (d) with (b), we see that equality must hold throughout (d) L e C] and then on taking G = L we obtain result (a). D
198
Chapter
Al:
Appendix
to Chapter
(A1.8)..
Theorem. A1.8. Proof of Caratheodory's

Recall
that Let
we need S
to prove
the
an
following.
be a
set, let
So be
algebra
on Sj
and let
S:=<t(So).
If
fiQ
is
a countahly
additive
a measure// on (5,S) such
map
that
fiQ
: So
~> [0,
ooj^ then there
exists
fi =
fiQ
on
So.
Proof.
Step 1:
Let
be
the
<t-algebra
of all
subsets
of
5.
For
G E
Q, define
A(G):=inf J]ô(i^n),
n
where We
the now
A
infimum prove
is
is taken
over all sequences(Fn) in

on (S,Q).
So
with
G C
{JFnn
that
outer
(a)
an
measure
The facts that A(0) = 0 and A is increasing are obvious. Suppose that (Gn) is a sequence in that each A(Gn) is finite. Let e > 0 be given. For ^, such a sequence each n, choose ^ ^ N) of elements of So such that (Fn^k \342\200\242
Gn
U Fn,k,
k
Yl
k
/^0(i^n,ifc)
<
A(Gn)
\302\2432-^
Then
G :=[jGnQ[jU
^n,k, so that
k
KG)
<
E
n
<E E /^''(^\".*)
ik
^(^\+^-
Since
e is
arbitrary, we have
proved
result
(a).
is
Step 2: By
is the
a-algebra of A-setsin
Caratheodory's LemmaAl.7, A
Q. All
So;
we
need
a measure on show is that
(5,
where \302\243),
(b)
(5,S).
So Q >C,
and
A = Q
//o on
C and
for then S := ^(So)
we can
define // to be the restriction of
A to
..(A1.8)
Chapter Al:
that
Appendix
to
Chapter
199
Step 3: Proof
A =
//q on
Sq.
suppose \\{F) < fio{F). Now a sequence we can define (En)
^
Let F G So. Un ^n, where Fn sets:
Then,
G Sq.
clearly, As usual,
F C that of disjoint
E^'.^Fu
En^Fnr^^[^
\\k<n
fA
/
such
that
En C
Fn
and
[jEn
^{jFn'^ n
F. Then
//0(F) =
6^/
/iO
(|J(i^
En))
fiQ
X^ /io(F n Fn),
Hence
using
the
countable
additivity /Ô(F) <
of
on
Sq.
J]//o(Fn)
< X;/^0(Fn),
SO
that
A(F) Proof
> fJ'o{F). that
Step 3 is complete.
Step sequence
4'
So Q
such
in (F\342\200\236) So
C. Let F G So and that G C |J^ F^, and

5^//o(Fn)<A(G)
n
G e
Q. Then
there existsa
\302\243.
Now,
by definition
of
A,
flo{Fn) =
n
Y, //o(F n Fn) + 5]
n n
/^0(F^
Fn)
> \\(E n
since
\302\243: n
G) +
n A(\302\243J'^
G),
D F\342\200\236). Thus,
U(\302\243;
D F\342\200\236) and
jE'^ n
G C
U(\302\243;<^
since
e is
arbitrary, A(G) >

n A(\302\243;
G)
+ a(je;^ n
G).
However, since A
is
subadditive,
A(G)
We
<
n A(\302\243;
G)
+ A(je;= n
G).
see
that
E is
indeed a A-set.
200
Chapter
Al:
Appendix
to Chapter
measure
1
on
(A1.9)..
((0,1],S(0,1]).
say
ofLebesgue A1.9. Proof ofthe existence Recall the set-up in Section1.8. Let 5 = union if F may be written as a finite
(*)
(0,1].
For F
C 5,
that
G So
F=(ai,6i]U...U(ar,M
where r G N, 0
convince
yourself)
< ai < 6i < ... < ar < 6r < 1- Then (as you So is an algebra on (0,1]and
S:=(j(So)
should
= S(0,l].
For
(We write
B(0,1] rather than
B((0,1]).)
F as
at (*), let
//o(F) = Yl^hk
k<r
ak).
Of course, a
of
set F
may
have
different
expressions
as a finite
disjoint
union
the
form
(*): for example,
(0,l] = (0,i]U(i,l].
it is easily seen that fiQ is well defined on So and that //o is finitely is obvious additive on So. While this a from picture, you might (or might to make the intuitive not) wish to consider how argument into a formal
However,
proof.
The
key
that
thing
(Fn)
is to prove
is a
that
fiQ
is
countably
additive
So
on So.
union
suppose
sequence of disjoint
So. We know that if Gn
= ULi
n
elementsof
So,
F in
with
^k,
then
MGn)
^fôiFk)
k=i
and Gn
it is
F.
To prove that fiQ for then //o(F),
is
countably
additive
enough to show that
fJ-o{Gn)
/xo(F)
=T
limô(C?\342\200\236)
=T
limJ^MFk)
5]/xo(Ft).
Let
= H\342\200\236 F\\Gn.
Then
\302\243 H\342\200\236 So
and
H\342\200\236 J. 0.
We need only
prove that
MHn) i 0;
..(A1.9)
for
Chapter
Al:
Appendix
to Chapter
201
then
It
is clear
that an alternative
is a
show is the following:

(a)
(and final!)
rewording
of
what
we need
to
if (Hn)
\302\243 > 0,
decreasing sequenceof
fô{Hn)
elements
o/Sq
such
that for
some
>
2\302\243,
Vn,
then
fl
îfc
7^ 0.
of Eo that, for the definition Proof of (a). It is obvious from the closure we can choose Jk G So such that, with Jk denoting
each A;
of Jk,
G N,
JkQHk
But
and
fi{Hk\\Jk)
<
e2-^.
then (recall that Hn
i)
fio[Hn\\f]Jk]<fô[[J
Hence,
< {Hk\\Jk) j Yl
we
^2\"*
<
e.
since fio{Hn)
>
Vn, 2\302\243,
see that
for every n.
fJ'O
ik<n
f]Jk]>e,
and hence
f]k<n \"^k
is
non-empty.
A fortiori
Jk
then, for
every
n,
Kn
'-\342\200\224
\\\\
is non-empty.
k<n
That
(b)
now
A:
n
follows
N)
*^^
'^
(whence
fl îk
if
7^
0)
gives
as follows. Alternatively, we can arguedirectly in the set Since each Xn KnXn non-empty belongs we can find a subsequence and a point x of (uq)
from the Heine-Borel theorem: for a covering of [0,1] by open sets
with
is false, then {{JkT \342\200\242 no finite subcovering. For each n, choose a point
(b) to
Ji many
such
However,
for each
fc,
Xn,
G Jk
ior all but
finitely
the compact set Ji, that x^, \342\200\224> x. and since is Jk g,
Chapter Al:
compact,
it
Appendix
to
Chapter
1
and
(A1.9)..
property
follows
that
x G Jk-
Hence x G Hifc
on So
^k,
(b) holds.
D
is countably
has a
// unique extensionto a measure measure Leb on ((0,1], S(0,1]). The
Since//q
additive
and //o(0,1]< oo,it

((0,1],S(0,1]).
follows
that
//q
on
This
is Lebesgue
the
//Q-sets
form
a-algebra
of Lebesgue measurablesubsets of
a a-algebra
strictly larger
(0,1].
than
S(0,1],
namely
See
Section
Al.ll.
ALIO.
With
Example of non-uniqueness of extension

(5,
So)
as in
Section
1.9,
suppose
that
F
for F G So,
= ^,
(a)
The Caratheodory
of
-o(^)-{L
extensionof
u{F)
vq
ff
will
be obtained
u
a^ the obvious extension

by
(a)
to
S.
However,
another extension
\342\200\224 number
is
given
of elements
in F.
Completion of a measure space In fact (apart from an 'aside' on the Riemann in completions this book.
Al.ll. of 5 as follows:
AT G
integral),
we
do not
need
Suppose
that
(5,
S, //)
is a
measure space. Define
a class
M of
subsets
TV if and only if
3Z G S
satisfying
such that
to be
AT C
Z and
//(Z) =
0.
It is sometimes philosophically
that any
able to
'iV subset
in M F
is //-measurable of 5, write
and ^{N) \342\200\224 0'. This FgS*
is done
make precisethe idea as follows. For
if
3J^,
G G
S such that J^
C F C G and
and
show
obvious
that
S* is a
a-algebra on 5
for F
G ^\342\200\242(F)
//(G indeed
\\ F) that
= 0. It is very S* = a(S,^).
easy
With
to
notation
we define
S*,
î{E)
= î{G),
it
being
easy
(5,
prove that
to check that is a S*,//*)
it is no problem //* is well defined.Moreover, measure space, the completion of (5, S, //).
to
..(A1.12)
Chapter Al:
of
Appendix
to
Chapter
203
For parts
probability example)
advanced
probability,
it is
essential to
of
completethe basic
when
triple
(Q, j^,
measures on (5, S), it is meaningless If we begin with ([0,1],S[0,

are
is topological,
parts S = B{S)ând we wish

to
P). In other
probability,
(for
to
consider
several
different
insist
on completion.
l],Leb),
then
S[0,1]*
is the
a-algebra of
example,
what
called
Lebesgue-measurable
~>
sets of
[0,1]. Then, for

if the
a
of
function
every
/ :
Borel
image
of a
it need not be true that the Lebesgue-measurable set is Lebesgue-measurable.

set is
[0,1]
[0,1]
is Lebesgue-measurable
inverse image
inverse
Lebesgue-measurable:
theorem A1.12. TheBairecategory

In
Section
1.11,
jH\"
we studied
a subset
iJ
of
5 :=
open
[0,1] such that

subsets
(i)
(ii)
P] Gk
k
for a
sequence {Gk)of Q fi
{hr 5. : r
of 5,
HDV,
where
y =
If H
were countable:
H
S =
\342\200\224
G N},
then we would have
(a)
HiJH^
union
{\\j{hr})^{\\JG%) r
expressing5 as a countable
(b)
S^lJFn
n
an
only
of closed every
k^
sets where no
G^
Fn
contains
open
interval.
that as a
CV^
so that
G^ contains
theorem
[Since
points
Gk
for
irrational
in 5.]
However, the
if a
Baire category
closed
states
he
complete metric
spaceS may
written
union of a countable
sequenceof
then
sets:
some Fn contains
be
an open ball.
in functional
Thus the set H must

The
uncountable.
Baire category
theorem has fundamental
applications too!
analysis,
Proof
and some strikingapplicationsto probability

no
of the
contradiction that
of 5,
we can
Baire category theorem. Assumefor the Fn contains an open ball. Since Ff is a

find
purposes
non-empty
of
open
subset
xi
in S
and
> \302\243i
0 such
that
^(xi,\302\243i)CF^
Chapter Al:
B{xi,
denoting \302\243i)
Appendix
to
Chapter
(A1.12)..
F2
the
open
the
ball of
set
radius Si centredat xi. Now
contains
no open
ball, so that
open
U2:=B{xi,2''ei)nF^
is non-empty,
and we
can
find
X2 in
U2
and
> \302\2432
0 such
that
B{x2,e2) Q U2,
Inductively,
\302\243n+i <
62
<2\"^\302\243i.
choose
2\"^\302\243n
a sequence
(xn) in S
and (\302\243\342\200\236) in (0, 00)
so that
we have
and
Since
Cauchy,
d(x\342\200\236,Xn+i)
<
it is 2\"^\302\243\342\200\236,
obvious
from the
so
that
x :=
limxn exists,
and that
triangle law that (xn) is
xef]Bix\342\200\236,e\342\200\236)cf]F^
contradicting
the fact that
[JFn =
S.
Chapter
A3
Appendix
to Chapter
A3.1.
Proof
of
the
Monotone-Class
the
Theorem
3.14
a set
Recallthe statementof
\342\226\272
theorem.
functions
Let
7i
he a
class of hounded
conditions:
space
from
S into
satisfying
the
following
(i) H
(ii) the
is a vector
over
1
R;
is an
constant function
i^ 0' sequence 1
element
of H;
(hi)
Then
if
(fn)
of non-negative
hounded
functions in H such that

on
of
fn
f where f
is a
function
S, then
every
\302\2437i.
if 7i
X,
contains
then
the indicator
contains
function
set
in some
tt-
system
7i
on S.
-
every hounded a{I)'measurahle
function immediate
I, V
Proof Let
from
T>
be
the
(i)
(iii)
that
clsiss of sets -F in 5 such that I/? D is a cf-system. Since T> contains
G W. the
It is
7r-system
contains (t{X).
Supposethat / is a <T(I)-measurable
N,
For n G
function
such
that
for some
K in
0 < f{s) < K,

N, define
V5
G 5.
t=0
where
A{n,i)
:= {s
: ^2\"\"
G H.
<
f{s)
< (i
+ 1)2\"\"}. so that
lA{n,t)
W is
Since/ is fT(J)-measurable, a vector space, every
every fn
A{nî)
G ^(2\,")
^'
Since
But 0
<
T /, /\342\200\236
so that
f G H.
205
206
Chapter
A3:
Appendix
to Chapter
3
where
(A3.1)..
/
li f e bcr(I),we and /- = max(-/,0).

f'^
may
Then
write /+,/\"
= f^
- f~,
^f~
G W by what
we established
above.
G h(j{I)
and /+,/\"
> 0, so that
max(/,0)
of generated A3.2. Discussion

This
cr-algebras
is
one of those
situations in which it
abstract
is
actually
easier
to understand
things in a more formal

ft
setting.
So, suppose
~>
that
and
S are
sets, and that
F : ft
5;
S is
a a-algebra on
-^
5:
X : ft
Because
R.
Y~^ preserves
all set operations,
is a a-algebra ft such that y

on on
ft,
and
because
it is
Y is
^/S measurable(in that
tautologically the smallest a-algebra F\"^ : S ~> y), we call it
a{Y):
a{Y)= r-^S.
LEMMA
(a)
is cr{Y)-measurable
if and only if
X = f{Y)
where
is a
Ti-measurable
function from
to
R.
Note.
The
of
'if part
ônly
is just the CompositionLemma

It is
Proof
(b)
if
part
enough to prove that 3/

G
G ba(F)
if and only if
bS
such
that
X =
/(F).
(b), we may as well use it. So define 7i to be the class X = /(F) for some / G bS.
axetan X, for example.) (Otherwise,consider Though we certainly donot needthe Monotone-Class

of all Taking
Theorem
to prove
F = Y'^B for
bounded functions JC on ft such I = a{Y), note that if F G J
that
then
some
B in
S, so that
Mio) = Ib(F(u;)),
..(A3.2)
so that
If Finally,
Chapter
A3:
Appendix
to Chapter
207
G H.
That
W is
that
is obvious. a vector spacecontainingconstants

(Xn)
suppose
real
is a
K,
for some positive
sequenceof
X <K,
Define
elements
of H
such that,
constant
0<Xn'[
For each n, Xn
/ G bS.
Then X
to
= /(F).
careful about
(3.13,b)
/n(^)
for some
in bS. /\342\200\236
/ :=
limsup/n, so that
One
has
be very
what
Lemma
(a)
means
in practice.
To be sure,result
Discussion
is the
special case when

that
Fjt \342\200\242 ^
(5, S) =
1 <
fc
(R,5).
<
define a
map F :
o/(3.13,c).
12
-^
Suppose R\" via
~^ R for
n.
We may
r(u;):=(r:
The
(\302\253),...,
r\342\200\236(u-))\342\202\254R\".
problem
mentioned
up here because,
to prove
beforewe
:=
at (3.13,d)
can
and in the
Lemma
Warning
following
it shows
need
apply
(a) to prove (3.13,c),we
that
(t(Fi,
[This
...,
Yn)
aiY^'^BiR)
: 1
< k <
n) = F-^S(R\")=: (t(F).
that the product <T-algebra ^^ ^^^ proving ^C^) ni<fe<n See Section 8.5.] Now Yk = 7ikoF, where 7^ is^the hence 'k^^ coordinate' (continuous, map on R\", so that Yk is a(F)-measurable. Borel) On the other hand, every subset of R\" is a countable union of open open of R, and since rectanglesd x - - x Gn where each Gk is a subinterval
amounts
to
same
as 5(R\.")
{YeGix.-.xGn}^C]{YkeGk}e<7iYi,...,Yn),
things do work
You discussion
out.
can already of (3.13,d).
see
why
we
are
in an
appendix,
and
why
we
skip
Chapter
A4
Appendix
to
Chapter
This Logarithm.
appendix Section
gives A4.3
of Strassen's Law of the Iterated the statement the completely different topic of constructing treats
rigorous
A4.1.
model for
Kolmogorov's
Markov
chain.
Law
of the
Iterated Logarithm
mean 0
almost
THEOREM
Let
JCi,-X'2
\342\200\242 \342\200\242 f>^ Î^ ? \342\200\242
RV^
ô,ch with
and
variance
1.
Let
Sn
'-\342\200\224 -\\Xi
X2
-\\-
\" ' -{\342\226\240 XnThen,
surely,
\"
lim sup-^===2==
V 2n
log
log n
= +1,
liminf\342\200\224^ V 2n
log log
-1.
This result
sums.
distributed.
already gives
very
precise
proof
behaviour
in
See Section
14.7 for
the
on the big values of partial case when the JC's are normally
A4.2.
Strassen's
Law
Law
is
of the
Iterated Logarithm
of Kolmogorov's result.
section.
of
Strassen's
map on Z\"*\",
t
a staggering
extension
in
Let {Xn) and (5n) be els

H-\342\226\272
the
previous
St{ijj)
on
[0,
00) be the linear
interpolation
the
For each u;, let the n 1-^ Sn{<^) map
so
that
St{uj)
:= {t
- n)5n+i(u;) + (n + 1result in mind, define

n
t)5n(u;),
te[n,n-^
1).
With Kolmogorov's
Z\342\200\236(<,c.):=^\302\243p^L=, V2n log log
t\342\202\254[0,l],
208
..(A4'3) so that t \\-^ Zni't.Lo) up to time n. Say

shapes
in
Chapter A4'
on [0,1] is a a function that
with
Appendix
to
Chapter
of
4
random
^09
walk S run
limiting
rescaledversion
t
\\-^
the
f{t,uj)
is in
the set
K{(jo)of
of the that such
path associated
uj if
there
is a
sequence ni(u;), n2(u;),...
Zn{t,uj) Now let

in
-^ /(t,u;)
those
uniformly
in
t G
[0,1].
K consist of
functions
/ in C[0,1]
which can be written
the
Lebesgue-integral
form
f(t) =
JO
I h{s)ds
where
/ ^0
hisfds
< 1.
Strassen's Theorem
P[K{uj)
K]
= 1. limiting
Thus, (almost) all paths have follows from Strassen's precisely sup{/(l):
However,
the
same
because 1,
shapes. (Exercise!)
Khinchine's law
= \342\202\254 A'}
inf{/(l):
/ G K}
= -1.
function rescaled)
the
only
so the big values of a line of slope 1.

Almost
element of K for which /(I) = 5 occur when the whole path will, in its often like
1 is
(when
the
f(t) looks
= t, like
every
t
path
function
and
infinitely
Z rescaling, look infinitely the function \342\200\224t; etc. etc.,

of
often
like
the
For a highly-motivated classicalproof References. Freedman (1971). For a proof for Brownian motion of see Stroock theory large deviations, (1984).
Strassen's
Law,
see
based
on the
powerful
A4.3. Let
matrix
model
for
a Markov
chain
\302\243^ be
a countable the
set; let //
denotes \302\243 as
set of all subsets in Section 4.8.
of
be a probability measure on where (^,5), let P denote a and stochastic E x E E;
we shall discover later, Complicating the notation somewhatfor reasons we wish to constructa probability triple (f2,j^, P'^) carrying an i^-valued stochastic process n such n that for Z+ and z'o, ii,..., E G in \342\202\254 E, {ZnZ+)
we have
P^{Zo =: Zo;...;Zn
in)
AîoP\302\253o\302\253i \342\200\242\342\200\242\342\200\242P\302\253n-iin-
210
Chapter
A4:
Appendix
carry
to Chapter
4
\302\243^-valued
(A4'8)..
variables
Thetrickis to make
(f2,
j^,
P'^)
independent
G J^;nGN)
{Zo',Y{i,n):i
Zq having
law ji and such that

P''(f(i,n)=j)=p(i,i),
(e,iGE). construction
We can
obviously do this
f2 and
via
the
in Section
4.6.
For a; G
n G N,
define
Zn(a;):=F(Zn-i(a>),n);
and that's it!
Chapter
Appendix
to
Chapter
Our
task
is to
elementary
prove the Monotone-Convergence Theorem 5.3. We preliminary result.
need
an
A5.1. Doubly monotone arrays
Proposition.
be an
Let
(2/1'^
:rGN,nGN)
which
array of numbers in [0, oo] for fixed r,

for
yn
is doubly
y^^^
monotone:
limyn n r
as n ^
so that
:=!
exists;
exists.
fixed
n,
yn
T cls
r ]
so
that
yn
'-\342\200\224] Yimyn'
Then
y<\302\260\302\260) :=T limy('-)
=T
=: limy\342\200\236
y^.
Proof
we
The result
Let
\302\243 > 0
is almosttrivial. By
y^'^
replacing
each
{yn
) by arc
Then
tan
yn
\\
can assume
that the
be given.
>
are
uniformly
bounded.
that
Choose no such
\342\200\224 Then
?/no
> Voo
\342\200\224
^S-
choose
To such that
yl!'^^^
yno
i^-
so that
?/(^) >
?/oo.
Similarly,
?/oo >
y^\"^^-
\342\226\241
A5.2.
The
key
use of
Lemma 1.10(a)
monotonicity this
This is wherethe fundamental Please re-readSection 5.1at
property
of measures
is used.
stage.
211
212 LEMMA
Chapter A5: Appendix to
Chapter
(A5.2)..
(a)
Suppose that
G S
and
hn
that
e 5F+
and
hn
T U-
Then Proof.
i2o{hn)
K^)need
From (5.1,e),
/io(^n) < /^(^), so we

liminf/io(^n)
only
prove
that
^ K^)>!-\302\243}.
Let
\302\243 > 0,
and
define
An
:=^
{s e
/^(^)-
A :
But
hn{s)
<hn
Then
An
so that,
by Lemma 1.10(a), /j{An) T
(l-e)U\342\200\236
so that,
by (5.1,e),
(1 \342\200\224 e)i2{An)
< /io(^n)-
Hence
liminf/io(^n)
Since
>: (1 \342\200\224 ^)/^(^)result follows.
this
is true
for every
e >
0, the
LEMMA
(b)
Suppose
that
G SF'^
gn
and that
e
SF'^
and
gn
T /\342\200\242
Then
/J,o{gn)
/ô(/).
Proof. disjoint
We can write and each ak
/ as a finite > 0. Then

a^Û^gn
sum
/ =
âklAk
(n T
where the sets Ak
are
TU, Lemma
oo),
D
and the result follows
from
(a).
A5.3.
LEMMA
'Uniqueness
Suppose
of integral'
f G
(mS)\"*\" of
(a)
that
and
that
we have
two
sequences
(f^^^)
and
(/n)
of elements
SF'^
such
that
f^''Û,
fnU-
Then
Tlim/<o(/'\">)=Tlim/.o(/\342\200\236).
..(A5.4)
Chapter
A5:
Appendix
as
to Chapter
5
/n,
213
and
Proof. Let
/i'\"^
/i'\"^
:=
Then A /\342\200\236. /(\342\200\242\342\200\242)
r T
oo, fn^
as n T
oo,
Z^'\"^- Hence,
by Lemma
A5.2(b),
i\"o(/i''^)T/io(/n)asrToo,
M/n''^)TMo(/('-))asnToo. The
result
now
follows
from
Proposition
A5.1.
G
\342\226\241
Recall
from
Section
5.2 that
for /
(mS)\"*\",
we
define
/i(/)
By
:= sup{fio{h) : h
we may
Let us (We
fn
E 5F+;
/i <
/} <
oo.
definition
fio{hn) gn
of /i(/),
T /^(/)t
and
such that
/.
choose a sequencehn in SF\"*\" such that hn < f of SF'^ also choose a sequence(gn)of elements can do this via the 'staircase function' in Section
:\342\200\224 msix{gn,
5.3.)
Then
Now
let
/î,
^^2,
\342\200\242 \342\200\242 \342\200\242, ^n).
fn
G 5F+,
fn <
since
/ô(/n) ^
/^(/)?
îid
/, and sincefn > fn ^ hn, we see that

Mfn) T M/)-
gn,
fn
Since /\342\200\242
fn
<
/,
On combining
this
fact
with
Lemma sequence
(a) 'changes
LEMMA
our
particular
(a), we to any
obtain the next sequence'.)
result. (Lemma
(b)
Let
G (mS)\"*\"
and let
(fn) be any sequence in

Kfn)
SF'^ such that
fn
T /\342\200\242
Then
= Mfn)
/^(/).
A5.4.
Proof
the
of the be a
Monotone-Convergence sequence of
elements
Theorem
such
Recall
statement:
Let
(fn)
o/(mS)\"'\"
T
that fn
Then T /\342\200\242
Kfn) Proof
set as
M/)-
Let a^^^
denote the
r*^
staircase
function
defined r
in Section
Lemma
5.3.
Now /^\"\"^
/^^ n I
a^^) /^^^ := a(^)(/). Since oo. Since a^^\\x) | x, Vx, /n as T /n
:= a(^)(/n),
is left-continuous,
fk''^ T
| oo.
By
A5.2(b),
T
/^(/n\"\"^)
Kf^''^)
as n
oo;
and
oo. We also know from Lemma A5.3(b) now follows from Proposition A5.1.
r t
by Lemma A5.3(b), that Kf^^^)
//(/i^^)
T
Kfn)
as
Kf)-
The
result
Chapter
Appendix
to
Chapter
This
chapter
is solely
devoted
all
8.6. It may
be read after
has
Section
student who
A9.1.
Let
read
to the proof of the 'infinite-product' Theorem It is probably something which a keen 9.10. a tutor. previous appendices should study with
Infinite
(An
products:
be a
setting things up
probability
:G N)
sequence of
measures
on (R,S).
Let
fi :=
a typical
Define
n
nGN
R.
SO that
of
R.
element u; of fi is a sequence u Xn{(-o) := u;^, and set
\342\200\224
(un
: n
G N)
of elements
^n
The
\342\200\242=
(t{Xi,X2,
. . . ,Xn)'
typical
element
Fn
Fn
of
J^n hsis
the form R,
(a)
Fubini's
Gn
JJ
k>n
Gne
II
l<k<n
B.
Theorem
shows that on
the algebra (NOT <T-algebra)
we
may
unambiguously
use (a)
P-(F\342\200\236)
to define a map P~ : J^~

(AiX...xA\342\200\236)(G\342\200\236),
-^
[0,1]
via
(b)
and
that
P~
is finitely
additive on
the
algebra
J-~,
However,
for each
fixed n,
2H
..(A9.2)
(c)
with
Chapter
A9:
Appendix
to Chapter
9
may
215
be identified
(fi,j^n,P~)
We want to
i^
0, bona
vio>
fide
(a)
probability
and
triple
which
Y[i<k<n(^^^^^k)
on
(b).
Moreover,
Xi,X2,...,Xn
are
independentRVs
(d)
(obviously
(fi,
J^n^P\that
prove
the
P~
is countably
with
additive on T~
of using Caratheodory's Theorem 1.7). Now measure of the existenceof Lebesgue (see (Al.9,a))
in T~
r,
intention
we
know
from
that it is
enough to show that is a sequence of sets if (Hr) (e) > e for every some \302\243 > 0, P~{Hr) A9.2. Proof of (A9.1,e)
Step
our proof
such
then
that
Hr
-^r+i^Vr^
and if for
f]Hr
^ 9-
1:
For
every
r, there
\342\200\224
is some n{r) such

,(-On(r)) look
that
Hr
G J^n{r)
and so
IhX^)
Recall
hr{(-0i,i02,... and
foT some
hr G
bB'^^^'K
that
Xk{io)
have
iOk^
again
at Section
A3.2.
Step 2: We
(aO)
E-MXi,X2,...,X,(,))>\302\243,Vr,
because
probability
the left-hand
triple
side of
(aO)
is exactly
(fi, J>i(y,),P~), then we know from
P~{Hr).
If we work within
Section
the
9.10
that
7r(<^) :T=5'r(<^l) :=
^~hr{0Jl,X2,X3,...,Xn(r))
expectation
is an explicitversion
of
the
conditional
of iHr
given
Jî
, and
< \302\243 P-iHr)

Now,
- E-(7,)
= Ai(ff,).
0 <
^'r <
e <
1, so that
Ai{9r)
< lAi{9r
<
> e2-'} + e2-'A,{gi'^ < \302\2432\"^}
Ai{^r>\302\2432-^}+\302\2432-^
Thus
Ai{9r>e2-^}>2-'e.
Step 3: However, since Hr 2 -H'r+i, where both Hr and Hr+i are in J^rn)
9r{î)
we
have
(working
within
(fi,j^^,p-)
> Qr-îioJi),
for every a;i in R.
216
Chapter A9: Appendix

we
to
Chapter
(A9.2)..
Working on (R, S, Ai)
have
Ki{gr>e2-^}>e2-\\'ir,
and gr i
and
so that
continuity
{gr
>
|; \342\202\2542~^}
from
by Lemma
1.10(b) on the
above
of measures,
we have
Ai{iOi:griu;^)>e2-\\yr}>e2-'.
thereexists Hence,
(al)
Step 4' We now
u*
(say)
in R
such that
>
E-ft,K,X2,...,X\342\200\236(o)
e2-\\
Vr.
repeat Steps 2 and 3 applied to the situation in which

is replaced
(Xi,X2,---) hr
by
(^2,^3,
\342\200\242 \342\200\242
O^
is replaced
by /ir(î),
where
...)\342\200\242= hr{Ljl,U2,i03,
(^r(^t))(^2,^3,
We
. .)\342\200\242 \342\200\242
find
that
there exists
u;^
in
R such
that
Vr.
(a2)
Proceeding
E-M\302\253r,\302\2532,^3,...,X\342\200\236(,))>e2-2,
inductively,
we obtain
u;*
a sequence
: n G
(u;;
N)
with
the
property
that
E-/..K,u;2*,...,a;*(,))>\302\2432-\"('-),Vr.
However,
and
can
exactly
/ir(u;r,u;*,...,u;;(^)) be or 1. 0 The only conclusion only the existence of such an uj* which
= /H.(u;*),
is that we
had
uj* G Hr^yr; to prove.
and
it
was
Chapter
A13
Appendix
to Chapter
13
This
chapter
by
many
is devoted to comments on as good for the souls of students, on.
modes of convergence, regarded

and
certainly
easy to
set
examination
questions
A13.1.
Modes
: n
of convergence: be a
Let
definitions
RVs
Let {Xn
our triple
G N)
sequenceof
us
and
(fi,j^, P).
collect
together
let JC be a RV, all definitions known to
carried
by
us.
Almost sure convergence

We
say
that
Xn
\342\200\224> X almost
surely
if
Convergence
We
In probability
Xn
\342\200\224\342\226\272 X In
say
that
probability
X\\>
if, for
^^
every e > 0,
n-ôo.
P{\\XnC^ convergence (p > 1)

We
e)
as
say
that
Xn
^ in C^
\\\\Xn
if each
Xn is in
as
and \302\243p
e C^
and
\342\200\224
X\\\\p
\342\200\224> 0
\342\200\224>
oo,
equivalently,
E(|Xn-X|P)^0
as
nôo.
217
218
A13.2.
Let
Chapter A13:Appendix
to
Chapter
13
(A13.2)..
Modes of convergence: relationships

state
me
the facts.
Thus
Convergence in probability
is the weakest
in prob)
of
the
above
forms of
convergence.
(a)
(b)
(Xn for
-^
X,
a.s.)
=^{Xn-^X
>
1,
(Xn -^
X in
=> \302\243P)
{Xn
-^ X in
prob).
valid.
No other implicationbetweenany two of course, for r > p > 1, But,

(c)
of our
three forms
is of convergence
(Xn
X in that
\302\2430=>
(Xn
^ X in
\302\243^).
If
we know
'convergence in probability
V5>0, us to
is happening quickly'
in
that
{d)^P{\\Xn-X\\>e)<oo,
n
then
(BCl)
allows
conclude that
Xn
\342\200\224*\342\200\242 a.s.
X,
The fact that property (d) the following result:
impliesa.s. convergence
only
is
used
in proving
(e)
Xn
\342\200\224*' X in
probability
if and
if
a further subsequence along which
we
every subsequence have almost sure
of (Xn) contains convergence to X.
The only
(f) for
other
Xn
useful
\342\200\224^ X in
result L^
is that
if and
only
hold:
p > Ij
Xn
if
the
following
two statements
(i)
\342\200\224*' X in
probability,
: n
(ii) the family that is

if
{\\Xn\\P
> 1)
is UL
above provide
There is only one way to gain an understanding of the to prove them yourself. The exercises under EA13 you need it.
facts,
and
guidance
Chapter
AI4
Appendix
to Chapter
14
We
v/ork
with
a filtered
space (II, ^,
{J^n-
Z\"*\"},
P).
This
chapter
that
introduces
J^t
The idea is
the a-algebra
the
J^t,
where
is a
stopping
time.
represents
information
integrable
available to our
supermartingale
observer
and
at (or, if
and T
of
Theorem
the
you
prefer,
that
says
immediately if X is a property:
after) time
uniformly
T. The Optional-Sampling
S
are stopping times with

supermartingale
< T,
then we
have the natural extension
E(Xt|J^s)
< Xs,
a.s.
time a stopping
time
A14.1. The a-algebra

Recall that
a map
J^t,
a stopping is called
T: fi
\342\200\224\302\273\342\200\242 Z\"*\" U {00}
if
{T
if
<n}
eJ'n,
ne Z+U{oo},
equivalently
{T
= n}
eJ'n,
ne Z^Ûloo}.
the 'n
\342\200\224 00' Z\"*\".
In each of
from the
Let
the
above
of
validity
the
statements, result for
case
follows
automatically
F
every n in
F
T be
a stopping
F n
time. Then, for n}

G
C fi,
G Z+ U
we say
{00},
that
G J^t if
\342\226\272 \342\226\272
{T <
n J^\342\200\236,
equivalently if
Then
F n {T = n} G T = n; ^t
^n, if T
n G
Z+
{00}.
^T
= ^n if
^00
= 00;
and J't
^ ^00 for
every
T.
219
220 You can another
Chapter A14:
Appendix
to
Chapter
14
(AI4.I)..
if
5 is
easily check that !Ft is stopping time, then
a cr-algebra.
You can
also check that
Hint. If
G J^SAT,
then
Fn{T
= n}= U
k<n
Fn{5AT=fc}.
X
is that if that needs to be checked Another detail process and T is a stoppingtime, then Xt G mJ^T- Here, defined in some way such that Xqo is ^00 measurable.
Proof. For B e
is an Xoo is
adapted assumed
B,
G
{Xt
A14.2.
5}
n {T
= n}
= {Xn \302\243 B}
H {T
= n}
J^n-
\342\226\241
A special
case of
OST
Let T
be
LEMMA
Let
be a
some N in N,
supermartingale.
T{lj)
a stopping
time
such that, for

and
< N,
Vu;. Then
E{Xn\\:Ft)<Xt.
Xt
\302\243Hfi,J^T,P)
Proof
Let F
e J^t- Then
n
E{Xn;F)=Y^ E{Xn;F
n<N
{T
n}) n}) =
<
the fact that
Y, E{Xn;F n
n<N \\Xt\\
{T
E(Xt; F).
the result
(Of
course,
<
E{\\Xt\\)< 00.)
|î
|H
h \\Xn\\
guarantees
that
D
martingales
A14.3. Doob'sOptional-Sampling
Theorem
for
UI
Let M
be a UI martingale.Then,
for
any
stopping
time T,
E(Moo|J^t) = Mt,
a.s.
..(A14-4)
Chapter
A14:
Appendix
to Chapter
I4
221
Corollary 1 (a new Optional-Stopping and If M is a UI martingale^ = E(Mo). anÊ(MT)
Theorem!) T is
a stopping
time,
then
E(|Mt|)
< 00
Corollary 2
If Proof
is
a UI
martingale
and S and T are E{Mt\\J's) =
stoppingtimes
with
<T,
then
Ms,
a.s.
have,
of
theorem.
By
Theorem
14.1 and
Lemma A14.2,we
=
for
fc
G N,
E(Moolîk)
= Mk,
a.s.,
E{Mk\\J^TAk)
Mtau,
a.s.
Hence, by the
Tower Property,
E(Moo|J^TAik)
(*)
If F G J^T, then
MTAik,
a.s.
(check!)F fl
< k})
{T
< fc} G
J^TAk, so
< k})
that, by (*),
= E(Mt;
Moo j
fc
(**)
We
E(Moo;
Fn{T
all
E(MTAifc;
Fn{T
Fn{T < fc}).

^ 0, whence (**) and using
can
Mn =
E(Moo|^n) ^ 0 for (MON), we obtain

E(Moo;
However,
(and
do) restrict
attention to the case when

n.
Then,
on letting
00 in
F n
{T <
00}) = E(Mt;Fn{T<
{T
00}).
00}).
D
the
fact
that
E(Moo; F n is tautological.
Corollary
follows
{T = 00})= E(Mt;F n
Hence E(Moo;F) =
2 now
E(Mt;F).
1
follows from
2!
the Tower Property, and Corollary
from
Corollary
A14.4.
A
The result for

submartingale
UI submartingales
Doob decomposition M is
if UI. Hence,
UI
X has
where (Exercise: explainwhy!) E(Aoo) a stopping time, then, almost surely,

e(Xoo\\:Ft)
<
00 and
is
= Xo
+ e(Moo|:^t)
E(Aoo|:^T)
= Xo-fMT+
>Xo
+ e(Aoo|:^t)
+ Mt
J*.
+ E{At\\J't)
-A
Chapter
A16
Appendix
to
Chapter
16
A16.1.
Differentiation
under
the
integral
sign
Before stating our theoremon this topic, let us examine the type of appHcaJC is a RV such that Ed-X\"!) tion we need in Section 16.3. Supposethat < oo the real and imaginary parts of can treat and that h{t,x) = ixe**^. (We of R, then Note that if [a, 6] is a subinterval the variables h separately.) {h{t,X) : t G [a, 6]} are dominated by |X|, and so are UI.In the theorem, we shall have
EH(t,X) = ^xit)-fx(a),
and
te[a,b],
we
can
conclude
that v^x(0
exists and equals Eh{t,X).
THEOREM
Let
be a
RV carried
by
(fi,^,P). a <b,
: [a, 6]
Suppose
that a, 6 G R
with
and that
X R
-\342\226\272 R
ft
has
the properties:
(i) i 1-^h{t,x) (ii)

X
is continuous
is B-measurable
in
t for
every x in R,
in
1-^
h{t,x)
for every t
are
[a, 6],
(iii) the variables
{h{t,X) : t
G [a, &]}
UI.
Then
(a) (b)
\\\342\200\224^
Eh{t,X)
is continuous
on [a,6],
is
B[a,b]
X B-measurable,
..(A16.1) (c) if H{t, x)
Chapter A16: Appendix
to
Chapter
16
223
6),
:= Jl h{s,
x)ds
for
a < t
< b, then for
G (a,
-j-EH{t,X) at
exists and equals

consider
Eh{t,X).
case tn
\342\200\224> t\\ result \342\226\241
Proofof(a). Since
Proof
we
need
only
(a) follows immediately from

o/(b).
Theorem 13.7.
:=
the 'sequential
Define
6n :=
2-^(6 - a), Dn
G
(a +
\302\253+)
fl
[a, 6],
rn(t)
:= inf {r
: r jO\342\200\236
> t},
t G
hn{t,x) :=
Then,
for
/i(r\342\200\236(t),x),
t G [a, 6], [a, 6], x G R.
G S,
/i-^(5)
SO
U (([r,r+
<5)n
[a,6])
x {x
: /i(r,x)
G 5}),
x R, result
that
hn
is S[a, 6]
x S-measurable.
Since hn
\342\200\224^ h on
[a,b]
follows.
(b)
Proofo/(c).
If r = A
X C
For
C [a,
6] x R,
define
6] x
a(T) := {(t,uj)e [a,

where
fi :
(t,X(a;))
G T}.
A G B[a,
b]
and
C G
S, then
G S[a,6]
a{T) = Ax
(X-^C)
^.
B[a,
It is now clear that the class of F for which a(F) is a cr-algebra containing B[a^b] x B. The point (*)
is an element of
of
b]xj^
all
this
is that
(t,a;)
for
H->
h{t,X(uj))
\\s B
x J^B}
measurable
(Yes, I know,
since
could have obtained but it is good to have (*) more directly using the /i\342\200\236's, other methods.)Sincethe family {h{t^X) : t > 0} is UI, it is bounded in
>C^, whence
rb
B, {(t,uj)
: h(t,X{uj))
is a(h-^B).
we
Ja I
E\\h{t,X)\\dt<oo.
Fubini's Theorem now

I J
implies
that,
for a
<t <
b,
Eh(s,X)ds
a
= E f
J a
h(s,X)ds = EH{t,X),
and
part (c) now
follows.
Chapter
Exercises
Starred
exercises
are
more
tricky.
The first number in an

it
exercisegives
an
a
of
rough
gumption in
of which chapter indication is all that's necessary'. A

main
depends
number
'G' of exercises
on.
begin
stands
the
text.
Some are
repeated here. We
measure
for 'a bit be found also may

the
with
Antidote to measure-theoretic mere that probability is more than
material- just for fun

theory
though
point
needs hammering
home.
EG.l.
chosen
Two
points
are to
chosen
at random
according
made
independently
the uniform distribution on of each other. The line AB

What
on a line AB, each point being AB, and the choices being
may
now
divided into three parts.
be regarded
as
is
the
probability
that
into a triangle?
they may be
made
EG.2. Planet X is a ball with centre O. Three spaceships A, B and C land at random on its surface, their positionsbeing independent and each
uniformly
distributed
on the
surface.
<
example,
Spaceships
that A
directly by radio if ZAOB can keep in touch (with, for

necessary)
90\302\260.Show
communicate that they probability with via C if B communicating

A
and
B can
the
is (tt
+ 2)/(47r).
free
EG.3.
0
with
Let G be the
the
group
with
the
two generators
a and h.
second
Start
at
time
unit
element
each
at
current word on the right by one of

with
1, the empty
(independently
word.
At
each
multiply
the
four
elements
a, a~^,
probability
1/4
of previous
h\"^, a~^,
6, h~^,choosing choices). The choices
a,a,6,
times
a~^,
a,
a,
1 to
the
will
produce
the
reduced
is intuitively word
word aah
Prove that
time
probability
that
why
the reduced
it
is 1/3,
and explain
of length 3 at time 9. word 1 ever occursat a positive clear that (almost surely)
n)ln
\342\200\224> \\.
(length of
reduced
at time
224
Chapter
E: Exercises
now
225
elements
EG.4.* (Continuation) Suppose

chosen
that
the
a,
the
a\"\"^,
&, &''^
are
instead
/3
with
0,a +
1 ever
respective
probabilities
that
aâ,l3^l3^
the
where a is
a >
reduced
0,/? >
word
^. Prove
root
that the
x =
conditional probabilitythat
element
the
1, is the unique
occurs at a positive time, given

r{a) (say) in
3x^ +
chosen at time
(0,1) of
+
equation
(3
- 4a-^)x^ + X
true
1 =
0. and more
As time
word
goes on, (it is almost surely

fixed,
that)
more
of the reduced
becomes
so that
a final word
is built
up.
If in
the symbolsa and a\"^ are both replaced by A and the show that the sequence of A^s are both replaced B, by a Markov chain on {A,B} with (for example)
PAA
the final word, b and b~^ symbols is B^s obtained and
a(l
\342\200\224
a(l -
x)
'
x)
+2/3(1-t/)'
proportion
where
r(/3).
What
is the
final
of the
Lyons
symbol a in the
of
(almost sure) limiting

word?
of occurrence
(Note.
This
result was
Edinburgh
to solve
a long-standing
problemin potentialtheory
used by Professor
on
Riemannian
manifolds.)
Algebras,
etc.
subsets
El.l.
Let
'Probability' for y C N. Say that V
of N
has
(Cesaro)
density ^(V)
and write
G CES
if
exists. Give an exampleof sets Thus, CES is not an algebra.
Vi and
V2
in
CES
for which
Vi
fl V'2
^ CES.
Independence
E4.1.
Let
(fi,.F,
P)
be a
for
TT-systems on fi such that,
probability triple.
k =
Let Ji, J2 and
I3
be
three
1,2,3,
Xik
and Q
\302\243Xk'
226
Chapter
E: Exercises
Prove that
if
p(/in/2n/3)
= P(/i)P(/2)P(/3)
then
whenever h e Ik {k = Why did we require that

E4.2.
1,2,3),
Q, \302\243 Xk\"^ (^(s)
cr(Ji),a(J2),a(J3)
are
independent.
Let 5
> 1, and define

random
:=
^~'*' Yln\342\202\254N
^^ usual.
Let X
and Y be
independent N-valued
variables
with
P(X =
Prove
are
n) = P(F
n)
n-7C(5).
that
the events
independent.
Explain Euler's formula
(Ep : p prime)
, where
Ep =
{X is
divisibleby
p},
i/c(^)=n(i-i/p')
p
probabilistically.
Prove
that
P(no square Let H
other than 1 divides

of
X)
1/C{2s).
be the highest commonfactor
and
Y. Prove that
P(H = n) = n-27C(25).
E4.3.
continuous
Let
-X'i,-X'2,...
be
distribution
function.
random variables with the independent Let Ei := fi, and, for n > 2, let
<
same
En :=
{Xn >
and
Xm^ym
n}
= {a
'Record' occurs at time
n}.
Convince yourself
with
your
tutor
that the
P{En)
= 1/n.
events Jî,J^2?
\342\200\242 \342\200\242 \342\200\242 are independent,
Borel-Cantelli Lemmas
E4.4.
Let
Ak
Suppose
that
a coin
be
the
event
that a
amongst tossesnumbered
is tossed repeatedly. heads occurs consecutive sequence more) - 1. Prove that 2*, 2* + 1,2* + 2,..., 2*+^
of
with probability
of
heads
k (or
'\342\200\242(^-o)
{j
Zi\\. use of the
Hint. Let Ei be the event that tossnumbered2*+(z Now \342\200\224l)fc.

formulae
there
make
are k consecutive
a simple
heads beginningat
inclusion-exclusion
(Lemma
1.9).
Chapter
E: Exercises
227
the
E4.5. Prove that

distribution,
if
G is
a random
variable with
normal
N(0,1)
then,
for
x >
0,
P(G >x) =
Let Xi,X2,
with \342\200\242 be a \342\200\242 \342\200\242
y/27:
Jx
He-^y'dy< ^yJ^
-J\342\200\224e-^^\\
probability
1,
of independent N(0,1) i < 1, where

sequence
variables. Prove that,
L := limsup(X\342\200\236/\\/2
(Harder.
log See
n). Section
Prove that
P{L =
J^2
1) = 1.)[Hint
\342\200\242 \342\200\242 \342\200\242
14.8.]
distribution.
this
Let 5n
:= Jî +
Prove
^n-
Recall
that
Sn/y/n
has the N(0,1)
that
P(|5n|
Note
< 2i/nlogn,
ev) =
1.
0)
that
implies
the Strong
the
Law: P(5n/n
Logarithm
= \342\200\224>
1.
Remark.
The Law of
Iterated
states
that
=
P (lim
V
sup .2n ^
V
now!
log log n
=1
1-
Do not attempt to prove

E4.6. Converseto
Let
SLLN
this
See Section
14.7.
Z be
a non-negative RV.
Let Y be the integerpart of
Z.
Show
that
and deduce that
(*)
5]P[Z>n]<E(Z)<l
ncN
+ j;P[Z>n].
ncN
Let (Xn) be a sequence of random variables)with EdJCÎ)
IID
RVs = 00,
(independent,
identically
distributed
Vn. Prove that
P[Xn\\
>
kn]
= 00
(ke N)
and
limsup J^
= 00, a.s.
Deduce
that
\\i Sn
= Xi
r lim
-\\-
X2
^ \\Sn\\
then \\-X\342\200\236,
sup
= 00,
a.s.
228
E4.7.
Let
Chapter
E:
Exercises
What's
Xi, X2,...
fair about a
fair
game?
RVs
be independent
y
such
that
__ ~\"
\342\200\224 with I f n'^
\\
\342\200\224 1
with that
probability probability if Sn
n~^ n~^. 1 \342\200\224
Prove
that
E{Xn) =
0, Vn,
but
= Xi
+ X2
\342\200\242\342\200\242\342\200\242
^n,
then
Sn n
\342\200\224 a.s.
1,
E4.8*.
Blackwell's
test of imagination
you
that assumes This exercise chains with two states.
are
familiar
with continuous-parameter
: t
Markov
with
For each n G
state-space
N, let
X^^) =
{X^^'^t) with
> 0}
be a Markov chain
the
two-point
set {0,1}
Q-matrix
and
transition
function
P^^\\t) =
bn)
exp(tQ^^^).Show
,
that,
for
every
t,
Pil\\t)
> bn/{an +
p^'^t)
< 0^/(0^
+ 6n).
=
The processes : n G N) are independent and X^^\\0) (X^^^ Each X^^^has right-continuous paths.
Suppose Prove
0 for
every n.
that that
an =
if t
oo and ân/bn < is a fixed time then

for
oo.
many
(*)
on
P{X(^)(t)= 1
to
infinitely
n} =
0.
convergent
Use Weierstrass'sM-test
[0,1],
show
that
and deduce that
J^n ^^gPoo
(0 is uniformly as tj 0.
P{X(^)(^) = 0 for
Prove
ALL
n}
-. 1
that
P{X(^)(5)
= 0,
tutor
V5
<
^,Vn}
= 0
for every t
>
and
discuss
with
your
why it is
almost surely true
that
Chapter
E:
Exercises
of
229
many
(**)
Now
within
every
non-empty
time
interval,
infinitely
the
X^^^
chains jump.
imagine
the
whole
behaviour.
almost all its time Notes. Almost surely, the process X = (X^^^) spends of sequences with in the countablesubset of {0, l}*^ consisting finitely only and Fubini's Theorem 8.2. However, it I's. This follows from many (*) is a.s. true that X visits uncountable points of {0,1}'^ during every This follows from (**) and the Bairecategory theorem time interval. nonempty A 1.12. one can show that for certain By using much deeper techniques, choices of (on) and (6n),X will almost certainly visit every point of {0,1}*^ often within a finite time. uncountably
Tail <T-algebras
E4.9.
Let
lo,
yi,
^2,...
p(y;
be independent
=
random variableswith = -i) =

...
+i)
= P(y;
i,
Vn.
For
n G
N, define
Xn '=
Prove that
YoYi
YnDefine
the variables-X'i,X2,...areindependent.
y :=
(7(^1, 1^2,. . .).
^n
:=
CT{Xr
I V
>
Tl).
Prove
that
:=f]c7{y,Tn)âh,f]Tnj
=:n.
independent of IZ.
Hint. Prove that
Yq
and G m\302\243 2
that
Yq is
E4.10. Star
See
Trek,
ElO.ll,
which you
can do now.
that fn{s) \342\200\224> 0 for every of := picture g sup\342\200\236 |/n|,
Dominated-Convergence Theorem E5.1. 5 := [0,1], S := B(5), // := Leb.

Let
Define
s in 5, but that //(/n) = 1 for and show that g ^ C^{S, S,//).
fn
:=
n/(o,i/n).
every
n.
Draw
Prove a
230
Inclusion-Exclusion
Chapter
E:
Exercises
Formulae
and
formulae E5.2. Prove the inclusion-exclusion functions. of indicator by considering integrals
inequalities
of Section
1.9
The
Strong
Inverting
Law
Laplace
E7.1.
Let
of
transforms on
by
function / be a bounded continuous is the function L on (0, oc) defined
[0,oo).
The Laplace
transform
i(A)
:=
o-Ax
/'
f{x)dx
Let
JCi,-X'2,.
A,
\342\200\242 be independent \342\200\242
of rate
so
P[X
> x]
= e~'^^, E{X) =
RVs each
with Var(X)
the
exponential
=
distribution
{,
^.
Show that
where Sn = Xi + -X'2 H of L. Prove that / may
h -^n, be
and
recovered
L^^^^^ denotes the (n \342\200\224 1)*'* derivative from L as follows: for y > 0,
ntoo
(n
\342\200\224
1)!
E7.2.
The
uniform
distribution
on the sphere 5\"*^^
R^
write S^^^ = {x E R^ : |a:| = 1}. You may assume that there is a = such that unique probability measure i/^~ôn (5^~^,S(5^~-^)) u^^^{A) A in B{S^^^). u^^^{HA) for every orthogonal n x n matrix H and every Prove that if X is a vector in Rn, the components of which are for then x n matrix H, the n independent variables, every orthogonal N(0,1) vector HX. has the same property. Deducethat has law i/^~^. X/|X|
As usual,
Let Zi, Z2,...
be
independent
N(0,1)
variables
and
define
Prove that Rn/y/n

Combine
important
\342\200\224> a.s.
1,
these ideas
Brownian
normal distribution to the

both
for
'infinite-dimensional' sphereand which motion and for Fock-space constructions in
to prove a rather striking
fact
which
relates
is
quantum
the
mechanics:
Chapter
for
E: Exercises
is a
231
chosen on
If,
each
to the distribution
lim
n, (F/^*^ , Y^\"\"^,..., Y^\"\"^)

i/^~^,
point
5^^^ according
then
n-*ooP(v^F/\"^
< x)
- $(x)
^=
y/2'K
J-oo
e-y^l^dy,
lim
n\342\200\224^00
PiV^Y}\"^
< xi;
^/^Y^^\"'>
<
X2) =
$(xi)$(a;2).
iTmi.
P(F/\"^
< u)
P(Xi/ii\342\200\236
<
u).
Conditional
Expectation
if
E9.1.
Prove that
is a
sub-<T-algebra
of J^
and if
\302\243 \302\243^(Q,J^,
P)
and
iiY\342\202\254C\\n,g,P)
and
(*)
E(X;G)
= E(F;G)
Q>
for every
for
G in a 7r-system
G in
which
contains
and
generates
Q, then
(*) holds
every
^.
that
E9.2. Suppose
X,Y
\302\243^(J2,J^,P)
and
that
E{X\\Y)
Prove
= F,
a.s.,
E(F|X) = X,
< c) + E{X-Y;X
a.s.
that
P(J\\: =
E{X
F) = 1.
-
Hint Consider
Martingales
F;X
> c,F
<c,Y <c).
El0.1.
At
Polya's
0,
urn
an urn contains 1 black ball and 1 white ball. At each time a at ball is chosen random the from urn and is 1,2,3,..., replaced together a new with ball of the same colour. Just after time are therefore n, there n + 2 balls in the urn, of which I are where is -^ Bn Bn the number black, of black balls chosen by time n.
time
Let Mn = {Bn+ l)/(n + 2), the

after
proportion
of black
filtration
balls in the urn

which
time
n.
Prove
that
=
specify)
M is a
of
(relative to a natural
fc)
just
you
should
martingale.
P{Bn =
Prove that
distribution
0,
(n
+ 1)-^
for 0
<
A;
<
n.
What
is the
where
:\342\200\224 lim Mn?
Chapter
E: Exercises
Prove
that
for
0 <
^ <
1,
N^. definesa martingale
{Continued
at
ElO.8.)
E10.2.
Martingale formulation per

unit
of
Bellman's
Optimality
where \302\243n,
Principle
the
Your winnings
with
stake
on
game
n are
Sn are
IID
RVs
P(\302\243n
+1)
= p,
P(\302\243n
-1)
= g,
lie
where
\\
<p^l-q<l.
IS 0 and Z^-i, where Zn \342\200\2241 maximize the object expected your 'interest rate' where iV is a given integer representing the length Elog(Z;v/ô), of the constant. and Zq, your fortune at time 0, is a given Let game, \342\200\224 be time n. to Show that if is C J^^ 'history' up your any (j{ei^... ,\302\243n) Tia is a ^t^permartingale, where a denotes (previsible) strategy, then log Zn \342\200\224
Your
stake
fortune
Cn on game n must at time n \342\200\224 1. Your
between
is to
the êntropif
==
plogpH-glogg but
+ log2,
so that Elog(Z;v/Zo) a martingale. What
<
is
Noi-,
that,
for a
certain strategy, log Zn \342\200\224 na is
the
best
strategy?
El0.3.
that
Stopping times
that
Suppose times.
5 A T
5 and T are stopping times (relative to (J2,^, {^n}))and 5 + T are (:= min(S,T)), 5 V T(:= max(5,T))
Prove
stopping
process
ElO.4.
l(5,7^
with
Let
and
T be
set
stopping times
N via
with
<
T.
Define the
parameter
\"
l(^,^(n,u;).-|^
otherwise.
and deduce that if JC
<
Prove that then
1(5,t]
is previsible,
is a supermartingale,
E(XTAn)
E(XsAn),
Vn.
Chapter
E: Exercises chance of happening will

for
23S
(almost
E10.5.
surely)
'What
always
stands
a reasonable
than
happen
Suppose
- sooner rather
T is
later.'
\302\243 > 0,
we
a stoppingtime such that have, for every n:

that
some
N eN
and some
P(T <n +
Prove by induction
fc-1,2,3,...
iV|^n)
>
e,
a.s.
using P(r >
kN)
P(T
> kN;T
> (k
- l)N)
that
for
P(T>fciV)
<(!-\302\243)*.
Show that E(T)
< oo.
E10.6.
At
ABRACADABRA
each
sequence
uniformly
of times of letters
from
1,2,3,..., a monkey types a capital letter at of each RVs typed forming an IID sequence the 26 possible capital letters. amongst
time
random,
the
chosen
Just beforeeach
He bets $
1,2,...,
letter
1 that
the
a new gamblerarrives
will be
on
the
scene.
n^^
A.
of
If he loses,he leaves.If event that

If he that
he
wins,
he receives
letter
$ 26 all
will
which
he bets
on the
the (n + loses, he leaves. If

he
1)^^
be B.
fortune
wins,
he bets
his whole current

be
of
$ 26^
the (n +
and
2)^^letter will
R Let
so on
which
time by
that
through the
the Explain
ABRACADABRA
sequence.
T be
the first
obvious
monkey why
ABRACADABRA.
produced the theory martingale

has
consecutive sequence
makes
-f- 26
it
intuitively
E(T) =
and
26^^ -f-
26^
use result
10.10(c) to prove this. (SeeRoss

Ruin
(1983)
for
other
such
applications.)
ElO.7.
Gambler's
Suppose
that Xi,
^2,
\342\200\242 \342\200\242 \342\200\242 are IID
RVs with
==
P[X =-f-1]
= p,
P[.Y
=-1]
^,
where
0<p=l-g<l,
234
and
p T^ q.
Suppose
that
a and
are
integers
with
0 <
a<
&.
Define
5n
:=a-|-Xi+---4-X\342\200\236,
T :=
{0,^})-
inf{n : 5n = 0 or
Explain
why
5n
&}.
Let
^ =
in
<t(Xi,... ,Xn)
Question
(^0
satisfies
the
conditions
E10.5.
Prove that
Mn :=
define
i^fP
and
Nn
Sn
n{p ^
of
q)
= 0)
martingales
and
N. Deduce
the
values
P(5t
and E(5t).
E10.8. Bayes' urn

A
random
number 0
coin is tossedrepeatedly.Let Bn the same has exactly be the number of heads in n tosses. Prove that (Bn) on Polya's urn. Prove in (ElO.l) probabilistic structure as the (Bn)sequence that N^ is a regularconditional of 0 given jBi, ^^2,..., Bnpdf {Continuedat El8.5.)
probability
0 is of heads
chosen uniformly is minted. The
between
0 and
1, and
a coin with
E10.9.
stopping
Show
time,
that
if JC is
a non-negative supermartingaleand T is a
then
E(Xt;T<oo)<E(Xo).
{Hint.
ElO.lO*.
Recall Fatou's
The
that Lemma.) Deduce
cP(supXn
n
> c)
< E(Xo)-
'Star-ship
Enterprise'
Problem
The control system on the star-ship has gone wonky. All that Enterprise one can do is to set a distance to be travelled. The will then move spaceship that distance in a randomly chosen then stop. The object is to direction, into the a of r. ball radius SolarSystem, get Initially, the Enterprise is at
a distanceRo{> r)
from
the
Sun.
Sun
Let Rn be the distancefrom Gauss's theorems on potentials

distributions
due
to
show that
that for any
whatever
and supermartingale^
that
from Sun
that
For
to Enterprise, 1/Rn is a
strategy is adopted, 1/Rn is a which no always sets a distance strategy

martingale.
to Enterprise after n to spherically-symmetric
'space-hops'. charge
greater
Use
than
Use
(ElO.9)
to show
P[Enterprise gets into SolarSystem]

each e
than
<
t/Rq.
> 0,
{t/Rq)
you
can
\342\200\224
choose What
a strategy kind
greater
e.
which makes this of strategy will this be?
probability
Chapter
E: Exercises
Log
Scott next
235
ElO.ll*. Star Trek,

Mr Spock and
that Chief
2.
'Captain's
...
modified
for
Engineer
have
the
Enterprise
current
is confined
However,
to move
ever
through the Sun.
the
and 'current' being updated is way). Spock muttering somethingabout logarithms and we will get random walks, I wonder whether it is (almost) certain that but into the Solar System sometime ... ' to be the
in distance
'hop-length'
the control system so a fixed plane passing is now automatically set

in
to the
Sun ('next'
the
obvious
Hint.
Let
Xn
of variables
each of
'-= log
Rn-iRn\342\200\224ôg mean
Prove
that
JCi, ^2?
\342\200\242 \342\200\242 \342\200\242 is an
0 and
finite variance
a^ (say), where a
IID sequence
> 0.
let
Sn:=Xr+X2
+ ---+Xn.
number,
Prove that if a
is a fixed
5n P[inf n
positive
then
-Q!<Tx/n,
=
= -oo] > P[Sn < > limsupP[5n < -a(Ty/n\\

Theorem.)
i.o.] > 0.
$(-a)
(Use the
in the
Central Limit
Process
Prove
that
the event
tail <T-algebra of
Branching
the (Xn) sequence.
{inf^ Sn =
is \342\200\224oo}
E12.1.
a
family
A branching
process Z
define
= {Zn-n > 0}isconstructed

of
in
the
usual
way. Thus,
IID
Z\"*\"-valued random
define
variables is
supposed
given. We
Zq: =
1 and
then
recursively:
Zn+i:=X[\"+'^
+ ---+Xi\"+'^
(n>0).
then
Assume that
if X
denotes
any one
and
of the
0 <
J\\:J^'*\\
//:=E(X)<oc
Prove
cr^:= Var(X) < oo.

filtration
that
(y
Mn:= Zi,...,
Zn/fJ-^ Zn).
Tn=
(Zq,
defines a Show that
martingale M relative to the
and deduce that
M is boundedin \302\243^ if
and
only
if ^
> 1.
Showthat
when
236
Chapter
Kronecker's
E: Exercises
El2.2. Use of
Let jEi,jE2,... Prove that ^
Lemma
with
be independent events
\342\200\224 (Yk
P{En)
to deducethat
where iV^: =
becomes
j^)
/logfc
converges
Nn
y
a.s., and
,
1,
use Kronecker'sLemma
l/n.
Let
Yi
Ie..
a.s.,
logn
Vi H
5^n- An
the
number
Trek,
of records by time n.
interesting
application is to
E4.3, when
Nn
E12.3. Star
Prove
and for ever - in R^
that
if the
strategy in ElO.ll
rather
is (in the obvious

then
sense)
employed
than
in
R\"^,
^R~'^<oo,
a.s.,
Enterprise result
fully
where Rn is the
should
distance from Note. It should be obvious

try
the which
to the plays
Sun at time
n.
you
the key
role here, but
to make
your
argument
rigorous.
Uniform
Integrability
El3.1.
conditions
Prove
(i)
that
a clas
C of RVs
is UI if
and
only
if both
of the
following
and
(ii)
A := sup{E(|X|) : X eC} <oo, so that (i) C is boundedin \302\243\\ for e > 0, 36 > 0 such that if F G ^, every P(F) < 6 and X e C,
(ii) hold:
< e. thenE(|J\\:|;F)
Hf\\
Hint
for
For X
eC,
P{\\X\\
>
K)
< R-^
\\X\\
A.
KP{F).
Hint for 'only if\\ E(|J\\:|; F) < E{\\X[, Prove that if C and V are UI El3.2.
C -^V
>
K)-^
classes
eC,
of RVs,
Y eV),
and
if
we
define
:= {X
-^Y : X
then C -\\-T>
IS UI. C
Hint. a
E13.3.
and
Let
be
some
sub-cr-algebra
prove this is to use E13.1. UI family of RVs. Say that F G 2> if for some X eC = we of have Y Q ^, E{X\\Q), a.s. Prove that V is
One way to
UI.
El4.1.
Hunt's
that
Lemma
{Xn)
Suppose
and that
is a
sequence
Y
of
in
RVs
such
that
X:
\342\200\224
YimXn
exists
a.s.
{Xn) is dominatedby
\\Xn{uj)\\<Y{uj),
{O)^:
V(n,a;),
and E(F) <
oo.
Chapter
E:
Exercises
237
Let {Tn}
be
any
filtration.
Prove
that
E(Xn|^n)-Ê(X|ôo)
a.s.
that
Hint Let Zm.= sup^>^ \\Xr for n > m, we have, that Prove
X\\.
Prove
Z^
-^ 0
a.s. and in
\302\243^
almost
surely,
E14.2.
Azuma-HoefFding
if
InequaHty
RV
(a) Showthat
for
is a
with
values
in
c] [\342\200\224c,
and
with
E(F)
= 0,
then,
e eR,
Ee^^
Prove
< coshl9c
< exp (\\^^cA

null
.
for some
(b)
that
G N)
(cn : n
of
is a martingale positive constants,

if M
at
0 such
that
sequence
|Mn -Mn_i| < Cn,

then,
Vn,
for x
> 0,
P
Hint
sup
Mfc >
< exp
( -x^
for
(a).
Let f{z):
= exp(^2;), zG
c]. [\342\200\224c,
Then,
since
/ is
convex,
/(y)<Sr^(-c)
Hint
+ ^/(c).
for
(b).
See the
proof of (14.7,a).
Characteristic Functions
El6.1.
Prove
that
lim TToo
sinx
dx
= 7r/2
by
by integrating
semicircles
of
radii
e and
J z ê^^dzaround the contour formed T and the intervals and \342\200\224e] [\342\200\224T, has the U[-l, 1]distribution,
ipz{e)
the
'upper'
[e,T].
then
E16.2.
Prove that if Z
= {sin
0)/0,
238
and
prove
that
there
do not
exist IID RVs x-r-uhi,i].
and
Y such that
that E16.3. Suppose
integrating
with
Show
the
that
with
each has the
standard Cauchy distribution.

Suppose
and let ^ > 0. By has the Cauchy distribution, formed the semicircle around + z^) by together [\342\200\224R^R\\ e*^^/(l i?, prove that ipx{0) = e~^. 'upper' semicircle centre 0 of radius = e\"!^! for all 6. Prove that are IID RVs if Xi, X\342\200\236 X2,... (fx{0) \342\200\242 \342\200\242 \342\200\242 then also the standard + Xn)/n distribution, Cauchy (Xi +
X that
E16.4.
^
has
the standard
\342\200\224
normal N(0,1) distribution.Let

the rectangular
>
0.
Consider
J(27r)~2
exp{
^z'^)dz
around
contour
(_i? _ ie) ^{Rand
^R^ (-i?)^ 19)
{-Rof and
19),
prove
that Prove
^x{^) that
definite
= exp( \342\200\224^^^).
a RV real
El6.5. non-negative
if (f is the characteristicfunction in that for complex Ci,C2,..., Cn
X, then
(p
is
\342\200\242 \342\200\242 ^1, ^2? \342\200\242 ? ^n,
{Hint
Express
(^
LHS as
says that
continuous,
the expectation of
function
is a characteristic
(^ :
if is non-negative \342\200\224> E18.6 C.) gives a

and Let Z,
definite! (It is of course understood simpler result in the same spirit.
... .) Bochner's if and only if (f{0) =
Theorem
1,
<^
is
that
here
E16.6.
of the
expansion
RV
(a)
(J2,^,P) where Z{uj)
of u.
Let
= ([0,1],B[0, l],Leb). What is the := 2uj - 1? Let u = X)2^î?n(u;)
distribution
the
be
binary
^(^) = Y.
odd
2\"\"Qn(u;),
where
Qniu;)
= 2i?n(u;)
- 1.
V
Find
identically
a random
distributed
variable
and
V -F
independent ^ V
of U
such that
U and
are
[/
is uniformly
distributed
on
and
[\342\200\2241,1].
(b)
Now
suppose
that
^Y
such that
(on some probability triple)X

is
are
IID RVs
X+
uniformly
distributed
on
[\342\200\2241,1].
Chapter
E:
Exercises
239
of that the distribution Let V? be the CF of X. Calculate ^{0)/ip{\\6).Show X must be the same as that of U in part (a), and deduce that there exists = 0 and P{X e F) = 1. a set F G e[-l, 1] such that Leb(F)
E18.1.
associated
(a)
Suppose
that
with
the
Binomial
converges
weakly
A.
to F
where F is the DF of
are IID RVs Prove that for x G
<
A > 0 and that (for distribution B{n^X/n).
n > X)Fn is the DF Prove (using CFs) that

Poisson
Fn
the
distribution
with
parameter
(b)
Suppose
that
-X'i,X2,...
on R.
each R,
with
the
density
function
cos x)l'KX^ (1 \342\200\224
lim
X I =
^ + TT
arc tanx,
where
arc tan G
? f )\342\200\242 ( \342\200\224f
E18.2.
Prove the
Jri,jr2?--X
of
Weak
Law
of
Suppose that
are ^ C^
IID
X. Suppose that the distribution
and that E(-X') =
Large Numbers RVs, each with
in the
the
following form.
use
same
the
/i. Prove by
+ Xn)
as distribution of CFs that
^n:=n-^(Xi+...
converges weakly to the
unit
mass ~>
at
//. Deduce
that
An
/^
in probability.
Law.
Of course,
Weak
SLLNimpliesthis Weak
for Prob[0,1]
be
Convergence
E18.3. Let X and Y
RVs =
taking E(r*),
values in [0,1].
fc
Suppose that
E(X*)
0,1,2,....
Prove
that
(i) Ep(X)
(ii)
= Ep(y) for E/(-X') = E/(y) for

x)
every every
polynomial continuous
p, function
/ on
[0,1],
(iii) P{X <

E18.4.
P(y
< y) for
every x in [0,1].
Theorem
Hint for (ii). Use the Weierstrass

Suppose
Fn{x)
7.4.
DFs
that {Fn) is a
= 0
sequence
of
with
for X
<
0,
Fn(l)
= 1,
for every n.
240
Suppose
that
(*)
rrik :=
lim / \"
x^dFn existsfor
fc
0,1,2,...
[0,1]
that
Use the
characterized
by
Helly-Bray Lemmaand E18.3to show

Jr^
^\342\226\240^
Fn
F, where
F is
x^dF
rrik^yk.
Moment F(O-)
E18.5. Let
E18.3: A F be a distribution with

Improving on
law,
Inversion =
Formula
0 and
F(l)
= 1. Let ^ be
the
associated
and
define
ruk
:=
J[o,i]
Define
x^dF{x).
P =
Q-
[0,1]
[0,1]^,
J^^BxB^,
^x Leb^,
law
This modelsthe
probability E10.8.
situation
in
0 The
of heads
is then
the
RV Hk
is 1 if
minted, and tossedat times 1,2,... . See k^^ toss produces heads, 0 otherwise. Define
+ --Theorem,
which
0 is
chosen with
^,
a coin
with
5n:=Hi+H2
+ i^n.
By the Strong Law

Define
and
Fubini's
Sn/n
a map
~> 0,
real
a.s.
(an : n G
D on
the space
of
sequences
Z*^)by setting
Da = {an
Prove
(*)
that
an+i
: n G Z*^).
Fn{x)
:=
^
i<nx of
f^)(D--'m), ^^^
F.
^ F{x)
at
every
point x of
Moment
{rrik
continuity
El8.6*
Problem
: A; G
Z\"*\")
Prove that if
is
existsa
mo =
Hint.
a sequence
of numbers in
that
RV
with
values
1 and
Define the
in [0,1]
such
[0,1], then there

only if
E{X^)
rrik if and
{D^m)s>0
show that
(r,5,GZ+).
that
and then verify Fn via E18.5(*), moments of ^n satisfy ^^k n

mn,i
E18.4(*)
holds.
You can
etc.
^n,o = 1,
You discover
mi,
m\342\200\236^2m2-f
n~^(mi
\342\200\224
m2),
the algebra!
Weak
241
Convergence
for
Prob[0,
oo)
instead R such
El8.7.
Using Laplacetransforms Suppose that F and G are DFson

/
of CFs
that
F(O-)
= G(O-)
VA
= 0, and
e-^^dF{x)
e-^^dG{x\\
F(0) the if X
>
0.
\302\273/[0,oo)
\302\273/[0,oo)
Note
that
the
integral
[Hint.
that
it
\342\200\224 G.
One
on LHS has a contribution could derive this from

that
idea
from {0}. Prove in E7.1. However,
is
easier
then
to use
DF G,
El8.3, because we know

= E[(e-^r],
is a sequence such that
has
DF F
and Y has
E[(e-^r]
n =
0,1,2,...
.]
R
Suppose
with
Fn(0\342\200\224)
that
{Fn)
of distribution functions on
each
0 and
i:(A):=lim
f e-^'^dFnix)
exists
that
for
A >
0 and that
L is continuous at
0. Prove that
VA
Fn
is tight
and
Fn^F
where / e-^^dF(x)= L{X),
> 0.
Modes
of convergence
that
EA13.1. (a) Prove

Hint.
(Xn
-^ X,
a.s.)
=>
(Xn
-\342\226\272 X in
prob).
See Section
that
13.5.
-^ X in
(b)
Prove
{Xn
prob)
y^
(Xn
-^ X,
a.s.).
events.
Hint.
Let Xn =
\342\200\242 where \342\200\242 \342\200\242 are \302\243î,\302\243^2, l\302\243;\342\200\236, independent
(c) Prove that if
^^ P{\\Xn
the
X\\ >
e) <
oo.We
>
0, then
Xn -^
X, a.s.
Hint. Show
that
set {u
: Xn{(-o)-f^
X{lS)\\
-^(^)}
niay
be written
many
IJ {u; :
(d) Suppose
(JCnjfe)
\342\200\224 \\Xn{y^^
> h~^
for
infinitely
n}.
that
Xn
\342\200\224> X in
of
(-^n)
such
that
with
Xn^
probability. ~> X., a.s.

'diagonal
Prove that there
is a subsequence
Hint.
Combine (c)
the
principle'.
242
Chapter E:
from
Exercises
(e) Deduce X.
EA
(a) (Xn)
and
subsequence of
13.2.
X in probability if and only if every (d) that Xn \342\200\224> a further subsequence which converges a.s. to contains
Recall
that
if
(^
is
a random
variable
with
the
standard
normal
N(0,1)
distribution,
then
Ee^^=:exp(|A2).
Suppose
that
Xlfc=i
Sn =
ik,
\342\200\242 \342\200\242 are IID RVs each (^1,(^2, \342\200\242 let a, 6 G R, and define
with
the
N(0,1)
distribution.
Let
Xn
Prove
\342\200\224
exp{aSn
\342\200\224
hn).
that
[Xn -> but that for r
0, a.s.)
44>
(6
> 0)
> 1,
(Tn->0in\302\243O^(^
<2b/a^).
References
Aldous,
D. (1989),
K.B.
Probability New
Approximations York.
via
the
Poisson
Clumping
New
Heuristic^ Springer,
Athreya,
and Ney, P.
(1972), BranchingProcesses^
of
Springer,
York,
Berlin.
Billingsley, P.
York.
(1968), Convergence
and
Probability
Measures^
Wiley,
New
P. (1979), Probability Billingsley, (2nd edn. 1987).

BoUobas,
Measure^
Wiley,
Chichester,
New York
B.
graphs.Coll
(1987),
Math.
Martingales,
Soc.
J.
isoperimetric inequalities, Bolyai 52, 113-39.

Reading,
Theory:
and random
Probability^ Addison-Wesley, Breiman, L. (1968), Chow, Y.-S. and Teicher,H. (1978), Probability Inter
Mass..
Independence,
changeability,
Martingales,
Course
Springer,
in
New York, Harcourt,
Berlin.
Brace and
with
Wold,
A Chung, K.L. (1968),
Probability,
New
York.
Davis,
M.H.A.
transactioncosts.
and Norman,
Maths, and
Portfolio A.R. (1990),
selection
of Operation
Davis,
M.H.A.
Vintner,
R.B.
appear). and (1985), Stochastic Modelling

(1980),
Research (to
Control,
Chapman
and
Hall, London.
Meyer,
Dellacherie, C. and
and
P.-A.
Probabilites
V-VIII, Hermann, Paris.

Stroock,
et Potential,
Chaps.
J.-D. Deuschel, Press, Boston.
D.W.
(1989),
Large Deviations,
New York.
Academic
J.L.(1953), Doob,
Stochastic
Processes,
Wiley,
243
244
References
Classical
Doob,J.L.(1981),
Counterpart,Springer,
Potential
Theory
and its
Probabilistic
Part
General
New
York.
Operators:
I^
Dunford,
N. and
Linear Schwartz,J.T. (1958),
Theory,
Interscience,
New York.
Motion
Brownian Durrett, R. (1984), Ca. Belmont, worth, Dym,
and
Martingales
in Analysis,
Integrals,
Wads-
H.
and
McKean,
H.P. (1972),
Fourier Series and

and
Academic
Press,
New York. Statistical
Deviations, Large Ellis, R.S. (1985),Entropy, Springer, New York, Berlin. Markov S.N. and Kurtz, T.G. (1986), Ethier, and Convergence, Wiley, New York.
Mechanics,
Processes:
Characterization
Feller, W.
(1957),
2nd
Introduction
edn.,
Vol.1,
Wiley,
to Probability New York.

Motion
Theory and its

Diffusion,
Applications,
San
Freedman, D.
Francisco.
(1971), Brownian
Martingale Reading,
and
Holden-Day,
Garsia, A.
Progress,
(1973), Benjamin,
Inequalities: Mass.
Seminar
Notes on Recent
New
and
Grimmett,
G.R. (1989),
Oxford
Percolation Theory,
Press.
to
Springer,
York,
Berlin.
Grimmett,
Processes,
G.R. and Stirzaker, D.R.(1982), Probability

University
the
Random
Hall, P. (1988), Introduction

New
Theory
of Coverage
Processes,
and
Wiley,
York.
Hall,
P. and
Heyde, C.C. (1980), Martingale

Press,
Limit
Theory
its
Academic Application,
New York.
Van Nostrand, Halmos, P.J. (1959),Measure Theory, Princeton, NJ. Proc. Fifth Berkeley Hammersley, J.M. (1966),Harnesses, Symp. Statist, and Prob., Vol.Ill,89-117, of California Press. University
Math.
Harris,
T.E. (1963), York, Berlin.

Jones,
The
Theory
of
Branching
Processes,
Springer, New
Jones, G. and London.
T.
(1949),
S.E. New
(Translation
of) The Mabinogion, Dent, Motion and
Karatzas,I.
Calculus,
and
Schreve,
Springer,
(1988), York.
Brownian
Stochastic
Karlin,
S. and Taylor,
Academic
H.M. (1981),A
New York.
Second
Course
in Stochastic
Processes,
Press,
References
Branching Kendall, D.G. (1966), Soc, 41, 385-406. Kendall,
245
since
processes
1873,
J. London
Math.
D.G.
before
Kingman, Probability,
(1975),
after)
The
1873,
(and
genealogy of genealogy: Branching processes Bull. London Math. Soc. 7, 225-53. S.J. (1966),
Press.
Cambridge
J.F.C.
and
Taylor,
Introduction
to
Measure
and
Cambridge
University
Korner, Laha, R.
T.W. (1988),
Fourier Analysis,
University
Press.
New York. and Rohatgi, V. (1979),Probability Theory, Wiley, 2nd Griffin, London. edn.. Functions, Lukacs, E. (1970), Characteristic Blaisand Potential (English Meyer, P.-A. (1966),Probability translation), Mass. Waltham, dell, J. (1965), Mathematical Foundation of the Calculus of Probability Neveu, San Francisco. (translated from the French),Holden-Day, Neveu,
J. (1975),
Discrete-parameter Martingales,North-Holland,
(1967), York.
Amsterdam.
Parthasarathy,
Academic
K.R.
Press,
Probability
Measures on
Diffusions^
Chichester,
Metric Spaces,
and
New
Rogers, L.C.G. Ross, S.
and Williams, D. (1987), Ito 2: Martingales^ calculus,Wiley, (1976), A

First
Markov Processes^
New
York.
Course
in Probability,
Wiley, to
Macmillan, New York.

New York.
Stochastic Ross, S. (1983),
Processes, Introduction
Stroock, D.W.
(1984),An
York,
the
Theory of
Large Deviations,
Springer,
Varadhan,
Philadelphia.
New
Berlin.
S.R.S. (1984),
S.
Large Deviations and Applications, SIAM, Paradox, Encyclopaediaof

Press.
Control,
Wagon,
(1985),
Vol.
The Banach-Tarski
24,
Mathematics,
Cambridge
University
Optimal
Whittle, P.
York.
Williams, Analysis,
(1990),Risk-sensitive
(1973), D.G.
Wiley,
Chichester,
New
D. eds.
Some Kendall
basic theorems on harnesses,in Stochastic and E.F. Harding, Wiley, New York, pp.349-66.
Index
(Recall that there is a Guideto

ABRACADABRA
Notation
on
pages
xiv-xv.)
(4.9, E10.6).
decomposition
Doob adapted process(10.2):
(12.11).
<T-algebra
(1.1).
algebra of
sets (1.1).
=
almost
everywhere of
a.e. (9.1,
(1.5); 14.13);
almost
surely =
a.s. (2.4).
function (16.5).
atoms:
Azuma-Hoeffding
<T-algebra
of distribution
inequality
(E14.2).
Baire
category
theorem (A1.12).
Banach-Tarskiparadox(1.0).
Bayes'
formula
(15.7-15.9).
Bellman
Optimality
option-pricing
Principle (E10.2,
formula
15.3).
Black-Scholes
Blackwell's
(15.2).
Markov
chain (E4.8). (2.7); Second =
Bochner's Theorem (E16.5)

Borel-Cantelli
Lemmas:
First
= BCl
BC2(4.3);
Levy's
extension
of (12.15).
Bounded
branching
Convergence Theorem
process
= BDD (6.2,13.6).
(Chapter
0, E12.1).
Burkholder-Davis-Gundy
inequality (14.18).
246
Index
Caratheodory's
247
Lemma
(A1.7).
Caratheodory's
Central
Theorem: statement
Theorem
(1.7); proof
(Al.8).
Limit
(18.4).
Cesaro's Lemma (12.6).
characteristic
convergence theorem
functions:
definition
(16.1);
inversion
formula
(16.6);
(18.1).
Chebyshev's
inequality
(7.3).
coin tossing (3.7).

conditional
expectation
(Chapter
9):
properties
(9.7).
conditional probability (9.9).

consistency
of
Likelihood-Ratio
of
Test
(14.17).
expectation
contraction property
convergence
conditional
(9.7,h).
in probability
for
(13.5, A13.2).
integrals: UI
theorems convergence BDD (6.2,13.6);

convergence
MON
(5.3);
Fatou (5.4);
for
DOM (5.9);
(14.1);
for
RVs
(13.7).
(11.5);
theorems for martingales: Main Upward (14.2);Downward (14.4).

(A1.2).
UI case
c?-system
differentiation
distribution
under integral
function
sign (A16.1).
DOM
for
RV (3.10-3.11).
(5.9);
Dominated Convergence Theorem =

J.L. DOOB's
inequality
conditional
(9.7,g).
StoppingTheorem Submartingale (10.10, A14.3); Lemma (11.2) - and much else! Upcrossing
Downward
C^ Decomposition Convergence Theorem (11.5); (12.11); Theorem Optional Sampling (14.11); (A14.3-14.4); Optional
Inequality
(14.6);
Theorem
(14.4).
Dynkin's
Lemma
(A1.3).
(4.1).
events (Chapter 2): independent

expectation
(6.1):
conditional
(Chapter 9).
(1.6);
'extensionof
measures':
uniqueness
existence
(1.7).
248
Index
probability
extinction
fair
(0.4).
game
(10.5):
unfavourable (E4.7).
sets
Fatou Lemmas: for

version
(2.6,b),
2.7,c);
for functions
(5.4); conditional
(9.7,f).
filtered
space,
filtration
(10.1).
filtering (15.6-15.9).
finite
and
<T-finite
measures
(1.5).
supermartingales
Forward Convergence Theoremfor

Fubini's
(11.5).
Theorem
(8.2).
gambler's ruin (E10.7).
gamblingstrategy
Hardy
(10.6).
space
WJ (14.18).
harnesses (15.10-15.12).
hedgingstrategy
Helly-Bray
(15.2).
Lemma
(17.4).
hitting
times (10.12).
(E14.2).
inequality Hoeffding's
Holder's
inequality
(6.13).
Hunt's Lemma
(E14.1).
definitions
independence: (9.7,k, 9.10).

independence
(4.1);
7r>system
criterion
(4.2); and
conditioning
and
product
measure
(8.4).
(14.18);
inequalities:
Chebyshev
Azuma-HoefFding(E14.2);Burkholder-Davis-Gundy
(7.3);
Doob's
(6.4);
\302\243p (14.11);
Holder
and in conditionalform
(14.6);
infinite
(9.7,h);
Khinchine
- see
(6.13);
Jensen
(6.6),
Markov
Minkowski
(6.14); Schwarz (6.8).
(14.8); Kolmogorov
Theorem
integration
products
of probability
(14.12,
measures
(8.7, Chapter A9);
Kakutani's
on
14.17).
(Chapter
5).
Index
249
(9.7,h).
Jensen's
inequality
{6.6)]
conditional
form
Kakutani's
Kalman-Bucy
Theorem on likelihoodratios (14.12, 14.17).

filter
(15.6-15.9).
A.N.
KOLMOGOROV's
Inequality
Law
Truncation
of
Definition of ConditionalExpectation (9.2); of the Iterated Logarithm (A4.1, 14.7); Strong (14.6); Theorem (12.5); Large Numbers (12.10, 14.5); Three-Series Zero-One Lemma (0-1) Law (4.11, 14.3). (12.9);
Law
Kronecker's Lemma(12.7).
Laplace law
transforms: of
inversion
(E7.1);
and weak
convergence (E18.7).
random
variable
(3.9): joint
laws (8.3).
predictor least-squares-best (9.4).

Lebesgue
integral
(Chapter
5).
1.9).
Lebesgue
Lebesgue
measure =
spaces
Leb (1.8,A
(6.10).
L^ \302\243p,
P.LEVY's
Convergence Theorem for CFs (18.1);Downward Lemmas martingales (14.4); Extension of Borel-Cantelli Inversion for CFs (16.6); Upward Theorem for formula
(14.2).
Theorem (12.15); martingales
for
Likelihood-Ratio
sheep
Test, consistency
(15.3-15.5).
of (14.17).
Mabinogion
Markov
chain (4.8,
10.13).
(6.4).
Markov'sinequality
martingale
(Chapters
Optional-Stopping
10-15!):
(11.5);
martingale
definition (10.3);
Theorem
Theorem Convergence
A14.3);
(10.9-10.10,
Optional-
Sampling Theorem
transform
(Chapter A14).
(10.6)
measurable
function
(3.1).
measurable space (1.1).
measure space (1.4).

Minkowski's
inequality
(6.14).
Moment
Problem
(E18.6).
250
monkey
Index
typing
Shakespeare
(4.9).
Monotone-Class
Theorem (3.14, A1.3).

for
Theorem: Monotone-Convergence
Chapter
sets
(1.10);
for functions
(5.3,
A5);
conditional
version
(9.7,e).
convergence.
narrow convergence option
see
weak
pricing
(15.2).
Optional-Sampling
Optional-Stopping
Theorem (Chapter A14).

Theorems
(10.9-10.10,
time
A14.3).
optional
orthogonal
outer
time - seestopping
projection
(A1.6).
(10.8).
(6.11): and
conditionalexpectation (9.4-9.5). (1.6, 4.2).
measures
TT-system
(1.6):
urn
Uniqueness Lemmas
E10.8).
Polya's
previsible
probability
(ElO.l,
(= predictable)
density measure
process (10.6).
= pdf
function (1.5).
(6.12); joint
(8.3).
probability
probability
triple
(2.1).
product measures (Chapter8).

Pythagoras's
Theorem
(6.9).
Radon-Nikodym
theorem
(5.14, 14.13-14.14).
random signs (12.3).

random
walk:
hitting
times
(10.12, E10.7);
on free group (EG.3-EG.4).
RecordProblem E12.2, (E4.3, 18.5).

regular
conditional
probability
(9.9).
Riemann
integral (5.3).
samplepath (4.8)
sample
sample
point
(2.1)
space
(2.1)
Index
251
Schwarz inequality (6.8).
Star
Trek
problems
(ElO.lO,
ElO.ll,
E12.3).
stopped process
(10.9).
(A14.1).
<t-algebras stopping times (10.8);associated
Strassen's
Law of
Laws
the Iterated Logarithm

12.10,
(A4.2).
Strong
(7.2,
12.14,
14.5).
submartingales
theorem
(11.5); optional
functions
and supermartingales:definitions convergence (10.3); optionalsampling stopping (10.9-10.10);

for
(A14.4).
superharmonic
Markov
chains (10.13).
symmetrization technique(12.4).
cf-system
(A1.2);
7r-system
(1.6).
tail <T-algebra (4.10-4.12,

Tchebycheff:
14.3).
(7.3).
Chebyshev
Three-Series
Theorem (12.5).
tightness (17.5).
Tower
Property
(9.7,i).
Truncation
Lemma
(12.9).
uniform integrability
Upcrossing
(Chapter 13).
Lemma
(11.1-11.2).
Uniqueness
Lemma
(1.6).
functions
(E18.7).
weak convergence (Chapter17):and characteristic moments (E18.3-18.4); and Laplace transforms

Weierstrass
(18.1);
and
approximation
(4.11,
theorem
14.3).
(7.4).
Zero-one law = 0-1law

ProbabilitProbability With Martingalesy With Martingales

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ProbabilitProbability With Martingalesy With Martingales

Uploaded by

Copyright:

Available Formats

Probability

Published in tlie United

University Press, New York

University Press 1991

record for this

number of children, X. Typical of conditional expectations. 0.4. Extinction

properties of measures. 1.11.Example/Warning.

Examples lim liminf,

First Borel-Cantelli 2.9. Exercise.

Lemma (BCl). 2.8.

mS, function, Lemma. 3.3.

The 7r-system Lemma; and Lemma (BC2). Borel-Cantelli

algebras. 4.11. Theorem. Kolmogorov's

Theorem Convergence (MON).

//(/) :=: J fdfi, SF'^. 5.2. Definition

fj,(f;A). 5.1. Integrals

6.13. Holder from

Orthogonal projection. 6.12. The

7.3. Chebyshev's inequality.

and 8.0. Introduction

8.3. Joint laws,

8.6. The n-fold 8.8. Technical note

pdfs. 9.10. of symmetry: an

9.7.9.9.Regular properties in Section Conditioning under independence

Chapter 10: Martingales

games. principle: supermartingales

Stopping Theorem.10.11. Awaiting

106 Upcrossings. 11.3. Doob'sUpcrossTheorem.

12: Martingales bounded in

'Strong (M)oo- 12.14. A trivial of the Borel-Cantelli

13.4. UI property in probability. 13.6. sufficient condition for C^

Chapter 14: UI Martingales

Theorem. 14.5. martingale

of Kolmogorov's of the proof

17: Weak Convergence

rokhod representation.17.4. Sequential

to Chapter A3: Appendix

Monotone-Class Theorem3.14. A3.2. Discussion to Chapter 4

A5.3. 'Uniqueness Theorem.

of OST. A14.4. The

16 Chapter A16: Appendixto Chapter

things I considered ruthlessly omitted covered

(1958) and Halmos

read the still-magnificent

for an excellent Hall and Heyde

theory, and you

should take every

of measure your intuition.

like any other

those in Chapter E. I refer

contraction map from

the 'inelegant' to preferring

\342\226\272\342\226\272 important, something Theorem. Convergence

B containedin someuniversal B: that is /^ : 5 \342\200\224\342\231\246 and {0,1}

function; DF: distribution

almost everywhere (1.5)

probability density function joint pdf (8.3)

process M stopped at time T (10.9)

for the remainder

0.1. Typical number of children, X In our the number of childrenof model,

power seriesimply E(X0^-') = J2

0-1 that fl <

of branching-process theory and in that context, animal

wasto the question