Professional Documents
Culture Documents
with
Martingales
David
Statistical
Williams
Laboratory,
DPMMS
Cambridge
University
Cambridge
UNIVERSITY
PRESS
States
of America
by
Cambridge
\302\251 Cambridge
Thispublication
and
to the provisions
no reproduction the written permission of Cambridge University Press. 1991 published Twelfth printing 2010
First Printed
in the
United
Kingdom
at the
University Press,Cambridge
is available from the British Library
A catalogue
publication
ISBN
978-0-521-40605-5
paperback
for the persistence or accun Cambridge University Press has no responsibility of URLs for external or third-party internet websites referred to in this public! and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
Contents
Preface
\342\200\224
please
read!
xi
A Question
A
of Terminology
Notation
xiii
xiv
Guide
to
Chapter
0: A Branching-Process Example
remarks. 0.3. Z\342\200\236.
1
0.2. Size
of
0.0.
of
Introductory
0.1. Use
n^^
generation,
tt.
probability,
0.5.
Pause
for thought:
0.7.
Convergence
(or not)
measure. 0.6. Our first martingale. the distribution of expectations. 0.8. Finding
Moo. 0.9.
Concrete example.
PART
A:
FOUNDATIONS
Chapter
1: Measure
remarks.
Spaces
1.1.
14
Definitions
1.0.
Examples.
Introductory
Borel
set
functions.
measures.
<7-algebras, B{S), B{R). 1.3. Definitions 1.4. Definition of measure space. 1.5.
B =
of algebra,
<7-algebra. 1.2.
concerning
Definitions
concerning
1.6.
Lemma.
extension
Theorem.
Uniqueness
theorem.
Elementary
of extension,
1.8.
inequalities.
7r-systems.1.7.
measure Leb
Lemma.
1.10.
Caratheodory's
on ((0,1],-B(0,1]). 1.9.
Monotone-convergence
Chapter
Lemma.
Lebesgue
2:
Events
for
2.1. Model
experiment:
of
(fi,^)
J,
pairs.
lim,
2.4.
2.6.
etc.
2.2. The intuitive meaning. 2.3. P). Almost surely (a.s.) 2.5. Reminder: Definitions. limsupJS^n, (-B\342\200\236,i.o.). 2.7.
vi
Contents
Definitions,
liminf
j^n,
(^n,ev).
Chapter
3: Random Variables
S-measurable
29
3.2. Elementary (mS)+,bS. Sums and products of
3.1.
Definitions.
3.4. Composition Lemma. 3.5. Lemma functions are measurable. measurable of functions. 3.6. Definition. Random liminfs of infs, on measurability 3.8. Definition. Coin variable. 3.7. Example. <7-algebra generated tossing. on Q. 3.9. Definitions. Law, Distribution by a collectionof functions functions. 3.11. Existenceof random of distribution Function. 3.10. Properties of with given distribution function. 3.12. Skorokod variable representation
a
Propositionson measurability.
random
variable
<7-algebras
Chapter
- a
4:
discussion.3.14.The Monotone-Class
with prescribed
distribution
function.
3.13. Theorem.
Generated
Independence
of
38
4.2. 4.3.
4.1. Definitions
independence.
more familiar
model
definitions.
A
Second
question
Example.4.5.
with
the
4.4.
fundamental
applications.
4.7. Notation:
IID RVs.
0-1
for modelling.
4.8.
4.6.
coin-tossing
Stochastic
processes;
Markov
chains. 4.9.
Monkeytyping
Shakespeare.
4.10. law.
Definition.
Tail a-
4.12.
Exercise/Warning.
49
5: Integration
etc.
Notation,
simple
TOU).
Positive and negative parts of /. 5.7. Inte5.8. 5.9. Dominated Convergence grable function, \302\243^(5, S,/i). Linearity. Theorem (DOM).5.10.SchefFe's Lemma 5.11. Remark on (SCHEFFE). uniform integrability. 5.12. The standard machine. 5.13. over Integrals subsets. 5.14. The measure //i, / \342\202\254 (mE)\"*\".
5.5.
functions,
of
non-negative
/ G (mS)\"*\". 5.3.
Monotone(FA-
The
Fatou
Lemmas
for functions
5.6.
Chapter
6:
Expectation
58
of expectation. 6.2. Introductory remarks. 6.1. Definition Convergence 6.3. The notation E(X;F). 6.4. Markov's 6.5. inequahty. Sums of non-negative RVs. 6.6. Jensen'sinequality for convex functions. 6.7. Monotonicity of C^ norms. 6.8. The Schwarz 6.9. >C^: inequahty. < p < oo). 6.11. etc. 6.10. of \302\243p Pythagoras, covariance, Completeness (1
theorems.
'elementary
formula'
for expectation.
Contents
vii
Chapter
7: An Easy
means
Strong Law
multiply'
71
7.2. Strong Law
approximation
7.1.
'Independence
7.4. Weierstrass
- again!
- first theorem.
version.
8: Product
Measure
advice.
75
Product
8.1.
measurable
=
8.2.
Product
and product measure. 8.5. i?(R)'' Independence of probability extension. 8.7. Infinite products
of on the existence
measure,
Fubini's Theorem.
structure,
joint
Ei
pdfs.
E2.
8.4.
triples.
joint
laws.
PART
B: MARTINGALE
THEORY
Chapter 9: ConditionalExpectation
83
9.1.
expectation Agreement expectation:
motivating
example. 1933).
(Kolmogorov, as
9.3.
with
least-squares-best traditional
Fundamental Theorem and Definition 9.2. The intuitive meaning. 9.4. Conditional 9.5. Proof of Theorem 9.2. 9.6. predictor. 9.7. Properties of conditional expression.
a list.
9.8.
Use
Proofs
and
of the
conditional
assumptions.
probabilities
9.11.
93
processes.
examples
10.1.
fundamental Stopped
Filtered
spaces.
submartingale.
10.2.
Adapted
10.4.
10.3. Martingale,
of martingales.
martingale,
and unfair
Some
10.5. Fair
super-
10.6.
Previsible process, gambling strategy. beat the system! 10.8. you can't
are
time.10.9. Stopping
Doob's Optional10.12. Hitting
10.7. A
times
functions
supermartingales.
the
10.10.
inevitable.
almost
simple
random
walk.
10.13.
Non-negative superharmonic
for
Markov
chains.
Chapter
11: The
picture
Convergence Theorem
says it all.
Corollary.
11.1.
ing
The
that
11.2.
Lemma.
11.4.
Corollary.
11.7.
11.6. Warning.
11.5. Doob's'Forward'Convergence
viii
Contents
Chapter
12.0.
\302\243^
110
Introduction.
in
\302\243^: orthogonality
of increments.
in C^. 12.3. of zero-meanindependent random variables the 12.4. A symmetrization Random signs. sample space. technique: expanding 12.6. Cesaro's Lemma. 12.7. Theorem. Three-Series 12.5.Kolmogorov's Kronecker'sLemma.12.8.A Strong Law under variance constraints. 12.9.
12.2. Sums
Law of Strong Kolmogorov'sTruncation Lemma. 12.10. Kolmogorov's The 1 2.12. Doob 12.11. Numbers angledecomposition. Large (SLLN). of of M to finiteness brackets process (M). 12.13. Relating convergence
extension
Law'
Lemmas. 12.16.Comments.
126
UI
for martingales
in
\302\243^.12.15.
Levy's
Chapter 13:
13.1.
of
Uniform
Integrability
continuity' property. 13.2. Definition. Two simple sufficient conditions for the UI property.
An 'absolute
conditional proof
Elementary
family.
13.3.
expectations. of (BDD).
13.5.
Convergence
necessary
13.7.
and
convergence.
133
Introduction.
Martingale
14.1.
proof
14.3.
14.8.
A
UI martingales.
Martingale
Inequality.
standard bounds;
estimate large
on
14.2. Levy's 'Upward'Theorem. 14.4. Levy's 'Downward' Strong Law. 14.6. Doob's SubLaw of the Iterated Logarithm: special case. the normal distribution. 14.9. Remarkson
0-1 law. theory.
exponential
deviation
14.10.
consequence
of Holder's
14.12. C^ inequality. Kakutani's Theorem on theorem. 14.14. The 'product' martingales. 14.13.TheRadon-Nikodym theorem and conditionalexpectation.14.15.Likelihood Radon-Nikodym measures. 14.16. Likelihood ratio and conditional ratio; equivalent expectation. 14.17. Kakutani's Theorem revisited; consistency of LR test. 14.18.
inequality.
14.11. Doob's
Note
on
Hardy
spaces,
etc.
Chapter
15: Applications
-
153
15.1.
A
15.0.
result.
Introduction
entangled.15.11.
The formula. 15.3. Option pricing; discrete Black-Scholes Proof of Lemma 15.4. 15.5. Proof sheep problem. Mabinogion 15.3(c). of result 15.3(d). 15.6.Recursive nature of conditional 15.7. probabilities. formula for bivariate distributions. normal 15.8. observation of Bayes' Noisy a single random variable. 15.9. The Kalman-Bucy filter. 15.10.Harnesses
please
read!
trivial
martingale-representation
15.2.
Harnesses
unravelled,
1.
15.12.
Harnesses
unravelled, 2.
Contents
PART
ix
C:
CHARACTERISTIC
FUNCTIONS
Chapter
16: Basic
Definition.
Properties of CFs
Elementary 16.4.
172
16.3. Some uses of 16.5. Atoms. 16.6. Levy's
16.1.
characteristic
16.2.
properties.
functions.
Three
key results.
Inversion
Formula.
16.7.
A table.
Chapter
179
formulation, n.3. Skofor
compactness
17.1.
The
definition.
17.2.
A 'practical'
Prob(R).
17.5.
Theorem 18: The Central Limit 18.1. Levy's Convergence Theorem. 18.2.o and important estimates. 18.4. The CentralLimit
Chapter
185
O
notation. 18.5.
18.3.
Some
Theorem.
Example.
18.6. CF
proof of
Lemma
12.4.
APPENDICES
Chapter
Al:
A
Appendix
Proof
to Chapter 1
192
Al.l.
Lemma.
non-measurable
A1.4.
Outer measures. Al.7. Caratheodory'sLemma. A1.8.Proof of A 1.9. Proof of the existence Theorem. of Lebesgue measure on ((0,1],B(0,1]). ALIO. of non-uniqueness of extension. Al.ll. Example Completionof a measure space. A1.12. The Baire categorytheorem.
case.
subset A of 5^. A1.2. <i-systems. A1.3.Dynkin's of Uniqueness Lemma 1.6. A1.5. A-sets: 'algebra'
A1.6.
Caratheodory's
Chapter
205
of
Proof
of the
<7-algebras.
Chapter
A4: Appendix
Kolmogorov's
208
A4.2. Strassen's Law
chain.
for
A4.1.
of
Law
Iterated
A
Logarithm.
the
Iterated
Logarithm.
A4.3.
model
a Markov
Chapter
A5: Appendix to
monotone
Chapter5
A5.2.
211
use
of
A5.1.
Doubly
arrays.
of integral'.
A5.4. Proof of
The key
the
Lemma
1.10(a).
Monotone-Convergence
Contents
Chapter
A9:
Infinite
Appendix
products:
to Chapter
setting
9
up. A9.2.
214
Proof
of
A9.1.
Chapter
things
A9.1(e).
A13:
Modes
Appendix
of
to Chapter
definitions.
13
A13.2.
217
Modes of
A13.1.
convergence:
convergence:
219 case
relationships.
to Chapter A14:Appendix
Chapter
14
A14.1. The <7-algebra ^r, T a stoppingtime. A14.2. A special A14.3. Doob's Optional-Sampling Theorem for UI martingales. result for UI submartingales.
222
Differentiation
under
the integral
sign.
Chapter E: Exercises
References
224
243
Index
246
Preface
please
read!
I have book is Chapter E: Exercises. 'EG' on the start now can left the interesting things for you to do. You exercises,but see 'More about exercises' later in this Preface. the set of lecture notes for a third-year is essentially The book, which as I can an introduction course at Cambridge, is as lively undergraduate of probability. Since much of the book is manage to the rigorous theory at those look it is bound to become very devoted to martingales, lively: of course, there is that initial plod through Exercises on Chapter 10! But, be said however that measure the measure-theoreticfoundations. It must theory, that most arid of subjects when done for its own sake, becomes amazingly more alive when used in probability, not only because it is then applied,but also because it is immensely enriched. avoid measure You cannot theory: an event in probabilityis a measurableset, a random variable is a measurablefunction on the sample space, the expectation of a random variable is its integral with to the respect and so on. To be sure, one can take some central results measure; probability from measure theory as axiomatic in the main text, giving careful proofs in appendices;and indeedthat is exactly what I have done. Measuretheory for its own sake is based on the fundamental addition rule for measures. with that the theory Probability supplements multiplication rule describes which and things are already independence; looking But what enriches and enlivens we deal with is that lots up. really things of (7-algebras, not just the one <7-algebra is the concern of measure which
The
most
important
chapter
in this
theory. In planningthis book, I decided for just a bit too advanced, and, often with
them. For a more thorough training in
every
topic
what
I have
sadness,
many
of
the
topics
here, see
Billingsley(1979),
Chow
and
Teicher
(1978),
Chung
(1968), Kingman
and
xi
xii
Taylor
Preface
(1966),
Laha
and
this
Rohatgi
from
measure
(1968), martingales.
theory, I
reading
learnt it
(1979), and
and
Neveu
(1965).
As
regards
Breiman
Dunford
Schwartz
can
(1959). After
and,
book,
you must
indication of
(1980).
discrete
than Aldous (1989), though it is a very for this There is no better whetstone and for learning of probability demanding book. For appreciatingthe scope and Stirzaker and Grimmett how to think about it, Kaxlin Taylor (1981), and recent Grimmett's superb book, Grimmett (1989), (1982),Hall (1988), recommended. on percolation are strongly
More about exercises. the homework sheet
Of course,intuitionis muchmoreimportant
than
knowledge
opportunity
to
sharpen
In compiling
give
Chapter E,
which
consists
exactly
of
to
the
account
contains
Cambridge
students,
I have
taken into
the
fact
that
this book,
of
mathematicsbook,implicitly
are easier than of which exercises you create by reading the statementof a result, and then trying to prove it for yourself, before read the other about exercises: One you proof. you will, for point given in E Exerciseson example, surely forgive my using expectation Chapter4 E is treated before with in full 6. Chapter rigour
a vast
number
of other
exercises,many
course
must My first thanks go to the students who have the book is basedand whose quality has made me try hard to make it worthy of them; and to those, especially David who had developed the coursebefore it became to Kendall, my privilege teach it. My thanks to David Tranah and otherstaff of CUP for their help in converting the courseinto this book.Next,I must thank Ben Gar ling, James Norris and Chris Rogers without the book would have contained more whom errorsand obscurities. many faults which surely remain in it are my (The Helen and I typed part of the book, but the vast Rutherford responsibility.) majority of it was typed by Sarah Shea-Simonds in a virtuoso performance of Horowitz. to Sarah. worthy My thanks to Helen and, most especially, Special thanks to my wife. Sheila, too, for all her help.
Acknowledgements.
the
endured
course
on which
But my
must
best
thanks
- and
go
to three
without Doob, A.N. Kolmogorov and P. Levy: them, been much to write about, as Doob (1953) splendidly
people whose
yours if you derive any benefit from the book names appear in capitals in the Index: J.L.
there
confirms.
wouldn't
have
Statistical
Laboratory,
David
Williams
Cambridge
October1990
A Question
of Terminology
functions
Randomvariables:
or
equivalence
classes?
if we of this book, the theory would be more'elegant' regarded of measurable functions on the class variable as an equivalence to the same equivalence class if and sample space,two functions belonging Then the conditional-expectation are almost if everywhere. only they equal At
the
level
a random
map
X
would for
^ E{x\\g)
i^^(fi,
be p
>
a truly 1; and
well-defined
^,
the
P) to
endless
we would not
classes)
L^(f2, Q^P)
'almost
to
keep
mentioning
versions
(representatives of
equivalence
and would
be able to avoid
route:
firstly,
surely'
qualifications.
have
I
with
however
chosen
functions^
and
confess
5 =
I prefer
to work
4 -h
2 mod
to
[4]7
[5]7
= [2]7.
But there is a substantive reason. I hope that this book will you to tempt to the and much more interesting, more where important, progress theory the parameter set of our process is uncountable (e.g. it may be the timeformulation parameter set [0,oo)). There, the equivalence-class just will not work: the 'cleverness'of introducing quotient spaces loses the subtlety which is essential even for formulating the fundamental results on existence of continuous are modifications, etc., unless one performs contortionswhich Even if these contortions allow one to formulate hardly elegant. results, one would still have to use genuine functions to prove them; so where doesthe
reality
lie?!
xni
Guide
to
Notation
\342\226\272 signifies
something
the
Martingale
very
important,
and
\342\226\272\342\226\272\342\
I use ':='
convenient
to signify
because (as
'is
defined
to equal'.
it can opposed
also be
to
use
analysts'
This Pascal notation is particularly used in the reversedsense. category theorists') conventions:
\342\226\272
N:={1,2,3,...}C{0,1,2,...}=:Z+.
Everyone For
function
is agreed that
of
R\"^
:=
[0,oo).
set
a set
5,
Ib
denotes
the indicator
otherwise.
For a, 6
E R,
a Ab
:= min(a, 6),
aV
6 :=
max(a,
6).
pdf:
CFxharacteristic
density
function;
probability
function.
a-algebra,
<7(C) (1.1);
: 7
G C)
(3.8, 3.13).
7r-system
(1.6);
d-system
(A1.2).
a.e.:
a.s.:
bE:
almost
surely
(2.4)
the space
of bounded
E-measurablefunctions
(3.1)
xiv
A Guide to
the
Notation
XV
B(S):
Borel
a-algebra
stochastic
on 5,
integral
B := B(R)
(10.6)
(5.14)
(1.2)
\342\200\242 X:
discrete
dX/dfi:
dQ/dP:
derivative Radon-Nikodym
Likehhood
Ratio (14.13)
X{uj)P(du;)
E(X):
E(X;F):
expectation E{X):= ^
/^ Xc/P
conditional
of
(6.3)
(6.3)
expectation
E(X|^):
(En.ev):
(En,i.o.):
(9.3)
liminf
jE;\342\200\236 (2.8)
limsupjEn
(2.6)
(pdf)
fX' /x,y:
fx\\Y'
of X
(6.12).
(9.6)
of
Fx'
liminf:
distribution function
for sets,
for
(3.9)
(2.8)
(2.6)
limsup:
X =1 log:
linix\342\200\236:
sets,
x\342\200\236
| x in
that
Xn
<
Xn-\\-i (Vn)
and
\342\200\224> x. x\342\200\236
natural
law
(base e)
X
logarithm
(6.7, 6.13)
Cx, Ax:
LP: \302\243P,
of
(3.9) spaces
Lebesgue
Leb:
mE:
Lebesguemeasure (1.8)
space
of E-measurable
functions
(3.1)
angle-brackets
process
with
(12.12)
to
/i(/):
integral of /
respect
/i (5.0,
5.2)
/i(/;A):
<px''
X4/c//i(5.0,5.2)
CFof pdf
(Chapter
16) normal
<p:
of standard
N(0,1)
N(0,1) distribution
^:
X^:
DF of
distribution
X stopped
at time T (10.9)
Chapter
A Branching-Process
(This with
Example
of
Chapter Chapter
is not 1 if
essential
the
book.
You can
start
you wish.)
0.0. Introductory
remarks
Thepurpose
well known
is probably is threefold: to take somethingwhich of this chapter or Ross to you from books such as the immortal Feller(1957) to that start on to start think so familiar make you you ground; (1976), about someof the problems involved in making the elementary treatment into rigorous and to indicate what new results appear if one mathematics; the somewhat more advanced theory developedin this book. We applies stick to one example: a branching process. This is rich enoughto show that the theory has some substance.
typical
animal
some
interpretations
of 'child'
assume
and 'animal')
is a random variable
(see Notes
X
below
with
values
in
Z\"'\".
We
that
z=
P(X We define
where
0)
> 0.
the generatingfunction
of X
SiS the
map /
: [0,1]-^
[0?1]?
kez+
Standard
theorems on
that,
for
0 G [0,1],
=
f\\0) =
and
ke^~^P{X
k)
E{X) = f\\l) = ^
1
kP{X
k)<oo.
2 Of course,
Chapter 0:
Branching-Process
Example
(O-l)-
as /'(I) is hereinterpreted
^Ti
9]i
- 6
since /(I) = 1. We
assume
Notes.
of
The
first application
of family
survival
names;
man,
and
child
= son.
can In another context,'animal' neutron into
'neutron',
and
'child'
of that
will a
signify nucleus.
can can
a neutron released if and when the parent neutron or not the associatedbranching Whether
supercritical
be
a matter find
of real
crashes is process
richer structures
study
importance.
of
We and
often
can
then
use
more
interesting
things.
see Athreya
Size
of n^^ generation,
formal:
Zn
that
To be a bit
(a)
suppose
we are
sequence
|X(^^
:m,rGN}
random
of independent
with
identically distributed
distribution
variables
(IID
RVs), each
the
same
as X:
idea
is that of children
for n G (who
Z\"^ will
and be
r G in the
number
(if there
that
is one) in the
signifies
represents
of
the
the
r^^ animal
n^^
generation.
fundamental
rule therefore
is
if
Zm
the size
(b)
We assume
the sequence
that (Zm
Z\342\200\236+i=x\\\"+'^
-.-
+ xil+'\\
full
Zq :
= m
(b) gives a
the
recursive
definition
of
sequence
(a).
is
..(0.3)
to calculate the
generating
Chapter
0:
A Branching-Process
Example or equivalently
distribution
function
of
Zn,
to
find
the
function
(c)
U9):=E{e^'')^Y.^'P{Zn^k).
first
main
result
is that for n G
Z\"^
(and
6 G
[0,1])
(a)
fn+m
that
= urn),
n-fold /o/o...o/.
the
that
so
(b)
for
each
n G
Z\"^,
is the /\342\200\236 =
composition
/\342\200\236
in
agreement
identity
map fo(0)
Zq = 1.
To prove
following
use
at the
- the
very
special
case
of the very
useful
Tower
Property
of
Conditional
Expectation:
(c)
to
E(c;) =
find
EE(u\\vy,
the
expectation
of
of a
U
random variable
V, and
Z7,
first
find
the conditional
that
expectation E(Z7|V)
We
given
prove
We
expectation of
(c)
with
U =
6^^+^ and V
E(^^\"+0
= EE(^^\"+H^n).
Now, for
satisfies
A:
G Z\"^,
the
conditional
expectation of
^^\"+igiven
=
that
Zn
= ^
(d)
But
= E(^^\"+> \\Z\342\200\236 k)
= E(^^{-'\"+-+4\"+\"
|z\342\200\236 k).
Zn is
of
constructed
Xj
from
variables
independent
,...
in the
expectation
,X|^\"
. The
is Zn
therefore
with
the absolute
(e)
E(e^'^\"^'\\..0^i\"''').
Chapter
0:
Branching-Process
Example
(0.3)..
the expression at (e) is a expectation random variables and as part of the family we know that this expectation of results,
But
may
product
we have
of
expectations.
Since
r)
proved
that
E(0^\" and
'|^n
fc)
/W*,
this is what
it
means
to
say that
[If V
takes
of
only
U
E(L/|V)
then when V
k^ the
conditional
expectation
=
U given that
k.
(Sounds
k) of
E^z\342\200\236+i ^E/(^)Z\342\200\236^
and,
since
E(a^\"
result
) =
/\342\200\236(\302\253),
(a) is
proved.
are
two
of the
main topics
Let
TTn
:=
P(Zn
= 0).
Then
tt^
/\342\200\236(0),
so
that,
by (0.3,b),
(a)
Measure
7r\342\200\236+i =/(7r\342\200\236).
theory
confirms
TT
our intuition
P{Zm
about the
extinction probability:
lim7r\342\200\236.
(b) Because
:=
= 0
for some m)
(a)
=t
/ is continuous, it
follows from
TT^
that
(c)
f(ir).
The
function
/ /
non-decreasing
the following
/'(I)
of
at 1
is non-decreasing and
/(O)
convex
P(X
= 0)
(of
opposite
make
THEOREM
IfE{X)> 1,
equation
then TT
then
the
which
extinction
lies
tt = 1.
/(tt)
strictly
E(-X')< 1,
root of
the
..(0.4)
Chapter
0:
Branching-Process
Example
y=
f{x)
Case1: subcriiical,
The critical
// = =
/'(!)<
case //
1 has
a similar
1. Clearly, tt picture.
:= 1.
Case 2: supercritical^
^ =
/'(I)
> 1. Now,
tt <
1.
6 0.5.
Now
Chapter 0: Pause
that
Branching-Process
Example
(0.5)..
we have say
theory
find a
courseson probabiUty
about
why
we must
claim
at (0.4,b)
that
(a)
is intuitively
cannot
=1
limTTn
prove
it
at present
mathematical
it? We certainly plausible, but how could one prove of stating with no means because we have what it is supposed to mean. Let us discuss precision
pure-
this
further.
said
'Suppose
that
we are
What
[Xr
: m,r
with
6 N}
same
of independentidentically
distribution
of)
random
variables
could
each
the
as X'.
function
this
mean?
A
We
random variable
follow
is a (certainkind
on a
elementary
sample space
all
Q.
outcomes,
in
the
typical
element
cj of
Q being
a; = (a;^^>
and then
we
:r6N,5
-
6N),
Q is
setting Xa {oj) =
oJa
Now
an uncountable
sense
of
set, so that
of
in the 7r\342\200\236
are
outside
the 'combinatorial'
one theory. Choice, can prove that it is impossibleto assign to all subsets of Q a probability the X's IID RVs satisfying the 'intuitively obvious' axioms and making with the correct common distribution. we to have know that the set So, of uo corresponding to the event 'extinction occurs' is one to which one can a will then a definition of uniquely assign probability (which provide tt). elementary
what
Let
is in some
constructa
which the
'probability
theory'.
C be
the class
subsets
C of
N for
'density'
p{C):= ntoo
lim
U^:l<k<n;keC}
exists.
Let Cn
Vn
also
(J Cn =
N. However, p(Cn) =
Cn
E C
and Cn
in
Vn,
0,
..(0.6)
Hence the
fact
Branching-Process
Example
allow
us correctly
to deduce
(a)
from
the
that
0} t {extinctionoccurs}
(N,C,/9) is
not
fails
for
the
(N,C,/o)
set-up:
'a
probability
triple'.
but provides a huge resolves There are problems. Measuretheory them, bonus in the form of much deeper results such as the Martingale
Convergence
Theorem
which
we now
take a
first
look
at
- at
an
intuitive
level,
hasten
to add.
martingale
that
from
(0.2,b)
is clear
of
the
values
Zi, Z2,...,
It Z\342\200\236.
P(Zn+i ^
a result
(Zn : n
j\\Zo
io.Zi
=ii,...,Zn=
in) =
P(2n+1 = j\\Zn
that
in),
> 0)
probably chain.
as recognize We therefore =
stating have
the process
Z \342\200\224
E(Zn+l|Zo =
2_^ jP(Z\342\200\236+i
j\\Zn
= in)
=
=
or, in
(a)
E(Zn+l|Z\342\200\236
in),
a condensed
E(Z\342\200\236+,|Zo,Zi,...,Z\342\200\236)
Of
course,
it is intuitively
obvious that
== E{Zn^,\\Zn)
(b)
because
children.
flZn,
each
We
of the
Zn
animals
in
the
n^^ generation
differentiating
can confirm
result (b) by
has on average
result
(j.
the
with
respect
to 6
Chapter
0:
A Branching-Process
Example
(0.6)..
Now define
(C)
Then
Mn := ^n//i\",
>
0.
E(Mn+i|Zo,Zi,...,Z\342\200\236)-Mn,
to
the
Z process.
value
it is now: what M is 'constant on average' in this very sophisticated of conditional expectation given 'past' and 'present'.The true statement
history of
up
to stage
n, the next
Mn+i
of M
is on
average
sense
(e)
is
E(Afn)
of course
A
= l,
Vn
infinitely
S is
cruder.
said to
statement 1 if
be true almost
P(5 is
surely (a.s.)
or
with
probability
(surprise,
surprise!)
true) =1.
{Mn >
is
Because
our Martingale
martingale
is non-negative implies
0,Vn), the
surely
Convergence
Theorem
that
it
almost
true that
(f)
Note
Moo:=limMn
that
exists.
can
> 0 for some outcome (which when probability only /i > 1), then the statement
if Moo
happen
with
positive
Zn//i\"
^ Moo
1; what
(a.s.)
of that
'exponential
/.i >
Convergence
know
(or Moo :=
/i
not)
of expectations
probabiUty
that
1, and
that
Vn. We
know
might be
if
that
eventually
0. Hence
^/m
tempted to believethat E(Moo) = 1. However, we already < 1, then, almost surely, the process dies out and Mn is
1, then
0 =
E(M\342\200\236)
1,
(a)
<
Moo =
0 (a.s.)
and
E(Moo)7^1imE(Mn)
= l.
..(0.8)
This is
Chapter 0: A
for
Branching-Process
Example
Fatou's Lemma,
variables:
to keepin an excellentexample
valid
mind
when
we come
any
sequence
Yn)
{Yn) of
non-negative random
to study
E(liminf
What
< liminf
E(Fn).
/i <
is
that
are
'going
Mn
will
large value
at wrong' be large
0.9.
is
1) for
and, very
keep
roughly
this
E(Mn)
at 1.
examples
in
Of course,
very
important
to know
E(lim-),
when
(b)
and
general
limE(-)=
we
do spend
are
quite a
rarely
considerabletime studying
fact
this.
The
best
concrete
theorems
good enough
and
problems, (c)
where
as is
evidenced by the
=
\"^
that
E(Moo)
X
=
if
only
if hoth
children. though
>
1 and
<
cx),
is the
and
Moo
E(XlogX)
0, a.s.
Of the
course process
0 log 0 = 0. li /j
>
die out.
0.8.
Since
Finding
Mn
-^ Moo
0,
-^ exp{-XMoo)
(a.s.)
Now
since
each
Mn
> 0,
in absolute
value by the
experiment.
assert
The
the whole sequence (exp( bounded \342\200\224AMn)) is constant 1, independently of the outcome of our
Convergence
Bounded
Theorem
says
that
we can
now
what
we would
wish:
(a)
Since
Eexp(-AA/oo)= limEexp{-XMn).
Mn
Zn/^\"\" and
E(6\302\273^\")
fn{0),
we have
(b)
so
Eexp(-AM,)
that,
fn{exp{-X/fi^)),
in principle
However,
side of (a).
function
can
calculate
the
the left-hand
distribution
a non-negative
random
variable F,
by
\302\273\342\200\224\342\226\272 <
P{Y
y) is completely
A
determined
oji
the
map
\302\273\342\200\224\342\226\272 Eexp(\342\200\224Ay)
(0,cx)).
10
Chapter
0:
Branching-Process
Example
(0.8)..
Hence, in principle,
we can
the
find
the
distribution
of Moo-
We have
seen that
real
problem
is to
calculate the
function
i:(A):=Eexp(-AMoo).
Using
(b),
/n+i = f
equation:
fn-,
and
establishthe functional
(c)
consequence
the
Bounded
Convergence
I(Am)
= /(X(A)).
concrete
example
explicitly,
is just about the only one in which one can in the it is useful of mathematics, but, way X to
calculate
in
many
contexts.
We take
distribution:
of
children'
have a
geometric
(a)
P{X
= pq^
(^^eZ+),
where
0<p<l,
Then,
q:=l-p.
as
you
can easily
check,
(b)
and
fi9) =
-^, 1
1
\342\200\224
q6
^=i, p
< p.
from
\\
if ^
of the
/,
we use
a device familiar
the
geometry
f 9u
\\921
gi2\\
922 J
the
is a
fractional
Hnear
transformation:
(c)
f4\302\261i-. -r 92if^
922
Chapter 0:
Branching-Process
Example
11
check that
if H
is another
G{H{9))= (GHXe),
SO
that
composition
of fractional
Unear transformations
correspondsto
matrix
multipHcation.
Suppose
that
the
we find
p ^
that
n^^ power
(AO\"=\"-'\"-C:)(^o;)(-.
so
T).
6) +
+
qO
that
(d)
li
(jL
MO) =
=
pfi\"(l
gp\302\273(l-^)
50-p-
q/p
< 1,
then
that
linin
fn{^)
\342\200\224
1,
corresponding
to the
now
yi >
1. Then you
0,
L{\\) :
= Eexp(-AMoo)_
lim/4exp(-A//i\)
p\\-\\r
q-
qX-^q-p
Jo
from
which
we
deduce
that
and
P(Moo
P(x
- 0) = TT,
<
X
< Moo
dx)
= (1
or,
better,
P(Moo
> x)
<
= (1
- 7r)e-(^-^)^
case, it is Zn ^ 0?
interesting
We
(x > 0).
to ask:
that
Suppose
that Zn
jj,
1.
In this by
what is
distribution of
the
conditioned
find
^^'
'^\"^'^i-/\342\200\236(0)
=13^'
where
\342\200\224
\"
qjj,^'
p-qyi^'
12
Chapter
0:
-h
Branching-Process
Example
(0.9)..
so 0
<
< 1 a\342\200\236
and
an
Pn
1. As n
\342\200\224^ we oc,
an -^
so (this
1-
//,
^n
-^
is justified)
= Um P{Zn n\342\200\224\342\226\272oo
(e)
Suppose
h\\Zn ^
0) =
(1 -
yi)ii^-^
[k
G N).
that
jjL
\342\200\224 1. You
can show by
induction that
n6 [n + 1) \342\200\224
and
that
E(e-^^\"/\302\273|Z\342\200\236^
0)^1/(1
+A),
corresponding
to
(f)
7^ 0)
->
e-^
x > 0.
know get
that some
when insight
E(Mn)
1, Vn, but
E(Moo) =
0. Can
that
First considerthe
for
when
jjl
<
1.
Result
(e) makes it
plausible
large
n, E{Zn\\Zn
^ 0)
is roughly
(1
//) E
kfi'^-' -
1/(1-
fi).
We
know
that
P{Zn ^
so
0) = 1 -
/\342\200\236(0)
is
roughly
(1
fi)fi^,
we
should
have (roughly)
=
E(M\342\200\236)
('^
Z\342\200\236 ^
o)
^ P{Z\342\200\236
0)
which might
values
the
'balance'
E(Mn)
= 1
is achievedby
big
times
..(0.9)
Chapter 0: A
case when
Branching-Process
Example
13
fi
= 1.
=
Then
l/(n ^
Zn
P(Z\342\200\236^0)
+ l), 0 is
\"^
1, so
correct
that Mn
order
We
0 is
n,
the
of magnitude
just
for balance.
Warning.
have of
argument
been using for 'correct intuitive explanations' which might have misled us into thinking that
= 1 in
But, of
course,
the
result = 1
E{Mn\\Z\342\200\236 ^
0)P(Z\342\200\236 7^
0)
is a
matter of
obvious
fact.
PART
A:
FOUNDATIONS
Chapter
Measure
Spaces
1.0.
Introductory
remarks
Topology
is about
oyen sets.
the
function
Measure
/ is
that
inverse is about
theory
measurable
the
sets. The
image
of
a measurable
function
/ is that
characterizingproperty
f\"^ (A)
inverse
of any
measurable set
is measurable.
In topology,
particular
that
intersection
one axiomatizesthe notion of 'open set', insisting of any collection of open setsis open, and sets is open. of a finite collectionof open
the
in
that
union
the
set', theory, one axiomatizesthe notion of 'measurable a of sets is countable collection measurable of insisting of measurable of a countable that the intersection collection sets measurable, and of a set must be is also measurable. Also, the measurable complement and the whole space must be measurable.Thus the measurable sets measurable, a a-algebra, a structure stable (or 'closed') under countably set form many that Without the insistence many operations operations. 'only countably - a point lost on are allowed', measure theory would be self-contradictory
In measure
that
the
union
certain
philosophers
of
probability.
The
sphere by
probability
5^ the
that
in R^ falls
random
of
on is just
the
5^
total
area 47r.
could
be
easier?
(see
However, Banach and Tarski showed Axiom of Choiceis assumed, asit is throughout then there exists a subset F of the unit 14
Wagon
conventional
(1985))
R^
that
such
sphere
S^ in
mathematics, that
..(1.1)
Z <
Measure
Spaces
15
k <
k =
oo), S^
of
k exact
copies
ofF:
5^ = U
1=
has an 'area', then that area must conclusion is that the set F The 0. only simultaneously so is it is non-measurable complicated that one measurable): (not Lebesgue Tarski have not broken the Law of and cannot assign an areato it. Banach of Area: Conservation they have simply operated outsideits jurisdiction. Remarks, every rotation r has a fixed point x on S^ such that (i) Because = X, it is not possibleto find A of 5^ and a rotation r such a subset r(x) \342\200\224 = A we A U t{A) could not have taken k = 2. that S^ and f] t{A) 0. So, that proved given any two bounded (ii) Banach and Tarski even subsets A and B of R^ each with non-empty interior, it is possible to decompose A into a certain finite number n of disjoint pieces A \342\200\224 A,- and B into IJ^Lj = the same number n of disjoint B a way that, for ^^ such pieces |jr=i ^\302\253' to B,!!! So, we can disassemble each 2, Ai is Euclid-congruent A and rebuild
where
each
r^-
is
a rotation.
If F
be 47r/3,47r/4,...,
it as
B.
in
the
appendix
a non-measurable
gives
a-algebras,
case for
probability
Tr-systems,
and
measures We
and emphasizes m^onotone-convergence properties of measures. in later chapters that, although not all sets are measurable, it
theory
shall
see
that
enough
a-algebra
sets are
measurable.
is always the
5 be
a set.
of subsets
Algebra on S
A
collection 5)
subsets of
(i)
(iii)
Eq if
of S
is calledan
algebra
on
S (or
algebra of
S e
F,G
So,
=>
(ii) FeSo
[Note that 0 =
5^
F^:=5\\F\342\202\254Eo,
\342\202\254 So
=>
FUGe
So.
\342\202\254 So and
So F, C? \342\202\254
=>
F n
C?= (F\" U
G\\"")
\342\202\254 So.]
16
Chapter
an
1: Measure
of
Spaces
of 5
(^-V-finitely
Thus,
set
algebra
on
5 is
family
subsets
stable under
C
many
operations.
Exercise
(optional).
subsets
of N for
which the
'density'
exists.
that
a number
this
density
random belongs to
why
this does not conform to a proper probability find elements Section 0.5.) For example,you should FnG
(We
saw one in
for
F and
G inC
which
^C.
terminology
Note on difference
('algebra the
versus
field').
An algebra in
fl
our senseis a
symmetric
algebraists'
sense
with
as
product,
and
AAB:=(AUB)\\{AnB)
underlying we
field
of the
prefer
of
way that an
is
A
algebra
is,
field
with
2 elements. there
'field of
subsets':
is no
trivial^
that
Eq =
{5,0}.) of S
algebra
(7-algebra on S
collection
E of
subsets
is an
subsets of
then
5)
if
(or
cr-algebra
of
F\342\200\236GE(nGN),
[Note
that
if E
is a
on S and cr-algebra
n
G E F\342\200\236
for n G N,
then
n of
a Thus, collection
subsets
of S
'stable under
any
countable
Note. Whereas
is
element
of
1.8
many for
of
the
below
a first
to write in 'closed form' the typical of sets which we shall meet (see Section algebras it is down the usually impossibleto write example),
usually
possible
our
concentrating
Measurablespace
A
pair
(5,
E),
space.
5 is a
measurable
An element
..(1.2)
generated
Chapter
(7-algebra be
1: Measure
Spaces
17
by a class C of subsets of 5. Then cr(C), the a-algebra by C, generated is the smallest cr-algebra E on 5 such that C C E . It is the intersectionof of all class the on S which have C as a subclass. all (7-algebras (Obviously, which extends subsets of 5 is a cr-algebra C.)
(7(C),
Let
a class
of subsets
1.2.
Let
B(5), B = B{R)
B{S)
B(5),
Borel
cr-algebra
on 5,
slight
is the
abuse
:\342\200\224
open subsets of
S.
by generated cr-algebra
the
family
of
With
of notation,
cr(open
B{S)
B:=B(R)
sets).
standard shorthand that B := B{R). of all cr-algebras. The cr-algebra B is the most important Every subset of R which you meet in everyday use is an element of it is indeed B; and difficult to find a subset of R constructed explicitly (but possible!) (without the Axiom of Choice)which is not in B.
It is
complicated.
However,
R}
the collection
{(_oo,a:]: x G
easy
a standard
case that
notation) is
very
to
understand,
and
it
is
often
the
about
B is
<T(7r(R)).
that
(a)
Proof
B=
of (a).
need only
But,
countable intersectionof open sets, All that remains to be proved cr(7r(R)). But every such G is a
show
For each x in R,
(--cx),a:]=
is
countable
flnGNC\"\"^^'^
the set
that a <
is ( \342\200\224cx),x]
+ ^~^)? in B.
^^ ^^^^
^^ ^
is in
every
union
b^
so we
that,
for a, 6 G
with
(a,6)6a(7r(R)).
for any
with
w >
a,
(\342\200\224oo, u]
n ( \342\200\224cxD, aY G cr(7r(R)),
for
= \302\243
^(6
\342\200\224
a),
{a,b)
we
[j{a,b-sn-%
n
see
that
is complete.
18
1.3. Definitions
Let function
Chapter
1:
Measure
Spaces
(1.3)..
5 be
/zq be
a non-negative
set
~> [0,oc].
Additive
Then
/zq
is
called
additive
if /io(0)
=>
= 0
and, for
F,
G G
So,
F n G=
yio{F
U G)
= yLo{F)+
/io(G).
Countably additive
The
map
whenever {Fn : n 6
F
/zo is
called
(note
countably additive
N)
(or
cr-additive)
if
/i(0)
with
=
not
0 and
union
is
a sequence
in |JF\342\200\236
So
that
this is
an assumptionsinceEo
of disjoint
sets in Eo
need
be a
(7-algebra), then
po(F) = ^/.o(F\342\200\236).
n
Of
course
(why?),
set function
is
additive.
1.4.
Definition
be
space on S. E is a cr-algebra
Let (5, E)
A
a measurable
space, so that
/i : E
map
-^ [0,cx)].
is countably
additive. The
1.5. Definitions concerningmeasures Let (5, E, /i) be a measure space. Then /z (5, E, /i)) is called
finite
(or
indeed
the
measure
space
if /i(5)
< oo,
(7-finite if there
is a sequence : n 6 (5\342\200\236
li{Sn) <
N) of
elements of
(J
such
that
oo (Vn
and \342\202\254 N)
5. 5\342\200\236
Intuition is usually OK for finite measures, (7-finite measures. However, measureswhich are not there are no such in measures this book. fortunately,
Warning.
and cr-finite
,.(1.6)
Probability
Chapter 1: Measure
measure,
is
Spaces
19
probability
a probability
triple
measure if
Our measure yi
called
is
then
called
a probability
triple.
(a.e.)
= 0.
(a.e.)
A statement
if
S about points
of
5 is
said to
hold almost
F :=
everywhere
{s : S{s) is false}G
and
fi{F)
= 0.
cr-algebras to
are
'difRcult',
but 7r-systenis
on
are 'easy'; so we
a family
of
work
S be a
with
the latter.
Let \342\226\272(a)
set. Let
be
of S stable
a 7r-system
S, that is,
subsets
Let that
:=
cr(J).
=
fii(S)
fii
fii
and
=
/i2
CLf^
measures
on
(5, E)
such
yL2 on J.
Then
fi2
on E.
If
they
two
on
then
agree
7r-system,
7r-system.
The example B=
the
(7(7r(R))
of course
the most
E =
cr(J) in the
theorem.
an
important exampleof
it
important
role. Indeed,
will
be
celebrated
to
this
chapter
should
perhaps
be consulted
this
chapter
first.
20
Chapter 1:
MeasureSpaces
Extension
(1.7)..
Theorem
a set,
let So
be
an
algebra
on S,
and let
E:=(7(Eo).
If
fiQ
is
a countably
additive
such
map
^i ^
fio
: T,o
-^ [0,oo],
then there
exists a
measureji on (5, E)
< oo,
that
fiQ
on
Eo-
If fJ>o{S)
then,
by
Lemma
1.6,
this extension
is
unique
an
algebra is
a ir-system!
result
In a sense,this
without
we it
should
have
more
use
\342\226\272 signs
than
any
we
could
not
construct
other,
for
have
our
model,
of
we make
this
no further
theorem.
The proof
result It
there
course.
for
completeness.
Let us
of the appendix given in Sections A 1.5-1.8 will do no harm to assume the result for is used. the theorem
is
this
1.8. Lebesgue
Let
S =
(0,1].
union
(*)
(ai,6i]U...U(a^,6r]
<
ar
< br
< 1.
Then Eq
is
an
algebra
on
write
S(0,1]
instead
fio{F)==J2(bk-ak).
k<r
Then
fiQ
is well-defined
and additive
measure 1.7, there existsa unique fi on ((0,1],B(0,1]) measure n is called Lebesgue measure on ((0,1],-B(0,1]) or (loosely) Lebesgue measureon (0,1]. We shall often denote fi by Leb. measure (still denoted by Leb) on ([0,1],B[0,1]) is of course Lebesgue obtained the set {0}having measure 0. Of by a trivial modification, Lebesgue the concept of length. course, Leb ma<kes precise
countably
additive
Eq. This
on Eq.
by Theorem
fiQ extending
(This is not
fio
is
See
Section
A1.9.)
Hence,
on
we
can on
(R,S(R)).
..(1.10)
1.9. LEMMA.
Let
Chapter
1:
Measure
Spaces
21
Elementary inequalities
measure space.
Then
(5,
E,//)
he a
(a)
fi(AuB)<fi{A)-hfi{B)
\342\226\272(b) K[J^<nF^)<E^<nKF^)
. . . , F\342\200\236 G E).
Furthermore^
(c) (d)
if fi{S)
< oo,
then
fi{AuB)
= fi{A)^fi{B)-fi{AnB) formula):
=
(A,B\342\202\254E),
(inclusion-exclusion
for Fi,
F2,...,
Fn
G E,
^(U<
\342\226\240^\342\226\240)
E<
^(^\342\226\240)-EE.<
Ki^.ni^;)
alternating
between
over-
and under-estimates.
(c)
is obvious
(c)=>(a)=>(b)
from
(c)
by
is by integration.
some version of these resultspreviously. But AUB is the disjointunion AU(j5\\(AnB)). - check that 'infinities do not matter'. You can deduce (d) induction, but, as we shall seelater,the neat way to prove (d)
surely
have
seen
because
measures
These
results
are often
needed
Shakespeare'
for
making
things
rigorous.
measure space.
If \342\226\272(a)
(Peep
ahead
Section
4.9.)
Again,
let (5,
E,//) be a
G E
(n G N)
property
Gi
and
Fn
T F,
then
T //(F\342\200\236)
//(F).
Notes.
the
Fn]
F means:
C Fn+i F\342\200\236
(Vn G N),
[jFn =
(n
F. Result (a) is
Then the
fundamental
of measure.
:=
Fi,
Gn :=
F\342\200\236\\Fn-i
> 2).
sets
disjoint^ U G2
U
and
...
fi{Gi
G\342\200\236)
fiiGk)
the
kKn
^(*^*) Yl k<.oo
= ^(^)-
\302\260
Application.
In a
proper
formulation
of
branching-process
example
of
Chapter 0,
{Z\342\200\2360}
t {extinction
occurs),
so that
tt^
tt.
(A
proper
formulation
of the
branching-process examplewill
be
given
later.)
22
If
Chapter 1:
Gn
\342\202\254 G\342\200\236 i G S,
MeasureSpaces
cx)
(1.10)..
h, then
now
i //(G\342\200\236)
\342\226\272(b)
and
^i{Gk) <
for
some
//(G).
Proof
of{h).
to
For n 6 N, let
indicate
apply
part (a).
Example -
what
can
'go
wrong\\
For
n G
N, let
Hn :=
(n,oo).
Then
(c)
Leb(^n)
cx),Vn,
but
i?\342\200\236 j 0.
The \342\226\272
union
of a
countable number
of results
This is a trivial
corollary
fi-null
1.11. Example/Warning
Let positive
(5,
S,//)
numbers
be a sequence be ([0,1],S[0, l],Leb). Let \342\202\254{k) that \342\202\254{k) such | 0. For a singlepoint x of 5,
of strictly
we
have
(a)
{x} C
for
(x -
e{k),
x -h
e{k))
n S,
so that
of
measurable
open
every
follows
fc,
fi({x}) because
<
2\342\202\254{k),
and
{x}
is the
subsets
Let
countable
V =
of
on the
in
it
union
measurableand that
S
of
singletons:
Leb(V)
V =
[0,1].
is
V
{vn
: n
G N},
clear
0. We
can include
measure
at
most
as follows: 4\342\202\254{k)
VCGk=
[j
fiGN
[(v\342\200\236
e(k)2-\",
+ e(fc)2-\") v\342\200\236
n 5]
=: jj
/\342\200\236,*.
Clearly,
that
consequence
of
:= pj^ Gk
the
satisfies Leb(-fir)
category
0 and
Baire
H is
the
(b)
set
theorem (see
n
to
n
be
Throughout
of
careful
about
interchanging orders
operations.
Chapter
Events
model
for
triple
an experiment
(fi,^,
probability
P) in
of
Sample
space
f] is a
Sample
A
set calledthe
point
u; of
sample
space.
point
f] is
called a samplepoint
on f]
(7-algebra of
^ ^,
is called the
family
of
events,
Q.
so that
an event is an
on
that
is, an
definition
of probability
measure
(f],^).
2.2.
Tyche,
The intuitive
Goddess
meaning
point u; of f] 'at random' according to ^, P(F) represents the 'probability' (in the sense that the point uj chosen by Tyche belongs to intuition)
chooses a
uj
of Chance, F in for
by
our
F.
point
determines
the
outcome
of the
experiment. Thus
\342\200\224> set of
outcomes,
u;
There
should
\302\273\342\200\224> outcome.
is no
be
reason
why
this
'map'
an
(the
that
co-domain
although
lies in our
there
one-one.
Often
it is
the case
intuition!)
obvious
tossing
is some
of
richermodel.
by imbedding
experiment,
in
it is
better to
coin
use some
example,
we can
the associated
random walk
properties
a Brownian
motion.)
24
2.3.
We
Chapter 2:
Examples
leave
Events
(2,3)..
of (f],^)
question
pairs probabilities
can
until
the
of assigning
later.
(a)
Experiment:
Q. =
Toss coin
twice. We
take
[HH, HT,
TH, TT},
event
T = P(fi) :=
'At
set
of
all
subsets
of Q.
In this
by
model,the intuitive
mathematical
Toss
least
one head
the
event
coin
(element
infinitely
of ^)
often.
(b) Experiment:
take
n = {H,T}'-',
SO
that
a typical
point cj
uj
of f]
is
a sequence
(u;i,u;2,...)^
^n G {H,T}.
intuitive
We certainly wish to speak of the to choose {if, T}, and it is natural :f = (t{{ujen:ujn
Although
event
'ujn =
W\\
where
= w}:neN,w
it turns
that
e {h,t}).
for ^ is big enough;
T we
7^
'Pi^)
(accept
this!),
out that
truth
example,
shall
see in Section
3.7
the
set
p^(.
of
Kk<n:u;,=
H) ^
11
'1 2
m.odel
the
statement
number
of heads in n
tosses
is
for
an
element
of !F.
Note that we
the
experiment
outcomes.
Q>
can use the current model as a moreinformative in (a), using the map u \302\273-> of sample (u;i,u;2)
a point
points
to
between
the
0 and
point
1 uniform.ly
chosen.
at
random,.
Take
[0,1],^ case
=
P
B[0, l],u;
=Leb.
signifying
In this
obviously taJce
for
The
will
the
of a fair
coin
case, we
explained
later.
..(2.5)
Chapter 2: Events
surely
25
2.4. Almost
statement \342\226\272A
(a.s.)
about
outcomes
is said to
be true
and
almost
surely
(a.s.)^
or
with probability
1 (w.p.l)^if
F
:= {lv:
S{uj) is
E J^
true} G
P(F)
= 1.
(a) Proposition. If Fn
Proof.
(n e
N) and
P(F^)
= 0,Vn,
so, by Lemma
about.
1.10(c), P(Un-^n)=
But
f]Fn
([jF^y.
(b) Somethingto
develop
think
probability
without
Some measure
is the
following.
When the
probability
measure
(SLLN)
to define the appropriate discussion(2.3,b)isextended of Large Numbers for fair coin tossing, the Strong Law = 1, where F, the truth set of the states that F \342\202\254 ^ and P(F)
'proportion
be let
of
statement
heads
in
n tosses
\342\200\224>
i',
is defined
(2.3,b).
For a e
Let A
A,
the
set of
all maps
a :N
\342\200\224* N such
that
a(l)
<... .
= ^\342\200\236
{.,:\302\253^-^\"'^\302\260'\"-^'^l}
the 'truth
we
set
P(Fa)
of
the
Strong
G
Law
A.
for the
have
= l,Va
that
course,
Exercise.
Prove
(Hint
For
any
given
that
cj, find
the
an a
concept
... .)
of 'almost
The moralis
but
precision, (ii) enough flexibility to into which those innocent of measure theory
surely' gives us
avoid
(i)
absolute
also
the
self-contradictions
too
since
easily fall.
they
(Of course,
thought
philosophers
are
pompous
where
we are
think deeply
... .)
Hmsup,
precise,
axe
to
2.5. Reminder:
: n G (a) Let (x\342\200\236
Hminf,
| lim,
real
etc.
numbers.
N) be
a sequenceof
<
We define
sup
[n>m
Xn f =i
lim { ^
ln>m
sup
26
Obviously,
Chapter 2: Events
ym '-=
limits
(2.5)..
^'^Pn>m ^n
exists
in is monotonenon-increasing
m,
so that
the
hmit of the
monotone
sequencey^
will be
in
The [\342\200\22400,00].
will
use
of tHm
handy, as
t/n
J,
t/oo
to signify
(b)
Analogously,
liminf
Xn
:=
sup
<
inf
Xn \\ =T
li^ {
i^^f
^n
\342\202\254 [\342\200\22400,00].
(c)
We
have
in
<==^ =
Xn
converges
[\342\200\22400,00]
limsupx\342\200\236
liminf
Xn,
and then
Note \342\226\272(d)
limxn =
that
limsupx\342\200\236
liminf
x\342\200\236.
(i) if z
<
z eventually
then
(that
sufficiently
large
n)
(ii) if 2: <
Xn
limsupx\342\200\236,
> z
infinitely
often
is, for
infinitely
many
n).
2.6.
The
Definitions.
event
limsupjE^n,(\302\243'n,
i.o.) the
(in the
rigorous formulation:
heads/
truth
set of
the statement)
^'
'number of
is
number
of tosses
n^^
\342\200\224>
built
out of
rather complicatedway.
toss
need
of
is required.
be
helpful
to note
if
is an
event, then
E=
Suppose
\342\226\272(a)
:ujeE}.
Z5 a
:n 6
N)
sequence
of events.
We
define
: =
(\302\243*\342\200\236, i.o.)
(En
infinitely
-=
often)
f]
:= =
=
limsup\302\243'n
m n>m
[j
En
{uj {uj
: for : uj
every
m,
3n{uj) > m
many
such that
n}.
u; G
^n(u;)}
E En for
infinitely
..(2.8)
Fatou
Chapter
Lemma
2: Events
27
of
(Reverse \342\226\272(b)
- needs
>
FINITENESS
limsupP(E\342\200\236).
P)
P(limsupJE;\342\200\236)
Proof.
Let
Gm
\342\226\240=
where G := HmsupE\342\200\236T
Un>m
^nBy
Then
(look at
(1.10,b),
>
the
definition
in
(a))
Gm i
G,
result
i P(G\342\200\236)
P(G).
But,
clearly,
P(G\342\200\236.)
sup
P{En).
Hence,
P(G) >i
Hm \"*
I sup
Ln>m
P{En)\\
J
=: limsupPC^n).
2.7.
\342\226\272 \342\226\272
First
Borel-Cantelli
Let
Lemma
G
Then
{En
: n
(BCl) N) be a sequence of
En) =
X:\342\200\236P(^n)<oo.
P{\\{m sup
Proof.
P{En, i.o.)= 0.
for
With
the
notation
of (2.6,b), P(G)<P(Gn^)<
we have,
n>m
each
m,
yP{En),
using
(1.9,b)
and
(1.10,a).
Now let m
cx). will
D be
Notes,
(i) An
given
later.
will
within
Borel-Cantelli
Lemma
be
given
applicationsrequireconceptsof
(En, is
independence,
variables,
etc..
ev) a sequence
2.8.
Again
Definitions,
define
of events,
We \342\226\272(a)
ev) :
= {En eventually)
limmiEn
: =
:=[j
f| En
m(a;), uj G
En,\\/n >
m{u;)}
for
all large
n}.
{En,
evf
= {E^,
i.o.).
Lemma
for sets
- true for
with
ALL
measure
P(\302\243'n).
spaces)
P{\\hnin{En)
Exercise.
< liminf
the
Prove
this
(1.10,a)rather than
in analogy
proof
of result
(2.6,b), using
(1.10,b).
Chapter
2: Events
(2.9).
2.9. Exercise
For an event
jB,
define
the indicator
i.M:={;;
:j
events.
uj
E.
Let {En
: n 6 N)
be a sequence of
Prove
that,
Iiimsup\302\243;\342\200\236(^) limsupl\302\243;\342\200\236(u;),
and
estabHsh
the
corresponding
result for
Um
infs.
Chapter
Random
Variables
Let
(5,
E)
be a
measurable space,
so that
is a
cr-algebra
on S.
that
h :
S -^ R. For
h-\\A)
R, define
S:h{s)\302\243A].
:={se
/i\"^
Then
h is
called H-measurable if
: B
is, h-^(A)
6 E, VA
E B.
So, here
is a
picture of
E-measurable
h:
Eiilis We write
the
mE for
the
class
of E-measurable
functions on
We
class
of non-negative
elements in mE.
on
5, and (mE)\"^
bE
for
denote
by
the class
of
Because
and
lim sups
for
of
sequences
other
reasons,
it is
to
functions
Tt-measurahle
Which
values in
S[\342\200\224oo, oo]
the
obvious
way:
is
called
various results
in
[\342\200\224oo, oo],
stated for
and
real-valued
functions
extend
to
functions
obvious.
with
values
what
these extensions
are, should be
Borel
function
is S
called itself
Borel is R.
if h is
B{S)-
29
30
3.2.
Chapter 8: Random
Elementary
The
Variables
(3.2)..
Propositions
preserves
on nieasurability operations:
h-\\A^)
(a)
map
h~^
all set
h-\\[j^A,) = [j,h-\\A,),
This is just
C B
Let definition
{h-^{A)y,
etc.
D
Proof.
IfC \342\226\272(b)
chasing.
then /i\"\"^
and g{C) = B,
the
C -^ E
=>
such
he mS.
that
Proof.
be \302\243
class
of elements
ft : 5
\342\200\224> R is
\\n
cr-algebra,
and
h~^(B)
h
E E. By
(c) If S is
Proof.
\342\226\272(d)
topological
then
is
Borel.
Take C
For
to be the classof
open
subsets
and apply
-^ R is
(Vc E
result (b). D
Jl-m,easurable
any
measurable
space (5,
E), a function
h : S
if
{h<c}:={seS:
Proof. Take C to be the class7r(R) and apply result (b).
of
h(s) < c}
intervals
R).
c E
of the
form
c], (\342\200\224oo,
R,
Note.
{h
Obviously,
similar
results
apply
in which
> c},
{h > c},
etc.
measurable
by {h < c} is replaced
functions
are
is an
R
algebra over R,
and
that
is,
if \\ E
E mE, then
hih2
hi -{-h2
Example
if
E mS,
E mE,
it
\\h
E mE.
and
only
is clear
that
hi{s)-^h2{s)
> c
> q
> c
\342\200\224
h2{s).
In other
words,
{hi +
union
/i2
>
c}
y
qeQ
({hi >
q}n{h2>c-
q}),
D
countable
of elements
of E.
,.(8.6)
3.4. Composition
Chapter
3:
Random
Variables
31
Lemma.
mB, then f o h
E mE.
If h
Proof.
E mE
and f G
Draw the
picture:
s -!urMr
in moreadvanced
and
Note.There
h : Si
this
are
obvious theory):
generaUzations if (5i,Ei)
h
-^
point
From
3.5.
\342\226\272 \342\226\272
82^ then
of
is
called
based on the definition (important and (52, E2) are measurable spaces if h~^ : E2 -^ Ei. E1/E2-measurable
view,
what
we have
called Y^-measurable
should
read
TiIB-measurable LEMMA
Let
(or perhaps
E/S[\342\200\224oc,
00]-measurable).
on measurability
: n
(i)
\302\243 N)
(hn
be a
functions
Then
inf/in?
(into
(ii) liminf/in,
are Ti-m.easurable
inf
hn
we shall
still write
E mE
(for
example)).
Further,
exists
in R}
E E.
flni^^n
: r
But
(iii) This part is now obvious, (iv) This is also clear becausethe set on which
{limsup/in
where
^f
exists
in R is
lim sup/in
\342\200\224
liminf/in.
3.6.
Definition.
(f],
Random
our (sample
Thus,
\342\226\272Let
elementof
!F) be m^.
events).
A random
variable is an
X '.n-^
X-^ '.B-^T.
32
3.7.
Chapter S: Random
Example.
Variables
(3.7)..
Coin tossing
=
Let n =
{H,T}'^,u
(ui,U2,...),u;ne
: u;n
{H,T}.
As in (2.3,b),
we define
f = aiW
Let
= W}
: n e N,W
e {H,T}).
The
definition
of
Lemma3.3,
Sn
f guarantees
that
each
Xn
is a
random variable. By
:=
Xi
+ X2
Xn
= number
of heads in n
tosses
6 [0,1],
of
we have
heads
y
<uj :
\\
number
number
1
p}
of tosses
{uj :
. _.. = X^(u;) v y
.^, p} /-j H
{uj i
: L
r-/(u;)
\\
\\
i = y). p},
where 3.5, A
\342\226\272 \342\226\272
X\"^
:=
lim
supn~^5n
and L~
By
Lemma
JF. \342\202\254
step towards
to
the
prove
that
it
the
3.8.
on
Definition,
Q,
cr-algebra
generated
by a collectionof functions
in
This
is an
important idea,
discussedfurther
every
Section
3.14.
weakest topologywhich
etc.)
(Compare
continuous,
the
makes
function
in a
given family
In Example
3.7, we have
set fi,
{Xn : n
a given
a The best
in the
way
family
6 N)
of maps
6
Xn that
-^ R.
is as
to
think
of the
a-algebra T in
example
N)
be described.
a collection
(F-y
\342\226\272 \342\226\272Generally,
have
: 7
6 C)
of maps
Ky
: f]
-^' R,
then
3^ :=
a(K, : 7 \342\202\254 C)
..(3.10)
Chapter
3: Random
Variables
each
33
map
Yy (7 E
C)
= : 7 \342\202\254 C)
a({u;
^ : \342\202\254
for
F-^H
(f],^),
: \342\202\254 B}
\342\202\254 C, B
\342\202\254 S).
If X
is a
(i)
random variable
The
some
then,
of course,
cr{X) C T.
in this section is somethingwhich introduced you about work as pick up gradually you through the course. Don't worry it now; think about it, yes! to our aid. For example,if {Xn : n 6 N) is a come 7r-systems (ii) Normally,
idea will
Remarks,
collectionof
[J
A'n
functions
on
f], and
Xn
denotes
which
a^Xk
fc
<
n),
then
is
a TT-system
(indeed,
an algebra)
generates
(j{Xn
: n
3.9. Definitions. Law, distributionfunction that X is a random variable carried by Suppose We have (f],jF,P).
some
probability
triple
n^R
[0,1]^J'^B,
Define
the law Cx
or indeed [0,1]-^ of X by
Cx:=PoX-\\
a{X)
^B.
Cx:B
^[0,1].
Then (Exercise!) Cx is a probability measure on (R,S). Since 7r(R) = Lemma 1.6 {(\342\200\224cx),c] : c 6 R} is a 7r-system which generates S, Uniqueness shows that Cx is determined by the function as defined Fx : R \342\200\224> [0,1]
follows:
Fx(c) :=
The
\302\243x(-oo,c]
P(X
< c)
= P{uj : X{uj)<
of
c}.
function
Fx
is called
the distribution
function
X.
3.10.
Suppose
Properties
that
of distribution
is the
X.
Then
some
random
variable
(a)
(b)
(c)
F:R-^[0,1],
(that
is,
x <
=\302\273
F(x)
< F(y)),
0,
right-continuous.
using
Proof of (c). By
and
Lemma
(1.10,b),
we see that
<x),
of
fact
continuous.
together
with the
any
monotonicity
ends.
Fx
shows
that
Fx is
right-
Exercise! Clear
up
loose
34
3.11.
function \342\226\272If F Section
Chapter S:
Existence
has
Random
Variables
(S.ll)..
of random
variable with
given
distribution
1.8
on
C on (R,5) such
C{~oQ,x]
S,
\302\243),
by can
analogy construct
with a unique
= F{x),\\fx.
= co. Then =
(R,
X{u;) Fx{x)
it is
tautological
that
F{x)yx.
Lebesgue-Stieltjes
measure
next
section.
with
3.12.
prescribed
Skorokhod
distribution
Again
let F
->
with
have
properties
function
random
variable
Define
(3.10,a,b,c).
F carried
We
can
construct
distribution
by
(Q,^,P)
as follows.
for
= ([0,l],S[0,l],Leb)
equalities, which you can
(the only)
right-hand
clarification
(al)
X+(w)
:=
\\rd{z
: F{z)
> a;}
= supjy
(al)
The
X-{lo)
:=
hd{z
: F{z)
> w} =
following
picture
shows
cases to
F{x)
M 0
X\302\261(a;)
X-{Fix))
X+{Fix))
By
definition
of ^~,
{CO
<
F{c))
iX-{co)
< c).
..(3.12)
Now,
Chapter S: Random
Variables
35
(^>.Y-(u;))
=^
{F{z)>ul
so, by
the
right-continuity
of F,
F{X~{ijo)) >
L:<
u, and
< F(c)\\.
c)
F{X-{u:))
<
F(c))
<=^
iX-{uj)
P(X-
< c),
so that
= F(c).
< c)
(b)
It
will
The measure
variable C
X^
therefore
has distribution
function F,
and
the
in
Section later
3.11 is
just
the
law
of X~.
be
important
to know that
function
(c)
Fy
and
that, indeed,
P(X+ = X-)
Proof
= 1.
of
(c).
By definition
of
X\"^,
(w
< F{c))
(X+(u;) <
c),
<
c).
Since
X+, it is clearthat
cGQ
But,
for every
c6
R,
P(X-
< c
< X+)
<
c})
< F(c)
- F(c) = 0.
Since
is countable,
the result
in fact true that every experiment you will meet in this (or course can be modelled via the triple ([0,1],S[0, l],Leb). (You will to be convinced of this by the end of the next start However, chapter.) this observation normally has value. only curiosity
Remark. It is
any other)
86
3.13.
Suppose experiment
Chapters:
Generated
that has
Random
Variables
(3.IS).,
(Q,^, been
cr-algebras - a discussion and that the experiment, P) is a model for some has made Section that so Tyche 2.2) (see performed,
a collection
her
choice of
u.
(Ky
Let
our
: 7
be \342\202\254 C)
experiment,
and
suppose
that
information
(*)
about
values
the
the chosen point uj: Yy{uj), that isj the observed values of
the
random
variables
Y, (7 e C).
Then
it : 7 \342\202\254 of the cr-algebra 3^ := cr(Ky the intuitive C) is that significance can F for which, for each and every consists precisely of those events u;, you or not uj E F) on decide whether or not F has occurred is, whether (that the information the basis of the information (*) is precisely equivalent (*); to the information: following
(**) the values If{uj) (F \342\202\254 y). Prove that the cr-algebra (a) Exercise. (t(Y)
by
generated
by
a single
random
a{Y)
cr(Y)
= Y-\\B)
is generated
B}
: B
e B),
x}
: x
E R)
= F-'(7r(R)).
things.
in
D
the
The reading
if
following
results
might
help
this
section
after
(c)! Results Z :
clarify
to this chapter.
(b) If y : f]
only
\342\200\224> then
R,
f]
\342\200\224> R is
there
exists
a Borel
function / : R
from f2 to R, then a function Z : Q, -^ R Yn are functions Yi, F2,. \342\200\242., is cr(Yi, F2, \342\200\242 \342\200\242 if and \342\200\242, only if there exists a Borel function yn)-measurable / on R\" such that Z = /(Yi, F2, \342\200\242 \342\200\242 We shall see in the appendix that \342\200\242, Yn). the more correct measurability condition on / is that / be 'S\"-measurable'.
(c) If
functions
only
(d) If
(Yy
: 7
from
E C)
Q
is a
to
R, then Z
collection(parametrized by
: fi
\342\200\224^ R is
the
infinite
set
C) of
if
if there
exists a
/
Borel function
Warning much
on R^
countablesequence (ji :i E
such that Z =
/(K,.,K,\342\200\236...).
a{Yy : 7
N)
6 C)-measurable
of elements
and
of C
and a
latter
measure space
S(R^)
is
^^ is
the
former
which
gives
the appropriate
type of /
in (d).
..(3.14)
3.14.
Chapters:
Monotone-Class
that
Random
Variables
37
The
Theorem
Lemma
us to deduce results the 'elementary' 7r-systems, following (7-algebras version of the Monotone-Class Theorem allows us to deduceresultsabout of ttfunctions from results about indicatorsof elements measurable general the we shallnot use theorem in the main text, preferring systems. Generally, measure in Chapter 8, it for 'just to use barehands'. However, product In the same way
about
Uniqueness
1.6 allows
from
results
about
becomes
indispensable.
THEOREM.
\342\226\272 \342\226\272
Let
Ti
he a
class of hounded
conditions:
space
functions
from
a set
S into
satisfying
the
following
(i) H
(iii)
is a vector
is a
f
over
R;
7i;
function 1 is an elementof
sequenceof
the
non-negative
functions
in H,
f
such that
tt-
fn^f
system
is a
hounded function
indicator
on 5,
then
E 7i.
Then if 7i
I,
contains
then
function
hounded
of every
(j(I)-measurahle
set in some
function
Ti contains
every
chapter.
on S.
For proof,
Chapter
Independence
Let (fi,^, P) be a
4.1.
probability
triple.
Definitions
We
of independence
attention
the on the cr-algebra formulation (and describe to acclimatize ourselves in terms of of familiar forms more independence it) information. as the natural means of summarizing of cr-algebras to thinking definitions Section 4.2 shows that the fancy agree with the ones cr-algebra
Note.
focus
from
elementary
courses.
Independent
\342\226\272 Sub-<J-algebras Gi
a-algebras
of J^ ^1,^2,-\342\200\242\342\200\242
are
called
then
(i
N) and
I'l,...,
Independent
\342\226\272Random
random
Xi,X2,...
variables
are called
variables
independent if the
cr-algebras
aiXi),a(X2),...
are
independent.
Independent
\342\226\272Events
events
jE^i,JEJ2,...
are
called
independent
cr-algebra
if the
cr-algebras \302\243*i,52,...
are
independent,
where
is \302\243n
the
{0,
Since
= \302\243n
(^{lEn)?
i^ follows that
only if the
if and
38
,.(4.2)
Chapter 4'
TT-system
Independence
more
39
definitions
independent
4.2. The
We
Lemma;
and
the
familiar
know
from
elementary
theory
and
if
corresponding
consequences
results
of
involving
complements
being
this.
We now
generalization (manageable)
use the
idea,
of
this
allowing
rather
TT-systems
than
the
(awkward)
case
cr-algebras. cr-algebras.
J-',
on Let us concentrate
of two
\342\226\272 \342\226\272(a)
LEMMA.
Suppose
that Q
with
1 and
J are TT-systems
and
that
a{i)
Q and
that
<7{j) =
n.
and J
Then
H are
in
if 2
J art ej.
independent
p(/
= P(/)P(
J),
J,
Proof Supposethat
J and
/ in
J, the
measures
(check
that
they
are
measures!)
H ^P{ln
on
H) and H ^
P(I)P(H)
J.
have the same total mass P(/), and agreeon (^^H) they therefore agree on cr{J) = W. Hence, p(/nH)==
By Lemma
1.6,
P(/)P(i?),
the
/GJ,
Hen.
Thus,
for
fixed
in
7Y,
measures
G^PiGD
on
H) and
G ^ P(G)P(H)
(f],
Q) have
=
the same
Q; and
agree on cr(Z)
this is what
total mass P(-H\,") and agree on J. They therefore we set out to prove. D
^0
Suppose
Chapter4now that X
x,y
I'Tidependence
(4-^)\"
random
variables
on (fi,
^, P)
such
that,
(b)
whenever
6 R,
P{X
<
x)P{Y
< y).
Now, (b)
independent.
says that
Hence
the
Tr-systems
cr{X)
and
Section 3.13)are
is,
and
Y are
Definition
same
way,
we
can
prove
that random
n
independent
if and only if
<Xk
P{Xk
:l<k<n):=Y[
from
familiar
things
elementary
Do ExerciseE4.1now.
Borel-Cantelli
: n
Second
If
Lemma
(BC2)
eventSj
(En
E N) is a sequence0/independent
= 00
then
J2P{En)
First,
=^
Proof,
we have
(limsupEny
With
- liminf ^^ = have
|J Q
E^.
pn
denoting
P(jE^n), we
\\n>m
J condition
n>m {n >
and
two
the
m} is
the
replaced by
limit
condition
independence,
as r
of the
| 00
being
sides.
0,
so
that,
since
YlPn =
00,
n>m
\\
n>m
So, PpmsupjEn)^]
=0.
if 0 that
D
<
< 1 /?\342\200\236
and
S :=
if 5
< 1,
then n(l
- X]Pn>
Pn)
< 00,then
1
- 5.
[](!
~Pn)
>
..(4-4)
4.4.
Chapter 4'
Independence
4^
Example
of rate
independent
random
variables,
each
1:
>a:)
= e-^
a:
>
0.
for q
> 0,
n-'',
so that,
using (BCl)
P{Xn
(aO)
> alogn
many
n)
= <
'
Now let
L :=
A:
limsup(X\342\200\236/logn).
Then
P(X and,
> 1)
> P(Xn
1,
for
G N,
P(L
Thus,
> l
{L>
1}
=0. + 2k-^) <P{Xn > (l + fc-i)logn, i.o.) = [Jk{L > 1 -h 2k-^] is P-null, and hence L = 1 almostsurely.
think
we
Something
to
about
can
In the
same way,
prove
(al)
| ^
if ^
< l'
or,
(a2)
finer,
P{Xn
> log n
+ log log n -f
in
a log
log log n,
i.o. ) =
if a
< l'
or etc.
sequence
By
combining
an appropriate
sets
of
statements
(a0),(al),(a2),...
of
this!) the
the
union
of
a countable
number
null
is null
while the
can
intersectionof
make
a sequence
of probability-1
sets has probability 1, we about the size of the big statements precise
obviously
remarkably
a
elements
in the
sequence (Xn).
I
truly
have
included
in
the
appendix
fantastic
theorem
about precise
Strassen's
Law.
42
A
Chapter 4- Independence
number
(4-4)\"
accessible
of exercises
in Chapter
E are now
to you.
4.5.
Can
A fundamental
we
construct
random
variables,
the
branching-process
model
is
to make
rigorous branching-process model. The trick answer of Lebesgue measure given based on the existence answer is in the next section does settle the question. A more satisfying a topic deferred to Chapter 8. measure, provided by the theory of product
sense. Equation
be ableto answer be able to construct a rigorousmodel 4.4 of Chapter 0, or indeed for Example
? We
have to
Yes
answer
to our
needed
4.6.
coin-tossing
model
with
applications
u; E
Let (n,
fi, expand
uj
in
binary:
O.UJ1UJ2
different expansions of a dyadic rational is not going (The to cause any problems because the set D (say) of dyadic rationals in [0,1] has Lebesgue measure0 - it is a countable An an Exercise, you can set!) that the : n where G prove sequence N), (^n
existence
of two
is a sequence of
probability
independent
variables
each
^ for
either
taking
the values 0 or
E
1 with
N)
provides
a model
define
Yi(uj) Y2((j^)
Y3{uj)
:=
O.uJiuJ^ujQ
...
:= 0.u;2Cc;5u;9 ...
:=
,
sequence
0.u;4u;8u;i3
... ,
and so on.
We
now
need
a bit
has
is
the
clear
same 'coin-tossing'
that
sequence
(ujn
: n G
N),
it
and similarly
for
distribution on [0,1];
..(4-8)
Since
Chapter 4'
I'f^d^P^f^d^'^^^^
4^
disjoint,
is
the sequences (1,3,6,.-O^ (2,5,9,...), and therefore correspond to different that obvious intuitively
...
which
give sets
rise
of tosses
of our
'coin',
it
variables,
each
uniformly
on
[0,1].
Now
is
given.
: n E N) of distribution functions suppose that a sequence (F\342\200\236 we can the Skorokhod representation of Section 3.12, By
find
functions
gn on
:=
gn{yn)
has
distribution
function Fnthe
same
is obviously
true of
We
have
therefore
succeeded in constructing
with
you
a fam,ily
(Xn
: n
prescribed
distribution
E N) of functions.
utilizing
if forced carry through these this is again largely a case of Obviously, arguments rigorously. as we did in the Uniqueness Lemma 1.6 in much the same way
could
Section 4.2.
of the
.random
variables
Thus,
(IID).
independent
and
most important problems in probability concern of sequences and which are distributed identically independent (RVs) if (Xn) is a sequence of IID variables, the Xn are then all have the same distribution function F (say): P{Xn <x)
= F{x),
V7i,Vx.
Of
course,
we now
we
can
for
construct
distribution
our
common
model
4.8.
know that for any given distribution function F, a triple (f],^, P) carrying a sequence of IID RVs with F. In particular, we can constructa rigorous function
process.
branching
Stochastic
process
\342\226\272A stochastic
a set
C is
a collection
F = (K, : 7
of random variables about existence of a
G C)
is (to all intents and purposes) settledby theorem, which is just beyond the scopeof
stochastic processwith
P).
the this
The
fundamental
question
prescribed
joint
distributions
famous course.
Daniell-Kolmogorov
^^
will Our concern
Chapter
be
Z\"^.
4' I'^f^^P^f^dence
with
think
7i \302\273-*
(4-^)\"
: n Z\"*\") (X\342\200\236 \342\202\254 of
mainly
We
processes
of Xn
time n. For u;
corresponding
A
(or parametrized)
by
G fi,
X = as the value
indexed
the
process
X at
of
the
to the
important
map
Xn(u;)
is called
the sample
path
very
of a
stochastic process is
{pij
provided
by
Markov
chain.
a finite
or countable
for
X E
matrix, so that
i,j
ij
e E)
he
sl
stochastic
Pii > 0,
Let // be
on
l.
^- := //({f}),{i
Z\"*\")
a probability
G
measureon E, so that
a time-homogeneous
fi
fi
is specified
by the
values
: ri
E).
By
such
Markov chain
transition
Z\"*\"
Z \342\200\224 {Zn
, in
with initial
distribution
that,
and
1-step
m,atrix P
io, M,...
is meant
gE.
a stochastic process Z
(a)
whenever
=
n G
and
P(Zo =
iQ\\Zi
2i;...;Z\342\200\236
z'n)
= fJ'ioPioh
-\"Pin-iin-
such
a chain
in terms of
variables.
4.9.
the
values
at u;
of a
suitable family
chapter.
Z expressing
of
Zn{^) explicitly
random
independent
See the
appendix to this
Shakespeare
have
Monkey
typing
that
to independence
that
P(F)^
which
apply
of
the
use of
measures
in Lemma
'Easy
1.10 and
exercise'
has a lot
towards
a silly method, but one the monotonicity properties the of the Kolmogorov flavour the end of this sectionfor an
solution
to the
correctly
problem.
WS, the Collected Works of typing a on a Shakespeare, particular sequence of N symbols typing typewriter. A monkey at one unit types symbols random, per time, producing an infinite sequence {Xn) of IID RVs with values in the set of all possible We agree that symbols.
amounts
Let us agreethat
to
= x) e := inf{P(A'i
Let
: x
is a
symbol} >
0.
of WS.
WS
be
Let Hk be the
that
the monkey producesinfinitely many copies the monkey will produce at least k copiesof
in
.,(4-9)
Chapter
be let
4: Independence
that
it
4^
at least
the
will
produce
k copies by
copies
of WS
period [m
over
-f-
1, cx)).
Because
behaviour
over
[l,m]
is independent
of its
But
logic
tells
us that, for
every m, H^\"^^
HI
Hence,
P(Hm,knH)
= P{Hm^k)P{H).
{H,ri,k
H
T Hk, Hence,
and
JjT)
T (Hk
HH)
= H,
it
being
that
Hk 2
-S^-
by Lemma
1.10(a),
P{H)=P{Hk)P{H).
H,
and
so, by Lemma
1.10(b),
P(H) = PiH)PiH),
whence
P(-fir)
= 0
The
for which
Kolmogorov
0-1
have
law
P(\302\243')
we must
- and
us which
Easy
1.
a produces 0 or P(^)
events
doesnot tell
it therefore
generates a lot
interesting
problems!
to prove that P(H) \342\200\224 Lemma SecondBorel-Cantelli event that the monkey produces WS away, right > e^. Then that is, during time period [1, A^]. P(\302\243'i) exercise only Tricky types ( to which we shall return). If the monkey and is on every occasion likely to type any of the 26, capital letters, equally on average how will it take him to produce the sequence long
exercise.
Let
Use the
be
Hint,
E\\
the
'ABRACADABRA'?
The
assimilate.
next
three
They 0-1 but
sections
are law
involve quite
stage a quick
take
time
to
Kolmogorov
IID RVs,
have
not strictly necessary for subsequent chapters. of the Strong Law is used in one of our two proofs
The
for
will
by
that
been
provided.
I use
chi,
instead ^oo\\
of Z
T too
like J.
Below,
A*, is too
like Greek
X?
but
we have
46
Chapter Definition.
JY\"2,...
4'
I'f^d^V^''^^^''^^^
(4-10)..
4.10.
\342\226\272 \342\226\272Let Xi,
Tail
be random
The
(7-algebra
T is
a-algebraof
events: :=
the
sequence
(Xn
:n 6
N).
for example,
Fi :=
F2 :=
.*=
(lim-Yfe
exists)
{uj :
exists}, limXit(u;) k
(X^-^ik
I hm
converges),
exists
1
(b3)
Also,
-F3
there
are many
important variableswhich ^
T
are
in mT:
for example,
.X
(c)
$i:=limsup
be
\302\26100, of
X\\'{-X2^
\\-Xk
which may
Exercise.
course.
Prove that
monkey
H in the
Hint
Fi, F2 and
is a
Section after
Fz
are
are
T-measurable,
the
that
various
the event
events
probability 0 and 1 in
problem
of
you
already
tried hard.
F3
For each
^\"+^M
is equal
to the
set
Fi\") := {u,:lim
Now,
F3\" 3
Xn_|_i,
exists}.
Xn+2,...
now
Tn
follows
triple (f],7^,
P).
That
Let
(Xn
and
that
0/independent
(Xn : n
random
6 N).
Then T
is P-trivial:
variables,
(i) (ii)
FeT if ^ is
in
m,inistic
=^ P(F) = 0 or P(F) = 1, a T-measurablerandom variable, then, that for some constant c in [\342\200\22400,00],
P(e
^ is
almost deter-
c) =
l.
..(4-11)
We
Chapter4'
^ =
(i). at \302\261oo
I'^dependence
4'^
allow
of
(ii)
for obvious
reasons.
Proof
Let
Step
1: We
of
claim
The
that
Xn
and
Tn are
independent.
the
Proof
claim.
class
IC of
events of
form
x,-
{u :
Xi(u) <xi:l<k<n},
generates : n -f
R U \342\202\254 sets
{oo}
is a
which TT-system
Xn-
The
-f
class
r},
of
of the
form
U {cx)}
sequence
{lo : Xj{uj)< is a
(Xk)
now TT-system
x^-
1 <
; < n
7^.
/C
r E
N,
Xj
E R
which
generates
But the
and
assumption
that
the
is independent
clinches
implies
that
JT are independent.
Lemma 4.2(a)
our
claim.
3:
We
claim,
that X^
Because
:=
cr(Xn
: n
6 N)
Vn,
and T are
the
independent.
\342\200\242=
Proof of
system
claim,.
Xn C A'n^.i,
class
/Coo
U'^'n
^^ ^
^\"
(it
is generally
NOT a
/Coo and
T axe independent,by
C A'cc, F
cr-algebra!)
Step
which
generates
A'oo-
Moreover,
2.
Lemma
4.2(a) again
clinches things.
Step 4Since
T is independentof
T \342\202\254
T!
Thus,
=>
P(F)
= P(F n
F) = P(F)P(F),
D
and
P(F)
of
= 0
(ii).
or 1.
By
Proof
Let
part
c :=
P($ =
P($ < x) = 0 or
1.
x)
\342\200\224oo)
1; and
if c
c is
= oo, it
finite.
is clearthat
Then P(^
0}.
Then,
if c
= -oc,
=
=
it
is
clear
that
< c-
P(^
oo) = 1.
0, Vn,
1/n)
so that
P(U{^<^-l/^})-P(^<c)
while,
= 0,
have
since
P(^
< c
+ 1/n)
= 1,Vn,
we
P(nU<^+l/n})
= P(^<c)
= l.
^8
Hence, P{C
Remarks.
Chapter
4-
Independence
(4-11)\"
\342\226\241
= c) =
1.
is. this result in Section 4.10 show how striking The examples random \342\200\242 ^^ cl sequence \342\200\242 For example, i/J\\ri,-Y2,\342\200\242 variables^ of independent
then
either
P( V]
Xn converges)
Xn
=0
=
or P(y^
converges)
1.
settles
The Three Series Theorem (Theorem completely 12.5) of which possibility occurs.
the
question
So,
Example.
you
can
see
that
the 0-1
In
the branching-process
Moo
is measurable
sequence
(Zn
the
variables
: n E N) but
need
independent.
4.12. Exercise/Warning
Let
Yo,
ill,
^2,
random
=
p(r\342\200\236
variables = i, -i)
with
vn.
For n 6
Prove
N,
define
that
the
... ^nvariables -X'i,X2,... are independent. Define T^ := a{Xr : r > n). y:^aiYi,Y2,...),
Xn
:= Vo^i
Prove
that
c-f]<T{y,Tn)j^aly,f]Tn]
n
=:n.
\\
/ of 1Z. tripped
given
Hint.
Prove that
and
Yq
E mC
and
that
Yq
is
independent example
when
Notes. The
mogorov
phenomenonillustrated by
Wiener.
this
up even
was
Kolto that
The
and
very simple
illustration
here
shown
we
can assert
a decreasing
sequence of cr-algebras )
=
f]aiy,T\342\200\236)
a(y,f]TA
contexts.
is a
probabilistic
Chapter
Integration
5.0. Notation, etc. /i(/) :=:J f dfi^ /i(/; A) Let (5, S,/i) be a measure space.We are interested elements/ of mE the (Lebesgue) integral of / with we shall use the alternative notations:
\342\226\272 \342\226\272
in defining
respect
to
fi{f)
:=:
Is f{s)fi{d3) that
:=: /^
fdfi.
It is worth
notations
mentioning
now
we shall
for
A 6
S:
(with a true
example,
definition
on
the
extreme
right!)
It should
be clear that,
for
Kf;
f>x):=
{s E
S : f{s)
> x}.
now is that, of course, is else worth summation Something emphasizing a special type of integration. If (a\342\200\236 real : n E N) is a sequence of numbers, = 1 then with 5 = N, E = 'P(N),and measure on (5, E) with jj, the /i({fc}) for every A: in N, then 5 \302\273\342\200\224> if and only if ^ |an | < 00,and a^ is /z-integrable
then
y^an
/ asiJ>{ds)=
a dji.
We
begin
by
to
considering
take
such an f
the integral
in the
of a
function
in (mS)\"^,
allowing
values
49
50
5.1.
Chapter 5: Integration
Integrals
is
(5.1)..
of non-negative of E,
If
an
element
we define
A^o(U)
:=
^^{A) <
that
The use of
integral
An
/io
rather
than
yi
signifies
we currently
defined
element
if
for simple
functions.
and
SF'^,
(a)
may
we
then write
/ E
/ =
X^\302\253itUfc
Jk=i
where
ak E
T,. We
then define
(with
(b)
fioif) =
first
<
oo
O.oo :=
0 =: oo.O).
The
to be checked is that /io(/) is well-defined; for point / will have we different of the must that form and ensure many representations (a), the same value of in desirable also they yield properties /io(/) (b). Various need to be checked,namely (c), (d) and (e) now to be stated:
(c) ii f,g e
and
(e)
5F+
and
//(/
^ g)
= 0 then
/io(/) = f^oig);
f
(d)
('Linearity')
ii f,g
Mo(/
e 5F+
Mo(/)
+ g
and cf
are in 5F+,
g)=^ if f,g
fJ'o{cf)= c/io(/);
/io(/)
(Monotonicity)
e SF'^
^f
< l^o{g)]
(f)
involves
ii f,g
no
e 5F+
then
/ A
and
are
in 5F+.
but
of substance,
what
turn our
attention to
it
this,
and
the
Monotone-Convergence
Theorem.
5.2. Definition
\342\226\272For /
of/i(/), /
we define
(mE)+
E (mE)\"^
(a)
Clearly, for
/
fi{f) :=
E 5F+,
result
sup{fio{h)
: h \342\202\254 SF+,
ft <
/}
< oo.
we have
fi{f) = fio{f).
The
following
is important.
..(5.3)
Chapter 5: Integration
51
LEMMA
\342\226\272(b)
//
/ G (mE)+
and fi{f)
= 0, then
K{/>o})
= o.
that
limj/
for
> n~^}.
some
n,
Hence, using (1.10,a), we see /i({/ > n~^}) > 0, and then
\342\226\241
fi{f)>fio{n-'l{f>i/n})>0.
5.3.
Monotone-Convergence
Theorem
(MON)
such
\342\226\272 If \342\226\272\342\226\272(a)(/n)
is a
sequence
of elements of (mE)\"^
M(/n)
T
that
f /\342\200\236
/,
then
M(/)
< OO,
or, in
other notation,
/ Js
other key
fnisUds)
/ Js
f{s)fJi{ds).
This theoremis really all there is results such a^ the Fatou it. Theoremfollow trivially from The (MON) theorem is proved
relates
Lemma
the
Appendix.
Obviously,
the
theorem
measures.
you have
It is
a sequence
very closely 1.10(a), the monotonicity result for The proof of (MON) is not at all difficult, and may be read once
to Lemma
definition
of o:^''^.
of
E
convenient to have an explicit way given / E (mE)\"^ such that f^^^ of simple functions f^^^ | /. For r r^^ staircase function a^^^ : [0,cx)] -^ [0,cx)] as follows:
(b)
N,
a(''>(x) := I {i y r
(0
if
X = {i
X
1)2-''
/('')
0,
if
>
1)2-''
r.
<x <
i2-'' <r
T
{i
N),
if
satisfies
Then
/('') =
a^'') o /
6 5F+,
and /(''>
so that,
by (MON),
/i(/)=Tlim//(/''>)
We
=Tlim/io(/^''^).
/\342\200\236 T
have
made
a^''^ left-continuous
so that if
/ then
T Oi^'^Hf)\302\273('')(/\342\200\236)
52
need
Chapter
Often,
5: Integration
(5.3)..
we
to apply /
ii^
the
hypothesis
T (/\342\200\236
^^e
Let
and
where convergence theorems such as (MON) case of (MON)) holds almost everywhere be made. us see how such adjustments may
If f,9
e (mE)+ /i(/^''^) =
E (mE)+
set /(''>
= g
(a.e.),
let
then fi{f)
r
= fi{g).
and
Proof. Let
by (5.1,c),
\342\226\272
o \302\253(''>
/,
^('')
= a^''^ o g.
Kg^\"^^)'
Now
so,
D
except
(d)
If f
and
is (/\342\200\236)
a sequence
such
that,
on
jjL-null
iV\",
Kfn)
M/).
Proof
fls\\N
We have
everywhere.
on,
/i(/) =
The
//(/nl5\\iv)-
But
fnls\\N
(MON).
D
not
From now
to spell out such extensions for the other bother theorems, convergence often stating results with 'almost but proving them under the everywhere' null set is empty. assumptionthat the exceptional
(MON)
this extension.We
do
integral
Riemann
example,
with
/ is a non-negative
Riemann
integrable
function
on ([0,1],
S[0,1], Leb)
sequence of
integral
I, then
a
(Ln)
of elements
of
SF\"^
and
SF\"^
such
that
Ln'{L<f,
and
fjL^Ln)
T I?
y^{Un) i
L If we define
2[L
then {/
7^
if
X =
[/,
0 \\
otherwise,
it is
/}
of
clear
is a
measure
that
/ 0.
is Borel
subset of the
to be
and
the
Riemann
So / is Lebesgue measurable (see SectionA 1.11) with integral of / equals the integralof / associated
the
Borelset {L ^
measurable, while
U)
(since/i(X) =
Lemma
/i(^)
1)
which
5.2(b)
shows
<j-algebra
of Lebesgue
of [0,1].
5.4.
The Fatou
(FATOU)
\342\226\272 \342\226\272(a)
a sequence /i(liminf/n)
(/\342\200\236)
in
(mE)\"^,
< liminf/i(/n).
..(5.6)
Proof.
Chapter 5: Integration
We have
53
(*)
For n >
A;,
=T lim^^, liminf/n n
we
where gk
/i(/n) >
n>Ai;
:=
infn>*:
fn-
have
/\342\200\236 gk^
>
so that
li{gk)
l^iQk)-,
whence
< inf
//(/\342\200\236);
and
on combining
this
with
an
appHcation li^i
k
to (*), we obtain
//(Uminf/n)=t
n
hm/i(<;A:)
<T
n>k
=: Hminf/i(/n). n
Reverse
\342\226\272(b)
\342\226\241
Fatou
//
Lemma is a
<
have fn
(fn)
sequence in
and
(mE)\"^
such
that
for
some g in (mE)\"^, we
5',Vn,
fi^g) <
sup
oo,
fn)
then
fi{lim
> Hmsup/i(/n).
fn)\342\226\241
5.5.
'Linearity'
Fora.f}
e R+
+ M
= c^Kf)
+ /i^g)
(< oo).
apply
Proof
to the
Approximate
and
simple functions,
g from
below by simplefunctions,
(5.1,d)
D
5.6.
Positive
E
and
we
For /
mE,
f^{s)
Then
f-{s)
:= max(-/(^), 0).
/+,/-\342\202\254
(mS)+,
|/| =
/++/\".
54
5.7.
Chapter 5: Integration
Integrable
(5.7)..
function,
we say
\302\243^(5, E,/i)
that /
is
fi-integrable^
and
write
if
M(i/i) =
M/\"')+M(r)<oo,
and then
we
define
y\"/dp:=M(/):=M(/+)-Mr)-
Note
\342\226\272
that,
for /
6 \302\243^(5,E,p),
IM/)I<MI/IX
integral
the familiar
integral
of
is less
than or
equal to
the
We write
\302\243^(5,
/i)\"^ for
the class
fi).
5.8.
Linearity
Fora,/3
GR
and
f,g
E C\\S,T;,fi),
af +
and
^geC\\S,i:,ix)
= ayL{f) +
fi(af
+ pg)
^yi{g).
result in Section 5.5.
Proof. This is a
totally
routine
consequence
of the
Suppose the
that
fn^f
E niE, that
is
fn{s)
\342\200\224>
f{s)
sequence
(/\342\200\236)
dominated
by an
|/n(^)|<^W,
where
V^G5,VneN,
-^
fi(g)
< oo.
Then
fn-^f
whence
in C\\S,
E, /i): that
Kfn)
is,
fi(\\fn
f\\)
0,
^ P(/)-
now.
..(5.11)
Cha'pier -
5: Integration
the
55
reverse
Proof.
Lemma
We
have
|/\342\200\236
/| <
--
Fatou
5.4(b),
\\imsupfi{\\fn
< /i(Hmsup|/n
-
- /I) = /i(0)
<
M(I/\302\253
0.
Since
- /^(/)l =
IM/n
/)l
/I),
CD
is proved.
Lemma
that
5.10.
\342\226\272(i)
Scheffe's
Suppose
negative.
(SCHEFFE)
\342\202\254 \302\243^(5, E,//)\"^;
fn,f
-
m particular,
fn and
-^ Kf)-
f are non-
Suppose
f^ilfn
that
fn-^f
(a.e.).
Then
if fi(fn)
/I)
-^ 0 if
and
only
Proof The
Suppose
now
(a)
Kfn)
(/\342\200\236
^ Kf).
that
Since
(b)
/)-
< /,
(DOM) shows
p((/n-/)-)-0.
Next
M((/n-/)+)
But
= M(/n-/;/n>/)
Kfn)
Kf)
- Kfn -f;fn<
/)\342\200\242
K/n
SO
-/;/n
< /)|
< K(/n
-/)-)!-
0
D
(a.e.). Then
-^
fi{\\f\\).
that
(a)
and
(C)
Of
course,
(b)
the
Here is
\342\226\272(ii)
second
Lemma,
and
Suppose
that
K\\fn
fn^fE
-
\302\243^(5,
E,/i)
that
fn-^f
/I)
-^ 0 if
and
only
if ^(|M)
Exercise.
Prove
\"^
the
^^^
f^ift)
trivial.
Kf^)^
'if part of (ii) by using to show that Fatou's Lemma ^^^^ applying is if part (i). Of course, the 'only
5.11. Remark
The
on uniform integrability
better
theory triples,
of uniform gives
probability
which we shall establishlater for integrability, of integrals. insight into the matter of convergence
56
Chapter 5:
machine
Integration
(5.12)..
I call Monotone-Class
to
the
/i in
a 'linear'
all
functions
we
when
h is
an indicator
; integrability
we use we use
linearity (MON)
h
result for
h h G
in SF
result for
h\"^
(mE)\"^,
conditions on
\342\200\242
usually
superfluous
=
at this
stage;
linearity, that
finally,
we
show,
the claimed
result is true.
when
by writing
using
It seems to
me
that,
machinework'than to appealto the monotone-class times when the greater subtlety of the Monotone-Class 5.13. Integrals over subsets Recall for / E (mE)\"^, we set, for that
it works,
it is ea<sier to
result,
Theorem
G E,
J[ fdfi:=:fi{f;A):=fi{flA).
A
If
we
really
want to
E^),
ought
integrate /
measure
over
A,
we
should
is // subsets
integrate
restricted
of
the restriction
respect
to the
E^
to
prove
denoting that
iiAU\\A)
ti{f;A).
indicator
standard machine does sides of (a) are just //(A have f\\A G mE^; and then
this. If / is the
fl B);
of a
set B in
A,
then
etc.
We discover that
for f
mS,
we
/U e
in
\302\243^(A,S^,//^)
if
and
only if fU
\302\243i(5,E,^),
which
case
(a) holds.
..(5.14)
5.14. Let
Chapter 5: Integration
measure f/.i^ f
A \342\202\254 (mS)\"*\"
57
The
/ G (mE)+.
For
G S,
define
(a)
A
(ff,){A):=fi{f;A):=fi(flAy
trivial
Exercise
on the
results of
^^
Section5.5 and
on (5,
(MON)
shows
that
(b)
(ff^)
h
measure
S).
For
(c)
If
(niE)\"^,
and
\342\202\254 S, we
can conjecture
that
standard (d)
(c) is immediate
we
by
definition.
Our
have
= (hf),!.
following form:
then
used
in the
(\"^^)^
and need
^^^ ^
then
(n^S)>
\302\243^(5, E,///)
if and
only if
that
D
fh e C^{S,S, /i)
{ffi){h)
= fi{fh).
ft
Proof.
We
only
prove
this for
5.
>
0 in
which case
it
merely
says
the measures
Terminology,
If
at (d) agreeon
and
theorem
say
A denotes
the measure
in
(5, E), we
via
that
A has
density f
relative
d\\/dfi
We
= f.
i^ E
note
that
in this
case, we have
for
E:
X{F)
(f)
so that
^{F) = 0 impliesthat
only certain
and A are
= 0;
measures
(proved
have
Nikodyin theorem
(g)
\\ if
density
relative
to fi.
The Radonholds,
in
Chapter
fi
a-finite
\342\202\254 (mE)\"^.
measures on
(5, E)
such
that
(f)
then
fji
for some f
Chapter
Expectation
6.0. Introductory
We
remarks
work
with a
Recall that a
measurable
probability
variable
write
C^ for
random
is an
element of
C^{Q.^T^ P).
is
m^,
that
an
J^-
function from fi to R.
to
P.
inequality^
(5,
useful
general
We
critical use of the fact that P(r2) = 1, is for and powerful: it implies the Schwarz, Holder,... inequalities E,//). (See Section 6.13.)
which makes
geometry
study
the
of the space
C^{Q.^J-^ P) in somedetail,with
a view
to several
later applications.
random
of
variable
X E :=
>C^
\302\243^(fi, J^,
P),
we define
the expectation
E(X)
Xhy
E(X)
We
/ XdP =
X
/
with
X{u)P{duj),
also
define
E(X)
\342\202\254 (m^)+.
In short, those
agree
in
E(X) =
function
(if it exists)
etc.
will
be
confirmed
in Section
that
{Xn)
is a
sequence of RVs,
P(Xn
that
is a
RVj
and
that
Xn
\342\200\224> X
almost
surely:
^ X) =
1.
notation:
We
rephrase
the
convergence
theorems
58
..(6.4)
\342\226\272 \342\226\272(MON)
Chapter
6:
Expectation
59
if 0
< Xn T X,
>
then
T E(X\342\200\236)
E(X)
< oo;
E(X\342\200\236);
ifX\342\200\236
0,
<
then
Y(uj)
E(X)
V(n,w),
< liminf
if
\\X\342\200\236{u)\\
where E{Y)
E(|X\342\200\236-.Y|)^0,
30 that
E{Xn)
\342\226\272(SCHEFFE)
- E(X);
then
ifE(\\Xn\\)
-^
E(|X|),
E{\\Xn-X\\)-^0;
\342\226\272 \342\226\272(BDD)
if for
some
finite
constant K,
\\Xn((^)\\
<
Ky{n^u),
then
E(|Xn-X|)->0.
The newly-added
immediate
Bounded Convergence
Section
Theorem
(BDD)
has
consequence
fact
of P(fi)
of the
that
proof which
in we shall examine
< oo.It
but
taking
Y{ijj)
K^ a direct
13.7;
to provide
As has
concept
you might
well be able
it
now.
which
study
this,
is the key been mentioned previously, uniform integrability of theorems. We a gives proper understanding convergence via the elementary (BDD) result,in Chapter 13.
shall
6.3.
The
notation
E(X; F)
and
6 JF,
we define
E(X;
F)
:= /^
X(u)P(cL;) :=
E(XI^),
where, as ever.
Of
course,
this
tallies
with the
/i(/;
A)
notation
of Chapter
5.
6.4. Markov's
Suppose
inequality
E
mj-\"
that
and
that
decreasing.(We
\342\226\272
know
that
>
g{Z)
E(^(Z);
and non-
\302\243g{Z)
c) >
60
Examples:
Chapter
6:
Expectation
(6-4)'-
for Z
for
\342\202\254 (m^)+,
cP(Z
cP(\\X\\
> c)
< E(Z),
E{\\X\\)
by
(c > 0),
(c >
e C\\
can
>c)<
0).
optimum
>->-Considerable strength
c in
\342\226\272
often
be obtained
choosing
the
0 for
P(F
> c)
< e-^^E(e^^),
(^ > 0,
\342\202\254 R).
6.5.
We
Sums
collect
of non-negative
together
imJ=')-^
RVs
< oo, then
some
and
useful results.
(a) If X e
If \342\226\272(b)
E{X)
P{X <
oo) = 1. This
is
obvious.
(Zk)
is a
This
is an (Zk)
obvious
is a
consequence of linearity
and (MON).
X^E(Z)t)
If \342\226\272(c)
< oo,
then
< oo (a.s.)
and
so
Zfc
\342\200\224> 0
(a.s.)
of immediate consequence and (b). (a) is a consequence of (c). For suppose (d) The First Borel-CantelliLemma that is a sequence of events such that oo. Take Zk = Ipk< (Fk) ^ P{Fk) = Then and, by E(Zk) P(Fk) (c),
This
is an
Y^
If^
number
of events
Fk
which
occur
is a.s.
finite.
6.6. Jensen's inequality for convexfunctions \342\226\272 \342\226\272A function c : G \342\200\224> where G is an open subinterval R, convex on G if its graph lies below any of its chords: for
of
R,
is
called
x,y
E G
and
0<p=l-q<l,
It
will
+ c{px
below
on
qy)
<
pc(x)
automatically
-h qc(y).
continuous and
be
explained
that
then
c is
c is
on
>
G.
If c
is
twice-difFerentiable
G,
convex if
only
if
c\"
0.
^-Important
examples
..(6.7)
Chapter
6: Expectation
61
THEOREM.
\342\226\272 \342\226\272
Jensen's inequality
that
Suppose
c : that
G of
and
E(|X|)
P(X
= \342\202\254 G)
1,
E|c(X)|
< oo.
Then
The
u
fact
< V <
be
rewritten
as follows:
ior u^v^w 6
Au,u
:=
^(^) -^^-^^
^(^) ^-^.
that
c is
continuous
on G,
and
that
for
each
v in
(D-c)(v) :=t
exist and satisfy
have
lii^
Au,i\342\200\236
(^+c)(^^)
:=i
Hm A^;,^^ D-c
The
functions
for any m in
+ c(i;),
for
jjl
[(Z)_c)(v), (\302\243)4.c)(v)]
> m(x
\342\200\224
v)
x E G.
:=
In particular, we have,
c{X)
and
almost surely,
E(X),
> m(X
-fi) + c(m), m 6
follows on taking
we
[(D-c)(;.),
(D+c)(/x)]
Jensen's
inequality
expectations. fact
+
that
Remark. Forlater
(a)
use,
shall
need
the obvious
sup(ana: n
c{x)
= sup[(D_c)(^)(a:
qeG
- q) + c{q)]=
bn)
{x 6
G)
for
some
sequences
R. (Recallthat
c is
continuous.)
6.7.
Monotonicity
p <
of C^ norms
X
E C^
= a{Q.,7,
< oo,
P)
if
E(|X|^)
62
and
Chapter 6:
then
Expectation
(6.7)..
we
define
II^IIp
\342\226\272 \342\226\272
:=
{EdXl\}^.
following:
The monotonicity
\342\226\272
(a)
ifl<p<r<oo
then <
and C^ \342\202\254
ii^^iIp
\\\\y\\\\r^
>-Proof.
For
n 6 N,
define
Xn{^):={\\Y{i^)\\hny.
Then Xn
x^^P
is
on
(0,
are
both
in C^.
that
Taking c{x) =
inequality
<
< ECJT;/\")
E[{\\Y\\AnY]
E(\\Yn.
Now let n t
oo and
use (MON)
to
obtain
the desired
result.
D
a simple but
effective use of
Vector-space
because \342\226\272
it
illustrates
property
a,
of C^
R\"^,
(b)
Since,for
6 6
we
have
(a +
is \302\243^
by <
[2max(a,b)]P
<
Vi^oP
-f 6^),
obviously
a vector
space.
and Y
are in C?,
then
XY
\302\243^and
|E(xy)l<E(|XF|)<l|x||2||y||2.
will have seen many versions of this result and of truncation to make the argument rigorous.
Remarkbefore.
You We use
By
its
proof
Proof. restrict
considering to
|X| the
and
\\Y\\
instead
of
attention
case
when X
>0,Y >0.
and Y,
we can and do
..(6.9)
Write Xrt~
Cha'pier
6:
Expectation
63 Yn are
Xn
and
bounded. For
any
E[{aXn
-f hYnf]
-f
= a^E{Xl)
and since
2abE(XnYn)
+ b^E{Y^),
have
two
distinct
real
roots,
Now let n t
The
{2E{XnYn)y <
oo using
AE(Xl)E(Y^)
<
AE{X^)E{Y^).
\342\226\241
(MON).
immediate consequence of
so (a):
we
following
is an
(b)
if X
and Y are in
C^, then
\\\\X
is X
-^ Y,
+
and
\\\\Y\\\\2.
have
the
triangle
law:
Yh<\\\\Xh
Remark.
Section
The
6.13,
Schwarz
which
space
- see
6.9.
In
C^: Pythagoras,
section,
with
this
we take
probabilistic
variance
of
C^
and
at its
connections
such as
covariance, correlation,etc.
and Covariance
li
X,Y
>C\"^,
then
by
the
monotonicity
of norms,
X,Y 6
>C^,
so
that
we
may define
Mx:=E(X),
fiY-E{Y).
are in
we \302\243^,
Since
(a)
the
constant
functions
with
values /ix,/^y
see
that
X:=X-^fix,
the
Y:=Y-fiY
XY
= E[{X
and \342\202\254 \302\243^,
are in C^. By
(b)
The
final
Schwarz
:=
inequality,
EiXY)
so we
may define
Cov(X,Y)
Schwarz
[ ] bracket
- ^cx){Y-
/zy)].
inequality
to yield
(c)
As
Coy(X,Y) = E{XY)-fixtiY.
you
know,
the
variance
of X
is defined
by =
(d)
Var(X)
:= E[(X
- fix)'] = E(X') -
^\\
Cov(X,
X).
64
Chapter 6:
Expectation
(6.9)..
Inner product,angle
For
Z7, V
G >C^, we
(e)
and
and V
(f)
if ||J7||2
by
and ||F||2
^ 0, we
define
the
cosine
of the
angle 9 betweenU
cos.=
the
<^^^
WuhWh
Schwarz
inequality.
This ties in
with
probabilistic
correlation
idea
of correlation: X and Y
p of
is cosa
where
is the
angle between
has
the
same
Thus
below).
geometry as any inner-product space (but see 'Quotientthe 'cosine rule' of elementarygeometry and the holds,
form
Pythagoras
(h)
vh'
= wuh'
+ wvh'
V
if
{u,
V)
= 0. or perpendicular,
form
If {U,V) write U
replaced
0,
V.
and
are
orthogonal
and
U,V
language,
(with
hy X,Y)
(i)
Var(X + F)
Xi,
Var(X)
Var(y)
Cov(X,F)
= 0.
Generally,for
(j)
X2,...,
6 C^, A'\342\200\236
Var(Xi
+X2 +
---+Xn) = J2
k
V^(^t)
2^^
^^.^
.Cov(Xi,
I am
X,).
I have
not marked
they are
(j)
with
\342\226\272 because
sure that
Parallelogram
Note
that
by the
-f
bilinearity of
+
\\\\U
(\342\200\242, \342\200\242),
(k)
\\\\U
FII2'
FII2'
= {U
+ V,U
+ V)
{U ^V,U
--V)
2\\\\Uh'+2\\\\Vh\\
..(6.10)
Chapter
6: Expectation
65
Quotienting
Our
space
does \302\243\"^
quite
can
satisfy
say only
best we
is that
an
inner
product
||J7||2= 0 if In functional
equivalence
and
= 0 almost
surely.
an
analysis,
we find
an elegant
relation
U ^
V ii
and
only
if
f7 =
V almost
and define
Ui
L'^
as
oneneedsto check
-
out
by this
if for
i =
1,2, we have
Vi, then
ciUi
that
-f
C2U2
- ciVi
-f
C2V2;
{UuU2)
(V\\, F2);
V
'liUn-^U
in C^
and Vn
-^ Un
and
of
V -^U,
then Vn-^
in
C^;
etc.
As mentioned in 'A
Question
level
Terminology',
theory.
level.
t
For
a Brownian
an
that
\302\273\342\200\224\342\226\272 is
Bt{uj)
continuous
function Bt on 6.10.
book, one couldnot do so at a more motion {Bt : t E R\"^}, the crucial property the true would be meaningless if one replaced
of this
Although
Q,
by
equivalence
class.
^ p
Completeness
of C^
(1
< 00)
Let p e
The
[1,00).
following
(a) is important in functional analysis, and will it as an the case when p = 2. It is instructive to prove exercise in our probabilisticway of thinking, and we now do so.
result
be
crucial
for us in
(a)
//
(Xn)
is
a Cauchy
sequence
in C^ in that
sup \\\\Xr-Xs\\\\p-^0
r,s>fc
(k-^oo)
\342\200\224> X in
then
there
exists
X in C^ ||X,
such
that
Xr
C^:
- A^l;,-^ 0
(r-^oo).
(a) by
Note.
We
already
know that C^ is a vector space. Property C^ can be made into a Banach space L^
is important a quotienting
the
preceding
section.
66
Chapter 6:
of
Expectation
be
(6.10)..
an almost sure limit
that
Proof
(a).
We
show
that
A\"
may
chosen
to be
of
a subsequence
Choose a sequence{h^
{r,s>kn)
kn ]
oo such
\\\\Xr-Xs\\\\p<2-^.
Then
=
H\\Xk\342\200\236^.-XkA) \\\\Xu\342\200\236^.-Xk\342\200\236h
<
U^^.-XkAr.
< 2-\",
SOthat
Hence it
is
almost
surely
series
converges
X](^\"itn+i
(even
^kr.)
absolutely!),
so that
limXfc\342\200\236(u;)
exists
for
almost
all uj.
Define
Then
X is Suppose
Vu;.
and
r >
N 9
<
>
n,
- Xu, I\")
\\\\Xr
Xk,
lip\"
2-\"P,
obtain
so that
on letting
| oo
and using
Fatou's Lemma,we
E C^.
Firstly,
Xr
->
X 6 Xr \342\200\224 X in CP.
>C^,
so
that
C^
convergence,
projection
The
number
result
as
well
on has of C^ obtained in the previous completeness of important consequences for probability theory, and it is to develop one of these while Section in your fresh mind.
6.10is
section a perhaps
of its
will
allow
me
to present
for
as
a piece
of
of geometry
conditional throughout
||2
deferring
discussion
expectation this
until Chapter
9.
write
||
\342\200\242 for
||
||
\342\200\242
section.
..(6.11)
THEOREM
\342\226\272
Chapter 6:
Expectation
67
Let
(Vn)
C^ has
which the
that
whenever
that
sup ||v;~v;||-^o
r,3>k
(k-^^),
then
there
exists
a V in K
such
that
\\\\Vn-V\\\\^Q
(n^oo).
exists
Then given
(i)
X in
\\\\X
C?y
there
Y in
K such
--W\\\\:We
that
y|| =
:=
inf{||X
K},
(ii)
X-Y
(i) and
\302\261Z,
VZ
/C. \342\202\254
Properties
(ii) ofY in IC
with
are
equivalent
and ifY
shares either
Y, 0
- y II
Y =
Y, a.s.).
Definition. The
the
random
variable
Y
IC.
theorem is
is another
a.s.
Proof.
orthogonal
projection
of X
onto
version, then F =
called a
version
of
F,
Choose
a sequence
(Yn) in /C such
||x-y\342\200\236||--A.
that
By
the parallelogram
\\\\X
law (6.9,k),
n||2 + n)
+ ||X
- n||2 = 2\\\\X
that
\\{Yr +
Y,)f
+ 2||Kn
It
Y,)\\\\\\
But
\\{Yr
e K,, so
||X
that the
K. such
sequence {Yn)
that
has
the
is now
obvious
in
there exists a F
\\\\Y\342\200\236-Y\\\\^Q.
Since
(6.8,b)
implies
that
||,Y -
Y\\\\
<
\\\\X
Yn\\\\
||y\342\200\236
r||,
it is
clear
that
||X-F||=A.
68
Chapter
6: Expectation
(6.11)..
and so
For any
in /C,
we have
F + tZ
/C for
t 6 R,
the
case for
all
of
small
modulus
if
\342\226\241
(z,x-r)
Remark.
form
= o.
theorem
The
\302\243^(f], Q^
case to
P) for some
is when tC
has the
6.12.
The
'elementary
formula'
Backto earth!
Let
J\\r be
let
us here
between
different
\302\243's,
law
of X:
Ax{B) :=
LEMMA
P{X e B).
R to
\342\226\272
Suppose
that
h is
G
a Borel
measurable function
if and
from
R. Then
h{X)
and
\302\243HQ,JF,P)
only if
\342\202\254 C\\R,B,Kx)
then
(a) We simply
Eh{X) =
Ax(h) = / Jr
into of
h{x)Ax{dx).
Proof
feed everything
the
standard
machine.
shows
(a)
for
non-negative
if h = I5 (B 6 B). Linearity then function on then simple implies (R, B). (MON) and linearity allows us to complete the function,
Ax D
argument.
Probability
We
density
that
function
(pdf)
(pdf)
say
X has
/x
Borel function
(b)
: R
fx
if there
exists a
P(X
eB)=
f fxix)dx, Jb
BeB.
..(6.13)
Here we
Section
Chapter
6:
Expectation
69
what
should
be Leb(<ix).
In the
to
language
Leb:
of
5.12,
has density fx
relative
dLeh
The function
to fx
fx is only
satisfy
defined
almost
everywhere:
any function
a.e. equal
will
also
(b) 'and
conversely'.
<
oo if
and only if
< / \\h{x)\\fx{x)dx
cx)
and
then
Eh{X)=
Jr
f h{x)fx(x)dx.
truncation
technique
fact
used
P(fi)
6.S relied
on the
any
that
true for
We
to prove the Schwarzinequality in Section < cx). However, the Schwarz inequality
is
measure
space, this
for
as is
with
conclude
Holder inequality
triples.
chapter
any
(5, S,
a device (often useful) which yields the //) from Jensen's inequality for probability
Let
\342\226\272
(5, S,//)
be a measure space.Suppose
p
that
> 1
and p~^
and
-f q~^ = 1.
cx), and in
that
Write
e CP(S,
E, //)
if
mE \342\202\254
fi{\\f\\P) <
case
define
11/11. :=
Ml/r)}^/^.
THEOREM
fi),
h G C^{S,
E, //).
and
Then
(Holder's
inequality)
fh \\Kfh)\\
\342\202\254 \302\243^(5,S,//)
<K\\fh\\)
< ||./||;>||-|lg^
\342\226\272(b)
(Minkowski's
inequality)
11/+
fir||,<
11/11,+ ||5||p.
70
Proof
Chapter 6: Expectation
of (a).
(6.IS)..
the casewhen
With
the notation
of Section5.14,define
SO that
P is
a probabiUty measureon
\342\200\236(,):^/M^)//W-^
(5, S).
Define
if/W>o,
The fact
that
P(w)\302\253
<
P(w\302\253) now
yields
\342\226\241
M(IAI)<ll/llpl|ftI{/>o}||,<||/||p||%.
Proof
of (h).
Using Holder's
Ml/
+ 91\")
= Ml/ll/
<\\\\f\\\\pA
\\\\9\\\\,A,
where
A=\\\\\\f+9r'\\u=M{\\f+9n^',
and (b)
and
of
follows
on
rearranging.
(The
A
result is
follows
non-trivial
only
if /,
flf
\302\243^,
in that
CP.)
case, the
finiteness of
from
the vector-space
property
D
Chapter 7
THEOREM
Suppose that X and Y are independent RVs, and that X and Y are both in Cl . Then XY G C1 and
E(XY) = E(X)E(F).
In particular,
if X and Y
of C2 , then
+ Y) = Var(X) + Var(F ).
case when X
X~, etc., allows us to reduce the problem to the Proof Writing X = > 0 and Y > 0. This we do.
But then, if is our familiar staircase function, then
a(X) = ailAi,
a(Y) = bjlsj
where the sums are over finite parameter sets, and where for each i and j, Ai (in cr(X)) is independent of Bj (in cr()). Hence
E [a(X)a(r\Y)} =
=
n B>)
EEa'P(')P() = E[a(r)(X)]E[a<r)(y)].
Remark. Note especially that if X and Y are independent then X Cl and Y 6 Cl imply that XY 6 C1. This is not necessarily true when X and
71
72
Chapter
7:
An
Easy
Strong
Law
(7.0)..
Y are
It is important
not independent,and we
that
need
independence -
Holder, etc.
inequalities.
note
7.2. Strong
Law
first
version
many
4^**
cases
moment'
of importance.
condition,
You should
that
the (X\342\200\236) for about identicaldistributions sequence. so fine a result has so simplea proof.
THEOREM
\342\226\272
Suppose
that
Xi^X2^'''
for
some
constant
variables,and
that
E(Xfc)
E{Xt)<K,
Then
=
\\/k.
Let
Sn
= Xi
+ X2 +
Xn.
P(n->5\342\200\236-^0)
l,
or again, Proof.
Sn/n
\342\200\224> 0
(a.s.).
We have
because,
for distinct
i,j,
fc
and
/,
E(X,X|)
using
=
plus
E{XiX]Xk)
the fact that
= E{XiXjXkXi)=
E(A'i) =
0,
of
independence
norms'
0. [Note that, for example, < 00, by the 'monotonicity are in C^.] Xj
6.7 that
\\/i.
i ^
[E{Xf)]''<E{Xt)<K,
Hence, using
j,
E{Xf)E{X]) < K.
..(1.S)
Thus
Chapter
7:
An
Easy
Strong Law
73
E(5;t)<
nK
-f
3n(n
1)K <
3Kn\\
Y^{Sn/ny
< 3K
Y^
7Z-2 <
oo,
so that
Y!f{Sn/ny
< oo?
a.s., and
0, Sn/f'i' \342\200\224^
a.s.
Corollary.
E(Xk) (a.s.)
// the condition
fJ-
E(Xit) = 0 m
the
theorem
is
replaced
n~^Sn
by
= as
for
some
constant
holds with
to the
-^
y^
its conclusion.
Proof. It is obviously a
where
case of
applying
the
theorem
sequence {Yk)-,
Yk :=
Xk
\342\200\224 But
/i.
we need
to know that
(a)
This
supE(F/)<oo. k
is obvious
from Minkowski's
inequality
||A';i.-mI|4<||A',||4 + H
//I on fi having C^ norm |/i|). But we the elementaryinequality (a) immediately by (6.7,b).
(the
constant
function
can
also
prove
D
The
next topics
indicate a
different
use
of variance.
7.3.
Chebyshev's
know
inequality
says
As you
this
that for
-
c > 0, and
C^ ^ \302\243
c^P{\\X
//| >
c) < Var(X),
fi
:=
E(X);
and
it is
obvious.
Example.
of
IID
RVs with
=
values in {0,1}
with
P(Xn
= 1)
= 1 ~ P(Xn
0).
Cha'pier
7:
An
Easy
Strong
Law
(7.3)..
Then E{Xn)
has
= P and Var(Xn)
np
p(l
P) < \\'
\342\200\224 <
expectation
and
variance
np{l
p)
n/4,
have
= E(n-^5n) = p, Var(n-i5n)
n-2Var(5n)
< l/(4n).
Chebyshev's
inequality yields
P(|n-^5n-p|>(!))<l/(4n(!)2).
e >
0, then
there exists a
xe[o,i]
sup
\\B{x)
f{x)\\
<e.
are
Proof. Let
aware
(Xk)^
Sn
etc.
be as in
well
that
P[Sn =
Hence
k]=
(^)p'(l
=
~ Pr~\\ 0<k<n.
-p)\"-^
B\342\200\236(p)
:=
Ef(n-'S\342\200\236)
J2f(n-'k)(^^p''(l
the 'j5'
Now
is bounded
on
on [0,1], \\f{y)\\
^
<5
<
continuous
[0,1]:
for our
y|
K-, \"iy
\342\202\254 [0,1].
Also, / 0 such
is
uniformly
that
(a)
Now, for p Let us
\342\200\224
implies
that
\\f{x)
\342\200\224
/(y)|
< \\e.
6 [0,1],
\\Bn{p)
/(p)|
= -
|E{/(n-i5\342\200\236)
/(p)}|.
write
/(p)| and
E(F\342\200\236)
:= Z\342\200\236 [n-^Sn
-p|.
Then Zn<8
impliesthat
and m\\
we have <
=
E{Yn;Zn<S)
+ E{Yn\\Zn>S)
>
6)
2A7(4n<52).
now
we chose
-
a fixed 6 at (a).
< e,
choose
n so that
2K/(4n6^)
< ie.
\342\226\241
Then
|B\342\200\236(p)
/(p)|
Now
do
Exercise
E7.1 on inverting
Chapter
Product
Measure
8.0.
One
'interchange
Introduction
and
advice
practical
of this
importance
is that
an
order
of integration'
result
/ n
J
Si
J S2
U fisi,S2)fii(dsi)jfi2{ds2) f{si,S2)fi2{ds2yjfjii{dsi)= Si
*^ *^ \342\226\240S'2 infinite)
is always valid (both sides possibly being both valid for 'signed'/ repeated integrals that one (then the other)ofthe integrals absolute
(with of
if
/ >
finite)
0; and is provided
values:
1/(^1,52)1/^1
fJ'2{ds2)
is finite.
idea
to read
strongly recommended to
the
get the
the
ideas,
but
you are
contents
until a
later
matter
the standard machine or the Monotone-Class Theorem to prove the notation. When by things made to look complicated it is important to appreciate when the more you do begin a seriousstudy, subtle Monotone-ClassTheoremhas to be used instead of the standard
intuitively
products, it is all a
case of
relentless
obvious
machine.
product S
that
8.1. Productmeasurable structure, x E2 Let (5i, El) and (52,E2) bemeasurable spaces.
Ei
Let
denote
:= Si X
52-
For
i =
1,2, let pi
denotethe i^^
coordinate
pi{si,S2) :=
51,
75
:= S2. P2{si,S2)
7^
Chapter
8: Product
of E
Measure
E2
(8.1)..
the cr-algebra
The fundamental
\342\226\272
definition
= Si
E =
is as
(a)
(7(/>i,/>2).
Thus E
is generated by
the
sets
of the
XS2
form
(Bi
p:[\\Bi) = Bi
together
with
eEi)
sets
of the
form
p-\\B2)
Generally,
= SixB2
over
(B2eE2).
Cartesian
product
a-algebra
to
which one
factor
J
factor is allowed
all
and
other
factors
vary
products
to
in
that
corresponding
our
product
of
two factors,
(b)
we have
(Bi
X 52) n
{Si
B2)
= BiX
B2
easily checkthat
T={BixB2:B,\342\202\254E.}
is a
a
TT-system generating
E = Ei X
^^^
E2.
A similar
remark
of
would apply
may
for
product y^^ n^\302\253' countable intersections in analogues of (b), of (7-algebras cause problems. The fundamental
countable
^^^
^^^ that,
since we
only
take
products
uncountable
families
definition
analogous
still works.
to (a)
LEMMA
(d)
Let which
7i
denote
are
of functions
map m,ap
R which / : 5 \342\200\224>
are in
bE and
on S2, on 5i.
such
for each
S2 si
\302\273-> y-*'
f(si^S2) f{si,S2)
is Tt2-measurable is T,i-m^easurable
Proof
is clear
that if
the
\342\202\254 J,
then
Ia
Verification H. \342\202\254
that
Monotone-Class
Theorem
3.14 is
result
follows.
,,(8,2)
8.2.
We
77
Product
continue
Fubiai's
Theorem
the
for i =
/ 6 JS2
/
a finite
that notation of the precedingSection.We suppose the preceding from measure on (5i, Ei). We know bE, we may define the integrals
\342\200\242=
f{^ii^2)f^2{ds2),
12(^2):=
/ JSi
f{si,S2)f^i(dsi).
LEMMA
Let
7i
he the
class of
the
following
property
holds:
\342\202\254 bEi !{(\342\200\242)
l{(-)
\342\202\254 bEs
and
JSx
Then
l{{si)fii{dsi)
=
JS-y
li{s2)fi2{ds2).
= bE.
X,
Proof. If
then,
trivially,
I^
W. \342\202\254
Verification
of the
conditions of
D
-F 6
with
indicator
function
/ :=
//(F) := / JSi
Fubini's
\342\226\272 \342\226\272
l{(^i)/ii(c/3i)= / JS2
on (5,
fi = fii
X
Theorem
The
measure
set
of
is
O'f^d
a measure
we
S) calledthe
x
fi2
product
write
dnd
(5,S,p) Moreover, fi is
= (5i,Si,^i)
(52,E2,M2).
on (5, E) for
Ai
which
(a)
//(Ai If f
= /ii(Ai)//2(A2),
the
\342\202\254 S,.
E (mE)\"^, then
obvious
definitions
ofl^^l^,
we have
(b)
li{s2)^L2{ds^),
18
Cha'pter 8:
[0,oo].
Product Measure
(8,2),,
(a)
in
If f
E mE
oo,
then
equation
is valid
(with
all
terms
in H).
fact
Proof, The
and (MON). of Unearity // is a measure is a consequence is obvious from Uniqueness is then uniquely specified by (a)
that that
<7{T)
T. The \342\202\254
\342\202\254 bS,
and
Monotone-Class in particular
it
for
for
(5,
E,//). that
shows
(MON) (b) is
is
valid
valid if //(|/|)
< oo.
Extension
\342\226\272
All sure
of FuhinVs spaces:
Theorem
will
work
if the
(Si^Tii, fii)
are a-finite
etc.
of
m,ea-
We have
this by
blocks.
a unique
breaking
(a),
etc.,
We can
disjoint
<7-finite
unions
prove
finite
Warning
The <7-finiteness The is the conditioncannot be dropped. standard example = 1,2, = = 2 For take and be Let 5, E,fii following. Lebesgue [0,1] S[0,1]. and let fi2 just count the number of elements in a set. Let F be measure the x 52 : x = y}. Then (check!) F 6 E,but \342\202\254 5i diagonal {{x^y)
I({s^)
l,
li(s2)
= 0
stating
that
1 =
0.
to think about
on was
So,
finite
our
insistence
measures
bounded functions on
it
products of
that in our
is
worth
emphasizing
standard machine, things work because we can use indicator functions of our set in we whereas when can only use indicator functions any cr-algebra, of sets in a 7r-system, we have to use the Monotone-Class Theorem. We cannot approximatethe set F in the Warning as example
F=TlimF\342\200\236,
where
each
Fn is
a finite union
of
'rectangles'
Ai
x A2,
each
A,
being
in
B[0,1].
..(8.3)
A
Chapter 8:
application
Product Measure
19
simple
random
variable
(fi,^) : 0
x ([0,
on (O,^,
P). Consider
:=
{(u;,x)
< x
:=
U.
Note
that
is
the
'region
under the
X'. P(X
If (u;)
= X(u;),
Thus
(c)
dx denoting
formulae
//(A) = E(X) =
P(X
>
x)dx,
for
obtained
one of
the well-known
under
the
graph
of X\\
reverse
sets
remarking that
Fatou
Monotone-Class for
Theorem,
functions
amount
to
applied
to regions
under graphs.
and
Y be two
random variables.
Cx,Y
(X, F)
defined
is the map
: BiR) by
Cx,Y
of the
pair
X B{R)
-Cx,y(r):=p[(x,r)er].
The S(R)
F'x,Y
system
X
x {(\342\200\224oo,x]
(\342\200\224oo,y]
: x^y
6 R}
B{R).
Hence
Cx,Y
is completely
of
and Y which
is defined via
Fx,Y{x,y):=P{X
<y).
We say that
X and Y
if for
have
/x,y on R2
probability
density
function
(joint pdf)
\342\202\254 B{R)
x B(R),
P[{X,Y)eT]
J^fxM^Mdz)
JrJr
80
Chapter
(Fubini's
8: Product
being
Measure
in the
(8.3)..
etc.,etc.,
Theorem
Theorem
used
further
shows
that
fx{x)
acts
:= / /x,y(^, Jh
y)dy
as a
pdf for
of
any more
X (Section6.12),etc.,etc. You
sort
don't
need
me to
tell you
this
of thing.
8.4. Independence and product measure laws Y be two random variableswith Let X and Cx -, Cy respectively three functions Fx^Fy respectively.Then the following distribution
statements
and
are
equivalent:
(i)
X and Y
Cx,Y
are independent;
X
Cy\\
(ii)
(iii)
Cx
Fx,y(x,2/)
= Fx(x)Fy(y);
/x,y
then
each of
almost
(i)-(iii) is equivalent
every
to
/x,y(^?y)
= fx{x)fY{y)
for Leb X
Leb
(x,i/).
You do
8.5.
Here
countable
5(R)'^
BiR\"\
again, things
products,
provided
we work
with finite or
if
but
different
we
work
with uncountable
i^^ coordinate
products.
the
space
R**. Now, if pi
: R\"
\342\200\224>
is the
map:
\342\200\242= ,^n) />i(^l,^2?-\342\200\242 -2:,',
then
pi is
continuous,
S\"
and hence
:=:
i3(R)\"
generated countable of
union
the
open
subsets
open
'hypercubes'
n
l<A;<n
(\302\253''^')
and
such
In
products
are in 5(R)\".
theory
it
Hence,B(R\")= 5(R)\".
always
8.8.
probability
is
almost
product
structures
B^ which
..(8.7)
Chapter
8: Product
Measure
81
far
in
this
somethingagain familiar
8.7.
This
in
analogous
contexts.
Infinite
topic
products
sl trivial
of probability triples
extension of
an and previous
is not
results.
it
important extension
triples
is then
a
a purely
sequence
routine exercise.
of independent
Canonical model
Let
already
construct (A\342\200\236:nEN)bea
for
know
a
from
sequence
more elegantand
THEOREM Let
Define (An
of probability measures on (R,S). We sequence the coin-tossing trickery of Section 4.6 that we can of law An. Here is a independent RVs, Xn having (Xn)
systematic
way
of doing
this.
: n
\342\202\254 N)
fee
a sequence fi=
of probability [J
nGN
R
measures on (R,S).
SO
that
a typical
element
Xn : fi
uj
of
H is
a sequence
Xn(u;)
(u^n)
i'n
R-
Define
-> R,
Then that
:= Un,
\342\202\254 N).
there
N
exists a unique
and
probability
such
for r 6
Bi,
B2?
(a)
((n l<k<r
\\
^0
\"^
n R1 k>r J
l<k<r
^\"'^^^^
We
write
(f],jr,P)=
JJ(R,s,An).
nGN
Then on
the (fi,.?^,
sequence P),
of independent
RVs
82
Chapter
8: Product
Measure
in the appear
of
of
P follows
sets
the
form
which
generating !F.
(a) more
(ii) We
could
rewrite
neatly as
monotone-convergence
property
(1.10,b)
of measures.
the
theorem
is deferred
on
to the
8.8.
Let
Technical
note
the
existence
E, -^
of joint
Define
Xi : fi -^ Si be such
5 :=
(Q,^),
(5i,Ei)
and (52,
that
1,2,
let
X~^
5i
52,
E :=
Ei x
-^
E2,
X(uj) := (Xi(u;),X2(u;))
5.
Then
fi oi
variable,and if
X (equals
(Exercise)
P
X~^ :
is a
T,
J^^ so
probability
Xi
that X is an (5, E)-valued random measure on fi, we can talk about the law and X2) on (5, E) : /i = P o X~^ on E.
now that 5i and 52 are metrizablespaces and that Ej = B{Si) Suppose = isa Then metrizable 5 under the space product topology. If 5i (i 1,2). and 52 are separable, then S = S(5), and there is no 'conflict'. However, if 5i and 52 are not separable,then B(S) may be strictly larger than E, X need not be an (5, S(5))-valued random and the variable, joint law of Xi and X2 need not exist on (5, S(5)).
It is
separability
perhaps
of
R was
as well to be warned of such things. Note used in proving that S(R\") C B^ in Section
that
the
8.5.
PART
B:
MARTINGALE
THEORY
Chapter 9
Conditional
Expectation
9.1.
variables,
motivating
example
Suppose
X
that (fi,
J^, P) is a probability
the
the
triple
and
that
X and Z
are random
taJcing
taking
distinct
distinct
values xi,X2,...
values
,a:m,
^i,
Elementary
conditional
probability:
Zj)
P(X =
and
Xi\\Z
:= P{X
=Xi;Z
= Zj)/PiZ =
Zj)
elementary
conditional
=
expectation:
zj)
E{X\\Z
5^x,P(X
=
Y
Xi\\Z
Zj)
are
familiar
to you.
of
E(X|Z),
the conditional
expectation
given
as follows:
(a)
if
Z(u)
= Zj,
then Y{u)
advantageous
:= \302\243(X\\Z
to
Zj)
=: yj (say).
It proves
'Reporting
to be
very
look
to
to us
the value
on which Z is constant:
of Z(ujy amounts
Z
Z = zi
Z2
Zn
The
<7-algebra
consistsprecisely
(a)
(^(Z)
generated
of
by Z
the
2**
consists of
possible
sets
{Z
\342\202\254 B},
E B,
unions
of the n
it
Z-atoms.
that
Y is
constant on is
better,
^-mea^urable. 83
84
Next,
Chapter 9:
since Y takes
YdP
Conditional
Expectation
(9.1)..
yj
on the
Z-atom
=
{Z =
Zj)P{Z
^j}, we
= zj)
have:
= zj)
= y]x,P(X = Xi\\Z
=:^S^xiP(X
If
= Xi;Z =
Zj)= f
XdP.
every
we
write
G in
^, /g
= = {Z = Zj}, this says ) E(XIg, ). Since for Gj E(FIg, is a sum of Igj 's, we have E(FIg) = E(XIg), or
(c)
Results (b)
JG
YdP
Jg
the
XdP,
\\/G
Q.
central
definition
of modern
probability.
9.2. Fundainental Theoremand Definition(Kolmogorov, 1933) with E(|X|) < oo. variable Let \342\226\272 \342\226\272\342\226\272 P) be a triple, and X a random (f],^, Let Q be a sub-a-algebra of J-. Then there exists a random variable
Y
such
that
(a)
(b)
Y is
E(|y|)
Q measurable^
< oo,
set
(c)
for
every
TT-system
which
G in some
have
I YdP=
G
G
RV
XdP,
\\/GeG.
Moreover,
that
is,
is called3L
given
Two
with
random
Y, a.s., (a)-(c)
of
of
the
conditional
concept,
expectation
= E(X|^), a.s. versions with familiar a.s., and when one has become agree one identifies different versionsand speaks of the conditional
Q, and we write Y
E.{X\\G).
expectation
JE.{X\\Q)
the
But
you should
think
about
the
'a.s.'
throughout
this
course.
in
Section
9.5, except
for the
7r-systemassertion
for
you
will
We
find at
often
Exercise E9.1.
write E{X\\Z) for E(X|(7(Z)), That this is consistent with
\342\226\272Notation.
E(X|Zi, Z2,...)
the
E(X|<j(Zi,Z2,...)),
etc.
is apparent from
elementary
usage
..(9.5)
9.3. The
An
Chapter
9:
Conditional
Expectation
85
intuitive meaning
to you The only information available performed. is the set of values Z{u;) point lj has been chosen is the variable Z. Then F(u;) = E(X|^)(u;) random ^-measurable in the this information. The 'a.s.' ambiguity value of X{(jj) given in general, but it is sometimes one has to live with is something
experiment
has
been
regarding
for
which sample
every
expected
definition
version
of E(X|^). {0,fi}
<7-algebra
(which
contains
no
then
^{X\\Q){ijj)
= E(X)
9.4.
Conditional
If
expectation <
predictor
\342\226\272 \342\226\272
E(-X''^) a version
\302\243^(fi,^,P). predictor
of
predictors
is oo, then the conditional expectation Y = E{X\\Q) the onto X Section of orthogonal projection (see 6.11)of Y is the least-squares-best Q-measurable Hence, all X: all Q-m,easurable functions (i.e. am^ongst amongst which can be com,puted the available Y from, information),
m,inim,izes
conditional
which
develops
it) is crucial in
industrial
processes,
or whatever.
expectation
(and
the martingale
of
space-ships,
of
isvia the Radonway to prove Theorem 9.2 (seeSection 14.14) theorem, described in Section 5.14. However, a Section 9.4 suggests much simpler approach,and this is what we now develop. We can then prove
The
standard
Nikodym,
theorem
by
14.13.
martingale
theory.
See Section
Then
First
we prove
we
prove
the
almost
of
existence
E{X\\Q)
E{X\\Q). we
prove
the
that eC\\Q,g,P),
E C^
and
and
that
and
Y are
versions of
E{X\\Q). Then
E(r-f;G)
= o,
WGeg.
86
Chapter
9:
Conditional
Expectation
We
(9.5)..
may
equal.
assume
that
> Y)
> 0.
Since
{Y>Y + n-'}uy>y}.
we
see
that
Y is in ^, because
P{Y
and
n~^) Y are
y
the
set
{Y -Y
> n'^}
E(y a contradiction.
y; r -
n~^)
> n-\"^P(y
- y > n-^)
>
o,
Hence Y =
E{X\\g) C^
Y, a.s.
e C^ Let
Section
Existenceof
let fC :=
for :=
\302\243^(fi,^, P).
^ be
6.10
^,
(a)
know
orthogonal
C^iQ^Q^P). By
C^ norm.
exists
Wf]
Theorem
6.11
on
Y m
: W
/C =
Yf] =
mi{E[{X= 0,
-C^C^)
\342\202\254 C^Q)},
(b)
Now,
{X -Y,Z)
a G
VZ
in
\302\2432(^).
eQ,
then Z :=
Iq
and
(b) states
that
E(Y;G) =
EiX;G).
Hence
is
a version
of E(X|^),
for
as required.
Existence of
By splitting
case
E{X\\g)
e C^
X-,
X as X = X'^
Xn
\342\200\224
we see that
X X.
when
bounded variables
choose
X G (>C^)\"^.So
with
assume that
0 < Xn
We
T
the
choose
can
a version
now
need
to establish that
that
0 <Yn
that
^.
(c)
this in a moment.Given
is true,
we set
y(u;) := limsupy\342\200\236(u;).
Then
Y G
m^, and
y\342\200\236 t y,
a.s.
=
E{Y;G)
from
(G e
Xn
the
corresponding
result
for
and
\342\200\242
..(9.6) result
Chapter
9: Conditional
Expectation
87
positivity
Property
(c) follows
is a
(d)
if
non-negative
then
> 0,
a.s.
Proof of (d).
some
Let
VT
be
a version
of E{U\\g). If P{W
Q has
for
n, the
set
G := {W < so that
in \342\200\224n\"^}
positive
probability,
0 < E{U; G)
finishes This contradiction
= the
E{W; proof.
G) <
-n-^P(G)
< 0.
D
with traditional usage that The case of two RVs will suffice to illustrate things. Sosuppose Z are RVs which have a joint probability density function (pdf)
9.6.
Agreement
and
fx,z{x,z).
Then fz{^)
density
function
for Z.
X given if /^(^)
Z via ^ 0;
Let
where
otherwise.
be
a Borel
function
E|/i(X)|
on
such
that
JR
of course
fx{^)
= /r
fx,z{x^z)dz gives
Jr
h{x)fx\\z{x\\z)dx.
for X.
Set
g{z) := /
Then
a{Z).
Y :=
g{Z) is a
typical
version
of
the
conditional
expectation
of h{X)
G B},
given
Proof.
The
element of
must
B E B. Hence, we
(a)
But
{uj
: Z{u;)
where
show
that
=:
L :=
^ =
= E[g(Z)lB{Z)] E[h{X)lB{Z)]
^)dxdz,
R.
J J
Kx)lB{z)fxA^^
R=
g{z)lB{z)fz{z)dz,
D
from
Fubini's is given
Theorem.
in Sections
15.6-15.9,
which
you
can
look
88
\342\226\272 \342\226\272\342\226\2729.7. Properties
Chapter 9:
of
Conditional
Expectation
(9.7).,
conditional
in
expectation:
Section
and
a list
9.8.
All X's
satisfy
Ed-X\"!)
<
oo in
use
(a)
properties. Of course,Q
denote
7i denote
of
'c'
to
'conditional'
in (cMON),
etc., is obvious.)
{Very
sub-cr-algebras
of J^.
(The
If
If
Y is
X
any version
(b)
(c)
is ^
useful, this.)
(Linearity)
Clarification:
+ 02^21^)
= aiE(Xi|a)
and
+ a2E(X2|a),a.s.
Y2 is
then
aiYi
4- ^2^2
a version
of E(-X'2
\\Q),
4- ci2X2\\G)'
(d) (Positivity)
If X > 0, then
> 0,
a.s.
a.s.
(e) (cMON) If 0
(f)
< Xn
^,
then
E{Xn\\G)
T E(X|a),
(cFATOU)
If Xn > 0, then
<
E[liminf
Xn|C?]
< liminf
^
E[J\\:n|a], a.s.
X, a.s.,
F(u;),
Vn, EV
then
E{Xn\\g)-^E{X\\g),
a.s.
E|c(X)|
(h) (c
JENSEN)
If
c : R
->
is
convex,
and
E[c{X)\\g]>c{E[X\\g]\\ a.s.
\\\\X\\\\p
for
p >
1.
Property) If W
is
a sub-cr-algebra
of ^,
then
= E[x\\ni E[E{x\\g)\\n]
Note.
a.s.
and bounded,
We
shorthand
what
LHS
is
to E[X|a|W]
known')
for tidiness.
is ^-measurable a.s.
and E{X)
If Z
(*)
E[ZX\\g]
= ZE[X\\gi
\302\243^(fi,
J^,P)
\342\202\254 (ma)+,
(*)
co,
(*)
holds.
(k)
(Role
of independence)
If H is independentof
a.s.
a{a(X),g),
E[X\\a{g,n)]=E{X\\g),
In particular,
if
is independent
of W, then
E{X\\n) = E(A'),
a.s.
..(9.8)
9.8. Proofs
Property
Chapter
9:
Conditional
Expectation
89
Property
= E(JV;J1), Jl being an since E(y;Jl) follows as is Property the from is immediate definition, (b)
(a)
Q.
(d)
is not
of
(9.5,d)
transfers
our
current
situation.
Proof of (e). If 0 < Xn T ^^ then, by (d), if, for each n, Yn is a version of Then Y G mQ, and Y := limsupFnE{Xn\\Q), then (a.s.) 0<Ynt Define a.s. Now use (MON) to deduce from Yn T y,
E(r\342\200\236;G')
E(Xn;G),
VG G a,
that
argument
E(y; G)
in
= E{X]G), VG
9.5.)
G G-
(Of course
we used a
very
similar
Section
D
should
(g).
(MON)
You
check
from
(DOM) from
you.
(FATOU)
in Section in Section
from
that the argument used to obtain 5.4 and the argument used to obtain to 5.9 both transfer without difficulty
careful
derivation
of
(cFATOU)
is an
essential exercisefor
n
(cFATOU)
Proof of
countable
sequence
((ctnj^n))
of
points in
such
that
c(x)
= sup(ariX
n
-h
6\342\200\236),
G R.
For each
surely,
(d)
from
c{X)
> CnX
4- bn
that,
almost
(**)
By
^lc{x)\\g] >
appeal
for
a\342\200\236E[x\\g]
b\342\200\236.
the
usual
to count ability,
all
we
can
say that
simultaneously
n,
6\342\200\236) c(E[J^|g]).
corollary
to
(h).
Let
p >
1. Taking
>
\\E(X\\g)\\\\
we c{x) = |a:|P,
see
that
E(|Xng)
a.s.
90
Chapter
take
9: Conditional
using
Expectation
(9.8)..
\342\226\241
Now
expectations, (i)
property
(a).
definition
Property
is virtually
of
conditional
expectation.
Proof of
Y
of E{X\\g),
we must
can
assume
prove
that X
that
integrahility
appropriate
hold,
(***)
We
E(ZX;G) = E(Zr;G).
machine. If Z is the indicator of a set in ^, then (***) of the conditional expectation Y. Linearity then shows for Z \342\202\254 Next, (MON) shows that (***) 5F+(fi,a,P). both be sides might (m^)\"^ with the understanding that
use
the
standard
definition
is true by
that
is true for
infinite.
All
(***)
holds
Z
that
is necessary
is
to
show
that
if
\\i
is obvious
inequality
Z
X
to establish that property (j) in the tableis correct under each of the conditions < cx). This given, Ed-ZXl) is bounded and X is in \302\243^,and follows from the Holder D ^ C^ and Z E C^ wherep > 1 and -{- q\"^ = 1. p\"^
can
Proof o/(k).
ff
\342\202\254 XIq H,
We
assume
and
H are
that X >0 (and E(A') < oo). ForG EQ and independent, so that by Theorem 7.1,
E(JV; GnH)
Now
= E[(XIg)Ih] = E(XlG)P(if).
if
independent of H so that
= E(X\\Q)
(a version
is
^-measurable,
YIq
is
E[(riG)iH] = E(riG)P(^)
and
we
have
= E[Y;GnH]. E[X;GnH]
Thus the
measures
K-.
E(X;
F),
F >-* E(Y;
F)
of on
on a(Q, \"H) of the same finite total mass agreeon the 7r-system form GC\\H(GeG,H\302\243 everywhere H), and henceagree is exactly what we had to prove.
sets
of the
(t(Q,
H).
This
..(9.10)
9.9.
Chapter 9: Conditional
conditional
have
Expectation
91
Regular
probabilities
=
ForF e f,we
P(F)
E(If).
For
be
sub-a-algebra
of ^,
we define
P{F\\Q) to
can
a version
o/E(If|^). for a
By linearity
disjoint
and (cMON), we
of ^,
show
that
elements
we have
EP(^\302\273l^)'
fixed sequence(Fn) of
(a)
Except
P(U^\"i^) =
in
(^\342\200\242^\342\200\242)
trivial
cases, there
are
uncountably
many
sequences
of disjoint
sets, so we cannot
concludefrom
(a)
that
there
exists a
map
for
F e
uj
\302\273->
P(a;,F)
is a
version
ofP{F\\Q);
(b2)
for almost
lj,
the
map
F^P(u:,F)
is a
If such is known
encountered
probability
measure
on T,
a map
that
in
for
exists, it
technical
Important
is called a regular conditional probability given Q. It conditions regular conditional probabilitiesexist under most exist. The matter is too practice^ but they do not always book at this level. See, for example, Parthasarathy (1967).
note.
The elementaryconditionalpdf
regular
- conditional
pdf for X
/x|z(^k)
of Section
given
9.6
in that
^
Proof,
\342\200\242\"*
JA
/i =
fx\\z{^\\Z{^))dx
is a
version of
P{X
G A\\Z).
Take
U in Section 9.6.
that
r
If
G N
/i G
and that
bS\"\"
A'*!,^\"2,
independent
RVs, Xk
and
we
define
(for xi G R)
(a)
^\\x,) = E[h{x,,X2,X^,...,Xr%
92
then
Chapter
9:
Conditional
Expectation
(9.10)..
(b)
7^(-X'i)is a
version
of
the
conditional
expectation
E[/i(Xi,X2,...,X,)|Xi].
need
only
show
that for
B e B.,
of
(c)
We
can
do this
(c) contains
the
the indicator
of
elements
in the
7r-system of
sets of
form
B1XB2X
...xBr
appeal
etc., etc.
says
Alternatively,
we
can
to the
(c)
that
Jx\302\243R^
/i(x)Ib(xi)(Ai
A2
...
Ar)(c/x)
l^{xi)lB{xi)Ki{dxi),
where
7^^i)=
/
Jy\302\243Rr\"-i
h{xi,y){A2X...xAr){dy). an example
RVs with
4-
9.11.
Use
that
of symmetry:
Xi,
Suppose
E(\\X\\)
X2
< 00.
Let 5n Qn
:= Xi
4-
X2
4- Xn,
define
\342\200\242 \342\200\242 \342\200\242)\342\200\242
\342\200\242\342\200\242=
Cr(5'n,5n+1,.
We
wish
to calculate
E(Xi|a\342\200\236),
for is
very
good
independent
14. Now cr(Xn+i, Xn+2, \342\200\242 \342\200\242 reasons, as we shall seein Chapter \342\200\242) of cr(Xi,5n) of ... (which is a sub-cr-algebra (t(Xi, ,Xn)).
Hence, by (9.7,k),
But if
we
denotes
the
xi
4-
^^2
4- x\342\200\236,
have
E(Xi;5\342\200\236G5)
...
/
Jsn\302\243B
xiA{dxi)A(dx2)... A{dxn)
...
=E(X\342\200\236;5nG5).
= Hence,
E(X2;SneB)=
almost
surely,
E(Xi|5\342\200\236)= \342\200\242\342\200\242\342\200\242 =E(X\342\200\236|5\342\200\236)
n-\302\273E(Xi +
... +
n-^Sn. X\342\200\236|5\342\200\236)
Chapter
10
Martingales
datum,
(Q,J^,
we now is a
>
{^n},P)-
Here,
P)
n
probability triple
a filtration,
as usual,
that is, an
^.
{^n :
0} is
increasingfamily
of
sub-
cr-algebras
of J^:
^0 C ^1 C ... C C^.
u; in
We define
J'oo:=<7(\\jj'n)
about
Q available
the
'just
values
natural filtration
of
some
about u
(stochastic)
which
process
have
W =
(Wn
*\342\200\242 n E Z\"^),
and
values
then the
information
we
at time n
consists of the
Woiu;),Wiio,),...,W\342\200\236{u;).
10.2.
\342\226\272A process
Adapted
X
process
= (Xn
'\342\200\242 n >
0) is
the filtration
is known
{J^n})
if for
each n,
Intuitive
Usually,
is J>i-measurable. X\342\200\236
idea.
J^n
If X
is adapted,the
-X'\342\200\236(u;)
to us
W^n)
at time
for
n.
<7{Wo,Wi,...,
Xn
= fn{Wo,
W^i,...,
some
g\"+i-measurable
function
9S
94
Chapter 10:
Martingales
submartingale
(10.S)..
is called
a martingale
(relative to ({J^n},P)) if
(i)
is adapted,
(ii)
(iii)
E(|X\342\200\236|)<oo,Vn,
J^\342\200\236_i, E[X\342\200\236|:F\342\200\236_i]
a.s.
(n>l).
similarly,
except
that
a.s. E[X\342\200\236|:r\342\200\236_i]<Z\342\200\236_i,
(n>l), replaced
and a
submartingale
is defined
with
(iii)
by
E[Xn\\rn-l]>Xn-U
A
a.S.
supermartingale
[Supermartingale
'decreases
on average';
corresponds
a submartingale
superharmonic:
filtration
average'!
R\"
to
a function
of
is
superharmonic
if and only if
for a
Brownianmotion B on
B.
is a
Compare
Note that
and that
\302\243^(fi, if
is a
supermartingale
and
if and only if
is a \342\200\224X
submartingale,
X is a
a submartingale. It is important
J^5,P) and
martingale if
X
only
[respectively, supermartingale,
\342\200\224
submartingale]
has
Xq
attention
Xq {Xn on processes
\342\200\224
: n
\342\202\254 Z\"^)
the
same
at 0.
of
If
is for
CEs,
(9.7)(i),
the Tower
<
Property
E[Xn\\Tm]
ElXnlJ'n-^llTm]
< E[Xn^l\\Tm]
< Xm^
a.s..
10.4.
Some
examples
is
of miartingales
As we shall see,it
and
submartingales
importance
up in
very
be
studied
to view all martingales,supermartingales the enormous gambling. But, of course, of martingale theory derivesfrom the fact that martingales crop contexts. For example, diffusion theory, which used to many via methods from Markov-process theory, from the theory of
very helpful of in terms
..(10.4)
Chapter
10:
Martingales
95
partial
Let interesting
differential
equations,
etc.,
has been
revolutionized by the
examples,
martingale
an
approach.
us question
now
look
(solved
at some
later)
simple first
pertaining
and
mention
to each.
Let and
Xi,
X2,...
be a
sequence
independent
RVs
with
Ed^itl)
E(X,)
Vfc.
Define
(5o
:= 0
and)
:= Xi 5\342\200\236
4- -X'2
4- -X'n,
J^n:=^(Xi,X2,...,Xn),
J^o:={0,fi}.
Then for n
> 1, we
have
(a.s.)
E(5n|^n-l)
= E(5n-l|J^n_l)
+ E(Xn|J^n-l)
=
The
Sn-\\
= Sn-1' 4- E(-X'\342\200\236)
first (a.s.)
equality is obvious
= and since X\342\200\236 is independent That must our notation! by (9.7,k). explain when does lim 5n exist (a.s.)? SeeSection 12.5. Interesting question:
(b)
is 5\342\200\236_i
^n-i-measurable,
from the linearity property (9.7,c). Since = 5n-i we have E(5\342\200\236_i|J*\342\200\236^i) (a.s.) by (9.7,b); of J^n~i, we have E(-X'n|^n~i) E(-X'\342\200\236) (a.s.)
Products X2,...
Xi, with
be
Let
= l,
Vfc.
Define
(Mo
:= 1,
JTq
:=
{0,fi}
and)
!Fn
'\342\226\240= Cr(Xi,X2,.
Mn :=
Then,
X\\X2
..
,X\342\200\236).
for
n >
1, we have
=
(a.s.)
E(M\342\200\236|:r\342\200\236_i) E(M\342\200\236_iX\342\200\236|^\342\200\236_i)^i:W\342\200\236_iE(x\342\200\236|:r\342\200\236_i)
^=W\342\200\236_iE(X\342\200\236) M\342\200\236-i,
so
that
A/ is
a martingale.
96
It should
(10.4)\"
at
all
artificial.
Because M is a non-negativemartingale,Moo = Theorem this is part of the Martingale Convergence 14.12 we say that E(Moo) = 1? SeeSections can of the next chapter. When
lim Mn
14.17.
(c)
Accumulating
data
about
J^,P).
a random variable.
Define
have := M\342\200\236 E(^|J^n)
Let
{Tn}
be
our
filtration,
and let
^6
\302\243^(0,
('some
version
(a.s.)
E(MnlJ^n^l)
Hence
= E(e|J^n|J^n~l)
= E{i\\J'n-l) =
Mn^L
is a
martingale.
shall
be
able
to say that
a.s.,
:=
E(^|J^oo),
is the best of Levy's Upward Theorem(Chapter14). Now Mn available to us at time n, and Moo is the the information predictor of ^ given best prediction of ^ we can ever make. When can we say that ^ = E(^|^oo)5 a.s? The answer is not always obvious. See Section 15.8.
unfair
games
now of
\342\200\224
Xn-i
as your
net
winnings
per
unit
stake
in game n
There
a series
of games,
played
0.
at times n = 1,2,
(n > 1)
game
is
no
at time
\\Tn-i]
= 0,
B[Xn
\342\200\224
Xn-i
l^n-i]
^ 0,
you).
Note
martingale
that
way of
formulating the
[supermartingale]
of X.
10.6. Previsible
\342\226\272 \342\226\272We call
a process
Cn is
..(10.8)
Note
Chayier
10:
Martingales
97 Z^i
that
Think
exist.
of
of
C has parameter
set
rather
than
Co does
not
Cn
based
game n. You have to decide on the value is the 1. This (and including)time n \342\200\224
character
intuitive significance of
-^n-i)
of C.
up
Your winnings on
time
total
winnings
to
n are
Yn=
Note
J2
l<Jt<n
C',(Xfc-AVi)=:(C#X)\342\200\236.
that
(C
\342\200\242
X)o
= 0,
and that
discrete
theory
The expression
analogue
of
is
one
of the
greatest achievements of
C, is the Stochastic-integral
of X by
theory
modern
of proba-
bihty.
10.7.
\342\226\272 \342\226\272(i)
fundamental
Let
system! C be a bounded non-negative previsible processso that, some for < K for every n and every u. Let X be a superin [0, oo)^ |Cn(<^)|
principle:
you
canH
beat the
m,artingale
[respectively
martingale].
Then
C%X is X is
a superw^artingale
m,artingale,
[m,artingale]
null
bounded
at 0.
previsible
(ii)
// C is a
{C
\342\200\242 is
process
and
then
X)
a w,artingale
(ii)^the
G
null at
0.
be
Xn
(iii)
Proof
In
(i)
and
boundedness
C^^Vn,
for
condition Cn
of (i). Write
provided
C
replaced
G C^,Wn.
by the
\342\200\242 X. Since
non-negative
and
!Fn-i measurable,
E[Yn
Yn^l l^n-l]
and
CnE[X\342\200\236
Xn-1
|^n-l]
< 0,
[resp. =0].
Proofs of
(ii)
(iii)
10.8. Stoppingtime
A
map
T : Jl
\342\200\224\342\226\272
{0,1,2,...; {T<n}
00} is = {u;:
\342\226\272\342\226\272(a)
called a
if, <
00,
98
Chapter
10: Martingales
(10.8)..
equivalently,
(b)
{T
= n}
= {uj:
T(u;) = n} e Tn,
and (b). If T
Vn
< oo.
Note that
T can
the
be oo.
of (a)
Frooj
of
equivalence
has property
-
(a),
then
{T =
n} =
{T< n}\\{T< n
k <
1} G J^nQ: ^n
If T
for
n, {T
= k} e J^k
and
{T<n)=
U
0<k<n
Intuitive
Whether
T is a time when you can idea. or not you stop immediately after
time
game.
n^^
game
J^n-
depends
B E
only on B. Let
n : {T
= n} E
Suppose
that
> 0
(An) is
E B}
that
T=
inf
{n
An
= time
T
into
set B.
By convention, inf(0)
Obviously,
= oo, so that
{T<n}^
k<n
oo if A
never enters
set B.
\\J {Ak
e B}
e J'n,
so
that
T is
a stopping
L
time.
: n
Example. Let L =
yourself that
10.9.
is NOT
sup{n
<
G 10;A\342\200\236
B},
is
sup(0)
freaky).
= 0.
Convince
a stopping
time (unlessA
Stopped
supermartingales
Let X be a supermartingale, and let at (immediately Supposethat you always bet 1 unit and quit playing T. time Then 'stake is n G N, for your C^^\\ where, process' after)
Your
'winnings
process'
is the
processwith
value
at
time
n equal
to
..(10.9)
If X^
99
denotes
stopped at T:
:=
XT(u;)An{(^),
then
Now C^^^
1)
and
non-negative.
n
Moreover,
G N,
C^^^ is
previsiblebecauseCn
{CP
Result
only
be 0
or 1 and, for
J'n-i^
= 0} =
{T<n-l} e
10.7
now yields
THEOREM.
\342\226\272 \342\226\272(i)
If X
is a
supermartingale and T
process
X^ = (Xtau
is a stopping
is
<
time,
then
the stopped
so that in
particular,
E(XrA\342\200\236)
E(Xo),
\342\226\272 \342\226\272(ii)
IfX
is a
gale, so that
martingale
in
and T
is a stopping
= ^Xo),
time,
then
is a m,artin-
particular,
E(XTAn)
Vn.
It bility
definition
is important conditions
of
to notice that
whatsoever and
this theorem imposes no extra integrain the (except of course for those implicit
martingale).
on
supermartingale
But be
at 0.
careful! Let X be a simple random walk very Then X is a martingale. Let T be the stopping time:
Z\"^, starting
T :=
inf{n : Xn
= 1}.
a
It is
proof
well
known
that
of this
P{T
< cx)) =
a martingale
1. (SeeSection10.12 for
calculation
martingale
of
of the
distribution
T.)
However,
even
E{XTAn) =
we have
E(Xo) for
every
n,
1=
E{Xt) j^
E(J^o)
= 0.
100
We
Chapter 10:
very
Martingales
(10.9)..
much
want
to know
E(Xt)
for a martingale
X. The following
gives
some
sufficient
conditions.
10.10. Doob's Optional-Stopping Theorem T he a stopping time. Let X be a supermartingale. Let \342\226\272(a) integrahle and
Then
Xt
is
in
each
of the
following situations:
(for
(i) T is hounded
(ii)
some
N, T{uj) <
N,
Vu;/,
is bounded
and
every uo)
T
oo^
is a.s.
and,
in R^,
K in
|X\342\200\236(u;)|
<
for
every n and
some
~
R\"'\",
\\Xn{uj)
Xn^i{uj)\\
< K
V(n,u;).
(b)
If any
of
the
conditions
(i)-(iii)
E(Xr)
holds and
= E(Xo).
X is a martingale, then
Proof of (ai). We
know
that
Xtau
is integrable,
and
(*)
E(XrAn-Xo)<0.
(i),
For
we
can
have
take
n =
For (iii), we
N. For (ii), we
TAn
can
let
\342\200\224\342\226\272 oo in
(*) using
(BDD).
\\XTAn-Xo\\
k=l
^(X,
-X,_i)|
< KT
(DOM)
justifies
letting
\342\200\224\342\226\272 oc in
(*) to
obtain
D
o/(b).
Apply (a)
to X
and
to
(-X).
..(10.11)
Chapter 10:
Martingales
101
Corollary
-\342\226\272(c)
Suppose
that
M is
by
a martingale,
constant some
the
increments
Mn~Mn-i
of which
are bounded by
some
Ki.
Suppose K2,
o>nd
that C
T
is a previsible
stopping time
process
such
that
bounded
constant
that
is a
E(T)
< 00.
Then
E(C#M)t
= 0.
Proof
left
of
the
following
as
an //
Exercise. X
part of
(d)
is a
non-negative
finite, then
superm^artingale,
and
is a
stopping
tim,e
which
is a.s.
E(Xt)<E(.Yo).
almostinevitable
some of
surely
the
order
to
be able
of
we need ways
announcement
to apply
that
results
of the
proving
of
happening
the
chance
often
principle
will
00. The
preceding Section,
following
of
almost
useful.
LEMMA
\342\226\272
Suppose
that
T is a we have,
that
for
som,e N in N
and
P(T
Then E(T)
You
>
e,
a.s.
< 00.
of this
first
will
find
the proof
set as an exercise in
occasion
Chapter
E.
Note
that if T
the
is the
at exercise'
by
which
the monkey in
the 'Tricky
end
of Section
00.
You
will
find
another
exercise
apply
result
in Chapter
that
inviting
you
= 26^^
4-26^4-26.
now
large
number
of other
Exercises are
accessible
to
you.
102
Chapter 10:
for
Martingales
random
(10.12)..
10.12.Hittingtimes
Suppose that
(X\342\200\236
simple
walk
each
: n
G N)
is a
Xn
having
the
P(X = 1)
:= Xi Define So := 0, 5\342\200\236
P(X
-1)
set
= i.
4-
and -X'\342\200\236,
T:=inf{n:5n
= l}.
Let
Then wish
.,5\342\200\236).
the to
process calculate
T is
a stopping
time.
We
For 0 eR,
i{e^
4- e\"^)
= cosh^,
=
so that
Vn.
E[(sechl9)e^^\"]
1,
Example (10.4,b)
showsthat
M^
is a
martingale,
where
M^ = (sechl9)\"e^^\".
SinceT is a
(a)
stopping
time,
and
M^
is a
martingale, we have
= 1,
Vn.
EM|,^\342\200\236
E[(sech^)^^\"
exp(^5rAn)]
\342\226\272
Now
insist
that as n t
0 >
0.
is bounded
Then, firstly,
e^.
exp(^5rAn)
by e^,
Secondly,
the latter
to let
\342\200\224\342\226\272 oo in
(a)
EM|, = l
the
E[(sech^)^e^]
term
inside
on the [\342\200\242]
right-hand
side correctly
being 0 if
0.
= cx). Hence
(b) We now
E[(sech^)^] =
\"f
e-^
T <
for
^ >
1 if
oo, and
(sechl?)^T
0 if
T =
cx).
= 1
= P(r
<
CX)).
.,(10.13)
Chapter
10: Martingales
103
to
\342\226\272 The
above infinite
possibly
argument stopping
(b)
given carefully
show
how to
deal
with
Put a
(c) so that
= sech^in
E(a^) =
^ a\"P(T
P(T =
n)
e~^ =
a\"^ [1 -
\\/l
a2],
2m-l) =
(-ir+>^j.
have
(d)
/(a)
:=
E(a^) = \\E{a^\\X,= 1) +
iE(a^|Xi
-1)
reason for the very last term is that time 1 has already elapsed \342\200\2241 to 1 has the form Ti -|- T2, and the time taken to go from giving to 0) and T2 (the time to go from where 0 to \342\200\2241 Ti (the time to go from as are T. It is not obvious each same with the distribution independent, 1) to devise a proof: that 'Ti and T2 are independent', but it is not difficult the so-called Markov us to allow Theorem would Strong justify (d).
The intuitive the a,
10.13.
Let
Non-negative
\302\243* be
superharmonic or countable
G
functions
(pij)
a finite
for
set. Let P =
=
Y^pik
a stochastic
E x
matrix, so that,
z, j
E, we have
Pij>0,
l.
Let /i be a probability measureon E. We know from Section 4.8 that there exists a triple (fi,^, P'^) (we now signify the dependence of P on //) carrying a Markov chain Z = (Z\342\200\236 : n G Z\"^) such that (4.8,a) holds. We write 'a.s., P'^'to signify 'almost surely relative to the P'*-measure'.
Zn). It is ezisy
when
to
deduce
from
(4.8,a) that if we
typographically
convenient,
then (a.s.,P'')
Let
/i be
a non-negative
function
Ph
on E
via
104
Assume that our
Chapter
10:
Martingales
(10.IS)..
that Ph<h
non-negative on E. Then,(cMON)
function
h shows
is finite that,
a.s.,
in
E^[h{Zn-,l)\\J'n] =
SO that
J2p{ZnJ)h{j)
supermartingale
{Ph){Zn)
h{Zn%
h{Zn)
is a
non-negative
(whatever
be the
initial
distribution //).
Suppose
that
the
chain
Z is
irreducible recurrent in
< cx))
that
P'(T; /,,\342\200\242
:=
= 1,
mass
Vz',iG^,
(//j
when
// is
the
unit
6ij) at i
(see 'Note'
Tj
:=inf{n:n>l;Z\342\200\236=i}.
over
that
the infimum
is
[n
>
by
1}, so that
Theorem
/,-, is the
probability of
if
h
a is
Then,
then, for
we see that j in
E^
so
that
is constant
on E.
first
Exercise.
Explain (at
fij
intuitively,
and
later
with
consideration
of
rigour) why
=
2Zi?tit//:j
-^Pij >
k^j
and
/^Pikfkj
k function
deduce
that
if every
then
Z is
irreducible recurrent.
have
non-negative P-superharmonic
is constant,
So
we
proved
that
recurrent
our chain Z
negative
is irreducibleand
function step
first
P-superharmonic
trivial
if and is constant.
only if every
non-
This is a
theory.
in the
Note.
The
perspicacious
reader
convey
will
have what
been
upset
in this
section.
I wished to
very
is interesting
by a first.
lack of
precision
Only the
enthusiastic
should
read
the remainder
of this section.
,.(10.13)
The natural
\302\243 denote for model take the canonical
Chapter
10:
Martingales
105 transition
thing to
do, given
the
the
one-step E
matrix
P, is to
Markov
of
the
<7-algebra
of all
subsets
as follows. Let
(fi,j^):=
a point
Q is
n(^\"^)nGZ+
In
particular,
u;
of
a sequence
=
u;
(u;o,cc;i,...)
of elements
of E. For u;
in
fi and
n in
:=
Z\"^,
define
Zn{u;)
LJn
E.
Then,
is a unique probability there for each probability measure/j, on (E, \302\243), \342\200\242 \342\200\242 \342\200\242 measure P'^ on (fi,J-) such that for n G N and \302\253o?\302\253i? G E, we have ?in
io,Zi(u;)
trivial
=
because
ii,...,
u;-sets
with
Zn(cc;) =
of the
0,
in]
f^ioPioii
-\"Pin-iin'
uniqueness is
left-hand
side of
Existence follows
canonical
(*), together
we
form
because
can
take
process Z constructed in
P'^
a 7r-system
P'^-law
the
non-
=P''oZ\"^
Here,
we regard Z
as the map
u;^{Zo{u;\\Zi{u),...\\
this
map
Z being
^/J^
measurable in that
obtained
is very
satisfying
because the
space
(Jl,^)
carries
all measures
P'^ simultaneously.
Chapter
11
The
Convergence
Theorem
11.1.
The
picture
that
says
it all
for a process X 11.1 shows a sample path n \302\273-> The top part of Figure Xn{(^) stake on unit where Xn \342\200\224 Xn-i game n. The represents your winnings per X lower part of the picture illustrates your total-winnings process Y := C \342\200\242 under the previsible strategy C describedas follows:
Pick
two
numbers
until
unit
a and
X
stakes
with
a <
b.
REPEAT
Wait
gets
below a
until
Play
UNTIL
X gets
above
and
stop
playing = 0.
FALSE (that
where
is, forever!).
=
at
Blackblobs
Recall that
To be
signify
1; and
time
C is not defined
more formal
(and
where
0.
to prove
Ci
inductively
I{Xo<a},
is previsible),define
:=
and, for n
> 2,
11.2. Upcrossings
The
number
i7iv[a,6](u;)
by
time
N is
defined to
be the largest
of upcrossings
of [a, 6]
in Z\"^ such
made
by
\302\273-> -X'\342\200\236(u;)
fc
that
tk <
we can find N
0 <si < ti
with
<
S2
<
t2 <
'\" <
Sk
<
Xs,{u;)
< a,
Xt,{u;) >b
{1 < i <
k),
106
.(11.2)
Chapter
11: The
Figure
Convergence Theorem
11.1
107
QQQ
108
The fundamental
\342\226\272(D)
Convergence
Theorem
(11.2)..
a]'
the F-value of [a, b] increases is obvious from the picture: every upcrossing the loss during the [-X'a^(u;) \342\200\224 while overemphasizes by at least (6 \342\200\224 a]~ a), the last 'interval of play'.
11.3.
\342\226\272
Doob's
Let
Upcrossing
be a
[a,
b]
Lenima
swpermartingale.
by
Let
i7iv[a,
^]
be the
number
of
wpcrossings of
time
N.
Then
<
(6 -
a)EUN[a,b]
E[{Xn
a)'].
> 0, result
X. and Y = C \342\200\242 now follows from
bounded Proof. The processC is previsible, F is a supermartingale, and E(yiv) 0. \302\243 (11.2,D).
and The
Hence
11.4. COROLLARY
\342\226\272
Let
be a
supermartingale
n
bounded in C^ in
<
that
supE(|A'\342\200\236|)
oo.
Let a, 6 G R
with
a <
b. Then,
with
Uoo[a^
b] :=t
{b-a)EUoo[a,b]
SO that
< |a| +
supE(|X\342\200\236|)
oo
P{Uoo[a,b] =
Proof
oo) = Q.
By
(11.3),
we have,
for
<
iV
G N,
{b^a)EUN[a,b]
Now
\\a\\
E{\\XN\\)
< |a|
4-supE(|X\342\200\236|).
let N
\"{
oo,
using
(MON).
Chapter 11:
The
Convergence
Theorem
109
supEd-Ynl)
<
oo.
Then,
exists
and
is finite.
Vu;,
For
Xoo
definiteness, we
is ^oo
Write
: =
measurable and
(noting
XqoC^)
X^o
'-=
=
limsup-X'\342\200\236(u;),
so
that
limX\342\200\236,a.s.
Proo/(Doob).
A
the use of
[\342\200\22400,00]):
{u; :
Xn{^) 1)
limit
in
00]} [\342\200\22400,
{ijj : liminf
limsupXnC^)}
: liminf-X'n(c<;)
< a
< 6<
limsup-X'\342\200\236(u;)}
{a,6GQ:a<6}
=:[jAa,6
(say).
But Aa,6 Q
so that,
we
{^ : Uoo[0',h]{ijj)
is
00},
by
that
see
a countable
union of
sets
Aa,6,
exists
a.s.
in
[\342\200\22400,00].
But
Fatou's
Lemma
shows that
E(|Xoc|) =
^(|Xn|)
< SUpEd^nl)
SO
< 00,
that
P(Xoo
is finite)
= 1.
is as
for the discrete-parameter case. None of these proofs and none shares the central of one for this probabilistic, importance the continuous-parameter case.
are
Note.There
other
11.6. Warning
As
we
Xn
branching-process
true
that
11.7. Corollary
\342\226\272 \342\226\272
If
is a
exists
almost
non-negative surely.
bounded
supermartingale,
in
since \302\243\\
then
= ^Xn)
Xqo
:=
limXn
Proof. X is obviously
E(|Xn|)
< E(Xo).
Chapter
12
Martingales
bounded
in C
12.0.
Introduction
When
boundedin C^
(a)
it works,
is
one
to
of
the
prove
<
ezisiest ways of proving that a martingale M that it is hounded in C? in the sense that
is
sup||M\342\200\236||2 n
oo,
equivalently,
supE(M^)
n
< oo.
formula
Boundedness
(proved
a Pythagorean
ib=l
The
study
of sums
on Theorem 12.2 below, both of neat which have proofs. We shall prove the Threeparts martingale Series Theorem,which says exactly when a sum of independentrandom We shall also prove the generalStrong Law variables of Large converges. Numbers for IID RVs and extension of the Borel-Cantelli Levy's
in the
of independent
be
random variables,
central
topic
classical theory,
will
seen
to hinge
Lemmas.
12.1. Martingalesin
Let M
\302\243^; orthogonality
of
increments
each
: n (M\342\200\236
> 0)
be a martingale in
for
C? in that
with
Mn
is in
C? so
know
Then
s, t,u^v
=
G
Mu
Z\"^,
s<t<u<v,we
E{Mr,\\J^u)
(a.s.),
so that
(a)
My\342\200\224Mu
is orthogonal
to C^{J^u)
(Mt-M\342\200\236M,-M\342\200\236>
0.
110
Chapter 12:
Martingalesbounded
in
C?
Ill
formula
n
Mn
= Mo
+ ^(Mit-Mit_i)
expresses
yields
as M\342\200\236
the
sum
of orthogonal
n
(b)
E(M2)
= E(M2)
+ ^E[(Mfc
ib=l
Mu-xf\\.
THEOREM
\342\226\272
Let in
M C?
he a
martingale for
which
Mn
G C'^,
Vn. Then
M is bounded
if and only if
(c)
Y.^[{Mk-Mk-xf]<oo; and
when
this
obtainsj
Mn
\342\200\224> Moo
almost
surely
and in
M is boundedin C^.
Proof. It is obviousfrom
(b)
that
condition
(c) is
Suppose
the property
now
that
(c)
Theorem
holds.
of
Then M
norms
Doob's
of monotonicity
(Section
Convergence
11.5
shows
surely. The
(d)
that M^q
exists
almost
n-\\-r
E[(M\342\200\236+.-M\342\200\236)2]=
Y.
E[(M*-Mft_i)2].
A:=n+1
Letting r
\342\200\224> oo and
applying
Fatou's
Lemma, we obtain
Y,
(e)
Hence
E[(Moo-M\342\200\236)2]<
E[(M*-Mft_i)2].
ib>n+l
(f)
liinE[(Moo-M\342\200\236)2]
= 0,
(d)
that
112
Chapter
12: Martingales
bounded in C?
variables
(12,2)..
C?
in
Suppose variables
that
N)
is a
sequence of
independentrandom
every k,
(7^
E(X,) = 0,
(a)
:=
Var(XO
< oo.
Then
(yZ^l (b)
\"^
^^)
iT^pli^^
ihat
(/\"y\"^* converges,
by
a.s.).
in [0, cxd)
in
//
that
the
variables
(Xk)
Wk,
are
Vu;,
bounded
then
some
constant
converges,
a.s.)
0-1
im,plies that
law
(/^
<^I
<
^^)-
Note.
Of course,
the
Kolmogorov
implies
that
P(5^Xjt converges) =
0 or 1.
Notation.
J^o :=
We
define
(with
{0, Jl},
Mo := 0, by
n
the
usual
conventions).
We also
define
An:=J2^l
Nn:=Ml-An,
k=l
so that
Ao
:=
0 and We
No :=
0. M is a martingale. Moreover
= al
Proof of
(*)
SO
(a).
know
\302\243[iAh-M,.^)^]
that,
= E(Xl)
from
(12.1,b),
E(A/2)^\302\243^2^^\342\200\236.
If
Z)^fc
< ^?
^hen M
is boundedin \302\2432, so
that
HmMn
exists a.s.
..(12.3)
Proof of (b).
J^k-if
We
bounded
in C?
113
strengthen
(*) as
we
have,
almost
surely,
=
E[(M* A
Ah-i)'\\J'k-i]
now
E(Mfc2m-i)
^Xl\\n-i]
= cl
familiar
argument
applies:
-
since Mk-i
is ^k-i measurable,
+ M|_i
al
But
2Mfc_iE(Mfc|:rfc_i)
= E(Ml\\J^k-i)-Ml.,
this
(a.s.)
result
states
that
N is a
Now
martingale.
let
c G
(0, oo)
and define
T:=inf{r : \\Mr\\
We know
>
c}.
every =
that N'^
n, 0. see
= E[(Mj)2]
\\^t\\
E^TAn
But
for
since
every
\\Mt
\342\200\224
Mt-i
\\
^ K
li T
is finite,
Vn.
we
that
|Mj|
<K + c
n,
whence
(**)
However,
bounded,
EATAn<{K
since
and converges X^-X'\342\200\236
+ cf,
the
for
a.s.,
partial
some
it must
be the
Aqo
case that
^<^l
c,
are a.s.
It is now
D
clear
from
(**)
that
\342\200\242=
< ^^'
Remark.
zero-mean
The proofof
sums
(b)
RVs uniformly
independent
(P{
partial
of J^X/t
i^Xk
converges
a.s.)
Generalization.
of
Theorem
12.2 with
Suppose
of IID
RVs
that
is a (a\342\200\236) with
sequence
of real
(\302\243\342\200\236)
is
a sequence
114
Chapter
of
12: Martingales
12.2
bounded in C?
(12.3)..
Theresults
Section
show
that
(a.s.)
Y^SnCin
converges
if and
only
if^a^
< oo,
and that
You should
if^o,^
= oo.
about
how
to clinch technique:
the latter
statement.
the sample
12.4.
We
symmetrization a stronger
expanding
space
need
provided by (12.2,b).
LEMMA
Suppose
by a
independentrandom
A',
variables
bounded
lA'nHI <
Then
Vn,Vu;.
(^X\342\200\236converges,
a.s.)
=>
(^E(-X'\342\200\236)
converges
and
^ Var(-X'n)
would
< oo).
to
hzis
mean
zero,
way
then of
replaces
\302\243is to
There
of
is a
n
course, this
each
amount
mean
such a
N))
preserve
of
Let {^,T,P,{Xn :
be an
exact copy
N)).
(n*,:F*,p*):=(fi,:F,p)x(n,:F,p)
and, for
X:(u;*)
We
clear
u;*
(^,i^)
:=
X\342\200\236H,
A-K)
:=
X\342\200\236(u-),
Z^K)
:= A^Cu;*)
X\342\200\236(u;*).
think
(and
of X*
may
be proved
as Xn lifted to the larger 'sample It is space' (J1*,^*,P*). by applying Uniqueness Lemma 1.6in a familiar
way)
that
G N)
(X^
: n
G N)
is a
and
family
of
independent
random
variables
X* having
on (Jl*,^*,
the
P*), with
both
X*
P-distribution
of Xn'.
= P o X-i
on
(R,
B),
etc.
..(12.5) Now we
Chapter 12:
Martingalesbounded
in
C?
115
have
n
(a)
(Z* :
variables
N*)
is a
zero-mean
on (fi*,
sequence of
\\Zn{u;*)\\
independent
random
< 2K
(Vn,Vu;*) and
G := with
P*(G
{u; E
0>
G defined
X
P(G)
G) = 1. But
Z;i{u;*)
converges
on G
x G,
so that
= P(G)
= 1, so that
(b)
P*(X;
^n
we
converges)
conclude
= 1.
that
(12.2,b),
follows
from
(12.2,a)
that
E(^n)]
converges,
a.s.,
E[{Xn-EiX\342\200\236)r]
al
X1E(-X'\342\200\236)
Since (c)
converges,
holds and
Yl-^n
converges
(a.s.)
by hypothesis,
Note.
Another
may
be
found
in Section
18.6.
(Xn)
be a
converges
K
almost
sequence of independent random variables. Then surely if and only if for some (then for
Y^Xn
every)
> 0,
(i)
the following
EP(l^n|>/0<00, n
three propertieshold:
(ii) \"^^{Xjf)
n
converges^
(iii)
EVax(X\342\200\236^)<co,
110
Chapter
where
C?
(12.5)..
^-^^>'-\\0
if |Xn(cc;)|>X.
that
Proof of
Then
Hf part.
Suppose
for
some
K >
0 properties (i)-(iii)hold.
Y,^(Xn
^ Xf)
5^P(|Xn|
so
that
by
(BCl)
P{Xn
= X^
we (ii),
finitely
many
n) X^
= 1. converges
only
need
almost
a.s.,where Y: Y^ converges,
However,
:=
X^
- E(X^).
independent
the
sequence
(Yj^
: n G N)
of is a zero-mean sequence
random
variables
with
E[(yj^)^]=Var(Xf).
Because of
Proof
any
(iii),
the
desired
follows from
(12.2,a).
of
constant
^only
if^
part.
Suppose
many
in
(0,
oo). Since it is
finitely
almost surely
a.s., that
and
that
K is
\\Xn\\ >
Xn
K for only
X^
holds. Since(a.s.)
YX^
a.s.
D
when
used
in
Kronecker's
Lemma
(Section
12.7).
12.6. Cesaro's
Suppose
Lemma
that
(bn)
is a
with
bn
^^f
^^^
^^^^ (^n)
convergent
sequence
of real
numbers:
^n
\342\200\224*' ^cx) G
R.
Then
1
^
\"
^\"*=i
X^(^*
\"
^k-i)vk
-^
Voo
(n
-> oo).
..(12.7) Here,
Proof.
bo
bounded
in C?
Ill
Let
\302\243 > 0.
Choose
N such that
>
\342\200\224 whenever ^c\302\273 \302\243
^k
k >
N.
Then,
1
t?n \"
\"
i/.\342\200\224Ot'ib
ib=l
> liminf
<
\342\200\224
Y]{bk
6ib-i)t'ib +
-^\342\200\224^(t'cx.
\302\243) ^
this is true for every \302\243 Since > similar argument, limsup < Voc
0,
the
we have
result
liminf >
follows.
v^o;
and
since,
by a
12.7. Kronecker's
\342\226\272
Lemnia
denote
Again, with
let bn t
(6\342\200\236)
a sequence
oo. Let
be a (x\342\200\236)
Sn
'-= Xi
h Xn.
Then
(E
Proof.
t:
^
\342\200\224-^g-) (\302\243-\"\302\253)
\342\200\242
Let
Un '-=
Ylk<n(^f^/^f^)^ ^^ ^^^^^oo
Un \342\200\224
'=
limwn
exists.
Then
Wn-1
Xn/bn.
Thus
ib=l
ib=l
Cesaro's
Woo
0.
\342\226\241
118
Chapter
12: Martingales
hounded in C?
(12.8)..
12.8.
Strong
Law
under
variance
constraints
LEMMA
Let
(Wn)
be a
sequence
such that
Then
n-i
X;ib<n ^k
-> 0,
a.s..
X^(VFn/n)
Proof. By
converges,
Kronecker's
a.s.
But
this
D
to
Note.
obtain
We
are
now
going
to see that
IID
the general
that
where
-X'i,X2,...
Ed-X\"!)
as X,
< oo.
RVs
each
with
the same
distribution
:=
E(-X').
Define
'\"\"
\\0
if\\Xn\\>n.
Then
(i)
E(r\342\200\236)^//;
(ii)
P[Yn
= Xn
eventually] =
1;
(iii)
Proof
^n-2Var(r\342\200\236)<oo.
of (i).
Let
._(X y \"\342\200\242\"
\\0
if|X|<n,
if
|X|>n.
that
same
oo,
distribution
have
as Fn, so
in
particular,
E(Zn)
\342\200\224> we
Zn
SO, by
^ X,
= ^c.
\\Z\342\200\236\\ \\X\\,
<
(DOM),
..(12.10) Proof of
Chapter 12:
Martingalesbounded
in
C?
119
(ii).
CX)
We
have
n=l
X;P(r\342\200\236
J^\342\200\236)Y^Pi\\X\342\200\236\\
>n)
J2Pi\\X\\
> n)
= EX;i{m>n}
n=l
Y.
l<E(l^l)<oo,
l<n<|X|
SOthat by (BCl),
Proof
of
(lii).
We
where, for
0<z<
oo,
f{z)
^
n>max(l,z)
n-2 <
2/max(l,z).
We
have
used
the fact
that,
for
>
1,
n?
?i(n 4-1)
\\n
n-\\-\\J
oo.
Hence
12.10.
\342\226\272 \342\226\272
Strong
Law of
LetXi,X2,...heIID
RVswithE{\\Xk\\)<oo,'ik.
Sn '-= X\\
Then,
-\\-
X2
with /i :=
E(Xit),
n~^
Vifc,
Sn
\342\200\224>
/^,
almost
surely.
12.9.
Yit
By property
~>
(ii) of
that
lemma,
we
n~^ ^
//,
a.s.
k<n
120 But
bounded
in C?
(12.10)..
(a)
where
tends
converges
k<n
Wk :=
to
/j,
Yk
by
to
0 by
Lemma 12.8.
Law
is philosophically
satisfying
a
in that
number
if
it
gives
formulation
of
of
X\\
realizations
large
E4.6 that
a.s..
for
Ed^l)
I/n
oo,
at
the best
possible result
the
IID
case.
a good result, it Even though we have achieved it does to be admitted that the truncation technique seems'ad hoc^: has - which sense of rightness not have the pure-mathematical elegance- the in the proof by ergodic theory (the latter is not and the martingale proof can be this adapted to book) both possess. However, each of the methods the others cannot tackle; and, in particular, classical cover situations which
of methods.
truncation
arguments
retain
great
im,portance.
Properly formulated, the argument which gave the result. which all of this chapter has so far relied, can yield much
Theorem
more.
12.2, on
12.11.
Doob
decomposition
'A
In the
following theorem, the statement that at 0' means of course that Aq = 0 and An E
is a
previsible
(n G N).
process null
mj^n-i
THEOREM
\342\226\272 \342\226\272(a)
Let
has a Doobdecom,position
(D)
(Xn
: n
E 2'^)
be an adaptedprocesswith
X
Xn
G C^,Wn.
Then X
= Xo
+ M
+A
..(12.12)
Chapter is a
12: Martingales
bounded in C?
is
121
process null at decomposition^
where
M
Moreover^
martingale
this
null at
0, and A
is unique
is
a previsible
0.
in
decomposition
the
sense
that if X
= Xo + M + A
modulo in distinguishability
another
such
then
i\342\200\236,Vn)-l. P(M\342\200\236=M\342\200\236,A\342\200\236
\342\226\272-(b)
is
processin the
a submartingale
sense
if and
that
<
only
if
the
process
A is
an increasing
P(A\342\200\236 A\342\200\236+i,Vn)
l.
Proof.
martingale
If X
and
then,
since
M is
E{Xn
E(A/\342\200\236 M\342\200\236_i|^n-l)
E(A.
An-llJ'n-l)
= 0+
Hence
n
~
{An
An-l).
(C)
^n
J2 E(^^^
k=l
A,
^^^-1
l-^^-l).
a-S-
and if we
The
use (c) to
define
we
obtain
the required
decomposition
of
X.
'submartingale'
result
(b) is now
obvious.
Remark.
submartingale previsible
The
in increasing
Doob-Meyer
process,
continuous
time is
decomposition, which expresses a sum of a local martingale and a a deep result which is the foundation stone
as the
for
stochastic-integral
theory.
12.12.
Let
Jensen's inequality
(a)
0.
Then
of
that
M^ is
has
a submartingale.
a Doob
Thus AI (b)
where
M^
decomposition
=N
is
i-A,
A''
and A
Notation.
being
a martingale null at 0.
and
A Aoo
is
a previsible
increasing
Define
A
The process
is
often
written
(M).
bounded
in C?
(12.12)..
E(A\342\200\236),
we
see
that
and
M is bounded in
C? if
only
i/E(Aoo)
< oo.
It is important to
\342\226\272 (d)
note that
M^.^lJ^n-l)
An
- An-l
= E(M2 -
E[(Mn
- Mn-lflJ'n-ll
of (M)oo 12.13. Relating convergence of M to finiteness in C^ and null at 0. Define A := (M). Again let M be a martingale
(More
strictly,
let
be
'a version
of (M).)
THEOREM
\342\226\272(a) limM\342\200\236(u;) n
exists
for
almost
every
u;
for
which
Aoo(^)
< oo.
that
\342\226\272(b)
Suppose
that
Hj
M has
for
some
in
\\Mn{uj)
<
K,
Vn^iv.
Then Aoo{^)
Remark.
Theorem
< oo for
every
lj for
which
a very
limM\342\200\236(c<;)
exists.
This is
12.2.
obviously an extensionit is
G Z+
and
substantial
one
of
Proof
of (a).
Because A
is
previsible,
every
fc
G N,
S{k)
defines
:=
inf
{n
> k}
is
a stopping
time
ible because
for B
B^
stopped processA^^^^
previs-
{An^S(k) e
where
B} = Fi U
F2,
n-l
Fi :=
r=0
= U {S{k) r; Ar
G B}
F2 :=
Since
{An
n {S{k)
< n
- 1}^G
A)^W
J^n-i-
(M^W)2
is
- A^W
= (M2 -
a martingale,
A^(^^ is
(c)
bounded by
M\342\200\236A5(*)
= A^^'^K we now see that (M*^^*^) the process However, so that k, by (12.12,c), M-^^*) is boundedin C^ and exists
lim
almost
surely.
..(12.14) However,
Chapter 12:
Martingalesbounded
in
C?
123
(d)
Result
{Aoo <
now follows
OO}
\\]{S(h)
k
= OO}.
(a)
\342\226\241
Proof
o/(b).
Suppose
P{Aoo =
Then for
(e)
OO,
SUp|Mn|
<
CX))
>
0.
some c > 0,
P(T(c)
= OO,
Aoo =
oo) > 0,
where
T{c)
is the
stopping
time:
T{c):=mi{r:\\Mr\\
Now,
> c}.
and
M^(^)
is bounded
by
c-\\-
K.
Thus
(f)
EATic)An<(c-^K)\\
Vn.
But (MON)
Remark.
showsthat
^
(e)
and
(f) are
we
incompatible.
able
In the
the jump
^T{c)
proof of
As{k)-i
(a),
were
to use
make
jump
increments.
As(k)
irrelevant.
we
We could not
the
do this
for
the
needed
assumption
about bounded
12.14.
Let is
'Strong
Law'
for martingales
at
in
\302\243^
be a
a bounded
0, and
let A =
\\<k<n
defines
a martingale
E[(H^\342\200\236
W. Moreover, \\rn-i]
since {1-{An)
An)-\\A\342\200\236 A\342\200\236_i)-i-(l
is
J^n-i
measurable,
W^.^f
= (1 +
<(l
A\342\200\236_i)
A\342\200\236)-i,
a.s.
124
Chapter
see
12: Martingales
bounded in C?
a.s..
(12.H)..
Kronecker's
We
that
(VF)oo <
1, a.s.,
so that
limT1^\342\200\236 exists,
Lemma
on {Aoo
= oo}-
THEOREM
Suppose
that for n G N, Zn :=
G J^n^\342\200\236
Define
Ek{k
y2 ^^k l<k<n
= number of
and
< n)
which occur.
Define ik:=P{Ek\\rk-i)y
l<k<n
Then,
almost
surely,
<
(a)
(Yoo
oo)
=> (Zoo
=^
< oo),
^ 1).
it follows that
(Zn/Vn
(i) SinceE^it
a.s.
P{Ek),
<
oo,
Let
(BCl)
{En : n
therefore follows. be a
and
if Y^P{Ek)
(ii)
P{Ek), Proof.
G N)
sequenceof
define (b).
\342\200\224
with some
a.s., Let
events associated independent = = \342\200\242 \342\200\242 J^n \302\243^k g{Ei , \302\243\"2, \342\200\242, \302\243'n). Then
the martingale Z
:= {M)n
F,
so that
Z =
M + F is the
Doob
decomposition
check!)
X^ 6(1
k<n
- 6) <
exists,
Yn,
a.s.
If^oo
are
< 00,
then
A<x)
<
null
00 and
u;-set'
lim
Afn
so that
Zoo is
finite.
trivial
(W^e
skipping
'except
for a
Aoo
statements
now.)
00 and
and
<
00
then
lim Mn
exists and
\342\200\224^so
it
is
that
If
Yoo
\342\200\224 00
Aoo
0,
that,
a fortiori,
D
..(12.16)
Chapter 12:
Martingalesbounded
in
C?
125
12.16.
Comments
few
just how powerful the use of (M) to as one can obtain the conditional study of one can obtain version Theorem 12.15 the Borel-Cantelli Lemmas, conditional versions of the Three-Series Theorem etc. But a whole new world is opened In the continuous-time up: see Neveu (1975),for example. case,
The last
sections
have
indicated
M is likely
still.
See,
for example,
Rogers and
Williams
Chapter
13
Uniform
Integrability
We
have
already
seen
full
a number
we
In
To derive
sufficient concept
the
Theorem. Convergence
condition required
benefit,
of nice applicationsof martingale theory. something better than the DominatedTheorem 13.7 gives a necessary and particular,
need
for
a sequence
links
of
RVs
to
converge
on
The \302\243^. of
new randomi
is that
of a
family
variables.
This concept
martingales.
with
conditional
expectations
and
hence with
The
examiners and others: modes of convergence. use of the Upcrossing Our Lemma has meant that this topic does not feature large in the main text of
this
appendix
to this chapter
of contains a discussion
that
topic
loved by
book.
13.1.
An ^absolute
continuity'
X E
that
property
P).
LEMMA
\342\226\272(a)
Suppose
that
such
C^ =
for
a 6>0
Proof
sequence
\302\243^(Q, J^,
Then,
given e
that
eT,
< e.
find
Sq
>
0, we
can
and
E(|X|;Fn)
P{H)
>
\302\243o.
Let
:= limsupFn.
Then (BCl)
that
showsthat
= 0,
but the
'Reverse'
Ei\\X\\;H)>eo;
and
we have
126
..(13.3)
Corollary
Chapter
13:
Uniform
IntegrabilHy
127
(b)
Supposethai X e
such
C^
and
that
e >
Q. Then there
existsK
in
[0,oo)
that
E{\\X\\;\\X\\>K)<e.
Proof. Let
S be as in
Lemma
(a).
Since
>
KP{\\X\\
K)
< E(|X|),
we can
chooseK such
that
P{\\X\\
> K)
< 6.
13.2. Definition.UI
\342\226\272 \342\226\272
family
A
if
class
given
E(|X|;|J^|
WXeC,
(with
for
such
a class
C, we
have
Ki
relating
to
= \302\243
1) for
E{\\X\\)
E{\\X\\;\\X\\>K\\)
+ E{\\X\\;\\X\\<K^)
Thus, a
It is
UIfamily
not true
is
that
bounded
a family
=
in C^. bounded
([0,
in C^
l],Leb).
is UI.
Let
Example. Take(Q,^,P)
Then Ed^nl) =
iiT >
l],i?[0,
^n=(0,n-i),
0, we have for
Xn^nlE^.
is bounded in
1,
Vn,
so
that
{Xn)
O. However, for
any
> A\",
E{\\Xn\\\\\\Xn\\>K)^nP{En)
= l, E{Xn) />
for
variables
~>
0, but
0.
UI property
conditions
the
Suppose
thatC
is a
class of random,
for
which
is bounded in
C^
some
A G [0,
oo),
E(|X|P)<A,
VXeC.
128
Then C
Chapter
13:
Uniform
Integrability
(13.3)..
is UI.
then
Proof. Uv>K>0,
X
v <
K^'^vP
(obviously!).
K > 0 and
E C,
we have
<
K^-^E{\\X\\P;
\\X\\ >
K^-^A,
D
The
(b)
result
follows.
Suppose
an
that C
is a class of
<
random
variables
which
is dom,inated
by
integrable
non-negative
\\X{u;)\\
variable
WX
Y: eC
y(u;),
and
Then
is
UI.
makes
Note. It is
(DOM)
work
E
for our
(fi, J^,P).
Proof.It is
and now
that,
for
K >
0 and
K)
C^
Ei\\X\\;\\X\\>
it
<EiY;Y
> K),
is only
necessary
to apply
(13.1,b)toy.
mean
reason
that
See
the UI
Exercise
is the following.
property
E13.3
fits
in
so well
for an
important extension.
with martingale
theory
THEOREM
\342\226\272 \342\226\272
Let
e C^.
: g a sub-(7-algebra {E{X\\g)
is uniformity
of
J^}
integrable.
in question
Note. Becauseof the business of versions, a formal descriptionof the class C would be as follows: y G C if and only if for some sub-cr-algebra of y is a version of E{X\\Q). ^ ^,
Proof.
Let e
> 0
that,
for < e.
F e
J^,
E{\\X\\;F)
,.(13,5)
Choose
Chapter 13:
K so that
Uniform
Integrability
129
be
any
version
of E{X\\Q).
(a)
Hence
|F|<E(|X||a),
E(|r|)
a.s.
< E(|J5f
|) and
ii:p(|y|>JO<E(|r|)<E(|x|), so that
But
p(|y| >
> K}
G
A')
<
s.
definition
{|F|
G^
so
that,
from (a)
and the
of
conditional
expectation,
E(|r|;|F|>A')<E(|X|;|F|>/r)<\302\243.
Note,
just
Now
you
can
see why
the
result
(13.1,b)
(Xn)
be a
sequence
of random
variable.
We
say that
Xn
\342\200\224> X in
\342\226\272 \342\226\272
probability
if for
every
\302\243 > 0,
P(\\Xn
0 as
n ->
oo.
LEMMA
\342\226\272
If
Xn
\342\200\224^ X almost
surely,
Xn
then
\342\200\224* X in
probability.
Proof
Reverse
Suppose that
Fatou
Xn
\342\200\224> X almost
surely
and that e
> 0. Then by
the
Lemma
-
0 = P{\\Xn
P(limsup{|Xn
X\\
>
e})
>limsupP(|Xn--Y|
>e),
ISO
Note.
various
IntegrabilHy
(13.5)..
of convergence
between mentioned, a discussionof the relationships to this the in be found chapter. appendix may
13.6.
We
Elementary
restate of
proof
Bounded
of (BDD)
Convergence Theorem, but in probability' rather than
the
hypothesis convergence'.
'convergence
'almost sure
THEOREM
(BDD)
Let {Xn) he
Xn
a sequenceof RVs,
probability
u)
and
let
he a RV.
\342\200\224*\342\226\240 X in
some K in [0,oc), we
< K.
Suppose that
have
for
every n and
Then
E(|X\342\200\236 that Proof. Let us check
X\\)
^ 0.
fc
P(|X|
< K)
= I.
-
Indeed,for
Xn\\ >
G N,
Pd^l
so
> K
fc-i),
Vn,
that
P(\\X\\ >
K +
P(|J^|
fc-i) = 0. Thus
> A')
P(U{|X|
k that
>K + k-'})^
0.
Let
\302\243 > 0
be given.
Choose no such
P{\\Xn
X\\
>i\302\243)
<
when n >
hq.
Then,
for
> no, -
\302\243{\\Xn
X\\) =
E{\\Xn
X\\;
\\Xn
X\\ >ie)
-f
E(|X\342\200\236
X|;
|X\342\200\236
X\\ <ie)
<2XP(|X\342\200\236-X|>l\302\243)-fl\302\243<\302\243.
The
proof
is finished.
This proof shows(much as doesthat of the Weierstrass approximation theorem) that convergence in probabilityis a natural concept.
,.(13.7)
13.7. A necessary
THEOREM
Chapter
IS:
Uniform
Integrability
131
convergence
Then
\342\226\272 \342\226\272
Let
{Xn)
be a
are
X ^
C^,
Xn
\342\200\224^ in X two
OJ
equivalently
\342\200\224^ in X
X\\)
-^
^, if and only
if the following
conditions
satisfied: probability,
(i) Xn
(Xn) is
UL
Remarks. It is of course the 'if part of the it must improve the result is 'best possible',
theorem on
which
(DOM)
triple;
Proof
K
of
and,
of
course,
result Suppose
function
this explicit.
Hf^
part.
e [0,
oc), define a
conditions (i)
: R \342\200\224^ [~-^5
For
^]
( K K
<Pk{x)
if X > K, iix> K,
:=
< x
if
\\x\\ X
< <
K,
l-K Let
\302\243 > 0
if
of
-K (Xn)
be given.
By the
-
UI property
<
the
sequence
and (13.1,b),
Xn\\}
|,
Vn;
E{\\ipK{X)
X\\}
<
|.
But, since
probability;
\\ipK{x)
by
and
choose
no such
that, for n
> no,
\\x y|, we see that (pK(Xn) ~> ^k(X) in the form of the preceding section,we can
E{\\^K{Xn)-'PKiX)\\}<'-. The
triangle
inequality
therefore
implies
that, for n
> no,
D
that Xn -^
N Choose
Proofof
'only such
if
that
part.
Suppose
\\n
O,
Let
e >
0 be given.
n>N
=>
E{\\Xn-X\\)<e/2,
132
By (13.1,a),
Chapter IS:
we can
Uniform
Integrability
(13.7)..
choose^
>
0 such
that whenever
(l<n<iV),
P(F) < 6, we
have
E(|,Y\342\200\236|;F)<\302\243
E(|A'|;F)<e/2.
Since
(X\342\200\236)
is
bounded
in C^,
K-^
we can r
chooseK such
< 6.
that
supE(\\Xr\\)
Then
for
> iV,
we have
Pd^nl > K)
<6
+
and
E(\\X\342\200\236\\;\\Xn\\>K)
< E(|X|;
Ei\\X
< e. X\342\200\236\\)
For n
<
iV,
we
have
> P(|A'\342\200\236|
K)
< S
and
E{\\Xn\\;\\X\342\200\236\\>K)<e.
Hence
{Xn)
is a
UI family.
Since eP{\\Xnit is
X\\
>e)<
E(|X\342\200\236
X\\) =
||A\342\200\236
A||i,
clear
that
Xn
\342\200\224^ X in
probability.
Chapter
14
UI
Martingales
14.0.Introduction
The
first
part
of this
happens
when
uniform
bility is
we also obtain such as Levy's 'Upward' and 'Downward'Theorems, Law of Large and of the Strong new proofs of the Kolmogorov 0-1 Law
combined with
martingale
property.
In addition
to new results
integra-
at Section 14.6) is concerned result impliesin particular This Inequality. martingale = in C^ is dominated for p > I (but not for a bounded that p martingale 1) an of hence element and both almost and in C^. The C^ surely by converges to Kakutani's is also used Theorem on prove SubmartingaleInequality in of an illustration and, bounds, product-formmartingales exponential to prove a very special case of the Law of the Iterated Logarithm.
of
the
chapter
(beginning
with
Sub
The
likelihoodratio
The
Radon-Nikodym
explained.
theorem
is then
proved, and
to its relevance
topic
theory 14.1.
and UI
of optional sampling, important for continuous-parameter in other contexts, is coveredin the appendix to this chapter.
martingales
Let M be a
Since
(fi,^, {^n},
P)
UI martingale,so that
and : n (M\342\200\236
is a
G Z+)
is a
13.7,
UI family.
(by
martingale
(13.2)),
relative to
and
that
our set-up
:= limM^
in
existsalmost
We now
is UI,
surely.
M is boundedin C^
By
so Mqo
Theorem
it is
also true
E(|M\342\200\236-Moo|)->0.
prove that
= M\342\200\236
yields
E{Moo\\^n),
a.s.
For F
J^n,
and
r >
n, the
martingale property
(*)
E{Mr;F)
= E(Mn;F).
133
134
But
(H-V-
<E(\\Mr-Moo\\).
Hence, on lettingr
-+oo
in
(\342\231\246),
we
obtain
E(Moo;F)
We
E(M\342\200\236;F).
have
proved
the following
result.
THEOREM
\342\226\272 \342\226\272
Let
be a\\JI
martingale.
Moo
Then
a.s.
'=
exists limM\342\200\236
and
in C^.
Moreover, for
every
n,
Mn
= E{Moo\\^n),
a.s..
be
proved
similarly.
Levy's
Let
'Upward'
\302\243^(Q, J^,P),
Theorem
and
^ e
define
Mn :=
E{^\\J^n),
a.s.
Then
is a
UI martingale and
almost surely
and
in C^.
because of the Proof. We know that M is a martingale We know from Theorem 13.4 that is UI. Hence Moo M a.s. and in \302\243^,and it remains Moo only to prove that
r? :=
Tower =
Property. exists
where
a.s.,
E{i\\J'oo).
Without
consider
loss of
measures
Qi(F):=E(,7;F),
the
do)
assume
that ^
> 0.
Now
where
= E{Moo; F),
the
F e TooProperty,
If
e Tn,
then since
by E(\302\2737|^\342\200\236) E{i\\:F\342\200\236)
Tower
E(\302\273?; F)
E(M\342\200\236; F)
= E(Moo;
F),
..(14'S)
the
135
second
Qi
and
Q2
agree on
they
more
strictly,
be taken
Thus,
defining
Mn
for
u.
F :=
{cj : ry
Moo}
G J^oo,
Q2(i^),
E(r7-Moo;r7
> Moo)
= 0.
ry)
0, and
similarly
P(Moo >
0.
14.3.
Recall
Martingale
the
result.
THEOREM
of independent RVs.
Define
7^ := Cr(-X'n-|-l,-X'n-f2,-\342\200\242
Then ifFeT,
Proof T]
\342\200\242)\302\273 ^\342\200\242=11^-
P(F) = 0 or 1.
Define
J='n :=
G b^oo,
Levy's
F eT,
and
let
tj :=
I^.
Since
E{tj\\J^oo)
is
= lim
E(r7|J^\342\200\236),a.s.
and
However,
for each
n,
r/
T\342\200\236 measurable,
hence
(see Remark
independentof
below) is
Hence J^\342\200\236.
by
(9.7,k),
=
E(7,|JF\342\200\236)
E(r?) =
ry
P(F),
a.s.
the values
Hencerj
follows.
P(i^),
a.s.;
and since
only
takes
0 and
1, the result
D
Remark.
the earlier
building
parts
of
proof
just given.
136
14.4.
\342\226\272 \342\226\272
(U'4)\"
'Downward'
that
Theorem P) is a
probability
is a collection of
(Jl, J^,
triple,
and
that
{Q-n : n G N}
sub-a-algebras
of
\342\200\242 \342\200\242 \342\200\242
T such that
^-n \302\243
Q-oo := fl
k
^-*
and
^-(-+1)
a-i.
Let 7 G
\302\243^(Q, J^,
P)
define
M.n :=
Then
E(7|a-n).
exists a.s. and in C^
M-oo
'=
lim M_n
and
(*)
Proof.
A/_oo
E(7|^_oo),
a.s.
the martingale
canbe
13.4,
used
exactly
as in
to show that
lim M_n
Convergence result,
Theorem Theorem
limM_\342\200\236 exists
'-= lim
follows
by
now-familiar
r t oo.
Law
14.5. Martingaleproofofthe
Recall
Strong
the result
THEOREM
Let
Xi,X2,...
IID
RVs,
with
common value
E{\\Xk\\)
<
oo,Vifc.
Let n be the
ofE{Xn)'
Write
Then
n'^^Sn
\342\200\224^ cl-s.
/J-,
and
in C^.
,.(14^6)
Proof.
137
Define
:= ^_\342\200\236
\342\200\242 \342\200\242 ^-oo \342\200\242)? Cr(5\342\200\236,5n-fl,5n+2? \342\200\242*= I ]^-n-
We know from
= n-'Sn.
in
a.s.
definiteness,
\302\243^.For fc,
define
L :=
every
u;. Then L =
..
for each
Xk-\\-l
H
hm sup
-^ Xk-^n
n
\342\200\242 \342\200\242 By Kolmogorov's \342\200\242)\342\200\242
where
some
Tk = a(Xk-\\.i',Xk-\\.2,
c in
0-1 law,
R. But
E(n-i
deduced think
c = E{L)= lim
Exercise.
(12.10).
Remarks.
of
See
Meyer
the
results
given
0-1
Hewitt-Savage
for important extensions and applications (1966) so far in this chapter. These extensions include:the on de theorem Finetti's random law, exchangeable
bounded
harmonic
functions
14.6. Doob's
THEOREM
\342\226\272 \342\226\272(a)
Submiartingale
Inequality
Let
be a
non-negative
submartingale.
CP [sup
k<n
Zk>c]
<
k<n
Proof Let
F :=
{sup;t<n
^k
c}-
Then F
is a disjointunion
F = FoUFiU...UFn,
138
where Fo :=
Fk
Chapter
14:
UI Martingales
(U-^)-
{Zo> c},
{Zo
and
:=
< c} n
Z
{Zi <
on Fk.
c} n...
Hence,
{Zk-i
<c}n
{Zk >
c}.
Now, Fk e fk,
> c
EiZn;Fk)>EiZk;Fk)>cP{Fk). Summingover k
The
following.
now
yields
the result.
\342\226\241
main
reason
for the
usefulness of
the
above
theorem
is the
LEMMA
\342\226\272(b)
If
is a
martingale, c
is a convex
function,
and
E|c(M\342\200\236)|
<
cx)^ Vn^
then
c(M) is a
submartingale.
of Jensen's
Proof
Apply
the
conditional
form
Kolniogorov's
\342\226\272
inequality
Let
(Xn
: n
:=
E N) be
Var(Xit).
a sequenceof
Write
independent
zero-mean
RVs in
C?.
Define 0%
Sn :=
Xi +
Xn,
Vn :=
Var(5\342\200\236)
J2
^^
k=i
0,
< c2pfsup|5it|>c) /
\\k<n
Vn.
Proof
-X'2,
then
S =
martingale. Note.
Kolmogorov's
apply
the
Submartingale
was
Inequality
to 5^.
(Sn) is a
Kolmogorov's
inequality
the
Three-Series
Theorem
original proofs of
Submartingale Inequality may be used bounds to case of Kolmogorov's exponential prove a very special
us
see
how the
via
so-called
Law of the
..(14.7)
Iterated
139
which is
this
to take
a quick look at
well
THEOREM Let
(Xn
: n
e N)
of
be IID
RVs
each
with
the standard
normal N(0,1)
distribution
mean
0 and
variance
X2
1. Define
+
\342\200\242 \342\200\242 \342\200\242
5n := Xi +
Then,
Xn.
almost surely,
limsup
(2nloglogn)2
write
n
= 1.
Proof
Throughout
the
proof,
we shall
h{n) := (2nloglogn)2,
(It will be understood that this is necessary.) e, when
>
3.
Step 1:
An
exponential
bound.
Define
a martingale relative to
{J^n}-It is well
yXn)-
Then
S is
that
for ^ G R, n G N,
The function x
\302\273-> e^^
is convex
on R, so that
e^^\" is a submartingale
and, by
\342\226\272 \342\226\272
the
Submartingale
Inequality, >
we have,
6^^*
for ^ > 0,
< e-^-E
5, (sup,<\342\200\236
c)
= P
(sup,<\342\200\236
>
e'^)
(e^^-) .
This is a type
In
of exponential
bound
much used in
modern probabilitytheory.
our
special
case, we
have
1^0
and
Chapter
for
14: UI Martingales
c/n,
(U-'^)obtain
c >
0, choosing
we
(a)
Step
2:
Obtaining
an upper
bound. Let
K
> 1.
is close
Then
( sup 5,
\\k<K\302\273
>
cr^
< J
exp(-4/2A--)
= (n
- l)-^(log A^^^.
The
First
large
Lemma therefore showsthat, Borel-Cantelli n (all n > no(u;))we have for A'\"\"-^ < k < A\", Sk <
for
A\"
almost surely,
for
all
Hence,
>
1,
<
AT,
a.s.
By
taking
a sequence
obtain
< 1,
a.s.
interested in caseswhen
course,
Step
3:
Obtaining
a lower
N
bound. Let N be an
is very more
integer
with
N >
e will
when
be small in
large.)
Let e
interest
1. (We are
Write
S{r) for
Sr,
etc.,
typographically
convenient.
For n G N, define
-
the
event
Fn
:=
{5(iV\"+l)
^(^n)
^ (J _ \302\243)ft(iV\"+l
iV\}.
Then (see
Proposition 14.S(b)below),
P(F\342\200\236)
$(y) >
where
t/ =
(1
\302\243){21oglog(iV\"+i
iV\}^
so that Thus, ignoring 'logarithmic terms', P(Fn)is roughly (nlogN)-^^~^^^ Fn (n G N) are clearly independent^ so ]CP(-^n) = <^- However, <Ae events
..(14-8)
that infinitely
Chapter
14:
UI Martingales
I4I
occur. F\342\200\236
almost surely,
infinitely
many
Thus,
for
\302\243)/i(iV\"+i
iV\") + large
SiN\"\.
n, so that
S{N'') >
-2/i(A^\")
for
all
for
infinitely
n, we have
(1 _ \302\243)/i(Arn+i
AT\")
2/i(A^\.
limsup/i(fc)-i5ib
ib
> limsuph(iV\"+i)-i5(iV\"+i)
n
>(l-\302\243)(l-iV-i)2
-2iV-2.
(You should
obvious.
14.8.
We
A standard
used
estimate
part
of the
Proposition
Suppose
that
has
the standard
=
normal distribution, so =
that,
for
P(X>x)
where
l-^x)
<p{y)dy
Then,
for
x >
0,
(a)
(b)
x~V(a^),
> x)>{x
=
+ x-^y^ifix).
(p'{y)
\342\200\224y^{y)^ /\342\200\242oo
/\302\273oo
JX
y<^{y)dy >x
J X
9{y)dy,
yielding
(a).
142
Chapter
14: UI Martingales
V'^Mv),
(I4.8)..
Since(y-V(y))' = \"(1 +
/\342\200\242oo
yoo
yielding
(b).
HI
14.9.
Obtaining
Remarks
exponential
on exponential
bounds
(1984),
is
deviations
an
- seeVaradhan
number
and
Stroock
See
ever-growing
of fields of
(1989)
which
has
Ellis
(1985).
of context You can study exponentialboundsin the very specific e and in Teicher Garsia Neveu martingales (1978), (1975), Chow (1973), tc.
Much
of
the
literature
is concerned
with
obtaining
exponential
bounds
a sensebest possible. results such as the 'elementary' However, in Exercise E14.1 numerous are useful in very Azuma-HoefFdinginequality to for the combinatorics in See Bollobas applications. example applications
which are in
(1987).
14.10.
Look
A consequence
at
the
statement
we
to see
where
are
going.
LEMMA
non-negative
RVs
such
that
c>
0.
p>I
and
p~^
= 1, we
<
have
\\\\X\\\\p
q\\\\Y\\\\,.
Proof
We obviously
have
c)dc
(*)
L:=
Jc=0
pc^-'PiX >
<
/ Jc=0
i?cP-2\302\243(y;X
>
c)dc
=: R,
non-negative
integrands,
we obtain
..(14-11)
L =
US
l{x>c}{^)P{d^))
-L Q \\Jc=0
Exactly similarly,
we
find
that
R =
We
E{qXP-'^Y).
apply
Holder's
inequality
to conclude that
II,.
(**)
Suppose
that
<
oo,
and
suppose
||X||p <
cxd
also.
Then
since
l)q
p, we
have
\\\\X^-'\\\\,
EiXni,
Hence
< 5||y||p. For generalX, notethat the hypothesis < ^H^Hp for all n, and the result \\\\X A n||p
D
using
(MON).
14.11.
THEOREM
\342\226\272 \342\226\272(a)
Doob's
\302\243P inequality
Let
p >
1 and
define q so thatp\"^
bounded
-{-q~^
1. Let
Z be a
non-negative
submartingale
in C^,
Z*
is standard
notation)
sup Zk.
Then Z* G
(*)
C^,
and
indeed
||Z*t<5Sup||Z,||p. r
by the element and in IIP and
The submartingaleZ is therefore dominated CP. As Ti \342\200\224> OO. Z(Xi \342\200\242\342\200\224 exists a.s. lim2^^
Z* of
||Zoo||p = sup||Zr|U=Tlim||Z.||p.
144
\342\226\272 \342\226\272(b)
Chapter
14:
UI Martingales
(I4.II)..
If
Z is :=
Mqo
where
is a
a.s.
Proof For n
Inequality
Z\"^,
define
Z* :=
14.6(a)
and
Lemma
14.10 we
sup/^^n^k-
see that
r
l|-^:ilp<9ll^n||p<<?SUp||Z.||p.
Property
is a ( \342\200\224Z)
Theorem.Since (*) now follows from the Monotone-Convergence we know in and in therefore bounded \302\243^, C^, supermartingale
exists
that
Zoo
\342\200\242= limZn
a.s.
However,
|Zn-Z|^<(2Z*)PG\302\243^
so
IIZr
that
(DOM)
shows
that
Zn
\342\200\224> Z in
\302\243^.Jensen's
inequality
shows
that
D
IIJ) is non-decreasing
in r,
14.12.
Kakutani's
Let
theorem
on 'product'
martingales
RVs,
Xi,A'2,...
Define Mq :=
I,
each
of
mean
1.
Mn:=XiX2...Xn.
Then
is a
non-negative
Moo
martingale, so
\342\200\242= limMn
that
exists
a.s.
statew>entsare
Moo
equivalent:
E(Moo)
= 1,
(ii) Mn ->
0 <
in
O;
(iii) M
< 1,
is UI;
0 where
:= E(Xj) \302\253\342\200\236
(v)E(l-\302\253n)<CX).
every
one)
of the
above five = 0)
statements fails
to
hold,
then
PiMoo
= 1. theorem is explained in
Remark.
Section
Something
of
the
significance
of this
14.17.
..(14.13)
Chapter
That
14: UI Martingales
145
0 > \302\253\342\200\236
Proof.
obvious.
an
<
I follows
from Jensen's
holds.
inequality. That
Then define
is
statement
(iv)
11
(*)
= \302\261i_\302\2612____^_ Ar\342\200\236
Then
iV is
a martingale
is.
We
have
ENl=l/(aia2...anY
so that
<
N is
bounded in C^. By
<E
Doob's
C^
inequality,
E (sup\\Mn\\]
TsuplAT^p^ :=
so
that
is dominated
by M* when
=
\\Mn\\ sup\342\200\236
is UI
and
consider
the
case
is a
[][a\342\200\236
0. a.s.
Define But
non-negative
martingale,
Moo
exists limiV\342\200\236
forced to
concludethat
0? a-.s.
to
is proved.
The equivalence
us from
theorem
14.13. The
Martingale
Radon-Nikodym
We
Radon-Nikodym
theorem
intuitive
theory
with
yields an
theorem. a special
and
''constructive^
- proof of
the
We are
guided by Meyer(1966).
begin
case.
THEOREM
\342\226\272 \342\226\272(I)
Suppose
that
in
(Q,^,
P) is
T
probability
triple
in
which
T is
separable
that
= a{Fn
subsets
: n G N)
of Q,.
Suppose that Q
to
is a finite
P in
on (fi,^)
which is
that
(a)
for
FeT,
P(F) = 0
=^
Q(F)
= 0.
146
Then
Chapter 14:
there
i'n
UI Martingales
such
(14.13)..
exists X in
\302\243^(fi,^,P)
that
Q =
XP (see
Section
5.14)
that
Q(F)
/ XdP
= E(X; F),
of
VF
J\".
The
variable
X is
to
called a version
on
the
Radon-Nikodym
derivative
write
of Q
relative
(Q,
-r=r dP
on
y^,
a.s.
Remark.
Most of
we the cr-algebras
have
encountered
are separable.
(The
cr-algebra of Proof.
property
With
Lebesgue-measurable
subsets
of [0,1]
is not.)
the
method
of Section
can prove
that
(a) implies
that
there
(b)
exists S
for
G J^,
P(F) <S=^
Next,
Q(F) < e.
,Fn)possible
define
^n
\342\200\242= cr(Fi,F2,...
Then
for each
n,
J^n
consists
of the
2^^\"^
unions
of 'atoms'
of
J^n
being
which
is again
an element of an element
J^n
such
that
0 is atom
the only
A will
of
^\342\200\236. (Each
HinH2n...nHny
a function
Xn : f2
\342\200\224>
[0, cx))
as follows:
if u
An,k,
then
^\"^'^^-\\Q(yl\342\200\236,fc)/P(A\342\200\236,A)
if
P(yl\342\200\236,,)>0.
Then
G C^{Q.,Tn,P) X\342\200\236
and
=
(c)
E(J^\342\200\236;F)
Q(F),
VFeJP-\342\200\236.
..(14,13)
Chapter
14:
UI Martingales
i^7
The variable
Xn
obvious
is the
from (J^n : n
obvious version
(c)
It is
to the
that
filtration
el'^),
Xoo
'=
a.s. cx)) be
Let
\302\243 > 0,
choose
6 as
(0,
such that
Then
P{Xn
> K)
< K-'E{Xn)
> K)
S,
so that
E{Xn;Xn
= q(Xn
The
martingale
is therefore
UI, so that
Xn-^XmC\\ It
now
follows
from
(c) that
the measures
Cl{F)
Fh->E(J\\:;F) and F ^
agree
on
the
the proof
of uniqueness,which
7r-system IJ^n,
so
is
that
they
agree
on T,
All
that
remains
is
now
standard
for us.
of
all
of the
the which
close
theorem ...
link between the Radon-Nikodym and conditional is made explicit in Section 14.14. Now for the next part of
proof
the
(II)
P
The
assumption
that
T is
separable can
be
dropped
from
Part
I.
and
finite.
on
Once one has Part II, one can easily extend the result to the case when Q are cr-finite measures by partitioning fi into sets on which both are
the
Proving Part II of
the
theorem
is a
piece of
is a
fact
that
C^ (or,
metric
particular
well
skip
might section.
want
for
of Sep be the class of all separable sub-cr-algebras such that G G Sep, there exists X(; in \302\243^(Q,^,P)
dq/dP
J^. Part
I shows
= Xq ;
equivalently, E{Xg\\G) =
Q(G),
GeQ.
148
We
Chapter
are
(14,13)..
P)
going
to prove that
X in \302\243^(Q, J^, O
such
that
(d)
Xo-^Xm
given
in the sensethat
e >
0, there
if K;
C a G
X||i < e.
First,
we
note
that
it is
: g
(e)
{Xq the
is Cauchyin C^
in
in
sense
that given
Sep
such that if
/C C
Qi
G Sep
for i = 1,2,
then
\\\\X(;^
-^(?2lli
'^
^\342\200\242
Suppose
=
that
(e)
holds.
Choose
G Sep /C\342\200\236
such
1,2,
then
\\\\Xo,-XoAi<^-^''^'^'
Let H(n) =
X
cr(/Ci,/C25
^^n)-
Then
(see
the proof
indeed,
of (6.10,a)) the
limit
:=
lim-X'7^(\342\200\236)
exists
a.s.
and
in
and \302\243^,
\\\\X-Xn(n)\\\\i<2-\\
Set X have
:=
limsup-X'7^(\342\200\236)
for
definiteness.
For any
Q G Sep with
\"Hn we
\\\\Xc-Xnin)\\\\i<2-\\
Proof
o/(e).
If (e)
can
find
> \302\243o
0 and
a sequence
of Sep
such that
>
\302\2430,
||^X:(n)
However,
filtration that
XfC{n-\\-l)\\\\l
Vn.
it is
(K{n))y
{Xfc(n))
is a
UI martingale
relative to the
D
(e)
is true.
Proof of
and
Part II
G ^,
of
the
theorem.
We need only
for
we have
X as at (d)
E(X;F) = Q(n
..(14.15)
Choose K
where
Chapter
14:
UI Martingales
149
suchthat
is the
for
/C C g? G Sep,
yX^; -X||i
< e.
Then a{K:, F) G
including
Sep,
(j{K^F)
and
F; and,
by a
familiar argument,
\\E{X;F)
The
Q(F)|
=
\\E{X-X,(K,F)\\F)\\
<\\\\X-X\342\200\236(K,F)h<e.
result
follows.
theoremand conditional 14.14. The Radon-Nikodym expectation that that ^ is a sub-cr-algebra Suppose (Q, ^, P) is a probabilitytriple,and of \302\243^(Q,^, element Then of T. Let X be a non-negative P).
Q(X):=E(X;G),
GeQ,
defines a
continuous
finite
measure
on
Q^
(Q,^).
so
relative
to P
Y
on
:=
that,
absolutely
theorem, (a
version...)
c/Q/dP
exists
Now Y is
^-measurable, and
E(y;G) = Q(G)
Hence
E(X;G),
GeQ.
F is
a version of
the
conditional
Y
expectation
a.s.
of X
given G'-
\302\243{X\\g),
Remark.
between
The
martingale
right context
convergence,
geometry
spaces.
Likelihood ratio, equivalent measures Let P and that Q be probability measures on (Q,J^) such Q is absolutely continuous relative to P, so that a version X of dQ/dP on J- exists. We that Y is (a version of) the likelihood ratio say of Q given P. Then P is to continuous relative if and if absolutely Q only P{X > 0) = 1, and then X~^ is a versionof c/P/c/Q. When each of P and Q is absolutely continuous relative to the other, then P and Q aresaidto be equivalent. Note it that then makes sense to define
/ JF
and
y/dPdq
:=
/ Jf
x'^dP
f {x-^)dq, Jf
what
FeT;
Kakutani
we
can
hope
for a
fuller understandingof
achieved
150
14.16.
Let
Chapter 14:
Likelihood
(Jl,
UI Martingales
(I4.I6)..
absolutely continuous relative to P with (fZ,^) function be a sub-cr-algebra of f. What ^-measurable = is of i t y on Q? Yes, course, yields dQ/dP E{X\\Q), E denoting P-expectation,
fy P) be which is
and conditional
measure
on
X.
Let Q
(modulo
for,
yet
again,
versions) with
Q(G)
for GeQ.
then
the likelihood
ratios
{dQlldPonTn)=
a UI
E{X\\Tn)
martingale.
(This is
of
course
bound
'a.s.'
qualifications on
have
outgrown
them.
14.17.
Let
Jl
Theorem revisited; consistency of LR = R*^, Xn{uj) = uJny and define the cr-algebras
Kakutani's
^ = a{Xk : fc
Suppose
fn=
Qn
a{Xk :
1<
fc
<
n).
that
functions
for each
on measure
n,
are
everywhere
R and
let
:= r\342\200\236(a:) gn{x)/fn{x).
Let P
on (Q,
makes the variablesXn independent, function density /\342\200\236 Clearly, [respectively, ^\342\200\236].
^) which
Mn := dq/dP where
reasons,
Now
YiY2
...
Fn on
^n,
that
the
has
P-mean
a martingale.
absolutely
and E(^|^\342\200\236),
Q is
on f,
exists
then
(a.s.,
But
= M\342\200\236
P) and
=
Moo
E(Moo\\T\342\200\236)
then
the probability
measures
Fh^Q(F)
f. Thus Q is absolutely m.
agree
on
the
7r-system
|J
Moo
if
and
only
on c/Q/dP if M is
..(14-18)
Kakutani's
151
therefore =
impHes that Q
is equivalent
to
P on
if and
only
nE(ri)
equivalently
/ y/U^)9n{x)dx> 0,
if
(*)
^ then P
/ {Vfn{x)
Vdnix)^
dx
< oc;
and that
is also absolutely
continuous
relative
to Q.
variables
distributed independent Suppose now that the Xn are identically functions of P and Q. Thus, for some under each density probability = = that n. is from It clear f and Qn 9 for all / and 5f on R, we have fn (*) = to Q is equivalent to P if and only if f g almost everywhere with respect = P. Moreover, Theorem Kakutani's measure, in which case Q Lebesgue \342\200\224> also tells us that if Q ^i^ P, then 0 (a.s., M\342\200\236 P) and this is exactly the Test in Statistics. consistency of the Likelihood-Ratio
have
seen
is
in this
martingales
a natural Theorem,
what
Sampling
However,
that for the class of UI many purposes, The appendix to this chapter, on the Optionalfurther evidence of this. provides
chapter
one.
we
might
if
martingales.
bounded
For example,
process,
not
always
true
for
UI
and C is a
previsible
then the
converge
in C^.
(uniformly)
be
bounded
the
more
advanced
for
of
M null holds:
at 0
which
(then
(a)
(b)
M*
:=sup|M\342\200\236|G\302\243^
ELi(^^
\"
Mk-,f
thereexist
(c)
By
special
case
absolute
of a
constants
theorem,
Cp||[M]i|U
<
<
C,\\\\[M]i\\\\,
(1 <
p <
oo).
spaces
The space
of martingales
TYj
is
obviously
sandwiched
>
bounded in
\302\243^ (p
1) and
Its
152
identification
Chapter 14:
as the
from
right
UI Martingales
space
(I4.I8)..
intermediate
has proved
complex
very important.
Its
name derives
Proof
its
important
links
with
analysis.
of (a) and (b) is B-D-G inequality or of the equivalence look at the relevance take a But we can here. quick very give it clear that makes M problem. of (b) to the C \342\200\242 First, (b)
of the
to
too
(d)
and
difficult
if
eHl
and C
\342\200\242 M
EHq^
see that, in a sense, this is 'best possible'. at 0 and a (bounded) previsthat we have a martingale M null Suppose = |, are IID RVs with P(ek = il) ible process e = (sk : k E N) where the \342\202\254k to show that want and where e and M are independent. We
we shall now
(e)
Hj
if (as
well as only
of
M is if) e \342\200\242
bounded
in C^.
We run into no
difficulties
'regularity'
if we
condition on
> 3-^E([M]|).
M:
ak
fc
G N)
be a
M is
known.
Define
n
Xk:=akek,
:= .Yi T^\342\200\236
E{W^)
k=i
J2 al
Then (see
Section
7.2)
SO
that,
certainly,
E{W^)
< Sv^.
On combiningthis
=
fact
with
Holder's
inequality
in the form
vn
E{W'J
<
||t^i|||||W^^||3
(E\\W\342\200\236\\)iEiW*)i
we
obtain
the
special case
of Khinchine's inequality we
E{\\Wn\\)>3-KI,
need:
For more on the topics in this section, seeChow and Dellacherie and Meyer (1980), Doob (1981), Durrett (1984). these is accessibleto the readerof this book; the others are
Teicher The
(1978), first of
more advanced.
Chapter
15
Applications
15.0.Introduction
The
\342\200\224
please
read!
in which the
purpose
of this
theory
chapter is to
which
we
have
problems.
In
We
consider
only
very
of some of the ways can be applied to real-world developed but at a lively pace! simple examples,
give
some
indication
Sections
The
15.1-15.2,
was
we discuss a
developed
trivial
case
of
a celebrated
result
from mathematical
formula.
model
for a continuous-parameter (diffusion) We for prices; see, example,Karatzas and Schreve (1988). in treatments the an also hzis obvious which discretization many present is that in the discrete case,the to be emphasized literature. What needs is why the answer is result has to do with which probability, nothing the of completely independent underlying probability measure. The use of the a device for other than P measure^ in Section 15.2is nothing 'martingale some But in the diffusion expressing sim,ple algebra/combinatorics. where the the and combinatorics are no longer meaningful, setting, algebra theorem, and Cameron-Martin-Girsanov changem^artingale-representation the essential of-measure theorem provide language. I think that this justifies a 'martingale' treatment of something which needs only juniormy giving
formula for
stock
school
algebra.
Sections
further
formulation
E10.2
sheep problem^]
techniques
example, the 'Mabinogion just but it is an example which illustrates rather well several be utilized in other contexts. may effectively
one 'fun'
we
stochastic control,
development
of the
at
martingale
Exercise
which
at
some
simple
noisy
which
made. This topic has important applicationsin engineering (lookat the IEEE I that in and will look in economics. medicine, you journals!), hope
15S
only
154
further
(15.0)..
this topic and into the important subject which develops for with stochastic-control is combined example, theory. See, filtering and Whittle and Vintner (1990). (1985)
encounter
when
Davis
first
reflections
on the
problems we
martingale concept.
result
subsets
with
15.1.
Let S
of
trivial
martingale-representation
let E denote the set of all denote the two-point set {\342\200\224 1,1}, and let fx be the probability measure on (5, E) 5, let p G (0,1),
/.({I})=:p=l-M{-1}).
Let
A^
G N.
Define
(fi, T,
cj =
P) = (5, S, //)^
so
that
a typical
element of fi
is
(cji,cj2,---,<^iv),
ek{^)
\342\200\242= ^ib, so define
^k G {-1,1}.
that
\342\200\242 \342\200\242 are (\302\243i,\302\2432, \342\200\242,^iv)
\342\200\224^ R
by
IID
RVs each
law
11. For 0
<n <
n
N,
^n:=X^(\302\243ib-2p+l),
^n
Note
'= Cr(Zo,Zi,
l.p 4- (-1)(1
. . . , Zn) - p) =
Cr{Sl,S2, -
that
E(\342\202\254k)
2p
1. We
see that
(a)
Z = {Zn:0<n<N)
martingale
is
(relative
to {{fn
:0
LEMMA
If
= {Mn
then
:0
< n < N)
exists
iV},P)^,
there
= Mo
+ H^Z,
=
that is, Mn
Mo
Mo
ELi
Hk{Zk
- Zk-i).
the
Remark. Sincefo
common
{0,Q},
is constant
on Q, and
has to be
measurable,
value We
of the
E(M\342\200\236).
Proof.
simply
construct
H explicitly.
BecauseMn =/n(^i,...
is
fn
Mn{^)
= fn{ei{u;),...,en{u;))
,u;n)
..(15.2)
for some
Chapter
15:
Applications
155
function
M is
a martingale, we have
0=E(Mn-Mn_i|j^n-l)(cc;)
= p/n((^l,
. . . ,CJ\342\200\236_i, 1)
4- (1
p)/\342\200\236((^l,
. . . ,CJn-l,
-1)
-'/n_i(u;i,...,u;n-i).
Hence the
expressions
/n(u^l, . . . ,CJn-l,
.,
^
^x
1) 2(1
and
(b2)
\342\200\242 \342\200\242 /n-l (t^l, \342\200\242 , t^n-1
/n(t^l,
-1)
2p
if we define then H is Hn{(j^) to be their common value, that as M = Mq -^ H \342\200\242 previsible, and simple algebraverifies Z, You check that H is unique. D
Black-Scholes
formula
two
an economy
in which
the
there are
of which
TV.
'securities':
bonds of stock
of fixed
interest rate
r, and stock,
of N.
value
fluctuates randomly.
units For
Let N be
fixed
element
values of
n =
0,1,...,
throughout
r)\"-Bo
the
value of
1 bond unit
throughout
open
time interval
for
(n, n + 1),
of 1 unit
0 with so that
of
the 4- 1).
value
stock
the open
x
time
interval
{n,n
of stock
time
a fortune
of value
=
made
up of
Aq
units
bond,
AoSo 4- VoBo
Between
before
X.
times
time
invest units
this of stock
in stocks
and bonds,
of
so that
just
and
Vi
bond
so that
^150 4-^1^0
= 2:. as
your
So,
(Ai,
Vi)
represents
the
portfolio
you have
'stake
on
the
first
game'.
156
Just
units
(15.2)..
n >
1) you have
An-i
units
of stock
and
V\342\200\236_i
Xn-\\
-An-iSn-l
+ Vn-lBn-l-
By trading
between
stock for
n
bonds
times
\342\200\224 1 and
value Xn-i
your portfolio
fortune is
(still described
of by
= AnSn-l
+ K^n-l
by
(n
>
1).
Your
(a)
fortune
is given
+
VnBn
(n>0)
and
your change in
Xn
fortune satisfies
Xn^l
(b)
Now,
= An{Sn \"
5\342\200\236_i).
Bn
\342\200\224
Bn-1
and
where
rewrite
is the J?\342\200\236
random
'rate of
interest
of
stock
at
time
n\\
We
may
now
(b) as
Xn \342\200\224 Xn-l
so that
= rXn-l
+ AnSn--l(Rn
\342\200\224
t),
if we
set
Yn
(C)
then
{l+ry^Xn,
Yn
Yn-1
= (1
r)-(\"-\302\273>A\342\200\2365\342\200\236-i(iZ\342\200\236
r).
fortune
at
time
Section 15.1.Note
We
fi,^,\302\243\342\200\236(l <n<N),
and J='n(0
only
build
a model
in which
each
a < r
Rn
takes
values
a,
in
\342\200\224
1, oo),
where
< b,
..(15,2)
by
157
setting
(e)
^ Rn =
a+ b + -\"2-
b \342\200\224 a
^~^\"-
R^-r='^(bnow choose
a){\342\202\254n
2p +
1) = h{b-
a)(Zn
^n-i),
Note
that
(d)
and
(f) together
to
Z.
A European
optionis a
of
contractmadejust after
after time
iV
0 which
will allow
stock
just
at
a price
K; K is the
so-called
you will
value the
at option
time If you have made such a contract, then just after N, price. exercise the option if 5^ > K and will not if 5^ < K. Thus, the should you pay for time N of such a contract is {Sn \342\200\224 What K)'^.
at
time
0?
an
Black and
the
Scholesprovide
strategy
answer
to this
concept
A
of a
hedging strategy.
with
scheme
a portfolio
A
hedging
initial
{{An,
Vi^)
management
and
V are
previsible
for
relative to
every uj,
and (b),
(hi)
we have
satisfying
(a)
Xo{uj) =
Xn{uj)
x,
0 (0
(h2)
(h3)
>
< n <
iV),
Xn{^)
= {SnH-'K)^-
Anyone employing a
management,
hedging strategy
going
will
by
appropriate
portfolio
value
and
without
bankrupt,
of
the
option
at time
N.
that
-X'\342\200\236(u;)
some n amounts to borrowing at the fixed interest rate r. A negative value of A corresponds to 'short-selling^ stock,but after you have read the theorem, this may not worry you!
V for
Note. Though Blackand Scholes insist we) do not insist that the processes A
>
0, Vn,
A
Vu;, they
(and
of
and V be
positive.
negative
value
158
(15.2)..
THEOREM
A
hedging
strategy
with
initial
value x exists if
and
only
if
x =
where
xo:=E[(l+r)-''iSN-K)+],
for the measure P
hedging
of
is the
as at
and
(g). There is a
of
expectation
unique
Section
15.1
with p
Xq,
strategy
with negative.
initial
value
it involves
this
is
the
unique
fair price
at
option.
In the
underlying positive
Suppose
definition
of
hedging
strategy,
an
requirement,
(jj
has
probability
measure.
there is nowhere any mention Because however of the 'for every u;' measures on Q for which each point x exists,
that
a hedging
associated
strategy
processes.
with
initial
value
and
From
Y = Yo^F.Z,
where
is the
previsible process
F\342\200\236
with
(l +
r)-(\"-\302\273)A\342\200\2365\342\200\236_i.
Of
course,
X
F is
Thus
bounded because
finitely
combinations.
Yo =
strategy,
and
since measure,
many
{n,uj)
is;
and
since
the definition of
hedging
we obtain
that
> 0.)
define
Now consider
y\342\200\236:=E((l+r)-^(5w-i^)+|:^\342\200\236).
Then
F is
a martingale,
in
and by
combining
(f)
for
with
some
the
unique
martingale-representation result
Section
15.1, we
see that
:=
previsible
A, (d)
holds. Define
process
Xn := (1+ r^Yn,
Vn
{Xn
- AnSn)/Bn.
Then
(a)
and
(b)
hold.
Since Xo =
X
and
Xn
= {Sn
- K)-^,
,,(15.3)
the only thing which
of
Chapter
15:
Applications A is
159
remains is to prove
(15.1,bl),
that
never negative.
Because
the
explicit
E
formula
this reduces
to showingthat
= (1 +
a simple
[{Sn
K)^\\Sn-l,Sn
= (1 + h)Sn-.l\\
>
and
[{Sn
K)-^\\Sn--uSn
a)Sn^i] ;
computation
this
is intuitively
be provedby
on binomial
coefficients.
the
Tale
of Peredur
ap Efrawg in the
Jones
We
very
early
Welsh
folk
flock
tales,
of
The
and
(1949)),
sacrifice
there
poetry
is a
magical
sheep,
black,
entire
some white.
each
for precision
in specifying
its behaviour. At
from randomly 1,2,3,... a sheep (chosen of if this bleating flock, independently previousevents) bleats; becomes white; if instantly sheep is white, one black sheep(if any remain) the bleating sheepis black, then one white sheep (if any remain) instantly
of times
becomesblack.No
The controlled
births
or deaths
occur.
system
the
Suppose
now
that
system
the
can be
transition
system.
and just after every magical sheep may be removed from of black
that any
just number
after
time
of white
sheep may
expected
be removed on
final
number
Consider
the
following
example
of a
decision,
policy:
or if no
reduce
the
white
do nothing if there are more black sheep remain;otherwise, to one less than the black population
population. The
value
function
V for
Policy F : Z+
is
the ->
Z+
sheep
where for w^b G Z\"^, V{w,b) denotes the expected final number of black if one adopts Policy A and if there are w white at and b black sheep time 0. Then V is uniquely the fact for G specified by that, w,b Z\"^,
(al)y(0,6)
= 6;
160
(a2)
(15.3)..
b) =
V{w
1,
ft)
whenever
w >b
and
>
0]
whenever
(a3)
6>
0 and w
It is black
V(w,b)
=
>
:^^V{w + l,b~l)^^^V(w-l,b^l)
0. Wn
we
<
h,
and
and Bn
adopt
Policy
initial
values
Wq
and
Bq,
of
{{Wn,Bn)
: n
> 0}.
(c) LEMMA
The
following
V{w^
statements
b)
are true
1, b)
(cl)
(c2)
whenever
> V(w
\342\200\224
whenever
>
0,
V(w,b)>^,Viw
w
+ l,b-l)
and
b
> 0
+ ^,Viw-l,b+
in
1)
the
>
0.
Let us suppose that this Lemma is true. (It is proved Then, for any policy whatsoever, (d)
next
Section.)
V(Wn^Bn)
fact
is a
F(VFn,
supermartingale.
Bn)
a.s. end up converges means that the systemmust are of one colour. But then V( Woo? Boo) sheep sheep (by definition of V). SinceV{Wn,Bn) have
The
is
that
in an
is a non-negativesupermartingale, we
for
deterministic
Wq^Bq,
EV{Woo,Boo)<V{Wo,Bo).
Hence,
sheep
whatever
the
initial
position, Thus
under
number
number
of
black
of black
sheep if Policy
is
used.
Policy
In Section
is
optim^al.
result:
following
as Thus
oc. fc \342\200\224)^
if we
start
with
10000
black
sheep
and 10000
(over
up with
(about)
white sheep, we
many
had
finish
average
'runs').
correctly
workedbecause we
subject
goodguesses.
guessed
area,
one often
has to
make
one
usually
has
to work
.,(15,4)
Chapter
15:
Applications
161
which correspond in more general situations to Lemma (c) and to prove these You might find it quite an amusingexercise (d). on. our special problem now, before reading
For
statement
results
for
problem
in economics
which utilizes
analogousideas,seeDavis
and
Norman
(1990).
15.4.
Proof
be
of Lemma 15.3(c)
to define
It
(a)
will
convenient
Vk:=V{k,k).
results:
for
1 <
c<
fc,
y(A:-c,fc
+ c)
= \302\253,+(2A:-n-)2-^\"-'> Y.
\342\226\240
)'
(b2)
y(jfc +
l-c,fc
Vk
+ c)
=
which
simply
reflect
(15.3,a3)
vfc,
together
F(0,2fc)
vk,
with
the
'boundary
conditions':
F(fc,fc) =
(c)
= 2fc, 4-1)
V(h 4-1, k)
V(0,2fc
= 2fc 4-1.
VM = and hence,from
(d)
(b2)
F(fc
1,
fc
1) =
that
V{k, k +
1),
with
c =
1, we find
v,+,
is the
\\^v,
+ -^{2k
+ 1),
tails
where
pk
chance of
obtaining k headsand k
in 2k
tosses of
a fair
coin:
Result
(d) is
the key to
proving
things
by
induction.
162
(15.4),. is automatically
when
w
Proof
true
result
(15.3,cl).
b
From
is
when
w > b.
and
Hence, we
w
-{- b
need only
then
(fc
(15.3,a2),
result (15.3,cl)
establish
the
result
<
b.
Now
i{ w <
odd,
l -c,fc4-c)
that
c with
1 <
1 <
fc,
c <
it is
enough
to show
that
for
a <
>(2.-\302\253)2-\"-.(/;;_\\);
and
since
/
\\k +
2fc
\\
//2ilA
2fc-l
\\
//2A:-1\\
\\
a-l)
the
;'
we need
only
establish
a=
1:
(2fc+i-n)2-(\"-i>(i+Pfcr'(^jtO
>(2ifc-.02-(\"-^>(^\\~'),
which
reduces
to
(f)
Vk>2k--p-\\
follows
by
induction
from
the
fact
that
Pk
is decreasing
is
Proof
for
the
case
when
6 4- w^
even
may
similarly.
Proof of
automatically
result true
(15.3,c2). when
Because
w; <
reduction
(15.3,cl),
of (15.3,a3), the result (15.3,c2) is > b. we so need it w establish when 6, only of the 'general a' caseto the 'boundary case it is easily shown that it is sufficient to prove = (fc 4-1, fc 4-1) for some k. Formulae (b) {w, 6)
that
(2fc +
..(15,5)
and this
Chapter
15:
Applications
168
(d)
using
only
increasing
in k.
15.5. Proof
Define
of result (15.3,d)
OLk :=
Vk-2k-
{pkT^
JTT.
Then,
from
(15.4,d),
aA:+i =
where
(1 -
pk)o^k
4- pkCk,
Stirling's
find
Cfc
that
given e
> 0, we
can
N so that
<e iV,
for
k>N,
for
fc
>
\\oik+i\\ <(1
- pk)(l we have ak
pk-i)'
\"
(I
pN)\\otN\\+e.
0, and it
But, since
limsup
J^/Ojt
oo,
J][(l
\342\200\224
pk)
is now
clear that
|a;t+i| Because
\342\200\224> 0.
of the n\\ =
Stirling's
formula:
e^/^^^n)^ 0 <
6{n)
< 1,
we have
SO
that Vit
(2fc
4- ~
>/7rfc)
->
0,
as required.
We now
formula
take a
quick
look
at
filtering.
The central
illustrated
with
a recursive
property
which is now
idea combinesBayes'
by
two examples.
164
15.6.
(15.6)..
nature
that
of conditional
S, C probabiUty.
A,
probabilities
(elements
(for
Example.
with
ACi B
strictly
positive
example)
of J^) ABC
each
for
Ca{B)
for
:- P{B\\A)
The
= P{AB)IP{A)
property' in which we
conditional is
probabilities. exemplified
'recursive
are
interested
by
Cabc{D)
= CMD\\C)
:=
^^
D
;
given B
'if
have
we
want occurred,
to find we
can
have
a strictly
{X, Y,
Z,
T)
has
Y,Z,T)eB}=
J
has
J J
fx,Y,zA^,
y, z,
t)dxdydzdt.
Then, of
course, (X, F, Z)
joint
fx,Y,z
on R^, where
fx,Y,z{x,y,z)= Jr/
The
fx,Y,z,T{x,y,z,t)dt.
formula
/T|x,y,z(^k,t/,^)
defines
:= fx,Y,z,T{x,y,z,t)
T
given
fx,Y,z{x,y,z)
X,
a ('regular')
conditional
pdf of
Y, Z:
for B
e B,we
have,
with
all
dependence E B\\X,Y,Z){u;)
on
u;
indicated,
P(T
= E(Ib{T)\\X,Y,Z){u;)
=
Similarly,
Jb
fT\\x,YAAX{uj),Y{uj),Z{uj))dt.
fT,Z\\X,Y{i,z\\x,y)
\\
\\
fx,Y,Z,T{x,y,z,t) \342\200\224 r
fx,Y{x,y)
\\
..(15.1)
The
165
recurrence
is exemplified by
fT,Z\\X,Y
pj
fnx,Y,z-{fnz)^^,y.--^^
15.7.
With
Bayes'
a now-clear
we have
for RVs
X,
Y with
strictly
positive joint
pdf/x,y on R2,
.
X
(*)
/x,v(x|y)
fx,Y{x,y)
-j^:^
fx{x)fY\\x{y\\x)
\342\200\224j^\342\200\224.
Thus
(**)
fx\\Y{x\\y)
proportionality'
oc fx{x)fY\\x{y\\x),
depending
the 'constant of
the fact that
on
but
being
determined
by
Jr /R
=
fx\\Y{x,y)dx
1.
The meaning
of the
following
Lemma
is clarified
within
its proof.
LEMMA
\342\226\272(a)
Suppose
RVs
that
such
/j,^a^b E R, that
\302\243(X)
and
that
X and Y
are
that N(/.,C/),
CxiY)
= -Nia +
bX,W).
Then
Cy(X)=N(i-,n
where
the
number
V G (0,
cx)) and
~
the
RV
X -
are
determined
as
follows:
1 = ~ 14.^
V
W'
X
X_^
V
b{Y
a)
U'^
'
Proof. The
absolutedistribution of /x(a:) =
is N(//,
U), so
that
(27rt7)-Uxp{-^^^|
166
The conditional
Chapter
15:
Applications
(15.7)..
of
of
given
X is
the density
N(a
4- bX,
W), so that
/y|x(t/k)
Hence,
(27rTyr^exp{-^^-^^^^^'|
from
(**),
log/x,v(x|\302\273)
= CW
(i^
fc|j^
result
where
1/F
= 1/U
+ b'^/W and
follows.
COROLLARY
(b)
With
the
notation
of the
Lemma, we have
=
= E{iX-Xr} \\\\X-X\\\\l
Proof.
V.
Since
Cy{X)
= N(X,
F),
we
even
have
E{(X-X)^\\Y}
V,
a.s.
15.8.
Let
A\",
Noisy observation
r/i,
of a singlerandom variable
RVs,
N(0,a2),
7/2,
with
Civk)
\302\243(X)
= m^)-
Let
(c\342\200\236)
be
a sequence
of positive
Yf,=X+Ckr)k.
We regard
(T{YuY2,...,Yn).
each
Yi
as
a noisy
observation
Moo
of X.
We
know
that
Mn :=
One
E{X\\:Fn) ^
:=
E(X|^oo)
a.s.
and in
\302\2432.
interesting
when
question
is
at (10.4,c) is:
it true
= X a.s.?^ or again,
is
a.s.
equal to an
J^oo-measurableRV?
..(15.8)
Chapter
15: Applications
Let us write
We
to signify C\342\200\236
'regular
conditional
have
Suppose
that it is
true
that
c\342\200\236-i(x)
N(i-\342\200\236_i,y\342\200\236-i)
where
is X\342\200\236-i
a linear
function
=
and
Vn-i
a constant in
X Y\342\200\236
we have c\342\200\236j]\342\200\236, =
C\342\200\236-i(Yn\\X)
N(X,cl).
From
the Lemma
/i
U = Vn-i
,a = 0,b = l,W =
=
cl,
we
have
Cn-iiX\\Y\342\200\236) -NiX\342\200\236,V\342\200\236),
where
\342\200\224 \342\200\224
4_ ^
_L ' c^
^^
V
in
^^-^ \342\200\224 ~ V
Xn
r2*
indicated
Section
C\342\200\236{X) Cn-l{X\\Y\342\200\236).
We
have
now proved
by induction that
Cn(X) =
Now,
Vn. N(X\342\200\236,V\342\200\236),
of course,
Mn =
However,
ib=l
Our
martingale
and
so converges
only
in
\302\243^. We
now
see that
X a.3, if
and
if
^ c^^
= oo .
168
15.9.
Chapter
15:
Applications
(15.9)..
The Kalman-Bucy
method
filter
used
The
immediate
of derivation
calculation of the
in the
famous
Let A,
r/i,
fl\", C,
K and g
Xo,
7/2,
RVs with
CiXo)
C{ek) =
C(7jt)= N(0,1),
at time n
N(m,
a%
by
Yo =
Xn,
0.
where
of
a system
is supposed
given
Xn
However
\342\200\224
Xn-l
= AXn-1
+ HSn +
9can
the
process
X cannot
be observed
the process
y, where
^n \342\200\224
directly: we
only
observe
(observation)
Yn-1
= CXn the
+ Krjn.
induction
make
hypothesis
that
C\342\200\236-iiX\342\200\236-i)=NiXn-uV\342\200\236.i),
where
Cn-i
signifies
regular
conditional law
given
5^1,12,
Xn =
OiXn-1 + 9 -^ Hsn,
where
a:=l4-A,
we
have
Cn-i{Xn)
N(aX\342\200\236_i
g,a^V\342\200\236.i
H^).
Also,
since
+ KVn,
=
we have
Apply
the
bivariate-normal
=
Lemma
15.7(a) to
=
find
that
where
(KBl)
\342\200\224 =
-^^^\342\200\224\342\200\224\342\200\224
('KB2^ ^ ^
^ = ^^n-i + g
Vn
C(Y\342\200\236-Y\342\200\236-i)
aWn-1
+H^
lO
..(15.10)
Equation
rectangular
169
of the
\342\200\224> V'oo,
hyperbola
the
positive
root oiV
= f{V).
to
in continuous a rigorous treatment of the K-B filter a nd use See, techniques. time, martingale stochastic-integral to filtering and and references for example, Rogers and Williams (1987) control mentioned there.For more on the discrete-time situation, which is
If onewishes
give
one is forced to
very
important
in
practice,
does link
(1990).
with
stochastic
control, 15.10.
The martingale
because question
concept is well
time
(discrete) arises:
does
the
processes
parametrized
explain
to processes evolving in time to the ordered naturally belongs spaceZ'^. The in some natural way transfer property martingale Z^? by (say)
adapted
to
Let mefirst
with
models
in Z (d
that
= 1) and in Z^,
: n
a difficulty
described
though
we do
Suppose
that ('almost
(a)
For
(Xn
G Z)
is a
will
process suchthat
be
Xn
G C^
and
surely' qualifications
E(Xn\\Xm
dropped)
:7n^n)=
K^n-i
+ Xn-fi),
Vn.
G Z,
define
Gm
cr{Xk shows
A:
<
m),
Hm =
h
cr{Xr : r
Z^
> m).
a <
The
a <
Tower
Property
that
for a,
in
with
6, we
have
for
r <
6,
: r
^{XAGa.H,) = E{Xr\\Xs
SO that
^ ^l^a,
Wft)
r i->
E^XrlQa^T^b) is the
E{Xr\\Ga,y'b)
linear interpolation
=
0 \342\200\224 a
^a
+ -. \342\200\224 Xh.
b a
Hence,
for n G Z
and
G N,
we have,
a.s..
E(-X'n|^n-u?'Wn-f
l) =
U-f- 1
110
Now,
Warning
(15.10)..
cr-algebras
as decrease
the
we had
Anyway,
by
L :=
and
E{Xn\\\\\\(^{Qn-u,Hn-\\-l))
u
- L.
Hence,
by
the
Tower Property,
E(Xn\\L,Hn+i)
\342\200\242
(b)
= Xn+1
\342\200\224 L
whence
we have
(n
1)L.
that
A further
(c)
Imin^ooiXn
\342\200\224
nL)
exists
(a.s.)
Hence
L=
By
uf
l\\ni(X u/u). oo
which led to (b) in
the
using
the arguments
reversed-time
sense,
we
now obtain
(d)
E{Xn+l\\L,gn)=Xn+L.
(d)
and
the Tower
Property,
L\\Xn-^l)
Xn-^l,
Exercise
Xn-^i
= Xn = nL-^
+ L. A,
Vn
Xn
so that
(almost) all
samplepaths
that
of
are straight
linesl
a harness property
that
and
every low-dimensional
harness is a straitjacket!
the type
of (a) should be called any analogue of result just obtained conveys the idea
..(15.12)
15.11.
in terms
should
Chapter
15:
Applications
171
Harnesses
that
unravelled, 1
rules
each
Thereason
(15.10,a)
say is that
Jtn :
but
Q -> R
then
require
only that
differences
(Xr-Xsir.sel)
be RVs
(that
is, -
be J^-measurable),
n,k
1 with
k ^ n.,
E(X\342\200\236
Xk\\Xm
Xfc
: m
7^
K^n-i
- Xk) + K^n+i
(1973).
^fc).
I call
this
a difference
harness in
that
Williams
Easy
exercise.
Suppose
: n (y\342\200\236
any function on
Q. Define
Y
e 1)
are IID
RVs
in
C^.
Let Xq
be
ifn<0. ^\"-\\Xo-EL\342\200\236+in
Thus,
_/^o
+ E*=in
ifn>0,
Xn
\342\200\224
Xn-i
= Yn, Vn.
Prove that X 2
is a difference
harness.
15.12.
Harnesses
unravelled,
In dimension d >
described
3,
we
do
not
in
related both to Gibbsstates in statisticalmechanics such that each Xn{n G Z^) is a, RV and, for n G Z^, EiXniXm : m
Z\"* \\ \342\202\254
the preceding
need to use the 'difference-process' unravelling section. For d > 3, there is a non-trivial model,
and
to
quantum
fields,
{n})
= (2d)-'
J] X See WiUiams
the (1973).
the
2d unit
vectors in 1^.
to a
'harness',
anticipating
harnesses,
Many
interesting
Hammersley (1966) contains many important ideas on later work on stochastic partial differential equations. unsolved on various kinds of harness remain. problems
terms
PART
C: CHARACTERISTIC
FUNCTIONS
Chapter 16
Basic
Properties
of CFs
Part
is merely
the briefest
is about
account
theory
that
is on
finds
it is
proper
it
its way
do
of characteristic stages in different spirit from the very something of the sample processes. paths
of
the
first
theory, book,
and
Korner
characteristic statistics, I
and
(1988); see
McKean
(1972).
On the
have
an essential
role in both
must
include
see
Chow
and Teicher
indicate the method
(1978) or Lukacs
on them. Forfull
probability and
treatment,
other hand,
Exercises
develop
the analogous
in
full
of moments
distributions
on
[0,1].
16.1. Definition
(CF)
ip
= ipx
: R
of a
be
the map
(yp
-> C defined
(important.-the
\342\226\272 \342\226\272
domain is R
ip{e)
not
C)
by
:=
E(e'^^')
= Ecos(^X)
of
+ iEsm{eX).
X,
Let
:= Fx
be the
distribution function
and
let
/j,
:=
law of
X, Then
fix
denote
the
^{6):= f Jr
so
of
that (p
e'^^'dFix)
:=
Jr of //, for
172
(p,
e'^V(d^),
is
the
Fourier
write
transform
factor
F.
(We do
(27r)~
2 which
used in Fourier
theory.) We
often
(pp
or c^^
..(16.3)
Chapter
16: Basic
Properties
of
CFs
178
16.2.
Elementary
ip
properties
RV
Let
ipx
for a
= <
X.
Then
\342\226\272(a)
^(0) \\<f{e)\\
1 (obviously); 1, V0;
is continuous
\342\226\272(b)
\342\226\272(c)
6 \302\273-> ^{0)
on R;
(d)
(e)
n-x){e)
VaX+6(\302\253)
= ^x{e\\ye,
=
e'\302\273Vx(a^).
to
You can
Note
differentiate
establish (see
(c).) Theorem
analysis
G N
and
Ed-X\"!\")
<
oo,
then
= Ee*^^
n times
(^(\">(^)
to obtain
E[(iX)\"e'^^].
it
we may
formally
In particular,
when
(p^^\\0) =
= oo.
i^E(X^),
that
However,
is possible
for <p'{0) to
exist
E(|X|)
We shall
see shortly
(f can
be the
'tent-function'
^w = (i-i^i)i[_,,ij(^)
so
that
(p
need
not
be differentiable
everywhere, and
(f
can
be
0 outside
[-1,1].
uses
of CFs the
are the
following:
Central
Limit Theorem
distributions the
of limit RVs,
\342\200\242 to
prove
'only if
part
of
the
Three-Series
via
estimates
on tail
probabilities
such
results
as
and
normal
if X
X -{-Y
distributions.
these
uses
are discussed
in this book.
174
16.4. Three
(a)
of
CFs
(^6,4),.
key results
are independent
RVs,
If
and Y
then
Proof.
This
is just
agziin:
(b)
F may he
statement.
functions
corresponds
exactly to
of
the
corresponding
CFs,
a precise
See Section
Theorem
18.1for
statement.
in
The way in which these results are used is as follows. Suppose that X\\, -X'2, we and variance 1. From (a) and (16.2,e),
Eexp(i^5n/v/;7)
the
proof
of the
Central Limit
with
RVs each
mean
h -X'\342\200\236,
see
that
if Sn
:= X\\^
then
= (^x(^/v^\".
show
We shall
that
9x(^/v^)\"
|l
\\6Vn
o(l/n)|\"
-. exp(-i^2)^
y^
\342\200\224^^^)
see
the
shortly),
distribution $
function
the
distribution
N(0,1).
In this
case,
this
means that
P{Sn/Vn
<x) -^ $(x),
eR.
16.5.
Atoms
In regard
by
(16.4,c),
tidiness
of results
can be
threatened
the
presence
.,(16.6)
If P{X
CFs
175
the distribution
= c)
> 0, then
function
the
law
// of
A\"
is
said
to have
an atom at
of X
has a discontinuity at
=
c:
c, and
f,{{c])= F(c)-Fic-)
Now
of /J,
P{X
c).
can
have of /i
at most
atoms
is at
at
least
1/n,
so that the
number
It therefore follows
reals
given
x G R,
right-continuity
there exists a
of
/j,
of sequence(t/\342\200\236)
with
yn
I x such
of continuity
of F);
and then, by
that every
is a non-atom t/\342\200\236
(equivalently
a point
of F,
F{yn) i
F{x).
theorem
puts
functions
may
be
reconstructed
from
theorem
does
such
that
ipp =
imply
R,
that
F
if F
and G are
(f in
very
^G on
then
= G.)
THEOREM
\342\226\272
Let
(^ be
F.
which
has law
/j,
and
distribution
function
^ ^ (a)
- p-*^^
-J \"-^
lim\342\200\224
TToo27r
ie
<p(e)de ^^ ^
\\[F{b)
F(b^)]-^[F{a)^F{a^)].
X has
continuous
Moreover,
if
J^
\\(p(6)\\d6
probability
density
function
/, and
(b)
The
f{x) =
'duality'
^j
e-'''^{e)de.
between
(b) and
the result
(c)
can
^{e) =
be
JR
I e''-f{x)dx
omitted on a first
reading.
exploited
proof
as we
of
shall see.
may be
The
the
theorem
176
Chapter of the
16: Basic
Properties
u
of
CFs
(16.6)..
Proof
eR
with
<v,
(d)
|e*'\"-e'\"|<|^-u|,
either from a
picture or since
I \\Ju
ie'*dt\\
a
<
I
Ju
f f \\ie''\\dt= Ju
Theorem
Idt.
allows
Let
0
a, 6 G R,
with
<
b.
Fubini's
us to
say that,
for
<T
< oo,
^ e-'\302\273\302\273
e-
(e)
-A J_
2w _rp
ie
<f{e)de
It
^Jr\\J-t~
id
de \\
/i{dx)
Ct
^^Jr[j-
TI
that evenness
< oo.
id
Ct
^ (^
of the
\342\200\224 \302\253)^/7r,
so
that
(e) is
valid.
cosine function
the
sine
function
to obtain
^ie(x-a)
(f)
I
27r
rT
_ ^iBix-b)
iO
iTTJ_ ./\342\200\224J\"
_
de
a\\T)
sgn(a:
- a)S{\\x
r 1 := < 0 I -1
sgn(x
- b)S{\\x -
b\\T)
where,
as usual.
if
\\i if
>
0,
sgn(a:)
x = X <
0, 0,
and S(U) :=
sinx
/ Jo
dx
(U > 0).
..(16.6)
Chapter
16:
Basic
Properties
x~^
of CFs
does
177
exist, because
sinxdx
not
we have
(see
Exercise
E16.1)
lim S{U)
= ^ .
in x and
to
The expression
(f)
is
bounded
simultaneously
converges
T for
our
fixed
a and
6; and, as T t
0 if
^ if
X X
< =
> =
6, 6,
1 if a The
<
<
6.
now
Bounded Suppose
Convergence Theorem
now that
(DOM)
yields
result
(a).
(a)
(g)
and
use
to obtain
J^
\\(f{6)\\d6
<
oc.
We can
then let T t oo in
result
F{h)
- F(a) =
-1 /r
F is continuous at a and b. However, that provided (DOM) right-hand side of (g) is continuous in a and b and (why?!) that F has no atoms and that holds for all a and b with (g)
shows we can
a <
We now have
(^^
But, F{b)
F{a)
6-a
by
[ e^^^
e\"''^^ ^^^)^^-
=2^U
^^iSa ie(b
iO{b-a)
I
(d).
_ ^-iSh
< 1.
a) oo allows
Hence,the assumptionthat
6 \342\200\224> a in
J^
\\y:>(6)\\d6 <
(h) to obtain
n\302\253)=/(\302\253):=^^e-*VWd^,
and, finally,
is continuous
by (DOM).
178
16.7.
of
CFs
(16,7).
A table
Distribution
Support
CF
1.
N(/x,a2)
^^^cxp|
-^^
-}
exp(i>6l
- 10-202)
i0
2.
U[0,1]
1 1
2
[0,1]
3. U[-l,l]
[-1,1]
R
sin 0
0 1
4. Double
exponential
ie-l-l
5. Cauchy 6. Triangular
7. Anon
7r(l-fic2)
e-l\302\273l
l-|x| 1\342\200\224COS X
7ra:2
[-1,1]
2(i-r^)
(1
|0|)i[_i,i]W
The
as
two
lines
4 and
5 illustrate
do the
two lines 6
and 7.
Hints
the duality between (16.6,b) and (16.6,c), on the table are verifying given in the
Chapter 17
Weak
Convergence
for the appropriate concept of 'convergence' is 'weak on The mezisures convergence' terminology probability (R,B). in the is closer to 'weak*' than to 'weak' convergence unfortunate: concept is the official senses used by functional analysts. 'Narrow convergence' the term. However, probabilists seem determined to use pure-mathematical here. them 'weak in their sense, and, reluctantly, I follow convergence'
In
this
chapter,
we consider
We are
(complete,
studying the
metric)
of
specialcaseof
For
weaJc
separable,
special features
Parthasarathy
R.
space the
S when general
5 =
convergence
R;
on a
we
Polish
use Ethier
and
(1967)
or, for
and Kurtz
a superb acount of
unashamedly scope,
current
(1986).
We
Notation.
for the
write
Prob(R)
on space of probabilitymeasures
R,
and
Cfc(R)
for the
space of
bounded
continuous
functions
on R.
definition
sequencein Prob(R)and
to
/j,
(/In
Hn
: n
G N)
be a
let
/j,
G Prob(R).
We say
that
converges
weakly
if
(and
only
V/i
if)
C6(R),
(a)
and
fin(h) ->
then write
//(/i),
(b)
/J'n^
179
fi
180
Chapter
We
Convergence
(17.1)..
know
that
elements
correspond to distributionfunctions
F{x) =
via
the
correspondence
/J,
<-^ F,
where
//(\342\200\224oo,x].
Weak
convergence
of distribution
Fn
obviousway:
(c)
^ F
if
//. //\342\200\236
-^
We are
Fn is the
distribution function
have,
case when
random
F\342\200\236 Fx^,
that
is when
some
variable
Xn-
Then, by
(6.12), we
for
h G Cb{R),
^^(h) =
Note
JR
= E/i(X\342\200\236). f h{x)dFn{x)
F is meaningful X are RVs
even
that
different
the statement
probability
Fn -^
spaces.
if the
Xn^s are
defined
on
then
However,
if Xn
(n G N^ and
-.
on
the
same
(d)
(X\342\200\236
A', a.s.)
^ =\302\273 (Fx\342\200\236
Fx),
and indeed,
(e)
Proof o/(d).
fj, is
(A\342\200\236^Xinprob)
=>
(Fx\342\200\236
Fx),
the
law of
-+ X, X\342\200\236
for
a.s.,
G Ci,(R), ^
and that /z\342\200\236 is the law of X\342\200\236 and \342\200\224\342\226\272 we have h{X\342\200\236) a.s., and, h{X), = n{h).
by (BDD),
/x\342\200\236(A) E(X\342\200\236)
E{X)
17.2.
be
'practical
formulation
Example.
the
Atoms
law of
for
are a
Xn, so that
h
Xn
is the
unit mass
X. Then,
-, X
= 0.
be
Let
//\342\200\236
the
law of
G Cb{R),
fin{h) =
h{n-')-^h{0)
= fi(h),
so
that
FniQ)
= 0
/> F(0) =
1.
..(17.2)
LEMMA
Chapter
17:
Weak
Convergence
181
(a)
Fn ^ F
Let {Fn) he
if
a sequence and
of DFs if
l\\m
on R, and let
be
a DF
on R.
Then
only
Fn{x)
F(x)
for every
every
point
of continuity) Let
x of
F.
Proof of 'only
Define
if'
Suppose
that
\342\200\224> F. F\342\200\236
x G R,
and let
^ > 0.
G Ch{^)
via
if
h{y)
<
X,
y
:=
1 \342\200\224 S~^(y -^
\342\200\224
x)
if x
\\{ y
<
>
<
x
S.
-{-
6,
lo
Then
\342\200\224> //\342\200\236(/i) /J,{h).
X -{-
Now,
Fn{x)
so that
< Hn{h)
and
S),
limsupFn(x)
n
S).
However,
F is
right-continuous, so we
limsupF\342\200\236(x) n
let
| 0 to obtain
R.
R
(b)
In
<
F(x),
Wx e
similar
fashion,
working
n
with
y \302\273-> h{y
+ S), F{x
and
^>0,
liminf
SO that
F\342\200\236(x-)
>
S),
(c) and
liminf
n
F\342\200\236(x-)
>
F(x-),
Vx G R.
Inequalities
(b)
the
(c) refine
D
a
In
represent
next
section,
we obtain
nice
at ion.
182
17.3.
Convergence
(17.3)..
representation
of DFs on R, that F is a DF point x of continuity of F.
THEOREM
Supposethat
on
R
: n (F\342\200\236 \302\243N)
is
and there
that
a sequence at every
Then
exists
a probability
RV
(Xn)
triple (fi,J^,
such
P)
carrying
sequence
that
Fn =
and
Fx^,
F = Fx,
a..S.
Ji-n
\342\200\224^ -^
This
is a kind
We
of 'converse'to (17.1,d).
use
Proof.
simply
the construction
in Section
in,T,P) = i[0,l],B[0,l],Leh),
define
X-^{u;)
:=
m{{z
: F(z)
> u;},
X\"^
X'^{u;):=mi{z:F(z)>i^},
and
define
have DF
F and that
Let
2r
X^,
X~
similarly.
P(X+
and
X~
z > X'^(u;). Then F{z) > of F with < so z. that > So hence, large n, Fn{z) u;, X^(uj) limsup\342\200\236 X^{u;) we can choose z I X'^{u;). Hence But (since non-atoms are dense), be
Fix Lj.
for
a non-atom
u, and
< z.
limsupJ\\:+(u;)<X-^(u;), and, by
similar
arguments,
liminfX~(u;)>X-(u;).
Since X~
1, the
result follows.
is a
the
unit
/In
but
the problem in working with mass at n. No subsequence of \342\200\224> in where //qo is A^oo Prob(R),
non-compact
converges (//\342\200\236)
space
weakly
R. Let
be //\342\200\236
in Prob(R),
is
the
unit
mziss at
oo. Here R
the
..(11.4) compact
Chapter metrizable
-^ in /^c\302\273
17:
Weak
Convergence
183
space
[\342\200\224cx),cx)],
the
definition
of Prob(R)
is obvious,
and Hn
Prob(R)
means
that
V/i
e C(R).
of C(R) axe elements because (We do not need the subscript '6' on C(R) that while functions in to keep remembering bounded.) It is important need not. The in ^^(R) functions at 4-cx) and \342\200\224cx), C(R) tend to hmits a countable dense subset) while the space has space C(R) is separable (it
C6(R)
is
not.
think of the next topic. HereI next paragraph(not the next the analysis, treatment. I resort to bare-hands By the Riesz elementary on, section) of is the dual representation the spaceof bounded space C(R)* theorem, C(R) of The measures on weak* signed topology C(R)* is metrizable (R,S(R)). of C(R)* is the unit ball a nd under this topology (because C(R) separable), is compact and contains Prob(R)as a closed subset.The weak* topology of Prob(R) is exactly the probabilists' weak so topology, weak (a) Prob(R) is a compact metrizable space under our probabilists^ Let me
some
briefly describe
functional
how
one
should
assume
but from
topology.
The
LEMMA
bare-hands
(Helly-Bray)
substitute for
following.
(b)
sequence
of distribution
non-decreasing
exist a
functions
function
on R. Then there
F
right-continuous
on
such
that
0 <
F <
1 and
a subsequence
(*)
limFni{x) = F(x)
We
every
point
of continuity
F.
Proof
make
countable
an
obvious
dense
use of
set
'the diagonalprinciple'.
R
Takea
C of
and
label
it:
C =
{ci,C2,C3,...}.
Since (Fn{ci) :
subsequence
N)
is a
bounded sequence, it
contains a
convergent
(i^n(i,j)(ci)):
Fn(ij){ci)
In
~> H(ci)
(say)
as j -> oo.
some
subsequence
of this
Fni2j){c2)
as j
~> oo;
184
and so on.
Chapter
17:
Weak
Convergence
(17,4)--
If we
put
n,- =
n{i,i)^ then
limFn.(c)
we shall have:
for
H{c):=
Obviously, 0 < if
exists
every
c in C.
< 1,
and
is non-decreasing
on C.
For x
R,
define
F(x)
:= lim
ciixH{c),
to x through as you can
can
the m'
signifying
that
c decreases
strictly
C. (In particular,
the
F{c)
pery'
need
not
equal F
Jy(c) for c in
C.)
Our function
of Sections
is right-continuous,
of depriving
check. By
'limsu-
holds: I wouldn't
also
check
(*)
you of that
pleasure.
17.5. Tightness
Of
course,
the
function
F in the It
will
be
distribution
function.
be
if and only if
F{x)
lim F{x)
Definition
\342\226\272
= 0,
lim
1.
sequence
\302\243 > 0,
there
if,
given
= F{K)
>l-e.
out
You
can
see
the
idea:
-oc'.
'tightness stops
mass beingpushed
to
4-cx) or
LEMMA
Suppose
that
Fn
is a
sequence of DFs.
(a)
(b)
If
such
Fn
F for
some DF
then
F,
then
(Fn)
is tight.
If (Fn)
that
is
tight,
there
exists
a subsequence
(Fm) and a DF F
Fm
-^ F.
exercise.
This is a really
easy limsupery
Chapter
18
The
Central
Limit
Theorem
The
Central
Limit we
that
Theorem it as
weak
mathematics. Here
derive of CFs.
which says
convergence
(CLT) is one of the great results of Theorem a corollary of Levy's Convergence to DFs exactly pointwise of corresponds convergence
18.1. Levy's
\342\226\272 \342\226\272\342\226\272 Let
Convergence Theorem
be
(Fn)
a sequence
ifn
denote
the
CF
of Fn-
Suppose that
g{6)
:=
lim(fn{^)
exists for
all 0 eR,
and
that
g{')
Then
is continuous
at 0.
F,
= (fp
for some
distribution function
f\342\200\236zf.
and
Proof
Assume
the
for the
(Fn)
moment that
is tight.
can
(a)
sequence
Then, by the
Helly-Brayresult 17.5(b),we
that
find
a subsequence
(Fn^)
such
Fn.
^ G
- F.
R, we
have
: CF (<^\342\200\236,
<fn,iO) ^ ^FiO)
(take
of
F\342\200\236J
h{x)
= e'*^).
Thus g =
ipp.
185
186
Now weakly
Chapter
18:
The
Central
Limit
Theorem
(18.1)..
we to
argue
by contradiction.
F.
Then,
we
for some
shall
point x
continuity
of
77
subsequencewhich
(*)
denote
by {Fn)
and an
>
0 such
\\Fn{x)-F{x)\\>r^,
so
yn.
and
that
we can find
a subsequence Fn^
a DF
F such
F
But
^ F
=1 g
=1
then
(fnj
~> (^,
so that
(p
ipp see
ipp.
Since
a CF
determines the
correspondingDF a of F and
non-atom
uniquely,
we
that
F =
F, so that, in particular,x is
(**)
Fn,(x)^F(x)=F(x).
(**)
clinches
the result.
(a).
Let e
+
Proof
of
tightness
of (Fn).
expression
M^)
M-^)
= I 2cos{0x)dFn{x)
Jr
is
real,
it
follows
that
g{0) +
g{\342\200\2240)
is real
(and
obviously
bounded above by
can
2).
Since
such that
is
continuous
at 0
and equal to
when
1 at 0, we
\\0\\
choose
6 >
\\l-9{0)\\<ie
<
6.
We now
have
0<6-'
Since
Jo {2-g{0)-g{-0)}d0<ie.
Convergence Theoremfor the in N such that for n > no.
finite
= lim(/?n,
the Bounded
interval
there exists no
6-' I Jo
{2-^n{0)-^n{-0)}d0<e.
..(18.2)
Chapter
18:
Theorem
187
However,
6-^
Jo
{2-^n{e)-<Pn{-6))d6
= 6-'
U {l-e''^)dFn{x)\\dO
the
interchange
\342\200\224
|1
e*^^|
< 2,
have, for n
> no,
the
fact
finite.
that
We
since
now
> / J\\x\\>26-->l\\x\\>26-
dFn
=fin{x:
\\x\\
>26-^}
and
it is now
sequence(Fn) is tight.
obtain 'Taylor'estimates on
18.2.
If you
now re-read
will
realize functions.
that
the next
task is to
o and
that
O notation
Recall
/(t)
= 0(^(0)
3s
t^L
CO
meansthat
< limsup|/(t)/(9r(t)|
and
that
f{t)=o{g{t))
3s
t^L
t^L.
means that
f{t)/g{t)^0
3S
188
18.3. Some
Central
Limit
Theorem
(18.S)..
important estimates
and
For
0,1,2,...
x real,
=
i?\342\200\236(x)
^.
Then
i?o(x)= e^^-l=
and
r
Jo
ie^ydy,
from
we see that
<min(2,!x|).
Since
Rn{x) =
we
/ Jo
iRn^i{y)dy,
obtain
by
induction:
Suppose
now that
A\"
is
a zero-mean
(7^
RV in C^:
Var(J\\:)
E{X) = 0,
Then,
\342\226\272 (a) with (/?
:=
< oo.
denoting
(/?x? we
have
|Ei?2(^X)|
l^(^)
(1
- \\a'9^)\\=
<
E\\R2{eX)\\
(\\xf^miy
E(-)
is dominated
by the
integrable
^
RV
\\X\\^
and
tends
by (DOM),
(^(^)
we have
l-i<T202_^o(^2)
^^0.
logs,
Next, for
\\z\\
<
^, and
with principal
values for
tdt
Jo
^+w
Jo 1
we
have
\\logil
+ z)
z\\
<
\\z\\\\
\\z\\<\\.
,.(18.5)
Chapter 18:
The
Central
Limit
Theorem
189
18.4. The
Let \342\226\272 \342\226\272\342\226\272
he an
distributedas X
where
= 0,
(T^ :=
set
Define
Sn
'-= Xi
-\\-
\342\200\242 \342\200\242\342\200\224
7=
cr^/n
Then,
for
x G R, we have,
P{Gn<x)^^x)
as n
=
\342\200\224^ oo,
-^J'
exp{-iy^)dy.
Proof.
Fix
in
R. Then,
using (18.3,b),
referring
to the
oo,
situation
when n -^
oo. But
now,
using
(18.3,c),
as n
\342\200\224\302\273\342\200\242
logvo.W=nlog{l-l^+o(^)}
Hence
v^g\342\200\236(^)
-^
Gxp(
and \342\200\224^^^),
since
0
from
i\342\200\224> exp(\342\200\224^^^)
is
the
CF
Theorem
18.1.
of the D
18.5. Example
shows
adapted to dealwith
With
sequence
of independent
be
P),
the
Record
Problem
jBi, jB2,
events
(ft, j^,
190
Chapter
18:
Theorem
(18.5)..
the 'numberof
E{Nn)
records
by
n' in the + 7
^-r=logn
+ o(l),
log
(7 is Euler's
7
constant)
Var(Ar.)
^
k<r,
(^1
i^
_
n +
Y + o(l).
Let
Nn
- log n
0 in
so that
E(Gn)
-^
0, Var(Gn)
R,
^Gni^)
But
\342\200\224
exp{-iOy/logn)(fN^
ik=l
nvx.(.)=n{i-i+i.\"}. k=l
with t
We see that
as n
\342\200\224\302\273\342\200\242 and
00,
:\342\200\224 0/y/logn,
k=l
= -z9y/i^
+J2l k=l
(it
(it ^
It' + oit')) ^
j
[logn
fE
\\k=l
p)
-iOy/logn-]-
- 1^2+o(t2)
0(1)]
+ ^^0(1
-1^2+o(l)--l^^
x)
HenceP{Gn<
18.6.
<^(x), x
(1980)
eR.
for
D
some
very general
limit
theorems.
proof
was
of Lemma
us
12.4
if part
Lemma12.4gave
statement
the
'only
els follows.
..(18.6)
Chapter 18:
The
Central
Limit
Theorem
191
LEMMA
Suppose that
by
{Xn)
is
a sequence
hounded
a constant
K in [0, oo):
\\Xn{u^)\\<K. Vn,Vu;.
Then
and Y^W^v{Xn) < oo). {Y^^{Xn) converges The proof given in Section 12.4was rather sophisticated. of as a consequence Proof using characteristic functions. First, note that, estimate (18.3,a), if Z is a RV such that for someconstant Ki, < /iTi, (j2 := Var(Z) < oo, \\Z\\ E(Z) = 0, for then |^| < K^^, we have
{Y^Xn
converges,
a.s.) =^
< 1_
< exp
1
i<y202 _^_
2^2 ^
1_
1
2^2
(-!-')\342\226\240
Now
take
0,
Var(Z\342\200\236)
Var(X\342\200\236),
\\exp{-ieE{Xn)WxM\\
\\^x\342\200\236m,
and
2K. \\Z\342\200\236\\<
If X; {2K)-\\
Var(X\342\200\236)
oo,
then,
for 6
< i2K)-\\
\302\253^P
we
shall
have,
for 0
<
\\d\\
<
n
However,
l'^^*(^)l =
if
IV'^*(^)I
{-^^'
Var(X*)|
= 0.
Yl^k
converges
a.s. to
5, then, by (DOM),
nv^x.(^) =
k<n
Eexp(i^5n)^(^s(^),
and
^s{0)
is continuous in
Y^W^v{Xn)
6 with
<^s(^)
1. We
have a
contradiction.
Hence
12.2(a)
X)Vax(Zn)
< oo,
0,
Theorem
shows
that
\\^ Zn
converges a.s.
Hence
converges
Y,E{Xn) =
a.s.,
the
Y.{X\342\200\236-Z\342\200\236}
and
since
it
is
a deterministic
part of
argument
was used in
Section 12.4.
sum, it converges!
This last
APPENDICES
Chapter Al
Appendix
to Chapter
Al.l.
In the
example
non-measurable Banach
subset and
A of
5^.
of
spirit of
Tarski,
although,
Axiom
pre-dates
trivial
(a)
S'-\\jA,
are
disjoint
sets,
If the
rotation.
from
it
any of
is
intuitively
that result
oo X
length
(A),
an impossibility.
To constructthe
{e*^ : 0 G
w
R} inside
exist
z ^
ii
there
family (Aq : q G Q), proceed as follows. Regard S^ as C. Define an equivalence relation ~ on S^ by writing a and ^ in R such that
z = e*'\",
Use
e'^,
- ^
which
Q.
the
Axiom of
of
has
precisely
one
representative
each
equivalence
Aq
e'^A^
{e'^z:zeA}.
Then
could
the family
be replaced
{Aq
: g
Z
G Q)
has the
the to
by
throughout
argument.)
rigorous
remainder
conclusion.
The
rigorous.
192
..(Al.S)
We
Chapter
set
193
now
out
to prove
A1.2.
Let
c?-systems.
5 be
a set, D,
a d-system
(a)
subsets
of 5.
Then
T>
is
called
5 G
(b)
(c)
ii A,B eVdindACB
ii
An
then
B\\A
e D,
eV
An
and
^ A
An
^,
then
AeV.
An+i(Vn)
Recall that
(d)
means:
An Q
and
[JAn of S
\342\200\224 A.
Proposition.
only
A collection
both
of
subsets
is a
a-algebra
part.
if
and
if S is
if
a 7r-system
and
a d-system.
the
part
is trivial,
so we prove only
and
'if
S is both
Then
E\"\"
a 7r-system
S\\E
a cf-system,
and
En{n
N)
:=
G S,
and
^UF=:5\\(E^nF^)GS.
HenceGn
Finally,
'-\342\200\224 EiU.,
.UEn
G Ti and,
since Gn T
U-^^^
^^
^^^ ^^^^
|J
J^ib
G S.
is a c?-system,
smallest
cf-system
d{C)
of subsets of 5. We define which contain C. Obviously, d(C) which contains C. It is also obvious that
a class C a{C).
Al.S.
\342\226\272 \342\226\272
Dynkin's
If
Lemma
d{I) = a{J).
is a TT-system, then
Thus
Proof.
TT-system.
any
c?-system
which
by
contains
that
a 7r-system
contains the
that
a-algebra generated
Because of
7r-system.
need
Proposition Al.2(d),we
only
prove
d{T) is a
194
Chapter
Al:
Appendix
to Chapter
G
1
el}.
(A1.3)..
Step 1: Let
Dj
:=
{B G d{X)
: BCiC
yC
for C in J,
and,
{B2\\Bi)nC
=
{B2nC)\\{BinC);
since ^2 H C G c?( J), J3i (B2\\Bi) n C G d{I), so that Bn T i^, then for C G J,
fl
G d{I) ^ ^2X^1
C
we I>i(n
see G N)
that
and
(^nnc)T(^nc)
so that
n C G
d{T) and
B
(since
X^i.]
Vi
:
is a c?-system
X>i
d(X).
2: Let
X>2
D2 := {^
T.
d(J)
BHAe
d{I),
VJ3
G d(J)}.
that
the
fact
contains
c?-system
that
structure
d(T)
X>2 =
just as in Step 1, we can therefore from d{X) and that says that c?(X) is a 7r-system.
But,
prove D2
Lemma
1.6
what
Let
the crucial
S
be a
jjLi
set. Let
a Tr-system
on 5,
and let E
that
and ~
and
Hi
on (5, S)
such that
E.
:= cr(2').Suppose
=
iJi\\{S)
l^2{S)
< 00
Hi
\342\200\224 /jL2
on
Proof.
Let
I? =
X>
{F
: Mi(F)
= /.2(F)}.
fact
Then
A,B
is
a c?-system
on 5.
[Indeed, the
=
that
G T> is
given.
If
eV,
then
(*)
Hi{B\\A) =
that
Hi(B) ii
Hi{A)
H2{B)
H2{A) ^,
MB\\A\\
by Lemma
so
B\\A
G V.
Finally,
=T
Fn
V and
F^
then
1.10(a),
/ii(F)
so that
F
lim/ii(Fn)
=T
lim/i2(Fn)
= fi2{F\\
G v.]
..(A1.5)
Chapter Al:
V^Ihy
Appendix
to
Chapter
195
that
hypothesis,
result follows.
circular argument is entailed by
the
Notes.
You
should
check that no
use
of
Lemma
The
/J'2{S)
1.10 (this
reason oo is
<
is obvious).
oo = oo \342\200\224
Indeed
the Lemma
1.6 is false if
'<
oo' is
omitted
- seeSection Al.lObelow.
We
now
aim
to prove
A1.5. A-sets:'algebra'case
LEMMA
Let
Qo be an
algebra of
subsets of S
A:go^[0,oo]
and
let
with
element
A(0) =
of
0. Call an elementL of Qq
properly\\'
X-set
if L ^splits
every
\\{L n
G) + A(i:^n
\\-sets
G) = A(G),
an algebra,
VG
G Go.
is
is
and
disjoint\302\243i,\302\2432,... ,Xn
A(|J(i:,nG)J
Kk=l
-^ACL.nG).
/
k=l
and
L2
be
A-sets,
and let
= L^.
L = Li
fl
L2.
We
wish
to
have,
and L^
n
H Z^
Hence, since L2
is a
A-set,
\\{L' n G) = A(i:2
Lj
n G)
-f \\{Li n
G)
196
and, of
Since course
Chapter
Al:
Appendix
to Chapter
(A1.5)..
\\{L'i Li
n G)
+ \\{L2 n G)
A(G).
is a
A-set,
A(L2
Lj
n G)
+ \\{L n G)
\\{L2
G).
On adding
the three
\\{L'
we equationsjust obtained,
see
that G ^o,
n G)
A-set.
+ \\{Ln G) =
A(G),
VG
so that L
Step
A-set,
is indeeda
follows
and
2:
it
Since,
now
the
complement
of a A-set
is a
Step 3: If
L\\
L2
are disjoint
A-sets and
GG
L2)
^0,
then
(Li
U L2)
r^Li=Li,
(Li
nLl
= L2,
so, since Li
is a A-set,
x{{Li
u L2)
n G)
A(Li
G) +
A(i:2 n G).
D
easily
completed.
Outer
measures
of
Let ^ be a cr-algebra
subsets
of S.
A:g^[0,oo]
map
is called
(a)
an outer 0;
measureon (5, Q)
for Gi,
if
A(0) =
A
(b)
is
increasing:
G2 G ^
with
Gi
C G2,
A(Gi)<A(G2);
(c)
is
countably
subadditive:
then
ii (Gk)
is
any
sequence
of elements
of ^,
^(u^M
^E^(^*)-
..(A1.7)
Chapter
Al:
Appendix
to Chapter
197
measurable A he an outer measure on the Let X-sets in Q form a a-algebra C on which that (5, \302\243, A) is a m,easure space.
space
A is
A 1.5, L :=
show that
if (Ljfc)
is a
disjoint
\\{L) =
Y.^{Lk).
k
By
the
subadditive
property
A(G)
of
<
A,
for
G ^
Q^'we have
(b)
Now
\\{L
n G)
+ \\{L' n
G).
that
let
Mn
:\342\200\224
IJifc<n
-^*-
Lemma
A(G) However,
(c)
- \\{Mn n G) +
A(M^
G).
M^ D
L^, so that
A(G)
> X{Mn n
us to
G) +
X{L'n G).
LemmaAl.5
now
allows
rewrite
\\{Lk n
(c) as G) +
A(G)
>
Y^
k<n
\\{L' n G),
SO that
(d)
A(G) >
Y^ X{Lkn G) + \\{L'n
k
G)
> x{L
n G)
+ A(i:^ n G),
subadditive of A in the last step. On comparing using the countably property and (b), so that (d) with (b), we see that equality must hold throughout (d) L e C] and then on taking G = L we obtain result (a). D
198
Chapter
Al:
Appendix
to Chapter
(A1.8)..
that Let
we need S
to prove
the
an
following.
be a
set, let
So be
algebra
on Sj
and let
S:=<t(So).
If
fiQ
is
a countahly
additive
map
that
fiQ
: So
~> [0,
exists
fi =
fiQ
on
So.
Proof.
Step 1:
Let
be
the
<t-algebra
of all
subsets
of
5.
For
G E
Q, define
A(G):=inf J]^o(i^n),
n
where We
the now
A
infimum prove
is
is taken
So
with
G C
{JFnn
that
outer
(a)
an
measure
The facts that A(0) = 0 and A is increasing are obvious. Suppose that (Gn) is a sequence in that each A(Gn) is finite. Let e > 0 be given. For ^, such a sequence each n, choose ^ ^ N) of elements of So such that (Fn^k \342\200\242
Gn
U Fn,k,
k
Yl
k
/^0(i^n,ifc)
<
A(Gn)
\302\2432-^
Then
G :=[jGnQ[jU
^n,k, so that
k
KG)
<
E
n
<E E /^''(^\".*)
ik
^(^\+^-
Since
e is
arbitrary, we have
proved
result
(a).
is
Step 2: By
is the
a-algebra of A-setsin
Caratheodory's LemmaAl.7, A
Q. All
So;
we
need
(5,
where \302\243),
(b)
(5,S).
So Q >C,
and
A = Q
//o on
C and
we can
A to
..(A1.8)
Chapter Al:
that
Appendix
to
Chapter
199
Step 3: Proof
A =
//q on
Sq.
suppose \\{F) < fio{F). Now a sequence we can define (En)
^
Then,
G Sq.
clearly, As usual,
F C that of disjoint
E^'.^Fu
En^Fnr^^[^
\\k<n
fA
/
such
that
En C
Fn
and
[jEn
^{jFn'^ n
F. Then
//0(F) =
6^/
/iO
(|J(i^
En))
fiQ
X^ /io(F n Fn),
Hence
using
the
countable
of
on
Sq.
J]//o(Fn)
< X;/^0(Fn),
SO
that
A(F) Proof
Step 3 is complete.
Step sequence
4'
So Q
such
in (F\342\200\236) So
G e
Q. Then
there existsa
\302\243.
Now,
by definition
of
A,
flo{Fn) =
n
Y, //o(F n Fn) + 5]
n n
/^0(F^
Fn)
> \\(E n
since
\302\243: n
G) +
n A(\302\243J'^
G),
D F\342\200\236). Thus,
U(\302\243;
D F\342\200\236) and
jE'^ n
G C
U(\302\243;<^
since
e is
G)
+ a(je;^ n
G).
However, since A
is
subadditive,
A(G)
We
<
n A(\302\243;
G)
+ A(je;= n
G).
see
that
E is
indeed a A-set.
200
Chapter
Al:
Appendix
to Chapter
measure
1
on
(A1.9)..
((0,1],S(0,1]).
say
ofLebesgue A1.9. Proof ofthe existence Recall the set-up in Section1.8. Let 5 = union if F may be written as a finite
(*)
(0,1].
For F
C 5,
that
G So
F=(ai,6i]U...U(ar,M
where r G N, 0
convince
yourself)
< ai < 6i < ... < ar < 6r < 1- Then (as you So is an algebra on (0,1]and
S:=(j(So)
should
= S(0,l].
For
(We write
B((0,1]).)
F as
at (*), let
//o(F) = Yl^hk
k<r
ak).
Of course, a
of
set F
may
have
different
expressions
as a finite
disjoint
union
the
form
(0,l] = (0,i]U(i,l].
it is easily seen that fiQ is well defined on So and that //o is finitely is obvious additive on So. While this a from picture, you might (or might to make the intuitive not) wish to consider how argument into a formal
However,
proof.
The
key
that
thing
(Fn)
is to prove
is a
that
fiQ
is
countably
additive
So
on So.
union
suppose
sequence of disjoint
= ULi
n
elementsof
So,
F in
with
^k,
then
MGn)
^f^oiFk)
k=i
and Gn
it is
F.
is
countably
additive
fJ-o{Gn)
/xo(F)
=T
lim^o(C?\342\200\236)
=T
limJ^MFk)
5]/xo(Ft).
Let
= H\342\200\236 F\\Gn.
Then
\302\243 H\342\200\236 So
and
H\342\200\236 J. 0.
We need only
prove that
MHn) i 0;
..(A1.9)
for
Chapter
Al:
Appendix
to Chapter
201
then
It
is clear
that an alternative
is a
(and final!)
rewording
of
what
we need
to
if (Hn)
\302\243 > 0,
decreasing sequenceof
f^o{Hn)
elements
o/Sq
such
that for
some
>
2\302\243,
Vn,
then
fl
^ifc
7^ 0.
of Eo that, for the definition Proof of (a). It is obvious from the closure we can choose Jk G So such that, with Jk denoting
each A;
of Jk,
G N,
JkQHk
But
and
fi{Hk\\Jk)
<
e2-^.
i)
fio[Hn\\f]Jk]<f^o[[J
Hence,
< {Hk\\Jk) j Yl
we
^2\"*
<
e.
since fio{Hn)
>
Vn, 2\302\243,
see that
for every n.
fJ'O
ik<n
f]Jk]>e,
and hence
f]k<n \"^k
is
non-empty.
A fortiori
Jk
then, for
every
n,
Kn
'-\342\200\224
\\\\
is non-empty.
k<n
That
(b)
now
A:
n
follows
N)
*^^
'^
(whence
fl ^ik
if
7^
0)
gives
as follows. Alternatively, we can arguedirectly in the set Since each Xn KnXn non-empty belongs we can find a subsequence and a point x of (uq)
with
is false, then {{JkT \342\200\242 no finite subcovering. For each n, choose a point
(b) to
Ji many
such
However,
for each
fc,
Xn,
G Jk
finitely
Chapter Al:
compact,
it
Appendix
to
Chapter
1
and
(A1.9)..
property
follows
that
x G Jk-
Hence x G Hifc
on So
^k,
(b) holds.
D
is countably
has a
Since//q
additive
follows
that
//q
on
This
is Lebesgue
the
//Q-sets
form
a-algebra
of Lebesgue measurablesubsets of
a a-algebra
strictly larger
(0,1].
than
S(0,1],
namely
See
Section
Al.ll.
ALIO.
With
So)
as in
Section
1.9,
suppose
that
F
for F G So,
= ^,
(a)
The Caratheodory
of
-o(^)-{L
extensionof
u{F)
vq
ff
will
be obtained
u
(a)
to
S.
However,
another extension
\342\200\224 number
is
given
of elements
in F.
Completion of a measure space In fact (apart from an 'aside' on the Riemann in completions this book.
Al.ll. of 5 as follows:
AT G
integral),
we
do not
need
Suppose
that
(5,
S, //)
is a
a class
M of
subsets
TV if and only if
3Z G S
satisfying
such that
to be
AT C
Z and
//(Z) =
0.
It is sometimes philosophically
that any
able to
'iV subset
in M F
is //-measurable of 5, write
is done
if
3J^,
G G
S such that J^
C F C G and
and
show
obvious
that
S* is a
a-algebra on 5
for F
G ^\342\200\242(F)
//(G indeed
\\ F) that
= 0. It is very S* = a(S,^).
easy
With
to
notation
we define
S*,
^i{E)
= ^i{G),
it
being
easy
(5,
prove that
it is no problem //* is well defined.Moreover, measure space, the completion of (5, S, //).
to
..(A1.12)
Chapter Al:
of
Appendix
to
Chapter
203
For parts
probability example)
advanced
probability,
it is
essential to
of
completethe basic
when
triple
(Q, j^,
is topological,
P). In other
probability,
(for
to
consider
several
different
insist
on completion.
l],Leb),
then
S[0,1]*
is the
a-algebra of
example,
what
called
Lebesgue-measurable
~>
sets of
a
of
function
every
/ :
Borel
image
of a
[0,1]
[0,1]
is Lebesgue-measurable
inverse image
inverse
Lebesgue-measurable:
Section
1.11,
jH\"
we studied
a subset
iJ
of
5 :=
open
(i)
(ii)
P] Gk
k
for a
sequence {Gk)of Q fi
{hr 5. : r
of 5,
HDV,
where
y =
If H
were countable:
H
S =
\342\200\224
G N},
(a)
HiJH^
union
{\\j{hr})^{\\JG%) r
expressing5 as a countable
(b)
S^lJFn
n
an
only
of closed every
k^
sets where no
G^
Fn
contains
open
interval.
that as a
CV^
so that
G^ contains
theorem
[Since
points
Gk
for
irrational
in 5.]
However, the
if a
Baire category
closed
states
he
complete metric
spaceS may
written
union of a countable
sequenceof
then
sets:
some Fn contains
be
an open ball.
in functional
uncountable.
Baire category
applications too!
analysis,
Proof
of the
contradiction that
of 5,
we can
purposes
non-empty
of
open
subset
xi
in S
and
> \302\243i
0 such
that
^(xi,\302\243i)CF^
Chapter Al:
B{xi,
denoting \302\243i)
Appendix
to
Chapter
(A1.12)..
F2
the
open
the
ball of
set
contains
no open
ball, so that
open
U2:=B{xi,2''ei)nF^
is non-empty,
and we
can
find
X2 in
U2
and
> \302\2432
0 such
that
B{x2,e2) Q U2,
Inductively,
\302\243n+i <
62
<2\"^\302\243i.
choose
2\"^\302\243n
a sequence
(xn) in S
so that
we have
and
Since
Cauchy,
d(x\342\200\236,Xn+i)
<
it is 2\"^\302\243\342\200\236,
obvious
from the
so
that
x :=
limxn exists,
and that
xef]Bix\342\200\236,e\342\200\236)cf]F^
contradicting
[JFn =
S.
Chapter
A3
Appendix
to Chapter
A3.1.
Proof
of
the
Monotone-Class
the
Theorem
3.14
a set
Recallthe statementof
\342\226\272
theorem.
functions
Let
7i
he a
class of hounded
conditions:
space
from
S into
satisfying
the
following
(i) H
(ii) the
is a vector
over
1
R;
is an
constant function
i^ 0' sequence 1
element
of H;
(hi)
Then
if
(fn)
of non-negative
hounded
fn
f where f
is a
function
S, then
every
\302\2437i.
if 7i
X,
contains
then
the indicator
contains
function
set
in some
tt-
system
7i
on S.
-
function immediate
I, V
Proof Let
from
T>
be
the
(i)
(iii)
that
G W. the
It is
7r-system
contains (t{X).
Supposethat / is a <T(I)-measurable
N,
For n G
function
such
that
for some
K in
V5
G 5.
t=0
where
A{n,i)
:= {s
: ^2\"\"
G H.
<
f{s)
< (i
+ 1)2\"\"}. so that
lA{n,t)
W is
every fn
A{n^i)
G ^(2\,")
^'
Since
But 0
<
T /, /\342\200\236
so that
f G H.
205
206
Chapter
A3:
Appendix
to Chapter
3
where
(A3.1)..
/
may
Then
write /+,/\"
= f^
- f~,
^f~
G W by what
we established
above.
G h(j{I)
and /+,/\"
> 0, so that
max(/,0)
cr-algebras
is
one of those
situations in which it
abstract
is
actually
easier
to understand
setting.
So, suppose
~>
that
and
S are
F : ft
5;
S is
a a-algebra on
-^
5:
X : ft
Because
R.
Y~^ preserves
ft,
and
because
it is
Y is
a{Y):
a{Y)= r-^S.
LEMMA
(a)
is cr{Y)-measurable
if and only if
X = f{Y)
where
is a
Ti-measurable
function from
to
R.
Note.
The
of
'if part
^only
Proof
(b)
if
part
G ba(F)
if and only if
bS
such
that
X =
/(F).
(b), we may as well use it. So define 7i to be the class X = /(F) for some / G bS.
Theorem
to prove
F = Y'^B for
that
then
some
B in
S, so that
Mio) = Ib(F(u;)),
..(A3.2)
so that
If Finally,
Chapter
A3:
Appendix
to Chapter
207
G H.
That
W is
that
suppose
real
is a
K,
sequenceof
X <K,
Define
elements
of H
such that,
constant
0<Xn'[
For each n, Xn
/ G bS.
Then X
to
= /(F).
careful about
(3.13,b)
/n(^)
for some
in bS. /\342\200\236
/ :=
limsup/n, so that
One
has
be very
what
Lemma
(a)
means
in practice.
To be sure,result
Discussion
is the
(5, S) =
1 <
fc
(R,5).
<
define a
map F :
o/(3.13,c).
12
-^
~^ R for
n.
We may
r(u;):=(r:
The
(\302\253),...,
r\342\200\236(u-))\342\202\254R\".
problem
mentioned
up here because,
to prove
beforewe
:=
at (3.13,d)
can
and in the
Lemma
Warning
following
it shows
need
apply
that
(t(Fi,
[This
...,
Yn)
aiY^'^BiR)
: 1
< k <
n) = F-^S(R\")=: (t(F).
that the product <T-algebra ^^ ^^^ proving ^C^) ni<fe<n See Section 8.5.] Now Yk = 7ikoF, where 7^ is^the hence 'k^^ coordinate' (continuous, map on R\", so that Yk is a(F)-measurable. Borel) On the other hand, every subset of R\" is a countable union of open open of R, and since rectanglesd x - - x Gn where each Gk is a subinterval
amounts
to
same
as 5(R\.")
{YeGix.-.xGn}^C]{YkeGk}e<7iYi,...,Yn),
things do work
You discussion
out.
see
why
we
are
in an
appendix,
and
why
we
skip
Chapter
A4
Appendix
to
Chapter
This Logarithm.
appendix Section
gives A4.3
of Strassen's Law of the Iterated the statement the completely different topic of constructing treats
rigorous
A4.1.
model for
Kolmogorov's
Markov
chain.
Law
of the
Iterated Logarithm
mean 0
almost
THEOREM
Let
JCi,-X'2
RV^
^o,ch with
and
variance
1.
Let
Sn
'-\342\200\224 -\\Xi
X2
-\\-
surely,
\"
lim sup-^===2==
V 2n
log
log n
= +1,
liminf\342\200\224^ V 2n
log log
-1.
This result
sums.
distributed.
already gives
very
precise
proof
behaviour
in
See Section
14.7 for
the
on the big values of partial case when the JC's are normally
A4.2.
Strassen's
Law
Law
is
of the
Iterated Logarithm
of Kolmogorov's result.
section.
of
Strassen's
map on Z\"*\",
t
a staggering
extension
in
the
previous
St{ijj)
on
[0,
interpolation
the
so
that
St{uj)
:= {t
t)5n(u;),
te[n,n-^
1).
With Kolmogorov's
t\342\202\254[0,l],
208
Chapter A4'
on [0,1] is a a function that
with
Appendix
to
Chapter
of
4
random
^09
walk S run
limiting
rescaledversion
t
\\-^
the
f{t,uj)
is in
the set
K{(jo)of
path associated
uj if
there
is a
-^ /(t,u;)
those
uniformly
in
t G
[0,1].
K consist of
functions
/ in C[0,1]
the
Lebesgue-integral
form
f(t) =
JO
I h{s)ds
where
/ ^0
hisfds
< 1.
Strassen's Theorem
P[K{uj)
K]
= 1. limiting
Thus, (almost) all paths have follows from Strassen's precisely sup{/(l):
However,
the
same
because 1,
shapes. (Exercise!)
Khinchine's law
= \342\202\254 A'}
inf{/(l):
/ G K}
= -1.
function rescaled)
the
only
element of K for which /(I) = 5 occur when the whole path will, in its often like
1 is
(when
the
f(t) looks
= t, like
every
t
path
function
and
infinitely
often
like
the
For a highly-motivated classicalproof References. Freedman (1971). For a proof for Brownian motion of see Stroock theory large deviations, (1984).
Strassen's
Law,
see
based
on the
powerful
A4.3. Let
matrix
model
for
a Markov
chain
\302\243^ be
a countable the
set; let //
denotes \302\243 as
of
we shall discover later, Complicating the notation somewhatfor reasons we wish to constructa probability triple (f2,j^, P'^) carrying an i^-valued stochastic process n such n that for Z+ and z'o, ii,..., E G in \342\202\254 E, {ZnZ+)
we have
P^{Zo =: Zo;...;Zn
in)
A^ioP\302\253o\302\253i \342\200\242\342\200\242\342\200\242P\302\253n-iin-
210
Chapter
A4:
Appendix
carry
to Chapter
4
\302\243^-valued
(A4'8)..
variables
Thetrickis to make
(f2,
j^,
P'^)
independent
G J^;nGN)
{Zo',Y{i,n):i
Zq having
(e,iGE). construction
We can
obviously do this
f2 and
via
the
in Section
4.6.
For a; G
n G N,
define
Zn(a;):=F(Zn-i(a>),n);
Chapter
Appendix
to
Chapter
Our
task
is to
elementary
need
an
Proposition.
be an
Let
(2/1'^
:rGN,nGN)
which
is doubly
y^^^
monotone:
limyn n r
as n ^
so that
:=!
exists;
exists.
fixed
n,
yn
T cls
r ]
so
that
yn
'-\342\200\224] Yimyn'
Then
y<\302\260\302\260) :=T limy('-)
=T
=: limy\342\200\236
y^.
Proof
we
The result
Let
\302\243 > 0
is almosttrivial. By
y^'^
replacing
each
{yn
) by arc
Then
tan
yn
\\
can assume
that the
be given.
>
are
uniformly
bounded.
that
Choose no such
\342\200\224 Then
?/no
> Voo
\342\200\224
^S-
choose
To such that
yl!'^^^
yno
i^-
so that
?/(^) >
?/oo.
Similarly,
?/oo >
y^\"^^-
\342\226\241
A5.2.
The
key
use of
Lemma 1.10(a)
monotonicity this
property
of measures
is used.
stage.
211
212 LEMMA
Chapter
(A5.2)..
(a)
Suppose that
G S
and
hn
that
e 5F+
and
hn
T U-
Then Proof.
i2o{hn)
K^)need
From (5.1,e),
only
prove
that
^ K^)>!-\302\243}.
Let
\302\243 > 0,
and
define
An
:=^
{s e
/^(^)-
A :
But
hn{s)
<hn
Then
An
so that,
(l-e)U\342\200\236
so that,
by (5.1,e),
(1 \342\200\224 e)i2{An)
< /io(^n)-
Hence
liminf/io(^n)
Since
this
is true
for every
e >
0, the
LEMMA
(b)
Suppose
that
G SF'^
gn
and that
e
SF'^
and
gn
T /\342\200\242
Then
/J,o{gn)
/^o(/).
Proof. disjoint
sum
/ =
^aklAk
(n T
are
TU, Lemma
oo),
D
from
(a).
A5.3.
LEMMA
'Uniqueness
Suppose
of integral'
f G
(mS)\"*\" of
(a)
that
and
that
we have
two
sequences
(f^^^)
and
(/n)
of elements
SF'^
such
that
f^''^U,
fnU-
Then
Tlim/<o(/'\">)=Tlim/.o(/\342\200\236).
..(A5.4)
Chapter
A5:
Appendix
as
to Chapter
5
/n,
213
and
Proof. Let
/i'\"^
/i'\"^
:=
r T
oo, fn^
as n T
oo,
Z^'\"^- Hence,
by Lemma
A5.2(b),
i\"o(/i''^)T/io(/n)asrToo,
M/n''^)TMo(/('-))asnToo. The
result
now
follows
from
Proposition
A5.1.
G
\342\226\241
Recall
from
Section
5.2 that
for /
(mS)\"*\",
we
define
/i(/)
By
:= sup{fio{h) : h
we may
Let us (We
fn
E 5F+;
/i <
/} <
oo.
definition
fio{hn) gn
of /i(/),
T /^(/)t
and
such that
/.
choose a sequencehn in SF\"*\" such that hn < f of SF'^ also choose a sequence(gn)of elements can do this via the 'staircase function' in Section
:\342\200\224 msix{gn,
5.3.)
Then
Now
let
/^i,
^^2,
fn
G 5F+,
fn <
since
/^o(/n) ^
/^(/)?
^iid
gn,
fn
Since /\342\200\242
fn
<
/,
On combining
this
fact
with
Lemma sequence
(a) 'changes
LEMMA
our
particular
(a), we to any
result. (Lemma
(b)
Let
G (mS)\"*\"
and let
fn
T /\342\200\242
Then
= Mfn)
/^(/).
A5.4.
Proof
the
of the be a
Monotone-Convergence sequence of
elements
Theorem
such
Recall
statement:
Let
(fn)
o/(mS)\"'\"
T
that fn
Then T /\342\200\242
Kfn) Proof
set as
M/)-
Let a^^^
denote the
r*^
staircase
function
defined r
in Section
Lemma
5.3.
Now /^\"\"^
/^^ n I
:= a(^)(/n),
is left-continuous,
fk''^ T
| oo.
By
A5.2(b),
T
/^(/n\"\"^)
Kf^''^)
as n
oo;
and
oo. We also know from Lemma A5.3(b) now follows from Proposition A5.1.
r t
//(/i^^)
T
Kfn)
as
Kf)-
The
result
Chapter
Appendix
to
Chapter
This
chapter
is solely
devoted
all
8.6. It may
be read after
has
Section
student who
A9.1.
Let
read
to the proof of the 'infinite-product' Theorem It is probably something which a keen 9.10. a tutor. previous appendices should study with
Infinite
(An
products:
be a
setting things up
probability
:G N)
sequence of
measures
on (R,S).
Let
fi :=
a typical
Define
n
nGN
R.
SO that
of
R.
\342\200\224
(un
: n
G N)
of elements
^n
The
\342\200\242=
(t{Xi,X2,
. . . ,Xn)'
typical
element
Fn
Fn
of
J^n hsis
the form R,
(a)
Fubini's
Gn
JJ
k>n
Gne
II
l<k<n
B.
Theorem
shows that on
we
may
unambiguously
use (a)
P-(F\342\200\236)
-^
[0,1]
via
(b)
and
that
P~
is finitely
additive on
the
algebra
J-~,
However,
for each
fixed n,
2H
..(A9.2)
(c)
with
Chapter
A9:
Appendix
to Chapter
9
may
215
be identified
(fi,j^n,P~)
We want to
i^
0, bona
vio>
fide
(a)
probability
and
triple
which
Y[i<k<n(^^^^^k)
on
(b).
Moreover,
Xi,X2,...,Xn
are
independentRVs
(d)
(obviously
(fi,
J^n^P\that
prove
the
P~
is countably
with
additive on T~
of using Caratheodory's Theorem 1.7). Now measure of the existenceof Lebesgue (see (Al.9,a))
in T~
r,
intention
we
know
from
that it is
enough to show that is a sequence of sets if (Hr) (e) > e for every some \302\243 > 0, P~{Hr) A9.2. Proof of (A9.1,e)
Step
our proof
such
then
that
Hr
-^r+i^Vr^
and if for
f]Hr
^ 9-
1:
For
every
r, there
\342\200\224
that
Hr
G J^n{r)
and so
IhX^)
Recall
hr{(-0i,i02,... and
foT some
hr G
bB'^^^'K
that
Xk{io)
have
iOk^
again
at Section
A3.2.
Step 2: We
(aO)
E-MXi,X2,...,X,(,))>\302\243,Vr,
because
probability
the left-hand
triple
side of
(aO)
is exactly
P~{Hr).
If we work within
Section
the
9.10
that
7r(<^) :T=5'r(<^l) :=
^~hr{0Jl,X2,X3,...,Xn(r))
expectation
is an explicitversion
of
the
conditional
of iHr
given
J^i
, and
- E-(7,)
= Ai(ff,).
0 <
^'r <
e <
1, so that
Ai{9r)
< lAi{9r
<
Ai{^r>\302\2432-^}+\302\2432-^
Thus
Ai{9r>e2-^}>2-'e.
Step 3: However, since Hr 2 -H'r+i, where both Hr and Hr+i are in J^rn)
9r{^i)
we
have
(working
within
(fi,j^^,p-)
> Qr-^iioJi),
216
to
Chapter
(A9.2)..
have
Ki{gr>e2-^}>e2-\\'ir,
and gr i
and
so that
continuity
{gr
>
|; \342\202\2542~^}
from
by Lemma
1.10(b) on the
above
of measures,
we have
Ai{iOi:griu;^)>e2-\\yr}>e2-'.
thereexists Hence,
(al)
Step 4' We now
u*
(say)
in R
such that
>
E-ft,K,X2,...,X\342\200\236(o)
e2-\\
Vr.
(Xi,X2,---) hr
by
(^2,^3,
\342\200\242 \342\200\242
O^
is replaced
by /ir(^i),
where
...)\342\200\242= hr{Ljl,U2,i03,
(^r(^t))(^2,^3,
We
. .)\342\200\242 \342\200\242
find
that
there exists
u;^
in
R such
that
Vr.
(a2)
Proceeding
E-M\302\253r,\302\2532,^3,...,X\342\200\236(,))>e2-2,
inductively,
we obtain
u;*
a sequence
: n G
(u;;
N)
with
the
property
that
E-/..K,u;2*,...,a;*(,))>\302\2432-\"('-),Vr.
However,
and
can
exactly
/ir(u;r,u;*,...,u;;(^)) be or 1. 0 The only conclusion only the existence of such an uj* which
= /H.(u;*),
is that we
had
and
it
was
Chapter
A13
Appendix
to Chapter
13
This
chapter
by
many
certainly
easy to
set
examination
questions
A13.1.
Modes
: n
of convergence: be a
Let
definitions
RVs
Let {Xn
our triple
G N)
sequenceof
us
and
(fi,j^, P).
collect
together
carried
by
us.
say
that
Xn
\342\200\224> X almost
surely
if
Convergence
We
In probability
Xn
\342\200\224\342\226\272 X In
say
that
probability
X\\>
if, for
^^
every e > 0,
n-^oo.
e)
as
say
that
Xn
^ in C^
\\\\Xn
if each
Xn is in
as
and \302\243p
e C^
and
\342\200\224
X\\\\p
\342\200\224> 0
\342\200\224>
oo,
equivalently,
E(|Xn-X|P)^0
as
n^oo.
217
218
A13.2.
Let
Chapter A13:Appendix
to
Chapter
13
(A13.2)..
me
the facts.
Thus
Convergence in probability
is the weakest
in prob)
of
the
above
forms of
convergence.
(a)
(b)
(Xn for
-^
X,
a.s.)
=^{Xn-^X
>
1,
(Xn -^
X in
=> \302\243P)
{Xn
-^ X in
prob).
valid.
of our
three forms
is of convergence
(Xn
X in that
\302\2430=>
(Xn
^ X in
\302\243^).
If
we know
'convergence in probability
V5>0, us to
is happening quickly'
in
that
{d)^P{\\Xn-X\\>e)<oo,
n
then
(BCl)
allows
conclude that
Xn
\342\200\224*\342\200\242 a.s.
X,
impliesa.s. convergence
only
is
used
in proving
(e)
Xn
\342\200\224*' X in
probability
if and
if
we
The only
(f) for
other
Xn
useful
\342\200\224^ X in
result L^
is that
if and
only
hold:
p > Ij
Xn
if
the
following
two statements
(i)
\342\200\224*' X in
probability,
: n
{\\Xn\\P
> 1)
is UL
above provide
There is only one way to gain an understanding of the to prove them yourself. The exercises under EA13 you need it.
facts,
and
guidance
Chapter
AI4
Appendix
to Chapter
14
We
v/ork
with
a filtered
space (II, ^,
{J^n-
Z\"*\"},
P).
This
chapter
that
introduces
J^t
The idea is
the a-algebra
the
J^t,
where
is a
stopping
time.
represents
information
integrable
available to our
supermartingale
observer
and
at (or, if
and T
of
Theorem
the
you
prefer,
that
says
immediately if X is a property:
after) time
uniformly
T. The Optional-Sampling
S
< T,
then we
E(Xt|J^s)
< Xs,
a.s.
time a stopping
time
J^t,
a stopping is called
T: fi
if
{T
if
<n}
eJ'n,
ne Z+U{oo},
equivalently
{T
= n}
eJ'n,
ne Z^^Uloo}.
the 'n
\342\200\224 00' Z\"*\".
In each of
from the
Let
the
above
of
validity
the
case
follows
automatically
F
every n in
F
T be
a stopping
F n
C fi,
G Z+ U
we say
{00},
that
G J^t if
\342\226\272 \342\226\272
{T <
n J^\342\200\236,
equivalently if
Then
F n {T = n} G T = n; ^t
^n, if T
n G
Z+
{00}.
^T
= ^n if
^00
= 00;
and J't
^ ^00 for
every
T.
219
Chapter A14:
Appendix
to
Chapter
14
(AI4.I)..
if
5 is
a cr-algebra.
You can
Hint. If
G J^SAT,
then
Fn{T
= n}= U
k<n
Fn{5AT=fc}.
X
is that if that needs to be checked Another detail process and T is a stoppingtime, then Xt G mJ^T- Here, defined in some way such that Xqo is ^00 measurable.
Proof. For B e
is an Xoo is
adapted assumed
B,
G
{Xt
A14.2.
5}
n {T
= n}
= {Xn \302\243 B}
H {T
= n}
J^n-
\342\226\241
A special
case of
OST
Let T
be
LEMMA
Let
be a
some N in N,
supermartingale.
T{lj)
a stopping
time
< N,
Vu;. Then
E{Xn\\:Ft)<Xt.
Xt
\302\243Hfi,J^T,P)
Proof
Let F
e J^t- Then
n
E{Xn;F)=Y^ E{Xn;F
n<N
{T
n}) n}) =
<
the fact that
Y, E{Xn;F n
n<N \\Xt\\
{T
E(Xt; F).
the result
(Of
course,
<
E{\\Xt\\)< 00.)
|^i
|H
h \\Xn\\
guarantees
that
D
martingales
A14.3. Doob'sOptional-Sampling
Theorem
for
UI
Let M
be a UI martingale.Then,
for
any
stopping
time T,
E(Moo|J^t) = Mt,
a.s.
..(A14-4)
Chapter
A14:
Appendix
to Chapter
I4
221
Theorem!) T is
a stopping
time,
then
E(|Mt|)
< 00
Corollary 2
If Proof
is
a UI
martingale
stoppingtimes
with
<T,
then
Ms,
a.s.
have,
of
theorem.
By
Theorem
14.1 and
Lemma A14.2,we
=
for
fc
G N,
E(Mool^ik)
= Mk,
a.s.,
E{Mk\\J^TAk)
Mtau,
a.s.
Hence, by the
Tower Property,
E(Moo|J^TAik)
(*)
If F G J^T, then
MTAik,
a.s.
(check!)F fl
< k})
{T
< fc} G
J^TAk, so
< k})
that, by (*),
= E(Mt;
Moo j
fc
(**)
We
E(Moo;
Fn{T
all
E(MTAifc;
Fn{T
can
Mn =
(and
do) restrict
Then,
on letting
00 in
F n
{T <
00}) = E(Mt;Fn{T<
{T
00}).
00}).
D
the
fact
that
E(Moo; F n is tautological.
Corollary
follows
{T = 00})= E(Mt;F n
Hence E(Moo;F) =
2 now
E(Mt;F).
1
follows from
2!
from
Corollary
A14.4.
A
UI submartingales
Doob decomposition M is
if UI. Hence,
UI
X has
<
00 and
is
= Xo
+ e(Moo|:^t)
E(Aoo|:^T)
= Xo-fMT+
>Xo
+ e(Aoo|:^t)
+ Mt
J*.
+ E{At\\J't)
-A
Chapter
A16
Appendix
to
Chapter
16
A16.1.
Differentiation
under
the
integral
sign
Before stating our theoremon this topic, let us examine the type of appHcaJC is a RV such that Ed-X\"!) tion we need in Section 16.3. Supposethat < oo the real and imaginary parts of can treat and that h{t,x) = ixe**^. (We of R, then Note that if [a, 6] is a subinterval the variables h separately.) {h{t,X) : t G [a, 6]} are dominated by |X|, and so are UI.In the theorem, we shall have
EH(t,X) = ^xit)-fx(a),
and
te[a,b],
we
can
conclude
that v^x(0
THEOREM
Let
be a
RV carried
by
(fi,^,P). a <b,
: [a, 6]
Suppose
that a, 6 G R
with
and that
X R
-\342\226\272 R
ft
has
the properties:
is continuous
is B-measurable
in
t for
every x in R,
in
1-^
h{t,x)
for every t
are
[a, 6],
{h{t,X) : t
G [a, &]}
UI.
Then
(a) (b)
\\\342\200\224^
Eh{t,X)
is continuous
on [a,6],
is
B[a,b]
X B-measurable,
to
Chapter
16
223
6),
:= Jl h{s,
x)ds
for
a < t
G (a,
-j-EH{t,X) at
Eh{t,X).
case tn
\342\200\224> t\\ result \342\226\241
Proofof(a). Since
Proof
we
need
only
Theorem 13.7.
:=
the 'sequential
Define
6n :=
2-^(6 - a), Dn
G
(a +
\302\253+)
fl
[a, 6],
rn(t)
:= inf {r
: r jO\342\200\236
> t},
t G
hn{t,x) :=
Then,
for
/i(r\342\200\236(t),x),
G S,
/i-^(5)
SO
U (([r,r+
<5)n
[a,6])
x {x
: /i(r,x)
G 5}),
x R, result
that
hn
is S[a, 6]
x S-measurable.
Since hn
\342\200\224^ h on
[a,b]
follows.
(b)
Proofo/(c).
If r = A
X C
For
C [a,
6] x R,
define
6] x
fi :
(t,X(a;))
G T}.
A G B[a,
b]
and
C G
S, then
G S[a,6]
a{T) = Ax
(X-^C)
^.
B[a,
It is now clear that the class of F for which a(F) is a cr-algebra containing B[a^b] x B. The point (*)
is an element of
of
b]xj^
all
this
is that
(t,a;)
for
H->
h{t,X(uj))
\\s B
x J^B}
measurable
(Yes, I know,
since
could have obtained but it is good to have (*) more directly using the /i\342\200\236's, other methods.)Sincethe family {h{t^X) : t > 0} is UI, it is bounded in
>C^, whence
rb
B, {(t,uj)
: h(t,X{uj))
is a(h-^B).
we
Ja I
E\\h{t,X)\\dt<oo.
implies
that,
for a
<t <
b,
Eh(s,X)ds
a
= E f
J a
h(s,X)ds = EH{t,X),
and
follows.
Chapter
Exercises
Starred
exercises
are
more
tricky.
exercisegives
an
a
of
rough
gumption in
depends
number
'G' of exercises
on.
begin
stands
the
text.
Some are
repeated here. We
measure
with
though
point
needs hammering
home.
EG.l.
chosen
Two
points
are to
chosen
at random
according
made
independently
on a line AB, each point being AB, and the choices being
may
now
be regarded
as
is
the
probability
that
into a triangle?
they may be
made
EG.2. Planet X is a ball with centre O. Three spaceships A, B and C land at random on its surface, their positionsbeing independent and each
uniformly
distributed
on the
surface.
<
example,
Spaceships
that A
90\302\260.Show
and
B can
the
is (tt
+ 2)/(47r).
free
EG.3.
0
with
Let G be the
the
group
with
the
two generators
a and h.
second
Start
at
time
unit
element
each
at
1, the empty
(independently
word.
At
each
multiply
the
four
elements
a, a~^,
probability
1/4
of previous
h\"^, a~^,
a,a,6,
times
a~^,
a,
a,
1 to
the
will
produce
the
reduced
is intuitively word
word aah
Prove that
time
probability
that
why
the reduced
it
is 1/3,
and explain
of length 3 at time 9. word 1 ever occursat a positive clear that (almost surely)
n)ln
\342\200\224> \\.
(length of
reduced
at time
224
Chapter
E: Exercises
now
225
elements
that
the
a,
the
a\"\"^,
&, &''^
are
instead
/3
with
0,a +
1 ever
respective
probabilities
that
a^a,l3^l3^
the
where a is
a >
reduced
0,/? >
word
^. Prove
root
that the
x =
conditional probabilitythat
element
the
1, is the unique
chosen at time
(0,1) of
+
equation
(3
- 4a-^)x^ + X
true
1 =
0. and more
As time
word
that)
more
of the reduced
becomes
so that
a final word
is built
up.
If in
the symbolsa and a\"^ are both replaced by A and the show that the sequence of A^s are both replaced B, by a Markov chain on {A,B} with (for example)
PAA
a(l
\342\200\224
a(l -
x)
'
x)
+2/3(1-t/)'
proportion
where
r(/3).
What
is the
final
of the
Lyons
symbol a in the
of
of occurrence
(Note.
This
result was
Edinburgh
to solve
a long-standing
problemin potentialtheory
used by Professor
on
Riemannian
manifolds.)
Algebras,
etc.
subsets
El.l.
Let
of N
has
(Cesaro)
density ^(V)
and write
G CES
if
Vi and
V2
in
CES
for which
Vi
fl V'2
^ CES.
Independence
E4.1.
Let
(fi,.F,
P)
be a
for
probability triple.
k =
I3
be
three
1,2,3,
Xik
and Q
\302\243Xk'
226
Chapter
E: Exercises
Prove that
if
p(/in/2n/3)
= P(/i)P(/2)P(/3)
then
1,2,3),
Q, \302\243 Xk\"^ (^(s)
cr(Ji),a(J2),a(J3)
are
independent.
Let 5
:=
^~'*' Yln\342\202\254N
^^ usual.
Let X
and Y be
independent N-valued
variables
with
P(X =
Prove
are
n) = P(F
n)
n-7C(5).
that
the events
independent.
(Ep : p prime)
, where
Ep =
{X is
divisibleby
p},
i/c(^)=n(i-i/p')
p
probabilistically.
Prove
that
X)
1/C{2s).
and
Y. Prove that
P(H = n) = n-27C(25).
E4.3.
continuous
Let
-X'i,-X'2,...
be
distribution
function.
random variables with the independent Let Ei := fi, and, for n > 2, let
<
same
En :=
{Xn >
and
Xm^ym
n}
= {a
n}.
Convince yourself
with
your
tutor
that the
P{En)
= 1/n.
events J^i,J^2?
Borel-Cantelli Lemmas
E4.4.
Let
Ak
Suppose
that
a coin
be
the
event
that a
amongst tossesnumbered
is tossed repeatedly. heads occurs consecutive sequence more) - 1. Prove that 2*, 2* + 1,2* + 2,..., 2*+^
of
with probability
of
heads
k (or
'\342\200\242(^-o)
{j
there
make
are k consecutive
a simple
heads beginningat
inclusion-exclusion
(Lemma
1.9).
Chapter
E: Exercises
227
the
if
G is
a random
variable with
normal
N(0,1)
then,
for
x >
0,
P(G >x) =
Let Xi,X2,
with \342\200\242 be a \342\200\242 \342\200\242
y/27:
Jx
He-^y'dy< ^yJ^
-J\342\200\224e-^^\\
probability
1,
L := limsup(X\342\200\236/\\/2
(Harder.
log See
n). Section
Prove that
P{L =
J^2
1) = 1.)[Hint
\342\200\242 \342\200\242 \342\200\242
14.8.]
distribution.
this
Let 5n
:= J^i +
Prove
^n-
Recall
that
Sn/y/n
that
P(|5n|
Note
< 2i/nlogn,
ev) =
1.
0)
that
implies
the Strong
the
Law: P(5n/n
Logarithm
= \342\200\224>
1.
Remark.
The Law of
Iterated
states
that
=
P (lim
V
sup .2n ^
V
now!
log log n
=1
1-
this
See Section
14.7.
Z be
a non-negative RV.
Z.
Show
that
(*)
5]P[Z>n]<E(Z)<l
ncN
+ j;P[Z>n].
ncN
IID
RVs = 00,
(independent,
identically
distributed
P[Xn\\
>
kn]
= 00
(ke N)
and
limsup J^
= 00, a.s.
Deduce
that
\\i Sn
= Xi
r lim
-\\-
X2
^ \\Sn\\
then \\-X\342\200\236,
sup
= 00,
a.s.
228
E4.7.
Let
Chapter
E:
Exercises
What's
Xi, X2,...
fair about a
fair
game?
RVs
be independent
y
such
that
__ ~\"
\\
\342\200\224 1
with that
probability probability if Sn
Prove
that
E{Xn) =
0, Vn,
but
= Xi
+ X2
\342\200\242\342\200\242\342\200\242
^n,
then
Sn n
\342\200\224 a.s.
1,
E4.8*.
Blackwell's
test of imagination
you
are
familiar
with continuous-parameter
: t
Markov
with
For each n G
state-space
N, let
X^^) =
{X^^'^t) with
> 0}
be a Markov chain
the
two-point
set {0,1}
Q-matrix
and
transition
function
P^^\\t) =
bn)
exp(tQ^^^).Show
,
that,
for
every
t,
Pil\\t)
> bn/{an +
p^'^t)
< 0^/(0^
+ 6n).
=
The processes : n G N) are independent and X^^\\0) (X^^^ Each X^^^has right-continuous paths.
Suppose Prove
0 for
every n.
that that
an =
if t
oo.
many
(*)
on
P{X(^)(t)= 1
to
infinitely
n} =
0.
convergent
Use Weierstrass'sM-test
[0,1],
show
that
J^n ^^gPoo
(0 is uniformly as tj 0.
P{X(^)(^) = 0 for
Prove
ALL
n}
-. 1
that
P{X(^)(5)
= 0,
tutor
V5
<
^,Vn}
= 0
for every t
>
and
discuss
with
your
why it is
that
Chapter
E:
Exercises
of
229
many
(**)
Now
within
every
non-empty
time
interval,
infinitely
the
X^^^
chains jump.
imagine
the
whole
behaviour.
almost all its time Notes. Almost surely, the process X = (X^^^) spends of sequences with in the countablesubset of {0, l}*^ consisting finitely only and Fubini's Theorem 8.2. However, it I's. This follows from many (*) is a.s. true that X visits uncountable points of {0,1}'^ during every This follows from (**) and the Bairecategory theorem time interval. nonempty A 1.12. one can show that for certain By using much deeper techniques, choices of (on) and (6n),X will almost certainly visit every point of {0,1}*^ often within a finite time. uncountably
Tail <T-algebras
E4.9.
Let
lo,
yi,
^2,...
p(y;
be independent
=
+i)
= P(y;
i,
Vn.
For
n G
N, define
Xn '=
Prove that
YoYi
YnDefine
the variables-X'i,X2,...areindependent.
y :=
^n
:=
CT{Xr
I V
>
Tl).
Prove
that
:=f]c7{y,Tn)^ah,f]Tnj
=:n.
independent of IZ.
Yq
and G m\302\243 2
that
Yq is
E4.10. Star
See
Trek,
ElO.ll,
which you
can do now.
Define
fn
:=
n/(o,i/n).
every
n.
Draw
Prove a
230
Inclusion-Exclusion
Chapter
E:
Exercises
Formulae
and
inequalities
of Section
1.9
The
Strong
Inverting
Law
Laplace
E7.1.
Let
of
transforms on
by
[0,oo).
The Laplace
transform
i(A)
:=
o-Ax
/'
f{x)dx
Let
JCi,-X'2,.
A,
of rate
so
P[X
> x]
= e~'^^, E{X) =
RVs each
with Var(X)
the
exponential
=
distribution
{,
^.
Show that
h -^n, be
and
recovered
L^^^^^ denotes the (n \342\200\224 1)*'* derivative from L as follows: for y > 0,
ntoo
(n
\342\200\224
1)!
E7.2.
The
uniform
distribution
R^
write S^^^ = {x E R^ : |a:| = 1}. You may assume that there is a = such that unique probability measure i/^~^on (5^~^,S(5^~-^)) u^^^{A) A in B{S^^^). u^^^{HA) for every orthogonal n x n matrix H and every Prove that if X is a vector in Rn, the components of which are for then x n matrix H, the n independent variables, every orthogonal N(0,1) vector HX. has the same property. Deducethat has law i/^~^. X/|X|
As usual,
be
independent
N(0,1)
variables
and
define
\342\200\224> a.s.
1,
these ideas
Brownian
for
fact
which
relates
is
quantum
the
mechanics:
Chapter
for
E: Exercises
is a
231
chosen on
If,
each
to the distribution
lim
point
5^^^ according
then
n-*ooP(v^F/\"^
< x)
- $(x)
^=
y/2'K
J-oo
e-y^l^dy,
lim
n\342\200\224^00
PiV^Y}\"^
< xi;
^/^Y^^\"'>
<
X2) =
$(xi)$(a;2).
iTmi.
P(F/\"^
< u)
P(Xi/ii\342\200\236
<
u).
Conditional
Expectation
if
E9.1.
Prove that
is a
sub-<T-algebra
of J^
and if
\302\243 \302\243^(Q,J^,
P)
and
iiY\342\202\254C\\n,g,P)
and
(*)
E(X;G)
= E(F;G)
Q>
for every
for
G in a 7r-system
G in
which
contains
and
generates
Q, then
(*) holds
every
^.
that
E9.2. Suppose
X,Y
\302\243^(J2,J^,P)
and
that
E{X\\Y)
Prove
= F,
a.s.,
E(F|X) = X,
< c) + E{X-Y;X
a.s.
that
P(J\\: =
E{X
F) = 1.
-
Hint Consider
Martingales
F;X
> c,F
<c,Y <c).
El0.1.
At
Polya's
0,
urn
an urn contains 1 black ball and 1 white ball. At each time a at ball is chosen random the from urn and is 1,2,3,..., replaced together a new with ball of the same colour. Just after time are therefore n, there n + 2 balls in the urn, of which I are where is -^ Bn Bn the number black, of black balls chosen by time n.
time
proportion
of black
filtration
time
n.
Prove
that
=
specify)
M is a
of
(relative to a natural
fc)
just
you
should
martingale.
P{Bn =
Prove that
distribution
0,
(n
+ 1)-^
for 0
<
A;
<
n.
What
is the
where
Chapter
E: Exercises
Prove
that
for
0 <
^ <
1,
{Continued
at
ElO.8.)
E10.2.
of
Bellman's
Optimality
where \302\243n,
Principle
the
Your winnings
with
stake
on
game
n are
Sn are
IID
RVs
P(\302\243n
+1)
= p,
P(\302\243n
-1)
= g,
lie
where
\\
<p^l-q<l.
IS 0 and Z^-i, where Zn \342\200\2241 maximize the object expected your 'interest rate' where iV is a given integer representing the length Elog(Z;v/^o), of the constant. and Zq, your fortune at time 0, is a given Let game, \342\200\224 be time n. to Show that if is C J^^ 'history' up your any (j{ei^... ,\302\243n) Tia is a ^t^permartingale, where a denotes (previsible) strategy, then log Zn \342\200\224
Your
stake
fortune
between
is to
the ^entropif
==
plogpH-glogg but
+ log2,
<
is
Noi-,
that,
for a
the
best
strategy?
El0.3.
that
Stopping times
that
Suppose times.
5 A T
5 and T are stopping times (relative to (J2,^, {^n}))and 5 + T are (:= min(S,T)), 5 V T(:= max(5,T))
Prove
stopping
process
ElO.4.
l(5,7^
with
Let
and
T be
set
stopping times
N via
with
<
T.
Define the
parameter
\"
l(^,^(n,u;).-|^
otherwise.
and deduce that if JC
<
1(5,t]
is previsible,
is a supermartingale,
E(XTAn)
E(XsAn),
Vn.
Chapter
23S
(almost
E10.5.
surely)
'What
always
stands
a reasonable
than
happen
Suppose
- sooner rather
T is
later.'
\302\243 > 0,
we
some
N eN
and some
P(T <n +
Prove by induction
fc-1,2,3,...
iV|^n)
>
e,
a.s.
kN)
P(T
> kN;T
> (k
- l)N)
that
for
P(T>fciV)
<(!-\302\243)*.
< oo.
E10.6.
At
ABRACADABRA
each
sequence
uniformly
of times of letters
from
1,2,3,..., a monkey types a capital letter at of each RVs typed forming an IID sequence the 26 possible capital letters. amongst
time
random,
the
chosen
Just beforeeach
He bets $
1,2,...,
letter
1 that
the
a new gamblerarrives
will be
on
the
scene.
n^^
A.
of
he
wins,
he receives
letter
$ 26 all
will
which
he bets
on the
1)^^
be B.
fortune
wins,
he bets
of
$ 26^
the (n +
and
2)^^letter will
R Let
so on
which
time by
that
through the
the Explain
ABRACADABRA
sequence.
T be
the first
obvious
monkey why
ABRACADABRA.
consecutive sequence
makes
-f- 26
it
intuitively
E(T) =
and
26^^ -f-
26^
use result
(1983)
for
other
such
applications.)
ElO.7.
Gambler's
Suppose
that Xi,
^2,
RVs with
==
P[X =-f-1]
= p,
P[.Y
=-1]
^,
where
0<p=l-g<l,
234
and
Chapter E: Exercises
p T^ q.
Suppose
that
a and
are
integers
with
0 <
a<
&.
Define
5n
:=a-|-Xi+---4-X\342\200\236,
T :=
{0,^})-
inf{n : 5n = 0 or
Explain
why
5n
&}.
Let
^ =
in
<t(Xi,... ,Xn)
Question
(^0
satisfies
the
conditions
E10.5.
Prove that
Mn :=
define
i^fP
and
Nn
Sn
n{p ^
of
q)
= 0)
martingales
and
N. Deduce
the
values
P(5t
and E(5t).
random
number 0
coin is tossedrepeatedly.Let Bn the same has exactly be the number of heads in n tosses. Prove that (Bn) on Polya's urn. Prove in (ElO.l) probabilistic structure as the (Bn)sequence that N^ is a regularconditional of 0 given jBi, ^^2,..., Bnpdf {Continuedat El8.5.)
probability
0 is of heads
between
0 and
1, and
a coin with
E10.9.
stopping
Show
time,
that
if JC is
a non-negative supermartingaleand T is a
then
E(Xt;T<oo)<E(Xo).
{Hint.
ElO.lO*.
Recall Fatou's
The
cP(supXn
n
> c)
< E(Xo)-
'Star-ship
Enterprise'
Problem
The control system on the star-ship has gone wonky. All that Enterprise one can do is to set a distance to be travelled. The will then move spaceship that distance in a randomly chosen then stop. The object is to direction, into the a of r. ball radius SolarSystem, get Initially, the Enterprise is at
a distanceRo{> r)
from
the
Sun.
Sun
due
to
show that
whatever
and supermartingale^
that
from Sun
that
For
to Enterprise, 1/Rn is a
'space-hops'. charge
greater
Use
than
Use
(ElO.9)
to show
<
t/Rq.
> 0,
{t/Rq)
you
can
\342\200\224
choose What
a strategy kind
greater
e.
probability
Chapter
E: Exercises
Log
Scott next
235
2.
'Captain's
...
modified
for
Engineer
have
the
Enterprise
current
is confined
However,
to move
ever
the
and 'current' being updated is way). Spock muttering somethingabout logarithms and we will get random walks, I wonder whether it is (almost) certain that but into the Solar System sometime ... ' to be the
in distance
'hop-length'
to the
Sun ('next'
the
obvious
Hint.
Let
Xn
of variables
each of
'-= log
Rn-iRn\342\200\224^og mean
Prove
that
JCi, ^2?
0 and
finite variance
a^ (say), where a
IID sequence
> 0.
let
Sn:=Xr+X2
+ ---+Xn.
number,
Prove that if a
is a fixed
5n P[inf n
positive
then
-Q!<Tx/n,
=
i.o.] > 0.
$(-a)
(Use the
in the
Central Limit
Process
Prove
that
the event
tail <T-algebra of
Branching
{inf^ Sn =
is \342\200\224oo}
E12.1.
a
family
A branching
process Z
define
in
the
usual
way. Thus,
IID
Z\"*\"-valued random
define
variables is
supposed
given. We
Zq: =
1 and
then
recursively:
Zn+i:=X[\"+'^
+ ---+Xi\"+'^
(n>0).
then
Assume that
if X
denotes
any one
and
of the
0 <
J\\:J^'*\\
//:=E(X)<oc
Prove
that
(y
Mn:= Zi,...,
Zn/fJ-^ Zn).
Tn=
(Zq,
M is boundedin \302\243^ if
and
only
if ^
> 1.
Showthat
when
236
Chapter
Kronecker's
E: Exercises
El2.2. Use of
Let jEi,jE2,... Prove that ^
Lemma
with
be independent events
\342\200\224 (Yk
P{En)
to deducethat
where iV^: =
becomes
j^)
/logfc
converges
Nn
y
a.s., and
,
1,
use Kronecker'sLemma
l/n.
Let
Yi
Ie..
a.s.,
logn
Vi H
5^n- An
the
number
Trek,
of records by time n.
interesting
application is to
E4.3, when
Nn
E12.3. Star
Prove
that
if the
strategy in ElO.ll
rather
sense)
employed
than
in
R\"^,
^R~'^<oo,
a.s.,
Enterprise result
fully
where Rn is the
should
the which
to the plays
Sun at time
n.
you
the key
to make
your
argument
rigorous.
Uniform
Integrability
El3.1.
conditions
Prove
(i)
that
a clas
C of RVs
is UI if
and
only
if both
of the
following
and
(ii)
A := sup{E(|X|) : X eC} <oo, so that (i) C is boundedin \302\243\\ for e > 0, 36 > 0 such that if F G ^, every P(F) < 6 and X e C,
(ii) hold:
< e. thenE(|J\\:|;F)
Hf\\
Hint
for
For X
eC,
P{\\X\\
>
K)
< R-^
\\X\\
A.
KP{F).
Hint for 'only if\\ E(|J\\:|; F) < E{\\X[, Prove that if C and V are UI El3.2.
C -^V
>
K)-^
classes
eC,
of RVs,
Y eV),
and
if
we
define
:= {X
-^Y : X
then C -\\-T>
IS UI. C
Hint. a
E13.3.
and
Let
be
some
sub-cr-algebra
prove this is to use E13.1. UI family of RVs. Say that F G 2> if for some X eC = we of have Y Q ^, E{X\\Q), a.s. Prove that V is
One way to
UI.
El4.1.
Hunt's
that
Lemma
{Xn)
Suppose
and that
is a
sequence
Y
of
in
RVs
such
that
X:
\342\200\224
YimXn
exists
a.s.
{Xn) is dominatedby
\\Xn{uj)\\<Y{uj),
{O)^:
V(n,a;),
oo.
Chapter
E:
Exercises
237
Let {Tn}
be
any
filtration.
Prove
that
E(Xn|^n)-^E(X|^oo)
a.s.
that
Hint Let Zm.= sup^>^ \\Xr for n > m, we have, that Prove
X\\.
Prove
Z^
-^ 0
a.s. and in
\302\243^
almost
surely,
E14.2.
Azuma-HoefFding
if
InequaHty
RV
(a) Showthat
for
is a
with
values
in
c] [\342\200\224c,
and
with
E(F)
= 0,
then,
e eR,
Ee^^
Prove
< coshl9c
.
for some
(b)
that
G N)
(cn : n
of
at
0 such
that
sequence
Vn,
for x
> 0,
P
Hint
sup
Mfc >
< exp
( -x^
for
(a).
Let f{z):
= exp(^2;), zG
c]. [\342\200\224c,
Then,
since
/ is
convex,
/(y)<Sr^(-c)
Hint
+ ^/(c).
for
(b).
See the
proof of (14.7,a).
Characteristic Functions
El6.1.
Prove
that
lim TToo
sinx
dx
= 7r/2
by
by integrating
semicircles
of
radii
e and
J z ^e^^dzaround the contour formed T and the intervals and \342\200\224e] [\342\200\224T, has the U[-l, 1]distribution,
ipz{e)
the
'upper'
[e,T].
then
E16.2.
Prove that if Z
= {sin
0)/0,
238
and
Chapter E: Exercises
prove
that
there
do not
and
Y such that
integrating
with
Show
the
that
with
and let ^ > 0. By has the Cauchy distribution, formed the semicircle around + z^) by together [\342\200\224R^R\\ e*^^/(l i?, prove that ipx{0) = e~^. 'upper' semicircle centre 0 of radius = e\"!^! for all 6. Prove that are IID RVs if Xi, X\342\200\236 X2,... (fx{0) \342\200\242 \342\200\242 \342\200\242 then also the standard + Xn)/n distribution, Cauchy (Xi +
X that
E16.4.
^
has
the standard
\342\200\224
>
0.
Consider
J(27r)~2
exp{
^z'^)dz
around
contour
{-Rof and
19),
prove
that Prove
^x{^) that
definite
= exp( \342\200\224^^^).
a RV real
El6.5. non-negative
X, then
(p
is
{Hint
Express
(^
LHS as
says that
continuous,
the expectation of
function
is a characteristic
(^ :
Theorem
1,
<^
is
that
here
E16.6.
of the
expansion
RV
(a)
of u.
Let
distribution
the
be
binary
^(^) = Y.
odd
2\"\"Qn(u;),
where
Qniu;)
= 2i?n(u;)
- 1.
V
Find
identically
a random
distributed
variable
and
V -F
independent ^ V
of U
such that
U and
are
[/
is uniformly
distributed
on
and
[\342\200\2241,1].
(b)
Now
suppose
that
^Y
such that
are
IID RVs
X+
uniformly
distributed
on
[\342\200\2241,1].
Chapter
E:
Exercises
239
of that the distribution Let V? be the CF of X. Calculate ^{0)/ip{\\6).Show X must be the same as that of U in part (a), and deduce that there exists = 0 and P{X e F) = 1. a set F G e[-l, 1] such that Leb(F)
E18.1.
associated
(a)
Suppose
that
with
the
Binomial
converges
weakly
A.
to F
where F is the DF of
are IID RVs Prove that for x G
<
Fn
the
distribution
with
parameter
(b)
Suppose
that
-X'i,X2,...
on R.
each R,
with
the
density
function
lim
X I =
^ + TT
arc tanx,
where
arc tan G
? f )\342\200\242 ( \342\200\224f
E18.2.
Prove the
Jri,jr2?--X
of
Weak
Law
of
Suppose that
are ^ C^
IID
in the
the
following form.
use
same
the
/i. Prove by
+ Xn)
^n:=n-^(Xi+...
unit
mass ~>
at
//. Deduce
that
An
/^
in probability.
Law.
Of course,
Weak
SLLNimpliesthis Weak
for Prob[0,1]
be
Convergence
RVs =
taking E(r*),
values in [0,1].
fc
Suppose that
E(X*)
0,1,2,....
Prove
that
(i) Ep(X)
(ii)
every every
polynomial continuous
p, function
/ on
[0,1],
P(y
< y) for
every x in [0,1].
Theorem
7.4.
DFs
that {Fn) is a
= 0
sequence
of
with
for X
<
0,
Fn(l)
= 1,
for every n.
240
Suppose
Chapter E: Exercises
that
(*)
rrik :=
lim / \"
x^dFn existsfor
fc
0,1,2,...
[0,1]
that
Use the
characterized
by
Fn
F, where
F is
x^dF
rrik^yk.
Moment F(O-)
E18.5. Let
Inversion =
Formula
0 and
F(l)
= 1. Let ^ be
the
associated
and
define
ruk
:=
J[o,i]
Define
x^dF{x).
P =
Q-
[0,1]
[0,1]^,
J^^BxB^,
^x Leb^,
law
This modelsthe
probability E10.8.
situation
in
0 The
of heads
is then
the
RV Hk
is 1 if
minted, and tossedat times 1,2,... . See k^^ toss produces heads, 0 otherwise. Define
+ --Theorem,
which
0 is
chosen with
^,
a coin
with
5n:=Hi+H2
+ i^n.
and
Fubini's
Sn/n
a map
~> 0,
real
a.s.
(an : n G
D on
the space
of
sequences
Z*^)by setting
Da = {an
Prove
(*)
that
an+i
: n G Z*^).
Fn{x)
:=
^
i<nx of
f^)(D--'m), ^^^
F.
^ F{x)
at
every
point x of
Moment
{rrik
continuity
El8.6*
Problem
: A; G
Z\"*\")
Prove that if
is
existsa
mo =
Hint.
a sequence
of numbers in
that
RV
with
values
1 and
Define the
in [0,1]
such
E{X^)
rrik if and
{D^m)s>0
show that
(r,5,GZ+).
that
E18.4(*)
holds.
You can
etc.
^n,o = 1,
You discover
mi,
m\342\200\236^2m2-f
n~^(mi
\342\200\224
m2),
the algebra!
Chapter E: Exercises
Weak
241
Convergence
for
Prob[0,
oo)
instead R such
El8.7.
of CFs
that
F(O-)
= G(O-)
VA
= 0, and
e-^^dF{x)
e-^^dG{x\\
F(0) the if X
>
0.
\302\273/[0,oo)
\302\273/[0,oo)
Note
that
the
integral
[Hint.
that
it
\342\200\224 G.
One
idea
is
easier
then
to use
DF G,
has
DF F
and Y has
E[(e-^r]
n =
0,1,2,...
.]
R
Suppose
with
Fn(0\342\200\224)
that
{Fn)
of distribution functions on
each
0 and
i:(A):=lim
f e-^'^dFnix)
exists
that
for
A >
0 and that
L is continuous at
0. Prove that
VA
Fn
is tight
and
Fn^F
> 0.
Modes
of convergence
that
(Xn
-^ X,
a.s.)
=>
(Xn
-\342\226\272 X in
prob).
See Section
that
13.5.
-^ X in
(b)
Prove
{Xn
prob)
y^
(Xn
-^ X,
a.s.).
events.
Hint.
Let Xn =
^^ P{\\Xn
the
X\\ >
e) <
oo.We
>
0, then
Xn -^
X, a.s.
Hint. Show
that
set {u
: Xn{(-o)-f^
X{lS)\\
-^(^)}
niay
be written
many
IJ {u; :
(d) Suppose
(JCnjfe)
\342\200\224 \\Xn{y^^
> h~^
for
infinitely
n}.
that
Xn
\342\200\224> X in
of
(-^n)
such
that
with
Xn^
is a subsequence
Hint.
Combine (c)
the
principle'.
242
Chapter E:
from
Exercises
(e) Deduce X.
EA
(a) (Xn)
and
subsequence of
13.2.
X in probability if and only if every (d) that Xn \342\200\224> a further subsequence which converges a.s. to contains
Recall
that
if
(^
is
a random
variable
with
the
standard
normal
N(0,1)
distribution,
then
Ee^^=:exp(|A2).
Suppose
that
Xlfc=i
Sn =
ik,
\342\200\242 \342\200\242 are IID RVs each (^1,(^2, \342\200\242 let a, 6 G R, and define
with
the
N(0,1)
distribution.
Let
Xn
Prove
\342\200\224
exp{aSn
\342\200\224
hn).
that
0, a.s.)
44>
(6
> 0)
> 1,
(Tn->0in\302\243O^(^
<2b/a^).
References
Aldous,
D. (1989),
K.B.
Probability New
Approximations York.
via
the
Poisson
Clumping
New
Heuristic^ Springer,
Athreya,
and Ney, P.
(1972), BranchingProcesses^
of
Springer,
York,
Berlin.
Billingsley, P.
York.
(1968), Convergence
and
Probability
Measures^
Wiley,
New
Measure^
Wiley,
Chichester,
New York
B.
graphs.Coll
(1987),
Math.
Martingales,
Soc.
J.
and random
Probability^ Addison-Wesley, Breiman, L. (1968), Chow, Y.-S. and Teicher,H. (1978), Probability Inter
Mass..
Independence,
changeability,
Martingales,
Course
Springer,
in
Berlin.
Brace and
with
Wold,
Probability,
New
York.
Davis,
M.H.A.
transactioncosts.
and Norman,
Maths, and
selection
of Operation
Davis,
M.H.A.
Vintner,
R.B.
Research (to
Control,
Chapman
and
Hall, London.
Meyer,
Dellacherie, C. and
and
P.-A.
Probabilites
et Potential,
Chaps.
D.W.
(1989),
Large Deviations,
New York.
Academic
J.L.(1953), Doob,
Stochastic
Processes,
Wiley,
243
244
References
Classical
Doob,J.L.(1981),
Counterpart,Springer,
Potential
Theory
and its
Probabilistic
Part
General
New
York.
Operators:
I^
Dunford,
N. and
Theory,
Interscience,
New York.
Motion
and
Martingales
in Analysis,
Integrals,
Wads-
H.
and
McKean,
H.P. (1972),
Academic
Press,
Deviations, Large Ellis, R.S. (1985),Entropy, Springer, New York, Berlin. Markov S.N. and Kurtz, T.G. (1986), Ethier, and Convergence, Wiley, New York.
Mechanics,
Processes:
Characterization
Feller, W.
(1957),
2nd
Introduction
edn.,
Vol.1,
Wiley,
Applications,
San
Freedman, D.
Francisco.
(1971), Brownian
Martingale Reading,
and
Holden-Day,
Garsia, A.
Progress,
(1973), Benjamin,
Inequalities: Mass.
Seminar
Notes on Recent
New
and
Grimmett,
G.R. (1989),
Oxford
Percolation Theory,
Press.
to
Springer,
York,
Berlin.
Grimmett,
Processes,
Random
Theory
of Coverage
Processes,
and
Wiley,
York.
Hall,
P. and
Limit
Theory
its
Academic Application,
New York.
Van Nostrand, Halmos, P.J. (1959),Measure Theory, Princeton, NJ. Proc. Fifth Berkeley Hammersley, J.M. (1966),Harnesses, Symp. Statist, and Prob., Vol.Ill,89-117, of California Press. University
Math.
Harris,
The
Theory
of
Branching
Processes,
Springer, New
T.
(1949),
S.E. New
(Translation
Karatzas,I.
Calculus,
and
Schreve,
Springer,
(1988), York.
Brownian
Stochastic
Karlin,
S. and Taylor,
Academic
H.M. (1981),A
New York.
Second
Course
in Stochastic
Processes,
Press,
References
Branching Kendall, D.G. (1966), Soc, 41, 385-406. Kendall,
245
since
processes
1873,
J. London
Math.
D.G.
before
Kingman, Probability,
(1975),
after)
The
1873,
(and
genealogy of genealogy: Branching processes Bull. London Math. Soc. 7, 225-53. S.J. (1966),
Press.
Cambridge
J.F.C.
and
Taylor,
Introduction
to
Measure
and
Cambridge
University
Korner, Laha, R.
T.W. (1988),
Fourier Analysis,
University
Press.
New York. and Rohatgi, V. (1979),Probability Theory, Wiley, 2nd Griffin, London. edn.. Functions, Lukacs, E. (1970), Characteristic Blaisand Potential (English Meyer, P.-A. (1966),Probability translation), Mass. Waltham, dell, J. (1965), Mathematical Foundation of the Calculus of Probability Neveu, San Francisco. (translated from the French),Holden-Day, Neveu,
J. (1975),
Discrete-parameter Martingales,North-Holland,
(1967), York.
Amsterdam.
Parthasarathy,
Academic
K.R.
Press,
Probability
Measures on
Diffusions^
Chichester,
Metric Spaces,
and
New
Markov Processes^
New
York.
Course
in Probability,
Wiley, to
Processes, Introduction
Stroock, D.W.
(1984),An
York,
the
Theory of
Large Deviations,
Springer,
Varadhan,
Philadelphia.
New
Berlin.
S.R.S. (1984),
S.
Wagon,
(1985),
Vol.
The Banach-Tarski
24,
Mathematics,
Cambridge
University
Optimal
Whittle, P.
York.
Williams, Analysis,
(1990),Risk-sensitive
(1973), D.G.
Wiley,
Chichester,
New
D. eds.
Some Kendall
basic theorems on harnesses,in Stochastic and E.F. Harding, Wiley, New York, pp.349-66.
Index
Notation
on
pages
xiv-xv.)
(4.9, E10.6).
decomposition
(12.11).
<T-algebra
(1.1).
algebra of
sets (1.1).
=
almost
everywhere of
a.e. (9.1,
(1.5); 14.13);
almost
surely =
a.s. (2.4).
function (16.5).
atoms:
Azuma-Hoeffding
<T-algebra
of distribution
inequality
(E14.2).
Baire
category
theorem (A1.12).
Banach-Tarskiparadox(1.0).
Bayes'
formula
(15.7-15.9).
Bellman
Optimality
option-pricing
Principle (E10.2,
formula
15.3).
Black-Scholes
Blackwell's
(15.2).
Markov
Lemmas:
First
= BCl
BC2(4.3);
Levy's
extension
of (12.15).
Bounded
branching
Convergence Theorem
process
= BDD (6.2,13.6).
(Chapter
0, E12.1).
Burkholder-Davis-Gundy
inequality (14.18).
246
Index
Caratheodory's
247
Lemma
(A1.7).
Caratheodory's
Central
Theorem: statement
Theorem
(1.7); proof
(Al.8).
Limit
(18.4).
characteristic
convergence theorem
functions:
definition
(16.1);
inversion
formula
(16.6);
(18.1).
Chebyshev's
inequality
(7.3).
expectation
(Chapter
9):
properties
(9.7).
of
Likelihood-Ratio
of
Test
(14.17).
expectation
contraction property
convergence
conditional
(9.7,h).
in probability
for
(13.5, A13.2).
integrals: UI
MON
(5.3);
Fatou (5.4);
for
DOM (5.9);
(14.1);
for
RVs
(13.7).
(11.5);
UI case
c?-system
differentiation
distribution
under integral
function
sign (A16.1).
DOM
for
RV (3.10-3.11).
(5.9);
conditional
(9.7,g).
StoppingTheorem Submartingale (10.10, A14.3); Lemma (11.2) - and much else! Upcrossing
Downward
C^ Decomposition Convergence Theorem (11.5); (12.11); Theorem Optional Sampling (14.11); (A14.3-14.4); Optional
Inequality
(14.6);
Theorem
(14.4).
Dynkin's
Lemma
(A1.3).
(4.1).
(6.1):
conditional
(Chapter 9).
(1.6);
'extensionof
measures':
uniqueness
existence
(1.7).
248
Index
probability
extinction
fair
(0.4).
game
(10.5):
unfavourable (E4.7).
sets
(2.6,b),
2.7,c);
for functions
(5.4); conditional
(9.7,f).
filtered
space,
filtration
(10.1).
filtering (15.6-15.9).
finite
and
<T-finite
measures
(1.5).
supermartingales
(11.5).
Theorem
(8.2).
gamblingstrategy
Hardy
(10.6).
space
WJ (14.18).
harnesses (15.10-15.12).
hedgingstrategy
Helly-Bray
(15.2).
Lemma
(17.4).
hitting
times (10.12).
(E14.2).
inequality Hoeffding's
Holder's
inequality
(6.13).
Hunt's Lemma
(E14.1).
definitions
(4.1);
7r>system
criterion
(4.2); and
conditioning
and
product
measure
(8.4).
(14.18);
inequalities:
Chebyshev
Azuma-HoefFding(E14.2);Burkholder-Davis-Gundy
(7.3);
Doob's
(6.4);
\302\243p (14.11);
Holder
and in conditionalform
(14.6);
infinite
(9.7,h);
Khinchine
- see
(6.13);
Jensen
(6.6),
Markov
Minkowski
(14.8); Kolmogorov
Theorem
integration
products
of probability
(14.12,
measures
Kakutani's
on
14.17).
(Chapter
5).
Index
249
(9.7,h).
Jensen's
inequality
{6.6)]
conditional
form
Kakutani's
Kalman-Bucy
(15.6-15.9).
A.N.
KOLMOGOROV's
Inequality
Law
Truncation
of
Definition of ConditionalExpectation (9.2); of the Iterated Logarithm (A4.1, 14.7); Strong (14.6); Theorem (12.5); Large Numbers (12.10, 14.5); Three-Series Zero-One Lemma (0-1) Law (4.11, 14.3). (12.9);
Law
Kronecker's Lemma(12.7).
Laplace law
transforms: of
inversion
(E7.1);
and weak
convergence (E18.7).
random
variable
(3.9): joint
laws (8.3).
integral
(Chapter
5).
1.9).
Lebesgue
Lebesgue
measure =
spaces
Leb (1.8,A
(6.10).
L^ \302\243p,
P.LEVY's
Convergence Theorem for CFs (18.1);Downward Lemmas martingales (14.4); Extension of Borel-Cantelli Inversion for CFs (16.6); Upward Theorem for formula
(14.2).
for
Likelihood-Ratio
sheep
Test, consistency
(15.3-15.5).
of (14.17).
Mabinogion
Markov
chain (4.8,
10.13).
(6.4).
Markov'sinequality
martingale
(Chapters
Optional-Stopping
10-15!):
(11.5);
martingale
definition (10.3);
Theorem
Theorem Convergence
A14.3);
(10.9-10.10,
Optional-
Sampling Theorem
transform
(Chapter A14).
(10.6)
measurable
function
(3.1).
inequality
(6.14).
Moment
Problem
(E18.6).
250
monkey
Index
typing
Shakespeare
(4.9).
Monotone-Class
Theorem: Monotone-Convergence
Chapter
sets
(1.10);
for functions
(5.3,
A5);
conditional
version
(9.7,e).
convergence.
see
weak
pricing
(15.2).
Optional-Sampling
Optional-Stopping
(10.9-10.10,
time
A14.3).
optional
orthogonal
outer
time - seestopping
projection
(A1.6).
(10.8).
(6.11): and
measures
TT-system
(1.6):
urn
Uniqueness Lemmas
E10.8).
Polya's
previsible
probability
(ElO.l,
(= predictable)
density measure
process (10.6).
= pdf
function (1.5).
(6.12); joint
(8.3).
probability
probability
triple
(2.1).
Theorem
(6.9).
Radon-Nikodym
theorem
(5.14, 14.13-14.14).
walk:
hitting
times
(10.12, E10.7);
conditional
probability
(9.9).
Riemann
integral (5.3).
samplepath (4.8)
sample
sample
point
(2.1)
space
(2.1)
Index
251
Star
Trek
problems
(ElO.lO,
ElO.ll,
E12.3).
stopped process
(10.9).
(A14.1).
Strassen's
Law of
Laws
(A4.2).
Strong
(7.2,
12.14,
14.5).
submartingales
theorem
(11.5); optional
functions
(A14.4).
superharmonic
Markov
chains (10.13).
symmetrization technique(12.4).
cf-system
(A1.2);
7r-system
(1.6).
14.3).
(7.3).
Chebyshev
Three-Series
Theorem (12.5).
tightness (17.5).
Tower
Property
(9.7,i).
Truncation
Lemma
(12.9).
uniform integrability
Upcrossing
(Chapter 13).
Lemma
(11.1-11.2).
Uniqueness
Lemma
(1.6).
functions
(E18.7).
(18.1);
and
approximation
(4.11,
theorem
14.3).
(7.4).