Professional Documents
Culture Documents
Discrete Time
Semyon Malamud
October 24, 2016
Contents
1 Densities, Expectations, and Moment Generating Functions 1
1.1 Uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Cauchy distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Γ-distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . 12
2 Conditional Densities 13
3 Conditional Expectation 16
4 Markov Processes 17
4.1 Transition probabilities and Kolmogorov Equations . . . . . . . . 18
4.2 Hitting Times for a Markov Chain . . . . . . . . . . . . . . . . . . 18
1
2. If X1 and X2 are independent with the densities pX1 (x) and pX2 (x) then
Y = X1 + X2 has the density
Z Z
pY (y) = pX1 (y − x)pX2 (x)dx = pX2 (y − x)pX1 (x)dx
R R
and the moment-generating functions multiply: if MX1 (α) = E[eX1 α ] and MX2 (α) =
E[eX2 α ] and the moment-generating functions of the independent variables X1
and X2 , then
MX1 +X2 (α) = E[eα(X1 +X2 ) ] = E[eαX1 ] E[eαX2 ] = MX1 (α) MX2 (α) .
(1.1)
The expectation is
Z
E[f (X)] = f (x) p(x) dx
The variance is
Var(X) = Cov(X, X)
2
A random variable X distributed uniformly clearly satisfies X ∈ [a, b] almost
surely because the density is equal to zero outside of the interval. Expectations
satisfy
Z b
1
E[f (X)] = f (x) dx .
b−a a
In particular,
Z b
1 1
E[e αX
] = eαx dx = α−1 (eαb − eαa )
b−a a b−a
is the moment-generating function and the moments can be directly calculated
as
Z b
1 1
n
E[X ] = xn dx = (n + 1)−1 (bn+1 − an+1 )
b−a a b−a
3
Now, we need to consider several cases:
(a) max{a, x − b} = a that is x 6 a + b that is min{b, x − a} = x − a. Therefore,
min{b, x − a} − max{a, x − b} = x − 2a.
(b) max{a, x − b} = x − b that is x > a + b that is min{b, x − a} = b. Therefore,
min{b, x − a} − max{a, x − b} = 2b − x.
We conclude that the density is
0, x 6 2a
1 x − 2a, x ∈ [2a, a + b]
pX1 +X2 = 2
(b − a) 2b − x, x ∈ [a + b, 2b]
0, x > 2b
See figure 1.1.
0.5
0.0
-0.5
4
has the density
n
1 1 Y
p(x1 , · · · , xn ) = 1Dn = 1x ∈[a ,b ] .
(b1 − a1 ) · · · (bn − an ) (b1 − a1 ) · · · (bn − an ) i=1 i i i
Clearly, a random vector (X1 , · · · , Xn ) with this density has independent compo-
nents. This follows from Lemma 2.1 below.
As a result,
5
Then
Prob ({ξ1 + ξ2 = k}) = Prob ({ξ1 + ξ2 = k} ∩ Ω)
= Prob ({ξ1 + ξ2 = k} ∩ (t∞l=0 {ξ1 = l}))
∞
= Prob (tl=0 ({ξ1 + ξ2 = k} ∩ {ξ1 = l}))
∞
disjoint sets X
= Prob ({ξ1 + ξ2 = k} ∩ {ξ1 = l})
l=0
∞
X
= Prob ({ξ2 = k − l} ∩ {ξ1 = l})
l=0
∞
indep. X
= Prob ({ξ2 = k − l}) Prob ({ξ1 = l})
l=0 (1.6)
k
ξ2 ≥0
X
= Prob ({ξ2 = k − l}) Prob ({ξ1 = l})
l=0
k
λ2k−l l
−λ1 λ1
X
−λ2
= e e
l=0
(k − l)! (l)!
k
λ1 +λ2 1 X k!
=e λk−l
2 λ1
l
k! l=0 (k − l)!l!
1
=eλ1 +λ2 (λ1 + λ2 )k ,
k!
which is the probability that a Poisson random variable with parameter λ1 + λ2
is equal to k. As k was arbitrary,
ξ1 + ξ2 ∼ Poisson(λ1 + λ2 ).
Proof via the Moment Generating Function From the problem 2 of the
previous assignment, the moment generating functions of our random variables
are
Mξ1 +ξ2 (s) = exp{λ1 (es − 1)} exp{λ2 (es − 1)} = exp{(λ1 + λ2 )(es − 1)},
which is the moment generating function of a Poisson random variable with pa-
rameter λ1 + λ2 . We conclude from the uniqueness of the moment generating
function.
6
1.3 Binomial Distribution
Binomial distribution B(n, p) assigns probability nk pk (1 − p)n−k to a value k ∈
{0, 1, · · · , n} .
The moment generating function of a binomial r.v X ∼ B(n, p) is given by
M (s) =E esX
n
n−k n
X
sk k
= e p (1 − p)
k=1
k
n s k
n
X ep n (1.7)
=(1 − p)
k=1
1 − p k
n
es p
n
=(1 − p) 1 +
1−p
n
= (1 − p + pes ) .
It follows that
d
= n (1 − p + pes )n−1 pes s=0 = np,
E (X) = M (s) (1.8)
ds s=0
and that
d2
2
E X = 2 M (s)
ds s=0
= np es (1 − p + pes )n−1 + es (n − 1) (1 − p + pes )n−2 pes s=0 (1.9)
=np(1 + (n − 1)p)
=np(1 − p) + (np)2 .
Finally,
7
It then follows that
n n !
X n d X n
k = esk
k ds k
k=0 k=0 s=0
d
((1 + es )n )
= (1.12)
ds s=0
s n−1 s
= n (1 + e ) e s=0
= n2n−1 ,
and that
n n !
d2
2 n n
X X
k = 2 esk
k ds k
k=0 k=0 s=0
d2
n
= 2 ((1 + es ) )
ds s=0
(1.13)
s n−1
s
+ es (n − 1)(1 + es )n−2 es s=0
= n e (1 + e )
= n2n−2 (2 + (n − 1)),
= 2n−2 n(n + 1).
Finally, we show how a binomial random variable can constructed from “binary”
ones. The moment generating function of any of the Xi ’s is
which is just the moment generating function of a binomial random variable. The
uniqueness of the Laplace transform then ensures that, indeed,
n
X
Xi ∼ B(n, p).
i=1
E(Xi ) = p,
and that
8
The two centred moments of the binomial r.v. are given by
n
! n n
X X X
E(X) =E Xi = E (Xi ) = p = np, and
i=1 i=1 i=1
n
! n n
(1.14)
X X X
Var(X 2 ) =Var Xi = Var (Xi ) = p(p − 1) = np(p − 1).
i=1 i=1 i=1
Note that in the last line there is no covariance term because the terms of the
sum are independent.
9
Now,
Z
1 (x−µ−sσ 2 )2
√ e− 2σ 2 dx = 1,
2πσ R
10
and we are actually facing a density function. The Cauchy distribution is rela-
tively unusual in the sense that it does not have a well defined mean. To see this,
we first observe that
Z M Z 2
λ x u:=x2 λ M 1 1
lim dx = lim du
M →+∞ 0 π λ 2 + x2 M →+∞ π 0 λ2 + u 2
λ M 2 (1.17)
= lim ln(λ2 + u)0
M →+∞ π
= + ∞.
But then, one cannot clearly define the mean of the Cauchy distribution, because
would it exist, the following should hold
Z +∞ Z 0 Z +∞
x f (x)dx = xf (x)dx + xf (x)dx
−∞ −∞ 0
Z 0 Z +∞
λ x λ x
= 2 2
dx + dx (1.18)
−∞ π λ + x 0 π λ 2 + x2
Z +∞ Z +∞
v:=−x λ v λ x
= − 2 2
dv + dx.
0 πλ +v 0 π λ 2 + x2
But this last expression is not well-defined, being of the form “∞ − ∞”.
As the mean of the distribution is not well defined, the variance cannot be defined
either. One may however consider the second moment, but it will turn out to be
+∞, and so will the moment generating function.
1.6 Γ-distribution
Γ-distribution has a density
1
(λx)k−1 λe−λx , x > 0.
Γ(k)
That is, this distribution is supported on [0, +∞) (i.e., the corresponding random
variable is non-negative with probability 1.
We first check that the proposed density actually defines a distribution. Recalling
the definition of the function Γ as, for z ∈ R>0 ,
Z +∞
∆
Γ(z) = tz−1 e−t dt,
0
11
we see that
Z +∞ Z +∞
1
f (x)dx = (λx)k−1 λe−λx dx
0 Γ(k) 0
Z +∞
u:=λx 1
= uk−1 e−u du
Γ(k) 0 (1.19)
1
= Γ(k)
Γ(k)
=1,
as expected.
The moment generating function can be calculated as follows:
Z +∞ Z +∞
∆ 1
M (s) = sx
e f (x)dx = esx (λx)k−1 λe−λx dx
0 Γ(k) 0
Z +∞
1
= (λx)k−1 λe−x(λ−s) dx
Γ(k) 0
Z +∞
u:=(λ−s)x 1 λ λ −u
= ( u)k−1 e du (1.20)
Γ(k) 0 λ−s λ−s
k
λ 1
= Γ(k)
λ−s Γ(k)
s −k
= 1− .
λ
Thus,
d s −k−1 −1
M (s) = − k 1 − ,
ds λ λ
(1.21)
d2
k s −k−2 −1
M (s) = (−k − 1) 1 − ,
ds2 λ λ λ
and
d k
mean = M (s) = ,
ds s=0 λ
(1.22)
d2 k(k + 1) k 2
k
variance = 2 M (s) − ( mean )2 =
2
− 2 = 2.
ds s=0 λ λ λ
12
It has the remarkable property: if X is exponentially distributed then
Prob[X > s + τ |X > s] = Prob[X > τ ]
for any s, τ > 0. Indeed,
Z +∞
Prob[X > t] = λ e−λx dx = e−λt
t
2 Conditional Densities
For two events A, B, the conditional probability of A given B is
def Prob(A ∩ B)
P (A|B) = .
Prob(B)
Given a random vector X with the density p(x), we can formally write
Prob(X ∈ dx) = p(x) dx
Here, dx is a formal, infinitesimal element of the space.
Given two random vectors X1 and X2 with the joint density p(x1 , x2 ), the densities
of X1 and X2 are given by
Z Z
pX1 (x1 ) = p(x1 , x2 )dx2 , pX2 (x2 ) = p(x1 , x2 )dx1 .
13
That is, in the denominator we integrate only over (x1 , · · · , xk ) , as the values
(xk+1 , · · · , xn ) are fixed. In fact, by the above,
Z
p(x1 , · · · , xn ) dx1 · · · dxk = p(Xk+1 ,··· ,Xn ) (xk+1 , · · · , xn )
Lemma 2.1 Random vectors X and Y are independent if and only if the joint
density p(x, y) for (X, Y ) is given by
Another way to see it is through the conditional density. Clearly, X and Y are
independent if and only if the conditional density of X, conditioned on Y is the
same as the density of x itself:
pX (x) = p(x|y)
and hence
def p(x, y) pX (x) pY (y) pX (x) pY (y)
p(x|y) = R = R = R = pX (x) .
p(x, y) dx pX (x) pY (y)dx pY (y) pX (x) dx
14
In general, the joint density can be represented through conditional densities via
p(xn , · · · , x1 )
= p(xn |xn−1 , · · · , x1 ) p(xn−1 |xn−2 , · · · , x1 ) p(xn−2 |xn−3 , · · · , x1 ) · · · p(x2 |x1 ) p(x1 ) .
(2.4)
Example 1. Suppose that (X1 , X2 ) is jointly normal with the density N ((µ1 , µ2 ), Σ)
where the covariance matrix Σ is given by
σ11 σ12
Σ =
σ12 σ22
p(x1 |x2 )
and
2 −1
σ 2 = σ11 − σ12 σ22 .
Example 2. Suppose that the random vector (X1 , X2 ) has a joint density
p(x1 , x2 ) = 2 1x1 ∈[0,1] 1x2 ∈[x1 ,1] . Let’s check that this is a probability density. In-
deed,
Z Z 1Z 1 Z 1
p(x1 , x2 )dx1 dx2 = 2 dx2 dx1 = 2 (1−x1 )dx1 = 2(1−0.5) = 1 .
0 x1 0
p(x1 , x2 )
p(x1 |x2 ) = R
p(x1 , x2 )dx1
and, since the integration region is {x2 ∈ [x1 , 1], x1 ∈ [0, 1]} = {x2 ∈ [0, 1], x1 ∈
[0, x2 ]} , we get
Z Z x2 Z x2
p(x1 , x2 )dx1 = 2 1x1 ∈[0,1] dx1 = 2 dx1 = 2x2 .
0 0
Xt+1 = a Xt + wt+1
15
where wt is a sequence of i.i.d. random variables with the density f (z), and
x0 = 0. What is the joint density of X1 , · · · , Xt ?
Well, since the process is clearly Markov (the distribution of Xt+1 only depends
on Xt because wt are i.i.d.), we have
What is p(xt |xt−1 )? Well, Xt = aXt−1 + wt+1 . Conditional on Xt−1 = xt−1 , the
density p(xt |xt−1 ) is the density of axt−1 + wt which is f (xt − a xt−1 ) . Thus,
p(xt , xt−1 , · · · , x1 )
= p(xt |xt−1 , · · · , x1 ) p(xt−1 |xt−2 , · · · , x1 ) p(xt−2 |xt−3 , · · · , x1 ) · · · p(x2 |x1 ) p(x1 )
= f (xt − axt−1 ) f (xt−1 − axt−2 ) · · · f (x2 − ax1 ) f (x1 ) . (2.5)
3 Conditional Expectation
For any function φ(x1 , · · · , xn ), we have
E[φ(X1 , · · · , Xn )|(Xk+1 , · · · , Xn )] .
As a major application, suppose that we are observing the process Xt . Then, for
any t > τ,
16
Example 1. Let (X1 , X2 ) ∼ N ((µ1 , µ2 ), Σ) . Then,
−1
E[X1 |X2 ] = µ1 + σ12 σ22 (X2 − µ2 )
and
2 −1 −1
E[X12 |X2 ] = Var[X1 |X2 ]+(E[X1 |X2 ])2 = σ11
2
−σ12 σ22 + (µ1 +σ12 σ22 (X2 −µ2 ))2 .
Furthermore,
because it is Markov. Furthermore, we know from the above that the conditional
density is
and hence
Z
E[φ(Xt+1 )|Ft ] = φ(x) f (x − aXt ) dx .
R
4 Markov Processes
• Stochastic process Xt (possibly, vector, i.e., Xt ∈ Rd )
Definition Xt is Markov if
In particular,
17
4.1 Transition probabilities and Kolmogorov Equations
• Xt takes a finite number of values x1 , · · · , xn
• P (Xt+1 = xj |Xt = xj ) = p( xi , xj )
• X0 = x0 with transition probabilities p(x0 , xj )
• Kolmogorov difference equation
X
G(t, xi ) = p(xi , xj ) G(t + 1, xj )
j
G(t, x) = ΠT −t g(x)
∞
X
V (x) = e−rt E[Xt | X0 = x]
t=0
Kolmogorov equation
Example: Hitting time. That is, the first time the process hits certain level.
E[sTxi | X0 = xj ] = Fj (s)
Show that the functions F1 (s), · · · , Fn (s) satisfy the system of difference equa-
tions
X
Fj (s) = pji s + s pjk Fk (s).
k6=i
18
Solve it for n = 2 and n = 3.
d d2 d
E[Txi |X0 = xj ] = Fj (s)|s=1 , E[Tx2i |X0 = xj ] = 2
Fj (s)|s=1 + Fj (s)|s=1
ds ds ds
Solution. The general idea is to condition on the next state that will be reached
by the process. Namely,
whose solution is
spji
Fj (s) = .
1 − spjj
19
Expressing it in matrix form, we obtain
1 − spjj −spjk Fj (s) pji s
= , (4.3)
−spkj 1 − spkk Fk (s) pki s
and
(1−spkk )pji s+spjk pki s
(
Fj (s) = (1−spjj )(1−spkk )−s2 pkj pjk
spkj pji s+(1−spjj )pki s . (4.4)
Fk (s) = (1−spjj )(1−spkk )−s2 pkj pjk
• if
then Xt is a super-martingale
• if
then Xt is a sub-martingale
Example: random walk. Let
Xt = Z 1 + · · · + Z t
Properties of Martingales
20
• Et [Mt+1 − Mt ] = 0
• Es [Mt − Ms ] = Es [Mt ] − Ms = 0
Theorem
• If Xt is a martingale (sub/super martingale) then so is the stopped process;
• limt→∞ Xtτ = Xτ a.s. if τ is almost surely finite.
Proof for the martingale case. By definition,
E[IA (Xt+1 − Xt )|Fθ ] = E[IA (E[Xt+1 |Ft ] − Xt )|Fθ ] = 0 (5.1)
for any Ft -measurable A for θ 6 t.
We have
t∧τ
X t
X
Xtτ = X0 + (Xθ − Xθ−1 ) = X0 + Iτ >θ (Xθ − Xθ−1 )
θ=1 θ=1
Examples .
21
1. Suppose Harriet has 7 dollars. Her plan is to make one dollar bets on fair
coin tosses until her wealth reaches either 0 or 50, and then to go home.
What is the expected amount of money that Harriet will have when she
goes home? What is the probability that she will have 50 when she goes
home?
2. Consider a contract that at time N will be worth either 100 or 0. Let S(n)
be its price at time 0 6 n 6 N. If S(n) is a martingale, and S(0) = 47, then
what is the probability that the contract will be worth 100 at time N ?
3. Pedro plans to buy the contract in the previous problem at time 0 and sell
it the first time T at which the price goes above 55 or below 15. What is
the expected value of S(T )?
4. Suppose S(N ) is with probability one either 100 or 0 and that S(0) = 50.
Suppose further there is at least a sixy percent probability that the price
will at some point dip to below 40 and then subsequently rise to above 60
before time N. Prove that S(n) cannot be a martingale.
, with probability 21
1
X(t, ω) = ,
−1 , with probability 12
and assume the random variable (X(t, ω))t≥0 are i.i.d. In particular, for any time
t≥0
E(X(t, ω)) = 0.
22
The gain process of someone that would start playing with Z(0) = 7 dollars and
never stop playing is given by, for t ≥ 0,
t
X
Z(t, ω) = Z(0) + X(s, ω).
s=1
If we take the filtration (Ft )t≥0 to be the one generated by the coin tosses
(X(t, ω))t≥0 , then, for any t ≥ 0,
t+1
X
E(Z(t + 1, ω)|Ft ) =E(Z(0) + X(s, ω)|Ft )
s=1
t
X
=Z(0) + X(s, ω) + E(X(t + 1, ω)|Ft )
s=1
(5.2)
Xt
=Z(0) + X(s, ω) + E(X(t + 1, ω))
s=1
=G(t, ω),
where we use the linearity of the conditional expectation and exercise 1, bullet
2 from the Problem Set 4 for the second equality, and the independence of the
“X(t, ω)”s for the third.
Harriet will stop playing when her wealth reaches 0 or 50. We describe this time
by the stopping time T (it is a stopping time). From the lecture, we know that
the stopped process
(Z T (t, ω))t≥0
is still a martingale. But it is also the process describing the evolution of Harriet’s
wealth. The mathematical formulation of our first question is thus: what is
E(Z T (T (ω), ω))?
To answer this, note that (Z T (t, ω))t≥0 is bounded by 0 and 50. But then it is
uniformly integrable, and by the suitable optional stopping theorem (slide 19,
day 5),
E(Z T (T (ω), ω)) = E(Z T (0, ω)) = E(Z(0, ω)) = Z(0) = 7. (5.3)
The expected amount of money that Harriet will have when she goes home is
thus her initial wealth.
We turn to the probability of going home with 50 dollars. Note that one of the
three following events must happen:
• Harriet plays forever because her wealth never reaches either of 0 or 50;
23
• Harriet goes home with 0;
The first possibility will never happen. This is probably clear, but may be verified
by the following application of one of the two Borel-Cantelli lemmas. If at some
point the series of bets gives 50 positive outcome in a row, we can be sure that,
at worst, by the end of the 50 wins Harriet will have gone home. Let us then
consider the events
A1 ={X1 , X2 , . . . , X50 are winning bets}
A2 ={X51 , X52 , . . . , X100 are winning bets}
(5.4)
...
Ak ={X(k−1)·50+1 , X(k−1)·50+2 , . . . , Xk·50 are winning bets}.
1
The events (Ak )k≥0 are independent, and the probability of each of them is 250
,
which is small but strictly positive. But then, as
∞
X
Prob(Ak ) = ∞,
k=0
the “second” Borel-Cantelli lemma ensures that, with probability one, infinitely
many of the events Ak will occur. In particular, Harriet will go home after a
finite time with probability one.
With this observation in mind, from (5.3),
Z(0) = E(Z T (T (ω), ω)) =E Z T (T (ω), ω)1Z(T (ω),ω)=0 + E Z T (T, ω)1Z(T (ω),ω)=50
from which
Prob(Harriet’s wealth when she goes home is 50)
= Prob(Z T (T (ω), ω) = 50) (5.6)
Z(0) 7
= = .
50 50
24
Exercise 2 The process S(n) being a martingale,
From which,
47
Prob(S(N ) = 100) = .
100
But then,
T
E(ST (ω) (ω)) = E(Smin{T (ω),N } (ω)) = E(SN (ω)) = S T (0) = S(0) = 47.
where we used the fact that S T is a martingale. The expected value is thus,
again, the initial value.
• τ40 for the first time the process is under 40, and
• τ60 for first time the process is over 60, coming up from under 40 (hence,
τ60 ≥ τ40 ).
We have that
Ω ={τ40 > N } t {τ40 ≤ N ; τ60 > N } t {τ40 ≤ N ; τ60 ≤ N }
∆
(5.8)
=A1 t A2 t A3 ,
25
where, for example, τ60 > N means that the process does not go over 60 before
N (and thus ends at 0).
As S is assumed to be a martingale, so are S τ40 and S τ60 . In particular,
τ60 τ40
E(SN − SN ) = S0 − S0 = 0. (5.9)
We will work a bit more on the left hand side. We have that
τ60 τ40 τ60 τ40 τ60 τ40 τ60 τ40
E(SN − SN ) =E((SN − SN )1A1 ) + E((SN − SN )1A2 ) + E((SN − SN )1A3 )
≥E(0 · 1A1 ) + E((−40) · 1A2 ) + E(20 · 1A3 )
= − 40 · Prob(A2 ) + 20 · Prob(A3 )
(5.10)
6
By assumption, Prob(A3 ) ≥ 10
. Concerning the other probability, as S τ40 is
assumed to be a martingale,
τ40
50 = S(0) = E(SN ) =E(Sτ40 |τ40 ≤ N ) Prob(τ40 ≤ N ) + 100 · Prob(τ40 > N )
≤40 · Prob(τ40 ≤ N ) + 100 · Prob(τ40 > N )
=40 · Prob(τ40 ≤ N ) + 100 · (1 − Prob(τ40 ≤ N )).
(5.11)
As a result,
5
Prob(A2 t A3 ) = Prob(τ40 ≤ N ) ≤ ,
6
and
1
Prob(A1 ) = 1 − Prob(A1 t A2 ) ≥ .
6
But then,
1 6 7
Prob(A2 ) = 1 − Prob(A1 ) − Prob(A3 ) ≤ 1 − − = .
6 10 30
Going back to (5.10),
τ60 τ40 7 6 8
E(SN − SN ) ≥ −40 · Prob(A2 ) + 20 · Prob(A3 ) ≥ −40 · + 20 · = > 0,
30 10 3
which is clearly in contradiction with (5.9). As a result, S cannot be a martingale.
Exercise 5
26
a) We represent the coin tosses by the i.i.d. random variables (X(t, ω))t≥0 ,
where
1 ( for “head”) , with probability 21
X(t, ω) = ,
−1 ( for “tail”) , with probability 21
In particular, E(X(t, ω)) = 0 and the gain from betting a francs on the outcome
at time t is thus aX(t, ω).
The strategy of the player can be described by the following series of amounts
a1 =1
(5.12)
at (ω) =2t−1 1∩t−1
u=1 {X(u,ω)=−1}
, t ≥ 2,
where one should note that at (ω) is measurable with respect to the σ-field gener-
ated by the coin tosses up to t − 1. (In other words, the strategy is reasonable in
the sense that it does not require any knowledge of the coin tosses still to occur.)
The gains resulting from this strategy are
t
X
G(t, ω) = as (ω)X(s, ω),
s=1
and one may check that this process is a martingale. Indeed, it is adapted to the
filtration (Ft )t≥0 generated by the coin tosses, and for any t ≥ 0,
t+1
!
X
E (G(t + 1, ω)|Ft ) =E as (ω)X(s, ω) Ft
s=1
t
X
= as (ω)X(s, ω) + E (at+1 (ω)X(t + 1, ω)| Ft )
(5.13)
s=1
=G(t, ω) + at+1 (ω)E (X(t + 1, ω)| Ft )
=G(t, ω) + at+1 (ω)E (X(t + 1, ω))
=G(t, ω).
27
c) There is an apparent contradiction: given that every time I play, on average I
neither win nor lose, how can I find a strategy with which I will always eventually
win.
This argument is an intuitive, and incorrect, interpretation of the optional sam-
pling theorem. We have two versions of it, but the form is similar in both cases
and looks like: if we have
• a martingale M ,
• an additional condition,
then
M0 = E(Mτ ).
What is puzzling in our case is that the stopped gain process (that correspond
to the term in the expectation on the right hand side) is identically equal to one,
but the starting value (that corresponds to the left hand side) is zero. There
is however no contradiction, as the setting of this problem does not satisfy the
additional conditions required by the theorem. (These conditions can be either
the uniform integrability (cf. slide 19, day 5) or the combination of a bound on
the increments and a finite expected value for the stopping time (cf. slide 12, day
5).
28