Lectures On Stochastic Analysis

Lectures on Stochastic Analysis
Thomas G. Kurtz
Departments of Mathematics and Statistics
University of Wisconsin - Madison
Madison, WI 53706-1388
Revised September 7, 2001
Minor corrections August 23, 2007
Contents
1 Introduction. 4
2 Review of probability. 5
2.1 Properties of expectation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Convergence of random variables. . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Convergence in probability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Norms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Information and independence. . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.6 Conditional expectation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Continuous time stochastic processes. 13
3.1 Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Filtrations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Stopping times. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.5 Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Martingales. 18
4.1 Optional sampling theorem and Doobs inequalities. . . . . . . . . . . . . . . 18
4.2 Local martingales. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Quadratic variation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Martingale convergence theorem. . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Stochastic integrals. 22
5.1 Denition of the stochastic integral. . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Conditions for existence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3 Semimartingales. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.4 Change of time variable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.5 Change of integrator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.6 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.7 Approximation of stochastic integrals. . . . . . . . . . . . . . . . . . . . . . . 33
5.8 Connection to Protters text. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1
6 Covariation and Itos formula. 35
6.1 Quadratic covariation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 Continuity of the quadratic variation. . . . . . . . . . . . . . . . . . . . . . . 36
6.3 Itos formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 The product rule and integration by parts. . . . . . . . . . . . . . . . . . . . 40
6.5 Itos formula for vector-valued semimartingales. . . . . . . . . . . . . . . . . 41
7 Stochastic Dierential Equations 42
7.1 Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.2 Gronwalls inequality and uniqueness for ODEs. . . . . . . . . . . . . . . . . 42
7.3 Uniqueness of solutions of SDEs. . . . . . . . . . . . . . . . . . . . . . . . . 44
7.4 A Gronwall inequality for SDEs . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.5 Existence of solutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.6 Moment estimates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8 Stochastic dierential equations for diusion processes. 53
8.1 Generator for a diusion process. . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2 Exit distributions in one dimension. . . . . . . . . . . . . . . . . . . . . . . . 54
8.3 Dirichlet problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.4 Harmonic functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.5 Parabolic equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.6 Properties of X(t, x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.7 Markov property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.8 Strong Markov property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.9 Equations for probability distributions. . . . . . . . . . . . . . . . . . . . . . 59
8.10 Stationary distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.11 Diusion with a boundary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9 Poisson random measures 63
9.1 Poisson random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.2 Poisson sums of Bernoulli random variables . . . . . . . . . . . . . . . . . . . 64
9.3 Poisson random measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.4 Integration w.r.t. a Poisson random measure . . . . . . . . . . . . . . . . . . 66
9.5 Extension of the integral w.r.t. a Poisson random measure . . . . . . . . . . 68
9.6 Centered Poisson random measure . . . . . . . . . . . . . . . . . . . . . . . . 71
9.7 Time dependent Poisson random measures . . . . . . . . . . . . . . . . . . . 74
9.8 Stochastic integrals for time-dependent Poisson random measures . . . . . . 75
10 Limit theorems. 79
10.1 Martingale CLT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.2 Sequences of stochastic dierential equations. . . . . . . . . . . . . . . . . . 82
10.3 Approximation of empirical CDF. . . . . . . . . . . . . . . . . . . . . . . . . 83
10.4 Diusion approximations for Markov chains. . . . . . . . . . . . . . . . . . . 83
10.5 Convergence of stochastic integrals. . . . . . . . . . . . . . . . . . . . . . . . 86
2
11 Reecting diusion processes. 87
11.1 The M/M/1 Queueing Model. . . . . . . . . . . . . . . . . . . . . . . . . . . 87
11.2 The G/G/1 queueing model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
11.3 Multidimensional Skorohod problem. . . . . . . . . . . . . . . . . . . . . . . 89
11.4 The Tandem Queue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
12 Change of Measure 93
12.1 Applications of change-of-measure. . . . . . . . . . . . . . . . . . . . . . . . 93
12.2 Bayes Formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
12.3 Local absolute continuity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
12.4 Martingales and change of measure. . . . . . . . . . . . . . . . . . . . . . . . 95
12.5 Change of measure for Brownian motion. . . . . . . . . . . . . . . . . . . . . 96
12.6 Change of measure for Poisson processes. . . . . . . . . . . . . . . . . . . . . 97
13 Finance. 99
13.1 Assets that can be traded at intermediate times. . . . . . . . . . . . . . . . . 100
13.2 First fundamental theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . 103
13.3 Second fundamental theorem. . . . . . . . . . . . . . . . . . . . . . . . . . 105
14 Filtering. 106
15 Problems. 109
3
1 Introduction.
The rst draft of these notes was prepared by the students in Math 735 at the University
of Wisconsin - Madison during the fall semester of 1992. The students faithfully transcribed
many of the errors made by the lecturer. While the notes have been edited and many
errors removed, particularly due to a careful reading by Georey Pritchard, many errors
undoubtedly remain. Read with care.
These notes do not eliminate the need for a good book. The intention has been to
state the theorems correctly with all hypotheses, but no attempt has been made to include
detailed proofs. Parts of proofs or outlines of proofs have been included when they seemed
to illuminate the material or at the whim of the lecturer.
4
2 Review of probability.
A probability space is a triple (, T, P) where is the set of outcomes, T is a -algebra of
events, that is, subsets of , and P : T [0, ) is a measure that assigns probabilities
to events. A (real-valued) random variable X is a real-valued function dened on such
that for every Borel set B B(R), we have X
1
(B) = : X() B T. (Note that
the Borel -algebra B(R)) is the smallest -algebra containing the open sets.) We will
occasionally also consider S-valued random variables where S is a separable metric space
(e.g., R
d
). The denition is the same with B(S) replacing B(R).
The probability distribution on S determined by
X
(B) = P(X
1
(B)) = PX B
is called the distrbution of X. A random variable X has a discrete distribution if its range is
countable, that is, there exists a sequence x
i
such that
PX = x
i
= 1. The expectation
of a random variable with a discrete distribution is given by
E[X] =
x
i
PX = x
i
provided the sum is absolutely convergent. If X does not have a discrete distribution, then
it can be approximated by random variables with discrete distributions. Dene X
n
=
k+1
n
and X
n
=
k
n
when
k
n
< X
k+1
n
, and note that X
n
< X X
n
and [X
n
X
n
[
1
n
. Then
E[X] lim
n
E[X
n
] = limE[X
n
]
provided E[X
n
] exists for some (and hence all) n. If E[X] exists, then we say that X is
integrable.
2.1 Properties of expectation.
a) Linearity: E[aX + bY ] = aE[X] + bE[Y ]
b) Monotonicity: if X Y a.s then E[X] E[Y ]
2.2 Convergence of random variables.
a) X
n
X a.s. i P : lim
n
X
n
() = X() = 1.
b) X
n
X in probability i > 0, lim
n
P[X
n
X[ > = 0.
c) X
n
converges to X in distribution (denoted X
n
X) i lim
n
PX
n
x = PX
x F
X
(x) for all x at which F
X
is continuous.
Theorem 2.1 a) implies b) implies c).
Proof. (b c) Let > 0. Then
PX
n
x PX x + = PX
n
x, X > x + PX x + , X
n
> x
P[X
n
X[ >
and hence limsup PX
n
x PX x + . Similarly, liminf PX
n
x PX
x . Since is arbitrary, the implication follows.
5
2.3 Convergence in probability.
a) If X
n
X in probability and Y
n
Y in probability then aX
n
+ bY
n
aX + bY in
probability.
b) If Q : R R is continuous and X
n
X in probability then Q(X
n
) Q(X) in
probability.
c) If X
n
X in probability and X
n
Y
n
0 in probability, then Y
n
X in probability.
Remark 2.2 (b) and (c) hold with convergence in probability replaced by convergence in
distribution; however (a) is not in general true for convergence in distribution.
Theorem 2.3 (Bounded Convergence Theorem) Suppose that X
n
X and that there exists
a constant b such that P([X
n
[ b) = 1. Then E[X
n
] E[X].
Proof. Let x
i
be a partition of R such that F
X
is continuous at each x
i
. Then
i
x
i
Px
i
< X
n
x
i+1
E[X
n
]
i
x
i+1
Px
i
< X
n
x
i+1
and taking limits we have
i
x
i
Px
i
< X x
i+1
lim
n
E[X
n
]
lim
n
E[X
n
]
i
x
i+1
Px
i
< X x
i+1
As max [x
i+1
x
i
[ 0, the left and right sides converge to E[X] giving the theorem.
Lemma 2.4 Let X 0 a.s. Then lim
M
E[X M] = E[X].
Proof. Check the result rst for X having a discrete distribution and then extend to general
X by approximation.
Theorem 2.5 (Monotone Convergence Theorem.) Suppose 0 X
n
X and X
n
X in
probability. Then lim
n
E[X
n
] = E[X].
Proof. For M > 0
E[X] E[X
n
] E[X
n
M] E[X M]
where the convergence on the right follows from the bounded convergence theorem. It follows
that
E[X M] liminf
n
E[X
n
] limsup
n
E[X
n
] E[X]
and the result follows by Lemma 2.4.
Lemma 2.6 . (Fatous lemma.) If X
n
0 and X
n
X, then liminf E[X
n
] E[X].
6
Proof. Since E[X
n
] E[X
n
M] we have
liminf E[X
n
] liminf E[X
n
M] = E[X M].
By the Monotone Convergence Theorem E[X M] E[X] and the lemma folllows.
Theorem 2.7 (Dominated Convergence Theorem) Assume X
n
X, Y
n
Y , [X
n
[ Y
n
,
and E[Y
n
] E[Y ] < . Then E[X
n
] E[X].
Proof. For simplicity, assume in addition that X
n
+ Y
n
X + Y and Y
n
X
n
Y
X (otherwise consider subsequences along which (X
n
, Y
n
) (X, Y )). Then by Fatous
lemma liminf E[X
n
+ Y
n
] E[X + Y ] and liminf E[Y
n
X
n
] E[Y X]. From these
observations liminf E[X
n
] + limE[Y
n
] E[X] + E[Y ], and hence liminf E[X
n
] E[X].
Similarly liminf E[X
n
] E[X] and limsup E[X
n
] E[X]
Lemma 2.8 (Markovs inequality)
P[X[ > a E[[X[]/a, a 0.
Proof. Note that [X[ aI
[X[>a
. Taking expectations proves the desired inequality.
2.4 Norms.
For 1 p < , L
p
is the collection of random variables X with E[[X[
p
] < and the
L
p
-norm is dened by [[X[[
p
= E[[X[
p
]
1/p
. L
is the collection of random variables X such

that P[X[ c = 1 for some c < , and [[X[[
= infc : P[X[ c = 1.
Properties of norms:
1) [[X Y [[
p
= 0 implies X = Y a.s. .
2) [E[XY ][ [[X[[
p
[[Y [[
q
1
p
+
1
q
= 1.
3) [[X + Y [[
p
[[X[[
p
+[[Y [[
p
Schwartz inequality. (p = q =
1
2
).
Note that
0 E[(aX + bY )
2
] = a
2
E[X
2
] + 2abE[XY ] + b
2
E[Y
2
].
Assume that E[XY ] 0 (otherwise replace X by X) and take a, b > 0. Then
E[XY ]
a
2b
E[X
2
] +
b
2a
E[Y
2
] .
Take a = |Y | and b = |X|.
Triangle inequality. (p =
1
2
)
7
We have
|X + Y |
2
= E[(X + Y )
2
]
= E[X
2
] + 2E[XY ] + E[Y
2
]
|X|
2
+ 2|X||Y | +|Y |
2
= (|X| +|Y |)
2
.
It follows that r
p
(X, Y ) = [[XY [[
p
denes a metric on L
p
, the space of random variables
satisfying E[[X[
p
] < . (Note that we identify two random variables that dier on a set of
probability zero.) Recall that a sequence in a metric space is Cauchy if
lim
n,m
r
p
(X
n
, X
m
) = 0
and a metric space is complete if every Cauchy sequence has a limit. For example, in the
case p = 1, suppose X
n
is Cauchy and let n
k
satisfy
sup
m>n
k
|X
m
X
n
k
|
1
= E[[X
m
X
n
k
[]
1
4
k
.
Then, with probability one, the series
X X
n
1
+
k=1
(X
n
k+1
X
n
k
)
is absolutely convergent, and it follows that
lim
m
|X
m
X| = 0.
2.5 Information and independence.
Information obtained by observations of the outcome of a random experiment is represented
by a sub--algebra T of the collection of events T. If D T, then the oberver knows
whether or not the outcome is in D.
An S-valued random variable Y is independent of a -algebra T if
P(Y B D) = PY BP(D), B B(S), D T.
Two -algebras T
1
, T
2
are independent if
P(D
1
D
2
) = P(D
1
)P(D
2
), D
1
T
1
, D
2
T
2
.
Random variables X and Y are independent if (X) and (Y ) are independent, that is, if
P(X B
1
Y B
2
) = PX B
1
PY B
2
.
8
2.6 Conditional expectation.
Interpretation of conditional expectation in L
2
.
Problem: Approximate X L
2
using information represented by T such that the mean
square error is minimized, i.e., nd the T-measurable random variable Y that minimizes
E[(X Y )
2
].
Solution: Suppose Y is a minimizer. For any ,= 0 and any T-measurable random variable
Z L
2
E[[X Y [
2
] E[[X Y Z[
2
] = E[[X Y [
2
] 2E[Z(X Y )] +
2
E[Z
2
].
Hence 2E[Z(X Y )]
2
E[Z
2
]. Since is arbitrary, E[Z(X Y )] = 0 and hence
E[ZX] = E[ZY ] (2.1)
for every T-measurable Z with E[Z
2
] < .
With (2.1) in mind, for an integrable random variable X, the conditional expectation of
X, denoted E[X[T], is the unique (up to changes on events of probability zero) random
variable Y satisfying
A) Y is T-measurable.
B)
_
D
XdP =
_
D
Y dP for all D T.
Note that Condition B is a special case of (2.1) with Z = I
D
(where I
D
denotes the indi-
cator function for the event D) and that Condition B implies that (2.1) holds for all bounded
T-measurable random variables. Existence of conditional expectations is a consequence of
the Radon-Nikodym theorem.
The following lemma is useful in verifying Condition B.
Lemma 2.9 Let ( T be a collection of events such that ( and ( is closed under
intersections, that is, if D
1
, D
2
(, then D
1
D
2
(. If X and Y are integrable and
_
D
XdP =
_
D
Y dP (2.2)
for all D (, then (2.2) holds for all D (() (the smallest -algebra containing ().
Example: Assume that T = (D
1
, D
2
, . . . , ) where
i=1
D
i
= , and D
i
D
j
=
whenever i ,= j. Let X be any T-measurable random variable. Then,
E[X[T] =
i=1
E[XI
D
i
]
P(D
i
)
I
D
i
9
To see that the above expression is correct, rst note that the right hand side is T-measurable.
Furthermore, any D T can be written as D =
iA
D
i
, where A 1, 2, 3, . . .. Therefore,
_
D
i=1
E[X I
D
i
]
P(D
i
)
I
D
i
dP =
i=1
E[X I
D
i
]
P(D
i
)
_
DD
i
I
D
i
dP (monotone convergence thm)
=
iA
E[X I
D
i
]
P(D
i
)
P(D
i
)
=
_
D
XdP
Properties of conditional expectation. Assume that X and Y are integrable random
variables and that T is a sub--algebra of T.
1) E[E[X[T]] = E[X]. Just take D = in Condition B.
2) If X 0 then E[X[T] 0. The property holds because Y = E[X[T] is T-measurable
and
_
D
Y dP =
_
D
XdP 0 for every D T. Therefore, Y must be positive a.s.
3) E[aX + bY [T] = aE[X[T] + bE[Y [T]. It is obvious that the RHS is T-measurable,
being the linear combination of two T-measurable random variables. Also,
_
D
(aX + bY )dP = a
_
D
XdP + b
_
D
Y dP
= a
_
D
E[X[T]dP + b
_
D
E[Y [T]dP
=
_
D
(aE[X[T] + bE[Y [T])dP.
4) If X Y then E[X[T] E[Y [T]. Use properties (2) and (3) for Z = X Y .
5) If X is T-measurable, then E[X[T] = X.
6) If Y is T-measurable and Y X is integrable, then E[Y X[T] = Y E[X[T]. First assume
that Y is a simple random variable, i.e., let D
i
i=1
be a partition of , D
i
T, c
i
R,
for 1 i , and dene Y =
i=1
c
i
I
D
i
. Then,
_
D
Y XdP =
_
D
_

i=1
c
i
I
D
i
_
XdP
=
i=1
c
i
_
DD
i
XdP
=
i=1
c
i
_
DD
i
E[X[T]dP
=
_
D
_

i=1
c
i
I
D
i
_
E[X[T]dP
10
=
_
D
Y E[X[T]P
For general Y , approximate by a sequence Y
n
n=1
of simple random variables, for
example, dened by
Y
n
=
k
n
if
k
n
Y <
k+1
n
, k Z.
Then Y
n
converges to Y , and the result follows by the Dominated Convergence Theo-
rem.
7) If X is independent of T, then E[X[T] = E[X]. Independence implies that for D T,
E[XI
D
] = E[X]P(D),
_
D
XdP =
_
XI
D
dP
= E[XI
D
]
= E[X]
_
I
D
dP
=
_
D
E[X]dP
Since E[X] is T-measurable, E[X] = E[X[T].
8) If T
1
T
2
then E[E[X[T
2
][T
1
] = E[X[T
1
]. Note that if D T
1
then D T
2
.
Therefore,
_
D
XdP =
_
D
E[X[T
2
]dP
=
_
D
E[E[X[T
2
][T
1
]dP,
and the result follows.
A function : R R is convex if and only if for all x and y in R, and in [0, 1],
(x +(1 )y) (x) +(1 )(y). We need the following fact about convex functions
for the proof of the next property. Let x
1
< x
2
and y R. Then
(x
2
) (y)
x
2
y

(x
1
) (y)
x
1
y
. (2.3)
Now assume that x
1
< y < x
2
and let x
2
converge to y from above. The left side of (2.3) is
bounded below, and its value decreases as x
2
decreases to y. Therefore, the right derivative
+
exists at y and
<
+
(y) = lim
x
2
y+
(x
2
) (y)
x
2
y
< +.
Moreover,
(x) (y) +
+
(y)(x y), x R. (2.4)
11
9) Jensens Inequality. If is convex then
E[(X)[T] (E[X[T]).
Dene M : R as M =
+
(E[X[T]). As measurability is preserved under compo-
sition, we can see that M is T-measurable. From (2.4),
(X) (E[X[T]) + M(X E[X[T]),
and
E[(X)[T] E[(E[X[T])[T] + E[M(X E[X[T])[T] Properties 3 and 4
= (E[X[T]) + ME[(X E[X[T])[T] Property 6
= (E[X[T]) + ME[X[T] E[E[X[T][T] Property 3
= (E[X[T]) + ME[X[T] E[X[T] Property 8
= (E[X[T])
10) Let X be an S
1
-valued, T-measurable random variable and Y be an S
2
-valued random
variable independent of T. Suppose that : S
1
S
2
R is a measurable function
and that (X, Y ) is integrable. Dene
(x) = E[(x, Y )].
Then, E[(X, Y )[T] = (X).
11) Let Y be an S
2
-valued random variable (not necessarily independent of T). Suppose
that : S
1
S
2
R is a bounded measurable function. Then there exists a
measurable : S
1
R such that for each x S
1
(, x) = E[(x, Y )[T]() a.s.
and
E[(X, Y )[T]() = (, X()) a.s.
for every T-measurable random variable X.
12) Let Y : N be independent of the i.i.d random variables X
i
i=1
. Then
E[
Y
i=1
X
i
[(Y )] = Y E[X
1
]. (2.5)
Identity (2.5) follows from Property (10) by taking (X, Y )() =
Y ()
i=1
X
i
() and
noting that (y) = E[
y
i=1
X
i
] = yE[X
1
].
13) E[[E[X[T] E[Y [T][
p
] E[[X Y [
p
], p 1.
E[[E[X[T] E[Y [T][
p
] = E[[E[X Y ][T][
p
] using linearity
E[E[[X Y [
p
[T]] using Jensens inequality
= E[[X Y [
p
]
14) Let X
n
n=0
be a sequence of random variables and p 1. If lim
n
E[[XX
n
[
p
] = 0,
then lim
n
E[[E[X[T] E[X
n
[T][
p
] = 0.
12
3 Continuous time stochastic processes.
A continuous time stochastic process is a random function dened on the time interval
[0, ), that is, for each , X(, ) is a real or vector-valued function (or more generally,
E-valued for some complete, separable metric space E). Unless otherwise stated, we will
assume that all stochastic processes are cadlag, that is, for each , X(, ) is a right
continuous function with left limits at each t > 0. D
E
[0, ) will denote the collection of
cadlag E-valued functions on [0, ). For each > 0, a cadlag function has, at most, nitely
many discontinuities of magnitude greater than in any compact time interval. (Otherwise,
these discontinuities would have a right or left limit point, destroying the cadlag property).
Consequently, a cadlag function can have, at most, a countable number of discontinuities.
If X is a cadlag process, then it is completely determined by the countable family of
random variables, X(t) : t rational.
It is possible to dene a metric on D
E
[0, ) so that it becomes a complete, separable
metric space. The distribution of an E-valued, cadlag process is then dened by
X
(B) =
PX() B for B B(D
E
[0, )). We will not discuss the metric at this time. For our
present purposes it is enough to know the following.
Theorem 3.1 Let X be an E-valued, cadlag process. Then
X
on D
E
[0, ) is determined
by its nite dimensional distributions
t
1
,t
2
,...,tn
: 0 t
1
t
2
. . . t
n
; n 0 where
t
1
,t
2
,...,tn
() = P(X(t
1
), X(t
2
), . . . , X(t
n
)) , B(E
n
).
3.1 Examples.
1) Standard Brownian Motion. Recall that the density function of a normal random variable
with expectation and variance
2
is given by
f
,
(x) =
1
2
2
exp
(x )
2
2
2

For each integer n > 0, each selection of times 0 t
1
< t
2
< . . . < t
n
and vector x R
n
,
dene the joint density function
f
W(t
1
),W(t
2
),...,W(tn)
(x) = f
0,t
1
(x
1
) f
0,t
2
(x
2
x
1
) . . . f
0,tn
(x
n
x
n1
).
Note that the joint densities dened above are consistent in the sense that
f
W(t
1
),...,W(t
n1
)
(x
t
1
, . . . , x
t
n1
) =
_

f
W(t
1
),...,W(tn))
(x
1
, . . . , x
n
)dx
i
where (t
t
1
, . . . , t
t
n1
) is obtained from (t
1
, . . . , t
n
) by deleting the ith entry. The Kolmogorov
Consistency Theorem, assures the existence of a stochastic process with these nite dimen-
sional distributions. An additional argument is needed to show that the process has cadlag
(in fact continuous) sample paths.
2) Poisson Process. Again, we specify the Poisson process with parameter , by specify-
ing its nite dimensional distributions. Let h(, k) = exp
k
k!
, that is, the Poisson()
13
probability of k. For t
1
< t
2
< < t
n
. Dene
PX(t
1
) = k
1
, X(t
2
) = k
2
, . . . , X(t
n
) = k
n
=
_
_
_
h(t
1
, k
1
) h((t
2
t
1
), (k
2
k
1
)) . . .
h((t
n
t
n1
), (k
n
k
n1
)) if k
1
k
2
. . . k
n
0 otherwise
3.2 Filtrations.
Let (X(s) : s t) denote the smallest -algebra such that X(s) is (X(s) : s t)-
measurable for all s t.
A collection of -algebras T
t
, satisfying
T
s
T
t
T
for all s t is called a ltration. T
t
is interpreted as corresponding to the information
available at time t (the amount of information increasing as time progresses). A stochastic
process X is adapted to a ltration T
t
if X(t) is T
t
-measurable for all t 0.
An E-valued stochastic process X adapted to T
t
is a Markov process with respect to
T
t
if
E[f(X(t + s))[T
t
] = E[f(X(t + s))[X(t)]
for all t, s 0 and f B(E), the bounded, measurable functions on E.
A real-valued stochastic process X adapted to T
t
is a martingale with respect to T
t
if
E[X(t + s)[T
t
] = X(t) (3.1)
for all t, s 0.
Proposition 3.2 Standard Brownian motion, W, is both a martingale and a Markov pro-
cess.
Proof. Let T
t
= (W(s) : s t). Then
E[W(t + s)[T
t
] = E[W(t + s) W(t) + W(t)[T
t
]
= E[W(t + s) W(t)[T
t
] + E[W(t)[T
t
]
= E[W(t + s) W(t)] + E[W(t)[T
t
]
= E[W(t)[T
t
]
= W(t)
so W is an T
t
-martingale. Similarly W is a Markov process. Dene T(s)f(x) = E[f(x +
W(s))], and note that
E[f(W(t + s))[T
t
] = E[f(W(t + s) W(t) + W(t))[T
t
]
= T(s)f(W(t))
= E[f(W(t + s))[W(t)]
14
3.3 Stopping times.
A random variable with values in [0, ] is an T
t
-stopping time if
t T
t
t 0.
Let X be a cadlag stochastic process that is T
t
-adapted. Then
= inft : X(t) or X(t)

is a stopping time. In general, for B B(R),
B
= inft : X(t) B is not a stopping time;
however, if (, T, P) is complete and the ltration T
t
is complete in the sense that T
0
contains all events of probability zero and is right continuous in the sense that T
t
=
s>t
T
s
,
then for any B B(R),
B
is a stopping time.
If ,
1
,
2
. . . are stopping times and c 0 is a constant, then
1)
1

2
and
1

2
are stopping times.
2) + c, c, and c are stopping times.
3) sup
k

k
is a stopping time.
4) If T
t
is right continuous, then inf
k

k
, liminf
k
k
, and limsup
k
k
are stopping
times.
Lemma 3.3 Let be a stopping time and for n = 1, 2, . . ., dene
n
=
k + 1
2
n
, if
k
2
n
<
k + 1
2
n
, k = 0, 1, . . . .
Then
n
is a decreasing sequence of stopping times converging to .
Proof. Observe that
n
t =
n

[2
n
t]
2
n
= <
[2
n
t]
2
n
T
t
.
For a stopping time , dene

T
= A T : A t T
t
, t 0.
Then T
is a -algebra and is interpreted as representing the information available to an

observer at the random time . Occasionally, one also uses
T
= A t < : A T
t
, t 0 T
0
.
Lemma 3.4 If
1
and
2
are stopping times and
1

2
, then T
1
T
2
.
Lemma 3.5 If X is cadlag and T
t
-adapted and is a stopping time, then X() is T
-
measurable and X( t) is T
t
-adapted.
15
3.4 Brownian motion
A process X has independent increments if for each choice of 0 t
1
< t
2
< < t
m
,
X(t
k+1
) X(t
k
), k = 1, . . . , m 1, are independent. X is a Brownian motion if it has
continuous sample paths and independent, Gaussian-distributed increments. It follows that
the distribution of a Brownian motion is completely determined by m
X
(t) = E[X(t)] and
a
X
(t) = V ar(X(t)). X is standard if m
X
0 and a
X
(t) = t. Note that the independence of
the increments implies
V ar(X(t + s) X(t)) = a
X
(t + s) a
X
(t),
so a
X
must be nondecreasing. If X is standard
V ar(X(t + s)) X(t)) = s.
Consequently, standard Brownian motion has stationary increments. Ordinarily, we will
denote standard Brownian motion by W.
If a
X
is continuous and nondecreasing and m
X
is continuous, then
X(t) = W(a
X
(t)) + m
X
(t)
is a Brownian motion with
E[X(t)] = m
X
(t), V ar(X(t)) = a
X
(t).
3.5 Poisson process
A Poisson process is a model for a series of random observations occurring in time. For
example, the process could model the arrivals of customers in a bank, the arrivals of telephone
calls at a switch, or the counts registered by radiation detection equipment.
Let N(t) denote the number of observations by time t. We assume that N is a counting
process, that is, the observations come one at a time, so N is constant except for jumps of
+1. For t < s, N(s) N(t) is the number of observations in the time interval (t, s]. We
make the following assumptions about the model.
0) The observations occur one at a time.
1) Numbers of observations in disjoint time intervals are independent random variables,
that is, N has independent increments.
2) The distribution of N(t + a) N(t) does not depend on t.
Theorem 3.6 Under assumptions 0), 1), and 2), there is a constant such that N(s)N(t)
is Poisson distributed with parameter (s t), that is,
PN(s) N(t) = k =
((s t))
k
k!
e
(st)
.
16
If Theorem 3.6 holds, then we refer to N as a Poisson process with parameter . If = 1,
we will call N the unit Poisson process.
More generally, if (0) and (1) hold and (t) = E[N(t)] is continuous and (0) = 0, then
N(t) = Y ((t)),
where Y is a unit Poisson process.
Let N be a Poisson process with parameter , and let S
k
be the time of the kth obser-
vation. Then
PS
k
t = PN(t) k = 1
k1
i=0
(t)
i
i
e
t
, t 0.
Dierentiating to obtain the probability density function gives
f
S
k
(t) =
_
(t)
k1
e
t
t 0
0 t < 0.
The Poisson process can also be viewed as the renewal process based on a sequence of
exponentially distributed random variables.
Theorem 3.7 Let T
1
= S
1
and for k > 1, T
k
= S
k
S
k1
. Then T
1
, T
2
, . . . are independent
and exponentially distributed with parameter .
17
4 Martingales.
A stochastic process X adapted to a ltration T
t
is a martingale with respect to T
t
if
E[X(t + s)[T
t
] = X(t) (4.1)
for all t, s 0. It is a submartingale if
E[X(t + s)[T
t
] X(t) (4.2)
and a supermartingale if
E[X(t + s)[T
t
] X(t). (4.3)
4.1 Optional sampling theorem and Doobs inequalities.
Theorem 4.1 (Optional sampling theorem.) Let X be a martingale and
1
,
2
be stopping
times. Then for every t 0
E[X(t
2
)[T
1
] = X(t
1

2
).
If
2
< a.s., E[[X(
2
)[] < and lim
t
E[[X(t)[I
2
>t
] = 0, then
E[X(
2
)[T
1
] = X(
1

2
) .
The same results hold for sub and supermartingales with = replaced by (submartingales)
and (supermartingales).
Proof. See, for example, Ethier and Kurtz (1986), Theorem 2.2.13.
Theorem 4.2 (Doobs inequalities.) If X is a non-negative sub-martingale, then
Psup
st
X(s) x
E[X(t)]
x
and for > 1
E[sup
st
X(s)
] (/ 1)
E[X(t)
].
Proof. Let
x
= inft : X(t) x and set
2
= t and
1
=
x
. Then from the optional
sampling theorem we have that
E[X(t)[T
x
] X(t
x
) I
xt
X(
x
) xI
xt
a.s.
so we have that
E[X(t)] xP
x
t = xPsup
st
X(s) x
See Ethier and Kurtz, Proposition 2.2.16 for the second inequality.
18
Lemma 4.3 If M is a martingale and is convex with E[[(M(t))[] < , then
X(t) (M(t))
is a sub-martingale.
Proof.
E[(M(t + s))[T
t
] (E[M(t + s)[T
t
])
by Jensens inequality.
From the above lemma, it follows that if M is a martingale then
Psup
st
[M(s)[ x
E[[M(t)[]
x
(4.4)
and
E[sup
st
[M(s)[
2
] 4E[M(t)
2
]. (4.5)
4.2 Local martingales.
The concept of a local martingale plays a major role in the development of stochastic inte-
gration. M is a local martingale if there exists a sequence of stopping times
n
such that
lim
n
n
= a.s. and for each n, M
n
M(
n
) is a martingale.
The total variation of Y up to time t is dened as
T
t
(Y ) sup
[Y (t
i+1
) Y (t
i
)[
where the supremum is over all partitions of the interval [0, t]. Y is an FV-process if T
t
(Y ) <
for each t > 0.
Theorem 4.4 (Fundamental Theorem of Local Martingales.) Let M be a local martingale,
and let > 0. Then there exist local martingales

M and A satisfying M =

M +A such that
A is FV and the discontinuities of

M are bounded by .
Remark 4.5 One consequence of this theorem is that any local martingale can be decomposed
into an FV process and a local square integrable martingale. Specically, if
c
= inft :
[

M(t)[ c, then

M(
c
) is a square integrable martingale. (Note that [

M(
c
)[ c+.)
Proof. See Protter (1990), Theorem III.13.
19
4.3 Quadratic variation.
The quadratic variation of a process Y is dened as
[Y ]
t
= lim
max [t
i+1
t
i
[0
(Y (t
i+1
) Y (t
i
))
2
where convergence is in probability; that is, for every > 0 there exists a > 0 such that
for every partition t
i
of the interval [0, t] satisfying max [t
i+1
t
i
[ we have
P[[Y ]
t
(Y (t
i+1
) Y (t
i
))
2
[ .
If Y is FV, then [Y ]
t
=
st
(Y (s) Y (s))
2
=
st
Y (s)
2
where the summation can be
taken over the points of discontinuity only and Y (s) Y (s) Y (s) is the jump in Y at
time s. Note that for any partition of [0, t]
(Y (t
i+1
) Y (t
i
))
2
[Y (t
i+1
)Y (t
i
)[>
(Y (t
i+1
) Y (t
i
))
2
T
t
(Y ).
Proposition 4.6 (i) If M is a local martingale, then [M]
t
exists and is right continous.
(ii) If M is a square integrable martingale, then the limit
lim
max [t
i+1
t
i
[0
(M(t
i+1
) M(t
i
))
2
exists in L
1
and E[(M(t)
2
] = E[[M]
t
].
Proof. See, for example, Ethier and Kurtz (1986), Proposition 2.3.4.
Let M be a square integrable martingale with M(0)=0. Write M(t) =
m1
i=0
M(t
i+1
)
M(t
i
) where 0 = t
0
< ... < t
m
= t. Then
E[M(t)
2
] = E[(
m1
i=0
M(t
i+1
) M(t
i
))
2
] (4.6)
= E[
m1
i=0
(M(t
i+1
) M(t
i
))
2
+
i,=j
(M(t
i+1
) M(t
i
))(M(t
j+1
) M(t
j
))].
For t
i
< t
i+1
t
j
< t
j+1
.
E[(M(t
i+1
) M(t
i
))(M(t
j+1
) M(t
j
))] (4.7)
= E[E[(M(t
i+1
) M(t
i
))(M(t
j+1
) M(t
j
))[T
t
j
]]
= E[(M(t
i+1
) M(t
i
))(E[M(t
j+1
)[T
t
j
] M(t
j
))]
= 0,
and thus the expectation of the second sum in (4.6) vanishes. By the L
1
convergence in
Proposition 4.6
E[M(t)
2
] = E[
m1
i=0
(M(t
i+1
) M(t
i
))
2
] = E[[M]
t
].
20
Example 4.7 If M(t) = N(t) t where N(t) is a Poisson process with parameter , then
[M]
t
= N(t), and since M(t) is square integrable, the limit exists in L
1
.
Example 4.8 For standard Brownian motion W, [W]
t
= t. To check this identity, apply
the law of large numbers to
[nt]
k=1
(W(
k
n
) W(
k 1
n
))
2
.
Proposition 4.9 If M is a square integrable T
t
-martingale, Then M(t)
2
[M]
t
is an
T
t
-martingale.
Remark 4.10 For example, if W is standard Brownian motion, then W(t)
2
t is a mar-
tingale.
Proof. The conclusion follows from part (ii) of the previous proposition. For t, s 0, let
u
i
be a partition of (0, t + s] with 0 = u
0
< u
1
< ... < u
m
= t < u
m+1
< ... < u
n
= t + s
Then
E[M(t + s)
2
[T
t
]
= E[(M(t + s) M(t))
2
[T
t
] + M(t)
2
= E[(
n1
i=m
M(u
i+1
) M(u
i
))
2
[T
t
] + M(t)
2
= E[
n1
i=m
(M(u
i+1
) M(u
i
))
2
[T
t
] + M(t)
2
= E[[M]
t+s
[M]
t
[T
t
] + M(t)
2
where the rst equality follows since, as in (4.7), the conditional expectation of the cross
product term is zero and the last equality follows from the L
1
convergence in Proposition
4.6.
4.4 Martingale convergence theorem.
Theorem 4.11 (Martingale convergence theorem.) Let X be a submartingale satisfying
sup
t
E[[X(t)[] < . Then lim
t
X(t) exists a.s.
Proof. See, for example, Durrett (1991), Theorem 4.2.10.
21
5 Stochastic integrals.
Let X and Y be cadlag processes, and let t
i
denote a partition of the interval [0, t]. Then
we dene the stochastic integral of X with respect to Y by
_
t
0
X(s)dY (s) lim
X(t
i
)(Y (t
i+1
) Y (t
i
)) (5.1)
where the limit is in probability and is taken as max [t
i+1
t
i
[ 0. For example, let W(t)
be a standard Brownian motion. Then
_
t
0
W(s)dW(s) = lim
W(t
i
)(W(t
i+1
) W(t
i
)) (5.2)
= lim
(W(t
i
)W(t
i+1
)
1
2
W(t
i+1
)
2
1
2
W(t
i
)
2
)
+
(
1
2
W(t
i+1
)
2
1
2
W(t
i
)
2
)
=
1
2
W(t)
2
lim
1
2
(W(t
i+1
) W(t
i
))
2
=
1
2
W(t)
2
1
2
t.
This example illustrates the signicance of the use of the left end point of [t
i
, t
i+1
] in the
evaluation of the integrand. If we replace t
i
by t
i+1
in (5.2), we obtain
lim
W(t
i+1
)(W(t
i+1
) W(t
i
))
= lim
( W(t
i
)W(t
i+1
) +
1
2
W(t
i+1
)
2
+
1
2
W(t
i
)
2
)
+
(
1
2
W(t
i+1
)
2
1
2
W(t
i
)
2
)
=
1
2
W(t)
2
+ lim
1
2
(W(t
i+1
) W(t
i
))
2
=
1
2
W(t)
2
+
1
2
t.
5.1 Denition of the stochastic integral.
Throughout, we will be interested in the stochastic integral as a stochastic process. With
this interest in mind, we will use a slightly dierent denition of the stochastic integral than
that given in (5.1). For any partition t
i
of [0, ), 0 = t
0
< t
1
< t
2
< . . ., and any cadlag
x and y, dene
S(t, t
i
, x, y) =
x(t
i
)(y(t t
i+1
) y(t t
i
)).
For stochastic processes X and Y , dene Z =
_
X
dY if for each T > 0 and each > 0,

there exists a > 0 such that
Psup
tT
[Z(t) S(t, t
i
, X, Y )[
22
for all partitions t
i
satisfying max [t
i+1
t
i
[ .
If X is piecewise constant, that is, for some collection of random variables
i
and
random variables
i
satisfying 0 =
0
<
1
< ,
X =
i
I
[
i
,
i+1
)
,
then
_
t
0
X(s)dY (s) =
i
(Y (t
i+1
) Y (t
i
))
=
X(
i
)(Y (t
i+1
) Y (t
i
)) .
Our rst problem is to identify more general conditions on X and Y under which
_
X
dY
will exist.
5.2 Conditions for existence.
The rst set of conditions we will consider require that the integrator Y be of nite variation.
The total variation of Y up to time t is dened as
T
t
(Y ) sup
[Y (t
i+1
) Y (t
i
)[
where the supremum is over all partitions of the interval [0, t].
Proposition 5.1 T
t
(f) < for each t > 0 if and only if there exist monotone increasing
functions f
1
, f
2
such that f = f
1
f
2
. If T
t
(f) < , then f
1
and f
2
can be selected so that
T
t
(f) = f
1
+ f
2
. If f is cadlag, then T
t
(f) is cadlag.
Proof. Note that
T
t
(f) f(t) = sup
([f(t
i+1
) f(t
i
)[ (f(t
i+1
) f(t
i
)))
is an increasing function of t, as is T
t
(f) + f(t).
Theorem 5.2 If Y is of nite variation then
_
X
dY exists for all X,

_
X
dY is cadlag,
and if Y is continuous,
_
X
dY is continuous. (Recall that we are assuming throughout that

X is cadlag.)
Proof. Let t
i
, s
i
be partitions. Let u
i
be a renement of both. Then there exist
k
i
, l
i
, k
t
i
, l
t
i
such that
Y (t
i+1
) Y (t
i
) =
l
i
j=k
i
Y (u
j+1
) Y (u
j
)
Y (s
i+1
) Y (s
i
) =
l
j=k
i
Y (u
j+1
) Y (u
j
).
23
Dene
t(u) = t
i
, t
i
u < t
i+1
s(u) = s
i
, s
i
u < s
i+1
(5.3)
so that
[S(t, t
i
, X, Y ) S(t, s
i
, X, Y )[ (5.4)
= [
X(t(u
i
))(Y (u
i+1
t) Y (u
i
t))
X(s(u
i
))(Y (u
i+1
t) Y (u
i
t))[
[X(t(u
i
)) X(s(u
i
))[[Y (u
i+1
t) Y (u
i
t)[.
But there is a measure
Y
such that T
t
(Y ) =
Y
(0, t]. Since [Y (b) Y (a)[
Y
(a, b], the
right side of (5.4) is less than
[X(t(u
i
)) X(s(u
i
))[
Y
(u
i
t, u
i+1
t] =
_
(u
i
t,u
i+1
t]
[X(t(u)) X(s(u))[
Y
(du)
=
_
(0,t]
[X(t(u)) X(s(u))[
Y
(du).
But
lim[X(t(u)) X(s(u))[ = 0,
so _
(0,t]
[X(t(u)) X(s(u))[
Y
(du) 0 (5.5)
by the bounded convergence theorem. Since the integral in (5.5) is monotone in t, the
convergence is uniform on bounded time intervals.
Recall that the quadratic variation of a process is dened by
[Y ]
t
= lim
max [t
i+1
t
i
[0
(Y (t
i+1
) Y (t
i
))
2
,
where convergence is in probability. For example, if Y is a Poisson process, then [Y ]
t
= Y (t)
and for standard Brownian motion, [W]
t
= t.
Note that
(Y (t
i+1
) Y (t
i
))
2
= Y (t)
2
Y (0)
2
2
Y (t
i
)(Y (t
i+1
) Y (t
i
))
so that
[Y ]
t
= Y (t)
2
Y (0)
2
2
_
t
0
Y (s)dY (s)
and [Y ]
t
exists if and only if
_
Y
dY exists. By Proposition 4.6, [M]

t
exists for any local
martingale and by Proposition 4.9, for a square integrable martingale M(t)
2
[M]
t
is a
martingale.
If M is a square integrable martingale and X is bounded (by a constant) and adapted,
then for any partition t
i
,
Y (t) = S(t, t
i
, X, M) =
X(t
i
)(M(t t
i+1
) M(t t
i
))
24
is a square-integrable martingale. (In fact, each summand is a square-integrable martingale.)
This observation is the basis for the following theorem.
Theorem 5.3 Suppose M is a square integrable T
t
- martingale and X is cadlag and T
t
-
adapted. Then
_
X
dM exists.
Proof. Claim: If we can prove
_
t
0
X(s)dM(s) = lim
X(t
i
)(M(t
i+1
t) M(t
i
t))
for every bounded cadlag X, we are done. To verify this claim, let X
k
(t) = (X(t) k) (k)
and suppose
lim
X
k
(t
i
)(M(t
i+1
t) M(t
i
t)) =
_
t
0
X
k
(s)dM(s)
exists. Since
_
t
0
X(s)dM(s) =
_
t
0
X
k
(s)dM(s) on sup
st
[X(s)[ k, the assertion is
clear.
Now suppose [X(t)[ C. Since M is a square integrable martingale and [X[ is bounded,
it follows that for any partition t
i
, S(t, t
i
, X, M) is a square integrable martingale. (As
noted above, for each i, X(t
i
)(M(t
i+1
t) M(t
i
t)) is a square-integrable martingale.)
For two partitions t
i
and s
i
, dene u
i
, t(u), and s(u) as in the proof of Theorem 5.2.
Recall that t(u
i
), s(u
i
) u
i
, so X(t(u)) and X(s(u)) are T
u
-adapted. Then by Doobs
inequality and the properties of martingales,
E[sup
tT
(S(t, t
i
, X, M) S(t, s
i
, X, M))
2
] (5.6)
4E[(S(T, t
i
, X, M) S(T, s
i
, X, M))
2
]
= 4E[(
(X(t(u
i
)) X(s(u
i
))(M(u
i+1
T) M(u
i
T)))
2
]
= 4E[
(X(t(u
i
)) X(s(u
i
))
2
(M(u
i+1
T) M(u
i
T))
2
]
= 4E[
(X(t(u
i
)) X(s(u
i
))
2
([M]
u
i+1
T
[M]
u
i
T
)].
Note that [M] is nondecreasing and so determines a measure by
[M]
(0, t] = [M]
t
, and it
follows that
E[
(X(t(u
i
)) X(s(u
i
)))
2
([M]
u
i+1
[M]
u
i
)] (5.7)
= E[
_
(0,t]
(X(t(u)) X(s(u)))
2
[M]
(du)],
since X(t(u)) and X(s(u)) are constant between u
i
and u
i+1
. Now
[
_
(0,t]
(X(t(u)) X(s(u)))
2
[M]
(du)[ 4C
2
[M]
(0, t] ,
so by the fact that X is cadlag and the dominated convergence theorem, the right side of (5.7)
goes to zero as max [t
i+1
t
i
[ 0 and max [s
i+1
s
i
[ 0. Consequently,
_
t
0
X(s)dM(s)
25
exists by the completeness of L
2
, or more precisely, by the completeness of the space of
processes with norm
|Z|
T
=
_
E[sup
tT
[Z(t)[
2
].
Corollary 5.4 If M is a square integrable martingale and X is adapted, then

_
X
dM is
cadlag. If, in addition, M is continuous, then
_
X
dM is continuous. If [X[ C for some

constant C > 0, then
_
X
dM is a square integrable martingale.

Proposition 5.5 Suppose M is a square integrable martingale and
E
__
t
0
X(s)
2
d[M]
s
_
< .
Then
_
X
dM is a square integrable martingale with

E
_
(
_
t
0
X(s)dM(s))
2
_
= E
__
t
0
X(s)
2
d[M]
s
_
. (5.8)
Remark 5.6 If W is standard Brownian motion, the identity becomes
E
_
__
t
0
X(s)dW(s)
_
2
_
= E
__
t
0
X
2
(s)ds
_
.
Proof. Suppose X(t) =
i
I
[t
i
,t
i+1
)
is a simple function. Then
E
_
__
t
0
X(s)dM(s)
_
2
_
= E
_
X(t
i
)
2
(M(t
i+1
) M(t
i
))
2
_
= E
_
X(t
i
)
2
_
[M]
t
i+1
[M]
t
i
_
_
= E
__
t
0
X
2
(s)d[M]
s
_
.
Now let X be bounded, with [X(t)[ C, and for a sequence of partitions t
n
i
with
lim
n
sup
i
[t
n
i+1
t
n
i
[ = 0, dene
X
n
(t) = X(t
n
i
), for t
n
i
t < t
n
i+1
.
Then by the argument in the proof of Theorem 5.3, we have
_
t
0
X
n
(s)dM(s) =
X(t
n
i
)
_
M(t t
n
i+1
) M(t t
n
i
)
_
_
t
0
X(s)dM(s),
26
where the convergence is in L
2
. Since
_
X
n
dM is a martingale, it follows that
_
X
dM is
a martingale, and
E
_
__
t
0
X(s)dM(s)
_
2
_
= lim
n
E
_
__
t
0
X
n
(s)dM(s)
_
2
_
= lim
n
E
__
t
0
X
2
n
(s)d[M]
s
_
= E
__
t
0
X
2
(s)d[M]
s
_
,
The last equality holds by the dominated convergence theorem. This statement establishes
the theorem for bounded X.
Finally, for arbitrary cadlag and adapted X, dene X
k
(t) = (k X(t)) (k). Then
_
t
0
X
k
(s)dM(s)
_
t
0
X(s)dM(s)
in probability, and by Fatous lemma,
liminf
k
E
_
__
t
0
X
k
(s)dM(s)
_
2
_
E
_
__
t
0
X(s)dM(s)
_
2
_
.
But
lim
k
E
_
__
t
0
X
k
(s)dM(s)
_
2
_
= lim
k
E
__
t
0
X
2
k
(s)d[M]
s
_
(5.9)
= lim
k
E
__
t
0
X
2
(s) k
2
d[M]
s
_
= E
__
t
0
X
2
(s)d[M]
s
_
< ,
so
E
__
t
0
X
2
(s)d[M]
s
_
E
_
__
t
0
X(s)dM(s)
_
2
_
.
Since (5.8) holds for bounded X,
E
_
__
t
0
X
k
(s)dM(s)
_
t
0
X
j
(s)dM(s)
_
2
_
(5.10)
= E
_
__
t
0
(X
k
(s) X
j
(s))dM(s)
_
2
_
= E
__
t
0
[X
k
(s) X
j
(s)[
2
d[M]
s
_
27
Since
[X
k
(s) X
j
(s)[
2
4X(s)
2
,
the dominated convergence theorem implies the right side of (5.10) converges to zero as
j, k . Consequently,
_
t
0
X
k
(s)dM(s)
_
t
0
X(s)dM(s)
in L
2
, and the left side of (5.9) converges to E[(
_
t
0
X(s)dM(s))
2
] giving (5.8).
If
_
t
0
X(s)dY
1
(s) and
_
t
0
X(s)dY
2
(s) exist, then
_
t
0
X(s)d(Y
1
(s) + Y
2
(s)) exists and
is given by the sum of the other integrals.
Corollary 5.7 If Y = M+V where M is a T
t
-local martingale and V is an T
t
-adapted
nite variation process, then
_
X
dY exists for all cadlag, adapted X,

_
X
dY is cadlag,
and if Y is continuous,
_
X
dY is continuous.
Proof. If M is a local square integrable martingale, then there exists a sequence of stopping
times
n
such that M
n
dened by M
n
(t) = M(t
n
) is a square-integrable martingale.
But for t <
n
,
_
t
0
X(s)dM(s) =
_
t
0
X(s)dM
n
(s),
and hence
_
X
dM exists. Linearity gives existence for any Y that is the sum of a local
square integrable martingale and an adapted FV process. But Theorem 4.4 states that
any local martingale is the sum of a local square integrable martingale and an adapted FV
process, so the corollary follows.
5.3 Semimartingales.
With Corollary 5.7 in mind, we dene Y to be an T
t
-semimartingale if and only if Y =
M +V , where M is a local martingale with respect to T
t
and V is an T
t
-adapted nite
variation process. By Theorem 4.4 we can always select M and V so that M is local square
integrable. In particular, we can take M to have discontinuities uniformly bounded by a
constant. If the discontinuities of Y are already uniformly bounded by a constant, it will be
useful to know that the decomposition preserves this property for both M and V .
Lemma 5.8 Let Y be a semimartingale and let sup [Y (s)[. Then there exists a local
square integrable martingale M and a nite variation process V such that Y = M + V ,
sup [M(s) M(s)[
sup [V (s) V (s)[ 2.
28
Proof. Let Y =

M +

V be a decompostion of Y into a local martingale and an FV process.
By Theorem 4.4, there exists a local martingale M with discontinuities bounded by and
an FV process A such that

M = M + A. Dening V = A +

V = Y M, we see that the
discontinuities of V are bounded by 2.
The class of semimartingales is closed under a variety of operations. Clearly, it is lin-
ear. It is also easy to see that a stochastic integral with a semimartingale integrator is a
semimartingale, since if we write
_
t
0
X(s)dY (s) =
_
t
0
X(s)dM(s) +
_
t
0
X(s)dV (s),
then the rst term on the right is a local square integrable martingale whenever M, is and
the second term on the right is a nite variation process whenever V is.
Lemma 5.9 If V is of nite variation, then Z
2
(t) =
_
t
0
X(s)dV (s) is of nite variation.
Proof. For any partition t
i
of [a, b],
[Z
2
(b) Z
2
(a)[ = lim
X(t
i
) (V (t
i+1
) V (t
i
))
sup
as<b
[X(s)[ lim
[V (t
i+1
) V (t
i
)[
sup
as<b
[X(s)[ (T
b
(V ) T
a
(V )) ,
and hence
T
t
(Z) sup
0s<t
[X(s)[T
t
(V ). (5.11)
Lemma 5.10 Let M be a local square integrable martingale, and let X be adapted. Then
Z
1
(t) =
_
t
0
X(s)dM(s) is a local square integrable martingale.
Proof. There exist
1

2
,
n
, such that M
n
= M(
n
) is a square integrable
martingale. Dene
n
= inf t : [X(t)[ or [X(t)[ n ,
and note that lim
n
n
= . Then setting X
n
(t) = (X(t) n) (n),
Z
1
(t
n

n
) =
_
tn
0
X
n
(s)dM
n
(s)
is a square integrable martingale, and hence Z
1
is a local square integrable martingale.
We summarize these conclusions as
Theorem 5.11 If Y is a semimartingale with respect to a ltration T
t
and X is cadlag
and T
t
-adapted, then
_
X
dY exists and is a cadlag semimartingale.

29
The following lemma provides a useful estimate on
_
X
dY in terms of properties of M
and V .
Lemma 5.12 Let Y = M + V be a semimartingale where M is a local square-integrable
martingale and V is a nite variation process. Let be a stopping time for which E[[M]
t
] =
E[M(t )
2
] < , and let
c
= inft : [X(t)[ or [X(t)[ c. Then
Psup
st
[
_
s
0
X(u)dY (u)[ > K
P t + Psup
s<t
[X(s)[ c + P sup
stc
[
_
s
0
X(u)dM(u)[ > K/2
+P sup
stc
[
_
s
0
X(u)dV (u)[ > K/2
P t + Psup
s<t
[X(s)[ c +
16c
2
E[[M]
t
]
K
2
+ PT
t
(V ) (2c)
1
K.
Proof. The rst inequality is immediate. The second follows by applying Doobs inequality
to the square integrable martingale
_
sc
0
X(u)dM(u)
and observing that
sup
us
[
_
s
0
X(u)dV (u)[ T
s
(V ) sup
u<s
[X(u)[.
5.4 Change of time variable.

We dened the stochastic integral as a limit of approximating sums
_
t
0
X(s)dY (s) = lim
X(t
i
)(Y (t t
i+1
) Y (t t
i
)),
where the t
i
are a partition of [0, ). By Theorem 5.20, the same limit holds if we replace
the t
i
by stopping times. The following lemma is a consequence of this observation.
Lemma 5.13 Let Y be an T
t
-semimartingale, X be cadlag and T
t
adapted, and be
continuous and nondecreasing with (0) = 0. For each u, assume (u) is an T
t
-stopping
time. Then, (
t
= T
(t)
is a ltration, Y is a (
t
semimartingale, X is cadlag and
(
t
-adapted, and
_
(t)
0
X(s)dY (s) =
_
t
0
X (s)dY (s). (5.12)
(Recall that if X is T
t
-adapted, then X() is T
measurable).
30
Proof.
_
t
0
X (s)dY (s) = limX (t
i
)(Y ((t
i+1
t)) Y ((t
i
t)))
= limX (t
i
)(Y ((t
i+1
) (t)) Y ((t
i
) (t)))
=
_
(t)
0
X(s)dY (s),
where the last limit follows by Theorem 5.20. That Y is an T
(t)
-semimartingale follows
from the optional sampling theorem.
Lemma 5.14 Let A be strictly increasing and adapted with A(0) = 0 and (u) = infs :
A(s) > u. Then is continuous and nondecreasing, and (u) is an T
t
-stopping time.
Proof. Best done by picture.
For A and as in Lemma 5.14, dene B(t) = A (t) and note that B(t) t.
Lemma 5.15 Let A, , and B be as above, and suppose that Z(t) is nondecreasing with
Z(0) = 0. Then
_
(t)
0
Z(s)dA(s) =
_
t
0
Z (s)dA (s)
=
_
t
0
Z (s)d(B(s) s) +
_
t
0
Z (s)ds
= Z (t)(B(t) t)
_
t
0
(B(s) s)dZ (s)
[B, Z ]
t
+
_
t
0
Z (s)ds
and hence
_
(t)
0
Z(s)dA(s) Z (t)(B(t) t) +
_
t
0
Z (s)ds.
5.5 Change of integrator.
Lemma 5.16 Let Y be a semimartingale, and let X and U be cadlag and adapted. Suppose
Z(t) =
_
t
0
X(s)dY (s) . Then
_
t
0
U(s)dZ(s) =
_
t
0
U(s)X(s)dY (s).
Proof. Let t
i
be a partition of [0, t], and dene
t(s) = t
i
as t
i
s < t
i+1
,
31
so that
_
t
0
U(s)dZ(s) = lim
U(t
i
)(Z(t t
i+1
) Z(t t
i
))
= lim
U(t
i
t)
_
tt
i+1
tt
i
X(s)dY (s)
= lim
_
tt
i+1
tt
i
U(t
i
t)X(s)dY (s)
= lim
_
tt
i+1
tt
i
U(t(s))X(s)dY (s)
= lim
_
t
0
U(t(s))X(s)dY (s)
=
_
t
0
U(s)X(s)dY (s)
The last limit follows from the fact that U(t(s)) U(s) as max [t
i+1
t
i
[ 0 by
splitting the integral into martingale and nite variation parts and arguing as in the proofs
of Theorems 5.2 and 5.3.
Example 5.17 Let be a stopping time (w.r.t. T
t
). Then U(t) = I
[0,)
(t) is cadlag and
adapted, since U(t) = 1 = > t T
t
. Note that
Y
(t) = Y (t ) =
_
t
0
I
[0,)
(s)dY (s)
and
_
t
0
X(s)dY (s) =
_
t
0
I
[0,)
(s)X(s)dY (s)
=
_
t
0
X(s)dY
(s)
5.6 Localization
It will frequently be useful to restrict attention to random time intervals [0, ] on which
the processes of interest have desirable properties (for example, are bounded). Let be a
stopping time, and dene Y
by Y
(t) = Y ( t) and X
by setting X
(t) = X(t) for

t < and X(t) = X() for t . Note that if Y is a local martingale, then Y
is a local
martingale, and if X is cadlag and adapted, then X
is cadlag and adapted. Note also that

if = inft : X(t) or X(t) c, then X
c.
The next lemma states that it is possible to approximate an arbitrary semimartingale by
semimartingales that are bounded by a constant and (consequently) have bounded disconti-
nuities.
32
Lemma 5.18 Let Y = M +V be a semimartingale, and assume (without loss of generality)
that sup
s
[M(s)[ 1. Let
A(t) = sup
st
([M(s)[ +[V (s)[ + [M]
s
+ T
s
(V ))
and
c
= inft : A(t) c,
and dene M
c
M
c
, V
c
V
c
, and Y
c
M
c
+ V
c
. Then Y
c
(t) = Y (t) for t <
c
,
lim
c
c
= , [Y
c
[ c + 1, sup
s
[Y
c
(s)[ 2c + 1, [M
c
] c + 1, T(V ) c.
Finally, note that
S(t , t
i
, X, Y ) = S(t, t
i
, X
, Y
). (5.13)
5.7 Approximation of stochastic integrals.
Proposition 5.19 Suppose Y is a semimartingale X
1
, X
2
, X
3
, . . . are cadlag and adapted,
and
lim
n
sup
tT
[X
n
(t) X(t)[ = 0
in probability for each T > 0. Then
lim
n
sup
tT
_
t
0
X
n
(s)dY (s)
_
t
0
X(s)dY (s)
= 0
in probability.
Proof. By linearity and localization, it is enough to consider the cases Y a square integrable
martingale and Y a nite variation process, and we can assume that [X
n
[ C for some
constant C. The martingale case follows easily from Doobs inequality and the dominated
convergence theorem, and the FV case follows by the dominated convergence theorem.
Theorem 5.20 Let Y be a semimartingale and X be cadlag and adapted. For each n,
let 0 =
n
0

n
1

n
2
be stopping times and suppose that lim
k
n
k
= and
lim
n
sup
k
[
n
k+1
n
k
[ = 0. Then for each T > 0
lim
n
sup
tT
[S(t,
n
k
, X, Y )
_
t
0
X(s)dY (s)[ = 0.
Proof. If Y is FV, then the proof is exactly the same as for Theorem 5.2 (which is an by
argument). If Y is a square integrable martingale and X is bounded by a constant, then
dening
n
(u) =
n
k
for
n
k
u <
n
k+1
,
E[(S(t,
n
k
, X, Y )
_
t
0
X(s)dY (s))
2
]
= E[
_
t
0
(X(
n
(u)) X(u))
2
d[Y ]
u
]
and the result follows by the dominated convergence theorem. The theorem follows from
these two cases by linearity and localization.
33
5.8 Connection to Protters text.
The approach to stochastic integration taken here diers somewhat from that taken in Prot-
ter (1990) in that we assume that all integrands are cadlag and do not introduce the notion
of predictability. In fact, however, predictability is simply hidden from view and is revealed
in the requirement that the integrands are evaluated at the left end points in the denition
of the approximating partial sums. If X is a cadlag integrand in our denition, then the left
continuous process X() is the predictable integrand in the usual theory. Consequently,
our notation
_
X
dY and
_
t
0
X(s)dY (s)
emphasizes this connection.
Protter (1990) denes H(t) to be simple and predictable if
H(t) =
m
i=0
i
I
(
i
,
i+1
]
(t),
where
0
<
1
< are T
t
-stopping times and the
i
are T
i
measurable. Note that H is
left continuous. In Protter, H Y is dened by
H Y (t) =
i
(Y (
i+1
t) Y (
i
t)) .
Dening
X(t) =
i
I
[
i
,
i+1
)
(t),
we see that H(t) = X(t) and note that
H Y (t) =
_
t
0
X(s)dY (s),
so the denitions of the stochastic integral are consistent for simple functions. Protter
extends the denition H Y by continuity, and Propositon 5.19 ensures that the denitions
are consistent for all H satisfying H(t) = X(t), where X is cadlag and adapted.
34
6 Covariation and Itos formula.
6.1 Quadratic covariation.
The covariation of Y
1
, Y
2
is dened by
[Y
1
, Y
2
]
t
lim
i
(Y
1
(t
i+1
) Y
1
(t
i
)) (Y
2
(t
i+1
) Y
2
(t
i
)) (6.1)
where the t
i
are partitions of [0, t] and the limit is in probability as max [t
i+1
t
i
[ 0.
Note that
[Y
1
+ Y
2
, Y
1
+ Y
2
]
t
= [Y
1
]
t
+ 2[Y
1
, Y
2
]
t
+ [Y
2
]
t
.
If Y
1
, Y
2
, are semimartingales, then [Y
1
, Y
2
]
t
exists. This assertion follows from the fact that
[Y
1
, Y
2
]
t
= lim
i
(Y
1
(t
i+1
) Y
1
(t
i
)) (Y
2
(t
i+1
) Y
2
(t
i
))
= lim(
(Y
1
(t
i+1
)Y
2
(t
i+1
) Y
1
(t
i
)Y
2
(t
i
))
Y
1
(t
i
)(Y
2
(t
i+1
) Y
2
(t
i
))
Y
2
(t
i
)(Y
1
(t
i+1
) Y
1
(t
i
)))
= Y
1
(t)Y
2
(t) Y
1
(0)Y
2
(0)
_
t
0
Y
1
(s)dY
2
(s)
_
t
0
Y
2
(s)dY
1
(s).
Recall that if Y is of nite variation, then [Y ]
t
=
st
(Y (s))
2
, where Y (s) Y (s)
Y (s).
Lemma 6.1 Let Y be a nite variation process, and let X be cadlag. Then
[X, Y ]
t
=
st
X(s)Y (s).
Remark 6.2 Note that this sum will be zero if X and Y have no simultaneous jumps. In
particular, if either X or Y is a nite variation process and either X or Y is continuous,
then [X, Y ] = 0.
Proof. We have that the covariation [X, Y ]
t
is
lim
max [t
i+1
t
i
[0
(X(t
i+1
) X(t
i
))(Y (t
i+1
) Y (t
i
))
= lim
max [t
i+1
t
i
[0
[X(t
i+1
)X(t
i
)[>
(X(t
i+1
) X(t
i
))(Y (t
i+1
) Y (t
i
))
+ lim
max [t
i+1
t
i
[0
[X(t
i+1
)X(t
i
)[]
(X(t
i+1
) X(t
i
))(Y (t
i+1
) Y (t
i
)),
where the rst term on the right is approximately
st
|X(s)|>
X(s)Y (s)
35
plus or minus a few jumps where X(s) = . Since the number of jumps in X is countable,
can always be chosen so that there are no such jumps. The second term on the right is
bounded by
[Y (t
i+1
) Y (t
i
)[ T
t
(Y ),
where T
t
(Y ) is the total variation of Y (which is bounded).
6.2 Continuity of the quadratic variation.
Since
a
i
b
i

_
a
2
i
b
2
i
it follows that [X, Y ]
t

_
[X]
t
[Y ]
t
. Observe that
[X Y ]
t
= [X]
t
2[X, Y ]
t
+ [Y ]
t
[X Y ]
t
+ 2 ([X, Y ]
t
[Y ]
t
) = [X]
t
[Y ]
t
[X Y ]
t
+ 2[X Y, Y ]
t
= [X]
t
[Y ]
t
.
Therefore,
[[X]
t
[Y ]
t
[ [X Y ]
t
+ 2
_
[X Y ]
t
[Y ]
t
. (6.2)
Assuming that [Y ]
t
< , we have that [X Y ]
t
0 implies [X]
t
[Y ]
t
.
Lemma 6.3 Let M
n
, n = 1, 2, 3, . . ., be square-integrable martingales with lim
n
E[(M
n
(t)
M(t))
2
] = 0 for all t. Then E [[[M
n
]
t
[M]
t
[] 0.
Proof. Since
E [[[M
n
]
t
[M]
t
[] E [[M
n
M]
t
] + 2E
_
_
[M
n
M]
t
[M]
t
_
E [[M
n
M]
t
] + 2
_
E [[M
n
M]
t
] E [[M]
t
],
we have the L
1
convergence of the quadratic variation.
Lemma 6.4 Suppose sup
st
[X
n
(s) X(s)[ 0 and sup
st
[Y
n
(s) Y (s)[ 0 for each
t > 0, and sup
n
T
t
(Y
n
) < . Then
lim
n
[X
n
, Y
n
]
t
= [X, Y ]
t
.
Proof. Note that T
t
(Y ) sup
n
T
t
(Y
n
), and recall that
[X
n
, Y
n
]
t
=
st
X
n
(s)Y
n
(s).
We break the sum into two parts,
st
|Xn(s)|
X
n
(s)Y
n
(s)
st
[Y
n
(s)[ T
t
(Y
n
)
and
st,[Xn(s)[>
X
n
(s)Y
n
(s).
36
Since X
n
(s) X(s) and Y
n
(s) Y (s), we have
limsup [[X
n
, Y
n
]
t
[X, Y ]
t
[ = limsup [
X
n
(s)Y
n
(s)
X(s)Y (s)[
(T
t
(Y
n
) + T
t
(Y )),
and the lemma follows.
Lemma 6.5 Let Y
i
= M
i
+ V
i
, Y
n
i
= M
n
i
+ V
n
i
, i = 1, 2, n = 1, 2, . . . be semimartingales
with M
n
i
a local square integrable martingale and V
n
i
nite variation. Suppose that there
exist stopping times
k
such that
k
as k and for each t 0,
lim
n
E[(M
n
i
(t
k
) M
i
(t
k
))
2
] = 0,
and that for each t 0 sup
i,n
T
t
(V
n
i
) < and
lim
n
sup
st
[V
n
i
(s) V
i
(s)[ = 0.
Then [Y
n
1
, Y
n
2
]
t
[Y
1
, Y
2
]
t
.
Proof. The result follows from Lemmas 6.3 and 6.4 by writing
[Y
n
1
, Y
n
2
]
t
= [M
n
1
, M
n
2
]
t
+ [M
n
1
, V
n
2
]
t
+ [V
n
1
, Y
n
2
]
t
.
Lemma 6.5 provides the proof for the following.

Lemma 6.6 Let Y
i
be a semimartingale, X
i
be cadlag and adapted, and
Z
i
(t) =
_
t
0
X
i
(s)dY
i
(s) i = 1, 2
Then,
[Z
1
, Z
2
]
t
=
_
t
0
X
1
(s)X
2
(s)d[Y
1
, Y
2
]
s
Proof. First verify the identity for piecewise constant X
i
. Then approximate general X
i
by
piecewise constant processes and use Lemma 6.5 to pass to the limit.
Lemma 6.7 Let X be cadlag and adapted and Y be a semimartingale. Then
lim
max [t
i+1
t
i
[0
X(t
i
)(Y (t
i+1
t) Y (t
i
t))
2
=
_
t
0
X(s)d[Y ]
s
. (6.3)
Proof. Let Z(t) =
_
t
0
2Y (s)dY (s). Observing that
(Y (t
i+1
t) Y (t
i
t))
2
= Y
2
(t
i+1
t) Y
2
(t
i
t) 2Y (t
i
)(Y (t
i+1
t) Y (t
i
t))
and applying Lemma 5.16, we see that the left side of (6.3) equals
_
t
0
X(s)dY
2
(s)
_
t
0
2X(s)Y (s)dY (s) =
_
t
0
X(s)d(Y
2
(s) Z(s)) ,
and since [Y ]
t
= Y
2
(t) Y
2
(0)
_
t
0
2Y (s)dY (s), the lemma follows.
37
6.3 Itos formula.
Theorem 6.8 Let f C
2
, and let Y be a semimartingale. Then
f (Y (t)) = f (Y (0)) +
_
t
0
f
t
(Y (s)) dY (s) (6.4)
+
_
t
0
1
2
f
tt
(Y (s)) d[Y ]
s
+
st
(f (Y (s)) f (Y (s)) f
t
(Y (s)) Y (s)
1
2
f
tt
(Y (s)) (Y (s))
2
).
Remark 6.9 Observing that the discontinuities in [Y ]
s
satisfy [Y ]
s
= Y (s)
2
, if we dene
the continuous part of the quadratic variation by
[Y ]
c
t
= [Y ]
t
st
Y (s)
2
,
then (6.4) becomes
f (Y (t)) = f (Y (0)) +
_
t
0
f
t
(Y (s)) dY (s) (6.5)
+
_
t
0
1
2
f
tt
(Y (s)) d[Y ]
c
s
+
st
(f (Y (s)) f (Y (s)) f
t
(Y (s)) Y (s))
Proof. Dene
f
(x, y) =
f(y) f(x) f
t
(x)(y x)
1
2
f
tt
(x)(y x)
2
(y x)
2
Then
f (Y (t)) = f (Y (0)) +
f (Y (t
i+1
)) f (Y (t
i
)) (6.6)
= f (Y (0)) +
f
t
(Y (t
i
)) (Y (t
i+1
) Y (t
i
))
+
1
2
f
tt
(Y (t
i
)) (Y (t
i+1
) Y (t
i
))
2
+
f
(Y (t
i
), Y (t
i+1
)) (Y (t
i+1
) Y (t
i
))
2
.
The rst three terms on the right of (6.6) converge to the corresponding terms of (6.4) by
previous lemmas. Note that the last term in (6.4) can be written
st
f
(Y (s), Y (s))Y (s)
2
. (6.7)
To show convergence of the remaining term, we need the following lemma.
38
Lemma 6.10 Let X be cadlag. For > 0, let D
(t) = s t : [X(s)[ . Then

limsup
max [t
i+1
t
i
[0
max
(t
i
,t
i+1
]D(t)=
[X(t
i+1
) X(t
i
)[ .
(i.e. by picking out the sub-intervals with the larger jumps, the remaining intervals have the
above property.) (Recall that X(s) = X(s) X(s) )
Proof. Suppose not. Then there exist a
n
< b
n
t and s t such that a
n
s, b
n
s,
(a
n
, b
n
] D
(t) = and limsup [X(b

n
) X(a
n
)[ > . That is, suppose
limsup
m
max
(t
m
i
,t
m
i+1
]D(t)=
[X(t
m
i+1
) X(t
i
)
m
[ > >
Then there exists a subsequence in (t
m
i
, t
m
i+1
]
i,m
such that [X(t
m
i+1
) X(t
m
i
)[ > . Selecting a
further subsequence if necessary, we can obtain a sequence of intervals (a
n
, b
n
] such that
[X(a
n
) X(b
n
)[ > and a
n
, b
n
s. Each interval satises a
n
< b
n
< s, s a
n
< b
n
, or
a
n
< s b
n
. If an innite subsequence satises the rst condition, then X(a
n
) X(s) and
X(b
n
) X(s) so that [X(b
n
)X(a
n
)[ 0. Similarly, a subsequence satisfying the second
condition gives [X(b
n
) X(a
n
)[ 0 since X(b
n
) X(s) and X(a
n
) X(s). Finally, a
subsequence satisfying the third condition satises [X(b
n
) X(a
n
)[ [X(s) X(s)[ =
[X(s)[ > , and the contradiction proves the lemma.
Proof of Theorem 6.8 continued. Assume f C
2
and suppose f
tt
is uniformly con-
tinuous. Let
f
() sup
[xy[

f
(x, y). Then
f
() is a continuous function of and
lim
0
f
() = 0. Let D
(t) = s t : [Y (s) Y (s
)[ . Then
f
(Y (t
i
), Y (t
i+1
)) (Y (t
i+1
) Y (t
i
))
2
=
(t
i
,t
i+1
]D(t),=
f
(Y (t
i
), Y (t
i+1
)) (Y (t
i+1
) Y (t
i
))
2
+
(t
i
,t
i+1
]D(t)=
f
(Y (t
i
), Y (t
i+1
)) (Y (t
i+1
) Y (t
i
))
2
,
where the second term on the right is bounded by
e(t
i
, Y )
f
_
max
(t
i
,t
i+1
]D(t)=
[Y (t
i+1
) Y (t
i
)[
_
(Y (t
i+1
) Y (t
i
))
2
and
limsup
max [t
i+1
t
i
[0
e(t
i
, Y )
f
()[Y ]
t
.
It follows that
limsup [
f
(Y (t
i
), Y (t
i+1
)) (Y (t
i+1
) Y (t
i
))
2
f
(Y (s), Y (s))Y (s)
2
[
2
f
()[Y ]
t
which completes the proof of the theorem.
39
6.4 The product rule and integration by parts.
Let X and Y be semimartingales. Then
X(t)Y (t) = X(0)Y (0) +
(X(t
i+1
)Y (t
i+1
) X(t
i
)Y (t
i
))
= X(0)Y (0) +
X(t
i
) (Y (t
i+1
) Y (t
i
)) +
Y (t
i
) (X(t
i+1
) X(t
i
))
+
(Y (t
i+1
) Y (t
i
)) (X(t
i+1
) X(t
i
))
= X(0)Y (0) +
_
t
0
X(s)dY (s) +
_
t
0
Y (s)dX(s) + [X, Y ]
t
.
Note that this identity generalizes the usual product rule and provides us with a formula for
integration by parts.
_
t
0
X(s)dY (s) = X(t)Y (t) X(0)Y (0)
_
t
0
Y (s)dX(s) [X, Y ]
t
. (6.8)
Example 6.11 (Linear SDE.) As an application of (6.8), consider the stochastic dierential
equation
dX = Xdt + dY
or in integrated form,
X(t) = X(0)
_
t
0
X(s)ds + Y (t).
We use the integrating factor e
t
.
e
t
X(t) = X(0) +
_
t
0
e
t
dX(s) +
_
t
0
X(s)de
s
= X(0)
_
t
0
X(s)e
s
ds +
_
t
o
e
s
dY (s) +
_
t
0
X(s)e
s
ds
which gives
X(t) = X(0)e
t
+
_
t
0
e
(ts)
dY (s).
Example 6.12 (Kroneckers lemma.) Let A be positive and nondecreasing, and lim
t
A(t) =
. Dene
Z(t) =
_
t
0
1
A(s)
dY (s).
If lim
t
Z(t) exists a.s., then lim
t
Y (t)
A(t)
= 0 a.s.
Proof. By (6.8)
A(t)Z(t) = Y (t) +
_
t
0
Z(s)dA(s) +
_
t
0
1
A(s)
d[Y, A]
s
. (6.9)
40
Rearranging (6.9) gives
Y (t)
A(t)
= Z(t)
1
A(t)
_
t
0
Z(s)dA(s) +
1
A(t)
st
Y (s)
A(s)
A(s). (6.10)
Note that the dierence between the rst and second terms on the right of (6.10) converges
to zero. Convergence of Z implies lim
t
Y (t)
A(t)
= 0, so the third term on the right of (6.10)
converges to zero giving the result.
6.5 Itos formula for vector-valued semimartingales.
Let Y (t) = (Y
1
(t), Y
2
(t), ...Y
m
(t))
T
(a column vector). The product rule given above is a
special case of Itos formula for a vector-valued semimartingale Y . Let f C
2
(R
m
). Then
f (Y (t)) = f (Y (0)) +
m
k=1
_
t
0
k
f (Y (s)) dY
k
(s)
+
m
k,l=1
1
2
_
t
0
l
f (Y (s)) d[Y
k
, Y
l
]
s
+
st
(f (Y (s)) f (Y (s))
m
k=1
k
f (Y (s)) Y
k
(s)
k,l=1
1
2
l
f (Y (s)) Y
k
(s)Y
l
(s)),
or dening
[Y
k
, Y
l
]
c
t
= [Y
k
, Y
l
]
t
st
Y
k
(s)Y
l
(s), (6.11)
we have
f (Y (t)) = f (Y (0)) +
m
k=1
_
t
0
k
f (Y (s)) dY
k
(s) (6.12)
+
m
k,l=1
1
2
_
t
0
l
f (Y (s)) d[Y
k
, Y
l
]
c
s
+
st
(f (Y (s)) f (Y (s))
m
k=1
k
f (Y (s)) Y
k
(s)).
41
7 Stochastic Dierential Equations
7.1 Examples.
The standard example of a stochastic dierential equation is an Ito equation for a diusion
process written in dierential form as
dX(t) = (X(t))dW(t) + b(X(t))dt
or in integrated form as
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds (7.1)
If we dene Y (t) =
_
W(t), t
_
T
and F(X) =
_
(X), b(X)
_
, then (7.1) can be written as
X(t) = X(0) +
_
t
0
F(X(s))dY (s) . (7.2)
Similarly, consider the stochastic dierence equation
X
n+1
= X
n
+ (X
n
)
n+1
+ b(X
n
)h (7.3)
where the
i
are iid and h > 0. If we dene Y
1
(t) =
[t/h]
k=1

k
, Y
2
(t) = [t/h]h, and X(t) =
X
[t/h]
, then
X(t) = X(0) +
_
t
0
_
(X(s)), b(X(s))
_
dY (s)
which is in the same form as (7.2). With these examples in mind, we will consider stochastic
dierential equations of the form (7.2) where Y is an R
m
-valued semimartingale and F is a
d m matrix-valued function.
7.2 Gronwalls inequality and uniqueness for ODEs.
Of course, systems of ordinary dierential equations are of the form (7.2), and we begin our
study by recalling the standard existence and uniqueness theorem for these equations. The
following inequality will play a crucial role in our discussion.
Lemma 7.1 (Gronwalls inequality.) Suppose that A is cadlag and non-decreasing, X is
cadlag, and that
0 X(t) +
_
t
0
X(s)dA(s) . (7.4)
Then
X(t) e
A(t)
.
42
Proof. Iterating (7.4), we have
X(t) +
_
t
0
X(s)dA(s)
+ A(t) +
_
t
0
_
s
0
X(u)dA(u)dA(s)
+ A(t) +
_
t
0
A(s)dA(s) +
_
t
0
_
s
0
_
u
0
X()dA()dA(u)dA(s)
Since A is nondecreasing, it must be of nite variation, making [A]
c
t
0. Itos formula thus
yields
e
A(t)
= 1 +
_
t
0
e
A(s)
dA(s) +
st
(e
A(s)
e
A(s)
e
A(s)
A(s))
1 +
_
t
0
e
A(s)
dA(s)
1 + A(t) +
_
t
0
_
s
0
e
A(u)
dA(u)dA(s)
1 + A(t) +
_
t
0
A(s)dA(s) +
_
t
0
_
s
0
_
u
0
e
A(v)
dA(v)dA(u)dA(s) .
Continuing the iteration, we see that X(t) e
A(t)
.
Theorem 7.2 (Existence and uniqueness for ordinary dierential equations.) Consider the
ordinary dierential equation in R
d
X =
dX
dt
= F(X)
or in integrated form,
X(t) = X(0) +
_
t
0
F(X(s))ds. (7.5)
Suppose F is Lipschitz, that is, [F(x) F(y)[ L[x y[ for some constant L. Then for
each x
0
R
d
, there exists a unique solution of (7.5) with X(0) = x
0
.
Proof. (Uniqueness) Suppose X
i
(t) = X
i
(0) +
_
t
0
F(X
i
(s))ds, i = 1, 2
[X
1
(t) X
2
(t)[ [X
1
(0) X
2
(0)[ +
_
t
0
[F(X
1
(s)) F(X
2
(s))[ds
[X
1
(0) X
2
(0)[ +
_
t
0
L[X
1
(s) X
2
(s)[ds
By Gronwalls inequality (take A(t) = Lt)
[X
1
(t) X
2
(t)[ [X
1
(0) X
2
(0)[e
tL
.
Hence, if X
1
(0) = X
2
(0), then X
1
(t) X
2
(t).
43
7.3 Uniqueness of solutions of SDEs.
We consider stochastic dierential equations of the form
X(t) = U(t) +
_
t
0
F(X(s))dY (s). (7.6)
where Y is an R
m
-valued semimartingale, U is a cadlag, adapted R
d
-valued process, and
F : R
d
M
dm
.
We will need the following generalization of Lemma 5.12.
Lemma 7.3 Let Y be a semimartingale, X be a cadlag, adapted process, and be a nite
stopping time. Then for any stopping time for which E[[M]
(+t)
] < ,
Psup
st
[
_
+s
X(u)dY (u)[ > K (7.7)

P + t + P sup
s<+t
[X(s)[ > c +
16c
2
E[[M]
(+t)
[M]
]
K
2
+PT
+t
(V ) T
(V ) (2c)
1
K.
Proof. The proof is the same as for Lemma 5.12. The strict inequality in the second term
on the right is obtained by approximating c from above by a decreasing sequence c
n
.
Theorem 7.4 Suppose that there exists L > 0 such that
[F(x) F(y)[ L[x y[.
Then there is at most one solution of (7.6).
Remark 7.5 One can treat more general equations of the form
X(t) = U(t) +
_
t
0
F(X, s)dY (s) (7.8)
where F : D
R
d[0, ) D
M
dm[0, ) and satises
sup
st
[F(x, s) F(y, s)[ Lsup
st
[x(s) y(s)[ (7.9)
for all x, y D
R
d[0, ) and t 0. Note that, dening x
t
by x
t
(s) = x(s t), (7.9) implies
that F is nonanticipating in the sense that F(x, t) = F(x
t
, t) for all x D
R
d[0, ) and all
t 0.
Proof. It follows from Lemma 7.3 that for each stopping time satisfying T a.s. for
some constant T > 0 and t, > 0, there exists a constant K
(t, ) such that

Psup
st
[
_
+s
X(u)dY (u)[ K
(t, )
44
for all cadlag, adapted X satisfying [X[ 1. (Take c = 1 in (7.7).) Furthermore, K
can be
chosen so that for each > 0, lim
t0
K
(t, ) = 0.
Suppose X and

X satisfy (7.6). Let
0
= inft : [X(t)

X(t)[ > 0, and suppose
P
0
< > 0. Select r, , t > 0, such that P
0
< r > and LK
0
r
(t, ) < 1. Note that
if
0
< , then
X(
0
)

X
0
(
0
) =
_

0
0
(F(X(s)) F(

X(s))dY (s) = 0. (7.10)
Dene
= infs : [X(s)

X(s)[ .
Noting that [X(s)

X(s)[ for s <
, we have
[F(X(s)) F(

X(s))[ L,
for s <
, and
[
_

0
(F(X(s)) F(

X(s))dY (s)[ = [X(
)

X(
)[ .
Consequently, for r > 0, letting
r
0
=
0
r, we have
P
r
0
t
P sup
st(
r
0
)
[
_

r
0
+s
0
F(X(u))dY (u)
_

r
0
+s
0
F(

X(u))dY (u)[ LK
r
0
(t, )
.
Since the right side does not depend on and lim
0
=
0
, it follows that P
0
0
r <
t and hence that P
0
< r , contradicting the assumption on and proving that
0
= a.s.
7.4 A Gronwall inequality for SDEs

Let Y be an R
m
-valued semimartingale, and let F : R
d
M
dm
satisfy [F(x) F(y)[
L[x y[. For i = 1, 2, let U
i
be cadlag and adapted and let X
i
satisfy
X
i
(t) = U
i
(t) +
_
t
0
F(X
i
(s))dY (s). (7.11)
Lemma 7.6 Let d = m = 1. Suppose that Y = M + V where M is a square-integrable
martingale and V is a nite variation process. Suppose that there exist > 0 and R > 0
such that sup
t
[M(t)[ , sup
t
[V (t)[ 2 and T
t
(V ) R, and that c(, R) (1
12L
2
2
6L
2
R) > 0. Let
A(t) = 12L
2
[M]
t
+ 3L
2
RT
t
(V ) + t, (7.12)
45
and dene (u) = inft : A(t) > u. (Note that the t on the right side of (7.12) only
serves to ensure that A is strictly increasing.) Then
E[ sup
s(u)
[X
1
(s) X
2
(s)[
2
]
3
c(, R)
e
u
c(,R)
E[ sup
s(u)
[U
1
(s) U
2
(s)[
2
]. (7.13)
Proof. Note that
[X
1
(t) X
2
(t)[
2
3[U
1
(t) U
2
(t)[
2
(7.14)
+3[
_
t
0
(F(X
1
(s)) F(X
2
(s)))dM(s)[
2
+3[
_
t
0
(F(X
1
(s)) F(X
2
(s)))dV (s)[
2
.
Doobs inequality implies
E[ sup
t(u)
[
_
t
0
(F(X
1
(s)) F(X
2
(s)))dM(s)[
2
] (7.15)
4E[
_
(u)
0
[F(X
1
(s)) F(X
2
(s))[
2
d[M]],
and Jensens inequality implies
E[ sup
t(u)
[
_
t
0
(F(X
1
(s)) F(X
2
(s)))dV (s)[
2
] (7.16)
E[T
(u)
(V )
_
(u)
0
[F(X
1
(s)) F(X
2
(s))[
2
dT
s
(V )].
Letting Z(t) = sup
st
[X
1
(s)X
2
(s)[
2
and using the Lipschitz condition and the assumption
that T
t
(V ) R it follows that
E[Z (u)] 3E[ sup
s(u)
[U
1
(s) U
2
(s)[
2
] (7.17)
+12L
2
E[
_
(u)
0
[X
1
(s) X
2
(s)[
2
d[M]]
+3L
2
RE[
_
(u)
0
[X
1
(s) X
2
(s)[
2
dT
s
(V )]
3E[ sup
s(u)
[U
1
(s) U
2
(s)[
2
]
+E[
_
(u)
0
Z(s)dA(s)]
3E[ sup
s(u)
[U
1
(s) U
2
(s)[
2
]
+E[(A (u) u)Z (u)] + E[
_
u
0
Z (s)ds].
46
Since 0 A (u) u sup
t
A(t) 12L
2
2
+ 6L
2
R, (7.12) implies
c(, R)E[Z (u)] 3E[ sup
s(u)
[U
1
(s) U
2
(s)[
2
] +
_
u
0
E[Z (s)]ds,
and (7.13) follows by Gronwalls inequality.
Note that the above calculation is valid only if the expectations on the right of (7.15)
and (7.16) are nite. This potential problem can be eliminated by dening
K
= inft :
[X
1
(t)X
2
(t)[ K and replacing (u) by (u)
K
. Observing that [X
1
(s)X
2
(s)[ K
for s
K
, the estimates in (7.17) imply (7.13) with (u) replaced by (u)
K
. Letting
K gives (7.13) as originally stated.
Lemma 7.6 gives an alternative approach to proving uniqueness.
Lemma 7.7 Let d = m = 1, and let U = U
1
= U
2
in (7.11). Then there is a stopping time
depending only on Y such that P > 0 = 1 and X
1
(t) = X
2
(t) for t [0, ].
Proof Let
1
= inft > 0 : [Y (t)[ > and dene

Y by
Y (t) =
_
Y (t) t <
1
Y (t) t
1
.
Then

Y is a semimartingale satisfying sup
t
[Y (t)[ and hence by Lemma 5.8,

Y can be
written as

Y =

M +

V where

M is a local square-integrable martingale with [

M(t)[
and

V is nite variation with [
V (t)[ 2. Let
2
= inft : [

M(t)[ K and, noting that
[

M(t)[ K + for t
2
, we have that

M

M(
2
) is a square-integrable martingale.
Finally, let
3
= inft : T
t
(
V ) > R and dene
V (t) =
_

V (t) t <
3
V (t) t
3
and

Y =

M
2
+

V . Note that

Y satises the conditions of Lemma 7.6 and that Y (t) =

Y (t)
for t <
1

2

3
. Setting
X
i
(t) =
_
X
i
(t) t <
X
i
(t) t
and dening

U similarly, we see that
X
i
(t) =

U(t) +
_
t
0
F(
X
i
(s))d
Y (s).
By Lemma 7.6, X
1
(t) =

X
1
(t) =

X
2
(t) = X
2
(t) for t < . Since X
i
() = X
i
() +
F(X
i
())Y (), we see that X
1
() = X
2
() as well.
Proposition 7.8 . Let d = m = 1, and let U = U
1
= U
2
in (7.11). Then X
1
= X
2
a.s.
Proof. Let = inft : X
1
(t) ,= X
2
(t). For any T < , X
1
(T ) = X
2
(T ). But
starting over at T , Lemma 7.7 implies that there is a stopping time > T such
that X
1
(t) = X
2
(t) for t , and hence P > T = 1. But T is arbitrary, so = .
Remark 7.9 The proof of these results for d, m > 1 is essentially the same with dierent
constants in the analogue of Lemma 7.6.
47
7.5 Existence of solutions.
If X is a solution of (7.6), we will say that X is a solution of the equation (U, Y, F). Let
Y
c
be dened as in Lemma 5.18. If we can prove existence of a solution X
c
of the equation
(U, Y
c
, F) (that is, of (7.6) with Y replaced by Y
c
), then since Y
c
(t) = Y (t) for t <
c
,
we have existence of a solution of the original equation on the interval [0,
c
). For c
t
> c,
suppose X
c
is a solution of the equation (U, Y

c
, F). Dene

X
c
(t) = X
c
(t) for t <

c
and

X
c
(t) = F(X
c
(
c
))Y
c
(
c
) for t
c
. Then

X
c
will be a solution of the equation
(U, Y
c
, F). Consequently, if for each c > 0, existence and uniqueness holds for the equation
(U, Y
c
, F), then X
c
(t) = X
c
(t) for t <

c
and c
t
> c, and hence, X(t) = lim
c
X
c
(t) exists
and is the unique solution of the equation (U, Y, F).
With the above discussion in mind, we consider the existence problem under the hypothe-
ses that Y = M + V with [M[ + [M] + T(V ) R. Consider the following approximation:
X
n
(0) = X(0)
X
n
(
k + 1
n
) = X
n
(
k
n
) + U(
k + 1
n
) U(
k
n
) + F(X(
k
n
))(Y (
k + 1
n
) Y (
k
n
)).
Let
n
(t) =
k
n
for
k
n
t <
k+1
n
. Extend X
n
to all t 0 by setting
X
n
(t) = U(t) +
_
t
0
F(X
n

n
(s))dY (s) .
Adding and subtracting the same term yields
X
n
(t) = U(t) +
_
t
0
(F(X
n

n
(s)) F(X
n
(s)))dY (s) +
_
t
0
F(X
n
(s))dY (s)
U(t) + D
n
(t) +
_
t
0
F(X
n
(s))dY (s).
Assume that [F(x) F(y)[ L[xy[, and for b > 0, let
b
n
= inft : [F(X
n
(t))[ b. Then
for T > 0,
E[ sup
s
b
n
T
[D
n
(s)[
2
] 2E[ sup
t
b
n
T
(
_
t
0
(F(X
n

n
(s)) F(X
n
(s)))dM(s))
2
]
+2E[ sup
t
b
n
T
(
_
t
0
(F(X
n

n
(s)) F(X
n
(s)))dV (s))
2
]
8L
2
E[
_

b
n
T
0
[X
n

n
(s) X
n
(s)[
2
d[M]
s
]
+2RL
2
E[
_

b
n
T
0
[X
n

n
(s) X
n
(s)[
2
dT
s
(V )]
= 8L
2
E[
_

b
n
T
0
F
2
(X
n

n
(s))(Y (s) Y (
n
(s)))
2
d[M]
s
]
+2RL
2
E[
_

b
n
T
0
F
2
(X
n

n
(s))(Y (s) Y (
n
(s)))
2
dT
s
(V )]
48
= 8L
2
b
2
E[
_

b
n
T
0
(Y (s) Y (
n
(s)))
2
d[M]
s
]
+2RL
2
b
2
E[
_

b
n
T
0
(Y (s) Y (
n
(s)))
2
dT
s
(V )] ,
so under the boundedness assumptions on Y ,
E[ sup
s
b
n
T
[D
n
(s)[
2
] 0.
Now assume that F is bounded, sup [M(s)[ , sup [V (s)[ 2, T
t
(V ) R, and
that L, , and R satisfy the conditions of Lemma 7.6. Since
X
n
(t) = U(t) + D
n
(t) +
_
t
0
F(X
n
(s))dY (s),
Lemma 7.6 implies X
n
is a Cauchy sequence and converges uniformly in probability to a
solution of
X(t) = U(t) +
_
t
0
F(X(s))dY (s) .
A localization argument gives the following theorem.
Theorem 7.10 Let Y be an R
m
-valued semimartingale, and let F : R
d
M
dm
be bounded
and satisfy [F(x) F(y)[ L[x y[. Then for each X(0) there exists a unique solution of
X(t) = U(t) +
_
t
0
F(X(s))dY (s). (7.18)
The assumption of boundedness in the above theorem can easily be weakened. For general
Lipschitz F, the theorem implies existence and uniqueness up to
k
= inft : [F(x(s)[ k
(replace F by a bounded function that agrees with F on the set x : [F(x)[ k). The
global existence question becomes whether or not lim
k

k
= ? F is locally Lispchitz if for
each k > 0, there exists an L
k
, such that
[F(x) F(y)[ L
k
[x y[ [x[ k, [y[ k.
Note that if F is locally Lipschitz, and
k
is a smooth nonnegative function satisfying
k
(x) = 1 when [x[ k and
k
(x) = 0 when [x[ k +1, then F
k
(x) =
k
(x)F(x) is globally
Lipschitz and bounded.
Example 7.11 Suppose
X(t) = 1 +
_
t
0
X(s)
2
ds .
Then F(x) = x
2
is locally Lipschitz and local existence and uniqueness holds; however, global
existence does not hold since X hits in nite time.
49
7.6 Moment estimates.
Consider the scalar Ito equation
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds.
Then by Itos formula and Lemma 5.16,
X(t)
2
= X(0)
2
+
_
t
0
2X(s)(X(s))dW(s)
+
_
t
0
2X(s)b(X(s))ds +
_
t
0
2
(X(s))ds .
Dene
k
= inft : [X(t)[ k. Then
[X(t
k
)[
2
= [X(0)[
2
+
_
t
k
0
2X(s)(X(s))dW(s)
+
_
t
k
0
(2X(s)b(X(s)) +
2
(X(s)))ds .
Since
_
t
k
0
2X(s)(X(s))dW(s) =
_
t
0
1
[0,
k
)
2X(s)(X(s))dW(s)
has a bounded integrand, the integral is a martingale. Therefore,
E[[X(t
k
)[
2
] = E[[X(0)[
2
] +
_
t
0
E[1
[0,
k
)
(2X(s)b(X(s)) +
2
(X(s)))]ds .
Assume (2xb(x) +
2
(x)) K
1
+K
2
[x[
2
for some K
i
> 0. (Note that this assumption holds
if both b(x) and (x) are globally Lipschitz.) Then
m
k
(t) E[[X(t
k
)[
2
]
= E[X(0)[
2
+
_
t
0
E1
[0,
k
)
[2X(s)b(X(s)) +
2
(X(s))]ds
m
0
+ K
1
t +
_
t
0
m
k
(s)K
2
ds
and hence
m
k
(t) (m
0
+ K
1
t)e
K
2
t
.
Note that
[X(t
k
)[
2
= (I
k
>t
[X(t)[ + I
k
t
[X(
k
)[)
2
,
and we have
k
2
P(
k
t) E([X(t
k
)[
2
) (m
0
+ K
1
t)e
K
2
t
.
Consequently, as k , P(
k
t) 0 and X(t
k
) X(t). By Fatous Lemma,
E[X(t)[
2
(m
0
+ K
1
t)e
K
2
t
.
50
Remark 7.12 The argument above works well for moment estimation under other condi-
tions also. Suppose 2xb(x) +
2
(x) K
1
[x[
2
. (For example, consider the equation
X(t) = X(0)
_
t
0
X(s)ds + W(t).) Then
e
t
[X(t)[
2
[X(0)[
2
+
_
t
0
e
s
2X(s)(X(s))dW(s)
+
_
t
0
e
s
[2X(s)b(X(s)) +
2
(X(s))]ds +
_
t
0
e
s
[X(s)[
2
ds
[X(0)[
2
+
_
t
0
e
s
2X(s)(X(s))dW(s) +
_
t
0
e
s
K
1
ds
[X(0)[
2
+
_
t
0
e
s
2X(s)(X(s))dW(s) +
K
1
2
(e
t
1),
and hence
e
t
E[[X(t)[
2
] E[[X(0)[
2
] +
K
1
[e
t
1].
Therefore,
E[[X(t)[
2
]] e
t
E[[X(0)[
2
] +
K
1
(1 e
t
),
which is uniformly bounded.
Consider a vector case example. Assume,
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds,
where is a d m matrix, b is a d-dimensional vector, and W is an m-dimensional standard
Brownian motion. Then
[X(t)[
2
=
d
i=1
[X
i
(t)[
2
= [X(0)[
2
+
_
t
0
2X
i
(s)dX
i
(s) +
d
i=1
[X
i
]
t
.
Dene Z
i
(t) =
m
k=1
_
t
0

ik
(X(s))dW
k
(s)
m
k=1
U
k
where U
k
=
_
t
0

ik
(X(s))dW
k
(s). Then
[Z
i
] =
k,l
[U
k
, U
l
], and
[U
k
, U
l
]
t
=
_
t
0
ik
(X(s))
il
(X(s))d[W
k
, W
l
]
s
=
_
_
_
0 k ,= l
_
t
0

2
ik
(X(s))ds k = l
Consequently,
[X(t)[
2
= [X(0)[
2
+
_
t
0
2X(s)
T
(X(s))dW(s)
51
+
_
t
0
[2X(s) b(X(s)) +
i,k
2
ik
(X(s))]ds
= [X(0)[
2
+
_
t
0
2X(s)
T
(X(s))dW(s)
+
_
t
0
(2X(s) b(X(s)) + trace((X(s))(X(s))
T
))ds .
As in the univariate case, if we assume,
2x b(x) + trace((x)(x)
T
) K
1
[x[
2
,
then E[[X(s)[
2
] is uniformly bounded.
52
8 Stochastic dierential equations for diusion pro-
cesses.
8.1 Generator for a diusion process.
Consider
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds,
where X is R
d
-valued, W is an m-dimensional standard Brownian motion, is a d m
matrix-valued function and b is an R
d
-valued function. For a C
2
function f,
f(X(t)) = f(X(0)) +
d
i=1
_
t
0
i
f(X(s))dX(s)
+
1
2
1i,jd
_
t
0
j
f(X(s))d[X
i
, X
j
]
s
.
The covariation satises
[X
i
, X
j
]
t
=
_
t
0
i,k
(X(s))
j,k
(X(s))ds =
_
t
0
a
i,j
(X(s))ds,
where a = ((a
i,j
)) =
T
, that is a
i,j
(x) =
k

ik
(x)
kj
(x). If we denote
Lf(x) =
d
i=1
b
i
(x)
i
f(x) +
1
2
i,j
a
i,j
(x)
i
j
f(x),
then
f(X(t)) = f(X(0)) +
_
t
0
f
T
(X(s))(X(s))dW(s)
+
_
t
0
Lf(X(s))ds .
Since
a =
T
,
we have
j
a
i,j
=
T
T
= [
T
[
2
0,
and hence a is nonnegative denite. Consequently, L is an elliptic dierential operator. L
is called the dierential generator or simply the generator for the corresponding diusion
process.
Example 8.1 If
X(t) = X(0) + W(t),
then ((a
i,j
(x))) = I, and hence Lf(x) =
1
2
f(x).
53
8.2 Exit distributions in one dimension.
If d = m = 1, then
Lf(x) =
1
2
a(x)f
tt
(x) + b(x)f
t
(x)
where
a(x) =
2
(x).
Find f such that Lf(x) = 0 (i.e., solve the linear rst order dierential equation for f
t
).
Then
f(X(t)) = f(X(0)) +
_
t
0
f
t
(X(s))(X(s))dW(s)
is a local martingale. Fix a < b, and dene = inft : X(t) / (a, b). If sup
a<x<b
[f
t
(x)(x)[ <
, then
f(X(t )) = f(X(0)) +
_
t
0
1
[0,)
(s)f
t
(X(s))(X(s))dW(s)
is a martingale, and
E[f(X(t ))[X(0) = x] = f(x).
Moreover, if we assume sup
a<x<b
f(x) < and < a.s. then letting t we have
E[f(X())[X(0) = x] = f(x).
Hence
f(a)P(X() = a[X(0) = x) + f(b)P(X() = b[X(0) = x) = f(x),
and therefore the probability of exiting the interval at the right endpoint is given by
P(X() = b[X(0) = x) =
f(x) f(a)
f(b) f(a)
. (8.1)
To nd conditions under which P( < ) = 1, or more precisely, under which E[] < ,
solve Lg(x) = 1. Then
g(X(t)) = g((X(0)) +
_
t
0
g
t
(X(s))(X(s))dW(s) + t,
and assuming sup
a<x<b
[g
t
(x)(x)[ < , we conclude that the stochastic integral in
g(X(t )) = g(x) +
_
t
0
g
t
(X(s))(X(s))dW(s) + t
is a martingale and hence
E[g(X(t ))[X(0) = x] = g(x) + E[t ].
If
C = sup
axb
[g(x)[ < ,
then 2C E[t ], so 2C E[], which implies < a.s. By (8.1), we also have
E[[X(0) = x] = E[g(X())[X(0) = x] g(x)
= g(b)
f(x) f(a)
f(b) f(a)
+ g(a)
f(b) f(x)
f(b) f(a)
g(x)
54
8.3 Dirichlet problems.
In the one-dimensional case, we have demonstrated how solutions of a boundary value prob-
lem for L were related to quantities of interest for the diusion process. We now consider
the more general Dirichlet problem
_
_
_
Lf(x) = 0 x D
f(x) = h(x) x D
(8.2)
for D R
d
.
Denition 8.2 A function f is Holder continuous with Holder exponent > 0 if
[f(x) f(y)[ L[x y[
for some L > 0.

Theorem 8.3 Suppose D is a bounded, smooth domain,
inf
xD
a
i,j
(x)
i
j
[[
2
,
where > 0, and a
i,j
, b
i
, and h are Holder continuous. Then there exists a unique C
2
solution f of the Dirichlet problem (8.2).
To emphasize dependence on the initial value, let
X(t, x) = x +
_
t
0
(X(s, x))dW(s) +
_
t
0
b(X(s, x))ds. (8.3)
Dene = (x) = inft : X(t, x) / D. If f is C
2
and bounded and satises (8.2), we have
f(x) = E[f(X(t , x))],
and assuming < a.s., f(x) = E[f(X(, x))]. By the boundary condition
f(x) = E[h(X(, x))] (8.4)
giving a useful representation of the solution of (8.2). Conversely, we can dene f by (8.4),
and f will be, at least in some weak sense, a solution of (8.2). Note that if there is a C
2
,
bounded solution f and < , f must be given by (8.4) proving uniqueness of C
2
, bounded
solutions.
8.4 Harmonic functions.
If f = 0 (i.e., f is harmonic) on R
d
, and W is standard Brownian motion, then f(x+W(t))
is a martingale (at least a local martingale).
55
8.5 Parabolic equations.
Suppose u is bounded and satises
_
_
_
u
t
= Lu
u(0, x) = f(x).
By Itos formula, for a smooth function v(t, x),
v(t, X(t)) = v(0, X(0)) + (local) martingale +
_
t
0
[v
s
(s, X(s)) + Lv(s, X(s))]ds.
For xed r > 0, dene
v(t, x) = u(r t, x).
Then

t
v(t, x) = u
1
(r t, x), where u
1
(t, x) =

t
u(t, x). Since u
1
= Lu and Lv(t, x) =
Lu(r t, x), v(t, X(t)) is a martingale. Consequently,
E[u(r t, X(t, x))] = u(r, x),
and setting t = r, E[u(0, X(r, x))] = u(r, x), that is ,we have
u(r, x) = E[f(X(r, x))].
8.6 Properties of X(t, x).
Assume now that
[(x) (y)[ K[x y[, [b(x) b(y)[ K[x y[
for some constant K. By arguments similar to those of Section 7.6, we can obtain the
estimate
E[[X(t, x) X(t, y)[
n
] C(t)[x y[
n
. (8.5)
Consequently, we have the following
Theorem 8.4 There is a version of X(t, x) such that the mapping (t, x) X(t, x) is con-
tinous a.s.
Proof. The proof is based on Kolmogorovs criterion for continuity of processes indexed by
R
d
.
8.7 Markov property.
Given a ltration T
t
, W is called an T
t
-standard Brownian motion if
1) W is T
t
-adapted
2) W is a standard Brownian motion
56
3) W(r +) W(r) is independent of T
r
.
For example, if W is an T
t
-Brownian motion, then
E[f(W(t + r) W(r))[T
r
] = E[f(W(t))].
Let W
r
(t) W(r + t) W(r). Note that W
r
is an T
r+t
- Brownian motion. We have
X(r + t, x) = X(r, x) +
_
r+t
r
(X(s, x))dW(s) +
_
r+t
r
b(X(s, x))ds
= X(r, x) +
_
t
0
(X(r + s, x))dW
r
(s)
+
_
t
0
b(X(r + s, x))ds.
Dene X
r
(t, x) such that
X
r
(t, x) = x +
_
t
0
(X
r
(s, x))dW
r
(s) +
_
t
0
b(X
r
(s, x))ds.
Then X(r+t, x) = X
r
(t, X(r, x)). Intuitively, X(r+t, x) = H
t
(X(r, x), W
r
) for some function
H, and by the independence of X(r, x) and W
r
,
E[f(X(r + t, x))[T
r
] = E[f(H
t
(X(r, x), W
r
))[T
r
]
= u(t, X(r, x)),
where u(t, z) = E[H
t
(z, W
r
)]. Hence
E[f(X(r + t, x))[T
r
] = E[f(X(r + t, x))[X(r, x)],
that is, the Markov property holds for X.
To make this calculation precise, dene
n
(t) =
k
n
, for
k
n
t <
k + 1
n
,
and let
X
n
(t, x) = x +
_
t
0
(X
n
(
n
(s), x))dW(s) +
_
t
0
b(X
n
(
n
(s), x))ds.
Suppose that z C
R
m[0, ). Then
H
n
(t, x, z) = x +
_
t
0
(H
n
(
n
(s), x, z))dz(s) +
_
t
0
b(H
n
(
n
(s), x, z))ds
is well-dened. Note that X
n
(t, x) = H
n
(t, x, W).
We also have
X(r + t, x) = X
r
(t, X(r, x))
= lim
n
X
n
r
(t, X(r, x))
= lim
n
H
n
(t, X(r, x), W
r
).
57
and it follows that
E[f(X(r + t, x))[T
r
] = lim
n
E[f(H
n
(t, X(r, x), W
r
)[T
r
]
= lim
nn
E[f(H
n
(t, X(r, x), W
r
)[X(r, x)]
= E[f(X(r + t, x))[X(r, x)] .
8.8 Strong Markov property.
Theorem 8.5 Let W be an T
t
-Brownian Motion and let be an T
t
stopping time.
Dene T
t
= T
+t
. Then W
(t) W( + t) W() is an T
t
Brownian Motion.
Proof. Let
n
=
k + 1
n
, when
k
n
<
k + 1
n
.
Then clearly
n
> . We claim that
E[f(W(
n
+ t) W(
n
))[T
n
] = E[f(W(t))].
Measurability is no problem, so we only need to check that for A T
n
_
A
f(W(
n
+ t) W(
n
))dP = P(A)E[f(W(t))].
Observe that A
n
= k/n T
k/n
. Thus
LHS =
k
_
An=k/n
f(W(
k
n
+ t) W(
k
n
))dP
=
k
P(A
n
= k/n)E[f(W(
k
n
+ t) W(
k
n
))]
=
k
P(A
n
= k/n)E[f(W(t))]
= E[f(W(t))]P(A).
Note also that T
n
T
. Thus
E[f(W(
n
+ t) W(
n
))[T
] = E[f(W(t))].
Let n to get
E[f(W( + t) W())[T
] = E[f(W(t))]. (8.6)
Since + s is a stopping time, (8.6) holds with replaced by + s and it follows that W
has independent Gaussian increments and hence is a Brownian motion.

Finally, consider
X( + t, x) = X(, x) +
_
t
0
(X( + s, x))dW
(s) +
_
t
0
b(X( + s, x))ds.
By the same argument as for the Markov property, we have
E[f(X( + t, x))[T
] = u(t, X(, x))

where u(t, x) = E[f(t, x)]. This identity is the strong Markov property.
58
8.9 Equations for probability distributions.
We have now seen several formulas and assertions of the general form:
f (X(t))
_
t
0
Lf (X(s)) ds (8.7)
is a (local) martingale for all f in a specied collection of functions which we will denote
T(L), the domain of L. For example, if
dX = (X)dW + b(X)dt
and
Lf(x) =
1
2
i
a
ij
(x)

2
x
i
x
j
f(x) +
b
i
(x)

x
i
f(x) (8.8)
with
((a
ij
(x))) = (x)
T
(x),
then (8.7) is a martingale for all f C
2
c
(= T(L)). (C
2
c
denotes the C
2
functions with
compact support.)
Markov chains provide another example. Suppose
X(t) = X(0) +
lY
l
__
t
0
l
(X(s)) ds
_
where Y
l
are independent unit Poisson processes. Dene
Qf(x) =
l
(x)
l
(f(x + l) f(x)) .
Then
f (X(t))
_
t
0
Qf (X(s)) ds
is a (local) martingale.
Since f (X(t))
_
t
0
Lf (X(s)) ds is a martingale,
E [f(X(t))] = E [f(X(0)] + E
__
t
0
Lf(X(s))ds
_
= E [f(X(0))] +
_
t
0
E [Lf(X(s))] ds.
Let
t
() = PX(t) . Then for all f in the domain of L, we have the identity
_
fd
t
=
_
fd
0
+
_
t
0
_
Lfd
s
ds, (8.9)
which is a weak form of the equation
d
dt
t
= L
t
. (8.10)
59
Theorem 8.6 Let Lf be given by (8.8) with a and b continuous, and let
t
be probability
measures on R
d
satisfying (8.9) for all f C
2
c
(R
d
). If dX = (x)dW +b(x)dt has a unique
solution for each initial condition, then PX(0) =
0
implies PX(t) =
t
.
In nice situations,
t
(dx) = p
t
(x)dx. Then L
should be a dierential operator satisfying

_
R
d
pLfdx =
_
R
d
fL
pdx.
Example 8.7 Let d=1. Integrating by parts, we have
_

p(x)
_
1
2
a(x)f
tt
(x) + b(x)f
t
(x)
_
dx
=
1
2
p(x)a(x)f
t
(x)
f
t
(x)
_
1
2
d
dx
(a(x)p(x)) b(x)p(x)
_
dx.
The rst term is zero, and integrating by parts again we have
_

f(x)
d
dx
_
1
2
d
dx
(a(x)p(x)) b(x)p(x)
_
dx
so
L
p =
d
dx
_
1
2
d
dx
(a(x)p(x)) b(x)p(x)
_
.
Example 8.8 Let Lf =
1
2
f
tt
(Brownian motion). Then L
p =
1
2
p
tt
, that is, L is self adjoint.
8.10 Stationary distributions.
Suppose
_
Lfd = 0 for all f in the domain of L. Then
_
fd =
_
fd +
_
t
0
_
Lfdds,
and hence
t
gives a solution of (8.9). Under appropriate conditions, in particular, those
of Theorem 8.6, if PX(0) = and f(X(t))
_
t
0
Lf(X(s))ds is a martingale for all
f T(L), then we have PX(t) = , i.e. is a stationary distribution for X.
Let d = 1. Assuming (dx) = (x)dx
d
dx
_
1
2
d
dx
(a(x)(x)) b(x)(x)
_
. .
this is a constant:
let the constant be 0
= 0,
so we have
1
2
d
dx
(a(x)(x)) = b(x)(x).
60
Applying the integrating factor exp(
_
x
0
2b(z)/a(z)dz) to get a perfect dierential, we have
1
2
e
x
0
2b(z)
a(z)
dz
d
dx
(a(x)(x)) b(x)e
x
0
2b(z)
a(z)
dz
(x) = 0
a(x)e
x
0
2b(z)
a(z)
dz
(x) = C
(x) =
C
a(x)
e
x
0
2b(z)
a(z)
dz
.
Assume a(x) > 0 for all x. The condition for the existence of a stationary distribution is
_

1
a(x)
e
x
0
2b(z)
a(z)
dz
dx < .
8.11 Diusion with a boundary.
(See Section 11.) Suppose
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds + (t)
with X(t) 0, and that is nondecreasing and increasing only when X(t) = 0. Then
f(X(t))
_
t
0
Lf(X(s))ds
is a martingale, if f C
2
c
and f
t
(0) = 0.
_

0
p(x)Lf(x)dx =
_
1
2
p(x)a(x)f
t
(x)
_
0
. .
=0
_

0
f
t
_
1
2
d
dx
(a(x)p(x)) b(x)p(x)
_
dx
=
_
f(x)
_
1
2
d
dx
(a(x)p(x)) b(x)p(x)
__
0
+
_

0
f(x)L
p(x)dx
and hence
L
p(x) =
d
dx
_
1
2
d
dx
(a(x)p(x)) b(x)p(x)
_
for p satisfying
_
1
2
a
t
(0) b(0)
_
p(0) +
1
2
a(0)p
t
(0) = 0 . (8.11)
The density for the distribution of the process should satisfy
d
dt
p
t
= L
p
t
and the stationary density satises
d
dx
_
1
2
d
dx
(a(x)(x)) b(x)(x)
_
= 0
61
subject to the boundary condition (8.11). The boundary condition implies
1
2
d
dx
(a(x)(x)) b(x)(x) = 0
and hence
(x) =
c
a(x)
e
x
0
2b(z)
a(z)
dz
, x 0.
Example 8.9 (Reecting Brownian motion.) Let X(t) = X(0) +W(t) bt +(t), where
a =
2
and b > 0 are constant. Then
(x) =
2b
2
e
2b
2
x
,
so the stationary distribution is exponential.
62
9 Poisson random measures
9.1 Poisson random variables
A random variable X has a Poisson distribution with parameter > 0 (we write X
Poisson()) if for each k 0, 1, 2, . . .
PX = k =

k
k!
e
.
From the denition it follows that E[X] = , V ar(X) = and the characteristic function of
X is given by
E[e
iX
] = e
(e
i
1)
.
Since the characteristic function of a random variable characterizes uniquely its distribution,
a direct computation shows the following fact.
Proposition 9.1 If X
1
, X
2
, . . . are independent random variables with X
i
Poisson(
i
)
and
i=1
i
< , then
X =
i=1
X
i
Poisson
_

i=1
i
_
Proof. Since for each i 1, 2, . . ., P(X
i
0) = 1, it follows that
k
i=0
X
i
is an increasing
sequence in k. Thus, X
i=1
X
i
exists. By the monotone convergence theorem
E[X] =
i=1
E[X
i
] =
i=0
i
< ,
and X is nite almost surely. Fix k 1. Then
E[e
iX
] = lim
k
E
_
e
i
k
i=1
X
i
_
= lim
k
e
(
k
i=1

i
)(e
i
1)
= e
(
i=1

i
)(e
i
1)
,
and hence X Poisson(
i=1
i
).
Suppose in the last proposition
i=1
i
= . Then
PX n = lim
k
P
_
k
i=1
X
i
n
_
= lim
k
n
i=0
1
i!
_
k
j=1
j
_
exp
_
j=1
j
_
= 0.
Thus PX n = 0 for every n 0. In other words, PX < = 0, and
i=1
X
i

Poisson(). From this we conclude the following result.
Corollary 9.2 If X
1
, X
2
, . . . are independent random variables with X
i
Poisson(
i
) then
i=1
X
i
Poisson
_

i=1
i
_
63
9.2 Poisson sums of Bernoulli random variables
A random variable Y is said to be a Bernoulli with parameter p 0, 1] (we write Y
Bernoulli(p)) if
PY = 1 = p , PY = 0 = 1 p.
Proposition 9.3 Let N Poisson(), and suppose that Y
1
, Y
2
, . . . are i.i.d. Bernoulli
random variables with parameter p [0, 1]. If N is independent of the Y
i
, then
N
i=0
Y
i

Poisson(p).
Next, we present a natural generalization of the previous fact. For j = 1, . . . , m, let e
j
be the vector in R
m
that has all its entries equal to zero, except for the jth which is 1.
For , y R
m
, let
, y) =
m
j=1
j
y
j
.
Let Y = (Y
1
, ..., Y
m
), where the Y
j
are independent and s Y
j
Poisson(
j
). Then the
characteristic function of Y has the form
E[e
i,Y )
] = exp
_
m
j=0
j
(e
i
j
1)
_
Noting, as before, that the characteristic function of a R
m
-valued random variable determines
its distribution we have the following:
Proposition 9.4 Let N Poisson(). Suppose that Y
0
, Y
1
, . . . are independent R
m
-valued
random variables such that for all k 0 and j 1, . . . , m
PY
k
= e
j
= p
j
,
where
m
j=1
p
j
= 1. Dene X = (X
1
, ..., X
m
) =
N
k=0
Y
k
. If N is independent of the Y
k
,
then X
1
, . . . , X
m
are independent random variables and X
j
Poisson(p
j
).
Proof. Dene X = (X
1
, . . . , X
m
). Then, for arbitrary R
m
, it follows that
E[e
i,X)
] =
k0
E
_
exp
_
i
k
j=1
Y
j
, )
__
PN = k
= e
k0
_
E(e
i,Y
1
)
)

k
k!
= exp
_
m
j=1
p
j
(e
i
j
1)
_
From the last calculation we see that the coordinates of X must be independent and
X
j
Poisson(p
j
) as desired.
64
9.3 Poisson random measures
Let (E, c) be a measurable space, and let be a -nite measure on c. Let ^(E) be the
collection of counting measures, that is, measures with nonnegative integer values, on E.
is an ^(E)-valued random variable on a probability space (, T, P) if for each ,
(, ) ^(E) and for each A c, (A) is a random variable with values in N. (For
convenience, we will write (A) instead of (, A).)
An ^(E)-valued random variable is a Poisson random measure with mean measure if
(i) For each A c, (A) Poisson((A)).
(ii) If A
1
, A
2
, . . . c are disjoint then (A
1
), (A
2
), . . . are independent random variables.
Clearly, determines the distribution of provided exists. We rst show existence for
nite, and then we consider -nite.
Proposition 9.5 Suppose that is a measure on (E, c) such that (E) < . Then there
exists a Poisson random measure with mean measure .
Proof. The case (E) = 0 is trivial, so assume that (E) (0, ). Let N be a Poisson
random variable with dened on a probability space (, T, P) with E[N] = (E). Let
X
1
, X
2
, . . . be iid E-valued random varible such that for every A c
PX
j
A =
(A)
(E)
,
and assume that N is independent of the X
j
.
Dene by (A) =
N
k=0
1
X
k
()A
. In other words
=
N
k=0
X
k
where, for each x E,
x
is the Dirac mass at x.
Clearly, for each , is a counting measure on c. To conclude that is a Poisson
random measure, it is enough to check that given disjoint sets A
1
, . . . , A
m
c such that
m
i=1
A
i
= E, (A
1
), . . . , (A
m
) are independent and (A
i
) Poisson((A
i
)). For this, dene
the R
m
-valued random vectors
Z
j
= (1
X
j
A
1
, . . . , 1
X
j
Am
).
Note that, for every j 0 and i 1, . . . , m, PZ
j
= e
i
=
(A
i
)
(E)
, since A
1
, . . . , A
m
partition
E. Since N and the X
j
are mutually independent, it follows that N and the Z
j
are also.
Finally, since
((A
1
), . . . , (A
m
)) =
N
j=1
Z
j
,
by Proposition 9.4, we conclude that (A
1
), . . . , (A
m
) are independent random variables
and (A
i
) Poisson((A
i
)).
The existence of a Poisson random measure in the -nite case is a simple consequence
of the following kind of superposition result.
65
Proposition 9.6 Suppose that
1
,
2
, . . . are nite measures dened on c, and that =
i=1
i
is -nite. For k = 1, 2, . . ., let
k
be a Poisson random measure with mean measure
k
, and assume that
1
,
2
, . . . are independent. Then =
n
k=1
k
denes a Poisson random
measure with mean measure .
Proof. By Proposition 9.5, for each i 1 there exists a probability space (
i
, T
i
, P
i
) and
a Poisson random measure
i
on (
i
, T
i
, P
i
) with mean measure
i
. Consider the product
space (, T, P) where
=
1
2
. . .
T = T
1
T
2
. . .
P = P
1
P
2
. . .
Note that any random variable X
i
dened on (
i
, T
i
, P
i
) can be viewed as a random variable
on (, T, P) by setting X
i
() = X
i
(
i
). We claim the following:
(a) for A c and i 1,
i
(A) Poisson(
i
(A)).
(b) if A
1
, A
2
, . . . c then
1
(A
1
),
2
(A
2
), . . . are independent random variables.
(c) (A) =
i=1
i
(A) is a Poisson random measure with mean measure .
(a) and (b) are direct cosequences of the denitions. For (c), rst note that is a counting
measure in c for each xed . Moreover, from (a), (b) and Corollary 9.2, we have that
(A) Poisson((A)). Now, suppose that B
1
, B
2
, . . . c are disjoint sets. Then, by (b)
it follows that the random variables
1
(B
1
),
2
(B
1
), . . . ,
1
(B
2
),
2
(B
2
), . . . ,
1
(B
n
),
2
(B
n
), . . .
are independent. Consequently, (B
1
), (B
2
), . . . are independet, and therefore, is a Poisson
random measure with mean .
Suppose now that is a -nite measure. By denition, there exist disjoint E
i
such
that E =
i=1
E
i
and (E
i
) < for all i 1. Now, for each i 1, consider the measure
i
dened on c by the formula
i
(A) = (A E
i
). Clearly each measure
i
is nite and
=
i=1
i
. Therefore, by Proposition 9.6 we have the following
Corollary 9.7 Suppose that is a -nite measure dened on c. Then there exists a
Poisson random measure with mean measure .
9.4 Integration w.r.t. a Poisson random measure
Let (, T, P) be a probability space and (E, c) be a measurable space. Let be a -nite
measure on (E, c), and let be a Poisson random measure with mean measure . Recall
that for each , (, ) is a counting measure on c. If f : E R is a measurable
function with
_
[f[d < , then we claim that

_
E
f(x)(, dx)
is a R-valued random variable. Consider rst simple functions dened on E, that is, f =
n
j=1
c
j
1
A
j
, where n N, c
1
, . . . , c
n
R, and A
1
, . . . , A
n
c are such that (A
j
) < for
66
all j 1, . . . , n. Then
X
f
() =
_
E
f(x)(, dx) =
n
j=0
c
j
(A
j
)
is a random variable. Note that
E[X
f
] =
_
E
fd, E[[X
f
[]
_
E
[f[d, (9.1)
with equality holding if f 0. Recall that the spaces
L
1
() = h : E R: h is measurable, and
_
E
[h[d <
and
L
1
(P) = X : R : X is a random variable, and E[[X[] <
are Banach spaces under the norms |h| =
_
E
[h[d and |X| = E[[X[] respectively. Since
the space of simple functions dened on E is dense in L
1
(), for f L
1
(), we can construct
a sequence of simple funtions f
n
such that f
n
f pointwise and in L
1
and [f
n
[ [f[. It
follows that X
f
() =
_
E
f(x)(, dx) is a random variable satisfying (9.1).
As convenient, we will use any of the following to denote the integral.
X
f
=
_
E
f(x)(dx) = f, ).
From the interpretation of X
f
as an ordinary integral, we have the following.
Proposition 9.8 Let f, g L
1
().
(a) If f g -a.s., then X
f
X
g
P-a.s.
(b) If R -a.s., then X
f
= X
f
P-a.s.
(c) X
f+g
= X
f
+ X
g
P-a.s.
For A c, let T
A
= ((B) : B c, B A). Note that if A
1
and A
2
are disjoint,
then T
A
1
and T
A
2
are independent, that is, if H
1
T
A
1
and H
2
T
A
2
, then P(H
1
H
2
) =
P(H
1
)P(H
2
). In the proof of the previous result, we have used the following result.
Proposition 9.9 Suppose that f, g L
1
() have disjoint supports i.e.
_
E
[f[ [g[d = 0.
Then X
f
and X
g
are independent.
Proof. Dene A := [f[ > 0 and B := [f[ = 0. Note that
X
f
=
_
A
f(x)(dx)
67
is T
A
-measurable and
X
g
=
_
g(x)(dx) =
_
B
g(x)(dx) a.s.
where the right side is T
B
-measurable. Since T
A
and T
B
are independent, it follows that
X
f
and X
g
are independent.
If has no atoms, then the support of (, ) has measure zero. Consequently, the
following simple consequence of the above observations may at rst appear surprising.
Proposition 9.10 If f, g L
1
() then
f = g -a.s. if and only if X
f
= X
g
P-a.s.
Proof. That the condition is sucient follows directly from the linearity of X
f
and (9.1).
For the converse, without lost of generality, we only need to prove that X
f
= 0 P-a.s. implies
that f = 0 -a.s. Since f = f
+
f
where f
+
= f1
f0
and f
= f1
f<0
, it follows
that 0 = X
f
= X
f
+ X
f
a.s. Note that X
f
+ and X
f
are independent since the support
of f
+
is disjoint from the support of f
. Consequently, X
f
+ and X
f
must be equal a.s. to
the same constant. Similar analysis demonstrates that the constant must be zero, and we
have _
E
[f[d =
_
E
f
+
d +
_
E
f
d = E[X
f
+] + E[X
f
] = 0.
9.5 Extension of the integral w.r.t. a Poisson random measure

We are going to extend our denition of X
f
to a larger class of functions f. As motivation,
if is a nite measure, then, as we saw in the proof of Proposition 9.5, =
N
k=1
X
k
is a
Poisson random measure with mean , whenever N Poisson((E)) is independent of the
sequence of i.i.d. E-valued random variables X
1
, X
2
, . . . with P(X
i
A) =
(A)
(E)
. Now, given
any measurable function f : E R, it is natural to dene
_
E
f(x)(dx) =
N
k=0
f(X
k
),
and we want to ensure that this denition is consistent.
Proposition 9.11 If f : E R is a simple function then for all a, b > 0
P[X
f
[ b
1
b
_
E
[f[ a d +
_
1 exp
_
a
1
_
E
[f[ a d
__
Proof. First, notice that
P[X
f
[ b PX
[f[
b PX
[f[1
{|f|a}
b + PX
[f[1
{|f|>a}
> 0
68
But, by the Markov inequality,
PX
[f[1
|f|a
b
1
b
_
[f[a
[f[d
1
b
_
E
([f[ a)d.
On the other hand, since PX
[f[1
|f|>a
> 0 = P([f[ > a) > 0 and [f[ > a
Poisson([f[ > a), we have
P([f[ > a) > 0 = 1 e
[f[>a
1 exp
_
a
1
_
E
[f[ a d
_
,
giving the desired result.
Consider the vector space
L
1,0
() = f : E R s.t. [f[ 1 L
1
() .
Notice that L
1
() L
1,0
(). In particular, L
1,0
() contains all simple functions dened on
E whose support has nite measure. Moreover, if is a nite measure then this vector
space is simple the space of all measurable functions. Given f, g L
1.0
(), we dene the
distance between f and g as
d(f, g) =
_
E
[f g[ 1 d.
The function d denes a metric on L
1,0
(). Note that d does not come from any norm;
however, it is easy to check that
(a) d(f g, p q) = d(f p, q g)
(b) d(f g, 0) = d(f, g)
Before considering the next simple but useful result, note that
_
E
[f[ 1 d =
_
[f[1
[f[d + [f[ > 1
Hence, a function f belongs to L
1,0
() if and only if both terms in the right side of the last
equality are nite.
Proposition 9.12 Under the metric d, the space of simple functions is dense in L
1,0
().
Moreover, for every f L
1,0
(), there exists a sequence of simple functions f
n
, such that
[f
n
[ [f[ for every n, and f
n
converges pointwise and under d to f.
Proof. Let f L
1,0
(). First, suppose that f 0, and dene
f
n
(x) =
n2
n
1
k=0
1
k2
n
f<(k+1)2
n
+1
n2
n
f
. (9.2)
69
Then f
n
is a nonnegative increasing sequence that converges pointwise to f and the
range(f
n
) is nite for all n. Consequently,
_
E
f
n
d =
_
f1
f
n
d +
_
f>1
f
n

_
f1
fd + n2
n
f > 1 < ,
and for each n,
0 (f f
n
) 1 (f 1) L
1
().
Therefore, since lim
n
(f f
n
) 1 = 0, by the bounded convergence theorem we can
conclude that lim
n
d(f, f
n
) = 0.
For arbitrary f L
1,0
(), write f = f
+
f
. Dene f
+
n
and f
n
as in (9.2), and set
f
n
= f
+
n
f
n
. Since L
1,0
() is linear and d is a metric,
d(f, f
n
) d(f
+
, f
+
n
) + d(f
, f
n
),
and the proposition follows.
Suppose that f L
1,0
(). By Proposition 9.12 there exist a sequence of simple functions
f
n
such that f
n
converges to f under d. But, from Proposition 9.11 with a = 1, we see
that for all n, m and b > 0,
P[X
fn
X
fm
[ b = P[X
fnfm
[ b
1
b
d(f
n
, f
m
) + 1 e
d(fn,fm)
,
and we conclude that the sequence X
fn
is Cauchy in probability. Therefore, there exists
a random variable X
f
such that
X
f
= lim
n
X
fn
,
where the last limit is in probability. As we showed in Section 9.4, this limit does not depend
on the sequence of simple functions chosen to converge to f. Therefore, for every f L
1,0
(),
X
f
is well-dene and the denition is consistent with the previous denition for f L
1
().
Before continuing, let us consider the generality of our selection of the space L
1,0
().
From Proposition 9.11, we could have considered the space
L
1,a
() :=
_
f : E R : [f[ a L
1
()
_
for some value of a other than 1; however, L
1,a
() = L
1,0
() and the corresponding metric
d
a
(f, g) :=
_
E
([f g[ a)d
is equivalent to d.
Proposition 9.13 If f L
1,0
(), then for all R
E[e
iX
f
] = exp
__
E
(e
if(x)
1) (dx)
_
70
Proof. First consider simple f. Then, without loss of generality, we can write f =
n
j=1
c
j
1
A
j
with (A
j
) < for all j 1, . . . , n and A
1
, . . . , A
n
disjoints. Since (A
1
), . . . , (A
n
)
are independent and (A
j
) Poisson((A
j
)), we have that
E[e
iX
f
] =
n
j=1
E
_
e
ic
j
(A
j
)
_
=
n
j=1
e
(A
j
)(e
ic
j
1)
= exp
_
n
j=1
(e
ic
j
1)(A
j
)
_
= exp
__
E
e
if(x)
1 (dx)
_
The general case follows by approximating f by simple functions f
n
as in Proposition 9.12
and noting that both sides of the identity
E[e
iX
fn
] = exp
__
E
e
ifn(x)
1 (dx)
_
converge by the bounded convergence theorem.
9.6 Centered Poisson random measure
Let be a Poisson random measure with mean . We dene the centered random measure
for by
(A) = (A) (A), A c, (A) < .

Note that, for each K c with (K) < and almost every , the restriction of

(, )
to K is a nite signed measure.
In the previous section, we dened
_
E
f(x)(dx) for every f L
1,0
(). Now, let f =
n
j=1
c
j
1
A
j
be a simple function with (A
j
) < . Then, the integral of f with respect to

is the random variable

X
f
(sometimes we write
_
E
f(x)
(dx)) dened as
X
f
=
_
E
f(x)(dx)
_
E
f(x)(dx) =
n
j=1
c
j
((A
j
) (A
j
))
Note that E[

X
f
] = 0.
Clearly, from our denition, it follows that for simple functions f, g, and , R that
X
f+g
=

X
f
+

X
g
.
Therefore, the integral with respect to a centered Poisson random measure is a linear func-
tion on the space of simple functions. The next result is the key to extending our def-
inition to the space L
2
() = h : E R : h is measurable, and
_
E
h
2
d < , which
is a Banach space under the norm |h|
2
=
__
E
h
2
d
_
1/2
. Similarly, L
2
(P) = X :
R : X is a random variable, and
_
E
X
2
dP < is a Banach space under the norm |X|
2
=
E[X
2
]
1/2
.
71
Proposition 9.14 If f is a simple function, then E[

X
2
f
] =
_
E
f
2
d.
Proof. Write f =
n
j=1
c
j
1
A
j
, where A
1
, . . . , A
n
are disjoint sets with (A
j
) < . Since
(A
1
), . . . , (A
n
) are independent and (A
j
) Poisson((A
j
)), we have that
E[

X
2
f
] = E
_
n
j=1
n
i=1
c
j
c
i
((A
j
) (A
j
))((A
i
) (A
i
))
_
=
n
j=1
c
2
j
E[(A
j
) (A
j
)]
2
=
n
j=1
c
2
j
(A
j
)
=
_
E
f
2
d
The last proposition, shows that

X
f
determines a linear isometry from L
2
() into L
2
(P).
Therefore, since the space of simple functions is dense in L
2
(), we can extend the denition
of

X
f
to all f L
2
(). As in Section 9.4, if (f
n
) is a sequence of simple functions that
converges to f in L
2
(), then we dene
X
f
= lim
n
X
fn
.
where the limit is L
2
(P). Clearly, the linearity of

X over the space of simple functions is
inherited by the limit. Also, for every f L
2
(), we have that
E[

X
2
f
] =
_
E
f
2
d
and
E[

X
f
] = 0.
Before continuing, note that if f is a simple function, then E[

X
f
[ 2
_
E
[f[d. This
inequality is enough to extend the denition of

X to the space L
1
(), where the simple
functions are also dense. Since is not necessarly nite, the spaces L
1
() and L
2
() are
not necessarily comparable. This slitghtly dierent approach will in the end be irrelevant,
because the space in which we are going to dene

X contains both L
2
() and L
1
().
Now, we extend the denition of

X
f
to a larger class of functions f. For this purpose,
consider the vector space
L
2,1
() = f : E R : [f[
2
[f[ L
1
(),
or equivalently, let
(z) = z
2
1
[0,1]
(z) + (2z 1)1
[1,)
(z).
Then
L
2,1
() = L
() = f : E R : ([f[) L
1
().
72
Note that L
1
() L
() and L
2
() L
(). In particular, L
() contains all the simple

functions dened on E whose support has nite measure. Since is convex and nondecreas-
ing and (0) = 0, L
() is an Orlicz space with norm

|f|
= infc :
_
E
(
[f[
c
)d < 1.
As in Proposition 9.12, that the space of simple functions with support having nite
measure is dense in L
(). The proof of this assertion follows by the same argument as in

Proposition 9.12.
Proposition 9.15 The space of simple functions is dense in L
(). Moreover, for every

f L
() there exists a sequence of simple functions f

n
, such that [f
n
[ [f[ for every n,
and f
n
converges pointwise and | |
to f.
Proof. Take f
n
= f
+
n
f
n
as constructed in the proof of Proposition 9.12.
Again the key to our extension is an inequality that allows us to dene

X as a limit in
probability instead of one in L
2
(P).
Proposition 9.16 If f : E R is a simple function with support having nite measure,
then
E[[

X
f
[] 3|f|
Proof. Fix a > 0. Then, for c > 0 and 0 < < 1,

E[[

X
f
[] cE[[

X
c
1
f1
{c
1
|f|1}
[] + cE[[

X
c
1
f1
{c
1
|f|>1}
[]
c
_
E[[

X
c
1
f1
{c
1
|f|1}
[
2
+ 2c
_
U
[c
1
f[1
c
1
[f[>1
d
c
_
U
c
2
f
2
1
c
1
[f[1
(du) + 2c
_
U
[c
1
f[1
c
1
[f[>1
d
c
_
U
(c
1
[f[)d + 2c
_
U
(c
1
[f[)d
Taking c = |f|
, the right side is bounded by 3|f|
.
Now, suppose that f L
(). Then by Proposition 9.15 there exists a sequence f

n
of
simple functions that converges to f in | |
. But, from Proposition 9.16, it follows that for

all n, m 1
P[

X
fn

X
fm
[ a
3|f
n
f
m
|
a
.
Therefore, the sequence

X
fn
is Cauchy in probability, and hence, there exists a random
variable

X
f
such that
X
f
= lim
n
X
fn
in probability.
As usual, the denition does not depend on the choice of f
n
. Also, note that the denition
of

X
f
for functions f L
2
() is consistent with the denition for f L
().
We close the present section with the calculation of the characteristic function for the
random variable

X
f
.
73
Proposition 9.17 If f L
(), then for all R

E[e
i

X
f
] = exp
__
E
(e
if(x)
1 if(x)) (dx)
_
(9.3)
Proof. First, from Proposition 9.13, (9.3) holds for simple functions. Now, let f L
(),
and let f
n
be as in Proposition 9.15. Since

X
f
= lim
n

X
fn
in probability, without
lossing generality we can assume that

X
fn
) converges almost surely to

X
f
on . Hence,
for every R, lim
n
E[e
i

X
fn
] = E[e
i

X
f
]. On the other hand, since f
n
converges
pointwise to f, it follows that for all R,
lim
n
(e
ifn
1 if
n
) = e
if
1 if.
But, there exists a constant k > 0 such that
[e
ifn
1 if
n
[ k ([f
n
[
2
[f
n
[) k ([f[
2
[f[) L
1
(d),
and the result follows by the dominated convergence theorem.
9.7 Time dependent Poisson random measures
Let (U, |, ) be a measurable space where is a -nite measure, and dene
/ = A | : (A) < .
Let B[0, ) be the Borel -algebra of [0, ), and denote Lebesgue measure on B[0, ) by
m. Then, the product measure m is -nite on | B[0, ) and therefore, by Corollary
9.7, there exists a Poisson random meaure Y , with mean measure m. Denote the
corresponding centered Poisson random measure by

Y .
For a A | and t 0, we write Y (A, t) instead of Y (A [0, t]). Similarly, we write
Y (A, t) instead of

Y (A [0, t]).
Proposition 9.18 For each A /, Y (A, ) is a Poisson process with intensity (A). In
particular,

Y (A, ) is a martingale.
Proof. Fix A /. Clearly, the process Y (A, ) satises the following properties almost
surely: (i) Y (A, 0) = 0 (ii) Y (A, t) Poisson((A)t) (iii) Y (A, ) has cadlag nondecreasing
sample paths with jumps of size one. Hence, to conclude that Y (A, ) is a Poisson process,
it is enough to check that Y (A, t
1
) Y (A, t
0
), . . . , Y (A, t
n
) Y (A, t
n1
) are independent
random variables, whenever 0 = t
0
< . . . < t
n
. But
Y (A, t
i
) Y (A, t
i1
) = Y (A (t
i1
, t
i
])
for every i 1, . . . , n, and the sets A(t
0
, t
1
], . . . , A(t
n1
, t
n
] are disjoint in U [0, ).
Consequently, the random variables are independent, and hence Y (A, ) is a Poisson random
process with intensity (A).
74
Proposition 9.19 If A
1
, A
2
, . . . / are disjoint sets, then the processes Y (A
1
, ), Y (A
2
, ), . . .
are independent.
Proof. Fix n 1 and let 0 = t
0
< . . . < t
m
. Note that, for i = 1, . . . , n and j = 1, . . . m,
the random variables Y (A
i
, t
j
) Y (A
i
, t
j1
) are independent because the sets A
i
(t
j1
, t
j
]
are disjoint, and the independence of Y (A
1
, ), Y (A
2
, ), . . . follows.
For each t 0, dene the -algebra
T
Y
t
= (Y (A, s) s.t. A / and s [0, t]) T
By denition, Y (A, ) is T
Y
t
-adapted process for all A /. In addition, by Proposition
9.19, for all A / and s, t 0, Y (A, t+s)Y (A, t) is independen of T
Y
t
. This independence
will play a central role in the denition of the stochastic integral with respect to Y . More
generally, we will say that Y is compatible with the ltration T
Y
t
if Y is adapted to T
Y
t

and Y (A, t + s) Y (A, t) is independent of T
Y
t
for all A |, and s.t 0.
9.8 Stochastic integrals for time-dependent Poisson random mea-
sures
Let Y be as in Section 9.7, and assume that Y is compatible with the ltration T
t
. For
L
1,0
(). dene
Y (, t)
_
U[0,t]
(u)Y (du ds)
Then Y (, ) is a process with independent increments and, in particular, is a T
t
-semimart-
ingale. Suppose
1
, . . . ,
m
are cadlag, T
t
-adapted processes and that
1
, . . . ,
m
L
1,0
().
Then
Z(u, t) =
m
k=1
k
(t)
k
(u) (9.4)
is a cadlag, L
1,0
()-valued process, and we dene
I
Z
(t) =
_
U[0,t]
Z(u, s)Y (du ds)
m
k=1
_
t
0
k
(s)dY (, s). (9.5)
Lemma 9.20 Let Y =
(U
i
,S
i
)
. Then for Z given by (9.4) and I
Z
by (9.5), with proba-
bility one,
I
Z
(t) =
m
k=1
i
1
[0,t]
(S
i
)
k
(S
i
)
k
(U
i
) =
i
1
[0,t]
(S
i
)Z(U
i
, S
i
),
and hence,
I
Z
(t) I
[Z[
(t). (9.6)
Proof. Approximate
k
by
k
=
k
1
[
k
[
, > 0. Then Y (u : [
k
(u)[ [0, t]) <
a.s. and with
k
replaced by
k
, the lemma follows easily. Letting 0 gives the desired
result.
75
Lemma 9.21 Let Z be given by (9.4) and I
Z
by (9.5). If E[
_
U[0,t]
[Z(u, s)[(du)ds] < ,
then
E[I
Z
(t)] =
_
U[0,t]
E[Z(u, s)](du)ds
and
E[[I
Z
(t)[]
_
U[0,t]
E[[Z(u, s)[](du)ds
Proof. The identity follows from the martingale properties of the Y (
i
, ), and the inequality
then follows by (9.6).
With reference to Proposition 9.11, we have the following lemma.
Z
by (9.5). Then for each stopping time ,
Psup
st
[I
Z
(s)[ b P t +
1
b
E[
_
U[0,t]
[Z(u, s)[ a(du)ds]
+1 E[expa
1
_
U[0,t]
[Z(u, s)[ a(du)ds]
Proof. First, note that
Psup
st
[I
Z
(s)[ b P t + P sup
st
[I
Z
(s)[ b.
By (9.6),
P sup
st
[I
Z
(s)[ b PI
[Z[
(t) b PI
[Z[1
{|Z|a}
(t) b+PI
[Z[1
{|Z|>a}
(t) > 0.
But, by the Markov inequality,
PI
[Z[1
{|Z|a}
(t ) b
1
b
_
U[0,t]
E[[Z(u, s)[1
[Z(u,s)[a,s
](du)ds
1
b
E[
_
U[0,t]
[Z(u, s)[ a(du)ds].
On the other hand,
PI
[Z[1
{|Z|>a}
(t ) > 0 = PY ((u, s) : [Z(u, s)[ > a, s ) > 0
= E
_
1 exp
_
_
t
0
u : [Z(u, s)[ > ads
__
giving the desired result.
Lemma 9.22 gives the estimates necessary to extend the integral to cadlag and adapted,
L
1,0
()-valued processes.
76
Theorem 9.23 If Z is a cadlag, adapted L
1,0
()-valued processes, then there exist Z
n
of
the form (9.4) such that sup
tT
_
U
[Z(u, t) Z
n
(u, t)[ 1(du) 0 in probability for each
T > 0, and there exists an adapted, cadlag process I
Z
such that sup
tT
[I
Z
(t) I
Zn
(t)[ 0
in probability.
Remark 9.24 We dene
_
U[0,t]
Z(u, s)Y (du ds) = I
Z
(t).
The estimate in Lemma 9.22 ensures that the integral is well dened.
Now consider

Y . For L
,

Y (, t) =
_
U[0,t]
(u)
Y (du ds) is a martingale. For Z

given by (9.4), but with
k
L
, dene
I
Z
(t) =
_
U[0,t]
Z(u, s)
Y (du ds)
m
k=1
_
t
0
k
(s)d
Y (, s). (9.7)
Then

I
Z
is a local martingale with
[
I
Z
]
t
=
_
U[0,t]
Z
2
(u, s)Y (du ds).
Note that if Z has values in L
, then Z
2
has values in L
1,0
.
Lemma 9.25 If E[
_
t
0
_
U
Z
2
(u, s)(du)ds] < , then

I
Z
is a square integrable martingale
with
E[
I
2
Z
(t)] = E
__
U[0,t]
Z
2
(u, s)Y (du ds)
_
= E
__
t
0
_
U
Z
2
(u, s)(du)ds
_
Z
by (9.5). Then for each stopping time ,
Psup
st
[
I
Z
(s)[ a P t+
16
a
2

4
a
E
__
t
0
_
U
([Z(u, s)[)(du)ds
_
Proof. As before,
Psup
st
[
I
Z
(s)[ a P t + P sup
st
[
I
Z
(s)[ a
Fix a > 0. Then
P sup
st
[
I
Z
(s)[ a P sup
st
[
I
Z1
{|Z|1}
(s)[ 2
1
a + P sup
st
[
I
Z1
{|Z|>1}
(s)[ 2
1
a
16
a
2
E
__
t
0
_
U
[Z(u, s)[
2
1
[Z(u,s)[1
(du)
_
+
4
a
E[
_
t
0
_
U
[Z(u, s)[1
[Z(u,s)[>1
(du)ds]
16
a
2

4
a
E
__
t
0
_
U
([Z(u, s)[)(du)ds
_
.
77
Remark 9.27 Lemma 9.26 gives the estimates needed to extend the denition of
_
U[0,t]
Z(u, s)
Y (du ds)
to all cadlag and adapted, L
()-valued processes Z.
Lemma 9.28 If Z is cadlag and adapted with values in L
1
(), then
_
U[0,t]
Z(u, s)
Y (du ds) =
_
U[0,t]
Z(u, s)Y (du ds)
_
t
0
_
U
Z(u, s)(du)ds
Lemma 9.29 If E[
_
t
0
_
U
Z
2
(u, s)(du)ds] < , then

I
Z
is a square integrable martingale
with
E[
I
2
Z
(t)] = E
__
U[0,t]
Z
2
(u, s)Y (du ds)
_
= E
__
t
0
_
U
Z
2
(u, s)(du)ds
_
78
10 Limit theorems.
10.1 Martingale CLT.
Denition 10.1 f : D
R
[0, ) R is continuous in the compact uniform topology if
sup
tT
[x
n
(t) x(t)[ 0,
for every T > 0, implies f(x
n
) f(x).
Denition 10.2 A sequence of cadlag stochastic processes Z
n
converges in distribution
to a continuous stochastic process Z (denoted Z
n
Z), if
E[f(Z
n
)] E[f(Z)]
for every bounded f that is continuous in the compact uniform topology.
Example 10.3 Consider g C
b
(R), h C(R
d
) and x : [0, ) R
d
, where C
b
(R) is the
space of all bounded continuous function on R. Dene
F(x) = g(sup
s27
h(x(s))).
Then F is continous in the compact uniform topology, Note that if x
n
x in the compact
uniform topology, then h x
n
h x in the compact uniform topology.
Example 10.4 In the notation of the last example,
G(x) = g(
_
27
0
h(x(s))ds)
is also continuous in the compact uniform topology.
Theorem 10.5 Let M
n
be a sequence of martingales. Suppose that
lim
n
E[sup
st
[M
n
(s) M
n
(s)[] = 0 (10.1)
and
[M
n
]
t
c(t) (10.2)
for each t > 0, where c(t) is continuous and deterministic. Then M
n
M = W c.
Remark 10.6 If
lim
n
E[[[M
n
]
t
c(t)[] = 0, t 0, (10.3)
then by the continuity of c, both (10.1) and (10.2) hold. If (10.2) holds and lim
n
E[[M
n
]
t
] =
c(t) for each t 0, then (10.3) holds by the dominated convergence theorem.
Proof. See Ethier and Kurtz (1986), Theorem 7.1.4.
79
Example 10.7 If M
n
W c, then
Psup
st
M
n
(s) x Psup
st
W(c(s)) x = P sup
uc(t)
W(u) x.
Corollary 10.8 (Donskers invariance principle.) Let
k
be iid with mean zero and variance
2
. Let
M
n
(t) =
1
n
[nt]
k=1
k
.
Then M
n
is a martingale for every n, and M
n
W.
Proof. Since M
n
is a nite variation process, we have
[M
n
]
t
=
st
(M
n
(s))
2
=
1
n
[nt]
k=1
2
k
=
[nt]
n[nt]
[nt]
k=1
2
k
t
2
.
where the limit holds by the law of large numbers. Consequently, (10.2) is satised. Note
that the convergence is in L
1
, so by Remark 10.6, (10.1) holds as well. Theorem 10.5 gives
M
n
W(
2
).
Corollary 10.9 (CLT for renewal processes.) Let
k
be iid, positive and have mean m and
variance
2
. Let
N(t) = maxk :
k
i=1
i
t.
Then
Z
n
(t)
N(nt) nt/m
n
W(
t
2
m
3
).
Proof. The renewal theorem states that
E[[
N(t)
t

1
m
[] 0
and
N(t)
t

1
m
, a.s.
Let S
k
=
k
1

i
, M(k) = S
k
mk and T
k
=
1
, . . . ,
k
. Then M is an T
k
-martingale
and N(t) + 1 is an T
k
stopping time. By the optional sampling theorem M(N(t) + 1) is
a martingale with respect to the ltration F
N(t)+1
.
80
Note that
M
n
(t) = M(N(nt) + 1)/(m
n)
=
N(nt) + 1
n

S
N(nt)+1
nt
m
n

nt
m
n
=
N(nt) nt/m
n
+
1
n

1
m
n
(S
N(nt)+1
nt) .
So asymptotically Z
n
behaves like M
n
, which is a martingale for each n. Now, for M
n
, we
have
sup
st
[M
n
(s) M
n
(s)[ = max
kN(nt)+1
[
k
m[/m
n
and
[M
n
]
t
=
1
m
2
n
N(nt)+1
1
[
k
m[
2
t
2
m
3
.
Since
E[[M
n
]
t
] =
1
m
2
n
E[N(nt) + 1]
t
2
m
3
Remark 10.6 applies and Z
n
W(
t
2
m
3
).
Corollary 10.10 Let N(t) be a Poisson process with parameter and
X(t) =
_
t
0
(1)
N(s)
ds.
Dene
X
n
(t) =
X(nt)
n
.
Then X
n

1
W.
Proof. Note that
(1)
N(t)
= 1 2
_
t
0
(1)
N(s)
dN(s)
= 1 2M(t) 2
_
t
0
(1)
N(s)
ds,
where
M(t) =
_
t
0
(1)
N(s)
d(N(s) s)
is a martingale. Thus
X
n
(t) =
X(nt)
n
=
1 (1)
N(nt)
2
n

M(nt)
n
.
One may apply the martingale CLT by observing that [M
n
]
t
= N(nt)/(n
2
) and that the
jumps of M
n
are of magnitude 1/(
n).
The martingale central limit theorem has a vector analogue.
81
Theorem 10.11 (Multidimensional Martingale CLT). Let M
n
be a sequence of R
d
-valued
martingales. Suppose
lim
n
E[sup
st
[M
n
(s) M
n
(s)[] = 0 (10.4)
and
[M
i
n
, M
j
n
]
t
c
i,j
(t) (10.5)
for all t 0 where, C = ((c
i,j
)) is deterministic and continuous. Then M
n
M, where M
is Gaussian with independent increments and E[M(t)M(t)
T
] = C(t).
Remark 10.12 Note that C(t) C(s) is nonnegative denite for t s 0. If C is
dierentiable, then the derivative will also be nonnegative denite and will, hence have a
nonnegative denite square root. Suppose C(t) = (t)
2
where is symmetric. Then M can
be written as
M(t) =
_
t
0
(s)dW(s)
where W is d-dimensional standard Brownian motion.
10.2 Sequences of stochastic dierential equations.
Let
k
be iid with mean zero and variance
2
. Suppose X
0
is independent of
k
and
X
k+1
= X
k
+ (X
k
)
k+1
n
+
b(X
k
)
n
.
Dene X
n
(t) = X
[nt]
, W
n
(t) = 1/
[nt]
k=1
k
, and V
n
(t) = [nt]/n. Then
X
n
(t) = X
n
(0) +
_
t
0
(X
n
(s))dW
n
(s) +
_
t
0
b(X
n
(s))dV
n
(s)
By Donskers theorem, (W
n
, V
n
) (W, V ), with V (t) = t.
More generally we have the following equation:
X
n
(t) = X(0) +
n
(t) +
_
t
0
(X
n
(s))dW
n
(s) +
_
t
n
b(X
n
(s))dV
n
(s) . (10.6)
Theorem 10.13 Suppose in 10.6 W
n
is a martingale, and V
n
is a nite variation pro-
cess. Assume that for each t 0, sup
n
E[[W
n
]
t
] < and sup
n
E[T
t
(V
n
)] < and that
(W
n
, V
n
,
n
) (W, V, 0), where W is standard Brownian motion and V (t) = t. Suppose that
X satises
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds (10.7)
and that the solution of (10.7) is unique. Then X
n
X.
Proof. See Kurtz and Protter (1991).
82
10.3 Approximation of empirical CDF.
Let
i
be i.i.d and uniform on [0, 1], let N
n
(t) =
n
k=1
I
[
k
,1]
(t), where 0 t 1. Dene
T
n
t
= (N
n
(u); u t). For t s 1, we have
E[N
n
(s)[T
n
t
] = E[N
n
(t) + N
n
(s) N
n
(t)[T
n
t
]
= N
n
(t) + E[N
n
(s) N
n
(t)[T
n
t
]
= N
n
(t) + (n N
n
(t))(s t)/(1 t) .
It follows that
M
n
(t) = N
n
(t)
_
t
0
n N
n
(s)
1 s
ds
is a martingale.
Dene F
n
(t)
Nn(t)
n
and B
n
(t) =

n(F
n
(t) 1) =
Nn(t)nt
n
. Then
B
n
(t) =
1
n
(N
n
(t) nt)
=
1
n
(

M
n
(t) + nt
n
_
t
0
B
n
(s)ds
1 s
nt)
= M
n
(t)
_
t
0
B
n
(t)
1 s
ds.
where M
n
(t) =

Mn(t)
n
. Note that [M
n
]
t
= F
n
(t) and by the law of large numbers, [M
n
]
t
t.
Since F
n
(t) 1, the convergence is in L
1
and Theorem 10.5 implies M
n
W. Therefore,
B
n
B where
B(t) = W(t)
_
t
0
B(s)
1 s
ds
at least if we restrict our attention to [0, 1 ] for some > 0. To see that convergence is
on the full interval [0, 1], observe that
E
_
1
1
[B
n
(s)[
1 s
ds =
_
1
1
E[[B
n
(s)[]
1 s
ds
_
1
1
E[B
2
n
(s)]
1 s
ds
_
1
1
s s
2
1 s
ds
which is integrable. It follows that for any > 0, sup
n
Psup
1s1
[B
n
(1) B
n
(s)[
0. This uniform estimate ensures that B
n
B on the full interval [0, 1]. The process B is
known as Brownian Bridge.
10.4 Diusion approximations for Markov chains.
Let X be an integer-valued process and write
X(t) = X(0) +
lZ
lN
l
(t)
83
where the N
l
are counting processes, that is, N
l
counts the number of jumps of X of size
l at or before time t. Assume that X is Markov with respect to T
t
, and suppose that
P(X(t + h) = j[X(t) = i) = q
ij
h + o(h) for i ,= j. If we dene
l
(x) = q
x,x+l
, then
E[N
l
(t + h) N
l
(t)[T
t
] = q
X(t),X(t)+l
h + o(h) =
l
(X(t)) + o(h) .
Our rst claim is that M
l
(t) N
l
(t)
_
t
0

l
(X(s))ds is a martingale (or at least a local
martingale). If we dene
l
(n) = inft : N
l
(t) = n, then for each n, M
l
(
l
(n)) is a
martingale.
Assume everything is nice, in particular, for each l, assume that M
l
(t) is an T
t
-
martingale. Then
X(t) = X(0) +
l
lN
l
(t) = X(0) +
l
lM
l
(t) +
l
l
_
t
0
l
(X(s))ds .
If
l
[l[
l
(x) < , then we can interchange the sum and integral. Let b(x)

l
l
l
(x), so
we have
X(t) = X(0) +
l
lM
l
(t) +
_
t
0
b(X(s))ds .
Note that [M
l
]
t
= N
l
(t) and [M
l
, M
k
]
t
= [N
l
, N
k
]
t
= 0. Therefore E[N
l
(t)] = E[
_
t
0

l
(X(s))ds]
and E[[M
l
]
t
] = E[N
l
(t)] = E[(M
l
(t))
2
] holds. Consequently,
E[(
m
k
lM
l
(t))
2
] =
m
k
l
2
E[M
l
(t)
2
] =
[
_
t
0
m
k
l
2
l
(X(s))ds],
so if
E[
_
t
0
l
l
2
l
(X(s))ds] < ,
then
lM
l
(t) is a square integrable martingale. If we only have
l
2
l
(x) for each
x and
l
N
l
(t) < a.s., then let
c
= inft,
l
2
l
(t) c and assume that
c
as
c . Then
l
lM
l
(t) is a local martingale.
Now consider the following sequence: X
n
(t) = X
n
(0) +
lN
n
l
(t)
n
where, for example, we
can take X
n
(t) =
X(n
2
t)
n
and N
n
l
(t) = N
l
(n
2
t) with X and N dened as before. Assume
M
n
l
(t)
1
n
(N
n
l
(t)
_
t
0
n
2
n
l
(X
n
(s))ds)
is a martingale, so we have [M
n
l
]
t
=
N
n
l
(t)
n
2
and E[[M
n
l
]
t
] = E[
_
t
0

n
l
(X
n
(s))ds].
For simplicity, we assume that sup
n
sup
x
n
l
(x) < and that only nitely many of the
n
l
are nonzero. Dene b
n
(x) n
l
l
n
l
(x). Then we have
X
n
(t) = X
n
(0) +
l
lM
n
l
(t) +
_
t
0
b
n
(X
n
(s))ds
Assume:
84
1) X
n
(0) X(0) ,
2)
n
l
(x)
l
(x) ,
3) b
n
(x) b(x) ,
4) n
2
inf
x
n
l
(x)
where the convergence in 1-3 is uniform on bounded intervals. By our assumptions,
E[(M
n
l
(t))
2
]
n
2

0, so by Doobs inequality. It follows that sup
st
[
M
n
l
(t)
n
[ 0. Consequently, [M
n
l
]
t

_
t
0

n
l
(X
n
(s))ds.
Dene W
n
l
(t) =
_
t
0
1
n
l
(Xn(s)
dM
n
l
(s). Then
[W
n
l
]
t
=
_
t
0
d[M
n
l
]
s
n
l
(X
n
(s)
=
_
t
0
1
n
n
l
(X
n
(s))
dM
n
l
(s) + t u
n
l
(t) + t.
Note that u
n
l
(t) is a martingale, and
E[u
n
l
(t)
2
] = E[
_
t
0
d[M
l
n
]
s
n
2
n
l
(X
n
(s)
2
]
= E[
_
t
0
n
l
(X
n
(s))ds
n
2
n
l
(X
n
(s)
2
]
= E[
_
t
0
ds
n
2
n
l
(X
n
(s
))
] 0.
Consequently, under the above assumptions, [W
n
l
]
t
t and hence W
n
l
W
l
, where W
l
is a
standard Brownian motion.
By denition, M
n
l
(t) =
_
t
0
_
n
l
(X
n
(s))dW
n
l
(s), so
X
n
(t) = X
n
(0) +
l
l
_
t
0
_
n
l
(X
n
(s))dW
n
l
(s) +
_
t
0
b
n
(X
n
(s))ds .
Let
n
(t) = X
n
(0) X(0) +
l
_
t
0
l(
_
n
l
(X
n
(s))
_
l
(X(s)))dW
n
l
(s)
+
_
t
0
(b
n
(X
n
(s)) b(X(s))ds
which converges to zero at least until X
n
(t) exits a xed bounded interval. Theorem 10.13
gives the following.
Theorem 10.14 Assume 1-4 above. Suppose the solution of
X(t) = X(0) +
l
l
_
t
0
_
l
(X(s))dW
l
(s) +
_
t
0
b(X(s))ds
exists and is unique. Then X
n
X.
85
10.5 Convergence of stochastic integrals.
Theorem 10.15 Let Y
n
be a semimartingale with decomposition Y
n
= M
n
+ V
n
. Suppose
for each t 0 that sup
st
[X
n
(s) X(s)[ 0 and sup
st
[Y
n
(s) Y (s)[ 0 in probability
as n , and that sup
n
E[M
n
(t)
2
] = sup
n
E[[M
n
]
t
] < and sup
n
E[T
t
(V
n
)] < . Then
for each T > 0
sup
tT
[
_
t
0
X
n
(s)dY
n
(s)
_
t
0
X(s)dY (s)[ 0
in probability.
Proof. See Kurtz and Protter (1991).
86
11 Reecting diusion processes.
11.1 The M/M/1 Queueing Model.
Arrivals form a Poisson process with parameter , and the service distribution is exponential
with parameter . Consequently, the length of the queue at time t satises
Q(t) = Q(0) + Y
a
(t) Y
d
(
_
t
0
I
Q(s)>0
ds) ,
where Y
a
and Y
d
are independent unit Poisson processes. Dene the busy period B(t) to be
B(t)
_
t
0
I
Q(s)>0
ds
Rescale to get
X
n
(t)
Q(nt)
n
.
Then X
n
(t) satises
X
n
(t) = X
n
(0) +
Y
a
(
n
nt)
n

1
n
Y
d
(n
n
_
t
0
I
Xn(s)>0
ds).
For a unit Poisson process Y , dene

Y (u) Y (u) u and observe that
X
n
(t) = X
n
(0) +
1
Y
a
(n
n
t)
1
Y
d
(n
n
_
t
0
I
Xn(s)>0
ds)
+
n(
n
n
)t +
n
n
_
t
0
I
Xn(s)=0
ds
We already know that if
n
and
n
, then
W
n
a
(t)
1
Y
a
(n
n
t)
W
1
(t)
W
n
d
(t)
1
Y
d
(n
n
t)

W
2
(t)),
where W
1
and W
2
are standard Brownian motions. Dening
c
n

n(
n
n
)
n
(t)
n
n
(t B
n
(t)),
we can rewrite X
n
(t) as
X
n
(t) = X
n
(0) + W
n
a
(t) W
n
d
(B
n
(t)) + c
n
t +
n
(t).
Noting that
n
is nondecreasing and increases only when X
n
is zero, we see that (X
n
,
n
)
is the solution of the Skorohod problem corresponding to X
n
(0) +W
n
a
(t) W
n
d
(B
n
(t)) +c
n
t,
that is, the following:
87
Lemma 11.1 For w D
R
[0, ) with w(0) 0, there exists a unique pair (x, ) satisfying
x(t) = w(t) + (t) (11.1)
such that (0) = 0, x(t) 0t, and is nondecreasing and increases only when x = 0. The
solution is given by setting (t) = 0 sup
st
(w(s)) and dening x by (11.1).
Proof. We leave it to the reader to check that (t) = 0sup
st
(w(s)) gives a solution. To
see that it gives the only solution, note that for t <
0
= infs : w(s) 0 the requirements
on imply that (t) = 0 and hence x(t) = w(t). For t
0
, the nonnegativity of x implies
(t) w(t),
and (t) nondecreasing implies
(t) sup
st
(w(s)).
If t is a point of increase of , then x(t) = 0, so we must have
(t) = w(t) sup
st
(w(s)). (11.2)
Since the right side of (11.2) is nondecreasing, we must have (t) sup
st
(w(s)) for all
t >
0
, and the result follows.
Thus in the problem at hand, we see that
n
is determined by
n
(t) = 0 (inf
st
(X
n
(0) + W
n
a
(s) W
n
d
(s) W
n
d
(B
n
(s)) + c
n
s))
Consequently, if
X
n
(0) + W
n
a
(t) W
n
d
(B
n
(t)) + c
n
t
converges, so does
n
and X
n
along with it. Assuming that c
n
c, the limit will satisfy
X(t) = X(0) +
W
1
(t)
W
2
(t) + ct + (t)
(t) = 0 sup
st
((X(0) +
(W
1
(s) W
2
(s)) + ct)).
Recalling that
(W
1
W
2
) has the same distribution as
2W, where W is standard

Brownian motion, the limiting equation can be simplied to
X(t) = X(0) +
2W(t) + ct + (t)
X(t) 0 t,
where is nondecreasing and increases only when X(t) = 0.
88
11.2 The G/G/1 queueing model.
Let
1
,
2
, ... be i.i.d. with
i
> 0 and
=
1
E[
i
]
.
The
i
represent interarrival times. The service times are denoted
i
, which are also i.i.d.
and positive with
=
1
E[
i
]
.
The arrival process is
A(t) = maxk :
k
i=1
i
t
and the service process is
S(t) = maxk :
k
i=1
i
t.
The queue-length then satises
Q(t) = Q(0) + A(t) S(
_
t
0
I
Q(s)>0
ds).
Following the approach taken with the M/M/1 queue, we can express the G/G/1 queue
as the solution of a Skorohod problem and use the functional central limit theorem for the
renewal processes to obtain the diusion limit.
11.3 Multidimensional Skorohod problem.
We now consider a multidimensional analogue of the problem presented in Lemma 11.1. Let
D be convex and let (x) denote the unit inner normal at x D. Suppose w satises
w(0) D. Consider the equation for (x, )
x(t) = w(t) +
_
t
0
(x(s))d(s)
x(t)D t 0,
where is nondecreasing and increases only when x(t)D.
Proof of Uniqueness. Let
x
i
(t) = w(t) +
_
t
0
(x
i
(s))d
i
(s)
Assume continuity for now. Since
i
is nondecreasing, it is of nite variation. Itos formula
yields
(x
1
(t) x
2
(t))
2
=
_
t
0
2(x
1
(s) x
2
(s))d(x
1
(s) x
2
(s))
=
_
t
0
2(x
1
(s) x
2
(s))(x
1
(s))d
1
(s)
_
t
0
2(x
1
(s) x
2
(s))(x
2
(s))d
2
(s)
0,
89
where the inequality follows from the fact that
i
increases only when x
i
is on the boundary
and convexity implies that for any x D and y D, (x) (y x) 0. Consequently,
uniqueness follows.
If there are discontinuities, then
[x
1
(t) x
2
(t)[
2
=
_
t
0
2(x
1
(s) x
2
(s) (x
1
(s))d
1
(s)
_
t
0
2(x
1
(s) x
2
(s)) (x
2
(s))d
2
(s) + [x
1
x
2
]
t
=
_
t
0
2(x
1
(s) x
2
(s)(x
1
(s))d
1
(s)
_
t
0
2(x
1
(s) x
2
(s))(x
2
(s))d
2
(s)
+2
st
(x
1
(s) x
2
(s))((x
1
(s))
1
(s) (x
2
(s))
2
(s))
[x
1
x
2
]
t
=
_
t
0
2(x
1
(s) x
2
(s))(x
1
(s))d
1
(s)
_
t
0
2(x
1
(s) x
2
(s))(x
2
(s))d
2
(s) [x
1
x
2
]
t
0,
so the solution is unique.
Let W be standard Brownian motion, and let (X, ) satisfy
X(t) = W(t) +
_
t
0
(X(s))d(s) , X(t)D, t 0
nondecreasing, and increasing only when XD. Itos formula yields
f(X(t)) = f(X(0) +
_
t
0
f(X(s))dW(s) +
_
t
0
1
2
f(X(s))ds
+
_
t
0
(X(s))f(X(s))d(s).
Assume (x) f = 0 for xD. If we solve
u
t
=
1
2
u
subject to the Neumann boundary conditions
(x) u(x, t) = 0 xD,
u(x, 0) = f(x),
90
we see that u(r t, X(t)) is a martingale and hence
E[f(X(t, x))] = u(t, x).
Similarly, we can consider more general diusions with normal reection corresponding to
the equation
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds +
_
t
0
(X(s))d(s)
subject to the conditions that X(t)D t 0, is nondecreasing and increases only when
X(t)D. To examine uniqueness, apply Itos formula to get
[X
1
(s) X
2
(s)[
2
=
_
t
0
2(X
1
(s) X
2
(s))
T
((X
1
(s)) (X
2
(s)))dW(s) (11.3)
+
_
t
0
(X
1
(s) X
2
(s)) (b(X
1
(s)) b(X
2
(s)))ds
+
_
t
0
trace((X
1
(s)) (X
2
(s)))((X
1
(s)) (X
2
(s)))
T
ds
+
_
t
0
2(X
1
(s) X
2
(s)) (X
1
(s))d
1
(s)
_
t
0
2(X
1
(s) X
2
(s)) (X
2
(s))d
2
(s).
The last two terms are negative as before, and assuming that and b are Lipschitz, we can
use Gronwalls inequality to obtain uniqueness just as in the case of unbounded domains.
Existence can be proved by an iteration beginning with
X
0
(t) X(0),
and then letting
X
n+1
(t) = X(0) +
_
t
0
(X
n
(s))dW(s) +
_
t
0
b(X
n
(s))ds +
_
t
0
(X
n
(s))d
n+1
(s).
An analysis similar to (11.3) enables one to show that the sequence X
n
is Cauchy.
11.4 The Tandem Queue.
Returning to queueing models, consider a simple example of a queueing network, the tandem
queue:
Q
1
(t) = Q
1
(0) + Y
a
(t) Y
d1
(
1
_
t
0
I
Q
1
(s)>0
ds)
Q
2
(t) = Q
2
(0) + Y
d1
(
1
_
t
0
I
Q
1
(s)>0
ds) Y
d2
(
2
_
t
0
I
Q
2
(s)>0
ds) .
91
If we assume that
n
,
n
1
,
n
2

c
n
1

n(
n
n
1
) c
1
c
n
2

n(
n
1

n
2
) c
2
and renormalize the queue lengths to dene
X
n
1
(t) =
Q
1
(nt)
n
= X
n
1
(0) + W
n
a1
(t) + c
n
1
t W
n
d1
(
_
t
0
I
X
n
1
(s)>0
ds)
+
n
n
1
_
t
0
I
X
n
1
(s)=0
ds
X
n
2
(t) = X
n
n
(0) + W
n
d1
(
_
t
0
I
X
n
1
(s)>0
ds) W
n
d2
(
_
t
0
I
X
n
2
(s)>0
ds)
+c
n
2
t
n
n
1
_
t
0
I
X
n
1
(s)=0
ds +
n
n
2
_
t
0
I
X
n
2
(s)=0
ds,
we can obtain a diusion limit for this model. We know already that X
n
1
converges in
distribution to the X
1
that is a solution of
X
1
(t) = X
1
(0) + W
1
(t) W
2
(t) + c
1
t +
1
(t) .
For X
n
2
, we use similar techniques to show X
n
2
converges in distribution to X
2
satisfying
X
2
(t) = X
2
(0) + W
2
(t) W
3
(t) + c
2
t
1
(t) +
2
(t),
or in vector form
X(t) = X(0) +
_
0
0
_
_
_
W
1
W
2
W
3
_
_
+
_
c
1
c
2
_
t +
_
1
1
_
1
(t) +
_
0
1
_
2
(t)
where
1
increases only when X
1
= 0 and
2
increases only when X
2
= 0.
92
12 Change of Measure
Let (, T, Q) be a probability space and let L be a non-negative random variable, such that
E
Q
[L] =
_
LdQ = 1.
Dene P()
_
LdQ where T. P is a probability measure on T. This makes P

absolutely continuous with respect to Q (P << Q) and L is denoted by
L =
dP
dQ
.
12.1 Applications of change-of-measure.
Maximum Likelihood Estimation: Suppose for each /,
P
() =
_
dQ
and
L
= H(, X
1
, X
2
, . . . X
n
)
for random variables X
1
, . . . , X
n
. The maximum likelihood estimate for the true pa-
rameter
0
/ based on observations of the random variables X
1
, . . . , X
n
is the value of
that maximizes H(, X
1
, X
2
, . . . X
n
).
For example, let
X
(t) = X(0) +
_
t
0
(X
(s))dW(s) +
_
t
0
b(X
(s), )ds,
We will give conditions under which the distribution of X
is absolutely continuous with

respect to the distribution of X satisfying
X(t) = X(0) +
_
t
0
(X(s))dW(s) . (12.1)
Suciency: If dP
= L
dQ where
L
(X, Y ) = H
(X)G(Y ),
then X is a sucient statistic for .
Finance: Asset pricing models depend on nding a change of measure under which the
price process becomes a martingale.
Stochastic Control: For a controlled diusion process
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s), u(s))ds
where the control only enters the drift coecient, the controlled process can be obtained
from an uncontrolled process satisfying (12.1) via a change of measure.
93
12.2 Bayes Formula.
Assume dP = LdQ on (, T). Note that E
P
[X] = E
Q
[XL]. We want to derive the
corresponding formula for conditional expectations.
Recall that Y = E[Z[T] if
1) Y is T-measurable.
2) For each D T,
_
D
Y dP =
_
D
ZdP.
Lemma 12.1 (Bayes Formula)
E
P
[Z[T] =
E
Q
[ZL[T]
E
Q
[L[T].
(12.2)
Proof. Clearly the right side of (12.2) is T-measurable. Let D T. Then
_
D
E
Q
[ZL[T]
E
Q
[L[T].
dP =
_
D
E
Q
[ZL[T]
E
Q
[L[T]
LdQ
=
_
D
E
Q
[ZL[T]
E
Q
[L[T]
E
Q
[L[T]dQ
=
_
D
E
Q
[ZL[T]dQ
=
_
D
ZLdQ =
_
D
ZdP
which veries the identity.
For real-valued random variables with a joint density X, Y f
XY
(x, y), conditional
expectations can be computed by
E[g(Y )[X = x] =
_
yf
XY
(x, y)dy
f
X
(x)
For general random variables, suppose X and Y are independent on (, T, Q). Let L =
H(X, Y ) 0, and E[H(X, Y )] = 1. Dene
Y
() = QY
dP = H(X, Y )dQ.
Bayes formula becomes
E
P
[g(Y )[X] =
_
g(y)H(X, y)
Y
(dy)
_
H(X, y)
Y
(dy)
The left side is equal to
E
Q
[g(Y )H(X, Y )[X]
E
Q
[H(X, Y )[X]
,
and the independence of X and Y gives the identity by Property 10 of the Section 2.6.
94
12.3 Local absolute continuity.
Let (, T) be a measurable space, and let P and Q be probability measures on T. Suppose
T
n
T
n+1
and that for each n, P[
Tn
<< Q[
Tn
. Dene L
n
=
dP
dQ
Tn
. Then L
n
is
a nonnegative T
n
-martingale on (, T, Q) and L = lim
n
L
n
satises E
Q
[L] 1. If
E
Q
[L] = 1, then P << Q on T =
_
n
T
n
. The next proposition gives conditions for this
absolute continuity in terms of P.
Proposition 12.2 P << Q on T if and only if Plim
n
L
n
< = 1.
Proof. We have
Psup
nN
L
n
K =
_
I
sup
nN
LnK
L
N
dQ.
The dominated convergence theorem implies
Psup
n
L
n
K =
_
sup
n
LnK
LdQ.
Letting K we see that E
Q
[L] = 1.
12.4 Martingales and change of measure.
(See Protter (1990), Section III.6.) Let T
t
be a ltration and assume that P[
Tt
<< Q[
Tt
and that L(t) is the corresponding Radon-Nikodym derivative. Then as before, L is an
T
t
-martingale on (, T, Q).
Lemma 12.3 Z is a P-local martingale if and only if LZ is a Q-local martingale.
Proof. Note that for a bounded stopping time , Z() is P-integrable if and only if L()Z()
is Q-integrable. By Bayes formula, E
P
[Z(t+h)Z(t)[T
t
] = 0 if and only if E
Q
[L(t+h)(Z(t+
h) Z(t))[T
t
] = 0 which is equivalent to
E
Q
[L(t + h)Z(t + h)[T
t
] = E
Q
[L(t + h)Z(t)[T
t
] = L(t)Z(t).
Theorem 12.4 If M is a Q-local martingale, then

Z(t) = M(t)
_
t
0
1
L(s)
d[L, M]
s
(12.3)
is a P-local martingale. (Note that the integrand is
1
L(s)
, not
1
L(s)
.)
Proof. Note that LM [L, M] is a Q-local martingale. We need to show that LZ is a
Q-local martingale. But letting V denote the second term on the right of (12.3), we have
L(t)Z(t) = L(t)M(t) [L, M]
t
_
t
0
V (s)dL(s),
and both terms on the right are Q-local martingales.
95
12.5 Change of measure for Brownian motion.
Let W be standard Brownian motion, and let be an adapted process. Dene
L(t) = exp
_
t
0
(s)dW(s)
1
2
_
t
0
2
(s)ds
and note that
L(t) = 1 +
_
t
0
(s)L(s)dW(s).
Then L(t) is a local martingale.
For independent standard Brownian motions W
1
, . . . , W
m
, and adapted processes
i
,
L(t) = exp
i
_
t
0
i
(s)dW(s)
1
2
i
_
t
0
2
i
(s)ds
is the solution of
L(t) = 1 +
i
_
t
0
i
(s)L(s)dW
i
(s).
Assume E
Q
[L(t)] = 1 for all t 0. Then L is a martingale. Fix a time T, and restrict
attention to the probability space (, T
T
, Q). On T
T
, dene dP = L(T)dQ.
For t < T, let A T
t
. Then
P(A) = E
Q
[I
A
L(T)] = E
Q
[I
A
E
Q
[L(T)[T
t
]]
= E
Q
[I
A
L(t)]
. .
has no dependence on T
(crucial that L is a martingale)
Claim:

W
i
(t) = W
i
(t)
_
t
0

i
(s)ds is a standard Brownian motion on (, T
T
, P). Since

W
i
is continous and [

W
i
]
t
= t a.s., it is enough to show that

W
i
is a local martingale (and hence
a martingale). But since W
i
is a Q-martingale and [L, W
i
]
t
=
_
t
0

i
(s)L(s)ds, Theorem 12.4
gives the desired result. Since [

W
i
,

W
j
]
t
= [W
i
, W
j
]
t
= 0 for i ,= j, the

W
i
are independe.
Now suppose that
X(t) = X(0) +
_
t
0
(X(s))dW(s)
and set
(s) = b(X(s)).
Note that X is a diusion with generator
1
2
2
(x)f
tt
(x). Dene
L(t) = exp
_
t
0
b(X(s))dW(s)
1
2
_
t
0
b
2
(X(s))ds,
96
and assume that E
Q
[L(T)] = 1 (e.g., if b is bounded). Set dP = L(T)dQ on (, T
T
). Dene
W(t) = W(t)
_
t
0
b(X(s))ds. Then
X(t) = X(0) +
_
t
0
(X(s))dW(s) (12.4)
= X(0) +
_
t
0
(X(s))d

W(s) +
_
t
0
(X(s))b(X(s))ds
so under P, X is a diusion with generator
1
2
2
(x)f
tt
(x) + (x)b(x)f
t
(x). (12.5)
We can eliminate the appriori assumption that E
Q
[L(T)] = 1 by dening
n
= inft :
_
t
0
b
2
(X(s))ds > n and dening dP = L(T
n
)dQ on T
Tn
. Then on (, T
Tn
, P), X
is a diusion with generator (12.5) stopped at time T
n
. But if there is a unique (in
distribution) such diusion and
_
t
0
b
2
(X(s))ds is almost surely nite for this diusion, then
we can apply Proposition 12.2 to conclude that P << Q on T
T
, that is, E[L(T)] = 1.
12.6 Change of measure for Poisson processes.
Let N be an T
t
-unit Poisson process on (, T, Q), that is, N is a unit Poisson process
adapted to T
t
, and for each t, N(t + ) N(t) is independent of T
t
. If Z is nonnegative
and T
t
-adapted, then
L(t) = exp
__
t
0
ln Z(s)dN(s)
_
t
0
(Z(s) 1)ds
_
satises
L(t) = 1 +
_
t
0
(Z(s) 1)L(s)d(N(s) s)
and is a Q-local martingale. If E[L(T)] = 1 and we dene dP = L(T)dQ on T
T
, the
N(t)
_
t
0
Z(s)ds is a P-local martingale.
If N
1
, . . . , N
m
are independent unit Poisson processes and the Z
i
are nonneagative and
T
t
-adapted
L(t) =
m
i=1
exp
__
t
0
ln Z
i
(s)dN
i
(s)
_
t
0
(Z
i
(s) 1)ds
_
satises
L(t) = 1 +
_
t
0
(Z
i
(s) 1)L(s)d(N
i
(s) s).
Let J[0, ) denote the collection of nonnegative integer-valued cadlag functions that are
constant except for jumps of +1. Suppose that
i
: J[0, )
m
[0, ) [0, ), i = 1, . . . , m
and that
i
(x, s) =
i
(x( s), s) (that is,
i
is nonanticipating). For N = (N
1
, . . . , N
m
), if
97
we take Z
i
(t) =
i
(N, t) and let
n
= inft :
i
N
i
(t) = n, then dening dP = L(
n
)dQ on
T
n
, N on (, T
n
, P) has the same distribution as the solution of
N
i
(t) = Y
i
(
_
t n
0
i
(

N, s)ds)
where the Y
i
are independent unit Poisson process and
n
= inft :
N
i
(t) = n.
98
13 Finance.
Consider nancial activity over a time interval [0, T] modeled by a probability space (, T, P).
Assume that there is a fair casino or market such that at time 0, for each event A T, a
price Q(A) 0 is xed for a bet or a contract that pays one dollar at time T if and only if
A occurs. Assume that the market is such that an investor can either buy or sell the policy
and that Q() < . An investor can construct a portfolio by buying or selling a variety
(possibly countably many) of contracts in arbitrary multiples. If a
i
is the quantity of a
contract for A
i
(a
i
< 0 corresponds to selling the contract), then the payo at time T is
i
a
i
I
A
i
.
We will require that
i
[a
i
[Q(A
i
) < so that the initial cost of the portfolio is (unambigu-
ously)
i
a
i
Q(A
i
).
The market has no arbitrage if no combination (buying and selling) of countably policies
with a net cost of zero results in a positive prot at no risk. That is, if
[a
i
[Q(A
i
) < ,
i
a
i
Q(A
i
) = 0,
and
i
a
i
I
A
i
0 a.s.,
then
i
a
i
I
A
i
= 0 a.s.
The no arbitrage requirement has the following implications.
Lemma 13.1 Assume that there is no arbitrage. If P(A) = 0, then Q(A) = 0. If Q(A) = 0,
then P(A) = 0.
Proof. Suppose P(A) = 0 and Q(A) > 0. Then construct a portfolio by buying one unit of
and selling Q()/Q(A) units of A. Then the net cost is
Q()
Q()
Q(A)
Q(A) = 0
and the payo is
1
Q()
Q(A)
I
A
= 1 a.s.
which contradicts the no arbitrage assumption.
Now suppose Q(A) = 0. Construct a portfolio by buying one unit of A. The cost of the
portfolio is Q(A) = 0 and the payo is I
A
0. So by the no arbitrage assumption, I
A
= 0
a.s., that is, P(A) = 0.
99
Lemma 13.2 If there is no arbitrage and A B, then Q(A) Q(B).
Proof. Suppose Q(B) < Q(A). Construct a portfolio by buying one unit of B and selling
Q(B)/Q(A) units of A. Then the net cost of the portfolio is
Q(B)
Q(B)
Q(A)
Q(A) = 0
and the payo is
I
B

Q(B)
Q(A)
I
A
= I
BA
+ (1
Q(B)
Q(A)
)I
A
0,
which is strictly positive on B. But Q(A) > 0 implies P(A) > 0, so there is a postive payo
with positive probability contradicting the no arbitrage assumption.
Theorem 13.3 If there is no arbitrage, Q must be a measure on T that is equivalent to P.
Proof. Let A
1
, A
2
, . . . be disjoint and for A =
i=1
A
i
, suppose that Q(A) <
i
Q(A
i
).
Then buy one unit of A and sell Q(A)/ units of A
i
for each i. The net cost is zero and the
net payo is
I
A
Q(A)
i
I
A
i
= (1
Q(A)
)I
A
.
Note that Q(A
i
) > 0 implies P(A
i
) > 0 and hence P(A) > 0, so the right side is 0 a.s.
and is > 0 with positive probability, contradicting the no arbtrage assumption. It follows
that Q(A) .
If Q(A) > , then sell one unit of A and buy Q(A)/ units of A
i
for each i.
Theorem 13.4 If there is no arbitrage, Q << P and P << Q. (P and Q are equivalent.)
Proof. The result follows form Lemma 13.1.
If X and Y are random variables satisfying X Y a.s., then no arbitrage should mean
Q(X) Q(Y ).
It follows that for any Q-integrable X, Q(X) =
_
XdQ.
13.1 Assets that can be traded at intermediate times.
Let T
t
represent the information available at time t. Let B(t) be the price of a bond at
time t that is worth $1 at time T (e.g. B(t) = e
r(Tt)
), that is, at any time 0 t T,
B(t) is the price of a contract that pays exactly $1 at time T. Note that B(0) = Q(), and
dene

Q(A) = Q(A)/B(0).
Let X(t) be the price at time t of another tradeable asset. For any stopping time T,
we can buy one unit of the asset at time 0, sell the asset at time and use the money
received (X()) to buy X()/B() units of the bond. Since the payo for this strategy is
X()/B(), we must have
X(0) =
_
X()
B()
dQ =
_
B(0)X()
B()
d

Q.
100
Lemma 13.5 If E[Z()] = E[Z(0)] for all bounded stopping times , then Z is a martingale.
Corollary 13.6 If X is the price of a tradeable asset, then X/B is a martingale on (, T,

Q).
Consider B(t) 1. Let W be a standard Brownian motion on (, T, P) and T
t
= T
W
t
,
0 t T. Suppose X is the price of a tradeable asset given as the unique solution of
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds.
For simplicity, assume (X(t)) > 0 for all t 0. Then
T
X
t
= T
W
t
,
since, setting
M(t) = X(t) X(0)
_
t
0
b(X(s))ds,
we have
W(t) =
_
t
0
1
(X(s))
dM(s).
Suppose

Q (= Q since B(0) = 1) is a pricing measure and
L = L(T) =
d

Q [
T
T
dP [
T
T
.
Then L(t) = E[L(T)[T
t
], 0 t T is a martingale and
W(t) = W(t)
_
t
0
1
L(s)
d[L, W]
s
is a Brownian motion on (, T,

Q).
Theorem 13.7 (Martingale Representation Theorem.) Suppose M is a martingale on (, T, P)
with respect to the ltration generated by a standard Brownian motion W. Then there exists
an adapted, measurable process U such that
_
t
0
U
2
(s)ds < a.s. for each t > 0 and
M(t) = M(0) +
_
t
0
U(s)dW(s).
Note that the denition of the stochastic integral must be extended for the above theorem
to be valid. Suppose U is progressive and statises
_
t
0
[U(s)[
2
ds < a.s.
for every t > 0. Dening U(s) = U(0), for s < 0, set
U
n
(t) = n
_
t
t
1
n
U(s)ds.
101
Note that U
n
is continuous and adapted and that
_
t
0
[U(s) U
n
(s)[
2
ds 0.
It follows that the sequence
_
t
0
U
n
(s)dW(s)
is Cauchy in probability,
Psup
st
[
_
t
0
U
n
(s)dW(s)
_
t
0
U
m
(s)dW(s)[ > P t +
4E[
t
0
[Un(s)Um(s)[
2
ds]
2
,
and
_
t
0
U(s)dW(s) lim
n
_
t
0
U
n
(s)dW(s).
Let
L(t) = 1 +
_
t
0
U(s)dW(s).
Then [L, W]
t
=
_
t
0
U(s)ds and
X(t) = X(0) +
_
t
0
(X(s))d

W(s) +
_
t
0
_
(X(s))
L(s)
U(s) + b(X(s))
_
ds.
Lemma 13.8 If M is a continuous local martingale of nite variation, then M is constant
in time.
Proof. We have
(M(t) M(0))
2
=
_
t
0
2(M(s) M(0))
2
dM(s) + [M]
t
Since [M]
t
= 0, (M(t) M(0))
2
must be a local martingale and hence must be identically
zero.
Since X must be a martingale on (, T,

Q), the lemma implies
(X(s))
L(s)
U(s) + b(X(s)) = 0.
It follows that
L(t) = 1
_
t
0
b(X(s))
(X(s))
L(s)dW(s),
so
L(t) = exp
_
t
0
b(X(s))
(X(s))
dW(s)
1
2
_
t
0
_
b(X(s))
(X(s))
_
2
ds,
W(t) = W(t) +
_
t
0
b(X(s))
(X(s))
ds,
102
and
X(t) = X(0) +
_
t
0
(X(s))d

W(s).
Note that E[L(t)] = 1 if
_
t
0
b(X(s))
(X(s))
< = 1.
For example, if
X(t) = x
0
+
_
t
0
X(s)dW(s) +
_
t
0
bX(s)ds,
that is,
X(t) = x
0
expW(t)
1
2
2
t + bt,
then
L(t) = exp
b
W(t)
1
2
b
2
2
t.
Under d

Q = L(T)dP,

W(t) = W(t) +
b
t is a a standard Brownian motion and

E
Q
[f(X(T))] =
_

f(x
0
expy
1
2
2
T)
1
2T
e
y
2
2T
dy.
How reasonable is the assumption that there exists a pricing measure Q? Start with a
model for a collection of tradeable assets. For example, let
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds
or more generally just assume that X is a vector semimartingale. Allow certain trading
strategies producing a payo at time T:
Y (T) = Y (0) +
i
_
t
0
H
i
(s)dX
i
(s)
Arbitrage exists if there is a trading strategy satisfying
Y (T) =
i
_
t
0
H
i
(s)dX
i
(s) 0 a.s.
with PY (T) > 0 > 0.
13.2 First fundamental theorem.
Theorem 13.9 (Meta theorem) There is no arbitrage if and only if there exists a probability
measure Q equivalent to P under which the X
i
are martingales.
Problems:
103
What trading strategies are allowable?
The denition of no arbitrage above is, in general, too weak to give theorem.
For example, assume that B(t) 1 and that there is a single asset satisfying
X(t) = X(0) +
_
t
0
X(s)dW(s) +
_
t
0
bX(s)ds = X(0) +
_
t
0
X(s)d

W(s).
Let T = 1 and for some stopping time < T, let
H(t) =
1
X(t)(1 t)
, 0 t < ,
and H(t) = 0 for t . Then for t < ,
_
t
0
H(s)dX(s) =
_
t
0
1
1 s
d

W(s) =

W(
_
t
0
1
(1 s)
2
ds),
where

W is a standard Brownian motion under

Q. Let
= infu :

W(u) = 1,
_

0
1
(1 s)
2
ds = .
Then with probability 1,
_
1
0
H(s)dX(s) = 1.
Admissible trading strategies: The trading strategy denoted x, H
1
, . . . , H
d
is admis-
sible if for
V (t) = x +
i
_
t
0
H
i
(s)dX
i
(s)
there exists a constant a such that
inf
0tT
V (t) a, a.s.
No arbitrage: If 0, H
1
, . . . , H
d
is an admissible trading strategy and
i
_
T
0
H
i
(s)dX
i
(s)
0 a.s., then
i
_
T
0
H
i
(s)dX
i
(s) = 0 a.s.
No free lunch with vanishing risk: If 0, H
n
1
, . . . , H
n
d
are admissible trading strategies
and
lim
n
|0
i
_
T
0
H
n
i
(s)dX
i
(s)|
= 0,
then
[
i
_
T
0
H
n
i
(s)dX
i
(s)[ 0
in probability.
Theorem 13.10 (Delbaen and Schachermayer). Let X = (X
1
, . . . , X
d
) be a bounded semi-
martingale dened on (, T, P), and let T
t
= (X(s), s t). Then there exists an equivalent
martingale measure dened on T
T
if and only if there is no free lunch with vanishing risk.
104
13.3 Second fundamental theorem.
Theorem 13.11 (Meta theorem) If there is no arbitrage, then the market is complete if and
only if the equivalent martingale measure is unique.
Problems:
What prices are determined by the allowable trading strategies?
Specically, how can one close up the collection of attainable payos?
Theorem 13.12 If there exists an equivalent martingale random measure, then it is unique
if and only if the set of replicable, bounded payos is complete in the sense that
x +
i
_
T
0
H
i
(s)dX
i
(s) : H
i
simple L
(P)
is weak
dense in L
(P, T
T
),
For general B, if we assume that after time 0 all wealth V must either be invested in the
assets X
i
or the bond B, then the number of units of the bond held is
V (t)
i
H
i
(t)X
i
(t)
B(t)
,
and
V (t) = V (0) +
i
_
t
0
H
i
(s)dX
i
(s) +
_
t
0
V (s)
i
H
i
(s)X
i
(s)
B(s)
dB(s).
Applying Itos formula, we have
V (t)
B(t)
=
V (0)
B(0)
+
i
_
t
0
H
i
(s)
B(s)
dX
i
(s),
which should be a martingale under

Q.
105
14 Filtering.
Signal:
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds
Observation:
Y (t) =
_
t
0
h(X(s))ds + V (t)
Change of measure
dQ[
Tt
dP[
Tt
= L
0
(t) = 1
_
t
0
1
h(X(s))L
0
(t)dV (s)
= 1
_
t
0
2
h(X(s))L
0
(s)dY (s) +
_
t
0
1
h
2
(X(s))L
0
(s)ds
= exp
_
t
0
1
h(X(s))dV (s)
1
2
_
t
0
2
h
2
(X(s))ds
= exp
_
t
0
2
h(X(s))dY (s) +
1
2
_
t
0
2
h
2
(X(s))ds.
Dene
V (t) = V (t) +
_
t
0
h(X(s))
ds.
W and

V are independent Brownian motions under Q. Therefore X and Y =
V are
independent under Q.
Therefore
dP[
Tt
dQ[
Tt
= L(t) = L
0
(t)
1
= exp
_
t
0
2
h(X(s))dY (s)
1
2
_
t
0
2
h
2
(X(s))ds
and
L(t) = 1 +
_
t
0
2
h(X(s))L(s)dY (s).
Set L(t, X, Y ) = L(t).
E
P
[f(X(t))[T
Y
t
] =
E
Q
[f(X(t))L(t, X, Y )[T
Y
t
]
E
Q
[L(t, X, Y )[T
Y
t
]
=
_
f(x(t))L(t, x, Y )
X
(dx)
_
L(t, x, Y )
X
(dx)
Let be the measure-valued process determined by
(t), f) = E
Q
[f(X(t))L(t)[T
Y
t
].
We want to derive a dierential equation for .
106
f(X(t))L(t) = f(X(0)) +
_
t
0
f(X(s))dL(s)
+
_
t
0
L(s)(X(s))f
t
(X(s))dW(s) +
_
t
0
L(s)Lf(X(s))ds
= f(X(0)) +
_
t
0
f(X(s))L(s)
2
h(X(s))dY (s)
+
_
t
0
L(s)(X(s))f
t
(X(s))dW(s) +
_
t
0
L(s)Lf(X(s))ds.
Lemma 14.1 Suppose X has nite expectation and is H-measurable and that T is indepen-
dent of ( H. Then
E[X[( T] = E[X[(].
Proof. It is sucient to show that for G ( and D T,
_
DG
E[X[(]dP =
_
DG
XdP.
But the independence assumption implies
_
DG
E[X[(]dP = E[I
D
I
G
E[X[(]]
= E[I
D
]E[I
G
E[X[(]]
= E[I
D
]E[I
G
X]
= E[I
D
I
G
X]
=
_
DG
XdP.
Lemma 14.2 Suppose that Y has independent increments and is compatible with T
t
.
Then for T
t
-progressive U satisfying
_
t
0
E[[U(s)[]ds <
E
Q
[
_
t
0
U(s)ds[T
Y
t
] =
_
t
0
E
Q
[U(s)[T
Y
s
]ds.
Proof. By the Fubini theorem for conditional expectations
E
Q
[
_
t
0
U(s)ds[T
Y
t
] =
_
t
0
E
Q
[U(s)[T
Y
t
]ds.
The identity then follows by Lemma 14.1 and the fact that Y (r) Y (s) is independent of
U(s) for r > s.
107
Lemma 14.3 Suppose that Y is an R
d
-valued process with independent increments that is
compatible with T
t
and that there exists p 1 and c such that
E[[
_
t
0
U(s)dY (s)[] < cE[
_
t
0
[U(s)[
p
ds]
p
1
, (14.1)
for each M
md
-valued, T
t
-predictable process U such that the right side of (14.1) is nite.
Then for each such U,
E[
_
t
0
U(s)dY (s)[T
Y
t
] =
_
t
0
E[U(s)[T
Y
s
]dY (s). (14.2)
Proof. If U =
m
i=1
i
I
(t
i1
,t
i
]
with 0 = t
0
< < t
m
and
i
, T
t
i
-measurable, (14.2) is
immediate. The lemma then follows by approximation.
Lemma 14.4 Let Y be as above, and let T
t
be a ltration such that Y is compatible
with T
t
. Suppose that M is an T
t
-martingale that is independent of Y . If U is T
t
-
predictable and
E[[
_
t
0
U(s)dM(s)[] < ,
then
E[
_
t
0
U(s)dM(s)[T
Y
t
] = 0.
Applying the lemmas, we have the Zakai equation:
(t), f) = (0), f) +
_
t
0
(s), Lf)ds
+
_
t
0
(s),
2
fh)dY (s)
108
15 Problems.
1. Let N be a nonnegative integer-valued random variable with E[N] < . Let
1
,
2
, . . .
be iid with mean m and independent of N. Show, using the denition of conditional
expectation, that
E[
N
k=1
k
[N] = mN.
2. Let T
k
be a discrete-time ltration, let
k
be iid such that
k
is T
k
-measurable
and (
k+1
,
k+2
, . . .) is independent of T
k
, and let be an T
k
-stopping time.
(a) Show that (
+1
,
+2
, . . .) is independent of T
.
(b) Show that if the assumption that the
k
are identically distributed is dropped,
then the assertion in part a is no longer valid.
3. Let
1
,
2
, . . . be positive, iid random variables with E[
i
] = 1, and let T
k
= (
i
, i k).
Dene
M
k
=
k
i=1
i
.
Show that M
k
is a T
k
-martingale.
4. Let N(t) be a counting process with E[N(t)] < , and let
1
,
2
, . . . be positive, iid
random variables with E[
i
] = 1 that are independent of N. Dene M(t) =
N(t)
i=1

i
.
Show that M is a martingale. (Justify your answer using the properties of conditional
expectations.)
5. Let Y be a Poisson process with intensity , and let
1
,
2
, . . . be iid with mean m and
variance
2
that are independent of Y . Dene
X(t) =
Y (t)
k=1
k
Show that X has stationary, independent increments, and calculate E[X(t) X(s)]
and V ar(X(t) X(s)).
6. Let be a discrete stopping time with range t
1
, t
2
, . . .. Show that
E[Z[T
] =
k=1
E[Z[T
t
k
]I
=t
k
.
7. (a) Let W denote standard Brownian motion, and let t
i
be a partition of the interval
[0, t]. What is lim
[W(t
i+1
) W(t
i
)[
3
as max [t
i+1
t
i
[ 0?
(b) What is lim
W(t
i
) ((W(t
i+1
) W(t
i
))
2
(t
i+1
t
i
)) as max [t
i+1
t
i
[ 0?
109
(c) Use the limits in part a) and b) to directly calculate
_
t
0
W
2
(s)dW(s) from the
denition of the stochastic integral.
8. Let 0
1

2
be T
t
-stopping times, and for k = 1, 2, . . ., let
k
be T
k
-
measurable. Dene
X(t) =
k=1
k
I
[
k
,
k+1
)
(t),
and show that X is T
t
-adapted
9. (a) For each n > 0, show that M
n
(t) =
_
t
0
W(s)
n
dW(s) is square integrable and
that M
n
is a martingale. (You do not need to explicitly compute M
n
.)
(b) Show that Z(t) =
_
t
0
e
W(s)
4
dW(s) is a local martingale. (It is not a martingale,
since it is not integrable.) In particular, nd a sequence of stopping times
n
such
that Z(
n
) is a martingale.
10. Let Y be a Poisson process with intensity .
(a) Find a cadlag process U such that
e
Y (t)
= 1 +
_
t
0
U(s)dY (s) (15.1)
(b) Use (15.1) and the fact that Y (t) t is a martingale to compute E[e
Y (t)
].
(c) Dene
Z(t) =
_
t
0
e
Y (s)
dY (s).
Again use the fact that Y (t)t is a martingale to calculate E[Z(t)] and V ar(Z(t)).
11. Let N be a Poisson process with parameter , and let X
1
, X
2
, . . . be a sequence of
Bernoulli trials with parameter p. Assume that the X
k
are independent of N. Let
M(t) =
N(t)
k=1
X
k
.
What is the distribution of M(t)?
12. Let N be a Poisson process with parameter . For t < s:
(a) What is the covariance of N(t) and N(s)?
(b) Calculate the probability that PN(t) = 1, N(s) = 1.
(c) Give an event in terms of S
1
and S
2
that is equivalent to the event N(t) =
1, N(s) = 1, and use the calculation in part 12b to calculate the joint density
function for S
1
and S
2
.
110
13. Let Y be a continuous semimartingale. Solve the stochastic dierential equation
dX = aXdt + bXdY, X(0) = x
0
Hint: Look for a solution of the form X(t) = AexpBt +CY (t) +D[Y ]
t
for some set
of constants, A, B, C, D.
14. Let W be standard Brownian motion and suppose (X, Y ) satises
X(t) = x +
_
t
0
Y (s)ds
Y (t) = y
_
t
0
X(s)ds +
_
t
0
cX(s)dW(s)
where c ,= 0 and x
2
+ y
2
> 0. Assuming all moments are nite, dene m
1
(t) =
E[X(t)
2
], m
2
(t) = E[X(t)Y (t)], and m
3
(t) = E[Y (t)
2
]. Find a system of linear dier-
ential equations satised by (m
1
, m
2
, m
3
), and show that the expected total energy
(E[X(t)
2
+ Y (t)
2
]) is asymptotic to ke
t
for some k > 0 and > 0.
15. Let X and Y be independent Poisson processes. Show that with probability one, X
and Y do not have simultaneous discontinuities and that [X, Y ]
t
= 0, for all t 0.
16. Two local martingales, M and N, are called orthogonal if [M, N]
t
= 0 for all t 0.
(a) Show that if M and N are orthogonal, then [M + N]
t
= [M]
t
+ [N]
t
.
(b) Show that if M and N are orthogonal, then M and N do not have simultaneous
discontinuities.
(c) Suppose that M
n
are pairwise orthogonal, square integrable martingales (that is,
[M
n
, M
m
]
t
= 0 for n ,= m). Suppose that
k=1
E[[M
n
]
t
] < for each t. Show
that
M
k=1
M
n
converges in L
2
and that M is a square integrable martingale with [M] =
k=1
[M
k
].
17. Let X
1
, X
2
. . . and Y
1
, Y
2
, . . . be independent, unit Poisson processes. For
k
> 0 and
c
k
R, dene
M
n
(t) =
n
k=1
c
k
(Y
k
(
k
t) X
k
(
k
t))
(a) Suppose
c
2
k
k
< . Show that for each T > 0,
lim
n,m
E[sup
tT
(M
n
(t) M
m
(t))
2
] = 0
and hence we can dene
M(t) =
k=1
c
k
(Y
k
(
k
t) X
k
(
k
t))
111
(b) Under the assumptions of part (a), show that M is a square integrable martingale,
and calculate [M].
18. Suppose in Problem 17 that
c
2
k
k
< but
[c
k
k
[ = . Show that for t >
0, T
t
(M) = a.s. (Be careful with this. In general the total variation of a sum is not
the sum of the total variations.)
19. Let W be standard Brownian motion. Use Itos Formula to show that
M(t) = e
W(t)
1
2
2
t
is a martingale. (Note that the martingale property can be checked easily by direct
calculation; however, the problem asks you to use Itos formula to check the martingale
property.)
20. Let N be a Poisson process with parameter . Use Itos formula to show that
M(t) = e
N(t)(e
1)t
is a martingale.
21. Let X satisfy
X(t) = x +
_
t
0
X(s)dW(s) +
_
t
0
bX(s)ds
Let Y = X
2
.
(a) Derive a stochastic dierential equation satised by Y .
(b) Find E[X(t)
2
] as a function of t.
22. Suppose that the solution of dX = b(X)dt + (X)dW, X(0) = x is unique for each
x. Let = inft > 0 : X(t) / (, ) and suppose that for some < x < ,
P < , X() = [X(0) = x > 0 and P < , X() = [X(0) = x > 0 .
(a) Show that P < T, X() = [X(0) = x is a nonincreasing function of x,
< x < .
(b) Show that there exists a T > 0 such that
inf
x
maxP < T, X() = [X(0) = x, P < T, X() = [X(0) = x > 0
(c) Let be a nonnegative random variable. Suppose that there exists a T > 0 and a
< 1 such that for each n, P > (n + 1)T[ > nT < . Show that E[] < .
(d) Show that E[] < .
23. Let dX = bX
2
dt + cXdW, X(0) > 0.
(a) Show that X(t) > 0 for all t a.s.
112
(b) For what values of b and c does lim
t
X(t) = 0 a.s.?
24. Let dX = (a bX)dt +
XdW, X(0) > 0 where a and b are positive constants. Let

= inft > 0 : X(t) = 0.
(a) For what values of a and b is P < = 1?
(b) For what values of a and b is P = = 1?
25. Let M be a k-dimensional, continuous Gaussian process with stationary, mean zero,
independent increments and M(0) = 0. Let B be a kk-matrix all of whose eigenvalues
have negative real parts, and let X satisfy
X(t) = x +
_
t
0
BX(s)ds + M(t)
Show that sup
t
E[[X(t)[
n
] < for all n. (Hint: Let Z(t) = CX(t) for a judiciously
selected, nonsingular C, show that Z satises an equation of the same form as X, show
that sup
t
E[[Z(t)[
n
] < , and conclude that the same must hold for X.)
26. Let X(t, x) be as in (8.3) with and b continuous. Let D R
d
be open, and let (x)
be the exit time from D starting from x, that is,
(x) = inft : X(t, x) / D.
Assume that h is bounded and continuous. Suppose x
0
D and
lim
xx
0
P[(x) < ] = 1, > 0.
Show that
lim
xx
0
P[[X((x), x) x
0
[ > ] = 0 > 0,
and hence, if f is dened by (8.4), f(x) h(x
0
), as x x
0
.
27. (Central Limit Theorem for Random Matrices) Let A
1
, A
2
, . . . be independent, iden-
tically distributed, matrix-valued random variables with expectation zero and nite
variance. Dene
Y
n
(t) =
1
n
[nt]
k=1
A
k
X
n
(t) = (I +
1
n
A
1
)(I +
1
n
A
2
) (I +
1
n
A
[nt]
)
Show that X
n
satises a stochastic dierential equation driven by Y
n
, conclude that
the sequence X
n
converges in distribution, and characterize the limit.
113
In Problems 28 and 29, assume that d = 1, that and are continuous, and that
X(t) = X(0) +
_
t
0
(X(s))dW(s) +
_
t
0
b(X(s))ds
Recall that Lf(x) =
1
2
(x)
2
f
tt
(x) + b(x)f
t
(x).
28. Let < , and suppose that there is a function f that is C
2
and satises Lf(x) > 0
on [, ]. Dene = inft : X(t) / (, ). Show that E[] < .
29. Let < , and suppose that inf
x(,)
(x)
2
> 0. Dene = inft : X(t) / (, ).
Show that E[] < .
In Problems 30 through 33, let X be real-valued and satisfy dX = (X)dW +b(X)dt where
is bounded and strictly positive, and suppose that v(x) = x
2
satises Lv K v for
some > 0. Let
0
= inft : X(t) 0.
30. Show that if E[X(0)
2
] < , then E[
0
] < .
31. Assume that E[X(0)
2
] < . Show that if f is twice continuously dierentiable with
bounded rst derivative, then
lim
t
1
t
_
t
0
Lf(X(s))ds = 0
where convergence is in L
2
. (Convergence is also almost sure. You can have 2 points
extra credit if you show that.)
32. Show that for every bounded continuous g, there exists a constant c
g
such that
lim
t
1
t
_
t
0
g(X(s))ds = c
g
Hint: Show that there exists c
g
and f such that Lf = g c
g
. Remark. In fact, under
these assumptions the diusion process has a stationary distribution and c
g
=
_
gd.
33. Show that if f is twice continuously dierentiable with bounded rst derivative, then
W
n
(t)
1
n
_
nt
0
Lf(X(s)ds
converges in distribution to W for some constant .
Ornstein Unlenbeck Process. (Problems 34-40.) The Ornstein-Uhlenbeck process was
originally introduced as a model for the velocity of physical Brownian motion,
dV = V dt + dW ,
114
where , > 0. The location of a particle with this velocity is then given by
X(t) = X(0) +
_
t
0
V (s)ds.
Explore the relationship between this model for physical Brownian motion and the usual
mathematical model. In particular, if space and time are rescaled to give
X
n
(t) =
1
n
X(nt),
what happens to X
n
as n ?
34. Derive the limit of X
n
directly from the stochastic dierential equation. (Note: No
fancy limit theorem is needed.) What type of convergence do you obtain?
35. Calculate E[V (t)
4
]. Show (without necessarily calculating explicitly), that if E[[V (0)[
k
] <
, then sup
0t<
E[[V (t)[
k
] < .
36. Compute the stationary distribution for V . (One approach is given Section 9. Can
you nd another.)
37. Let g be continous with compact support, and let c
g
=
_
gd, where is the stationary
distribution for V . Dene
Z
n
(t) =
1
n
(
_
nt
0
g(V (s))ds nC
g
t).
Show that Z
n
converges in distribution and identify the limit.
38. Note that Problem 34 is a result of the same form as Problem 37 with g(v) = v, so the
condition that g be continuous with compact support in Problem 37 is not necessary.
Find the most general class of gs you can, for which a limit theorem holds.
39. Consider X(t) = X(0) +
_
t
0
V (s)ds with the following modication. Assume that
X(0) > 0 and keep X(t) 0 by switching the sign of the velocity each time X hits
zero. Derive the stochastic dierential equation satised by (X, V ) and prove the
analogue of Problem 34. (See Lemma 11.1.)
40. Consider the Ornstein-Unlenbeck process in R
d
dV = V dt + dW
where W is now d-dimensional Brownian motion. Redo as much of the above as you
can. In particular, extend the model in Problem 39 to convex sets in R
d
.
41. Let X be a diusion process with generator L. Suppose that h is bounded and C
2
with h > 0 and that Lh is bounded. Show that
L(t) =
h(X(t))
h(X(0))
exp
_
t
0
Lh(X(s))
h(X(s))
ds
is a martingale with E[L(t)] = 1.
115
42. For x > 0, let
X(t) = x +
_
t
0
(X(s))dW(s)
and = inft : X(t) = 0. Give conditions on , as general as you can make them,
that imply E[] < .
43. Let X(t) = X(0) +W(t) where W = (W
1
, W
2
) is two-dimensional standard Brownian
motion. Let Z = (R, ) be the polar coordinates for the point (X
1
, X
2
). Derive a
stochastic dierential equation satised by Z. Your answer should be of the form
Z(t) = Z(0) +
_
t
0
(Z(s))dW(s) +
_
t
0
b(Z(s))ds.
44. Assume that d = 1. Let
X(t, x) = x +
_
t
0
(X(s, x))dW(s) +
_
t
0
b(X(s, x))ds,
where and b have bounded continuous derivatives. Derive the stochastic dierential
equation that should be satised by
Y (t, x) =

x
X(t, x)
if the derivative exists and show that
Y (t, x) lim
h0
1
h
(X(t, x + h) X(t, x))
exists in L
2
where Y is the solution of the derived equation.
45. Let N be a unit Poisson process and let
W
n
(t) =
1
n
_
nt
0
(1)
N(s)
ds
(Recall that W
n
W where W is standard Brownian motion.) Show that there exist
martingales M
n
such that W
n
= M
n
+ V
n
and V
n
0, but T
t
(V
n
) .
46. Let W
n
be as in Problem 45. Let have a bounded, continuous derivative, and let
X
n
(t) =
_
t
0
(X
n
(s))dW
n
(s).
Show that X
n
X for some X and identify the stochastic dierential equation satised
by X. Note that by Problem 45, the conditions of Theorem 10.13 are not satised for
X
n
(t) =
_
t
0
(X
n
(s))dM
n
(s) +
_
t
0
(X
n
(s))dV
n
(s). (15.2)
Integrate the second term on the right of (15.2) by parts, and show that the sequence
of equations that results, does satisfy the conditions of Theorem 10.13.
116
Central limit theorem for Markov chains. (Problems 47-54.) Let
0
,
1
, . . . be an
irreducible Markov chain on a nite state space 1, ..., d, let P = ((p
ij
)) denote its transition
matrix, and let be its stationary distribution. For any function h on the state space, let
h denote
i
h(i).
47. Show that
f(
n
)
n1
k=0
(Pf(
k
) f(
k
))
is a martingale.
48. Show that for any function h, there exists a solution to the equation Pg = h h,
that is, to the system
j
p
ij
g(j) g(i) = h(i) h.
49. The ergodic theorem for Markov chains states that
lim
n
1
n
n
k=1
h(
k
) = h.
Use the martingale central limit theorem to prove convergence in distribution for
W
n
(t) =
1
n
[nt]
k=1
(h(
k
) h) .
50. Use the martingale central limit theorem to prove the analogue of Problem 49 for a
continuous time nite Markov chain (t), t 0. In particular, use the multidimen-
sional theorem to prove convergence for the vector-valued process U
n
= (U
1
n
, . . . , U
d
n
)
dened by
U
k
n
(t) =
1
n
_
nt
0
(I
(s)=k
k
)ds
51. Explore extensions of Problems 49 and 50 to innite state spaces.
Limit theorems for stochastic dierential equations driven by Markov chains
52. Show that W
n
dened in Problem 49 and U
n
dened in Problem 50 are not good
sequences of semimartingales, in the sense that they fail to satisfy the hypotheses of
Theorem 10.13. (The easiest approach is probably to show that the conclusion is not
valid.)
53. Show that W
n
and U
n
can be written as M
n
+ Z
n
where M
n
is a good sequence
and Z
n
0.
54. (Random evolutions) Let be as in Problem 50, and let X
n
satisfy
X
n
(t) =
nF(X
n
(s), (ns)).
Suppose
i
F(x, i)
i
= 0. Write X
n
as a stochastic dierential equations driven by
U
n
, give conditions under which X
n
converges in distribution to a limit X, and identify
the limit.
117
55. (a) Let W be a standard Brownian motion, let
i
, i = 1, 2, be bounded, continuous
functions, and suppose that
X
i
(t) = X
i
(0) +
_
t
0
i
(s)X
i
(s)dW(s), i = 1, 2.
Apply Itos formula to nd an SDE satised by Z = X
1
X
2
.
(b) Let W
1
and W
2
be independent standard Brownian motions. Let
Y
i
(t) = Y
i
(0) +
_
t
0
i
(s)Y
i
(s)dW
i
(s), i = 1, 2.
Find an SDE satised by U = Y
1
Y
2
, and show that U is a martingale.
56. Suppose the price X of a tradeable asset is the unique solution of
X(t) = X(0) + Y (
_
t
0
(X(s))ds)
_
t
0
(X(s))ds, (15.3)
where Y is a unit Poisson process, X(0) is independent of Y , Y and X(0) are dened
on (, T, P), and and are bounded and strictly positive. Let T
t
= (X(s) : s t).
Find a measure Q equivalent to P such that X(t), 0 t T is a martingale under
Q and X(0) has the same distribution under Q as under P.
57. Suppose that : R R is Lipshitz.
(a) Show that the solution of (15.3) is unique.
(b) Let u be cadlag. Show that
x(t) = u(t)
_
t
0
(x(s))ds (15.4)
has a unique solution and that x is cadlag.
(c) Let (t, u) = x(t), where x is the unique solution of (15.4). Show that is
nonanticipating in the sense that (t, u) = (t, u
t
), t 0, where u
t
(s) = u(s t).
58. (Extra credit) Show that Q in Problem 56 is unique. (You might begin by showing
that the distribution of X is the same under any Q satisfying the conditions of the
problem.)
59. Let , (0, )
2
, and let X = (X
1
, X
2
) satisfy
X(t) = X(0) + Y
1
(
_
t
0
1
(X(s))ds) Y
2
(
_
t
0
2
(X(s))ds),
where Y
1
and Y
2
are independent, unit Poisson processes independent of X(0); Y
1
, Y
2
, X(0)
are dened on (, T, P); and are linearly independent; and 0 <
1
,
2

1
,
for some > 0. Show that there exists a probability measure Q equivalent to P under
which X is a martingale and X(0) has the same distribution under Q as under P.
118
References
Durrett, Richard (1991), Probability: Theory and Examples, The Wadsworth & Brooks/Cole
Statistics/Probability Series, Wadsworth & Brooks/Cole Advanced Books & Software,
Pacic Grove, CA. 4.4
Ethier, Stewart N. and Thomas G. Kurtz (1986), Markov Processes: Characterization and
Convergence, Wiley Series in Probability and Mathematical Statistics: Probability and
Mathematical Statistics, John Wiley & Sons Inc., New York. 4.1, 4.1, 4.3, 10.1
Kurtz, Thomas G. and Philip Protter (1991), Weak limit theorems for stochastic integrals
and stochastic dierential equations, Ann. Probab. 19(3), 10351070. 10.2, 10.5
Protter, Philip (1990), Stochastic Integration and Dierential Equations: A New Approach,
Vol. 21 of Applications of Mathematics (New York), Springer-Verlag, Berlin. 4.2, 5.8, 12.4
119

Lectures On Stochastic Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lectures On Stochastic Analysis

Uploaded by

Copyright:

Available Formats

Lectures on Stochastic Analysis

and taking limits we have

is the collection of random variables X such

= inft : X(t) or X(t)

For a stopping time , dene

is a -algebra and is interpreted as representing the information available to an

dY if for each T > 0 and each > 0,

dY exists for all X,

dY is continuous. (Recall that we are assuming throughout that

dY exists. By Proposition 4.6, [M]

Corollary 5.4 If M is a square integrable martingale and X is adapted, then

dM is continuous. If [X[ C for some

dM is a square integrable martingale.

dM is a square integrable martingale with

dY exists for all cadlag, adapted X,

dY exists and is a cadlag semimartingale.

5.4 Change of time variable.

(t) = X(t) for

is cadlag and adapted. Note also that

Lemma 6.5 provides the proof for the following.

(t) = s t : [X(s)[ . Then

(t) = and limsup [X(b

X(u)dY (u)[ > K (7.7)

(t, ) such that

7.4 A Gronwall inequality for SDEs

V ) > R and dene

Y satises the conditions of Lemma 7.6 and that Y (t) =

U similarly, we see that

is a solution of the equation (U, Y

(t) for t <

(t) for t <

for some L > 0.

has independent Gaussian increments and hence is a Brownian motion.

] = u(t, X(, x))

should be a dierential operator satisfying

9.5 Extension of the integral w.r.t. a Poisson random measure

(A) = (A) (A), A c, (A) < .

is the random variable

The last proposition, shows that

() contains all the simple

() is an Orlicz space with norm

(). The proof of this assertion follows by the same argument as in

(). Moreover, for every

() there exists a sequence of simple functions f

Proof. Fix a > 0. Then, for c > 0 and 0 < < 1,

, the right side is bounded by 3|f|

(). Then by Proposition 9.15 there exists a sequence f

. But, from Proposition 9.16, it follows that for

(), then for all R

Y (du ds) is a martingale. For Z

2W, where W is standard

LdQ where T. P is a probability measure on T. This makes P

is absolutely continuous with

Theorem 12.4 If M is a Q-local martingale, then

t is a a standard Brownian motion and

XdW, X(0) > 0 where a and b are positive constants. Let

You might also like