You are on page 1of 92

AN INTRODUCTION TO

STOCHASTIC CALCULUS
Marta Sanz-Sole
Facultat de Matem` atiques
Universitat de Barcelona
September 18, 2012
Contents
1 A review of the basics on stochastic processes 4
1.1 The law of a stochastic process . . . . . . . . . . . . . . . . . 4
1.2 Sample paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 The Brownian motion 9
2.1 Equivalent denitions of Brownian motion . . . . . . . . . . . 9
2.2 A construction of Brownian motion . . . . . . . . . . . . . . . 11
2.3 Path properties of Brownian motion . . . . . . . . . . . . . . . 15
2.4 The martingale property of Brownian motion . . . . . . . . . . 18
2.5 Markov property . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Itos calculus 25
3.1 It os integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 The Ito integral as a stochastic process . . . . . . . . . . . . . 31
3.3 An extension of the Ito integral . . . . . . . . . . . . . . . . . 33
3.4 A change of variables formula: Itos formula . . . . . . . . . . 35
3.4.1 One dimensional It os formula . . . . . . . . . . . . . . 36
3.4.2 Multidimensional version of It os formula . . . . . . . . 39
4 Applications of the Ito formula 46
4.1 Burkholder-Davis-Gundy inequalities . . . . . . . . . . . . . . 46
4.2 Representation of L
2
Brownian functionals . . . . . . . . . . . 47
4.3 Girsanovs theorem . . . . . . . . . . . . . . . . . . . . . . . . 49
5 Local time of Brownian motion and Tanakas formula 52
6 Stochastic dierential equations 57
6.1 Examples of stochastic dierential equations . . . . . . . . . . 59
6.2 A result on existence and uniqueness of solution . . . . . . . . 60
6.3 Some properties of the solution . . . . . . . . . . . . . . . . . 65
6.4 Markov property of the solution . . . . . . . . . . . . . . . . . 67
7 Numerical approximations of stochastic dierential equa-
tions 71
8 Continuous time martingales 74
8.1 Doobs inequalities for martingales . . . . . . . . . . . . . . . 74
8.2 Local martingales . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.3 Quadratic variation of a local martingale . . . . . . . . . . . . 79
9 Stochastic integrals with respect to continuous martingales 83
10 Appendix 1: Conditional expectation 86
11 Appendix 2: Stopping times 90
3
1 A review of the basics on stochastic pro-
cesses
This chapter is devoted to introduce the notion of stochastic processes and
some general denitions related with this notion. For a more complete ac-
count on the topic, we refer the reader to [11]. Let us start with a denition.
Denition 1.1 A stochastic process with state space S is a family X
i
, i
I of random variables X
i
: S indexed by a set I.
For a successful progress in the analysis of such an object, some further
structure on the index set I and on the state space S is required. In this
course, we shall mainly deal with the particular cases: I = N, Z
+
, R
+
and S
either a countable set or a subset of R
d
, d 1.
The basic problem statisticians are interested in, is the analysis of the prob-
ability law (mostly described by some parameters) of characters exhibited by
populations. For a xed character described by a random variable X, they
use a nite number of independent copies of X -a sample of X. For many
purposes, it is interesting to have samples of any size and therefore to con-
sider sequences X
n
, n 1. It is important here to insist on the word copies,
meaning that the circumstances around the dierent outcomes of X do not
change. It is a static world. Hence, they deal with stochastic processes
X
n
, n 1 consisting of independent and identically distributed random
variables.
This is not the setting we are interested in here. Instead, we would like to give
stochastic models for phenomena of the real world which evolve as time goes
by. Stochasticity is a choice in front of a complete knowledge and extreme
complexity. Evolution, in contrast with statics, is what we observe in most
phenomena in Physics, Chemistry, Biology, Economics, Life Sciences, etc.
Stochastic processes are well suited for modeling stochastic evolution phe-
nomena. The interesting cases correspond to families of random variables X
i
which are not independent. In fact, the famous classes of stochastic processes
are described by means of types of dependence between the variables of the
process.
1.1 The law of a stochastic process
The probabilistic features of a stochastic process are gathered in the joint
distributions of their variables, as given in the next denition.
Denition 1.2 The nite-dimensional joint distributions of the process
X
i
, i I consists of the multi-dimensional probability laws of any nite
4
family of random vectors X
i
1
, . . . , X
i
m
, where i
1
, . . . , i
m
I and m 1 is
arbitrary.
Let us give an important example.
Example 1.1 A stochastic process X
t
, t 0 is said to be Gaussian if its
nite-dimensional joint distributions are Gaussian laws.
Remember that in this case, the law of the random vector (X
t
1
, . . . , X
t
m
) is
characterized by two parameters:
(t
1
, . . . , t
m
) = E (X
t
1
, . . . , X
t
m
) = (E(X
t
1
), . . . , E(X
t
m
))
(t
1
, . . . , t
m
) =
_
Cov(X
t
i
, X
t
j
)
_
1i,jm
.
In the sequel we shall assume that I R
+
and S R, either countable or
uncountable, and denote by R
I
the set of real-valued functions dened on I.
A stochastic process X
t
, t 0 can be viewed as a random vector
X : R
I
.
Putting the appropriate -eld of events in R
I
, say B(R
I
), one can dene,
as for random variables, the law of the process as the mapping
P
X
(B) = P(X
1
(B)), B B(R
I
).
Mathematical results from measure theory tell us that P
X
is dened by means
of a procedure of extension of measures on cylinder sets given by the family
of all possible nite-dimensional joint distributions. This is a deep result.
In Example 1.1, we have dened a class of stochastic processes by means of
the type of its nite-dimensional joint distributions. But, does such an object
exist? In other words, could one dene stochastic processes giving only its
nite-dimensional joint distributions? Roughly speaking, the answer is yes,
adding some extra condition. The precise statement is a famous result by
Kolmogorov that we now quote.
Theorem 1.1 Consider a family
P
t
1
,...,t
n
, t
1
< . . . < t
n
, n 1, t
i
I (1.1)
where:
1. P
t
1
,...,t
n
is a probability on R
n
,
5
2. if t
i
1
< . . . < t
i
m
t
1
< . . . < t
n
, the probability law P
t
i
1
...t
i
m
is the
marginal distribution of P
t
1
...t
n
.
There exists a stochastic process X
t
, t I dened in some probability space,
such that its nite-dimensional joint distributions are given by (1.1). That
is, the law of the random vector (X
t
1
, . . . , X
t
n
) is P
t
1
,...,t
n
.
One can apply this theorem to Example 1.1 to show the existence of Gaussian
processes, as follows.
Let K : I I R be a symmetric, nonnegative denite function. That
means:
for any s, t I, K(t, s) = K(s, t);
for any natural number n and arbitrary t
1
, . . . , t
n
I, and x
1
, . . . , x
n

R,
n

i,j=1
K(t
i
, t
j
)x
i
x
j
0.
Then there exists a Gaussian process X
t
, t 0 such that E(X
t
) = 0 for
any t I and Cov (X
t
i
, X
t
j
) = K(t
i
, t
j
), for any t
i
, t
j
I.
To prove this result, x t
1
, . . . , t
n
I and set = (0, . . . , 0) R
n
, =
(K(t
i
, t
j
))
1i,jn
and
P
t
1
,...,t
n
= N(0, ).
We denote by (X
t
1
, . . . , X
t
n
) a random vector with law P
t
1
,...,t
n
. For any
subset t
i
1
, . . . , t
i
m
of t
1
, . . . , t
n
, it holds that
A(X
t
1
, . . . , X
t
n
) = (X
t
i
1
, . . . , X
t
i
m
),
with
A =
_
_
_

t
1
,t
i
1

t
n
,t
i
1

t
1
,t
i
m

t
n
,t
i
m
_
_
_,
where
s,t
denotes the Kronecker Delta function.
By the properties of Gaussian vectors, the random vector (X
t
i
1
, . . . , X
t
i
m
)
has an m-dimensional normal distribution, zero mean, and covariance matrix
AA
t
. By the denition of A, it is trivial to check that
AA
t
= (K(t
i
l
, t
i
k
))
1l,km
.
Hence, the assumptions of Theorem 1.1 hold true and the result follows.
6
1.2 Sample paths
In the previous discussion, stochastic processes are considered as random
vectors. In the context of modeling, what matters are the observed values
of the process. Observations correspond to xed values of . This new
point of view leads to the next denition.
Denition 1.3 The sample paths of a stochastic process X
t
, t I are the
family of functions indexed by , X() : I S, dened by X()(t) =
X
t
().
Sample paths are also called trajectories.
Example 1.2 Consider random arrivals of customers at a store. We set our
clock at zero and measure the time between two consecutive arrivals. They
are random variables X
1
, X
2
, . . . . We assume X
i
> 0, a.s. Set S
0
= 0 and
S
n
=

n
j=1
X
j
, n 1. S
n
is the time of the n-th arrival. The process we
would like to introduce is N
t
, giving the number of customers who have visited
the store during the time interval [0, t], t 0.
Clearly, N
0
= 0 and for t > 0, N
t
= k if and only if
S
k
t < S
k+1
.
The stochastic process N
t
, t 0 takes values on Z
+
. Its sample paths are
increasing right continuous functions, with jumps at the random times S
n
,
n 1, of size one. It is a particular case of a counting process. Sample paths
of counting processes are always increasing right continuous functions, their
jumps are natural numbers.
Example 1.3 Evolution of prices of risky assets can be described by real-
valued stochastic processes X
t
, t 0 with continuous, although very rough,
sample paths. They are generalizations of the Brownian motion.
The Brownian motion, also called Wiener process, is a Gaussian process
B
t
, t 0 with the following parameters:
E(B
t
) = 0
E(B
s
B
t
) = s t,
This denes the nite dimensional distributions and therefore the existence
of the process via Kolmogorovs theorem (see Theorem 1.1).
7
Before giving a heuristic motivation for the preceding denition of Brownian
motion, we introduce two further notions.
A stochastic process X
t
, t I has independent increments if for any
t
1
< t
2
< . . . < t
k
the random variables X
t
2
X
t
1
, . . . , X
t
k
X
t
k1
are
independent.
A stochastic process X
t
, t I has stationary increments if for any t
1
< t
2
,
the law of the random variable X
t
2
X
t
1
is the same as that of X
t
2
t
1
.
Brownian motion is termed after Robert Brown, a British botanist who ob-
served and reported in 1827 the irregular movements of pollen particles sus-
pended in a liquid. Assume that, when starting the observation, the pollen
particle is at position x = 0. Denote by B
t
the position of (one coordinate)
of the particle at time t > 0. By physical reasons, the trajectories must be
continuous functions and because of the erratic movement, it seems reason-
able to say that B
t
, t 0 is a stochastic process. It also seems reasonable
to assume that the change in position of the particle during the time interval
[t, t +s] is independent of its previous positions at times < t and therefore,
to assume that the process has independent increments. The fact that such
an increment must be stationary is explained by kinetic theory, assuming
that the temperature during the experience remains constant.
The model for the law of B
t
has been given by Einstein in 1905. More
precisely, Einsteins denition of Brownian motion is that of a stochastic
processes with independent and stationary increments such that the law of
an increment B
t
B
s
, s < t is Gaussian, zero mean and E(B
t
B
s
)
2
= t s.
This denition is equivalent to the one given before.
8
2 The Brownian motion
2.1 Equivalent denitions of Brownian motion
This chapter is devoted to the study of Brownian motion, the process intro-
duced in Example 1.3 that we recall now.
Denition 2.1 The stochastic process B
t
, t 0 is a one-dimensional
Brownian motion if it is Gaussian, zero mean and with covariance function
given by E (B
t
B
s
) = s t.
The existence of such process is ensured by Kolmogorovs theorem. Indeed,
it suces to check that
(s, t) (s, t) = s t
is nonnegative denite. That means, for any t
i
, t
j
0 and any real numbers
a
i
, a
j
, i, j = 1, . . . , m,
m

i,j=1
a
i
a
j
(t
i
, t
j
) 0.
But
s t =
_

0
1 1
[0,s]
(r) 1 1
[0,t]
(r) dr.
Hence,
m

i,j=1
a
i
a
j
t
i
t
j
=
m

i,j=1
a
i
a
j
_

0
1 1
[0,t
i
]
(r) 1 1
[0,t
j
]
(r) dr
=
_

0
_
m

i=1
a
i
1 1
[0,t
i
]
(r)
_
2
dr 0.
Notice also that, since E(B
2
0
) = 0, the random variable B
0
is zero almost
surely.
Each random variable B
t
, t > 0, of the Brownian motion has a density, and
it is
p
t
(x) =
1

2t
exp(
x
2
2t
),
while for t = 0, its density is a Dirac mass at zero,
{0}
.
Dierentiating p
t
(x) once with respect to t, and then twice with respect to
x easily yields

t
p
t
(x) =
1
2

2
x
2
p
t
(x)
p
0
(x) =
{0}
.
9
This is the heat equation on R with initial condition p
0
(x) =
{0}
. That
means, as time evolves, the density of the random variables of the Brownian
motion behaves like a diusive physical phenomenon.
There are equivalent denitions of Brownian motion, as the one given in the
nest result.
Proposition 2.1 A stochastic process X
t
, t 0 is a Brownian motion if
and only if
(i) X
0
= 0, a.s.,
(ii) for any 0 s < t, the random variable X
t
X
s
is independent of the
-eld generated by X
r
, 0 r s, (X
r
, 0 r s) and X
t
X
s
is a
N(0, t s) random variable.
Proof: Let us assume rst that X
t
, t 0 is a Brownian motion. Then
E(X
2
0
) = 0. Thus, X
0
= 0 a.s..
Let H
s
and

H
s
be the vector spaces included in L
2
() spanned by (X
r
, 0
r s) and (X
s+u
X
s
, u 0), respectively. Since for any 0 r s
E (X
r
(X
s+u
X
s
)) = 0,
H
s
and

H
s
are orthogonal in L
2
(). Consequently, X
t
X
s
is independent
of the -eld (X
r
, 0 r s).
Since linear combinations of Gaussian random variables are also Gaussian,
X
t
X
s
is normal, and E(X
t
X
s
) = 0,
E(X
t
X
s
)
2
= t +s 2s = t s.
This ends the proof of properties (i) and (ii).
Assume now that (i) and (ii) hold true. Then the nite dimensional distri-
butions of X
t
, t 0 are multidimensional normal and for 0 s t,
E(X
t
X
s
) = E ((X
t
X
s
+X
s
)X
s
) = E ((X
t
X
s
)X
s
) +E(X
2
s
)
= E(X
t
X
s
)E(X
s
) +E(X
2
s
) = E(X
2
s
) = s = s t.
Remark 2.1 We shall see later that Brownian motion has continuous sam-
ple paths. The description of the process given in the preceding proposition
tell us that such a process is a model for a random evolution which starts from
x = 0 at time t = 0, such that the qualitative change on time increments only
depends on their length (stationary law), and that the future evolution of the
process is independent of its past (Markov property).
10
Remark 2.2 It is easy to prove that the if B = B
t
, t 0 is a Brownian
motion, so is B = B
t
, t 0. Moreover, for any > 0, the process
B

=
_
1

2
t
, t 0
_
is also a Brownian motion. This means that zooming in or out, we will
observe the same sort of behaviour. This is called the scaling property of
Brownian motion.
2.2 A construction of Brownian motion
There are several ways to obtain a Brownian motion. Here we shall give P.
Levys construction, which also provides the continuity of the sample paths.
Before going through the details of this construction, we mention an alter-
nate.
Brownian motion as limit of a random walk
Let
j
, j N be a sequence of independent, identically distributed random
variables, with mean zero and variance
2
> 0. Consider the sequence of
partial sums dened by S
0
= 0, S
n
=

n
j=1

j
. The sequence S
n
, n 0 is
a Markov chain, and also a martingale.
Let us consider the continuous time stochastic process dened by linear in-
terpolation of S
n
, n 0, as follows. For any t 0, let [t] denote its integer
value. Then set
Y
t
= S
[t]
+ (t [t])
[t]+1
, (2.1)
for any t 0.
The next step is to scale the sample paths of Y
t
, t 0. By analogy with
the scaling in the statement of the central limit theorem, we set
B
(n)
t
=
1

n
Y
nt
, (2.2)
t 0.
A famous result in probability theory -Donsker theorem- tell us that the se-
quence of processes B
(n)
t
, t 0, n 1, converges in law to the Brownian
motion. The reference sample space is the set of continuous functions van-
ishing at zero. Hence, proving the statement, we obtain continuity of the
sample paths of the limit.
Donsker theorem is the innite dimensional version of the above mentioned
central limit theorem. Considering s =
k
n
, t =
k+1
n
, the increment B
(n)
t

B
(n)
s
=
1

k+1
is a random variable, with mean zero and variance t
11
s. Hence B
(n)
t
is not that far from the Brownian motion, and this is what
Donskers theorem proves.
P. Levys construction of Brownian Motion
An important ingredient in the procedure is a sequence of functions dened
on [0, 1], termed Haar functions, dened as follows:
h
0
(t) = 1,
h
k
n
(t) = 2
n
2
1 1
[
2k
2
n+1
,
2k+1
2
n+1
[
2
n
2
1 1
[
2k+1
2
n+1
,
2k+2
2
n+1
[
,
where n 1 and k 0, 1, . . . , 2
n
1.
The set of functions (h
0
, h
k
n
) is a CONS of L
2
([0, 1], B([0, 1]), ), where
stands for the Lebesgue measure. Consequently, for any f
L
2
([0, 1], B([0, 1]), ), we can write the expansion
f = f, h
0
)h
0
+

n=0
2
n
1

k=0
f, h
k
n
)h
k
n
, (2.3)
where the notation , ) means the inner product in L
2
([0, 1], B)([0, 1]), ).
Using (2.3), we dene an isometry between L
2
([0, 1], B([0, 1]), ) and
L
2
(, T, P) as follows. Consider a family of independent random variables
with law N(0, 1), (N
0
, N
k
n
). Then, for f L
2
([0, 1], B([0, 1]), ), set
I(f) = f, h
0
)N
0
+

n=0
2
n
1

k=0
f, h
k
n
)N
k
n
.
Clearly,
E
_
I(f)
2
_
= |f|
2
2
.
Hence I denes an isometry between the space of random variables
L
2
(, T, P) and L
2
([0, 1], B([0, 1]), ). Moreover, since
I(f) = lim
m
f, h
0
)N
0
+
m

n=0
2
n
1

k=0
f, h
k
n
)N
k
n
,
the randon variable I(f) is N(0, |f|
2
2
) and by Parsevals identity
E(I(f)I(g)) = f, g), (2.4)
for any f, g L
2
([0, 1], B([0, 1]), ).
Theorem 2.1 The process B = B
t
= I(1 1
[0,t]
), t [0, 1] denes a Brow-
nian motion indexed by [0, 1]. Moreover, the sample paths are continuous,
almost surely.
12
Proof: By construction B
0
= 0. Notice that for 0 s t 1, B
t
B
s
=
I(1 1
]s,t]
). Hence, by virtue or (2.4) the process B
t
B
s
is independent of
any B
r
, 0 < r < s, and B
t
B
s
has a N(0, t s) law. From this, by
a change of variables one obtains that the nite dimensional distributions
of B
t
, t [0, 1] are Gaussian. By Proposition 2.1, we obtain the rst
statement.
Our next aim is to prove that the series appearing in
B
t
= I
_
1 1
[0,t]
_
= 1 1
[0,t]
, h
0
)N
0
+

n=0
2
n
1

k=0
1 1
[0,t]
, h
k
n
)N
k
n
= g
0
(t)N
0
+

n=0
2
n
1

k=0
g
k
n
(t)N
k
n
(2.5)
converges uniformly, a.s. In the last term we have introduced the Schauder
functions dened as follows.
g
0
(t) = 1 1
[0,t]
, h
0
) = t,
g
k
n
(t) = 1 1
[0,t]
, h
k
n
) =
_
t
0
h
k
n
(s)ds,
for any t [0, 1].
By construction, for any xed n 1, the functions g
k
n
(t), k = 0, . . . , 2
n
1,
are positive, have disjoint supports and
g
k
n
(t) 2

n
2
.
Thus,
sup
t[0,1]

2
n
1

k=0
g
k
n
(t)N
k
n

n
2
sup
0k2
n
1
[N
k
n
[.
The next step consists in proving that [N
k
n
[ is bounded by some constant
depending on n such that when multiplied by 2

n
2
the series with these
terms converges.
For this, we will use a result on large deviations for Gaussian measures along
with the rst Borel-Cantelli lemma.
Lemma 2.1 For any random variable X with law N(0, 1) and for any a 1,
P([X[ a) e

a
2
2
.
13
Proof: We clearly have
P([X[ a) =
2

2
_

a
dxe

x
2
2

2
_

a
dx
x
a
e

x
2
2
=
2
a

2
e

a
2
2
e

a
2
2
,
where we have used that 1
x
a
and
2
a

2
1.

We now move to the Borel-Cantellis based argument. By the preceding


lemma,
P
_
sup
0k2
n
1
[N
k
n
[ > 2
n
4
_

2
n
1

k=0
P
_
[N
k
n
[ > 2
n
4
_
2
n
exp
_
2
n
2
1
_
.
It follows that

n=0
P
_
sup
0k2
n
1
[N
k
n
[ > 2
n
4
_
< +,
and by the rst Borel-Cantelli lemma
P
_
liminf
n
_
sup
0k2
n
1
[N
k
n
[ 2
n
4
__
= 1.
That is, a.s., there exists n
0
, which may depend on , such that
sup
0k2
n
1
[N
k
n
[ 2
n
4
for any n n
0
. Hence, we have proved
sup
t[0,1]

2
n
1

k=0
g
k
n
(t)N
k
n

n
2
sup
0k2
n
1
[N
k
n
[ 2

n
4
,
a.s., for n big enough, which proves the a.s. uniform convergence of the series
(2.5).

Next we discuss how from Theorem 1.1 we can get a Brownian motion indexed
by R
+
. To this end, let us consider a sequence B
k
, k 1 consisting of
independent Brownian motions indexed by [0, 1]. That means, for each k 1,
B
k
= B
k
t
, t [0, 1] is a Brownian motion and for dierent values of k, they
14
are independent. Then we dene a Brownian motion recursively as follows.
Let k 1; for t [k, k + 1] set
B
t
= B
1
1
+B
2
1
+ +B
k
1
+B
k+1
tk
.
Such a process is Gaussian, zero mean and E(B
t
B
s
) = s t. Hence it is a
Brownian motion.
we end this section by giving the notion of d-dimensional Brownian motion,
for a natural number d 1. For d = 1 it is the process we have seen so far.
For d > 1, it is the process dened by
B
t
= (B
1
t
, B
2
t
, . . . , B
d
t
), t 0,
where the components are independent one-dimensional Brownian motions.
2.3 Path properties of Brownian motion
We already know that the trajectories of Brownian motion are a.s. continuous
functions. However, since the process is a model for particles wandering
erratically, one expects rough behaviour. This section is devoted to prove
some results that will make more precise these facts.
Firstly, it is possible to prove that the sample paths of Brownian motion
are -H older continuous. The main tool for this is Kolmogorovs continuity
criterion:
Proposition 2.2 Let X
t
, t 0 be a stochastic process satisfying the fol-
lowing property: for some positive real numbers , and C,
E ([X
t
X
s
[

) C[t s[
1+
.
Then almost surely, the sample paths of the process are -Holder continuous
with <

.
The law of the random variable B
t
B
s
is N(0, t s). Thus, it is possible to
compute the moments, and we have
E
_
(B
t
B
s
)
2k
_
=
(2k)!
2
k
k!
(t s)
k
,
for any k N. Therefore, Proposition 2.2 yields that almost surely, the
sample paths of the Brownian motion are H older continuous with
(0,
1
2
).
15
Nowhere dierentiability
We shall prove that the exponent =
1
2
above is sharp. As a consequence
we will obtain a celebrated result by Dvoretzky, Erd os and Kakutani telling
that a.s. the sample paths of Brownian motion are not dierentiable. We
gather these results in the next theorem.
Theorem 2.2 Fix any (
1
2
, 1]; then a.s. the sample paths of B
t
, t
0 are nowhere Holder continuous with exponent . As a consequence, a.s.
sample paths are nowhere dierentiable and there are of innite variation on
each nite interval.
Proof: Let (
1
2
, 1] and assume that a sample path t B
t
() is -H older
continuous at s [0, 1). Then
[B
t
() B
s
()[ C[t s[

,
for any t [0, 1] and some constant C > 0.
Let n big enough and let i = [ns] + 1; by the triangular inequality

Bj
n
() Bj+1
n
()

B
s
() Bj
n
()

B
s
() Bj+1
n
()

C
_

s
j
n

s
j + 1
n

_
.
Hence, by restricting j = i, i + 1, . . . , i +N 1, we obtain

Bj
n
() Bj+1
n
()

C
_
N
n
_

=
M
n

.
Dene
A
i
M,n
=
_

Bj
n
() Bj+1
n
()


M
n

, j = i, i + 1, . . . , i +N 1
_
.
We have seen that the set of trajectories where t B
t
() is -H older con-
tinuous at s is included in

M=1

k=1

n=k

n
i=1
A
i
M,n
.
Next we prove that this set has null probability. Indeed,
P
_

n=k

n
i=1
A
i
M,n
_
liminf
n
P
_

n
i=1
A
i
M,n
_
liminf
n
n

i=1
P
_
A
i
M,n
_
liminf
n
n
_
P
_

B1
n


M
n

__
N
,
16
where we have used that the random variables Bj
n
Bj+1
n
are N
_
0,
1
n
_
and
independent. But
P
_

B1
n


M
n

_
=
_
n
2
_
Mn

Mn

nx
2
2
dx
=
1

2
_
Mn
1
2

Mn
1
2

x
2
2
dx Cn
1
2

.
Hence, by taking N such that N
_

1
2
_
> 1,
P
_

n=k

n
i=1
A
i
M,n
_
liminf
n
nC
_
n
1
2

_
N
= 0.
since this holds for any k, M, se get
P
_

M=1

k=1

n=k

n
i=1
A
i
M,n
_
= 0.
This ends the proof of the rst part of the theorem.
If a sample path t B
t
() were dierentiable at some s, then it would be
Lipschitz continuous (i.e. Holder continuous with exponent = 1), and we
have just proved that this might only happen with probability zero.
It is known that if a real function is of nite variation on some nite interval,
then it is dierentiable on that interval. This yields the last assertion of the
theorem and ends its proof.

Quadratic variation
The notion of quadratic variation provides a measure of the roughness of
a function. Existence of variations of dierent orders are also important in
procedures of approximation via a Taylor expansion and also in the develop-
ment of innitesimal calculus. We will study here the existence of quadratic
variation, i.e. variation of order two, for the Brownian motion. As shall be
explained in more detail in the next chapter, this provides an explanation to
the fact that rules of Itos stochastic calculus are dierent from those of the
classical dierential deterministic calculus.
Fix a nite interval [0, T] and consider the sequence of partitions given by
the points
n
= (t
n
0
= 0 t
n
1
. . . t
n
r
n
= T). We assume that
lim
n
[
n
[ = 0,
where [
n
[ denotes the norm of the partition
n
:
[
n
[ = sup
j=0,...,r
n
1
(t
j+1
t
j
)
Set
k
B = B
t
n
k
B
t
n
k1
.
17
Proposition 2.3 The sequence

r
n
k=1
(
k
B)
2
, n 1 converges in L
2
() to
the deterministic random variable T. That is,
lim
n
E
_
_
_
r
n

k=1
(
k
B)
2
T
_
2
_
_
= 0.
Proof: For the sake of simplicity, we shall omit the dependence on n. Set

k
t = t
k
t
k1
. Notice that the random variables (
k
B)
2

k
t, k = 1, . . . , n,
are independent and centered. Thus,
E
_
_
_
r
n

k=1
(
k
B)
2
T
_
2
_
_
= E
_
_
_
r
n

k=1
_
(
k
B)
2

k
t
_
_
2
_
_
=
r
n

k=1
E
_
_
(
k
B)
2

k
t
_
2
_
=
r
n

k=1
_
3(
k
t)
2
2(
k
t)
2
+ (
k
t)
2
_
= 2
r
n

k=1
(
k
t)
2
2T[
n
[,
which clearly tends to zero as n tends to innity.
This proposition, together with the continuity of the sample paths of Brow-
nian motion yields
sup
n
r
n

k=1
[
k
B[ = , a.s.,
something that we already know from Theorem 2.2.
Indeed, assume that V := sup
n

r
n
k=1
[
k
B[ < . Then
r
n

k=1
(
k
B)
2
sup
k
[
k
B[
_
r
n

k=1
[
k
B[
_
V sup
k
[
k
B[.
We obtain lim
n

r
n
k=1
(
k
B)
2
= 0. a.s., which contradicts the result proved
in Proposition 2.3.
2.4 The martingale property of Brownian motion
We start this section by giving the denition of martingale for continuous
time stochastic processes. First, we introduce the appropriate notion of l-
tration, as follows.
A family T
t
, t 0 of sub elds of T is termed a ltration if
18
1. T
0
contains all the sets of T of null probability,
2. For any 0 s t, T
s
T
t
.
If in addition

s>t
T
s
= T
t
,
for any t 0, the ltration is said to be right-continuous.
Denition 2.2 A stochastic process X
t
, t 0 is a martingale with respect
to the ltration T
t
, t 0 if each variable belongs to L
1
() and moreover
1. X
t
is T
t
measurable for any t 0,
2. for any 0 s t, E(X
t
/T
s
) = X
s
.
If the equality in (2) is replaced by (respectively, ), we have a super-
martingale (respectively, a submartingale).
Given a stochastic process X
t
, t 0, there is a natural way to dene a
ltration by considering
T
t
= (B
s
, 0 s t), t 0.
To ensure that the above property (1) for a ltration holds, one needs to com-
plete the -eld. In general, there is no reason to expect right-continuity.
However, for the Brownian motion, the natural ltration possesses this prop-
erty.
A stochastic process with mean zero, independent increments possesses the
martingale property with respect to the natural ltration. Indeed, for 0
s t,
E(X
t
X
s
/T
s
) = E(X
t
X
s
) = 0.
Hence, a Brownian motion possesses the martingale property with respect to
the natural ltration.
Other examples of martingales with respect to the same ltration, related
with the Brownian motion are
1. B
2
t
t, t 0,
2. exp
_
aB
t

a
2
t
2
_
, t 0.
Indeed, for the rst example, let us consider 0 s t. Then,
E
_
B
2
t
/T
s
_
= E
_
(B
t
B
s
+B
s
)
2
/T
s
_
= E
_
(B
t
B
s
)
2
/T
s
_
+ 2E ((B
t
B
s
)B
s
/T
s
)
+E
_
B
2
s
/T
s
_
.
19
Since B
t
B
s
is independent of T
s
, owing to the properties of the conditional
expectation, we have
E
_
(B
t
B
s
)
2
/T
s
_
= E
_
(B
t
B
s
)
2
_
= t s,
E ((B
t
B
s
)B
s
/T
s
) = B
s
E (B
t
B
s
/T
s
) = 0,
E
_
B
2
s
/T
s
_
= B
2
s
.
Consequently,
E
_
B
2
t
B
2
s
/T
s
_
= t s.
For the second example, we also use the property of independent increments,
as follows:
E
_
exp
_
aB
t

a
2
t
2
_
/T
s
_
= exp(aB
s
)E
_
exp
_
a(B
t
B
s
)
a
2
t
s
_
/T
s
_
= exp(aB
s
)E
_
exp
_
a(B
t
B
s
)
a
2
t
s
__
.
Using the density of the random variable B
t
B
s
one can easily check that
E
_
exp
_
a(B
t
B
s
)
a
2
t
s
__
= exp
_
a
2
(t s)
2

a
2
t
2
_
.
Therefore, we obtain
E
_
exp
_
aB
t

a
2
t
2
_
/T
s
_
= exp
_
aB
s

a
2
s
2
_
.
2.5 Markov property
For any 0 s t, x R and A B(R), we set
p(s, t, x, A) =
1
(2(t s))
1
2
_
A
exp
_

[x y[
2
2(t s)
_
dy. (2.6)
Actually, p(s, t, x, A) is the probability that a random variable, Normal, with
mean x and variance t s take values on a xed set A.
Let us prove the following identity:
PB
t
A/ T
s
= p(s, t, B
s
, A), (2.7)
which means that, conditionally to the past of the Brownian motion until
time s, the law of B
t
at a future time t only depends on B
s
.
20
Let f : R R be a bounded measurable function. Then, since B
s
is T
s

measurable and B
t
B
s
independent of T
s
, we obtain
E (f(B
t
)/ T
s
) = E (f (B
s
+ (B
t
B
s
)) / T
s
)
= E (f(x +B
t
B
s
))

x=B
s
.
The random variable x +B
t
B
s
is N(x, t s). Thus,
E (f(x +B
t
B
s
)) =
_
R
f(y)p(s, t, x, dy),
and consequently,
E (f(B
t
)/ T
s
) =
_
R
f(y)p(s, t, B
s
, dy).
This yields (2.7) by taking f = 1 1
A
.
Going back to (2.6), we notice that the function x p(s, t, x, A) is measur-
able, and the mapping A p(s, t, x, A) is a probability.
Let us prove the additional property, called Chapman-Kolmogorov equation:
For any 0 s u t,
p(s, t, x, A) =
_
R
p(u, t, y, A)p(s, u, x, dy). (2.8)
We recall that the sum of two independent Normal random variables, is again
Normal, with mean the sum of the respective means, and variance the sum
of the respective variances. This is expressed in mathematical terms by the
fact that
f
N(x,
1
)
f
N(y,
2
)
=
_
R
f
N(x,
1
)
(y)f
N(y,
2
)
( y)dy
= f
N(x+y,
1
+
2
)
.
Using this fact, we obtain
_
R
p(u, t, y, A)p(s, u, x, dy) =
_
A
dz
_
f
N(x,us)
f
N(0,tu)
_
(z)
=
_
A
dzf
N(x,ts)
(z) = p(s, t, x, A).
proving (2.8).
This equation is the time continuous analogue of the property own by the
transition probability matrices of a Markov chain. That is,

(m+n)
=
(m)

(n)
,
21
meaning that evolutions in m + n steps are done by concatenating m-step
and n-step evolutions. In (2.8) m + n is replaced by the real time t s, m
by t u, and n by u s, respectively.
We are now prepared to give the denition of a Markov process.
Consider a mapping
p : R
+
R
+
R B(R) R
+
,
satisfying the properties
(i) for any xed s, t R
+
, A B(R),
x p(s, t, x, A)
is B(R)measurable,
(ii) for any xed s, t R
+
, x R,
A p(s, t, x, A)
is a probability,
(iii) Equation (2.8) holds.
Such a function p is termed a Markovian transition function. Let us also x
a probability on B(R).
Denition 2.3 A real valued stochastic process X
t
, t R
+
is a Markov
process with initial law and transition probability function p if
(a) the law of X
0
is ,
(b) for any 0 s t,
PX
t
A/T
s
= p(s, t, X
s
, A).
Therefore, we have proved that the Brownian motion is a Markov process
with initial law a Dirac delta function at 0 and transition probability function
p the one dened in (2.6).
Strong Markov property
Throughout this section, (T
t
, t 0) will denote the natural ltration asso-
ciated with a Brownian motion B
t
, t 0 and stopping times will always
refer to this ltration.
22
Theorem 2.3 Let T be a stopping time. Then, conditionally to T < ,
the process dened by
B
T
t
= B
T+t
B
T
, t 0,
is a Brownian motion independent of T
T
.
Proof: Assume that T < a.s. We shall prove that for any A T
T
, any
choice of parameters 0 t
1
< < t
p
and any continuous and bounded
function f on R
p
, we have
E
_
1 1
A
f
_
B
T
t
1
, , B
T
t
p
__
= P(A)E
_
f
_
B
t
1
, , B
t
p
__
. (2.9)
This suces to prove all the assertions of the theorem. Indeed, by taking
A = , we see that the nite dimensional distributions of B and B
T
coincide.
On the other hand, (2.9) states the independence of 1 1
A
and the random vector
(B
T
t
1
, , B
T
t
p
). By a monotone class argument, we get the independence of
1 1
A
and B
T
.
The continuity of the sample paths of B implies, a.s.
f
_
B
T
t
1
, , B
T
t
p
_
= f
_
B
T+t
1
B
T
, . . . , B
T+t
p
B
T
_
= lim
n

k=1
1 1
{(k1)2
n
<Tk2
n
}
f
_
B
k2
n
+t
1
B
k2
n, . . . B
k2
n
+t
p
B
k2
n
_
.
Since f is bounded we can apply bounded convergence and write
E
_
1 1
A
f
_
B
T
t
1
, , B
t
T
p
__
= lim
n

k=1
E
_
1 1
{(k1)2
n
<Tk2
n
}
1 1
A
f
_
B
k2
n
+t
1
B
k2
n, . . . B
k2
n
+t
p
B
k2
n
__
.
Since A T
T
, the event A (k 1)2
n
< T k2
n
T
k2
n. Since
Brownian motion has independent and stationary increments, we have
E
_
1 1
{(k1)2
n
<Tk2
n
}
1 1
A
f
_
B
k2
n
+t
1
B
k2
n, . . . B
k2
n
+t
p
B
k2
n
__
= P
_
A (k 1)2
n
< T k2
n

_
E
_
f(B
t
1
, B
t
p
)
_
.
Summing up with respect to k both terms in the preceding identity yields
(2.9), and this nishes the proof if T < a.s.
If P(T = ) > 0, we can argue as before and obtain
E
_
1 1
A{T<}
f
_
B
T
t
1
, , B
T
t
p
__
= P(A T < )E
_
f
_
B
t
1
, , B
t
p
__
.

An interesting consequence of the preceding property is given in the next


proposition.
23
Proposition 2.4 For any t > 0, set S
t
= sup
st
B
s
. Then, for any a 0
and b a,
P S
t
a, B
t
b = P B
t
2a b . (2.10)
As a consequence, the probability law of S
t
and [B
t
[ are the same.
Proof: Consider the stopping time
T
a
= inft 0, B
t
= a,
which is nite a.s. We have
PS
t
a, B
t
b = PT
a
t, B
t
b = PT
a
t, B
T
a
tT
a
b a.
Indeed, B
T
a
tT
a
= B
t
B
T
a
= B
t
a and B and B
T
a
have the same law.
Moreover, we know that these processes are independent of T
T
a
.
This last property, along with the fact that B
T
a
and B
T
a
have the same
law yields that (T
a
, B
T
a
) has the same distribution as (T
a
, B
T
a
).
Dene H = (s, w) R
+
((R
+
; R); s t, w(t s) b a. Then
PT
a
t, B
T
a
tT
a
b a = P(T
a
, B
T
a
) H = P(T
a
, B
T
a
) H
= PT
a
t, B
T
a
tT
a
b a = PT
a
t, 2a b B
t

= P2a b B
t
.
Indeed, by denition of the process B
T
t
, t 0, the condition B
T
a
tT
a
ba
is equivalent to 2ab B
t
; moreover, the inclusion 2ab B
t
T
a
t
holds true. In fact, if T
a
> t, then B
t
a; since b a, this yields B
t
2ab.
This ends the proof of (2.10).
For the second one, we notice that B
t
a S
t
a. This fact along
with (2.10) yield the validity of the identities
PS
t
a = PS
t
a, B
t
a +PS
t
a, B
t
a
= 2PB
t
a = P[B
t
[ a.
The proof is now complete.

24
3 It os calculus
It os calculus has been developed in the 50 by Kyoshi Ito in an attempt to
give rigourous meaning to some dierential equations driven by the Brown-
ian motion appearing in the study of some problems related with continuous
time Markov processes. Roughly speaking, one could say that It os calculus
is an analogue of the classical Newton and Leibniz calculus for stochastic pro-
cesses. In fact, in classical mathematical analysis, there are several extensions
of the Riemann integral
_
f(x)dx. For example, if g is an increasing bounded
function (or the dierence of two of this class of functions), Lebesgue-Stieltjes
integral gives a precise meaning to the integral
_
f(x)g(dx), for some set of
functions f. However, before It os development, no theory allowing nowhere
dierentiable integrators g was known. Brownian motion, introduced in the
preceding chapter, is an example of stochastic process whose sample paths, al-
though continuous, are nowhere dierentiable. Therefore, Lebesgue-Stieltjes
integral does not apply to the sample paths of Brownian motion.
There are many motivations coming from a variety of disciplines to consider
stochastic dierential equations driven by a Brownian motion. Such an object
is dened as
dX
t
= (t, X
t
)dB
t
+b(t, X
t
)dt,
X
0
= x
0
,
or in integral form,
X
t
= x
0
+
_
t
0
(s, X
s
)dB
s
+
_
t
0
b(s, X
s
)ds. (3.1)
The rst notion to be introduced is that of stochastic integral. In fact, in (3.1)
the integral
_
t
0
b(s, X
s
)ds might be dened pathwise, but this is not the case
for
_
t
0
(s, X
s
)dB
s
, because of the roughness of the paths of the integrator.
More explicitly, it is not possible to x , then to consider the path
(s, X
s
()), and nally to integrate with respect to B
s
().
3.1 It os integral
Throughout this section, we will consider a Brownian motion B = B
t
, t 0
dened on a probability space (, T, P). We also will consider a ltration
(T
t
, t 0) satisfying the following properties:
1. B is adapted to (T
t
, t 0),
2. the -eld generated by B
u
B
t
, u t is independent of (T
t
, t 0).
25
Notice that these two properties are satised if (T
t
, t 0) is the natural
ltration associated to B.
We x a nite time horizon T and dene L
2
a,T
as the set of stochastic processes
u = u
t
, t [0, T] satisfying the following conditions:
(i) u is adapted and jointly measurable in (t, ), with respect to the product
-eld B([0, T]) T.
(ii)
_
T
0
E(u
2
t
)dt < .
The notation L
2
a,T
evokes the two properties -adaptedness and square
integrability- described before.
Consider rst the subset of L
2
a,T
consisting of step processes. That is, stochas-
tic processes which can be written as
u
t
=
n

j=1
u
j
1 1
[t
j1
,t
j
[
(t), (3.2)
with 0 = t
0
t
1
t
n
= T and where u
j
, j = 1, . . . , n, are T
t
j1

measurable square integrable random variables. We shall denote by c the


set of these processes.
For step processes, the Ito stochastic integral is dened by the very natural
formula
_
T
0
u
t
dB
t
=
n

j=1
u
j
(B
t
j
B
t
j1
), (3.3)
that we may compare with Lebesgue integral of simple functions. Notice
that
_
T
0
u
t
dB
t
is a random variable. Of course, we would like to be able to
consider more general integrands than step processes. Therefore, we must try
to extend the denition (3.3). For this, we have to use tools from Functional
Analysis based upon a very natural idea: If we are able to prove that (3.3)
gives a continuous functional between two metric spaces, then the stochastic
integral dened for the very particular class of step stochastic processes could
be extended to a more general class given by the closure of this set with
respect to a suitable norm.
The idea of continuity is made precise by the
Isometry property:
E
_
_
T
0
u
t
dB
t
_
2
= E
_
_
T
0
u
2
t
dt
_
. (3.4)
26
Let us prove (3.4) for step processes. Clearly
E
_
_
T
0
u
t
dB
t
_
2
=
n

j=1
E
_
u
2
j
(
j
B)
2
_
+ 2

j<k
E(u
j
u
k
(
j
B)(
k
B)).
The measurability property of the random variables u
j
, j = 1, . . . , n, implies
that the random variables u
2
j
are independent of (
j
B)
2
. Hence, the con-
tribution of the rst term in the right hand-side of the preceding identity is
equal to
n

j=1
E(u
2
j
)(t
j
t
j1
) =
_
T
0
E(u
2
t
)dt.
For the second term, we notice that for xed j and k, j < k, the random
variables u
j
u
k

j
B are independent of
k
B. Therefore,
E(u
j
u
k
(
j
B)(
k
B)) = E(u
j
u
k
(
j
B))E(
k
B) = 0.
Thus, we have (3.4).
This property tell us that the stochastic integral is a continuous functional
dened on c, endowed with the norm of L
2
([0, T]), taking values on the
set L
2
() of square integrable random variables.
Other properties of the stochastic integral of step processes
1. The stochastic integral is a centered random variable. Indeed,
E
_
_
T
0
u
t
dB
t
_
= E
_
_
n

j=1
u
j
(B
t
j
B
t
j+1
_
_
=
n

j=1
E(u
j
)E
_
B
t
j
B
t
j+1
_
= 0,
where we have used that the random variables u
j
and B
t
j
B
t
j+1
are
independent and moreover E
_
B
t
j
B
t
j+1
_
= 0.
2. Linearity: If u
1
, u
2
are two step processes and a, b R, then clearly
au
1
+bu
2
is also a step process and
_
T
0
(au
1
+bu
2
)(t)dB
t
= a
_
T
0
u
1
(t)dB
t
+
_
T
0
u
2
(t)dB
t
.
27
The next step consists of identifying a bigger set than c of random processes
such that c is dense in the norm L
2
( [0, T]). This is actually the set
denoted before by L
2
a,T
. Indeed, we have the following result which is a
crucial fact in It os theory.
Proposition 3.1 For any u L
2
a,T
there exists a sequence (u
n
, n 1) c
such that
lim
n
_
T
0
E(u
n
t
u
t
)
2
dt = 0.
Proof: Assume u L
2
a,T
, bounded, and has continuous sample paths, a.s. An
approximation sequence can be dened as follows:
u
n
(t) =
[nT]

k=0
u
_
k
n
_
1 1
[
k
n
,
k+1
n
[
(t).
Clearly, u
n
L
2
a,T
and by continuity,
_
T
0
[u
n
(t) u(t)[
2
dt =
[nT]

k=0
_ k+1
n
T
k
n

u
_
k
n
_
u(t)

2
dt
T sup
k
sup
t
k
[u
n
(t) u(t)[
2
0,
as n , a.s. Then, the approximation result follows by bounded conver-
gence.
In a second step, we assume that u L
2
a,T
is bounded. Let be a (

function with compact support contained in [1, 1]. For any n 1, set

n
(x) =
1
n
(nx) and
u
n
(t) =
_
t
0

n
(t s)u(s)ds.
u
n
is a stochastic process with continuous sample paths, a.s. Classical results
on approximation of the identity yield
_
T
0
[u
n
(s) u(s)[
2
ds 0, a.s.
By bounded convergence,
lim
n
E
_
T
0
[u
n
(s) u(s)[
2
ds 0.
28
Finally, consider u L
2
a,T
and dene
u
n
(t) =
_

_
0, u(t) < n,
u(t), n u(t) n,
0, u(t) > n.
Clearly, sup
,t
[u
n
(t)[ n and u
n
L
2
a,T
. Moreover,
E
_
T
0
[u
n
(s) u(s)[
2
ds = E
_
T
0
[u(s)[
2
1 1
{|u(s)|>n}
ds 0,
where we have used that for a function f L
1
(, T, ),
lim
n
_

[f[1 1
|f|>n
d = 0.

By using the approximation result provided by the preceding Proposition,


we can give the following denition.
Denition 3.1 The Ito stochastic integral of a process u L
2
a,T
is
_
T
0
u
t
dB
t
:= L
2
() lim
n
_
T
0
u
n
t
dB
t
. (3.5)
In order this denition to make sense, one needs to make sure that if the
process u is approximated by two dierent sequences, say u
n,1
and u
n,2
, the
denition of the stochastic integral, using either u
n,1
or u
n,2
coincide. This
is proved using the isometry property. Indeed
E
_
_
T
0
u
n,1
t
dB
t

_
T
0
u
n,2
t
dB
t
_
2
=
_
T
0
E
_
u
n,1
t
u
n,2
t
_
2
dt
2
_
T
0
E
_
u
n,1
t
u
t
_
2
dt + 2
_
T
0
E
_
u
n,2
t
u
t
_
2
dt
0,
By its very denition, the stochastic integral dened in Denition 3.1 satises
the isometry property as well. Moreover,
(a) stochastic integrals are centered random variables:
E
_
_
T
0
u
t
dB
t
_
= 0,
29
(b) stochastic integration is a linear operator:
_
T
0
(au
t
+bv
t
) dB
t
= a
_
T
0
u
t
dB
t
+b
_
T
0
v
t
dB
t
.
Remember that these facts are true for processes in c, as has been mentioned
before. The extension to processes in L
2
a,T
is done by applying Proposition
3.1. For the sake of illustration we prove (a).
Consider an approximating sequence u
n
in the sense of Proposition 3.1. By
the construction of the stochastic integral
_
T
0
u
t
dB
t
, it holds that
lim
n
E
_
_
T
0
u
n
t
dB
t
_
= E
_
_
T
0
u
t
dB
t
_
,
Since E
_
_
T
0
u
n
t
dB
t
_
= 0 for every n 1, this concludes the proof.
We end this section with an interesting example.
Example 3.1 For the Brownian motion B, the following formula holds:
_
T
0
B
t
dB
t
=
1
2
_
B
2
T
T
_
.
Let us remark that we would rather expect
_
T
0
B
t
dB
t
=
1
2
B
2
T
, by analogy
with rules of deterministic calculus.
To prove this identity, we consider a particular sequence of approximating
step processes, as follows:
u
n
t
=
n

j=1
B
t
j1
1 1
]t
j1
,t
j
]
(t).
Clearly, u
n
L
2
a,T
and we have
_
T
0
E (u
n
t
B
t
)
2
dt =
n

j=1
_
t
j
t
j1
E
_
B
t
j1
B
t
_
2
dt

T
n
n

j=1
_
t
j
t
j1
dt =
T
2
n
.
Therefore, (u
n
, n 1) is an approximating sequence of B in the norm of
L
2
( [0, T]).
According to Denition 3.1,
_
T
0
B
t
dB
t
= lim
n
n

j=1
B
t
j1
_
B
t
j
B
t
j1
_
,
30
in the L
2
() norm.
Clearly,
n

j=1
B
t
j1
_
B
t
j
B
t
j1
_
=
1
2
n

j=1
_
B
2
t
j
B
2
t
j1
_

1
2
n

j=1
_
B
t
j
B
t
j1
_
2
=
1
2
B
2
T

1
2
n

j=1
_
B
t
j
B
t
j1
_
2
. (3.6)
We conclude by using Proposition 2.3.
3.2 The It o integral as a stochastic process
The indenite It o stochastic integral of a process u L
2
a,T
is dened as
follows:
_
t
0
u
s
dB
s
:=
_
T
0
u
s
1 1
[0,t]
(s)dB
s
, (3.7)
t [0, T].
For this denition to make sense, we need that for any t [0, T], the process
u
s
1 1
[0,t]
(s), s [0, T] belongs to L
2
a,T
. This is clearly true.
Obviously, properties of the integral mentioned in the previous section, like
zero mean, isometry, linearity, also hold for the indenite integral.
The rest of the section is devoted to the study of important properties of the
stochastic process given by an indenite It o integral.
Proposition 3.2 The process I
t
=
_
t
0
u
s
dB
s
, t [0, T] is a martingale.
Proof: We rst establish the martingale property for any approximating
sequence
I
n
t
=
_
t
0
u
n
s
dB
s
, t [0, T],
where u
n
converges to u in L
2
([0, T]). This suces to prove the Proposi-
tion, since L
2
()limits of martingales are again martingales (this fact follows
by applying Jensens inequality).
Let u
n
t
, t [0, T], be dened by the right hand-side of (3.2). Fix 0 s t
T and assume that t
k1
< s t
k
< t
l
< t t
l+1
. Then
I
n
t
I
n
s
= u
k
(B
t
k
B
s
) +
l

j=k+1
u
j
(B
t
j
B
t
j
1
)
+u
l+1
(B
t
B
t
l
).
31
Using properties (g) and (f), respectively, of the conditional expectation (see
Appendix 1) yields
E (I
n
t
I
n
s
/T
s
) = E (u
k
(B
t
k
B
s
)/T
s
) +
l

j=k+1
E
_
E
_
u
j

j
B/T
t
j1
_
/T
s
_
+E
_
u
l+1
E
_
B
t
B
t
l
/T
t
l1
_
/T
s
_
= 0.
This nishes the proof of the proposition.
A proof not very dierent as that of Proposition 2.3 yields
Proposition 3.3 For any process u L
2
a,T
and bounded,
L
1
() lim
n
n

j=1
_
_
t
j
t
j1
u
s
dB
s
_
2
=
_
t
0
u
2
s
ds.
That means, the quadratic variation of the indenite stochastic integral is
given by the process
_
t
0
u
2
s
ds, t [0, T].
The isometry property of the stochastic integral can be extended in the fol-
lowing sense. Let p [2, [. Then,
E
__
t
0
u
s
dB
s
_
p
C(p)E
__
t
0
u
2
s
ds
_
p
2
. (3.8)
Here C(p) is a positive constant depending on p. This is Burkholders in-
equality.
A combination of Burkholders inequality and Kolmogorovs continuity cri-
terion allows to deduce the continuity of the sample paths of the inde-
nite stochastic integral. Indeed, assume that
_
T
0
E (u
r
)
p
2
dr < , for any
p [2, [. Using rst (3.8) and then H olders inequality (be smart!) implies
E
__
t
s
u
r
dB
r
_
p
C(p)E
__
t
s
u
2
r
dr
_
p
2
C(p)[t s[
p
2
1
_
t
s
E (u
r
)
p
2
dr
C(p)[t s[
p
2
1
.
Since p 2 is arbitrary, with Theorem 1.1 we have that the sample paths of
_
t
0
u
s
dB
s
, t [0, T] are H older continuous with ]0,
1
2
[.
32
3.3 An extension of the Ito integral
In Section 3.1 we have introduced the set L
2
a,T
and we have dened the
stochastic integral of processes of this class with respect to the Brownian
motion. In this section we shall consider a large class of integrands. The
notations and underlying ltration are the same as in Section 3.1.
Let
2
a,T
be the set of real valued processes u adapted to the ltration (T
t
, t
0), jointly measurable in (t, ) with respect to the product -eld B([0, T])
T and satisfying
P
_
_
T
0
u
2
t
dt <
_
= 1. (3.9)
Clearly L
2
a,T

2
a,T
. Our aim is to dene the stochastic integral for processes
in
2
a,T
. For this we shall follow the same approach as in section 3.1. Firstly,
we start with step processes (u
n
, n 1) of the form (3.2) belonging to
2
a,T
and dene the integral as in (3.3). The extension to processes in
2
a,T
needs
two ingredients. The rst one is an approximation result that we now state
without giving a proof. Reader may consult for instance [1].
Proposition 3.4 Let u
2
a,T
. There exists a sequence of step processes
(u
n
, n 1) of the form (3.2) belonging to
2
a,T
such that
lim
n
_
T
0
[u
n
t
u
t
[
2
dt = 0,
a.s.
The second ingredient gives a connection between stochastic integrals of step
processes in
2
a,T
and their quadratic variation, as follows.
Proposition 3.5 Let u be a step processes in
2
a,T
. Then for any > 0,
N > 0,
P
_

_
T
0
u
t
dB
t

>
_
P
_
_
T
0
u
2
t
dt > N
_
+
N

2
. (3.10)
Proof: It is based on a truncation argument. Let u be given by the right-hand
side of (3.2) (here it is not necessary to assume that the random variables u
j
are in L
2
()). Fix N > 0 and dene
v
N
t
=
_
_
_
u
j
, if t [t
j1
, t
j
[, and

n
j=1
u
2
j
(t
j
t
j1
) N,
0, if t [t
j1
, t
j
[, and

n
j=1
u
2
j
(t
j
t
j1
) > N,
The process v
N
t
, t [0, T] belongs to L
2
a,T
. Indeed, by denition
_
t
0
[v
N
t
[
2
dt N.
33
Moreover, if
_
T
0
u
2
t
dt N, necessarily u
t
= v
N
t
for any t [0, T]. Then by
considering the decomposition
_

_
T
0
u
t
dB
t

>
_
=
_
_
T
0
u
t
dB
t
> ,
_
T
0
u
2
t
dt > N
_

_
_
T
0
u
t
dB
t
> ,
_
T
0
u
2
t
dt N
_
,
we obtain
P
_

_
T
0
u
t
dB
t

>
_
P
_

_
T
0
v
N
t
dB
t

>
_
+P
_
_
T
0
u
2
t
dt > N
_
.
We nally apply Chebychevs inequality along with the isometry property of
the stochastic integral for processes in L
2
a,T
and get
P
_

_
T
0
v
N
t
dB
t

>
_

1

2
E
_
_
T
0
v
N
t
dB
t
_
2

2
.
This ends the proof of the result.

The extension
Fix u
2
a,T
and consider a sequence of step processes (u
n
, n 1) of the
form (3.2) belonging to
2
a,T
such that
lim
n
_
T
0
[u
n
t
u
t
[
2
dt = 0, (3.11)
in the convergence in probability.
By Proposition 3.5, for any > 0, N > 0 we have
P
_

_
T
0
(u
n
t
u
m
t
)dB
t

>
_
P
_

_
T
0
(u
n
t
u
m
t
)
2
dt

> N
_
+
N

2
.
Using (3.11), we can choose such that for any N > 0 and n, m big enough,
P
_

_
T
0
(u
n
t
u
m
t
)
2
dt

> N
_


2
.
Then, we may take N small enough so that
N

2


2
. Consequently, we have
proved that the sequence of stochastic integrals of step processes
_
_
T
0
u
n
t
dB
t
, n 1
_
34
is Cauchy in probability. Since the convergence in probability is metrizable,
this sequence does have a limit in probability. Hence, we then dene
_
T
0
u
t
dB
t
= P lim
n
_
T
0
u
n
t
dB
t
. (3.12)
It is easy to check that this denition is indeed independent of the particular
approximation sequence used in the construction.
3.4 A change of variables formula: It os formula
Like in Example 3.1, we can prove the following formula, valid for any t 0:
B
2
t
= 2
_
t
0
B
s
dB
s
+t. (3.13)
If the sample paths of B
t
, t 0 were suciently smooth -for example, of
bounded variation- we would rather have
B
2
t
= 2
_
t
0
B
s
dB
s
. (3.14)
Why is it so? Consider a similar decomposition as the one given in (3.6)
obtained by restricting the time interval to [0, t]. More concretely, consider
the partition of [0, t] dened by 0 = t
0
t
1
t
n
= t,
B
2
t
=
n1

j=0
_
B
2
t
j+1
B
2
t
j
_
= 2
n1

j=0
B
t
j
_
B
t
j+1
B
t
j
_
+
n1

j=0
_
B
t
j+1
B
t
j
_
2
, (3.15)
where we have used that B
0
= 0.
Consider a sequence of partitions of [0, t] whose mesh tends to zero. We
already know that
n1

j=0
_
B
t
j+1
B
t
j
_
2
t,
in the convergence of L
2
(). This gives the extra contribution in the devel-
opment of B
2
t
in comparison with the classical calculus approach.
Notice that, if B were of bounded variation then, we could argue as follows:
n1

j=0
_
B
t
j+1
B
t
j
_
2
sup
0jn1
[B
t
j+1
B
t
j
[

n1

j=0
[B
t
j+1
B
t
j
[.
35
By the continuity of the sample paths of the Brownian motion, the rst factor
in the right hand-side of the preceding inequality tends to zero as the mesh
of the partition tends to zero, while the second factor remains nite, by the
property of bounded variation.
Summarising. Dierential calculus with respect to the Brownian motion
should take into account second order dierential terms. Roughly speaking
(dB
t
)
2
= dt.
A precise meaning to this formal formula is given in Proposition 2.3.
3.4.1 One dimensional It os formula
In this section, we shall extend the formula (3.13) and write an expression
for f(t, B
t
) for a class of functions f which include f(x) = x
2
.
Denition 3.2 Let v
t
, t [0, T] be a stochastic process, adapted, whose
sample paths are almost surely Lebesgue integrable, that is
_
T
0
[v
t
[dt < , a.s.
Consider a stochastic process u
t
, t [0, T] belonging to
2
a,T
and a random
variable X
0
. The stochastic process dened by
X
t
= X
0
+
_
t
0
u
s
dB
s
+
_
t
0
v
s
ds, (3.16)
t [0, T] is termed an Ito process.
An alternative writing of (3.16) in dierential form is
dX
t
= u
t
dB
t
+v
t
dt.
To warm up, we state a particular version of the It o formula. By (
1,2
we
denote the set of functions on [0, T] R which are jointly continuous in
(t, x), continuous dierentiable in t and twice continuous dierentiable in x
with derivatives jointly continuous.
Theorem 3.1 Let f : [0, T] R R be a function in (
1,2
and X be an
Ito process with decomposition given in (3.16). The following formula holds
true:
f(t, X
t
) = f(0, X
0
) +
_
t
0

s
f(s, X
s
)ds +
_
t
0

x
f(s, X
s
)u
s
dB
s
+
_
t
0

x
f(s, X
s
)v
s
ds +
1
2
_
t
0

2
xx
f(s, X
s
)u
2
s
ds. (3.17)
36
An idea of the proof. Consider a sequence of partitions of [0, T], for example
the one dened by t
n
j
=
jt
n
. In the sequel, we avoid mentioning the superscript
n for the sake of simplicity. We can write
f(t, X
t
) f(0, X
0
) =
n1

j=0
_
f(t
j+1
, X
t
j+1
) f(t
j
, X
t
j
)
_
=
n1

j=0
_
f(t
j+1
, X
t
j
) f(t
j
, X
t
j
)
_
+
_
f(t
j+1
, X
t
j+1
) f(t
j+1
, X
t
j
)
_
=
n1

j=0
_

s
f(

t
j
, X
t
j
)(t
j+1
t
j
)
_
+
_

x
f(t
j+1
, X
t
j
)(X
t
j+1
X
t
j
)
_
+
1
2
n1

j=0

2
xx
f(t
j+1
,

X
j
)(X
t
j+1
X
t
j
)
2
. (3.18)
with

t ]t
j
, t
j+1
[ and

X
j
an intermediate (random) point on the segment
determined by X
t
j
and X
t
j+1
.
In fact, this follows from a Taylor expansion of the function f up to the
rst order in the variable s, and up to the second order in the variable x.
The asymmetry in the orders is due to the existence of quadratic variation
of the processes involved. The expresion (3.18) is the analogue of (3.15).
The former is much simpler for two reasons. Firstly, there is no s-variable;
secondly, f is a polynomial of second degree, and therefore it has an exact
Taylor expansion. But both formulas have the same structure.
When passing to the limit as n , we obtain
n1

j=0

s
f(t
j
, X
t
j
)(t
j+1
t
j
)
_
t
0

s
f(s, X
s
)ds
n1

j=0

x
f(t
j+1
, X
t
j
)(X
t
j+1
X
t
j
)
_
t
0

x
f(s, X
s
)u
s
dB
s
+
_
t
0

x
f(s, X
s
)v
s
ds
n1

j=0

xx
f(t
j+1
,

X
j
)(X
t
j+1
X
t
j
)
2

_
t
0

2
xx
f(s, X
s
)u
2
s
ds,
in the convergence of probability.
37
It os formula (3.17) can be written in the formal simple dierential form
df(t, X
t
) =
t
f(t, X
t
)dt +
x
f(t, X
t
)dX
t
+
1
2

2
xx
f(t, X
t
)(dX
t
)
2
, (3.19)
where (dX
t
)
2
is computed using the formal rule of composition
dB
t
dB
t
= dt,
dB
t
dt = dt dB
t
= 0,
dt dt = 0.
Consider in Theorem 3.1 the particular case where f : R R is a function
in (
2
(twice continuously dierentiable). Then formula (3.17) becomes
f(X
t
) = f(X
0
) +
_
t
0
f

(X
s
)u
s
dB
s
+
_
t
0
f

(X
s
)v
s
ds
+
1
2
_
t
0
f

(X
s
)u
2
s
ds. (3.20)
Example 3.2 Consider the function
f(t, x) = e
t

2
2
t+x
,
with , R.
Applying formula (3.17) to X
t
:= B
t
-a Brownian motion- yields
f(t, B
t
) = 1 +
_
t
0
f(s, B
s
)ds +
_
t
0
f(s, B
s
)dB
s
.
Hence, the process Y
t
= f(t, B
t
), t 0 satises the equation
Y
t
= 1 +
_
t
0
Y
s
ds +
_
t
0
Y
s
dB
s
.
The equivalent dierential form of this identity is the linear stochastic dier-
ential equation
dY
t
= Y
t
dt +Y
t
dB
t
,
Y
0
= 1. (3.21)
Black and Scholes proposed as model of a market with a single risky asset
with initial value S
0
= 1, the process S
t
= Y
t
. We have seen that such a
process is in fact the solution to a linear stochastic dierential equation (see
(3.21)).
38
3.4.2 Multidimensional version of It os formula
Consider a m-dimensional Brownian motion (B
1
t
, , B
m
t
) , t 0 and p
real-valued It o processes, as follows:
dX
i
t
=
m

l=1
u
i,l
t
dB
l
t
+v
i
t
dt, (3.22)
i = 1, . . . , p. We assume that each one of the processes u
i,l
t
belong to
2
a,T
and that
_
T
0
[v
i
t
[dt < , a.s. Following a similar plan as for Theorem 3.1, we
will prove the following:
Theorem 3.2 Let f : [0, ) R
p
R be a function of class (
1,2
and
X = (X
1
, . . . , X
p
) be given by (3.22). Then
f(t, X
t
) = f(0, X
0
) +
_
t
0

s
f(s, X
s
)ds +
p

k=1
_
t
0

x
k
f(s, X
s
)dX
k
s
+
1
2
p

k,l=1
_
t
0

x
k
,x
l
f(s, X
s
)dX
k
s
dX
l
s
, (3.23)
where in order to compute dX
k
s
dX
l
s
, we have to apply the following rules
dB
k
s
dB
l
t
=
k,l
ds, (3.24)
dB
k
s
ds = 0,
(ds)
2
= 0,
where
k,l
denotes the Kronecker symbol.
We remark that the identity (3.24) is a consequence of the independence of
the components of the Brownian motion.
Example 3.3 Consider the particular case m = 1, p = 2 and f(x, y) = xy.
That is, f does not depend on t and we have denoted a generic point of R by
(x, y). Then the above formula (3.23) yields
X
1
t
X
2
t
= X
1
0
X
2
0
+
_
t
0
X
1
s
dX
2
s
+
_
t
0
X
2
s
dX
1
s
+
_
t
0
_
u
1
s
u
2
s
_
ds. (3.25)
Proof of Theorem 3.2: Let
n
= 0 = t
n
0
< < t
n
p
n
= t be a sequence
of increasing partitions such that lim
n
[
n
[ = 0. First, we consider the
decomposition
f(t, X
t
) f(0, X
0
) =
p
n
1

i=0
_
f(t
n
i+1
, X
t
n
i+1
) f(t
n
i
, X
t
n
i
)
_
,
39
and then Taylors formula for each term in the last sum. We obtain
f(t
n
i+1
, X
t
n
i+1
) f(t
n
i
, X
t
n
i
) =
s
f(

t
n
i
, X
t
n
i
)(t
n
i+1
t
n
i
)
+
p

k=1

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_
(X
k
t
n
i+1
X
k
t
n
i
)
+
p

k,l=1
f
k,l
n,i
2
(X
k
t
n
i+1
X
k
t
n
i
)(X
l
t
n
i+1
X
l
t
n
i
),
with

t
n
i
]t
n
i
, t
n
i+1
[ and
inf
[0,1]

2
x
k
,x
l
f
_
t
n
i
, X
1
t
n
i
+
_
X
1
t
n
i+1
X
1
t
n
i
_
,
_
f
k,l
n,i
sup
[0,1]

2
x
k
,x
l
f
_
t
n
i
, X
1
t
n
i
+
_
X
1
t
n
i+1
X
1
t
n
i
_
,
_
We now proceed to prove the convergence of each contribution to the sum. In
order to simplify the arguments, we shall assume more restrictive assumptions
on the function f and the processes u
i,l
. Notice that the process X has
continuous sample paths, a.s.
First term
p
n
1

i=0

s
f(

t
n
i
, X
t
n
i
)(t
n
i+1
t
n
i
)
_
t
0

s
f(s, X
s
)ds, (3.26)
a.s. indeed

p
n
1

i=1

s
f(

t
n
i
, X
t
n
i
)(t
n
i+1
t
n
i
)
_
t
0

s
f(s, X
s
)ds

p
n
1

i=1
_
t
n
i+1
t
n
i
_

s
f(

t
n
i
, X
t
n
i
)
s
f(s, X
s
)
_
ds

p
n
1

i=1
_
t
n
i+1
t
n
i

s
f(

t
n
i
, X
t
n
i
)
s
f(s, X
s
)

ds
t sup
1ip
n
1

s
f(

t
n
i
, X
t
n
i
)
s
f(s, X
s
)
_
1 1
[t
n
i
,t
n
i+1
]
(s)

.
The continuity of
s
f along with that of the process X implies
lim
n
sup
1ip
n
1

s
f(

t
n
i
, X
t
n
i
)
s
f(s, X
s
)
_
1 1
[t
n
i
,t
n
i+1
]
(s)

= 0.
This gives (3.26).
40
Second term
Fix k = 1, . . . , p. We next prove
p
n
1

i=1

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_
(X
k
t
n
i+1
X
k
t
n
i
)
_
t
0

x
k
f(s, X
s
)dX
k
s
(3.27)
in probability, which by (3.22) amounts to check two convengences:
p
n
1

i=1

x
kf
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_
_
t
n
i+1
t
n
i
u
k,l
t
dB
l
t

_
t
0

x
kf(s, X
s
)u
k,l
s
dB
l
s
, (3.28)
p
n
1

i=1

x
kf
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_
_
t
n
i+1
t
n
i
v
k
t
dt
_
t
0

x
kf(s, X
s
)v
k
s
ds, (3.29)
for any l = 1, . . . , m.
We start with (3.28). Assume that
x
kf is bounded and that the processes
u
k,l
are in L
2
a,T
. Then
E

p
n
1

i=1

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_
_
t
n
i+1
t
n
i
u
k,l
t
dB
l
t

_
t
0

x
k
f(s, X
s
)u
k,l
s
dB
l
s

2
= E

p
n
1

i=1
_
t
n
i+1
t
n
i
_

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_

x
k
f(s, X
s
)
_
u
k,l
s
dB
l
s

2
=
p
n
1

i=1
E

_
t
n
i+1
t
n
i
_

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_

x
k
f(s, X
s
)
_
u
k,l
s
dB
l
s

2
=
p
n
1

i=1
_
t
n
i+1
t
n
i
E
__

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_

x
k
f(s, X
s
)
_
u
k,l
s
_
2
ds,
where we have applied successively that the stochastic integrals on disjoint
intervals are independent to each other, they are centered random variables,
along with the isometry property.
By the continuity of
x
kf and the process X,
sup
1ip
n
1

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_

x
k
f(s, X
s
)
_
1 1
[t
n
i
,t
n
i+1
]
(s)

0, (3.30)
a.s. Then, by bounded convergence,
sup
1ip
n
1
E

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_

x
k
f(s, X
s
)
_
u
k,l
s
1 1
[t
n
i
,t
n
i+1
]
(s)

2
0,
and hence we get (3.28) in the convergence of L
2
().
41
For the proof of (3.29) we also assume that
x
kf is bounded. Then,

p
n
1

i=1

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_
_
t
n
i+1
t
n
i
v
k
t
dt
_
t
0

x
k
f(s, X
s
)v
k
s
ds

p
n
1

i=1
_
t
n
i+1
t
n
i
_

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_

x
k
f(s, X
s
)
_
v
k
s
ds

p
n
1

i=1
_
t
n
i+1
t
n
i

x
k
f
_
t
n
i
, X
1
t
n
i
, . . . , X
p
t
n
i
_

x
k
f(s, X
s
)

[v
k
s
[ds.
By virtue of (3.30) and bounded convergence, we obtain (3.29) in the a.s.
convergence.
42
Third term
Given the notation (3.24), it suces to prove that
p
n
1

i=0
f
k,l
n,i
(X
k
t
n
i+1
X
k
t
n
i
)(X
l
t
n
i+1
X
l
t
n
i
)
m

j=1
_
t
0

2
x
k
,x
l
f(s, X
s
)u
k,j
s
u
l,j
s
ds. (3.31)
which will be a consequence of the following convergences
p
n
1

i=0
f
k,l
n,i
_
_
t
n
i+1
t
n
i
u
k,j
s
dB
j
s
__
_
t
n
i+1
t
n
i
u
l,j

s
dB
j

s
_

j,j

_
t
0

2
x
k
,x
l
f(s, X
s
)u
k,j
s
u
l,j

s
ds,
(3.32)
p
n
1

i=0
f
k,l
n,i
_
_
t
n
i+1
t
n
i
u
k,j
s
dB
j
s
__
_
t
n
i+1
t
n
i
v
l
s
ds
_
0, (3.33)
p
n
1

i=0
f
k,l
n,i
_
_
t
n
i+1
t
n
i
v
k
s
ds
__
_
t
n
i+1
t
n
i
v
l
s
ds
_
0. (3.34)
Let us start by arguing on (3.32). We assume that
2
x
k
,x
l
f is bounded and
that u
l,j
L
2
a,T
for any l = 1, . . . , p, j = 1, . . . , m. Suppose that j ,= j

. Then,
the convergence to zero follows from the fact that the stochastic integrals
_
t
n
i+1
t
n
i
u
k,j
s
dB
j
s
,
_
t
n
i+1
t
n
i
u
l,j

s
dB
j

s
,
are independent.
Assume j = j

. Then,
E

p
n
1

i=0
f
k,l
n,i
_
_
t
n
i+1
t
n
i
u
k,j
s
dB
j
s
__
_
t
n
i+1
t
n
i
u
l,j
s
dB
j
s
_

_
t
0

2
x
k
,x
l
f(s, X
s
)u
k,j
s
u
l,j
s
ds

(T
1
+T
2
),
with
T
1
= E

p
n
1

i=0
f
k,l
n,i

__
_
t
n
i+1
t
n
i
u
k,j
s
dB
j
s
__
_
t
n
i+1
t
n
i
u
l,j
s
dB
j
s
_

_
t
n
i+1
t
n
i
u
k,j
s
u
l,j
s
ds
_

,
T
2
= E

p
n
1

i=0
f
k,l
n,i
_
t
n
i+1
t
n
i
u
k,j
s
u
l,j
s
ds
_
t
0

2
x
k
,x
l
f(s, X
s
)u
k,j
s
u
l,j
s
ds

.
43
Since
2
x
k
,x
l
f is bounded,
T
1
CE
p
n
1

i=0

_
_
t
n
i+1
t
n
i
u
k,j
s
dB
j
s
__
_
t
n
i+1
t
n
i
u
l,j
s
dB
j
s
_

_
t
n
i+1
t
n
i
u
k,j
s
u
l,j
s
ds

.
This tends to zero as n (see Proposition 3.3).
As for T
2
, we have
T
2
E
_
sup
i

f
k,l
n,i

2
x
k
,x
l
f(s, X
s
)

1 1
]t
n
i
,t
n+1
i
]
(s)

_
t
0
u
k,j
s
u
l,j
s
ds

_
By continuity,
lim
n
sup
i
_

(f
k,l
n,i

2
x
k
,x
l
f(s, X
s
)

1 1
]t
n
i
,t
n+1
i
]
(s)
_
= 0.
Thus, by bounded convergence we obtain
lim
n
T
2
= 0.
Hence we have proved (3.32) in L
1
().
Next we prove (3.33) in L
1
(), assuming that
2
x
k
,x
l
f is bounded and ad-
ditionally that the processes u
k,j
L
2
a,T
and v
l
L
2
( [0, T]). In this
case,
E

p
n
1

i=0
f
k,l
n,i
_
_
t
n
i+1
t
n
i
u
k,j
s
dB
j
s
__
_
t
n
i+1
t
n
i
v
l
s
ds
_

C
p
n
1

i=0
E
_

_
t
n
i+1
t
n
i
u
s
dB
s

_
t
n
i+1
t
n
i
v
s
ds

_
C
p
n
1

i=0
_
E
_
t
n
i+1
t
n
i
u
2
s
ds
_1
2
_
_
E
_
_
t
n
i+1
t
n
i
[v
s
[ds
_
2
_
_
1
2
C
p
n
1

i=0
[t
n
i+1
t
n
i
[
1
2
_
_
t
n
i+1
t
n
i
E[u
s
[
2
ds
_1
2
_
E
_
_
t
n
i+1
t
n
i
[v
s
[
2
ds
__1
2
C sup
i
[t
n
i+1
t
n
i
[
1
2
_
_
p
n
1

i=0
_
t
n
i+1
t
n
i
E[u
s
[
2
ds
_
_
1
2

_
_
p
n
1

i=0
_
t
n
i+1
t
n
i
E[v
s
[
2
ds
_
_
1
2
,
44
which tends to zero as n a.s. Starting from the second term, we have
omitted superindices, for the sake of simplicity.
The proof of (3.34) is easier. Indeed assuming that
2
x
k
,x
l
f is bounded,

p
n
1

i=0
f
k,l
n,i
_
_
t
n
i+1
t
n
i
v
k
s
ds
__
_
t
n
i+1
t
n
i
v
l
s
ds
_

C sup
i
_
_
t
n
i+1
t
n
i
[v
l
s
[ds
_
_
t
0
[v
k
s
[ds.
The rst factor tends to zero as n , while the second one is bounded
a.s.
The proof of the theorem is now complete.

45
4 Applications of the It o formula
This chapter is devoted to give some important results that use the It o for-
mula in some parts of their proofs.
4.1 Burkholder-Davis-Gundy inequalities
Theorem 4.1 Let u L
2
a,T
and set M
t
=
_
t
0
u
s
dB
s
. Dene
M

t
= sup
s[0,t]
[M
s
[ .
Then, for any p > 0, there exist two positive constants c
p
, C
p
such that
c
p
E
_
_
T
0
u
2
s
ds
_
p
2
E (M

T
)
p
C
p
E
_
_
T
0
u
2
s
ds
_
p
2
. (4.1)
Proof: We will only prove here the right-hand side of (4.1) for p 2. Consider
the function
f(x) = [x[
p
,
for which we have that
f

(x) = p[x[
p1
sign(x),
f

(x) = p(p 1)[x[


p2
.
Then, according to (3.20) we obtain
[M
t
[
p
=
_
t
0
p[M
s
[
p1
sign(M
s
)u
s
dB
s
+
1
2
_
t
0
p(p 1)[M
s
[
p2
u
2
s
ds.
Applying the expectation operator to both terms of the above identity yields
E ([M
t
[
p
) =
p(p 1)
2
E
__
t
0
[M
s
[
p2
u
2
s
ds
_
.
We next apply H olders inequality to the expectation with exponents
p
p2
and q =
p
2
and get
E
__
t
0
[M
s
[
p2
u
2
s
ds
_
E
_
(M

t
)
p2
_
t
0
u
2
s
ds
_
[E (M

t
)
p
]
p2
p
_
E
__
t
0
u
2
s
ds
_
p
2
_
2
p
.
46
Doobs inequality (see Theorem 8.1) implies
E (M

t
)
p

_
p
p 1
_
p
E([M
t
[
p
).
Hence,
E (M

t
)
p

__
p
p 1
_
p
p(p 1)
2
_
p
2
E
__
t
0
u
2
s
ds
_
p
2
.
This ends the proof of the upper bound.

4.2 Representation of L
2
Brownian functionals
We already know that for any process u L
2
a,T
, the stochastic integral process

_
t
0
u
s
dB
s
, t [0, T] is a martingale. The next result is a kind of converse
statement. In the proof we shall use a technical ingredient that we write
without giving a proof.
In the sequel we denote by T
T
the -eld generated by (B
t
, 0 t T).
Lemma 4.1 The vector space generated by the random variables
exp
_
_
T
0
f(t)dB
t

1
2
_
T
0
f
2
(t)dt
_
,
f L
2
([0, T]), is dense in L
2
(, T
T
, P).
Theorem 4.2 Let Z L
2
(, T
T
). There exists a unique process h L
2
a,T
such that
Z = E(Z) +
_
T
0
h
s
dB
s
. (4.2)
Hence, for any martingale M bounded in L
2
, there exist a unique process
h L
2
a,T
and a constant C such that
M
t
= C +
_
t
0
u
s
dB
s
. (4.3)
Proof: We start with the proof of (4.2). Let H be the vector space consisting
of random variables Z L
2
(, T
T
) such that (4.2) holds. Firstly, we argue
the uniqueness of h. This is an easy consequence of the isometry of the
47
stochastic integral. Indeed, if there were two processes h and h

satisfying
(4.2), then
E
_
_
T
0
(h
s
h

s
)
2
ds
_
= E
_
_
T
0
(h
s
h

s
)dB
s
_
2
= 0.
This yields h = h

in L
2
([0, T] ).
We now turn to the existence of h. Any Z H satises
E(Z
2
) = (E(Z))
2
+E
_
_
T
0
h
2
s
ds
_
.
From this it follows that if (Z
n
, n 1) is a sequence of elements of H con-
verging to Z in L
2
(, T
T
), then the sequence (h
n
, n 1) corresponding to
the representations is Cauchy in L
2
a,T
. Denoting by h the limit, we have
Z = E(Z) +
_
T
0
h
s
dB
s
.
Hence H is closed in L
2
(, T
T
).
For any f L
2
([0, T], T
T
), set
c
f
t
= exp
__
t
0
f
s
dB
s

1
2
_
t
0
f
2
s
ds
_
.
By the It o formula,
c
f
t
= 1 +
_
t
0
c
f
s
f(s)dB
s
.
Consequently, the representation holds for Z := c
f
T
and also any linear com-
bination of such random variables belong to H. The conclusion follows from
Lemma 4.1.
The representation (4.3) is a consequence of the following fact: The martin-
gale M converges in L
2
() as t to a random variable M

L
2
(),
and
M
t
= E(M

[T
t
).
Then by taking conditional expectations in both terms of (4.2) we obtain
M
t
= E(M

[T
t
) = E(M

) +
_
t
0
h
s
dB
s
.
The proof of the theorem is now complete.

48
Example 4.1 Consider Z = B
3
T
. In order to nd the corresponding process
h in the integral representation, we apply rst Itos formula, yielding
B
3
T
=
_
T
0
3B
2
t
dB
t
+ 3
_
T
0
B
t
dt.
An integration by parts gives
_
T
0
B
t
dt = TB
T

_
T
0
tdB
t
.
Thus,
B
3
T
=
_
T
0
3
_
B
2
t
+T t
_
dB
t
.
Notice that E(B
T
)
3
= 0. Then h
t
= 3 [B
2
t
+T t].
4.3 Girsanovs theorem
It is well known that if X is a multidimensional Gaussian random variable,
any ane transformation brings X into a multidimensional Gaussian random
variable as well. The simplest version of Girsanovs theorem extends this
result to a Brownian motion. Before giving the precise statement and the
proof, let us introduce some preliminaries.
Lemma 4.2 Let L be a nonnegative random variable such that E(L) = 1.
Set
Q(A) = E(1 1
A
L), A T. (4.4)
Then, Q denes a probability on T, equivalent to P, with density given by L.
Reciprocally, if P and Q are two probabilities on T and P Q, then threre
exists a nonnegative random variable L such that E(L) = 1, and (4.4) holds.
Proof: It is clear that Q denes a -additive function on T. Moreover, since
Q() = E(1 1

L) = E(L) = 1,
Q is indeed a probability.
Let A T be such that Q(A) = 0. Since L > 0, a.s., we should have
P(A) = 0. Reciprocally, for any A T with P(A) = 0, we have Q(A) = 0
as well.
The second assertion of the lemma is Radon-Nikodym theorem.

49
If we denote by E
Q
the expectation operator with respect to the probability
Q dened before, one has
E
Q
(X) = E(XL).
Indeed, this formula is easily checked for simple random variables and then
extended to any random variable X L
1
() by the usual approximation
argument.
Consider now a Brownian motion B
t
, t [0, T]. Fix R and let
L
t
= exp
_
B
t


2
2
t
_
. (4.5)
It os formula yield
L
t
= 1
_
t
0
L
s
dB
s
.
Hence, the process L
t
, t [0, T] is a positive martingale and E(L
t
) = 1,
for any t [0, T]. Set
Q(A) = E (1 1
A
L
T
) , A T
T
. (4.6)
By Lemma 4.2, the probability Q is equivalent to P on the -eld T
T
.
By the martingale property of L
t
, t [0, T], the same conclusion is true on
T
t
, for any t [0, T]. Indeed, let A T
t
, then
Q(A) = E (1 1
A
L
T
) = E (E (1 1
A
L
T
[T
t
))
= E (1 1
A
E (L
T
[T
t
))
= E (1 1
A
L
t
) .
Next, we give a technical result.
Lemma 4.3 Let X be a random variable and let ( be a sub -eld of T such
that
E
_
e
iuX
[(
_
= e

u
2

2
2
.
Then, the random variable X is independent of the -eld ( and its proba-
bility law is Gaussian, zero mean and variance
2
.
Proof: By the denition of the conditional expectation, for any A (,
E
_
1 1
A
e
iuX
_
= P(A)e

u
2

2
2
.
50
In particular, for A := , we see that the characteristic function of X is that
of a N(0,
2
). This proves the last assertion.
Moreover, for any A (,
E
A
_
e
iuX
_
= e

u
2

2
2
,
saying that the law of X conditionally to A is also N(0,
2
). Thus,
P ((X x) A) = P(A)P
A
(X x) = P(A)P (X x) ,
yielding the independence of X and (.

Theorem 4.3 (Girsanovs theorem) Let R and set


W
t
= B
t
+t.
In the probability space (, T
T
, Q), with Q given in (4.6), the process W
t
, t
[0, T] is a standard Brownian motion.
Proof: We will check that in the probability space (, T
T
, Q), any increment
W
t
W
s
, 0 s < t T is independent of T
s
and has N(0, t s) distribution.
That is, for any A T
s
,
E
Q
_
e
iu(W
t
W
s
)
1 1
A
_
= E
Q
_
1 1
A
e

u
2
2
(ts)
_
= Q(A)e

u
2
2
(ts)
.
The conclusion will follow from Lemma 4.3.
Indeed, writing
L
t
= exp
_
(B
t
B
s
)

2
2
(t s)
_
exp
_
B
s


2
2
s
_
,
we have
E
Q
_
e
iu(W
t
W
s
)
1 1
A
_
= E
_
1 1
A
e
iu(W
t
W
s
)
L
t
_
= E
_
1 1
A
e
iu(B
t
B
s
)+iu(ts)(B
t
B
s
)

2
2
(ts)
L
s
_
.
Since B
t
B
s
is independent of T
s
, the last expression is equal to
E (1 1
A
L
s
) E
_
e
(iu)(B
t
B
s
)
_
e
iu(ts)

2
2
(ts)
= Q(A)e
(iu)
2
2
(ts)+iu(ts)

2
2
(ts)
= Q(A)e

u
2
2
(ts)
.
The proof is now complete.

51
5 Local time of Brownian motion and
Tanakas formula
This chapter deals with a very particular extension of It os formula. More
precisely, we would like to have a decomposition of the positive submartingale
[B
t
x[, for some xed x R as in the It o formula. Notice that the function
f(y) = [y x[ does not belong to (
2
(R). A natural way to proceed is to
regularize the function f, for instance by convolution with an approximation
of the identity, and then, pass to the limit. Assuming that this is feasible,
the question of identifying the limit involving the second order derivative
remains open. This leads us to introduce a process termed the local time of
B at x introduced by Paul Levy.
Denition 5.1 Let B = B
t
, t 0 be a Brownian motion and let x R.
The local time of B at x is dened as the stochastic process
L(t, x) = lim
0
1
2
_
t
0
1 1
(x,x+)
(B
s
)ds
= lim
0
1
2
s [0, t] : B
s
(x , x +), (5.1)
where denotes the Lebesgue measure on R.
We see that L(t, x) measures the time spent by the process B at x during a
period of time of length t. Actually, it is the density of this occupation time.
We shall see later that the above limit exists in L
2
(it also exists a.s.), a fact
that it is not obvious at all.
Local time enters naturally in the extension of the Ito formula we alluded
before. In fact, we have the following result.
Theorem 5.1 For any t 0 and x R, a.s.,
(B
t
x)
+
= (B
0
x)
+
+
_
t
0
1 1
[x,)
(B
s
)dB
s
+
1
2
L(t, x), (5.2)
where L(t, x) is given by (5.1) in the L
2
convergence.
Proof: The heuristics of formula (5.2) is the following. In the sense of
distributions, f(y) = (y x)
+
has as rst and second order derivatives,
f

(y) = 1 1
[x,)
(y), f

(y) =
x
(y), respectively, where
x
denotes the Dirac
delta measure. Hence we expect a formula like
(B
t
x)
+
= (B
0
x)
+
+
_
t
0
1 1
[x,)
(B
s
)dB
s
+
1
2
_
t
0

x
(B
s
)ds.
52
However, we have to give a meaning to the last integral.
Approximation procedure
We are going to approximate the function f(y) = (y x)
+
. For this, we x
> 0 and dene
f
x
(y) =
_

_
0, if y x
(yx+)
2
4
, if x y x +
y x if y x +
which clearly has as derivatives
f

x
(y) =
_

_
0, if y x
(yx+)
2
, if x y x +
1 if y x +
and
f

x
(y) =
_

_
0, if y < x
1
2
, if x < y < x +
0 if y > x +
Let
n
, n 1 be a sequence of (

functions with compact supports decreas-


ing to 0. For instance we may consider the function
(y) = c exp
_
(1 y
2
)
1
_
1 1
{|y|<1}
,
with a constant c such that
_
R
(z)dz = 1, and then take

n
(y) = n(ny).
Set
g
n
(y) = [
n
f
x
](y) =
_
R
f
x
(y z)
n
(z)dz.
It is well-known that g
n
(

, g
n
and g

n
converge uniformly in R to f
x
and
f

x
, respectively, and g

n
converges pointwise to f

x
except at the points x+
and x .
We then have an Itos formula for g
n
, as follows:
g
n
(B
t
) = g
n
(B
0
) +
_
t
0
g

n
(B
s
)dB
s
+
1
2
_
t
0
g

n
(B
s
)ds. (5.3)
53
Convergence of the terms in (5.3) as n
The function f

x
is bounded. The function g

n
is also bounded. Indeed,
[g

n
(y)[ =
_
R
f

x
(y z)
n
(z)dz
=
_ 1
n

1
n
f

x
(y z)
n
(z)dz
2|f

x
|

.
Moreover,

n
(B
s
)1 1
[0,t]
f

x
(B
s
)1 1
[0,t]

0,
uniformly in t and in . Hence, by bounded convergence,
E
_
t
0
[g

n
(B
s
) f

x
(B
s
)[
2
0.
Then, the isometry property of the stochastic integral implies
E

_
t
0
[g

n
(B
s
) f

x
(B
s
)] dB
s

2
0,
as n .
We next deal with the second order term. Since the law of each B
s
has a
density, for each s > 0,
PB
s
= x + = PB
s
= x = 0.
Thus, for any s > 0,
lim
n
g

n
(B
s
) = f

x
(B
s
),
a.s. Using Fubinis theorem, we see that this convergence also holds, for
almost every s, a.s. In fact,
_
t
0
ds
_

dP1 1
{f

x,
(B
s
)=lim
n
g

n
(B
s
)}
=
_

dP
_
t
0
ds1 1
{f

x,
(B
s
)=lim
n
g

n
(B
s
)}
= 0.
We have
sup
yR
[g

n
(y)[
1
2
.
Indeed,
[g

n
(y)[ =
1
2

_
R

n
(z)1 1
(x,x+)
(y z)dz

1
2
_
yx+
yx
[
n
(z)[dz
2
2
.
54
Then, by bounded convergence
_
t
0
g

n
(B
s
)ds
_
t
0
f

x
(B
s
)ds,
a.s. and in L
2
.
Thus, passing to the limit the expression (5.3) yields
f
x
(B
t
) = f
x
(B
0
) +
_
t
0
f

x
(B
s
)dB
s
+
1
2
_
t
0
1
2
1 1
(x,x+)
(B
s
)ds. (5.4)
Convergence as 0 of (5.4)
Since f
x
(y) (y x)
+
as 0 and
[f
x
(B
t
) f
x
(B
0
)[ [B
t
B
0
[,
we have
f
x
(B
t
) f
x
(B
0
) (B
t
x)
+
(B
0
x)
+
,
in L
2
.
Moreover,
E
__
t
0
_
f

x,
(B
s
) 1 1
[x,)
(B
s
)
_
2
ds
_
E
__
t
0
1 1
(x,x+)
(B
s
)ds
_

_
t
0
2

2s
ds.
that clearly tends to zero as 0. Hence, by the isometry property of the
stochastic integral
_
t
0
f

x,
(B
s
)dB
s

_
t
0
1 1
[x,)
(B
s
)dB
s
,
in L
2
.
Consequently, we have proved that
_
t
0
1
2
1 1
(x,x+)
(B
s
)ds
converges in L
2
as 0 and that formula (5.2) holds.

We give without proof two further properties of local time.


1. The property of local time as a density of occupation measure is made
clear by the following identity, valid por any t 0 and every a b:
_
b
a
L(t, x)dx =
_
t
0
1 1
(a,b)
(B
s
)ds.
55
2. The stochastic integral
_
t
0
1 1
[x,)
(B
s
)dB
s
has a jointly continuous ver-
sion in (t, x) (0, ) R. Hence, by (5.2) so does the local time
L(t, x), (t, x) (0, ) R.
The next result, which follows easily from Theorem 5.1 is known as Tanakas
formula.
Theorem 5.2 For any (t, x) [0, ) R, we have
[B
t
x[ = [B
0
x[ +
_
t
0
sign(B
s
x)dB
s
+L(t, x). (5.5)
Proof: We will use the following relations: [x[ = x
+
+x

, x

= max(x, 0) =
(x)
+
. Hence, by virtue of (5.2), we only need a formula for (B
t
+ x)
+
.
Notice that we already have it, since the process B is also a Brownian
motion. More precisely,
(B
t
+x)
+
= (B
0
+x)
+
+
_
t
0
1 1
[x,)
(B
s
)d(B
s
) +
1
2
L

(t, x),
where we have denoted by L

(t, x) the local time of B at x. We have


the following facts:
_
t
0
1 1
[x,)
(B
s
)d(B
s
) =
_
t
0
1 1
(,x]
(B
s
)dB
s
,
L

(t, x) = lim
0
1
2
_
t
0
1 1
(x,x+)
(B
s
)ds
= lim
0
1
2
_
t
0
1 1
(x,x+)
(B
s
)ds
= L(t, x),
where the limit is in L
2
().
Thus, we have proved
(B
t
x)

= (B
0
x)

_
t
0
1 1
(,x]
(B
s
)dB
s
+
1
2
L(t, x). (5.6)
Adding up (5.2) and (5.6) yields (5.5). Indeed
1 1
[x,)
(B
s
) 1 1
(,x]
(B
s
) =
_

_
1, if B
s
> x
1 if B
s
< x
0 if B
s
= x
which is identical to sign (B
s
x).

56
6 Stochastic dierential equations
In this section we shall introduce stochastic dierential equations driven by a
multi-dimensional Brownian motion. Under suitable properties on the coe-
cients, we shall prove a result on existence and uniqueness of solution. Then
we shall establish properties of the solution, like existence of moments of any
order and the H older property of the sample paths.
The setting
We consider a d-dimensional Brownian motion B = B
t
= (B
1
t
, . . . , B
d
t
), t
0, B
0
= 0, dened on a probability space (, T, P), along with a ltration
(T
t
, t 0) satisfying the following properties:
1. B is adapted to (T
t
, t 0),
2. the -eld generated by B
u
B
t
, u t is independent of (T
t
, t 0).
We also consider functions
b : [0, ) R
m
R
m
, : [0, ) R
m
L(R
d
; R
m
).
When necessary we will use the description
b(t, x) =
_
b
i
(t, x)
_
1im
, (t, x) =
_

i
j
(t, x)
_
1im,1jd
.
By a stochastic dierential equation, we mean an expression of the form
dX
t
= (t, X
t
)dB
t
+b(t, X
t
)dt, t (0, ),
X
0
= x, (6.1)
where x is a m-dimensional random vector independent of the Brownian
motion.
We can also consider any time value u 0 as the initial one. In this case,
we must write t (u, ) and X
u
= x in (6.1). For the sake of simplicity we
will assume here that x is deterministic.
The formal expression (6.1) has to be understood as follows:
X
t
= x +
_
t
0
(s, X
s
)dB
s
+
_
t
0
b(s, X
s
)ds, (6.2)
or coordinate-wise,
X
i
t
= x
i
+
d

j=1
_
t
0

i
j
(s, X
s
)dB
j
s
+
_
t
0
b
i
(s, X
s
)ds,
57
i = 1, . . . , m.
Strong existence and path-wise uniqueness
We now give the notions of existence and uniqueness of solution that will be
considered throughout this chapter.
Denition 6.1 A m-dimensional stochastic process (X
t
, t 0) measurable
and T
t
-adapted is a strong solution to (6.2) if the following conditions are
satised:
1. The processes (
i
j
(s, X
s
), s 0) belong to L
2
a,
, for any 1 i m,
1 j d.
2. The processes (b
i
(s, X
s
), s 0) belong to L
1
a,
, for any 1 i m.
3. Equation (6.2) holds true for the xed Brownian motion dened before.
Denition 6.2 The equation (6.2) has a path-wise unique solution if any
two strong solutions X
1
and X
2
in the sense of the previous denition are
indistinguishable, that is,
PX
1
(t) = X
2
(t), for any t 0 = 1.
Hypotheses on the coecients
We shall refer to (H) for the following set of hypotheses.
1. Linear growth:
sup
t
[[b(t, x)[ +[(t, x)] [ L(1 +[x[). (6.3)
2. Lipschitz in the x variable, uniformly in t:
sup
t
[[b(t, x) b(t, y)[ +[(t, x) (t, y)[] L[x y[. (6.4)
In (6.3), (6.4), L stands for a positive constant.
58
6.1 Examples of stochastic dierential equations
When the functions and b have a linear structure, the solution to (6.2)
admits an explicit form. This is not surprising as it is indeed the case for
ordinary dierential equations. We deal with this question in this section.
More precisely, suppose that
(t, x) = (t) +F(t)x, (6.5)
b(t, x) = c(t) +D(t)x. (6.6)
Example 1 Assume for simplicity d = m = 1, (t, x) = (t), b(t, x) =
c(t) +D, t 0 and D R. Now equation (6.2) reads
X
t
= X
0
+
_
t
0
(s)dB
s
+
_
t
0
[c(s) +DX
s
]ds,
and has a unique solution given by
X
t
= X
0
e
Dt
+
_
t
0
e
D(ts)
(c(s)ds + (s)dB
s
). (6.7)
To check (6.7) we proceed as in the deterministic case. First we consider the
equation
dX
t
= DX
t
dt,
with initial condition X
0
, which solution is
X
t
= X
0
e
Dt
, t 0.
The we use the variation of constants procedure and write
X
t
= X
0
(t)e
Dt
.
A priori X
0
(t) may be random. However, since e
Dt
is dierentiable, the Ito
dierential of X
t
is given by
dX
t
= dX
0
(t)e
Dt
+X
0
(t)e
Dt
Ddt.
Equating the right-hand side of the preceding identity with
(t)dB
t
+ (c(t) +X
t
D) dt
yields
dX
0
(t)e
Dt
+X
t
Ddt
= (t)dB
t
+ (c(t) +X
t
D) dt,
59
that is
dX
0
(t) = e
Dt
[(t)dB
t
+c(t)dt] .
In integral form
X
0
(t) = x +
_
t
0
e
Ds
[(s)dB
s
+c(s)ds] .
Plugging the right-hand side of this equation in X
t
= X
0
(t)e
Dt
yields (6.7).
A particular example of the class of equations considered before is Langevin
Equation:
dX
t
= dB
t
X
t
dt, t > 0,
X
0
= x
0
R, where R and > 0. Here X
t
stands for the velocity at
time t of a free particle that performs a Brownian motion dierent from the
B
t
in the equation. The solution to this equation is given by
X
t
= e
t
x
0
+
_
t
0
e
(ts)
dB
s
.
Notice that X
t
, t 0 denes a Gaussian process.
6.2 A result on existence and uniqueness of solution
This section is devoted to prove the following result.
Theorem 6.1 Assume that the functions , and b satisfy the assumptions
(H). Then there exists a path-wise unique strong solution to (6.2).
Before giving a proof of this theorem we recall a version of Gronwalls lemma
that will be used repeatedly in the sequel.
Lemma 6.1 Let u, v : [, ] R
+
be functions such that u is Lebesgue
integrable and v is measurable and bounded. Assume that
v(t) c +
_
t

u(s)v(s)ds, (6.8)
for some constant c 0 and for any t [, ]. Then
v(t) c exp
__
t

u(s)ds
_
. (6.9)
60
Proof of Theorem 6.1
Let us introduce Picards iteration scheme
X
0
t
= x,
X
n
t
= x +
_
t
0
(s, X
n1
s
)dB
s
+
_
t
0
b(s, X
n1
s
)ds, n 1, (6.10)
t 0. Let us restrict the time interval to [0, T], with T > 0. We shall
prove that the sequence of stochastic processes dened recursively by (6.10)
converges uniformly to a process X which is a strong solution of (6.2). Even-
tually, we shall prove path-wise uniqueness.
Step 1: We prove by induction on n that for any t [0, T],
E
_
sup
0st
[X
n
s
[
2
_
< . (6.11)
Indeed, this property is clearly true if n = 0, since in this case X
0
t
is constant
and equal to x. Suppose that (6.11) holds true for n = 0, . . . , m 1. By
applying Burkholders and H olders inequality, we reach
E
_
sup
0st
[X
n
s
[
2
_
C
_
x +E
_
sup
0st

_
s
0
(u, X
m1
u
)dB
u

2
_
+E
_
sup
0st

_
s
0
b(u, X
m1
u
)du

2
_
_
C
_
x +E
__
t
0

(u, X
m1
u
)

2
du
_
+E
__
t
0

b(u, X
m1
u
)

2
du
__
C
_
x +E
__
t
0
_
1 +[X
m1
u
[
2
_
du
__
C
_
x +T +TE
_
sup
0sT
[X
m1
s
[
2
__
.
Hence (6.11) is proved.
Step 2: As in Step 1, we prove by induction on n that
E
_
sup
0st
[X
n+1
s
X
n
s
[
2
_

(Ct)
n+1
(n + 1)!
. (6.12)
Indeed, consider rst the case n = 0 for which we have
X
1
s
x =
_
s
0
(u, x)dB
u
+
_
s
0
b(s, x)ds.
61
Burkolders inequality yields
E
_
sup
0st

_
s
0
(u, x)dB
u

2
_
C
_
t
0
[(u, x)[
2
du
Ct(1 +[x[
2
).
Similarly,
E
_
sup
0st

_
s
0
b(u, x)du

2
_
Ct
_
t
0
[b(u, x)[
2
du
Ct
2
(1 +[x[
2
).
With this, (6.11) is established for n = 0.
Assume that (6.11) holds for natural numbers m n 1. Then, as we did
for n = 0, we can consider the decomposition
E
_
sup
0st
[X
n+1
s
X
n
s
[
2
_
2(A(t) +B(t)),
with
A(t) = E
_
sup
0st

_
s
0
_
(u, X
n
u
) (u, X
n1
u
)
_
dB
u

2
_
,
B(t) = E
_
sup
0st

_
s
0
_
b(u, X
n
u
) b(u, X
n1
u
)
_
du

2
_
.
Using rst Burkholders inequality and then H olders inequality along with
the Lipschitz property of the coecient , we obtain
A(t) C(L)
_
t
0
E([X
n
s
X
n1
s
[
2
)ds.
By the induction assumption we can upper bound the last expression by
C(L)
_
t
0
(Cs)
n
n!
C(T, L)
(Ct)
n+1
(n + 1)!
.
Similarly, applying Holders inequality along with the Lipschitz property of
the coecient b and the induction assumption, yield
B(t) C(T, L)
(Ct)
n+1
(n + 1)!
.
62
Step 3: The sequence of processes X
n
t
, t [0, T], n 0, converges uni-
formly in t to a stochastic process X
t
, t [0, T] which satises (6.2).
Indeed, applying rst Chebychevs inequality and then (6.12), we have
P
_
sup
0tT

X
n+1
t
X
n
t

>
1
2
n
_
2
2n
(Ct)
n+1
(n + 1)!
,
which clearly implies

n=0
P
_
sup
0tT

X
n+1
t
X
n
t

>
1
2
n
_
< .
Hence, by the rst Borel-Cantellis lemma
P
_
liminf
n
_
sup
0tT

X
n+1
t
X
n
t


1
2
n
__
= 1.
In other words, for each a.s., there exists a natural number m
0
() such
that
sup
0tT

X
n+1
t
X
n
t


1
2
n
,
for any n m
0
(). The Weierstrass criterion for convergence of series of
functions then implies that
X
m
t
= x +
m1

k=1
[X
k+1
t
X
k
t
]
converges uniformly on [0, T], a.s. Let us denote by X = X
t
, t [0, T] the
limit. Obviously the process X has a.s. continuous paths.
To conclude the proof, we must check that X satises equation (6.2) on [0, T].
The continuity properties of and b imply the convergences
(t, X
n
t
) (t, X
t
),
b(t, X
n
t
) b(t, X
t
),
as n , uniformly in t [0, T], a.s.
Therefore,

_
t
0
[b(s, X
n
s
) b(s, X
s
)] ds

L
_
t
0
[X
n
s
X
s
[ ds
L sup
0st
[X
n
s
X
s
[ 0,
63
as n , a.s. This proves the a.s. convergence of the sequence of the
path-wise integrals.
As for the stochastic integrals, we will prove
_
t
0
(s, X
n
s
)dB
s

_
t
0
(s, X
s
)dB
s
(6.13)
as n with the convergence in probability.
Indeed, applying the extension of Lemma 3.5 to processes of
a,T
, we have
for each , N > 0,
P
_

_
t
0
((s, X
n
s
) (s, X
s
)) dB
s

>
_
P
__
t
0
[(s, X
n
s
) (s, X
s
)[
2
ds > N
_
+
N

2
.
The rst term in the right-hand side of this inequality converges to zero as
n . Since , N > 0 are arbitrary, this yields the convergence stated in
(6.13).
Summarising, by considering if necessary a subsequence X
n
k
t
, t [0, T],
we have proved the a.s. convergence, uniformly in t [0, T], to a stochastic
process X
t
, t [0, T] which satises (6.2), and moreover
E sup
0tT
[X
t
[
2
< .
In order to conclude that X is a strong solution to (6.1) we have to check
that the required measurability and integrability conditions hold. This is left
as an exercise to the reader.
Step 4: Path-wise uniqueness.
Let X
1
and X
2
be two strong solutions to (6.1). Proceeding in a similar way
as in Step 2, we easily get
E
_
sup
0ut
[X
1
(u) X
2
(u)[
2
_
C
_
t
0
E
_
sup
0us
[X
1
(u) X
2
(u)[
2
_
ds.
Hence, from Lemma 6.1 we conclude
E
_
sup
0uT
[X
1
(u) X
2
(u)[
2
_
= 0,
proving that X
1
and X
2
are indistinguishable.

64
6.3 Some properties of the solution
We start this section by studying the L
p
-moments of the solution to (6.2).
Theorem 6.2 Assume the same assumptions as in Theorem 6.1 and suppose
in addition that the initial condition is a random variable X
0
, independent of
the Brownian motion. Fix p [2, ) and t [0, T]. There exists a positive
constant C = C(p, t, L) such that
E
_
sup
0st
[X
s
[
p
_
C (1 +E[X
0
[
p
) . (6.14)
Proof: From (6.2) it follows that
E
_
sup
0st
[X
s
[
p
_
C(p)
_
E[X
0
[
p
+E
_
sup
0st

_
s
0
(u, X
u
)dB
u

p
_
E
_
sup
0st

_
s
0
b(u, X
u
)du

p
_
_
.
Applying rst Burkholders inequality and then Holders inequality yield
E
_
sup
0st

_
s
0
(u, X
u
)dB
u

p
_
C(p)E
__
t
0
[(s, X
s
)[
2
ds
_
p
2
C(p, t)E
__
t
0
[(s, X
s
)[
p
ds
_
C(p, L, t)
_
t
0
(1 +E[X
s
[
p
)ds
C(p, L, t)
_
1 +
_
t
0
E
_
sup
0us
[X
u
[
p
__
ds.
For the pathwise integral, we apply H olders inequality to obtain
E
_
sup
0st

_
s
0
b(u, X
u
)du

p
_
C(p, t)E
__
t
0
[b(s, X
s
)[
p
ds
_
C(p, L, t)
_
1 +
_
t
0
E
_
sup
0us
[X
u
[
p
_
ds
_
.
Dene
(t) = E
_
sup
0st
[X
s
[
p
_
.
We have established that
(t) C(p, L, t)
_
E[X
0
[
p
+ 1 +
_
t
0
(s)ds
_
.
65
Then, with Lemma 6.1 we end the proof of (6.14).

It is clear that the solution to (6.2) depends on the initial value X


0
. Consider
two initial conditions X
0
, Y
0
(remember that there should be m-dimensional
random vectors independent of the Brownian motion). Denote by X(X
0
),
X(Y
0
) the corresponding solutions to (6.2). With a very similar proof as that
of Theorem 6.2 we can obtain the following.
Theorem 6.3 The assumptions are the same as in Theorem 6.1. Then
E
_
sup
0st
[X
s
(X
0
) X
s
(Y
0
)[
p
_
C(p, L, t) (E[X
0
Y
0
[
p
) , (6.15)
for any p [2, ), where C(p, L, t) is some positive constant depending on
p, L and t.
The sample paths of the solution of a stochastic dierential equation possess
the same regularity as those of the Brownian motion. We next discuss this
fact.
Theorem 6.4 The assumptions are the same as in Theorem 6.1. Let p
[2, ), 0 s t T. There exists a positive constant C = C(p, L, T) such
that
E ([X
t
X
s
[
p
) C(p, L, T) (1 +E[X
0
[
p
) [t s[
p
2
. (6.16)
Proof: By virtue of (6.2) we can write
E ([X
t
X
s
[
p
)
C(p)
_
E

_
t
s
(u, X
u
)dB
u

p
+E

_
t
s
b(u, X
u
)du

p
_
.
Burkholders inequality and then Holders inequality with respect to
Lebesgue measure on [s, t] yield
E

_
t
s
(u, X
u
)dB
u

p
C(p)E
__
t
s
[(u, X
u
)[
2
du
_
p
2
C(p)[t s[
p
2
1
E
__
t
s
[(u, X
u
)[
p
du
_
C(p, L, T)[t s[
p
2
1
_
t
s
(1 +E([X
u
[
p
)) du.
66
By using the estimate (6.14) of Theorem 6.2, we have
_
t
s
(1 +E([X
u
[
p
)) du C(p, T, L)[t s[E
_
sup
0ut
[X
u
[
p
_
C(p, T, L)[t s[ (1 +E[X
0
[
p
) . (6.17)
This ends the estimate of the L
p
moment of the stochastic integral.
The estimate of the path-wise integral follows from H olders inequality and
(6.17). Indeed, we have
E

_
t
s
b(u, X
u
)du

p
[t s[
p1
_
t
s
E ([b(u, X
u
)[
p
) du
C(p, L)[t s[
p1
_
t
s
(1 +E([X
u
[
p
)) du
C(p, T, L)[t s[
p
(1 +E[X
0
[
p
) .
Hence we have proved (6.16)

We can now apply Kolmogorovs continuity criterion (see Proposition 2.2) to


prove the following.
Corollary 6.1 With the same assumptions as in Theorem 6.1, we have that
the sample paths of the solution to (6.2) are Holder continuous of degree
(0,
1
2
).
Remark 6.1 Assume that in Theorem 6.3 the initial conditions are deter-
ministic and are denoted by x and y, respectively. An extension of Kol-
mogorovs continuity criterion to stochastic processes indexed by a multi-
dimensional parameter yields that the sample paths of the stochastic process
X
t
(x), t [0, T], x R
m
are jointly Holder continuous in (t, x) of degree
<
1
2
in t and < 1 in x, respectively.
6.4 Markov property of the solution
In Section 2.5 we discussed the Markov property of a real-valued Brownian
motion. With the obvious changes R into R
n
, with arbitrary n 1 we can
see that the property extends to multi-dimensional Brownian motion. In this
section we prove that the solution to the sde (6.2) inherits the Markov prop-
erty from Brownian motion. To establish this fact we need some preliminary
results.
67
Lemma 6.2 Let (, T, P) be a probability space and (E, c) be a measur-
able space. Consider two independent sub--elds of T, denoted by (, H,
respectively, along with mappings
X : (, H) (E, c),
and
: (E , c () R
m
,
with (X(), ) in L
1
().
Then,
E ((X, ) [H) = (X)(), (6.18)
with (x)() = E ((x, )).
Proof: Assume rst that (x, ) = f(x)Z() with a (-measurable Z and a
c-measurable f. Then, by the properties of the mathematical expectation,
E ((X, ) [H) = E (f(X())Z()[H)
= f(X())E(Z).
Indeed, we use that X is H-measurable and that (, H are independent.
Clearly,
f(X())E(Z) = f(x)E(Z)[
x=X()
= E ((x, )) [
x=X()
= (X).
This yields (6.18).
The result extends to any c (-measurable function by a monotone class
argument.

Lemma 6.3 Fix u 0 and let be a T


u
-measurable random variable in
L
2
(). Consider the SDE
Y

t
= +
_
t
u
(s, Y

s
) dB
s
_
t
u
b (s, Y

s
) ds,
with coeents and b satisfying the assumptions (H). Then for any t u,
Y
()
t
() = X
x,u
t
()[
x=()
,
where X
x,u
t
, t 0, denotes the solution to
X
x,u
t
= x +
_
t
u
(s, X
x,u
s
) dB
s
_
t
u
b (s, X
x,u
s
) ds. (6.19)
68
Proof: Suppose rst that is a step function,
=
r

i=1
c
i
1 1
A
i
, A
i
T
u
.
By virtue of the local property of the stochastic integral, on the set A
i
,
X
c
i
,u
t
() = X
x,u
t
()[
x=()
= Y

t
().
Let now (
n
, n 1) be a sequence of simple T
u
-measurable random variables
converging in L
2
() to . By Theorem 6.3 we have
L
2
() lim
n
Y

n
t
= Y

t
.
By taking if necessary a subsequence, we may assume that the limit is a.s..
Then, a.s.,
Y
()
t
() = lim
n
Y

n
()
t
() = lim
n
X
x,u
t
()[
x=
n
()
= X
x,u
t
()[
x=()
,
where in the last equality, we have applied the joint continuity in (t, x) of
X
x,u
t
.

As a consequence of the preceding lemma, we have X


x,s
t
= X
X
x,s
u
,u
t
for any
0 s u t, a.s.
For any B(R
m
), set
p(s, t, x, ) = PX
x,s
t
, (6.20)
so, for xed 0 s t and x R
m
, p(s, t, x, ) is the law of the random
variable X
x,s
t
.
Theorem 6.5 The stochastic process X
x,s
t
, t s is a Markov process
with initial distribution =
{x}
and transition probability function given
by (6.20).
Proof: According to Denition 2.3 we have to check that (6.20) denes a
Markovian transition function and that
PX
x,s
t
[T
u
= p(u, t, X
x,s
u
, ). (6.21)
We start by proving this identity. For this, we shall apply Lemma 6.3 in the
following setting:
(E, c) = (R
m
, B(R
m
)),
( = (B
r+u
B
u
, r 0) , H = T
u
,
(x, ) = 1 1

(X
x,u
t
()) , u t,
X := X
x,s
u
, s u.
69
The property of independent increments of the Brownian motion clearly
yields that the -elds ( and H dened above are independent and X
x,s
u
is T
u
-measurable. Moreover,
(x) = E ((x, )) = E (1 1

(X
x,u
t
))
= PX
x,u
t
= p(u, t, x, ).
Thus, Lemma 6.3 and then Lemma 6.2 yield
P X
x,s
t
[T
u
= P
_
X
X
x,s
u
,u
t
[T
u
_
= E
_
1 1

_
X
X
x,s
u
,u
t
_
[T
u
_
= E ((X
x,s
u
, ) [T
u
)
= (X
x,s
u
) = p(u, t, X
x,s
u
, ).
Since x X
x,s
t
is continuous a.s., the mapping x p(s, t, , ) is also contin-
uous and thus measurable. Moreover, by its very denition p(s, t, x, )
is a probability. We now prove that Chapman-Kolmogorovs equation is
satised (see (2.8)).
Indeed, x 0 s u t; by property (c) of the conditional expectation we
have
p(s, t, x, ) = E (1 1

(X
x,s
t
))
E (E (1 1

(X
x,s
t
)[T
u
)) = E (P X
x,s
t
[T
u
) .
By (6.21) this last expression is E (p(u, t, X
x,s
u
, )). But
E (p(u, t, X
x,s
u
, )) =
_
R
m
p(u, t, y, )L
X
x,s
u
(dy),
where L
X
x,s
u
denotes the probability law of X
x,s
u
. By denition
L
X
x,s
u
(dy) = p(s, u, x, dy).
Therefore,
p(s, t, x, ) =
_
R
m
p(u, t, y, )p(s, u, x, dy).
The proof of the theorem is now complete.

70
7 Numerical approximations of stochastic
dierential equations
In this section we consider a xed time interval [0, T]. Let = 0 =
0
<

1
< . . . <
N
= T be a partition of [0, T]. The Euler-Maruyama scheme
for the SDE (6.2) based on the partition is the stochastic process X

=
X

t
, t [0, T] dened iteratively as follows:
X

n+1
= X

n
+(
n
, X

n
)(B

n+1
B

n
) +b(
n
, X

n
)(
n+1

n
),
n = 0, , N 1
X

0
= x, (7.1)
Notice that the values X

n
, n = 0, , N 1 are determined by the values
of B

n
, n = 1, . . . , N.
We can extend the denition of X

to any value of t [0, T] by setting


X

t
= X

j
+(
j
, X

j
)(B
t
B

j
) +b(
j
, X

j
)(t
j
), (7.2)
for t [
j
,
j+1
).
The stochastic process X

t
, t [0, T] dened by (7.2) can be written as a
stochastic dierential equation. This notation will be suitable for comparing
with the solution of (6.2). Indeed, for any t [0, T] set (t) = sup
l

;
l
t; then
X

t
= x +
_
t
0
_

_
(s), X

(s)
_
dB
s
+b
_
(s), X

(s)
_
ds
_
. (7.3)
The next theorem gives the rate of convergence of the Eurler-Maruyama
scheme to the solution of (6.2) in the L
p
norm.
Theorem 7.1 We assume that the hypotheses (H) are satised. Moreover,
we suppose that there exists (0, 1) such that
[(t, x) (s, x)[ +[b(t, x) b(s, x)[ C(1 +[x[)[t s[

, (7.4)
where C is some positive contant.
Then, for any p [1, )
E
_
sup
0tT
[X

t
X
t
[
p
_
C(T, p, x)[[
p
, (7.5)
where [[ denotes the norm of the partition and =
1
2
.
71
Proof: We shall apply the following result, that can be argued in a similar
way as in Theorem 6.2
sup

E
_
sup
0st
[X

s
[
p
_
C(p, T). (7.6)
Set
Z
t
= sup
0st
[X

s
X
s
[ .
Applying Burkholders and H olders inequality we obtain
E(Z
p
t
) 2
p1
_
t
p
2
1
_
t
0
E
_

((s), X

(s)
) (s, X
s
)

p
_
ds
+t
p1
_
t
0
E
_

b((s), X

(s)
) b(s, X
s
)

p
_
ds
_
.
The assumptions on the coecients yield

((s), X

(s)
) (s, X
s
)

((s), X

(s)
) ((s), X
(s)
)

((s), X
(s)
) (s, X
(s)
)

(s, X
(s)
) (s, X
s
)

C
T
_

(s)
X
(s)

+
_
1 +

X
(s)

_
[s (s)[

X
(s)
X
s

_
C
T
_
Z
s
+
_
1 +

X
(s)

_ _
(s (s))

X
(s)
X
s

__
,
and similarly for the coecient b.
Hence, we have
E
_

((s), X

(s)
) (s, X
s
)

p
_
C(p, T)
_
E([Z
s
[
p
) + (1 +[x[
p
)
_
[[
p
+[[
p
2
__
C(p, T)
_
E([Z
s
[
p
) + (1 +[x[
p
)
_
[[

+[[
1
2
_
p
_
,
and a similar estimate for b. Consequently,
E(Z
p
t
) C(p, T, x)
__
t
0
E(Z
p
s
)ds +
_
[[

+[[
1
2
_
p
T
_
.
With Gronwalls lemma we conclude
E(Z
p
t
) C(p, T, x)[[
p
,
with =
1
2
.

72
Remark 7.1 If the coecients and b do not depend on t, with a similar
proof, we obtain =
1
2
in (7.4).
Assume that the sequence of partitions of [0, T], (
n
, n 1) satises the
following property: there exists (0, ) and p 1 such that

n1
[
n
[
()p
< . (7.7)
Then, Chebyshevs inequality and (7.4) imply

n=0
P
_
[
n
[

sup
0sT
[X

n
s
X
s
[ >
_
C(p, T, x)
p

n=0
[
n
[
()p
C(p, T, x)
p
.
Then, Borel-Cantellis lemma then yields
[
n
[

sup
0sT
[X

n
s
X
s
[ 0, a.s., (7.8)
as n , uniformly in t [0, T].
For example, for the sequence of dyadic partitions, [
n
[ = 2
n
and for any
(0, ) and (0, ), p 1, (7.7) holds.
73
8 Continuous time martingales
In this chapter we shall study some properties of martingales (respectively,
supermartingales and submartingales) whose sample paths are continuous.
We consider a ltration T
t
, t 0 as has been introduced in section 2.4 and
refer to denition 2.2 for the notion of martingale (respectively, supermartin-
gale, submartingale). We notice that in fact this denition can be extended
to families of random variables X
t
, t T where T is an ordered set. In
particular, we can consider discrete time parameter processes.
We start by listing some elementary but useful properties.
1. For a martingale (respectively, supermartingale, submartingale) the
function t E(X
t
) is a constant (respectively, decreasing, increasing)
function.
2. Let X
t
, t 0 be a martingale and let f : R R be a convex
function. Assume further that f(X
t
) L
1
(), for any t 0. Then
the stochastic process f(X
t
), t 0 is a submartingale. The same
conclusion holds true for a submartingale if, additionally the convex
function f is increasing.
The rst assertion follows easily from property (c) of the conditional expec-
tation. The second assertion can be proved using Jensens inequality.
8.1 Doobs inequalities for martingales
In the rst part of this section, we will deal with discrete parameter martin-
gales.
Proposition 8.1 Let X
n
, 0 n N be a submartingale. For any > 0,
the following inequalities hold:
P
_
sup
n
X
n

_
E
_
X
N
1 1
(sup
n
X
n
)
_
E
_
[X
N
[1 1
(sup
n
X
n
)
_
. (8.1)
Proof: Consider the stopping time
T = infn : X
n
N.
74
Then
E (X
N
) E (X
T
) = E
_
X
T
1 1
(sup
n
X
n
)
_
+E
_
X
T
1 1
(sup
n
X
n
<)
_
P
_
sup
n
X
n

_
+E
_
X
N
1 1
(sup
n
X
n
<)
_
.
By substracting E
_
X
N
1 1
(sup
n
X
n
<)
_
from the rst and last term before we
obtain the rst inequality of (8.1). The second one is obvious.

As a consequence of this proposition we have the following.


Proposition 8.2 Let X
n
, 0 n N be either a martingale or a positive
submartingale. Fix p [1, ) and (0, ). Then,

p
P
_
sup
n
[X
n
[
_
E ([X
N
[
p
) . (8.2)
Moreover, for any p ]1, ),
E ([X
N
[
p
) E
_
sup
n
[X
n
[
p
_

_
p
p 1
_
p
E ([X
N
[
p
) . (8.3)
Proof: Without loss of generality, we may assume that E [X
N
[
p
) < , since
otherwise (8.2) holds trivially.
According to property 2 above, the process [X
n
[
p
, 0 n N is a sub-
martingale and then, by Proposition 8.1 applied to the process X
n
:= [X
n
[
p
,
P
_
sup
n
[X
n
[
p

_
= P
_
sup
n
[X
n
[
1
p
_
=
p
P
_
sup
n
[X
n
[
_
E ([X
N
[
p
) ,
where for any > 0 we have written =
1
p
.
We now prove the second inequality of (8.3); the rst is obvious.
Set X

= sup
n
[X
n
[, for which we have
P (X

) E
_
[X
N
[1 1
(X

)
_
.
75
Fubinis theorem yields
E ((X

k)
p
) = E
_
_
X

k
0
p
p1
d
_
=
_

dP
_

0
1 1
{X

k}
p
p1
d
=
_
k
0
dp
p1
_
{X

}
dP
=
_
k
0
dp
p1
P (X

)
p
_
k
0
dp
p2
E
_
[X
N
[1 1
(X

)
_
= pE
_
[X
N
[
_
kX

p2
d
_
=
p
p 1
E
_
[X
N
[ (X

k)
p1
_
.
Applying Holders inequality with exponents
p
p1
and p yields
E ((X

k)
p
)
p
p 1
[E ((X

k)
p
)]
p1
p
[E ([X
N
[
p
)]
1
p
.
Consequently,
[E ((X

k)
p
)]
1
p

p
p 1
[E ([X
N
[
p
)]
1
p
.
Letting k and using monotone convergence, we end the proof.

It is not dicult to extend the above results to martingales (submartingales)


with continuous sample paths. In fact, for a given T > 0 we dene
D = Q [0, T],
D
n
= D
_
k
2
n
, k Z
+
_
,
where Q denotes the set of rational numbers.
We can now apply (8.2), (8.3) to the corresponding processes indexed by D
n
.
By letting n to we obtain

p
P
_
sup
tD
[X
t
[
_
sup
tD
E ([X
t
[
p
) , p [1, )
and
E
_
sup
tD
[X
t
[
p
_

_
p
p 1
_
p
sup
tD
E ([X
t
[
p
) , p ]1, ).
By the continuity of the sample paths we can nally state the following result.
76
Theorem 8.1 Let X
t
, t [0, T] be either a continuous martingale or a
continuous positive submartingale. Then

p
P
_
sup
t[0,T]
[X
t
[
_
sup
t[0,T]
E ([X
t
[
p
) , p [1, ), (8.4)
E
_
sup
t[0,T]
[X
t
[
p
_

_
p
p 1
_
p
sup
t[0,T]
E ([X
t
[
p
) , p ]1, ). (8.5)
Inequality (8.4) is termed Doobs maximal inequality, while (8.5) is called
Doobs L
p
inequality.
8.2 Local martingales
Throughout this section we will consider a xed ltration (T
t
, t 0) and use
the notation X
T
for the stochastic process X
T
t
= X
tT
, t 0, where T is
a stopping time.
Denition 8.1 An adapted continuous process M = (M
t
, t 0), with M
0
=
0, a.s. is a (continuous) local martingale if there exists an increasing sequence
of stopping times (T
n
, n 1) such that T
n
a.s. and for each n 1, the
process M
T
n
is a martingale.
We say that (T
n
, n 1) reduces M.
Here are some interesting properties.
(a) Each continuous martingale is a continuous local martingale. In fact,
the sequence T
n
= n reduces M.
(b) If M is a continuous local martingale, then for any stopping time T,
M
T
is also a local martingale.
The next result helps to understand the relationship between the notions of
martingale and local martingale.
Proposition 8.3 1. A positive local martingale is a supermartingale.
2. Let M be a local martingale for which there exists a random variable
Z L
1
() such that [M
t
[ Z for any t 0. Then M is a martingale.
3. Let M be a local martingale. The sequence of stopping times given by
T
n
= inft 0, [M
t
[ n
reduces M.
77
Proof: We start by proving (1). Consider a sequence (T
n
, n 1) of stopping
times that reduce M. Then for 0 s t and each n,
M
sT
n
= E (M
tT
n
[T
s
) . (8.6)
Since the random variables are positive, by letting n tend to innity and
applying Fatous lemma, we obtain
M
s
= lim
n
E (M
tT
n
[T
s
)
E
_
liminf
n
M
tT
n
[T
s
_
= E (M
t
[T
s
) .
Moreover, for s = 0 we obtain E(M
t
) E(M
0
) = 0. This ends the proof of
the rst assertion.
To prove the second one, we notice that by bounded convergence, from (8.6)
we obtain
M
s
= E (M
t
[T
s
) .
Let us nally prove 3). The process M
T
n
is a local martingale (see property
(b) above) and it is bounded by n. Thus, by 2), it is also a martingale.
Consequently, the sequence T
n
= inft 0, [M
t
[ = n, n 1 reduces M.

The next statement gives a feeling of the roughness of the sample paths of a
continuous local martingale.
Proposition 8.4 Let M be a continuous local martingale with sample paths
of bounded variation, a.s. Then M is indistinguishable from the constant
process 0.
Proof: Dene the sequence of stopping times

n
= inft > 0;
_
t
0
[dM
s
[ n,
n 1.
Fix n and set N
n
= M

n
. This is a local martingale satisfying
_

0
[dN
n
s
[ n
and therefore [N
n
t
[ n for any t 0. Hence, by Proposition 8.3, N
n
is a
bounded martingale. In the sequel we shall omit the superscript n to simplify
the notation.
78
Fix t > 0 and consider a partition 0 = t
0
< t
1
< < t
p
= t of [0, t]. Then
E(N
2
t
) =
p

i=1
E[
_
N
2
t
i
N
2
t
i1
_
=
p

i=1
E
_
_
N
t
i
N
t
i1
_
2
_
E
_
_
sup
i
[N
t
i
N
t
i1
[
_ p

i=1
[N
t
i
N
t
i1
[
_
2nE
__
sup
i
[N
t
i
N
t
i1
[
__
,
where the second identity above is a consequence of the martingale property
of N.
Considering a sequence of partitions whose mesh tends to zero, the preceding
estimate yields E(N
2
t
) = E(M
2
t
n
) = 0, by the continuity of the sample paths
of N. Eventually, letting n implies E(M
2
t
) = 0. This nishes the proof
of the proposition.

8.3 Quadratic variation of a local martingale


Throughout this section we will consider a xed t > 0 and an increasing
sequence of partitions of [0, t] whose mesh tends to zero. Points of the n-th
partition will be generically denoted by t
n
k
, k = 0, 1, . . . , p
n
. We will also
consider a local continuous martingale M and dene
M)
n
t
=
p
n

k=1
_
M
t
n
k
M
t
n
k1
_
2
,
(
n
k
M)
t
= M
t
n
k
M
t
n
k1
.
Theorem 8.2 The sequence (M)
n
t
, n 1), t [0, T], converges uni-
formly in t, in probability, to a continuous, increasing process M) =
(M)
t
, t [0, T]), such that M)
0
= 0. That is, for any > 0,
P
_
sup
t[0,T]
[M)
n
t
M)
t
[ >
_
0, (8.7)
as n . The process M) is unique satisfying the above conditions and
that M
2
M) is a local continuous martingale.
If M is bounded in L
p
(), then the convergence of M)
n
to M) takes place
in L
p
2
().
79
Proof: Uniqueness follows from Proposition 8.4. Indeed, assume there were
two increasing processes M
i
), i = 1, 2 satisfying that M
2
M
i
) is a con-
tinuous local martingale. By substracting these two processes we get that
M
1
)M
2
) is of bounded variation and, at the same time a continuous local
martingale. Hence M
1
) and M
2
) are indistinguishable.
The existence of M) is proved through dierent steps, as follows.
Step 1. Assume that M is bounded. We shall prove that (M)
n
t
, n 1),
t [0, T] is uniformly in t a Cauchy sequence in probability.
Let m > n. We have the following:
E
_
[M)
n
t
M)
m
t
[
2
_
= E
_
_
_

p
n

k=1
_

_(
n
k
M)
2
t

j:t
m
j
[t
n
k1
,t
n
k
[
(
m
k
M)
2
t
_

2
_
_
_
= 4E
_
_
_

p
n

k=1
_

j:t
m
j
[t
n
k1
,t
n
k
[
(
m
k
M)
t
_
M
t
m
j
M
t
n
k
_
_

2
_
_
_
= 4E
_
_
_
p
n

k=1

j:t
m
j
[t
n
k1
,t
n
k
[
(
m
k
M)
2
t
_
M
t
m
j
M
t
n
k
_
2
_
_
_
4E
_
_
sup
j:t
m
j
[t
n
k1
,t
n
k
[
_
M
t
m
j
M
t
n
k
_
2
p
m

j=1
(
m
k
M)
2
t
_
_
4
_

_E
_
_
sup
j:t
m
j
[t
n
k1
,t
n
k
[
_
M
t
m
j
M
t
n
k
_
4
_
_
E
_
_
p
m

j=1
(
m
k
M)
2
t
_
_
2
_

_
1
2
.
Let us now consider the last expression. The rst factor tends to zero as n
and m tends to innity, because M is continuous and bounded. The second
factor is easily seen to be bounded uniformly in m. Thus we have proved
lim
n,m
E
_
[M)
n
t
M)
m
t
[
2
_
= 0, (8.8)
for any t [0, T].
One can easily check that for any n 1, M
2
t
M)
n
t
, t [0, T] is a
continuous martingale. Therefore M)
n
t
M)
m
t
, t [0, T] is a martingale
for any n, m 1. Hence, Doobs inequality yields
E
_
sup
t[0,T]
[M)
n
t
M)
m
t
[
2
_
4E
_
[M)
n
T
M)
m
T
[
2
_
.
This yields the existence of a process M) satisfying the required conditions.
80
Step 2. For a general continuous local martingale M, let us consider an in-
creasing sequence of stopping times (T
n
, n 1) converging to innity a.s.
such that M
T
n
is a bounded martingale. By Step 1, there exists an increas-
ing process M
n
) associated with M
n
:= M
T
n
. Moreover, for any m n,
M
n
)
tT
m
= M
m
)
t
. Fix t [0, T]. There exists n 1 such that t T
n
;
then we set
M)
t
= M)
tT
n
= M
n
)
t
.
We have
P sup
t[0,T]
[M)
n
t
M)
t
[ > P sup
t[0,T]
[M)
n
tT
k
M)
tT
k
[ >
+PT
k
< T.
For any > 0, we can choose k 1 such that PT
k
< T <

2
. Thus, we
obtain
lim
n
P sup
t[0,T]
[M)
n
t
M)
t
[ > = 0.
The assertion on L
p
2
() convergence can be proved using martingale inequal-
ities.

Given two continuous local martingales M and N we dene the cross varia-
tion by
M, N)
t
=
1
2
[M +N)
t
M)
t
N)
t
] , (8.9)
t [0, T].
From the properties of the quadratic variation (see Theorem 8.2) we see that
the process M, N)
t
, t [0, T] is a process of bounded variation and it is
the unique process (up to indistinguishability) such that M
t
N
t
M, N)
t
, t
[0, T] is a continuous local martingale. It is also clear that
lim
n
p
n

k=1
_
M
t
n
k
M
t
n
k1
_ _
N
t
n
k
N
t
n
k1
_
= M, N)
t
, (8.10)
uniformly in t [0, T] in probability.
This result together with Schwarzs inequality imply
[M, N)
t
[
_
M)
t
N)
t
.
and more generally, by setting M, N)
t
s
= M, N)
t
M, N)
s
, for 0 s
t T, then
[M, N)
t
s
[
_
M)
t
s
N)
t
s
.
81
This inequality (a sort of Cauchy-Schwarzs inequality) is a particular case
of the result stated in the next proposition.
Proposition 8.5 Let M, N be two continuous local martingales and H, K
be two measurable processes. Then
_

0
[H
s
[[K
s
[[dM, )
s
[
__

0
H
2
s
dM)
s
_1
2
__

0
K
2
s
dN)
s
_1
2
. (8.11)
82
9 Stochastic integrals with respect to contin-
uous martingales
This chapter aims to give an outline of the main ideas of the extension of the
It o stochastic integral to integrators which are continuous local martingales.
We start by describing precisely the spaces involved in the construction of
such a notion. Throughout the chapter, we consider a xed probability space
(, T, P) endowed with a ltration (T
t
, t 0).
We denote by H
2
the space of continuous martingales M, bounded in L
2
(),
with M
0
= 0 a.s. This is a Hilbert space endowed with the inner product
(M, N)
H
2
= E[M, N)

].
A stochastic process (X
t
, t 0) is said to be progressively measurable if for
any t 0, the mapping (s, ) X
s
() dened on [0, t] is measurable
with respect to the -eld B([0, t]) T
t
.
For any M H
2
we dene L
2
(M) as the set of progressively measurable
processes H such that
E
__

0
H
2
s
dM)
s
_
< .
Notice that this in an L
2
space of measurable mappings dened on R
+

with respect to the measure dPdM). Hence it is also a Hilbert space, the
natural inner product being
(H, K)
L
2
(M)
= E
__

0
H
s
K
s
dM)
s
_
.
Let c be the linear subspace of L
2
(M) consisting of processes of the form
H
s
() =
p

i=0
H
i
()1 1
]t
i
,t
i+1
]
(s), (9.1)
where 0 = t
0
< t
1
< . . . < t
p+1
, and for each i, H
i
is a T
t
i
-measurable,
bounded random variable.
Stochastic processes belonging to c are termed elementary. There are related
with L
2
(M) as follows.
Proposition 9.1 Fix M H
2
. The set c is dense in L
2
(M).
Proof: We will prove that if K L
2
(M) and is orthogonal to c then K = 0.
For this, we x 0 s < t and consider the process
H = F1 1
]s,t]
,
83
with F a T
s
-measurable and bounded random variable.
Saying that K is orthogonal to H can be written as
E
_
F
_
t
s
K
u
dM)
u
_
= 0.
Consider the stochastic process
X
t
=
_
t
0
K
u
dM)
u
.
Notice that X
t
L
1
(). In fact,
E[X
t
[ E
_
t
0
[K
u
[dM)
u

_
E
_
t
0
[K
u
[
2
dM)
u
_
1
2
_
EM)
2
t
_1
2
.
We have thus proved that E (F(X
t
X
s
)) = 0, for any 0 s < t and any
T
s
-measurable and bounded random variable F. This shows that the process
(X
t
, t 0) is a martingale. At the same time, (X
t
, t 0) is also a process of
bounded variation. Hence
_
t
0
K
u
dM)
u
= 0, t 0,
which implies that K = 0 in L
2
(M).

Stochastic integral of processes in c


Proposition 9.2 Let M H
2
, H c as in (9.1). Dene
(H.M)
t
=
p

i=0
H
i
_
M
t
i+1
t
M
t
i
t
_
.
Then
(i) H.M H
2
.
(ii) The mapping H H.M extends to an isometry from L
2
(M) to H
2
.
The stochastic process (H.M)
t
, t 0 is called the stochastic integral of the
process H with respect to M and is also denoted by
_
t
0
H
s
dM
s
.
84
Proof of (i): The martingale property follows from the measurability proper-
ties of H and the martingale property of M. Moreover, since H is bounded,
(H.M) is bounded in L
2
().
Proof of (ii): We prove rst that the mapping
H c H.M
is an isometry from c to H
2
.
Clearly H H.M is linear. Moreover, H.M is a nite sum of terms like
M
i
t
= H
i
_
M
t
i+1
t
M
t
i
t
_
each one being a martingale and orthogonal to each other. It is easy to check
that
M
i
)
t
= H
2
i
(M)
t
i+1
t
M)
t
i
t
).
Hence,
H.M)
t
=
p

i=0
H
2
i
(M)
t
i+1
M)
t
i
).
Consequently,
EH.M)

= |H.M|
2
H
2 = E
_
p

i=0
H
2
i
(M)
t
i+1
t
M)
t
i
t
)
_
= E
__

0
H
2
s
dM)
s
_
= |H|
L
2
(M)
.
Since c is dense in L
2
(M) this isometry extends to a unique isometry from
L
2
(M) into H
2
. The extension is termed the stochastic integral of the process
H with respect to M.

85
10 Appendix 1: Conditional expectation
Roughly speaking, a conditional expectation of a random variable is the
mean value with respect to a modied probability after having incorporated
some a priori information. The simplest case corresponds to conditioning
with respect to an event B T. In this case, the conditional expectation is
the mathematical expectation computed on the modied probability space
(, T, P(/B)).
However, in general, additional information cannot be described so easily.
Assuming that we know about some events B
1
, . . . , B
n
we also know about
those that can be derived from them, like unions, intersections, complemen-
taries. This explains the election of a -eld to keep known information and
to deal with it.
In the sequel, we denote by ( an arbitrary -eld included in T and by X
a random variable with nite expectation (X L
1
()). Our nal aim is to
give a denition of the conditional expectation of X given (. However, in
order to motivate this notion, we shall start with more simple situations.
Conditional expectation given an event
Let B T be such that P(B) ,= 0. The conditional expectation of X given
B is the real number dened by the formula
E(X/B) =
1
P(B)
E(1 1
B
X). (10.1)
It immediately follows that
E(X/) = E(X),
E(1 1
A
/B) = P(A/B).
With the denition (10.1), the conditional expectation coincides with the
expectation with respect to the conditional probability P(/B). We check
this fact with a discrete random variable X =

i=1
a
i
1 1
A
i
. Indeed,
E(X/B) =
1
P(B)
E
_

i=1
a
i
1 1
A
i
B
_
=

i=1
a
i
P(A
i
B)
P(B)
=

i=1
a
i
P(A
i
/B).
86
Conditional expectation given a discrete random variable
Let Y =

i=1
y
i
1 1
A
i
, A
i
= Y = y
i
. The conditional expectation of X given
Y is the random variable dened by
E(X/Y ) =

i=1
E(X/Y = y
i
)1 1
A
i
. (10.2)
Notice that, knowing Y means knowing all the events that can be described
in terms of Y . Since Y is discrete, they can be described in terms of the
basic events Y = y
i
. This may explain the formula (10.2).
The following properties hold:
(a) E (E(X/Y )) = E(X);
(b) if the random variables X and Y are independent, then E(X/Y ) =
E(X).
For the proof of (a) we notice that, since E(X/Y ) is a discrete random
variable
E (E(X/Y )) =

i=1
E(X/Y = y
i
)P(Y = y
i
)
= E
_
X

i=1
1 1
{Y =y
i
}
_
= E(X).
Let us now prove (b). The independence of X and Y yields
E(X/Y ) =

i=1
E(X1 1
{Y =y
i
}
)
P(Y = y
i
)
1 1
A
i
=

i=1
E(X)1 1
A
i
= E(X).
The next proposition states two properties of the conditional expectation.
that motivates the Denition 10.1
Proposition 10.1 1. The random variable Z := E(X/Y ) is (Y )-
measurable; that is, for any Borel set B B, Z
1
(B) (Y ),
2. for any A (Y ), E (1 1
A
E(X/Y )) = E(1 1
A
X).
Proof: Set c
i
= E(X/Y = y
i
) and let B B. Then
Z
1
(B) =
i:c
i
B
Y = y
i
(Y ),
87
proving the rst property.
To prove the second one, it suces to take A = Y = y
k
. In this case
E
_
1 1
{Y =y
k
}
E(X/Y )
_
= E
_
1 1
{Y =y
k
}
E(X/Y = y
k
)
_
= E
_
1 1
{Y =y
k
}
E(X1 1
{Y =y
k
}
)
P(Y = y
k
)
_
= E(X1 1
{Y =y
k
}
).

Conditional expectation given a -eld


Denition 10.1 The conditional expectation of X given ( is a random vari-
able Z satisfying the properties
1. Z is (-measurable; that is, for any Borel set B B, Z
1
(B) (,
2. for any G (,
E(Z1 1
G
) = E(X1 1
G
).
We will denote the conditional expectation Z by E(X/().
Notice that the conditional expectation is not a number but a random vari-
able. There is nothing strange in this, since conditioning depends on the
observations.
Condition (1) tell us that events that can be described by means of E(X/()
are in (. Whereas condition (2) tell us that on events in ( the random vari-
ables X and E(X/() have the same mean value.
The existence of E(X/() is not a trivial issue. You should trust mathe-
maticians and believe that there is a theorem in measure theory -the Radon-
Nikodym Theorem- which ensures its existence.
Before stating properties of the conditional expectation, we are going to
explain how to compute it in two particular situations.
Example 10.1 Let ( be the -eld (actually, the eld) generated by a nite
partition G
1
, . . . , G
m
. Then
E(X/() =
m

j=1
E(X1 1
G
j
)
P(G
j
)
1 1
G
j
. (10.3)
Formula (10.3) can be checked using Denition 10.1. It tell us that, on
each generator of (, the conditional expectation is constant; this constant is
weighted by the mass of the generator (P(G
j
)).
88
Example 10.2 Let ( be the -eld generated by random variables
Y
1
, . . . , Y
m
, that is, the -eld generated by events of the form
Y
1
1
(B
1
), . . . , Y
1
1
(B
m
), with B
1
, . . . , B
m
arbitrary Borel sets. Assume in
addition that the joint distribution of the random vector (X, Y
1
, . . . , Y
m
) has
a density f. Then
E(X/Y
1
, . . . , Y
m
) =
_

xf(x/Y
1
, . . . , Y
m
)dx, (10.4)
with
f(x/y
1
, . . . , y
m
) =
f(x, y
1
, . . . , y
m
)
_

f(x, y
1
, . . . , y
m
)dxdy
1
. . . dy
m
. (10.5)
In (10.5), we recognize the conditional density of X given Y
1
= y
1
, . . . , Y
m
=
y
m
. Hence, in (10.4) we rst compute the conditional expectation E(X/Y
1
=
y
1
, . . . , Y
m
= y
m
) and nally, replace the real values y
1
, . . . , y
m
by the random
variables Y
1
, . . . , Y
m
.
We now list some important properties of the conditional expectation.
(a) Linearity: for any random variables X, Y and real numbers a, b
E(aX +bY/() = aE(X/() +bE(Y/().
(b) Monotony: If X Y then E(X/() E(Y/().
(c) The mean value of a random variable is the same as that of its conditional
expectation: E(E(X/()) = E(X).
(d) If X is a (-measurable random variable, then E(X/() = X
(e) Let X be independent of (, meaning that any set of the form X
1
(B),
B B is independent of (. Then E(X/() = E(X).
(f) Factorization: If Y is a bounded, (-measurable random variable,
E(Y X/() = Y E(X/().
(g) If (
i
, i = 1, 2 are -elds with (
1
(
2
,
E(E(X/(
1
)/(
2
) = E(E(X/(
2
)/(
1
) = E(X/(
1
).
89
(h) Assume that X is a random variable independent of ( and Z another
(-measurable random variable. For any measurable function h(x, z)
such that the random variable h(X, Z) is in L
1
(),
E(h(X, Z)/() = E(h(X, z))[
Z=z
.
For those readers who are interested in the proofs of these properties, we give
some indications.
Property (a) follows from the denition of the conditional expectation and
the linearity of the operator E.
Property (b) is a consequence of the monotony property of the operator E and
a result in measure theory telling that, for (-measurable random variables
Z
1
and Z
2
, satisfying
E(Z
1
1 1
G
) E(Z
2
1 1
G
),
for any G (, we have Z
1
Z
2
.
Here, we apply this result to Z
1
= E(X/(), Z
2
= E(Y/().
Taking G = in condition (2) above, we prove (c). Property (d) is obvious.
Constant random variables are measurable with respect to any -eld. Ther-
fore E(X) is (-measurable. Assuming that X is independent of (, yields
E(X1 1
G
) = E(X)E(1 1
G
) = E(E(X)1 1
G
).
This proves (e).
Properety (h) is very intuitive: Since X is independent of ( in does not enter
the game of conditioning. Moreover, the measurability of Z means that by
conditioning one can suppose it is a constant.
11 Appendix 2: Stopping times
Throughout this section we consider a xed ltration (see Section 2.4)
(T
t
, t 0).
Denition 11.1 A mapping T : [0, ] is termed a stopping time with
respect to the ltration (T
t
, t 0) is for any t 0
T t T
t
.
It is easy to see that if S and T are stopping times with respect to the same
ltration then T S, T S and T +S are also stopping times.
90
Denition 11.2 For a given stopping time T, the -eld of events prior to
T is the following
T
T
= A T : A T t T
t
, for all t 0.
Let us prove that T
T
is actually a -eld. By the denition of stopping time
T
T
. Asuume that A T
T
. Then
A
c
T t = T t (A T t)
c
T
t
.
Hence with any A, T
T
also contains A
c
.
Let now (A
n
, n 1) T
T
. We clearly have
(

n=1
A
n
) T t =

n=1
(A
n
T t) T
t
.
This completes the proof.
Some properties related with stopping times
1. Any stopping time T is T
T
-measurable. Indeed, let s 0, then
T s T t = T s t T
st
T
t
.
2. If X
t
, t 0 is a process with continuous sample paths, a.s. and (T
t
)-
adapted, then X
T
is T
T
-measurable. Indeed, the continuity implies
X
T
= lim
n

i=0
X
i2
n1 1
{i2
n
<T(i+1)2
n
}
.
Let us now check that for any s 0, the random variable X
s
1 1
{s<T}
is
T
T
-measurable. This fact along with the property T (i + 1)2
n

T
T
shows the result.
Let A B(R) and t 0. The set
X
s
A s < T T t
is empty if s t. Otherwise it is equal to X
s
A s < T t,
which belongs to T
t
.
91
References
[1] P. Baldi: Equazioni dierenziali stocastiche e applicazioni. Quaderni
dellUnione Matematica Italiana 28. Pitagora Editrici. Bologna 2000.
[2] K.L. Chung, R.J. Williams: Introduction to Stochastic Integration, 2nd
Edition. Probability and Its Applications. Birkhauser, 1990.
[3] R. Durrett: Stochastic Calculus, a practical introduction. CRC Press,
1996.
[4] L.C. Evans: An Introduction to Stochastic Dierential Equations.
http://math.berkeley.edu/ evans/SDE.course.pdf
[5] H-H Kuo: Introduction to Stochastic Integration. Universitext. Springer
2006.
[6] I. Karatzas, S. Shreve: Brownian Motion and Stochastic Calculus.
Springer Verlag, 1991.
[7] G. Lawler: Introduction to Stochastic Processes. Chapman and
Hall/CRC, 2nd Edition, 2006.
[8] J.-F. Le Gall: Mouvement Brownien et calcul stochastiques. Notes de
Cours de DEA 1996-1997. http://www.dma.ens.fr/ legall/
[9] B. ksendal: Stochastic Dierential Equations; an Introduction with
Applications. Springer Verlag, 1998.
[10] D. Revuz, M. Yor: Continuous martingales and Brownian motion.
Springer Verlag, 1999.
[11] P. Todorovic: An Introduction to Stochastic Processes and Their Ap-
plications. Springer Series in Statistics. Springer Verlag, 1992,
92

You might also like