You are on page 1of 71

Malliavins calculus and applications in stochastic

control and nance


Warsaw, March, April 2008
Peter Imkeller
version of 5. April 2008
Malliavins calculus has been developed for the study of the smooth-
ness of measures on innite dimensional spaces. It provides a stochastic
access to the analytic problem of smoothness of solutions of parabo-
lic partial dierential equations. The mathematical framework for this
access is given by measures on spaces of trajectories.
In the one-dimensional framework it is clear what is meant by smooth-
ness of measures. We look for a direct analogy to the smoothness problem
in innite-dimensional spaces. For this purpose we start interpreting the
Wiener space as a sequence space, to which the theory of dierentiation
and integration in Euclidean spaces is generalized by extension to innite
families of real numbers instead of nite ones.
The calculus possesses applications to many areas of stochastics, in
particular nance stochastics, as is underpinned by the recently publis-
hed book by Malliavin and Thalmayer. In this course I will report on
recent applications to the theory of backward stochastic dierential equa-
tions (BSDE), and their application to problems of the ne structure of
option pricing and hedging in incomplete nance or insurance markets.
At rst we want to present an access to the Wiener space as sequence
space.
1 The Wiener space as sequence space
Denition 1.1 A probability space (, T, P) is called Gaussian if there
is a family (X
k
)
1kn
or a sequence (X
k
)
kN
of independent Gaussian
unit random variables such that
T = (X
k
: 1 k n) resp. (X
k
: k N)
(completed by sets of P-measure 0).
1
Example 1:
Let = C(R
+
, R
m
), T the Borel sets on generated by the topology
of uniform convergence on compact sets of R
+
, P the m-dimensional
canonical Wiener measure on T. Let further W = (W
1
, , W
m
) be the
canonical m-dimensional Wiener process dened by the projections on
the coordinates.
Claim: (, T, P) is Gaussian.
Proof:
Let (g
i
)
iN
be an orthonormal basis of L
2
(R
+
),
W
j
(g
i
) =

g
i
(s)dW
j
s
, i N, 1 j m,
in the sense of L
2
-limits of Ito integrals. Then (modulo completion) we
have
T = (W
t
: t 0).
Let t 0, (a
i
)
iN
a sequence in l
2
such that
1
[0,t]
=

iN
a
i
g
i
.
Then we have for 1 j m
W
j
t
= lim
n
n

i=1
a
i
W
j
(g
i
) =

i=1
a
i
W
j
(g
i
),
hence W
j
t
is (modulo completion) measurable with respect to (W
j
(g
i
) :
i N). Therefore (modulo completion)
T = (W
j
(g
i
) : i N, 1 j m).
Moreover, due to
E(W
j
(g
i
)W
k
(g
l
)) =
jk
'g
i
, g
l
` =
jk

il
, i, l N, 1 j, k m,
hence the W
j
(g
i
) are independent Gaussian unit variables.
In the following we shall construct an abstract isomorphism between
the canonical Wiener space and a sequence space. Since we are nally
interested in innite dimensional spaces, we assume from now on
Assumption: the Gaussian space considered is generated by innitely
many independent Gaussian unit variables.
2
Let R
N
= (x
i
)
iN
: x
i
R, i N be the set of all real-valued
sequences, and for n N denote by

n
: R
N
R
n
, (x
i
)
iN
(x
i
)
1in
,
the projection on the rst n coordinates. Let B
n
be the -algebra of
Borel sets in R
n
,
B
N
= (
nN

1
n
[B
n
]).
Let for n N

1
(dx) =
1

2
exp(
x
2
2
) dx, = P
(X
n
)
nN
=
iN

1
,
n
=
1
n
.
This notation is consistent for n = 1.
We want to construct an isomorphism between the spaces of integrable
functions on (, T, P) and (R
N
, B
N
, ). For this purpose, it is necessary
to know how functions on the two spaces are mapped to each other. It
is clear that for B
N
-measurable f on R
N
we have
F = f ((X
n
)
nN
)
is T-measurable on .
Lemma 1.1 Let F be T-measurable on . Then there exists a B
N
-
measurable function f on R
N
such that
F = f ((X
n
)
nN
).
Proof
1. Let F = 1
A
with A = ((X
i
)
1in
)
1
[B], B B
n
. Then set f = 1

1
n
[B]
.
f is by denition B
N
-measurable and we have
f((X
n
)
nN
) = 1
B
((X
i
)
1in
) = 1
A
= F.
Hence the asserted equation is veried by indicators of a generating set
of T which is stable for intersections. Hence by Dynkins theorem it is
valid for all indicators of sets in T.
2. By part 1. and by linearity the claim is then veried for linear
combinations of indicator functions of T-measurable sets. The assertion
is stable for monotone limits in the set of functions for which it is veried.
Hence it is valid for all T-measurable functions by the monotone class
theorem.
3
Theorem 1.1 Let p 1. Then the mapping
L
p
(R
N
, B
N
, )) f F = f ((X
n
)
nN
) L
p
(, T, P))
denes a linear isomorphism.
Proof
The mapping is well dened due to
[[F[[
p
p
= E([f((X
n
)
nN
)[
p
)
=

[f(x)[
p
(dx) (transformation theorem)
= [[f[[
p
p
,
and bijective due to Lemma 1.1. Linearity is trivial.
Theorem 1.1 allows us to develop a dierential calculus on the se-
quence space (R
N
, B
N
, ), and then to transfer it to the canonical space
(, T, P). For this purpose we are stimulated by the treatment of the
one-dimensional situation.
Questions of smoothness of probability measures are prevalent. We
start considering them in the setting of R.
2 Absolute continuity of measures on R
Our aim is to study laws of random variables dened on (, T, P), i.e. the
probability measures P
X
for random variables X. By means of Theorem
1.1 these measures correspond to the measures f
1
for B
N
-measurable
functions f on R
N
. The one-dimensional version of these measures is
given by
1
f
1
for B
1
-measurable functions f dened on R.
We rst discuss a simple analytic criterion for absolute continuity of
measures of this type.
Lemma 2.1 Let be a nite measure on B
1
. Suppose there exists c R
such that for all C
1
(R) we have
[

(x)(dx)[ c[[[[

.
Then << , i.e. is absolutely continuous with respect to , the
Lebesgue measure on R.
4
Proof
Let 0 f be continuous with compact support, and dene
(x) =

f(y)dy,
Then

fd =

(x)(dx) c[[[[

= c

fd.
By a measure theoretic standard argument this inequality follows for
bounded measurable f. Therefore we conclude for A B
1
(A) c (A),
which clearly implies << .
We aim at applying Lemma 2.1 to the probability measure
1
f
1
with f B
1
-measurable. For this purpose we encounter for the rst time
the central technique of integration by parts on Gaussian spaces, which
is at the heart of Malliavins calculus.
For reasons of notational clarity we rst recall the classical technique
of integration by parts. Indeed, for g, h C

0
(R) (smooth functions with
compact support) we have
()'g
/
, h` =

g
/
(x) h(x) dx =

g(x) h
/
(x) dx = 'g, h
/
`.
This relationship can be extended to functions g, h L
2
(R) which vanish
at and which possess derivatives in the distributional sense. Let us
for a moment assume this setting and denote by dg the derivative in
distributional sense of g, and in the sense of the duality (*) with h its
adjoint operator. Then for h C

0
(R) we have
h = h
/
,
and we can interpret the duality relationship as
'dg, h` = 'g, h`.
Finally, the operator d =
d
2
dx
2
plays an important role in the calculus.
Here it is identical to the negative of the Laplace operator. In the just
sketched classical calculus one does not have to distinguish between d
and (modulo sign).
5
For the analysis on Gaussian spaces things are dierent. We sketch the
analogue of a dierential calculus with respect to duality on Gaussian
spaces. For g, h L
2
(R,
1
) denote
'g[h` =

g(x)h(x)
1
(dx).
To apply Lemma 1.1 formally to the measure =
1
f
1
for some B
1
-
measurable f, we have to write, assuming all operations are justied,


/
(x)
1
f
1
(dx) =


/
f(x)
1
(dx) = '
/
f[1` = '( f)
/
[
1
f
/
`.
Now, as in the classical setting, we want to transfer the derivation to
the other argument. For this purpose we continue calculating for g, h
C

0
(R)
'g
/
[h` =
1

g
/
(x) h(x) exp(
x
2
2
) dx
=
1

g(x)
d
dx
[h(x) exp(
x
2
2
)] dx
=

g(x) exp(
x
2
2
)
d
dx
[h(x) exp(
x
2
2
)]
1
(dx)
= 'g[ h
/
+ xh`.
So in the setting of Gaussian spaces, if we dene as before dg as distribu-
tional derivative in the generalized sense, its dual operator on a suitable
space of functions (to be described later) has to be dened by
h = h
/
+ xh.
In this sense we have the following duality relationship, completely ana-
logously to the classical formula
'dg[h` = 'g[h`.
For the combination of the derivative operator and its dual we obtain
this time the following operator
L = d =
d
2
dx
2
+ x
d
dx
,
in the suitable distributional sense.
6
d will be called Malliavin derivative, Skorokhod integral, and L
Ornstein-Uhlenbeck operator. The domains of these operators will be
more precisely dened in the higher dimensional setting. The present
exposition is given for motivating the notions to be studied.
Let us return to the problem of smoothness of the measure
1
f
1
.
Lemma 2.2 Let g, h L
2
(R,
1
) be such that dg, h L
2
(R,
1
). Then
we have
'dg[h` = 'g[h`.
Moreover for f L
2
(R,
1
) such that (
1
df
) L
2
(R,
1
) we have

1
f
1
<< .
Proof
We continue the above calculation in the notation chosen. We have by
duality and the inequality of Cauchy-Schwarz
['d( f)[
1
df
`[ = [' f[(
1
df
)`
= [

f(x) (
1
df
)(x)
1
(dx)
[[[[

[[(
1
df
)[[
2
.
Hence Lemma 1.1 can be applied with c = [[(
1
df
)[[
2
which yields the
desired absolute continuity.
With this lemma the program for the development of Gaussian die-
rential calculus in nite and innite dimensional spaces is sketched. We
have to develop rigorously in this framework the calculus of the three
operators. We shall hereby, for brevity, mostly concentrate on the ope-
rators d and . One natural orthonormal basis of L
2
(R,
1
) proves to be
very useful hereby.
3 Hermite polynomials; orthogonal developments
We continue denoting d, and L the operators studied above. They
are at least (and this is the sense in which we use them) well dened on
7
C

0
(R), even, due to the integrability properties of the Gaussian density,
on the space of polynomials in one real variable.
Denition 3.1 For n 0 let
H
n
=
n
1.
H
n
is called Hermite polynomial of degree n.
By denition we have for x R
H
0
(x) = 1,
H
1
(x) = 1 = x,
H
2
(x) = x = 1 + x
2
,
H
3
(x) = (1 + x
2
) = x 2x + x
3
= x
3
3x.
Theorem 3.1 H
n
is a polynomial of degree n, with leading coecient
1. Moreover for n N
(i) H
n
= H
n+1
,
(ii) dH
n
= nH
n1
,
(iii) LH
n
= nH
n
.
In particular, H
n
is an eigenvector of L with eigenvalue n.
Proof
(i): follows by denition.
(ii): We rst briey investigate the commutator of d and . In fact, we
have for h C

0
(R)
(dd) h = d(h
/
+xh)(h
//
+xh
/
) = h
//
+xh
/
+h(h
//
+xh
/
) = h.
This means that
(d d) = id.
With this in mind we proceed by induction on the degree n. The claim
is clear for n = 1. Assume it holds for n 1. Then
dH
n
= dH
n1
= dH
n1
+ H
n1
= (n 1)H
n2
+ H
n1
= nH
n1
.
(iii): LH
n
= dH
n
= nH
n1
= nH
n
.
8
Corollary 3.1 For g L
2
(R) dene the Fourier transform by
g(u) =
1

R
e
iux
g(x) dx, u R.
Then

(H
n
e

x
2
2
)(u) = (iu)
n
e

u
2
2
.
Proof
Choose u R. Then

(H
n
e

x
2
2
)(u) =

(
n
1 e

x
2
2
)(u)
= 'e
iu
[
n
1`
= 'd
n
e
iu
[1`
= (iu)
n
'e
iu
[1`
= (iu)
n
e

u
2
2
.

With these preliminaries, we can show that the Hermite polynomials


constitute an orthonormal basis of our Gaussian space in one dimension.
Theorem 3.2 (
1

n!
H
n
)
n0
is an orthonormal basis of L
2
(R,
1
).
Proof
1. Let n, k N, and suppose that n < k. Then by Theorem 3.1
'H
n
[H
k
` = '
n
1[
k
1` = 'd
k

n
1[1` = 0,
while
'H
n
[H
n
` = 'd
n

n
1[1` = n!'1[1` = n!.
2. It remains to show that (H
n
)
n0
is complete in L
2
(R,
1
), i.e. the
set of linear combinations of Hermite polynomials is dense in L
2
(R,
1
).
For this purpose, it suces
to show: if L
2
(R,
1
) satises for all n 0 we have 'H
n
[` = 0,
then = 0.
For z C let
F(z) =

R
(v)e
ivz
1
2
v
2
dv.
Then we have for k N, t R by Cauchy-Schwarz
[

R
[v
k
(v)[e
vt
1
2
v
2
dv [

2
(v)e

1
2
v
2
dv

R
v
2k
e
2vt
1
2
v
2
dv]
1
2
< .
9
Hence F may be dierentiated arbitrarily often under the integral sign,
which implies that F is an entire function. Moreover, we have for k 0
with x
k
=

k
l=0
a
l
H
l
(x)
F
(k)
(0) = i
k

R
v
k
(v)e

v
2
2
dv = i
k
'x
k
[`
= i
k
k

l=0
a
l
'H
l
[` = 0.
This, however, implies that F = 0, and so by the uniqueness of Fourier
transforms also = 0.
We now return to our target space, namely R
N
, the sequence space
version of our innite dimensional Gaussian space. Our task will be to
establish in this space suitable notions of the operators d and . For this
purpose it will be convenient to have again an orthonormal basis of this
Gaussian space. We have to dene an innite dimensional extension of
Hermite polynomials.
Denition 3.2 For n N let E
n
= Z
n
+
, let E be the set of sequences in
Z
+
that vanish except for nitely many components. For
p = (p
1
, , p
k
, 0, ) E let [p[ =

k
i=1
p
i
, p! =

k
i=1
p
i
!. For x R
k
resp. x R
N
, and p E
k
resp. p E let
H
p
(x) =
k

i=1
H
p
i
(x
i
) resp.

iN
H
p
i
(x
i
).
H
p
is called k-dimensional resp. generalized Hermite polynomial.
We can extend Theorem 3.2 to the multidimensional setting.
Theorem 3.3 (
1

p!
H
p
)
pE
k
is an orthonormal basis of L
2
(R
k
, B
k
,
k
),
(
1

p!
H
p
)
pE
is an orthonormal basis of L
2
(R
N
, B
N
, ).
Proof
1. For k N and g, h L
2
(R
k
,
k
) denote
'g[h` =

R
k
g(x)h(x)
k
(dx).
10
Then for p, q E
k
we have, due to Fubinis theorem
'H
p
[H
q
` =
k

i=1
'H
p
i
[H
q
i
`.
Hence (
1

p!
H
p
)
pE
k
is an orthonormal system. Moreover, linear combina-
tions of tensor products of functions of one of k variables are dense in
L
2
(R
k
, B
k
,
k
). Hence the rst claim follows from Theorem 3.2.
2. The set
nN

1
n
[L
2
(R
n
, B
n
,
n
)] is dense in L
2
(R
N
, B
N
, ). Hence,
the second assertion follows from the rst.
We next dene and study Sobolev spaces in nite and innite di-
mension, for which we use our knowledge of the orthonormal bases just
acquired.
4 Finite dimensional Gaussian Sobolev spaces
Let k N. We consider k-dimensional spaces rst. Before treating the
Gaussian spaces, let us recall the most important facts about classical
Sobolev spaces, i.e. Sobolev spaces with respect to Lebesgue measure on
R
k
.
Denition 4.1 Let p 1. For f L
p
(R
k
), a R
k
we say that f
possesses a directional (generalized) derivative in direction a, if there is
a function d
a
f L
p
(R
k
) such that
[[
1

[f( + a) f] d
a
f[[
p
0
as 0. Let
W
p
1
= f L
p
(R
k
) : f possesses a directional derivative in direction
a for any a R
k

( Sobolev space of order (1, p)).


By linearity, it is clear that if for 1 i k we denote by e
i
the ith
canonical basis vector, then for f W
p
1
(R
k
) we have d
a
f =

k
i=1
a
i
d
e
i
f,
if a = (a
1
, , a
k
). Let d
i
= d
e
i
for a canonical basis e
1
, , e
k
of R
k
.
11
Denition 4.2 Let p 1, s N. We dene recursively
W
p
s
= f W
p
1
: d
a
f W
p
s1
for any a R
k

( Sobolev space of order (s, p)). For f W


p
s
, a
1
, , a
s
R
k
we dene
recursively
d
a
1
d
a
2
d
a
s
f.
We dene the (1, p)-Sobolev norm by
[[f[[
1,p
= [[f[[
p
+
k

i=1
[[d
i
f[[
p
, f W
p
1
,
and analogous norms for higher derivative orders.
For any p 1, s N we have
C

0
(R
k
) W
p
s
and for g C

0
(R
k
), a = (a
1
, , a
k
) R
k
we have
d
a
g =
k

i=1
a
i
g
x
i
.
What is the relationship of our Sobolev spaces and the weak deriva-
tives or derivatives in distributional sense encountered above?
Denition 4.3 Let f L
1
loc
(R
k
), a R
k
. Then u
a
L
1
loc
(R
k
) is called
weak derivative of f in direction a, if for any C

0
(R
k
) the equation
'f, d
a
` = 'u
a
, `
is satised.
Theorem 4.1 Let p 1, f L
p
(R
k
. Then the following are equivalent:
(i) f W
p
1
,
(ii) for a R
k
f possesses a weak derivative u
a
, and we have u
a

L
p
(R
k
).
In this case, moreover, d
a
f = u
a
.
12
Proof
1. Let us show that (i) implies (ii). For this purpose, assume a R
k
,
C

0
(R
k
). Then for > 0 by translational invariance of Lebesgue measure

[f(x + a) f(x)](x)dx =

f(x)[(x a) (x)]dx.
Now use (i), let 0, to identify the limit as

d
a
f(x)(x)dx on the
left hand side, and as

f(x)d
a
(x)dx on the right hand side. Hence
for any C

0
(R
k
)
'd
a
f, ` = 'f, d
a
`.
This means by denition that f possesses the weak derivative d
a
f which
belongs to L
p
(R
k
).
2. Let us now prove that (ii) implies (i). Fix C

0
(R
k
), a =
(a
1
, , a
k
) R
k
. Then by Taylors formula with integral remainder
term and Fubinis theorem we have for any > 0

R
k
1

[f(x + a) f(x)](x)dx
=

R
k
1

f(x)[(x a) (x)]dx
=

R
k
f(x)[
1


0
k

i=1
a
i

x
i
(x a)d]dx
=
1


0
[

R
k
k

i=1
a
i

x
i
(x a)f(x)dx]d
=
1


0
[

R
k
(x a)u
a
(x)dx]d
=
1


0
[

R
k
(x)u
a
(x + a)dx]d
=

R
k
[
1


0
u
a
(x + a)d](x)dx.
It remains to prove that
1

0
u
a
( +a)d converges to u
a
in L
p
(R
k
). This
is certainly true provided u
a
C

0
(R
k
). But for any f, g L
p
(R
k
) we
have uniformly in > 0
[[
1


0
f( + a)d
1


0
g( + a)d[[
p


0
[[f( + a) g( + a)[[
p
d
= [[g f[[
p
.
13
By means of this observation we can transfer the desired result from
C

0
(R
k
) to L
p
(R
k
), since C

0
(R
k
) is dense in L
p
(R
k
).
Corollary 4.1 Let e
1
, , e
k
denote the canonical basis of R
k
, let (f
n
)
nN
be a sequence in W
p
1
such that
(i) [[f
n
f[[
p
0 as n ,
(ii) for any 1 i k the sequence (d
i
f
n
)
nN
converges in L
p
(R
k
).
Then f W
p
1
and [[f
n
f[[
1,p
0 as n .
Proof
We have to show that f is weakly dierentiable in direction e
i
for 1
i k, and d
i
f = lim
n
d
i
f
n
L
p
(R
k
). For this purpose let
u
i
= lim
n
d
i
f
n
,
which exists due to assumption (ii). Then by (i) for any C

0
(R
k
), 1
i k

f(x)d
i
(x)dx = lim
n

f
n
(x)d
i
(x)dx
= lim
n

d
i
f
n
(x)(x)dx =

u
i
(x)(x)dx.
This means that f possesses weak directional derivatives in direction e
i
and d
i
f = u
i
L
p
(R
k
). Now Theorem 4.1 is applicable and nishes the
proof.
Corollary 4.2 Let p 1. Then W
p
1
is a Banach space with respect to
the norm [[ [[
1,p
, and for any a R
k
the mapping d
a
: W
p
1
L
p
(R
k
) is
continuous.
Proof
We have to prove that W
p
1
is complete with respect to [[[[
1,p
. Let therefore
(f
n
)
nN
be a Cauchy sequence in W
p
1
. Then setting f = lim
n
f
n
in
L
p
(R
k
), we see that the hypotheses (i) and (ii) of Corollary 4.1 are
satised, and it suces to apply this Corollary.
We nally need a local version of Sobolev spaces.
Denition 4.4 For p 1, s N let
W
p
s,loc
= f : f : R
k
R measurable f W
p
s
for C

0
(R
k
).
( local Sobolev space of order (s, p)).
14
Theorem 4.2 Let p 1, s N. Then f W
p
s,loc
i for any x
0
R
k
there exists an open neighborhood V
x
0
of x
0
such that for any C

0
(R
k
)
with support in V
x
0
we have f W
p
s
.
Proof
We only need to prove the only if part of the claim. For any x
0
R
k
let therefore V
x
0
be given according to the statement of the assertion.
Then (V
x
0
)
x
0
R
k is an open covering of R
k
. Then there exists a locally
nite partition of the unit (
k
)
kN
C

0
(R
k
) which is subordinate to
the covering, i.e. such that
(i) 0
n
1, for any n N,
(ii) for any n N there exists x
0
(n) such that supp(
n
) V
x
0
(n)
,
(iii)

nN

n
= 1,
(iv) for any compact set K R
k
the intersection of K and supp(
n
)
is non-empty for at most nitely many n.
Now let C

0
(R
k
). Then for any k N (ii) gives supp(
k
) V
x
0
(k)
and thus by assumption

k
f W
p
s
, k N.
Since by (iv) the support of
k
is non-trivial for at most nitely many
k, (iii) and linearity yield the desired
f W
p
s
.

We now turn to Gaussian Sobolev spaces. Our analysis will again


be based on the dierential operator we know from the above sket-
ched classical calculus. Only the measure with respect to which we
consider duality changes from the Lebesgue to the Gaussian measure.
Since we thereby pass from an innite to a nite measure, integra-
bility properties for functions and therefore the domains of the dual
operators change. This is why the notion of local Sobolev spaces is
important. On these spaces, we can dene our operators locally, wi-
thout reference to integrability rst. In fact, using Theorem 4.2, and for
s N, p 1, 1 j
1
, , j
s
k, f W
p
s,loc
we can dene
d
j
1
d
j
2
d
j
s
f
15
locally on an open neighborhood V
x
0
of an arbitrary point x
0
R
k
by the
corresponding generalized derivative of f with C

0
(R
k
) such that
= 1 on an open neighborhood U
x
0
V
x
0
of x
0
. This gives a globally
unique notion, since x
0
is arbitrary.
Denition 4.5 Let s N, p 1, 1 j k, f W
p
s,loc
, and denote
by d
j
the directional derivative in direction of the jth unit vector in R
k
according to the preceding remark. Let then
f = (d
1
f, , d
k
f),

j
f = d
j
f + x
j
f,
Lf =
k

j=1

j
d
j
f =
k

j=1
[d
j
d
j
f + x
j
d
j
f].
For any 1 r s we dene more generally

r
f = (d
j
1
d
j
2
d
j
r
f : 1 j
1
, j
2
, , j
r
k).
This denition gives rise to the following notion of Gaussian Sobolev
spaces.
Denition 4.6 Let p 1, s N. Then let
D
p
s
(R
k
) = f W
p
s,loc
:
s

r=0
[[ [
r
f[ [[
p
< ,
[[f[[
s,p
=
s

r=0
[[ [
r
f[ [[
p
( k-dimensional Gaussian Sobolev space of order (s, p)).
Remark
D
p
s
(R
k
) is a Banach space. This is seen by arguments as for the proof of
Corollary 4.2.
Since our calculus will be based mostly on the Hilbert case p = 2, we
shall restrict our attention to this case whenever convenient. In this case,
our ONB composed of k-dimensional Hermite polynomials as investiga-
ted in the previous chapter will play a central role, and adds structure
to the setting. To get acquaintance with Gaussian Sobolev spaces, let us
16
compute the operators dened on the series expansions with respect to
this ONB.
For f L
2
(R
k
,
k
) we can write
f =

pE
k
c
p
(f)
p!
H
p
with coecients c
p
(f) R, p E
k
. Due to orthogonality, the Gaussian
norm is given by
[[f[[
2
=

pE
k
c
p
(f)
2
p!
2
'H
p
[H
p
` =

pE
k
c
p
(f)
2
p!
.
We also write f (c
p
(f)) to denote this series expansion. Denote by
{ the linear hull of the k-dimensional Hermite polynomials. Plainly,
{ W
p
s,loc
for any s N, p 1. According to chapter 3, { is dense
in L
2
(R
k
,
k
). And for functions in {, the generalized derivatives d
j
are
just identical to the usual partial derivatives in direction j, 1 j k.
We rst calculate the operators on Hermite polynomials. In fact, for
p E
k
, 1 j k we have in the non-trivial cases
d
j
H
p
= p
j

i=j
H
p
i
H
p
j
1
,
j
H
p
=

i=j
H
p
i
H
p
j
+1
, LH
p
= [p[H
p
.
Hence for f (c
p
(f)) {, 1 j k we may write
d
j
f =

pE
k
c
p
(f)
p!
p
j

i=j
H
p
i
H
p
j
1
,

j
f =

pE
k
c
p
(f)
p!

i=j
H
p
i
H
p
j
+1
,
Lf =

pE
k
c
p
(f)
p!
[p[H
p
.
According to Corollary 4.2 and the calculations just sketched, the
natural domains of the operators extending ,
j
and L beyond { must
be those distributions in R
k
for which the formulas just given generate
convergent series in the L
2
-norm with respect to
k
. The most important
domain is the one of , the Sobolev space D
2
1
(R
k
). For f (c
p
(f)) {
17
we have
[[ [f[ [[
2
2
=

R
k
[f[
2
(x)
k
(dx)
=
k

j=1

R
k
[d
j
f[
2
(x)
k
(dx)
=
k

j=1

pE
k
p
2
j
c
p
(f)
2
p!
2

i=j
p
i
!(p
j
1)!
=
k

j=1

pE
k
p
j
c
p
(f)
2
p!
=

pE
k
[p[
c
p
(f)
2
p!
.
If in addition f L
2
(R
k
,
k
), we may write f (c
p
(f)) and approximate
it by f
n
=

pE
k
,[p[n
c
p
(f)
p!
{, n N. Hence, according to corollary 4.2,
f belongs to D
2
1
(R
k
) if the following series converges
[[ [f[ [[
2
2
= lim
n
[[ [f
n
[ [[
2
2
= lim
n

pE
k
,[p[n
[p[
c
p
(f)
2
p!
=

pE
k
[p[
c
p
(f)
2
p!
< .
Along these lines, we now turn to describing Gaussian Sobolev spaces
and the domains of our principal operators for p = 2 by means of Hermite
expansions. We start with the case k = 1.
Theorem 4.3 Let r N, f (c
p
(f)) L
2
(R,
1
) W
2
r,loc
. Denote
f
p
=
c
p
(f)
p!
H
p
, p 0. Then the following are equivalent:
(i)
r
f L
2
(R,
1
),
(ii)

p0
p
r
[[f
p
[[
2
2
< ,
(iii) f D
2
r
(R),
(iv)
r
f L
2
(R,
1
).
In particular, D
2
r
(R) is the domain of
r
,
r
in L
2
(R,
1
). For f, g
D
2
1
(R) we have
'f[g` = 'f[g`.
18
Proof
1. We prove equivalence of (i) and (ii). We have
f =

p1
p c
p
(f)
p!
H
p1
=

p0
c
p+1
(f)
p!
H
p
,
and therefore by iteration

r
f =

p0
c
p+r
(f)
p!
H
p1
.
Therefore
[[
r
f[[
2
2
=

p0
c
p+r
(f)
2
p!
, [[f
p
[[
2
2
=
c
p
(f)
2
p!
,
and hence
[[
r
f[[
2
2
=

p0
(p + r)!
p!
[[f
p+r
[[
2
2
<
if and only if

p0
(p + r)
r
[[f
p+r
[[
2
2
< ,
and this is the case if and only if

p0
p
r
[[f
p
[[
2
2
< .
2. We next prove that (ii) and (iv) are equivalent. Note that
f =

p0
c
p
(f)
p!
H
p+1
, and therefore
r
f =

p0
c
p
(f)
p!
H
p+r
.
This implies that
[[
r
f[[
2
2
=

p0
c
p
(f)
2
p!
2
(p + r)! =

p0
(p + r) (p + 1)
c
p
(f)
2
p!
<
if and only if

p0
p
r
c
p
(f)
2
p!
=

p0
p
r
[[f
p
[[
2
2
< .
3. The equivalence of (i) and (iii) is contained in the denition.
4. Let f (c
p
(f)), g (c
p
(g)) D
2
1
(R). Then we have
'f[g` =

p0
c
p+1
(f)
p!
c
p
(g)
p!
p!,
19
whereas
'f[g` =

p0
c
p+1
(f)
p!
c
p
(g)
p!
p!.
This completes the proof.
The dierential calculus on Gaussian spaces obeys similar rules as the
classical dierential calculus.
Theorem 4.4 Let g D
4
1
(R), =
1
g
1
. If L
4
(R, ), and
L
4
(R, ), we have
( g) = () g g.
Proof
If and g belong to C

0
(R), the assertion is clear. To generalize, appro-
ximate in { and use Holders inequality.
Theorem 4.5 Let f, g D
4
1
(R). Then f g D
2
1
(R) and we have
(f g) = f g +f g.
Proof
The assertion is clear for f, g C

0
(R). To generalize, approximate in
{ and use Holders inequality.
We now turn to arbitrary nite dimension k, and interpret Gaussi-
an Sobolev spaces by convergence properties of Hermite expansions as
above.
Theorem 4.6 Let f (c
p
(f)) L
2
(R
k
,
k
) W
2
1,loc
. Denote f
p
=
c
p
(f)
p!
H
p
, p E
k
. Then the following are equivalent:
(i) [f[ = [

k
j=1
(d
j
f)
2
]
1
2
L
2
(R
k
,
k
),
(ii)

pE
k
[p[[[f
p
[[
2
2
< ,
(iii) f D
2
1
(R
k
).
In particular, D
2
1
(R
k
) is the domain of in L
2
(R
k
,
k
). Analogous
results hold for Sobolev spaces of order (r, 2) with r N.
Proof
Analogous to the proof of Theorem 4.2.
20
5 Innite dimensional Gaussian Sobolev spaces
To refer the innite dimensional setting to the nite dimensional one,
we use the following observation.
For n N recall

n
: R
N
R
n
, (x
n
)
nN
(x
k
)
1kn
.
Let for n N let C
n
= (
n
) = (
1
, ,
n
). Then (C
n
)
nN
is a
ltration on (R
N
, B
N
).
Lemma 5.1 Let p 1, f L
p
(R
N
, B
N
, ). Then

f
n
= E(f[C
n
), n N
denes a martingale which converges -a.s. and in L
p
to f.
Proof
This follows from a standard theorem of discrete martingale theory.
Let in the following f
n
: R
n
R the n-dimensional factorization of

f
n
, related by
f
n

n
=

f
n
, n N.
As a crucial observation for the denition of innite dimensional Sobo-
lev spaces, the martingale property is essentially not destroyed by the
directional derivative operators.
Lemma 5.2 Let p > 1, f L
p
(R
N
), (f
n
)
nN
the corresponding sequence
according to the above remarks. Suppose that sup
nN
[[f
n
[[
1,p
< . Then
for any j N the sequence (d
j
f
n

n
)
nN
converges in L
p
(R
N
), to
a limit that we denote by d
j
f. Corresponding statements hold true for
higher order derivatives.
Proof
Let n N, j N. Then for n j we have
E(d
j
f
n+1

n+1
[C
n
) = d
j
f
n

n
.
This means that (d
j
f
n

n
)
nj
is a martingale with respect to (C
n
)
nj
which, due to
sup
nj
[[d
j
f
n

n
[[
p
sup
nN
[[f
n
[[
1,p
< ,
21
is bounded in L
p
(R
N
) and hence converges in L
p
(R
N
), due to p > 1.
The preceding Lemmas give rise to the following denition of Sobolev
spaces.
Denition 5.1 Let p 1, s N. Then
D
p
s
(R
N
) = f L
p
(R
N
, ) : f
n
D
p
s
(R
n
), n N, sup
nN
[[f
n
[[
s,p
< ,
( innite dimensional Sobolev space of order (s, p)), endowed with the
norm
[[f[[
s,p
= sup
nN
[[f
n
[[
s,p
, f D
p
s
(R
N
).
This denition makes sense, for the following reasons.
Theorem 5.1 Let p > 1, s N. Then D
p
s
(R
N
) is a Banach space with
the norm [[ [[
s,p
.
Proof
We prove the claim for s = 1. Let (f
m
)
mN
be a Cauchy sequence in
D
p
1
(R
N
), and (f
m
n
)
n,mN
the corresponding nite dimensional functions
according to the remarks above. Then for m, l N, n N Jensens
inequality and the martingale statement in the preceding proof give the
following estimate
limsup
m,l
[[f
m
n
f
l
n
[[
1,p
lim
m,l
[[f
m
f
l
[[
1,p
= 0.
D
p
1
(R
n
) being a Banach space for n N, we know that
f
n
= lim
m
f
m
n
D
p
1
(R
n
)
exists. Let

f
n
= f
n

n
. Now let f = lim
m
f
m
in L
p
(R
N
). Then by
uniform integrability
E(f[C
n
) = E( lim
m
f
m
[C
n
) = lim
m
E(f
m
[C
n
) = lim
m

f
m
n
=

f
n
.
Moreover
sup
nN
[[f
n
[[
1,p
sup
m,nN
[[f
m
n
[[
1,p
sup
mN
[[f
m
[[
1,p
< .
22
Hence by denition f D
p
1
(R
N
), and by Fatous lemma
[[f f
m
[[
1,p
liminf
l
[[f
m
f
l
[[
1,p
0
as m .
According to Lemma 5.2, the gradient on the innite dimensional
Gaussian Sobolev spaces is dened as follows.
Denition 5.2 Let p > 1, f D
p
1
(R
N
). Then let
f = (d
j
f)
jN
( Malliavin gradient or Malliavin derivative), where for any j N ac-
cording to Lemma 5.2
d
j
f = lim
n
d
j
f
n

n
.
Accordingly, for s N we dene
r
f, 1 r s, for f D
p
s
(R
N
).
Remark
The gradient being a continuous mapping from D
p
1
(R
n
) to L
p
(R
n
,
n
)
for any nite dimension n, Lemma 5.2 and the denition of the Mal-
liavin gradient imply, that is a continuous mapping from D
p
1
(R
N
) to
L
p
(R
N
, ).
Let us now again restrict our attention to p = 2 and describe Gaussian
Sobolev spaces by means of the generalized Hermite polynomials. First
of all, suppose f =

pE
c
p
(f)
p!
L
2
(R
N
, ). We shall continue to use the
notation f (c
p
(f)). Then for n N, we have f
n
=

pE
n
c
(p,0)
(f)
p!
H
p
,
where we put (p, 0) = (p
1
, , p
n
, 0, 0, ) for p = (p
1
, , p
n
) E
n
.
Therefore, we also have

f
n
=

pE
n
c
(p,0)
(f)
p!
H
(p,0)
. Let again { be the
linear hull generated by all generalized Hermite polynomials.
As in the preceding chapter, we may calculate the gradient norms for
f (c
p
(f)) D
2
1
(R
N
). In fact, we have for j N
d
j
f = lim
n
d
j
f
n

n
= lim
n

pE
n
c
(p,0)
(f)
p!
p
j

i=j
H
p
i
H
p
j
1
=

pE
c
p
(f)
p!
p
j

i=j
H
p
i
H
p
j
1
.
23
Furthermore, for f D
2
1
(R
N
) let us compute the norm of [f[ =
[

jN
(d
j
f)
2
]
1
2
in L
2
(R
N
, ). In fact, we have, using the calculation of
gradient norms in the preceding chapter,
> sup
nN
[[[f
n

n
[[[
2
2
= [[[f[[[
2
2
= sup
nN

pE
n
[p[
c
(p,0)
(f)
2
p!
=

pE
[p[
c
p
(f)
2
p!
.
We therefore obtain the following main result about the description
of the innite dimensional Gaussian Sobolev spaces or order (1, 2).
Theorem 5.2 For f L
2
(R
N
, ) the following are equivalent:
(i) f D
2
1
(R
N
),
(ii)

pE
[p[
c
p
(f)
2
p!
< ,
(iii) [f
n
[
n
= [

jN
(d
j
f
n
)
2

n
]
1
2
converges in L
2
(R
N
, ) to [f[.
Moreover, D
2
1
(R
N
) is a Hilbert space with respect to the scalar product
(f, g)
1,2
= 'f[g` +

jN
'd
j
f[d
j
g`, f, g D
2
1
(R
N
).
For p 2 { is dense in D
p
1
(R
N
). Analogous results hold for Sobolev
spaces of order (s, 2) with s N.
6 Absolute continuity in innite dimensional Gaus-
sian space
We are now in a position to discuss the main result of Malliavins calculus
in the framework of innite dimensional Gaussian sequence spaces. The
result is about the smoothness of laws of random variables dened on the
Gaussian space. We start with a generalization of Lemma 1.1 to nite
measures on B
d
for d N.
Lemma 6.1 Let [B
d
be a nite measure. Assume there exists c R
such that for all C
1
(R
d
) with bounded partial derivatives, and any
1 j d we have
[


x
j
(x) (dx)[ c [[[[

.
Then <<
d
(d-dimensional Lebesgue measure).
24
Proof
For simplicity, we argue for d = 2, and omit the superscript denoting
2-dimensional Lebesgue measure.
1. Assume that C
1
(R
2
) possesses compact support. We show:
[

[[
2
d]
1
2

1
2
[

[

x
1
[
2
d +

[

x
2
[
2
d].
In fact, we have
[

[[
2
d]
1
2
[

sup
x
1
R
[(x
1
, x
2
)[dx
2

sup
x
2
R
[(x
1
, x
2
)[dx
1
]
1
2
[

[

x
1
(x
1
, x
2
)[dx
1
dx
2

[

x
2
(x
1
, x
2
)[dx
2
dx
1
]
1
2

1
2
[

[

x
1
[
2
d +

[

x
2
[
2
d].
2. Let 0 u be continuous with compact support, and such that

ud = 1, dene for > 0 u

=
1

2
u(
1

). Moreover, let

( y)(dy)
be a smoothed version of . Then we obtain for h continuous with com-
pact support, using Fubinis theorem,

(x)h(x)dx =

(x y)(dy)]h(x)dx
=

(x y)h(x)dx](dy)
=

u(x)h(x + y)dx](dy)

h(y)(dy).
3. We show:
L
2
(R
2
) g

gd R
is a continuous linear functional.
In fact, let C
1
(R
2
) have compact support, and let > 0. Then by
hypothesis and smoothness of

with a calculation as in 2.
[


x
i

(x)(x)dx[ = [

(x)

x
i
(x)dx[
= [

(x y)

x
i
(x)dx](dy)[
25
= [


x
i
u

(x y)(x)dx](dy)[
= [


y
i
u

(x y)(x)dx](dy)[
c[[

(x )(x)dx[[

c[[[[

.
Generalizing this inequality to bounded measurable , and then taking
= sgn(

) yields the inequality

[

x
i

[d c
for any > 0. Now let > 0, g L
2
(R
2
) be given. Then, using 1. and
the estimate above
[

(x)g(x)dx[ [

(x)[
2
dx

[g(x)[
2
dx]
1
2

1
2
[

[

x
1

[d +

[

x
2

[d]
1
2
[[g[[
2
c[[g[[
2
.
Applying this inequality in the special case, in which g is continuous
with compact support, and using 2. we get
[

g(x)(dx)[ c[[g[[
2
.
Finally extend this inequality to g L
2
(R
2
) by approximating it with
continuous functions of compact support. This yields the desired conti-
nuity of the linear functional.
4. It remains to apply Riesz representation theorem to nd a square
integrable density for .
We now consider a vector f = (f
1
, , f
d
) with components in
L
2
(R
N
, B
N
, ). Our aim is to study the absolute continuity with respect
to
d
of the law of f under , i.e. of the probability measure f
1
. For
this purpose we plan to apply the criterion of Lemma 6.1. Let C
1
(R
d
)
possess bounded partial derivatives. Then, the integral transformation
theorem gives


x
i
d f
1
=


x
i
fd.
26
In case d = 1 at this place we use integration by parts hidden in the
representations
d( f) =
/
(f)df,

/
(f) = d( f)
1
df
.
Our innite dimensional analogue of d is the Malliavin gradient .
Hence, we need a chain rule for .
Theorem 6.1 Let p 2, f D
p
1
(R
N
)
d
, C
1
(R
d
) with bounded par-
tial derivatives. Then
f D
p
1
(R
N
) and [ f] =
d

i=1

x
i
(f) f
i
.
Proof
Use Theorem 5.2 to choose a sequence (f
n
)
nN
{
d
such that for any
1 i d
[[f
i
n
f
i
[[
1,p
0.
For each n N we have
[ f
n
] =
d

i=1

x
i
(f
n
) f
i
n
.
Since is continuous on D
p
1
(R
N
), and since the partial derivatives of
are bounded, we furthermore obtain that
[ f] = lim
n
[ f
n
] =
d

i=1

x
i
(f) f
i
in L
2
(R
N
, ). This completes the proof.
We next present a calculation leading to the verication of the abso-
lute continuity criterion of Lemma 6.1. We concentrate on the algebraic
steps, and remark that their analytic background can be easily provided
with the theory of chapter 5. The rst aim of the calculations must be
to isolate, for a given test function C
1
(R
d
) with bounded partial
derivatives, the expression

x
i
(f), 1 i d. Recall the notation
(x, y) =

i=1
x
i
y
i
, x, y l
2
.
27
For 1 i, k d let

ik
= (f
i
, f
k
).
Then we have for 1 k d
(( f), f
k
) =

j=1
d
j
( f)d
j
f
k
=

j=1

1id

x
i
(f)d
j
f
i
d
j
f
k
=

1id

x
i
(f)
ik
.
We now assume that the matrix is (almost everywhere) invertible.
Then, denoting its inverse by
1
we may write

x
i
(f) =

1kd
(( f), f
k

1
ki
)
=

1kd

j=1
d
j
( f)
1
ki
d
j
f
k
.
We next assume, that the dual operator
j
of d
j
, which is dened in the
usual way on {, is well dened and the series appearing is summable.
Then we have


x
i
(f)d =


1kd

j=1
d
j
( f)
1
ki
d
j
f
k
d
=

f[

1kd

j=1

j
(d
j
f
k

1
ki
)]d.
The right hand side can be estimated by c[[[[

with
c = [[

1kd

j=1

j
(d
j
f
k

1
ki
)[[
2
in L
2
(R
N
, ). It can be seen (in analogy to Theorem 4.3) that this series
makes sense under the hypotheses of the following main theorem.
Theorem 6.2 Suppose that f = (f
1
, , f
d
) L
2
(R
N
, ) satises
(i) f
i
D
4
2
(R
N
) for 1 i d,
(ii)
ik
= (f
i
, f
k
), 1 i, k d, is -a.s. invertible and
1
ki

D
4
1
(R
N
) for 1 i, k d.
Then we have f
1
<<
d
.
28
Proof
Approximate f by polynomials, use continuity properties of the operators.
7 The canonical Wiener space: multiple integrals
We now return to the canonical Wiener space. The transfer between
sequence and canonical space is provided by the isomorphism of chapter
1. We briey recall it. Let (g
i
)
iN
be an orthonormal sequence in L
2
(R
+
).
Then it is given by
T : L
p
(R
N
, B
N
, ) L
p
(, T, P), f f ((W(g
i
)
iN
),
where W(g
i
) is the Gaussian stochastic integral of g
i
for i N. It will be
constructed in the following chapter. For simplicity we conne our atten-
tion to the canonical Wiener space in one dimension, i.e. = C(R
+
, R),
T the (completed) Borel -algebra on generated by the topology of
uniform convergence on compact sets in R
+
, P Wiener measure on T.
In the approach of dierential calculus on Gaussian sequence spaces,
in the Hilbert space setting the most important tool proved to be the
Hermite expansions of functions in L
2
(R
N
, ). In the setting on the ca-
nonical space, they can be given a dierent interpretation which we shall
now develop.
According to our isomorphism, the objects corresponding to generali-
zed Hermite polynomials on the canonical space are given by

i=1
H
p
i
(W(g
i
)),
p E. We shall interpret these objects as iterated Ito integrals. To do
this, we use the abbreviation B
1
+
for the Borel sets of R
+
.
Denition 7.1 For m N let
c
m
= f[f : R
m
+
R, f =
n

i
1
,,i
m
=1
a
i
1
i
m
1
A
i
1
A
i
m
,
(A
i
)
1in
B
1
+
p.d., a
i
1
i
m
= 0 in case i
k
= i
l
for some k = l.
Remark
For f L
2
(R
+
) of the form
f =
n

i=1
a
i
1
J
i
, (J
i
)
1in
p.d. intervals in R
+
29
let
W(f) =
n

i=1
a
i
W(J
i
) =
n

i=1
a
i
(W
t
i
W
s
i
),
if J
i
=]s
i
, t
i
], 1 i n. Then by Itos isometry we have
[[W(f)[[
2
2
= [[f[[
2
2
.
Since c
1
is dense in L
2
(R
+
), we can extend the linear mapping f W(f)
to L
2
(R
+
). Therefore, in particular for A B
1
+
with nite Lebesgue
measure, W(A) = W(1
A
) is dened. It will be used for the denition
of the following multiple stochastic integrals. In this chapter, the scalar
product ', ` will be with respect to Lebesgue measure on R
m
+
with
unspecied integer m.
Denition 7.2 For f =

n
i
1
,,i
m
=1
a
i
1
i
m
1
A
i
1
A
i
m
c
m
let
I
m
(f) =
n

i
1
,,i
m
=1
a
i
1
i
m
W(A
i
1
) W(A
i
m
).
The additivity of B
1
+
A W(A) R implies that I
m
is well dened.
We state some elementary properties of I
m
. Denote by o
m
the set of
all permutations of the numbers 1, , m.
Lemma 7.1 Let m, q N, f c
m
, g c
q
.
(i) I
m
[c
m
is linear,
(ii) if

f(t
1
, , t
m
) =
1
m!

o
m
f(t
(1)
, , t
(m)
) ( symmetrization of
f), then
I
m
(f) = I
m
(

f),
(iii) E(I
m
(f)I
q
(g)) = m!'

f, g`, if m = q, and 0 otherwise.


Proof
1. (i) follows from the additivity of the map A W(A).
2. (ii) is a direct consequence of the fact that in the denition of I
m
the product W(A
i
1
) W(A
i
m
) is invariant under permutations of the
factors.
30
3. By (ii), we may assume that f, g are symmetric. By choosing com-
mon subdivisions, we may further assume that
f =
n

i
1
,,i
m
=1
a
i
1
i
m
1
A
i
1
A
i
m
,
g =
n

i
1
,,i
q
=1
b
i
1
i
q
1
A
i
1
A
i
q
.
Now if m = q, by the assumptions that (A
i
)
1in
consists of p.d. Borel
sets, and that coecients vanish if two of the indices coincide,
E(I
m
(f)I
q
(g)) = 0 is evident. Assume m = q. Then, again by these
two assumptions and symmetry
E(I
m
(f)I
m
(g)) =
n

i
1
,,i
m
=1
n

j
1
,,j
m
=1
a
i
1
i
m
b
j
1
j
m
E(
m

p=1
W(A
i
p
)W(A
j
p
))
= m!
2

i
1
<<i
m

j
1
<,j
m
a
i
1
i
m
b
j
1
j
m
E(
m

p=1
W(A
i
p
)W(A
j
p
))
= m!
2

i
1
<<i
m
a
i
1
i
m
b
i
1
i
m
m

p=1
(A
i
p
)
= m!'f, g`.

To extend I
m
beyond the space c
m
of elementary functions, we proceed
as for m = 1.
Lemma 7.2 c
m
is dense in L
2
(R
m
+
) for any m N .
Proof
We may assume m 2, the assertion being known for m = 1. Due to
standard results of measure theory, it is enough to show: for A
1
, , A
m

B
1
+
with nite Lebesgue measure, and > 0, there exists f c
m
such
that
[[1
A
1
A
m
f[[
2
< .
Let > 0 to be determined later. Choose B
1
, , B
n
B
1
+
with nite
Lebesgue measure, pairwise disjoint, and such that for any 1 j n
we have (B
j
) < , and such that any A
i
can be represented by a nite
union of some of the B
j
. Then we have
1
A
1
A
m
=
n

i
1
,,i
m
=1
b
i
1
i
m
1
B
i
1
B
i
m
,
31
where b
i
1
i
m
= 1 if B
i
1
B
i
m
A
1
A
m
, and 0, if not. Let
I = (i
1
, , i
m
) : i
k
= i
l
for k = l, and J = 1, , n
m
` I. Then by
denition
f =

(i
1
,,i
m
)I
b
i
1
i
m
1
B
i
1
B
i
m
c
m
and we have
[[1
A
1
A
m
f[[
2
2
=

(i
1
,,i
m
)J
b
2
i
1
i
m
m

p=1
(B
i
p
)

m(m1)
2
n

i=1
(B
i
)
2
(
n

i=1
(B
i
))
m2

m(m1)
2
(
n

i=1
(B
i
))
m1
Finally, we have to choose small enough.
Using Lemma 7.2, we may now extend I
m
to L
2
(R
m
+
).
Denition 7.3 The linear and continuous extension of I
m
[c
m
to L
2
(R
m
+
)
which exists according to Lemma 7.2 is called multiple Wiener-Ito inte-
gral of degree m and also denoted by I
m
.
Properties of the elementary integral are transferred in a straightfor-
ward way.
Theorem 7.1 Let m, q N, f L
2
(R
m
+
), g L
2
(R
q
+
). Then
(i) I
m
[L
2
(R
m
+
) is linear,
(ii) we have
I
m
(f) = I
m
(

f),
(iii) E(I
m
(f)I
q
(g)) = m!'

f, g`, if m = q, and 0 otherwise,


(iv) I
1
(f) = W(f), f L
2
(R
+
).
Notation
We write
I
m
(f) =

R
m
+
f(t
1
, , t
m
)dW
t
1
dW
t
m
=

R
m
+
f(t
1
, , t
m
)W(dt
1
) W(dt
m
).
32
We next aim at explaining the relationship between generalized Her-
mite polynomials and multiple Wiener-Ito integrals. For this purpose
we will need a recursive relationship between Hermite polynomials of
dierent degrees.
Remark
Recall the denition of Hermite polynomials in one variable, given by
H
n
=
n
1.
Moreover, we may compute for n N
xH
n
= (d + )H
n
= nH
n1
+ H
n+1
, or H
n+1
= xH
n
nH
n1
.
For technical reasons, we need the following operation of contraction.
Denition 7.4 Let m N. Suppose f L
2
(R
m
+
), g L
2
(R
+
). Then
for t
1
, , t
m
, t R
+
f g(t
1
, , t
m
, t) = f(t
1
, , t
m
) g(t) (tensor product),
f
1
g(t
1
, , t
m1
) =

R
+

f(t
1
, , t
m
)g(t
m
) dt
m
(contraction).
The recursion relation for Hermite polynomials will emerge from the
recursion relation between Wiener-Ito integrals stated in the following
Lemma.
Lemma 7.3 Let m N, f L
2
(R
m
+
), g L
2
(R
+
). Then we have
I
m
(f)I
1
(g) = I
m+1
(f g) + mI
m1
(f
1
g).
Proof
1. By linearity and density of c
m
in L
2
(R
m
+
) we may assume that
f = 1
A
1
A
m
, g = 1
A
0
or g = 1
A
1
,
where (A
i
)
0im
B
1
+
is a collection of p.d. Borel sets with nite Le-
besgue measure.
2. The case g = 1
A
0
is trivial. Then the second term on the right
hand side of the claimed formula vanishes, and the other two terms are
obviously identical by denition of the elementary integral.
33
3. Let now g = 1
A
1
. For > 0 choose a collection of p.d. sets
B
1
, , B
n
B
1
+
such that A
1
=
n
i=1
B
i
, and for any 1 i n we
have (B
i
) < . Then
I
m
(f)I
1
(g) = W(A
1
)
2
W(A
2
) W(A
m
)
=

i=j
W(B
i
)W(B
j
)W(A
2
) W(A
m
)
+

1in
[W(B
i
)
2
(B
i
)]W(A
2
) W(A
m
)
+ (A
1
)W(A
2
) W(A
m
).
a) We now prove that the rst term on the right hand side of our
formula is close to I
m+1
(f g). In fact, let
h

i=j
1
B
i
B
j
A
2
A
m
c
m+1
.
Then
[[h

f g[[
2
2

n

i=1
(B
i
)
2
(A
2
) (A
m
)
(A
1
) (A
m
).
b) Let us next prove that the second term on the right hand side is
negligeable in the limit 0. In fact, denote it by R

. Then, since for


1 i n the variance of W
2
(B
i
) (B
i
) is given by c(B
i
)
2
with some
constant c, we obtain
E(R
2

) c
n

i=1
(B
i
)
2
(A
2
) (A
m
) c(A
1
)(A
2
) (A
m
).
c) To evaluate the last term, note that

1
A
1
A
m

1
1
A
1
=
1
m

1
A
2
A
m
(A
1
).
Therefore
(A
1
)W(A
2
) W(A
m
) = mI
m1
(

1
A
1
A
m

1
1
A
1
),
and we obtain the desired recursion formula.
This nally puts us in a position to derive the relationship between
Hermite polynomials and iterated stochastic integrals.
34
Theorem 7.2 Let m N, h L
2
(R
+
) be such that [[h[[
2
= 1. Denote
by h

m
the m-fold tensor product of h with itself. Then we have
H
m
(W(h)) = I
m
(h

m
).
Let H
0
= R, H
m
= I
m
(L
2
(R
m
+
)), m N. Then: (H
m
)
mN
is a sequence
of pairwise orthogonal closed linear subspaces of L
2
(, T, P) and we have
L
2
(, T, P) =

m=0
H
m
.
In particular, for any F L
2
(, T, P) there exists a sequence (f
m
)
m0
of functions f
m
L
2
(R
m
+
) such that
F =

m=0
I
m
(f
m
).
The representation with symmetric f
m
is
m
-a.e. unique, m N.
Proof
1. We rst have to prove:
H
m
(W(h)) = I
m
(h

m
).
This is done by induction on m. For m = 1, the formula is clear from
H
1
= x, I
1
(h) = W(h). Now assume it is known for m. Then Lemma 7.3
and the recursion formula for Hermite polynomials given above combine
to yield, remembering that [[h[[
2
= 1,
I
m+1
(h

m+1
) = I
m
(h

m
)I
1
(h) mI
m1
(h

1
h)
= I
m
(h

m
)I
1
(h) mI
m1
(h

m1
)
= H
m
(W(h))H
1
(W(h)) mH
m1
(W(h))
= H
m+1
(W(h)).
2. Let L
2
s
(R
m
+
) be the linear space of symmetric functions in L
2
(R
m
+
).
Then by Theorem 7.1
[[I
m
(

f)[[
2
2
= m![[

f[[
2
2
,
hence H
m
= I
m
(L
2
s
(R
m
+
)) is closed. Orthogonality is also a consequence
of Theorem 7.1.
35
3. Let (g
i
)
iN
be an orthonormal basis of L
2
(R
+
), F L
2
(, T, P).
Let f L
2
(R
N
, B
N
, ) be such that T(f) = F. Assume f (c
p
(f)),
according to the notation of chapter 3. For m 0, dene
f
m
=

pE,[p[=m
c
p
(f)
p!

iN
g

p
i
i
,
considered as a function of m variables. Then f
m
L
2
(R
m
+
) and with
the help of Lemma 7.3 we see
I
m
(f
m
) =

pE,[p[=m
c
p
(f)
p!
I
m
(

iN
g

p
i
i
)
=

pE,[p[=m
c
p
(f)
p!

iN
I
p
i
(g

p
i
i
)
=

pE,[p[=m
c
p
(f)
p!

iN
H
p
i
(W(g
i
)).
Summing this expression over m yields the desired
F =

m=0
I
m
(f
m
).
The remaining claims are obvious.
8 The canonical Wiener space: Malliavins deriva-
tive
In this chapter we shall investigate the analogue of the gradient we en-
countered in the dierential calculus on the sequence space. Fix again
an orthonormal basis (g
i
)
iN
of L
2
(R
+
), and recall the isomorphism
T : L
2
(R
N
, B
N
, ) L
2
(, T, P), f f((W(g
i
)
iN
).
Of course, every permutation of the orthonormal basis functions gives
another orthonormal basis. So here we encounter a problem of coordinate
dependence of our objects of study. How can we dene Malliavins deri-
vative on the canonical space in a both consistent and basis independent
way? According to Theorem 5.2
f = (d
j
f)
jN
36
takes values in l
2
. The corresponding object on the side of the canonical
space is L
2
(R
+
). It is therefore plausible if we set
Denition 8.1 For n N let C

p
(R
n
) denote the set of smooth functi-
ons the partial derivatives of which possess polynomial growth. Let
o = F[F = f(W(h
1
), , W(h
n
)), h
1
, , h
n
L
2
(R
+
),
f C

p
(R
n
), n N.
For F = f(W(h
1
), , W(h
n
)) o, t 0 let
D
t
F =
n

i=1

x
i
f(W(h
1
), , W(h
n
)) h
i
(t).
To see if this is a good candidate for the denition of the Mallia-
vin gradient in the setting of the canonical space, let us verify in de-
tail the independence on the specic representation of functionals. Let
h
1
, , h
n
L
2
(R
+
), and g
1
, , g
m
L
2
(R
+
) orthonormal, such that
the linear hulls of the two systems are identical, and such that with
f C

p
(R
n
), g C

p
(R
m
) we have
f(W(h
1
), , W(h
n
)) = g(W(g
1
), , W(g
m
)).
For 1 i n write
h
i
=
m

j=1
'h
i
, g
j
`g
j
.
Then, denoting
= ('h
i
, g
j
`)
1in,1jm
,
we obviously have f = g. Therefore
m

j=1

x
j
(f )(W(g
1
), , W(g
m
))g
j
=
m

j=1
n

i=1

x
i
f(W(h
1
), , W(h
n
))
'h
i
, g
j
`g
j
=
n

i=1

x
i
f(W(h
1
), , W(h
n
))h
i
.
This proves that the denition of D is independent of the representation
of functionals in o.
37
If h L
2
(R
+
) is another function, we have by denition
'D

F, h` =
m

i=1

x
i
f(W(g
1
), , W(g
m
))'g
i
, h`,
in particular for i N
'D

F, g
i
` =

x
i
f(W(g
1
), , W(g
m
)) = d
i
f(W(g
1
), , W(g
m
)).
We therefore may interpret 'D

F, g
i
` as directional derivative in direction
of g
i
, and we have by Parsevals identity
'DF, DF` =
m

i=1
'DF, g
i
`
2
=
m

i=1
(d
i
f)
2
(W(g
1
), , W(g
m
))
= [f[
2
(W(g
1
), , W(g
m
)).
Analogously, higher derivatives are related to each other. So we see that
the isomorphism T also maps 'DF, DF` to [f[
2
. Consequently, we can
just transfer the denitions of Gaussian Sobolev spaces to the setting of
the canonical Wiener space.
Denition 8.2 Let p 2, s N. For F L
2
(, T, P) denote by f
L
2
(R
N
, B
N
, ) the function for which we have F = f((W(g
i
)
iN
). Then
let
D
p
s
= F[f D
p
s
(R
N
)
( canonical Gaussian Sobolev space of order (s, p)), with the norm
[[F[[
s,p
= [[f[[
s,p
.
For F = T(f) D
p
s
, 1 r s, let
D
r
F =

j
1
,,j
r
=1
d
j
1
d
j
r
f(W(g
i
)
iN
)g
j
1
g
j
r
( canonical Malliavin derivative of order r).
Remark
From our knowledge of sequence spaces we can easily derive that D
p
s
is
a Banach space with respect to the norm [[ [[
s,p
for p 2, s N, and
that for F D
p
s
we have
[[F[[
s,p
=
s

r=0
[['D
r
F, D
r
F`[[
p
.
38
We know that D
p
s
is the closure of o with respect to the norm [[ [[
s,p
.
Turning to p = 2, we know that D
2
1
is a Hilbert space with respect to
the scalar product
(F, G)
1,2
= E(FG) + E('DF, DG`), F, G D
2
1
.
Moreover, we know that D is a closed operator, dened on D
2
1
, which is
continuous as a mapping from D
2
1
to L
2
(, T, P).
Let us now investigate how D operates on the decomposition into
Wiener-Ito integrals.
Theorem 8.1 Let F =

m=0
I
m
(f
m
) L
2
(, T, P) be given, f
m
sym-
metric for any m 0. Then we have
F D
2
1
if and only if

m=1
mm! [[f
m
[[
2
2
< .
In this case we have
D
t
F =

m=1
mI
m1
(f
m
(, t))
(for P -a.e. (, t) R
+
).
Proof
1. Suppose that with respect to an orthonormal basis (g
i
)
iN
of L
2
(R
+
)
we have F = T(f) with f (c
p
(f)). As before, for m 0 let
f
m
=

pE,[p[=m
c
p
(f)
p!

iN
g

p
i
i
.
We interpret

iN
g

p
i
i
as

k
j=1
g

p
i
j
i
j
, if i
1
, , i
k
are precisely those indices
for which p
i
1
, , p
i
k
> 0.
2. Let now p E such that [p[ = m, and let t 0. Then
D
t
I
m
(

iN
g

p
i
i
) = D
t

iN
I
p
i
(g

p
i
i
)
= D
t
H
p
((W(g
i
)
iN
)
=

iN
p
i

j=i
H
p
j
(W(g
j
))H
p
i
1
(W(g
i
))g
i
(t)
= I
m1
(

iN
p
i

j=i
g

p
j
j
g

p
i
1
i
g
i
(t)).
39
Hence by closedness of D, symmetry of f
m
and [p[ = m, the desired
formula
D
t
I
m
(f
m
) = mI
m1
(f
m
(, t))
follows.
3. For n N let now
F
n
=
n

m=0
I
m
(f
m
).
By the closedness of the operator D and the remarks above, we know
that
F D
2
1
if and only if (F
n
)
nN
is Cauchy in D
2
1
.
Now we know from the rst part of the proof that
DF
n
=
n

m=1
mI
m1
(f
m
(, )).
Let n, m N, n m be given. Then
E('D(F
n
F
m
), D(F
n
F
m
)`) =
n

k=m+1
k
2

R
+
(k 1)!'f
k
(, t), f
k
(, t)`dt
=
n

k=m+1
k
2
(k 1)![[f
k
[[
2
2
.
Hence (DF
n
)
nN
is a Cauchy sequence in D
2
1
if and only if

k=0
kk![[f
k
[[
2
2
<
. In this case, the series with the desired representation converges.
We need some rules to be able to calculate with the Malliavin gradient
D.
Theorem 8.2 Let p 2, d N, C
1
(R
d
) with bounded partial
derivatives, let F = (F
1
, , F
d
) (D
p
1
)
d
. Then F D
p
1
and we
have
D F =
d

i=1

x
i
(F)DF
i
.
Proof
The proof of Theorem 6.1 translates.
With the following properties we prepare a study of the dual operator
of D.
40
Theorem 8.3 Let F o, h L
2
(R
+
). Then we have
E('DF, h`) = E(FW(h)).
Proof
We may assume that F = f(W(g
1
), , W(g
n
)), h = g
1
with respect
to an orthonormal system g
1
, , g
n
L
2
(R
+
). In this case we have by
duality of d
1
and
1
E('DF, h`) = E(

x
1
f((W(g
1
), , W(g
n
)))
= 'f[(1, 0, , 0)`
= 'f[
1
1` = 'f[H
(1,0,)
`
= E(f(W(g
1
), , W(g
n
))W(g
1
))
= E(FW(h)).
This completes the proof.
Theorem 8.4 Let F, G o, h L
2
(R
+
). Then we have
E(G'DF, h`) = E(FGW(h) F'DG, h`).
Proof
Apply Theorem 8.3 to the function FG.
9 The canonical Wiener space: Skorokhods inte-
gral
In this chapter we dedicate a more careful study to the dual operator (in
the sense of Hilbert space theory) of the Malliavin gradient than in the
Gaussian sequence spaces. In the setting of the canonical Wiener space,
this operator turns out to be a stochastic integral.
So far we know that
D : D
2
1
L
2
( R
+
)
is densely dened and linear.
41
Denition 9.1 Let
dom() = u L
2
( R
+
) : there is c R such that for any
F D
2
1
we have E('DF, u`) c[[F[[
2
.
For u dom() the mapping F E('DF, u`) can be extended to
a continuous linear functional. Hence by Riesz representation we may
nd (u) L
2
() such that
E('DF, u`) = E(F (u)), F D
2
1
.
Since D is densely dened, (u) is unique for any u dom().
Denition 9.2 For u dom() the uniquely determined random varia-
ble (u) L
2
() is called Skorokhod integral of u.
Notation
We write (u) =

R
+
u
t
W
t
.
Why is this operator called integral ? To answer this question, we rst
ask how elementary processes are integrated.
Denition 9.3 Let
o
L
2
(R
+
)
= u[u =
n

i=1
F
i
h
i
, F
i
o, h
i
L
2
(R
+
), n N.
Lemma 9.1 Let u =

n
i=1
F
i
h
i
o
L
2
(R
+
)
. Then we have
(u) =
n

i=1
[F
i
W(h
i
) 'DF
i
, h
i
`].
Proof
By linearity of we may assume that u = Fh with F o and h
L
2
(R
+
). Then for G o by means of Theorem 8.4
E('u, DG`) = E(F'h, DG`)
= E(FGW(h) G'h, DF`)
= E(G[FW(h) 'DF, h`]).
Hence we have
(u) = FW(h) 'h, DF`.
42
This completes the proof.
Recall now the standard Wiener ltration (T
t
)
t0
, which for t 0
is given by the P-completion T
t
of (W
s
: s t). Lemma 9.1 yields
the elementary Ito integral, if F
i
is T
t
i
-measurable, h
i
= 1
]t
i
,t
i+1
]
, where
0 = t
0
< t
1
< < t
n
, if 'DF
i
, h
i
` = 0, 1 i n 1. This is indeed the
case, as we will show now.
Lemma 9.2 Let F D
2
1
, A B
1
+
, T
A
= (W(1
B
) : B A, (B) < ).
Then we have
E(F[T
A
) D
2
1
and
D
t
E(F[T
A
) = E(D
t
F[T
A
)1
A
(t)
(in L
2
( R
+
)).
Proof
1. We rst consider F = f(W(h
1
), , W(h
n
)) o. By setting
g(x
1
, , x
n
, y
1
, , y
n
) = f(x
1
+ y
1
, , x
n
+ y
n
), x
1
, , y
n
R, we
can write
F = g(W(h
1
1
A
), , W(h
n
1
A
), W(h
1
1
A
c), , W(h
n
1
A
c)).
Let
Q = P (W(h
1
1
A
c), , W(h
n
1
A
c))
1
.
Then by independence of T
A
and the vector (W(h
1
1
A
c), , W(h
n
1
A
c))
we have
E(F[T
A
) =

g(W(h
1
1
A
), , W(h
n
1
A
), y
1
, , y
n
)dQ(y
1
, , y
n
).
Hence E(F[T
A
) o and
D
t
(E(F[T
A
) =
n

i=1


x
i
g(W(h
1
1
A
), , W(h
n
1
A
), y
1
, , y
n
)
dQ(y
1
, , y
n
)h
i
(t)1
A
(t)
= E(D
t
F[T
A
)1
A
(t).
2. It remains to approximate F D
2
1
by standard arguments.
Theorem 9.1 Let u L
2
( R
+
) be (T
t
)-adapted. Then
u dom() and (u) =

R
+
u
t
dW
t
(Ito integral).
43
Proof
1. Let 0 s < t, F L
2
(, T
s
, P). We prove:
u = F1
]s,t]
dom() and (u) = F(W
t
W
s
).
a) Let rst F o in addition. Then by Lemma 9.1 and 9.2 we may
write
(u) = F(W
t
W
s
) 'DF, 1
]s,t]
`
= F(W
t
W
s
) 'DF1
[0,s]
, 1
]s,t]
`
= F(W
t
W
s
).
b) For F L
2
(, T
s
, P) let (F
n
)
nN
o such that F
n
F in
L
2
(, T, P). Then also o G
n
= E(F
n
[T
s
) F in L
2
(, T, P). Hence
by a) for any n N
(G
n
1
]s,t]
) = G
n
(W
t
W
s
).
Moreover, this sequence is a Cauchy sequence in L
2
(, T, P). Since is
a closed operator (as a dual operator), we obtain that
F1
]s,t]
dom() and (F1
]s,t]
) = F(W
t
W
s
).
2. a) Let now u =

n
j=1
F
j
1
]s
j
,t
j
]
L
2
( R
+
), where s
j
t
j
, F
j
T
s
j
-
measurable, 1 j n. Then by linearity
u dom() and (u) =
n

j=1
F
j
(W
t
j
W
s
j
).
b) Now given u as in the claim, choose a sequence (u
n
)
nN
of simple
adapted processes as in a) such that [[u
n
u[[
2
0 in L
2
(R
+
). Then
use the closedness of and the denition of the Ito integral to obtain
that
(u) = lim
n
(u
n
) = lim
n

R
+
u
n
t
dW
t
=

R
+
u
t
dW
t
.
This completes the proof.
We next ask the question how the Skorokhod integral operates on the
decomposition into multiple Wiener-Ito integrals.
44
Lemma 9.3 Let u L
2
(R
+
). Then for m 0 there exist functions
f
m
L
2
(R
m+1
+
) such that f
m
is symmetric in its rst m variables, and
such that
u
t
=

m=0
I
m
(f
m
(, t)) in L
2
( R
+
).
We have
E(

R
+
u
2
s
ds) =

m=0
m![[f
m
[[
2
2
.
Proof
Choose a sequence of elementary processes (u
n
)
nN
L
2
( R
+
) such
that
[[u
n
u[[
2
0 (n ).
Suppose F
n
k
L
2
(), g
n
k
L
2
(R
+
), 1 k m
n
, n N are given such
that
u
n
t
=
m
n

k=1
F
n
k
g
n
k
(t).
For n N, 1 k m
n
let
F
n
k
=

m=0
I
m
(f
k,n
m
), f
k,n
m
L
2
(R
m
+
) symmetric.
Then we have
u
n
t
=

m=0
I
m
(
m
n

k=1
f
k,n
m
g
n
k
(t)), t R
+
.
Dene
f
n
m
=
m
n

k=1
f
k,n
m
g
n
k
, m 0, n N.
Then f
n
m
L
2
(R
m+1
+
), f
n
m
is symmetric in its rst m variables, and due
to orthogonality and symmetry we have for l, n N
[[u
n
u
l
[[
2
2
=

m=0
m![[f
n
m
f
l
m
[[
2
2
.
Hence for any m 0 (f
n
m
)
nN
is a Cauchy sequence in L
2
(R
m+1
+
) which
converges to a function f
m
which is also symmetric in the rst m varia-
bles. We obtain for u
n,M
=

M
m=0

m
n
k=1
F
n
k
g
n
k
, n, M N
> [[u[[
2
2
= lim
n
[[u
n
[[
2
2
= sup
MN
lim
n
[[u
n,M
[[
2
2
= sup
MN
M

m=0
m![[f
m
[[
2
2
=

m=0
m![[f
m
[[
2
2
.
45
By a similar argument and by denition we must have
u
t
=

m=0
I
m
(f
m
(, t)), t 0.
This completes the proof.
Theorem 9.2 Let u L
2
( R
+
), u =

m=0
I
m
(f
m
(, )) according to
Lemma 9.3. Then we have
u dom() if and only if

m=0
(m+ 1)![[

f
m
[[
2
2
< .
In this case
(u) =

m=0
I
m+1
(f
m
).
Proof
1. Let n N, g L
2
(R
n
+
), G = I
n
(g). We show:
E('u, DG`) = E(I
n
(f
n1
)G).
In fact, by Theorem 8.1
E('u, DG`) = E('u, nI
n1
(g(, ))`)
= n

R
+
E(I
n1
(f
n1
(, t))I
n1
(g(, t)))dt
= n!'f
n1
, g` = E(I
n
(f
n1
)G).
2. Let us now prove the if part of the claim. For this purpose, let
u dom(), G = I
n
(g) H
n
, n N. Then by the rst part and by
duality
E((u)G) = E(I
n
(f
n1
)G).
By extending G to other components of L
2
() and linearity we obtain
L
2
() (u) =

n=0
I
n+1
(f
n
), and

n=0
(n + 1)![[

f
n
[[
2
2
< .
3. Let us now establish the only if part of the claim. Assume that

n=0
(n + 1)![[

f
n
[[
2
2
< . Let V =

n=0
I
n+1
(f
n
) which is well dened,
and G =

n=0
I
n
(g
n
) L
2
() with g
n
L
2
(R
n
+
), n N, symmetric, and
46
nally let G
n
=

n
k=0
I
k
(g
k
), n N. Then we may again appeal to the
rst part of the proof to get
E('u, DG
n
`) = E(V G
n
), n N.
By approximation, this equation extends to G D
2
1
. Since V L
2
(),
we obtain u dom() and (u) = V. This completes the proof.
For practical purposes it is not easy to deal with dom() when discus-
sing the Skorokhod integral. The space is analytically hardly accessible.
For having a simpler treatment of questions related to the calculus of
Skorokhods integral it is preferable to work on the following subspace.
Denition 9.4 Let
L
2
1
= u[u L
2
( R
+
), u
t
D
2
1
for a.a. t 0,
for some measurable version of (s, t) D
s
u
t
we have
E(

R
+

R
+
[D
s
u
t
[
2
dsdt) < .
Remark
L
2
1
is a Hilbert space with the norm [[u[[
2
1,2
= [[u[[
2
2
+[[Du[[
2
2
.
How can L
2
1
be described in terms of Wiener-Ito decompositions?
Remark
Let u L
2
(R
+
), u =

m=0
I
m
(f
m
(, )) according to Lemma 9.3. Let
us formulate in these terms the conditions of the denition of L
2
1
. First
of all, for t 0 according to Theorem 8.1 u
t
D
2
1
means that

m=0
mm![[f
m
(, t)[[
2
2
< .
In the same terms, E(

R
+

R
+
[D
s
u
t
[
2
dsdt) < then means that

m=0
mm!

R
+
[[f
m
(, t)[[
2
2
dt =

m=0
mm![[f
m
[[
2
2
< .
The latter is the case i

m=0
(m+ 1)![[f
m
[[
2
2
< .
Compare this with the condition we obtained in Theorem 9.2. Since
for m 0 we have [[

f
m
[[
2
[[f
m
[[
2
, we obviously have
L
2
1
dom().
How is Itos isometry transferred to the Skorokhod integral?
47
Theorem 9.3 Let u, v L
2
1
. Then
E((u)(v)) = E(

R
+
u
t
v
t
dt) + E(

R
2
+
D
t
u
s
D
s
v
t
dsdt).
Proof
1. Let rst u o
L
2
(R
+
)
. We show:
D
t
(u) = u + (D
t
u) (in L
2
( R
+
)).
By linearity, we may further assume that u = Fh, where F o, h
L
2
(R
+
). Then according to Lemma 9.1 we may write (u) = FW(h)
'DF, h`, and therefore
D
t
(u) = D
t
FW(h) + Fh 'D
t
DF, h`
= u + (D
t
Fh)
= u + (D
t
u).
2. Let still u o
L
2
(R
+
)
. Then duality, the calculation just obtained
and the fact that v L
2
1
lead to
E((u)(v)) = E('D(u), v`)
= E('u, v` +'(D

u), v`)
= E('u, v`) + E(

R
2
+
D
t
u
s
D
s
v
t
dsdt).
3. It remains to do an approximation of u by functions in o
L
2
(R
+
)
, and
to use that convergence in L
2
1
implies convergence for all three terms in
the formula.
We nally give a rule for the Malliavin dierentiation of Ito integrals
which will be of use in the applications of Malliavins calculus to stocha-
stic analysis to be discussed.
Theorem 9.4 Let u L
2
([0, 1]) be adapted, X
t
=

t
0
u
s
dW
s
, 0 t
1, its Ito integral process. Then we have
u L
2
1
if and only if X
T
D
2
1
for all T [0, 1].
In this case X L
2
1
and for 0 t T 1 we have
D
t
X
T
= u
t
1
[0,T]
(t) +

T
t
D
t
u
r
dW
r
,
and

T
0
E([D
t
X
T
[
2
)dt =

T
0
E(u
t
)
2
dt +

T
0

T
t
E([D
t
u
r
[
2
)drdt.
48
Proof
For simplicity let T = 1. This time we use Wiener-Ito decompositions
for approximations.
1. We will use an extension of the representation of Lemma 9.3 to
adapted u. Let u =

m=0
I
m
(f
m
(, )) be the representation according
to Lemma 9.3. Here for any m 0 the function f
m
L
2
(R
m+1
+
) is
symmetric in its rst m variables. We show:
f
m
(, t) = f
m
(, t)1
[0,t]
m() (in L
2
(R
m+1
+
)), and thus
[[

f
m
[[
2
2
=
1
m + 1
[[f
m
[[
2
2
.
In fact, resume the notation of the proof of Lemma 9.3, to specialize to
the adapted case. We approximate u in L
2
( [0, 1]) by functions
u
n
=
m
n

k=1
F
n
k
1
]t
n
k
,t
n
k+1
]
, n N,
where 0 = t
n
1
< < t
n
m
n
= 1, F
n
k
is T
t
n
k
-measurable. In the Wiener-Ito
decomposition
F
n
k
=

m=0
I
m
(f
k,n
m
)
of F
n
k
, due to its measurability properties and Lemma 9.2, we have for
t
1
, , t
m
R
+
f
k,n
m
(t
1
, , t
m
) = m!D
t
1
D
t
m
I
m
(f
k,n
m
)
= m!D
t
1
D
t
m
I
m
(f
k,n
m
)1
[0,t
n
k
]
m(t
1
, , t
m
)
= f
k,n
m
(t
1
, , t
m
) 1
[0,t
n
k
]
m(t
1
, , t
m
).
Hence the functions
f
n
m
=
m
n

k=1
F
k,n
m
1
]t
n
k
,t
n
k
+
1
]
, m 0, n N,
possess the property
f
n
m
(, t) = f
n
m
(, t)1
[0,t]
m() for any t 0.
Now use a diagonal sequence argument to select a subsequence (v
n
)
nN
of (u
n
)
nN
with corresponding Wiener-Ito functions g
n
m
L
2
(R
m+1
+
),
49
symmetric in their rst m variables, and such that for any m 0 g
n
m

f
m
P -a.e. Hence we have for m 0
f
m
(, t) = f
m
(, t)1
[0,t]
m() (in L
2
(R
m+1
+
)).
To prove the second assertion, note that due to the validity of the rst

f
m
=
1
m+ 1
m+1

i=1
h
i
m
,
where the h
i
m
have disjoint support, and their norms in L
2
(R
m+1
+
) are
identical to the one of f
m
. Hence
[[

f
m
[[
2
2
=
1
(m+ 1)
2
m+1

i=1
[[h
i
m
[[
2
2
=
1
m+ 1
[[f
m
[[
2
2
,
as claimed.
2. Now as was shown in the remark above, u L
2
1
translates into

m=0
(m+ 1)![[f
m
[[
2
2
< .
Moreover, we know that X
1
= (u) =

m=0
I
m+1
(f
m
), and by Theorem
8.1 that X
1
D
2
1
if and only if

m=0
(m+ 1)(m + 1)![[

f
m
[[
2
2
< .
But according to the rst part of the proof

m=1
(m+ 1)(m+ 1)![[

f
m
[[
2
2
=

m=1
(m+ 1)![[f
m
[[
2
2
.
This proves the rst claim of the Theorem.
3. Let us next prove the dierentiation formula. Note rst that due to
Lemma 9.2
D
t
u
s
= D
t
u
s
1
[t,1]
(s) (in L
2
([0, 1]
2
)).
Moreover, for t s 1 we have, according to Theorem 8.1
D
t
u
s
= D
t

m=0
I
m
(f
m
(, s))
=

m=1
mI
m1
(f
m
( , t, s)).
50
Since Du L
2
([0, 1]
2
) according to the denition of L
2
1
, and since for
xed t the process (D
t
u
s
)
ts1
is adapted, we know from Theorem 9.1
that (D
t
u
s
)
ts1
is Ito integrable and that its Ito and Skorokhod integral
are identical. More formally,

1
t
D
t
u
s
dW
s
=

1
0
D
t
u
s
dW
s
= (D
t
u)
=

m=0
mI
m
(f
m
( , t, )).
Finally, we know that X
1
= (u) D
2
1
. We can compute the Malliavin
derivative for t [0, 1] (in the usual sense of equality in L
2
( [0, 1]))
D
t
(u) = D
t

m=0
I
m+1
(f
m
)
= D
t

m=0
I
m+1
(

f
m
)
=

m=1
(m+ 1)I
m
(

f
m
(, t))
=

m=1
I
m
(f
m
(, t)) +

m=1
mI
m
(f
m
( , t, )).
The last line is now identied with u
t
+

1
t
D
t
u
s
dW
s
by using the expan-
sions of these two expressions given above. The norm equation follows
from Itos isometry.
10 Backward stochastic dierential equations
Backward stochastic dierential equations (BSDE) constitute a very suc-
cessful and active tool for stochastic nance and insurance, and more
generally serve as a central method of stochastic control theory. In this
chapter we shall establish the basic existence and uniqueness theory for
these equations in case the coecients are Lipschitz continuous.
We x for the sequel a nite time horizon T > 0, and a dimension
m N. We start by explaining some notation. Let (, T, P) be the
canonical n-dimensional Wiener space, with canonical Wiener process
W = (W
1
, , W
n
). Denote by (T
t
)
t0
the ltration of the canonical
space, i.e. the natural ltration completed by sets of P-measure 0.
51
Let L
2
(R
m
) be the linear space of R
m
-valued T
T
-measurable random
variables, endowed with norm E([X[
2
)
1
2
. Let H
2
(R
m
) denote the linear
space of (T
t
)
0tT
-adapted measurable processes X : [0, T] R
m
endowed with the norm [[X[[
2
= E(

T
0
[X
t
[
2
dt)
1
2
. Further let H
1
(R
m
)
denote the space of (T
t
)
0tT
-adapted measurable processes X :
[0, T] R
m
with the norm [[X[[
1
= E([

T
0
[X
t
[
2
dt]
1
2
). Finally, for > 0
and X H
2
(R
m
) let
[[X[[
2
2,
= E(

T
0
e
t
[X
t
[
2
dt),
and H
2,
(R
m
) the space H
2
(R
m
) endowed with the norm [[ [[
2,
.
We next describe the general hypotheses we want to require for the
parameters of our BSDE. The terminal condition will be supposed to
belong to L
2
(R
m
). The generator will be a function
f : R
+
R
m
R
nm
R
m
,
which is product measurable, adapted in the time parameter, and which
fullls
(H1) f(, 0, 0) H
2
(R
m
),
f is uniformly Lipschitz, i.e. there exists C R such that for any
(y
1
, z
1
), (y
2
, z
2
) R
m
R
nm
, P -a.e. (, t) R
+
(H2) [f(, t, y
1
, z
1
) f(, t, y
2
, z
2
)[ C[[y
1
y
2
[ +[z
1
z
2
[].
Here for z R
nm
we denote [z[ = (tr(zz

))
1
2
.
Denition 10.1 A pair of functions (f, ) fullling, besides the mentio-
ned measurement requirements, hypotheses (H1), (H2), is said to be a
standard parameter.
Given standard parameters, we shall solve the problem of nding a
pair of (T
t
)
0tT
-adapted processes (Y
t
, Z
t
)
0tT
such that the backward
stochastic dierential equation (BSDE)
() dY
t
= Z

t
dW
t
f(, t, Y
t
, Z
t
)dt, Y
T
= ,
is satised.
In order to construct a solution, a contraction argument on suitable
Banach spaces will be used. For its derivation we shall need the following
a priori inequalities.
52
Lemma 10.1 For i = 1, 2 let (f
i
,
i
) be standard parameters, (Y
i
, Z
i
)
H
2
(R
m
) H
2
(R
nm
) solutions of (*) with corresponding standard para-
meters. Let C be a Lipschitz constant for f
1
. Dene for 0 t T
Y
t
= Y
1
t
Y
2
t
,

2
f
t
= f
1
(, t, Y
2
t
, Z
2
t
) f
2
(, t, Y
2
t
, Z
2
t
).
Then for any triple (, , ) with > 0,
2
> C, C(2 +
2
) +
2
we
have
[[Y [[
2
2,
T[e
T
E([Y
T
[
2
) +
1

2
[[
2
f[[
2
2,
],
[[Z[[
2
2,


2

2
C
[e
T
E([Y
T
[
2
) +
1

2
[[
2
f[[
2
2,
].
Proof
1. Let (Y, Z) H
2
(R
m
) H
2
(R
nm
) be a solution of (*) with standard
parameters (f, ). This means that we may write for 0 t T
() Y
t
=

T
t
Z

s
dW
s
+

T
t
f(, s, Y
s
, Z
s
)ds.
We show:
sup
0tT
[Y
t
[ L
2
(R
m
).
In fact, due to (*) we have
sup
0tT
[Y
t
[ [[ +

T
0
[f[, s, Y
s
, Z
s
)[ds + sup
0tT
[

T
t
Z

s
dW
s
[,
and, with the help of Doobs inequality
E( sup
0tT
[

T
t
Z

s
dW
s
[
2
) 4E( sup
0tT
[

t
0
Z

s
dW
s
[
2
) 8E(

T
0
[Z
s
[
2
ds).
Since in addition (H1) and (H2) guarantee that [[+

T
0
[f(, s, Y
s
, Z
s
)[ds
L
2
(R), we obtain the desired
E( sup
0tT
[Y
t
[
2
) < .
2. Now we derive a preliminary bound. Apply Itos formula to the
semimartingale (e
s
[Y
s
[
2
)
0sT
to obtain for 0 t T
e
T
[Y
T
[
2
e
t
[Y
t
[
2
=

T
t
e
s
[Y
s
[
2
ds + 2

T
t
e
s
'Y
s
, f
1
(, s, Y
1
s
, Z
1
s
)
f
2
(, s, Y
2
s
, Z
2
s
)`ds2

T
t
e
s
'Y
s
, Z

s
dW
s
`+

T
t
e
s
[Z
s
[
2
ds.
53
By reordering the terms in the equation we obtain
e
t
[Y
t
[
2
+

T
t
e
s
[Y
s
[
2
ds +

T
t
e
s
[Z
s
[
2
ds
= e
T
[Y
T
[
2
+ 2

T
t
e
s
'Y
s
, Z

s
dW
s
`
2

T
t
e
s
'Y
s
, f
1
(, s, Y
1
s
, Z
1
s
) f
2
(, s, Y
2
s
, Z
2
s
)`ds.
3. We prove for 0 t T:
E(e
t
[Y
t
[
2
) E(e
T
[Y
T
[
2
) +
1

2
E(

T
t
e
s
[
2
f
s
[
2
ds).
To prove this, rst take expectations on both sides of the inequality
obtained in 2., with the result
E(e
t
[Y
t
[
2
) + E(

T
t
e
s
[Y
s
[
2
ds) + E

T
t
e
s
[Z
s
[
2
ds)
E(e
T
[Y
T
[
2
)
+2E(

T
t
e
s
'Y
s
, f
1
(, s, Y
1
s
, Z
1
s
) f
2
(, s, Y
2
s
, Z
2
s
)`ds).
Now by our assumptions for 0 s T
[f
1
(, s, Y
1
s
, Z
1
s
) f
2
(, s, Y
2
s
, Z
2
s
)[ [f
1
(, s, Y
1
s
, Z
1
s
) f
1
(, s, Y
2
s
, Z
2
s
)[
+[
2
f
s
[
C[[
s
Y [ +[
s
Z[] +[
2
f
s
[.
The latter implies

T
t
E(2e
s
['Y
s
, f
1
(, s, Y
1
s
, Z
1
s
) f
2
(, s, Y
2
s
, Z
2
s
)`[ds

T
t
2e
s
E([Y
s
[[C([
s
Y [ +[
s
Z[) +[
2
f
s
[]ds
=

T
t
2e
s
[CE([Y
s
[
2
) + E([
s
Y [(C[
s
Z[) +[
2
f
s
[)]ds.
Now for C, y, z, t > 0 with , > 0
2y(Cz + t) = 2Cyz + 2yt
C[(y)
2
+ (
z

)
2
] + (y)
2
+ (
t

)
2
= C(
z

)
2
+ (
t

)
2
+ y
2
(
2
+ C
2
).
54
With this we can estimate the last term in our inequality further:

T
t
2e
s
[CE([Y
s
[
2
) + E([
s
Y [(C[
s
Z[ +[
2
f
s
[))]ds

T
t
e
s
[2CE([Y
s
[
2
) +
C

2
E([
s
Z[
2
)
+
1

2
E([
2
f
s
[
2
) + (
2
+ C
2
)E([
s
Y [
2
]ds
=

T
t
e
s
[(
2
+ C(2 +
2
))E([Y
s
[
2
)
+
C

2
E([
s
Z[
2
) +
1

2
E([
2
f
s
[
2
)]ds.
Summarizing, we obtain, using our assumptions on the parameters
() E(e
t
[Y
t
[
2
) E(

T
t
e
s
[Y
s
[
2
ds)[ + C(2 +
2
) +
2
]
+E(

T
t
e
s
[Z
s
[
2
ds)[
C

2
1] + E(e
T
[Y
T
[
2
)
+
1

2
E(

T
t
e
s
[
2
f
s
[
2
ds) + E(e
T
[Y
T
[
2
)
E(e
T
[Y
T
[
2
) +
1

2
E(

T
t
e
s
[
2
f
s
[
2
ds).
This is the claimed inequality.
4. In order to obtain the rst inequality in the assertion, it remains to
integrate the inequality resulting from 3. in t [0, T].
5. The second inequality in the assertion follows from (**) by taking
the second term from the right hand side to the left. This completes the
proof.
We are in a position to state existence and uniqueness results for our
BSDE (*).
Theorem 10.1 Let (, f) be standard parameters. Then there exists a
uniquely determined pair (Y, Z) H
2
(R
m
)H
2
(R
nm
) with the property
(BSDE) Y
t
=

T
t
Z

s
dW
s
+

T
t
f(, s, Y
s
, Z
s
)ds, 0 t T.
Proof
Consider
: H
2,
(R
m
) H
2,
(R
nm
) H
2,
(R
m
) H
2,
(R
nm
), (y, z) (Y, Z),
55
where (Y, Z) is a solution of the BSDE
() Y
t
=

T
t
Z

s
dW
s
+

T
t
f(, s, y
s
, z
s
)ds, 0 t T.
1. We prove: (Y, Z) is well dened. First of all, our assumptions yield
+

T
t
f(, s, y
s
, z
s
)ds L
2
(), 0 t T.
Therefore
M
t
= E( +

T
0
f(, s, y
s
, z
s
)ds[T
t
), 0 t T,
is a well dened martingale. M possesses a continuous version, due to
the fact that we are working in a Wiener ltration. M is square integra-
ble. Hence we may apply the martingale representation theorem, which
provides (a unique) Z H
2
(R
nm
) such that
M
t
= M
0
+

t
0
Z

s
dW
s
, 0 t T.
Let now
Y
t
= M
t

t
0
f(, s, y
s
, z
s
)ds.
Then Y is square integrable, and we have
Y
t
= E( +

T
t
f(, s, y
s
, z
s
)ds[T
t
), 0 t T.
Hence
Y
T
= = M
0
+

T
0
Z

s
dW
s

T
0
f(, s, y
s
, z
s
)ds,
and thus for 0 t T
Y
t
= M
0

T
0
Z

s
dW
s
+

T
0
f(, s, y
s
, z
s
)ds
+ M
0
+

t
0
Z

s
dW
s

t
0
f(, s, y
s
, z
s
)ds
=

T
t
Z

s
dW
s
+

T
t
f(, s, y
s
, z
s
)ds.
2. We prove: For > 2(1 + T)C the mapping is a contracti-
on. For this purpose, let (y
1
, z
1
), (y
2
, z
2
) H
2,
(R
m
) H
2,
(R
nm
),
(Y
1
, Z
1
), (Y
2
, Z
2
) corresponding solutions of (*) according to 1. We ap-
ply Lemma 10.1 with C = 0, =
2
, and f
i
= f(, y
i
, z
i
). With this
56
choice we obtain
[[Y [[
2,

T

[E(

T
0
e
s
[f(, s, y
1
s
, z
1
s
) f(, s, y
2
s
, z
2
s
)[
2
ds)]
1
2
,
[[Z[[
2,

1

[E(

T
0
e
s
[f(, s, y
1
s
, z
1
s
) f(, s, y
2
s
, z
2
s
)[
2
ds)]
1
2
.
Since f is Lipschitz continuous, we further obtain
[[Y [[
2,

2TC

[[[y[[
2,
+[[z[[
2,
],
[[Z[[
2,

2C

[[[y[[
2,
+[[z[[
2,
].
We summarize to obtain
() [[Y [[
2,
+[[Z[[
2,

2C(T + 1)

[[[y[[
2,
+[[z[[
2,
].
By choice of , is a contraction.
3. Now let (Y , Z) be the xed point of , which exists due to 2. Let
Y
t
= E( +

T
t
f(, s, Y
s
, Z
s
)ds[T
t
), 0 t T.
Then Y is continuous and P-a.s. identical to Y . Then (Y, Z) is a solution
of our BSDE.
4. Uniqueness follows from the contraction property of and the un-
iqueness of the xed point.
The construction of solutions in the preceding proof rests upon a recur-
sive algorithm. The algorithm converges, as we shall note in the following
Corollary.
Corollary 10.1 Let > 2(1 + T)C, ((Y
k
, Z
k
))
k0
the sequence of pro-
cesses, given by Y
0
= Z
0
= 0,
Y
k+1
t
=

T
t
(Z
k+1
s
)

dW
s
+

T
t
f(, s, Y
k
s
, Z
k
s
)ds
according to the proof of the preceding Theorem. Then ((Y
k
, Z
k
))
k0
con-
verges in H
2,
(R
m
) H
2,
(R
nm
) to the uniquely determined solution
(Y, Z) of (BSDE).
57
Proof
The inequality (**) in the proof of Theorem 10.1 recursively yields
[[Y
k+1
Y
k
[[
2,
+[[Z
k+1
Z
k
[[
2,

k
[[[Y
1
Y
0
[[
2,
+[[Z
1
Z
0
[[
2,
],
with =
2C(T+1)

< 1. This implies

kN
[[[Y
k+1
Y
k
[[
2,
+[[Z
k+1
Z
k
[[
2,
] < .
Now a standard argument applies.
11 Interpretation of backward stochastic dieren-
tial equations in Malliavins calculus
In this chapter we shall establish the vital connection between Mallia-
vins calculus and the structure of solutions of BSDE. We shall see that,
provided the standard parameters are suciently smooth, the process
Z can be interpreted as the Malliavin trace of the process Y . For doing
this, we rst have to introduce the version of the space L
2
1
which corre-
sponds to integrability in some arbitrary power p 2. For simplicity, we
let the dimension of our underlying Wiener process be one, i.e. for this
chapter we set n = 1.
Denition 11.1 Let p 2, and
L
p
1
(R
m
) = u[u adapted, [

T
0
[u
t
[
2
dt]
1
2
L
p
(),
u
t
(D
p
1
)
m
for a.a. t 0,
for some measurable version
of (s, t) D
s
u
t
we have E([

T
0

T
0
[D
s
u
t
[
2
dsdt]
p
2
) < .
For u L
p
1
(R
m
) dene
[[u[[
1,p
= E([

T
0
[u
t
[
2
dt]
p
2
) + E([

T
0

T
0
[D
s
u
t
[
2
dsdt]
p
2
).
To abbreviate, for k N, v L
2
( [0, T]
k
) denote
[[v[[ = [

[0,T]
k
[v
t
[
2
dt]
1
2
.
58
In these terms, Jensens inequality gives
E([[Du[[
p
) T
p
2
1

T
0
[[D
s
u[[
p
p
ds.
We next prove that solutions of BSDE for regular standard parameters
are Malliavin dierentiable, and that Z allows an interpretation as a Mal-
liavin trace of Y . We need some versions of the process spaces considered
in the previous chapter that correspond to p-integrable random varia-
bles. For p 2 denote by S
p
(R
m
) the linear space of all measurable (T
t
)-
adapted continuous processes X : [0, T] R
m
, endowed with the
norm[[X[[
S
p = E(sup
0tT
[X
t
[
p
)
1
p
. Let further H
p
(R
m
) denote the linear
space of measurable (T
t
)-adapted processes X : [0, T] R
m
endo-
wed with the norm [[X[[
p
= E([[X[[
p
)
1
p
. To abbreviate, let B
p
(R
m
) =
S
p
(R
m
) H
p
(R
m
), with the norm [[(Y, Z)[[
p
= [[[Y [[
p
S
p +[[Z[[
p
p
]
1
p
.
We are ready to state our main result.
Theorem 11.1 Let (f, ) be standard parameters such that D
2
1

L
4
(R
m
), f : [0, T] R
m
R
m
R
m
continuously dierentiable
in (y, z), with uniformly bounded and continuous partial derivatives, and
such that for (y, z) R
m
R
m
we have
(H3) f(, y, z) L
2
1
, f(, 0, 0) H
4
(R
m
),
for t [0, T] and (y
1
, z
1
, y
2
, z
2
) (R
m
R
m
)
2
we have
(H4) [D
s
f(, t, y
1
, z
1
) D
s
f(, t, y
2
, z
2
)[ K
s
(t)[[y
1
y
2
[ +[z
1
z
2
[]
(in L
2
( [0, t]
2
)), with a real-valued measurable process (K
s
(t))
0st
which is (T
t
)-adapted in t, and satises

T
0
[[K
s
[[
4
4
ds < .
For the unique solution (Y, Z) of the BSDE (*) we moreover suppose

T
0
[[D
s
f(, Y, Z)[[
2
ds <
P-a.s..
Then we have:
(Y, Z) L
2
1
(R
m
) L
2
1
(R
m
),
59
and a (measurable) version of (D
s
Y
t
, D
s
Z
t
)
0s,tT
) possesses the proper-
ties
D
s
Y
t
= D
s
Z
t
= 0, 0 t < s T,
D
s
Y
t
= D
s

T
t
D
s
Z

u
dW
u
+

T
t
[

y
f(, u, Y
u
, Z
u
) D
s
Y
u
+

z
f(, u, Y
u
, Z
u
) D
s
Z
u
+D
s
f(, u, Y
u
, Z
u
)]du, 0 s t T,
(D
s
Y
s
)
0sT
is a version of (Z
s
)
0sT
.
Proof
1. For further simplifying notation, we assume m = 1. As in chapter 10,
our arguments are mainly based upon several a priori estimates. The
rst one is an analogue of Lemma 10.1 and investigates the properties
of the contraction map on B
2
(R).
Lemma 11.1 Let p 2, assume f(, 0, 0) H
p
(R), and dene
: B
p
(R) B
p
(R), (y, z) (Y, Z),
where (Y, Z) is the solution of the BSDE
(+) Y
t
=

T
t
Z
s
dW
s
+

T
t
f(, s, y
s
, z
s
)ds.
Let further for i = 1, 2 (Y
i
, Z
i
) be the solutions corresponding to (y
i
, z
i
)
in (+), and let
Y = Y
1
Y
2
, Z = Z
1
Z
2
, y = y
1
y
2
, z = z
1
z
2
.
Then there exists a constant C
p
not depending on (y, z, Y, Z) such that
(i) [[Y [[
p
S
p C
p
E([[[
p
) + T
p
2
(

T
0
[f(, s, y
s
, z
s
)[
2
ds)
p
2
]),
(ii) [[Z[[
p
p
C
p
E([[[
p
) + T
p
2
(

T
0
[f(, s, y
s
, z
s
)[
2
ds)
p
2
]),
(iii) [[(Y, Z)[[
p
p
C
p
T
p
2
[[(y, z)[[
p
p
.
Proof
a) Taking up the notation of the proof of Lemma 10.1, we show: is
well dened.
60
To do this, recall for 0 t T
Y
t
= E( +

T
t
f(, s, y
s
, z
s
)ds[T
t
).
We have
[Y
t
[ E([[ +

T
t
[f(, s, y
s
, z
s
)[ds[T
t
),
hence Doobs inequality provides a universal constant C
1
p
such that
[[Y [[
p
S
p C
1
p
E([[[ +

T
0
[f(, s, y
s
, z
s
)[ds]
p
).
Moreover, by Cauchy-Schwarz inequality

T
0
[f(, s, y
s
, z
s
)[ds T
1
2
[

T
0
[f(, s, y
s
, z
s
)[
2
ds]
1
2
,
hence with another universal constant C
2
p
we have
() [[Y [[
p
S
p C
2
p
E([[[
p
+ [

T
0
[f(, s, y
s
, z
s
)[
2
ds]
p
2
]).
Invoke f(, 0, 0) H
p
(R), that f is Lipschitz continuous, and that
(y, z) B
p
(R), to see that the right hand side of the preceding ine-
quality is nite.
We next prove that Z H
p
(R). For this purpose we shall use the
inequality of Burkholder. It yields further universal constants C
3
p
, , C
5
p
such that
() E([[Z[[
p
) C
3
p
E([

T
0
Z
s
dW
s
[
p
)
C
4
p
E([ +

T
0
f(, s, y
s
, z
s
)ds Y
0
[
p
)
C
5
p
E([[
p
+ T
p
2
[

T
0
[f(, s, y
s
, z
s
)[
2
ds]
p
2
).
Hence we obtain Z H
p
(R), and therefore (Y, Z) B
p
(R). The ine-
qualities (i) and (ii) have also been established.
b) We prove: (iii). The solution (Y, Z) belongs to the generator
f(, t, y
1
t
, z
1
t
) f(, t, y
2
t
, z
2
t
), and = 0. Therefore (i) and (ii), as well
as an appeal to the Lipschitz condition, give, with universal constants
C
6
p
, C
7
p
[[(Y, Z)[[
p
p
C
6
p
T
p
2
E([

T
0
[f(, t, y
1
t
, z
1
t
) f(, t, y
2
t
, z
2
t
)[
2
dt]
p
2
)
C
7
p
T
p
2
[[[y[[
p
S
p +[[z[[
p
p
]
= C
7
p
T
p
2
[[(y, z)[[
p
p
.
61
This completes the proof.
Let us return to the proof of Theorem 11.1. We dene approximations
of the solution of the BSDE recursively. Let for k 0, 0 t T
Y
0
= Z
0
= 0,
Y
k+1
t
=

T
t
Z
k+1
s
dW
s
+

T
t
f(, s, Y
k
s
, Z
k
s
)ds.
We show:
[[(Y
k
, Z
k
) (Y, Z)[[
4
0 (k ).
Recall the universal constant C
4
from Lemma 11.1, (iii). We may (mo-
dulo repeating the argument nitely often on successive subintervals of
[0, T]) assume that [C
4
T
2
]
1
4
< 1. With this condition, Lemma 11.1 im-
plies that is a contraction, and the solution (Y, Z) of the BSDE its
unique xed point in B
4
(R). From this observation, we obtain our as-
sertion via the Cauchy sequence property of the approximate solutions
which follows from
[[(Y
k
, Z
k
)(Y
l
, Z
l
)[[
4

l

r=k+1
[[(Y
r
, Z
r
)(Y
r1
, Z
r1
)[[
4
0 (k, l ).
2. We prove by recursion on k:
(Y
k
, Z
k
) L
2
1
(R) L
2
1
(R).
This is trivial for k = 0. Let it be guaranteed for k. According to the
chain rule for the Malliavin derivative and our hypotheses concerning
the standard parameters we know for 0 t T
+

T
t
f(, s, Y
k
s
, Z
k
s
)ds D
2
1
,
with Malliavin derivative
D
s
+

T
t
[

y
f(, u, Y
k
u
, Z
k
u
) D
s
Y
k
u
+

z
f(, u, Y
k
u
, Z
k
u
) D
s
Z
k
u
+ D
s
f(, u, Y
k
u
, Z
k
u
)]du.
This is seen by discretizing the Lebesgue integral, using the chain ru-
le, and then approximating by means of the boundedness properties of
62
the partial derivatives, the Lipschitz continuity properties of f and clo-
sedness of the operator D. Consequently, Lemma 9.2 yields for xed
0 t T
Y
k+1
t
= E( +

T
t
f(, s, Y
k
s
, Z
k
s
)ds[T
t
) D
2
1
as well. Consequently, also

T
t
Z
k+1
s
dW
s
= +

T
t
f(, s, Y
k
s
, Z
k
s
)ds Y
k+1
t
D
2
1
.
Now an appeal to Theorem 9.4 implies that Z
k+1
L
2
1
(R) and in L
2
(
[0, T]
2
) we have the equation
D
s

T
t
Z
k+1
u
dW
u
=

T
t
D
s
Z
k+1
u
dW
u
, s t,
D
s

T
t
Z
k+1
u
dW
u
= Z
k+1
s
+

T
s
D
s
Z
k+1
u
dW
u
, s > t.
All stated dierentiabilities go along with square integrability of the
Malliavin derivatives in all variables. This means that
(Y
k+1
, Z
k+1
) L
2
1
(R) L
2
1
(R),
and the recursion step is completed. We also can identify the Malliavin
derivative by the formula valid for 0 s t T in the usual sense
( ) D
s
X
k+1
t
= D
s

T
t
D
s
Z
k+1
u
dW
u
+

T
t
[

y
f(, u, Y
k
u
, Z
k
u
) D
s
Y
k
u
+

z
f(, u, Y
k
u
, Z
k
u
) D
s
Z
k
u
+D
s
f(, u, Y
k
u
, Z
k
u
)]du.
3. In this step we show:
(DY
k
, DZ
k
) (Y

, Z

) in L
2
( [0, T]
2
),
where for 0 s T (Y
s
, Z
s
) is the solution of the BSDE
()Y
s
t
= D
s

T
t
Z
s
u
dW
u
+

T
t
[

y
f(, u, Y
u
, Z
u
) Y
s
u
+

z
f(, u, Y
u
, Z
u
) Z
s
u
+D
s
f(, u, Y
u
, Z
u
)]du, 0 s t T,
Y
s
t
= Z
s
t
= 0, 0 t < s T.
63
We rst consult our hypotheses to verify that, at least for -a.e. 0 s
T the parameters (F
s
, D
s
) with
F
s
(, t, y, z) =

y
f(, t, Y
t
, Z
t
) y +

z
f(, t, Y
t
, Z
t
) z + D
s
f(, t, Y
t
, Z
t
),
0 t T, y, z R, are standard. Hence (Y
s
, Z
s
) is well dened (and
set trivial on the set of s where the parameters eventually fail to be
standard). Also in this case our arguments will be based on a priori
inequalities.
Lemma 11.2 Let (f
i
,
i
), i = 1, 2, be standard parameters of a BSDE,
p 2. Suppose

i
L
p
(), f
i
(, 0, 0) H
p
(R), i = 1, 2.
Let (Y
i
, Z
i
) B
p
(R) be the corresponding solutions, C a Lipschitz con-
stant for f
1
. Put
Y = Y
1
Y
2
, Z = Z
1
Z
2
,
2
f
t
= f
1
(, t, Y
2
t
, Z
2
t
)f
2
(, t, Y
2
t
, Z
2
t
),
0 t T. Then for T small enough there exists a constant C
p,T
such
that
[[(Y, Z)[[
p
p
C
p,T
[E([Y
T
[
p
) + E((

T
0
[
2
f
s
[ds)
p
)]
C
p,T
[E([Y
T
[
p
) + T
p
2
[[
2
f
s
[[
p
].
Proof
With a calculation analogous to the one used to prove (i) and (ii) in
Lemma 11.1 we arrive at the following inequality which is valid with uni-
versal constants C
1
p
, , C
3
p
, and for which we also use Doobs and Cauchy-
Schwarz inequalities,
[[Y [[
p
S
p +[[Z[[
p
p
C
1
p
E([Y
T
[
p
+(

T
0
[f
1
(, t, Y
1
t
, Z
1
t
) f
2
(, t, Y
2
t
, Z
2
t
)[dt)
p
)
C
2
p
E([Y
T
[
p
+ (

T
0
[[Y
s
[ +[Z
s
[ +[
2
f
s
[]ds)
p
)
C
3
p
[E([Y
T
[
p
+(

T
0
[
2
f
s
[ds)
p
) + (T
p
[[Y [[
p
S
p + T
p
2
[[Z[[
p
p
)].
64
Now choose T small enough to ensure C
3
p
(T
p
+T
p
2
) < 1. This being done,
we may take the last two expressions in the previous inequality from the
right to the left hand side, to obtain the desired estimate.
Let us now apply Lemma 11.2 to prove that for -a.a. 0 s T we
have (Y
s
, Z
s
) B
2
(R). For this purpose, we apply the Lemma with
Y
1
= Y
s
, Y
2
= 0,

1
= D
s
,
2
= 0,
f
1
(, t, y, z) = (

y
f(, t, Y
t
, Z
t
) y +

z
f(, t, Y
t
, Z
t
) z + D
s
f(, t, Y
t
, Z
t
),
f
2
= 0,
0 t T, y, z R. Then we have

2
f
t
= D
s
f(, t, Y
t
, Z
t
), 0 t T.
We obtain with some universal constant C the inequality
[[(Y
s
, Z
s
)[[
2
2
CE([D
s
[
2
+[[D
s
f(, Y, Z)[[
2
),
and therefore

T
0
[[(Y
s
, Z
s
)[[
2
2
ds < .
This implies the desired integrability.
To obtain estimates for dierences of (DY
k
, DZ
k
) and (Y

, Z

), let us
next, xing k N, apply Lemma 11.2 to the following parameters

1
=
2
= D
s
,
f
1
(t) =

y
f(, t, Y
k
t
, Z
k
t
) D
s
Y
k
t
+

z
f(, t, Y
k
t
, Z
k
t
) D
s
Z
k
t
+D
s
f(, t, Y
k
t
, Z
k
t
),
f
2
(t) =

y
f(, t, Y
t
, Z
t
) Y
s
t
+

z
f(, t, Y
t
, Z
t
) Z
s
t
+ D
s
f(, t, Y
t
, Z
t
),
s t T. Set for abbreviation

k
t
= f
1
(t) f
2
(t), s t T.
The Lemma yields the inequality
[[(D
s
Y
k+1
Y
s
, D
s
Z
k+1
Z
s
)[[
2
2
C
1
E((

T
s
[
k
t
[dt)
2
)
65
with a universal constant C
1
. Let us now further estimate the right hand
side of this inequality. We have, xing 0 s T
E((

T
s
[
k
t
[dt)
2
) C
2
[A
s
k
(T) + B
s
k
(T) + C
s
k
(T)],
where
A
s
k
(T) = E([

T
s
[D
s
f(, t, Y
t
, Z
t
) D
s
f(, t, Y
k
t
, Z
k
t
)[dt]
2
),
B
s
k
(T) = E([

T
s
[

y
f(, t, Y
k
t
, Z
k
t
)(Y
s
t
D
s
Y
k
t
)[dt]
2
)
+ E([

T
s
[

z
f(, t, Y
k
t
, Z
k
t
)(Z
s
t
D
s
Z
k
t
)[dt]
2
),
C
s
k
(T) = E([

T
s
[

y
f(, t, Y
t
, Z
t
)

y
f(, t, Y
k
t
, Z
k
t
)[ [Y
s
t
[dt]
2
)
+ E([

T
s
[

z
f(, t, Y
t
, Z
t
)

z
f(, t, Y
k
t
, Z
k
t
)[ [Z
s
t
[dt]
2
).
With further universal constants C
3
, C
4
we deduce, using (H4)
A
s
k
(T) E([

T
s
K
s
(t)[[Y
t
Y
k
t
[ +[Z
t
Z
k
t
[]dt]
2
)
C
3
E(

T
s
K
s
(t)
2
dt[

T
s
[Y
t
Y
k
t
[
2
dt +

T
s
[Z
t
Z
k
t
[
2
dt])
C
4
[E(

T
s
K
s
(t)
4
dt]
1
2
[E(

T
s
[Y
t
Y
k
t
[
4
dt)
1
2
+ E(

T
s
[Z
t
Z
k
t
[
4
dt)
1
2
].
Hence by part 1. of the proof
lim
k

T
0
A
s
k
(T)ds = 0.
Furthermore, since the partial derivatives of f with respect to y, z are
bounded and continuous, and since E(

T
0
[[(Y
s
, Z
s
)[[
2
ds < , dominated
convergence allows to conclude
lim
k

T
0
C
s
k
(T)ds = 0.
Let us nally discuss the convergence of the B
s
k
(T) as k . Again
by boundedness of the partial derivatives of f we obtain with a universal
constant C
5
B
s
k
(T) C
5
T
2
[[D
s
Y
k
Y
s
, D
s
Z
k
Z
s
)[[
2
2
.
66
Now choose T small enough to ensure = C
5
T
2
< 1. Let > 0. Then
by what has been shown there exists N N large enough so that for
k N we have
E(

T
0
[[(D
s
Y
k+1
Y
s
, D
s
Z
k+1
Z
s
)[[
2
ds)
+ E(

T
0
[[(D
s
Y
k
Y
s
, D
s
Z
k
Z
s
)[[
2
ds).
By recursion we obtain for k N
E(

T
0
[[(D
s
Y
k
Y
s
, D
s
Z
k
Z
s
)[[
2
ds) (1 + +
2
+ +
kN1
)
+
kN
E(

T
0
[[(D
s
Y
N
Y
s
, D
s
Z
N
Z
s
)[[
2
ds)


1
+
kN
E(

T
0
[[(D
s
Y
N
Y
s
, D
s
Z
N
Z
s
)[[
2
ds).
Now let k . Since is arbitrary, we conclude
lim
k

T
0
B
s
k
(T)ds = 0.
3. Since L
2
1
(R) is a Hilbert space, and D is a closed operator, we
obtain that (Y, Z) L
2
1
(R) L
2
1
(R), and that (Y
s
, Z
s
)
0sT
is a version
of (D
s
Y, D
s
Z)
0sT
in the usual sense.
4. We show:
(D
t
Y
t
)
0tT
is a version of (Z
t
)
0tT
.
For t s we have
Y
s
= Y
t
+

s
t
Z
r
dW
r

s
t
f(, r, Y
r
, Z
r
)dr.
Hence by Theorem 9.4 for t < u s
D
u
Y
s
= Z
u
+

s
u
D
u
Z
r
dW
r

s
u
[

y
f(, r, Y
r
, Z
r
) D
u
Y
r
+

z
f(, r, Y
r
, Z
r
) D
u
Z
r
+D
u
f(, r, Y
r
, Z
r
)]dr.
By continuity in t of (Y
s
, Z
s
) we may choose u = s, to obtain the desired
identity.
67
Literatur
[1] Amendinger, J., Imkeller, P., Schweizer, M. Additional logarithmic
utility of an insider. Stoch. Proc. Appl. 75 (1998), 263-286.
[2] S. Ankirchner, P. Imkeller, A. Popier On measure solutions of back-
ward stochastic dierential equations. Preprint, HU Berlin (2008).
[3] S. Ankirchner, P. Imkeller, G. Reis Classical and Variational Die-
rentiability of BSDEs with Quadratic Growth. Electron. J. Probab.
12 (2007), 14181453 (electronic).
[4] S. Ankirchner, P. Imkeller, G. Reis Pricing and hedging of derivati-
ves based on non-tradable underlyings. Preprint, HU Berlin (2007).
[5] S. Ankirchner, P. Imkeller Quadratic hedging of weather and cata-
strophe risk by using short term climate predictions. Preprint, HU
Berlin (2008).
[6] Bell, D. The Malliavin calculus. Pitman Monographs and Surveys
in Pure and Applied Math. 34. Longman and Wiley: 1987.
[7] Bell, D. Degenerate stochastic dierential equations and hypoellipti-
city. Pitman Monographs and Surveys in Pure and Applied Mathe-
matics. 79. Harlow, Essex: Longman (1995).
[8] Bismut, J. M. Martingales, the Malliavin Calculus and Hypoellip-
ticity under General Hormanders Conditions. Z. Wahrscheinlich-
keitstheorie verw. Geb. 56 (1981), 469-505.
[9] Bismut, J. M. Large deviations and the Malliavin calculus. Progress
in Math. 45, Birkhauser, Basel 1984.
[10] Bouleau, N., Hirsch, F. Dirichlet forms and analysis on Wiener
space. W. de Gruyter, Berlin 1991.
[11] El Karoui, N.; Peng, S.; Quenez, M.C. Backward stochastic
dierential equations in nance. Math. Finance 7, No.1, 1-
71 (1997).
68
[12] Fournie, E., Lasry, J.-M., Lebouchoux, J., Lions, P.-L., Touzi, N.
Applications of Malliavins calculus to Monte Carlo methods in -
nance. Finance Stochast. 3 (1999), 391-412.
[13] Huang, Z-Y.; Yan, J-A. Introduction to innite dimensional stocha-
stic analysis. Mathematics and its Applications (Dordrecht). 502.
Dordrecht: Kluwer Academic Publishers (2000).
[14] Ikeda, N., Watanabe, S. Stochastic dierential equations and diu-
sion processes. North Holland (2 nd edition): 1989.
[15] Imkeller, P. Malliavins calculus in insider models: additional utility
and free lunches. Preprint, HU Berlin (2002).
[16] Imkeller, P., Pontier, M., Weisz, F. Free lunch and arbitrage possibi-
lities in a nancial market model with an insider. Stochastic Proc.
Appl. 92 (2001), 103-130.
[17] Imkeller, P. Enlargement of the Wiener ltration by an absolu-
tely continuous random variable via Malliavins calculus. PTRF
106(1996), 105-135.
[18] Imkeller, P., Perez-Abreu, V., Vives, J. Chaos expansions of double
intersection local time of Brownian motion in R
d
and renormaliza-
tion. Stoch. Proc. Appl. 56(1994), 1-34.
[19] Jacod, J. Grossissement initial, hypoth`ese (H), et theor`eme de Gir-
sanov. in: Grossissements de ltrations: exemples et applications. T.
Jeulin, M.Yor (eds.). LNM 1118. Springer: Berlin 1985.
[20] Janson, Svante Gaussian Hilbert spaces. Cambridge Tracts in Ma-
thematics. 129. Cambridge: Cambridge University Press (1997).
[21] Kuo, H. H. Donskers delta function as a generalized Brownian func-
tional and its application. Lecture Notes in Control and Information
Sciences, vol. 49 (1983), 167-178. Springer, Berlin.
[22] Kusuoka, S. , Stroock, D.W. The partial Malliavin calculus and its
applications to nonlinear ltering. Stochastics 12 (1984), 83-142.
69
[23] Ma, J., and Yong, J. Forward-Backward Stochastic Dierential
Equations and Their Applications. LNM 1702. Springer: Berlin
(1999).
[24] Malliavin, P. Stochastic analysis. Grundlehren der Mathe-
matischen Wissenschaften. 313. Berlin: Springer (1997).
[25] Malliavin, P. Integration and probability. Graduate Texts
in Mathematics. 157. New York, NY: Springer (1995).
[26] Malliavin, P. Stochastic calculus of variation and hypoelliptic ope-
rators. Proc. Intern. Symp. SDE Kyoto, 1976, 195-263. Kinokynia:
Tokyo 1978.
[27] Malliavin, P. C
k
hypoellipticity with degeneracy. Stochastic Analy-
sis. ed: A. Friedman and M. Pinsky, 199-214, 327-340. Acad. Press:
New York 1978.
[28] Malliavin, P., Thalmaier, A. Stochastic calculus of variati-
ons in mathematical nance. Springer: Berlin 2006.
[29] Norris, J. Simplied Malliavin calculus. Seminaire de Probabilites
XX. LNM 1204, 101-130. Springer: Berlin 1986.
[30] Nualart, D., Pardoux, E. Stochastic calculus with anticipating inte-
grands. Probab. Th. Rel. Fields 78 (1988), 535-581.
[31] Nualart, D. The Malliavin calculus and related topics.
Springer: Berlin 1995.
[32] Nualart, D. Malliavins calculus and anticipative calculus.
St Flour Lecture Notes (1996).
[33] Nualart, D., Vives, J. Chaos expansion and local times. Publicacions
Matematiques 36(2), 827-836.
[34] Revuz, D., Yor, M. Continuous martingales and Brownian motion.
Springer, Berlin 1991.
[35] Stroock, D. W. The Malliavin Calculus. Functional Analytic Ap-
proach. J. Funct. Anal. 44 (1981), 212-257.
70
[36] Stroock, D. W. The Malliavin calculus and its applications to second
order parabolic dierential equations. Math. Systems Theory, Part
I, 14 (1981), 25-65, Part II, 14 (1981), 141-171.
[37] Stroock, D. W. Some applications of stochastic calculus to partial
dierential equations. In: Ecole dEte de Probabilite de St Flour.
LNM 976 (1983), 267-382.
[38] Watanabe, H. The local time of self-intersections of Brownian mo-
tions as generalized Brownian functionals. Letters in Math. Physics
23 (1991), 1-9.
[39] Watanabe, S. Stochastic dierential equations and Malliavin calcu-
lus. Tata Institut of Fundamental Research. Springer: Berlin 1984.
[40] Yor, M. Grossissement de ltrations et absolue continuite de noyaux.
in: Grossissements de ltrations: exemples et applications. T. Jeulin,
M.Yor (eds.). LNM 1118. Springer: Berlin 1985.
[41] Yor, M. Entropie dune partition, et grossissement initial dune l-
tration. in: Grossissements de ltrations: exemples et applications.
T. Jeulin, M.Yor (eds.). LNM 1118. Springer: Berlin 1985.
71

You might also like