An Introduction To Malliavin Calculus

An Introduction to Malliavin
Calculus
Courant Institute of Mathematical Sciences
New York University
Peter K. Friz
August 3, 2002
These notes available on
www.math.nyu.edu/phd students/frizpete
Please send corrections to
Peter.Friz@cims.nyu.edu
These lecture-notes are based on a couple of seminar-talks I gave at Courant
in Spring 2001. I am extremely grateful to Prof. S.R.S. Varadhan for supporting
and stimulating my interest in Malliavin Calculus. I am also indepted to Nicolas
Victoir, Enrique Loubet for their careful reading of this text.
-Peter F.
Notations:
... Wienerspace C[0, 1] resp. C([0, 1], R
m
)
F ... natural ltration
H ... L
2
[0, 1] resp. L
2
([0, 1], R
m
)
H
k
... tensorproduct

= L
2
([0, 1]
k
), H
k
... symmetric tensorproduct
H ... Cameron-Martin-space , elements are paths with derivative in H

W : F R ... Wiener-measure on
t
= (t) ... Brownian Motion (= coordinate process on (, F, W))
W : H L
2
() ... dened by W(h) =
_
1
0
hd
S
2
... Wiener polynomials, functionals of form polynomial(W(h
1
), ..., W(H
n
)
S
1
... cylindrical functionals, S
2
D
k,p
... L
p
() containing k-times Malliavin dierentiable functionals
D
...
k,p
D
k,p
, smooth Wiener functionals
,
m
... (m-dimensional) Lebesgue-measure
,
n
... (n-dimensional) standard Gaussian measure
... gradient-operator on R
n
L
p
(, H) ... H-valued random-variables s.t.
_

H
dW <
D ... Malliavin derivative, operator L
p
() L
p
(, H)
... = D
the adjoint operator, also: divergence, Skorohod Integral

L ... = D, Ornstein-Uhlenbeck operator L
p
() L
p
()
W
k,p
... Sobolev-spaces built on R
n
H
k
...W
k,2
... (for functions f : R R) simple dierentiation
... adjoint of on L
2
(R, )
L ... =
, one-dimensional OU-operator
i
,
ij
... partial derivatices w.r.t. x
i
, x
j
etc
L ... generator of m-dimensional diusion process, for instance L = E
ij
ij
+B
i
i
H
n
... Hermite-polynomials
n
(t) ... n-dimensional simplex {0 < t
1
< ... < t
n
< t} [0, 1]
n
J() ... Iterated Wiener-Ito integral, operator L
2
[
n
] toC
n
L
2
()
C
n
... n
th
Wiener Chaos
... multiinex (nite-dimensional)
X ... m-dimensional diusion process given by SDE, driven by d BMs
= (X) ... < DX, DX >
H
, Malliavin covariance matrix
V, W ... vectorelds on R
m
, seen as map R
m
R
m
or as rst order dierential
operator
1
B, A
0
... vectorelds on R
m
, appearing as drift term in Ito (resp. Stratonovich)
SDE
A
1
, . . . , A
d
...vectorelds on R
m
, appearig in diusion term of the SDE
d ... Stratonovich dierential = Ito dierential + (...)dt
X ... diusion given by SDE, X(0) = x
Y, Z ... R
mm
-valued processes, derivative of X w.r.t. X(0) resp. the inverse
... V is short for the matrix
j
V
i
,
W
V ... connection, = (V )W
[V, W] ... Lie-bracket, yields another vectoreld
Lie {...} ... the smallest vectorspace closed under Lie-brackets, containing {...}
D ... = C
c
, test-functions
D
... Schwartz-distributions = cont. functionals on D

2
Chapter 1
Analysis on the Wiener
Space
1.1 Wiener Space
will denote the Wiener Space C([0, 1]). As usual, we put the Wiener measure
W on therefore getting a probability space
(, F, W)
where F is generated by the coordinate maps. On the other hand we can furnish
with the
- norm making it a (separable) Banach-space. F coincides with

the -eld generated by the open sets of this Banach-space. Random-variables
on are called Wiener functionals. The coordinate process (t) is a Brownian
motion under W, with natural ltration ({(s) : s t}) F
t
. Often we will
write this Brownian motion as (t) = (t, ) = (t), in particular in the context
of stochastic Wiener-Ito integrals.
1.2 Two simple classes of Wiener functionals
Let f be a polynomial, h
1
, . . . , h
n
H L
2
[0, 1]. Dene rst a class of cylin-
drical functionals
S
1
= {F : F = f(
t
1
, . . . ,
t
n
))},
then the larger class of Wiener polynomials
S
2
= {F : F = f(W(h
1
), . . . , W(h
n
))}
where W(h)
_
1
0
hd.
Remarks: - Both S
i
are algebras. In particular S
2
is what [Malliavin2] p13
calls the fundamental algebra.
- A S
2
-type function with all h
i
s deterministic step functions is in S
1
.
- In both cases, we are dealing with r.v. of the type
F = f(n-dimensional gaussian) =

f(n indep. std. gaussians).
3
Constructing

f boils down to a Gram-Schmidt-orthonormalization for the
h
i
s. When restricting discussion to S
2
-functionals one can actually forget and
simply work with (R
n
,
n
), that is, R
n
with n-dimensional standard Gaussian
measure d
n
(x) = (2)
n/2
exp(|x|
2
/2)dx.
This remark looks harmless here but will prove useful during the whole setup
of the theory.
- S
1
S
2
L
p
() for all p 1 as t he polynomial growth of f assures
the existence of all moments. From this point of view, one could weaken the
assumptions on f, for instance smooth and of maximal polyniomal growth or
exponential-martingale-type functionals.
1.3 Directional derivatives on the Wiener Space
Recall that [W(h)]() =
_
1
0
hd() is constructed as L
2
-limits and hence, as
element in L
2
(, W), only W-a.s. dened. Hence, any S
2
- or more general
Wiener functional is only W-a.s. dened.
In which directions can we shift the argument of a functional while keeping
it a.s. well-dened? By Girsanovs theorem, the Cameron-Martin-directions
h() :=
_
.
0
h(t)dt with h H
are ne, as the shifted Wiener-measure (
h
)W is equivalent to W. The set
of all

h is the Cameron-Martin-space

H. It is known that for a direction
k

H the shifted measure is singular wrt to W, see [RY], Ch. VIII/2.
Hence, F(+k) does not make sense, when F is an a.s. dened functional, and
neither does a directional derivative in direction k.
Remarks: - The paths

h are sometimes called nite energy paths
- The set

H has zero W-measure, since every

h is of bounded variation
while W-a.s. Brownian paths are not.
- The map h
h is a continuous linear injection from H into (,
).
- Also, h

h is a bijection from H

H with inverse
d
dt
h(t) = h(t). This

derivative exists dt-a.s. since

h is absolutly continuous, moreover h H i.e.
square-integrable.
In particular, we can use this transfer the Hilbert-structure from H to

H. For
g, k

H let g,

k denote their square-integrable derivatives. Then
< g, k >
H
< g,

k >
H
=
_
1
0
g
kd
- In a more general context

H (or indeed H) are known as reproducing kernel
space for the Gaussian measure W on the Banach space (terminology from
[DaPrato], p40).
1.4 The Malliavin derivative D in special cases
Take F S
1
, with slightly dierent notation
F() = f((t
1
), . . . , (t
n
)) = f(W(1
[0,t
1
]
), . . . , W(1
[0,t
n
]
)
4
Then, at = 0
d
d
F( +
h)
equals
n
i=1
i
f((t
1
), . . . , (t
n
))
_
t
i
0
hd =< DF, h >
H
where we dene
DF =
i
f(W(1
[0,t
1
]
), . . . , W(1
[0,t
n
]
)1
[0,t
i
]
.
This extends naturally to S
2
functionals,
DF =
i
f(W(h
1
), . . . , W(h
n
))h
i
,
and this should be regarded as an H-valued r.v.
Remarks: - D is well-dened. In particular for F = W(h) =
_
1
0
hd this is
a consequence of the Ito-isometry.
- Sometimes it is convinient to write
D
t
F() =
i
f(W(h
1
)(), . . .)h
i
(t)
which, of course, is only W-as well-dened.
- Since D(
_
1
0
hd) = D(W(h)) = h,
DF =
i
f(W(h
1
), . . . , W(h
n
))D(W(h
i
)),
which is the germ of a chain-rule-formula.
- Here is a product rule, for F, G S
2
D(FG) = FDG+GDF. (1.1)
(Just check it for monomials, F = W(H)
n
, G = W(g)
m
.) See [Nualart], p34
for an extension.
- As f has only polynomial growth, we have DF L
p
(, H) i.e.
_
DF()
p
H
dW <
. For p = 2, this can be expressed simpler, DF L
2
([0, 1] ), (after xing
a version) DF = DF(t, ) can be thought of a stochastic process.
1.5 Extending the Malliavin Derivative D
So far we have
D : L
p
() S
2
L
p
(, H).
It is instructive to compare this to the following well-known situation in
(Sobolev-)analysis. Take f L
p
(U), some domain U R
n
. Then the gradient-
operator = (
i
)
i=1,...,n
maps an appropriate subset of L
p
(U) into L
p
(U, R
n
).
The R
n
comes clearly into play as it is (isomorphic to) the tangent space at any
5
point of U.
Going back to the Wiener-space we see that H (or, equivalently,

H) plays the
role of the tangent space to the Wiener spaces.
1
Again, on L
P
(U). What is its natural domain? The best you can do is
(, W
1,p
), which is a closed operator, while (, C
1
c
) (for instance) is a closable
operator. This closability (see [RR]) is exactly what you need to extend to
operator to the closure of C
1
c
with respect to
W
1,p where
f
p
W
1,p
=
_
U
|f|
p
d
n
+
n
i=1
_
U
|
i
f|
p
d
n
or equivalently
_
U
|f|
p
d
n
+
_
U
f
p
R
n
d
n
.
Using an Integration-by-Parts formula (see the following section on IBP), (D, S
2
)
is easily seen to be closable (details found in [Nualart] p26 or [Uestuenel]). The
extended domain is denoted with D
1,p
and is exactly the closure of S
2
with
respect to
1,p
where
F
p
1,p
=
_
|F|
p
dW +
_
DF
p
H
dW
= E|F|
p
+EDF
p
H
.
Remarks: - Look up the denition of closed operator and compare Bass way
to introduce the Malliavin derivative ( [?, ?, Bass] p193) with the classical
result in [Stein] p122.
- For simplicity take p = 2 and consider F = f(W(h
1
), . . . , W(h
n
)) with h
i
s
in H. As mentioned in section 1.2, there is n.l.o.g. by assuming the h
i
s to be
orthonormal.
Then
DF
2
H
=
n
i=1
(
i
f(n iid std gaussians))
2
.
and F
2
1,2
simply becomes
_
R
n
f
2
d
n
+
_
R
n
f
2
R
nd
n
which is just the norm on the weighted Sobolev-space W
1,p
(
n
). More on this link
between D
1,p
and nite-dimensional Sobolev-spaces is to be found in [Malliavin1]
and [Nualart].
- A frequent characterization of Sobolev-spaces on R
n
is via Fourier trans-
form (see, for instance, [Evans] p 282). Let f L
2
= L
2
(R
n
), then
f H
k
i (1 +|x|
k
)

f L
2
.
1
We follow Malliavin himself and also Nualart by dening DF as H-valued r.v.. This seems
the simplest choice in view of the calculus to come. Oksendal, Uestuenel and Hsu dene it as
H-valued r.v. As commented in Section 1.3 the dierence is purely notational since there is a
natural isomorphism between H and

H. For instance, we can write D(
1
0
hd) = h while the
H choice leads to (

H-derivative of)(
1
0
hd =
.
0
hd.
6
Moreover,
f
H
k (1 +|x|
k
)

f
L
2.
In particular, this allows a natural denition of H
s
(R
n
) for all s R. For later
reference, we consider the case k = 1. Furthermore, for simplicity n = 1. Recall
that i is a self-adjoint operator on (L
2
(R), < , >)
< (1 +x)

f, (1 +x)

f > = < (1 +i)f, (1 +i)f >
= < f, f > + < if, if >
= < f, f > + < f,
2
f >
= < f, (1 +A)f >,
where A denotes the negative second derivative. In Section 1.10 this will be
linked to the usual denition of Sobolev-spaces (as seen at the beginning of this
section), both on (R
n
,
n
) as on (, W).
- The preceeding discussion about how to obtain the optimal domain for the
gradient on (R
n
,
n
) is rarely an issue in practical exposures of Sobolev Theory
on R
n
. The reason is, of course, that we can take weak derivatives resp. dis-
tributional derivatives. As well known, Sobolev-spaces can then be dened as
those L
p
-functions whose weak derivatives are again in L
p
. A priori, this cant
be done on the Wiener-spaces (at this stage, what are the smooth test-functions
there?).
1.6 Integration by Parts
As motivation, we look at (R, ) rst. Take f smooth with compact support
(for instance), then, by the translation invariance of Lebesgue-measure,
_
f(x +h)d =
_
f(x)d
and hence, after dividing by h and h 0,
_
f
d = 0.
Replacing f by f g this reads
_
f
gd =
_
fg
d.
The point is that IBP is the innitesimal expression of a measure-invariance.
Things are simple here because
n
is translation invariant, (
h
)
n
=
n
. Lets
look at (R
n
,
n
). It is elementary to check that for any h R
n
d(
h
)
n
d
n
(x) = exp(
n
i=1
h
i
x
i
1
2
n
i=1
h
2
i
).
The corresponding fact on the Wiener space (, W) is the Cameron-Martin
theorem. For

h

H and with
h
() = +
_
.
0
h = +

h
d(
h
)
W
dW
() = exp(
_
1
0
hd()
1
2
_
1
0
h
2
d).
7
Theorem 1 (IBP on the Wiener Space) Let h H, F S
2
. Then
E(< DF, h >
H
) = E(F
_
1
0
hd).
Proof: (1st variant following [Nualart]) By homogeneity, w.l.o.g. h = 1.
Furthermore, we can nd f such that F = f(W(h
1
), . . . , W(h
n
)), with (h
i
)
orthonormal in H and h = h
1
. Then, using classical IBP
E < DF, h > = E
i
f < h
i
, h >
=
_
R
n
1
f(x)(2)
n/2
e
|x|
2
/2
dx
=
_
R
n
f(x)(2)
n/2
e
|x|
2
/2
(x
1
)dx
=
_
R
n
f(x) x
1
d
n
= E(F W(h
1
))
= E(F W(h))
(2nd variant following an idea of Bismut, see [Bass]) We already saw in section
1.4 that for F S
1
the directional derivative in direction

h exists and coincides
with < DF, h >. For such F
_
F()dW() =
_
F(
h
() +

h)dW()
=
_
F( +

h)d(
h
)
W()
=
_
F( +
_
.
0
hd) exp(
_
1
0
hd +
1
2
_
1
0
h
2
d)dW(),
using Girsanovs theorem. Replace h by h and observe that the l.h.s. is inde-
pendent of . At least formally, when exchanging integration over and
d
d
at
= 0, we nd
_
_
< DF, h > F()
_
1
0
hd
_
dW() = 0
as required. To make this rigorous, approximate F by Fs which are, together
with DF
H
bounded on . Another approximation leads to S
2
-type function-
als. 2
Remarks: - IBP on the Wiener spaces is one of the cornerstones of the Malli-
avin Calculus. The second variant of the proof inspired the name Stochastic
Calculus of Variations: Wiener-paths are perturbed by paths

h(.). Stochastic
Calculus of Variations has (well, a priori) nothing to do with classical calculus
of variations.
- As before we can apply this result to a product FG, where both F, G S
2
.
This yields
E(G < DF, h >) = E(F < DG, h > +FGW(h)). (1.2)
8
1.7 Ito representation formula / Clark-Ocone-
Haussmann formula
As already mentioned DF L
2
([0, 1]) can be thought of a stochastic process.
Is it adapted? Lets see. Set
F(s) := E
s
(h) := exp(
_
s
0
hd
1
2
_
s
0
h
2
d),
an exponential martingale. F := E(h) := F(1) is not quite in S
2
but easily seen
to been in D
1,p
and, at least formally and d(t) -a.s.
D
t
F = e
1
2
1
0
h
2
d
D
t
(exp
_
1
0
hd)
= e
1
2
1
0
h
2
d
exp
_
_
1
0
hd
_
h(t)
= Fh(t)
The used chain-rule is made rigorous by approximation in S
2
using the partial
sums of the exponential.
As F contains information up to time 1, D
t
F is not adapted to F
t
but we can
always project down
E(D
t
F|F
t
) = E(F(1)h(t)|F
t
)
= h(t)E(F(1)|F
t
)
= h(t)F(t),
using the martingale-property of F(t). On the other hand, F solves the SDE
dF(t) = h(t)F(t)d(t)
with F(0) = 1 = E(F). Hence
F = E(F) +
_
1
0
h(t)F(t)d(t)
= E(F) +
_
1
0
E(D
t
F|F
t
)d(t) (1.3)
By throwing away some information this reads
F = E(F) +
_
1
0
(t, )d (1.4)
for some adapted process in L
2
([0, 1] ). We proved (1.4) for F of the
form E(h), sometimes called Wick-exponentials, call E the set of all such Fs.
Obviously this extends the the linear span (E) and by a density argument, see ??
for instance, to any F L
2
(, W). This is the Ito representation theorem.
Looking back to (1.3), we cant expect this to hold for any F L
2
(, W) since
D is only dened on the proper subset D
1,2
. However, it is true for F D
1,2
,
this is the Clark-Ocone-Haussmann formula.
9
Remarks: - In most books, for instance [Nualart], the proof uses the Wiener-
Ito-Chaos-decomposition, although approximation via the span(E) should work.
- A similar type of computations allows to compute, at least for F span(E),
2
and d(t) dW() a.s.
D
t
E(F|F
s
) = E(D
t
F|F
s
)1
[0,s]
(t).
In particular,
F is F
s
adapted D
t
F = 0 for Lebesgue-a.e.t > s. (1.5)
The intuition here is very clear: if F only depends on the early parts of the
paths up to time s, i.e. on {(s
) : s
s}, perturbing the pathes later on

(i.e. on t > s) shouldnt change a thing. Now recall the interpretation of
< DF, h >=
_
D
t
Fh(t)dt as directional derivatives in direction of the pertur-
bation

h =
_
.
0
hd.
- Comparing (1.4) and (1.3), the question arises what really happens for
F L
2
D
1,2
. There is an extension of D to D
, the space of Meyer-Watanabe-

distribution built on the space D
(introduced a little bit later in this text ),

and L
2
D
. In this context, (1.3) makes sense for all F L

2
, see [Uestuenel],
p42.
1.8 Higher derivatives
When f : R
n
U R then f = (
i
f) is a vectoreld on U, meaning that
at each point f(x) T
x
U

= R
n
with standard dierential-geometry notation.
Then (
ij
)f is a (symmetric) 2-tensoreld, i.e. at each point an element of
T
.
U T
.
U

= R
n
R
n
. As seen in section 1.5 the tangent space of corresponds
to H, therefore D
2
F (still to be dened!) should be a H H-valued r.v. (or
H
H to indicate symmetry). No need to worry about tensor-calculus in innite

dimension since H H

= L
2
([0, 1]
2
). For F S
2
(for instance), randomness
xed
D
2
s,t
F := D
s
(D
t
F)
is d
2
(s, t)-a.s. well-dened, i.e. good enough to dene an element of L
2
([0, 1]
2
).
Again, there is closability of the operator D
2
: L
p
(W) L
p
(W, H
H) to check,
leading to a maximal domain D
2,p
with associated norm
2,p
and the same
is done for higher derivatives. Details are in [Nualart],p26.
Remarks: - D
k,p
is not an algebra but
D
:=
k,p
D
k,p
is. As with the class of rapidly decreasing functions, underlying the tempered
distributions, D
can be given a metric and then serve to introduce continuous

functionals on it, the Meyer-Watanabe-distributions. This is quite a central
point in many exposures including [IW], [Oksendal2], [Ocone] and [Uestuenel].
- Standard Sobolev imbedding theorems, as for instance [RR], p215 tell us that
for U = R
n
W
k,p
(U) C
b
(U)
2
An extension to D
1,2
is proved in [Nualart], p32, via WICD.
10
whenever kp > dimU = n. Now, very formally, when n = one could
have a function in the intersection of all these Sobolev-spaces without achiev-
ing any continuity. And this is what happens on !!! For instance, taking
F = W(h), h H gives DF = h, D
2
F = 0, therefore F D
. On the other
hand, [Nualart] has classied those h for which a continuous choice of W(h)
exists, as those L
2
-functions that have a representative of bounded variation,
see [Nualart] p32 and the references therein.
1.9 The Skorohod Integral / Divergence
For simplicity consider p = 2, then
D : L
2
() D
1,2
L
2
(, H),
a densely dened unbounded operator. Let denote the adjoint operator, i.e.
for u Dom L
2
(, H)
= L
2
([0, 1] ) we require
E(< DF, u >
H
) = E(F(u)).
Remark: On (R
n
,
n
),
_
R
n
< f, u >
R
n d
n
=
_
R
n
f(divu)d
n
, (1.6)
this explains (up to a minus-sign) why is called divergence.
Take F, G S
2
, h H. Then (Fh) is easily computed using the IBP-
formula (1.2)
E((Fh)G) = E(< Fh, DG >)
= E(F < h, DG >)
= E(G < h, DF >) +E(FGW(h))
which implies
(Fh) = FW(h) < h, DF > (1.7)
Taking F 1 we immediatly get that coincides with the Ito-integral on
(deterministic) L
2
-functions. But we can see much more: take F F
r
-measurable,
h = 1
(r,s]
. We know from (1.5) that D
t
F = 0 for a.e. t > r. Therefore
< h, DF >=
_
1
0
1
[r,s]
(t)D
t
Fdt = 0,
i.e.
(Fh) = FW(h) = F(
s
r
) =
_
1
0
Fhd
by the very denition of the Ito-integral on adapted step-functions.
3
By an approximation, for u L
2
a
, the closed subspace of L
2
([0, 1]) formed
by the adapted processes, it still holds that
(u) =
_
1
0
u(t)d(t),
3
Also called simple processes. See [KS] for denitions and density results.
11
see [Nualart] p41 or [Uestuenel] p15.
The divergence is therefore a generalization of the Ito-integral (to non-
adapted integrands) and - in this context - called Skorohod-integral.
Remark: For u H, W(u) = (u) (1.7) also reads
(Fu) = F(u) < u, DF > (1.8)
and this relation stays true for u Dom(), F D
1,2
and some integrability
condition, see [Nualart], p40. The formal proof is simple, using the product
rule
E < Fu, DG > = E < u, FDG >
= E < u, D(FG) G(DF) >
= E[(u)FG < u, DF > G]
= E[(Fu < u, DF >)G].
1.10 The OU-operator
We found gradient and divergence on . On R
n
plugging them together yields
a positive operator (the negative Laplacian)
A = = div .
Here is an application. Again, we are on (R
n
,
n
), < ., . > denotes the inner
product on L
2
(R
n
).
f
2
W
1,2 = f
2
H
1 =
_
|f|
2
d
n
+
_
f
2
R
nd
n
=
_
|f|
2
d
n
+
_
fAfd
n
using (1.6)
= < f, f > + < Af, f >
= < (1 +A)f, f >
= < (1 +A)
1/2
f, (1 +A)
1/2
f > = (1 +A)
1/2
f
2
,
using the sqareroot of the positive operator (1 + A) as dened for instance by
spectal calculus. For p = 2 there is no equality but one still has

W
1,p (1 +A)
1/2

L
p,
when p > 1, see [Stein] p135.
Lets do the same on (, W), rst dene the Ornstein-Uhlenbeck operator
L := D.
Then the same is true, i.e. for 1 < p < .

1,p
(1 +L)
1/2

L
p
()
,
with equality for p = 2, the latter case is seen as before. This result is a corrolary
from the Meyer Inequalities. The proof is not easy and found in [Uestuenel]
p19, [Nualart] p61 or [Sugita] p37.
12
How does L act on a S
2
-type functional F = f(W(h
1
), . . . , W(h
n
)) where we
take w.l.o.g. the h
i
s orthonormal? Using DF =

(
i
f)h
i
and formula (1.7)
we get
LF =
i
fW(h
i
)
i
< D
i
f, h
i
>
=
i
fW(h
i
)
i,j
ij
f < h
j
, h
i
>
= (L
(n)
f)(W(h
1
), . . . , W(h
n
))
where L
(n)
is dened as the operator for functions on R
n
L
(n)
:=
n
i=1
[x
i
ii
]
= x .
Remarks: - Minus L
(n)
is the generator of the n-dimensional OU-process given
by the SDE
dx =
2d xdt.
with explicit solution
x(t) = x
0
e
t
+
2e
t
_
t
0
e
s
d(s) (1.9)
and for t xed x(t) has law N(x
0
e
t
, (1 e
2t
)Id).
- L plays the role the same role on (, W) as L
(n)
on (R
n
,
n
) or A =
on (R
n
,
n
).
- Here is some OU-calculus, (at least for) F, G S
2
L(FG) = F LG+G LF 2 < DF, DG >, (1.10)
as immediatly seen by (1.1) and (1.8).
- Some more of that kind,
(FDG) = F LG < DF, DG > . (1.11)
1.11 The OU-semigroup
We rst give a result from semigroup-theory.
Theorem 2 Let H be a Hilbert-space, B : H H be a (possibly unbounded,
densely dened) positive, self-adjoint operator. Then B is the innitesimal
generator of a strongly continuous semigroup of contractions on H.
Proof: (B) = (B)
and positivity implies that B is dissipative in

Pazys terminology. Now use Corollary 4.4 in [Pazy], page 15 (which is derived
from the Lumer-Phillips theorem, which is, itself, based on the Hille-Yoshida
13
theorem). 2
Applying this to A yields the heat-semigroup on L
2
(R
n
,
n
), applying it to
L
(n)
yields the OU-semigroup on L
2
(R
n
,
n
), and for L we get the OU-semigroup
on L
2
(, W).
Lets look at the OU-semigroup P
(n)
t
with generator L
(n)
. Take f : R
n
R,
say, smooth with compact support. Then it is well-known that, using (1.9),
(P
(n)
t
f)(x) = E
x
f(x(t)) =
=
_
R
n
f(e
t
x +
_
1 e
2t
y)d
n
(y).
is again a continuous function in x. (This property is summarized by saying
that P
(n)
t
is a Feller-semigroup.) Similarly, whenever F : R is nice enough
we can set
(P
t
F)(x) =
_
F(e
t
x +
_
1 e
2t
y)dW(y) (1.12)
=
_
F(xcos +y sin)dW(y)
A priori, this is not well-dened for F L
p
() since two W-a.s. identical Fs
could lead to dierent results. However, this does not happen:
Proposition 3 Let 1 p < . Then P
t
is a well-dened (bounded) operator
from L
p
() L
p
() (with norm 1).
Proof: Using Jensen and the rotational invariance of Wiener-measure, with
R(x, y) = (xcos +y sin, xsin +y cos ), we have
P
t
F
p
L
p
()
=
_
_
_
F(xcos +y sin)dW(y)
p
dW(x)
_
|F 1|(R(x, y))
p
d(W W)(x, y)
=
_
_
|F 1|(x, y)
p
d(W W)(x, y)
=
_
|F(x)|
p
d(W W)(x, y) =
_
|F(x)|
p
dW(x) = F
p
L
p
()
.
2
It can be checked that P
t
as dened via (1.12) and considered as operator
on L
2
(), coincides with the abstract semigroup provided by the theorem at
the beginning of this section. It suces to check that P
t
is a semigroup with
innitesimal generator L, the OU-operator, see [Uestuenel] p17.
Remark: P
t
is actually more than just a contraction on L
p
, it is hypercontractive
meaning that it increases the degree of integrability, see also [Uestuenel].
1.12 Some calculus on (R, )
From section 1.10,
(Lf)(x) := (L
(1)
)f(x) = xf
(x) f
(x).
14
Following in notation [Malliavin1], [Malliavin2], denote by f = f
the dierentiation-
operator and by
the adjoint operator on L

2
(). By standard IBP
(
f)(x) = f
(x) +xf(x).
Note that L =
. Dene the Hermite polynomials by

H
0
(x) = 1, H
n
=
H
n1
= (
)
n
1.
Using the commutation relation
= Id, an induction (one-line) proof

yields H
n
= nH
n1
. An immediate consequence is
LH
n
= nH
n
.
Since H
n
is a polynomial of degree n,
m
H
n
= 0 when m > n, therefore
< H
n
, H
m
>
L
2
()
= < H
n
, (
)
m
1 >
= < ()
m
H
n
, 1 >
= 0
On the other hand, since
n
H
n
= n!
< H
n
, H
n
>
L
2
()
= n!,
hence
_
1
(n!)
1/2
H
n
_
is a orthonormal system which is known to be complete, see
[Malliavin2] p7. Hence, given f L
2
() we have
f =
c
n
H
n
with c
n
=
1
n!
< f, H
n
> .
Assume that all derivatives are in L
2
, too. Then
< f, H
n
>=< f,
H
n1
>=< f, H
n1
>= . . . =<
n
f, 1 > .
Denote this projection on 1 by E(
n
f) and observe that it equals E((
n
f)(X))
for a std. gaussian X. We have
f =
n=0
1
n!
E(
n
f)H
n
. (1.13)
Apply this to f
t
(x) = exp(tx t
2
/2) where t is a xed parameter. Noting
n
f
t
= t
n
f
t
and E(
n
f
t
) = t
n
we get
exp(tx t
2
/2) =
n=0
t
n
n!
H
n
(x).
Remark: [Malliavin1], [Malliavin2] extend ,
, L in a straightforward-
manner to (R
N
,
N
) which is, in some sense, (, W) with a xed ONB in H.
15
1.13 Iterated Wiener-Ito integrals
There is a close link between Hermite polynomials and iterated Wiener-Ito in-
tegrals of the form
J
n
(f) :=
_
n
fd
n
:=
_
1
0
. . .
_
t
1
0
f(t
1
, . . . t
n
)d
t
1
...d
t
n
,
(well-dened) for f L
2
(
n
) where
n
:=
n
(1) := {0 < t
1
< . . . ... < t
n
<
1} [0, 1]
n
. Note that only integration over such a simplex makes sure that
every Ito-integration has an adapted integrand. Note that J
n
(f) L
2
(). A
straight-forward computation using the Ito-isometry shows that for n = m
E(J
n
(f)J
m
(g)) = 0
while
E(J
n
(f)J
n
(g)) =< f, g >
L
2
(
n
)
.
Proposition 4 Let h H with h
H
= 1. Let h
n
be the n-fold product, a
(symmetric) element of L
2
([0, 1]
n
) and restrict it to
n
. Then
n!J
n
(h
n
) = H
n
(W(h)). (1.14)
Proof: Set
M
t
:= E
t
(g) and N
t
:= 1 +
n=1
_
n
(t)
g
n
d
n
where g H. By the above orthonormality relations N
t
is seen to be in L
2
.
Moreover, both Y = M resp. N solve the integral equation,
Y
t
= 1 +
_
t
0
Y
s
g(s)d
s
.
By a unicity result for SDEs (OK, its just Gronwalls Lemma for the L
2
-norm
of M
t
N
t
) we see that W-a.s. M
t
= N
t
Now take f H with norm one. Use
the above result with g = f, t = 1
exp(
_
1
0
fd
1
2
2
) = 1 +
n=1
n
J
n
(f
n
), (1.15)
and using the generating function for the Hermite polynomials nishes the proof.
2
A simple geometric corollary of the preceeding is that for h, g both norm
one elements in H,
E(H
n
(W(h))H
m
(W(g)) = 0
if n = m and
E(H
n
(W(h))H
n
(W(g)) = n!(< h, g >
H
)
n
.
Remark: If it were just for this corollary, an elementary and simple proof is
contained in [Nualart].
16
1.14 The Wiener-Ito Chaos Decomposition
Set
C
n
:= {J
n
(f) : f L
2
(
n
)} (n
th
Wiener Chaos)
a family of closed, orthogonal subspaces in L
2
().
For F = E(h) L
2
() we know from the proof of proposition 4 that
F = 1 +
n=1
J
n
(h
n
) (orthogonal sum).
Less explicitly this is an orthogonal decompostion of the form
F = f
0
+
n=1
J
n
(f
n
)
for some sequence of f
n
L
2
(
n
). Clearly, this extends to span (E), and since
this span is dense in L
2
() this further extends to any F L
2
() which is the
same as saying that
L
2
() =
n=0
C
n
(orthogonal).
when setting C
0
the subspace of the constants. Indeed, assume that is a non-zero
element G (
C
n
)
, wlog of norm one. But there is a F span(E) (
C
n
)
arbitrarily close - contradiction. This result is called the Wiener-Ito Chaos
Decomposition.
Remarks: - A slightly dierent description of of the Wiener-Chaos,
C
n
= closure of span{J
n
(h
n
) : h
H
= 1}
= closure of span{H
n
(W(h)) : h
H
= 1}. (1.16)
The second equality is clear by (1.14). Denote by B
n
the r.h.s., clearly B
n
C
n
.
But since span (E)

B
n
, taking the closure yields

B
n
= L
2
, hence
B
n
= C
n
.
We now turn to the spectral decomposition of the OU-operator L
Theorem 5 Let
n
denote the orthogonal projection on C
n
, then
L =
n=1
n
n
.
Proof: Set X = W(h), Y = W(k) for two norm one elements in H,
a =< h, k >, F = H
n
(X). Then
E(LF, H
m
(Y ) = E < DH
n
(X), DH
m
(Y ) >
= E < nH
n1
(X)h, mH
m1
(Y )k > using H
n
= H
n
= nH
n
= nmaE(H
n1
(X), H
m1
(Y ))
which, see end of last section, is 0 when n = m and
nma(n 1)!a
n1
= nn!a
n
= nE(H
n
(X), H
m
(Y ))
17
otherwise i.e. when n = m. By density of the linear span of such H
n
(X)s the
result follows. 2
Another application of Hermite polynomials is the ne-structure of C
n
. Let
p : N N
0
such that |p| =

n
p(n) < . Fix a ONB e
i
for H and set
H
p
:=
n
H
p(n)
(W(e
n
)) (1.17)
well-dened since H
0
= 1 and p(n) = 0 but nitely often. Set p! =

p(n)!
Proposition 6 The set
{
1
(p!)
1/2
H
p
: |p| = n}
forms a complete orthonormal set for the n
th
Wiener-chaos C
n
.
Note that this proposition is true for any ONB-choice in H.
Proof: Orthonormality is quickly checked with the -properties of H
n
(W(h))
seen before. Next we show that H
p
C
n
. We do induction by N,the number
of non-trivial factors in (1.17). for N = 1 this is a consequence of (1.14). For
N > 1, H
p
splits up in
H
p
= H
q
H
i
with H
i
= H
i
(W(e
j
))
some i, j where H
q
S
2
is a Wiener-polynomial in which W(e
j
) does not appear
as argument. Randomness xed, it follows by the orthonormality of the e
i
s that
DH
q
e
j
hence DH
q
DH
i
.
By induction hypothesis, H
q
C
|q|
= C
ni
. Hence
LH
q
= (n i)H
q
using the the spectral decomposition of the OU-operator. By (1.10),
L(H
p
) = L(H
q
H
i
)
= H
q
LH
i
+H
i
LH
q
2 < DH
q
, DH
i
>
= H
q
(iH
i
) +H
i
(n i)H
q
= nH
p
,
hence H
p
C
n
. Introduce

C
n
, the closure of the span of all H
p
s with |p| = n.
We saw that

C
n
C
n
and we want to show equality. To this end, take any
F L
2
(, W) and set
f
k
:= E[F|(W(e
1
), . . . , W(e
k
))].
By martingale convergence, f
k
F in L
2
. Furthermore
f
k
= g
k
(W(e
1
), . . . , W(e
k
))
18
for some g
k
L
2
(R
k
,
k
) = (L
2
(R, ))
k
. Since the (simple) Hermite polyno-
mials form an ONB for L
2
(R, ) its k-fold tensor product has the ONB
{
1
(q!)
1/2
k
i=1
H
q(i)
(x
i
) : all multiindices q : {1, . . . , k} N}.
Hence
f
k

i=0
C
i
,
Set f
n
k
:=
n
f
k
, then we still have lim
k
f
n
k
= F, while f
n
k
C
n
for all k.
Therefore

C
n
= C
n
as claimed. 2
Remarks: - Compare this ONB for C
n
with (1.16). Choosing h = e
1
, e
2
, . . .
in that line will not span C
n
. The reason is that (e
n
i
)
i
is not a basis for H
n
,
the symmetric tensor-product space, whereas h
n
for all unit elements is a basis.
For instance, look at n = 2. A basis is (e
2
i
)
i
and (e
i
e
j
)
i,j
and
(e
i
+e
j
)
2
e
2
i
e
2
j
= e
i
e
j
+e
j
e
i
,
the last expression equals (up to a constant) e
i
e
j
.
- The link between Hermite-polynomials and iterated Wiener-Ito integrals, can
be extended to this setting. For instance,
H
p
= H
2
(W(e
1
)) H
1
(W(e
2
)) = (some constant) J
3
(e
1
e
1
e
2
).
There is surprisingly little found in books about this. Of course, its contained
in Itos original paper [Ito], but even [Oksendal2] p3.4. refers to that paper
when it comes down to it.
1.15 The Stroock-Taylor formula
Going back to the WICD, most authors prove it by an iterated application of
the Ito-representation theorem, see section 1.7. For instance, [Oksendal2], p1.4
writes this down in detail. Lets do the rst step
F = EF +
_
1
0
t
d
t
= EF +
_
1
0
_
E(
t
) +
_
t
0
s,t
d
s
_
d
t
= EF +
_
1
0
E(
t
)d
t
+
_
2
(s, t, )d
s
d
t
= f
0
+J
1
(f
1
) +
_
2
(s, t, )d
s
d
t
when setting f
0
= E(F), f
1
= E(). Its not hard to see that
_
2
(s, t, )d
s
d
t
is orthogonal to C
0
and C
1
(the same proof as for deterministic integrands -
it always boils down to the fact that an Ito-integral has mean zero), hence
19
we found the rst two fs of the WICD. But we also saw in section 1.7 that
t
= E(D
t
F|F
t
), hence
f
1
(t) = E(D
t
F),
d(t)-a.s. and for F D
1,2
. Similarly,
f
2
(s, t) = E(D
2
s,t
F)
d
2
(s, t)-a.s. and so for higher f
n
s, provided all necessary Malliavin-derivatives
of F exist. We have
Theorem 7 (Stroock-Taylor) Let F
k
D
k,2
, then the following rened
WICD holds,
F = EF +
n=1
J
n
(E(D
n
F))
= EF +
n=1
1
n!
I
n
(E(D
n
F))
where
I
n
(f) :=
_
[0,1]
n
fd
n
:= n!J
n
(f)
for any f L
2
(
n
) (or symmetric f L
2
[0, 1]
n
), this notation only introduced
here because of its current use in other texts.
Example: Consider F = f(W(h)) with h
H
= 1 a smooth function f which
is together with all its derivatives in L
2
(). By iteration,
D
n
F = (
n
f)(W(h))h
n
,
hence
E(D
n
F) = h
n
E((
n
f)(W(h))
= h
n
E(
n
f)
where we use the notation from 1.12,
E(f) =
_
fd.
Then
J
n
(E(D
n
F)) = E(
n
f)J
n
(h
n
)
= E(
n
f)
1
n!
H
n
(W(h)),
and Stroock-Taylor just says
f(W(h)) = E(f) +
n=1
1
n!
E(
n
f)H
n
(W(h))
which is, unsurprisingly, just (1.13) evaluated at W(h).
20
Chapter 2
Smoothness of laws
2.1
Proposition 8 Let F = (F
1
, ..., F
m
) be an m-dimensional r.v. Suppose that
for all k and all multiindices with || = k there is a constant c
k
such that for
all g C
k
(R
m
)
|E[
g(F)]| c
k
g
. (2.1)
Then the law of F has a C
density.
Proof: Let (dx) = P(F dx) and its Fourier-transform. Fix u R
m
and
take g = exp(i < u, >). Then, when || = k,
|u
|| (u)| = |E[
g(F)]| c
k
.
For any integer l, by choosing the right s of order l and maximising the l.h.s
we see that
( max
i=1,...,m
|u
i
|)
l
(u)| c
l
Hence, at innity, (u) decays faster than any polynomial in |u|. On the other
hand, as F-transform is bounded (by one), therefore L
1
(R
m
). By standard
Fourier-transform-results we have
F
1
( ) =: f C
0
(R
m
)
and since

f = , by uniqueness, d = fd
m
. Replacing by +(0, . . . , 0, l, 0, . . . , 0)
we have
|u
i
|
l
|u
||
f(u)| c
k+l
But since |u
||
f(u)| = |

f| we conclude as before that
f C
0
. 2
Remark: - Having (2.1) only for k m+ 1 you can still conclude that
(u) = O(
1
|u|
m+1
) and hence in L
1
, therefore d = fd for continuous f. How-
ever, as shown in [Malliavin1], having (2.1) only for k = 1, i.e. only involving
rst derivatives, one still has d = fd
m
for some f L
1
(R
m
).
Now one way to proceed is as follows: for all i = 1, . . . , m let F
i
D
1,2
(for the
moment) and take g : R
m
R as above. By an application of the chain-rule, j
xed,
< Dg(F), DF
j
> = <
i
g(F)DF
i
, DF
j
>
=
i
g(F) < DF
i
, DF
j
>
21
Introducing the Malliavin covariance matrix
ij
=< DF
i
, DF
j
> (2.2)
and assuming that
1
exists W a.s. (2.3)
this yields a.s.
i
g(F) = (
1
)
ij
< Dg(F), DF
j
>
= < Dg(F), (
1
)
ij
DF
j
>
and hence
E[
i
g(F)] = E < Dg(F), (
1
)
ij
DF
j
>
= E[g(F), ((
1
)
ij
DF
j
)]
by denition of the divergence while hoping that (
1
)
ij
DF
j
Dom. In this
case we have
E[
i
g(F)] g
E[((
1
)
ij
DF
j
)]
and we can conclude that F has a density w.r.t. Lebesgue measure
m
. With
some additional assumptions this outline is made rigorous:
1
Theorem 9 Suppose F = (F
1
, . . . , F
m
), F
i
D
2,4
and
1
exists a.s. Then
F has a density w.r.t. to
m
.
Under much stronger assumptions we have the following result.
Theorem 10 Suppose F = (F
1
, . . . , F
m
) D
and
1
L
p
for all p then F
has a C
-density.
For reference in the following proof,
D(g(F)) =
i
g(F)DF
i
(2.4)
L(g(F) =
i
g(F)LF
i
ij
g(F)
ij
(2.5)
L(FG) = FLG+GLF 2 < DF, DG > (2.6)
the last equation was already seen in (1.10). The middle equation is a simple
consequence of the chain-rule (2.4) and (1.8).
Also, D, L and E are extended compontentwise to vector- or matrix-valued r.v.,
for instance < DF, DF >= .
Proof: Since 0 = D(
1
) = L(
1
) we have
D(
1
) =
1
(D)
1
and
L(
1
) =
1
(L)
1
2 <
1
D,
1
(D)
1
>
Take a (scalar-valued) Q D
(at rst reading take Q = 1) and a smooth

function g : R
m
R. Then
E[
1
< DF, D(g F) > Q] = E[
1
< DF, DF > (g F)Q]
= E[(g F)Q]. (2.7)
1
[Nualart], p81
22
We also have
L(F(g F)) = F(L(g F)) + (LF)(g F) 2 < DF, D(g F) > .
This and the self-adjointness of L yields
E[
1
< DF, D(g F) > Q] =
1
2
E[
1
{L(F(g F)) +F(L(g F)) + (LF)(g F)}Q]
=
1
2
E[F(g F)L(
1
Q) + (g F)L(
1
FQ) + (g F)
1
(LF)Q]
= E[(g F)R(Q)] (2.8)
with a random vector
R(Q) =
1
2
[FL(
1
Q) +L(
1
FQ) +
1
(LF)Q].
From the vector-equality (2.7) = (2.8)
E[(
i
g F)Q] = E[(g F){e
i
R(Q)}],
with i
th
unit-vector e
i
. Now the idea is that together with the other assumptions
Q D
implies (componentwise) R(Q) D
. To see this you start with

proposition 3 but then some more information about L and its action on D
is
required. We dont go into details here, but see [Bass] and [IW].
The rest is easy, taking Q = 1 yields
|E[
i
g F]| c
1
g
and the nice thing is that we can simply iterate: taking Q = e

j
R(1) we get
E[
ji
g F] = E[(
i
g F)(e
j
R(1))] = E[(g F)e
i
R(e
j
R(1))]
and you conclude as before. Obviously we can continue by induction. Hence,
by the rst proposition of this section we get the desired result. 2
23
Chapter 3
Degenerated Diusions
3.1 Malliavin Calculus on the d-dimensional Wiener
Space
Generalizing the setup of Chapter 1, we call
= C([0, 1], R
d
)
the d-dimensional Wiener Space. Under the d-dimensional Wiener measure on
the coordinate process becomes a d-dimensional Brownian motion, (
1
, . . . ,
d
).
The reproducing kernel space is now
H = L
2
([0, 1], R
d
) = L
2
[0, 1] . . . L
2
[0, 1] (d copies).
As in Chapter 1 the Malliavin derivate of a real-valued r.v. X can be considered
as a H-valued r.v. Hence we can write
DX = (D
1
X, . . . , D
d
X).
For a m-dimensional random variable X = (X
i
) set
DX = (D
j
X
i
)
ij
,
which appears as a (md)-matrix of L
2
[0, 1]-valued r.v. The Malliavin covari-
ance matrix, as introduced in Chapter 2, reads
ij
=< DX
i
, DX
j
>
H
=
d
k=1
< D
k
X
i
, D
k
X
j
>
L
2
[0,1]
,
or simply
=< DX, (DX)
T
>
L
2
[0,1]
. (3.1)
3.2 The problem
Given vector-elds A
1
, . . . , A
d
, B on R
m
consider the SDE
dX
t
= A
j
(X
t
)d
j
t
+B(X
t
)dt (3.2)
24
For some xed t > 0 (and actually t 1 due to our choice of ) we want to
investigate the regularity of the law of X(t), i.e. existence and smoothness of
a density with repect to
m
on R
m
. We assume all the coecients as nice as
we need (smooth, bounded, bounded derivatives etc). Indeed, the degeneration
we are interested in lies somewhere else: taking all coecients zero, the law of
X(t) is just the Dirac-measure at X(0) = x, in particular there doesnt exist a
density.
3.3 SDEs and Malliavin Calculus, the 1-dimensional
case
For simplicity take m = d = 1 and consider
X
t
= x +
_
t
0
a(X
s
)d
s
+
_
t
0
b(X
s
)ds. (3.3)
Our try is to assume
1
that all X
s
are in the domain of D and then to bring D
under the integrals. To this end recall from section 1.7 that for xed s and a
F
s
-measureable r.v. F one has D
r
F = 0 for -a.e. r > s.
Let u(s, ) be some F
s
-adapted process, and let r t. Then
D
r
_
1
0
u(s)ds =
_
t
0
D
r
u(s)ds =
_
t
r
D
r
u(s)ds,
the rst step can be justied by a R-sum approximation and the closedness of
the operator D. The stochastic integral is more interesting, we restrict ourself
to a simple adapted process
2
of the form
u(t, ) = F() h(t)
with h(t) = 1
(s
1
,s
2
]
(t) and F
s
1
-measurable F. Again, let r t. Then
D
r
_
t
0
Fh(s)d(s) = D
r
_
_
[0,r)
Fh(s)d(s) +
_
[r,t]
Fh(s)d(s)
_
= 0 +D
r
_
1
0
Fh(s)1
[r,t]
(s)d(s)
= D
r
_
FW(h1
[r,t]
)
= (D
r
F)W(h1
[r,t]
) +Fh(r)
=
_
1
0
D
r
Fh1
[r,t]
(s)d(s) +u(r)
= u(r) +
_
t
r
D
r
u(s)d(s) ()
Let us comment on this result. First, if it makes you uncomfortable that our
only a.s.-welldened little r pops up in intervals, rewrite the preceding com-
pution in integrated form, i.e. multiply everything with some arbitrary deter-
ministic L
2
[0, 1]-function k = k(r) and integrate r over [0, 1]. (Hint: interchange
1
For a proof see [IW], p393.
2
We already proceeded like this in section 1.9 when computing (u).
25
integration w.r.t d
s
and dr).
Secondly, a few words about (). The reduction from
_
t
0
on the l.h.s. to
_
t
r
at
the end is easy to understand - see the recall above. Next, taking t = r + we
can, at least formally, reduce () to u(r) alone. Also, the l.h.s. is easily seen to
equal D
r
_
t
r
. That is, when operating D
r
on
_
r+
r
ud we create somehow a
Dirac point-mass
r
(s). But that is not surprising! Formally, D
r
Y =< Y,
r
>
corresponding to a (non-admissible!) perturbation of by a Heaviside-function
with jump at r, say H( r) with derivative
r
. Now, very formally, we interpret
as Brownian path perturbed in direction H( r.) Taking dierentials for use
in the stochastic integral we nd the Dirac mass
r
appearing.
(A detailed proof is found in [Oksendal2], corollary 5.13.)
Back to our SDE, applying these results to (3.3) we get
D
r
X
t
= a(X
r
) +
_
t
r
D
r
a(X
s
)d
s
+
_
t
r
D
r
b(X
s
)ds
= a(X
r
) +
_
t
r
a
(X
s
)D
r
X(s)d
s
+
_
t
r
b
(X
s
)D
r
X(s)ds
Fix r and set

X := D
r
X. We found the (linear!) SDE
d

X
t
= a
(X
t
)

X
t
d
t
+b
(X
t
)

X
t
dt, t > r (3.4)
with initial condition

X
r
= a(X
r
).
3.4 Stochastic Flow, the 1-dimensional case
A similar situation occurs when investigating the sensitivity of (3.3) w.r.t. the
initial condition X(0) = x. Set
Y (t) =

x
X(t).
(A nice version of) X(t, x) is called stochastic ow.
A formal computions (see [Bass], p30 for a rigorous proof) gives the same
SDE
dY
t
= a
(X
t
)Y
t
d
t
+b
(X
t
)Y
t
dt, t > 0
and clearly Y (0) = 1. Matching this with (3.4) yields
D
r
X(t) = Y (t)Y
1
(r)a(X(r)). (3.5)
Remark: In the multidimensional setting note that for xed
D
r
X(t) R
md
while
Y (t) R
mm
.
( [Bass] actually makes the choice m = d for a simpler exposure.)
26
3.5 SDE/ows in multidimensional setting
Rewrite (3.2) in coordinates
dX
i
= A
i
k
(X)d
k
+B
i
(X)dt, i = 1, . . . , m (3.6)
with initial condition X(0) = x = (x
j
) R
m
. Set
(Y )
ij
=
j
X
i

x
j
X
i
.
As before (formally)
d
j
X
i
=
l
A
i
k
j
X
l
d
k
+
l
B
i
j
X
l
dt
To simplify notation, for any vector-eld V on R
m
, considered as map R
m
R
m
, we set
3
(V )
ij
=
j
V
i
. (3.7)
This yields the following (mm)-matrix SDE
dY = A
k
(X)Y d
k
+B(X)Y dt
Y (0) = I
and there is no ambiguity in this notation. Note that this is (as before) a linear
SDE. We will be interested in the inverse Z := Y
1
. As a motivation, consider
the following 1-dimensional ODE
dy = f(t)ydt
Clearly z = 1/y satises
dz = f(t)zdt.
We can recover the same simplicity in the multidimensional SDE case by using
Strantonovich Calculus, a rst-order stochastic calculus.
3.6 Stratonovich Integrals
3.6.1
Let M, N be continuous semimartingales, dene
4
_
t
0
M
s
dN
s
=
_
t
0
M
s
dN
s
+
1
2
< N, M >
t
resp.
M
t
dN
t
= M
t
dN
t
+
1
2
d < N, M >
t
.
The Ito-formula becomes
f(M
t
) = f(M
0
) +
_
t
0
f
(M
s
) dM
s
. (3.8)
3
If you know classical tensor-calculus it is clear that
j
V
i
corresponds to a matrix where
i represent lines and j the columns.
4
Do not mix up the bracket with the inner product on Hilbert Spaces.
27
See [Bass]p27 or any modern account on semimartingales for these results. A
special case occurs, when M is given by the SDE
dM
t
= u
t
d
t
+v
t
dt or dM
t
= u
t
d
t
+ v
t
dt
Then
M
t
d
t
= M
t
d
t
+
1
2
u
t
dt. (3.9)
One could take this as a denition ( [Nualart], p21 does this).
3.6.2
Of course there is a multidimensional version of (3.8) (write it down!). For
instance, let V : R
m
R
m
and X some m-dimensional process, then
dV (X) = (V )(X) dX. (3.10)
It also implies a rst order product rule
d(MN) = N dM +M dN
where M, N are (real-valued) semi-martingales.
For later use we discuss a slight generalization. Let Y, Z be two matrix-
valued semi-martingales (with dimensions such that Y Z makes sense). Dene
d(Y Z) component-wise. Then
d(ZY ) = (dZ)Y +Z dY. (3.11)
This might look confusing at rst glance, but it simply means
Z
i
k
(t)Y
k
j
(t) = Z
i
k
(0)Y
k
j
(0) +
_
t
0
Y
k
j
dZ
i
k
+
_
t
0
Z
i
k
dY
k
j
.
3.6.3
Let M, N, O, P be semimartingales and
dP = NdO.
Then it is well-known that
5
MdP = MNdO. (3.12)
A similar formula, less well-known, holds for Stratonovich dierentials. Let
dP = N dO
then
M dP = MN dO. (3.13)
5
[KS], p145.
28
Proof: The equals MdP +
1
2
d < M, P >= M(NdO +
1
2
d < N, O >) +
1
2
d <
M, P > so the only thing to show is
Md < N, O > +d < M, P >= d < MN, O > .
Now d < M, P >= N < M, O > ( [KS], p143). On ther other hand
d(MN) = MdN +NdM +d(bounded variation)
shows d < MN, O >= Md < N, O > +N < M, O > (since the bracket kills the
bounded variation parts) and we are done. 2
3.7 Some dierential geometry jargon
3.7.1 Covariante derivatives
Given two smooth vectorelds V , W (on R
m
) and using (3.7)
(V )W = W
j
j
V
i
i
,
where we follow the dierential-geometry usage to denote the basis by (
1
, . . . ,
m
).
This simply means that (V )W is a vector whose ith component is W
j
j
V
i
.
Also, we recognize directional derivatives (in direction W) on the r.h.s. In Rie-
mannian geometry this is known as the covariante derivative
6
of V in direction
W. A standard notation is
W
V = W
j
j
V
i
i
.
is called connection.
3.7.2 The Lie Bracket
Let V, W be as before. It is common in Dierential Geometry that a vector
V (x) = (V
i
(x)) is identied with the rst order dierential operator
V (x) = V
i
(x)
i
|
x
.
Consider the ODEs on R
m
given by
dX = V (X)dt.
It is known
7
that there exists (at least locally) an intergral curve. More pre-
cisely, for every x R
n
there exists some open (time) interval I
x
around 0 and a
smooth curve X
x
: I(x) R
m
which satises the ODE and the initial condition
X
x
(0) = x. By setting
V
t
(x) = X
x
(t)
we obtain a so-called local 1-parameter group. For t xed V
t
() is a dieomor-
phism between appropriate open sets. [Warner] p37 proves all this, including
existence, on a general manifold.
6
On a general Riemannian manifold there is an additional term due to curvature. Cleary,
curvature is zero on R
m
.
7
A simple consequence of the standard existence/uniqueness result for ODEs.
29
Consider a second ODE, say dY = W(Y )dt with local one-parameter group
W
t
(). Then, for t small enough everyhting exists and a second order expansion
yields
W
t
V
t
W
t
V
t
(x) t
2
.
Dividing the l.h.s. by t
2
and letting t 0 one obtains a limit in R
m
depending
on x, say [V, W](x), the so-called Lie Bracket. We see that it is measures how
two ows lack to commute (innitesimaly).
Considering [V, W] as rst order operator one actually nds
[V, W] = V W W V,
where the r.h.s. is to be understood as composition of dierential operators.
Note that the r.h.s. is indeed a 1
st
order operator, since
ij
=
ji
(when
operating on smooth functions as here). We see that the Lie bracket measures
how much two ows lack to commute.
It is immediate to check that
8
[V, W] =
V
W
W
V
= (W)V (V )W
Generally speaking, whenever there are two vectorelds mixed together
the Lie bracket is likely to appear.
Example: Let A be a vectoreld. Inspired by section 3.5 consider
dX = A(X)dt, X(0) = x
and
dY = A(X)Y dt, Y (0) = I
Consider the matrix-ODE
dZ = ZA(X)dt, Z(0) = I. (3.14)
By computing d(ZY ) = (ZA(X)dt)Y + Z(A(X)Y dt = 0 we see that Y
1
exists for all times and Z = Y
1
. Without special motivation, but for later use
we compute
d[Z
t
V (X
t
)] = (dZ
t
)V (X
t
) +Z
t
dV (X
t
)
=
_
ZA(X
t
)V (X
t
) +Z
t
V (X
t
)A(X
t
)
dt
= Z
_
V (X
t
)A(X
t
) A(X
t
)V (X
t
)
dt
= Z[A, V ](X
t
)dt (3.15)
3.8 Our SDEs in Stratonovich form
Recall
dX = A
k
(X)d
k
+B(X)dt
= A
k
(X) d
k
+A
0
(X)dt (3.16)
8
In Riemannian geometry, the rst equation is known as the torsion-free-property of a
Riemanniann connection .
30
with X(0) = x. It is easy to check that
A
i
0
= B
i
1
2
A
j
k
j
A
i
k
, i = 1, . . . , m.
In the notations introduced in the last 2 sections,
A
0
= B
1
2
(A
k
)A
k
= B
1
2
A
k
A
k
.
With Y dened as as in section 3.5 we obtain
dY = A
k
(X)Y d
k
+A
0
(X)Y dt
Y (0) = I
and Z = Y
1
exists for all times and satises a generalized version of (3.14),
dZ = ZA
k
(X) d
k
ZA
0
(X)Zdt
Z(0) = I. (3.17)
The proof goes along (3.14) using (3.11): Since we already discussed the deter-
ministic version we restrict to the case where A
0
0. Then
d(ZY ) = (dZ)Y +Z dY
= ZA
k
(X)Y d
k
+ZA
k
(X)Y d
k
= 0
(References for this and the next section are [Bass] p199-201, [Nualart]
p109-p113 and [IW] p393.)
3.9 The Malliavin Covariance Matrix
Dene the (md) matrix
= (A
1
| . . . ...|A
d
).
Then a generalization of (3.5) holds (see [Nualart] p109 for details)
D
r
X(t) = Y (t)Y
1
(r)(X
r
)
= Y (t)Z(r)(X
r
),
and D
r
X(t) appears at a (random) R
md
-matrix as already remarked at the end
of section 3.4. Fix t and write X = X(t). From (3.1), the Malliavin covariance
matrix equals
=
t
=
_
1
0
D
r
X(D
r
X)
T
dr
= Y (t)
_
_
t
0
Z(r)(X
r
)
T
(X
r
)Z
T
(r)dr
_
Y
T
(t). (3.18)
31
3.10 Absolute continuity under Hormanders con-
dition
We need a generalization of (3.15).
Lemma 11 Let V be a smooth vector eld on R
m
. Let X and Z be processes
given by the Stratonovich SDEs (3.16) and (3.17). Then
d(Z
t
V (X
t
)) = Z
t
[A
k
, V ](X
t
) d
k
+Z
t
[A
0
, V ](X
t
)dt
= Z
t
[A
k
, V ](X
t
)d
k
(3.19)
+Z
t
_
1
2
[A
k
, [A
k
, V ]] + [A
0
, V ]]
_
(X
t
)dt. (3.20)
First observe that the second equality is a simply application of (3.9) and (3.19)
with V replaced by [A
k
, V ]. To see the rst equality one could just point at
(3.15) and argue with 1st order Stratonovich Calculus . Here is a rigorous
Proof: Since the deterministic case was already condidered in (3.15) we take
w.l.o.g A
0
0. Using (3.10) and (3.11) we nd
d(ZV (X)) = (dZ)V (X) +Z dV
= (ZA
k
V ) d
k
+ (ZV A
k
) d
k
= Z
_
[A
k
, V ] |
X
_
d
k
2
If you dont like the Strantonovich dierentials, a (straight forward but quite
tedious) compution via standard Ito calculus is given in [Nualart], p113.
Corollary 12
9
Let be a stopping time and y R
m
such that
< Z
t
V (X
t
), y >
R
m 0 for t [0, ].
Then for i = 0, 1, . . . , d
< Z
t
[A
i
, V ](X
t
), y >
R
m 0 for t [0, ].
Proof: Lets prove
Z
t
V (X
t
) 0 Z
t
[A
i
, V ](X
t
) 0.
(The proof of the actual statement goes along the same lines.) First, the as-
sumption implies that
Z
t
[A
k
, V ](X
t
)d
k
+Z
t
_
1
2
[A
k
, [A
k
, V ]] + [A
0
, V ]]
_
(X
t
)dt 0 for t [0, ].
By uniqueness of semimartingale-decomposition into (local) martingale and and
bounded variation part we get (always for t [0, ])
Z
t
[A
k
, V ](X
t
) 0 for k = 1, . . . , d
and
Z
t
_
1
2
[A
k
, [A
k
, V ]] + [A
0
, V ]]
_
(X
t
)dt 0
9
Compare [Bell], 75
32
By iterating this argument on the rst relation
Z
t
[A
k
, [A
j
, V ]](X
t
) 0 for k, j = 1, . . . , d
and together with the second relation we nd
Z
t
[A
0
, V ](X
t
) 0
and we are done. 2
In the following the range denotes the image (R
m
) R
m
of some (random,
time-dependent) mm - matrix .
Theorem 13 Recalling X(0) = x, for any t > 0 it holds that
span
_
A
1
|
x
, . . . , A
d
|
x
, [A
j
, A
k
] |
x
, [[A
j
, A
k
], A
l
] |
x
; j, k, l = 0, . . . , d
_
range
t
a.s.
Proof: For all s t dene
R
s
= span {Z(r)A
i
(X
r
) : r [0, s], i = 1, . . . , d}
and
R = R() =
s>0
R
s
.
We claim that R
t
= range
t
. From (3.18) it follows that
range
t
= range
_
t
0
Z(r)(X
r
)
T
(X
r
)Z
T
(r)ds (3.21)
and since, any r t xed, span {Z(r)A
i
: i = 1, . . . , d} = range Z(r)(X
r
)
range Z(r)(X
r
)
T
(X
r
)Z
T
(r) the inclusion R
t
range
t
is clear. On the other
hand take some v R
m
orthgonal to range
t
. Clearly
v
T
t
v = 0.
Using (3.21) we actually have
_
t
0
|v
T
Z
s
(X
s
)|
2
R
m =
d
k=1
_
t
0
|v
T
Z
r
A(X
k
)|
2
= 0
Since every diusion path X
k
() is continuous we see that the whole integrand
is continuous and we deduce that, for all k and r t,
v Z
r
A
k
(X
r
).
We showed ( Range
t
)
t
and hence the claim is proved.
Now, by Blumenthals 0-1 law there exists a (deterministic) set

R such that
R = R() a.s. Suppose that y

R
. Then a.s. there exists a stopping time

such that R
s
=

R for s [0, ]. This means that for all i = 1, . . . , d and for all
s [0, ]
< Z
s
A
i
(X
s
), y >
R
m= 0.
33
or simply y Z
s
A
i
(X
s
). Moreover, by iterating Corollary 12 we get
y Z
s
[A
j
, A
k
], Z
s
[[A
j
, A
k
], A
l
], . . .
for all s [0, ]. Calling S the set appearing in the l.h.s. of (3.21) and using
the last result at s = 0 shows that y S
. So we showed S

R. On the other
hand, it is clear that a.s.

R R
t
= Range
t
as we saw earlier. The proof is
nished. 2
Combing this with Theorem (9) we conclude
Theorem 14 Let A
0
, . . . , A
d
be smooth vector elds (satisfying certain bound-
edness conditions
10
) on R
m
which satisfy Hormanders condition (H1)
that is that
11
A
1
|
x
, . . . , A
d
|
x
, [A
j
, A
k
] |
x
, [[A
j
, A
k
], A
l
] |
x
. . . ; j, k, l = 0, . . . (3.22)
span the whole space R
m
. Equivalently we can write
Lie {A
1
|
x
, . . . , A
d
|
x
, [A
1
, A
0
]|
x
, . . . , [A
d
, A
0
]|
x
} = R
m
. (3.23)
Fix t > 0 and let X
t
be the solution of the SDE
dX
t
=
d
k=1
A
k
(X
t
) d
t
+A
0
(X
t
)dt. X(0) = x.
Then the law of X(t), i.e. the measure P[X(t) dy], has a density w.r.t. to
Lebesgue-measure on R
m
.
3.11 Smoothness under Hoermanders condition
Under essentially the same hypothesis as in the last theorem
12
one actually has
a smooth density of X(t), i.e. C
(R
m
). The idea is clearly to use Theorem
10, but there is some work to do. We refer to [Norris] and [Nualart], [Bass]
and [Bell].
3.12 The generator
It is well-know that the generator of a (Markov) process given by the SDE
dX = A
k
(X) d
k
+A
0
(X)dt
= A
k
(X)d
k
+B(X)dt
= d +B(X)dt. (3.24)
is the second order dierential operator
L =
1
2
E
ij
ij
+B
i
i
(3.25)
10
Bounded and bounded derivatives will do - we have to guarantee existence and uniqueness
of X and Y as solution of the corresponding SDEs.
11
Note that A
0
|
x
alone is not contained in the following list while it does appear in all
brackets.
12
One requires that the vectorelds have bounded derivatives of all orders, since higher-order
analogues to Y come into play.
34
with (mm) matrix E =
T
(or E
ij
d
k=1
A
i
k
A
j
k
. in coordinates). Identi-
fying a vector eld, say V , with a rst order dierntial operator, the expression
V
2
= V V makes sense as a second order dierential operator. In coordinates,
V
i
i
(V
j
j
) = V
i
V
j
ij
+V
j
(
j
V
i
)
i
.
Note the last term on the r.h.s is the vector V V =
V
V . Replacing V by A
k
and summing over all we see that
E
ij
ij
=
d
k=1
A
2
k
A
k
A
k
.
We recall (see chapter 3.8) that A
0
= B
1
2
A
k
A
k
. Hence
L =
1
2
d
k=1
A
2
k
+A
0
. (3.26)
Besides giving another justication of the Stratonovich calculus, it is impor-
tant to notice that this sum-of-square-form is invariant under coordinate-
transformation, hence a suited operator for analysis on manifolds.
3.13
Example 1 (bad): Given two vectorelds on R
2
(in 1st order di. operator
notation)
A
1
= x
1
1
+
2
, A
2
=
2
set
L =
1
2
(A
2
1
+A
2
2
).
Expanding,
(x
1
1
+
2
)
2
= x
2
1
11
+x
1
1
+ 2x
1
12
+ 2
22
,
yields
L =
1
2
E
ij
ij
+b
i
i
with
E =
_
x
2
1
x
1
x
1
2
_
and B = (x
1
, 0)
T
. Now E =
T
with
=
_
x
1
0
1 1
_
and the associated diusion process is
dX
t
= (X
t
)d
t
+B(X
t
)dt
= A
1
(X
t
)d
1
t
+A
2
(X
t
)d
2
t
+B(X
t
)dt.
We see that when we start from the x
2
-axis i.e. on {x
1
= 0} there is no drift,
B 0, and both brownian motions push us around along direction x
2
, therefore
35
no chance of ever leaving this axis again. Clearly, in such a situation, the law
of X(t) is singular with respect to Lebesgue-measure on R
2
.
To check Hormanders condition H1 compute
[A
1
, A
2
] = (x
1
1
+
2
)
2
2
(x
1
1
+
2
)
= 0
therefore the Lie Algebra generated by A
1
and A
2
simply equals the span
{A
1
, A
2
} and is not the entire of R
2
when evaluated at the degenerated area
{x
1
= 0} - exactly as expected.
Example 2 (good): Same setting but
A
1
= x
2
1
+
2
, A
2
=
2
.
Again,
L =
1
2
(V
2
1
+V
2
2
)
=
1
2
a
ij
ij
+b
i
i
Similarly we nd
dX
t
= A
1
(X
t
)d
1
t
+A
2
(X
t
)d
2
t
+B(X)dt.
with drift B = (1, 0)
T
.
The situation looks similar. On the x
1
-axis where {x
2
= 0} we have A
1
= A
2
,
therefore diusion happens in x
2
-direction only. However, when we start at
{x
2
= 0} we are pushed in x
2
-direction and hence immediatly leave the degen-
erated area.
To check Hormanders condition H1 compute
[V
1
, V
2
] = (x
2
1
+
2
)
2
2
(x
1
1
+
2
)
=
1
.
See that
span(V
2
, [V
1
, V
2
]) = R
2
for all points and our Theorem 14 applies.
Example 3 (How many driving BM?):
Consider the m = 2-dimensioal process driven by one BM (d = 1),
dX
1
= d,
dX
2
= X
1
dt.
From this extract A
1
=
1
, drift A
0
= x
1
2
and since [A
1
, A
0
] =
2
Hormanders
condition holds for all points on R
2
. Actually, it is an easy exercise to see that
(X
1
, X
2
) is a zero mean Gaussian process with covariance matrix
_
t t
2
/2
t
2
/2 t
3
/3
_
.
36
Hence we can write down explicitly a density with repsect to 2-dimensional
Lebesgue-measure. Generally, one BM together with the right drift is enough
for having a density.
Example 3 (Is Hormanders condition necessary?):
No! Take f : R R smooth, bounded etc such that f
(n)
(0) = 0 for all
n 0 (in particular f(0) = 0) and look at
2L =
_
1
_
2
+
_
f(x
1
)
2
_
2
.
as arising from m = d = 2, A
1
=
1
, A
2
= f(x
1
)
2
. Check that A
2
, [A
1
, A
2
], ...
are all 0 when evaluated at x
1
= 0 (simply because the Lie-brackets make all
derivatives of f appear.) Hence Hormanders condition is not satised when
starting from the degenerated region {x
1
= 0}. On the other hand, due to
A
1
we will immediatly leave the degenerate region and hence there is a density
(some argument as in example 2).
37
Chapter 4
Hypelliptic PDEs
4.1
Let V
0
, . . . , V
d
be smooth vectorelds on some open U R
n
, let c be a smooth
function on U. Dene the second order dierential operator (where c operates
by multiplication)
G :=
d
k=1
V
2
k
+V
0
+c.
Let f, g D
(U), assume
Gf = g
in the distributional sense, which means (by denition)
< f, G
>=< g, >
for all test-functions D(U). We call the operator G hypoelliptic if, forall
open V U,
g |
V
C
(V ) f |
V
C
(V ).
Hormanders Theorem, as proved in [Kohn], states:
Theorem 15 Assume
Lie [V
0
|
y
, . . . , V
d
|
y
] = R
n
for all y U. Then the operator G as given above is hypoelliptic.
Remark: An example the Hormanders Theorem is a sucient condition
for hypoellipticity but not a necessary one goes along Example 4 from the last
chapter.
4.2
Take X as in section (3.8), take U = (0, ) R
m
and let D(U). For T
large enough
E[(T, X
T
)] = E[(0, X
0
)] = 0,
38
hence, by Itos formula,
0 = E
_
T
0
(t +L)(t, X
t
)dt.
By Fubini and T this implies
0 =
_

0
_
R
m
(t +L)(t, y)p
t
(dy)dt
=
_

0
_
R
m
(t, y)p
t
(dy)dt
for D(U) as dened through the last equation. This also reads
0 =< , (t +L) >=< , >
for some distribution D
(U).
1
In distributional sense this writes
(
t
+L)
= (
t
+L
) = 0, (4.1)
saying that satises the forward Fokker-Planck equation. If we can guarantee
that
t
+L
is hypoelliptic then, by Hormanders theorem there exists p(t, y)

smooth in both variables s.t.
< , > =
_
(0,)R
m
p(t, y)(t, y)dtdy
=
_
(0,)R
m
(t, y)p
t
(dy)dt.
This implies
p
t
(dy) = p(t, y)dy
for p(t, y) smooth on (0, ) R
m
.
2
4.3
We need sucient conditions to guarantee the hypoellipticity of G =
t
+L
as operator on U = (0, ) R
m
R
n
with n = m+ 1.
Lemma 16 Given a rst order dierential operator V = v
i
i
its adjoint is
given by
V
= (V +c
V
)
where c
V
=
i
v
i
is a scalar-eld acting by multiplication.
Proof: Easy. 2
As corollary,
(V
2
)
= (V
)
2
= (c
V
+V )
2
= V
2
+ 2c
V
V +c
1
The distribution is also represented by the (nite-on-compacts-) measure given by the
semi-direct product of the kernel p(s, dy) and Lebesgue-measure ds = d(s).
2
Note that the smoothness conclusion via Malliavin calculus doesnt say anything about
smoothness in t, i.e. our conclusion is stronger.
39
for some scalar-eld c. For L as given in (3.26) this implies
L
=
1
2
d
k=1
A
2
k
(A
0
c
A
k
A
k
) +c
for some (dierent) scalar-eld c. Dening
A
0
= A
0
c
A
k
A
k
(4.2)
this reads
L
=
1
2
d
k=1
A
2
k

A
0
+c.
We can trivially extend vectorelds on R
m
to vectorelds on U = (0, ) R
m
(time-independent vectorelds). From the dierential-operator point of view
it just means that we have that we are acting only on the space-variables and
not in t. Then
G =
1
2
d
k=1
A
2
k
(

A
0
+
t
) +c
is an operator on U, in Hormander form as needed. Dene the vector

A =

A
0
+
t
R
n
.
3
Hence Hormanders (sucient) condition for G being hypoellitpic
reads
Lie {A
1
|
y
, . . . , A
d
|
y
,

A|
y
} = R
n
(4.3)
for all y U. Note that for k = 1, . . . , d
[A
k
,
t
] = A
i
k
t
A
i
k
i
= 0
since the A
i
k
are functions in space only. It follows that
4
[A
k
,

A] = [A
k
,

A
0
],
and similarly no higher bracket will yield any component in t-direction. From
this it follows that (4.3) is equivalent to
Lie {A
1
|
y
, . . . , A
d
|
y
, [A
1
,

A
0
]|
y
, . . . , [A
d
,

A
0
]|
y
} = R
m
(4.4)
for all y R
m
. Using (4.2) we can replace

A
0
in condition (4.4) by A
0
without
changing the spanned Lie-algebra. We summarize
Theorem 17 Assume that Hormanders condition (H2) holds:
Lie {A
1
|
y
, . . . , A
d
|
y
, [A
1
, A
0
]|
y
, . . . , [A
d
, A
0
]|
y
} = R
m
.) (4.5)
for all y R
m
. Then the law of the process X
t
has a density p(t, y) which is
smooth on (0, ) R
m
.
3
As vector, think of having a 1 in the 0th position (time), then use the coordinates from
A
0
to ll up positions 1 to m.
4
We abuse notation: the Bracket on the l.h.s. is taken in R
n
resulting in a vector with
no component in t-direction which, therefore, is identied with the R
m
-vector an the r.h.s.,
result of the bracket-operation in R
m
.
40
Remarks: - Compare conditions H1 and H2, see (3.23) and (4.5). The only
dierence is that H2 is required for all points while H1 only needs to hold for
x = X(0). (Hence H2 is a stronger condition.)
- Using H1 (ie Malliavins approach) we dont get (a priori) information about
smoothness in t.
- Neither H1 nor H2 allow A
0
(alone!) to help out with the span. The intuitive
meaning is clear: A
0
alone represents the drift hence doesnt cause any diusion
which is the origin for a density of the process X
t
.
- We identied the distribution as uniquely associated to the (smooth) func-
tion p(t, y) = p(t, y; x). Hence from (4.1)
t
p = L
p ( L
acts on y )
and p(0, dy) is the Dirac-measure at x. All that is usually summarized by saying
that p is a fundamental solution of the above parabolic PDE and our theorem
gives smoothness-results for it.
- Let = (A
1
| . . . |A
d
) and assume that E =
T
is uniformly elliptic. We claim
that in this case the vectors {A
1
, . . . , A
d
} already span R
m
(at all points), so
that Hormanders condition is always satised.
Proof: Assume v span {A
1
, . . . , A
d
}
. Then, for all k,

0 =< v, A
k
>
2
= |v
i
A
i
k
|
2
= v
i
A
i
k
A
j
k
v
j
= v
T
Ev.
Since E is symmetric, positive denite we see that v = 0. 2
41
Bibliography
[Bass] Bass Elliptic operators and diusion, Springer, 1997.
[Bell] Bell The Malliavin Calculus, Pitman Monographs 34, 1987.
[BH] Bouleau, Hirsch Dirichlet Forms ..., Springer, 1997.
[DaPrato] DaPrato, Zabczyk Stochastic Equations in Innite Dimensions,
Cambridge University Press, 1992.
[Evans] L.C. Evans Partial Dierential Equations AMS.
[IW] Ikeda Watanabe Stochastic Dierential Equations and Diusion Pro-
cesses, North-Holland, 1989
[Ito] K. Ito Multiple Wiener integral, J.Math.Soc.Japan 3 (1951), 157-169
[KS] Karatzas, Shreve Brownian Motion and Stochastic Calculus, 2nd Ed,
Springer.
[Kohn] J.J, Kohn Pseudo-dierential Operators and Hypoellipticity Proc.
Symp. Pure Math, 23, A.M.S. (1073) p61-69
[Malliavin1] P. Malliavin Probability and Integration Springer.
[Malliavin2] Malliavin Stochastic Analysis Springer.
[Norris] Norris Simplied Malliavin Calculus
[Nualart] D. Nualart Malliavin Calculus and Related Topics Springer.
[Ocone] Ocone A guide to stochastic calculus of variations LNM 1316, 1987.
[Oksendal1] B. Oksendal Stochastic Dierential Equations Springer 1995.
[Oksendal2] B. Oksendal An Introduction to Malliavin Calculus with Ap-
plication to Economics Lecture Notes, 1997, available on
www.nhh.no/for/dp/1996/
[Pazy] A. Pazy Semigroups of Linear Operators and Applications to Partial
Dierential Equations Springer, 1983.
[RR] Renardy Rogers Introduction to Partial Dierential Equations,
Springer.
[RU] W. Rudin Functional Analysis, McGraw-Hill, 1991.
42
[RY] Revuz, Yor Brownian Motion and Continuous Martingales, Springer.
[Stein] Stein Singular Integrals Princeton UP, 1970.
[Sugita] H. Sugita Sobolev spaces of Wiener functionals and Malliavins cal-
culus, J. Math. Kyoto Univ. 25-1 (1985) 31-48
[Uestuenel] A.S. Uestuenel An Introduction to Analysis on Wiener Space,
Springer LNM 1610.
[Warner] Warner Foundations of Dierentiable Manifolds and Lie Groups,
Springer.
[Williams] Williams To begin at the beginning., Stochastic Inergrals, Lecture
Notes in Math. 851 (1981), 1-55
43

An Introduction To Malliavin Calculus

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Introduction To Malliavin Calculus

Uploaded by

Copyright:

Available Formats

An Introduction to Malliavin

H ... Cameron-Martin-space , elements are paths with derivative in H

the adjoint operator, also: divergence, Skorohod Integral

... Schwartz-distributions = cont. functionals on D

- norm making it a (separable) Banach-space. F coincides with

h is a continuous linear injection from H into (,

h(t) = h(t). This

s}, perturbing the pathes later on

, the space of Meyer-Watanabe-

(introduced a little bit later in this text ),

. In this context, (1.3) makes sense for all F L

H to indicate symmetry). No need to worry about tensor-calculus in innite

can be given a metric and then serve to introduce continuous

and positivity implies that B is dissipative in

the adjoint operator on L

. Dene the Hermite polynomials by

= Id, an induction (one-line) proof

, wlog of norm one. But there is a F span(E) (

f| we conclude as before that

(at rst reading take Q = 1) and a smooth

implies (componentwise) R(Q) D

. To see this you start with

and the nice thing is that we can simply iterate: taking Q = e

R = R() a.s. Suppose that y

. Then a.s. there exists a stopping time

is hypoelliptic then, by Hormanders theorem there exists p(t, y)

. Then, for all k,

You might also like