Professional Documents
Culture Documents
– Typeset by FoilTEX –
Contents
2 Probability Theory 21
– Typeset by FoilTEX – 1
2.3 Expectation and integration w.r.t. measures . . . . . . . . . . . . . . . . . . 25
2.7 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3 Stochastic Processes 48
3.3 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
– Typeset by FoilTEX – 2
3.5 Distribution of some important variables associated with Brownian motion . . . 62
4 Stochastic Integration 73
– Typeset by FoilTEX – 3
4.8 Stochastic integral representation . . . . . . . . . . . . . . . . . . . . . . . 104
– Typeset by FoilTEX – 4
1 Warm-up & Motivation
Modern finance makes critical use of mathematical tools, in particular, tools from a discipline called
”stochastics” which includes probability calculus, stochastic processes and stochastic calculus.
Practical applications in finance requiring a profound knowledge of these prerequisites are, for
example, risk management, statistical analysis of financial markets, and, most importantly, pricing
and hedging of complex financial products such as derivatives.
This section takes the problem of pricing and hedging an option as a starting point to develop
the motivation and a course programme for our journey through the theory of stochastic processes.
Consider a model for a dynamic financial market, where time is evolving discretely, t =
0, 1, 2, . . . . The time t = 0 refers to today, whereas t = 1, 2, . . . are future points in time.
In our financial market there are liquidly traded financial securities S 1, S 2 with prices St1,
resp. St2 at time t = 0, 1, 2, . . . . Assume security S 1 is a risk-free bank account that increases
in each time period at an interest rate r :
1 1 t
S0 = 1, St = (1 + r) .
The other security is assumed to be risky, i.e., the price at future points in time is unknown
today, it is a random quantity. Of course, the price today, S02 is known, and we assume that
S02 = 1. For the future time points we suppose a random behaviour consisting of two possible
price branchings from time step to time step:
(
2 2 u ”up”move
St+1 = St · ,
d ”down” move
where u and d are fixed numbers with u > d. At time t the possible values of the random price
St2 are
2 2 k t−k
St ∈ {S0 · u d , k = 0, 1, . . . , t}.
Here is an example:
u= 1.100
d = 1/u = 0.9091
t=0 t=1 t=2 ...
1.210
1.100
S02 = 1.000 1.000
0.9091
0.8265
We are now interested in getting a ”fair” price today for a payoff XT at time T that is random
and depends on the random price of S 2 at time T , or, even more general, on the random price
path of S 2 up to time T ,
2 2
XT = F (ST ) or XT = F (St , t = 0, 1, . . . , T ),
with some payoff function F . The simplest example is a call option with strike (exercise) price
K , XT = (ST2 − K)+.
At a first glance it looks like that there is no solution for the ”fair” price of XT , because,
naively one would expect that the price depends on our expectations on the future behaviour
of the underlying security S 2, in particular, on the likelihoods we would assign to the up and
down moves. But those expectations are highly subjective and possibly different for each market
participant. It is one of the most remarkable inventions of modern finance that there is indeed a
fair price that is independent of subjective expectations! The reason is the following. Although
XT , whose payoff is determined by its underlying S 2, introduces a new security to our market, its
fair price at times t < T must be somehow linked to the prices of the primary securities S 1, S 2
to avoid price conflicts between the prices of the now three securities in our market. This will
become more transparent soon.
It turns out that XT does not really introduce a new financial instrument to our market since
it can be replicated by a clever strategy of trading in the primary securities S 1, S 2. A trading
strategy in S 1, S 2 is given by a pair θ = (θt1, θt2)t=0,1,2,..., where θti is the quantity of shares of
security S i we hold at time t, more precisely, between time t and t + 1. Of course, the strategy
(θt1, θt2) at time t is not predetermined as of today, it may depend on the random behaviour of
the prices of S 1, S 2 up to time t.
In the Binomial tree model the random behaviour of the prices of S 1, S 2 is characterized by
a sequence ω of moves ξt being ”up” or ”down”
Such an ω is also called a state of the world. The behaviour up to time t is described by the
up to time t part ωt of this sequence: ωt = (ξ1, ξ2, . . . , ξt). A trading strategy is then fully
t=0 θ0i , i = 1, 2
t=1 θ1i (u), θ1i (d), i = 1, 2
t=2 θ2i (uu), θ2i (ud), θ2i (du), θ2i (dd), i = 1, 2
... ...
For example, the quantity θ2i (ud) is the number of shares of security S i to be held between time
t = 2 and t = 3 if the realized price moves up to time t = 2 were an ”up” u followed by a
”down” d.
In our Binomial tree example above consider a call option with maturity T = 2 and payoff
2 +
X2 = (S2 − 0.98) .
The option payout in the possible states at maturity are X2(uu) = 0.23, X2(ud) =
0.02, X2(du) = 0.02, X2(dd) = 0.0. For security S 1 we assume an interest rate of
time strategy
t=0 θ01 -0.67871
θ02 0.79937
t=1 θ11(u) -0.88889
θ12(u) 1.00000
θ11(d) -0.08638
θ12(d) 0.11524
At time t = 0 the strategy requires an initial investment of V0(θ) = θ01S01 + θ02S02 = 0.12066
which is the value of the position we hold at time t = 0. If at time t = 1 the market goes ”up”
the strategy generates a profit & loss of
1 1 1 2 2 2
θ0 [S1 (u) − S0 ] + θ0 [S1 (u) − S0 ] = 0.046
giving us a gross value of 0.16667 at time t = 1 if the market went up. To continue from
here with our strategy above from time t = 1 to t = 2 we need capital of V1(θ)(u) =
θ11(u)S11(u) + θ12(u)S12(u) = 0.16667 fitting exactly to what we have obtained so far by our
strategy! So rearranging our portfolio is costless. Continuing from time t = 1 to t = 2 with
our strategy, adding up the profit & loss generated, we end up with a portfolio value from our
strategy at time t = 2
V2(θ)(uu) = 0.23
V2(θ)(ud) = 0.02.
But this is exactly identical to the payoff of our call option X2 in the states (uu), (ud). One
can easily verify that the given strategy replicates the payoff of the option XT for all possible
future path of the market. This means that the payoff of the option XT and the result of above
strategy are indistinguishable for all states ω of our world. The above strategy perfectly replicates
the option payoff, it is called a replicating strategy for XT .
As a consequence, the price of the option and the costs of replication must coincide.
Since the replication only required an initial investment V0(θ) the fair price of the option is
V0(θ) = 0.12066. Any price different from that would give raise to a contradiction and would
imply that there is a money printing machine.
So far we have made no assumption on the probabilities associated with the up and down
moves, and the fair price of XT does clearly not depend on any probability assignments! The
reason is that the replication strategy above does work in all states of the world independent of
their probabilities.
Now it is time to summarize and formalize what we have learned from the example.
We have replicated the option payoff XT in all states of the world by a clever strategy
θ = (θ 1, θ 2). Replication means that the initial investment of the strategy plus the cumulative
profit & loss from the strategy, finally, at time T , generates exactly the same payout as the
option,
XX T X i
i i i i
XT = V0(θ) + θt−1[St − St−1] = θT −1ST . (1.1)
i=1,2 t=1 i=1,2
During the replication, rearranging the strategy was costless since
X i i
X i i
θt−1St = θt S t .
i=1,2 i=1,2
A strategy with this property is also called self-financing. Recall that all these equalities are
equalities between random variables, i.e., the random variables coincide for all states ω .
The appropriate strategy, θt1(ωt), θt2(ωt), both depending on the state of the world ωt up to
time t, always1 exists. It is obtained from the following backward solving algorithm
X i i
XT (ωT ) = θT −1(ωT −1)ST (ωT ) (1.2)
i=1,2
X i i
X i i
θt (ωt)St (ωt) = θt−1(ωt−1)St (ωt) (1.3)
i=1,2 i=1,2
t = T − 1, T − 2, . . . , 1. (1.4)
Finally, the fair price V0(XT ) of XT today has to be the same as the costs of replicating XT by
the strategy θ . In view of (1.3) there are only the costs of initially setting up the strategy, i.e.,
V0(XT ) = V0(θ).
1 Up to the pathological case where u = d or u = 1 + r or d = 1 + r .
In other words, the fair price of the derivative XT is equal to its costs of replication.
Clearly it is cumbersome to calculate the replicating strategy by the backward induction (1.2),
(1.3) just to obtain the amount that has to be invested initially. It turns out that there is a clever
tool to shorten the calculation.
The replication algorithm can be formalized also in the following way. We started with the
goal of generating at time T a portfolio value VT (θ) that replicates the payoff XT ,
X i i
VT (θ) = XT = θT −1ST . (1.5)
i
Then for each time step t = T, T − 1, . . . , 1 the required capital in the previous step is
X i i
Vt−1(θ) := θt−1St−1, (1.6)
i
Proposition 1.1. In the binomial tree model the value process Vt(θ), t = T, T − 1, . . . , 1
satisfies the recursion equation
1 h i
Vt−1(θ)(ωt−1) = p Vt(θ)(ωt−1, u) + (1 − p) Vt(θ)(ωt−1, d) , (1.8)
1+r
where
1+r−d
p= (1.9)
u−d
and (ωt−1, u) denotes the path ωt continued with an ”up” move u in step t and analogously for
(ωt−1, d).
If we assume that
d < (1 + r) < u,
then 0 < p < 1 can be interpreted as a probability and (1.8) tells us that Vt−1(θ) is the
expectation of Vt(θ) divided (“discounted”) by (1 + r).
Warning. One has to be careful interpreting the probability p. The whole approach so far was
completely independent of any assumptions on the probabilities associated with the u, d moves.
Also, here we do not assume that p is the ”real” or any subjective likelihood for the up move!
The probability p is purely artificial and, as such, just a tool or trick to shorten calculations.
Proof. For the two branchings u, d continuing the path ωt−1 equation (1.7) reads as
1 1 2 2
Vt(θ)(ωt−1, u) = θt−1(ωt−1)St−1(ωt−1) (1 + r) + θt−1(ωt−1)St−1(ωt−1) u
1 1 2 2
Vt(θ)(ωt−1, d) = θt−1(ωt−1)St−1(ωt−1) (1 + r) + θt−1(ωt−1)St−1(ωt−1) d.
1 1
Multiplying the first equation with 1+r p and the second with 1+r (1 − p) and, finally, adding
both equations yields the assertion.
In each step, associating with the up move u a probability p and with the down move d
the probability 1 − p, we construct a probability distribution Q on the set Ω of all states of
the world ω = (ξ1, ξ2, . . . ), ξi ∈ {u, d}. For a set of states ω which follow a given path
(ξ1, ξ2, . . . , ξT ) up to time T we set
]u (ωT ) ]d (ωT )
Q({ω : ωT = (ξ1, ξ2, . . . , ξT )}) = p (1 − p) , (1.10)
where ]u(ωT ), resp. ]d(ωT ), denotes the number of occurrences of ”ups” u, resp. ”downs” d,
in the path ωT = (ξ1, ξ2, . . . , ξT ).
We will learn form the exercises that this particular probability distribution Q admits an
interpretation as risk neutral distribution if p is given by (1.9). Moreover, the moves of the
security S 2 are independent from step to step and the security S 2 is a Markov process.
Corollary 1.2. Again assume a Binomial tree model. Consider a derivative with payoff XT (ωT )
at time T in state ωT which can be replicated by a strategy θ as in (1.2), (1.3). The fair price
V0(XT ) of XT today is then given by
1 X ]u (ωT ) ]d (ωT ) 1
V0(XT ) = XT (ωT )p (1 − p) = EQ(XT ), (1.11)
(1 + r)T ω (1 + r)T
T
where p is as in (1.9). So V0(XT ) is nothing else but the ”discounted” expectation of the
payoff XT , where the expectation is taken under an appropriate artificial probability distribution
assigning the probability p (resp. 1 − p) to the ”up” (resp. ”down”).
To be able to carry out the same analysis as in the preceding two sections, but in much more
general modeling framework than just a Binomial tree model, it is obvious that we have to get a
hand on the mathematical concepts and tools necessary to do the job.
2 Probability Theory
The mathematical model of an event with uncertain outcome is a probability space (Ω, F , P)
with Ω as the (nonempty) set of all possible outcomes (states of the world) ω ∈ Ω, which is also
often named sample space. F is called the set of observable events, it is a set of subsets
of Ω. Finally, P is the probability measure which associates a probability P(A) to each
event A ∈ F .
(i) ∅ ∈ F ,
(ii) A ∈ F ⇒ Ac ∈ F , where Ac denotes the complement of A in Ω,
S∞
(iii) A1, A2, · · · ∈ F ⇒ n=1 An ∈ F .
As a consequence of these three conditions one can easily show that F is closed under all kinds
of set-theoretic operations (union, intersection, set-difference . . . )
The probability measure P is a mapping P|F → [0, 1] with the following properties
(i) P(Ω) = 1,
∞ ∞
!
[ X
P An = P(An).
n=1 n=1
Again, many simple calculation rules for probabilities can be derived as a consequence of these
two properties.
−1
PX (B) := P(X (B)), B ∈ B (2.1)
are well-defined. PX is a probability measure on the space (R, B), it is called the distribution
of X .
which is defined by
Associated with a random variable X is the σ -algebra σ(X) generated by the random
variable X , defined by2
−1
σ(X) = {X (B) : B ∈ B}. (2.3)
The family σ(X) is a sub-σ -algebra of F , i.e., σ(X) ⊆ F . Informally, σ(X) contains all
events, ”information”, from Ω that are observable by observing the random variable X .
There is a quite useful little result on the structure of any other random variable Y that is
σ(X)-measurable3.
2 Verify that the right hand side is indeed a σ -algebra!
3 More precisely, (B, σ(X))–measurable.
Lemma 2.1. Let Y be a random variable on (Ω, F , P). Then Y is σ(X) measurable if and
only if there is a function g|R → R that is (B, B)–measurable and such that Y = g(X).
For an at most countable space of outcomes Ω the expectation EX of the random variable X is
defined as the probability weighted average of all possible values of X ,
X
EX = X(ω)P({ω}). (2.4)
ω∈Ω
k=0
2 n 2n , 2n
then for n → ∞ we have Xn(ω) ↑ X(ω), ∀ω ∈ Ω, the expectation EXn is well-defined and
increasing in n and we define finally
EX = lim EXn.
n→∞
The random variable X is called integrable, if EX + < ∞ and EX − < ∞, which is the
same as E|X| < ∞.
One can easily extend the definition of EX allowing that at most one of the two components,
EX + or EX −, can take the value ∞. The expectation in this generalized sense takes then
values in [−∞, ∞].
Observe that for any function g|R → R which is (B, B)-measurable4, the variable g(X)
is again a random variable. So along the above lines we have also defined the expectation
E(g(X)).
Proposition 2.2. The expectation of a random variable satisfies the following properties.
EX ≤ EY.
4 Such a function is also called Borel-measurable.
(iii) (Jensen’s inequality) If X is integrable and g|R → R is a convex function such that
g(X) is integrable, then
E(g(X)) ≥ g(EX). (2.6)
When calculating the expectation of a random variable X or g(X), the integration over the
sample space Ω can be transformed equivalently into an integration over the value space R by
replacing the measure P on (Ω, F ) by PX on (R, B),
Z Z Z ∞
E(g(X)) = g(X(ω))dP(ω) = g(x)dPX (x) = g(x)dFX (x), (2.7)
Ω R −∞
the latter integral being a Lebesgue-Stieltjes integral w.r.t. the increasing function FX .
Consider a sequence of random variables X1, X2, . . . defined on the probability space
(Ω, F , P). There are many different ways of how this sequence can converge to a certain
limiting random variable X . In the simplest case, we can think of pointwise convergence
limn Xn(ω) = X(ω), ∀ω ∈ Ω. However, since for most applications, in the limit exceptional
sets of vanishing probability can be ignored it is more appropriate to work with weaker concepts
of convergence than pointwise convergence.
P–a.s. ` ´
(i) P-almost surely convergence: Xn −→ X , if P ω : limn Xn(ω) = X(ω) = 1,
P
(ii) Convergence in P probability: Xn −→ X , if ∀ > 0
` ´
lim P ω : |Xn(ω) − X(ω)| > = 0,
n
Lp
(iii) L convergence: Xn −→ X , if E|Xn|p < ∞ and limn E|Xn − X|p = 0,
p
d
(iv) Distribution or weak convergence: Xn −→ X , if limn FXn (x) = FX (x) for all
points x ∈ R of continuity of FX .
Convergence P–almost surely implies convergence in probability, but not vice versa.
Convergence in Lp and convergence in probability are equivalent, provided E[Xn|p < ∞ and
the family |Xn|p, n = 1, 2, . . . is uniformly integrable, a kind of boundedness condition.
The notion of P–almost surely is also important and useful in other contexts. We say a
statement A holds P–almost surely, in short, P-a.s., if P(A) = 1.
It is important for many applications to know under which conditions taking the limit and
taking expectation are interchangeable operations.
Theorem 2.4. [Monotone convergence] Let (Xn) be an P–almost surely increasing sequence
of non-negative random variables with limit X , P(X1 ≤ X2 ≤ . . . , limn Xn = X) = 1, then
5 Observe that it is even not necessary that the random variables X and X are defined on the same probability space.
n
Consider first the situation of a finite sample space Ω where for each ω the one point set is
in the σ –algebra F , {ω} ∈ F . Then the probability measure P assigns a certain probability
to each ω , and we assume for simplicity P(ω) = P({ω}) > 0. Then any other probability
measure Q on (Ω, F ) can be seen as nothing else but a re-scaling of the probabilities P with a
scaling function Z , indeed,
Q(ω) = Z(ω)P(ω), (2.9)
(ω)
with Z(ω) = Q P(ω) . For Q defined by (2.9) to be a probability measure it is necessary and
sufficient that Z(ω) ≥ 0 and EPZ = 1, where EP denotes the expectation taken w.r.t. the
measure P.
In case of an infinite sample space Ω the transformation (2.9) may not make sense anymore,
since there will be ω ’s such that P(ω) = 0, Q(ω) = 0. This indicates that the transformation
has to be done on a set basis instead of an outcome basis.
Theorem 2.6. On the probability space (Ω, F , P) let Z be an P–a.s. nonnegative random
variable with EPZ = 1. Define
Z Z
Q(A) = Z(ω)dP(ω) = 1A(ω)Z(ω)dP(ω), A ∈ F . (2.10)
A Ω
Theorem 2.7. On the probability space (Ω, F , P) let Q be another probability measure
absolutely continuous w.r.t P. Then there exists an P–a.s. unique nonnegative random variable
Z with EPZ = 1 such that
Z
Q(A) = Z(ω)dP(ω), ∀A ∈ F . (2.11)
A
dQ
Z= .
dP
1 dP 1
Z
P(A) = dQ, i.e., = .
A Z dQ Z
P(A ∩ B)
P(A|B) := ,
P(B)
where this definition makes sense only if P(B) > 0. Analogously one defines the conditional
expectation E(X|B) of the random variable X given B ,
R
1 XdP
Z
B
E(X|B) := X 1B dP = . (2.12)
P(B) Ω P(B)
Intuitively P(A|B) resp. E(X|B) is the probability of A, resp. the expectation of X , provided
that the event B has ”occurred”. Conditional expectation and conditional probability are linked
by
P(A|B) = E( 1A|B)
Z
E(X|B) = X(ω)dP(ω|B) = EP(.|B)X.
Ω
The σ –algebra F in our probability space, was interpreted as the set of all observable events
we have the probability P assigned to. For a random variable X , observing the outcomes of X ,
we know whether an event of the form X −1(B) = {ω : X(ω) ∈ B}, B ∈ B, has occurred or
not. So observing X gives us an information on the occurrence of all sets of the form X −1(B)
that form the σ –algebra σ(X) generated by X . Clearly, X −1(B) ∈ F by the definition of
a random variable, but, it is in general not true that any A ∈ F is necessarily of the form
X −1(B). This means that in general the observation of X gives us only information about the
events contained in σ(X), but not the whole of F .
Theorem 2.8. Let X be an integrable random variable on the probability space (Ω, F , P).
Then for any σ -algebra G ⊆ F there exists a P–a.s. unique random variable X̃ such that
Definition 2.9. Any random variable X̃ with the properties of Theorem 2.8 is called condi-
tional expectation of X under the σ –algebra G and is denoted formally by E(X|G).
Example. Let us calculate the conditional expectation of the random variable X und the
σ –algebra G generated by the information generated by observing if the given event B ∈ F has
occurred or not: G = {B, B c, Ω, ∅}. Assume 0 < P(B) < 1. Then
(
1
R
P(B) B XdP if ω ∈ B,
E(X|G)(ω) = 1
if ω ∈ B c.
R
P(B c ) B c XdP
When working with conditional expectations the following fundamental calculation rules are
extremely useful.
(i) (Linearity) If X1 and X2 are integrable random variables and c1, c2 ∈ R, then
(ii) (Taking out what is known) If XY and Y are integrable and X is G –measurable, then
(v) ( Jensen’s inequality) If X is integrable und g|R → R is a convex function such that
g(X) is integrable, then
` ` ´
g E(X|G)) ≤ E g(X)|G , P–a.s. (2.17)
Proof: . . .
Proposition 2.11. Let the random variable X be square integrable, i.e., EX 2 < ∞. Denote
by L2(G) the space of all square integrable random variables that are G –measurable. Then
E(X|G) ∈ L2(G) and E(X|G) is characterized by the fact that
ˆ ˜2 ˆ ˜2
E X − E(X|G) = min E X − Z . (2.18)
Z∈L2 (G)
Definition 2.12. We define the conditional probability of the event A ∈ F under the
σ –algebra G , by
P(A|G) = E( 1A|G), P–a.s. (2.19)
Consider the conditional expectation E(Y |σ(X)) of the random variable Y w.r.t. the
σ –algebra σ(X) generated by another random variable X . As we know from Lemma 2.1 there
exists a Borel–measurable function g|R → R such that
E(Y |X = x) := g(x), x ∈ R.
The expression E(Y |X = x) on the left hand side is a purely formal notation which gets defined
by the function g . Recall that referring to (2.12), the conditional expectation E(Y |{X = x})
is only well-defined if P(X = x) > 0!
2.7 Independence
More generally, a family Ai, i ∈ I is called independent, if for any n and every finite choice
of n different indices i1, i2, . . . , in ∈ I we have
n
! n
\ Y
P A ik = P(Aik ).
k=1 k=1
The concept of independence is extended to sets of events in the following way. Let G1, G2
be subsets of F , in particular, G1, G2 could be sub-σ –algebras of F . Then G1, G2 are called
independent, if for every choice A1 ∈ G1, A2 ∈ G2 the events A1, A2 are independent, i.e.,
P(A1 ∩ A2) = P(A1) P(A2).
As a special case consider two random variables X1, X2 and the σ –algebras σ(X1), σ(X2)
generated by X1, X2, respectively. Then the random variables X1, X2 are named independent,
if σ(X1), σ(X2) are independent. Independence of X1, X2 is equivalent to
P(X1 < x1, X2 < x2) = P(X1 < x1)P(X2 < x2), for all x1, x2 ∈ R.
Here are some extremely useful rules related to independence for working with conditional
expectations.
Theorem 2.13. Let (Ω, F , P) be a probability space and let G and H be sub-σ –algebras of
F . The random variable X is assumed to be integrable.
(ii) If X and G are independent, G and H are independent, and G ∨ H denotes the smallest
σ –algebra containing G and H, then
(iii) Assume the random variables X1, . . . , Xn are G –measurable and the random vari-
ables Y1, . . . , Ym are independent of G . Then for any bounded function f =
f (x1 . . . , xn, y1, . . . , ym) we have that
where g is given by
Proof: . . .
Property (iii) is somehow an extension to the ”Taking out what is known” property of
conditional expectation combined with property (i) above.
3 Stochastic Processes
A stochastic process can be seen as a random variable taking values in the space of all
functions from T to R, but to make that precise one has to equip this space with an appropriate
σ –algebra to define measurability of the mapping.
For all stochastic processes of interest in this course one can show that there is a so-called
version which has regular paths, meaning paths which are right-continuous with left limit
(”cadlag”), or left-continuous with right hand limits (”caglad”) or even with continuous paths.
X
_
Ft = σ(Xs) = σ(Xs : s ≤ t)
s≤t
which contains all events of F observable through the stochastic process X until time t.
Clearly FsX ⊆ FtX for s ≤ t. The family FX = (FtX )t∈T is called the filtration or flow
of information generated by X .
(ii) A family F = (Ft)t∈T of sub–σ –algebras Ft of F is called a filtration if Fs ⊆ Ft for all
s, t ∈ T with s ≤ t.
(iii) A stochastic process Y = (Yt)t∈T is called F–adapted, if, for every t ∈ T, the random
variable Yt is (B, Ft)–measurable.
(iv) A filtered probability space (Ω, F , (Ft)t∈T, P) is a probability space equipped with a
filtration F = (Ft)t∈T of sub–σ –algebras of F .
If Y = (Yt) is F–adapted this means intuitively, that observing the flow of information
F = (Ft), at each point in time t the random variable Yt is ”known”.
In most of our applications the underlying filtration will be generated by the observation of
price processes of securities (Sti)t∈T, i = 1, . . . , N ,
Brownian motion is prototype of many types of stochastic processes – it is at the same time
a martingale, a Markov process, a Gaussian process, a process with independent increments, a
stationary process. On the other hand, it can be shown that almost every stochastic process with
continuous paths can be obtained by a suitable transformation from Brownian motion.
Brownian motions plays an outstanding role in mathematical finance since the majority of
popular models are based on it.
Definition 3.2. The stochastic process W = (Wt)t∈T defined on a probability space (Ω, F , P)
is called a Brownian motion or Wiener process if
(i) W possesses continuous paths, i.e., for every ω ∈ Ω the function t 7→ Wt(ω) is continuous.
(ii) W has independent increments, i.e., for all t1 < t2 < · · · < tn the increments Wt2 −
Wt1 , Wt3 − Wt2 , . . . , Wtn − Wtn−1 are independent random variables.
(iii) For all t > s the increment Wt − Ws has Normal distribution with mean zero and variance
t − s, in symbols (Wt − Ws) ∼ N(0, t − s).
(iv) W0 = 0.
(i) W is F–adapted.
(ii) For all t > s the increment Wt − Ws is independent of the σ –algebra Fs.
Clearly, the filtration FW generated by W is a filtration for the Brownian motion W . Any F
being a filtration for the Brownian motion W satisfies F ⊇ FW , i.e., Ft ⊇ FtW , ∀t.
Many useful transformations of a given Brownian motion result again in another Brownian
motion.
Proposition 3.3. Let W = (Wt)t≥0 be a Wiener process. Then the following process
B = (Bt) is again a Wiener process:
(i) Bt = −Wt, t ≥ 0,
(ii) Bt = Ws+t − Ws, t ≥ 0, for arbitrary fixed s ∈ T,
(iii) Bt = 1c Wc2 t, t ≥ 0, for some fixed c ∈ R, c 6= 0.
The paths W.(ω) of Brownian motion admit some very interesting properties that will stress our
imagination. On the other hand understanding
Rt those path properties is critical when we approach
the problem of defining the integral 0 gsdWs in Section 4.
where 0 = tn0 < tn1 < · · · < tnn = T is a partition of [0, T ] and δn = maxi(tni − tni−1).
For p = 1 we simply speak of finite resp. bounded variation. The case p = 2 is called
quadratic variation.
Proposition 3.4. For P–almost all ω ∈ Ω the path W.(ω) of a Brownian motion has the
following properties.
Proof: . . .
The quadratic variation of a stochastic process, if it exists, will play an outstanding role in
the definition of the general stochastic integral. Quadratic variation of the stochastic process
X = (Xt)t≥0 up to time t is usually denoted by the symbol
n
X 2
[X, X]t = lim |Xtn − Xtn | , (3.2)
δn →0 i i−1
i=1
where the δn is the maximum width of the partition of [0, t]. So for a Brownian motion W we
have
[W, W ]t = t, t ≥ 0, P − a.s.
3.3 Martingales
The theory of martingales, although developed by probabilists during the 50s to 80s, has turned
out to be THE MACHINERY for modern finance.
When working with different probability measures and different filtrations one has to be more
precise saying that X is a martingale (sub-, supermartingale) with respect to the measure P and
the filtration F.
Example 1. Consider the binomial tree model from Section 1.1. Under the probability measure
Q as introduced in Section 1.3 formula (1.10) the process Xt = St2/St1, t = 0, 1, . . . is a
S 1 ,S 2
martingale w.r.t. the filtration F .
Xt = E(ξ|Ft), t ∈ T. (3.3)
Proposition 3.6. Let W be a Brownian motion on the filtered probability space (Ω, F , F, P).
The filtration is assumed to be a filtration for the Brownian motion W , in particular, we could
take F = FW . Then
Proof. . . .
The martingale in item (iii) above is called the exponential martingale associated with
σW .
Definition 3.7. Given a Brownian motion W we call the process X = (Xt) defined by
„ «
1 2
Xt = X0 exp σWt + (µ − σ )t , t ≥ 0, (3.4)
2
For easy reference in forthcoming examples and remarks we introduce the dynamics of the
stochastic security price processes underlying the famous Black & Scholes model. The model
assumes geometric Brownian motion for the price St of a stock. Under the analogous risk-neutral
distribution Q as investigated in Section 1.3 the drift of the geometric Brownian motion is equal
to the (continuously compounded) interest rate r , i.e.,
„ «
1 2
St = S0 exp rt + σWt − σ t .
2
The model also assumes that we can invest and borrow money at the risk-free rate r , that is
there is another security, the money market account, with deterministic price process
Bt = exp(rt).
The flow of information (filtration) F observable in the market is generated by observing the
stock price path through time, i.e., F = FS .
The ratio (St/Bt), which is the discounted stock price process, is clearly a (Q, F)–martingale.
Consider a payoff XT at time T , which is dependent on the stock price (path) via the payoff
function F :
XT = F (St, t ≤ T ).
No-arbitrage pricing theory and our analysis in Section 4.9 justifies that for any time 0 ≤ t ≤ T
the fair price Vt(XT ) is given by
„ ˛ «
XT ˛˛
Vt(XT ) = Bt EQ Ft , (3.5)
BT ˛
in particular,
„ «
XT
V0(XT ) = EQ .
BT
Generations of probabilists have worked hard to reveal the probability distribution of almost all
random quantities of interest related to Brownian motion. For example, we are interested in the
distribution of the maximum of the Brownian path up to some fixed time, or the distribution of
the first time the path hits some level.
the first time X hits the level a. We use the convention inf ∅ = ∞. Also define the running
maximum MtX and the running minimum mX t of the process X up to time t by
8
X X
Mt (ω) = max{Xs(ω) : s ≤ t}, mt (ω) = min{Xs(ω) : s ≤ t}. (3.7)
8 We are a bit sloppy here. In general one should use the supremum resp. infimum instead of the maximum resp. minimum which is not
always defined. But since we are working mostly with continuous processes X this is not a problem.
(i) For every a ∈ R the random hitting time τaW is finite P − a.s.,
W
P(τa < ∞) = 1.
∞ 2
2
Z
W −x
P(τa ≤ t) = √ e 2t dx. (3.8)
2πt a
Proof. . . .
Using the same ideas as in the proof of Proposition 3.8 one shows the following result on the
joint distribution of a Brownian motion and its running maximum.
Proposition 3.9. Let W be a Brownian motion. The joint distribution of (WT , MTW ) possesses
a density that is given by
2(2y − x) − (2y−x)2
f(W ,M W )(x, y) = √ e 2T , x ≤ y, y > 0. (3.9)
T T T 2πT
Proof. . . .
As motivated by our analysis in Section 1.3 and justified by no-arbitrage pricing theory, the fair
price V0(XT ) of this derivative is the ”discounted” expectation
1 h
+
i
V0(XT ) = EQ (ST − E) 1{M S ≤B} ,
(1 + r)T T
under an appropriately chosen probability measure Q. To calculate this expectation the knowledge
of the joint distribution of (ST , MTS ) is required. In the framework of the Black & Scholes
model the security S is modeled as a geometric Brownian motion with a certain drift. To derive
a closed pricing formula for barrier options, in Proposition 4.10 below we will generalize the result
of Proposition 3.9 to Brownian motions with drift applying Girsanov’s theorem.
In applications many phenomena are modeled as stochastic processes that are so-called Markov
processes. The theory of Markov processes is extremely rich and offers a fruitful interplay with
functional analysis. Many well-known techniques in financial applications, such as, for example,
using partial differential equations or tree methods to solve for option prices, are critically based
on the Markov property of the underlying process.
However, despite the richness of methods available for Markov processes one should be aware
of the fact, that it is sometimes highly questionable whether a process in real life, such as, e.g., a
price process of a stock, is really Markovian.
(i) X is F–adapted,
As a consequence of Lemma 2.1 for every s, t, B the right hand side of (3.10) can be written as
some Borel measurable function P (s, ., t, B) of the random variable Xs:
The function P (s, x, t, B) is called the transition probability function of the Markov
process X . It can be shown that, considered as a function of B ∈ B, the transition
probability P (s, x, t, B) is a probability measure on R. In case this measure possesses a density
p(s, x, t, y), i.e., Z
P (s, x, t, B) = p(s, x, t, y)dy
B
Theorem 3.11. Let W be a Wiener process with filtration F. Then W is a Markov process
w.r.t. F. Moreover W admits a transition density given by
1 (y−x)2
−
P(Wt ∈ dy|Ws = x) = p(s, x, t, y)dy = p e 2(t−s) dy, t > s. (3.12)
2π(t − s)
Proof: . . .
The Markov property is the key to the popular backward-induction method frequently used
in pricing derivatives. Let us illustrate this going back to the Binomial tree model of Section 1.1.
The underlying probability space (Ω, F , Q) is given by
There are two securities S 1, S 2 which are modeled as stochastic processes for times
T = {0, 1, 2, . . . }:
1 t
St (ω) = (1 + r) (3.17)
2 2 ]u (ωt ) ]d (ωt )
St (ω) = S0 · u d . (3.18)
S 1 ,S 2 S2
F=F =F .
2
St+1
(i) For every t the return ratio is independent of Ft.
St2
S2
(ii) The process S1
is a martingale.
2
(iii) The process S is a Markov process w.r.t. F.
Consider a derivative with integrable payoff XT at time T . As known from Corollary 1.2, the fair
price V0(XT ) today is given by
1
V0(XT ) = T
EQ(XT ).
(1 + r)
Denote Vt(XT ) = (1 + r)−(T −t)EQ(XT |Ft), t ≤ T . As will become clear from arbitrage
pricing theory, Vt(XT ) is indeed the fair price of the derivative XT at time t – but this is not
important for now.
If XT is of the form XT = F (ST2 ), applying the Markov property of S 2 and the rule of
iterated conditioning (2.15) from Theorem 2.10, one derives the following facts:
in particular, it does not depend on the history of the price Su2 , u < t.
4 Stochastic Integration
where H and X are stochastic processes on a filtered probability space (Ω, F , F, P). From
Subsection 4.2 on we will mainly deal with the case that X is a Brownian motion.
Recall our motivation for looking for integrals of stochastic integrands H integrated w.r.t. a
stochastic process X . In Section 1, working there on a discrete time scale, we have seen that
for a strategy θ = (θt) of trading in the security S = (St) the cumulative profit and loss up to
time t is of the form Z t
θudSu.
0
Consider first integrating a ”simple” process H , where there is a partition 0 = t0 < t1 <
t2 · · · < tn of time such that H is ”constant” on each interval (ti, ti+1]:
n
X
Ht = ξ0 1{0}(t) + ξi 1(ti,ti+1](t), t ≥ 0. (4.1)
i=0
The ”values” ξi are not necessarily constant, they could be random variables. Then a natural
definition of the stochasticRintegral would be the following. For every t ≥ 0 and ω ∈ Ω we
t
define the random variable 0 HsdXs by9
„Z t « n
X “ ”
HsdXs (ω) = ξi(ω) Xt∧ti+1 (ω) − Xt∧ti (ω) , t ≥ 0. (4.2)
0 i=0
Now, as is standard, one could try to extend the definition of the integral to more general
integrands H by taking limits of sequences of simple integrands hoping that the respective
integrals also converges to a limit and so on. Unfortunately, this will not work in the cases of
particular interest to us, namely in case of X being a Brownian motion.
To give an idea where the problem is coming from, we consider a result which can be found in
[13], Theorem 52. Assume h = h(t) and x = x(t) are deterministic functions of time t ∈ [0, 1].
P
For a partition π = {0 = t0 < t1 < . . . tn = 1} define Iπ = i h(ti )(x(ti+1 ) − x(ti )).
If the limit lim|π|→0 Iπ exists for every continuous integrand h then x = x(t) is necessarily a
function of bounded variation!
But as we know from Proposition 3.4 the Brownian path W.(ω) is of unbounded variation,
so a naive path by path definition of the stochastic integral is not viable for X being a Brownian
motion. To solve this problem we have to think of a limiting procedure which is not based on
a path by path convergence but rather on a weaker concept of measuring the distance when
passing to the limit.
From now on we assume that the integrating process X is a Brownian motion W = (Wt) with
filtration F. For the simple integrand H ,
n
X
Ht = ξ0 1{0}(t) + ξi 1(ti,ti+1](t), t ≥ 0, (4.3)
i=0
we suppose now in addition that the random variable ξi is Fti –measurable and square integrable.
This measurability assumption implies that H is F–adapted. Recall from our motivation that the
integrands of interest to us are trading strategies which have to be adapted anyway.
Theorem 4.1. The stochastic integral I(H) = (It(H)) satisfies the following properties.
(i) (Continuity) The stochastic process I(H) = (It(H)) possesses continuous paths.
(ii) (Adaptedness) The process I(H) is F–adapted, i.e., It(H) is Ft–measurable for every t.
(iii) (Linearity) For two simple integrands H 1, H 2 and constants c1, c2 we have that
1 2 1 2
It(c1H + c2H ) = c1It(H ) + c2It(H ), t ≥ 0.
(iv) (Martingale property) The integral process I(H) = (It(H)) is a martingale which
starts at zero, I0(H) = 0.
(v) (Isometry) For every t it holds that
Z t
2 2
E(It(H)) = E (Hs) ds. (4.5)
0
(vi) (Covariance) For every t and two simple integrands H, K the covariance of the integrals is
Z t
E(It(H)It(K)) = E HsKsds. (4.6)
0
(vii) (Quadratic Variation) The quadratic variation of the integral process I(H) is
Z t
2
[I(H), I(H)]t = Hs ds. (4.7)
0
Proof: . . .
The isometry property is the key to extending the definition of the stochastic integral to
more general integrands H . We assume that H = (Ht) is F–adapted and satisfies the condition
that10 „Z t «
2
E (Hs) ds < ∞, ∀t ≥ 0. (4.8)
0
10 Implicitly this assumes also that for every ω the integral R t (H (ω))2 ds is defined and gives again a random variable. For that to
0 s
+
hold Hs (ω) has to be B(R ) ⊗ F –measurable.
Then one can show that H can be approximated by a sequence H n of simple processes such that
„Z t «
n 2
lim E (Hs − Hs) ds = 0,
n 0
cf. [6], Lemma 6.3.5. Applying the isometry property we can show that the stochastic integrals
It(H n) converge11 to a limiting variable, which we denote by It(H), where the convergence is
in the sense of mean square convergence:
n 2
lim E (It(H ) − It(H)) = 0.
n
It turns out that all properties of the stochastic integral in Theorem 4.1 are preserved by the
limiting procedure.
Finally, it is possible to extend the definition of the stochastic integral to an even wider class
11 To be more precise, I (H n ) forms a Cauchy sequence in the complete space L2 (Ω, F , P) and has therefore a limit.
t
To extend the definition of the integral further to those integrands we have to utilize an even
weaker concept of convergence. One can show that any such integrand H can be approximated12
by a sequence H n of integrands satisfying (4.8). The corresponding integrals It(H n) are then
convergent in probability P and the integral It(H) is defined to be their limit.
Observe that for integrands H satisfying only the weaker condition (4.9) the martingale
property of the stochastic integral as well as the isometry and the formula for the covariance get
lost. The integral It(H) is in general no longer square integrable and it is merely a so-called
local martingale instead of a martingale.
In case the integrand H is non-stochastic the integral preserves the normal distribution of
the Brownian motion.
Z t
2
(h(s)) ds < ∞.
0
Rt
Then the stochastic integral It(h) = 0
h(s)dWs is a Gaussian process. In particular, its
expectation and covariance function is
EIt(h) = 0 (4.10)
Z t∧s
2
E (It(h)Is(h)) = (h(u)) du. (4.11)
0
The definition of the stochastic integral is trivially extended from Brownian motion as the
integrating process to so-called Ito processes.
Definition 4.3. Given two F–adapted processes µ = (µt) and σ = (σt) satisfying
Z t
2
(σs) ds < ∞, P − a.s., ∀t ≥ 0 (4.12)
0
Z t
|µs|ds < ∞, P − a.s., ∀t ≥ 0. (4.13)
0
Lemma 4.4. For an Ito process (4.14) to be a martingale, it is necessary that the ”dt” vanishes,
i.e., µ ≡ 0.
The Ito-formula is one of the most powerful tools of stochastic calculus. It is a version of the
well known chain rule of differential calculus, adapted and modified for the particular needs of
stochastic integration.
Let us recall the chain rule. For smooth functions f and g we know that
d 0 0
f (g(t)) = f (g(t))g (t),
dt
which can be written also as
0 0 0
df (g(t)) = f (g(t))g (t)dt = f (g(t))dg(t).
A formula of this type remains valid as long as the function g is of bounded variation. In case of
the stochastic integral we know that the integrating Brownian motion has paths of unbounded
variation, which forced us to define the integral in the sense of a mean square convergence. This
will be exactly the reason for the appearance of an additional term in the chain rule of stochastic
calculus.
Theorem 4.5. [Ito formula for Brownian motion] Let f = f (t, x) be a function for which
the partial derivatives ft(t, x), fx(t, x), fxx(t, x) exist and are continuous. Then
t t t
1
Z Z Z
f (t, Wt) = f (0, W0)+ fx(s, Ws)dWs + ft(s, Ws)ds+ fxx(s, Ws)ds, t ≥ 0.
0 0 2 0
(4.16)
Proof: We restrict ourselves to the case of a function f that does not depend on time,
f = f (x). We sketch the main steps of the proof discussing more details in the class.
For f = f (x) depending only on x the Ito formula reads as
t t
1
Z Z
f (Wt) = f (W0) + fx(Ws)dWs + fxx(Ws)ds.
0 2 0
Take a partition (tni) of the interval [0, t], 0 = tn0 < tn1 < . . . tnn = t. Then, clearly
n−1 “
X ”
f (Wt) = f (W0) + f (Wtn ) − f (Wtn ) .
i+1 i
i=0
n−1 n−1
X “ ” 1X n
“ ”2
f (Wt) = f (W0) + fx(Wtn ) Wtn − Wtn + fxx(θi ) Wtn − Wtn ,
i=0
i i+1 i 2 i=0 i+1 i
with θin inside the interval spanned by Wtn , Wtn . It is not difficult to see that the first sum
i+1 i
converges to the stochastic integral,
n−1
X “ ” Z t
fx(Wtn ) Wtn − Wtn → fx(Ws)dWs.
i i+1 i
i=0 0
It is a bit more tricky to see that the second sum converges to a ”ds”–integral,
n−1
X “ ”2 Z t
n
fxx(θi ) Wtn − Wtn → fxx(Ws)ds.
i+1 i
i=0 0
Observe that in classical differential calculus this sum would converge to zero. If the function
Ws(ω), s ∈ [0, 1], had bounded variation then its square variation would be zero (cf. Exercises!)
causing the second sum to disappear in the limit.
Along the same lines one derives an Ito formula for general Ito processes.
Theorem 4.6. [Ito formula for Ito processes] Suppose X = (Xt) is an Ito process,
Z t Z t
Xt = X0 + σsdWs + µsds.
0 0
Let f = f (t, x) be a function for which the partial derivatives ft(t, x), fx(t, x), fxx(t, x) exist
The last integral is often written using the quadratic variation [X, X] of the Ito-process X . The
quadratic variation of X is by (4.7) by the fact that the ”ds” part of X does not contribute to
the quadratic variation of X ,
Z t
2
[X, X]t = (σs) ds.
0
Now it is time to relax from the hard work so far. Let us enjoy the power of the Ito formula
applying it now to solve some stochastic differential equations that are of extreme importance in
finance.
or, formally,
dXt = σXtdWt + µXtdt,
with constants σ, µ. A stochastic process X satisfying this equation is called a solution of the
SDE (4.18).
Proposition 4.7. The linear stochastic differential equation (4.18) possesses a unique solution
that is given by „ «
1 2
Xt = X0 exp σWt + (µ − σ )t , t ≥ 0. (4.19)
2
In other words, the solution of (4.18) is a geometric Brownian motion with volatility σ and drift
µ.
Proof: . . .
In the same way one derives that the solution of the inhomogenous SDE,
is given by „Z t t «
1 2
Z
Xt = X0 exp σsdWs + (µs − σs )ds . (4.21)
0 0 2
„Z t t «
1 2
Z
Xt = X0 exp σsdWs − σs ds .
0 0 2
is the solution of the SDE dXt = XtσtdWt and R t X is called the exponential (local)
martingale or the exponential associated with 0 σsdWs. Exponential martingales will act
as Radon-Nikodym densities when changing probability measures on filtered spaces. We will meet
those exponentials soon in Section 4.6 below.
In the generalized Vasicek model (also called Hull-White model) the so-called short rate
(rt) is modeled by the mean-reverting SDE
Proposition 4.8. The unique solution (rt) of the SDE (4.22) is given by
» Z t Z t –
−1
rt = A (t) r0 + A(s)λ(s)ϑ(s)ds + A(s)σ(s)dWs , (4.23)
0 0
Proof: Use Ito’s formula. Hint: start calculating the dynamics of the process rtA(t).
Clearly, when switching from a measure P to another measure Q any Wiener process W
under the probability P will be no longer a Wiener process w.r.t. Q.
where the integrand H is such that the Ito integral is well-defined. Define a new probability
measure Q by
dQ = ZT dP, (4.25)
is a Q–Brownian motion.
Proof: . . .
The condition (4.26) is not easy to verify. There is a sufficient condition for (4.26) to hold
true due to Novikov:
(Novikov condition)
!
T
1
Z
2
EP exp Hs ds < ∞ ⇒ EPZT = 1. (4.28)
2 0
As an application we use Girsanov’s theorem to generalize Proposition 3.9 and to derive the
joint distribution of the process and its running maximum for a Brownian motion with drift µ
Xt = Wt + µt.
Proposition 4.10. The joint distribution of (XT , MTX ) possesses a density that is given by
Proof. . . .
We can use this result to get the explicit valuation formula for an up-and-out call (barrier
option) with strike E and up-and-out barrier B in the Black & Scholes model
+
EQ exp(−rT )(ST − E) 1{M S ≤B},
T
where „ «
1 2
St = S0 exp rt + σWt − σ t .
2
See the Exercises.
According to no-arbitrage pricing theory the price of a payoff is given by the expectation under
an appropriate probability measure.
Given that the underlying security price processes are Markovian, it turns out that expectations
are solutions to certain partial differential equations (PDE). This is the key to using methods of
PDE solving to calculate prices.
Consider the process X being the solution of the stochastic differential equation
Under relatively mild conditions on the functions σ(t, x), µ(t, x) one can show existence and
uniqueness of the solution.
Theorem 4.11. The solution X of equation (4.30) is a Markov process, i.e., for every bounded
Borel function h|R → R and t ≤ T it holds that
Proof: We give only an intuitive argument why the solution is Markov, a detailed proof is quite
technical. Let t = T − δ for some small δ . Then
Z T Z T
XT = XT −δ + σ(s, Xs)dWs + µ(s, Xs)ds
T −δ T −δ
Now applying Proposition 2.13 (iii), using that XT −δ is FT −δ –measurable and that (WT −WT −δ )
is independent of FT −δ , we get
E(h(XT )|FT −δ )
„ ˛ «
` ´˛
≈ E h XT −δ + σ(T − δ, XT −δ )(WT − WT −δ ) + µ(T − δ, XT −δ ) δ ˛˛FT −δ
Z
` ´
≈ h XT −δ + σ(T − δ, XT −δ ) z + µ(T − δ, XT −δ ) δ n0,δ (z)dz,
R
where n0,δ (z) is the density of the normal distribution with expectation zero and variance δ . The
result on the right hand side is obviously σ(XT −δ )–measurable. Now repeating the argument
from T − δ to T − 2δ and so on, and, applying the property of iterated conditional expectations
”proves” the assertion.
Due to the Markov property, we know that E(h(XT )|Ft) is an σ(Xt)–measurable variable,
which, by Lemma 2.1 can be written as a function of Xt,
It is common to write
g(t, x) = Et,x(h(XT )).
Theorem 4.12. [Feynman-Kac] Let X be the solution of the stochastic differential equation
„ R «
− T r(s,Xs )ds
f (t, x) = Et,x e t h(XT ) , 0 ≤ t ≤ T, x ∈ R
1 2
ft(t, x) + µ(t, x)fx(t, x) + σ (t, x)fxx(t, x) = r(t, x)f (t, x), (4.33)
2
Proof: We sketch the proof of part (i). First, one can show that g(t, x) is a smooth function
so that we can apply Ito’s formula for Ito processes, Proposition 4.6, to the process g(t, Xt),
dg(t, Xt)
= gt(t, Xt)dt + gx(t, Xt)µ(t, Xt)dt + gx(t, Xt)σ(t, Xt)dWt
1 2
+ gxx(t, Xt)σ (t, Xt)dt.
2
On the other hand, being a conditional expectation of a random variable h(XT ),
is a martingale (cf. (3.3)). Therefore, by Lemma 4.4 the ”dt” contribution has to vanish, i.e.,
1 2
gt(t, Xt) + gx(t, Xt)µ(t, Xt) + gxx(t, Xt)σ (t, Xt) = 0,
2
which ”shows” that g(t, x) has to solve the asserted PDE. The terminal condition is obvious.
As an application we derive the Black & Scholes partial differential equation for the
price of an option in the framework of the Black & Scholes model, see Section 3.4. The stock
price S = (St) satisfies the SDE
Now applying the Feynman-Kac theorem we get immediately that v(t, x) solves the PDE
1 2 2
vt(x, t) + rxvx(t, x) + σ x vxx(t, x) = rv(t, x), (4.34)
2
with terminal condition
v(T, x) = h(x).
The result of this section is the key to answer the question whether a payoff XT can be replicated
by trading in the underlying securities. Recall from Section 1 that this was our starting point to
pricing payoffs by the no-arbitrage argument. Basically, payoffs which cannot replicated cannot
be priced by no-arbitrage.
(i) Let XT be an integrable random variable which is FTW –measurable. Then there exists an
appropriate integrand H = (Ht) such that XT has the representation
Z T
XT = EXT + HsdWs. (4.35)
0
We start the most important application of the previous result in the field of finance.
Consider the Black & Scholes model with the underlying stock price S modeled as
under the risk-neutral measure Q, and with risk free bank account B given by
dBt = rBtdt.
S,B S
F=F =F .
We are interested in the pricing of a payoff XT at time T which is an integrable FTS –measurable
random variable. The payoff XT can be replicated with the strategy θ = (θtS , θtB ) if the value
generated by the strategy until time T coincides with XT in all states ω of the world. The value
Vt(θ) at time t of the portfolio associated with the strategy is defined as
S B
Vt(θ) = θt St + θt Bt.
S B
XT (ω) = VT (θ)(ω) = θT (ω)ST (ω) + θT (ω)BT (ω), ∀ω ∈ Ω.
Moreover, the strategy has to be such that there are only initial costs V0(θ) meaning that
Proposition 4.14. In the framework of the Black & Scholes model any integrable payoff XT at
time T can be replicated by a self-financing strategy.
Proof: Because of FS = FW we can apply the martingale representation Theorem 4.13 (i) to
get
T
XT XT
Z
= EQ + HsdWs.
BT BT 0
Now we have to translate the dW integral into an dS integral. Let S̃ = S/B , then
dS̃t = σ S̃tdWt
and thus
T
XT XT Hs
Z
= EQ + dS̃s.
BT BT 0 σ S̃s
Now define
S Ht
θt =
σ S̃t
t
XT
Z
B S S
θt = EQ + θs dS̃s − θt S̃t.
BT 0
S B XT
V0(θ) = θ0 S 0 + θ0 B0 = EQ . (4.37)
BT
The strategy replicates XT . Indeed,
„ Z T «
S B S XT S S
θT ST + θT BT = θT ST + EQ + θs dS̃s − θT S̃T BT = XT .
BT 0
It remains to show that θ = (θtS , θtB ) is self-financing. Right from the definition of the strategy
we obtain
S B
Vt(θ) = θt S t + θt Bt
„ Z t «
S
= Bt V0(θ) + θs dS̃s .
0
Rt
Now, Ito’s formula applied for the Ito process Yt = V0(θ) + 0
θsS dS̃s = θtB + θtS S̃t and
remembering that Bt = ert, yields
“ ”
rt
dVt(θ) = d (BtYt) = d e Yt
rt rt S
= r e Ytdt + e θt dS̃t
B S S
= (θt + θt S̃t)dBt + Btθt dS̃t
B S
= θt dBt + θt dSt,
where the last step uses Ito’s formula for BtS̃t. This shows that
Z t Z t
B S
Vt(θ) = V0(θ) + θu dBu + θu dSu,
0 0
Knowing that every payoff XT can be replicated, the price of XT equals its costs of
replication, i.e., V0(XT ) = V0(θ). As a side result of the proof, namely equation (4.37), we
have shown the following result, which is the analogue to our result for the Binomial model in
Section 1.3.
Proposition 4.15. The fair price of the integrable payoff XT is given by the expectation
XT
V0(XT ) = EQ .
BT
The other representation for θtB follows from the PDE (4.34).
The replicating position θtS is called the option delta, which is nothing else but the first
derivative (the sensitivity) of the option price v(t, St) with respect to the current stock price
St .
References
[10] Klebaner, F.C.: Introduction to Stochastic Calculus with Applications, Imperial Collge
Press 2005
[11] Lamberton, D. and Lapeyre, B.: Introduction to Stochastic Calculus Applied to
Finance, Chapman & Hall London, 1996
[12] Øksendal, B.: Stochastic Differential Equations, Springer Berlin 1995
[13] Protter, P.: Stochastic Integration and Differential Equations, , Springer Berlin 1990
75
[14] Royden, H.L.: Real Anaylsis, Macmillan New York 1988
[15] Shreve, S.E.: Stochastic Calculus for Finance I, II, Springer Berlin 2004