Stochastic Calculus and Finance

Stochastic Calculus and Applications to Finance
Ovidiu Calin
Department of Mathematics
Eastern Michigan University
Ypsilanti, MI 48197 USA
ocalin@emich.edu
Preface
i
ii O. Calin
Contents
I Stochastic Calculus 3
1 Basic Notions 5
1.1 Probability Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Sample Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Events and Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Basic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.7 Independent Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.8 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.9 Radon-Nikodym’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.10 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.11 Inequalities of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.12 Limits of Sequences of Random Variables . . . . . . . . . . . . . . . . . . . . . 20
1.13 Properties of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.14 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2 Useful Stochastic Processes 27

2.1 The Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Geometric Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Integrated Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Exponential Integrated Brownian Motion . . . . . . . . . . . . . . . . . . . . . 34
2.5 Brownian Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 Brownian Motion with Drift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Bessel Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8 The Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.8.1 Definition and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.8.2 Interarrival times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.8.3 Waiting times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.8.4 The Integrated Poisson Process . . . . . . . . . . . . . . . . . . . . . . . 40
2.8.5 The Fundamental Relation dMt2 = dNt . . . . . . . . . . . . . . . . . . 43
2.8.6 The Relations dt dMt = 0, dWt dMt = 0 . . . . . . . . . . . . . . . . . . 43
iii
iv O. Calin
3 Properties of Stochastic Processes 47

3.1 Hitting Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Limits of Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.1 The Martingale Convergence Theorem . . . . . . . . . . . . . . . . . . . 56
3.3.2 The Squeeze Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4 Stochastic Integration 59
4.0.3 Nonanticipating Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.0.4 Increments of Brownian Motions . . . . . . . . . . . . . . . . . . . . . . 59
4.1 The Ito Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2 Examples of Ito integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.1 The case Ft = c, constant . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.2 The case Ft = Wt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 The Fundamental Relation dWt2 = dt . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4 Properties of the Ito Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 The Wiener Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.6 Poisson Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6.1 An Workout Example: the case Ft = Mt . . . . . . . . . . . . . . . . . . 70
5 Stochastic Differentiation 73
5.1 Differentiation Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Basic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3 Ito’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.1 Ito’s formula for diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.2 Ito’s formula for Poisson processes . . . . . . . . . . . . . . . . . . . . . 78
5.3.3 Ito’s multidimensional formula . . . . . . . . . . . . . . . . . . . . . . . 79
6 Stochastic Integration Techniques 81

6.0.4 Fundamental Theorem of Stochastic Calculus . . . . . . . . . . . . . . . 81
6.0.5 Stochastic Integration by Parts . . . . . . . . . . . . . . . . . . . . . . . 83
6.0.6 The Heat Equation Method . . . . . . . . . . . . . . . . . . . . . . . . . 88
7 Stochastic Differential Equations 93

7.1 Definitions and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 Finding Mean and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3 The Integration Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.4 Exact Stochastic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.5 Integration by Inspection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.6 Linear Stochastic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.7 The Method of Variation of Parameters . . . . . . . . . . . . . . . . . . . . . . 112
7.8 Integrating Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.9 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8 Martingales 117
8.1 Examples of Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.2 Girsanov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Stochastic Calculus and Applications to Finance v
II Applications to Finance 127

9 Modeling Stochastic Rates 129
9.1 An Introductory Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.2 Langevin’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.3 Equilibrium Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.4 The Rendleman and Bartter Model . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.4.1 The Vasicek Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.4.2 The Cox-Ingersoll-Ross Model . . . . . . . . . . . . . . . . . . . . . . . 134
9.5 No-arbitrage Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.5.1 The Ho and Lee Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.5.2 The Hull and White Model . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.6 Nonstationary Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.6.1 Black, Derman and Toy Model . . . . . . . . . . . . . . . . . . . . . . . 137
9.6.2 Black and Karasinski Model . . . . . . . . . . . . . . . . . . . . . . . . . 138
10 Modeling Stock Prices 139

10.1 Constant Drift and Volatility Model . . . . . . . . . . . . . . . . . . . . . . . . 139
10.2 Time-dependent Drift and Volatility Model . . . . . . . . . . . . . . . . . . . . 141
10.3 Models for Stock Price Averages . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.4 Stock Prices with Rare Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
10.5 Modeling other Asset Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
11 Risk-Neutral Valuation 153

11.1 The Method of Risk-Neutral Valuation . . . . . . . . . . . . . . . . . . . . . . . 153
11.2 Call option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
11.3 Cash-or-nothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
11.4 Log-contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
11.5 Power-contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
11.6 Forward contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
11.7 The Superposition Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
11.8 Call Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
11.9 Asian Forward Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
11.10 Asian Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
11.11 Forward Contracts with Rare Events . . . . . . . . . . . . . . . . . . . . . . . 164
12 Martingale Measures 167

12.1 Martingale Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
12.1.1 Is the stock price St a martingale? . . . . . . . . . . . . . . . . . . . . . 167
12.1.2 Risk-neutral World and Martingale Measure . . . . . . . . . . . . . . . . 169
12.1.3 Finding the Risk-Neutral Measure . . . . . . . . . . . . . . . . . . . . . 169
12.2 Risk-neutral World Density Functions . . . . . . . . . . . . . . . . . . . . . . . 170
12.3 Correlation of Stocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
12.4 The Sharpe Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
12.5 Risk-neutral Valuation for Derivatives . . . . . . . . . . . . . . . . . . . . . . . 174
Stochastic Calculus and Applications to Finance 1
13 Black-Scholes Analysis 177

13.1 Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
13.2 What is a Portfolio? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
13.3 Risk-less Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
13.4 Black-Scholes Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
13.5 Delta Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
13.6 Tradable securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
13.7 Risk-less investment revised . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
13.8 Solving Black-Scholes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
13.9 Black-Scholes and Risk-neutral Valuation . . . . . . . . . . . . . . . . . . . . . 191
13.10Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
13.11Risk-less Portfolios for Rare Events . . . . . . . . . . . . . . . . . . . . . . . . . 192
14 Black-Scholes for Asian Derivatives 195

14.0.1 Weighted averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
14.1 Setting up the Black-Scholes Equation . . . . . . . . . . . . . . . . . . . . . . . 197
14.2 Weighted Average Strike Call Option . . . . . . . . . . . . . . . . . . . . . . . . 198
14.3 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
14.4 Asian Forward Contracts on Weighted Averages . . . . . . . . . . . . . . . . . . 203
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
2 O. Calin
Part I
Stochastic Calculus
3
Chapter 1
Basic Notions
1.1 Probability Space

The modern theory of probability stems in the work of A. N. Kolmogorov published in 1933.
Kolmogorov associates a random experiment with a probability space, which is a triplet,
(Ω, F, P ), consisting in the set of outcomes, Ω, a σ-field, F, with Boolean algebra proper-
ties, and a probability measure, P . In the following each of these elements will be discussed in
more detail.
1.2 Sample Space

A random experiment in the theory of probability is an experiment whose outcomes cannot be
determined in advance. These experiments are done most of the time mentally.
When an experiment is performed, the set of all possible outcomes is called the sample space,
and we shall denote it by Ω. One can regard this also as the states of the world, understanding
by this all possible states the world might have. For instance, flipping a die will produce the
sample space with two states {H, T }, while rolling a die yields a sample space with six states.
Piking randomly a number between 0 and 1 corresponds to a sample space which is the entire
segment (0, 1).
All subsets of the sample space Ω forms a set denoted by 2Ω . The reason for this notation
is that the set of parts of Ω can be put into bijective correspondence with the set of binary
functions f : Ω → {0, 1}. The number of elements of this set is 2|Ω| , where |Ω| denotes the
cardinal of Ω. If the set is finite, |Ω| = n, then 2Ω has 2n elements. If Ω is infinite countable
(i.e. can be put into bijective correspondence with the set of natural numbers), then 2|Ω| is
infinite and its cardinal is the same as that of the real numbers set R. The next couple of
examples provide examples of sets 2Ω in the finite and infinite cases.
Example 1.2.1 Flip a coin and measure the occurrence of outcomes by 0 and 1: associate a
0 if the outcome does not occur and a 1 if the outcome occurs. We obtain the following four
possible assignments:
{H → 0, T → 0}, {H → 0, T → 1}, {H → 1, T → 0}, {H → 1, T → 1},
so the set of subsets of {H, T } can be represented as 4 sequences of length 2 formed with 0 and
5
6
1: {0, 0}, {0, 1}, {1, 0}, {1, 1}. These corresponds in order to Ø, {T }, {H}, {H, T }, which is
2{H,T } .
Example 1.2.2 Pick a natural number at random. Any subset of the sample space corresponds
to a sequence formed with 0 and 1. For instance, the subset {1, 3, 5, 6} corresponds to the
sequence 10101100000 . . . having 1 on the 1st, 3rd, 5th and 6th places and 0 in rest. It is known
that the number of these sequences is infinite and can be put into bijective correspondence with
the real numbers set R. This can be also written as |2N | = |R|.
1.3 Events and Probability

The set 2Ω has the following obvious properties
1. It contains the empty set Ø;
2. If contains a set A, then it contains also its complement Ā = Ω\A;
3. It is closed to unions, i.e., if A1 , A2 , . . . is a sequence of sets, then their union A1 ∪A2 ∪· · ·

also belongs to 2Ω .
Any subset F of 2Ω that satisfies the previous three properties is called a σ-field. The sets
belonging to F are called events. This way, the complement of an event, or the union of events
is also an event. We say that an event occurs if the outcome of the experiment is an element
of that subset.
The chance of occurrence of an event is measured by a probability function P : F → [0, 1]

which satisfies the following two properties
1. P (Ω) = 1;
2. For any mutually disjoint events A1 , A2 , · · · ∈ F ,
P (A1 ∪ A2 ∪ · · · ) = P (A1 ) + P (A2 ) + · · · .
The triplet (Ω, F, P ) is called a probability space. This is the main setup in which the
probability theory works.
Example 1.3.1 In the case of flipping a coin, the probability space has the following elements:
Ω = {H, T }, F = {Ø, {H}, {T }, {H, T }} and P defined by P (Ø) = 0, P ({H}) = 21 , P ({T }) =
1
2 , P ({H, T }) = 1.
Example 1.3.2 Consider a finite sample space Ω = {s1 , . . . , sn }, with the σ-field F = 2Ω , and
probability given by P (A) = |A|/n, ∀A ∈ F. Then (Ω, 2Ω , P ) is called the classical probability
space.
7
¡ ¢
Figure 1.1: If any pullback X −1 (a, b) is known, then the random variable X : Ω → R is
2Ω -measurable.
1.4 Random Variables

Since the σ-field F provides the knowledge about which events are possible on the considered
probability space, then F can be regarded as the information component of the probability
space (Ω, F, P ). A random variable X is a function that assigns a numerical value to each
state of the world, X : Ω → R, such that the values taken by X are known to someone who
has access to the information F. More precisely, given any two numbers a, b ∈ R, then all the
states of the world for which X takes values between a and b forms a set that is an event (an
element of F), i.e.
{ω ∈ Ω; a < X(ω) < b} ∈ F.
Another way of saying it is that X is an F-measurable function. It worth noting that in the
case of the classical field of probability the knowledge is maximal since F = 2Ω , and hence the
measurability of random variables is automatically satisfied. From now on instead of measurable
we shall use the more suggestive word predictable. This will make more sense in the sequel when
we shall introduce conditional expectations.
Example 1.4.1 Consider the experiment of flipping three coins. In this case Ω is the set of
all possible triplets. Consider the random variable X which gives the number of tails obtained.
For instance X(HHH) = 0, X(HHT ) = 1, etc. The sets
{ω; X(ω) = 0} = {HHH}, {ω; X(ω) = 1} = {HHT, HT H, T HH},

{ω; X(ω) = 3} = {T T T }, {ω; X(ω) = 2} = {HT T, T HT, T T H}
obviously belong to 2Ω , and hence X is a random variable.
Example 1.4.2 A graph is a set of elements, called nodes, and a set of unordered pairs of
nodes, called edges. Consider the set of nodes N = {n1 , n2 , . . . , nk } and the set of edges
E = {(n1 , n2 ), . . . , (ni , nj ), . . . , (nk−1 , nk )}. Define the probability space (Ω, F, P ), where
◦ the sample space is Ω = N ∪ E (the complete graph);
◦ the σ-field F is the set of all subgraphs of Ω;
8
◦ the probability is given by P (G) = n(G)/k, where n(G) is the number of nodes of the
graph G.
As an example of a random variable we consider Y : F → R, Y (G) = the total number of edges
of the graph G. Since given F, one can count the total number of edges of each subgraph, it
follows that Y is F-measurable, and hence it is a random variable.
1.5 Distribution Functions

Let X be a random variable on the probability space (Ω, F, P ). The distribution function of X
is the function FX : R → [0, 1] defined by
FX (x) = P (ω; X(ω) ≤ x).
The distribution function is non-decreasing and satisfies the limits
lim FX (x) = 0, lim FX (x) = 1.

x→−∞ x→+∞
If we have
d
F (x) = p(x),
dx X
then we say that p(x) is the probability density function of X. A useful property which follows
from the Fundamental Theorem of Calculus is
Z b
P (a < X < b) = P (ω; a < X(ω) < b) = p(x) dx.
a
In the case of discrete random variables the aforementioned integral is replaced by the following
sum X
P (a < X < b) = P (X = x).
a<x<b
1.6 Basic Distributions

We shall recall a few basic distributions, which are most often seen in applications.
Normal distribution A random variable X is said to have a normal distribution if its prob-
ability density function is given by
1 2 2
p(x) = √ e−(x−µ) /(2σ ) ,
σ 2π
with µ and σ > 0 constant parameters, see Fig.1.2a. The mean and variance are given by
E[X] = µ, V ar[X] = σ 2 .
Log-normal distribution Let X be normally distributed with mean µ and variance σ 2 . Then
the random variable Y = eX is said log-normal distributed. The mean and variance of Y are
given by
2
E[X] = eµ+σ /2
2 2
V ar[X] = e2µ+σ (eσ − 1).
9
0.4
0.5
0.3 0.4
0.3
0.2
0.2
0.1
0.1
-4 -2 0 2 4 0 2 4 6 8
a b
3.5
Α = 3, Β = 9 Α = 8, Β = 3
3.0
0.20
Α = 3, Β = 2 2.5
0.15
2.0
0.10 1.5
Α = 4, Β = 3
1.0
0.05
0.5
0 5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0
c d
Figure 1.2: a Normal distribution; b Log-normal distribution; c Gamma distributions; d Beta

distributions.
The density function of the log-normal distributed random variable Y is given by

1 (ln x−µ)2
p(x) = √ e− 2σ 2 , x > 0,
xσ 2π
see Fig.1.2b.
Exercise 1.6.1 Given that the moment generating function of a normally distributed random
2 2
variable X ∼ N (µ, σ 2 ) is m(t) = E[etX ] = eµt+t σ /2 , show that
2 2
(a) E[Y n ] = enµ+n σ /2 , where Y = eX .
(b) Show that the mean and variance of the log-normal random variable Y = eX are
2 2 2
E[Y ] = eµ+σ /2
, V ar[X] = e2µ+σ (eσ − 1).
Gamma distribution A random variable X is said to have a gamma distribution with pa-
rameters α > 0, β > 0 if its density function is given by
xα−1 e−x/β
p(x) = , x ≥ 0,
β α Γ(α)
where Γ(α, β) denotes the gamma function, see Fig.1.2c. The mean and variance are
E[X] = αβ, V ar[X] = αβ 2 .
The case α = 1 is known as the exponential distribution, see Fig.1.3a. In this case
1 −x/β
p(x) = e , x > 0.
β
10
0.35 0.10
0.30 Λ = 15, 0 < k < 30

0.08
0.25
0.20
Β=3 0.06
0.15
0.04
0.10
0.02
0.05
0 2 4 6 8 10 5 10 15 20 25 30
a b
Figure 1.3: a Exponential distribution; b Poisson distribution.
The particular case when α = n/2 and β = 2 becomes the χ2 −distribution with n degrees of
freedom. This characterizes a sum of n independent standard normal distributions.
Beta distribution A random variable X is said to have a beta distribution with parameters
α > 0, β > 0 if its probability density function is of the form
xα−1 (1 − x)β−1
p(x) = , 0 ≤ x ≤ 1,
B(α, β)
where B(α, β) denotes the beta function. See see Fig.1.2d for two particular density functions.
In this case
α αβ
E[X] = , V ar[X] = 2
.
α+β (α + β) (α + β + 1)
Poisson distribution A discrete random variable X is said to have a Poisson probability

distribution if
λk −λ
P (X = k) = e , k = 0, 1, 2, . . . ,
k!
with λ > 0 parameter, see Fig.1.3b. In this case E[X] = λ and V ar[X] = λ.
1.7 Independent Random Variables

Roughly speaking, two random variables X and Y are independent if the occurrence of one
of them does not change the probability density of the other. More precisely, if for any sets
A, B ∈ F, the events
{ω; X(ω) ∈ A}, {ω; Y (ω) ∈ B}
are independent, then X and Y are called independent random variables.
Proposition 1.7.1 Let X and Y be independent random variables with probability density
functions pX (x) and pY (y). Then the product random variable XY has the probability density
function pX (x) pY (y).
11
Proof: Let pXY (x, y) be the probability density of the product XY . Using the independence
of sets we have
pXY (x, y) dxdy = P (x < X < x + dx, y < Y < y + dy)

= P (x < X < x + dx)P (y < Y < y + dy)
= pX (x) dx pY (y) dy
= pX (x)pY (y) dxdy.
Dropping the factor dxdy yields the desired result.
1.8 Expectation
A random variable X : Ω → R is called integrable if
Z Z
|X(ω)| dP (ω) = |x|p(x) dx < ∞.
Ω R
The expectation of an integrable random variable X is defined by

Z Z
E[X] = X(ω) dP (ω) = x p(x) dx
Ω R
where p(x) denotes the probability density function of X. Customary the expectation of X is
denoted by µ and it is also called mean. In general, for any continuous1 function h : R → R,
we have Z Z
¡ ¢
E[h(X)] = h X(ω) dP (ω) = h(x)p(x) dx.
Ω R
Proposition 1.8.1 The expectation operator E is linear, i.e. for any integrable random vari-
ables X and Y
1. E[cX] = cE[X], ∀c ∈ R;
2. E[X + Y ] = E[X] + E[Y ].
Proof: It follows from the fact that the integral is a linear operator.
Proposition 1.8.2 Let X and Y be two independent integrable random variables. Then
E[XY ] = E[X]E[Y ].
Proof: This is a variant of Fubini’s theorem. Let pX , pY , pXY denote the probability densities
of X, Y and XY , respectively. Since X and Y are independent, by Proposition 1.7.1 we have
pXY = pX pY . Then
ZZ Z Z
E[XY ] = xypXY (x, y) dxdy = xpX (x) dx ypY (y) dy = E[X]E[Y ].
1 in general, measurable
12
1.9 Radon-Nikodym’s Theorem

This section is concerned with existence and uniqueness results that will be useful later in
defining conditional expectations. Since this section is rather theoretical, it can be skipped at
a first reading.
Proposition 1.9.1 Consider the probability space (Ω, F, P ), and let G be a σ-field included in
F. It X is a G-predictable random variable such that
Z
X dP = 0 ∀A ∈ G,
A
then X = 0 a.s.
¡ ¢
Proof: In order to show that X = 0 almost surely, it suffices to prove that P ω; X(ω) = 0 = 1.
We shall show first that X takes values as small as possible with probability one, i.e. ∀² > 0
we have P (|X| < ²) = 1. To do this, let A = {ω; X(ω) ≥ ²}. Then
Z Z Z
1 1
0 ≤ P (X ≥ ²) = dP = ² dP ≤ X dP = 0,
A ² A ² A
and hence P (X ≥ ²) = 0. Similarly P (X ≤ −²) = 0. Therefore
P (|X| < ²) = 1 − P (X ≥ ²) − P (X ≤ −²) = 1 − 0 − 0 = 1.

1
Taking ² → 0 leads to P (|X| = 0) = 1. This can be formalized as follows. Let ² = n and
consider Bn = {ω; |X(ω)| ≤ ²}, with P (Bn ) = 1. Then
∞
\
P (X = 0) = P (|X| = 0) = P ( Bn ) = lim P (Bn ) = 1.
n→∞
n=1
Corollary 1.9.2 If X and Y are G-predictable random variables such that

Z Z
X dP = Y dP ∀A ∈ G,
A A
then X = Y a.s.
R
Proof: Since A (X − Y ) dP = 0, ∀A ∈ G, by Proposition 1.9.1 we have X − Y = 0 a.s.
Theorem 1.9.3 (Radon-Nikodym) Let (Ω, F, P ) be a probability space and G be a σ-field

included in F. Then for any random variable X there is a G-predictable random variable Y
such that Z Z
X dP = Y dP, ∀A ∈ G. (1.9.1)
A A
We shall omit the proof but discuss a few aspects.

1. All σ-fields G ⊂ F contain impossible and certain events Ø, Ω ∈ G. Making A = Ω yields
Z Z
X dP = Y dP,
Ω Ω
13
which is E[X] = E[Y ].

2. Radon-Nikodym’s theorem states the existence of Y . In fact this is unique almost surely.
In order to show that, assume there are two G-predictable random variables Y1 and Y2 with the
aforementioned property. Then from (1.9.1) yields
Z Z
Y1 dP = Y2 dP, ∀A ∈ G.
A A
Applying Corollary (1.9.2) yields Y1 = Y2 a.s.
1.10 Conditional Expectation

Let X be a random variable on the probability space (Ω, F, P ). Let G be a σ-field contained
in F. Since X is F-predictable, the expectation of X, given the information F must be X
itself. This shall be written as E[X|F] = X. It is natural to ask what is the expectation of X,
given the information G. This is a random variable denoted by E[X|G] satisfying the following
properties:
1. E[X|G] is G-predictable;
R R
2. A E[X|G] dP = A X dP, ∀A ∈ G.
E[X|G] is called the conditional expectation of X given G.
We owe a few explanations regarding the correctness of the aforementioned definition. The
existence of the G-predictable random variable E[X|G] is assured by the Radon-Nikodym the-
orem. The almost surely uniqueness is an application of Proposition (1.9.1) (see the discussion
point 2 of section 1.9).
It worth noting that the expectation of X, denoted by E[X] is a number, while the con-
ditional expectation E[X|G] is a random variable. When are they equal and what is their
relationship? The answers is inferred by the following solved exercises.
Exercise 1.10.1 Show that if G = {Ø, Ω}, then E[X|G] = E[X].
Proof: We need to show that E[X] satisfies conditions 1 and 2. The first one is obviously
satisfied since any constant is G-predictable. The latter condition is checked on each set of G.
We have
Z Z Z
X dP = E[X] = E[X] dP = E[X]dP
ZΩ Z Ω Ω
X dP = E[X]dP.
Ø Ø
Exercise 1.10.2 Show that E[E[X|G]] = E[X], i.e. all conditional expectations have the same
mean, which is the mean of X.
Proof: Using the definition of expectation and taking A = Ω in the second relation of the
aforementioned definition, yields
Z Z
E[E[X|G]] = E[X|G] dP = XdP = E[X],
Ω Ω
14
which ends the proof.
Exercise 1.10.3 The conditional expectation of X given the total information F is the random
variable X itself, i.e.
E[X|F] = X.
Proof: The random variables X and E[X|F] are both F-predictable (from the definition of
the random variable). From the definition of the conditional expectation we have
Z Z
E[X|F] dP = X dP, ∀A ∈ F.
A A
Corollary (1.9.2) implies that E[X|F] = X almost surely.

General properties of the conditional expectation are stated below without proof. The proof
involves more or less simple manipulations of integrals and can be taken as an exercise for the
reader.
Proposition 1.10.4 Let X and Y be two random variable on the probability space (Ω, F, P ).
We have
1. Linearity:
E[aX + bY |G] = aE[X|G] + bE[Y |G], ∀a, b ∈ R;
2. Factoring out the predictable part:
E[XY |G] = XE[Y |G]
if X is G-predictable. In particular, E[X|G] = X.
3. Tower property:
E[E[X|G]|H] = E[X|H], if H ⊂ G;
4. Positivity:
E[X|G] ≥ 0, if X ≥ 0;
5. Expectation of a constant is a constant
E[c|G] = c.
6. An independent condition drops out
E[X|G] = E[X],
if X is independent of G.
1.11 Inequalities of Random Variables

This section prepares the reader for the limits of sequences of random variables and limits of
stochastic processes. We shall start with a classical inequality result regarding expectations:
Theorem 1.11.1 (Jensen’s inequality) Let ϕ : R → R be a convex function and let X be
an integrable random variable on the probability space (Ω, F, P ). If ϕ(X) is integrable, then
ϕ(E[X]) ≤ E[ϕ(X)]
almost surely (i.e. the inequality might fail on a set of measure zero).
15
Figure 1.4: Jensen’s inequality ϕ(E[X]) < E[ϕ(X)] for a convex function ϕ.
Proof: Let µ = E[X]. Expand ϕ in a Taylor series about µ and get

1
ϕ(x) = ϕ(µ) + ϕ0 (µ)(x − µ) + ϕ00 (ξ)(ξ − µ)2 ,
2
with ξ in between x and µ. Since ϕ is convex, ϕ00 ≥ 0, and hence
ϕ(x) ≥ ϕ(µ) + ϕ0 (µ)(x − µ),

¡ ¢
which means the graph of ϕ(x) is above the tangent line at x, ϕ(x) . Replacing x by the
random variable X, and taking the expectation yields
E[ϕ(X)] ≥ E[ϕ(µ) + ϕ0 (µ)(X − µ)] = ϕ(µ) + ϕ0 (µ)(E[X] − µ)

= ϕ(µ) = ϕ(E[X]),
which proves the result.

Fig.1.4 provides a graphical interpretation of Jensen’s inequality. If the distribution of X is
symmetric, then the distribution of ϕ(X) is skewed, with ϕ(E[X]) < E[ϕ(X)].
It worth noting that the inequality is reversed for ϕ concave. We shall present next a couple
of applications.
A random variable X : Ω → R is called square integrable if
Z Z
E[X 2 ] = |X(ω)|2 dP (ω) = x2 p(x) dx < ∞.
Ω R
Application 1.11.2 If X is a square integrable random variable, then it is integrable.
Proof: Jensen’s inequality with ϕ(x) = x2 becomes
E[X]2 ≤ E[X 2 ].
Since the right side is finite, it follows that E[X] < ∞, so X is integrable.
16
Application 1.11.3 If mX (t) denotes the moment generating function of the random variable
X with mean µ, then
mX (t) ≥ etµ .
Proof: Applying Jensen inequality with the convex function ϕ(x) = ex yields
eE[X] ≤ E[eX ].
Substituting tX for X yields

eE[tX] ≤ E[etX ]. (1.11.2)
Using the definition of the moment generating function mX (t) = E[etX ] and that E[tX] =
tE[X] = tµ, then (1.11.2) leads to the desired inequality.
The variance of a square integrable random variable X is defined by
V ar(X) = E[X 2 ] − E[X]2 .
By Application 1.11.2 we have V ar(X) ≥ 0, so there is a constant σX > 0, called standard

deviation, such that
2
σX = V ar(X).
Exercise 1.11.4 Prove that a non-constant random variable has a non-zero standard devia-
tion.
Exercise 1.11.5 Prove the following identity
V ar[X] = E[(X − E[X])2 ].
Theorem 1.11.6 (Markov’s inequality) Prove the following inequality
1
P (ω; |X(ω)| ≥ λ) ≤ E[|X|p ],
λp
for any p > 0.
Proof: Let A = {ω; |X(ω)| ≥ λ}. Then

Z Z Z
E[|X|p ] = |x|p p(x) dx ≥ |x|p p(x) dx ≥ λp p(x) dx
Ω A A
Z
p
= λ p(x) dx = λp P (A) = λp P (|X| ≥ λ).
A
Dividing by λp leads to the desired result.
Theorem 1.11.7 (Tchebychev’s inequality) If X is a random variable with mean µ and

variance σ 2 , show that
σ2
P (ω; |X(ω) − µ| ≥ λ) ≤ 2 .
λ
17
Proof: Let A = {ω; |X(ω) − µ| ≥ λ}. Then

Z Z
σ2 = V ar(X) = E[(X − µ)2 ] = (x − µ)2 p(x) dx ≥ (x − µ)2 p(x) dx
Ω A
Z
≥ λ2 p(x) dx = λ2 P (A) = λ2 P (ω; |X(ω) − µ| ≥ λ).
A
Dividing by λ2 leads to the desired inequality.
Theorem 1.11.8 (Chernoff bounds) Let X be a random variable. Then for any λ > 0 we
have
E[etX ]
1. P (X ≥ λ) ≤ , ∀t > 0;
eλt
E[etX ]
2. P (X ≤ λ) ≤ , ∀t < 0.
eλt
Proof: 1. Let t > 0 and denote Yt = etX . By Markov’s inequality
E[Y ]
P (Y ≥ eλt ) ≤ .
eλt
Then we have
P (X ≥ λ) = P (tX ≥ λt) = P (etX ≥ eλt )

E[Y ] E[etX ]
= P (Y ≥ eλt ) ≤ λt = .
e eλt
2. The case t < 0 is similar.
In the following we shall present an application of the Chernoff bounds for the normal
distributed random variables.
Let X be a random variable normally distributed with mean µ and variance σ 2 . It is known
that its moment generating function is given by
1 2
σ2
m(t) = E[etX ] = eµt+ 2 t .
Using the first Chernoff bound we obtain

m(t) 1 2 2
P (X ≥ λ) ≤ λt
= e(µ−λ)t+ 2 t σ , ∀t > 0,
e
which implies
1
min[(µ − λ)t + t2 σ 2 ]
P (X ≥ λ) ≤ e t>0 2 .
It is easy to see that the quadratic function f (t) = (µ − λ)t + 21 t2 σ 2 has the minimum value
λ−µ
reached for t =
σ2 ³λ − µ´ (λ − µ)2
min f (t) = f = − .
t>0 σ2 2σ 2
Substituting in the previous formula, we obtain the following result:
18
Proposition 1.11.9 If X is a normal distributed variable, with X ∼ N (µ, σ 2 ), then

(λ − µ)2
−
P (X ≥ λ) ≤ e 2σ 2 .
Example 1.11.1 Let X be a Poisson random variable with mean λ > 0.

t
1. Show that the moment generating function is m(t) = eλ(e −1) ;
2. Use Chernoff bound to show that
t
P (X ≥ k) ≤ eλ(e −1)−tk
.
Markov’s, Tchebychev’s and Chernoff’s inequalities will be useful later when computing
limits of random variables.
The next inequality is called Tchebychev’s inequality for monotone sequences of numbers.
Lemma 1.11.10 Let (ai ) and (bi ) be two sequences of real numbers such that either
a1 ≤ a2 ≤ · · · ≤ an , b1 ≤ b2 ≤ · · · ≤ bn
or
a1 ≥ a2 ≥ · · · ≥ an ,
b1 ≥ b2 ≥ · · · ≥ bn
Xn
If (λi ) is a sequence of non-negative numbers such that λi = 1, then
i=1
³X
n ´³ X
n ´ Xn
λi a i λi b i ≤ λi ai bi .
i=1 i=1 i=1
Proof: Since the sequences (ai ) and (bi ) are either both increasing or both decreasing
(ai − aj )(bi − bj ) ≥ 0.
Multiplying by the positive quantity λi λj and summing over i and j we get
X
λi λj (ai − aj )(bi − bj ) ≥ 0.
i,j
Expanding yields
³X ´³ X ´ ³X ´³ X ´ ³X ´³ X ´
λj λi ai bi − λi ai λj bj − λj a j λi b i
j i i j j i
³X ´³ X ´
+ λi λj aj bj ≥ 0.
i j
X
Using λj = 1 the expression becomes
j
X ³X ´³ X ´
λi ai bi ≥ λi ai λj bj ,
i i j
which end the proof.

Next we present a meaningful application of the previous inequality.
19
Proposition 1.11.11 Let X be a random variable and f and g be two functions, both increas-
ing or both decreasing. Then
E[f (X)g(X)] ≥ E[f (X)]E[g(X)]. (1.11.3)
Proof: If X is a discrete random variable, with outcomes {x1 , · · · xn }, inequality (1.11.3)

becomes X X X
f (xj )g(xj )p(xj ) ≥ f (xj )p(xj ) g(xj )p(xj ),
j j j
where p(xj ) = P (X = xj ). Denoting aj = f (xj ), bj = g(xj ), and λj = p(xj ), the inequality

transforms into X X X
aj bj λj ≥ aj λj b j λj ,
j j j
which holds true by Lemma 1.11.10.

If X is a continuous random variable with the density function p : I → R, the inequality
(1.11.3) can be written in the integral form
Z Z Z
f (x)g(x)p(x) dx ≥ f (x)p(x) dx g(x)p(x) dx. (1.11.4)
I I I
Let x0 < x1 < · · · < xn be a partition of the interval I, with ∆x = xk+1 − xk . Using Lemma
1.11.10 we obtain the following inequality between Riemann sums
X ³X ´³ X ´
f (xj )g(xj )p(xj )∆x ≥ f (xj )p(xj )∆x g(xj )p(xj )∆x ,
j j j
where aj = f (xj ), bj = g(xj ), and λj = p(xj )∆x. Taking the limit k∆xk → 0 we obtain
(1.11.4), which leads to the desired result.
Exercise 1.11.12 Let f and g be two differentiable functions such that f 0 (x)g 0 (x) > 0, x ∈ I.
Then
E[f (X)g(X)] ≥ E[f (X)]E[g(X)],
for any random variable X with values in I.
Exercise 1.11.13 Use Exercise 1.11.12 to show the following inequalities:

(a) E[X 2 ] ≥ E[X]2 ;
(b) E[X 2 cosh(X)] ≥ E[X 2 ]E[cosh(X)];
(c) E[X sinh(X)] ≥ E[X]E[sinh(X)];
(d) E[X 6 ] ≥ E[X 2 ]E[X 4 ];
(d) E[X 6 ] ≥ E[X]E[X 5 ];
(e) E[X 6 ] ≥ E[X 3 ]2 .
Exercise 1.11.14 For any n, k ≥ 1, k ≤ 2n show that
E[X 2n ] ≥ E[X k ]E[X 2n−k ].
Can you prove a similar inequality for E[X 2n+1 ]?

20
1.12 Limits of Sequences of Random Variables

Consider a sequence (Xn )n≥1 of random variables defined on the probability space (Ω, F, P ).
There are several ways of making sense of the limit expression X = lim Xn , and they will be
n→∞
discussed in the following.
Almost Certain Limit

The sequence Xn converges almost certainly to X, if for all states of the world ω, except a set
of probability zero, we have
lim Xn (ω) = X(ω).
n→∞
More precisely, this means ³ ´

P ω; lim Xn (ω) = X(ω) = 1,
n→∞
and we shall write ac-lim Xn = X. An important example where this type of limit occurs is
n→∞
the Strong Law of Large Numbers:
If Xn is a sequence of independent and identically distributed random variables with the
X1 + · · · + Xn
same mean µ, then ac-lim = µ.
n→∞ n
It worth noting that this type of convergence is known also under the name of strong
convergence. This is the reason why the the aforementioned theorem bares its name.
Mean Square Limit

Another possibility of convergence is to look at the mean square deviation of Xn from X. We
say that Xn converges to X in the mean square if
lim E[(Xn − X)2 ] = 0.

n→∞
More precisely, this should be interpreted as

Z
¡ ¢2
lim Xn (ω) − X(ω) dP (ω) = 0.
n→∞
This limit will be abbreviated by ms-lim Xn = X. The mean square convergence is useful
n→∞
when defining the Ito integral.
Example 1.12.1 Consider a sequence Xn of random variables such that there is a constant k
with E[Xn ] → k and V ar(Xn ) → 0 as n → ∞. Show that ms-lim Xn = k.
n→∞
Proof: Since we have
E[|Xn − k|2 ] = E[Xn2 − 2kXn + k 2 ] = E[Xn2 ] − 2kE[Yn ] + k 2

¡ ¢ ¡ ¢
= E[Xn2 ] − E[Xn ]2 + E[Xn ]2 − 2kE[Yn ] + k 2
¡ ¢2
= V ar(Xn ) + E[Xn ] − k ,
the right side tends to 0 when taking the limit n → ∞.

21
Limit in Probability or Stochastic Limit

The random variable X is the stochastic limit of Xn if for n large enough the probability of
deviation from X can be made smaller than any arbitrary ². More precisely, for any ² > 0
¡ ¢
lim P ω; |Xn (ω) − X(ω)| ≤ ² = 1.
n→∞
This can be written also as

¡ ¢
lim P ω; |Xn (ω) − X(ω)| > ² = 0.
n→∞
This limit is denoted by st-lim Xn = X.

n→∞
It worth noting that both almost certain convergence and convergence in mean square
imply the stochastic convergence. Hence, the stochastic convergence is weaker than the afore-
mentioned two convergence cases. This is the reason why it is also called the weak convergence.
One application is the Weak Law of Large Numbers:
If X1 , X2 , . . . are identically distributed with expected value µ and if any finite number of
X1 + · · · + Xn
them are independent, then st-lim = µ.
n→∞ n
Proposition 1.12.1 The convergence in the mean square implies the stochastic convergence.
Proof: Let ms-lim Yn = Y . Let ² > 0 arbitrary fixed. Applying Markov’s inequality with
n→∞
X = Yn − Y , p = 2 and λ = ², yields
1
0 ≤ P (|Yn − Y | ≥ ²) ≤ E[|Yn − Y |2 ].
²2
The right side tends to 0 as n → ∞. Applying the Squeeze Theorem we obtain
lim P (|Yn − Y | ≥ ²) = 0,
n→∞
which means that Yn converges stochastic to Y .
Exercise 1.12.2 Let Xn be a sequence of random variables such that E[|Xn |] → 0 as n → ∞.

Prove that st-lim Xn = 0.
n→∞
Proof: Let ² > 0 be arbitrary fixed. We need to show

¡ ¢
lim P ω; |Xn (ω)| ≥ ² = 0. (1.12.5)
n→∞
From Markov’s inequality (see Exercise 1.11.6) we have

¡ ¢ E[|Xn |]
0 ≤ P ω; |Xn (ω)| ≥ ² ≤ .
²
Using Squeeze Theorem we obtain (1.12.5).
Remark 1.12.3 The conclusion still holds true even in the case when there is a p > 0 such
that E[|Xn |p ] → 0 as n → ∞.
22
Limit in Distribution
We say the sequence Xn converges in distribution to X if for any continuous bounded function
ϕ(x) we have
lim ϕ(Xn ) = ϕ(X).
n→∞
This type of limit is even weaker than the stochastic convergence, i.e. it is implied by it.
An application of the limit in distribution is obtained if consider ϕ(x) = eitx . In this
case, if Xn converges in distribution to X, then the characteristic function of Xn converges to
the characteristic function of X. In particular, the probability density of Xn approachers the
probability density of X.
It can be shown that the convergence in distribution is equivalent with
lim Fn (x) = F (x),

n→∞
whenever F is continuous at x, where Fn and F denote the distribution functions of Xn and

X, respectively. This is the reason why this convergence bares its name.
1.13 Properties of Limits

Lemma 1.13.1 If ms-lim Xn = 0 and ms-lim Yn = 0, then
n→∞ n→∞
1. ms-lim (Xn + Yn ) = 0
n→∞
2. ms-lim (Xn Yn ) = 0.
n→∞
Proof: Since ms-lim Xn = 0, then lim E[Xn2 ] = 0. Applying the Squeeze Theorem to the
n→∞ n→∞
inequality2
0 ≤ E[Xn ]2 ≤ E[Xn2 ]
yields lim E[Xn ] = 0. Then
n→∞
³ ´
lim V ar[Xn ] = lim E[Xn2 ] − lim E[Xn ]2
n→∞ n→∞ n→∞
= lim E[Xn2 ] − lim E[Xn ]2
n→∞ n→∞
= 0.
Similarly, we have lim E[Yn2 ] = 0, lim E[Yn ] = 0 and lim V ar[Yn ] = 0. Then lim σXn =
n→∞ n→∞ n→∞ n→∞
lim σYn = 0. Using the correlation formula
n→∞
Cov(Xn , Yn )
Corr(Xn , Yn ) = ,
σXn σYn
and the fact that |Corr(Xn , Yn )| ≤ 1, yields
0 ≤ |Cov(Xn , Yn )| ≤ σXn σYn .

2 This follows from the fact that V ar[Xn ] ≥ 0.
23
Since lim σXn σXn = 0, from the Squeeze Theorem it follows that
n→∞
lim Cov(Xn , Yn ) = 0.
n→∞
Taking n → ∞ in the relation
Cov(Xn , Yn ) = E[Xn , Yn ] − E[Xn ]E[Yn ]
yields lim E[Xn Yn ] = 0. Using the previous relations, we have

n→∞
lim E[(Xn + Yn )2 ] = lim E[Xn2 + 2Xn Yn + Yn2 ]

n→∞ n→∞
= lim E[Xn2 ] + 2 lim E[Xn Yn ] + lim E[Yn2 ]
n→∞ n→∞ n→∞
= 0,
which means ms-lim (Xn + Yn ) = 0.

n→∞
Proposition 1.13.2 If the sequences of random variables Xn and Yn converge in the mean
square, then
1. ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn

n→∞ n→∞ n→∞
2. ms-lim (cXn ) = c · ms-lim Xn , ∀c ∈ R.
n→∞ n→∞
Proof: 1. Let ms-lim Xn = L and ms-lim Yn = M . Consider the sequences Xn0 = Xn − L and
n→∞ n→∞
Yn0 = Yn − M . Then ms-lim Xn0 = 0 and ms-lim Yn0 = 0. Applying Lemma 1.13.1 yields
n→∞ n→∞
ms-lim (Xn0 + Yn0 ) = 0.
n→∞
This is equivalent with
ms-lim (Xn − L + Yn − M ) = 0,
n→∞
which becomes
ms-lim (Xn + Yn ) = L + M .
n→∞
1.14 Stochastic Processes

A stochastic process on the probability space (Ω, F, P ) is a family of random variables Xt
parameterized by t ∈ T, where T ⊂ R. If T is an interval we say that Xt is a stochastic process
in continuous time. If T = {1, 2, 3, . . . } we shall say that Xt is a stochastic process in discrete
time. The later case describes a sequence of random variables. The aforementioned types of
convergence can be easily extended to continuous time. For instance, Xt converges in strong
sense to X as t → ∞ if ³ ´
P ω; lim Xt (ω) = X(ω) = 1.
t→∞
The evolution in time of a given state of the world ω ∈ Ω given by the function t 7−→ Xt (ω) is
called a path or realization of Xt . The study of stochastic processes using computer simulations
is based on retrieving information about the process Xt given a large number of it realizations.
24
Consider that all the information accumulated until time t is contained by the σ-field Ft .
This means that Ft contains the information of which events have already occurred until the
time t, and which did not. Since the information is growing in time, we have
Fs ⊂ F t ⊂ F
for any s, t ∈ T with s ≤ t. The family Ft is called a filtration.

A stochastic process Xt is called adapted to the filtration Ft if Xt is Ft - predictable, for any
t ∈ T.
Example 1.14.1 Here there are a few examples of filtrations:

1. Ft represents the information about the evolution of a stock until time t, with t > 0.
2. Ft represents the information about the evolution of a Black-Jack game until time t, with
t > 0.
Example 1.14.2 If X is a random variable, consider the conditional expectation
Xt = E[X|Ft ].
From the definition of the conditional expectation, the random variable Xt is Ft -predictable, and
can be regarded as the measurement of X at time t using the information Ft . If the accumulated
knowledge Ft increases and eventually equals the σ-field F, then X = E[X|F], i.e. we obtain
the entire random variable. The process Xt is adapted to Ft .
Example 1.14.3 Don Joe is asking the doctor how long he still has to live. The age at which
he will pass away is a random variable, denoted by X. Given his medical condition today, which
is contained in Ft , the doctor infers that Mr. Joe will die at the age of Xt = E[X|Ft ]. The
stochastic process Xt is adapted to the medical knowledge Ft .
We shall define next an important type of stochastic process.
Definition 1.14.4 A process Xt , t ∈ T, is called a martingale with respect to the filtration Ft

if
1. Xt is integrable for each t ∈ T;
2. Xt is adapted to the filtration Ft ;
3. Xs = E[Xt |Fs ], ∀s < t.
Remark 1.14.5 The first condition states that the unconditional forecast is finite E[|Xt ]] =
Z
|Xt | dP < ∞. Condition 2 says that the value Xt is known, given the information set Ft .
Ω
The third relation asserts that the best forecast of unobserved future values is the last observation
on Xt .
Remark 1.14.6 If the third condition is replaced by

30 . Xs ≤ E[Xt |Fs ], ∀s ≤ t
then Xt is called a submartingale; and if it is replaced by
300 . Xs ≥ E[Xt |Fs ], ∀s ≤ t
then Xt is called a supermartingale.
It worth noting that Xt is submartingale if and only if −Xt is supermartingale.
25
Example 1.14.1 Let Xt denote Mr. Li Zhu’s salary after t years of work in the same company.
Since Xt is known at time t and it is bounded above, as all salaries are, then the first two
conditions hold. Being honest, Mr. Zhu expects today that his future salary will be the same as
today’s, i.e. Xs = E[Xt |Fs ], for s < t. This means that Xt is a martingale.
Example 1.14.2 If in the previous example Mr. Zhu is optimistic and believes as today that
his future salary will increase, then Xt is a submartingale.
Example 1.14.3 If X is an integrable random variable on (Ω, F, P ), and Ft is a filtration,

then Xt = E[X|Ft ] is a martingale.
Example 1.14.4 Let Xt and Yt be martingales with respect to the filtration Ft . Show that for
any a, b, c ∈ R the process Zt = aXt + bYt + c is a Ft -martingale.
Example 1.14.5 Let Xt and Yt be martingales with respect to the filtration Ft . Is the process
Xt Yt a martingale with respect to Ft ?
In the following, if Xt is a stochastic process, the minimum amount of information resulted

from knowing the process Xt until time t is denoted by σ(Xs ; s ≤ t). In the case of a discrete
process, we have σ(Xk ; k ≤ n).
Example 1.14.6 Let Xn , n ≥ 0 be a sequence of integrable independent random variables.

Let S0 = X0 , Sn = X0 + · · · + Xn . If E[Xn ] ≥ 0, then Sn is a σ(Xk ; k ≤ n)-submartingal.
In addition, if E[Xn ] = 0 and E[Xn2 ] < ∞, ∀n ≥ 0, then Sn2 − V ar2 (Sn ) is a σ(Xk ; k ≤ n)-
martingal.
Example 1.14.7 Let Xn , n ≥ 0 be a sequence of independent random variables with E[Xn ] = 1

for n ≥ 0. Then Pn = X0 · X1 · · · · Xn is a σ(Xk ; k ≤ n)-martingal.
In chapter 8.1 we shall encounter several processes which are martingales.

26
Chapter 2
Useful Stochastic Processes
2.1 The Brownian Motion

The observation made first by the botanist Robert Brown in 1827, that small pollen grains
suspended in water have a very irregular and unpredictable state of motion, led to the definition
of the Brownian motion, which is formalized in the following.
Definition 2.1.1 A Brownian motion process is a stochastic process Bt , t ≥ 0, which satisfies

1. The process starts at the origin, B0 = 0;
2. Bt has stationary, independent increments;
3. The process Bt is continuous in t;
4. The increments Bt − Bs are normally distributed with mean zero and variance |t − s|,
Bt − Bs ∼ N (0, |t − s|).
The process Xt = x + Bt has all the properties of a Brownian motion that starts at x. Since
Bt − Bs is stationary, its distribution function depends only on the time interval t − s, i.e.
P (Bt+s − Bs ≤ a) = P (Bt − B0 ≤ a) = P (Bt ≤ a).
From condition 4 we get that Bt is normally distributed with mean E[Bt ] = 0 and V ar[Bt ] = t
Bt ∼ N (0, t).
Let 0 < s < t. Since the increments are independent, we can write
E[Bs Bt ] = E[(Bs − B0 )(Bt − Bs ) + Bs2 ] = E[Bs − B0 ]E[Bt − Bs ] + E[Bs2 ] = s.
Proposition 2.1.2 A Brownian motion process Bt is a martingale with respect to the infor-
mation set Ft = σ(Bs ; s ≤ t).
Proof: Let s < t and write Bt = Bs + (Bt − Bs ). Then
E[Bt |Fs ] = E[Bs + (Bt − Bs )|Fs ]

= E[Bs |Fs ] + E[Bt − Bs |Fs ]
= Bs ,
27
28
where we used that Bs is Fs -measurable (from where E[Bs |Fs ] = Bs ) and that the increment
Bt − Bs is independent of previous values of Bt contained in the information set Ft = σ(Bs ; s ≤
t).
A process with similar properties as the Brownian motion was introduced by Wiener.
Definition 2.1.3 A Wiener process Wt is a process adapted to a filtration Ft such that

1. The process starts at the origin, W0 = 0;
2. Wt is an Ft -martingale with E[Wt2 ] < ∞ for all t ≥ 0 and
E[(Wt − Ws )2 ] = t − s, s ≤ t;
3. The process Wt is continuous in t.
Since Wt is a martingale, its increments are unpredictable and hence E[Ws − Wt ] = 0; in

particular E[Wt ] = 0. It is easy to show that
V ar[Wt − Ws ] = |t − s|, V ar[Wt ] = t.
The only property Bt has and Wt seems not to have is that the increments are normally
distributed. However, there is no distinction between these two processes, as the following
result states.
Theorem 2.1.4 (Lévy) A Wiener process is a Brownian motion process.
In stochastic calculus we often need to use infinitesimal notations and its properties. If dWt
denotes the infinitesimal increment of a Wiener process in the time interval dt, the aforemen-
tioned properties become dWt ∼ N (0, dt), E[dWt ] = 0, and E[(dWt )2 ] = dt.
Proposition 2.1.5 If Wt is a Wiener process with respect to the information set Ft , then
Yt = Wt2 − t is a martingale.
Proof: Let s < t. Using that the increments Wt − Ws and (Wt − Ws )2 are independent of the
information set Fs and applying Proposition 1.10.4 yields
E[Wt2 |Fs ] = E[(Ws + Wt − Ws )2 |Fs ]

= E[Ws2 + 2Ws (Wt − Ws ) + (Wt − Ws )2 |Fs ]
= E[Ws2 |Fs ] + E[2Ws (Wt − Ws )|Fs ] + E[(Wt − Ws )2 |Fs ]
= Ws2 + 2Ws E[Wt − Ws |Fs ] + E[(Wt − Ws )2 |Fs ]
= Ws2 + 2Ws E[Wt − Ws ] + E[(Wt − Ws )2 ]
= Ws2 + t − s,
and hence E[Wt2 − t|Fs ] = Ws2 − s, for s < t.

The following result states the memoryless property of Brownian motion1
Proposition 2.1.6 The conditional distribution of Wt+s , given the present Wt and the past
Wu , 0 ≤ u < t, depends only on the present.
1 These type of processes are called Marcov processes
29
Proof: Using the independent increment assumption, we have
P (Wt+s ≤ c|Wt = x, Wu , 0 ≤ u < t)

= P (Wt+s − Wt ≤ c − x|Wt = x, Wu , 0 ≤ u < t)
= P (Wt+s − Wt ≤ c − x)
= P (Wt+s ≤ c|Wt = x).
Since Wt is normally distributed with mean 0 and variance t, its density function is
1 − x2
φt (x) = √ e 2t .
2πt
Then its distribution function is
Z x
1 u2
Ft (x) = P (Wt ≤ x) = √ e− 2t du
2πt −∞
The probability that Wt is between the values a and b is given by

Z b
1 u2
P (a ≤ Wt ≤ b) = √ e− 2t du, a < b.
2πt a
Even if the increments of a Brownian motion are independent, its values are still correlated.
Proposition 2.1.7 Let 0 ≤ s ≤ t. Then

1. Cov(Ws , Wt ) = s;r
s
2. Corr(Ws , Wt ) = .
t
Proof: 1. Using the properties of covariance
Cov(Ws , Wt ) = Cov(Ws , Ws + Wt − Ws )
= Cov(Ws , Ws ) + Cov(Ws , Wt − Ws )
= V ar(Ws ) + E[Ws (Wt − Ws )] − E[Ws ]E[Wt − Ws ]
= s + E[Ws ]E[Wt − Ws ]
= s,
since E[Ws ] = 0.
We can arrive to the same result starting from the formula
Cov(Ws , Wt ) = E[Ws Wt ] − E[Ws ]E[Wt ] = E[Ws Wt ].
Using the tower property and that Wt is a martingale, we have
E[Ws Wt ] = E[E[Ws Wt |Fs ]] = E[Ws E[Wt |Fs ]]

= E[Ws Ws ] = E[Ws2 ] = s,
so Cov(Ws , Wt ) = s.
30
2. The correlation formula yields
r
Cov(Ws , Wt ) s s
Corr(Ws , Wt ) = =√ √ = .
σ(Wt )σ(Ws ) s t t
Remark 2.1.8 Removing the order relation between s and t, the previous relations can be also
stated as
Cov(Ws , Wt ) = min{s, t};
s
min{s, t}
Corr(Ws , Wt ) = .
max{s, t}
The following exercises state the translation and the scaling invariance of the Brownian
motion.
Exercise 2.1.9 For any t0 ≥ 0, show that the process Xt = Wt+t0 −Wt0 is a Brownian motion.
Exercise 2.1.10 For any λ > 0, show that the process Xt = √1 Wλt is a Brownian motion.
λ
Exercise 2.1.11 Let 0 < s < t < u. Show the following multiplicative property
Corr(Ws , Wt )Corr(Wt , Wu ) = Corr(Ws , Wu ).
Research topic: Find all stochastic processes with the aforementioned property.
Exercise 2.1.12 (a) Use the martingale property of Wt2 − t to find
E[(Wt2 − t)(Ws2 − s)];
(b) Evaluate E[Wt2 Ws2 ];
(c) Compute Cov(Wt2 , Ws2 );
(d) Find Corr(Wt2 , Ws2 ).
Exercise 2.1.13 Consider the process Yt = tW 1t , t > 0.
(a) Find the distribution of Yt ;
(b) Find Cov(Ys , Yt );
(c) Is Yt a Brownian motion process?
Exercise 2.1.14 The process Xt = |Wt | is called Brownian motion reflected at the origin.
Show that p
(a) E[|Wt |] = 2t/π;
(b) V ar(|Wt |) = (1 − π2 )t.
Research topic: Find all functions g(t) and h(t) such that the process Xt = g(t)Wh(t) is a
Brownian motion.
Exercise 2.1.15 Let 0 < s < t. Find E[Wt2 |Fs ].
Exercise 2.1.16 Let 0 < s < t. Show that
(a) E[Wt3 |Fs ] = 3(t − s)Ws + Ws3 ;
(b) E[Wt4 |Fs ] = 3(t − s)2 + 6(t − s)Ws2 + Ws4 .
31
50 100 150 200 250 300 350

4
3
-0.5
-1.0
50 100 150 200 250 300 350
a b
Figure 2.1: a Three simulations of the Brownian motion process Wt ; b Two simulations of the
geometric Brownian motion process eWt .
2.2 Geometric Brownian Motion

The process Xt = eWt , t ≥ 0 is called geometric Brownian motion. A few simulations of this
process are depicted in Fig.2.1 b. The following result will be useful in the sequel.
2
Lemma 2.2.1 We have E[eαWt ] = eα t/2
, for α ≥ 0.
Proof: Using the definition of the expectation

Z Z
1 x2
E[eαWt ] = eαxφt (x) dx = √ e− 2t +sx dx
2πt
2
= eα t/2
,
where we have used the integral formula

Z
2 π b2
e−ax +bx dx = e 4a , a>0
a
1
with a = 2t and b = s.
Proposition 2.2.2 The geometric Brownian motion Xt = eWt is log-normally distributed with
mean et/2 and variance e2t − et .
Proof: Since Wt is normally distributed, then Xt = eWt will have a log-normal distribution.
Using Lemma 2.2.1 we have
E[Xt ] = E[eWt ] = et/2

E[Xt2 ] = E[e2Wt ] = e2t ,
and hence the variance is
V ar[Xt ] = E[Xt2 ] − E[Xt ]2 = e2t − (et/2 )2 = e2t − et .

32
The distribution function can be obtained by reducing to the distribution function of a

Brownian motion
FXt (x) = P (Xt ≤ x) = P (eWt ≤ x)
= P (Wt ≤ ln x) = FWt (ln x)
Z ln x
1 u2
= √ e− 2t du.
2πt −∞
The density function of the geometric Brownian motion Xt = eWt is given by

1 2
√
 e−(ln x) /(2t) , if x > 0,
d x 2πt
p(x) = F (x) =
dx Xt 

0, elsewhere.
Exercise 2.2.3 If Xt = eWt , find the covariance Cov(Xs , Xt ).
Exercise 2.2.4 Let Xt = eWt .
(a) Show that Xt is not a martingale.
t
(b) Show that e− 2 Xt is a martingale.
2.3 Integrated Brownian Motion

The stochastic process Z t
Zt = Ws ds, t≥0
0
is called integrated Brownian motion.
Let 0 = s0 < s1 < · · · < sk < · · · sn = t, with sk = kt
n . Then Zt can be written as a limit of
Riemann sums
X n
Ws1 + · · · + Wsn
Zt = lim Wsk ∆s = t lim .
n→∞ n→∞ n
k=1
We are tempted to apply the Central Limit Theorem, but Wsk are not independent, so we need
to transform the sum into a sum of independent normally distributed random variables first.
A straightforward computation shows that
Ws1 + · · · + Wsn
= n(Ws1 − W0 ) + (n − 1)(Ws2 − Ws1 ) + · · · + (Wsn − Wsn−1 )
= X1 + X2 + · · · + Xn . (2.3.1)
Since the increments of a Brownian motion are independent and normally distributed, we have
¡ ¢
X1 ∼ N 0, n2 ∆s
¡ ¢
X2 ∼ N 0, (n − 1)2 ∆s
¡ ¢
X3 ∼ N 0, (n − 2)2 ∆s
························
¡ ¢
Xn ∼ N 0, ∆s .
Recall now the following variant of the Central Limit Theorem:
33
Theorem 2.3.1 If Xj are independent random variables normally distributed with mean µj
and variance σj2 , then the sum X1 +· · ·+Xn is also normally distributed with mean µ1 +· · ·+µn
and variance σ12 + · · · + σn2 .
Then
³ ´ ³ n(n + 1)(2n + 1) ´
X1 + · · · + Xn ∼ N 0, (1 + 22 + 32 + · · · + n2 )∆s = N 0, ∆s ,
6
t
with ∆s = . Using (2.3.1) yields
n
W s + · · · + W sn ³ (n + 1)(2n + 1) ´
t 1 ∼ N 0, t3 .
n 6n2
Taking the limit we get
³ t3 ´
Zt ∼ N 0, .
3
Proposition 2.3.2 The integrated Brownian motion Zt has a Gaussian distribution with mean
0 and variance t3 /3.
The mean and the variance can be also computed in a direct way as follows. By Fubini’s
theorem we have
Z t Z Z t
E[Zt ] = E[ Ws ds] = Ws ds dP
0 R 0
Z tZ Z t
= Ws dP ds = E[Ws ] ds = 0,
0 R 0
since E[Ws ] = 0. Then the variance is given by

V ar[Zt ] = E[Zt2 ] − E[Zt ]2 = E[Zt2 ]
Z t Z t Z tZ t
= E[ Wu du · Wv dv] = E[ Wu Wv dudv]
0 0 0 0
Z tZ t ZZ
= E[Wu Wv ] dudv = min{u, v} dudv
0 0 [0,t]×[0,t]
ZZ ZZ
= min{u, v} dudv + min{u, v} dudv, (2.3.2)
D1 D2
where
D1 = {(u, v); u < v, 0 ≤ u ≤ t}, D2 = {(u, v); u > v, 0 ≤ u ≤ t}
The first integral can be evaluated using Fubini’s theorem
ZZ ZZ
min{u, v} dudv = v dudv
D1 D1
Z t ³Z u ´ Z t
u2 t3
= v dv du = du = .
0 0 0 2 6
Similarly, the latter integral is equal to
ZZ
t3
min{u, v} dudv = .
D2 6
34
Substituting in (2.3.2) yields

t3 t3 t3
V ar[Zt ] = + = .
6 6 3
Exercise 2.3.3 (a) Prove that the moment generating function of Zt is
2 3
m(u) = eu t /6
.
(b) Use the first part to find the mean and variance of Zt .
Exercise 2.3.4 Let s < t. Show that the covariance of the integrated Brownian motion is given
by ³t s´
Cov[Zs , Zt ] = s2 − .
2 6
Exercise 2.3.5 Show that
(a) Cov[Zt , Zt − Zt−h ] = 21 t2 h + o(h).
t2
(b) Cov[Zt , Wt ] = .
2
Z t
Exercise 2.3.6 Consider the process Xt = eWs ds.
0
(a) Find the mean of Xt ;
(b) Find the variance of Xt ;
(c) What is the distribution of Xt ?
2.4 Exponential Integrated Brownian Motion

Rt
If Zt = 0
Ws ds denotes the integrated Brownian motion, the process
Vt = eZt
is called exponential integrated Brownian motion. The process starts at V0 = e0 = 1. Since Zt

is normally distributed, then Vt is log-normally distributed. We shall compute the mean and
the variance in a direct way. Using Exercise 2.3.3 we have
t3
E[Vt ] = E[eZt ] = m(1) = e 6
4t3 2t3
E[Vt2 ] = E[e2Zt ] = m(2) = e 6 =e 3
2t3 t3
V ar[Vt ] = E[Vt2 ] − E[Vt ]2 = e 3 −e3 .
2.5 Brownian Bridge

The process Xt = Wt − tW1 is called the Brownian bridge tied down at both 0 and 1. Since we
can also write
Xt = Wt − tWt − tW1 + tWt

= (1 − t)(Wt − W0 ) − t(W1 − Wt ),
35
using that the increments Wt − W0 and W1 − Wt are independent and normally distributed,
with
Wt − W0 ∼ N (0, t), W1 − Wt ∼ N (0, 1 − t),
it follows that Xt is normally distributed with
E[Xt ] = (1 − t)E[(Wt − W0 )] − tE[(W1 − Wt )] = 0

V ar[Xt ] = (1 − t)2 V ar[(Wt − W0 )] + t2 V ar[(W1 − Wt )]
= (1 − t)2 (t − 0) + t2 (1 − t)
= t(1 − t).
This can be also stated by saying that the Brownian bridge tied at 0 and 1 is a Gaussian process
with mean 0 and variance t(1 − t).
2.6 Brownian Motion with Drift

The process Yt = µt + Wt , t ≥ 0, is called Brownian motion with drift. The process Xt tends
to drift off at a rate µ. It starts at Y0 = 0 and it is a Gaussian process with mean
E[Yt ] = µt + E[Wt ] = µt
and variance
V ar[Yt ] = V ar[µt + Wt ] = V ar[Wt ] = t.
2.7 Bessel Process

This section deals with the process satisfied by the Euclidean distance from the origin to a
particle following a Brownian motion in Rn . More precisely, if W1 (t), · · · , Wn (t) are independent
Brownian motions, let W (t) = (W1 (t), · · · , Wn (t)) be a Brownian motion in Rn , n ≥ 2. The
process p
Rt = dist(O, W (t)) = W1 (t)2 + · · · + Wn (t)2
is called n-dimensional Bessel process.
The probability density of this process is given by the following result.
Proposition 2.7.1 The probability density function of Rt , t > 0 is given by


ρ2

 2
ρn−1 e− 2t , ρ ≥ 0;
(2t)n/2 Γ(n/2)
pt (ρ) =


0, ρ<0
with 
³ n ´  ( n2 − 1)! for n even;
Γ = √
2  n
( 2 − 1)( n2 − 2) · · · 32 12 π, for n odd.
36
Proof: Since the Brownian motions W1 (t), . . . , Wn (t) are independent, their joint density
function is
fW1 ···Wn (t) = fW1 (t) · · · fWn (t)

1 2 2
= n/2
e−(x1 +···+xn )/(2t) , t > 0.
(2πt)
In the next computation we shall use the following formula of integration that follows from
the use of polar coordinates
Z Z ρ
n−1
f (x) dx = σ(S ) rn−1 g(r) dr, (2.7.3)
{|x|≤ρ} 0
where f (x) = g(|x|) is a function on Rn with spherical symmetry, and where
2π n/2
σ(Sn−1 ) =
Γ(n/2)
is the area of the (n − 1)-dimensional sphere in Rn .
Let ρ ≥ 0. The distribution function of Rt is

Z
FR (ρ) = P (Rt ≤ ρ) = fW1 ···Wn (t) dx1 · · · dxn
{Rt ≤ρ}
Z
1 2 2
= n/2
e−(x1 +···+xn )/(2t) dx1 · · · dxn
x21 +···+x2n ≤ρ2 (2πt)
Z ρ ÃZ !
n−1 1 −(x21 +···+x2n )/(2t)
= r n/2
e dσ dr
0 S(0,1) (2πt)
Z
σ(Sn−1 ) ρ n−1 −r2 /(2t)
= r e dr.
(2πt)n/2 0
Differentiating yields
d σ(Sn−1 ) n−1 − ρ2
pt (ρ) = FR (ρ) = ρ e 2t
dρ (2πt)n/2
2 2
n−1 − ρ2t
= ρ e , ρ > 0, t > 0.
(2t)n/2 Γ(n/2)
It worth noting that in the 2-dimensional case the aforementioned density becomes a partic-
ular case of a Weibull distribution with parameters m = 2 and α = 2t, called Wald’s distribution
1 − x2
pt (x) = xe 2t , x > 0, t > 0.
t
37
2.8 The Poisson Process

A Poisson process describes the number of occurrences of a certain event before time t, such
as: the number of electrons arriving at an anode until time t; the number of cars arriving at a
gas station until time t; the number of phone calls received in a certain day until time t; the
number of visitors entering a museum in a certain day until time t; the number of earthquakes
occurred in Japan during time interval [0, t]; the number of shocks in the stock market from
the beginning of the year until time t; the number of twisters that might hit Alabama during
a decade.
2.8.1 Definition and Properties

The definition of a Poisson process is stated more precisely in the following.
Definition 2.8.1 A Poisson process is a stochastic process Nt , t ≥ 0, which satisfies

1. The process starts at the origin, N0 = 0;
2. Nt has stationary, independent increments;
3. The process Nt is right continuous in t, with left hand limits;
4. The increments Nt − Ns , with 0 < s < t, have a Poisson distribution with parameter
λ(t − s), i.e.
λk (t − s)k −λ(t−s)
P (Nt − Ns = k) = e .
k!
It can be shown that condition 4 in the previous definition can be replaced by the following
two conditions:
P (Nt − Ns = 1) = λ(t − s) + o(t − s) (2.8.4)

P (Nt − Ns ≥ 2) = o(t − s), (2.8.5)
where o(h) denotes a quantity such that limh→0 o(h)/h = 0. This means the probability that a
jump of size 1 occurs in the infinitesimal interval dt is equal to λdt, and the probability that at
least 2 events occur in the same small interval is zero. This implies that the random variable
dNt may take only two values, 0 and 1, and hence satisfies
P (dNt = 1) = λ dt (2.8.6)
P (dNt = 0) = 1 − λ dt. (2.8.7)
The fact that Nt − Ns is stationary can be stated as

n
X (λt)k
P (Nt+s − Ns ≤ n) = P (Nt − N0 ≤ n) = P (Nt ≤ n) = e−λt .
k!
k=0
From condition 4 we get the mean and variance of increments
E[Nt − Ns ] = λ(t − s), V ar[Nt − Ns ] = λ(t − s).
In particular, the random variable Nt is Poisson distributed with E[Nt ] = λt and V ar[Nt ] = λt.
The parameter λ is called the rate of the process. This means that the events occur at the
constant rate λ.
38
Since the increments are independent, we have
E[Ns Nt ] = E[(Ns − N0 )(Nt − Ns ) + Ns2 ]

= E[Ns − N0 ]E[Nt − Ns ] + E[Ns2 ]
= λs · λ(t − s) + (V ar[Ns ] + E[Ns ]2 )
= λ2 st + λs. (2.8.8)
As a consequence we have the following result:
Proposition 2.8.2 Let 0 ≤ s ≤ t. Then

1. Cov(Ns , Nt ) = λs;
r
s
2. Corr(Ns , Nt ) = .
t
Proof: 1. Using (2.8.8) we have
Cov(Ns , Nt ) = E[Ns Nt ] − E[Ns ]E[Nt ]

= λ2 st + λs − λsλt
= λs.
2. Using the formula for the correlation yields

r
Cov(Ns , Nt ) λs s
Corr(Ns , Nt ) = 1/2
= = .
(V ar[Ns ]V ar[Nt ]) (λsλt)1/2 t
It worth noting the similarity with Proposition 2.1.7.
Proposition 2.8.3 Let Nt be Ft -adapted. Then the process Mt = Nt − λt is a martingale.
Proof: Let s < t and write Nt = Ns + (Nt − Ns ). Then
E[Nt |Fs ] = E[Ns + (Nt − Ns )|Fs ]

= E[Ns |Fs ] + E[Nt − Ns |Fs ]
= Ns + E[Nt − Ns ]
= Ns + λ(t − s),
where we used that Ns is Fs -measurable (and hence E[Ns |Fs ] = Ns ) and that the increment
Nt − Ns is independent of previous values of Ns and the information set Fs . Subtracting λt
yields
E[Nt − λt|Fs ] = Ns − λs,
or E[Mt |Fs ] = Ms . Since it is obvious that Mt is Ft -adapted, it follows that Mt is a martingale.
It worth noting that the Poisson process Nt is not a martingale. The martingale process
Mt = Nt − λt is called the compensated Poisson process.
Exercise 2.8.4 Compute E[Nt2 |Fs ]. Is the process Nt2 a martingale?

39
Exercise 2.8.5 (i) Show that the moment generating function of the random variable Nt is
x
mNt (x) = eλt(e −1)
.
(ii) Find the expressions E[Nt2 ], E[Nt3 ], and E[Nt4 ].
Exercise 2.8.6 Find the mean and variance of the process Xt = eNt .
Exercise 2.8.7 (i) Show that the moment generating function of the random variable Mt is
x
mMt (x) = eλt(e −x−1)
.
(ii) Let s < t. Verify that
E[Mt − Ms ] = 0,
E[(Mt − Ms )2 ] = λ(t − s),
E[(Mt − Ms )3 ] = λ(t − s),
E[(Mt − Ms )4 ] = λ(t − s) + 3λ2 (t − s)2 .
Exercise 2.8.8 Let s < t. Show that
V ar[Nt − Ns ] = λ(t − s),

V ar[(Mt − Ms )2 ] = λ(t − s) + 2λ2 (t − s)2 .
2.8.2 Interarrival times

For each state of the world ω, the path t → Nt (ω) is a step function that exhibits unit jumps.
Each jump in the path corresponds to an occurrence of a new event. Let T1 be the random
variable which describes the time of the 1st jump. Let T2 be the time between the 1st jump
and the second one. In general, denote by Tn the time elapsed between the (n − 1)th and nth
jumps. The random variables Tn are called interarrival times.
Proposition 2.8.9 The random variables Tn are independent and exponentially distributed
with mean E[Tn ] = 1/λ.
Proof: We start by noticing that the events {T1 > t} and {Nt = 0} are the same, since both
describe the situation that no events occurred until time t. Then
P (T1 > t) = P (Nt = 0) = P (Nt − N0 = 0) = e−λt ,
and hence the distribution function of T1 is
FT1 (t) = P (T1 ≤ t) = 1 − P (T1 > t) = 1 − e−λt .
Differentiating yields the density function
d
fT1 (t) = FT (t) = λe−λt .
dt 1
40
It follows that T1 is has an exponential distribution, with E[T1 ] = 1/λ. The conditional
distribution of T2 is
F (t|s) = P (T2 ≤ t|T1 = s) = 1 − P (T2 > t|T1 = s)

P (T2 > t, T1 = s)
= 1−
P (T1 = s)
¡ ¢
P 0 jumps in (s, s + t], 1 jump in (0, s]
= 1−
P (T1 = s)
¡ ¢ ¡ ¢
P 0 jumps in (s, s + t] P 1 jump in (0, s]
= 1− ¡ ¢
P 1 jump in (0, s]
= 1 − P (Ns+t − Ns = 0) = 1 − e−λt ,
which is independent of s. Then T2 is independent of T1 and exponentially distributed. A

similar argument for any Tn leads to the desired result.
2.8.3 Waiting times

The random variable Sn = T1 + T2 + · · · + Tn is called the waiting time until the nth jump.
The event {Sn ≤ t} means that there are n jumps that occurred before or at time t, i.e. there
are at least n events that happened up to time t; the event is equal to {Nt ≥ n}. Hence the
distribution function of Sn is given by
∞
X (λt)k
FSn (t) = P (Sn ≤ t) = P (Nt ≥ n) = e−λt .
k!
k=n
Differentiating we obtain the density function of the waiting time Sn
d λe−λt (λt)n−1
fSn (t) = FSn (t) = .
dt (n − 1)!
Writing
tn−1 e−λt
fSn (t) = ,
(1/λ)n Γ(n)
it turns out that Sn has a gamma distribution with parameters α = n and β = 1/λ. It follows
that
n n
E[Sn ] = , V ar[Sn ] = 2 .
λ λ
The relation lim E[Sn ] = ∞ states that the expectation of the waiting time gets unbounded
n→∞
large as n → ∞.
2.8.4 The Integrated Poisson Process

The function u → Nu is a continuous with the exception of a set of countable jumps of size 1.
It is known that such functions are Riemann integrable, so it makes sense to define the process
Z t
Ut = Nu du,
0
41
Figure 2.2: The Poisson process Nt and the waiting times S1 , S2 , · · · Sn . The shaded rectangle
has area n(Sn+1 − t).
called the integrated Poisson process. The next result provides a relation between the process
Ut and the partial sum of the waiting times Sk .
Proposition 2.8.10 The integrated Poisson process can be expressed as

Nt
X
Ut = tNt − Sk .
k=1
Let Nt = n. Since Nt is equal to k between the waiting times Sk and Sk+1 , the process Ut ,
which is equal to the area of the subgraph of Nu between 0 and t, can be expressed as
Z t
Ut = Nu du = 1 · (S2 − S1 ) + 2 · (S3 − S2 ) + · · · + n(Sn+1 − Sn ) − n(Sn+1 − t).
0
Since Sn < t < Sn+1 , the difference of the last two terms represents the area of last the
rectangle, which has the length t − Sn and the height n. Using associativity, a computation
yields
1 · (S2 − S1 ) + 2 · (S3 − S2 ) + · · · + n(Sn+1 − Sn ) = nSn+1 − (S1 + S2 + · · · + Sn ).
Substituting in the aforementioned relation yields
Ut = nSn+1 − (S1 + S2 + · · · + Sn ) − n(Sn+1 − t)

= nt − (S1 + S2 + · · · + Sn )
Nt
X
= tNt − Sk ,
k=1
where we replaced n by Nt .
42
Exercise 2.8.11 Find the following means

(a) E[Ut ].
Nt
£X ¤
(b) E Sk .
k=1
The following result deals with the quadratic variation of the compensated Poisson process
Mt = Nt − λt.
Proposition 2.8.12 Let a < b and consider the partition a = t0 < t1 < · · · < tn−1 < tn < b.
Then
n−1
X
ms– lim (Mtk+1 − Mtk )2 = Nb − Na , (2.8.9)
k∆n k→∞
k=0
where k∆n k = sup (tk+1 − tk ).

0≤k≤n−1
Proof: For the sake of simplicity we shall use the following notations
∆tk = tk+1 − tk , ∆Mk = Mtk+1 − Mtk , ∆Nk = Ntk+1 − Ntk .
The relation we need to prove can be also written as
n−1
X £ ¤
ms-lim (∆Mk )2 − ∆Nk = 0.
n→∞
k=0
Let
Yk = (∆Mk )2 − ∆Nk = (∆Mk )2 + ∆Mk + λ∆tk .
It suffices to show that
h n−1
X i
E Yk = 0, (2.8.10)
k=0
h n−1
X i
lim V ar Yk = 0. (2.8.11)
n→∞
k=0
The first identity follows from the properties of Poisson processes, see Exercise 2.8.7
h n−1
X i n−1
X n−1
X
E Yk = E[Yk ] = E[(∆Mk )2 ] − E[∆Nk ]
k=0 k=0 k=0
n−1
X
= (λ∆tk − λ∆tk ) = 0.
k=0
For the proof of the identity (2.8.11) we need to find first the variance of Yk .
V ar[Yk ] = V ar[(∆Mk )2 − (∆Mk + λ∆tk )] = V ar[(∆Mk )2 − ∆Mk ]
= V ar[(∆Mk )2 ] + V ar[∆Mk ] − 2Cov[∆Mk2 , ∆Mk ]
= λ∆tk + 2λ2 ∆t2k + λ∆tk
h i
−2 E[(∆Mk )3 ] − E[(∆Mk )2 ]E[∆Mk ]
= 2λ2 (∆tk )2 ,
43
where we used Exercise 2.8.7 and the fact that E[∆Mk ] = 0. Since Mt is a process with
independent increments, then Cov[Yk , Yj ] = 0 for i 6= j. Then
h n−1
X i n−1
X X n−1
X
V ar Yk = V ar[Yk ] + 2 Cov[Yk , Yj ] = V ar[Yk ]
k=0 k=0 k6=j k=0
n−1
X n−1
X
= 2λ2 (∆tk )2 ≤ 2λ2 k∆n k ∆tk = 2λ2 (b − a)k∆n k,
k=0 k=0
h n−1
X i
and hence V ar Yn → 0 as k∆n k → 0. By Proposition 3.3.1 we obtain the desired limit in
k=0
mean square sense.
The previous result states that the quadratic variation of the martingale Mt between a and
b is equal to the jump of the Poisson process between a and b.
2.8.5 The Fundamental Relation dMt2 = dNt

Recall relation (2.8.9)
n−1
X
ms- lim (Mtk+1 − Mtk )2 = Nb − Na . (2.8.12)
n→∞
k=0
The right side can be regarded as a Riemann-Stieltjes integral

Z b
Nb − Na = dNt ,
a
while the left side can be regarded as a stochastic integral with respect to (dMt )2
Z b n−1
X
(dMt )2 := ms- lim (Mtk+1 − Mtk )2 .
a n→∞
k=0

Z b Z b
(dMt )2 = dNt ,
a a
for any a < b. The equivalent differential form is
dMt2 = dNt . (2.8.13)
This relation will be useful in formal calculations involving Ito’s formula.
2.8.6 The Relations dt dMt = 0, dWt dMt = 0

In order to show that dt dMt = 0 in the mean square sense, it suffices to prove the limit
n−1
X
ms- lim (tk+1 − tk )(Mtk+1 − Mtk ) = 0, (2.8.14)
n→∞
k=0
44
since this is thought as a vanishing integral of the increment process dMt with respect to dt
Z b
dMt dt = 0, ∀a, b ∈ R.
a
Denote
n−1
X n−1
X
Xn = (tk+1 − tk )(Mtk+1 − Mtk ) = ∆tk ∆Mk .
k=0 k=0
In order to show (2.8.16) it suffices to prove that
1. E[Xn ] = 0;
2. lim V ar[Xn ] = 0.
n→∞
using the additivity of the expectation and Exercise 2.8.7, (ii)
h n−1
X i n−1
X
E[Xn ] = E ∆tk ∆Mk = ∆tk E[∆Mk ] = 0.
k=0 k=0
Since the Poisson process Nt has independent increments, the same property holds for the
compensated Poisson process Mt . Then ∆tk ∆Mk and ∆tj ∆Mj are independent for k 6= j, and
using the properties of variance we have
h n−1
X i n−1
X n−1
X
V ar[Xn ] = V ar ∆tk ∆Mk = (∆tk )2 V ar[∆Mk ] = λ (∆tk )3 ,
k=0 k=0 k=0
where we used
V ar[∆Mk ] = E[(∆Mk )2 ] − (E[∆Mk ])2 = λ∆tk ,
see Exercise 2.8.7, (ii). If let k∆n k = max ∆tk , then
k
n−1
X n−1
X
V ar[Xn ] = λ (∆tk )3 ≤ λk∆n k2 ∆tk = λ(b − a)k∆n k2 → 0
k=0 k=0
as n → ∞. Hence we proved the stochastic differential relation
dt dMt = 0. (2.8.15)
For showing the relation dWt dMt = 0, we need to prove
ms- lim Yn = 0, (2.8.16)

n→∞
where we have denoted

n−1
X n−1
X
Yn = (Wk+1 − Wk )(Mtk+1 − Mtk ) = ∆Wk ∆Mk .
k=0 k=0
45
Since the Brownian motion Wt and the process Mt have independent increments and ∆Wk is
independent of ∆Mk , we have
n−1
X n−1
X
E[Yn ] = E[∆Wk ∆Mk ] = E[∆Wk ]E[∆Mk ] = 0,
k=0 k=0
where we used E[∆Wk ] = E[∆Mk ] = 0. Using also E[(∆Wk )2 ] = ∆tk , E[(∆Mk )2 ] = λ∆tk ,
invoking the independence of ∆Wk and ∆Mk , we get
V ar[∆Wk ∆Mk ] = E[(∆Wk )2 (∆Mk )2 ] − (E[∆Wk ∆Mk ])2

= E[(∆Wk )2 ]E[(∆Mk )2 ] − E[∆Wk ]2 E[∆Mk ]2
= λ(∆tk )2 .
Then using the independence of the terms in the sum, we get

n−1
X n−1
X
V ar[Yn ] = V ar[∆Wk ∆Mk ] = λ (∆tk )2
k=0 k=0
n−1
X
≤ λk∆n k ∆tk = λ(b − a)k∆n k → 0,
k=0
as n → ∞. Since Yn is a random variable with mean zero and variance decreasing to zero, it
follows that Yn → 0 in the mean square sense. Hence we proved that
dWt dMt = 0. (2.8.17)
Exercise 2.8.13 Show the following stochastic differential relations

(a) dt dNt = 0
(b) dWt dNt = 0
(c) dt dWt = 0.
The relations proved in this section will be useful in the sequel, when developing the stochas-
tic model of a stock price that exhibits jumps modeled by a Poisson process.
46
Chapter 3
Properties of Stochastic
Processes
3.1 Hitting Times

Hitting times are useful in finance when studying barrier options and lookback options. For
instance, knock-in options enter into existence when the stock price hits a certain barrier before
option maturity. A lookback option is priced using the maximum value of the stock until the
present time. The stock price is not a Brownian motion, but it depends on one. Hence the
need of studying the hitting time for the Brownian motion.
The first result deals with the hitting time for a Brownian motion to reach the barrier a ∈ R,
see Fig.3.1.
Lemma 3.1.1 Let Ta be the first time the Brownian motion Wt hits a. Then
Z ∞
2 2
P (Ta ≤ t) = √ √ e−y /2
dy.
2π |a|/ t
2.5
Wt
2.0
a
1.5
50 100 150 200 250 300 350
Ta
Figure 3.1: The first hitting time Ta given by WTa = a.
47
48
Proof: If A and B are two events, then
P (A) = P (A ∩ B) + P (A ∩ B)
= P (A|B)P (B) + P (A|B)P (B). (3.1.1)
Let a > 0. Using formula (3.1.1) for A = {ω; Wt (ω) ≥ a} and B = {ω; Ta (ω) ≤ t} yields
P (Wt ≥ a) = P (Wt ≥ a|Ta ≤ t)P (Ta ≤ t)

+P (Wt ≥ a|Ta > t)P (Ta > t) (3.1.2)
If Ta > t, the Brownian motion did not reached the barrier a yet, so we must have Wt < a.
Therefore
P (Wt ≥ a|Ta > t) = 0.
If Ta ≤ t, then WTa = a. Since the Brownian motion is a Markov process, it starts fresh at Ta .
Due to symmetry of the density function of a normal variable, Wt has equally chances to go up
or go down after the time interval t − Ta . It follows that
1
P (Wt ≥ a|Ta ≤ t) = .
2
P (Ta ≤ t) = 2P (Wt ≥ a)
Z ∞ Z ∞
2 −x2 /(2t) 2 −y 2 /2
= √ e dx = √ √ e dy.
2πt a 2π a/ t
If a < 0, symmetry reasons imply that the distribution of Ta is the same as that of T−a , so we
get Z ∞
2 −y 2 /2
P (Ta ≤ t) = P (T−a ≤ t) = √ √ e dy.
2π −a/ t
Theorem 3.1.2 Let a ∈ R be fixed. Then the Brownian motion hits a in finite time with
probability 1.
Proof: The probability that Wt hits a in finite time is

Z ∞
2 2
P (Ta < ∞) = lim P (Ta ≤ t) = lim √ √ e−y /2
dy
t→∞ t→∞ 2π |a|/ t
Z ∞
2 2
= √ e−y /2 dy = 1,
2π 0
where we used the well known integral
Z ∞ Z
2 1 ∞ −y2 /2 1√
e−y /2 dy = e dy = 2π.
0 2 −∞ 2
49
Remark 3.1.3 Even if the hitting time is finite with probability 1, its expectation E[Ta ] is
infinite. This means that the expected time to hit the barrier is infinite.
Corollary 3.1.4 A Brownian motion process returns to the origin in finite time with probability
1.
Proof: Choose a = 0 and apply Theorem 3.1.2.
Exercise 3.1.5 Show that the distribution function of the process Xt = max Ws is given by
s∈[0,t]
Z √
a/ t
2 2
P (Xt ≤ a) = √ e−y /2
dy.
2π −∞
The fact that a Brownian motion returns or hits a barrier almost surely is a property char-
acteristic to the dimension 1 only. The next result states that in larger dimensions this is no
more possible.
¡ ¢
Theorem 3.1.6 Let (a, b) ∈ R2 . The 2-dimensional Brownian motion W (t) = W1 (t), W2 (t)
hits the point (a, b) with probability zero. The same result is valid for any n-dimensional Brow-
nian motion, with n ≥ 2.
Research topic. This deals with the hitting time of a 2-dimensional Brownian motion to reach
a given disk. Let D² (x0 ) = {x ∈ R2 ; |x − x0 | ≤ ²}. Find the probability P (∃t; W (t) ∈ D² (x0 ))
as a function of x0 and ². Let TD² (x0 ) = inf t≥0 {t; W (t) ∈ D² (x0 )}. Find the distribution
and the expectation of the random variable TD² (x0 ) . It is known that P (TD² (x0 ) < ∞) = 1.
However, in the n-dimensional version this probability is zero.
Research topic. This deals with the exit time of a 2-dimensional Brownian motion from the
disc of radius a. Let
Ta = inf {t; W (t) ∈

/ D(0, a)} = inf {t; |W (t)| > a},
t>0 t>0
Find the distribution function of the exit time Ta and its expectation. It is known that P (Ta <
2
∞) = R2 . You may try to apply the same argument as in the proof of Lemma 3.1.1; however,
because of convexity
1
P (Rt ≥ a|Ta ≤ t) > .
2
What is the exact value of this probability?
Theorem 3.1.7 (The law of Arc-sine) The probability that a Brownian motion Wt does not
have any zeros in the interval (t1 , t2 ) is equal to
r
2 t1
P (Wt 6= 0, t1 ≤ t ≤ t2 ) = arcsin .
π t2
Proof: Let A(a; t1 , t2 ) denote the event that the Brownian motion Wt takes on the value a
between t1 and t2 . In particular, A(0; t1 , t2 ) denotes the event that Wt has (at least) a zero
between t1 and t2 . Substituting A = A(0; t1 , t2 ) and X = Wt1 in the following formula of
conditional probability
Z Z
P (A) = P (A|X = x) PX (x) = P (A|X = x)fX (x) dx
50
0.4
0.2
t1 t2
50 100 150 200 250 300 350
-0.2
a
-0.4
Wt
-0.6
Figure 3.2: The event A(a; t1 , t2 ) in the Law of Arc-sine.
yields
Z
¡ ¢ ¡ ¢
P A(0; t1 , t2 ) = P A(0; t1 , t2 )|Wt1 = x fWt (x) dx (3.1.3)
1
Z ∞
1 ¡ ¢ x2
= √ P A(0; t1 , t2 )|Wt1 = x e− 2t1 dx
2πt1 −∞
Using the properties of Wt with respect to time translation and symmetry we have
¡ ¢ ¡ ¢
P A(0; t1 , t2 )|Wt1 = x = P A(0; 0, t2 − t1 )|W0 = x
¡ ¢
= P A(−x; 0, t2 − t1 )|W0 = 0
¡ ¢
= P A(|x|; 0, t2 − t1 )|W0 = 0
¡ ¢
= P A(|x|; 0, t2 − t1 )
¡ ¢
= P T|x| ≤ t2 − t1 ,
the last identity stating that Wt hits |x| before t2 − t1 . Using Lemma 3.1.1 yields
Z ∞
¡ ¢ 2 − y2
P A(0; t1 , t2 )|Wt1 = x = p e 2(t2 −t1 ) dy.
2π(t2 − t1 ) |x|
Substituting in (3.1.3) we obtain
Z ∞ ³ Z ∞ ´ x2
¡ ¢ 1 2 − y2
P A(0; t1 , t2 ) = √ p e 2(t2 −t1 ) dy e− 2t1 dx
2πt1−∞ 2π(t2 − t1 ) |x|
Z ∞Z ∞
1 − y2
−x
2
= p e 2(t2 −t1 ) 2t1 dydx.
π t1 (t2 − t1 ) 0 |x|
The above integral can be evaluated to get (see Exercise 3.1.8 )

r
¡ ¢ 2 t1
P A(0; t1 , t2 ) = 1 − arcsin .
π t2
¡ ¢
Using P (Wt 6= 0, t1 ≤ t ≤ t2 ) = 1 − P A(0; t1 , t2 ) we obtain the desired result.
51
Exercise 3.1.8 Use polar coordinates to show

Z ∞ Z ∞
r
1 − 2(t y−t
2
x2
− 2t 2 t1
p e 2 1) 1 dydx = 1 − arcsin .
π t1 (t2 − t1 ) 0 |x| π t2
¡ ¢
Exercise 3.1.9 Find the probability that a 2-dimensional Brownian motion W (t) = W1 (t), W2 (t)
stays in the same quadrant for the time interval t ∈ (t1 , t2 ).
Exercise 3.1.10 Find the probability that a Brownian motion Wt does not take the value a in
the interval (t1 , t2 ).
Exercise 3.1.11 Let a 6= b. Find the probability that a Brownian motion Wt does not take any
of the values {a, b} in the interval (t1 , t2 ). Formulate and prove a generalization.
Research topic. What is the probability that a 2-dimensional Brownian process hits a set D
between time instances t1 and t2 ?
We provide below a similar result without proof.
Rt
Theorem 3.1.12 (Arc-sine Law of Lévy) Let L+ +
t = 0 sgn Ws ds be the amount of time a
Brownian motion Wt is positive during the time interval [0, t]. Then
r
2 τ
P (L+
t ≤ τ ) = arcsin .
π t
Research topic. Let Lt (D) be the amount of time spent by a 2-dimensional Brownian motion
W (t) inside the set D. Find P (Lt (D) ≤ τ ). When D is the half-plane {(x, y); y > 0} we retrieve
the previous result.
Theorem 3.1.13 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate µ, and
consider α, β > 0. Then
e2µβ − 1
P (Xt goes up to α before down to − β) = .
e2µβ− e−2µα
3.2 Limits of Stochastic Processes

Let (Xt )t≥0 be a stochastic process. One can make sense of the limit expression X = lim Xt ,
t→∞
in a similar way as we did in section 1.12 for sequences of random variables. We shall re-write
the definitions for the continuous case.
Almost Certain Limit

The process Xt converges almost certainly to X, if for all states of the world ω, except a set of
probability zero, we have
lim Xt (ω) = X(ω).
t→∞
We shall write ac-lim Xt = X. It is also called sometimes strong convergence.

t→∞
52
Mean Square Limit

We say that the process Xt converges to X in the mean square if
lim E[(Xt − X)2 ] = 0.

t→∞
In this case we write ms-lim Xt = X.

t→∞
Limit in Probability or Stochastic Limit

The stochastic process Xt converges in stochastic limit to X if
¡ ¢
lim P ω; |Xt (ω) − X(ω)| > ² = 0.
t→∞
This limit is abbreviated by st-lim Xt = X.

t→∞
It worth noting that, like in the case of sequences of random variables, both almost certain
convergence and convergence in mean square imply the stochastic convergence.
Limit in Distribution
We say that Xt converges in distribution to X if for any continuous bounded function ϕ(x) we
have
lim ϕ(Xt ) = ϕ(X).
t→∞
It worth noting that the stochastic convergence implies the convergence in distribution.
3.3 Convergence Theorems

The following property is a reformulation of the Exercise 1.12.1 in the continuous setup.
Proposition 3.3.1 Consider a stochastic process Xt such that E[Xt ] → k, constant, and
V ar(Xt ) → 0 as t → ∞. Then ms-lim Xt = k.
t→∞
Next we shall provide a few applications that show how some processes compare with powers
of t for t large.
Application 3.3.2 If α > 1/2, then

Wt
ms-lim = 0.
t→∞ tα
Wt E[Wt ] 1 t 1
Proof: Let Xt = α
. Then E[Xt ] = α
= 0, and V ar[Xt ] = 2α V ar[Wt ] = 2α = 2α−1 ,
t t t t t
1 Wt
for any t > 0. Since 2α−1 → 0 as t → ∞, applying Proposition 3.3.1 yields ms-lim α = 0.
t t→∞ t
Wt
Corollary 3.3.3 We have ms-lim = 0.
t→∞ t
53
Rt
Application 3.3.4 Let Zt = 0
Ws ds. If β > 3/2, then
Zt
ms-lim = 0.
t→∞ tβ
Zt E[Zt ] 1 t3 1
Proof: Let Xt = β
. Then E[Xt ] = β
= 0, and V ar[Xt ] = 2β V ar[Zt ] = 2β = 2β−3 ,
t t t 3t 3t
1
for any t > 0. Since 2β−3 → 0 as t → ∞, applying Proposition 3.3.1 leads to the desired
3t
result.
Application 3.3.5 For any p > 0, c ≥ 1 we have

eWt −ct
ms-lim = 0.
t→∞ tp
eWt −ct eWt
Proof: Consider the process Xt = = . Since
tp tp ect
E[eWt ] et/2 1 1
E[Xt ] = p ct
= p ct
= (c− 1 )t p → 0, as t → ∞
t e t e e 2 t
V ar[eWt ] e2t − et 1 ³ 1 1 ´
V ar[Xt ] = = = − → 0.
t2p e2ct t2p e2ct t2p e2t(c−1) et(2c−1)
Proposition 3.3.1 leads to the desired result.
Application 3.3.6 If β > 1/2, then

max Ws
0≤s≤t
ms-lim = 0.
t→∞ tβ
max Ws
0≤s≤t Ws
Proof: Let Xt = . There is an s ∈ [0, t] such that Ws = max Ws , so Xt = . The
tβ 0≤s≤t tβ
mean and the variance satisfy
E[Ws ]
E[Xt ] = =0
tβ
V ar[Ws ] s t 1
V ar[Xt ] = = 2β ≤ 2β = 2β−1 → 0, t → ∞.
t2β t t t
Apply Proposition 3.3.1 and get the desired result.
Remark 3.3.7 The strongest result regarding limits of Brownian motion is called the law of
iterated logarithms and was first proved by Lamperti:
Wt
lim sup √ = 1,
t→∞ 2t ln ln t
almost certainly.
Proposition 3.3.8 Let Xt be a stochastic process. Then
ms-lim Xt = 0 ⇐⇒ ms-lim Xt2 = 0.

t→∞ t→∞
54
Proof: Left as an exercise.
Exercise 3.3.9 Let Xt be a stochastic process. Show that
ms-lim Xt = 0 ⇐⇒ ms-lim |Xt | = 0.

t→∞ t→∞
Another convergence result can be obtained if consider the continuous analog of Exercise
1.12.2:
Proposition 3.3.10 Let Xt be a stochastic process such that there is a p > 0 such that
E[|Xt |p ] → 0 as t → ∞. Then st-lim Xt = 0.
t→∞
Application 3.3.11 We shall show that for any α > 1/2

Wt
st-lim = 0.
t→∞ tα
Wt
Proof: Consider the process Xt = α . By Proposition 3.3.8 it suffices to show st-lim Xt2 = 0.
t t→∞
Since h W 2 i E[W 2 ] t 1
E[|Xt |2 ] = E[Xt2 ] = E 2αt = t
= 2α = 2α−1 → 0, t → ∞,
t t2α t t
then Proposition 3.3.10 yields st-lim Xt2 = 0.
t→∞
The following result can be regarded as the L’Hospital’s rule for sequences:
Lemma 3.3.12 (Cesaró-Stoltz) Let xn and yn be two sequences of real numbers, n ≥ 1. If
xn+1 − xn
the limit lim exists and it is equal to L, then the following limit exists
n→∞ yn+1 − yn
xn
lim = L.
n→∞ yn
Proof: (Sketch) Assume there are differentiable functions f and g such that f (n) = xn and
g(n) = yn . (How do you construct these functions?) From Cauchy’s theorem1 there is a
cn ∈ (n, n + 1) such that
xn+1 − xn f (n + 1) − f (n) f 0 (cn )
L = lim = lim = lim 0 .
n→∞ yn+1 − yn n→∞ g(n + 1) − g(n) n→∞ g (cn )
Since cn → ∞ as n → ∞, we can write the aforementioned limit also as

f 0 (t)
lim = L.
t→∞ g 0 (t)
(Here one may argue against this, but we recall the freedom of choice for the functions f and
g such that cn can be any number between n and n + 1). By l’Hospital’s rule we get
f (t)
lim = L.
t→∞ g(t)
1 This says that if f and g are differentiable on (a, b) and continuous on [a, b], then there is a c ∈ (a, b) such
f (a) − f (b) f 0 (c)
that = 0 .
g(a) − g(b) g (c)
55
xn
Making t = n yields lim = L.
t→∞ yn
The next application states that if a sequence is convergent, then the arithmetic average of
its terms is also convergent, and the sequences have the same limit.
Example 3.3.1 Let an be a convergent sequence with lim an = L. Let
n→∞
a1 + a2 + · · · + an
An =
n
be the arithmetic average of the first n terms. Show that An is convergent and
lim An = L.
n→∞
Proof: This is an application of Cesaró-Stoltz lemma. Consider the sequences xn = a1 + a2 +

· · · + an and yn = n. Since
xn+1 − xn (a1 + · · · + an+1 ) − (a1 + · · · + an ) an+1
= = ,
yn+1 − yn (n + 1) − n 1
then
xn+1 − xn
lim = lim an+1 = L.
n→∞ yn+1 − yn n→∞
Applying the Cesaró-Stoltz lemma yields

xn
lim An = lim = L.
n→∞ n→∞ yn
Exercise 3.3.13 Let bn be a convergent sequence with lim bn = L. Let

n→∞
Gn = (b1 · b2 · · · · · bn )1/n
be the geometric average of the first n terms. Show that Gn is convergent and
lim Gn = L.
n→∞
The following result extends the Cesaró-Stoltz lemma to sequences of random variables.
Proposition 3.3.14 Let Xn be a sequence of random variables on the probability space (Ω, F, P ),
such that
Xn+1 − Xn
ac-lim = L.
n→∞ Yn+1 − Yn
Then
Xn
ac-lim = L.
n→∞ Yn
Proposition 3.3.15 Consider the sets

Xn+1 (ω) − Xn (ω)
A = {ω ∈ Ω; lim = L}
n→∞ Yn+1 (ω) − Yn (ω)
Xn (ω)
B = {ω ∈ Ω; lim = L}.
n→∞ Yn (ω)
56
Since for any given state of the world ω, the sequences xn = Xn (ω) and yn = Yn (ω) are
numerical sequences, Lemma 3.3.12 yields the inclusion A ⊂ B. This implies P (A) ≤ P (B).
Since P (A) = 1, it follows that P (B) = 1, which leads to the desired conclusion.
Remark 3.3.16 Let Xn and Yn denote the prices of two stocks in the day n. The previous
result states that if Corr(Xn+1 − Xn , Yn+1 − Yn ) → 1, as n → ∞, then Corr(Xn , Yn ) → 1. So,
if the correlation of the the daily changes of the stock price tends to 1 in the long run, then the
stock prices correlation does the same.
Example 3.3.17 Let Sn denote the price of a stock in the day n, and assume that
ac-lim Sn = L.
n→∞
Then
S1 + · · · + Sn
ac-lim = L and ac-lim (S1 · · · · · Sn )1/n = L.
n→∞ n n→∞
This says, that if almost all future simulations of the stock price approach the steady state
limit L, the arithmetic and geometric averages converge to the same limit. The statement is a
consequence of Proposition 3.3.14 and follows a similar proof as Example 3.3.1. Asian options
have payoffs depending on these type of averages, as we shall se in the sequel.
Research topic: Extend Proposition 3.3.14 to the continuous case of stochastic processes.
3.3.1 The Martingale Convergence Theorem

We state now a result which is a powerful way of proving almost certain convergence.
Theorem 3.3.18 Let Xn be a martingale with bounded means
∃M > 0 such that E[|Xn |] ≤ M, ∀n ≥ 1. (3.3.4)
Then there is L < ∞ such that
¡ ¢
P ω; lim Xn (ω) = L = 1.
n→∞
2
Since E[|Xn |] ≤ E[Xn2 ], the boundness condition (3.3.4) can be replaced by its stronger version
∃M > 0 such that E[Xn2 ] ≤ M, ∀n ≥ 1.
Example 3.3.2 It is known that Xt = eWt −t/2 is a martingale. Since
E[Xt ] = E[eWt −t/2 ] = e−t/2 E[eWt ] = e−t/2 et/2 = 1,
by the Martingale Convergence Theorem there is a number L such that Xt → L a.c. as t → ∞.
What is the limit L? How did you make your guess?
3.3.2 The Squeeze Theorem

The following result is the analog of the Squeeze Theorem from usual Calculus.
Theorem 3.3.19 Let Xn , Yn , Zn be sequences of random variables on the probability space
(Ω, F, P ) such that
Xn ≤ Yn ≤ Zn a.s. ∀n ≥ 1.
If Xn and Zn converge to L as n → ∞ almost certainly (or in mean square, or stochastic or
in distribution), then Yn converges to L in a similar mode.
57
Proof: For any state of the world ω ∈ Ω consider the sequences xn = Xn (ω), yn = Yn (ω) and
zn = Zn (ω) and apply the usual Squeeze Theorem to them.
Remark 3.3.20 The previous theorem remains valid if n is replaced by a continuous positive
parameter t.
Wt sin(Wt )
Example 3.3.3 Show that ac-lim = 0.
t→∞ t
Wt sin(Wt ) Wt
Proof: Consider the sequences Xt = 0, Yt = and Zt = . From Application
t t
3.3.11 we have ac-lim Zt = 0. Applying the Squeeze Theorem we obtain the desired result.
t→∞
58
Chapter 4
Stochastic Integration
This chapter deals with one of the most useful stochastic integral, called the Ito integral. This
type of stochastic integral was introduced in 1944 by the Japanese mathematician K. Ito, and
was originally motivated by a construction of diffusion processes.
4.0.3 Nonanticipating Processes

Consider the Brownian motion Wt . A process Ft is called nonanticipating process if Ft is
independent of the increment Wt0 − Wt for any t and t0 with t < t0 . Consequently, the process
Ft is independent of the behavior of the Brownian motion in the future, i.e. it cannot anticipate
the future. For instance, Wt , eWt , Wt2 − Wt + t are examples of nonanticipating processes, while
Wt+1 or 21 (Wt+1 − Wt )2 are not.
Nonanticipating processes are important because the Ito integral concept applies only to
them.
If Ft denotes the information known until time t, where this information is generated by
the Brownian motion {Ws ; s ≤ t}, then any Ft -adapted process Ft is nonanticipating.
4.0.4 Increments of Brownian Motions

In this section we shall discuss a few basic properties of the increments of a Brownian motion,
which will be useful when computing stochastic integrals.
Proposition 4.0.21 Let Wt be a Brownian motion. If s < t, we have

1. E[(Wt − Ws )2 ] = t − s.
2. V ar[(Wt − Ws )2 ] = 2(t − s)2 .
Proof: 1. Using that Wt − Ws ∼ N (0, t − s), we have
E[(Wt − Ws )2 ] = E[(Wt − Ws )2 ] − (E[Wt − Ws ])2 = V ar(Wt − Ws ) = t − s.

Wt − Ws
2. Dividing by the standard deviation yields the standard normal random variable √ ∼
t−s
(Wt − Ws )2
N (0, 1). Its square, is χ2 -distributed with 1 degree of freedom.1 Its mean is 1 and
t−s
1A χ2 -distributed random variable with n degrees of freedom has mean n and variance 2n.
59
60
its variance is 2. This implies

h (W − W )2 i
t s
E = 1 =⇒ E[(Wt − Ws )2 ] = t − s;
t−s
h (W − W )2 i
t s
V ar = 2 =⇒ V ar[(Wt − Ws )2 ] = 2(t − s)2 .
t−s
Remark 4.0.22 The infinitesimal version of the previous result is obtained by replacing t − s
with dt
1. E[dWt2 ] = dt;
2. V ar[dWt2 ] = 2dt2 .
We shall see in a next section that in fact dWt2 and dt are equal in a mean square sense.

1. E[(Wt − Ws )4 ] = 3(t − s)2 ;
2. E[(Wt − Ws )6 ] = 15(t − s)3 .
4.1 The Ito Integral

The Ito integral is defined in a way that is similar to the Riemann integral. The Ito integral
is taken with respect to infinitesimal increments of a Brownian motion dWt , which are random
variables, while the Riemann integral considers integration with respect to the predictable
infinitesimal changes dt. It worth noting that the Ito integral is a random variable, while the
Riemann integral is just a real number. Despite this fact, there are several common properties
and relations between these two types of integral.
Consider 0 ≤ a < b and let Ft = f (Wt , t) be a nonanticipating process with
hZ b i
E Ft2 dt < ∞.
a
Divide the interval [a, b] into n subintervals using the partition points
a = t0 < t1 < · · · < tn−1 < tn = b,
and consider the partial sums

n−1
X
Sn = Fti (Wti+1 − Wti ).
i=0
We emphasize that the intermediate points are the left endpoints of each interval, and this is
the way they should be always chosen. Since the process Ft is nonanticipative, the random
variables Fti and Wti+1 − Wti are independent; this is an important feature in the definition of
the Ito integral.
The Ito integral is the limit of the partial sum Sn
61
Z T
ms-lim Sn = Ft dWt ,
n→∞ 0
provided the limit exists. It can be shown that the choice of partition does not influence the
value of the Ito integral. This is the reason why, for practical purposes, it suffices to assume
the intervals equidistant, i.e.
(b − a)
ti+1 − ti = a + , i = 0, 1, · · · , n − 1.
n
The previous convergence is in the mean square sense, i.e.
h³ Z b ´2 i
lim E Sn − Ft dWt = 0.
n→∞ a
Existence of Ito integral

Z b
It is known that the Ito stochastic integral Ft dWt exists if the process Ft = f (Wt , t) satisfies
a
the following two properties:
1. The paths ω → Ft (ω) are continuous on [a, b] for any state of the world ω ∈ Ω;
2. The process Ft is nonanticipating for t ∈ [a, b].
For instance, the following stochastic integrals exist:
Z T Z T Z b
cos(Wt )
Wt2 dWt , sin(Wt ) dWt , dWt .
0 0 a t
4.2 Examples of Ito integrals

As in the case of Riemann integral, using the definition is not an efficient way of computing
integrals. The same philosophy applies to Ito integrals. We shall compute in the following two
simple Ito integrals. In later sections we shall introduce more efficient methods for computing
Ito integrals.
4.2.1 The case Ft = c, constant

In this case the partial sums can be computed explicitly
n−1
X n−1
X
Sn = Fti (Wti+1 − Wti ) = c(Wti+1 − Wti )
i=0 i=0
= c(Wb − Wa ),
and since the answer does not depend on n, we have

Z b
c dWt = c(Wb − Wa ).
a
In particular, taking c = 1, since the Brownian motion starts at 0, we have the following formula
Z T
dWt = WT .
0
62
4.2.2 The case Ft = Wt

We shall integrate the process Wt between 0 and T . Considering an equidistant partition, we
kT
take tk = , k = 0, 1, · · · , n − 1. The partial sums are given by
n
n−1
X
Sn = Wti (Wti+1 − Wti ).
i=0
Since
1
xy = [(x + y)2 − x2 − y 2 ],
2
letting x = Wti and y = Wti+1 − Wti yields
1 2 1 1
Wti (Wti+1 − Wti ) = Wti+1 − Wt2i − (Wti+1 − Wti )2 .
2 2 2
Then after pair cancelations the sum becomes
n−1 n−1 n−1
1X 2 1X 2 1X
Sn = Wti+1 − W ti − (Wti+1 − Wti )2
2 i=0 2 i=0 2 i=0
n−1
1 2 1X
= Wtn − (Wti+1 − Wti )2
2 2 i=0
Using tn = T , we get
n−1
1 2 1X
Sn = W T − (Wti+1 − Wti )2 .
2 2 i=0
Since the first term is independent of n, we have
n−1
1 2 1X
ms- lim Sn = WT − ms- lim (Wti+1 − Wti )2 . (4.2.1)
n→∞ 2 n→∞ 2
i=0
In the following we shall compute the right term limit. Denote

n−1
X
Xn = (Wti+1 − Wti )2 .
i=0
Since the increments are independent, Proposition 4.0.21 yields

n−1
X n−1
X
E[Xn ] = E[(Wti+1 − Wti )2 ] = (ti+1 − ti )
i=0 i=0
= tn − t0 = T ;
n−1
X n−1
X
V ar[Xn ] = V ar[(Wti+1 − Wti )2 ] = 2(ti+1 − ti )2
i=0 i=0
2T
= ,
n2
63
where we used that the partition is equidistant. Since Xn satisfies the conditions
E[Xn ] = T, ∀n ≥ 1;
V ar[Xn ] → 0, n → ∞,
by Proposition 3.3.1 we obtain ms-lim Xn = T , or

n→∞
n−1
X
ms- lim (Wti+1 − Wti )2 = T. (4.2.2)
n→∞
i=0
This states that the quadratic variation of the Brownian motion is T . Hence (4.2.1) becomes
1 2 1
ms- lim Sn = W − T.
n→∞ 2 T 2
We have obtained the following explicit formula of a stochastic integral
Z T
1 2 1
Wt dWt = W − T.
0 2 T 2
In a similar way one can obtain

Z b
1 1
Wt dWt = (W 2 − Wa2 ) − (b − a).
a 2 b 2
It worth noting that the right side contains random variables depending on the limits of inte-
gration a and b.
Exercise 4.2.1 Show the following identities

RT
1. E[ 0 dWt ] = 0;
RT
2. E[ 0 Wt dWt ] = 0;
RT
3. V ar[ 0 Wt dWt ] = T 2 .
4.3 The Fundamental Relation dWt2 = dt

The relation discussed in this section can be regarded as the fundamental relation of Stochastic
Calculus. We shall start by recalling relation (4.2.2)
n−1
X
ms- lim (Wti+1 − Wti )2 = T. (4.3.3)
n→∞
i=0
The right side can be regarded as a regular Riemann integral

Z T
T = dt,
0
64
while the left side can be regarded as a stochastic integral with respect to dWt2
Z T n−1
X
(dWt )2 := ms- lim (Wti+1 − Wti )2 .
0 n→∞
i=0

Z T Z T
(dWt )2 = dt, ∀T > 0.
0 0
The differential form of this integral equation is
dWt2 = dt.
Roughly speaking, the process dWt2 , which is the square of infinitesimal increments of a Brow-
nian motion, is totally predictable. This relation is plays a central role in Stochastic Calculus
and it will be useful when dealing with Ito’s Lemma.
4.4 Properties of the Ito Integral

We shall start with some properties which are similar with those of the Riemannian integral.
Proposition 4.4.1 Let f (Wt , t), g(Wt , t) be nonanticipating processes and c ∈ R. Then we
have
1. Additivity:
Z T Z T Z T
[f (Wt , t) + g(Wt , t)] dWt = f (Wt , t) dWt + g(Wt , t) dWt .
0 0 0
2. Homogeneity:
Z T Z T
cf (Wt , t) dWt = c f (Wt , t) dWt .
0 0
3. Partition property:
Z T Z u Z T
f (Wt , t) dWt = f (Wt , t) dWt + f (Wt , t) dWt , ∀0 < u < T.
0 0 u
Proof: 1. Consider the partial sum sequences
n−1
X
Xn = f (Wti , ti )(Wti+1 − Wti )
i=0
n−1
X
Yn = g(Wti , ti )(Wti+1 − Wti ).
i=0
65
Z T Z T
Since ms-lim Xn = f (Wt , t) dWt and ms-lim Yn = g(Wt , t) dWt , using Proposition
n→∞ 0 n→∞ 0
1.13.2 yields
Z T ³ ´
f (Wt , t) + g(Wt , t) dWt
0
X³
n−1 ´
= ms-lim f (Wti , ti ) + g(Wti , ti ) (Wti+1 − Wti )
n→∞
i=0
h n−1
X³ n−1
X i
= ms-lim f (Wti , ti )(Wti+1 − Wti ) + g(Wti , ti )(Wti+1 − Wti )
n→∞
i=0 i=0
= ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn
n→∞ n→∞ n→∞
Z T Z T
= f (Wt , t) dWt + g(Wt , t) dWt .
0 0
The proof of the parts 2 and 3 are left as an exercise to the reader.
Some other properties, such as monotonicity, do not hold in general. It is possible to have a
RT
nonnegative random variable Ft for which the random variable 0 Ft dWt has negative values.
Some of the random variable properties of the Ito integral are given by the following result.
Proposition 4.4.2 We have
1. Zero mean:
hZ b i
E f (Wt , t) dWt = 0.
a
2. Isometry:
h³ Z b ´2 i hZ b i
E f (Wt , t) dWt =E f (Wt , t)2 dt .
a a
3. Covariance:
h³ Z b ´³ Z b í hZ b i
E f (Wt , t) dWt g(Wt , t) dWt =E f (Wt , t)g(Wt , t) dt .
a a a
We shall discuss the previous properties giving rough reasons why they hold true. The
detailed proofs are beyond the goal of this books.
Pn−1
1. The Ito integral is the mean square limit of the partial sums Sn = i=0 fti (Wti+1 −
Wti ), where we denoted fti = f (Wti , ti ). Since f (Wt , t) is nonanticipative process, then fti is
independent of the increments Wti+1 − Wti , and then we have
h n−1
X i n−1
X
E[Sn ] = E fti (Wti+1 − Wti ) = E[fti (Wti+1 − Wti )]
i=0 i=0
n−1
X
= E[fti ]E[(Wti+1 − Wti )] = 0,
i=0
because the increments have mean zero. Since each partial sum has zero mean, their limit,
which is the Ito Integral, will also have zero mean.
66
2. Since the square of the sum of partial sums can be written as

³ n−1
X ´2
Sn2 = fti (Wti+1 − Wti )
i=0
n−1
X X
= ft2i (Wti+1 − Wti )2 + 2 fti (Wti+1 − Wti )ftj (Wtj+1 − Wtj ),
i=0 i6=j
using the independence yields

n−1
X
E[Sn2 ] = E[ft2i ]E[(Wti+1 − Wti )2 ]
i=0
X
+2 E[fti ]E[(Wti+1 − Wti )]E[ftj ]E[(Wtj+1 − Wtj )]
i6=j
n−1
X
= E[ft2i ](ti+1 − ti ),
i=0
Z b hZ b i
which are the Riemann sums of the integral E[ft2 ] dt = E ft dt , where the last identity
a a
follows from Fubini’s theorem. Hence E[Sn2 ] converges to the aforementioned integral. It is yet
to be shown that the convergence holds also in mean square.
3. Consider the partial sums
n−1
X n−1
X
Sn = fti (Wti+1 − Wti ), Vn = gtj (Wtj+1 − Wtj ).
i=0 j=0
Their product is
³ n−1
X ´³ n−1
X ´
Sn Vn = fti (Wti+1 − Wti ) gtj (Wtj+1 − Wtj )
i=0 j=0
n−1
X n−1
X
= fti gti (Wti+1 − Wti )2 + fti gtj (Wti+1 − Wti )(Wtj+1 − Wtj )
i=0 i6=j
Using that ft and gt are nonanticipative and that
E[(Wti+1 − Wti )(Wtj+1 − Wtj )] = E[Wti+1 − Wti ]E[Wtj+1 − Wtj ] = 0, i 6= j

2
E[(Wti+1 − Wti ) ] = ti+1 − ti ,
it follows that
n−1
X
E[Sn Vn ] = E[fti gti ]E[(Wti+1 − Wti )2 ]
i=0
n−1
X
= E[fti gti ](ti+1 − ti ),
i=0
67
Rb
which is the Riemann sum for the integral E[ft gt ] dt. a
Rb
From 1 and 2 it follows that the random variable a f (Wt , t) dWt has mean zero and variance
hZ b i hZ b i
V ar f (Wt , t) dWt = E f (Wt , t)2 dt .
a a
From 1 and 3 it follows that

hZ b Z b i Z b
Cov f (Wt , t) dWt , g(Wt , t) dWt = E[f (Wt , t)g(Wt , t)] dt.
a a a
Corollary 4.4.3 (Cauchy’s integral inequality) Let f (t) = f (Wt , t) and g(t) = g(Wt , t).
Then
³Z b ´2 ³ Z b ´³ Z b ´
2
E[ft gt ] dt ≤ E[ft ] dt E[gt2 ] dt .
a a a
Proof: It follows from the previous theorem and from the correlation formula |Corr(X, Y )| =
|Corr(X, Y )|
≤ 1.
[V ar(X)V ar(Y )]1/2
Let Ft be the information set at time t. This implies that fti and Wti+1 − Wti are known
n−1
X
at time t, for any ti+1 ≤ t. It follows that the partial sum Sn = fti (Wti+1 − Wti ) is
i=0
Ft -predictable. The following result states that this is also valid in mean square:
Rt
Proposition 4.4.4 The Ito integral 0
fs dWs is Ft -predictable.
The following two results state that if the upper limit of an Ito integral is replaced by the
parameter t we obtain a continuous martingale.
Proposition 4.4.5 For any s < t we have

hZ t i Z s
E f (Wu , u) dWu |Fs = f (Wu , u) dWu .
0 0
Proof: Using part 3 of Proposition 4.4.2 we get

hZ t i
E f (Wu , u) dWu |Fs
0
hZ s Z t i
= E f (Wu , u) dWu + f (Wu , u) dWu |Fs
0 s
hZ s i hZ t i
= E f (Wu , u) dWu |Fs + E f (Wu , u) dWu |Fs . (4.4.4)
0 s
Rs
Since 0
f (Wu , u) dWu is Fs -predictable (see Proposition 4.4.4), by part 2 of Proposition 1.10.4
hZ s i Z s
E f (Wu , u) dWu |Fs = f (Wu , u) dWu .
0 0
68
Rt
Since s f (Wu , u) dWu contains only information between s and t, it is unpredictable given the
information set Fs , so
hZ t i
E f (Wu , u) dWu |Fs = 0.
s
Substituting in (4.4.4) yields the desired result.
Rt
Proposition 4.4.6 Consider the process Xt = 0 f (Ws , s) dWs . Then Xt is continuous, i.e.
for almost any state of the world ω ∈ Ω, the path t → Xt (ω) is continuous.
Proof: A rigorous proof is beyond the purpose of this book. We shall provide a rough sketch.
Assume the process f (Wt , t) satisfies E[f (Wt , t)2 ] < M , for some M > 0. Let t0 be fixed and
consider h > 0. Consider the increment Yh = Xt0 +h −Xt0 . Using the aforementioned properties
of the Ito integral we have
h Z t0 +h i
E[Yh ] = E[Xt0 +h − Xt0 ] = E f (Wt , t) dWt = 0
t0
h³ Z t0 +h ´2 i Z t0 +h
E[Yh2 ] = E f (Wt , t) dWt = E[f (Wt , t)2 ] dt
t0 t0
Z t0 +h
< M dt = M h.
t0
The process Yh has zero mean for any h > 0 and its variance tends to 0 as h → 0. Using a
convergence theorem yields that Yh tends to 0 in mean square, as h → 0. This is equivalent
with the continuity of Xt at t0 .
4.5 The Wiener Integral

The Wiener integral is a particular case of the Ito stochastic integral. It is obtained by replacing
the nonanticipating stochastic process f (Wt , t) by the deterministic function f (t). The Wiener
Rb
integral a f (t) dWt is the mean square limit of the partial sums
n−1
X
Sn = f (ti )(Wti+1 − Wti ).
i=0
All properties of Ito integrals hold also for Winer integrals. The Wiener integral is a random
variable with mean zero
hZ b i
E f (t) dWt = 0
a
and variance
h³ Z b ´2 i Z b
E f (t) dWt = f (t)2 dt.
a a
However, in the case of Wiener integrals we can say something about the its distribution.
Rb
Proposition 4.5.1 The Wiener integral I(f ) = a f (t) dWt is a normal random variable with
mean 0 and variance Z b
V ar[I(f )] = f (t)2 dt := kf k2L2 .
a
69
Proof: Since increments Wti+1 − Wti are normally distributed with mean 0 and variance
ti+1 − ti , then
f (ti )(Wti+1 − Wti ) ∼ N (0, ti (ti+1 − ti )).
Since these random variables are independent, by the Central Limit Theorem (see Theorem
2.3.1), their sum is also normally distributed, with
n−1
X ³ n−1
X ´
Sn = f (ti )(Wti+1 − Wti ) ∼ N 0, f (ti )(Wti+1 − Wti ) .
i=0 i=0
Taking n → ∞ and max kti+1 − ti k → 0, the normal distribution tends to

i
³ Z b ´
N 0, f (t)2 dWt .
a
The previous convergence holds in distribution, and it still need to be shown in the mean square.
We shall omit this essential proof detail.
RT
Exercise 4.5.2 Show that the random variable X = 1 √1t dWt is normally distributed with
mean 0 and variance ln T .
RT √
Exercise 4.5.3 Let Y = 1 t dWt . Show that Y is normally distributed with mean 0 and
variance (T 2 − 1)/2.
Rt
Exercise 4.5.4 Find the distribution of the integral 0
et−s dWs .
Rt Rt
Exercise 4.5.5 Show that Xt = 0 (2t − u) dWu and Y = 0 (3t − 4u) dWu are Gaussian
processes with mean 0 and variance 73 t3 .
Rt bu
Exercise 4.5.6 Find all constants a, b such that Xt = 0
(a + t ) dWu is a Brownian motion
process.
4.6 Poisson Integration

In this section we deal with the integration with respect to the compensated Poisson process
Mt = Nt − λt, which is a martingale. Consider 0 ≤ a < b and let Ft = F (t, Mt ) be a
non-anticipating process with
hZ b i
E Ft2 dt < ∞.
a
Consider the partition

a = t0 < t1 < · · · < tn−1 < tn = b
of the interval [a, b], and associate the partial sums
n−1
X
Sn = Fti− (Mti+1 − Mti ).
i=0
70
For predictability reasons, the intermediate points are the left-handed limit to the endpoints of
each interval. Since the process Ft is non-anticipative, the random variables Fti− and Mti+1 −
Mti are independent.
The integral of Ft− with respect to Mt is the mean square limit of the partial sum Sn
Z T
ms-lim Sn = Ft− dMt ,
n→∞ 0
provided the limit exists. More precisely, this convergence means that
h³ Z b ´2 i
lim E Sn − Ft− dMt = 0.
n→∞ a
4.6.1 An Workout Example: the case Ft = Mt

Me shall integrate the process Mt− between 0 and T with respect to Mt . Considering the
kT
partition points tk = , k = 0, 1, · · · , n − 1. The partial sums are given by
n
n−1
X
Sn = Mti− (Mti+1 − Mti ).
i=0
1
Using xy = [(x + y)2 − x2 − y 2 ], by letting x = Mti− and y = Mti+1 − Mti , we get (Where
2
does a minus go?)
1 2 1 1
Mti− (Mti+1 − Mti ) = M − M 2 − (Mti+1 − Mti )2 .
2 ti+1 2 ti 2
After pair cancelations we have
n−1 n−1 n−1
1X 2 1X 2 1X
Sn = Mti+1 − Mti − (Mti+1 − Mti )2
2 i=0 2 i=0 2 i=0
n−1
1 2 1X
= M tn − (Mti+1 − Mti )2
2 2 i=0
Since tn = T , we get
n−1
1 2 1X
Sn = MT − (Mti+1 − Mti )2 .
2 2 i=0
The second term on the right is the quadratic variation of Mt , using formula (2.8.9) yields that
Sn converges in mean square towards 12 MT2 − 12 NT , since N0 = 0.
Hence we have arrived at the following formula
Z T
1 2 1
Mt− dMt = M − NT .
0 2 T 2
Similarly, one can obtain

Z b
1 1
Mt− dMt = (Mb2 − Ma2 ) − (Nb − Na ).
a 2 2
71

hZ b i
(a) E Mt dMt = 0,
a
hZ b i hZ b i
(b) V ar Mt dMt = E Mt2 dNt .
a a
The stochastic integral with respect to the compensated Poisson process Mt has in general
the following properties, which are left as an exercise to the reader
Proposition 4.6.2 We have
1. Linearity:
Z b Z b Z b
(αf + βg) dMt = α f dMt + β g dMt , α, β ∈ R;
a a a
2. Zero mean:
hZ b i
E f dMt = 0;
a
2. Isometry:
h³ Z b ´2 i hZ b i
E f dMt =E f 2 dNt ;
a a
Exercise 4.6.3 Let ω be a fixed state of the world and assume the sample path t → Nt (ω) has
a jump in the interval (a, b). Show that the Riemann-Stieltjes integral
Z b
Nt (ω) dNt
a
does not exist.
Exercise 4.6.4 Let Nt− denote the left-hand limit of Nt . Show that Nt− is predictable, while
Nt is not.
The previous exercises provide the reason why in the following we shall work with Mt− instead
Z b Z b
of Mt : the integral Mt dNt might not exist, while Mt− dNt does exist.
a a
Exercise 4.6.5 Show that Z T

1
Mt− dMt = (M 2 − Nt ).
0 2 t
Z T Z t
1 2
Nt− dMt = (N − Nt ) − λ Nt dt.
0 2 t 0
Exercise 4.6.7 Find the variance of

Z T
Nt− dMt .
0
72
Chapter 5
Stochastic Differentiation
5.1 Differentiation Rules

Most stochastic processes are not differentiable. For instance, the Brownian motion process
Wt is a continuous process which is nowhere differentiable. Hence, derivatives like dW
dt do not
t
make sense in stochastic calculus. The only quantities allowed to be used are the infinitesimal
changes of the process, in our case dWt .
The infinitesimal change of a process
The change in the process Xt between instances t and t + ∆t is given by ∆Xt = Xt+∆t − Xt .
When ∆t is infinitesimally small, we obtain the infinitesimal change of a process Xt
dXt = Xt+dt − Xt .
Sometimes it is useful to use the equivalent formulation Xt+dt = Xt + dXt .
5.2 Basic Rules

The following rules are the analog of some familiar differentiation rules from elementary Cal-
culus.
The constant multiple rule
If Xt is a stochastic processes and c is a constant, then
d(c Xt ) = c dXt .
The verification follows from a straightforward application of the infinitesimal change formula
d(c Xt ) = c Xt+dt − c Xt = c(Xt+dt − Xt ) = c dXt .
The sum rule

If Xt and Yt are two stochastic processes, then
d(Xt + Yt ) = dXt + dYt .
73
74
The verification is as in the follwing

d(Xt + Yt ) = (Xt+dt + Yt+dt ) − (Xt + Yt )
= (Xt+dt − Xt ) + (Yt+dt − Yt )
= dXt + dYt .
The difference rule
d(Xt − Yt ) = dXt − dYt .
The proof is similar with the one for the sum rule.
The product rule
d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt .
The proof is as follows

d(Xt Yt ) = Xt+dt Yt+dt − Xt Yt
= Xt (Yt+dt − Yt ) + Yt (Xt+dt − Xt ) + (Xt+dt − Xt )(Yt+dt − Yt )
= Xt dYt + Yt dXt + dXt dYt ,
where the second identity is verified by direct computation.
If the process Xt is replaced by the deterministic function f (t), then the aforementioned
formula becomes
d(f (t)Yt ) = f (t) dYt + Yt df (t) + df (t) dYt .
Since in most of practical cases the process Yt is an Ito diffusion
dYt = a(t, Wt )dt + b(t, Wt )dWt ,
2
using the relations dt dWt = dt = 0, the last term vanishes
df (t) dYt = f 0 (t)dtdYt = 0,
and hence
d(f (t)Yt ) = f (t) dYt + Yt df (t).
This relation looks alike the usual product rule.
The quotient rule
³ X ´ Y dX − X dY − dX dY Xt
t t t t t t t
d = + 3 (dYt )2 .
Yt Yt2 Yt
The proof follows from Ito’s formula and shall be postponed for the time being.
When the process Yt is replaced by the deterministic function f (t), and Xt is an Ito diffusion,
then the previous formula becomes
³ X ´ f (t)dX − X df (t)
t t t
d = .
f (t) f (t)2
75
Example 5.2.1 We shall show that
d(Wt2 ) = 2Wt dWt + dt.
Applying the product rule and the fundamental relation (dWt )2 = dt, yields
d(Wt2 ) = Wt dWt + Wt dWt + dWt dWt = 2Wt dWt + dt.
Example 5.2.2 Show that
d(Wt3 ) = 3Wt2 dWt + 3Wt dt.
Applying the product rule and the previous exercise yields
d(Wt3 ) = d(Wt · Wt2 ) = Wt d(Wt2 ) + Wt2 dWt + d(Wt )2 dWt
= Wt (2Wt dWt + dt) + Wt2 dWt + dWt (2Wt dWt + dt)
= 2Wt2 dWt + Wt dt + Wt2 dWt + 2Wt (dWt )2 + dt dWt
= 3Wt2 dWt + 3Wt dt,
where we used (dWt )2 = dt and dt dWt = 0.
Example 5.2.3 Show that d(tWt ) = Wt dt + t dWt .
Using the product rule and t dWt = 0, we get
d(tWt ) = Wt dt + t dWt + dt dWt
= Wt dt + t dWt .
Rt
Example 5.2.4 Let Zt = 0
Wu du be the integrated Brownian motion. Show that
dZt = Wt dt.
The infinitesimal change of Zt is
Z t+dt
dZt = Zt+dt − Zt = Ws ds = Wt dt,
t
since Ws is a continuous function in s.

Rt
Example 5.2.5 Let At = 1t Zt = 1t 0 Wu du be the average of the Brownian motion on the
time interval [0, t]. Show that
1³ 1 ´
dAt = Wt − Zt dt.
t t
We have
³1´ 1 ³1´
dAt = d Zt + dZt + d dZt
t t t
−1 1 −1
= dt2
Zt dt + Wt dt + 2 Wt |{z}
t2 t t
=0
1³ 1 ´
= Wt − Zt dt.
t t
1
Rt
Exercise 5.2.1 Let Gt = t 0
eWu du be the average of the geometric Brownian motion on
[0, t]. Find dGt .
76
5.3 Ito’s Formula

Ito’s formula is the analog of the chain rule from elementary Calculus. We shall start by
reviewing a few concepts regarding function approximations.
Let f be a differentiable function of a real variable x. Let x0 be fixed and consider the
changes ∆x = x − x0 and ∆f (x) = f (x) − f (x0 ). It is known from Calculus that the following
second order Taylor approximation holds
1
∆f (x) = f 0 (x)∆x + f 00 (x)(∆x)2 + O(∆x)3 .
2
When x is infinitesimally close to x0 , we replace ∆x by the differential dx and obtain
1
df (x) = f 0 (x)dx + f 00 (x)(dx)2 + O(dx)3 . (5.3.1)
2
In the elementary Calculus, all the terms involving terms of equal or higher order to dx2 are
neglected; then the aforementioned formula becomes
df (x) = f 0 (x)dx.
Now, if consider x = x(t) be a differentiable function of t, substituting in the previous formula

we obtain the differential form of the well known chain rule
¡ ¢ ¡ ¢ ¡ ¢
df x(t) = f 0 x(t) dx(t) = f 0 x(t) x0 (t) dt.
We shall work out a similar formula in the stochastic environment. In this case the deter-
ministic function x(t) is replaced by a stochastic process Xt . The composition between the
differentiable function f and the process Xt is denoted by Ft = f (Xt ). Since the increments
involving powers of dt2 or higher are neglected, we may assume that the same holds true for
the increment dXt , i.e., dXt = O(dt). Then the expression (5.3.1) becomes
¡ ¢ 1 ¡ ¢¡ ¢2
dFt = f 0 Xt dXt + f 00 Xt dXt . (5.3.2)
2
In the computation of dXt we may take into the account stochastic relations such as dWt2 = dt,
or dt dWt = 0.
5.3.1 Ito’s formula for diffusions

The previous formula is a general case of Ito’s formula. However, in most cases the increments
dXt are given by some particular relations. An important case is when the increment is given
by
dXt = a(Wt , t)dt + b(Wt , t)dWt .
A process Xt satisfying this relation is called an Ito diffusion.
Theorem 5.3.1 (Ito’s formula for diffusions) If Xt is an Ito diffusion, and Ft = f (Xt ),
then
h b(Wt , t) 00 i
dFt = a(Wt , t)f 0 (Xt ) + f (Xt ) dt + b(Wt , t)f 0 (Xt ) dWt . (5.3.3)
2
77
Proof: We shall provide a formal proof. Using the relations dWt2 = dt and dt2 = dWt dt = 0,
we have
³ ´2
(dXt )2 = a(Wt , t)dt + b(Wt , t)dWt
= a(Wt , t)2 dt2 + 2a(Wt , t)b(Wt , t)dWt dt + b(Wt , t)2 dWt2
= b(Wt , t)2 dt.

¡ ¢ 1 ¡ ¢¡ ¢2
dFt = f 0 Xt dXt + f 00 Xt dXt
2
¡ ¢³ ´ 1 ¡ ¢
= f 0 Xt a(Wt , t)dt + b(Wt , t)dWt + f 00 Xt b(Wt , t)2 dt
2
h b(W , t) i
t
= a(Wt , t)f 0 (Xt ) + f 00 (Xt ) dt + b(Wt , t)f 0 (Xt ) dWt .
2
In the case Xt = Wt we obtain the following consequence:
Corollary 5.3.2 Let Ft = f (Xt ). Then
1 00
dFt = f (Wt )dt + f 0 (Wt ) dWt . (5.3.4)
2
Particular cases
1. If f (x) = xα , with α constant, then f 0 (x) = αxα−1 and f 00 (x) = α(α − 1)xα−2 , then (5.3.4)
becomes the following useful formula
1
d(Wtα ) = α(α − 1)Wtα−2 dt + αWtα−1 dWt .
2
A couple of useful cases easily follow:
d(Wt2 ) = 2Wt dWt + dt

d(Wt3 ) = 3Wt2 dWt + 3Wt dt.
2. If f (x) = ekx , with k constant, f 0 (x) = kekx , f 00 (x) = k 2 ekx . Therefore
1
d(ekWt ) = kekWt dWt + k 2 ekWt dt.
2
In particular, for k = 1 we obtain the increments of a geometric Brownian motion

1
d(eWt ) = ekWt dWt + eWt dt.
2
3. If f (x) = sin x, then
1
d(sin Wt ) = cos Wt dWt − sin Wt dt.
2
78
Exercise 5.3.3 Use the previous rules to find the following increments
a. d(Wt eWt )
b. d(3Wt2 + 2e5Wt )
2
c. d(et+Wt )
¡ ¢
d. d (t + Wt )n .
³1 Z t ´
e. d Wu du
t 0
³1 Z t ´
f. d α eWu du , where α is a constant.
t 0
In the case when the function f = f (t, x) is also time dependent, the analog of (5.3.1) is
given by
1
df (t, x) = ∂t f (t, x)dt + ∂x f (t, x)dx + ∂x2 f (t, x)(dx)2 + O(dx)3 + O(dt)2 . (5.3.5)
2
Substituting x = Xt yields
1
df (t, Xt ) = ∂t f (t, Xt )dt + ∂x f (t, Xt )dXt + ∂x2 f (t, Xt )(dXt )2 . (5.3.6)
2
If Xt is an Ito diffusion we obtain an extra-term in formula (5.3.3)
h b(Wt , t) 2 i
dFt = ∂t f (t, Xt ) + a(Wt , t)∂x f (t, Xt ) + ∂x f (t, Xt ) dt
2
+b(Wt , t)∂x f (t, Xt ) dWt . (5.3.7)
d(tWt2 ) = (1 + Wt2 )dt + 2tWt dWt .
Exercise 5.3.5 Find the following increments
(a) d(tWt ) (c) d(t2 cos Wt )

(b) d(et Wt ) (d) d(sin t Wt2 ).
5.3.2 Ito’s formula for Poisson processes

Consider the process Ft = F (Mt ), where Mt = Nt − λt is the compensated Poisson process.
Using the formal relation (2.8.13)
dMt2 = dNt
Ito’s formula becomes
1
dFt = F 0 (Mt )dMt + F 00 (Mt )dNt ,
2
which is equivalent with
³ 1 ´ λ
dFt = F 0 (Mt ) + F 00 (Mt ) dMt + F 00 (Mt )dt.
2 2
79
For instance, if Ft = Mt2

d(Mt2 ) = 2Mt dMt + dNt ,
which is equivalent with the stochastic integral
Z T Z T Z T
d(Mt2 ) = 2 Mt dMt + dNt
0 0 0
that yields
Z T
1
Mt− dMt = (M 2 − NT ).
0 2 T
The left-hand limit is used for predictability reasons, see section 4.6.
5.3.3 Ito’s multidimensional formula

If the process Ft depends on several Ito diffusions, say Ft = f (t, Xt , Yt ), then a similar formula
to (5.3.7) leads to
∂f ∂f ∂f
dFt = (t, Xt , Yt )dt + (t, Xt , Yt )dXt + (t, Xt , Yt )dYt
∂t ∂x ∂y
1 ∂2f 2 1 ∂2f
+ (t, X t , Y t )(dX t ) + (t, Xt , Yt )(dYt )2
2 ∂x2 2 ∂y 2
∂2f
+ (t, Xt , Yt )dXt dYt .
∂x∂y
Particular cases
In the case when Ft = f (Xt , Yt ), with Xt = Wt1 , Yt = Wt2 independent Brownian motions, we
have
∂f ∂f 1 ∂2f 1 ∂2f
dFt = dWt1 + dWt2 + 2
(dWt1 )2 + (dWt2 )2
∂x ∂y 2 ∂x 2 ∂y 2
1 ∂2f
+ dWt1 dWt2
2 ∂x∂y
∂f ∂f 1 ³ ∂2f ∂2f ´
= dWt1 + dWt2 + 2
+ 2 dt
∂x ∂y 2 ∂x ∂y
The expression
1 ³ ∂2f ∂2f ´
∆f = 2
+ 2
2 ∂x ∂y
is called the Laplacian of f . We can rewrite the previous formula as
∂f ∂f
dFt = dWt1 + dWt2 + ∆f dt
∂x ∂y
A function f with ∆f = 0 is called harmonic. The aforementioned formula in the case of
harmonic functions takes the very simple form
∂f ∂f
dFt = dWt1 + dWt2 .
∂x ∂y
80
Exercise 5.3.6 Use the previous formulas to find dFt in the following cases
(a) Ft = (Wt1 )2 + (Wt2 )2
(b) Ft = ln[(Wt1 )2 + (Wt2 )2 ].
p
Exercise 5.3.7 Consider the Bessel process Rt = (Wt1 )2 + (Wt2 )2 , where Wt1 and Wt2 are
two independent Brownian motions. Prove that
1 W1 W2
dRt = dt + t dWt1 + t dWt2 .
2Rt Rt Rt
Example 5.3.1 (The product rule) Let Xt and Yt be two processes. Show that
d(Xt Yt ) = Yt dXt + Xt dYt + dXt dYt .
Consider the function f (x, y) = xy. Since ∂x f = y, ∂y f = x, ∂x2 f = ∂y2 f = 0, ∂x ∂y = 1, then

Ito’s multidimensional formula yields
¡ ¢
d(Xt Yt ) = d f (X, Yt ) = ∂x f dXt + ∂y f dYt
1 1
+ ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt
2 2
= Yt dXt + Xt dYt + dXt dYt .
Example 5.3.2 (The quotient rule) Let Xt and Yt be two processes. Show that
³ X ´ Y dX − X dY − dX dY Xt
t t t t t t t
d = + 2 (dYt )2 .
Yt Yt2 Yt
Consider the function f (x, y) = xy . Since ∂x f = y1 , ∂y f = − yx2 , ∂x2 f = 0, ∂y2 f = − yx2 , ∂x ∂y = 1

y2 ,
then applying Ito’s multidimensional formula yields
³X ´ ¡ ¢
t
d = d f (X, Yt ) = ∂x f dXt + ∂y f dYt
Yt
1 1
+ ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt
2 2
1 Xt 1
= dXt − 2 dYt − 2 dXt dYt
Yt Yt Yt
Yt dXt − Xt dYt − dXt dYt Xt
= + 2 (dYt )2 .
Yt2 Yt
Chapter 6
Stochastic Integration Techniques
Computing a stochastic integral starting from the definition of the Ito integral is a quite ineffi-
cient method. Like in the elementary Calculus, several methods can be developed to compute
stochastic integrals. In order to keep the analogy with the elementary Calculus, we have called
them Fundamental Theorem of Stochastic Calculus and Integration by Parts. The integration
by substitution is more complicated in the stochastic environment and we have considered only
a particular case of it, which we called The method of heat equation.
6.0.4 Fundamental Theorem of Stochastic Calculus

Consider a process Xt whose increments satisfy the equation dXt = f (t, Wt )dWt . Integrating
formally between a and t yields
Z t Z t
dXs = f (s, Ws )dWs . (6.0.1)
a a
The integral on the left side can be computed as in the following. If consider the partition
0 = t0 < t1 < · · · < tn−1 < tn = t, then
Z t n−1
X
dXs = ms-lim (Xtj+1 − Xtj ) = Xt − Xa ,
a n→∞
j=0
since we canceled the terms in pairs. Substituting in formula (6.0.1) yields Xt = Xa +

Z t ³Z t ´
f (s, Ws )dWs , and hence dXt = d f (s, Ws )dWs , since Xa is a constant.
a a
The following result bares its name from the analogy with the similar result from elementary
Calculus.
Theorem 6.0.8 (The Fundamental Theorem of Stochastic Calculus) (i) For any a <
t, we have
³Z t ´
d f (s, Ws )dWs = f (t, Wt )dWt .
a
81
82
(ii) If Yt is a stochastic process, such that Yt dWt = dFt , then

Z b
Yt dWt = Fb − Fa .
a
We shall provide a few applications of the aforementioned theorem.

Example 6.0.3 Verify the stochastic formula
Z t
W2 t
Ws dWs = t − .
0 2 2
Rt Wt2 t
Let Xt = 0
Ws dWs and Yt = − . From Ito’s formula
2 2
³W2 ´ ³t´ 1 1
t
dYt = d −d = (2Wt dWt + dt) − dt = Wt dWt ,
2 2 2 2
and from the Fundamental Theorem of Stochastic Calculus
³Z t ´
dXt = d Ws dWs = Wt dWt .
0
Hence dXt = dYt , or d(Xt − Yt ) = 0. Since the process Xt − Yt has zero increments, then
Xt − Yt = c, constant. Taking t = 0, yields
Z 0 ³W2
0 0´
c = X0 − Y0 = Ws dWs − − = 0,
0 2 2
and hence c = 0. It follows that Xt = Yt , which verifies the desired relation.
Example 6.0.4 Verify the formula
Z t ´ 1Z t
t³ 2
sWs dWs = Wt − 1 − Ws2 ds.
0 2 2 0
Rt t³ 2 ´
1
Rt
Consider the stochastic processes Xt = 0 sWs dWs , Yt = Wt − 1 , and Zt = 2 0
Ws2 ds.
2
The Fundamental Theorem yields
dXt= tWt dWt
1 2
dZt = W dt.
2 t
Applying Ito’s formula, see Exercise 5.3.4, we get
³t³ ´´ 1 ³t´
dYt = d Wt2 − 1 = d(tWt2 ) − d
2 2 2
1h 2
i 1
= (1 + Wt )dt + 2tWt dWt − dt
2 2
1 2
= W dt + tWt dWt .
2 t
We can easily see that
dXt = dYt − dZt .
This implies d(Xt − Yt + Zt ) = 0, i.e. Xt − Yt + Zt = c, constant. Since X0 = Y0 = Z0 = 0, it
follows that c = 0. This proves the desired relation.
83
Example 6.0.5 Show that

Z t
1 3
(Ws2 − s) dWs = W − tWt .
0 3 t
Consider the function f (t, x) = 13 x3 − tx, and let Ft = f (t, Wt ). Since ∂t f = −x, ∂x f = x2 − t,
and ∂x2 f = 2x, then Ito’s formula provides
1
dFt = ∂t f dt + ∂x f dWt + ∂x2 f (dWt )2
2
2 1
= −Wt dt + (Wt − t) dWt + 2Wt dt
2
= (Wt2 − t)dWt .
From the Fundamental Theorem we get

Z t Z t
1 3
(Ws2 − s) dWs = dFs = Ft − F0 = Ft = W − tWt .
0 0 3 t
6.0.5 Stochastic Integration by Parts

Consider the process Ft = f (t)g(Wt ), with f and g differentiable. Using the product rule yields
dFt = df (t) g(Wt ) + f (t) dg(Wt )

¡ 1 ¢
= f 0 (t)g(Wt )dt + f (t) g 0 (Wt )dWt + g 00 (Wt )dt
2
1
= f (t)g(Wt )dt + f (t)g (Wt )dt + f (t)g 0 (Wt )dWt .
0 00
2
Writing the relation in the integral form, we obtain the first integration by parts formula:
Z b ¯b Z b Z b
0 ¯ 0 1
f (t)g (Wt ) dWt = f (t)g(Wt )¯ − f (t)g(Wt ) dt − f (t)g 00 (Wt ) dt.
a a a 2 a
This formula is to be used when integrating a product between a function of t and a function
of the Brownian motion Wt , for which an antiderivative is known. The following two particular
cases are important and useful in applications.
1. If g(Wt ) = Wt , the aforementioned formula takes the simple form
Z b ¯t=b Z b
¯
f (t) dWt = f (t)Wt ¯ − f 0 (t)Wt dt. (6.0.2)
a t=a a
It worth noting that the left side is a Wiener integral.

2. If f (t) = 1, then the formula becomes
Z b ¯t=b 1 Z b
¯
g 0 (Wt ) dWt = g(Wt )¯ − g 00 (Wt ) dt. (6.0.3)
a t=a 2 a
84
Z T
Application 1 Consider the Wiener integral IT = t dWt . From the general theory, see
0
Proposition 4.5.1, it is known that I is a random variable normally distributed with mean 0
and variance Z T
T3
V ar[IT ] = t2 dt = .
0 3
Recall the definition of integrated Brownian motion
Z t
Zt = Wu du.
0
Formula (6.0.2) yields a relationship between I and the integrated Brownian motion
Z T Z T
IT = t dWt = T WT − Wt dt = T WT − ZT ,
0 0
and hence IT + ZT = T WT . This relation can be used to compute the covariance between IT
and ZT .
Cov(IT + ZT , IT + ZT ) = V ar[T WT ] ⇐⇒
V ar[IT ] + V ar[ZT ] + 2Cov(IT , ZT ) = T 2 V ar[WT ] ⇐⇒
T 3 /3 + T 3 /3 + 2Cov(IT , ZT ) = T 3 ⇐⇒
Cov(IT , ZT ) = T 3 /6,
where we used that V ar[ZT ] = T 3 /3. The processes It and Zt are not independent. Their
correlation coefficient is 0.5 as the following calculation shows
Cov(IT , ZT ) T 3 /6
Corr(IT , ZT ) = ³ ´1/2 = T 3 /3
V ar[IT ]V ar[ZT ]
= 1/2.
x2
Application 2 If let g(x) = 2 in formula (6.0.3), we get
Z b
Wt2 ¯¯b 1
Wt dWt = ¯ − (b − a).
a 2 a 2
It worth noting that letting a = 0 and b = T we retrieve a formula proved by direct methods
in a previous chapter
Z T
W2 T
Wt dWt = T − .
0 2 2
Next we shall deduct inductively from (6.0.3) an explicit formula for the stochastic inte-
RT n+1
gral 0 Wtn dWt , for n natural number. Letting g(x) = xn+1 in (6.0.3) and denoting In =
RT n
0
Wt dWt we obtain the recursive formula
1 n+1
In+1 = W n+2 − In , n ≥ 1.
n+2 T 2
85
Iterating this formula we have

1 n
In = WTn+1 − In−1
n+1 2
1 n n−1
In−1 = W − In−2
n T 2
1 n−2
In−2 = W n−1 − In−3
n−1 T 2
······ ························
1 3 2
I2 = W − I1
3 T 2
1 2 T
I1 = W − .
2 T 2
Multiplying the second formula by − n2 , the third by (− n2 )(− n−1 n n−1 n−2
2 ), the fourth by (− 2 )(− 2 )(− 2 ),
e.t.c., and adding and performing the pair cancelations, yields
1
In W n+1
=
n+1 T
n1 n
− W
2n T
n n−1 1
− (− ) W n−1
2 2 n−1 T
n n−1 n−2 1
− (− )(− ) W n−2
2 2 2 n−2 T
······························
nn−1n−2 n−k 1
+(−1)k+1 ··· W n−k
2 2 2 2 n−k T
······························
nn−1n−2 2 1³ 2 T ´
+(−1)n−1 ··· WT − .
2 2 2 22 2
Using the summation notation we have
X n−2
1 n(n − 1) · · · (n − k + 1) n−k n! T
In = WTn+1 + (−1)k+1 k+1
WT + (−1)n n .
n+1 2 2 2
k=0
Since
n!
n(n − 1) · · · (n − k + 1) =
,
(n − k)!
the aforementioned formula leads to the explicit formula
Z T X n−2
1 n! n! T
Wtn dWt = WTn+1 + (−1)k+1 k+1 W n−k + (−1)n n .
0 n+1 2 (n − k)! T 2 2
k=0
The following particular cases might be useful in applications
Z T
1 3 1 2 T
Wt2 dWt = W − W + . (6.0.4)
0 3 T 2 T 2
Z T
1 4 1 3 3 3
Wt3 dWt = WT − WT + 2 WT2 − 2 T. (6.0.5)
0 4 2 2 2
86
Application 3
RT
Choosing f (t) = eαt and g(x) = cos x, we shall compute the stochastic integral 0 eαt cos Wt dWt
using the formula of integration by parts
Z T Z T
αt
e cos Wt dWt = eαt (sin Wt )0 dWt
0 0
¯T Z T Z
1 T αt
αt ¯ αt 0
= e sin Wt ¯ − (e ) sin Wt dt − e (cos Wt )00 dt
0 0 2 0
Z T Z
1 T αt
= eαT sin WT − α eαt sin Wt dt + e sin Wt dt
0 2 0
³ Z
1 ´ T αt
= eαT sin WT − α − e sin Wt dt.
2 0
1
The particular case α = 2 leads to the following exact formula of a stochastic integral
Z T
t T
e 2 cos Wt dWt = e 2 sin WT . (6.0.6)
0
RT
In a similar way, we can obtain an exact formula for the stochastic integral 0
eβt sin Wt dWt
as follows
Z T Z T
eβt sin Wt dWt = − eβt (cos Wt )0 dWt
0 0
¯T Z T Z T
¯ 1
= −eβt cos Wt ¯ + β eβt cos Wt dt − eβt cos Wt dt.
0 0 2 0
1
Taking β = 2 yields the closed form formula
Z T
t T
e 2 sin Wt dWt = 1 − e 2 cos WT . (6.0.7)
0
A consequence of the last two formulas and of Euler’s formula
eiWt = cos Wt + i sin Wt ,
is
Z T
t T
e 2 +iWt dWt = i(1 − e 2 +iWT ).
0
The proof details are left to the reader.

A general form of the integration by parts formula
In general, if Xt and Yt are two Ito diffusions, from the product formula
Integrating between the limits a and b

Z b Z b Z b Z b
a a a a
87
From the Fundamental Theorem

Z b
d(Xt Yt ) = Xb Yb − Xa Ya ,
a
so the previous formula takes the following form of integration by parts

Z b Z b Z b
Xt dYt = Xb Yb − Xa Ya − Yt dXt − dXt dYt .
a a a
This formula is of theoretical value. In practice, the term dXt dYt needs to be computed using
the rules Wt2 = dt, and dt dWt = 0.
Exercise 6.0.9 (a) Use integration by parts to get

Z T Z T
1 −1 Wt
2 dWt = tan (WT ) + 2 2 dt, T > 0.
0 1 + Wt 0 (1 + Wt )
(b) Show that

Z T h i
Wt
E[tan−1 (WT )] = − E 2 2
dt.
0 (1 + Wt )
(c) Prove the double inequality
√ √
3 3 x 3 3
− ≤ ≤ , ∀x ∈ R.
16 (1 + x2 )2 16
(d) Use part (c) to obtain
√ Z T √
3 3 Wt 3 3
− T ≤ 2 2 ≤ 16 T.
16 0 (1 + Wt )
(e) Use part (d) to get √ √

3 3 −1 3 3
− T ≤ E[tan (WT )] ≤ T.
16 16
(f ) Does part (e) contradict the inequality
π π
− < tan−1 (WT ) < ?
2 2
Exercise 6.0.10 (a) Show the relation
Z T Z
Wt WT 1 T Wt
e dt = e −1− e dt.
0 2 0
(b) Use part (a) to find E[eWt ].
Exercise 6.0.11 (a) Use integration by parts to show

Z T Z
Wt Wt Wt 1 T Wt
Wt e dWt = 1 + Wt e − e − e (1 + Wt ) dt.
0 2 0
(b) Use part (a) to find E[Wt eWt ].

88
Exercise 6.0.12 (a) Let T > 0. Show the following relation using integration by parts
Z T Z T
2Wt 1 − Wt2
dWt = ln(1 + Wt2 ) − ds.
0 1 + Wt2 0 (1 + Wt2 )2
(b) Show that for any real number x the following double inequality holds
1 1 − x2
− ≤ ≤ 1.
8 (1 + x2 )2
(c) Use part (b) to show that

Z T
1 1 − Wt2
− ≤ dt ≤ 1.
8 0 (1 + Wt2 )2
(d) Use parts (a) and (c) to get
T
− ≤ E[ln(1 + Wt2 )] ≤ T.
8
(e) Use Jensen’s inequality to get
E[ln(1 + Wt2 )] ≤ ln(1 + T ).
Does this contradict the upper bound provided in (d)?
6.0.6 The Heat Equation Method

In the elementary Calculus the integration by substitution is the inverse application of the chain
rule. In the stochastic environment, this will be the inverse application of Ito’s formula. This
is difficult to apply in general, but there is a particular case of great importance.
Let ϕ(t, x) be a solution of the equation
1
∂t ϕ + ∂x2 ϕ = 0. (6.0.8)
2
This is called the heat equation without sources. The non-homogeneous equation
1
∂t ϕ + ∂x2 ϕ = G(t, x) (6.0.9)
2
is called heat equation with sources. The function G(t, x) represents the density of heat sources,
while the function ϕ(t, x) is the temperature at point x at time t in a one-dimensional wire. If
the heat source is time independent, then G = G(x), i.e. G is a function of x only.
Example 6.0.6 Find all solutions of the equation (6.0.8) of the type
ϕ(t, x) = a(t) + b(x).
Substituting into equation (6.0.8) yields

1 00
b (x) = −a0 (t).
2
89
Since the right side is a function of x only, while the right side is a function of variable t,
the only case when the previous equation is satisfied is when both sides are equal to the same
constant C. This is called a separation constant. Therefore a(t) and b(x) satisfy the equations
1 00
a0 (t) = −C, b (x) = C.
2
Integrating yields a(t) = −Ct + C0 and b(x) = Cx2 + C1 x + C2 . It follows that
ϕ(t, x) = C(x2 − t) + C1 x + C3 ,
with C0 , C1 , C2 , C3 arbitrary constants.
Example 6.0.7 Find all solutions of the equation (6.0.8) of the type
ϕ(t, x) = a(t)b(x).
Substituting in the equation and dividing by a(t)b(x) yields

a0 (t) 1 b00 (x)
+ = 0.
a(t) 2 b(x)
a0 (t) b00 (x)
There is a separation constant C such that = −C and = 2C. There are three
a(t) b(x)
distinct cases to discuss:
1. C = 0. In this case a(t) = a0 and b(x) = b1 x + b0 , with a0 , a1 , b0 , b1 real constants. Then
ϕ(t, x) = a(t)b(x) = c1 x + c0 , c0 , c1 ∈ R
is just a linear function in x.

2
2. C > 0. Let λ > 0 such that 2C = λ2 . Then a0 (t) = − λ2 a(t) and b00 (x) = λ2 b(x), with
solutions
2
a(t) = a0 e−λ t/2
b(x) = c1 eλx + c2 e−λx .
The general solution of (6.0.8) is

2
ϕ(t, x) = e−λ t/2
(c1 eλx + c2 e−λx ), c1 , c2 ∈ R.
λ2
3. C < 0. Let λ > 0 such that 2C = −λ2 . Then a0 (t) = 2 a(t) and b00 (x) = −λ2 b(x).
Solving yields
2
a(t) = a0 eλ t/2
b(x) = c1 sin(λx) + c2 cos(λx).
The general solution of (6.0.8) in this case is

2 ¡ ¢
ϕ(t, x) = eλ t/2 c1 sin(λx) + c2 cos(λx) , c1 , c2 ∈ R.
In particular, the functions x, x2 − t, ex−t/2 , e−x−t/2 , et/2 sin x and et/2 cos x, or any linear
combination of them are solutions of the heat equation (6.0.8). However, there are other
solutions which are not of the previous type.
90
Exercise 6.0.13 Prove that ϕ(t, x) = 13 x3 − tx is a solution of the heat equation (6.0.8).
2
Exercise 6.0.14 Show that ϕ(t, x) = t−1/2 e−x /(2t)
is a solution of the heat equation (6.0.8)
for t > 0.
Theorem 6.0.15 Let ϕ(t, x) be a solution of the heat equation (6.0.8) and denote f (t, x) =
∂x ϕ(t, x). Then
Z b
f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ).
a
Proof: Let Ft = ϕ(t, Wt ). Applying Ito’s formula we get

³ 1 ´
dFt = ∂x ϕ(t, Wt ) dWt + ∂t ϕ + ∂x2 ϕ dt.
2
Since ∂t ϕ + 12 ∂x2 ϕ = 0 and ∂x ϕ(t, Wt ) = f (t, Wt ), we have
dFt = f (t, Wt ) dWt .
Applying the Fundamental Theorem yields

Z b Z b
f (t, Wt ) dWt = dFt = Fb − Fa = ϕ(b, Wb ) − ϕ(a, Wa ).
a a
Application 6.0.16 Show that

Z T
1 2 1
Wt dWt = W − T.
0 2 T 2
Choose the solution of the heat equation (6.0.8) given by ϕ(t, x) = x2 − t. Then f (t, x) =
∂x ϕ(t, x) = 2x. Theorem 6.0.15 yields
Z T Z T ¯T
¯
2Wt dWt = f (t, Wt ) dWt = ϕ(t, x)¯ = WT2 − T.
0 0 0
Dividing by 2 leads to the desired result.
Application 6.0.17 Show that

Z T
1 3
(Wt2 − t) dWt = W − T WT .
0 3 T
Consider the function ϕ(t, x) = 31 x3 − tx, which is a solution of the heat equation (6.0.8), see
Exercise 6.0.13. Then f (t, x) = ∂x ϕ(t, x) = x2 − t. Applying Theorem 6.0.15 yields
Z T Z T ¯T
¯ 1
(Wt2 − t) dWt = f (t, Wt ) dWt = ϕ(t, Wt )¯ = WT3 − T WT .
0 0 0 3
91
Application 6.0.18 Let λ > 0. Prove the identities

Z T
λ2 t 1 ³ − λ2 T ±λWT ´
e− 2 ±λWt dWt = e 2 −1 .
0 ±λ
λ2 t
Consider the function ϕ(t, x) = e− 2 ±λx , which is a solution of the homogeneous heat equation
λ2 t
(6.0.8), see Example 6.0.7. Then f (t, x) = ∂x ϕ(t, x) = ±λe− 2 ±λx . Apply Theorem 6.0.15 to
get
Z T Z T ¯T
2
− λ2 t ±λx ¯ λ2 T
±λe dWt = f (t, Wt ) dWt = ϕ(t, Wt )¯ = e− 2 ±λWT − 1.
0 0 0
Dividing by the constant ±λ ends the proof.
In particular, for λ = 1 the aforementioned formula becomes
Z T
t T
e− 2 +Wt dWt = e− 2 +WT − 1. (6.0.10)
0
Application 6.0.19 Let λ > 0. Prove the identity

Z T 2
λ t 1 λ2 T
e 2 cos(λWt ) dWt = e 2 sin(λWT ).
0 λ
λ2 t
From the Example 6.0.7 we know that ϕ(t, x) = e 2 sin(λx) is a solution of the heat equation.
λ2 t
Applying Theorem 6.0.15 to the function f (t, x) = ∂x ϕ(t, x) = λe 2 cos(λx), yields
Z T Z T ¯T
λ2 t ¯
λe 2 cos(λWt ) dWt = f (t, Wt ) dWt = ϕ(t, Wt )¯
0 0 0
¯T
λ2 t ¯ λ2 T
= e 2 sin(λWt )¯ = e 2 sin(λWT ).
0
Divide by λ to end the proof.
If choose λ = 1 we recover a result already familiar to the reader from section 6.0.5
Z T
t T
e 2 cos(Wt ) dWt = e 2 sin WT . (6.0.11)
0
Application 6.0.20 Let λ > 0. Show that

Z T 2
λ t 1³ λ2 T
´
e 2 sin(λWt ) dWt = 1 − e 2 cos(λWT ) .
0 λ
λ2 t
Choose ϕ(t, x) = e 2 cos(λx) to be a solution of the heat equation. Apply Theorem 6.0.15 for
λ2 t
the function f (t, x) = ∂x ϕ(t, x) = −λe 2 sin(λx) to get
Z T ¯T
λ2 t ¯
(−λ)e 2 sin(λWt ) dWt = ϕ(t, Wt )¯
0 0
¯T
λ2 T ¯ λ2 T
= e 2 cos(λWt )¯ = e 2 cos(λWt ) − 1,
0
and then divide by −λ.
92
Application 6.0.21 Let 0 < a < b. Show that

Z b Wb2
3 Wt2 1
2
Wa 1
t− 2 Wt e− 2t dWt = a− 2 e− 2a − b − 2 e− 2b . (6.0.12)
a
2
From Exercise 6.0.14 we have that ϕ(t, x) = t−1/2 e−x /(2t) is a solution of the homogeneous
2
heat equation. Since f (t, x) = ∂x ϕ(t, x) = −t−3/2 xe−x /(2t) , applying Theorem 6.0.15 yields to
the desired result. The reader can easily fill in the details.
Integration techniques will be used when solving stochastic differential equations in the next
chapter.
Exercise 6.0.22 Find the value of the stochastic integrals
Z 1 √
(a) et cos( 2Wt ) dWt
0
Z 3
(b) e2t cos(2Wt ) dWt
0
Z 4 √
(c) e−t+ 2Wt
dWt .
0
Exercise 6.0.23 Let ϕ(t, x) be a solution of the following non-homogeneous heat equation with
time-dependent and uniform heat source G(t)
1
∂t ϕ + ∂x2 ϕ = G(t).
2
Denote f (t, x) = ∂x ϕ(t, x). Then
Z b Z b
f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ) − G(t) dt.
a a
How does the formula change if the heat source G is constant?

Chapter 7
Stochastic Differential Equations
7.1 Definitions and Examples

Let Xt be a continuous stochastic process. If small changes in the process Xt can be written
as a linear combination of small changes in t and small increments of the Brownian motion Wt ,
we may write
dXt = a(t, Wt , Xt )dt + b(t, Wt , Xt ) dWt (7.1.1)
and called it a stochastic differential equation. In fact, this differential relation has the following
integral form meaning
Z t Z t
Xt = X0 + a(s, Ws , Xs ) ds + b(s, Ws , Xs ) dWs , (7.1.2)
0 0
where the last integral is taken in the Ito sense. Relation (7.1.2) is taken as the definition for
the stochastic differential equation (7.1.1), so the definition of stochastic differential equations
is fictions. However, since it is convenient to use stochastic differentials informally, we shall
approach stochastic differential equations by analogy with the ordinary differential equations,
and try to present the same methods of solving equation in the new stochastic environment.
The functions a(t, Wt , Xt ) b(t, Wt , Xt ) are called drift rate and volatility. A process Xt is
called a solution for the stochastic equation (7.1.1) if it satisfies the equation. In the following
we shall start with an example.
Example 7.1.1 (The Brownian Bridge) Let a, b ∈ R. Show that the process
Z t
1
Xt = a(1 − t) + bt + (1 − t) dWs , 0 ≤ t < 1
0 1 − s
is a solution of the stochastic differential equation
b − Xt
dXt = dt + dWt , 0 ≤ t < 1, X0 = a.
1−t
We shall perform a routine verification to show that Xt is a solution. First we compute the
b − Xt
quotient :
1−t
93
94
Z t
1
b − Xt = b − a(1 − t) − bt − (1 − t) dWs
0 1−s
Z t
1
= (b − a)(1 − t) − (1 − t) dWs ,
0 1 − s
and dividing by 1 − t yields

Z t
b − Xt 1
=b−a− dWs . (7.1.3)
1−t 0 1−s
Using
³Z t
1 ´ 1
d dWs = dWt ,
0 1−s 1−t
the product rule yields
Z t ³Z t 1 ´
1
dXt = a d(1 − t) + bdt + d(1 − t) dWs + (1 − t)d dWs
0 1−s 0 1−s
³ Z t ´
1
= b−a− dWs dt + dWt
0 1 − s
b − Xt
= dt + dWt ,
1−t
where the last identity comes from (7.1.3). We just verified that the process Xt is a solution of
the given stochastic equation. The question of how this solution was obtained in the first place,
is the subject of study for the next few sections.
7.2 Finding Mean and Variance

For most practical purposes, the most important information one needs to know about a process
is its mean and variance. These can be found in some particular cases without solving explicitly
the equation, directly from the stochastic equation. We shall deal in the present section with
this problem.
Taking the expectation in (7.1.2) and using the property of the Ito integral as a zero mean
random variable yields Z t
E[Xt ] = X0 + E[a(s, Ws , Xs )] ds. (7.2.4)
0
Applying the Fundamental Theorem of Calculus we obtain
d
E[Xt ] = E[a(t, Wt , Xt )].
dt
We note that Xt is not differentiable, but its expectation E[Xt ] is. This equation can be solved
exactly in a few particular cases.
d
1. If a(t, Wt , Xt ) = a(t), then dt E[Xt ] = a(t) with the exact solution E[Xt ] = X0 +
Rt
0
a(s) ds.
95
2. If a(t, Wt , Xt ) = α(t)Xt + β(t), with α(t) and β(t) continuous deterministic functions.
Then
d
E[Xt ] = α(t)E[Xt ] + β(t),
dt
which is a linear differential equation in E[Xt ]. Its solution is given by
³ Z t ´
E[Xt ] = eA(t) X0 + e−A(s) β(s) ds , (7.2.5)
0
Rt
where A(t) = 0 α(s) ds. It worth noting that the expectation E[Xt ] does not depend on the
volatility term b(t, Wt , Xt ).
Example 7.2.1 If dXt = (2Xt + e2t )dt + b(t, Wt , Xt )dWt , then
E[Xt ] = e2t (X0 + t).

For general drift rates we cannot find the mean, but in the case of concave drift rates we
can find an upper bound for the expectation E[Xt ]. The following result will be useful in the
sequel.
Lemma 7.2.1 (Gronwall’s inequality) Let f (t) be a non-negative function satisfying the
inequality Z t
f (t) ≤ C + M f (s) ds
0
for 0 ≤ t ≤ T , with C, M constants. Then
f (t) ≤ CeM t , 0 ≤ t ≤ T.
Proposition 7.2.2 Let Xt be a continuous stochastic process such that
dXt = a(Xt )dt + b(t, Wt , Xt ) dWt ,

with the function a(·) satisfying the following conditions
1. a(x) ≥ 0, for 0 ≤ x ≤ T ;
2. a00 (x) < 0, for 0 ≤ x ≤ T ;
3. a0 (0) = M .
Then E[Xt ] ≤ X0 eM t , for 0 ≤ Xt ≤ T .
Proof: From the mean value theorem there is ξ ∈ (0, x) such that
a(x) = a(x) − a(0) = (x − 0)a0 (ξ) ≤ xa0 (0) = M x, (7.2.6)

where we used that a0 (x) is a decreasing function. Applying Jensen’s inequality for concave
functions yields
E[a(Xt )] ≤ a(E[Xt ]).
Combining with (7.2.6) we obtain E[a(Xt )] ≤ M E[Xt ]. Substituting in the identity (7.2.4)
implies Z t
E[Xt ] ≤ X0 + M E[Xs ] ds.
0
Applying Gronwall’s inequality we obtain E[Xt ] ≤ X0 eM t .

96
Exercise 7.2.3 State the previous result in the particular case when a(x) = sin x, with 0 ≤
x ≤ π.
Proposition 7.2.4 Let Xt be a process satisfying the stochastic equation

dXt = α(t)Xt dt + b(t)dWt .
Then the mean and variance of Xt are given by
E[Xt ] = eA(t) X0
Z t
V ar[Xt ] = e2A(t) e−A(s) b2 (s) ds,
0
Rt
where A(t) = 0
α(s) ds.
Proof: The expression of E[Xt ] follows directly from formula (7.2.5) with β = 0. In order to
compute the second moment we first compute
(dXt )2 = b2 (t) dt;

d(Xt2 ) = 2Xt dXt + (dXt )2
¡ ¢
= 2Xt α(t)Xt dt + b(t)dWt + b2 (t)dt
¡ ¢
= 2α(t)Xt2 + b2 (t) dt + 2b(t)Xt dWt ,
where we used Ito’s formula. If let Yt = Xt2 , the previous equation becomes
¡ ¢ p
dYt = 2α(t)Yt + b2 (t) dt + 2b(t) Yt dWt .
Applying formula (7.2.5) with α(t) replaced by 2α(t) and β(t) by b2 (t), yields
³ Z t ´
2A(t)
E[Yt ] = e Y0 + e−2A(s) b2 (s) ds ,
0
which is equivalent with

³ Z t ´
E[Xt2 ] =e 2A(t)
X02 + e−2A(s) b2 (s) ds .
0
It follows that the variance is

Z t
V ar[Xt ] = E[Xt2 ] − (E[Xt ])2 = e2A(t) e−2A(s) b2 (s) ds.
0
Remark 7.2.5 We note that the previous equation is of linear type. This shall be solved
explicitly in a future section.
The mean and variance for a given stochastic process can be computed by working out the
associated stochastic equation. We shall provide next a few examples.
Example 7.2.2 Find the mean and variance of ekWt , with k constant.
97
From Ito’s formula

1
d(ekWt ) = kekWt dWt + k 2 ekWt dt,
2
and integrating yields
Z t Z t
1
ekWt = 1 + k ekWs dWs + k 2 ekWs ds.
0 2 0
Taking the expectations we have

Z t
1
E[ekWt ] = 1 + k 2 E[ekWs ] ds.
2 0
kWt
If let f (t) = E[e ], then differentiating the previous relations yields the differential equation
1 2
f 0 (t) = k f (t)
2
2
with the initial condition f (0) = E[ekW0 ] = 1. The solution is f (t) = ek t/2
, and hence
2
E[ekWt ] = ek t/2
.
The variance is
2 2
V ar[ekWt ] = E[e2kWt ] − (E[ekWt ])2 = e4k t/2
− ek t
2 2
= ek t (ek t − 1).
Example 7.2.3 Find the mean of the process Wt eWt .
We shall set up a stochastic differential equation for Wt eWt . Using the product formula and
Ito’s formula yields
d(Wt eWt ) = eWt dWt + Wt d(eWt ) + dWt d(eWt )

1
= eWt dWt + (Wt + dWt )(eWt dWt + eWt dt)
2
1
= ( Wt eWt + eWt )dt + (eWt + Wt eWt )dWt .
2
Integrating and using that W0 eW0 = 0 yields
Z t Z t
Wt 1 Ws Ws
Wt e = ( Ws e + e ) ds + (eWs + Ws eWs ) dWs .
0 2 0
Since the expectation of an Ito integral is zero, we have

Z t
¡1 ¢
E[Wt eWt ] = E[Ws eWs ] + E[eWs ] ds.
0 2
Let f (t) = E[Wt eWt ]. Using E[eWs ] = et/2 , the previous integral equation becomes
Z t
1
f (t) = ( f (s) + es/2 ) ds,
0 2
98
Differentiating yields the following linear differential equation

1
f 0 (t) = f (t) + et/2
2
with the initial condition f (0) = 0. Multiplying by e−t/2 yields the following exact equation
(e−t/2 f (t))0 = 1.
The solution is f (t) = tet/2 . Hence we obtained that
E[Wt eWt ] = tet/2 .
Exercise 7.2.6 Find E[Wt2 eWt ] and E[Wt ekWt ].
Example 7.2.4 Show that for any integer k ≥ 0 we have
(2k)! k
E[Wt2k ] = t , E[Wt2k+1 ] = 0.
2k k!
In particular, E[Wt4 ] = 3t2 , E[Wt6 ] = 15t3 .
From Ito’s formula we have
n(n − 1) n−2
d(Wtn ) = nWtn−1 dWt + Wt dt.
2
Integrate and get Z Z
t
n(n − 1) t n−2
Wtn = n Wsn−1 dWs + Ws ds.
0 2 0
Since the expectation of the first integral on the right side is zero, taking the expectation yields
the following recursive relation
Z
n(n − 1) t
E[Wtn ] = E[Wsn−2 ] ds.
2 0
Using the initial values E[Wt ] = 0 and E[Wt2 ] = t, the method of mathematical induction
implies that E[Wt2k+1 ] = 0 and E[Wt2k ] = (2k)!
2k k!
tk .
Exercise 7.2.7 Find E[sin Wt ].
From Ito’s formula
1
d(sin Wt ) = cos Wt dWt − sin Wt dt,
2
then integrating yields
Z t Z t
1
sin Wt = cos Ws dWs − sin Ws ds.
0 2 0
Taking expectations we arrive at the integral equation

Z
1 t
E[sin Wt ] = − E[sin Ws ] ds.
2 0
Let f (t) = E[sin Wt ]. Differentiate yields the equation f 0 (t) = − 21 f (t) with f (0) = E[sin W0 ] =
0. The unique solution is f (t) = 0. Hence
E[sin Wt ] = 0.
99
Exercise 7.2.8 Let σ be a constant. Show that

(a) E[sin(σWt )] = 0;
2
(b) E[cos(σWt )] = e−σ t/2
;
2
(c) E[sin(t + σWt )] = e−σ t/2
sin t;
2
(d) E[cos(t + σWt )] = e−σ t/2
cos t;
Exercise 7.2.9 Use the previous exercise and the definition of expectation to show that
Z ∞
2 π 1/2
(a) e−x cos x dx = 1/4 ;
−∞ e
Z ∞ r
−x2 /2 2π
(b) e cos x dx = .
−∞ e
Exercise 7.2.10 Find the expectations

1
(a) E[Wt3 eWt ] (b) E[Wt2 eW1 − 2 ].
Not in all cases can the mean and the variance be obtained directly from the stochastic
equation. In these cases we need more powerful methods that produce closed form solutions.
In the next sections we shall discuss several methods of solving stochastic differential equation.
7.3 The Integration Technique

We shall start with the simple case when both the drift and the volatility are just functions of
time t.
Proposition 7.3.1 The solution Xt of the stochastic differential equation
dXt = a(t)dt + b(t)dWt

Rt Rt
is Gaussian distributed with mean X0 + 0
a(s) ds and variance 0
b2 (s) ds.
Proof: Integrating in the equation yields

Z t Z t Z t
Xt − X0 = dXs = a(s) ds + b(s) dWs .
0 0 0
Rt
Using the property of Wiener integrals, 0 b(s) dWs is Gaussian distributed with mean 0 and
Rt
variance 0 b2 (s) ds. Then Xt is Gaussian (as a sum between a predictable function and a
100
Gaussian), with
Z t Z t
E[Xt ] = E[X0 + a(s) ds + b(s) dWs ]
0 0
Z t Z t
= X0 + a(s) ds + E[ b(s) dWs ]
0 0
Z t
= X0 + a(s),
0
Z t Z t
V ar[Xt ] = V ar[X0 + a(s) ds + b(s) dWs ]
0 0
hZ t i
= V ar b(s) dWs
0
Z t
= b2 (s) ds,
0
Exercise 7.3.2 Solve the following stochastic differential equations for t ≥ 0 and determine
the mean and the variance of the solution
(a) dXt = cos t dt − sin t dWt , X0 = 1.
√
(b) dXt = et dt + t dWt , X0 = 0.
t 3/2
(c) dXt = 1+t 2 dt + t dWt , X0 = 1.
If the drift and the volatility depend on both variables t and Wt , the stochastic differential
equation
dXt = a(t, Wt )dt + b(t, Wt )dWt , t≥0
defines an Ito diffusion. Integrating yields the solution
Z t Z t
Xt = X0 + a(s, Ws ) ds + b(s, Ws ) dWs .
0 0
There are several cases when both integrals can be computed explicitly.
Example 7.3.1 Find the solution of the stochastic differential equation
dXt = dt + Wt dWt , X0 = 1.
Integrate between 0 and t and get

Z t Z t
Wt2 t
Xt = 1+ ds + Ws dWs = t + −
0 0 2 2
1
= (W 2 + t).
2 t
Example 7.3.2 Solve the stochastic differential equation
dXt = (Wt − 1)dt + Wt2 dWt , X0 = 0.

101
Rt
Let Zt = 0 Ws ds denote the integrated Brownian motion process. Integrating the equation
between 0 and t and using (6.0.4), yields
Z s Z t Z t
Xt = dXs = (Ws − 1)ds + Ws2 dWs
0 0 0
1 1 t
= Zt − t + Wt3 − Wt2 −
3 2 2
1 3 1 2 t
= Zt + Wt − Wt − .
3 2 2
dXt = t2 dt + et/2 cos Wt dWt , X0 = 0,
and find E[Xt ] and V ar[Xt ].
Integrating yields
Z t Z t
2
Xt = s ds + es/2 cos Ws dWs
0 0
3
t
= + et/2 sin Wt , (7.3.7)
3
where we used (6.0.11). Even if the process Xt is not Gaussian, we can still compute its mean
and variance. By Ito’s formula we have
1
d(sin Wt ) = cos Wt dWt − sin Wt dt
2
Integrating between 0 and t yields
Z t Z t
1
sin Wt = cos Ws dWs − sin Ws ds,
0 2 0
where we used that sin W0 = sin 0 = 0. Taking the expectation in the previous relation yields
hZ t i 1Z t
E[sin Wt ] = E cos Ws dWs − E[sin Ws ] ds.
0 2 0
From the properties of the Ito integral, the first expectation on the right side is zero. Denoting
µ(t) = E[sin Wt ], we obtain the integral equation
Z
1 t
µ(t) = − µ(s) ds.
2 0
Differentiating yields the differential equation
1
µ0 (t) = − µ(t)
2
with the solution µ(t) = ke−t/2 . Since k = µ(0) = E[sin W0 ] = 0, it follows that µ(t) = 0.
Hence
E[sin Wt ] = 0.
102
Taking expectation in (7.3.7) leads to

h t3 i t3
E[Xt ] = E + et/2 E[sin Wt ] = .
3 3
Since the variance of predictable functions is zero,

h t3 i
V ar[Xt ] = V ar + et/2 sin Wt = (et/2 )2 V ar[sin Wt ]
3
et
= e E[sin2 Wt ] =
t
(1 − E[cos 2Wt ]). (7.3.8)
2
In order to compute the last expectation we use Ito’s formula
d(cos 2Wt ) = −2 sin 2Wt dWt − 2 cos 2Wt dt
and integrate to get

Z t Z t
cos 2Wt = cos 2W0 − 2 sin 2Ws dWs − 2 cos 2Ws ds
0 0
Taking the expectation and using that Ito integrals have zero expectation, yields
Z t
E[cos 2Wt ] = 1 − 2 E[cos 2Ws ] ds.
0
If denote m(t) = E[cos 2Wt ], the previous relation becomes an integral equation
Z t
m(t) = 1 − 2 m(s) ds.
0
Differentiate and get

m0 (t) = −2m(t),
with the solution m(t) = ke−2t . Since k = m(0) = E[cos 2W0 ] = 1, we have m(t) = e−2t .
et et − e−t
V ar[Xt ] = (1 − e−2t ) = = sinh t.
2 2
In conclusion, the solution Xt has the mean and the variance given by
t3
E[Xt ] = , V ar[Xt ] = sinh t.
3
Example 7.3.4 Solve the following stochastic differential equation
et/2 dXt = dt + eWt dWt , X0 = 0,
and then find the distribution of the solution Xt and its mean and variance.
103
Dividing by et/2 , integrating between 0 and t, and using formula (6.0.10) yields
Z t Z t
−s/2
Xt = e ds + e−s/2+Ws dWs
0 0
= 2(1 − e−t/2 ) + e−t/2 eWt − 1
= 1 + e−t/2 (eWt − 2).
Since eWt is a geometric Brownian motion, using Proposition 2.2.2 yields
E[Xt ] = E[1 + e−t/2 (eWt − 2)] = 1 − 2e−t/2 + e−t/2 E[eW

t ]
= 2 − 2e−t/2 .
V ar[Xt ] = V ar[1 + e−t/2 (eWt − 2)] = V ar[e−t/2 eWt ] = e−t V ar[eWt ]
= e−t (e2t − et ) = et − 1.
The process Xt has the distribution of a sum between the predictable function 1 − 2e−t/2 and
the log-normal process e−t/2+Wt .

2
dXt = dt + t−3/2 Wt e−Wt /(2t) dWt , X1 = 1.
Integrating between 1 and t and applying formula (6.0.12) yields

Z t Z t
2
Xt = X1 + ds + s−3/2 Ws e−Ws /(2s) dWs
1 1
−W12 /2 1 −Wt2 /(2t)
= 1+t−1−e − e
t1/2
2 1 −Wt2 /(2t)
= t − e−W1 /2 − e , ∀t ≥ 1.
t1/2
7.4 Exact Stochastic Equations

The stochastic differential equation
dXt = a(t, Wt )dt + b(t, Wt )dWt (7.4.9)
is called exact if there is a differentiable function f (t, x) such that

1
a(t, x) = ∂t f (t, x) + ∂x2 f (t, x) (7.4.10)
2
b(t, x) = ∂x f (t, x). (7.4.11)
Assume the equation is exact. Then substituting in (7.4.9) yields

³ 1 ´
dXt = ∂t f (t, Wt ) + ∂x2 f (t, Wt ) dt + ∂x f (t, Wt )dWt
2
Applying Ito’s formula, the previous equations becomes
¡ ¢
dXt = d f (t, Wt ) ,
104
which implies Xt = f (t, Wt ) + c, with c constant.

Solving the partial differential equations system (7.4.10–7.4.10) requires the following steps:
1. Integrate partially with respect to x in the second equation to obtain f (t, x) up to an
additive function T (t);
2. Substitute into the first equation and determine the function T (t);
3. The solution is Xt = f (t, Wt ) + c, with c determined from the initial condition on Xt .
dXt = et (1 + Wt2 )dt + (1 + 2et Wt )dWt , X0 = 0.
In this case a(t, x) = et (1 + x2 ) and b(t, x) = 1 + 2et x. The associated system is
1
et (1 + x2 )= ∂t f (t, x) + ∂x2 f (t, x)
2
1 + 2et x = ∂x f (t, x).
Integrate partially in x in the second equation yields
Z
f (t, x) = (1 + 2et x) dx = x + et x2 + T (t).
Then ∂t f = et x2 + T 0 (t) and ∂x2 f = 2et . Substituting in the first equation yields
et (1 + x2 ) = et x2 + T 0 (t) + et .
This implies T 0 (t) = 0, or T = c constant. Hence f (t, x) = x + et x2 + c, and Xt = f (t, Wt ) =
Wt + et Wt2 + c. Since X0 = 0, it follows that c = 0. The solution is Xt = Wt + et Wt2 .
Example 7.4.2 Find the solution of
¡ ¢
dXt = 2tWt3 + 3t2 (1 + Wt ) dt + (3t2 Wt2 + 1)dWt , X0 = 0.
The coefficient functions are a(t, x) = 2tx3 + 3t2 (1 + x) and b(t, x) = 3t2 x2 + 1. The associated
system is given by
1
2tx3 + 3t2 (1 + x) = ∂t f (t, x) + ∂x2 f (t, x)
2
3t2 x2 + 1 = ∂x f (t, x).
Integrate partially in the second equation yields
Z
f (t, x) = (3t2 x2 + 1) dx = t2 x3 + x + T (t).
Then ∂t f = 2tx3 + T 0 (t) and ∂x2 f = 6t2 x, and plugging in the first equation we get
1
2tx3 + 3t2 (1 + x) = 2tx3 + T 0 (t) + 6t2 x.
2
After cancelations we get T 0 (t) = 3t2 , so T (t) = t3 + c. Then
f (t, x) = t2 x3 + x + 3t2 = t2 (x3 + 1) + x + c.
The solution process is given by Xt = f (t, Wt ) = t2 (Wt3 + 1) + Wt + c. Using X0 = 0 we get
c = 0. Hence the solution is Xt = t2 (Wt3 + 1) + Wt .
The next result deals with a closeness-type condition.
105
Theorem 7.4.1 If the stochastic differential equation (7.4.9) is exact, then the coefficient func-
tions a(t, x) and b(t, x) satisfy the condition
1
∂x a = ∂t b + ∂x2 b. (7.4.12)
2
Proof: If the stochastic equation is exact, there is a function f (t, x) satisfying the system
(7.4.10–7.4.10). Differentiating the first equation of the system with respect to x yields
1
∂x a = ∂t ∂x f + ∂x2 ∂x f.
2
Substituting b = ∂x f yields the desired relation
Remark 7.4.2 The equation (7.4.12) has the meaning of a heat equation. The function b(t, x)
represents the temperature measured at x at the instance t, while ∂x a is the density of heat
sources. The function a(t, x) can be regarded as the potential from which the density of heat
sources is derived by taking the gradient in x.
It worth noting that equation (7.4.12) is a just necessary condition for exactness. This
means that if this condition is not satisfied, then the equation is not exact. In this case we
need to try a different method to solve the equation.
Example 7.4.3 Is the stochastic differential equation
dXt = (1 + Wt2 )dt + (t4 + Wt2 )dWt
exact?
Collecting the coefficients, we have a(t, x) = 1 + x2 , b(t, x) = t4 + x2 . Since ∂x a = 2x, ∂t b = 4t3 ,
and ∂x2 b = 2, the condition (7.4.12) is not satisfied, and hence the equation is not exact.
7.5 Integration by Inspection

When solving a stochastic differential equation by inspection we look for opportunities to apply
the product or the quotient formulas:
d(f (t)Yt ) = f (t) dYt + Yt df (t).
³ X ´ f (t)dX − X df (t)
t t t
d = .
f (t) f (t)2
For instance, if a stochastic differential equation can be written as
dXt = f 0 (t)Wt dt + f (t)dWt ,
the product rule brings the equation in the exact form
³ ´
dXt = d f (t)Wt ,
which after integration leads to the solution

Xt = X0 + f (t)Wt .
106
Example 7.5.1 Solve
dXt = (t + Wt2 )dt + 2tWt dWt , X0 = a.
We can write the equation as
dXt = Wt2 dt + t(2Wt dWt + dt),
which can be contracted to

dXt = Wt2 dt + td(Wt2 ).
Using the product rule we can bring it to the exact form
dXt = d(tWt2 ),
with the solution Xt = tWt2 + a.
dXt = (Wt + 3t2 )dt + tdWt .
If rewrite the equation as

dXt = 3t2 dt + (Wt dt + tdWt ),
we note the exact expression formed by the last two terms Wt dt + tdWt = d(tWt ). Then
dXt = d(t3 ) + d(tWt ),
which is equivalent with d(Xt ) = d(t3 + tWt ). Hence Xt = t3 + tWt + c, c ∈ R.
e−2t dXt = (1 + 2Wt2 )dt + 2Wt dWt .
Multiply by e2t to get

dXt = e2t (1 + 2Wt2 )dt + 2e2t Wt dWt .
After regrouping this becomes
dXt = (2e2t dt)Wt2 + e2t (2Wt dWt + dt).
Since d(e2t ) = 2e2t dt and d(Wt2 ) = 2Wt dWt + dt, the previous relation becomes
dXt = d(e2t )Wt2 + e2t d(Wt2 ).
By the product rule, the right side becomes exact
dXt = d(e2t Wt2 ),
and hence the solution is Xt = e2t Wt2 + c, c ∈ R.
Example 7.5.4 Solve the equation
t3 dXt = (3t2 Xt + t)dt + t6 dWt , X1 = 0.

107
The equation can be written as
t3 dXt − 3Xt t2 dt = tdt + t6 dWt .
Divide by t6
t3 dXt − Xt d(t3 )
= t−5 dt + dWt .
(t3 )2
Applying the quotient rule yields
³X ´ ³ t−4 ´
t
d 3 = −d + dWt .
t 4
Integrating between 1 and t, yields
Xt t−4
= − + Wt − W1 + c
t3 4
so
1
Xt = ct3 − + t3 (Wt − W1 ), c ∈ R.
4t
Using X1 = 0 yields c = 1/4 and hence the solution is
1³ 3 1´
Xt = t − + t3 (Wt − W1 ), c ∈ R.
4 t
Exercise 7.5.1 Solve the following stochastic differential equations by the inspection method
(a) dXt = (1 + Wt )dt + (t + 2Wt )dWt , X0 = 0;
2 3
(b) t dXt = (2t − Wt )dt + tdWt , X1 = 0;
−t/2 1
(c) e dXt = 2 Wt dt + dWt , X0 = 0;
(d) dXt = 2tWt dWt + Wt2 dt,
X0 = 0;
³ ´ √
1
(e) dXt = 1 + 2√t Wt dt + t dWt , X1 = 0.
7.6 Linear Stochastic Equations

Consider the stochastic differential equation with drift term linear in Xt
¡ ¢
dXt = α(t)Xt + β(t) dt + b(t, Wt )dWt , t ≥ 0.
This can be also written as
dXt − α(t)Xt dt = β(t)dt + b(t, Wt )dWt

Rt
Let A(t) = 0 α(s) ds. Multiplying by the integrating factor e−A(t) , the left side of the previous
equation becomes an exact expression
³ ´
e−A(t) dXt − α(t)Xt dt = e−A(t) β(t)dt + e−A(t) b(t, Wt )dWt
³ ´
d e−A(t) Xt = e−A(t) β(t)dt + e−A(t) b(t, Wt )dWt .
108
Integrating yields
Z t Z t
−A(t) −A(s)
e Xt = X0 + e β(s) ds + e−A(s) b(s, Ws ) dWs
0 0
³Z t Z t ´
Xt = X0 eA(t) + eA(t) e−A(s) β(s) ds + e−A(s) b(s, Ws ) dWs .
0 0
The first integral in the previous parenthesis is a Riemann integral, and the latter one is an
Ito stochastic integral. Sometimes, in practical applications these integrals can be computed
explicitly.
When b(t, Wt ) = b(t), the latter integral becomes a Wiener integral. In this case the solution
Xt is Gaussian with mean and variance given by
Z t
E[Xt ] = X0 eA(t) + eA(t) e−A(s) β(s) ds
0
Z t
V ar[Xt ] = e2A(t) e−2A(s) b(s)2 ds.
0
Another important particular case is when α(t) = α 6= 0, β(t) = β are constants and
b(t, Wt ) = b(t). The equation in this case is
¡ ¢
dXt = αXt + β dt + b(t)dWt , t ≥ 0,
and the solution takes the form

Z t
β αt
Xt = X0 eαt + (e − 1) + eα(t−s) b(s) dWs .
α 0
Example 7.6.1 Solve the linear stochastic differential equation
dXt = (2Xt + 1)dt + e2t dWt .
Write the equation as

dXt − 2Xt dt = dt + e2t dWt
and multiply by the integrating factor e−2t to get
³ ´
d e−2t Xt = e−2t dt + dWt .
Integrate between 0 and t and multiply by e2t , we obtain

Z t Z t
2t 2t −2s 2t
Xt = X0 e + e e ds + e dWs
0 0
1
= X0 e2t + (e2t − 1) + e2t Wt .
2
dXt = (2 − Xt )dt + e−t Wt dWt .

109
Multiply by the integrating factor et yields
et (dXt + Xt dt) = 2et dt + Wt dWt .
Since et (dXt + Xt dt) = d(et Xt ), integrating between 0 and t we get

Z t Z t
t t
e Xt = X0 + 2e dt + Ws dWs .
0 0
Dividing by et and performing the integration yields

1
Xt = X0 e−t + 2(1 − e−t ) + e−t (Wt2 − t).
2
1
dXt = ( Xt + 1)dt + et cos Wt dWt .
2
Write the equation as
1
dXt − Xt dt = dt + et cos Wt dWt
2
and multiply by the integrating factor e−t/2 to get
d(e−t/2 Xt ) = e−t/2 dt + et/2 cos Wt dWt .
Integrating yields
Z t Z t
e−t/2 Xt = X0 + e−s/2 ds + es/2 cos Wt dWs
0 0
Multiply by et/2 and use formula (6.0.11) to obtain the solution
Xt = X0 et/2 + 2(et/2 − 1) + et sin Wt .
Exercise 7.6.1 Solve the following linear stochastic differential equations

(a) dXt = (4Xt − 1)dt + 2dWt ;
(b) dXt = (3Xt − 2)dt + e3t dWt ;
(c) dXt = (1 + Xt )dt + et Wt dWt ;
(d) dXt = (4Xt + t)dt + e4t dWt ;
³ ´
(e) dXt = t + 12 Xt dt + et sin Wt dWt ;
(f ) dXt = −Xt dt + e−t dWt .
In the following we present an important example of stochastic differential equation, which

can be solved by the method presented in this section.
Proposition 7.6.2 (The mean-reverting Ornstein-Uhlenbeck process) Let m and α be

two constants. Then the solution Xt of the stochastic equation
dXt = (m − Xt )dt + αdWt (7.6.13)

110
is given by Z t
Xt = m + (X0 − m)e−t + α es−t dWs . (7.6.14)
0
Xt is a Gaussian with mean and variance given by
E[Xt ] = m + (X0 − m)e−t
α2
V ar[Xt ] = (1 − e−2t ).
2
Proof: Adding Xt dt to both sides and multiplying by the integrating factor et we get
d(et Xt ) = met dt + αet dWt ,
which after integration yields
Z t
et Xt = X0 + m(et − 1) + α es dWs ,
0
and hence
Z t
Xt = X0 e−t + m − e−t + αe−t es dWs
0
Z t
= m + (X0 − m)e−t + α es−t dWs .
0
Since Xt is the sum between a predictable function and a Wiener integral, using Proposition
4.5.1 it follows that Xt is a Gaussian, with
h Z t i
E[Xt ] = m + (X0 − m)e−t + E α es−t dWs = m + (X0 − m)e−t
0
h Z t i Z t
V arXt = V ar α es−t dWs = α2 e−2t e2s ds
0 0
2t
e −1 1
= α2 e−2t = α2 (1 − e−2t ).
2 2
The name of mean-reverting comes obviously from the fact that

lim E[Xt ] = m.
t→∞
The variance also tends to zero exponentially, lim V ar[Xt ] = 0. According to Proposition
t→∞
3.3.1, the process Xt tends to m in the mean square sense.
Proposition 7.6.3 (The Brownian Bridge) For a, b ∈ R fixed, the stochastic differential
equation
b − Xt
dXt = dt + dWt , 0 ≤ t < 1, X0 = a
1−t
has the solution
Z t
1
Xt = a(1 − t) + bt + (1 − t) dWs , 0 ≤ t < 1. (7.6.15)
0 1 − s
The solution has the properties X0 = a and limt→1 Xt = b, almost certainly.

111
Proof: If let Yt = b − Xt the equation becomes linear in Yt

1
dYt + Yt dt = −dWt .
1−t
1
Multiplying by the integrating factor ρ(t) = 1−t yields
³ Y ´ 1
t
d =− dWt ,
1−t 1−t
which leads by integration to Z t
Yt 1
=c− dWs .
1−t 0 1−s
Making t = 0 yields c = a − b, so
Z t
b − Xt 1
=a−b− dWs .
1−t 0 1−s
Solving for Xt yields
Z t
1
Xt = a(1 − t) + bt + (1 − t) dWs , 0 ≤ t < 1.
0 1−s
Rt 1
Let Ut = (1 − t) 0 1−s
dWs . First we notice that
Z t
£ 1 ¤
E[Ut ] = (1 − t)E dWs = 0,
0 1−s
Z Z t
£ t 1 ¤ 1
V ar[Ut ] = (1 − t)2 V ar dWs = (1 − t)2 2
s
0 1−s 0 (1 − s)
³ 1 ´
= (1 − t)2 − 1 = t(1 − t).
1−t
In order to show ac-limt→1 Xt = b, we need to prove
¡ ¢
P ω; lim Xt (ω) = b = 1.
t→1
Since Xt = a(1 − t) + bt + Ut , it suffices to show that

¡ ¢
P ω; lim Ut (ω) = 0 = 1. (7.6.16)
t→1
We evaluate the probability of the complementary event

¡ ¢ ¡ ¢
P ω; lim Ut (ω) 6= 0 = P ω; |Ut (ω)| > ², ∀t ,
t→1
for some ² > 0. Since by Markov’s inequality

¡ V ar(Ut ) t(1 − t)
P ω; |Ut (ω)| > ²) < 2
=
² ²2
holds for any 0 ≤ t < 1, choosing t → 1 implies that
¡
P ω; |Ut (ω)| > ², ∀t) = 0,
112
which implies (7.6.16).

The process (7.6.15) is called Brownian bridge because it joins X0 = a with X1 = b. Since
Xt is the sum between a deterministic linear function in t and a Wiener integral, it follows that
is is a Gaussian process, with mean and variance
E[Xt ] = a(1 − t) + bt
V ar[Xt ] = V ar[Ut ] = t(1 − t).
It worth noting that the variance is maximum at the midpoint t = (b − a)/2 and zero at the
end points a and b.
7.7 The Method of Variation of Parameters

Consider the following stochastic equation
dXt = αXt dWt , (7.7.17)
with α constant. This is the equation which is known in physics to model the linear noise.
Dividing by Xt yields
dXt
= α dWt
Xt
Switch to the integral form Z Z
dXt
= α dWt ,
Xt
and integrate “blindly” to get ln Xt = αWt + c, with c integration constant. This leads to the
“pseudo-solution”
Xt = eαWt +c .
The nomination “pseudo” stands for the fact that Xt does not satisfy the initial equation. We
shall find a correct solution by letting the parameter c to be a function of t. In other words,
we are looking for a solution of the following type
Xt = eαWt +c(t) , (7.7.18)
where the function c(t) is subject to be determined. Using Ito’s formula we get
dXt = d(eWt +c(t) ) = eαWt +c(t) (c0 (t) + α2 /2)dt + αeαWt +c(t) dWt
= Xt (c0 (t) + α2 /2)dt + αXt dWt .
Substituting the last term from the initial equation (7.7.17) yields
dXt = Xt (c0 (t) + α2 /2)dt + dXt ,
which leads to the equation

c0 (t) + α2 /2 = 0.
2
with the solution c(t) = − α2 t + k. Substituting into (7.7.18) yields
α2
Xt = eαWt − 2 t+k
.
113
The value of the constant k is determined by taking t = 0. This leads to X0 = ek . Hence we

have obtained the solution of the equation (7.7.17)
α2
Xt = X0 eαWt − 2 t
.
Example 7.7.1 Use the method of variation of parameters to solve the equation
dXt = Xt Wt dWt .
Dividing by Xt convert the differential equation into the equivalent integral form
Z Z
1
dXt = Wt dWt .
Xt
The right side is a well-known stochastic integral given by
Z
W2 t
Wt dWt = t − + C.
2 2
The left side will be integrated “blindly” according to the rules of elementary Calculus
Z
1
dXt = ln Xt + C.
Xt
Equating the last two relations and solving for Xt we obtain the “pseudo-solution”
Wt2
− 2t +c
Xt = e 2 ,
with c constant. In order to get a correct solution, we let c to depend on t and Wt . We shall
assume that c(t, Wt ) = a(t) + b(Wt ), so we are looking for a solution of the form
Wt2
− 2t +a(t)+b(Wt )
Xt = e 2 .
Applying Ito’s formula, we have

£ 1 1 ¤ ¡ ¢
dXt = Xt − + a0 (t) + (1 + b00 (Wt )) dt + Xt Wt + b0 (Wt ) dWt .
2 2
Subtracting the initial equation dXt = Xt Wt dWt yields
¡ 1 ¢
0 = Xt a0 (t) + b00 (Wt ) dt + Xt b0 (Wt )dWt .
2
This equation is satisfied if we are able to choose the functions a(t) and b(Wt ) such that the
coefficients of dt and dWt vanish
1
b0 (Wt ) = 0, a0 (t) + b00 (Wt ) = 0.
2
From the first equation b must be a constant. Substituting in the second equation it follows
that a is also a constant. It turns out that the aforementioned “pseudo-solution” is in fact a
solution. The constant c = a + b is obtained letting t = 0. Hence the solution is given by
Wt2
− 2t
Xt = X0 e 2 .
114
Example 7.7.2 Use the method of variation of parameters to solve the stochastic differential
equation
dXt = µXt dt + σXt dWt ,
with µ and σ constants.
After dividing by Xt we bring the equation into the equivalent integral form
Z Z Z
dXt
= µ dt + σ dWt .
Xt
Integrate on the left “blindly” and get
ln Xt = µt + σWt + c,
where c is an integration constant. We arrive at the following “pseudo-solution”
Xt = eµt+σWt +c .
Assume the constant c is replaced by a function c(t), so we are looking for a solution of the
form
Xt = eµt+σWt +c(t) . (7.7.19)
Apply Ito’s formula and get
¡ σ2 ¢
dXt = Xt µ + c0 (t) + dt + σXt dWt .
2
Subtracting the initial equation yields
¡ σ2 ¢
c0 (t) + dt = 0,
2
2 2
which is satisfied for c0 (t) = − σ2 , with the solution c(t) = − σ2 t + k, k ∈ R. Substituting into
(7.7.19) yields the solution
σ2 σ2 σ2
Xt = eµt+σWt − 2 t+k
= e(µ− 2 )t+σWt +k
= X0 e(µ− 2 )t+σWt
.
7.8 Integrating Factors

The method of integrating factors can be applied to a class of stochastic differential equation
of type
dXt = f (t, Xt )dt + g(t)Xt dWt , (7.8.20)
where f and g are continuous deterministic functions. The integrating factor is given by
Rt Rt
g(s) dWs + 21 g 2 (s) ds
ρt = e− 0 0 .
The equation can be brought to the following exact form
d(ρt Xt ) = ρt f (t, Xt )dt.
Substituting Yt = ρt Xt , we obtain that Yt satisfies the deterministic differential equation
dYt = ρt f (t, Yt /ρt )dt,
which can be solved by either by integration or as an exact equation. We shall exemplify this
method with a few examples.
115
dXt = rdt + αXt dWt , (7.8.21)
with r and α constants.

1 2
The integrating factor is given by ρt = e 2 α t−αWt . Using Ito’s formula, we can easily check
that
dρt = ρt (α2 dt − αdWt ).
Using dt2 = dt dWt = 0, (dWt )2 = dt we obtain
dXt dρt = −α2 ρt Xt dt
Multiplying by ρt , the initial equation becomes
ρt dXt − αρt Xt dWt = rρt dt,
and adding and subtracting α2 ρt Xt dt from the left side yields
ρt dXt − αρt Xt dWt + α2 ρt Xt dt − α2 ρt Xt dt = rρt dt.
This can be written as

ρt dXt + Xt dρt + dρt dXt = rρt dt,
which in the virtue of the product rule becomes
d(ρt Xt ) = rρt dt.
Integrating yields
Z t
ρt Xt = ρ0 X0 + r ρs ds
0
and hence the solution is

Z t
1 r
Xt = X0 + ρs ds
ρt ρt 0
Z t
1 2 1 2
= X0 eαWt − 2 α t + r e− 2 α (t−s)+α(Wt −Ws ) ds.
0
Exercise 7.8.1 Let α be a constant. Solve the following stochastic differential equations by the
method of integrating factor
(a) dXt = αXt dWt ;
(b) dXt = Xt dt + αXt dWt ;
1
(c) dXt = dt + αXt dWt , X0 > 0.
Xt
Exercise 7.8.2 Let X R tt be the solution of the stochastic equation dXt = σXt dWt , with σ
constant. Let At = 1t 0 Xs dWs be the stochastic average of Xt . Find the stochastic equation
followed by At and the mean and variance of At .
116
7.9 Existence and Uniqueness

An exploding solution
Consider the non-linear stochastic differential equation
dXt = Xt3 dt + Xt2 dWt , X0 = 1/a. (7.9.22)
We shall look for a solution of the type Xt = f (Wt ). Ito’s formula yields
1
dXt = f 0 (Wt )dWt + f 00 (Wt )dt.
2
Equating the coefficients of dt and dWt in the last two equations yields
f 0 (Wt ) = Xt2 =⇒ f 0 (Wt ) = f (Wt )2 (7.9.23)

1 00
f (Wt ) = Xt3 =⇒ f 00 (Wt ) = 2f (Wt )3 (7.9.24)
2
We note that equation (7.9.23) implies (7.9.24) by differentiation. So it suffices to solve only
the ordinary differential equation
f 0 (x) = f (x)2 , f (0) = 1/a.
Separating and integrating we have

Z Z
df 1
2
= ds =⇒ f (x) = .
f (x) a−x
Hence a solution of equation (7.9.22) is

1
Xt = .
a − Wt
Let Ta be the first time the Brownian motion Wt hits a. Then the process Xt is defined only
for 0 ≤ t < Ta . Ta is a random variable with P (Ta < ∞) = 1 and E[Ta ] = ∞, see section 3.1.
The following theorem is the analog of Picard’s uniqueness result from ordinary differential
equations:
Theorem 7.9.1 (Existence and Uniqueness) Consider the stochastic differential equation
dXt = b(t, Xt )dt + σ(t, Xt )dWt , X0 = c
where c is a constant and b and σ are continuous functions on [0, T ] × R satisfying

1. |b(t, x)| + |σ(t, x)| ≤ C(1 + |x|); x ∈ R, t ∈ [0, T ]
2. |b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)| ≤ K|x − y|, x, y ∈ R, t ∈ [0, T ]
with C, K positive constants. There there is a unique solution process Xt that is continuous
and satisfies
hZ T i
E Xt2 dt < ∞.
0
The first condition says that the drift and volatility increase no faster than a linear function
in x. The second conditions states that the functions are Lipschitz in the second argument.
Chapter 8
Martingales
8.1 Examples of Martingales

In this section we shall use the knowledge acquired in previous chapters to present a few
important examples of martingales. These will be useful in the proof of Girsanov’s theorem in
the next section.
We start by recalling that a process Xt , 0 ≤ t ≤ T , is an Ft -martingale if
1. E[|Xt |] < ∞ (Xt integrable for each t);
2. Xt is Ft -predictable;
3. the forecast of future values is the last observation: E[Xt |Fs ] = Xs , ∀s < t.
We shall present in the following three important examples of martingales and some of their
particular cases.
Example 8.1.1 If v(s) is a continuous function on [0, T ], then

Z t
Xt = v(s) dWs
0
is an Ft -martingale.
Taking out the predictable part

hZ s t Z¯ i
¯
E[Xt |Fs ] = E v(τ ) dWτ + v(τ ) dWτ ¯Fs
0 s
hZ t ¯ i
¯
= Xs + E v(τ ) dWτ ¯Fs = Xs ,
s
Rt
where we used that s v(τ ) dWτ is independent of Fs and the conditional expectation equals
the usual expectation
hZ t ¯ i hZ t i
¯
E v(τ ) dWτ ¯Fs = E v(τ ) dWτ = 0.
s s
117
118
Z t
Example 8.1.2 Let Xt = v(s) dWs be a process as in Example 8.1.1. Then
0
Z t
Mt = Xt2 − v 2 (s) ds
0
The process Xt satisfies the stochastic equation dXt = v(t)dWt . By Ito’s formula
d(Xt2 ) = 2Xt dXt + (dXt )2 = 2v(t)Xt dWt + v 2 (t)dt. (8.1.1)
Integrating between s and t yields

Z t Z t
Xt2 − Xs2 =2 Xτ v(τ ) dWτ + v 2 (τ ) dτ.
s s
Then separating the predictable from unpredictable we have

h Z t
¯ i
E[Mt |Fs ] = E Xt − 2
v 2 (τ ) dτ ¯Fs
0
h Z t Z s
¯ i
2
= E Xt − Xs − 2 2 2
v (τ ) dτ + Xs − v 2 (τ ) dτ ¯Fs
s 0
Z s h Z t
¯ i
= Xs2 − v 2 (τ ) dτ + E Xt2 − Xs2 − v 2 (τ ) dτ ¯Fs
0 s
hZ t ¯ i
= Ms + 2E Xτ v(τ ) dWτ ¯Fs = Ms ,
s
Z t
where we used relation (8.1.1) and that Xτ v(τ ) dWτ is totaly unpredictable given the infor-
s
mation set Fs . In the following we shall discuss a few particular cases.
Particular cases.
1. If v(s) = 1, then Xt = Wt . In this case Mt = Wt2 − t is an Ft -martingale.
Rt
2. If v(s) = s, then Xt = 0 s dWs , and hence
³Z t ´2 t3
Mt = s dWs −
0 3
Example 8.1.3 Let u : [0, T ] → R be a continuous function. Then

Rt Rt
u(s) dWs − 12 u2 (s) ds
Mt = e 0 0
is an Ft -martingale for 0 ≤ t ≤ T .
Z t Z
1 t 2
Consider the process Ut = u(s) dWs − u (s) ds. Then
0 2 0
119
1
dUt = u(t)dWt − u2 (t)dt
2
(dUt )2 = u(t)dt.
Then Ito’s formula yields

1
dMt = d(eUt ) = eUt dUt + eUt (dUt )2
2
³ 1 1 ´
= eUt u(t)dWt − u2 (t)dt + u2 (t)dt
2 2
= u(t)Mt dWt .

Z t
Mt = Ms + u(τ )Mτ dWτ
s
Z t
Since u(τ )Mτ dWτ is independent of Fs , then
s
hZ t i £
Z t ¤
E u(τ )Mτ dWτ |Fs = E u(τ )Mτ dWτ = 0,
s s
and hence Z t
E[Mt |Fs ] = E[Ms + u(τ )Mτ dWτ |Fs ] = Ms .
s
Remark 8.1.4 The condition that u(s) is continuous on [0, T ] can be relaxed by asking only
u ∈ L2 [0, T ]. It worth noting that the conclusion still holds if the function u(s) is replaced by a
stochastic process u(t, ω) satisfying Novikov’s condition
£ 1 RT 2 ¤
E e 2 0 u (s,ω) ds < ∞.
The previous process has a distinguished importance in the theory of martingales and it will
be useful when proving the Girsanov theorem.
Definition 8.1.5 Let u ∈ L2 [0, T ] be a deterministic function. Then the stochastic process
Rt Rt
u(s) dWs − 12 u2 (s) ds
Mt = e 0 0
is called the exponential process induced by u.
Particular cases of exponential processes.

σ2
1. If let u(s) = σ, constant, then Mt = eσWt − 2 t is an Ft -martingale.
2. Let u(s) = s. Integrating in d(tWt ) = tdWt − Wt dt yields
Z t Z t
s dWs = tWt − Ws ds.
0 0
120
Rt
Let Zt = 0
Ws ds be the integrated Brownian motion. Then
Rt Rt
d dWs − 12 s2 ds
Mt = e 0 0
t3
= etWt − 6 −Zt
Example 8.1.6 Let Xt be a solution of dXt = u(t)dt + dWt , with u(s) bounded function.
Consider the exponential process
Rt Rt
u(s) dWs − 12 u2 (s) ds
M t = e− 0 0 .
Then Yt = Mt Xt is an Ft -martingale.
In Example 8.1.3 we obtained dMt = −u(t)Mt dWt . Then
dMt dXt = −u(t)Mt dt.
Product rule yields
dYt = Mt dXt + Xt dMt + dMt dXt
¡ ¢
= Mt u(t)dt + dWt − Xt u(t)Mt dWt − u(t)Mt dt
¡ ¢
= Mt 1 − u(t)Xt dWt .
Z t ¡ ¢
Yt = Ys + Mτ 1 − u(τ )Xτ dWτ .
s
Rt ¡ ¢
Since s
Mτ 1 − u(τ )Xτ dWτ is independent of Fs ,
hZ t ¡ ¢ i hZ t ¡ ¢ i
E Mτ 1 − u(τ )Xτ dWτ |Fs = E Mτ 1 − u(τ )Xτ dWτ = 0,
s s
and hence
E[Yt |Fs ] = Ys .
1
Exercise 8.1.7 Prove that (Wt + t)e−Wt − 2 t is an Ft -martingale.
Exercise 8.1.8 Let h be a continuous function. Using the properties of the Wiener integral
and lognormal random variables, show that
h Rt i 1
Rt 2
E e 0 h(s) dWs = e 2 0 h(s) ds .
Exercise 8.1.9 Use the previous exercise to show that for any t > 0
Rt
u(s)2 ds
(a) E[Mt ] = 1 (b) E[Mt2 ] = e 0 .
Exercise 8.1.10 Let Ft = σ{Wu ; u ≤ t}. Show that the following processes are Ft -martingales:
(a) et/2 cos Wt ;
(b) et/2 sin Wt .
Research Topic: Given a function f , find a function g such that g(t)f (Wt ) is an Ft -martingale.
121
8.2 Girsanov’s Theorem

In this section we shall present and prove a particular version of Girsanov’s theorem, which will
suffice for the purpose of later applications.
We shall recall first a few basic notions. Let (Ω, F, P ) be a probability space. When dealing
with an Ft -martingale on the aforementioned probability space, the filtration Ft is considered to
be the sigma-algebra generated by the given Brownian motion Wt , i.e. Ft = σ{Wu ; 0 ≤ u ≤ s}.
By default, a martingale is considered with respect to the probability measure P , in the sense
that the expectations involve an integration with respect to P
Z
E P [X] = X(ω)dP (ω).
Ω
We have not used the upper script until now since there was no doubt which probability measure
was used. In this section we shall use also another probability measure given by
dQ = MT dP,
where Mt is an exponential process. This means that Q : F → R is given by

Z Z
Q(A) = dQ = MT dP, ∀A ∈ F .
A A
Since MT > 0, M0 = 1, using the martingale property of Mt yields
Q(A) > 0, A 6= Ø;
Z
Q(Ω) = MT dP = E P [MT ] = E P [MT |F0 ] = M0 = 1,
Ω
which shows that Q is a probability on F, and hence (Ω, F, Q) becomes a probability space.
The following transformation of expectation from the probability measure Q to P will be useful
in the sequel. If X is a random variable
Z Z
E Q [X] = X(ω) dQ(ω) = X(ω)MT (ω) dP (ω)
Ω Ω
= E P [XMT ].
The following result will play a central role in proving Girsanov’s theorem:
Lemma 8.2.1 Let Xt be the Ito process
dXt = u(t)dt + dWt , X0 = 0, 0 ≤ t ≤ T,
with u(s) bounded function. Consider the exponential process

Rt Rt
u(s) dWs − 12 u2 (s) ds
M t = e− 0 0 .
Then Xt is an Ft -martingale with respect to the measure
dQ(ω) = MT (ω)dP (ω).

122
Proof: We need to prove that Xt is an Ft -martingale w.r.t. Q, so it suffices to show the

following three properties:
1. Integrability of Xt . This part usually follows from standard manipulations of norms esti-
mations. We shall do it here in detail, but we shall omit it in other proofs. Integrating in the
equation of Xt between 0 and t yields
Z t
Xt = u(s) ds + Wt . (8.2.2)
0
We start with an estimation of the expectation w.r.t. P

h³ Z t ´2 Z t i
P 2 P
E [Xt ] = E u(s) ds + 2 u(s) ds Wt + Wt2
0 0
³Z t ´2 Z t
= u(s) ds + 2 u(s) ds E P [Wt ] + E P [Wt2 ]
0 0
³Z t ´2
= u(s) ds + t < ∞, ∀0 ≤ t ≤ T.
0
where the last inequality follows from the norm estimation

Z t Z t h Z t i1/2
u(s) ds ≤ |u(s)| ds ≤ t |u(s)|2 ds
0 0 0
h Z T i1/2
≤ t |u(s)|2 ds = T 1/2 kukL2 [0,T ] .
0
Next we obtain an estimation w.r.t. Q

³Z ´2 Z Z
E Q [|Xt |]2 = |Xt |MT dP ≤ |Xt |2 dP MT2 dP
Ω Ω Ω
P
= E [Xt2 ] E P [MT2 ] < ∞,
RT
u(s)2 ds kuk2L2 [0,T ]
since E P [Xt2 ] < ∞ and E P [MT2 ] = e 0 =e , see Exercise 8.1.9.
2. Ft -predictability of Xt . This follows from equation (8.2.2) and the fact that Wt is Ft -
predictable.
3. Conditional expectation of Xt . From Examples 8.1.3 and 8.1.6 recall that for any 0 ≤ t ≤ T
Mt is an Ft -martingale w.r.t. probability measure P ;

Xt Mt is an Ft -martingale w.r.t. probability measure P.
We need to verify that

E Q [Xt |Fs ] = Xs , ∀s ≤ t.
which can be written as Z Z
Xt dQ = Xs dQ, ∀A ∈ Fs .
A A
Since dQ = MT dP , the previous relation becomes
Z Z
Xt MT dP = Xs MT dP, ∀A ∈ Fs .
A A
123
This can be written in terms of conditional expectation as
E P [Xt MT |Fs ] = E P [Xs MT |Fs ]. (8.2.3)
We shall prove this identity by showing that both terms are equal to Xs Ms . Since Xs is
Fs -predictable and Mt is a martingale, the right side term becomes
E P [Xs MT |Fs ] = Xs E P [MT |Fs ] = Xs Ms , ∀s ≤ T.
Let s < t. Using tower property (see Proposition 1.10.4, part 3), the left side term becomes
£ ¤ £ ¤
E P [Xt MT |Fs ] = E P E P [Xt MT |Ft ]|Fs = E P Xt E P [MT |Ft ]|Fs
£ ¤
= E P Xt Mt |Fs = Xs Ms ,
where we used that Mt and Xt Mt are martingales and Xt is Ft -predictable. Hence (8.2.3) holds
and Xt is an Ft -martingale w.r.t. the probability measure Q.
Lemma 8.2.2 Consider the process

Z t
Xt = u(s) ds + Wt , 0 ≤ t ≤ T,
0
with u ∈ L2 [0, T ] deterministic function, and let dQ = MT dP . Then
E Q [Xt2 ] = t.
Rt
Proof: Denote U (t) = 0
u(s) ds. Then
E Q [Xt2 ] = E P [Xt2 MT ] = E P [U 2 (t)MT + 2U (t)Wt MT + Wt2 MT ]

= U 2 (t)E P [MT ] + 2U (t)E P [Wt MT ] + E P [Wt2 MT ]. (8.2.4)
From Exercise 8.1.9 (a) we have E P [MT ] = 1. In order to compute E P [Wt MT ] we use the
tower property and the martingale property of Mt
E P [Wt MT ] = E[E P [Wt MT |Ft ]] = E[Wt E P [MT |Ft ]]

= E[Wt Mt ]. (8.2.5)
Using the product rule
d(Wt Mt ) = Mt dWt + Wt dMt + dWt dMt

¡ ¢
= Mt − u(t)Mt Wt dWt − u(t)Mt dt,
where we used dMt = −u(t)Mt dWt . Integrating between 0 and t yields

Z t Z t
¡ ¢
Wt Mt = Ms − u(s)Ms Ws dWs − u(s)Ms ds.
0 0
Taking the expectation and using the property of Ito integrals we have
Z t Z t
E[Wt Mt ] = − u(s)E[Ms ] ds = − u(s) ds = −U (t). (8.2.6)
0 0
124

E P [Wt MT ] = −U (t). (8.2.7)
For computing E P [Wt2 MT ] we proceed in a similar way
E P [Wt2 MT ] = E[E P [Wt2 MT |Ft ]] = E[Wt2 E P [MT |Ft ]]

= E[Wt2 Mt ]. (8.2.8)
Using the product rule yields
d(Wt2 Mt ) = Mt d(Wt2 ) + Wt2 dMt + d(Wt2 )dMt

¡ ¢
= Mt (2Wt dWt + dt) − Wt2 u(t)Mt dWt
¡ ¢
−(2Wt dWt + dt) u(t)Mt dWt
¡ ¢ ¡ ¢
= Mt Wt 2 − u(t)Wt dWt + Mt − 2u(t)Wt Mt dt.
Integrate between 0 and t

Z t Z t
¡ ¢ ¡ ¢
Wt2 Mt = [Ms Ws 2 − u(s)Ws ] dWs + Ms − 2u(s)Ws Ms ds.
0 0
and take the expected value to get

Z t ¡ ¢
E[Wt2 Mt ] = E[Ms ] − 2u(s)E[Ws Ms ] ds
0
Z t ¡ ¢
= 1 + 2u(s)U (s) ds
0
= t + U 2 (t),
where we used (8.2.6). Substituting in (8.2.8) yields
E P [Wt2 MT ] = t + U 2 (t). (8.2.9)
Substituting (8.2.7) and (8.2.9) into relation (8.2.4) yields
E Q [Xt2 ] = U 2 (t) − 2U (t)2 + t + U 2 (t) = t, (8.2.10)
which ends the proof of the lemma.

Now we are prepared to prove one of the most important results of stochastic calculus.
Theorem 8.2.3 (Girsanov Theorem) Let u ∈ L2 [0, T ] be a deterministic function. Then

the process
Z t
Xt = u(s) ds + Wt , 0≤t≤T
0
is a Brownian motion w.r.t. the probability measure Q given by

RT RT
u(s) dWs − 12 u(s)2 ds
dQ = e− 0 0 dP.
125
Proof: In order to prove that Xt is a Brownian motion on the probability space (Ω, F, Q)
we shall apply Lévy’s characterization theorem, see Theorem 2.1.4. Hence it suffices to show
that Xt is a Wiener process. Lemma 8.2.1 implies that the process Xt satisfies the following
properties:
1. X0 = 0;
2. Xt is continuous in t;
3. Xt is a square integrable Ft -martingale on the space (Ω, F, Q).
The only property we still need to show is
4. E Q [(Xt − Xs )2 ] = t − s, s < t.
The rest of the proof is dedicated to the verification of relation 4. Using Lemma 8.2.2, the
martingale property of Wt , the additivity and the tower property of expectations, we have
E Q [(Xt − Xs )2 ] = E Q [Xt2 ] − 2E Q [Xt Xs ] + E Q [Xs2 ]

= t − 2E Q [Xt Xs ] + s
= t − 2E Q [E Q [Xt Xs |Fs ]] + s
= t − 2E Q [Xs E Q [Xt |Fs ]] + s
= t − 2E Q [Xs2 ] + s
= t − 2s + s = t − s,
which proves relation 4.

Choosing u(s) = λ, constant, we obtain the following consequence that will be useful later
in finance applications.
Corollary 8.2.4 Let Wt be a Brownian motion on the probability space (Ω, F, P ). Then the
process
Xt = λt + Wt , 0≤t≤T
is a Brownian motion on the probability space (Ω, F, Q), where
1 2
dQ = e− 2 λ T −λWT
dP.
126
Part II
Applications to Finance
127
Chapter 9
Modeling Stochastic Rates
Elementary Calculus provides powerful methods to model an ideal world. However, the real
world is imperfect, and in order to study it, one needs to employ the methods of Stochastic
Calculus.
9.1 An Introductory Problem

The model in an ideal world
Consider the amount of money M (t) at time t invested in a bank account that pays interest at
a constant rate r. The differential equation which models this is
dM (t) = rM (t)dt. (9.1.1)
Given the initial investment M (0) = M0 , the account balance at time t is given by the solution
of the aforementioned equation, which is M (t) = M0 ert .
The model in the real world
In the real world the interest rate r is not constant. It may be assumed constant only for a very
small amount of time, like one day or one week. The interest rate changes unpredictably in time,
which means that it is a stochastic process. This can be modeled in several different ways. For
instance, we may assume that the interest rate at time t is given by the continuous stochastic
process rt = r + σWt , where σ > 0 is a constant that controls the volatility of the rate, and
Wt is a Brownian motion process. The process rt represents a diffusion that starts at r0 = r,
with constant mean E[rt ] = r and variance proportional with the time elapsed, V ar[rt ] = σ 2 t.
With this change in the model, the account balance at time t becomes a stochastic process Mt
that satisfies the stochastic equation
dMt = (r + σWt )Mt dt, t ≥ 0. (9.1.2)
Solving the equation

In order to
R t solve this equation, we write it as dMt − rt Mt dt = 0 and multiply by the integrating
factor e− 0 rs ds . Using dt2 = dtdWt = 0, we get
¡ Rt ¢ Rt 1
dMt d e− 0 rs ds = rt Mt dt e− 0 rs ds (−rt dt + rt2 dt2 )
2
= 0.
129
130
Using the product rule, the equation becomes exact

³ Rt ´
d Mt e− 0 rs ds = 0.
Integrating yields the solution

Rt Rt
Mt = M0 e 0 rs ds = M0 e 0
(r+σWs ) ds
= M0 ert+σZt ,
Rt
where Zt = 0 Ws ds is the integrated Brownian motion process. Since the moment generating
2 3
function of Zt is m(σ) = E[eσZt ] = eσ t /6 (see Exercise 2.3.3), we obtain
2 3
E[eσWs ] = eσ t /6
;
2 3 2 3
σWs
V ar[e ] = m(2σ) − m(σ) = eσ t /3
(eσ t /3
− 1).
Then the mean and variance of the solution Mt = M0 ert+σZt are

2 3
E[Mt ] = M0 ert E[eσZt ] = M0 ert+σ t /6
;
2 3 2 3
V ar[Mt ] = M02 e2rt V ar[e σZt
]= M02 e2rt+σ t /3 (eσ t /3 − 1).
Conclusions
We shall make a few interesting remarks. If M (t) and Mt represent the balance at time t in
the ideal and real worlds, respectively, then
2 3
E[Mt ] = M0 ert eσ t /6
> M0 ert = M (t).
This means that we expect to have more money in the account in the real world rather than
in the idealistic world. It looks that we make money on an investment if the interest rate
is stochastic. In a similar way a bank can expect to make money when lending money at a
stochastic interest rate. This inequality is due to the convexity of the exponential function. If
Xt = rt + σ 2 t3 /6, then Jensen’s inequality yields
E[eXt ] ≥ eE[Xt ] = ert .
9.2 Langevin’s Equation

We shall consider another stochastic extension of the equation (9.1.1). We shall allow for
continuously random deposits and withdrawals which can be modeled by an unpredictable
term, given by αdWt , with α constant. The obtained equation
dMt = rMt dt + αdWt , t≥0 (9.2.3)
is called Langevin’s equation.

Solving the equation We shall solve it as a linear stochastic equation. Multiplying by the
integrating factor e−rt yields
d(e−rt Mt ) = αe−rt dWt .
131
Integrating we obtain Z t
e−rt Mt = M0 + α e−rs dWs .
0
Hence the solution is
Z t
rt
Mt = M0 e + α er(t−s) dWs . (9.2.4)
0
This is called the Ornstein-Uhlenbeck process. Since the last term is a Wiener integral, by
Proposition (7.3.1) we have that Mt is Gaussian with the mean
Z t
rt
E[Mt ] = M0 e + E[α er(t−s) dWs ] = M0 ert
0
and variance Z
£ t ¤ α2 2rt
V ar[Mt ] = V ar α er(t−s) dWs = (e − 1).
0 2r
It worth noting that the expected balance is equal to the real world balance M0 ert . The variance
for t small is approximately equal to α2 t.
If the constant α is replaced by an unpredictable function α(t, Wt ), the equation becomes
dMt = rMt dt + αdWt , t ≥ 0.
Using a similar argument we arrive at the following solution

Z t
Mt = M0 ert + er(t−s) α(t, Wt ) dWs . (9.2.5)
0
This process is not Gaussian. Its mean and variance are given by
E[Mt ] = M0 ert
Z t
V ar[Mt ] = e2r(t−s) E[α2 (t, Wt )] ds.
0
√ √
In the particular case when α(t, Wt ) = e 2rWt , using Application 6.0.18 with λ = 2r, we
can work out for (9.2.5) an explicit form of the solution
Z t √
rt
Mt = M0 e + er(t−s) e 2rWt dWs
0
Z t √
rt rt
= M0 e + e e−rs e2 rWt dWs
0
rt 1 ¡ −rs+√2rWt
rt
¢
= M0 e + e √ e −1
2r
1 ¡ √ ¢
= M0 ert + √ e 2rWt − ert .
2r
The previous model for the interest rate is not a realistic one because it allows for negative
rates. The Brownian motion hits −r after a finite time with probability 1. However, it might
be a good stochastic model for the evolution of population, since in this case the rate can be
negative.
132
1.25
1.24
1.23
1.22
1.21
Equilibrium level
1000 2000 3000 4000 5000 6000
Figure 9.1: A simulation of drt = a(b − rt )dt + σdWt , with r0 = 1.25, a = 3, σ = 1%, b = 1.2.
9.3 Equilibrium Models

Let rt denote the spot rate at time t. This is the rate at which one can invest for the shortest
period of time. The interest rate rt is assumed to satisfy an equation of the form
drt = m(rt )dt + σ(rt )dWt . (9.3.6)
The drift rate and volatility of the spot rate rt do not depend explicitly on the time t. There
are several classical choices for m(rt ) and σ(rt ) that will be discuses in the sequel.
9.4 The Rendleman and Bartter Model

In this model the short-time rate satisfies the process
drt = µrt dt + σrt dWt .

The growth rate µ and the volatility σ are considered constants. This equation has been solved
in Example 7.7.2 and its solution is given by
σ2
rt = r0 e(µ− 2 )t+σWt
.
The distribution of rt is log-normal. Using Example 7.2.2, the mean and variance become
σ2 σ2
)t σ 2 t/2
E[rt ] = r0 e(µ− 2 )t
E[eσWt ] = r0 e(µ− 2 e = r0 eµt .
σ2 σ2
)t σ 2 t 2
V ar[rt ] = r02 e2(µ− 2 )t
V ar[eσWt ] = r02 e2(µ− 2 e (eσ t − 1)
2
= r02 e2µt (eσ t − 1).
The next two models incorporate the mean reverting phenomenon of interest rates. This
means that in long run the rate converges towards an average level. These models are more
realistic and are based on economic arguments.
9.4.1 The Vasicek Model

Vasicek’s assumption is that the short-term interest rates should satisfy the stochastic differ-
ential equation
drt = a(b − rt )dt + σdWt , (9.4.7)
133
with a, b, σ constants.
Assuming the spot rates are deterministic, we take σ = 0 and obtain the ODE
drt = a(b − rt )dt.
Solving by the method separation yields the solution
rt = b + (r0 − b)e−at .
This implies that the rate rt is pulled towards level b at the rate a. This means that, if r0 > b,
then rt is decreasing towards b, and if r0 < b, then rt is increasing towards the horizontal
asymptote b. The term σdWt in Vasicek’s model adds some “white noise” to the process. In
the following we shall find an explicit formula for the spot rate rt in the stochastic case.
Proposition 9.4.1 The solution of the equation (9.4.7) is given by

Z t
rt = b + (r0 − b)e−at + σe−at eas dWs .
0
The process Xt is Gaussian with mean and variance
E[rt ] = b + (r0 − b)e−at ;

σ2
V ar[rt ] = (1 − e−2at ).
2a
Proof: Multiplying by the integrating factor eat yields
³ ´
d eat rt = abeat dt + σeat dWt .
Integrating between 0 and t and dividing by eat we get

Z t
rt = r0 e−at + be−at (eat − 1) + σe−at eas dWs
0
Z t
= b + (r0 − b)e−at + σe−at eas dWs .
0
Since the spot rate rt is the sum between the predictable function r0 e−at + be−at (eat − 1) and
a multiple of a Wiener integral, from Proposition (4.5.1) follows that rt is Gaussian, with
E[rt ] = b + (r0 − b)e−at

h Z t i Z t
V ar[rt ] = V ar σe−at eas dWs = σ 2 e−2at eas ds
0 0
σ2
= (1 − e−2at ).
2a
The following consequence explains the name of mean reverting rate.

Corollary 9.4.2 The spot rates rt tend to b in the mean square limit.
134
12.0
11.5
11.0 CIR model
10.5
Vasicek's model
Equilibrium level
10.0
1000 2000 3000 4000 5000 6000
Figure 9.2: Comparison between a simulation in CIR and Vasicek models, with parameter
values a = 3, σ = 15%, r0 = 12, b = 10. Note that CIR process tends to be more volatile than
Vasicek’s.
Proof: Since lim E[rt ] = lim (b+(r0 −b)e−at ) = b and lim V ar[rt ] = 0, applying Proposition
t→∞ t→∞ t→∞
3.3.1 we get ms-lim rt = b.
t→∞
Since rt is normally distributed, the Vasicek’s model has been criticized because it allows for
negative interest rates and unbounded large rates. See Fig.9.1 for a simulation of the short-term
interest rates for the Vasicek model.
Exercise 9.4.3 Find the probability that rt is negative. What happens with this probability
when t → ∞? Find the rate of change of this probability.
9.4.2 The Cox-Ingersoll-Ross Model

The Cox-Ingersoll-Ross (CIR) model assumes that the spot rates verify the stochastic equation
√
drt = a(b − rt )dt + σ rt dWt , (9.4.8)
with a, b, σ constants. Two main advantages of this model are
• the process exhibits mean reversion.
• it is not possible for the interest rates to become negative.
A process that satisfies equation (9.4.8) is called a CIR process. This is not a Gaussian process.
In the following we shall compute its first two moments.
Integrate the equation (9.4.8) between 0 and t yields
Z t Z t
√
rt = r0 + abt − a rs ds + σ rs dWs .
0 0
Take the expectation and obtain

Z t
E[rt ] = r0 + abt − a E[rs ] ds.
0
135
Denote µ(t) = E[rt ]. Then differentiating with respect to t in

Z t
µ(t) = r0 + abt − a µ(s) ds
0
yields the differential equation µ (t) = ab − aµ(t). Multiply by the integrating factor eat and
0
obtain
d(eat µ(t)) = abeat .
Integrating and using that µ(0) = r0 provides the solution
µ(t) = b + e−at (r0 − b).
Hence ¡ ¢
lim E[rt ] = lim b + e−at (r0 − b) = b,
t→∞ t→∞
which shows that the process is mean reverting.

We compute in the following the second moment µ2 (t) = E[rt2 ]. By Ito’s formula we have
d(rt2 ) = 2rt drt + (drt )2 = 2rt drt + σ 2 rt dt

√
= 2rt [a(b − rt )dt + σ rt dWt ] + σ 2 rt dt
3/2
= [(2ab + σ 2 )rt − 2art2 ]dt + 2σrt dWt .
Integrate and get

Z t Z t
rt2 = r02 + [(2ab + σ )rs − 2
2ars2 ] ds + 2σ rs3/2 dWs .
0 0
Taking the expectation yields

Z t
µ2 (t) = r02 + [(2ab + σ 2 )µ(s) − 2aµ2 (s)] ds.
0
Differentiate again
µ02 (t) = (2ab + σ 2 )µ(t) − 2aµ2 (t).
Solving as a linear differential equation in µ2 (t) yields
d(µ2 (t)e2at ) = (2ab + σ 2 )e2at µ(t).
Substituting the value of µ(t) and integrate yields

h b r0 − b at i
µ2 (t)e2at = r02 + (2ab + σ 2 ) (e2at − 1) + (e − 1) .
2a a
Hence the second moment has the formula
h b r0 − b i
µ2 (t) = r02 e−2at + (2ab + σ 2 ) (1 − e−2at ) + (1 − e−at )e−at .
2a a
Exercise 9.4.4 Use a similar method to find a recursive formula for the moments of a CIR
process.
136
9.5 No-arbitrage Models

In the following models the drift rate is a function of time, which is chosen such that the model
is consistent with the term structure.
9.5.1 The Ho and Lee Model

The continuous time-limit of Ho and Lee model for the spot rates is
drt = θ(t)dt + σdWt .
In this model θ(t) is the average direction in which rt moves and it is considered independent
of rt , while σ is the standard deviation of the short rate The solution process is Gaussian and
is given by
Z t
rt = r0 + θ(s) ds + σWt .
0
If F (0, t) denotes the forward rate at time t as seen at time 0, it is known that θ(t) = Ft (0, t) +
σ 2 t. In this case the solution becomes
σ2
rt = r0 + F (0, t) + t + σWt .
2
9.5.2 The Hull and White Model

This is an extension of Vasicek’s model which incorporates mean reversion
drt = (θ(t) − art )dt + σdWt ,
with a and σ constants. We can solve the equation multiplying by the integrating factor eat
d(eat rt ) = θ(t)eat dt + σeat dWt .
Integrating between 0 and t yields

Z t Z t
−at −at as −at
rt = r0 e +e θ(s)e ds + σe eas dWs . (9.5.9)
0 0
Since the first two terms are deterministic and the last is a Wiener integral, the process rt is
Gaussian.
The θ(t) function can be calculated from the term structure as
σ2
θ(t) = ∂t F (0, t) + aF (0, t) + (1 − e−2at ).
2a
Then
Z t Z t Z t Z t
as as as σ2
θ(s)e ds = ∂s F (0, s)e ds + a F (0, s)e ds + (1 − e−2as )eas ds
0 0 0 2a 0
σ2 ¡ ¢
= F (0, t)eat − r0 + cosh(at) − 1 ,
a2
137
where we used that F (0, 0) = r0 . The deterministic part of rt becomes

Z t
−at −at σ2 ¡ ¢
r0 e +e θ(s)eas ds = F (0, t) + 2 e−at cosh(at) − 1 ,
0 a
An algebraic manipulation shows that
¡ ¢ 1
e−at cosh(at) − 1 = (1 − eat )2 .
2
Z t
σ2
rt = F (0, t) + 2 (1 − eat )2 + σe−at eas dWs .
2a 0
The mean and variance are:

σ2
E[rt ] = F (0, t) + (1 − eat )2
2a2
hZ t i Z t
V ar[rt ] = σ 2 e−2at V ar eas dWs = σ 2 e−2at e2as ds
0 0
σ2
= (1 − e−2at ).
2a
9.6 Nonstationary Models

9.6.1 Black, Derman and Toy Model
The binomial tree of Black, Derman and Toy is equivalent with the following continuous model
of short-time rate
h σ 0 (t) i
d(ln rt ) = θ(t) + ln rt dt + σ(t)dWt .
σ(t)
Making the substitution ut = ln rt , we obtain a linear equation in ut
h σ 0 (t) i
dut = θ(t) + ut dt + σ(t)dWt .
σ(t)
The equation can be written in the equivalent way
σ(t)dut − dσ(t) ut θ(t)
2
= dt + dWt ,
σ (t) σ(t)
which after using the quotient rule becomes
h u i θ(t)
t
d = dt + dWt .
σ(t) σ(t)
Integrating and solving for ut leads to
Z t
u0 θ(s)
ut = σ(t) + σ(t) ds + σ(t)Wt .
σ(0) 0 σ(s)
138
This implies that ut is Gaussian and hence rt = eut is log-normal for each t. Using u0 = ln r0
and σ(t)
u0 σ(t)
e σ(0) σ(t) = e σ(0) ln r0 = r0σ(0) ,
we obtain the following explicit formula for the spot rate
σ(t) Rt θ(s)
rt = r0σ(0) eσ(t) 0 σ(s)
ds σ(t)Wt
e .
Since σ(t)Wt is normally distributed with mean 0 and variance σ 2 (t)t, the log-normal variable
eσ(t)Wt has
2
E[eσ(t)Wt ] = eσ (t)t/2
2 2
V ar[eσ(t)Wt ] = eσ(t) t (eσ(t) t − 1).
Hence
σ(t) Rtθ(s)
ds σ 2 (t)t/2
E[rt ] = r0σ(0) eσ(t) 0 σ(s) e
2σ(t) Rt θ(s)
ds σ(t)2 t 2
V ar[rt ] = r0σ(0)
e2σ(t) 0 σ(s) e (eσ(t) t − 1).
9.6.2 Black and Karasinski Model

In this model the spot rates follow a more general model
h i
d(ln rt ) = θ(t) − a(t) ln rt dt + σ(t)dWt
We arrive to this model if substitute

Rt
σ(t) = σ(0)e− 0
a(s) ds
in the Black, Derman and Toy model.
Exercise 9.6.1 Find the explicit formula for rt in this case.

Chapter 10
Modeling Stock Prices
The price of a stock can be modeled by a continuous stochastic process which is the sum
between a predictable and an unpredictable part. However, this type of model does not take
into account market crashes. If those are to be taken into consideration, the stock price needs
to contain a third component which models unexpected jumps. We shall discuss these models
in the following.
10.1 Constant Drift and Volatility Model

Let St denote the price of a stock at time t. If Ft denotes the information set at time t, then
St is a continuous process that is Ft -adapted. The return on the stock during the time interval
∆t measures the percentage increase in the stock price between instances t and t + ∆t and is
given by St+∆tSt
−St
. When ∆t is infinitesimally small, we obtain the instantaneous return, dS
St .
t
This is supposed to be the sum of two components:

• the predictable part µdt
• the noisy part due to unexpected news σdWt .
Adding these parts yields
dSt
= µdt + σdWt ,
St
which leads to the stochastic equation
dSt = µSt dt + σSt dWt . (10.1.1)
The parameters µ and σ are positive constants which represent the drift and volatility of the
stock. This equation has been solved in Example 7.7.2 by applying the method of variation of
parameters. The solution is
σ2
St = S0 e(µ− 2 )t+σWt
, (10.1.2)
where S0 denotes the price of the stock at time t = 0. It worth noting that the stock price is
Ft -adapted, positive, and it has a log-normal distribution. Using Exercise 1.6.1, the mean and
variance are
E[St ] = S0 eµt (10.1.3)
2
V ar[St ] = S02 e2µt (eσ t − 1). (10.1.4)
139
140
1.15
1.10
1.05
50 100 150 200 250 300 350
Figure 10.1: Two distinct simulations for the stochastic equation dSt = 0.15St dt + 0.07St dWt ,
with S0 = 1.
See Fig.10.1 for two simulations of the stock price.
Exercise 10.1.1 Let Fu be the information set at time u. Find E[St |Fu ] and V ar[St |Fu ] for
u ≤ t.
Exercise 10.1.2 Find the stochastic process followed by ln St . What are the values of E[ln(St )]
and V ar[ln(St )]? What happens when s = t?
Exercise 10.1.3 Find the stochastic differential equations associated with the following pro-
cesses
1
(a) (b) Stn (c) ln St (d) (St − 1)2 .
St
2
Exercise 10.1.4 Show that E[St2 ] = S02 e(2µ+σ )t
. Find a similar formula for E[Stn ], with n
positive integer.
Exercise 10.1.5 (a) Find the correlation function ρt = Corr(St , Wt ).

(b) Find the expectation E[St Wt ].
The next result deals with the probability that the stock price reaches certain barrier before
another barrier.
Theorem 10.1.6 Let Su and Sd be fixed, such that Sd < S0 < Su . The probability that the
stock price St hits the upper value Su before the lower value Sd is
dγ − 1
p= ,
dγ − uγ
where Su /S0 = u, Sd /S0 = d, and γ = 1 − 2µ/σ 2 .
Proof: Let Xt = mt + Wt . Theorem 3.1.13 provides
e2mβ − 1
P (Xt goes up to α before down to − β) = . (10.1.5)
e2mβ − e−2mα
141
Choosing the following values for the parameters

µ σ ln u ln d
m= − , α= , β=− ,
σ 2 σ σ
we have the sequence of identities
P (Xt goes up to α before down to − β)
= P (σXt goes up to σα before down to − σβ)
= P (S0 eσXt goes up to S0 eσα before down to S0 e−σβ )
= P (S0 eσXt goes up to S0 eσα before down to S0 e−σβ )
= P (St goes up to Su before down to Sd ).
Using (10.1.5) yields
e2mβ − 1
P (St goes up to Su before down to Sd ) = . (10.1.6)
e2mβ − e−2mα
Since a computation shows that
2µ 2µ
e2mβ = e−( σ2 −1) ln d = d1− σ2 = dγ
2µ 2µ
e−2mα = e(− σ2 +1) ln u = u1− σ2 = uγ ,
formula (10.1.6) becomes
dγ − 1
P (St goes up to Su before down to Sd ) = ,
dγ − uγ
Corollary 10.1.7 Let Su > S0 > 0 be fixed. Then
³ S ´1− 2µ2
0 σ
P (St hits Su ) = .
Su
Proof: Taking d = 0 implies Sd = 0. Since St never reaches zero,

P (St hits Su ) = P (St goes up to Su before down to Sd = 0)
¯
dγ − 1 ¯¯ 1 ³ S ´1− 2µ2
0 σ
= ¯ = = .
dγ − uγ ¯ uγ Su
d=0
10.2 Time-dependent Drift and Volatility Model

This model considers the drift µ = µ(t) and volatility σ = σ(t) deterministic functions of time.
In this case the equation (10.1.1) becomes
dSt = µ(t)St dt + σ(t)St dWt . (10.2.7)

142
We shall solve the equation using the method of integrating factors presented in section 7.8.
Multiplying by the integrating factor
Rt Rt
σ(s) dWs + 12 σ 2 (s) ds
ρt = e − 0 0
the equation (10.2.7) becomes d(ρt St ) = ρt µ(t)St dt. Substituting Yt = ρt St yields the deter-
ministic equation dYt = µ(t)Yt dt with solution
Rt
µ(s) ds
Yt = Y0 e 0 .
Going back in variable St = ρ−1

t Yt , we obtain the closed-form solution of equation (10.2.7)
Rt R
(µ(s)− 12 σ 2 (s)) ds+ 0t σ(s) dWs
St = S0 e 0 .
Proposition 10.2.1 The solution St is Ft -adapted and log-normally distributed, with mean
and variance given by
Rt
µ(s) ds
E[St ] = S0 e 0
¡ Rt 2
Rt ¢
V ar[St ] = S02 e2 0 µ(s) ds e 0 σ (s) ds − 1 .
Rt Rt
Proof: Let Xt = 0 (µ(s) − 12 σ 2 (s)) ds + 0 σ(s) dWs . Since Xt is a sum between a predictable
integral function and a Wiener integral, it is normally distributed, see Proposition 4.5.1, with
Z t
¡ 1 ¢
E[Xt ] = µ(s) − σ 2 (s) ds
0 2
hZ t i Z t
V ar[Xt ] = V ar σ(s) dWs = σ 2 (s) ds.
0 0
Using Exercise 1.6.1, the mean and variance of the log-normal random variable St = S0 eXt are
given by
Rt 2 Rt Rt
(µ− σ2 ) ds+ 12 σ 2 ds
E[St ] = S0 e 0 = S0 e 0 µ(s) ds
0
Rt 1 2
Rt 2 ¡ Rt 2 ¢
V ar[St ] = S02 e2 0 (µ− 2 σ ) ds+ 0 σ ds e 0 σ − 1
Rt ¡ Rt 2 ¢
= S02 e2 0 µ(s) ds e 0 σ (s) ds − 1 .
If the average drift and average squared volatility are defined as

Z
1 t
µ = µ(s) ds
t 0
Z
1 t 2
σ2 = σ (s) ds,
t 0
the aforementioned formulas can be also written as
E[St ] = S0 eµt
2
V ar[St ] = S02 e2µt (eσ t − 1).
It worth noting that we obtained similar formulas with (10.1.3)–(10.1.4).

143
10.3 Models for Stock Price Averages

In this section we shall provide stochastic differential equations for several types of averages on
stocks. These averages are used as underlying assets in the case of Asian options.
Let St1 , St2 , · · · , Stn be a sample of stock prices at n instances of time t1 < t2 < · · · < tn .
The most common types of discrete averages are:
• The arithmetic average
n
1X
A(t1 , t2 , · · · , tn ) = S tk .
n
k=1
• The geometric average

³Y
n ´ n1
G(t1 , t2 , · · · , tn ) = S tk .
k=1
• The harmonic average

n
H(t1 , t2 , · · · , tn ) = n .
X 1
S tk
k=1
The well-known inequality of means states that we always have
H(t1 , t2 , · · · , tn ) ≤ G(t1 , t2 , · · · , tn ) ≤ A(t1 , t2 , · · · , tn ), (10.3.8)
with identity in the case of constant stock prices.

In the following we shall obtain expressions for continuous sampled averages and find their
associated stochastic equations.
The continuously sampled arithmetic average
Let tn = t and assume tk+1 − tk = nt . Using the definition of the integral as a limit of Riemann
sums, we have
n n Z
1X 1X t 1 t
lim Stk = lim Stk = Su du.
n→∞ n n→∞ t n t 0
k=1 k=1
It follows that the continuously sampled arithmetic average of stock prices between 0 and t is
given by
Z
1 t
At = Su du.
t 0
Using the formula for St we obtain

Z t
S0 2
At = e(µ−σ /2)u+σWu
du.
t 0
This integral can be computed explicitly just in the case µ = 0, see Application 6.0.18. It worth
noting that At is neither normal nor log-normal, fact which makes the price of Asian options
on arithmetic averages hard to evaluate.
144
Rt
Let It = 0 Su du. By the Fundamental Theorem of Calculus implies dIt = St dt. Then the
quotient rule yields
³I ´ dIt t − It dt St t dt − It dt 1
t
dAt = d = 2
= 2
= (St − At )dt,
t t t t
i.e. the continuous arithmetic average At satisfies
1
dAt = (St − At )dt.
t
If At < St , the right side is positive and hence dAt > 0, i.e. the average At goes up. Similarly,
if At > St , then the average At goes down. This shows that the average At tends to trace the
stock values St .
By L’Hospital’s rule we have
It
A0 = lim
= lim St = S0 .
t t→0
t→0
Using that the expectation commutes with integrals, we have

Z Z
1 t 1 t eµt − 1
E[At ] = E[Su ] du = S0 eµu du = S0 .
t 0 t 0 µt
Hence ( µt
E[At ] = S0 e µt− 1 , if t > 0 (10.3.9)
S0 , if t = 0.
In the following we shall compute the variance V ar[At ]. Since
1
V ar[At ] = E[It2 ] − E[At ]2 , (10.3.10)
t2
it suffices to find E[It2 ]. We need first the following result:
Lemma 10.3.1 (i) We have
S02 2
E[It St ] = [e(2µ+σ )t − eµt ].
µ + σ2
(ii) The processes At and St are not independent.

Proof: (i) Using Ito’s formula
d(It St ) = dIt St + It dSt + dIt dSt

= St2 dt + It (µSt dt + σSt dWt ) + St dt dSt
| {z }
=0
= (St2 + µIt St )dt + σIt St dWt .
Using I0 S0 = 0, integrating between 0 and t yields

Z t Z t
It S t = (Su2 + µIu Su ) du + σ Iu Su dWu .
0 0
145
Since the expectation of the Ito integral is zero, we have

Z t
E[It St ] = (E[Su2 ] + µE[Iu Su ]) du.
0
Using Exercise 10.1.4 this becomes

Z t³ ´
2
E[It St ] = S02 e(2µ+σ )u
+ µE[Iu Su ] du.
0
If denote g(t) = E[It St ], differentiating yields the ODE

2
g 0 (t) = S02 e(2µ+σ )t
+ µg(t),
with the initial condition g(0) = 0. This can be solved as a linear differential equation in g(t)
multiplying by the integrating factor e−µt . The solution is
S02 2
g(t) = 2
[e(2µ+σ )t − eµt ].
µ+σ
(ii) Since At and It are proportional, it suffices to show that It and St are not independent.
This follows from part (i) and the fact
S02 µt
E[It St ] 6= E[It ]E[St ] = (e − 1)eµt .
µ
Next we shall find E[It2 ]. Using dIt = St dt, then (dIt )2 = 0 and hence Ito’s formula yields
d(It2 ) = 2It dIt + (dIt )2 = 2It St dt.
Integrating between 0 and t and using I0 = 0 leads to

Z t
It2 = 2 Iu Su du.
0
Taking the expectation and using Lemma 10.3.1 we obtain

Z
2S02 h e(2µ+σ )t − 1 eµt − 1 i
t 2
E[It2 ] =2 E[Iu Su ] du = − . (10.3.11)

0 µ + σ2 2µ + σ 2 µ

( )
2 h e(2µ+σ )t − 1 eµt − 1 i (eµt − 1)2
2
S02
V ar[At ] = 2 − − .
t µ + σ2 2µ + σ 2 µ µ2
Concluding the previous calculations, we have the following result:
Proposition 10.3.2 The arithmetic average At satisfies the stochastic equation

1
dAt = (St − At )dt, A0 = S0 .
t
146
Its mean and variance are given by
eµt − 1
E[At ] = S0 , t>0
µt
( )
2 h e(2µ+σ )t − 1 eµt − 1 i (eµt − 1)2
2
S02
V ar[At ] = − − .
t2 µ + σ 2 2µ + σ 2 µ µ2
Exercise 10.3.3 Find approximative formulas for E[At ] and V ar[At ] for t small, up to the
order O(t2 ).
The continuously sampled geometric average

t
Dividing the interval (0, t) into equal subintervals of length tk+1 − tk = n, we have
³ ´1/n
³Y
n ´1/n ln
Qn
St k
k=1
G(t1 , . . . , tn ) = S tk =e
k=1
1
Pn 1
Pn t
ln Stk ln Stk
= en k=1 = et k=1 n .
Using the definition of the integral as a limit of Riemann sums

³Y
n ´1/n 1
Pn t 1
Rt
ln Stk ln Su du
Gt = lim Stk = lim e t k=1 n = et 0 .
n→∞ n→∞
k=1
Therefore the continuously sampled geometric average of stock prices between instances 0 and
t is given by
1
Rt
ln Su du
Gt = e t 0 . (10.3.12)
Theorem 10.3.4 Gt has a log-normal distribution, with the mean and variance given by
σ2
) 2t
E[Gt ] = S0 e(µ− 6
σ2 ¡ σ2 t ¢
V ar[Gt ] = S02 e(µ− 6 )t
e 3 −1 .
Proof: Using
³ σ2
´ σ2
ln Su = ln S0 e(µ− 2 )u+σWu = ln S0 + (µ − )u + σWu ,
2
then taking the logarithm yields
Z th i Z t
1 σ2 σ2 t σ
ln Gt = ln S0 + (µ − )u + σWu du = ln S0 + (µ − ) + Wu du.
t 0 2 2 2 t 0
Rt
Since the integrated Brownian motion Zt = 0
Wu du is Gaussian with Zt ∼ N (0, t3 /3), it
follows that ln Gt has a normal distribution
³ σ2 t σ2 t ´
ln Gt ∼ N ln S0 + (µ − ) , . (10.3.13)
2 2 3
147
This implies that Gt has a log-normal distribution. Using Exercise 1.6.1, we obtain
1 σ2 t σ2 t σ2
) 2t
E[Gt ] = eE[ln Gt ]+ 2 V ar[ln Gt ] = eln S0 +(µ− 2 ) 2 + 6 = S0 e(µ− 6 .
¡ ¢
V ar[Gt ] = e2E[ln Gt ]+V ar[ln Gt ] eV ar[ln Gt ] − 1
σ2 ¡ σ2 t ¢ σ2 ¡ σ2 t ¢
= e2 ln S0 +(µ− 6 )t e 3 − 1 = S02 e(µ− 6 )t e 3 − 1 .
Corollary 10.3.5 The geometric average Gt is given by the closed-form formula
σ2
Rt
) 2t + σt
Gt = S0 e(µ− 2 0
Wu du
.
An important consequence of the fact that Gt is log-normal is that Asian options on geometric
averages have closed-form solutions.
Exercise 10.3.6 (a) Show that ln Gt satisfies the stochastic differential equation
1³ ´
d(ln Gt ) = ln St − ln Gt dt.
t
(b) Show that Gt satisfies
1 ³ ´
dGt = Gt ln St − ln Gt dt.
t
The continuously sampled harmonic average

Let Stk be values of a stock evaluated at the sampling dates tk , i = 1, . . . , n. Their harmonic
average is defined by
n
H(t1 , · · · , tn ) = n .
X 1
S(tk )
k=1
Consider tk = kt
n . Then the continuously sampled harmonic average is obtained by taking the
limit n → ∞ in the aforementioned relation
n t t
lim n = lim n =Z t ,
n→∞ X 1 n→∞ X 1 t 1
du
S(tk ) S(tk ) n 0 Su
k=1 k=1
Hence the continuously sampled harmonic average is defined by
t
Ht = Z t .
1
du
0 Su
Z t
t 1
We may also write Ht = , where It = du satisfies
It 0 Su
1 ³1´ 1
dIt = dt, I0 = 0, d =− dt.
St It St It2
148
From the L’Hospital rule we get

t
H0 = lim Ht = lim = S0 .
t&0 t&0 It
Using the product rule we obtain the stochastic equation followed by the continuous harmonic
average Ht
³1´ 1 ³1´
dHt = t d + dt + dt d
It It It
1³ t ´ 1 ³ Ht ´
= 1− dt = Ht 1 − dt,
It St It t St
so
1 ³ Ht ´
dHt = Ht 1 − dt. (10.3.14)
t St
If at the instance t we have Ht < St , it follows from the equation that dHt > 0, i.e. the harmonic
average increases. Similarly, if Ht > St , then dHt < 0, i.e Ht decreases. It worth noting that
the converses are also true. The random variable Ht is not normally neither log-normally
distributed.
Ht
Exercise 10.3.7 Show that is a decreasing function of t. What is its limit as t → ∞?
t
Exercise 10.3.8 Show that the continuous analog of inequality (10.3.8) is
Ht ≤ Gt ≤ At .
Exercise 10.3.9 Consider the power α of the arithmetic average of Stαk

" Pn #α
α
α k=1 Stk
A (t1 , · · · , tn ) = .
n
(i) Show that the aforementioned expression tends to

h1 Z t iα
Aαt = S α
du ,
t 0 u
as n → ∞.
(ii) Find the stochastic differential equation satisfied by Aα
t.
α
(iii) What does At become in the particular cases α = ±1?
Exercise 10.3.10 The stochastic average of stock prices between 0 and t is defined by
Z
1 t
Xt = Su dWu ,
t 0
where Wu is a Brownian motion process.
(i) Find dXt , E[Xt ] and V ar[Xt ].
St − S0
(ii) Show that σXt = Rt − µAt , where Rt = is the “raw average” and At =
Rt t
1
t 0
Su du is the continuous arithmetic average.
149
10.4 Stock Prices with Rare Events

In order to model the stock price when rare events are taken into account, we shall combine
the effect of two stochastic processes:
• the Brownian motion process Wt , which models regular events given by infinitesimal
changes in the price, and which is a continuous process;
• the Poisson process Nt , which is discontinuous and models sporadic jumps in the stock
price that correspond to shocks in the market.
Since E[dNt ] = λdt, the Poisson process Nt has a positive drift and we need to “compensate”
it by subtracting λt from Nt . The resulting process Mt = Nt − λt is a martingale, called the
compensated Poisson process, that models unpredictable jumps of size 1 at a constant rate λ.
It worth noting that the processes Wt and Mt involved in modeling the stock price are assumed
to be independent.
dSt
To set up the model, we assume the instantaneous return on the stock, , to be the sum
St
of the following three components:
• the predictable part µdt;
• the noisy part due to unexpected news σdWt ;
• the rare events part due to unexpected jumps ρdMt ,
where µ, σ and ρ are constants, corresponding to the drift rate of the stock, volatility and jump
size.
Adding yields
dSt
= µdt + σdWt + ρdMt .
St
Hence the dynamics of a stock price, subject to rare events, is modeled by the following stochas-
tic differential equation
dSt = µSt dt + σSt dWt + ρSt dMt . (10.4.15)
It worth noting that in the case of zero jumps, ρ = 0, the previous equation becomes the
classical stochastic equation (10.1.1).
Using that Wt and Mt are martingales, we have
E[ρSt dMt |Ft ] = ρSt E[dMt |Ft ] = 0,

E[σSt dWt |Ft ] = σSt E[dWt |Ft ] = 0.
This shows the unpredictability of the last two terms, i.e. given the information set Ft at time
t, it is not possible to predict any future increments in the next interval of time dt. The term
σSt dWt captures regularly events of insignificant size, while ρSt dMt captures rare events of
large size. The “rare events” term, ρSt dMt , incorporates jumps proportional with the stock
price and is given in terms of the Poisson process Nt as
ρSt dMt = ρSt d(Nt − λt) = ρSt dNt − λρSt dt.
Substituting in equation (10.4.15) yields
dSt = (µ − λρ)St dt + σSt dWt + ρSt dNt .

150
The constant λ represents the rate at which the jumps of the Poisson process Nt occur. This is
the same as the rate of rare events in the market, and can be determined from historical data.
The following result provides an explicit solution for the stock price when rare events are taken
into account.
Proposition 10.4.1 The solution of the stochastic equation (10.4.15) is given by

¡ √ ¢ √
µ− 12 σ 2 +λ[ 1+2ρ−ρ−1] t+σWt +( 1+2ρ−1)Mt
St = S0 e ,
where
µ is the stock price drift rate;
σ is the volatility of the stock;
λ is the rate at which rare events occur;
ρ is the size of jump in the expected return when a rare events occur.
Proof: We shall look for a solution of the type St = S0 eΦ(t,Wt ,Mt ) , with Φ(t, Wt , Mt ) =
α(t) + β(Wt ) + γ(Mt ). Using (dMt )2 = dNt = dMt + λdt, see (2.8.13), applying Ito’s formula
we have
dSt = St α0 (t)dt + St β 0 (Wt )dWt + St γ 0 (Mt )dMt

1
+ St [β 0 (Wt )2 + β 00 (Wt )]dt
2
1
+ St [γ 0 (Mt )2 + γ 00 (Mt )](dMt + λdt)
2
1 1 λ λ
= St [α0 (t) + β 0 (Wt )2 + β 00 (Wt ) + γ 0 (Mt )2 + γ 00 (Mt )]dt
2 2 2 2
+St β 0 (Wt )dWt
1 1
+St [γ 0 (Mt ) + γ 0 (Mt )2 + γ 00 (Mt )]dMt .
2 2
Equating the coefficients of dt, dWt and dMt against the corresponding coefficients of the
equation (10.4.15) yields
1 1 λ λ
µ = α0 (t) + β 0 (Wt )2 + β 00 (Wt ) + γ 0 (Mt )2 + γ 00 (Mt )
2 2 2 2
σ = β 0 (Wt )
1 1
ρ = γ 0 (Mt ) + γ 0 (Mt )2 + γ 00 (Mt )
2 2
From the second equation we get β(Wt ) = σWt , and substituting in the first one yields
1
α0 (t) = µ − σ 2 − λρ + λγ 0 (Mt ).
2
Since the left side term is a function of t and the right side is a function of Mt , it must be a
separation constant C such that
1
α0 (t) = C = µ − σ 2 − λρ + λγ 0 (Mt ). (10.4.16)
2
151
Then α(t) = Ct. In the following we shall find the value of the constant C. The second identity
of (10.4.16) yields γ 0 (Mt ) = κ, constant. Then the aforementioned formula for ρ yields
1
ρ = κ + κ2 .
2
Multiplying by 2 and completing the square we have
p
(κ + 1)2 = 1 + 2ρ =⇒ κ1,2 = −1 ± 1 + 2ρ.
1 p
C1,2 = µ − σ 2 − λρ + λ(−1 ± 1 + 2ρ)
2
1 p
= µ − σ 2 + λ[± 1 + 2ρ − ρ − 1].
2
In order to pick the correct sign, if let ρ = 0, i.e. if assume the jumps have size zero, we need
to get α(t) = (µ − 12 σ 2 )t. This implies the positive sign in front of the square root. Then
1 p
C = µ − σ 2 + λ[ 1 + 2ρ − ρ − 1].
2
√
It worth noting that 1 + 2ρ − ρ − 1 > 0, and C > 0. It follows that
³ 1 p ´
α(t) = µ − σ 2 + λ[ 1 + 2ρ − ρ − 1] t
2
β(Wt ) = σWt
p
γ(Mt ) = ( 1 + 2ρ − 1)Mt ,
where we choose α(0) = β(0) = γ(0) = 0. Hence
³ 1 p ´ p
Φ(t, Wt , Mt ) = µ − σ 2 + λ[ 1 + 2ρ − ρ − 1] t + σWt + ( 1 + 2ρ − 1)Mt .
2
Since processes Wt and Mt are independent, we have

¡ √ ¢ √
1 2
E[St ] = S0 e µ− 2 σ +λ[ 1+2ρ−ρ−1] t E[eσWt ]E[e( 1+2ρ−1)Mt ]
¡ √ ¢ 2
1+2ρ−1 √
√
1 2 σ t
= S0 e µ− 2 σ +λ[ 1+2ρ−ρ−1] t e 2 eλt[e − 1+2ρ]
√
1+2ρ−1
= S0 e[µ−λ(ρ+1)+λe ]t
. (10.4.17)
We can write this as
E[St ] = S0 eµt eλf (ρ)t ,
√
1+2ρ−1
with f (ρ) = e − ρ − 1. Using that f (0) = 0 and
√ ³ 1 ´
f 0 (ρ) = e 1+2ρ−1 √ − 1 < 0, ∀ρ > 0,
1 + 2ρ
it follows that St is expected to decrease as the size of the jumps ρ increases.
Exercise 10.4.2 Find V ar[St ].
Exercise 10.4.3 Find E[ln St ] and V ar[ln St ].
Exercise 10.4.4 Compute the conditional expectation E[St |Fu ] for u < t.
152
10.5 Modeling other Asset Prices

Besides stock, the underlying asset of a derivative can be also a stock index, or foreign currency.
When use the risk neutral valuation for derivatives on a stock index that pays a continuous
dividend yield at a rate q, the drift rate µ is replace by r − q.
In the case of foreign currency that pays interest at the foreign interest rate rf , the drift
rate µ is replace by r − rf .
Chapter 11
Risk-Neutral Valuation
11.1 The Method of Risk-Neutral Valuation

This valuation method is based on the risk-neutral valuation principle, which states that the
price of a derivative on an asset St is not affected by the risk preference of the market partic-
ipants; so we may assume they have the same risk aversion, and in this case the valuation of
the derivative price ft at time t is done as in the following:
1. Assume the expected return of the asset St is the risk-free rate, µ = r.
2. Calculate the expected payoff of the derivative as of time t, under condition 1.
3. Discount at the risk-free rate from time T to time t.
The first two steps require to consider the expectation as of time t in a risk-neutral world.
This expectation is denoted by Ebt [ · ] and has the meaning of a conditional expectation E[ · |Ft , µ =
r]. The methods states that if a derivative has the payoff fT , its price at any time t prior to
maturity T is given by
ft = e−r(T −t) Êt [fT ].
The rate r is considered constant, but the method can be easily adapted for time dependent
rates.
In the following we shall present explicit computations for the most common European type
derivatives prices using the risk-neutral valuation method.
11.2 Call option

Consider a call option with maturity date T , strike price K with the underlying stock price St
having constant volatility σ > 0. The payoff at maturity time is fT = max(ST − K, 0). The
price of the call at any prior time 0 ≤ t ≤ T is given by the expectation in a risk-neutral world
c(t) = Êt [e−r(T −t) fT ] = e−r(T −t) Êt [fT ] (11.2.1)
If let x = ln(ST /St ), using the log-normality of the stock price in a risk-neutral world
σ2
ST = St e(r− 2 )(T −t)+σWT −t
,
153
154
it follows that x has the normal distribution

³ σ2 ´
x ∼ N (r − )(T − t), σ 2 (T − t) .
2
Then the density function of x is
¡ 2 2 ¢
x−(r− σ )(T −t)
1 − 2
p(x) = p e 2
2σ (T −t) .
σ 2π(T − t)
We can write the expectation as
Êt [fT ] = Êt [max(ST − K, 0)] = Êt [max(St ex − K, 0)]

Z ∞ Z ∞
x
= max(St e − K, 0)p(x) dx = (St ex − K)p(x) dx
−∞ ln(K/St )
= I2 − I1 , (11.2.2)
with notations
Z ∞ Z ∞
I1 = Kp(x) dx, I2 = St ex p(x) dx
ln(K/St ) ln(K/St )
2
x−(r− σ2 )(T −t)
With the substitution y = √
σ T −t
, the first integral becomes
Z ∞ Z ∞
1 y2
I1 = K p(x) dx = K √ e− 2 dy
ln(K/St ) −d2 2π
Z d2
1 y2
= K √ e− 2 dy = KN (d2 ),
−∞ 2π
where 2
ln(St /K) + (r − σ2 )(T − t)
d2 = √ ,
σ T −t
and Z u
1 2
N (u) = √ e−z /2
dz
2π −∞
denotes the standard normal distribution function.

Using the aforementioned substitution the second integral can be computed by completing
the square
Z ∞ Z ∞ √
1 1 2 σ2
I2 = St ex p(x) dx = St √ e− 2 y +yσ T −t+(r− 2 )(T −t) dx
ln(K/St ) −d2 2π
Z ∞ √
1 1 2
= St √ e− 2 (y−σ T −t) er(T −t) dy
−d2 2π
Z ∞ Z d1
1 − z2 1 z2
= St er(T −t) √
√ e 2 dz = S er(T −t)
t √ e− 2 dz
−d2 −σ T −t 2π −∞ 2π
= St er(T −t) N (d1 ),
155
where 2
√ ln(St /K) + (r + σ2 )(T − t)
d1 = d2 + σ T − t = √ .
σ T −t
Substituting back to (11.2.2) and then into (11.2.1) yields
c(t) = e−r(T −t) (I2 − I1 ) = e−r(T −t) [St er(T −t) N (d1 ) − KN (d2 )]
= St N (d1 ) − Ke−r(T −t) N (d2 ).
We have obtained the well known formula of Black and Scholes:

Proposition 11.2.1 The price of an European call option at time t is given by
c(t) = St N (d1 ) − Ke−r(T −t) N (d2 ).
11.3 Cash-or-nothing
A financial security that pays 1 dollar if the stock price ST ≥ K and 0 otherwise, is called bet
contract, or cash-or-nothing contract. The payoff can be written as
½
1, if ST ≥ K
fT =
0, if ST < K.
Substituting St = eXt , the payoff becomes

½
1, if XT ≥ ln K
fT =
0, if XT < ln K,
where XT has the normal distribution

³ σ2 ´
XT ∼ N ln St + (µ + )(T − t), σ 2 (T − t) .
2
The expectation in the risk-neutral world as of time t is
Z ∞
Ebt [fT ] = E[fT |Ft , µ = r] = fT (XT )p(Xt ) dXt
−∞
Z +∞
2
[x−ln St −(r+ σ )(T −t)]2
1 − 2
= √ √ e 2σ 2 (T −t) dx
ln K 2πσ T − t
Z ∞ Z d2
1 y2 1 y2
= √ e− 2 dy = √ e− 2 dy = N (d2 ),
−d2 2π −∞ 2π
2
x−ln St −(r+ σ2 )(T −t)
where we used the substitution y = √
σ T −t
and the notation
2
ln St − ln K − (r + σ2 )(T − t)
d2 = √ .
σ T −t
The price at time t of a bet contract is
bt [fT ] = e−r(T −t) N (d2 ).
ft = e−r(T −t) E
156
Exercise 11.3.1 Let 0 < K1 < K2 . Find the price of a financial derivative which pays at
maturity $1 if K1 ≤ St ≤ K2 and zero otherwise. This is a “box-bet” and its payoff is given by
½
1, if K1 ≤ ST ≤ K2
fT =
0, otherwise.
Exercise 11.3.2 An asset-or-nothing contract pays ST if ST > K at maturity time T, and

pays 0 otherwise. Show that the price of the contract at time t is ft = St N (d1 ).
Exercise 11.3.3 Find the price at time t of a derivative which pays at maturity
½ n
ST , if ST ≥ K
fT =
0, otherwise.
11.4 Log-contract
The financial security that pays at maturity fT = ln ST is called a log-contract. Since the stock
is log-normal distributed,
³ σ2 ´
ln ST ∼ N ln St + (µ − )(T − t), σ 2 (T − t) ,
2
the risk-neutral expectation at time t is
2
bt [fT ] = E[ln ST |Ft , µ = r] = ln St + (r − σ )(T − t),
E
2
and hence the price of the log-contract is given by
³ 2 ´
bt [fT ] = e−r(T −t) ln St + (r − σ )(T − t) .
ft = e−r(T −t) E
2
Exercise 11.4.1 Find the price at time t of a square log-contract whose payoff is given by
fT = (ln ST )2 .
11.5 Power-contract
The financial derivative which pays at maturity the nth power of the stock price, STn , is called
a power contract. Since ST has a log-normal distribution with
σ2
ST = St e(µ− 2 )(T −t)+σWT −t
,
the nth power of the stock, STn , is also log-normal distributed, with
σ2
fT = STn = Stn en(µ− 2 )(T −t)+nσWT −t
.
Then the expectation at time t in the risk-neutral world is
σ2
E[fT |Ft , µ = r] = Stn en(r− 2 )(T −t)
E[enσWT −t ]
σ2 2
1
σ 2 (T −t)
= Stn en(r− 2 )(T −t)
e2n
157
The price of the power-contract is obtained by discounting to time t

σ2 2
1
σ 2 (T −t)
ft = e−r(T −t) E[fT |Ft , µ = r] = Stn e−r(T −t) en(r− 2 )(T −t)
e2n
nσ 2
= Stn e(n−1)(r+ 2 )(T −t)
.
It worth noting that if n = 1, i.e. if the payoff is fT = ST , then the price of the contract at
any time t ≤ T is ft = St , i.e. the stock price itself. This will be used shortly when valuing
forward contracts.
2
In the case n = 2, i.e. if the contract pays ST2 at maturity, then the price is ft = St2 e(r+σ )(T −t) .
Exercise 11.5.1 Let n ≥ 1 be an integer. Find the price of a power call option whose payoff
is given by fT = max(STn − K n , 0).
11.6 Forward contract

A forward contract pays at maturity the difference between the stock price, ST , and the delivery
price of the asset, K. The price at time t is
bt [ST − K] = e−r(T −t) E
ft = e−r(T −t) E bt [ST ] − e−r(T −t) K = St − e−r(T −t) K,
bt [ST ] = St er(T −t) .

where we used that K is a constant and that E
Exercise 11.6.1 Let n ∈ {2, 3}. Find the price of the contract that pays at maturity fT =
(ST − K)n .
11.7 The Superposition Principle

If the payoff of a derivative, fT , can be written as a linear combination of payoffs
n
X
fT = ci hi,T
i=1
with ci constants, then the price at time t is given by

n
X
ft = ci hi,t
i=1
where hi,t is the price at time t of a derivative that pays at maturity hi,T . We shall successfully
use this method in the situation when the payoff fT can be decomposed in simpler payoffs, for
which we can evaluate directly the price of the associate derivative. In this case the price of
the initial derivative, ft , is obtained as a combination of the prices of the more easy to valuate
derivatives.
The reason underlying the aforementioned superposition principle is the linearity of the
expectation operator Ê
n
X
ft = b T ] = e−r(T −t) E[
e−r(T −t) E[f b ci hi,T ]
i=1
n
X n
X
= e−r(T −t) b i,T ] =
ci E[h ci hi,t .
i=1 i=1
158
This principle is also connected with the absence of arbitrage opportunities in the market.
Consider two portfolios of derivatives with equal values at the maturity time T
n
X m
X
ci hi,T = aj gj,T .
i=1 j=1
If think this common value to be the payoff of a derivative, fT , then by the aforementioned
principle, the portfolios have the same value at any prior time t ≤ T
n
X m
X
ci hi,t = aj gj,t .
i=1 j=1
The last identity can also result from the absence of arbitrage opportunities in the market. If
there is a time t at which the identity fails, then buying the cheaper portfolio and selling the
more expensive one will lead to an arbitrage profit.
The superposition principle can be used to price package derivatives such as spreads, strad-
dles, strips, straps and strangles. We shall deal with these in the sequel.
11.8 Call Option

In the following we price an European call using the superposition principle. The payoff of a
call option can be decomposed as
cT = max(ST − K, 0) = h1,T − Kh2,T ,
with ½ ½
ST , if x ≥ K 1, if x ≥ K
h1,T = h2,T =
0, if x < K, 0, if x < K.
These are the payoffs of an asset-or-nothing and of a cash-or-nothing derivatives. From section
11.3 and Exercise 11.3.2 we have h1,t = St N (d1 ), h2,t = e−r(T −t) N (d2 ). By superposition we
get the price of a call at time t
ct = h1,t − Kh2,t = St N (d1 ) − Ke−r(T −t) N (d2 ).

½
1, if x ≤ K
Exercise 11.8.1 (a) Consider the payoff h1,T = Show that
0, if x > K.
h1,t = e−r(T −t) N (−d2 ), t ≤ T.

½
ST , if x ≤ K
(b) Consider the payoff h2,T = Show that
0, if x > K.
h2,t = St N (−d1 ), t ≤ T.
(c) The payoff of a put is pT = max(K − ST , 0). Verify that
pT = Kh1,T − h2,T
and use the superposition principle to find the price pt of the put.
159
11.9 Asian Forward Contracts

Let At denote the continuous arithmetic average of the asset price between 0 and T . It makes
sense sometimes for two parties to get into a contract on which one party pays the other at
maturity time T the difference between the average price of the asset,AT , and the fixed delivery
price K. The payoff of this contract is
fT = AT − K.
For instance, if the asset is electric energy or natural gas, it makes sense to make a deal
on the average price of the asset, since their prices are volatile and sometimes can become
unaffordable expensive during the winter season.
The risk-neutral expectation E bt [AT ] is obtained from formula (10.3.9) by letting µ = r and
replacing t by T − t and S0 by St . Since K is a constant, the price of the contract at time t is
given by
ft bt [fT ] = e−r(T −t) (E

= e−r(T −t) E bt [AT ] − K)
³ er(T −t) − 1 ´ 1 − e−r(T −t)
= e−r(T −t) St − K = St − e−r(T −t) K
r(T − t) r(T − t)
St ³ S ´
t
= − e−r(T −t) +K .
r(T − t) r(T − t)
It worth noting that the price of an asian forward contract is always cheaper than the price
of an usual forward contract on the asset. To see that, we substitute x = r(T − t) in the
inequality e−x > 1 − x, x > 0, to get
1 − e−r(T −t)
< 1.
r(T − t)
This implies the inequality
1 − e−r(T −t)
St − e−r(T −t) K < St − e−r(T −t) K.
r(T − t)
Since the left side is the price of an asian forward contract, while the right side is the price of
an usual forward contract, we obtain the desired inequality.
Exercise 11.9.1 Find the price of a contract that pays at maturity date the difference between
the asset price and its arithmetic average, fT = ST − AT .
For theoretical purposes, one may consider asian forward contracts on geometric average.
This is a derivative that pays at maturity the difference fT = GT − K, where GT is the
continuous geometric average of the asset price between 0 and T and K is a fixed delivery
price.
Substituting µ = r, S0 by St and t by T − t in the first relation provided by Theorem 10.3.4,
the risk-neutral expectation of GT as of time t is
2
bt [GT ] = St e 21 (r− σ6
E )(T −t)
.
160
The value at time t of a derivative that pays at maturity the geometric average, gT = GT , is
1 σ2
bt [GT ] = St e− 2 (r+
gt = e−r(T −t) E 6 )(T −t)
.
Then the value of the forward contract on the geometric average becomes
1 σ2
ft = gt + e−r(T −t) K = St e− 2 (r+ 6 )(T −t)
− e−r(T −t) K. (11.9.3)
Exercise 11.9.2 Show that the contract (11.9.3) is cheaper than a usual forward contract on
the asset.
11.10 Asian Options

There are several types of asian options depending on how the payoff is related to the average
stock price:
• Average Price options:
– Call: max(Save − K, 0)
– Put: max(K − Save , 0).
• Average Strike options:
– Call: max(ST − Save , 0)
– Put: max(Save − ST , 0).
The average asset price Save can be either the arithmetic or the geometric average of the asset
between 0 and T .
Geometric average price options When the asset is the geometric average, GT , we shall
obtain closed form formulas for average price options. In order to do this, we need the following
two results. The first one is a cash-or-nothing type contract where the underlying asset is the
geometric mean.
Lemma 11.10.1 The value at time t of a derivative, which pays at maturity $1 if the geometric
average GT ≥ K and 0 otherwise, is given by
ht = e−r(T −t) N (de2 ),
where 2
ln St − ln K + (µ − σ2 ) T 2−t
de2 = p .
σ (T − t)/3
Proof: The payoff can be written as

½ ½
1, if GT ≥ K 1, if XT ≥ ln K
hT = =
0, if GT < K 0, if XT < ln K,
where XT = ln GT has the normal distribution

h σ 2 T − t σ 2 (T − t) i
XT ∼ N ln St + (µ − ) , ,
2 2 3
161
which was obtained by replacing S0 by St and t by T − t in formula (10.3.13). Let p(x) be the
probability density of the random variable XT
1 σ2
) T 2−t ]2 /(
2σ 2 (T −t)
p(x) = √ p e−[x−ln St −(µ− 2 3 )
. (11.10.4)
2πσ (T − t)/3
The risk neutral expectation of the payoff at time t is

Z Z ∞
bt [hT ] =
E hT (x)p(x) dx = p(x) dx
Z ln∞K
1 σ 2 T −t 2 2σ 2 (T −t)
= √ p e−[x−ln St −(r− 2 ) 2 ] /( 3 ) dx,
2πσ (T − t)/3 ln K
where µ was replaced by r. Substituting

2
x − ln St − (r − σ2 ) T 2−t
y= p , (11.10.5)
σ (T − t)/3
yields
Z ∞ Z df2
bt [hT ] = 1 2 1 2
E √ e−y /2 dy = √ e−y /2 dy
2π −df2 2π −∞
= N (de2 ),
where 2
ln St − ln K + (r − σ2 ) T 2−t
de2 = p .
σ (T − t)/3
Discounting to the free interest rate yields the price at time t
bt [hT ] = e−r(T −t) N (de2 ).
ht = e−r(T −t) E
The following result deals with the price of a geometric average-or-nothing derivative.
Lemma 11.10.2 The value at time t of a derivative, which pays at maturity GT if GT ≥ K
and 0 otherwise, is given by the formula
σ2
St N (de1 ),
1
gt = e− 2 (r+ 6 )(T −t)
where 2
ln St − ln K + (r + σ6 ) T 2−t
de1 = p .
σ (T − t)/3
Proof: Since the payoff can be written as
½ ½ X
GT , if GT ≥ K e T, if XT ≥ ln K
gT = =
0, if GT < K 0, if XT < ln K,
the risk neutral expectation at the time t of the payoff is

162
Z ∞ Z ∞
bt [g ] =
E gT (x)p(x) dx = ex p(x) dx,
T
−∞ ln K
where p(x) is given by (11.10.4), with µ is replaced by r. Using the substitution (11.10.5) and
completing the square yields
Z ∞
bt [g ] = 1 σ2
) T 2−t ]2 /(
2σ 2 (T −t)
E T
√ p ex e−[x−ln St −(r− 2 3 )
dx
2πσ (T − t)/3 ln K
Z ∞ √
1 1 σ2 1 2
= √ St e 2 (r− 6 )(T −t) e− 2 [y−σ (T −t)/3] dy
2π f
−d2
If let
p
de1 = de2 + σ (T − t)/3
2
ln St − ln K + (r − σ2 ) T 2−t p
= p + σ (T − t)/3
σ (T − t)/3
2
ln St − ln K + (r + σ6 ) T 2−t
= p ,
σ (T − t)/3
p
the previous integral becomes after substituting z = y − σ (T − t)/3
Z ∞
1 σ2 1 2 σ2
e− 2 z dz = St e 2 (r− 6 )(T −t) N (de1 )
1 1
√ St e 2 (r− 6 )(T −t)
2π f
−d 1
Then the risk neutral expectation of the payoff is

2
bt [g ] = St e 21 (r− σ6
E T
)(T −t)
N (de1 ).
The value of the derivative at time t is obtained by discounting at the interest rate r
σ2
N (de1 )
1
gt = bt [g ] = e−r(T −t) St e 2 (r−
e−r(T −t) E 6 )(T −t)
T
2
= e − 21 (r+ σ6 )(T −t)
St N (de1 ).
Proposition 11.10.3 The value at time t of a geometric average price call option is
σ2
St N (de1 ) − Ke−r(T −t) N (de2 ).
1
ft = e− 2 (r+ 6 )(T −t)
Proof: Since the payoff fT = max(GT − K, 0) can be decomposed as
fT = gT − KhT ,
with ½ ½
GT , if GT ≥ K 1, if GT ≥ K
gT = hT =
0, if GT < K, 0, if GT < K,
163
applying the superposition principle and Lemmas 11.10.1 and 11.10.2 yields
ft = gt − Kht
σ2
St N (de1 ) − Ke−r(T −t) N (de2 ).
1
= e− 2 (r+ 6 )(T −t)
Exercise 11.10.4 Find the value at time t of a geometric average price put option.
Arithmetic average price options There is no simple closed-form solution for a call or
for a put on the arithmetic average At . However, there is an approximate solution based on
computing exactly the first two moments of the distribution of At , and applying the risk-neutral
valuation assuming that the distribution is log-normal with the same two moments. This idea
was developed by Turnbull and Wakeman (1991), and it works pretty well for volatilities up to
about 20%.
The following result provides the mean and variance of a normal distribution in terms of
the first two moments of the associated log-normal distribution.
Proposition 11.10.5 Let Y be a log-normal distributed random variable, having the first two
moments given by
m1 = E[Y ], m2 = E[Y 2 ].
Then ln Y has the normal distribution ln Y ∼ N (µ, σ 2 ), with
m2 m2
µ = ln √ 1 , σ 2 = ln · (11.10.6)
m2 m21
Proof: Using Exercise 1.6.1 we have
σ2 2
m1 = E[Y ] = eµ+ 2 , m2 = E[Y 2 ] = e2µ+2σ .
Taking a logarithm, yields

σ2
µ+ = ln m1 , 2µ + 2σ 2 = ln m2 .
2
Solving for µ and σ yields (11.10.6).
Assume the arithmetic average At = Itt has a log-normal distribution. Then ln At = ln It −
RT
ln t is normal, so ln It is normal, and hence It is log-normal distributed. Since IT = 0 Su du,
using (10.3.11) yields
eµT − 1
m1 = E[IT ] = S0 ;
µ
2S02 h e(2µ+σ )T − 1 eµT − 1 i
2
m2 = E[IT2 ] = − .
µ + σ2 2µ + σ 2 µ
using Proposition 11.10.5 it follows that ln At is normal distributed, with
³ m2 m2 ´
ln AT ∼ N ln √ 1 − ln t, ln 2 . (11.10.7)
m2 m1
164
Relation (11.10.7) represents the normal approximation of ln At . We shall price the arithmetic
average price call under this condition.
In the next two exercises we shall assume the distribution of AT given by the log-normal
distribution (11.10.7).
Exercise 11.10.6 Using a method similar with the one used in Lemma 11.10.1, show that an
approximate value at time 0 of a derivative, which pays at maturity $1 if the arithmetic average
AT ≥ K and 0 otherwise, is given by
h0 = e−rT N (dˇ2 ),
with √
ln(m21 / m2 ) − ln K − ln t
dˇ2 = p , (11.10.8)
ln(m2 /m21 )
where in the expressions of m1 and m2 we replaced µ by r.
Exercise 11.10.7 Using a method similar with the one used in Lemma 11.10.2, show that the
approximate value at time 0 of a derivative, which pays at maturity AT if AT ≥ K and 0
otherwise, is given by the formula
1 − e−rT
a0 = S0 N (dˇ1 ),
r
where
m2
ln √m12 + ln t − ln K
dˇ1 = ln(m2 /m21 ) + p , (11.10.9)
ln(m2 /m21 )
where in the expressions of m1 and m2 we replaced µ by r.
Proposition 11.10.8 The approximate value at t = 0 of an arithmetic average price call is

given by
S0 (1 − e−rT )
f0 = N (dˇ1 ) − Ke−rT N (dˇ2 ),
r
with dˇ1 and dˇ2 given by formulas (11.10.8) and (11.10.9).
Exercise 11.10.9 (a) Prove Proposition 11.10.8.

(b) How does the formula change if the value is taken at the time t instead of time 0?
11.11 Forward Contracts with Rare Events

We shall evaluate the price of a forward contract on a stock which follows a stochastic process
with rare events. The payoff at maturity is fT = ST − K, where K is the delivery price. The
risk-neutral expectation at time t of ST is obtained by substituting µ by r, t by T − t and S0
by St in formula (10.4.17)
√
1+2ρ−1
bt [ST ] = St e[r−λ(ρ+1)+λe
E ](T −t)
.
bt [ST −K],
The price of the forward contract is given by the risk-neutral valuation ft = e−r(T −t) E
which becomes √
1+2ρ−1
ft = St e[−λ(ρ+1)+λe ](T −t)
− e−r(T −t) K. (11.11.10)
165
In the particular case, when the jump size is ρ = 0, or the rate of occurring of jumps λ = 0, we
obtain the familiar result
ft = St − e−r(T −t) K.
166
Chapter 12
Martingale Measures
12.1 Martingale Measures

An Ft -predictable stochastic process Xt on the probability space (Ω, F, P ) is not always a
martingale. However, it might become a martingale with respect to another probability measure
Q on F. This is called a martingale measure. The main result of this section is finding a
martingale measure with respect to which the discounted stock price is a martingale. This
measure plays an important role in the mathematical explanation of the risk-neutral valuation.
12.1.1 Is the stock price St a martingale?

Since the stock price St is an Ft -predictable and non-explosive process, the only condition
which needs to be satisfied to be a Ft -martingale is
E[St |Fu ] = Su , ∀u < t. (12.1.1)
Heuristically speaking, this means that given all information in the market at time u, Fu , the
expected price of any future stock price is the price of the stock at time u, i.e Su . This does
not make any sense, since in this case the investor would prefer investing the money in a bank
at risk-free interest rate, rather than buying a stock with zero return. Then (12.1.1) does not
hold. The next result shows how to fix this problem.
Proposition 12.1.1 Let µ be the rate of return of the stock St . Then
E[e−µt St |Fu ] = e−µu Su , ∀u < t, (12.1.2)
i.e. e−µt St is an Ft -martingale.
Proof: The process e−µt St is “non-explosive” since
E[|e−µt St |] = e−µt E[St ] = e−µt S0 eµt = S0 < ∞.
167
168
Since St is Ft -predictable, so will be e−µt St . Using formula (10.1.2), taking out the predictable
part yields
1 2
E[St |Fu ] = E[S0 e(µ− 2 σ )t+σWt
|Fu ]
(µ− 12 σ 2 )u+σWu (µ− 12 σ 2 )(t−u)+σ(Wt −Wu )
= E[S0 e e |Fu ]
(µ− 21 σ 2 )(t−u)+σ(Wt −Wu )
= E[Su e |Fu ]
(µ− 12 σ 2 )(t−u) σ(Wt −Wu )
= Su e E[e |Fu ]. (12.1.3)
Since the increment Wt − Wu is independent of all values Ws , s ≤ u, then it will be also

independent of Fu . By Proposition 1.10.4, part 6, the conditional expectation is the usual
expectation
E[eσ(Wt −Wu ) |Fu ] = E[eσ(Wt −Wu ) ].
¡ ¢
Since σ(Wt − Wu ) ∼ N 0, σ 2 (t − u) , from Exercise 1.6.1 (b) we get
1 2
E[eσ(Wt −Wu ) ] = e 2 σ (t−u)
.
Substituting back in (12.1.3) yields

1 2 1 2
E[St |Fu ] = Su e(µ− 2 σ )(t−u)
e2σ (t−u)
= Su eµ(t−u) ,
which is equivalent to
E[e−µt St |Fu ] = e−µu Su .
The conditional expectation E[St |Fu ] can be expressed in terms of the conditional density
function as Z
E[St |Fu ] = St p(St |Fu ) dSt , (12.1.4)
where St is taken as an integration variable.

Exercise 12.1.2 (a) Find the formula for conditional density function, p(St |Fu ), defined by
(12.1.4).
(b) Verify the formula
E[St |F0 ] = E[St ]
in two different ways, either by using part (a), or by using the independence of St with respect
to F0 .
The martingale relation (12.1.2) can be written equivalently as
Z
e−µt St p(St |Fu ) dSt = e−µu Su , u < t.
This way, dP (x) = p(x|Fu ) dx becomes a martingale measure for e−µt St . Since the rate of
return µ might not be known from the beginning, and it depends on each particular stock, a
meaningful question would be:
Under what martingale measure does the discounted stock price, Mt = e−rt St , become a mar-
tingale?
The constant r denotes, as usual, the risk-free interest rate.
169
12.1.2 Risk-neutral World and Martingale Measure

Assume such martingale measure exists. Then we must have
bu [e−rt St ] = E[e
E b −rt St |Fu ] = e−ru Su ,
where Eb denotes the expectation with respect to the requested martingale measure. The
previous relation can be also written as
b t |Fu ] = Su ,
e−r(t−u) E[S u < t.
This states that the discounted expectation at the risk-free interest rate for the time interval
t − u is the price of the stock, Su . Since this does not involve any of the riskiness of the stock,
we might think of it as an expectation in the risk-neutral world. The aforementioned formula
can be written in the compound mode as
b t |Fu ] = Su er(t−u) ,
E[S u < t. (12.1.5)
This can be obtained from the conditional expectation E[St |Fu ] = Su eµ(t−u) by substituting
µ = r and replacing E by E, b which corresponds to the definition of the expectation in a
risk-neutral world. Therefore, the evaluation of derivatives in section 11 is done by using
the aforementioned martingale measure under which e−rt St is a martingale. Next we shall
determine this measure explicitly.
12.1.3 Finding the Risk-Neutral Measure

The solution of the stochastic differential equation of the stock price
dSt = µSt dt + σSt dWt
can be written as 2
1
St = S0 eµt eσWt − 2 σ t .
1 2
Then e−µt St = S0 eσWt − 2 σ t is an exponential process. By Example 8.1.3, particular case 1,
this process is an Ft -martingale, where Ft = σ{Wu ; u ≤ t} is the information available in the
market until time t. Hence e−µt St is a martingale, which is a result proved also by Proposition
12.1.1. The probability space where this martingale lives is (Ω, F, P ).
In the following we shall change the rate of return µ into the risk-free rate r and change the
probability measure such that the discounted stock price becomes a martingale. The discounted
stock price can be expressed in terms of the Brownian motion with drift
ct = µ − r t + Wt
W (12.1.6)
σ
as in the following
1 2 c 1 2
e−rt St = e−rt S0 eµt eσWt − 2 σ t
= e σ Wt − 2 σ t .
µ−r ct is a Brownian
If let λ = in Corollary 8.2.4 of Girsanov’s theorem, it follows that W
σ
motion on the probability space (Ω, F, Q), where
1 µ−r 2
dQ = e− 2 ( σ ) T −λWT dP.
170
c 1 2
As an exponential process, eσWt − 2 σ t becomes a martingale on this space. Consequently e−rt St
is a martingale process w.r.t. the probability measure Q. This means
E Q [e−rt St |Fu ] = e−ru Su , u < t.
where E Q [ · |Fu ] denotes the conditional expectation in the measure Q, and it is given by
1 µ−r 2
E Q [Xt |Fu ] = E P [Xt e− 2 ( σ ) T −λWT |Fu ].
The measure Q is called the equivalent martingale measure, or the risk-neutral measure. The
expectation taken with respect to this measure is called the expectation in the risk-neutral
world. Customarily we shall use the notations
b −rt St ] =
E[e E Q [e−rt St ]
bu [e−rt St ] =
E E Q [e−rt St |Fu ]
b −rt St ] = E
It worth noting that E[e b0 [e−rt St ], since e−rt St is independent of the initial informa-
tion set F0 .
ct is contained in the following useful result.
The importance of the process W
Proposition 12.1.3 The probability measure that makes the discounted stock price, e−rt St , a
martingale changes the rate of return µ into the risk-free interest rate r, i.e
ct .
dSt = rSt dt + σSt dW
Proof: The proof is a straightforward verification using (12.1.6)
dSt = µSt dt + σSt dWt = rSt dt + (µ − r)St dt + σSt dWt

³µ − r ´
= rSt dt + σSt dt + dWt
σ
c
= rSt dt + σSt dWt
It worth noting that the solution of the previous stochastic equation is

c 1 2
St = S0 ert eσWt − 2 σ t .
Exercise 12.1.4 Assume µ 6= r and let u < t.

(a) Find E P [e−rt St |Fu ] and show that e−rt St is not a martingale w.r.t. the probability
measure P .
(b) Find E Q [e−µt St |Fu ] and show that e−µt St is not a martingale w.r.t. the probability
measure Q.
12.2 Risk-neutral World Density Functions

The purpose of this section is to establish formulas for the densities of Brownian motions Wt
ct with respect to both probability measures P and Q, and discuss their relationship.
and W
171
This will clear some unclarities that appear in practical applications when we need to choose
the right probability density.
ct w.r.t. P and Q will be denoted respectively by p , p and pb ,
The densities of Wt and W P Q P
pbQ .
Since Wt and Wct are Brownian motions on the spaces (Ω, F, P ) and (Ω, F, Q), respectively,
they have the following normal probability densities
1 − x2
pP (x) = √ e 2t = p(x);
2πt
1 − x2
pbQ (x) = √ e 2t = p(x).
2πt
The associated distribution functions are
Z Z x
P
FWt (x) = P (Wt ≤ x) = dP (ω) = p(u) du;
{Wt ≤x} −∞
Z Z x
Q
FW ct ≤ x) =
c (x) =
t
Q(W dQ(ω) = p(u) du.
ct ≤x}
{W −∞
ct and using that W

Expressing Wt in terms of W ct is normal distributed w.r.t. Q we get the
distribution function of Wt w.r.t. Q as
Q
FWt (x) = Q(Wt ≤ x) = Q(Wct − ηt ≤ x)
Z
= ct ≤ x + ηt) =
Q(W dQ(ω)
ct ≤x+ηt}
{W
Z x+ηt Z x+ηt
1 − y2
= p(y) dy = √ e 2t dy.
−∞ −∞ 2πt
Differentiating yields the density function
d Q 1 − 1 (y+ηt)2
pQ (x) = FWt (x) = √ e 2t .
dx 2πt
It worth noting that pQ (x) can be decomposed as
1 2
pQ (x) = e−ηx− 2 η t p(x),
which makes the connection with the Girsanov theorem.

ct w.r.t. P can be worked out in a similar way
The distribution function of W
P
FW ct ≤ x) = P (Wt + ηt ≤ x)
c (x) = P (W
t
Z
= P (Wt ≤ x − ηt) = dP (ω)
{Wt ≤x−ηt}
Z x−ηt Z x−ηt
1 − y2
= p(y) dy = √ e 2t dy,
−∞ −∞ 2πt
so the density function is
d P 1 − 1 (y−ηt)2
pP (x) = ct (x) = √
FW e 2t .
dx 2πt
172
12.3 Correlation of Stocks

Consider two stock prices driven by the same novelty term
dS1 = µ1 S1 dt + σ1 S1 dWt (12.3.7)

dS2 = µ2 S2 dt + σ2 S2 dWt . (12.3.8)
Since the underlying Brownian motions are perfectly correlated, one may be tempted to think
that the stock prices S1 and S2 are the same. The following result shows that in general the
stock prices are positively correlated:
Proposition 12.3.1 The correlation coefficient between the stock prices S1 and S2 driven by
the same Brownian motion is
eσ1 σ2 t − 1
Corr(S1 , S2 ) = 2 2 > 0.
(eσ1 t − 1)1/2 (eσ2 t − 1)1/2
In particular, if σ1 = σ2 , then Corr(S1 , S2 ) = 1.
Proof: Since
1 2 1 2
S1 (t) = S1 (0)eµ1 t− 2 σ1 t eσ1 Wt , S2 (t) = S2 (0)eµ2 t− 2 σ2 t , eσ2 Wt ,
2
from Exercise 12.3.4 and formula E[ekWt ] = ek t/2
, yields
Cov(eσ1 Wt , eσ2 Wt )
Corr(S1 , S2 ) = Corr(eσ1 Wt , eσ2 Wt ) = p
V ar(eσ1 Wt )V ar(eσ2 Wt )
E[e(σ1 +σ2 )Wt ] − E[eσ1 Wt ]E[eσ2 Wt ]
= p
V ar(eσ1 Wt )V ar(eσ2 Wt )
1 2 1 2 1 2
e 2 (σ1 +σ2 ) t − e 2 σ1 t e 2 σ2 t
= 2 2 2 2
[eσ1 t (eσ1 t − 1)eσ2 t (eσ2 t − 1)]1/2
eσ 1 σ 2 t − 1
= 2 2 .
(eσ1 t − 1)1/2 (eσ2 t − 1)1/2
If σ1 = σ2 = σ then the previous formula obviously provides
2
eσ t − 1
Corr(S1 , S2 ) = σ2 t = 1,
e −1
i.e. the stocks are perfectly correlated if they have the same volatility.
Corollary 12.3.2 The stock prices S1 and S2 are positively strongly correlated for small values
of t:
Corr(S1 , S2 ) → 1 as t → 0.
This fact has the following financial interpretation. If some stocks are driven by the same
unpredictable news, when one stock increases, then the other tends to increase too, at least
for a small amount of time. In the case when some bad news affect an entire financial market,
the risk becomes systemic, and hence if one stock fails, all the other tend to decrease as well,
leading to a severe strain on the financial market.
173
1.0
0.8
0.6
0.4
0.2
20 40 60 80 100
eσ1 σ2 t −1
Figure 12.1: The correlation function f (t) = 2 2 in the case σ1 = 0.15,
(eσ1 t −1)1/2 (eσ2 t −1)1/2
σ2 = 0.40.
Corollary 12.3.3 The stock prices correlation gets weak as t gets large:
Corr(S1 , S2 ) → 0 as t → ∞.
Proof: It follows from the asymptotic correspondence
eσ1 σ2 t − 1 eσ1 σ2 t (σ −σ )2
− 1 2 2 t
σ12 t 2 ∼ 2 2 = e → 0, t → 0.
(e − 1)1/2 (eσ2 t − 1)1/2 σ1 +σ2
e 2 t
It follows that in long run any two stocks tend to become uncorrelated, see Fig.12.1.
Exercise 12.3.4 If X and Y are random variables and α, β ∈ R, show that
½
Corr(X, Y ), if αβ > 0
Corr(αX, βY ) =
−Corr(X, Y ), if αβ < 0.
Exercise 12.3.5 Find the following
¡ ¢
(a) Cov dS1 (t), dS2 (t) ;
¡ ¢
(b) Corr dS1 (t), dS2 (t) .
12.4 The Sharpe Ratio

If µ is the expected return on the stock St , the risk premium is defined as the difference µ − r,
were r is the risk-free interest rate. The Sharpe ratio is the quotient between the risk premium
and stock price volatility
µ−r
η= .
σ
The following result shows that the Sharpe ratio is an important invariant for the family of
stocks driven by the same uncertainly source.
Proposition 12.4.1 Let S1 and S2 be two stocks satisfying equations (12.3.7) − (12.3.8). Then
their Sharpe ratio are equal
µ1 − r µ2 − r
= · (12.4.9)
σ1 σ2
174
Proof: Eliminating the term dWt from equations (12.3.7) − (12.3.8) yields
σ2 σ1
dS1 − dS2 = (µ1 σ2 − µ2 σ1 )dt. (12.4.10)
S1 S2
σ2 (t) σ1 (t)
Consider the portfolio P (t) = θ1 (t)S1 (t) − θ2 (t)S2 (t), with θ1 (t) = and θ2 (t) = .
S1 (t) S2 (t)
Using the properties of self-financing portfolios, we have
dP (t) = θ1 (t)dS1 (t) − θ2 (t)dS2 (t)

σ2 σ1
= dS1 − dS2 .
S1 S2
Substituting in (12.4.10) yields dP = (µ1 σ2 − µ2 σ1 )dt, i.e. P is a risk-less portfolio. Since the
portfolio earns interest at the risk-free interest rate, we have dP = rP dt. Then equating the
coefficients of dt yields
µ1 σ2 − µ2 σ1 = rP (t).
Using the definition of P (t), the previous relation becomes
µ1 σ2 − µ2 σ1 = rθ1 S1 − rθ2 S2 ,
that can be transformed to

µ1 σ2 − µ2 σ1 = rσ2 − rσ1 ,
which is equivalent with (12.4.9).
Using Proposition 12.1.3, relations (12.3.7) − (12.3.8) can be written as
dS1 = ct
rS1 dt + σ1 S1 dW
dS2 = ct
rS2 dt + σ2 S2 dW
ct is the same in both equations
where the risk-neutral process dW
ct = µ1 − r µ2 − r
dW dt + dWt = dt + dWt .
σ1 σ2
12.5 Risk-neutral Valuation for Derivatives

The risk-neutral process dW ct plays an important role in the risk neutral valuation of derivatives.
In this section we shall prove that if fT is the price of a derivative at the maturity time, then
b −r(T −t) fT |Ft ] is the price of the derivative at the time t, for any t < T .
ft = E[e
In other words, the discounted price of a derivative in the risk-neutral world is the price
of the derivative at the new instance of time. This is based on the fact that e−rt ft is an
Ft -martingale w.r.t. the risk-neutral measure Q introduced previously.
In particular, the idea of proof can be applied for the stock St . Applying the product rule
d(e−rt St ) = d(e−rt )St + e−rt dSt + d(e−rt )dSt

| {z }
=0
= −re −rt
St dt + e −rt ct )
(rSt dt + σSt dW
ct ).
= e−rt (rSt dt + σSt dW
175
If u < t, integrating between u and t

Z t
e−rt St = e−ru Su + cs ,
σe−rs Ss dW
u
and taking the risk-neutral expectation w.r.t. the information set Fu yields
Z t
b −rt St |Fu ] = E[e
E[e b −ru Su + cs | Fu ]
σe−rs Ss dW
u
Z
£ t −rs ¤
= e−ru Su + E b σe Ss dW cs |Fu
u
Z t
£ ¤
= e−ru Su + E b cs
σe−rs Ss dW
u
= e−ru Su ,
Z t
since cs is independent of Fu . It follows that e−rt St is an Ft -martingale in the
σe−rs Ss dW
u
risk-neutral world. The following fundamental result can be shows using a similar proof as the
one encountered previously:
Theorem 12.5.1 If ft = f (t, St ) is the price of a derivative at time t, then e−rt ft is an

Ft -martingale in the risk-neutral world, i.e.
b −rt ft |Fu ] = e−ru fu ,
E[e ∀0 < u < t.
ct , the process followed
Proof: Using Ito’s formula and the risk neutral process dS = rS + σSdW
by ft is
∂f ∂f 1 ∂2f
dft = dt + dS + (dS)2
∂t ∂S 2 ∂S 2
2
∂f ∂f ct ) + 1 σ 2 S 2 ∂ f dt
= dt + (rSdt + σSdW
∂t ∂S 2 ∂S 2
³ ∂f ∂f 1 2 2 ∂2f ´ ∂f c
= + rS + σ S dt + σS dWt
∂t ∂S 2 ∂S 2 ∂S
∂f c
= rf dt + σS dWt ,
∂S
where in the last identity we used that f satisfies the Black-Scholes equation. Applying the
product rule we obtain
d(e−rt ft ) = d(e−rt )ft + e−rt dft + d(e−rt )dft

| {z }
=0
∂f c
= −re−rt ft dt + e−rt (rft dt + σS dWt )
∂S
∂f c
= e−rt σS dWt .
∂S
Integrating between u and t we get
Z t
∂fs c
e−rt ft = e−ru fu + e−rs σS dWs ,
u ∂S
176
which assures that e−rt ft is a Zmartingale, since Wcs is a Brownian motion process. Using that
t
∂f s c
e−ru fu is Fu -predictable, and e−rs σS dWs is independent of the information set Fu , we
u ∂S
have Z
£ t −rs ∂fs ¤
b −rt
E[e ft |Fu ] = e −ru b
fu + E e σS cs = e−ru fu .
dW
u ∂S
Exercise 12.5.2 Show the following:

b σ(Wt −Wu ) |Fu ] = e(r−µ+ 12 σ2 )(t−u) , u < t;
(a) E[e
h i £ ¤
(b) E b eσ(Wt −Wu ) |Fu ,
b St |Fu = e(µ− 12 σ2 )(t−u) E u < t;
Su
h i
b St |Fu = er(t−u) , u < t.
(c) E Su
Exercise 12.5.3 Find the following risk-neutral world conditional expectations:

R
b t Su du|Fs ], s < t;
(a) E[ 0
Rt
b
(b) E[St 0 Su du|Fs ], s < t;
R
b t Su dWu |Fs ], s < t;
(c) E[ 0
Rt
b
(d) E[St 0 Su dWu |Fs ], s < t;
¡Rt ¢2
b
(e) E[ S du |Fs ], s < t;
0 u
Exercise 12.5.4 Use risk-neutral valuation to find the price of a derivative that pays at ma-
turity the following payoffs:
(a) fT = T ST ;
RT
(b) fT = 0 Su du;
RT
(c) fT = 0 Su dWu .
Chapter 13
Black-Scholes Analysis
13.1 Heat Equation

This section is devoted to a basic discussion on heat equation. Its importance resides in the
remarkable fact that the Black-Scholes equation, which is the main equation of derivatives
calculus, can be reduced to this type of equation.
Let u(τ, x) denote the temperature in an infinite rod at point x and time τ . In the absence of
exterior heat sources the heat diffuses according to the following parabolic differential equation
∂u ∂ 2 u
− = 0, (13.1.1)
∂τ ∂x2
called the heat equation. If the initial heat distribution is known and is given by u(0, x) = f (x),
then we have an initial value problem for the heat equation.
Solving this equation involves a convolution between the initial temperature f (x) and the
fundamental solution of the heat equation G(τ, x), which will be defined shortly.
Definition 13.1.1 The function
1 x2
G(τ, x) = √ e− 4τ , τ > 0,
4πτ
is called the fundamental solution of the heat equation (13.1.1).
We recall the most important properties of the function G(τ, x).
• G(τ, x) has the properties of a probability density1 , i.e.
1. G(τ, x) > 0, ∀x ∈ R, τ > 0;

Z
2. G(τ, x) dx = 1, ∀τ > 0.
R
1 In fact it is a Gaussian probability density.
177
178
1.2
Τ = 0.10
1.0
GHΤ, xL 0.8
Τ = 0.24
0.6
0.4
Τ = 0.38
0.2 Τ = 0.50
x
-2 -1 1 2
Figure 13.1: The function G(τ, x) tends to the Dirac measure δ(x) as τ & 0, and flattens out
as τ → ∞.
• it satisfies the heat equation

∂G ∂ 2 G
− = 0, τ > 0.
∂τ ∂x2
• G tends to the Dirac measure as τ gets closer to the initial time
lim G(τ, x) = δ(x),

τ &0
where the Dirac measure can be defined using integration as

Z
ϕ(x)δ(x)dx = ϕ(0),
R
for any smooth function with compact support ϕ. Consequently, we also have
Z
ϕ(x)δ(x − y)dx = ϕ(y).
R
One can think of δ(x) as a measure with infinite value at x = 0, zero for the rest of the values
and with the integral equal to 1, see Fig.13.1.
The physical significance of the fundamental solution G(τ, x) is that it describes the heat
evolution in the infinite rod after an initial heat impulse of infinite size applied at x = 0.
Proposition 13.1.2 The solution of the initial value heat equation

∂u ∂ 2 u
− = 0
∂τ ∂x2
u(0, x) = f (x)
is given by the convolution between the fundamental solution and the initial temperature
Z
u(τ, x) = G(τ, y − x)f (y) dy, τ > 0.
R
179
Proof: Substituting z = y − x, the solution can be written as

Z
u(τ, x) = G(τ, z)f (x + z) dz. (13.1.2)
R
Differentiating under the integral yields

Z
∂u ∂G(τ, z)
= f (x + z) dz,
∂τ R ∂τ
Z Z
∂2u ∂ 2 f (x + z) ∂ 2 f (x + z)
2
= G(τ, z) 2
dz = G(τ, z) dz
∂x R ∂x R ∂z 2
Z 2
∂ G(τ, z)
= f (x + z) dz,
R ∂z 2
where we applied integration by parts twice and the fact that
∂G(τ, z)
lim G(τ, z) = lim = 0.
z→∞ z→∞ ∂z
Since G satisfies the heat equation,
Z h
∂u ∂ 2 u ∂G(τ, z) ∂ 2 G(τ, z) i
− = − f (x + z) dz = 0.
∂τ ∂x2 R ∂τ ∂z 2
Since the limit and the integral commute2 , using the properties of Dirac measure, we have
Z
u(0, x) = lim u(τ, x) = lim G(τ, z)f (x + z) dz
τ &0 τ &0 R
Z
= δ(z)f (x + z) dz = f (x).
R
Hence (13.1.2) satisfies the initial value heat equation.

R
It worth noting that the solution u(τ, x) = R G(y − x, τ )f (y) dy provides the temperature
at any point in the rod for any time τ > 0, but it cannot provide the temperature for τ < 0,
because of the singularity the fundamental solution exhibits at τ = 0. We can reformulate this
by saying that heat equation is semi-deterministic, in the sense that given the present, we can
know the future but not the past.
The semi-deterministic character of diffusion phenomena can be exemplified with a drop of
ink which starts diffusing in a bucket of water at time t = 0. We can determine the density of
the ink at any time t > 0 at any point x in the bucket. However, given the density of ink at
a time t > 0, it is not possible to trace back in time the ink density and find the initial point
where the drop started its diffusion.
The semi-deterministic behavior occurs in the study of derivatives too. In the case of the
Black-Scholes equation, which is a backwards heat equation3 , given the present value of the
derivative, we can find the past values but not the future ones. This is the capital difficulty in
foreseeing the prices of stock market instruments from the present prices. This difficulty will be
overcame by working the price from the given final condition, which is the payoff at maturity.
2 This is allowed by the dominated convergence theorem.
3 This comes from the fact that at some point τ becomes −τ due to a substitution
180
13.2 What is a Portfolio?

A portfolio is a position in the market that consists in long and short positions in one or more
stocks and other securities. The value of a portfolio can be represented algebraically as a linear
combination of stock prices and other securities values:
n
X m
X
P = aj Sj + bk Fk .
j=1 k=1
The market participant holds aj units of stock Sj and bk units in derivative Fk . The coefficients
are positive for long positions and negative for short positions. For instance, a portfolio given
by P = 2F − 3S means that we buy 2 securities and sell 3 units of stock (a position with 2
securities long and 3 stocks short).
13.3 Risk-less Portfolios

A portfolio P is called risk-less if the increments dP are completely predictable. In this case
the increments value dP should equal the interest earned in the time interval dt on the portfolio
P . This can be written as
dP = rP dt, (13.3.3)
where r denotes the risk-free interest rate. For the sake of simplicity the rate r will be assumed
constant through this section.
Let’s assume now that the portfolio P depends on only one stock S and one derivative F ,
whose underlying asset is S. The portfolio depends also on time t, so
P = P (t, S, F ),
We are interested in deriving the stochastic differential equation followed by the portfolio P .
We note that at this moment the portfolio is not assumed risk-less. By Ito’s formula we get
∂P ∂P ∂P 1 ∂2P 2 1 ∂2P
dP = dt + dS + dF + dS + (dF )2 . (13.3.4)
∂t ∂S ∂F 2 ∂S 2 2 ∂F 2
The stock S is assumed to follow the geometric Brownian motion
dS = µSdt + σSdWt , (13.3.5)
where and are the expected return rate on the stock µ and the stock’s volatility σ are constants.
Since the derivative F depends on time and underlying stock, we can write F = F (t, S).
Applying Ito’s formula, yields
∂F ∂F 1 ∂2F
dF = dt + dS + (dS)2
∂t ∂S 2 ∂S 2
³ ∂F ∂F 1 ∂2F ´ ∂F
= + µS + σ 2 S 2 2 dt + σS dWt . (13.3.6)
∂t ∂S 2 ∂S ∂S
where we have used (13.3.5). Taking the squares in relations (13.3.5) and (13.3.6), and using
the stochastic relations (dWt )2 = dt and dt2 = dWt dt = 0, we get
(dS)2 = σ 2 S 2 dt
³ ∂F ´2
(dF )2 = σ2 S 2 dt.
∂S
181
Substituting back in (13.3.4), and collecting the predictable and unpredictable parts, yields
h ∂P ³ ∂P ∂P ∂F ´ ∂P ³ ∂F 1 ∂2F ´
dP = + µS + + + σ2 S 2 2
∂t ∂S ∂F ∂S ∂F ∂t 2 ∂S
1 2 2³ ∂2P ∂ 2 P ³ ∂F ´2 í
+ σ S + dt
2 ∂S 2 ∂F 2 ∂S
³ ∂P ∂P ∂F ´
+σS + dWt . (13.3.7)
∂S ∂F ∂S
Looking at the unpredictable component, we have the following result:
dP
Proposition 13.3.1 The portfolio P is risk-less if and only if = 0.
dS
Proof: A portfolio P is risk-less if and only if its unpredictable component is identically zero,
i.e.
∂P ∂P ∂F
+ = 0.
∂S ∂F ∂S
Since the total derivative of P is given by
dP ∂P ∂P ∂F
= + ,
dS ∂S ∂F ∂S
dP
the previous relation becomes = 0.
dS
dP
Definition 13.3.2 The amount ∆P = is called the delta of the portfolio P .
dS
The previous result can be reformulated by saying that a portfolio is risk-less if and only if its
delta vanishes. In practice this can hold only for a short amount of time, so the portfolio need
to be re-balanced periodically. The process of making a portfolio risk-less involves a procedure
called delta hedging, through which the portfolio’s delta becomes zero or very close to this
value.
Assume P is a risk-less portfolio, so
dP ∂P ∂P ∂F
= + = 0. (13.3.8)
dS ∂S ∂F ∂S
Then equation (13.3.7) simplifies to
h ∂P ∂P ³ ∂F 1 ∂2F ´
dP = + + σ2 S 2 2
∂t ∂F ∂t 2 ∂S
1 2 2³ ∂2P ∂ 2 P ³ ∂F ´2 í
+ σ S + dt (13.3.9)
2 ∂S 2 ∂F 2 ∂S
Comparing with (13.3.3) yields
∂P ∂P ³ ∂F 1 ∂2F ´ 1 ³ ∂2P ∂ 2 P ³ ∂F ´2 ´
+ + σ2 S 2 2 + σ2 S 2 + = rP. (13.3.10)
∂t ∂F ∂t 2 ∂S 2 ∂S 2 ∂F 2 ∂S
This equation works under the general hypothesis that P = P (t, S, F ) is a risk-free financial
instrument that depends on time t, stock S and derivative F .
182
13.4 Black-Scholes Equation

This section deals with a parabolic partial differential equation satisfied by all European-type
securities, called the Black-Scholes equation. This was initially used by Black and Scholes to find
the value of options. This is a deterministic equation obtained by eliminating the unpredictable
component of the derivative by making a risk-less portfolio. The main reason for this being
possible is the fact that both the derivative F and the stock S are driven by the same source
of uncertainty.
The next result holds in a market with the following restrictive conditions:
• the risk-free rate r and stock volatility σ are constant.
• there are no arbitrage opportunities.
• no transaction costs.
Proposition 13.4.1 If F (t, S) is a derivative defined for t ∈ [0, T ], then
∂F ∂F 1 ∂2F
+ rS + σ 2 S 2 2 = rF. (13.4.11)
∂t ∂S 2 ∂S
Proof: The equation (13.3.10) works under the general hypothesis that P = P (t, S, F ) is
a risk-free financial instrument that depends on time t, stock S and derivative F . We shall
consider P to be the following particular portfolio
P = F − λS.
This means taking a long position in derivative and a short position in λ units of stock (assuming
λ positive). The partial derivatives in this case are
∂P ∂P ∂P
= 0, = 1, = −λ,
∂t ∂F ∂S
∂2P ∂2P
= 0, = 0.
∂F 2 ∂S 2
∂F
From the risk-less property (13.3.8) we get λ = . Substituting in equation (13.3.10) yields
∂S
∂F 1 ∂2F ∂F
+ σ 2 S 2 2 = rF − rS ,
∂t 2 ∂S ∂S
which is equivalent with the desired equation.
However, the Black-Scholes equation is derived most often in a less rigorous way. This is
based on the assumption that the number λ = ∂F ∂S , which appears in the formula of the risk-less
portfolio P = F −λS, is considered constant for the time interval ∆t. If consider the increments
over the time interval ∆t
∆Wt = Wt+∆t − Wt
∆S = St+∆t − St
∆F = F (t + ∆t, St + ∆S) − F (t, S),
183
then Ito’s formula yields

³ ∂F ∂F 1 ∂2F ´
∆F = (t, S) + µS (t, S) + σ 2 S 2 2 (t, S) ∆t
∂t ∂S 2 ∂S
∂F
+σS (t, S)∆Wt .
∂S
On the other side, the increments in the stock are given by
∆S = µS∆t + σS∆Wt .
Since both increments ∆F and ∆S are driven by the same uncertainly source, ∆Wt , we can
eliminate it by multiplying the latter equation by ∂F
∂S and subtract it from the former
∂F ³ ∂F 1 ∂2F ´
∆F − (t, S)∆S = (t, S) + σ 2 S 2 2 (t, S) ∆t.
∂S ∂t 2 ∂S
The left side can be regarded as the increment ∆P , of the portfolio
∂F
P =F − S.
∂S
This portfolio is risk-less because its increments are totally deterministic, so it must also satisfy
∆P = rP ∆t. The number ∂F ∂S is assumed constant for small intervals of time ∆t. Even if
this assumption is not rigorous enough, the procedure still leads to the right equation. This is
obtained by equating the coefficients of ∆t in the last two equations
∂F 1 ∂2F ³ ∂F ´
(t, S) + σ 2 S 2 2 (t, S) = r F − S ,
∂t 2 ∂S ∂S
which is equivalent to the Black-Scholes equation.
13.5 Delta Hedging

∂F
The proof for the Black-Scholes’ equation is based on the fact that the portfolio P = F − S
∂S
is risk-less. Since the delta of the derivative F is
dF ∂F
∆F = = ,
dS ∂S
then the portfolio P = F − ∆F S is risk-less. This leads to the delta-hedging procedure, by
which selling ∆F units of the underlying stock S yields a risk-less investment.
13.6 Tradable securities

A derivative F (t, S) that is a solution of the Black-Scholes equation is called tradable. Its name
comes from the fact that it can be traded (either on an exchange or over-the-counter). The
Black-Scholes equation constitutes the equilibrium relation that provides the traded price of
the derivative. We shall deals next with a few examples of tradable securities.
184
Example 13.6.1 (i) It is easy to show that F = S is a solution for the Black-Scholes equation.
Hence the stock is a tradable derivative.
(ii) If K is a constant, then F = ert K is a tradable derivative.
(iii) If S is the stock price, then F = eS is not a tradable derivative, since F it does not satisfy
equation (13.4.11).
Exercise 13.6.1 Show that F = ln S is not a tradable derivative.
Exercise 13.6.2 Find all constants α such that S α is tradable.
Substituting F = S α in equation (13.4.11) we obtain

1
rSαS α−1 + σ 2 S 2 α(α − 1)S α−2 = rS α .
2
1
Dividing by S α yields rα + σ 2 α(α − 1) = r. This can be factorized as
2
1 2 2r
σ (α − 1)(α + 2 ) = 0,
2 σ
2r
with two distinct solution α1 = 1 and α2 = − . Hence there are only two tradable securities
σ2 2
that are powers of the stock: the stock itself, S, and S −2r/σ . In particular, S 2 is not tradable,
since −2r/σ 2 6= 2 (the left side is negative). The role of these two cases will be clarified by the
next result.
Proposition 13.6.3 The general form of a traded derivative, which does not depend explicitly
on time, is given by
2
F (S) = C1 S + C2 S −2r/σ , (13.6.12)
with C1 , C2 constants.
Proof: If the derivative depends solely on the stock, F = F (S), then the Black-Scholes equation
becomes the ordinary differential equation
dF 1 d2 F
rS + σ 2 S 2 2 = rF. (13.6.13)
dS 2 dS
This is an Euler-type equation, which can be solved by using the substitution S = ex . The
d d
derivatives and are related by the chain rule
dS dx
d dS d
= .
dx dx dS
dS dex d d
Since = = ex = S, it follows that = S . Using product rule,
dx dx dx dS
d2 d ¡ d ¢ d 2 d
2
= S S = S + S ,
dx2 dS dS dS dS 2
and hence
d2 d2 d
S2 = − .
dS 2 dx2 dx
185
1 2 d2 G(x) 1 dG(x)
σ + (r − σ 2 ) = rG(x).
2 dx2 2 dx
where G(x) = G(ex ) = F (S). The associated indicial equation
1 2 2 1
σ α − (r − σ 2 )α = r
2 2
has solutions α1 = 1, α2 = −r/σ 2 , so the general solution has the form
r
G(x) = C1 ex + C2 e− σ2 x ,
which is equivalent with (13.6.12).
Exercise 13.6.4 Show that the price of a forward contract, which is given by F (t, S) = S − Ke−r(T −t) ,
satisfies the Black-Scholes equation, i.e. a forward contract is a tradable derivative.
Exercise 13.6.5 Let d1 and d2 be given by
2
ln(St /K) + (r − σ2 )(T − t)
d2 = √
σ T −t
√
d1 = d2 + σ T − t.
Show that the following functions satisfy the Black-Scholes equation:

(a) F1 (t, S) = SN (d1 )
(b) F2 (t, S) = e−r(T −t) N (d2 )
(c) F2 (t, S) = SN (d1 ) − Ke−r(T −t) N (d2 ).
To which well-known derivatives do these formulas correspond?
13.7 Risk-less investment revised

A risk-less investment, P (t, S, F ), which depends on time t, stock price S and derivative F ,
which has S as underlying asset, satisfies equation (13.3.10). Using the Black-Scholes equation
satisfied by the derivative F
∂F 1 ∂2F ∂F
+ σ 2 S 2 2 = rF − rS ,
∂t 2 ∂S ∂S
equation (13.3.10) becomes
∂P ∂P ³ ∂F ´ 1 2 2 ³ ∂ 2 P ∂ 2 P ³ ∂F ´2 ´
+ rF − rS + σ S 2
+ = rP.
∂t ∂F ∂S 2 ∂S ∂F 2 ∂S
Using the risk-less condition (13.3.8)
∂P ∂P ∂F
=− , (13.7.14)
∂S ∂F ∂S
186
the previous equation becomes
∂P ∂P ∂P 1 h ∂2P ∂ 2 P ³ ∂F ´2 i
+ rS + +rF + σ2 S 2 + = rP. (13.7.15)
∂t ∂S ∂F 2 ∂S 2 ∂F 2 ∂S
In the following we shall find an equivalent expression for the last term on the left side. Differ-
entiate in (13.7.14) with respect to F yields
∂2P ∂ 2 P ∂F ∂P ∂ 2 F
= − −
∂F ∂S ∂F 2 ∂S ∂F ∂F ∂S
2
∂ P ∂F
= − 2 ,
∂F ∂S
where we used
∂2F ∂ ∂F
= = 1.
∂F ∂S ∂S ∂F
∂F
Multiplying by implies
∂S
∂ 2 P ³ ∂F ´2 ∂ 2 P ∂F
2
=− .
∂F ∂S ∂F ∂S ∂S
Substituting in the aforementioned equation yields
∂P ∂P ∂P 1 h ∂2P ∂ 2 P ∂F i
+ rS + +rF + σ2 S 2 − = rP. (13.7.16)
∂t ∂S ∂F 2 ∂S 2 ∂F ∂S ∂S
We have seen in section 13.4 that P = F − ∂F ∂S S is a risk-less investment, in fact a risk-less

portfolio. We shall discuss in the following another risk-less investment.
Application 13.7.1 If a risk-less investment P has the variable S and F separable, i.e. it is
the sum P (S, F ) = f (F ) + g(S), with f and g smooth functions, then
2
P (S, F ) = F + c1 S + c2 S −2r/σ .
with c1 , c2 constants. The derivative F is given by the formula

2
F (t, S) = −c1 S − c2 S −2r/σ + c3 ert , c3 ∈ R.
Since P has separable variables, the mixed derivative term vanishes, and the equation (13.7.16)
becomes
σ 2 2 00
Sg 0 (S) + S g (S) − g(S) = f (F ) − F f 0 (F ).
2r
There is a separation constant C such that
f (F ) − F f 0 (F ) = C
2
σ 2 00
Sg 0 (S) + S g (S) − g(S) = C.
2r
Dividing the first equation by F 2 yields the exact equation
³1 ´0 C
f (F ) = − 2 ,
F F
187
2
with the solution f (F ) = c0 F + C. To solve the second equation, let κ = σ2r . Then the
substitution S = ex leads to the ordinary differential equation with constant coefficients
κh00 (x) + (1 − κ)h0 (x) − h(x) = C,
where h(x) = g(ex ) = g(S). The associated indicial equation
κλ2 + (1 − κ)λ − 1 = 0
has the solutions λ1 = 1, λ2 = − κ1 . The general solution is the sum between the particular
solution hp (x) = −C and the solution of the associated homogeneous equation, which is h0 (x) =
1
c1 ex + c2 e− κ x . Then
1
h(x) = c1 ex + c2 e− κ x − C.
Going back to the variable S, we get the general form of g(S)
2
g(S) = c1 S + c2 S −2r/σ − C,
with c1 , c2 constants. Since the constant C cancels by addition, we have the following formula
for the risk-less investment with separable variables F and S
2
P (S, F ) = f (F ) + g(S) = c0 F + c1 S + c2 S −2r/σ .
Dividing by c0 , we may assume c0 = 1. We shall find the derivative F (t, S) which enters the
previous formula. Substituting in (13.7.14) yields
∂F 2r 2
− = c1 − 2 c2 S −1−2r/σ ,
∂S σ
which after partial integration in S gives
2
F (t, S) = −c1 S − c2 S −2r/σ + φ(t),
where the integration constant φ(t) is a function of t. The sum of the first two terms is the
derivative given by formula (13.6.12). The remaining function φ(t) has also to satisfy the Black-
Scholes equation, and hence it is of the form φ(t) = c3 ert , with c3 constant. Then the derivative
F is given by
2
F (t, S) = −c1 S − c2 S −2r/σ + c3 ert .
It worth noting that substituting in the formula of P yields P = c3 ert , which agrees with the
formula of a risk-less investment.
Exercise 13.7.2 Find the function g(S) such that the product P = F g(S) is a risk-less invest-
ment, with F = F (t, S) derivative. Find the expression of the derivative F in terms of S and
t.
Proof: Substituting P = F g(S) in equation (13.7.15) and simplifying by rF yields
dg(S) σ 2 2 d2 g(S)
S + S = 0.
dS 2r dS 2
Substituting S = ex , and h(x) = g(ex ) = g(S) yields
³ 2r ´
h00 (x) + − 1 h0 (x) = 0.
σ2
188
Integrating leads to the solution

2r
h(x) = C1 + C2 e(1− σ2 )x .
Going back to variable S
2r 1− 2r
σ2
g(S) = h(ln S) = C1 + C2 e(1− σ2 ) ln S = C1 + C2 S .
13.8 Solving Black-Scholes

In this section we shall solve the Black-Scholes equation and show that its solution coincide
with the one provided by the risk-neutral evaluation in section 11. This way, Black-Scholes
equation provides a variant approach for European-type derivatives by using partial differential
equations instead of expectations.
Consider a European-type derivative F , with the payoff at maturity T given by f T , which is
a function of the stock price at maturity, ST . Then F (t, S) satisfies the following final condition
partial differential equation
∂F ∂F 1 ∂2F
+ rS + σ2 S 2 2 = rF
∂t ∂S 2 ∂S
F (T, ST ) = f T (ST ).
This means the solution is known at the final time T and we need to find its expression at any
time t prior to T , i.e.
ft = F (t, St ), 0 ≤ t < T.
First we shall transform the equation into an equation with constant coefficients. Substi-
tuting S = ex , and using the identities
∂ ∂ ∂2 ∂2 ∂
S = , S2 2
= −
∂S ∂x ∂S ∂x2 ∂x
the equation becomes
∂V 1 ∂2V 1 ∂V
+ σ 2 2 + (r − σ 2 ) = rV,
∂t 2 ∂x 2 ∂x
where V (t, x) = F (t, ex ). Using the time scaling τ = 12 σ 2 (T − t), chain rule provides
∂ ∂τ ∂ 1 ∂
= = − σ2 .
∂t ∂t ∂τ 2 ∂τ
2r
Denote k = . Substituting in the aforementioned equation yields
σ2
∂W ∂2W ∂W
= + (k − 1) − kW, (13.8.17)
∂τ ∂x2 ∂x
189
where W (τ, x) = V (t, x). Next we shall get rid of the last two terms on the right side of the
equation by using a crafted substitution.
Consider W (τ, x) = eϕ u(τ, x), where ϕ = αx + βτ , with α, β constants that will be deter-
mined such that the equation satisfied by u(τ, x) has on the right side only the second derivative
in x. Since
∂W ³ ∂u ´
= eϕ αu +
∂x ∂x
∂2W ³ ∂u ∂u ´
= eϕ α2 u + 2α +
∂x2 ∂x ∂x2
∂W ³ ∂u ´
= eϕ βu + ,
∂τ ∂τ
substituting in (13.8.17), dividing by eϕ and collecting the derivatives yields
∂u ∂2u ¡ ¢ ∂u ¡ 2 ¢
= 2
+ 2α + k − 1 + α + α(k − 1) − k − β u = 0
∂τ ∂x ∂x
∂u
The constants α and β are chosen such that the coefficients of ∂x and u vanish
2α + k − 1 = 0
2
α + α(k − 1) − k − β = 0.
Solving yields
k−1
α = −
2
(k + 1)2
β = α2 + α(k − 1) − k = − .
4
The function u(τ, x) satisfies the heat equation
∂u ∂2u
=
∂τ ∂x2
with the initial condition expressible in terms of fT
u(0, x) = e−ϕ(0,x) W (0, x) = e−αx V (T, x)

= e−αx F (T, ex ) = e−αx fT (ex ).
From the general theory of heat equation, the solution can be expressed as the convolution
between the fundamental solution and the initial condition
Z ∞
1 (y−x)2
u(τ, x) = √ e− 4τ u(0, y) dy
−∞ 4πτ
The previous substitutions yield the following relation between F and u
F (t, S) = F (t, ex ) = V (t, x) = W (τ, x) = eϕ(τ,x) u(τ, x),
so F (T, ex ) = eαx u(0, x). This implies

190
Z ∞
1 (y−x)2
F (t, ex ) = eϕ(τ,x) u(τ, x) = eϕ(τ,x) √ e− 4τ u(0, y) dy
−∞ 4πτ
Z ∞
1 (y−x) 2
= eϕ(τ,x) √ e− 4τ e−αy F (T, ey ) dy.
−∞ 4πτ
√
With the substitution y = x = s 2τ this becomes
Z ∞ √ √
x ϕ(τ,x) 1 s2
F (t, e ) = e √ e− 2 −α(x+s 2τ ) F (T, ex+s 2τ ) dy.
−∞ 2π
Completing the square as
s2 √ 1³ k − 1 √ ´2 (k − 1)2 τ k−1
− − α(x + s 2τ ) = s− 2τ + + x,
2 2 2 4 2
after cancelations, the previous integral becomes

Z ∞ ¡ √ ¢2 √
(k+1)2 1 1
s− k−1 (k−1)2
x
F (t, e ) = e − 4 τ
√ e− 2 2 2τ
e 4 τ
F (T, ex+s 2τ
) ds.
2π −∞
Using
(k+1)2 (k−1)2
e− 4 τ
e 4 τ
= e−kτ = e−r(T −t) ,
1
(k − 1)τ = (r − σ 2 )(T − t),
2
√
after the substitution z = x + s 2τ we get
Z ∞
x −r(T −t) 1 1 (z−x−(k−1)τ )
2
1
F (t, e ) = e √ e− 2 2τ F (T, ez ) √ dz
2π −∞ 2τ
Z ∞ [z−x−(r− 1 σ 2 )(T −t)]2
1 − 2
= e−r(T −t) p e 2σ 2 (T −t) F (T, ez ) dz.
2
2πσ (T − t) −∞
Since ex = St , considering the probability density
[z−ln St −(r− 1 σ 2 )(T −t)]2

1 − 2
p(z) = p e 2σ 2 (T −t) ,
2πσ 2 (T − t)
the previous expression becomes

Z ∞
F (t, St ) = e −r(T −t) bt [fT ],
p(z)fT (ez ) dz = e−r(T −t) E
−∞
bt the risk-neutral expectation operator as of time t, which was

with fT (ST ) = F (T, ST ) and E
introduced and used in section 11.
191
13.9 Black-Scholes and Risk-neutral Valuation

The conclusion of the elementary but highly skilful computation of the last section is of capital
importance for the derivatives calculus. It shows the equivalence between the Black-Scholes
equation and the risk-neutral evaluation. It turns out that instead of computing the risk-
neutral expectation of the payoff, as in the case of risk-neutral evaluation, we may have the
choice to solve the Black-Scholes equation directly, and impose the final condition to be the
payoff.
In many cases solving a partial differential equation is simpler than evaluating the expecta-
tion integral. This is due to the fact that we may look for a solution dictated by the particular
form of the payoff fT . We shall apply that in finding put-call parities for different type of
derivatives.
Consequently, all derivatives evaluated by the risk-neutral valuation are solutions of the
Black-Scholes equation. The only distinction is their payoff. A few of them are given in the
next example.
Example 13.9.1 (a) The price of an European call option is the solution F (t, S) of the Black-
Scholes equation satisfying
fT (ST ) = max(ST − K, 0).
(b) The price of an European put option is the solution F (t, S) of the Black-Scholes equation
with the final condition
fT (ST ) = max(K − ST , 0).
(c) The value of a forward contract is the solution the Black-Scholes equation with the final
condition
fT (ST ) = ST − K.
It is worth noting that the superposition principle discussed in section 11 can be explained
now by the fact that the solution space of the Black-Scholes equation is a linear space. This
means that a linear combination of solutions is also a solution.
Another interesting feature of the Black-Scholes equation is its independence of the stock
drift rate µ. Then its solutions must have the same property. This explains why in the risk-
neutral valuation the value of µ does not appear explicitly in the solution.
Asian options satisfy similar Black-Scholes equations, with small differences, as we shall see
in the next section.
13.10 Boundary Conditions

We have solved the Black-Scholes equation for a call option, under the assumption that there
is a unique solution. The Black-Scholes equation is of first order in the time variable t and of
second order in the stock variable S, so it needs one final condition at t = T and two boundary
conditions for S = 0 and S → ∞.
In the case of a call option the final condition is given by the following payoff
F (T, ST ) = max{ST − K, 0}.
When S → 0, the option does not get exercised, so the initial boundary condition is
F (t, 0) = 0.
192
20
15
10 S - K
10 20 30 40 50 60
Figure 13.2: The graph of the option price before maturity in the case K = 40, σ = 30%,
r = 8%, and T − t = 1.
When S → ∞ the price becomes linear

F (t, S) ∼ S − K,
the graph of F ( · , S) having a slant asymptote, see Fig.13.2.
13.11 Risk-less Portfolios for Rare Events

Consider the derivative P = P (t, S, F ), which depends on the time t, stock price S and the
derivative F , whose underlying asset is S. We shall find the stochastic differential equation
followed by P , under the hypothesis that the stock exhibits rare events, i.e.
dS = µSdt + σSdWt + ρSdMt , (13.11.18)
where the constant µ, σ, ρ denote the drift rate, volatility and jump in the stock price in the
case or a rare event. The processes Wt and Mt = Nt − λt denote the Brownian motion and the
compensated Poisson process, respectively. The constant λ > 0 denotes the rate of occurrence
of the rare events in the market.
By Ito’s formula we get
∂P ∂P ∂P 1 ∂2P 2 1 ∂2P
dP = dt + dS + dF + dS + (dF )2 . (13.11.19)
∂t ∂S ∂F 2 ∂S 2 2 ∂F 2
In the following we hall use the following stochastic relations
(dWt )2 = dt, (dMt )2 = dNt , dt2 = dt dWt = dt dMt = dWt dMt = 0,
see sections 2.8.5, 2.8.6. Then
(dS)2 = σ 2 S 2 dt + ρ2 S 2 dNt
= (σ 2 + λρ2 )S 2 dt + ρ2 S 2 dMt , (13.11.20)
where we used dMt = dNt − λdt. It worth noting that the unpredictable part of (dS)2 depends
only on the rare events, and does not depend on the regular daily events.
193
Exercise 13.11.1 If S satisfies (13.11.18), find the following
(a) E[(dS)2 ] (b) E[dS] (c) V ar[dS].

Using Ito’s formula, the infinitesimal change in the value of the derivative F = F (t, S) is given
by
∂F ∂F 1 ∂2F
dF = dt + dS + (dS)2
∂t ∂S 2 ∂S 2
³ ∂F ∂F 1 ∂2F ´
= + µS + (σ 2 + λρ2 )S 2 2 dt
∂t ∂S 2 ∂S
∂F
+σS dWt
∂S
³1 ∂2F ∂F ´
+ ρ2 S 2 2 + ρS dMt . (13.11.21)
2 ∂S ∂S
where we have used (13.11.18) and (13.11.20). The increment dF has two independent sources
of uncertainty: dWt and dMt , both with mean equal to 0.
Taking the square, yields
³ ∂F ´2 ³1 ∂2F ∂F ´
(dF )2 = σ 2 S 2 dt + ρ2 S 2 2 + ρS dNt . (13.11.22)
∂S 2 ∂S ∂S
Substituting back in (13.11.19), we obtain the unpredictable part of dP as the sum of two
components
³ ∂P∂P ∂F ´
σS + dWt
∂S ∂F ∂S
³ ∂P ∂P ∂F ´ 1 2 2 ³ ∂P ∂ 2 F ∂2P ∂2P ∂2F ´
+ρS + + ρ S + + dMt .
∂S ∂F ∂S 2 ∂F ∂S 2 ∂S 2 ∂F 2 ∂S 2
The risk-less condition for the portfolio P is obtained when the coefficients of dWt and dMt
vanish
∂P ∂P ∂F
+ = 0 (13.11.23)
∂S ∂F ∂S
2 2 2 2
∂P ∂ F ∂ P ∂ P∂ F
2
+ 2
+ = 0. (13.11.24)
∂F ∂S ∂S ∂F 2 ∂S 2
These relations can be further simplified. If differentiate in (13.11.23) with respect to S
∂2P ∂P ∂ 2 F ∂ 2 P ∂F
+ = − .
∂S 2 ∂F ∂S 2 ∂S∂F ∂S
substituting in (13.11.24) yields
∂2P ∂2F ∂ 2 P ∂F
2 2
= . (13.11.25)
∂F ∂S ∂S∂F ∂S
Differentiating in (13.11.23) with respect to F we get
∂2P ∂ 2 P ∂F ∂P ∂ 2 F
= − 2
−
∂F ∂S ∂F ∂S ∂F ∂F ∂S
∂ 2 P ∂F
= − 2 , (13.11.26)
∂F ∂S
194
since
∂2F ∂ ³ ∂F ´
= .
∂F ∂S ∂S ∂F
∂F
Multiplying (13.11.26) by yields
∂S
∂ 2 P ∂F ∂ 2 P ³ ∂F ´2
=− 2 ,
∂F ∂S ∂S ∂F ∂S
and substituting in the right side of (13.11.25) leads to the equation
∂ 2 P h ∂ 2 F ³ ∂F ´2 i
+ = 0.
∂F 2 ∂S 2 ∂S
We arrived at the following result:
Proposition 13.11.2 Let F = F (t, S) be a derivative with the underlying asset S. The in-
vestment P = P (t, S, F ) is risk-less if and only if
∂P ∂P ∂F
+ = 0
∂S ∂F ∂S
∂ 2 P h ∂ 2 F ³ ∂F ´2 i
+ = 0.
∂F 2 ∂S 2 ∂S
There are two risk-less conditions because there are two unpredictable components in the incre-
ments of dP , one due to regular changes and the other dues to rare events. The first condition
dP
is equivalent with the vanishing total derivative, = 0, and corresponds to offsetting the
dS
regular risk.
∂2P ∂ 2 F ³ ∂F ´2
The second condition vanishes either if = 0 or if + = 0. In the first case P
∂F 2 ∂S 2 ∂S
∂F
is at most linear in F . For instance, if P = F −f (S), from the first condition yields f 0 (S) = .
∂S
∂F
In the second case, denote U (t, S) = ∂S . Then need to solve the partial differential equation
∂U
+ U 2 = 0.
∂t
Future research directions:
1. Solve the above equation
2. Find the predictable part of dP
3. Get an analog of the Black-Scholes in this case
4. Evaluate a call option in this case
5. Is the risk-neutral valuation still working and why?
Chapter 14
Black-Scholes for Asian

Derivatives
In this chapter we shall develop the Black-Scholes equation in the case of Asian derivatives and
we shall discuss the particular cases of options and forward contracts on weighted averages.
In the case of the later contracts we obtain closed form solutions, while for the former ones
we apply the reduction variable method to decrease the number of variables and discuss the
solution.
14.0.1 Weighted averages

In many practical problems the asset price needs to be considered with a certain weight. For
instance, when computing a car insurance, more weight is assumed for recent accidents than
for accidents occurred 10 years ago.
In the following we shall define the weight function and provide several examples.
Let ρ : [0, T ] → R be a weight function, i.e. a function satisfying
1. ρ > 0;
RT
2. 0 ρ(t) dt = 1.
The stock weighted average with respect to the weight ρ is defined as
Z T
Save = ρ(t)St dt.
0
1
Example 14.0.3 (a) The uniform weight is obtained for ρ(t) = . In this case
T
Z T
1
Save = St dt
T 0
is the continuous arithmetic average of the stock on the time interval [0, T ].
195
196
2t
(b) The linear weight is obtained if ρ(t) = . In this case the weight is the time
T2
Z T
2
Save = 2 tSt dt.
T 0
kekt
(c) The exponential weight is obtained for ρ(t) = . If k > 0, the weight is increasing, so
ekT − 1
recent data are weighted more than old data; if k < 0, the weight is decreasing. The exponential
weighted average is given by
Z T
k
Save = kT ekt St dt.
e −1 0
Exercise 14.0.4 Consider the polynomial weighted average

Z
(n) n+1 T n
Save = n+1 t St dt.
T 0
(n)
Find the limit lim Save in the cases 0 < T < 1, T = 1, and T > 1.
n→∞
f (t) RT
In all previous examples ρ(t) = ρ(t, T ) = , with 0 f (t) dt = g(T ), so g 0 (T ) = f (T )
g(T )
and g(0) = 0. The average becomes
Z T
1 IT
Save (T ) = f (u)Su du = ,
g(T ) 0 g(T )
Rt
with It = 0
f (u)Su du satisfying dIt = f (t)St dt. From the product rule we get
Ã !
dIt g(t) − It dg(t) f (t) g 0 (t) It
dSave (t) = = St − dt
g(t)2 g(t) g(t) g(t)
f (t) ³ g 0 (t) ´
= St − Save (t) dt
g(t) f (t)
f (t) ³ ´
= St − Save (t) dt,
g(t)
since g 0 (t) = f (t). The initial condition is
It f (t)St f (t)
Save (0) = lim Save (t) = lim = lim 0 = S0 lim 0 = S0 ,
t&0 t&0 g(t) t&0 g (t) t&0 g (t)
Proposition 14.0.5 The weighted average Save (t) satisfies the stochastic differential equation
f (t)
dXt = (St − Xt )dt
g(t)
X0 = S0 .
197
Exercise 14.0.6 Let x(t) = E[Save (t)].

(a) Show that x(t) satisfies the ordinary differential equation
f (t) ¡ ¢
x0 (t) = S0 eµt − x(t)
g(t)
x(0) = S0 .
(b) Find x(t).

2
Exercise 14.0.7 Let y(t) = E[Save (t)].
2
(a) Find the stochastic differential equation satisfied by Save (t);
(b) Find the ordinary differential equation satisfied by y(t);
(c) Solve the previous equation to get y(t) and compute V ar[Save ].
14.1 Setting up the Black-Scholes Equation

Consider an Asian derivative whose value at time t, F (t, St , Save (t)), depends on variables t,
St , and Save (t). Using the stochastic process of St
dSt = µSt dt + σSt dWt
and Proposition 14.0.5, an application of Ito’s formula together with the stochastic formulas
dt2 = 0, (dWt )2 = 0, (dSt )2 = σ 2 S 2 dt, (dSave )2 = 0
yields
∂F ∂F 1 ∂2F ∂F
dF = dt + dSt + (dSt )2 + dSave
∂t ∂St 2 ∂St2 ∂Save
³ ∂F ∂F 1 ∂2F f (t) ∂F ´
= + µSt + σ 2 St2 2 + (St − Save ) dt
∂t ∂St 2 ∂St g(t) ∂Save
∂F
+σSt dWt .
∂St
∂F
Let ∆F = ∂St . Consider the following portfolio at time t
P (t) = F − ∆F St ,
obtained by buying one derivative F and selling ∆F units of stock. The change in the portfolio
value during the time dt does not depend on Wt
dP = dF − ∆F dSt
³ ∂F 1 ∂2F f (t) ∂F ´
= + σ 2 St2 2 + (St − Save ) dt (14.1.1)
∂t 2 ∂St g(t) ∂Save
so the portfolio P is risk-less. Since no arbitrage opportunities are allowed, investing a value P
at time t in a bank at the risk-free rate r for the time interval dt yields
³ ∂F ´
dP = rP dt = rF − rSt dt. (14.1.2)
∂St
198
Equating (14.1.1) and (14.1.2) yields the following form of the Black-Scholes equation for Asian
derivatives on weighted averages
∂F ∂F 1 ∂2F f (t) ∂F
+ rSt + σ 2 St2 2 + (St − Save ) = rF.
∂t ∂St 2 ∂St g(t) ∂Save
14.2 Weighted Average Strike Call Option

In this section we shall use the reduction variable method to decrease the number of variables
It
from three to two. Since Save (t) = , it is convenient to consider the derivative as a function
g(t)
of t, St and It
V (t, St , It ) = F (t, St , Save ).
A computation similar with the previous one yields the simpler equation
∂V ∂V 1 ∂2V ∂V
+ rSt + σ 2 St2 2 + f (t)St = rV. (14.2.3)
∂t ∂St 2 ∂St ∂It
The payoff at maturity of an average strike call option can be written in the following form
VT = V (T, ST , IT ) = max{ST − Save (T ), 0}

IT 1 IT
= max{ST − , 0} = ST max{1 − , 0}
g(T ) g(T ) ST
= ST L(T, RT ),
where
It 1
Rt = , L(t, R) = max{1 − R, 0}.
St g(t)
Since at maturity the variable ST is separated from T and RT , we shall look for a solution of
equation (14.2.3) of the same type for any t ≤ T , i.e. V (t, S, I) = SG(t, R) . Since
∂V ∂G ∂V ∂G 1 ∂G
= S , =S = ;
∂t ∂t ∂I ∂R S ∂R
∂V ∂G ∂R ∂G
= G+S =G−R ;
∂S ∂R ∂S ∂R
∂2V ∂ ∂G ∂G ∂R ∂R ∂G ∂ 2 G ∂R
= (G − R )= − −R 2 ;
∂S 2 ∂S ∂R ∂R ∂S ∂S ∂R ∂R ∂S
∂2G I
= R 2 ;
∂R S
∂2V ∂2G
S2 = RI ·
∂S 2 ∂R2
RI
Substituting in (14.2.3) and using that = R2 , after cancelations yields
S
∂G 1 2 2 ∂ 2 G ∂G
+ σ R + (f (t) − rR) = 0. (14.2.4)
∂t 2 ∂R2 ∂R
199
This is a partial differential equation in only two variables, t and R. It can be solved explicitly
sometimes, depending of the form of the final condition G(T, RT ) and expression of the function
f (t).
In the case of a weighted average strike call option the final condition is
RT
G(T, RT ) = max{1 − , 0}. (14.2.5)
g(T )
Example 14.2.1 In the case of the arithmetic average the function G(t, R) satisfies the partial
differential equation
∂G 1 2 2 ∂ 2 G ∂G
+ σ R 2
+ (1 − rR) = 0
∂t 2 ∂R ∂R
RT
with the final condition G(T, RT ) = max{1 − , 0}.
T
Example 14.2.2 In the case of the exponential average the function G(t, R) satisfies the equa-
tion
∂G 1 2 2 ∂ 2 G ∂G
+ σ R 2
+ (kekt − rR) =0 (14.2.6)
∂t 2 ∂R ∂R
RT
with the final condition G(T, RT ) = max{1 − kT , 0}.
e −1
Neither of the previous two final condition problems can be solved explicitly.
14.3 Boundary Conditions

The partial differential equation (14.2.4) is of first order in t and second order in R. We need
to specify one condition at t = T (the payoff at maturity), which is given by (14.2.5), and two
conditions for R = 0 and R → ∞, which specify the behavior of solution G(t, R) at two limiting
positions of the variable R.
Taking R → 0 in equation (14.2.4) and using Exercise 14.3.1 yields the first boundary
condition for G(t, R)
³ ∂G ∂G ´¯¯
+f ¯ = 0. (14.3.7)
∂t ∂R R=0
¯ ¯
The term ∂G ¯ ∂G ¯
∂R R=0 represents the slope of G(t, R) with respect to R at R = 0, while ∂t R=0 is
the variation of the price G with respect to time t when R = 0.
Another boundary condition is obtained by specifying the behavior of G(t, R) for large
values of R. If Rt → ∞, we must have St → 0, because
Z t
1
Rt = f (u)Su du
St 0
Rt
and 0 f (u)Su du > 0 for t > 0. In this case we are better off not exercising the option (since
otherwise we get a negative payoff), so the boundary condition is
lim G(R, t) = 0. (14.3.8)

R→∞
200
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
20 40 60 80 100 20 40 60 80 100
(a) (b)
Figure 14.1: The profile of the solution H(t, R): (a) at expiration; (b) when there is T − t time
left before expiration.
It can be shown in the theory of partial differentia equations that the equation (14.2.4) together
with the final condition (14.2.5), see Fig.14.1(a), and boundary conditions (14.3.7) and (14.3.8)
has a unique solution G(t, R), see Fig.14.1(b).
Exercise 14.3.1 Let f be a bounded differentiable function. Show that

(a) lim xf 0 (x) = 0;
x→0
(b) lim x2 f 00 (x) = 0.

x→0
There is no close form solution for the weighted average strike call option. Even in the most
simple case, when the average is arithmetic, the solution is just approximative, see section
11.10. In real life the price is worked out using the Monte-Carlo simulation. This is based
on averaging a large number, n, of simulations of the process Rt in the risk-neutral world, i.e.
RT,j
assuming µ = r. For each realization, the associated payoff GT,j = max{1− } is computed,
g(T )
with j ≤ n. Here RT,j represents the value of R at time T in the jth realization. The average
n
1X
GT,j
n j=1
is a good approximation of the payoff expectation E[GT ]. Discounting under the risk-free rate
we get the price at time t
³1 Xn ´
G(t, R) = e−r(T −t) GT,j .
n j=1
It worth noting that the term on the right is an approximation of the risk neutral conditional
b T |Ft ].
expectation E[G
When simulating the process Rt , it is convenient to know its stochastic differential equation.
Using
³1´ 1³ 2 ´
dIt = f (t)St dt, d = (σ − µ)dt − σdWt dt,
St St
the product rule yields
201
³I ´ ³ 1´
t
dRt = d = d It
St St
1 ³1´ ³1´
= dIt + It d + dIt d
St S St
³ t ´
2
= f (t)dt + Rt (σ − µ)dt − σdWt .
Collecting terms yields the following stochastic differential equation for Rt

¡ ¢
dRt = −σRt dWt + f (t) + (σ 2 − µ)Rt dt. (14.3.9)
I0
The initial condition is R0 = S0 = 0, since I0 = 0.
Can we solve explicitly this equation? Can we find the mean and variance of Rt ?
We shall start by finding the mean E[Rt ]. The equation can be written as
dRt − (σ 2 − µ)Rt dt = f (t)dt − σRt dWt .

2
Multiplying by e−(σ −µ)t
yields the exact equation
³ 2
´ 2 2
d e−(σ −µ)t Rt = e−(σ −µ)t f (t)dt − σe−(σ −µ)t Rt dWt .
Integrating yields
Z t Z t
2 2 2
e−(σ −µ)t
Rt = e−(σ −µ)u
f (u) du − σe−(σ −µ)u
Ru dWu
0 0
The first integral is deterministic while the second is an Ito integral. Using that the expectations
of Ito integrals vanish, we get
Z t
2 2
E[e−(σ −µ)t Rt ] = e−(σ −µ)u f (u) du
0
and hence
Z t
2 2
E[Rt ] = e(σ −µ)t
e−(σ −µ)u
f (u) du.
0
Exercise 14.3.2 Find E[Rt2 ] and V ar[Rt ].

Equation (14.3.9) is a linear equation of the type discussed in section 7.8. Multiplying by
the integrating factor
1 2
ρt = eσWt + 2 σ t
the equation is transformed into an exact equation
¡ ¢
d(ρt Rt ) = ρt f (t) + (σ 2 − µ)ρt Rt dt.
Substituting Yt = ρt Rt yields
¡ ¢
dYt = ρt f (t) + (σ 2 − µ)Yt dt,
202
which can be written as

dYt − (σ 2 − µ)Yt dt = ρt f (t)dt.
2
Multiplying by e−(σ −µ)t
yields the exact equation
2 2
d(e−(σ −µ)t
Yt ) = e−(σ −µ)t
ρt f (t)dt,
which can be solved by integration
Z t
−(σ 2 −µ)t 2
e Yt = e−(σ −µ)u
ρu f (u) du.
0
Going back in the variable Rt = Yt /ρt , we obtain the following closed form expression
Z t
1 2
Rt = e(µ− 2 σ )(u−t)+σ(Wu −Wt )
f (u) du. (14.3.10)
0
Exercise 14.3.3 Find E[Rt ] by taking the expectation in formula (14.3.10).

It worth noting that we can arrive at formula (14.3.10) directly, without going through
solving a stochastic differential equation. We shall show this procedure in the following.
Using the well-known formulas for the stock price
1 2 1 2
Su = S0 e(µ− 2 σ )u+σWu
, St = S0 e(µ− 2 σ )t+σWt
,
dividing, yields
Su 1 2
= e(µ− 2 σ )(u−t)+σ(Wu −Wt ) .
St
Then we get
Z t
It 1
Rt = = Su f (u) du
St St 0
Z t Z t
Su 1 2
= f (u) du = e(µ− 2 σ )(u−t)+σ(Wu −Wt ) f (u) du,
0 St 0
which is formula (14.3.10).

Exercise 14.3.4 Find an explicit formula for Rt in terms of the integrated Brownian motion
(σ) Rt
Zt = 0 eσWu du, in the case of an exponential weight with k = 12 σ 2 −µ, see Example 14.0.3(c).
Exercise 14.3.5 (a) Find the price of a derivative G which satisfies
∂G 1 2 2 ∂ 2 G ∂G
+ σ R 2
+ (1 − rR) =0
∂t 2 ∂R ∂R
with the payoff G(T, RT ) = RT2 .
(b) Find the value of an Asian derivative Vt on the arithmetic average, that has the payoff
IT2
VT = V (T, ST , IT ) = ,
ST
RT
where IT = 0
St dt.
Exercise 14.3.6 Use a computer simulation to find the value of an Asian arithmetic average
strike option with r = 4%, σ = 50%, S0 = $40, and T = 0.5 years.
203
14.4 Asian Forward Contracts on Weighted Averages

Since the payoff of this derivative is given by
³ RT ´
VT = ST − Save (T ) = ST 1 − ,
g(T )
the reduction variable method suggests considering a solution of the type V (t, St , It ) = St G(t, Rt ),
RT
where G(t, T ) satisfies equation (14.2.4) with the final condition G(T, RT ) = 1 − . Since
g(T )
this is linear in RT , this implies to look for a solution G(t, Rt ) in the following form
G(t, Rt ) = a(t)Rt + b(t), (14.4.11)
with functions a(t) and b(t) subject to be determined. Substituting in (14.2.4) and collecting
Rt yields
(a0 (t) − ra(t))Rt + b0 (t) + f (t)a(t) = 0.
Since this polynomial in Rt vanishes for all values of Rt , then its coefficients are identically
zero, so
a0 (t) − ra(t) = 0, b0 (t) + f (t)a(t) = 0.
When t = T we have
RT
G(T, RT ) = a(T )RT + b(T ) = 1 − .
g(T )
Equating the coefficients of RT yields the final conditions
1
a(T ) = − , b(T ) = 1.
g(T )
The coefficient a(t) satisfies the ordinary differential equation
a0 (t) = ra(t)
1
a(T ) = −
g(T )
which has the solution
1 −r(T −t)
a(t) = − e .
g(T )
The coefficient b(t) satisfies the equation
b0 (t) = −f (t)a(t)
b(T ) = 1
with the solution Z T
b(t) = 1 + f (u)a(u) du.
t
Z T
1 −r(T −t)
G(t, R) = − e Rt + 1 + f (u)a(u) du
g(T ) t
Z T
1 £ ¤
= 1− Rt e−r(T −t) + f (u)e−r(T −u) du .
g(T ) t
204
Then going back in the variable It = St Rt yields
V (t, St , It ) = St G(t, Rt )
Z T
1 £ −r(T −t) ¤
= St − It e + St f (u)e−r(T −u) du .
g(T ) t
f (u)
Using that ρ(u) = , going back to the initial variable Save (t) = It /g(t) yields
g(T )
F (t, St , Save (t)) = V (t, St , It )

Z T
g(t)
= St − Save (t)e−r(T −t) − St ρ(u)e−r(T −u) du.
g(T ) t
We arrived at the following result:
Proposition 14.4.1 The value at time t of an Asian forward contract on a weighted average
with the weight function ρ(t), i.e. an Asian derivative with the payoff FT = ST − Save (T ), is
given by
³ Z T ´ g(t) −r(T −t)

F (t, St , Save (t)) = St 1 − ρ(u)e−r(T −u) du − e Save (t).
t g(T )
It worth noting that the previous price can be written as a linear combination of St and
Save (t)
F (t, St , Save (t)) = α(t)St + β(t)Save (t),
where
Z T
α(t) = 1− ρ(u)e−r(T −u) du
t
Rt
g(t) −r(T −t) f (u) du −r(T −t)
β(t) = − e = − R 0T e .
g(T ) f (u) du
0
In the first formula ρ(u)e−r(T −u) is the discounted weight at time u, and α(t) is 1 minus the
total discounted weight between t and T . One can easily check that α(T ) = 1 and β(T ) = −1.
Exercise 14.4.2
Rt Find the value at time t of an Asian forward contract on an arithmetic av-
erage At = 0 Su du.
Exercise 14.4.3 (a) Find the value at time t of an Asian forward contract on an exponential
weighted average with the weight given by Example 14.0.3 (c).
(b) What happens if k = −r? Why?
Exercise 14.4.4 Find the value at time t of an Asian power contract with the payoff FT =
¡RT ¢n
0
Su du .
Hints and Solutions
205

Stochastic Calculus and Finance

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stochastic Calculus and Finance

Uploaded by

Copyright:

Available Formats

Stochastic Calculus and Applications to Finance

2 Useful Stochastic Processes 27

3 Properties of Stochastic Processes 47

6 Stochastic Integration Techniques 81

7 Stochastic Differential Equations 93

II Applications to Finance 127

10 Modeling Stock Prices 139

11 Risk-Neutral Valuation 153

12 Martingale Measures 167

13 Black-Scholes Analysis 177

14 Black-Scholes for Asian Derivatives 195

1.1 Probability Space

1.2 Sample Space

{H → 0, T → 0}, {H → 0, T → 1}, {H → 1, T → 0}, {H → 1, T → 1},

1.3 Events and Probability

1. It contains the empty set Ø;

2. If contains a set A, then it contains also its complement Ā = Ω\A;

3. It is closed to unions, i.e., if A1 , A2 , . . . is a sequence of sets, then their union A1 ∪A2 ∪· · ·

The chance of occurrence of an event is measured by a probability function P : F → [0, 1]

2. For any mutually disjoint events A1 , A2 , · · · ∈ F ,

P (A1 ∪ A2 ∪ · · · ) = P (A1 ) + P (A2 ) + · · · .

1.4 Random Variables

{ω; X(ω) = 0} = {HHH}, {ω; X(ω) = 1} = {HHT, HT H, T HH},

obviously belong to 2Ω , and hence X is a random variable.

1.5 Distribution Functions

FX (x) = P (ω; X(ω) ≤ x).

The distribution function is non-decreasing and satisfies the limits

lim FX (x) = 0, lim FX (x) = 1.

1.6 Basic Distributions

0 5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0

Figure 1.2: a Normal distribution; b Log-normal distribution; c Gamma distributions; d Beta

The density function of the log-normal distributed random variable Y is given by

E[X] = αβ, V ar[X] = αβ 2 .

0.30 Λ = 15, 0 < k < 30

Figure 1.3: a Exponential distribution; b Poisson distribution.

Poisson distribution A discrete random variable X is said to have a Poisson probability

1.7 Independent Random Variables

are independent, then X and Y are called independent random variables.

pXY (x, y) dxdy = P (x < X < x + dx, y < Y < y + dy)

Dropping the factor dxdy yields the desired result.

The expectation of an integrable random variable X is defined by

1.9 Radon-Nikodym’s Theorem

P (|X| < ²) = 1 − P (X ≥ ²) − P (X ≤ −²) = 1 − 0 − 0 = 1.

Corollary 1.9.2 If X and Y are G-predictable random variables such that

Theorem 1.9.3 (Radon-Nikodym) Let (Ω, F, P ) be a probability space and G be a σ-field

We shall omit the proof but discuss a few aspects.

which is E[X] = E[Y ].

Applying Corollary (1.9.2) yields Y1 = Y2 a.s.

1.10 Conditional Expectation

which ends the proof.

Corollary (1.9.2) implies that E[X|F] = X almost surely.

1.11 Inequalities of Random Variables

Proof: Let µ = E[X]. Expand ϕ in a Taylor series about µ and get

ϕ(x) ≥ ϕ(µ) + ϕ0 (µ)(x − µ),

E[ϕ(X)] ≥ E[ϕ(µ) + ϕ0 (µ)(X − µ)] = ϕ(µ) + ϕ0 (µ)(E[X] − µ)

which proves the result.

Application 1.11.2 If X is a square integrable random variable, then it is integrable.

Proof: Jensen’s inequality with ϕ(x) = x2 becomes

Substituting tX for X yields

V ar(X) = E[X 2 ] − E[X]2 .

By Application 1.11.2 we have V ar(X) ≥ 0, so there is a constant σX > 0, called standard

Exercise 1.11.5 Prove the following identity