Professional Documents
Culture Documents
Contents
1 Introduction.
1.1 Example of models . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Application in Finance . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Application in Insurance . . . . . . . . . . . . . . . . . . . . . . . .
1
1
2
2
2 Review of probability
2.1 Distribution of Random Variables. General.
2.2 Expected value or mean . . . . . . . . . . .
2.3 Variance Var, and SD . . . . . . . . . . . . .
2.4 General Properties of Expectation . . . . .
2.5 Exponential moments of Normal distribution
2.6 LogNormal distribution . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
5
6
7
8
9
3 Independence.
3.1 Joint and marginal densities . . . . . . . . . .
3.2 Multivariate Normal distributions . . . . . . .
3.3 A linear combination of a multivariate normal
3.4 Independence . . . . . . . . . . . . . . . . . .
3.5 Covariance . . . . . . . . . . . . . . . . . . . .
3.6 Properties of Covariance and Variance . . . .
3.7 Covariance function . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10
10
11
12
13
15
15
16
4 Conditional Expectation
4.1 Conditional Distribution and its mean . . . .
4.2 Properties of Conditional Expectation . . . .
4.3 Expectation as best predictor . . . . . . . . .
4.4 Conditional Expectation as Best Predictor . .
4.5 Conditional expectation with many predictors
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
16
16
17
18
18
20
.
.
.
.
22
22
23
23
25
.
.
.
.
.
.
26
26
27
28
29
29
30
. . . .
. . . .
. . . .
( pq )Xn
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Applications in Insurance
7.1 The bound for the ruin probability. Constant R.
7.2 R in the Normal model . . . . . . . . . . . . . .
7.3 Simulations . . . . . . . . . . . . . . . . . . . .
7.4 The Acceptance- Rejection method . . . . . . .
.
.
.
.
30
32
32
34
35
8 Brownian Motion
8.1 Denition of Brownian Motion . . . . . . . . . . . . . . . . . . . . .
8.2 Independence of Increments . . . . . . . . . . . . . . . . . . . . . .
36
36
37
.
.
.
.
38
38
41
42
44
.
.
.
.
.
.
.
.
.
46
46
46
47
48
49
49
50
51
52
.
.
.
.
.
.
.
.
10 Stochastic Calculus
10.1 Non-dierentiability of Brownian motion . . . . .
10.2 Ito Integral. . . . . . . . . . . . . . . . . . . . . .
10.3 Distribution of Ito integral of simple deterministic
10.4 Simple stochastic processes and their Ito integral
10.5 Ito integral for general processes . . . . . . . . . .
10.6 Properties of Ito Integral . . . . . . . . . . . . . .
10.7 Rules of Stochastic Calculus . . . . . . . . . . . .
10.8 Chain Rule: Itos formula for f (Bt ). . . . . . . . .
10.9 Martingale property of Ito integral . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
processes
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
54
54
55
56
57
57
58
59
12 Options
12.1 Financial Concepts . . . . .
12.2 Functions x+ and x . . . . .
12.3 The problem of Option price
12.4 One-step Binomial Model .
12.5 One-period Binomial Pricing
61
61
63
63
64
65
. . . . .
. . . . .
. . . . .
. . . . .
Model.
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
59
60
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
65
66
67
68
69
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
71
71
72
72
73
74
74
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
76
76
76
77
78
78
80
.
.
.
.
.
.
.
.
.
81
81
81
81
82
83
83
84
84
85
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction.
Intro
1.1
Example of models
Let xt the amount of money in a savings account. Suppose the interest rate is r,
and x0 > 0.
The evolution of xt is described by the dierential equation
dxt
= rxt .
dt
We solve this equation as follows. Divide by xt to get xt /xt = r. We know that
the derivative of ln xt equals xt /xt . Hence, by integration, we have ln xt = rt + C,
where C is a constant. Finally,
xt = eC ert .
In order to nd the value of eC we need to know x0 . In fact, by plugging t = 0, we
have x0 = eC . Hence, we get
xt = x0 ert .
What is it for? It allows to predict xt at a future time t. Or it allows to nd rate
r if both xt and x0 are known.
1.2
Application in Finance
13.0
3.0
15.0
3.6
Prices
50
150
250
50
20
10
15.0
250
Boral
17.5
BHP
150
50
150
250
LLC
50
150
250
NCP
Figure 1: Prices of stocks
1.3
Application in Insurance
Consider a sequence of independent games, and suppose that your payo at the end
of each game is Xi , which is a random variable.
n We assume that Xi are identically
distributed. The Random Walk is simply i=1 Xi for n N. This is the discrete
counterpart of Brownian motion.
Using Random Walk to model Insurance surplus we can calculate the ruin
probability.
1.4
1.2
0.6
1.0
0.8
0.4
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.6
0.8
1.0
0.8
1.0
mu = 0
0 5
15
25
mu = -1
0.4
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
mu = 1
0.4
0.6
mu = 2
Xk ,
k=1
where U0 the initial funds, c is the premium collected in each year and Xk is the
amount of claims paid out in year k. The insurance company wants to compute the
probability of ruin, i.e. the probability that soon or later the process (Un , n 1)
to hit zero or become negative.
This model allows to nd sucient initial funds to control the probability of
ruin.
Review of probability
2.1
A random variable refers to a quantity that takes dierent values with some probabilities. A random variable is completely dened by its cumulative probability
distribution function.
Cumulative probability distribution function
F (x) = Pr(X x),
x IR.
cdf
d
F (x)
dx
Using the relation between the integral and the derivative we can calculate
probabilities of outcomes by using the pdf.
The probability of observing an outcome in the range (a,b] (or (a, b)) is
b
Pr(a < X b) = F (b) F (a) =
f (x)dx.
a
f (x)dx = 1.
Conversely, any such f corresponds to some probability distribution.
if x (0, 1)
otherwise.
0
F (x) = x
if x 0
if x (0, 1)
if x 1.
2.2
E(X) = xf (x)dx.
Interpretation, if f (x) is the mass density then EX is the centre of gravity.
5
2.3
SD = = E(X EX)2 .
SD shows how far on average the values are away from the mean.
It turns out that for the N (, 2 ) distribution the mean is and variance 2 .
Theorem 1 If X has N (, 2 ) distribution then
E(X) = , V ar(X) = 2 , SD(X) = .
Proof is an exercise in Calculus.
Linear transform
linearNorm Theorem 2 If Z has standard Normal distribution N (0, 1), then random variable
X = + Z
has N (, 2 ) distribution. Conversely, if X has N (, 2 ) distribution, then
Z=
This result allows to calculate probabilities for any Normal distribution by using
tables of Standard Normal.
It also allows to generate any Normal by using a standard Normal.
Example: X N (1, 2). Find P (X > 0).
X 1
01
1
P (X > 0) = P (
> ) = P (Z > ) = 1 (0.707) = 0.76
2
2
2
Example Consider the process Xt = tZ, where Z is N (0, 1). Give the distribution of Xt . Give the distribution of the increments of Xt .
2.4
Expectation
Eh(X) =
h(x)fX (x)dx.
2.5
2 2
u
2
Proof: By
the Property 4 of the expectation (Expectation of a function of
a rv. Eh(X) = h(x)fX (x)dx) with h(x) = eux
(x)2
1
uX
Ee =
eux
e 22 dx.
2
x2 2x+2 2 2 ux
1
2 2
=
e
dx
2
x2 2x(+ 2 u)+2
1
2 2
e
=
dx
2
(x(+ 2 u))2 (+ 2 u)2 +2
1
2 2
=
e
dx
2
2 2 u+( 2 u)2
(x(+ 2 u))2
1
2 2
2 2
e
dx.
=
e
2
=e
dx.
e
2
2 u2 )/2
1 = eu+(
2 u2 )/2
.
2
2.6
LogNormal
LogNormal distribution
Exercise Derive this formula by using the denition and the normal density.
Example: X LN (1, 2). Find P (X > 1).
X = eY where Y N (1, 2). Then
P (X > 1) = P (eY > 1) = P (Y > 0) = 0.76,
where the last value by using the previous example.
2
mean of LN Theorem 4 If X has LN (, ) distribution then its mean
2
EX = e+ 2 ,
2
Independence.
Pr(X B) =
. . . f (x)dx1 dx2 ...dxn
B
3.1
marginal dn
Consider case n = 2.
Theorem 5 If X and Y have a joint density f (x, y) then marginal densities are
given by integrating
out the other variable.
F (x, y) = P (X x, Y y) =
Then
FX (x) = P (X x) = F (x, ) =
f (u, v)dudv.
f (u, v)dudv.
10
3.2
Multivariate Normal distribution is a collection of a number of Normal distributions, which are correlated with each other.
Multivariate N
Denition
Multivariate Normal distribution is determined by its mean vector and its covariance matrix:
X = (X1, X2,...., Xd ) is N (, ), where
(
)
= (EX1 , EX2 , . . . , EXd ), = Cov(Xi , Xj )
i,j=1,...,d
(2)
d
2
e 2 (X)
1
(X)T
det()
Example A bivariate
[
]normal.
1
= 0 and =
.
1
Calculations give
fX (x) =
1 2
1
[x2 2xy+y 2 ]
2(12 )
1 0 0
I is the identity matrix, eg. for d = 3 I = 0 1 0
0 0 1
1
d
12 ZI ZT
(2) 2
d
d
1 1 zi2
2
e
=
=
fZi (zi ).
2
i=1
i=1
Exercise If the random variables (X1, X2,...., Xd ) are jointly Normal, then they are
independent if and only if they are uncorrelated.
3.3
aX = N
Example. Find the distribution of (X1 + X2 ), and specify its variance, where
X1 , X2 are correlated normals.
[
]
1 2
12
X = (X1, X2 ) is N (, ), =
1 2
22
Note that the sum can be written as a scalar product
X1 + X2 = aX, where a = (1, 1).
12
]( )
12
1 2
1
aa = (1, 1)
= 12 + 21 2 + 22 .
1 2
22
1
as it should be, as can be veried directly that
V ar(X1 + X2 ) = 12 + 21 2 + 22
T
Example. The average of Normals, even if they are correlated, is again Normal,
2
but not so for LogNormals (eN (, ) ).
If X is N (, ) nd distribution of
= 1
X
Xi .
n i=1
n
Find distribution of
n
(
Xi
) n1
1 Xi
e .
n i=1
n
i=1
Remark Let X be multivariate Normal and U = BX, for a non random matrix
B. Using Theorem 7 show that U is multivariate Normal with mean BX and
covariance matrix BX B T .
3.4
Independence
Independence
Events A1 and A2 are independent if the probability that they occur together, is
given by the product of their probabilities,
P (A1 A2 ) = P (A1 )P (A2 ).
Random variables X and Y are independent, if the joint probability distribution
is a product of marginal probabilities, and in terms of densities
fX (x, y) = fX (x)fY (y).
In general it is not enough to know the distribution of each variable X and Y
in order to know the distribution of the random vector (X, Y ).
But if variables X and Y are independent then their marginal distributions
determine their joint distribution (by the product formula).
E(XY ) = E(X)E(Y ).
Proof:
E(XY ) =
by independence
14
Covariance
3.5
Covariance
Let X and Y be two random variables with nite second moments E(X 2 ) <
and E(Y 2 ) < . Their covariance is dened as
Cov(X, Y ) = E(X EX)(Y EY ).
Theorem 10
Cov(X, Y ) = E(XY ) E(X)E(Y )
Proof:
(
)
E(X EX)(Y EY ) = E XY Y EX XEY + EXEY
Now use the property of expectation that constants can be taken out
E(aX) = aEX
= E(XY ) 2EXEY + EXEY = E(XY ) E(X)E(Y ).
2
Correlation
Correlation is dened as
Cov(X, Y )
=
.
V ar(X)V ar(Y )
3.6
Covariance
15
3.7
Covariance function
4
CondExp
4.1
Conditional Expectation
Conditional Distribution and its mean
E(X) =
xfX (x)dx.
Similarly, the conditional expectation is the integral with respect to the conditional distribution
E(X|Y = y) =
xf (x|y)dx.
f (x, y)
,
fY (y)
at any point y where fY (y) > 0. It is easy to see that so dened f (x|y) is indeed
a probability density, as it is nonnegative and integrates to one.
The expectation of this distribution, when exists, is called the conditional expectation of X given Y = y, and is given by the above formula.
Example Let X and Y have a standard bivariate normal distribution with parameter . Then
1. The conditional distribution of X given Y = y is normal N (y, (1 2 )).
2. E(X|Y = y) = y, E(X|Y ) = Y .
}
{
1
2
2
Proof: 1. The joint density f (x, y) = 1 2 exp 2(1
[x
2xy
+
y
]
,
2)
and the marginal density is fY (y) =
2 1
2
1 ey /2 .
2
16
[x
2xy
+
y
]
2
2(1 )
2 12
fX|y (x) =
2
1 ey /2
2
{
}
(x y)2 + (1 2 )y 2 y 2
1
+
exp
=
2(1 2 )
2
2(1 2 )
{
}
1
(x y)2
exp
=
.
2(1 2 )
2(1 2 )
But this is the density of N (, 2 ) distribution with = y and 2 = 1 2 .
2. The mean of N (, 2 ) is , thus from 1. E(X|Y = y) = y.
2
4.2
Prop.
E(X|Y)
5. If X is independent of Y , then
E(X|Y ) = EX,
that is, if the information we know provides no clues about X, then the
conditional expectation of X is simply its mean value.
4.3
4.4
We now look for the best possible predictor of X based on Y , some function of Y ,
as the one that minimizes
h(Y ). We dene the optimal predictor or estimator X
the mean-squared error, i.e. for any random variable Z, which is a function of Y
2 E(X Z)2 .
E(X X)
It turns out that the best predictor of X based on Y is the conditional expectation of X given Y , denoted E(X|Y ).
18
based on Y is given by
Best Predictor Theorem 12 The best predictor (optimal estimator) X
= E(X|Y ), in other words for any random variable Z, Y -measurable (a function
X
of Y )
E(X E(X|Y ))2 E(X Z)2 .
For the proof we need the following result.
Theorem 13 Any random variable Z which is Y -measurable (a function of Y ) is
= X E(X|Y ).
uncorrelated with X X
Proof:
= E(Z(X X))
E(Z)E(X X)
Cov(Z, X X)
The second term is zero because by the law of double expectation
= E(X) E(E(X|Y )) = E(X) E(X) = 0.
E(X X)
Thus
= E(Z(X X))
Cov(Z, X X)
=0
= E(ZX) E(Z X)
by the law of double expectation
(
)
(
)
2
Example Consider the process Xt = tZ, where Z is N (0, 1). Give the distribution of Xt . Give the distribution of the increments of Xt .
In particular, we have a
= E(X|Y ) and X X
= X E(X|Y ) are uncorrelated.
Corollary X
(
)
2
2
2 + E(X
Z)2 , by the previous result
= E(X X)
2.
E(X X)
= E(X|Y ) is the optimal, best predictor/estimator.
Thus X
2
4.5
based on Y1 , Y2 , . . . , Yn is given by
Theorem 14 The best predictor X
= E(X|Y1 , Y2 , . . . Yn ).
X
Conditional expectation given many random variables is dened similarly as
the mean of the conditional distribution. It is denoted by
E(X|Y1 , Y2 , . . . Yn )
Notation: If we denote the information generated by Y1 , Y2 , . . . , Yn by Fn then
E(X|Y1 , Y2 , . . . Yn ) = E(X|Fn ).
Note that often it is hard to nd a formula for the conditional expectation. But
in the multivariate Normal case it is known and is established by direct calculations.
Cond.Exp.MvN Theorem 15 (Normal Correlation) Suppose X and Y jointly form a multi-
20
Cov(X, Y )
(Y EY ).
V ar(Y )
21
5.1
RW
A model of pure chance is served by an ideal coin being tossed with equal probabilities for the Heads and Tails to come up. Introduce a random variable Y taking
values +1 (Heads) and 1 (Tails) with probability 12 . If the coin is tossed n times
then a sequence of random variables Y1 , Y2 , . . . , Yn describes this experiment. All
Yi have exactly the same distribution as Y1 , moreover they are all independent.
Random walk is the process Xn , dened by
Xn = X0 + Y1 + Y2 + .... + Yn .
Xn gives the fortune of a player in a game of chance after n plays, where a coin is
tossed and one wins $1 if Heads come up and loses $1 when Tails come up. Random
walk is the central model for stock prices, the standard assumption is that returns
on stocks follow a random walk
Random Walk
A more general Random Walk
Xn = X0 + Y1 + Y2 + .... + Yn ,
where Yi s are i.i.d. (not necessarily 1).
RW is unbiased if EYi = 0 and biased otherwise.
Yi ) =
i=1
V ar(Yi ) = nV ar(Y1 ) = n.
i=1
Useful tools. Strong law of large numbers and central limit theorems. In general
if X1 , X2 , . . . , Xn , . . . are i.i.d. random variables with nite mean, we have
1
lim
Xi = E[X1 ].
n n
i=1
n
22
5.2
Martingales
Martingales
E|Xn | = E|
Yi |
E|Yi | = nE|Y1 |,
i=1
i=1
5.3
Some questions about Random Walks, such as ruin probabilities can be answered
with the help of martingales.
RWMGs Theorem 16 Let Xn , n = 0, 1, 2, . . . be a Random Walk. Then the following
23
Proof
1. Since, by the triangle inequality |a + b| |a| + |b|,
E|Xn n| = E|X0 +
Yi n| E|X0 |+
i=1
i=1
uX0 nh(u)
uYi
uX0 nh(u)
= e e
E
e =e e
E(euYi ) by independence
i=1
n
nh(u)
h(u)
= euX0 e
i=1
= euX0 < .
i=1
(1)
Multiplying both sides of the above equation by e(n+1)h(u) , the martingale property
is obtained, E(Mn+1 |Fn ) = Mn .
5.4
exp MG RW
25
6.1
Stopping Times
stopping times
Lets see rst an example of a random variable that is not a stopping time. Flip
a fair coin 10 times. Denote by the last time you observe Head. Is this a stopping
time? No. Looking at X1 and X2 is not enough to tell if = 2. In fact if X1 = 0
(here 0 means Tail), X2 = 1, Xi = 0 for all i {3, 4, . . . , 10} then = 2. On the
other hand if X1 = 0, X2 = 1, Xi = 1 for all i {3, 4, . . . , 10} then = 10. Hence
the rst two observations X1 and X2 are not enough to tell if = 2 holds.
Time of ruin is a stopping time.
= min{n : Xn = 0}.
{ > n} = {X1 = 0, X2 = 0, . . . , Xn = 0}.
If we can tell if > n we can also tell if { n}. So by observing the capital at
times 1, 2, . . . , n, we can decide if the ruin by time n has occurred or not, e.g. If
X1 = 0, X2 = 0, X3 = 0 then > 3.
The time when something happens for the rst time is a stopping time. E.g.
rst time Random Walk hits value 1 (or 100). Say you gamble from 8pm to 11 pm,
is rst time you win $ 100. By observing your winnings you can decide whether
has or has not occurred.
One way to see that a random variable is nite is to establish that it has nite
mean.
If E( ) < then P ( < ) = 1.
If 1 and 2 are stopping times then their minimum is also a stopping time,
= min(1 , 2 ) = 1 2 ,
is a stopping time.
We use this result mainly when one of the stopping times is a constant, = N .
Clearly, any constant N is a stopping time.
Then N is a stopping time, which is bounded by N .
For example, if is the rst time one wins $5 in a game of coin tossing then
10 is the time of winning $5 if it happens before 10 tosses, or time 10 if $5 were
not won by toss 10.
Note that also max(1 , 2 ) = 1 2 and 1 + 2 are also stopping times. But
we dont use these properties.
6.2
Prove that a martingale has a constant mean for any deterministic time. For
example, if (Mn ) is a martingale, then prove that
E(M5 ) = E(M4 ) = E(M3 ) = E(M2 ) = E(M1 ) = M0 .
There is nothing special about time 5. We can prove the same for all xed times.
What if we substite a xed deterministic time with a random one?
It turns out that the mean of the stopped martingale does not change also for
some random times, such as a bounded stopping time. What we observed above
might not be true! This is why we need the following theorem.
Theorem 17 (Optional Stopping Theorem)
Let Mn be a martingale.
EMtau=M0
1. If K < is a bounded stopping time then
E(M ) = E(M0 ).
2. If Mn are uniformly bounded, |Mn | C for any n, then for any stopping time
(even non nite),
E(M ) = E(M0 ).
The proof of this theorem is outside this course.
27
6.3
Ruin Prob.
Unbiased RW
Suppose that you are playing a game of chance by betting on the outcomes of
tosses of a fair coin (p = 0.5). You win $1 if heads come up and lose $1 if tails
come up. You start with $20. Find the probability of winning $10 before losing all
your initial capital of $20.
Denote by a the probability that you win 10 before losing 20, ie X = 30, then
1 a is the probability X = 0 if you lose 20 before winning 10.
Thus the distribution of X is: X = 30 and X = 0 with probability 1 a.
We have seen that the process Xn is a martingale. Applying the optional
stopping theorem (without proving that we can)
E(X ) = E(X0 ) = X0 = 20.
On the other hand, calculating the expectation directly E(X ) = 30a+0(1a).
Thus we have from these equations 30a = 20, and a = 2/3. Thus the probability
of winning $10 before losing the initial capital of $20 is 2/3.
The same calculation gives that the probability of the process Xn hitting level
unbiased RW b before it hits level c, having started at x, b < x < c is given by
a=
ruin biased RW
cx
.
cb
Biased RW
Let a simple random move to the right with probability p and to the left with
probability q = 1 p. We want to nd the probability that it hits level b before it
hits level c, when started at x, b < x < c. Let be the stopping time of random
walk hitting b or c.
28
Solving it for a
a=
6.4
(q/p)x (q/p)c
.
(q/p)b (q/p)c
Unbiased RW
We use the martingale Mn = Xn2 n and stop it at . Assuming it is allowed
EX2 E = x2 . E = ab2 + (1 a)c2 x2 , where a is the hitting probability
in unbiased RW.
Biased RW
We use the martingale Mn = Xn n. Here = p q = 2p 1. Stopping it
at gives
EX E = x.
E = (ab + (1 a)c x)/(2p 1), , where a is the hitting probability in the
biased RW.
Exercise: Give a proof that Optional stopping applies in the martingale above.
6.5
k=1
29
Xk
Assumptions.
c E(Xn ) > 0. The premiums are greater than the expected payout.
X1 , X2 , . . . are identically distributed and independent.
Exercise
6.6
Ruin Probability
Applications in Insurance
30
EeR(cX1 ) = 1.
Then for all n
Px (T n|U0 = x) eRx
Proof.
Step 1. Show that Mn = eRUn is a martingale.
Step 2. Use the Martingale stopping Theorem with the stopping time min(T, n) =
T n
Step 3. Extract information from the resulting equation.
Step 1.
Finite expectation.
7.1
(
)
eRc E eRX = 1.
7.2
2 2 )
) = eR+ 2 R
2 2
(or use the formula for Normal moment generating function). Thus the equation
for R becomes
mX (R) = eRc
1
eR+ 2 R
2 2
= eRc .
R=
2(c )
2
Remark The aggregate claims in consecutive years X1 , X2 , . . . , Xn , . . . are assumed to have the same distribution, say as X1 .
Suppose that there are n insured individuals. Then each has individual claim
distribution Y . So that in one year the aggregate claim is
X1 =
Yi ,
i=1
N (0, 1).
nY
In other words,
X1 N (nY , nY2 ).
Example Consider a car owner who has an 80% chance of no accidents in a year,
a 20% chance of being in a single accident in a year, and no chance of being in more
than one accident in a year. For simplicity, assume that there is a 50% probability
that after the accident the car will need repairs costing 500, a 40% probability
that the repairs will cost 5000, and a 10% probability that the car will need to be
replaced, which will cost 15,000. Hence the distribution of the random variable Y,
loss due to
accident:
0.80
if x = 0
0.10
if x = 500
f (x) =
0.08
if x = 5000
0.02
if x = 15000
The car owners expected loss is the mean of this distribution, E(Y ) = 750. The
standard deviation of the loss Y = 2442.
An insurance company that will reimburse repair costs resulting from accidents
for 100 such car owners
For the company the loss in one year is sum of losses for each car. If the loss
to car i is Yi , then
100
X1 =
Yi ,
i=1
Note that most of Yi s are zero. This fact is taken into account in the loss (claim)
distribution.
For the company, the expected loss in one year is sum of expected losses
( 100 )
= X = E
Yi = 100car = 75, 000
i=1
The variance is
2
2 = X
= V ar
( 100
)
Yi
2
= 100car
= 596, 336, 400.
i=1
So that the aggregate loss in one year X has approximately Normal distribution
with this parameters.
Suppose the premium is set to be 30% higher than the expected claim, c = 1.3
Then
R=
2(c )
0.6
0.6
45000
= 2 = 2 =
= 7.55 105
2
So, if the company has initial fund of x = 100, 000 = 105 , then the ruin probability is less than e7.55 = 0.0005.
Note that initial fund of only x = 10, 000 = 104 is not enough, the ruin probability is less than e0.755 = 0.47.
7.3
Simulations
7.4
Example: Simulate from f (x) = 20x(1 x)3 , 0 < x < 1. (Beta distribution).
Take g(x) = 1.
It is an exercise in Calculus to see that f (x)/g(x) C = max f (x) = 135
.
64
Thus
1. generate Y and U2 from U (0, 1).
64
2. If U2 135
20Y (1 Y )3 , then set X = Y and stop; otherwise sample again.
This method applies to any distribution with bounded density f (x).
35
Brownian Motion
8.1
The last equality, because expectation of a constant is that constant. Next for a
random variable X with zero mean, EX = 0, we have
V ar(X) = E(X EX)2 = E(X 2 ).
Since (Bt Bs ) has zero mean, and by a property of N (0, 2 ) distribution
E(Bt Bs )2 = V ar(Bt Bs ) = t s,
SD(Bt Bs ) = t s.
If we take s = 0 then we obtain E(Bt B0 ) = 0 and E(Bt B0 )2 = t.
8.2
Independence of Increments
For any times s and t, s < t, the random variable Bt Bs is independent of all the
variables Bs and Bu , u < s.
BMGauss Theorem 19 Brownian motion has covariance function min(t, s).
EBs2
)
+ E Bs (Bt Bs ) .
Now Brownian motion has independent increments: (Bt Bs ) and Bs are independent, therefore expectation of their product is the product of their expectations
(Theorem 8) so that
(
)
E Bs (Bt Bs ) = EBs E(Bt Bs ).
Brownian motion has Normal increments: (Bt Bs ) is N (0, t s). Therefore its
mean is zero, E(Bt Bs ) = 0. So that
E(Bs Bt ) = E(Bs2 ).
Next, writing Bs = B0 + (Bs B0 ) and using independence of terms, we have
E(Bs2 ) = E(B02 + (Bs B0 )2 + 2B0 (Bs B0 )) = E(B02 ) + s = B02 + s.
Here we used that E(Bs B0 )2 = s as the variance of N (0, s) distribution, and
that B0 is non-random, E(B02 ) = B02 .
Next, for any t
EBt = E(B0 ) + E(Bt B0 ) = E(B0 ) = B0 .
37
Hence
EBt EBs = B02 .
Finally
(
)
(
)
Cov Bs , Bt = E Bt Bs EBt EBs = B02 + s B02 = s.
The distributions of B(t) for any time t are called marginal distributions of Brownian motion.
The joint distributions of the vector (B(t1 ), B(t2 )) of Brownian motion sampled
at two arbitrary times t1 < t2 are called bivariate distributions.
Similarly for any n the joint distributions of the vector (B(t1 ), B(t2 ), . . . , B(tn ))
of Brownian motion sampled at n arbitrary times t1 < t2 < . . . < tn are called ndimensional distributions.
Finite dimensional distributions are the joint distributions when n = 1, 2, 3, . . ..
To describe a random process it is not enough to know the distributions of its
values at any time t, but also joint distributions.
A stochastic (random) process is called Gaussian if all its nite dimensional
distributions are multivariate Normal.
In this lecture we prove that Brownian motion is a Gaussian process.
BMGaussPr Theorem 20 Brownian Motion is a Gaussian process.
9.1
38
Write X = t1 Z1 ,Y = t2
t1 Z2 , where Z1 , Z2 are independent standard Normal. Denote 1 = t1 , 2 = t2 t1 . Then the vector
(X, X + Y ) = (1 Z1 , 1 Z1 + 2 Z2 ) = AZ,
)
(
1 0
. Therefore (X, X + Y ) is bivariate Normal with
where matrix A =
1 2
) (
( 2
)
1
12
t1 t1
T
=
mean vector (0, 0) and covariance matrix AA =
.
12 12 + 22
t1 t2
Similarly, for n = 3, the joint distribution of the vector (B(t1 ), B(t2 ), B(t3 )) is
trivariate normal with mean (0, 0, 0), and covariance matrix
t1 t1 t1
t1 t2 t2
t1 t2 t3
For a general n can complete the proof by induction. Alternatively, write directly
(B(t1 ), B(t2 ), . . . , B(tn ))
= (B(t1 ), B(t1 ) + (B(t2 ) B(t1 )), . . . , B(tn1 ) + (B(tn )) B(tn1 )),
| {z } | {z } |
{z
}
{z
}
| {z } |
Y1
Y1
Y2
Y1 +...+Yn1
Yn
Denote Y1 = B(t1 ), and for k > 1 Yk = B(tk ) B(tk1 ). Then by the property of
independence of increments of Brownian motion, Yk s are independent. They also
have normal distribution, Y1 N (0, t
1 ), and Yk N (0, tk tk1 ). B(t2 ) = Y1 + Y2 ,
1 0 0
1 2 0
A=
... ... ...
1 2 3
... 0
... 0
, 1 = t1 , k = tk tk1 .
... 0
. . . n
(B(t1 ), B(t2 ), . . . , B(tn ))
is a linear transformation of Z. This shows that this vector is a linear transformation of standard normal vector, therefore it is multivariate normal.
2
39
BMGauss Corollary Brownian motion is a Gaussian process with constant mean function,
=
1 2 3 3
1 2 3 4
Now, let a = (1, 1, 1, 1). Then
aX = X1 + X2 + X3 + X4 = B(1) + B(2) + B(3) + B(4).
aX has a Normal distribution with mean zero and variance aaT , the sum of the
elements of the covariance matrix in this case.
Thus B(1) + B(2) + B(3) + B(4) has a Normal distribution with mean zero
and variance 30. Alternatively, we can calculate the variance of the sum by the
covariance formula
V ar(X1 + X2 + X3 + X4 ) =
Cov(X1 + X2 + X3 + X4 , X1 + X2 + X3 + X4 ) =
Cov(Xi , Xj ) = 30.
i,j
40
9.2
Two process used in applications are the Arithmetic and Geometric Brownian
motion.
Arithmetic Brownian motion Xt = t + Bt , where and are constants. This
is also known as Brownian motion with drift.
Theorem 21 If Xt is Brownian motion with drift above then
Brownian motion.
Xt t
is a standard
In particular, we have a
= E(X|Y ) and X X
= X E(X|Y ) are uncorrelated.
Corollary X
E(X Z) = E X X + X Z
(
)
2
2
41
9.3
based on Y1 , Y2 , . . . , Yn is given by
Theorem 22 The best predictor X
= E(X|Y1 , Y2 , . . . Yn ).
X
Conditional expectation given many random variables is dened similarly as
the mean of the conditional distribution. It is denoted by
E(X|Y1 , Y2 , . . . Yn )
Notation: If we denote the information generated by Y1 , Y2 , . . . , Yn by Fn then
E(X|Y1 , Y2 , . . . Yn ) = E(X|Fn ).
Note that often it is hard to nd a formula for the conditional expectation. But
in the multivariate Normal case it is known and is established by direct calculations.
Cond.Exp.MvN Theorem 23 (Normal Correlation) Suppose X and Y jointly form a multi-
Cov(X, Y )
(Y EY ).
V ar(Y )
42
Example Best predictor of future value of Brownian motion based on the present
best PredBM value.
43
9.4
A process Mt , t 0, is a martingale if
for all t, E|Mt | <
for all t and s > 0, E(Mt+s |Mu , u t) = Mt
Introduce notation Ft = {Mu , u t} for the values of the process (prices) before
time t, the history up to time t.
Then the martingale property reads for all t and s > 0,
E(Mt+s |Ft ) = Mt .
(In fact, Ft is a model for the ow of information, called -eld, and their
collection is called ltration, but we dont study these concepts in this course.)
44
45
10
10.1
Stochastic Calculus
Non-dierentiability of Brownian motion
10.2
It
o Integral.
Here we give a concise introduction to denition and properties of stochastic integral, Ito integral.
Firstly it is dened for simple processes as a sum, and then a general process
is approximated by simple ones.
If Xt is a constant c, then the integral should be
T
cdBt = c(BT B0 )
0
The integral over (0, T ] should be the sum of integrals over two sub-intervals
(0, a] and (a, T ]. Thus if Xt takes two values c1 on (0, a], and c2 on (a, T ], then the
integral of X with respect to B is easily dened.
Simple processes
46
T
0
if ti < t ti+1 , i = 0, . . . , n 1,
Xt dBt =
0
n1
(
)
ci B(ti+1 ) B(ti ) .
i=0
)
)
)
(
(
(
X(s)dB(s) = c0 B(1) B(0) + c1 B(2) B(1) + c2 B(3) B(2)
(
)
(
)
= B(1) + B(2) B(1) + 2 B(3) B(2)
= N (0, 1) + N (0, 1) + N (0, 4) = N (0, 6)
10.3
Distribution of It
o integral of simple deterministic
processes
T
0
n1
(
)
Xt dBt =
ci B(ti+1 ) B(ti )
i=0
( n1
)
n1
(
)
(
)
= Cov
ci B(ti+1 ) B(ti ) ,
cj B(tj+1 ) B(tj )
i=0
n1
n1
j=0
( (
) (
))
Cov ci B(ti+1 ) B(ti ) , cj B(tj+1 ) B(tj )
i=0 j=0
n1
( (
) (
))
Cov ci B(ti+1 ) B(ti ) , ci B(ti+1 ) B(ti )
i=0
n1
n1
)
V ar(ci B(ti+1 ) B(ti ) ) =
c2i (ti+1 ti ) =
i=0
10.4
i=0
X 2 (t)dt.
0
if ti < t ti+1 , i = 0, . . . , n 1,
n1
(
)
Xt dBt =
i B(ti+1 ) B(ti )
i=0
This is similar to the case of simple deterministic processes, except for random i s
the distribution of Ito integral is no longer Normal.
48
10.5
It
o integral for general processes
Proposition
T 2Stochastic integrals are dened for adapted processes Xt ,
such that 0 Xt dt < . Adapted means that for a given t the value Xt may depend
on the past and present values of Brownian motion B(u), u t, but not on the
future values B(u) for u > t.
The integral for general processes is dened by approximation by integrals of
simple processes. This mathematical theory is too advanced to cover here.
T
T
(n)
Xt dBt = lim
Xt dBt = lim
X (n) (ti1 )B(ti ),
n
(n)
where Xt are simple adapted processes. The limit is the limit in probability,
which is not covered here.
10.6
Properties of It
o Integral
1. Linearity. If Xt and Yt are adapted processes and and are some constants
then
T
T
T
(Xt + Yt ) dBt =
Xt dBt +
Yt dBt .
0
2. If
T
0
T
Zero mean property. E 0 Xt dBt = 0
(
)2
T
T
Isometry property. E 0 Xt dBt = 0 E(Xt2 )dt
Note that there are cases when the Ito integral does not have mean.
1
Example Let J = 0 tdBt . We calculate E(J) and V ar(J).
1
Since 0 t2 dt < , Ito integral is dened. Since the integrand t is nonrandom, the
1
integral has the rst two moments, E(J) = 0, and E(J 2 ) = 0 t2 dt = 1/3.
49
Example
on [0, T ].
T
0
Bt dBt .
(
Therefore E
0
T
0
Bt2 dt
E
(
Bt dBt
= 0 and E
Bt2
T
0
tdt = T 2 /2 <
dt =
0
)2
Bt dBt
T
0
E (Bt2 ) dt =
T
0
tdt =
T 2 /2.
10.7
The rules of stochastic calculus are dierent to usual. This has to do with properties
of Brownian motion paths Bt .
In the usual calculus only terms which have dt are important, and higher order
terms are all taken to be zero.
(dt)2 = dtdt = 0
In stochastic calculus in addition to this
(dBt )2 = dBt dBt = dt,
but
dtdBt = dBt dt = 0
dgt = gt dt
and
One can recover stochastic calculus rules from the usual one by use of Taylors
formula (up to the second order terms)
Recall
1
f (x + dx) = f (x) + f (x)dx + f (x)(dx)2 + ....
2
50
The dierential of f (x) is the linear part of the increment over [x, x + dx].
Thus
df (x) = f (x)dx.
So if dx = 0.1, then
Inclusion of the next term will change only the next decimal place f (x) (0.1)2 .
So it is not included.
10.8
Since (dBt )2 = dt gives a linear term dt, we need to keep the quadratic term to
obtain stochastic dierential.
1
df (x) = f (x)dx + f (x)(dx)2
2
2
Ito for BM Using (dBt ) = dt, and letting x = Bt , we have
1
df (Bt ) = f (Bt )dBt + f (Bt )dt
2
1
1 t
1 t
1
2
Bs dBs =
d(Bs )
ds = Bt2 t.
2 0
2 0
2
2
0
t
Compare stochastic integral 0 Bs dBs to the Riemann integral of a dierentiable
t
function g with g0 = 0, 0 gs dgs . Make the change of variable
t
0
gt
gs = u
51
10.9
Martingale property of It
o integral
T
0
Xs dBs
0
is a martingale. This can be easily proved for Ito integral of simple processes and
then by taking limits for general processes.
Examples.
T
T
t
1. Since 0 EBt2 dt = 0 tdt < , it follows that 0 Bs dBs is a martingale.
This is also veried by direct evaluation of the integral above
t
1
1
Bs dBs = Bt2 t,
2
2
0
which is a martingale, since Bt2 t is a martingale.
T
t
T
2. Since 0 E(eBt )2 dt = 0 E(e2Bt )dt < , it follows that 0 eBs dBs is a
martingale.
Now we give results that help to check the martingale property by using stochastic calculus. Stochastic integrals are martingales under some condition.
t
T
Proposition 0 Xs dBs is a martingale provided 0 EXt2 dt < .
It is now intuitively clear but can be proven that
Proposition For a process Mt to be a martingale, it is necessary that its stochastic dierential dMt has no dt term.
This proposition is used together with Itos formula to obtain equations for pricing
of options, such as Black-Scholes partial dierential equation.
Proposition Stochastic integral with respect to a martingale is again a martingale, provided some integrability conditions hold.
Examples.
1. Mt = Bt . Then dMt = dBt . Here Xt = 1.
martingale.
52
T
0
Xt2 dt = T < . Mt is a
53
11
11.1
Consider the equation describing growth xt in which the rate of growth is constant
and is proportional to xt . For example the amount of money in a savings account
with continuously compounded interest, or bacteria growth.
If bt is the amount in the account at time t,
dbt is change in the account bt over interval of time [t, t + dt], where dt denotes
small change in time, e.g. 1 day.
continuous compounding means
dbt = rbt dt.
This is an ordinary dierential equation (ODE)
11.2
54
St = S0 e(a 2 b
11.3
BSsde
1
(aXt dt
Xt
+ bXt dBt )
b2
dt
2
d ln Xt = (a
b2
)dt + bdBt .
2
Integrating we have
ln Xt ln X0 = (a
55
b2
)t + bBt ,
2
and nally
b2
Xt = X0 e(a 2 )t+bBt .
11.4
It
os formula for functions of two variables
Using Taylors formula for 2 variables and keeping quadratic terms (dx)2 , (dy)2 and
using (dBt )2 = dt, we obtain for a function f (x, y) of two diusion processes Xt , Yt
1 2f
f
f
1 2f
2
dXt +
dYt +
(dX
)
+
(dYt )2
t
x
y
2 x2
2 y 2
df (Xt , Yt ) =
+
2f
(dXt )(dYt ),
xy
Ito 2dim
where all derivatives of f are evaluated at the point (Xt , Yt ). Using rules for (dXt )2
we have
df (X t , Y t ) =
f
f
(X t , Y t )dX t + (X t , Y t )dY t
x
y
2
1 f
1 2f
2
+
(X
,
Y
)
(X
)dt+
(X , Y ) 2 (Y )dt
2 x2 t t X t
2 y 2 t t Y t
2f
+
(X , Y ) (X ) (Y )dt.
xy t t X t Y t
It
os formula is for functions of the form f (Xt , t)
Let f (x, t) be twice continuously dierentiable in x, and continuously dierentiable in t, then (by taking Yt = t) we have
df (X t , t) =
f
f
1 2
2f
(X t , t)dX t + (X t , t)dt+ X
(X t , t) 2 (X t , t)dt.
x
t
2
x
56
11.5
The usual integration by parts formula states that for two dierentiable functions
u(t) and v(t)
d(uv) = udv + vdu.
Here we show similar rule when the functions are functions of Brownian motion,
Xt and Yt , and may not be dierentiable.
If we take f (x, y) = xy, then we obtain a dierential of a product (or the
product rule) which gives the integration by parts formula.
2
2
f
2f
= y, f
= x, xf2 = 0, yf2 = 0, xy
=1
x
y
d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt .
By parts
11.6
Ornstein-Uhlenbeck process.
dX t = X t dt + dB t ,
where and are some nonnegative constants. We solve it and later show that it
gives the Ornstein-Uhlenbeck process, a Gaussian process with the specied mean
and covariance functions.
To solve this equation consider the process
Yt = X t et
Using the dierential of the product rule, we have
dY t = et dX t +et Xt dt
Using the SDE for dXt we obtain
dY t = et dB t
57
This gives
es dB s
Yt = Y 0 +
0
Xt = e
(
)
t
t
s
t
X0 +
e dBs = e X0 +
e(ts) dBs
0
11.7
Writing the equation in the integral form and taking expectations and using
that E(Bt ) = 0
t
t
E(rt ) E(r0 ) =
b(a E(rs ))ds + E(Bt ) =
b(a E(rs ))ds.
0
Taking derivatives
dht = b(a ht )dt
This equation is solved by separating variables. Integrating from 0 to t, and per0
forming the change of variable u = hs , we have ln ah
= bt, and nally
aht
ht = a ebt (a h0 ).
Note that in the long run the rate approaches the value a, limt E(rt ) = ht = a.
58
11.8
Xt = rt ht satises
dXt = drt dht = b(a rt )dt + dBt b(a ht )dt
= b(rt ht )dt + dBt
or dXt = bXt dt + dBt .
But this is the Ornstein-Uhlenbeck process, solution to this equation was found
earlier.
t
bt
Xt = e X0 +
eb(ts) dBs .
0
Hence
rt = Xt + ht = ht +
eb(ts) dBs
rt = a ebt (a r0 ) +
eb(ts) dBs .
11.9
B1 and W = B1 + 1 2 B2 .
11.10
t
t
Ito integral 0 Xs dBs is dened for adapted processes Xt with 0 Xs2 ds < .
T
t
If 0 E(Xs2 )ds < , then 0 Xs dBs , t T , is a martingale,
(
)2
t
t
t
E 0 Xs dBs = 0, E 0 Xs dBs = 0 EXs2 ds.
t
If Xt is deterministic, then 0 Xs dBs is a Normal random variable.
conventions:
f
f
(X, Y )dX +
(X, Y )dY +
x
y
1 2f
2f
1 2f
2
2
(X,
Y
)(dX)
+
(X,
Y
)(dY
)
+
(X, Y )(dX)(dY )
2 x2
2 y 2
xy
60
12
12.1
Options
Financial Concepts
Markets
In Finance a market is where people sell and buy nincial papers (agreements).
For example, stock market, bond market, currencies market (FX), options markets
etc.
http://www.asx.com.au/products/all-products.htm
Look up BHP. Price and history chart.
Shares
Options
Shares
To raise capital a company issues shares to shareholders. By buying a share a
shareholder has a part in that company. Prices of shares are determined by the
market (ASX) and uctuate in time.
Example: A paper that represents 1 share of BHP
1 SHARE of BHP
On February 23, 2011, the price of 1 BHP share was $46.58. On March 27,
2008, it was $24.80.
Notation: price of a share at time t is denoted by St .
Options
OPTION on BHP
This paper gives its holder the right to buy 1 share of BHP for K at time T (or
before).
Example: T =23/06/2011. K = 46.00. Price on 23/2/ 2011 is $2.695.
Option contract is on 1000 shares costs $2695.
More formally, an option is a contract between two parties, the buyer, the
other is the seller which either
1. gives its holder the right (not the obligation) to buy a certain amount of
shares of stock at the agreed price at any time on (or before) a given date
(call option); or
2. gives its holder the right (not the obligation) to buy a certain amount of
shares of stock at the agreed price at any time on (or before) a given date
(put option)
We denote by T the given date, and by K the amount of shares. The contract is
set up at time t prior to T .
61
Types
Bond
Bond
A bond is an instrument of indebtedness of the bond issuer to the holders. The
issuer owes the holders a debt and, depending on the terms of the bond, is obliged
to pay them interest and/or to repay at a later date, which we call the maturity
date
Savings account
Savings acc.
62
This is because: if ST < K the option is worthless (it gives the holder the right
to buy stock for K from the writer, but he/she can buy it from the market for
ST < K; if ST > K then the option is worth ST K. This is because the holder
can buy the share for K instead of price ST .
CT = max(ST K, 0)
12.2
Functions x+ and x .
|x| = x+ + x .
Payo graph
Let ST = x then we have payo function of an option.
Here we take{K = 10, and x for S
0
if x 10
Payo(x) =
or (x 10)+ .
x 10 if x > 10
A European {
put with strike K pays max(0, K x) (option to sell)
0
if x 10
Payo(x) =
10 x if x < 10
+
or (10 x) .
12.3
But this is not so, or at least we have to choose carefully the distribution! In fact it
could give the possibilities of arbitrage. Arbitrage, will be dene rigorously below.
Roughly, is the possibility of an agent to make money with no risk. We shall see
this rst on the simplest model to price this option. The one period model.
12.4
Binomial Model
Current Price = 10
8
Suppose interest rate over the period is 10%.
Consider pricing of a call option with exercise price K = 10. Suppose the call
is priced at $1 per share. We claim that this price allows to make prot out of
nothing without taking any risk (arbitrage).
Consider the strategy: buy call option on 200 shares and sell 100 shares of
stock. At this stage it is not clear why we chose such strategy.
Look at what happens at all possibilities under the model.
S1 = 12 S1 = 8
Buy 200 options
-200
400
0
Sell (short) 100 shares 1000
-1200
-800
Invest
-800
880
880
Prot
0
+80
+80
In either case a prot of $80 is realized. In a case like this it is said that there
exists an arbitrage, i.e. a strategy of making money with no risk involved,
arbitrage also known as free lunch.
Thus the price of $1 for the option allows for arbitrage. $1 is too little. Arbitrage
strategies are not allowed by the theory.
Suppose the call is priced at $2. Then the opposite strategy will give arbitrage:
Sell calls on 200 shares and buy 100 shares
S1 = 12 S1 = 8
Sell 200 options
400
-400
0
Buy 100 shares -1000
1200
800 In this case the reverse strategy
Borrow
600
-660
-660
Prot
0
+140
+140
gives an arbitrage opportunity.
The price that does not allow for arbitrage strategies is $1.36.
How to compute it? We show it in the next section
64
12.5
One period T = 1
Assume that in one period stock price moves up by factor u or down by d,
d < 1 < u. In the previous example, d = 8/10 and u = 12/10.
So that the model for the random future price of stock at time 1, S1 is Su and
Sd . These values are realized with some probabilities, but it turns out that they
are not important for the purpose of option pricing.
uS
Current Price = S
dS
Savings account: in one period model (and other discrete time models) the interest
rate r > 1 (eg. 10% is 1.1). It corresponds to er in continuous time models.
The value of an option at time 1 is denoted by C1 which is given by the following
formula.
12.6
Replicating Portfolio
replicating portfolio
Since this portfolio is equivalent to the claim C, we obtain two equations with
two unknowns,
}
auS + br = Cu
adS + br = Cd
Solving them gives
uCd dCu
Cu Cd
, b=
(u d)S
(u d)r
Thus to avoid arbitrage C must equal to the following,
a=
C = aS + b
65
This is because if C is larger than the portfolio, then we can use the following
strategy to make money: sell the option and buy the portfolio. In one period they
will be same. If it is priced below this value then buy it and sell the portfolio. This
gives an arbitrage strategy.
Example (continued):
12 u = 1.2
10
8 d = 0.8
C1 = (S1 K)+ . Let us nd the price C of this option.
2 = Cu
C =?
0 = Cd
Take r = 1.1.
Then by solving equations for a and b we nd
a = 0.5, b = 3.64.
Thus this option is replicated by the portfolio consisting of borrowing 3.64 dollars
and buying 0.5 share of stock.
The initial value of this portfolio is
0.5 10 3.64 = 1.36,
which gives the no-arbitrage value for the call option.
12.7
1
[pCu + (1 p)Cd ]
r
with
p=
rd
ud
66
can be viewed as the discounted expected payo of the claim, with probability p of
up and (1 p) down movements.
This probability p is calculated from given returns of the stock and has nothing
to do with subjective personal assessment of market going up or down.
This recovers the main principle of pricing options by no arbitrage which applies
in all other models: the price of an option is the expected discounted payo but
under new probability.
{
C1 =
Cu with probability p
Cd with probability 1 p
C = 1r E(C1 )
For the call option C1 = (S1 K)+
C = 1r E(S1 K)+
= 1r [(uS K)+ p + (dS K)+ (1 p)] = 1r pCu
In our example p =
1.10.8
1.20.8
= 0.75. So C =
1
2
1.1
0.75 = 1.36.
12.8
Consider two random variables X and Y . We say that {X, Y } is a two step
martingale with E(Y |X) = X and both E(|X|) and E(|Y |) are nite. Next, we
show that the price of the one period binomial model is connected with a particular
two step martingale.
Theorem 26 The discounted by r stock price St , t = 0, 1, is a martingale under
rd
.
new probability p = ud
Proof: Since there are only two values S0 and S1 all we need to check is that
E(S1 /r|S0 ) = S0 .
67
rd
ur
+ dS0
= S0 r,
ud
ud
12.9
The one-period formula can be applied recursively to price the claim C if trading
is done one period after another.
Take 2-period, T = 2, model. If all the parameters are the same for both
periods (r, u, d) then
Cuu
Cu
Cud
C
Cdu
Cd
Cdd
where
Cu = 1r [pCuu + (1 p)Cud ], Cd = 1r [pCdu + (1 p)Cdd ]
and using the formula again
C = 1r [pCu + (1 p)Cd ] =
1
[p2 Cuu
r2
C=
1
E(C2 )
r2
Multiperiod model
Continuing by induction, if Cudu...du = Cu...ud...d ,
T ( )
1 T i
p (1 p)T i Cu...u d....d .
C= T
| {z } |{z}
r i=0 i
i
T i
12.10
Black-Scholes formula
where
(h) is the standard normal distribution function (also denoted by N (h) in
h
2
nance). (h) = 12 ex /2 dx.
St is the stock price at time t
r is continuously compounding interest rate
is the volatility, the standard deviation of the return on the stock
T exercise (maturity) time of the call, T t is time remaining to expiration
K is the exercise (strike) price
ht =
T t
Remarks.
(h) gives the number ofshares held in the replicating portfolio, the of the
portfolio. Ker(T t) (ht T t) gives the amount borrowed in the replicating
portfolio.
69
Example On July 28, 2000 the following information is found. BHP last sale
S=18.50. For August call option with strike K=18.50,
Working out Black-Scholes value:
Time to expiration=1 month=1/12=0.083
Interest rate r = 0.062- from the Bank Bill
Take volatility 0.25. Hence Black-Scholes call price is C = 0.5789
70
13
The market model involves two or more assets. One is riskless (savings account)
with value at time t t . Others are risky asset (stocks). We consider only one risky
asset St .
The model for t is ert , dt = rt dt.
The model for stock prices is given by
BS Model
dSt = St dt + St dBt .
Here is annual yield on stock, the mean of returns, and is the volatility or
the standard deviation of returns.
We have seen that solution to this SDE is given by
St = S0 e(
2
)t+Bt
2
13.1
Self-nancing Portfolios
self-fin.
This implies that the value of a self-nancing portfolio at any time equals to its
initial value plus the gain in trade,
t
t
Vt = V0 +
au dSu +
bu du .
0
71
13.2
Proof: If Ct < Vt then sell the portfolio and buy the option. The dierence is
Vt Ct > 0. At time T the value of the option and portfolio are the same (condition
of Theorem). It costs nothing to evolve the portfolio as it is self-nancing. Thus
we have arbitrage prot Vt Ct times the interest. Since arbitrage is not allowed
we must rule out that Ct < Vt .
If Ct > Vt then the opposite strategy of selling the option and buying the
portfolio results in arbitrage prot. Thus cannot have Ct > Vt . The only possibility
left is Ct = Vt .
2
In nance a self-nancing replicating portfolio is called a hedge.
13.3
C
1 2C
C
dSt +
dt +
(dSt )2
x
t
2 x2
BSpde
Comparing the two equations (separating the terms with dSt and dt) we obtain
at =
C
(St , t)
x
72
and
rt
bt d(e ) =
)
C 1 2 C 2 2
+
St dt,
t
2 x2
13.4
We derive the PDE for the price of option, and then give its solution (we dont
solve it)
Putting these back into the equation Ct = Vt
Ct = at St + bt ert
and replacing St by x we obtain the Black-Scholes PDE.
1 2 2 2C
C C
x
+ rx
+
rC = 0
2
2
x
x
t
Boundary conditions for a call option with exercise price K
BSPDE
BS formula
ln(x/K) + (r + 12 2 )(T t)
T t
by direct verication.
ht =
Proof:
Corollary The replicating self-nancing portfolio for a call option in the BlackScholes model is given by
73
13.5
It can be seen by using calculations with Lognormal random variable that the
Black-Scholes formula can be written as the discounted expected nal payo of the
option
C = erT EQ (ST K)+ ,
but for a dierent probability Q. This probability makes the discounted stock price
St ert into a martingale. Q is called an equivalent martingale probability measure
(EMM), also known as risk-neutral probability.
13.6
Options are priced not under the real probability measure but under the riskneutral, EMM Q. For calculations of options prices including simulations equations
for stock under Q must be used, not the original model.
t such that Xt = St ert = X0 eBt 2
Theorem 28 There is a Brownian motion B
is a martingale. Further the SDE for St with new Brownian motion is
2t
t ,
dSt = rSt dt + St dB
with solution for ST
1
ST = S0 e(r 2
2 )T + B
t is a Brownian
By Girsanovs theorem with c = r
, there is Q so that ct + Bt = B
motion.
Hence under Q the sde for the discounted stock price Xt = St ert is
t .
dXt = Xt dB
Solving this, we have
Xt = X0 eBt 2 t .
74
Remarks
The eect of Q is changing to r in the coecient of dt. Financially it makes
sense: in the risk-neutral world (Q) the return is r (the same as risk-free rate) not
(average return > r can be only due uncertainty in returns, ie. when there is
a possibility of losses).
When the price of an option is evaluated by simulations, the sde for stock under
Q must be used.
1 2
Girsanovs theorem states that if we have a Brownian motion with drift, then
there is an equivalent measure under which this process is a Brownian motion.
Theorem 29 (Girsanov) Let Bt , 0 t T be a Brownian Motion (under the
original probability measure P ) and c be a constant. Then there exists an equivalent
t = Bt + ct is a Q Brownian motion.
measure Q such that the process B
The proof is outside this course.
75
14
14.1
Pricing options.
Denition: A contingent claim (derivative) with delivery time T , is a random
variable X FT . It represents that at t = T the amount X is paid to the holder
of the claim by the seller.
Example: (European Call Option) X = max[ST K, 0] = (ST K)+
(ST =stock price at time T )
Want to nd a price so that there no arbitrage possibilities.
Arbitrage
An arbitrage strategy is a way to make money out of nothing without taking
risk. An arbitrage possibility is a miss-pricing on the market. In mathematical
theory of options models in which arbitrage strategies exist are not allowed.
14.2
Arbitrage
76
14.3
The rst theorem gives a necessary and sucient condition for models not to have
arbitrage strategies.
1st Fund.
Thm. Theorem 30 (First fundamental theorem) A model does not have arbitrage
Since we have seen that there is EMM in the Black-Scholes model, we have
Corollary Black-Scholes model does not have arbitrage.
Remark. It is possible to prove it directly by showing that the discounted portfolio Vt ert is a martingale (see Theorem 32) and using the fact that a martingale
has a constant mean.
rd
.
ud
77
14.4
Since the EMM Q is unique, the market model of BS is complete. This means that
any option can be replicated by a self-nancing portfolio and therefore priced by
no-arbitrage approach.
14.5
We know that arbitrage method consists of nding a self-nancing replicating portfolio and then Ct = Vt . But how to nd Vt ? It is possible to give a general formula.
It relies on the insight that the discounted portfolio can be represented as an integral with respect to the discounted stock price.
PortfMG Theorem 32 If the discounted stock price is a martingale then the discounted value
Proof:
Thus
Vt e
rt
= V0 +
au d(Su eru )
Corollary The price of an option is given by the discounted expected payo taken
Price Formula under the martingale probability, e.g. for the call option the price at time t is
Ct = er(T t) EQ (X|Ft ) .
For example, for call option
(
)
= er(T t) EQ (ST K)+ |Su , u t
Proof:
To avoid arbitrage
Ct = Vt .
Remark If the interest rate r is itself random rt and savings account is given by
t
t = e
OptionPrice then the pricing formula takes form
Ct = EQ
rs ds
)
t
X|Ft .
T
79
(2)
14.6
Summary
rd
ud
80
15
15.1
If $1 is invested at time t until time T > t it will result in an amount greater than
$1 due to interest. The length of investment period T t is called term. Money
invested for dierent terms yield a dierent rate of interest. The function R(t, T )
of the argument T is called the yield curve, or the term structure of interest rates.
The rates are not traded. They are derived from prices of bonds, which are
traded on the bond market. This leads to construction of models for bonds and
no-arbitrage pricing for bonds and their options. In this section we denote the
standard Brownian motion by Wt rather than Bt (This is because sometimes in
other texts the bond is denoted by Bt ).
15.2
ln P (t, T )
,
T t
and as a function in T , is called the yield curve at time t. Assume also that a
savings account paying at t instantaneous rate r(t), called the spot (or short) rate,
savings acc. is available. $1 invested until time t will result in
t
(t) = e
15.3
r(s)ds
To avoid arbitrage between bonds and savings account, a certain relation must
hold between bonds and the spot rate. If there were no uncertainty, then to avoid
arbitrage the following relation must hold
P (t, T ) = e
81
T
t
r(s)ds
since investing either of these amounts at time t results in $1 at time T . When the
T
rate is random, then t r(s)ds is also random and in the future of t, whereas the
price P (t, T ) is known at time t, and the above relation holds only on average.
No arbitrage approach is used for pricing of bonds and their options. The
market model for bonds is incomplete. Hence there are many EMM. The model
for rates is often specied under the EMM Q.
We can use the fundamental theorem to price a bond as an option on the rate.
By
Arbitrage pricing theory the price of the bond P (t, T ) is given by
Bond price
P (t, T ) = EQ (e
T
t
r(s)ds
| Ft ),
15.4
dr(t) = dt + dW (t).
Vasicek The Vasicek model
r(t)dW (t).
82
15.5
Forward rates
Bond Fwd rates Forward rates f (t, T ), t T T are dened by the relation
P (t, T ) = e
T
t
f (t,u)du
ln P (t, T )
,
T
The spot rate r(t) = f (t, t). Consequently the savings account (t) grows according to
t
(t) = e 0 f (s,s)ds .
The class of models suggested by Heath, Jarrow, and Morton (1992) is based
on modelling the forward rates. We dont cover this.
15.6
Recall Vasiceks model for interest rate. We have seen that the solution to the
Vasiceks SDE
dr(t) = b(a r(t))dt + dW (t).
is given by
bt
rt = r0 e
bt
+ a(1 e
)+
eb(ts) dWs .
Vasicek Bond
1eb
,
a
A( ) = (C( ) )(a
2
)
2b
2
C( )2
4b
From the bond prices forward rates can be determined and then the yield curve.
Exercise: Find these for the Vasiceks model.
83
15.7
CIR sde has the same drift as Vasiceks but diusion coecient is the square root.
Unlike Vasiceks model CIR process is always positive (this is not proved here).
dr(t) = b(a r(t))dt +
r(t)dW (t).
Bond prices have similar form except for dierent functions A( ) and C( ).
P (t, T ) = eA( )C( )rt ,
(
)
2(e 1)
2e(+b) /2
2ab
,
log
where C( ) = (+b)(e
A(
)
=
, = b2 + 2 2 .
1)+2
2
(+b)(e 1)+2
From the bond prices forward rates can be determined the yield curve by formula. Exercise: Find these for the CIR model.
15.8
Options on bonds
A call option to buy a bond at time S with maturity T gives its holder the right to
buy the T -bond at time S < T . It pays (P (S, T ) K)+ at time S. The arbitragefree price of this call at time t < S is given by the Option Pricing formula by
replacing X by its expression in this case
)
( S
EQ e t ru du (P (S, T ) K)+ |Ft .
T
In Vasiceks model the conditional distribution of t r(s)ds given Ft is the same
as that given r(t) (Markov property) and is a Normal distribution. Hence in the
Vasiceks model the price of bonds is Lognormal with known mean and variance,
and a closed form expression for the price of an option on the bond can be obtained.
It looks like a version of the Black-Scholes formula.
Options on bonds are used to cap interest rates. It can be seen that a cap
corresponds to a put option, and a oor to a call option.
A cap is a contract that gives its holder the right to pay the rate of interest
smaller of the two, the oating rate, and rate k, specied in the contract. A party
holding the cap will never pay rate exceeding k, the rate of payment is capped at k.
Since the payments are done at a sequence of payments dates T1 , T2 , . . . , Tn , called
a tenor, with Ti+1 = Ti + (e.g. = 14 of a year), the rate is capped over intervals
of time of length . Thus a cap is a collection of caplets.
Consider a caplet over [T, T + ]. Without the caplet, the holder of a loan
must pay at time T + an interest payment of f , where f is the oating, simple
84
dates
fi
t
T0
T1
T2
Ti
Ti+1
Tn
15.9
We show next that a caplet is in eect a put option on the bond. From the basic
(T )
relation (EMM) P (T, T + ) = E( (T
| FT ). Proceeding from (15.8) by the law
+)
of double expectation, with E = EQ
(t)(T )
1
(
1 k)+ | FT ) | Ft )
(T )(T + ) P (T, T + )
1
(T )
(t)
(
1 k)+ E(
| FT ) | Ft )
= E(
(T ) P (T, T + )
(T + )
(t)
1
= (1 + k)E(
(
P (T, T + ))+ | Ft ).
(T ) (1 + k)
Caplet(t) = E(E(
(4)
1
Thus a caplet is a put option on P (T, T + ) with strike (1+k)
, and exercise time T .
In practical modelling, as in models with deterministic volatilities, the distribution
of P (T, T + ) is Lognormal, giving rise to the Black-Scholes type formula for a
caplet, Blacks (1976) formula.
85