Steven Shreve. Lectures On Stochastic Calculus and Finance

Steven Shreve: Stochastic Calculus and Finance
P RASAD C HALASANI Carnegie Mellon University chal@cs.cmu.edu S OMESH J HA Carnegie Mellon University sjha@cs.cmu.edu
THIS IS A DRAFT: PLEASE DO NOT DISTRIBUTE c Copyright; Steven E. Shreve, 1996 October 6, 1997
Contents
1 Introduction to Probability Theory 1.1 1.2 1.3 1.4 1.5 The Binomial Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . Finite Probability Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lebesgue Measure and the Lebesgue Integral . . . . . . . . . . . . . . . . . . . . General Probability Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 1.5.2 1.5.3 1.5.4 1.5.5 1.5.6 1.5.7 Independence of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Independence of -algebras . . . . . . . . . . . . . . . . . . . . . . . . . Independence of random variables . . . . . . . . . . . . . . . . . . . . . . Correlation and independence . . . . . . . . . . . . . . . . . . . . . . . . Independence and conditional expectation. . . . . . . . . . . . . . . . . . Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 11 16 22 30 40 40 41 42 44 45 46 47 49 49 50 52 52 53 54 55 57 58
2 Conditional Expectation 2.1 2.2 2.3 A Binomial Model for Stock Price Dynamics . . . . . . . . . . . . . . . . . . . . Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Denition of Conditional Expectation . . . . . . . . . . . . . . . . . . . . Further discussion of Partial Averaging . . . . . . . . . . . . . . . . . . . Properties of Conditional Expectation . . . . . . . . . . . . . . . . . . . . Examples from the Binomial Model . . . . . . . . . . . . . . . . . . . . .
Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 3 Arbitrage Pricing 3.1 3.2 3.3 Binomial Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General one-step APT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Risk-Neutral Probability Measure . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 3.3.2 3.4 3.5 Portfolio Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-nancing Value of a Portfolio Process . . . . . . . . . . . . . . . . 59 59 60 61 62 62 63 64 67 67 69 70 70 73 74 77 77 79 81 85 85 86 88 89 91 91 92 94 97 97
Simple European Derivative Securities . . . . . . . . . . . . . . . . . . . . . . . . The Binomial Model is Complete . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 The Markov Property 4.1 4.2 4.3 4.4 4.5 Binomial Model Pricing and Hedging . . . . . . . . . . . . . . . . . . . . . . . . Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Different ways to write the Markov property . . . . . . . . . . . . . . . . Showing that a process is Markov . . . . . . . . . . . . . . . . . . . . . . . . . . Application to Exotic Options . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Stopping Times and American Options 5.1 5.2 5.3 American Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Value of Portfolio Hedging an American Option . . . . . . . . . . . . . . . . . . . Information up to a Stopping Time . . . . . . . . . . . . . . . . . . . . . . . . . .
6 Properties of American Derivative Securities 6.1 6.2 6.3 6.4 The properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proofs of the Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Compound European Derivative Securities . . . . . . . . . . . . . . . . . . . . . . Optimal Exercise of American Derivative Security . . . . . . . . . . . . . . . . . .
7 Jensens Inequality 7.1 7.2 7.3 Jensens Inequality for Conditional Expectations . . . . . . . . . . . . . . . . . . . Optimal Exercise of an American Call . . . . . . . . . . . . . . . . . . . . . . . . Stopped Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8 Random Walks 8.1 First Passage Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 is almost surely nite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The moment generating function for Expectation of . . . . . . . . . . . . . . . . . . . . . . . . 97 99
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
The Strong Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 General First Passage Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Example: Perpetual American Put . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Distribution of First Passage Times . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.10 The Reection Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 9 Pricing in terms of Market Probabilities: The Radon-Nikodym Theorem. 9.1 9.2 9.3 9.4 9.5 111
Radon-Nikodym Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Radon-Nikodym Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 The State Price Density Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Stochastic Volatility Binomial Model . . . . . . . . . . . . . . . . . . . . . . . . . 116 Another Applicaton of the Radon-Nikodym Theorem . . . . . . . . . . . . . . . . 118 119
10 Capital Asset Pricing
10.1 An Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 11 General Random Variables 123
11.1 Law of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 11.2 Density of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 11.3 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 11.4 Two random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 11.5 Marginal Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 11.6 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 11.7 Conditional Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 11.8 Multivariate Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 11.9 Bivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 11.10MGF of jointly normal random variables . . . . . . . . . . . . . . . . . . . . . . . 130 12 Semi-Continuous Models 131
12.1 Discrete-time Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4 12.2 The Stock Price Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 12.3 Remainder of the Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 12.4 Risk-Neutral Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 12.5 Risk-Neutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 12.6 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 12.7 Stalking the Risk-Neutral Measure . . . . . . . . . . . . . . . . . . . . . . . . . . 135 12.8 Pricing a European Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 13 Brownian Motion 139
13.1 Symmetric Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 13.2 The Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 13.3 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 13.4 Brownian Motion as a Limit of Random Walks . . . . . . . . . . . . . . . . . . . 141 13.5 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 13.6 Covariance of Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 13.7 Finite-Dimensional Distributions of Brownian Motion . . . . . . . . . . . . . . . . 144 13.8 Filtration generated by a Brownian Motion . . . . . . . . . . . . . . . . . . . . . . 144 13.9 Martingale Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 13.10The Limit of a Binomial Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 13.11Starting at Points Other Than 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 13.12Markov Property for Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . 147 13.13Transition Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 13.14First Passage Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 14 The It Integral o 153
14.1 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 14.2 First Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 14.3 Quadratic Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 14.4 Quadratic Variation as Absolute Volatility . . . . . . . . . . . . . . . . . . . . . . 157 14.5 Construction of the It Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 o 14.6 It integral of an elementary integrand . . . . . . . . . . . . . . . . . . . . . . . . 158 o 14.7 Properties of the It integral of an elementary process . . . . . . . . . . . . . . . . 159 o 14.8 It integral of a general integrand . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 o
5 14.9 Properties of the (general) It integral . . . . . . . . . . . . . . . . . . . . . . . . 163 o 14.10Quadratic variation of an It integral . . . . . . . . . . . . . . . . . . . . . . . . . 165 o 15 It s Formula o 167
15.1 It s formula for one Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . 167 o 15.2 Derivation of It s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 o 15.3 Geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 15.4 Quadratic variation of geometric Brownian motion . . . . . . . . . . . . . . . . . 170 15.5 Volatility of Geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . 170 15.6 First derivation of the Black-Scholes formula . . . . . . . . . . . . . . . . . . . . 170 15.7 Mean and variance of the Cox-Ingersoll-Ross process . . . . . . . . . . . . . . . . 172 15.8 Multidimensional Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 173 15.9 Cross-variations of Brownian motions . . . . . . . . . . . . . . . . . . . . . . . . 174 15.10Multi-dimensional It formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 o 16 Markov processes and the Kolmogorov equations 177
16.1 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 16.2 Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 16.3 Transition density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 16.4 The Kolmogorov Backward Equation . . . . . . . . . . . . . . . . . . . . . . . . 180 16.5 Connection between stochastic calculus and KBE . . . . . . . . . . . . . . . . . . 181 16.6 Black-Scholes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 16.7 Black-Scholes with price-dependent volatility . . . . . . . . . . . . . . . . . . . . 186 17 Girsanovs theorem and the risk-neutral measure 17.1 Conditional expectations under
f IP
189
. . . . . . . . . . . . . . . . . . . . . . . . . . 191
17.2 Risk-neutral measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 18 Martingale Representation Theorem 197
18.1 Martingale Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 197 18.2 A hedging application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 18.3 18.4
d-dimensional Girsanov Theorem . . . . . . . . . . . d-dimensional Martingale Representation Theorem . .
. . . . . . . . . . . . . . . 199 . . . . . . . . . . . . . . . 200
18.5 Multi-dimensional market model . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
6 19 A two-dimensional market model 19.1 Hedging when ;1 < 19.2 Hedging when 203 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
=1
<1 .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 209
20 Pricing Exotic Options
20.1 Reection principle for Brownian motion . . . . . . . . . . . . . . . . . . . . . . 209 20.2 Up and out European call. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 20.3 A practical issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 21 Asian Options 219
21.1 Feynman-Kac Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 21.2 Constructing the hedge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 21.3 Partial average payoff Asian option . . . . . . . . . . . . . . . . . . . . . . . . . . 221 22 Summary of Arbitrage Pricing Theory 223
22.1 Binomial model, Hedging Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . 223 22.2 Setting up the continuous model . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 22.3 Risk-neutral pricing and hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 22.4 Implementation of risk-neutral pricing and hedging . . . . . . . . . . . . . . . . . 229 23 Recognizing a Brownian Motion 233
23.1 Identifying volatility and correlation . . . . . . . . . . . . . . . . . . . . . . . . . 235 23.2 Reversing the process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 24 An outside barrier option 239
24.1 Computing the option value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 24.2 The PDE for the outside barrier option . . . . . . . . . . . . . . . . . . . . . . . . 243 24.3 The hedge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 25 American Options 247
25.1 Preview of perpetual American put . . . . . . . . . . . . . . . . . . . . . . . . . . 247 25.2 First passage times for Brownian motion: rst method . . . . . . . . . . . . . . . . 247 25.3 Drift adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 25.4 Drift-adjusted Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 25.5 First passage times: Second method . . . . . . . . . . . . . . . . . . . . . . . . . 251
7 25.6 Perpetual American put . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 25.7 Value of the perpetual American put . . . . . . . . . . . . . . . . . . . . . . . . . 256 25.8 Hedging the put . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 25.9 Perpetual American contingent claim . . . . . . . . . . . . . . . . . . . . . . . . . 259 25.10Perpetual American call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 25.11Put with expiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 25.12American contingent claim with expiration . . . . . . . . . . . . . . . . . . . . . 261 26 Options on dividend-paying stocks 263
26.1 American option with convex payoff function . . . . . . . . . . . . . . . . . . . . 263 26.2 Dividend paying stock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 26.3 Hedging at time t1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 267
27 Bonds, forward contracts and futures
27.1 Forward contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 27.2 Hedging a forward contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 27.3 Future contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 27.4 Cash ow from a future contract . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 27.5 Forward-future spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 27.6 Backwardation and contango . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 28 Term-structure models 275
28.1 Computing arbitrage-free bond prices: rst method . . . . . . . . . . . . . . . . . 276 28.2 Some interest-rate dependent assets . . . . . . . . . . . . . . . . . . . . . . . . . 276 28.3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 28.4 Forward rate agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 28.5 Recovering the interest r(t) from the forward rate . . . . . . . . . . . . . . . . . . 278 28.6 Computing arbitrage-free bond prices: Heath-Jarrow-Morton method . . . . . . . . 279 28.7 Checking for absence of arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . 280 28.8 Implementation of the Heath-Jarrow-Morton model . . . . . . . . . . . . . . . . . 281 29 Gaussian processes 285
29.1 An example: Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 30 Hull and White model 293
8 30.1 Fiddling with the formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 30.2 Dynamics of the bond price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 30.3 Calibration of the Hull & White model . . . . . . . . . . . . . . . . . . . . . . . . 297 30.4 Option on a bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 31 Cox-Ingersoll-Ross model 303
31.1 Equilibrium distribution of r(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 31.2 Kolmogorov forward equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 31.3 Cox-Ingersoll-Ross equilibrium density . . . . . . . . . . . . . . . . . . . . . . . 309 31.4 Bond prices in the CIR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 31.5 Option on a bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 31.6 Deterministic time change of CIR model . . . . . . . . . . . . . . . . . . . . . . . 313 31.7 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 31.8 Tracking down '0(0) in the time change of the CIR model . . . . . . . . . . . . . 316 32 A two-factor model (Dufe & Kan) 319
32.1 Non-negativity of Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 32.2 Zero-coupon bond prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 32.3 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 33 Change of num raire e 325
33.1 Bond price as num raire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 e 33.2 Stock price as num raire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 e 33.3 Merton option pricing formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 34 Brace-Gatarek-Musiela model 335
34.1 Review of HJM under risk-neutral IP . . . . . . . . . . . . . . . . . . . . . . . . . 335 34.2 Brace-Gatarek-Musiela model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 34.3 LIBOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 34.4 Forward LIBOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 34.5 The dynamics of L(t
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
34.6 Implementation of BGM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 34.7 Bond prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 34.8 Forward LIBOR under more forward measure . . . . . . . . . . . . . . . . . . . . 343
9 34.9 Pricing an interest rate caplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 34.10Pricing an interest rate cap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 34.11Calibration of BGM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 34.12Long rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 34.13Pricing a swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 35 Notes and References 349
35.1 Probability theory and martingales. . . . . . . . . . . . . . . . . . . . . . . . . . . 349 35.2 Binomial asset pricing model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 35.3 Brownian motion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 35.4 Stochastic integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 35.5 Stochastic calculus and nancial markets. . . . . . . . . . . . . . . . . . . . . . . 350 35.6 Markov processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 35.7 Girsanovs theorem, the martingale representation theorem, and risk-neutral measures.351 35.8 Exotic options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 35.9 American options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 35.10Forward and futures contracts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 35.11Term structure models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 35.12Change of num raire. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 e 35.13Foreign exchange models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 35.14REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
10
Chapter 1
Introduction to Probability Theory

1.1 The Binomial Asset Pricing Model
The binomial asset pricing model provides a powerful tool to understand arbitrage pricing theory and probability theory. In this course, we shall use it for both these purposes. In the binomial asset pricing model, we model stock prices in discrete time, assuming that at each step, the stock price will change to one of two possible values. Let us begin with an initial positive stock price S0. There are two positive numbers, d and u, with
0<d<u
(1.1)
such that at the next period, the stock price will be either dS 0 or uS0. Typically, we take d and u to satisfy 0 < d < 1 < u, so change of the stock price from S0 to dS0 represents a downward movement, and change of the stock price from S0 to uS0 represents an upward movement. It is 1 common to also have d = u , and this will be the case in many of our examples. However, strictly speaking, for what we are about to do we need to assume only (1.1) and (1.2) below. Of course, stock price movements are much more complicated than indicated by the binomial asset pricing model. We consider this simple model for three reasons. First of all, within this model the concept of arbitrage pricing and its relation to risk-neutral pricing is clearly illuminated. Secondly, the model is used in practice because with a sufcient number of steps, it provides a good, computationally tractable approximation to continuous-time models. Thirdly, within the binomial model we can develop the theory of conditional expectations and martingales which lies at the heart of continuous-time models. With this third motivation in mind, we develop notation for the binomial model which is a bit different from that normally found in practice. Let us imagine that we are tossing a coin, and when we get a Head, the stock price moves up, but when we get a Tail, the price moves down. We denote the price at time 1 by S1 (H ) = uS0 if the toss results in head (H), and by S 1 (T ) = dS0 if it 11
12
S2 (HH) = 16
S (H) = 8 1 S2 (HT) = 4 S =4 0 S1 (T) = 2 S2 (TH) = 4
S2 (TT) = 1
Figure 1.1: Binomial tree of stock prices with S 0 results in tail (T). After the second toss, the price will be one of:
= 4, u = 1=d = 2.
S2 (HH ) = uS1(H ) = u2 S0 S2(HT ) = dS1(H ) = duS0 S2 (TH ) = uS1(T ) = udS0 S2(TT ) = dS1(T ) = d2S0:
After three tosses, there are eight possible coin sequences, although not all of them result in different stock prices at time 3. For the moment, let us assume that the third toss is the last one and denote by
= fHHH HHT HTH HTT THH THT TTH TTT g

the set of all possible outcomes of the three tosses. The set of all possible outcomes of a random experiment is called the sample space for the experiment, and the elements ! of are called sample points. In this case, each sample point ! is a sequence of length three. We denote the k-th component of ! by !k . For example, when ! = HTH , we have !1 = H , !2 = T and !3 = H .
The stock price Sk at time k depends on the coin tosses. To emphasize this, we often write S k (! ). Actually, this notation does not quite tell the whole story, for while S 3 depends on all of ! , S2 depends on only the rst two components of ! , S 1 depends on only the rst component of ! , and S0 does not depend on ! at all. Sometimes we will use notation such S 2(!1 !2) just to record more explicitly how S 2 depends on ! = (!1 !2 !3 ).
1 Example 1.1 Set S0 = 4, u = 2 and d = 2 . We have then the binomial tree of possible stock prices shown in Fig. 1.1. Each sample point ! = (! 1 !2 !3) represents a path through the tree. Thus, we can think of the sample space as either the set of all possible outcomes from three coin tosses or as the set of all possible paths through the tree.
To complete our binomial asset pricing model, we introduce a money market with interest rate r; $1 invested in the money market becomes $(1 + r) in the next period. We take r to be the interest
CHAPTER 1. Introduction to Probability Theory
13
rate for both borrowing and lending. (This is not as ridiculous as it rst seems, because in a many applications of the model, an agent is either borrowing or lending (not both) and knows in advance which she will be doing; in such an application, she should take r to be the rate of interest for her activity.) We assume that
d < 1 + r < u:
(1.2)
The model would not make sense if we did not have this condition. For example, if 1 + r u, then the rate of return on the money market is always at least as great as and sometimes greater than the return on the stock, and no one would invest in the stock. The inequality d 1 + r cannot happen unless either r is negative (which never happens, except maybe once upon a time in Switzerland) or d 1. In the latter case, the stock does not really go down if we get a tail; it just goes up less than if we had gotten a head. One should borrow money at interest rate r and invest in the stock, since even in the worst case, the stock price rises at least as fast as the debt used to buy it.
K > 0 and expiration time 1. This option confers the right to buy the stock at time 1 for K dollars, and so is worth S 1 ; K at time 1 if S1 ; K is positive and is otherwise worth zero. We denote by V1(!) = (S1(! ) ; K )+ = maxfS1(!) ; K 0g the value (payoff) of this option at expiration. Of course, V 1(! ) actually depends only on ! 1 , and we can and do sometimes write V1 (!1) rather than V1(! ). Our rst task is to compute the arbitrage
price of this option at time zero. Suppose at time zero you sell the call for V0 dollars, where V0 is still to be determined. You now have an obligation to pay off (uS 0 ; K )+ if !1 = H and to pay off (dS0 ; K )+ if !1 = T . At the time you sell the option, you dont yet know which value ! 1 will take. You hedge your short position in the option by buying 0 shares of stock, where 0 is still to be determined. You can use the proceeds V0 of the sale of the option for this purpose, and then borrow if necessary at interest rate r to complete the purchase. If V0 is more than necessary to buy the 0 shares of stock, you invest the residual money at interest rate r. In either case, you will have V 0 ; 0S0 dollars invested in the money market, where this quantity might be negative. You will also own 0 shares of stock. If the stock goes up, the value of your portfolio (excluding the short position in the option) is
With the stock as the underlying asset, let us consider a European call option with strike price
0 S1(H ) + (1 + r)(V0 ; 0 S0 )
and you need to have V1(H ). Thus, you want to choose V 0 and
0 so that
(1.3)
V1(H ) =
0S1 (H ) + (1 + r)(V0 ; 0 S0):
If the stock goes down, the value of your portfolio is
0 S1 (T ) + (1 + r)(V0 ; 0 S0)
and you need to have V1(T ). Thus, you want to choose V 0 and
0 to also have
(1.4)
V 1 (T ) =
0S1 (T ) + (1 + r)(V0 ; 0S0 ):
14 These are two equations in two unknowns, and we solve them below Subtracting (1.4) from (1.3), we obtain
V1 (H ) ; V1(T ) =
so that
0 (S1(H ) ; S1 (T ))
(1.5)
0=
V1(H ) ; V1(T ) S1(H ) ; S1(T ) :
(1.6)
This is a discrete-time version of the famous delta-hedging formula for derivative securities, according to which the number of shares of an underlying asset a hedge should hold is the derivative (in the sense of calculus) of the value of the derivative security with respect to the price of the underlying asset. This formula is so pervasive the when a practitioner says delta, she means the derivative (in the sense of calculus) just described. Note, however, that my denition of 0 is the number of shares of stock one holds at time zero, and (1.6) is a consequence of this denition, not the denition of 0 itself. Depending on how uncertainty enters the model, there can be cases in which the number of shares of stock a hedge should hold is not the (calculus) derivative of the derivative security with respect to the price of the underlying asset. To complete the solution of (1.3) and (1.4), we substitute (1.6) into either (1.3) or (1.4) and solve for V0 . After some simplication, this leads to the formula
1 r V0 = 1 + r 1 + ; ; d V1(H ) + u ; (1 + r) V1 (T ) : u d u;d This is the arbitrage price for the European call option with payoff V 1 at time 1.
formula, we dene
(1.7) To simplify this
r p = 1 + ;; d ~ u d
so that (1.7) becomes
q = u ; (1 + r) = 1 ; p ~ ~ u;d
(1.8)
1 ~ V0 = 1 + r pV1(H ) + qV1(T )]: ~
(1.9)
Because we have taken d < u, both p and q are dened,i.e., the denominator in (1.8) is not zero. ~ ~ Because of (1.2), both p and q are in the interval (0 1), and because they sum to 1, we can regard ~ ~ them as probabilities of H and T , respectively. They are the risk-neutral probabilites. They appeared when we solved the two equations (1.3) and (1.4), and have nothing to do with the actual probabilities of getting H or T on the coin tosses. In fact, at this point, they are nothing more than a convenient tool for writing (1.7) as (1.9). We now consider a European call which pays off K dollars at time 2. At expiration, the payoff of this option is V 2 = (S2 ; K )+ , where V2 and S2 depend on !1 and !2 , the rst and second coin tosses. We want to determine the arbitrage price for this option at time zero. Suppose an agent sells the option at time zero for V0 dollars, where V0 is still to be determined. She then buys 0 shares
15
of stock, investing V 0 ; 0S0 dollars in the money market to nance this. At time 1, the agent has a portfolio (excluding the short position in the option) valued at
X1 =
0 S1 + (1 + r)(V0 ; 0 S0):
(1.10)
Although we do not indicate it in the notation, S 1 and therefore X1 depend on !1 , the outcome of the rst coin toss. Thus, there are really two equations implicit in (1.10):
X1(H ) = X1(T ) =
0S1 (H ) + (1 + r)(V0 ; 0 S0)
0S1 (T ) + (1 + r)(V0 ; 0 S0 ):
After the rst coin toss, the agent has X 1 dollars and can readjust her hedge. Suppose she decides to now hold 1 shares of stock, where 1 is allowed to depend on ! 1 because the agent knows what value !1 has taken. She invests the remainder of her wealth, X1 ; 1S1 in the money market. In the next period, her wealth will be given by the right-hand side of the following equation, and she wants it to be V 2. Therefore, she wants to have
V2 =
1 S2 + (1 + r)(X1 ; 1 S1):
(1.11)
Although we do not indicate it in the notation, S 2 and V2 depend on !1 and !2 , the outcomes of the rst two coin tosses. Considering all four possible outcomes, we can write (1.11) as four equations:
V2(HH ) V2 (HT ) V2 (TH ) V2(TT )
= = = =
1(H )S2(HH ) + (1 + r)(X1(H ) ; 1 (H )S1(H )) 1(H )S2(HT ) + (1 + r)(X1(H ) ; 1(H )S1(H )) 1(T )S2(TH ) + (1 + r)(X1(T ) ; 1(T )S1(T )) 1(T )S2(TT ) + (1 + r)(X1(T ) ; 1 (T )S1(T )):
To solve these equations, and thereby determine the arbitrage price V 0 at time zero of the option and the hedging portfolio 0 , 1(H ) and 1 (T ), we begin with the last two
We now have six equations, the two represented by (1.10) and the four represented by (1.11), in the six unknowns V 0 , 0 , 1(H ), 1 (T ), X1 (H ), and X1 (T ).
V2(TH ) = V2(TT ) =
1 (T )S2(TH ) + (1 + r)(X1(T ) ; 1(T )S1(T )) 1 (T )S2(TT ) + (1 + r)(X1(T ) ; 1 (T )S1(T )): 1(T ), we obtain the delta-hedging for(1.12)
Subtracting one of these from the other and solving for mula
1(T ) =
V2 (TH ) ; V2(TT ) S2(TH ) ; S2(TT )
and substituting this into either equation, we can solve for
1 ~ X1 (T ) = 1 + r pV2(TH ) + qV2 (TT )]: ~
(1.13)
16 Equation (1.13), gives the value the hedging portfolio should have at time 1 if the stock goes down between times 0 and 1. We dene this quantity to be the arbitrage value of the option at time 1 if !1 = T , and we denote it by V1(T ). We have just shown that
1 ~ V1(T ) = 1 + r pV2 (TH ) + qV2(TT )]: ~
(1.14)
The hedger should choose her portfolio so that her wealth X 1 (T ) if !1 = T agrees with V1(T ) dened by (1.14). This formula is analgous to formula (1.9), but postponed by one step. The rst two equations implicit in (1.11) lead in a similar way to the formulas
1 (H ) =
V2(HH ) ; V2(HT ) S2(HH ) ; S2(HT )
(1.15)
and X1(H ) = V1(H ), where V1(H ) is the value of the option at time 1 if ! 1
= H , dened by
(1.16)
1 ~ V1(H ) = 1 + r pV2(HH ) + qV2(HT )]: ~
This is again analgous to formula (1.9), postponed by one step. Finally, we plug the values X 1(H ) = V1(H ) and X1(T ) = V1(T ) into the two equations implicit in (1.10). The solution of these equations for 0 and V0 is the same as the solution of (1.3) and (1.4), and results again in (1.6) and (1.9). The pattern emerging here persists, regardless of the number of periods. If Vk denotes the value at time k of a derivative security, and this depends on the rst k coin tosses ! 1 : : : !k , then at time k ; 1, after the rst k ; 1 tosses !1 : : : !k;1 are known, the portfolio to hedge a short position should hold k;1 (!1 : : : !k;1) shares of stock, where
H ) ; Vk (!1 : : : !k 1 T ) (1.17) H ) ; Sk (!1 : : : !k 1 T ) and the value at time k ; 1 of the derivative security, when the rst k ; 1 coin tosses result in the outcomes !1 : : : !k 1 , is given by
k 1 (!1
;
V : : : !k 1 ) = Sk (!1 :: :: :: !k !k k (!1
;
1 1
1 ~ Vk 1 (!1 : : : !k 1) = 1 + r pVk (!1 : : : !k

; ;
H ) + q Vk (!1 : : : !k ~
T )]
(1.18)
1.2 Finite Probability Spaces

Let be a set with nitely many elements. An example to keep in mind is
= fHHH HHT HTH HTT THH THT TTH TTT g
(2.1)
of all possible outcomes of three coin tosses. Let F be the set of all subsets of . Some sets in F are , fHHH HHT HTH HTT g, fTTT g, and itself. How many sets are there in F ?

Denition 1.1 A probability measure properties: (i)
17
IP
is a function mapping
F into 0 1] with the following
IP ( ) = 1, (ii) If A1 A2 : : : is a sequence of disjoint sets in F , then IP

1
k=1
Ak =
! X
1
k=1
IP (Ak ):
Probability measures have the following interpretation. Let A be a subset of F . Imagine that is the set of all possible outcomes of some random experiment. There is a certain probability, between 0 and 1, that when that experiment is performed, the outcome will lie in the set A. We think of IP (A) as this probability. Example 1.2 Suppose a coin has probability 1 for H and 2 for T . For the individual elements of 3 3 in (2.1), dene
IP fHHH g = IP fHTH g = IP fTHH g = IP fTTH g =

For A 2 F , we dene
1 3 1 3 1 3 1 3
3 2 2 3 2 1 3 2 2 3
IP fHHT g = IP fHTT g = IP fTHT g = IP fTTT g =
1 2 2 3 3 1 2 2 3 3 1 2 2 3 3 2 3: 3
IP (A) =
For example,
X
! A
2
IP f!g:
2
(2.2)
3 IP fHHH HHT HTH HTT g = 1 + 2 1 3 3
2 + 1 3 3
2 3
=1 3
which is another way of saying that the probability of H on the rst toss is 1 . 3 As in the above example, it is generally the case that we specify a probability measure on only some of the subsets of and then use property (ii) of Denition 1.1 to determine IP (A) for the remaining sets A 2 F . In the above example, we specied the probability measure only for the sets containing a single element, and then used Denition 1.1(ii) in the form (2.2) (see Problem 1.4(ii)) to determine IP for all the other sets in F . Denition 1.2 Let be a nonempty set. A -algebra is a collection following three properties: (i)
G of subsets of
with the
2 G,
18 (ii) If A 2 G , then its complement Ac (iii) If A1
A2 A3 : : : is a sequence of sets in G , then
2 G,
k=1 Ak is also in G .
1
Here are some important -algebras of subsets of the set
F0 = F1 = F2 =
( (
in Example 1.2:
) fHHH HHT HTH HTT g fTHH THT TTH TTT g fHHH HHT g fHTH HTT g fTHH THT g fTTH TTT g )
and all sets which can be built by taking unions of these
F3 = F = The set of all subsets of :

To simplify notation a bit, let us dene
AH = fHHH HHT HTH HTT g = fH on the rst tossg AT = fTHH THT TTH TTT g = fT on the rst tossg
so that and let us dene
F1 = f
AH AT g
AHH = fHHH HHT g = fHH on the rst two tossesg AHT = fHTH HTT g = fHT on the rst two tossesg ATH = fTHH THT g = fTH on the rst two tossesg ATT = fTTH TTT g = fTT on the rst two tossesg
so that
F2 = f
AHH AHT ATH ATT AH AT AHH ATH AHH ATT AHT ATH AHT ATT Ac Ac Ac Ac g: HH HT TH TT
We interpret -algebras as a record of information. Suppose the coin is tossed three times, and you are not told the outcome, but you are told, for every set in F 1 whether or not the outcome is in that set. For example, you would be told that the outcome is not in and is in . Moreover, you might be told that the outcome is not in A H but is in A T . In effect, you have been told that the rst toss was a T , and nothing more. The -algebra F1 is said to contain the information of the rst toss, which is usually called the information up to time 1. Similarly, F 2 contains the information of
19
the rst two tosses, which is the information up to time 2. The -algebra F 3 = F contains full information about the outcome of all three tosses. The so-called trivial -algebra F 0 contains no information. Knowing whether the outcome ! of the three tosses is in (it is not) and whether it is in (it is) tells you nothing about ! Denition 1.3 Let be a nonempty nite set. A ltration is a sequence of -algebras F 0 F1 F2 : : : such that each -algebra in the sequence contains all the sets contained by the previous -algebra. Denition 1.4 Let be a nonempty nite set and let random variable is a function mapping into IR.
Fn
F be the
-algebra of all subsets of
. A
Example 1.3 Let be given by (2.1) and consider the binomial asset pricing Example 1.1, where S0 = 4, u = 2 and d = 1 . Then S0, S1, S2 and S3 are all random variables. For example, 2 S2 (HHT ) = u2 S0 = 16. The random variable S0 is really not random, since S0(!) = 4 for all ! 2 . Nonetheless, it is a function mapping into IR, and thus technically a random variable, albeit a degenerate one. A random variable maps into IR, and we can look at the preimage under the random variable of sets in IR. Consider, for example, the random variable S2 of Example 1.1. We have
S2(HHH ) = S2(HHT ) = 16 S2(HTH ) = S2(HTT ) = S2(THH ) = S2 (THT ) = 4 S2(TTH ) = S2(TTT ) = 1:

Let us consider the interval
4 27]. The preimage under S2 of this interval is dened to be

we can get as preimages of sets in IR is:
f! 2 S2(!) 2 4 27]g = f! 2 4 S2 27g = Ac : TT

The complete list of subsets of
AHH AHT ATH ATT

and sets which can be built by taking unions of these. This collection of sets is a -algebra, called the -algebra generated by the random variable S 2, and is denoted by (S2). The information content of this -algebra is exactly the information learned by observing S 2 . More specically, suppose the coin is tossed three times and you do not know the outcome ! , but someone is willing to tell you, for each set in (S 2), whether ! is in the set. You might be told, for example, that ! is not in AHH , is in AHT ATH , and is not in A TT . Then you know that in the rst two tosses, there was a head and a tail, and you know nothing more. This information is the same you would have gotten by being told that the value of S 2(! ) is 4. Note that F2 dened earlier contains all the sets which are in (S 2), and even more. This means that the information in the rst two tosses is greater than the information in S 2 . In particular, if you see the rst two tosses, you can distinguish A HT from ATH , but you cannot make this distinction from knowing the value of S2 alone.
20 Denition 1.5 Let be a nonemtpy nite set and let F be the -algebra of all subsets of . Let X be a random variable on ( F ). The -algebra (X ) generated by X is dened to be the collection of all sets of the form f! 2 X (! ) 2 Ag, where A is a subset of IR. Let G be a sub- -algebra of F . We say that X is G -measurable if every set in (X ) is also in G . Note: We normally write simply fX
2 Ag rather than f! 2 X (!) 2 Ag. Denition 1.6 Let be a nonempty, nite set, let F be the -algebra of all subsets of , let IP be a probabilty measure on ( F ), and let X be a random variable on . Given any set A IR, we
dene the induced measure of A to be
LX (A) = IP fX 2 Ag:
In other words, the induced measure of a set A tells us the probability that X takes a value in A. In the case of S2 above with the probability measure of Example 1.2, some sets in IR and their induced measures are:
LS2 ( ) = IP ( ) = 0 LS2 (IR) = IP ( ) = 1 LS2 0 1) = IP ( ) = 1 2 LS2 0 3] = IP fS2 = 1g = IP (ATT ) = 2 : 3

1 = 9 at the number 16, a mass of size 4 at the number 4, and a mass of size 2 2 = 4 at the number 1. A common way to record this 9 3 9 information is to give the cumulative distribution function F S2 (x) of S2 , dened by 2 In fact, the induced measure of S2 places a mass of size 1 3
FS2 (x) = IP (S2

By the distribution of a random variable
8 >0 >4 <9 x) = > 8 >1 :9
if x < 1 if 1 x < 4 if 4 x < 16 if 16 x:
(2.3)
large they are, or tell what the cumulative distribution function is. (Later we will consider random variables X which have densities, in which case the induced measure of a set A IR is the integral of the density over the set A.)
LX . If X is discrete, as in the case of S2 above, we can either tell where the masses are and how
X , we mean any of the several ways of characterizing
Important Note. In order to work through the concept of a risk-neutral measure, we set up the denitions to make a clear distinction between random variables and their distributions. A random variable is a mapping from to IR, nothing more. It has an existence quite apart from discussion of probabilities. For example, in the discussion above, S 2 (TTH ) = S2(TTT ) = 1, regardless of whether the probability for H is 1 or 1 . 3 2
21
The distribution of a random variable is a measure L X on IR, i.e., a way of assigning probabilities to sets in IR. It depends on the random variable X and the probability measure IP we use in . If we set the probability of H to be 1 , then LS2 assigns mass 1 to the number 16. If we set the probability 3 9 of H to be 1 , then LS2 assigns mass 1 to the number 16. The distribution of S 2 has changed, but 2 4 the random variable has not. It is still dened by
S2(HHH ) = S2(HHT ) = 16 S2(HTH ) = S2(HTT ) = S2(THH ) = S2 (THT ) = 4 S2(TTH ) = S2(TTT ) = 1:

Thus, a random variable can have more than one distribution (a market or objective distribution, and a risk-neutral distribution). In a similar vein, two different random variables can have the same distribution. Suppose in the binomial model of Example 1.1, the probability of H and the probability of T is 1 . Consider a 2 European call with strike price 14 expiring at time 2. The payoff of the call at time 2 is the random variable (S2 ; 14)+, which takes the value 2 if ! = HHH or ! = HHT , and takes the value 0 in every other case. The probability the payoff is 2 is 1 , and the probability it is zero is 3 . Consider also 4 4 a European put with strike price 3 expiring at time 2. The payoff of the put at time 2 is (3 ; S 2)+ , which takes the value 2 if ! = TTH or ! = TTT . Like the payoff of the call, the payoff of the put is 2 with probability 1 and 0 with probability 3 . The payoffs of the call and the put are different 4 4 random variables having the same distribution. Denition 1.7 Let be a nonempty, nite set, let F be the -algebra of all subsets of , let IP be a probabilty measure on ( F ), and let X be a random variable on . The expected value of X is dened to be
IEX =
X
2
X (!)IP f!g:
(2.4) is a into
Notice that the expected value in (2.4) is dened to be a sum over the sample space . Since nite set, X can take only nitely many values, which we label x 1 : : : xn . We can partition the subsets fX 1 = x1 g : : : fXn = xn g, and then rewrite (2.4) as
IEX =
= = = =
X
2
! n X
X (!)IP f!g
X
2f g
k=1 ! Xk =xk n X X xk IP f!g k=1 ! Xk =xk n X xk IP fXk = xk g k=1 n X xk LX fxk g: k=1

2f g
X (!)IP f! g
22 Thus, although the expected value is dened as a sum over the sample space , we can also write it as a sum over IR.
To make the above set of equations absolutely clear, we consider S 2 with the distribution given by (2.3). The denition of IES2 is
IES2 = S2 (HHH )IP fHHH g + S2(HHT )IP fHHT g +S2 (HTH )IP fHTH g + S2(HTT )IP fHTT g +S2 (THH )IP fTHH g + S2(THT )IP fTHT g +S2 (TTH )IP fTTH g + S2 (TTT )IP fTTT g = 16 IP (AHH ) + 4 IP (AHT ATH ) + 1 IP (ATT ) = 16 IP fS2 = 16g + 4 IP fS2 = 4g + 1 IP fS2 = 1g
= 16 LS2 f16g + 4 LS2 f4g + 1 LS2 f1g = 16 1 + 4 4 + 4 4 9 9 9 48 : = 9
Denition 1.8 Let be a nonempty, nite set, let F be the -algebra of all subsets of , let IP be a probabilty measure on ( F ), and let X be a random variable on . The variance of X is dened to be the expected value of (X ; IEX )2, i.e., Var(X ) =
X
2
(X (! ) ; IEX )2IP f! g:
(2.5)
x1 : : : xn, then
One again, we can rewrite (2.5) as a sum over IR rather than over . Indeed, if X takes the values Var(X ) =
n X k=1
(xk ; IEX )2IP fX = xk g =
n X k=1
(xk ; IEX )2LX (xk ):
1.3 Lebesgue Measure and the Lebesgue Integral

In this section, we consider the set of real numbers IR, which is uncountably innite. We dene the Lebesgue measure of intervals in IR to be their length. This denition and the properties of measure determine the Lebesgue measure of many, but not all, subsets of IR. The collection of subsets of IR we consider, and for which Lebesgue measure is dened, is the collection of Borel sets dened below. We use Lebesgue measure to construct the Lebesgue integral, a generalization of the Riemann integral. We need this integral because, unlike the Riemann integral, it can be dened on abstract spaces, such as the space of innite sequences of coin tosses or the space of paths of Brownian motion. This section concerns the Lebesgue integral on the space IR only; the generalization to other spaces will be given later.
23
Denition 1.9 The Borel -algebra, denoted B(IR), is the smallest -algebra containing all open intervals in IR. The sets in B(IR) are called Borel sets. Every set which can be written down and just about every set imaginable is in B(IR). The following discussion of this fact uses the -algebra properties developed in Problem 1.3.
By denition, every open interval (a b) is in B(IR), where a and b are real numbers. Since B(IR) is a -algebra, every union of open intervals is also in B(IR). For example, for every real number a, the open half-line
(a 1) =
is a Borel set, as is
n=1
1
(a a + n) (a ; n a):
(;1 a) =
For real numbers a and b, the union
n=1
(;1 a) (b 1)
is Borel. Since B(IR) is a -algebra, every complement of a Borel set is Borel, so B(IR) contains
a b] = (;1 a) (b 1) :
This shows that every closed interval is Borel. In addition, the closed half-lines
a 1) =
and
n=1
1
a a + n] a ; n a]
(;1 a] =
n=1
are Borel. Half-open and half-closed intervals are also Borel, since they can be written as intersections of open half-lines and closed half-lines. For example,
(a b] = (;1 b] \ (a 1):
Every set which contains only one real number is Borel. Indeed, if a is a real number, then
fag =
\
1
n=1
1 1 a; n a+ n : = fa 1 a2 : : : an g,
n
This means that every set containing nitely many real numbers is Borel; if A then
A=
k=1
fak g:
24 In fact, every set containing countably innitely many numbers is Borel; if A = fa 1
A=
a2 : : : g, then
k=1
fak g:
This means that the set of rational numbers is Borel, as is its complement, the set of irrational numbers. There are, however, sets which are not Borel. We have just seen that any non-Borel set must have uncountably many points. Example 1.4 (The Cantor set.) This example gives a hint of how complicated a Borel set can be. We use it later when we discuss the sample space for an innite sequence of coin tosses. Consider the unit interval
0 1], and remove the middle half, i.e., remove the open interval 3 A1 = 1 4 : 4
The remaining set
C1 = 0 1 4
1 3 A2 = 16 16
31 4 13 15 : 16 16
has two pieces. From each of these pieces, remove the middle half, i.e., remove the open set
The remaining set
length 41 . The Cantor set k
15 16 1 : has four pieces. Continue this process, so at stage k, the set C k has 2k pieces, and each piece has
1 C2 = 0 16
3 1 16 4
3 13 4 16
C=
\
1
k=1
Ck
is dened to be the set of points not removed at any stage of this nonterminating process. Note that the length of A 1 , the rst set removed, is 1 . The length of A2 , the second set removed, 2 1 1 is 1 + 1 = 4 . The length of the next set removed is 4 32 = 1 , and in general, the length of the 8 8 8 ;k k-th set removed is 2 . Thus, the total length removed is
k=1
X1 2k = 1
1
and so the Cantor set, the set of points not removed, has zero length. Despite the fact that the Cantor set has no length, there are lots of points in this set. In particular, none of the endpoints of the pieces of the sets C 1 C2 : : : is ever removed. Thus, the points are all in C . This is a countably innite set of points. We shall see eventually that the Cantor set has uncountably many points.
3 1 3 13 1 0 1 4 1 16 16 16 15 64 : : : 4 16

Denition 1.10 Let B(IR) be the -algebra of Borel subsets of IR. A measure on (IR function mapping B into 0 1] with the following properties: (i)
25
B(IR)) is a
( ) = 0,
(ii) If A1
A2 : : : is a sequence of disjoint sets in B(IR), then

1
k=1
Ak =
! X
1
k=1
(Ak ):
Lebesgue measure is dened to be the measure on (IR B(IR)) which assigns the measure of each interval to be its length. Following Williamss book, we denote Lebesgue measure by 0 . A measure has all the properties of a probability measure given in Problem 1.4, except that the total measure of the space is not necessarily 1 (in fact, 0 (IR) = 1), one no longer has the equation
(Ac ) = 1 ; (A)
in Problem 1.4(iii), and property (v) in Problem 1.4 needs to be modied to say: (v) If A1
A2 : : : is a sequence of sets in B(IR) with A 1 A2
\
1
k=1
To see that the additional requirment
Ak = nlim (An ):
!1
and
(A1 ) < 1, then
(A 1 ) < 1 is needed in (v), consider
A1 = 1 1) A2 = 2 1) A3 = 3 1) : : ::
Then \1 Ak k=1
, so 0 (\1 Ak ) = 0, but limn!1 0 (An ) = 1. k=1
We specify that the Lebesgue measure of each interval is its length, and that determines the Lebesgue measure of all other Borel sets. For example, the Lebesgue measure of the Cantor set in Example 1.4 must be zero, because of the length computation given at the end of that example. The Lebesgue measure of a set containing only one point must be zero. In fact, since
fag
for every positive integer n, we must have
1 1 a; n a+ n
0
0
Letting n ! 1, we obtain
0 fag
1 1 2 a ; n a + n = n:
0 fag = 0:
26 The Lebesgue measure of a set containing countably many points must also be zero. Indeed, if A = fa1 a2 : : : g, then
0 (A) =
X
1
k=1
0 fak g =
X
1
k=1
0 = 0:
The Lebesgue measure of a set containing uncountably many points can be either zero, positive and nite, or innite. We may not compute the Lebesgue measure of an uncountable set by adding up the Lebesgue measure of its individual members, because there is no way to add up uncountably many numbers. The integral was invented to get around this problem. In order to think about Lebesgue integrals, we must rst consider the functions to be integrated. Denition 1.11 Let
fx 2 IR f (x) 2 Ag is in B(IR) whenever A 2 B(IR). In the language of Section 2, we want the -algebra generated by f to be contained in B(IR).
Denition 3.4 is purely technical and has nothing to do with keeping track of information. It is difcult to conceive of a function which is not Borel-measurable, and we shall pretend such functions dont exist. Hencefore, function mapping IR to IR will mean Borel-measurable function mapping IR to IR and subset of IR will mean Borel subset of IR. Denition 1.12 An indicator function g from and 1. We call
f be a function from IR to IR.
We say that
f is Borel-measurable if the set
IR to IR is a function which takes only the values 0
A = fx 2 IR g (x) = 1g the set indicated by g . We dene the Lebesgue integral of g to be
IR
g d 0 = 0 (A):
n X
A simple function h from IR to IR is a linear combination of indicators, i.e., a function of the form
h(x) =
where each gk is of the form
k=1
ck gk (x)
if x 2 Ak if x 2 Ak =
gk (x) = hd 0 = ck
and each ck is a real number. We dene the Lebesgue integral of h to be
1 0
n X Z
k=1
IR
gk d 0 =
n X
k=1
ck 0 (Ak ):
We
Let f be a nonnegative function dened on dene the Lebesgue integral of f to be
IR, possibly taking the value 1 at some points.
IR
f d 0 = sup
IR
hd
h is simple and h(x) f (x) for every x 2 IR :

It is possible that this integral is innite. If it is nite, we say that f is integrable.
27
Finally, let f be a function dened on IR, possibly taking the value 1 at some points and the value ;1 at other points. We dene the positive and negative parts of f to be
f + (x) = maxff (x) 0g f (x) = maxf;f (x) 0g respectively, and we dene the Lebesgue integral of f to be
;
Let f be a function dened on IR, possibly taking the value 1 at some points and the value ;1 at other points. Let A be a subset of IR. We dene
provided the right-hand side is not of the form 1 ; 1. If both IR f + d 0 and IR f ; d 0 are nite R (or equivalently, IR jf j d 0 < 1, since jf j = f + + f ; ), we say that f is integrable.
IR
fd 0=
IR
f+ d
0;;
IR
f d
;
A
where
fd 0=
I l A (x) =
is the indicator function of A.
IR
I l Af d
1 0
if x 2 A if x 2 A =
The Lebesgue integralRjust dened is related to the Riemann integral in one very important way: if R b the Riemann integral a f (x)dx is dened, then the Lebesgue integral a b] f d 0 agrees with the Riemann integral. The Lebesgue integral has two important advantages over the Riemann integral. The rst is that the Lebesgue integral is dened for more functions, as we show in the following examples. Example 1.5 Let Q be the set of rational numbers in 0 1], and consider f set, Q has Lebesgue measure zero, and so the Lebesgue integral of f over
= l Q . Being a countable I 0 1] is
1 To compute the Riemann integral 0 f (x)dx, we choose partition points 0 = x 0 < x1 < < xn = 1 and divide the interval 0 1] into subintervals x 0 x1] x1 x2] : : : xn;1 xn]. In each subinterval xk;1 xk ] there is a rational point q k , where f (qk ) = 1, and there is also an irrational point rk , where f (rk ) = 0. We approximate the Riemann integral from above by the upper sum
0 1]
f d 0 = 0:
n X
k=1 n X k=1
f (qk )(xk ; xk 1 ) =
;
n X
k=1 n X k=1
1 (xk ; xk 1 ) = 1
;
and we also approximate it from below by the lower sum
f (rk )(xk ; xk 1 ) =
;
0 (xk ; xk 1 ) = 0:
;
28 No matter how ne we take the partition of 0 1], the upper sum is always 1 and the lower sum is always 0. Since these two do not converge to a common value as the partition becomes ner, the Riemann integral is not dened. Example 1.6 Consider the function
f (x) =
if x = 0 if x 6= 0:
This is not a simple function because simple function cannot take the value function which lies between 0 and f is of the form
1.
Every simple
h(x) =
for some y
if x = 0 if x 6= 0
2 0 1), and thus has Lebesgue integral Z

IR
h d 0 = y 0 f0 g = 0 :
It follows that
IR
f d 0 = sup
Z
IR
hd
h is simple and h(x) f (x) for every x 2 IR = 0:
1 Now consider the Riemann integral ;1 f (x) dx, which for this function f is the same as the R 1 f (x) dx. When we partition ;1 1] into subintervals, one of these will contain Riemann integral ;1 R1 the point 0, and when we compute the upper approximating sum for ;1 f (x) dx, this point will contribute 1 times the length of the subinterval containing it. Thus the upper approximating sum is 1. On the other hand, the lower approximating sum is 0, and again the Riemann integral does not exist.
The Lebesgue integral has all linearity and comparison properties one would expect of an integral. In particular, for any two functions f and g and any real constant c,
IR
and whenever f (x)
(f + g ) d
0 0
g (x) for all x 2 IR, we have
IR
cf d
= c
IR Z
fd 0+ fd
0
IR
gd
IR
Finally, if A and B are disjoint sets, then
IR
fd
Z
IR
gd d 0:
A B
fd 0=
Z
A
fd 0+
Z
B
f d 0:
29
There are three convergence theorems satised by the Lebesgue integral. In each of these the situation is that there is a sequence of functions f n n = 1 2 : : : converging pointwise to a limiting function f . Pointwise convergence just means that
nlim fn (x) = f (x) for every x 2 IR:

!1
There are no such theorems for the Riemann integral, because the Riemann integral of the limiting function f is too often not dened. Before we state the theorems, we given two examples of pointwise convergence which arise in probability theory. Example 1.7 Consider a sequence of normal densities, each with variance mean n: 1 ; (x;n)2
1 and the n-th having
fn (x) = p e
2
These converge pointwise to the function
R We have IR fn d 0 = 1 for every n, so lim n
f (x) = 0 for every x 2 IR: R f d = 1, but R f d = 0. 0 IR IR n 0

!1
Example 1.8 Consider a sequence of normal densities, each with mean 0 and the n-th having vari1 ance n : r 2 2 fn (x) = n e; xn :
These converge pointwise to the function
f (x) =
We have again IR fn d 0 = 1 for every n, so limn!1 IR fn d 0 = 1, but IR f d 0 = 0. The function f is not the Dirac delta; the Lebesgue integral of this function was already seen in Example 1.6 to be zero. Theorem 3.1 (Fatous Lemma) Let fn verging pointwise to a function f . Then
if x = 0 if x 6= 0:
n = 1 2 : : : be a sequence of nonnegative functions con0
If limn!1 IR fn d 0 is dened, then Fatous Lemma has the simpler conclusion
IR
fd fd
lim inf n
!1
Z
IR
fn d 0 :
IR
nlim
!1
IR
fn d 0:
This is the case in Examples 1.7 and 1.8, where
nlim IR fn d 0 = 1
!1
30 while IR f d 0 = 0. We could modify either Example 1.7 or 1.8 Rby setting g n = fn if n is even, R f but gn = 2R n if n is odd. Now IR gn d 0 = 1 if n is even, but IR gn d 0 = 2 if n is odd. The sequence f IR gn d 0 g1 has two cluster points, 1 and 2. R By denition, the smaller one, 1, is n=1 R lim inf n!1 IR gn d 0 and the larger one, 2, is lim supn!1 IR gn d 0 . Fatous Lemma guarantees that even the smaller cluster point will be greater than or equal to the integral of the limiting function. The key assumption in Fatous Lemma is that all the functions take only nonnegative values. Fatous Lemma does not assume much but it is is not very satisfying because it does not conclude that
IR
f d 0 = nlim
!1
IR
fn d 0: n = 1 2 : : : be a sequence of functions
for every x 2 IR:
There are two sets of assumptions which permit this stronger conclusion. Theorem 3.2 (Monotone Convergence Theorem) Let fn converging pointwise to a function f . Assume that
0 f1 (x) f2 (x) f3 (x)

Then
Z
IR
where both sides are allowed to be 1.
f d 0 = nlim
Z
IR
!1
fn d
Theorem 3.3 (Dominated Convergence Theorem) Let fn n = 1 2 : : : be a sequence of functions, which may take either positive or negative values, converging pointwise to a function f . Assume R that there is a nonnegative integrable function g (i.e., IR g d 0 < 1) such that
Then
jfn(x)j g(x) for every x 2 IR for every n: Z Z

IR
f d 0 = nlim
!1
IR
fn d
and both sides will be nite.
1.4 General Probability Spaces

Denition 1.13 A probability space ( (i) (ii) (iii)
F IP ) consists of three objects:
, a nonempty set, called the sample space, which contains all possible outcomes of some random experiment;
F, a
-algebra of subsets of ;
IP , a probability measure on ( F ), i.e., a function which assigns to each set A 2 F a number IP (A) 2 0 1], which represents the probability that the outcome of the random experiment lies in the set A.
31
Remark 1.1 We recall from Homework Problem 1.4 that a probability measure IP has the following properties: (a)
IP ( ) = 0.
(b) (Countable additivity) If A1
A2 : : : is a sequence of disjoint sets in F , then IP

1
k=1
Ak =
! X
1
k=1
IP (Ak ):
(c) (Finite additivity) If n is a positive integer and A 1
IP (A1
(d) If A and B are sets in F and A
: : : An are disjoint sets in F , then An ) = IP (A1 ) + + IP (An ):
B, then IP (B ) = IP (A) + IP (B n A): IP (B ) IP (A): A2 : : : is a sequence of sets in F with A 1 A2 IP

1
In particular, (d) (Continuity from below.) If A1
k=1
1
Ak = nlim IP (An ):
!1
, then
(d) (Continuity from above.) If A1
A2 : : : is a sequence of sets in F with A 1 A2 IP
\ !
, then
k=1
Ak = nlim IP (An ):
!1
We have already seen some examples of nite probability spaces. We repeat these and give some examples of innite probability spaces as well. Example 1.9 Finite coin toss space. Toss a coin n times, so that is the set of all sequences of H and T which have n components. We will use this space quite a bit, and so give it a name: n . Let F be the collection of all subsets of n . Suppose the probability of H on each toss is p, a number between zero and one. Then the probability of T is q = 1 ; p. For each ! = (!1 !2 : : : !n ) in n , we dene For each A 2 F , we dene
IP f! g = pNumber of H in ! q Number of T in ! : IP (A) =
X
! A
2
IP f!g:
(4.1)
We can dene IP (A) this way because A has only nitely many elements, and so only nitely many terms appear in the sum on the right-hand side of (4.1).
32 Example 1.10 Innite coin toss space. Toss a coin repeatedly without stopping, so that is the set of all nonterminating sequences of H and T . We call this space 1 . This is an uncountably innite space, and we need to exercise some care in the construction of the -algebra we will use here.
For each positive integer n, we dene Fn to be the -algebra determined by the rst n tosses. For example, F2 contains four basic sets,
AHH = f! = (!1 !2 !3 : : : ) !1 = H !2 = H g = The set of all sequences which begin with HH AHT = f! = (!1 !2 !3 : : : ) !1 = H !2 = T g = The set of all sequences which begin with HT ATH = f! = (!1 !2 !3 : : : ) !1 = T !2 = H g = The set of all sequences which begin with TH ATT = f! = (!1 !2 !3 : : : ) !1 = T !2 = T g = The set of all sequences which begin with TT:
In the -algebra F , we put every set in every -algebra Fn , where n ranges over the positive integers. We also put in every other set which is required to make F be a -algebra. For example, the set containing the single sequence Because sets.
F2 is a
-algebra, we must also put into it the sets ,
, and all unions of the four basic
fHHHHH g = fH on every tossg

is not in any of the F n -algebras, because it depends on all the components of the sequence and not just the rst n components. However, for each positive integer n, the set
fH on the rst n tossesg

is in Fn and hence in F . Therefore,
fH on every tossg =
is also in F .
\
1
n=1
fH on the rst n tossesg
We next construct the probability measure IP on ( 1 F ) which corresponds to probability p 2 0 1] for H and probability q = 1 ; p for T . Let A 2 F be given. If there is a positive integer n such that A 2 Fn , then the description of A depends on only the rst n tosses, and it is clear how to dene IP (A). For example, suppose A = AHH ATH , where these sets were dened earlier. Then A is in F2. We set IP (AHH ) = p2 and IP (ATH ) = qp, and then we have
IP (A) = IP (AHH ATH ) = p2 + qp = (p + q )p = p: In other words, the probability of a H on the second toss is p.
33
Let us now consider a set A 2 F for which there is no positive integer n such that A 2 F . Such is the case for the set fH on every tossg. To determine the probability of these sets, we write them in terms of sets which are in Fn for positive integers n, and then use the properties of probability measures listed in Remark 1.1. For example,
fH on the rst tossg
fH on the rst two tossesg fH on the rst three tossesg
and
\
1
n=1
fH on the rst n tossesg = fH on every tossg:
According to Remark 1.1(d) (continuity from above),
IP fH on every tossg = nlim IP fH on the rst n tossesg = nlim pn :

!1 !1
A similar argument shows that if 0 < p < 1 so that 0 < q < 1, then every set in 1 which contains only one element (nonterminating sequence of H and T ) has probability zero, and hence very set which contains countably many elements also has probabiliy zero. We are in a case very similar to Lebesgue measure: every point has measure zero, but sets can have positive measure. Of course, the only sets which can have positive probabilty in 1 are those which contain uncountably many elements. In the innite coin toss space, we dene a sequence of random variables Y1
If p = 1, then IP fH on every tossg = 1; otherwise, IP fH on every tossg = 0.
Yk (!) =
and we also dene the random variable
Y2 : : : by
1 0
if !k if !k
=H =T
X (! ) =
n X Y k (! ) k=1
2k
Since each Yk is either zero or one, X takes values in the interval 0 1]. Indeed, X (TTTT ) = 0, X (HHHH ) = 1 and the other values of X lie in between. We dene a dyadic rational m number to be a number of the form 2k , where k and m are integers. For example, 3 is a dyadic 4 rational. Every dyadic rational in (0,1) corresponds to two sequences ! 2 1 . For example,
X (HHTTTTT
) = X (HTHHHHH
) = 3: 4
The numbers in (0,1) which are not dyadic rationals correspond to a single ! have a unique binary expansion.
; these numbers
34 Whenever we place a probability measure IP on ( F ), we have a corresponding induced measure LX on 0 1]. For example, if we set p = q = 1 in the construction of this example, then we have 2
1 LX 0 2 = IP fFirst toss is T g = 1 2 1 1 = IP fFirst toss is H g = 1 LX 2 2
1 LX 0 4 = IP fFirst two tosses are TT g = 1 4 1 LX 4 1 = IP fFirst two tosses are TH g = 1 2 4
1 LX 2 3 = IP fFirst two tosses are HT g = 1 4 4 3 LX 4 1 = IP fFirst two tosses are HH g = 1 : 4 0
Continuing this process, we can verify that for any positive integers k and m satisfying
m;1 < m
2k 2k
we have
In other words, the LX -measure of all intervals in 0 1] whose endpoints are dyadic rationals is the same as the Lebesgue measure of these intervals. The only way this can be is for LX to be Lebesgue measure. It is interesing to consider what L X would look like if we take a value of p other than 1 when we 2 construct the probability measure IP on .
m LX m2; 1 2k = 21k : k
We conclude this example with another look at the Cantor set of Example 3.2. Let pairs be the subset of in which every even-numbered toss is the same as the odd-numbered toss immediately preceding it. For example, HHTTTTHH is the beginning of a sequence in pairs , but HT is not. Consider now the set of real numbers
C = fX ( ! ) ! 2
0
pairs g:
The numbers between ( 1 1 ) can be written as X (! ), but the sequence ! must begin with either 4 2 1 3 TH or HT . Therefore, none of these numbers is in C 0. Similarly, the numbers between ( 16 16 ) can be written as X (! ), but the sequence ! must begin with TTTH or TTHT , so none of these numbers is in C 0. Continuing this process, we see that C 0 will not contain any of the numbers which were removed in the construction of the Cantor set C in Example 3.2. In other words, C 0 C. With a bit more work, one can convince onself that in fact C 0 = C , i.e., by requiring consecutive coin tosses to be paired, we are removing exactly those points in 0 1] which were removed in the Cantor set construction of Example 3.2.
35
In addition to tossing a coin, another common random experiment is to pick a number, perhaps using a random number generator. Here are some probability spaces which correspond to different ways of picking a number at random. Example 1.11 Suppose we choose a number from IR in such a way that we are sure to get either 1, 4 or 16. Furthermore, we construct the experiment so that the probability of getting 1 is 4 , the probability of 9 getting 4 is 4 and the probability of getting 16 is 1 . We describe this random experiment by taking 9 9 to be IR, F to be B(IR), and setting up the probability measure so that
4 1 IP f1g = 9 IP f4g = 4 IP f16g = 9 : 9

This determines IP (A) for every set A 2 B(IR). For example, the probability of the interval (0 is 8 , because this interval contains the numbers 1 and 4, but not the number 16. 9
5]
The probability measure described in this example is L S2 , the measure induced by the stock price S2 , when the initial stock price S 0 = 4 and the probability of H is 1 . This distribution was discussed 3 immediately following Denition 2.8. Example 1.12 Uniform distribution on 0 1]. Let = 0 1] and let F = B( 0 1]), the collection of all Borel subsets containined in 0 1]. For each Borel set A 0 1], we dene IP (A) = 0 (A) to be the Lebesgue measure of the set. Because 0 1] = 1, this gives us a probability measure. 0
This probability space corresponds to the random experiment of choosing a number from 0 1] so that every number is equally likely to be chosen. Since there are innitely mean numbers in 0 1], this requires that every number have probabilty zero of being chosen. Nonetheless, we can speak of the probability that the number chosen lies in a particular set, and if the set has uncountably many points, then this probability can be positive. I know of no way to design a physical experiment which corresponds to choosing a number at random from 0 1] so that each number is equally likely to be chosen, just as I know of no way to toss a coin innitely many times. Nonetheless, both Examples 1.10 and 1.12 provide probability spaces which are often useful approximations to reality. Example 1.13 Standard normal distribution. Dene the standard normal density
'(x) = p1 e 2 : 2
;
x2
Let
= IR, F = B(IR) and for every Borel set A IR, dene
IP (A) =
' d 0:
(4.2)
36 If A in (4.2) is an interval
a b], then we can write (4.2) as the less mysterious Riemann integral: Z b 1 x2 IP a b] = p e 2 dx:
;
This corresponds to choosing a point at random on the real line, and every single point has probability zero of being chosen, but if a set A is given, then the probability the point is in that set is given by (4.2). The construction of the integral in a general probability space follows the same steps as the construction of Lebesgue integral. We repeat this construction below. Denition 1.14 Let ( F IP ) be a probability space, and let X be a random variable on this space, i.e., a mapping from to IR, possibly also taking the values 1. If X is an indicator, i.e,
X (! ) = l A (!) = I
for some set A 2 F , we dene If X is a simple function, i.e,
1 0
if ! if !
2A 2 Ac
X dIP = IP (A):
n X k=1
X (! ) =
n X Z
ck l Ak (!) I
n X
where each ck is a real number and each Ak is a set in F , we dene
X dIP =
k=1
ck
I l Ak dIP =
k=1
ck IP (Ak ):
If X is nonnegative but otherwise general, we dene
X dIP
= sup
Y dIP Y
is simple and Y (! )
X (!) for every ! 2
In fact, we can always construct a sequence of simple functions Yn and Y (! ) = limn!1 Yn (! ) for every !
n = 1 2 : : : such that
0 Y1 (! ) Y2 (! ) Y3 (! ) : : : for every ! 2
. With this sequence, we can dene
X dIP = nlim
!1
Yn dIP:

If X is integrable, i.e, where then we dene
37
X + dIP < 1
Z
;
X dIP < 1
;
X + (!) = maxfX (!) 0g X (! ) = maxf;X (!) 0g
X dIP =
X + dIP
; ; X dIP:
;
If A is a set in F and X is a random variable, we dene
The expectation of a random variable X is dened to be
X dIP = IEX =
I l A X dIP: X dIP:
The above integral has all the linearity and comparison properties one would expect. In particular, if X and Y are random variables and c is a real constant, then
(X + Y ) dIP =
cX dIP = c X dP
, then
X dIP +
Y dIP
If X (! )
Y (!) for every ! 2
X dIP
Y dIP:
In fact, we dont need to have X (! ) Y (! ) for every ! 2 in order to reach this conclusion; it is enough if the set of ! for which X (! ) Y (! ) has probability one. When a condition holds with probability one, we say it holds almost surely. Finally, if A and B are disjoint subsets of and X is a random variable, then
A B
X dIP =
X dIP +
X dIP:
We restate the Lebesgue integral convergence theorem in this more general context. We acknowledge in these statements that conditions dont need to hold for every ! ; almost surely is enough. Theorem 4.4 (Fatous Lemma) Let Xn n = 1 2 : : : be a sequence of almost surely nonnegative random variables converging almost surely to a random variable X . Then
X dIP
lim inf n
!1 !1
Xn dIP
or equivalently,
IEX lim inf IEXn: n
38 Theorem 4.5 (Monotone Convergence Theorem) Let Xn n = 1 2 : : : be a sequence of random variables converging almost surely to a random variable X . Assume that
0 X1 X2 X3
Then
almost surely:
X dIP = nlim
!1
Xn dIP
or equivalently,
IEX = nlim IEXn :

!1
Theorem 4.6 (Dominated Convergence Theorem) Let Xn n = 1 2 : : : be a sequence of random variables, converging almost surely to a random variable X . Assume that there exists a random variable Y such that jXnj Y almost surely for every n: Then
X dIP = nlim
!1
Xn dIP
or equivalently,
IEX = nlim IEXn :

!1
In Example 1.13, we constructed a probability measure on (IR B(IR)) by integrating the standard R normal density. In fact, whenever ' is a nonnegative function dened on R satisfying IR ' d 0 = 1, we call ' a density and we can dene an associated probability measure by
IP (A) =
'd
0 for every A 2 B(IR):
(4.3)
We shall often have a situation in which two measure are related by an equation like (4.3). In fact, the market measure and the risk-neutral measures in nancial markets are related this way. We say that ' in (4.3) is the Radon-Nikodym derivative of dIP with respect to 0 , and we write
dIP '= d :
0
(4.4)
The probability measure IP weights different parts of the real line according to the density '. Now suppose f is a function on (R B(IR) IP ). Denition 1.14 gives us a value for the abstract integral
We can also evaluate
Z Z
IR IR
f dIP:
0
f' d
which is an integral with respec to Lebesgue measure over the real line. We want to show that
IR
f dIP =
IR
f' d
(4.5)
39
dIP an equation which is suggested by the notation introduced in (4.4) (substitute d 0 for ' in (4.5) and cancel the d 0 ). We include a proof of this because it allows us to illustrate the concept of the standard machine explained in Williamss book in Section 5.12, page 5.
The standard machine argument proceeds in four steps. Step 1. Assume that f is an indicator function, i.e., f (x) that case, (4.5) becomes Z
= l A (x) for some Borel set A I
IR. In
IP (A) = ' d 0: A This is true because it is the denition of IP (A). Step 2. Now that we know that (4.5) holds when f is an indicator function, assume that f
simple function, i.e., a linear combination of indicator functions. In other words,
is a
f (x) =
n X k=1
ck hk (x)
where each ck is a real number and each hk is an indicator function. Then
IR
f dIP =
= = = =
Z "X n
ck hk dIP IR k=1 n X Z ck hk dIP IR k=1 Z n X ck hk ' d 0 k=1 " IR # Z X n ck hk ' d 0 IR k=1 Z

IR
f' d 0:
Step 3. Now that we know that (4.5) holds when f is a simple function, we consider a general nonnegative function f . We can always construct a sequence of nonnegative simple functions fn n = 1 2 : : : such that
0 f1 (x) f2 (x) f3 (x) : : : for every x 2 IR

and f (x) = limn!1 fn (x) for every x 2 IR. We have already proved that
We let n ! 1 and use the Monotone Convergence Theorem on both sides of this equality to get Z Z
IR
fn dIP =
IR
fn ' d
0 for every n:
IR
f dIP =
IR
f' d 0:
40 Step 4. In the last step, we consider an integrable function f , which can take both positive and negative values. By integrable, we mean that
IR
From Step 3, we have
f + dIP
<1
IR
f dIP < 1:
;
Z ZIR
IR
f + dIP = f dIP =
;
Z ZIR
IR
f +' d
;
f ' d 0:
Subtracting these two equations, we obtain the desired result:
IR
f dIP =
= =
ZIR ZIR
R
f + dIP ; f +' d f' d 0 :
0;
IR Z
f dIP
;
IR
f 'd
;
1.5 Independence
In this section, we dene and discuss the notion of independence in a general probability space ( F IP ), although most of the examples we give will be for coin toss space.
1.5.1
Independence of sets
Denition 1.15 We say that two sets A 2 F and B
IP (A \ B) = IP (A)IP (B):
2 F are independent if
Suppose a random experiment is conducted, and ! is the outcome. The probability that ! 2 A is IP (A). Suppose you are not told !, but you are told that ! 2 B . Conditional on this information, the probability that ! 2 A is The sets A and B are independent if and only if this conditional probability is the uncondidtional probability IP (A), i.e., knowing that ! 2 B does not change the probability you assign to A. This discussion is symmetric with respect to A and B ; if A and B are independent and you know that ! 2 A, the conditional probability you assign to B is still the unconditional probability IP (B ).
(A \ IP (AjB) = IP IP (B)B) :
Whether two sets are independent depends on the probability measure IP . For example, suppose we toss a coin twice, with probability p for H and probability q = 1 ; p for T on each toss. To avoid trivialities, we assume that 0 < p < 1. Then
IP fHH g = p2 IP fHT g = IP fTH g = pq IP fTT g = q2:
(5.1)

Let A = fHH HT g and B = fHT set one H and one T . Then A \ B
41
TH g. In words, A is the set H on the rst toss and B is the = fHT g. We compute IP (A) = p2 + pq = p IP (B ) = 2pq IP (A)IP (B ) = 2p2q IP (A \ B) = pq:
If p = 1 , then IP (B ), the probability of one head and one tail, is 1 . If you are told that the coin 2 2 tosses resulted in a head on the rst toss, the probability of B , which is now the probability of a T on the second toss, is still 1 . 2 Suppose however that p = 0:01. By far the most likely outcome of the two coin tosses is TT , and the probability of one head and one tail is quite small; in fact, IP (B ) = 0:0198. However, if you are told that the rst toss resulted in H , it becomes very likely that the two tosses result in one head and one tail. In fact, conditioned on getting a H on the rst toss, the probability of one H and one T is the probability of a T on the second toss, which is 0:99.
These sets are independent if and only if 2p 2 q
= pq , which is the case if and only if p =
1 2.
1.5.2
Independence of -algebras
Denition 1.16 Let G and H be sub- -algebras of F . We say that G and H are independent if every set in G is independent of every set in H, i.e,
IP (A \ B) = IP (A)IP (B) for every A 2 H B 2 G :

Example 1.14 Toss a coin twice, and let IP be given by (5.1). Let determined by the rst toss: G contains the sets
G = F1 be the
-algebra
fHH HT g fTH TT g:
Let H be the -albegra determined by the second toss: H contains the sets
fHH TH g fHT TT g:
These two -algebras are independent. For example, if we choose the set fHH the set fHH TH g from H, then we have
HT g from G and
IP fHH HT gIP fHH TH g = (p2 + pq )(p2 + pq ) = p2 IP fHH HT g\ fHH TH g = IP fHH g = p2:

No matter which set we choose in G and which set we choose in H, we will nd that the product of the probabilties is the probability of the intersection.
42 Example 1.14 illustrates the general principle that when the probability for a sequence of tosses is dened to be the product of the probabilities for the individual tosses of the sequence, then every set depending on a particular toss will be independent of every set depending on a different toss. We say that the different tosses are independent when we construct probabilities this way. It is also possible to construct probabilities such that the different tosses are not independent, as shown by the following example.
= fHH HT TH TT g to be 1 1 IP fHH g = 9 IP fHT g = 2 IP fTH g = 1 IP fTT g = 3 9 3 and for every set A , dene IP (A) to be the sum of the probabilities of the elements in A. Then IP ( ) = 1, so IP is a probability measure. Note that the sets fH on rst tossg = fHH HT g and fH on second tossg = fHH TH g have probabilities IP fHH HT g = 1 and IP fHH TH g = 3
Example 1.15 Dene IP for the individual elements of independent.
4 4 9 , so the product of the probabilities is 27 . On the other hand, the intersection of fHH HT g and fHH TH g contains the single element fHH g, which has probability 1 . These sets are not 9
1.5.3
Independence of random variables
Denition 1.17 We say that two random variables X and Y are independent if the -algebras they generate (X ) and (Y ) are independent. In the probability space of three independent coin tosses, the price S 2 of the stock at time 2 is independent of S3 . This is because S2 depends on only the rst two coin tosses, whereas S3 is S2 S2 either u or d, depending on whether the third coin toss is H or T .
Denition 1.17 says that for independent random variables X and Y , every set dened in terms of X is independent of every set dened in terms of Y . In the case of S2 and S3 just considered, for exS2 ample, the sets fS2 = are indepedent sets.
udS0g = fHTH HTT g and
n S3
S2 = u = fHHH HTH THH TTH g
Suppose X and Y are independent random variables. We dened earlier the measure induced by X on IR to be Similarly, the measure induced by Y is Now the pair (X pair
LX (A) = IP fX 2 Ag A IR: LY (B) = IP fY 2 Bg B IR:
Y ) takes values in the plane IR 2, and we can dene the measure induced by the
The set C in this last equation is a subset of the plane IR 2 . In particular, C could be a rectangle, i.e, a set of the form A B , where A IR and B IR. In this case,
LX Y (C ) = IP f(X Y ) 2 C g C IR2:
f(X Y ) 2 A Bg = fX 2 Ag \ fY 2 Bg

and X and Y are independent if and only if
43
LX Y (A B) = IP fX 2 Ag \ fY 2 Bg = IP fX 2 AgIP fY 2 B g = LX (A)LY (B ):
(5.2)
In other words, for independent random variables X and Y , the joint distribution represented by the measure LX Y factors into the product of the marginal distributions represented by the measures LX and LY . A joint density for (X
Y ) is a nonnegative function f X Y (x y) such that
LX Y (A B) = Z
Z Z
Not every pair of random variables (X Y ) has a joint density, but if a pair does, then the random variables X and Y have marginal densities dened by
A B
fX Y (x y) dx dy:
fX (x) =
These have the properties
;1
fX Y (x ) d
fY (y)
;1
fX Y ( y ) d :
LX (A) = LY (B) =
Z ZA
B
fX (x) dx A IR fY (y) dy B IR:
Suppose X and Y have a joint density. Then X and Y are independent variables if and only if the joint density is the product of the marginal densities. This follows from the fact that (5.2) is equivalent to independence of X and Y . Take A = (;1 x] and B = (;1 y ], write (5.1) in terms of densities, and differentiate with respect to both x and y . Theorem 5.7 Suppose X and Y are independent random variables. Let g and h be functions from IR to IR. Then g(X ) and h(Y ) are also independent random variables. P ROOF : Let us denote W = g (X ) and Z a typical set in (W ) is of the form
= h(Y ). We must consider sets in (W ) and (Z ). But
f! W (!) 2 Ag = f! : g(X (!)) 2 Ag

which is dened in terms of the random variable X . Therefore, this set is in (X ). (In general, we have that every set in (W ) is also in (X ), which means that X contains at least as much information as W . In fact, X can contain strictly more information than W , which means that (X ) will contain all the sets in (W ) and others besides; this is the case, for example, if W = X 2.) In the same way that we just argued that every set in (W ) is also in (X ), we can show that every set in (Z ) is also in (Y ). Since every set in (X ) is independent of every set in (Y ), we conclude that every set in (W ) is independent of every set in (Z ).
44 Denition 1.18 Let X1 X2 : : : be a sequence of random variables. We say that these random variables are independent if for every sequence of sets A1 2 (X1) A2 2 (X2) : : : and for every positive integer n,
IP (A1 \ A2 \
An) = IP (A1 )IP (A2 )
IP (An ):
1.5.4
Correlation and independence
Theorem 5.8 If two random variables X and Y are independent, and if g and h are functions from IR to IR, then
IE g (X )h(Y )] = IEg (X ) IEh(Y )
provided all the expectations are dened.
I P ROOF : Let g (x) = l A (x) and trying to prove becomes
h(y) = l B (y ) be indicator functions. I
Then the equation we are
IP fX 2 Ag \ fY 2 Bg = IP fX 2 AgIP fY 2 Bg
which is true because X and Y are independent. Now use the standard machine to get the result for general functions g and h. The variance of a random variable X is dened to be Var(X ) = IE
X ; IEX ]2:
The covariance of two random variables X and Y is dened to be Cov(X
Y ) = IE (X ; IEX )(Y ; IEY ) = IE XY ] ; IEX IEY:
i
X and Y
According to Theorem 5.8, for independent random variables, the covariance is zero. If both have positive variances, we dene their correlation coefcient
(X Y ) = p Cov(X Y ) : Var(X )Var(Y )

For independent random variables, the correlation coefcient is zero. Unfortunately, two random variables can have zero correlation and still not be independent. Consider the following example. Example 1.16 Let X be a standard normal random variable, let Z be independent of X and have the distribution IP fZ = 1g = IP fZ = ;1g = 0. Dene Y = XZ . We show that Y is also a standard normal random variable, X and Y are uncorrelated, but X and Y are not independent. The last claim is easy to see. If X 2 = Y 2 almost surely.
X and Y
were independent, so would be X 2 and Y 2 , but in fact,

We next check that Y is standard normal. For y
45
IP fY
Being standard normal, both X and Y have expected value zero. Therefore, Cov(X
Since X is standard normal, IP fX y g = IP fX which shows that Y is also standard normal.
yg = IP fY = IP fX = IP fX 1 = 2 IP fX
2 IR, we have y and Z = 1g + IP fY y and Z = ;1g y and Z = 1g + IP f;X y and Z = ;1g y gIP fZ = 1g + IP f;X y gIP fZ = ;1g y g + 1 IP f;X y g: 2
;yg, and we have IP fY yg = IP fX yg,
Y ) = IE XY ] = IE X 2Z ] = IEX 2 IEZ = 1 0 = 0: Y )?
Where in IR2 does the measure LX Y put its mass, i.e., what is the distribution of (X
We conclude this section with the observation that for independent random variables, the variance of their sum is the sum of their variances. Indeed, if X and Y are independent and Z = X + Y , then Var(Z )
= IE (Z ; IEZ )2 i = IE X + Y ; IEX ; IEY )2 h i = IE (X ; IEX )2 + 2(X ; IEX )(Y ; IEY ) + (Y ; IEY )2 = Var(X ) + 2IE X ; IEX ]IE Y ; IEY ] + Var(Y ) = Var(X ) + Var(Y ):
This argument extends to any nite number of random variables. If we are given independent random variables X1 X2 : : : Xn , then Var(X1 + X2 +
+ Xn ) = Var(X1) + Var(X2) +
+ Var(Xn ):
(5.3)
1.5.5
Independence and conditional expectation.
We now return to property (k) for conditional expectations, presented in the lecture dated October 19, 1995. The property as stated there is taken from Williamss book, page 88; we shall need only the second assertion of the property: (k) If a random variable X is independent of a -algebra H, then
IE X jH] = IEX:
The point of this statement is that if X is independent of H, then the best estimate of X based on the information in H is IEX , the same as the best estimate of X based on no information.
46 To show this equality, we observe rst that IEX is H-measurable, since it is not random. We must also check the partial averaging property
If X is an indicator of some set B , which by assumption must be independent of H, then the partial averaging equation we must check is
IEX dIP =
X dIP for every A 2 H:
The left-hand side of this equation is IP (A)IP (B ), and the right hand side is
IP (B) dIP =
I l B dIP:
I I l Al B dIP =
I l A B dIP = IP (A \ B):
\
The partial averaging equation holds because A and B are independent. The partial averaging equation for general X independent of H follows by the standard machine.
1.5.6
Law of Large Numbers
There are two fundamental theorems about sequences of independent random variables. Here is the rst one. Theorem 5.9 (Law of Large Numbers) Let X1 X2 : : : be a sequence of independent, identically distributed random variables, each with expected value and variance 2. Dene the sequence of averages Then Yn converges to
Yn = X1 + X2 + + Xn n = 1 2 : : :: n almost surely as n ! 1.
We are not going to give the proof of this theorem, but here is an argument which makes it plausible. We will use this argument later when developing stochastic calculus. The argument proceeds in two steps. We rst check that IEYn = for every n. We next check that Var(Yn ) ! 0 as n ! 0. In other words, the random variables Yn are increasingly tightly distributed around as n ! 1. For the rst step, we simply compute
1 IEYn = n IEX1 + IEX2 +
1 + IEXn] = n | + + + } = : ] {z
n times
For the second step, we rst recall from (5.3) that the variance of the sum of independent random variables is the sum of their variances. Therefore, Var(Yn ) = As n ! 1, we have Var(Yn ) ! 0.
n X k=1
Var
n Xk = X 2 = 2 : 2 n n k=1 n
47
1.5.7
Central Limit Theorem
The Law of Large Numbers is a bit boring because the limit is nonrandom. This is because the denominator in the denition of Y n is so large that the variance of Yn converges to zero. If we want p to prevent this, we should divide by n rather than n. In particular, if we again have a sequence of independent, identically distributed random variables, each with expected value and variance 2 , but now we set then each Zn has expected value zero and Var(Zn ) =
) Zn = (X1 ; ) + (X2 ;pn +

n X
Var
+ (Xn ; )
2
As n ! 1, the distributions of all the random variables Z n have the same degree of tightness, as measured by their variance, around their expected value 0. The Central Limit Theorem asserts that as n ! 1, the distribution of Z n approaches that of a normal random variable with mean (expected value) zero and variance 2. In other words, for every set A IR,
k=1
Xk ; pn
n X k=1
2 n = :
1 Z e lim IP fZn 2 Ag = p n 2 A
!1
x2 2 2
dx:
48
Chapter 2
Conditional Expectation
Please see Hulls book (Section 9.6.)
2.1 A Binomial Model for Stock Price Dynamics

Stock prices are assumed to follow this simple binomial model: The initial stock price during the period under study is denoted S 0. At each time step, the stock price either goes up by a factor of u or down by a factor of d. It will be useful to visualize tossing a coin at each time step, and say that the stock price moves up by a factor of u if the coin comes out heads (H ), and down by a factor of d if it comes out tails (T ).
Note that we are not specifying the probability of heads here. Consider a sequence of 3 tosses of the coin (See Fig. 2.1) The collection of all possible outcomes (i.e. sequences of tosses of length 3) is
= fHHH HHT HTH HTT THH THH THT TTH TTT g:

A typical sequence of will be denoted ! , and ! k will denote the kth element in the sequence ! . We write Sk (! ) to denote the stock price at time k (i.e. after k tosses) under the outcome !. Note that Sk (! ) depends only on ! 1 !2 : : : !k . Thus in the 3-coin-toss example we write for instance,
S1 (!) = S1(!1 !2 !3) = S1(! 1)

4 4
S2 (!) = S2(! 1 !2 !3) = S2(!1 !2):

4 4
Each Sk is a random variable dened on the set . More precisely, let F = P ( ). Then F is a -algebra and ( F ) is a measurable space. Each Sk is an F -measurable function !IR, that is, ; Sk 1 is a function B!F where B is the Borel -algebra on I . We will see later that Sk is in fact R 49
50
= 3 S2 (HH) =u 2S0 2 = S (H) = uS0 1 1= S0 1 = S (T) = dS0 1 2 = S2 (TT) = d 2S0 3 = S3 (TTT) = d 3 S0 = 3 2 = 2 = S2 (HT) = ud S 0 S2 (TH) = ud S 0 = 3 3 = 3 = S3 (HHT) = u2 d S0 S3 (HTH) = u2 d S0 S3 (THH) = u2 d S0 3 S3 (HHH) = u S 0
S3 (HTT) = d 2 u S0 S3 (THT) = d 2 u S0 S3 (TTH) = d 2 u S0
Figure 2.1: A three coin period binomial model. measurable under a sub- -algebra of F . Recall that the Borel -algebra B is the -algebra generated by the open intervals of I . In this course we will always deal with subsets of I that belong to B. R R For any random variable X dened on a sample space
4
and any y
2 IR, we will use the notation:
fX yg = f! 2 X (!) yg: The sets fX < y g fX y g fX = y g etc, are dened similarly. Similarly for any subset B of IR,
we dene
fX 2 Bg = f! 2 X (!) 2 Bg:
4
Assumption 2.1
u > d > 0.
2.2 Information
Denition 2.1 (Sets determined by the rst k tosses.) We say that a set A is determined by the rst k coin tosses if, knowing only the outcome of the rst k tosses, we can decide whether the outcome of all tosses is in A. In general we denote the collection of sets determined by the rst k tosses by F k . It is easy to check that F k is a -algebra. Note that the random variable Sk is F k -measurable, for each k = 1
2 : : : n.
Example 2.1 In the 3 coin-toss example, the collection F 1 of sets determined by the rst toss consists of:
CHAPTER 2. Conditional Expectation

1. 2. 3. 4.
51
4 AH = fHHH HHT HTH HT T g, 4 AT = fTHH THT TTH TTT g,

, .
The collection F 2 of sets determined by the rst two tosses consists of: 1. 2. 3.
4. 5. The complements of the above sets, 6. Any union of the above sets (including the complements), 7. and .
4 AHH = fHHH HHT g, 4 AHT = fHTH HTT g, 4 AT H = fTHH THT g, 4 AT T = fTTH T TT g,
Denition 2.2 (Information carried by a random variable.) Let X be a random variable !IR. We say that a set A is determined by the random variable X if, knowing only the value X (!) of the random variable, we can decide whether or not ! 2 A. Another way of saying this is that for every y 2 IR, either X ;1(y ) A or X ;1 (y ) \ A = . The collection of susbets of determined by X is a -algebra, which we call the -algebra generated by X , and denote by (X ). If the random variable X takes nitely many different values, then tion of sets
(X ) is generated by the collec-
fX 1(X (!))j! 2 g
;
these sets are called the atoms of the -algebra In general, if X is a random variable
(X ). !IR, then (X ) is given by

;
(X ) = fX 1(B ) B 2 Bg:
Example 2.2 (Sets determined by S2 ) The -algebra generated by S2 consists of the following sets: 1. 2. 3.
AHH = fHHH HHT g = f! 2 S2 (!) = u2S0 g, AT T = fTTH T TT g = fS2 = d2S0 g AHT AT H = fS2 = udS0g
4. Complements of the above sets, 5. Any union of the above sets, 6. = fS2 (!) 2 g, 7. = fS2 (!) 2 IRg.
52
2.3 Conditional Expectation

In order to talk about conditional expectation, we need to introduce a probability measure on our coin-toss sample space . Let us dene
p 2 (0 1) is the probability of H , q = (1 ; p) is the probability of T ,

4
the coin tosses are independent, so that, e.g., IP (HHT ) = p2q etc.
IP (A) = P! A IP (!), 8A
4 2
Denition 2.3 (Expectation.)
IEX =
4
X
!
2
X (!)IP (! ):
1 0
if ! if !
If A
then
IA (!) =
4
and
IE (IAX ) =
Z
A
XdIP =
2A 62 A X
2
We can think of IE (IA X ) as a partial average of X over the set A.
X (!)IP (! ):
2.3.1
An example
Let us estimate S1, given S2. Denote the estimate by IE (S1jS2). From elementary probability, IE (S1jS2) is a random variable Y whose value at ! is dened by
Y (!) = IE (S1jS2 = y ) where y = S2 (!). Properties of IE (S1jS2):
IE (S1jS2) should depend on !, i.e., it is a random variable. If the value of S2 is known, then the value of IE (S 1jS2 ) should also be known. In particular, If ! = HHH or ! = HHT , then S2 (! ) = u2 S0. If we know that S2(! ) = u2 S0 , then even without knowing ! , we know that S 1 (! ) = uS0. We dene IE (S1jS2)(HHH ) = IE (S1jS2)(HHT ) = uS0: If ! = TTT or ! = TTH , then S2(! ) = d2S0 . If we know that S2(! ) = d2S0 , then even without knowing ! , we know that S 1 (! ) = dS0. We dene IE (S1jS2)(TTT ) = IE (S1jS2 )(TTH ) = dS0 :

If !
53
udS0, then we do not know whether S1 = uS0 or S1 = dS0. We then take a weighted
average:
2 A = fHTH HTT THH THT g, then S2(!) = udS0. If we know S2(!) =

IP (A) = p2 q + pq2 + p2q + pq 2 = 2pq:
Furthermore,
Z
A
S1 dIP = p2quS0 + pq2uS0 + p2qdS0 + pq 2dS0 = pq (u + d)S0
For !
2 A we dene
R S dIP IE (S1jS2)(!) = A (1A) = 1 (u + d)S0: 2 IP Z

A
Then
IE (S1jS2)dIP =
Z
A
S1dIP:
In conclusion, we can write where
In other words, IE (S1jS2) is random only through dependence on S 2 . We also write
8 > uS0 < g(x) = > 1 (u + d)S0 2 : dS

0
IE (S1jS2)(!) = g (S2(!))
if x = u2 S0 if x = udS0 if x = d2 S0
IE (S1jS2 = x) = g (x)
where g is the function dened above. The random variable IE (S1jS2) has two fundamental properties:
IE (S1jS2) is (S2)-measurable. For every set A 2 (S2), Z

A
2.3.2
Let ( on ( (a)
IE (S1jS2)dIP =
Z
A
S1dIP:
Denition of Conditional Expectation
Please see Williams, p.83.
F IP ) be a probability space, and let G be a sub- -algebra of F . Let X be a random variable F IP ). Then IE (X jG) is dened to be any random variable Y that satises:
Y is G -measurable,
54 (b) For every set A 2 G , we have the partial averaging property
Y dIP =
XdIP:
Uniqueness. There can be more than one random variable Y satisfying the above properties, but if Y 0 is another one, then Y = Y 0 almost surely, i.e., IP f! 2 Y (!) = Y 0 (!)g = 1: Notation 2.1 For random variables X
Existence. There is always a random variable Y satisfying the above properties (provided that IE jX j < 1), i.e., conditional expectations always exist.
Y , it is standard notation to write

4
IE (X jY ) = IE (X j (Y )):
Here are some useful ways to think about IE (X jG): A random experiment is performed, i.e., an element ! of is selected. The value of ! is partially but not fully revealed to us, and thus we cannot compute the exact value of X (!). Based on what we know about ! , we compute an estimate of X (!). Because this estimate depends on the partial information we have about ! , it depends on !, i.e., IE X jY ](!) is a function of ! , although the dependence on ! is often not shown explicitly. If the -algebra G contains nitely many sets, there will be a smallest set A in G containing ! , which is the intersection of all sets in G containing ! . The way ! is partially revealed to us is that we are told it is in A, but not told which element of A it is. We then dene IE X jY ](!) to be the average (with respect to IP ) value of X over this set A. Thus, for all ! in this set A, IE X jY ](!) will be the same.
2.3.3
Further discussion of Partial Averaging
The partial averaging property is
A
We can rewrite this as
IE (X jG)dIP =
Z
A
XdIP 8A 2 G :
(3.1)
IE IA:IE (X jG)] = IE IA:X ]:

Note that IA is a G -measurable random variable. In fact the following holds: Lemma 3.10 If V is any G -measurable random variable, then provided IE jV:IE (X jG)j < 1,
(3.2)
IE V:IE (X jG)] = IE V:X ]:
(3.3)
55
Proof: To see this, rst use (3.2) and linearity of expectations to prove (3.3) when V is a simple G -measurable random variable, i.e., V is of the form V = Pn=1 ck IAK , where each Ak is in G and k each ck is constant. Next consider the case that V is a nonnegative G -measurable random variable, but is not necessarily simple. Such a V can be written as the limit of an increasing sequence of simple random variables Vn ; we write (3.3) for each Vn and then pass to the limit, using the Monotone Convergence Theorem (See Williams), to obtain (3.3) for V . Finally, the general G measurable random variable V can be written as the difference of two nonnegative random-variables V = V + ; V ; , and since (3.3) holds for V + and V ; it must hold for V as well. Williams calls this argument the standard machine (p. 56). Based on this lemma, we can replace the second condition in the denition of a conditional expectation (Section 2.3.2) by: (b) For every G -measurable random-variable V , we have
IE V:IE (X jG)] = IE V:X ]:

2.3.4 Properties of Conditional Expectation
(3.4)
Please see Willams p. 88. Proof sketches of some of the properties are provided below. (a)
(b)
IE (IE (X jG)) = IE (X ): Proof: Just take A in the partial averaging property to be . The conditional expectation of X is thus an unbiased estimator of the random variable X . If X is G -measurable, then IE (X jG) = X: Proof: The partial averaging property holds trivially when Y is replaced by X . And since X is G -measurable, X satises the requirement (a) of a conditional expectation as well. If the information content of G is sufcient to determine X , then the best estimate of X based on G is X itself. IE (a1X1 + a2X2jG) = a1IE (X1jG) + a2IE (X2jG):
0 almost surely, then
(c) (Linearity) (d) (Positivity) If X
IE (X jG) 0:
Proof: Take A = f! 2 R (X jG)(! ) < 0g. This set is in G since IE (X jG) is G -measurable. IE R Partial averaging implies A IE (X jG)dIP = A XdIP . The right-hand side is greater than or equal to zero, and the left-hand side is strictly negative, unless IP (A) = 0. Therefore, IP (A) = 0.
56 (h) (Jensens Inequality) If
: R!R is convex and IE j (X )j < 1, then
IE ( (X )jG)
Recall the usual Jensens Inequality: IE
(IE (X jG)): (IE (X )):
(X )
(i) (Tower Property) If H is a sub- -algebra of G , then
IE (IE (X jG)jH) = IE (X jH):

-algebra of G means that G contains more information than H. If we estimate X based on the information in G , and then estimate the estimator based on the smaller amount of information in H, then we get the same result as if we had estimated X directly based on the information in H. (j) (Taking out what is known) If Z is G -measurable, then
H is a sub-
IE (ZX jG) = Z:IE (X jG):

Proof: Let Z be a G -measurable random variable. A random variable Y is IE (ZX jG) if and only if (a) Y is G -measurable; R R (b) A Y dIP = A ZXdIP When conditioning on G , the G -measurable random variable Z acts like a constant.
8A 2 G .
Take Y = Z:IE (X jG). Then Y satises (a) (a product of G -measurable random variables is G-measurable). Y also satises property (b), as we can check below:
Y dIP = IE (IA:Y ) = IE IAZIE (X jG)] = IE IAZ:X ] ((b) with V = IA Z Z = ZXdIP:

A
(k) (Role of Independence) If H is independent of
( (X ) G), then
IE (X j (G H)) = IE (X jG):
In particular, if X is independent of H, then
IE (X jH) = IE (X ):
If H is independent of X and G , then nothing is gained by including the information content of H in the estimation of X .
57
2.3.5
Examples from the Binomial Model
Recall that F 1
= f AH AT g. Notice that IE (S2jF 1 ) must be constant on AH and AT . Now since IE (S2jF 1 ) must satisfy the partial averaging property,
Z
We compute
AH AT
IE (S2jF 1)dIP = IE (S2jF 1)dIP =
Z Z
AH AT
S2 dIP S2dIP:
Z
AH
IE (S2jF 1 )dIP = IP (AH ):IE (S2jF 1 )(!) = pIE (S2jF 1)(! ) 8! 2 AH :
On the other hand,
Z
AH
S2 dIP = p2u2 S0 + pqudS0:
Therefore, We can also write
IE (S2jF 1 )(!) = pu2 S0 + qudS0 8! 2 AH : IE (S2jF 1)(! ) = pu2 S0 + qudS0 = (pu + qd)uS0 = (pu + qd)S1(!) 8! 2 AH
Similarly,
IE (S2jF 1 )(!) = (pu + qd)S1(!) 8! 2 AT : IE (S2jF 1)(!) = (pu + qd)S1(!) 8! 2 :
Thus in both cases we have
A similar argument one time step later shows that
IE (S3jF 2)(!) = (pu + qd)S2(! ):

We leave the verication of this equality as an exercise. We can verify the Tower Property, for instance, from the previous equations we have
IE IE (S3jF 2)jF 1] = IE (pu + qd)S2jF 2] = (pu + qd)IE (S2jF 1) = (pu + qd)2S1: This nal expression is IE (S 3jF 1).
(linearity)
58
2.4 Martingales
The ingredients are: A probability space ( A sequence of
F . Such a sequence of
Conditions for a martingale:
F IP ). -algebras F 0 F 1 : : : F n , with the property that F 0 F 1 : : : F n

-algebras is called a ltration.
A sequence of random variables M0
M1 : : : Mn. This is called a stochastic process.
1. Each Mk is F k -measurable. If you know the information in F k , then you know the value of Mk . We say that the process fMk g is adapted to the ltration fF k g. 2. For each k, IE (Mk+1 jF k ) = Mk . Martingales tend to go neither up nor down. A supermartingale tends to go down, i.e. the second condition above is replaced by IE (M k+1 jF k ) Mk ; a submartingale tends to go up, i.e. IE (M k+1jF k ) Mk .
Example 2.3 (Example from the binomial model.) For k
= 1 2 we already showed that
IE(Sk+1 jF k ) = (pu + qd)Sk :

For k = 0, we set F 0 = f g, the trivial -algebra. This -algebra contains no information, and any F 0 -measurable random variable must be constant (nonrandom). Therefore, by denition, IE(S1 jF 0 ) is that constant which satises the averaging property
IE(S1 jF 0 )dIP =
S1 dIP:
The right hand side is IES1
= (pu + qd)S0 , and so we have IE(S1 jF 0) = (pu + qd)S0 :
In conclusion, If (pu + qd) = 1 then fSk If (pu + qd) 1 then fSk If (pu + qd) 1 then fSk
F k k = 0 1 2 3g is a martingale. F k k = 0 1 2 3g is a submartingale. F k k = 0 1 2 3g is a supermartingale.
Chapter 3
Arbitrage Pricing
3.1 Binomial Pricing
Return to the binomial pricing model Please see: Cox, Ross and Rubinstein, J. Financial Economics, 7(1979), 229263, and Cox and Rubinstein (1985), Options Markets, Prentice-Hall.
Example 3.1 (Pricing a Call Option) Suppose u = 2 d = 0:5 r = 25%(interest rate), S0 = 50. (In this and all examples, the interest rate quoted is per unit time, and the stock prices S0 S1 : : : are indexed by the same time periods). We know that
S1 (!) =
100 25
if ! 1 if ! 1
=H =T
Find the value at time zero of a call option to buy one share of stock at time 1 for $50 (i.e. the strike price is $50). The value of the call at time 1 is
V1(!) = (S1 (!) ; 50)+ =
50 0
if ! 1 if ! 1
=H =T
Suppose the option sells for $20 at time 0. Let us construct a portfolio: 1. Sell 3 options for $20 each. Cash outlay is ;$60: 2. Buy 2 shares of stock for $50 each. Cash outlay is $100. 3. Borrow $40. Cash outlay is ;$40:
59
60
This portfolio thus requires no initial investment. For this portfolio, the cash outlay at time 1 is: Pay off option Sell stock Pay off debt
$0 The arbitrage pricing theory (APT) value of the option at time 0 is V0 = 20.
Assumptions underlying APT: Unlimited short selling of stock. Unlimited borrowing. No transaction costs. Agent is a small investor, i.e., his/her trading does not move the market. Important Observation: The APT value of the option does not depend on the probabilities of H and T .
;;;;; ;;;;;
!1 = H $150 ;$200 $50 $0
!1 = T $0 ;$50 $50
3.2 General one-step APT

Suppose a derivative security pays off the amount V 1 at time 1, where V1 is an F 1 -measurable random variable. (This measurability condition is important; this is why it does not make sense to use some stock unrelated to the derivative security in valuing it, at least in the straightforward method described below). Sell the security for V0 at time 0. (V0 is to be determined later).
negative).
0 shares of stock at time 0. ( 0 is also to be determined later) Invest V0 ; 0S0 in the money market, at risk-free interest rate r. (V0 ;
Buy Then wealth at time 1 is
0S0 might be
X1 =
4
0 S1 + (1 + r)(V0 ; 0S0 ) (1 + r)V0 + 0 (S1 ; (1 + r)S0):
We want to choose V0 and
0 so that
X1 = V1
regardless of whether the stock goes up or down.
CHAPTER 3. Arbitrage Pricing
61
The last condition above can be expressed by two equations (which is fortunate since there are two unknowns):
(1 + r)V0 + (1 + r)V0 +
0(S1 (H ) ; (1 + r)S0) = V1 (H ) 0 (S1(T ) ; (1 + r)S0) = V1 (T )
(2.1) (2.2)
Note that this is where we use the fact that the derivative security value V k is a function of Sk , i.e., when Sk is known for a given ! , Vk is known (and therefore non-random) at that ! as well. Subtracting the second equation above from the rst gives
0=
Plug the formula (2.3) for
V1(H ) ; V1(T ) : S1(H ) ; S1(T )
(2.3)
0 into (2.1):
(1 + r)V0 = V1 (H ) ; 0 (S1(H ) ; (1 + r)S0) V1 = V1 (H ) ; V1((H ) ; )S (T ) (u ; 1 ; r)S0 u;d 0 = 1 (u ; d)V (H ) ; (V (H ) ; V (T ))(u ; 1 ; r)]

1 1 1 u;d r 1 = 1 + ; ; d V1(H ) + u ; ; ; r V1(T ): u d u d
We have already assumed u > d > 0. We now also assume d be an arbitrage opportunity). Dene
4 4
1 + r u (otherwise there would
Then p > 0 and q > 0. Since p + q = 1, we have 0 < p < 1 and q = 1 ; p. Thus, p ~ ~ ~ ~ ~ ~ ~ ~ probabilities. We will return to this later. Thus the price of the call at time 0 is given by
r 1 p = 1 + ;; d q = u ; ; ; r : ~ ~ u d u d
1 ~ ~ V0 = 1 + r pV1(H ) + qV1(T )]:
q are like ~
(2.4)
3.3 Risk-Neutral Probability Measure

Let be the set of possible outcomes from n coin tosses. Construct a probability measure by the formula 4 f IP (! 1 !2 : : : !n ) = p#fj !j =H g q #fj !j =T g ~ ~
f IP on
f f f IP is called the risk-neutral probability measure. We denote by IE the expectation under IP . Equation 2.4 says
f 1 V0 = IE 1 + r V1 :
62
f P Theorem 3.11 Under I , the discounted stock price process f(1+ r) ;k Sk

Proof:
F k gn=0 is a martingale. k
f IE (1 + r)
= =
d(u ; 1 ; r) S k u;d u;d + du = (1 + r) (k+1) u + ur ; ud ; d ; d ; dr Sk u = (1 + r) (k+1) (u ; d)(1 + r) Sk u;d kS : = (1 + r) k

; ; ; ;
(k+1) Sk+1 jF k ] (1 + r) (k+1)(~u + qd)Sk p ~ (1 + r) (k+1) u(1 + r ; d) +

; ;
3.3.1
Portfolio Process
The portfolio process is
=(
:::
n 1), where
;
k is the number of shares of stock held between times k and k + 1.

Each
k is F k -measurable. (No insider trading).
3.3.2
Self-nancing Value of a Portfolio Process

Start with nonrandom initial wealth X 0 , which need not be 0. Dene recursively
Xk+1 =
=
Then each Xk is F k -measurable.
k Sk+1 + (1 + r)(Xk ; k Sk ) (1 + r)Xk + k (Sk+1 ; (1 + r)Sk ):
(3.1) (3.2)
f P Theorem 3.12 Under I , the discounted self-nancing portfolio process value f(1 + r) ;k Xk is a martingale.
Proof: We have
F k gn=0 k
(1 + r) (k+1)Xk+1 = (1 + r) k Xk + k (1 + r) (k+1) Sk+1 ; (1 + r) k Sk :

; ; ; ;

Therefore,
63
f IE (1 + r) (k+1)Xk+1 jF k ] f = IE (1 + r) k Xk jF k ] f +IE (1 + r) (k+1) k Sk+1 jF k ] f ;IE (1 + r) k k Sk jF k ] = (1 + r) k Xk (requirement (b) of conditional exp.) f + k IE (1 + r) (k+1) Sk+1 jF k ] (taking out what is known) ;(1 + r) k k Sk (property (b)) = (1 + r) k Xk (Theorem 3.11)
; ; ; ; ; ; ; ;
3.4 Simple European Derivative Securities

Denition 3.1 () A simple European derivative security with expiration time m is an F m -measurable random variable Vm . (Here, m is less than or equal to n, the number of periods/coin-tosses in the model). Denition 3.2 () A simple European derivative security Vm is said to be hedgeable if there exists a constant X0 and a portfolio process = ( 0 : : : m;1 ) such that the self-nancing value process X0 X1 : : : Xm given by (3.2) satises
Xm(!) = Vm(!) 8! 2 : In this case, for k = 0 1 : : : m, we call Xk the APT value at time k of V m .
Theorem 4.13 (Corollary to Theorem 3.12) If a simple European security Vm is hedgeable, then for each k = 0 1 : : : m, the APT value at time k of V m is
f Vk = (1 + r)k IE (1 + r) m VmjF k ]:
4 ;
(4.1)
Proof: We rst observe that if martingale property for each k
we use the tower property to write
= 0 1 : : : m ; 1, then we also have f IE Mm jF k ] = Mk k = 0 1 : : : m ; 1: (4.2) When k = m ; 1, the equation (4.2) follows directly from the martingale property. For k = m ; 2,
fMk F k k = 0 1 : : : mg is a martingale, i.e., satises the f IE Mk+1 jF k ] = Mk
f ff IE Mm jF m 2 ] = IE IE Mm jF m 1 ]jF m 2 ] f = IE Mm 1 jF m 2 ] = Mm 2 :
; ; ; ; ; ;
64 We can continue by induction to obtain (4.2). If the simple European security Vm is hedgeable, then there is a portfolio process whose selfnancing value process X0 X1 : : : Xm satises Xm = Vm . By denition, Xk is the APT value at time k of Vm . Theorem 3.12 says that
X0 (1 + r) 1X1 : : : (1 + r) m Xm is a martingale, and so for each k, f f (1 + r) k Xk = IE (1 + r) m Xm jF k ] = IE (1 + r) m Vm jF k ]:

; ; ; ; ;
Therefore,
f Xk = (1 + r)k IE (1 + r) m VmjF k ]:
;
3.5 The Binomial Model is Complete

Can a simple European derivative security always be hedged? It depends on the model. If the answer is yes, the model is said to be complete. If the answer is no, the model is called incomplete. Theorem 5.14 The binomial model is complete. In particular, let V m be a simple European derivative security, and set
f Vk (!1 : : : !k ) = (1 + r)k IE (1 + r) m Vm jF k ](!1 : : : !k )

;
(5.1)
k (!1
f E Starting with initial wealth V 0 = I (1 + r);m Vm ], the self-nancing value of the portfolio process 0 1 : : : m;1 is the process V0 V1 : : : Vm.
Proof: Let V0 : : : Vm;1 and 0 : : : m;1 be dened by (5.1) and (5.2). Set X0 = V0 and dene the self-nancing value of the portfolio process 0 : : : m;1 by the recursive formula 3.2:
V : : : !k ) = Sk+1(!1 :: :: :: !k H ) ; Vk+1 (!1 :: :: :: !k T ) : !k H ) ; Sk+1 (!1 !k T ) k+1 (!1
(5.2)
Xk+1 =
We need to show that
k Sk+1 + (1 + r)(Xk ; k Sk ):
Xk = Vk
8k 2 f0 1 : : : mg:
(5.3)
We proceed by induction. For k = 0, (5.3) holds by denition of X 0. Assume that (5.3) holds for some value of k, i.e., for each xed (!1 : : : !k ), we have
Xk (!1 : : : !k ) = Vk (!1 : : : !k ):

We need to show that
65
Xk+1 (!1 : : : !k H ) = Vk+1(!1 : : : !k H ) Xk+1(!1 : : : !k T ) = Vk+1(!1 : : : !k T ):

We prove the rst equality; the second can be shown similarly. Note rst that
f IE (1 + r)
(k+1) Vk+1 jF k ]
f In other words, f(1 + r);k Vk gn=0 is a martingale under I . In particular, P k

;
= = =
ff IE IE (1 + r) m VmjF k+1 ]jF k ] f IE (1 + r) mVm jF k ] (1 + r) k Vk

; ; ;
f Vk (!1 : : : !k ) = IE (1 + r) 1 Vk+1jF k ](!1 : : : !k ) 1 p ~ = 1 + r (~Vk+1 (!1 : : : !k H ) + q Vk+1 (!1 : : : !k T )) :

1 p ~ Vk = 1 + r (~Vk+1 (H ) + qVk+1(T )) :
Since (!1 : : : !k ) will be xed for the rest of the proof, we simplify notation by suppressing these symbols. For example, we write the last equation as
We compute
Xk+1 (H ) = k Sk+1 (H ) + (1 + r)(Xk ; k Sk ) = k (Sk+1 (H ) ; (1 + r)Sk ) + (1 + r)Vk V (H ) ; V = Sk+1 (H ) ; Sk+1 (T ) (Sk+1 (H ) ; (1 + r)Sk ) k+1 k+1 (T ) +~Vk+1 (H ) + q Vk+1 (T ) p ~ (H ; Vk = Vk+1uS ) ; dS+1 (T ) (uSk ; (1 + r)Sk ) k k +~Vk+1 (H ) + q Vk+1 (T ) p ~ = (Vk+1 (H ) ; Vk+1 (T )) u ; 1 ; r + pVk+1 (H ) + qVk+1 (T ) ~ ~ u;d = (Vk+1 (H ) ; Vk+1 (T )) q + pVk+1 (H ) + q Vk+1 (T ) ~ ~ ~ = Vk+1 (H ):
66
Chapter 4
The Markov Property

4.1 Binomial Model Pricing and Hedging
Recall that Vm is the given simple European derivative security, and the value and portfolio processes are given by:
f Vk = (1 + r)k IE (1 + r) m Vm jF k ] k = 0 1 : : : m ; 1: Vk+1 (!1 : : : !k H ) ; Vk+1(!1 : : : !k T ) k (!1 : : : !k ) = S (! : : : ! H ) ; S (! : : : ! T ) k = 0 1 : : : m ; 1: k+1 1 k k+1 1 k

;
Example 4.1 (Lookback Option) u = 2 d = 0:5 r = 0:25 S0 = 4 p = 1+;;d = 0:5 q = 1 ; p = ~ ur d ~ ~ Consider a simple European derivative security with expiration 2, with payoff given by (See Fig. 4.1):
0:5:
V2 = 0max2(Sk ; 5)+ : k
Notice that
V2 (HH) = 11 V2 (HT) = 3 6= V2 (TH) = 0 V2 (T T ) = 0: 1 ~ V1 (H) = 1 + r pV2 (HH) + qV2 (HT)] = 4 0:5 11 + 0:5 3] = 5:60 ~ 5 V1 (T) = 4 0:5 0 + 0:5 0] = 0 5 4 0:5 5:60 + 0:5 0] = 2:24: V0 = 5
0
The payoff is thus path dependent. Working backward in time, we have:
Using these values, we can now compute:
V ; = S1(H) ; V1 (T ) = 0:93 (H) S (T )

1 1 2 2
V (H) = S2 (HH) ; V2 (HT) = 0:67 (HH) ; S (HT)

67
68
S2 (HH) = 16
S (H) = 8 1 S2 (HT) = 4 S =4 0 S1 (T) = 2 S2 (TH) = 4
S2 (TT) = 1
Figure 4.1: Stock price underlying the lookback option.

1
V (T) = S2 (TH) ; V2 (TT ) = 0: (TH) ; S (TT)

2 2
Working forward in time, we can check that
X1 (H) = 0S1 (H) + (1 + r)(X0 ; 0S0 ) = 5:59 V1 (H) = 5:60 X1 (T ) = 0S1 (T) + (1 + r)(X0 ; 0S0 ) = 0:01 V1 (T ) = 0 X1 (HH) = 1(H)S1 (HH) + (1 + r)(X1 (H) ; 1(H)S1 (H)) = 11:01 V1 (HH) = 11
etc.
Example 4.2 (European Call) Let u = 2 with expiration time 2 and payoff function
~ ~ d = 1 r = 1 S0 = 4 p = q = 2 4 V2 = (S2 ; 5)+ :
1 2,
and consider a European call
Note that
V2 (HH) = 11 V2 (HT) = V2 (TH) = 0 V2 (T T ) = 0 1 V1 (H) = 4 2 :11 + 1 :0] = 4:40 2 5 41 V1(T) = 5 2 :0 + 1 :0] = 0 2 41 V0 = 5 2 4:40 + 1 0] = 1:76: 2 Dene vk (x) to be the value of the call at time k when Sk = x. Then v2 (x) = (x ; 5)+ v1(x) = 4 1 v2 (2x) + 1 v2 (x=2)] 2 52 v0(x) = 4 1 v1 (2x) + 1 v1 (x=2)]: 2 52
CHAPTER 4. The Markov Property

In particular,
69
Let
v2(16) = 11 v2 (4) = 0 v2 (1) = 0 1 v1(8) = 4 2 :11 + 1 :0] = 4:40 2 5 1 v1(2) = 4 1 :0 + 2 :0] = 0 52 1 v0 = 4 2 4:40 + 1 0] = 1:76: 2 5 k (x) be the number of shares in the hedging portfolio at time k when S k = x. Then
k
(x) = vk+1 (2x) ; vk+1 (x=2) k = 0 1: 2x ; x=2
4.2 Computational Issues

For a model with n periods (coin tosses), equations of the form has 2 n elements. For period k, we must solve 2k
1 ~ Vk (!1 : : : !k ) = 1 + r pVk+1(!1 : : : !k H ) + qVk+1 (!1 : : : !k T )]: ~

For example, a three-month option has 66 trading days. If each day is taken to be one period, then n = 66 and 266 7 1019. There are three possible ways to deal with this problem: 1. Simulation. We have, for example, that
f V0 = (1 + r) n IEVn
;
and so we could compute V0 by simulation. More specically, we could simulate n coin tosses ! = (!1 : : : !n ) under the risk-neutral probability measure. We could store the value of Vn (! ). We could repeat this several times and take the average value of Vn as an f EV approximation to I n . 2. Approximate a many-period model by a continuous-time model. Then we can use calculus and partial differential equations. Well get to that. 3. Look for Markov structure. Example 4.2 has this. In period 2, the option in Example 4.2 has three possible values v 2 (16) v2(4) v2(1), rather than four possible values V 2(HH ) V2(HT ) V2(TH ) If there were 66 periods, then in period 66 there would be 67 possible stock price values (since the nal price depends only on the number of up-ticks of the stock price i.e., heads so far) and hence only 67 possible option values, rather than 2 66 7 1019.
V2(TT ).
70
4.3 Markov Processes

Technical condition always present: We consider only functions on I and subsets of I which are R R R Borel-measurable, i.e., we only consider subsets A of I that are in B and functions g : IR!IR such that g ;1 is a function B!B. Denition 4.1 () Let ( F P) be a probability space. Let fF k gn=0 be a ltration under k fXk gn=0 be a stochastic process on ( F P). This process is said to be Markov if: k The stochastic process fXk g is adapted to the ltration fF k g, and
F.
Let
(The Markov Property). For each k = 0 1 : : : n ; 1, the distribution of Xk+1 conditioned on F k is the same as the distribution of X k+1 conditioned on X k .
4.3.1
Different ways to write the Markov property
(a) (Agreement of distributions). For every A 2 B
= B(IR), we have
4
IP (Xk+1 2 AjF k ) = IE IA(Xk+1)jF k ] = IE IA(Xk+1 )jXk ] = IP Xk+1 2 AjXk ]:

(b) (Agreement of expectations of all functions). For every (Borel-measurable) function h : IR!IR for which IE jh(Xk+1 )j < 1, we have
IE h(Xk+1 )jF k ] = IE h(Xk+1)jXk]:

(c) (Agreement of Laplace transforms.) For every u 2 IR for which IEeuXk+1
< 1, we have
IE euXk+1 F k = IE euXk+1 Xk :
(If we x u and dene h(x) = eux , then the equations in (b) and (c) are the same. However in (b) we have a condition which holds for every function h, and in (c) we assume this condition only for functions h of the form h(x) = e ux . A main result in the theory of Laplace transforms is that if the equation holds for every h of this special form, then it holds for every h, i.e., (c) implies (b).) (d) (Agreement of characteristic functions) For every u 2 IR, we have
where i = 1.)
p;1. (Since jeiuxj = j cos x +sin xj 1 we dont need to assume that IE jeiuxj <
IE eiuXk+1 jF k = IE eiuXk+1 jXk
71
Remark 4.1 In every case of the Markov properties where IE : : : jXk ] appears, we could just as well write g (Xk ) for some function g . For example, form (a) of the Markov property can be restated as: For every A 2 B, we have
IP (Xk+1 2 AjF k ) = g (Xk )

where g is a function that depends on the set A. Conditions (a)-(d) are equivalent. The Markov property as stated in (a)-(d) involves the process at a current time k and one future time k + 1. Conditions (a)-(d) are also equivalent to conditions involving the process at time k and multiple future times. We write these apparently stronger but actually equivalent conditions below. Consequences of the Markov property. Let j be a positive integer. (A) For every Ak+1
IR : : : Ak+j IR,
IP Xk+1 2 Ak+1 : : : Xk+j 2 Ak+j jF k ] = IP Xk+1 2 Ak+1 : : : Xk+j 2 Ak+j jXk ]:

(A) For every A 2 IRj ,
IP (Xk+1 : : : Xk+j ) 2 AjF k ] = IP (Xk+1 : : : Xk+j ) 2 AjXk ]:

(B) For every function h : IRj !IR for which IE jh(Xk+1
: : : Xk+j )j < 1, we have
IE h(Xk+1 : : : Xk+j )jF k ] = IE h(Xk+1 : : : Xk+j )jXk ]:

(C) For every u = (uk+1
: : : uk+j ) 2 IRj for which IE jeuk+1 Xk+1 +:::+uk+j Xk+j j < 1, we have
IE euk+1 Xk+1+:::+uk+j Xk+j jF k ] = IE euk+1 Xk+1 +:::+uk+j Xk+j jXk ]:

(D) For every u = (uk+1
: : : uk+j ) 2 IRj we have
IE ei(uk+1Xk+1 +:::+uk+j Xk+j )jF k ] = IE ei(uk+1Xk+1 +:::+uk+j Xk+j )jXk ]:

Once again, every expression of the form IE (: : : jXk ) can also be written as g (Xk ), where the function g depends on the random variable represented by : : : in this expression. Remark. All these Markov properties have analogues for vector-valued processes.
72 Proof that (b) Consider
=) (A). (with j = 2 in (A)) Assume (b).
Then (a) also holds (take
h = IA ).
IP Xk+1 2 Ak+1 Xk+2 2 Ak+2 jF k ] = IE IAk+1 (Xk+1 )IAk+2 (Xk+2 )jF k ]

= IE IE IAk+1 (Xk+1 )IAk+2 (Xk+2 )jF k+1 ]jF k ] = IE IAk+1 (Xk+1 ):IE IAk+2 (Xk+2)jF k+1 ]jF k ] = IE IAk+1 (Xk+1 ):IE IAk+2 (Xk+2)jXk+1 ]jF k ] = IE IAk+1 (Xk+1 ):g (Xk+1)jF k ] = IE IAk+1 (Xk+1 ):g (Xk+1)jXk ]
(Markov property, form (b).) (Remark 4.1) (Markov property, form (a).) (Taking out what is known) (Tower property) (Denition of conditional probability)
Now take conditional expectation on both sides of the above equation, conditioned on use the tower property on the left, to obtain
(X k ), and
(3.1)
IP Xk+1 2 Ak+1 Xk+2 2 Ak+2jXk ] = IE IAk+1 (Xk+1 ):g (Xk+1)jXk]:

Since both and
IP Xk+1 2 Ak+1 Xk+2 2 Ak+2 jF k ] IP Xk+1 2 Ak+1 Xk+2 2 Ak+2 jXk ]

= 2.
are equal to the RHS of (3.1)), they are equal to each other, and this is property (A) with j
Example 4.3 It is intuitively clear that the stock price process in the binomial model is a Markov process. We will formally prove this later. If we want to estimate the distribution of S k+1 based on the information in F k , the only relevant piece of information is the value of Sk . For example,
e IE Sk+1 jF k ] = (~u + qd)Sk = (1 + r)Sk p ~
(3.2)
is a function of Sk . Note however that form (b) of the Markov property is stronger then (3.2); the Markov property requires that for any function h,
e IE h(Sk+1)jF k ] is a function of Sk . Equation (3.2) is the case of h(x) = x.
Consider a model with 66 periods and a simple European derivative security whose payoff at time 66 is
1 V66 = 3 (S64 + S65 + S66 ):

The value of this security at time 50 is
73
e V50 = (1 + r)50IE (1 + r);66V66 jF 50 ] e = (1 + r);16IE V66jS50]

because the stock price process is Markov. (We are using form (B) of the Markov property here). In other words, the F50-measurable random variable V50 can be written as
V50(!1 : : : !50) = g(S50 (! 1 : : : !50))

for some function g, which we can determine with a bit of work.
4.4 Showing that a process is Markov

Denition 4.2 (Independence) Let ( F P) be a probability space, and let G and H be sub- algebras of F . We say that G and H are independent if for every A 2 G and B 2 H, we have
IP (A \ B) = IP (A)IP (B):
We say that a random variable X is independent of a -algebra G if by X , is independent of G .
(X ), the
-algebra generated
Example 4.4 Consider the two-period binomial model. Recall that F 1 is the -algebra of sets determined by the rst toss, i.e., F 1 contains the four sets
4 4 AH = fHH HT g AT = fTH TT g
fHH T H g fHT T T g
Let H be the -algebra of sets determined by the second toss, i.e., H contains the four sets
: Then F 1 and H are independent. For example, if we take A = fHH HT g from F1 and B = fHH T H g from H, then IP (A \ B) = IP(HH) = p2 and IP (A)IP(B) = (p2 + pq)(p2 + pq) = p2 (p + q)2 = p2:
Note that F 1 and S2 are not independent (unless p = 1 or p = 0). For example, one of the sets in (S2 ) is f! S2 (!) = u2S0 g = fHH g. If we take A = fHH HT g from F 1 and B = fHH g from (S2 ), then IP(A \ B) = IP(HH) = p2 , but
IP(A)IP(B) = (p2 + pq)p2 = p3(p + q) = p3 :
The following lemma will be very useful in showing that a process is Markov: Lemma 4.15 (Independence Lemma) Let X and Y be random variables on a probability space ( F P). Let G be a sub- -algebra of F . Assume
74
X is independent of G ; Y is G -measurable.
Let f (x
y ) be a function of two variables, and dene g (y) = IEf (X y):

4
Then
IE f (X Y )jG] = g (Y ):
Remark. In this lemma and the following discussion, capital letters denote random variables and lower case letters denote nonrandom variables.
Example 4.5 (Showing the stock price process is Markov) Consider an n-period binomial model. Fix a
4 k 4 = SS+1 and G = F k . Then X = u if !k+1 = H and X = d if !k+1 = T . Since X k 4 depends only on the (k + 1)st toss, X is independent of G . Dene Y = Sk , so that Y is G -measurable. Let h 4 be any function and set f(x y) = h(xy). Then 4 g(y) = IEf(X y) = IEh(Xy) = ph(uy) + qh(dy):
time k and dene X The Independence Lemma asserts that
IE h(Sk+1 )jF k ] = = = =
k IE h SS+1 :Sk jF k ] k IE f(X Y )jG ] g(Y ) ph(uSk ) + qh(dSk ):
This shows the stock price is Markov. Indeed, if we condition both sides of the above equation on (S k ) and use the tower property on the left and the fact that the right hand side is (S k )-measurable, we obtain
IE h(Sk+1 )jSk ] = ph(uSk ) + qh(dSk ): Thus IE h(Sk+1 )jF k ] and IE h(Sk+1 )jXk ] are equal and form (b) of the Markov property is proved. IE h(Sk+1 )jF k ] as a function of Sk . This is a special case of Remark 4.1.
Not only have we shown that the stock price process is Markov, but we have also obtained a formula for
4.5 Application to Exotic Options

Consider an n-period binomial model. Dene the running maximum of the stock price to be
Consider a simple European derivative security with payoff at time n of v n (Sn Examples:
Mk = 1maxk Sj : j
4
Mn ) .
75
vn(Sn Mn ) = (Mn ; K )+ (Lookback option); vn(Sn Mn ) = IMn B (Sn ; K )+ (Knock-in Barrier option).
Lemma 5.16 The two-dimensional process f(S k Mk )gn=0 is Markov. (Here we are working under k the risk-neutral measure I , although that does not matter). P Proof: Fix k. We have
Mk+1 = Mk _ Sk+1
4
where _ indicates the maximum of two quantities. Let Z
f IP (Z = u) = p IP (Z = d) = q ~ f ~
and Z is independent of F k . Let h(x
k = SS+1 , so k
y ) be a function of two variables. We have
h(Sk+1 Mk+1 ) = h(Sk+1 Mk _ Sk+1 ) = h(ZSk Mk _ (ZSk )):

Dene
f g (x y) = IEh(Zx y _ (Zx)) = ph(ux y _ (ux)) + q h(dx y _ (dx)): ~ ~

4
The Independence Lemma implies
f IE h(Sk+1 Mk+1 )jF k ] = g(Sk Mk ) = ph(uSk Mk _ (uSk )) + qh(dSk Mk ) ~ ~

the second equality being a consequence of the fact that M k ^ dSk = Mk . Since the RHS is a function of (Sk Mk ), we have proved the Markov property (form (b)) for this two-dimensional process. Continuing with the exotic option of the previous Lemma... Let V k denote the value of the derivative f security at time k. Since (1 + r);k Vk is a martingale under I , we have P
1 f Vk = 1 + r IE Vk+1jF k ] k = 0 1 : : : n ; 1:
At the nal time, we have
Vn = vn (Sn Mn):
Stepping back one step, we can compute
Vn
1 IE v (S M )jF ] f 1+r n n n n 1 1 ~ = 1 + r pvn (uSn 1 uSn 1 _ Mn 1 ) + q vn (dSn ~

; ; ; ;
Mn 1)] :
;
76 This leads us to dene
1 ~ vn 1 (x y) = 1 + r pvn (ux ux _ y ) + qvn(dx y)] ~

4 ;
so that The general algorithm is
Vn 1 = vn 1 (Sn
; ;
Mn 1):
;
1 ~ ~ vk (x y) = 1 + r pvk+1 (ux ux _ y ) + q vk+1(dx y )

and the value of the option at time k is v k (Sk Mk ). Since this is a simple European option, the hedging portfolio is given by the usual formula, which in this case is
_ Mk = vk+1 (uSk (uSk ) u ; d))S; vk+1 (dSk Mk ) (

k
Chapter 5
Stopping Times and American Options

5.1 American Pricing
Let us rst review the European pricing formula in a Markov model. Consider the Binomial model with n periods. Let Vn = g (Sn ) be the payoff of a derivative security. Dene by backward recursion:
vn (x) = g (x) 1 ~ vk (x) = 1 + r pvk+1 (ux) + qvk+1(dx)]: ~ Then vk (Sk ) is the value of the option at time k, and the hedging portfolio is given by vk+1 (uSk ) ; vk+1 (dSk ) k = 0 1 2 : : : n ; 1: k= (u ; d)Sk
Now consider an American option. Again a function g is specied. In any period k, the holder of the derivative security can exercise and receive payment g (Sk ). Thus, the hedging portfolio should create a wealth process which satises
Xk g (Sk) 8k
almost surely.
This is because the value of the derivative security at time k is at least g (S k ), and the wealth process value at that time must equal the value of the derivative security. American algorithm.
vn(x) = g (x) 1 p vk (x) = max 1 + r (~vk+1 (ux) + q vk+1 (dx)) g(x) ~ Then vk (Sk ) is the value of the option at time k.
77
78
S2 (HH) = 16 v2 (16) = 0
S (H) = 8 1 S2 (HT) = 4 S =4 0 S1 (T) = 2 S2 (TH) = 4
v (4) = 1 2
S2 (TT) = 1
v (1) = 4 2
Figure 5.1: Stock price and nal value of an American put option with strike price 5.
Example 5.1 See Fig. 5.1. S0 Then
1 ~ ~ 2 = 4 u = 2 d = 1 r = 4 p = q = 1 n = 2. Set v2 (x) = g(x) = (5 ; x)+ . 2
v1 (8) = max 4 1 :0 + 1 :1 (5 ; 8)+ 2 5 2 = max 2 0 5 = 0:40 v1 (2) = max 4 1 :1 + 1 :4 (5 ; 2)+ 2 5 2 = maxf2 3g = 3:00 v0 (4) = max 4 1 :(0:4) + 1 :(3:0) (5 ; 4)+ 2 5 2 = maxf1:36 1g = 1:36
Let us now construct the hedging portfolio for this option. Begin with initial wealth X 0 0 as follows:
= 1:36.
Compute
0:40 = = = = 3:00 = = = =
v1 (S1 (H)) S1 (H) 0 + (1 + r)(X0 ; 0S0 ) 8 0 + 5 (1:36 ; 4 0) 4 3 0 + 1:70 =) 0 = ;0:43 v1 (S1 (T)) S1 (T) 0 + (1 + r)(X0 ; 0S0 ) 2 0 + 5 (1:36 ; 4 0) 4 ;3 0 + 1:70 =) 0 = ;0:43
CHAPTER 5. Stopping Times and American Options

Using
0
79
= ;0:43 results in X1 (H) = v1(S1 (H)) = 0:40 X1 (T ) = v1 (S1 (T)) = 3:00

(Recall that S1 (T) = 2):
Now let us compute
1 = = = = 4 = = = =
We get different answers for
1
v2 (4) S2 (TH) 1(T) + (1 + r)(X1 (T ) ; 1(T)S1 (T)) 5 4 1(T ) + 4 (3 ; 2 1(T)) 1:5 1(T ) + 3:75 =) 1(T) = ;1:83 v2 (1) S2 (TT ) 1(T ) + (1 + r)(X1 (T) ; 1 (T )S1 (T )) 5 1(T) + (3 ; 2 1(T)) 4 ;1:5 1(T ) + 3:75 =) 1(T ) = ;0:16 (T)! If we had X1 (T ) = 2, the value of the European put, we would have 1 = 1:5 1(T ) + 2:5 =) 4 = ;1:5 1(T) + 2:5 =)
1
(T) = ;1
1
(T) = ;1
5.2 Value of Portfolio Hedging an American Option
Xk+1 =
=
k Sk+1 + (1 + r)(Xk ; Ck ; k Sk ) (1 + r)Xk + k (Sk+1 ; (1 + r)Sk ) ; (1 + r)Ck
Here, Ck is the amount consumed at time k. The discounted value of the portfolio is a supermartingale. The value satises Xk
g(Sk) k = 0 1 : : : n.
The value process is the smallest process with these properties. When do you consume? If
f IE ((1 + r)
or, equivalently,
(k+1) vk+1 (Sk+1 )jF k ] < (1 + r) k vk (Sk )

;
f 1 IE( 1 + r vk+1(Sk+1)jF k ] < vk (Sk )
80 and the holder of the American option does not exercise, then the seller of the option can consume to close the gap. By doing this, he can ensure that X k = vk (Sk ) for all k, where vk is the value dened by the American algorithm in Section 5.1. In the previous example, v1 (S1(T )) = 3 v2(S2(TH )) = 1 and v2(S2 (TT )) = 4. Therefore,
1 f 1 IE 1 + r v2(S2 )jF 1 ](T ) = 4 2 :1 + 1 :4 2 5
= = v1 (S1(T )) =
4 5 5 2 2 3
so there is a gap of size 1. If the owner of the option does not exercise it at time one in the state ! 1 = T , then the seller can consume 1 at time 1. Thereafter, he uses the usual hedging portfolio
k vk = vk+1 (uSu ) ; )S+1 (dSk ) ( ;d k
In the example, we have v1 (S1(T )) = g (S1(T )). It is optimal for the owner of the American option to exercise whenever its value vk (Sk ) agrees with its intrinsic value g (S k ). Denition 5.1 (Stopping Time) Let ( F tion. A stopping time is a random variable
f! 2
P) be a probability space and let fF k gn=0 be a ltrak : !f0 1 2 : : : ng f1g with the property that: (!) = kg 2 F k 8k = 0 1 : : : n 1:
q= ~
1. 2
Example 5.2 Consider the binomial model with n = 2 S 0 = 4 u = 2 d = 1 r = 1 , so p = ~ 2 4 v0 v1 v2 be the value functions dened for the American put with strike price 5. Dene
Let
(!) = minfk vk(Sk ) = (5 ; Sk )+ g:

The stopping time corresponds to stopping the rst time the value of the option agrees with its intrinsic value. It is an optimal exercise time. We note that
(!) =
We verify that is indeed a stopping time:
1 2
if ! if !
2 AT 2 AH
f! (!) = 0g = 2 F 0 f! (!) = 1g = AT 2 F 1 f! (!) = 2g = AH 2 F 2
Example 5.3 (A random time which is not a stopping time) In the same binomial model as in the previous example, dene
(!) = minfk Sk (!) = m2 (!)g

where m2 = min0 j 2 Sj . In other words, random variable is given by
81
80 < (!) = : 1 2
stops when the stock price reaches its minimum value. This if ! if ! if !
2 AH
= TH = TT
We verify that is not a stopping time:
f! (!) = 0g = AH 62 F 0 f! (!) = 1g = fTH g 62 F 1 f! (!) = 2g = fTT g 2 F 2
5.3 Information up to a Stopping Time

Denition 5.2 Let that be a stopping time. We say that a set A is determined by time provided
A \ f! (!) = kg 2 F k 8k:
The collection of sets determined by
is a -algebra, which we denote by F .
Example 5.4 In the binomial model considered earlier, let
= minfk vk (Sk ) = (5 ; Sk )+ g
i.e.,
(!) =
The set fHT g is determined by time , but the set fTH g is not. Indeed,
1 2
if ! if !
2 AT 2 AH
fHT g \ f! (!) = 0g = 2 F 0 fHT g \ f! (!) = 1g = 2 F 1 fHT g \ f! (!) = 2g = fHT g 2 F 2

but The atoms of F are
fT H g \ f! (!) = 1g = fTH g 62 F 1 : fHT g fHH g AT = fT H T T g:
Notation 5.1 (Value of Stochastic Process at a Stopping Time) If ( F P) is a probability space, fF k gn=0 is a ltration under F , fXk gn=0 is a stochastic process adapted to this ltration, and is k k a stopping time with respect to the same ltration, then X is an F -measurable random variable whose value at ! is given by
X (!) = X (!) (!):

4
82 Theorem 3.17 (Optional Sampling) Suppose that fYk F k g1 (or fYk F k gn=0 ) is a submartink=0 k gale. Let and be bounded stopping times, i.e., there is a nonrandom number n such that
n
If almost surely, then
almost surely.
Y IE (Y jF ): Taking expectations, we obtain IEY IEY , and in particular, Y 0 = IEY0 IEY is a supermartingale, then implies Y IE (Y jF ). implies Y = IE (Y jF ). If fYk F k gk=0 is a martingale, then
1
. If fYk
F k gk=0
1
Example 5.5 In the example 5.4 considered earlier, we dene (!) = 2 for all ! 2 . Under the risk-neutral probability measure, the discounted stock price process ( 5 );k Sk is a martingale. We compute 4
e IE
The atoms of F are fHH g
"
4 5
S2 F :
fHT g
e IE e IE
and for !
"
and AT . Therefore,
"
4 2 S F (HH) = 5 2 # 4 2 S F (HT) = 5 2
4 5 4 5
S2 (HH) S2 (HT )
2
2 AT ,
e IE
"
4 5
S2 F (!) =
4 2 S (T H) + 1 4 2 5 5 2 = 1 2:56 + 1 0:64 2 2 = 1:60

1 2
S2 (T T)
In every case we have gotten (see Fig. 5.2)
e IE
"
4 5
S2 F (!) = 4 5
( )
S (! ) (!):
83
(16/25) S (HH) = 10.24 2
(4/5) S (H) = 6.40 1 (16/25) S2 (HT) = 2.56 S =4 0 (16/25) S2 (TH) = 2.56 (4/5) S (T) = 1.60 1
(16/25) S2 (TT) = 0.64
Figure 5.2: Illustrating the optional sampling theorem.
84
Chapter 6
Properties of American Derivative Securities

6.1 The properties
Denition 6.1 An American derivative security is a sequence of non-negative random variables fGk gn=0 such that each Gk is F k -measurable. The owner of an American derivative security can k exercise at any time k, and if he does, he receives the payment Gk . (a) The value Vk of the security at time k is
f Vk = max (1 + r)k IE (1 + r) G jF k ]
;
where the maximum is over all stopping times
satisfying
k almost surely.
(b) The discounted value process f(1 + r);k Vk gn=0 is the smallest supermartingale which satises k
Vk Gk 8k
almost surely.
(c) Any stopping time
which satises
f V0 = IE (1 + r) G ]
;
is an optimal exercise time. In particular
= minfk Vk = Gk g
4
is an optimal exercise time. (d) The hedging portfolio is given by
k (!1
V ( : : : !k ) = Sk+1(!1 :: :: :: !k H ) ; Vk+1(!1 :: :: :: !k T ) k = 0 1 : : : n ; 1: !k H ) ; Sk+1 (!1 !k T ) k+1 !1

85
86 (e) Suppose for some k and ! , we have Vk (!) = Gk (!). Then the owner of the derivative security should exercise it. If he does not, then the seller of the security can immediately consume
1 f Vk (!) ; 1 + r IE Vk+1jF k ](!)

and still maintain the hedge.
6.2 Proofs of the Properties

Let fGk gn=0 be a sequence of non-negative random variables such that each Gk is F k -measurable. k Dene Tk to be the set of all stopping times satisfying k n almost surely. Dene also
f Vk = (1 + r)k max IE (1 + r) G jF k ] : T
4 ; 2
Lemma 2.18 Proof: Take
Vk Gk for every k.
2 Tk to be the constant k.
attain the maximum in the denition of V k+1 , i.e.,
Lemma 2.19 The process f(1 + r);k Vk gn=0 is a supermartingale. k Proof: Let
fh (1 + r) (k+1) Vk+1 = IE (1 + r)
; ;
G jF k+1 :
;
Because
is also in T k , we have
f IE (1 + r)
(k+1) Vk+1jF k ]
f f IE IE (1 + r) G jF k+1]jF k f IE (1 + r) G jF k ] f max IE (1 + r) G jF k ] Tk = (1 + r) k Vk :
= =
; ; 2 ;
Lemma 2.20 If fYk gn=0 is another process satisfying k
Yk Gk k = 0 1 : : : n
and f(1 + r);k Yk gn=0 is a supermartingale, then k
a.s.,
Yk Vk k = 0 1 : : : n
a.s.
CHAPTER 6. Properties of American Derivative Securities

Proof: The optional sampling theorem for the supermartingale f(1 + r) ;k Yk gn=0 implies k
87
f IE (1 + r) Y jF k ] (1 + r) k Yk 8 2 Tk :
; ;
Therefore,
f Vk = (1 + r)k max IE (1 + r) G jF k ] T
;
= Yk :
Lemma 2.21 Dene
f (1 + r)k max IE (1 + r) Y jF k ] Tk (1 + r) k (1 + r)k Yk

; 2 ;
1 f Ck = Vk ; 1 + r IE Vk+1 jF k ] n f = (1 + r)k (1 + r) k Vk ; IE (1 + r)
;
(k+1) Vk+1jF k ]
Since f(1 + r);k Vk gn=0 is a supermartingale, C k must be non-negative almost surely. Dene k
k (!1
Set X0
= V0 and dene recursively Xk+1 = k Sk+1 + (1 + r)(Xk ; Ck ; k Sk ):
V : : : !k ) = Sk+1(!1 :: :: :: !k H ) ; Vk+1 (!1 :: :: :: !k T ) : !k H ) ; Sk+1 (!1 !k T ) k+1 (!1
Then
Xk = Vk 8k:
for some
We proceed by induction on k. The induction hypothesis is that X k = Vk k 2 f0 1 : : : n ; 1g, i.e., for each xed (!1 : : : !k) we have Xk (!1 : : : !k ) = Vk (!1 : : : !k ): Proof: We need to show that
Xk+1 (!1 : : : !k H ) = Vk+1(!1 : : : !k H ) Xk+1(!1 : : : !k T ) = Vk+1(!1 : : : !k T ):

We prove the rst equality; the proof of the second is similar. Note rst that
Vk (!1 : : : !k ) ; Ck (!1 : : : !k ) f = 1 IE Vk+1 jF k ](!1 : : : !k ) 1+r
1 p = 1 + r (~Vk+1 (!1 : : : !k H ) + q Vk+1 (!1 : : : !k T )) : ~
88 Since (!1 : : : !k ) will be xed for the rest of the proof, we will suppress these symbols. For example, the last equation can be written simply as
1 p Vk ; Ck = 1 + r (~Vk+1 (H ) + qVk+1 (T )) : ~
We compute
Xk+1 (H ) =
= = = =
k Sk+1 (H ) + (1 + r)(Xk ; Ck ; k Sk ) Vk+1 (H ) ; Vk+1(T ) (S (H ) ; (1 + r)S ) k Sk+1 (H ) ; Sk+1 (T ) k+1 +(1 + r)(Vk ; Ck ) Vk+1 (H ) ; Vk+1(T ) (uS ; (1 + r)S ) k k (u ; d)Sk +~Vk+1 (H ) + q Vk+1 (T ) p ~ (Vk+1 (H ) ; Vk+1 (T ))~ + pVk+1(H ) + q Vk+1 (T ) q ~ ~ Vk+1 (H ):
6.3 Compound European Derivative Securities

In order to derive the optimal stopping time for an American derivative security, it will be useful to study compound European derivative securities, which are also interesting in their own right. A compound European derivative security consists of n + 1 different simple European derivative securities (with the same underlying stock) expiring at times 0 1 : : : n; the security that expires at time j has payoff Cj . Thus a compound European derivative security is specied by the process fCj gn=0 , where each Cj is F j -measurable, i.e., the process fCj gn=0 is adapted to the ltration j j fF k gn=0. k
Hedging a short position (one payment). Here is how we can hedge a short position in the j th European derivative security. The value of European derivative security j at time k is given by
f Vk(j) = (1 + r)k IE (1 + r) j Cj jF k ] k = 0 : : : j
;
and the hedging portfolio for that security is given by
(j ) (! k 1
j) j) Vk(+1 (!1 : : : !k H ) ; Vk(+1 (!1 : : : !k T ) : : : !k ) = (j) k = 0 : : : j ; 1: () Sk+1 (!1 : : : !k H ) ; Skj+1 (!1 : : : !k T )

(j ) (j )
Thus, starting with wealth V 0 , and using the portfolio ( 0 time j we have wealth Cj .
:::
(j ) ), we can ensure that at j 1

;
Hedging a short position (all payments). Superpose the hedges for the individual payments. In P (j) other words, start with wealth V 0 = n=0 V0 . At each time k 2 f0 1 : : : n ; 1g, rst make the j payment Ck and then use the portfolio
= k (k+1) + k (k+2) + : : : + k (n)
CHAPTER 6. Properties of American Derivative Securities
89
corresponding to all future payments. At the nal time n, after making the nal payment Cn , we will have exactly zero wealth. Suppose you own a compound European derivative securityfC j gn=0 . Compute j
V0 =
n X j =0
2n 3 X f V0(j ) = IE 4 (1 + r) j Cj 5
;
j =0
;1 and the hedging portfolio is f k gn=0 . You can borrow V0 and consume it immediately. This leaves k you with wealth X 0 = ;V0 . In each period k, receive the payment Ck and then use the portfolio ; k . At the nal time n, after receiving the last payment Cn, your wealth will reach zero, i.e., you will no longer have a debt.
6.4 Optimal Exercise of American Derivative Security

In this section we derive the optimal exercise time for the owner of an American derivative security. Let fGk gn=0 be an American derivative security. Let be the stopping time the owner plans to k use. (We assume that each Gk is non-negative, so we may assume without loss of generality that the owner stops at expiration time n if not before). Using the stopping time , in period j the owner will receive the payment
Cj = I
=j
Gj :
In other words, once he chooses a stopping time, the owner has effectively converted the American derivative security into a compound European derivative security, whose value is
V0( )
2n 3 X f = IE 4 (1 + r) j Cj 5 2j=0 3 n X f = IE 4 (1 + r) j I =j Gj 5
; ;
f = IE (1 + r) G ]:
;
j =0
The owner of the American derivative security can borrow this amount of money immediately, if he chooses, and invest in the market so as to exaclty pay off his debt as the payments fC j gn=0 are j received. Thus, his optimal behavior is to use a stopping time Lemma 4.22 which maximizes V 0 .
( )
V0( ) is maximized by the stopping time

= minfk Vk = Gk g:
Proof: Recall the denition
f V0 = max IE (1 + r) G ] = max V0( T T

4 ; 2
90
f Let 0 be a stopping time which maximizes V0 , i.e., V0 = I (1 + r); G 0 : Because f(1 + r);k Vk gn=0 E k is a supermartingale, we have from the optional sampling theorem and the inequality V k Gk , the following:
( )
0
V0
f IE (1 + r) 0 V 0 jF 0 i fh = IE (1 + r) 0 V 0 i fh IE (1 + r) 0 G 0 = V0:
; ; ; ;
Therefore, and
f f V0 = IE (1 + r) 0 V 0 = IE (1 + r) 0 G 0
;
V 0=G
0
0 a.s.
We have just shown that if
attains the maximum in the formula
f V0 = max IE (1 + r) G ] T
; 2
(4.1)
then But we have dened and so we must have

0
V 0=G
0 a.s.
n almost surely. The optional sampling theorem implies

;
= minfk Vk = Gk g
;
(1 + r)
= (1 + r) V i fh IE (1 + r) 0 V 0 jF i fh = IE (1 + r) 0 G 0 jF :
; ;
Taking expectations on both sides, we obtain
f IE (1 + r)
i fh i IE (1 + r) 0 G 0 = V0 :
;
It follows that also attains the maximum in (4.1), and is therefore an optimal exercise time for the American derivative security.
Chapter 7
Jensens Inequality
7.1 Jensens Inequality for Conditional Expectations
Lemma 1.23 If ' : IR!IR is convex and IE j'(X )j < 1, then
IE '(X )jG] '(IE X jG]):

For instance, if G
=f
g '(x) = x2:
IEX 2 (IEX )2:
Proof: Since ' is convex we can express it as follows (See Fig. 7.1):
'(x) = max h(x): h '

h is linear
Now let h(x) = ax + b lie below '. Then,
IE '(X )jG]
IE aX + bjG] = aIE X jG] + b = h(IE X jG])

max h(IE X jG]) h '
This implies
IE '(X )jG]
= '(IE X jG]):
h is linear
91
92
Figure 7.1: Expressing a convex function as a max over linear functions. Theorem 1.24 If fYk gn=0 is a martingale and k Proof: is convex then f'(Y k )gn=0 is a submartingale. k
IE '(Yk+1)jF k ]
'(IE Yk+1jF k ]) = '(Yk ):
7.2 Optimal Exercise of an American Call

This follows from Jensens inequality. Corollary 2.25 Given a convex function g : 0 1)!IR where g (0) = 0. For instance, g (x) = (x ; K )+ is the payoff function for an American call. Assume that r 0. Consider the American derivative security with payoff g (S k ) in period k. The value of this security is the same as the value of the simple European derivative security with nal payoff g (S n), i.e., f f IE (1 + r) n g (Sn)] = max IE (1 + r) g (S )]
; ;
where the LHS is the European value and the RHS is the American value. In particular optimal exercise time. Proof: Because g is convex, for all
= n is an
2 0 1] we have (see Fig. 7.2): g( x) = g ( x + (1 ; ):0) g (x) + (1 ; ):g (0)

=
g (x):
CHAPTER 7. Jensens Inequality
93
( x, g(x))
(x,g(x))
x ( x, g( x))
Figure 7.2: Proof of Cor. 2.25 Therefore,
1 g 1 + r Sk+1
1 g (S ) 1 + r k+1
;
and
optional sampling theorem implies
1 g (S )jF 1 + r k+1 k 1 f (1 + r) k IE g 1 + r Sk+1 jF k f 1 (1 + r) k g IE 1 + r Sk+1 jF k = (1 + r) k g (Sk ) So f(1 + r) k g (Sk )gn=0 is a submartingale. Let be a stopping time satisfying 0 k
f IE (1 + r)
(k+1) g (Sk+1)jF k
f = (1 + r) k IE
; ;
n. The
f (1 + r) g (S ) IE (1 + r) n g (Sn)jF ] :
; ; ;
Taking expectations, we obtain
f IE (1 + r) g(S )]
f f IE IE (1 + r) n g(Sn)jF ] f = IE (1 + r) n g (Sn)] :
; ; ;
Therefore, the value of the American derivative security is

;
f f max IE (1 + r) g (S )] IE (1 + r) n g (Sn )] f f max IE (1 + r) g (S )] = IE (1 + r) n g (Sn )] :

; ;
and this last expression is the value of the European derivative security. Of course, the LHS cannot be strictly less than the RHS above, since stopping at time n is always allowed, and we conclude that
94
S2 (HH) = 16
S (H) = 8 1 S2 (HT) = 4 S =4 0 S1 (T) = 2 S2 (TH) = 4
S2 (TT) = 1
Figure 7.3: A three period binomial model.
7.3 Stopped Martingales

Let fYk gn=0 be a stochastic process and let k stopped process
^
be a stopping time. We denote by
fY k gn=0 the k
^
Yk (!) (!) k = 0 1 : : : n:
(!) = 1 2
if if
Example 7.1 (Stopped Process) Figure 7.3 shows our familiar 3-period binomial example. Dene
Then
8 S (HH) = 16 > 2 < ) S2^ (! ) (!) = > S2 (HT= = 4 S1 (T ) 2 : S1(T ) = 2
!1 = T !1 = H:
if if if if
! = HH ! = HT ! = TH ! = T T:
Theorem 3.26 A stopped martingale (or submartingale, or supermartingale) is still a martingale (or submartingale, or supermartingale respectively). Proof: Let fYk gn=0 be a martingale, and be a stopping time. Choose some k 2 f0 1 k The set f kg is in F k , so the set f k + 1g = f kgc is also in F k . We compute
: : : ng.
IE Y(k+1) jF k = IE I k Y + I k+1 Yk+1 jF k = I k Y + I k+1 IE Yk+1 jF k ] = I k Y + I k+1 Yk = Yk :

^ f g f g f f g g f f g g ^
CHAPTER 7. Jensens Inequality
95
96
Chapter 8
Random Walks
8.1 First Passage Time
Toss a coin innitely many times. Then the sample space is the set of all innite sequences ! = (!1 !2 : : : ) of H and T . Assume the tosses are independent, and on each toss, the probability of H is 1 , as is the probability of T . Dene 2
Yj (!) = Mk =
;1
k X
if if
!j = H !j = T
M0 = 0
The process fMk g1 is a symmetric random walk (see Fig. 8.1) Its analogue in continuous time is k=0 Brownian motion. Dene
j =1
Yj k = 1 2 : : :
= minfk 0 Mk = 1g: If Mk never gets to 1 (e.g., ! = (TTTT : : : )), then = 1. The random variable
8.2 is almost surely nite
is called the rst passage time to 1. It is the rst time the number of heads exceeds by one the number of tails.
It is shown in a Homework Problem that fM k g1 and fNk g1 where k=0 k=0
Nk = exp Mk ; k log e + e 2
= e Mk 2 e +e
97
!)
98
Mk k
Figure 8.1: The random walk process Mk
e e + 2 1
Figure 8.2: Illustrating two functions of
2 e e + 1
are martingales. (Take Mk = ;Sk in part (i) of the Homework Problem and take (v).) Since N0 = 1 and a stopped martingale is a martingale, we have
=;
in part
1 = IENk = IE e Mk
^
"
2 e +e
(2.1)
for every xed 2 IR (See Fig. 8.2 for an illustration of the various functions involved). We want to let k!1 in (2.1), but we have to worry a bit that for some sequences ! 2 , (! ) = 1. We consider xed As k!1,
> 0, so
e +e
2 e +e
k
; ^
< 1:
if if
(
0
2 e +e;
<1
Furthermore, Mk^
1, because we stop this martingale when it reaches 1, so 0 e Mk

^
=1
CHAPTER 8. Random Walks

and
99
0 e Mk lim e Mk k!
1
2 e +e
k
^
e:
if if
In addition,
^
2 e +e
2 e +e;
<1 = 1:
Recall Equation (2.1):
IE e Mk^
"
Letting k!1, and using the Bounded Convergence Theorem, we obtain
2 e +e
=1 = 1:
2 IE e e + e
For all
<
1g
(2.2)
2 (0 1], we have
so we can let
#0 in (2.2), using the Bounded Convergence Theorem again, to conclude h i IE If < 1g = 1

IP f < 1g = 1:
2 0 e e +e
<
1g
i.e.,
We know there are paths of the symmetric random walk fMk g1 which never reach level 1. We k=0 have just shown that these paths collectively have no probability. (In our innite sample space , each path individually has zero probability). We therefore do not need the indicator I f < 1g in (2.2), and we rewrite that equation as
IE
e +e
=e :
;
(2.3)
8.3 The moment generating function for

Let
2 (0 1) be given. We want to nd > 0 so that

2 = e +e
;
Solution:
e + e ;2=0 (e )2 ; 2e + = 0
; ; ;
100
e =1
;
1;
:
2
We want
> 0, so we must have e < 1. Now 0 < < 1, so

;
0 < (1 ; )2 < (1 ; ) < 1 ; 1; < 1; 2 p 1; 1; 2 < p 1; 1; 2 < 1

We take the negative square root:
e = 1; 1;
;
: > 0:
2
Recall Equation (2.3):
IE
With
2 (0 1) and > 0 related by

e
e +e
;
=e
= 1; 1; 2 = e +e
;
this becomes
IE
= 1; 1;
0 < < 1:
(3.1)
We have computed the moment generating function for the rst passage time to 1.
8.4 Expectation of
Recall that
IE
so
= 1; 1; = IE (
0< <1
;
d IE d
= 1 ;p 1 ; 2 1;
= dd 1 ; 1 ;
1)
2 2 :

Using the Monotone Convergence Theorem, we can let
101
IE (
to obtain
1) =
1 ;p 1 ; 2 1;
"1 in the equation

2 2
IE = 1:
= minfk Mk = 1g
4
Thus in summary:
IP f < 1g = 1 IE = 1:
8.5 The Strong Markov Property
The random walk process fMk g1 is a Markov process, i.e., k=0
IE
random variable depending only on
= IE
same random variable
Mk+1 Mk+2 : : : j F k ] jMk ] :
In discrete time, this Markov property implies the Strong Markov property:
IE
random variable depending only on
= IE
same random variable
M +1 M +2 : : : j F ] j M ]:
for any almost surely nite stopping time .
8.6 General First Passage Times

Dene
Then 2 ; 1 is the number of periods between the rst arrival at level 1 and the rst arrival at level 2. The distribution of 2 ; 1 is the same as the distribution of 1 (see Fig. 8.3), i.e.,
m = minfk
4
0 Mk = mg m = 1 2 : : :
IE
2; 1
= 1; 1;
2 (0 1):
102
Mk k 1 2 1
Figure 8.3: General rst passage times. For
2 (0 1),
IE
jF 1 ] = IE
=
1
2; 1 2; 1
IE IE
jF 1 jF 1 ]
= = =
Take expectations of both sides to get
(taking out what is known)

1 2; 1
jM 1 ]
IE 2 1 ] (M 1 = 1 not random ) ! p
1
;
(strong Markov property)
1; 1;
IE
= IE 1 : 1 ; 1 ; = 1; 1;
! 2 2
In general,
IE
= 1; 1;
! 2 m
2 (0 1):
8.7 Example: Perpetual American Put

1 Consider the binomial model, with u = 2 d = 2 1 , q = 1 , and thus ~ neutral probabilities are p = 2 ~ 2
r=
1 , and payoff function (5 ; S )+ . The risk k 4
Sk = S0 uMk
103 Suppose
S0 = 4. Here are some possible exercise rules:

Rule 0: Stop immediately. 0
where
f Mk is a symmetric random walk under the risk-neutral measure, denoted by IP .

= 0 V ( 0 ) = 1.
1
Rule 1: Stop as soon as stock price falls to 2, i.e., at time

;
= minfk Mk = ;1g:
4
Rule 2: Stop as soon as stock price falls to 1, i.e., at time

;
= minfk Mk = ;2g:
4
f f Because the random walk is symmetric under I , ;m has the same distribution under I as the P P stopping time m in the previous section. This observation leads to the following computations of value. Value of Rule 1: f V ( ;1) = IE (1 + r) ;1 (5 ; S ;1 )+ h4 i = (5 ; 2)+IE ( 5 ) ;1
;
= 3: 3 = 2:
Value of Rule 2:
1 ; 1 ; ( 4 )2 5
4 5
f 5 V ( ;2 ) = (5 ; 1)+IE ( 4 ) ;2 1 = 4:( 2 )2 = 1:
Suppose instead we start with S 0 = 8, and stop the rst time the price falls to 2. This requires 2 down steps, so the value of this rule with this initial stock price is
This suggests that the optimal rule is Rule 1, i.e., stop (exercise the put) as soon as the stock price falls to 2, and the value of the put is 3 if S0 = 4. 2
In general, if S0 = 2j for some j 1, and we stop when the stock price falls to 2, then j ; 1 down steps will be required and the value of the option is
fh 4 i (5 ; 2)+ IE ( 5 ) ;2 = 3:( 1 )2 = 3 : 2 4 h i
;
f 4 (5 ; 2)+ IE ( 5 ) ;(j;1) = 3:( 1 )j 1 : 2

We dene
v(2j ) = 3:( 1 )j 2
4
j = 1 2 3 :::
104
1, then the initial price is at or below 2. In this case, we exercise If S0 = 2j for some j immediately, and the value of the put is
v (2j ) = 5 ; 2j j = 1 0 ;1 ;2 : : :
4
Proposed exercise rule: Exercise the put whenever the stock price is at or below 2. The value of this rule is given by v (2 j ) as we just dened it. Since the put is perpetual, the initial time is no different from any other time. This leads us to make the following: Conjecture 1 The value of the perpetual put at time k is v (S k ). How do we recognize the value of an American derivative security when we see it? There are three parts to the proof of the conjecture. We must show: (a)
v (Sk ) (5 ; Sk )+ 8k o n (b) ( 4 )k v (Sk ) is a supermartingale, 5 k=0 (c) fv (Sk )gk=0 is the smallest process with properties (a) and (b).
1 1
Note: To simplify matters, we shall only consider initial stock prices of the form S 0 always of the form 2j , with a possibly different j . Proof: (a). Just check that
= 2j , so Sk is
v(2j ) = 3:( 1 )j 2
4 4
(5 ; 2j )+
for for
j 1 j 1:
v(2j ) = 5 ; 2j (5 ; 2j )+
This is straightforward. Proof: (b). We must show that
v(Sk )
By assumption, Sk
= 2j for some j . We must show that 2 v(2j ) 5 v (2j +1) + 2 v (2j 1): 5 If j 2, then v (2j ) = 3:( 1 )j 1 and 2 2 v (2j +1 ) + 2 v (2j 1) 5 5 1 1 = 2 :3:( 2 )j + 2 :3:( 2 )j 2 5 5 2 = 3: 5 : 1 + 2 ( 1 )j 2 4 5 2 = 3: 1 :( 1 )j 2 2 2 = v (2j ):
; ; ; ; ; ;
f 5 IE 4 v (Sk+1)jF k = 4 : 1 v (2Sk ) + 4 : 1 v ( 1 Sk ): 5 2 5 2 2

If j
105
= 1, then v (2j ) = v (2) = 3 and

2 v (2j +1 ) + 2 v (2j 1) 5 5 2 = 5 v (4) + 2 v (1) 5 2 :3: 1 + 2 :4 = 5 2 5
;
= 3=5 + 8=5 1 = 2 5 < v (2) = 3
If j
There is a gap of size 4 . 5
0, then v (2j ) = 5 ; 2j and

2 v (2j +1 ) + 2 v (2j 1) 5 5 = 2 (5 ; 2j +1 ) + 2 (5 ; 2j 1 ) 5 5 2 = 4 ; 5 (4 + 1)2j 1 = 4 ; 2j < v (2j ) = 5 ; 2j :
; ; ;
There is a gap of size 1. This concludes the proof of (b). Proof: (c). Suppose fYk gn=0 is some other process satisfying: k (a) (b)
Yk (5 ; Sk )+ 8k f( 4 )k Yk gk=0 is a supermartingale. 5
1
We must show that
Yk v (Sk ) 8k: Y0 v(S0)

For j
(7.1)
Actually, since the put is perpetual, every time k is like every other time, so it will sufce to show (7.2)
provided we let S0 in (7.2) be any number of the form 2j . With appropriate (but messy) conditioning on F k , the proof we give of (7.2) can be modied to prove (7.1).
1,
so if S0
= 2j for some j 1, then (a) implies
v (2j ) = 5 ; 2j = (5 ; 2j )+ Y0 (5 ; 2j )+ = v (S0):
Suppose now that S 0
= 2j for some j 2, i.e., S0 4. Let = minfk Sk = 2g = minfk Mk = j ; 1g:
106 Then
v(S0) = v (2j ) = 3:( 1 )j 1 h 4 2 +i = IE ( 5 ) (5 ; S ) :

;
4 Because f( 5 )k Yk g1 is a supermartingale k=0 4 Y0 IE ( 5 ) Y
IE ( 4 ) (5 ; S )+ = v(S0): 5
Comment on the proof of (c): If the candidate value process is the actual value of a particular exercise rule, then (c) will be automatically satised. In this case, we constructed v so that v (S k ) is the value of the put at time k if the stock price at time k is S k and if we exercise the put the rst time (k, or later) that the stock price is 2 or less. In such a situation, we need only verify properties (a) and (b).
8.8 Difference Equation

If we imagine stock prices which can fall at any point in (0 1), not just at points of the form 2 j for integers j , then we can imagine the function v (x), dened for all x > 0, which gives the value of the perpetual American put when the stock price is x. This function should satisfy the conditions:
v (x) (K ; x)+ 8x, 1 ~ ~ (b) v (x) 1+r pv (ux) + q v (dx)] 8x (c) At each x, either (a) or (b) holds with equality.
(a) In the example we worked out, we have For
6 j 1 : v(2j ) = 3:( 1 )j 1 = 2j 2 For j 1 : v (2j ) = 5 ; 2j :

;
This suggests the formula
v (x) =
We then have (see Fig. 8.4): (a) (b)
x 3 5 ; x 0 < x 3:
6 x
v (x) (5 ; x)+ 8x v (x)
4 1 1 x 5 2 v (2x) + 2 v ( 2 ) for every x except for 2 < x < 4.
107
v(x) 5
(3,2)
Figure 8.4: Graph of v (x). Check of condition (c): If 0 < x If x
3, then (a) holds with equality.

4 1 v (2x) + 1 v ( x ) 5 2 2 2 1 12 1 6 2 2x + 2 x
6, then (b) holds with equality: =4 5 6 = x:
If 3 < x < 4 or 4 < x < 6, then both (a) and (b) are strict. This is an artifact of the discreteness of the binomial model. This artifact will disappear in the continuous model, in which an analogue of (a) or (b) holds with equality at every point.
8.9 Distribution of First Passage Times

Let fMk g1 be a symetric random walk under a probability measure IP , with M0 k=0
= 0. Dening
= minfk 0 Mk = 1g
we recall that
IE
= 1; 1;
0 < < 1:
We will use this moment generating function to obtain the distribution of . We rst obtain the Taylor series expasion of IE as follows:
108
f (x) = 1 ; 1 ; x f (0) = 0 1 1 1 f (x) = 2 (1 ; x) 2 f (0) = 2 3 f (x) = 1 (1 ; x) 2 f (0) = 1 4 4 3 (1 ; x) 5 f (0) = 3 2 f (x) = 8 8 ::: : f (j)(x) = 1 3 : :2j (2j ; 3) (1 ; x) f (j)(0) = 1 3 : : : (2j ; 3)
0 ; 0 00 ; 00 000 ; 000
(2j 1) 2
: = 1 3 : :2j (2j ; 3) : 2 42j :1:(: ;(2j ; 2) j 1)! 2j 1 (2j ; 2)! 1 = 2 (j ; 1)!

; ;
2j
The Taylor series expansion of f (x) is given by
f (x) = 1 ; 1 ; x X 1 (j) j = j ! f (0)x

1
j =0
X
1
X = x+ 2
1
j =1
1 2j 1 2
;
(2j ; 2)! j j !(j ; 1)! x

;
j =2
1 2j 1 1 2 (j ; 1)
2j ; 2 xj : j
So we have
IE
= 1; 1; = 1 f ( 2) = 2 +
X
1
2j 1
;
j =2
1
2
2j 1 IP f
;
(j ; 1)
2j ; 2 :
But also,
IE =
X
j =1
= 2j ; 1g:
109
Figure 8.5: Reection principle.
Figure 8.6: Example with j Therefore,
= 2.
IP f = 1g = IP f = 2j ; 1g =
1 2
! 1 2j 1 1 2j ; 2 j = 2 3 : : : 2 (j ; 1) j
;
8.10 The Reection Principle

To count how many paths reach level 1 by time 2j ; 1, count all those for which M 2j ;1 double count all those for which M 2j ;1 3. (See Figures 8.5, 8.6.)
= 1 and
110 In other words,
IP f
For j
2j ; 1g = IP fM2j 1 = 1g + 2IP fM2j 1 3g = IP fM2j 1 = 1g + IP fM2j 1 3g + IP fM2j = 1 ; IP fM2j 1 = ;1g:

; ; ; ; ;
;3g
2,
IP f = 2j ; 1g = IP f
= = =
= = = =
2j ; 1g ; IP f 2j ; 3g 1 ; IP fM2j 1 = ;1g] ; 1 ; IP fM2j 3 = ;1g] IP fM2j 3 = ;1g ; IP fM2j 1 = ;1g (2j ; 3)! ; 1 2j 1 (2j ; 1)! 1 2j 3 2 (j ; 1)!(j ; 2)! 2 j !(j ; 1)! 2j 1 (2j ; 3)! 1 2 j !(j ; 1)! 4j (j ; 1) ; (2j ; 1)(2j ; 2)] 1 2j 1 (2j ; 3)! 2j (2j ; 2) ; (2j ; 1)(2j ; 2)] 2 j !(j ; 1)! 2j 1 (2j ; 2)! 1 2 j !(j ; 1)! ! 2j ; 2 : 1 2j 1 1 2 (j ; 1) j
; ; ; ; ; ; ; ; ; ;
Chapter 9
Pricing in terms of Market Probabilities: The Radon-Nikodym Theorem.

9.1 Radon-Nikodym Theorem
f P Theorem 1.27 (Radon-Nikodym) Let I and I be two probability measures on a space ( F ). P f Assume that for every A 2 F satisfying IP (A) = 0, we also have I (A) = 0. Then we say that P f is absolutely continuous with respect to I . Under this assumption, there is a nonegative random P IP variable Z such that f IP (A) = Z f P P and Z is called the Radon-Nikodym derivative of I with respect to I . f IEX = IE XZ ] for every random variable X for which IE jXZ j < 1. f Remark 9.2 If I is absolutely continuous with respect to I , and I is absolutely continuous with P P P f, we say that I and IP are equivalent. I and IP are equivalent if and only if f f respect to I P P P f IP (A) = 0 exactly when IP (A) = 0 8A 2 F : 1 f f If I and I are equivalent and Z is the Radon-Nikodym derivative of I w.r.t. I , then Z is the P P P P f Radon-Nikodym derivative of I w.r.t. I , i.e., P P f (1.2) IEX = IE XZ ] 8X f 1 IEY = IE Y: Z ] 8Y: (Let X and Y be related by the equation Y = XZ to see that (1.2) and (1.3) are the same.)
111 (1.3) Remark 9.1 Equation (1.1) implies the apparently stronger condition
ZdIP 8A 2 F
(1.1)
112
Example 9.1 (Radon-Nikodym Theorem) Let = fHH HT TH T T g, the set of coin toss sequences of length 2. Let P correspond to probability 1 for H and 2 for T , and let IP correspond to probability 1 for 3 3 2
e H and 1 for T . Then Z(!) = IP (!) , so

2
I ( ) P
! 9 9 Z(HH) = 4 Z(HT) = 9 Z(TH) = 9 Z(T T) = 16 : 8 8
9.2 Radon-Nikodym Martingales

Let be the set of all sequences of n coin tosses. Let I be the market probability measure and let P f IP be the risk-neutral probability measure. Assume
f f P P so that I and I are equivalent. The Radon-Nikodym derivative of I with respect to I is P P f IP (!) Z (!) = IP (! ) :
Dene the I -martingale P
f IP (!) > 0 IP (!) > 0 8! 2
We can check that Zk is indeed a martingale:
Zk = IE Z jF k ] k = 0 1 : : : n:
4
f Lemma 2.28 If X is F k -measurable, then I EX

Proof:
IE Zk+1jF k ] = IE IE Z jF k+1 ]jF k ] = IE Z jF k ] = Zk :

= IE XZk ].
f IEX = IE XZ ] = IE IE XZ jF k ]] = IE X:IE Z jF k ]] = IE XZk ]: f IE IA X ] = IE ZkIA X ]

A
Note that Lemma 2.28 implies that if X is F k -measurable, then for any A 2 F k ,
or equivalently,
f XdIP =
Z
A
XZk dIP:
CHAPTER 9. Pricing in terms of Market Probabilities

Z2 (HH) = 9/4 1/3 Z (H) = 3/2 1 1/3 Z =1 0 2/3 Z (T) = 3/4 1 2/3 Z2 (TT) = 9/16 1/3 2/3 Z2 (HT) = 9/8 Z2 (TH) = 9/8
113
Figure 9.1: Showing the Zk values in the 2-period binomial model example. The probabilities shown f are for I , not I . P P Lemma 2.29 If X is F k -measurable and 0
j k, then
j
1 f IE X jF j ] = Z IE XZk jF j ]:
1 Proof: Note rst that Z
Z Z 1 f IE XZk jF j ]dIP = IE XZkjF j ]dIP A A Zj Z

= =
IE XZk jF j ] is F j -measurable. So for any A 2 F j , we have

(Lemma 2.28)
ZA
A
XZk dIP
(Partial averaging) (Lemma 2.28)
f XdIP
Example 9.2 (Radon-Nikodym Theorem, continued) We show in Fig. 9.1 the values of the martingale Zk . We always have Z0 = 1, since
Z0 = IEZ =
e ZdIP = IP ( ) = 1:
9.3 The State Price Density Process

In order to express the value of a derivative security in terms of the market probabilities, it will be useful to introduce the following state price density process:
= (1 + r) k Zk k = 0 : : : n:
;
114 We then have the following pricing formulas: For a Simple European derivative security with payoff Ck at time k,
f V0 = IE (1 + r) k Ck h i = IE (1 + r) k Zk Ck = IE k Ck ]:
; ;
(Lemma 2.28)
More generally for 0
j k,
f Vj = (1 + r)j IE (1 + r) k Ck jF j j h i = (1 + r) IE (1 + r) k Zk Ck jF j Zj 1 IE C jF ] = k k j
; ;
(Lemma 2.29)
Remark 9.3
f j Vj gk=0 is a martingale under I , as we can check below: P j IE j+1Vj +1 jF j ] = IE IE k Ck jF j +1]jF j ] = IE k Ck jF j ]

= j Vj :
Now for an American derivative security fGk gn=0 : k

2
f V0 = sup IE (1 + r) G ] T0 = sup IE (1 + r) Z G ] T0 = sup IE G ]:

; ; 2 2
T0
More generally for 0
f Vj = (1 + r)j sup IE (1 + r) G jF j ]
= 1 sup IE G jF j ]:
j
2
j n,
1 = (1 + r)j sup Z IE (1 + r) Z G jF j ]
2
Tj
Tj j
Tj
Remark 9.4 Note that
f j Vj gn=0 is a supermartingale under I , P j (b) j Vj j Gj 8j

(a)

2() = 1.44 S2 (HH) = 16 1/3 () = 1.20 1 S (H) = 8 1 1/3 S =4 0 2/3 0 = 1.00 2/3 2() = 0.72 S2 (HT) = 4 S2 (TH) = 4 () = 0.72 2
115
1/3 S (T) = 2 1 () = 0.6 1 2/3
() = 0.36 2 S2 (TT) = 1
f Figure 9.2: Showing the state price values k . The probabilities shown are for I , not I . P P
(c)
f j Vj gn=0 is the smallest process having properties (a) and (b). j
We interpret k by observing that k (! )IP (! ) is the value at time zero of a contract which pays $1 at time k if ! occurs.
Example 9.3 (Radon-NikodymTheorem, continued) We illustrate the use of the valuation formulas for 1 European and American derivative securities in terms of market probabilities. Recall that p = 3 , q = 2 . The 3 state price values k are shown in Fig. 9.2. For a European Call with strike price 5, expiration time 2, we have
V2 (HH) = 11 2(HH)V2 (HH) = 1:44 11 = 15:84: V2 (HT) = V2(TH) = V2 (TT ) = 0: V0 = 1 1 15:84 = 1:76: 3 3 2 (HH) V2(HH) = 1:44 11 = 1:20 11 = 13:20 1:20 1 (HH) V1 (H) = 1 13:20 = 4:40 3
Compare with the risk-neutral pricing formulas:
V1(H) = 2 V1 (HH) + 2 V1(HT) = 2 11 = 4:40 5 5 5 2 V1 (T ) = 5 V1 (TH) + 2 V1 (T T ) = 0 5 2 V (H) + 2 V (T) = 2 V0 = 5 1 4:40 = 1:76: 5 1 5

Now consider an American put with strike price 5 and expiration time 2. Fig. 9.3 shows the values of + k (5 ; Sk ) . We compute the value of the put under various stopping times : (0) Stop immediately: value is 1. (1) If
(HH) = (HT) = 2 (T H) = (TT ) = 1, the value is 1 2 0:72 + 2 1:80 = 1:36: 3 3 3
116
+ (5 - S (HH)) = 0 2 + (HH) (5 - S (HH)) = 0 2 2
+ (5 - S (H)) = 0 1 + (H) (5 - S1(H)) = 0 1 1/3 2/3 + (5 - S (T)) = 3 1 (T) (5 - S (T))+ = 1.80 1 1
1/3
+ (5-S 0) =1 + 0 (5-S 0) =1
2/3 1/3
+ (5 - S (HT)) = 1 2 (HT) (5 - S (HT))+= 0.72 2 2 (5 - S (TH))+= 1 2 + 2 (TH) (5 - S2(TH)) = 0.72
2/3
+ (5 - S (TT)) = 4 2 + (TT) (5 - S (TT)) = 1.44 2 2
Figure 9.3: Showing the values k (5 ; Sk )+ for an American put. The probabilities shown are for f I , not I . P P
(2) If we stop at time 2, the value is
1 3
2 3
2 0:72 + 3 1 0:72 + 2 3 3
2 3
1:44 = 0:96
We see that (1) is optimal stopping rule.
9.4 Stochastic Volatility Binomial Model

Let be the set of sequences of n tosses, and let 0 < d k are F k -measurable. Also let
< 1+ rk < uk , where for each k, dk uk rk

k k
f Let I be the risk-neutral probability measure: P
k pk = 1 + r;; dk ~ u d k k
; qk = uk u (1 + rk) : ~ ;d
and for 2
equivalent. Dene
f IP !k+1 = H jF k ] = pk ~ f IP !k+1 = T jF k ] = qk : ~ Let I be the market probability measure, and assume IP f! g > 0 8! 2 P f IP (!) Z (!) = IP (! ) 8! 2
k n,
f IP f!1 = H g = p0 ~ f IP f!1 = T g = q0 ~
. Then I and P
f IP are
117
Zk = IE Z jF k ] k = 0 1 : : : n:
We dene the money market price process as follows:
M0 = 1
Note that Mk is Fk;1 -measurable.
Mk = (1 + rk 1)Mk
;
k = 1 : : : n:
We then dene the state price process to be
1 = M Zk k = 0 : : : n:
; As before the portfolio process is f consists of X 0, the non-random initial wealth, and
k n 1 k gk=0 . The self-nancing value process (wealth process)
Xk+1 =
k Sk+1 + (1 + rk )(Xk ; k Sk )
k = 0 : : : n ; 1:
f P Then the following processes are martingales under I :

1 S n M k
k k=0
and
1 X n M k
k
k=0
and the following processes are martingales under I : P
f k Sk gn=0 k
We thus have the following pricing formulas:
and
f k Xk gn=0: k
Simple European derivative security with payoff Ck at time k:
f C Vj = Mj IE Mk F j k 1 IE C jF ] = k k j
j
American derivative security
fGk gn=0: k f G Vj = Mj sup IE M F j Tj 1 sup IE G jF ] : = j

2
Tj
The usual hedging portfolio formulas still work.
118
9.5 Another Applicaton of the Radon-Nikodym Theorem

Let ( F Q) be a probability space. Let G be a sub- -algebra of F , and let X be a non-negative R X dQ = 1. We construct the conditional expectation (under Q) of X random variable with given G . On G , dene two probability measures
IP (A) = Q(A) 8A 2 G
f IP (A) = Z
Whenever Y is a G -measurable random variable, we have
XdQ 8A 2 G :
Y dIP =
Y dQ
if Y = A for some A 2 G , this is just the denition of IP , and the rest follows from the standard f f machine. If A 2 G and IP (A) = 0, then Q(A) = 0, so I (A) = 0. In other words, the measure I P P f. The Radon-Nikodym theorem implies that P is absolutely continuous with respect to the measure I there exists a G -measurable random variable Z such that
f IP (A) =
4
Z dIP 8A 2 G
A
i.e.,
Z
A
X dQ =
Z dIP 8A 2 G :
This shows that Z has the partial averaging property, and since Z is G -measurable, it is the conditional expectation (under the probability measure Q) of X given G . The existence of conditional expectations is a consequence of the Radon-Nikodym theorem.
Chapter 10
Capital Asset Pricing

10.1 An Optimization Problem
Consider an agent who has initial wealth X 0 and wants to invest in the stock and money markets so as to maximize
IE log Xn :
Remark 10.1 Regardless of the portfolio used by the agent, f k Xk g1 is a martingale under I , so P k=0
IE nXn = X0
Here, (BC) stands for Budget Constraint. Remark 10.2 If is any random variable satisfying (BC), i.e.,
(BC )
IE
= X0
then there is a portfolio which starts with initial wealth X 0 and produces Xn = at time n. To see this, just regard as a simple European derivative security paying off at time n. Then X 0 is its value at time 0, and starting from this value, there is a hedging portfolio which produces X n = . Remarks 10.1 and 10.2 show that the optimal obtained by solving the following Constrained Optimization Problem: Find a random variable which solves: Maximize Subject to Equivalently, we wish to Maximize
Xn for the capital asset pricing problem can be
IE log
n
IE
= X0:
X
!
2
(log (! )) IP (! )
119
120 Subject to
X
!
2
n (! )
(!)IP (! ) ; X0 = 0:
There are 2n sequences ! in . Call them ! 1
!2 : : : !2n . Adopt the notation
x1 = (! 1) x2 = (!2 ) : : : x2n = (!2n ):

We can thus restate the problem as: Maximize
2n X
k=1
(log xk )IP (! k )
Subject to In order to solve this problem we use:
2n X
k=1
n (! k )xk IP (! k )
; Xo = 0:
Theorem 1.30 (Lagrange Multiplier) If (x1 Maxmize Subject to then there is a number such that
: : : xm) solve the problem f (x1 : : : xm ) g (x1 : : : xm) = 0

(1.1)
@ f (x : : : x ) = @ g(x : : : x ) k = 1 : : : m m m @xk 1 @xk 1

and
g(x1 : : : xm) = 0:
For our problem, (1.1) and (1.2) become
(1.2)
xk IP (! k ) =
2n X
n (! k )IP (! k ) n (! k )xk IP (! k )
k = 1 : : : 2n
= X0 : (1:2 )
0
(1:1 )
0
k=1
Equation (1.1) implies
xk =
Plugging this into (1.2) we get
n
1 : n (! k )
2 1 X IP (! ) = X =) 1 = X : 0 k 0
k=1
CHAPTER 10. Capital Asset Pricing

Therefore,
121
0 xk = X! ) k = 1 : : : 2n: ( k n
solves the problem Maximize Subject to
Thus we have shown that if
IE log IE ( n ) = X0
= X0 :
n
(1.3)
then (1.4)
Theorem 1.31 If Proof: Fix Z
is given by (1.4), then
solves the problem (1.3).
> 0 and dene
We maximize f over x > 0:

0
f (x) = log x ; xZ:

1 1 f (x) = x ; Z = 0 () x = Z
The function f is maximized at x
1 f (x) = ; x2 < 0 8x 2 IR:

00
1 log x ; xZ f (x ) = log Z ; 1 8x > 0 8Z > 0:

Let be any random variable satisfying
1 = Z , i.e.,
(1.5)
IE ( n ) = X0
and let
= X0 :
n
From (1.5) we have
log ;
n X0
log X0 ; 1:
n
Taking expectations, we have
1 IE log ; X IE ( n ) IE log ; 1 0
and so
IE log
IE log :
122 In summary, capital asset pricing works as follows: Consider an agent who has initial wealth and wants to invest in the stock and money market so as to maximize
X0
IE log Xn :
The optimal Xn is Xn
0 = Xn , i.e.,
Since f k Xk gn=0 is a martingale under I , we have P k
n Xn = X0:
k Xk
so
= IE nXn jF k ] = X0 k = 0 : : : n
Xk = X0
k k (!1
and the optimal portfolio is given by
: : : !k ) =
X0 ; k+1(!1 X0: : : k+1 (!1 : : : !k H ) Sk+1 (!1 : : : !k H ) ; Sk+1 (!1 : : :
!k T ) : !k T )
Chapter 11
General Random Variables

11.1 Law of a Random Variable
Thus far we have considered only random variables whose domain and range are discrete. We now consider a general random variable X : !IR dened on the probability space ( F P). Recall that:
F is a
I is a probability measure on F , i.e., IP (A) is dened for every A 2 F . P A function X : !IR is a random variable if and only if for every Borel subsets of I ), the set R
4 ; 4
-algebra of subsets of .
B 2 B(IR) (the
-algebra of
fX 2 Bg = X 1(B) = f! X (!) 2 Bg 2 F
i.e., X 11.1)
: !IR is a random variable if and only if X
1 is a function from B(IR) to F (See Fig.
Thus any random variable X induces a measure X on the measurable space by
(IR B(IR)) dened
in Williams book this is denoted by L X .
X 1(B) 8B 2 B(IR) where the probabiliy on the right is dened since X 1(B ) 2 F . X is often called the Law of X
X (B ) = IP
; ;
11.2 Density of a Random Variable

The density of X (if it exists) is a function f X
X (B ) =
: IR! 0 1) such that
fX (x) dx 8B 2 B (IR):
123
124
R B
{X B}
Figure 11.1: Illustrating a real-valued random variable X . We then write
where the integral is with respect to the Lebesgue measure on I . fX is the Radon-Nikodym derivaR tive of X with respect to the Lebesgue measure. Thus X has a density if and only if X is absolutely continuous with respect to Lebesgue measure, which means that whenever B 2 B(IR) has Lebesgue measure zero, then
d X (x) = fX (x)dx
IP fX 2 Bg = 0:
11.3 Expectation
Theorem 3.32 (Expectation of a function of X ) Let h : IR!IR be given. Then
IEh(X ) =
4
Z Z
h(X (!)) dIP (!) h(x) d X (x) h(x)fX (x) dx: IR, then these equations are
= =
ZIR
IR
Proof: (Sketch). If h(x) = B (x) for some B

4
IE 1B (X ) = P fX 2 B g = Z X (B ) = fX (x) dx
B
which are true by denition. Now use the standard machine to get the equations for general h.
CHAPTER 11. General Random Variables
125
(X,Y)
C y { (X,Y) C} x
Figure 11.2: Two real-valued random variables X
Y.
11.4 Two random variables

Let X Y be two random variables !IR dened on the space ( F P). Then measure on B(IR2 ) (see Fig. 11.2) called the joint law of (X Y ), dened by
XY
induce a
X Y (C ) = IP f(X
4
Y ) 2 C g 8C 2 B(IR2 ):
The joint density of (X
Y ) is a function fX Y : IR2! 0 1)
that satises
X Y (C ) =
ZZ
C
fX Y (x y ) dxdy 8C 2 B(IR2): Y in a manner analogous to the univariate case:
We compute the expectation of a function of X
fX Y
is the Radon-Nikodym derivative of X Y with respect to the Lebesgue measure (area) on IR2 .
IEk(X Y ) =
4
= =
ZZ
IR ZZ
2
k(X (!) Y (!)) dIP (!) k(x y ) d

X Y (x
y)
IR
k(x y )fX Y (x y ) dxdy
126
11.5 Marginal Density

Suppose (X
Y ) has joint density f X Y . Let B IR be given. Then

Y (B )
= IP fY 2 B g = IP f(X Y ) 2 IR B g = Z X Z (IR B ) Y = fX Y (x y ) dxdy =
ZB Z
B
4
fY (y ) dy fX Y (x y ) dx:
IR
where
fY (y ) =
Therefore, fY (y ) is the (marginal) density for Y .
IR
11.6 Conditional Expectation

Suppose
(X Y ) has joint density f X Y . Let h : IR!IR be given. Recall that IE h(X )jY ] = IE h(X )j (Y )] depends on ! through Y , i.e., there is a function g (y ) (g depending on h) such that
4
IE h(X )jY ](!) = g (Y (!)):

We can characterize g using partial averaging: Recall that A 2 (Y )()A = B 2 B(IR). Then the following are equivalent characterizations of g : How do we determine g ?
fY 2 Bg for some
(6.1)
g (Y ) dIP =
h(X ) dIP 8A 2 (Y )
Z Z
IR
1B (Y )g(Y ) dIP =
ZZ
IR2
1B (Y )h(X ) dIP 8B 2 B(IR) 1B (y)h(x) d

X Y (x
(6.2)
1B (y)g(y) Y (dy) =
g(y )fY (y ) dy =
y ) 8B 2 B(IR)
(6.3)
Z
B
Z Z
B IR
h(x)fX Y (x y ) dxdy 8B 2 B(IR):
(6.4)
127
11.7 Conditional Density

A function fX jY (xjy ) : IR2 ! 0 any function h : IR!IR:
1) is called a conditional density for X given Y provided that for

g (y ) =
Z
IR
h(x)fX Y (xjy ) dx:

j
(7.1)
(Here g is the function satisfying
and g depends on h, but fX jY does not.) Theorem 7.33 If (X
IE h(X )jY ] = g (Y )
Y ) has a joint density fX Y , then fX Y (xjy ) = fXfY (x)y) : Y (y

j
(7.2)
Proof: Just verify that g dened by (7.1) satises (6.4): For B
Z Z
B | IR
h(x)fX Y (xjy) dx fY (y ) dy =
g (y )
Z Z
2 B(IR)
{z
j
B IR
h(x)fX Y (x y) dxdy:
Notation 11.1 Let g be the function satisfying
IE h(X )jY ] = g (Y ):
The function g is often written as
g(y ) = IE h(X )jY = y ]

and (7.1) becomes
IE h(X )jY = y ] = g (y ) =
Z
IR
h(x)fX Y (xjy ) dx:

j
In conclusion, to determine IE
h(X )jY ] (a function of ! ), rst compute
and then replace the dummy variable y by the random variable Y :
IR
h(x)fX Y (xjy ) dx
j
IE h(X )jY ](!) = g (Y (!)):

Example 11.1 (Jointly normal random variables) Given parameters: (X Y ) have the joint density
1
>0
1 2
> 0 ;1 < < 1. :
Let
fX Y (x y) =
1 2
1 p1 ; 2 exp
1 ; 2(1 ; 2 )
x2 ; 2 x y + y2 2 2
1 2
128
The exponent is
x2 ; 2 x y + y2 2 2 2(1 ; 2 ) 1 " 1 2 2 # x ; y 2 + y2 (1 ; 2 ) 1 = ; 2(1 ; 2 ) 2 1 2 2 2 2 1 = ; 2(1 ; 2 ) 12 x ; 1 y ; 1 y 2 : 2 2 1 2 1 p
We can compute the Marginal density of Y as follows
fY (y) =
Z1
1 1 Z 1 e; u2 du:e; 2 y = 2 2
2
1 2
1;
;1
; 2(1;1 2)
2 2 2
2 1
x;
1 2
dx:e
1 ; 2 y2 2
using the substitution u =

1 y2 2 2 2
;1
Thus Y is normal with mean 0 and variance 2 . 2 Conditional density. From the expressions
; = p1 e 2 2
;2
x;
1 2
dx y , du = p1; 2
fX Y (x y) =
1 2
1 p
1;
e 2
1 ; 2(1; 2 )
1 x 2 1 2
;;
1y 2 2
e; 2
1 y2
2 2
1y 2 fY (y) = p 1 e; 2 2 2 2
we have
fX jY (xjy) = fXfY (x y) Y (y) 2 1 1 p 1 e; 2(1; 2) 12 x ; 21 y : 1 = p 2 1 1; 2 In the x-variable, fX jY (xjy) is a normal density with mean 21 y and variance (1 ; 2 ) Z1 IE X jY = y] = xf (xjy) dx = 1 y 2 ;1 X jY IE
2. 1
Therefore,
"
X;
1 2
Z1
Y =y
1 2
;1 = (1 ; 2 ) 2 : 1
x;
y fX jY (xjy) dx 2

From the above two formulas we have the formulas
129
IE X jY ] = IE
1 2
(7.3)
"
X;
1 2
Y = (1 ; 2 ) 2: 1
(7.4)
Taking expectations in (7.3) and (7.4) yields
IEX = IE
1 2
IEY = 0
(7.5)
"
X;
1 2
= (1 ; 2 ) 2: 1
(7.6)
Based on Y , the best estimator of X is 21 Y . This estimator is unbiased (has expected error zero) and the expected square error is (1 ; 2 ) 2. No other estimator based on Y can have a smaller expected square error 1 (Homework problem 2.1).
11.8 Multivariate Normal Distribution

Please see Oksendal Appendix A. Let denote the column vector of random variables (X1 X2 : : : Xn )T , and the corresponding has a multivariate normal distribution if and only if column vector of values (x1 x2 : : : xn )T . the random variables have the joint density
p o n fX(x) = (2detnA exp ; 1 (X ; )T:A:(X ; ) : 2 ) =2

=(
4
Here, and A is an n
T T n ) = IE X = (IEX1 : : : IEXn) n nonsingular matrix. A 1 is the covariance matrix

1
:::
;
A 1 = IE (X ; ):(X ; )T
i.e. the (i j )th element of A;1 is IE (Xi ; i )(Xj ; j ). The random variables in if and only if A;1 is diagonal, i.e.,
X are independent
A 1 = diag(
;
2 2 1 2
:::
2) n
where 2 j
= IE (Xj ; j )2 is the variance of Xj .
130
11.9 Bivariate normal distribution

Take n = 2 in the above denitions, and let
4
= IE (X1 ; 1 )(X2 ; 2 ) :
1 2
;
Thus,
2 A=4
A =
"
;
2 1 1 2
1 2 2 2
#
;
; p
1 (1 2 ) 2) 1 2 (1
2 1
;
;
1 p
1 2 2 2
(1 ) 1 (1 2 )
2
;
3 5
det A =
1 2
1;
and we have the formula from Example 11.1, adjusted to account for the possibly non-zero expectations:
fX1 X2 (x1 x2) =
1 p 2 1;
#) ( " 1 (x1 ; 1 )2 ; 2 (x1 ; 1 )(x2 ; 2 ) + (x2 ; 2 )2 : 2 2 2 exp ; 2(1 ; 2) 1 2 1 2
11.10 MGF of jointly normal random variables

Let = (u1 u2 : : : un )T denote a column vector with components in IR, and let have a multivariate normal distribution with covariance matrix A ;1 and mean vector . Then the moment generating function is given by
T IEeu :X =
T eu :XfX1 X2 : : : Xn (x1 x2 : : : xn ) dx1 : : :dxn o n = exp 1 uT A 1 u + uT : 2

1 ;1
:::
;1
If any n random variables X1 X2 : : : Xn have this moment generating function, then they are jointly normal, and we can read out the means and covariances. The random variables are jointly normal and independent if and only if for any real column vector = (u 1 : : : un )T
u 8n 9 8n 9 <X = <X 1 2 2 = IEeuT :X = IE exp : uj Xj = exp : 2 j uj + uj j ] : j =1 j =1

4
Chapter 12
Semi-Continuous Models
12.1 Discrete-time Brownian Motion
Let fYj gn=1 be a collection of independent, standard normal random variables dened on ( j where I is the market measure. As before we denote the column vector (Y1 : : : Yn )T by P therefore have for any real colum vector = (u1 : : : un )T ,
u 8n 9 8n 9 <X = <X = TY IEeu = IE exp : uj Yj = exp : 1 u2 : 2 j j =1 j =1

B0 = 0 Bk =
k X
F P), Y. We
Dene the discrete-time Brownian motion (See Fig. 12.1):
If we know Y1 Y2 : : : Yk , then we know B1 B2 : : : Bk . Conversely, if we know B1 then we know Y1 = B1 Y2 = B2 ; B1 : : : Yk = Bk ; Bk;1 . Dene the ltration
j =1
Yj k = 1 : : : n: B2 : : : Bk ,
F0 = f g F k = (Y1 Y2 : : : Yk ) = (B1 B2 : : : Bk ) k = 1 : : : n:
Theorem 1.34 Proof:
fBk gn=0 is a martingale (under I ). P k

IE Bk+1 jF k ] = IE Yk+1 + Bk jF k ] = IEYk+1 + Bk = Bk :
131
132
B k Y2 Y1 0 1 k 2 Y 3 3 Y 4 4
Figure 12.1: Discrete-time Brownian motion. Theorem 1.35
fBk gn=0 is a Markov process. k

IE h(Bk+1 )jF k ] = IE h(Yk+1 + Bk )jF k ]:
1 Z g(b) = IEh(Yk+1 + b) = p 2
1
Proof: Note that
Use the Independence Lemma. Dene
;1
h(y + b)e
1 y2 2
dy:
Then which is a function of B k alone.
IE h(Yk+1 + Bk )jF k ] = g(Bk )
12.2 The Stock Price Process

Given parameters:
> 0, the volatility. S0 > 0, the initial stock price.
2 IR, the mean rate of return.
The stock price process is then given by
Sk = S0 exp Bk + ( ; 1 2 )k 2
Note that
k = 0 : : : n:
1 Sk+1 = Sk exp Yk+1 + ( ; 2 2 )
CHAPTER 12. Semi-Continuous Models
133
;
IE Sk+1jF k ] = Sk IE e Yk+1 jF k ]:e 1 2 1 2 = Sk e 2 e 2 = e Sk :

;
1 2
Thus
k +1 = log IE SkS jF k ] = log IE SS+1 F k k k k 1 var log SS+1 = var Yk+1 + ( ; 2 2) = 2 : k
and
12.3 Remainder of the Market

The other processes in the market are dened as follows. Money market process: Portfolio process:
Mk = erk k = 0 1 : : : n:
0
Each
k is F k -measurable.
:::
n 1
;
Wealth process:
X0 given, nonrandom. Xk+1 =

=
Each Xk is F k -measurable. Discounted wealth process:
r k Sk+1 + e (Xk ; k Sk ) r r k (Sk+1 ; e Sk ) + e Xk
Xk+1 Mk+1 =
12.4 Risk-Neutral Measure
Sk+1 Sk Xk Mk+1 ; Mk + Mk :
f Denition 12.1 Let I be a probability measure on ( P n o

Sk Mk
n f f is a martingale under I , we say that I is a risk-neutral measure. P P k=0
F ), equivalent to the market measure I . If P
134
f Theorem 4.36 If I is a risk-neutral measure, then every discounted wealth process P f a martingale under I , regardless of the portfolio process used to generate it. P
Proof:
n Xk on
Mk
k=0
is
f X IE Mk+1 F k k+1
f = IE
=
X = Mk : k
Sk+1 Sk Xk Mk+1 ; Mk + Mk F k S X f S IE Mkk+1 F k ; Mkk + Mk +1 k

k
12.5 Risk-Neutral Pricing

Let Vn be the payoff at time n, and say it is F n -measurable. Note that Vn may be path-dependent. Hedging a short position: Sell the simple European derivative security V n . Receive X0 at time 0.
::: n f P If there is a risk-neutral measure I , then

Construct a portfolio process
1 which starts with X 0 and ends with Xn
= Vn .
fX f V X0 = IE Mn = IE Mnn : n
Remark 12.1 Hedging in this semi-continuous model is usually not possible because there are not enough trading dates. This difculty will disappear when we go to the fully continuous model.
12.6 Arbitrage
Denition 12.2 An arbitrage is a portfolio which starts with X 0
= 0 and ends with Xn satisfying
IP (Xn 0) = 1 IP (Xn > 0) > 0:

(I here is the market measure). P Theorem 6.37 (Fundamental Theorem of Asset Pricing: Easy part) If there is a risk-neutral measure, then there is no arbitrage.
135
f measure, let X0 = 0, and let Xn be the nal wealth corresponding P Proof: Let I be a risk-neutraln on f is a martingale under I , P to any portfolio process. Since Xk
Mk
k=0
fX fX IE Mn = IE M00 = 0: n
Suppose IP (Xn
(6.1)
0) = 1. We have
(6.2)
f f IP (Xn 0) = 1 =) IP (Xn < 0) = 0 =) IP (Xn < 0) = 0 =) IP (Xn 0) = 1: f (6.1) and (6.2) imply I (Xn P
This is not an arbitrage.
= 0) = 1. We have
f f IP (Xn = 0) = 1 =) IP (Xn > 0) = 0 =) IP (Xn > 0) = 0:
12.7 Stalking the Risk-Neutral Measure

Recall that
Y1 Y2 : : : Yn are independent, standard normal random variables on some probability space

( F P). o n Sk = S0 exp Bk + ( ; 1 2)k . 2
Sk+1 = S0 exp (Bk + Yk+1 ) + ( ; 1 2)(k + 1) o 2 n 1 2) : = Sk exp Yk+1 + ( ; 2

Therefore,
Sk+1 = Sk : exp n Yk+1 + ( ; r ; 1 2)o 2 Mk+1 Mk
If
S S IE Mkk+1 F k = Mkk :IE exp f Yk+1 gjF k ] : expf ; r ; 1 2g 2 +1 S = Mkk : expf 1 2 g: expf ; r ; 1 2g 2 2 Sk : = e r : Mk = r, the market measure is risk neutral. If 6= r, we must seek further.
;
136
Sk+1 = Sk : exp n Yk+1 + ( ; r ; 1 2 )o 2 Mk+1 Mk o n S = Mkk : exp (Yk+1 + r ) ; 1 2 2 n ~ o S = Mkk : exp Yk+1 ; 1 2 2
;
where The quantity

;
f ~ P We want a probability measure I under which Y1 dom variables. Then we would have
r is denoted and is called the market price of risk.
~ Yk+1 = Yk+1 +
r:
~ : : : Yn are independent, standard normal ran-
i S fh 1 f S ~ IE Mkk+1 F k = Mkk :IE expf Yk+1 gjF k : expf; 2 2g +1 S 1 = Mkk : expf 2 2 g: expf; 1 2g 2 S = Mkk : 3 2n X 1 Z = exp 4 (; Yj ; 2 2 )5 :
j =1
Cameron-Martin-Girsanovs Idea: Dene the random variable
Properties of Z :
Z 0.
9 8n = <X IEZ = IE exp : (; Yj ) : exp ; n 2 j =1

= exp n 2
2
: exp ; n 2
= 1:
Dene
f IP (A) =
0 for all A 2 F and
Z
A
Z dIP 8A 2 F :
f Then I (A) P
f In other words, I is a probability measure. P
f IP ( ) = IEZ = 1:
137
f P We show that I is a risk-neutral measure. For this, it sufces to show that

~ ~ Y 1 = Y1 + : : : Yn = Yn +
f are independent, standard normal under I . P

Verication:
Y1 Y2 : : : Yn : Independent, standard normal under I , and P
2n 3 2n 3 X 5 X IE exp 4 uj Yj = exp 4 1 u2 5 : 2 j
j =1 j =1
i h Z = exp Pn=1 (; Yj ; 1 2) j 2
~ ~ Y = Y1 + : : : Yn = Yn + : Z > 0 almost surely.
f IP (A) =
Z
A
Z dIP 8A 2 F
f IEX = IE (XZ ) for every random variable X . f ~ ~ Compute the moment generating function of (Y1 : : : Yn ) under I : P 2n 3 2n 3 n X ~5 X X f IE exp 4 uj Yj = IE exp 4 uj (Yj + ) + (; Yj ; 1 2)5 2 j =1 j =1 j =1 3 2n 3 2n X X = IE exp 4 (uj ; )Yj 5 : exp 4 (uj ; 1 2 )5 2 j =1 j =1 2n 3 2n 3 X1 X 1 = exp 4 2 (uj ; )25 : exp 4 (uj ; 2 2 )5 j =1 2j=1 3 n X 1 2 = exp 4 ( 2 uj ; uj + 1 2 ) + (uj ; 1 2) 5 2 2 j =1 2n 3 X = exp 4 1 u2 5 : 2 j
j =1
138
12.8 Pricing a European Call

Stock price at time n is
Sn = S0 exp Bn + ( ; 1 2)n 2
o n 8 n 9 < X = = S0 exp : Yj + ( ; 1 2 )n 2 j =1 8 n 9 < X = = S0 exp : (Yj + r ) ; ( ; r)n + ( ; 1 2)n 2 j =1 8 n 9 < X = 1 ~ = S0 exp : Yj + (r ; 2 2)n : j =1
;
Payoff at time n is (Sn ; K )+ . Price at time zero is
2 0 1+3 n (Sn ; K )+ = IE 4e rn @S exp < X Y + (r ; 1 2)n= ; K A 5 f f ~j IE M 0 2 :

;
8 n
;1
S0 exp b + (r ; 1 2 )n ; K : p 1 e 2n2 db 2 n Pn Y is normal with mean 0, variance n,2under IP . f ~j since j =1

1
j =1
rn
b2
This is the Black-Scholes price. It does not depend on .
Chapter 13
Brownian Motion
13.1 Symmetric Random Walk
Toss a fair coin innitely many times. Dene
Xj (!) = 1
Set
;1
if if
!j = H ! j = T:
M0 = 0 Mk =
k X j =1
Xj
k 1:
13.2 The Law of Large Numbers

We will use the method of moment generating functions to derive the Law of Large Numbers:
Theorem 2.38 (Law of Large Numbers:)
1 M !0 k k
almost surely, as 139
k!1:
140 Proof:
'k (u) = IE exp u Mk 9 8kk <X u = = IE exp : k Xj j =1 k Y = IE exp u Xj k j =1

=
which implies,
(Def. of
Mk :)
(Independence of the Xj s)
1 u 2e k
+ 1e 2
u
u k k
log 'k (u) = k log 1 e k + 1 e 2 2

Let x = 1 . Then k
u k
k!
lim log 'k (u) = xlim0 !

1
log 1 eux + 1 e ux 2 2
;
u eux ; = xlim0 2 ux ! 1e + 2
= 0:
u e ux 2 1 e ux 2
; ;
(LH pitals Rule) o
Therefore,
k!
lim 'k (u) = e0 = 1

1
which is the m.g.f. for the constant 0.
13.3 Central Limit Theorem

We use the method of moment generating functions to prove the Central Limit Theorem. Theorem 3.39 (Central Limit Theorem)
1 p Mk ! k
Proof:
Standard normal, as
k!1:
u 'k (u) = IE exp p Mk k u u 1 = 2 e pk + 1 e pk 2

;
CHAPTER 13. Brownian Motion

so that,
141
1 log 'k (u) = k log 1 e pk + 2 e 2 1 Let x = pk . Then
u pk
lim log 'k (u) = xlim0 ! k!

1
u eux ; u e ux = xlim0 1 ux 1 1 ux : xlim0 2 2x2 ! ! 2e + 2e

; ;
x2 u eux ; u e ux = xlim0 2 1 ux 2 1 ux ! 2x 2 e + 2 e
; ;
1 log 2 eux + 1 e ux 2
;
(LH pitals Rule) o
= xlim0 2 ! = xlim0 ! 1 u2 : =2
Therefore,
u eux ; u e ux 2
;
u2 eux ; u2 e ux 2 2
;
2x
(LH pitals Rule) o
lim ' (u) = e 2 u k! k

1
which is the m.g.f. for a standard normal random variable.
13.4 Brownian Motion as a Limit of Random Walks

Let n be a positive integer. If t
k 0 is of the form n , then set
1 1 B (n) (t) = pn Mtn = pn Mk :

k 0 is not of the form n , then dene B(n) (t) by linear interpolation (See Fig. 13.1). Here are some properties of B(100)(t):
If t
142
k/n
(k+1)/n
Figure 13.1: Linear Interpolation to dene B (n) (t).
Properties of
B(100)(1) :
1 B (100)(1) = 10
100 X
1 IEB(100)(1) = 10
j =1 100 X
Xj IEXj = 0:
var(Xj ) = 1
(Approximately normal)
var(B (100)(1)) = 100

Properties of
j =1 100 1 X j =1
B(100)(2) :
1 B (100)(2) = 10
200 X
IEB(100)(2) = 0: var(B (100)(2)) = 2:

Also note that:
j =1
Xj
(Approximately normal)
B (100)(1) and B (100)(2) ; B (100)(1) are independent. B (100)(t) is a continuous function of t. To get Brownian motion, let n!1 in B (n) (t) t 0.
13.5 Brownian Motion
(Please refer to Oksendal, Chapter 2.)
143
B(t) = B(t,) (, F,P)

Figure 13.2: Continuous-time Brownian Motion. A random variable properties: 1. 2. 3.
B(t) (see Fig.
13.2) is called a Brownian Motion if it satises the following
B (0) = 0, B (t) is a continuous function of t; B has independent, normally distributed increments: If 0 = t0 < t1 < t2 < : : : < tn
and
Y1 = B (t1 ) ; B (t0 ) Y2 = B(t2 ) ; B(t1 ) : : : Yn = B (tn ) ; B(tn 1 )

;
then
Y1 Y2 : : : Yn are independent, IEYj = 0 8j var(Yj ) = tj ; tj 1 8j:

;
13.6 Covariance of Brownian Motion

Let
0 s t be given. Then B (s) and B (t) ; B (s) are independent, so B (s) and B (t) = (B (t) ; B (s)) + B (s) are jointly normal. Moreover, IEB (s) = 0 var(B (s)) = s IEB(t) = 0 var(B (t)) = t IEB(s)B (t) = IEB(s) (B(t) ; B(s)) + B(s)] = IEB (s)(B{z) ; B (s)) + IEB 2 (s) (t } | {z } | = s:
0
144 Thus for any s
0, t 0 (not necessarily s t), we have
IEB(s)B(t) = s ^ t:
13.7 Finite-Dimensional Distributions of Brownian Motion
Let be given. Then
0 < t1 < t2 < : : : < tn (B (t1 ) B (t2) : : : B (tn ))
is jointly normal with covariance matrix
2 3 IEB 2(t1 ) IEB (t1 )B (t2) : : : IEB(t1 )B (tn ) 6 7 2 C = 6:IEB:(t:2:):B:(:t1 ) : : : : :IEB: :(t2:): : : : : :: :: :: : : IEB:(:t:2:):B (t:n: )7 6 ::: : :: 4 ::: : ::: : : :7 5 IEB(tn)B (t1 ) IEB(tn )B(t2 ) : : : IEB2 (tn) 2 3 t1 t1 : : : t1 6 7 = 6t1 : : t:2: : ::::::: : : t:2:7 6: : 7 4 5
t1 t2 : : : tn
13.8 Filtration generated by a Brownian Motion
fF (t)gt
Required properties: For each t, B (t) is F (t)-measurable, For each t and for t < t1
< t2 < < tn , the Brownian motion increments B (t1 ) ; B(t) B(t2 ) ; B(t1 ) : : : B (tn ) ; B (tn 1 ) are independent of F (t).
;
Here is one way to construct F (t). First x t. Let s 2
0 t] and C 2 B(IR) be given. Put the set

Then put in every other set
fB(s) 2 C g = f! : B(s !) 2 C g in F (t). Do this for all possible numbers s 2 0 t] and C 2 B(IR).
required by the -algebra properties.
This F (t) contains exactly the information learned by observing the Brownian motion upto time t. fF (t)gt 0 is called the ltration generated by the Brownian motion.
145
13.9 Martingale Property

Theorem 9.40 Brownian motion is a martingale. Proof: Let 0
s t be given. Then IE B (t)jF (s)] = IE (B(t) ; B(s)) + B(s)jF (s)] = IE B (t) ; B (s)] + B (s) = B (s):
Theorem 9.41 Let
2 IR be given. Then
Z (t) = exp ; B(t) ; 1 2 t 2

is a martingale. Proof: Let 0
s t be given. Then
1 = IE Z (s) expf; (B (t) ; B (s)) ; 2 2 (t ; s)g F (s)
IE Z (t)jF (s)] = IE expf; (B(t) ; B (s) + B(s)) ; 1 2((t ; s) + s)g F (s) 2

= Z (s)IE expf; (B (t) ; B (s)) ; 1 2 (t ; s)g 2 n1 2 o = Z (s) exp 2 (; ) var(B (t) ; B (s)) ; 1 2 (t ; s) 2 = Z (s):
13.10 The Limit of a Binomial Model

Consider the nth Binomial model with the following parameters:
un = 1 + n : Up factor. ( > 0). dn = 1 ; n : Down factor. r = 0. pn = u1n ddnn = 2 == nn = 1 . ~ 2 qn = 1 . ~ 2

p p p ; ; p
146 Let ]k (H ) denote the number of H in the rst k tosses, and let ] k (T ) denote the number of T in the rst k tosses. Then
]k (H ) + ]k (T ) = k ]k (H ) ; ]k (T ) = Mk
which implies,
]k (H ) = 1 (k + Mk ) 2 ]k (T ) = 1 (k ; Mk ): 2
In the nth model, take n steps per unit time. Set S0
f P Under I , the price process S (n) is a martingale.
S (n)(t) = 1 + pn
(n) = 1. Let t = k for some k, and let n 1 (nt+Mnt ) 1 (nt Mnt ) 2 2
1 ; pn
Theorem 10.42 As n!1, the distribution of S (n) (t) converges to the distribution of where B is a Brownian motion. Note that the correction martingale. Proof: Recall that from the Taylor series we have
expf B (t) ; 1 2 tg 2
; 1 2t is necessary in order to have a 2
log(1 + x) = x ; 1 x2 + O(x3) 2
so
log S (n)(t) = 1 (nt + Mnt ) log(1 + p ) + 1 (nt ; Mnt ) log(1 ; p ) 2 2 = nt

1 log(1 + 2
+ Mnt
1 2 + O(n 3=2 ) = nt ;4 n ; 4n ! 1 2 + 1 p + 1 2 + O(n 3=2 ) + Mnt 1 pn ; 4 n 2 n 4 n 2

1p 2 n 1 2 pn ;
; ;
1 2 log(1 + 1 2
pn ) + 1 log(1 ; pn ) 2
pn ) ; 1 log(1 ; pn ) 2
1 = ; 2 2 t + O (n 2 ) 1 1 1 + pn Mnt + n Mnt O(n 2 )

; ;
| {z } | {z } !0 !Bt
As n!1, the distribution of log S (n) (t) approaches the distribution of
B(t) ; 1 2t. 2
147
B(t) = B(t,) x t
(, F, Px)
Figure 13.3: Continuous-time Brownian Motion, starting at x 6= 0.
13.11 Starting at Points Other Than 0

(The remaining sections in this chapter were taught Dec 7.) For a Brownian motion B (t) that starts at 0, we have:
IP (B(0) = 0) = 1:
For a Brownian motion B (t) that starts at x, denote the corresponding probability measure by IP x (See Fig. 13.3), and for such a Brownian motion we have:
IP x (B(0) = x) = 1:
Note that: If x 6= 0, then IP x puts all its probability on a completely different set from I . P The distribution of B (t) under IP x is the same as the distribution of x + B (t) under I . P
13.12 Markov Property for Brownian Motion

We prove that Theorem 12.43 Brownian motion has the Markov property. Proof: Let s
2 6 ) IE h(B (s + t)) F (s) = IE 6h( B(s + t{z; B(s) + 4 | }

Independent of F (s)
F
t 0 be given (See Fig. 13.4).
(s)-measurable
B{zs) |( }
3 7 ) F (s)7 5
148
B(s) s restart
Figure 13.4: Markov Property of Brownian Motion. Use the Independence Lemma. Dene
s+t
g (x) = IE h( B (s + t) ; B(s) + x )]
2 6 = IE 6h( x + 4
= IE xh(B (t)):
Then
same distribution as B (s + t) ; B (s)
B (t) |{z}
3 7 )7 5
IE h (B(s + t) ) F (s) = g (B(s)) = E B (s)h(B (t)):
In fact Brownian motion has the strong Markov property.

Example 13.1 (Strong Markov Property) See Fig. 13.5. Fix x > 0 and dene
= min ft 0 B(t) = xg :
Then we have:
IE h( B( + t) ) F ( ) = g(B( )) = IE x h(B(t)):
149
x restart
Figure 13.5: Strong Markov Property of Brownian Motion.
+t
13.13 Transition Density

Let p(t x y ) be the probability that the Brownian motion changes value from x to y in time t, and let be dened as in the previous section.
p(t x y) = p 1 e 2 t g(x) = IE xh(B(t)) =
(y
;x)2
2t
h(y )p(t x y) dy:
;1
IE h(B (s + t)) F (s) = g(B (s)) = IE h(B( + t)) F ( ) =

13.14 First Passage Time
Fix x > 0. Dene Fix
h(y )p(t B (s) y) dy:
;1
h(y )p(t x y) dy:
;1
= min ft 0
B (t) = xg :
> 0. Then
exp B (t ^ ) ; 1 2 (t ^ ) 2
is a martingale, and
1 IE exp B(t ^ ) ; 2 2 (t ^ ) = 1:
150 We have
8 n 12 o <e 1 2 if < 1 lim exp ; 2 (t ^ ) = : 2 t! if = 1 0 0 expf B (t ^ ) ; 1 2 (t ^ )g e x : 2

; 1
(14.1)
Let t!1 in (14.1), using the Bounded Convergence Theorem, to get
IE expf x ; 1 2
Let
g1
<
1g
= 1:
#0 to get IE 1
<
1g
= 1, so
IP f < 1g = 1 IE expf; 1 2 g = e 2
Let
x:
(14.2)
=1 2
2 . We have the m.g.f.:
IEe
Differentiation of (14.3) w.r.t.
=e x
;
> 0:
p
(14.3)
yields
;IE e
Letting
= ; px e x 2
;
#0, we obtain
IE = 1:
(14.4)
Conclusion. Brownian motion reaches level x with probability 1. The expected time to reach level x is innite. We use the Reection Principle below (see Fig. 13.6).
IP f
t B (t) < xg = IP fB (t) > xg IP f tg = IP f t B(t) < xg + IP f t B (t) > xg = IP fB (t) > xg + IP fB (t) > xg = 2IP fB (t) > xg
Z e = p2 2 tx
1
y2 2t dy
151
shadow path
x t Brownian motion
Figure 13.6: Reection Principle in Brownian Motion. Using the substitution z
= yt
p
dz =
dy we get t
IP f
Density:
tg = p2
x pt
z2
2
dz:
x2 2t
@ f (t) = @t IP f F (t) =
tg = p x 3 e 2 t
which follows from the fact that if
Zb
a(t)
g (z) dz
then
@F = ; @a g (a(t)): @t @t IEe
;
Laplace transform formula:
= e t f (t)dt = e x
; ;
Z
0
152
Chapter 14
The It Integral o
The following chapters deal with Stochastic Differential Equations in Finance. References: 1. B. Oksendal, Stochastic Differential Equations, Springer-Verlag,1995 2. J. Hull, Options, Futures and other Derivative Securities, Prentice Hall, 1993.
14.1 Brownian Motion

(See Fig. 13.3.) ( F P) is given, always in the background, even when not explicitly mentioned. !IR, has the following properties: Brownian motion, B (t ! ) : 0 1) 1. 2. 3.
B (0) = 0 Technically, IP f! B(0 !) = 0g = 1, B (t) is a continuous function of t, If 0 = t0 t1 : : : tn , then the increments B (t1) ; B (t0 ) : : : B(tn ) ; B(tn 1 )
;
are independent,normal, and
IE B (tk+1 ) ; B(tk )] = 0 IE B (tk+1 ) ; B(tk )]2 = tk+1 ; tk :

14.2 First Variation
Quadratic variation is a measure of volatility. First we will consider rst variation, function f (t). 153
FV (f ), of a
154
f(t)
t2 t1 T t
Figure 14.1: Example function f (t). For the function pictured in Fig. 14.1, the rst variation over the interval
0 T ] is given by:
FV 0 T ](f ) = f (t1) ; f (0)] ; f (t2) ; f (t1 )] + f (T ) ; f (t2 )]

= f (t) dt + (;f (t)) dt + f (t) dt:
0 0 0
Zt1
0 ZT 0
Zt2
ZT
t1
t2
= jf (t)j dt:
0
Thus, rst variation measures the total amount of up and down motion of the path. The general denition of rst variation is as follows: Denition 14.1 (First Variation) Let
= ft0 t1 : : : tn g be a partition of 0 T ], i.e.,
0 = t0 t1 : : : tn = T:
The mesh of the partition is dened to be
jj jj = k=0max 1(tk+1 ; tk ): ::: n

;
We then dene
FV 0 T ](f ) = lim 0 !
jj jj
n 1 X
;
k=0
jf (tk+1) ; f (tk )j:
Suppose f is differentiable. Then the Mean Value Theorem implies that in each subinterval t k tk+1 ], there is a point tk such that
f (tk+1 ) ; f (tk ) = f (tk )(tk+1 ; tk ):

0
CHAPTER 14. The It Integral o

Then
155
n 1 X
;
k=0
and
jf (tk+1) ; f (tk )j =
n 1 X
;
k=0
;
jf (tk )j(tk+1 ; tk )
0 0
FV 0 T ](f ) = lim 0 !
jj jj
n 1 X k=0
jf (tk )j(tk+1 ; tk )
= jf (t)j dt:
0
ZT
0
14.3 Quadratic Variation

Denition 14.2 (Quadratic Variation) The quadratic variation of a function f on an interval is
0 T]
hf i(T ) = lim 0 jf (tk+1 ) ; f (tk )j2: ! k=0

jj jj
n 1 X
;
Remark 14.1 (Quadratic Variation of Differentiable Functions) If f is differentiable, then hf i(T ) = 0, because
n 1 X
;
k=0
jf (tk+1) ; f (tk )j2 =
n 1 X
;
k=0
jf (tk )j2(tk+1 ; tk )2
0
jj jj:
and
n 1 X
;
k=0
jf (tk )j2(tk+1 ; tk )
0
hf i(T )
jj
lim 0 jj jj: lim 0 ! !

jj jj jj jj
n 1 X
;
= lim jj jj jf (t)j2 dt
jj
!0
ZT
0
k=0
jf (tk )j2(tk+1 ; tk )
0
= 0:
Theorem 3.44
hBi(T ) = T
or more precisely,
IP f! 2
hB(: !)i(T ) = T g = 1:
In particular, the paths of Brownian motion are not differentiable.
156 Proof: (Outline) Let = ft0 t1 : : : tn g be a partition of B (tk+1 ) ; B(tk ). Dene the sample quadratic variation
0 T ]. To simplify notation, set D k =
Q =
Then
n 1 X
;
k=0
2 Dk :
Q ;T =
We want to show that
jj jj
n 1 X
;
k=0
2 Dk ; (tk+1 ; tk )]:
lim (Q ; T ) = 0:
!0
Consider an individual summand
2 Dk ; (tk+1 ; tk ) = B(tk+1 ) ; B(tk )]2 ; (tk+1 ; tk ):

This has expectation 0, so
IE (Q ; T ) = IE
For j
n 1 X
;
6= k, the terms
var(Q ; T ) = = =
n 1 X
;
k=0
2 Dk ; (tk+1 ; tk )] = 0: 2 Dk ; (tk+1 ; tk )
Dj2 ; (tj+1 ; tj )
and
are independent, so
k=0 n 1 X
;
2 var Dk ; (tk+1 ; tk )] 4 2 IE Dk ; 2(tk+1 ; tk )Dk + (tk+1 ; tk )2]
k=0 n 1 X
;
k=0
3(tk+1 ; tk )2 ; 2(tk+1 ; tk )2 + (tk+1 ; tk )2 ] (tk+1 ; tk )2

n 1 X
;
=2
n 1 X
;
(if X is normal with mean 0 and variance 2, then IE (X 4) = 3 4)
k=0
2jj jj
= 2jj jj T:
Thus we have
k=0
(tk+1 ; tk )
IE (Q ; T ) = 0 var(Q ; T ) 2jj jj:T:

As jj
157
jj!0, var(Q ; T )!0, so

jj
lim (Q ; T ) = 0:
jj
!0
Remark 14.2 (Differential Representation) We know that
IE (B(tk+1 ) ; B(tk ))2 ; (tk+1 ; tk )] = 0:

We showed above that
var (B (tk+1 ) ; B (tk ))2 ; (tk+1 ; tk )] = 2(tk+1 ; tk )2 :

When (tk+1 ; tk ) is small, (tk+1 ; tk )2 is very small, and we have the approximate equation
(B (tk+1 ) ; B (tk ))2 ' tk+1 ; tk

which we can write informally as
dB(t) dB(t) = dt:
14.4 Quadratic Variation as Absolute Volatility

On any time interval
T1 T2], we can sample the Brownian motion at times T1 = t0 t1 : : : tn = T2 T2 ; T1 k=0(B(tk+1 ) ; B(tk )) :

2
and compute the squared sample absolute volatility
n 1 X
;
This is approximately equal to
hBi(T2) ; hBi(T1)] = T2 ; T1 = 1: T2 ; T1 T2 ; T1
1
As we increase the number of sample points, this approximation becomes exact. In other words, Brownian motion has absolute volatility 1. Furthermore, consider the equation
hBi(T ) = T = 1 dt
0
ZT
8T 0:
This says that quadratic variation for Brownian motion accumulates at rate 1 at all times along almost every path.
158
14.5 Construction of the It Integral o

The integrator is Brownian motion following properties: 1. 2. 3.
B(t) t
0, with associated ltration F (t) t
0, and the
s t=) every set in F (s) is also in F (t), B (t) is F (t)-measurable, 8t, For t t1 : : : tn , the increments B (t1 ) ; B (t) B (t2 ) ; B (t1 ) : : : B (tn ) ; B (tn 1 ) are independent of F (t).
;
The integrand is 1. 2.
(t) t 0, where
is adapted)
(t) is F (t)-measurable 8t (i.e.,

is square-integrable:
IE
We want to dene the It Integral: o
ZT
0
2(t) dt < 1
8T:
I (t) =
Zt
0
(u) dB (u)
t 0: f (t) is a differentiable function, then
Remark 14.3 (Integral w.r.t. a differentiable function) If we can dene
Zt
0
(u) df (u) =
Zt
0
(u)f (u) du:

0
This wont work when the integrator is Brownian motion, because the paths of Brownian motion are not differentiable.
14.6 It integral of an elementary integrand o

Let
= ft0 t1 : : : tn g be a partition of 0 T ], i.e., 0 = t0 t1 : : : tn = T:

an
Assume that (t) is constant on each subinterval t k tk+1 ] (see Fig. 14.2). We call such a elementary process. The functions B (t) and
(tk ) can be interpreted as follows:
Think of B (t) as the price per unit share of an asset at time t.
159
( t ) = ( t 1 ) ( t ) = ( t ) 0
( t )= ( t 3 )
0=t0
t1
t2
t3
t4 = T
( t ) = ( t 2 )
Figure 14.2: An elementary function . Think of t0 t1
Think of (tk ) as the number of shares of the asset acquired at trading date t k and held until trading date tk+1 . Then the It integral I (t) can be interpreted as the gain from trading at time t; this gain is given by: o
: : : tn as the trading dates for the asset.
8 > (t0) B(t) ; | (t0) ] B{z } 0 t t1 > < =B (0)=0 I (t) = > (t0 ) B (t1) ; B (t0 )] + (t1 ) B (t) ; B (t1 )] t t t2 > : (t0) B(t1) ; B(t0)] + (t1) B(t2) ; B(t1 )] + (t2) B(t) ; B(t2)] t1 t t3: 2
t tk+1 , I (t) =
k 1 X
;
In general, if tk
j =0
(tj ) B (tj +1 ) ; B (tj )] + (tk ) B (t) ; B (tk )]:
14.7 Properties of the Ito integral of an elementary process

Adaptedness For each t Linearity If
I (t) is F (t)-measurable. I (t) =
Zt
0
(u) dB (u)
J (t) =
( (u)
Zt
0
(u) dB (u)
then
I (t) J (t) =
Zt
0
(u)) dB (u)
160
s t l t l+1 t
.....
k+1
Figure 14.3: Showing s and t in different partitions. and
cI (t) =
Martingale
Zt
0
c (u)dB (u):
I (t) is a martingale.
We prove the martingale property for the elementary process case. Theorem 7.45 (Martingale Property)
I (t) =
is a martingale.
k 1 X
;
j =0
(tj ) B (tj +1 ) ; B (tj )] + (tk ) B (t) ; B (tk )]
tk t tk+1
s t be given. We treat the more difcult case that s and t are in different Proof: Let 0 subintervals, i.e., there are partition points t ` and tk such that s 2 t` t`+1 ] and t 2 tk tk+1 ] (See Fig. 14.3).
Write
I (t) =
` 1 X
;
j =0
(tj ) B (tj +1 ) ; B (tj )] + (t` ) B (t`+1 ) ; B (t` )]

k 1 X
;
j =`+1
(tj ) B (tj +1 ) ; B (tj )] + (tk ) B (t) ; B (tk )]
We compute conditional expectations:
2` 1 3 `1 X X IE 4 (tj )(B (tj+1 ) ; B(tj )) F (s)5 = (tj )(B (tj +1 ) ; B (tj )):
; ;
j =0
j =0
IE (t`)(B (t`+1 ) ; B(t` )) F (s) = (t` ) (IE B(t`+1)jF (s)] ; B(t` )) = (t` ) B (s) ; B (t` )]

These rst two terms add up to I (s). We show that the third and fourth terms are zero.
; ;
161
2k 1 3 k1 X X IE 4 (tj )(B (tj +1 ) ; B (tj )) F (s)5 = IE IE (tj )(B (tj +1 ) ; B (tj )) F (tj ) F (s) j =`+1 j =`+1 3 2 k 1 X 6 = IE 4 (tj ) (IE B (tj +1)jF (tj )] ; B(tj )) F (s)7 5 {z } | j =`+1 =0 2 3 IE (tk )(B (t) ; B(tk )) F (s) = IE 6 (tk ) |IE B(t)jF ({z )] ; B(tk )) F (s)7 tk 4 ( 5 }
;
=0
Theorem 7.46 (It Isometry) o
IEI 2(t) = IE
Proof: To simplify notation, assume t = t k , so
Zt
0
2 (u) du:
I (t) =
k X
j =0
(tj ) B (tj +1 ){z B (tj )] ; } |

Dj
Each Dj has expectation 0, and different Dj are independent.
0k 12 X I 2(t) = @ (tj )Dj A
j =0 k X 2 X 2 = (tj )Dj + 2 j =0 i<j k X j =0 k X j =0 k X j =0
(ti ) (tj )DiDj :
Since the cross terms have expectation zero,
IEI 2(t) =
= =
IE 2(tj )Dj2] IE
2 (t
j )IE
(B (tj +1 ) ; B (tj ))2 F (tj )
IE 2(tj )(tj+1 ; tj )
2 (u) du
= IE
k X tZj+1
j =0 tj Zt 2 (u) du = IE 0
162
path of 4
path of
0=t0
t1
t2
t3
t4 = T
Figure 14.4: Approximating a general process by an elementary process 4 , over
0 T ].
14.8 It integral of a general integrand o

Fix T
> 0. Let IE
be a process (not necessarily an elementary process) such that
(t) is F (t)-measurable, 8t 2 0 T ],
R T 2(t) dt < 1: 0
Theorem 8.47 There is a sequence of elementary processes f n g1 such that n=1
nlim IE !
1
ZT
0
j n(t) ; (t)j2 dt = 0:
Proof: Fig. 14.4 shows the main idea.
In the last section we have dened
In (T ) =
for every n. We now dene
ZT
0
n (t) dB (t)
ZT
0
(t) dB (t) = nlim !
ZT
0
n (t) dB (t):
163
The only difculty with this approach is that we need to make sure the above limit exists. Suppose n and m are large positive integers. Then
var(In (T ) ; Im (T )) = IE
(It Isometry:) = IE o
ZT
0 T
Z
0 0
n (t) ; m (t)] dB (t)

2 n (t) ; m (t)] dt
!2
((a + b)2
j n(t) ; (t)j + j (t) ; m(t)j ]2 dt ZT ZT 2a2 + 2b2 :) 2IE j n (t) ; (t)j2 dt + 2IE j m (t) ; (t)j2 dt
= IE
0 0
1
ZT
which is small. This guarantees that the sequence fI n (T )gn=1 has a limit.
14.9 Properties of the (general) It o integral
I (t) =
Zt
0
(u) dB (u):
Here is any adapted, square-integrable process. Adaptedness. For each t, I (t) is F (t)-measurable. Linearity. If
I (t) =
then
Zt
0
(u) dB (u)
J (t) =
( (u)
Zt
0
(u) dB (u)
I (t) J (t) =
Zt
0
(u)) dB (u)
and
cI (t) =
Zt
0
c (u)dB (u):
Martingale.
I (t) is a martingale. Continuity. I (t) is a continuous function of the upper limit of integration t. Rt It Isometry. IEI 2(t) = IE 0 2 (u) du. o
Example 14.1 () Consider the It integral o
ZT
0
B(u) dB(u):
We approximate the integrand as shown in Fig. 14.5
164
T/4
2T/4
3T/4
Figure 14.5: Approximating the integrand B (u) with 4 , over
0 T ].
8 >B(0) = 0 > <B(T=n) n (u) = >: : : > (n;1)T :

B
T
if if if
0 u < T=n T=n u < 2T=n

(n 1)T
n
u < T:
By denition,
ZT
0
B(u) dB(u) = nlim !1
n 1
; X
B kT n k =0
B (k + 1)T ; B kT n n
To simplify notation, we denote
so
4 Bk = B kT n ZT n;1 X B(u) dB(u) = nlim !1 k=0 Bk (Bk+1 ; Bk ): 0

n 1 k =0
We compute
1 2
n 1 k =0
; X
(Bk+1 ; Bk )2 = 1 2
; X
2 Bk+1 ;
n 1 j =0
n 1 k =0
; X
Bk Bk+1 + 1 2
n 1 k =0
n 1 k =0
; X
2 Bk
n 1 k =0
1 2 = 2 Bn + 1 2 1 2 = 2 Bn +
; X
Bj2 ;
; X
Bk Bk+1 + 1 2
; X
2 Bk
n 1 k =0 n 1
; X ; X
2 Bk ;
n 1 k =0
; X
Bk Bk+1
= Bn ;
1 2 2
k =0
Bk (Bk+1 ; Bk ):

Therefore,
165
n 1 k =0
; X
2 Bk (Bk+1 ; Bk ) = 1 Bn ; 1 2 2
n 1 k =0
; X
(Bk+1 ; Bk )2
n 1 k =0
or equivalently
n 1
; X
B kT n k =0
B (k + 1)T ; B kT n n
1 = 1 B 2 (T) ; 2 2
; X
B (k + 1)T n
k T
Let n!1 and use the denition of quadratic variation to get
ZT
0
1 B(u) dB(u) = 2 B 2 (T) ; 1 T: 2
Remark 14.4 (Reason for the 1 T term) If f is differentiable with f (0) = 0, then 2
ZT
0
f (u) df (u) =
ZT
0
f (u)f (u) du
0
= 1 f 2 (u) 2
In contrast, for Brownian motion, we have
T
0
= 1 f 2 (T ): 2
ZT
0
B (u)dB(u) = 1 B 2(T ) ; 1 T: 2 2
The extra term 1 T comes from the nonzero quadratic variation of Brownian motion. It has to be 2 there, because Z
IE
B (u) dB (u) = 0
(It integral is a martingale) o
but
IE 1 B 2 (T ) = 1 T: 2 2
14.10 Quadratic variation of an It integral o

Theorem 10.48 (Quadratic variation of It integral) Let o
I (t) =
Then
Zt
0
(u) dB (u):
0 2(u) du:
hI i(t) =
Zt
166 This holds even if is not an elementary process. The quadratic variation formula says that at each time u, the instantaneous absolute volatility of I is 2 (u). This is the absolute volatility of the Brownian motion scaled by the size of the position (i.e. (t)) in the Brownian motion. Informally, we can write the quadratic variation formula in differential form as follows:
dI (t) dI (t) = 2(t) dt:

Compare this with
dB(t) dB(t) = dt:

(t) =
n 1 X
;
Proof: (For an elementary process ). Let = ft0 t1 : : : tn g be the partition for , i.e., (tk ) for tk t tk+1 . To simplify notation, assume t = tn . We have
hI i(t) =
Let us compute hI i(tk+1) ; hI i(tk ). Let
k=0
hI i(tk+1) ; hI i(tk)] :
tk = s0
Then
= fs0 s1 : : : sm g be a partition s1 : : : sm = tk+1 :

sZ+1 j sj
I (sj+1 ) ; I (sj ) =
so
(tk ) dB (u)
= (tk ) B (sj +1 ) ; B (sj )]
hI i(tk+1) ; hI i(tk) =
m 1 X
;
j =0
I (sj+1) ; I (sj )]2

m 1 X
;
= 2 (tk )
jj jj!0 ;;;;;!
It follows that
j =0 2 (t
B(sj+1 ) ; B(sj )]2

k )(tk+1 ; tk ):
hI i(t) =
=
n 1 X
; ;
k=0 n 1 tZ +1 X k k=0 tk
2(t )(t k k+1 ; tk ) 2 (u) du 0 2(u) du:
jj 0 ; jj!! ;;;;;
Zt
Chapter 15
It s Formula o
15.1 It s formula for one Brownian motion o
We want a rule to differentiate expressions of the form f (B (t)), where f (x) is a differentiable function. If B (t) were also differentiable, then the ordinary chain rule would give
d f (B(t)) = f (B(t))B (t) dt

0 0
which could be written in differential notation as
df (B (t)) = f (B(t))B (t) dt = f (B (t))dB (t)

0 0 0
However, B (t) is not differentiable, and in particular has nonzero quadratic variation, so the correct formula has an extra term, namely,
dt df (B(t)) = f (B(t)) dB (t) + 1 f (B(t)) |{z} : 2

0 00
dB (t) dB (t)
This is It s formula in differential form. Integrating this, we obtain It s formula in integral form: o o
f (B(t)) ; | (B (0)) = f {z }
f (0)
Zt
0
1 f (B(u)) dB (u) + 2
0
Zt
0
f (B (u)) du:
00
Remark 15.1 (Differential vs. Integral Forms) The mathematically meaningful form of It s foro mula is It s formula in integral form: o
f (B(t)) ; f (B (0)) =
Zt
0
1 (B (u)) dB (u) + 2 0
167
Zt
f (B (u)) du:
00
168 This is because we have solid denitions for both integrals appearing on the right-hand side. The rst,
Zt
0
f (B(u)) dB (u)
0
is an It integral, dened in the previous chapter. The second, o
Zt
0
f (B (u)) du
00
is a Riemann integral, the type used in freshman calculus. For paper and pencil computations, the more convenient form of It s rule is It s formula in differo o ential form:
df (B (t)) = f (B (t)) dB(t) + 1 f (B (t)) dt: 2 There is an intuitive meaning but no solid denition for the terms df (B (t)) dB (t) and dt appearing
0 00
in this formula. This formula becomes mathematically respectable only after we integrate it.
15.2 Derivation of It s formula o

Consider f (x) = 1 x2 , so that 2 Let xk
f (x) = x f (x) = 1:
0 00 0 00
xk+1 be numbers. Taylors formula implies f (xk+1) ; f (xk ) = (xk+1 ; xk )f (xk ) + 1 (xk+1 ; xk )2f (xk ): 2
In the general case, the above equation is only approximate, and the error is of the order of (x k+1 ; xk )3 . The total error will have limit zero in the last step of the following argument. Fix T
In this case, Taylors formula to second order is exact because f is a quadratic function.
> 0 and let = ft0 t1 : : : tng be a partition of 0 T ]. Using Taylors formula, we write: f (B(T )) ; f (B(0)) 1 = 2 B 2 (T ) ; 1 B 2 (0) 2
= = =
n 1 X
; ;
k=0 n 1 X k=0 n 1 X
;
f (B(tk+1 )) ; f (B (tk ))] B(tk+1 ) ; B (tk )] f (B (tk )) + 1 2

0
n 1 X
;
k=0
B(tk ) B(tk+1 ) ; B(tk )] + 1 2
n 1 X
;
k=0
B (tk+1 ) ; B(tk )]2 f (B(tk ))

00
k=0
B(tk+1 ) ; B(tk )]2 :
CHAPTER 15. It s Formula o

We let jj
169
jj!0 to obtain
f (B (T )) ; f (B(0)) =
=
ZT ZT
0 0
B(u) dB(u) + 1 hB{zT ) 2 | i( } f

0
T ZT (B (u)) dB (u) + 1 f 2 0 |
00
(B (u)) du: {z }
1
This is It s formula in integral form for the special case o
f (x) = 1 x2 : 2
15.3 Geometric Brownian motion
Denition 15.1 (Geometric Brownian Motion) Geometric Brownian motion is
S (t) = S (0) exp B(t) +

where Dene and
> 0 are constant. f (t x) = S (0) exp x +
1 ;2
;1 2
so
S (t) = f (t B (t)):
Then
ft =
According to It s formula, o
;1 2
f fx = f fxx = 2f:
dS (t) = df (t B(t)) 1 = ft dt + fx dB + 2 fxx dBdB | {z }

=( ; dt + f dB + 1 2 f dt 2 = S (t)dt + S (t) dB (t)
1 2 )f 2
Thus, Geometric Brownian motion in differential form is
dt
dS (t) = S (t)dt + S (t) dB (t)

and Geometric Brownian motion in integral form is
S (t) = S (0) +
Zt
0
S (u) du +
Zt
0
S (u) dB(u):
170
15.4 Quadratic variation of geometric Brownian motion

In the integral form of Geometric Brownian motion,
S (t) = S (0) +
the Riemann integral
Zt
0
S (u) du +
Zt
0
S (u) dB(u)
F (t) =
is differentiable with F 0 (t) =
Zt
0
S (u) du S (u) dB (u)

2 S 2(u) du:
S (t). This term has zero quadratic variation. The It integral o G(t) =
Zt
0
is not differentiable. It has quadratic variation
hGi(t) =
Zt
0
Thus the quadratic variation of S is given by the quadratic variation of G. In differential notation, we write
dS (t) dS (t) = ( S (t)dt + S (t)dB(t))2 = 2S 2 (t) dt
15.5 Volatility of Geometric Brownian motion

Fix 0 T1 T2. Let = volatility of S on T 1 T2] is
ft0 : : : tng be a partition of T1 T2]. The squared absolute sample

n 1 X
;
T2 ; T1 k=0 S (tk+1) ; S (tk ' T2 ; T1 T1 2 S 2(T ) ' 1
)]2
ZT2
2 S 2(u) du
As T2 # T1 , the above approximation becomes exact. In other words, the instantaneous relative volatility of S is 2 . This is usually called simply the volatility of S .
15.6 First derivation of the Black-Scholes formula

Wealth of an investor. An investor begins with nonrandom initial wealth X 0 and at each time t, holds (t) shares of stock. Stock is modelled by a geometric Brownian motion:
dS (t) = S (t)dt + S (t)dB(t):

lending at interest rate r.
171 The investor nances his investing by borrowing or
(t) can be random, but must be adapted.
Let X (t) denote the wealth of the investor at time t. Then
dX (t) = (t)dS (t) + r X (t) ; (t)S (t)] dt = (t) S (t)dt + S (t)dB (t)] + r X (t) ; (t)S (t)] dt = rX (t)dt + (t)S (t) ( {z r) dt + (t)S (t) dB (t): | ; }
Risk premium
Value of an option. Consider an European option which pays g (S (T )) at time T . Let v (t x) denote the value of this option at time t if the stock price is S (t) = x. In other words, the value of the option at each time t 2 0 T ] is
v (t S (t)):
The differential of this value is
dv (t S (t)) = vt dt + vxdS + 1 vxxdS dS 2 = vt dt + vx S dt + S dB ] + 1 vxx 2 S 2 dt h i 2 1 2 S 2v = vt + Svx + 2 xx dt + SvxdB A hedging portfolio starts with some initial wealth X 0 and invests so that the wealth X (t) at each time tracks v (t S (t)). We saw above that dX (t) = rX + ( ; r)S ] dt + S dB: To ensure that X (t) = v (t S (t)) for all t, we equate coefcients in their differentials. Equating the dB coefcients, we obtain the -hedging rule: (t) = vx (t S (t)): Equating the dt coefcients, we obtain: vt + Svx + 1 2 S 2vxx = rX + ( ; r)S: 2 But we have set = vx , and we are seeking to cause X to agree with v . Making these substitutions,
1 vt + Svx + 2 2S 2vxx = rv + vx ( ; r)S (where v = v (t S (t)) and S = S (t)) which simplies to vt + rSvx + 1 2 S 2vxx = rv: 2 In conclusion, we should let v be the solution to the Black-Scholes partial differential equation 1 vt (t x) + rxvx (t x) + 2 2 x2vxx(t x) = rv (t x)
satisfying the terminal condition If an investor starts with X 0 = v (0 S (0)) and uses the hedge (t) = vx (t X (t) = v(t S (t)) for all t, and in particular, X (T ) = g (S (T )). we obtain
v (T x) = g (x):
S (t)), then he will have
172
15.7 Mean and variance of the Cox-Ingersoll-Ross process

The Cox-Ingersoll-Ross model for interest rates is
dr(t) = a(b ; cr(t))dt +

where a
r(t) dB(t)
bc
and r(0) are positive constants. In integral form, this equation is
0 0 2(t). This is df (r(t)), where f (x) = x2 . We obtain We apply It s formula to compute dr o dr2(t) = df (r(t)) = f (r(t)) dr(t) + 1 f (r(t)) dr(t) dr(t) 2
0 00
r(t) = r(0) + a (b ; cr(u)) du +
Zt
Z tq
r(u) dB (u):
= 2r(t) a(b ; cr(t)) dt +
r(t) dB (t) + a(b ; cr(t)) dt +

3
r(t) dB (t)
= 2abr(t) dt ; 2acr2(t) dt + 2 r 2 (t) dB (t) + 2 r(t) dt 3 = (2ab + 2 )r(t) dt ; 2acr2(t) dt + 2 r 2 (t) dB (t)
The mean of r(t). The integral form of the CIR equation is
r(t) = r(0) + a (b ; cr(u)) du +

0
Zt
Z tq
0
r(u) dB (u):
Taking expectations and remembering that the expectation of an It integral is zero, we obtain o
IEr(t) = r(0) + a (b ; cIEr(u)) du:

0
Differentiation yields
Zt
d dt IEr(t) = a(b ; cIEr(t)) = ab ; acIEr(t)
which implies that
d heact IEr(t)i = eact acIEr(t) + d IEr(t) = eact ab: dt dt eactIEr(t) ; r(0) = ab
Integration yields We solve for IEr(t):
Zt
0
;
eacu du = b (eact ; 1): c r(0) ; b : c c
b If r(0) = c , then IEr(t) = b for every t. If r(0) 6= b , then r(t) exhibits mean reversion: c c
IEr(t) = b + e c
act
lim IEr(t) = b : t!
1

Variance of r(t). The integral form of the equation derived earlier for dr2(t) is
173
r2 (t) = r2(0) + (2ab + 2 )

Taking expectations, we obtain
Zt
0
r(u) du ; 2ac
Zt
0
r2(u) du + 2
Zt
0
2 r 3 (u) dB(u):
IEr2(t) = r2(0) + (2ab + 2)

Differentiation yields
Zt
0
IEr(u) du ; 2ac
Zt
0
IEr2(u) du:
d IEr2(t) = (2ab + 2)IEr(t) ; 2acIEr2(t) dt
which implies that
d e2actIEr2(t) = e2act 2acIEr2(t) + d IEr2(t) dt dt 2act (2ab + 2)IEr(t): =e Using the formula already derived for IEr(t) and integrating the last equation, after considerable
algebra we obtain
2 2b b 2 + b2 + r(0) ; b act 2 c2 2ac c ac + c e 2 2 2 + r(0) ; b e 2act + ac 2bc ; r(0) e 2act : c ac var r(t) = IEr2(t) ; (IEr(t))2 2 2 b 2 = 2ac2 + r(0) ; b ac e act + ac 2bc ; r(0) e 2act : c
IEr2(t) =
15.8 Multidimensional Brownian Motion

Denition 15.2 (d-dimensional Brownian Motion) A d-dimensional Brownian Motion is a process
B (t) = (B1 (t) : : : Bd (t))
with the following properties: Each Bk (t) is a one-dimensional Brownian motion; If i 6= j , then the processes Bi (t) and Bj (t) are independent. Associated with a d-dimensional Brownian motion, we have a ltration fF (t)g such that For each t, the random vector B (t) is F (t)-measurable; For each t
t1 : : : tn , the vector increments B(t1 ) ; B(t) : : : B(tn) ; B(tn 1 ) are independent of F (t).
;
174
15.9 Cross-variations of Brownian motions

Because each component Bi is a one-dimensional Brownian motion, we have the informal equation
dBi(t) dBi (t) = dt:

However, we have: Theorem 9.49 If i 6= j ,
dBi(t) dBj (t) = 0

0 T ]. For i 6= j , dene the sample cross variation
Proof: Let = ft0 : : : tn g be a partition of of Bi and Bj on 0 T ] to be
C =
n 1 X
;
k=0
Bi (tk+1 ) ; Bi (tk )] Bj (tk+1) ; Bj (tk )] :
The increments appearing on the right-hand side of the above equation are all independent of one another and all have mean zero. Therefore,
IEC = 0:
We compute var(C ). First note that
C2 =
+2
n 1 X
;
k=0 n 1 X
;
Bi(tk+1 ) ; Bi (tk ) Bj (tk+1) ; Bj (tk )
`<k
Bi (t`+1) ; Bi (t` )] Bj (t`+1 ) ; Bj (t` )] : Bi (tk+1 ) ; Bi (tk )] Bj (tk+1 ) ; Bj (tk )]
All the increments appearing in the sum of cross terms are independent of one another and have mean zero. Therefore,
var(C ) = IEC 2 = IE
n 1 X
;
k=0
Bi (tk+1) ; Bi(tk )]2 Bj (tk+1 ) ; Bj (tk )]2 :
But Bi (tk+1 ) ; Bi (tk )]2 and Bj (tk+1 ) ; Bj (tk )]2 are independent of one another, and each has expectation (tk+1 ; tk ). It follows that
var(C ) =
As jj
n 1 X
;
k=0
(tk+1 ; tk )2 jj jj
n 1 X
;
k=0
(tk+1 ; tk ) = jj jj:T: = 0.
jj!0, we have var(C )!0, so C
converges to the constant IEC
175
15.10 Multi-dimensional It formula o

To keep the notation as simple as possible, we write the It formula for two processes driven by a o two-dimensional Brownian motion. The formula generalizes to any number of processes driven by a Brownian motion of any number (not necessarily the same number) of dimensions. Let X and Y be processes of the form
X (t) = X (0) + Y (t) = Y (0) +
Zt
0
Z0t
(u) du + (u) du +
Zt Z0t
0
11(u) dB1 (u) + 21 (u) dB1 (u) +
Zt Z0t
0
12 (u) dB2(u) 22 (u) dB2 (u):
Such processes, consisting of a nonrandom initial condition, plus a Riemann integral, plus one or more It integrals, are called semimartingales. The integrands (u) (u) and ij (u) can be any o adapted processes. The adaptedness of the integrands guarantees that X and Y are also adapted. In differential notation, we write
dX = dt + 11 dB1 + 12 dB2 dY = dt + 21 dB1 + 22 dB2 : Given these two semimartingales X and Y , the quadratic and cross variations are: dX dX = ( dt + 11 dB1 + 12 dB2)2 2 2 = 11 dB1{z 1 +2 11 12 dB1{z 2 + 12 dB2{z 2 | dB } | dB} | dB}
=( dY dY = ( =( dX dY = ( =(
0 dt 2 + 2 )2 dt 11 12 dt + 21 dB1 + 22 dB2 )2 2 2 2 21 + 22 ) dt dt + 11 dB1 + 12 dB2)( 11 21 + 12 22 ) dt
dt
dt +
21 dB1 + 22 dB2 )
Let f (t x y ) be a function of three variables, and let X (t) and Y (t) be semimartingales. Then we have the corresponding It formula: o
df (t x y) = ft dt + fx dX + fy dY + 1 fxx dX dX + 2fxy dX dY + fyy dY dY ] : 2 In integral form, with X and Y as decribed earlier and with all the variables lled in, this equation
is
f (t X (t) Y (t)) ; f (0 X (0) Y (0)) Zt 2 2 = ft + fx + fy + 1 ( 11 + 12)fxx + ( 2

+
where f
0Z
= f (u X (u) Y (u), for i j 2 f1 2g, ij = ij (u), and Bi = Bi (u).
11 fx + 21 fy ]
dB1 +
Zt
0
2 1 2 11 21 + 12 22)fxy + 2 ( 21 + 22)fyy ] du
12 fx + 22 fy ]
dB2
176
Chapter 16
Markov processes and the Kolmogorov equations

16.1 Stochastic Differential Equations
Consider the stochastic differential equation:
dX (t) = a(t X (t)) dt + (t X (t)) dB (t):

Here a(t x) and (t x) are given functions, usually assumed to be continuous in (t chitz continuous in x,i.e., there is a constant L such that
(SDE)
x) and Lips-
ja(t x) ; a(t y)j Ljx ; yj

for all t satisfying
j (t x) ; (t y)j Ljx ; yj
t0
x y. Let (t0 x) be given. A solution to (SDE) with the initial condition (t 0 x) is a process fX (t)gt X (t0) = x
X (t) = X (t0) + a(s X (s)) ds +

t0
Zt
Zt
t0
(s X (s)) dB (s)
t t0
The solution process fX (t)g t t0 will be adapted to the ltration fF (t)g t 0 generated by the Brownian motion. If you know the path of the Brownian motion up to time t, then you can evaluate X (t).
Example 16.1 (Drifted Brownian motion) Let a be a constant and
= 1, so
dX(t) = a dt + dB(t):
If (t0
x) is given and we start with the initial condition X(t0 ) = x

177
178
then
X(t) = x + a(t ; t0) + (B(t) ; B(t0 )) To compute the differential w.r.t. t, treat t0 and B(t0 ) as constants: dX(t) = a dt + dB(t):
t t0:
Example 16.2 (Geometric Brownian motion) Let r and
be constants. Consider
dX(t) = rX(t) dt + X(t) dB(t):

Given the initial condition
X(t0 ) = x
the solution is
X(t) = x exp (B(t) ; B(t0 )) + (r ; 1 2 )(t ; t0) : 2 Again, to compute the differential w.r.t. t, treat t0 and B(t0 ) as constants: dX(t) = (r ; 1 2)X(t) dt + X(t) dB(t) + 1 2 X(t) dt 2 2 = rX(t) dt + X(t) dB(t):
16.2 Markov Property

Let 0
t0 < t1 be given and let h(y ) be a function. Denote by IE t0 xh(X (t1))

= x. Now let 2 IR be given, and start with initial
the expectation of h(X (t1)), given that X (t0) condition
X (0) = :
We have the Markov property
IE 0 h(X (t1)) F (t0) = IE t0 X (t0)h(X (t1)):

In other words, if you observe the path of the driving Brownian motion from time 0 to time t 0 , and based on this information, you want to estimate h(X (t 1)), the only relevant information is the value of X (t0). You imagine starting the (SDE ) at time t0 at value X (t0), and compute the expected value of h(X (t1)).
CHAPTER 16. Markov processes and the Kolmogorov equations
179
16.3 Transition density

Denote by
p(t0 t1 x y) the density (in the y variable) of X (t 1), conditioned on X (t 0) = x. In other words, IE t0 xh(X (t1)) =
The Markov property says that for 0
IR
h(y )p(t0 t1 x y ) dy:

,
t0 t1 and for every
IE 0
h(X (t1)) F (t0 ) =
IR
h(y )p(t0 t1 X (t0) y ) dy:
Example 16.3 (Drifted Brownian motion) Consider the SDE
dX(t) = a dt + dB(t):
Conditioned on (t1 ; t0 ), i.e.,
X(t 0 ) = x, the random variable X(t1 ) is normal with mean x + a(t1 ; t0 ) and variance
2 p(t0 t1 x y) = p 1 exp ; (y ; (x + a(t1 ; t0 ))) : 2(t1 ; t0 ) 2 (t1 ; t0) Note that p depends on t0 and t1 only through their difference t 1 ; t0 . This is always the case when a(t x) and (t x) dont depend on t.
Example 16.4 (Geometric Brownian motion) Recall that the solution to the SDE
dX(t) = rX(t) dt + X(t) dB(t)

with initial condition X(t 0 ) = x, is Geometric Brownian motion:
1 X(t1 ) = x exp (B(t1 ) ; B(t0 )) + (r ; 2 2 )(t1 ; t0) :
The random variable B(t1 ) ; B(t0 ) has density

2 IP fB(t1 ) ; B(t0 ) 2 dbg = p 1 exp ; 2(t b; t ) db 2 (t1 ; t0) 1 0
and we are making the change of variable
y = x exp b + (r ; 1 2)(t1 ; t0 ) 2
or equivalently,
h y i b = 1 log x ; (r ; 1 2)(t1 ; t0) : 2

dy = y db
or equivalently,
The derivative is
db = dy : y
180
Therefore,
p(t0 t1 x y) dy = IP fX(t1 ) 2 dyg 1 exp ; 2(t ; t ) = p 1 y 2 (t1 ; t0) 1 0 IE (X(T ) ; K) =

tx
h y i2 1 log x ; (r ; 2 2 )(t1 ; t0 ) dy:
Using the transition density and a fair amount of calculus, one can compute the expected payoff from a European call:
+
(y ; K)+ p(t T x y) dy 0 h x i r (T ;t) 1 =e xN p 1 log K + r(T ; t) + 2 2(T ; t) T ;t h x i log K + r(T ; t) ; 1 2 (T ; t) ; KN p 1 2 T ;t
Z1
where
N( ) = p1 2
Therefore,
Z1 12 1 2 e; 2 x dx = p1 e; 2 x dx: 2 ; ;1
IE 0 e;r(T ;t) (X(T) ; K)+ F (t) = e;r(T ;t) IE t X (t) (X(T) ; K)+ 1 = X(t)N p 1 log X(t) + r(T ; t) + 2 2(T ; t) K T ;t log X(t) + r(T ; t) ; 1 2 (T ; t) ; e;r(T ;t) K N p 1 2 K T ;t
16.4 The Kolmogorov Backward Equation

Consider and let p(t0 t1
x y ) be the transition density. Then the Kolmogorov Backward Equation is: @ @ @2 1 ; @t p(t0 t1 x y) = a(t0 x) @x p(t0 t1 x y) + 2 2(t0 x) @x2 p(t0 t1 x y): 0 (KBE)
dX (t) = a(t X (t)) dt + (t X (t)) dB (t)
In the case that a and are functions of x alone, p(t0 t1 x y ) depends on t0 and t1 only through their difference = t1 ; t0 . We then write p( x y ) rather than p(t0 t1 x y ), and (KBE ) becomes
The variables t0 and x in (KBE ) are called the backward variables.
@ p( x y ) = a(x) @ p( x y) + 1 2(x) @ 2 p( x y ): 2 @ @x @x2
(KBE)

Example 16.5 (Drifted Brownian motion)
181
dX(t) = a dt + dB(t) 2 p( x y) = p 1 exp ; (y ; (x + a )) : 2 2 @ p = p = @ p 1 exp ; (y ; x ; a )2 @ @ 2 2 @ (y ; x ; a )2 p 1 exp ; (y ; x ; a )2 ; @ 2 2 2 1 + a(y ; x ; a ) + (y ; x ; a ) p: = ;2 22 @ p = p = y ; x ; a p: x @x @2 p = p = @ y ; x ; a p + y ; x ; a p xx x @x2 @x 2 = ; 1 p + (y ; x ; a ) p:

2
Therefore,
2 1 apx + 2 pxx = a(y ; x ; a ) ; 21 + (y ; x ; a ) p 2 2 =p :
This is the Kolmogorov backward equation. Example 16.6 (Geometric Brownian motion)
dX(t) = rX(t) dt + X(t) dB(t): h y 1 1 p( x y) = p exp ; 2 1 2 log x ; (r ; 2 2 ) y 2 It is true but very tedious to verify that p satises the KBE 1 p = rxpx + 2 2x2 pxx:
i2
16.5 Connection between stochastic calculus and KBE

Consider
dX (t) = a(X (t)) dt + (X (t)) dB(t):

Let h(y ) be a function, and dene
(5.1)
v (t x) = IE t xh(X (T ))
182 where 0
t T . Then
v(t x) = h(y) p(T ; t x y ) dy vt(t x) = ; h(y ) p (T ; t x y ) dy vx(t x) = h(y) px(T ; t x y ) dy vxx(t x) = h(y) pxx(T ; t x y ) dy:
Therefore, the Kolmogorov backward equation implies
Z Z Z
vt(t x) +Za(x)vx(t x) + 1 2(x)vxx(t x) = 2 h i h(y ) ;p (T ; t x y) + a(x)px(T ; t x y ) + 1 2(x)pxx(T ; t x y ) dy = 0 2

Let (0 IE 0 .
) be an initial condition for the SDE (5.1). We simplify notation by writing IE rather than
Theorem 5.50 Starting at X (0) = , the process v (t
X (t)) satises the martingale property:

0 s t T:
IE v (t X (t)) F (s) = v (s X (s))

Proof: According to the Markov property,
IE h(X (T )) F (t) = IE t X (t)h(X (T )) = v (t X (t))

so
IE v (t X (t))jF(s)] = IE IE h(X (T )) F (t) F (s)

= IE h(X (T )) F (s) = IE s X (s)h(X (T )) = v (s X (s)):
(Markov property)
It s formula implies o
dv(t X (t)) = vt dt + vxdX + 1 vxx dX dX 2 1 = vt dt + avx dt + vx dB + 2 2 vxx dt:

In integral form, we have
183
v (t X (t)) = v (0 X (0)) Z th i + vt (u X (u)) + a(X (u))vx(u X (u)) + 1 2 (X (u))vxx(u X (u)) du 2

+
Z0t
0
(X (u))vx(u X (u)) dB (u):
Rt We know that v (t X (t)) is a martingale, so the integral 0 for all t. This implies that the integrand is zero; hence
1 vt + avx + 2 2 vxx du must be zero
vt + avx + 1 2vxx = 0: 2
Thus by two different arguments, one based on the Kolmogorov backward equation, and the other based on It s formula, we have come to the same conclusion. o Theorem 5.51 (Feynman-Kac) Dene
v(t x) = IE t xh(X (T ))
where Then
0 t T
dX (t) = a(X (t)) dt + (X (t)) dB(t): vt(t x) + a(x)vx(t x) + 1 2 (x)vxx(t x) = 0 2

(FK)
and
v (T x) = h(x):
The Black-Scholes equation is a special case of this theorem, as we show in the next section. Remark 16.1 (Derivation of KBE) We plunked down the Kolmogorov backward equation without any justication. In fact, one can use It s formula to prove the Feynman-Kac Theorem, and use o the Feynman-Kac Theorem to derive the Kolmogorov backward equation.
16.6 Black-Scholes
Consider the SDE With initial condition the solution is
dS (t) = rS (t) dt + S (t) dB (t): S (t) = x
S (u) = x exp (B (u) ; B(t)) + (r ; 1 2)(u ; t) 2
u t:
184 Dene
v (t x) = IE t xh(S (T )) n o = IEh x exp (B (T ) ; B (t)) + (r ; 1 2)(T ; t) 2

Recall the Independence Lemma: If G is a -eld, X is G -measurable, and Y is independent of G , then where h is a function to be specied later.
IE h(X Y ) G = (X )
where
(x) = IEh(x Y ):
With geometric Brownian motion, for 0
S (t) = S (0) exp B(t) + (r ; 1 2)t 2 n o S (T ) = S (0) exp B(T ) + (r ; 1 2)T 2 n o 1 = S (t) exp (B (T ) ; B (t)) + (r ; 2 2 )(T ; t) |{z}
F
t T , we have
(t)-measurable
independent of F (t)
{z
We thus have
S (T ) = XY
where
X = S (t) o n Y = exp (B(T ) ; B (t)) + (r ; 1 2 )(T ; t) : 2

Now
IEh(xY ) = v(t x):

The independence lemma implies
IE h(S (T )) F (t) = IE h(XY )jF (t)] = v (t X ) = v (t S (t)):

We have shown that
185
v(t S (t)) = IE h(S (T )) F (t)
0 t T:
Note that the random variable h(S (T )) whose conditional expectation is being computed does not depend on t. Because of this, the tower property implies that v (t S (t)) 0 t T , is a martingale: For 0 s t T ,
IE v(t S (t)) F (s) = IE IE h(S (T )) F (t) F (s)

= IE h(S (T )) F (s) = v (s S (s)):
This is a special case of Theorem 5.51. Because formula,
v(t S (t)) is a martingale, the sum of the dt terms in dv (t S (t)) must be 0. dv (t S (t)) = vt(t S (t)) dt + rS (t)vx(t S (t)) + 1 2 S 2(t)vxx(t S (t)) dt 2 + S (t)vx(t S (t)) dB (t):
By It s o
This leads us to the equation
vt(t x) + rxvx(t x) + 1 2x2vxx(t x) = 0 2

This is a special case of Theorem 5.51 (Feynman-Kac).
0 t < T x 0:
Along with the above partial differential equation, we have the terminal condition
v(T x) = h(x) x 0: Furthermore, if S (t) = 0 for some t 2 0 T ], then also S (T ) = 0.

condition
This gives us the boundary
Finally, we shall eventually see that the value at time t of a contingent claim paying h(S (T )) is
v(t 0) = h(0)
; ;
0 t T:
u(t x) = e =e
at time t if S (t) = x. Therefore,
r(T t) IE t xh(S (T )) r(T t) v (t x)

;
v(t v t (t vx(t vxx(t
x) = er(T t)u(t x) x) = ;rer(T t)u(t x) + er(T t)ut (t x) x) = er(T t)ux(t x) x) = er(T t)uxx(t x):
; ; ; ; ;
186 Plugging these formulas into the partial differential equation for v and cancelling the e r(T ;t) appearing in every term, we obtain the Black-Scholes partial differential equation:
;ru(t x) + ut(t x) + rxux(t x) + 1 2x2uxx(t x) = 0 2

In terms of the transition density
0 t < T x 0:
(BS)
Compare this with the earlier derivation of the Black-Scholes PDE in Section 15.6.
1 p(t T x y ) = p 1 exp ; 2(T ; t) y 2 (T ; t)
y 1 log x ; (r ; 2 2)(T ; t)
2)
for geometric Brownian motion (See Example 16.4), we have the stochastic representation
u(t x) = e =e
In the case of a call, and
r(T t) IE t xh(S (T ))
; ;
r(T t)
(SR)
h(y )p(t T x y) dy:
h(y ) = (y ; K )+
u(t x) = x N
p1
satises the Black-Scholes PDE (BS) derived above.
T ;t Even if h(y ) is some other function (e.g., h(y ) = (K ; y )+ , a put), u(t x) is still given by and
16.7 Black-Scholes with price-dependent volatility
x log K + r(T ; t) + 1 2 (T ; t) 2 T ;t x log K + r(T ; t) ; 1 2 (T ; t) ; e r(T t)K N p 1 2

; ;
dS (t) = rS (t) dt + (S (t)) dB(t) v(t x) = e r(T t) IE t x(S (T ) ; K )+ :

; ;
The Feynman-Kac Theorem now implies that
;rv(t x) + vt(t x) + rxvx(t x) + 1 2(x)vxx(t x) = 0 2

v also satises the terminal condition v(T x) = (x ; K )+ x 0
0 t < T x > 0:

and the boundary condition
187
v(t 0) = 0
0 t T:
An example of such a process is the following from J.C. Cox, Notes on options pricing I: Constant elasticity of variance diffusions, Working Paper, Stanford University, 1975:
dS (t) = rS (t) dt + S (t) dB (t)

where 0 < 1. The volatility sponding Black-Scholes equation is
1(t) decreases with increasing stock price. The corre-
;rv + vt + rxvx + 1 2x2 vxx = 0 2
0 t<T x>0 v (t 0) = 0 0 t T v (T x) = (x ; K )+ x 0:
188
Chapter 17
Girsanovs theorem and the risk-neutral measure

(Please see Oksendal, 4th ed., pp 145151.) Theorem 0.52 (Girsanov, One-dimensional) Let B (t) 0 t T , be a Brownian motion on a probability space ( F P). Let F (t) 0 t T , be the accompanying ltration, and let (t) 0 t T , be a process adapted to this ltration. For 0 t T , dene
e B (t) =
Zt
0
(u) du + B (t)
Z (t) = exp ;
Zt
0
(u) dB (u) ;
Zt 2 1 2 0 (u) du 8A 2 F :
and dene a new probability measure by
f IP (A) = f e Under I , the process B (t) P
Z
A
Z (T ) dIP
0 t T , is a Brownian motion.
Caveat: This theorem requires a technical condition on the size of . If
IE exp
everything is OK. We make the following remarks:
( ZT
1 2 0
2 (u) du
<1
Z (t) is a matingale. In fact, 1 dZ (t) = ; (t)Z (t) dB (t) + 2 2 (t)Z (t) dB(t) dB(t) ; 1 2(t)Z (t) dt 2 = ; (t)Z (t) dB (t):
189
190
f IP is a probability measure.
Since Z (0) = 1, we have IEZ (t) = 1 for every t
f IP ( ) =
0. In particular
Z (T ) dIP = IEZ (T ) = 1
f IE in terms of IE .
f so I is a probability measure. P
f IEZ = IE Z (T )X ] : To see this, consider rst the case X = 1 A , where A 2 F . We have Z Z f f IEX = IP (A) = Z (T ) dIP = Z (T )1A dIP = IE Z (T )X ] :
A
Now use Williams standard machine. The intuition behind the formula
f f E P Let I denote expectation under I . If X is a random variable, then
f IP and IP .
f IP (A) =
Z
A
Z (T ) dIP
8A 2 F f IP . Thus,
is that we want to have
e Distribution of B (T ). If
f but since IP (! ) = 0 and I (!) = 0, this doesnt really tell us anything useful about P we consider subsets of , rather than individual elements of .
is constant, then
f IP (!) = Z (T !)IP (!) n
Z (T ) = exp ; B (T ) ; 1 2 T 2 e (T ) = T + B(T ): B e Under IP , B (T ) is normal with mean 0 and variance T , so B (T ) is normal with mean T and variance T : ( ) b 1 exp ; (~ ; T )2 d~: e IP (B (T ) 2 d~) = p b b 2T 2 T e f e P Removal of Drift from B (T ). The change of measure from IP to I removes the drift from B (T ).
To see this, we compute
Z = p1 2 T 1 Z =p 2 T Z (y = T + b) = p 1 2 T = 0:
fe IEB (T ) = IE Z (T )( T + B(T ))] h n o i = IE exp ; B (T ) ; 1 2 T ( T + B (T )) 2

1 ;1 1
( T + b) expf; b ;
;1 1
( T + b) exp ; (b + TT ) 2
2 y exp ; y2 dy
1 2 2 Tg
( 2) b exp ; 2T db 2)
db
= T + b)
;1
(Substitute y
CHAPTER 17. Girsanovs theorem and the risk-neutral measure
191
fe E We can also see that I B (T ) = 0 by arguing directly from the density formula
2 ~ IP B (t) 2 db = p 1 exp ; (b ; TT ) d~: b 2 2 T
Because
ne
o ~
Z (T ) = expf; B(T ) ; 1 2T g 2 e (T ) ; T ) ; 1 2 T g = expf; (B 2 e (T ) + 1 2 T g = expf; B 2

we have
f e e IP B(T ) 2 d~ = IP B (T ) 2 d~ exp ; ~ + 1 2 T b b b 2 ( ~ ) 1 exp ; (b ; T )2 ; ~ + 1 2 T d~: b 2 =p b 2T 2 T ( ) b 1 exp ; ~2 d~: =p 2T b 2 T f e Under I , B (T ) is normal with mean zero and variance P mean T and variance T .
T.
Under IP ,
e B (T ) is normal with
Means change, variances dont. When we use the Girsanov Theorem to change the probability measure, means change but variances do not. Martingales may be destroyed or created. Volatilities, quadratic variations and cross variations are unaffected. Check:
e e dB dB = ( (t) dt + dB(t))2 = dB:dB = dt: f 17.1 Conditional expectations under I P

Lemma 1.53 Let 0
t T . If X is F (t)-measurable, then
f IEX = IE X:Z (t)]:
Proof:
f IEX = IE X:Z (T )] = IE IE X:Z (T )jF (t)] ] = IE X IE Z (T )jF (t)] ] = IE X:Z (t)]

because Z (t)
0 t T , is a martingale under IP .
192 Lemma 1.54 (Bayes Rule) If X is F (t)-measurable and 0
1 f IE X jF (s)] = Z (s) IE XZ (t)jF (s)]:
s t T , then
(1.1)
Proof: It is clear that Z 1s) IE XZ (t)jF (s)] is ( property. For A 2 F (s), we have
F (s)-measurable.
We check the partial averaging
1 IE XZ (t)jF (s)] dIP = IE 1 1 IE XZ (t)jF (s)] f f A Z (s) A Z (s) (Lemma 1.53) = IE 1A IE XZ (t)jF (s)]] = IE IE 1A XZ (t)jF (s)]] (Taking in what is known) = IE 1A XZ (t)] f = IE 1A X ] (Lemma 1.53 again) Z f = X dIP :
A
Although we have proved Lemmas 1.53 and 1.54, we have not proved Girsanovs Theorem. We will not prove it completely, but here is the beginning of the proof. Lemma 1.55 Using the notation of Girsanovs Theorem, we have the martingale property
fe e IE B (t)jF (s)] = B (s)
0 s t T:
e Proof: We rst check that B (t)Z (t) is a martingale under IP . Recall e dB (t) = (t) dt + dB (t) dZ (t) = ; (t)Z (t) dB(t):
Therefore,
e e e e d(BZ ) = B dZ + Z dB + dB dZ e = ;B Z dB + Z dt + Z dB ; Z dt e = (;B Z + Z ) dB:

s t T,
Next we use Bayes Rule. For 0
fe e IE B (t)jF (s)] = Z 1s) IE B (t)Z (t)jF (s)] ( e = Z 1s) B (s)Z (s) ( e = B (s):
193
Denition 17.1 (Equivalent measures) Two measures on the same probability space which have the same measure-zero sets are said to be equivalent. The probability measures dened by
IP
and
f IP of the Girsanov Theorem are equivalent.

A 2 F:
Recall that
f IP is
Z f(A) = Z (T ) dIP IP Z
If IP (A) = 0, then A Z (T ) f of I to obtain P
dIP = 0: Because Z (T ) > 0 for every !, we can invert the denition IP (A) = dIP A Z (T )
1
A 2 F:
f P If I (A) = 0, then A Z (1T )
dIP = 0:
17.2 Risk-neutral measure

As usual we are given the Brownian motion: B (t) 0 t T , with ltration F (t) dened on a probability space ( F P). We can then dene the following. Stock price:
T,
dS (t) = (t)S (t) dt + (t)S (t) dB (t):

The processes (t) and (t) are adapted to the ltration. The stock price model is completely general, subject only to the condition that the paths of the process are continuous.
r(t) 0 t T . The process r(t) is adapted. Wealth of an agent, starting with X (0) = x. We can write the wealth process differential in
Interest rate: several ways:
dX (t) =
= r(t)X (t) dt + (t) dS (t) ; rS (t) dt] = r(t)X (t) dt + (t) ( (t) {z r(t)) S (t) dt + (t) (t)S (t) dB (t) | ; }
Capital gains from Stock
dS ) | (t){z (t} + r(t) X (t) ;{z (t)S (t)] dt | }

Interest earnings
2 3 6 (t) ; r(t) 7 6 7 6 = r(t)X (t) dt + (t) (t)S (t) 6 dt + dB (t)7 7 ( 4 | {zt) } 5

Risk premium Market price of risk=
(t)
194 Discounted processes:
d e d e
R t r(u) du
0
R t r(u) du
0
S (t) = e
X (t) = e
R t r(u) du
0
;
R t r(u) du
0
;r(t)S (t) dt + dS (t)] ;r(t)X (t) dt + dX (t)]

S (t) :
0
= (t)d e
Notation:
R t r(u) du
Rt (t) = e 0 r(u) du
d (t) = r(t) (t) dt
1 = e R0t r(u) du (t) d 1t) = ; r(tt) dt: ( ()

;
The discounted formulas are
= 1t) (t)S (t) (t) dt + dB (t)] ( (t) d X((tt)) = (t) d S(t) () = (tt) (t)S (t) (t) dt + dB (t)]:
Changing the measure. Dene
(t) d S(t) = 1t) ;r(t)S (t) dt + dS (t)] ( = 1t) ( (t) ; r(t))S (t) dt + (t)S (t) dB (t)] (
e B (t) =
Then
Zt
0
(u) du + B (t):
(t) e d S(t) = 1t) (t)S (t) dB (t) ( e d X((tt)) = ((tt)) (t)S (t) dB (t):
f (t) Under I , S (t) and X((tt)) are martingales. P

Denition 17.2 (Risk-neutral measure) A risk-neutral measure (sometimes called a martingale measure) is any probability measure, equivalent to the market measure IP , which makes all discounted asset prices martingales.

For the market model considered here,
195
f IP (A) =
where
Z
A
0
Z (T ) dIP
(u) dB (u) ; 1 2
A2F
Z (t) = exp ;
Zt
Zt
is the unique risk-neutral measure. Note that because 0.
(t) =
2 (u) du 0 (t) r(t) we must assume that (t)

;
(t) 6=
Risk-neutral valuation. Consider a contingent claim paying an F (T )-measurable random variable V at time T .
Example 17.1
V = (S(T) ; K)+ European call + V = (K ; S(T)) European put !+ ZT 1 S(u) du ; K V= T Asian call 0 V = 0maxT S(t) Look back t
If there is a hedging portfolio, i.e., a process satises X (T ) = V , then
(t) 0 t T , whose corresponding wealth process
f V X (0) = IE (T ) : f P This is because X((tt)) is a martingale under I , so

(0) f T f V X (0) = X(0) = IE X((T )) = IE (T ) :
196
Chapter 18
Martingale Representation Theorem

18.1 Martingale Representation Theorem
See Oksendal, 4th ed., Theorem 4.11, p.50. Theorem 1.56 Let B (t) 0 t T be a Brownian motion on ( the ltration generated by this Brownian motion. Let X (t) 0 t relative to this ltration. Then there is an adapted process (t) 0
F P). Let F (t) 0 t T , be

T , be a martingale (under IP ) t T , such that
X (t) = X (0) +
Zt
0
(u) dB (u)
0 t T:
In particular, the paths of X are continuous. Remark 18.1 We already know that if X (t) is a process satisfying
dX (t) = (t) dB (t) then X (t) is a martingale. Now we see that if X (t) is a martingale adapted to the ltration generated by the Brownian motion B (t), i.e, the Brownian motion is the only source of randomness in X , then dX (t) = (t) dB(t) for some (t).
18.2 A hedging application
Homework Problem 4.5. In the context of Girsanovs Theorem, suppse that F (t) the ltration generated by the Brownian motion B (under IP ). Suppose that Y is a Then there is an adapted process (t) 0 t T , such that
0 t T is f IP -martingale.
Y (t) = Y (0) +
Zt
0
e (u) dB (u)
197
0 t T:
198
dS (t) = (t)S (t) dt + (t)S (t) dB (t) Zt (t) = exp r(u) du 0 (t) = (t) ; r(t) Z t (t) e B (t) = (u) du + B(t)
0 Z f(A) = Z (T ) dIP IP
Z (t) = exp ;
A
Zt
(u) dB (u) ; 1 2
Zt
0
2 (u) du
8A 2 F :
Then
Let
(t) (t) e d S(t) = S(t) (t) dB (t): (t) 0 t T be a portfolio process. The corresponding wealth process X (t) satises (t) e d X((tt)) = (t) (t) S(t) dB(t)
i.e.,
X (t) = X (0) + Z t (u) (u) S (u) dB(u) e (t) (u) 0
0 t T:
Let V be an F (T )-measurable random variable, representing the payoff of a contingent claim at time T . We want to choose X (0) and (t) 0 t T , so that
f P Dene the I -martingale
X (T ) = V:
f V Y (t) = IE (T ) F (t) Zt
0
0 t T: (t) 0 t T , such that 0 t T:
According to Homework Problem 4.5, there is an adapted process
f Set X (0) = Y (0) = I E
Y (t) = Y (0) +
V (T ) and choose
e (u) dB (u)
(u) so that
(u) (u) S (u) = (u): (u)
CHAPTER 18. Martingale Representation Theorem

With this choice of
199
(u) 0 u T , we have X (t) = Y (t) = IE V F (t) f (t) (T )
0 t T:
In particular,
X (T ) = IE V F (T ) = V f (T ) (T ) (T )
so
X (T ) = V:
The Martingale Representation Theorem guarantees the existence of a hedging portfolio, although it does not tell us how to compute it. It also justies the risk-neutral pricing formula
f X (t) = (t)IE = Z(t) IE (t) 1 IE = (t)

where
(T ) F (t) Z (T ) V F (t) (T ) (T )V F (t)
0 t T
(t) (t) = Z(t) = exp ;
Zt
0
(u) dB (u) ;
Zt
0
(r(u) + 1 2 (u)) du 2
18.3 d-dimensional Girsanov Theorem

Theorem 3.57 (d-dimensional Girsanov) dimensional Brownian motion on (
F (t) 0 t T
t T
dene
For 0
(t) = ( 1(t) : : : d (t)) 0 t T , d-dimensional adapted process.
the accompanying ltration, perhaps larger than the one generated by B ;
F P);
B(t) = (B1 (t) : : : Bd(t)) 0
T , a d-
e Bj (t) =
Zt
0
Z f(A) = Z (T ) dIP: IP
A
Z (t) = exp ;
j (u) du + Bj (t) Zt
j = 1 ::: d
1 (u): dB (u) ; 2 0 0
Zt
jj (u)jj2 du
200
f P Then, under I , the process e e e B (t) = (B1(t) : : : Bd (t))

is a d-dimensional Brownian motion.
0 t T
18.4 d-dimensional Martingale Representation Theorem

Theorem 4.58 on ( F
t T , is a martingale (under IP ) relative to F (t) 0 If X (t) 0 d-dimensional adpated process (t) = ( 1(t) : : : d (t)), such that X (t) = X (0) +
F (t) 0 t T
P);
B(t) = (B1 (t) : : : Bd (t)) 0
T a d-dimensional Brownian motion t T , then there is a
the ltration generated by the Brownian motion B .
Zt
0
(u): dB (u)
0 t T:
Corollary 4.59 If we have a d-dimensional adapted process (t) = ( 1 (t) : : : d (t)) then we can e f f P P dene B Z and I as in Girsanovs Theorem. If Y (t) 0 t T , is a martingale under I relative to F (t) 0 t T , then there is a d-dimensional adpated process (t) = ( 1(t) : : : d(t)) such that Z
Y (t) = Y (0) +
e (u): dB (u)
0 t T:
18.5 Multi-dimensional market model
following:
B (t) = (B1(t) : : : Bd(t)) 0 t T , be a d-dimensional Brownian motion on some ( F P), and let F (t) 0 t T , be the ltration generated by B . Then we can dene the
Let
Stocks
dSi (t) = i (t)Si(t) dt + Si (t)

Accumulation factor
d X j =1
ij (t) dBj (t)
i = 1 ::: m
(t) = exp
Here, i (t)
Zt
0
r(u) du :
ij (t) and r(t) are adpated processes.
CHAPTER 18. Martingale Representation Theorem

Discounted stock prices
201
d X d Si((tt)) = ( i(t){z r(t)) Si((tt)) dt + Si((tt)) ; } ij (t) dBj (t) | j =1 d ? Si (t) X (t) (t) + dB (t)] = (t) ij | j {z j } j =1 e dBj (t)
Risk Premium
(5.1)
For 5.1 to be satised, we need to choose 1 (t)
:::
d (t), so that
(MPR)
d X j =1
ij (t) j (t) = i (t) ; r(t)
i = 1 : : : m: :::
Market price of risk. The market price of risk is an adapted process (t) = ( 1 (t) satisfying the system of equations (MPR) above. There are three cases to consider:
d (t))
Case I: (Unique Solution). For Lebesgue-almost every t and IP -almost every ! , (MPR) has a unique solution (t). Using (t) in the d-dimensional Girsanov Theorem, we dene a unique f f P P risk-neutral probability measure I . Under I , every discounted stock price is a martingale. f Consequently, the discounted wealth process corresponding to any portfolio process is a I P martingale, and this implies that the market admits no arbitrage. Finally, the Martingale Representation Theorem can be used to show that every contingent claim can be hedged; the market is said to be complete. Case II: (No solution.) If (MPR) has no solution, then there is no risk-neutral probability measure and the market admits arbitrage. Case III: (Multiple solutions). If (MPR) has multiple solutions, then there are multiple risk-neutral probability measures. The market admits no arbitrage, but there are contingent claims which cannot be hedged; the market is said to be incomplete. Theorem 5.60 (Fundamental Theorem of Asset Pricing) Part I. (Harrison and Pliska, Martingales and Stochastic integrals in the theory of continuous trading, Stochastic Proc. and Applications 11 (1981), pp 215-260.): If a market has a risk-neutral probability measure, then it admits no arbitrage. Part II. (Harrison and Pliska, A stochastic calculus model of continuous trading: complete markets, Stochastic Proc. and Applications 15 (1983), pp 313-316): The risk-neutral measure is unique if and only if every contingent claim can be hedged.
202
Chapter 19
A two-dimensional market model

Let B (t)
In what follows, all processes can depend on t and ! , but are adapted to simplify notation, we omit the arguments whenever there is no ambiguity. Stocks:
F (t) 0 t T
= (B1 (t) B2(t)) 0
be a two-dimensional Brownian motion on ( be the ltration generated by B .
t T
F P). Let
t T.
To
F (t) 0
dS1 = S1 dS2 = S2
We assume 1
dt + 1 dB1] q 2 dt + 2 dB1 + 1 ;
1: Note that
2 2 dB2
>0
2>0
;1
2 2 dS1 dS2 = S1 2 dB1 dB1 = 2S1 dt 1 1 2 2 dS2 dS2 = S2 2 2 dB1 dB1 + S2 (1 ; 2) 2 dB2 dB2 2 2 2 S 2 dt = 2 2 dS1 dS2 = S1 1S2 2 dB1 dB1 = 1 2S1S2 dt:
In other words,
dS1 has instantaneous variance 2 , 1 S1 dS2 has instantaneous variance 2 , 2 S2 dS1 and dS2 have instantaneous covariance S1 S2
Accumulation factor:
1 2.
(t) = exp
Zt
0
r du :
= 2=
1;r
(MPR)
The market price of risk equations are
2 1+
1;
1 1 2 2
2;r
203
204 The solution to these equations is
1 2
provided ;1 <
= =
< 1. Suppose ;1 < < 1. Then (MPR) has a unique solution (
1;r 1 ( 2 ; r) p 2( 1 ; r) ; 1 1 2 1; 2
Z f(A) = Z (T ) dIP IP
A
Z (t) = exp
1 2); we dene Zt Zt Zt 2 2 ; 1 dB1 ; 2 dB2 ; 1 ( 1 + 2 ) du 2 0 0 0
8A 2 F :
f IP is the unique risk-neutral measure. Dene e B1 (t) = e B2(t) =

Then
Zt Z0
0
1 du + B1 (t) 2 du + B2 (t):
dS1 = S1 r dt + dS2 = S2 r dt +
e 1 dB1
i q
2 2 dB2 e
e 2 dB1 + 1 ;
We have changed the mean rates of return of the stock prices, but not the variances and covariances.
19.1 Hedging when ;1 <
<1 1S1 ; 2S2 ) dt
dX = 1 dS1 + 2 dS2 + r(X ; d X = 1 (dX ; rX dt)

=1 =1
1(dS1 ; rS1 dt) +
2 (dS2 ; rS2 dt)
e 1 2S2 1S1 1 dB1 + f V Y (t) = IE (T ) F (t)
q e1 + 1 ; 2 dB
0 t T:
2 2
e dB2 :
f P Let V be F (T )-measurable. Dene the I -martingale
CHAPTER 19. A two-dimensional market model

The Martingale Representation Corollary implies
205
Y (t) = Y (0) +
We have
Zt
0
e 1 dB1 +
Zt
0
e 2 dB2: e dB1
d X = 1 1 S1 1 + 1 2 S2 dY =
We solve the equations
e e 1 dB1 + 2 dB2 :
1 S1 1 +
+1
2S2
1;
2 2 dB2 e
1
for the hedging portfolio ( 1
2S2 1 ;
2 S2
2= 1 2 2= 2 2) and setting
2). With this choice of ( 1
f V X (0) = Y (0) = IE (T )
we have X (t) = Y (t)
0 t T
and in particular,
X (T ) = V:
Every F (T )-measurable random variable can be hedged; the market is complete.
19.2 Hedging when

The case
=1
= ;1 is analogous. Assume that = 1. Then
dS1 = S1 dS2 = S2
The stocks are perfectly correlated. The market price of risk equations are
1 dt + 1 dB1 ] 2 dt + 2 dB1 ]
1 1= 1;r 2 1= 2;r
The process 2 is free. There are two cases:
(MPR)
206 Case I: 1 ;r 6= 2 ;r : There is no solution to (MPR), and consequently, there is no risk-neutral 1 2 measure. This market admits arbitrage. Indeed
d X = 1 1(dS1 ; rS1 dt) + 1 2(dS2 ; rS2 dt)

=1
Suppose
1; 1
1S1
( 1 ; r) dt + 1
dB1] + 1 2S2 ( 2 ; r) dt +
1 : S
2 ; r dt + dB 1 2
dB1]
>
2; 2
r : Set
1= 1 S1 2=; 2 2
Then
d X =1
=1
1; 1
1 ; r dt + dB ; 1 1 1 1 ; r ; 2 ; r dt 1 {z 2 }
Positive
Case II:
2; 2
r : The market price of risk equations

1 1 2 1
= =
1;r 2;r
have the solution
1=
f P 2 is free; there are innitely many risk-neutral measures. Let I be one of them.
Hedging:
1;r 1
2;r 2
d X = 1 1S1 ( 1 ; r) dt +
=1 = 1
1 S1 1 +
dB1] + 1 2S2 ( 2 ; r) dt +
1
2 S2 2 1 dt + dB1 ]
dB1]
1S1 1 1 dt + dB1 ] +
2S2 2
e dB1 :
e Notice that B2 does not appear.
Let V be an F (T )-measurable random variable. If V depends on B2 , then it can probably not be hedged. For example, if
and 1 or 2 depend on B2 , then there is trouble.
V = h(S1 (T ) S2(T ))
CHAPTER 19. A two-dimensional market model
207
f P More precisely, we dene the I -martingale f V Y (t) = IE (T ) F (t)

We can write
0 t T:
Y (t) = Y (0) +
so
Zt
0
e 1 dB1 +
Zt
0
e 2 dB2
dY =
e e 1 dB1 + 2 dB2 :
To get d X to match dY , we must have
2 = 0:
208
Chapter 20
Pricing Exotic Options

20.1 Reection principle for Brownian motion
Without drift. Dene
M (T ) = 0maxT B (t): t
Then we have:
IP fM (T ) > m B(T ) < bg
= IP fB (T ) > 2m ; bg ( 2) 1 Z x =p exp ; 2T dx 2 T 2m b
1 ;
m>0 b<m
So the joint density is
Z 2 @2 IP fM (T ) 2 dm B(T ) 2 dbg = ; @m @b p 1 exp ; x dx dm db 2T 2 T 2m b ( )! @ p 1 exp ; (2m ; b)2 dm db = ; @m 2T 2 T ( ) 2(2p ; b) exp ; (2m ; b)2 dm db m > 0 b < m: m = 2T T 2 T
1 ;
) !
With drift. Let
e B(t) = t + B(t)
209
210
2m-b
shadow path
m b Brownian motion
Figure 20.1: Reection Principle for Brownian motion without drift
m m=b (B(T), M(T)) lies in here
Figure 20.2: Possible values of B (T )
M (T ).
CHAPTER 20. Pricing Exotic Options

where B (t)
211
0 t T , is a Brownian motion (without drift) on ( F P). Dene
Z (T ) = expf; B (T ) ; 1 2 T g 2 = expf; (B (T ) + T ) + 1 2 T g 2 e (t) + 1 2T g = expf; B 2 Z f(A) = Z (T ) dIP 8A 2 F : IP

A
f SetM (T ) = max0 t T f Under I P
e B (T ):
) ( m b ~ ~ b ffM (T ) 2 dm B (T ) 2 d~g = 2(2p ; ~) exp ; (2m ; ~)2 dm d~ m > 0 ~ < m: f e ~ b ~ b ~ IP ~ b 2T

T 2 T
~) be a function of two variables. Then b
e B is a Brownian motion (without drift), so
Let h(m ~
T f e f IEh(M (T ) B (T )) = IE h(M (Z ()TB (T )) )
f f e e = IE h(M (T ) B (T )) expf B (T ) ; 1 2 T g 2
=
m= ~Z
1
~=m bZ ~
;1
m=0 ~= ~ b
f f h(m ~) expf ~ ; 2 2 T g IP fM (T ) 2 dm B (T ) 2 d~g: ~ b b 1 ~ e b
But also,
f e IEh(M (T ) B (T )) =
m= ~Z
~Z m b= ~
;1
m=0 ~= ~ b
f h(m ~) IP fM (T ) 2 dm B (T ) 2 d~g: ~ b ~ e b
Since h is arbitrary, we conclude that (MPR)
f IP fM (T ) 2 dm B (T ) 2 d~g ~ e b 1 2 T g I fM (T ) 2 dm B (T ) 2 d~g f f e = expf ~ ; 2 b b )~ (P m ~ ~ ~ ~2 b 2 = 2(2p ; b) exp ; (2m ; b) : expf ~ ; 1 2 T gdm d~ m > 0 ~ < m: ~ b ~ b ~ 2T T 2 T
212
20.2 Up and out European call.

Let 0 < K where
< L be given. The payoff at time T is (S (T ) ; K )+ 1 S (T )<L

f
To simplify notation, assume that IP is already the risk-neutral measure, so the value at time zero of the option is
S (T ) = 0maxT S (t): t
v (0 S (0)) = e rT IE (S (T ) ; K )+ 1 S (T )<L : Because IP is the risk-neutral measure, dS (t) = rS (t) dt + S (t) dB (t) S (t) = S0 expf B (t) + (r ; 1 2 )tg 2
; f g
8 > > < = S0 exp > > :
e = S0 expf B (t)g
2 6 6B(t) + r ; 6 4 | {z 2
39 > 7> = t7> 7 } 5>
where
= r;2 e B (t) = t + B(t):

Consequently,
f S (t) = S0 expf M (t)g

where,
f e M (t) = 0maxt B (u): u

We compute,
v (0 S (0)) = e
(S (T ) ; K )+ 1 S (T )<L e = e rT IE S (0) expf B (T )g ; K

;
rT IE
i
+
e = e rT IE S (0) expf B (T )g ; K 1
;
"
1 S(0)exp M (T ) e
f f
<L
e B (T )>
1 log K M (T )< 1 log L e S (0)} | {z | {zS (0)}

~ b
m ~
213
~ M(T) y ~ m (B(T), M(T)) lies in here x ~ B(T) x=y
~ b
e Figure 20.3: Possible values of B (T )

We consider only the case
f M (T ).
S (0) K < L so 0 ~ < m: b ~ The other case, K < S (0) L leads to ~ < 0 m and the analysis is similar. b ~ R m R m : : :dy dx: ~ ~ We compute ~ x b
;
; 2 1 p v (0 S (0)) = e (S (0) expf xg ; K ) 2(2y ; x) exp ; (2y2T x) + x ; 2 2 T dy dx ~ x b T 2 T( ) ~ Zm ~ 1 exp ; (2y ; x)2 + x ; 1 2 T y=m dx rT = ;e (S (0) expf xg ; K ) p 2 ~ 2T 2 T b y=x " ( 2 ) Zm ~ = e rT ~ (S (0) expf xg ; K ) p 1 exp ; x + x ; 1 2 T 2 2T b 2 T )# ( (2m ; x)2 + x ; 1 2 T dx ~ ; exp ; 2T 2 ( ) ~ 1 e rT S (0) Z m exp x ; x2 + x ; 1 2 T dx =p 2 ~ 2T 2 T b ) ( ~ 1 e rT K Z m exp ; x2 + x ; 1 2 T dx ; p2 T 2 ~ 2T b ( ) ~ 1 e rT S (0) Z m exp x ; (2m ; x)2 + x ; 1 2 T dx ~ ;p 2 ~ 2T 2 T b ) ( ~ 1 e rT K Z m exp ; (2m ; x)2 + x ; 1 2 T dx: ~ +p 2 ~ 2T 2 T b
rT
; ; ; ; ; ;
Z mZ m ~ ~
The standard method for all these integrals is to complete the square in the exponent and then recognize a cumulative normal distribution. We carry out the details for the rst integral and just
214 give the result for the other three. The exponent in the rst integrand is
= ; 21 (x ; T ; T )2 + 1 2T + T 2 T 2 T = ; 21 x ; rT ; 2 + rT: T
In the rst integral we make the change of variable
x2 x ; 2T + x ; 1 2 T 2
y = (x ; rT= ; T=2)= T dy = dx= T

to obtain
rT ~ e p S (0) Z m exp b 2 T ~
;
x2 x ; 2T + x ; 1 2T dx 2 ( ) ~ 1 S (0) Z m exp ; 1 x ; rT ; T 2 dx =p ~ 2T 2 2 T b
p m p p~T ; r T ; 2 T
= p 1 S (0): 2 T ~ b
~ m ~ = S (0) N p ; r T ; 2 T ; N pb ; r T ; 2 T T T
Putting all four integrals together, we have
"
p p pT ; r T ; 2 T
expf; y2 g dy
p !
p !#
~ m ~ v (0 S (0)) = S (0) N p ; r T ; 2 T ; N pb ; r T ; 2 T T " T p p ! p p !# ~ T ; N pb ; r T + T m ~ rT K N p ; r T + ;e 2 2 T T " p p ! p p !# ~ b m + r T + T ; N (2m ; ~) + r T + T ~ p ; S (0) N pT 2 2 pT p ! m ~ + exp ;rT + 2m r ; 2 ~ N p + r T ; 2T ; p p ! T ~ ~ p N (2m ; b) + r T ; 2 T T

;
"
p !
p !#
where
~ = 1 log K b S (0)
m = 1 log SL : ~ (0)
215
v(t,L) = 0
v(T,x) = (x - K)+
v(t,0) = 0
Figure 20.4: Initial and boundary conditions. If we let L!1 we obtain the classical Black-Scholes formula
v (0 S (0)) = S (0) 1 ; N p ; r T ; 2 T T " p p !# ~ b T rT K 1 ; N p ; r T + ;e 2 T p p ! 1 (0) = S (0)N p log SK + r T + 2 T T p p ! T : 1 rT KN p log S (0) + r T ; ;e K 2 T

; ;
"
~ b
p !#
If we replace T by T ; t and replace S (0) by x in the formula for v (0 S (0)), we obtain a formula for v (t x), the value of the option at the time t if S (t) = x. We have actually derived the formula under the assumption x K L, but a similar albeit longer formula can also be derived for K < x L. We consider the function
v(t x) = IE t x e
r(T t) (S (T ) ; K )+ 1
;
S (T )<L
0 t T 0 x L:
This function satises the terminal condition
v(T x) = (x ; K )+
and the boundary conditions
0 x<L
v(t 0) = 0 0 t T v(t L) = 0 0 t T:
We show that v satises the Black-Scholes equation
;rv + vt + rxvx + 1 2x2vxx 0 t < T 0 x L: 2
216 Let S (0) > 0 be given and dene the stopping time
= minft 0 S (t) = Lg:

Theorem 2.61 The process
e
is a martingale. Proof: First note that
r(t )v (t ^
^
S (t ^ )) 0 t T
S (T ) < L () > T: Let ! 2 be given, and choose t 2 0 T ]. If (!) t, then IE e

But when
;
rT (S (T ) ; K )+ 1
S (T )<L
F (t) (!) = 0:
(!) t, we have
v (t ^ (!) S (t ^ (! ) !)) = v(t ^ (!) L) = 0

so we may write
IE e
rT (S (T ) ; K )+ 1
S (T )<L
F (t) (!) = e
r(t (!)) v (t ^
^
(!) S (t ^ (! ) !)) :
On the other hand, if
(!) > t, then the Markov property implies
IE e
= IE e = e rt v (t S (t !)) = e r(t (!))v (t ^

; ; ; ^
h t S (t !)
rT (S (T ) ; K )+ 1
S (T )<L F (t) (! ) i rT (S (T ) ; K )+ 1 S (T )<L

f g f g
S (t ^ (!) !)) :
;
In both cases, we have
e
Suppose 0
r(t )v (t ^
^
S (t ^ )) = IE e
r(t )v (t ^
^ ;
rT (S (T ) ; K )+ 1
S (T )<L
F (t) :
u t T . Then IE e
;
S (t ^ )) F (u)
f g
= IE IE e rT (S (T ) ; K )+ 1 S (T )<L F (t) F (u) = IE e rT (S (T ) ; K )+ 1 S (T )<L F (u) = e r(u )v (u ^ S (u ^ )) :

; f g ; ^
217
For 0
t T , we compute the differential d e rt v(t S (t)) = e rt(;rv + vt + rSvx + 1 2S 2vxx ) dt + e 2

; ; ;
rt
Svx dB:
Integrate from 0 to t ^ :
r(t )v (t ^
^
S (t ^ )) = v(0 S (0)) Zt + e ru (;rv + vt + rSvx + 1 2 S 2vxx) du 2

^ ;
+
Because e;r(t^ )v (t ^
Zt |0
ru
A stopped martingale is still a martingale
{z
Svx dB:
Zt
0
S (t ^ )) is also a martingale, the Riemann integral

^
ru (;rv + v
2 2 t + rSvx + 1 S vxx ) du 2
is a martingale. Therefore,
1 ;rv(u S (u))+ vt(u S (u)) + rS (u)vx(u S (u))+ 2 2S 2(u)vxx(u S (u)) = 0 0 u t ^ :

The PDE then follows. The Hedge
;rv + vt + rxvx + 1 2x2vxx = 0 0 t T 0 x L 2

d e rtv (t S (t)) = e
; ; ;
S (t)vx(t S (t)) dB (t) 0 t Let X (t) be the wealth process corresponding to some portfolio (t). Then d(e rtX (t)) = e
We should take and Then
;
rt
rt
(t) S (t) dB (t):
X (0) = v (0 S (0))
(t) = vx (t S (t)) 0 t T^ :
X (T ^ ) = v(T ^ S (T ^ )) ( v (T S (T )) = (S (T ) ; K )+ = v ( L) = 0
if if
>T T.
218
v(T, x)
v(t, x)
Figure 20.5: Practial issue.
20.3 A practical issue

For t < T but t near T , v (t
x) has the form shown in the bottom part of Fig. 20.5.

(t) = vx (t S (t))
In particular, the hedging portfolio
can become very negative near the knockout boundary. The hedger is in an unstable situation. He should take a large short position in the stock. If the stock does not cross the barrier L, he covers this short position with funds from the money market, pays off the option, and is left with zero. If the stock moves across the barrier, he is now in a region of (t) = v x(t S (t)) near zero. He should cover his short position with the money market. This is more expensive than before, because the stock price has risen, and consequently he is left with no money. However, the option has knocked out, so no money is needed to pay it off. Because a large short position is being taken, a small error in hedging can create a signicant effect. Here is a possible resolution. Rather than using the boundary condition
v(t L) = 0 0 t T
solve the PDE with the boundary condition
where short position. The new boundary condition guarantees: 1.
v(t L) + Lvx (t L) = 0 0 t T is a tolerance parameter, say 1%. At the boundary, Lvx (t L) is the dollar size of the
Lvx (t L) remains bounded;

times the dollar
2. The value of the portfolio is always sufcient to cover a hedging error of size of the short position.
Chapter 21
Asian Options
Stock:
dS (t) = rS (t) dt + S (t) dB (t): V =h
Payoff:
ZT
0
S (t) dt
Value of the payoff at time zero:
X (0) = IE e
"
rT h
ZT
0
S (t) dt :
!#
Introduce an auxiliary process Y (t) by specifying
dY (t) = S (t) dt:

With the initial conditions
S (t) = x Y (t) = y
we have the solutions
S (T ) = x exp (B(T ) ; B(t)) + (r ; 1 2 )(T ; t) 2 Y (T ) = y +
ZT
t
S (u) du:
Dene the undiscounted expected payoff
u(t x y ) = IE t x y h(Y (T )) 0 t T x 0 y 2 IR:

219
220
21.1 Feynman-Kac Theorem

The function u satises the PDE
1 ut + rxux + 2 2x2 uxx + xuy = 0 0 t T x 0 y 2 IR
the terminal condition and the boundary condition
u(T x y ) = h(y ) x 0 y 2 IR u(t 0 y) = h(y ) 0 t T y 2 IR:
One can solve this equation. Then
v t S (t)
is the option value at time t, where The PDE for v is
Zt
0
;
S (u) du
;
v (t x y ) = e
r(T t) u(t
x y ):
(1.1)
v (T x y ) = h(y ) v(t 0 y) = e r(T t)h(y ): One can solve this equation rather than the equation for u.
; ;
;rv + vt + rxvx + 1 2x2vxx + xvy = 0 2
21.2 Constructing the hedge

Start with the stock price S (0). The differential of the value X (t) of a portfolio
dX = dS + r(X ; S ) dt = S (r dt + dB ) + rX dt ; r S dt = S dB + rX dt:
(t) is
We want to have
X (t) = v t S (t)
so that
Zt
0
S (u) du S (u) du
X (T ) = v T S (0)
=h
ZT
0
ZT
0
S (u) du :
CHAPTER 21. Asian Options

The differential of the value of the option is
221
dv t S (t)
Zt
0
S (u) du = vtdt + vxdS + vy S dt + 1 vxx dS dS 2 1 = (vt + rSvx + Svy + 2 2 S 2vxx ) dt + Svx dB = rv (t S (t)) dt + vx (t S (t)) S (t) dB (t): (From Eq. 1.1)
Compare this with
Take
dX (t) = rX (t) dt + (t) S (t) dB(t): (t) = vx (t S (t)): If X (0) = v (0 S (0) 0), then X (t) = v t S (t)
Zt
0
S (u) du
0 t T
because both these processes satisfy the same stochastic differential equation, starting from the same initial condition.
21.3 Partial average payoff Asian option

Now suppose the payoff is
V =h
where 0 <
ZT
S (t) dt
; ;
< T . We compute v( x y ) = IE x y e r(T )h(Y (T )) just as before. For 0 t , we compute next the value of a derivative security which pays off v ( S ( ) 0)
at time . This value is The function w satises the Black-Scholes PDE
w(t x) = IE t xe
r( t) v (
;
S ( ) 0): x 0
;rw + wt + rxwx + 1 2x2wxx = 0 0 t 2

w( x) = v ( x 0) x 0 w(t 0) = e
;
with terminal condition and boundary condition The hedge is given by
r(T t)h(0)
;
0 t T: 0 t
8 <w (t S (t)) R (t) = : x vx t S (t) t S (u) du
< t T:
222 Remark 21.1 While no closed-form for the Asian option price is known, the Laplace transform (in 2 the variable 4 (T ; t)) has been computed. See H. Geman and M. Yor, Bessel processes, Asian options, and perpetuities, Math. Finance 3 (1993), 349375.
Chapter 22
Summary of Arbitrage Pricing Theory

A simple European derivative security makes a random payment at a time xed in advance. The value at time t of such a security is the amount of wealth needed at time t in order to replicate the security by trading in the market. The hedging portfolio is a specication of how to do this trading.
22.1 Binomial model, Hedging Portfolio

Let be the set of all possible sequences of n coin-tosses. We have no probabilities at this point. Let r 0 u > r + 1 d = 1=u be given. (See Fig. 2.1) Evolution of the value of a portfolio:
Xk+1 =
k Sk+1 + (1 + r)(Xk ; k Sk ):
Given a simple European derivative security V (! 1 use a portfolio processes
!2), we want to start with a nonrandom X0 and

1 (T )
0
so that
1 (H )
X2(!1 !2 ) = V (!1 !2 ) 8! 1 !2 :
There are four unknowns: X0
(four equations)
1(H )
1(T ). Solving the equations, we obtain:

223
224
2 3 7 1 6 r X1(!1) = 1 + r 6 1 + ; ; d X2(!1 H } + u ; (1 + r) X2(! 1 T )7 4 u d | {z ) u ; d | {z }5

r 1 X0 = 1 + r 1 + ; ; d X1(H ) + u ; (1 + r) X1(T ) u d u;d X2(!1 H ) ; X2(!1 T ) 1 (!1 ) = S (! H ) ; S (! T ) 2 1 2 1 X1(H ) ; X1(T ) : 0 = S (H ) ; S (T )
1 1
V (!1 H )
V (! 1 T )
The probabilities of the stock price paths are irrelevant, because we have a hedge which works on every path. From a practical point of view, what matters is that the paths in the model include all the possibilities. We want to nd a description of the paths in the model. They all have the property
k (log Sk+1 ; log Sk )2 = log SS+1 k = ( log u)2 = (log u)2 :

Let
= log u > 0. Then
n 1 X
;
The paths of log Sk accumulate quadratic variation at rate 2 per unit time.
k=0
(log Sk+1 ; log Sk )2 = 2 n:
If we change u, then we change , and the pricing and hedging formulas on the previous page will give different results. We reiterate that the probabilities are only introduced as an aid to understanding and computation. Recall:
Xk+1 =
k Sk+1 + (1 + r)(Xk ; k Sk ): k
Dene Then
= (1 + r)k :
Xk+1 =
k+1 k+1
Sk+1 + Xk ;
k+1 k k k+1
Sk
k
i.e.,
Xk+1 ; Xk =
k
Sk+1 ; Sk :
k
In continuous time, we will have the analogous equation
(t) d X((tt)) = (t) d S(t) :
CHAPTER 22. Summary of Arbitrage Pricing Theory
225
k f If we introduce a probability measure I under which Sk is a martingale, then P martingale, regardless of the portfolio used. Indeed,
Xk
k will also be a
f f IE Xk+1 F k = IE Xk + k Sk+1 ; Sk F k k+1 k k+1 k Sk+1 F ; Sk : Xk + f E = k I k

k
k+1
=0
{z
}
Then we
Suppose we want to have must have
X2 = V , where V
is some
F 2-measurable random variable.
X1 f X2 1 f V 1 + r X1 = 1 = IE 2 F 1 = IE 2 F 1 f f X0 = X0 = IE X1 = IE V :
0 1 2
To nd the risk-neutral probability measure
f IP f!k = H g, q = IP f!k = T g, and compute ~ f f ~ ~ IE Sk+1 F k = pu Sk + qd Sk k+1 k+1 k+1 1 pu + q d] Sk : = 1+r ~ ~ k We need to choose p and q so that ~ ~
pu + qd = 1 + r ~ ~ p + q = 1: ~ ~
The solution of these equations is
f IP under which Skk
is a martingale, we denote
p= ~
r p = 1 + ;; d ~ u d
q = u ; (1 + r) : ~ u;d
22.2 Setting up the continuous model

Now the stock price S (t) 0 t T , is a continuous function of t. We would like to hedge along every possible path of S (t), but that is impossible. Using the binomial model as a guide, we choose > 0 and try to hedge along every path S (t) for which the quadratic variation of log S (t) accumulates at rate 2 per unit time. These are the paths with volatility 2. To generate these paths, we use Brownian motion, rather than coin-tossing. To introduce Brownian motion, we need a probability measure. However, the only thing about this probability measure which ultimately matters is the set of paths to which it assigns probability zero.
226 Let
B (t) 0 t T , be a Brownian motion dened on a probability space ( F P). 2 IR, the paths of t + B(t) S (t) = S (0) expf t + B(t)g
For any
accumulate quadratic variation at rate 2 per unit time. We want to dene
so that the paths of
log S (t) = log S (0) + t + B (t)
accumulate quadratic variation at rate 2 per unit time. Surprisingly, the choice of in this denition is irrelevant. Roughly, the reason for this is the following: Choose ! 1 2 . Then, for 1 2 IR,
1t +
B (t !1) 0 t T B(t !1 ) is a different function.
is a continuous function of t. If we replace 1 by 2, then 2 t + However, there is an ! 2 2 such that
1t +
B (t !1 ) = 2t + B (t !2) 0 t T:
In other words, regardless of whether we use 1 or 2 in the denition of S (t), we will see the same paths. The mathematically precise statement is the following: If a set of stock price paths has a positive probability when S (t) is dened by
S (t) = S (0) expf 1t + B(t)g

then this set of paths has positive probability when S (t) is dened by
S (t) = S (0) expf 2t + B(t)g:

Since we are interested in hedging along every path, except possibly for a set of paths which has probability zero, the choice of is irrelevant. The most convenient choice of so and is
1 = r; 2
S (t) = S (0) expfrt + B(t) ; 1 2tg 2 e rt S (t) = S (0) expf B(t) ; 1 2tg 2
;
is a martingale under IP . With this choice of ,
dS (t) = rS (t) dt + S (t) dB (t)

and IP is the risk-neutral measure. If a different choice of is made, we have
227
S (t) = S (0) expf t + B(t)g dS (t) = ( +{z1 2 ) S (t) dt + S (t) dB(t): | 2 }

= rS (t) dt +
h |
r dt + dB (t)
f We can change to the risk-neutral measure I , under which P Brownian motion, and then proceed as if had been chosen to be equal to r ; 1 2 . 2 e B has the same paths as B .
22.3 Risk-neutral pricing and hedging
{z e dB (t)
} e B is a
f P Let I denote the risk-neutral measure. Then
e dS (t) = rS (t) dt + S (t) dB (t) e f P where B is a Brownian motion under I . Set (t) = ert :
Then
f P so S (t) is a martingale under I . (t) Evolution of the value of a portfolio:

which is equivalent to
(t) (t) e d S(t) = S(t) dB (t)
dX (t) = (t)dS (t) + r(X (t) ; (t)S (t)) dt

(t) d X((tt)) = (t)d S(t) (t) e = (t) S(t) dB (t):
(3.1)
(3.2)
f Regardless of the portfolio used, X((tt)) is a martingale under I . P Now suppose V is a given F (T )-measurable random variable, the payoff of a simple European derivative security. We want to nd the portfolio process (T ) 0 t T , and initial portfolio X (t) must be a martingale, we must have value X (0) so that X (T ) = V . Because (t)
X (t) = IE V F (t) f (t) (T )
0 t T:
(3.3) This is the risk-neutral pricing formula. We have the following sequence:
228 1.
2. Dene X (t)
is given,
3. Construct dened in step 2.
0 t T , by (3.3) (not by (3.1) or (3.2), because we do not yet have (t)). (t) so that (3.2) (or equivalently, (3.1)) is satised by the X (t) 0 t T ,
To carry out step 3, we rst use the tower property to show that X((tt)) dened by (3.3) is a martingale f P under I . We next use the corollary to the Martingale Representation Theorem (Homework Problem 4.5) to show that
e d X((tt)) = (t) dB (t)

(t) = (t) (t) : S (t)
From (3.3), the denition of X , we see that the hedging portfolio must begin with value Then (3.4) implies (3.2), which implies (3.1), which implies that X (t) the portfolio process (t) 0 t T .
(3.4)
for some proecss . Comparing (3.4), which we know, and (3.2), which we want, we decide to dene (3.5)
T , is the value of
f V X (0) = IE (T )
and it will end with value
V f V X (T ) = (T )IE (T ) F (T ) = (T ) (T ) = V: Remark 22.1 Although we have taken r and to be constant, the risk-neutral pricing formula is still valid when r and are processes adapted to the ltration generated by B . If they depend on e either B or on S , they are adapted to the ltration generated by B . The validity of the risk-neutral
pricing formula means: 1. If you start with
then there is a hedging portfolio
2. At each time t, the value X (t) of the hedging portfolio in 1 satises
f V X (0) = IE (T ) (t) 0 t T , such that X (T ) = V ;
X (t) = IE V F (t) : f (t) (T )

Remark 22.2 In general, when there are multiple assets and/or multiple Brownian motions, the risk-neutral pricing formula is valid provided there is a unique risk-neutral measure. A probability measure is said to be risk-neutral provided

it has the same probability-zero sets as the original measure; it makes all the discounted asset prices be martingales.
229
To see if the risk-neutral measure is unique, compute the differential of all discounted asset prices e e and check if there is more than one way to dene B so that all these differentials have only d B terms.
22.4 Implementation of risk-neutral pricing and hedging

To get a computable result from the general risk-neutral pricing formula
X (t) = IE V F (t) f (t) (T )

one uses the Markov property. We need to identify some state variables, the stock price and possibly other variables, so that
f V X (t) = (t)IE (T ) F (t)
is a function of these variables.

Example 22.1 Assume r and variable. Dene are constant, and V
= h(S(T)). We can take the stock price to be the state i e h v(t x) = IE t x e;r(T ;t) h(S(T)) :
Then
e X(t) = ert IE e;rT h(S(T)) F (t)

= v(t S(t))
(t) and X(t)
e = e;rt v(t S(t)) is a martingale under IP .

are constant.
Example 22.2 Assume r and
V =h
Take S(t) and Y (t) = 0
ZT
0
S(u) du :
R t S(u) du to be the state variables. Dene e t x y h ;r(T ;t)

v(t x y) = IE e Y (T) = y +
h(Y (T))
where
ZT
t
S(u) du:
230
Then
e X(t) = ert IE e;rT h(S(T)) F (t)

= v(t S(t) Y (t))
and is a martingale under IP .
X(t) = e;rt v(t S(t) Y (t)) (t)
Example 22.3 (Homework problem 4.2)
e dS(t) = r(t Y (t)) S(t)dt + (t Y (t))S(t) dB(t) e dY (t) = (t Y (t)) dt + (t Y (t)) dB(t) V = h(S(T)):
Take S(t) and Y (t) to be the state variables. Dene
2 3 6 ( Z 7 ) 6 7 T txy6 e 6exp ; r(u Y (u)) du h(S(T ))7 : 7 v(t x y) = IE 6 7 t 6| 7 4 5 {z }

(t) (T )
Then
e X(t) = (t)IE h(S(T)) F (t) " ( (T) e = IE exp ; ZT

t
r(u Y (u)) du h(S(T )) F (t)
= v(t S(t) Y (t))

and
is a martingale under IP .
X(t) = exp ; Z t r(u Y (u)) du v(t S(t) Y (t)) (t) 0
In every case, we get an expression involving v to be a martingale. We take the differential and set the dt term to zero. This gives us a partial differential equation for v , and this equation must e hold wherever the state processes can be. The dB term in the differential of the equation is the differential of a martingale, and since the martingale is
we can solve for
(t). This is the argument which uses (3.4) to obtain (3.5).
X (t) = X (0) + Z t (u) S (u) dB (u) e (t) (u) 0

Example 22.4 (Continuation of Example 22.3)
231
X(t) = exp ; Z t r(u Y (u)) du v(t S(t) Y (t)) (t) | 0 {z }

1= (t)
is a martingale under IP . We have
1 d X(t) = (t) ;r(t Y (t))v(t S(t) Y (t)) dt (t) + vt dt + vx dS + vy dY + 1 vxx dS dS + vxy dS dY + 1 vyy dY dY 2 2 1 1 = (t) (;rv + vt + rSvx + vy + 1 2 S 2 vxx + Svxy + 2 2 vyy ) dt 2 e + ( Svx + vy ) dB
The partial differential equation satised by v is
;rv + vt + rxvx + vy + 1 2x2 vxx + 2

where it should be noted that v
xvxy + 1 2 vyy = 0 2
where (see (3.2))
= v(t x y), and all other variables are functions of (t y). We have 1 e d X(t) = (t) Svx + vy ] dB(t) (t) = (t Y (t)), = (t Y (t)), v = v(t S(t) Y (t)), and S = S(t). We want to choose (t) so that (t) to be
e d X(t) = (t) (t Y (t)) S(t) dB(t): (t) (t)
Therefore, we should take
Y (t)) (t) = vx (t S(t) Y (t)) + (t (t (t)) S(t) vy (t S(t) Y (t)): Y
232
Chapter 23
Recognizing a Brownian Motion

Theorem 0.62 (Levy) Let B (t) F (t) 0 t T , such that:
be a process on
( F P), adapted to a ltration
1. the paths of B (t) are continuous, 2. 3.
B is a martingale, hBi(t) = t 0 t T , (i.e., informally dB(t) dB(t) = dt).
Then B is a Brownian motion. Proof: (Idea) Let 0 s < t T be given. We need to show that B (t) ; B (s) is normal, with mean zero and variance t ; s, and B (t) ; B (s) is independent of F (s). We shall show that the conditional moment generating function of B (t) ; B (s) is
IE eu(B (t)
B (s)) F (s)
1 2 = e 2 u (t s) :
;
Since the moment generating function characterizes the distribution, this shows that B (t) ; B (s) is normal with mean 0 and variance t ; s, and conditioning on F (s) does not affect this, i.e., B (t) ; B(s) is independent of F (s). We compute (this uses the continuity condition (1) of the theorem)
deuB(t) = ueuB(t)dB(t) + 1 u2 euB(t)dB(t) dB(t) 2

so
euB (t) = euB (s) +
Zt
s
ueuB (v) dB(v ) + 1 u2 2

233
Zt
s
euB(v)
uses cond. 3
dv: |{z}
234
Rt Now 0 ueuB (v)dB (v ) is a martingale (by condition 2), and so

IE
Zt
s
=; = 0:
It follows that
Zs
0
ueuB(v) dB (v ) F (s) ueuB (v) dB(v ) + IE
Zt
0
ueuB(v)dB (v ) F (s)
IE euB(t) F (s) = euB(s) + 1 u2 2

We dene
Zt
s
IE euB(v) F (s) dv:
'(v) = IE euB(v) F (s)

so that
'(s) = euB(s)
and
1 '(t) = euB(s) + 2 u2 1 ' (t) = 2 u2'(t) 1 '(t) = ke 2 u2 t :

0
Zt
s
'(v) dv
Plugging in s, we get
euB (s) = ke 2 u2 s =)k = euB (s)

Therefore,
1 u2 s 2 :
IE euB(t) F (s) = '(t) = euB(s)+ 2 u2 (t IE eu(B(t)

;
s)
B (s)) F (s)
1 2 = e 2 u ( t s) :
;
CHAPTER 23. Recognizing a Brownian Motion
235
23.1 Identifying volatility and correlation

Let B1 and B2 be independent Brownian motions and
dS1 = r dt + S1 dS2 = r dt + S
2
11 dB1 + 12 dB2 21 dB1 + 22 dB2
Dene
2 + 2 1= q 11 12 2 + 2 2= 21 22 11 21 + 12 =
Dene processes W1 and W2 by
1 2
22 :
dW1 = dW2 =
Then W1 and W2 have continuous paths, are martingales, and
11 dB1 + 12 dB2 1 dB1 + 22 dB2 : 21 2
dW1 dW1 = 12 ( 11dB1 +

= 12 ( = dt
1 1 2 dB 11 1
12 dB2 )2 2 dB dB ) 12 2 2
dB1 +
and similarly Therefore, W1 and W2 are Brownian motions. The stock prices have the representation
dW2 dW2 = dt:
The Brownian motions W1 and W2 are correlated. Indeed,
dS1 = r dt + S1 dS2 = r dt + S
2
1 dW1 2 dW2: 12 dB2 )( 21dB1 + 22dB2 )
dW1 dW2 = 1 ( 11dB1 +

= 1 ( 1 2 = dt:
1 2
11 21 + 12 22 ) dt
236
23.2 Reversing the process

Suppose we are given that
where W1 and W2 are Brownian motions with correlation coefcient . We want to nd
dS1 = r dt + dW 1 1 S1 dS2 = r dt + dW 2 2 S
2
=
so that
0
"
11 21 11 12
12 22 21 22
= = =
" " "
11 21
2 + 2 11 12 + 12 22 11 21 # 2 1 2 1 2 1 2 2 11 = 1
12 22
#"
# #
11 21 + 12 22 2 + 2 21 22
A simple (but not unique) solution is (see Chapter 19)
12 = 0 22 =
21 =
This corresponds to
1;
2 2:
p 2 =) dB2 = dW2 ; dW1 1; If = 1, then there is no B2 and dW2 = dB1 = dW1: Continuing in the case 6= 1, we have
1 dW1 = 1 dB1 =)dB1 = dW1 q 2 dW2 = 2 dB1 + 1 ; 2 2
dB2
( 6= 1)
dB1 dB1 = dW1 dW1 = dt 1 dB2 dB2 = 1 ; 2 dW2 dW2 ; 2 dW1 dW2 + 2dW2 dW2
1 = 1; = dt
2
dt ; 2 2 dt + 2 dt
CHAPTER 23. Recognizing a Brownian Motion

so both B1 and B2 are Brownian motions. Furthermore,
237
dB1 dB2 = p 1
1; =p 1 1;
2 (dW1 dW2 ; 2(
dW1 dW1 )
dt ; dt) = 0:
We can now apply an Extension of Levys Theorem that says that Brownian motions with zero cross-variation are independent, to conclude that B 1 B2 are independent Brownians.
238
Chapter 24
An outside barrier option

Barrier process:
dY (t) = dt + Y (t)
Stock process:
1 dB1 (t):
dS (t) = dt + S (t)
where 1
2 dB1 (t) + 1 ;
2 2 dB2 (t)
> 0 2 > 0 ;1 < < 1, and B1 and B2 are independent Brownian motions on some ( F P). The option pays off: (S (T ) ; K )+ 1 Y (T )<L
f g
at time T , where
0 < S (0) < K
0 < Y (0) < L
Y (T ) = 0maxT Y (t): t
Remark 24.1 The option payoff depends on both the Y and S processes. In order to hedge it, we will need the money market and two other assets, which we take to be Y and S . The risk-neutral measure must make the discounted value of every traded asset be a martingale, which in this case means the discounted Y and S processes. We want to nd 1 and 2 and dene
e dB1 = 1 dt + dB1
e dB2 =
239
2 dt + dB2
240 so that
dY = r dt + dB 1 e1 Y = r dt + 1 1 dt + 1 dB1 dS = r dt + dB + q1 ; 2 e1 S
= r dt + 2 1 dt + 1 ; 2 2 2 dt q + 2 dB1 + 1 ; 2 2 dB2 : =r+ =r+

1 1 2 1+
2 2 dB2 e
We must have
(0.1)
1;
2 2 2:
(0.2)
We solve to get
;r; 2= p 1; 2 n
= ;r
1
2 1: 2
We shall see that the formulas for 1 and 2 do not matter. What matters is that (0.1) and (0.2) uniquely determine 1 and 2 . This implies the existence and uniqueness of the risk-neutral measure. We dene
2 2 Z (T ) = exp ; 1 B1 (T ) ; 2 B2 (T ) ; 1 ( 1 + 2 )T 2 Z f IP (A) = Z (T ) dIP 8A 2 F :
f e e P Under I , B1 and B2 are independent Brownian motions (Girsanovs Theorem). risk-neutral measure. f P Remark 24.2 Under both IP and I , Y has volatility 1, S has volatility 2 and
i.e., the correlation between dY and dS is . Y S The value of the option at time zero is
f IP is the unique
dY dS = YS
1 2 dt
f v(0 S (0) Y (0)) = IE e
rT (S (T ) ; K )+ 1
Y (T )<L
i
g
We need to work out a density which permits us to compute the right-hand side.
CHAPTER 24. An outside barrier option

Recall that the barrier process is
241
dY = r dt + Y
so
e 1 dB1 n o
1 e Y (t) = Y (0) exp rt + 1B1 (t) ; 2 2t : 1

Set
b = r= 1 ; 1=2 b e B (t) = bt + B1 (t) c b M (T ) = max B (t):

0 t T
Then
The joint density of
b c B (T ) and M (T ), appearing in Chapter 20, is f b IP fB (T ) 2 d^ M (T ) 2 dmg b c ^ ^) ( (2m ; ^)2 b^ 1 b2 ) ^ m ^ = 2(2p ; b exp ; ^2T b + b ; 2 T db dm ^ T 2 T m > 0 ^ < m: ^ b ^ q e1 + 1 ; 2 dB
2 2 dB2 e
b Y (t) = Y (0) expf 1B (t)g c Y (T ) = Y (0) expf 1M (T )g:
The stock process.
dS = r dt + S
so
1 e e S (T ) = S (0) expfrT + 2B1 (T ) ; 1 2 2 T + 1 ; 2 2B2 (T ) ; 2 (1 ; 2) 2 T g 2 2 2 q e e = S (0) expfrT ; 1 2 T + 2 B1 (T ) + 1 ; 2 2 B2 (T )g 2 2

From the above paragraph we have
e b B1 (T ) = ; bT + B (T )
so
b S (T ) = S (0) expfrT + 2B (T ) ; 1 2 T ; 2 2
q bT + 1 ; 2
2 2B2 (T )g e
242
24.1 Computing the option value
f v (0 S (0) Y (0)) = IE e
;
rT (S (T ) ; K )+ 1
Y (T )<L
i
g
1 f = e rT IE S (0) exp (r ; 2 2 ; 2
2 b)T +
b 2 B (T ) + 1 ;
2 2 B2 (T ) e
;K
:1 Y (0)exp 1 M (T )]<L b
f
b We know the joint density of ( B (T )
( ) b b b ffB2(T ) 2 d~g = p 1 exp ; ~2 d~ ~ 2 IR: e IP b 2T 2 T e e b c Furthermore, the pair of random variables (B (T ) M (T )) is independent of B2 (T ) because B1 and e2 are independent under IP . Therefore, the joint density of the random vector (B2(T ) B (T ) M (T )) f e b c B
is
c e M (T )). The density of B2 (T ) is
f e f e IP fB2 (T ) 2 d~ B(T ) 2 d^ M (T ) 2 dm g = IP fB2 (T ) 2 d~g:IP fB(T ) 2 d^ M (T ) 2 dmg b b b c ^ b f b b c ^

The option value at time zero is
v (0 S (0) Y (0))
=e
;
rT
1 1
log Y L Zm Z (0) ^ 0
( ) b 1 exp ; ~2 :p 2T 2 T ( ) 2(2p ; ^) exp ; (2m ; ^)2 + b^ ; 1 b2 T m b ^ ^ b : b 2 2T

;1 ;1
S (0) exp (r ;
1 2; 2 2
2 b)T +
b 2^ + 1 ;
2 2~ b
;K
T 2 T :d~ d^ dm: b b ^
The answer depends on T S (0) and Y (0). It also depends on 1 2 r K and L. It does not depend on 1 nor 2 . The parameter b appearing in the answer is b = r1 ; 21 : Remark 24.3 If we had not regarded Y as a traded asset, then we would not have tried to set its mean return equal to r. We would have had only one equation (see Eqs (0.1),(0.2))
= r+
1+ 1;
2 2 2
(1.1)
to determine 1 and 2 . The nonuniqueness of the solution alerts us that some options cannot be hedged. Indeed, any option whose payoff depends on Y cannot be hedged when we are allowed to trade only in the stock.

If we have an option whose payoff depends only on original equation for S ,
243
S , then Y
1;
is superuous. Returning to the
dS = dt + S
we should set
2 dB1 +
2 2 dB2
dW = dB1 + 1 ; 2dB2 so W is a Brownian motion under IP (Levys theorem), and dS = dt + dW: 2 S

Now we have only Brownian motion, there will be only one , namely,
= ;r
f so with dW
= dt + dW
we have
dS = r dt + S
and we are on our way.
f dW
24.2 The PDE for the outside barrier option

Returning to the case of the option with payoff
(S (T ) ; K )+ 1 Y (T )<L
f
we obtain a formula for
v (t x y ) = e
r(T t)I t x y f E
;
(S (T ) ; K )+ 1 maxt
f
u T Y (u) < Lg
by replacing T , S (0) and Y (0) by T ; t, x and y respectively in the formula for v (0 S (0) Y (0)). Now start at time 0 at S (0) and Y (0). Using the Markov property, we can show that the stochastic process
f is a martingale under I . We compute P

d e rtv(t S (t) Y (t))
; ;
e rt v (t S (t) Y (t))
;
1 = e rt ;rv + vt + rSvx + rY vy + 2 2S 2 vxx + 2
e 2 Svx dB1 +
2 2 1 2 SY vxy + 1 1 Y vyy 2
dt
1;
2 2Svx dB2 + 1Y vy dB1 e e
244
y L v(t, x, L) = 0, x >= 0
x v(t, 0, 0) = 0
Figure 24.1: Boundary conditions for barrier option. Note that t 2
0 T ] is xed.
Setting the dt term equal to 0, we obtain the PDE
; rv + vt + rxvx + ryvy + 1 2x2vxx 2 2

+
2 2 1 2 xyvxy + 1 1 y vyy 2
=0 0 t<T
x 0 0 y L:
The terminal condition is
v(T x y) = (x ; K )+
x 0 0 y<L
and the boundary conditions are
v(t 0 0) = 0 0 t T v (t x L) = 0 0 t T x 0:
245
;rv + vt + ryvy + 1 2y2vyy = 0 2 1

This is the usual Black-Scholes formula in y . The boundary conditions are the terminal condition is
x=0
y=0 ;rv + vt + rxvx + 1 2x2vxx = 0 2 2

This is the usual Black-Scholes formula in x. The boundary condition is
v (t 0 L) = 0 v(t 0 0) = 0 v (T 0 y) = (0 ; K )+ = 0 y 0:
On the x = 0 boundary, the option value is v (t 0 y ) = 0 0 y L:
v(t 0 0) = e
r(T t)(0 ; K )+ = 0
;
the terminal condition is
v (T x 0) = (x ; K )+
x 0:
On the y = 0 boundary, the barrier is irrelevant, and the option value is given by the usual Black-Scholes formula for a European call.
24.3 The hedge

After setting the dt term to 0, we have the equation
d e rtv (t S (t) Y (t))

;
= e rt
;
e 2Svx dB1 + 1 ; i i
2 2 Svx dB2 + 1 Y vy dB1 e e

are functions of t. Note
where vx that
e e = vx (t S (t) Y (t)), vy = vy (t S (t) Y (t)), and B1 B2 S Y

d e rt S (t) = e
;
rt
d e rt Y (t) = e =e
;
= e rt
; ;
;rS (t) dt + dS (t)] q e 2S (t) dB1 (t) + 1 ; ;rY (t) dt + dY (t)] e 1Y (t) dB1 (t):
; ;
2 2S (t) dB2 (t) e
rt
rt
Therefore,
d e rtv (t S (t) Y (t)) = vxd e rtS ] + vy d e rtY ]:

;
Let 2 (t) denote the number of shares of stock held at time t, and let 1(t) denote the number of shares of the barrier process Y . The value X (t) of the portfolio has the differential
dX =
2dS + 1 dY
+rX;
2 S ; 1Y ] dt:
246 This is equivalent to
d e rtX (t)] =
;
2 (t)d e
rtS (t)] +
1(t)d e
rt Y (t)]:
To get X (t) = v (t
S (t) Y (t)) for all t, we must have X (0) = v (0 S (0) Y (0))
and
2 (t) = vx (t
S (t) Y (t)) 1 (t) = vy (t S (t) Y (t)):
Chapter 25
American Options
This and the following chapters form part of the course Stochastic Differential Equations for Finance II.
25.1 Preview of perpetual American put
dS = rS dt + S dB
Intrinsic value at time t : (K ; S (t))+ : Let L 2
0 K ] be given. Suppose we exercise the rst time the stock price is L or lower. We dene
vL
L = minft 0 S (t) (x) = IEe r L (K ; S (
K;x (K ; L)IEe
L ))
Lg
+
if x L, if x > L:
rL
The plan is to comute vL (x) and then maximize over L to nd the optimal exercise price. We need to know the distribution of L .
25.2 First passage times for Brownian motion: rst method

(Based on the reection principle) Let B be a Brownian motion under IP , let x > 0 be given, and dene
= minft 0 B (t) = xg:

is called the rst passage time to x. We compute the distribution of . 247
248
Intrinsic value
Stock price
Figure 25.1: Intrinsic value of perpetual American put Dene
M (t) = 0maxt B (u): u
From the rst section of Chapter 20 we have
2 m IP fM (t) 2 dm B(t) 2 dbg = 2(2p ; b) exp ; (2m2; b) dm db t t 2 t

Therefore,
m > 0 b < m:
Z Z m 2(2m ; b) ( (2m ; b)2 ) p IP fM (t) xg = db dm exp ; 2t t 2 t x ) ( Z 2 exp ; (2m ; b)2 b=m dm p = 2t 2 t x b= ( 2) Z p 2 exp ; mt dm: = 2 2 t
1 ;1 1 ;1 1
We make the change of variable z
= mt in the integral to get

p
2 p2 exp ; z2 dz: = x= t 2
1 p
Now
t()M (t) x
CHAPTER 25. American Options

so
249
@ IP f 2 dtg = @t IP f tg dt @ = @t IP fM (t) xg dt " Z ( ) # @ 2 exp ; z 2 dz dt p = @t 2 x= t 2 ( 2) @ x = ; p2 exp ; xt : @t p dt 2 t 2 ( 2) = px exp ; x dt: 2t t 2 t

1 p
We also have the Laplace transform formula
IEe
=e x
;
e t IP f 2 dtg
;
> 0:
(See Homework)
Reference: Karatzas and Shreve, Brownian Motion and Stochastic Calculus, pp 95-96.
25.3 Drift adjustment

Reference: Karatzas/Shreve, Brownian motion and Stochastic Calculus, pp 196197. For 0
t < 1, dene
e B (t) = t + B (t) Z (t) = expf; B (t) ; 1 2 tg 2 e (t) + 1 2tg = expf; B 2 e ~ = minft 0 B (t) = xg:
A
Dene
We x a nite time T and change the probability measure only up to T . More specically, with T xed, dene Z
f e Under I , the process B (t) P
f IP (A) =
Z (T ) dP
A 2 F (T ):
0 t T , is a (nondrifted) Brownian motion, so f IP f~ 2 dtg = IP f 2 dtg

2 = px exp ; x dt 0 < t T: 2t t 2 t
250 For 0 < t
T we have
IP f~ tg = IE 1
h h
f = IE 1
~ t
i
g g
f = IE 1
i fh e = IE 1 ~ t expf B (~ ^ t) ; 1 2 (~ ^ t)g 2 i fh = IE 1 ~ t expf x ; 1 2 ~g 2 Zt 1 f = expf x ; 2 2 sgIP f ~ 2 dsg 0 ( Zt x 2) 1 2s ; x p exp x ; 2 = 2s ds 0 s 2 s ) ( Zt x (x ; s)2 ds: p exp ; 2s = s 2 s
f g f g f g
1 2 e ~ t expf B (T ) ; 2 T g 1 f f e = IE 1 ~ t IE expf B (T ) ; 2 2 T g F (~ ^ t)
f g
~ t
Z (T )
Therefore,
2 IP f~ 2 dtg = px exp ; (x ; t t) dt 0 < t T: 2 t 2 t

Since T is arbitrary, this must in fact be the correct formula for all t > 0.
25.4 Drift-adjusted Laplace transform

Recall the Laplace transform formula for
= minft 0 B (t) = xg
for nondrifted Brownian motion:
IEe
For
Z
0
2 px exp ; t ; xt dt = e 2 t 2 t
x 2
p
> 0 x > 0:
~ = minft 0 t + B (t) = xg

the Laplace transform is
251
IEe
~=
2 px exp ; t ; (x ; t t) dt 2 0 t 2 t ( ) Z x exp ; t ; x2 + x ; 1 2 t dt p = 2 2t 0 t 2 t ) ( Z x exp ;( + 1 2 )t ; x2 dt x p =e 2 2t 0 t 2 t = ex x 2 + 2 >0 x>0

1 1 1 p ;
where in the last step we have used the formula for IEe; If
~(!) < 1, then
with
replaced by
+1 2
2.
lim e
;
if
~(! ) = 1, then e
~(!) = 0 for every
#0 #0
~(! ) = 1
> 0, so lim e ~(! ) = 0:

; ;
Therefore,
lim e
Letting
#0 and using the Monotone Convergence Theorem in the Laplace transform formula
IEe
;
#0
~(! ) = 1
~<
:
2
~ = ex x 2 +
p ; ;
we obtain If If
IP f~ < 1g = ex
= ex x :
; j j
0, then
IP f ~ < 1g = 1: IP f ~ < 1g = e2x < 1:
< 0, then
(Recall that x > 0).
25.5 First passage times: Second method

(Based on martingales) Let
> 0 be given. Then
Y (t) = expf B(t) ; 1 2tg 2
252 is a martingale, so Y (t ^
) is also a martingale. We have 1 = Y (0 ^ ) = IEY (t ^ ) = IE expf B (t ^ ) ; 1 2 (t ^ )g: 2 1 = t! IE expf B (t ^ ) ; 2 2 (t ^ )g: lim

1
We want to take the limit inside the expectation. Since
expf B (t ^ ) ; 1 2 (t ^ )g 2
1
ex
this is justied by the Bounded Convergence Theorem. Therefore,
1 = IE t! expf B (t ^ ) ; 1 2 (t ^ )g: lim 2 There are two possibilities. For those ! for which (! ) < 1,
; 1
1 2 lim expf B (t ^ ) ; 1 2 (t ^ )g = e x 2 : 2 t! For those ! for which (! ) = 1, 1 lim expf B (t ^ ) ; 1 2 (t ^ )g t! expf x ; 2 2tg = 0: lim 2 t!
1 1
Therefore,
1 1 = IE t! expf B (t ^ ) ; 2 2(t ^ )g lim

1
= IE e x
;
1 2
1<
1 where we understand e x; 2
Let
= IEe x
2
1 2
=1 2
2 , so
= 2
to be zero if
= 1.
;
. We have again derived the Laplace transform formula
x 2
p
= IEe
>0 x>0
for the rst passage time for nondrifted Brownian motion.
25.6 Perpetual American put
dS = rS dt + S dB S (0) = x S (t) = x expf(r ; 1 2 )t + B(t)g 2
8 > > < = x exp > > :
2 39 > > 6 r 6 ; t + B(t)7= : 7 6 7> 4| {z 2 } 5>

Intrinsic value of the put at time t: Let L 2
253
0 K ] be given. Dene for x L,

L = minft
(K ; S (t)) +.
0 S (t) = Lg = minft 0 t + B (t) = 1 log L g x 1 log x g = minft 0 ; t ; B (t) = L
Dene
vL = (K ; L)IEe r L x xp = (K ; L) exp ; log L ; 1 log L 2r + 1 2r + 2 x : = (K ; L) L

; p ; ;
We compute the exponent
p ; ; 1 2r +
2 = ; r + 1 ; 1 2r + r ; =2 2 2
s s s s
= ; r2 + 1 ; 1 2r + r 2 ; r + 2 =4 2
2 2 = ; r2 + 1 ; 1 r 2 + r + 2=4 2
= ; r2 + 1 ; 1 r + =2 2 = ; r2 + 1 ; 1 r + =2 2 = ; 2r :
2
Therefore,
8 <(K ; x) vL (x) = : ;x (K ; L) L
;
2r=
x L:
2
0 x L
;x The curves (K ; L) L
2r=
are all of the form Cx;2r=
We want to choose the largest possible constant. The constant is
C = (K ; L)L2r=
254
value
K-x K 2 (K - L) (x/L)-2r/
Stock price
Figure 25.2: Value of perpetual American put
value
C3 x C2 x C1 x
-2r/ 2
-2r/ 2 -2r/ 2
Stock price
Figure 25.3: Curves.

and
255
@C = ;L 2r2 + 2r (K ; L)L 2r2 1 2 @L 2r 1 = L 2 ;1 + 2r (K ; L) L 2 2r = L 2 ; 1 + 2r + 2r K : 2 2L

;
We solve
; 1 + 2r + 2r K = 0 2 2L
to get
2 L = 2 rK2r : +
Since 0 < 2r < 2 + 2r we have
0 < L < K: 0 x L
Solution to the perpetual American put pricing problem (see Fig. 25.4):
8 < v (x) = :(K ; x) ; x (K ; L ) L (
2r=
x L
where
2 L = 2 rK2r : +
Note that
v (x) = ;12r
0
; 2 (K ; L) (L
0
)2r=
2r=
2;
0 x<L
x>L :
We have
x#L
lim v (x) = ;2 r2 (K ; L ) 1
L 2+ 2 = ;2 r2 K ; 2 rK2r 2rK2r + 2 + 2r ; 2r ! 2 + 2r = ;2 r2 2 + 2r 2r
= ;1 = lim v (x):
x"L
0
256
value
K-x K -2r/ (K - L* )(x/L* ) 2
L*
Stock price
Figure 25.4: Solution to perpetual American put.
25.7 Value of the perpetual American put

Set
2 L = 2 rK2r = + 1 K: + If 0 x < L , then v (x) = K ; x. If L x < 1, then v(x) = (K ; L )(L ) } x | {z

;
= 2r 2
(7.1) (7.2)
= IE x
where
e (K ; L )+1
;
C r
<
1g
S (0) = x
If 0
= minft 0 S (t) = L g:
00
(7.3) (7.4)
x < L , then ;rv(x) + rxv (x) + 1 2x2v (x) = ;r(K ; x) + rx(;1) = ;rK: 2 If L x < 1, then ;rv(x) + rxv (x) + 1 2x2v (x) 2 = C ;rx ; rx x 1 ; 1 2 x2 (; ; 1)x 2 ] 2 1 2 (; ; 1)] = Cx ;r ; r ; 2 = C (; ; 1)x r ; 1 2 2r 2 2 = 0:
0 0 00 ; ; ; ; ; ; ;
In other words, v solves the linear complementarity problem: (See Fig. 25.5).
257
6
K@
@@
L
@@ @@
-x
Figure 25.5: Linear complementarity
For all x 2 IR, x 6= L ,
1 rv ; rxv ; 2 2x2 v 0 v (K ; x)+

0 00
(a) (b) (c)
One of the inequalities (a) or (b) is an equality. The half-line
0 1) is divided into two regions:
C = fx v(x) > (K ; x)+ g S = fx rv ; rxv ; 1 2x2v > 0g 2

0 00
and L is the boundary between them. If the stock price is in C , the owner of the put should not exercise (should continue). If the stock price is in S or at L , the owner of the put should exercise (should stop).
25.8 Hedging the put

Let S (0) be given. Sell the put at time zero for v (S (0)). Invest the money, holding (t) shares of stock and consuming at rate C (t) at time t. The value X (t) of this portfolio is governed by
dX (t) = (t) dS (t) + r(X (t) ; (t)S (t)) dt ; C (t) dt

or equivalently,
d(e rt X (t)) = ;e rt C (t) dt + e

; ;
rt
(t) S (t) dB (t):
258 The discounted value of the put satises
d e rtv(S (t)) = e
;
rt
+ e rt S (t)v (S (t)) dB (t) = ;rKe rt 1 S (t)<L dt + e rt S (t)v (S (t)) dB (t):

; 0 ; ; 0 f g
h i ;rv(S (t)) + rS (t)v (S (t)) + 1 2S 2(t)v (S (t)) dt 2

0 00
We should set
C (t) = rK 1 S (t)<L (t) = v (S (t)):

f 0
Remark 25.1 If S (t) < L , then
As long as the owner does not exercise, you can consume the interest from the money market position, i.e.,
v(S (t)) = K ; S (t) (t) = v (S (t)) = ;1: To hedge the put when S (t) < L , short one share of stock and hold K in the money market.
0
C (t) = rK 1 S(t)<L :
f g
Properties of e;rt v (S (t)): 1. 2. 3.
e rtv (S (t)) is a supermartingale (see its differential above). e rtv (S (t)) e rt (K ; S (t))+ , 0 t < 1; e rtv (S (t)) is the smallest process with properties 1 and 2.
; ; ; ;
Explanation of property 3. Let Y be a supermartingale satisfying
Y (t) e rt (K ; S (t))+
;
0 t < 1:
(8.1)
Then property 3 says that
Y (t) e rt v (S (t)) 0 t < 1: (8.2) We use (8.1) to prove (8.2) for t = 0, i.e., Y (0) v(S (0)): (8.3) If t is not zero, we can take t to be the initial time and S (t) to be the initial stock price, and then
;
adapt the argument below to prove property (8.2). Proof of (8.3), assuming Y is a supermartingale satisfying (8.1): Case I: S (0)
L : We have
Y (0) |{z}(K ; S (0))+ = v(S (0)):

(8:1)

Case II: S (0) > L : For T
259
> 0, we have
f 1g
Y (0) IEY ( ^ T ) (Stopped supermartingale is a supermartingale) h i IE Y ( ^ T )1 < : (Since Y 0)

Now let T !1 to get
Y (0)
lim IE Y ( ^ T )1 < T! h i (Fatous Lemma) IE Y ( )1 <

1 f 1g
2 IE 6e r (K ; S{z ))+ 1 4 |( }
f 1g ;
= v (S (0)):
<
1g
3 7 5
(by 8.1)
(See eq. 7.2)
25.9 Perpetual American contingent claim

Intinsic value: h(S (t)). Value of the American contingent claim:
v (x) = sup IE x e r h(S ( ))

;
where the supremum is over all stopping times. Optimal exercise rule: Any stopping time Characterization of v : 1. 2. 3. which attains the supremum.
e rtv (S (t)) is a supermartingale; e rtv (S (t)) e rt h(S (t)) 0 < t < 1; e rtv (S (t)) is the smallest process with properties 1 and 2.
; ; ; ;
25.10 Perpetual American call
v(x) = sup IE x e r (S ( ) ; K )+
;
Theorem 10.63
v (x) = x 8x 0:
260 Proof: For every t,
v(x) IE x e rt (S (t) ; K )+ h i IE x e rt (S (t) ; K ) h i = IE x e rt S (t) ; e rt K = x ; e rt K:

; ; ; ; ;
x. Now start with S (0) = x and dene Y (t) = e rtS (t):

;
Let t!1 to get v (x)
Then: 1. 2.
Y is a supermartingale (in fact, Y is a martingale); Y (t) e rt(S (t) ; K )+ 0 t < 1.

;
Therefore, Y (0)
v(S (0)), i.e.,
x v(x):
we choose,
;
Remark 25.2 No matter what

;
IE x e r (S ( ) ; K )+ < IE x e r S ( )
There is no optimal exercise time.
x = v(x):
25.11 Put with expiration
T > 0. Intrinsic value: (K ; S (t)) +.

Expiration time: Value of the put:
v(t x) = (value of the put at time t if S (t) = x) = sup IE xe r( t) (K ; S ( ))+:

; ;
:stopping time
See Fig. 25.6. It can be shown that jump. Let S (0) be given. Then
| {z }
v vt vx are continuous across the boundary, while v xx has a
261
v>K;x ;rv + vt + rxvx + 1 2x2vxx = 0 2 K
v (T x) = 0 x K
v=K;x vt = 0 vx = ;1 vxx = 0 ;rv + vt + rxvx + 1 2x2vxx = ;rK 2 T

1. 2. 3.
;
v (T x) = K ; x 0 x K
-t
Figure 25.6: Value of put with expiration
e rtv (t S (t)) 0 t T is a supermartingale; e rtv (t S (t)) e rt (K ; S (t))+ 0 t T ; e rtv (t S (t)) is the smallest process with properties 1 and 2.
; ; ;
25.12 American contingent claim with expiration
T > 0. Intrinsic value: h(S (t)).

Expiration time: Value of the contingent claim:
v(t x) = sup IE x e
t T
Then
r( t)h(S (
;
)):
At every point (t
x) 2 0 T ]
0 1), either (a) or (b) is an equality.
rv ; vt ; rxvx ; 1 2 x2vxx 0 2 v h(x)
(a) (b) (c)
Characterization of v : Let S (0) be given. Then
262 1. 2. 3.
e rtv (t S (t)) 0 t T is a supermartingale; e rtv (t S (t)) e rt h(S (t)); e rtv (t S (t)) is the smallest process with properties 1 and 2.
; ; ; ;
The optimal exercise time is
= minft 0 v (t S (t)) = h(S (t))g

If
(!) = 1, then there is no optimal exercise time along the particular path ! .
Chapter 26
Options on dividend-paying stocks

26.1 American option with convex payoff function
Theorem 1.64 Consider the stock price process
dS (t) = r(t)S (t) dt + (t)S (t) dB (t) where r and are processes and r(t) 0 0 t T a.s. This stock pays no dividends. 0, and assume h(0) = 0. (E.g., h(x) = (x ; K )+ ). An Let h(x) be a convex function of x American contingent claim paying h(S (t)) if exercised at time t does not need to be exercised
before expiration, i.e., waiting until expiration to decide whether to exercise entails no loss of value. Proof: For 0
1 and x 0, we have
h( x) = h((1 ; )0 + x) (1 ; )h(0) + h(x) = h(x): Let T be the time of expiration of the contingent claim. For 0 t T ,
0
and S (T )
(t) = exp ; Z T r(u) du (T ) t (t) S (T ) (T )
0, so
(t) h(S (T )): (T )
(*)
Consider a European contingent claim paying h(S (T )) at time T . The value of this claim at time t 2 0 T ] is
X (t) = (t) IE (1 ) h(S (T )) F (t) : T

263
264
.... .... .... .. ......r.. . ..... . ..... ....... (x h(x)) .. h(x) ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ...... ...... ...... ...... ....... ....... ....... ............r....... .. ............ ...... . . . . . .... . . .. . . . . h( x) ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ...... ...... ...... ...... ................................. ...................r.......... .. h . . . . . . . . ....... . . . . ........ . . ......... . . . . . . ........... . . ............ . . ..............
Figure 26.1: Convex payoff function Therefore,
X (t) = 1 IE (t) h(S (T )) F (t) (t) (t) (T ) (t) 1 (t) IE h (T ) S (T ) F (t) (by (*)) 1 h (t) IE S (T ) F (t) (Jensens inequality) (t) (T ) = 1 h (t) S (t) ( S is a martingale) (t) (t)
1 = (t) h(S (t)):
This shows that the value X (t) of the European contingent claim dominates the intrinsic value h(S (t)) of the American claim. In fact, except in degenerate cases, the inequality
X (t) h(S (t)) 0 t T

is strict, i.e., the American claim should not be exercised prior to expiration.
26.2 Dividend paying stock

Let r and be constant, let be a dividend coefcient satisfying
0 < < 1:
CHAPTER 26. Options on dividend paying stocks

Let T > 0 be an expiration time, and let t1 price is given by
265
2 (0 T ) be the time of dividend payment. The stock
1 2 0 t t1 S (t) = S (0) expf(r ; 2 )t +1 B(t)g 2 )(t ; t1 ) + (B (t) ; B (t1 ))g (1 ; )S (t1) expf(r ; 2 t1 < t T: Consider an American call on this stock. At times t 2 (t 1 T ), it is not optimal to exercise, so the
value of the call is given by the usual Black-Scholes formula
v(t x) = xN (d+(T ; t x)) ; Ke

where
r(T t) N (d
;
(T ; t x))
t1 < t T :
At time t1 , immediately after payment of the dividend, the value of the call is
x d (T ; t x) = p 1 log K + (T ; t)(r T ;t v (t1 (1 ; )S (t1)):
2=2)
At time t1 , immediately before payment of the dividend, the value of the call is
w(t1 S (t1))
where Theorem 2.65 For 0
w(t1 x) = max (x ; K )+ v (t1 (1 ; )x : t t1 , the value of the American call is w(t S (t)), where w(t x) = IE t x e
r(t1 t) w(t
;
S (t1)) :
This function satises the usual Black-Scholes equation
;rw + wt + rxwx + 1 2x2wxx = 0 0 t t1 x 0 2

(where w = w(t
x)) with terminal condition w(t1 x) = max (x ; K )+ v(t1 (1 ; )x) x 0 w(t 0) = 0 0 t T:

(t) =
and boundary condition The hedging portfolio is
wx(t S (t)) vx (t S (t))
0 t t1
t1 < t T:
Proof: We only need to show that an American contingent claim with payoff w(t 1 S (t1)) at time t1 need not be exercised before time t1. According to Theorem 1.64, it sufces to prove 1.
w(t1 0) = 0,
266 2.
w(t1 x) is convex in x.
0) = 0, we have immediately that
Since v (t1
w(t1 0) = max (0 ; K )+ v(t1 (1 ; )0) = 0:

To prove that w(t1 x) is convex in x, we need to show that v (t 1 (1 ; )x) is convex is x. Obviously, (x ; K )+ is convex in x, and the maximum of two convex functions is convex. The proof of the convexity of v (t1 (1 ; )x) in x is left as a homework problem.
26.3 Hedging at time t1

Case I: v (t1 (1 ; )x) (x ; K )+ . The option need not be exercised at time t 1 (should not be exercised if the inequality is strict). We have Let x = S (t1).
w(t1 x) = v (t1 (1 ; )x) (t1) = wx (t1 x) = (1 ; )vx (t1 (1 ; )x) = (1 ; ) (t1+)

where
(t1 +) = lim (t)

t#t1
is the number of shares of stock held by the hedge immediately after payment of the dividend. The post-dividend position can be achieved by reinvesting in stock the dividends received on the stock held in the hedge. Indeed,
(t1 +) = 1 (t1) = (t1) + 1; 1; (t1 )S (t1) = (t1 ) + (1 ; )S (t )

1
(t1)
dividends received price per share when dividend is reinvested
= # of shares held when dividend is paid +
Case II: v (t1 (1 ; )x) < (x ; K )+ . The owner of the option should exercise before the dividend payment at time t 1 and receive (x ; K ). The hedge has been constructed so the seller of the option has x ; K before the dividend payment at time t1 . If the option is not exercised, its value drops from x ; K to v (t 1 (1 ; )x), and the seller of the option can pocket the difference and continue the hedge.
Chapter 27
Bonds, forward contracts and futures

Let fW (t) F (t) 0 t T g be a Brownian motion (Wiener process) on some ( sider an asset, which we call a stock, whose price satises
F P). Con-
dS (t) = r(t)S (t) dt + (t)S (t) dW (t):

Here, r and are adapted processes, and we have already switched to the risk-neutral measure, which we call IP . Assume that every martingale under IP can be represented as an integral with respect to W . Dene the accumulation factor
(t) = exp
Zt
0
r(u) du :
A zero-coupon bond, maturing at time T , pays 1 at time T and nothing before time T . According to the risk-neutral pricing formula, its value at time t 2 0 T ] is
B(t T ) = (t) IE (1T ) F (t) t = IE ((T)) F (t) " (

= IE exp ;
ZT
t
r(u) du F (t) :
Given B (t
T ) dollars at time t, one can construct a portfolio of investment in the stock and money
267
268 market so that the portfolio value at time T is 1 almost surely. Indeed, for some process ,
B(t T ) = (t) IE (1 ) F (t) | T {z }

martingale
1 + Z t (u) dW (u) = (t) IE (T ) Zt 0 = (t) B (0 T ) + (u) dW (u)

0
dB (t T ) = r(t) (t) B (0 T ) + (u) dW (u) dt + (t) (t) dW (t) 0 = r(t)B (t T ) dt + (t) (t) dW (t):
The value of a portfolio satises
Zt
dX (t) = (t) dS (t) + r(t) X (t) ; (t)S (t)]dt = r(t)X (t) dt + (t) (t)S (t) dW (t):
(*) We set
(t) (t) = (t)S(t) : (t)
If, at any time t, X (t) = B (t
T ) and we use the portfolio (u) t u T , then we will have X (T ) = B(T T ) = 1:
If r(t) is nonrandom for all t, then
( ZT ) B (t T ) = exp ; r(u) du
t
dB (t T ) = r(t)B(t T ) dt
i.e., = 0. Then given above is zero. If, at time t, you are given B (t invest only in the money market, then at time T you will have
T ) dollars and you always
B(t T ) exp
(Z T
t
r(u) du = 1:
If r(t) is random for all t, then is not zero. One generally has three different instruments: the stock, the money market, and the zero coupon bond. Any two of them are sufcient for hedging, and the two which are most convenient can depend on the instrument being hedged.
CHAPTER 27. Bonds, forward contracts and futures
269
27.1 Forward contracts

We continue with the set-up for zero-coupon bonds. The T -forward price of the stock at time t 2 0 T ] is the F (t)-measurable price, agreed upon at time t, for purchase of a share of stock at time T , chosen so the forward contract has value zero at time t. In other words,
We solve for F (t):
IE (1 ) (S (T ) ; F (t)) F (t) = 0 0 t T: T
0 = IE 1 (S (T ) ; F (t)) F (t) (T ) = IE S (T ) F (t) ; F (t) IE (t) F (t) (T ) (t) (T ) = S (t) ; F (t) B (t T ): (t) (t)
This implies that
) F (t) = BSt(tT ) : (
Remark 27.1 (Value vs. Forward price) The T -forward price F (t) is not the value at time t of the forward contract. The value of the contract at time t is zero. F (t) is the price agreed upon at time t which will be paid for the stock at time T .
27.2 Hedging a forward contract

Enter a forward contract at time 0, i.e., agree to pay F (0) = BS (0) ) for a share of stock at time T . (0 T At time zero, this contract has value 0. At later times, however, it does not. In fact, its value at time t 2 0 T ] is
V (t) = (t) IE (1 ) (S (T ) ; F (0)) F (t) T t (T ) = (t) IE S(T ) F (t) ; F (0) IE ((T)) F (t) (t) = (t) S(t) ; F (0)B (t T ) = S (t) ; F (0)B (t T ): F (0)B (0 T ) = BS (0) ) B (0 T ) = S (0): (0 T
This suggests the following hedge of a short position in the forward contract. At time 0, short F (0) T -maturity zero-coupon bonds. This generates income
270 Buy one share of stock. This portfolio requires no initial investment. Maintain this position until time T , when the portfolio is worth
S (T ) ; F (0)B (T T ) = S (T ) ; F (0):
Deliver the share of stock and receive payment F (0). A short position in the forward could also be hedged using the stock and money market, but the implementation of this hedge would require a term-structure model.
27.3 Future contracts

Future contracts are designed to remove the risk of default inherent in forward contracts. Through the device of marking to market, the value of the future contract is maintained at zero at all times. Thus, either party can close out his/her position at any time. Let us rst consider the situation with discrete trading dates
0 = t0 < t1 < : : : < tn = T:

On each tj tj +1 ), r is constant, so
(tk+1 ) = exp
Z tk+1
8 0k 9 <X = = exp : r(tj )(tj +1 ; tj ) j =0
r(u) du
Enter a future contract at time tk , taking the long position, when the future price is (t k ). At time tk+1 , when the future price is (tk+1 ), you receive a payment (tk+1 ) ; (tk ). (If the price has fallen, you make the payment ;( (tk+1 ) ; (tk )). ) The mechanism for receiving and making these payments is the margin account held by the broker. By time T
is F (tk )-measurable.
= tn , you have received the sequence of payments (tk+1 ) ; (tk ) (tk+2 ) ; (tk+1 ) : : : (tn ) ; (tn 1 )
;
at times tk+1 tk+2
: : : tn . The value at time t = t0 of this sequence is
2n 1 X (t) IE 4
;
j =k
3 1 5 (tj +1 ) ( (tj +1 ) ; (tj )) F (t) :
Because it costs nothing to enter the future contract at time t, this expression must be zero almost surely.

The continuous-time version of this condition is
271
(t) IE
"Z T
t
1 d (u) F (t) = 0 (u)
0 t T:
Note that (tj +1 ) appearing in the discrete-time version is F (t j )-measurable, as it should be when approximating a stochastic integral. Denition 27.1 The T -future price of the stock is any F (t)-adapted stochastic process
f (t) 0 t T g
satisfying
IE
"Z T
t
1 d (u) F (t) = 0 (u)
(T ) = S (T ) a.s., and
(a) (b)
0 t T:
Theorem 3.66 The unique process satisfying (a) and (b) is
(t) = IE S (T ) F (t) 0 t T:
Proof: We rst show that (b) holds if and only if R t 1 d (u) is also a martingale, so 0 (u) is a martingale. If is a martingale, then
IE
"Z T
t
# 1 d (u) F (t) = IE Z t 1 d (u) F (t) ; Z t 1 d (u) (u) 0 (u) 0 (u) = 0:

M (t) = IE
On the other hand, if (b) holds, then the martingale
"Z T
0
1 d (u) F (t) (u)
satises
"Z T # Zt 1 1 d (u) F (t) M (t) = d (u) + IE 0 (u) t (u) Zt 1

=
0
(u)
d (u) 0 t T:
this implies
1 dM (t) = (t) d (t) d (t) = (t) dM (t)
272 and so is a martingale (its differential has no dt term).
Now dene
(t) = IE S (T ) F (t)
0 t T:
Clearly (a) is satised. By the tower property, is the only martingale satisfying (a).
is a martingale, so (b) is also satised. Indeed, this
27.4 Cash ow from a future contract

With a forward contract, entered at time 0, the buyer agrees to pay F (0) for an asset valued at S (T ). The only payment is at time T . With a future contract, entered at time 0, the buyer receives a cash ow (which may at times be negative) between times 0 and T . If he still holds the contract at time T , then he pays S (T ) at time T for an asset valued at S (T ). The cash ow received between times 0 and T sums to
ZT
0
d (u) = (T ) ; (0) = S (T ) ; (0):

( (0) ; S (T )) + S (T ) = (0)
Thus, if the future contract holder takes delivery at time T , he has paid a total of for an asset valued at S (T ).
27.5 Forward-future spread

Future price: Forward price:
(t) = IE S (T ) F (t)
) F (t) = BSt(tT ) = (
(t)IE
S (t)
1 (T )
F (t)
Forward-future spread:
h 1 (0) ; F (0) = IE S (T )] ; S (0) i

=
IE
IE
(T )
1 (T )
IE
1 IE (S (T )) ; IE S (T ) (T ) (T )
If (1 ) and S (T ) are uncorrelated, T
(0) = F (0):

If (1 ) and S (T ) are positively correlated, then T
273
(0) F (0):
This is the case that a rise in stock price tends to occur with a fall in the interest rate. The owner of the future tends to receive income when the stock price rises, but invests it at a declining interest rate. If the stock price falls, the owner usually must make payments on the future contract. He withdraws from the money market to do this just as the interest rate rises. In short, the long position in the future is hurt by positive correlation between (1 ) and S (T ). The buyer of the future is T compensated by a reduction of the future price below the forward price.
27.6 Backwardation and contango

Suppose Dene
dS (t) = S (t) dt + S (t) dW (t): f W (t) = t + W (t), Z (T ) = expf; W (T ) ; 1 2 T g 2 Z f IP (A) = Z (T ) dIP 8A 2 F (T ):

A
f f Then W is a Brownian motion under I , and P f dS (t) = rS (t) dt + S (t) dW (t):

We have
Because (1 ) T
f = e rT is nonrandom, S (T ) and (1T ) are uncorrelated under IP . Therefore,

;
(t) = ert 1 S (t) = S (0) expf( ; 2 2)t + W (t)g f = S (0) expf(r ; 1 2 )t + W (t)g 2
f (t) = IE S (T ) F (t)]
= F (t) ) = BSt(tT ) = er(T t)S (t): (
;
The expected future spot price of the stock under IP is
IES (T ) = S (0)e T IE exp ; 1 2T + W (T ) 2 T S (0): =e
oi
274 The future price at time 0 is
If > r, then (0) < IES (T ): This situation is called normal backwardation (see Hull). If then (0) > IES (T ). This is called contango.
(0) = erT S (0):
< r,
Chapter 28
Term-structure models
Throughout this discussion, fW (t) 0 t T g is a Brownian motion on some probability space ( F P), and fF (t) 0 t T g is the ltration generated by W .
Suppose we are given an adapted interest rate process fr(t) lation factor Z
0 t T g. We dene the accumu-
(t) = exp
r(u) du
0 t T:
In a term-structure model, we take the zero-coupon bonds (zeroes) of various maturities to be the primitive assets. We assume these bonds are default-free and pay $1 at maturity. For 0 t T T , let B(t T ) = price at time t of the zero-coupon bond paying $1 at time T . Theorem 0.67 (Fundamental Theorem of Asset Pricing) A term structure model is free of arbif trage if and only if there is a probability measure I on (a risk-neutral measure) with the same P probability-zero sets as IP (i.e., equivalent to IP ), such that for each T 2 (0 T ], the process
B(t T ) (t)
0 t T
f is a martingale under I . P
Remark 28.1 We shall always have
dB (t T ) = (t T )B (t T ) dt + (t T )B(t T ) dW (t) 0 t T
for some functions
(t T ) and (t T ). Therefore
1 1 d B(t(tT ) = B(t T ) d (t) + (t) dB(t T ) ) = (t T ) ; r(t)] B (t(tT ) dt + (t T ) B (t tT ) dW (t) ) ()

275
276 so IP is a risk-neutral measure if and only if (t T ), the mean rate of return of B (t T ) under IP , is the interest rate r(t). If the mean rate of return of B (t T ) under IP is not r(t) at each time t and for f each maturity T , we should change to a measure I under which the mean rate of return is r(t). If P such a measure does not exist, then the model admits an arbitrage by trading in zero-coupon bonds.
28.1 Computing arbitrage-free bond prices: rst method

Begin with a stochastic differential equation (SDE)
dX (t) = a(t X (t)) dt + b(t X (t)) dW (t):

The solution X (t) is the factor. If we want to have n-factors, we let W be an n-dimensional Brownian motion and let X be an n-dimensional process. We let the interest rate r(t) be a function of X (t). In the usual one-factor models, we take r(t) to be X (t) (e.g., Cox-Ingersoll-Ross, HullWhite). Now that we have an interest rate process prices to be
fr(t) 0 t )
T g, we dene the zero-coupon bond
B (t T ) = (t) IE (1T ) F (t) " (

= IE exp ;
ZT
t
r(u) du F (t)
0 t T
T:
We showed in Chapter 27 that
dB (t T ) = r(t)B(t T ) dt + (t) (t) dW (t)

for some process . Since B (t and there is no arbitrage.
T ) has mean rate of return r(t) under IP , IP is a risk-neutral measure
28.2 Some interest-rate dependent assets

Coupon-paying bond: Payments P1
P2 : : : Pn at times T1 T2 : : : Tn. Price at time t is
k:t<Tk
Pk B (t Tk ):
g
Call option on a zero-coupon bond: Bond matures at time Price at time t is
T.
Option expires at time T1
< T.
(t) IE
1 (B (T T ) ; K )+ F (t) 1 (T1)
0 t T1:
CHAPTER 28. Term-structure models
277
28.3 Terminology
Denition 28.1 (Term-structure model) Any mathematical model which determines, at least theoretically, the stochastic processes
B(t T ) 0 t T
for all T
2 (0 T ].
t T T
, the yield to maturity
Denition 28.2 (Yield to maturity) For 0 F (t)-measurable random-variable satisfying
Y (t T ) is the
B(t T ) exp f(T ; t)Y (t T )g = 1

or equivalently,
Y (t T ) = ; T 1 t log B(t T ): ; B (t T ) 0 t T Y (t T ) 0 t T T T:
Determining is equivalent to determining
28.4 Forward rate agreement

Let 0 t T < T + T be given. Suppose you want to borrow $1 at time T with repayment (plus interest) at time T + , at an interest rate agreed upon at time t. To synthesize a forward-rate agreement to do this, at time t buy a T -maturity zero and short BBt(t T ) ) (T + )-maturity zeroes. ( T+ The value of this portfolio at time t is
Bt B(t T ) ; B(t (T T ) ) B(t T + ) = 0: +

At time T , you receive $1 from the T -maturity zero. At time T + , you pay $ BBt(t T ) ) . The ( T+ effective interest rate on the dollar you receive at time T is R(t T T + ) given by
B (t T ) = expf R(t T T + )g B (t T + )
or equivalently,
R(t T T + ) = ; log B (t T + ) ; log B (t T ) : @ f (t T ) = lim R(t T T + ) = ; @T log B(t T ):
The forward rate is
#0
(4.1)
278 This is the instantaneous interest rate, agreed upon at time t, for money borrowed at time T . Integrating the above equation, we obtain
ZT
t
f (t u) du = ;
ZT @ @u log B(t u) du
t u=T u=t
= ; log B (t u) = ; log B (t T )
so
( ZT ) B(t T ) = exp ; f (t u) du :
t
You can agree at time t to receive interest rate f (t u) at each time u 2 t T ]. If you invest $ B (t at time t and receive interest rate f (t u) at each time u between t and T , this will grow to
T)
B(t T ) exp
at time T .
(Z T
t
f (t u) du = 1
28.5 Recovering the interest r (t) from the forward rate
@ @T B(t T ) T =t = IE ;r(t) F (t) = ;r(t):

On the other hand,
( ZT ) # B (t T ) = IE exp ; r(u) du F (t) t " ( ) # @ B (t T ) = IE ;r(T ) exp ; Z T r(u) du F (t) @T

t
"
@ @T B(t T ) T =t = ;f (t t): Conclusion: r(t) = f (t t).
( ZT ) B (t T ) = exp ; f (t u) du t ( ) @ B (t T ) = ;f (t T ) exp ; Z T f (t u) du @T
t
279
28.6 Computing arbitrage-free bond prices: Heath-Jarrow-Morton method

For each T
2 (0 T ], let the forward rate be given by Zt Zt

f (t T ) = f (0 T ) +
0
(u T ) du +
Here f
(u T ) 0 u T g and f (u T ) 0 u T g are adapted processes.
(u T ) dW (u)
0 t T:
In other words, Recall that
df (t T ) = (t T ) dt + (t T ) dW (t):
( ZT ) B(t T ) = exp ; f (t u) du :
t
Now
( ZT ) ZT d ; f (t u) du = f (t t) dt ; df (t u) du t t ZT = r(t) dt ; (t u) dt + (t u) dW (t)] du t "Z T # "Z T # (t u) du dW (t) = r(t) dt ; (t u) du dt ; t t {z } | {z } |

= r(t) dt ; (t T ) dt ; (t T ) dW (t):
(t T ) (t T )
Let Then
g(x) = ex g (x) = ex g (x) = ex:

0 00
B (t T ) = g ;
and
ZT
t
f (t u) du
! !
dt ; dW )
dB (t T ) = dg ;
=g ;
0
ZT
t
00
t ZT
f (t u) du
f (t u) du (r dt ;
1g 2
= B (t T ) r(t) ; (t T ) + 1 ( (t T ))2 dt 2 ; (t T )B(t T ) dW (t):
ZT
t
f (t u) du ( )2 dt
280
28.7 Checking for absence of arbitrage
IP is a risk-neutral measure if and only if (t T ) = 1 ( (t T ))2 2

i.e.,
0 t T
T T: T:
(7.1)
ZT
t
(t u) du =
1 2
ZT
t
(t u) du
!2
0 t T 0 t T
Differentiating this w.r.t.
T , we obtain
(t T ) = (t T )
ZT
t
(t u) du
(7.2)
Not only does (7.1) imply (7.2), (7.2) also implies (7.1). This will be a homework problem. Suppose (7.1) does not hold. Then IP is not a risk-neutral measure, but there might still be a riskneutral measure. Let f (t) 0 t T g be an adapted process, and dene
f W (t) = f IP (A) =
Then
Zt
0
(u) du + W (t)
Z (t) = exp ;
Zt
0
(u) dW (u) ;
Z (T ) dIP 8A 2 F (T ):
Zt 2 1 2 0 (u) du i
dB (t T ) = B(t T ) r(t) ; (t T ) + 1 ( (t T ))2 dt 2 ; (h T )B(t T ) dW (t) t i = B (t T ) r(t) ; (t T ) + 1 ( (t T ))2 + (t T ) (t) dt 2 f ; (t T )B(t T ) dW (t) 0 t T: f In order for B (t T ) to have mean rate of return r(t) under I , we must have P (t T ) = 1 ( (t T ))2 + (t T ) (t) 0 t T T : 2 Differentiation w.r.t. T yields the equivalent condition (t T ) = (t T ) (t T ) + (t T ) (t) 0 t T T :
Theorem 7.68 (Heath-Jarrow-Morton) For each T 2 (0 (u T ) 0 u T , be adapted processes, and assume f (0 T ) 0 t T , be a deterministic function, and dene
(7.3)
(7.4) and Let
T ], let (u T ) 0 u T (u T ) > 0 for all u and T .

(u T ) dW (u):
f (t T ) = f (0 T ) +
Zt
0
(u T ) du +
Zt
0
281
t T T is a family of forward rate processes for a term-structure model Then f (t T ) 0 without arbitrage if and only if there is an adapted process (t) 0 t T , satisfying (7.3), or equivalently, satisfying (7.4).
Remark 28.2 Under IP , the zero-coupon bond with maturity T has mean rate of return
r(t) ; (t T ) + 1 ( (t T ))2 2
and volatility
(t T ). The excess mean rate of return, above the interest rate, is

1 ; (t T ) + 2 ( (t T ))2
and when normalized by the volatility, this becomes the market price of risk
1 ; (t T ) + 2 ( (t T ))2 :
(t T )
The no-arbitrage condition is that this market price of risk at time t does not depend on the maturity T of the bond. We can then set
; (t T ) + 1 ( (t T ))2 2 (t) = ; (t T )
and (7.3) is satised. (The remainder of this chapter was taught Mar 21)
"
Suppose the market price of risk does not depend on the maturity T , so we can solve (7.3) for . Plugging this into the stochastic differential equation for B (t T ), we obtain for every maturity T :
f dB(t T ) = r(t)B(t T ) dt ; (t T )B(t T ) dW (t):

Because (7.4) is equivalent to (7.3), we may plug (7.4) into the stochastic differential equation for f (t T ) to obtain, for every maturity T :
df (t T ) = (t T ) (t T ) + (t T ) (t)] dt + (t T ) dW (t) f = (t T ) (t T ) dt + (t T ) dW (t):

28.8 Implementation of the Heath-Jarrow-Morton model
Choose
(t T ) 0 t T T (t) 0 t T :
282 These may be stochastic processes, but are usually taken to be deterministic functions. Dene
(t T ) = (t T ) (t T ) + (t T ) (t)
f W (t) =
Zt
0
(u) du + W (t)
Z f(A) = Z (T ) dIP 8A 2 F (T ): IP
0
Let f (0
Z (t) = exp ;
A
Zt
(u) dW (u) ; 1 2
Zt
0
2 (u) du
T) 0 T
be determined by the market; recall from equation (4.1):
@ f (0 T ) = ; @T log B (0 T ) 0 T T :
Then f (t
T ) for 0 t T is determined by the equation f df (t T ) = (t T ) (t T ) dt + (t T ) dW (t) r(t) = f (t t) 0 t T
(8.1)
this determines the interest rate process (8.2)
and then the zero-coupon bond prices are determined by the initial conditions B (0 , gotten from the market, combined with the stochastic differential equation
T) 0
T
(8.3)
f dB(t T ) = r(t)B(t T ) dt ; (t T )B(t T ) dW (t):
f Because all pricing of interest rate dependent assets will be done under the risk-neutral measure I , P f is a Brownian motion, we have written (8.1) and (8.3) in terms of W rather than f under which W W . Written this way, it is apparent that neither (t) nor (t T ) will enter subsequent computations. The only process which matters is (t T ) 0 t T T , and the process
(t T ) =
obtained from From (8.3) we see that (t T . Equation (8.4) implies This is because B (T vanish.
ZT
t
(t u) du
0 t T
(8.4)
(t T ).
T ) is the volatility at time t of the zero coupon bond maturing at time

(T T ) = 0 0 T
T:
(8.5)
T ) = 1 and so as t approaches T (from below), the volatility in B (t T ) must T) 0

(t T ) 0 t T
In conclusion, to implement the HJM model, it sufces to have the initial market data B (0 T T and the volatilities
T:

We require that
283
(t T ) be differentiable in T and satisfy (8.5). We can then dene (t T ) = @
@T (t T )
and (8.4) will be satised because
ZT @ (t T ) = (t T ) ; (t t) = @u (t u) du:
t
f f We then let W be a Brownian motion under a probability measure I , and we let B (t T ) 0 t P T T , be given by (8.3), where r(t) is given by (8.2) and f (t T ) by (8.1). In (8.1) we use the initial conditions f f P Remark 28.3 It is customary in the literature to write W rather than W and IP rather than I , so that IP is the symbol used for the risk-neutral measure and no reference is ever made to the market measure. The only parameter which must be estimated from the market is the bond volatility (t T ), and volatility is unaffected by the change of measure.
@ f (0 T ) = ; @T log B (0 T ) 0 T T :
284
Chapter 29
Gaussian processes
Denition 29.1 (Gaussian Process) A Gaussian process X (t), t 0, is a stochastic process with the property that for every set of times 0 t1 t2 : : : tn , the set of random variables
X (t1) X (t2) : : : X (tn)
is jointly normally distributed. Remark 29.1 If X is a Gaussian process, then its distribution is determined by its mean function
m(t) = IEX (t)

and its covariance function
(s t) = IE (X (s) ; m(s)) (X (t) ; m(t))]: Indeed, the joint density of X (t 1) : : : X (tn) is IP fX (t1) 2 dx1 : : : X (tn) 2 dxng n o 1p exp ; 1 (x ; m(t)) 1 (x ; m(t))T dx1 : : : dxn = 2 (2 )n=2 det
;
where
is the covariance matrix
x is the row vector
::: 5 (tn t1) (tn t2 ) : : : (tn tn ) x1 x2 : : : xn], t is the row vector t1 t2 : : : tn ], and m(t) = m(t1) m(t2) : : : m(tn)]. IE exp
2 6 (t1 t1) (t1 t2) :: :: :: = 6 (t2: :t1) (t2: :t2 ) : : : 6 : 4 : )
(t1 tn ) (t2 tn ) 7 7 7
The moment generating function is
(X n
where
u = u1 u2 : : : un].
k=1
uk X (tk ) = exp u m(t)T + 1 u 2
uT
285
286
29.1 An example: Brownian Motion

Brownian motion W is a Gaussian process with m(t) = 0 and then
(s t) = s ^ t. Indeed, if 0 s t,
(s t) = IE W (s)W (t)] = IE W (s) (W (t) ; W (s)) + W 2 (s) = IEW (s):IE (W (t) ; W (s)) + IEW 2 (s) = IEW 2(s) = s ^ t:
To prove that a process is Gaussian, one must show that X (t 1) : : : X (tn) has either a density or a moment generating function of the appropriate form. We shall use the m.g.f., and shall cheat a bit by considering only two times, which we usually call s and t. We will want to show that
IE exp fu1 X (s) + u2X (t)g = exp u1 m1 + u2 m2 +

Theorem 1.69 (Integral w.r.t. a Brownian) Let dom function. Then Z
1 u1 u2 ] 2
"
11 21
12 22
# " #)
u1 u2
W (t) be a Brownian motion and (t) a nonran(u) dW (u)

^
X (t) =
is a Gaussian process with m(t) = 0 and
(s t) =
Proof: (Sketch.) We have Therefore,
Zs
0
2(u) du:
dX = dW:
deuX (s) = ueuX (s) (s) dW (s) + 1 u2euX (s) 2 (s) ds 2 Zs Zs uX (s) = euX (0) + u euX (v) (v ) dW (v ) + 1 u2 euX (v) 2 (v ) dv e 2 IEeuX (s) = 1 + 1 u2 2
| 0 Zs
0
Martingale
{z
d IEeuX (s) = 1 u2 2(s)IEeuX (s) 2 ds Zs uX (s) = euX (0) exp 1 u2 2(v ) dv IEe 2
= exp
2(v ) dv 1 u2 2 0
2(v )IEeuX (v) dv
Zs
(1.1)
Rs This shows that X (s) is normal with mean 0 and variance 0 2(v ) dv .
CHAPTER 29. Gaussian processes
287
s < t be given. Just as before, deuX (t) = ueuX (t) (t) dW (t) + 1 u2euX (t) 2 (t) dt: 2 Integrate from s to t to get
Now let 0
euX (t) = euX (s) + u

Take IE
Zt
IE
Zt
s
: : : jF (s)] conditional expectations and use the martingale property

(v )euX (v) dW (v ) F (s) = IE =0
Zt uX (v) dW (v ) + 1 u2 2 (v )euX (v) dv: (v )e 2 s s Zt

0
(v )euX (v) dW (v ) F (s)
Zs
0
(v )euX (v) dW (v )
to get
The solution to this ordinary differential equation with initial time s is
d uX (t) F (s) = 1 u2 2(t)IE euX (t) F (s) 2 dt IE e IE euX (t) F (s) = euX (s) exp 1 u2 2
IE euX (t) F (s) = euX (s) + 1 u2 2
Zt
s
2 (v )IE
euX (v) F (s) dv t s: t s:

(1.2)
Zt
s
2(v ) dv
We now compute the m.g.f. for (X (s)
X (t)), where 0 s t:
IE eu1 X (s)+u2X (t) F (s) = eu1 X (s)IE eu2X (t) F (s)
t 2 (1.2) (u1 +u2 )X (s) 1 (v ) dv = e exp 2 u2 2 s h i IE eu1 X (s)+u2 X (t) = IE IE eu1X (s)+u2 X (t) F (s)
= IE e(u1 +u2 )X (s) : exp
This shows that (X (s)
X (t)) is jointly normal with IEX (s) = IEX (t) = 0,
t 2 1 u2 2 2 s (v ) dv Zs Zt (1.1) 2 (v ) dv + 1 u2 2 = exp 1 (u1 + u2 )2 2 2 2 s (v ) dv 0 Zs Zt 2 + 2u u ) 2(v ) dv + 1 u2 2 1 (u = exp 2 1 1 2 2 2 0 (v ) dv "R s 2 0R s 2# " #) ( 1 = exp 2 u1 u2 ] R0s 2 R0t 2 u1 : u2 0 0
IEX 2(s) =
Zs
0
2 (v ) dv
IE X (s)X (t)] =
Zs
0
IEX 2(t) =
Zt
0
2 (v ) dv
2(v ) dv:
288
Remark 29.2 The hard part of the above argument, and the reason we use moment generating functions, is to prove the normality. The computation of means and variances does not require the use of moment generating functions. Indeed,
X (t) =
is a martingale and X (0) = 0, so For xed s
Zt
0
(u) dW (u)
0,
m(t) = IEX (t) = 0 8t 0: IEX 2(s) =
Zs
0
2(v ) dv
by the It isometry. For 0 o
s t,
IE X (s)(X (t) ; X (s))] = IE IE X (s)(X (t) ; X (s)) F (s)
2 3 6 7 = IE 6X (s) IE X (t) F (s) ; X (s) 7 6 7 4 | {z }5

= 0:
0
Therefore,
IE X (s)X (t)] = IE X (s)(X (t) ; X (s)) + X 2(s)] Zs 2 (v ) dv: = IEX 2(s) =

0
If were a stochastic proess, the It isometry says o
IE 2(v ) dv and the same argument used above shows that for 0 s t, Zs 2(s) = IE X (s)X (t)] = IEX IE 2(v) dv: 0 However, when is stochastic, X is not necessarily a Gaussian process, so its distribution is not
0
determined from its mean and covariance functions. Remark 29.3 When is nonrandom,
IEX 2(s) =
Zs
X (t) =
Zt
0
(u) dW (u)
is also Markov. We proved this before, but note again that the Markov property follows immediately from (1.2). The equation (1.2) says that conditioned on F (s), the distribution of X (t) depends only R on X (s); in fact, X (t) is normal with mean X (s) and variance st 2(v ) dv .

y= v=
289
z
z z
s y t (a)
y=
s v (b)
s v y t (c)
Figure 29.1: Range of values of y
z v for the integrals in the proof of Theorem 1.70.
Theorem 1.70 Let Dene
W (t) be a Brownian motion, and let (t) and h(t) be nonrandom functions. X (t) =
Zt
0
(u) dW (u)
Y (t) =
Zt
0
h(u)X (u) du:
Then Y is a Gaussian process with mean function m Y (t) = 0 and covariance function
Y (s
t) =
Zs
0
2 (v )
Zs
v
h(y ) dy
Zt
v
h(y) dy dv:
(1.3)
Proof: (Partial) Computation of Y (s t): Let 0 s t be given. It is shown in a homework problem that (Y (s) Y (t)) is a jointly normal pair of random variables. Here we observe that
mY (t) = IEY (t) =

and we verify that (1.3) holds.
Zt
0
h(u) IEX (u) du = 0
290 We have
Y (s
t) = IE Y (s)Y (t)] Zs Zt = IE h(y )X (y ) dy: h(z )X (z) dz

= IE = = = =
Z s0 Z t
0 Z sZ t 0
h(y)h(z )X (y)X (z ) dy dz
Z0s Z0t Z0s Z0t

0 Z z
h(y )h(z)IE X (y )X (z )] dy dz h(y )h(z )
Zy
0 Z
2 (v ) dv dy dz
Zs
0
sZ s
0 y
h(y )h(z )
h(z )
Zt
z
h(y )h(z) h(y ) dy
Zy
0
2 (v ) dv
dy dz dz dy dz dy
(See Fig. 29.1(a))
Zz
0
2(v ) dv
Z sZ
0 Z 0
Zs
0 z
h(y )
Zs
y
h(z ) dz
Zy
0
2 (v ) dv
Z sZ
0 Z v
sZ y
h(z )
2 (v )
Zt
z
2(v ) dv
h(y ) dy dv dz
0 0 s h(z ) 2(v )
h(y )
2(v )
Zs
y
Zt
z
h(z ) dz dv dy h(z) dz dy dv
Zs
0
sZ s
h(y ) dy dz dv
0 v
Z0s 2 (v ) = 0 Zs 2 (v ) = 0
Zs
Zs
0
2 (v )
Z sZ t
v Z z
h(y ) 2(v )
sZ s
Zs
y
(See Fig. 29.1(b))
h(y )h(z) dy dz dv h(y )h(z) dz dy dv h(y )h(z) dy dz dv
2 (v )
2 (v )
v y sZ t
Zv s v Zv
v s
h(y ) dy h(y ) dy
Zt Zv
v
(See Fig. 29.1(c))
h(z ) dz dv h(y ) dy dv
Remark 29.4 Unlike the process
X (t) =
R t (u) dW (u), the process Y (t) = R t X (u) du is 0 0

neither Markov nor a martingale. For 0
291
IE Y (t)jF (s)] =
Zs
0
s < t,
h(u)X (u) du + IE
Zt
s
= Y (s) + = Y (s) +
Zt Zt
s s
h(u)X (u) du F (s)
h(u)IE X (u) F (s)] du h(u)X (s) du
= Y (s) + X (s)
Zt
s
h(u) du Y (t)jF (s)] is
where we have used the fact that X is a martingale. The conditional expectation IE not equal to Y (s), nor is it a function of Y (s) alone.
292
Chapter 30
Hull and White model

Consider where
(t), (t) and
dr(t) = ( (t) ; (t)r(t)) dt + (t) dW (t) (t) are nonrandom functions of t. K (t) =
We can solve the stochastic differential equation. Set
Zt
0
(u) du:
Then
d eK (t)r(t) = eK(t) (t)r(t) dt + dr(t) = eK (t) ( (t) dt + (t) dW (t)) :

Integrating, we get
eK (t)r(t) = r(0) +
so
Zt
0
eK(u) eK(u)
(u) du +
Zt
0
eK (u) (u) dW (u) eK(u) (u) dW (u) :
r(t) = e
K (t)
r(0) +
Zt
0
(u) du +
Zt
0
From Theorem 1.69 in Chapter 29, we see that r(t) is a Gaussian process with mean function
mr (t) = e
and covariance function
K (t)
r(0) +
;
Zt
0
^
eK(u) (u) du
(0.1)
r (s
t) = e
K (s) K (t)
Zs
0
t 2K (u) 2 e (u) du:
(0.2)
The process r(t) is also Markov.
293
294
RT We want to study 0
r(t) dt. To do this, we dene
X (t) =
Then
Zt
0
eK (u) (u) dW (u) Y (T ) =

K (t)
ZT
0
K (t) X (t) dt:
ZT
0
r(t) = e r(t) dt =
ZT
0
r(0) +
K (t)
Zt
0
eK(u) (u) du + e
r(0) +
Zt
0
K (t) X (t)
eK (u) (u) du dt + Y (T ):
RT According to Theorem 1.70 in Chapter 29, 0

IE
and its variance is
ZT
0
r(t) dt =
ZT
0
r(t) dt is normal. Its mean is r(0) +
K (t)
Zt
0
eK (u) (u) du dt
(0.3)
var
ZT
0
r(t) dt = IEY 2(T )

=
ZT
0
e2K(v) 2(v )
ZT
v
K (y) dy
!2
dv:
The price at time 0 of a zero-coupon bond paying $1 at time T is
( ZT ) B (0 T ) = IE exp ; r(t) dt 0 ( !) ZT ZT 1 (;1)2 var = exp (;1)IE r(t) dt + 2 r(t) dt 0 0 ZT ZTZ t = exp ;r(0) e K (t) dt ; e K (t)+K (u) (u) du dt 0 0 0 !2 ZT ZT
; ;
1 +2
= expf;r(0)C (0 T ) ; A(0 T )g
where
e2K (v) 2 (v )
K (y)
dy
dv
C (0 T ) = A(0 T ) =
ZT
0 0
Z TZt
0
K (t) dt
K (t)+K (u)
(u) du dt ; 1 2
ZT
0
e2K (v) 2(v )
ZT
v
K (y )
dy
!2
dv:
CHAPTER 30. Hull and White model
295
T u
Figure 30.1: Range of values of u
u
t for the integral.
30.1 Fiddling with the formulas
Note that (see Fig 30.1)
ZTZ t
0
= (y = t v = u) =
Therefore,
Z Z
0 TZ T 0 0
K (t)+K (u)
(u) du dt (u) dt du
K (y) dy
u T eK (v)
K (t)+K (u)
(v )
ZT
v
=
dv:
C (0 T ) = e K(y) dy 0 B(0 T ) = exp f;r(0)C (0 T ) ; A(0 T )g :

;
Z T2 Z 4eK(v) (v) T e A(0 T ) = 0 v ZT
K (y)
dy ; 1 e2K (v) 2(v ) 2
ZT
v
!23 e K(y) dy 5 dv
;
Consider the price at time t 2
( ZT ) # B(t T ) = IE exp ; r(u) du F (t) :

t
0 T ] of the zero-coupon bond:
"
Because r is a Markov process, this should be random only through a dependence on r(t). In fact,
B (t T ) = exp f;r(t)C (t T ) ; A(t T )g
296 where
C (t T ) = eK(t)
Z T2 Z 4eK(v) (v) T e A(t T ) = t v ZT

t
K (y) dy
; 1 e2K(v) 2(v) 2
ZT
v
!23 e K(y) dy 5 dv
;
K (y) dy:
The reason for these changes is the following. We are now taking the initial time to be t rather than RT R zero, so it is plausible that 0 : : : dv should be replaced by tT : : : dv: Recall that
K (v ) =
and this should be replaced by
Zv
0
(u) du
K (v ) ; K (t) =
Zv
t
(u) du:
Similarly, K (y ) should be replaced by K (y ) ; K (t). Making these replacements in A(0 see that the K (t) terms cancel. In C (0 T ), however, the K (t) term does not cancel.
T ), we
30.2 Dynamics of the bond price

Let Ct(t
T ) and At (t T ) denote the partial derivatives with respect to t. From the formula

we have
dB (t T ) = B (t T ) ;C (t T ) dr(t) ; 1 C 2(t T ) dr(t) dr(t) ; r(t)Ct(t T ) dt ; At (t T ) dt 2

= B (t T ) ; C (t T ) ( (t) ; (t)r(t)) dt
; C (t T ) (t) dW (t) ; 1 C 2(t T ) 2(t) dt 2 ; r(t)Ct(t T ) dt ; At(t T ) dt :

Because we have used the risk-neutral pricing formula
) # ( ZT B(t T ) = IE exp ; r(u) du F (t) t "

dB(t T ) = r(t)B(t T ) dt + (: : : ) dW (t):
to obtain the bond price, its differential must be of the form

Therefore, we must have
297
;C (t T ) ( (t) ; (t)r(t)) ; 1 C 2(t T ) 2(t) ; r(t)Ct(t T ) ; At(t T ) = r(t): 2

We leave the verication of this equation to the homework. After this verication, we have the formula
dB (t T ) = r(t)B (t T ) dt ; (t)C (t T )B(t T ) dW (t): In particular, the volatility of the bond price is (t)C (t T ).
30.3 Calibration of the Hull & White model
Recall:
C (t T ) = eK(t) e K(y) dy t B (t T ) = exp f;r(t)C (t T ) ; A(t T )g :

;
ZT2 Z 4eK(v) (v) T e A(t T ) = t v ZT

0
dr(t) = ( (t) ; (t)r(t)) dt + (t) dB(t) Zt K (t) = (u) du

;
K (y)
dy ; 1 e2K (v) 2 (v ) 2
ZT
v
!2 3 e K (y) dy 5 dv
;
Suppose we obtain B (0 determine the functions
T ) for all T 2 0 T ] from market data (with some interpolation). Can we (t), (t), and (t) for all t 2 0 T ]? Not quite. Here is what we can do.
;
We take the following input data for the calibration: 1. 2. 3. 4. 5.
B (0 T ) 0 T T r(0);
(0); (t) 0 t T
(usually assumed to be constant);
(0)C (0 T ) 0 T
, i.e., the volatility at time zero of bonds of all maturities.
Step 1. From 4 and 5 we solve for
C (0 T ) =
ZT
0
K (y)
dy:
298 We can then compute
We now have
@ C (0 T ) = e K(T ) @T @ =) K (T ) = ; log @T C (0 T ) @ K (T ) = @ Z T (u) du = (T ): @T @T 0 (T ) for all T 2 0 T ].

;
Step 2. From the formula
B(0 T ) = expf;r(0)C (0 T ) ; A(0 T )g we can solve for A(0 T ) for all T 2 0 T ]. Recall that
Z T2 ZT A(0 T ) = 4eK (v) (v ) e

0
K (y) dy
; 1 e2K(v) 2(v) 2
T
as follows:
ZT
v
!23 e K (y) dy 5 dv:

;
We can use this formula to determine
ZT @ A(0 T ) = K (v) (v )e K (T ) ; e2K (v) 2 (v )e K (T ) e e K (y) dy dv @T 0 v !# Z " ZT K (T ) @ A(0 T ) = T eK (v) (v ) ; e2K (v) 2 (v ) K (y) dy e @T e dv 0 v @ eK (T ) @ A(0 T ) = eK (T ) (T ) ; Z T e2K(v) 2(v ) e K (T ) dv @T @T 0 @ eK (T ) @ A(0 T ) = e2K (T ) (T ) ; Z T e2K(v) 2 (v ) dv eK(T ) @T @T 0 @ eK(T ) @ eK (T ) @ A(0 T ) = (T )e2K (T ) + 2 (T ) (T )e2K(T ) ; e2K (T ) 2(T ) 0 T T : @T @T @T
; ; ; ; ; 0 0
(T ) 0 T
ZT"
!#
This gives us an ordinary differential equation for , i.e.,
(t)e2K (t) + 2 (t) (t)e2K (t) ; e2K (t) 2(t) = known function of t:
From assumption 4 and step 1, we know all the coefcients in this equation. From assumption 3, we have the initial condition (0). We can solve the equation numerically to determine the function (t) 0 t T . Remark 30.1 The derivation of the ordinary differential equation for (t) requires three differentiations. Differentiation is an unstable procedure, i.e., functions which are close can have very different derivatives. Consider, for example,
f (x) = 0 8x 2 IR g (x) = sin(1000x) 8x 2 IR: 100

Then
299
1 jf (x) ; g(x)j 100 8x 2 IR

but because
g (x) = 10 cos(1000x)
0
we have for many values of x.
jf (x) ; g (x)j = 10
0 0
Assumption 5 for the calibration was that we know the volatility at time zero of bonds of all maturities. These volatilities can be implied by the prices of options on bonds. We consider now how the model prices options.
30.4 Option on a bond

Consider a European call option on a zero-coupon bond with strike price K and expiration time T 1 . The bond matures at time T2 > T1. The price of the option at time 0 is R T1 r(u) du ; +
R T1 r(u) du 0 (expf;r(T1)C (T1 T2) ; A(T1 T2)g ; K )+ : Z Z + = e x expf;yC (T1 T2) ; A(T1 T2)g ; K f (x y) dx dy RT where f (x y ) is the joint density of 0 1 r(u) du r(T1) . R We observed at the beginning of this Chapter (equation (0.3)) that 0T1 r(u) du is normal with "Z T1 # Z T1
= IEe
; 1 1 ; ;1 ;1
IE e
(B (T1 T2) ; K )
1 = IE
4
r(u) du =
=
Z T1
0
IEr(u) du r(0)e
;
2 = var 1
4
"Z T1
0
r(u) du =
# Z0T1
K (v) + e K (v)
;
Zv
0
e2K (v) 2(v )
Z T1
v
eK(u) (u) du dv
K (y) dy
!2
dv:
We also observed (equation (0.1)) that r(T1) is normal with

4 ; ;
Z T1 = IEr(T1) = r(0)e K (T1) + e K (T1 ) eK (u) (u) du 2

;
2 = var (r(T )) = e 2K (T1 ) 1 2

4
Z T1
0
0 2K (u) 2 (u) du: e
300 In fact,
R T1 r(u) du r(T ) 1 0
is jointly normal, and the covariance is
1 2 = IE
"Z T1
0
= =
Z T1 Z T1
0 0
(r(u) ; IEr(u)) du: (r(T1) ; IEr(T1))
IE (r(u) ; IEr(u)) (r(T1) ; IEr(T1))] du

r (u T1) du
where r (u
T1) is dened in Equation 0.2.
The option on the bond has price at time zero of
;1
;1
expf;yC (T1 T2) ; A(T1 T2)g ; K 2

1
1 p 2 1;
2 2 exp ; 2(1 ; 2) x2 + 2 xy + y 2 2 1 2 1 2
"
#)
dx dy:
(4.1)
The price of the option at time t 2
0 T1] is
IE e
R T1 r(u) du
t
(B (T1 T2) ; K )+ F (t)

;
= IE e
R T1 r(u) du
t
(expf;r(T1)C (T1 T2) ; A(T1 T2)g ; K )+ F (t)
(4.2)
Because of the Markov property, this is random only through a dependence on r(t). To compute R this option price, we need the joint distribution of tT1 r(u) du r(T1) conditioned on r(t). This

pair of random variables has a jointly normal conditional distribution, and
301
1 (t) = IE
"Z T1
t
2 Z 3 !2 T1 2 (t) = IE 4 r(u) du ; 1(t) F (t)5 1 t !2 Z T1 Z T1

=
t
Z T1
t
r(u) du F (t)
;
r(t)e
K (v)+K (t) + e K (v)

;
Zv
t
eK (u) (u) du dv
e2K (v) 2(v )
K (y) dy
dv
2 (t) = IE
Z T1 = r(t)e K (T1 )+K (t) + e K (T1 ) eK(u) (u) du

; ;
r(T1) r(t)
2 (t) = IE 2
(r(T1) ;
2K (T1 )
=e
Z T1
t
(t))2
F (t) !
(t) 1 (t) 2 (t) = IE =

t
" Z T1
t
e2K (u) 2 (u) du
Z T1
Z K (u) K (T1 ) u e2K (v) 2 (v ) dv du: t

;
r(u) du ; 1 (t) (r(T1) ; 2(t)) F (t)
The variances and covariances are not random. The means are random through a dependence on r(t). Advantages of the Hull & White model: 1. Leads to closed-form pricing formulas. 2. Allows calibration to t initial yield curve exactly. Short-comings of the Hull & White model: 1. One-factor, so only allows parallel shifts of the yield curve, i.e.,

so bond prices of all maturities are perfectly correlated. 2. Interest rate is normally distributed, and hence can take negative values. Consequently, the bond price " ( ) #
B (t T ) = IE exp ;
ZT
t
r(u) du F (t)
can exceed 1.
302
Chapter 31
Cox-Ingersoll-Ross model
In the Hull & White model, r(t) is a Gaussian process. Since, for each t, r(t) is normally distributed, there is a positive probability that r(t) < 0. The Cox-Ingersoll-Ross model is the simplest one which avoids negative interest rates. We begin with a d-dimensional Brownian motion (W 1 W2 constants. For j = 1 : : : d, let Xj (0) 2 IR be given so that
: : : Wd ) .
Let
> 0 and > 0 be
2 2 2 X1 (0) + X2 (0) + : : : + Xd (0) 0

and let Xj be the solution to the stochastic differential equation
dXj (t) = ; 1 Xj (t) dt + 1 dWj (t): 2 2 Xj is called the Orstein-Uhlenbeck process. It always has a drift toward the origin. The solution to
this stochastic differential equation is
Xj (t) = e
1 2 t
Xj (0) + 1 2
;
Zt
0
e 2 u dWj (u) :
This solution is a Gaussian process with mean function
mj (t) = e
and covariance function
1 2 t Xj (0)
^
(s t) = 1 2 e 4
4
1 (s+t) Z s t u 2 e 0
du:
Dene
2 2 2 r(t) = X1 (t) + X2 (t) + : : : + Xd (t): 2 If d = 1, we have r(t) = X1 (t) and for each t, IP fr(t) > 0g = 1, but (see Fig. 31.1)
IP
There are innitely many values of t > 0 for which r(t) = 0 303
= 1
304
r(t) = X 1 (t)
2
t x2 ( X (t), X (t) ) 2 1
Figure 31.1: r(t) can be zero. If d
2, (see Fig. 31.1)
IP fThere is at least one value of t > 0 for which r(t) = 0g = 0:

Let f (x1
x2 : : : xd) = x2 + x2 + : : : + x2 . Then 1 2 d fxi = 2xi
fxi xj = 2
0
d X i=1
if i = j if i 6= j:
It s formula implies o
dr(t) =
=
d X i=1 d X i=1
fxi dXi + 1 2
fxi xi dXi dXi

d X1 i=1 4
2
2Xi ; 1 Xi dt + 1 dWi(t) + 2 2
d X i !=1
dWi dWi
= ; r(t) dt +
2
d q XX = d4 ; r(t) dt + r(t) p i(t) dWi(t): i=1 r(t)

Dene
Xi dWi + d4 dt
d X Z t Xi(u) p dWi(u): W (t) = r(u) i=1 0
CHAPTER 31. Cox-Ingersoll-Ross model

Then W is a martingale,
305
d X Xi dW = pr dWi
dW dW =
d X Xi2 i=1
i=1
r dt = dt
so W is a Brownian motion. We have
! d 2 ; r(t) dt + qr(t) dW (t): dr(t) = 4

dr(t) = ( ; r(t)) dt +
The Cox-Ingersoll-Ross (CIR) process is given by
r(t) dW (t)
We dene
If d happens to be an integer, then we have the representation
d = 4 2 > 0:
d X i=1
r(t) =
Xi2(t) <1 2
2), then
but we do not require d to be an integer. If d < 2 (i.e.,
IP fThere are innitely many values of t > 0 for which r(t) = 0g = 1:

This is not a good parameter choice. If d
2 (i.e.,
1 2 ), then 2
IP fThere is at least one value of t > 0 for which r(t) = 0g = 0:

With the CIR process, one can derive formulas under the assumption that integer, and they are still correct even when d is not an integer. For example, here is the distribution of r(t) for xed t > 0. Let r(0)
;
d=
4 2 is a positive
0 be given. Take
X1(0) = 0 X2(0) = 0 : : : Xd 1(0) = 0 Xd(0) = r(0):

For i = 1
2 : : : d ; 1, Xi (t) is normal with mean zero and variance

2 (t t) = 4 (1 ; e t ):
;
306
Xd (t) is normal with mean

and variance
(t t). Then
md(t) = e
d 1 X
;
1 q 2 t r(0)
r(t) =
Chi-square with d ; 1 = freedom
(t t)
i=1
{z
X p i((tt)t)
4
!2 }
degrees of
+
term
;2
Normal squared and independent of the other
2 Xd (t) | {z }
(0.1)
Thus r(t) has a non-central chi-square distribution.
31.1 Equilibrium distribution of r (t)

As t!1, md (t)!0. We have
d X Xi(t) !2 p r(t) = (t t) : (t t) i=1

2 2
As t!1, we have (t t) = 4 , and so the limiting distribution of r(t) is 4 times a chi-square with d = 4 2 degrees of freedom. The chi-square density with 4 2 degrees of freedom is
f (y ) =
We make the change of variable r = 4
1 2 = 2; 2
2
22
y=2 :
y. The limiting density for r(t) is

2
2
p(r) = 4 2 :
1 2 = 2; 2
2 2
4 r 2
= 22
22
;2 2
r:
We computed the mean and variance of r(t) in Section 15.7.
31.2 Kolmogorov forward equation

Consider a Markov process governed by the stochastic differential equation
dX (t) = b(X (t)) dt + (X (t)) dW (t):
307
-y
Figure 31.2: The function h(y )
Because we are going to apply the following analysis to the case X (t) 0 for all t.
X (t) = r(t), we assume that
0 at time 0. Then X (t) is random with density p(0 t x y ) (in the y We start at X (0) = x variable). Since 0 and x will not change during the following, we omit them and write p(t y ) rather than p(0 t x y ). We have
IEh(X (t)) =
for any function h.
h(y )p(t y ) dy
t and y . We derive it below. Let h(y ) be a smooth function of y 0 which vanishes near y = 0 and for all large values of y (see
Fig. 31.2). It s formula implies o
The Kolmogorov forward equation (KFE) is a partial differential equation in the forward variables
dh(X (t)) = h (X (t))b(X (t)) + 1 h (X (t)) 2(X (t)) dt + h (X (t)) (X (t)) dW (t) 2
0 00 0
so
h(X (t)) = h(X (0)) +
Z th
0
0
Zt
0
h (X (s))b(X (s)) + 1 h (X (s)) 2 (X (s)) ds + 2

0 00
h (X (s)) (X (s)) dW (s)
IEh(X (t)) = h(X (0)) + IE
Z th
0
h (X (s))b(X (s)) dt + 1 h (X (s)) 2 (X (s)) ds 2

0 00
308 or equivalently,
Z
0
h(y)p(t y) dy = h(X (0)) +
Z tZ
1
Zt 1 2 0 0
Z0 0
h (y )b(y )p(s y) dy ds +
0
h (y ) 2 (y)p(s y) dy ds:
00
Differentiate with respect to t to get
h(y)pt(t y) dy =
h (y )b(y )p(t y ) dy + 1 2
0
Z
0
h (y ) 2 (y)p(t y ) dy:
00
Integration by parts yields
Z
0
h (y)b(y)p(t y ) dy = h(y)b(y )p(t y )

0
y=
Z
0
h (y )
00
2(y )p(t
y) dy = h (y)
0
2 (y )p(t =0
=0
{z
y=0 } y=
;
1
Z
0
@ h(y ) @y (b(y )p(t y)) dy

1
{z
y)
y=0 }
@ h (y ) @y
0
2(y )p(t
y ) dy
2(y )p(t
@ = ;h(y ) @y
2(y )p(t =0
{z
y)
y=
y=0 }
Z
0
@2 h(y) @y 2
y ) dy:
Therefore,
Z
0
h(y)pt(t y) dy = ;
Z
0
Z @ @2 h(y ) @y (b(y )p(t y )) dy + 1 h(y ) @y2 2 0

1
2(y )p(t
y ) dy
or equivalently,
Z
0
@ @2 h(y) pt(t y ) + @y (b(y )p(t y )) ; 1 @y2 2 @ @2 pt(t y) + @y ((b(y )p(t y )) ; 1 @y 2 2
"
2 (y )p(t
y)
dy = 0:
This last equation holds for every function h of the form in Figure 31.2. It implies that
2 (y )p(t
y) = 0:
(KFE)
If there were a place where (KFE) did not hold, then we could take points, but take h to be zero elsewhere, and we would obtain
h(y ) > 0 at that and nearby
@ @2 h pt + @y (bp) ; 1 @y2 ( 2p) dy 6= 0: 2
"

If the process X (t) has an equilibrium density, it will be
309
p(y ) = t! p(t y ): lim

1
In order for this limit to exist, we must have
Letting t!1 in (KFE), we obtain the equilibrium Kolmogorov forward equation
0 = t! pt (t y ): lim
1
@ (b(y)p(y )) ; 1 @ 2 2 @y 2 @y
2 (y )p(y )
= 0:
When an equilibrium density exists, it is the unique solution to this equation satisfying
p(y ) 0 8y 0
p(y ) dy = 1:
31.3 Cox-Ingersoll-Ross equilibrium density

We computed this to be
p(r) = Cr
where
;2 2
C = 22
We compute
22
p (r) = ; 22r2
00
2 p (r) = 2 ; : p(rr) ; 2 2 p(r) 2 2 1 2 = 2 r ; 2 ; r p(r)

0
; 1 2 ; r p(r) + 2r (; )p(r) + 2r ; 1 2 ; r p (r) 2 2 2 2 2 ; 1 ( ; 1 2 ; r) ; + 2 ( ; 1 2 ; r)2 p(r) = 2r r 2 2 2r

0
We want to verify the equilibrium Kolmogorov forward equation for the CIR process:
@ (( ; r)p(r)) ; 1 @ 2 ( 2 rp(r)) = 0: 2 @r2 @r
(EKFE)
310 Now
@ (( ; r)p(r)) = ; p(r) + ( ; r)p (r) @r @ 2 ( 2 rp(r)) = @ ( 2 p(r) + 2rp (r)) @r2 @r = 2 2 p (r) + 2rp (r):
0 0 0 00
The LHS of (EKFE) becomes
; p(r) + ( ; r)p (r) ; 2p (r) ; 1 2rp (r) 2 2 ( ; 1 2 ; r) = p(r) ; + ( ; r ; 2 ) 2 r 2

0 0 00
1 + r ( ; 1 2 ; r) + ; 2r ( ; 1 2 ; r)2 2 2 2 = p(r) ( ; 1 2 ; r) 2 ( ; 1 2 ; r) 2 2 2r ; 1 2 2 ( ; 1 2 ; r) =0 1 + r(
2 2 2r ; 1 2 ; r) ; 2 r ( 2 2 1 ; 2 2 ; r)2
as expected.
31.4 Bond prices in the CIR model

The interest rate process r(t) is given by
dr(t) = ( ; r(t)) dt +
where r(0) is given. The bond price process is
r(t) dW (t)
( ZT ) # B(t T ) = IE exp ; r(u) du F (t) :

t
"
Because
( ZT ) # exp ; r(u) du B (t T ) = IE exp ; r(u) du F (t) Zt

0 0
"
the tower property implies that this is a martingale. The Markov property implies that B (t T ) is random only through a dependence on r(t). Thus, there is a function B (r t T ) of the three dummy variables r t T such that the process B (t T ) is the function B (r t T ) evaluated at r(t) t T , i.e.,
B(t T ) = B(r(t) t T ):

Because exp pute
311
n Rt o ; 0 r(u) du B(r(t) t T ) is a martingale, its differential has no dt term. We comZt Z0t

0
d exp ;
= exp ;
r(u) du B (r(t) t T ) r(u) du ;r(t)B(r(t) t T ) dt + Br (r(t) t T ) dr(t) + t T ) dr(t) dr(t) + Bt (r(t) t T ) dt :
1 Brr (r(t) 2
The expression in
: : : ] equals
= ;rB dt + Br ( ; r) dt + Br 1 + 2 Brr 2 r dt + Bt dt:
pr dW
Setting the dt term to zero, we obtain the partial differential equation
; rB(r t T ) + Bt (r t T ) + ( ; r)Br (r t T ) + 1 2rBrr (r t T ) = 0 2

The terminal condition is
0 t < T r 0:
(4.1)
B(r T T ) = 1 r 0: B (r t T ) = e
rC (t T ) A(t T )
;
Surprisingly, this equation has a closed form solution. Using the Hull & White model as a guide, we look for a solution of the form
;
where C (T
T ) = 0 A(T T ) = 0. Then we have Bt = (;rCt ; At)B Br = ;CB Brr = C 2B
and the partial differential equation becomes
0 = ;rB + (;rCt ; At )B ; ( ; r)CB + 1 2 rC 2B 2 1 2 C 2 ) ; B (A t + C ) = rB (;1 ; Ct + C + 2

We rst solve the ordinary differential equation
;1 ; Ct(t T ) + C (t T ) + 1 2C 2(t T ) = 0 C (T T ) = 0 2
and then set
A (t T ) =
ZT
t
C (u T ) du
312 so A(T
T ) = 0 and
At (t T ) = ; C (t T ):
It is tedious but straightforward to check that the solutions are given by
C (t T ) =
sinh( (T ; t)) cosh( (T ; t)) + 1 sinh( (T ; t)) 2 2 3 1 2 log 4 e 2 (T t) 5 A(t T ) = ; 2 cosh( (T ; t)) + 1 sinh( (T ; t)) 2
;
where
1 2
2+2 2
u u sinh u = e ; e 2
;
u u cosh u = e + e : 2
;
Thus in the CIR model, we have
( ZT ) # IE exp ; r(u) du F (t) = B (r(t) t T )

t
"
where
B(r t T ) = exp f;rC (t T ) ; A(t T )g 0 t < T r 0 and C (t T ) and A(t T ) are given by the formulas above. Because the coefcients in dr(t) = ( ; r(t)) dt + r(t) dW (t) do not depend on t, the function B (r t T ) depends on t and T only through their difference = T ; t. Similarly, C (t T ) and A(t T ) are functions of = T ; t. We write B(r ) instead of B (r t T ), and we have B(r ) = exp f;rC ( ) ; A( )g 0 r 0
where
C ( ) = cosh( sinh( 1 )sinh( ) )+ 2 2 1 2 log 4 e2 A( ) = ;

=
We have
1 2
cosh( ) + 1 sinh( ) 2
3 5
2 + 2 2:
( ZT ) B(r(0) T ) = IE exp ; r(u) du :

T ) is strictly decreasing in T . Moreover,
0
Now r(u) > 0 for each u, almost surely, so B (r(0)
B (r(0) 0) = 1
313
lim B (r(0) T ) = IE exp ; T!

1
Z
0
r(u) du = 0:
But also, so
B(r(0) T ) = exp f;r(0)C (T ) ; A(T )g r(0)C (0) + A(0) = 0 lim r(0)C (T ) + A(T )] = 1 T!
1
and
r(0)C (T ) + A(T )
is strictly inreasing in T .
31.5 Option on a bond

The value at time t of an option on a bond in the CIR model is
( Z T1 ) # + F (t) v(t r(t)) = IE exp ; r(u) du (B (T1 T2) ; K ) t
"
where T1 is the expiration time of the option, T 2 is the maturity time of the bond, and 0 t T 1 n R o T2. As usual, exp ; 0t r(u) du v(t r(t)) is a martingale, and this leads to the partial differential equation (where v
= v (t r).) The terminal condition is
;rv + vt + ( ; r)vr + 1 2rvrr = 0 0 t < T1 r 0: 2

v(T1 r) = (B (r T1 T2) ; K )+ r 0:
Other European derivative securities on the bond are priced using the same partial differential equation with the terminal condition appropriate for the particular security.
31.6 Deterministic time change of CIR model

Process time scale: In this time scale, the interest rate r(t) is given by the constant coefcient CIR equation q
dr(t) = ( ; r(t)) dt +
r(t) dW (t):
Real time scale: In this time scale, the interest rate r( t) is given by a time-dependent CIR equation ^^
^ ^r ^ ^ ^ ^ ^ ^ ^ dr(t) = (^(t) ; ^(t)^(t)) dt + ^ (t) r(t) dW (t): ^^
t:
314
Process time
. . . . . . . . . . . . . . . . .
A period of high interest rate volatility
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
^ t = '(t)
^ t:
Real time
Figure 31.3: Time change function. There is a strictly increasing time change function t Fig. 31.3).
^ = '( t) which relates the two time scales (See
^rt ^ ^ ^ Let B (^ ^ T ) denote the price at real time t of a bond with maturity T when the interest rate at time ^ is r. We want to set things up so t ^ ^rt ^ B (^ ^ T ) = B (r t T ) = e
;
rC (t T ) A(t T )
;
^ where t = '(t)
^ T = '(T ), and C (t T ) and A(t T ) are as dened previously. ^ We need to determine the relationship between r and r. We have
( ZT ) B(r(0) 0 T ) = IE exp ; r(t) dt ( Z0T ) ^ ^ B(^(0) 0 T ) = IE exp ; r(t) dt : r ^^ ^

0
With T
^ ^ ^ ^ = '(T ), make the change of variable t = '(t), dt = ' (t) dt in the rst integral to get
0
( Z T^ ) ^ ^ ^ B (r(0) 0 T ) = IE exp ; r('(t))' (t) dt

0
and this will be B (^(0) r
^ 0 T ) if we set ^ ^ r(t) = r('(t)) ' (t): ^^

0
315
31.7 Calibration
^ ^) ^ ^ ^r^ t ^ B (^(t) ^ T ) = B r(t^ '(t) '(T ) ' ) ( (t) ^ ^ C ('(t) '(T )) ; A('(t) '(T )) ^ ^ = exp ;r(t) ^^ ^ ' (t) n o = exp ;r(t)C (t T ) ; A(t T ) ^^^ ^ ^ ^^ ^
0 0
where
^ '^ ^^^ C (t T ) = C ('(t) ^ (T )) ' (t) ^^^ ^ ^ A(t T ) = A('(t) '(T ))

0
^ do not depend on ^ and T only through t are time dependent. ^r ^ Suppose we know r(0) and B (^(0)
^ ^ T ; t, since, in the real time scale, the model coefcients
^ ^ ^ 0 T ) for all T 2 0 T ]. We calibrate by writing the equation
^r ^ B (^(0) 0 T ) = exp ;r(0)C (0 T ) ; A(0 T ) ^ ^ ^ ^ ^

or equivalently,
^ ^ ^ ^r ^ r ; log B (^(0) 0 T ) = '(0) C ('(0) '(T )) + A('(0) '(T )): (0)

0
Take and so the equilibrium distribution of r(t) seems reasonable. These values determine ^ the functions C A. Take '0(0) = 1 (we justify this in the next section). For each T , solve the ^): equation for '(T
^r ^ ^ ^ ^ ; log B (^(0) 0 T ) = r(0)C (0 '(T )) + A(0 '(T )): ^ The right-hand side of this equation is increasing in the '( T ) variable, starting at 0 at time having limit 1 at 1, i.e.,
(*)
0 and
^r ^ ^ ^ Since 0 ; log B (^(0) 0 T ) < 1 (*) has a unique solution for each T . For T ^1 < T2, then ^ is '(0) = 0. If T ^ ^ ^ ^ ; log B (r(0) 0 T1) < ; log B (r(0) 0 T2)
r(0)C (0 0) + A(0 0) = 0 ^ lim r(0)C (0 T ) + A(0 T )] = 1: ^ T!

1
= 0, this solution
^ ^ so '(T1) < '(T2). Thus ' is a strictly increasing time-change-function with the right properties.
316
31.8 Tracking down '0 (0) in the time change of the CIR model
Result for general term structure models:
@ ; @T log B(0 T )
Justication:
T =0
= r(0):
@ log B (0 T ) = IE r(T )e 0 ; @T RT IEe 0 r(u) du @ ; @T log B(0 T ) = r(0):

; ;
( ZT ) B(0 T ) = IE exp ; r(u) du : 0 ( ZT ) ; log B(0 T ) = ; log IE exp ; r(u) du 0 R T r(u) du
T =0
In the real time scale associated with the calibration of CIR by time change, we write the bond price as
^r ^ B(^(0) 0 T )
thereby indicating explicitly the initial interest rate. The above says that
^r ^ ; @^ log B(^(0) 0 T ) ^ = r(0): ^
@T
T =0
The calibration of CIR by time change requires that we nd a strictly increasing function ' with '(0) = 0 such that
^r ^ ^ ^ where B (^(0) 0 T ), determined by market data, is strictly increasing in T , starts at 1 when T ^r ^ ^ and goes to zero as T !1. Therefore, ; log B (^(0) 0 T ) is as shown in Fig. 31.4.
Consider the function Here C (T ) and A(T ) are given by
^r ^ ^ ^ ^ ; log B (^(0) 0 T ) = ' 1 r(0)C ('(T )) + A('(T )) T 0 (0) ^

0
(cal)
= 0,
r(0)C (T ) + A(T ) ^
C (T ) =
sinh( T ) cosh( T ) + 1 sinh( T ) 2 2 3 1 T 2 A(T ) = ; 2 2 log 4 cosh( T ) e 1 sinh( T ) 5 +2 =

1 2
2 + 2 2:
317 Goes to 1
^r ^ 6; log B (^(0) 0 T )
Strictly increasing
-T ^
Figure 31.4: Bond price in CIR model
6
r(0)C (T ) + A(T ) ^
^r ^ ; log B (^(0) 0 T )
...... ...... ...... ...... ....... ....... ....... ....... ...... ...... ..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
^ '(T )
Figure 31.5: Calibration
-T
^ The function r(0)C (T ) + A(T ) is zero at T = 0, is strictly increasing in T , and goes to 1 as T !1. This is because the interest rate is positive in the CIR model (see last paragraph of Section 31.4).
To solve (cal), let us rst consider the related equation
^r ^ ^ ^ ^ ; log B (^(0) 0 T ) = r(0)C ('(T )) + A('(T )): ^ ^ Fix T and dene '(T ) to be the unique T for which (see Fig. 31.5)
(cal)
dened a time-change function ' which has all the right properties, except it satises (cal) rather than (cal).
^r ^ ^ ; log B (^(0) 0 T ) = r(0)C (T ) + A(T ) ^ ^ ^ ^ ^ ^ ^ ^ If T = 0, then '(T ) = 0. If T1 < T2, then '(T1) < '(T2). As T !1, '(T )!1. We have thus
318 We conclude by showing that ' 0(0) = 1 so ' also satises (cal). From (cal) we compute
^r ^ r(0) = ; @^ log B (^(0) 0 T ) ^ ^ @T T =0 = r(0)C ('(0))' (0) + A ('(0))' (0) ^ = r(0)C (0)' (0) + A (0)' (0): ^
0 0 0 0 0 0 0 0
We show in a moment that C 0 (0) = 1, A0(0) = 0, so we have
r(0) = r(0)' (0): ^ ^

0
Note that r(0) is the initial interest rate, observed in the market, and is striclty positive. Dividing by ^ r(0), we obtain ^ ' (0) = 1:
0
Computation of C 0(0):
C( )=
0
1 cosh( ) + 1 sinh( ) 2
cosh( ) )+ 1 2
1 cosh( ) + 2 sinh( )
h i C (0) = 12 ( + 0) ; 0(0 + 1 ) = 1: 2
0
; sinh( )
2 sinh(
cosh( )
Computation of A0 (0):
A ( ) = ;2 2
0
"
cosh( ) + 1 sinh( ) 2
#
2
1 cosh( ) + 1 sinh( ) 2
=2
=2 2e
1 cosh( ) + 2 sinh( )
; e
A (0) = ; 2 2
0
=2
2 sinh(
+0
= ; 2 2 12 = 0:
1 1 ( + 0)2 #2 ( + 0) ; (0 + 2 ) " 2
1 2 ;2 2
1 )+ 2
cosh( )
Chapter 32
A two-factor model (Dufe & Kan)

Let us dene:
X1 (t) = Interest rate at time t X2(t) = Yield at time t on a bond maturing at time t +
Let X1(0) > 0, X2 (0) differential equations
> 0 be given, and let X1(t) and X2(t) be given by the coupled stochastic
dX1(t) = (a11X1(t) + a12X2(t) + b1) dt + 1 1X1(t) + 2X2(t) + dW1(t) (SDE1) q q dX2(t) = (a21X1(t) + a22X2(t) + b2) dt + 2 1X1(t) + 2X2(t) + ( dW1(t) + 1 ; 2 dW2(t))
(SDE2) where W1 and W2 are independent Brownian motions. To simplify notation, we dene
Y (t) = 1X1(t) + 2X2 (t) + q W3(t) = W1(t) + 1 ; 2W2 (t):

4 4
Then W3 is a Brownian motion with
dW1(t) dW3(t) = dt
and
dX1 dX1 = 2Y dt dX2 dX2 = 2Y dt dX1 dX2 = 1 2

319
1 2Y
dt:
320
32.1 Non-negativity of Y
dY =
1 dX1 + 2 dX2 = ( 1a11X1 + 1a12 X2 + 1b1) dt + ( 2a21 X1 + 2a22 X2 + 2 b2) dt q p + Y ( 1 1 dW1 + 2 2 dW1 + 2 1 ; 2 2 dW2)
= ( 1a11 + 2 a21)X1 + ( 1a12 + 2a22)X2 ] dt + ( 1 b1 + 2 b2) dt q 2 2 1 + ( 1 2 + 2 1 2 1 2 + 2 2 ) 2 Y (t) dW4(t) 1 2

1 1 + 2 2 )W1(t) + 2 1 ; 2 2 W2 (t) q2 2 2 2 1 1+2 1 2 1 2+ 2 2
where
W4(t) = (
is a Brownian motion. We shall choose the parameters so that: Assumption 1: For some , Then
1 a11 + 2a21 =
1 a12 + 2a22 =
2:
dY =
1X1 +
+( = Y dt + (
2X2 + ] dt + ( 1b1 + 2b2 ; ) dt 2 2 2 2 1p 1 1 + 2 1 2 1 2 + 2 2 ) 2 Y dW4 2 2 2 2 1p 1b1 + 2 b2 ; ) dt + ( 1 1 + 2 1 2 1 2 + 2 2) 2 Y
dW4:
From our discussion of the CIR process, we recall that Y will stay strictly positive provided that: Assumption 2: and Assumption 3: Under Assumptions 1,2, and 3,
Y (0) = 1X1 (0) + 2X2(0) + > 0

1( 2 2 + 2 1 2 2 1 1 2 2 1 2 + 2 2 ):
1b1 + 2 b2 ;
Y (t) > 0 0 t < 1
almost surely,
and (SDE1) and (SDE2) make sense. These can be rewritten as
dX1(t) = (a11X1 (t) + a12X2 (t) + b1) dt + 1 Y (t) dW1(t) q dX2(t) = (a21X1 (t) + a22X2 (t) + b2) dt + 2 Y (t) dW3(t):
(SDE1) (SDE2)
CHAPTER 32. A two-factor model (Dufe & Kan)
321
32.2 Zero-coupon bond prices

The value at time t
T of a zero-coupon bond paying $1 at time T is
( ZT ) # B(t T ) = IE exp ; X1(u) du F (t) :

t
"
Since the pair (X1 X2) of processes is Markov, this is random only through a dependence on X1(t) X2(t). Since the coefcients in (SDE1) and (SDE2) do not depend on time, the bond price depends on t and T only through their difference = T ; t. Thus, there is a function B (x 1 x2 ) of the dummy variables x1 x2 and , so that
( ZT ) # B(X1 (t) X2(t) T ; t) = IE exp ; X1(u) du F (t) :

t
"
The usual tower property argument shows that
exp ;
Zt
0
X1(u) du B(X1(t) X2(t) T ; t)
is a martingale. We compute its stochastic differential and set the dt term equal to zero.
d exp ;
= exp ; = exp ;
Zt
0
Z0t Zt
0
X1(u) du B(X1(t) X2(t) T ; t) X1(u) du ;X1 B dt + Bx1 dX1 + Bx2 dX2 ; B dt

+ 1 Bx1 x1 dX1 dX1 + Bx1 x2 dX1 dX2 + 1 Bx2 x2 dX2 dX2 2 2
X1(u) du
+
;X1B + (a11X1 + a12X2 + b1)Bx1 + (a21X1 + a22X2 + b2)Bx2 ; B p

1 2 Y Bx1 x2 + 1 2 Y Bx2 x2 2 2 2
+ 1 2 Y Bx1 x1 + 2 1
1
Y Bx1 dW1 +
dt
Y Bx2 dW3
The partial differential equation for B (x 1
x2 ) is ; x1B ; B +(a11x1 + a12x2 + b1)Bx1 +(a21x1 + a22x2 + b2)Bx2 + 1 2( 1x1 + 2x2 + )Bx1 x1 2 1 1 + 1 2( 1x1 + 2x2 + )Bx1 x2 + 2 2 ( 1x1 + 2x2 + )Bx2 x2 = 0: (PDE) 2 B (x1 x2 ) = exp f;x1 C1 ( ) ; x2C2( ) ; A( )g 0 and all x1 x2 satisfying 1x1 + 2 x2 + > 0:
We seek a solution of the form
valid for all
(*)
322 We must have
B(x1 x2 0) = 1 8x1 x2 satisfying (*) because = 0 corresponds to t = T . This implies the initial conditions C1(0) = C2(0) = A(0) = 0: We want to nd C1( ) C2( ) A( ) for > 0. We have B (x1 x2 ) = ;x1 C1( ) ; x2C2( ) ; A ( ) B (x1 x2 ) Bx1 (x1 x2 ) = ;C1( )B(x1 x2 ) Bx2 (x1 x2 ) = ;C2( )B(x1 x2 ) 2 Bx1 x1 (x1 x2 ) = C1 ( )B(x1 x2 ) Bx1 x2 (x1 x2 ) = C1( )C2( )B (x1 x2 ) 2 Bx2 x2 (x1 x2 ) = C2 ( )B(x1 x2 ):
0 0 0
(IC)
(PDE) becomes
0 = B (x1 x2 ) ;x1 + x1C1( ) + x2 C2( ) + A ( ) ; (a11x1 + a12 x2 + b1)C1( ) ; (a21x1 + a22x2 + b2)C2( ) 2 + 1 2 ( 1x1 + 2 x2 + )C1 ( ) + 1 2( 1x1 + 2x2 + )C1( )C2( ) 2 1 2 + 1 2 ( 1x1 + 2 x2 + )C2 ( ) 2 2
0 0 0
= x1 B (x1 x2 ) ; 1 + C1 ( ) ; a11C1( ) ; a21C2 ( )

0
+1 2 +1 2 +1 2
We get three equations:
0
2 1C 2 ( 1 1
)+
0
1 2 1C1(
)C2( ) + 1 2 )C2( ) + 1 2
2 2
2 1C 2 ( 2 2
) )
+ x2 B (x1 x2 ) C2 ( ) ; a12C1( ) ; a22C2 ( )

2 C 2( 1 2 1
0
)+
1 2 2C1(
2 C 2( 2 2 2
+ B (x1 x2 ) A ( ) ; b1 C1( ) ; b2C2( )

2 1 2 C1 ( ) + 1 2
C1 ( )C2( ) + 1 2
);
2 C2 ( )
C1( ) = 1 + a11C1 ( ) + a21C2( ) ; 1 2

0
2 C 2( 1 1 1
1 2 1C1 (
)C2( ) ; 1 2
2 C 2( 2 1 2
)
(1) (2) (3)
C1(0) = 0 2 C2( ) = a12C1( ) + a22 C2( ) ; 1 2 2C1 ( ) ; 2 1 C2(0) = 0 2 A ( ) = b1C1( ) + b2C2( ) ; 1 2 C1 ( ) ; 1 2 1 A(0) = 0
0
1 2 2C1( 2
)C2( ) ; 1 2
2 2
2 2C 2 ( 2 2 2 C2 ( )
C1 ( )C2( ) ; 1 2
CHAPTER 32. A two-factor model (Dufe & Kan)
323
We rst solve (1) and (2) simultaneously numerically, and then integrate (3) to obtain the function A( ).
32.3 Calibration
Let 0
> 0 be given. The value at time t of a bond maturing at time t +
0 is
B(X1 (t) X2(t) 0) = expf;X1(t)C1( 0) ; X2(t)C2( 0) ; A( 0)g

and the yield is
; 1 log B(X1(t) X2(t) 0) = 1 X1(t)C1( 0) + X2(t)C2( 0) + A( 0)] :

0 0
But we have set up the model so that X 2(t) is the yield at time t of a bond maturing at time t + 0 . Thus
X2(t) = 1 X1(t)C1( 0 ) + X2(t)C2( 0) + A( 0)] :

0
This equation must hold for every value of X 1(t) and X2 (t), which implies that
C1( 0) = 0 C2 ( 0) =
We must choose the parameters
A( ) = 0:
1 2
a11 a12 b1 a21 a22 b2

so that these three equations are satised.
1 2
324
Chapter 33
Change of num raire e

Consider a Brownian motion driven market model with time horizon T . For now, we will have one asset, which we call a stock even though in applications it will usually be an interest rate dependent claim. The price of the stock is modeled by
dS (t) = r(t) S (t) dt + (t)S (t) dW (t)

where the interest rate process may be larger than the ltration generated by W .
(0.1)
fF (t) 0 t T g. W is a Brownian motion relative to this ltration, but fF (t) 0 t T g

This is not a geometric Brownian motion model. We are particularly interested in the case that the interest rate is stochastic, given by a term structure model we have not yet specied. We shall work only under the risk-neutral measure, which is reected by the fact that the mean rate of return for the stock is r(t). We dene the accumulation factor
r(t) and the volatility process (t) are adapted to some ltration
(t) = exp
Zt
0
r(u) du
so that the discounted stock price S (t) is a martingale. Indeed, (t)
(t) (t) d S(t) = S(t) (t) dW (t):

The zero-coupon bond prices are given by
) # ( ZT B (t T ) = IE exp ; r(u) du F (t) "

= IE
325
(t) F (t) (T )
326 so
B(t T ) = IE 1 F (t) (t) (T )

is also a martingale (tower property). The T -forward price F (t T ) of the stock is the price set at time t for delivery of one share of stock at time T with payment at time T . The value of the forward contract at time t is zero, so
0 = IE
(t) (S (T ) ; F (t T )) F (t) (T ) = (t)IE S (T ) Ft ; F (t T )IE (t) F (t) (T ) (T ) = (t) S (t) ; F (t T )B (t T ) (t) = S (t) ; F (t T )B (t T )
Therefore,
) F (t T ) = BSt(tT ) : (
Denition 33.1 (Num raire) Any asset in the model whose price is always strictly positive can be e taken as the num raire. We then denominate all other assets in units of this num raire. e e
Example 33.1 (Money market as num raire) The money market could be the num raire. At time t, the e e S (t) B (t T ) stock is worth (t) units of money market and the T -maturity bond is worth (t) units of money market.
Example 33.2 (Bond as num raire) The T -maturity bond could be the num raire. At time t e e is worth F(t T ) units of T -maturity bond and the T -maturity bond is worth 1 unit.
T , the stock
We will say that a probability measure IP N is risk-neutral for the num raire N if every asset price, e divided by N , is a martingale under IP N . The original probability measure IP is risk-neutral for the num raire (Example 33.1). e Theorem 0.71 Let N be a num raire, i.e., the price process for some asset whose price is always e strictly positive. Then IP N dened by
IPN (A) = N1 (0)

is risk-neutral for N .
Z N (T ) dIP 8A 2 F (T ) (T )
A
CHAPTER 33. Change of num raire e

Note:
327
IP and IPN are equivalent, i.e., have the same probability zero sets, and Z IP (A) = N (0) N((T )) dIPN 8A 2 F (T ): A T
1 Z N (T ) dIP IPN ( ) = N (0) (T ) 1 :IE N (T ) = N (0) (T ) (0) = N1 N(0) (0) =1
Proof: Because N is the price process for some asset, N= is a martingale under IP . Therefore,
and we see that IPN is a probability measure.
Let Y be an asset price. Under IP , Y= is a martingale. We must show that under IPN , Y=N is a martingale. For this, we need to recall how to combine conditional expectations with change of measure (Lemma 1.54). If 0 t T T and X is F (T )-measurable, then
N( IEN X F (t) = N (0)t)(t) IE N (0) T()T ) X F (t) N( T = N((tt)) IE N((T )) X F (t) :
Therefore,
Y T Y IEN N(T ) F (t) = N((tt)) IE N((T )) N(T ) F (t) (T ) (T ) () = N((tt)) Y (tt) Y = N (t) (t) which is the martingale property for Y=N under IP N .
33.1 Bond price as num raire e
Fix T
2 (0 T ] and let B(t T ) be the num raire. The risk-neutral measure for this num raire is e e 1 Z B (T T )
IPT (A) = B (0 T ) dIP A (T ) Z 1 = 1 B (0 T ) A (T ) dIP 8A 2 F (T ):
328 Because this bond is not dened after time T , we change the measure only up to time T 1 using B (0 T ) B ((TT ) and only for A 2 F (T ). ) is called the stock is
T , i.e.,
IPT
T -forward measure.
Denominated in units of
T -maturity bond, the value of the
) F (t T ) = BSt(tT ) (
0 t T:
This is a martingale under IPT , and so has a differential of the form
dF (t T ) =
We write F (t) rather than F (t
F (t
T )F (t T ) dWT (t) 0 t T t
0.
(1.1)
i.e., a differential without a dt term. The process fWT 0 IPT . We may assume without loss of generality that F (t T )
T g is a Brownian motion under
T ) from now on.
33.2 Stock price as num raire e

Let S (t) be the num raire. In terms of this num raire, the stock price is identically 1. The riske e neutral measure under this num raire is e
1 IPS (A) = S (0)
Z S (T ) (T ) dIP 8A 2 F (T ):
A
Denominated in shares of stock, the value of the T -maturity bond is
B(t T ) = 1 : S (t) F (t)

This is a martingale under IPS , and so has a differential of the form
d F 1t) = (t T ) F 1t) dWS (t) ( (

where fWS (t) generality that
(2.1)
0 t T g is a Brownian motion under IPS . (t T ) 0.
We may assume without loss of
Theorem 2.72 The volatility (t words, (2.1) can be rewritten as
T ) in (2.1) is equal to the volatility

F (t
F (t
T ) in (1.1).
In other
d F 1t) = (
T ) F 1t) dWS (t) (
(2.1)

Proof: Let g (x) = 1=x, so g 0(x) = ;1=x2
329
g (x) = 2=x3. Then

00
d F 1t) = dg (F (t)) ( = g (F (t)) dF (t) + 1 g (F (t)) dF (t) dF (t) 2 1 = ; F 2 (t) F (t T )F (t T ) dWT (t) + F 31(t) 2 (t T )F 2(t T ) dt F
0 00
h i = F 1t) ; F (t T ) dWT (t) + 2 (t T ) dt F ( = F (t T ) F 1t) ;dWT (t) + F (t T ) dt]: (
Under IPT ;WT is a Brownian motion. Under this measure, F 1t) has volatility F (t T ) and mean ( rate of return 2 (t T ). The change of measure from IPT to IPS makes F 1t) a martingale, i.e., it F ( changes the mean return to zero, but the change of measure does not affect the volatility. Therefore, (t T ) in (2.1) must be F (t T ) and WS must be
WS (t) = ;WT (t) +
Zt
0
F (u
T ) du:
33.3 Merton option pricing formula

The price at time zero of a European call is
V (0) = IE (1T ) (S (T ) ; K )+ = IE S (T ) 1 S (T )>K ; KIE 1 1 S (T )>K (T ) Z(T ) S (T ) dIP ; KB (0 T ) Z 1 = S (0) S (0) (T ) B(0 T ) (T ) dIP S (T )>K S (T )>K = S (0)IPS fS (T ) > K g ; KB (0 T )IPT fS (T ) > K g = S (0)IPS fF (T ) > K g ; KB (0 T )IPT fF (T ) > K g = S (0)IPS 1 < 1 ; KB (0 T )IPT fF (T ) > K g: F (T ) K
f g f g f g f g
330 This is a completely general formula which permits computation as soon as we specify F (t we assume that F (t T ) is a constant F , we have the following:
T ). If
1 = B (0 T ) exp n W (T ) ; 1 2 T o F S 2 F F (T ) S (0) 1 S (0) IPS F (1T ) < K = IPS F WS (T ) ; 1 2 T < log KB (0 T ) 2 F p S (0) S p = IPS Wp(T ) < 1 log KB (0 T ) + 1 F T 2 = N ( 1) 1 p
where
1=
Similarly,
log
S (0) 1 2 KB (0 T ) + 2 F T :
o n F (T ) = BS (0) ) exp F WT (T ) ; 1 2 T F 2 (0 T IPT fF (T ) > K g = IPT F WT (T ) ; 1 2 T > log KB (0 T ) 2 F S (0) T 1 p = IPT Wp(T ) > 1 log KB (0 T ) + 2 2 T F S (0) T F T 1 p p = IPT ;WT (T ) < 1 log S (0) ; 2 2 T F KB(0 T ) T T F = N ( 2)
where
2=
If r is constant, then B (0
F
;
1 p
S (0) log KB (0 T ) ; 1 2 T : 2 F
T) = e
1=
rT ,
(0) log SK + (r + 1 2 )T 2 F F T S (0) 1 2 2 = p log K + (r ; 1 F )T 2 1 p

F
and we have the usual Black-Scholes formula. When formula
r is not constant, we still have the explicit
V (0) = S (0)N ( 1) ; KB(0 T )N ( 2):

As this formula suggests, if F is constant, then for 0 at time T is where
331
t T , the value of a European call expiring
V (t) = S (t)N ( 1(t)) ; KB (t T )N ( 2(t))
This formula also suggests a hedge: at each time t, hold KN ( 2(t)) bonds.
( p1 ; t log FKt) + 1 2 (T ; t) 2 F F T ( 1 log FKt) ; 1 2 (T ; t) : 2(t) = p 2 F T ;t

1(t) =
N ( 1(t)) shares of stock and short
We want to verify that this hedge is self-nancing. Suppose we begin with $ V (0) and at each time t hold N ( 1(t)) shares of stock. We short bonds as necessary to nance this. Will the position in the bond always be ;KN ( 2(t))? If so, the value of the portfolio will always be S (t)N ( 1(t)) ; KB (t T )N ( 2(t)) = V (t) and we will have a hedge. Mathematically, this question takes the following form. Let
(t) = N ( 1(t)): At time t, hold (t) shares of stock. If X (t) is the value of the portfolio at time t, then X (t) ; (t (t)S (t) will be invested in the bond, so the number of bonds owned is X B)(t T )(t) S (t) and the
;
portfolio value evolves according to
t ; dX (t) = (t) dS (t) + X (B)(t T )(t) S (t) dB(t T ):

The value of the option evolves according to
(3.1)
dV (t) = N ( 1(t)) dS (t) + S (t) dN ( 1(t)) + dS (t) dN ( 1(t)) ; KN ( 2(t)) dB(t T ) ; K dB(t T ) dN ( 2(t)) ; KB(t T ) dN ( 2(t)): If X (0) = V (0), will X (t) = V (t) for 0 t T ?
(3.2)
Formulas (3.1) and (3.2) are difcult to compare, so we simplify them by a change of num raire. e This change is justied by the following theorem. Theorem 3.73 Changes of num raire affect portfolio values in the way you would expect. e Proof: Suppose we have a model with k assets with prices S 1 S2 : : : Sk . At each time t, hold i (t) shares of asset i, i = 1 2 : : : k ; 1, and invest the remaining wealth in asset k. Begin with a nonrandom initial wealth X (0), and let X (t) be the value of the portfolio at time t. The number of shares of asset k held at time t is
X (t) ; Pk=11 i(t)Si(t) i k (t) = Sk (t)

;
332 and X evolves according to the equation
dX =
=
Note that
k 1 X
;
k X i=1
i=1
i dSi + i dSi: k X
X;
k 1 X
;
i=1
i Si
dSk Sk
Xk (t) =
and we only get to specify Let N be a num raire, and dene e
:::
i=1 k 1, not k , in advance.

;
i (t)Si (t)
X (t) b X (t) = N (t)

Then
c Si (t) = Si(t) N (t)
i = 1 2 : : : k:
1 1 b 1 dX = N dX + X d N + dX d N !
i i i
k k k 1 X dS + X S d 1 + X dS d 1 =N i i i i N i=1 i i N i=1 i=1 k X 1 dS + S d 1 + dS d 1 =
=
Now
i=1 k X i=1
Si i dc:
X ; Pk=11 i Si i k= Sk X=N ; Pk=11 iSi=N i = Sk =N b ; Pk=11 ici i =X c S: Sk

; ; ;
Therefore,
k c X c b k 1 c! dSk X b= dX i dSi + X ; iSi c Sk i=1 i=1

;

This is the formula for the evolution of a portfolio which holds i shares of asset i, i = 1 1, and all assets and the portfolio are denominated in units of N .
333
2 : : : k;
We return to the European call hedging problem (comparison of (3.1) and (3.2)), but we now use the zero-coupon bond as num raire. We still hold (t) = N ( 1(t)) shares of stock at each time t. e In terms of the new num raire, the asset values are e Stock: Bond: The portfolio value evolves according to
S (t) = F (t) B(t T ) B(t T ) = 1: B(t T )

(3.1)
b b dX (t) = (t) dF (t) + (X (t) ; (t)) d(1) = (t) dF (t): 1

In the new num raire, the option value formula e
V (t) = N ( 1(t))S (t) ; KB (t T )N ( 2(t))

becomes
) b V (t) = BVt(tT ) = N ( 1(t))F (t) ; KN ( 2(t)) (
and
b dV = N ( 1(t)) dF (t) + F (t) dN ( 1(t)) + dN ( 1(t)) dF (t) ; K dN ( 2(t)):

To show that the hedge works, we must show that
(3.2)
F (t) dN ( 1(t)) + dN ( 1(t)) dF (t) ; K dN ( 2(t)) = 0:

This is a homework problem.
334
Chapter 34
Brace-Gatarek-Musiela model
34.1 Review of HJM under risk-neutral I P
f (t T ) = df (t T ) =
where The interest rate is r(t) = f (t
Forward rate at time t for borrowing at time T:
(t T ) (t T ) dt + (t T ) dW (t) (t T ) =
ZT
t
(t u) du
( ZT ) # B (t T ) = IE exp ; r(u) du F (t) ( ZT t ) = exp ; f (t u) du

t
volatility of T -maturity bond.
t). The bond prices
"
satisfy
dB (t T ) = r(t) B(t T ) dt ;
(t T )
t ) | ({z T }
B(t T ) dW (t):
To implement HJM, you specify a function
0 t T:
A simple choice we would like to use is
where
> 0 is the constant volatility of the forward rate. This is not possible because it leads to
(t T ) =
( t T ) = f (t T )
ZT
t
f (t u) du
df (t T ) = 2f (t T )
ZT
t
f (t u) du dt + f (t T ) dW (t)
335
336 and Heath, Jarrow and Morton show that solutions to this equation explode before T .
The problem with the above equation is that the dt term grows like the square of the forward rate. To see what problem this causes, consider the similar deterministic ordinary differential equation
f (t) = f 2(t)
0
where f (0) = c > 0. We have
This solution explodes at t
1 + 1 = Z t 1 du = t ; f (t) f (0) 0 1 = t ; 1 = t ; 1=c = ct ; 1 ; f (t) f (0) c c : f (t) = 1 ; ct = 1=c.
f (t) = 1 f 2(t) d 1 ; dt f (t) = 1

0
34.2 Brace-Gatarek-Musiela model

New variables: Current time t Time to maturity Forward rates:
= T ; t:
(2.1) (2.2)
r(t ) = f (t t + ) r(t 0) = f (t t) = r(t) @ r(t ) = @ f (t t + ) @ @T

Bond prices:
D(t ) = B(t t + )
= exp ; (u = v ; t du = dv ) : = exp ; = exp ;
Z t+ Zt Z0
0
(2.3)
f (t v ) dv
f (t t + u) du r(t u) du
(2.4)
@ D(t ) = @ B(t t + ) = ;r(t )D(t ): @ @T
CHAPTER 34. Brace-Gatarek-Musiela model

We will now write
337
(t ) = (t T ; t) rather than (t T ). In this notation, the HJM model is
df (t T ) = (t ) (t ) dt + (t ) dW (t) dB (t T ) = r(t)B(t T ) dt ; (t )B (t T ) dW (t)

where
(2.5) (2.6)
@ @
(t ) =
Z
0
(t u) du
(2.7) (2.8)
(t ) = (t ): ) and D(t ), analogous to (2.5) and (2.6) We have
We now derive the differentials of r(t
dr(t ) =
(2.5),(2.2) (2.8)
differential applies only to rst argument
df (t {z+ ) | t }
@ + @T f (t t + ) dt
h i = @@ r(t ) + 1 ( (t ))2 dt + (t ) dW (t): 2

dB (t {z + ) | t } @ + @T B (t t + ) dt
(t ) (t ) dt + (t ) dW (t) + @@ r(t ) dt
(2.9)
Also,
dD(t ) =
(2.6),(2.4)
differential applies only to rst argument
= r(t) B (t t + ) dt ; (t )B (t t + ) dW (t) ; r(t )D(t ) dt (2.1) = r(t 0) ; r(t )] D(t ) dt ; (t )D(t ) dW (t): (2.10)
34.3 LIBOR
Fix > 0 (say, time t + . L(t
1 = 4 year). $ D(t ) invested at time t in a (t + )-maturity bond grows to $ 1 at 0) is dened to be the corresponding rate of simple interest:
Z@ r(t u) du 1 + L(t 0) = D(1 ) = exp t 0 nR o exp 0@ r(t u) du ; 1 L(t 0) = :
D(t )(1 + L(t 0)) = 1
338
34.4 Forward LIBOR

( At time t, agree to invest $ DDt(t +) ) at time t + , with payback of $1 at time ( t + + . Can do this at time t by shorting DDt(t +) ) bonds maturing at time t + and going long one bond maturing at time t + + . The value of this portfolio at time t is
> 0 is still xed.
The forward LIBOR L(t
) is dened to be the simple (forward) interest rate for this investment: D(t + ) (1 + L(t )) = 1 D(t ) R D(t ) = exp f; 0 r(t u) dugo n R 1 + L(t ) = D(t + ) exp ; 0 + r(t u) du = exp
; D(t (t +) ) D(t ) + D(t + ) = 0: D
(Z nR
r(t u) du
L(t ) =
Connection with forward rates:
exp
r(t u) du ; 1
(4.1)
( @ exp Z @
so
r(t u) du
)
=0
= r(t + ) exp = r(t )
(Z
r(t u) du
)
=0
f (t t + ) = r(t ) = lim L(t ) =

exp
nR + #0
exp
nR
r(t u) du ; 1 >0
xed: (4.2)
r(t u) du ; 1
r(t ) is the continuously compounded rate. L(t ) is the simple rate over a period of duration . We cannot have a log-normal model for r(t ) because solutions explode as we saw in Section 34.1. For xed positive , we can have a log-normal model for L(t ).
34.5 The dynamics of L(t )
We want to choose
(t ) t 0 0, appearing in (2.5) so that dL(t ) = (: : : ) dt + L(t ) (t ) dW (t)
339
0 0. This is the BGM model, and is a subclass of HJM models, for some (t ) t corresponding to particular choices of (t ).
Recall (2.9):
i @ h dr(t ) = @u r(t u) + 1 ( (t u))2 dt + (t u) dW (t): 2

Therefore,
r(t u) du =
=
! Z Z
+ +
@ hr(t u) + 1 ( (t u))2i du dt + Z + (t u) du dW (t) 2 i h @u 1 = r(t + ) ; r(t ) + 2 ( (t + ))2 ; 1 ( (t ))2 dt 2 + (t + ) ; (t )] dW (t)
dr(t u) du
(5.1)
and
o 3 2 nR + exp r(t u) du ; 1 5 dL(t ) (4:1) d 4 = (Z + ) Z + 1

= exp
( ) 1 exp Z + r(t u) du d Z +2 (4.1), (5.1) 1 = 1 + L(t )]
r(t u) du d
r(t u) du
+
r(t u) du
!2
(5.2)
= 1 1 + L(t )] r(t + ) ; r(t )] dt + (t + ) (t + ) ; (t )] dt = + (t + ) ; (t )] dW (t) :
1 r(t + ) ; r(t ) + 2 ( (t + ))2 ; 1 ( (t ))2] dt 2 + (t + ) ; (t )] dW (t) + 1 (t + ) ; (t )]2 dt 2
340 But
2 nR @ L(t ) = @ 4 exp @ @ (Z +
= exp
r(t u) du ; 1 5
r(t u) du : r(t + ) ; r(t )]
= 1 1 + L(t )] r(t + ) ; r(t )]:

Therefore,
dL(t ) = @@ L(t ) dt + 1 1 + L(t )] (t + ) ; (t )]: (t + ) dt + dW (t)]:

Take
(t ) to be given by (t )L(t ) = 1 1 + L(t )] (t + ) ; (t )]:

(5.3)
Then
dL(t ) = @@ L(t ) + (t )L(t ) (t + )] dt + (t )L(t ) dW (t):

Note that (5.3) is equivalent to
(5.4)
) (t + ) = (t ) + L(t L(t(t ) ) : 1+
Plugging this into (5.4) yields
(5.3)
2 ) 2 dL(t ) = @@ L(t ) + (t )L(t ) (t ) + L1 (t L(t (t ) ) dt + + (t )L(t ) dW (t):
"
(5.4)
34.6 Implementation of BGM

Obtain the initial forward LIBOR curve
L(0 )
(t )
0 0:
from market data. Choose a forward LIBOR volatility function (usually nonrandom)
t 0
341
Because LIBOR gives no rate information on time periods smaller than , we must also choose a partial bond volatility function
(t )
With these functions, we can for each
t 0 0
<
for maturities less than from the current time variable t.
2 0 ) solve (5.4) to obtain

< : <2 <2
) generated recursively by (5.3). (t ) is random.
. We then solve (5.4) to obtain
L(t ) t 0 0
Plugging the solution into (5.3), we obtain
(t ) for
L(t ) t 0
and we continue recursively.
Remark 34.1 BGM is a special case of HJM with HJMs (t In BGM, (t ) is usually taken to be nonrandom; the resulting
Remark 34.2 (5.4) (equivalently, (5.4)) is a stochastic partial differential equation because of the @ @ L(t ) term. This is not as terrible as it rst appears. Returning to the HJM variables t and T , set
K (t T ) = L(t T ; t):
Then
dK (t T ) = dL(t T ; t) ; @@ L(t T ; t) dt dK (t T ) = (t T ; t)K (t T ) = (t T ; t)K (t T )

(t T ; t + ) dt + dW (t)] (t T ; t) dt + K (t T ) (t T ; t) dt + dW (t) : 1 + K (t T ) (6.1)
and (5.4) and (5.4) become
Remark 34.3 From (5.3) we have
(t )L(t ) = 1 + L(t )] (t + ) ; (t ) :
If we let
#0, then
@ (t )L(t )! @
(t + )
=0
= (t )
and so We saw before (eq. 4.2) that as
(t T ; t)K (t T )! (t T ; t): #0,
L(t )!r(t ) = f (t t + )
342 so
K (t T )!f (t T ):
Therefore, the limit as
#0 of (6.1) is given by equation (2.5):

df (t T ) = (t T ; t)
(t T ; t) dt + dW (t)] :
2
Remark 34.4 Although the dt term in (6.1) has the term to this equation do not explode because
(t T t)K 2 (t T ) involving K 2 , solutions 1+K (t T )

;
2(t
T ; t)K 2 (t T ) 1 + K (t T )
T ; t)K 2(t T ) K (t T ) 2(t T ; t)K (t T ):

2(t
34.7 Bond prices

Let
(t) = exp
nR t
0 r(u) du : From (2.6) we have
d B(t(tT ) = 1t) ;r(t)B(t T ) dt + dB(t T )] ) ( = ; B (t T ) (t T ; t) dW (t): (t)

The solution B (ttT ) to this stochastic differential equation is given by ()
B (t T ) = exp ; Z t (u T ; u) dW (u) ; 1 Z t( (u T ; u))2 du : 2 0 (t)B (0 T ) 0

1 Z 1 dIP IPT (A) = B (0 T ) Z B(TAT )(T ) = dIP 8A 2 F (T ): A (T )B (0 T )
This is a martingale, and we can use it to switch to the forward measure
Girsanovs Theorem implies that
WT (t) = W (t) +
is a Brownian motion under IPT .
Zt
0
(u T ; u) du
0 t T
343
34.8 Forward LIBOR under more forward measure

From (6.1) we have
dK (t T ) = (t T ; t)K (t T ) (t T ; t + ) dt + dW (t)] = (t T ; t)K (t T ) dWT + (t)

so
K (t T ) = K (0 T ) exp
and
Zt
0
(u T ; u) dWT + (u) ; 1 2
Zt
0
2(u
T ; u) du
K (T T ) = K (0 T ) exp
= K (t T ) exp
We assume that
(Z T
0
(Z T
t
(u T ; u) dWT + (u) ;
ZT 2(u 1 2 0 ZT
t
2 (u
T ; u) du
)
(8.1)
(u T ; u) dWT + (u) ; 1 2
T ; u) du :
is nonrandom. Then
X (t) =
is normal with variance
ZT
t
(u T ; u) dWT + (u) ; 1 2
2(t) =
ZT
t
2(u
T ; u) du
(8.2)
ZT
t
2(u
T ; u) du
1 and mean ; 2 2(t).
34.9 Pricing an interest rate caplet

Consider a oating rate interest payment settled in arrears. At time T + , the oating rate interest payment due is L(T 0) = K (T T ) the LIBOR at time T . A caplet protects its owner by requiring him to pay only the cap c if K (T T ) > c. Thus, the value of the caplet at time T + is (K (T T ) ; c)+ . We determine its value at times 0 t T + .
Case I: T
t T+
t) CT + (t) = IE (T(+ ) (K (T T ) ; c)+ F (t) t) = (K (T T ) ; c)+ IE (T (+ ) F (t) = (K (T T ) ; c)+ B (t T + ):
(9.1)
344 Case II: 0 Recall that
t T.
IPT + (A) =
where
Z
A
Z (T + ) dIP
8A 2 F (T + )
T Z (t) = (B (t (0 + +) ) : t)B T
We have
t) CT + (t) = IE (T(+ ) (K (T T ) ; c)+ F (t) 2
3 7 6 (t)B (0 T + ) IE 6 B (T + T + ) (K (T T ) ; c)+ F (t)7 7 6 = B (t T + ) B (t T + ) 7 6 (T + )B(0 T + ) 5 4| {z } {z } | 1

Z (t)
Z (T + )
= B (t T + )IET + (K (T T ) ; c)+ F (t)

From (8.1) and (8.2) we have
K (T T ) = K (t T ) expfX (t)g R 1 where X (t) is normal under IPT + with variance 2(t) = tT 2(u T ; u) du and mean ; 2 2(t). Furthermore, X (t) is independent of F (t). CT + (t) = B(t T + )IET + (K (t T ) expfX (t)g ; c)+ F (t) :
Set
g(y) = IET + (y expfX (t)g ; c)+ 1 = y N 1 log y + 2 (t) ; c N (t) c

Then
1 log y ; 1 (t) : ( t) c 2
(9.2)
CT + (t) = B(t T + ) g (K (t T )) 0 t T ; :
In the case of constant , we have
(t) =
T ;t
and (9.2) is called the Black caplet formula.
345
34.10 Pricing an interest rate cap

Let
T0 = 0 T1 = T2 = 2 : : : Tn = n :
(K (Tk Tk ) ; c)+
at time Tk+1
A cap is a series of payments
k = 0 1 : : : n ; 1:
The value at time t of the cap is the value of all remaining caplets, i.e.,
C (t) =
34.11 Calibration of BGM
The interest rate caplet c on L(0
k:t Tk
CTk (t):
T ) at time T +
has time-zero value
CT + (0) = B(0 T + ) g (K (0 T ))
where g (dened in the last section) depends on
ZT
0
2 (u
T ; u) du:
Let us suppose
is a deterministic function of its second argument, i.e.,
(t ) = ( ):
Then g depends on
ZT
0
If we know the caplet price CT + know caplet prices where T0
R (0), we can back out the squared volatility 0T 2(v ) dv . If we Z T1

0
2(T ; u) du =
ZT
0
2(v ) dv:
Z T0
0
< T1 < : : : < Tn , we can back out
CT0+ (0) CT1+ (0) : : : CTn+ (0)

2(v ) dv ;
2(v ) dv
Z T1
T0
2(v ) dv =
Z T0
0
2(v ) dv
:::
In this case, we may assume that is constant on each of the intervals
;
Z Tn
Tn;1
2(v ) dv: (11.1)
(0 T0) (T0 T1) : : : (Tn
Tn )
346
If we know caplet prices CT + tiate to discover 2( ) and (
R (0) for all T 0, we can back out 0T 2(v ) dv and then differenp 2( ) for all 0. )= 0, and To implement BGM, we need both ( )
(t )
and choose these constants to make the above integrals have the values implied by the caplet prices.
t 0 0
< : t+
(see (2.6)).
Now (t ) is the volatility at time t of a zero coupon bond maturing at time < , it is reasonable to set Since is small (say 1 year), and 0 4
(t ) = 0
We can now solve (or simulate) to get
t 0 0
0
< :
L(t ) t 0
or equivalently,
K (t T ) t 0 T 0
using the recursive procedure outlined at the start of Section 34.6.
34.12 Long rates

The long rate is determined by long maturity bond prices. Let n be a large xed positive integer, so that n is 20 or 30 years. Then
D(t n ) = exp
= =
n Y k=1 n Y k=1
(Z n
0
exp
(Z k
r(t u) du
(k 1)
;
r(t u) du
1 + L(t (k ; 1) )]
where the last equality follows from (4.1). The long rate is
n 1 log 1 = 1 X log 1 + L(t (k ; 1) )]: n D(t n ) n k=1
34.13 Pricing a swap

Let T0
0 be given, and set
T1 = T0 +
T2 = T0 + 2 : : : Tn = T0 + n :

The swap is the series of payments For 0
347
t T0, the value of the swap is

n 1 X
;
(L(Tk 0) ; c)
at time Tk+1
k = 0 1 : : : n ; 1:
IE (T(t) ) (L(Tk 0) ; c) F (t) : k+1 k=0
Now
1 + L(Tk 0) = B (T 1T ) k k+1
so
L(Tk 0) = 1 B(T 1T ) ; 1 : k k+1 IE (T(t) ) (L(Tk 0) ; c) F (t) k+1 = IE (T(t) ) B (T 1T ) ; 1 ; c F (t)
We compute
k k+1 3 2 k+1 6 7 6 (Tk ) F (T ) F (t)7 ; (1 + c)B (t T ) (t) 7 6 = IE 6 (T )B (T T ) IE (T ) k k+1 7 k+1 5 4 k k k+1 | {z }
The value of the swap at time t is
(t) (Tk+1 ) F (t) ; (1 + c)B (t Tk+1) = B (t Tk ) ; (1 + c)B (t Tk+1): = IE
B (Tk Tk+1 )
n 1 X
;
IE (T(t) ) (L(Tk 0) ; c) F (t) k+1 k=0

=
n 1 X
;
= B (t T0) ; (1 + c)B (t T1) + B (t T1) ; (1 + c)B (t T2) + : : : + B (t Tn 1 ) ; (1 + c)B (t Tn ) = B (t T0) ; cB (t T1) ; cB (t T2) ; : : : ; cB (t Tn ) ; B (t Tn ): The forward swap rate wT0 (t) at time t for maturity T0 is the value of c which makes the time-t
;
k=0
B (t Tk ) ; (1 + c)B(t Tk+1)]
value of the swap equal to zero:
Bt ) ( wT0 (t) = B (t (T T0+ ;: B+tBT(n )T )] : t n 1) : :
In contrast to the cap formula, which depends on the term structure model and requires estimation of , the swap formula is generic.
348
Chapter 35
Notes and References

35.1 Probability theory and martingales.
Probability theory is usually learned in two stages. In the rst stage, one learns that a discrete random variable has a probability mass function and a continuous random variable has a density. These can be used to compute expectations and variances, and even conditional expectations. Furthermore, one learns how transformations of continuous random variables cause changes in their densities. A well-written book which contains all these things is DeGroot (1986). The second stage of probability theory is measure theoretic. In this stage one views a random variable as a function from a sample space to the set of real numbers IR. Certain subsets of are called events, and the collection of all events forms a -algebra F . Each set A in F has a probability IP (A). This point of view handles both discrete and continuous random variables within the same unifying framework. A conditional expectation is itself a random variable, measurable with respect to the conditioning -algebra. This point of view is indispensible for treating the rather complicated conditional expectations which arise in martingale theory. A well-written book on measure-theoretic probability is Billingsley (1986). A succinct book on measure-theoretic probability and martingales in discrete time is Williams (1991). A more detailed book is Chung (1968). The measure-theoretic view of probability theory was begun by Kolmogorov (1933). The term martingale was apparently rst used by Ville (1939), although the concept dates back to 1934 work of L vy. The rst complete account of martingale theory is Doob (1953). e
35.2 Binomial asset pricing model.

The binomial asset pricing model was developed by Cox, Ross & Rubinstein (1979). Accounts of this model can be found in several places, including Cox & Rubinstein (1985), Dothan (1990) and Ritchken (1987). Many models are rst developed and understood in continuous time, and then binomial versions are developed for purposes of implementation. 349
350
35.3 Brownian motion.

In 1828 Robert Brown observed irregular movement of pollen suspended in water. This motion is now known to be caused by the buffeting of the pollen by water molecules, as explained by Einstein (1905). Bachelier (1900) used Brownian motion (not geometric Brownian motion) as a model of stock prices, even though Brownian motion can take negative values. L vy (1939, 1948) discove ered many of the nonintuitive properties of Brownian motion. The rst mathematically rigorous construction of Brownian motion was carried out by Wiener (1923, 1924). Brownian motion and its properties are presented in a numerous texts, including Billingsley (1986). The development in this course is a summary of that found in Karatzas & Shreve (1991).
35.4 Stochastic integrals.

The integral with respect to Brownian motion was developed by It (1944). It was introduced to o nance by Merton (1969). A mathematical construction of this integral, with a minimum of fuss, is given by ksendal (1995). The quadratic variation of martingales was introduced by Fisk (1966) and developed into the form used in this course by Kunita & Watanabe (1967).
35.5 Stochastic calculus and nancial markets.

Stochastic calculus begins with It (1944). Many nance books, including (in order of increasing o mathematical difculty) Hull (1993), Dothan (1990) and Dufe (1992), include sections on It s o integral and formula. Some other books on dynamic models in nance are Cox & Rubinstein (1985), Huang & Litzenberger (1988), Ingersoll (1987), and Jarrow (1988). An excellent reference for practitioners, now in preprint form, is Musiela & Rutkowski (1996). Some mathematics texts on stochastic calculus are ksendal (1995), Chung & Williams (1983), Protter (1990) and Karatzas & Shreve (1991). Samuelson (1965, 1973) presents the argument that geometric Brownian motion is a good model for stock prices. This is often confused with the efcient market hypothesis, which asserts that all information which can be learned from technical analysis of stock prices is already reected in those prices. According to this hypothesis, past stock prices may be useful to estimate the parameters of the distribution of future returns, but they do not provide information which permits an investor to outperform the market. The mathematical formulation of the efcient market hypothesis is that there is a probability measure under which all discounted stock prices are martingales, a much weaker condition than the claim that stock prices follow a geometric Brownian motion. Some empirical studies supporting the efcient market hypothesis are Kendall (1953), Osborne (1959), Sprenkle (1961), Boness (1964), Alexander (1961) and Fama (1965). The last of these papers discusses other distributions which t stock prices better than geometric Brownian motion. A criticism of the efcient market hypothesis is provided by LeRoy (1989). A provocative article on the source of
CHAPTER 35. Notes and References

stock price movements is Black (1986).
351
The rst derivation of the Black-Scholes formula given in this course, using only It s formula, o is similar to that originally given by Black & Scholes (1973). An important companion paper is Merton (1973), which makes good reading even today. (This and many other papers by Merton are collected in Merton (1990).) Even though geometric Brownian motion is a less than perfect model for stock prices, the Black-Scholes option hedging formula seems not to be very sensitive to deciencies in the model.
35.6 Markov processes.

Markov processes which are solutions to stochastic differential equations are called diffusion processes. A good introduction to this topic, including discussions of the Kolmogorov forward and backward equations, is Chapter 15 of Karlin & Taylor (1981). The other books cited previously, ksendal (1995), Protter (1990), Chung & Williams (1983), and Karatzas & Shreve (1991), all treat this subject. Kloeden & Platen (1992) is a thorough study of the numerical solution of stochastic differential equations. The constant elasticity of variance model for option pricing appears in Cox & Ross (1976). Another alternative model for the stock price underlying options, due to F llmer & Schweizer (1993), has o the geometric Ornstein-Uhlenbeck process as a special case. The Feynman-Kac Theorem, connecting stochastic differential equations to partial differential equations, is due to Feyman (1948) and Kac (1951). A numerical treatment of the partial differential equations arising in nance is contained in Wilmott, Dewynne and Howison (1993, 1995) and also Dufe (1992).
35.7 Girsanovs theorem, the martingale representation theorem, and risk-neutral measures.
Girsanovs Theorem in the generality stated here is due to Girsanov (1960), although the result for constant was established much earlier by Cameron & Martin (1944). The theorem requires a f technical condition to ensure that IEZ (T ) = 1, so that I is a probability measure; see Karatzas & P Shreve (1991), page 198. The form of the martingale representation theorem presented here is from Kunita & Watanabe (1967). It can also be found in Karatzas & Shreve (1991), page 182. The application of the Girsanov Theorem and the martingale representation theorem to risk-neutral pricing is due to Harrison & Pliska (1981). This methodology frees the Brownian-motion driven model from the assumption of constant interest rate and volatility; these parameters can be random through dependence on the path of the underlying asset, or even through dependence on the paths of other assets. When both the interest rate and volatility of an asset are allowed to be stochastic, the Brownian-motion driven model is mathematically the most general possible for asset prices without jumps.
352 When asset processes have jumps, risk-free hedging is generally not possible. Some works on hedging and/or optimization in models which allow for jumps are Aase (1993), Back (1991), Bates (1988,1992), Beinert & Trautman (1991), Elliott & Kopp (1990), Jarrow & Madan (1991b,c), Jones (1984), Madan & Seneta (1990), Madan & Milne (1991), Mercurio & Runggaldier (1993), Merton (1976), Naik & Lee (1990), Schweizer (1992a,b), Shirakawa (1990,1991) and Xue (1992). The Fundamental Theorem of Asset Pricing, as stated here, can be found in Harrison & Pliska (1981, 1983). It is tempting to believe the converse of Part I, i.e., that the absence of arbitrage implies the existence of a risk-neutral measure. This is true in discrete-time models, but in continuous-time models, a slightly stronger condition is needed to guarantee existence of a risk-neutral measure. For the continuous-time case, results have been obtained by many authors, including Stricker (1990), Delbaen (1992), Lakner (1993), Delbaen & Schachermayer (1994a,b), and Fritelli & Lakner (1994, 1995). In addition to the fundamental papers of Harrison & Kreps (1979), and Harrison & Pliska (1981, 1983), some other works on the relationship between market completeness and uniqueness of the risk-neutral measure are Artzner & Heath (1990), Delbaen (1992), Jacka (1992), Jarrow & Madan (1991a), M ller (1989) and Taqqu & Willinger (1987). u
35.8 Exotic options.

The reection principle, adjusted to account for drift, is taken from Karatzas & Shreve (1991), pages 196197. Explicit formulas for the prices of barrier options have been obtained by Rubinstein & Reiner (1991) and Kunitomo & Ikeda (1992). Lookback options have been studied by Goldman, Sosin & Gatto (1979), Goldman, Sosin & Shepp (1979) and Conz & Viswanathan (1991). e Because it is difcult to obtain explicit formulas for the prices of Asian options, most work has been devoted to approximations. We do not provide an explicit pricing formula here, although the partial differential equation given here by the Feynman-Kac Theorem characterizes the exact price. Bouaziz, Bryis & Crouhy (1994) provide an approximate pricing formula, Rogers & Shi (1995) provide a lower bound, and Geman & Yor (1993) obtain the Laplace transform of the price.
35.9 American options.

A general arbitrage-based theory for the pricing of American contingent claims and options begins with the articles of Bensoussan (1984) and Karatzas (1988); see Myneni (1992) for a survey and additional references. The perpetual American put problem was solved by McKean (1965). Approximation and/or numerical solutions for the American option problem have been proposed by several authors, including Black (1975), Brennan & Schwartz (1977) (see Jaillet et al. (1990) for a treatment of the American option optimal stopping problem via variational inequalities, which leads to a justication of the Brennan-Schwartz algorithm), by Cox, Ross & Rubinstein (1979) (see Lamberton (1993) for the convergence of the associated binomial and/or nite difference schemes)
353
and by Parkinson (1977), Johnson (1983), Geske & Johnson (1984), MacMillan (1986), Omberg (1987), Barone-Adesi & Whalley (1987), Barone-Adesi & Elliott (1991), Bunch & Johnson (1992), Broadie & Detemple (1994), and Carr & Faguet (1994).
35.10 Forward and futures contracts.

The distinction between futures contracts and daily resettled forward contracts has only recently been recognized (see Margrabe (1976), Black (1976)) and even more recently understood. Cox, Ingersoll & Ross (1981) and Jarrow & Oldeld (1981) provide a discrete-time arbitrage-based analysis of the relationship between forwards and futures, whereas Richard & Sundaresan (1981) study these claims in a continuous-time, equilibrium setting. Our presentation of this material is similar to that of Dufe & Stanton (1992), which also considers options on futures, and to Chapte 7 of Dufe (1992). For additional reading on forward and futures contracts, one may consult Dufe (1989).
35.11 Term structure models.

The Hull & White (1990) model is a generalization of the constant-coefcent Vasicek (1977) model. Implementations of the model appear in Hull & White (1994a,b). The Cox-Ingersoll-Ross model is presented in (1985a,b). The presentations of these given models here is taken from Rogers (1995). Other surveys of term structure models are Dufe & Kan (1994) and Vetzal (1994). A partial list of other term structure models is Black, Derman & Toy (1990), Brace & Musiela (1994a,b), Brennan & Schwartz (1979, 1982) (but see Hogan (1993) for discussion of a problem with this model), Dufe & Kan (1993), Ho & Lee (1986), Jamshidian (1990), and Longstaff & Schwartz (1992a,b). The continuous-time Heath-Jarrow-Morton model appears in Heath, Jarrow & Morton (1992), and a discrete-time version is provided by Heath, Jarrow & Morton (1990). Carverhill & Pang (1995) discuss implementation. The Brace-Gatarek-Musiela variation of the HJM model is taken from Brace, et al. (1995). A summary of this model appears as Reed (1995). Related works on term structure models and swaps are Flesaker & Hughston (1995) and Jamshidian (1996).
35.12 Change of num raire. e

This material in this course is taken from Geman, El Karoui and Rochet (1995). Similar ideas were used by by Jamshidian (1989). The Merton option pricing formula appears in Merton (1973).
35.13 Foreign exchange models.

Foreign exchange options were priced by Biger & Hull (1983) and Garman & Kohlhagen (1983). The prices for differential swaps have been worked out by Jamshidian (1993a, 1993b) and Brace & Musiela (1994a).
354
35.14 REFERENCES
(1993) Aase, K. K., A jump/diffusion consumption-based capital asset pricing model and the equity premium puzzle, Math. Finance 3(2), 6584. (1961) Alexander, S. S., Price movements in speculative markets: trends or random walks, Industrial Management Review 2, 726. Reprinted in Cootner (1964), 199218. (1990) Artzner, P. & Heath, D., Completeness and non-unique pricing, preprint, School of Operations Research and Industrial Engineering, Cornell University. (1900) Bachelier, L., Th orie de la sp culation, Ann. Sci. Ecole Norm. Sup. 17, 2186. Reprinted e e in Cootner (1964). (1991) Back, K., Asset pricing for general processes, J. Math. Econonics 20, 371395. (1991) Barone-Adesi, G. & Elliott, R., Approximations for the values of American options, Stoch. Anal. Appl. 9, 115131. (1987) Barone-Adesi, G. & Whalley, R., Efcient analytic approximation of American option values, J. Finance 42, 301320. (1988) Bates, D. S., Pricing options under jump-diffusion processes, preprint, Wharton School of Business, University of Pennsylvania. (1992) Bates, D. S., Jumps and stochastic volatility: exchange rate processes implicit in foreign currency options, preprint, Wharton School of Business, University of Pennsylvania. (1991) Beinert, M. & Trautmann, S., Jump-diffusion models of stock returns: a statistical investigation, Statistische Hefte 32, 269280. (1984) Bensoussan, A., On the theory of option pricing, Acta Appl. Math. 2, 139158. (1983) Biger, H. & Hull, J., The valuation of currency options, Financial Management, Spring, 2428. (1986) Billingsley, P., Probability and Measure, 2nd edition, Wiley, New York. (1975) Black, F., Fact and fantasy in the use of options, Financial Analysts J., 31, 3641, 6172.

(1976) Black, F., The pricing of commodity contracts, J. Financial Economics 3. (1986) Black, F., Noise, J. Finance 41, 529543.
355
(1990) Black, F., Derman, E. & Toy, W., A one-factor model of interest rates and its application to treasury bond options, Fin. Anal. Journal, 3339. (1973) Black, F. & Scholes, M., The pricing of options and corporate liabilities, J. Political Economy 81, 637659. (1964) Boness, Some evidence of the protability of trading put and call options. In Cootner (1964), 475496. (1994) Bouaziz, L., Bryis, E. & Crouhy, M., The pricing of forward-starting Asian options, J. Banking Finance. (1994a) Brace, A. & Musiela, M., Swap derivatives in a Gaussian HJM framework, preprint, School of Mathematics, University of New South Wales. (1994b) Brace, A. & Musiela, M., A multi-factor Gauss Markov implementation of Heath, Jarrow and Morton, Math. Finance 4, 259283. (1995) Brace, A.. Gatarek, D. & Musiela, M., The market model of interest rate dynamics, preprint, School of Mathematics, University of New South Wales. (1977) Brennan, M. J. & Schwartz, E. S., The valuation of the American put option, J. Finance 32, 449462. (1979) Brennan, M. J. & Schwartz, E. S., A continuous-time approach to the pricing of bonds, J. Banking Finance 3, 133155. (1982) Brennan, M. J. & Schwartz, E. S., An equilibrium model of bond pricing and a test of market efciency, J. Financial Quantitative Analysis 17, 301329. (1994) Broadie, M. & Detemple, J., American option values: new bounds, approximations and a comparison of existing methods, preprint, Graduate School of Business, Columbia University. (1992) Bunch, D. & Johnson, H., A simple and numerically efcient valuation method for American puts, using a modied Geske-Johnson approach, J. Finance 47, 809816. (1944) Cameron, R. H. & Martin, W. T., Transformation of Wiener integrals under translations,
356 Ann. Math. 45, 386396. (1994) Carr, P. & Faguet, D., Fast accurate valuation of American options, preprint, Johnson School of Business, Cornell University. (1995) Carverhill, A. & Pang, K., Efcient and exible bond option valuation in the Heath, Jarrow, and Morton framework, J. Fixed Income, September, 7077. (1968) Chung, K. L. A Course in Probability Theory, Academic Press, Orlando, FL. (1983) Chung, K. L. & Williams, R. J. (1983) Introduction to Stochastic Integration, Birkh user, a Boston. (1991) Conz , A. & Viswanathan, R., Path dependent options: the case of lookback options, J. e Finance 46, 18931907. (1964) Cootner, P. H. (editor), The Random Character of Stock Market Prices, MIT Press, Cambridge, MA. (1981) Cox, J. C., Ingersoll, J. E., & Ross, S, The relation between forward prices and futures prices, J. Financial Economics 9, 321346. (1985a) Cox, J. C., Ingersoll, J. E. & Ross, S., An intertemporal general equilibrium model of asset prices, Econometrica 53, 363384. (1985b) Cox, J. C., Ingersoll, J. E. & Ross, S., A theory of the term structure of interest rates, Econometrica 53, 373384. (1976) Cox, J. C. & Ross, S. A., The valuation of options for alternative stochastic processes, J. Fin. Economics 3, 145166. (1979) Cox, J. C., Ross, S. A. & Rubinstein, M., Option pricing: a simplied approach, J. Financial Economics 7, 229263. (1985) Cox, J. C. & Rubinstein, M. Options Markets, Prentice Hall, Englewood Cliffs, NJ. (1986) DeGroot, M., Probability and Statistics, 2nd edition, Addison Wesley, Reading, MA. (1992) Delbaen, F., Representing martingale measures when asset prices are continuous and bounded, Math. Finance 2(2), 107130.
357
(1994a) Delbaen, F. & Schachermayer, W., A general version of the fundamental theorem of asset pricing, Mathematische Annalen 300, 463520. (1994b) Delbaen, F. & Schachermayer, W., Arbitrage and free-lunch with bounded risk, for unbounded continuous processes, Math. Finance 4(4), 343348. (1953) Doob, J., Stochastic Processes, Wiley, New York. (1990) Dothan, M. U., Prices in Financial Markets, Oxford University Press, New York. (1989) Dufe, D., Futures Markets, Prentice-Hall, Englewood Cliffs, NJ. (1992) Dufe, D., Dynamic Asset Pricing Theory, Princeton University Press, Princeton. (1993) Dufe, D. & Kan, R., A yield-factor model of interest rates, preprint. (1994) Dufe, D. & Kan, R., Multi-factor term structure models, Phil. Transactions Royal Society London A 347, 577586. (1992) Dufe, D. & Stanton, R., Pricing continuously resettled contingent claims, J. Economic Dynamics and Control, 16, 561573. (1905) Einstein, A. On the movement of small particles suspended in a stationary liquid demanded by the molecular-kinetic theory of heat, Ann. Physik 17. (1990) Elliott, R. J. & Kopp, P. E., Option pricing and hedge portfolios for Poisson processes, Stoch. Anal. Appl. 9, 429444. (1965) Fama, E. The behavior of stock-market prices, J. Business 38, 34104. (1948) Feynman, R., Space-time approach to nonrelativistic quantum mechanics, Rev. Mod. Phys. 20, 367387. (1966) Fisk, D. L., Sample quadratic variation of continuous, second-order martingales, Z. Wahrscheinlichkeitstheorie verw. Gebiete 6, 273278. (1995) Flesaker, B. & Hughston, L., Positive interest, preprint. (1993) F llmer, H. & Schweizer, M., A microeconomic approach to diffusion models for stock o prices, Math. Finance 3, 123.
358 (1994) Fritelli, M. & Lakner, P., Almost sure characterization of martingales, Stochastics 49, 181 190. (1995) Fritelli, M. & Lakner, P., Arbitrage and free-lunch in a general nance market model: the fundamental theorem of asset pricing, IMA Vol 65: Mathematical Finance, 8992, (M. Davis, D. Dufe, W. Fleming & S. Shreve, eds.), Springer-Verlag, New York. (1983) Garman, M. & Kohlhagen, S., Foreign currency option values, J. International Money Finance 2, 231237. (1995) Geman, H., El Karoui, N. & Rochet, J.-C., Changes of num raire, changes of probability e measure, and option pricing, J. Applied Probability 32, 443458. (1993) Geman, H. & Yor, M., Bessel processes, Asian options, and perpetuities, Mathematical Finance 3, 349375. (1984) Geske, R. & Johnson, H., The American put-option valued analytically, J. Finance 39, 1511 1524. (1960) Girsanov, I. V., On transforming a certain class of stochastic processes by absolutely continuous substitution of measures, Theory Probability Applications 5, 285301. (1979) Goldman, B., Sosin, H. & Gatto, M., Path dependent options: Buy at the low, sell at the high, J. Finance 34, 6984. (1979) Goldman, B., Sosin, H. & Shepp, L. A., On contingent claims that insure ex-post optimal stock market timing, J. Finance 34, 401414. (1979) Harrison, J. M. & Kreps, D. M., Martingales and arbitrage in multiperiod securities markets, J. Economic Theory 20, 381408. (1981) Harrison, J. M. & Pliska, S. R., Martingales and stochastic integrals in the theory of continuous trading, Stochastic Processes Applications 11, 215260. (1983) Harrison, J. M. & Pliska, S. R., A stochastic calculus model of continuous trading: complete markets, Stochastic Processes Applications 15, 313316. (1990) Heath, D., Jarrow, R. & Morton, A., Bond pricing and the term structure of interest rates: a discrete time approximation, J. Financial Quantitative Analysis 25, 419440. (1992) Heath, D., Jarrow, R. & Morton, A., Bond pricing and the term structure of interest rates: a

new methodology, Econometrica 60, 77105.
359
(1993) Hogan, M., Problems in certain two-factor term structure models, Annals Applied Probability 3, 576581. (1986) Ho, T. S. Y. & Lee, S.-B., Term structure movements and pricing interest rate contingent claims, J. Finance 41, 10111029. (1988) Huang, C.-f. & Litzenberger, R., Foundations for Financial Economics, North Holland, Amsterdam. (1993) Hull, J., Options, Futures, and other Derivative Securities, Prentice Hall, Englewood Cliffs, NJ. (1990) Hull, J. & White, A., Pricing interest rate derivative securities, Rev. Financial Studies 3, 573592. (1994a) Hull, J. & White, A., Numerical procedures for implementing term structure models I: single-factor models, J. Derivatives, Fall, 716. (1994b) Hull, J. & White, A., Numerical procedures for implementing term structure models II: two factor models, J. Derviatives, Winter. (1987) Ingersoll, J. E., Theory of Financial Decision Making, Rowman & Littleeld. (1944) It , K., Stochastic integral, Proc. Imperial Acad. Tokyo 20, 519524. o (1992) Jacka, S., A martingale representation result and an application to incomplete nancial markets, Math. Finance 2, 239250. (1990) Jaillet, P., Lamberton, D. & Lapeyre, B., Variational inequaltiies and the pricing of American options, Acta Appl. Math. 21, 263289. (1989) Jamshidian, F., An exact bond option pricing formula, J. Finance 44, 205209. (1990) Jamshidian, F., The preference-free determination of bond and option prices from the spot interest rate, Advances Futures & Options Research 4, 5167. (1993a) Jamshidian, F., Price differentials, Risk Magazine 6/7.
360 (1993b) Jamshidian, F., Corralling quantos, Risk Magazine 7/3. (1996) Jamshidian, F., Sorting out swaptions, Risk Magazine 9/3, 5960. (1988) Jarrow, R., Finance Theory, Prentice Hall, Englewood Cliffs, NJ. (1991a) Jarrow, R. & Madan, D., A characterization of complete security markets on a Brownian ltration, Math. Finance 1(3), 3143. (1991b) Jarrow, R. & Madan, D., Valuing and hedging contingent claims on semimartingales, preprint, Johnson School of Business, Cornell University. (1991c) Jarrow, R. & Madan, D., Option pricing using the term structure of interest rates to hedge systematic discontinuities in asset returns, preprint, Johnson School of Business, Cornell University. (1981) Jarrow, R. & Oldeld, G., Forward contracts and futures contracts, J. Financial Economics 9, 373382. (1983) Johnson, H., An analytic approximation for the American put price, J. Financial Quant. Analysis 18, 141148. (1984) Jones, E. P., Option arbitrage and strategy with large price changes, J. Financial Economics 13, 91113. (1951) Kac, M., On some connections between probability theory and differential and integral equations, Proc. 2nd Berkeley Symposium on Math. Stat. & Probability, 189215. (1988) Karatzas, I., On the pricing of American options, App. Math. Optimization 17, 3760. (1991) Karatzas, I. & Shreve, S. E., Brownian Motion and Stochastic Calculus, Springer-Verlag, New York. (1981) Karlin, S. & Taylor, H. M., A Second Course in Stochastic Processes, Academic Press, Orlando, FL. (1953) Kendall, M. G., Analysis of economic time-series Part I: prices, J. Royal Statistical Society 96, Part I, 1125. Reprinted in Cootner (1964), 8599. (1992) Kloeden, P. E. & Platen, E., Numerical Solution of Stochastic Differential Equations, SpringerVerlag, New York.
361
(1933) Kolmogorov, A. N., Grundbegriffe der Wahrscheinlichkeitsrechnung, Ergebnisse Mathematik 2. English translation: Foundations of Probability Theory, Chelsea, New York, 1950. (1967) Kunita, H. & Watanabe, S., On square-integrable martingales, Nagoya Math. Journal 30, 209245. (1992) Kunitomo, N. & Ikeda, M., Pricing options with curved boundaries, Mathematical Finance 2, 275298. (1993) Lakner, P., Martingale measures for a class of right-continuous processes, Math. Finance 3, 4353. (1993) Lamberton, D., Convergence of the critical price in the approximation of American options, Math. Finance 3, 179190. (1989) LeRoy, S. F., Efcient capital markets and martingales, J. Economic Literature 27, 1583 1621. (1939) L vy, P., Sur certains processus stochastiques homog nes, Compositio Math. 7, 283339. e e (1948) L vy, P., Processus Stochastiques et Mouvement Brownian, Gauthier-Villars, Paris. e (1992a) Longstaff, F. A. & Schwartz, E. S., Interest rate volatility and the term structure: a twofactor general equilibrium model, J. Finance 47, 12591282. (1992b) Longstaff, F. A. & Schwartz, E. S., A two-factor interest rate model and contingent claim valuation, J. Fixed Income 3, 1623. (1986) MacMillan, L., Analytic approximation for the American put option, Adv. Futures Options Research 1, 119139. (1990) Madan, D. & Seneta, E., The V. G. model for share returns, J. Business 63, 511524. (1991) Madan, D. & Milne, F., Option pricing with V. G. martingale components, Math. Finance 1(4), 3956. (1976) Margrabe, W., A theory of forward and futures prices, preprint, Wharton School of Business, University of Pennsylvania. (1965) McKean, H. P., Jr., Appendix to P. A. Samuelson (1965): A free boundary problem for the heat equation arising from a problem in mathematical economics, Industrial Management Review
362 6, 3239. (1993) Mercurio, F. & Runggaldier, W., Option pricing for jump diffusions: approximations and their interpretation, Math. Finance 3, 191200. (1969) Merton, R. C., Lifetime portfolio selection under uncertainty: the continuous-time case, Review Economics Statistics 51, 247257. (1973) Merton, R. C., Theory of rational option pricing, Bell J. Econ. Management Science 4, 141 183. (1976) Merton, R. C., Option pricing when underlying stock returns are discontinuous, J. Fin. Economics 3, 125144. (1990) Merton, R. C., Continuous-Time Finance, Basil Blackwell, Cambridge, 1990. (1989) M ller, S. On complete securities markets and the martingale property on securities prices, u Economic Letters 31 3741. (1996) Musiela, M. & Rutkowski, M., Arbitrage Pricing of Derivative Securities. Theory and Applications, Springer-Verlag, New York, to appear. (1992) Myneni, R., The pricing of the American option, Ann. Applied Probability 2, 123. (1990) Naik, V. & Lee, M., General equilibrium pricing of options on the market portfolio with discontinuous returns, Rev. Financial Studies 3, 493521. (1995) ksendal, B., Stochastic Differential Equations, 4th Edition, Springer-Verlag, New York. (1987) Omberg, E., The valuation of American puts with exponential exercise policies, Adv. Futures Option Research 2, 117142. (1959) Osborne, M. F. M., Brownian motion in the stock market, Operations Research 7, 145173. Reprinted in Cootner (1964), 100128. (1977) Parkinson, M., Option pricing: the American put, J. Business 50, 2136. (1990) Protter, P., Stochastic Integration and Differential Equations, Springer-Verlag, New York. (1995) Reed, N., If the cap ts : : : , Risk 8(8), 3435.
363
(1981) Richard, S. F. & Sundaresan, M., A continuous-time equilibrium model of forward and futures prices in a multigood economy, J. Financial Economics, 9, 347371. (1987) Ritchken, P. Options: Theory, Strategy and Applications, Harper Collins. (1995) Rogers, L. C. G., Which model of term-structure of interest rates should one use?, IMA Vol. 65: Mathematical Finance, 93116, (M. Davis, D. Dufe, W. Fleming & S. Shreve, editors), Springer-Verlag, New York. (1995) Rogers, L. C. G. & Shi, Z., The value of an Asian option, J. Applied Probability, to appear. (1991) Rubinstein, M. & Reiner, E., Breaking down the barriers, Risk Magazine 4/8, 2835. (1965) Samuelson, P. A., Proof that properly anticipated prices uctuate randomly, Industrial Management Review 4, 4150. (1973) Samuelson, P. A., Mathematics of speculative prices, SIAM Review 15, 142. (1992a) Schweizer, M., Martingale densities for general asset prices, J. Math. Economics 21, 123 131. (1992b) Schweizer, M., Mean-variance hedging for general claims, Annals Appl. Probab. 2, 171 179. (1990) Shirakawa, H., Optimal dividend and portfolio decisions with Poisson and diffusion-type return processes, preprint IHSS 90-20, Tokyo Institute of Technology. (1991) Shirakawa, H., Interest rate option pricing with Poisson-Gaussian forward rate curve processes, Math. Finance 1(4), 7794. (1961) Sprenkle, C. M., Warrant prices as indicators of expectations and preferences, Yale Economic Essays 1, 178231. Reprinted in Cootner (1964), 412474. (1990) Stricker, C., Arbitrage et lois de martingales, Ann. Inst. henri Poincar 26, 451460. e (1987) Taqqu, M. & Willinger, W., The analysis of nite security markets using martingales, Adv. Appl. Probab. 19, 125. (1977) Vasicek, O., An equilibrium characterization of the term structure, J. Financial Economics 5, 177188.
364 (1994) Vetzal, K. R., A survey of stochastic continuous time models of the term structure of interest rates, Insurance: Mathematics and Economics 14, 139161. (1939) Ville, J., Etude Critique de la Notion du Collectif, Gauthier-Villars, Paris. (1923) Wiener, N., Differential spaces, J. Math. Physics 2, 131174. (1924) Wiener, N., Un probl me de probabilit s d nombrables, Bull. Soc. Math. France 52, 569 e e e 578. (1991) Williams, D., Probability with Martingales, Cambridge University Press, Cambridge. (1993) Wilmott, P., Dewynne, J. & Howison, S., Option Pricing, Oxford Financial Press, PO Box 348, Oxford, OX4 1DR, UK. (1995) Wilmott, P., Dewynne, J. & Howison, S., The Mathematics of Financial Derivatives, Cambridge University Press, Cambridge. (1992) Xue, X. X., Martingale representation for a class of processes with independent increments, and its applications, in Lecture Notes in Control and Information Sciences 177, 279311, SpringerVerlag, Berlin.

Steven Shreve. Lectures On Stochastic Calculus and Finance

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Steven Shreve. Lectures On Stochastic Calculus and Finance

Uploaded by

Copyright:

Available Formats

Steven Shreve: Stochastic Calculus and Finance

Simple European Derivative Securities . . . . . . . . . . . . . . . . . . . . . . . . The Binomial Model is Complete . . . . . . . . . . . . . . . . . . . . . . . . . . .

8 Random Walks 8.1 First Passage Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10 Capital Asset Pricing

10.1 An Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 11 General Random Variables 123

12.1 Discrete-time Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

17.2 Risk-neutral measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 18 Martingale Representation Theorem 197

d-dimensional Girsanov Theorem . . . . . . . . . . . d-dimensional Martingale Representation Theorem . .

18.5 Multi-dimensional market model . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

20 Pricing Exotic Options

27 Bonds, forward contracts and futures

Introduction to Probability Theory

S (H) = 8 1 S2 (HT) = 4 S =4 0 S1 (T) = 2 S2 (TH) = 4

= fHHH HHT HTH HTT THH THT TTH TTT g

CHAPTER 1. Introduction to Probability Theory

0S1 (H ) + (1 + r)(V0 ; 0 S0):

If the stock goes down, the value of your portfolio is

0S1 (T ) + (1 + r)(V0 ; 0S0 ):

V1(H ) ; V1(T ) S1(H ) ; S1(T ) :

(1.7) To simplify this

1 ~ V0 = 1 + r pV1(H ) + qV1(T )]: ~

CHAPTER 1. Introduction to Probability Theory

0S1 (H ) + (1 + r)(V0 ; 0 S0)

V2(HH ) V2 (HT ) V2 (TH ) V2(TT )

V2 (TH ) ; V2(TT ) S2(TH ) ; S2(TT )

and substituting this into either equation, we can solve for

1 ~ X1 (T ) = 1 + r pV2(TH ) + qV2 (TT )]: ~

1 ~ V1(T ) = 1 + r pV2 (TH ) + qV2(TT )]: ~

V2(HH ) ; V2(HT ) S2(HH ) ; S2(HT )

1 ~ V1(H ) = 1 + r pV2(HH ) + qV2(HT )]: ~

1 ~ Vk 1 (!1 : : : !k 1) = 1 + r pVk (!1 : : : !k

1.2 Finite Probability Spaces

= fHHH HHT HTH HTT THH THT TTH TTT g

CHAPTER 1. Introduction to Probability Theory

F into 0 1] with the following

IP ( ) = 1, (ii) If A1 A2 : : : is a sequence of disjoint sets in F , then IP

IP fHHH g = IP fHTH g = IP fTHH g = IP fTTH g =

IP fHHT g = IP fHTT g = IP fTHT g = IP fTTT g =

3 IP fHHH HHT HTH HTT g = 1 + 2 1 3 3

18 (ii) If A 2 G , then its complement Ac (iii) If A1

A2 A3 : : : is a sequence of sets in G , then

Here are some important -algebras of subsets of the set

F3 = F = The set of all subsets of :

CHAPTER 1. Introduction to Probability Theory

-algebra of all subsets of

S2(HHH ) = S2(HHT ) = 16 S2(HTH ) = S2(HTT ) = S2(THH ) = S2 (THT ) = 4 S2(TTH ) = S2(TTT ) = 1:

4 27]. The preimage under S2 of this interval is dened to be

f! 2 S2(!) 2 4 27]g = f! 2 4 S2 27g = Ac : TT

AHH AHT ATH ATT

LS2 ( ) = IP ( ) = 0 LS2 (IR) = IP ( ) = 1 LS2 0 1) = IP ( ) = 1 2 LS2 0 3] = IP fS2 = 1g = IP (ATT ) = 2 : 3

FS2 (x) = IP (S2

8 >0 >4 <9 x) = > 8 >1 :9

if x < 1 if 1 x < 4 if 4 x < 16 if 16 x:

X , we mean any of the several ways of characterizing

CHAPTER 1. Introduction to Probability Theory

S2(HHH ) = S2(HHT ) = 16 S2(HTH ) = S2(HTT ) = S2(THH ) = S2 (THT ) = 4 S2(TTH ) = S2(TTT ) = 1:

k=1 ! Xk =xk n X X xk IP f!g k=1 ! Xk =xk n X xk IP fXk = xk g k=1 n X xk LX fxk g: k=1

(xk ; IEX )2IP fX = xk g =

(xk ; IEX )2LX (xk ):

1.3 Lebesgue Measure and the Lebesgue Integral

CHAPTER 1. Introduction to Probability Theory

24 In fact, every set containing countably innitely many numbers is Borel; if A = fa 1

The remaining set